As you might have noticed we're quite big fans of Salt. One of the things that Salt enables us to do, it to apply what we're used to doing with code to our infrastructure. Let's look at TDD (Test Driven Development).

Write the test first, make it fail, implement the code, test goes green, you're done.

Apply the same thing to infrastructure and you get TDI (Test Driven Infrastructure).

So before you deploy a service, you make sure that your supervision (shinken, nagios, incinga, salt based monitoring, etc.) is doing the correct test, you deploy and then your supervision goes green.

Let's take a look at website supervision. At Logilab we weren't too satisfied with how our shinken/http_check were working so we started using uptime (nodejs + mongodb). Uptime has a simple REST API to get and add checks, so we wrote a salt execution module and a states module for it.

For the sites that use the apache-formula we simply loop on the domains declared in the pillars to add checks :

{% for domain in salt['pillar.get']('apache:sites').keys() %}
uptime {{ domain }} (http):
    - name : http://{{ domain }}
{% endfor %}

For other URLs (specific URL such as sitemaps) we can list them in pillars and do :

{% for url in salt['pillar.get']('uptime:urls') %}
uptime {{ url }}:
    - name : {{ url }}
{% endfor %}

That's it. Monitoring comes before deployment.

We've also contributed a formula for deploying uptime.

Follow us if you are interested in Test Driven Infrastructure for we intend to write regular reports as we make progress exploring this new domain.

blog entry of