Parallel SSH and system monitoring in Clojure

During my 10% time, I created two simple clojure tools to aid in basic sysadmin tasks. Today I’m open sourcing them on github and clojars.

parallel-ssh

The first tool I built is a library for running commands in parallel on multiple servers. It takes a BASH command and a csv of server names to run the command on. Internally, one clojure agent is spawned per server and each agent is responsible for running the command and storing the result. After all the agents have completed, or a specified timeout is reached, the agents are dereferenced and their output is returned. Currently, I just shell out to run ssh and I make the assumption that password-less login is available.

This library also has a command line interface:

I found it useful to wrap that in a BASH script that would run a command on all of our servers. This is clearly not a replacement for sophisticated server management tools like puppet, but it is helpful when you quickly want to an answer a question such as: “How much disk space is free?” or “How many servers is this process running on?”, etc..

server-stats

The second tool is a micro-framework built on top of parallel-ssh. Similar to python’s fabric, it allows the user to define custom commands to be run on a specified group of servers. It also has the capability to respond to the results of the command run based on custom triggers. Here’s an example configuration file:

First we define the ssh username to be used and our server groupings:

Now we can start adding commands. Here we add a command called ‘top’ that will be run only on web-servers and app-servers:

Note that the doc string is used in the auto-generated usage page, so you should never have to open the config file to figure out what a command does. We can now run this command from the command line:
We can make things a little more interesting by adding alert triggers. First we need to define an alert handler function. An alert handler takes 3 arguments: the alert message, the name of the server, and the output from the command that was run. Here we add a handler called ‘email’ that will send us an email when a trigger condition is met:
Now lets define a command and trigger that will use this alert:
This command has an extra field called ‘alerts’; this is an array of trigger conditions for this command. The command ‘disk’ only has one trigger, which states “when the Use% column of ‘df -ah’ is greater than 85%, send an email with the message ‘Disk space over 85% full’”. Heres a breakdown of an alert:

‘column’ is used for commands that return column-formatted output (eg. df, iostat, top), and it instructs server-stats to look at a specific column for the value. If it is not specified it will assume the command output is a scalar value.
‘value-type’ tells server-stats how to parse the command result string in to a clojure value. Right now there are only three possible value types: percent, bool, and number.
‘handlers’ is a vector of alert handlers to call when this condition is met. In this case, it is just the email handler.
‘msg’ is the alert message that gets passed as the first argument to the handler function.
Finally, ‘trigger’ actually defines the condition that has to be met. It is a tuple which has a Boolean operator and a value to compare against.

Additionally, you can define a global function to be called whenever a command can not be successfully completed for some reason (eg., server timesout). Here we send a text message using Twilio whenever that happens:

conclusion

Currently this is just used for basic server monitoring, but this could easily be used for much more advanced reactive behavior. Building this in clojure was a lot of fun and pretty easy since clojure has macros, easy to use concurrency, higher order functions, and full access to java libraries. The one downside of using clojure on the command line is you have to eat the JVM startup time on every run.

	RFM Analysis for Online Businesses: Part One Salvatore Calvo\|January 12, 2017
	Understanding your Customer Concentration Jim Roddy\|May 18, 2017
	Understanding your Magento Series: Taking Stock of your Inventory Chris Schmid\|April 20, 2017
	Introducing Magento Business Intelligence Essentials Johanna Richardson\|April 10, 2017
	Quantifying Customer Loyalty Michael Kreshtool\|March 30, 2017
	Optimizing Customer Support Michael Kreshtool\|March 23, 2017

Parallel SSH and system monitoring in Clojure

Chris McBride | March 15, 2012

parallel-ssh

server-stats

conclusion

resources

http://clojars.org/server-stats

http://clojars.org/parallel-ssh

https://github.com/RJMetrics/Parallel-SSH

https://github.com/RJMetrics/Server-Stats

mwasley
2 Posts

Chris Schmid
3 Posts

Tommy Baw
1 Posts

Anita Andrews
12 Posts

Ryan Williams
6 Posts

Ben Garvey
4 Posts

parallel-ssh

server-stats

conclusion

resources

http://clojars.org/server-stats

http://clojars.org/parallel-ssh

https://github.com/RJMetrics/Parallel-SSH

https://github.com/RJMetrics/Server-Stats

You Might Also Enjoy

RFM Analysis for Online Businesses: Part One

Salvatore Calvo|January 12, 2017

Understanding your Customer Concentration

Jim Roddy|May 18, 2017

Understanding your Magento Series: Taking Stock of your Inventory

Chris Schmid|April 20, 2017

Introducing Magento Business Intelligence Essentials

Johanna Richardson|April 10, 2017

Quantifying Customer Loyalty

Michael Kreshtool|March 30, 2017

Optimizing Customer Support

Michael Kreshtool|March 23, 2017