Saturday, February 14, 2015

Graph Elertus and Google Nest in Nagios/check_mk

I recently got turned onto OMD/check_mk, which is probably the most user friendly tool for Nagios. It has a nearly complete Nagios stack, and more:

  • Nagios
  • Monitoring Plugins
  • check_nrpe
  • mrpe (a check_mk clone of nrpe)
  • Icinga
  • Shinken
  • NagVis
  • pnp4nagios
  • rrdtool/rrdcached
  • Check_MK
  • MK Livestatus
  • Multisite
  • Dokuwiki
  • Thruk
  • Mod-Gearman
  • check_logfiles
  • check_oracle_health
  • check_mysql_health
  • jmx4perl
  • check_webinject
  • check_multi


It has a very simple install and works with Debian/Ubuntu/Red Hat/SUSE. I recently deleted my Icinga build and went all in with this.

As well as regular servers, vmware and other things, I'm able to graph Google Nest's temperature, humidity and "leaf" status. You can set up alerts if it crosses a certain threshold and graph for as many years and you wish.

Check_MK with Google Nest

This is something that is sorely lacking in Google's current site since the graphing information is only stored for 10 days, and doesn't provide the granular level of detail I want.

Google Nest's graphs.

While I won't go into much detail on the install of OMD (it's pretty straightforward) and I won't go into much detail about the Nest integration (Great writeup on how to do that here)  I will be showing a way to get another product, the Elertus, monitored with check_mk as an example of what you can do with it.

The Elertus, is a wifi temperature, sound, light and water detector that runs on batteries. I wrote up an article about it on this blog before. It's a little trickier to monitor than the Nest due to the fact that there is no API, no way to scrape information off of the device since it only makes outbound connections.

Since my last post, last year, nothing has changed with it, nor has my opinion. It's still sluggish, no graphing, no "all clear" alerts... and a painfully slow (and useless) app to compliment. The plus side is that it's been almost a year and the batteries still work (AA batteries). I have had no connection issues from it as far as I can tell.

Other than an alerts tab, there's not much more to the Elertus website


When I had my Icinga server, I stood up another box that basically sniffed the traffic as it was in delivery to their servers.(which is still sent cleartext, btw). I cut out the bits and pieces I needed to get a basic graph up. Since I started this server, I decided to try a new method.

Using pfSense, I created a static DCHP connection for the device and an internal NAT rule to relay the traffic from the Elertus to my own web server running Apache with mod_dumpio turned on.

While this basically kills any communication with the Elertus servers, they won't be missed. With, my prior setup I was able to enjoy both my internally generated alerts, and their alerts, but I found my own to be a lot more useful.

The Elertus sends out a POST to their servers as a check in.

device_type=1&posix_time=1423925134&email_id=myemail@gmail.com&mac_address=000680000000&alert_flags=&light=14&temp=298&humidity=30&battery=70&motion=0&int_contact=1&ext_contact=1&ext_temp=-1&fw_ver=4.0.1_EL_v7&debug=rssi:46, ant:I, af:, pkt:l14_t298_h30_b70_m0_i1_e1_x-1_p1423925134, wdog:1, crtry:4, queue:3, ctime:w2285_d410_n130_s205_t3040, &


Apache has a virtualhost set up to receive those incoming PHP POST requests with dumpio enabled and the trace level set to 7.


<IfModule dumpio_module>
 DumpIOInput On
 DumpIOOutput On
 LogLevel dumpio:trace7
</IfModule

I also made sure to set custom logs, for just this module, as they will be filling up quick.

ErrorLog ${APACHE_LOG_DIR}/dumpio_module_error.log
CustomLog ${APACHE_LOG_DIR}/dumpio_module_access.log combined

The requests come in 2 at a time, sometimes more if there is any movement/light. So, to remedy the lack of a consistent time period to check the logs, I set up a cron job to pull the newest line in the log with POST data. This runs every 5 minutes.

#!/bin/bash

tac /var/log/apache2/dumpio_module_error.log | grep -m1 "email" > /tmp/tempout.txt

~/scripts/temp.sh
~/scripts/humidity.sh
~/scripts/battery.sh

The scripts within it, in turn, pull information out of the latest POST and updates the raw values to separate files. I've kept it modular so I can add more checks as I need them. I'm not too concerned about the water, movement and light alerts just yet.

The temperature is in Kelvin, so I converted it to Fahrenheit.

#!/bin/bash

tempk=$(cat /tmp/tempout.txt | awk -F "=" '/light/ {print $8}' | sed 's/&.*//')
tempb=$(awk "BEGIN {print "$tempk" - 273.15}")
temp=$(echo ""$tempb"*1.8+32" | bc)

echo "$temp" > ~/perfdata/temp
(I know, this can be cleaned up a lot)

The battery and humidity, are pretty much the same thing, with different names. It just requires another {print $X} position. The light, movement and water sensor can be added just as easily since they are only values of 0 or 1.

Now, unlike Nagios, getting devices to check_mk is a breeze. Making custom checks, with RRD graphing is just as easy. You add the custom checks to the host itself, not the monitoring server.

This web server is Ubuntu and it has check_mk installed. There is a folder that allows you to put custom scripts to make local checks, above what it already monitors.

/usr/lib/check_mk_agent/local/

The setup is pretty straightforward if you want monitoring with performance graphs.

#!/bin/bash

TEMP1=$(cat ~/perfdata/temp)

echo '<<<local>>>'
echo "P Temperature temp="$TEMP1";35:89;32:91;0;110"

Any script that is in this folder runs when check_mk runs and grabs the other server global readings.

The output appends itself to the bottom of the generated file, as a local check like this:

<<<local>>>
P Temperature temp=73.13;35:89;32:91;0;110

It tells check_mk that it's available to be added as a service, with graphing

  • P

The name of the service

  • Temperature

The variable name

  • temp=

The output from $TEMP1, the WARN min:max, the CRIT min:max and UNKNOWN lower;upper (for graphing reasons and not required). Notice that the colons and semicolons are there for a reason.

When you scan the host in WATO on the main check_mk site, the host should have basic performance graphs and automatically added to the notification ruleset the host was part of.



While it's not an ideal solution, it gets the job done. I wish Elertus would open their API up to developers, the would probably sell a lot more units if they did.

I recommend anyone who's given up with Nagios/Icinga to give OMD a try. It's pretty good. The documentation is a little poor (and mostly in German), but with Google Translate and some coffee, you can get through it if you've ever set up any other monitoring before.

No comments:

Post a Comment