Nagios Check - check_hw_sensors

check_hw_sensors plugin for Nagios monitors sysctl hw.sensors on OpenBSD

With the new sensor framework in OpenBSD 3.9, I wanted to be able to monitor the new hw.sensors from Nagios and this is what I have. It is currently a work in progress although it does seem to work just fine. The documentation is a bit thin and I don't know how reliable it is. I would be happy to accept patches. Send them to andrew+nagios@rraz.net. I know the docs aren't as good as I would like, so if there are places that need clarification, please let me know!

What I think is really kewl about this plugin is that it can use the same sensorsd.conf as sensorsd. That means that they can be easily kept in sync. But, since Nagios supports both warning and critical alerts, it turned out really handy that sensorsd ignores any additional capabilities in the file. The addtional capabilities check_hw_sensors supports are described below. If you have an /etc/sensorsd.conf with the checks you want, it can be run as simply as 'check_hw_sensors -f'.

Download the current version here

Please be sure to support the OpenBSD project by purchasing CDs, T-shirts, or making a donation.
These finances ensure that OpenBSD will continue to exist, and will remain free for everyone to use and reuse as they see fit.

    check_hw_sensors (-f [<FILENAME>]|(-s <hw.sensors id> -w limit -c limit))

Usage:
    -f, --filename=FILE
        FILE to load checks from (defaults to /etc/sensorsd.conf)
    -s, --sensor=ID
        ID of a single sensor.  "-s 0" means hw.sensors.0.
    -w, --warning=RANGE or single ENTRY
        Exit with WARNING status if outside of RANGE or if != ENTRY
    -c, --critical=RANGE or single ENTRY
        Exit with CRITICAL status if outside of RANGE or if != ENTRY

    -h (--help)       usage help
        

FILE is in the same format as sensorsd.conf(5) plus some additional entries. These additional entries in the file are ignored by sensorsd(8).

check_hw_sensors understands the following entries:
low, high, crit, warn, crit.low, crit.high, warn.low, warn.high

An ENTRY depends on the type. The descriptions in sensorsd.conf(5) can be used when appropriate, or you can use the following:

The entries 'crit' or 'warn' (or the -c or -w on the command line) may be a RANGE or a comma separated list of acceptable values. The comma separated list of values contains a list of things that will NOT cause the status. This is possibly counterintuitive, but you are more likely to know good values than bad values.

A RANGE is a low ENTRY and a high ENTRY separated by a colon (:). It can also be low: or :high with the other side left blank to only make the single check..

check_hw_sensors (nagios-plugins 1.4.2) 1.14
The nagios plugins come with ABSOLUTELY NO WARRANTY. You may redistribute copies of the plugins under the terms of the GNU General Public License. For more information about these matters, see the file named COPYING.

Example Sensorsd.conf

# $OpenBSD: sensorsd.conf,v 1.1 2003/10/08 20:30:04 grange Exp $
# $ RedRiver: sensorsd.conf,v 1.1 2006/05/03 21:48:42 andrew Exp $

#
# Sample sensorsd.conf file. See sensorsd.conf(5) for details.
# This one has examples for use with nagios check_hw_sensors
# Actual sensors on a 2450 are below.
#

# hw.sensors.0=esm0, Motherboard, raw, 0
# hw.sensors.1=esm0, CPU 1, OK, temp, 28.00 degC / 82.40 degF
hw.sensors.1:high=50C:warn.high=40C

# hw.sensors.2=esm0, CPU 2, OK, temp, 30.00 degC / 86.00 degF
hw.sensors.2:high=50C:warn.high=40C

# hw.sensors.3=esm0, Mainboard, OK, temp, 21.50 degC / 70.70 degF
hw.sensors.3:high=40C:warn.high=30C

# hw.sensors.4=esm0, CPU 1 Core, OK, volts_dc, 1.69 V
hw.sensors.4:high=1.85V:warn.high=1.8V:low=1.60V:warn.low=1.65V

# hw.sensors.5=esm0, CPU 2 Core, OK, volts_dc, 1.70 V
hw.sensors.5:high=1.85V:warn.high=1.8V:low=1.60V:warn.low=1.65V

# hw.sensors.6=esm0, Motherboard +5V, OK, volts_dc, 4.95 V
hw.sensors.6:high=5.1V:warn.high=5.05V:low=4.90V:warn.low=4.85V

# hw.sensors.7=esm0, Motherboard +12V, OK, volts_dc, 11.94 V
hw.sensors.7:high=12.15V:warn.high=12.1V:low=11.8V:warn.low=11.85V

# hw.sensors.8=esm0, Motherboard +3.3V, OK, volts_dc, 3.27 V
hw.sensors.8:high=3.5V:warn.high=3.4V:low=3.15V:warn.low=3.2V

# hw.sensors.9=esm0, Motherboard +2.5V, OK, volts_dc, 2.48 V
hw.sensors.9:high=2.75V:warn.high=2.6V:low=2.25V:warn.low=2.4V

# hw.sensors.10=esm0, Motherboard GTL Term, OK, volts_dc, 1.49 V
hw.sensors.10:high=1.75V:warn.high=1.6V:low=1.25V:warn.low=1.4V

# hw.sensors.11=esm0, Motherboard Battery, OK, volts_dc, 2.93 V
hw.sensors.11:high=3.1V:warn.high=3.05V:low=2.75V:warn.low=2.8V

# hw.sensors.12=esm0, Chassis Intrusion, indicator, Off
hw.sensors.12:crit=Off:warn=Off

# hw.sensors.13=esm0, Fan 1, OK, fanrpm, 3526 RPM
hw.sensors.13:low=3000:warn.low=3250

# hw.sensors.14=esm0, Fan 2, OK, fanrpm, 3569 RPM
hw.sensors.14:low=3000:warn.low=3250

# hw.sensors.15=esm0, Fan 3, OK, fanrpm, 3563 RPM
hw.sensors.15:low=3000:warn.low=3250

# hw.sensors.16=esm0, Backplane, raw, 0
# hw.sensors.17=esm0, Backplane Top, OK, temp, 14.50 degC / 58.10 degF
hw.sensors.17:high=35C:warn.high=25C

# hw.sensors.18=esm0, Backplane Bottom, OK, temp, 22.00 degC / 71.60 degF
hw.sensors.18:high=40C:warn.high=30C

# hw.sensors.19=esm0, Backplane +5V, OK, volts_dc, 4.97 V
hw.sensors.19:high=5.1V:warn.high=5.05V:low=4.90V:warn.low=4.85V

# hw.sensors.20=esm0, Backplane SCSI A Connected, indicator, On
hw.sensors.20:crit=On:warn=On

# hw.sensors.21=esm0, Backplane SCSI A External, OK, volts_dc, 4.70 V
hw.sensors.21:high=5.1V:warn.high=5.05V:low=4.60V:warn.low=4.65V

# hw.sensors.22=esm0, Backplane SCSI B Connected, indicator, Off
hw.sensors.22:crit=Off:warn=Off

# hw.sensors.23=esm0, Drive 0, drive, online
hw.sensors.23:crit=online,empty:warn=online

# hw.sensors.24=esm0, Drive 1, drive, online
hw.sensors.24:crit=online,empty:warn=online

# hw.sensors.25=esm0, Drive 2, drive, empty
hw.sensors.25:crit=online,empty:warn=online,empty

# hw.sensors.26=esm0, Drive 3, drive, empty
hw.sensors.26:crit=online,empty:warn=online,empty

# hw.sensors.27=esm0, Backplane Control 2, raw, 17
# hw.sensors.28=esm0, Backplane +3.3V, OK, volts_dc, 3.28 V
hw.sensors.28:high=3.5V:warn.high=3.4V:low=3.2V:warn.low=3.25V

# hw.sensors.29=safte0, temp0, OK, temp, 14.44 degC / 58.00 degF
hw.sensors.29:high=35C:warn.high=25C

# hw.sensors.30=safte0, temp1, OK, temp, 22.22 degC / 72.00 degF
hw.sensors.30:high=40C:warn.high=30C
        

Example output from sysctl hw.sensors from that box

hw.sensors.0=esm0, Motherboard, raw, 0
hw.sensors.1=esm0, CPU 1, OK, temp, 30.00 degC / 86.00 degF
hw.sensors.2=esm0, CPU 2, OK, temp, 31.00 degC / 87.80 degF
hw.sensors.3=esm0, Mainboard, OK, temp, 19.50 degC / 67.10 degF
hw.sensors.4=esm0, CPU 1 Core, OK, volts_dc, 1.69 V
hw.sensors.5=esm0, CPU 2 Core, OK, volts_dc, 1.70 V
hw.sensors.6=esm0, Motherboard +5V, OK, volts_dc, 4.95 V
hw.sensors.7=esm0, Motherboard +12V, OK, volts_dc, 11.93 V
hw.sensors.8=esm0, Motherboard +3.3V, OK, volts_dc, 3.27 V
hw.sensors.9=esm0, Motherboard +2.5V, OK, volts_dc, 2.48 V
hw.sensors.10=esm0, Motherboard GTL Term, OK, volts_dc, 1.49 V
hw.sensors.11=esm0, Motherboard Battery, OK, volts_dc, 2.94 V
hw.sensors.12=esm0, Chassis Intrusion, indicator, Off
hw.sensors.13=esm0, Fan 1, OK, fanrpm, 3514 RPM
hw.sensors.14=esm0, Fan 2, OK, fanrpm, 3582 RPM
hw.sensors.15=esm0, Fan 3, OK, fanrpm, 3570 RPM
hw.sensors.16=esm0, Backplane, raw, 0
hw.sensors.17=esm0, Backplane Top, OK, temp, 14.50 degC / 58.10 degF
hw.sensors.18=esm0, Backplane Bottom, OK, temp, 22.50 degC / 72.50 degF
hw.sensors.19=esm0, Backplane +5V, OK, volts_dc, 4.97 V
hw.sensors.20=esm0, Backplane SCSI A Connected, indicator, On
hw.sensors.21=esm0, Backplane SCSI A External, OK, volts_dc, 4.70 V
hw.sensors.22=esm0, Backplane SCSI B Connected, indicator, Off
hw.sensors.23=esm0, Drive 0, drive, online
hw.sensors.24=esm0, Drive 1, drive, online
hw.sensors.25=esm0, Drive 2, drive, empty
hw.sensors.26=esm0, Drive 3, drive, empty
hw.sensors.27=esm0, Backplane Control 2, raw, 17
hw.sensors.28=esm0, Backplane +3.3V, OK, volts_dc, 3.28 V
hw.sensors.29=safte0, temp0, OK, temp, 15.00 degC / 59.00 degF
hw.sensors.30=safte0, temp1, OK, temp, 22.78 degC / 73.00 degF
        

Andrew Fresh <andrew@mad-techies.org>

$RedRiver: index.html,v 1.1 2006/05/03 23:50:17 andrew Exp $