check_hw_sensors plugin for Nagios monitors sysctl hw.sensors on OpenBSD
With the new sensor framework in OpenBSD 3.9, I wanted to be able to monitor the new hw.sensors from Nagios and this is what I have. The documentation is a bit thin and I don't know how reliable it is. I would be happy to accept patches. Send them to andrew+nagios@rraz.net. I know the docs aren't as good as I would like, so if there are places that need clarification, please let me know!
New in this release is the ability to check the sensors that report their status. Since many sensors support this, it can make the size of your sensorsd.conf much smaller. For example, check_hw_sensors will automatically check these two sensors:
What I think is really kewl about this plugin is that it can use the same sensorsd.conf as sensorsd. That means that they can be easily kept in sync. But, since Nagios supports both warning and critical alerts, it turned out really handy that sensorsd ignores any additional capabilities in the file. The addtional capabilities check_hw_sensors supports are described below. If you have an /etc/sensorsd.conf with the checks you want, it can be run as simply as 'check_hw_sensors -f'. If you only want to check the sensors that report their status, you can ever run it as just 'check_hw_sensors'.
TODO:
check_hw_sensors [-i] (-f [<FILENAME>]|(-s <hw.sensors id> [-w limit] [-c limit])) Usage: -i, --ignore-status Don't check the status of sensors that report it. -f, --filename=FILE FILE to load checks from (defaults to /etc/sensorsd.conf) -s, --sensor=ID ID of a single sensor. "-s 0" means hw.sensors.0. -w, --warning=RANGE or single ENTRY Exit with WARNING status if outside of RANGE or if != ENTRY -c, --critical=RANGE or single ENTRY Exit with CRITICAL status if outside of RANGE or if != ENTRY
FILE is in the same format as sensorsd.conf(5) plus some additional entries. These additional entries in the file are ignored by sensorsd(8).
check_hw_sensors understands the following entries:
low, high, crit, warn, crit.low, crit.high, warn.low, warn.high,
ignore, status
An ENTRY depends on the type. The descriptions in sensorsd.conf(5) can be used when appropriate, or you can use the following:
The entries 'crit' or 'warn' (or the -c or -w on the command line) may be a RANGE or a comma separated list of acceptable values. The comma separated list of values contains a list of things that will NOT cause the status. This is possibly counterintuitive, but you are more likely to know good values than bad values.
A RANGE is a low ENTRY and a high ENTRY separated by a colon (:). It can also be low: or :high with the other side left blank to only make the single check..
An entry marked "ignore" will cause that sensor to be skipped. Generally used with status checking of all sensors to ignore sensors you don't care about or that report incorrectly.
If you are using --ignore-status, you can still check the status of individual sensors with a status entry.
check_hw_sensors (nagios-plugins 1.4.2) 1.17
The nagios plugins come with ABSOLUTELY NO WARRANTY. You may redistribute
copies of the plugins under the terms of the GNU General Public License.
For more information about these matters, see the file named COPYING.
# hw.sensors.3=esm0, Ambient, OK, temp, 21.50 degC / 70.70 degF hw.sensors.3:high=28C:warn.high=25C # hw.sensors.12=esm0, Chassis Intrusion, indicator, Off hw.sensors.12:warn=Off # hw.sensors.29=esm0, Drive 0, drive, online hw.sensors.29:crit=online # hw.sensors.30=esm0, Drive 1, drive, online hw.sensors.30:crit=online # hw.sensors.31=esm0, Drive 2, drive, unknown # hw.sensors.32=esm0, Drive 3, drive, unknown # hw.sensors.33=esm0, Drive 4, drive, online hw.sensors.33:crit=online # hw.sensors.34=esm0, Drive 5, drive, online hw.sensors.34:crit=online # hw.sensors.35=esm0, Drive 6, drive, unknown # hw.sensors.36=esm0, Drive 7, drive, unknown # hw.sensors.70=esm0, Power Supply 1 AC, indicator, On hw.sensors.70:crit=On # hw.sensors.71=esm0, Power Supply 1 SW, indicator, On hw.sensors.71:crit=On # hw.sensors.72=esm0, Power Supply 1 OK, indicator, On hw.sensors.72:crit=On # hw.sensors.74=esm0, Power Supply 1 FFAN, indicator, Off hw.sensors.74:crit=Off # hw.sensors.75=esm0, Power Supply 1 OTMP, indicator, Off hw.sensors.75:crit=Off # hw.sensors.76=esm0, Power Supply 2 AC, indicator, On hw.sensors.76:crit=On # hw.sensors.77=esm0, Power Supply 2 SW, indicator, On hw.sensors.77:crit=On # hw.sensors.78=esm0, Power Supply 2 OK, indicator, On hw.sensors.78:crit=On # hw.sensors.80=esm0, Power Supply 2 FFAN, indicator, Off hw.sensors.80:crit=Off # hw.sensors.81=esm0, Power Supply 2 OTMP, indicator, Off hw.sensors.81:crit=Off # hw.sensors.82=esm0, Power Supply 3 AC, indicator, On hw.sensors.82:crit=On # hw.sensors.83=esm0, Power Supply 3 SW, indicator, On hw.sensors.83:crit=On # hw.sensors.84=esm0, Power Supply 3 OK, indicator, On hw.sensors.84:crit=On # hw.sensors.86=esm0, Power Supply 3 FFAN, indicator, Off hw.sensors.86:crit=Off # hw.sensors.87=esm0, Power Supply 3 OTMP, indicator, Off hw.sensors.87:crit=Off # hw.sensors.88=esm0, Fan 1, CRITICAL, fanrpm, 0 RPM hw.sensors.88:ignore # hw.sensors.89=esm0, Fan 2, CRITICAL, fanrpm, 0 RPM hw.sensors.89:ignore # hw.sensors.90=esm0, Fan 3, CRITICAL, fanrpm, 0 RPM hw.sensors.90:ignore # hw.sensors.91=esm0, Fan 4, CRITICAL, fanrpm, 0 RPM hw.sensors.91:ignore # hw.sensors.92=esm0, Fan 5, CRITICAL, fanrpm, 0 RPM hw.sensors.92:ignore # hw.sensors.93=esm0, Fan 6, CRITICAL, fanrpm, 0 RPM hw.sensors.93:ignore
hw.sensors.0=esm0, CPU 1, OK, temp, 36.00 degC / 96.80 degF hw.sensors.1=esm0, CPU 2, OK, temp, 41.00 degC / 105.80 degF hw.sensors.2=esm0, Mainboard, OK, temp, 28.00 degC / 82.40 degF hw.sensors.3=esm0, Ambient, OK, temp, 21.50 degC / 70.70 degF hw.sensors.4=esm0, CPU 1 Core, OK, volts_dc, 1.65 V hw.sensors.5=esm0, CPU 2 Core, OK, volts_dc, 1.67 V hw.sensors.6=esm0, Motherboard +5V, OK, volts_dc, 4.98 V hw.sensors.7=esm0, Motherboard +12V, OK, volts_dc, 11.97 V hw.sensors.8=esm0, Motherboard +3.3V, OK, volts_dc, 3.25 V hw.sensors.9=esm0, Motherboard +2.5V, OK, volts_dc, 2.49 V hw.sensors.10=esm0, Motherboard GTL Term, OK, volts_dc, 1.50 V hw.sensors.11=esm0, Motherboard Battery, OK, volts_dc, 2.92 V hw.sensors.12=esm0, Chassis Intrusion, indicator, Off hw.sensors.13=esm0, Chassis Fan Ctrl, raw, 2 hw.sensors.14=esm0, Fan 1, OK, fanrpm, 3839 RPM hw.sensors.15=esm0, Fan 2, OK, fanrpm, 4040 RPM hw.sensors.16=esm0, Backplane Control, raw, 227 hw.sensors.17=esm0, Backplane Top, OK, temp, 30.00 degC / 86.00 degF hw.sensors.18=esm0, Backplane Bottom, OK, temp, 29.50 degC / 85.10 degF hw.sensors.19=esm0, Backplane +5V, OK, volts_dc, 4.97 V hw.sensors.20=esm0, Backplane +12V, OK, volts_dc, 12.06 V hw.sensors.21=esm0, Backplane Board, OK, volts_dc, 2.83 V hw.sensors.22=esm0, Backplane Fan Control, raw, 8738 hw.sensors.23=esm0, Backplane Fan 1, OK, fanrpm, 3690 RPM hw.sensors.24=esm0, Backplane Fan 2, OK, fanrpm, 3552 RPM hw.sensors.25=esm0, Backplane Fan 3, OK, fanrpm, 3505 RPM hw.sensors.26=esm0, Backplane SCSI A Connected, indicator, On hw.sensors.27=esm0, Backplane SCSI A External, OK, volts_dc, 4.68 V hw.sensors.28=esm0, Backplane SCSI A Internal, OK, volts_dc, 4.79 V hw.sensors.29=esm0, Drive 0, drive, online hw.sensors.30=esm0, Drive 1, drive, online hw.sensors.31=esm0, Drive 2, drive, unknown hw.sensors.32=esm0, Drive 3, drive, unknown hw.sensors.33=esm0, Drive 4, drive, online hw.sensors.34=esm0, Drive 5, drive, online hw.sensors.35=esm0, Drive 6, drive, unknown hw.sensors.36=esm0, Drive 7, drive, unknown hw.sensors.37=esm0, Power Supply 1 +5V, volts_dc, 5.07 V hw.sensors.38=esm0, Power Supply 1 +12V, volts_dc, 12.13 V hw.sensors.39=esm0, Power Supply 1 +3.3V, volts_dc, 3.32 V hw.sensors.40=esm0, Power Supply 1 -5V, volts_dc, 5.08 V hw.sensors.41=esm0, Power Supply 1 -12V, volts_dc, 12.20 V hw.sensors.42=esm0, Power Supply 2 +5V, volts_dc, 5.09 V hw.sensors.43=esm0, Power Supply 2 +12V, volts_dc, 12.08 V hw.sensors.44=esm0, Power Supply 2 +3.3V, volts_dc, 3.34 V hw.sensors.45=esm0, Power Supply 2 -5V, volts_dc, 5.01 V hw.sensors.46=esm0, Power Supply 2 -12V, volts_dc, 12.03 V hw.sensors.47=esm0, Power Supply 3 +5V, volts_dc, 5.03 V hw.sensors.48=esm0, Power Supply 3 +12V, volts_dc, 12.15 V hw.sensors.49=esm0, Power Supply 3 +3.3V, volts_dc, 3.34 V hw.sensors.50=esm0, Power Supply 3 -5V, volts_dc, 5.03 V hw.sensors.51=esm0, Power Supply 3 -12V, volts_dc, 11.97 V hw.sensors.52=esm0, System Power Supply +5V, volts_dc, 5.02 V hw.sensors.53=esm0, System Power Supply +12V, volts_dc, 12.09 V hw.sensors.54=esm0, System Power Supply +3.3V, volts_dc, 3.32 V hw.sensors.55=esm0, System Power Supply -5V, volts_dc, 4.91 V hw.sensors.56=esm0, System Power Supply -12V, volts_dc, 11.82 V hw.sensors.57=esm0, System Power Supply +5V aux, volts_dc, 5.16 V hw.sensors.58=esm0, Power Supply 1 +5V, OK, amps, 2.40 A hw.sensors.59=esm0, Power Supply 1 +12V, OK, amps, 1.00 A hw.sensors.60=esm0, Power Supply 1 +3.3V, OK, amps, 1.80 A hw.sensors.61=esm0, Power Supply 2 +5V, OK, amps, 2.80 A hw.sensors.62=esm0, Power Supply 2 +12V, OK, amps, 1.20 A hw.sensors.63=esm0, Power Supply 2 +3.3V, OK, amps, 1.60 A hw.sensors.64=esm0, Power Supply 3 +5V, OK, amps, 1.20 A hw.sensors.65=esm0, Power Supply 3 +12V, OK, amps, 1.00 A hw.sensors.66=esm0, Power Supply 3 +3.3V, OK, amps, 1.60 A hw.sensors.67=esm0, Power Supply 1 Fan, OK, fanrpm, 3726 RPM hw.sensors.68=esm0, Power Supply 2 Fan, OK, fanrpm, 4042 RPM hw.sensors.69=esm0, Power Supply 3 Fan, OK, fanrpm, 3606 RPM hw.sensors.70=esm0, Power Supply 1 AC, indicator, On hw.sensors.71=esm0, Power Supply 1 SW, indicator, On hw.sensors.72=esm0, Power Supply 1 OK, indicator, On hw.sensors.73=esm0, Power Supply 1 ON, OK, indicator, On hw.sensors.74=esm0, Power Supply 1 FFAN, indicator, Off hw.sensors.75=esm0, Power Supply 1 OTMP, indicator, Off hw.sensors.76=esm0, Power Supply 2 AC, indicator, On hw.sensors.77=esm0, Power Supply 2 SW, indicator, On hw.sensors.78=esm0, Power Supply 2 OK, indicator, On hw.sensors.79=esm0, Power Supply 2 ON, OK, indicator, On hw.sensors.80=esm0, Power Supply 2 FFAN, indicator, Off hw.sensors.81=esm0, Power Supply 2 OTMP, indicator, Off hw.sensors.82=esm0, Power Supply 3 AC, indicator, On hw.sensors.83=esm0, Power Supply 3 SW, indicator, On hw.sensors.84=esm0, Power Supply 3 OK, indicator, On hw.sensors.85=esm0, Power Supply 3 ON, OK, indicator, On hw.sensors.86=esm0, Power Supply 3 FFAN, indicator, Off hw.sensors.87=esm0, Power Supply 3 OTMP, indicator, Off hw.sensors.88=esm0, Fan 1, CRITICAL, fanrpm, 0 RPM hw.sensors.89=esm0, Fan 2, CRITICAL, fanrpm, 0 RPM hw.sensors.90=esm0, Fan 3, CRITICAL, fanrpm, 0 RPM hw.sensors.91=esm0, Fan 4, CRITICAL, fanrpm, 0 RPM hw.sensors.92=esm0, Fan 5, CRITICAL, fanrpm, 0 RPM hw.sensors.93=esm0, Fan 6, CRITICAL, fanrpm, 0 RPM hw.sensors.94=esm0, Fan Enclosure, raw, 42498 hw.sensors.95=safte0, fan0, OK, indicator, On hw.sensors.96=safte0, fan1, OK, indicator, On hw.sensors.97=safte0, fan2, OK, indicator, On hw.sensors.98=safte0, temp0, OK, temp, 30.00 degC / 86.00 degF hw.sensors.99=safte0, temp1, OK, temp, 29.44 degC / 85.00 degF
Andrew Fresh <andrew@mad-techies.org>
$RedRiver: index.html,v 1.3 2006/05/04 01:39:16 andrew Exp $