=================================================================== RCS file: /cvs/nagios/check_hw_sensors/index.html,v retrieving revision 1.1 retrieving revision 1.5 diff -u -r1.1 -r1.5 --- nagios/check_hw_sensors/index.html 2006/05/04 00:50:17 1.1 +++ nagios/check_hw_sensors/index.html 2006/12/05 16:44:14 1.5 @@ -1,208 +1,251 @@ -
check_hw_sensors plugin for Nagios monitors sysctl hw.sensors on OpenBSD
-With the new sensor framework in OpenBSD 3.9, I wanted to be able to monitor the new hw.sensors from Nagios and this is what I have. It is currently a work in progress although it does seem to work just fine. The documentation is a bit thin and I don't know how reliable it is. I would be happy to accept patches. Send them to andrew+nagios@rraz.net. I know the docs aren't as good as I would like, so if there are places that need clarification, please let me know!
-What I think is really kewl about this plugin is that it can use the same sensorsd.conf as sensorsd. That means that they can be easily kept in sync. But, since Nagios supports both warning and critical alerts, it turned out really handy that sensorsd ignores any additional capabilities in the file. The addtional capabilities check_hw_sensors supports are described below. If you have an /etc/sensorsd.conf with the checks you want, it can be run as simply as 'check_hw_sensors -f'.
-- check_hw_sensors (-f [<FILENAME>]|(-s <hw.sensors id> -w limit -c limit)) - -Usage: - -f, --filename=FILE - FILE to load checks from (defaults to /etc/sensorsd.conf) - -s, --sensor=ID - ID of a single sensor. "-s 0" means hw.sensors.0. - -w, --warning=RANGE or single ENTRY - Exit with WARNING status if outside of RANGE or if != ENTRY - -c, --critical=RANGE or single ENTRY - Exit with CRITICAL status if outside of RANGE or if != ENTRY - - -h (--help) usage help --
FILE is in the same format as sensorsd.conf(5) plus some additional entries. These additional entries in the file are ignored by sensorsd(8).
- -check_hw_sensors understands the following entries:
- low, high, crit, warn, crit.low, crit.high, warn.low, warn.high
An ENTRY depends on the type. The descriptions in sensorsd.conf(5) - can be used when appropriate, or you can use the following: - -
The entries 'crit' or 'warn' (or the -c or -w on the command line) - may be a RANGE or a comma separated list of acceptable values. - The comma separated list of values contains a list of things that - will NOT cause the status. This is possibly counterintuitive, but - you are more likely to know good values than bad values.
- -A RANGE is a low ENTRY and a high ENTRY separated by a colon (:). - It can also be low: or :high with the other side left blank to only - make the single check..
- -
check_hw_sensors (nagios-plugins 1.4.2) 1.13
- The nagios plugins come with ABSOLUTELY NO WARRANTY. You may redistribute
- copies of the plugins under the terms of the GNU General Public License.
- For more information about these matters, see the file named COPYING.
-# $OpenBSD: index.html,v 1.1 2006/05/03 23:50:17 andrew Exp $ -# $ RedRiver: sensorsd.conf,v 1.1 2006/05/03 21:48:42 andrew Exp $ - -# -# Sample sensorsd.conf file. See sensorsd.conf(5) for details. -# This one has examples for use with nagios check_hw_sensors -# Actual sensors on a 2450 are below. -# - -# hw.sensors.0=esm0, Motherboard, raw, 0 -# hw.sensors.1=esm0, CPU 1, OK, temp, 28.00 degC / 82.40 degF -hw.sensors.1:high=50C:warn.high=40C - -# hw.sensors.2=esm0, CPU 2, OK, temp, 30.00 degC / 86.00 degF -hw.sensors.2:high=50C:warn.high=40C - -# hw.sensors.3=esm0, Mainboard, OK, temp, 21.50 degC / 70.70 degF -hw.sensors.3:high=40C:warn.high=30C - -# hw.sensors.4=esm0, CPU 1 Core, OK, volts_dc, 1.69 V -hw.sensors.4:high=1.85V:warn.high=1.8V:low=1.60V:warn.low=1.65V - -# hw.sensors.5=esm0, CPU 2 Core, OK, volts_dc, 1.70 V -hw.sensors.5:high=1.85V:warn.high=1.8V:low=1.60V:warn.low=1.65V - -# hw.sensors.6=esm0, Motherboard +5V, OK, volts_dc, 4.95 V -hw.sensors.6:high=5.1V:warn.high=5.05V:low=4.90V:warn.low=4.85V - -# hw.sensors.7=esm0, Motherboard +12V, OK, volts_dc, 11.94 V -hw.sensors.7:high=12.15V:warn.high=12.1V:low=11.8V:warn.low=11.85V - -# hw.sensors.8=esm0, Motherboard +3.3V, OK, volts_dc, 3.27 V -hw.sensors.8:high=3.5V:warn.high=3.4V:low=3.15V:warn.low=3.2V - -# hw.sensors.9=esm0, Motherboard +2.5V, OK, volts_dc, 2.48 V -hw.sensors.9:high=2.75V:warn.high=2.6V:low=2.25V:warn.low=2.4V - -# hw.sensors.10=esm0, Motherboard GTL Term, OK, volts_dc, 1.49 V -hw.sensors.10:high=1.75V:warn.high=1.6V:low=1.25V:warn.low=1.4V - -# hw.sensors.11=esm0, Motherboard Battery, OK, volts_dc, 2.93 V -hw.sensors.11:high=3.1V:warn.high=3.05V:low=2.75V:warn.low=2.8V - -# hw.sensors.12=esm0, Chassis Intrusion, indicator, Off -hw.sensors.12:crit=Off:warn=Off - -# hw.sensors.13=esm0, Fan 1, OK, fanrpm, 3526 RPM -hw.sensors.13:low=3000:warn.low=3250 - -# hw.sensors.14=esm0, Fan 2, OK, fanrpm, 3569 RPM -hw.sensors.14:low=3000:warn.low=3250 - -# hw.sensors.15=esm0, Fan 3, OK, fanrpm, 3563 RPM -hw.sensors.15:low=3000:warn.low=3250 - -# hw.sensors.16=esm0, Backplane, raw, 0 -# hw.sensors.17=esm0, Backplane Top, OK, temp, 14.50 degC / 58.10 degF -hw.sensors.17:high=35C:warn.high=25C - -# hw.sensors.18=esm0, Backplane Bottom, OK, temp, 22.00 degC / 71.60 degF -hw.sensors.18:high=40C:warn.high=30C - -# hw.sensors.19=esm0, Backplane +5V, OK, volts_dc, 4.97 V -hw.sensors.19:high=5.1V:warn.high=5.05V:low=4.90V:warn.low=4.85V - -# hw.sensors.20=esm0, Backplane SCSI A Connected, indicator, On -hw.sensors.20:crit=On:warn=On - -# hw.sensors.21=esm0, Backplane SCSI A External, OK, volts_dc, 4.70 V -hw.sensors.21:high=5.1V:warn.high=5.05V:low=4.60V:warn.low=4.65V - -# hw.sensors.22=esm0, Backplane SCSI B Connected, indicator, Off -hw.sensors.22:crit=Off:warn=Off - -# hw.sensors.23=esm0, Drive 0, drive, online -hw.sensors.23:crit=online,empty:warn=online - -# hw.sensors.24=esm0, Drive 1, drive, online -hw.sensors.24:crit=online,empty:warn=online - -# hw.sensors.25=esm0, Drive 2, drive, empty -hw.sensors.25:crit=online,empty:warn=online,empty - -# hw.sensors.26=esm0, Drive 3, drive, empty -hw.sensors.26:crit=online,empty:warn=online,empty - -# hw.sensors.27=esm0, Backplane Control 2, raw, 17 -# hw.sensors.28=esm0, Backplane +3.3V, OK, volts_dc, 3.28 V -hw.sensors.28:high=3.5V:warn.high=3.4V:low=3.2V:warn.low=3.25V - -# hw.sensors.29=safte0, temp0, OK, temp, 14.44 degC / 58.00 degF -hw.sensors.29:high=35C:warn.high=25C - -# hw.sensors.30=safte0, temp1, OK, temp, 22.22 degC / 72.00 degF -hw.sensors.30:high=40C:warn.high=30C --
-hw.sensors.0=esm0, Motherboard, raw, 0 -hw.sensors.1=esm0, CPU 1, OK, temp, 30.00 degC / 86.00 degF -hw.sensors.2=esm0, CPU 2, OK, temp, 31.00 degC / 87.80 degF -hw.sensors.3=esm0, Mainboard, OK, temp, 19.50 degC / 67.10 degF -hw.sensors.4=esm0, CPU 1 Core, OK, volts_dc, 1.69 V -hw.sensors.5=esm0, CPU 2 Core, OK, volts_dc, 1.70 V -hw.sensors.6=esm0, Motherboard +5V, OK, volts_dc, 4.95 V -hw.sensors.7=esm0, Motherboard +12V, OK, volts_dc, 11.93 V -hw.sensors.8=esm0, Motherboard +3.3V, OK, volts_dc, 3.27 V -hw.sensors.9=esm0, Motherboard +2.5V, OK, volts_dc, 2.48 V -hw.sensors.10=esm0, Motherboard GTL Term, OK, volts_dc, 1.49 V -hw.sensors.11=esm0, Motherboard Battery, OK, volts_dc, 2.94 V -hw.sensors.12=esm0, Chassis Intrusion, indicator, Off -hw.sensors.13=esm0, Fan 1, OK, fanrpm, 3514 RPM -hw.sensors.14=esm0, Fan 2, OK, fanrpm, 3582 RPM -hw.sensors.15=esm0, Fan 3, OK, fanrpm, 3570 RPM -hw.sensors.16=esm0, Backplane, raw, 0 -hw.sensors.17=esm0, Backplane Top, OK, temp, 14.50 degC / 58.10 degF -hw.sensors.18=esm0, Backplane Bottom, OK, temp, 22.50 degC / 72.50 degF -hw.sensors.19=esm0, Backplane +5V, OK, volts_dc, 4.97 V -hw.sensors.20=esm0, Backplane SCSI A Connected, indicator, On -hw.sensors.21=esm0, Backplane SCSI A External, OK, volts_dc, 4.70 V -hw.sensors.22=esm0, Backplane SCSI B Connected, indicator, Off -hw.sensors.23=esm0, Drive 0, drive, online -hw.sensors.24=esm0, Drive 1, drive, online -hw.sensors.25=esm0, Drive 2, drive, empty -hw.sensors.26=esm0, Drive 3, drive, empty -hw.sensors.27=esm0, Backplane Control 2, raw, 17 -hw.sensors.28=esm0, Backplane +3.3V, OK, volts_dc, 3.28 V -hw.sensors.29=safte0, temp0, OK, temp, 15.00 degC / 59.00 degF -hw.sensors.30=safte0, temp1, OK, temp, 22.78 degC / 73.00 degF --
Andrew Fresh <andrew@mad-techies.org>
-$RedRiver$
- - +check_hw_sensors plugin for Nagios monitors sysctl hw.sensors on OpenBSD
+With the new sensor framework in OpenBSD 3.9, I wanted to be able to monitor the new hw.sensors from Nagios and this is what I have. The documentation is a bit thin and I don't know how reliable it is. I would be happy to accept patches. Send them to andrew+nagios@rraz.net. I know the docs aren't as good as I would like, so if there are places that need clarification, please let me know!
+New in this release is support for more sensor types, as well as support for OpenBSD 4.0.
+It has the ability to check the sensors that report their status. Since many sensors support this, it can make the size of your sensorsd.conf much smaller. For example, check_hw_sensors will automatically check these two sensors: +
What I think is really kewl about this plugin is that it can use the same sensorsd.conf as sensorsd. That means that they can be easily kept in sync. But, since Nagios supports both warning and critical alerts, it turned out really handy that sensorsd ignores any additional capabilities in the file. The addtional capabilities check_hw_sensors supports are described below. If you have an /etc/sensorsd.conf with the checks you want, it can be run as simply as 'check_hw_sensors -f'. If you only want to check the sensors that report their status, you can ever run it as just 'check_hw_sensors'.
+TODO: +
+ check_hw_sensors [-i] (-f [<FILENAME>]|(-s <hw.sensors id> [-w limit] [-c limit])) + +Usage: + -i, --ignore-status + Don't check the status of sensors that report it. + -f, --filename=FILE + FILE to load checks from (defaults to /etc/sensorsd.conf) + -s, --sensor=ID + ID of a single sensor. "-s 0" means hw.sensors.0. + -w, --warning=RANGE or single ENTRY + Exit with WARNING status if outside of RANGE or if != ENTRY + -c, --critical=RANGE or single ENTRY + Exit with CRITICAL status if outside of RANGE or if != ENTRY ++
FILE is in the same format as sensorsd.conf(5) plus some additional entries. These additional entries in the file are ignored by sensorsd(8).
+ +check_hw_sensors understands the following entries:
+ low, high, crit, warn, crit.low, crit.high, warn.low, warn.high,
+ ignore, status
An ENTRY depends on the type. The descriptions in sensorsd.conf(5) + can be used when appropriate, or you can use the following: + +
The entries 'crit' or 'warn' (or the -c or -w on the command line) + may be a RANGE or a comma separated list of acceptable values. + The comma separated list of values contains a list of things that + will NOT cause the status. This is possibly counterintuitive, but + you are more likely to know good values than bad values.
+ +A RANGE is a low ENTRY and a high ENTRY separated by a colon (:). + It can also be low: or :high with the other side left blank to only + make the single check..
+ +
An entry marked "ignore" will cause that sensor to be skipped. + Generally used with status checking of all sensors to ignore sensors you + don't care about or that report incorrectly.
+ +If you are using --ignore-status, you can still check the status of + individual sensors with a status entry.
+ +check_hw_sensors (nagios-plugins 1.4.2) 1.21
+ The nagios plugins come with ABSOLUTELY NO WARRANTY. You may redistribute
+ copies of the plugins under the terms of the GNU General Public License.
+ For more information about these matters, see the file named COPYING.
+# hw.sensors.0=esm0, CPU 1, 39.00 degC, OK +# hw.sensors.1=esm0, CPU 2, 37.00 degC, OK +# hw.sensors.2=esm0, Mainboard, 29.50 degC, OK +# hw.sensors.3=esm0, CPU 1 Core, 1.74 V DC, OK +# hw.sensors.4=esm0, CPU 2 Core, 1.73 V DC, OK +# hw.sensors.5=esm0, Motherboard +5V, 4.94 V DC, OK +# hw.sensors.6=esm0, Motherboard +12V, 11.90 V DC, OK +# hw.sensors.7=esm0, Motherboard +3.3V, 3.22 V DC, OK +# hw.sensors.8=esm0, Motherboard +2.5V, 2.49 V DC, OK +# hw.sensors.9=esm0, Motherboard GTL Term, 1.49 V DC, OK +# hw.sensors.10=esm0, Motherboard Battery, 2.98 V DC, OK +# hw.sensors.11=esm0, Chassis Intrusion, Off +hw.sensors.11:crit=Off +# hw.sensors.12=esm0, Fan 1, 3586 RPM, OK +# hw.sensors.13=esm0, Fan 2, 3539 RPM, OK +# hw.sensors.14=esm0, Fan 3, 3536 RPM, OK +# hw.sensors.15=esm0, Backplane, 0 raw +# hw.sensors.16=esm0, Backplane Top, 28.00 degC, OK +# hw.sensors.17=esm0, Backplane Bottom, 30.50 degC, OK +# hw.sensors.18=esm0, Backplane +5V, 4.94 V DC, OK +# hw.sensors.19=esm0, Backplane +12V, 11.81 V DC, OK +# hw.sensors.20=esm0, Backplane SCSI A Connected, On +hw.sensors.20:crit=On +# hw.sensors.21=esm0, Backplane SCSI A External, 4.61 V DC, OK +# hw.sensors.22=esm0, Backplane SCSI B Connected, Off +# hw.sensors.23=esm0, Drive 0, drive online +hw.sensors.23:crit=online +# hw.sensors.24=esm0, Drive 1, drive online +hw.sensors.24:crit=online +# hw.sensors.25=esm0, Drive 2, drive online +hw.sensors.25:crit=online +# hw.sensors.26=esm0, Drive 3, drive online +hw.sensors.26:crit=online +# hw.sensors.27=esm0, Drive 4, drive online +hw.sensors.27:crit=online +# hw.sensors.28=esm0, Backplane Control 2, 1 raw +# hw.sensors.29=esm0, Backplane +3.3V, 3.28 V DC, OK +# hw.sensors.30=ami0, sd0, drive online, OK +# hw.sensors.31=ami0, sd1, drive online, OK +# hw.sensors.32=safte0, Temp0, 27.78 degC, OK +# hw.sensors.33=safte0, Temp1, 30.56 degC, OK ++
+RCS file: /cvs/scripts/Admin scripts/check_hw_sensors/check_hw_sensors,v +Working file: check_hw_sensors +head: 1.21 +branch: +locks: strict +access list: +symbolic names: +keyword substitution: kv +total revisions: 21; selected revisions: 21 +description: +---------------------------- +revision 1.21 +date: 2006/12/05 16:26:27; author: andrew; state: Exp; lines: +5 -5 +new better example for 4.0 +and fix the s/drive\s// from the data, not the type +---------------------------- +revision 1.20 +date: 2006/12/05 00:17:47; author: andrew; state: Exp; lines: +35 -26 +Match sensors differently depending on OS Version from the Config module. + +Also support checks on the other sensor types and document that. + +and refactor the way I return a $sensor->{'status'} from ~10 lines to 1. +---------------------------- +revision 1.19 +date: 2006/12/04 23:33:53; author: andrew; state: Exp; lines: +8 -3 +add a regex for the 'percent' type of sensor +---------------------------- +revision 1.18 +date: 2006/12/02 02:15:17; author: andrew; state: Exp; lines: +74 -9 +fix it for the output from OpenBSD 4.0 +---------------------------- +revision 1.17 +date: 2006/10/25 23:30:23; author: andrew; state: Exp; lines: +4 -7 +get the docs up to match the new version +---------------------------- +revision 1.16 +date: 2006/10/25 18:36:46; author: andrew; state: Exp; lines: +4 -4 +Stuff in CVS should output nagios format +---------------------------- +revision 1.15 +date: 2006/10/25 18:35:59; author: andrew; state: Exp; lines: +73 -46 +add support for the status as reported by the sensors. it is teh r0x0r! +---------------------------- +revision 1.14 +date: 2006/05/04 01:30:29; author: andrew; state: Exp; lines: +6 -4 +I thought I checked this in already +---------------------------- +revision 1.13 +date: 2006/05/03 22:16:42; author: andrew; state: Exp; lines: +7 -6 +Some more fixing of the help +---------------------------- +revision 1.12 +date: 2006/05/03 21:54:43; author: andrew; state: Exp; lines: +51 -30 +Get the help output cleaned up. Still not 100% what I want, but so far so good. +---------------------------- +revision 1.11 +date: 2006/05/03 20:01:09; author: holligan; state: Exp; lines: +4 -4 +updated and clarified help +---------------------------- +revision 1.10 +date: 2006/05/03 03:31:22; author: andrew; state: Exp; lines: +27 -42 +A bunch of cleanup and some kewl refactoring into loops. + +Still need to find a way to refactor the checks that are so similar! +---------------------------- +revision 1.9 +date: 2006/05/03 02:26:47; author: andrew; state: Exp; lines: +8 -8 +It now doesn't do nonexistent checks (for some stuff anyway) +Also changed the <br /> to <br>. Not valid XHTML whatever, but it does get stripped before getting sent to my pager. +---------------------------- +revision 1.8 +date: 2006/05/02 21:23:29; author: andrew; state: Exp; lines: +6 -16 +Better looking output for the web page. +---------------------------- +revision 1.7 +date: 2006/05/02 20:03:53; author: andrew; state: Exp; lines: +5 -5 +oops, that's an array ref! +---------------------------- +revision 1.6 +date: 2006/05/02 19:59:47; author: andrew; state: Exp; lines: +7 -5 +Better output for the OK checks +---------------------------- +revision 1.5 +date: 2006/05/02 19:49:29; author: andrew; state: Exp; lines: +19 -12 +Only show details for things other than OK, cuZ we are limited in the amount of data we can return :-( +---------------------------- +revision 1.4 +date: 2006/05/02 15:54:42; author: andrew; state: Exp; lines: +44 -21 +Some cleanup, as well as making it output a single line like nagios supposedly likes. +---------------------------- +revision 1.3 +date: 2006/05/02 01:39:23; author: andrew; state: Exp; lines: +3 -3 +fix the help getopts. +---------------------------- +revision 1.2 +date: 2006/05/02 01:29:33; author: andrew; state: Exp; lines: +427 -11 +Adding the sensors from one of the routers, cuZ there were a lot and I can use it for testing. + +Also, now the check_hw_sensors now seems to be OK. I need to put it on a few machines and set up the checks now. If it works for the rest of the week, I can clean it up and mebbe put it on teh interweb and post to undeadly. +---------------------------- +revision 1.1 +date: 2006/05/01 18:11:23; author: andrew; state: Exp; +add this so I can check it out on a box for testing +============================================================================= ++
Andrew Fresh <andrew@mad-techies.org>
+$RedRiver: index.html,v 1.4 2006/10/25 23:30:23 andrew Exp $
+ + +