[BACK]Return to index.html CVS log [TXT][DIR] Up to [local] / nagios / check_hw_sensors

Annotation of nagios/check_hw_sensors/index.html, Revision 1.6

1.5       andrew      1: <html><head><title>Nagios Check - check_hw_sensors</title></head>
                      2:     <body>
                      3:         <h1>Nagios Check - check_hw_sensors</h1>
                      4:         <p>check_hw_sensors plugin for Nagios monitors sysctl hw.sensors on OpenBSD</p>
                      5:         <p>With the new sensor framework in OpenBSD 3.9, I wanted to be able to monitor the new hw.sensors from <a href='http://www.nagios.org/'>Nagios</a> and this is what I have.  The documentation is a bit thin and I don't know how reliable it is.  I would be happy to accept patches.  Send them to <a href='mailto:andrew+nagios@rraz.net'>andrew+nagios@rraz.net</a>.  I know the docs aren't as good as I would like, so if there are places that need clarification, please let me know!</p>
1.6     ! andrew      6:                <p>New in this release is support for the new 2 level sensors in OpenBSD 4.0-current.  They seem way more better, and I may change some stuff to only support that version after I get all my machines moved to 4.1</p>
1.5       andrew      7:                <p>It has the ability to check the sensors that report their status.  Since many sensors support this, it can make the size of your sensorsd.conf much smaller.  For example, check_hw_sensors will automatically check these two sensors:
                      8:                <ul>
                      9:                        <li>hw.sensors.76=esm0, Fan 4, 3629 RPM, OK</li>
                     10:                        <li>hw.sensors.77=esm0, Fan 5, 0 RPM, CRITICAL</li>
                     11:                </ul>
                     12:                It will report the status listed to Nagios.  For 76, it would be OK, for 77 it would be CRITICAL.  You don't need to put anything in a config file to support those.</p>
                     13:         <p>What I think is really kewl about this plugin is that it can use the same sensorsd.conf as sensorsd.  That means that they can be easily kept in sync.  But, since Nagios supports both warning and critical alerts, it turned out really handy that sensorsd ignores any additional capabilities in the file.  The addtional capabilities check_hw_sensors supports are described below.  If you have an /etc/sensorsd.conf with the checks you want, it can be run as simply as 'check_hw_sensors -f'.  If you only want to check the sensors that report their status, you can ever run it as just 'check_hw_sensors'.</p>
                     14:         <p>TODO:
                     15:         <ul>
                     16:             <li>need real documentation.</li>
                     17:                        <li>the RANGE using the colon to separate probably screws up the getcap of sensorsd.conf, so it should probably get replaced with a dash or somesuch</li>
                     18:         </ul>
                     19:         </p>
1.6     ! andrew     20:         <p><center><b><a href='check_hw_sensors-1.22.tar.gz'>Download the current version here</a></b></center></p>
1.5       andrew     21:         <h4>
                     22:           Please be sure to support the <a href="http://www.openbsd.org">OpenBSD</a>
                     23:           project by purchasing
                     24:           <a href="http://www.openbsd.org/items.html">CDs</a>,
                     25:           <a href="http://www.openbsd.org/tshirts.html">T-shirts</a>, or making a
                     26:           <a href="http://www.openbsd.org/donations.html">donation</a>.
                     27:           <br />
                     28:           These finances ensure that OpenBSD will continue to exist, and
                     29:           will remain <a href="http://www.openbsd.org/policy.html">free</a>
                     30:           for everyone to use and reuse as they see fit.
                     31:         </h4>
                     32:         <pre>
                     33:     check_hw_sensors [-i] (-f [&lt;FILENAME&gt;]|(-s &lt;hw.sensors id&gt; [-w limit] [-c limit]))
                     34:
                     35: Usage:
                     36:     -i, --ignore-status
                     37:         Don't check the status of sensors that report it.
                     38:     -f, --filename=FILE
                     39:         FILE to load checks from (defaults to /etc/sensorsd.conf)
                     40:     -s, --sensor=ID
                     41:         ID of a single sensor.  "-s 0" means hw.sensors.0.
                     42:     -w, --warning=RANGE or single ENTRY
                     43:         Exit with WARNING status if outside of RANGE or if != ENTRY
                     44:     -c, --critical=RANGE or single ENTRY
                     45:         Exit with CRITICAL status if outside of RANGE or if != ENTRY
                     46:         </pre>
                     47:         <p>FILE is in the same format as <a href='http://www.openbsd.org/cgi-bin/man.cgi?query=sensorsd.conf'>sensorsd.conf(5)</a> plus some additional entries.  These additional entries in the file are ignored by <a href='http://www.openbsd.org/cgi-bin/man.cgi?query=sensorsd'>sensorsd(8)</a>.  </p>
                     48:
                     49:         <p>check_hw_sensors understands the following entries:<br>
                     50:                low, high, crit, warn, crit.low, crit.high, warn.low, warn.high,
                     51:                ignore, status</p>
                     52:
                     53:         <p>An ENTRY depends on the type.  The descriptions in <a href='http://www.openbsd.org/cgi-bin/man.cgi?query=sensorsd.conf'>sensorsd.conf(5)</a>
                     54:         can be used when appropriate, or you can use the following:
                     55:
                     56:         <ul>
                     57:             <li>fanrpm, volts_dc, amps, watthour, amphour, integer (raw), percent, lux or timedelta<br>
                     58:             Anything that includes digits.
                     59:             Both the value of the check and the value of the sensor
                     60:             response that are not either a digit or period are stripped
                     61:             and then the two resultant values are compared.</li>
                     62:
                     63:             <li>temp<br>
                     64:             Can be as above, but if the entry has an F in it,
                     65:             it compares farenheit, otherwise it uses celcius.</li>
                     66:
                     67:             <li>indicator or drive<br>
                     68:             does a case sensitive match of each
                     69:             entry in the comma separated list and if it does not match
                     70:             any of the entries, it sets the status.</li>
                     71:         </ul>
                     72:
                     73:         <p>The entries 'crit' or 'warn' (or the -c or -w on the command line)
                     74:         may be a RANGE or a comma separated list of acceptable values.
                     75:         The comma separated list of values contains a list of things that
                     76:         will NOT cause the status.  This is possibly counterintuitive, but
                     77:         you are more likely to know good values than bad values.</p>
                     78:
                     79:         <p>A RANGE is a low ENTRY and a high ENTRY separated by a colon (:).
                     80:         It can also be low: or :high with the other side left blank to only
                     81:         make the single check..<p>
                     82:
                     83:                <p>An entry marked "ignore" will cause that sensor to be skipped.
                     84:                Generally used with status checking of all sensors to ignore sensors you
                     85:                don't care about or that report incorrectly.</p>
                     86:
                     87:                <p>If you are using --ignore-status, you can still check the status of
                     88:                individual sensors with a status entry.</p>
                     89:
1.6     ! andrew     90:         <p>check_hw_sensors (nagios-plugins 1.4.2) 1.22<br>
1.5       andrew     91:         The nagios plugins come with ABSOLUTELY NO WARRANTY. You may redistribute
                     92:         copies of the plugins under the terms of the GNU General Public License.
                     93:         For more information about these matters, see the file named COPYING.</p>
                     94:
                     95:         <h3>Example sensorsd.conf</h3>
                     96:         <pre>
1.6     ! andrew     97: # hw.sensors.acpibat0.volt0=7.40 V DC, (voltage), OK
        !            98: # hw.sensors.acpibat0.volt1=8.30 V DC, (current voltage), OK
        !            99: # hw.sensors.acpibat0.watthour0=57.72 Wh, (last full capacity)
        !           100: # hw.sensors.acpibat0.watthour1=0.00 Wh, (warning capacity)
        !           101: # hw.sensors.acpibat0.watthour2=0.12 Wh, (low capacity)
        !           102: # hw.sensors.acpibat0.watthour3=57.72 Wh, (remaining capacity)
        !           103: hw.sensors.acpibat0.watthour3:warn.low=50 Wh:crit.low=30 Wh
        !           104: # hw.sensors.acpibat0.raw0=2, (battery charging), OK
        !           105: # hw.sensors.acpibat0.raw1=99, (rate)
        !           106: # hw.sensors.acpiac0.indicator0=On, (power supply)
        !           107: hw.sensors.acpiac0.indicator0:crit=On
        !           108: # hw.sensors.acpitz0.temp0=62.95 degC, (zone temperature)
        !           109: hw.sensors.acpitz0.temp0:warn.high=65 degC:crit.high=75 degC
1.5       andrew    110:         </pre>
                    111:                <h3>CVS log for check_hw_sensors</h3>
                    112:                <pre>
                    113: RCS file: /cvs/scripts/Admin scripts/check_hw_sensors/check_hw_sensors,v
                    114: Working file: check_hw_sensors
1.6     ! andrew    115: head: 1.22
1.5       andrew    116: branch:
                    117: locks: strict
                    118: access list:
                    119: symbolic names:
                    120: keyword substitution: kv
1.6     ! andrew    121: total revisions: 22;   selected revisions: 22
1.5       andrew    122: description:
                    123: ----------------------------
1.6     ! andrew    124: revision 1.22
        !           125: date: 2007/01/06 03:16:41;  author: andrew;  state: Exp;  lines: +11 -4
        !           126: Support the new dual level sensors
        !           127: ----------------------------
1.5       andrew    128: revision 1.21
                    129: date: 2006/12/05 16:26:27;  author: andrew;  state: Exp;  lines: +5 -5
                    130: new better example for 4.0
                    131: and fix the s/drive\s// from the data, not the type
                    132: ----------------------------
                    133: revision 1.20
                    134: date: 2006/12/05 00:17:47;  author: andrew;  state: Exp;  lines: +35 -26
                    135: Match sensors differently depending on OS Version from the Config module.
                    136:
                    137: Also support checks on the other sensor types and document that.
                    138:
                    139: and refactor the way I return a $sensor-&gt;{'status'} from ~10 lines to 1.
                    140: ----------------------------
                    141: revision 1.19
                    142: date: 2006/12/04 23:33:53;  author: andrew;  state: Exp;  lines: +8 -3
                    143: add a regex for the 'percent' type of sensor
                    144: ----------------------------
                    145: revision 1.18
                    146: date: 2006/12/02 02:15:17;  author: andrew;  state: Exp;  lines: +74 -9
                    147: fix it for the output from OpenBSD 4.0
                    148: ----------------------------
                    149: revision 1.17
                    150: date: 2006/10/25 23:30:23;  author: andrew;  state: Exp;  lines: +4 -7
                    151: get the docs up to match the new version
                    152: ----------------------------
                    153: revision 1.16
                    154: date: 2006/10/25 18:36:46;  author: andrew;  state: Exp;  lines: +4 -4
                    155: Stuff in CVS should output nagios format
                    156: ----------------------------
                    157: revision 1.15
                    158: date: 2006/10/25 18:35:59;  author: andrew;  state: Exp;  lines: +73 -46
                    159: add support for the status as reported by the sensors.  it is teh r0x0r!
                    160: ----------------------------
                    161: revision 1.14
                    162: date: 2006/05/04 01:30:29;  author: andrew;  state: Exp;  lines: +6 -4
                    163: I thought I checked this in already
                    164: ----------------------------
                    165: revision 1.13
                    166: date: 2006/05/03 22:16:42;  author: andrew;  state: Exp;  lines: +7 -6
                    167: Some more fixing of the help
                    168: ----------------------------
                    169: revision 1.12
                    170: date: 2006/05/03 21:54:43;  author: andrew;  state: Exp;  lines: +51 -30
                    171: Get the help output cleaned up.  Still not 100% what I want, but so far so good.
                    172: ----------------------------
                    173: revision 1.11
                    174: date: 2006/05/03 20:01:09;  author: holligan;  state: Exp;  lines: +4 -4
                    175: updated and clarified help
                    176: ----------------------------
                    177: revision 1.10
                    178: date: 2006/05/03 03:31:22;  author: andrew;  state: Exp;  lines: +27 -42
                    179: A bunch of cleanup and some kewl refactoring into loops.
                    180:
                    181: Still need to find a way to refactor the checks that are so similar!
                    182: ----------------------------
                    183: revision 1.9
                    184: date: 2006/05/03 02:26:47;  author: andrew;  state: Exp;  lines: +8 -8
                    185: It now doesn't do nonexistent checks (for some stuff anyway)
                    186: Also changed the &lt;br /&gt; to &lt;br&gt;.  Not valid XHTML whatever, but it does get stripped before getting sent to my pager.
                    187: ----------------------------
                    188: revision 1.8
                    189: date: 2006/05/02 21:23:29;  author: andrew;  state: Exp;  lines: +6 -16
                    190: Better looking output for the web page.
                    191: ----------------------------
                    192: revision 1.7
                    193: date: 2006/05/02 20:03:53;  author: andrew;  state: Exp;  lines: +5 -5
                    194: oops, that's an array ref!
                    195: ----------------------------
                    196: revision 1.6
                    197: date: 2006/05/02 19:59:47;  author: andrew;  state: Exp;  lines: +7 -5
                    198: Better output for the OK checks
                    199: ----------------------------
                    200: revision 1.5
                    201: date: 2006/05/02 19:49:29;  author: andrew;  state: Exp;  lines: +19 -12
                    202: Only show details for things other than OK, cuZ we are limited in the amount of data we can return :-(
                    203: ----------------------------
                    204: revision 1.4
                    205: date: 2006/05/02 15:54:42;  author: andrew;  state: Exp;  lines: +44 -21
                    206: Some cleanup, as well as making it output a single line like nagios supposedly likes.
                    207: ----------------------------
                    208: revision 1.3
                    209: date: 2006/05/02 01:39:23;  author: andrew;  state: Exp;  lines: +3 -3
                    210: fix the help getopts.
                    211: ----------------------------
                    212: revision 1.2
                    213: date: 2006/05/02 01:29:33;  author: andrew;  state: Exp;  lines: +427 -11
                    214: Adding the sensors from one of the routers, cuZ there were a lot and I can use it for testing.
                    215:
                    216: Also, now the check_hw_sensors now seems to be OK.  I need to put it on a few machines and set up the checks now.  If it works for the rest of the week, I can clean it up and mebbe put it on teh interweb and post to undeadly.
                    217: ----------------------------
                    218: revision 1.1
                    219: date: 2006/05/01 18:11:23;  author: andrew;  state: Exp;
                    220: add this so I can check it out on a box for testing
                    221: =============================================================================
                    222:                </pre>
                    223:         <p>Andrew Fresh &lt;<a href='mailto:andrew@mad-techies.org'>andrew@mad-techies.org</a>&gt;</p>
1.6     ! andrew    224:         <p><small>$RedRiver: index.html,v 1.5 2006/12/05 16:44:14 andrew Exp $</small></p>
1.5       andrew    225:     </body>
                    226: </html>
                    227:

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>