Archives par étiquette : supervision

check-mk « check_pid »

This local check for check_mk is useful with services who won’t start if an old or incorrect pid file remains. I needed to write it because this is the very case of the postfix version embedded with zimbra.

The code is self-explanatory :

 

#!/bin/bash
# Cyril Pawelko http://www.pawelko.net/check-mk-check_pid/
# Version 1
# Checks if the PID referenced in a PIDFILE exists
# This is a "local" check_mk check for *NIX:
# - Rename it if you need several checks on the same host
# - Copy to check-mk local checks directory (/usr/lib/check_mk_agent/local on Debian)
# - Customize with the PID file path:
PIDFILE="/opt/zimbra/data/postfix/spool/pid/master.pid"

if [ ! -r $PIDFILE ]
 then echo 3 Pid_$PIDFILE - File $PIDFILE cannot be read
 exit
fi

PID=$(cat $PIDFILE  )
PID=${PID//[^0-9]/}
if ps $PID > /dev/null
 then echo 0 Pid_$PIDFILE - Process $PID referenced in $PIDFILE is running
 else echo 2 Pid_$PIDFILE - Process $PID referenced in $PIDFILE does not exist
fi

Download

Have fun !

Check_mk check for generic OID

A while ago, I wrote a simple check_mk check to monitor a single SNMP OID. I can simply be customized to quickly monitor any value, with alert levels and perfdata. Here it is :

#!/usr/bin/python
# -*- encoding: utf-8; py-indent-offset: 4 -*-

# Cyril - 28/01/2014
# Check_snmp_template for a single OID, for example .1.3.6.1.2.1.4.5.0
# How to customize :
# Break the OID in 2 parts, for instance ".1.3.6.1.2.1.4.5" and "0". Adapt baseoid and suboid.
# Adjust crit and warn values
# Replace "snmp_oid_test1" in all strings and function names
# Remplace "valeur" with a description

# OID to check
baseoid = ".1.3.6.1.4.1.6486.800.1.2.1.16.1.1.1.13"
suboid = "0"
# Warn if above this value
warn = 70
# Critical if above this value
crit = 90
# OID description (for example: CPU Utilization)
valeur = "Valeur"

def inventory_snmp_oid_test1(info):
    if len(info) > 0:
        return [ (None, (warn, crit) ) ]

def check_snmp_oid_test1(item, _no_params, info):
   value = int(info[0][0])
   perfdata = [ ("valeur", value, warn, crit) ]
   if value > crit:
           return (2, "Valeur: %i" % value, perfdata)
   elif value > warn:
           return (1, "Valeur: %i" % value, perfdata)
   else:
           return (0, "Valeur: %i" % value, perfdata)

check_info["snmp_oid_test1"] = {
        "check_function"        : check_snmp_oid_test1,
        "service_description"   : "Alcatel Switch CPU",
        "snmp_info"             : ( baseoid, [ suboid ] ),
        "has_perfdata"          : True,
        "inventory_function"    : inventory_snmp_oid_test1,
        }

# Quick and dirty scan function, testing against sysObjectID would be more efficient
snmp_scan_functions["snmp_oid_test1"] = \
        lambda oid: oid( baseoid + "." + suboid ) != None

Download it here : check_mk_generic_snmp_oid

Dell Storage Compellent SC plugin for check_mk

I wrote a check_mk check for Dell « Compellent » Storage Arrays.

It was developped for the SC4020 model, and works also for SCv2000/SCv2020 (thanks Emmanuel) and SC7020. It should work with other models (SC5020, SC8000, SC9000).

The following items can be monitored:

  • Global status, as reported by the system
  • Controllers status
  • I/O modules status
  • Disk status
  • Hardware sensors status
  • Temperatures (with performance data)
  • Power supplies status
  • Volume status
  • Servers status

sc4020_check_mk

Download from Check-MK Exchange

Have fun !

Update 27/06/2017 : Since SCOS 7.x, temperature values are not correctly reported through SNMP. This caused a crash of the temperature checks.This is a known bug which will be addressed with SCOS 7.2.10. Meanwhile, I’ll release version 1.1, containing workarounds to avoid crashes.

 

 

Backup Exec plugin for check_mk

This check_mk plugin queries Backup Exec database to find job history, and returns the state of the last job execution.

It has been tested with Backup Exec 2010 and 2012, and should work with previous versions from 9.0 to 12.5. Backup Exec 2010 is required (see below)

checks

It has been tested on Windows 2008 R2, and should work on Windows 2012, and maybe Windows 2003 if powershell is installed.
Contrary to many Backup Exec plugins for Nagios, it is not a compiled executable but a simple powershell script, and doesn’t parse xml history files and so runs very fast.
It tries to connect on the local MSSQL instance named BEDB, which is the default configuration. In most cases, no configuration is needed, just copy backupexec_job.ps1 into check_mk « plugins » subdirectory.

The following performance data is returned to create graphs: Job size, duration, rate and deduplication ratio.

graph_backup_rate

graph_backup_time

 

 

Download from Check-MK Exchange

16/07/2014 : Version 1.1 – Added support for job status 9 (missed)

28/07/2014 : Version 1.2 – Added WATO plugin to treat jobs ‘completed with exceptions’ as OK

29/07/2014 : Version 1.3 – Added support for job status 21 (canceled, timed out)

02/04/2015 : For Backup Exec 12.5, you need to use backupexec_9-12.5_job on your windows server (thanks Guillermo).

25/06/2015 : Version 1.4 – Fixed error « “Check parameter definition for backupexec_job has type Dictionary, but match_type None” » with check_mk 1.2.7i (thanks Jörg !). Version 1.4 doesn’t work anymore with 1.2.6 or previous versions. Use backupexec_job-1.3 instead.

02/09/2015 : Version 1.5 – Fixed compatibility with both check_mk<=1.2.6 and check.mk>=1.2.5 (thanks Peter & François)

07/10/2015 : Version 1.6 – Fixed « ERROR: Skipping invalid manpage: backupexec_job (Catalog info missing) » in 1.2.7i3 (thanks to Daniel Müller)