Setting up pnp4nagios on Ubuntu 12.04 LTS Precise Pangolin

Prerequisites

nagios3 is installed.

Install

aptitude install pnp4nagios

Configuring Bulk Mode with NPCD

nagios.cfg

In /etc/nagios3/nagios.cfg, update process_performance_data=1. Use the sed command or just edit the file!

grep process_performance_data /etc/nagios3/nagios.cfg
sed -i 's/process_performance_data=0/process_performance_data=1/' /etc/nagios3/nagios.cfg

Create the following directories where the performance data will be stored

mkdir -p /var/spool/pnp4nagios/nagios
chown -R nagios:nagios /var/spool/pnp4nagios

Configure nagios.cfg to use the directories and files for storing the data

#
# service performance data
#
service_perfdata_file=/var/spool/pnp4nagios/nagios/service-perfdata
service_perfdata_file_template=DATATYPE::SERVICEPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tSERVICEDESC::$SERVICEDESC$\tSERVICEPERFDATA::$SERVICEPERFDATA$\tSERVICECHECKCOMMAND::$SERVICECHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$\tSERVICESTATE::$SERVICESTATE$\tSERVICESTATETYPE::$SERVICESTATETYPE$
service_perfdata_file_mode=a
service_perfdata_file_processing_interval=15
service_perfdata_file_processing_command=pnp-bulk-service
 
#
# host performance data
# 
host_perfdata_file=/var/spool/pnp4nagios/nagios/host-perfdata
host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tHOSTPERFDATA::$HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$
host_perfdata_file_mode=a
host_perfdata_file_processing_interval=15
host_perfdata_file_processing_command=pnp-bulk-host
#

conf.d/pnp4nagios.cfg should already have pnp-bulk-service & pnp-bulk-host commands (came about with the install), if not paste this at: /etc/nagios3/pnp4nagios.cfg. Note the path for process_perfdata.pl - may be different and is sometimes at /usr/local/pnp4nagios/libexec.

define command {
    command_name    pnp-synchronous-service
    command_line    /usr/bin/perl /usr/lib/pnp4nagios/libexec/process_perfdata.pl
}

define command {
    command_name    pnp-synchronous-host
    command_line    /usr/bin/perl /usr/lib/pnp4nagios/libexec/process_perfdata.pl -d HOSTPERFDATA
}

##############################################################################

define command{
    command_name    pnp-bulk-service
    command_line    /usr/bin/perl /usr/lib/pnp4nagios/libexec/process_perfdata.pl --bulk=/var/spool/pnp4nagios/nagios/service-perfdata
}

define command{
    command_name    pnp-bulk-host
    command_line    /usr/bin/perl /usr/lib/pnp4nagios/libexec/process_perfdata.pl --bulk=/var/spool/pnp4nagios/nagios/host-perfdata
}

##############################################################################

define command{
    command_name    pnp-bulknpcd-service
    command_line    /bin/mv /var/spool/pnp4nagios/nagios/service-perfdata /var/spool/pnp4nagios/npcd/service-perfdata.$TIMET$
}

define command{
    command_name    pnp-bulknpcd-host
    command_line    /bin/mv /var/spool/pnp4nagios/nagios/host-perfdata /var/spool/pnp4nagios/npcd/host-perfdata.$TIMET$
}

Config pnp4nagios

  • Edit /etc/default/npcd
    • Update: RUN=“yes”
  • Validate: /etc/pnp4nagios/npcd.cfg or /usr/local/pnp4nagios/etc/npcd.cfg
    • No change usually required

Apache web server configuration

Link the pnp4nagios web configuration to apache configuration

ln -s /etc/pnp4nagios/apache.conf /etc/apache2/conf-enabled/pnp4nagios.conf
service apache2 restart

Restart

Restart services

service nagios3 restart
service npcd start
service apache2 restart

Use

Optional

In /etc/pnp4nagios/npcd.cfg change log_type from syslog to file

log_type = file
#log_type = syslog

Errors

Common errors are when monitoring parameters change in nagios and the old XML definitions have to be deleted. Do determine if there are issues run:

grep -R "found extra data" /var/lib/pnp4nagios/perfdata/*/*.xml

For the related dot xml file, delete both the dot rrd (and the dot xml) file. For example if you get the output as

/var/lib/pnp4nagios/perfdata/server1/Disks.xml:    <TXT>/var/lib/pnp4nagios/perfdata/server1/Disks.rrd: found extra data on update argument: 613954</TXT>

then delete as follows

rm /var/lib/pnp4nagios/perfdata/server1/Disks.*

Of course this means you lose historical data as well.

Alternative grep to check for other errors/information as well:

grep -R TXT /var/lib/pnp4nagios/perfdata/*/*.xml|grep -v successful

Cron job for error check

Create /etc/cron.daily/pnp4nagios_check as below:

#!/bin/bash
#
PNP4LOC=/var/lib/pnp4nagios/perfdata
#
PNPERRCNT=`grep -R TXT $PNP4LOC/*/*.xml|grep -c -v successful`
if [ $PNPERRCNT -gt 0 ]; then
  grep -R TXT $PNP4LOC/*/*.xml | grep -v successful | mailx -s "PNP4Nagios Error" admin@example.org
fi
#
exit

Increasing RRD Resolution

The /etc/pnp4nagios/rra.cfg has the default resolution. The default rolls up and aggregates the data quite quickly and I prefer to have more resolution over longer periods of time. The below change the resolution to be more fine grained over longer time periods.

Sometimes this file is at /usr/local/pnp4nagios/etc/rra.cfg

Sometimes this file is referenced from /usr/local/pnp4nagios/etc/process_perfdata.cfg

#
# Define the default RRA Step in seconds
# More Infos on
# http://oss.oetiker.ch/rrdtool/doc/rrdcreate.en.html
#
RRA_STEP=60
#
# PNP default RRA config
#
# you will get 6 MB of data per datasource
#
# 51840 entries with 1 minute step = 36 days
#
RRA:AVERAGE:0.5:1:51840
#
# 115200 entries with 5 minute step = 400 days
#
RRA:AVERAGE:0.5:5:115200
#
# 38400 entries with 30 minute step = 800 days
#
RRA:AVERAGE:0.5:30:38400
#
# 35040 entries with 60 minute step = 4 years
#
RRA:AVERAGE:0.5:60:35040

RRA:MAX:0.5:1:51840
RRA:MAX:0.5:5:115200
RRA:MAX:0.5:30:38400
RRA:MAX:0.5:60:35040

RRA:MIN:0.5:1:51840
RRA:MIN:0.5:5:115200
RRA:MIN:0.5:30:38400
RRA:MIN:0.5:60:35040

Other


QR Code
QR Code tech:linux:pnp4nagios (generated for current page)