nagios3 is installed.
aptitude install pnp4nagios
In /etc/nagios3/nagios.cfg, update process_performance_data=1. Use the sed command or just edit the file!
grep process_performance_data /etc/nagios3/nagios.cfg sed -i 's/process_performance_data=0/process_performance_data=1/' /etc/nagios3/nagios.cfg
Create the following directories where the performance data will be stored
mkdir -p /var/spool/pnp4nagios/nagios chown -R nagios:nagios /var/spool/pnp4nagios
Configure nagios.cfg to use the directories and files for storing the data
# # service performance data # service_perfdata_file=/var/spool/pnp4nagios/nagios/service-perfdata service_perfdata_file_template=DATATYPE::SERVICEPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tSERVICEDESC::$SERVICEDESC$\tSERVICEPERFDATA::$SERVICEPERFDATA$\tSERVICECHECKCOMMAND::$SERVICECHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$\tSERVICESTATE::$SERVICESTATE$\tSERVICESTATETYPE::$SERVICESTATETYPE$ service_perfdata_file_mode=a service_perfdata_file_processing_interval=15 service_perfdata_file_processing_command=pnp-bulk-service # # host performance data # host_perfdata_file=/var/spool/pnp4nagios/nagios/host-perfdata host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tHOSTPERFDATA::$HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$ host_perfdata_file_mode=a host_perfdata_file_processing_interval=15 host_perfdata_file_processing_command=pnp-bulk-host #
conf.d/pnp4nagios.cfg should already have pnp-bulk-service & pnp-bulk-host commands (came about with the install), if not paste this at:
/etc/nagios3/pnp4nagios.cfg
. Note the path for process_perfdata.pl
- may be different and is sometimes at /usr/local/pnp4nagios/libexec
.
define command { command_name pnp-synchronous-service command_line /usr/bin/perl /usr/lib/pnp4nagios/libexec/process_perfdata.pl } define command { command_name pnp-synchronous-host command_line /usr/bin/perl /usr/lib/pnp4nagios/libexec/process_perfdata.pl -d HOSTPERFDATA } ############################################################################## define command{ command_name pnp-bulk-service command_line /usr/bin/perl /usr/lib/pnp4nagios/libexec/process_perfdata.pl --bulk=/var/spool/pnp4nagios/nagios/service-perfdata } define command{ command_name pnp-bulk-host command_line /usr/bin/perl /usr/lib/pnp4nagios/libexec/process_perfdata.pl --bulk=/var/spool/pnp4nagios/nagios/host-perfdata } ############################################################################## define command{ command_name pnp-bulknpcd-service command_line /bin/mv /var/spool/pnp4nagios/nagios/service-perfdata /var/spool/pnp4nagios/npcd/service-perfdata.$TIMET$ } define command{ command_name pnp-bulknpcd-host command_line /bin/mv /var/spool/pnp4nagios/nagios/host-perfdata /var/spool/pnp4nagios/npcd/host-perfdata.$TIMET$ }
Link the pnp4nagios web configuration to apache configuration
ln -s /etc/pnp4nagios/apache.conf /etc/apache2/conf-enabled/pnp4nagios.conf service apache2 restart
Restart services
service nagios3 restart service npcd start service apache2 restart
In /etc/pnp4nagios/npcd.cfg change log_type from syslog to file
log_type = file #log_type = syslog
Common errors are when monitoring parameters change in nagios and the old XML definitions have to be deleted. Do determine if there are issues run:
grep -R "found extra data" /var/lib/pnp4nagios/perfdata/*/*.xml
For the related dot xml file, delete both the dot rrd (and the dot xml) file. For example if you get the output as
/var/lib/pnp4nagios/perfdata/server1/Disks.xml: <TXT>/var/lib/pnp4nagios/perfdata/server1/Disks.rrd: found extra data on update argument: 613954</TXT>
then delete as follows
rm /var/lib/pnp4nagios/perfdata/server1/Disks.*
Of course this means you lose historical data as well.
Alternative grep to check for other errors/information as well:
grep -R TXT /var/lib/pnp4nagios/perfdata/*/*.xml|grep -v successful
Create /etc/cron.daily/pnp4nagios_check as below:
#!/bin/bash # PNP4LOC=/var/lib/pnp4nagios/perfdata # PNPERRCNT=`grep -R TXT $PNP4LOC/*/*.xml|grep -c -v successful` if [ $PNPERRCNT -gt 0 ]; then grep -R TXT $PNP4LOC/*/*.xml | grep -v successful | mailx -s "PNP4Nagios Error" admin@example.org fi # exit
The /etc/pnp4nagios/rra.cfg
has the default resolution. The default rolls up and aggregates the data quite quickly and I prefer to have more resolution over longer periods of time. The below change the resolution to be more fine grained over longer time periods.
Sometimes this file is at /usr/local/pnp4nagios/etc/rra.cfg
Sometimes this file is referenced from /usr/local/pnp4nagios/etc/process_perfdata.cfg
# # Define the default RRA Step in seconds # More Infos on # http://oss.oetiker.ch/rrdtool/doc/rrdcreate.en.html # RRA_STEP=60 # # PNP default RRA config # # you will get 6 MB of data per datasource # # 51840 entries with 1 minute step = 36 days # RRA:AVERAGE:0.5:1:51840 # # 115200 entries with 5 minute step = 400 days # RRA:AVERAGE:0.5:5:115200 # # 38400 entries with 30 minute step = 800 days # RRA:AVERAGE:0.5:30:38400 # # 35040 entries with 60 minute step = 4 years # RRA:AVERAGE:0.5:60:35040 RRA:MAX:0.5:1:51840 RRA:MAX:0.5:5:115200 RRA:MAX:0.5:30:38400 RRA:MAX:0.5:60:35040 RRA:MIN:0.5:1:51840 RRA:MIN:0.5:5:115200 RRA:MIN:0.5:30:38400 RRA:MIN:0.5:60:35040
# # Define the default RRA Step in seconds # More Infos on # http://oss.oetiker.ch/rrdtool/doc/rrdcreate.en.html # RRA_STEP=60 # # PNP default RRA config # # you will get 2.8 MB of data per datasource # # 11520 entries with 1 minute step = 8 days # RRA:AVERAGE:0.5:1:11520 # # 11520 entries with 5 minute step = 40 days # RRA:AVERAGE:0.5:5:11520 # # 19200 entries with 30 minute step = 400 days # RRA:AVERAGE:0.5:30:19200 # # 17520 entries with 120 minute step = 4 years # RRA:AVERAGE:0.5:120:17520 RRA:MAX:0.5:1:11520 RRA:MAX:0.5:5:11520 RRA:MAX:0.5:30:19200 RRA:MAX:0.5:360:17520 RRA:MIN:0.5:1:11520 RRA:MIN:0.5:5:11520 RRA:MIN:0.5:30:19200 RRA:MIN:0.5:360:17520
The default resolution you get on install.
# # Define the default RRA Step in seconds # More Infos on # http://oss.oetiker.ch/rrdtool/doc/rrdcreate.en.html # RRA_STEP=60 # # PNP default RRA config # # you will get 400kb of data per datasource # # 2880 entries with 1 minute step = 48 hours # RRA:AVERAGE:0.5:1:2880 # # 2880 entries with 5 minute step = 10 days # RRA:AVERAGE:0.5:5:2880 # # 4320 entries with 30 minute step = 90 days # RRA:AVERAGE:0.5:30:4320 # # 5840 entries with 360 minute step = 4 years # RRA:AVERAGE:0.5:360:5840 RRA:MAX:0.5:1:2880 RRA:MAX:0.5:5:2880 RRA:MAX:0.5:30:4320 RRA:MAX:0.5:360:5840 RRA:MIN:0.5:1:2880 RRA:MIN:0.5:5:2880 RRA:MIN:0.5:30:4320 RRA:MIN:0.5:360:5840
Format
'label'=value[UOM];[warn];[crit];[min];[max]
Example showing multiple data sources
Access Count is OK. Response Time is OK. HTTP 2xx Count is OK. HTTP 3xx Count is OK. HTTP 4xx Count is OK. HTTP 5xx Count is OK. Access Count=23 Response Time=179357us HTTP 2xx Count=13 HTTP 3xx Count=10 HTTP 4xx Count=0 HTTP 5xx Count=0|'Access Count'=23;1500;1600;0 'Response Time'=179357us;250000;300000;0 'HTTP 2xx Count'=13;1500;1600;0 'HTTP 3xx Count'=10;350;400;0 'HTTP 4xx Count'=0;30;50;0 'HTTP 5xx Count'=0;10;15;0
You can customize data graphs by creating a custom php template and naming it appropriately. The naming convention is to use the same name (with .php extension) as used by the Nagios command. The default template specifies the underlying Nagios command name in the graph (at the bottom right corner). The custom templates are typically located in the following directory /usr/local/pnp4nagios/share/templates.dist
. Copy the default.php
to the <command>.php
file and customize as required. Check pnp4nagios Templates for more information.
Also refer to Custom Templates to change the default behavior of which command name the template will use. The etc/check_commands
directory (usually under /usr/local/pnp4nagios) will refer to the config file (<check_command>.cfg) to determine which command file to use. This is useful when the Nagios command is the same (such as in the case of check_nrpe) and you need to customize for the sub-command (such as check_nrpe_1arg!check_ls_memory_usage). In this case create a file check_nrpe.cfg with CUSTOM_TEMPLATE = 1
to specify the sub-command name to be used in the custom template.