Differences

This shows you the differences between two versions of the page.

Link to this comparison view

tech:linux:pnp4nagios [2018/12/06 06:27] (current)
Line 1: Line 1:
 +====== Setting up pnp4nagios on Ubuntu 12.04 LTS Precise Pangolin ======
 +===== Prerequisites =====
 +nagios3 is installed.
 +
 +===== Install =====
 +<​code>​
 +aptitude install pnp4nagios
 +</​code>​
 +
 +===== Configuring Bulk Mode with NPCD =====
 +==== nagios.cfg ====
 +In /​etc/​nagios3/​nagios.cfg,​ update process_performance_data=1. ​ Use the sed command or just edit the file!
 +<code bash>
 +grep process_performance_data /​etc/​nagios3/​nagios.cfg
 +sed -i '​s/​process_performance_data=0/​process_performance_data=1/'​ /​etc/​nagios3/​nagios.cfg
 +</​code>​
 +
 +Create the following directories where the performance data will be stored
 +<code bash>
 +mkdir -p /​var/​spool/​pnp4nagios/​nagios
 +chown -R nagios:​nagios /​var/​spool/​pnp4nagios
 +</​code>​
 +
 +Configure nagios.cfg to use the directories and files for storing the data
 +<code ini>
 +#
 +# service performance data
 +#
 +service_perfdata_file=/​var/​spool/​pnp4nagios/​nagios/​service-perfdata
 +service_perfdata_file_template=DATATYPE::​SERVICEPERFDATA\tTIMET::​$TIMET$\tHOSTNAME::​$HOSTNAME$\tSERVICEDESC::​$SERVICEDESC$\tSERVICEPERFDATA::​$SERVICEPERFDATA$\tSERVICECHECKCOMMAND::​$SERVICECHECKCOMMAND$\tHOSTSTATE::​$HOSTSTATE$\tHOSTSTATETYPE::​$HOSTSTATETYPE$\tSERVICESTATE::​$SERVICESTATE$\tSERVICESTATETYPE::​$SERVICESTATETYPE$
 +service_perfdata_file_mode=a
 +service_perfdata_file_processing_interval=15
 +service_perfdata_file_processing_command=pnp-bulk-service
 +
 +#
 +# host performance data
 +
 +host_perfdata_file=/​var/​spool/​pnp4nagios/​nagios/​host-perfdata
 +host_perfdata_file_template=DATATYPE::​HOSTPERFDATA\tTIMET::​$TIMET$\tHOSTNAME::​$HOSTNAME$\tHOSTPERFDATA::​$HOSTPERFDATA$\tHOSTCHECKCOMMAND::​$HOSTCHECKCOMMAND$\tHOSTSTATE::​$HOSTSTATE$\tHOSTSTATETYPE::​$HOSTSTATETYPE$
 +host_perfdata_file_mode=a
 +host_perfdata_file_processing_interval=15
 +host_perfdata_file_processing_command=pnp-bulk-host
 +#
 +</​code>​
 +
 +conf.d/​pnp4nagios.cfg should already have pnp-bulk-service & pnp-bulk-host commands (came about with the install), if not paste this at:
 +''/​etc/​nagios3/​pnp4nagios.cfg''​. ​ Note the path for ''​process_perfdata.pl''​ - may be different and is sometimes at ''/​usr/​local/​pnp4nagios/​libexec''​.
 +
 +<​code>​
 +define command {
 +    command_name ​   pnp-synchronous-service
 +    command_line ​   /​usr/​bin/​perl /​usr/​lib/​pnp4nagios/​libexec/​process_perfdata.pl
 +}
 +
 +define command {
 +    command_name ​   pnp-synchronous-host
 +    command_line ​   /​usr/​bin/​perl /​usr/​lib/​pnp4nagios/​libexec/​process_perfdata.pl -d HOSTPERFDATA
 +}
 +
 +##############################################################################​
 +
 +define command{
 +    command_name ​   pnp-bulk-service
 +    command_line ​   /​usr/​bin/​perl /​usr/​lib/​pnp4nagios/​libexec/​process_perfdata.pl --bulk=/​var/​spool/​pnp4nagios/​nagios/​service-perfdata
 +}
 +
 +define command{
 +    command_name ​   pnp-bulk-host
 +    command_line ​   /​usr/​bin/​perl /​usr/​lib/​pnp4nagios/​libexec/​process_perfdata.pl --bulk=/​var/​spool/​pnp4nagios/​nagios/​host-perfdata
 +}
 +
 +##############################################################################​
 +
 +define command{
 +    command_name ​   pnp-bulknpcd-service
 +    command_line ​   /bin/mv /​var/​spool/​pnp4nagios/​nagios/​service-perfdata /​var/​spool/​pnp4nagios/​npcd/​service-perfdata.$TIMET$
 +}
 +
 +define command{
 +    command_name ​   pnp-bulknpcd-host
 +    command_line ​   /bin/mv /​var/​spool/​pnp4nagios/​nagios/​host-perfdata /​var/​spool/​pnp4nagios/​npcd/​host-perfdata.$TIMET$
 +}
 +</​code>​
 +
 +==== Config pnp4nagios ====
 +  * Edit /​etc/​default/​npcd
 +    * Update: RUN="​yes"​
 +  * Validate: /​etc/​pnp4nagios/​npcd.cfg or /​usr/​local/​pnp4nagios/​etc/​npcd.cfg
 +    * No change usually required
 +
 +
 +===== Apache web server configuration =====
 +Link the pnp4nagios web configuration to apache configuration
 +<code bash>
 +ln -s /​etc/​pnp4nagios/​apache.conf /​etc/​apache2/​conf-enabled/​pnp4nagios.conf
 +service apache2 restart
 +</​code>​
 +
 +===== Restart =====
 +Restart services
 +<code bash>
 +service nagios3 restart
 +service npcd start
 +service apache2 restart
 +</​code>​
 +
 +===== Use =====
 +http://​localhost/​pnp4nagios/​
 +
 +
 +===== Optional =====
 +In /​etc/​pnp4nagios/​npcd.cfg change log_type from syslog to file
 +<​code>​
 +log_type = file
 +#log_type = syslog
 +</​code>​
 +
 +
 +===== Errors =====
 +Common errors are when monitoring parameters change in nagios and the old XML definitions have to be deleted. ​ Do determine if there are issues run:
 +<code bash>
 +grep -R "found extra data" /​var/​lib/​pnp4nagios/​perfdata/​*/​*.xml
 +</​code>​
 +For the related dot xml file, delete both the dot rrd (and the dot xml) file.  For example if you get the output as
 +<​code>​
 +/​var/​lib/​pnp4nagios/​perfdata/​server1/​Disks.xml: ​   <​TXT>/​var/​lib/​pnp4nagios/​perfdata/​server1/​Disks.rrd:​ found extra data on update argument: 613954</​TXT>​
 +</​code>​
 +then delete as follows
 +<code bash>
 +rm /​var/​lib/​pnp4nagios/​perfdata/​server1/​Disks.*
 +</​code>​
 +Of course this means you lose historical data as well.
 +
 +Alternative grep to check for other errors/​information as well:
 +<code bash>
 +grep -R TXT /​var/​lib/​pnp4nagios/​perfdata/​*/​*.xml|grep -v successful
 +</​code>​
 +
 +
 +===== Cron job for error check =====
 +Create /​etc/​cron.daily/​pnp4nagios_check as below:
 +<code bash>
 +#!/bin/bash
 +#
 +PNP4LOC=/​var/​lib/​pnp4nagios/​perfdata
 +#
 +PNPERRCNT=`grep -R TXT $PNP4LOC/​*/​*.xml|grep -c -v successful`
 +if [ $PNPERRCNT -gt 0 ]; then
 +  grep -R TXT $PNP4LOC/​*/​*.xml | grep -v successful | mailx -s "​PNP4Nagios Error" admin@example.org
 +fi
 +#
 +exit
 +</​code>​
 +
 +
 +===== Increasing RRD Resolution =====
 +The ''/​etc/​pnp4nagios/​rra.cfg''​ has the default resolution. ​ The default rolls up and aggregates the data quite quickly and I prefer to have more resolution over longer periods of time.  The below change the resolution to be more fine grained over longer time periods.
 +
 +Sometimes this file is at ''/​usr/​local/​pnp4nagios/​etc/​rra.cfg''​
 +
 +Sometimes this file is referenced from ''/​usr/​local/​pnp4nagios/​etc/​process_perfdata.cfg''​
 +
 +==== High Resolution ====
 +<​code>​
 +#
 +# Define the default RRA Step in seconds
 +# More Infos on
 +# http://​oss.oetiker.ch/​rrdtool/​doc/​rrdcreate.en.html
 +#
 +RRA_STEP=60
 +#
 +# PNP default RRA config
 +#
 +# you will get 6 MB of data per datasource
 +#
 +# 51840 entries with 1 minute step = 36 days
 +#
 +RRA:​AVERAGE:​0.5:​1:​51840
 +#
 +# 115200 entries with 5 minute step = 400 days
 +#
 +RRA:​AVERAGE:​0.5:​5:​115200
 +#
 +# 38400 entries with 30 minute step = 800 days
 +#
 +RRA:​AVERAGE:​0.5:​30:​38400
 +#
 +# 35040 entries with 60 minute step = 4 years
 +#
 +RRA:​AVERAGE:​0.5:​60:​35040
 +
 +RRA:​MAX:​0.5:​1:​51840
 +RRA:​MAX:​0.5:​5:​115200
 +RRA:​MAX:​0.5:​30:​38400
 +RRA:​MAX:​0.5:​60:​35040
 +
 +RRA:​MIN:​0.5:​1:​51840
 +RRA:​MIN:​0.5:​5:​115200
 +RRA:​MIN:​0.5:​30:​38400
 +RRA:​MIN:​0.5:​60:​35040
 +</​code>​
 +
 +==== Medium Resolution ====
 +<​code>​
 +#
 +# Define the default RRA Step in seconds
 +# More Infos on
 +# http://​oss.oetiker.ch/​rrdtool/​doc/​rrdcreate.en.html
 +#
 +RRA_STEP=60
 +#
 +# PNP default RRA config
 +#
 +# you will get 2.8 MB of data per datasource
 +#
 +# 11520 entries with 1 minute step = 8 days
 +#
 +RRA:​AVERAGE:​0.5:​1:​11520
 +#
 +# 11520 entries with 5 minute step = 40 days
 +#
 +RRA:​AVERAGE:​0.5:​5:​11520
 +#
 +# 19200 entries with 30 minute step = 400 days
 +#
 +RRA:​AVERAGE:​0.5:​30:​19200
 +#
 +# 17520 entries with 120 minute step = 4 years
 +#
 +RRA:​AVERAGE:​0.5:​120:​17520
 +
 +RRA:​MAX:​0.5:​1:​11520
 +RRA:​MAX:​0.5:​5:​11520
 +RRA:​MAX:​0.5:​30:​19200
 +RRA:​MAX:​0.5:​360:​17520
 +
 +RRA:​MIN:​0.5:​1:​11520
 +RRA:​MIN:​0.5:​5:​11520
 +RRA:​MIN:​0.5:​30:​19200
 +RRA:​MIN:​0.5:​360:​17520
 +</​code>​
 +
 +==== Low Resolution ====
 +The default resolution you get on install.
 +<​code>​
 +#
 +# Define the default RRA Step in seconds
 +# More Infos on
 +# http://​oss.oetiker.ch/​rrdtool/​doc/​rrdcreate.en.html
 +#
 +RRA_STEP=60
 +#
 +# PNP default RRA config
 +#
 +# you will get 400kb of data per datasource
 +#
 +# 2880 entries with 1 minute step = 48 hours
 +#
 +RRA:​AVERAGE:​0.5:​1:​2880
 +#
 +# 2880 entries with 5 minute step = 10 days
 +#
 +RRA:​AVERAGE:​0.5:​5:​2880
 +#
 +# 4320 entries with 30 minute step = 90 days
 +#
 +RRA:​AVERAGE:​0.5:​30:​4320
 +#
 +# 5840 entries with 360 minute step = 4 years
 +#
 +RRA:​AVERAGE:​0.5:​360:​5840
 +
 +RRA:​MAX:​0.5:​1:​2880
 +RRA:​MAX:​0.5:​5:​2880
 +RRA:​MAX:​0.5:​30:​4320
 +RRA:​MAX:​0.5:​360:​5840
 +
 +RRA:​MIN:​0.5:​1:​2880
 +RRA:​MIN:​0.5:​5:​2880
 +RRA:​MIN:​0.5:​30:​4320
 +RRA:​MIN:​0.5:​360:​5840
 +</​code>​
 +
 +===== Performance Data Format =====
 +Format
 +<​code>​
 +'​label'​=value[UOM];​[warn];​[crit];​[min];​[max] ​
 +</​code>​
 +
 +Example showing multiple data sources
 +<​code>​
 +Access Count is OK. Response Time is OK. HTTP 2xx Count is OK. HTTP 3xx Count is OK. HTTP 4xx Count is OK. HTTP 5xx Count is OK. Access Count=23 Response Time=179357us HTTP 2xx Count=13 HTTP 3xx Count=10 HTTP 4xx Count=0 HTTP 5xx Count=0|'​Access Count'​=23;​1500;​1600;​0 '​Response Time'​=179357us;​250000;​300000;​0 'HTTP 2xx Count'​=13;​1500;​1600;​0 'HTTP 3xx Count'​=10;​350;​400;​0 'HTTP 4xx Count'​=0;​30;​50;​0 'HTTP 5xx Count'​=0;​10;​15;​0
 +</​code>​
 +
 +
 +===== Performance Data Custom Graphs =====
 +You can customize data graphs by creating a custom php template and naming it appropriately. The naming convention is to use the same name (with .php extension) as used by the Nagios command. ​ The default template specifies the underlying Nagios command name in the graph (at the bottom right corner). ​ The custom templates are typically located in the following directory ''/​usr/​local/​pnp4nagios/​share/​templates.dist''​. ​ Copy the ''​default.php''​ to the ''<​command>​.php''​ file and customize as required. ​ Check [[http://​docs.pnp4nagios.org/​pnp-0.4/​tpl|pnp4nagios Templates]] for more information.  ​
 +
 +Also refer to [[http://​docs.pnp4nagios.org/​pnp-0.6/​tpl_custom|Custom Templates]] to change the default behavior of which command name the template will use.  The ''​etc/​check_commands''​ directory (usually under /​usr/​local/​pnp4nagios) will refer to the config file (<​check_command>​.cfg) to determine which command file to use.  This is useful when the Nagios command is the same (such as in the case of check_nrpe) and you need to customize for the sub-command (such as check_nrpe_1arg!check_ls_memory_usage). ​ In this case create a file check_nrpe.cfg with ''​CUSTOM_TEMPLATE = 1''​ to specify the sub-command name to be used in the custom template.
 +
 +===== Other =====
 +  * [[pnp4nagios_averages|pnp4nagios extracting averages]]
 +  * [[pnp4nagios_graphs|pnp4nagios extracting graphs]]
 +  * [[http://​docs.pnp4nagios.org/​pnp-0.6/​perfdata_format|Performance Data Format]]
 +
 +
 +
  

QR Code
QR Code tech:linux:pnp4nagios (generated for current page)