Differences
This shows you the differences between two versions of the page.
tech:linux:pnp4nagios [2018/12/06 06:27] |
tech:linux:pnp4nagios [2018/12/06 06:27] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | ====== Setting up pnp4nagios on Ubuntu 12.04 LTS Precise Pangolin ====== | ||
+ | ===== Prerequisites ===== | ||
+ | nagios3 is installed. | ||
+ | |||
+ | ===== Install ===== | ||
+ | <code> | ||
+ | aptitude install pnp4nagios | ||
+ | </code> | ||
+ | |||
+ | ===== Configuring Bulk Mode with NPCD ===== | ||
+ | ==== nagios.cfg ==== | ||
+ | In /etc/nagios3/nagios.cfg, update process_performance_data=1. Use the sed command or just edit the file! | ||
+ | <code bash> | ||
+ | grep process_performance_data /etc/nagios3/nagios.cfg | ||
+ | sed -i 's/process_performance_data=0/process_performance_data=1/' /etc/nagios3/nagios.cfg | ||
+ | </code> | ||
+ | |||
+ | Create the following directories where the performance data will be stored | ||
+ | <code bash> | ||
+ | mkdir -p /var/spool/pnp4nagios/nagios | ||
+ | chown -R nagios:nagios /var/spool/pnp4nagios | ||
+ | </code> | ||
+ | |||
+ | Configure nagios.cfg to use the directories and files for storing the data | ||
+ | <code ini> | ||
+ | # | ||
+ | # service performance data | ||
+ | # | ||
+ | service_perfdata_file=/var/spool/pnp4nagios/nagios/service-perfdata | ||
+ | service_perfdata_file_template=DATATYPE::SERVICEPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tSERVICEDESC::$SERVICEDESC$\tSERVICEPERFDATA::$SERVICEPERFDATA$\tSERVICECHECKCOMMAND::$SERVICECHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$\tSERVICESTATE::$SERVICESTATE$\tSERVICESTATETYPE::$SERVICESTATETYPE$ | ||
+ | service_perfdata_file_mode=a | ||
+ | service_perfdata_file_processing_interval=15 | ||
+ | service_perfdata_file_processing_command=pnp-bulk-service | ||
+ | |||
+ | # | ||
+ | # host performance data | ||
+ | # | ||
+ | host_perfdata_file=/var/spool/pnp4nagios/nagios/host-perfdata | ||
+ | host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tHOSTPERFDATA::$HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$ | ||
+ | host_perfdata_file_mode=a | ||
+ | host_perfdata_file_processing_interval=15 | ||
+ | host_perfdata_file_processing_command=pnp-bulk-host | ||
+ | # | ||
+ | </code> | ||
+ | |||
+ | conf.d/pnp4nagios.cfg should already have pnp-bulk-service & pnp-bulk-host commands (came about with the install), if not paste this at: | ||
+ | ''/etc/nagios3/pnp4nagios.cfg''. Note the path for ''process_perfdata.pl'' - may be different and is sometimes at ''/usr/local/pnp4nagios/libexec''. | ||
+ | |||
+ | <code> | ||
+ | define command { | ||
+ | command_name pnp-synchronous-service | ||
+ | command_line /usr/bin/perl /usr/lib/pnp4nagios/libexec/process_perfdata.pl | ||
+ | } | ||
+ | |||
+ | define command { | ||
+ | command_name pnp-synchronous-host | ||
+ | command_line /usr/bin/perl /usr/lib/pnp4nagios/libexec/process_perfdata.pl -d HOSTPERFDATA | ||
+ | } | ||
+ | |||
+ | ############################################################################## | ||
+ | |||
+ | define command{ | ||
+ | command_name pnp-bulk-service | ||
+ | command_line /usr/bin/perl /usr/lib/pnp4nagios/libexec/process_perfdata.pl --bulk=/var/spool/pnp4nagios/nagios/service-perfdata | ||
+ | } | ||
+ | |||
+ | define command{ | ||
+ | command_name pnp-bulk-host | ||
+ | command_line /usr/bin/perl /usr/lib/pnp4nagios/libexec/process_perfdata.pl --bulk=/var/spool/pnp4nagios/nagios/host-perfdata | ||
+ | } | ||
+ | |||
+ | ############################################################################## | ||
+ | |||
+ | define command{ | ||
+ | command_name pnp-bulknpcd-service | ||
+ | command_line /bin/mv /var/spool/pnp4nagios/nagios/service-perfdata /var/spool/pnp4nagios/npcd/service-perfdata.$TIMET$ | ||
+ | } | ||
+ | |||
+ | define command{ | ||
+ | command_name pnp-bulknpcd-host | ||
+ | command_line /bin/mv /var/spool/pnp4nagios/nagios/host-perfdata /var/spool/pnp4nagios/npcd/host-perfdata.$TIMET$ | ||
+ | } | ||
+ | </code> | ||
+ | |||
+ | ==== Config pnp4nagios ==== | ||
+ | * Edit /etc/default/npcd | ||
+ | * Update: RUN="yes" | ||
+ | * Validate: /etc/pnp4nagios/npcd.cfg or /usr/local/pnp4nagios/etc/npcd.cfg | ||
+ | * No change usually required | ||
+ | |||
+ | |||
+ | ===== Apache web server configuration ===== | ||
+ | Link the pnp4nagios web configuration to apache configuration | ||
+ | <code bash> | ||
+ | ln -s /etc/pnp4nagios/apache.conf /etc/apache2/conf-enabled/pnp4nagios.conf | ||
+ | service apache2 restart | ||
+ | </code> | ||
+ | |||
+ | ===== Restart ===== | ||
+ | Restart services | ||
+ | <code bash> | ||
+ | service nagios3 restart | ||
+ | service npcd start | ||
+ | service apache2 restart | ||
+ | </code> | ||
+ | |||
+ | ===== Use ===== | ||
+ | http://localhost/pnp4nagios/ | ||
+ | |||
+ | |||
+ | ===== Optional ===== | ||
+ | In /etc/pnp4nagios/npcd.cfg change log_type from syslog to file | ||
+ | <code> | ||
+ | log_type = file | ||
+ | #log_type = syslog | ||
+ | </code> | ||
+ | |||
+ | |||
+ | ===== Errors ===== | ||
+ | Common errors are when monitoring parameters change in nagios and the old XML definitions have to be deleted. Do determine if there are issues run: | ||
+ | <code bash> | ||
+ | grep -R "found extra data" /var/lib/pnp4nagios/perfdata/*/*.xml | ||
+ | </code> | ||
+ | For the related dot xml file, delete both the dot rrd (and the dot xml) file. For example if you get the output as | ||
+ | <code> | ||
+ | /var/lib/pnp4nagios/perfdata/server1/Disks.xml: <TXT>/var/lib/pnp4nagios/perfdata/server1/Disks.rrd: found extra data on update argument: 613954</TXT> | ||
+ | </code> | ||
+ | then delete as follows | ||
+ | <code bash> | ||
+ | rm /var/lib/pnp4nagios/perfdata/server1/Disks.* | ||
+ | </code> | ||
+ | Of course this means you lose historical data as well. | ||
+ | |||
+ | Alternative grep to check for other errors/information as well: | ||
+ | <code bash> | ||
+ | grep -R TXT /var/lib/pnp4nagios/perfdata/*/*.xml|grep -v successful | ||
+ | </code> | ||
+ | |||
+ | |||
+ | ===== Cron job for error check ===== | ||
+ | Create /etc/cron.daily/pnp4nagios_check as below: | ||
+ | <code bash> | ||
+ | #!/bin/bash | ||
+ | # | ||
+ | PNP4LOC=/var/lib/pnp4nagios/perfdata | ||
+ | # | ||
+ | PNPERRCNT=`grep -R TXT $PNP4LOC/*/*.xml|grep -c -v successful` | ||
+ | if [ $PNPERRCNT -gt 0 ]; then | ||
+ | grep -R TXT $PNP4LOC/*/*.xml | grep -v successful | mailx -s "PNP4Nagios Error" admin@example.org | ||
+ | fi | ||
+ | # | ||
+ | exit | ||
+ | </code> | ||
+ | |||
+ | |||
+ | ===== Increasing RRD Resolution ===== | ||
+ | The ''/etc/pnp4nagios/rra.cfg'' has the default resolution. The default rolls up and aggregates the data quite quickly and I prefer to have more resolution over longer periods of time. The below change the resolution to be more fine grained over longer time periods. | ||
+ | |||
+ | Sometimes this file is at ''/usr/local/pnp4nagios/etc/rra.cfg'' | ||
+ | |||
+ | Sometimes this file is referenced from ''/usr/local/pnp4nagios/etc/process_perfdata.cfg'' | ||
+ | |||
+ | ==== High Resolution ==== | ||
+ | <code> | ||
+ | # | ||
+ | # Define the default RRA Step in seconds | ||
+ | # More Infos on | ||
+ | # http://oss.oetiker.ch/rrdtool/doc/rrdcreate.en.html | ||
+ | # | ||
+ | RRA_STEP=60 | ||
+ | # | ||
+ | # PNP default RRA config | ||
+ | # | ||
+ | # you will get 6 MB of data per datasource | ||
+ | # | ||
+ | # 51840 entries with 1 minute step = 36 days | ||
+ | # | ||
+ | RRA:AVERAGE:0.5:1:51840 | ||
+ | # | ||
+ | # 115200 entries with 5 minute step = 400 days | ||
+ | # | ||
+ | RRA:AVERAGE:0.5:5:115200 | ||
+ | # | ||
+ | # 38400 entries with 30 minute step = 800 days | ||
+ | # | ||
+ | RRA:AVERAGE:0.5:30:38400 | ||
+ | # | ||
+ | # 35040 entries with 60 minute step = 4 years | ||
+ | # | ||
+ | RRA:AVERAGE:0.5:60:35040 | ||
+ | |||
+ | RRA:MAX:0.5:1:51840 | ||
+ | RRA:MAX:0.5:5:115200 | ||
+ | RRA:MAX:0.5:30:38400 | ||
+ | RRA:MAX:0.5:60:35040 | ||
+ | |||
+ | RRA:MIN:0.5:1:51840 | ||
+ | RRA:MIN:0.5:5:115200 | ||
+ | RRA:MIN:0.5:30:38400 | ||
+ | RRA:MIN:0.5:60:35040 | ||
+ | </code> | ||
+ | |||
+ | ==== Medium Resolution ==== | ||
+ | <code> | ||
+ | # | ||
+ | # Define the default RRA Step in seconds | ||
+ | # More Infos on | ||
+ | # http://oss.oetiker.ch/rrdtool/doc/rrdcreate.en.html | ||
+ | # | ||
+ | RRA_STEP=60 | ||
+ | # | ||
+ | # PNP default RRA config | ||
+ | # | ||
+ | # you will get 2.8 MB of data per datasource | ||
+ | # | ||
+ | # 11520 entries with 1 minute step = 8 days | ||
+ | # | ||
+ | RRA:AVERAGE:0.5:1:11520 | ||
+ | # | ||
+ | # 11520 entries with 5 minute step = 40 days | ||
+ | # | ||
+ | RRA:AVERAGE:0.5:5:11520 | ||
+ | # | ||
+ | # 19200 entries with 30 minute step = 400 days | ||
+ | # | ||
+ | RRA:AVERAGE:0.5:30:19200 | ||
+ | # | ||
+ | # 17520 entries with 120 minute step = 4 years | ||
+ | # | ||
+ | RRA:AVERAGE:0.5:120:17520 | ||
+ | |||
+ | RRA:MAX:0.5:1:11520 | ||
+ | RRA:MAX:0.5:5:11520 | ||
+ | RRA:MAX:0.5:30:19200 | ||
+ | RRA:MAX:0.5:360:17520 | ||
+ | |||
+ | RRA:MIN:0.5:1:11520 | ||
+ | RRA:MIN:0.5:5:11520 | ||
+ | RRA:MIN:0.5:30:19200 | ||
+ | RRA:MIN:0.5:360:17520 | ||
+ | </code> | ||
+ | |||
+ | ==== Low Resolution ==== | ||
+ | The default resolution you get on install. | ||
+ | <code> | ||
+ | # | ||
+ | # Define the default RRA Step in seconds | ||
+ | # More Infos on | ||
+ | # http://oss.oetiker.ch/rrdtool/doc/rrdcreate.en.html | ||
+ | # | ||
+ | RRA_STEP=60 | ||
+ | # | ||
+ | # PNP default RRA config | ||
+ | # | ||
+ | # you will get 400kb of data per datasource | ||
+ | # | ||
+ | # 2880 entries with 1 minute step = 48 hours | ||
+ | # | ||
+ | RRA:AVERAGE:0.5:1:2880 | ||
+ | # | ||
+ | # 2880 entries with 5 minute step = 10 days | ||
+ | # | ||
+ | RRA:AVERAGE:0.5:5:2880 | ||
+ | # | ||
+ | # 4320 entries with 30 minute step = 90 days | ||
+ | # | ||
+ | RRA:AVERAGE:0.5:30:4320 | ||
+ | # | ||
+ | # 5840 entries with 360 minute step = 4 years | ||
+ | # | ||
+ | RRA:AVERAGE:0.5:360:5840 | ||
+ | |||
+ | RRA:MAX:0.5:1:2880 | ||
+ | RRA:MAX:0.5:5:2880 | ||
+ | RRA:MAX:0.5:30:4320 | ||
+ | RRA:MAX:0.5:360:5840 | ||
+ | |||
+ | RRA:MIN:0.5:1:2880 | ||
+ | RRA:MIN:0.5:5:2880 | ||
+ | RRA:MIN:0.5:30:4320 | ||
+ | RRA:MIN:0.5:360:5840 | ||
+ | </code> | ||
+ | |||
+ | ===== Performance Data Format ===== | ||
+ | Format | ||
+ | <code> | ||
+ | 'label'=value[UOM];[warn];[crit];[min];[max] | ||
+ | </code> | ||
+ | |||
+ | Example showing multiple data sources | ||
+ | <code> | ||
+ | Access Count is OK. Response Time is OK. HTTP 2xx Count is OK. HTTP 3xx Count is OK. HTTP 4xx Count is OK. HTTP 5xx Count is OK. Access Count=23 Response Time=179357us HTTP 2xx Count=13 HTTP 3xx Count=10 HTTP 4xx Count=0 HTTP 5xx Count=0|'Access Count'=23;1500;1600;0 'Response Time'=179357us;250000;300000;0 'HTTP 2xx Count'=13;1500;1600;0 'HTTP 3xx Count'=10;350;400;0 'HTTP 4xx Count'=0;30;50;0 'HTTP 5xx Count'=0;10;15;0 | ||
+ | </code> | ||
+ | |||
+ | |||
+ | ===== Performance Data Custom Graphs ===== | ||
+ | You can customize data graphs by creating a custom php template and naming it appropriately. The naming convention is to use the same name (with .php extension) as used by the Nagios command. The default template specifies the underlying Nagios command name in the graph (at the bottom right corner). The custom templates are typically located in the following directory ''/usr/local/pnp4nagios/share/templates.dist''. Copy the ''default.php'' to the ''<command>.php'' file and customize as required. Check [[http://docs.pnp4nagios.org/pnp-0.4/tpl|pnp4nagios Templates]] for more information. | ||
+ | |||
+ | Also refer to [[http://docs.pnp4nagios.org/pnp-0.6/tpl_custom|Custom Templates]] to change the default behavior of which command name the template will use. The ''etc/check_commands'' directory (usually under /usr/local/pnp4nagios) will refer to the config file (<check_command>.cfg) to determine which command file to use. This is useful when the Nagios command is the same (such as in the case of check_nrpe) and you need to customize for the sub-command (such as check_nrpe_1arg!check_ls_memory_usage). In this case create a file check_nrpe.cfg with ''CUSTOM_TEMPLATE = 1'' to specify the sub-command name to be used in the custom template. | ||
+ | |||
+ | ===== Other ===== | ||
+ | * [[pnp4nagios_averages|pnp4nagios extracting averages]] | ||
+ | * [[pnp4nagios_graphs|pnp4nagios extracting graphs]] | ||
+ | * [[http://docs.pnp4nagios.org/pnp-0.6/perfdata_format|Performance Data Format]] | ||
+ | |||
+ | |||
+ | |||