Friday, 11 July 2014

Part3 :BigData Cluster Monitoring (Graph set up with Nagios)

NagiosGraph set up with Nagios core
1) Download NagiosGraph form its website

2) Extract it

3)Move to nagiosgraph directory

4) Nagiosgraph requires some dependency application to run. As the root user, run the install.pl script with the --check-prereq option.
./install.pl –check-prereq

Note:
You need some standard Perl libraries, so don’t forget to check if packages are available for them. For example, I was able to install the RRDs and GD modules on my Ubuntu system as follows:
sudo apt-get install librrds-perl libgd-gd2-perl


5)  After getting the sucessfull message from ./install.pl --check-prereq command now we will move forwar to install the nagiosgraph. Run the install.pl  script with the --install argument.
sudo ./install.pl –install

6) Now we will edit some nagios configuration files to make nagiosgraph functional. Change to the Nagios Core configuration directory.
cd /etc/nagios3/.
sudo vi /etc/nagios3/nagios.cfg

# process nagios performance data using nagiosgraph
process_performance_data=1
service_perfdata_file=/tmp/perfdata.log
service_perfdata_file_template=$LASTSERVICECHECK$||$HOSTNAME$||$SERVICEDESC$||$SERVICEOUTPUT$||$SERVICEPERFDATA$
service_perfdata_file_mode=a
service_perfdata_file_processing_interval=30
service_perfdata_file_processing_command=process-service-perfdata-for-nagiosgraph

7) Now move to the Nagios Core objects configuration directory.

sudo vi /etc/nagios3/commands.cfg

define command {
command_name process-service-perfdata-for-nagiosgraph
command_line /usr/local/nagiosgraph/bin/insert.pl
}                                                 

8) After installing the nagiosgraph you should have an Apache configuration file for nagiosgraph inside the /usr/local/nagiosgraph/etc/ directory.We have to include that file to our Apache configuration file to make it functional.
Edit the httpd.conf /apache.conf file for your Apache HTTPD server to include the following line at the end:
sudo vi /etc/apache2/apache2.conf
Include /usr/local/nagiosgraph/etc/nagiosgraph-apache.conf
Now restart Apache and Nagios server:
sudo /etc/init.d/apache2 restart
sudo /etc/init.d/nagios3 restart

9) Check NagiosGraph working

If you have set up properly,then you will see a nagiosgraph page having configuration and Map rules details

10) Append the following service in service_nagios2.cfg file of Nagios core /etc/nagios/conf.d

define service {
name nagiosgraph
action_url /nagiosgraph/cgi-bin/show.cgi?host=$HOSTNAME$&service=$SERVICEDESC$
register 0
}

11) Now use the service in localhost_nagios2.cfg
# Define a service to check the load on the local machine.
define service{
use   generic-service,nagiosgraph    ; Name of service template to use
host_name                       localhost
service_description             Check-Load
check_command                   check_load!5.0!4.0!3.0!10.0!6.0!4.0
}
        }


12) Restart Nagios server again.
sudo /etc/init.d/nagios3 restart

13) Now Click the service option (on the left pane) of the Nagios core WebUI,Then you will see a service name PING with otyher list of services. But Check-Load service now has icon beside it, Click on the icon to vuisualize your graph as shown below:

Pnp4Nagios Set up with Nagios core

1) You can install pnp4nagios on ubuntu using apt-get
sudo apt-get install pnp4nagios

It will get install in /etc/pnp4nagios – here you will find all the configuration & tempaltes
It will get install in /usr/lib/pnp4nagios  --here you will find the library and perfdata.pl file

2) PNP supports several modes to process performance data. The modes differ in complexity and the performance to be expected.
Synchronous Mode: The “synchronous mode” is the simplest and easiest to set up. Nagios will call the perl script process_perfdata.pl for every service and host, respectively, to process the data. The synchronous mode will work very good up to about 1.000 services in a 5 minute interval.

Bulk Mode:In bulk mode Nagios writes the necessary data to a temporary file. After expiration of a defined time the file will be processed in one piece and deleted afterwards.
The number of calls of process_perfdata.pl will be reduced to a fraction. Depending on time and the amount of collected data there will be much less system calls. Instead, process_perfdata.pl will run longer.
Note : Using this mode you should keep an eye on the runtime of process_perfdata.pl. While it is running to process data nagios will not execute any checks.

Bulk Mode with NCPD:Nagios again uses a temporary file to store the data and executes a command after expiration of a certain time. Instead of immediate processing by process_perfdata.pl the file is moved to a spool directory. As moving a file inside the same filesystem nearly takes no time nagios is able to execute crucial work immediately.
The NPCD daemon (Nagios Performance C Daemon) will monitor the directory for new files and will pass the names to process_perfdata.pl. Processing of performance data is decoupled completely from nagios. NPCD itself is able to start multiple thread for processing the data.

Note: we will be using Bulk Mode.For other configuration please refer:


3) Now we will edit some nagios configuration files to make pnp4nagios functional. Change to the Nagios Core configuration directory.
cd /etc/nagios3/.
sudo vi /etc/nagios3/nagios.cfg

enable_environment_macros=1

# process nagios performance data using nagiosgraph( already done for nagiosgraph).
process_performance_data=1
service_perfdata_file=/tmp/perfdata.log
service_perfdata_file_template=$LASTSERVICECHECK$||$HOSTNAME$||$SERVICEDESC$||$SERVICEOUTPUT$||$SERVICEPERFDATA$
service_perfdata_file_mode=a
 service_perfdata_file_processing_interval=10
 service_perfdata_file_processing_command=process-service-perfdata-for-nagiosgraph

#Note; You can change the name  service_perfdata_file_processing_command( as im keeping it same for both nagiosgraph and pnp4nagios,so that both the plugin can used it for service perf data).
#In addition to the above configuartion  add the following

service_perfdata_command=process-service-perfdata

# host performance data starting with Nagios 3.0
host_perfdata_file=/tmp/host-perfdat.log
host_perfdata_file_template=DATATYPE::HOSTPERFDATA\tTIMET::$TIMET$\tHOSTNAME::$HOSTNAME$\tHOSTPERFDATA::$HOSTPERFDATA$\tHOSTCHECKCOMMAND::$HOSTCHECKCOMMAND$\tHOSTSTATE::$HOSTSTATE$\tHOSTSTATETYPE::$HOSTSTATETYPE$
host_perfdata_file_mode=a
host_perfdata_file_processing_interval=15
host_perfdata_file_processing_command=process-host-perfdata-file


4) Now move to the Nagios Core objects configuration directory.
sudo vi /etc/nagios3/commands.cfg

define command{
   command_name    process-service-perfdata-for-nagiosgraph
  command_line    /usr/lib/pnp4nagios/libexec/process_perfdata.pl --bulk=/tmp/perfdata.log
}
define command{
       command_name    process-host-perfdata-file
       command_line    /usr/lib/pnp4nagios/libexec/process_perfdata.pl --bulk=/tmp/host-perfdat.log
}

5) Append the following service in service_nagios2.cfg file of Nagios core /etc/nagios/conf.d
define host {
   name       host-pnp
   action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=_HOST_
   register   0
}

define service {
   name       srv-pnp
   action_url /pnp4nagios/index.php/graph?host=$HOSTNAME$&srv=$SERVICEDESC$
   register   0
}

6) Now use the host and  service in localhost_nagios2.cfg
# Define a host to check the health on the local machine.
define host{
        use                     generic-host,host-pnp            ; Name of host template to use
        host_name               localhost
        alias                   localhost
        address                 127.0.0.1
        }
# Define a service to check the number of user on the local machine.
define service{
        use                             generic-service,srv-pnp         ; Name of service template to use
        host_name                       localhost
        service_description             Current Users
        check_command                   check_users!20!50
        }

  
12) Restart Nagios server again.
sudo /etc/init.d/nagios3 restart

13) Now Click the service option (on the left pane) of the Nagios core WebUI, Then you will see a service name Current Users with other list of services. But Current Users service now has icon beside it, Click on the icon to visualize your graph. Similary the local host will have icon beside it .

No comments:

Post a Comment