Difference between revisions of "LCG-FR / SA1-FR Monitoring NagiosWithQuattor"

Un article de lcgwiki.
Jump to: navigation, search
(Who is monitored)
(Configuring the Nagios server)
Ligne 11: Ligne 11:
 
An example Nagios server template is here :
 
An example Nagios server template is here :
  
object template profile_node58;
+
https://trac.lal.in2p3.fr/LCGQWG/browser/templates/trunk/clusters/example-3.1/profiles/nagios3-server.example.org.tpl
+
 
include { 'rpms/kernelupdates' }; # this includes kernel updates, no matter the OS version
+
This machine should be a UI to monitor grid services.
variable AII_KS_SRV = "192.54.208.182";
 
variable AII_ACK_SRV = AII_KS_SRV;
 
variable NFS_AUTOFS = true;
 
include { 'site/firewall/nagios_server' };
 
 
############
 
#Fonctionnalite UI utile pour nagios service grille
 
  variable VOS ?= list('grif','dteam');
 
include { 'machine-types/ui' };
 
############
 
 
 
#include Nagios server
 
##############################
 
##What resources are monitored
 
variable SITES = list('dapnia');
 
include { 'config/nodes_properties' };
 
##############################
 
###Configuration, setting variables
 
variable NAGIOS_NCG_CONFIG = true;
 
variable NAGIOS_NOTIFICATIONS_ENABLED = false;
 
variable NAGIOS_NODES_PROPERTIES  = NODES_PROPS;
 
variable NAGIOS_DEFAULT_ADMIN_NAME = "dapnia";
 
variable NAGIOS_IGNORED_NODES = list("node09.datagrid.cea.fr","node19.datagrid.cea.fr","node22.datagrid.cea.fr");
 
variable NAGIOS_MONITORED_HOSTGROUPS =
 
list("WN","NFS","SEDPM","SE_DISK","SITE_BDII","MON","LFC","CE","CE-MPI","VOBOX","UI","WMS");
 
variable NAGIOS_ADMIN_CONTACTS= nlist(
 
        "tuto1"          ,"tuto1@org.fr",
 
        "tuto2"    ,"tuto2@org.fr",
 
 
        );
 
  variable NAGIOS_HTPASSWD_LOGIN ?= "grif";
 
variable NAGIOS_HTPASSWD_PASS  ?= 'xxxxxx';
 
 
##############################
 
###Functions used to configure services and hosts
 
include { 'monitoring/nagios3/server/functions' };
 
 
##############################
 
###Services configuration
 
variable TMP_SERVICE=nlist(
 
    "use","                            generic-service",
 
    "host_name","                      node07.datagrid.cea.fr",
 
    "service_description","            Workers ssh_known_hosts",
 
    "contact_groups","                  admins",
 
    "check_command","                  check_nrpe_long!check_ssh_known_hosts!60",
 
    "normal_check_interval","          60 ; check every hour",
 
    "max_check_attempts","              1",
 
);
 
variable NAGIOS_SERVICES=nagios_add_service(TMP_SERVICE);
 
variable NAGIOS_USER_DEFINED_HOST_DEPENDENCIES=nagios_add_host_service_dependency\
 
("node07.datagrid.cea.fr","nrpe  daemon","node07.datagrid.cea.fr","Workers ssh_known_hosts");
 
include { 'monitoring/nagios3/server/config' };
 
 
   
 
   
###
 
#
 
# software repositories (should be last)
 
#
 
include { 'rpms/siteupdates' };
 
include { PKG_REPOSITORY_CONFIG };
 
 
 
=== Who is monitored ===
 
=== Who is monitored ===
  
Ligne 82: Ligne 23:
 
https://trac.lal.in2p3.fr/LCGQWG/browser/templates/trunk/sites/example/site/config
 
https://trac.lal.in2p3.fr/LCGQWG/browser/templates/trunk/sites/example/site/config
  
You can tune tis with:
+
You can tune this with:
  
 
NAGIOS_IGNORED_NODES  
 
NAGIOS_IGNORED_NODES  
Ligne 95: Ligne 36:
 
=== Proxy management ===
 
=== Proxy management ===
 
Need to have a valid certificate for local grid probe.
 
Need to have a valid certificate for local grid probe.
2 mechanisms are possible: Renewal et Retrieval sont possibles:
+
2 mechanisms are possible: Renewal and Retrieval are possible:
 
in cfg/standard/monitoring/nagios3/server/config.tpl
 
in cfg/standard/monitoring/nagios3/server/config.tpl
 
  include { if(NAGIOS_NCG_CONFIG && NAGIOS_MODE_PROXY_RENEW) 'monitoring/nagios3/server/vobox'};
 
  include { if(NAGIOS_NCG_CONFIG && NAGIOS_MODE_PROXY_RENEW) 'monitoring/nagios3/server/vobox'};
Ligne 109: Ligne 50:
 
  MYPROXY_SERVER
 
  MYPROXY_SERVER
 
  NAGIOS_VONAME_PROXY
 
  NAGIOS_VONAME_PROXY
 
  
 
== client configuration ==
 
== client configuration ==

Version du 21:49, 20 janvier 2009

Installing Nagios with quattor

Nagios configuration requires both a set of client templates for commands to be run on clients by the Nagios Remote Plug-in Executor (NRPE) and a set of server templates configuring contacts for alarms, hosts to be monitored, services (AKA sensors) and so on.


Configuring the Nagios server

The configuration of a Nagios server is done in a set of standard templates, in the 'monitoring/nagios3' namespace. Sensors are provided for many of the plug-ins from the SA1 repository: http://www.sysadmin.hep.ac.uk/rpms/grid-services/RPMS.monitoring/

An example Nagios server template is here :

https://trac.lal.in2p3.fr/LCGQWG/browser/templates/trunk/clusters/example-3.1/profiles/nagios3-server.example.org.tpl

This machine should be a UI to monitor grid services.

Who is monitored

hosts from site (variable SITES) and present in config/’sitename’_nodes_properties.tpl will be monitored

Template example for host declaration are in LCGQWG: https://trac.lal.in2p3.fr/LCGQWG/browser/templates/trunk/sites/example/site/config

You can tune this with:

NAGIOS_IGNORED_NODES

NAGIOS_MONITORED_HOSTGROUPS


see the profile above.

What is monitored

Proxy management

Need to have a valid certificate for local grid probe. 2 mechanisms are possible: Renewal and Retrieval are possible: in cfg/standard/monitoring/nagios3/server/config.tpl

include { if(NAGIOS_NCG_CONFIG && NAGIOS_MODE_PROXY_RENEW) 'monitoring/nagios3/server/vobox'};
include { if(NAGIOS_NCG_CONFIG && NAGIOS_MODE_PROXY_RETRIEVE) 'monitoring/nagios3/server/proxy_retrieval'};

Les variables associées:

NAGIOS_MODE_PROXY_RENEW
NAGIOS_RENEW_PROXY 
NAGIOS_OUTPUT_PROXY
NAGIOS_MODE_PROXY_RETRIEVE
NAGIOS_MYPROXY_NAME 
MYPROXY_SERVER
NAGIOS_VONAME_PROXY

client configuration

Les variables

\\ \\

|NAGIOS_ADMIN_CONTACTS | admin emails for alarms| | NAGIOS_CONFINFO_USERS | | | NAGIOS_DEFAULT_ADMIN_NAME | | | NAGIOS_DEFAULT_NODE_GROUP | | | NAGIOS_HOSTCOMMANDS_USERS | | | NAGIOS_HOSTVIEW_USERS | | | NAGIOS_HTPASSWD_CONFIG | | | NAGIOS_HTPASSWD_LOGIN | | | NAGIOS_HTPASSWD_PASS | | | NAGIOS_IGNORED_NODES | | | NAGIOS_KNOWN_HOSTGROUPS | | | NAGIOS_MONITORED_HOSTGROUPS | | | NAGIOS_NCG_CONFIG | | | NAGIOS_NODES_PROPERTIES | | | NAGIOS_NOTIFICATIONS_ENABLED | | | NAGIOS_RPM_VERSION | | | NAGIOS_SERVCOMMANDS_USERS | | | NAGIOS_SERVER | | | NAGIOS_SERVICEEXTINFOS | | | NAGIOS_SERVICEEXTINFOS | | | NAGIOS_SERVICES | | | NAGIOS_SERVVIEW_USERS | | | NAGIOS_SUPPORTED_OS_LIST | | | NAGIOS_SYSCOMMAND_USERS | | | NAGIOS_SYSINFO_USERS | | | NAGIOS_USER_DEFINED_HOST_DEPENDENCIES | |


Installation Exemple

With Quattor

server profile creation look at the profile above.

svn add cfg/clusters/your-3.1/profiles/profile_node58.tpl

Modify your list of machines:

vi ./cfg/sites/your/site/config/your_nodes_properties.tpl

create your hardware template

svn cp ./cfg/sites/your/hardware/virtual_machine_3.tpl ./cfg/sites/your/hardware/virtual_machine_13.tpl


Comit your change:

svn ci -m 'adding serveur nagios'


on the nagios server

vi /var/log/spma.log
vi /var/log/ncm-cdispd.log
/etc/init.d/nagios status
/etc/init.d/nagios start


Verifier le certificat serveur