Difference between revisions of "LCG-FR / SA1-FR Monitoring NagiosWithQuattor"

Un article de lcgwiki.
Jump to: navigation, search
(What is monitored)
Ligne 18: Ligne 18:
 
-Plugins « nagios-grid-plugins » are in noarch RPM in the repository « nagios »
 
-Plugins « nagios-grid-plugins » are in noarch RPM in the repository « nagios »
 
Ex. : http://quattor.web.lal.in2p3.fr/packages/nagios/
 
Ex. : http://quattor.web.lal.in2p3.fr/packages/nagios/
 +
 +
voir template : nagios3/plugins/config.tpl
  
 
=== Server Template ===
 
=== Server Template ===
Ligne 46: Ligne 48:
 
=== What is monitored ===
 
=== What is monitored ===
  
L’ajout d’un service a lieux dans le template « server/cfgfiles/services.tpl »
+
Services are added in the template « server/cfgfiles/services.tpl »
  —Rajouter un service est fait ainsi : — —
+
  —adding a service can be done like this :
  
 +
variable TMP_SERVICE=nlist(
 +
        "use","                            generic-service",
 +
        "host_name","                      node07.org.fr",
 +
      "service_description","            Workers ssh_known_hosts",
 +
      "contact_groups","                  admins",
 +
      "check_command","                  check_nrpe_long!check_ssh_known_hosts!60",
 +
      "normal_check_interval","          60 ; check every hour",
 +
      "max_check_attempts","              1",
 +
  );
  
  
  
◦Si le 2eme paramètre de la fonction nagios_add_service est « true » ,une dépendance sera ajoutée sur le démon NRPE pour tous les nœuds « "*,!NOQUATTOR » pour le service ainsi défini… ceci est probablement à améliorer.
+
If the second parameter of the function nagios_add_service is « true » , a dependency will be added on the NRPE daemon for all the nodes  « "*,!NOQUATTOR » for the service defined .... need to improve on this...
  
Certains fichiers ne nécessitent pas de structure quattor compliquée, ils sont créés via un filecopy :
 
L’ajout de commandes se fait dans
 
monitoring/nagios3/server/cfgfiles/commands
 
L’ajout de commandes nrpe se fait dans
 
monitoring/nagios3/client/cfgfiles/nrpe_commands
 
  
Il est possible d’ajouter une dépendance sur le démon NRPE pour les services non définis sur toutes les machines (template services.tpl)
+
Nagios configuration files doesn't need complex quattor structure template and so are created with filecopy :
.......
+
adding commands is done in:
 +
monitoring/nagios3/server/cfgfiles/commands
 +
adding NRPE commands is done in:
 +
monitoring/nagios3/client/cfgfiles/nrpe_commands
  
Il est possible d’ajouter une dépendance sur un service d’un hôte, pour un service d’un autre hôte bien défini :
+
It's possible to add dependency on the NRPE daemon for services wich are not defined on all the hosts(template services.tpl)
  
variable NAGIOS_USER_DEFINED_HOST_DEPENDENCIES = nagios_add_host_service_dependency(
 
"node07.datagrid.cea.fr","nrpe daemon", "node07.datagrid.cea.fr","Workers ssh_known_hosts" 
 
);
 
  
Il n’est PAS possible d’ajouter une dépendance entre hostgroups (pour le moment ?)
+
NEED SOMETHING
 +
 
 +
It's possible to add dependency on a services for a host, with a service from another host well defined:
 +
 
 +
variable NAGIOS_USER_DEFINED_HOST_DEPENDENCIES = nagios_add_host_service_dependency(
 +
"node07.datagrid.cea.fr","nrpe daemon", "node07.datagrid.cea.fr","Workers ssh_known_hosts" 
 +
);
 +
 
 +
It's not possible to add dependency between hostgroups (for the moment ?)
  
 
=== Proxy management ===
 
=== Proxy management ===
Ligne 149: Ligne 163:
  
 
  Verifier le certificat serveur
 
  Verifier le certificat serveur
 +
 +
NEED SOMETHING from node58

Version du 22:34, 20 janvier 2009

Installing Nagios with quattor

Nagios configuration requires both a set of client templates for commands to be run on clients by the Nagios Remote Plug-in Executor (NRPE) and a set of server templates configuring contacts for alarms, hosts to be monitored, services (AKA sensors) and so on.


Configuring the Nagios server

The configuration of a Nagios server is done in a set of standard templates, in the 'monitoring/nagios3' namespace.

Repository Used

Sensors are provided for many of the plug-ins from: -the SA1 repository: http://www.sysadmin.hep.ac.uk/rpms/grid-services/RPMS.monitoring/

-RPMs for nagios and nagios-plugins (+dépendances) are compiled for each supported OS, and are put in the repository « updates » on quattorsrv.lal.in2p3.fr. Ex. : http://quattor.web.lal.in2p3.fr/packages/os/sl440-i386/updates/

-Plugins « nagios-grid-plugins » are in noarch RPM in the repository « nagios » Ex. : http://quattor.web.lal.in2p3.fr/packages/nagios/

voir template : nagios3/plugins/config.tpl

Server Template

An example Nagios server template is here :

https://trac.lal.in2p3.fr/LCGQWG/browser/templates/trunk/clusters/example-3.1/profiles/nagios3-server.example.org.tpl

This machine should be a UI to monitor grid services.

Who is monitored

hosts from site (variable SITES) and present in config/’sitename’_nodes_properties.tpl will be monitored

Template example for host declaration are in LCGQWG: https://trac.lal.in2p3.fr/LCGQWG/browser/templates/trunk/sites/example/site/config

You can tune this with:

NAGIOS_IGNORED_NODES

NAGIOS_MONITORED_HOSTGROUPS


see the profile above.

What is monitored

Services are added in the template « server/cfgfiles/services.tpl »

—adding a service can be done like this :
variable TMP_SERVICE=nlist( 
       "use","                             generic-service", 
       "host_name","                       node07.org.fr", 
     "service_description","             Workers ssh_known_hosts", 
     "contact_groups","                  admins", 
     "check_command","                   check_nrpe_long!check_ssh_known_hosts!60", 
     "normal_check_interval","           60 ; check every hour", 
     "max_check_attempts","              1", 
 ); 


If the second parameter of the function nagios_add_service is « true » , a dependency will be added on the NRPE daemon for all the nodes « "*,!NOQUATTOR » for the service defined .... need to improve on this...


Nagios configuration files doesn't need complex quattor structure template and so are created with filecopy : adding commands is done in:

monitoring/nagios3/server/cfgfiles/commands

adding NRPE commands is done in:

monitoring/nagios3/client/cfgfiles/nrpe_commands

It's possible to add dependency on the NRPE daemon for services wich are not defined on all the hosts(template services.tpl)


NEED SOMETHING

It's possible to add dependency on a services for a host, with a service from another host well defined:

variable NAGIOS_USER_DEFINED_HOST_DEPENDENCIES =	nagios_add_host_service_dependency(
	"node07.datagrid.cea.fr","nrpe daemon", "node07.datagrid.cea.fr","Workers ssh_known_hosts" 
);

It's not possible to add dependency between hostgroups (for the moment ?)

Proxy management

Need to have a valid certificate for local grid probe. 2 mechanisms are possible: Renewal and Retrieval. In cfg/standard/monitoring/nagios3/server/config.tpl

include { if(NAGIOS_NCG_CONFIG && NAGIOS_MODE_PROXY_RENEW) 'monitoring/nagios3/server/vobox'};
include { if(NAGIOS_NCG_CONFIG && NAGIOS_MODE_PROXY_RETRIEVE) 'monitoring/nagios3/server/proxy_retrieval'};

Les variables associées:

NAGIOS_MODE_PROXY_RENEW
NAGIOS_RENEW_PROXY 
NAGIOS_OUTPUT_PROXY
NAGIOS_MODE_PROXY_RETRIEVE
NAGIOS_MYPROXY_NAME 
MYPROXY_SERVER
NAGIOS_VONAME_PROXY

client configuration

Les variables

\\ \\

|NAGIOS_ADMIN_CONTACTS | admin emails for alarms| | NAGIOS_CONFINFO_USERS | | | NAGIOS_DEFAULT_ADMIN_NAME | | | NAGIOS_DEFAULT_NODE_GROUP | | | NAGIOS_HOSTCOMMANDS_USERS | | | NAGIOS_HOSTVIEW_USERS | | | NAGIOS_HTPASSWD_CONFIG | | | NAGIOS_HTPASSWD_LOGIN | | | NAGIOS_HTPASSWD_PASS | | | NAGIOS_IGNORED_NODES | | | NAGIOS_KNOWN_HOSTGROUPS | | | NAGIOS_MONITORED_HOSTGROUPS | | | NAGIOS_NCG_CONFIG | | | NAGIOS_NODES_PROPERTIES | | | NAGIOS_NOTIFICATIONS_ENABLED | | | NAGIOS_RPM_VERSION | | | NAGIOS_SERVCOMMANDS_USERS | | | NAGIOS_SERVER | | | NAGIOS_SERVICEEXTINFOS | | | NAGIOS_SERVICEEXTINFOS | | | NAGIOS_SERVICES | | | NAGIOS_SERVVIEW_USERS | | | NAGIOS_SUPPORTED_OS_LIST | | | NAGIOS_SYSCOMMAND_USERS | | | NAGIOS_SYSINFO_USERS | | | NAGIOS_USER_DEFINED_HOST_DEPENDENCIES | |


Installation Exemple

With Quattor

server profile creation look at the profile above.

svn add cfg/clusters/your-3.1/profiles/profile_node58.tpl

Modify your list of machines:

vi ./cfg/sites/your/site/config/your_nodes_properties.tpl

create your hardware template

svn cp ./cfg/sites/your/hardware/virtual_machine_3.tpl ./cfg/sites/your/hardware/virtual_machine_13.tpl


Comit your change:

svn ci -m 'adding serveur nagios'


on the nagios server

vi /var/log/spma.log
vi /var/log/ncm-cdispd.log
/etc/init.d/nagios status
/etc/init.d/nagios start


Verifier le certificat serveur
NEED SOMETHING from node58