LCG-FR / SA1-FR Monitoring NagiosWithQuattor: Difference between revisions

Un article de lcgwiki.
Jump to navigation Jump to search
LEROY (talk | contribs)
LEROY (talk | contribs)
No edit summary
Ligne 18: Ligne 18:
-Plugins « nagios-grid-plugins » are in noarch RPM in the repository « nagios »
-Plugins « nagios-grid-plugins » are in noarch RPM in the repository « nagios »
Ex. : http://quattor.web.lal.in2p3.fr/packages/nagios/
Ex. : http://quattor.web.lal.in2p3.fr/packages/nagios/
voir template : nagios3/plugins/config.tpl


=== Server Template ===
=== Server Template ===
Ligne 46: Ligne 48:
=== What is monitored ===
=== What is monitored ===


L’ajout d’un service a lieux dans le template « server/cfgfiles/services.tpl »
Services are added in the template « server/cfgfiles/services.tpl »
  —Rajouter un service est fait ainsi : — —
  —adding a service can be done like this :


variable TMP_SERVICE=nlist(
        "use","                            generic-service",
        "host_name","                      node07.org.fr",
      "service_description","            Workers ssh_known_hosts",
      "contact_groups","                  admins",
      "check_command","                  check_nrpe_long!check_ssh_known_hosts!60",
      "normal_check_interval","          60 ; check every hour",
      "max_check_attempts","              1",
  );






◦Si le 2eme paramètre de la fonction nagios_add_service est « true » ,une dépendance sera ajoutée sur le démon NRPE pour tous les nœuds « "*,!NOQUATTOR » pour le service ainsi défini… ceci est probablement à améliorer.
If the second parameter of the function nagios_add_service is « true » , a dependency will be added on the NRPE daemon for all the nodes  « "*,!NOQUATTOR » for the service defined .... need to improve on this...


Certains fichiers ne nécessitent pas de structure quattor compliquée, ils sont créés via un filecopy :
L’ajout de commandes se fait dans
monitoring/nagios3/server/cfgfiles/commands
L’ajout de commandes nrpe se fait dans
monitoring/nagios3/client/cfgfiles/nrpe_commands


Il est possible d’ajouter une dépendance sur le démon NRPE pour les services non définis sur toutes les machines (template services.tpl)
Nagios configuration files doesn't need complex quattor structure template and so are created with filecopy :
.......
adding commands is done in:
monitoring/nagios3/server/cfgfiles/commands
adding NRPE commands is done in:
monitoring/nagios3/client/cfgfiles/nrpe_commands


Il est possible d’ajouter une dépendance sur un service d’un hôte, pour un service d’un autre hôte bien défini :
It's possible to add dependency on the NRPE daemon for services wich are not defined on all the hosts(template services.tpl)


variable NAGIOS_USER_DEFINED_HOST_DEPENDENCIES = nagios_add_host_service_dependency(
"node07.datagrid.cea.fr","nrpe daemon", "node07.datagrid.cea.fr","Workers ssh_known_hosts" 
);


Il n’est PAS possible d’ajouter une dépendance entre hostgroups (pour le moment ?)
NEED SOMETHING
 
It's possible to add dependency on a services for a host, with a service from another host well defined:
 
variable NAGIOS_USER_DEFINED_HOST_DEPENDENCIES = nagios_add_host_service_dependency(
"node07.datagrid.cea.fr","nrpe daemon", "node07.datagrid.cea.fr","Workers ssh_known_hosts" 
);
 
It's not possible to add dependency between hostgroups (for the moment ?)


=== Proxy management ===
=== Proxy management ===
Ligne 149: Ligne 163:


  Verifier le certificat serveur
  Verifier le certificat serveur
NEED SOMETHING from node58

Version du 23:34, 20 janvier 2009

Installing Nagios with quattor

Nagios configuration requires both a set of client templates for commands to be run on clients by the Nagios Remote Plug-in Executor (NRPE) and a set of server templates configuring contacts for alarms, hosts to be monitored, services (AKA sensors) and so on.


Configuring the Nagios server

The configuration of a Nagios server is done in a set of standard templates, in the 'monitoring/nagios3' namespace.

Repository Used

Sensors are provided for many of the plug-ins from: -the SA1 repository: http://www.sysadmin.hep.ac.uk/rpms/grid-services/RPMS.monitoring/

-RPMs for nagios and nagios-plugins (+dépendances) are compiled for each supported OS, and are put in the repository « updates » on quattorsrv.lal.in2p3.fr. Ex. : http://quattor.web.lal.in2p3.fr/packages/os/sl440-i386/updates/

-Plugins « nagios-grid-plugins » are in noarch RPM in the repository « nagios » Ex. : http://quattor.web.lal.in2p3.fr/packages/nagios/

voir template : nagios3/plugins/config.tpl

Server Template

An example Nagios server template is here :

https://trac.lal.in2p3.fr/LCGQWG/browser/templates/trunk/clusters/example-3.1/profiles/nagios3-server.example.org.tpl

This machine should be a UI to monitor grid services.

Who is monitored

hosts from site (variable SITES) and present in config/’sitename’_nodes_properties.tpl will be monitored

Template example for host declaration are in LCGQWG: https://trac.lal.in2p3.fr/LCGQWG/browser/templates/trunk/sites/example/site/config

You can tune this with:

NAGIOS_IGNORED_NODES

NAGIOS_MONITORED_HOSTGROUPS


see the profile above.

What is monitored

Services are added in the template « server/cfgfiles/services.tpl »

—adding a service can be done like this :
variable TMP_SERVICE=nlist( 
       "use","                             generic-service", 
       "host_name","                       node07.org.fr", 
     "service_description","             Workers ssh_known_hosts", 
     "contact_groups","                  admins", 
     "check_command","                   check_nrpe_long!check_ssh_known_hosts!60", 
     "normal_check_interval","           60 ; check every hour", 
     "max_check_attempts","              1", 
 ); 


If the second parameter of the function nagios_add_service is « true » , a dependency will be added on the NRPE daemon for all the nodes « "*,!NOQUATTOR » for the service defined .... need to improve on this...


Nagios configuration files doesn't need complex quattor structure template and so are created with filecopy : adding commands is done in:

monitoring/nagios3/server/cfgfiles/commands

adding NRPE commands is done in:

monitoring/nagios3/client/cfgfiles/nrpe_commands

It's possible to add dependency on the NRPE daemon for services wich are not defined on all the hosts(template services.tpl)


NEED SOMETHING

It's possible to add dependency on a services for a host, with a service from another host well defined:

variable NAGIOS_USER_DEFINED_HOST_DEPENDENCIES =	nagios_add_host_service_dependency(
	"node07.datagrid.cea.fr","nrpe daemon", "node07.datagrid.cea.fr","Workers ssh_known_hosts" 
);

It's not possible to add dependency between hostgroups (for the moment ?)

Proxy management

Need to have a valid certificate for local grid probe. 2 mechanisms are possible: Renewal and Retrieval. In cfg/standard/monitoring/nagios3/server/config.tpl

include { if(NAGIOS_NCG_CONFIG && NAGIOS_MODE_PROXY_RENEW) 'monitoring/nagios3/server/vobox'};
include { if(NAGIOS_NCG_CONFIG && NAGIOS_MODE_PROXY_RETRIEVE) 'monitoring/nagios3/server/proxy_retrieval'};

Les variables associées:

NAGIOS_MODE_PROXY_RENEW
NAGIOS_RENEW_PROXY 
NAGIOS_OUTPUT_PROXY
NAGIOS_MODE_PROXY_RETRIEVE
NAGIOS_MYPROXY_NAME 
MYPROXY_SERVER
NAGIOS_VONAME_PROXY

client configuration

Les variables

\\ \\

|NAGIOS_ADMIN_CONTACTS | admin emails for alarms| | NAGIOS_CONFINFO_USERS | | | NAGIOS_DEFAULT_ADMIN_NAME | | | NAGIOS_DEFAULT_NODE_GROUP | | | NAGIOS_HOSTCOMMANDS_USERS | | | NAGIOS_HOSTVIEW_USERS | | | NAGIOS_HTPASSWD_CONFIG | | | NAGIOS_HTPASSWD_LOGIN | | | NAGIOS_HTPASSWD_PASS | | | NAGIOS_IGNORED_NODES | | | NAGIOS_KNOWN_HOSTGROUPS | | | NAGIOS_MONITORED_HOSTGROUPS | | | NAGIOS_NCG_CONFIG | | | NAGIOS_NODES_PROPERTIES | | | NAGIOS_NOTIFICATIONS_ENABLED | | | NAGIOS_RPM_VERSION | | | NAGIOS_SERVCOMMANDS_USERS | | | NAGIOS_SERVER | | | NAGIOS_SERVICEEXTINFOS | | | NAGIOS_SERVICEEXTINFOS | | | NAGIOS_SERVICES | | | NAGIOS_SERVVIEW_USERS | | | NAGIOS_SUPPORTED_OS_LIST | | | NAGIOS_SYSCOMMAND_USERS | | | NAGIOS_SYSINFO_USERS | | | NAGIOS_USER_DEFINED_HOST_DEPENDENCIES | |


Installation Exemple

With Quattor

server profile creation look at the profile above.

svn add cfg/clusters/your-3.1/profiles/profile_node58.tpl

Modify your list of machines:

vi ./cfg/sites/your/site/config/your_nodes_properties.tpl

create your hardware template

svn cp ./cfg/sites/your/hardware/virtual_machine_3.tpl ./cfg/sites/your/hardware/virtual_machine_13.tpl


Comit your change:

svn ci -m 'adding serveur nagios'


on the nagios server

vi /var/log/spma.log
vi /var/log/ncm-cdispd.log
/etc/init.d/nagios status
/etc/init.d/nagios start


Verifier le certificat serveur
NEED SOMETHING from node58