Difference between revisions of "LCG-FR / SA1-FR Monitoring NagiosWithQuattor"
Ligne 9: | Ligne 9: | ||
http://www.sysadmin.hep.ac.uk/rpms/grid-services/RPMS.monitoring/ | http://www.sysadmin.hep.ac.uk/rpms/grid-services/RPMS.monitoring/ | ||
− | An example Nagios server template is | + | An example Nagios server template is here : |
object template profile_node58; | object template profile_node58; | ||
Ligne 76: | Ligne 76: | ||
=== Who is monitored === | === Who is monitored === | ||
− | hosts present in config/nodes_properties.tpl will be monitored, you can tune this with the variable: | + | hosts from site (variable SITES) and present in config/nodes_properties.tpl will be monitored, you can tune this with the variable: |
NAGIOS_IGNORED_NODES | NAGIOS_IGNORED_NODES | ||
Ligne 88: | Ligne 88: | ||
+ | == client configuration == | ||
Ligne 123: | Ligne 124: | ||
\\ | \\ | ||
+ | |||
+ | Installation Exemple: | ||
+ | |||
+ | ==== With Quattor ==== | ||
+ | server profile creation look at the profile above. | ||
+ | svn add cfg/clusters/your-3.1/profiles/profile_node58.tpl | ||
+ | |||
+ | Modify your list of machines: | ||
+ | vi ./cfg/sites/your/site/config/your_nodes_properties.tpl | ||
+ | |||
+ | create your hardware template | ||
+ | svn cp ./cfg/sites/your/hardware/virtual_machine_3.tpl ./cfg/sites/your/hardware/virtual_machine_13.tpl | ||
+ | |||
+ | |||
+ | Comit your change: | ||
+ | svn ci -m 'adding serveur nagios' | ||
+ | |||
+ | |||
+ | |||
+ | ==== on the nagios server ==== | ||
+ | vi /var/log/spma.log | ||
+ | vi /var/log/ncm-cdispd.log | ||
+ | /etc/init.d/nagios status | ||
+ | /etc/init.d/nagios start | ||
+ | |||
+ | |||
+ | Verifier le certificat serveur |
Version du 17:36, 19 janvier 2009
Sommaire
Installing Nagios with quattor
Nagios configuration requires both a set of client templates for commands to be run on clients by the Nagios Remote Plug-in Executor (NRPE) and a set of server templates configuring contacts for alarms, hosts to be monitored, services (AKA sensors) and so on.
Configuring the Nagios server
The configuration of a Nagios server is done in a set of standard templates, in the 'monitoring/nagios3' namespace. Sensors are provided for many of the plug-ins from the SA1 repository: http://www.sysadmin.hep.ac.uk/rpms/grid-services/RPMS.monitoring/
An example Nagios server template is here :
object template profile_node58; include { 'rpms/kernelupdates' }; # this includes kernel updates, no matter the OS version variable AII_KS_SRV = "192.54.208.182"; variable AII_ACK_SRV = AII_KS_SRV; variable NFS_AUTOFS = true; include { 'site/firewall/nagios_server' }; ############ #Fonctionnalite UI utile pour nagios service grille variable VOS ?= list('grif','dteam'); include { 'machine-types/ui' }; ############ #include Nagios server ############################## ##What resources are monitored variable SITES = list('dapnia'); include { 'config/nodes_properties' }; ############################## ###Configuration, setting variables variable NAGIOS_NCG_CONFIG = true; variable NAGIOS_NOTIFICATIONS_ENABLED = false; variable NAGIOS_NODES_PROPERTIES = NODES_PROPS; variable NAGIOS_DEFAULT_ADMIN_NAME = "dapnia"; variable NAGIOS_IGNORED_NODES = list("node09.datagrid.cea.fr","node19.datagrid.cea.fr","node22.datagrid.cea.fr"); variable NAGIOS_MONITORED_HOSTGROUPS = list("WN","NFS","SEDPM","SE_DISK","SITE_BDII","MON","LFC","CE","CE-MPI","VOBOX","UI","WMS"); variable NAGIOS_ADMIN_CONTACTS= nlist( "tuto1" ,"tuto1@org.fr", "tuto2" ,"tuto2@org.fr", ); variable NAGIOS_HTPASSWD_LOGIN ?= "grif"; variable NAGIOS_HTPASSWD_PASS ?= 'xxxxxx'; ############################## ###Functions used to configure services and hosts include { 'monitoring/nagios3/server/functions' }; ############################## ###Services configuration variable TMP_SERVICE=nlist( "use"," generic-service", "host_name"," node07.datagrid.cea.fr", "service_description"," Workers ssh_known_hosts", "contact_groups"," admins", "check_command"," check_nrpe_long!check_ssh_known_hosts!60", "normal_check_interval"," 60 ; check every hour", "max_check_attempts"," 1", ); variable NAGIOS_SERVICES=nagios_add_service(TMP_SERVICE); variable NAGIOS_USER_DEFINED_HOST_DEPENDENCIES=nagios_add_host_service_dependency\ ("node07.datagrid.cea.fr","nrpe daemon","node07.datagrid.cea.fr","Workers ssh_known_hosts"); include { 'monitoring/nagios3/server/config' }; ### # # software repositories (should be last) # include { 'rpms/siteupdates' }; include { PKG_REPOSITORY_CONFIG };
Who is monitored
hosts from site (variable SITES) and present in config/nodes_properties.tpl will be monitored, you can tune this with the variable:
NAGIOS_IGNORED_NODES
NAGIOS_MONITORED_HOSTGROUPS
see the profile above.
What is monitored
client configuration
Les variables
\\ \\
^ NAGIOS_ADMIN_CONTACTS ^| admin emails for alarms| | NAGIOS_CONFINFO_USERS | | | NAGIOS_DEFAULT_ADMIN_NAME | | | NAGIOS_DEFAULT_NODE_GROUP | | | NAGIOS_HOSTCOMMANDS_USERS | | | NAGIOS_HOSTVIEW_USERS | | | NAGIOS_HTPASSWD_CONFIG | | | NAGIOS_HTPASSWD_LOGIN | | | NAGIOS_HTPASSWD_PASS | | | NAGIOS_IGNORED_NODES | | | NAGIOS_KNOWN_HOSTGROUPS | | | NAGIOS_MONITORED_HOSTGROUPS | | | NAGIOS_NCG_CONFIG | | | NAGIOS_NODES_PROPERTIES | | | NAGIOS_NOTIFICATIONS_ENABLED | | | NAGIOS_RPM_VERSION | | | NAGIOS_SERVCOMMANDS_USERS | | | NAGIOS_SERVER | | | NAGIOS_SERVICEEXTINFOS | | | NAGIOS_SERVICEEXTINFOS | | | NAGIOS_SERVICES | | | NAGIOS_SERVVIEW_USERS | | | NAGIOS_SUPPORTED_OS_LIST | | | NAGIOS_SYSCOMMAND_USERS | | | NAGIOS_SYSINFO_USERS | | | NAGIOS_USER_DEFINED_HOST_DEPENDENCIES | | ^ espace réservé | 50 GB, 3.5 GB RAM, 4 CPUs ||
\\
Installation Exemple:
With Quattor
server profile creation look at the profile above.
svn add cfg/clusters/your-3.1/profiles/profile_node58.tpl
Modify your list of machines:
vi ./cfg/sites/your/site/config/your_nodes_properties.tpl
create your hardware template
svn cp ./cfg/sites/your/hardware/virtual_machine_3.tpl ./cfg/sites/your/hardware/virtual_machine_13.tpl
Comit your change:
svn ci -m 'adding serveur nagios'
on the nagios server
vi /var/log/spma.log vi /var/log/ncm-cdispd.log /etc/init.d/nagios status /etc/init.d/nagios start
Verifier le certificat serveur