Nagios regional

Un article de lcgwiki.
Revision as of 09:11, 19 février 2010 by LEROY (talk | contribs)
Jump to: navigation, search

Installation d'une NOGIOS box pour le ROC France

1) Installation de base :

a. Machine installée par les sysadmin du CC : OS + VOBOX + certificat

b. Accessible via gsissh (port 1975)

2) Action faite au préalable :

a. Faire la demande pour que la machine soit autorisée à récupérer les SAM tests : https://gus.fzk.de/ws/ticket_info.php?ticket=55132

b. Autoriser la nagios box à récupérer les proxy en mode « retrieval » voir Annexe0

c. Certificat utilisé pour :

i. Access to GOCDB PI for ROCS GOCDB PI level 2 required

ii. Recuperation de proxy pour les sondes locales

d. s'inscrire dans la mailing liste: regional-nagios-admins@cern.ch (très réactive)

3) Installation de Nagios

Reference : https://twiki.cern.ch/twiki/bin/view/EGEE/GridMonitoringNcgYaim

Installation des packages via yum, Ajout des repos suivant

a.	mirrors-rpmforge (rpm rpmforge-release-0.5.1-1.el5.rf.x86_64.rpm)
b.	rpmforge-testing.repo
c.	rpmforge.repo
d.	glite-UI.repo
e.	sa1-centos5-release.repo (rpm:  sa1-release-2-1.el5.noarch.rpm)

Problèmes de dépendances rencontrées si besoin se référer Annexe1 (mais j’avais dû oublier de faire un : yum install egee-NAGIOS)

Mysql

1. Installer la dernière version de Mysql (server + client) ;

2. configurer le mot de passe admin se référer Annexe2a ;

3. Configurer les users utiles à la nagios Box se référer Annexe2b ;

Configuration via yaim

1. Remplir /etc/ncg/ncg.localdb avec la liste des sites se referer Annexe3


2. Remplir le /opt/glite/yaim/site-info.def voir Annexe4 ;

3. groupadd nagios

4. modif des uids/gids : /opt/glite/yaim/examples/edgusers.conf Annexe5


5. lancement de la configuration automatique via yaim :

	/opt/glite/yaim/bin/yaim -s /opt/glite/yaim/site-info.def -c -n glite-NAGIOS
	/opt/glite/yaim/bin/yaim -s /opt/glite/yaim/site-info.def -c -n glite-NRPE
	/opt/glite/yaim/bin/yaim -s /opt/glite/yaim/site-info.def -c -n glite-UI -n glite-NAGIOS

Tuning the configuration

1. Verifier dans /etc/sysconfig/nagios:

LD_LIBRARY_PATH=/opt/classads/lib64:/opt/glite/lib64:/opt/globus/lib:/opt/c-ares/lib:/opt/classads/lib64  

2. Desactiver les notifications, dans les fichiers de conf:

a.      /etc/nagios/nagios.cfg, désactiver les notifications:enable_notifications=0 ;log_notifications=0
b.	 /etc/nagios/wlcg.d/host.templates.cfg
c.	 /etc/nagios/wlcg.d/service.templates.cfg

4. N’autoriser que les dteam/France à visualiser notre interface nagios : modifier le fichier :

/etc/voms2htpasswd.conf, avec : vomss://voms.cern.ch:8443/voms/dteam?/dteam/france

Plus utile site le site-info.def est défini correctement:

VO_DTEAM_VOMS_SERVERS='vomss://voms.cern.ch:8443/voms/dteam?/dteam/france'

ActiveMQ

/usr/sbin/msg-to-queue --prefix /queue/grid.probe.metricOutput.EGEE.a635834332381123c8b296d02b682f8f --broker-uri stomp://prod-grid-msg.cern.ch:6163

[root@cclcgvmli03 cron.hourly]# cat check_msg-to-queue.sh

Verifier les messages:

/usr/libexec/grid-monitoring/plugins/nagios/recv_from_queue -v 

faire une update de perl-GridMon

The problem here is in the message handler 
(/usr/lib/perl5/vendor_perl/5.8.8/GridMon/MsgHandler/MetricOutput.pm). 
Probe on WN reports hostname localhost.localdomain and serviceURI CE 
hostname. In the previous version message handler first checked hostname 
value and then serviceURI. That is the reason why Christine is seeing 
results for localhost.localdomain. However, we fixed this and the latest 
version (1.0.34) parses messages correctly.
yum update perl-GridMon

Annexe0

[cleroy@grid08 ~]$ grep cclcgvmli03 /opt/glite/etc/myproxy-server.conf
trusted_retrievers /O=GRID-FR/C=FR/O=CNRS/OU=CC-LYON/CN=cclcgvmli03.in2p3.fr
authorized_retrievers /O=GRID-FR/C=FR/O=CNRS/OU=CC-LYON/CN=cclcgvmli03.in2p3.fr

Annexe 1

rpm -ivh http://www.sysadmin.hep.ac.uk/rpms/egee-SA1/centos5/x86_64/sa1-release-2-1.el5.noarch.rpm
rpm -ivh  http://packages.sw.be/rpmforge-release/rpmforge-release-0.5.1-1.el5.rf.x86_64.rpm
yum install atp
yum install bouncycastle
yum install broker
yum install broker-cache
yum install dcache-srmclient
yum install dummy-ca-certs
yum install egee-NAGIOS
yum install egee-NAGIOS egee-NRPE
yum install egee-NRPE
yum install fetch-crl
yum install fipscheck fipscheck-lib
yum install glite-UI
yum install glite-security-voms-clients
yum install glite-wms-ui-commands
yum install glite-yaim-core
yum install glite-yaim-nagios
yum install httpd
yum install jdk
yum install lcg-CA
yum install lcg-CA egee-NAGIOS
yum install lcg_util
yum install mddb
yum install msg-publish-simple
yum install myproxy
yum install mysql-client
yum install mysql-server
yum install nagios-proxy-refresh
yum install perl-Config-Tiny
yum install perl-DBD-MySQL
yum install perl-rrdtool-1.3.8-2.el5.rf.x86_64
yum install python-yaml
yum install sun-jaf
yum install uberftp-client
yum install vdt_globus_rm_client
yum update glite-yaim-clients
yum update glite-yaim-core
yum update glite-yaim-nagios
yum update mysql-server
yum update perl-DBI

Annexe2

a)Mot de passe admin mysql : Yum pour recupérer la derniere version de Mysql : MySQL-server-community, ne pas oublier le client (pas de dépendance dessus) Demmarrage de mysql avec --skip-grant-tables (pour ne pas avoir de mot de passe a rentrer)

mysqld_safe --skip-grant-tables &
[root@cclcgvmli03 ~]# mysql -u root 
update user set password=PASSWORD("NEW-ROOT-PASSWORD") where User='root';

b)creation des users pour le nagios regional:

[root@cclcgvmli03 ~]# mysql -u root -p
mysql> GRANT SELECT, INSERT, UPDATE, DELETE ON nagios.* TO 'ndouser'@'localhost' IDENTIFIED by 'ROCfr2009';
mysql> GRANT SELECT, INSERT, UPDATE, DELETE ON atp.* TO 'atpuser'@'localhost' IDENTIFIED by 'ROCfr2009';

Annexe 3

[root@cclcgvmli03 ~]# cat /etc/ncg/ncg.localdb
#
# Local Rules file to modify NCG configuration
#
SITE!AUVERGRID
SITE!CGG-LCG2
SITE!ESRF
SITE!GRIF
SITE!IBCP-GBIO
SITE!IN2P3-CC
SITE!IN2P3-CC-PPS
SITE!IN2P3-CC-T2
SITE!IN2P3-CPPM
SITE!IN2P3-IPNL
SITE!IN2P3-IRES
SITE!IN2P3-LAPP
SITE!IN2P3-LPC
SITE!IN2P3-LPSC
SITE!IN2P3-SUBATECH
SITE!IPSL-IPGP-LCG2
SITE!M3PEC
SITE!MSFG
SITE!MSFG-MULTI
SITE!MSFG-OPEN
SITE!OBSPM
SITE!PARIS-UREC-IPV6
SITE!SN-UCAD
SITE!ROC-FR
SITE!SOLEIL
SITE!StratusLab
[root@cclcgvmli03 ~]#

Annexe 4

SITE_EMAIL=c.leroy@cea.fr
SITE_NAME=ROC-FR
RB_HOST=node04.datagrid.cea.fr
WMS_HOST=node04.datagrid.cea.fr
PX_HOST=myproxy.grif.fr
BDII_HOST=topbdii.grif.fr
SITE_BDII_HOST=bdii.grif.fr
MON_HOST=node06.datagrid.cea.fr
VOS="dteam"
DTEAM_GROUP_ENABLE="dteam"
VO_DTEAM_SW_DIR=$VO_SW_DIR/dteam
VO_DTEAM_DEFAULT_SE=$SE_HOST
VO_DTEAM_STORAGE_DIR=$CLASSIC_STORAGE_DIR/dteam
VO_DTEAM_VOMS_SERVERS='vomss://voms.cern.ch:8443/voms/dteam?/dteam/'
VO_DTEAM_VOMSES="'dteam lcg-voms.cern.ch 15004 /DC=ch/DC=cern/OU=computers/CN=lcg-voms.cern.ch dteam 24' 'dteam    voms.cern.ch 15004 /DC=ch/DC=cern/OU=computers/CN=voms.cern.ch dteam 24'"
VO_DTEAM_VOMS_CA_DN="'/DC=ch/DC=cern/CN=CERN Trusted Certification Authority' '/DC=ch/DC=cern/CN=CERN Trusted   Certification Authority'"
NAGIOS_HOST=cclcgvmli03.in2p3.fr
NAGIOS_ADMIN_DNS="/O=GRID-FR/C=FR/O=CEA/OU=IRFU/CN=Christine Leroy","/O=GRID-FR/C=FR/O=CNRS/OU=CC-LYON/CN=Nadia Lajili","/O=GRID-FR/C=FR/O=CNRS/OU=LPC/CN=Emmanuel Medernach","/O=GRID-FR/C=FR/O=CNRS/OU=CPPM/CN=Juan Carlos Carranza"
NAGIOS_NCG_ENABLE_CONFIG=true
NAGIOS_NAGIOS_ENABLE_CONFIG=true
NCG_GOCDB_ROC_NAME=France
ROC_NAME=France
NCG_PROBES_TYPE=remote,native,local
NCG_VO=dteam
NAGIOS_MYPROXY_NAME=nagios_roc_fr2
NAGIOS_MYPROXY_USER=nagios
MSG_BROKER_CACHE_NETWORK=PROD
NAGIOS_ROLE=roc
NAGIOS_HTTPD_ENABLE_CONFIG=true
NAGIOS_SUDO_ENABLE_CONFIG=true
NAGIOS_CGI_ENABLE_CONFIG=true
NCG_LDAP_FILTER=GlueSiteOtherInfo=EGEE_ROC=France
NAGIOS_DB_PASS=x
NAGIOS_NSCA_PASS=x
MYSQL_ADMIN=x
ATP_DB_PASS=x
MDDB_DB_PASS=x
MS_DB_PASS=x
MYSQL_PASSWORD=x
MYEGEE_DB_PASS=x

Annexe 5

[root@cclcgvmli03 ~]# cat /opt/glite/yaim/examples/edgusers.conf
11151:${DPMMGR_USER}:11151:${DPMMGR_GROUP}:DPM user:
11152:${EDG_USER}:11152,11156:${EDG_GROUP},${INFOSYS_GROUP}:EDG user:${EDG_HOME_DIR}
11153:${EDGINFO_USER}:11153,1156:${EDGINFO_USER},${INFOSYS_GROUP}:EDG info user:${EDGINFO_HOME_DIR}
11154:${RGMA_USER}:11154,1156:${RGMA_GROUP},${INFOSYS_GROUP}:RGMA user:${INSTALL_ROOT}/glite/etc/rgma
11155:${GLITE_USER}:11155:${GLITE_GROUP}:gLite user:${GLITE_HOME_DIR}
11156:${BDII_USER}:11158:${BDII_GROUP}:BDII user:${BDII_HOME_DIR}