Difference between revisions of "Atlas:Analysis Challenge ST"

Un article de lcgwiki.
Jump to: navigation, search
(More information)
(Test conditions)
Ligne 11: Ligne 11:
 
* The testing framework is ganga-based. It is currently using LCG backend but it will soon be possible to use PANDA backend as well. Metrics are collected and displayed at http://gangarobot.cern.ch/st/
 
* The testing framework is ganga-based. It is currently using LCG backend but it will soon be possible to use PANDA backend as well. Metrics are collected and displayed at http://gangarobot.cern.ch/st/
 
* Both POSIX I/O and "copy mode" may be used allowing performances comparaison of the 2 modes. <br>
 
* Both POSIX I/O and "copy mode" may be used allowing performances comparaison of the 2 modes. <br>
* It uses regular AOD analysis in 14.2.20 with mc08*AOD*e*s*r5 DQ2 inputs<br>
+
* It uses regular AOD analysis and ATLAS software release 14.2.20  
 +
* Input DS Patterns used :
 +
    mc08.*Wmunu*.recon.AOD.e*_s*_r5*tid*
 +
    mc08.*Zprime_mumu*.recon.AOD.e*_s*_r5*tid*
 +
    mc08.*Zmumu*.recon.AOD.e*_s*_r5*tid*
 +
    mc08.*T1_McAtNlo*.recon.AOD.e*_s*_r5*tid*
 +
    mc08.*H*zz4l*.recon.AOD.e*_s*_r5*tid*
 +
    mc08.*.recon.AOD.e*_s*_r5*tid*
 
* Input datasets are read from ATLASMCDISK and outputs are stored on ATLASUSERDISK (no special requirements there). Input data access is the main issue. No problem on data output <br>
 
* Input datasets are read from ATLASMCDISK and outputs are stored on ATLASUSERDISK (no special requirements there). Input data access is the main issue. No problem on data output <br>
 
* Required CPUtime : GlueCEPolicyMaxCPUTime >= 1440 (1 day , typical duration : 5 hours)
 
* Required CPUtime : GlueCEPolicyMaxCPUTime >= 1440 (1 day , typical duration : 5 hours)

Version du 15:55, 20 janvier 2009

Site Stress Test

Procedure

  • Replication of target datasets accross the cloud
  • Preparation of job
  • Generation n jobs per site (Each job processes 1 dataset)
  • Bulk submission to WMS (1 per site)

Test conditions

  • The testing framework is ganga-based. It is currently using LCG backend but it will soon be possible to use PANDA backend as well. Metrics are collected and displayed at http://gangarobot.cern.ch/st/
  • Both POSIX I/O and "copy mode" may be used allowing performances comparaison of the 2 modes.
  • It uses regular AOD analysis and ATLAS software release 14.2.20
  • Input DS Patterns used :
   mc08.*Wmunu*.recon.AOD.e*_s*_r5*tid*
   mc08.*Zprime_mumu*.recon.AOD.e*_s*_r5*tid*
   mc08.*Zmumu*.recon.AOD.e*_s*_r5*tid*
   mc08.*T1_McAtNlo*.recon.AOD.e*_s*_r5*tid*
   mc08.*H*zz4l*.recon.AOD.e*_s*_r5*tid*
   mc08.*.recon.AOD.e*_s*_r5*tid*
  • Input datasets are read from ATLASMCDISK and outputs are stored on ATLASUSERDISK (no special requirements there). Input data access is the main issue. No problem on data output
  • Required CPUtime : GlueCEPolicyMaxCPUTime >= 1440 (1 day , typical duration : 5 hours)
  • Jobs run under DN : /O=GermanGrid/OU=LMU/CN=Johannes_Elmsheuser
  • LAN saturation observed in case of 1 Gb network connection between WN and SE.
  • It is possible for sites to limit the number of jobs sent at a time.
  • Test duration : 48 hours

Target and metrics

  • Nb of jobs : Few hundred up to 1000 jobs/site
  • Rate (evt/s) : up to 15 Hz
  • Success rate (success/failure rate) > 80 %
  • CPU utilization : CPUtime / Walltime > 50 %

Results and Monitoring

More information

  • Latest news from ADC developent meeting : http://indico.cern./conferenceDisplay.py?confId=48239
    • Usage of a prestager (copy in background of needed files on WN while processing) improve on most sites CPU/Walltime ratio (up to >90%) and Nb of events > 20Hz. However.. lcg-cp is used for copy...
    • some tests were performed on the whole LCG cloud like this one

http://gangarobot.cern.ch/st/test_105/

FR-Cloud ST summary (12/08)