Difference between revisions of "Atlas:Analysis Challenge"
Ligne 11: | Ligne 11: | ||
See ATLAS coordination DA challenge meeting (Nov. 20) | See ATLAS coordination DA challenge meeting (Nov. 20) | ||
* http://indico.cern.ch/conferenceDisplay.py?confId=45718 | * http://indico.cern.ch/conferenceDisplay.py?confId=45718 | ||
− | First exercise will help to identify breaking points and bottlenecks. It is limited in time (a few days) and requires careful attention of site administrators during that period,in particular network (internal & external), disk, cpu monitoring. | + | First exercise will help to identify breaking points and bottlenecks. <b>It is limited in time (a few days) and requires careful attention of site administrators during that period,in particular network (internal & external), disk, cpu monitoring.</b> |
− | This first try (Stress tests) can be run centrally in a controlled manner. ATLAS coordination (Dan van der Ster and Johannes Elmsheuser) needs to know which sites to be tested and when. | + | This first try (Stress tests) can be run centrally in a controlled manner. The testing framework is ganga-based. ATLAS coordination (Dan van der Ster and Johannes Elmsheuser) needs to know which sites to be tested and when. |
* See [http://lcg.in2p3.fr/wiki/index.php/Atlas:Analysis_Challenge_ST details of Site Stress test] : procedure, test conditions and targets | * See [http://lcg.in2p3.fr/wiki/index.php/Atlas:Analysis_Challenge_ST details of Site Stress test] : procedure, test conditions and targets | ||
* See results : http://gangarobot.cern.ch/st/ | * See results : http://gangarobot.cern.ch/st/ | ||
Ligne 27: | Ligne 27: | ||
* Dec 17-18 : Data Analysis exercice open to physicists with their favorite application and tools | * Dec 17-18 : Data Analysis exercice open to physicists with their favorite application and tools | ||
+ | * Physicists involved : Julien Donini, Arnaud Lucotte, Bertrand Brelier, Eric Lançon, LAL ?, LPNHE ? | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
<b>Participation required at cloud and site level. Any site in the Tiers_of_ATLAS list can participate.</b> | <b>Participation required at cloud and site level. Any site in the Tiers_of_ATLAS list can participate.</b> |
Version du 09:30, 2 décembre 2008
Sommaire
Goals
- measure "real" analysis job efficiency and turn around on several sites of a given cloud
- measure data access performance
- check load balancing between different users and different analysis tools (Ganga vs pAthena)
- check load balancing between analysis and MC production
First exercise on the FR Cloud (>= December 8th )
Phase 1 : Site stress test run centrally in a controlled manner (2 days)
DA challenges have been performed on IT and DE clouds in october 08. Proposition has been made to extend this cloud-by cloud challenge to the FR Cloud. See ATLAS coordination DA challenge meeting (Nov. 20)
First exercise will help to identify breaking points and bottlenecks. It is limited in time (a few days) and requires careful attention of site administrators during that period,in particular network (internal & external), disk, cpu monitoring. This first try (Stress tests) can be run centrally in a controlled manner. The testing framework is ganga-based. ATLAS coordination (Dan van der Ster and Johannes Elmsheuser) needs to know which sites to be tested and when.
- See details of Site Stress test : procedure, test conditions and targets
- See results : http://gangarobot.cern.ch/st/
- Nov. 28 Submission (Tot 200 jobs sur 12 sites): http://gangarobot.cern.ch/st/test_43/
- Planning
- Dec. 8-9: 1rst round with Tokyo and possibly GRIF - Dec 14 : stop of MC production - Dec. 15-16 : possibly : LAPP, CC-IN2P3-T2(to be confirmed), Tokyo, GRIF, sites with 1gbps LAN : CPPM, NIPNE, LPC (to be contacted) - Dec 17 : restart of MC production
Phase 2 : Pathena Analysis Challenge
- Dec 17-18 : Data Analysis exercice open to physicists with their favorite application and tools
- Physicists involved : Julien Donini, Arnaud Lucotte, Bertrand Brelier, Eric Lançon, LAL ?, LPNHE ?
Participation required at cloud and site level. Any site in the Tiers_of_ATLAS list can participate.
It is possible for sites to limit the number of jobs sent at a time. DA team is ready to take into account site constraints. DA team is open to any metrics
Target and metrics
- Nb of events : Few hundred up to 1000 jobs/site
- Rate (evt/s) : up to 15 Hz
- Efficiency (success/failure rate) : 80 %
- CPU utilization : CPUtime / Walltime > 50 %
Results
See
- ATLAS Twiki page : https://twiki.cern.ch/twiki/bin/view/Main/GangaSiteTests
- Results of analysis challenge performed on IT Cloud
- Results of DE Cloud