Atlas

Un article de lcgwiki.
Revision as of 17:41, 2 décembre 2018 by Jezequel (talk | contribs) (DOMA_FR tests)
Jump to: navigation, search

Bienvenue sur la page Atlas LCG-France
Welcome to the LCG-France Atlas page

General DOMA informations

DOMA_FR project

DOMA_FR tests

  • Global transfers from/ to (same site dest/source site excluded)
Header text LAPP LPSC CC
From

Link

Link

Link

To

Link

Link

Link

  • Data transfer through LHCONE
Transfer Sud-East CC
Source

[1] || [2]

Destination
  • Computing activity per job type (Running slots)
Running slots LAPP LPSC CC

[3]

[4]

[5]

  • Data access monitorings as seen by site WN
Destination of access LAPP LPSC CC
Production download

Link

Link

Link

Production upload

Link

Link

Link

Production input

Link

Link

Link

Production output

Link

Link

Link

Analysis Download

Link

Link

Link

Analysis Direct Access

link

Link

Link

  • Questions
    • Enabling direct access creates much more network usage -> Is it usefull for ATLAS ? (processing urgent request faster ?)
    • Production_output can only be done to 1 site
    • If direct access to IN2P3-CC in direct access with xrootd, what is the size of the portal (2 or 10 Gb/s for ATLAS)


  • Issues
    • Asynchronous transfers of input files goes to closest/fastest site to input site instead of the read_lan0 to the WN (would help to reduce network occupancy between IN2P3-CC and LAPP/LPSC since smoothed by FTS) -> Request to change on 21st September (Panda level)
    • Jobs brokering should take into account downtime of remote SEs (issue with IN2P3-CC downtime) -> Request sent by Rod
    • 10 Gb/s connection LAPP-CC (used for all LAPP WAN transfers to any site) can get saturated if huge amount of jobs starting at same time (No smoothing by Panda) -> No suggestion yet
    • Production_output can only be done to 1 site (usefull if SE destination in downtime while local/remote WN is not) -> No request
    • Users can enforce analysis jobs to copy files instead of direct access -> the whole file is transfered instead of a fraction
    • IN2P3-CC is running only 200-300 analysis jobs while the total site runs 10k : Reason is the different analysis share between T1 (5% ) and T2 (25% ?)
    • IN2P3-CC presents remote direct read access because read_wan0=srm


  • Next steps
    • Deploy Rucio 1.17 to use protocol priority defined in AGIS (bug in 1.16). Solving bug : use most trusted protocol and would help to control srm decreasing usage
    • Monitor job efficiency vs RTT between SE and WN -> Identify when cache does not have impact (assuming no network bandwidth limitation)
    • Understand the ATLAS job brokering algorithm
    • Get the typical transfer rate per job type (Johannes)