Difference between revisions of "Atlas:SC4-May07"

Un article de lcgwiki.
Jump to: navigation, search
 
(Logbook)
 
(27 intermediate revisions by the same user not shown)
Ligne 1: Ligne 1:
 +
= T0-T1-T2 transfer tests  =
 +
* A new DDM version (called 0.3) is under test. The first goal is to reach a steady transfer from CERN (T0) to T1s.
 +
* A new Arda monitoring from DDM 0.3 is accessible at this [http://dashb-atlas-data-test.cern.ch/dashboard/request.py/site address]
 +
* A [https://twiki.cern.ch/twiki/bin/view/Atlas/DDMMorningMeetings Twiki page] is maintained for the whole transfers
 +
* [http://lxarda08.cern.ch/dashboard/request.py/site DDM 0.2] (MC production transfers) and [http://dashb-atlas-data-test.cern.ch/dashboard/request.py/site DDM 0.3] (T0_T1-T2 transfers) run concurently. Both use the same FTS instance in LYON.
 +
 +
== T0-LYON tests  ==
 +
=== General remarks ===
 +
* Lyon participated to the T0-T1 test since the beginning (beginning of May).
 +
* The main problems was Castor stability at CERN
 +
* Datasets exists during 24 hours. After 24 hours, the datasets are deleted but not the physical files and the LFC entries.
 +
* As a consequence of previous point, since 16 May, subscriptions T0->T1 of datasets older than 12 hours are removed
 +
* Files are kept 8 hours in dcache disk area (cleaning managed by dcache team)
 +
* No Tape driver is dedicated to this exercise (used also for production). Dedicated drivers will come with the new robot in LYON (end june ?)
 +
 +
=== DDM Live Plots ===
 +
* Transfer rates to all T1s (MB/s) (Tx->T1) http://dashb-atlas-data-test.cern.ch/dashboard/templates/plots/OVERVIEW.throughput.86400.png
 +
* Transfer rates to all components of FR cloud http://dashb-atlas-data-test.cern.ch/dashboard/templates/plots/LYON.throughput.86400.png
 +
 +
=== Network live plots ===
 +
* [http://netstat.in2p3.fr/weathermap/graphiques/lyo-nrd.html CC-> LYON NRD]
 +
* [http://www.renater.fr/supervision/map-Renater4/level0/reseau-map-Renater4.gif LYON -> Paris (Renater network)]
 +
* [http://www.renater.fr/supervision/map-IDF/level0/reseau-map-IDF.gif France -> International (Paris network)]
 +
 +
=== Logbook ===
 +
* Transfers started mid-April
 +
* Status plot with one bin per day (15 May)
 +
[[Image:OVERVIEW.throughput.2592000-150507.png]]
 +
* Status plot with one bin per day (9 June)
 +
[[Image:T0-T1-9june.png]]
 +
 +
== LYON-T2 tests  ==
 +
=== General remarks ===
 +
* Only AOD datasets are transfered to T2s. Each dataset include one file of 3.6 GB.
 +
* A cron script is run from LAPP to request 
 +
** transfer of datasets to each T2/T3 site
 +
** delete files and LFC entries when the dataset transfer is complete or the dataset disappeared (operated by CERN)
 +
* Files have to be transfered before the files (in LYON) and the datasets (at CERN) disappear. So from time to time, no datasets fullfill these constraints
 +
 +
=== FTS parameters (nstream/nfile)===
 +
 +
* url mode
 +
** IN2P3-GRIF, IN2P3-CPPM, IN2P3-TOKYO : 10/10
 +
** IN2P3-LAPP, IN2P3-LPC : 10/5
 +
** IN2P3-NIPNE02, IN2P3-NIPNE07 : 2/2
 +
* srmcp
 +
** IN2P3-BEIJING : 5/5
 +
 +
=== Logbook ===
 +
* 12 May First tests of LYON->T2. These tests should not affect T0->T2 transfers (dcache load)
 +
* 15 May: Status plot for (15 May)
 
[[Image:LYON.T2.total_bytes.86400.png]]
 
[[Image:LYON.T2.total_bytes.86400.png]]
 +
** First comments (15 May):
 +
*** CPPM, LAPP, LAL, SACLAY, LPNHE, LPC : OK
 +
*** NIPNE_02, NIPNE_07 : Seems to have network limitations
 +
*** TOKYO : Many transfers pending with DDM 0.2. Big fight between both DDM versions
 +
*** BEIJING : Problem to get TURL
 +
* Transfers from LYON to T2s are probably affected by the saturation of the line between Computing Center and the NRD (RENATER node)[[http://netstat.in2p3.fr/weathermap/graphiques/lyo-nrd.html link]]. All T1->T2 transfers go through this line.
 +
[[Image:Lyo-nrd-monthly-150507.gif]]
 +
 +
* 16 May:
 +
** TURL problem with BEIJING solved. Now problem of access right
 +
** FTS T1->T2 behaves much better when CC->NRD link is not saturated
 +
 +
* 17 May :
 +
** LYON->T2 transfers subscribed but not treated by DDM 0.3. Request help from DDM team.
 +
 +
* 21 May :
 +
** LYON->T2 restarted after intervention of DDM team at CERN
 +
** Problems : Access write in BEIJING, does not accept Miguel's certificate in NIPNE 07
 +
** Lyon->Paris and Paris-> International link busy (no detailed monitoring)
 +
 +
* 22 May
 +
** Same problems as previous days(BEIJING,NIPNE_07). Concerning TOKYO, needs to understand why normal MC files (AOD,...) are correctly transfered to TOKYO and not AOD T0 files (which are bigger (3.6 GB) by at least an order of magnitude compared to MC files)
 +
** Plots for running sites (Vertical axis : MB/s) [[Image:LYON.T2.throughput.86400-22May.png]]
 +
 +
* 24 May
 +
** Network to foreign countries saturated. Should be solved with new connections in the coming weeks
 +
* Plots for running sites (one bin per day): [[Image:LYON.T2.throughput.604800-May24.png]]
 +
 +
* 26 May
 +
** Impossible to write on CPPM and SACLAY : Transfer stopped
 +
** Impossible to delete files on LPNHE and LAL with lcg-del : Transfer stopped
 +
** No saturation of the network to foreign countries but transfers from LYONDISK to TOKYO/BEIJING is slow (< 1 MB/s). No problem last year. To be understood.
 +
** Running only with LAPP and LPC with a higher rate (20-30 MB/s with only two sites)
 +
 +
* 9 June
 +
** Stable period in the previous week for working sites
 +
* Plots for running sites (one bin per day): [[Image:T1-T2-9June.png]]
 +
 +
* 24 June:
 +
** Although T0->LYON tranfer was resumed, DDM is not able to provide list of datasets to be transfered to T2s (probably problem with the central DDM catalog)

Latest revision as of 21:30, 24 juin 2007

T0-T1-T2 transfer tests

  • A new DDM version (called 0.3) is under test. The first goal is to reach a steady transfer from CERN (T0) to T1s.
  • A new Arda monitoring from DDM 0.3 is accessible at this address
  • A Twiki page is maintained for the whole transfers
  • DDM 0.2 (MC production transfers) and DDM 0.3 (T0_T1-T2 transfers) run concurently. Both use the same FTS instance in LYON.

T0-LYON tests

General remarks

  • Lyon participated to the T0-T1 test since the beginning (beginning of May).
  • The main problems was Castor stability at CERN
  • Datasets exists during 24 hours. After 24 hours, the datasets are deleted but not the physical files and the LFC entries.
  • As a consequence of previous point, since 16 May, subscriptions T0->T1 of datasets older than 12 hours are removed
  • Files are kept 8 hours in dcache disk area (cleaning managed by dcache team)
  • No Tape driver is dedicated to this exercise (used also for production). Dedicated drivers will come with the new robot in LYON (end june ?)

DDM Live Plots

Network live plots

Logbook

  • Transfers started mid-April
  • Status plot with one bin per day (15 May)

OVERVIEW.throughput.2592000-150507.png

  • Status plot with one bin per day (9 June)

T0-T1-9june.png

LYON-T2 tests

General remarks

  • Only AOD datasets are transfered to T2s. Each dataset include one file of 3.6 GB.
  • A cron script is run from LAPP to request
    • transfer of datasets to each T2/T3 site
    • delete files and LFC entries when the dataset transfer is complete or the dataset disappeared (operated by CERN)
  • Files have to be transfered before the files (in LYON) and the datasets (at CERN) disappear. So from time to time, no datasets fullfill these constraints

FTS parameters (nstream/nfile)

  • url mode
    • IN2P3-GRIF, IN2P3-CPPM, IN2P3-TOKYO : 10/10
    • IN2P3-LAPP, IN2P3-LPC : 10/5
    • IN2P3-NIPNE02, IN2P3-NIPNE07 : 2/2
  • srmcp
    • IN2P3-BEIJING : 5/5

Logbook

  • 12 May First tests of LYON->T2. These tests should not affect T0->T2 transfers (dcache load)
  • 15 May: Status plot for (15 May)

LYON.T2.total bytes.86400.png

    • First comments (15 May):
      • CPPM, LAPP, LAL, SACLAY, LPNHE, LPC : OK
      • NIPNE_02, NIPNE_07 : Seems to have network limitations
      • TOKYO : Many transfers pending with DDM 0.2. Big fight between both DDM versions
      • BEIJING : Problem to get TURL
  • Transfers from LYON to T2s are probably affected by the saturation of the line between Computing Center and the NRD (RENATER node)[link]. All T1->T2 transfers go through this line.

Lyo-nrd-monthly-150507.gif

  • 16 May:
    • TURL problem with BEIJING solved. Now problem of access right
    • FTS T1->T2 behaves much better when CC->NRD link is not saturated
  • 17 May :
    • LYON->T2 transfers subscribed but not treated by DDM 0.3. Request help from DDM team.
  • 21 May :
    • LYON->T2 restarted after intervention of DDM team at CERN
    • Problems : Access write in BEIJING, does not accept Miguel's certificate in NIPNE 07
    • Lyon->Paris and Paris-> International link busy (no detailed monitoring)
  • 22 May
    • Same problems as previous days(BEIJING,NIPNE_07). Concerning TOKYO, needs to understand why normal MC files (AOD,...) are correctly transfered to TOKYO and not AOD T0 files (which are bigger (3.6 GB) by at least an order of magnitude compared to MC files)
    • Plots for running sites (Vertical axis : MB/s) LYON.T2.throughput.86400-22May.png
  • 24 May
    • Network to foreign countries saturated. Should be solved with new connections in the coming weeks
  • Plots for running sites (one bin per day): LYON.T2.throughput.604800-May24.png
  • 26 May
    • Impossible to write on CPPM and SACLAY : Transfer stopped
    • Impossible to delete files on LPNHE and LAL with lcg-del : Transfer stopped
    • No saturation of the network to foreign countries but transfers from LYONDISK to TOKYO/BEIJING is slow (< 1 MB/s). No problem last year. To be understood.
    • Running only with LAPP and LPC with a higher rate (20-30 MB/s with only two sites)
  • 9 June
    • Stable period in the previous week for working sites
  • Plots for running sites (one bin per day): T1-T2-9June.png
  • 24 June:
    • Although T0->LYON tranfer was resumed, DDM is not able to provide list of datasets to be transfered to T2s (probably problem with the central DDM catalog)