Difference between revisions of "ALICE native xrootd"

Un article de lcgwiki.
Jump to: navigation, search
(Collect the information on the xrootd files)
Ligne 372: Ligne 372:
 
A solution is to collect all the information about xrootd files on server A by running the script collect_xrootd_files_info.sh (see [[#Collecting_files_GUIDs_and_other_information]]).
 
A solution is to collect all the information about xrootd files on server A by running the script collect_xrootd_files_info.sh (see [[#Collecting_files_GUIDs_and_other_information]]).
  
Once the copy of the data is done, one can process the file name2file_map to extract the list of files copied from each A partitions and use this list to update the symlinks in the namespace of B ([[#About_the_namespaces_merging]]).
+
Once the copy of the data is done, one can process the file name2file_map to extract the list of files copied from each A partitions and use this list to update the symlinks in the namespace of B ([[#About_the_namespaces_merging]].
  
 
In our example we have to copy of 3 partitions. First we have to do the copy and after the merge.
 
In our example we have to copy of 3 partitions. First we have to do the copy and after the merge.
Ligne 464: Ligne 464:
  
  
== When one xrootd partiton is lost ==
+
== When one or more xrootd disk partitions are lost ==
  
Here we suppose that we lost the partition /grid/xrddata3. We still have the namespace /grid/xrddata1/namespace .
+
Here we suppose that the namespace /grid/xrddata1/namespace is not damaged and that we lost only the partition /grid/xrddata3.  
Since the namespace is still available, we can easily identifies the missing files and send their GUIDs to ALICE so that the transfer can be done.
+
From the namespace, we can easily identify the missing files and send their GUIDs to ALICE so that the transfer can be done.
  
 
Following is what to do:
 
Following is what to do:
  
 
<pre>
 
<pre>
# date ; sh collect_xrootd_files_info.sh /grid/xrddata1/namespace; date
+
# sh collect_xrootd_files_info.sh /grid/xrddata1/namespace
 
</pre>
 
</pre>
  
Then send the file "missing_files_guids" to the ALICE experts.
+
Then send the file "missing_files_guids" to the ALICE experts who will do the file transfer.
  
== When all the data of a xrootd server are lost ==
+
== When all the xrootd disk partitons and the namespace are lost ==
  
== When the namespace partition is lost ==
+
If you lose all the xrootd disk partitions plus the namespace on a xrootd server, you can't anymore collect the GUIDs of the lost files.
 +
Thes steps to deduce the lost files GUIDs are :
 +
 
 +
#
 +
== When only the namespace partition is lost ==

Version du 16:16, 20 novembre 2015

Decommissioning ALICE native xrootd servers and dealing with data loss

Aim of this documentation

This objective of this document is to detail the procedure to follow by system administrators when they want to remove a xrootd server (decommissioning) or when they lost a filesystem on a xrootd server.

This document is based on the mails exchanged on the alice-lcg-task-force@cern.ch list and on the real cases encountered at the GRIF-IPNO site. Costin Grigoras is the author of the different recommendations and tips successfully applied at IPNO.

About the examples

The examples are taken from the IPNO sites where the redirector is ipngridxrd0.in2p3.fr and the xrootd servers are ipngridxrd1, ipngridxrd2, ... On all the xrootd servers the data partitions mount points follow the same naming convention: the data partitions are /grid/xrddataX {X=1..8}.

A quick presentation of the xrootd files tree

On each xrootd server there are one or more disk partitions where the data files are stored. There is also a namespace which is a directory containing the names of the data files: theses names are the ones the redirector uses. The name (or file name) itself is a symlink to the real xrootd data file. The name space can be in a separate partition or in a subdirectory of a data partition.

In the case of IPNO, the namespace is always a subdirectory of the first data partition. Here are some example from one xrootd server:

# df -h|grep xrddata
/dev/sdb1             9.1T  5.6T  3.6T  62% /grid/xrddata1
/dev/sdb2             9.1T  5.6T  3.6T  62% /grid/xrddata2
/dev/sdb3             9.1T  5.6T  3.6T  62% /grid/xrddata3
/dev/sdb4             9.1T  5.6T  3.6T  62% /grid/xrddata4
/dev/sdc1             9.1T  5.6T  3.6T  62% /grid/xrddata5
/dev/sdc2             9.1T  5.6T  3.6T  62% /grid/xrddata6
/dev/sdc3             9.1T  5.6T  3.6T  62% /grid/xrddata7
/dev/sdc4             9.1T  5.6T  3.6T  62% /grid/xrddata8
# 
# ls -ld /grid/xrddata1/namespace
drwxr-xr-x 18 xrootd xrootd 4096 Mar 30  2015 /grid/xrddata1/namespace

The data file  %grid%xrddata1%namespace%00%65278%b8f9f574-dd42-11e4-a4e6-63e8b3f6492f in the partion /grid/xrddata6 is recorded in the namespace as /grid/xrddata1/namespace/00/65278/b8f9f574-dd42-11e4-a4e6-63e8b3f6492f :

# ls -lh /grid/xrddata1/namespace/00/65278/b8f9f574-dd42-11e4-a4e6-63e8b3f6492f
lrwxrwxrwx 1 xrootd xrootd 85 Apr  7  2015 /grid/xrddata1/namespace/00/65278/b8f9f574-dd42-11e4-a4e6-63e8b3f6492f ->
 /grid/xrddata6/%grid%xrddata1%namespace%00%65278%b8f9f574-dd42-11e4-a4e6-63e8b3f6492f

# ls -lLh /grid/xrddata1/namespace/00/65278/b8f9f574-dd42-11e4-a4e6-63e8b3f6492f
-rw-rw-r-- 1 xrootd xrootd 3.6M Apr  7  2015 /grid/xrddata1/namespace/00/65278/b8f9f574-dd42-11e4-a4e6-63e8b3f6492f

To access the file in this example, the URL will be root://ipngridxrd0.in2p3.fr:1094//00/65278/b8f9f574-dd42-11e4-a4e6-63e8b3f6492f where ipngridxrd0 is the redirector here. For example to copy the file from a WN:


# xrdcp root://ipngridxrd0.in2p3.fr:1094//00/65278/b8f9f574-dd42-11e4-a4e6-63e8b3f6492f /tmp/xrd_test.dat
[3.594MB/3.594MB][100%][==================================================][3.59
[3.594MB/3.594MB][100%][==================================================][3.594MB/s]  
[root@ipngrid90 ~]# ls -lh /tmp/xrd_test.dat
-rw-r--r-- 1 root root 3.6M Nov 18 10:54 /tmp/xrd_test.dat

Some observations:

  • the file name in the namespace contains the GUID of the xrootd data file (ex: b8f9f574-dd42-11e4-a4e6-63e8b3f6492f in the example above)
# basename /grid/xrddata1/namespace/00/65278/b8f9f574-dd42-11e4-a4e6-63e8b3f6492f
b8f9f574-dd42-11e4-a4e6-63e8b3f6492f
  • the xrootd data file name is built from the name in the namespace. In the example above, the xrootd data file name %grid%xrddata1%namespace%00%65278%b8f9f574-dd42-11e4-a4e6-63e8b3f6492f in the directory /grid/xrddata6/ is built from the name 00/65278/b8f9f574-dd42-11e4-a4e6-63e8b3f6492f of the namespace.
  • the xrootd data file name can be a random one as long as the symlink in the namespace continue to point to it. So one can do (to avoid in practice because there is no reason to do it) :
# service xrdservices stop
# mv /grid/xrddata6/%grid%xrddata1%namespace%00%65278%b8f9f574-dd42-11e4-a4e6-63e8b3f6492f /grid/xrddata6/testfile.dat
# ln -fs /grid/xrddata6/testfile.dat /grid/xrddata1/namespace/00/65278/b8f9f574-dd42-11e4-a4e6-63e8b3f6492f
# service xrdservices start

Even though the xrootd data file is renamed as /grid/xrddata6/testfile.dat, xrootd will continue to see it as /grid/xrddata1/namespace/00/65278/b8f9f574-dd42-11e4-a4e6-63e8b3f6492f because we updated the symlink. This flexibility will allow the transfer of the data from one xrootd server to another even if the directory tree is not exactly the same on both servers.

Decommissioning a xrootd server

You may need the remove a xrootd server for many reasons (old hardware, frequent failures, ...). Before stopping the server and disconnecting it from the network, ALICE should be informed by sending an e-mail to alice-lcg-task-force@cern.ch. The experts from ALICE will tell you what to do to transfer the data elsewhere. There are possibilities:

  1. you have enough space on an other xrootd server on you site to transfer the data to
    • in this case you must copy the data with rsync to this xrootd server
  2. you have enough space available on your SE but no xrootd server alone can receive all the data
    • in this case, ALICE will ask you to send the list of the GUIDs and sizes of the files you need to transfer and will manage the transfer to your SE
  3. your SE doesn't have enough space to store the copies of the files
    • in this case, ALICE will ask you to send the list of the GUIDs and sizes of the files you need to transfer and will manage the transfer to the SE on another ALICE site

The procedures in the different cases are detailed below.

Transfer preparation

Put the xrood server in read-only mode

Il any case, before starting the copy of the files, you must stop xrootd services on the server beeing decomminssioned, remount the partions in read-only mode and restart xrootd services.

If I have 8 xrootd data partitions:

service xrdservices stop
for i in $(seq 1 8); do mount -o remount,ro /grid/xrddata$i; done
service xrdservices start

Collecting files GUIDs and other information

If you can copy the files to another xrootd server by rsync, you don't need to collect the GUIDs of the files. But if the transfer is to be done through xrood itself (from SE to SE), you need to collect the files GUIDs and provide them to the ALICE experts.

The script collect_xrootd_files_info.sh will produce 4 files in the subdir /var/tmp/GUIDS_$(hostname)_PID.

  • file_names : contains the namesto be used in the xrootd URL of a file when using xrdcp for example
  • guids_and_sizes : contains the GUIDs of the data files plus their sizes
  • missing_files_names : contains the list of missing files (broken symlinks or missing data files)
  • missing_files_guids : contains the GUIDs of the files in missing_files_names
  • name2file_map : contains two columns: first column=entry name in the name space, second colum=xrootd data file full path name

Here is the collect_xrootd_files_info.sh script:

# cat collect_xrootd_files_info.sh 
#!/bin/sh
# Collect the GUIDs + file sizes on an xrootd disk server. Also collect the 
# GUIDs of missing files.
# NB: !!! Before launching this script, make sure to first mount 
# xrootd data partitions in read-only monde

if [ "$#" != "1" ]; then
  echo ""
  echo "Usage: $0 namespace_base_dir"
  echo ""
  echo "Example: $0 /grid/xrddata1/namespace"
  echo ""
  exit 0
fi

NAMESPACE=$1

[ ! -d "${NAMESPACE}" ] && echo "${NAMESPACE}: is not a directory" && exit 1

OUTDIR=/var/tmp/GUIDS_$(hostname)_$$
mkdir -p ${OUTDIR}

echo ""
echo "The result of this script will be stored in files in ${OUTDIR} ..."
echo ""


# Example of entry : ./04/11522/4107d468-a7f6-11df-b283-001e0bd3f44c
FILE_NAMES="${OUTDIR}/file_names"

# Example of entry: 05151ae6-76d7-11e5-aad5-8b87ecfb2d4e 19676060
GUIDS_AND_SIZES="${OUTDIR}/guids_and_sizes" # details (ls -l) des fichiers

# Broken links: entry (symlinks) in the namespace without a data file
MISSING_FILES_NAMES="${OUTDIR}/missing_files_names"

# GUIDs of missing files
MISSING_FILES_GUIDS="${OUTDIR}/missing_files_guids"

# Name to file mapping: file with two columns: 
# name in the namespace | xrootd file full path name
NAME2FILE_MAPPING="${OUTDIR}/name2file_map"

cd ${NAMESPACE}

# File names from the namespace including missings files (broken links)
find . -type l -print > ${FILE_NAMES}

# Details on data files : ls -l
cat ${FILE_NAMES} | xargs ls -lL > ${GUIDS_AND_SIZES} 2> ${MISSING_FILES_NAMES}

# keep only GUIDs and file sizes
sed -i 's/\// /g' ${GUIDS_AND_SIZES}
cat ${GUIDS_AND_SIZES} | awk '{print $NF " " $5}' > ${GUIDS_AND_SIZES}_tmp
/bin/mv ${GUIDS_AND_SIZES}_tmp ${GUIDS_AND_SIZES}

# Save missing files GUIDs
cp  ${MISSING_FILES_NAMES} ${MISSING_FILES_GUIDS}
sed -i -e 's/\// /g' -e 's/\://g' ${MISSING_FILES_GUIDS}
cat ${MISSING_FILES_GUIDS} | awk '{print $5}' > ${MISSING_FILES_GUIDS}_tmp
/bin/mv ${MISSING_FILES_GUIDS}_tmp ${MISSING_FILES_GUIDS}

# Create name to file mappinag
cat ${FILE_NAMES} | xargs ls -l | awk '{print $9 " " $NF}' > ${NAME2FILE_MAPPING}
# Exclude the missing files : broken links 
# Commented in bellow because cpu and memory hungry. The missing files can be 
# avoided an other way when using the file ${NAME2FILE_MAPPING} to merge the namespace
#grep -v -f ${MISSING_FILES_GUIDS} ${NAME2FILE_MAPPING} > ${NAME2FILE_MAPPING}_tmp
#/bin/mv ${NAME2FILE_MAPPING}_tmp ${NAME2FILE_MAPPING}

Example:

# date ; sh collect_xrootd_files_info.sh /grid/xrddata1/namespace; date
Thu Nov 19 17:08:15 CET 2015

The result of this script will be stored in files in /var/tmp/GUIDS_ipngridxrd16.in2p3.fr_40663 ...

Thu Nov 19 17:18:10 CET 2015

# cd /var/tmp/GUIDS_ipngridxrd16.in2p3.fr_40663
# ls -lh *
-rw-r--r-- 1 root root  49M Nov 19 17:08 file_names
-rw-r--r-- 1 root root  46M Nov 19 17:17 guids_and_sizes
-rw-r--r-- 1 root root    0 Nov 19 17:17 missing_files_guids
-rw-r--r-- 1 root root    0 Nov 19 17:08 missing_files_names
-rw-r--r-- 1 root root 135M Nov 19 17:18 name2file_map
# 

# wc -l *
  1052454 file_names
  1052454 guids_and_sizes
        0 missing_files_guids
        0 missing_files_names
  1052454 name2file_map

# head -2 file_names
./04/45205/7cb82112-3c56-11e5-9516-23d68bd9df8f
./04/45205/816a4938-2e30-11e2-9cd8-db9bc21ad468

# head -2 guids_and_sizes
0cc502f4-1c43-11e5-b7b2-5f940fb22164 18961963
2444240a-de72-11e4-9879-079c5762f860 936798

# head -2 name2file_map
./04/00196/0cc502f4-1c43-11e5-b7b2-5f940fb22164 /grid/xrddata3/%grid%xrddata1%namespace%04%00196%0cc502f4-1c43-11e5-b7b2-5f940fb22164
./04/00196/2444240a-de72-11e4-9879-079c5762f860 /grid/xrddata3/%grid%xrddata1%namespace%04%00196%2444240a-de72-11e4-9879-079c5762f860
# 

About the namespaces merging

After the xrootd data files are copied from server A to server B, the namespace on B must be merged with the namespace of A.

Easy merge

If all the data copied from any partion of A land in a directory with the same name on B, the namespace merging can be done by just copying the files from the namespace of A to the namespace of B. This is what is done at #Copy_with_preserving_absolute_file_names .

For example, let's supposed that 4 partitions A:/grid/xrddata{1,2,3,4} are copied respectively to B:/grid/xrddata{1,2,3,4}. If the namespaces are A:/grid/xrddata1/namespace and B:/grid/xrddata1/namespace, then the merge can be done by copying the files from A:/grid/xrddata1/namespace/ to B:/grid/xrddata1/namespace/ .

Less easy merge

There may be some case where the source directory and the destination directory names differ. For example let's suppose you have to do the following files copy:

  • A:/grid/xrddata1 to B:/grid/xrddata1
  • A:/grid/xrddata2 to B:/grid/xrddata7
  • A:/grid/xrddata3 to B:/grid/xrddata8

In this case before starting the copy, you must collect the file names for each partition because you will need to create these names in the namespace of B as symlinks pointing to new location of the copied files. You can save in a two colums file the list of files :

  • first colum = the names (symlinks) in A namespace
  • second column = real xrootd data file names

You will need this list to update the namespace on serrver B. See #Copy_when_absolute_file_names_can.27t_be_preserved for the copy and merging procedure.

Copy xrootd data with rsync to another local xrootd server

Suppose that A is the xrootd server being decommissioned and B a xrootd server having enough disk space to receive data from A. In this case you can use rsync to transfer the data from A to B.

I you have N filesystems (mounted partitions) to transfer from A to B, B should have at least N partitions with sufficient space available. Even though it is possible to split a partition from A to more than one partitions of B, this should be avoided. It is much more easier and safer to copy each partition of A entirely to only one partition of B.

Before starting the copy, you must stop xrootd services on A, remount the xrootd data partitions in read-only mode and restart xrootd services (#Put_the_xrood_server_in_read-only_mode).

After the data copy, the namespace on the server B must be updated to reflect its new content. The namespace on B can be updated for each copied partition or only after the last partition is copied.

After the namespace on B is up-to-date, the server A can be stopped and disconnected from the network to avoid accidental reboot.

Important notes about the namespace

Once the data are copied from A to B, the namespace on B must be updated (merging the two namespaces). There are many possible cases.

  1. Identical source and destination partition names
    If for each partition copied from A to B, the partition name is identical on A an B, then the only thing to to after the copy of data files is to copy the namespace from A to B
  2. Some source and destination partition names are different
    In this case, after the copy of the each partition or after the last partition is copied you must update the namespace on B par a simple copy or by making symlinks.
    • when a data partition name is unchanged during the copy, just copy the concerned file names from the namespace of A to B
    • when a data partition name is changed during the copy, you must recreate in the namespace on B the symlinks and point the new paths of the data files.

Copy with preserving absolute file names

This is possible only if you have the same mount points naming convention on A an B. To preserve the xrootd data file name during the copy you must copy data from each partition of A to a partition with the same name on B.

On server A:

Allow root ssh connexion without a password from A to B if you don't want to give the B root password each time you will rsync.

  1. ssh-keygen -t rsa
  2. ssh-copy-id root@B

Then :

  1. First copy all the xrootd data files from A to B
  2. merge the tow namespaces: copy the files names from the namespace of A to the namespace of B

The following script (to be edited before running it) will do the copy and the namespace merging.

$ cat simple_xrootd_files_copy.sh 
#!/bin/sh
# This script will transfer all the xrootd files from the current 
# server to DEST_SERVEUR
# Ensure all the partitions are mounted in read-only before running this script:
#   service xrdservices stop
#   for i in $(seq 1 8); do mount -o remount /grid/xrddata$i; done 
#   service xrdservices start
#
# Here we suppose that the mount point naming conventions are the same
# on both servers and that the copy will preserve the path name of the files:
# the source and destination directories have the same name
# i.e copies are from src_server:/grid/xrddata$i to dest_server:/grid/xrddata$i

# !!!! Start of the area where variables must be adjusted 
DEST_SERVEUR="ipngridxrdB.in2p3.fr"

# Base name of the mount points respecting the same naming conventions on 
# both servers
# Here I have partitions /grid/xrddata1, /grid/xrddata2, ... on each server
SRC_PARTITION_PREFIX="/grid/xrddata"
DEST_PARTITION_PREFIX="/grid/xrddata"

#namespace: directory containing the namespace
SRC_NAME_SPACE="${SRC_PARTITION_PREFIX}1/namespace/"
DEST_NAME_SPACE=${SRC_NAME_SPACE}

# number of partitions to copy
# Exemple: "1 2" means copy the partions /grid/xrddata1 and /grid/xrddata2
SRC_PARTITION_NUM="1 2 3 4 5 6"
# !!!! End of the area where variables must be adjusted 

# Excluded files 
# You must not copy the namespace before the xrootd data files, so exclude the
# namespace now and copy it at the end
EXCLUDED_FILES="/tmp/excluded_xfr_$$"
/bin/rm -f ${EXCLUDED_FILES}
cat > ${EXCLUDED_FILES} <<EOF
*.lock
*.fail
DIR_LOCK
namespace
EOF

# Copy the xrootd data files
for n in $SRC_PARTITION_NUM
do
  p="${SRC_PARTITION_PREFIX}$n/"
  echo -n "Starting the copy of partition $p : "; date
  rsync -a --exclude-from=${EXCLUDED_FILES} $p ${DEST_SERVEUR}:$p
  echo -n "End of copy of partition $p: "; date
done

# Copie of the namespace
  echo -n "Starting the copie of the namespace ${SRC_NAME_SPACE} : "; date
  rsync -a ${SRC_NAME_SPACE} ${DEST_SERVEUR}:${DEST_NAME_SPACE}
  echo -n "End of copy of the namespace ${SRC_NAME_SPACE} : "; date


#Cleaning
/bin/rm -f ${EXCLUDED_FILES}

Copy when absolute file names can't be preserved

Suppose that we want to decommission server A having 3 data partitions /grid/xrddata{1,2,3}. We have a Server B with enough free disk space to hold the data from A in the partitions /grid/xrddata{1,7,8} respectively. We see that for two directories the source and destination names differ.

The copy to be done are:

  • copy A:/grid/xrddata1 to B:/grid/xrddata1
  • copy A:/grid/xrddata2 to B:/grid/xrddata7
  • copy A:/grid/xrddata3 to B:/grid/xrddata8

Collect the information on the xrootd files

A solution is to collect all the information about xrootd files on server A by running the script collect_xrootd_files_info.sh (see #Collecting_files_GUIDs_and_other_information).

Once the copy of the data is done, one can process the file name2file_map to extract the list of files copied from each A partitions and use this list to update the symlinks in the namespace of B (#About_the_namespaces_merging.

In our example we have to copy of 3 partitions. First we have to do the copy and after the merge.

The xrootd data files copy

Allow root ssh connexion without a password from A to B if you don't want to give the B root password each time you will rsync.

  1. ssh-keygen -t rsa
  2. ssh-copy-id root@B

On server A, do something like:

# cat copy_xrootd_data_files.sh
#!/bin/sh
EXCLUDED_FILES="/tmp/excluded_xfr_$$"
/bin/rm -f ${EXCLUDED_FILES}
cat > ${EXCLUDED_FILES} <<EOF
*.lock
*.fail
DIR_LOCK
namespace
EOF

rsync -a  --exclude-from=${EXCLUDED_FILES} A:/grid/xrddata1/ B:/grid/xrddata1
rsync -a  --exclude-from=${EXCLUDED_FILES} A:/grid/xrddata2/ B:/grid/xrddata7
rsync -a  --exclude-from=${EXCLUDED_FILES} A:/grid/xrddata3/  B:/grid/xrddata8

The namespace merging

Now that the data files are copied we can merge the namespaces.

Here is a possible scenario.

1) On server A, prepare the list of files to process per partition

  • From the file name2file_map, build the list of files copied from each individual partition
# for i 1 2 3; do grep "\/grid\/xrddata${i}" name2file_map > /tmp/xrddata${i}.list; done
# scp 

2) Merging the namespace for /grid/xrddata1

The source and destination partition name in rsync was the same. So we can simply copy the namespace from A to B for the concerned files only.

  • On server A:
# cat /tmp/xrddata1.list | awk '{print $1}' >  /tmp/xrddata1.names
# cd /grid/xrdata1/namespace
# loop on all the file names in /tmp/xrddata1.names and copy them from  A:/grid/xrdata1/namespace B:/grid/xrdata1/namespace

3) Merging the namespace for /grid/xrddata2

The file have been moved from the directory /grid/xrddata2 to a directory /grid/xrddata7 and therefore the symlinks in the namespace of B need to created to reflect the new file locations.

  • On server B:
# sed -i 's/\/grid\/xrddata2/\/grid\/xrddata7/g' /tmp/xrddata2.list

Then implement this pseudo-code:

cd /grid/xrdata1/namespace
for each line from  xrddata2.list
do
   name=column1
   file=column2   
   continue if $file doesn't exist
   mkdir -p $(dirname $name)
   ln -s $file $name

4) Merging the namespace for /grid/xrddata3

Same procedure to follow as for /grid/xrddata2

After this example, you have understood that one should avoid splitting data from one partition of A to more than one partition on B, because the transfer and the namespace merging will be more complicated an this can lead to mistakes.

What to do in case of data loss ?

Due to a hardware failure, one can lose a partition and so all the xrootd files on this partition. In case of a such data loss, one must collect the GUIDs (see here) of the lost files and inform the experts by sending an e-mail to alice-lcg-task-force@cern.ch. The expert will use the GUIDs collected to find copies of these files from other ALICE SE. If the site still have enough disk space on it's SE, ALICE will transfer back the files to the site SE. If there is no more free space on the site SE, ALICE will transfer the files to an SE of another ALICE site. ALICE will provide a link on Monalisa to follow the transfer progress (example: http://alimonitor.cern.ch/transfers/index.jsp?id=8364).

In our example we will suppose that we have a xrootd server with 3 partitions /grid/xrddata{1,2,3} and that the namespace is under /grid/xrddata1/namespace .


When one or more xrootd disk partitions are lost

Here we suppose that the namespace /grid/xrddata1/namespace is not damaged and that we lost only the partition /grid/xrddata3. From the namespace, we can easily identify the missing files and send their GUIDs to ALICE so that the transfer can be done.

Following is what to do:

# sh collect_xrootd_files_info.sh /grid/xrddata1/namespace

Then send the file "missing_files_guids" to the ALICE experts who will do the file transfer.

When all the xrootd disk partitons and the namespace are lost

If you lose all the xrootd disk partitions plus the namespace on a xrootd server, you can't anymore collect the GUIDs of the lost files. Thes steps to deduce the lost files GUIDs are :

When only the namespace partition is lost