Decommissioning ALICE native xrootd servers and dealing with data loss

Aim of this documentation

This objective of this document is to detail the procedure to follow by system administrators when they want to remove an xrootd server (decommissioning) or when they lost a filesystem on an xrootd server.

This document is based on the mails exchanged on the list and on the real cases encountered at the GRIF-IPNO site. Costin Grigoras is the author of the different recommendations and tips successfully applied at IPNO.

About the examples

The examples are taken from the IPNO sites where the redirector is and the xrootd servers are ipngridxrd1, ipngridxrd2, ... On all the xrootd servers the data partitions mount points follow the same naming convention: the data partitions are /grid/xrddataX {X=1..8}.

A quick presentation of the xrootd files tree

On each xrootd server there are on or more disk partitions where the data files are stored. There is also a namespace which is a directory containing the names of the data files: theses names are the ones the redirector uses. The name itself is a symlink to the real data file. The name space can be in a separate partition or in a subdirectory of a data partition.

In the case of IPNO, the namespace is always a subdir of the first data partition. Here are some example from one xrootd server:

# df -h|grep xrddata
/dev/sdb1             9.1T  5.6T  3.6T  62% /grid/xrddata1
/dev/sdb2             9.1T  5.6T  3.6T  62% /grid/xrddata2
/dev/sdb3             9.1T  5.6T  3.6T  62% /grid/xrddata3
/dev/sdb4             9.1T  5.6T  3.6T  62% /grid/xrddata4
/dev/sdc1             9.1T  5.6T  3.6T  62% /grid/xrddata5
/dev/sdc2             9.1T  5.6T  3.6T  62% /grid/xrddata6
/dev/sdc3             9.1T  5.6T  3.6T  62% /grid/xrddata7
/dev/sdc4             9.1T  5.6T  3.6T  62% /grid/xrddata8
# ls -ld /grid/xrddata1/namespace
drwxr-xr-x 18 xrootd xrootd 4096 Mar 30  2015 /grid/xrddata1/namespace

The data file  %grid%xrddata1%namespace%00%65278%b8f9f574-dd42-11e4-a4e6-63e8b3f6492f in the partion /grid/xrddata6 is recorded in the namespace as 
/grid/xrddata1/namespace/00/65278/b8f9f574-dd42-11e4-a4e6-63e8b3f6492f :

# ls -lh /grid/xrddata1/namespace/00/65278/b8f9f574-dd42-11e4-a4e6-63e8b3f6492f
lrwxrwxrwx 1 xrootd xrootd 85 Apr  7  2015 /grid/xrddata1/namespace/00/65278/b8f9f574-dd42-11e4-a4e6-63e8b3f6492f -> /grid/xrddata6/%grid%xrddata1%namespace%00%65278%b8f9f574-dd42-11e4-a4e6-63e8b3f6492f

# ls -lLh /grid/xrddata1/namespace/00/65278/b8f9f574-dd42-11e4-a4e6-63e8b3f6492f
-rw-rw-r-- 1 xrootd xrootd 3.6M Apr  7  2015 /grid/xrddata1/namespace/00/65278/b8f9f574-dd42-11e4-a4e6-63e8b3f6492f

To access the file in this example, the URL will be root:// where ipngridxrd0 is the redirector here.