# Hands-on with EOS and the CTA tape backend</span> 05 February 2020 [Julien Leduc](mailto:julien.leduc@cern.ch) for the CTA team --- <h2>EOS+CTA <i style="color:blue;">Architecture</i></h2> Main difference with CASTOR: <span class="fragment" style="color: dodgerblue"><b>EOSCTA is a pure tape system.</b></span> <span class="fragment">Disk cache duty consolidated in main <b style="color: dodgerblue">EOS instance.</b></span> <span class="fragment">Operating tape drive at full speed full time efficiently requires a <b style="color: crimson">SSD based buffer.</b></span> ---- <h2>EOS+CTA <i style="color:blue;">Architecture</i></h2> <img src="https://codimd.web.cern.ch/uploads/upload_e764d94a4ee3ac79c328ea0d21a6a128.svg" class="plain"></span> --- <h2>EOS+CTA <span class="fragment"><i style="color:blue;">Typical operations</i></span></h2> <ul> <li class="fragment">Write file to eoscta buffer <tt class="fragment">xrdcp</tt></li> <li class="fragment">Is file on tape? <tt class="fragment">xrdfs stat -q BackupExists</tt></li> <li class="fragment">Queue file for retrieve <tt class="fragment">xrdfs prepare -s</tt></li> <li class="fragment">Is file in eoscta buffer? <tt class="fragment">!xrdfs stat -q Offline | xrdfs query prepare</tt></li> <li class="fragment">Read file from eoscta buffer <tt class="fragment">xrdcp</tt></li> <li class="fragment">Evict file from eoscta buffer <tt class="fragment">xrdfs prepare -e</tt></li> <li class="fragment">Delete file from namespace <tt class="fragment">xrdfs rm</tt></li> </ul> ---- <h2>EOS+CTA <i style="color:blue;">Authentication</i></h2> <img src="https://codimd.web.cern.ch/uploads/upload_17588d02c4abefb94ecdd58a4cd487fe.svg" class="plain"></span> --- <h2>EOS+CTA <span class="fragment"><i style="color:blue;">Production instances</i></span></h2> ---- ## <span style="color: dodgerblue">EOSCTA</span> buffer servers: - 16x2TB SSDs, 25Gb/s each - hosting up to **1 EOSCTA instance per server** Specific *bandwidth oriented* EOS setup each server runs: - **1 EOS MGM** - **1 EOS NAMESPACE** - *quarkdb* - **14 EOS DISKSERVERs** - *FSTs* ---- ## <span style="color: dodgerblue">EOSCTA</span> <img src="https://codimd.web.cern.ch/uploads/upload_88f3b03ec6cf37aa8c59787b8909d6f6.svg" class="plain"> ---- ## <span style="color: dodgerblue">EOSCTA</span> Deployed instances: <ul> <li class="fragment"><b>eosctaatlaspps</b> redundant share of CASTOR writes rule in place</li> <li class="fragment"><b>eosctacmspps</b> tape endpoint for CMS Rucio instance</li> <li class="fragment"><b>eosctaalicepps</b> for Alice</li> <li class="fragment"><b>eosctarepack</b> for CTA repack activities</li> <li class="fragment"><b>eosctaatlas</b> ATLAS instance (2020 reprocessing campaign, then production)</li> </ul> --- # Play with CTA at home ---- ## Kubernetes EOS CTA generic instance <ul> <li>Implement a framework based on a <span class="fragment highlight-red">single generic docker image</span>.</li> <li>Use <span class="fragment highlight-blue">Kubernetes</span> to build an EOS CTA instance out of it.</li> <li>Flexible enough to <span class="fragment highlight-red">accomodate any supported resource</span> (database, objectstore, tape library).</li> <li>Part of CTA code repository: <span class="fragment highlight-red">CI tests are evolving with the tested code</span>.</li> </ul> ---- ## EOS CTA generic k8s instance <img src="https://codimd.web.cern.ch/uploads/upload_fc9e6f74e0b135d7b4f6438ed8d64e0e.svg" class="plain" height=60%> --- # Extra slides --- # Workflows for Archival and Retrieval ---- ## Write to tape ```mermaid sequenceDiagram participant Experiment participant FTS participant EOS participant EOSCTA participant Tape Experiment->>FTS: archive(file) activate EOS FTS->>EOSCTA: xrdcp EOS:file EOS->>+EOSCTA: file loop until timeout FTS->>EOSCTA: file BackupExists_bit ? alt BackupExists_bit=1 EOSCTA-->>+Tape: file deactivate EOSCTA Tape->>FTS: file on tape FTS->>Experiment: file archival OK deactivate Tape else BackupExists_bit=0 activate EOSCTA EOSCTA-xFTS: file NOT on tape FTS->>-EOSCTA: delete file FTS-xExperiment: file archival FAILED end end deactivate EOS ``` ---- ## Read from tape ```mermaid sequenceDiagram participant Experiment participant FTS participant EOS participant EOSCTA participant Tape Experiment->>FTS: retrieve(file) activate Tape FTS->>EOSCTA: xrdfs prepare -s file loop until timeout FTS->>EOSCTA: file online ? alt offline_bit=0 Tape->>+EOSCTA: file activate EOSCTA EOSCTA->>FTS: file is online FTS->>EOS: xrdcp EOSCTA:file EOSCTA->>+EOS: file FTS->>EOSCTA: xrdfs prepare -e deactivate EOSCTA FTS->>Experiment: file retrieval OK deactivate EOS else offline_bit=1 EOSCTA-xFTS: file is NOT online FTS-xExperiment: file retrieval FAILED end end deactivate Tape ``` --- ```graphviz graph hierarchy { nodesep=1 // increases the separation between nodes node [color=Red, fontname=Courier, shape=box] //All nodes will this shape and colour edge [color=Blue, label="25Gb/s"] //All the lines look like this Router [shape=circle] Router--{SwitchBuffer} [label="3x(2x100Gb/s)", fontsize=15, style=bold] Router--{SwitchTape} [label="7x20Gb/s", fontsize=15, style=bold] subgraph cluster_level1{ label="EOSCTA Buffer infrastructure\n3x10 hyperconverged servers" color=dodgerblue fontcolor=dodgerblue SwitchBuffer SSD01 [color=black, shape=cylinder] SSDXX [color=black, shape=cylinder] SSD16 [color=black, shape=cylinder] buffersrv01 buffersrvXX--{SSD01 SSDXX SSD16} [label=""] } subgraph cluster_level2{ label="Tape infrastructure\nXX tapeservers" color=crimson fontcolor=crimson SwitchTape SwitchTape--{tpsrv01 tpsrvXX} [color=Blue, label="10Gb/s"] SwitchBuffer--{buffersrv01 buffersrvXX } [color=Blue, label="25Gb/s", style=bold] {rank=same; tpsrv01 tpsrvXX} // Put them on the same level tape [color=black, shape=Msquare] tpsrvXX--tape [label="360MB/s"] } } ```
{"title":"200105 EOS workshop Hands-on with EOS and the CTA tape backend","description":"Hands-on with EOS and the CTA tape backend","slideOptions":{"transition":"slide","theme":"white"}}