# Hands-on with EOS and the CTA tape backend</span>
05 February 2020
[Julien Leduc](mailto:julien.leduc@cern.ch) for the CTA team
---
<h2>EOS+CTA <i style="color:blue;">Architecture</i></h2>
Main difference with CASTOR: <span class="fragment" style="color: dodgerblue"><b>EOSCTA is a pure tape system.</b></span>
<span class="fragment">Disk cache duty consolidated in main <b style="color: dodgerblue">EOS instance.</b></span>
<span class="fragment">Operating tape drive at full speed full time efficiently requires a <b style="color: crimson">SSD based buffer.</b></span>
----
<h2>EOS+CTA <i style="color:blue;">Architecture</i></h2>
<img src="https://codimd.web.cern.ch/uploads/upload_e764d94a4ee3ac79c328ea0d21a6a128.svg" class="plain"></span>
---
<h2>EOS+CTA <span class="fragment"><i style="color:blue;">Typical operations</i></span></h2>
<ul>
<li class="fragment">Write file to eoscta buffer <tt class="fragment">xrdcp</tt></li>
<li class="fragment">Is file on tape? <tt class="fragment">xrdfs stat -q BackupExists</tt></li>
<li class="fragment">Queue file for retrieve <tt class="fragment">xrdfs prepare -s</tt></li>
<li class="fragment">Is file in eoscta buffer? <tt class="fragment">!xrdfs stat -q Offline | xrdfs query prepare</tt></li>
<li class="fragment">Read file from eoscta buffer <tt class="fragment">xrdcp</tt></li>
<li class="fragment">Evict file from eoscta buffer <tt class="fragment">xrdfs prepare -e</tt></li>
<li class="fragment">Delete file from namespace <tt class="fragment">xrdfs rm</tt></li>
</ul>
----
<h2>EOS+CTA <i style="color:blue;">Authentication</i></h2>
<img src="https://codimd.web.cern.ch/uploads/upload_17588d02c4abefb94ecdd58a4cd487fe.svg" class="plain"></span>
---
<h2>EOS+CTA <span class="fragment"><i style="color:blue;">Production instances</i></span></h2>
----
## <span style="color: dodgerblue">EOSCTA</span>
buffer servers:
- 16x2TB SSDs, 25Gb/s each
- hosting up to **1 EOSCTA instance per server**
Specific *bandwidth oriented* EOS setup each server runs:
- **1 EOS MGM**
- **1 EOS NAMESPACE** - *quarkdb*
- **14 EOS DISKSERVERs** - *FSTs*
----
## <span style="color: dodgerblue">EOSCTA</span>
<img src="https://codimd.web.cern.ch/uploads/upload_88f3b03ec6cf37aa8c59787b8909d6f6.svg" class="plain">
----
## <span style="color: dodgerblue">EOSCTA</span>
Deployed instances:
<ul>
<li class="fragment"><b>eosctaatlaspps</b> redundant share of CASTOR writes rule in place</li>
<li class="fragment"><b>eosctacmspps</b> tape endpoint for CMS Rucio instance</li>
<li class="fragment"><b>eosctaalicepps</b> for Alice</li>
<li class="fragment"><b>eosctarepack</b> for CTA repack activities</li>
<li class="fragment"><b>eosctaatlas</b> ATLAS instance (2020 reprocessing campaign, then production)</li>
</ul>
---
# Play with CTA at home
----
## Kubernetes EOS CTA generic instance
<ul>
<li>Implement a framework based on a <span class="fragment highlight-red">single generic docker image</span>.</li>
<li>Use <span class="fragment highlight-blue">Kubernetes</span> to build an EOS CTA instance out of it.</li>
<li>Flexible enough to <span class="fragment highlight-red">accomodate any supported resource</span> (database, objectstore, tape library).</li>
<li>Part of CTA code repository: <span class="fragment highlight-red">CI tests are evolving with the tested code</span>.</li>
</ul>
----
## EOS CTA generic k8s instance
<img src="https://codimd.web.cern.ch/uploads/upload_fc9e6f74e0b135d7b4f6438ed8d64e0e.svg" class="plain" height=60%>
---
# Extra slides
---
# Workflows for Archival and Retrieval
----
## Write to tape
```mermaid
sequenceDiagram
participant Experiment
participant FTS
participant EOS
participant EOSCTA
participant Tape
Experiment->>FTS: archive(file)
activate EOS
FTS->>EOSCTA: xrdcp EOS:file
EOS->>+EOSCTA: file
loop until timeout
FTS->>EOSCTA: file BackupExists_bit ?
alt BackupExists_bit=1
EOSCTA-->>+Tape: file
deactivate EOSCTA
Tape->>FTS: file on tape
FTS->>Experiment: file archival OK
deactivate Tape
else BackupExists_bit=0
activate EOSCTA
EOSCTA-xFTS: file NOT on tape
FTS->>-EOSCTA: delete file
FTS-xExperiment: file archival FAILED
end
end
deactivate EOS
```
----
## Read from tape
```mermaid
sequenceDiagram
participant Experiment
participant FTS
participant EOS
participant EOSCTA
participant Tape
Experiment->>FTS: retrieve(file)
activate Tape
FTS->>EOSCTA: xrdfs prepare -s file
loop until timeout
FTS->>EOSCTA: file online ?
alt offline_bit=0
Tape->>+EOSCTA: file
activate EOSCTA
EOSCTA->>FTS: file is online
FTS->>EOS: xrdcp EOSCTA:file
EOSCTA->>+EOS: file
FTS->>EOSCTA: xrdfs prepare -e
deactivate EOSCTA
FTS->>Experiment: file retrieval OK
deactivate EOS
else offline_bit=1
EOSCTA-xFTS: file is NOT online
FTS-xExperiment: file retrieval FAILED
end
end
deactivate Tape
```
---
```graphviz
graph hierarchy {
nodesep=1 // increases the separation between nodes
node [color=Red, fontname=Courier, shape=box] //All nodes will this shape and colour
edge [color=Blue, label="25Gb/s"] //All the lines look like this
Router [shape=circle]
Router--{SwitchBuffer} [label="3x(2x100Gb/s)", fontsize=15, style=bold]
Router--{SwitchTape} [label="7x20Gb/s", fontsize=15, style=bold]
subgraph cluster_level1{
label="EOSCTA Buffer infrastructure\n3x10 hyperconverged servers"
color=dodgerblue
fontcolor=dodgerblue
SwitchBuffer
SSD01 [color=black, shape=cylinder]
SSDXX [color=black, shape=cylinder]
SSD16 [color=black, shape=cylinder]
buffersrv01
buffersrvXX--{SSD01 SSDXX SSD16} [label=""]
}
subgraph cluster_level2{
label="Tape infrastructure\nXX tapeservers"
color=crimson
fontcolor=crimson
SwitchTape
SwitchTape--{tpsrv01 tpsrvXX} [color=Blue, label="10Gb/s"]
SwitchBuffer--{buffersrv01 buffersrvXX } [color=Blue, label="25Gb/s", style=bold]
{rank=same; tpsrv01 tpsrvXX} // Put them on the same level
tape [color=black, shape=Msquare]
tpsrvXX--tape [label="360MB/s"]
}
}
```
{"title":"200105 EOS workshop Hands-on with EOS and the CTA tape backend","description":"Hands-on with EOS and the CTA tape backend","slideOptions":{"transition":"slide","theme":"white"}}