---
-----
### **Evaluation of RBD Replication Options @CERN**
-----
Arthur Outhenin-Chalandre
CERN IT Fellow since March
-----
#### 16 June 2021
---
### RBD Ceph @CERN
- 4 RBD clusters for Openstack Cinder
- 7 000 volumes (8PB RAW)
- Allow users to replicate between clusters (DC/room)
----
### Objective
* Disaster Recovery solution for Cinder volumes
* On a subset of existing images
* Replicate on a per-image basis
* Integrated with OpenStack
* Entry-point to RBDs for our users
* With minimal performance impact
* Suitable replication performance
---
### RBD replication in Ceph
* Handled by a `rbd-mirror` daemon:
1. Reads the state of RBD images in a source cluster
1. Replay asynchronously on the target cluster
* Two operation modes supported:
* RBD journal
* RBD Snapshot
---
### Testbed setup
* Ceph Octopus
* 6 bare-metal machines
* 60 OSDs: 3TB HDD for data + SSD for blockdb
* 18 OSDs: 180GB on SSD for RBD journal
* up to 5 client machines running multiple `fio` with random write workloads
---
### RBD Journal (1/3)
* The RBD client (`librbd` only) will write data in the image **AND** journal.
* Chosen first because of full OpenStack support:
* Setup of replication works out of the box
* Upon disaster, fail-over and promote other site
----
### RBD Journal (2/3) - Mirroring Performance
* rbd-mirror replays are slow with the default settings:
* ~30MB/s per image
* but it scales well with the number of images
* risks to lag behind
* replica gets out of date
* journal not trimmed
* relevant option for tuning:
* rbd journal max payload bytes
----
### RBD Journal (3/3) - Client Performance
* "Small writes" (4K) do not suffer much
* "Big writes" (4M) are ~50% slower
 
---
### RBD Snapshots (1/2)
* Point-in-time replication
* Image snapshot diff exported -> imported to target cluster
* RBD client not involved in replication
* Performance impact should only be related to:
* Snapshot trimming
* Replay workload
* Replays are fast OOTB (200MB/s per image in our tests)
----
### Mirror snapshots (2/2)
* No support in OpenStack (so far):
* Cinder does not implement this mode
* Contributed [patches upstream](https://review.opendev.org/q/topic:rbd-snapshot-mirroring): review in progress
```
[rbd-rep-1]
volume_driver = cinder.volume.drivers.rbd.RBDDriver
replication_device = ...
rbd_replication_mode = journal/snapshot
```
```
[DEFAULT]
backup_driver = cinder.backup.drivers.ceph.CephBackupDriver
backup_ceph_image_snapshot_mirroring=true
```
---
### Conclusion
* Snapshot mirroring is our objective for RBD replication
* We will continue our tests further and report/fix bugs or missing features along the way to make it work for us.
* So far we have submitted 5 PR to improve RBD replication observability and stability
---
## Appendix
----
### Benchmark: SSD vs HDD pool
 
 
----
### Performance impact of journaling
 
 
----
### Performance impact of journaling on SSDs
 
 
---
{"type":"slide","slideOptions":{"transition":"slide","theme":"cern5"},"slideNumber":true,"title":"RBD replication","tags":"presentation, Ceph"}