--- ----- ### **Evaluation of RBD Replication Options @CERN** ----- Arthur Outhenin-Chalandre CERN IT Fellow since March ----- #### 16 June 2021 --- ### RBD Ceph @CERN - 4 RBD clusters for Openstack Cinder - 7 000 volumes (8PB RAW) - Allow users to replicate between clusters (DC/room) ---- ### Objective * Disaster Recovery solution for Cinder volumes * On a subset of existing images * Replicate on a per-image basis * Integrated with OpenStack * Entry-point to RBDs for our users * With minimal performance impact * Suitable replication performance --- ### RBD replication in Ceph * Handled by a `rbd-mirror` daemon: 1. Reads the state of RBD images in a source cluster 1. Replay asynchronously on the target cluster * Two operation modes supported: * RBD journal * RBD Snapshot --- ### Testbed setup * Ceph Octopus * 6 bare-metal machines * 60 OSDs: 3TB HDD for data + SSD for blockdb * 18 OSDs: 180GB on SSD for RBD journal * up to 5 client machines running multiple `fio` with random write workloads --- ### RBD Journal (1/3) * The RBD client (`librbd` only) will write data in the image **AND** journal. * Chosen first because of full OpenStack support: * Setup of replication works out of the box * Upon disaster, fail-over and promote other site ---- ### RBD Journal (2/3) - Mirroring Performance * rbd-mirror replays are slow with the default settings: * ~30MB/s per image * but it scales well with the number of images * risks to lag behind * replica gets out of date * journal not trimmed * relevant option for tuning: * rbd journal max payload bytes ---- ### RBD Journal (3/3) - Client Performance * "Small writes" (4K) do not suffer much * "Big writes" (4M) are ~50% slower ![](https://codimd.web.cern.ch/uploads/upload_861e90360596c581d32285e4e1b7220a.png =470x) ![](https://codimd.web.cern.ch/uploads/upload_0b21dd9811a03a8384e4811b6dd6b482.png =470x) --- ### RBD Snapshots (1/2) * Point-in-time replication * Image snapshot diff exported -> imported to target cluster * RBD client not involved in replication * Performance impact should only be related to: * Snapshot trimming * Replay workload * Replays are fast OOTB (200MB/s per image in our tests) ---- ### Mirror snapshots (2/2) * No support in OpenStack (so far): * Cinder does not implement this mode * Contributed [patches upstream](https://review.opendev.org/q/topic:rbd-snapshot-mirroring): review in progress ``` [rbd-rep-1] volume_driver = cinder.volume.drivers.rbd.RBDDriver replication_device = ... rbd_replication_mode = journal/snapshot ``` ``` [DEFAULT] backup_driver = cinder.backup.drivers.ceph.CephBackupDriver backup_ceph_image_snapshot_mirroring=true ``` --- ### Conclusion * Snapshot mirroring is our objective for RBD replication * We will continue our tests further and report/fix bugs or missing features along the way to make it work for us. * So far we have submitted 5 PR to improve RBD replication observability and stability --- ## Appendix ---- ### Benchmark: SSD vs HDD pool ![](https://codimd.web.cern.ch/uploads/upload_b58f65688bae76268bce33a9ecec8a9d.png =410x) ![](https://codimd.web.cern.ch/uploads/upload_ca4ca5ff8fc1ad951684366e77bdd0d4.png =410x) ![](https://codimd.web.cern.ch/uploads/upload_febbc86a93e359ef8185c0dbfd36d067.png =410x) ![](https://codimd.web.cern.ch/uploads/upload_9eaa22bedf39ba73aa92f4481cfb444e.png =410x) ---- ### Performance impact of journaling ![](https://codimd.web.cern.ch/uploads/upload_63cf5d853df3e4a7c8c004fc8b1c6954.png =410x) ![](https://codimd.web.cern.ch/uploads/upload_5d17c58863f1f5ed2876afbb83be2e90.png =410x) ![](https://codimd.web.cern.ch/uploads/upload_8538429324839bf758c4490abd9b0c30.png =410x) ![](https://codimd.web.cern.ch/uploads/upload_ceeeb436b38c3ece7a1a259595df82de.png =410x) ---- ### Performance impact of journaling on SSDs ![](https://codimd.web.cern.ch/uploads/upload_d6450790edc0068910a9e36d2b56e4d7.png =410x) ![](https://codimd.web.cern.ch/uploads/upload_142b5e049056714c4cee43a497e6038d.png =410x) ![](https://codimd.web.cern.ch/uploads/upload_2c574c53e7e22fb2f0e6024e1b9eb5f3.png =410x) ![](https://codimd.web.cern.ch/uploads/upload_2b3f138a31f45b39a3424986c48688c5.png =410x) ---
{"type":"slide","slideOptions":{"transition":"slide","theme":"cern5"},"slideNumber":true,"title":"RBD replication","tags":"presentation, Ceph"}