Ceph and RADOS Block Devices (RBD)

Ceph and RADOS Block Devices (RBD)

Continuing on our series of articles providing useful information on the functioning of Ceph, a massively scalable open source cloud storage platform, it’s components and features, along with it’s functionality, we’ll be delving a bit more into RADOS Block Devices (RBD).

What is a RADOS Block Device?

As we learned in our initial article on Ceph, “What is Ceph”, RADOS is an acronym for Reliable Autonomic Distributed Object Storage.  In essence, RADOS is the portion of Ceph that abstracts data down from the file level into blocks, and distributes it over X number of servers, where X is almost infinitely scalable.

So, what is a RADOS block device? 

When storing data in Ceph, you have multiple options.  You can use the librados API as a Linux storage abstraction layer, which allows you to translate data into stored blocks within Ceph and interact directly with them from inside of Linux.  You could also choose to use CephFS, which makes use of the Ceph MDS (Metadata Server) daemon to abstract your requests down through Ceph as data blocks within the array of OSD’s and returns a POSIX compliant, mountable filesystem object.

Those two examples don’t really help us, however, if we intend on abstracting Ceph out into Virtualized environments, like Virtual Machines.  Ceph, given its high-performance nature and virtually limitless scalability would seem to be a perfect fit for virtualization.  How do we get Ceph to respond to block layer requests to form a device that can be used in virtualized environments?  Enter the RADOS Block Device.

A RADOS block device is a thin-provisioned, resizable object stored across multiple OSD’s in a Ceph cluster.  Through RADOS, Ceph block devices maintain snap-shot capability, consistency and replication across your entire cluster. 

RADOS Block Devices, unlike the previous examples we gave, leverage the “librbd” API library to provide an abstraction to the operating system.  From this abstraction layer, virtualization engines like KVM can provide interfaces to the RADOS Block Devices directly from the VM’s, allowing for the leveraging of Ceph for virtualization storage.

Performance, Reliability, and Failures.

Performance

As with other Ceph architectures, the performance of RBD is limited only by the underlying speed of the network and the hardware supporting the Ceph cluster.  Investments in SSD’s, NVMe’s, and slower storage can be leveraged to both scale up and out the RBD images in terms of overall capacity and throughput.  Connections from your hypervisors to the Ceph cluster can be made using high speed 10Gbps, 40Gbps, 100Gbps or InfiniBand interconnects. 

Reliability

Because the RADOS block devices uses the same storage and reliability methodology that all other Ceph services use, data from the RADOS block device is replicated at the same rate as other block and filesystem objects in the Ceph data pools.  RADOS Block Devices also benefit from built in snap-shot functionality available in enterprise class storage systems.  Snapshots and data replication can happen at a scheduled interval, and some virtualization platforms like Proxmox even build in support for Ceph RBD snapshots within their GUI interfaces.

Failures

While the performance of Ceph is undeniable, and the reliable storage of data through striping of data into randomized blocks helps to ensure data integrity, many things can happen that would still cause an outage within a Ceph cluster.  What do you do in the event of a full halt to writing to RBD block devices being used in a virtualization environment?

Ceph failures can be brought on by many things.  As we discussed in our Common Ceph Failure Scenarios article, Inadequate hardware, monitor redundancy, or miscounted OSD’s can lead to situations where your Ceph cluster – in order to protect your data – will stop all reads and writes from the cluster.  Other failure events that would lock your cluster include:

  1. Network failures
  2. Monitor Service failures
  3. Authentication Issues (Ceph authx)
  4. NTP/Time synchronization problems

Unlike other vastly more expensive solutions, however, Ceph does more to safeguard the underlying data by forcing a full stop to operations during one of these events.  This safeguarding does present certain issues to running virtual machines though. When a running virtual machine using a RADOS block device as it’s backing storage becomes unable to read or write to it’s virtual disk, the virtual machine can crash.  Recovering from these failures, however, is fairly straight forward and involves just rebooting the virtual machine.

Regardless of your virtualization storage platform, be it iSCSI, NFS, or Ceph RBD, a sane backup and recovery strategy should be employed to mitigate business risk – nothing is completely without risk.

Conclusion

Ceph RADOS Block Devices are a powerful way to manage storage for virtual machines and other applications.  Ensuring a reliable, stable and secure Ceph deployment requires attention to all aspects of a cluster deployment including networking, physical disks, server infrastructure, and operating system configuration. 

AMDS Cosmos Engineers are available to assist you in the architecture, design, selection, deployment and ongoing maintenance of your Ceph cluster, or any related Linux, Windows, or storage project.  With extensive experience in both vendor management and open source software, we can augment your teams existing skillset to help you grow into new technology.