Ceph raw disk performance testing is something you should not overlook when architecting a ceph cluster. When choosing media for use as a journal or OSD in a Ceph cluster, determining the raw IO characteristics of the disk when used in the same way ceph will use the disk is of tantamount importance before tens, hundreds or thousands of disks are purchased. The point of this article is to briefly discuss how ceph handles IO. One important point is to estimate the deviation caused by ceph between RAW IOs from disk and ceph IOs.
TESTING & GRAPHING WITH FIO
For this article I assume you are aware of FIO. You will need FIO and GNUPlot installed to run these tests. I have developed an automation tool in my spare time for writing these tests. You can find it here: Ceph-Disk-Test
RBD can best be simulated by using a block size of 4M in your testing. However it is pertinent to test with smaller IOs like 64k or 4k for worst case. Below is an example test run with a Samsung Extreme USB stick to demonstrate how the results look using this automation.
The automation produces a nice graphs like this:
Journal IO in Ceph uses O_DIRECT and D_SYNC flags. Journals write with an IO Depth of 1 (1 IO at a time). However if you colocate multiple journals you should increase your IO depth to the number of journals you plan to colocate on the drive. Here is an example FIO test for testing journal performance on a disk:
OSDs use buffered IO and thus you need to run performance tests of a size and duration that is greater then the amount of RAM in the test machine. Here is an example test file for an OSD:
Research material used to structure these ceph raw disk performance tests:
Ceph IO, The Bad: