The Elusive Miss Alignment
Is it a new miss election?? Well, after doing more than a little testing I figured out I may be MISSing something… So unfortunately it is not about beautiful women, but yet another technical deepdive. This time into misalignment. Theoretically it is SO easy to point out what the problem is (see Throughput part 3: Data alignment. For this new blog entry I had my mind set on showing the differences between misalignment and alignment in the lab… However this proves to be much MUCH harder than anticipated…
Some background
First we need to get our head around the concept of alignment (or the lack of it). The idea is simple: When I/O is performed on a RAID array with multiple striped members (like RAID0, RAID5, RAID10) data is cut up in chunks. The size of these chunks are often called the “segment size” or “stripe size”.
Because most I/O in a virtual environment is random, you really want every read or write to be performed with minimal impact on the disks. This means you ultimately only want to seek a single disk for each random I/O. That will enable all other disks in the RAID set to perform other seeks to perform other random I/Os.
So let’s say you want to build a RAID set. Since your VMs perform I/O no bigger than 64KB (usually the case, but not always) you could decide to choose a segment size of 64KB. Every read performed will only hit a single disk in the RAID set, every write will impact a minimum number of disks (mirror or parity segment accesses are required too).
The story so far is simple and straight forward. Without thinking any further, you’d think you’re done by selecting a 64KB segment size. But here is where alignment comes in: For the idea above to work, you need to make sure that your 64KB I/O’s are exactly ALIGNED to the segments on the physical disks. You can imagine that if you read a 64KB block, you need to read one segment (when aligned), or the halves of two when you have a 32KB misalignement. The latter is what is called misalignment.
So in theory misalignment is a bad thing. When reading 64KB in the example above, not one but TWO disks would have to perform a seek action (which in general takes a very long time on physical disks compared to all other actions; see Throughput Part 1: The Basics). The double seek required in this case should degrade performance. Sounds easy enough, so now to actually measure this in a real life situation…
To the Lab!
The first idea appeared rock solid. Take a RAID0 set of two disks (segment size of 16KB), take a Windows 2003 server VM, create two virtual disks on the RAID0 set. Align one virtual disk to 64KB (65536 bytes) boundary, and format the other ‘as you normally would’ (but shouldn’t 😉 ), which results in Windows 2003 to an alignment to 63 sectors (which is a 32256 byte alignment), also called misaligned. Then simply load the poor RAID0 array with enough random 16KB read workload and notice a difference between the two disks.
The idea was, that the misaligned virtual disk would require both physical disks to perform a seek to a certain cylinder for every single random I/O. The aligned disk should only need to seek just one physical disk for every single random I/O, leaving the other member to perform another random read.
Boy did it fail. Both disks showed a persistent performance, with hardly any difference. I would have expected a noticeable difference… I think largely because the seek distance between the random reads simply was very small. Back to the drawing board!
To the Lab! – Take Two
After numerous tests I still failed to have a significant difference between a misaligned and an aligned disk. I definitely needed to look at things in a different way. So I performed a new test.
This time I used a (4+1) RAID5 set with 16KB segment size. I forced the controller to “Write through” policy, which basically forces each write to be put to physical disk instead of the write cache. A VM performs heavy random writes at the beginning of the RAID set (lower cylinders), on either an aligned or a misaligned virtual disk. The writes are an impact on its own on a RAID5 set, so misalignment should make the difference even bigger.
A SECOND VM performs heavy writes at the end of the RAID set (higher cylinders). This forces the physical disks to seek across the entire platter, so seeking extra disks will have more impact on the test VM.
Finally, success! I managed to get this comparing graph:
Success at last! A significant difference in the write performance to a misaligned virtual disk (left)
and a properly aligned disk (right) can be seen here
Conclusion
I still am amazed by the fact that it was SO hard to actually get a significant difference in performance by (mis)aligning my virtual disks. It makes you wonder if alignment has a measurable impact in real world scenarios… Still not convinced that aligning your disks makes so little difference, I will continue to test this. I guess I need to bring my other RAID box to life, build a bigger RAID5 or RAID0 set with a very small segment size… (maybe 4KB or something).
Any hints and tips on how to maximize the impact of aligning virtual disks are more than welcome; by tuning really wrong I would expect to be able to get at least a 30-40% performance degradation when not aligned…