People are talking SO much about VMware View sizing these days. Everyone seems to have their own view on how much IOPS a vDesktop (virtual desktop) really uses. When you’re off by a few IOPS times a thousand desktops things can get pretty ugly. Everyone hammers on optimizing the templates, making sure the vDesktops do not swap themselves to death etc. But everyone seems to forget a very important aspect…
Where to look
People are measuring all the time. Looking, checking, seeing it fail in the field, go back to the drawing board, size things up, try again. This could happen in an environment where storage does not a 1-on-1 relation with disk drives (like when you use SSDs for caching etc). But even in straight RAID5/RAID10 configs I see it happen all the time.
How many times have you heard this one:
you must always run a PoC because things just don’t add up”
Well, actually things add up perfectly. Almost everyone just misses one vital step: WHERE DO YOU MEASURE.
The answer is usually simple: “In the desktop”. Seems ok, but then I ask why they are not measuring on the storage itself. Most responses I get is people going into a blank stare.
Teaser alert: Linked Clones
So let’s say we already have XP or Windows 7 VM’s where we are measuring in. Inside the OS we might see 8-10 IOPS typically, and I’m not even starting on the read-write ratio (that is another blogpost). At the same time, everyone is using linked clones “because you should”, and exactly that is where things go wrong: Almost nobody takes the impact of linked clones on storage into account. This appears to be the “mysterious factor” where things tend to go wrong.
The impact of linked clones on storage I/O
In order to determine the impact a linked clone has on storage I/O, we must first of all determine what a linked clone is. To no surprise, a linked clone behaves exactly like a snapshot. This is basically because a linked clone IS a snapshot, nothing more. There are just multiple snapshots taken from the same base image (in View called the replica). Looking at one of my older blogposts, I discovered I had already done my homework on snapshots: “Performance impact when using VMware snapshots“. In this post is a very clear conclusion on the impact of VMware snapshots on the underlying storage:
“The simple fact of having a snapshot to a VM means that all Read Operations of this VM grow with a factor 2 (!). On top of that, when the VM writes a block that was not written to the snapshot before you also need twice the Write Operations on that VM. This is due to the fact that VMware has to update the table of “where to find what block” (snapshot or base disk). This impacts performance significantly!”
In case you still wonder why VMware once came up with 6 IOPS per desktop: This is exactly why. Yes a tuned desktop might use 6 IOPS on average, but the same desktop using linked clones will use at least twice that amount of IOPS. Exactly, the 12-14 IOPS everyone uses in their sizings!
If you previously did not account for linked clones impact on storage sizing: Be afraid and check out the facts: Using VMware View’s linked clones multiplies storage read operations by a factor TWO. Period. If you did not account for it, and did not do a PoC where the reads where “mysteriously” different and corrected for, things are likely to get ugly.
Writes are impacted too, especially when you “refresh” a number of vDesktops. This is because refreshing means building a new linked clone from scratch, which in turn means a growing snapshot file, which in turn means double write operations (VMware needs to write the pointer to the data in the snapshot file and the data itself). Especially when you run your linked clones on RAID5, you can have a heavy impact on write performance, much bigger then you might have expected/calculated. The randomness of these writes on the storage array will almost never deliver a full-stripe write, which means you end up with a write penalty of 4 or effectively even 8 (!!) if the block that is written was not part of the linked clone yet.