Posts Tagged ‘VMware’

No COS NICs have been added by the user – solved

Now that I am busy setting up UDA 2.0 (beta14) for a customer to be able to reinstall their 50+ VMware servers, I stumbled upon this message. The install would hang briefly, then proceed to a “press any key to reboot” prompt. Not too promising…

After searching the internet I found a lot of blog entries on exactly this error. I could not find any useful hints or tips that would solve my problem; I have been checking the disk layout over and over again, to make sure no mistakes were made there. I was starting to pull my hair out, because it did work previously.

Then I started thinking; the customer in question has multiple PxE servers in the same network, and special DHCP entries were created for all vmnic0 MAC addresses, so that option 66 and 67 could be set to point to the UDA appliance. I think their DHCP server denies DHCP to any MAC address unknown to it, because right before the “press any key to reboot” I saw something passing in the line of “unable to obtain a dynamic address”. I figured in the initial setup, the kickstart part tries to get a DHCP address using the Service Console virtual NIC (with a different MAC address each time you reinstall). So I tried to alter the “Kernel option command-line” from this:

ks=http://[UDA_IPADDR]/kickstart/[TEMPLATE]/[SUBTEMPLATE].cfg initrd=initrd.[OS].[FLAVOR] mem=512M



to include static IP data:



ks=http://[UDA_IPADDR]/kickstart/[TEMPLATE]/[SUBTEMPLATE].cfg initrd=initrd.[OS].[FLAVOR] mem=512M ksdevice=vmnic0 ip=[IPADDR] netmask=255.255.255.0 gateway=10.11.12.254 dns=10.11.12.13



This appears to have done the trick; Now finally the “No COS NICs have been added by the user”-error is resolved. This warning however is not the actual issue: The warning is still there, but the install continues now. Still unsure what this actual warning means…

Throughput part 3: Data alignment


A lot of people have discovered yet another excuse why their environment is not quite performing as it should: misalignment. Ever since a VMware document stated misalignment could potentially cost you up to 60% of performance, it has become an excuse. When looking closer, the impact is often nearly negligible, but sometimes substantial. Why is this?



Introduction

It is more and more seen in VMware environments today. “You should have aligned the partition. No wonder performance is bad”. But what is misalignment exactly, and is it really that devastating in a normal environment? The basic understanding of misalignment is rather simple. In RAID arrays, there is a certain segment size (see Throughput part 2: RAID types and segment sizes). This means data is striped across all members of a raid volume (a set of disks strung together to perform as one big unity). Especially when performing random I/O (and most VMware environments do), you want only a single disk to have to perform a track seek in order to get a block of data. So if your segment size on disk is 64KB, and you read a block of 64KB, only one disk has to seek for the data. That is, IF you aligned your data. If somewhere in between the data is not aligned with the segments on disk, you’d possibly have to read two segments, because each segment carries part of the block to be read (or written for that matter). Exactly that is called misalignment.

In most VMware environments, there are two “layers” between your VM data and the segments on disk: the VMFS and the file system inside your virtual disk. Since ESX 3.x, VMware delivers 64KB alignment of the VMFS. As soon as the blocks vSphere is accessing get bigger than 64KB, you could call it sequential access, where alignment does not help anymore. So basically the start of a VMFS block of 64KB, is always aligned to a 64KB segment on the disks laying underneath. For those who might wonder: VMFS block sizes (1MB … 8MB) are not related to the I/O sizes used on disk; VMFS is able to perform I/O on subsets of these blocks.

The second “layer” is more problematic: The guest file system. Especially NTFS under Windows 2003 server (or earlier) or desktop releases prior to Windows 7, NTFS will by default misalign. I have never understood why, but a default NTFS will align itself to 32256 bytes, or 63 sectors. After that the actual data starts. Getting NTFS aligned is simple: just create a gap after sector 63 right up to sector 128 (or any power of two above for that matter). This is easily done for new virtual disks, but not so easy for existing ones (especially system disks).



Misalignment shown graphically

A lot of people find misalignment hard to understand. A picture says a thousand words, so in order to keep this blog post somewhat shorter: pictures!



Both NTFS and VMFS aligned

Figure 1:    Aligned VMFS and NTFS (how you should want it)

In figure 1, both VMFS and NTFS have been properly aligned, including some alignment space. In effect, for every block accessed from or to the NTFS file system, only one block on the underlying storage is touched. Thumbs up!



Both NTFS and VMFS misaligned

Figure 2:    Both VMFS and NTFS misaligned (you should never want this)

A misalignment of both VMFS and NTFS is depicted in figure 2. This is a really undesirable situation. As you can see, the access of an NTFS block will require one VMFS segment to be read, sometimes even two (due to NTFS misalignment). But since VMFS is misaligned to the disk segments, every 64KB VMFS block in this example will require the access to two segments on disk. This can and will hurt performance. Luckily, VMware spotted this problem relatively early, and from ESX 3.0 and up VMFS alignment is automagically if you format the VMFS from the VI client.



NTFS misaligned, VMFS aligned

Figure 3:    Aligned VMFS, but misaligned NTFS (most common situation)

Figure 3 shows the situation I mostly see in the field. VMFS is aligned (because VMFSses formatted in the VI client align automagically to a 64KB boundary). NTFS is misaligned in this example. I see this all the time in Windows 2003 / Windows XP VMs. As you can see in this example, most blocks touch only a single segment on the physical disk. Some NTFS blocks “fall over the edge” of a 64K segment on disk. Any action performed on those NTFS blocks will result in the reading or writing of TWO segments on the underlying disks. This is the performance impact right there.

You can probably see where this is going: If your segment size on your storage is way bigger than the block size of your VM file system, impact is not too much of a problem. In the example in figure 2, two NTFS blocks out of every 64 blocks will be impacted by this, and only for random access (in sequential accesses your storage cache will fix your problem since both segments on disk will be read anyway). This is an impact of 1/32th, or 3,1%. You could possibly live with that…

Now let’s up the stakes. What if your storage array used a really small segment size on physical disk, let’s say 4KB? Take a look at figure 4:



VMFS alligned, NTFS misaligned, small segment size

Figure 4:    Aligned VMFS, misaligned NTFS and a small segment size

vSphere will generate I/O blocks which get sized to the highest effectiveness. For example, if you have a database which uses 4KB blocks, and performs 100% random I/O, you get a situation like in figure 4. Every time you access a 4KB block, VMFS interprets this to a 4KB I/O action to your array. Because the NTFS / database blocks are misaligned, EACH access to a 4K block ends up on TWO disk segments. This impacts performance dramatically (up to 50% if all I/O sizes are 4KB). A similar situation occurs when your database application would use 8KB blocks; in that case for every I/O three segments on disk would be accessed instead of two, impacting performance of the disk set by 33% (if all I/O sizes are 8KB).



Why ever use a small segment size?

When you look at an EMC SAN (Clariion), the segment size is fixed at 64KB . When you look at a NetApp, segment size is fixed at 4KB. It would be pretty safe to say, that the impact of misalignment will hit harder on a NetApp than on an EMC box. That is probably why NetApp hammers so hard on alignment; in a NetApp environment it really does matter, in an EMC environment, a little less.

Looking at it the other way round: Why would you ever use such a small segment size? Why not use a segment size of for example 256KB, and feast on having only 1/128th or 0,78% impact when not aligning? Well, using a large segment size appears to be the solution to misalignment. And in a way, it is. But do not forget: Every time you need to access 4KB of data, 256KB is accessed on disk. So both yes AND no, a large segment size makes alignment almost a waste of time, but it introduces other problems.

Somewhere, the “perfect segmentsize” should exist. Best of both worlds… The problem is… This perfect segment size will vary with the type of load you feed to your SAN. EMC is sure about their 64KB (since it cannot be altered), NetApp seems sure about 4KB, because of the very same reason. The el-cheapo parallel-SCSI array (yes parallel SCSI indeed and vmotion works- but that is another story) I use for my home lab does a more generic job: For each RAID volume, I am allowed to choose my segment size (called a stripe size there). Now THAT gives room for tuning! And room for failure in tuning it at the same time…



Dedup and misalignment

Now that deduplication is the new hype, misalignment is said to impact dedup effectiveness. The answer to this, as usual, is…. It depends. If you take two misaligned windows 2003 servers from a template, you’d deduplicate them very effectively since they are very alike. If you were to align one of them (leaving the second one misaligned), dedup would possibly not find a single block in common. Makes sense right? Your alignment shifted all data within the VMDK, differentiating all blocks in effect. If I now align the second VM as well (using the same alignment boundary), dedup would once again be able to work effectively.

So the final answer should be: If dedup is to be effective, either align ALL VMs, or align NONE.



How to get rid of misalignment

Let’s say you’ve found that your VMs are misaligned. If things are really bad they are situated on a RAID volume with a very small segment size. Alignment could save the day. So how do you go about it? Several solutions I’ve come across:

  1. Manually;
  2. GParted utility;
  3. Use Vizioncore’s vOptimizer;
  4. If you’re a NetApp customer and use ESX (not ESXi), use their alignment tool mbrscan/mbralign;
  5. V2V your VMs using Platespin PowerConvert and align them on the way.


Manual Alignment is perfect for data drives. The idea is that you add a second data drive, create an aligned partition there using diskpart:

  • Open a command promp, run diskpart;
  • list disk – then select disk x;
  • list volume – then select volume x;
  • create partition primary align=64 (or any power of 2 above).


after that, stop whatever service is using your datadrive, copy all data, change the drive letters so your new aligned disk matches the old data drive, restart your services, remove the original data disk from the VM. This works great for SQL, Exchange, fileservers etc. The big downside: You cannot align system disks using diskpart (not even from another VM; diskpart’s create partition is destructive).

GParted is a utility that is said to align your partition if you resize the partition using this tool. Never looked into it, but it’s worth checking it out.

Vizioncore’s vOptimizer is a very nice tool that performs alignment for you. Basically it shuts the VM in question, and starts to move every block inside your VMDK(s). You end up with all disks aligned. The VM is then restarted and an NTFS disk check is forced. After that you’re good to go. It served me well on some occasions! You even get two alignments for free if you decide to give their product a spin.

NetApp customers get an alignment tool for free: mbralign. I never used this tool, but apparently it does about the same job as vOptimzer. It shuts your VM, aligns the disks, reboots your VM. It only works on ESX though (installs software in the Service Console).

If you cannot live with the downtime, but need to align anyway, you could consider to look at Platespin products. They can perform a “hot” V2V and align in the process. When data moving is complete, they fail over from the original VM to the newly V2Ved VM, syncing the final changes on the destination disk(s). You end up with an aligned copy of your VM with minimal downtime.



How to prevent misalignment in the first place

Misalignment is often seen, but not necessary at all if you think about it before you start: A lot of people create templates. Not too many align their templates… But you could! If you have a (misaligned or not) VM laying around, you could add an empty system disk of the template-to-be to it, and format the partition aligned from that “helper” VM (see the diskpart description above). Then detach the system disk from the helper VM again, and proceed to install Windows on the (now aligned) disk. Choose not to change anything to the partitioning and you are good to go. Bootable XP CD’s can also do the same trick here.

Now your template is aligned. The upshot: Any VM deployed from this template is too!

There is an easy way to check under windows if your disks are aligned. Simply run the msinfo32.exe from windows, expand components, storage, disks. Find the item “Partition Starting Offset”. If it reads 32.256, you’re out of luck: your partition is misaligned. If it reads 65536, you have a 64K aligned partition. If the value reads 1.048.576, the partition is aligned on a 1MB boundary (Windows 2008 / Windows 7 default).



Conclusion

Is alignment important? Well, it depends. It particularly depends on the segment size used within your storage array. The smaller the segment size, the more impact you have. Bottom line though: Alignment always helps! Get off to a good start and perform alignment right from the beginning and you’ll profit ever after. If you didn’t go off to a perfect start, consider aligning your VMs afterwards. Start with the heavy random I/O data disks for sure, but I would recommend to have the system disks aligned as well, using one of the described tools.

Performance impact when using VMware snapshots

It is certainly not unheared of – “When I delete a snapshot from a VM, the thing totally freezes!“. The strange thing is, some customers have these issues, others don’t (or are not aware of it). So what really DOES happen when you clean out a snapshot? Time to investigate!

Test Setup

So how do we test performance impact on storage while ruling out external factors? The setup I choose was using a VM with the following specs:

Read the rest of this entry »

Throughput part 1: The Basics

As I tackle more and more disk performance related issues, I thought it was time to create a series of blogposts about spindles, seektimes, latency and all that stuff. For now part 1, which covers the basics. Things like raid type, rotational speeds and seektimes basically make up “how fast you will go”. On to the dirty details!

Introduction to physical disks and their behaviour

So what is really important when looking at physical disks, and their performance? Firstly and most important, we must look at the storage system parameters in order to reduce disk latencies. In order to be able to do this properly, we have to take into account the characteristics of the I/O what is being performed. Secondly, we have to look at segment sizes within the chosen raid types (which in turn followes from the system parameters). Finally, we’ll deepdive into alignment (which still appears to be misunderstood by a lot of people)
Read the rest of this entry »

Breaking VMware Views sound barrier with Sun Open Storage (part 1)

A hugely underestimated requirement in larger VDI environments is disk IOPs. A lot of the larger VDI implementations have failed using SATA spindles, when you use 15K SAS or FC disks you get away with it most of the times (as long as you do not scale up too much). I have been looking at ways to get more done using less (especially in current times, who doesn’t!). Dataman, the dutch company I work for (www.dataman.nl) teamed up with Sun Netherlands and their testing facility in Linlithgow, Scotland for testing. I got the honours of performing the tests, and I almost literally broke the sound barrier using Suns newest line of Unified Storage: The 7000 series. Why can you break the sound barrier with this type of storage? Watch the story unroll! For now part one… The intro.

What VMware View offers… And needs

Before a performance test even came to mind, I started to figure what VMware View offers, and what it needs. It is obvious: View gives you linked cloning technology. This means, that only a few full clones (called replicas) are read by a lot of Virtual Desktops (or vDesktops as I will call them from now on) in parallel. So what would really help pushing the limits of your storage? Exactly, a very large cache or solid-state disks. Read the rest of this entry »

esxtop advanced features

No rocket science here. esxtop has always been there. Yet a lot of people miss out on some of its great features. Hopefully this blogpost will get you interested in looking at esxtop (again?) in detail!

Yesterday I attended a very interesting breakout session about esxtop and its advanced features in vSphere. Old news you might say, but there is SO much you can do with esxtop. For example, you can export data from esxtop and import them in Windows perfmon. And if you did know that, then for example, did you know you can now actually see which physic NIC is being used by a certain VM?

Other neat little features were shown. The best one being that the “swcur” field is actually NOT about the current swapping activity of a VM, but swapping that occured in the past (yes, I too would have called it differently…). How many of you knew that one? Finally, a very interesting field in the storage screen (yes for those who did not know that one, esxtop is not just about CPU, but also memory, storage, and new in vSphere… Interrupts) ). This field is called “DAVG” and this shows the actual latency seen by ESX to your storage (and also KAVG for kernel latency and GAVG for the total latency the guest sees).

There were also a few examples of misbehaving VMs which was very interesting to see. Numbers which seemed not possible, yet explained perfectly. I would like to vote this very last presentation at VMworld 2009 the best technical presentation I witnessed there!

I hope I got you (re)interested in esxtop. I am more of a graphical guy, so I like the performance monitor embedded within the VI client. But some things just aren’t there. So esxtop is definitely worth a(nother) look. If you’re using ESXi, make sure to download the vMA appliance (here) which has resxtop included (which looks a lot like esxtop on ESX).

Just for Fun – VMware just got greener

So what do you get when you mix VMware ESX and some dirt, and then a you add a little enthousiasm? Exactly, you get a paludarium.

The word paludarium comes from the word “palus” basically meaning mud, and it is kind of a cross between an aquarium and a terrarium. I have been building my own little world inside this glass box for the past months. Very moist, very green. Being a VMware fan I just had to combine these two hobbies. Why? Well that one is obvious, because you *can*!

VM controlled paludarium

VM controlled paludarium

So now my tiny little jungle is fully controlled by a virtual machine. Lighting, rain, fog, even thunder! Just when everyone thought it wasn’t possible, VMware just got greener!

See my paludarium site at http://paluweb.nl for some live stats!

Long distance Vmotion a fact

Today was announced that long distance vmotion is now officially supported by VMware up to a distance of 200 kilometers. A team-up from Cisco, VMware and EMC did some tests, proving the posiblities. Long distance vmotion is basically the vmotioning between two remote datacenters, enabling follow the sun, follow the moon, or evacuating a datacenter anticipating on a soon-to-be-disaster (“the tornado is coming”).

Of course some limitations apply.  Things like maximum latency of 5ms round trip and minimum bandwidth of 622 Mbits/sec apply, but still! Long distance vmotion is a fact, and I guess will soon be accepted as an enterprise solution just like normal vmotion has.

esXpress uses vStorage API for detecting changed blocks

Today at VMworld 2009 is joined a breakout session presented by PHD Virtual about their latest version of esXpress (3.6). Great stuff once again! Apart from the fact that esXpress is now fully functional on vSphere (still no ESXi support though), they also managed to use the vStorage API for “changed block reporting”. Basically what this means, is that when you are using vSphere and doing delta or deduped backups, you no longer need to read all the blocks of a VM and then decide is that block was changed or not. PHD managed to get esXpress so far that it reads only the changed blocks directly by using this “cheat sheet” that VMware was so nice to make available though the vStorage API.

What this means is, that backup speeds will be way higher when you do delta or deduped backups.

When you also use their dedup targt, with the dedup action going on on the SOURCE, you get tremendous backup speeds and as an added bonus you can use smaller WAN links when you are sending these backups offsite. Wonderfull guys, you did it again!

VCP4 certified

I am not (yet?) the kind of blogger to throw on everything I see around me on my blog just because it is “new”; I think blogging should be more about things you have tested or measured.

Yet today on VMworld 2009 I make an exception: I just got my VMware VCP4 certification. Yeah!

Soon to come
  • Coming soon

    • Determining Linked Clone overhead
    • Designing the Future part1: Server-Storage fusion
    • Whiteboxing part 4: Networking your homelab
    • Deduplication: Great or greatly overrated?
    • Roads and routes
    • Stretching a VMware cluster and "sidedness"
    • Stretching VMware clusters - what noone tells you
    • VMware vSAN: What is it?
    • VMware snapshots explained
    • Whiteboxing part 3b: Using Nexenta for your homelab
    • widget_image
    • sidebars_widgets
  • Blogroll
    Links
    Archives