Posts Tagged ‘Backup’
For a long time I have been a fan of PHDvirtual (formerly esXpress) and their way of backing up virtual environments. Their lack of ESXi support has driven a lot of people towards other vendors, and the one that is really on technology’s edge nowadays is Veeam’s Backup and Replication. Now that PHDvirtual has released their version 5.1 with ESXi support, it is high time for a shootout between the two.
Some history on drawing virtual backups
In the old ESX 3.0 and ESX 3.5 days, there was hardly any integration with 3rd party backup products. Read the rest of this entry »
Just when I thought I had done a pretty complete tuneup on the storage path from Veeam backup to an Iomega IX2-200 NAS, two things came up I wanted to test. The first one (why didn’t I think of that) is to set compression to “low”, saving CPU cycles and hopefully getting more throughput. The second one was starting a second job on the same Veeam VM to the same target storage.
Thanks to Veeam’s Happy Holidays gift, I now have a license for several Veeam products. The one I really wanted to try in my home lab was Veeam Backup and Replication.
In this blogpost, I will try various ways to connect the Veeam appliance to my Iomega IX2-200 NAS box. This setup is very tiny indeed, but it clearly shows the options you have and how they perform compared to each other. Read the rest of this entry »
PHD Virtual, creators of the backup product for VMware environments formerly known as esXpress, have introduced a new version of their PHD Virtual Backup software with support for Citrix XenServer.
The PHD Virtual Backup product is the first “virtual only” backup product with full support for Xen based virtualization I know of. But then again, I am no XenServer expert. For me, this is not too important at the moment, because I’m still very focussed on VMware only. But it is a major step for PHD Virtual; a lot of vendors are roadmapping support for all leading virtualization platforms, but today PHD Virtual actually delivers!
Keep going guys! The backup solution they have on the market right now for VMware is simply brilliant, especially in its robustness and its almost famous “install and forget” ability. I actually installed the product at several customer sites, and it started backing up. Fully automated, no headaches. No one ever bothered again. New VMs are automatically added to the list to backup if configured to do so. Simply brilliant. In many ways VMware has been looking closely as to how esXpress has been doing backups. Prove of this is VMware’s Data Recovery, which basically is a poor copy of esXpress in its way of working.
Some other vendors have been shouting about this great “hotadd feature” they now support. People tend to forget that esXpress has used similar technology for several YEARS now! Because hotadd did not exist then, they were forced to use “coldadd” , meaning their Virtual Backup Appliances (VBAs) needed to be powered down between backups (to clarify: NOT the VMs to be backed up).
Whether you use hot- or cold-add, backup speeds are great in any case. But cold-add has a drawback: The VM performing backups has to be powered up and down, reconfigured etc. That takes time. Especially now that Changed Block Tracking (CBT) is used by most vendors, a backup can take as little as 20 seconds if not too many blocks have changed within the virtual disk to backup. And this is where “cold-add” start to hurt: Reconfigure, power up and down of the VBAs for every virtual disk to backup easily takes a 3-5 minutes to complete.
PHD Virtual has been working hard on a version which is compatible with ESXi. Especially now that VMware is pushing even more towards ESXi, this is more important than ever. I hope to see a beta version soon of the ESXi compatible product; I cannot wait to test . This will also solve the “cold-add” overhead, because I’ve been told this version will use a single VBA per ESX node which hotadds multiple disks in parallel and then performs the backup also in parallel. Very exciting: Hopefully we’ll see both backup efficiency and robustness like never before. Add replication in the mix (which is a part of the product by default) and you have a superb solution at a relatively low cost.
PHD Virtual Backup with XenServer support link:
I just spotted something that had not occurred to me yet… A small new detail in vSphere 4.1 (or I just missed out on it previously)… VMware has had this “problem” for a long time: If you added a second virtual disk to a virtual machine on a datastore different from the location of the first virtual disk, vSphere used to name that new virtual disk the same as the base disk. Well not any more!
I noticed in vSphere 4.1 that this is no longer true. A second disk created on a separate datastore gets its name from the virtual machine (like it used to be), but with a trailing “_1″ in the filename.
For a long time backup vendors have been battling with this “issue” because the backup software ended up with two virtual disks that were both named the same… In a lot of environments that meant manual renaming and remounting second and third disks to VMs in order to get proper backups without having to guess which disk goes where.
Amazing that these things all of a sudden get fixed now that VMware has its own backup solution
For a long time now I have been a fan of PHDs esXpress. It is still the only VMware backup solution I know that scales, has no single points of failure and works reliable with VMware snapshots. The solution has always been “other than others”: At first it appears to be a really weird piece of software, that creates its own appliances to perform its backups. Once you get to know it, esXpress’s way of working is great. So great in fact, that VMware themselves are now adopting this very way of working with their Disaster Recovery feature in vSphere 4, maybe even stepping away from their beloved VCB (VMware Consolidated Backup).
VCB in my opinion has never been that great, apart for some special uses in special environments. esXpress fits all, from single ESX hosts to large clusters. In contrast to VMware’s Disaster Recovery which is still buggy at the time of this blog, esXpress has been on this train for years now, and definitely knows the drill. EsXpress 3.1 is not the holy grail though. Some features were just not easy to use, there was no global GUI to manage all nodes easily and there was no Data deduplication available (not that I am that big a fan data dedup for backup, but ey, everbody does it!).
Enter esXpress 3.5
To make up for most of the shortcomings, esXpress version 3.5 has been introduced. The engine itself still is pretty much the same. And exactly there lies the power of esXpress: It still WORKS. It just works, it always works. Extra features have been added in such a smart and incredible simple way, that the product remains rock stable. No “waiting for the point 1 release” needed here!
I was over at a client who suffered a SAN failure (when upgrading firmware). They were in progress of failing over to their recovery site, when the administrator got an email from one of the production ESX hosts: esXpress had successfully completed its backups. What? All LUNs appeared unavailable at the production site. This host did not have its storage devices rescanned; it still kept on ticking. I think things like this are major plusses for both VMware ESX and esXpress showing their enterprise readiness.
Finally: A working global GUI
From the initial ESX 3.5 release, PHD also released a GUI to manage all esXpress instances from one central portal. In the old 3.1 (and before) days, you ended up copying configfiles between hosts; working, but not very user friendly. You might think that adding a central GUI took a lot of deep digging in the code of esXpress. But, they surprised once again: The GUI just holds the config files and, could it be more simple, the GUI appliance introduces a small NFS store. The NFS store is automagically mounted to the ESX servers, and presto! That is where the config files can be found. EsXpress itself just has to check the share for a new config, something already (partly) in existance in the previous version.
Even better: the GUI does a great job. I had some trouble with the first versions, some manual labor was needed to get it going (like manually needing to change the time zone and not being able to add a second DNS server). All these issues are fixed now, but even those early versions were already very effective. And things have become only better since then!
Because “everybody has it”: Deduplication
What should we do without deduplication nowadays? It is a major hype around storage and backup. If you don’t have it, you’re out of business it seems. But who ever thinks about the risks and limitations involved (see: The Dedup Dillema).
The idea of deduplication is brilliant, but the implementation has to be right. I must admit, I am not a big fan of deduplication. It is still your vital data you are talking about! Admit nr.2: EsXpress 3.5 managed to change my opinion to dedup a little.
The deduplication implementation of esXpress is in style with PHDs way of working: both effective and simple. A separate appliance is installed (which is in fact the same one as the GUI appliance. At first boot of the appliance you choose what the appliance will become. Smart!). The dedup appliance (called PHDD for PHd Data Dedup) can mount a datastore or an NFS store for storing its deduped data. It performs quite well, saving diskspace as you backup more of the same (or alike) data. It is now much “cheaper” to keep more backups of your VMs.
Only few changes appear to have been made to esXpress itself to allow PHDD as a backup target, so once again, stability guaranteed.
So now all your data lives inside the PHDD appliance. Now how do I get out this data the way I want it? PHD did something clever: They added a CIFS/SAMBA interface to the appliance, allowing you to browse, copy and backup your VMs as if they weren’t deduped at all! This last feature makes the mix of backup and dedup more acceptable, even effectively useable
When will the fun EVER stop? File level restore!
The best feature of the PHDD dedup target in my opinion, next to dedup itself, is the ability to perform file level restores. At last you can get out that one single file of a full VM without having to restore the whole thing. This option is so cool, you simply browse to the appliance, select your files, and save the collection you marked as a single zip file! Couldn’t be easier, another bulls eye for PHD, even in their first release of this piece of software.
Scaling esXpress 3.5 with dedup
Not all is bright and shiny with dedup. I found it hard to scale the solution: If there is only one PHDD target, scaling ends somewhere, and a SPOF (single point of failure) is introduced. Not good (although PHD is working on a way to link the dedup appliance to a secondary one). Still, one may consider to use two or more PHDD appliances in parallel. This will work, but the dedup effectiveness will drop sharply, especially when you use DRS and all VM backups end up on all PHDD targets in time (this happens when you design the often used strategy where one assings for a backup target to each ESX server individually with failovers to others). You can make it somewhat more effective by specifying a backup target for each VM (in the local config), a best practice that also stands when using multiple FTP targets btw. This will ensure that a backup of a particular VM will always end up on the same backup target, making things clearer and making dedup more effective (although far from ideal – Every PHDD target has its own library of data, meaning that identical blocks still get stored on EACH PHDD target instead of just one).
The limitations mentioned above are not a limit of esXpress though, but more a limitation of dedup in itself. PHD choose to use online dedup (basically you dedup while you write), which will use CPU power during backup and restores. CPU power might even be the limiting factor in your backup speed. Luckily CPU power is usually available in abundance nowadays. I will dive deeper into performance and scaling of deduped installations in the next blogpost, which will hopefully prove that dedup really performs (like the setup using multiple FTP targets simultaneously described in my blogpost Scaling VMware hot-backups using esXpress).
The new version of esXpress 3.5 is in terms of speed and reliability on par with its predecessor version 3.1. It is still the only backup solution I know that has no Single Point of Failure, scales (REALLY scales) up to whatever size you want without any issues, and best of all: Once it works it KEEPS working with hardly any problems around VM snapshotting like some other backup solutions do have.
On top of all the good things that already were, a global GUI is added which manages all esXpress installs at the same time, and there is a Data Deduplication appliance which features a very well working single file restore option. I would like to have seen a file restore option in a non-dedup target as well. From what I’ve seen, online deduping costs a lot of CPU power, and the backup speeds go down because of this. Once the database is built though, things do get better (less data to backup because more and more blocks are already backup up in th dedup appliance). Still, calculations have to be done.
In a smaller environment, the dedup appliance is no match for a set of non-dedupping FTP targets. This is a drawback from which any dedup system suffers… It is just the way the “thingy” works. Still I see a solid future for esXpresses PHDD dedup targets where speed is not of the utmost importance.
Make no mistake on backup speeds: IF esXpress and its backup targets are designed and configured properly, it is by far the fastest full-VM backup solution I’ve seen. It does not mess with taking backups through the service console network, it creates Virtual Appliances runtime that perform the backups – and many in parallel. If you want to see real backup speed from esXpress, do not test it on a single VM like some people tend to do when comparing. If you do, speeds are about on par with other 3rd party vendors. But when scaled up to make 8 or more backups in parallel to several backup targets with matched bandwidth, esXpress will start to shine and leave the competition far behind.
There are a lot of ways of making backups- When using VMware Infrastructure there are even more. In this blog, I will focus on so called “hot backups”- Backups made by snapshotting the VM in question (on an ESX level), and then copying the (now temporary read-only) virtual disk files off to the backup location. And especially, how to scale these backups into larger environments.
Hot backups are created by first taking a snapshot. A snapshot is quite a nasty thing. First of all, each virtual disk that makes up a single VM have to be snapped at exactly the same time. Secondly, if at all possible, the VM should flush all pending writes to disk just before making this snapshot. Quiescing is supported in the VMware Tools (which should be inside your VM). Quiescing will flush all write buffers to disk. Effective to some extent, however not enough for database applications like SQL, Exchange or Active Directory. In those cases VSS was thought up by Microsoft. VSS will tell VSS enabled applications to flush every buffer to disk and hold any new writes for a while. Then the snapshot is made.
There is a lot of discussions about making these snapshots, and quiescing or using VSS. I will not get into that discussion, it is too much application related for my taste. For the sake of this blog, you just need to know it is there
Snapshot made – now what?
After a snapshot is made, the virtual disk files of the VM have become read-only accessible. It is time to make a copy. And this is where different backup vendors start to really do things different. As far as I have seen, there are several different ways of taking out these files:
VCB is VMware’s enabler for primarily making backups through a fibre-based SAN. VCB enables a “view” to the internals of a virtual disk (for NTFS), or give a view to an entire virtual disk file. From that point, any backup software can make a backup of these files. It is fast for a single backup stream, but requires a lot of local storage on the backup proxy, and does not easily scale up to a larger environment. The variation using the network as carrier is clearly a suboptimal solution compared to other network-based solutions.
Using the service console
This option installs a backup agent inside the service console, which takes care of the backup. Do not forget, an FTP server is also an agent in this situation. It is not a very fast option, especially since the service console network is “crippled” in performance. This scenario does not scale very well to larger environments.
And here things get interesting – Please welcome esXpress. I like to call esXpress the “Software version” of VCB. Basically, what VCB does – make a snapshot and create a view to a backup proxy – is what esXpress does as well. The backup proxy is no hardware server though, or a single VM, but numerous tiny appliances, all running together on each and every ESX host in your cluster! You guessed it – see it and love it. I did anyways.
esXpress – What the h*ll?
The first time you see esXpress in action, you might think it is a pretty strange thing – First you install it inside the service console (you guessed right – There is no ESXi support yet). Second it creates and (re)configures tiny Virtual Appliances all by itself.
When you look closer and get used to these facts – It is an awesome solution. It scales very well, each ESX server you add to your environment starts acting as a multiheaded dragon, opening 2-8 (even up to 16) parallel backup streams out of your environment.
esXpress is also the only solution I have seen which does not have a single SPOF (Single Point of Failure). ESX host failure is countered by VMware HA, and the restarted VMs on other ESX servers are backup up from the remaining hosts. Failing backup targets are automatically failed over to by esXpress to other targets.
Setting up esXpress can seem a little complex – there are numerous option you can configure. You can get it to do virtually anything. Delta backups (changed blocks only), skipping of virtual disks, different scheduling of backups per VM, compression and encryption of the backups, and SO many more. Excellent!
Finally, esXpress has the ability to perform what is called “mass restores” or “simple replication”. This function will automatically restore any new backups found on the backup target(s) to other locations. YES! You can actually create a low-cost Disaster Recovery solution with esXpress – RPO (Recovery Point Objective) is not too small (about 4-24 hours), but the RTO (Recovery Time Objective) can be small, 5-30 minutes is easily accomplished.
The real stuff: Scaling esXpress backups
Being able to create backups is a nice feature for a backup product. But what about scaling to a larger environment? esXpress, unlike most other solutions, scales VERY well. Although esXpress is able to backup to VMFS, I will focus in this blog on backing up to the network, in this case FTP servers. Why? Because it scales easily! following makes the scaling so great:
- For every ESX host you add, you add more Virtual Backup Appliances (VBAs), so increases total backup bandwidth out of your ESX environment;
- Backup uses CPU from the ESX hosts. Especially because CPU is hardly an issue nowadays, it scales with usually no extra costs for source/proxy hardware;
- Backups are made through regular VM networks (not the service console network), so you can easily add more bandwidth out of the ESX hosts and bandwidth is not crippled by ESX;
- Because each ESX server runs multiple VBAs at the same time, you can balance the network load very well across physical uplinks, even when you use PORT-ID load balancing;
- More backup target bandwidth can be realized by adding network interfaces to the FTP server(s) when they (and your switches) support load-balancing through mulitple NICs (etherchannel/port aggregation);
- More backup target bandwidth can also be realized by adding more FTP targets (esXpress can load-balance VM backups across these FTP targets).
Even better stuff: Scaling backup targets using VMware ESXi server(s)
Although ESXi is not supported with esXpress, it can very well be leveraged as a “multiple FTP server” target. If you do not want to fiddle with load-balancing NICs on physical FTP targets, why not use the free ESXi to install serveral FTP servers as VMs inside one or more ESXi hosts! By adding NICs to the ESXi server it is very easy to establish load-balancing. Especially since each ESX host delivers 2-16 data streams, IP-hash load-balancing works very well in this scenario, and is readily available in ESXi.
If you want to make high performance full-image backups of your ESX environment, you should definitely consider the use of esXpress. In my opinion, the best way to go would be to:
- Use esXpress Pro or better (more VBAs, more bandwidth, delta backups, customizing per VM);
- Reserve some extra CPU power inside your ESX hosts (only applicable if you burn a lot of CPU cycles during your backup window);
- Reserve bandwidth in physical uplinks (use a backup-specific network);
- Backup to FTP targets for optium speed (faster than SMB/NFS/SSH);
- Place multiple FTP targets as VMs on (free) ESXi hosts;
- Use multiple uplinks from these ESXi hosts using loadbalancing mechanisms inside ESXi;
- Configure each VM to use a specific FTP target as its primary target. This may seem complex, but it guarantees that backups of a single VM always land on the same FTP target (better than selecting a different primary FTP target per ESX host);
- And finally… Use non-blocking switches in your backup LAN, which preferably support etherchannel/port aggregation.
If you design your backup environment like this, you are sure to get a very nice throughput! Any comments or inquiries for support on this item are most welcome.