Whiteboxing part 3: Choosing storage for your homelab
After a long time I want to continue my series on building out your own home lab. Up next: Storage. What to choose, how to build it? I will be focussing on building out shared storage in this blog post. Yes you could use local disks just fine without sharing them outside that one box, but that is not where the hard choices lie. Once you decide you want shared storage, the questions start to pop up and decisions have to be made.
Different approaches to storage in your home lab
Before you start buying disks or SSDs, first things first. To begin with it is very important to make some base decisions. Some the most important ones:
- Will you build disks inside your ESX nodes;
- Will you buy a separate server or NAS box.
So what are the cons and pros of both approaches? For power consumption, the first options is usually cheapest. You don’t need another box or NAS box, you don’t need to power that extra box. You just add disks to your existing setup. On the other hand, adding disks to your physical boxes will limit you in flexibility. Try upgrading vSphere on a node that carries all your storage…!
Let’s start with your options for using local disks. After that we’ll step into the separate NAS boxes.
Building disks inside your ESX nodes: VMware vSA
Very much in line with VMware’s vision is to build out local storage and present it as shared storage. The problem: VMware only has “first wave technology” out today. It is called the vSphere Storage Appliance (vSA). This appliance can combine local disks of two or three nodes, and project them out as shared storage.
The good thing on the VMware vSA is that you can bring down nodes for upgrades etc. without impacting storage availability. But this comes at a price: you obviously need to mirror storage and you always need three nodes to build a reliable setup. Yes you can run VSA in a 2 node setup, but you’d still need some third machine to either participate in the vSA cluster (which makes it a three node config), or a piece of software running on the third machine to provide a cluster witness functionality in case of a node failure.
The need to mirror the data across disks in different nodes is another pro or con of the setup (depends on what matter to you). This is on top of the RAID protection that has to be part of each node individually. So in a capacity-optimised scenario you’d have at least:
- RAID5 protection in each node;
- MIRROR between two nodes.
Of course this describes a SUPPORTED configuration. You can get away with RAID0 sets in your nodes and get a working vSA cluster that has only the mirroring overhead. Question is, do you want to get into that potential mess? Loosing a single disk would mean loosing a member, which in turn means a reprotection of an entire node of the vSA cluster.
In a two node setup both nodes will mirror all available data and present it out. In a three node setup the appliance will deliver three datastores out, each having a mirror between two of the three nodes (so a mirror over 1&2 , 2&3 , 3&1). The RAID protection in the nodes is projected to cope with any disk failure, while the mirrors provide protection against node loss (or upgrade, maintenance etc).
For a home lab setup this quickly becomes an issue: You have to have at least two nodes with some kind of RAID controller. Even though most main boards today deliver RAID setups, they hardly ever work in RAID mode because the RAID controllers rely on the O/S. You guessed it, vSphere has no support for software RAID implementations. That means your nodes need a “real” RAID controller on board. Once you have this, you additionally need some kind of third machine online as well to be the cluster witness (or a true third node).
As described above, on the capacity front you loose a lot of capacity by using the vSA. You loose your RAID protection within the nodes (50% for RAID10, one disk for RAID5 and two disks for RAID6), and on top of all that another 50% of your is lost to mirroring between boxes.
On the performance front, you loose a lot of write performance as each write has to go both mirrors. Especially when you use RAID5 or RAID6 in the nodes, the write penalty (see Throughput part 2: RAID types and segment sizes) can be pretty devastating for your performance. When a disk has failed and you need to rebuild the RAID set, things get even worse in RAID5 or RAID6. The way to protect yourself against that would be to use RAID10 inside the nodes, but that in turn would mean to mirror two mirrors, leaving you with only 25% of your raw capacity.
Building disks inside your ESX nodes: Single-node Virtual Storage Appliances
Another way to go is to just fill one server with local disks, install a virtual appliance on that box, and feed the appliance your local drives (or big virtual disks that live on them). The virtual appliance will then apply RAID to your local disks and project them out as shared storage (usually over iSCSI or NFS).
For my own home lab, I looked into three options to use for a virtual appliance:
- FreeNAS 8 (based on FreeBSD, can use ZFS);
- Openfiler 2 (based on rPath Linux, uses 8TB+ journalling filesystem);
- Nexenta Community Edition 3 (based on Sun’s OpenSolaris 10, uses ZFS).
They all look the same from the outside, meaning they all run some kind of Linux or Unix operating system, they use disk logic from the O/S, and they have a GUI glued on top to make management easier. On the inside, they use very different stuff: Openfiler uses Linux, FreeNAS uses FreeBSD with ZFS, while Nexenta uses Sun’s OpenSolaris (the version that was out before they got bought by Oracle).
How did I select which one to use? Simple. I installed each appliance on my home lab system. I fed each one some thin provisioned virtual disks, and requested the appliance to bake me shared storage from those disks over NFS. Finally, I migrated a test Windows XP VM onto it and the testing could begin!
For testing I did not focus all too much on performance; for me it is more important that my data is not lost. For you that may be different, and in that case I suggest that you run some tests yourself before you decide on an appliance to use. Testing for reliability was very simple in the virtual world: Let it run, remove one of the virtual disks on the fly and see how it copes. Then delete that virtual disk (like a physical disk that was destroyed), and on the fly I insert another virtual disk of the same size. Then I request the appliance to rebuild. In all examples I used RAID5 / RAIDZ1 to configure a system with a single redundant disk (N+1).
Testing was done quite quickly: None of the appliances handled this “disaster” in a perfect manner: One did not even detect the disk was gone for many minutes, the other refused to see my new virtual disk as a new disk. Nexenta did all as expected, but refused to initiate the rebuild from the GUI.
After some playing around, I managed to get myself into “serious” trouble trying to get it to repair the shared data. Only Nexenta came out pretty well: As soon as I went back to it’s command line, I was able to just shoot the zpool command to initiate the rebuild to the new device and off it went. Did the web GUI show the rebuild? Nope. It kept reporting the degraded state (instead of rebuilding, resilvering, reconstructing or whatever), but by using the command line I was able to follow the rebuild process perfectly (in percents, time passed, remaining time and MB/s). Pretty sweet; I wish I could see that in the GUI!
Once the rebuild was complete, the web GUI all of a sudden showed “ONLINE” instead of “DEGRADED”, and even the disks reported in the pool were the correct ones. Pretty good stuff even though I needed to use the command line to get it done.
Using a separate server for shared storage
If you do not want to have any issues with local storage on vSphere nodes you may want to be upgrading or putting into maintenance mode on a regular basis, you could decide to build a separate SAN/NAS box out of a PC. That works pretty well, and gives you a LOT of potential for a very nice price.
So how do you build this? Simply look at the previous paragraph, but run the appliance natively instead of inside an appliance! Since Nexenta was my choice for the virtual appliance, I’ll get into Nexenta running natively as well.
You do not really need all that much for an effective shared storage box. In general, it is even cheaper than buying a dedicated NAS box (but beware – most NAS boxes use less power).
Especially if you’re a fan of DIY, it is a lot of fun to build your own dedicated NAS system. I played with Nexenta, and successfully installed it onto a USB stick and on a CompactFlash card while using a bunch of 250GB SATA drives as shared storage. As Nexenta uses ZFS, it will help to give it loads of memory. Then you need CPU power, some networking and of course storage. Adding one or more SSD drives may increase performance even more. I may dive into Nexenta sizing in more depth if I find the time in a separate post.
When you DIY, you can very flexibly size for CPU power, memory, number of NICs (and speed) etc. This is a freedom you won’t get with most (or even any?) ready-to-go NAS boxes.
Using a ready-to-go NAS system
Quick and easy. That is all there is to say about these boxes. They use little power, they are standalone (so you can play with your vSphere nodes all you want; the storage is completely separated from them). In general, they are more expensive than building your own storage system out of a PC, unless you plan on building a really small, low performance unit. In that case you could consider a box like the Iomega IX-2. They house only two disks and use little power. They also come REALLY cheap (I’ve seen IX-2’s without disks for 99 euros!). The downsides? They are relatively slow. Not really meant to run a lot of VMs from, but more for streaming and backup.
The bigger NAS boxes (like the Iomega PX series, Qnaps and Synologies) cost way more than a PC-based NAS (prices around 700 euros and up without disks). If money is not your primary concern, then these NAS boxes are a serious option. If you want the biggest bang for the buck and most flexibility, you’re probably better off building your own PC-NAS.
Main takeaways
When choosing storage for your home lab, there are some important things to remember. First of all, where to put your disks. One way of doing this is to have local storage in one of your vSphere nodes and run a virtual appliance to share them out. Another approach is to convert a PC into a NAS box, or buy a ready-to-go NAS box. Lastly, you could choose to build a hybrid between local and shared storage by using VMware’s vSA, where each node has local storage and you can bring down nodes one by one (for maintenance etc) without loosing your shared storage.
What road to choose will be different for everyone. How many vSphere nodes you you have? What is your capacity and performance needs? What is your budget? How “green” does your solution need to be? Will you be running the home lab 24/7 or not? All these questions will decide which path to choose.
Eric,
I have had many similar debates (internally in my own head) about this and how best to lay it all out and some real troubles keeping a stable lab because of these questions. In the end I got an Antec gaming machine with lots of bays and have a couple of 500GB SATA as well as 2 SSD drives – one for booting windows 7 and another just for VMware workstation to use. The SATA are real handy for creating SDRS clusters with just raw lumps of disk for testing… not great performance but not necessary… SSD is good for low latency so you can do both …
I had openfiler running nested inside workstation presenting datastores over iSCSI to my esxi machines which were its siblings. On 3 occasions in the last 3 months it just died and I had to rebuild it … One of the things I did was to have 2 separate partitions on my “dedicated” SSD which was 230 GB. One I was letting WIndows use (with vmware workstation) and the other I was letting openfiler use in side a workstation VM with a linux partition table…. this is quite tricky and has some limitations…
By accident I came across Microsoft iscsi trget server
Link here: http://www.microsoft.com/en-ie/download/details.aspx?id=19867
It’s 6MB and you can create your VHD’s from your existing filesystems and it works really well. So you use a nested Windows server inside workstation or run it natively on your main OS…. either way it’s a doddle to setup and more reliable ….. Would highly recommend it compared to workstation !!
BTW Im studying for VCAP-DCA and came across your fantastic vcsisstats Excel spreadsheet and blogs. Thanks so much – it’s a great toolset !!!
Paul
Hi Paul,
There are literally a million ways of building your stuff. It is very personal on how you build, for both personal preference, personal knwoledge and the things you want/need to do. For me personally, Workstation is no option. I choose the physical ESX install on my nodes and work from there; I need the uptime as my lab runs 24/7. And so far it has not disappointed: The virtual storage appliances have done a more than decent job the past few months, no real crashes or anything that required manual intervention.
When you are running Microsoft and/or VMware Workstation anyway, the iSCSI target server is not a bad idea – Why spend more effort comes to mind.
Great addition I think to the abilities one has to build out storage for a homelab!