I’ll say it at the beginning of this post and again at the end, stop getting caught up in the vendor hype that their storage solves boot storms and worry about the write IO that is created by virtual desktops, login storms, application streaming. Disk is the main reason VDI projects fail or don’t get off the ground. If you have more than a couple hundred hosted virtual desktops you’re going to need to understand disk IO, the IO profile of your virtual desktops, the applications and events that cause it, the peaks, and how your storage system handles it.
I’m quite surprised by the number of vendors talking about how they solve boot storms or have really high read IOPS capability in a hosted virtual desktop/VDI environment. In the real world most people don’t worry about boot storms because we boot up VM’s well prior to users arriving. Not to mention storage manufacturers have cache on their controllers which will cache the blocks of frequently accessed data on the controller. Does this IO still show up when you look at the hypervisor and VM’s? Yes, but that doesn’t mean it is resulting in a read IO from physical disk. The other point which conveniently goes un-talked about, presumably because it’s not something all storage vendors can address, is the percentage of IO in a VDI environment that is write IO. In fact in many cases the write IO is as much or more than the read IO and as you’ll read below, write IO is more expensive in nature to handle with RAID. On top of all of that the write cache which works really well when reading the same physical block of data in a disk array is left barely hanging on as the write IO coming into cache is unique and must be written to disk quickly or we risk filling up cache, causing a cache flush, and new IO coming in is forced to go straight to disk (BAD).
Great reads on VDI, IOPs, and the importance of addressing the write IO in VDI. Read the articles below then lets move on.
http://myvirtualcloud.net/?p=2513
http://myvirtualcloud.net/?p=2829
In order for me to explain this fully I think it’s first necessary to revisit IOPS and disk drives.
Let’s start with the math of IOPS. Disk drives, solid state or of the spinning platter variety are all have a limited number of I/O’s per second that they are capable of. The general numbers that are accepted are for random read or write performance. I’m referencing numbers listed on wikipedia but you’ll see numbers that vary from this depending on the vendor. http://en.wikipedia.org/wiki/IOPS
| Device | Type | IOPS |
|---|---|---|
| 7,200 rpm SATA drives | HDD | ~75-100 IOPS[2] |
| 10,000 rpm SATA drives | HDD | ~125-150 IOPS[2] |
| 10,000 rpm SAS drives | HDD | ~140 IOPS [2] |
| 15,000 rpm SAS drives | HDD | ~175-210 IOPS [2] |
Now that we know the estimated available IOPS per drive drive we need to calculate in the RAID penalty. For many this may be something you were not aware of. RAID penalty is the reason behind the statement you’ve heard, put your log files on RAID 1 or RAID 10. As you can see in the table below the RAID penalty is lower on RAID 1 or 10 for write IO and log files are typically heavy write IO. So to get the most IOPS out of your drives you want optimize the RAID type for level of redundancy and the type of IO involved. The total IOPs your storage system can handle is a function of the speed of the drive, the RAID being used, and the split of read vs write IO. Let’s say you have 10 drives averaging 150 IOPs each for a total of 1500 IOPs. Using RAID 5 those drives can handle 1500 read IOPs or about 375 write IOPs. Notice how much less write IO those drives can handle? Are you beginning to understand the scope of the problem with VDI and disk IO?
| RAID level | Read | Write |
|---|---|---|
| RAID 0 | 1 | 1 |
| RAID 1 and 10 | 1 | 2 |
| RAID 5 | 1 | 4 |
| RAID 6 | 1 | 6 |
So what are we looking for in a disk solution for VDI? At scale we need a solution that can handle read and write IO in a tier of SSD. There are many solutions on the market that use SSD as a cache, some for only read, some for both read and write. A cache of SSD which handles both read and write IO is good so long as the underlying disk can keep up during the busy time (typically during periods of heavy login or application streaming). If the cache can’t keep up then we end up going directly to physical disk, which are probably under heavy IO, hence the reason cache couldn’t offload fast enough to disk in the first place.
In my opinion the easiest thing for IT is to have a disk solution which can keep all of the heavy read and write IO in a tier of SSD in the array. These solutions don’t require a warming of cache as the frequently used blocks for read IO are probably already there if they are indeed accessed frequently. These solutions are more easily understood and implemented. These solutions should have enough SSD for write IO to sustain the day’s write IO in SSD and then tier that data down to a slower tier if desired. The downside to this approach? It’s usually expensive…hence the reason this VDI storage marketplace is so fragmented in their approach to solving the VDI IOPs issue.
There are other solutions on the market which use SSD for read and lay down their write IO sequentially on slower spinning disk (which results in higher sustained IOPs). I haven’t seen any performance testing for these types of solutions that establish the write IO they can sustain.
There are also some new solutions using RAM as a tier of disk and then optimizing the stream of IO to existing storage.
I’m not going to make a recommendation on the storage you should choose, just make sure you’re handling write IO in your storage solution and make sure you fully understand and test prior to production deployment.
Regardless of the solutions you choose the biggest takeaway for you should be, stop getting caught up in the vendor hype that their storage solves boot storms and worry about the write IO that is created by virtual desktops, login storms, application streaming.
[...] article in my Desktop Virtualization Bible has been published and it’s on VDI and IOPs. http://blog.danbrinkmann.com/2012/02/05/4-vdi-and-iops/ Share this: This entry was posted in Citrix, View, VMware, XenDesktop and tagged boot storm, [...]
Dan, I liked your latest post on IOPs. However, even if you know all that stuff it is still difficult to be successful for a few reasons that come to mind.
Purchasing storage for VDI is likely different for management and procurement. We have to help management see storage as performance, not capacity. The people signing checks want to use all the available resources to maximize their investment. They do not recognize that using disk capacity beyond disk performance results in outages.
Also related to the purchasing of storage for VDI, vendors start eliminating parts to lower the cost of their solution to stay competitive. So, maybe their solution was solid, but procurement starts beating them up and the solution gets reworked and looks a lot different in the end.
When we rush to purchase storage to meet project timelines, regardless of the available information about the desktop profile for VDI, we don’t have enough time to make the best decision. Even with slower timelines, information overload lowers the odds of selecting a suitable solution. It takes a lot of valuable time to get familiar with all the reference architectures, take customer reference calls, and read blogged feedback on the available solutions.
The vendor needs to ask the right questions so they have what they need correctly size the solution. This is not always obvious until it is too late. A trade secret VDI calculator is hard to validate.
Not recognizing when you are leaving a valuable VAR out of the loop that could have helped all the parties involved be successful.
Lastly, overcoming a normalcy bias. Virtualization for servers does not typically require the same high level of design involvement as a VDI solution. Management underestimates the importance of design features and does not want to spend more money on a solution that is sized correctly until the VDI environment experiences large and prolonged outages.
Eric Gustafson
Truer words have never been spoken!
At Quest that’s exactly what we do; cache commonly read blocks into RAM, then coalesce and serialize writes to limit the amount of disk access.
As for boot storms, it’s not just about VM creation or boot time, as we al know we can boot VMs in chunks of 10 or 20 at a time to limit disk IO, but when doing a wholesale update to thousands of VMs, the time that takes is substantial. If 99% of the Read IO is coming from RAM, the boot storm is a non-issue.
Example; I have 1000 Win7 desktops spread across 7 VDI hosts and I want to delete them all and replace them with 1000 Win7 SP1 VMs. Booting Win7 reads about 300MB of data from disk, so booting. 125+ per host would require ~40GB of data to be read from disk. If we cache the boot process in RAM, we read only 300MB from disk (per host)
The fundamental issue is that to deliver the IO to sustain this process we either need to stagger it out over time, or move the workloads to SAN where we can provide enough spindles to deliver the IO, as 1U servers don’t have enough drive slots to handle the required IO. This leaves us with only SAN or SSD as viable options.
Now if we cache read IO in RAM and optimize write IO, we can use commodity local disk like 6 x SAS 15KRPM drives to easily deliver the IO and provide more than enough storage for 100-200 VMs per host.
Customers don’t want to hear that they need special storage, special servers, special networks…that are all expensive, to deliver a virtual desktop, that is “supposed to be” less expensive than the physical equivalent.
Remove the requirement for these for a good percentage of the virtual desktops (the non persistent VDI or session host workloads) and only spend the money for SAN, FC/iSCSI, clustering, HA…for the desktops that must be HA, I.e. developer desktops and the capital expenditure for the project drops significantly.
Marry that with something that just works without any specialized virtualization skills, I.e. all the desktop team needs to know is how to install Windows and their apps and the broker configures and manages the VDI host and you have real value.
I will continue to disagree that a boot storm is a problem for most customers, it’s not. The underlying point of my post was that too many people focus on the boot storm and read IOPs caching, the ignored and bigger problem is write IOPs. Read cache has a lot of solutions out there, this isn’t the barrier, it’s marketing. Shared storage solutions can easily cache GB’s of data in cache, no disk impact.
Optimized write IO onto spinning disk doesn’t replace the need for SSD for writes and doesn’t scale beyond the next generation of Intel processors. Distributed IO, doing it locally as opposed to putting it all back on the centralized storage solution as you are doing is great architecture. Distributed write IO is a big deal and you’ve done great work here.
my crystal ball says the hypervisor will do many of these things (read cache, write optimization) in the future. We’re on the cusp of storage vendor’s having SSD solutions for both read/write, time will tell if they solve the problem at a price that is cost effective…for the time being they majors vendors definitely are challenged here and your optimizations will indeed save dollars for non-persistent deployments.
Citrix VDI-in-a-Box does not have the ram as disk optimizations that Quest has, but it does have an architecture based on local disk deployments, can’t wait to see some integration of these ideas in XenDesktop and better yet at the hypervisor.
Thanks for the comments I enjoy the discussion… and congratulations on the latest release of vWorkspace, it has my attention.
[...] you’ve read my other posts on about VDI and IOPs you know that I don’t believe that boot storms are a real problem in [...]
[...] you have read from a previous blog post here I take issue with storage vendors not recognizing the write IO issues that happen with Desktop [...]
[...] you have read from a previous blog post here I take issue with storage vendors not recognizing the write IO issues that happen with Desktop [...]
This blog site has some really helpful info on it. Cheers for informing me!