Today I saw that an updated version of a VMware whitepaper discussing the CPU scheduler in VMware ESX/vSphere was published, The CPU Scheduler in VMware vSphere 5.1. I have used the previous whitepaper written for ESX 4.1 in a few presentations I’ve given and I frequently reference it when discussing VDI/Server Hosted Virtual Desktop solutions with customers. I wrote a blog post in 2011 discussing some of the key points to understand about the CPU scheduler and VDI.
I thought with this update of the whitepaper it would be a good time to once again focus on items that apply to all virtualized workloads but is critically important to understand when deploying desktop virtualization workloads in which user experience is of the highest importance and you have high over-commitment ratios. From the whitepaper published by VMware I think these paragraphs are the most important for desktop virtualization admins to understand:
When making scheduling decisions, the ratio of the consumed CPU resources to the entitlement is used as the priority of the world. If there is a world that has consumed less than its entitlement, the world is considered high priority and will likely be chosen to run next.
One way to understand prioritizing by the CPU scheduler is to compare it to the CPU scheduling that occurs in UNIX.The key difference between CPU scheduling in UNIX and ESXi involves how a priority is determined. In UNIX, a priority is arbitrarily chosen by the user. If one process is considered more important than others, it is given higher priority. Between two priorities, it is the relative order that matters, not the degree of the difference.
In ESXi, a priority is dynamically re-evaluated based on the consumption and the entitlement. The user controls the entitlement, but the consumption depends on many factors including scheduling, workload behavior, and system load. Also, the degree of the difference between two entitlements dictates how much CPU time should be allocated.
Those sentences are the ones all desktop virtualization administrators should reread until they understand what that means to them. Here’s my translation: “Shares being equal, the more CPU resources you consume (CPU time) the more likely another workload (world) will preempt yours.”
When looking at the workload being done by your users within their hosted virtual desktop it is important to understand the applications and ways in which they will use those apps, the real-time nature of any of the applications being used by them, etc. The main thing I look for are applications that rely on audio or video in order to be effectively used, or is their primary purpose.
These applications are negatively affected when the pCPU is not available and the vCPU must wait for it to become available. When you are overcommitting vCPU to pCPU in ratios like 8 to 1 there is a much higher chance that another vCPU will be waiting for the pCPU. Some interuption and waiting for the pCPU probably won’t be noticed, but if the other 7 vCPU’s are also trying to schedule audio and video you’re going to have serious contention on that pCPU and it will most likely degrade the user experience.
Remember, shares being equal and all vCPU’s having work to be done the CPU scheduler will equally distribute work between the vCPU’s, there is no priority of operating system thread that the vSphere CPU scheduler sees.