VMWare ESXi announce High Availability (HA) for NVIDIA GRID vGPU VMs with vSphere 6.5

I was very pleased yesterday to see Pat Lee from VMware’s PM team tweet about this yesterday…

patleetweet

It’s something we knew VMware had added to vSphere 2016, vSphere 2016 supported in the GRID 4.1 (Nov 2016) release. As a VMware implemented feature this was something we at NVIDIA had to wait for them to announce. I think there have been a few problems with the documentation update staging which is why this has been a rather quiet feature release. I’ll update this blog with links to the documentation when it becomes available which should be soon!

But since Pat has let the cat out of the bag…. Probably best to answer a few basic questions straing away.

What is High Availability (HA)?

Basic HA is a feature to ensure VMs are up and running as soon as possible in the event of host failure. The VM will automatically restart as soon as possible on another host if one is available with sufficient resources. So for vGPU enabled VMs that means on a host with an appropriate GPU etc. Although the user will experience some down-time where possible this is minimized without the need for manual intervention by a system administrator.

Guaranteed High Availability…

This can be provided by HA features by allowing resources to be resourced such as RAM/CPU on hosts e.g. maybe 15% of a hosts capacity, which allows a guarantee that resource will be available to restart VMs upto a certain number of host failures. I believe that VMware’s configuration does not extend to configuring GPU resource reservation and so the support announced today will not offer guaranteed HA. It is a feature VMware could add in the future though if they saw sufficient demand, it is not a feature engineered by NVIDIA.

Can HA provide continual up-time?

No, not alone. Many hypervisors though offer Fault Tolerance (FT) which can provide such support, this is a very expensive feature to use as it relies on running essentially a duplicate VM on mirrored hardware which is phase-locked to the original (i.e. milliseconds behind), in the event of failure the user is switched to the duplicate with only a momentary glitch in user experience. It’s a feature essentially only used in a few safety / mission critical use cases as it’s so costly to implement.

So is Fault Tolerance (FT) supported for vGPU?

No not today, the technology to continually essentially snapshot a live GPU is not available. This is also a pre-requisite for live migration/motion e.g. vMotion and also regular snapshots.

The Future

NVIDIA and all the partners such as Citrix and VMware appreciate that live motion and snapshotting are key enterprise datacenter needs so we continue to work towards making such technology happen (it’s very technically hard I’m told!). We all know what you want and what you want our priorities to be!!!

NVIDIA GRID is architected with a software model which gives us the ability to add additional support for new OSs for customers existing hardware allowing them to pick up new features.

NVIDIA GRID: Linux Guest OS support for Linux distributions on Citrix and VMware

I was recently involved in a support inquiry where a user wanted to know if NVIDIA GRID vGPU was available on Linux VDAs with the Linux guest OS, OpenSUSE LEAP (the answer at the time of writing is that it’s NOT!). Finding the answer was a lot harder than I expected as both VMware and Citrix documentation took a bit of hunting around.

Much of the marketing around Linux VDA’s mentions support for “SUSE”, “CentOS” or other genres of Linux, such as this blog. It is important that customers check both their hypervisor and VDI solutions official support matrix as both Citrix and VMware only certify, QA and support specific versions of Linux Guest OSs (usually only enterprise supported versions). Customers may find themselves unsupported by the virtualization vendors if they fail to check that the OS and specific version is supported by both their hypervisor and VDI solution (especially if mixing vendors such as Citrix XenDesktop on VMware ESXi).

Both vendors are evolving their Linux support rapidly and customers must check the documentation associated with the relevant versions of VMware/Citrix products they intend to use.

NVIDIA cannot provide support for guest OSs unsupported by the relevant virtualization vendor and as such customers are recommended to contact VMware/Citrix if they wish to use alternative versions/distributions. It is very likely many other varieties of Linux will “work” but customers should be aware that they will be unable to obtain hypervisor or VDI support in the event of an issue.

At the time of writing Horizon 7 on ESXi supports:

  • Ubuntu 12.04 and 14.04
  • Red Hat Enterprise Linux (RHEL) 6.6 and 7.1
  • CentOS 6.6
  • NeoKylin 6 Update 1 (Chinese)
  • SUSE Linux Enterprise Desktop 11 SP3

 

At the time of writing Citrix XenDesktop 7.9 on XenServer supports:

  • SUSE Linux Enterprise:
    • Desktop 11 Service Pack 4
    • Desktop 12 Service Pack 1
    • Server 11 Service Pack 4
    • Server 12 Service Pack 1
  • Red Hat Enterprise Linux
    • Workstation 6.7
    • Workstation 7.2
    • Server 6.7
    • Server 7.2
  • CentOS Linux
    • CentOS 6.7
    • CentOS 7.2

Ongoing if you want to check the OSs available for a Linux VDA you should follow the advice below.

Citrix

XenServer Support for Linux Guest OSs

This is documented in the “Citrix XenServer® Virtual Machine User’s Guide” for the relevant version of XenServer e.g. for 7.0, here: http://docs.citrix.com/content/dam/docs/en-us/xenserver/xenserver-7-0/downloads/xenserver-7-0-vm-users-guide.pdf

XenDesktop Guest OSs Supported by the Linux VDA

This can be found in the Linux VDA product documentation for the relevant version of XenDesktop under the section “System Requirements” e.g. for XenDesktop 7.9 Please see http://docs.citrix.com/en-us/xenapp-and-xendesktop/7-9/install-configure/suse-linux-vda.html (This is where I had to hunt around as bizarrely Citrix detail the genres and versions of Linux supported under each supported OS rather than in a master list, so the SUSE documentation is where you can find RHEL and other supported versions listed)

VMware

ESXi/vSphere Support for Linux Guest OSs

Supported Linux OSs are listed in the “VMware Compatibility Guide”: https://www.vmware.com/resources/compatibility/search.php?deviceCategory=software

Horizon Support for Linux Guest OSs

The versions and distributions supported by Horizon are listed in the FAQ for the appropriate release e.g. for Horizon 7, here: http://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/products/horizon/vmware-horizon-for-linux-faq.pdf