Citrix Linux VDA now supports Ubuntu 16.04

Last year I wrote a blog on how to find out which Linux distributions are supported by VMware/Citrix, at the time I struggled to find some of the Citrix info as there wasn’t a master list in their documentation. With the recent 7.12 XenDesktop release though this changed and there’s now a nice clear list in the System Requirement Documentation (at the time of writing for 7.12), this reads:

  • SUSE Linux Enterprise:
    • Desktop 12 Service Pack 1
    • Server 11 Service Pack 4
    • Server 12 Service Pack 1
  • Red Hat Enterprise Linux
    • Workstation 6.8
    • Workstation 7.2
    • Server 6.8
    • Server 7.2
  • CentOS Linux
    • CentOS 6.8
    • CentOS 7.2
  • Ubuntu Linux
    • Ubuntu Desktop 16.04
    • Ubuntu Server 16.04

It’s great to see the addition in 7.12 of support for Citrix users for the Ubuntu OS. It is important you use a supported _version_ to ensure support. There’s a really good overview of this addition and other details of the latest Linux VDA from the Citrix Product Manager for the product, Vipin Borkar, on the Citrix blog – worth a read, here.

VMware Linux VDA Support

For VMware there is similar documentation linked to from their Linux VDA home page in the “Horizon 7 for Linux FAQ”.

  • Which flavors of Linux are supported in the first release of Horizon 7 for Linux?
  • Ubuntu 12.04 and 14.04, Red Hat Enterprise Linux (RHEL) 6.6 and 7.1, CentOS 6.6, and NeoKylin 6 Update 1 (Chinese), SUSE Linux Enterprise Desktop 11 SP3 are supported with Horizon 7 for Linux.

 

NVIDIA GRID Support

NVIDIA GRID vGPU technologies also support some Linux OS versions and distributions. These are a subset of those supported by VMware and Citrix so you need to also check that as well as using a supported OS for the Linux VDA in use that you also use a version supported by the vGPU technologies. The OS versions and genres supported for each hypervisor are listed in release notes for the driver for each hypervisor, these are available in the driver download but have been added to NVIDIA’s knowledge base for certain releases, e.g.

 

If you are mixing vendors for VDI and hypervisor, e.g. Citrix XenDesktop on VMware ESXi you will also want to double check the hypervisor and Linux VDA support matrices overlap.

 

XenServer Support for Linux Guest OSs

This is documented in the “Citrix XenServer® Virtual Machine User’s Guide” for the relevant version of XenServer e.g. for 7.0, here: http://docs.citrix.com/content/dam/docs/en-us/xenserver/xenserver-7-0/downloads/xenserver-7-0-vm-users-guide.pdf

 

ESXi/vSphere Support for Linux Guest OSs

Supported Linux OSs are listed in the “VMware Compatibility Guide”: https://www.vmware.com/resources/compatibility/search.php?deviceCategory=software

 

VMWare ESXi announce High Availability (HA) for NVIDIA GRID vGPU VMs with vSphere 6.5

I was very pleased yesterday to see Pat Lee from VMware’s PM team tweet about this yesterday…

patleetweet

It’s something we knew VMware had added to vSphere 2016, vSphere 2016 supported in the GRID 4.1 (Nov 2016) release. As a VMware implemented feature this was something we at NVIDIA had to wait for them to announce. I think there have been a few problems with the documentation update staging which is why this has been a rather quiet feature release. I’ll update this blog with links to the documentation when it becomes available which should be soon!

But since Pat has let the cat out of the bag…. Probably best to answer a few basic questions straing away.

What is High Availability (HA)?

Basic HA is a feature to ensure VMs are up and running as soon as possible in the event of host failure. The VM will automatically restart as soon as possible on another host if one is available with sufficient resources. So for vGPU enabled VMs that means on a host with an appropriate GPU etc. Although the user will experience some down-time where possible this is minimized without the need for manual intervention by a system administrator.

Guaranteed High Availability…

This can be provided by HA features by allowing resources to be resourced such as RAM/CPU on hosts e.g. maybe 15% of a hosts capacity, which allows a guarantee that resource will be available to restart VMs upto a certain number of host failures. I believe that VMware’s configuration does not extend to configuring GPU resource reservation and so the support announced today will not offer guaranteed HA. It is a feature VMware could add in the future though if they saw sufficient demand, it is not a feature engineered by NVIDIA.

Can HA provide continual up-time?

No, not alone. Many hypervisors though offer Fault Tolerance (FT) which can provide such support, this is a very expensive feature to use as it relies on running essentially a duplicate VM on mirrored hardware which is phase-locked to the original (i.e. milliseconds behind), in the event of failure the user is switched to the duplicate with only a momentary glitch in user experience. It’s a feature essentially only used in a few safety / mission critical use cases as it’s so costly to implement.

So is Fault Tolerance (FT) supported for vGPU?

No not today, the technology to continually essentially snapshot a live GPU is not available. This is also a pre-requisite for live migration/motion e.g. vMotion and also regular snapshots.

The Future

NVIDIA and all the partners such as Citrix and VMware appreciate that live motion and snapshotting are key enterprise datacenter needs so we continue to work towards making such technology happen (it’s very technically hard I’m told!). We all know what you want and what you want our priorities to be!!!

NVIDIA GRID is architected with a software model which gives us the ability to add additional support for new OSs for customers existing hardware allowing them to pick up new features.

NVIDIA GRID vGPU: 512MB profiles, Win 10, framebuffer – new support article

I’ve been a bit quiet on the blogging front over the last month or so – a long vacation coupled with a slight role change – but also working on some internal support documentation around 512MB profiles which is now published.

As ever the NVIDIA Enterprise Support KB articles are living documents and you should check the main document here: http://nvidia.custhelp.com/app/answers/detail/a_id/4238/ at the time of reading for up-to-date NVIDIA sactioned advice.

However I thought it would be nice to highlight the availability and reproduce the article as it stands today. The search facility on the KB system is reasonably good and it’s always worth checking for new articles and searching for answers: http://nvidia.custhelp.com/app/home/.

The latest article gathers together some support and technical advise around 512MB including some additional advice on the use of small profiles with Win 10, as well as links to VMware and Citrix best practice on workign with Win 10.

The KB article as published is below, this was the work of a number of people within NVIDIA but also incorporated feedback and work with our NGCA (NVIDIA GRID Community Advisors), especially Rasmus Raun-Nielsen whose customer review and perspective were particularly helpful.

NVIDIA GRID vGPU: Memory exhaustion can occur with vGPU profiles that have 512 Mbytes or less of framebuffer

Symptom/Error

Symptoms and errors that may occur:

  • When playing video content (full screen 1080p) in a browser the session hangs and session reconnect fails on 512MB profiles.
  • This issue typically occurs when multiple display heads are used with Citrix XenDesktop or VMware Horizon on a Windows 10 guest VM.
    • When this error occurs, the NVIDIA host driver reports Xid error 31 and Xid error 43 in XenServer’s /var/log/messages file.
    • Or on VMware When this error occurs, the NVIDIA host driver reports Xid error 31 and Xid error 43 in the VMware vSphere log file vmware.log in the guest VM’s storage directory.

Root Cause

There is a known issue associated with changes in the way recent Microsoft Operating systems handle and allow access to overprovisioning messages and errors. NVIDIA is working with Microsoft closely to resolve these issues ongoing. Users with correctly provisioned systems should not encounter issues. As such users need to take care to ensure there is sufficient frame buffer to support their uses.

512MB is a very small framebuffer and as such users should be aware that the multiple demands made in a virtualized environment can lead to memory exhaustion. Uses that place demand on the framebuffer:

  • Use of more recent Microsoft OSs that place more demand on the framebuffer e.g. using Windows 10 rather than Windows 7. Windows 10 demands far more resources
  • Use of multiple monitors
  • Use of higher resolution monitors
  • Use of the framebuffer for hardware protocol encode (NVENC) – to reduce the probability of users encountering issues NVENC has been disabled for 512MB in the GRID 4.0 (August 2016) release for protocols such as Blast Extreme (VMware) and Citrix HDX/ICA.
  • Frame buffer intensive applications

Documentation

This issue is documented in the driver release notes for the GRID 4.0 (August 2016) release. Customers are advised to always read the known and resolved issues lists contained within the driver release notes for each release for their hypervisor (links below), for Citrix XenServer the release notes state:

Memory exhaustion can occur with vGPU profiles that have 512 Mbytes or less of framebuffer. This issue typically occurs when multiple display heads are used with Citrix XenDesktop or VMware Horizon on a Windows 10 guest VM. When this error occurs, the NVIDIA host driver reports Xid error 31 and Xid error 43 in XenServer’s /var/log/messages file.

The following vGPU profiles have 512 Mbytes or less of frame buffer:

  • Tesla M6-0B, M6-0Q
  • Tesla M10-0B, M10-0Q
  • Tesla M60-0B, M60-0Q
  • GRID K100, K120Q
  • GRID K200, K220Q

GRID Release notes for Citrix XenServer: http://us.download.nvidia.com/Windows/Quadro_Certified/GRID/369.17/XenServer-6.5/367.43-369.17-nvidia-grid-vgpu-release-notes.pdf

GRID Release notes for VMware ESXi: http://us.download.nvidia.com/Windows/Quadro_Certified/GRID/369.17/ESXi-6.0/367.43-369.17-nvidia-grid-vgpu-release-notes.pdf

Verifying framebuffer usage

Users can follow the advice in this article on monitoring NVIDIA GRID framebuffer usage  (http://nvidia.custhelp.com/app/answers/detail/a_id/4108/~/monitoring-the-framebuffer-for-nvidia-grid-vgpu-and-gpu-passthrough) to assist correctly size their environment with respect to framebuffer usage to avoid memory exhaustion. The advice is also of use to assess whether issues users are encountering are actually caused by memory exhaustion.

Consequences

To reduce the consequence of users encountering issues NVIDIA have disabled NVENC on 512MB profiles with the GRID 4.0 (August 2016 release) to minimize the risk of users encountering the Memory exhaustion. Application GPU acceleration remains fully supported and available with all profiles including 512MB. NVENC support from both Citrix and VMware is a recent new feature and as such the majority of users on older versions should encounter no change in functionality.

Workarounds and Solutions

Windows 10

Microsoft Windows 10 has significantly increased the demands upon graphical resources such as GPU framebuffer above older OS releases, as well as on other non-graphical system resources. As such both Citrix and VMware have published tools and configuration advice as to how users can reduce resources. Customers using Windows 10 are encouraged to consider following advice from virtualization vendors.

  • VMware: VMware have provided an OS optimization tool for Horizon View which can make and apply optimization recommendations for Windows 10 and other OSs. Users of Citrix/other virtualisations stacks may find this tool useful for the recommendations made even if they cannot then use the automated configuration tools. The tool can be found here: https://labs.vmware.com/flings/vmware-os-optimization-tool
  • Citrix: Citrix consultant Daniel Feller has published a number of articles on Windows 10 best practice and configuration many of which will also be relevant to VMware / other virtualization stacks. See: https://virtualfeller.com/?s=windows+10+optimization

Some users will find that a 512MB is inappropriate for their Windows 10 workload and that a 1GB profile is more appropriate.

Support

NVIDIA customers with support who believe they are encountering issues as a result of frame buffer memory exhaustion should raise a support case with NVIDIA Enterprise Support via https://nvidia-esp.custhelp.com and can reference issue #200130864.

Applicable products

NVIDIA GRID vGPU

GRID GPUs including M60, M6, M10, K1, K2

VMware Horizon and ESXi

Citrix XenDesktop and XenServer

Users are most likely to encounter this issue if using:

  • heavy graphical or video workloads
  • recent more graphically intensive Microsoft OSs e.g. Windows 10 rather than Windows 7
  • small framebuffers e.g. 512MB
  • Remoting protocols leveraging NVIDIA NVENC hardware encode e.g. recent versions of Citrix HDX/ICA or VMware Blast Extreme

Disclaimers

This Web site contains links to Web sites and third-party tools controlled by parties other than NVIDIA. NVIDIA is not responsible for and does not endorse or accept any responsibility for the contents or use of these third party Web sites or tools. NVIDIA is providing these links to you only as a convenience, and the inclusion of any link does not imply endorsement by NVIDIA of the linked Web site. It is your responsibility to take precautions to ensure that whatever tools or information you select for your use is free of viruses or other items of a destructive nature.

NVIDIA GRID: More info on vApps and VPC/vWS Licensing

lukeblog
Check out Luke Wignall’s blog on NVIDIA GRID licensing and other GRID topics!

I wrote a blog on RDSH (including XenApp) licensing and the options available with NVIDIA GRID vGPU and GPU-passthrough a few weeks ago, which you can read – here (including support for multi-monitor and resolutions). Since then my colleague Luke has added some more information in a blog where he outlines various case studies including many on vApps, which is worth a read here:

Luke answers how many licenses and what type you will need for various use cases, answering questions such as:

  • Q: I am deploying Citrix XenDesktop for 5000 global users, using two data centers, to meet a follow the sun productivity goal.  The data centers are also backup sites to each other.  I expect at most 1200 users at each of our three regional areas to be on during their workday, connecting to their closest data center, but there is some overlap (people working late or starting early) so I am architecting with a buffer for a total of 1500 virtual desktops.  I need to be able to run all users from either data center of one should go down. My users are all engineers and their apps require Quadro.
  • Q:  I am deploying virtual desktops but using XenApp to do so, and am looking for improved end user experience, for 1000 users.  At any given time I expect no more than 850 users to be connected.  I have no other desktop delivery method.
  • Q:  I chose to run XenApp on a bare metal host, so no hypervisor (I would question the decision to forgo the flexibility and manageability of virtualization), delivering three Microsoft Office applications so .  I have 500 users but expect no more than 350 of them to be connected at any given time.  I have no Virtual desktops for these users.
  • Q:  I have 250 engineers using CATIA and similar apps, they must have Quadro drivers, but usually only 200 of them are working at any given time.  I also have 1000 knowledge workers that range from sales to support, their apps do not need Quadro but perform much better with GPU (=happy users), of those I typically see 800 actively on their desktops.  I am deploying VMware Horizon.  We have a set of web apps that all 1250 employees use for time keeping, expenses, and safety training, these I am delivering with XenApp.

 

There is a lot of information on GRID licensing in our knowledge base – just search on “GRID licensing” on our KB home page here:

Highlights include:

Licensing Documentation:

Of course one of the best references is the official licensing guides on the GRID resources page (under deployment guides) here: http://www.nvidia.com/object/grid-enterprise-resources.html. In particular these two are useful:

 

Questions

Any questions – ask below or on the support NVIDIA GRID forums at https://gridforums.nvidia.com

NVIDIA GRID – RDSH licensing (including XenApp)

I’ve had a few questions about what licensing is needed under the GRID 2.0 and up software licensing for the M60/M10/M6 GPUs for RDSH solutions such as XenApp. I think the confusion arises because it’s possible to use a number of GPU/vGPU different profiles for a server OS VM. The key point is to remember that the licensing is always per user.

Continue reading NVIDIA GRID – RDSH licensing (including XenApp)

Significant leaps in virtualized NVIDIA vGPU monitoring

managesdk
Read the documentation – the User Guide provided alongside the managmeent SDK is really comprehensive!

Today NVIDIA announced a new monitoring SDK / API incorporated into its GRID vGPU products as part of their GRID August 2016 (4.0) release. This will be available from Friday 26th August 2016 as a software release for existing hardware, greatly enhancing the functionality for existing as well as new customers. (You can read the announcement here).

NVIDIA has broken ranks with traditional hardware-only GPU models and recognized enterprises needs software to manage and monitor GPUs as a component of the data centre. Software licensing has enabled existing customers to benefit from new features with fully supported software, directly supported by NVIDIA (you wouldn’t run your Microsoft OS or CAD software unsupported!). Continue reading Significant leaps in virtualized NVIDIA vGPU monitoring

Optimising TCP for Citrix HDX/ICA including Netscaler

MArius
Marius Sandbu – NGCA (NVIDIA GRID Community Advisor)  aka Clever Viking!

The TCP implementation within Citrix HDX/ICA protocol used by XenDesktop and XenApp and also Citrix Netscaler is pretty Vanilla to the original TCP/IP standards and definition and the out-of-the-box configuration usually does a good job on LAN. However, for WAN scenarios particularly with higher latencies and certain kinds of data (file transfers), Citrix deployments can benefit greatly from some tuning.

 

One of our new NGCAs (NVIDIA GRID Community Advisors) Marius Sandbu has written a must-read blog on how to optimize TCP with a Citrix Netscaler in the equation: http://msandbu.org/tag/netscaler-tcp-profile/Marius highlights some of the configuration optimisations hidden away in the Netscaler documentation and you’ll probably want to refer to that  documentation too (https://docs.citrix.com/en-us/netscaler/11-1/system/TCP_Congestion_Control_and_Optimization_General.html).

Citrix HDX TCP is not optimized for many WAN scenarios but at the moment it can also be tuned manually following this advice: CTX125027 – How to Optimize HDX Bandwidth Over High Latency Connections. This is one configuration I’d love to see Citrix automate as having to tune and configure the receiver is fiddly and also not possible in organisations/scenarios where the end-points and server/network infrastructure might be provided by different teams or even companies (e.g. IaaS).

 

For Citrix NVIDIA GRID vGPU customers with looking at high network latency scenarios – it really is worth investigating the potential and benefits of TCP window tuning. I’d be really interested to hear feedback if you have tried this and what your experience / thoughts are too!

 

Norwegian, Marius Sandbu was recently awarded NGCA status by NVIDIA for his work with our community through his Netscaler, remoting protocols and experience with technologies such as UDP and TCP/IP. You can follow him on twitter @msandbu and of course do follow his excellent blog on http://msandbu.org/ !!!