NVIDIA GRID vGPU: 512MB profiles, Win 10, framebuffer – new support article

I’ve been a bit quiet on the blogging front over the last month or so – a long vacation coupled with a slight role change – but also working on some internal support documentation around 512MB profiles which is now published.

As ever the NVIDIA Enterprise Support KB articles are living documents and you should check the main document here: http://nvidia.custhelp.com/app/answers/detail/a_id/4238/ at the time of reading for up-to-date NVIDIA sactioned advice.

However I thought it would be nice to highlight the availability and reproduce the article as it stands today. The search facility on the KB system is reasonably good and it’s always worth checking for new articles and searching for answers: http://nvidia.custhelp.com/app/home/.

The latest article gathers together some support and technical advise around 512MB including some additional advice on the use of small profiles with Win 10, as well as links to VMware and Citrix best practice on workign with Win 10.

The KB article as published is below, this was the work of a number of people within NVIDIA but also incorporated feedback and work with our NGCA (NVIDIA GRID Community Advisors), especially Rasmus Raun-Nielsen whose customer review and perspective were particularly helpful.

NVIDIA GRID vGPU: Memory exhaustion can occur with vGPU profiles that have 512 Mbytes or less of framebuffer

Symptom/Error

Symptoms and errors that may occur:

  • When playing video content (full screen 1080p) in a browser the session hangs and session reconnect fails on 512MB profiles.
  • This issue typically occurs when multiple display heads are used with Citrix XenDesktop or VMware Horizon on a Windows 10 guest VM.
    • When this error occurs, the NVIDIA host driver reports Xid error 31 and Xid error 43 in XenServer’s /var/log/messages file.
    • Or on VMware When this error occurs, the NVIDIA host driver reports Xid error 31 and Xid error 43 in the VMware vSphere log file vmware.log in the guest VM’s storage directory.

Root Cause

There is a known issue associated with changes in the way recent Microsoft Operating systems handle and allow access to overprovisioning messages and errors. NVIDIA is working with Microsoft closely to resolve these issues ongoing. Users with correctly provisioned systems should not encounter issues. As such users need to take care to ensure there is sufficient frame buffer to support their uses.

512MB is a very small framebuffer and as such users should be aware that the multiple demands made in a virtualized environment can lead to memory exhaustion. Uses that place demand on the framebuffer:

  • Use of more recent Microsoft OSs that place more demand on the framebuffer e.g. using Windows 10 rather than Windows 7. Windows 10 demands far more resources
  • Use of multiple monitors
  • Use of higher resolution monitors
  • Use of the framebuffer for hardware protocol encode (NVENC) – to reduce the probability of users encountering issues NVENC has been disabled for 512MB in the GRID 4.0 (August 2016) release for protocols such as Blast Extreme (VMware) and Citrix HDX/ICA.
  • Frame buffer intensive applications

Documentation

This issue is documented in the driver release notes for the GRID 4.0 (August 2016) release. Customers are advised to always read the known and resolved issues lists contained within the driver release notes for each release for their hypervisor (links below), for Citrix XenServer the release notes state:

Memory exhaustion can occur with vGPU profiles that have 512 Mbytes or less of framebuffer. This issue typically occurs when multiple display heads are used with Citrix XenDesktop or VMware Horizon on a Windows 10 guest VM. When this error occurs, the NVIDIA host driver reports Xid error 31 and Xid error 43 in XenServer’s /var/log/messages file.

The following vGPU profiles have 512 Mbytes or less of frame buffer:

  • Tesla M6-0B, M6-0Q
  • Tesla M10-0B, M10-0Q
  • Tesla M60-0B, M60-0Q
  • GRID K100, K120Q
  • GRID K200, K220Q

GRID Release notes for Citrix XenServer: http://us.download.nvidia.com/Windows/Quadro_Certified/GRID/369.17/XenServer-6.5/367.43-369.17-nvidia-grid-vgpu-release-notes.pdf

GRID Release notes for VMware ESXi: http://us.download.nvidia.com/Windows/Quadro_Certified/GRID/369.17/ESXi-6.0/367.43-369.17-nvidia-grid-vgpu-release-notes.pdf

Verifying framebuffer usage

Users can follow the advice in this article on monitoring NVIDIA GRID framebuffer usage  (http://nvidia.custhelp.com/app/answers/detail/a_id/4108/~/monitoring-the-framebuffer-for-nvidia-grid-vgpu-and-gpu-passthrough) to assist correctly size their environment with respect to framebuffer usage to avoid memory exhaustion. The advice is also of use to assess whether issues users are encountering are actually caused by memory exhaustion.

Consequences

To reduce the consequence of users encountering issues NVIDIA have disabled NVENC on 512MB profiles with the GRID 4.0 (August 2016 release) to minimize the risk of users encountering the Memory exhaustion. Application GPU acceleration remains fully supported and available with all profiles including 512MB. NVENC support from both Citrix and VMware is a recent new feature and as such the majority of users on older versions should encounter no change in functionality.

Workarounds and Solutions

Windows 10

Microsoft Windows 10 has significantly increased the demands upon graphical resources such as GPU framebuffer above older OS releases, as well as on other non-graphical system resources. As such both Citrix and VMware have published tools and configuration advice as to how users can reduce resources. Customers using Windows 10 are encouraged to consider following advice from virtualization vendors.

  • VMware: VMware have provided an OS optimization tool for Horizon View which can make and apply optimization recommendations for Windows 10 and other OSs. Users of Citrix/other virtualisations stacks may find this tool useful for the recommendations made even if they cannot then use the automated configuration tools. The tool can be found here: https://labs.vmware.com/flings/vmware-os-optimization-tool
  • Citrix: Citrix consultant Daniel Feller has published a number of articles on Windows 10 best practice and configuration many of which will also be relevant to VMware / other virtualization stacks. See: https://virtualfeller.com/?s=windows+10+optimization

Some users will find that a 512MB is inappropriate for their Windows 10 workload and that a 1GB profile is more appropriate.

Support

NVIDIA customers with support who believe they are encountering issues as a result of frame buffer memory exhaustion should raise a support case with NVIDIA Enterprise Support via https://nvidia-esp.custhelp.com and can reference issue #200130864.

Applicable products

NVIDIA GRID vGPU

GRID GPUs including M60, M6, M10, K1, K2

VMware Horizon and ESXi

Citrix XenDesktop and XenServer

Users are most likely to encounter this issue if using:

  • heavy graphical or video workloads
  • recent more graphically intensive Microsoft OSs e.g. Windows 10 rather than Windows 7
  • small framebuffers e.g. 512MB
  • Remoting protocols leveraging NVIDIA NVENC hardware encode e.g. recent versions of Citrix HDX/ICA or VMware Blast Extreme

Disclaimers

This Web site contains links to Web sites and third-party tools controlled by parties other than NVIDIA. NVIDIA is not responsible for and does not endorse or accept any responsibility for the contents or use of these third party Web sites or tools. NVIDIA is providing these links to you only as a convenience, and the inclusion of any link does not imply endorsement by NVIDIA of the linked Web site. It is your responsibility to take precautions to ensure that whatever tools or information you select for your use is free of viruses or other items of a destructive nature.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s