Update (24th Feb 2016): Jason Southern has published an overview and how-to-guide on this tool: https://www.youtube.com/watch?v=VAQhiNNFXxQ&feature=youtu.be
I had an enquiry this week asking what the “modeswitch tool” is and when a user should use it. Basically a GRID 1.0 K1/K2 user had been to a demo of the new GRID 2.0 M60 GPUs. As he hadn’t got M60 boards yet he didn’t have the tool and hadn’t downloaded it (so hadn’t read the very comprehensive documentation that comes with it explain when and why to use it) but he’d remembered some information from the demo and was trying to extrapolate that to his GRID 1.0 K1/K2 cards.
What is the mode switching tool?
gpumodeswitch is a command-line tool that is used to switch supported NVIDIA GPUs between compute and graphics mode. It’s a simple tool available to download for customers with GRID 2.0 products and the newer Tesla models (e.g. M60), hardware GPUs designed for both graphics and HPC use cases. That allows custoemrs to quickly reconfigure their hardware between optimal configurations for either HPC (computation) or Graphics (including virtualized).
The first generation GRID cards (K1/K2) and also the Quadro range at NVIDIA where designed for graphics usage and for interoperability with virtualization stacks involving hypervisors such as XenServer, vSphere etc.. Whilst the Tesla range of cards were designed for uses such as High Performance Computing (HPC). This wasn’t ideal:
- Some customers have both HPC and graphical needs. One of the virtues of virtualization is the ability to quickly repurpose hardware as your project and business needs change.
- It confused users, over the years I’ve had several disappointed customers who have bought an HPC server (usually from an ill-informed server reseller who thinks any GPU will do) when they were looking for GPUs to support graphical uses.
- The naming was complex, there were GPU families and Boards. I remember when I wrote this blog (https://www.citrix.com/blogs/2014/01/14/xenserver-and-xendesktop-nvidia-support-for-gpu-passthrough-on-cards-including-the-k5000/) I included a table off NVIDIAs site that confused me: that a Quadro board could be part of the Tesla family, and some Tesla boards are Kepler family not Tesla.
|GPU Family||Boards Supported|
- This table has moved on and thankfully the M60 is part of the Maxwell family (M = Maxwell family), rather than having GRID and Tesla boards as separate non-interchangeable use cases. There are no longer GRID boards within the GRID 2.0, just Tesla M60’s. Users can now by their cards and use them in the future for HPC or Graphics. This was the main technical driver with GRID2.0 to move the vGPU licensing into software rather than enforcing the use case by dictating a separate physical GPU (e.g. K1/K2) for graphics.
Should I use the mode switcher with my GRID 1.0 (K1/K2) cards?
No, it’s not designed for them. GRID 1.0 cards such as the K1/K2 were shipped specifically for graphics and are configured for this and are not optimized for HPC (for HPC a Tesla of that generation is more appropriate). If you have the tool, you will have downloaded it from the GRID 2.0 portal alongside a very comprehensive user guide that outlines exactly when you should use it and with which boards. If using the tool you must read this thoroughly. This means you should also be mindful that performing benchmarks designed for HPC cards on GRID 1.0 K1/K2 cards is a poor indication of how they will perform for graphical cases and benchmarking cards on purely physical systems when intending to virtualize is unwise as they do not reveal the interoperability limitations imposed by hypervisors and operating systems, you would be looking at results on a physical system that would not be technically possible to get working in many virtualized stacks.
Why does the M60 (GRID 2.0) have mode switching available – technical / geek details? Why isn’t there a single optimal configuration that can be used for HPC and Graphics use cases?
A single optimal configuration and simplifying the repurposing is something we will continue to work towards. At the moment though the existing software and hardware has a lot of legacy overhead. Not just NVIDIA but HPC and Graphical applications, Operating systems, and particularly hypervisor layers and their evolving management of memory and also the server hardware. We are to some extent limited by many other factors of a GPU enabled deployment interacting with a lot of technology that is supplied by third-parties who have their own timescales and constraints for innovation and evolution.
If you have M60 cards and read the documentation it notes how the default configuration on the M60 is compute mode. NVIDIA Tesla GPUs are shipped in a configuration optimized for high-performance compute (HPC) applications. While compute mode is optimal for HPC usage, it can cause compatibility problems with OS and hypervisors when the GPU is used primarily as a graphics device, details are given of how some hypervisors cannot support passthrough of GPUs with large memory BARs to guest virtual machines. To address these problems, certain NVIDIA Tesla GPUs (e.g. M60) support setting the GPU into graphics mode.
Read the documentation!
There is a very comprehensive table of advice as to when to use either compute or graphics mode in the documentation, as well as details of how the modes are changed by the tool and the configuration of the physical GPU in each mode that the tool can apply. This detail is useful for those who are keen on performing really stringent benchmarks and understanding application behaviour.
Basically you have to read the documentation for GRID 2.0 products like vGPU and follow the advice on using graphics mode as that feature is designed for that configuration currently. I have included a screenshot of the mode switching tool documentation above in this blog to remind you!
Update (5th Feb 2016): Answering a question this blog raised: Yes! When using “graphics” mode you will get compute functionality, such as CUDA and OpenCL. See my later blog: https://www.citrix.com/blogs/2014/04/23/opencl-and-nvidia-cuda-support-for-citrix-xenserver-xendesktop-and-xenapp/