Significant leaps in virtualized NVIDIA vGPU monitoring

managesdk
Read the documentation – the User Guide provided alongside the managmeent SDK is really comprehensive!

Today NVIDIA announced a new monitoring SDK / API incorporated into its GRID vGPU products as part of their GRID August 2016 (4.0) release. This will be available from Friday 26th August 2016 as a software release for existing hardware, greatly enhancing the functionality for existing as well as new customers. (You can read the announcement here).

NVIDIA has broken ranks with traditional hardware-only GPU models and recognized enterprises needs software to manage and monitor GPUs as a component of the data centre. Software licensing has enabled existing customers to benefit from new features with fully supported software, directly supported by NVIDIA (you wouldn’t run your Microsoft OS or CAD software unsupported!). Continue reading Significant leaps in virtualized NVIDIA vGPU monitoring

NVIDIA GRID – A Guide on GPU Metric Integration for Citrix XenServer

Just a quick blog aimed at those looking to develop GPU hypervisor monitoring products by integrating the NVIDIA GPU metrics exposed by XenServer via their APIs. Really it’s a bit of a guide as to where to find the information provided by Citrix.

GPU-Graph

Background

Two NVIDIA GPU technologies are available on Citrix XenServer:

  • GPU (PCIe) pass-through (including GPU-sharing for XenApp and VDI passthrough)
  • vGPU (shared GPU technologies)

Owing to the nature of PCIe passthrough whereby the hypervisor is bypassed and the VM itself obtains complete control and sole access to the GPU, host and hypervisor level metrics are not available to the NVIDIA SDK and APIs on host nor to the hypervisor.

Developing a supported solution

Many Citrix customers insist on a monitoring solution being certified by the vendor via the Citrix Ready program. ISVs are advised to join the Citrix Ready program (access level is free) to obtain advise on developing a supported product and to eventually certify and market their product. In particular ISVs are recommended to evaluate the conditions of the vendor self-certification “kit” for supported products.

Whilst monitoring can be performed by inserting a kernel module or supplemental pack into XenServer’s dom0 this is an unsupported mechanism that Citrix generally will not support and customers are rarely willing to compromise their support agreements to use such products. ISVs are strongly advised to consider using the XenServer APIs and SDK to access metrics in a supported manner. See: https://www.citrix.com/partner-programs/citrix-ready/test.html (under XenServer-> Citrix XenServer (6.x) Integrated ISV Self-Certification Kit).

XenServer SDK / API

The XenServer API provides bindings for five languages: C, C#, Java, Python and Powershell.

XenServer maintains a landing page for ISV developers: http://xenserver.org/partners/developing-products-for-xenserver.html

Additionally there is developer (SDK) support forum where many XenServer staff answer questions: http://discussions.citrix.com/forum/1276-xenserver-sdk/

XenServer Metrics

XenServer captures metrics in RRDs. Details of the RRDs, code examples and information on how the XenServer SDK can be used to access the metrics are given on this landing page: http://xenserver.org/partners/developing-products-for-xenserver/18-sdk-development/96-xs-dev-rrds.html

XenServer have integrated many of the metrics available from NVIDIAs NVML interface into their RRDs. This means customers can access the metrics via the XenServer APIs in a supported manner rather than inserting unsupported kernel modules to call NVML in the hypervisor’s host operating system (dom0).  See https://www.citrix.com/blogs/2014/01/22/xenserverxendesktop-vgpu-new-metrics-available-to-monitor-nvidia-grid-gpus/

XenServer APIs – querying GPU infrastructure:

For information on which VMs have vGPUs, the type of vGPU profile etc. see http://nvidia.custhelp.com/app/answers/detail/a_id/4117/kw/citrix%20monitoring under “Checking your GPU configuration” for links to appropriate XenServer documentation.

 

Useful links:

 

More Lenovo Servers Support NVIDIA GPUs Including the M60

Lenovo have recently qualified and announced support for more NVIDIA GPUs for several servers including the x3650 M5 (E5-2600 v4), details can be found on Lenovo’s site, here:

Also recently listed is the x3500 M5:

This means Lenovo have worked with NVIDIA to test and certify that both parties hardware, firmware and software is fully-compatible, thermally and electrically stable.

Lenovo and vGPU/GPU-passthrough

Lenovo’s “redbook” site with server specifications and support also carries a wealth of information about Lenovo’s investment and joint development to support GPU technologies and virtualization including NVIDIA GRID vGPU. In particular their reference architecture designs including considerations for GPU usage are excellent and available for both VMware and Citrix infrastructures. You can read them here:

I’ve found the best place to start a search on Lenovo’s site is here: https://lenovopress.com/redpxref-system-x-reference and here:

 

Hypervisor Support

The GRID M60 card is now supported on more bare-metal/physical servers. Customers looking to use the M60 card with GRID vGPU in conjunction with a hypervisor such as Citrix XenServer or VMware ESXi should verify that the server OEM has also certified with the hypervisor by checking the VMware/Citrix HCL (Hardware compatibility list), details of how to do this can be found in these NVIDIA Support articles:

New Cisco Validated Design featuring UCS B200 M4 with NVIDIA GRID M6 vGPU – available now!

It’s great to see a new validated design released by Cisco in recent weeks. Particularly as this features the NVIDIA GRID M6 options for blade servers to enable virtualized GPU-accelerations (vGPU). This reference architecture joins other available for UCS but in particular features a reference blueprint for Citrix XenDesktop/XenApp 7.7 and VMware vSphere 6.0 for 5000 Seats. Key features include

  • Citrix XenDesktop/XenApp 7.7.
  • Built on Cisco UCS (including Cisco B200 M4 Blade Server) and Cisco Nexus 9000 Series
  • with NetApp AFF 8080EX
  • VMware vSphere ESXi 6.0 Update 1 Hypervisor Platform

Cisco have done a great job providing a comprehensive guide and reference for a full VDI/XenApp deployment that includes networking, storage and graphics acceleration considerations.

 

Cisco-NVIDIA Relationship

There are plenty of case studies, whitepapers and webinar recording covering Cisco long-investment in NVIDIA GRID and vGPU too:

GPU Sizer – Community tool seeks Beta Testers

A few lucky folks at E2EVC, a couple of weeks ago in Las Vegas, got a sneak preview of a couple of new community tools for analyzing application usage of NVIDIA GPUS. I have already blogged about Jeremy Main’s GPU Profiler (read about it – here).

newtoolse2evc

The other tool is one from community GPU and virtualisation expert Magnar Johnsen from Norway, who is well-known in the Virtualisation communities for his GPU-enabled deployments and tools. Magnar was in fact one of the community users who we invited to NVIDIA to speak to our engineers and product managers about the future direction of our products and user needs.

Magnar has released this tantalizing screen shot of his new tool and is actively inviting beta testers and GPU users to try it out and input into its development. You can sign up for the beta program here: http://virtualexperience.us13.list-manage.com/subscribe?u=efedd1e2c3378132102c90273&id=3875dd956b

gpusizer

One particularly interesting feature is the tools ability to monitor if applications are using APIs to use the GPU for DirectX (DX9, DX10, DX11) and OpenGL, OpenCl, CUDA etc.

Magnar Johnsen is a EUC solution specialist, blogger, speaker, and community tool developer with +15 years experience in End User Computing. Magnar works as a consultant in Bergen in Norway. He has worked with Citrix, Microsoft and VMware products since 1999 and with NVIDIA products since 2012. Magnar has a passion for technology, computer visualization and virtual reality. He has basic experience with 3D modeling, graphic manipulation and video effects which helps him better design and implement 3D and graphical applications in a virtual environmet. He has assessed, designed, implemented and supported many virtual graphics solutions based on NVIDIA techology for small to large companies in Oil and Gas industry in Norway. Magnar shares his knowledge, tools and experience on his blog http://www.virtualexperience.no and speaks at several industry conferences like Citrix Synergy, Briforum and Citrix User Group. You can follow Magnar for updates on his blog and GPU Sizer on twitter @MagnarJohnsen.

Could Deep Learning find the needle in the PDM/PLM haystacks or cloud?

yowza
Yowza at D3Live 2016!

Recently I attended the NVIDIA GTC Developers’ conference and spent a little time learning more about the deep learning products. A branch of machine learning that often aims to make better representations and create models to learn these representations from large-scale unlabeled data. Many of the use cases have been associated with text and image search, facial recognition or medical diagnosis (identifying specific cancers, medical problems).

Deep learning is a very generic (in the broad-reaching sense) technology, companies like NVIDIA have developed the GPU technology and on top of them frameworks have been built to allow researchers and developers to rapidly construct products without actually developing bespoke image recognition technologies or similar. At GTC we had a bit of an employee hackathon and a group of our staff with no experience or machine-learning background built a solution to classify whale songs in an afternoon. Supplied with the data set from a kaggle competition and a framework it was apparently fairly straightforward (http://danielnouri.org/notes/2014/01/10/using-deep-learning-to-listen-for-whales/).

Coming from a CAD/CAE/PLM background the perfect application in my mind for this type of technology is part recognition and PDM (product Data Management). For every designer actually using Siemens NX or Dassault Solidworks/Catia there is a vast ecosystem of engineering users using and sharing the CAD data via PDM/PLM applications such as Enovia, Delmia, Teamcenter or cloud collaboration tools.

CAD is undoubtably getting cloudy – with onshape, outscale, fra.me and GrabCAD as well as numerous VDI users using VMware/Citrix etc. CAD though always carries a lot of legacy constraints and you have to deal with what you’ve got. The PDM/PLM and indeed BIM (in AEC) communities spend vast amounts of effort on part labelling conventions, search and part translation. Basically trying to find out what they’ve already got! (You only have to read the main PLM/PDM blogs such as Oleg Shilovitsky – see: http://beyondplm.com/2015/11/04/intelligent-part-numbers-might-be-a-good-idea-in-connected-world/ to see the PDM/PLM folk spend a lot of time and effort indexing and trying to find their parts!)

So how do you find stuff

There are a few 3D CAD search technologies around but strangely whilst everyone is talking about putting stuff in the cloud and sharing these technologies don’t seem to get much press.

Many products use some of the features and techniques outline in this seminal paper: https://www.uni-konstanz.de/mmsp/pubsys/publishedFiles/BuKeSa06.pdf; and use one or more of the techniques to make a very lightweight “fingerprint” of the CAD part and then look for sections of that fingerprint: same size, same hole pattern, material attributes (bits of metadata on part), sometimes you want same part but rescaled, could be re-oriented/affine transform etc ….

 

Companies/products I’ve come across in the 3D search space:

This isn’t a space I’ve followed particularly closely for a few years but I suspect it might finally be getting some end-user traction and visibility. As more and more folk put stuff in the cloud and datacenter the need to find it again over ever more disperse infrastructure grows!

  • Geolus (Siemens PLM); Geolus must be a decade old or more. When I worked at Siemens they fascinated me and I never understood why this gem was never more promoted amongst the vast range of Siemens portfolio. I guess with so many technologies the exciting cooler ones that would be the VCs/start-ups sparkle are simply taken for granted? Mature and established in use in many vast real manufacturers that are Siemens bread-and-butter.
  • Cadenas; This has always been a fascinating company, a German CAD parts catalogue rather than a modeler. Quietly for many years they have been developing simply very interesting technologies. They added some degree of modelling capability allowing a designer to define a simplified part against which to search. Partnered with more familiar names e.g. Delcam (now Autodesk) and won some serious acclaim from the manufacturing sector. Novel technologies being built on-top of a real established traditional CAD business, slowly and steadily.
  • ShapeSpace – made a fair amount of impact in 2013 and got some good partners but seem to have gone quiet recently, a small UK software company I’m not quite sure what they are up to currently.
  • Yowza; Where did they come from? Established in just 2015, this year they made a splash at Develop3D live, with a polished booth, talks and demos. Of all the companies I am familiar with in the 3D/CAD search.

Financial drivers

There are huge financial drivers for decent CAD search and categorization:

So where is Deep Learning?

Now this is where it got a bit disappointing. It seemed such a natural match of technology to problem and yet when I started googling I was surprised at the very limited material of the area. A few papers mainly behind pay-for (booo!!!) journal firewalls http://link.springer.com/article/10.1631%2Fjzus.C1300185

CAD / PLM does tend to be a closed industry. The bucks are big in PLM and many technologies are developed in-house and the expertise kept in house. Although many of the big players license their components and APIs (e.g. Geolus and Parasolid from Siemens), you won’t find a developer community or information on the technologies lying around the internet.

So will we see deep-learning in the CAD / BIM headlines soon? My guess is probably not, my feeling is that this type of technology will creep in behind the scenes.

Although with start-ups like Yowza popping up at end-user events like D3DLive! Perhaps there will be a little disruption and end-user interest in the overlooked area of CAD indexing/part retrieval.

 

Learn more:

 

NVIDIA GRID: New knowledge resources – just released!

KB2The NVIDIA Knowledge base is going from strength to strength.  Since GRID moved to a software and fully-supported enterprise model there has been an acceleration in the information being published there that should carry on long-term.

Known issues, workarounds, how-to-guides and links to other places to find information on NVIDIA products including GRID and vGPU.

Yet MORE NEW articles have just been released:

NEW KB Articles – this week!

NEW KB Articles – last week

As always I’d recommend a visit to the GRID user forums where you can ask questions with the users who include customers, partners and NVIDIA staff including developers who write the product and support staff: Have a look – HERE!