Did I really see a $165 Thin Client doing 55fps+ with Borderlands 2 with HDX 3D Pro, Citrix XenDesktop?

demophotoBack in May 2015, at Citrix’s Synergy 2015, I was involved with an HDX demo at Citrix Synergy 2015, the remit was pretty simple – to show really good HDX 3D Pro frame rates, good enough to satisfy gamers, on low-cost hardware using ONLY existing, shipping-today, fully-supported production ready products. We also want to show that HDX 3D Pro can run really well on low cost entry level thin clients, and so we chose a single core Intel NUC DE3815TYKHE Atom Processor. The upshot was we showed 60fps gaming on a hardware device costing $150 with a $15 (System on a Chip) SoC enabled plugin supported OS.

Low cost Thin Clients can be super performant if they have good hardware decode support and SoC technology. The Citrix Linux Receiver provides an SDK for Thin Client vendors, or a third-party to write a software plugin that will connect the hardware on the Thin Client to HDX and the Receiver. The presence (or not) of such a plugin and the quality of it is usually key to a performant thin-client.

hdxdemobusy
A Busy Demo – Rather a lot of our staff seem to have found!
hdxdemobusy2
A Busy Demo – Had a lot of volunteers to staff this demo! Wonder why 😀

The Citrix Thin Client Model

This is the Citrix thin-client model, we provide a number of default software and hardware plugins to process various types of encoding (jpeg, H.264 etc.) plus stubs into which an OEM or third party can drop additional plugins optimised for the specific device.

SoCStackLR

The diagram shows the Linux Receiver provides a number of graphical acceleration libraries such as GStreamer, libjpeg-turbo etc. (these may change in time) that will be used if a third-party does not supply alternatives. An OEM such as an OS vendor or a thin-client manufacturer can supply additional hardware accelerated plugins such as CTXJPEG or CTXH264 to take advantage of the thin-client hardware to do hardware on-chip decode of jpeg or H.264 data.

The Intel NUC DE3815TYKHE Atom Processor and ThinLinx OS

Intel as far as I know don’t supply HDX SoC accelerated plugins for the NUC, so we used our partner ThinLinx’s OS which does include and fully support them for the NUC. ThinLinx OS not only includes such hardware acceleration (plugins for CTXH264 etc.) but also the tools and software to effectively manage thin clients and for around $15 per client. You can read the details, here on the ThinLinx website.

Gaming and Frame Rates

A lot of our HDX engineers quite like their gaming, there are genuine “business” uses such as helicopter pilot training, health and safety simulation training (firefighters) etc… but in general gaming workloads and the user-expectations of gamers is good way to harden our technologies.

A lot of people think HDX (High Definition user eXperience) is just ICA and the graphics protocols, but it is far more. It’s about the whole user-experience: audio, USB support for peripherals like gaming controllers and general responsiveness beyond visual.

The human eye is generally considered to find 30fps pleasing (TV is typically 24 fps), and the incremental benefits between 30-60 fps minimal; but perception is a funny thing and many users find 60fps more pleasant although visually the differences are mathematically minimal. A lot of this for gamers is associated with the responsiveness (when they “pull the trigger”). At 30fps, there is 0.03333 secs between frames at 15 fps 0.0666 so if a mouse click just misses a frame you pick up an extra 0.033s (33ms) of “latency” in the user feel – this is why gamers obsessed with 60fps (it’s not necessarily the visuals above 30ms but it can “feel” different.

The Linux Receiver has built in end-user instrumentation

A lot of those who tried the demo seemed slightly surprised to see an instrumentation window on the Linux Receiver. This is actually a feature I’d like to see expanded to all our receivers long term.
linuxinstrumentation2

To enable the instrumentation you need to configure your client by adjusting the wfica options using

  • “-rm Dcdf” to enable the instrumentation

 LRinstrumentclose

As you won’t normally be calling wfica directly there is an environment variable WFICA_OPTS which can be set to pass the command line options for example:

  • export WFICA_OPTS=”-rm Dcdf”
  • /opt/Citrix/ICAClient/selfservice

Update (3rd Aug 2015): Grrr…. my readers are too sharp! The instrumentation is indeed screenshot from a different system. Bonus points for the sharp-eyed who spotted it was on a Quad core system form the CPU readings, yes this was the case. The NUC in the demo was single core…. 

Demo Summary

  • Highly Graphical/Interactive Content
  • Borderlands 2 Multiuser Game
  • Up to 60fps 1080p (1920×1080 resolution) Graphics

Purpose of Demo

  • Show Challenging near native UX
  • Monitor the performance
  • Ask questions
  • Have fun

THE SHOPPING LIST TO DO THIS DEMO YOURSELF

Shipping Citrix Software Products

  • XenServer 6.5
  • XenDesktop 7.6 with 3DPro VDA
  • Linux Receiver 13.1

Shipping Partner Software Products

  • Windows 8.1
  • ThinLinX OS (Linux) with HDX SoC Plugins

Shipping Partner Hardware

  • Dell R720 and HP DL380 G8 servers
  • NVIDIA GRID K2 Graphics Cards (K260Q vGPU profiles used to share GPUs)
  • Intel NUC DE3815TYKHE Atom Processor
  • Additional memory for the NUC: 2GB (widely available e.g. here and here for around $10-$15): Crucial 2GB Single DDR3 1600 MT/s (PC3-12800) CL11 SODIMM 204-Pin 1.35V/1.5V Notebook Memory Module CT25664BF160B
  • ViewSonic 27” 1080p Monitors
  • Xbox USB Game Controllers

Key Takeaways

  • Shows the low cost performance capabilities of Citrix and Partner shipping technologies.
  • The sum is greater than the parts. You can configure this yourself.
  • Fun – “A World Where People Can Work and Play from Anywhere”.
  • Things you don’t think of when planning a demo: Borderlands 2 requires you to buy ammunition, which resulted in our most senior engineering manager spending an awful lot of time running around the game buying bullets for the next demo – sorry, Joanna! We didn’t think about that one too well!

Policy Tweaks

In normal production usage we would recommend customers avoid fiddling with registry keys and policies – especially if they don’t understand what they are changing. I normally recommend new HDX 3D Pro users start with Jason Southern from NVIDIA’s policy template, available and explained – here.

For the Synergy demo, we did do a bit of tuning. With hindsight I’m not sure we should have as it just raised questions and I’m not sure the performance gains warranted the questions this fiddling raised. However many of those who saw the demo took away the configuration and have asked questions, so in the interests of transparency I’ve included some uber-nerdy detail after this blog.

The feedback from those who saw this demo at Synergy was superb:

impressedCTPs
CTPs Neil Spelling and Remko Weijnen grill Ben from HDX Engineering on the demo!

Demo Policies and Settings

Broker Policy

  • Overall session bandwidth limit

User setting – ICA\Bandwidth

38000 Kbps (Default: 0)

  • Target frame rate

User setting – ICA\Visual Display

60 fps (Default: 30 fps)

  • Client USB device redirection

User setting – ICA\USB Device

Allowed (Default: Prohibited)

Client Configuration

  • ini “[ClientAudio]” Section

AudioLatencyControlEnabled=True

  • wfica options

“-rm Dcdf” To enable instrumentation

HDX 3D Pro VDA Registry

HKLM\SOFTWARE\Citrix\Graphics

  • GfxProvider = 0x2
  • TwumEnabled = 0x1
  • Vd3dEnabled = 0x0
  • EncodeSpeed = 0x2
  • Encoder = 0x2
  • MinFPS = 0x3c
  • H264EncodedData = 0x0
  • LowVisualQualityCRF = 0x17
  • HighVisualQualityCRF = 0x14
  • SystemFlowControl = 0x0 (a setting pertaining to the Linux Receiver that users can disregard)
  • Remove ExtraSharpenCount (this is a legacy setting that users will be able to disregard in the future)

All these values are in HEX, there is a HEX to DEC (decimal) calculator available, here. Some of these keys are the defaults. We are in the process of tidying up the registry keys, so please do not use the above as any type of general reference. I’m publishing them to answer the questions from those who saw the demo as to why we used them.

These settings are associated with the use of the 3D Pro VDA, and aren’t significant these should be installed as default by the HDX 3D Pro VDA:

  • GfxProvider = 0x2
  • TwumEnabled = 0x1
  • Vd3dEnabled = 0x0

Encoder = 0x2 (In decimal = 2), means we are using the default HDX 3D Pro encoder which is pure H.264 without lossless text. Encoder = 1 is used on the Standard VDA and is H.264+lossless text.

EncoderSpeed = 0x2 (In decimal = 2). The default for HDX 3D Pro is currently (XD7.6) EncodeSpeed=1. For the Standard XenDesktop and XenApp VDAs EncodeSpeed is set to 2 by default.

  • What does EncodeSpeed do?
    • Value set to 1: Better image quality, but needs more CPU on client.
    • Value set to 2: Image quality drops, but CPU on client is able to support higher resolution and improves performance on thin client. Favours performance.

So why did we tweak the EncodeSpeed setting? This was a gaming demo, i.e. not much text and lots of movement (transients), the image quality improvements were negligible but the gamers’ desire for that 60fps responsiveness is better met by this tweak.

These next two are quality settings we usually strongly discourage users from tweaking as it is very easy to configure a system into a state where you turn off beneficial adaptive behaviour. In this case a LAN gaming demo, and gamer engineers who knew exactly what they were doing got involved!

  • LowVisualQualityCRF = 0x17 (=23 in decimal)
  • HighVisualQualityCRF = 0x14 (=20 in decimal)

QualityCRF ranges between 18 and 45 with 18= highest quality and 45= lowest quality. It’s a measure of compression so the lower the QualityCRF value, the less compression applied and the higher the visual quality.

In this gaming demo scenario, where there are continuous full-screen updates going on – the system rarely, if ever, gets a chance to stop and rebalance for HDX’s adaptive behaviour to fully kick-in. The narrow Min/Max quality range was set such that:

  • We lowered the Max quality slightly as the screen is always changing, so converging a static screen to “near-perfect” isn’t a requirement, and indeed we don’t want the system to raise quality too far that it has to “self-tune” back down again.
  • We raised the Min quality such that if there’s a momentary “blip” in performance and H.264 tries to change the quality to the minimum – it doesn’t overshoot.

Similarly, raising the MinFPS to be close to the MaxFPS (The target frame rate set in the policies, raised to 60fps from the default 30fps) was done such that:

  • In the event of a momentary blip in performance H.264 doesn’t overshoot. MinFPS (0x3c corresponds to 60fps in decimal).

In anything other than a gaming environment when using business or graphical applications you really shouldn’t have to consider such adjustments. However our HDX engineers like their gaming and their perfect trigger-happy experience!

We did end up tuning down the demo and ran the game at 50-55fps, on a 60fps session, this was due to getting some unsightly screen tearing on the client when everything was running at 60fps. One we are looking into!

Demo Network
demonetwork

Session Details

Sessiondetails

17 thoughts on “Did I really see a $165 Thin Client doing 55fps+ with Borderlands 2 with HDX 3D Pro, Citrix XenDesktop?

Add yours

  1. We have been using thin clients for over a decade and especially with the advances in SoC technology, the versatility of these units is way more now as it was then, with not only cost and power savings as factors, but also security a major concern to take into consideration. The estimated power savings of a thin client vs. a desktop computer is around $70 per year, depending of course on regional cost differences, and that easily pays for the cost of these units over their lifetimes.

    Like

  2. What resolution were they playing at?

    I’ve setup something similar and I get great fps at 1280×800 but performance seems to drop rapidly as resolution increases (probably to be expected).

    Like

  3. Great article and a good inspiration.

    I agree on the point with having gamers to harden the Technology, as this usersegment usually are very susceptible to Even the smallest of changes (especially if degrading the experience).

    Like

  4. Pingback: Virtually Visual
  5. Pingback: Cloud Evangelist
  6. “Thank you for your details interpretation about thin clients, here I am introducing a Versatile thin client Terminal RDP XL-500

    We can use it as a

    1) high end thin client device

    2) Mini PC/Individual PC

    3) Virtualization Ready “

    Like

  7. Have you tried to recreate this demo (or similar game demo) using the Raspberry Pi3? I’d be curious how many FPS a $60 thin client can sustain… (ThinLinx has $10-licensed OS for RPi 2 and 3, which have h.264 HW decoders).

    Like

  8. Hi there, would like to know more details about the hardware powering the clients, how many k2s are powering how many clients? Would love to set up one myself 🙂

    Like

Leave a comment

Blog at WordPress.com.

Up ↑