2. TIP 1 – USE THE CORRECT INSTALL PROCESS
1. Do your research Hardware + Software to make sure your config is supported
2. Build your hypervisor
3. Register for an evaluation license and download the GRID software bundle and license
server from the licensing portal
4. Install an NVIDIA License Server and install evaluation license file
5. Install the GPU card and GPU Manager software on hypervisor
6. Prepare Base VDI image without a GPU and configure RDP access
7. Configure VDI image with a vGPU Profile & boot VM
8. Install NVIDIA Windows Driver & license server IP
9. Reboot and connect to the VM to check license was acquired
Overview of the procedure
Deployment Guides: https://www.nvidia.com/en-us/data-center/virtualization/resources/
3. TIP 2 – GET AN EVALUATION LICENSE
Best link yet: https://www.nvidia.com/object/vgpu-evaluation.html
90 Day Trial License – You get 128 vApps, vPC, vDWS, vCS
It’s always better to get the customer to request the evaluation
because it makes things much easier to apply their pay-for
licenses to the installation when they buy them.
If you are a solutions partner, don’t be tempted to think ahead
and register for an evaluation on behalf of the customer because
their license server will then be registered to your partner
account and it’s harder to transfer later.
4. TIP 3 - DRIVER DOWNLOAD LOCATION
GRID Drivers are downloaded from the Licensing Portal, not the driver download pages
1. Ignore this table
2. Click here to
access the
licensing portal
3. Drivers
5. TIP 4 - TURN OFF ECC
Reboot
If VMs won’t boot then the GPU might need ECC turned off *
https://docs.nvidia.com/grid/latest/grid-software-quick-start-guide/index.html#disabling-enabling-ecc-memory
1 2
3
Disable ECC for all cards
Disable ECC for card id 00000000:02:00.0
If you see this …
* ECC Supported on Quadro and vCS Profiles since vGPU 9.0
6. TIP 5 - MEMORY ABOVE 1TB
May be an issue with M10; not an issue with “Pascal” cards or later
Hypervisor support of IOMMU causes issues on servers with more than 1TB of RAM
Relevant to ESXi and XenServer, not Nutanix AHV
VM failures or crashes may occur
Follow the documentation for XenServer and vSphere
Maxwell Cards can’t see greater than 1TB
7. TIP 6 - LICENSE SERVER
Check out my video in this playlist!
Follow the install process religiously!
8. TIP 7 - ESXI GPU SETTINGS
Tips for VMware Customers
• HOST>Configure>Graphics>Host
Graphics
• Ensure “Shared Direct” is
selected or vGPU profiles will
not be listed
• If needed, follow highlights to enable
vgpu.hotmigrate.enabled setting
• Ensure you have Enterprise Plus licenses; you NEED vCenter1
2
3
9. TIP 8 - XENSERVER 7.5/7.6/8.0
• VMs with GPUs attached experience slower performance (than XenServer 7.1)
• Can cause laggy graphics and slowdowns in general apps
• Private (hidden) Hotfix is available from Citrix (reference SR78634793) or
https://support.citrix.com/article/CTX250164
• Recommend moving to Citrix Hypervisor 8.2 (or latest version)
• Hotfix XS80E003: https://support.citrix.com/article/CTX258320
Private Performance Hotfix
10. TIP 9 - AVOID DRIVER MISMATCH
Keep the GPU Manager and VM’s Driver within the same major release
Optimal
Supported
NOT Supported
https://docs.nvidia.com/grid/
Note: vGPU 11 now has Cross-Branch support
11. 11
vGPU 9.0Guest
vGPU 9.0Host
vGPU 9.1 vGPU 10.0
vGPU 10.0
vGPU 10.0Guest
vGPU 10.0Host
vGPU 10.1 vGPU 11.0
vGPU 11.0
In-branch Compatibility
(Pre vGPU 11)
Cross-branch Compatibility
(new in vGPU 11)
vGPU 9.1
vGPU 10.0
Cross-Branch Compatibility
New host driver with previous version of Guest driver now supported
12. TIP 10 - MIXING PROFILES
Profiles must be homogenous per GPU
4B
4B
Example 1 : mixing 4 GB & 2GB Frame Buffers
2B
NVIDIA T4
No mixed Frame Buffer sizes or License types on the same GPU
(first profile defines the type allowed)
Cards with over 1 GPU (NVIDIA M10 & M60) offer more flexibility)
4B 4Q
4Q
Example 2: mixing vDWS & vPC licenses
Quadro RTX6000
4Q
4B
13. TIP 10 - MIXING PROFILES
No mixed Frame Buffer sizes or License types on the same GPU
(Cards with over 1 GPU (NVIDIA M10 & M60) offer more flexibility)
NVIDIA M10 has 4 x GPUs each with 8GB RAM, so mixed profiles is possible
2Q
2Q
2Q
2Q
1B
1B
1B
1B
8A
4Q
4Q
1B
1B
1B
1B
14. TIP 11 - BLACK VM CONSOLE
Ensure you have enabled RDP access inside the VM before installing the Driver
Exception is XenServer which does show the VM’s console in XenCenter
Console sessions will go blank after Driver Loads
15. TIP 12 - ISSUES WITH DCH DRIVER
• DCH is a way of getting driver updates via Windows Update
• The NVIDIA DCH driver is not currently compatible with vGPU
• If Windows detects a GPU during Windows Update it will install the DCH driver automatically. Hard to
revert image to vGPU driver afterwards
• Windows Update will not install DCH if it finds an existing vendor driver installed
• TIP: Do NOT run Windows update in between 1) Adding a GPU and 2) installing the vGPU Windows
Driver
• https://nvidia.custhelp.com/app/answers/detail/a_id/4777/~/nvidia-dch%2Fstandard-display-drivers-
for-windows-10-faq
Run Windows Update on base image before attaching vGPU Profile
16. TIP 13 - VGPU LICENSE OPERATION
VM start
License Allocation process – License allocation when VM starts
License
Server
5 4
License
checked
out
VM shutdown
License
released
5 5
Licenses available
Trusted store Trusted store Trusted store
VM off
17. VGPU LICENSE OPERATION
Golden Image
Issues with cloning a VM that has checked out a vGPU license
Provisioning
PVS/MCS/Instant
Clones/Linked
Clones
Trusted
store
Trusted
store
Trusted
store
Trusted store
Trusted
store
Trusted
store
Trusted
store
Trusted
store
Trusted
store
Trusted
store
Trusted
store
Trusted
store
Trusted
store
Trusted store gets replicated to clones
18. VGPU LICENSE OPERATION
Golden Image
Solution #1 - PVS/MCS/Instant Clones/Fast Clones
Remove the trusted store before cloning
Trusted store
Delete all the files under
"<SystemDrive>:Program FilesNVIDIA
CorporationGrid LicensingTrusted
Storage" on the base vDisk image (if
present). Note that these are hidden files
with names like
‘amsdudhygcfzzycwceeezwbpuyeugyjs’
19. VGPU LICENSE OPERATION
Golden Image
Solution #2 - Inject license server details on VM boot
Use image with no
vGPU IP details set
& trusted store
cleared
https://docs.nvidia.com/grid/latest/grid-licensing-user-
guide/index.html#windows-registry-grid-license-settings
[HKEY_LOCAL_MACHINESOFTWARENVIDIA
CorporationGlobalGridLicensing]
"ServerAddress"="192.168.10.63"
"ServerPort"="7070"
"BackupServerAddress"="192.168.10.64"
"BackupServerPort"="7070"
Sample REG file to run during boot
Clones
20. VGPU LICENSE OPERATION
If a VM cannot find a license server on boot or loses connection during operation (after grace period expired)
Grace period for running VMs is 24 hours since last check-in
Desktop limited to 3fps
On vGPU profiles that support CUDA, CUDA is disabled
GPU resource channels are limited, which will prevent some applications from running correctly.
Note: vGPU 11 has no restrictions for 20 minutes after a VM has booted, then relaxed restrictions
until 24 hours is reached
Connection loss to the license server(s)
Note: vGPU 11.x has more relaxed grace period restrictions. See next slides
21. 21
• 3 fps
• CUDA restrictions
Boot
Virtual
Machine
24h
Successful
Checkout
Boot
Virtual
Machine
Unsuccessful
Checkout
BEFORE VIRTUAL GPU 11
• 3 fps
• CUDA restrictions
Full Restriction
Full Restriction
23. TIP 14 - GPU DOESN’T WORK UNDER XENAPP
Computer Configuration => Administrative
Templates => Windows Components => Remote
Desktop Services => Remote Desktop Session Host
=> Remote Session Environment
Also Reg Settings for WPF/CUDA/OpenCL:
https://docs.citrix.com/en-us/citrix-virtual-apps-
desktops/graphics/hdx-3d-pro/gpu-acceleration-
server.html
Policy & Registry keys are required
24. TIP 15 - FUZZY FONTS
YUV 4:2:0
Chroma Subsampling using a video codec (H.264/H.265)
YUV 4:4:4
• Try changing to “Visually Lossless” Policy.
• Try Bitmap codec
• “Actively Changing Regions” policy is also good
25. TROUBLESHOOTING CHECKLIST
Item Example Item
Server: Hardware Model HP DL380 G10 VDI: GPU Profile (if using vGPU) T4-2Q
Server: Hypervisor make and version VMware 6.7U3 VDI: OS version and build Win10 1903
Server: GPU model and number of
cards
4 x NVIDIA T4 VDI: NVIDIA Driver version 432.08
Server: GPU Manager Version (vib) 430.67 VDI: Version of Remoting Agent 7.15
Server: Hardware (CPU, Speed, RAM,
Disk, Network)
HW spec Network: Bandwidth, Latency & User Location 20Mbps, 50ms, home
Server: Other loads running Non-GPU enabled VMs Endpoint: Make/Model Dell/Wyse 5070
Remoting: Software (Horizon,
XenDesktop etc..)
XenApp 7.15 Endpoint: OS ThinOS
Remoting: Protocol Policy (H.264/BMP
etc.)
H.264 HW encode,
Quality-Medium, ACR
Endpoint: version of Remoting Client Citrix WSA 1911
Remoting: Onsite or Cloud Onsite Endpoint: Res/Number of Displays 2 x 4K (3840x2160)
VDI: RAM and number of vCPUs 24GB, 4 x vCPUs Endpoint: Apps Used Office, Catia
VDI: vGPU, Pass-Through/DDA or vSGA vGPU Endpoint: App Characteristics Proviz application
VDI: Name of VM if applicable Steps to reproduce issue If applicable
Information to collect for support
26. QUALIFICATION QUESTION FOR VGPU
Discussion points for potential vGPU Customers
What is the reason for this project? Hardware upgrade, remote working, Perf.issue
What is your workload? Office-only apps, video, ProViz Apps, Deep Learning/HPC
What hardware do your users currently have? Physical Workstation, Non-GPU VDI
What endpoint hardware will you have? Thin Clients, Laptops
How many screens and what resolution Is 4K a target? Multiple 4K?
What is your preferred Hypervisor and Remoting Stack? VMware ESX, Citrix, Horizon, Teradici
What are your density aspirations? High-density VDI, High Perf. Professional graphics
On-site deployment or Cloud? Mostly, Cloud uses a complete GPU, not fractions (vGPU)
Notas do Editor
Quadro and Virtual Compute Server profiles support and can use ECC and Page Retirement. vApps and vPC profiles will run on a GPU that has ECC turned on (since vGPU 9.0) but will not leverage these technologies.
If you have database issues on the license server, often this can be fixed by rebuilding the database from backup: https://docs.nvidia.com/grid/7.0/grid-license-server-user-guide/index.html#restoring-trusted-storage-database
Notes;
GRID Software Licenses are mandatory for NVIDIA vGPU cards whether customers are using GPU Pass through or vGPU. The only exception is where the card is running in Compute-only mode without graphics.
If a license is not applied, the driver will throw up a message saying “cannot obtain a license” or similar. Performance will degrade to 3 frames-per-second in this mode too.
If a License runs out or a license server can’t be found, you get a grace period (I believe 3 days) and then goes to 2).
If a license runs out, the Licensing Portal will no longer list new version of GRID software.
Cross branch compatibility is good for environments that have a lot of VM images that are difficult to upgrade at the same time as the vGPU Manager host software on the hypervisor. It allows you to maintain functionality with older (previous version) guests whilst upgrading the host GPU Manager software
A vGPU license is checked out when the VM starts (not when a user logs in). The license is released when the VM is stopped. The trusted store files remain after shutdown (they will normally get refreshed on boot).
Having the trusted store for the golden image replicated to hundreds of clones can cause license allocation issues.
Default behaviour in vGPU 10 and below if a workstation cannot find a license.
For reference: Useful
info to report to your partner or NVIDIA tech support when troubleshooting an issue.
Item
Server: Hardware Model : e.g. HP DL380 G10
Server: Hardware (CPU, Speed, RAM, Disk, Network) : e.g. HW spec
Server: Hypervisor make and version : e.g. VMware 6.7U3
Server: GPU model and number of cards : e.g. 4 x NVIDIA T4
Server: GPU Manager Version (vib) : e.g. 430.67
Server: Other loads running : e.g. Non-GPU enabled VMs
VDI: OS version and build : e.g. Win10 1903
VDI: RAM and number of vCPUs : e.g. 24GB, 4 x vCPUs
VDI: vGPU, Pass-Through/DDA or vSGA : e.g. vGPU
VDI: GPU Profile (if using vGPU) : e.g. T4-2Q
VDI: NVIDIA Driver version : e.g. 432.08
VDI: Version of Remoting Agent : e.g. Citrix VAD 7.15
VDI: Name of VM : if applicable
Remoting: Software (Horizon, XenDesktop etc..) : e.g. XenApp 7.15
Remoting: Protocol Policy (H.264/BMP etc.) : e.g. H.264 HW encode, Quality-Medium, ACR
Remoting: Onsite or Cloud : e.g. Onsite
Network: Bandwidth, Latency & User Location : e.g. 20Mbps, 50ms, home
Endpoint: Make/Model : e.g. Dell/Wyse 5070
Endpoint: OS : e.g. ThinOS
Endpoint: version of Remoting Client : e.g. Citrix WSA 1911
Endpoint: Res/Number of Displays (Note: do not quote screen inches) : e.g. 2 x 4K (3840x2160)
Endpoint: Apps Used : e.g. Office, Catia
Endpoint: App Characteristics : e.g. Proviz application
Steps to reproduce issue : If applicable