Comment
Author: Admin | 2025-04-28
Workaround. Even after applying this workaround to a system on which this issue occurs, vGPU migration with Tesla M10 GPUs fails with the following error: Unexpected migration data block encountered. On the host that is running vGPU Manager 9.1, set the registry key RMSetVGPUVersionMax to 0x30001. Start the VM. Confirm that the vGPU version in the log files is 0x30001. 2020-06-12T10:19:05.420Z| vthread-2142280| I125: vmiop_log: vGPU version: 0x30001 The VM can now be migrated. Status Not a bug Ref. # 200533827 5.16. 9.0, 9.1 Only: ECC memory with NVIDIA vGPU is not supported on Tesla M60 and Tesla M6 Description Error-correcting code (ECC) memory with NVIDIA vGPU is not supported on Tesla M60 and Tesla M6 GPUs. The effect of starting NVIDIA vGPU when it is configured on a Tesla M60 or Tesla M6 GPU on which ECC memory is enabled depends on your NVIDIA vGPU software release. 9.0 only: The hypervisor host fails. 9.1 only: The VM fails to start. Workaround Ensure that ECC memory is disabled on Tesla M60 and Tesla M6 GPUs. For more information, see 9.0, 9.1 Only: Virtual GPU fails to start if ECC is enabled. Status Resolved in NVIDIA vGPU software 9.2 5.17. 9.0, 9.1 Only: Virtual GPU fails to start if ECC is enabled Description Tesla M60, Tesla M6, and GPUs based on the Pascal GPU architecture, for example Tesla P100 or Tesla P4, support error correcting code (ECC) memory for improved data integrity. Tesla M60 and M6 GPUs in graphics mode are supplied with ECC memory disabled by default, but it may subsequently be enabled using nvidia-smi. GPUs based on the Pascal GPU architecture are supplied with ECC memory enabled. NVIDIA vGPU does not support ECC memory with the following GPUs: Tesla M60 GPUs Tesla M6 GPUs If ECC memory is enabled and your GPU does not support ECC, NVIDIA vGPU fails to start. The following error is logged in the VMware vSphere host’s log file: vthread10|E105: Initialization: VGX not supported with ECC Enabled. Workaround If you are using Tesla M60 or Tesla M6 GPUs, ensure that ECC is disabled on all GPUs. Before you begin, ensure that NVIDIA Virtual GPU Manager is installed on your hypervisor. Use nvidia-smi to list the status of all GPUs, and check for ECC noted as enabled on GPUs. # nvidia-smi -q==============NVSMI LOG==============Timestamp : Tue Dec 19 18:36:45 2017Driver Version : 384.99Attached GPUs : 1GPU 0000:02:00.0[...] Ecc
Add Comment