VMware ESXi 5.5.0 panics when using Intel AMT VNC

For compatibility with a new guest OS, I upgraded my ESXi  to 5.5 today. During reboot, it crashes after a few seconds (it briefly flashes a message about starting up PCI passthrough on the yellow ESXi boot screen). The purple screen of death (PSOD) I get looks like this:

VMware ESXi 5.5.0 [Releasebuild-1474528 x86_64]
#PF Exception 14 in world 32797:helper1-2 IP 0x4180046f7319 addr 0x410818781760
PTEs:0x10011e023;0x1080d7063;0x0;
cr0=0x8001003d cr2=0x410818781760 cr3=0xb6cd0000 cr4=0x216c
frame=0x41238075dd60 ip=0x4180046f7319 err=0 rflags=0x10206
rax=0x410818781760 rbx=0x41238075deb0 rcx=0x0
rdx=0x0 rbp=0x41238075de50 rsi=0x41238075deb0
rdi=0x1878176 r8=0x0 r9=0x2
r10=0x417fc47b9528 r11=0x41238075df10 r12=0x1878176
r13=0x1878176000 r14=0x41089f07a400 r15=0x6
*PCPU2:32797/heler1-2
PCPU  0: SSSHS
Code start: 0x418004600000 VMK uptime: 0:00:00:05.201
0x41238075de50:[0x4180046f7319]BackMap_Lookup@vmkernel#nover+0x35 stack: 0xffffffff00000000
0x41238075df00:[0x418004669483]IOMMUDoReportFault@vmkernel#nover+0x133 stack: 0x60000010
0x41238075df30:[0x418004669667]IOMMUProcessFaults@vmkernel#nover+0x1f stack:0x0
0x41238075dfd0:[0x418004660f8a]helpFunc@vmkernel#nover+0x6b6 stack: 0x0
0x41238075dff0:[0x418004853372]CpuSched_StartWorld@vmkernel#nover+0xf1 stack:0x0
base fs=0x0 gs=0x418040800000 Kgs=0x0

When rebooting the machine now, it reverts to my previous version, ESXi 5.1-914609.

A bit of playing around revealed: This only happens if I am connected to the Intel AMT VNC server. If I connect after ESXi has booted up, it crashes a fraction of a second after I connect to VNC. Go figure! Apparently it’s not such a good idea to have a VNC server inside the GPU, Intel…

Before I figured this out, I booted up the old ESXi 5.1.0-914609 and even upgraded it to ESXi 5.1.0-1483097.  Looking at dmesg revealed loads of weird errors while connected to the VNC server:

2014-02-13T11:23:15.145Z cpu0:3980)WARNING: IOMMUIntel: 2351: IOMMU Unit #0: R/W=R, Device 00:02.0 Faulting addr = 0x3f9bd6a000 Fault Reason = 0x0c -> Reserved fields set in PTE actively set for Read or Write.
2014-02-13T11:23:15.145Z cpu0:3980)WARNING: IOMMUIntel: 2371: IOMMU context entry dump for 00:02.0 Ctx-Hi = 0x101 Ctx-Lo = 0x10d681003

lspci | grep ’00:02.0 ‘ shows that this is the integrated Intel GPU (which I’m obviously not doing PCI Passthrough on).

So

  • ESXi 5.5 panics when using Intel AMT VNC
  • ESXi 5.1 handles Intel AMT VNC semi-gracefully and only spams the kernel log with dozens of messages per second
  • ESXi 5.0 worked fine (if I remember correctly)

I have no idea what VMware is doing there. From all I can tell, out-of-band management like Intel AMT should be completely invisible to the OS.

Note that this is on a Sandy Bridge generation machine with an Intel C206 chipset and a Xeon E3-1225. The Q67 chipset is almost identical to the C206, so I expect it to occur there as well. Newer chipsets hopefully behave better, perhaps even newer firmware versions help.

Update November 2014: I just upgraded to the latest version, ESXi 5.5u2-2143827, and it’s working again. I still get the dmesg spam, but the PSODs are gone. These are the kernel messages I’m seeing now while connected via Intel AMT VNC:

2014-11-29T11:17:25.516Z cpu0:32796)WARNING: IOMMUIntel: 2493: IOMMU context entry dump for 0000:00:02.0 Ctx-Hi = 0x101 Ctx-Lo = 0x10ec22001
2014-11-29T11:17:25.516Z cpu0:32796)WARNING: IOMMU: 1652: IOMMU Fault detected for 0000:00:02.0 (unnamed) IOaddr: 0x5dc5aa000 Mask: 0xc Domain: 0x41089f1eb400
2014-11-29T11:17:25.516Z cpu0:32796)WARNING: IOMMUIntel: 2436:  DMAR Fault IOMMU Unit #0: R/W=R, Device 0000:00:02.0 Faulting addr = 0x5dc5aa000 Fault Reason = 0x0c -> Reserved fields set in PTE actively set for Read or Write.

So basically, Intel AMT VNC is now usable again.

Update August 2015: ESXi 6.0 still spams the logs, no change over ESXi 5.5.

9 thoughts on “VMware ESXi 5.5.0 panics when using Intel AMT VNC

  1. sl0n

    Hi,

    I’ve got Jetway NF9E with ATM, using it on daily basis with no issues…ESXi 5.5 installed from scratch If you need any extra info, I’m happy to share.

    Cheers,

  2. Michael Kuron Post author

    I should add that my board is from the Sandy Bridge generation (C206 chipset, similar to the Q67), while yours is an Ivy Bridge Q77.
    I’d appreciate if anyone with an older chipset could report whether it’s working for them. I’ll see if I can find out my Management Engine firmware version, perhaps this issue is even firmware specific. Or it might be due to customizations by the board vendor etc.

  3. Pieter Goossens

    Glad I found your blog! I can confirm this issue also occurs on the Intel DQ67SW motherboard (latest BIOS version SWQ6710H.86A. 0066.2012.1105.1504) with Intel ME 7.1.52.1176. After hooking up a physical monitor and keyboard I was finally able to upgrade my machine from ESXi 5.1 to 5.5. Thanks!

  4. tsygam

    my small experience – I’m using Jetway NF9E and played a bit with NICs (82579lm and 82574L).
    with ESXi’s default install, 82579lm is not seen, therefore ESXi uses 82574L for management network.
    As a result further installation of 82579lm driver into ESXi provides for 2nd NIC and there is no conflict between AMT and ESXi management.
    Though if to reconfigure ESXi to use 82579lm for management (double check MAC in use), either AMT does not work (if started after ESXi boot) or ESXi crashes a few seconds after boot (if VLC is active).

    BR

  5. max

    I have the same problem with my Intel DQ67OW. I also tried to install it via VNC and it crashed. I will try the installation with a real monitor and keyboard.
    I have BIOS version (SWQ6710H.86A.0067.2014.0313.1347)

Leave a Reply

Your email address will not be published. Required fields are marked *