VMware: What is vMotion fails at 67% with error ‘The VM failed to resume on the destination during early power on’.

Symptoms:

Recently we have been experiencing problems with ESXi 4.1 hosts becoming disconnected from vCenter.

Troubleshooting Steps:
- vMotion fails at 67%.
Error message is ‘The VM failed to resume on the destination during early power on”
hostd will post the below events on the destination ESXi host:


2013-05-30T08:20:57.990Z [21AE1B90 info 'Vcsvc.VMotionDst (1369902018116694)' opID=E9BE2EE8-00000A78-43-61-5b] VMotionEntry: migrateType = 1
2013-05-30T08:20:57.991Z [21AE1B90 verbose 'Vcsvc.VMotionDst (1369902018116694)' opID=E9BE2EE8-00000A78-43-61-5b] JrnlCreate: Created 1369902057.4
2013-05-30T08:20:57.991Z [21AE1B90 verbose 'Vcsvc.VMotionDst (1369902018116694)' opID=E9BE2EE8-00000A78-43-61-5b] Migration changed state from NONE to BEGIN
2013-05-30T08:20:57.991Z [21AE1B90 info 'Vcsvc.VMotion' opID=E9BE2EE8-00000A78-43-61-5b] Completed PrepareDestinationEx [1369902018116694]
2013-05-30T08:20:58.067Z [FFF1AD20 info 'Vcsvc.VMotion' opID=E9BE2EE8-00000A78-43-61-4c] InitiateDestination [1369902018116694], VM = ‘/vmfs/volumes/50b472fe-f2e2006b-a074-d48564c9dd57/045GEITGBLBQ003.dmz01.web/045GEITGBLBQ003.dmz01.web.vmx’
2013-05-30T08:20:58.413Z [FFF1AD20 verbose 'Vcsvc.VMotionDst (1369902018116694)' opID=E9BE2EE8-00000A78-43-61-4c] JrnlBegin: VM objID=965 srcIp=10.45.224.46 migrateType=MIGRATE_TYPE_VMOTION
2013-05-30T08:20:58.451Z [FFF1AD20 info 'Vmsvc.vm:/vmfs/volumes/50b472fe-f2e2006b-a074-d48564c9dd57/045GEITGBLBQ003.dmz01.web/045GEITGBLBQ003.dmz01.web.vmx' opID=E9BE2EE8-00000A78-43-61-4c] VMotionPrepare (1369902018116694): Sending ‘from’ srcIp=10.45.224.46 dstIp=10.45.224.35, type=1, encrypted=false, remoteThumbprint=1A:4D:24:87:05:3D:27:55:26:1C:41:8A:3F:29:FF:AC:F9:F2:19:2C
2013-05-30T08:20:58.452Z [FFF1AD20 info 'Vcsvc.VMotionDst (1369902018116694)' opID=E9BE2EE8-00000A78-43-61-4c] Initiate: Waiting for WID
2013-05-30T08:20:59.516Z [FFF1AD20 verbose 'Vcsvc.VMotionDst (1369902018116694)' opID=E9BE2EE8-00000A78-43-61-4c] JrnlPutWid: Wrote wid=78833
2013-05-30T08:20:59.516Z [FFF1AD20 info 'Vcsvc.VMotionDst (1369902018116694)' opID=E9BE2EE8-00000A78-43-61-4c] Initiate: Got WID 78833
2013-05-30T08:20:59.516Z [FFF1AD20 verbose 'Vcsvc.VMotionDst (1369902018116694)' opID=E9BE2EE8-00000A78-43-61-4c] Migration changed state from BEGIN to MIGRATING
2013-05-30T08:20:59.516Z [2199EB90 verbose 'Vmsvc.vm:/vmfs/volumes/50b472fe-f2e2006b-a074-d48564c9dd57/045GEITGBLBQ003.dmz01.web/045GEITGBLBQ003.dmz01.web.vmx'] VMotionCmdCb [1369902018116694] ‘from’: done
2013-05-30T08:21:04.878Z [21081B90 info 'Vcsvc.VMotionDst (1369902018116694)'] ResolveCb: VMX reports needsUnregister = true for migrateType MIGRATE_TYPE_VMOTION
2013-05-30T08:21:05.232Z [2199EB90 info 'Vcsvc.VMotion' opID=E9BE2EE8-00000A78-43-61-e8] CompleteDestination [1369902018116694]2013-05-30T08:21:04.878Z [21081B90 info 'Vcsvc.VMotionDst (1369902018116694)'] ResolveCb: Failed with fault: (vmodl.fault.SystemError) {
–>    dynamicType = <unset>,
–>    faultCause = (vmodl.MethodFault) null,
–>    faultMessage = (vmodl.LocalizableMessage) [
-->       (vmodl.LocalizableMessage) {
-->          dynamicType = <unset>,
-->          key = "msg.migrate.resume.fail",
-->          message = "The VM failed to resume on the destination during early power on.”}]
–>    reason = “The VM failed to resume on the destination during early power on.”,  “, msg = “”}
2013-05-30T08:21:04.879Z [21081B90 verbose 'Vcsvc.VMotionDst (1369902018116694)'] Migration changed state from MIGRATING to UNREGISTERING
2013-05-30T08:21:04.879Z [21081B90 verbose 'Vcsvc.VMotionDst (1369902018116694)'] Migration changed state from MIGRATING to UNREGISTERING
2013-05-30T08:21:04.879Z [21081B90 verbose 'Vcsvc.VMotionDst (1369902018116694)'] Finish called
2013-05-30T08:21:04.987Z [21081B90 verbose 'Vcsvc.VMotionDst (1369902018116694)'] Migration changed state from UNREGISTERING to DONE
2013-05-30T08:21:04.987Z [21081B90 verbose 'Vcsvc.VMotionDst (1369902018116694)'] JrnlEnd: Wrote active=0
2013-05-30T08:21:04.987Z [21081B90 verbose 'Vcsvc.VMotionDst (1369902018116694)'] JrnlEnd: Finished journal
2013-05-30T08:21:04.987Z [21081B90 verbose 'Vcsvc.VMotionDst (1369902018116694)'] [JrnlFilePersistenceProvider::DeleteJournal] Deleting /var/lib/vmware/hostd/journal/1369902057.4.
2013-05-30T08:21:04.987Z [21081B90 verbose 'Vcsvc.VMotionDst (1369902018116694)'] JrnlDelete: Deleted journal

Resolution:
Impact / Risk:
2013-05-30T08:20:57.990Z [21AE1B90 info 'Vcsvc.VMotion' opID=E9BE2EE8-00000A78-43-61-5b] PrepareDestinationEx [1369902018116694]

- vmware.log of VM will post the below events

2013-05-30T08:21:04.772Z| vmx| I120: guestCPUID level 80000008, 0: 0×00003028 0×00000000 0×00000000 0×00000000
2013-05-30T08:21:04.772Z| vmx| I120: guestCPUID level 8000000a, 0: 0×00000000 0×00000000 0×00000000 0×00000000
2013-05-30T08:21:04.772Z| vmx| I120: Msg_Question:
2013-05-30T08:21:04.772Z| vmx| I120: [msg.cpuid.noVHVQuestion] Virtualized AMD-V/RVI is incompatible with this virtual machine configuration.
2013-05-30T08:21:04.772Z| vmx| I120+ Continue without virtualized AMD-V/RVI?
2013-05-30T08:21:04.772Z| vmx| I120: —————————————-
2013-05-30T08:21:04.772Z| vmx| W110: Migrate: Multi-answer question encountered, auto-answering.
2013-05-30T08:21:04.772Z| vmx| I120: MsgQuestion: msg.cpuid.noVHVQuestion reply=1
2013-05-30T08:21:04.772Z| vmx| I120: Module CPUID power on failed.
2013-05-30T08:21:04.772Z| vmx| I120: VMX_PowerOn: ModuleTable_PowerOn = 0
2013-05-30T08:21:04.772Z| vmx| I120: MigrateSetStateFinished: type=2 new state=11
2013-05-30T08:21:04.772Z| vmx| I120: MigrateSetState: Transitioning from state 10 to 11.
2013-05-30T08:21:04.772Z| vmx| I120: [msg.migrate.resume.fail] The VM failed to resume on the destination during early power on.
2013-05-30T08:21:04.773Z| vmx| I120: Msg_Post: Error
2013-05-30T08:21:04.773Z| vmx| I120: [msg.migrate.resume.fail] The VM failed to resume on the destination during early power on.
2013-05-30T08:21:04.773Z| vmx| I120: —————————————-
2013-05-30T08:21:04.862Z| vmx| I120: VMXVmdb_SetToolsVersionStatus: status value set to ‘ok’, ‘current’, install possible
2013-05-30T08:21:04.862Z| svga| I120: Async SVGA thread is exiting
Cause
This will happen when there is vhv parameter enabled on the destination hypervisor under the below location:
cat etc/vmware/config | grep -i vhv
vhv.allow = “TRUE”
vhv.enable = “TRUE”
vhv is applicable for nested hypervisors and therefore vmx is checked during vMotion for this parameter.


Disable the vhv parameter and reboot the host for vMotion to work fine.

Steps to be followed:
- Enter the ESXi in maintenance mode.
- ssh to ESXi
- vi /etc/vmware/config
- Comment out the below parameters in the file.
vhv.allow = “TRUE”
vhv.enable = “TRUE” 
- Save and quit the file.
- Reboot the ESXi. 


- Please ensure that no virtual machines are running since the activity will require ESXi reboot.
- Please keep a backup of config file before making the change.



Comments

Popular posts from this blog

Part - 1 : Windows Administrator: L1: Interview question & Answer for AD, DNS, DHCP, WINS & DFS

Windows: Interview Q & A: L1 & L2 Interview question

How to create a Bootable ESXi Installer using USB Flash Drive