Yesterday, I had a problem after I shutdown a lagcy San, that after I rescaned my HBA’s, ( which timed out), I lost connection to the host.
I tried restarting the normal services, in console,
# service vmware-vpxa restart
#service mgmt-vmware restart
but I would get,
Stopping VMware ESX Server Management services:
VMware ESX Server Host Agent Watchdog [FAILED]
I found the following post , http://communities.vmware.com/message/1157237 and KB article, http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=1005566
Both which have you kill the hostd, process, then rename the pid, file. Here is the contents of the KB.
"
Service mgmt-vmware restart may not restart hostd
Symptoms
Resolution
You must manually stop the stuck service and restart it.
To stop the service and restart it:
-
Log in as root to the ESX host command-line via the physical console or via KVM connection.
- Navigate to the /var/run/vmware directory:
# cd /var/run/vmware -
Run the following command to list the files vmware-hostd.PID and watchdog-hostd.PID:
# ls -l vmware-hostd.PID watchdog-hostd.PID
- Determine the Process ID (PID) management service. View the contents of the vmware-hostd.PID file:
# cat vmware-hostd.PID
For example:
[root@vmware]# cat vmware-hostd.PID
1191[root@vmware]# -
Use the resulting PID to kill the process.
Caution: Use the kill -9 command with care. It kills the process of the supplied PID without exception or confirmation.
# kill -9 <PID>
In this example you run kill -9 1191.
- Delete the vmware-hostd.PID and watchdog-hostd.PID files:
# rm vmware-hostd.PID watchdog-hostd.PID - Restart the management service.
# service mgmt-vmware restart”
This seemed to fix it, and without any downtime.
Roger L