Now, why do we need to restart the management agents of the host.

We need to understand the impact also when and host to do that.It is a easy task to just restart the services. We need to understand is it necessary to do that or its just to save the time and get back to production .

Symptoms:

+++++++++

  • Cannot connect directly to the ESXi host or manage under vCenter server.
  • vCenter Server displays the error:

    Virtual machine creation may fail because agent is unable to retrieve VM creation options from the host

Restart Management agents in ESXi Using Direct Console User Interface (DCUI):

  1. Connect to the console of your ESXi host.
  2. Press F2 to customize the system.
  3. Log in as root.
  4. Use the Up/Down arrows to navigate to Troubleshooting Options > Restart Management Agents.
  5. Press Enter.
  6. Press F11 to restart the services.
  7. When the service restarts, press Enter.
  8. Press Esc to log out.
Restart Management agents in ESXi Using ESXi Shell or Secure Shell (SSH):
  1. Log in to ESXi Shell or SSH as root.

    For Enabling ESXi Shell or SSH, see Using ESXi Shell in ESXi 5.x and 6.x (2004746).

  2. Restart the ESXi host daemon and vCenter Agent services using these commands:

    /etc/init.d/hostd restart

    /etc/init.d/vpxa restart

Note: In ESXi 4.x, run this command to restart the vpxa agent:

service vmware-vpxa restart

Alternatively:

  • To reset the management network on a specific VMkernel interface, by default vmk0, run the command:

    esxcli network ip interface set -e false -i vmk0; esxcli network ip interface set -e true -i vmk0

    Note: Using a semicolon (;) between the two commands ensures the VMkernel interface is disabled and then re-enabled in succession. If the management interface is not running on vmk0, change the above command according to the VMkernel interface used.

  • To restart all management agents on the host, run the command:

    services.sh restart

Now we know how to do it.. Lets see what might it look like if the hostd and the vpxa service is hung. There might me lot of zombie processes for hostd which might cause this.

Eg:

You can try to kill the hostd process first if there is only a process, if there are multiple I would recommend to reboot the host, rather than creating another zombie process. It goes same with vpxa process.

You see these errors:
  • VPXA log errors:

    Authd error: 514 Error connecting to hostd-vmdb service instance.
    Failed to connect to host :902. Check that authd is running correctly (lib/connect error 11).

  • Error in hostd.log

    2017-11-27T19:57:41.000Z [282DFB70 info ‘Vimsvc.ha-eventmgr’] Event 7856 : Issue detected on test.local in ha-datacenter: hostd detected to be non-responsive

It depends on the logging what needs to be done on the restart of agents or restarting the individual service.

You guys would be better to judge on the reboot of host would be a better Idea or a restart of agent.

If you see few of these errors as below :

  • The ESXi host appears as disconnected in vCenter Server
  • Attempting to reconnect the host fails
  • You see the error:
A general system error occurred: Timed waiting for vpxa to start

To resolve this issue:

  1. Restart the Management agents on host.
  2. Log in to the host directly using the vSphere Client.
  3. Right-click the virtual machine and click Remove from Inventory.
  4. Restart the Management agents on the host.
  5. Right-click the host in vCenter Server and click Reconnect.

 

I hope this helps 🙂