Anyone ever had a network administrator make your hosts lose connectivity and the only option you have is to use the ESXi shell through an HP ILO port? I have. Here are the changes that were helpful to me at the time.
esxcli network vswitch standard list — This lists all of the vswitches on the host.
esxcli network vswitch standard uplink add -u vmnic1 -v vSwitch0 — Add an uplink to the vswitch. Just switch “add” to “remove” to get rid of a vmnic from the vswitch.
esxcli network vswitch standard policy failover get -v vSwitch0 — Get the load balancing policy for the vSwitch.
esxcli network vswitch standard policy failover set -a vmnic1 -v vSwitch0 — Be careful because when changing the active vmnics on a vswitch it will only add ones that you specify and if there are active ones that you specify it will move them to unused.
esxcli network vswitch standard portgroup policy failover get -p “Management Network” — This shows the policy information for the portgroup “Management Network”
esxcli network vswitch standard portgroup policy failover set -a vmnic1 -p “Management Network” — Change the active adapter on the Management Network.
esxcli network vswitch standard policy failover set -l iphash -v vSwitch0 — Set the load balancing on the vSwitch0 to iphash. Please be aware that this might change the portgroups to iphash if they are set to get their settings from the vswitch.
esxcli network vswitch standard portgroup policy failover set -l iphash -p “Management Network” — Sets the load balancing policy on the Management Network portgroup to iphash. This will cause the load balancing policy on the vswitch to no longer apply to the Management Network portgroup.
There are two separate alerts that I have experience from these upgrades. Please note that after the update the ILO cards were rebooted, but our esxi hosts running vsphere 5.1 were not rebooted.
Out of 31 hosts that had firmware updated, this error has appeared on 6 so far. Some of them took a couple days before they presented with the alert. The alarm triggered is: Host Memory Status. Under the “Hardware Status” tab the alert shows for the System Board 8 Memory: Uncorrectable ECC.
The second problem that I have encountered is the filling up of the IPMI SEL Log. I am able to go in an clear the log which gets rid of the alert for a short time, but the log fills up again. The alert shows as Host IPMI System Event Log Status. Under the “Hardware Status” tab the “System Event Log” and “IPMI SEL” show as Unknown. You can click on “Show Event Log” and then “Reset Event Log” and it will clear for a while…but the alert will return. Notice the future date of 12/31/9998 which I am guessing might be when the world ends.
How do I fix these problems? After calling HP and VMware I was told that I needed to put each host into Maintenance Mode and then run a “Detailed Hardware Diagnostic”. According to VMware this was the only way to clear the error (especially the memory one). The solution that ended up working for me was to just reboot the host…:) The VMware couldn’t believe that worked, but it did. I know it isn’t a difficult fix, but maybe this might help others that get this alert.
Please let me know if you have encountered similar alerts from upgrading the firmware on your ILO ports.