This might be useful to all the guys who deal with storage related issues day in an out:
- After failed attempts to grow a VMFS datastore, VIM APIs information and LVM information on the system is inconsistent
This problem occurs when you attempt to grow the datastore while the backing SCSI device enters the APD or PDL state. As a result, you might observe inconsistent information in VIM APIs and LVM commands on the host.Workaround: Perform these steps:
- Run the vmkfstools --growfs command on one of the hosts connected to the volume.
- Perform the rescan-vmfs operation on all host connected to the volume.
- VMFS6 datastore does not support combining 512n and 512e devices in the same datastore
You can expand a VMFS6 datastore only with the devices of the same type. If the VMFS6 datastore is backed by a 512n device, expand the datastore with the 512n devices. If the datastore is created on a 512e device, expand the datastore with 512e devices.Workaround: None.
- ESXi does not support the automatic space reclamation on arrays with unmap granularity greater than 1 MB
If the unmap granularity of the backing storage is greater than 1 MB, the unmap requests from the ESXi host are not processed. You can see the Unmap not supported message in the vmkernel.log file.Workaround: None.
- Using storage rescan in environments with the large number of LUNs might cause unpredictable problems
Storage rescan is an IO intensive operation. If you run it while performing other datastore management operation, such as creating or extending a datastore, you might experience delays and other problems. Problems are likely to occur in environments with the large number of LUNs, up to 1024, that are supported in the vSphere 6.5 release.Workaround: Typically, storage rescans that your hosts periodically perform are sufficient. You are not required to rescan storage when you perform the general datastore management tasks. Run storage rescans only when absolutely necessary, especially when your deployments include a large set of LUNs.
- The NFS 4.1 client loses synchronization with the NFS server when attempting to create new sessions
This problem occurs after a period of interrupted connectivity with the NFS server or when NFS IOs do not get response. When this issue occurs, the vmwarning.log file contains a throttled series of warning messages similar to the following:
NFS41 CREATE_SESSION request failed with NFS4ERR_SEQ_MISORDEREDWorkaround: Perform the following steps:
- Unmount the affected NFS 4.1 datastores. If no files are open when you unmount, this operation succeeds and the NFS 4.1 client module cleans up its internal state. You can then remount the datastores that were unmounted and resume normal operation.
- If unmounting the datastore does not solve the problem, disable the NICs connecting to the IP addresses of the NFS shares. Keep the NICs disabled for as long as it is required for the server lease times to expire, and then bring the NICs back up. Normal operations should resume.
- If the preceding steps fail, reboot the ESXi host.
- After an ESXi reboot, NFS 4.1 datastores exported by EMC VNX storage fail to mount
Due to a potential problem with EMC VNX, NFS 4.1 remount requests might fail after an ESXi host reboot. As a result, any existing NFS 4.1 datastores exported by this storage appear as unmounted.Workaround: Wait for the lease time of 90 seconds to expire and manually remount the volume.
- Mounting the same NFS datastore with different labels might trigger failures when you attempt to mount another datastore later
The problem occurs when you use the esxcli command to mount the same NFS datastore on different ESXi hosts. If you use different labels, for example A and B, vCenter Server renames B to A, so that the datastore has consistent labels across the hosts. If you later attempt to mount a new datastore and use the B label, your ESXi host fails. This problem occurs only when you mount the NFS datastore with the esxcli command. It does not affect mounting through the vSphere Web Client.Workaround: When mounting the same NFS datastore with the esxcli commands, make sure to use consistent labels across the hosts.
- An NFS 4.1 datastore exported from a VNX server might become inaccessible
When the VNX 4.1 server disconnects from the ESXi host, the NSF 4.1 datastore might become inaccessible. This issue occurs if the VNX server unexpectedly changes its major number. However, the NFS 4.1 client does not expect the server major number to change after establishing connectivity with the server.Workaround: Remove all datastores exported by the server and then remount them.
Virtual Volumes Issues
- After upgrade from vSphere 6.0 to vSphere 6.5, the Virtual Volumes storage policy might disappear from the VM Storage Policies list
After you upgrade your environment to vSphere 6.5, the Virtual Volumes storage policy that you created in vSphere 6.0 might no longer be visible in the list of VM storage policies.Workaround: Log out of the vSphere Web Client, and then log in again.
- The vSphere Web Client fails to display information about the default profile of a Virtual Volumes datastore
Typically, you can check information about the default profile associated with the Virtual Volumes datastore. In the vSphere Web Client, you do it by browsing to the datastore, and then clicking Configure > Settings > Default Profiles.
However, the vSphere Web Client is unable to report the default profiles when their IDs, configured at the storage side, are not unique across all the datastores reported by the same Virtual Volumes provider.Workaround: None.
- In vSphere 6.5, the name assigned to the iSCSI software adapter is different from the earlier releases
After you upgrade to the vSphere 6.5 release, the name of the existing software iSCSI adapter, vmhbaXX, changes. This change affects any scripts that use hard-coded values for the name of the adapter. Because VMware does not guarantee that the adapter name remains the same across releases, you should not hard code the name in the scripts. The name change does not affect the behavior of the iSCSI software adapter.Workaround: None.
Storage Host Profiles Issues
- Attempts to set the action_OnRetryErrors parameter through host profiles fail
This problem occurs when you edit a host profile to add the SATP claim rule that activates the action_OnRetryErrors setting for NMP devices claimed by VMW_SATP_ALUA. The setting controls the ability of an ESXi host to mark a problematic path as dead and trigger a path failover. When added through the host profile, the setting is ignored.Workaround: You can use two alternative methods to set the parameter on a reference host.
- Use the following esxcli command to enable or disable the action_OnRetryErrors parameter:
esxcli storage nmp satp generic deviceconfig set -c disable_action_OnRetryErrors -d naa.XXX
esxcli storage nmp satp generic deviceconfig set -c enable_action_OnRetryErrors -d naa.XXX
- Perform these steps:
- Add the VMW_SATP_ALUA claimrule to the SATP rule:
esxcli storage nmp satp rule add --satp=VMW_SATP_ALUA --option=enable_action_OnRetryErrors --psp=VMW_PSP_XXX --type=device --device=naa.XXX
- Run the following commands to reclaim the device:
esxcli storage core claimrule load
esxcli storage core claiming reclaim -d naa.XXX
- Add the VMW_SATP_ALUA claimrule to the SATP rule:
- Use the following esxcli command to enable or disable the action_OnRetryErrors parameter:
VM Storage Policy Issues
- Hot migrating a virtual machine with vMotion across vCenter Servers might change the compliance status of a VM storage policy
After you use vMotion to perform a hot migration of a virtual machine across vCenter Servers, the VM Storage Policy compliance status changes to UNKNOWN.Workaround: Check compliance on the migrated virtual machine to refresh the compliance status.
- In the vSphere Web Client, browse to the virtual machine.
- From the right-click menu, select VM Policies > Check VM Storage Policy Compliance.
The system verifies the compliance.
Storage Driver Issues
- The bnx2x inbox driver that supports the QLogic NetXtreme II Network/iSCSI/FCoE adapter might cause problems in your ESXi environment
Problems and errors occur when you disable or enable VMkernel ports and change the failover order of NICs for your iSCSI network setup.Workaround: Replace the bnx2x driver with an asynchronous driver. For information, see the VMware Web site.
- The ESXi host might experience problems when you use Seagate SATA storage drives
If you use an HBA adapter that is claimed by the lsi_msgpt3 driver, the host might experience problems when connecting to the Seagate SATA devices. The vmkernel.log file displays errors similar to the following:
SCSI cmd RESERVE failed on path XXX
reservation state on device XXX is unknownWorkaround: Replace the Seagate SATA drive with another drive.
- When you use the Dell lsi_mr3 driver version 6.903.85.00-1OEM.600.0.0.2768847, you might encounter errors
If you use the Dell lsi_mr3 asynchronous driver version 6.903.85.00-1OEM.600.0.0.2768847, the VMkernel logs might display the following message ScsiCore: 1806: Invalid sense buffer.Workaround: Replace the driver with the vSphere 6.5 inbox driver or an asynchronous driver from Broadcom.
Boot from SAN Issues
- Installing ESXi 6.5 on a Fibre Channel or iSCSI LUN with LUN ID greater than 255 is not supported
vSphere 6.5 supports LUN IDs from 0 to 16383. However, due to adapter BIOS limitations, you cannot use LUNs with IDs greater than 255 for the boot from SAN installation.Workaround: For ESXi installation, use LUNs with IDs 255 or lower.
Miscellaneous Storage Issues
- If you use SESparse VMDK, formatting of a VM with Windows or Linux file system takes longer
When you format a VM with Windows or Linux file system, the process might take longer than usual. This occurs if the virtual disk is SESparse.Workaround: Before formatting, disable the UNMAP operation on the guest operating system. You can re-enable the operation after the formatting process completes.
- Attempts to use the VMW_SATP_LOCAL plug-in for shared remote SAS devices might trigger problems and failures
In releases earlier than ESX 6.5, the SAS devices are marked as remote despite being claimed by the VMW_SATP_LOCAL plug-in. In ESX 6.5, all devices claimed by VMW_SATP_LOCAL are marked as local even when they are external. As a result, when you upgrade to ESXi 6.5 from earlier releases, any of your existing remote SAS devices that were previously marked as remote change their status to local. This change affects shared datastores deployed on these devices and might cause problems and unpredictable behavior.
In addition, problems occur if you incorrectly use the devices that are now marked as local, but are in fact shared and external, for certain features. For example, when you allow creation of the VFAT file system, or use the devices for Virtual SAN.Workaround: Do not use the VMW_SATP_LOCAL plug-in for the remote external SAS devices. Make sure to use other applicable SATP from the supported list or a vendor unique SATP.
- Logging out of the vSphere Web Client while uploading a file to a datastore cancels the upload and leaves an incomplete file
Uploading large files to a datastore takes some time. If you log out while uploading the file, the upload is cancelled without warning. The partially uploaded file might remain on the datastore.Workaround: Do not log out during file uploads. If the datastore contains the incomplete file, manually delete the file from the datastore.
Storage I/O Control Issues
- You cannot change VM I/O filter configuration during cloning
Changes to a virtual machine’s policies during cloning is not supported by Storage I/O Control.Workaround: Perform the clone operation without any policy change. You can update the policy after completing the clone operation.
- Storage I/O Control settings are not honored per VMDK
Storage I/O Control settings are not honored on a per VMDK basis. The VMDK settings are honored at the virtual machine level.Workaround: None.
Storage DRS Issues
- Storage DRS does not honor Pod-level VMDK affinity if the VMDKs on a virtual machine have a storage policy attached to them
If you set a storage policy on the VMDK of a virtual machine that is part of a datastore cluster with Storage DRS enabled, then Storage DRS does not honor the Keep VMDKs together flag for that virtual machine. It might recommend different datastores for newly added or existing VMDKs.Workaround: None. This behavior is observed when you set any kind of policy such as VMCrypt or tag-based policies.
- You cannot disable Storage DRS when deploying a VM from an OVF template
When you deploy an OVF template and select an individual datastore from a Storage DRS cluster for the VM placement, you cannot disable Storage DRS for your VM. Storage DRS remains enabled and might later move this VM to a different datastore.Workaround: To permanently keep the VM on the selected datastore, manually change the automation level of the VM. Add the VM to the VM overrides list from the storage cluster settings.
Backup and Restore Issues
- New After file-based restore of а vCenter Server Appliance to a vCenter Server instance, operations in the vSphere Web Client such as configuring high availability cluster or enabling SSH access to the appliance may result with failure
In the process of restoring a vCenter Server instance, a new vCenter Server Appliance is deployed and the appliance HTTP server is started with self-signed certificate. The restore process completes with recovering the backed up certificates but without restarting the appliance HTTP server. As a result, any operation which requires to make an internal API call to the appliance HTTP server fails.Workaround: After restoring the vCenter Server Appliance to a vCenter Server instance, you must login to the appliance and restart its HTTP server by running the command
service vami-lighttp restart.
- Attempts to restore a Platform Services Controller appliance from a file-based backup fail if you have changed the number of vCPUs or the disk size of the appliance
In vSphere 6.5, the Platform Services Controller appliance is deployed with 2 vCPUs and 60 GB disk size. Increasing the number of vCPUs and the disk size is unsupported. If you try to perform a file-based restore of a Platform Services Controller appliance with more than 2 CPUs or 60 GB disk size, the vCenter Server Appliance installer fails with the error:
No possible size matches your set of requirements.Workaround: Decrease the number of the processors to no more than 2 vCPUs and the disk size to no more than 60 GB.
- Restoring a vCenter Server Appliance with an external Platform Services Controller from an image-based backup does not start all vCenter Server services
After you use vSphere Data Protection to restore a vCenter Server Appliance with an external Platform Services Controller, you must run the
vcenter-restorescript to complete the restore operation and start the vCenter Server services. The
vcenter-restoreexecution might fail with the error message:
Operation Failed. Please make sure the SSO username and password are correct and rerun the script. If problem persists, contact VMware support.Workaround: After the
vcenter-restoreexecution has failed, run the
service-control --start --allcommand to start all services.
service-control --start --allexecution fails, verify that you entered the correct vCenter Single Sign-On user name and password. You can also contact VMware Support.
Reference : https://docs.vmware.com/en/VMware-vSphere/6.5/rn/vsphere-vcenter-server-651-release-notes.html