VSAN troubleshooting commands

Here are some of the important commands that can be used while troubleshooting vSan:  Please note this is for quick ref if you are looking for commands and detailed explanation will be given in posts.

Get the total number of hosts that are part of vsan cluster:

  • esxcli vsan cluster get

Manually join the host into vsan cluster:

  • esxcli vsan cluster join -u uuid

Leave the host from vsan cluster

  • esxcli vsan cluster leave

Change the Resync copy flight values in host, this will boost the resync speed.

  • vsish -e get /vmkModules/vsan/dom/MaxNumResyncCopyInFlight
  • vsish -e set /vmkModules/vsan/dom/MaxNumResyncCopyInFlight 5

Disable resync throttle:

  • esxcfg-advcfg -g /VSAN/DomCompResyncThrottle
  • esxcfg-advcfg -s 0 /VSAN/DomCompResyncThrottle

 

Check congestion in any of the host in vSAN cluster:

  • for ssd in $(localcli vsan storage list |grep "Group UUID"|awk '{print $5}'|sort -u);do echo $ssd;vsish -e get /vmkModules/lsom/disks/$ssd/info|grep Congestion;done

 

Check resync status: Resync triggers

while true;do echo "" > ./resyncStats.txt ;cmmds-tool find -t DOM_OBJECT -f json |grep uuid |awk -F \" '{print $4}' |while read i;do pendingResync=$(cmmds-tool find -t DOM_OBJECT -f json -u $i|grep -o "\"bytesToSync\": [0-9]*,"|awk -F " |," '{sum+=$2} END{print sum / 1024 / 1024 / 1024;}');if [ ${#pendingResync} -ne 1 ]; then echo "$i: $pendingResync GiB";fi;done |tee -a ./resyncStats.txt;total=$(cat resyncStats.txt |awk '{sum+=$2} END{print sum}');echo "Total: $total GiB" |tee -a ./resyncStats.txt;total=$(cat ./resyncStats.txt |grep Total);totalObj=$(cat ./resyncStats.txt|grep -vE " 0 GiB|Total"|wc -l);echo "`date +%Y-%m-%dT%H:%M:%SZ` $total ($totalObj objects)" >> ./totalHistory.txt; sleep 10;done

Delete a specific object:

  • /usr/lib/vmware/osfs/bin/objtool delete -u <object uuid> -f -v 10

 

Change the value of goto11 and TcpipHeapMax in host:

  • esxcli system settings advanced list –o /VSAN/goto11
  • esxcli system settings advanced list –o /Net/TcpopHeapMax
  • esxcli system settings advanced set -o /Net/TcpipHeapMax -i 1536
  • esxcli system settings advanced set -o /VSAN/goto11 -i 1

 

Get the LSOM log congestion values:

  • esxcfg-advcfg -g /LSOM/lsomLogCongestionHighLimitGB
  • esxcfg-advcfg -g /LSOM/lsomLogCongestionLowLimitGB

Below commands to change the lsomLogCongestion value in hosts:

  • esxcfg-advcfg -s 24 /LSOM/lsomLogCongestionLowLimitGB
  • esxcfg-advcfg -s 32 /LSOM/lsomLogCongestionHighLimitGB

 

2 thoughts on “VSAN troubleshooting commands

  1. Dude can you remove the following from your blog, this cannot be used in any production environments and will cause hosts to PSOD.
    Stop and start the resync in vsan cluster:

    vsish -e set /vmkModules/vsan/dom/PauseAllResync 1 => To pause the resynchronization.
    vsish -e set /vmkModules/vsan/dom/PauseAllResync 0 => To start the resynchronization.

Leave a Reply