Change vsan network subnet for hosts in vSan enabled cluster

Reading Time: 4 mins

When working on lab environment, I got an idea of changing the complete vSan subnet for hosts and immediately started working on it without waiting a minute, with paper work first , before implementing it and finally ended with a success..

 

Here is my lab environment details:

No of hosts in the cluster: 5

Current subnet used for vSan traffic : 172.20.20.0/24,  vmkernel adapter is vmk5.

Target subnet for vSan traffic: 192.168.10.0/24.

 

Here is the procedure, let's start looking into it: 

  • Checked the cluster health thoroughly for any red alerts like resync, inaccessible objects, network partitions, predictive drive failures. Found all green, so went ahead.
  • Since the hosts already has 172.20 subnet for vsan traffic (vmk5), I had to create new kernel adapter for vsan traffic with subnet 10.10 subnet with vmk6.
  • After adding the new adapter i.e., vmk6 for first host, I have seen network partition errors under vsan health.
  • Collected the multicast ip that is currently assigned to vmk5, this can be checked using the command:

esxcli vsan network list

  • Mark the ip's of Agent Group Multicast Address and Master Group Multicast Address from output collected using above command.
  • Use the command to assign multicast ip to newly created vmk6 vmkernel adapter:

esxcli vsan network ipv4 set -i vmk6 -d Agent Group Multicast Address -u Master Group Multicast Address

  • Once all the hosts in vSan cluster has two adapters for vSan traffic with matching multicast ip. I didn't see any errors in vsan cluster and was able to ping to hosts over two kernel adapters i.e., vmk5 and vmk6.
  • I checked using the ping as shown below, where 172.20.20.10 and 192.168.10.10 are the vsan ip's assigned to host test-vsn-esx2 - vmk5 and vmk6 vsan enabled kernel adapters.

[root@test-vsn-esx1:~] vmkping -I vmk5 172.20.20.10

PING 172.20.20.10 (172.20.20.10): 56 data bytes

64 bytes from 172.20.20.10: icmp_seq=0 ttl=64 time=0.185 ms

64 bytes from 172.20.20.10: icmp_seq=1 ttl=64 time=0.204 ms

 

[root@test-vsn-esx1:~] vmkping -I vmk6 192.168.10.10

PING 192.168.10.10 (192.168.10.10): 56 data bytes

64 bytes from 192.168.10.10: icmp_seq=0 ttl=64 time=0.185 ms

64 bytes from 192.168.10.10: icmp_seq=1 ttl=64 time=0.204 ms

  • Now the ping worked, between all the hosts and no alerts seen in vsan health – All set to proceed further.
  • Remove vmk5 kernel adapter from each host of the cluster, this can be done from Esxcli or from vCenter > host > networking > vmkernel adapters.
  • I would suggest to remove vmk5 on one host first and monitor for any issues, check the Sub-Cluster Member Count using the command esxcli vsan cluster get 
  • If the cluster count is matching , then proceed further with removing vmk5 from all hosts in cluster. After every removal I ensured to check the cluster for any resync or inaccessible objects or any red errors.
  • Now all the hosts in cluster has vmk6 as vsan traffic, it can be left as it is or just recreate the network adapter to get the naming of vmk5 to the vsan adapter. I wanted the hosts to use vmk5 only for vsan traffic to maintain consistency.

 

Recreate the network adapter:

  1. Connect to the vCenter where the cluster/hosts are located.
  2. Select the host and put it in MM with ensure accessibility.

Note : Collect the information like IP address, subnet mask, MTU, multicast ip and Sub-Cluster Member Count using the command esxcli vsan cluster get

  1.   Navigate to host > Manage > Networking > vmkernel adapters > select the vsan enabled kernel adapter > delete
  2.   Once successfully removed, esxcli vsan cluster getcommand shown 1 count less than what it was shown before removing the adapter. In my case, before removing the adapter it was 5 and now it is 4 as the host is not able to communicate to other hosts in vsan cluster.
  3.     Create the kernel adapter with same configuration as collected in step 3 , once done the Sub-Cluster Member Count started showing as 5, vmk5 adapter started showing for vsan traffic and with no errors in vsan health.
  4.   Followed the procedure 1-5 for remaining hosts to get vmk5 kernel adapter for vsan traffic.

 

Thanks for reading, have a nice day.