I ran into a problem that I would like to share with everyone in hopes that this will save you some time if you ever run into it.
I have recently been working with virtualized SDNv2 networks. I create a virtual network to test some security settings in an Active Directory forest. In my configuration, I have two Hyper-V servers (HV1 and HV2) configured as standalone systems (No Clusters). SCVMM 2019 was used to configure virtual switches that are configured/managed between the NICs on the Hyper-V hosts. On HV1, I added two VMs (DC1 and DC2) and created a Windows 2019 forest (CONTOSO.LOCAL).
I then created a test VM (W10-01) on HV2. I connected the W10-01 VM to the same virtual network as DC1 and DC2. I then attempted to join W10-01 to the CONTOSO.LOCAL domain. The domain join failed with the following error:
I then tried to ping the domain from the W10-01 VM by using the FQDN (CONTOSO.LOCAL) and it failed. Pinging the IP address of the DCs or any other IP address on the subnet is successful. It is also successful from any systems on the virtual network on HV2 to the W10-01 VM using the IP address.
At this point I started thinking it’s a firewall blocking issue. Therefore, I turned off the firewall on the DC and the W10-01 system, but this did not resolve the problem. The symptoms would lead you to believe it is a DNS related problem because all DNS queries from the W10-01 on HV2 fails.
The problem is not the DNS servers. It turned out to have nothing to do with DNS. After working with some very smart networking engineers, we determined the packets being sent from the W10-01 VM on HV2 was being dropped by the VMs on HV1. When the VMs on the virtualized network on HV2 sends a packet to a VMs on HV1, it traverses the NIC in the Hyper-V host, its then analyzed by the VM and determines the checksum is invalid and drops it.
The checksum can be handled by the NIC card if the card has the proper Checksum Offloading (IE…VXLAN Encapsulated Task Offload) option under the Advanced property of the NIC card. My Broadcom NetXtreme Gigabit NIC cards did not. Below is a packet captured on the DC1 VM on HV1 while a Nslookup was being ran on the W10-01 VM on HV2. Notice the reason for the dropped packet below:
0]0000.0000::2020-10-01 15:12:43.723000200 [Microsoft-Windows-TCPIP]TCPIP: Transport (Protocol UDP , AddressFamily = IPV4 ) dropped 1 packet(s) with Local = 192.168.4.4, Remote = 192.168.4.21. Reason = Checksum is invalid.
You have two options to resolve this issue:
OPT#1 – The first option is to place all the VMs on a particular virtual network on the same Hyper-V host. With this option, the packets never go through the NICs in the Hyper-V host therefore, you avoid the issue with the packets and the checksums offloading.
OPT#2 – The second option is to replace the NICs in the Hyper-V host with ones that supports Checksum Offloading (VXLAN Encapsulated Task Offload) properly.
My NICs did not support the checksum offloading option to correct the problem so I moved the W10-01 VM to Hyper-V host (HV1) the DCs VMs are on and now I can join the system to the Active Directory domain and all is well.
And that is it. The steps above were successful for me to resolve the issue with joining the system to the domain. I hope this post saves you time if you ever encounter these errors.