Friday, August 10, 2018

Ran into this one recently on EL7 (CentOS and friends)

IPv6 duplication address detection leaves link local addresses in dadfailed state after a VM cloning operation.

$ ip -6 addr
1: lo: mtu 65536 state UNKNOWN qlen 1000
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2: ens160: mtu 1500 state UP qlen 1000
    inet6 fe80::4729:d09e:8f22:56c3/64 scope link tentative dadfailed
       valid_lft forever preferred_lft forever
    inet6 fe80::1435:486f:84ad:93ba/64 scope link tentative dadfailed
       valid_lft forever preferred_lft forever
    inet6 fe80::25b4:c63e:38cb:4bcc/64 scope link tentative dadfailed
       valid_lft forever preferred_lft forever

Which led me to take a look at the interface config file

$ cat /etc/sysconfig/network-scripts/ifcfg-Wired_connection_1 | grep IPv6

IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
IPV6_PEERDNS=yes
IPV6_PEERROUTES=yes

The default gen_mode is stable-privacy, which led me to take a look at the RFC for address generation.  In particular, the section on the algorithm.

1. Compute a random (but stable) identifier with the expression:
RID = F(Prefix, Net_Iface, Network_ID, DAD_Counter, secret_key)
Where:
 secret_key:
A secret key that is not known by the attacker. The secret
key SHOULD be of at least 128 bits. It MUST be initialized to
a pseudo-random number (see [RFC4086] for randomness
requirements for security) when the operating system is
installed or when the IPv6 protocol stack is "bootstrapped"
for the first time. An implementation MAY provide the means
for the system administrator to display and change the secret
key.
The systems in question are using NetworkManager, which led to me to here

$ sudo ls -l /var/lib/NetworkManager/

NetworkManager-intern.conf
NetworkManager.state
no-auto-default.state
secret_key
timestamps

There can be a lot of cruft (left over state from cloning) in this directory but the secret_key is what we are looking for.

The easy solution and most correct solution is to simply nuke the Ethernet connection in NetworkManager and recreate it.

In this case, the original sysconfig/ifcfg-interface file didn't have a HWADDRESS configured so a new NetworkManager connection wasn't automatically regenerated when a new vNIC showed up after a clone.

And so we have our root cause...

The lesson: Don't remove HWADDRESS from your ifcfg-interface files.