in Wireless

WiFi is down, what change did you make?

This weekend I got a call from an educational institute. The WiFi was down. They use an SDA fabric with a fabric-enabled Cisco 9800 WLC for their wireless. The incident occurred right after they made a change from DNAC to push new radius servers to the WLC. The problem was that no client station could connect. They stay in the state associated.

After investigating the logs on the WLC, I saw the radius server was not reachable from the controller. And all Control plane traffic like authentication traffic is handled by the WLC. On the radius server, I saw also no request to these new servers come in.

The Network administrator also figured that out so he decided to roll back the change, by adding the old radius server (actually a loadbalancer VIP) containing pool members of the radius server. 
 

fig: Design/Network settings on the gui of thier WLC:




The network administrator did not want to change the config on the SSID again, so they manually use the old server but on the WLAN’s authentication list



Dnac-cts is a reference object from the authentication list it looks like this on the CLI:
//aaa referenceaaa authorization network dnac-cts-eduroam-022aa653 group dnac-rGrp-eduroam-022aa653
aaa authentication dot1x dnac-cts-eduroam-022aa653 group dnac-rGrp-eduroam-022aa653

//dnac group
aaa group server radius dnac-rGrp-eduroam-022aa653
 server name dnac-radius_10.xx.x.aaa
 ip radius source-interface VlanxXx

However, DNAC did remove the old server before but did not add back in the old server (pink block) once needed again ( because it was not bound to anything)




The reference from the group was still there , but the server was not there.



So we added back the value from the reference object.
radius server dnac-radius_10.xx.x.aaa
 address ipv4 10.xx.x.aaa auth-port 1812 acct-port 1813
 timeout 4
 retransmit 3
 pac key 7 052916357543623D4D005D3C0E0A1278


//group-not working, because can’t connect to these radius servers.
aaa authorization network dnac-cts-eduroam-fa8affd3 group dnac-rGrp-eduroam-fa8affd3
aaa authentication dot1x dnac-cts-eduroam-fa8affd3 group dnac-rGrp-eduroam-fa8affd3

aaa group server radius dnac-rGrp-eduroam-fa8affd3
 server name dnac-radius_10.xx.x.bb
 server name dnac-radius_10.xx.x.cc
 ip radius source-interface VlanXxX

// separate global radius servers

radius server dnac-radius_10.xx.x.bb
 address ipv4 10.xx.x.bb auth-port 1812 acct-port 1813
 timeout 4
 retransmit 3
 pac key 7 08035C745D162923460E462A2F2D327A
!
radius server dnac-radius_10.xx.x.cc
 address ipv4 10.xx.x.cc auth-port 1812 acct-port 1813
 timeout 4
 retransmit 3
 pac key 7 15301B36502507107C367F0C16010051
!

Conclusion

Automated provisioning in a system is nice, but understanding how things are glued together is key when troubleshooting when things break. All reference points need to be checked because the reference value may not be what it seems.

Write a Comment

Comment