It was around the time I finished reading the CWDP book when I got involved in this project with a hospital in Maryland. I was also told by the account manager upfront that this is an escalated situation and all care needs to be taken to manage and resolve the situation.
On my first call with Director of Technology (DOT) at the hospital, I was narrated the issues as follows:
- Nurses complained of dropped calls as they roam.
- When they come back from lunch, their VoWiFi phones (Spectralink 8440 model) would not connect automatically to the network. They would have to pull the batteries out and back in to reconnect.
- Login to the phones via GUI for remote management disconnects phone connectivity and the only way to reconnect them was by resetting the phone battery.
- The hospital was still using WEP as the phones would not work on WPA2-Personal and they hadn’t tested WPA2-Enterprise.
- Hospital staff had a general loss of confidence in IT department.
After the call with DOT, I started working on the statement of work (SOW) and broke down the project into 2 phases. Phase 1 was to study the RF environment and phase 2 was to make changes to the Wireless network. Total time to improve the network and restore faith was 10 days (2 weeks).
I started by reviewing the wireless survey of the previous company and the Cisco controller (5508) configuration. Shortly I confirm that the wireless survey was not done taking the device receive sensitivity into consideration. The previous company followed Cisco best practices and designed for – 67dBm signal strength at the cell edge, but the Spectralink 8440 phones, which were majority of the phones used at the hospital, need -58dBm at 12 Mbps data-rate to decode the signal at 5GHz, based on page 5 of their spec sheet mentioned below.
So, there was obviously some gap between the way the network was designed and what was needed. Further investigation also revealed high channel utilization (above 60%), and 80% of the APs were operating at full power in U-NII1. I followed the initial findings with an active survey to discover the gaps in coverage. The total number of APs in the hospital was 64, model AIR-AP3702i-A-K9. I presented my findings to DOT, CIO also planned to join this meeting (it was only second week for the CIO, talk about pressure).
As I was doing this work it reminded me of CWDP chapter 2 “Designing for Client Devices and Applications”, what a beautiful heading.
After the survey I proposed adding 30 more APs to the hospital as part of a revised design. DOT and team agreed and they have factored this cost into their next budget. So as a temporary fix, with 80% of APs already operating at maximum power, was to lower the mandatory datarate to 6mbps for 5Ghz, which is acceptable for VoWiFi. This did improve the coverage at 5Ghz, and made the performance better.
For Phase 2, first thing I did was to upgrade the Spectralink 8440 firmware from 3.7.X to the latest 4.5.5 firmware (latest at the time), followed by controller code upgrade to the latest and stable 184.108.40.206 firmware (latest at the time, keeping KRACK, Key Reinstallation Attack, into consideration)
I created a completely new SSID for VoWiFi on 5 GHz only with 802.1X/PEAP/MSchapV2. Hospital is planning to deploy Zebra phones which supports PEAP as well. PEAP was also easy to deploy as the hospital has its own PKI infrastructure, and pushing the root certificate to Spectralink phones was easy since the root cert can simply be added to the site config of the Spectralink phones. The SSID was also built with Spectralink best practices guide and some of Cisco’s best practices guide mentioned in the link below.
Surprisingly, when the phones were connected at 5Ghz the network drop issue while accessing phone’s GUI went away, so I tested again with 2.4Ghz and issue was still there even with upgraded firmware, seems like this was a bug only at 2.4GHz (wireless capture showed the phones were actively associated to WiFi however would stop responding to ping or http at 2.4Ghz). Unfortunately, the hospital did not have an active contract with Spectralink, and hence I could not open a case with Spectralink, but the technical staff is aware of this issue on 2.4GHz.
The reconnect timer on the phones was initially set to 3600 seconds. I reduced this timer to 10 seconds. This resolved the issue with the phones not reconnecting when the nurses came back from their lunch break. With this change, the phones triggers a connection within 10 seconds and reconnected. I verified this with “debug client” output from the controller with client association and authentication. Phones were set to ring 4 times, with each ring set to 3 seconds long. With 10 seconds set as the reconnect timer, they would connect to the network without even a single missed call.
At this point, the phones are working without any interruptions. DOT can manage the phones using the GUI without disconnections. No dropped calls were reported across 3 shift changes of nurses, nurses can connect automatically after their lunch breaks, the security has been upgraded from WEP to WPA2-Enterprise, and the phones and Cisco Controller firmware are up to date (with KRACK patch on phones). I was also able to segment the wireless network better by adding some additional AP groups, reducing the total advertised SSID count in hospital from 10 to 7. This is a good start.
Phase 3 in next budget cycle will include, adding additional APs as per design, changing datarate and other RRM specs, moving from 7 to 3 SSIDs (WiFi Data users, VoWiFi users and IOT), renewing support contact for Spectralink phones and adding Zebra phones to the network and WiFi equation.
I hope you enjoyed reading it as much as I enjoyed writing it.