Best Practices to Deploy High- Availability in Wireless LAN Architectures

Sujit Ghosh Senior Manager, Technical Marketing BRKEWN-3014 “High availability is a characteristic of a system, which aims to ensure an agreed level of operational performance, usually uptime for a higher than normal period.” - Wikipedia

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 3 Session Objective

What is the acceptable network downtime? Minutes<< 11 secondminute are ok admin

The goal of this session is to show you how to design and deploy a Highly Available wireless network to reduce the network downtime

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 4 Agenda

• High Availability (HA), the theory of operations:

• What to do at the Radio Frequency layer?

• Controller HA for different Deployment Modes:

• Centralised, FlexConnect, Mobility Express, Prime and MSE high availability

• HA Design and Deployment Practices

• Key takeaways Cisco Controller Portfolio Large Enterprise/Branch Mid-size Enterprise/Branch Cisco 8540

Cisco vWLC 6000 APs Small Network, Small Branch 64,000 clients 3000 APs 40 Gbps 32000 Clients 500 Mbps

Cisco 5520 1500 APs 20000 Clients Cisco 3504 20 Gbps Mobility Express 150 APs 50 APs/1000 Clients – AP 18xx 3000 Clients 100 AP/2000 Clients – AP 2800/3800 4 Gbps

Cisco IOS 5760 Cisco 8510 Cisco WISM2 Cisco Flex 7500 Cisco 5508 1000 APs 12,000 clients 6000 APs 60 Gbps Cisco 2500 64,000 clients 6000 APs 10 Gbps 1000 APs Cisco vWLC 500 APs 64,000 clients Autonomous 15,000 clients 7000 clients 1 Gbps APs 20 Gbps 75 APs 8 Gbps 1000 clients 1 Gbps 200APs 3000 clients 500 Mbps

Upto 150 APs Grow as Your150- 1500Business APs Grows © 2017 Cisco and/or its affiliates.1500 All rights-6000 reserved. APs Cisco Public Cisco Aironet 802.11ac Wave 2 Portfolio Industry’s most comprehensive and innovative AP portfolio

Enterprise Class Mission Critical Best in Class

DNA Ready | RF Excellence | CMX | Centralised, FlexConnect or Mobility Express Dual 5 GHz | Flexible Radio | HDX Future Proof

3800 2800 1850 • 4x4:3SS 160 MHz 1815 1830 • 4x4:3SS 160 MHz • 5 Gbps Performance Indoor / High-powered Indoor • 4x4:3SS 80Mhz • 5 Gbps Performance • 2.4 and 5GHz or Wall Plate / Teleworker • 3x3:2SS 80MHz • 1.7 Gbps Performance • 2.4 and 5GHz or Dual 5GHz • 2x2:2SS 80 MHz • 867 Mbps Performance • Internal or External Dual 5GHz • 2 GE Ports Uplink or • 867 Mbps Performance • Tx Beam Forming • 2 GE Ports Uplink 1 GE + 1 mGig (5G) • Tx Beam Forming • 1 GE Port Uplink • Tx Beam Forming • CleanAir and ClientLink • CleanAir and ClientLink 1 • Integrated BLE Gateway • USB 2.0 • 2 GE Ports Uplink • Internal or External • StadiumVision • Max Transmit Power (dBm) • USB 2.0 Antenna • Internal or External Antenna 2 per local regulations • Smart Antenna Connector • Smart Antenna Connector • 3 GE Local Ports, including • USB 2.0 • USB 2.0 1 PoE out3 • Investment Proof Modularity • Local ports 802.1x ready3 • Centralized, FlexConnect and Mobility Express• USB 2.04 1Future availability 2 Available for High-powered only 3 Available for wall-plate and teleworker only 4 Available for teleworker only © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public Radio Frequency (RF) Considerations 802.11ac is Here

80 MHz channel

But it Comes With a Price

High Signal at the client for 256QAM

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 9 Radio Frequency (RF) High Availability

§ RF HA is the ability to build redundancy at the physical layer

§ What does it translates to in practice? • Creating a pervasive, stable, predictable RF environment (Proper Design, Site Survey, Radio Planning) • Dealing with coverage holes if an AP goes down (RF Management) • Identifying, Classifying, Mitigating an interference source (Spectrum Intelligence Solution) • Improving client (all clients!) received signal ()

§ BTW…Cisco has differentiating features/functionalities to address all these things

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 10 Radio Frequency (RF) High Availability

§ Site Survey, site survey….and site survey • Use “Active” survey • Coverage vs. Capacity • Consider Client type (ex. Smartphone vs. Laptop)

MyMy antenna power is gain half is of 4 my timesbrother smaller MacBook

andI try then to connect move to to another 5GHz andBSSID stay ifconnected it is REALLY until the signal betteris REALLY bad

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 11 Radio Frequency (RF) High Availability

§ Site Survey, site survey….and site survey • Use “Active” survey • Coverage vs. Capacity • Consider Client type (ex. Smartphone vs. Laptop)

§ AP positioning and antenna choice is Key • Use common sense • Light source analogy • Internal antennas are designed to be mounted on ceiling • External antennas: use same antennas on all connectors

§ Tools • What you use is less important than how you use it • Use the same tool to compare results

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 12 RF High Availability: Cisco RRM

§ What are Radio Resource Manager (RRM)’s objectives? • Provide a system wide RF view of the network at the Controller (only Cisco!!) • Dynamically balance the network and mitigate changes • Manage Spectrum Efficiency so as to provide the optimal throughput under changing conditions

§ What’s RRM • DCA—Dynamic Channel Assignment • TPC—Transmit Power Control • CHDM—Coverage Hole Detection and Mitigation

§ RRM best practices • RRM settings to auto for most deployments (High Density is a special case) • Design for most radios set at mid power level (lever 3 for example) • Use RF Profiles to customise RRM settings per Areas/Groups of APs

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 13 RF High Availability: Cisco RRM RRM DCA in action

§ RRM will determine the optimal 1 6 1 channel plan based on AP layout

§ A rogue AP is detected on channel 11

§ RRM will assess the RF and take a decision in less than 10min 11 1 11 § Channel change is triggered to improve the RF

11 § Note how the 3 non overlapping channels are still maintained!

§ RRM has a RF system view. AP view would be limited and could result in sub-optimal RF plan

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 14 RF High Availability: Cisco RRM – RF Profiles RF profiles = RF Design flexibility

8.0

Look out for information at the bottom of the key pages

High Density

TPC, DCA, Coverage Hole Data Rates

For more info: http://www.cisco.com/en/US/tech/tk722/tk809/technologies_tech_note09186a008072c759.shtml BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 15 RF High Availability: Cisco CleanAir

§ Assess impact of interferences and proactively change channel when needed § Hardware based Spectrum intelligence solution integrated in Cisco Prime § Only CleanAir ASIC based solution can reliably detect interference sources:

5MHz scanning resolution 156 kHz scanning resolution

• CleanAir • Hardware based Solution

• 32 times WiFi chip’s visibility • Accurate classification • Multiple device recognition

§ Best Practice: always turn it on supported APs (all 802.11ac APs are CleanAir capable)

For more info: http://www.cisco.com/en/US/netsol/ns1070 BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 16 RF High Availability: Cisco ClientLink

§ Cisco ClientLink is Beamforming at the chip level: § Implemented in hardware, no software component, no performance degradation § ClientLink creates a better quality RF for all clients (a/g/n/c) § Do I need a 4x4 AP? Yes, and even more critical with 802.11ac

Downlink Data Rate Comparison Modulation MCS Data Rate (Mbps) Cisco 3700 3x3 Cisco 3700 4x4 64QAM m7 975 46% 0% 256QAM m8 1170 49% 15% 256QAM m9 1300 5% 85%

§ Best practice: on by default

For more info: http://www.cisco.com/en/US/prod/collateral/wireless/ps5678/ps11983/at_a_glance_c45-691984.pdf

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 17 Innovations Only Cisco Delivers Radio Frequency Excellence for High-Density Environments Available on new 2800/3800 APs

Cisco CleanAir® Remediates Device Impacting Interference Zero Impact AVC Hardware Based Application Visibility and Flexible Radio Assignment Control without Impact to Performance. Adjust Radio Bands to Better Serve the Environment. Multi-Gigabit Uplinks Free Up Wireless With Faster Wired Network Offload 802.11ac Wave 2 MU-MIMO Standards based Wave 2 MU-MIMO

Smart Antenna Connector Optimized Roaming 2nd Antenna Connection Intelligently Connects the Proper Dual 5 GHz & Band specific Antennas Access Point as People Move

Turbo Performance Cisco ClientLink Scales to Support More Devices Improves Performance of legacy and 802.11ac Running High Bandwidth Apps. Devices.

Flex Dynamic Frequency Selection Expandability Automatically Adjusts So Not to Interfere Add Functionality Via Module, Smart Antenna Port or With Other Radio Systems USB Port

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 18 Maximise the Spectrum Avoiding Excessive Management Traffic

§ Always aim for 1 SSID – More SSID’s = Worse Performance § Why? – Each SSID requires a separate Beacon – Each SSID will beacon at the minimum mandatory data rate § Each broadcast SSID will respond to null probe requests – Exponential amounts of airtime wasted

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 19 Maximise the Spectrum PHY Rate Tuning: Why PHY Rates Matter § How fast can we talk?

18Mbps – Signal (RSSI) and Noise are Client near AP: 24Mbps key factors Higher PHY Rate 36Mbps § As client moves further More Efficient 48Mbps (high signal-to-noise ratio) from AP or as noise 54Mbps worsens, client rate- shifts downward Client far from AP: Lower PHY Rate § Lower rate, more airtime Less Efficient consumed (lower signal-to-noise ratio) § 802.11ac Wave 2 example ~15’

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 20 Maximise the Spectrum PHY Rate Tuning: How-To Basics

18Mbps 24Mbps

36Mbps § Position AP’s and antennas to allow 48Mbps elimination of low rates (i.e., <18mbps) 54Mbps § Eliminate 802.11b rates § Avoid disabling MCS rates – Disabling MCS rates, especially 0-7, can cause significant client issues

Remember the 3 Key RF Relationships!

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 21 Client-Induced Interference: What is it?

Student Student with Student using Student generating Client covered by AP Smartphones Wi-Fi Induced Interference

Common Assumptions • 75% of Students will have a Smartphone • 30% of Smartphone users will utilise Wi-Fi

• But what is everyone else doing?

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 22 Client-Induced Interference What does it look like and how can we mitigate? § Client-induced interference: especially damaging on 2.4GHz but also impacts 5GHz via ACI (Adjacent Channel Interference) § Probe requests sent on all channels – Many frames on overlapping channels, driving noise floor to be higher/worse § Getting these devices on your network can help – Probe frequency diminishes significantly on an associated device

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 23 Maximising the Spectrum Ease-of-Use & Client Induced Interference

§ Ask yourself - how difficult is it to get on your WiFi network? § Ease-of-use directly impacts airtime efficiency § Low take rate = lots of probe request noise (1mb, max power, all channels) – Results in Client Induced Interference § Design for seamless end-user experience – Captive portals for T&C: necessary? § A device on the network is far less damaging than a device off the network!

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 24 Reforming 5 GHz to Optimise for 802.11ac

• More non-overlapping channels enabling better Future 5GHz Opportunity 802.11ac experience Channel No. of Non- Bandwidth (MHz) overlapping • 6x 80 MHz channels (5 in Canada and Europe) Channels

• 2x 160 MHz channels (1 in Canada) 20 37 40 18 • Additional 5GHz spectrum liberalisation (5.35- 5.47 GHz and 5.85-5.925 GHz) allows: 80 9 160 4

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 25 RRM’s new Flexible Radio Assignment (FRA)

• Manage the Flexible Radio Hardware • Determine Coverage Overlap Factor (COF) at 2.4 GHz • Evaluate Radios as potentially Redundant • Determine best role for Flexible Radio • Assign • Radio role determination and assignment is Automatic If radio’s FRA Auto and FRA is enabled. • FRA calculates COF for Manual assigned radios and Administrator can make Role choices

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 26 FRA – COF, Coverage Overlap Factor

• 2.4 GHz Radios that are members of the “Same” AP Group will be calculated together • Coverage Overlap is the percentage (%) of a given cell that is covered by other AP’s at -65 dBm or greater • All AP models considered in the coverage calculation • Neighbours above -60 dBm will be used for coverage • Only 2800/3800 can be marked as Redundant

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 27 FRA – Assignment Priority

• Pervasive 2.4GHz and 5GHz coverage 1 5GHz 2.4GHz Coverage too dense – Mark Redundant Serving Serving • Default operating Role

• DCAIncrease will determine Network suitability,Capacity andand Performance 2 5GHz 5GHz • IfMaximum Unsuitable over – then the Monitor air data rate up to 5.2Gbps Serving Serving • High Density Client Performance improvements

• Secure Network from Non-Wi-Fi Interference, wIPS 5GHz Wireless 3 Serving Security Wirelessattackers, monitoring and Rogue of 2.4 Clients/Access and 5 GHz Points Monitor • Scan both 2.4GHz and 5GHz for security threats

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 28 Wireless Controller HA Wireless Controller Deployment Modes

Cisco Cloud based FlexConnect Centralised Mobility Express

WAN

Traffic Distributed at AP Traffic Distributed at AP Traffic Centralised at WLC Traffic Distributed at AP

Target Branch and Campus Branch Campus Single Site Positioning Purchase Public Cloud Wireless only Wireless only Wireless only Decision

• Uses VRRP to elect a new • Multiple Geo distributed DCs • Full RF HA High controller based on a set of • Data over at least 3 DCs • Client SSO when Local • Most complete solution Availability priorities. Client SSO when local • Distributed packet processing Switching switching

Key • Ease of Management, • Branch with WAN BW and • Small to medium deployments. Up • Full features Considerations scalability, cloud based latency requirements to 100 Access Points

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 30 Centralised Mode HA Requirements Benefits Minimum release: 7.5 Active Client State is synched 5500, WiSM2, 7500, 8500 series AP state is synched L2 connection between boxes Client SSO No Application downtime Same HW and software HA-SKU available 1:1 box redundancy

Release: 7.3 and 7.4 5500, WiSM2, 7500, 8500 series AP state is synched AP SSO Direct physical connection No SSID downtime (SSID stateful switchover) Same HW and SW HA-SKU available (> 7.4) 1:1 box redundancy Network Uptime N+1 Redundancy Available on all controllers (Deterministic/Stateless HA, Each Controller has to be Crosses L3 boundaries a.k.a.: configured separately Flexible: 1:1, N:1, N:N primary/secondary/tertiary) HA-SKU available (> 7.4)

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 31 WLC Redundancy with Mobility Group Building 2 Why not recommended? WLC-B MAC: AA:AA:AA:AA:AA:02

Mobility Group : MyGroup

Mobility Group Neighbours: § Mobility Group allows controllers to peer with WLC-A, AA:AA:AA:AA:AA:01 each other to support Seamless and Fast WLC-C, AA:AA:AA:AA:AA:03 roaming across controller boundaries WLC-C • Support for up to 24 WLCs in the same Mobility Group WLC-A WLC-C MAC: AA:AA:AA:AA:AA:01 MAC: AA:AA:AA:AA:AA:03 § Best Practice is to keep Mobility Group small Mobility Group: MyGroup Mobility Group: MyGroup and limited to the areas where seamless Mobility Group Neighbours: Mobility Group Neighbours: WLC-B, AA:AA:AA:AA:AA:02 WLC-A, AA:AA:AA:AA:AA:01 mobility can happen WLCC, AA:AA:AA:AA:AA:03 WLC-B, AA:AA:AA:AA:AA:02

§ APs learn about all the WLCs in a Mobility Ethernet in IP Tunnel Group at join time WLC-A WLC-B

§ NOT Recommended for HA...why?

Building 1 BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 32 WLC Redundancy with Mobility Group Why not recommended? “Salt and Pepper” Deployment Mobility Group When relying only on Mobility Group information: • AP only learns available Controllers at JOIN time WLC-A WLC-B • AP joins the least loaded WLC AP learns both WLC • This could lead to “Salt and Pepper” deployment: CAPWAP tunnels Floor 1 • Same floor AP on different WLCs • More inter-controller roaming • Harder to troubleshoot • For High Availability: AP Boots UP AP state machine • No concept of backup controller list Reset WLC failure • Upon loosing the registered controller, the AP has to Discovery detected start from scratch the whole Discovery process to all members of Mobility Group Image Data DTLS Setup Run

Join Config

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 33 WLC Redundancy with Mobility Group Why not recommended “Salt and Pepper” Deployment Mobility Group When relying only on Mobility Group information: • AP only learns available Controllers at JOIN time WLC-A WLC-B • AP joins the least loaded WLC

• This could lead to “Salt and Pepper” deployment: CAPWAP tunnels Floor 1 • Same floor AP on different WLCs • More inter-controller roaming • Harder to troubleshoot

• For High Availability: AP state machine AP Boots UP • No concept of backup controller list Reset WLC failure • Upon loosing the registered controller, the AP has to Discovery detected start from scratch the whole Discovery process to all Not members of Mobility Group Image Data DTLS deterministic • Failover takes more time and it’s not deterministic: Setup you don’t know where the AP will end up Run

Join Config

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 34 N+1 Redundancy § Administrator statically assigns APs a primary,

WLAN-Controller-A WLAN-Controller-B WLAN-Controller-C secondary, and/or tertiary controller • Assigned from controller interface (per AP) or Prime Infrastructure (template-based) • You need to specify Name and IP if WLCs are not in the same Mobility Group IP Network § Pros: • Predictability: easier operational management • Support for L3 network between WLCs Access Point • Flexible redundancy design options:1:1, N:1, N:N:1

Primary: WLAN-Controller-1 Primary: WLAN-Controller-2 • WLCs can be of different HW and SW (*) Primary: WLAN-Controller-3 Secondary: WLAN-Controller-2 Secondary: WLAN-Controller-3 Secondary: WLAN-Controller-2 Tertiary: WLAN-Controller-3 Tertiary: WLAN-Controller-1 • “Fallback” option in the case of failover Tertiary: WLAN-Controller-1 • Can overload APs on controllers (using AP priority) § Cons: • Stateless redundancy 8.0 • More upfront planning and configuration

(*) AP will need to upgrade/downgrade code upon joining

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 35 N+1 Redundancy Global Backup Controllers

§ Backup controllers configured for all APs under Wireless > High Availability § Used if there are no primary/secondary/tertiary WLCs configured on the AP § The backup controllers are added to the primary discovery response message to the AP

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 36 N+1 Redundancy AP Failover Mechanism < 30-45 sec (*)

§ When configured with Primary and backup Controllers: • AP uses heartbeats to validate current WLC connectivity • Upon loosing a heartbeat to the Primary, AP sends 5 AP Boots UP WLC failure consecutives heartbeats every 3 second (default) detected • Configurable to minimum of 3 keepalive every 2 sec Reset Discovery • If no reply, AP starts the join process to the first backup WLC candidate: Image Data • Backup is the first alive WLC in this order: primary, secondary, tertiary, global primary, global secondary. DTLS Setup • With N+1 Failover, AP goes back to discovery state just to Run make sure the backup WLC is UP and then immediately starts the JOIN process • With N+1, AP periodically checks for Primary to come back Join Config online and falls back to it (AP fallback can be disabled)

(*) With Fast Heartbeat and minimum values for keepalive

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 37 N+1 Redundancy AP Fast Heartbeat < 30-45 sec (*)

§ Fast Heartbeats lower the amount of time it takes to detect Primary controller failure

§ How Fast Heartbeat works • AP sends these packets, by default every 1 sec • When the fast heartbeat timer expires, the AP sends a 3 fast echo requests to the WLC for 3 times (configurable)

• If no response primary is considered dead and the AP selects an available controller from its “backup controller” list in the order of primary, secondary, tertiary, primary backup controller, and secondary backup controller.

§ Fast Heartbeat only supported for Local and Flex mode

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 38 N+1 Redundancy AP Primary Discovery Request Timer

• The access point periodically sends primary discovery requests to the Primary WLC to know when it is back online. Default is 120 sec.

• If AP Fallback is enabled (default), the AP automatically joins back the Primary controller

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 39 N+1 Redundancy Critical AP fails over AP Failover Priority

AP Priority: Critical Overloaded Controller § Assign priorities to APs: Critical, High, Medium, Low Medium priority AP dropped

§ Critical priority APs get precedence over AP Priority: Medium all other APs when joining a controller

§ In a failover situation, a higher priority AP will be allowed to join ahead of all other APs

§ If backup controller doesn’t have enough licenses (ex. multiple Primary WLCs fail), existing lower priority APs will be dropped to accommodate higher priority APs

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 40 N+1 Redundancy Controller HA SKU § The HA-SKU was introduced in 7.4 for 5508, WiSM2, Flex7500, 8510 and in 7.5 for 2504 • It provides the support for the maximum number of APs on the specific hardware platform • It needs to be configured as you would with the secondary controller (no auto synch with Primary).

Primary Controller: WiSM-2 You need “config redundancy unit secondary” License Count: 500 To have support for max number of APs APs connected: 400

Primary Controller : 2504 IP network AIR-CT5508-HA-K9 License Count: 50 Secondary Controller APs connected: 25 Max AP support: 475500- APs25400 = =475 75 APs

§ Other important information: • For 5508 (2504) you need a minimum of 50 (5) PERMANENT licenses to convert it into HA-SKU • From 7.6 you can convert HA-SKU to Primary and use it as Active controller (you’d need to add licenses, of course) • In 8.0 no more nagging message on the console after 90 days from first AP joining • New 5520 and 8540 Controllers do not have an HA SKU, use the zero AP SKU instead

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 41 N+1 Redundancy < 30-45 sec (*) Geo separated DC Typical Design WLC-BKP

§ Most common Design is N+1 with Redundant WLC in a geographically separate location Primary Buildings IP network § Can provide 30-45 sec of downtime when use faster heartbeat to detect failure WLAN-Local § Use AP priority in case of over subscription of redundant WLC WLAN-Local

WLAN-Local § Use HA SKU for the backup Controller APs Configured With: Primary: WLAN-Local • available for 5508, 7500, 8510 since 7.4 and for Secondary: WLC-BKP 2500 from release 7.5

For more info: http://www.cisco.com/en/US/docs/wireless/technology/hi_avail/N1_HA_Overview.html or http://www.cisco.com/en/US/prod/collateral/wireless/ps6302/ps8322/ps10315/qa_c67-714540.html

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 42 Wireless Controller HA Centralised Mode – Stateful Switch Over (SSO) Stateful Switchover (SSO) < 1 sec Active Controller RP 1 § True Box to Box High Availability i.e. 1:1 • One WLC in Active state and second WLC in Hot Standby state L2 network Hot Stand-by Controller • Secondary continuously monitors the health of Active WLC via dedicated link

§ Configuration on Active is synched to Standby WLC RP 2 • This happens at startup and incrementally at each configuration change on the Active

§ What else is synched between Active and Standby? • AP CAPWAP state in 7.3 and 7.4: APs will not restart upon failover, SSID stays UP – AP SSO • Client in “RUN”/active state in 7.5: client will not disconnect – Client SSO

§ Downtime during failover reduced is greatly reduced: • 2 - 100 msec for a box failover (Active WLC crashes, system hangs, manual reset or forced switch-over) • 350-500 msec in the case of power failure on the Active WLC (no direct command for switchover is possible) • Few seconds in the case of network failover (gateway not reachable)

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 44 Stateful Switchover (SSO) What’s the impact on client applications?

Ping May loose one ping

VoIP Call Voice call stays up

MS Lync No session drop

Citrix VDI No impact

video: https://www.youtube.com/watch?v=If5F7eZkC3w

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 45 Stateful Switchover (SSO) Failover sequence ACTIVE STANDBYACTIVE

1. Redundancy role negotiation and config sync 2. APs associates with Active controller Si Si GARP 3. Client associates with Active through AP 4. Active failure: notify peer / or missing keep alive 5. Standby WLC sends out GARP

6. Standby becomes Active: Si Si Si AP DB and Client DB (7.5) is already synced with standby controller Si AP CAPWAP tunnel session intact Client session intact, client does not re-associate* Campus Access

Effective downtime for the client is: Detection time + Switchover time + (client association if AP SSO) Capwap tunnel AP SSO – 7.3 / 3.3 Client Session *Client SSO – 7.5 BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 46 Stateful Switchover (SSO) Pairing the boxes

§ HA Pairing is possible only between the same type of Active Controller hardware and software versions

§ 5500/7500/8500 have dedicated Redundancy Ports RP 1 • Direct connection supported in 7.3 and 7.4 L2 network (7.5) Hot Stand-by Controller • L2 connection supported in 7.5 and above

§ WiSM-2 has dedicated Redundancy VLAN RP 2 • Redundancy VLAN should be a non-routable VLAN • WISM-2 can be deployed in single chassis OR multiple chassis • WISM-2 in multiple chassis needs to use VSS (7.3, 7.4) • WISM-2 in multiple chassis can be L2 connected in 7.5 and above

§ Requirements for L2 connection: RTT Latency: < 80 ms; Bandwidth: > 60 Mbps; MTU: 1500

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 47 Stateful Switch Over (SSO) Redundancy Management Interface

§ Redundancy Management Interface (RMI) • To check gateway reachability sending ICMP packets every 1 sec • Peer reachability once the Active does not respond to Keepalive on the Redundant Port • Notification to standby in event of box failure or manual reset • Communication with Syslog, NTP, TFTP server for uploading configurations • Must be in same subnet as Management Interface. From 8.0 the Management VLAN needs to be tagged

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 48 Stateful Switchover (SSO) Redundancy Port

§ Redundancy Port (RP): • Active/Standby role negotiation • Configuration synch from Active to Standby (bulk and incremental configuration) • Peer reachability sending UDP keep alive messages every 100 msec • Notification to standby in event of box failure • Time synch with peer, if NTP not available • Auto generated IP Address where last 2 octets are picked from the last 2 octets of RMI

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 49 For Your Stateful Switchover (SSO) Reference Configuration

§ Management interfaces on both WLCs must be on the same subnet

§ Mandatory Configuration for HA setup: • Redundant Management IP Address • Peer Redundant Management IP Address • Redundancy Mode set to SSO enable (7.3 and 7.4 would show AP SSO) • Primary/Secondary Configuration – Required if peer WLC’s UDI is not HA SKU • The Primary HA must have valid AP licenses Optional Configuration: • Unit can be secondary if it has at least 50 AP (5508) • Service Port Peer IP permanent licenses (no restrictions for other WLCs) • Mobility MAC Address • Keep Alive and Peer Search Timer

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 50 For Your Stateful Switchover (SSO) Reference Connectivity to the boxes

§ Once SSO is enabled: • Connect to Standby WLC using console or SSH to Service Port and RMI • TFTP, NTP and Syslog traffic use the RMI interface on the Standby WLC • Telnet / SSH / SNMP / Web Access is not available on Management and Dynamic interface on Standby WLC • There is no SNMP or GUI access on the service port for both WLCs in the HA setup

§ When SSO is disabled: • Configuration done on Active is pushed to Standby; after rebooting all the ports will come up on Active and will be disabled on Standby • This is to avoid network conflicts because the two WLCs have the same configuration

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 51 For Your Stateful Switchover (SSO) Reference Maintenance Mode

§ Standby transitions to Maintenance Mode if: • Gateway not reachable via RMI Interface • Software mismatch • WLC with HA SKU has never discovered its peer • Redundant Port is down

§ Important info: • When one of the conditions above is met, Standby reboots and goes to Maintenance mode • From 8.0 it will go directly without reboot • In Maintenance mode same rules to connect to standby box apply (console or Service port) • WLC should be rebooted to bring it out of Maintenance Mode • From 7.6 it will reboot automatically when are fixed

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 52 Stateful Switchover (SSO) What is Synced/not Synced Between Active and Standby

All Configuration Access Point data base Client in “RUN” state (7.5) Sleeping Client data base (8.0) OEAP clients (8.0) Internal DHCP (8.0) Static CAP config and stats (8.0)

Client Statistics PMIPv6 info NBAR (AVC) info Active Standby SIP static CAC info WGB and associated clients Passive clients DB Location DB Rogue AP and client info Not synched as of 8.2 Device and root certificates

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 53 Stateful Switchover (SSO) Other Important Things to Keep in Mind..

§ There is no preemption in Controller SSO: • when the failed Active WLC comes back online it will joining as Hot Standby

§ Recommendations:

• In Service Software Upgrade (ISSU) is not supported: plan for down time when upgrading software • Physical connection between Redundant Ports should be done first before HA configuration • Keepalive and Peer Discovery timers should be left at default values for better performance

§ SSO and MESH APs: • only RAP are supported from 7.5, for MAPs the state is not synched • Use N+1 redundancy for a mesh based network

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 54 Stateful Switchover (SSO) Changes Introduced From Release 8.0

§ Gateway (GW) reachability changes:

GW reachability 7.3/7.4 7.5 8.0

protocol ICMP ICMP ICMP # of keepalives 3 12 6 ARP check n.a. n.a Yes

§ Release 8 introduced IPv6 support for the wireless infrastructure: • SSO (AP and Client) is supported with IPv6 • Redundancy Management/Redundancy port interface supports only IPv4 addresses.

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 55 Stateful Switchover (SSO) Changes Introduced From Release 8.0

§ Peer Redundancy Management interface (RMI) reachability check:

RMI reachability Before 8.0 8.0

protocol ICMP UDP

interval 1 sec 1 sec

§ IEEE 802.1Q tag for Management VLAN: starting 8.0, Management and RMI interfaces are highly RECOMMENDED to be tagged • If upgrading from a previous release with untagged interface, the controller will show a warning message “ Untagged configuration is not recommended”

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 56 For Your Stateful Switchover (SSO) Reference Changes Introduced From Release 8.0

§ Peer configuration: • new range for Keep Alive and Peer search timers • new Keep Alive Retries parameter Before 8.0 From 8.0

© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public Stateful Switchover (SSO) Changes Introduced From Release 8.1: Fast Restart When useful:

ü LAG Configuration change

ü Mobility Mode change

ü Web-auth certificate installation

ü Clear Configuration

ü Post Configuration Wizard • Process restart to reduce network and service downtime ü Transfer Download of configuration • Supported on Cisco WLC 7510, 8510, 5520 8540 and vWLC

• CLI Command “restart”

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 58 8.1 release: Fast Restart 7510/8510 System Downtime with ‘reset system’

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 59 New in 8.1 release - Fast Restart 7510/8510 System Downtime with Fast ‘restart’ 73% Faster

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 60 For Your New in 8.1 release - HA Standby Monitoring Reference

HA-SKU Trap, Events and Logging

1 WLC Turns Hot Standby - Trap

2 Bulk Sync Completion - Trap

3 Standby Reboot - Trap

4 Peer System, CPU, Memory details on Active GUI and CLI

5 Admin Login on Standby RMI - Syslog

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 61 Stateful Switchover (SSO) Licensing

§ You need valid licenses on the Active for HA to work (permanent or evaluation) § As Standby you can use HA-SKU or a “converted” existing WLC § To convert any existing WLC to a Standby WLC: • Use the “config redundancy unit secondary” command in the CLI or GUI equivalent. • Restriction: on the 5508 a minimum of 50 AP Permanent licenses are needed. § What happens to licenses when you a HA pair? Example with HA-SKU: • The device with HA-SKU becomes Standby first time it pairs up • AP-count licenses will be pushed from Active to Standby • On event of Active failure, HA-SKU will let APs join and start a 90-day count-down. • After 90-days, HA-SKU WLC starts nagging messages but won’t disconnect connected APs • New WLC joins as Standby and timer is reset if the new WLC has a number of licenses >= to the failed one.

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 62 Stateful Switchover (SSO) Adding Licenses to a SSO Pair

§ The licenses are added to the ACTIVE controller • If using the HA-SKU make sure that the ACTIVE is the Primary controller

§ No need to break the pair and/or reboot. The HOT-STANDBY inherits the new added licenses • From 8.1, a reboot is recommended for the 5508 and WISM2 (not needed for 8510/8540/5520)

§ Let’s see the actual steps:

Active Standby

© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public Wireless Controller HA FlexConnect Mode FlexConnect quick recap.... Central Site WLCs § Control plane, two modes of operation: Centralised • Connected (when WLC is reachable) Trafic • Standalone (when WLC is not reachable) Centralised Trafic § Data Plane can be: • Centralised (split MAC architecture) switching • Local (local MAC architecture) switching

§ Traffic Switching mode is configured per AP and per SSID • From 7.3 split tunnelling is supported on a WLAN basis WAN § FlexConnect Group: • Defines the Key caching domain for Fast Roaming, allows backup Radius scenarios Local Trafic § From 8.0 Flex + Mesh mode supported Remote Office § WAN recommendations: • Minimum bandwidth 12.8 kbps per AP • Round trip latency no greater than 300 ms for data deployments and 100 ms for data + voice deployments

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 65 FlexConnect HA Limitations Benefits

L2 roaming Upon WLC failure AP stays up and FlexConnect Local Flex Groups for AAA Local Auth. clients are not disconnected Switching Fault Tolerance: Identical Equivalent to Client SSO configuration on N+1 controllers AAA survivability available

FlexConnect Central Same as Centralised mode Same as Centralised mode Switching

For more info: http://www.cisco.com/en/US/products/ps11635/products_tech_note09186a0080b7f141.shtml

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 66 FlexConnect WAN Failure (or Single Central WLC Failure) Central Site

§ HA considerations: • Disconnection for centrally switched SSIDs clients • No impact for connected clients on locally switched SSIDs

§ Fast roaming allowed within FlexConnect group for already connected clients WAN

§ What about new clients? • Static keys are locally stored in FlexConnect AP: new clients can Remote Site join if authentication is PSK • Can design for AAA survivability (see next slides) Application Server § Lost features • RRM, CleanAir, WIDS, Location, other AP modes • Web authentication, NAC

Centrally switched traffic

Locally switched traffic BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 67 FlexConnect WLC Failure with Deterministic N+1 HA Central Site Secondary Primary

§ HA considerations: • Disconnection for centrally switched SSIDs clients • No impact for connected clients on locally switched SSIDs WAN § FlexConnect AP transitions to Standalone and then to Connected when joins the Secondary Remote § When in Standalone mode, Fast roaming is Office allowed within the FlexConnect Group Application § Fault Tolerant: upon re-syncing with Secondary, Server client sessions for local traffic are not impacted, provided that the configuration on the WLCs are identical

Centrally switched traffic

Locally switched traffic BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 68 FlexConnect WLC Failure Scenario with SSO Central Site

Standby

§ HA considerations: Active • No impact for locally switched SSIDs • Disconnection of centrally switched SSIDs clients with AP SSO • No/minimal impact for centrally switched client with Client SSO (7.5 and above) WAN § FlexConnect AP will NOT transition to Standalone because SSO kicks in

§ AP will continue to be in Connected mode with the Application Standby (now Active) WLC Server

Remote Office

Centrally switched traffic

Locally switched traffic BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 69 FlexConnect AAA Survivability AAA Server Backup Central Site Central RADIUS § By default authentication is done centrally in connected mode

§ Backup AAA servers are configured at FlexConnect Group level

§ When WLC/WAN fails, AP goes in Standalone mode WAN § In Standalone mode, the AP can be configured to Backup authenticate new clients with backup RADIUS defined RADIUS Remote locally at the AP Office § Upon WAN/WLC failure: • Existing connected clients stay connected • New clients are authenticated to the locally defined AAA FlexConnect Group

Central authentication traffic Local authentication traffic BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 70 FlexConnect AAA Survivability For Your AAA Server Backup Configuration Reference

§ Define primary and secondary local backup RADIUS server under FlexConnect Group configuration

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 71 FlexConnect AAA Survivability FlexConnect Local Auth Central Site Central RADIUS § By default FlexConnect AP authenticates clients through central controller when in Connected mode

§ This feature allows AP to act as an Authenticator even in Connected mode WAN Local § AAA servers are defined at the FlexGroup RADIUS level Remote Office § Useful HA scenarios: • Independent branch: AAA is local at the branch, no AAA traffic goes through WAN • WLC goes down but WAN is up. Local users are authenticated from AP to Central site AAA Central authentication traffic Local authentication traffic

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 72 FlexConnect AAA Survivability For Your FlexConnect Local Auth: configuration Reference

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 73 FlexConnect AAA Survivability AAA Server on AP Central Site

§ By default authentication is done centrally in Central RADIUS connected mode

§ When WLC/WAN fails AP goes in Standalone mode WAN § In Standalone, the AP can act as a AAA server

§ EAP-FAST, LEAP, PEAP*, EAP-TLS* and a max of 100 clients supported Remote Site

§ Upon WAN/WLC failure: • Existing connected clients stay connected • New clients are authenticated to the locally defined AAA

* 7.5 Code and above Central authentication traffic Local authentication traffic

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 74 FlexConnect For Your AAA Server on AP - Configuration Reference

§ Check “Enable AP Local Auth” under the FlexConnect Group “General” tab

§ Under the “Local Authentication” tab: § Define EAP parameters (LEAP, EAP-FAST, PEAP, EAP-TLS ) § Define users (max 100) and passwords

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 75 Wireless Controller HA Mobility Express Mobility Express Failure of Access Point Running the Controller Function

§ Controller and APs in the same L2 broadcast domain. Based on FlexConnect architecture. Support central authentication and local switching for clients

§ HA considerations: • No impact for connected clients on locally switched SSIDs

§ Fast roaming allowed within FlexConnect group for already connected clients

§ What about new clients? • Static keys are locally stored in FlexConnect AP: new clients can join if authentication is PSK

§ Lost features • Authentication of new client not previously connected, RRM, CleanAir • Web authentication

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 77 Mobility Express Failure of Access Point Running the Controller Function

§ Election of a new controller using VRRP • Heartbeat exchanged every 10s with Master AP • After 3 missed heartbeats, master election is initiated in and Mobility Express capable APs participate in Master Election • APs fall into standalone mode while Master Election in-progress and within next 30s, a new Master is elected • Standalone Access Points join the new elected master and go to connected mode

§ Election Priorities • Most capable Access Points. 3800 > 2800 > 1800. • Access Client with least client load • In case of tie, election based on lowest MAC Address

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 78 Master Election Process AIR-AP1852I-B-K9

AIR-AP2802I-B-K9 AIR-AP1852I-B-K9 Most capable Access Point P - 1850 vs. 1830 P

AIR-AP1852I-B-K9 AIR-AP3802I-B-K9 AIR-AP1852I-B-K9

MASTER AP Least Client Load P

Lowest MAC address AIR-AP13702I-B-K9 AIR-AP3702E-B-K9

AIR-AP2702I-B-K9

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 79 Management and Mobility Services HA Prime and MSE HA Prime and MSE HA

Requirements Benefits

§ Active / Standby (1:1) mode § No database loss upon failover § Same software & hardware § Failover Automatic or Manual Prime HA § Minimum failover time is 15 s § Failback is always manual § PI 2.2 supports Virtual IP (VIP) § No AP licenses on Secondary § HA SKU from PI 2.0 and later § Supported across WAN

§ Active / Standby (1:1) mode § HA for all Services supported § Same software and hardware § Failover times < 1 min MSE HA § Same subnet only (no WAN) § No HA licenses needed § Same software version § Services licenses on Primary § Release 8.0 recommended § Failover Automatic or Manual

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 81 Prime Infrastructure HA How it Works

§ The Health Monitor (HM) is the primary DB is synchronised and monitor component for HA operation of the system: through a dedicated process • Synchronises configuration related to HA • Synchronises the database Primary Server Secondary Server • Exchanges heartbeat messages • Checks the available disk space on both servers DB DB • Triggers failover Health Health • Connect to https://:8082 of Primary/Secondary Monitor Monitor Configuration Configuration files § All configuration is done on the Primary files (HM) (HM) • Secondary needs only authentication key at setup • Pi 2.2 introduces Virtual IP for same subnet deployments to simply configuration on monitored Prime Processes Prime Processes devices (Active) (Inactive) • Manual Failover is recommended

§ Prime HA is supported in 3 scenarios: HM synchronises config files: • Same LAN: Virtual IP can be used SOAP based heartbeat: § Frequently change: 10 sec • Campus: usually different subnets every 5 sec § Less frequent: 100 sec • Remote: across WAN

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 82 MSE HA How it Works Role configuration is done through the setup MSE script

Primary MSE Secondary MSE

Eth0: 10.1.1.12 Eth0: 10.1.1.13

VIP: 10.1.1.11 Same Health VLAN Monitor Synch HA Configuration, pairing and Monitoring NMSP SOAP/XML/REST is done through Prime over HTTPS

WLC Prime

© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public HA Design and Deployment Practices HA Design and Deployment Practices Connecting an AP to the Wired Network

Recommendations: § Create redundancy throughout the access layer by homing APs to different switches § If the AP is in Local mode, configure the port as access with SPT PortFast, BPDU guard, etc. § If the AP is in Flex mode and Local Switching, configure the port as trunk and allow only the VLANs you need

© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public HA Design and Deployment Practices Connecting a Controller to the Wired Network: Options

1) To a single Modular Switch or StackWise Modular Switch/Stack • Use Trunk EtherChannel(EC)/LAG • Trunk only the required VLANs to the Controller WLC • 2/4/8 ports in a bundle to optimise load sharing • Spread ports across Line Cards/Stack members

2) To a VSS pair VSS pair

• Same as Option 1 WLC • Spread ports across VSS members

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 86 Connecting a Controller to the wired network

Single AireOS Controllers (2504/5508/7500/8500/Wism2) Distribution Layer Switch/Stack Option 1: to single Modular Switch or StackWise

§ Identical configuration on WLC and switch side (EC mode, trunk mode, allowed VLANs, native VLAN, etc.) § EC mode: only mode “ON” supported; no LACP, PAgP § EC load-balancing: no restriction for 5508/2500/7500/8500 • Recommended to include L3 and L4 port for better hash results Trunk § EC load-balancing for WISM2: Port-channel • Need to set the EC load balancing method on the switch to “src-dest-IP”. Use CLI “port-channel load-balance src_dest_ip” § Note: no STP supported on AireOS Controllers. Do not disable it on switch side. Use “switchport portfast trunk”

AireOS based WLC

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 87 Connecting a Controller to the wired network

Single AireOS Controllers (2504/5508/7500/8500/Wism2) Distribution Layer Switch/Stack Option 1: to single Modular Switch or StackWise

§ Identical configuration on WLC and switch side (EC mode, trunk mode, port-channel load-balance src-dst-mixed-ip-port allowed VLANs, native! VLAN, etc.) § EC mode: only modeinterface “ON” GigabitEthernet1/0/1supported; no LACP, PAgP § EC load-balancing: nodescription restriction to_WLC for- 5508/2500/7500/85001 • Recommended toswitchport include L3 trunk and L4encapsulation port for better dot1q hash results Trunk • On the switch use:switchport “port-channel trunk loadallowed-balance vlan src10,11,20,30,40-dst-mixed-ip-port” Port-channel § EC load-balancing forswitchport WISM2: mode trunk • Need to set the ECchannel load- balancinggroup 1 mode method on on the switch to “src-dest-IP”. Use CLI “port-channel loadswitchport-balance src_dest_ip”portfast trunk • For Catalyst 6500 with PFC3 use “port-channel load-balance src-dst-ip exclude vlan” (command supported in 12.2(33)SXH6 and 12.2(33)SXI3 and above) § Note: no STP supported on AireOS Controllers. Do not disable it on switch AireOS based WLC side. Use “switchport portfast trunk”

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 88 Connecting a Controller to the wired network

Catalyst VSS Pair Option 2: to a VSS pair

§ Single LAG to the VSS pair § Spread ports across VSS pair § In case of failure of Primary switch traffic continues to flow through Secondary switch in the VSS pair Trunk § Same recommendations given for Option 1 also apply Port-channel

Recommende d Network Design WLC

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 89 Design & Deployment Practice

Connecting a Controller HA pair HA Design and Deployment Practices Connecting AireOS HA Pair to the Wired Network

Single Switch or stack Option 1: to single Modular Switch or StackWise Same configuration on both Po1 and Po2 § The HA pair of AireOS WLCs should be considered as separated WLCs with the same exact configuration Po 1 Po 2 § Ports on both WLCs are UP but only the ones on the Active WLC are forwarding data traffic Trunk Port-channels § On WLC side: use same physical ports are connected to the network, for ex.: port 1-4 on WLC1 and port 1-4 on WLC2 § On switch side the configuration has to be the same. If using LAG, L2 for example, two Port-channel should be used with the same configuration (same mode, same VLANs, same native, etc.) AireOS AireOS § General recommendations for Option 1 AireOS WLC also apply Active WLC Standby WLC

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 91 HA Design and Deployment Practices Recommende d Network Connecting AireOS HA Pair to the Wired Network Design Same configuration on both Po1 and Po2 Catalyst VSS Pair Option 2: to VSS pair

§ Use EC from each WLC to Distribution VSS § Spread the links in each EC among the two physical switches: this will prevent a WLC Po 1 Po 2 switchover upon a failure of one of the VSS switch Trunk § Same considerations for connecting to a single Port-channels Distribution switch apply § General recommendations for Option 1 AireOS L2 WLC also apply AireOS AireOS Active WLC Standby WLC

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 92 HA Design and Deployment Practices

Connecting AireOS HA Pair to the wired network Distribution Layer Switches

Option 3: to Pair of Distribution switches Layer 2

§ Use ECs to connect to Distribution switches Po 2 § Same exact configuration on both Dist. switches Po 1 § Use same physical ports on the WLCs § Layer 2 between the distribution switches for the Wireless VLANs § Use STP on the Distribution switches L2

AireOS AireOS Active WLC Standby WLC

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 93 Recommende HA Design and Deployment Practices d Network Controller Considerations Design

§ 1)VSS To a is single the recommended Modular Switch Design or choice StackWise as it provides:

§ •RedundancyUse Trunk at EtherChannel(EC)/LAGdistribution layer Catalyst VSS Pair § •EfficientTrunk use only of allthe links required with MultiVLANs-Link to EtherChannelthe Controller • 2/4/8 ports in a bundle to optimize load sharing § Fast convergence, no spanning tree • Spread ports across Line Cards/Stack members § How many WLC ports do I need to connect? 2) To a VSS pair § Multiple interfaces for redundancy § •ConsiderSame the as wirelessOption 1 over-subscription (80:1 is considered normal) • Spread ports across VSS members Trunk § Choose the right model of switch to connect to: Port-channel 3)§ ToSome a pair controllers of Distribution have only switches10GE interfaces (8510, 7510) § Consider TCAM scalability for the number of client MACs

§ Sup2T and Nexus 7000 supports 128k MAC addresses

§ 3850 supports up to 32k MAC addresses WLC

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 94 HA Design and Deployment Practices

Campus HA Design and Deployment Practices Campus

§ What is the acceptable downtime for your business applications? • Are 30 sec to few minutes ok? Go with N+1 to have more deployment flexibility • No downtime? Go with AireOS Stateful Switchover

§ SSO: what is the downtime to upgrade a HA pair and how to minimise it?

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 96 HA Design and Deployment Practices Upgrading an SSO Pair - Standard Procedure

8.0Active Standby 1. Download the new code on Active

7.6 7.6 2. Code transferred to Standby:

Do NOT reboot at this time!

3. Pre-download software on APs

Campus/WAN

8.0 8.0 8.0 8.0 7.6 7.6 7.6 7.6

CAPWAP tunnel

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 97 HA Design and Deployment Practices Upgrading an SSO Pair - Standard Procedure

8.0Active 8.0Standby 1. Download the new code on Active

7.6 7.6 2. Code transferred to Standby

3. Pre-download software on APs

4. Swap the images on APs

5. Reboot the HA pair Campus/WAN • APs will reboot and join when Active is UP

Total Network Downtime: Time for HA pair to reboot + the APs to join 8.0 8.0 8.0 8.0 7.6 7.6 7.6 7.6 5min:12sec with fully loaded 5508 (500 APs/7000 clients)

© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public HA Design and Deployment Practices Upgrading an SSO Pair – Efficient Procedure

8.0Active Standby backup 1. Download the new code on Active

7.6 7.6 7.6 2. Code transferred to Standby Do NOT reboot at this time!

3. Pre-download software on APs

4. Configure APs to join the backup controller Campus/WAN • This can be done per group of APs/Areas • This can be automated using Prime

5. The APs join the backup WLC (no reboot) • This takes less than 30sec 8.0 8.0 8.0 8.0 7.6 7.6 7.6 7.6 • Downtime can be isolated per area

CAPWAP tunnel

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 99 HA Design and Deployment Practices Upgrading an SSO Pair – Efficient Procedure

8.0Active 8.0 Standby backup 1. Download the new code on Active

7.6 7.6 7.6 2. Code transferred to Standby

3. Pre-download software on APs

4. Configure APs to join the backup controller

5. The APs join the backup WLC (no reboot) Campus/WAN 6. Swap the images on Aps

Do this for all the APs in your network 8.0 8.0 8.0 8.0 7.6 7.6 7.6 7.6

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 100 HA Design and Deployment Practices Upgrading an SSO Pair – Efficient Procedure

8.0Active 8.0 Standby backup 7. Reboot the HA pair

7.6 7.6 7.6 8. Configure the APs to join the HA pair • This can be done per group of APs/Areas • This can be automated via Prime

9. APs will join the Active WLC and reboot because of new code: Campus/WAN

Network Downtime: Time for the APs to move to Active, reboot

8.0 8.0 8.0 8.0 and join back: 3min 7.6 7.6 7.6 7.6 Main Advantage: downtime is per Area

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 101 HA Design and Deployment Practices Campus

• What is the acceptable downtime for your business applications? • No downtime? Go with AireOS Stateful Switchover • Are 30 sec to few minutes ok? Go with N+1 to have more deployment flexibility

• SSO: what is the downtime to upgrade a HA pair and how to minimise it?

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 102 HA Design and Deployment Practice

Branch HA Design and Deployment Practices Branch: Some Key Design Questions

§ General considerations:

Local Controller FlexConnect • Branch cookie cutter configuration • Specific per branch configuration VS. • Single pane of Mgmt. & Troubleshooting • Independency from WAN quality • Reduced branch footprint • Reduced configuration on switches • Cheapest AP controller licenses • Full feature support • Built-in resiliency • L3 roaming supported • Perfect fit for centralised IT Team

§ HA considerations: • Is the branch independent from the Central site from an operation prospective? • What is the traffic flow of your application? Are the APP servers centrally located? • Is there a local Internet breakout? • How do you authenticate new users if WAN/Controller is down? Where is the AAA server located? • FlexConnect is inherently designed for HA and offers: • Data plane resiliency upon Central WLC failure or WAN outage • AAA survivability options

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 104 HA Design and Deployment Practices Branch Redundancy: Centralised Controller & Flex (local switching) HA considerations: Data Centre Campus Services § if WAN fails, Flex APs allow a level of redundancy: § Local Data path stays UP

5500 / 8500/ 7510 § Control plane features go down: RRM, CleanAir, WebAuth, etc. ISE § WLC SSO at central site provides Control plane survivability Si WAN Si Design considerations: PI § WAN requirements: § General: 12kbps per AP, 300 ms RTT, 500B MTU § More info here: http://tiny.cc/FlexDG Remote location § APs are in Flex Mode = less features and functionalities compare to Local Mode. Key features missing: • No L3 roaming, No Bonjour Gateway § Flex Groups have AP count limit § 25 APs for 2500/5508, 100 APs for 7500/85x0/5520 FlexConnect APs § Switchport as Trunk if SSID/VLAN separation needed

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 105 HA Design and Deployment Practices Branch Redundancy: Local Controller, Flex local Switching & Central Backup Controller

Data Centre High Availability considerations: Campus Services § Local Controller for managing the APs and for providing 5500 / 8500/ 7510 Control plane survivability in the event of a WAN failure (RRM, CleanAir, WebAuth, etc.) ISE § Why AP in Flex? So that if the local controller fails, the Si

Si APs can failover to the central controller but traffic still PI Internet remains local WAN Design considerations: § AP in Flex mode = less features and functionality compare to Local Mode. Key features missing: Remote WLC Local Services: location • No L3 roaming, No Bonjour GW AAA, DHCP, DNS Si § If using Flex Groups be aware of the AP count limit (25 APs for 2500/5508, 100 APs for 7500 / 8500) § Switchport as Trunk if SSID/VLAN separation needed § For large branch is recommended to have DHCP, DNS FlexConnect and AAA services running locally for better reliability APs

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 106 Key takeaways Key Takeaways

§ High Availability for Wireless is a multi level approach, starting from Level 1 (RF)

§ You have different solutions to chose based on the downtime that is acceptable for your business application

§ Cisco Controller SSO eliminates the network downtime upon a controller failure

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 108 Key Takeaways

Minimum release: 7.5 Active Client State is synched Stateful 5500, WiSM2, 7500, 8500 series AP state is synched Predictable L2 connection Switchover No Application downtime < 1 sec Same HW and software (SSO) HA-SKU available 1:1 box redundancy

Available on all controllers Crosses L3 boundaries Predictable N+1 Redundancy Each Controller has to be configured separately Flexible: 1:1, N:1, N:N <30 sec HA-SKU available (> 7.4) Network Uptime

Available on all controllers Each Controller has to be Crosses L3 boundaries Unpredictable Mobility Group configured separately No specific HA configuration

Requirements Benefits Downtime

© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public Cisco Wireless LAN Documentation

INSTALLATION GUIDES RADIO CONFIGURATION CLIENT ADDRESSING POLICY ENGINE

§ 5520 WLC • 802.11r BSS Fast Transition • Bi-Directional Rate Limiting • AVC

§ 8540 WLC • Adaptive wIPS • Flex AP-EoGRE Tunnel Gtwy • Bonjour

§ AP1570 • ATF Ph 1 & 2 • IPv6 • Chromecast

§ AP1810 OE • CleanAir • Jabber • Device Classification

§ AP1810W Wall Plate • CMX FastLocate • Jabber and UCM • Domain Filtering

§ AP1850 • High Density • Microsoft Lync • mDNS Gateway w/Chromecast

§ AP2700/3700 • Rogue Management • Passpoint Configuration • Wireless Device Profiling & Policy Classification § AP2800/3800 • RRM RF Grouping Algorithm • Real-Time Traffic Over WLAN BEST PRACTICES § AP702W • RRM White Paper • VideoStream • Apple Devices • Vocera IP Phone in WLAN § APIC-EM Wireless AP PnP ENCRYPTION • Enterprise Mobility Design Guide § Flex7500 WLC • VoWLAN Troubleshooting • BYOD for FlexConnect • High Availability (SSO) § Mesh APs • BYOD with ISE • HyperLocation § Mobility Express • Security Integration • iPhone 6 Roaming § Smart Licensing • N+1 High Availability

§ Univ. AP Regulatory Domain • WLAN Express

§ Virtual WLC • WLC Configuration Best Practices

BRKEWN-2010 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 110 Click - https://www.youtube.com/user/CiscoWLAN/

© 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public VoD Links

Faster Innovation • Fastlane App Demo https://www.youtube.com/watch?v=N1QMUcv3aRQ • Cisco CMX Solution https://www.youtube.com/watch?v=KQRb8vfU0qM

• Cisco APIC-EM Wireless PnP Demo https://www.youtube.com/watch?v=_9P2- • CMX Hyperlocation vs RSSI Demo bU66PU https://www.youtube.com/watch?v=6ls7EHbSK4A

• Cisco Aironet Plug and Play Cloud Redirection Reduce • Cisco Dual 5GHz Wi-Fi https://www.youtube.com/watch?v=mbpjiETvDXc https://www.youtube.com/watch?v=W7fBZ6xfSxw

Cost & • Cisco Aironet AP-3800 RF Excellence • Wireless LAN Controller Dashboard Review Complexity https://www.youtube.com/watch?v=dBpGsTKeyNM&t=64s https://www.youtube.com/watch?v=af09TBaafRI&feature=youtu.be • Digital Network Architecture with Wave2 with 802.11ac • Cisco Wireless Mobile App https://www.youtube.com/watch?v=HyvZ4mbVAWs https://www.youtube.com/watch?v=ySjN13hPhXY&t=2s • WLC Advanced UI Client Troubleshooting • Cisco Aironet Series – Flexible Radio Assignment https://www.youtube.com/watch?v=dZVxI6jOx_Q https://www.youtube.com/watch?v=K_-BykT_YIM • ISE Simplified Wireless Setup https://www.youtube.com/watch?v=A3F2DrFu7Lo&feature=youtu.be • TechWiseTV: Apple and Cisco: Fast-Tracking the Mobile Enterprise https://www.youtube.com/watch?v=bh8rEvrzm7Y&feature=youtu.be Lower • Cisco Wireless TrustSec Demo https://www.youtube.com/watch?v=A3F2DrFu7Lo&feature=youtu.be • Prioritised Business Apps Risk https://www.youtube.com/watch?v=z0EOKNxL964&feature=youtu.be • Cisco Wireless Netflow Lancope Integration Demo https://www.youtube.com/watch?v=TuWYkrt94CQ • Apple and Cisco: Three Solutions Coming Together https://www.youtube.com/watch?v=7MgsDkf55wQ&feature=youtu.be • OpenDNS Integration with WLC https://www.youtube.com/watch?v=cMdX8sBBYG4 • WiFi Optimised Feature https://www.youtube.com/watch?v=xgPfxAolJoQ&feature=youtu.be

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 110 Q & A Complete Your Online Session Evaluation Give us your feedback and receive a Cisco Live 2017 Cap by completing the overall event evaluation and 5 session evaluations.

All evaluations can be completed via the Cisco Live Mobile App.

Caps can be collected Friday 10 March Learn online with Cisco Live! at Registration. Visit us online after the conference for full access to session videos and presentations. www.CiscoLiveAPAC.com

BRKEWN-3014 © 2017 Cisco and/or its affiliates. All rights reserved. Cisco Public 114 Thank you