#CLUS Network State Awareness and Troubleshooting

Aamer Akhter / [email protected] BRKARC-2025

#CLUS Agenda • Troubleshooting Methodology

• Packet Forwarding Review

• Control Plane • Topology • Logging • Routing Protocol Stability

• Data Plane • Active Monitoring • Passive Flow Monitoring • QoS

• Getting Started

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 4 Cisco Webex Teams

Questions? Use Cisco Webex Teams (formerly Cisco Spark) to chat with the speaker after the session How 1 Find this session in the Cisco Events App 2 Click “Join the Discussion” 3 Install Webex Teams or go directly to the team space 4 Enter messages/questions in the team space

Webex Teams will be moderated cs.co/ciscolivebot#BRKARC-2025 by the speaker until June 18, 2018.

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 5 Keeping Focused: What This Session is About

• This session is about basic network troubleshooting, focusing on fault detection & isolation • Some non-Cisco specifics

• For context, we will cover some basic methodologies and functional elements of network behavior

• This session is NOT about • Architectures of specific platforms • Data Center technologies

• This is the 90 min tour. ;-)

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 6 Network Quality is a Complex, End-to-End Problem Affects Join/Roam

Affects Quality/Throughput Client firmware Affects Both* WAN Uplink usage End-User services

Client density AP coverage Configuration

WLC Capacity WAN QoS, Routing, ... Authentication RF Noise/Interf. Addressing CUCM ISE

WANWhat is the problem? There are 100+ points of DHCP Office site Where is theNetwork problem? services DC APs Cisco Prime™ failure Mobilebetween clients user and app Local WLCs How can I fix the problem* Both = Join/roam fast? and quality/throughput #CLUS © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public Network state awareness?

• What is it: • View of network, what it is doing, and why • Monitoring of data network performance, in comparison with previous working states • Quick detection of hard failures • Early warning for • soft failures • performance issues • and tomorrows’ problems • Faster problem resolution • Greater confidence in network by users and application operators

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 8 Think Like a Network Detective

Find the Suspects Question Suspects Improve Be Prepared

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 9 Control Plane & Data Plane

• Control Plane Gossip from Admin Edict other routers • Processes variety of information sources and policies, creates forwarding information base (FIB) Routing show ip bgp APIs Statics PfR Protocol(s) show ip ospf • Best known intention w/o actual packet in hand Control Plane show ip route show ip policy • Data Plane

Int B show ip cef • The actual forwarding process Int A packet show mpls forwarding… (might be SW or HW based) Data Plane show -table Int C • Granted some decision flexibility show policy-map int… • Driven by arriving packet details, traffic show interface conditions etc. Passive Measurements show flow monitor

ifmib CbQoS *Flow

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 10 Data Plane Decision Flexibility

• Control plane: condenses options driven by policies and (relatively) slower moving (ms to secs), aggregated information, eg. prefix reachability, interface state

• Data plane responds to packet conditions • Destination prefix to egress interface matching • Multi-path (ECMP / LAG) member selection • Interface congestion • QoS class state • Access Lists • Packet processing fields (TTL expire, etc) • IPv4 fragmentation, etc

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 11 Network as a System: Independent Decisions

• Each network device makes an independent forwarding decision • Explicit Local / domain policies • Device perspective might not be symmetric • Data plane flexibility

• Asymmetric routing: forward and reverse path are different • Caused by traffic engineering policies, popular at WAN-edge and admin boundaries

Congested link R5 is doing ECMP hash R3

R1 R6 R2 R5 A B R4 your network You don’t control #CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 12 Data Plane and Control Plane Changes

• Change is normal, but some changes are more interesting:

• Single change that causes loss of reachability or suboptimal performance

• Instability: high rate of change

• 3Ws: When, where, and what

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 13 Control Plane

3Ws: When, where, and what What do I have?

• Establish inventory baseline • Device names, IPs, configuration • Modular HW configuration • Serial # (for support & replacement) Example device label • History (where has it been placed)

• Clearly label devices, ownership and contact info to • Establish standards for location, device/port names Example cable label • Check for changes periodically (tooling)

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 15 How is it wired together?

• Establish baseline • Visual inspection 

• Be prepared to be surprised! • show cdp neighbor show lldp neighbor • CDP / LLDP for Layer-2 neighborships • Traverse spanning-tree blocked, but not L3 R1 SW1 R2 • Monitor for non-leaf changes

R1#show cdp neighbors Capability Codes: R - , T - Trans Bridge, B - Source Route Bridge S - Switch, H - Host, I - IGMP, r - Repeater

Device ID Local Intrfce Holdtme Capability Platform Port ID SW1 Eth 0 157 T S WS-C3524-XFas 0/0/0

#CLUS © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public Tools for Topology & Inventory Management

• Most NMS tools have some element of inventory and topology awareness

• DNA Center

• APIC-EM

DNA Center: Discovery • Cisco Prime Infrastructure

• NetBrain

• (open source) NetDisco http://www.netdisco.org

• (open source) Netdot https://osl.uoregon.edu/redmine/projects/netdot

DNA Center: Physical Neighbor Topology #CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 17 Logging

• Centrally: for ease of analysis and search • Cisco Prime Infrastructure & Cisco EPNM– full featured tool for inventory, and monitoring • Moogsoft - automates early detection of service failures, collaboration & knowledge base • syslog-ng – preprocessing, relay and store(file/db) • Logstash(ELK), fluentd – multisource collection, storage and analysis service timestamps log datetime msec show-timezone ! • Locally: in case logs can’t get home logging host logging trap 6 logging source interface Loopback 0 ! logging buffered 6 logging presistant url disk0:/syslog size filesize

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 18 Alerting & Collaboration

• Routing of alerts / interesting events • Coordinating response

• Is this noise or signal? • IM tools (Spark, hipchat etc.)

• Which team(s) to alert? • Email

• Who is on duty? • Ticketing tools (OTRS, Jira, ServiceNow, Moogsoft…) • How to contact: SMS, IM, phone call…

• Pagerduty, Openduty

PagerDuty OTRS

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 19 State of the Routing Table

• Be familiar with normal behavior of important service prefixes • Establish quickly if problem is control plane or data plane • show ip route / ipRouteTable MIB / show ip traffic (Drop stats) • Nagios: check_snmp_iproute.pl • Track objects and EEM (config) track 100 ip route 0.0.0.0 0.0.0.0 reachability event manager applet TrackRoute_0.0.0.0 event track 100 state any action 1.0 syslog msg "route is $_track_state“ # 01:09:21: %HA_EM-6-LOG: TrackRoute_0.0.0.0: route is down blog.ipsapce.net

#show ip route 192.168.2.2 Routing entry for 192.168.2.2/32 Known via "ospf 1", distance 110, metric 11, type intra area Last update from 10.0.0.2 on FastEthernet0/0, 00:00:13 ago Routing Descriptor Blocks: * 10.0.0.2, from 2.2.2.2, 00:00:13 ago, via FastEthernet0/0 Route metric is 11, traffic share count is 1 blog.ipspace.net

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 20 # show ip ospf Routing Process "ospf 1" with ID 192.168.0.1 Start time: 00:01:46.195, Time elapsed: 00:48:27.308 Supports only single TOS(TOS0) routes Supports opaque LSA OSPF Area / AS-Wide Supports Link-local Signaling (LLS) Supports area transit capability Supports NSSA (compatible with RFC 3101) Supports Database Exchange Summary List Optimization (RFC 5243) Event-log enabled, Maximum number of events: 1000, Mode: cyclic Router is not originating router-LSAs with maximum metric • Remember that OSPF data in area should be Initial SPF schedule delay 5000 msecs consistent Minimum hold time between two consecutive SPFs 10000 msecs Maximum wait time between two consecutive SPFs 10000 msecs Incremental-SPF disabled • Understand ‘normal’ rate of changes Minimum LSA interval 5 secs Minimum LSA arrival 1000 msecs • LSA refresh /30-min unless a change LSA group pacing timer 240 secs Interface flood pacing timer 33 msecs • Track SPF runs over time Retransmission pacing timer 66 msecs Number of external LSA 0. Checksum Sum 0x000000 • show ip ospf stat detail Number of opaque AS LSA 0. Checksum Sum 0x000000 Number of DCbitless external and opaque AS LSA 0 • number of LSAs expected Number of DoNotAge external and opaque AS LSA 0 Number of areas in this router is 1. 1 normal 0 stub 0 nssa • OSPF-MIB: OspfSpfRuns, ospfAreaLSACount Number of areas transit capable is 0 External flood list length 0 IETF NSF helper support enabled • Route missing? Cisco NSF helper support enabled Reference bandwidth unit is 100 mbps • Where is the network supposed to be attached? Is it Area BACKBONE(0) still? Number of interfaces in this area is 4 (1 loopback) Area has no authentication • show interface (on advertising router) SPF algorithm last executed 00:47:05.379 ago SPF algorithm executed 4 times • show ip ospf database … Area ranges are Number of LSA 16. Checksum Sum 0x078460 Number of opaque link LSA 0. Checksum Sum 0x000000 • Of course, all lines have a purpose: BRKRST-3310 Number of DCbitless LSA 0 Number of indication LSA 0 Number of DoNotAge LSA 0 Flood list length 0 #CLUS © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public OSPF Neighborships • neighbor adjacencies • log-adjacency-changes [detail] (on by default, detail optional) • show ip ospf neighbor detail (OSPF-MIB: ospfNbrState, ospfNbrEvents, ospfNbrLSRetransQLen)

(config) router ospf (config-router) log-adjacency-changes [detail] %OSPF-5-ADJCHG: Process 12, Nbr 172.25.25.1 on Serial0/0 from FULL to DOWN, Neighbor Down: Dead timer expired Oct 14 09:57:43: %OSPF-5-ADJCHG: Process 12, Nbr 172.25.25.1 on ...

# show ip ospf neighbor detail Neighbor 192.168.0.7, interface address 10.0.0.3 In the area 0 via interface GigabitEthernet0/1 Neighbor priority is 1, State is FULL, 6 state changes DR is 10.0.0.3 BDR is 10.0.0.4 Options is 0x12 in Hello (E-bit, L-bit) Options is 0x52 in DBD (E-bit, L-bit, O-bit) LLS Options is 0x1 (LR) Dead timer due in 00:00:39 Neighbor is up for 00:33:10 Index 2/2/2, retransmission queue length 0, number of retransmission 0 First 0x0(0)/0x0(0)/0x0(0) Next 0x0(0)/0x0(0)/0x0(0) Last retransmission scan length is 0, maximum is 0 Last retransmission scan time is 0 msec, maximum is 0 msec

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 22 Neighbors Show IP EIGRP Neighbors Outstanding Packets Last Reliable Packet Sent RtrA#show ip eigrp neighbors IP-EIGRP neighbors for process 1 H Address Interface Hold Uptime SRTT RTO Q Seq (sec) (ms) Cnt Num 2 10.1.1.1 Et0 12 6d16h 20 200 0 233 1 10.1.4.3 Et1 13 2w2d 87 522 0 452 0 10.1.4.2 Et1 10 2w2d 85 510 0 3

Seconds Remaining Before Declaring Neighbor Down

How Long Since the Last Time Neighbor Was Discovered

How Long It Takes for This Neighbor to Respond to Reliable Packets

How Long We’ll Wait Before Retransmitting if No Acknowledgement

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 23 Neighbors Log-Neighbor-Changes Messages • So this tells us why the neighbor is bouncing—but what do they mean?

• eg: peer restarted means you have to ask the peer; he’s the one that restarted the session

Neighbor 10.1.1.1 (Ethernet0) is down: peer restarted Neighbor 10.1.1.1 (Ethernet0) is up: new adjacency Neighbor 10.1.1.1 (Ethernet0) is down: holding time expired Neighbor 10.1.1.1 (Ethernet0) is down: retry limit exceeded Others, but not often

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 24 BGP Monitoring Protocol (BMP) Overview Collecting Pre-Policy BGP Messages

BMP collector

BMP message

Adj-RIB-in (pre-inbound-filter) BGP Monitor Protocol update

Loc-RIB (post-inbound-filter) Adj-RIB-in (pre-inbound-filter) iBGP update eBGP update BGP peer (internal) BMP client Inbound BGP peer’s (external) filtering policing

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 25 BGP Monitoring Protocol

• IETF draft-ietf-grow-bmp

• BMP client (router) provides pre-policy view of the ADJ-RIB-IN of a peer

• Update messages from peer sent to BMP receiver

• Example uses: • Realtime visualizer of BGP state • Traffic engineering analytics • BGP policy exploration

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 26 OpenBMP http://www.openbmp.org

Historical record of prefix withdraws

Current route views and peer status

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 27 Data Plane

3Ws: When, where, and what User / Agent Checks

• Treat network as a black box: are your synthetic tests working? • Synthetic service check (HTTP, DNS, etc.) • Ping (not all remotes will respond)

• Data plane is exercised and tested • Variety = better coverage (multiple IP addresses / L4 ports per location) • Validate similar treatment (QoS) as real user traffic

• Uptime and performance (loss, latency) metrics

• Look for patterns, changes from normal. All down vs some down.

• Capture and validate real user (human) incidents. What got missed?

• Use wisely: network and server resources consumed

R3

R1 R6 R2 R5 A B

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 29 IPSLA and Relatives

• IPSLA on router/switch – makes use of deployed network infra • May not be true check of data plane (shadow router) • Resource contention (CPU) – group scheduling • Simplistic service checks

• User end-system based agent software (Nagios agents…) • Uses host stack (OS, browser) on PC • End to end (could include WiFi) • Includes end system resource view • BYOD deployment challenges

• Dedicated Agent (Cisco NAM, RIPE Atlas probes…) • Mixture of benefits from end-system and network • Matching real user end-system stack can be challenging

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 30 IP SLA: Synthetic Traffic Measurements Uses Multiprotocol Service Level Network Label VoIP Agreement Network Trouble Availability Performance Switching Monitoring (SLA) Assessment Shooting Monitoring (MPLS) Monitoring Monitoring Measurement Metrics Packet Network Dist. of Latency Connectivity Loss Jitter Stats Operations Jitter FTP DNS DHCP DLSW ICMP UDP TCP HTTP LDP H.323 SIP RTP

Cisco IOS Software

IP SLA Destination Source MIB Data Active Generated Traffic to Measure the Network Cisco IOS Cisco IOS IP SLA Software Software IP SLA Responder

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 31 Reference IPSLA Multicast Support

• IPSLA Multicast One Way Delay (NTP req) One Way Jitter Packet Loss

• Configuration is on IP SLA Sender • Have to specify each responder explicitly in endpoint-list

• Responder becomes mcast receiver, IGMPv3 (G) and (S,G) behavior Unicast control

• ISRG2, ISR4451X, ASR1k, CSR1000v, cat4k(sup7/6), c7600 Multicast traffic

SLAsender(config)#ip sla endpoint-list type ip mylist ip-address 172.16.1.2,172.17.1.2 port 3800 SLAsender(config)#ip sla 1 udp-jitter 224.1.1.1 4000 endpoint-list mylist source-ip 172.16.1.1 source-port 4500 num-packets 100 interval 25

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 32 iperf3

• Active measurement tool to discover available path capacity • worst link and worst host configurations

• Test can be in either direction (only static NAT works)

• TCP (retransmissions, rate, cwd), SCTP and UDP (loss, jitter, out of order) tests

TCP/5201 sender receiver Test traffic: TCP, SCTP, UDP

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 33 bwctl • bwctl client coordinates active measurement tests • Authentication – IP subnets, AES key/username • Scheduling/reserving • result gathering – gathered from both server and client systems • Does not have to be on bwctl server (3rd party)

• bwctl server hosts the test resources (iperf3, ping, traceroute/path, owamp)

• Allows for multi-admin domain (along path) active tests

• bwtraceroute: wrapper for bwctl and traceroute

$• bwtraceroutebwctl distributed-s 205.186.62.54 with Ubuntu, may need to be installed (yum, apt-get, compiled) for other bwtraceroute: Using tool: traceroute Local machine bwtracerouteUNIXes : 17 seconds until test results available SENDER START traceroute to 152.22.242.103 (152.22.242.103), 30 hops max, 60 byte packets 1 205-186-62-53.generic.c-light.net (205.186.62.53) 0.104 ms 0.098 ms 0.102 ms 2 xe-1-1-1-816-t01-sox.culr.net (205.186.63.2) 2.932 ms 2.934 ms 2.929 ms … 9 152.22.242.103 (152.22.242.103) 12.188 ms 12.180 ms 12.144 ms SENDER END

#CLUS © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public $ bwctl -T iperf3 -t 30 -O 4 -s "56m-ps-4x10.sox.net:4823" bwctl: Using tool: iperf3 Iperf3 bwctl: 40 seconds until test results available SENDER START Connecting to host 152.22.242.103, port 5160 examples [ 15] local 143.215.194.123 port 45609 connected to 152.22.242.103 port 5160 Client to server [ ID] Interval Transfer Bandwidth Retr Cwnd [ 15] 0.00-1.00 sec 107 MBytes 898 Mbits/sec 0 3.06 MBytes (omitted) (local to remote) [ 15] 1.00-2.00 sec 112 MBytes 944 Mbits/sec 0 3.06 MBytes (omitted) … [ 15] 29.00-30.00 sec 112 MBytes 944 Mbits/sec 0 3.06 MBytes ------Throw away stats [ ID] Interval Transfer Bandwidth Retr from first 4 sec [ 15] 0.00-30.00 sec 3.29 GBytes 942 Mbits/sec 0 sender [ 15] 0.00-30.00 sec 3.29 GBytes 943 Mbits/sec receiver iperf Done. Run for 30 sec SENDER END Use –P for parallel streams $ $ bwctl -T iperf3 -t 30 -O 4 -c "56m-ps-4x10.sox.net:4823" bwctl: Using tool: iperf3 bwctl: 39 seconds until test results available

SENDER START ~940 mbps (remote Connecting to host 143.215.194.123, port 5327 to local) [ 15] local 152.22.242.103 port 44855 connected to 143.215.194.123 port 5327 [ ID] Interval Transfer Bandwidth Retr Cwnd [ 15] 0.00-1.00 sec 5.14 MBytes 43.1 Mbits/sec 411 25.5 KBytes (omitted) [ 15] 1.00-2.00 sec 2.26 MBytes 19.0 Mbits/sec 15 19.8 KBytes (omitted) retransmissions … [ 15] 28.00-29.00 sec 2.26 MBytes 18.9 Mbits/sec 16 25.5 KBytes ------[ ID] Interval Transfer Bandwidth Retr [ 15] 0.00-30.00 sec 59.8 MBytes 16.7 Mbits/sec 539 sender ~19mbps (local to [ 15] 0.00-30.00 sec 60.7 MBytes 17.0 Mbits/sec receiver remote) iperf Done.

SENDER END #CLUS © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public > netperf -t TCP_STREAM -H 162.209.79.211 -i 30,10 -I 95,5 -j -l 60 MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 162.209.79.211 () port 0 AF_INET : +/-2.500% @ 95% conf. : demo !!! WARNING netperf !!! Desired confidence was not achieved within the specified iterations. !!! This implies that there was variability in the test environment that !!! must be investigated before going further. !!! Confidence intervals: Throughput : 8.965% • Similar to iperf3 but: !!! Local CPU util : 0.000% !!! Remote CPU util : 0.000% • Works bidirectionally in a NAT Recv Send Send Socket Socket Message Elapsed environment Size Size Size Time Throughput bytes bytes bytes secs. 10^6bits/sec • additional connection/per 87380 16384 16384 60.52 13.91 second and tracnsaction/per second tests

• statistical confidence intervals download (-I)

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 36 owamp

• One way delay/jitter to/from end systems > owping -c 10000 -i 0.01 2hd32g-2.cenic.org:861 • Checks for loss, order Approximately 103.5 seconds until results available --- owping statistics from [152.22.242.103]:9525 to [2hd32g- 2.cenic.org]:9105 --- SID: 89a41e75da6e5be4ad003a66630c3668 • NTP needed (check is first: 2016-02-16T21:39:34.059 last: 2016-02-16T21:41:13.152 10000 sent, 1 lost (0.010%), 0 duplicates done) one-way delay min/median/max = 52.7/54.5/58.5 ms, (err=1.6 ms) one-way jitter = 1.3 ms (P95-P50) 55ms(to) vs Hops = 10 (consistently) no reordering 12ms(from) --- owping statistics from [2hd32g-2.cenic.org]:9207 to [152.22.242.103]:9111 --- SID: 9816f267da6e5be4b0980a5547a7e2f0 first: 2016-02-16T21:39:34.046 last: 2016-02-16T21:41:13.438 10000 sent, 0 lost (0.000%), 0 duplicates one-way delay min/median/max = 10.2/11.9/16 ms, (err=1.6 ms) one-way jitter = 1.4 ms (P95-P50) Hops = 10 (consistently) no reordering

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 37 perfsonar

• Scheduling, execution and visualization for various tests across servers

• Registry of public servers

∫∫∫∫∫∫∫ #CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 38 Diagnostic Tools Hosting Platforms Along the Path IOS XR Support RPM package installation directly to the system.

Nexus OS Support for 3rd party LXC containers. Support for Guest Shell LXC. Future support for Docker containers.

IOS XE Open to any 3rd party or custom KVM application on routing platforms. Ultimate flexibility with UCS-E module.

bwctl/oaping/iperf3

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 39 ISR 4400 Series Service Containers

• Data Plane to Container: 200Mbps

• Container to Data Plane: 1Gbp

• Service container is on independent CPU

• Router features (QoS, NetFlow, etc.) are applied to container traffic

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 40 traceroute Internet: aka the TCP/80 network Widest dispersion • Understand the limitations against possibilities. Difficult to • Sends 3 packets (default) at each TTL understand though. • Implementations • Linux/Cisco: UDP (ICMP and TCP-SYN are Linux optional) • UDP DST port # used to keep track of packets, increments per packet. Initial= 33434 (default) • SRC port #: randomized (linux), incrementing per packet (IOS) Narrower • Linux (GNU inetutils-traceroute) dispersion. • UDP DST port# increments per TTL (not per packet) Story might be misleading. • SRC port is random but fixed per entire run • Windows: ICMP Echo request ICMP blocked frequently  • IOS ICMP responses limited to 1 per 500ms • Configurable via: ip icmp rate-limit unreachable

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 41 Reference Unix traceroute

1 AAA • Multiple path options 2 BBB 3 CCC • Topology ‘shortcuts’ (same router seen at diff hop) 4 DDD 5 EEE • 6 FGF Ultimately all paths result in similar e2e delay 7 HII 8 JKK +10ms (unsustained) $ traceroute 62.2.88.172 traceroute to 62.2.88.172 (62.2.88.172), 30 hops max, 60 byte packets 9 JLJ 1 152.22.242.65 (152.22.242.65) 1.044 ms 1.371 ms 1.585 ms 10 LLM +120ms (sustained) 2 152.22.240.8 (152.22.240.8) 0.219 ms 0.328 ms 0.327 ms 11 NNM 3 128.109.70.9 (128.109.70.9) 1.066 ms 1.059 ms 1.168 ms 12 NNO 4 rtp7600-gw-to-dep7600-gw2.ncren.net (128.109.70.137) 1.634 ms 1.628 ms 1.736 ms 13 PPP 5 rlasr-gw-link1-to-rtp7600-gw.ncren.net (128.109.9.17) 5.354 ms 5.446 ms 5.557 ms 6 128.109.9.117 (128.109.9.117) 5.671 ms 128.109.9.170 (128.109.9.170) 7.141 ms 128.109.9.117 (128.109.9.117)14 QQQ 5.433 ms 7 wscrs-gw-to-ws-a1a-ip-asr-gw-sec.ncren.netMultiple (128.109.1.105) paths 9.174 ms 128.109.1.209 (128.109.1.209)15 *** 8.256 ms 6.397 ms 8 dcp-brdr-03.inet.qwest.net (205.171.251.110) 18.414 ms chr-edge-03.inet.qwest.net+120ms Atlantic (65.114.0.205) 16 RRR 27.353 ~268ms ms 27.438 (all msthree) 9 dcp-brdr-03.inet.qwest.net (205.171.251.110) 21.739 ms 63-235-40-106.dia.static.qwest.net (63.235.40.106) 17.750 ms dcp-brdr-03.inet.qwest.net (205.171.251.110) 22.450 ms crossing 10 63-235-40-106.dia.static.qwest.net (63.235.40.106) 22.531 ms 22.516 ms 84-116-130-173.aorta.net (84.116.130.173) 140.738 ms 11 nl-ams02a-rd1-te0-2-0-2.aorta.net (84.116.130.65) 140.831 ms 140.816 ms 84-116-130-173.aorta.net (84.116.130.173) 144.819 ms 12 nl-ams02a-rd1-te0-2-0-2.aorta.net (84.116.130.65) 144.074 ms 144.761 ms 84-116-130-58.aorta.net (84.116.130.58) 138.455 ms 13 84-116-130-58.aorta.netfilter (84.116.130.58) 141.844 ms 141.924 ms 142.459 ms 14 84.116.204.234 (84.116.204.234) 145.603 ms 145.891 ms 145.987 ms + > 100 ms delay 15 * * * 16 62-2-88-172.static.cablecom.ch (62.2.88.172) 268.281 ms 268.245 ms 268.176 ms

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 42 Reference Unix inetutils traceroute

• Narrower view (no alternate paths directly seen)

• Repeating nodes suggests multipath, or (unlikely) routing issue

$ inetutils-traceroute --resolve-hostname 62.2.88.172 traceroute to 62.2.88.172 (62.2.88.172), 64 hops max 1 152.22.242.65 (152.22.242.65) 0.783ms 0.727ms 0.798ms Packets for hop 9,12 took a 2 152.22.240.8 (152.22.240.8) 0.226ms 0.228ms 0.221ms ‘shortcut’ and packets for 3 128.109.70.9 (128.109.70.9) 0.967ms 0.980ms 0.962ms 4 128.109.70.137 (rtp7600-gw-to-dep7600-gw2.ncren.net) 1.576ms 1.598ms 1.567ms hop 10,13 went long way 5 128.109.9.17 (rlasr-gw-link1-to-rtp7600-gw.ncren.net) 5.149ms 5.140ms 5.126ms 6 128.109.9.166 (128.109.9.166) 7.113ms 7.098ms 7.306ms 7 128.109.1.209 (128.109.1.209) 7.835ms 8.326ms 7.958ms 8 65.114.0.205 (chr-edge-03.inet.qwest.net) 19.944ms 9.299ms 40.372ms 9 63.235.40.106 (63-235-40-106.dia.static.qwest.net) 18.442ms 18.412ms 18.432ms 10 63.235.40.106 (63-235-40-106.dia.static.qwest.net) 22.424ms 22.391ms 75.960ms 11 84.116.130.173 (84-116-130-173.aorta.net) 145.434ms 146.301ms 145.445ms 12 84.116.130.58 (84-116-130-58.aorta.net) 137.583ms 137.556ms 137.661ms 13 84.116.130.58 (84-116-130-58.aorta.net) 142.476ms 141.886ms 141.819ms 14 84.116.204.234 (84.116.204.234) 144.841ms 145.034ms 144.964ms 15 * * * 16 62.2.88.172 (62-2-88-172.static.cablecom.ch) 287.318ms 176.670ms 254.237ms

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 43 Reference LFT

• lft ‘layer 4 traceroute’ dynamically adjusts to responses

• Firewall detection, whois and AS lookup integrated

• Narrower packet changes, so narrower multi-path

$ sudo lft -ENA 62.2.88.172 Tracing ______. Used tcp/80 SYN TTL LFT trace to 62-2-88-172.static.cablecom.ch (62.2.88.172):80/tcp 1 [AS81] [NCREN-B22] 152.22.242.65 20.1/17.2ms 2 [AS81] [NCREN-B22] 152.22.240.8 20.1/20.1ms 3 [AS81] [CONCERT] 128.109.70.9 20.1/20.1ms 4 [AS81] [CONCERT] rtp7600-gw-to-dep7600-gw2.ncren.net (128.109.70.137) 20.1/20.1ms 5 [AS81] [CONCERT] rlasr-gw-link1-to-rtp7600-gw.ncren.net (128.109.9.17) 20.1/20.1ms 6 [AS81] [CONCERT] 128.109.9.117 20.1/20.1ms 7 [AS209] [unknown] chr-edge-03.inet.qwest.net (65.121.156.209) 20.1/19.5ms 8 [AS209] [QWEST-INET-35] dcp-brdr-03.inet.qwest.net (205.171.251.110) 20.1/18.4ms 9 [AS209] [QWEST-INET-17] 63-235-40-106.dia.static.qwest.net (63.235.40.106) 20.1/60.3ms 10 [AS6830] [84-RIPE/LGI-Infrastructure] 84-116-130-173.aorta.net (84.116.130.173) 160.7/160.7ms 11 [AS6830] [84-RIPE/LGI-Infrastructure] nl-ams02a-rd1-te0-2-0-2.aorta.net (84.116.130.65) 160.7/160.7ms 12 [AS6830] [84-RIPE/LGI-Infrastructure] 84-116-130-58.aorta.net (84.116.130.58) 140.6/140.6ms ** [firewall] the next gateway may statefully inspect packets 13 [AS6830] [84-RIPE/LGI-Infrastructure] 84.116.204.234 160.7/160.6ms ** [neglected] no reply packets received from TTL 14 15 * [AS6830] [RIPE-C3/CC-HO841-NET] [target] 62-2-88-172.static.cablecom.ch (62.2.88.172):80 160.7ms

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 44 Reference MTR

• Interactive combined traceroute and ping

• Gives a sense of health of path (loss, delay Standard Deviation) Just local noise, no • Narrow path view carry over to later hops Sustained loss. aakhter-nlr-ubuntu-01 (0.0.0.0) Sat May 30 18:57:09 2015 Likely something Keys: Help Display mode Restart statistics Order of fields quit wrong 12->13, or Packets Pings Host Loss% Snt Last Avg Best Wrst StDev way back 1. 152.22.242.65 0.0% 145 0.8 0.9 0.7 10.0 0.8 2. 152.22.240.8 0.0% 145 0.3 0.2 0.2 0.3 0.0 3. 128.109.70.9 0.0% 145 1.0 3.3 1.0 182.3 17.2 4. rtp7600-gw-to-dep7600-gw2.ncren.net 1.0% 145 9.2 4.1 1.6 203.4 18.6 5. rlasr-gw-link1-to-rtp7600-gw.ncren.net 0.0% 145 5.3 5.3 5.1 6.8 0.2 6. 128.109.9.166 0.0% 145 7.1 7.3 7.1 16.1 0.8 7. wscrs-gw-to-ws-a1a-ip-asr-gw-sec.ncren.net 0.0% 145 6.8 8.3 6.2 10.6 1.0 8. chr-edge-03.inet.qwest.net 0.0% 145 9.4 12.3 9.3 62.1 9.5 Note 9. dcp-brdr-03.inet.qwest.net 0.0% 145 21.8 22.8 21.7 70.7 5.5 10. 63-235-40-106.dia.static.qwest.net 0.0% 145 21.8 24.5 21.7 86.1 10.6 variability, 11. 84-116-130-173.aorta.net 0.0% 145 144.8 145.0 144.7 152.9 1.0 probably just 12. nl-ams02a-rd1-te0-2-0-2.aorta.net 0.0% 145 144.1 145.5 144.0 165.4 3.7 13. 84-116-130-58.aorta.net 5.0% 144 142.9 142.3 142.0 145.6 0.4 the end 14. 84.116.204.234 5.0% 144 145.1 145.1 144.9 145.3 0.0 system 15. 217-168-62-150.static.cablecom.ch 5.0% 144 145.9 146.1 145.2 164.3 1.9 16. 62-2-88-172.static.cablecom.ch 5.0% 144 313.0 260.3 152.6 508.0 80.0

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 45 # show interface GigabitEthernet1 is up, line protocol is up Show interface Hardware is CSR vNIC, address is 000c.291a.7f97 (bia 000c.291a.7f97) Internet address is 192.168.225.130/24 MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec, reliability 255/255, txload 1/255, rxload 1/255 Encapsulation ARPA, loopback not set Keepalive set (10 sec) Full Duplex, 1000Mbps, link type is auto, media type is RJ45 output flow-control is unsupported, input flow-control is unsupported ARP type: ARPA, ARP Timeout 04:00:00 Last input 00:05:35, output 00:09:58, output hang never Last clearing of "show interface" counters never • Classic command Input queue: 0/375/0/0 (size/max/drops/flushes); Total output drops: 0 Queueing strategy: fifo Output queue: 0/40 (size/max) • 5 minute input rate 0 bits/sec, 0 packets/sec Check ‘up’ status 5 minute output rate 0 bits/sec, 0 packets/sec 25349 packets input, 2381158 bytes, 0 no buffer Received 0 broadcasts (0 IP multicasts) • 0 runts, 0 giants, 0 throttles Stability: log event or ‘show ip 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored 0 watchdog, 0 multicast, 0 pause input route’ 3958 packets output, 312408 bytes, 0 underruns 0 output errors, 0 collisions, 0 interface resets 56 unknown protocol drops 0 babbles, 0 late collision, 0 deferred • Monitor in/out bit/packet changes 0 lost carrier, 0 no carrier, 0 pause output 0 output buffer failures, 0 output buffers swapped out

snmp ifmib ifindex persist snmp ifmib trap throttle interface [no] logging event link-status [no] no snmp trap link-status load-interval 30

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 46 Follow the Flow with NetFlow • Per-Node: Data plane observations and decisions captured • Src/dst mac/IP/port#s, DSCP values, in/out interfaces, etc.

• Network view: flows centrally analyzed- NetFlow collector/analyzer

• Biggest value: strategically placed partial views (eg WAN edge)

NetFlow Collector

LiveAction

R3

R1 R6 R2 R5 A B R4

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 47 NetFlow—What Is It?

• Developed and patented at Cisco Systems in 1996

• NetFlow is the de facto standard for acquiring IP operational data

• Standardized in IETF via IPFIX

• Provides network and security monitoring, network planning, traffic analysis, and IP accounting

• Packet capture is like a wire tap

• NetFlow is like a phone bill Network World Article—NetFlow Adoption on the Rise http://www.networkworld.com/newsletters/nsm/2005/0314nsm1.html

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 48 Flexible NetFlow Multiple Monitors with Unique Key Fields

Flow Flow Monitor 1 Monitor 2

Key Fields Packet 1 Key Fields Packet 1 Non-Key Fields Non-Key Fields Packets Source IP 3.3.3.3 Packets Source IP 3.3.3.3 Bytes Dest IP 2.2.2.2 Timestamps Destination IP 2.2.2.2 Input Interface Ethernet 0 Source Port 23 Timestamps SYN Flag 0 Destination Port 22078 Next Hop Address Layer 3 Protocol TCP - 6 TOS Byte 0 Security Analysis Cache Input Interface Ethernet 0 Source IP Dest. IP Input I/F Flag … Pkts

Traffic Analysis Cache 3.3.3.3 2.2.2.2 E0 0 … 11000 Src. Dest. Source Dest. TO Input Protocol … Pkts IP IP Port Port S I/F

3.3.3.3 2.2.2.2 23 22078 6 0 E0 … 1100

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 49 NetFlow Forwarding Status & Drop Count Fields

• Flexible NetFlow Forwarding Status field captures forwarding (and drop reason) for flow.

• Drop Count increments on any explicit drop by router

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 50 Network Performance Monitor

• Network nodes are able to discover & validate RTP, TCP and IP-CBR traffic on hop by hop basis

• À la carte metric (loss, latency, jitter etc.) selections, applied on operator selected sets of traffic

• Allows for fault isolation and network span validation

• Per-application threshold and altering.

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 51 Performance Monitor Information Elements

Media Monitoring Application Response Time Other Metrics • RTP SSRC • CND - Client Network Delay (min/max/sum) • L3 counter (bytes/packets) • RTP Jitter (min/max/mean) • SND – Server Network Delay (min/max/sum) • Flow event • Transport Counter • ND – Network Delay (min/max/sum) • Flow direction (expected/loss) • AD – Application Delay (min/max/sum) • Client and server address • Media Counter • Total Response Time (min/max/sum) • Source and destination address (bytes/packets/rate) • Total Transaction Time (min/max/sum) • Transport information • Media Event • Number of New Connections • Input and output interfaces • Collection interval • Number of Late Responses • L3 information (TTL, DSCP, • TCP MSS • Number of Responses by Response Time TOS, etc.) • TCP round-trip time (7-bucket histogram) • Application information (from • Number of Retransmissions NBAR2) • Number of Transactions • Monitoring class hierarchy • Client/Server Bytes • Client/Server Packets

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 52 NetFlow QoS Analysis

How is my flow being classified? Did this QoS class drop traffic?

Cisco Prime Infra

LiveAction flow 5-tuple DPI/NBAR QoS processing DSCP

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 53 Flow exporter: NetFlow QoS option c3pl-class-table timeout option c3pl-policy-table timeout QoS Queue performance: flow record type performance monitor qos-record match policy qos queue index collect policy qos queue drops (or) flow record qos-record match policy qos queue index collect policy qos queue drops Flow to QoS Association: flow record type performance-monitor A • QoS queue performance (drops) match connection client ipv4 address match connection server ipv4 address match connection server transport port • QoS class structure class-map and collect policy qos class hierarchy policy map names collect policy qos queue id … (or) flow record qos-class-record match ipv4 source address match ipv4 destination address collect policy qos classification hierarchy collect policy qos queue index …

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 54 Enhanced NetFlow CLI Example

R1#show flow monitor qos-flow-monitor cache IP FORWARDING STATUS: Forward IPV4 SOURCE ADDRESS: 192.168.32.128 platform qos performance-monitor IPV4 DESTINATION ADDRESS: 224.0.0.5 ! INTERFACE INPUT: Null flow record qos-class-record INTERFACE OUTPUT: Gi2 0x30 = CS6: in match routing forwarding-status FLOW DIRECTION: Output ‘control’ class match ipv4 dscp IP DSCP: 0x30 match ipv4 source address policy qos class hierarchy: WAN-EDGE-4-CLASS: CONTROL match ipv4 destination address policy qos queue index: 1073741827 match interface input IP FORWARDING STATUS: Consume match interface output IPV4 SOURCE ADDRESS: 192.168.225.128 match flow direction IPV4 DESTINATION ADDRESS: 192.168.225.130 My VTY collect policy qos classification hierarchy INTERFACE INPUT: Gi1 session collect policy qos queue index INTERFACE OUTPUT: Null ! FLOW DIRECTION: Input flow monitor qos-flow-monitor IP DSCP: 0x04 record qos-class-record policy qos class hierarchy: WAN-EDGE-4-CLASS: class-default ! policy qos queue index: 0 Data interface GigabitEthernet1 IP FORWARDING STATUS: Forward ip flow monitor qos-flow-monitor input IPV4 SOURCE ADDRESS: 192.168.225.128 traffic ! IPV4 DESTINATION ADDRESS: 5.5.5.5 interface GigabitEthernet2 INTERFACE INPUT: Gi1 ip flow monitor qos-flow-monitor output INTERFACE OUTPUT: Gi2 service-policy output WAN-EDGE-4-CLASS FLOW DIRECTION: Output IP DSCP: 0x00 policy qos class hierarchy: WAN-EDGE-4-CLASS: class-default policy qos queue index: 1073741829

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 55 CBQoS MIB

• IOS QoS collects vital information regarding health of QoS classes • Pre and Post bytes, drops, etc • Same class names from different routers can be compared • For flow level analysis, use NetFlow QoS reporting • ‘snmp mib persist CBQoS’ (IOS 12.4(4)T)

Adventnet #CLUS © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public Dedicated Protocol Analyzers

• Wireshark, Cisco NAM and other protocol analyzers are great • Detailed analysis for variety of protocols at deep level

• Dedicated probes are expensive to deploy pervasively • Operator has to make difficult judgment calls on where the problem is going to be– before it happens

• Can be challenging after the fact- need on-site trained personnel.

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 58 Embedded Packet Capture & Analyze

• Capture packets locally to buffer on router • Store to flash, USB, FTP, TFTP for analysis in protocol analyzer • IOS XE Cat 4k Sup 7E & Sup 7L-E (XE 3.3.0 SG) include built in Wireshark decode capability • Capture does not add traffic to network

LY-2851-8#monitor capture buffer pcap-buffer1 size 10000 max-size 1550 LY-2851-8#monitor capture point ip cef pcap-point1 g0/0 both LY-2851-8#monitor capture point associate pcap-point1 pcap-buffer1 LY-2851-8#monitor capture point start pcap-point1 LY-2851-8#monitor capture point stop pcap-point1 LY-2851-8#monitor capture buffer pcap-buffer1 export ftp://10.17.0.252/images/test.cap

Gig0/0

#CLUS © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public DNA Assurance and Analytics Converting Data to Business & IT Insights

Visibility Learn from the network and clients attached to it

Automate Insights Recognize changes and See problems inform the self-driving before your end network users do

Predictive Performance Proactive Troubleshooting Understand how new services Find root cause faster with will impact service levels granular details

Industry’s First Self-Predicting Network Analytics Platform

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 60 DNA Assurance is part of DNA Center

Automation Design Provision Policy Assurance

• Global settings • Fabric domains • Virtual networks • Issues and trends • Site profiles • Device on-boarding • ISE, AAA, Radius • Performance • DDI, SWIM, PNP • Device inventory • Access control • Proactive • User access • Host on-boarding • Application control troubleshooting

Planning, installation and migration Proactive and predictive network, client and application assurance

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 61 End-to-End Visibility and Insights

End user Client on- Network health Application visibility boarding and connectivity and status and performance

CUCM

WAN DHCP

Mobile Clients SFCDC NMS APs Office Site Network Services DC SNOW BOX Local WLCs Cloud Apps

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 62 DNA Center Data Analytics Architecture

Data collection and ingestion Data correlation and analysis Data visualization and action

Network assurance

Router Switch WLC Sensor Complex Network correlation telemetry Metadata SNMP NetFlow Syslog Streaming extraction telemetry ...

Collector and analytics pipeline SDK

ISE AAA Topology Location PxGrid Stream processing Data models and restful APIs

DNS DHCP Inventory Policy IPAM Time series analysis

Contextual data Analytics Engine System management portal Network Data Platform

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 63 Insights: Wireless Use Cases

Network Coverage Network Device Application Client Onboarding Client Experience & Capacity Monitoring Performance

 Association failures  Throughput analysis  Coverage hole  Availability  Sensor Tests:  Authentication failures  Roaming pattern analysis  AP License  Crash, AP Join Failure • Web: HTTP & HTTPS Utilization  IP address failure  Sticky client  High Availability • Email: POP3, IMAP,  Client Capacity Outlook Web Access  Client Exclusion  Slow roaming  CPU, Memory  Radio Utilization • File Transfer: FTP &  Excessive roaming Flapping AP, Hung  Excessive on-boarding  TFTP time  RF, Roaming pattern Radio  Application Experience  Excessive authentication  Dual band clients prefer  Power supply failures (Packet Loss, Latency, time 2.4GHz Jitter)  Excessive IP addressing  Excessive interference time  AAA, DHCP reachability  Client Side Analytics (Apple Insights)

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 64 Insights: Wired Use Cases

Network Device Client Onboarding Control Plane Data Plane Policy Plane Monitoring

 Client/Device DHCP  Control plane reachability  Border and edge  ISE/PxGrid connectivity  High CPU connectivity  Client/Device DNS  Edge reachability  Border Node policy  High Mem  Border node health  Client authentication /  Border reachability  Edge Node policy  High Temp authorization  Access node health  MAP server  SGACL validation  Line-card  BGP AS mismatch, Flaps  Network Services  Modules DHCP, DNS, AAA  OSPF adjacency failure  POE power  Interface High Utilization  EIGRP adjacency failure  TCAM Table  Interface Flaps  Gateway Connectivity  Application Performance (Packet Loss, Latency, Jitter)

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 65 End-to-end visibility – Overall Health

Where in the world are the most serious issues happening

Overall health of the Network Infra and the Clients

Top 10 Global Insights

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 66 End-to-end visibility – Client Health

Overall Network Client health summary – wired and wireless

Drill down of Client Onboarding, RF and Profile details

Listing of Network Clients with detailed health information

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 67 3600 Visibility– Network Client Network Client Health history, Proactively identify any Issues Detailed Client health information

Client Onboarding Details

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 68 Application Experience

Client level Application usage visibility per Business relevance category Per-Application Health Score along with historical trending

Detailed Application level flow metrics – Throughput, Packet loss, Latency, Delay

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 69 Path Trace – Troubleshoot the Network Path

Network Path for any traffic flow from any source to destination

Detailed information for all Devices and Interface along the Network path

Identify ACLs that may be Blocking or Affecting the traffic flow

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 70 Path Trace – Input Flow Details

Required Information SRC and DEST IP address [End host or L3 interface]

Optional Information SRC and DEST L4 port numbers; L4 protocol (TCP or UDP)

Note: Layer 4 port and protocol information is optional but highly recommended for accurate path calculation

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 71 Path Trace – Flow Path Statistics  Device Statistics – CPU and Memory utilization for every network device along the flow path  Interface Details - Ingress and Egress interface on the devices for the application flow path. Some of the other optional information provided by Path Trace include; the network operator would need to select these options at the time of initiating the path trace request  Interface Statistics – Detailed statistics for every interface along the path. This includes data such as VLAN information, packet drop counters, packet rate counters, operational state etc.  QoS Statistics – QoS Policy map queue level statistics for each interface where ever QoS policies are attached  ACL Interrogation – The DNA Assurance Path Trace function will look for Access Lists on all the relevant interfaces along the flow path and determine if any of the Access List Entries in these ACLS might have an impact on the application flow (permit or deny). This impact will be

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 72 Demo Time Getting Started Be Prepared!

• Be prepared and have data collection systems enabled • Enable passive monitoring on endpoints and network • Enable active tests

• Helpdesk • Interview Script => establish & maintain checklists • Multi-group access to tools, logs, etc.

• Firefighters run drills, so should your teams! • Be familiar with the tools and how they respond on your network • Red phone: Cross-domain teams (applications, UC, security, servers)

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 75 Expanding your Toolbox and Knowledge

• Commercial and open source tools to look at • Network topology & IP address management: APIC-EM, netdot, GestióIP • Performance tests: iperf3 • Service checks: AppDynamics, Nagios Core, Zenoss Community • NetFlow / Log analysis: Cisco Prime Infra, Lancope, logstash, fluentd • Template driven config generation: Cisco Prime Infra, ansible

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 77 Complete your online session evaluation

Give us your feedback to be entered into a Daily Survey Drawing. Complete your session surveys through the Cisco Live mobile app or on www.CiscoLive.com/us.

Don’t forget: Cisco Live sessions will be available for viewing on demand after the event at www.CiscoLive.com/Online.

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 78 Continue Demos in Walk-in Meet the Related your the Cisco self-paced engineer sessions education campus labs 1:1 meetings

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 79 Complete your online session evaluation

Give us your feedback to be entered into a Daily Survey Drawing. Complete your session surveys through the Cisco Live mobile app or on www.CiscoLive.com/us.

Don’t forget: Cisco Live sessions will be available for viewing on demand after the event at www.CiscoLive.com/Online.

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 80 R&S related Cisco education offerings

Course Description Cisco Certification

CCIE R&S Advanced Workshops (CIERS-1 & Expert level trainings including: instructor led workshops, self CCIE® Routing & Switching CIERS-2) plus assessments, practice labs and CCIE Lab Builder to prepare candidates Self Assessments, Workbooks & Labs for the CCIE R&S practical exam.

• Implementing Cisco IP Routing v2.0 Professional level instructor led trainings to prepare candidates for the CCNP® Routing & Switching • Implementing Cisco IP Switched CCNP R&S exams (ROUTE, SWITCH and TSHOOT). Also available in Networks V2.0 self study eLearning formats with Cisco Learning Labs. • Troubleshooting and Maintaining Cisco IP Networks v2.0 Interconnecting Cisco Networking Devices: Builds on ICND1 to provide capabilities needed to configure, implement CCNA® Routing & Switching Part 2 (or combined) and troubleshoot a small enterprise network. Including: understanding of Quality of Service (QoS), how virtualized and cloud services interact and impact enterprise networks, along with an overview of network programmability and the related controller types and tools that are available to support software-defined network architectures. Also available in self study eLearning format with Cisco Learning Lab.

Interconnecting Cisco Networking Devices: Understand layer 2 and layer 3 networking fundamentals needed to CCENT® Routing & Switching Part 1 install, configure, and provide basic support of small/branch networks. Covers network device security and IPv6 basics. Also available in self study eLearning format with Cisco Learning Lab.

For more details, please visit: http://learningnetwork.cisco.com Questions? Visit the Learning@Cisco Booth

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 81 Thank you

#CLUS #CLUS Backup Slides Performance Monitor Configuration

Flow Where to send data? Flow Exporter Record (optional)

Policy-map Applied inbound or Flow Monitor outbound Class-map

Interface What metrics to collect? What traffic to monitor?

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 85 Example Configuration – Flow Record

flow record type performance-monitor default-rtp-pt-name • Flow Record defines what metrics to collect match ipv4 protocol match ipv4 source address and how to collect them (just like in Flexible match ipv4 destination address match transport source-port NetFlow configuration) match transport destination-port match transport rtp ssrc match policy performance-monitor classification hierarchy collect routing forwarding-status • Performance monitor introduces collect ipv4 dscp collect ipv4 ttl flow record type performance-monitor collect transport packets expected counter collect transport packets lost counter collect transport packets lost rate collect transport event packet-loss counter • Match field types perform aggregation collect transport rtp jitter mean collect transport rtp jitter minimum towards that field. collect transport rtp jitter maximum collect interface input collect interface output collect counter bytes • Ie collect counter packets collect counter bytes rate match ipv4 source address collect timestamp interval collect application name match ipv4 destination address collect application media bytes counter collect application media bytes rate collect application media packets counter collect application media packets rate will create a unique entry per src-dst collect application media event collect monitor event collect transport rtp payload-type combinations !

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 86 Example Configuration – monitor

flow exporter mn-campus-samplicator • flow monitor pulls together the flow record, destination 10.1.160.37 exporter, and specific cache management source Loopback0 transport udp 2055 configurations (just like Flexible NetFlow) template data timeout 60 option c3pl-class-table option c3pl-policy-table • Special type of flow monitor option interface-table option application-table flow monitor type performance-monitor option sub-application-table ! • (optional) Flow exporter configures how the flow monitor type performance-monitor default-rtp-pt-name record default-rtp-pt-name NetFlow exporting is done exporter mn-campus-samplicator cache timeout synchronized 10 export-spread 5 history size 10 • Policy map specifies which traffic to monitor ! policy-map type performance-monitor rtp-traffic-name (via class-map), how to monitor (via class VOIP monitor), and any per-class threshold flow monitor default-rtp-pt-name react 1 transport-packets-lost-rate crossing actions threshold value ge 1.00 alarm severity error action syslog • Typed policy-map (performance monitor) class VIDEO-CONF flow monitor default-rtp-pt-name

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 87 Example Configuration – Interface attachment

• Finally, policy map is applied to interface

• Note typed policy is used

• Direction of monitoring (input|output) selectable for some platforms interface gigabitEthernet 0/1 service-policy type performance-monitor input rtp-traffic-name

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 88 Audio Quality Metrics (AQM) on CUBE

• AQM provides deeper insight into the media flows that are

processed by the CUBE / Voice SIP/media gateways

ISRG2, c8xx 15.3(3)M

PRI • Available via MIB, CDR and performance monitor

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 89 Example Configuration – AQM performance monitor

• ‘media monitoring’ configuration under voice service voip media monitoring [num] persist ‘voice service voip’ or dial-peer ! num is number of channels used to monitor media statistics ! delay calc, MOS etc • Controls generation of metrics on CUBE/VG OR

dial-peer voice [tag] voip • To export via NetFlow, regular media monitoring ! performance monitor configuration – flow record type performance-monitor aqm match ipv4 source address just include the AQM fields match ipv4 destination address match transport source-port match transport destination-port • MIB collect application voice number called collect application voice number calling CISCO-VOICE-DIAL-CONTROL-MIB … Regular performance monitoring configuration continues

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 90 Video Quality Metrics (VQM) on ISR G2

• VQM deeper insight into the video flows (H.264) that are crossing routers

• ISRG2, c8xx 15.3(3)M

• Available via performance monitor

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 91 Example Configuration – VQM performance monitor

video monitoring maximum-sessions 10 • ‘no shut’ under ‘video no shutdown flow record type performance-monitoring vqm-rec monitoring’ global config. match ipv4 protocol match ipv4 source address match ipv4 destination address match transport source-port • To export via NetFlow, match transport destination-port match transport rtp ssrc collect application video resolution [ width | height ] last regular performance monitor collect application video frame rate collect application video payload bitrate [ average | fluctuation ] collect application video frame [ I | STR | LTR | super-P | NR ] counter frames configuration – just include collect application video frame [ I | STR | LTR | super-P | NR ] counter packets [lost] the AQM fields collect application video frame [ I | STR | LTR | super-P | NR ] counter bytes collect application video frame [ I | STR | LTR | super-P | NR ] slice-quantization- level collect application video eMOS compression [ network | bitstream ] collect application video eMOS packet-loss [ network | bitstream ] collect application video frame percentage damaged collect application video scene-complexity collect application video level-of-motion collect transport rtpsequence-number [ last ]

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 92 show commands 1861-AA0213#show performance monitor history Load for five secs: 20%/16%; one minute: 8%; five minutes: 4% Time source is NTP, 01:52:12.052 EST Fri Oct 29 2010

Codes: * - field is not configurable under flow record NA - field is not applicable for configured parameters

Match: ipv4 src addr = 10.1.160.19, ipv4 dst addr = 10.1.3.5, ipv4 prot = udp, trns src port = 32760, trns dst port = 22802, SSRC = 1717646439 • Individual monitor intervals: Policy: all-apps, Class: telepresence-CS4, Interface: FastEthernet0/0, Direction: input start time 01:51:31 ======*history bucket number : 1 *counter flow : 1 counter bytes : 162329 • show performance monitor counter bytes rate (Bps) : 5410 *counter bytes rate per flow (Bps) : 5410 *counter bytes rate per flow min (Bps) : 5410 *counter bytes rate per flow max (Bps) : 5410 history counter packets : 773 *counter packets rate per flow : 25 counter packets dropped : 0 routing forwarding-status reason : Unknown interface input : Fa0/0 interface output : Vl1000 • Aggregation over all stored monitor event : false ipv4 dscp : 32 ipv4 ttl : 58 application media bytes counter : 146869 intervals: application media packets counter : 773 application media bytes rate (Bps) : 4895 *application media bytes rate per flow (Bps) : 4895 *application media bytes rate per flow min (Bps) : 4895 *application media bytes rate per flow max (Bps) : 4895 • show performance monitor application media packets rate (pps) : 25 application media event : Normal *transport rtp flow count : 1 transport rtp jitter mean (usec) : 476 status transport rtp jitter minimum (usec) : 1 transport rtp jitter maximum (usec) : 1997 *transport rtp payload type : 96 transport event packet-loss counter : 0 *transport event packet-loss counter min : 0 *transport event packet-loss counter max : 0 transport packets expected counter : 773 transport packets lost counter : 0 *transport packets lost counter minimum : 0 *transport packets lost counter maximum : 0 transport packets lost rate ( % ) : 0.00 *transport packets lost rate min ( % ) : 0.00 *transport packets lost rate max ( % ) : 0.00 for reference

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 93 Service Planning FNF Configuration - Example 1. Configure the Exporter Router(config)# flow exporter my-exporter Where do I want my data sent? Router(config-flow-exporter)# destination 1.1.1.1 2. Configure the Flow Record Router(config)# flow record my-record Router(config-Whatflow- record)#data do Imatch want ipv4to meter?destination address Router(config-flow-record)# match ipv4 source address Router(config-flow-record)# collect counter bytes 3. Configure the Flow Monitor Router(config)# flow monitor my-monitor Router(configHow do-flow I want-monitor)# to cache exporter information? my-exporter Router(config-flow-monitor)# record my-record 4. Apply to an Interface Router(config)# interface s3/0 Which interface do I want to monitor? Router(config-if)# ip flow monitor my-monitor input

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 94 NetFlow QoS Reporting

Flow exporter: option c3pl-class-table timeout option c3pl-policy-table timeout • How is my flow being QoS Queue performance: flow record type performance monitor qos-record classified? match policy qos queue index collect policy qos queue drops (or) flow record qos-record • Did this class drop traffic? match policy qos queue index collect policy qos queue drops • QoS queue performance Flow to QoS Association: flow record type performance-monitor A match connection client ipv4 address (drops) match connection server ipv4 address match connection server transport port collect policy qos class hierarchy • QoS class structure class- collect policy qos queue id … (or) map and policy map names flow record qos-class-record match ipv4 source address match ipv4 destination address collect policy qos classification hierarchy collect policy qos queue index …

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 98 R1#show ip traffic [interface ] IP statistics: Rcvd: 1117 total, 1116 local destination 0 format errors, 0 checksum errors, 0 bad hop count 0 unknown protocol, 0 not a gateway 0 security failures, 0 bad options, 0 with options Opts: 0 end, 0 nop, 0 basic security, 0 loose source route show ip traffic 0 timestamp, 0 extended security, 0 record route 0 stream ID, 0 strict source route, 0 alert, 0 cipso, 0 ump 0 other Frags: 0 reassembled, 0 timeouts, 0 couldn't reassemble 0 fragmented, 0 fragments, 0 couldn't fragment Bcast: 58 received, 0 sent Mcast: 442 received, 221 sent Sent: 842 generated, 1195 forwarded Drop: 1 encapsulation failed, 0 unresolved, 0 no adjacency 0 no route, 0 unicast RPF, 0 forced drop 0 options denied Drop: 0 packets with source IP address zero Drop: 0 packets with internal loop back IP address 0 physical broadcast Reinj: 0 in input feature path, 0 in output feature path

ICMP statistics: Rcvd: 0 format errors, 0 checksum errors, 0 redirects, 0 unreachable 0 echo, 0 echo reply, 0 mask requests, 0 mask replies, 0 quench 0 parameter, 0 timestamp, 0 timestamp replies, 0 info request, 0 other 0 irdp solicitations, 0 irdp advertisements 0 time exceeded, 0 info replies Sent: 0 redirects, 0 unreachable, 0 echo, 0 echo reply 0 mask requests, 0 mask replies, 0 quench, 0 timestamp, 0 timestamp replies 0 info reply, 0 time exceeded, 0 parameter problem 0 irdp solicitations, 0 irdp advertisements

UDP statistics: Rcvd: 58 total, 0 checksum errors, 58 no port 0 finput Sent: 0 total, 0 forwarded broadcasts

BGP statistics: Rcvd: 0 total, 0 opens, 0 notifications, 0 updates 0 keepalives, 0 route-refresh, 0 unrecognized Sent: 0 total, 0 opens, 0 notifications, 0 updates 0 keepalives, 0 route-refresh

TCP statistics: Rcvd: 1471 total, 0 checksum errors, 85 no port Sent: 597 total ..

OSPF statistics: Last clearing of OSPF traffic counters never Rcvd: 460 total, 0 checksum errors 414 hello, 8 database desc, 3 link state req 22 link state updates, 13 link state acks Sent: 245 total 199 hello, 12 database desc, 2 link state req 21 link state updates, 12 link state acks

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 99 Path Trace – Flow Path Details

Device Details – Information regarding all the network devices along the path along with a pointer to the Device 360 page to get further detailed information. This includes the device name, IP address etc. Link information source – For all the links along the application flow path trace, the link information source is displayed. Some examples for this particular field include:  Routing protocols (OSPF, BGP etc.) - The link is based on the routing protocol table  ECMP - The link is based upon a Cisco® Express Forwarding load-balancing decision  NetFlow - The link is based upon NetFlow cache records collected on the device  Static - The link is based on a static routing table  Wired and wireless – Type of end-client (in case the source or destination is a client device)  Switched - The link is based on Layer 2 VLAN forwarding information  Traceroute - The link is based on information collected by the trace route app Tunnels – Visualization of the overlay tunnels present along the application flow path. Examples include CAPWAP, VXLAN etc.

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 100 Path Trace – How does it works Background Data Collection

. DNA Center will periodically collect the device, host, and routing table information from the network elements . The collected information is stored in the DNA Center DB

. Information Collected (The frequency is every polling interval) – Device, interface, link state – CDP, LLDP, IP device tracking DB – Wireless association – VLAN, STP DNA Center – HSRP Database – OSPF, ISIS, EIGRP, BGP, static routes – More . Information Collected Using SNMP Traps – Wired / Wireless host discovery through SNMP traps

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 101 Path Trace – How does it works Flow Path Calculation

. For any given 5-tuple input, the DNA Center will, at first, try to calculate the path using the information stored in the DNA Center DB

. DNA Center will query the network on demand to obtain the most accurate path information in the following scenarios: – ECMP along the application flow path - The controller will query the network device to get the exact egress interface for the given application flow. Note: It is important to provide the 5-tuple information for the application flow for accurate path trace results. – Unknown or unsupported device along the path - The controller will use traceroute to get the best possible path. – When the source and destination IP address are on different branch or campus networks - To determine which border router actually received the application flow for the path trace, DNA Center will look at the NetFlow cache records on all the border routers in the destination campus or branch network.

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 102 Path Trace – How does it works Flow Path Calculation Example

Info Source: Routing Cloud Netflow DNA Center Table Cache Info Source: DB Lookup Cloud Lookup Poll Network Device

Info Source: ECMP Poll Network Info Source: Decision L2 Device DNA Center Lookup DB

Info Source: ECMP Poll Network Decision Device

Info Source: L2 HSRP DNA Center HSRP Lookup DB

Routing Info Source: Campus Table DNA Center Branch Lookup DB

L2 Info Source: Gateway DNA Center Lookup DB

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 103 DNA Center Flow Path Analysis 5 tuple Input via Use Interface

Required Information Optional Information

SRC and DEST IP Address SRC and DEST L4 Port Numbers; [End-Host or L3 Interface] L4 Protocol (TCP or UDP)

Note: L4 Port and Protocol information is optional but highly recommended for accurate path calculation

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 104 Flow Path Analysis Enhanced Application Flow Visibility

CAPWAP tunnel visualization Link source information

Accuracy value

Ingress/Egress Interface

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 105 Flow Path Analysis Enhanced Application Flow Visibility – Key Statistics

Area of Interest

Interface and QoS Queue Stats

#CLUS BRKARC-2025 © 2018 Cisco and/or its affiliates. All rights reserved. Cisco Public 106