SlideShare uma empresa Scribd logo
1 de 46
Network State Awareness
and Troubleshooting
Faraz Shamim
Technical Leader
Cisco Systemcs Inc
Network State Awareness & troubleshooting
Abstract
•  Network state awareness and troubleshooting is a large and skilled part of
operating a network. This session will cover basic network data plane
troubleshooting best practices and techniques to plan for failures. We will also
do demos and a review of the troubleshooting tool chain: NetFlow, perf-mon,
CBQoS, ICMP/traceroute, interface stats, but also touching on RP stability (SPF
runs, unstable neighbors etc), and SDN methodologies along the same lines.
2
Network State Awareness & troubleshooting
•  Troubleshooting Methodology
•  Packet Forwarding Review
•  Data Plane
•  Active Monitoring
•  Passive Flow Monitoring
•  QoS
•  Control Plane
•  Logging
•  Routing Protocol Stability
•  Getting Started
Agenda
Network State Awareness & troubleshooting
•  This session is about basic network troubleshooting,
focusing on fault detection & isolation
•  Mostly, vendor neutral
•  For context, we will cover some basic methodologies
and functional elements of network behavior
•  This session is NOT about
•  Architectures of specific platforms
•  Data Center technologies
•  This is the 90 min tour. ;-)
Keeping Focused: What This Session is About
4
Network State Awareness & troubleshooting
The Big Picture
network
Network Operator
Server
Client
Application Operator
Not
happy
It’s not
the
network
It’s the
network
Is it
Monday?
Pings
fine!
Can’t
ping it.
Internet’s
down.
Somebody's
downloading
something.
(?)
5
Network State Awareness & troubleshooting
Enterprise
DC
•  A lot of stuff going on
•  Multiple networks
•  Multiple applications
•  Multiple layered services
•  Mis-information / inconsistency
Some More (network) Detail
LAN
Server A
Client
Not
happy
ISP A
Enterprise
WAN
Server B
Internet
DNS
DHCP
802.1x
DNS
6
Network State Awareness & troubleshooting
ISP B
Enterprise
DC
•  Redundant paths / ECMP / LAG
•  Overlays
•  Load balancers
•  Firewalls
•  NATs
… and it keeps on going
LAN
Server A
Client
Not
happy
ISP A
Enterprise
WAN
Server B
Internet
DNS
DHCP
802.1x
DNS
7
Network State Awareness & troubleshooting
Why network state awareness?
•  Quick detection of hard failures
•  Early warning for
•  soft failures
•  performance issues
•  and tomorrows’ problems
•  Faster problem resolution
•  Greater confidence in network by users and application operators
8
Network State Awareness & troubleshooting
Find	the	Suspects	 Ques/on	Suspects	 Improve	
Be	Prepared	
Think Like a Network Detective
9
Network State Awareness & troubleshooting
•  Control Plane
•  Processes variety of information
sources and policies, creates
routing information base (RIB)
•  Best known intention w/o actual
packet in hand
•  Data Plane
•  The actual forwarding process
(might be SW or HW based)
•  Granted some decision flexibility
•  Driven by arriving packet details,
traffic conditions etc.
Control Plane & Data Plane
Control Plane
Data Plane
Int A
Int B
Int C
packet
Routing
Protocol(s)
APIs Statics
Check routes
check L3 routing
Check policy
check forwarding
Gossip from
other routers
Passive Measurements
ifmib *FlowCbQoS
check policy-map int…
check interface
check flow monitor
PfR
10
Admin Edict
Network State Awareness & troubleshooting
•  Control plane: condenses options driven by policies and (relatively) slower
moving , aggregated information, eg. prefix reachability, interface state
•  Data plane responds to packet conditions
•  Destination prefix to egress interface matching
•  Multi-path (ECMP / LAG) member selection
•  Interface congestion
•  QoS class state
•  Access Lists
•  Packet processing fields (TTL expire, etc)
•  IPv4 fragmentation, etc
Data Plane Decision Flexibility
11
Network State Awareness & troubleshooting
•  Each network device makes an independent forwarding decision
•  Explicit Local / domain policies
•  Device perspective might not be symmetric
•  Data plane flexibility
•  Generally happens at WAN-edge and admin boundaries (traffic engineering)
•  Asymmetric routing
Network as a System: Independent Decisions
A B
R1 R2 R5
R6
R4
R3
your network You don’t control
Congested link
R5 is doing
ECMP hash
12
Network State Awareness & troubleshooting
Data Plane
13
Network State Awareness & troubleshooting
User / Agent Checks
•  Treat network as a black box: are your beacon services working?
•  Synthetic service check (HTTP, DNS, etc.)
•  Ping (not all remotes will respond)
•  Data plane is exercised and tested
•  Variety = better coverage (multiple IP addresses / L4 ports per location)
•  Validate similar treatment (QoS) as real user traffic
•  Uptime and performance (loss, latency) metrics
•  Look for patterns, changes from normal. All down vs some down.
•  Capture and validate real user (human) incidents. What got missed?
•  Use wisely: network and server resources consumed
A B
R1 R2 R5
R6
R3
14
Network State Awareness & troubleshooting
Latency
Network
Jitter
Dist. of
Stats Connectivity
Packet
Loss
FTP DNS DHCP TCPJitter ICMP UDPDLSW HTTP
Network
Performance
Monitoring
Service Level
Agreement
(SLA)
Monitoring
Network
Assessment
Multiprotocol
Label
Switching
(MPLS)
Monitoring
VoIP
MonitoringAvailability
Trouble
Shooting
Operations
Measurement Metrics
Uses
MIB Data Active Generated Traffic to Measure the
Network
DestinationSource
Responder
LDP H.323 SIP RTP
IP SLA
IP SLA*(RFC 6812): Synthetic Traffic Measurements
IP SLA
IP SLA
15
*IP SLA can be replaced with other monitoring tools used by other vendors such as RPM of Juniper etc
•  IPSLA on router/switch –
Shadow Router?
•  User end-system based
agent software
•  Dedicated Agent
Network State Awareness & troubleshooting
Check interface
•  Classic command
•  Check interface ‘up’ status
•  Stability: check log event or check
routing table stability
•  Monitor in/out bit/packet changes
# show interface
GigabitEthernet1 is up, line protocol is up
Hardware is CSR vNIC, address is 000c.291a.7f97 (bia 000c.291a.
7f97)
Internet address is 192.168.225.130/24
MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec,
reliability 255/255, txload 1/255, rxload 1/255
Encapsulation ARPA, loopback not set
Keepalive set (10 sec)
Full Duplex, 1000Mbps, link type is auto, media type is RJ45
output flow-control is unsupported, input flow-control is
unsupported
ARP type: ARPA, ARP Timeout 04:00:00
Last input 00:05:35, output 00:09:58, output hang never
Last clearing of "show interface" counters never
Input queue: 0/375/0/0 (size/max/drops/flushes); Total output
drops: 0
Queueing strategy: fifo
Output queue: 0/40 (size/max)
5 minute input rate 0 bits/sec, 0 packets/sec
5 minute output rate 0 bits/sec, 0 packets/sec
25349 packets input, 2381158 bytes, 0 no buffer
Received 0 broadcasts (0 IP multicasts)
0 runts, 0 giants, 0 throttles
0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored
0 watchdog, 0 multicast, 0 pause input
3958 packets output, 312408 bytes, 0 underruns
0 output errors, 0 collisions, 0 interface resets
56 unknown protocol drops
0 babbles, 0 late collision, 0 deferred
0 lost carrier, 0 no carrier, 0 pause output
0 output buffer failures, 0 output buffers swapped out
Network State Awareness & troubleshooting
traceroute
•  Understand the limitations
•  Sends 3 packets (default) at each TTL
•  Implementations
•  Linux/Cisco: UDP (ICMP and TCP-SYN are Linux optional)
•  UDP DST port # used to keep track of packets, increments per packet. Initial= 33434 (default)
•  SRC port #: randomized (linux), incrementing per packet (Cisco IOS)
•  Linux (GNU inetutils-traceroute)
•  UDP DST port# increments per TTL (not per packet)
•  SRC port is random but fixed per entire run
•  Windows: ICMP Echo request
Widest dispersion
against possibilities.
Difficult to
understand though.
ICMP blocked
frequently L
Narrower
dispersion.
Story might be
misleading.
Internet: aka the
TCP/80 network
17
Network State Awareness & troubleshooting
Unix traceroute
•  Multiple path options
•  Topology ‘shortcuts’ (same router seen at diff hop)
•  Ultimately all paths result in similar e2e delay
18
$ traceroute 62.2.88.172
traceroute to 62.2.88.172 (62.2.88.172), 30 hops max, 60 byte packets
1 152.22.242.65 (152.22.242.65) 1.044 ms 1.371 ms 1.585 ms
2 152.22.240.8 (152.22.240.8) 0.219 ms 0.328 ms 0.327 ms
3 128.109.70.9 (128.109.70.9) 1.066 ms 1.059 ms 1.168 ms
4 rtp7600-gw-to-dep7600-gw2.ncren.net (128.109.70.137) 1.634 ms 1.628 ms 1.736 ms
5 rlasr-gw-link1-to-rtp7600-gw.ncren.net (128.109.9.17) 5.354 ms 5.446 ms 5.557 ms
6 128.109.9.117 (128.109.9.117) 5.671 ms 128.109.9.170 (128.109.9.170) 7.141 ms 128.109.9.117 (128.109.9.117) 5.433 ms
7 wscrs-gw-to-ws-a1a-ip-asr-gw-sec.ncren.net (128.109.1.105) 9.174 ms 128.109.1.209 (128.109.1.209) 8.256 ms 6.397 ms
8 dcp-brdr-03.inet.qwest.net (205.171.251.110) 18.414 ms chr-edge-03.inet.qwest.net (65.114.0.205) 27.353 ms 27.438 ms
9  dcp-brdr-03.inet.qwest.net (205.171.251.110) 21.739 ms 63-235-40-106.dia.static.qwest.net (63.235.40.106) 17.750 ms
dcp-brdr-03.inet.qwest.net (205.171.251.110) 22.450 ms
10 63-235-40-106.dia.static.qwest.net (63.235.40.106) 22.531 ms 22.516 ms 84-116-130-173.aorta.net (84.116.130.173) 140.738 ms
11 nl-ams02a-rd1-te0-2-0-2.aorta.net (84.116.130.65) 140.831 ms 140.816 ms 84-116-130-173.aorta.net (84.116.130.173) 144.819
ms
12 nl-ams02a-rd1-te0-2-0-2.aorta.net (84.116.130.65) 144.074 ms 144.761 ms 84-116-130-58.aorta.net (84.116.130.58) 138.455 ms
13 84-116-130-58.aorta.net (84.116.130.58) 141.844 ms 141.924 ms 142.459 ms
14 84.116.204.234 (84.116.204.234) 145.603 ms 145.891 ms 145.987 ms
15 * * *
16 62-2-88-172.static.cablecom.ch (62.2.88.172) 268.281 ms 268.245 ms 268.176 ms
1 AAA
2 BBB
3 CCC
4 DDD
5 EEE
6 FGF
7 HII
8 JKK +10ms (unsustained)
9 JLJ
10 LLM +120ms (sustained)
11 NNM
12 NNO
13 PPP
14 QQQ
15 ***
16 RRR ~268ms (all three)
filter + > 100 ms
delay
+120ms
Atlantic
crossing
Reference
Network State Awareness & troubleshooting
Unix inetutils traceroute
•  Narrower view (no alternate paths directly seen)
•  Repeating nodes suggests multipath, or (unlikely) routing issue
19
$ inetutils-traceroute --resolve-hostname 62.2.88.172
traceroute to 62.2.88.172 (62.2.88.172), 64 hops max
1 152.22.242.65 (152.22.242.65) 0.783ms 0.727ms 0.798ms
2 152.22.240.8 (152.22.240.8) 0.226ms 0.228ms 0.221ms
3 128.109.70.9 (128.109.70.9) 0.967ms 0.980ms 0.962ms
4 128.109.70.137 (rtp7600-gw-to-dep7600-gw2.ncren.net) 1.576ms 1.598ms 1.567ms
5 128.109.9.17 (rlasr-gw-link1-to-rtp7600-gw.ncren.net) 5.149ms 5.140ms 5.126ms
6 128.109.9.166 (128.109.9.166) 7.113ms 7.098ms 7.306ms
7 128.109.1.209 (128.109.1.209) 7.835ms 8.326ms 7.958ms
8 65.114.0.205 (chr-edge-03.inet.qwest.net) 19.944ms 9.299ms 40.372ms
9 63.235.40.106 (63-235-40-106.dia.static.qwest.net) 18.442ms 18.412ms 18.432ms
10 63.235.40.106 (63-235-40-106.dia.static.qwest.net) 22.424ms 22.391ms 75.960ms
11 84.116.130.173 (84-116-130-173.aorta.net) 145.434ms 146.301ms 145.445ms
12 84.116.130.58 (84-116-130-58.aorta.net) 137.583ms 137.556ms 137.661ms
13 84.116.130.58 (84-116-130-58.aorta.net) 142.476ms 141.886ms 141.819ms
14 84.116.204.234 (84.116.204.234) 144.841ms 145.034ms 144.964ms
15 * * *
16 62.2.88.172 (62-2-88-172.static.cablecom.ch) 287.318ms 176.670ms 254.237ms
Packets for hop 9,12 took a
‘shortcut’ and packets for
hop 10,13 went long way
Reference
Network State Awareness & troubleshooting
lft
•  lft ‘layer 4 traceroute’ dynamically adjusts to responses
•  Firewall detection, whois and AS lookup integrated
•  Narrower packet changes, so narrower multi-path
20
$ sudo lft -ENA 62.2.88.172
Tracing ________________________________________________________________.
TTL LFT trace to 62-2-88-172.static.cablecom.ch (62.2.88.172):80/tcp
1 [AS81] [NCREN-B22] 152.22.242.65 20.1/17.2ms
2 [AS81] [NCREN-B22] 152.22.240.8 20.1/20.1ms
3 [AS81] [CONCERT] 128.109.70.9 20.1/20.1ms
4 [AS81] [CONCERT] rtp7600-gw-to-dep7600-gw2.ncren.net (128.109.70.137) 20.1/20.1ms
5 [AS81] [CONCERT] rlasr-gw-link1-to-rtp7600-gw.ncren.net (128.109.9.17) 20.1/20.1ms
6 [AS81] [CONCERT] 128.109.9.117 20.1/20.1ms
7 [AS209] [unknown] chr-edge-03.inet.qwest.net (65.121.156.209) 20.1/19.5ms
8 [AS209] [QWEST-INET-35] dcp-brdr-03.inet.qwest.net (205.171.251.110) 20.1/18.4ms
9 [AS209] [QWEST-INET-17] 63-235-40-106.dia.static.qwest.net (63.235.40.106) 20.1/60.3ms
10 [AS6830] [84-RIPE/LGI-Infrastructure] 84-116-130-173.aorta.net (84.116.130.173) 160.7/160.7ms
11 [AS6830] [84-RIPE/LGI-Infrastructure] nl-ams02a-rd1-te0-2-0-2.aorta.net (84.116.130.65) 160.7/160.7ms
12 [AS6830] [84-RIPE/LGI-Infrastructure] 84-116-130-58.aorta.net (84.116.130.58) 140.6/140.6ms
** [firewall] the next gateway may statefully inspect packets
13 [AS6830] [84-RIPE/LGI-Infrastructure] 84.116.204.234 160.7/160.6ms
** [neglected] no reply packets received from TTL 14
15 * [AS6830] [RIPE-C3/CC-HO841-NET] [target] 62-2-88-172.static.cablecom.ch (62.2.88.172):80 160.7ms
Used tcp/80
SYN
Reference
Network State Awareness & troubleshooting
mtr
•  Interactive combined traceroute and ping
•  Gives a sense of health of path (loss, delay Standard Deviation)
•  Narrow path view
21
Reference
$ mtr 62.2.88.172
aakhter-nlr-ubuntu-01 (0.0.0.0) Sat May 30 18:57:09 2015
Keys: Help Display mode Restart statistics Order of fields quit
Packets Pings
Host Loss% Snt Last Avg Best Wrst StDev
1. 152.22.242.65 0.0% 145 0.8 0.9 0.7 10.0 0.8
2. 152.22.240.8 0.0% 145 0.3 0.2 0.2 0.3 0.0
3. 128.109.70.9 0.0% 145 1.0 3.3 1.0 182.3 17.2
4. rtp7600-gw-to-dep7600-gw2.ncren.net 1.0% 145 9.2 4.1 1.6 203.4 18.6
5. rlasr-gw-link1-to-rtp7600-gw.ncren.net 0.0% 145 5.3 5.3 5.1 6.8 0.2
6. 128.109.9.166 0.0% 145 7.1 7.3 7.1 16.1 0.8
7. wscrs-gw-to-ws-a1a-ip-asr-gw-sec.ncren.net 0.0% 145 6.8 8.3 6.2 10.6 1.0
8. chr-edge-03.inet.qwest.net 0.0% 145 9.4 12.3 9.3 62.1 9.5
9. dcp-brdr-03.inet.qwest.net 0.0% 145 21.8 22.8 21.7 70.7 5.5
10. 63-235-40-106.dia.static.qwest.net 0.0% 145 21.8 24.5 21.7 86.1 10.6
11. 84-116-130-173.aorta.net 0.0% 145 144.8 145.0 144.7 152.9 1.0
12. nl-ams02a-rd1-te0-2-0-2.aorta.net 0.0% 145 144.1 145.5 144.0 165.4 3.7
13. 84-116-130-58.aorta.net 5.0% 144 142.9 142.3 142.0 145.6 0.4
14. 84.116.204.234 5.0% 144 145.1 145.1 144.9 145.3 0.0
15. 217-168-62-150.static.cablecom.ch 5.0% 144 145.9 146.1 145.2 164.3 1.9
16. 62-2-88-172.static.cablecom.ch 5.0% 144 313.0 260.3 152.6 508.0 80.0
Note variability,
probably just
the end
system
Just local noise, no
carry over to later
hops Sustained loss.
Likely something
wrong 12->13, or
way back
Network State Awareness & troubleshooting
Follow the Flow with NetFlow(RFC 3954)
•  Per-Node: Data plane observations and decisions captured
•  Src/dst mac/IP/port#s, DSCP values, in/out interfaces, etc.
•  Network view: flows centrally analyzed- NetFlow collector/analyzer
•  Biggest value: strategically placed partial views
(eg WAN edge)
22
A B
R1 R2 R5
R6
R4
R3
NetFlow Collector
LiveAction
Network State Awareness & troubleshooting
•  Developed and patented at Cisco
Systems in 1996
•  NetFlow is the de facto standard for
acquiring IP operational data
•  Standardized in IETF via IPFIX
•  Provides network and security
monitoring, network planning, traffic
analysis, and IP accounting
•  Packet capture is like a wire tap
•  NetFlow is like a phone bill
NetFlow(RFC 3954)—What Is It?
Network World Article—NetFlow Adoption on the Rise
http://www.networkworld.com/newsletters/nsm/2005/0314nsm1.html 23
Network State Awareness & troubleshooting
Src.
IP
Dest.
IP
Source
Port
Dest.
Port
Protocol TOS
Input
I/F
… Pkts
3.3.3.3 2.2.2.2 23 22078 6 0 E0 … 1100
Traffic Analysis Cache
Flow
Monitor 1
Traffic
Non-Key Fields
Packets
Bytes
Timestamps
Next Hop Address
Source IP Dest. IP Input I/F Flag … Pkts
3.3.3.3 2.2.2.2 E0 0 … 11000
Security Analysis Cache
Flow
Monitor 2
Key Fields Packet 1
Source IP 3.3.3.3
Dest IP 2.2.2.2
Input Interface Ethernet 0
SYN Flag 0
Non-Key Fields
Packets
Timestamps
Flexible NetFlow
Multiple Monitors with Unique Key Fields
Key Fields Packet 1
Source IP 3.3.3.3
Destination IP 2.2.2.2
Source Port 23
Destination Port 22078
Layer 3 Protocol TCP - 6
TOS Byte 0
Input Interface Ethernet 0
24
Network State Awareness & troubleshooting
•  Flexible NetFlow Forwarding
Status field captures
forwarding (and drop reason)
for flow.
•  Drop Count increments on any
explicit drop by router
NetFlow Forwarding Status & Drop Count Fields
25
Network State Awareness & troubleshooting
Network nodes are able to discover & validate RTP, TCP and IP-CBR traffic on hop by hop basis
À la carte metric (loss, latency, jitter etc.) selections, applied on operator selected sets of traffic
Allows for fault isolation and network span validation
Per-application threshold and altering.
Network Performance Monitor
26
Network State Awareness & troubleshooting
•  RTP SSRC
•  RTP Jitter (min/max/mean)
•  Transport Counter (expected/loss)
•  Media Counter (bytes/packets/
rate)
•  Media Event
•  Collection interval
•  TCP MSS
•  TCP round-trip time
Performance Monitor Information Elements
•  CND - Client Network Delay (min/max/sum)
•  SND – Server Network Delay (min/max/sum)
•  ND – Network Delay (min/max/sum)
•  AD – Application Delay (min/max/sum)
•  Total Response Time (min/max/sum)
•  Total Transaction Time (min/max/sum)
•  Number of New Connections
•  Number of Late Responses
•  Number of Responses by Response Time (7-
bucket histogram)
•  Number of Retransmissions
•  Number of Transactions
•  Client/Server Bytes
•  Client/Server Packets
•  L3 counter (bytes/packets)
•  Flow event
•  Flow direction
•  Client and server address
•  Source and destination address
•  Transport information
•  Input and output interfaces
•  L3 information (TTL, DSCP,
TOS, etc.)
•  Application information (from
deep packet inspection tool)
•  Monitoring class hierarchy
Media Monitoring Application Response Time Other Metrics
27
Network State Awareness & troubleshooting
NetFlow QoS Analysis
28
Cisco Prime Infra
LiveAction
flow 5-tuple DPI/NBAR QoS processingDSCP
How is my flow being classified?
Did this QoS class drop traffic?
Network State Awareness & troubleshooting
Dedicated Protocol Analyzers
•  Wireshark and other protocol analyzers are great
•  Detailed analysis for variety of protocols at deep level
•  Dedicated probes are expensive to deploy pervasively
•  Operator has to make difficult judgment calls on where the problem is going to be– before it
happens
•  Can be challenging after the fact- need on-site trained personnel.
29
Network State Awareness & troubleshooting
Embedded Packet Capture & Analyze
•  Capture packets locally to buffer on router
•  Store to flash, USB, FTP, TFTP for analysis in protocol analyzer
•  Capture does not add traffic to network
LY-2851-8#monitor capture buffer pcap-buffer1 size 10000 max-size 1550
LY-2851-8#monitor capture point ip cef pcap-point1 g0/0 both
LY-2851-8#monitor capture point associate pcap-point1 pcap-buffer1
LY-2851-8#monitor capture point start pcap-point1
LY-2851-8#monitor capture point stop pcap-point1
LY-2851-8#monitor capture buffer pcap-buffer1 export ftp://10.17.0.252/images/test.cap
Gig0/0
Network State Awareness & troubleshooting
iOAM6(prototype)
•  Instrumented IPv6 extension header on user packets
•  vs. IPv4 record-route option header
•  v6 Ext Headers better designed
•  Domain level control
•  Minimal performance hit (handled in data plane)
•  Packets continue on regular path
•  Instrumentation
•  Packet sequence numbers => detect packet loss
•  Time stamps => one way delay
•  Node and ingress/egress interface names => path recording
31
	
	
	
Network	
Element	
Apps/Controller	
	
	
	
	
v6 traffic
matrix
Live flow
tracing
Delay
distribution
Bi-castĂ­ng
control
Loss matrix/
monitor
App data
monitoring
Enhanced Telemetry
Per hop and end-to-end data added to
(selected) data traffic into the packet
Node-ID Ingress i/f egress i/f
Sequence# Timestamp App-Data
Network State Awareness & troubleshooting
iOAM6 Path Trace
•  Extended Ping
H1#ping
Protocol [ip]: ipv6
Target IPv6 address: ::A:1:1:0:1D
Repeat count [5]: 1
Datagram size [100]: 300
Timeout in seconds [2]:
Extended commands? [no]: yes
Source address or interface: gig0/1
UDP protocol? [no]:
Verbose? [no]: yes
Precedence [0]:
DSCP [0]:
Include hop by hop Path Record option? [no]: yes
Sweep range of sizes? [no]:
Type escape sequence to abort.
Sending 1, 300-byte ICMP Echos to ::A:1:1:0:1D, timeout is 2 seconds:
(Gi0/1)R1(Gi0/2)----(Gi0/1)R4(Gi0/2)----(Gi0/2)R3(Gi0/3)----H3----(Gi0/3)R3(Gi0/2)----(Gi0/2)R4(Gi0/1)----(Gi0/2)R1(Gi0/1)
Reply to request 0 (35 ms)
Success rate is 100 percent (1/1), round-trip min/avg/max = 35/35/35 ms
H1 R1 R3
H3
::A:1:1:0:1D
R2
R4
32
V6 extension
header applied/
decapped
V6 extension
header applied/
decapped
End system ICMP
stack iOAM6 enabled
Network State Awareness & troubleshooting
Control Plane
33
Network State Awareness & troubleshooting
•  3Ws: When, where, and what
•  Change is normal, but some
changes are more interesting:
•  Single change that causes loss
of reachability or suboptimal
performance
•  Instability: high rate of change
Control Plane
34
Network State Awareness & troubleshooting
Logging
•  Centrally: for ease of analysis and search
•  syslog-ng – preprocessing, relay and store(file/db)
•  Logstash(ELK), fluentd – multisource collection, storage and analysis
•  Locally: in case logs can’t get home
35
Network State Awareness & troubleshooting
State of the Routing Table
•  Be familiar with normal behavior of important service prefixes
•  Establish quickly if problem is control plane or data plane
•  Check routing table/ ipRouteTable MIB / check ip traffic (Drop stats)
•  Track objects
36
#show ip route 192.168.2.2
Routing entry for 192.168.2.2/32
Known via "ospf 1", distance 110, metric 11, type intra area
Last update from 10.0.0.2 on FastEthernet0/0, 00:00:13 ago
Routing Descriptor Blocks:
* 10.0.0.2, from 2.2.2.2, 00:00:13 ago, via FastEthernet0/0
Route metric is 11, traffic share count is 1
Network State Awareness & troubleshooting
•  Remember that OSPF data in area
should be consistent
•  Understand ‘normal’ rate of changes
•  LSA refresh /30-min unless a change
•  Track SPF runs over time
•  number of LSAs expected
•  OSPF-MIB: OspfSpfRuns,
ospfAreaLSACount
•  Route missing?
•  Where is the network supposed to be
attached? Is it still?
•  check interface (on advertising router)
•  Check ospf database …
OSPF Area / AS-Wide
# show ip ospf
Routing Process "ospf 1" with ID 192.168.0.1
Start time: 00:01:46.195, Time elapsed: 00:48:27.308
Supports only single TOS(TOS0) routes
Supports opaque LSA
Supports Link-local Signaling (LLS)
Supports area transit capability
Supports NSSA (compatible with RFC 3101)
Supports Database Exchange Summary List Optimization (RFC 5243)
Event-log enabled, Maximum number of events: 1000, Mode: cyclic
Router is not originating router-LSAs with maximum metric
Initial SPF schedule delay 5000 msecs
Minimum hold time between two consecutive SPFs 10000 msecs
Maximum wait time between two consecutive SPFs 10000 msecs
Incremental-SPF disabled
Minimum LSA interval 5 secs
Minimum LSA arrival 1000 msecs
LSA group pacing timer 240 secs
Interface flood pacing timer 33 msecs
Retransmission pacing timer 66 msecs
Number of external LSA 0. Checksum Sum 0x000000
Number of opaque AS LSA 0. Checksum Sum 0x000000
Number of DCbitless external and opaque AS LSA 0
Number of DoNotAge external and opaque AS LSA 0
Number of areas in this router is 1. 1 normal 0 stub 0 nssa
Number of areas transit capable is 0
External flood list length 0
IETF NSF helper support enabled
Cisco NSF helper support enabled
Reference bandwidth unit is 100 mbps
Area BACKBONE(0)
Number of interfaces in this area is 4 (1 loopback)
Area has no authentication
SPF algorithm last executed 00:47:05.379 ago
SPF algorithm executed 4 times
Area ranges are
Number of LSA 16. Checksum Sum 0x078460
Number of opaque link LSA 0. Checksum Sum 0x000000
Number of DCbitless LSA 0
Number of indication LSA 0
Number of DoNotAge LSA 0
Flood list length 0
Network State Awareness & troubleshooting
OSPF Neighborships
•  neighbor adjacencies
•  Check ospf neighbor detail (OSPF-MIB: ospfNbrState, ospfNbrEvents, ospfNbrLSRetransQLen)
•  How many state changes occur?
•  What is the current state?
•  Any retransmission happening?
•  Check the interface queue
38
# show ip ospf neighbor detail
Neighbor 192.168.0.7, interface address 10.0.0.3
In the area 0 via interface GigabitEthernet0/1
Neighbor priority is 1, State is FULL, 6 state changes
DR is 10.0.0.3 BDR is 10.0.0.4
Options is 0x12 in Hello (E-bit, L-bit)
Options is 0x52 in DBD (E-bit, L-bit, O-bit)
LLS Options is 0x1 (LR)
Dead timer due in 00:00:39
Neighbor is up for 00:33:10
Index 2/2/2, retransmission queue length 0, number of retransmission 0
First 0x0(0)/0x0(0)/0x0(0) Next 0x0(0)/0x0(0)/0x0(0)
Last retransmission scan length is 0, maximum is 0
Last retransmission scan time is 0 msec, maximum is 0 msec
Network State Awareness & troubleshooting
Neighbors
•  Logs will tells us why the neighbor is
bouncing—but what do they mean?
•  eg: if peer restarted it means you
have to ask the peer; he’s the one
that restarted the session
Are the neighbors bouncing constantly?
39
Neighbor 10.1.1.1 (Ethernet0) is down: peer restarted
Neighbor 10.1.1.1 (Ethernet0) is up: new adjacency
Neighbor 10.1.1.1 (Ethernet0) is down: holding time expired
Neighbor 10.1.1.1 (Ethernet0) is down: retry limit exceeded Others, but not often
Network State Awareness & troubleshooting
BGP Monitoring Protocol (BMP) Overview
Collecting Pre-Policy BGP Messages
Adj-RIB-in (pre-inbound-filter)
BGP Monitor Protocol update
BMP collector
BMP client
Inbound
filtering
policing
Loc-RIB (post-inbound-filter)
iBGP update
BMP message
Adj-RIB-in (pre-inbound-filter)
eBGP update
BGP peer’s (external)
BGP peer
(internal)
40
Network State Awareness & troubleshooting
•  IETF draft-ietf-grow-bmp-14
•  BMP client (router) provides pre-policy view of the ADJ-RIB-IN of a peer
•  Update messages from peer sent to BMP receiver
•  Example uses:
•  Realtime visualizer of BGP state
•  Traffic engineering analytics
•  BGP policy exploration
BGP Monitoring Protocol
41
Network State Awareness & troubleshooting
OpenBMP
Historical record of prefix withdraws
Current route views and peer status
42
http://www.openbmp.org
Network State Awareness & troubleshooting
Getting Started
43
Network State Awareness & troubleshooting
Be Prepared!
•  Be prepared and have data collection systems enabled
•  Enable passive monitoring on endpoints and network
•  Enable active tests
•  Helpdesk
•  Interview Script => establish & maintain checklists
•  Multi-group access to tools, logs, etc.
•  Firefighters run drills, so should your teams!
•  Be familiar with the tools and how they respond on your network
•  Red phone: Cross-domain teams (applications, UC, security, servers)
44
Network State Awareness & troubleshooting
Expanding your Toolbox and Knowledge
•  Great open source tools to look at
•  Network topology & IP address management: netdot, GestióIP
•  Performance tests: iperf3
•  Service checks: Nagios Core, Zenoss Community
•  NetFlow / Log analysis: logstash, fluentd
•  Template driven config generation: ansible
45
Presentation ID
Thank You
46

Mais conteĂşdo relacionado

Mais procurados

IPv6 and the DNS, RIPE 73
IPv6 and the DNS, RIPE 73IPv6 and the DNS, RIPE 73
IPv6 and the DNS, RIPE 73APNIC
 
IPv6 at FPT Telecom
IPv6 at FPT TelecomIPv6 at FPT Telecom
IPv6 at FPT TelecomAPNIC
 
The Next Generation Internet Number Registry Services
The Next Generation Internet Number Registry ServicesThe Next Generation Internet Number Registry Services
The Next Generation Internet Number Registry ServicesMyNOG
 
BGP: Whats so special about the number 512?
BGP: Whats so special about the number 512?BGP: Whats so special about the number 512?
BGP: Whats so special about the number 512?GeoffHuston
 
Introduction to RPKI - MyNOG
Introduction to RPKI - MyNOGIntroduction to RPKI - MyNOG
Introduction to RPKI - MyNOGSiena Perry
 
Internet Resource Transfer Policy: what can you learn from them?
Internet Resource Transfer Policy: what can you learn from them?Internet Resource Transfer Policy: what can you learn from them?
Internet Resource Transfer Policy: what can you learn from them?APNIC
 
28th TWNIC OPM and TWNOG 2017: Security best practices for network operators
28th TWNIC OPM and TWNOG 2017: Security best practices for network operators28th TWNIC OPM and TWNOG 2017: Security best practices for network operators
28th TWNIC OPM and TWNOG 2017: Security best practices for network operatorsAPNIC
 
Route Hijaking and the role of RPKI
Route Hijaking and the role of RPKIRoute Hijaking and the role of RPKI
Route Hijaking and the role of RPKIAPNIC
 
Applying IPv6 to LTE Networks
Applying IPv6 to LTE NetworksApplying IPv6 to LTE Networks
Applying IPv6 to LTE NetworksAPNIC
 
IPv6 Transition Strategies Tutorial, by Philip Smith [APNIC 38]
IPv6 Transition Strategies Tutorial, by Philip Smith [APNIC 38]IPv6 Transition Strategies Tutorial, by Philip Smith [APNIC 38]
IPv6 Transition Strategies Tutorial, by Philip Smith [APNIC 38]APNIC
 
More specific announcments in BGP
More specific announcments in BGPMore specific announcments in BGP
More specific announcments in BGPAPNIC
 
Scaling BGP
Scaling BGPScaling BGP
Scaling BGPAPNIC
 
VNIX-NOG 2021: IPv6 Deployment Update
VNIX-NOG 2021: IPv6 Deployment UpdateVNIX-NOG 2021: IPv6 Deployment Update
VNIX-NOG 2021: IPv6 Deployment UpdateAPNIC
 
Apnic IPv6 Deployment
Apnic IPv6 DeploymentApnic IPv6 Deployment
Apnic IPv6 DeploymentAPNIC
 
IPv6 Deployment: Why and Why not?
IPv6 Deployment: Why and Why not?IPv6 Deployment: Why and Why not?
IPv6 Deployment: Why and Why not?apnic_slides
 
BKNIX Peering Forum 2017: Community tools to fight DDoS
BKNIX Peering Forum 2017: Community tools to fight DDoSBKNIX Peering Forum 2017: Community tools to fight DDoS
BKNIX Peering Forum 2017: Community tools to fight DDoSAPNIC
 
IPv6 deployment at APNIC
IPv6 deployment at APNICIPv6 deployment at APNIC
IPv6 deployment at APNICAPNIC
 
APNIC IPv6 Deployment
APNIC IPv6 DeploymentAPNIC IPv6 Deployment
APNIC IPv6 DeploymentAPNIC
 
464XLAT Tutorial
464XLAT Tutorial464XLAT Tutorial
464XLAT TutorialAPNIC
 

Mais procurados (20)

IPv6 and the DNS, RIPE 73
IPv6 and the DNS, RIPE 73IPv6 and the DNS, RIPE 73
IPv6 and the DNS, RIPE 73
 
IPv6 at FPT Telecom
IPv6 at FPT TelecomIPv6 at FPT Telecom
IPv6 at FPT Telecom
 
The Next Generation Internet Number Registry Services
The Next Generation Internet Number Registry ServicesThe Next Generation Internet Number Registry Services
The Next Generation Internet Number Registry Services
 
BGP: Whats so special about the number 512?
BGP: Whats so special about the number 512?BGP: Whats so special about the number 512?
BGP: Whats so special about the number 512?
 
Introduction to RPKI - MyNOG
Introduction to RPKI - MyNOGIntroduction to RPKI - MyNOG
Introduction to RPKI - MyNOG
 
Internet Resource Transfer Policy: what can you learn from them?
Internet Resource Transfer Policy: what can you learn from them?Internet Resource Transfer Policy: what can you learn from them?
Internet Resource Transfer Policy: what can you learn from them?
 
28th TWNIC OPM and TWNOG 2017: Security best practices for network operators
28th TWNIC OPM and TWNOG 2017: Security best practices for network operators28th TWNIC OPM and TWNOG 2017: Security best practices for network operators
28th TWNIC OPM and TWNOG 2017: Security best practices for network operators
 
Route Hijaking and the role of RPKI
Route Hijaking and the role of RPKIRoute Hijaking and the role of RPKI
Route Hijaking and the role of RPKI
 
Applying IPv6 to LTE Networks
Applying IPv6 to LTE NetworksApplying IPv6 to LTE Networks
Applying IPv6 to LTE Networks
 
IPv6 Transition Strategies Tutorial, by Philip Smith [APNIC 38]
IPv6 Transition Strategies Tutorial, by Philip Smith [APNIC 38]IPv6 Transition Strategies Tutorial, by Philip Smith [APNIC 38]
IPv6 Transition Strategies Tutorial, by Philip Smith [APNIC 38]
 
More specific announcments in BGP
More specific announcments in BGPMore specific announcments in BGP
More specific announcments in BGP
 
IPv6 Deployment Update
IPv6 Deployment UpdateIPv6 Deployment Update
IPv6 Deployment Update
 
Scaling BGP
Scaling BGPScaling BGP
Scaling BGP
 
VNIX-NOG 2021: IPv6 Deployment Update
VNIX-NOG 2021: IPv6 Deployment UpdateVNIX-NOG 2021: IPv6 Deployment Update
VNIX-NOG 2021: IPv6 Deployment Update
 
Apnic IPv6 Deployment
Apnic IPv6 DeploymentApnic IPv6 Deployment
Apnic IPv6 Deployment
 
IPv6 Deployment: Why and Why not?
IPv6 Deployment: Why and Why not?IPv6 Deployment: Why and Why not?
IPv6 Deployment: Why and Why not?
 
BKNIX Peering Forum 2017: Community tools to fight DDoS
BKNIX Peering Forum 2017: Community tools to fight DDoSBKNIX Peering Forum 2017: Community tools to fight DDoS
BKNIX Peering Forum 2017: Community tools to fight DDoS
 
IPv6 deployment at APNIC
IPv6 deployment at APNICIPv6 deployment at APNIC
IPv6 deployment at APNIC
 
APNIC IPv6 Deployment
APNIC IPv6 DeploymentAPNIC IPv6 Deployment
APNIC IPv6 Deployment
 
464XLAT Tutorial
464XLAT Tutorial464XLAT Tutorial
464XLAT Tutorial
 

Semelhante a Network Troubleshooting Techniques

Tutorial: Network State Awareness Troubleshooting
Tutorial: Network State Awareness TroubleshootingTutorial: Network State Awareness Troubleshooting
Tutorial: Network State Awareness TroubleshootingAPNIC
 
Co se skrývá v datovém provozu? - Pavel Minařík
Co se skrývá v datovém provozu? - Pavel MinaříkCo se skrývá v datovém provozu? - Pavel Minařík
Co se skrývá v datovém provozu? - Pavel MinaříkSecurity Session
 
Performance & Monitoring Performance.pdf
Performance & Monitoring Performance.pdfPerformance & Monitoring Performance.pdf
Performance & Monitoring Performance.pdfPhcng785014
 
Packet Analysis - Course Technology Computing Conference
Packet Analysis - Course Technology Computing ConferencePacket Analysis - Course Technology Computing Conference
Packet Analysis - Course Technology Computing ConferenceCengage Learning
 
Ccna Imp Guide
Ccna Imp GuideCcna Imp Guide
Ccna Imp Guideabhijitgnbbl
 
LinkedIn's Approach to Programmable Data Center
LinkedIn's Approach to Programmable Data CenterLinkedIn's Approach to Programmable Data Center
LinkedIn's Approach to Programmable Data CenterShawn Zandi
 
Approved MikroTik training programs and certificates outlines
Approved MikroTik training programs and certificates outlinesApproved MikroTik training programs and certificates outlines
Approved MikroTik training programs and certificates outlinesDobri Boyadzhiev
 
Wireshark Basics
Wireshark BasicsWireshark Basics
Wireshark BasicsYoram Orzach
 
Enabling Active Flow Manipulation (AFM) in Silicon-based Network Forwarding E...
Enabling Active Flow Manipulation (AFM) in Silicon-based Network Forwarding E...Enabling Active Flow Manipulation (AFM) in Silicon-based Network Forwarding E...
Enabling Active Flow Manipulation (AFM) in Silicon-based Network Forwarding E...Tal Lavian Ph.D.
 
Network protocols and vulnerabilities
Network protocols and vulnerabilitiesNetwork protocols and vulnerabilities
Network protocols and vulnerabilitiesG Prachi
 
Tech 2 Tech: Network performance
Tech 2 Tech: Network performanceTech 2 Tech: Network performance
Tech 2 Tech: Network performanceJisc
 
Security defined routing_cybergamut_v1_1
Security defined routing_cybergamut_v1_1Security defined routing_cybergamut_v1_1
Security defined routing_cybergamut_v1_1Joel W. King
 
IRATI: an open source RINA implementation for Linux/OS
IRATI: an open source RINA implementation for Linux/OSIRATI: an open source RINA implementation for Linux/OS
IRATI: an open source RINA implementation for Linux/OSICT PRISTINE
 
IP Signal Distribution
IP Signal DistributionIP Signal Distribution
IP Signal DistributionrAVe [PUBS]
 
Skydive 31 janv. 2016
Skydive 31 janv. 2016Skydive 31 janv. 2016
Skydive 31 janv. 2016Sylvain Afchain
 
Tcp ip management & security
Tcp ip management & securityTcp ip management & security
Tcp ip management & securityAsif Qureshi
 
Free OpManager training Part3- Network performance monitoring
Free OpManager training Part3- Network performance monitoringFree OpManager training Part3- Network performance monitoring
Free OpManager training Part3- Network performance monitoringManageEngine, Zoho Corporation
 

Semelhante a Network Troubleshooting Techniques (20)

Tutorial: Network State Awareness Troubleshooting
Tutorial: Network State Awareness TroubleshootingTutorial: Network State Awareness Troubleshooting
Tutorial: Network State Awareness Troubleshooting
 
Co se skrývá v datovém provozu? - Pavel Minařík
Co se skrývá v datovém provozu? - Pavel MinaříkCo se skrývá v datovém provozu? - Pavel Minařík
Co se skrývá v datovém provozu? - Pavel Minařík
 
Решения WANDL и NorthStar для операторов
Решения WANDL и NorthStar для операторовРешения WANDL и NorthStar для операторов
Решения WANDL и NorthStar для операторов
 
Performance & Monitoring Performance.pdf
Performance & Monitoring Performance.pdfPerformance & Monitoring Performance.pdf
Performance & Monitoring Performance.pdf
 
Packet Analysis - Course Technology Computing Conference
Packet Analysis - Course Technology Computing ConferencePacket Analysis - Course Technology Computing Conference
Packet Analysis - Course Technology Computing Conference
 
Ccna Imp Guide
Ccna Imp GuideCcna Imp Guide
Ccna Imp Guide
 
LinkedIn's Approach to Programmable Data Center
LinkedIn's Approach to Programmable Data CenterLinkedIn's Approach to Programmable Data Center
LinkedIn's Approach to Programmable Data Center
 
Approved MikroTik training programs and certificates outlines
Approved MikroTik training programs and certificates outlinesApproved MikroTik training programs and certificates outlines
Approved MikroTik training programs and certificates outlines
 
Wireshark Basics
Wireshark BasicsWireshark Basics
Wireshark Basics
 
Enabling Active Flow Manipulation (AFM) in Silicon-based Network Forwarding E...
Enabling Active Flow Manipulation (AFM) in Silicon-based Network Forwarding E...Enabling Active Flow Manipulation (AFM) in Silicon-based Network Forwarding E...
Enabling Active Flow Manipulation (AFM) in Silicon-based Network Forwarding E...
 
Network protocols and vulnerabilities
Network protocols and vulnerabilitiesNetwork protocols and vulnerabilities
Network protocols and vulnerabilities
 
Software Defined Networking: Primer
Software Defined Networking: Primer Software Defined Networking: Primer
Software Defined Networking: Primer
 
Tech 2 Tech: Network performance
Tech 2 Tech: Network performanceTech 2 Tech: Network performance
Tech 2 Tech: Network performance
 
Thaker q3 2008
Thaker q3 2008Thaker q3 2008
Thaker q3 2008
 
Security defined routing_cybergamut_v1_1
Security defined routing_cybergamut_v1_1Security defined routing_cybergamut_v1_1
Security defined routing_cybergamut_v1_1
 
IRATI: an open source RINA implementation for Linux/OS
IRATI: an open source RINA implementation for Linux/OSIRATI: an open source RINA implementation for Linux/OS
IRATI: an open source RINA implementation for Linux/OS
 
IP Signal Distribution
IP Signal DistributionIP Signal Distribution
IP Signal Distribution
 
Skydive 31 janv. 2016
Skydive 31 janv. 2016Skydive 31 janv. 2016
Skydive 31 janv. 2016
 
Tcp ip management & security
Tcp ip management & securityTcp ip management & security
Tcp ip management & security
 
Free OpManager training Part3- Network performance monitoring
Free OpManager training Part3- Network performance monitoringFree OpManager training Part3- Network performance monitoring
Free OpManager training Part3- Network performance monitoring
 

Mais de APNIC

DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024APNIC
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...APNIC
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024APNIC
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGAPNIC
 
IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119APNIC
 
draft-harrison-sidrops-manifest-number-01, presented at IETF 119
draft-harrison-sidrops-manifest-number-01, presented at IETF 119draft-harrison-sidrops-manifest-number-01, presented at IETF 119
draft-harrison-sidrops-manifest-number-01, presented at IETF 119APNIC
 
Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119
Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119
Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119APNIC
 
IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119
IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119
IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119APNIC
 
Is DNS ready for IPv6, presented by Geoff Huston at IETF 119
Is DNS ready for IPv6, presented by Geoff Huston at IETF 119Is DNS ready for IPv6, presented by Geoff Huston at IETF 119
Is DNS ready for IPv6, presented by Geoff Huston at IETF 119APNIC
 
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...APNIC
 
APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85
APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85
APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85APNIC
 
NANOG 90: 'BGP in 2023' presented by Geoff Huston
NANOG 90: 'BGP in 2023' presented by Geoff HustonNANOG 90: 'BGP in 2023' presented by Geoff Huston
NANOG 90: 'BGP in 2023' presented by Geoff HustonAPNIC
 
DNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff Huston
DNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff HustonDNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff Huston
DNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff HustonAPNIC
 
APAN 57: APNIC Report at APAN 57, Bangkok, Thailand
APAN 57: APNIC Report at APAN 57, Bangkok, ThailandAPAN 57: APNIC Report at APAN 57, Bangkok, Thailand
APAN 57: APNIC Report at APAN 57, Bangkok, ThailandAPNIC
 
Lao Digital Week 2024: It's time to deploy IPv6
Lao Digital Week 2024: It's time to deploy IPv6Lao Digital Week 2024: It's time to deploy IPv6
Lao Digital Week 2024: It's time to deploy IPv6APNIC
 
AINTEC 2023: Networking in the Penumbra!
AINTEC 2023: Networking in the Penumbra!AINTEC 2023: Networking in the Penumbra!
AINTEC 2023: Networking in the Penumbra!APNIC
 
CNIRC 2023: Global and Regional IPv6 Deployment 2023
CNIRC 2023: Global and Regional IPv6 Deployment 2023CNIRC 2023: Global and Regional IPv6 Deployment 2023
CNIRC 2023: Global and Regional IPv6 Deployment 2023APNIC
 
AFSIG 2023: APNIC Foundation and support for Internet development
AFSIG 2023: APNIC Foundation and support for Internet developmentAFSIG 2023: APNIC Foundation and support for Internet development
AFSIG 2023: APNIC Foundation and support for Internet developmentAPNIC
 
AFNOG 1: Afghanistan IP Deployment Status
AFNOG 1: Afghanistan IP Deployment StatusAFNOG 1: Afghanistan IP Deployment Status
AFNOG 1: Afghanistan IP Deployment StatusAPNIC
 
AFSIG 2023: Internet routing and addressing
AFSIG 2023: Internet routing and addressingAFSIG 2023: Internet routing and addressing
AFSIG 2023: Internet routing and addressingAPNIC
 

Mais de APNIC (20)

DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOG
 
IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119
 
draft-harrison-sidrops-manifest-number-01, presented at IETF 119
draft-harrison-sidrops-manifest-number-01, presented at IETF 119draft-harrison-sidrops-manifest-number-01, presented at IETF 119
draft-harrison-sidrops-manifest-number-01, presented at IETF 119
 
Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119
Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119
Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119
 
IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119
IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119
IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119
 
Is DNS ready for IPv6, presented by Geoff Huston at IETF 119
Is DNS ready for IPv6, presented by Geoff Huston at IETF 119Is DNS ready for IPv6, presented by Geoff Huston at IETF 119
Is DNS ready for IPv6, presented by Geoff Huston at IETF 119
 
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
 
APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85
APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85
APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85
 
NANOG 90: 'BGP in 2023' presented by Geoff Huston
NANOG 90: 'BGP in 2023' presented by Geoff HustonNANOG 90: 'BGP in 2023' presented by Geoff Huston
NANOG 90: 'BGP in 2023' presented by Geoff Huston
 
DNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff Huston
DNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff HustonDNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff Huston
DNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff Huston
 
APAN 57: APNIC Report at APAN 57, Bangkok, Thailand
APAN 57: APNIC Report at APAN 57, Bangkok, ThailandAPAN 57: APNIC Report at APAN 57, Bangkok, Thailand
APAN 57: APNIC Report at APAN 57, Bangkok, Thailand
 
Lao Digital Week 2024: It's time to deploy IPv6
Lao Digital Week 2024: It's time to deploy IPv6Lao Digital Week 2024: It's time to deploy IPv6
Lao Digital Week 2024: It's time to deploy IPv6
 
AINTEC 2023: Networking in the Penumbra!
AINTEC 2023: Networking in the Penumbra!AINTEC 2023: Networking in the Penumbra!
AINTEC 2023: Networking in the Penumbra!
 
CNIRC 2023: Global and Regional IPv6 Deployment 2023
CNIRC 2023: Global and Regional IPv6 Deployment 2023CNIRC 2023: Global and Regional IPv6 Deployment 2023
CNIRC 2023: Global and Regional IPv6 Deployment 2023
 
AFSIG 2023: APNIC Foundation and support for Internet development
AFSIG 2023: APNIC Foundation and support for Internet developmentAFSIG 2023: APNIC Foundation and support for Internet development
AFSIG 2023: APNIC Foundation and support for Internet development
 
AFNOG 1: Afghanistan IP Deployment Status
AFNOG 1: Afghanistan IP Deployment StatusAFNOG 1: Afghanistan IP Deployment Status
AFNOG 1: Afghanistan IP Deployment Status
 
AFSIG 2023: Internet routing and addressing
AFSIG 2023: Internet routing and addressingAFSIG 2023: Internet routing and addressing
AFSIG 2023: Internet routing and addressing
 

Último

Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkataanamikaraghav4
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts servicevipmodelshub1
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersDamian Radcliffe
 
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With RoomVIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Roomdivyansh0kumar0
 
Gram Darshan PPT cyber rural in villages of india
Gram Darshan PPT cyber rural  in villages of indiaGram Darshan PPT cyber rural  in villages of india
Gram Darshan PPT cyber rural in villages of indiaimessage0108
 
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts servicesonalikaur4
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Roomdivyansh0kumar0
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebJames Anderson
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Sheetaleventcompany
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...aditipandeya
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Servicegwenoracqe6
 
AlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsAlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsThierry TROUIN ☁
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Roomdivyansh0kumar0
 
Challengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya Shirtrahman018755
 

Último (20)

Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls KolkataRussian Call Girls in Kolkata Ishita 🤌  8250192130 🚀 Vip Call Girls Kolkata
Russian Call Girls in Kolkata Ishita 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls KolkataVIP Call Girls Kolkata Ananya 🤌  8250192130 🚀 Vip Call Girls Kolkata
VIP Call Girls Kolkata Ananya 🤌 8250192130 🚀 Vip Call Girls Kolkata
 
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
Dwarka Sector 26 Call Girls | Delhi | 9999965857 🫦 Vanshika Verma More Our Se...
 
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Alwarpet Phone 🍆 8250192130 👅 celebrity escorts service
 
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICECall Girls In South Ex 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
Call Girls In South Ex 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SERVICE
 
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providersMoving Beyond Twitter/X and Facebook - Social Media for local news providers
Moving Beyond Twitter/X and Facebook - Social Media for local news providers
 
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With RoomVIP Kolkata Call Girl Dum Dum 👉 8250192130  Available With Room
VIP Kolkata Call Girl Dum Dum 👉 8250192130 Available With Room
 
Gram Darshan PPT cyber rural in villages of india
Gram Darshan PPT cyber rural  in villages of indiaGram Darshan PPT cyber rural  in villages of india
Gram Darshan PPT cyber rural in villages of india
 
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts serviceChennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
Chennai Call Girls Porur Phone 🍆 8250192130 👅 celebrity escorts service
 
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Model Towh Delhi 💯Call Us 🔝8264348440🔝
 
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Defence Colony Delhi 💯Call Us 🔝8264348440🔝
 
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130  Available With RoomVIP Kolkata Call Girl Alambazar 👉 8250192130  Available With Room
VIP Kolkata Call Girl Alambazar 👉 8250192130 Available With Room
 
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark WebGDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
GDG Cloud Southlake 32: Kyle Hettinger: Demystifying the Dark Web
 
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
Call Girls Service Chandigarh Lucky ❤️ 7710465962 Independent Call Girls In C...
 
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
VIP 7001035870 Find & Meet Hyderabad Call Girls Dilsukhnagar high-profile Cal...
 
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl ServiceRussian Call girl in Ajman +971563133746 Ajman Call girl Service
Russian Call girl in Ajman +971563133746 Ajman Call girl Service
 
AlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with FlowsAlbaniaDreamin24 - How to easily use an API with Flows
AlbaniaDreamin24 - How to easily use an API with Flows
 
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
Call Girls In Ashram Chowk Delhi 💯Call Us 🔝8264348440🔝
 
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130  Available With RoomVIP Kolkata Call Girl Kestopur 👉 8250192130  Available With Room
VIP Kolkata Call Girl Kestopur 👉 8250192130 Available With Room
 
Challengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya ShirtChallengers I Told Ya Shirt
Challengers I Told Ya ShirtChallengers I Told Ya Shirt
 

Network Troubleshooting Techniques

  • 1. Network State Awareness and Troubleshooting Faraz Shamim Technical Leader Cisco Systemcs Inc
  • 2. Network State Awareness & troubleshooting Abstract •  Network state awareness and troubleshooting is a large and skilled part of operating a network. This session will cover basic network data plane troubleshooting best practices and techniques to plan for failures. We will also do demos and a review of the troubleshooting tool chain: NetFlow, perf-mon, CBQoS, ICMP/traceroute, interface stats, but also touching on RP stability (SPF runs, unstable neighbors etc), and SDN methodologies along the same lines. 2
  • 3. Network State Awareness & troubleshooting •  Troubleshooting Methodology •  Packet Forwarding Review •  Data Plane •  Active Monitoring •  Passive Flow Monitoring •  QoS •  Control Plane •  Logging •  Routing Protocol Stability •  Getting Started Agenda
  • 4. Network State Awareness & troubleshooting •  This session is about basic network troubleshooting, focusing on fault detection & isolation •  Mostly, vendor neutral •  For context, we will cover some basic methodologies and functional elements of network behavior •  This session is NOT about •  Architectures of specific platforms •  Data Center technologies •  This is the 90 min tour. ;-) Keeping Focused: What This Session is About 4
  • 5. Network State Awareness & troubleshooting The Big Picture network Network Operator Server Client Application Operator Not happy It’s not the network It’s the network Is it Monday? Pings fine! Can’t ping it. Internet’s down. Somebody's downloading something. (?) 5
  • 6. Network State Awareness & troubleshooting Enterprise DC •  A lot of stuff going on •  Multiple networks •  Multiple applications •  Multiple layered services •  Mis-information / inconsistency Some More (network) Detail LAN Server A Client Not happy ISP A Enterprise WAN Server B Internet DNS DHCP 802.1x DNS 6
  • 7. Network State Awareness & troubleshooting ISP B Enterprise DC •  Redundant paths / ECMP / LAG •  Overlays •  Load balancers •  Firewalls •  NATs … and it keeps on going LAN Server A Client Not happy ISP A Enterprise WAN Server B Internet DNS DHCP 802.1x DNS 7
  • 8. Network State Awareness & troubleshooting Why network state awareness? •  Quick detection of hard failures •  Early warning for •  soft failures •  performance issues •  and tomorrows’ problems •  Faster problem resolution •  Greater confidence in network by users and application operators 8
  • 9. Network State Awareness & troubleshooting Find the Suspects Ques/on Suspects Improve Be Prepared Think Like a Network Detective 9
  • 10. Network State Awareness & troubleshooting •  Control Plane •  Processes variety of information sources and policies, creates routing information base (RIB) •  Best known intention w/o actual packet in hand •  Data Plane •  The actual forwarding process (might be SW or HW based) •  Granted some decision flexibility •  Driven by arriving packet details, traffic conditions etc. Control Plane & Data Plane Control Plane Data Plane Int A Int B Int C packet Routing Protocol(s) APIs Statics Check routes check L3 routing Check policy check forwarding Gossip from other routers Passive Measurements ifmib *FlowCbQoS check policy-map int… check interface check flow monitor PfR 10 Admin Edict
  • 11. Network State Awareness & troubleshooting •  Control plane: condenses options driven by policies and (relatively) slower moving , aggregated information, eg. prefix reachability, interface state •  Data plane responds to packet conditions •  Destination prefix to egress interface matching •  Multi-path (ECMP / LAG) member selection •  Interface congestion •  QoS class state •  Access Lists •  Packet processing fields (TTL expire, etc) •  IPv4 fragmentation, etc Data Plane Decision Flexibility 11
  • 12. Network State Awareness & troubleshooting •  Each network device makes an independent forwarding decision •  Explicit Local / domain policies •  Device perspective might not be symmetric •  Data plane flexibility •  Generally happens at WAN-edge and admin boundaries (traffic engineering) •  Asymmetric routing Network as a System: Independent Decisions A B R1 R2 R5 R6 R4 R3 your network You don’t control Congested link R5 is doing ECMP hash 12
  • 13. Network State Awareness & troubleshooting Data Plane 13
  • 14. Network State Awareness & troubleshooting User / Agent Checks •  Treat network as a black box: are your beacon services working? •  Synthetic service check (HTTP, DNS, etc.) •  Ping (not all remotes will respond) •  Data plane is exercised and tested •  Variety = better coverage (multiple IP addresses / L4 ports per location) •  Validate similar treatment (QoS) as real user traffic •  Uptime and performance (loss, latency) metrics •  Look for patterns, changes from normal. All down vs some down. •  Capture and validate real user (human) incidents. What got missed? •  Use wisely: network and server resources consumed A B R1 R2 R5 R6 R3 14
  • 15. Network State Awareness & troubleshooting Latency Network Jitter Dist. of Stats Connectivity Packet Loss FTP DNS DHCP TCPJitter ICMP UDPDLSW HTTP Network Performance Monitoring Service Level Agreement (SLA) Monitoring Network Assessment Multiprotocol Label Switching (MPLS) Monitoring VoIP MonitoringAvailability Trouble Shooting Operations Measurement Metrics Uses MIB Data Active Generated Traffic to Measure the Network DestinationSource Responder LDP H.323 SIP RTP IP SLA IP SLA*(RFC 6812): Synthetic Traffic Measurements IP SLA IP SLA 15 *IP SLA can be replaced with other monitoring tools used by other vendors such as RPM of Juniper etc •  IPSLA on router/switch – Shadow Router? •  User end-system based agent software •  Dedicated Agent
  • 16. Network State Awareness & troubleshooting Check interface •  Classic command •  Check interface ‘up’ status •  Stability: check log event or check routing table stability •  Monitor in/out bit/packet changes # show interface GigabitEthernet1 is up, line protocol is up Hardware is CSR vNIC, address is 000c.291a.7f97 (bia 000c.291a. 7f97) Internet address is 192.168.225.130/24 MTU 1500 bytes, BW 1000000 Kbit/sec, DLY 10 usec, reliability 255/255, txload 1/255, rxload 1/255 Encapsulation ARPA, loopback not set Keepalive set (10 sec) Full Duplex, 1000Mbps, link type is auto, media type is RJ45 output flow-control is unsupported, input flow-control is unsupported ARP type: ARPA, ARP Timeout 04:00:00 Last input 00:05:35, output 00:09:58, output hang never Last clearing of "show interface" counters never Input queue: 0/375/0/0 (size/max/drops/flushes); Total output drops: 0 Queueing strategy: fifo Output queue: 0/40 (size/max) 5 minute input rate 0 bits/sec, 0 packets/sec 5 minute output rate 0 bits/sec, 0 packets/sec 25349 packets input, 2381158 bytes, 0 no buffer Received 0 broadcasts (0 IP multicasts) 0 runts, 0 giants, 0 throttles 0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored 0 watchdog, 0 multicast, 0 pause input 3958 packets output, 312408 bytes, 0 underruns 0 output errors, 0 collisions, 0 interface resets 56 unknown protocol drops 0 babbles, 0 late collision, 0 deferred 0 lost carrier, 0 no carrier, 0 pause output 0 output buffer failures, 0 output buffers swapped out
  • 17. Network State Awareness & troubleshooting traceroute •  Understand the limitations •  Sends 3 packets (default) at each TTL •  Implementations •  Linux/Cisco: UDP (ICMP and TCP-SYN are Linux optional) •  UDP DST port # used to keep track of packets, increments per packet. Initial= 33434 (default) •  SRC port #: randomized (linux), incrementing per packet (Cisco IOS) •  Linux (GNU inetutils-traceroute) •  UDP DST port# increments per TTL (not per packet) •  SRC port is random but fixed per entire run •  Windows: ICMP Echo request Widest dispersion against possibilities. Difficult to understand though. ICMP blocked frequently L Narrower dispersion. Story might be misleading. Internet: aka the TCP/80 network 17
  • 18. Network State Awareness & troubleshooting Unix traceroute •  Multiple path options •  Topology ‘shortcuts’ (same router seen at diff hop) •  Ultimately all paths result in similar e2e delay 18 $ traceroute 62.2.88.172 traceroute to 62.2.88.172 (62.2.88.172), 30 hops max, 60 byte packets 1 152.22.242.65 (152.22.242.65) 1.044 ms 1.371 ms 1.585 ms 2 152.22.240.8 (152.22.240.8) 0.219 ms 0.328 ms 0.327 ms 3 128.109.70.9 (128.109.70.9) 1.066 ms 1.059 ms 1.168 ms 4 rtp7600-gw-to-dep7600-gw2.ncren.net (128.109.70.137) 1.634 ms 1.628 ms 1.736 ms 5 rlasr-gw-link1-to-rtp7600-gw.ncren.net (128.109.9.17) 5.354 ms 5.446 ms 5.557 ms 6 128.109.9.117 (128.109.9.117) 5.671 ms 128.109.9.170 (128.109.9.170) 7.141 ms 128.109.9.117 (128.109.9.117) 5.433 ms 7 wscrs-gw-to-ws-a1a-ip-asr-gw-sec.ncren.net (128.109.1.105) 9.174 ms 128.109.1.209 (128.109.1.209) 8.256 ms 6.397 ms 8 dcp-brdr-03.inet.qwest.net (205.171.251.110) 18.414 ms chr-edge-03.inet.qwest.net (65.114.0.205) 27.353 ms 27.438 ms 9  dcp-brdr-03.inet.qwest.net (205.171.251.110) 21.739 ms 63-235-40-106.dia.static.qwest.net (63.235.40.106) 17.750 ms dcp-brdr-03.inet.qwest.net (205.171.251.110) 22.450 ms 10 63-235-40-106.dia.static.qwest.net (63.235.40.106) 22.531 ms 22.516 ms 84-116-130-173.aorta.net (84.116.130.173) 140.738 ms 11 nl-ams02a-rd1-te0-2-0-2.aorta.net (84.116.130.65) 140.831 ms 140.816 ms 84-116-130-173.aorta.net (84.116.130.173) 144.819 ms 12 nl-ams02a-rd1-te0-2-0-2.aorta.net (84.116.130.65) 144.074 ms 144.761 ms 84-116-130-58.aorta.net (84.116.130.58) 138.455 ms 13 84-116-130-58.aorta.net (84.116.130.58) 141.844 ms 141.924 ms 142.459 ms 14 84.116.204.234 (84.116.204.234) 145.603 ms 145.891 ms 145.987 ms 15 * * * 16 62-2-88-172.static.cablecom.ch (62.2.88.172) 268.281 ms 268.245 ms 268.176 ms 1 AAA 2 BBB 3 CCC 4 DDD 5 EEE 6 FGF 7 HII 8 JKK +10ms (unsustained) 9 JLJ 10 LLM +120ms (sustained) 11 NNM 12 NNO 13 PPP 14 QQQ 15 *** 16 RRR ~268ms (all three) filter + > 100 ms delay +120ms Atlantic crossing Reference
  • 19. Network State Awareness & troubleshooting Unix inetutils traceroute •  Narrower view (no alternate paths directly seen) •  Repeating nodes suggests multipath, or (unlikely) routing issue 19 $ inetutils-traceroute --resolve-hostname 62.2.88.172 traceroute to 62.2.88.172 (62.2.88.172), 64 hops max 1 152.22.242.65 (152.22.242.65) 0.783ms 0.727ms 0.798ms 2 152.22.240.8 (152.22.240.8) 0.226ms 0.228ms 0.221ms 3 128.109.70.9 (128.109.70.9) 0.967ms 0.980ms 0.962ms 4 128.109.70.137 (rtp7600-gw-to-dep7600-gw2.ncren.net) 1.576ms 1.598ms 1.567ms 5 128.109.9.17 (rlasr-gw-link1-to-rtp7600-gw.ncren.net) 5.149ms 5.140ms 5.126ms 6 128.109.9.166 (128.109.9.166) 7.113ms 7.098ms 7.306ms 7 128.109.1.209 (128.109.1.209) 7.835ms 8.326ms 7.958ms 8 65.114.0.205 (chr-edge-03.inet.qwest.net) 19.944ms 9.299ms 40.372ms 9 63.235.40.106 (63-235-40-106.dia.static.qwest.net) 18.442ms 18.412ms 18.432ms 10 63.235.40.106 (63-235-40-106.dia.static.qwest.net) 22.424ms 22.391ms 75.960ms 11 84.116.130.173 (84-116-130-173.aorta.net) 145.434ms 146.301ms 145.445ms 12 84.116.130.58 (84-116-130-58.aorta.net) 137.583ms 137.556ms 137.661ms 13 84.116.130.58 (84-116-130-58.aorta.net) 142.476ms 141.886ms 141.819ms 14 84.116.204.234 (84.116.204.234) 144.841ms 145.034ms 144.964ms 15 * * * 16 62.2.88.172 (62-2-88-172.static.cablecom.ch) 287.318ms 176.670ms 254.237ms Packets for hop 9,12 took a ‘shortcut’ and packets for hop 10,13 went long way Reference
  • 20. Network State Awareness & troubleshooting lft •  lft ‘layer 4 traceroute’ dynamically adjusts to responses •  Firewall detection, whois and AS lookup integrated •  Narrower packet changes, so narrower multi-path 20 $ sudo lft -ENA 62.2.88.172 Tracing ________________________________________________________________. TTL LFT trace to 62-2-88-172.static.cablecom.ch (62.2.88.172):80/tcp 1 [AS81] [NCREN-B22] 152.22.242.65 20.1/17.2ms 2 [AS81] [NCREN-B22] 152.22.240.8 20.1/20.1ms 3 [AS81] [CONCERT] 128.109.70.9 20.1/20.1ms 4 [AS81] [CONCERT] rtp7600-gw-to-dep7600-gw2.ncren.net (128.109.70.137) 20.1/20.1ms 5 [AS81] [CONCERT] rlasr-gw-link1-to-rtp7600-gw.ncren.net (128.109.9.17) 20.1/20.1ms 6 [AS81] [CONCERT] 128.109.9.117 20.1/20.1ms 7 [AS209] [unknown] chr-edge-03.inet.qwest.net (65.121.156.209) 20.1/19.5ms 8 [AS209] [QWEST-INET-35] dcp-brdr-03.inet.qwest.net (205.171.251.110) 20.1/18.4ms 9 [AS209] [QWEST-INET-17] 63-235-40-106.dia.static.qwest.net (63.235.40.106) 20.1/60.3ms 10 [AS6830] [84-RIPE/LGI-Infrastructure] 84-116-130-173.aorta.net (84.116.130.173) 160.7/160.7ms 11 [AS6830] [84-RIPE/LGI-Infrastructure] nl-ams02a-rd1-te0-2-0-2.aorta.net (84.116.130.65) 160.7/160.7ms 12 [AS6830] [84-RIPE/LGI-Infrastructure] 84-116-130-58.aorta.net (84.116.130.58) 140.6/140.6ms ** [firewall] the next gateway may statefully inspect packets 13 [AS6830] [84-RIPE/LGI-Infrastructure] 84.116.204.234 160.7/160.6ms ** [neglected] no reply packets received from TTL 14 15 * [AS6830] [RIPE-C3/CC-HO841-NET] [target] 62-2-88-172.static.cablecom.ch (62.2.88.172):80 160.7ms Used tcp/80 SYN Reference
  • 21. Network State Awareness & troubleshooting mtr •  Interactive combined traceroute and ping •  Gives a sense of health of path (loss, delay Standard Deviation) •  Narrow path view 21 Reference $ mtr 62.2.88.172 aakhter-nlr-ubuntu-01 (0.0.0.0) Sat May 30 18:57:09 2015 Keys: Help Display mode Restart statistics Order of fields quit Packets Pings Host Loss% Snt Last Avg Best Wrst StDev 1. 152.22.242.65 0.0% 145 0.8 0.9 0.7 10.0 0.8 2. 152.22.240.8 0.0% 145 0.3 0.2 0.2 0.3 0.0 3. 128.109.70.9 0.0% 145 1.0 3.3 1.0 182.3 17.2 4. rtp7600-gw-to-dep7600-gw2.ncren.net 1.0% 145 9.2 4.1 1.6 203.4 18.6 5. rlasr-gw-link1-to-rtp7600-gw.ncren.net 0.0% 145 5.3 5.3 5.1 6.8 0.2 6. 128.109.9.166 0.0% 145 7.1 7.3 7.1 16.1 0.8 7. wscrs-gw-to-ws-a1a-ip-asr-gw-sec.ncren.net 0.0% 145 6.8 8.3 6.2 10.6 1.0 8. chr-edge-03.inet.qwest.net 0.0% 145 9.4 12.3 9.3 62.1 9.5 9. dcp-brdr-03.inet.qwest.net 0.0% 145 21.8 22.8 21.7 70.7 5.5 10. 63-235-40-106.dia.static.qwest.net 0.0% 145 21.8 24.5 21.7 86.1 10.6 11. 84-116-130-173.aorta.net 0.0% 145 144.8 145.0 144.7 152.9 1.0 12. nl-ams02a-rd1-te0-2-0-2.aorta.net 0.0% 145 144.1 145.5 144.0 165.4 3.7 13. 84-116-130-58.aorta.net 5.0% 144 142.9 142.3 142.0 145.6 0.4 14. 84.116.204.234 5.0% 144 145.1 145.1 144.9 145.3 0.0 15. 217-168-62-150.static.cablecom.ch 5.0% 144 145.9 146.1 145.2 164.3 1.9 16. 62-2-88-172.static.cablecom.ch 5.0% 144 313.0 260.3 152.6 508.0 80.0 Note variability, probably just the end system Just local noise, no carry over to later hops Sustained loss. Likely something wrong 12->13, or way back
  • 22. Network State Awareness & troubleshooting Follow the Flow with NetFlow(RFC 3954) •  Per-Node: Data plane observations and decisions captured •  Src/dst mac/IP/port#s, DSCP values, in/out interfaces, etc. •  Network view: flows centrally analyzed- NetFlow collector/analyzer •  Biggest value: strategically placed partial views (eg WAN edge) 22 A B R1 R2 R5 R6 R4 R3 NetFlow Collector LiveAction
  • 23. Network State Awareness & troubleshooting •  Developed and patented at Cisco Systems in 1996 •  NetFlow is the de facto standard for acquiring IP operational data •  Standardized in IETF via IPFIX •  Provides network and security monitoring, network planning, traffic analysis, and IP accounting •  Packet capture is like a wire tap •  NetFlow is like a phone bill NetFlow(RFC 3954)—What Is It? Network World Article—NetFlow Adoption on the Rise http://www.networkworld.com/newsletters/nsm/2005/0314nsm1.html 23
  • 24. Network State Awareness & troubleshooting Src. IP Dest. IP Source Port Dest. Port Protocol TOS Input I/F … Pkts 3.3.3.3 2.2.2.2 23 22078 6 0 E0 … 1100 Traffic Analysis Cache Flow Monitor 1 Traffic Non-Key Fields Packets Bytes Timestamps Next Hop Address Source IP Dest. IP Input I/F Flag … Pkts 3.3.3.3 2.2.2.2 E0 0 … 11000 Security Analysis Cache Flow Monitor 2 Key Fields Packet 1 Source IP 3.3.3.3 Dest IP 2.2.2.2 Input Interface Ethernet 0 SYN Flag 0 Non-Key Fields Packets Timestamps Flexible NetFlow Multiple Monitors with Unique Key Fields Key Fields Packet 1 Source IP 3.3.3.3 Destination IP 2.2.2.2 Source Port 23 Destination Port 22078 Layer 3 Protocol TCP - 6 TOS Byte 0 Input Interface Ethernet 0 24
  • 25. Network State Awareness & troubleshooting •  Flexible NetFlow Forwarding Status field captures forwarding (and drop reason) for flow. •  Drop Count increments on any explicit drop by router NetFlow Forwarding Status & Drop Count Fields 25
  • 26. Network State Awareness & troubleshooting Network nodes are able to discover & validate RTP, TCP and IP-CBR traffic on hop by hop basis À la carte metric (loss, latency, jitter etc.) selections, applied on operator selected sets of traffic Allows for fault isolation and network span validation Per-application threshold and altering. Network Performance Monitor 26
  • 27. Network State Awareness & troubleshooting •  RTP SSRC •  RTP Jitter (min/max/mean) •  Transport Counter (expected/loss) •  Media Counter (bytes/packets/ rate) •  Media Event •  Collection interval •  TCP MSS •  TCP round-trip time Performance Monitor Information Elements •  CND - Client Network Delay (min/max/sum) •  SND – Server Network Delay (min/max/sum) •  ND – Network Delay (min/max/sum) •  AD – Application Delay (min/max/sum) •  Total Response Time (min/max/sum) •  Total Transaction Time (min/max/sum) •  Number of New Connections •  Number of Late Responses •  Number of Responses by Response Time (7- bucket histogram) •  Number of Retransmissions •  Number of Transactions •  Client/Server Bytes •  Client/Server Packets •  L3 counter (bytes/packets) •  Flow event •  Flow direction •  Client and server address •  Source and destination address •  Transport information •  Input and output interfaces •  L3 information (TTL, DSCP, TOS, etc.) •  Application information (from deep packet inspection tool) •  Monitoring class hierarchy Media Monitoring Application Response Time Other Metrics 27
  • 28. Network State Awareness & troubleshooting NetFlow QoS Analysis 28 Cisco Prime Infra LiveAction flow 5-tuple DPI/NBAR QoS processingDSCP How is my flow being classified? Did this QoS class drop traffic?
  • 29. Network State Awareness & troubleshooting Dedicated Protocol Analyzers •  Wireshark and other protocol analyzers are great •  Detailed analysis for variety of protocols at deep level •  Dedicated probes are expensive to deploy pervasively •  Operator has to make difficult judgment calls on where the problem is going to be– before it happens •  Can be challenging after the fact- need on-site trained personnel. 29
  • 30. Network State Awareness & troubleshooting Embedded Packet Capture & Analyze •  Capture packets locally to buffer on router •  Store to flash, USB, FTP, TFTP for analysis in protocol analyzer •  Capture does not add traffic to network LY-2851-8#monitor capture buffer pcap-buffer1 size 10000 max-size 1550 LY-2851-8#monitor capture point ip cef pcap-point1 g0/0 both LY-2851-8#monitor capture point associate pcap-point1 pcap-buffer1 LY-2851-8#monitor capture point start pcap-point1 LY-2851-8#monitor capture point stop pcap-point1 LY-2851-8#monitor capture buffer pcap-buffer1 export ftp://10.17.0.252/images/test.cap Gig0/0
  • 31. Network State Awareness & troubleshooting iOAM6(prototype) •  Instrumented IPv6 extension header on user packets •  vs. IPv4 record-route option header •  v6 Ext Headers better designed •  Domain level control •  Minimal performance hit (handled in data plane) •  Packets continue on regular path •  Instrumentation •  Packet sequence numbers => detect packet loss •  Time stamps => one way delay •  Node and ingress/egress interface names => path recording 31 Network Element Apps/Controller v6 traffic matrix Live flow tracing Delay distribution Bi-castĂ­ng control Loss matrix/ monitor App data monitoring Enhanced Telemetry Per hop and end-to-end data added to (selected) data traffic into the packet Node-ID Ingress i/f egress i/f Sequence# Timestamp App-Data
  • 32. Network State Awareness & troubleshooting iOAM6 Path Trace •  Extended Ping H1#ping Protocol [ip]: ipv6 Target IPv6 address: ::A:1:1:0:1D Repeat count [5]: 1 Datagram size [100]: 300 Timeout in seconds [2]: Extended commands? [no]: yes Source address or interface: gig0/1 UDP protocol? [no]: Verbose? [no]: yes Precedence [0]: DSCP [0]: Include hop by hop Path Record option? [no]: yes Sweep range of sizes? [no]: Type escape sequence to abort. Sending 1, 300-byte ICMP Echos to ::A:1:1:0:1D, timeout is 2 seconds: (Gi0/1)R1(Gi0/2)----(Gi0/1)R4(Gi0/2)----(Gi0/2)R3(Gi0/3)----H3----(Gi0/3)R3(Gi0/2)----(Gi0/2)R4(Gi0/1)----(Gi0/2)R1(Gi0/1) Reply to request 0 (35 ms) Success rate is 100 percent (1/1), round-trip min/avg/max = 35/35/35 ms H1 R1 R3 H3 ::A:1:1:0:1D R2 R4 32 V6 extension header applied/ decapped V6 extension header applied/ decapped End system ICMP stack iOAM6 enabled
  • 33. Network State Awareness & troubleshooting Control Plane 33
  • 34. Network State Awareness & troubleshooting •  3Ws: When, where, and what •  Change is normal, but some changes are more interesting: •  Single change that causes loss of reachability or suboptimal performance •  Instability: high rate of change Control Plane 34
  • 35. Network State Awareness & troubleshooting Logging •  Centrally: for ease of analysis and search •  syslog-ng – preprocessing, relay and store(file/db) •  Logstash(ELK), fluentd – multisource collection, storage and analysis •  Locally: in case logs can’t get home 35
  • 36. Network State Awareness & troubleshooting State of the Routing Table •  Be familiar with normal behavior of important service prefixes •  Establish quickly if problem is control plane or data plane •  Check routing table/ ipRouteTable MIB / check ip traffic (Drop stats) •  Track objects 36 #show ip route 192.168.2.2 Routing entry for 192.168.2.2/32 Known via "ospf 1", distance 110, metric 11, type intra area Last update from 10.0.0.2 on FastEthernet0/0, 00:00:13 ago Routing Descriptor Blocks: * 10.0.0.2, from 2.2.2.2, 00:00:13 ago, via FastEthernet0/0 Route metric is 11, traffic share count is 1
  • 37. Network State Awareness & troubleshooting •  Remember that OSPF data in area should be consistent •  Understand ‘normal’ rate of changes •  LSA refresh /30-min unless a change •  Track SPF runs over time •  number of LSAs expected •  OSPF-MIB: OspfSpfRuns, ospfAreaLSACount •  Route missing? •  Where is the network supposed to be attached? Is it still? •  check interface (on advertising router) •  Check ospf database … OSPF Area / AS-Wide # show ip ospf Routing Process "ospf 1" with ID 192.168.0.1 Start time: 00:01:46.195, Time elapsed: 00:48:27.308 Supports only single TOS(TOS0) routes Supports opaque LSA Supports Link-local Signaling (LLS) Supports area transit capability Supports NSSA (compatible with RFC 3101) Supports Database Exchange Summary List Optimization (RFC 5243) Event-log enabled, Maximum number of events: 1000, Mode: cyclic Router is not originating router-LSAs with maximum metric Initial SPF schedule delay 5000 msecs Minimum hold time between two consecutive SPFs 10000 msecs Maximum wait time between two consecutive SPFs 10000 msecs Incremental-SPF disabled Minimum LSA interval 5 secs Minimum LSA arrival 1000 msecs LSA group pacing timer 240 secs Interface flood pacing timer 33 msecs Retransmission pacing timer 66 msecs Number of external LSA 0. Checksum Sum 0x000000 Number of opaque AS LSA 0. Checksum Sum 0x000000 Number of DCbitless external and opaque AS LSA 0 Number of DoNotAge external and opaque AS LSA 0 Number of areas in this router is 1. 1 normal 0 stub 0 nssa Number of areas transit capable is 0 External flood list length 0 IETF NSF helper support enabled Cisco NSF helper support enabled Reference bandwidth unit is 100 mbps Area BACKBONE(0) Number of interfaces in this area is 4 (1 loopback) Area has no authentication SPF algorithm last executed 00:47:05.379 ago SPF algorithm executed 4 times Area ranges are Number of LSA 16. Checksum Sum 0x078460 Number of opaque link LSA 0. Checksum Sum 0x000000 Number of DCbitless LSA 0 Number of indication LSA 0 Number of DoNotAge LSA 0 Flood list length 0
  • 38. Network State Awareness & troubleshooting OSPF Neighborships •  neighbor adjacencies •  Check ospf neighbor detail (OSPF-MIB: ospfNbrState, ospfNbrEvents, ospfNbrLSRetransQLen) •  How many state changes occur? •  What is the current state? •  Any retransmission happening? •  Check the interface queue 38 # show ip ospf neighbor detail Neighbor 192.168.0.7, interface address 10.0.0.3 In the area 0 via interface GigabitEthernet0/1 Neighbor priority is 1, State is FULL, 6 state changes DR is 10.0.0.3 BDR is 10.0.0.4 Options is 0x12 in Hello (E-bit, L-bit) Options is 0x52 in DBD (E-bit, L-bit, O-bit) LLS Options is 0x1 (LR) Dead timer due in 00:00:39 Neighbor is up for 00:33:10 Index 2/2/2, retransmission queue length 0, number of retransmission 0 First 0x0(0)/0x0(0)/0x0(0) Next 0x0(0)/0x0(0)/0x0(0) Last retransmission scan length is 0, maximum is 0 Last retransmission scan time is 0 msec, maximum is 0 msec
  • 39. Network State Awareness & troubleshooting Neighbors •  Logs will tells us why the neighbor is bouncing—but what do they mean? •  eg: if peer restarted it means you have to ask the peer; he’s the one that restarted the session Are the neighbors bouncing constantly? 39 Neighbor 10.1.1.1 (Ethernet0) is down: peer restarted Neighbor 10.1.1.1 (Ethernet0) is up: new adjacency Neighbor 10.1.1.1 (Ethernet0) is down: holding time expired Neighbor 10.1.1.1 (Ethernet0) is down: retry limit exceeded Others, but not often
  • 40. Network State Awareness & troubleshooting BGP Monitoring Protocol (BMP) Overview Collecting Pre-Policy BGP Messages Adj-RIB-in (pre-inbound-filter) BGP Monitor Protocol update BMP collector BMP client Inbound filtering policing Loc-RIB (post-inbound-filter) iBGP update BMP message Adj-RIB-in (pre-inbound-filter) eBGP update BGP peer’s (external) BGP peer (internal) 40
  • 41. Network State Awareness & troubleshooting •  IETF draft-ietf-grow-bmp-14 •  BMP client (router) provides pre-policy view of the ADJ-RIB-IN of a peer •  Update messages from peer sent to BMP receiver •  Example uses: •  Realtime visualizer of BGP state •  Traffic engineering analytics •  BGP policy exploration BGP Monitoring Protocol 41
  • 42. Network State Awareness & troubleshooting OpenBMP Historical record of prefix withdraws Current route views and peer status 42 http://www.openbmp.org
  • 43. Network State Awareness & troubleshooting Getting Started 43
  • 44. Network State Awareness & troubleshooting Be Prepared! •  Be prepared and have data collection systems enabled •  Enable passive monitoring on endpoints and network •  Enable active tests •  Helpdesk •  Interview Script => establish & maintain checklists •  Multi-group access to tools, logs, etc. •  Firefighters run drills, so should your teams! •  Be familiar with the tools and how they respond on your network •  Red phone: Cross-domain teams (applications, UC, security, servers) 44
  • 45. Network State Awareness & troubleshooting Expanding your Toolbox and Knowledge •  Great open source tools to look at •  Network topology & IP address management: netdot, GestiĂłIP •  Performance tests: iperf3 •  Service checks: Nagios Core, Zenoss Community •  NetFlow / Log analysis: logstash, fluentd •  Template driven config generation: ansible 45