The document discusses various topics related to network layer in computer networks including network layer services, packet switching, IP addressing, forwarding of IP packets, routing algorithms, and IP protocols. Specifically, it covers logical addressing, services provided at different nodes, forwarding methods based on destination address and labels, IP version 4 addressing including classes and subnetting, and processing of packets at routers and destination computers.
4. NETWORK LAYER SERVICES
In this section, we briefly discuss services provided by the network layer. Our discussion
is mostly based on the connectionless service, the dominant service in today’s Internet.
Logical Addressing
Services Provided at the Source Computer
Services Provides at the Each Router
Services Provided at the Destination Computer
Goal: Router is to forward packets through a set of networks.
1. Classification of Routing algorithm - Two Types
Static routing algorithm ( non – adaptive)
Dynamic routing algorithm (adaptive)
2. Routing tables
At the conceptual level, we can think of the global Internet as a black box network that connects
millions (if not billions) of computers in the world together. At this level, we are only concerned that a
message from the application layer in one computer reaches the application layer in another
computer.
4
9. Routing algorithm:: that part of the Network Layer responsible for
deciding on which output line to transmit an incoming packet.
Remember: For virtual circuit subnets the routing decision is made
ONLY at set up.
Algorithm properties:: correctness, simplicity, robustness,
stability, fairness, optimality, and scalability.
Routing Classification Adaptive Routing
based on current measurements
of traffic and/or topology.
1. centralized
2. isolated
3. distributed
Non-Adaptive Routing
1. Flooding
2. Static routing using shortest
path algorithms
ROUTING
9
10. SWITCHING
From the previous discussion, it is clear that the passage of a message from a source to a destination
involves many decisions. When a message reaches a connecting device, a decision needs to be made
to select one of the output ports through which the packet needs to be send out. In other words, the
connecting device acts as a switch that connects one port to another port.
Circuit Switching
Packet Switching
In circuit switching, the whole message is sent from the source to the
destination without being divided into packets.
Note
A good example of a circuit-switched network is the early telephone systems in which the path was established between a
caller and a callee when the telephone number of the callee was dialed by the caller. When the callee responded to the call,
the circuit was established. The voice message could now flow between the two parties, in both directions, while all of the
connecting devices maintained the circuit. When the caller or callee hung up, the circuit was disconnected.
In packet switching, the message is first divided into manageable packets at the
source before being transmitted. The packets are assembled at the destination.
The NL is designed as a packet-switched network. This means that the packet at the source is divided into manageable
packets called datagrams. Individual datagrams are then transferred from S to D. The received datagrams are assembled at
the destination before recreating the original message. The packet-switched network layer of the Internet was originally
designed as a connectionless service, but recently there is a tendency to change this to a connection-oriented service.
10
11. Types of Services
Connectionless Service
Connection-Oriented Service
In a connectionless packet-switched network, the forwarding decision is
based on the destination address of the packet.
Note
Sender Network
Network
ReceiverOut of orderR3
R4
R5
R1 R2
Aconnectionless
packet-swtiched network
4 3 2 1
1
2
3
4
2
3 3
1
4 43 21
Delay in a connectionless network
Time Time Time Time
Souce
Destination
1
2
3
Totaldelay
11
12. Connection-oriented packet switched network
4 3 2 1
4
3
2
1
4 3 2 1 4 3 2 1
In a connection-oriented packet switched network, the forwarding
decision is based on the label of the packet.
Note
Delay in a connection-oriented network
1
2
Transmission
time
3
4
5
SetupTeardown
Totaldelay
6
7
Time Time Time Time
Source D
12
13. Forwarding of IP Packets
The network layer supervises the handling of the packets by the underlying physical networks. We
define this handling as the delivery of a packet. The delivery of a packet to its final destination is
accomplished using two different methods of delivery: direct and indirect. Direct Delivery
Indirect DeliveryDirect delivery
Direct delivery
Direct delivery
Indirect delivery
Link LinkLink
A B
Indirect delivery Indirect delivery
13
14. FORWARDING
Forwarding means to place the packet in its route to its destination. Since the Internet
today is made of a combination of links (networks), forwarding means to deliver the
packet to the next hop (which can be the final destination or the intermediate connecting
device). Although the IP protocol was originally designed as a connectionless protocol,
today the tendency is to use IP as a connection-oriented protocol.
Forwarding Based on Destination Address
Forwarding Based on Label
Forwarding Based on Destination Address
4 types
Next – hop
Network Specific method
Host specific method
Default method
14
16. Figure 6.4 Network-specific method
N2 R1
Destination Next Hop
Network-specific
routing table for host S
A
B
C
D
Destination
R1
R1
R1
R1
Next Hop
Host-specific
routing table for host S
16
17. Figure 6.5 Host-specific routing
R2
Host B
R3
Host A
R1
N1
N2 N3
Routing table for host A
R3
R1
R3
......
Destination Next Hop
Host B
N2
N3
......
17
18. Figure 6.6 Default routing
R1
Host A N1
Rest of the Internet
Default
router
R2
N2Routing table for host A
Destination Next Hop
......
N2
Default
......
R1
R2
18
19. The address in the network layer of the TCP/IP model is called Internet Address or
IP address, an IP address is a 32-bit address
The IP addresses are unique (each connection has a different address) and
universal (must be accepted by any host wants to connect to the internet).
Consists of 4 octets (bytes)
Network IP addresses are managed by a nonprofit organization called ICANN
(International Corporation for Assigned Names and Numbers) to avoid conflicts.
Assigns addresses to regional Authorities which assign numbers to ISPs
Assigns and manages DNS (Domain Name System)
ADDRESSING
The address space of IPv4 is 2
32
or 4,294,967,296.
Network + Host: Complete IP address
Network Address: Host part set to 0
Network ID: identifies the network to
which the host is connected
Host ID: identifies the interface of the
network connection to the host not the
host itself
19
20. Figure Dotted-decimal notation
Example 1
Change the IP addresses from binary notation to dotted-decimal notation.
a. 10000001 00001011 00001011 11101111
b. 11111001 10011011 11111011 00001111
Solution
We replace each group of 8 bits with its equivalent decimal number and add dots for separation:
a. 129.11.11.239
b. 249.155.251.15
20
21. Example 2 Change the IP addresses from dotted-decimal notation to binary notation.
a. 111.56.45.78
b. 75.45.34.78
Solution We replace each decimal number with its binary equivalent:
a. 01101111 00111000 00101101 01001110
b. 01001011 00101101 00100010 01001110
ADDRESSING
PROBLEM
Example 3 Find the error, if any, in the following IP address: 75.45.301.14
Solution In dotted-decimal notation, each number is less than or equal to 255;
301 is outside this range.
In classful addressing, the address space is divided into five classes:
A, B, C, D, and E.
Note
21
22. Finding the classes in binary and dotted-decimal notation
Figure 19.11 Finding the address class
22
23. Example 3 Find the class of each address:
a. 00000001 00001011 00001011 11101111
b. 11110011 10011011 11111011 00001111
Solution
See the procedure in Figure 19.11.
a. The first bit is 0; this is a class A address.
b. The first 4 bits are 1s; this is a class E address.
Example 4 Find the class of each address:
a.227.12.14.87 b.252.5.15.111 c.134.11.78.56
Solution
a. The first byte is 227 (between 224 and 239); the class is D.
b. The first byte is 252 (between 240 and 255); the class is E.
c. The first byte is 134 (between 128 and 191); the class is B.
Figure Netid and hostid
23
24. Classful Addressing
Class A
Start with binary 0
All 0 reserved (default route) or any network
01111111 (127) reserved for loopback
231 or 2,147,483,648 class A complete IP addresses
27 =128 blocks (network addresses)
Number of complete IP addresses in each block is 224=16777216 – (all zeros
host - network address, and all ones – broadcast address)
Valid Range 1.x.x.x to 126.x.x.x (126 valid blocks)
All allocated
Class B
Start with binary 10
Range 128.x.x.x to 191.x.x.x
230 class B complete IP addresses
214=16384 blocks (network addresses)
Number of addresses in each block is 216=65536 – (all zeros host, and all ones)
All allocated
24
25. Classful Addressing
Class C
229 Class C complete IP addresses
221=2097152 blocks (network addresses)
Start with binary 110
Range 192.x.x.x to 223.x.x.x
Number of addresses in each block is 256 – (all zeros host, and all ones) class
Nearly all allocated
Class D
Multicast addresses
No network/host hierarchy
Range Total
10.0.0.0 to
10.255.255.255
224
172.16.0.0 to
172.31.255.255
220
192.168.0.0 to
192.168.255.255
216
Private addresses
25
26. Figure 19.14 Blocks in class A
Millions of class A addresses are wasted.
Note
26
28. Figure 19.16 Blocks in class C
The number of addresses in class C block is
smaller than the needs of most organizations.
Note
28
29. Figure 19.17 Network address
In classful addressing, the network address is
the one that is assigned to the organization.
Note
Example 5 Given the address 23.56.7.91, find the network address.
Solution
The class is A. Only the first byte defines the netid. We can find the network address by
replacing the hostid bytes (56.7.91) with 0s. Therefore, the network address is 23.0.0.0.
29
30. Example 6 Given the address 132.6.17.85, find the network address.
Solution The class is B. The first 2 bytes defines the netid. We can find the network address by
replacing the hostid bytes (17.85) with 0s. Therefore, the network address is 132.6.0.0.
Example 7 Given the network address 17.0.0.0, find the class.
Solution The class is A because the netid is only 1 byte.
Class C
Class B
Class A
Sample internet
Note
A network address is different from a
netid. A network address has both netid
and hostid, with 0s for the hostid.
30
31. Table 19.1 Default masks
Class In Binary
In Dotted-
Decimal
Using Slash
A 11111111 00000000 00000000 00000000 255.0.0.0 /8
B 11111111 11111111 00000000 00000000 255.255.0.0 /16
C 11111111 111111111 11111111 00000000 255.255.255.0 /24
IP addresses are designed with two levels of hierarchy.
Note
31
32. Figure A network with two levels of hierarchy
Addressing without Subnets
A class B “Flat Network”, more than
2
16
=65536 hosts
How to manage?
Performance? Too many hosts on the
same LAN (single broadcast domain) will
slowdown the LAN performance
Solution: Subnetting
The network address can be found by
applying the default mask to any address
in the block (including itself).It retains the
netid of the block and sets the hostid to 0s.
Note
32
33. Figure 19.23 Subnet mask
Class B
Reduces
the routing
table
entries and
size 33
34. Subnetting
Dividing the network into several smaller groups (subnets) with each
group having its own subnet IP address
Site looks to rest of internet like single network and routers outside
the organization route the packet based on the main Network address
Local routers route within subnetted network using subnet address
Host portion of address partitioned into subnet number (most
significant part) and host number (least significant part)
In this case, IP address will have 3 levels (Main network, subnet, host)
Subnet mask is a 32-bit consists of zeros and ones that indicates
which bits of the IP address are subnet number and which are host
number
Subnet mask when ANDed with the IP address it gives the
subnetwork address
34
35. Figure 19.20 A network with three levels of hierarchy
(subnetted)
Routers will use subnet mask 255.255.192.0 or /18 35
36. Example 8 A router outside the organization receives a packet with destination address
190.240.7.91 /16. Show how it finds the network address to route the packet.
The router follows three steps:
• The router looks at the first byte of the address to find the class. It is class B.
• The default mask for class B is 255.255.0.0. or /16 The router ANDs this mask with the address
to get 190.240.0.0.
• The router looks in its routing table to find out how to route the packet to this destination.
Later, we will see what happens if this destination does not exist.
Solution
Example 9 A router inside the organization receives the same packet with D address
190.240.33.91 /19. Show how it finds the subnetwork address to route the packet.
Solution
The router follows three steps:
The router must know the mask. Is 255.255.224.0 or /19
The router applies the mask to the address, 190.240.33.91. The subnet address is 190.240.32.0.
The router looks in its routing table to find how to route the packet to this destination. Later, we will
see what happens if this destination does not exist.
36
37. Obtaining Host IP Address
Once a network administrator in an organization obtained a block of
addresses from its ISP, it can then assign individual IP addresses to
the host and router interfaces
It can be done in two ways:
Manual configuration: IP address is stored manually by the
administrator in a configuration file
What about a diskless computer? Or first time booted
computer with a disk?
What about if the computer has moved from one subnet to
another?
Solution is using a protocol called Dynamic Host Configuration
Protocol (DHCP)
DHCP is a client-server program
37
38. Dynamic Host Configuration Protocol
Dynamic Host Configuration Protocol (DHCP)
A protocol that provide IP address, subnet mask, IP address of a
gateway router, and IP address of DNS server dynamically to a
host or to a diskless computer
DHCP server keeps two databases (static IP addresses and unused
temporary Addresses.)
Static IP addresses database maps physical addresses (MAC) to
permanent IP addresses (used for diskless workstations)
When a host requests an address DHCP will look into the static
database first.
If no address match is found, DHCP will select the dynamic IP
database. DHCP will assign a Temporary Address: selected address
from a pool of free addresses and assign it to the host
Leasing: DHCP server assigns an IP address for a host for a
specific period of time in order not to waste IP addresses
After the period expires, host must return the IP address or
renew the lease.
38
39. Address Resolution Protocol (ARP)
At the network level hosts and routers are recognized by
their IP address
Packets must pass through physical networks to reach hosts
and routers.
At the physical network, hosts and routers are recognized by their
MAC addresses which is local address.
ARP is a network layer protocol that translates between
Internet IP address and MAC sublayer (layer-2) address
Figure Encapsulation of ARP packet
39
43. IPv4 datagram fields
Minimum Header length is 20 bytes without options.
With options the maximum can go to 60 bytes
Largest data that can be carried in the datagram is 65535 – 20 = 65515
Version field: will carry the version number which is 4 = (0100)2
Header length: the length of the header in bytes after dividing it by 4. Min is 20/4 = 5 =
(0101)2 and the max is 60/4 = 15 = (1111 )2
Total length: total length of the packet: header + data. Max = 65535 bytes
Identification, flags, and offset used for fragmentation and reassembly at the D
Packet can be fragmented at any node between the source and the destination but
reassembly is done ONLY at the destination node.
Time to Live is used to prevent lost packets from circulating between routers forever. This
field is set to certain value depending on the device operating system. Each router will
decrement this field by one and check the value. If the value is zero the packet will be
dropped.
Protocol: contains a code for what is being carried in the data field.
Header checksum used for checking if there is error in the header only. The checksum is
recomputed at each router between the source and the destination.
Figure Maximum transfer unit (MTU)
43
44. Protocol field and encapsulated data
Table Protocol values in Hex
Table MTUs for
some networks
44
45. Internet Control Message Protocol
used by hosts & routers to
communicate network-level
information
error reporting:
unreachable host, network,
port, protocol
echo request/reply (used by
ping)
network-layer “above” IP:
ICMP msgs carried in IP
datagrams
ICMP message: type, code
plus first 8 bytes of IP
datagram causing error
Type Code description
0 0 echo reply (ping)
3 0 dest. network unreachable
3 1 dest host unreachable
3 2 dest protocol unreachable
3 3 dest port unreachable
3 6 dest network unknown
3 7 dest host unknown
4 0 source quench (congestion
control - not used)
8 0 echo request (ping)
9 0 route advertisement
10 0 router discovery
11 0 TTL expired
12 0 bad IP header
45
46. Traceroute and ICMP
source sends series of UDP segments to dest
first set has TTL =1
second set has TTL=2, etc.
unlikely port number
when nth set of datagrams arrives to nth router:
router discards datagrams
and sends source ICMP messages (type 11, code 0)
ICMP messages includes name of router & IP address
when ICMP messages arrives, source records RTTs
stopping criteria:
UDP segment eventually arrives at destination host
destination returns ICMP “port unreachable” message (type 3, code 3)
source stops
3 probes
3 probes
3 probes
46
47. data
destination address
(128 bits)
source address
(128 bits)
payload len next hdr hop limit
flow labelpriver
32 bits
IPv6: Motivation
initial motivation: 32-bit address space soon to be completely
allocated.
additional motivation:
header format helps speed processing/forwarding
header changes to facilitate QoS
IPv6 datagram format:
fixed-length 40 byte header
no fragmentation allowed
priority: identify priority among
datagrams in flow
flow Label: identify datagrams in
same “flow.”
next header: identify upper layer
protocol for data
47
48. Other changes from IPv4
checksum: removed entirely to reduce processing time at each hop
options: allowed, but outside of header, indicated by “Next Header” field
ICMPv6: new version of ICMP
additional message types, e.g. “Packet Too Big”
multicast group management functions
Transition from IPv4 to IPv6
not all routers can be upgraded simultaneously
no “flag days”
how will network operate with mixed IPv4 and IPv6 routers?
tunneling: IPv6 datagram carried as payload in IPv4 datagram among IPv4 routers
IPv4 source, dest addr
IPv4 header fields
IPv4 datagram
IPv6 datagram
IPv4 payload
UDP/TCP payload
IPv6 source dest addr
IPv6 header fields
49. Tunneling
flow: X
src: A
dest: F
data
A-to-B:
IPv6
Flow: X
Src: A
Dest: F
data
src:B
dest: E
B-to-C:
IPv6 inside
IPv4
E-to-F:
IPv6
flow: X
src: A
dest: F
data
B-to-C:
IPv6 inside
IPv4
Flow: X
Src: A
Dest: F
data
src:B
dest: E
physical view:
A B
IPv6 IPv6
E
IPv6 IPv6
FC D
logical view:
IPv4 tunnel
connecting IPv6 routers
E
IPv6 IPv6
FA B
IPv6 IPv6
IPv4 IPv4
49
50. Interplay between routing, forwarding
1
23
IP destination address in
arriving packet’s header
routing algorithm
local forwarding table
dest address output link
address-range 1
address-range 2
address-range 3
address-range 4
3
2
2
1
routing algorithm determines
end-end-path through network
forwarding table determines
local forwarding at this router
50
51. u
yx
wv
z
2
2
1
3
1
1
2
5
3
5graph: G = (N,E)
N = set of routers = { u, v, w, x, y, z }
E = set of links ={ (u,v), (u,x), (v,x), (v,w), (x,w),
(x,y), (w,y), (w,z), (y,z) }
Unicast Routing basics
aside: graph abstraction is useful in other network contexts, e.g., P2P, where
N is set of peers and E is set of TCP connections
Graph abstraction: costs
c(x,x’) = cost of link (x,x’) e.g., c(w,z) = 5
cost could always be 1, or inversely related to bandwidth, or inversely
related to congestion
cost of path (x1, x2, x3,…, xp) = c(x1,x2) + c(x2,x3) + … + c(xp-1,xp)
key question: what is the least-cost path between u and z ?
routing algorithm: algorithm that finds that least cost path
51
52. Routing algorithm classification
Q: global or decentralized information?
global: all routers have complete topology, link cost info “link state”
algorithms
decentralized:
router knows physically-connected neighbors, link costs to neighbors
iterative process of computation, exchange of info with neighbors
“distance vector” algorithms
Q: static or dynamic?
static: routes change slowly over time
dynamic: routes change more quickly
periodic update
in response to link cost changes
routing algorithms are: link state, distance vector, hierarchical routing
routing in the Internet : RIP, OSPF, BGP
52
53. A Link-State Routing Algorithm
Dijkstra’s algorithm
net topology, link costs known
to all nodes
accomplished via “link state
broadcast”
all nodes have same info
computes least cost paths from
one node (‘source”) to all other
nodes
gives forwarding table for that
node
iterative: after k iterations,
know least cost path to k dest.’s
notation:
c(x,y): link cost from node
x to y; = ∞ if not direct
neighbors
D(v): current value of cost
of path from source to dest.
v
p(v): predecessor node
along path from source to v
N': set of nodes whose least
cost path definitively known
53
54. Dijsktra’s Algorithm
1 Initialization:
2 N' = {u}
3 for all nodes v
4 if v adjacent to u
5 then D(v) = c(u,v)
6 else D(v) = ∞
7
8 Loop
9 find w not in N' such that D(w) is a minimum
10 add w to N'
11 update D(v) for all v adjacent to w and not in N' :
12 D(v) = min( D(v), D(w) + c(w,v) )
13 /* new cost to v is either old cost to v or known
14 shortest path cost to w plus cost from w to v */
15 until all nodes in N'
Step N'
D(v)
p(v)
0
1
2
3
4
5
D(w)
p(w)
D(x)
p(x)
D(y)
p(y)
D(z)
p(z)
u ∞∞7,u 3,u 5,u
uw ∞11,w6,w 5,u
14,x11,w6,wuwx
uwxv 14,x10,v
uwxvy 12,y
uwxvyz
notes:
construct shortest path tree by tracing
predecessor nodes
ties can exist (can be broken arbitrarily)
w3
4
v
x
u
5
3
7 4
y
8
z
2
7
9
54
55. Dijkstra’s algorithm
algorithm complexity: n nodes
each iteration: need to check all nodes, w, not in N
n(n+1)/2 comparisons: O(n2)
more efficient implementations possible: O(nlogn)
oscillations possible:
e.g., support link cost equals amount of carried traffic:
A
D
C
B
1 1+e
e0
e
1 1
0 0
initially
A
D
C
B
given these costs,
find new routing….
resulting in new costs
2+e 0
00
1+e 1
A
D
C
B
0 2+e
1+e1
0 0
A
D
C
B
2+e 0
00
1+e 1
55
56. Distance vector algorithm
Bellman-Ford equation (dynamic programming)
Let dx(y) := cost of least-cost path from x to y then
dx(y) = min {c(x,v) + dv(y) }
cost to neighbor v
min taken over all neighbors v of x
cost from neighbor v to destination y
u
yx
wv
z
2
2
1
3
1
1
2
5
3
5 clearly, dv(z) = 5, dx(z) = 3, dw(z) = 3
B-F equation says:
du(z) = min { c(u,v) + dv(z), c(u,x) + dx(z),
c(u,w) + dw(z) }
= min {2 + 5, 1 + 3, 5 + 3} = 4
node achieving minimum is next
hop in shortest path, used in forwarding table 56
57. Distance vector algorithm
Dx(y) = estimate of least cost from x to y
x maintains distance vector Dx = [Dx(y): y є N ]
node x:
knows cost to each neighbor v: c(x,v)
maintains its neighbors’ distance vectors. For each neighbor v, x
maintains Dv = [Dv(y): y є N ]
key idea:
from time-to-time, each node sends its own distance vector estimate
to neighbors
when x receives new DV estimate from neighbor, it updates its own
DV using B-F equation:Dx(y) ← minv{c(x,v) + Dv(y)} for each node y ∊ N
under minor, natural conditions, the estimate Dx(y) converge to the actual least cost dx(y)
57
58. iterative, asynchronous: each
local iteration caused by:
local link cost change
DV update message from
neighbor
distributed:
each node notifies neighbors only
when its DV changes
neighbors then notify their
neighbors if necessary
wait for (change in local link cost or
msg from neighbor)
recompute estimates
if DV to any dest has changed, notify
neighbors
each node:
Distance vector algorithm
58
59. 3 1 0
x y z
x
y
z
0 2 3
from
cost to
x y z
x
y
z
0 2 7
from
cost to
x y z
x
y
z
0 2 3
from
cost to
x y z
x
y
z
0 2 3
from
cost to
x y z
x
y
z
0 2 7
from
cost to
2 0 1
7 1 0
2 0 1
3 1 0
2 0 1
3 1 0
2 0 1
3 1 0
2 0 1
time
x y z
x
y
z
0 2 7
∞∞ ∞
∞∞ ∞
from
cost to
fromfrom
x y z
x
y
z
0
x y z
x
y
z
∞ ∞
∞∞ ∞
cost to
x y z
x
y
z
∞∞ ∞
7 1 0
cost to
∞
2 0 1
∞ ∞ ∞
2 0 1
7 1 0
time
x z
12
7
y
node x
table
Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)}
= min{2+0 , 7+1} = 2
Dx(z) = min{c(x,y) +
Dy(z), c(x,z) + Dz(z)}
= min{2+1 , 7+0} = 3
32
node y
table
node z
table
cost to
from
59
60. Comparison of LS and DV algorithms
message complexity
LS: with n nodes, E links, O(nE)
msgs sent
DV: exchange between neighbors
only
convergence time varies
speed of convergence
LS: O(n2) algorithm requires
O(nE) msgs
may have oscillations
DV: convergence time varies
may be routing loops
count-to-infinity problem
robustness: what happens if router
malfunctions?
LS:
node can advertise incorrect
link cost
each node computes only its
own table
DV:
DV node can advertise
incorrect path cost
each node’s table used by
others
error propagate thru
network
60
61. 3b
1d
3a
1c
2a
AS3
AS1
AS21a
2c
2b
1b
Intra-AS
Routing
algorithm
Inter-AS
Routing
algorithm
Forwarding
table
3c
Hierarchical routing
scale: with 600 million destinations:
can’t store all dest’s in routing tables!
routing table exchange would swamp links!
gateway router:
at “edge” of its own AS
has link to router in another AS
administrative autonomy
internet = network of networks
each network admin may want to control routing in its own network
aggregate routers into regions, “autonomous systems” (AS)
routers in same AS run same routing protocol
“intra-AS” routing protocol
routers in different AS can run different intra-AS routing protocol
61
62. Inter-AS tasks
suppose router in AS1 receives datagram destined outside of AS1:
router should forward packet to gateway router, but which one?
AS1 must: 1.learn which dests are reachable through AS2, which through AS3
2.propagate this reachability info to all routers in AS1
job of inter-AS routing!
suppose AS1 learns (via inter-AS protocol) that subnet x reachable via AS3
(gateway 1c), but not via AS2
inter-AS protocol propagates reachability info to all internal routers
router 1d determines from intra-AS routing info that its interface I is on the least
cost path to 1c
installs forwarding table entry (x,I)
AS3
AS2
3b
3c
3a
AS1
1c
1a
1d
1b
2a
2c
2b
other
networks
other
networks
x
62
63. Intra-AS Routing
also known as interior gateway protocols (IGP)
most common intra-AS routing protocols:
RIP: Routing Information Protocol
OSPF: Open Shortest Path First
IGRP: Interior Gateway Routing Protocol(Cisco prop.)
RIP: example
routing table in router D
destination subnet next router # hops to dest
w A 2
y B 2
z B 7
x -- 1
…. …. ....
w x y
z
A
C
D B
63
64. RIP ( Routing Information Protocol)
included in BSD-UNIX distribution in 1982
distance vector algorithm
distance metric: # hops (max = 15 hops), each link has cost 1
DVs exchanged with neighbors every 30 sec in response message
(aka advertisement)
each advertisement: list of up to 25 destination subnets (in IP
addressing sense)
DC
BA
u v
w
x
y
z
subnet hops
u 1
v 2
w 2
x 3
y 3
z 2
from router A to destination subnets:
64
65. RIP: link failure, recovery
if no advertisement heard after 180 sec -> neighbor/link declared dead
routes via neighbor invalidated
new advertisements sent to neighbors
neighbors in turn send out new advertisements (if tables changed)
link failure info quickly (?) propagates to entire net
poison reverse used to prevent ping-pong loops (infi.dist= 16 hops)
RIP routing tables managed by application-level process called route-d
(daemon)
advertisements sent in UDP packets, periodically repeated
physical
link
network forwarding
(IP) table
transport
(UDP)
routed
physical
link
network
(IP)
transprt
(UDP)
routed
forwarding
table
65
66. OSPF (Open Shortest Path First)
OSPF advertisement carries one entry per neighbor
advertisements flooded to entire AS
carried in OSPF messages directly over IP (rather than TCP / UDP
IS-IS routing protocol: nearly identical to OSPF
“open”: publicly available
uses link state algorithm
LS packet dissemination
topology map at each node
route computation using
Dijkstra’s algorithm
boundary router
backbone ro
area 1
area 2
area 3
backbone
area
border
routers
internal
routers
Hierarchical OSPF
66
67. OSPF “advanced” features (not in RIP)
security: all OSPF messages authenticated (to prevent malicious intrusion)
multiple same-cost paths allowed (only one path in RIP)
for each link, multiple cost metrics for different TOS (e.g., satellite link cost
set “low” for best effort ToS; high for real time ToS)
integrated uni- and multicast support:
Multicast OSPF (MOSPF) uses same topology data base as OSPF
hierarchical OSPF in large domains.
Hierarchical OSPF
two-level hierarchy: local area, backbone.
link-state advertisements only in area
each nodes has detailed area topology; only know direction (shortest
path) to nets in other areas.
area border routers: “summarize” distances to nets in own area, advertise
to other Area Border routers.
backbone routers: run OSPF routing limited to backbone.
boundary routers: connect to other AS’s.
67
68. Internet inter-AS routing: BGP
BGP (Border Gateway Protocol): the de facto inter-domain routing
protocol - “glue that holds the Internet together”
BGP provides each AS a means to:
eBGP: obtain subnet reachability inform. from neighboring ASs.
iBGP: propagate reachability inform. to all AS-internal routers.
determine “good” routes to other networks based on reachability
information and policy.
allows subnet to advertise its existence to rest of Internet:I am here
AS3
AS2
3b
3c
3a
AS1
1c
1a
1d
1b
2a
2c
2b
other
networks
other
networks
BGP
message
BGP session: two BGP routers
(“peers”) exchange BGP mesg:
advertising paths to different
destination network prefixes
(“path vector” protocol)
exchanged over semi-
permanent TCP connections
68
69. AS3
AS2
3b
3a
AS1
1c
1a
1d
1b
2a
2c
2b
other
networks
other
networks
eBGP session
iBGP session
using eBGP session between 3a and 1c, AS3 sends prefix reachability info to
AS1.
1c can then use iBGP do distribute new prefix info to all routers in AS1
1b can then re-advertise new reachability info to AS2 over 1b-to-2a
eBGP session
when router learns of new prefix, it creates entry for prefix in its forwarding
table.
advertised prefix includes BGP attributes
prefix + attributes = “route”
two important attributes:
AS-PATH: contains ASs through which prefix advertisement has passed:
e.g., AS 67, AS 17
NEXT-HOP: indicates specific internal-AS router to next-hop AS. (may
be multiple links from current AS to next-hop-AS)
gateway router receiving route advertisement uses import policy to
accept/decline
e.g., never route through AS x
policy-based routing 69
70. BGP route selection
router may learn about more than 1 route to destination AS, selects
route based on:
local preference value attribute: policy decision
shortest AS-PATH
closest NEXT-HOP router: hot potato routing
additional criteria
BGP messages : msg exchanged between peers over TCP connection
BGP messages:
OPEN: opens TCP connection to peer and authenticates sender
UPDATE: advertises new path (or withdraws old)
KEEPALIVE: keeps connection alive in absence of UPDATES;
also ACKs OPEN request
NOTIFICATION: reports errors in previous msg; also used to
close connection
70
71. Multicast Basics
goal: find a tree connecting routers having local mcast group members
tree: not all paths between routers used, 2 types
shared-tree: same tree used by all group members
source-based: different tree from each sender to rcvrs
group
member
not group
member
router
with a
group
member
router
without
group
member
legend
shared tree source-based trees
71
72. Approaches for building mcast trees
Approaches
source-based tree: one tree per source
shortest path trees
reverse path forwarding
group-shared tree: group uses one tree
minimal spanning (Steiner)
center-based trees
Shortest path tree
mcast forwarding tree: tree of shortest
path routes from source to all receivers
Dijkstra’s algorithm
R1
R2
R3
R4
R5
R6 R7
2
1
6
3 4
5
s: source
i
router with attached
group member
router with no attached
group member
link used for forwarding,
i indicates order link
added by algorithm
LEGEND
72
73. Reverse path forwarding
if (mcast datagram received on incoming link on shortest path
back to center)
then flood datagram onto all outgoing links
else ignore datagram
rely on router’s knowledge of unicast shortest path from it to sender
each router has simple forwarding behavior:
R1
R2
R3
R4
R5
R6 R7
s: source
router with attached
group member
router with no attached
group member
datagram will be
forwarded
LEGEND
datagram will not be
forwarded
73
74. Reverse path forwarding: pruning
forwarding tree contains subtrees with no mcast group
members
no need to forward datagrams down subtree
“prune” msgs sent upstream by router with no
downstream group members
prune message
LEGEND
links with multicast
forwarding
P
R1
R2
R3
R4
R5
R6
R7
s: source
P
P
74
75. Shared-tree: steiner tree
steiner tree: minimum cost tree connecting all
routers with attached group members
problem is NP-complete
excellent heuristics exists
not used in practice:
computational complexity
information about entire network needed
monolithic: rerun whenever a router needs to
join/leave
75
76. Center-based trees
single delivery tree shared by all
one router identified as “center” of tree
to join:
edge router sends unicast join-msg addressed to center router
join-msg “processed” by intermediate routers & fwd towards center
join-msg either hits existing tree branch for this center, or arrives at
center
path taken by join-msg becomes new branch of tree for this router
suppose R6 chosen as center:
path order in which join
messages generated
LEGEND
2 1
3
1
R1
R2
R3
R4
R5
R6
R7 76
77. Internet Multicasting Routing: DVMRP
DVMRP: distance vector multicast routing protocol, RFC1075
flood and prune: reverse path forwarding, source-based tree
RPF tree based on DVMRP’s own routing tables constructed by
communicating DVMRP routers
no assumptions about underlying unicast
initial datagram to mcast group flooded everywhere via RPF
routers not wanting group: send upstream prune msgs
soft state: DVMRP router periodically (1 min.) “forgets” branches
are pruned:
mcast data again flows down unpruned branch
downstream router: reprune or else continue to receive data
routers can quickly regraft to tree - following IGMP join at leaf
odds and ends - commonly implemented in commercial router
77
78. Tunneling
Q: how to connect “islands” of multicast routers in a “sea” of
unicast routers?
mcast datagram encapsulated inside “normal” (non-multicast-addressed) datagram
normal IP datagram sent thru “tunnel” via regular IP unicast to receiving mcast
router (recall IPv6 inside IPv4 tunneling)
receiving mcast router unencapsulates to get mcast datagram
physical topology logical topology
78
79. PIM: Protocol Independent Multicast
not dependent on any specific underlying unicast routing algorithm
(works with all)
two different multicast distribution scenarios :
dense:
group members densely packed, in
“close” proximity.
bandwidth more plentiful
sparse:
networks with group members small wrt
interconnected networks
group members “widely dispersed”
bandwidth not plentiful
Consequences of sparse-dense
dense
group membership by routers
assumed until routers explicitly prune
data-driven construction on mcast
tree (e.g., RPF)
bandwidth and non-group-router
processing profligate
sparse:
no membership until routers
explicitly join
receiver- driven construction of
mcast tree (e.g., center-based)
bandwidth and non-group-router
processing conservative
79
80. PIM- dense mode
flood-and-prune RPF: similar to DVMRP but…
underlying unicast protocol provides RPF info for incoming datagram
less complicated (less efficient) downstream flood than DVMRP
reduces reliance on underlying routing algorithm
has protocol mechanism for router to detect it is a leaf-node router
80
81. PIM - sparse mode
center-based approach
router sends join msg to rendezvous point (RP)
intermediate routers update state and forward join
after joining via RP, router can switch to source-specific
tree
increased performance: less concentration, shorter paths
sender(s):
unicast data to RP, which distributes down RP-rooted tree
RP can extend mcast tree upstream to source
RP can send stop msg if no attached receivers
“no one is listening!”
81