Tải bản đầy đủ (.pdf) (10 trang)

The Complete IS-IS Routing Protocol- P29 ppt

Bạn đang xem bản rút gọn của tài liệu. Xem và tải ngay bản đầy đủ của tài liệu tại đây (236.15 KB, 10 trang )

metric). Based on that, the partial run is basically a search operation, which tries to find
out the lowest metric for a given prefix. Figure 10.14 illustrates the simplicity of a partial
SPF calculation. All the leaf information from the routers on the PATH list, plus the
Pennsauken root router, extract their IPv4 prefixes and move them to a table. Next, the
list is sorted and duplicate entries with a worse cost are eliminated. Finally, the prefixes
are sorted by their cost in ascending order. This simple search operation is computationally
much less complex than the topological section of the full SPF run.
Both JUNOS and IOS support partial runs for IPv4 and IPv6. In IOS, you can also
control the SPF delay for partial route calculations (PRCs). PRC is an IOS term and can
be controlled using the prc-interval router isis configuration command. These
timers can be more aggressive (shorter) than the spf-interval <a> <b> <c>
timers. This is because the burden that a partial SPF run adds to a control plane is not as
high as a full run, so the router does not need to self-protect so much. The following con-
figuration example sets the router pre-SPF timer (initial wait) before doing a partial SPF
calculation to 100 ms. For the second run, the router holds down for 250 ms. The PRC
also employs an exponential back-off timer. That means after the second run, the hold-
down value is now 500 ms. The first argument of the command controls the maximum
hold-down value of one second.
IOS configuration
In IOS there are three timers to control partial SPF hold down. The three timers work sim-
ilarly to the timers for the spf-interval configuration command.
London# show running-config
[… ]
router isis
prc-interval 1 100 250
[… ]
JUNOS does not have a dedicated control knob to control the PRC behaviour. In
JUNOS, there is just one hold-down logic path. For partial SPF runs, therefore, the same
hold-down logic applies as for full SPF runs. it is recommended setting the three IOS
parameters 5, 200 and 200 for compatiblity to the JUNOS default behaviour.
10.3.2.1 Performance and CPU Usage


Partial SPF runs are pretty cheap from the calculation point of view. A router has to scan
through all the routers in its link-state database, extract the prefix information, add the
prefix cost of the distance to the originating router, and sort the prefixes to find out which
is closest. This exhibits absolutely linear behaviour, meaning the CPU processing time is
directly proportional to the number of routes in the network. Mathematically speaking,
this would be O(R) with R being the number of prefixes of an address family. In practical
implementations, the cost of the partial SPF run nears zero cost. Typically, the partial run
is less than 10 ms execution time, even if R is unreasonably high (like 10,000) routes. So
partial runs are even less of an issue than full SPF runs.
268 10. SPF and Route Calculation
269
172.16.33.16/30
172.16.33.0/30
172.16.33.12/30
172.16.33.4/30
172.16.33.2
8/30
172.16.33.24/30
172.1
6.33.8/30
SPF Result list
via
cost
Extracted IPv4 Prefix list
New York
26000
Washington
48000
Frankfurt
298000

London
Paris
315000
385000
Destination
origin
cost
Pennsauken
-
0
192.168.0.19/32
172.16.33.0/30
172.16.33.16/30
192.168.0.8/32
172.16.33.8/30
172.16.33.12/30
172.16.33.20/30
172.16.33.24/30
192.168.0.12/32
172.16.33.4/30
172.16.33.12/30
Sorted IPv4 prefix list
Destination
cost
192.168.0.17/32
0
172.16.33.0/30
0
172.16.33.4/30
0

192.168.0.19/32
172.16.33.16/30
26000
172.16.33.8/30
0
192.168.0.21/32
48000
172.16.33.20/30
172.16.33.28/30
192.168.0.8/32
298000
172.16.33.24/30
298000
192.168.0.12/32
315000
172.16.33.12/30
315000
192.168.0.22/32
385000
172.16.33.24/30
385000
192.168.0.17/32
0
172.16.33.0/30
0
172.16.33.4/30
0
172.16.33.8/30
0
192.168.0.21/32

172.16.33.16/30
172.16.33.20/30
172.16.33.28/30
192.168.0.22/32
172.16.33.24/30
172.16.33.28/30
Pennsauken
192.168.0.17
New York
192.168.0.19
London
192.168.0.12
Washington
192.168.0.21
172.16.33.20/30
Frankfurt
192.168.0.8
Paris
192.168.0.22
Destination
New York
New York
New York
New York
London
Washington
Washington
Washington
Washington
New York

New York
New York
Paris
Paris
Paris
Pennsauken
Pennsauken
Pennsauken
Pennsauken
Frankfurt
Frankfurt
Frankfurt
Frankfurt
Frankfurt
London
London
London
48000
48000
48000
48000
26000
26000
26000
385000
385000
385000
298000
298000
298000

298000
298000
315000
315000
315000
Partial SPF
calculation
origin
Pennsauken
Pennsauken
Pennsauken
Pennsauken
New York
New York
Washington
Washington
Washington
Frankfurt
Frankfurt
London
London
Paris
Paris
26000
48000
48000
F
IGURE
10.14. A partial route calculation (PRC) is basically a simple, computational cheap sort operation
10.3.3 Incremental SPF Run

The incremental SPF (iSPF) run is an optimized version of the full SPF run. What it does
is maintain additional data structures, so-called Neighbor and Parent lists, during previous
full SPF calculations. The paths that have not been used so far are of special interest.
Consider Figure 10.15, which shows the SPF tree from the SPF calculation example.
Note that the link between London and Frankfurt is not on the shortest path tree from
270 10. SPF and Route Calculation
Pennsauken
Paris
oc192/STM-64
26000
New York
oc768/STM-256
22000
Washington
oc12/STM-4
600000
oc192/STM-64
87000
250000
oc192/STM-64
Area 49.0001
Level 2-only
Frankfurt
oc768/STM-256
22000
London
oc48/STM-16
315000
oc48/STM-16
315000

FIGURE 10.15. Incremental SPF does not need to re-compute a SPF calculation if a link is not on
the shortest path tree
Pennsauken’s perspective. If the Pennsauken router receives a new LSP reporting that
this particular link is down, then Pennsauken does not need to schedule a full SPF run.
The reason is that because the router doing the SPF calculation has not used the link
before (when it was up), then it does not have to consider it when it is down.
Keep in mind that such considerations, whether to do a full SPF or an incremental SPF
run, is a purely local decision that applies only to the local router. For other routers in
the network, for example Frankfurt, the link between London and Frankfurt may be mean-
ingful, and therefore on Frankfurt’s shortest path tree. The iSPF advantage on the
Pennsauken router is meaningless to the Frankfurt router. The incremental SPF run only
spares the full SPF run on some of the routers in a given area but not to all of them.
Which routers benefit from incremental SPF is heavily dependent on topology.
Another optimization of the incremental SPF run is to track network dependencies.
Consider Figure 10.16, which shows a new router (Munich) attached as a leaf to the sample
SPF Calculation Diversity 271
87000600000
250000
22000
22000
oc48/STM-16
315000
315000
26000
43000
GE
Pennsauken
oc192/STM-64
oc48/STM-16
London

New York
oc768/STM-256
Area 49.0001
Level 2-only
oc768/STM-256
Washington
oc192/STM-64
oc12/STM-4
oc192/STM-64
Frankfurt
Paris
Area 49.0001
Munich
FIGURE 10.16. Leaf routers also do not need to re-run SPF on all event that would trigger a full
SPF run
topology. The incremental SPF algorithm figures out that Munich is a leaf node and
dependent on the Frankfurt router. That knowledge is used in the SPF calculation. Recall
that once the immediate successors on the PATH list are explored, the algorithm knows
that Munich is (because of its edge position) an uninteresting node for path searches and
hence does not need to get explored.
Two scenarios where the iSPF algorithm may be applicable have been highlighted. It
is the authors’ opinion that in the first scenario (Figure 10.15) the performance improvement
is next to nothing. This is due to the fact that, in a distributed environment, convergence is
bound to the worst-case performing router. It has been shown that not all routers take
equal advantage of the optimization, and some routers in the topology need a full SPF
run anyway. The second example (Figure 10.16) is far more interesting as it dramatically
reduces the number of nodes that need to get explored. Also the majority of the routers
in the network take advantage of this and so there is a real SPF performance improvement.
10.3.3.1 Performance and CPU Usage
There are little, but profound, things known about theoretical models of the incremental

SPF calculation. This is because there are lots of caveats and “it depends” in the underlying
algorithm. Incremental SPF only makes sense if the underlying topology is sparsely
meshed and has many edge nodes. Identification and path tracking turned out to have one
of the highest overheads in the full SPF run.
Stefano Previdi, a Development Engineer at Cisco Systems who maintains their IS-IS
routing protocol, claims that the average saving is 80 per cent from early field trials. The
first practical examination was conducted by Cengiz Alaettinoglu and Stephen Casner of
Packetdesign, who monitored the QWEST backbone in the US and analyzed full and
incremental SPF runtimes. The results are illustrated in Figure 10.17.
It will be shown shortly that this is the misguided reason that people are afraid of frequent
SPF runs. It is the post-processing of route resolving and prefix insertion, and not the SPF
calculation itself, which makes the control planes of the core routers in the Internet busy.
272 10. SPF and Route Calculation
10000
1000
100
10
1
0 102030405060708090100
Percentage of SPF runs
avg = 13 usec
Dijkstra SPF
Incremental SPF
avg = 1069 usec
F
IGURE 10.17. Incremental SPF performs by a factor of 80 better than the full (Dijkstra) SPF based
on the QWEST topology
The result of the SPF calculation is fed into the route resolution process. The route
resolver checks to see if routes from other routing protocols have been affected by the
result of the SPF calculation.

10.4 Route Resolution
Pure reachability protocols like BGP rely on a working IGP like IS-IS to map the
Reachability information, such as customer and Internet routes, to a topology in order to
properly calculate the path cost. After every SPF recalculation, the route resolver needs to
track dependent routes and update their forwarding next-hops accordingly. Finally, the
changed prefixes are downloaded to the line cards and ASICs. In the past there has been lit-
tle attention to the nature and performance implications of tracking the dependent routes.
However, in an Internet environment with full routing tables, it turns out to be that
finding out who is dependent and who is not is one of the most dominating factors in the
total route-recalculation period.
10.4.1 BGP Recursion and Route Dependency
Routing protocols like BGP are somehow agnostic to the underlying topology and need
an IGP that provides two services:
1. Connectivity between the internal loopback IP addresses of all the routers in an AS so
that the BGP speakers can bootstrap their iBGP mesh
2. Topology awareness to calculate the IGP distance to a BGP speaker
Internal BGP neighbours are typically not directly connected, so a router cannot simply
inherit the neighbour address from the routing update sender as other distance vector proto-
cols (RIP and EIGRP) would do. Even if the neighbour is directly connected, the router still
cannot inherit that information because it does not know if the neighbour is a BGP Route
Reflector or not. The good news is that there is information contained in the BGP message
that points to the IP address where the route originated. This field is called the next-hop and
is a mandatory BGP attribute that points to the correct forwarding router. In the tcpdump
output below, a BGP Update message containing a next-hop attribute is shown.
Tcpdump Output
The BGP Next-hop attribute carries an IP address that the IGP needs to resolve.
08:28:27.945234 IP 192.168.0.19.179 > 192.168.0.21.28161: BGP, length: 77
Update Message (2), length: 77
Origin (1), length: 1, Flags [T]: IGP
AS Path (2), length: 14, Flags [T]: 3320 4711 12788 24896

Next-hop (3), length: 4, Flags [T]: 192.168.0.8
Local Preference (5), length: 4, Flags [T]: 100
Community (8), length: 12, Flags [OT]: 5511:500, 5511:516, 5511:999
Updated routes:
81.21.0.0/20
Route Resolution 273
After receiving the BGP update the router needs to look up 192.168.0.8 in the SPF
result database and find the local forwarding next-hop. The BGP route 81.21/20 is now
dependent on the IS-IS route pointing to 192.168.0.8. Whenever the IS-IS topology is
recalculated, the router needs to check all dependent routes and find out if there is a better
way to reach the BGP speaker.
A given route may arrive at a BGP router via many diverse paths. Certain rules in the
BGP route selection process depend on the IGP calculated route.
10.4.2 BGP Route Selection
BGP performs tie-breaking to find the best path according to the following list:
1. Is the BGP next-hop reachable?
2. Prefer the highest Local Preference value
3. Prefer the shortest AS Path length
4. Prefer the lowest Origin value
5. Prefer the lowest MED value
6. Prefer routes learned via EBGP over routes learned via iBGP
7. Prefer routes with the lowest IGP metric
8. Prefer routes from the peer with the lowest RID
9. Prefer routes from the peer with the lowest peer ID
At the very top of the tie-breaking list, BGP is heavily dependent on IS-IS. BGP needs
to validate its BGP next-hop and check if it is reachable before further comparing the
route. The BGP next-hop is a mandatory BGP attribute that points to the correct forwarding
router. In Rule #7, the BGP route again is dependent on IS-IS. This time the lower IGP met-
ric provides BGP with some insight on how close a BGP speaker is. Consider Figure 10.18
for an example. Router Pennsauken has learned the prefix 81.21/20 from London, New

York and Paris. After applying the BGP tie-breaking process, it turns out that the route
from New York is best, due to a lower (better) IGP metric.
There are different ways of implementing route-recursion inside the router – the most
common ones are to store backtracking pointers. Whenever a BGP route is resolved
through an IS-IS route, the router stores a pointer from the IS-IS routes to the dependent
BGP routes. If a change is needed to an IS-IS route, simply revisit the stored prefixes and
look to see if the old IS-IS route is still the best route. The router does that by checking
if the BGP next-hop is still on the shortest path. If it is – fine, then simply stop there (do
not attempt to change forwarding state). If it is not, and there has been a path change
(which could be a path becoming better or a path getting worse), then re-run the recursion
for the prefixes stored in the backtrack-list. The router has to re-check to see if there are
better paths pointing to the BGP next-hop. In a worst case, this means that 100 K prefixes
need to re-run through the entire BGP tie-breaking process, which can be quite expensive
in terms of computational cost (CPU load).
10.4.2.1 Performance and CPU Usage
Both JUNOS and IOS do a proper BGP recursion check, but implemented differently.
The difference is in the way the BGP code is written and its performance implications.
274 10. SPF and Route Calculation
In IOS the BGP code is job-based. That means whenever there is a change to a BGP
learned prefix only a flag in the data-structure of the prefix is set or cleared. Then there
is a job that scans the BGP table for changed entries (called the BGP walker). Why is this
information relevant for a book about IS-IS? It means that even if IS-IS has detected that
a link has been broken, and must perform all the relevant actions (flooding, scheduling of
an SPF full run etc.), it takes in the worst case the BGP walker duration in IOS (50 sec-
onds) until the Cisco router starts to change prefixes, update forwarding states, and so on.
So the implementation style of the BGP implementation dictates the convergence behaviour
of the BGP routes. Perhaps this is not the best design choice. In all fairness, the first
implementation of BGP in IOS was coded at a time when the Internet consisted of not
even 1000 routes. So it is probably not bad design, but a legacy effect.
In contrast, JUNOS routing software is event-driven. That means that whenever a sub-

system in the router notices that something has gone wrong, or is up again, that change
is propagated throughout the system immediately and without any delay. Immediately
after the SPF run, JUNOS does BGP recursion.
Both implementations result in a list of prefixes that need to change in the main routing
table. After that, the router updates the forwarding state in the forwarding plane. Updating
the forwarding plane is the most daunting task of all because it makes both the forwarding
and control plane CPUs really busy. The reason this keeps both CPUs busy is the sheer
amount of data and table sizes that has to be pumped through a router’s chassis. Currently
Route Resolution 275
Area 49.0001
Level 2-only
oc192/STM-64
87000
oc12/STM-4
600000
oc192/STM-64
250000
oc768/STM-256
22000
oc768/STM-256
22000
oc48/STM-16
315000
oc48/STM-16
315000
oc192/STM-64
26000
Origin: IGP
AS Path: 5511 2874 12788 24896
Next-hop: 192.168.0.12

Local preference: 100
81.21.0.0/20
Origin: IGP
AS Path: 701 702 12788 24896
Next-hop: 192.168.0.19
Local preference: 100
81.21.0.0/20
Origin: IGP
AS Path: 3320 8847 12788 24896
Next-hop: 192.168.0.8
Local preference: 100
Community: 3320:4711
81.21.0.0/20
LondonNew york
Pennsauken
New York
Paris
Community: 5511:500, 5511:516, 5511:999
Washington
Frankfurt
London
FIGURE 10.18. The transit route 81.21/20 via Pennsauken wins the BGP tie-breaking process
a full routing table of all Internet routes consumes about 120–200 MB of memory. A full
forwarding table consumes about 2 MB of memory on each line-card in the router. So
crunching at least 100 MB of BGP tables and generating N*2 MB sized forwarding tables
is the main reason the router is busy.
The next section covers legacy and state-of-the-art methods of forwarding state change
operations that can make the prefix insertion process scale better.
10.5 Prefix Insertion
In the age when the Internet was a network of only 1000 prefixes, no one had to worry

about efficiency in changing forwarding state. Figure 10.19 shows an old-style implementa-
tion of a forwarding table structure.
10.5.1 Flat Forwarding Table
There are two tables in the figure. The first table holds all the prefixes of the main routing
table. The second table holds all the forwarding next-hops of the router. A forwarding
next-hop is a local interface plus Layer-2 data like encapsulation method, MAC addresses
etc. As a result of the route calculation, the entries in the prefixes list are all pointing to
the forwarding next-hops. To put the two tables into perspective: based on today’s Internet
routing tables, 100,000s of prefixes point to only 10s of forwarding next-hops.
It is exactly that many-to-few mapping that causes problems. Consider the sample
topology shown in Figure 10.20 where each router is a public BGP speaker and injects
BGP routes into the network. Each of the six routers carries a full BGP load, and after the
BGP tie-breaking process the routers figure out which are the best routes. The figures in
the box indicate how many active routes each router carries.
For simplicity, look at the Frankfurt routing and forwarding table only. The forwarding
table looks very simple: all 120.000 prefixes map to one of three possible next-hops,
which are the SONET/SDH links to London, Paris or Pennsauken. Now, assume the link
between Washington and Frankfurt breaks. Both Washington and Frankfurt will quickly
detect that one of their SONET/SDH interfaces is down. Next, both routers will originate
276 10. SPF and Route Calculation
Forwarding engine
81.21.0.0/20
so-7/3/0.0
100000s of Prefixes
10s of forwarding
Next-hops
FIGURE 10.19. In a flat forwarding table a prefix points directly to a forwarding next-hop
a new LSP declaring the adjacency down. Because of the default values of the SPF hold-
down timers in the network, the SPF run will be scheduled after 100 ms. As the number
of nodes and links is low, in less than one millisecond the results will be available. Now

the scary part begins: the recursion and change of forwarding state in the forwarding
plane. The routing tables are traversed in 1–2 seconds and the control plane realizes that
it has to change 40,000 prefixes. The route processor computes new forwarding tables
and loads them down to the line-cards. Because of the fact that the router has to update
Prefix Insertion 277
Area 49.0001
Level 2-only
LondonNew York
BGP
13K active
routes
Pennsauken
Frankfurt
London
Washington
New York
Paris
BGP
40K active
routes
BGP
18K active
routes
BGP
17K active
routes
BGP
22K active
routes
BGP

10K active
routes
FIGURE 10.20. Each router in the sample topology is a BGP router and carries several thousand
active paths

×