Ospf Opt
Ospf Opt
Ospf Opt
Abstract-Open Shortest Path First (OSPF) is the most commonly used nodes and arcs represent routers and the links between them.
intra-domain internet routing protocol. ’Rafficflow is routed along shortest Each arc a has a capacity .(U) which is a measure for the amount
paths, splitting flow at nodes where several outgoing links are on shortest
paths to the destination. The weights of the links, and thereby the shortest of traffic flow it can take. In addition to the capacitated network,
patb routes, can be changed by the network operator. The weights could we are given a demand matrix D that for each pair (6, t ) of nodes
be set proportional to their physical distances, but often the main goal is to tells us how much traffic flow we will need to send from s to t.
avoid congestion, i.e. overloading of links,and the standard heuristic rec- We shall refer to s and t as the source and the destination of the
ommended by Cwo is to make the weight of a link inversely proportional
to its capacity. demand. Many of the entries of D may be zero, and in particu-
Our startiug point was a proposed AT&T WorIdNet backbone with de- lar, D ( s ,t ) should be zero if there is no path from s to t in G.
mands projected from previous measurements. The desire was to optimize The routing problem is now, for each non-zero demand D(s,t ) ,
the weight setting based on the projected demands. We showed that opti-
m i z i i the weight settings for a given set of demands is NP-hard, so we re- to distribute the demanded flow over paths from s to t. Here, in
sorted to a local search heuristic. Surprisingly it turned out that for the pro- the general routing problem, we assume there are no limitations
posed AT&T WorlWet backbone, we found weight settingsthat performed to how we can distribute the flow between the paths from s to t .
within a few percent from that of the optimal general routing where the Bow The above definition of the general routing problem is equiv-
for each demand is optimally distributed over all paths between source and
destination. This contrasts the common belief that OSPF routing leads to alent to the one used e.g. in Awduche et al. [ 11. Its most con-
congestion and it shows that for the network and demand matrix studied troversial feature is the assumption that we have an estimate of
we cannot get a substantially better load balancing by switching to the pro- a demand matrix. This demand matrix could, as in our case for
posed more flexible Multi-protocol Label Switching (MPLS) technologies.
Our techniques were also tested on synthetic internetworks, based on a the proposed AT&T WorldNet backbone, be based on concrete
model of Zegura et al. (INFOCOM’96), for which we did not always get measures of the flow between source-destinationpairs. The de-
quite as close to the optimal general routing. However, we compared with mand matrix could also be based on a concrete set of customer
standard heuristics, such as weights inversely proportional to the capac-
ity or proportional to the physical distances, and found that, for the same
subscriptions to virtual leased line services. Our demand matrix
network and capacities, we could support a 50%-110% increase in the de- assumption does not accommodate unpredicted bursts in traffic.
mands. However, we can deal with more predictable periodic changes,
Our assumed demand matrix can also be seen as modeling service level say between morning and evening, simply by thinking of it as
agreements (SLAs) with customers, with demands representing guarantees
of throughput for virtual leased lines. two independent routing problems: one for the morning, and
Keywords-OSPF, MPLS, traffic engineering, local search, hashing ta- one for the evening.
bles, dynamic shortest paths, multi-commodity network Rows. Having decided on a routing, the load [ ( U ) on an arc a is the
total flow over a, that is [ ( a ) is the sum over all demands of
I. INTRODUCTION the amount of flow for that demand which is sent over a. The
utilization of a linka is e ( a ) / c ( a ) .
P ROVISIONING an Internet Service Provider (ISP) back-
bone network for intra-domain IP traffic is a big challenge,
particularly due to rapid growth of the network and user de-
Loosely speaking, our objective is to keep the loads within the
capacities. More precisely, our cost function @ sums the cost of
the arcs, and the cost of an arc U has to do with the relation
mands. At times, the network topology and capacity may seem . our experimental study, we had
between t ( a ) and ~ ( u )In
insufficient to meet the current demands. At the same time,
there is mounting pressure for ISPs to provide Quality of Ser-
vice (QoS) in terms of Service Level Agreements (SLAs) with aEA
customers, with loose guarantees on delay, loss, and throughput.
All of these issues point to the importance of tr@c engineering, where for all a E A , Ga(0) = 0 and
making more efficient use of existing network resources by tai- ( 1 for 0 5 X / C ( U ) < 1/3
loring routes to the prevailing traffic. 3 for 1/3 5 Z / C ( U ) < 2/3
10 for 2/3 5 z / c ( u ) < 9/10
A. The general routing problem dh(4 = 70 for 9/10 5 z / c ( u ) < 1
The getzeral routing problem is defined as follows. Our net- 500 for 15 ./.(a) < 11/10
work is a directed graph, or multi-graph, G = ( N ,A ) whose 5000 for 11/10 5 z / c ( u ) < 00
Knowing the optimal solution for the general routing problem Our first answer is negative: for arbitrary n, we construct f
is an important benchmark for judging the quality of solutions instance of the routing problem on 8 n3 nodes where any OSPF
based on, say, OSPF routing. routing has its average flow on arcs with utilization Q ( n )timds
The above objective function provides a general best effort higher than the max-utilization in an optimal general solution. 1
measure. We have previously claimed that our approach can also With our concrete objective function, this demonstrates a gap
be used if the demand matrix is modeling service level agree- of a factor approaching 5000 between the cost of the optimal
ments (SLAs) with customers, with demands representing guar- general routing and the cost of the optimal OSPF routing. I
antees of throughput for virtual leased lines. In this case, we The next natural question is: how well does OSPF routing I
are more interested in ensuring that no packet gets sent across perform on real networks. In particular we wanted to answer
overloaded arcs, so our objective is to minimize a maximum this question for a proposed AT&T WorldNet backbone. In addi-
rather than a sum over the arcs. However, due to the very high tion, we studied synthetic intemetworks, generated as suggested
penalty for overloaded arcs, our objective function favours solu- by Calvert, Bhattacharjee, Daor, and Zegura 151, [6].Finding a
tions without overloaded arcs. A general advantage to working perfect answer is hard in the sense that it is NP-hard to find an
with a sum rather than a maximum is that even if there is a bot- optimal setting of the OSPF weights for an arbitrary network.
tleneck link that is forced to be heavily loaded, our objective Instead we resorted to a local search heuristic, not guaranteed
function still cares about minimizing the loads in the rest of the to find optimal solutions. Very surprisingly, it turned out that
network. for the proposed AThT WorldNet backbone, the heuristic found
weight settings making OSPF routing performing within a few
B. OSPF versus MPLS routing protocols percent from the optimal general routing. Thus for the proposed
Unfortunately, most intra-domain internet routing protocols AT&T WorldNet backbone with our projected demands, y d
today do not support a free distribution of flow between source with our concrete objective function, there would be no sub-
stantial traffic engineering gain in switching from the existing I
and destination as defined above in the general routing problem.
The most common protocol today is Open Shortest Path First well-tested and understood robust OSPF technology to the nefw
I
(OSPF) [2]. In this protocol, the network operator assigns a MPLS alternative.
weight to each link, and shortest paths from each router to each For the synthetic networks, we did not always get quite t
as
destination are computed using these weights as lengths of the close to the optimal general routing. However, we compared o b
links. In each router, the next link on all shortest paths to all local search heuristic with standard heuristics, such as weights
possible destinations is stored in a table, and a demand going inversely proportional to the capacities or proportional to the
in the router is sent to its destination by splitting the flow be- physical distances, and found that, for the same network abd
tween the links that are on the shortest paths to the destination. capacities, we could support a 50%-110% increase in the de-
The exact mechanics of the splitting can be somewhat compli- mands, both with respect to our concrete cost function and, si-
cated, depending on the implementation. Here, as a simplifying multaneously, and with respect to keeping the max-utilization
approximation, we assume that it is an even split. below 100%.
Our local search heuristic is original in its use of hash tables mina =E
,EA
to avoid cycling and for search diversification. A first attempt of
using hashing tables to avoid cycling in local search was made subject to
by Woodruff and Zemel [7], in conjunction with tabu search.
Our approach goes further and avoids completely the problem
specific definitions of solution attributes and tabu mechanisms,
leading to an algorithm that is conceptually simpler, easier to
implement, and to adapt to other problems.
Our local search heuristic is also original in its use of more ad- a E A,
vanced dynamic graph algorithms. Computing the OSPF rout- (8,t)ENxN
ing resulting from a given setting of the weights tumed out to @a L e(a) a E A, (3)
be the computational bottleneck of our local search algorithm,
as many different solutions are evaluated during a neighborhood 0,2 se(4 - $ c ( u ) a E A, (4)
exploration. However, our neighborhood structure allows only oa 2 ioe(a) - g+) a EA, (5)
a few local changes in the weights. We therefore developed ef- 0,2 7 0 t ( a )- ?.(a) a E A, (6)
ficient algorithms to update the routing and recompute the cost
of a solution when a few weights are changed. These speed-ups 0, 2 500l(a)- y c ( a ) a E A, (7)
are critical for the local search to reach a good solution within @a 2 50ooe(~)- Y C ( Ua)E A , (8)
reasonable time bounds. >0
fa
(8J)
- a € A; s , t E N . (9)
Constraints (I) are flow conservation constraints that ensure the
E. Contents desired traffic flow is routed from s to t , constraints (2) define
the load on each arc and constraints (3) to (8) define the cost on
In Section I1 we formalize our general routing model as a lin- each arc.
ear program, thereby proving Proposition 1. We present in Sec- The above program is a complete linear programming for-
tion I11 a scaled cost function that will allow us to compare costs mulation of the general routing problem, and hence it can be
across different sizes and topologies of networks. In Section IV. solved optimally in polynomial time (JShachiyan [SI), thus set-
a family of networks is constructed, demonstrating a large gap tling Proposition 1. In our experiments, we solved the above
between OSPF and multi-commodity flow routing. In Section V problems by calling CPLEX via AMPL. We shall use @OPT to
we present our local search algorithm, guiding the search with denote the optimal general routing cost.
hash tables. In Section VI we show how to speed-up the calcula-
tions using dynamic graph algorithms. In Section VII, we report B. OSPF routing
the numerical experiments. Finally, in Section VI11 we discuss
In OSPF routing, we choose a weight w(a) for each arc.
our findings.
The length of a path is then the sum of its arc weights, and
Because of space limitations, we defer to the journal version we have the extra condition that all flow leaving a node aimed
the proof that it is NP-hard to find an optimal weight setting at a given a destination is evenly spread over arcs on short-
for OSPF routing. In the journal version, based on collaborative est paths to that destination. More precisely, for each source-
work with Johnson and Papadimitriou, we will even prove it NP- destination pair (s,t) E N x N and for each node I, we have
hard to find a weight setting getting within a factor 1.77 from that f[:;) = 0 if (t, y) is not on a shortest path from s to t ,
optimality for arbitrary graphs.
and that f[i:i\ = f[;:%, if both (I,y) and (2,y') are on short-
est paths from s to t. Note that the routing of the demands is
11. MODEL completely determined by the shortest paths which in turn are
determined by the weights we assign to the arcs. Unfortunately,
A. Optimal routing the above condition of splitting between shortest paths based on
variable weights cannot be formulated as a linear program, and
Recall that we are given a directed network G = (NIA ) with this extra condition makes the problem NP-hard.
a capacity C ( U ) for each a E A . Furthermore, we have a demand We shall use O o p t ~ stopdenote
~ the optimal cost with OSPF
matrix D that for each pair (s,t ) E N x N tells the demand routing.
D ( s ,t ) in traffic flow between s and t. We will sometimes re-
fer to the non-zero entries of D as the demands. With each pair 111. NORMALIZING
COST
(s,t ) and each arc a, we associate a variable telling how
fiSlt)
We now introduce a normalizing scaling factor for the cost
much of the traffic flow from s to t goes over a. Variable )(a) function that allows us to compare costs across different sizes
represents the total load on arc a, i.e. the sum of the flows go- and topologies of networks. To define the measure, we introduce
ing over a , and @, is used to model the piece-wise linear cost
function of arc a. @Uncop = C ( D ( s , t ).distl(s,t)). (10)
With this notation, the general routing problem can be formu- ( s , t ) E Nx N
E. Diversocation
Another important ingredient for local search efficiency is di-
versification. The aim of diversification is to escape from re-
gions that have been explored for a while without any improve-
ment, and to search regions as yet unexplored.
In our particular case, many weight settings can lead to the
same routing. Therefore, we observed that when a local min-
imum is reached, it has many neighbors having the same cost,
Fig. 4. The second type of move tries to make all paths form a to t of equal
length.
leading to long series of iterations with the same cost value. To
escape from these "long valleys" of the search space, the sec-
ondary hashing table is again used. This table is generally reset
solutions to our problem are 1.1 [-dimensional integer vectors. at the end of each iteration, since we want to avoid repetitions
Our approach maps these vectors to integers, by means of a inside a single iteration only. However, if the neighborhood ex-
hashing function h, chosen as described in [ll]. Let I be the ploration does not lead to a solution better than the current one,
number of bits used to represent these integers. We use a we do not reset the table. If this happens for several iterations,
boolean table T to record if a value produced by the hashing more and more collisions will occur and more potentially good
function has been encountered. As we need an entry in T for solutions will be excluded, forcing the algorithm to escape from
each possible value returned by h, the size of T is 2'. In our the region currently explored. For these collisions to appear at a
implementation,I = 16. At the beginning of the algorithm, all reasonable rate, the size of the secondary hashing table must be
entries in T are set to false. If w is the solution produced at a small compared to the primary one. In our experiments, its size
given iteration, we set T(h(w))to true, and, while searching the is 20 times the number of arcs in the network.
neighborhood, we reject any solution w' such that T ( h ( w ' ) )is This approach for diversification is useful to avoid regions
true. with a lot of local minima with the same cost, but is not suffi-
This approach completely eliminates cycling, but may also cient to completely escape from one region and go to a possibly
reject an excellent solution having the same hashing value as more attractive one. Therefore, each time the best solution found
a solution met before. However, if h is chosen carefully, the is not improved for 300 iterations, we randomly perturb the cur- '
probability of collision becomes negligible. A first attempt of rent solution in order to explore a new region from the search :
using hashing tables to avoid cycling in local search was made space. The perturbation consists of adding a randomly selected i
by Woodruff and Zemel [ 7 ] , in conjunction with tabu search. perturbation, uniformly chosen between -2 and +2, to 10 % of
It differs from our approach since we completely eliminate the the weights.
tabu lists and the definition of solution attributes, and we store
the values for all the solutions encountered, while Woodruff and VI. COST EVALUATION
Zemel only record recent iterations (as in tabu search again). We will now first show how to evaluate our cost function
Moreover, they store the hash values encountered in a list, lead- for the static case of a network with a specified weight set-
ing to a time linear in the number of stored values to check if a ting. Computing this cost function from scratch is unfortunately
solution must be rejected, while with our boolean table, this is too time consuming for our local search, so afterwards, we will
done in constant time. show how to reuse computations, exploiting that there are only
few weight changes between a current solution and any solution
D. Speeding up neighborhood evaluation in its neighborhood.
Due to our complex neighborhood structure, it turned out that
A. The static case
several moves often lead to the same weight settings. For ef-
ficiency, we would like to avoid evaluation of these equivalent We are given a directed multigraph G = ( N , A ) with arc
moves. Again, hashing tables are a useful tool to achieve this capacities { C a } a c A , demand matrix D,and weights { W a } a c A .
goal : inside a neighborhood exploration, we define a secondary For the instances considered, the graph is sparse with IAl =
hashing table used to store the encountered weight settings as 0 (IN I). Moreover, in the weighted graph the maximal distance,
above, and we do not evaluate moves leading to a hashing value between any two nodes is O( INI).
already met. We want to compute our cost function a. The basic problem
The neighborhoodstructure we use has also the drawback that is to compute the loads resulting from the weight setting. Wel
the number of neighbors of a given solution is very large, and will consider one destination t at a time, and compute the totali
exploring the neighborhood completely may be too time con- flow from all sources s E N to t . This gives rise to a certain/
fLs")
suming. To avoid this drawback, we only evaluate a randomly partial load 1: = C s E N for each arc. Having done th4
selected set of neighbors. above computation for each destination t , we can compute the
We start by evaluating 20 % of the neighborhood. Each time load 1, on arc a as 1;. ztGN
(c) 2000 IEEE
0-7803-5880-5/00/$10.00 524 IEEE INFOCOM 2000
To compute the flow to t , our first step is to use Dijkstra’s To see that the above suffices, first note that the nodes visited
algorithm to compute all distances to t (normally Dijkstra’s al- are considered in order of decreasing distances. This follows
gorithm computes the distances away from some source, but we because we always take the node at the maximal distance and
can just apply such an implementation of Dijkstra’s algorithm because when we add a new node z to M , it is closer to t than
to the graph obtained by reversing the orientation of all arcs in the currently visited node y. Consequently, our dynamic algo-
G). Having computed the distance cl: to t for each node, we rithm behaves exactly as our static algorithm except that it does
compute the set At of arcs on shortest paths tot, that is, not treat nodes not in M . However, all nodes whose incoming or
outgoing arc set changes, or whose incoming arc loads change
are put in M , so if a node is skipped, we know that the loads
around it would be exactly the same as in the previous evalua-
For each node 2, let IS: denote its outdegree in A t , i.e. d: = tion.
I{Y E N (X,Y) E A t N .
Observation 4: For all (y, z ) E A t , VII. NUMERICAL EXPERIMENTS
We present here our results obtained with a proposed AT&T
WorldNet backbone as well as synthetic internetworks.
Besides comparing our local search heuristic (HeurOSPF)
Using Observation 4,we can now compute all the loads l{y,z, with the general optimum (OPT), we compared it with OSPF
as follows. The nodes y E N are visited in order of decreas- routing with “oblivious” weight settings based on properties of
ing distance 4 to 1. When visiting a node y. we first set the arc alone but ignoring the rest of the network. The oblivi-
= k(D(Y>t) $- C ( z , y ) E A * ‘ t ~ , y ) ) ’ Second we set 2tg,,) = ous heuristics are InvCapOSPF setting the weight of an arc in-
versely proportional to its capacity as recommended by Cisco
for each (y, z ) E At.
[3], UnitOSPF just setting all arc weights to 1, L2OSPF setting
To see that the above algorithm works correctly, we note the
the weight proportional to its physical Euclidean distance ( L z
invariant that when we start visiting a node y, we have correct
norm), and RandomOSPF, just choosing the weights randomly.
loads 1; on all arcs a leaving nodes coming before y. In partic-
Our local search heuristic starts with randomly generated
ular this implies that all the arcs (2 y) entering y have correctly
weights and performs 5000 iterations, which for the largest
computed loads Ifx,Y), and hence when visiting y, we compute
graphs took approximately one hour. The random starting point
the correct load ltY,=) for arcs (y, z) leaving y. was weights chosen for RandomOSPF, so the initial cost of our
Using bucketingfor the priority queue in Dijkstra’s algorithm, local search is that of RandomOSPF.
the computation for each destination takes O ( ( A ( )= O(lN1) The results for the AT&T WorldNet backbone with different
time, and hence our total time bound is O ( ( N I 2 ) . scalings of the projected demand matrix are presented in Table I.
In each entry we have the normalized cost @* introduced in Sec-
B. The dynamic case tion HI. The normalized cost is followed by the max-utilization
In our local search we want to evaluate the cost of many dif- in parenthesis. For all the OSPF schemes, the normalized cost
ferent weight settings, and these evaluations are a bottleneck for and ma-utilization are calculated for the same weight setting
our computation. To save time, we will try to exploit the fact that and routing. However. for OPT,the optimal normalized cost and
when we evaluate consecutive weight settings, typically only a the optimal max-utilization are computed independently with
few arc weights change. Thus it makes sense to try to be lazy different routing. We do not expect any general routing to be
and not recompute everything from scratch, but to reuse as much able to get the optimal normalized cost and max-utilization si-
as possible. With respect to shortest paths, this idea is already multaneously. The results are also depicted graphically in Fig-
well studied (Ramalingam and Reps [12]), and we can apply ure 5. The first graph shows the normalized cost and the hori-
their algorithm directly. Their basic result is that, for the re- zontal line shows our threshold of 10: for regarding thenetwork
computation, we only spend time proportional to the number of as congested. The second graph shows the max-utilization.
arcs incident to nodes x whose distance dk to t changes. In our The synthetic internetworks were produced using the gener-
experiments there were typically only very few changes, so the ator GT-ITM [14], based on a model of Calvert, Bhattachar-
gain was substantial - in the order of factor 20 for a 100 node jee,Daor, and Zegura [SI, 161. This model places nodes in a
graph. Similar positive experiences with this laziness have been unit square, thus getting a distance 6(x,y) between each pair of
reported in Frigioni et al. [13]. nodes. These distances lead to random distribution of 2-level
The set of changed distances immediately gives us a set of graphs, with arcs divided in two classes: local access arcs and
“update” arcs to be added to or deleted from A t . We will now long distance arcs. Arc capacities were set equal to 200 for local
present a lazy method for finding the changes of loads. We will access arcs and to 1000 for long distance arcs. The above model
operate with a set M of “critical” nodes. Initially, M consists of does not include a model for the demands. We decided to model
all nodes with an incoming or outgoing update arc. We repeat the demands as follows. For each node x , we pick two random
the following until M is empty: First, we take the node y E M numbers O,, Dy E [0,1] . Further, for each pair (z,y) of nodes
which maximizes the updated distance db and remove y from we pick a random number C(,,y! E [0,1]. Now, if theEuclidean
M . Second, set 1 = &(D(y,t) -t C(r,y)EAt Finally, distance (L2) between x and y IS 6 ( x , y), the demand between
x and y is
for each (y, z ) E A t , where At is also updated, if I # ltv,zl,set
l t u , 2 ) = 1 and add z to M . cUOxD,c ( x , y ) e - + ~ y ) ’ ~ A
TABLE I
AT&T'S PROPOSED BACKBONE WITH 90 NODES AND 274 ARCS AND SCALED PROJECTED DEMANDS, WITH (*) MARKING THE ORIGINAL UNSCALED
DEMAND.
14 1.4 -
12
10
1.2 -
OPT -
: e
U
0
0 10000 20000 30000 40000 50000 60000 0 10000 20000 30000 40000 50000 60000
demand demand
Fig. 5. ATBrT's proposed backbone with 90 nodes and 274 arcs and scaled projected demands.
14 1.4
12 1.2
10 B
.
.I 1
a
-2 0 . 8
U
6
4
z
U
4
o'6
0.4
2 0.2
I 0
0 1000 2000 3000 4000 5000 6000 7000
demand
-
LZOSPF
12
OPT
10
u
:8
6
0
0 1000 2000 3000 4000 5000 6000 0 1000 2000 3000 4000 5000 6000
demand demand
InvCapOSPF
UnitOSpp
------- 1.4
InvCapOSPF .-. ----
UnitOSPF
-
L20SPP --*--
-
L2OSPF
RandomOSPF --*--- RandomOSPP
HeurOSPP -.-.e-- 1.2 HeurOSPF
OPT OPT
8 1
..-I
9
:.
..-I
4
0.8
i Ob6
0.4
0.2
0
0 1000 2000 3000 4000 5000 6000 1000 2000 3000 4000 5000 6000
demand demand
14 1.4
LZOSPF --.A --
-
/!
RandomOSPF
12 ? HeurOSPF --.e-.-
OPT
- 1.2
t
I
10
0.2
0 ' Ll n
0 2000 4000 6000 8000 10000 12000 14000 16000 18000 0 2000 4000 6000 8000 10000 12000 1 4 0 0 0 16000 18000
demand demand
( c ) 2000 IEEE
0-7803-5880-5/00/$10.00 528 IEEE INFOCOM 2000