North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: Outbound Route Optimization

  • From: Richard A Steenbergen
  • Date: Fri Jan 23 17:56:55 2004

On Fri, Jan 23, 2004 at 11:01:14AM -0800, Richard J. Sears wrote:
> 
> In reality, I learned that BGP is simply not up to the task of handling
> anything beyond its limited scope - best path routing. In today's world,
> we need to look beyond best path as it simply has nothing to do with
> best performance, at least not in 40 to 50% of my traffic routing
> decisions. You can do that with bodies (if your a purest) or you can
> utilize route optimization equipment. In either case, you have to do it.
> 
> I think for the time being, route optimization equipment, and the
> companies that utilize them will have an edge over those doing things
> the manual way. Regardless of which box I could have chosen, the end
> result is that myself and my  backbone engineers have far more time on
> their hands for other tasks and my customers are much happier than they
> were before.

BGP is relatively good at determining the best path when you a major
carrier with connectivity to "everyone" (i.e. when traffic flows
"naturally"), in many locations, and you engineer your network so that you
have sufficient capacity to support the traffic flows.

However, BGP is relatively BAD at determining the best path when you are
the customer of many carriers, some of whom have serious problems on their
network that they spend a lot of time and effort trying to hide from you,
and when you have a diverse assortment of link speeds. In this setup,
traffic does not flow "naturally".

I often find myself spending a fair amount of time talking people down
from trying to make their network "better" by buying transit from every
carrier they can get their hands on. A single flapping session on a single
transit can get you dampened for quite a while, making you only as strong
as your weakest link. Also, the convergence becomes painfully slow, not to
mention flaptacular, as best paths are computed, announced, re-computed,
re-announced, re-re-computed, etc (and if you don't believe me watch
Internap converge some time). Plus if you are an inbound heavy network, 
the localpref increase via certain paths (everyone localprefs their own 
customers above routes they hear from peers/transits) will cause a skew in 
traffic that prepending may have little to no influence over.

Botton line, BGP is most useful when you select paths as naturally as
possible, with as few transits are as needed for redundancy, and use
equal-sized pipes with sufficient capacity to support the traffic flow (or
where you make capacity decisions based on the traffic levels, not the
other way around). When you try to force BGP to work with the model you 
described, it will go kicking and screaming.

Now this isn't to say that even the best run carrier doesn't have their 
off days, and that there is potential benefit from having many different 
carriers to choose from, but it does almost REQUIRE a different system of 
path selection to be effective. Unfortunately there are some serious 
problems to overcome in order for any such system to scale, not the least 
of which are:

* The inability to receive FULL bgp routes from every bgp peer to your
optimization box without requiring your transit providers to set up a host
of eBGP Multihop sessions (which most refuse to do). This means you will
always be stuck assuming that every egress path is a transit and can reach 
any destination on the Internet until your active or passive probing says 
otherwise.

* The requirement of deaggregation in order to make best path decisions 
effective. For example, someone's T3 to genuithree gets congested and the 
best path to their little /24 of the Internet is through another provider. 
Do you move 4.0.0.0/8?

* The constant noise of stupid scripts pinging everything on the Internet. 
Once upon a time I heard some pretty interesting numbers about the amount 
of traffic a newly routed /8 with no usage received just in Internet noise 
from all the scanners, hackers, and worms out there. I don't know if it 
was true or not (though I'm sure someone on this list has done such and 
can tell us exactly how much traffic it is), but just looking at the 
amount of noise much smaller blocks receive leads one to the conclusion 
that active analysis will not scale to support everyone.

etc etc etc. There is certainly room for improvement of traffic 
engineering in the protocols, but the perl scripts and zebra hacks most 
people are throwing at the problem currently are far from capable of 
handling it.

-- 
Richard A Steenbergen <[email protected]>       http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)