North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: Global BGP - 2001-06-23 - Vendor X's statement...

  • From: Brett Frankenberger
  • Date: Tue Jun 26 18:45:17 2001

> On Tue, 26 June 2001, "Chance Whaley" wrote:
> > How would you like Vendor X to liberally handle the situation? There is
> > a point when being too liberal causes issue - like this one. The idea is
> > that if the original peer followed the spec it would of been contained
> > at the source and this would of never happened. Where is the line?
> > Something about GIGO comes to mind.
> 
> I would prefer implementations (not vendors) reject the one router which
> they don't like, and accept the other 100,000+ routes in the global Internet
> without flapping BGP sessions.
> 
> Killing 100,000 routes because you don't like one seems a bit excessive.

But an invalid route should never be received.  If it is, something is
fundamentally wrong.  It's not like, say, a CRC or Checksum error,
which indicated that a packet got corrupted.  That's a normal occurance
and dropping the packet (and then moving on) is the right thing to do.

But when we're talking about malformed BGP advertisements, we
know with a high degree of certainty that the sender is broken.  We
don't know how broken, nor do we know the details of the brokenness,
but we know that it's broken.  So our options are:

(1) Reject the one bogus route, accept everything else, thereby
assuming that the brokenness extends only to that one route.

(2) Send a NOTIFY, drop everything, because we don't want to be
accepting routes from a known-to-be-broken route.  (It is important to
note that we don't know that the other, non-malformed, routes are good. 
We only know that they aren't malformed.  They may or may not actually
be valid.)

Given that we're talking about the routing information for the core of
the Internet, the most reasonable thing to do seems to be (2): Discard
Everything.  After all, we *know* that the stuff being discarded is
coming from a broken router ... we just don't know how broken that it
is.  Why gamble with the backbone by assuming that "hey, it's broken,
but the brokenness doesn't extend to sending wrong but correctly formed
advertisements".  (Whether or not discarding everything, then bring
the session right back up, downloading routes, eventually getting to
themalformed one, and repeating the process is a good idea is a
different question.)

Do I wish my dual-homed routers would accept everything else and just
ignore the bad route.  Sure.  But even if the "everything else" is
crap, it's not going to get beyond the edge of my network, because I'm
not a big provider and I have no BGP anywhere except the edge, and I
don't pass what I receive via BGP beyond my network (i.e. I only
advertise my routes).

Do I think UUnet should propogate decent-looking routes that it got
from a known-to-be-broken neighbor and pass then through its core and
on to it's peers and BGP-connected transit customers?  Probably not. 
The penalty for passing on a bogus information is too high.  "Be
conservative in what you accept" might suggest that the session should
be kept up and the one bad advertisement be discarded, but "Be
conservative in what you send" would tend to argue for never, under any
ciscumstances, passing on routes that you received from a
known-to-be-broken router.

And, of course, there's the lack of data issue.  There have been at
least three signifigant outages that were the result of BGP flappage
caused by malformed AS paths.  In three cases, it is generally believed
that the Internet would have been better off had the BGP sessions
stayed up and just hte one malformed advertisement been discarded.  

Should we therefore change the protocol?  We don't know, because we
don't know how many times the "if it sends someting bogus, assume it's
seriosuly broken and discard everything" rule has saved the Internet
from a signifigant outage.

At a minimum, if we're going to change the RFC based on the notion that
"single malformed advertisements from otherwise functional routers" are
signifigantly more prevalant that "routers that advertise a bunch of
crap, some of which is correctly formed", we should at least have some
data suggesting that it is in fact more prevalant.  (We should also
consider the consequences of getting this one wrong.)

     -- Brett