North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: Yahoo outage summary

  • From: Jared Mauch
  • Date: Mon Jul 09 16:13:15 2007

On Mon, Jul 09, 2007 at 01:23:46PM -0500, Borchers, Mark M. wrote:
> Jared Mauch wrote:
> > The simple truth is that prefix lists ARE hard to manage. 
> Medium-hard IMHO.  Adding prefixes is relatively easy to implement.
> Tracking and removing outdated information significantly more challenging.
> > Some people lack tools and automation to make it work or to manage their
> networks.
> Best I can tell, even the largest transit providers handle prefix list
> updates manually.

	Some have automated systems, but they're dependent on IRR data
being correct.  There are even tools to automate population of IRR data.

> At this stage of history, a human interface is probably necessary in making
> a reasonable
> assessment about the legitimacy of an update request.

	I think here is one of the cruxes of the problem.  If it
requires a human, there's a few things that will happen:

	1) prefix-list volume will be too much to be dealt with.
	   I see some per-asn prefix lists that would be 255k routes and
	   include all sorts of unreasonable junk like /32's

	2) even taking a reasonable network, (in this case, i picked AS286)
	   I see 4425 routes.  Either you check these all manually (at least
           once), or come up with some way to model it.  I currently see 250
           routes in the table with as-path _286_ from my view.  Either
           there's a lot of cruft there, or there's a lot of multihomed folks
	   where i see a better path.  Which is it?  Do I have the time to
	   crunch this myself?

	3) What about those unique customer relationships?  (this is made up)
	   Like where ATT buys transit from Cogent for those few prefixes
	   in New Zealand they care about?  There's always some compelling
	   business case to do something wonky.  Does this mean that ATT needs
	   to register their prefixes in the cogent IRR?  How do you keep it
	   'quiet' that this is happening, instead of an object saying
	   'att priority customer route'?  How do you validate these?  Even
	   the 'big guys' will make policy mistakes once in awhile.

	There needs to be some 'better-way' IMHO, but my ideas on this
topic have not gotten far enough along for me to put code behind them.
Perhaps I'll need to reprioritize those efforts.  It seems to me like
someone could do a cool system that churns through the route-views data, or
if necessary just duplicate part of it by getting lots of bgp feeds and
trying to parse the data.

	Too bad there's not a good way to do something like dampening on routes
where depending on the age of the announcement and some 'trust' factor you can
assign a series of local-preferences.  I'd really like to see something like
this exist.  ie: "dampen" the "new" path (even if the prefix is a longer
one) until some timer has ticked (unless some policy criteria are satisfied,
such as same as-path, etc..).

	There's also the issue of how to implement this in the existing
router(s), some of them with slower cpus.  There's a lot of folks using
older hardware to to bgp that just might melt if they had to evaluate some
huge routing policy.

	- Jared

Jared Mauch  | pgp key available via finger from [email protected]
clue++;      |  My statements are only mine.