North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: Verio Peering Question

  • From: Paul Vixie
  • Date: Sat Sep 29 15:43:09 2001

> > there has to be a limit.
> 
> A limit is needed, but the filtering method in question to me essentially
> says this:
> 
> if you have 64.x.x.x/15, slice it into as many /20's as you can and bloat
> as much as you want.. we feel this is an acceptable practice.

i strongly doubt that the policy was formulated on that basis.  it may or
may not be equivilent to what you said, but you have not described anyone's
(whom i'm aware of) actual motivations with the above formulation.

> Yet, if you're a legitimately multihomed customer wants to push out a
> single /24 (1 AS, 1 prefix) that is not considered acceptable.

actually there's a loophole.  nobody filters swamp /24's that i know of, since
so much of the oldest salty crusty layers of the internet are built on those.

> The only kind of prefix filtering I would want to implement is something
> that can accomplish:
> 
> 1. Define threshold, say /20 or /18 or hell even /8.
> 3. all prefixes longer than threshold get held until entire tables are
> loaded
> 3. start looking at the longer prefixes across the entire ipv4 space
> starting with the longest and finishing at threshold+1
> 4. if prefixes longer than threshold appear as part of a larger aggregate
> block that *originate* from the same AS, drop.
> 5. if prefixes longer than threshold originate from a different AS than
> the aggregate, accept.

i wish you luck in implementing this proposal.  i think that folks with
multivendor global networks will find it completely impractical, but you
can probably pull it off in a regional zebra-based network with no problem.

> This way I could get rid of redundant information yet at the same time not
> cause any trouble to smaller multihomed customers.  I'm not saying that we
> should allow /32's to be pushed everywhere either.  As you said there has
> to be a limit, and /24 seems to be a pretty good one if something along
> the lines of the above mentioned filtering algorithm could be used.

let's do some math on this.  swamp space is more or less 192/8 and 193/8
(though parts of other /8's were also cut up with a pretty fine bladed knife).
if every 192.*.*/24 and 193.*.*/24 were advertised, that would be more prefixes
than the entire current table shown in tony bates' reports (~100K vs 128K).

that is of course just the existing swamp.  and it would be hard to handle
but even harder to prevent since there's no real way using today's routers to
say "accept the current /24's in 192/8 and 193/8 but don't allow new ones".
this is the bogey man that gives people like smd nightmares.

then there's everything else.  if 20 /8's were cut up into /24's then tony
bates' report would have 1.3M more things in it than are there today.  if
the whole IPv4 space were cut up that way then we'd see 16M routes globally.

those numbers may seem unreasonable, either because current routers can't
hold them, or because current routing protocols would never be able to
converge, or because you just can't imagine humanity generating even 1.3M
/24's let alone 16M of them.

multihoming is a necessary property of a scalable IP economy.  actually,
provider independence is necessary, multihoming is just a means to that end.
if you don't think there are more than 1.3M entities worldwide who would pay
a little extra for provider independence, then you don't understand what's
happened to *.COM over the last 10 years.  in that case i'll simply ask you
to take my word for it -- you make 1.3M slots available, they'll fill up.

i do not know the actual limit -- that is, where it ends.  i know it's going
to be higher than 1.3M though.  i also know that the limit of humanity's
desire for "provider independence without renumbering" (or "multihoming") is
currently higher than what the internet's capital plant, including budgetted
expansions, can support.  and i strongly suspect that this will remain true
for the next 5..10 years.

> I'm sure in reality there's many reasons this would not be able to be
> implemented (CPU load perhaps) but it would atleast do something more than
> a "gross hack" that nails some offenders, not all by any means, and
> impacts multihomed customers who are only a portion of the problem that
> the current prefix filtering solution does not solve.

people are out there building networks using available technology.  forget
about CPU load and look at delta volume and convergence.  the "internet
backbone" is not fat enough to carry the amount of BGP traffic that it would
take to represent the comings and goings of 16M prefixes.  1.3M is probably
achievable by the time it comes due for natural causes.  do any of our local
theorists have an estimate of how much BGP traffic two adjacent core nodes
will be exchanging with 1.3M prefixes?  is it a full DS3 worth?  more?  less?

every time you change out the capital plant on a single global AS core in
order to support some sea change like 10Gb/s sonet or 200K routes, it costs
that AS's owner between US$200M and US$1B depending on the density and
capacity.  bean counters for old line telcos used to want a 20 year payback
(depreciation schedule) on investments of that order of magnitude.  today
a provider is lucky to get five years between core transplants.  bringing
the period down to two to three years would cause "the internet" to cost more
to produce than its "customers" are willing to pay.

so in the meanwhile, verio (and others who aren't mentioned in this thread)
are using the technology they have in order to maximize the period before
their capital plant becomes obsolete.  as i said in a previous note, they are
certainly balancing their filters so that filtering more would result in too
many customer complaints due to unreachability, but filtering less would
result in too many customer complaints due to instability.

anyone who wants the point of equilibrium to move in the direction of "more
routes" should be attacking the economies which give rise to the problem
rather than attacking the engineering solutions which are the best current
known answer to the problem.  in other words go tell cisco/juniper/whomever
your cool idea for a new routing protocol / route processing engine / cheap
OC768-capable backplane and maybe they'll hire you to build it for them.