North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: Scalability issues in the Internet routing system

  • From: Andre Oppermann
  • Date: Wed Oct 19 07:13:22 2005

Tony Li wrote:

 capacity = prefix * path * churnfactor / second

 capacity = prefixes * packets / second

I think it is safe, even with projected AS and IP uptake, to assume
Moore's law can cope with this.

This one is much harder to cope with as the number of prefixes and
the link speeds are rising.  Thus the problem is multiplicative to
You'll note that the number of prefixes is key to both of your equations. If the number of prefixes exceeds Moore's law, then it will be very difficult to get either of your equations to remain under Moore's law on the left hand side.

That's the whole point of the discussion.
Let me rephrase my statement so we aren't talking past each other.

The control plane (BGP) scales pretty much linearly (as Richard has observed
too) with the number of prefixes.  It is unlikely that the growth in prefixes
and prefix churn manages to exceed the increase in readily available control
plane CPU power.  For example a little VIA C3-800MHz can easily handle 10
current full feeds running OpenBDPd (for which I have done the internal data
structures design).  Guess what a $500 AMD Opteron or Intel P4 can handle.
In addition BGP lends itself relatively well to scaling on SMP.  So upcoming
dual- or multicore CPU's help to keep at least pace with prefix growth.
Conclusion: There is not much risk on the control plane running BGP even with
high prefix growth.

On the other hand the forwarding plane doesn't have the same scaling properties.
It faces not one but two raising factors.  The number of prefixes (after cooking)
and the number of lookups per second (equal pps) as defined by link speed.
Here a 10-fold increase in prefixes and a 10-fold increase in lookups/second
may well exceed the advances in chip design and manufactoring capabilities.
A 10-fold increase in prefixes means you have search 10 times as many prefixes
(however optimized that is) and a 10-fold increase in link speed means you have
only 1/10 the time for search you had before.  There are many optimization
thinkable to solve each of these.  Some scale better in terms of price/performance,
others dont.

My last remark in the original mail meant that the scaling properties of
longest-match in hardware are less good than for perfect matching.  My
personal conclusion is that we may have to move the DFZ routing to some
sort of fixed sized (32bit for example) identifier on which the forwarding
plane can do perfect matching.  This is not unlike the rationale behind
MPLS.  However here we need something that administratively and politically
works inter-AS like prefix+BGP today.  Maybe the new 32bit AS number may
serve as such a perfect match routing identifier.  That'd make up to 4 billion
possible entries in the DFZ routing system.  Or about 16k at todays size of
the DFZ.  One AS == one routing policy.