North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: Scalability issues in the Internet routing system

  • From: Rubens Kuhl Jr.
  • Date: Wed Oct 26 00:22:56 2005
  • Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta;; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=SJ1dXgujKoPZkZNgHClff0kG31r37XffrXJpYD5YfxdWjf/qYfYYZUb/ixHpSjjnW+Q/yqgIH2ofCpzoiPAAESpf3pdB8tIKeguM6knQphpxpChuSbaCty/gvIHO/lbzCGeBxD160DvX58oqaUIppSLOezw0P7is9xi9RysL9Ds=

Assume you have determined that a percentage (20%, 80%, whatever) of
the routing table is really used for a fixed time period. If you
design a forwarding system that can do some packets per second for
those most used routes, all you need to DDoS it is a zombie network
that would send packets to all other destinations... rate-limiting and
dampening would probably come into place, and a new arms race would
start, killing operator's abilities to fast renumber sites or entire
networks and new troubleshooting issues for network operators.

Isn't just simpler to forward at line-rate ? IP look ups are fast
nowadays, due to algorithmic and architecture improvements... even
packet classification (which is n-tuple version of the IP look up
problem) is not that hard anymore. Algorithms can be updated on
software-based routers, and performance gains far exceed Moore's Law
and projected prefix growth rates... and routers that cannot cope with
that can always be changed to handle IGP-only routes and default
gateway to a router that can keep up with full routing.
(actually, hardware-based routers based on limited size CAMs are more
vulnerable to obsolescence by routing table growth than software ones)

Let's celebrate the death of "ip route-cache", not hellraise this fragility.


On 10/24/05, Alexei Roudnev <[email protected]> wrote:
> One question - which percent of routing table  of any particular router is
> REALLY used, say, during 1 week?
> I have a strong impression, that answer wil not be more than 20% even in
> biggerst backbones, and
> will be (more likely) below 1% in the rest of the world. Which makes a hige
> space for optimization.
> ----- Original Message -----
> From: "Daniel Senie" <[email protected]>
> To: <[email protected]>
> Sent: Tuesday, October 18, 2005 9:50 AM
> Subject: Re: Scalability issues in the Internet routing system
> >
> > At 11:30 AM 10/18/2005, Andre Oppermann wrote:
> >
> > >I guess it's time to have a look at the actual scalability issues we
> > >face in the Internet routing system.  Maybe the area of action becomes
> > >a bit more clear with such an assessment.
> > >
> > >In the current Internet routing system we face two distinctive
> scalability
> > >issues:
> > >
> > >1. The number of prefixes*paths in the routing table and interdomain
> > >    routing system (BGP)
> > >
> > >This problem scales with the number of prefixes and available paths
> > >to a particlar router/network in addition to constant churn in the
> > >reachablility state.  The required capacity for a routers control
> > >plane is:
> > >
> > >  capacity = prefix * path * churnfactor / second
> > >
> > >I think it is safe, even with projected AS and IP uptake, to assume
> > >Moore's law can cope with this.
> >
> > Moore will keep up reasonably with both the CPU needed to keep BGP
> > perking, and with memory requirements for the RIB, as well as other
> > non-data-path functions of routers.
> >
> >
> >
> > >2. The number of longest match prefixes in the forwarding table
> > >
> > >This problem scales with the number of prefixes and the number of
> > >packets per second the router has to process under full or expected
> > >load.  The required capacity for a routers forwarding plane is:
> > >
> > >  capacity = prefixes * packets / second
> > >
> > >This one is much harder to cope with as the number of prefixes and
> > >the link speeds are rising.  Thus the problem is multiplicative to
> > >quadratic.
> > >
> > >Here I think Moore's law doesn't cope with the increase in projected
> > >growth in longest prefix match prefixes and link speed.  Doing longest
> > >prefix matches in hardware is relatively complex.  Even more so for
> > >the additional bits in IPv6.  Doing perfect matches in hardware is
> > >much easier though...
> >
> > Several items regarding FIB lookup:
> >
> > 1) The design of the FIB need not be the same as the RIB. There is
> > plenty of room for creativity in router design in this space.
> > Specifically, the FIB could be dramatically reduced in size via
> > aggregation. The number of egress points (real or virtual) and/or
> > policies within a router is likely FAR smaller than the total number
> > of routes. It's unclear if any significant effort has been put into this.
> >
> > 2) Nothing says the design of the FIB lookup hardware has to be
> > longest match. Other designs are quite possible. Again, some
> > creativity in design could go a long way. The end result must match
> > that which would be provided by longest-match lookup, but that
> > doesn't mean the ASIC/FPGA or general purpose CPUs on the line card
> > actually have to implement the mechanism in that fashion.
> >
> > 3) Don't discount novel uses of commodity components. There are fast
> > CPU chips available today that may be appropriate to embed on line
> > cards with a bit of firmware, and may be a lot more cost effective
> > and sufficiently fast compared to custom ASICs of a few years ago.
> > The definition of what's hardware and what's software on line cards
> > need not be entirely defined by whether the design is executed
> > entirely by a hardware engineer or a software engineer.
> >
> > Finally, don't discount the value and performance of software-based
> > routers. MPLS was first "sold" as a way to deal with core routers not
> > handling Gigabit links. The idea was to get the edge routers to take
> > over. Present CPU technology, especially with good embedded systems
> > software design, is quite capable of performing the functions needed
> > for edge routers in many circumstances. It may well make sense to
> > consider a mix of router types based on port count and speed at edges
> > and/or chassis routers with line cards that are using general purpose
> > CPUs for forwarding engines instead of ASICs for lower-volume sites.
> > If we actually wind up with the core of most backbones running MPLS
> > after all, well, we've got the technology so use it. Inter-AS routers
> > for backbones, will likely need to continue to be large, power-hungry
> > boxes so that policy can be separately applied on the borders.
> >
> > I should point out that none of this really is about scalability of
> > the routing system of the Internet, it's all about hardware and
> > software design to allow the present system to scale. Looking at
> > completely different and more scalable routing would require finding
> > a better way to do things than the present BGP approach.
> >
> >