Re: size of the routing table is a big deal, especially in IPv6

North American Network Operators Group

Re: size of the routing table is a big deal, especially in IPv6

From: Daniel Senie
Date: Mon Nov 29 19:05:21 2004

At 06:33 PM 11/29/2004, Iljitsch van Beijnum wrote:

On 28-nov-04, at 5:20, Daniel Roesen wrote:
I find it interesting that no operators are screaming that there will be
too many routes, but that all the IPv6 researchers are bringing forth
this view.
ACK. All the "oh our IPv4 DFZ table explodes today" is similarily
unfounded as far as I'm aware. I have not heard of anybody being
able to crystal-ball the scaling limits of BGP4 yet, and currently
used BGP implementations seem to cope quite well with 150k routes
(set aside the notorious vendor C artificial RAM limits in older gear
to make you buy new gear when table gets bigger).
Ok, I'll do this one more time.

There are basically two issues: the forwarding table and BGP processing. Information in the forwarding table needs to be found *really* fast. Fortunately, it's possible to create datastructures where this is possible, to all intends and purposes, regardless of the size of the table. However, memory is a concern here, as you only have a few hundred nanoseconds to look up something in the routing table at 10 Gbps speeds.

This is a solvable problem. Hardware lookups are quite sufficient. Forwarding bases stored in line cards can be aggregated to the extent the data permits. Any router with 10GigE interfaces that's going to care about actually filling such pipes will have advanced hardware forwarding technology and a price tag to support the development of same.

When the forwarding table gets too large and the packets rates too high, you may run into memory bandwidth problems and/or have to use much more expensive memory. On any line card, but especially on a fast one, a bigger fdb simply costs more money.

Right. And anyone on the edge just needs enough memory to hold the table in their software-based routers that have little or no lookup assistance.

For the BGP routing information base this isn't much of a problem, as you can use much cheaper and slower memory. Unfortunately, there is also the processing. Because of stuff like the longest match first rule and the presence of multiple BGP routes towards the same destination, it's much harder to use very efficient data structures for this. And to add insult to injury, the contents of the BGP table changes all the time. Now this appears to be a linear problem, but it isn't: when the routing table gets twice as big, generally this means twice as many updates (probably more, as deaggregated routes tend to flap more) but you also need to search through twice as many routes in the routing table to process each update. So the work doesn't increase as O(n) but either O(n*n) or O(n*log(n)).

Even 10 years ago it was evident the routing table structures chosen by different manufacturers had significantly different performance characteristics. As there is no single data structure to define the storage of this information, it may follow that there is no singular formula for the impact of scaling.

Now all of this doesn't mean we can't have any growth in the global routing table, but it does mean that such growth must be considerably below the Moore's Law rate (a factor 2 in 18 months or about a factor 10 in 5 years). Over the past few years the routing table growth has been very modest, but it looks like it's picking up speed again. This isn't good, although we're certainly not at dangerous levels yet.

Over the past several years, the CPUs in routers have been considerably below the speediest on the market. I suspect there's a fair bit of headroom at present between the route processing engines in core routers and the fastest CPUs presently offered for sale. As such, I have to wonder just how much growth we could handle instantaneously, and still stay within the CPU capabilities of today's available processors. Also consider that CPU power is far from the only issue. Higher speed memory continues to be developed along with higher speed bus architectures. System performance is made up of many factors.

8 years too late guys.  We've figured out table management.
ACK, looks like that.
Yes, it's surprising how effective hoping for the best can be sometimes.
And even if all active ASses would immediately adopt IPv6, we would
land at about 18k IPv6 routes. "big deal".
I have a slightly bigger deal for you. Unfortunately, I can't find the current number right now, but the number of individual /24s in the BGP table was always something like half the table when I looked. Now for an ISP, a /24 is small change, so it's likely that most of those /24s are real or defacto PI blocks that are often announced under the AS of the ISP of the week rather than under the AS of the holder of the block. If you take this number you're at around 50k. I'm not sure about how this works out in actual implementations, but it's likely a 50 to 75 k IPv6 table takes the same amount of memory as a 150k IPv4 table.

Deaggregating the entire IPv4 space into /24's is today the worst case design for the RIB of a router. Designing a router to handle that case is not beyond today's technology.

Next step. In IPv4, there is downward pressure on multihoming because you can't get a route advertised that's longer than a /24. And yes, even a /24 is somewhat hard to get for most people. In IPv6, _everyone_ can get a /48. So if we allow /48 PI blocks in the routing table, how do we make sure we only allow "legitimate" PI users and not ISPs deaggregating a /32 into 64k /48s or people announcing PA /48s?

This deal is getting bigger by the minute.

Lookout above! The sky is falling.

In IPv4 it took a while before we managed to get it right, resulting in the 192.x.x.x swamp and lots of address space and AS numbers that are as good as unreclaimable. And this was all before 1993, before pretty much anyone had even heard of the internet. If we get it wrong to the same degree in IPv6 it will be much worse because the potential influx of new IPv6 users in a week is larger than the influx of new IPv4 users in any year before 1993. (For instance, if there is a land rush on AS numbers because they are a free ticket towards an IPv6 PI prefix.)

Now I'm not saying that all kinds of bad things are going to happen.

Really? You've set the stage to say exactly that. At least that's how it read to me.

I'm just saying we should be very conservative in allowing unreversible changes in unscalable aspects of IPv6.

I'd sure like to see a lot more thorough analysis than what you provided above before reaching that conclusion. History has certainly not sided with you. Back in the mid-1990s, we were told routers wouldn't scale, so we needed MPLS. While MPLS has found useful roles in the network, it wasn't needed as a replacement for IPv4 routing in the core. Several companies, including some startups, figured out ways to route packets quite quickly.

In the long run, I'd rather provide the ability to offer the services needed. This permits the companies looking for those services to flourish and help the economies of the world. While there are challenges to be addressed, I belive those challenges will be well met by the equipment marketplace, and that innovation also will help the economies of the world. Artificial restraint does not result in expanded services or product innovations. If I had a way to vore on this, I'd vote to let the markets work.

Follow-Ups:
- Re: size of the routing table is a big deal, especially in IPv6 Tony Li

References:
- Re: BBC does IPv6 ;) (Was: large multi-site enterprises and PI prefix [Re: who gets a /32) Fred Baker
- Re: BBC does IPv6 ;) (Was: large multi-site enterprises and PI Paul Vixie
- Re: BBC does IPv6 ;) (Was: large multi-site enterprises and PI Iljitsch van Beijnum
- Re: BBC does IPv6 ;) (Was: large multi-site enterprises and PI Leo Bicknell
- Re: BBC does IPv6 ;) (Was: large multi-site enterprises and PI Daniel Roesen
- Re: size of the routing table is a big deal, especially in IPv6 Iljitsch van Beijnum

Prev by Date: Re: "Make love, not spam"....
Next by Date: New IANA IPv6 allocation for APNIC (2001:8000::/23 - 2001:AE00::/23)
Date Index
Thread Index
Author Index
Historical