North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: multi homing pressure

  • From: Elmar K. Bins
  • Date: Fri Oct 21 03:55:58 2005

Re Owen,

Just a short (ok, now I read it again, it's grown...) answer to
the list, but you're right, we might continue this in private.
(Reply-To set)

Thanks for being so patient explaining everything, and for
discussing with a (still somewhat) hairy-head like myself :-)


[email protected] (Owen DeLong) wrote:

> >You're only talking v6? Why? Anyway, let's follow this through...
> >
> Because we don't really need to solve this in V4.  V4 multihoming is
> well understood and is unlikely to hit a scaling limit on router
> capabilities before we hit an end of life on address space.
...
> Again... Multihoming already works in V4 and there is no real need
> to solve this in the V4 world.

I can expect a strongly rising demand of end-customers to multihome
right now, and we still have a bunch of /24s to go on. But then,
it may only add another 300Kprefixes to the BGP table, which is not
really an order of magnitude.

As to the "it works" - surely it does, but up to now I believed
it wouldn't scale far enough. Maybe I'm wrong (see Moore).


> You only need one RLI in the packet header.  More would actually be
> bad.  Let me 'splain.  If you are routing on RLI, then, you need
> to choose the best path and stick to it.  If the packet doesn't
> make it through that way, that's OK... That's what retransmits are
> for.  If you start rerouting it on the fly, it's likely to loop
> a lot before dying, but, little else is achieved.  Worse, it's
> likely to loop even if it might have gotten there given one path
> and only one path chosen as best by the RLI inserting router.

Actually, I don't understand the last part; why should it loop in
this case? It's a matter of destination(s) look-up on the "core"
routers, just like in your model. Only the destination's potentially
more than one.

It would of course loop anyway if it entered (the same part of) the
same transit AS again, but that is independent of whether you see
the ESI or not (aka RLI insertion vs. encapsulation).

I'm still not comfortable with the box in Sao Paolo determining
whether the packet should go to ISP A in Hamburg or ISP B in Munich
or ISP C in Frankfurt (from where the respective ISP would forward
it to the customer in Cologne). This decision can easily be made
later on and result in a "better" path.


> No, it is not.  Since the RLI inserting router has up to date dynamic
> information about which RLIs are reachable and at what cost (BGP

The inserting router is less probable to have up-to-date RLI topology
information than routers closer to the packet's destination, due to
the way the topology information gets distributed.


> No.  You have nearly the same advantage you have today.  If
> the path goes away, then, hopefully by the time of retransmit,
> the RLI inserting router will have learned that that RLI destination
> is no longer reachable, and, he will insert a different one in
> the retransmitted packet.  Same as what happens today with the
> retransmitted packet being sent a different way.

I don't like "hopefully" here, but maybe that's our trade-off
anyway. You are, nonetheless, giving the "RLI inserting router"
somewhat "hotter" information, if it has to make the topological
choice (choose destination RLI and, implicitly, select a group
of possible paths over all others). If it were only to know the
translation information which does not change as often, I'd be
much happier.

What I also do not like is the wrong analogy to today's routing
mechanism. You claim implicitly that the RLI inserting router's
new decision was the same as what happened in the Internet
routing system today: rerouting packets. This means, in other
words, you're making a global choice locally. But of course, the
current system does not reroute at the packet source (only),
it can do this on any hop between source and destination and
thus makes only local choices locally.

This is a significant difference, because it makes adaptation
to changes easier, faster, and it works with only partial
convergence along the path.


> >Who exactly chooses? IMHO it's AS B that does the selection.
> >And: B is closer to the target, aka the source of the routing
> >information. Its BGP table is more probable to be up-to-date.
> >
> Right... B is the first DFZ router.  A is not likely DFZ since A
> is not multihomed in your scenario.  No need for A to be DFZ if
> A only talks to B.

Yesyesyes, consider

A B C D E F T
A B C D G H T

What now? Is "D" necessarily the first DFZ router? I think not.
So you are still using B for the RLI insertion; B has to make
the choice, and that choice may be wrong or sub-optimal.


> Z's ESI is visible in the core, but, not carried in the routing
> table.  Z does not have an RLI, but, instead uses the RLIs of
> their provider(s).

Yup, in your "add something to the header" scenario, the ESI is
still visible. In mine it is not (it is, but encapsulated).
Actually, it does not matter, as long as the destination can
revive this information ("destination" as in "the re-translating
router").


> In the long run (once this is ubiquitous on core routers), the
> global prefix-based table can be abandoned freeing router memory.
> Hopefully that would occur before the global table and this table
> grew to require significant hardware upgrades, and, would make
> significant room for caching ESI->RLI lookups.

Moving the intelligence out of the core. Well, yes, that's an
advantage for the migration phase (which could take decades).


> No, you don't have to distribute it.  You _CAN_ provide it for
> lookup instead.

How do I get there? Bootstrapping? 2.1.20? That's not moot at all.


> >Then I do not understand why you want the DFZ routers to be able
> >to translate.
> >
> I don't know what you mean by translate.

Translate ESI to RLI and insert that into the packet header.


> Well... Since we already have RIRs, I don't see a reason that the
> top level of the hierarchy for this information couldn't be managed
> as ANYCAST servers at well known addresses run by the RIRs and/or
> IANA.  All space originates from there anyway, so, it is a natural
> point of hierarchy.  In essence, the router will learn the path
> to the Root and Top Level RLIs which will be fixed ASNs assigned
> as part of this protocol deployment.  Only the root is truly
> necessary.

Special ASNs/RLIs, reserved for this? What about extensibility there?
And actually, that's not bootstrapping the system, because if you're
in the DFZ, you need a specific path to go there, and you have to
get it from somewhere. So either there's a hole in the idea or I'm
too dumb to understand. Or do you mean, every RLI hosts such an
anycasted server, in order for their routers to be able to reach it?

Lest we forget that the anycasted RIR routing topology servers
also need to get updated somehow...

Btw, yes, I like the idea of RIRs and RAs taking control over
Internet routing (it's only logical, and it's necessary albeit
currently impossible); I would propose the same. Not everybody
may like that, though.


> The source of the packet does not determine it.  The first DFZ router
> (often many routers removed from the source) determines it's best
> path.  Just like today when the first DFZ router makes a choice,
> e.g. between forwarding to 701 or 3561 to get to 10565.
> Once the packet is handed off to 701, it's not going to come back
> and go via 3561 in most cases.  If 701 loses it's connection
> downstream towards 10565, it will likely drop the packet.

BGP has always been based on the idea of contiguous ASs. That's
why 3561 will refuse to accept the packet. That avoids loops,
but it also makes life harder in other respects.

I think your "first DFZ router" is quite close to the packet source
in most cases. It usually is the customer's own router (if they
participate in DFZ routing, like we do), or it's one of the upstreams'
edge routers. At least, it is, as soon as prefix-based routing has
disappeared from large parts of the DFZ.

Cheers,
	Elmar.

--

"Begehe nur nicht den Fehler, Meinung durch Sachverstand zu substituieren."
                          (PLemken, <[email protected]>)

--------------------------------------------------------------[ ELMI-RIPE ]---