North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: shim6 @ NANOG (forwarded note from John Payne)

  • From: Joe Abley
  • Date: Wed Mar 01 10:09:17 2006

On 1-Mar-2006, at 02:56, Kevin Day wrote:

On Mar 1, 2006, at 12:47 AM, Joe Abley wrote:

  o a small to medium multi-homed tier-n isp
A small-to-medium, multi-homed, tier-n ISP can get PI space from their RIR, and don't need to worry about shim6 at all. Ditto larger ISPs, up to and including the largest.
If you include "Web hosting company" in your definition of ISP, that's not true.
Right. I wasn't; I listed them separately.

It's important to note that even if you are a hosting company who *does* qualify for PI v6 space, you still need shim6-capable servers, if you want to make them optimally available to multi-homed, shim6- capable hosts. The difference PI makes is in the distribution of addresses to servers (the servers only need a single set).

You don't get PI space, and Shim6 is looking like your only alternative for multihoming.
Right. For a hosting company with multiple PA netblocks, shim6 is the option on the table.

Many content providers set up multiple non-interconnected POPs in different geographical locations. The only way this can be accomplished is by making separate announcements in each POP for each space. This means either being able to deaggregate, or to get a block for each POP. I don't know of *ANY* that are deploying 5000 + servers per POP.
Right. With shim6, getting a block per POP is trivial, since they are all PA assignments from transit providers.

I'm just one guy, one ASN, and one content/hosting network. But I can tell you that to switch to using shim6 instead of BGP speaking would be a complete overhaul of how we do things.
You are not alone in fearing change.

Putting routing decisions in the control of servers we don't operate scares me. I wouldn't rely on 90% of our customers to get this right unless it was completely idiot proof. Even if it was, I don't see how we can trust that users aren't messing with things to "game the system" somehow.
This is the kind of feedback that the shim6 architects need. There is talk at present of whether the protocol needs to be able to accommodate a site-policy middlebox function to enforce site policy in the event that host behaviour needs to be controlled. The scope of that policy mediation function depends strongly on people like you saying "at a high level, this is the kind of decision I am not happy with the hosts making".

We deal with long lived TCP sessions (hours/days). I don't see how routing updates can happen that won't result in a disconnect/ reconnect, which isn't acceptable.
One of the primary objectives of shim6 is to provide session survivability over re-homing events. Since routing protocols are not used to manage re-homing, the speed at which a session can recover from a topological event depends on the operation of the shim6 protocol between client and server.

It seems reasonable to say that in some cases shim6 re-homing transitions will be faster than the equivalent routing transition in v4; in other cases it will be shorter. Depends on the network, and how enthusiastically you flap, perhaps.

The experience of people who provide services involving long-held TCP sessions is exactly the kind of thing that the shim6 architects need to hear about.

We have peering arrangements with about 120 ASNs. How do we mix BGP IPv6 peering and Shim6 for transit?
You advertise all your PA netblocks to all your peers.

So far it looks like Shim6 is going to rely on DNS. The DNS caching issue is a real problem. We need changes to happen faster than DNS caching will allow.
Well, not quite.

If you change a transit provider, then you need to remove a set of AAAA records from the servers you operate, and substitute a new set. The time taken for this change to propagate in the DNS is non-zero, assuming you use reasonable TTLs. This is your point above, I think.

With shim6-capable clients and servers, the dark period during which the changes propagate is handled by an address selection/retry algorithm in the client (for new sessions) and by the shim6 protocol doing failure detection and selecting a new locator (for established sessions).

Once the DNS change has propagated, the address selection and shim6 band-aids are no longer required, and clients have an accurate set of information.

Renumbering for hosting providers can be a monstrous pain in the neck, especially for hosting providers who rely on third parties (or, horrors, their customers) to maintain the zone files within which services are named.

Some hosting providers of my acquaintance insist on customer zones being redelegated to the hosting providers' nameservers, so that any renumbering that needs to happen can be coordinated by the hosting provider directly. Hosting providers who don't do this, and who use PA addresses with shim6 to multi-home, are definitely going to face some challenges.

Our network is complicated. We have a /21 that's split into 4 /23s. One for each non-interconnected POP. We only advertise the /23 for each POP out to transit, but we give peers access to our entire network wherever they peer with us and we pay to haul/tunnel it around. How do we even do this without PI space, let alone through shim6?
You avoid it completely, and use PA space in every POP. You can still announce PA space from other POPs to peers, if you want to retain your tunnels.

For quite the foreseeable future, we'd be running IPv4 and IPv6 at the same time, over the same transit connections. We'd have to TE our IPv6 bits completely differently than our IPv4 bits, even though we'd be billed for the aggregate usage of both. Automated tools for tweaking total usage per transit port is hard enough in BGP. Having to tweak both BGP and some external shim6 method of TE when the goal is a common aggregate number is going to be a very difficult issue.
Yep. Difficult and expensive.

Some of our applications are extremely sensitive to jitter/latency. We've spent ages tweaking route-maps manually (and through automated continual tweaking) to make sure we avoid any congested links. [...]
The site-policy middleware that I alluded to earlier seems like the analogous place to specify this policy. Such a facility might actually give you more control than you have now -- tweaking BGP attributes to accomplish this kind of thing is often like a game of whack-a-mole; if you were able to control the route taken by traffic in both directions by influencing the locator selection for each and every session, you'd have far greater, and more fine-grained, control over your external traffic than BGP/swamp-abuse gives you currently.

Your specific requirements in this regard (the high-level objectives that you currently meet using BGP) would no doubt be gratefully received on the shim6 list.

We'd still be relying on PA space. No matter how great dhcp6 is, there will be significant renumbering pain when providers are changed. Static ACLs, firewall rules, etc. If you're including customer machines in the renumbering, many simply won't do it.
Agreed, renumbering is a pain. Dhcp6 sounds like a scary thing to use with servers. Customers suck. Change in operational practices will be required.

Lest I sound too much like a foam-at-the-mouth shim6 advocate, I think it would be perfectly fine if, in the final analysis, the conclusion was that shim6 and PA/renumbering was not an option for hosting providers. A reasoned technical argument which came to that conclusion would provide a solid basis for the RIRs to modify their allocation policies such that hosting providers could use PI space instead. As perhaps the recent attempt to change the v6 PI policy indicates, the chances of making changes without such a reasoned argument are slim.

However, I think it's possible that shim6, incorporating some facility for a site to manage the locator selection of the hosts, could actually make some things easier for hosting providers. There might even be reasons to like it :-)


Joe