North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: shim6 @ NANOG (forwarded note from John Payne)

  • From: Kevin Day
  • Date: Thu Mar 02 08:33:54 2006

On Mar 2, 2006, at 4:07 AM, [email protected] wrote:
When I see comments like this I wonder whether people
understand what shim6 is all about. First of all, these
aren't YOUR hosts. They belong to somebody else. If you
are an access provider then these hosts belong to a customer
that is paying you to carry packets. This customer also
pays another ISP for the same service and the hosts
are making decisions about whether to use your service
or your competitors.

If you are a hosting provider, then these hosts, owned
by a third party, are making decisions about whether to
send you packets through one or another AS.

Is there something inherently wrong with independent
organizations deciding where to send their packets?
The problem is when the *hosting company* or *ISP* is multihomed and using shim6. The customers aren't straddling two hosting companies, they're using a hosting company who is using shim6.

Take us as a slightly exaggerated example(using totally made up bandwidth and prices, to protect NDAs). We have several boxes on our network that we do not control, we don't even have a login on the server.

In one POP we have three transit providers. NSP A gives us 10Gbps of bandwidth, and charges us $50/mbps. NSP B is on a GigE, but we only have a 500mbps commit. B charges us $75/mbps, but $150/mbps if we go over our commit. NSP C is also on a GigE, but we only have a 100mbps commit, charges us $200/mbps, and $500/mbps if we go over our commit.

I don't want a customer to touch NSP C, except for a very tiny number of routes where A and B aren't so great. I want to use NSP B as close to, but not going over, our commit as possible. I want everything else to go over NSP A. If any of the three transit connections go down, all the rules change temporarily (but hopefully not for long enough that we get dinged for 95th-percentile)

Putting the routing decisions in the hands of the servers(that we do not control) requires that we somehow impart this routing policy on our customers, make them keep it up to date when we change things, and somehow enforce that they don't break the policy. If a customer sees that forcing traffic to go through NSP C results in a faster connection for him, they may tweak/break the selection process of shim6(or just ignore our policy instructions) and cost us lots of money. We may learn from one of our providers that they lost an OC48 in our city, and can't handle our full traffic so we need to back off immediately. Or we can know in advance that a connection is about to go down, and want to preemptively route around it before things get blackholed before the routers notice.

On very high traffic days, we may make 10+ manual changes to our BGP policies to balance outbound and inbound traffic, to keep levels under their commits while still utilizing as much of our commit as possible. We have automated tools that make slight tweaks every 5 minutes. How can information that changes this frequently, and involves a very large dataset (several full tables of routes) get propagated to hundreds/thousands of hosts in a reasonable timeframe? Are we reinventing BGP as an IGP to send route data to shim6? :) And do we want to blow that much ram keeping a full routing table on each server? Even compressed to only list exceptions to a default route, my list of exceptions is still huge.

The same problems exist, on a smaller level, on enterprise networks. Routing policies can be complex, requiring information that isn't currently visible to end hosts, that changes frequently, and can be very costly if anyone ignores the policy. Under current BGP-style decisions-at-the-edges networking, it's impossible for an end user or server to ignore routing policy. With shim6, the end nodes ARE the routing policy. There's a lot more to many network's decision making process of "how to select the best route" that can't be measured with RTT or received TTLs, or anything else the end nodes can see.

Even outside the case of enterprise/hosting environments, transit providers already send route preference data to their customers. As a transit provider I'm able to depref/prepend/tag/etc routes to customers that we'd rather they not use (but are free to ignore). Under shim6, it's not really possible for your upstreams to tell you "My connection to this network is degraded at the moment, use it only as a last resort", where as with BGP they can prepend those routes a dozen times or flag it with a community and you won't use it unless you have to. Under host-based routing, all end nodes have to be made aware of this information.

Something like shim6 works great for small or medium businesses where they don't care about this sort of thing, their routing policies only change when they add/drop a provider, and they don't have thousands of customers with root access on their boxes trying to game the system. I just don't think it's a solution for everyone.