North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: Slashdot: Providers Ignoring DNS TTL?

  • From: Steve Gibbard
  • Date: Sat Apr 23 01:54:53 2005

On Sat, 23 Apr 2005, Christopher L. Morrow wrote:

oh well, I tried to stay quiet :) Probably the PPLB problem isn't quite as
simple as: "you have pplb you can't do anycast". I'd imagine that you have
to have some substantial difference in the paths that the PPLB follows,
yes? like links to differing ISP's or perhaps extremely diverse links
inside the same ISP. Correct?
For anybody who's confused by this thread, this is a quick explanation, after which I'm really hoping the thread will die:

The "PPLB" Dean mentions is "per packet load balancing" in which you have two or more circuits, and packets to the same destination alternate which circuit they go down. In every case in which I've seen this used, it's been to combine multiple circuits taking the same path between the same pair of routers, to in effect create a bigger circuit. In theory, PPLB could also be used to split traffic between circuits going to different routers, perhaps even in different places. I've never seen anybody actually use the latter setup, and it seems to be universally regarded as something that would break things. I suppose it's possible that somebody's using it somewhere, probably with "interesting" results. It's the latter, theoretically possible, setup that Dean is talking about.

Anycast is a technique in which two or more servers, generally in different locations, announce the same address space. Those sending traffic into a network via one POP or exchange point will have their traffic go to the server close to that entry point, while those sending traffic into a network via another POP or exchange point will have their traffic go to the server close to that point. To an outside network, it looks the same as regular peering -- you see the same route at each peering point and can hand off traffic. The only difference is that the packets may not have to travel as far once they enter the other network.

So, just as a fun theoretical exercise, let's examine what happens in the PPLB to multiple locations scenario that Dean imagines:

Let's say somebody is in the Midwest, and has T1s to Network A and network B. And let's say that their network administrator read on the NANOG list that per packet load balancing was the trendy thing to do, so they turn on per packet load balancing between the two T1s. Now they want to send some packets to a unicast host on network C, somewhere in California.

They start with UDP DNS queries, each consisting of a single packet. Half go via network A, which peers with Network C in California. Responses come back with a 40 ms RTT. The other half go through network B, which has its closest peering point with Network C in Virginia. The packets go to Virginia and then to California, and the replies come back 80 ms later. Everything works fine.

Then they try to set up a more persistent connection, and again half their packets are taking the 40 ms path while the others are taking the 80 ms path. Now things get interesting, because the packets are arriving out of order. Some applications may do ok with this, since they'll take the sequence numbers and reorder the packets, with some buffering and processing delay. But remember, the latency amounts here are numbers I just made up, and there's no reason why it couldn't be 40 ms vs. 1 second in some parts of the world. In either case, I suppose it's possible that you'd get an HTTP connection to sort of work, and an ssh session might just seem mildly painful. But good luck getting a VOIP call or anything of the sort to function over such a connection.

Dean is correct that this setup would fall apart even further when anycast is thrown into the mix. In the anycast example, Network A hands off the packets to Network C in California, where they get sunk into a local server. Network B hands off the packets to Network C in Virginia, where they get sunk into a local server. Each server only sees half the packets, and half the retransmits, and is probably never going to get enough of the connection to put it all back together in a way that works.

So, there are a couple of different conclusions that could be drawn from this. The conclusion I come to is that there are enough problems doing per packet load balancing on non-identical paths that nobody would actually do it. I'm made more comfortable in this conclusion by having been through this discussion several times without finding anybody who claims to actually do that sort of per packet load balancing. I, therefore, declare the PPLB thing to be a non-issue.

It may also be valid to declare that PPLB over non-identical paths is important to allow people to use every last bit of bandwidth they're paying for, and that we shouldn't make their already painful predicament worse. But that's an argument I continue to be skeptical of.