North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: PMTU-D: remember, your load balancer is broken

  • From: Greg A. Woods
  • Date: Fri Jun 16 02:07:23 2000

[ On Thursday, June 15, 2000 at 21:54:58 (-0700), Marc Slemko wrote: ]
> Subject: Re: PMTU-D: remember, your load balancer is broken 
> On Thu, 15 Jun 2000, Greg A. Woods wrote:
> > So, how about it everyone?  Can we please all disable PMTU everywhere
> I assume you Mean PMTU-D, not PMTU.

Yes, of course...   :-)

> If PMTU-D is causing problems, then get whoever has a broken network to
> fix it.  Is it always practical?  Of course not.  But education is the
> key.  PMTU-D is not the problem here, and it is very shortsighted to
> say "oh, we just know better and can manually tune things to work
> well".  That is not a wise "solution".  If even 5% of people are in a
> situation where broken networks cause PMTU-D to not work, then such
> broken networks will be fixed, period.

I don't yet agree.  I've never yet seen Path-MTU-Discovery used on the
public Internet for any purpose that cannot better be achieved by simply
tuning your default MSS to a more "modern" value.  IIRC you yourself
advocated this very same solution.  People say they need PMTU-D to get
good throughput on bulk data transfers and yet they can achieve the same
efficiencies by simply tuning their TCP stacks to meet the demands and
capabilities of the modern Internet.  PMTU-D is really only a hack
that's not currently necessary.

>  If you want to work around it on
> your systems, then lower your MTUs.

In my particular case that's what causes the problem in the first place!  ;-)

>  But the solution is not for everyone
> to go disable PMTU-D because there are some broken networks; after all,
> the people that would listen to disable it are the same people who would 
> just fix their broken networks.

Actually I think the most practical solution is for server OS vendors to
choose better defaults (i.e. PMTU-D should be off by default and the
default MSS should be set to something very close to 1460), and for them
to better document both the effects and the dangers of changing these
values.  In the mean time those who are using PMTU-D really must
re-evaluate the reasons they are using it and check to see if they can't
achieve the same results through adjusting their default MSS instead.

Defaulting to always using PMTU-D will be guaranteed to always lead to
problems that, as has been said already, will always result in 100%
failure for those affected.  Not tuning your default MSS will only
result in degraded service, never complete failure so far as I can tell.
Furthermore as I've tried to demonstrate, and as you more or less
confirm in your next sentence, any degradation introduced will only
affect those few people who are in the first place susceptible to
complete failures when PMTU-D is used.  The overall effect on the
Internet will be minor (and perhaps minutely positive since there'll no
longer be any excess "needs frag" packets and retransmissions being

Even people running servers on local networks with >1500-byte MTUs would
not suffer (and might actually benefit as above too) if their primary
purpose is to serve to the Internet since most of the Internet is
running with just 1500-byte MTUs and so they can't usually send bigger
packets anyway.....

>  And in 99% of the cases, the broken 
> network will be at their end or at the user's end, it will very seldom
> be in some network in the middle providing transit.

Indeed it's almost never the network in the middle that's at fault,
though strictly speaking in my case I've always encountered problems
when the link between my networks and the next hop out has the lower
MTU (eg. PPP, PPPoE, GRE, etc.)

BTW, what happens to a server using PMTU-D if some attacker starts
successfully spoofing "needs frag" replies to it with rediculously low
next-hop-MTU?  :-)  I.e. how many existing server implementations are
robust enough to even verify the sanity of the MTU they're being asked
to use, never mind validating that the IP header and data returned in
the needs-frag payload match the original bit-for-bit?

							Greg A. Woods

+1 416 218-0098      VE3TCP      <[email protected]>      <robohack!woods>
Planix, Inc. <[email protected]>; Secrets of the Weird <[email protected]>