North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: PMTU-D: remember, your load balancer is broken

  • From: Richard A. Steenbergen
  • Date: Fri Jun 16 05:41:25 2000

On Wed, 14 Jun 2000 [email protected] wrote:
> On Tue, 13 Jun 2000 22:36:08 MDT, Marc Slemko said:
> > Except that, technically, you are not permitted to just blindly send 
> > segments of such size.  Well, you can but systems in the middle don't
> > have to handle them.  No?
> Hmm.. either I did a bad job of explaining or I haven't had enough caffiene
> to parse what you said.  Given that you also suggest going to a 1460 MSS,
> I suspect that we're actually violently in agreement here.
> Now if I can remember why I chose 1396 for a default MSS.... ;)
> > It is also a concern that, in my experience, many of the links with
> > MTUs <1500 are also the links with greater packet loss, etc. so
> > you really don't want fragmentation on them.
> The worst part here is that I suspect that most of these links (just on
> sheer numbers of shipped product) are the aformentioned Win98 576-MTU.  
> However, in this case, the fragmentation happens in a terminal server on
> the last hop, and hopefully the case of a terminal server running out of
> queueing buffers and having to drop one of the 2 remaining fragments of
> a 1500->576 split after sending the first one is pretty rare....
> I seem to remember that the *original* motivation for slow-start and    
> all that was Van Jacobson's observation that the most common cause of
> a TCP retransmit was that an *entire* packet had been silently dropped
> due to queueing congestion, and could thus be treated identical to
> an ICMP Source Quench.
> Has this changed?  Has "fragmentation" become a Great Evil, rather than
> an annoyance that some links have to deal with?

Anything not in the fast-path (fragmentation, IP options, etc) is a
scourge to all that is good and rightous about networking. In other words,
if it isn't an every day occurance people seem to forget that it needs to
be cared about, checked for "issues", DoS potential, etc (I just heard
about a DoS potential against a popular unix stack because of lack of
bounds checking in the IP options this evening in fact). I'm hoping IPv6
will fix some of that for IP options, since they're a bit more usable and
a bit more important, but I doubt anything will change the mentality.

Fortunantly it seems that the backbone links are all running larger MTUs
then the hosts (PoS, FDDI, jumbo frame support for gige, even if it isn't
standard). As long as its the hosts shrinking the MTU and not the network
in-between things are better, its just less then "optimal" thruput. Simple
WFQ would be of more use to those poor bastard souls on dialup though.

If you are a provider running the tunnel (not on an end host where MSS can
be set), you would do well to keep the available MTU post-tunnel >= 1500
to keep everyone happy, if at all possible. Just a quick 5:30AM thought, 
but it seems like a better solution would be to have the hosts signal that
fragmentation was encountered on the ACK, so if ICMP discovery is not
possible the packets are not lost with no idea why (it can go right next
to the ECN bit :P).

Richard A Steenbergen <[email protected]>
PGP Key ID: 0x138EA177  (67 29 D7 BC E8 18 3E DA  B2 46 B3 D8 14 36 FE B6)