North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: SOLVED! The cause of puzzling TCP (eg. WHOIS) connection failureswith some InterNIC.net hosts

  • From: Marc Slemko
  • Date: Fri Nov 20 17:31:55 1998

On Fri, 20 Nov 1998, Greg A. Woods wrote:

[...]

> The problem has to do with the failure of a host to fragment larger
> packets on demand (i.e. when the other host sends an ICMP "needs frag"
> notification).  This may be because the ICMP packet never gets through
> (perhaps someone who didn't understand TCP/IP and ICMP and everything
> else related implemented a filter on all "abnormal" ICMP packets); or it
> may be because the receiving host doesn't understand the ICMP "needs
> frag" request (and also doesn't implement path MTU discovery, or have I
> got that backwards?).

No, if they don't implement PMTU-D then they normally wouldn't be 
sending packets with DF set.  If DF isn't set, then normally the 
packets will be fragmented by the router so there is no problem.

I really don't think it would be very wise for a router to try to keep
track of packets it is routing that have DF set and then ignore it
if it thinks it should.  You can implement PMTU-D blackhole discovery
on the sender, ie. if it keeps trying to send and doesn't succeed or
get any can't fragment ICMP back, then it will try backing down.

The problem in this case is probably the load balancing systems that
NSI is now using, which have known bogus behaviour when interacting
with PMTU-D.  NSI should disable PMTU-D on their servers until they
can fix this problem; fixing this problem probably involves having
the vendor for their load balancing boxes fixing their broken
software.

> Here's a sample trace collected from the PPP router upstream which shows
> the outgoing ICMP packets and the incoming TCP retransmissions,
> un-fragmented, even after the first request to fragment:

I thought you had said that there were no differences in the traffic
dumps between working and non-working connections...

As always, see http://www.worldgate.com/~marcs/mtu/ for discussion
of PMTU-D and how things break it.