North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical and other impatient Postfix mailers everywhere

  • From: kai
  • Date: Thu Aug 02 13:28:17 2001

If you are receiving larger numbers of reports about failing mail
transport from your customers these days, here is why.

Apparently, the nameservers for the and zones are overloaded (I get <10% ICMP_echo's
back, and virtually never any DNS answers), causing a more than
60-second connection-established -> 220 Hello banner
answering delay on typical Unix-based Sendmail (and other SMTP)
servers with default DNS resolver timeouts.

This by itself should not pose a problem given that RFC 1123 5.3.2
stipulates a 5-minute timeout for the banner, but it appears that
a SIGNIFICANT number of mailers out there are losing their patience
after only about 60s: the postfix mailer running the outgoing NANOG-L
mail ( is one of them (guess how I found out, heh)

Upon closer examination, I find thousands of these mailers, all
suddenly appearing with lots of Null-connections in my sendmail logs:

Aug  2 12:08:49 sonet sendmail[26668]: NOQUEUE: Null connection \
 from [email protected] []

While I am not sure that the non-responsiveness of the
DNS servers for their "subscription-only" query zones is intentional
(how do you shed traffic coming from 1000's of sites that you no longer
wish to serve?), I am just amazed at the wide proliferation of
blazingly RFC 1123-violating implementations/configurations of mail
servers around: why for once, do none of these servers get MORE
patient (after deciding for the first time that 2MB for 60s of
their precious server RAM oughta be enough of their resources wasted
for a delivery attempt for a particular mail to a particular host)
with a host that is not answering 'fast enough' for them?

Have people forgotten the robustness principle and no longer
feel responsible in any shape or form ?

(see RFC 1123,  1.2.1) :

         A vendor who develops computer
         communication software for the Internet protocol suite (or any
         other protocol suite!) and then fails to maintain and update
         that software for changing specifications is going to leave a
         trail of unhappy customers.  The Internet is a large
         communication network, and the users are in constant contact
         through it.  Experience has shown that knowledge of
         deficiencies in vendor software propagates quickly through the
         Internet technical community.

E.g.: this is not MAPS' fault. This is large site's (Hello Yahoo!)
SMTP MTAs (and excuse me for not having researched the shipped default
timeouts for Postfix here, I am not blaming Postfix or any particular
SMTP MTA here, as administrators tend to have their hands too deeply
in the config files) screwing it up for all of us:

They are mistakenly thinking that, (while violating RFC 1123 that
was designed with interoperability and stability in mind, not max.
profit margins) defining an arbitrarily small number of resources
for their flawed business model does not have consequences beyond
their own service.

MAPS's changing server arrangements just happen to be the
coincidential "contributing failure" here, but the true cause is
apparently marketing, financial and other blithering idiots at
the wheel at Yahoo, Verizon, Flonetwork (and 1000's of other
dot-coms) making resource decisions over the heads, and beyond
any sane reason, of responsible technical personnel that knows
better than them what will fly and what won't, and why.

And remember: when mail breaks, your phones don't stop ringing.


"Just say No" to Spam                                     Kai Schlichting
New York, Palo Alto, You name it             Sophisticated Technical Peon
Kai's SpamShield <tm> is FREE!        
|                                                                       |