North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: is reverse dns required? (policy question)

  • From: Andre Oppermann
  • Date: Thu Dec 02 10:06:09 2004

Steven Champeon wrote:
on Wed, Dec 01, 2004 at 03:34:43PM -0500, [email protected] wrote:

On Wed, 01 Dec 2004 15:02:19 EST, Steven Champeon said:

Connect:dhcp.vt.edu     ERROR:5.7.1:"550 go away, dynamic user"
Given the number of options available at our end, I can hardly blame
other sites for considering this a reasonable rule - I can't think of a
scenario we can't fix at our end, as long as the user bothers calling our
help desk and asks for help fixing it...
Exactly. That's why rDNS has been so useful for us. We can either
whitelist exceptions (such as customers of ISPs who have sucky customer
service and technical support) or try to educate them. It's (generally)
easy to change, it requires static assignment in order to work properly,
as an indication of the purpose(s) to which a given IP is put, etc.
Instead of having 6936 regexp patterns to match and parse one gazillion
different reverse DNS encodings you could simply mark the reverse DNS
entries of IP addresses that are actually *supposed* to be mail servers.

Reverse zone file for 10.0.0.0/24:

 1.0.0.10.in-addr.arpa.   IN PTR   mail.example.com.

 _send._smtp._srv.1.0.0.10.in-addr.arpa.   IN TXT   "1"

About as simple as it gets.  And much easier than figuring out for 99% of
all IP addresses that they are not supposed to send mail directly.  Just
turn the tables and tag those that are mail servers.  And it allows for a
nice and graceful transition too.

Nicely described here:

 ftp://ftp.rfc-editor.org/in-notes/internet-drafts/draft-stumpf-dns-mtamark-03.txt

--
Andre


(On the other hand, anybody who's filtering certain address blocks
because they're our DHCP blocks deserves to be shot, for all the usual
reasons and then some..)
Sure, but I can certainly understand why, for example, someone might
block all of AOL's dynamic blocks port 25, at least. Or Charter's. Or
Cox's, or any of the other sources of massive and constant abuse.

Wouldn't catch 1.2.3.4.dhcp.vt.edu.example.com anyway.
Yeah, but that has 'dhcp' at something other than the 3rd level.. ;)
Fair enough :)

I was more interested in whether a rule like
'*.dhcp.*.{com|net|org|edu)' (blindly looking at the 3rd level domain
and/or the 4th level for the two-letter TLDs) did any better/worse
than having to maintain a list of 7K or so - are there enough variant
forms that it's worth enumerating, or is it just that enumerating is
easier than doing a wildcard?
Ah, I see what you're getting at. Well, I started maintaining my long
list of patterns because of the insane complexity of trying to construct
simple rules like the above. At one point, I had five or six of them,
but it got easier to just run the vetted "generic" hostnames through a
quick perl script to generate a regex for each, and then check them all.
Surprisingly, on a reasonably fast system with a moderate mail load it
runs through the entire set pretty quickly, and it doesn't take up as
much RAM as I'd expected it would. I could probably get better stats
if you're interested.

Quick example, though: of 6936 patterns currently in my list, if you
just run a cut on \\ (which catches either '.' or '-' as the next char,
for the most part) you get (matches of 20 or more):

count first left-hand pattern part
----- ----------------------------
1572 ^[0-9]+
206 ^.+
200 ^host[0-9]+
179 ^host
145 ^adsl
140 ^ip
121 ^ip[0-9]+
121 ^.*[0-9]+
89 ^dsl
83 ^ppp[0-9]+
74 ^pc[0-9]+
64 ^ppp
54 ^h[0-9]+
52 ^dialup
48 ^dhcp
46 ^d[0-9]+
45 ^dial
43 ^dhcp[0-9]+
42 ^dsl[0-9]+
40 ^user[0-9]+
40 ^[a-z]+[0-9]+
40 ^[0-f]+
37 ^.+[0-9]+
36 ^p[0-9]+
36 ^[a-z]+
36 ^.*
32 ^c[0-9]+
32 ^adsl[0-9]+
28 ^m[0-9]+
28 ^cable
25 ^dyn
23 ^dial[0-9]+
23 ^cable[0-9]+
23 ^a[0-9]+
22 ^user
22 ^s[0-9]+
22 ^[a-z][0-9]+
21 ^mail[0-9]+
20 ^u[0-9]+
20 ^pc
20 ^client

It's really not as simple as just blocking .*(dsl|cable|dialup).*; the
zombie botnets are sophisticated and they're /everywhere/. So you can't
just block the largest 25% most likely sources, as the spammers just
rotate through until they find another you aren't testing for.

Throw in minor variations within a given ISP, language differences
worldwide in naming conventions, and peculiarities in how sendmail's
regex support works ('.' isn't picked up by '.+') and you've got a need
for at least a few thousand patterns even if you strip off the domain
part and try to match on the host part alone.