North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: DNS problems to RoadRunner - tcp vs udp

  • From: Scott C. McGrath
  • Date: Mon Jun 16 12:52:19 2008

All,

Thanks for the helpful suggestions.

For what it's worth we use Cisco's CNR as we operate a MAC registration system which controls access to our network. We allow customers to select hostnames which are pushed into DDNS when the the system acquires a lease. CNR has internal limits (user configurable) which control the TCP state machine and these are easy to overwhelm as once you hit the high limit
the server process stops accepting new connection requests for any reason until the connections go below the max limit once again. We have been in constant contact with the development group on defending these machines from DDoS activity.


UDP is somewhat easier due to our network structure than TCP to rate limit and we do operate microflow policers to limit UDP activity from any given host.

We once used BIND but bind could not handle the DDNS updates in a reasonable fashion as we have many short lived connections as students access the wireless network between classes
hence the move to CNR which handles DDNS effectively but does not like TCP based attacks Unlike MIT over the river Harvard only has 2 Class B's available and we have many more registered clients than we have IP space for and a community which requires fixed hostnames for academic reasons and since we cannot assign static IP assignments except to well known and fixed services this becomes problematic hence DDNS which as many have pointed out here is painful from a operational standpoint but in our environment it is a lifesaver.


Unfortunately we have needed to insert some controlled breakage into the network to keep the services our customers require alive as TCP SYN attacks are unfortunately still effective in this
day and age we have tried many things our latest foray into TCP control is creating a Snort infrastructure which is sufficient to monitor all flows ingressing and egressing our network and from there based on analysis of the data applying rules to limit traffic in real time from ill behaved TCP hosts as our long term goal is not to operate a corporate network locked into stupid mode with no understanding of protocol needs


- Scott

Nathan Ward wrote:
On 15/06/2008, at 9:18 AM, Scott McGrath wrote:

Yes - we are blocking TCP too many problems with drone armies and we started about a year ago when our DNS servers became unresponsive for no apparent reason. Investigation showed TCP flows of hundreds of megabits/sec and connection table overflows from tens of thousands of bots all trying to simultaneously do zone transfers and failing tried active denial systems and shunning with limited effectiveness.

We are well aware of the host based mechanisms to control zone information, Trouble is with TCP if you can open the connection you can DoS so we don't allow the connection to be opened and this is enforced at the network level where we can drop at wire speed. Open to better ideas but if you look at the domain in my email address you will see we are a target for hostile activity just so someone can 'make their bones'.


There's really two problems here - one is packet/bit rate causing problems for your network, that's not necessarily an end system thing. Not really DNS specific, and blocking 53/TCP doesn't really help here as people could just send 53/UDP your way and get the same effect.

Connection table overflowing is a bit of a different issue, obvious way to overcome that is to whack a load balancer in there to share the load around. It's not immediately obvious to me why your connection table would be filling up - what state were connections stuck in?

Anyway, one thought that comes to me would be to split off UDP and TCP services to different servers - if some TCP attack kills your TCP DNS server you:
a) don't have to worry about UDP services failing.
b) can turn it off for the duration of the attack, and are no worse off than you are right now, then turn it back on when you see the high volume of SYN messages disappear.
c) as TCP DNS service recovery isn't super time critical (I'm assuming this, because you're not running it at all right now) you have time to look at the anatomy of the attack and figure out how to filter it more precisely if possible, instead of simply dropping all TCP.


Obviously, you'd want to make sure TCP from your other name servers always goes to the UDP one, etc. etc.

--
Nathan Ward