North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: anycast (Re: .ORG problems this evening)

  • From: David G. Andersen
  • Date: Mon Sep 22 18:35:57 2003

On Thu, Sep 18, 2003 at 02:38:18PM -0400, Todd Vierling quacked:
> 
> On Thu, 18 Sep 2003, E.B. Dreger wrote:
> 
> : EBD> That's why one uses a daemon with main loop including
> : EBD> something like:
> : EBD>
> : EBD> 	success = 1 ;
> : EBD> 	for ( i = checklist ; i->callback != NULL ; i++ )
> : EBD> 		success &= i->callback(foo) ;
> : EBD> 	if ( success )
> : EBD> 		send_keepalive(via_some_ipc_mechanism) ;
> 
> Yes, I hope that UltraDNS implements something like this, if they have not
> already.  It's still not a guarantee that things will get withdrawn -- or be
> reachable, even if working but not withdrawn -- in case of a problem.  That
> still leaves the DNS for a gTLD at risk for a single point of failure.

The whole problem with only listing two anycast servers is that 
you leave yourself vulnerable to other kinds of faults.  Your
upstream ISP fat-fingers "ip route 64.94.110.11 null0" and
accidentally blitzes the netblock from which the anycast servers
are announced.  A router somewhere between customers and the
anycast servers stops forwarding traffic, or starts corrupting
transit data, without interrupting its route processing.
packet filters get misconfigured..

(Observe how divorced route processing and packet processing
are in modern routing architectures and it's pretty easy to
see how this can happen.  With load balancing, traffic
can get routed down a non-functional path while routing
takes place over the other one - BBN did that to us once,
was very entertaining).

Route updates in BGP take a while to propagate.  Much longer
than the 15ms RTT from me to, say, a.root-server.net.  The application
retry in this context can be massively faster than waiting 30+ seconds
for a BGP update interval.

The availability of the DNS is now co-mingled with the success
of the magic route tweak code;  the resulting system is a fair
bit more complex than simply running a bunch of different
DNS servers.   God forbid that zebra ever has bugs...

  http://www.geocrawler.com/lists/3/GNU/372/0/

In contrast, talking to a few DNS servers gives you an end-to-end
test of how well the service is working.  You still depend on the
answers being correct, but you can intuit a lot from whether
or not you actually get answers, instead of sitting around twiddling
your thumbs thinking, "gee, I sure wish that routing update would
get sent out so I could use the 'net."

  -Dave

-- 
work: [email protected]                          me:  [email protected]
      MIT Laboratory for Computer Science           http://www.angio.net/
      I do not accept unsolicited commercial email.  Do not spam me.