Re: ultradns reachability

North American Network Operators Group

Re: ultradns reachability

From: Joe Abley
Date: Fri Jul 02 10:26:21 2004

On 2 Jul 2004, at 00:18, Christopher L. Morrow wrote:

So, I thought of it like this:
1) Rodney/Centergate/UltraDNS knows where all their 35000billion copies of
the 2 .org TLD boxes are, what network pieces they are connected to at
which bandwidths and the current utilization
2) Rodney/Centergate/UltraDNS knows which boxes in each location (there
could be multiple inside each pod, right?) are running their dns process
and answering at which rates
3) Rodney/Centergate/UltraDNS knows when processes die and locally stop
pushing requests to said system inside the pod
4) Rodney/Centergate/UltraDNS knows when a pod is completely down (no
systmes responding inside the local pod) so they can stop routing the /24
from that pod's location

So, Rodney/Centergate/UltraDNS should know almost exactly when they have a
problem they can term 'critical'... I most probably left out some steps
above, like wedged proceseses or loss of outbound routing to prefixes
sending reqeusts. I'm sure Paul/ISC has a fairly complete list of failure
modes for anycast DNS services.

All the failure modes that ISC has seen with anycast nameserver instances can be avoided (for the authoritative DNS service as a whole) by including one or more non-anycast nameservers in the NS set.

This leaves the anycast servers providing all the optimisation that they are good for (local nameserver in toplogically distant networks; distributed DDoS traffic sink; reduced transaction RTT) and provides a fall-back in case of effective reachability problems for the anycast nameservers.

This is so trivial, I continue to be amazed that PIR hasn't done it.

The problem then becomes the "Hey, .org is dead!" From where is it dead?
What pod are you seeing it dead from? Is it routing TO the pod from you?
FROM the pod to you? The pod itself? Stuck/stale routing information
somewhere on the path(s)? This is very complex, or seems to be to me :(

With the fix above, the problem becomes "hey, *some* of the nameservers for ORG are dead! We should fix that, but since not *all* of them are dead, at least ORG still works."

I think more failure modes will be investigated before that comes :)
fortunately lots of people are already investigating these, eh?

I don't know about lots, but I know of a few. None of the people I know of are using an entire production TLD as their test-bed, however.

Joe

Follow-Ups:
- Re: ultradns reachability Stephen J. Wilcox
- Re: ultradns reachability Dr. Jeffrey Race
- Re: ultradns reachability Leo Bicknell

References:
- ultradns reachability Matt Ghali
- Re: ultradns reachability Eric Frazier
- Re: ultradns reachability James Edwards
- Re: ultradns reachability Christopher L. Morrow
- Re: ultradns reachability k claffy
- Re: ultradns reachability Christopher L. Morrow

Prev by Date: Re: [Fwd: [IP] Intriguing Progress of China's IPv9 Network Technology]
Next by Date: Re: Peering point speed publicly available?
Date Index
Thread Index
Author Index
Historical