North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

2006.06.07 NANOG-NOTES Anycast benefits for k root server

  • From: Matthew Petach
  • Date: Fri Jun 09 19:59:02 2006
  • Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta;; h=received:message-id:date:from:sender:to:subject:mime-version:content-type:content-transfer-encoding:content-disposition:x-google-sender-auth; b=dYR+1Nq7GRj6UoJqTK3DWVBH4jdoDpklPOs/xsdKFVovAGebqDrW8q8ALeMGEIds94bhrOVIvohdi5x3dM5cUkBXBRh2keNluGZbyYa1CNhQk87JbHECxDnURRoX0HeUw+EKMlemluBYfzyZYlnzC/brCT2sf2qH9FHKaKBwhOY=

Break ends at 11:40, PGP signing will take place,
and don't forget to fill out servers.

ANYCAST fun for the final sessions.

Lorenzo Colitti, RIPE NCC
[slides are at:

Benefit of individual nodes
Routing issues

Why anycast?
root server anycast widely deployed
c, f, i j, k, m at least
reasons for anycasting
provide resiliency: eg contain DOS attacks
spread server and network load
increase performance

but is it effective?

measure latency
ideally for every given client, BGP should chose node
with lowest RTT.  does it?
from every client, measure RTTs to
anycast IP address
service interfaces of global nodes (not anycasted)
for every client, compare K RTT to RTT of closest global
a = RTTk/min(RTTi)
if 1, BGP is picking right node
if > 1, BGP picks the wrong node
if <1, seeing local node.

Latency with TTM: methodology
DNS queries from ~100 TTM test boxes
dig hostname.bind
see which host answers
extract RTT
take min of 5 queries
check paths to service interfaces;
is it same as prod IP
according to RIS, mostly 'yes'

TTM probe locations, mostly in europe

Latency with TTM: results (5 nodes)
most values are close to one; generally BGP doing pretty
good job.

from 2 nodes to 5 nodes
(2 nodes, April 2005)  (5 nodes, April 2006)
mostly same results, clustered around one, whether
2 or 5 nodes.

consistency of 'a' over time
average of that over time.

TT103 is outlier
calculated over time, threw out that one outlier.

results are pretty consistent.
average is little higher than one, mostly consistent
over time

measuring from servers
TTM latency measurements not optimal
locations biased towards europe
limited number of probes (~100)
don't reflect k client distribution

how to fix?

ping clients from servers
much larger dataset

process packet traces on k global nodes
extract list of client IP addreses
ping all addresses from all global nodes
plot distribution of 'a'
6 hours of data
246,769,005 queries
845,328 unique IP addresses

CDF of 'a' seen from servers
results not as good as seen by TTM
only 50% of clients have a = 1
about 10% are 4x slower/farther.

probably due to TTM clustering in europe

latency conclusions
5 node result vs 2 node, comparable, at least
in TTM

non-TTM results not so rosy.

How many nodes are needed--is 5 enough?
evaluate existing instances
how to measure benefit of an instance?

Assume optimal instance selection
that is, every client sees closest instance
this is upper benefit of benefit
consistent to see if we've reached diminishing returns

for every client, see how much its performance if the
chosen node didn't exist.

B is loss factor, how much a client would suffer if an
instance were knocked out
B = RTTknockout/RTT...

Graph for LINX; 90% of clients wouldn't see an impact
if it went away; 10% would see a worsening.
geographic distribution pretty wide

about 20% would suffer performance degregation; busiest
two nodes, see a lot of clients, important to k

If they plot it for both LINX and AMSIX together,
about 65% wouldn't be affected, most of others would
see 4x, 10% would be 7x worse.
So taken together, the *two* nodes are important.

Tokyo; best node for few clients; but those served,
BADLY served by others;
about 10% who would go more than 7x if it went way,
those clients mostly Asia.
Miami node at NOTA,
moderate benefit for some clients, US and southAm
would be badly served by europe or Tokyo.

Delhi node is mostly ineffective, most would be
served better by other nodes.

Condense the graph into one number to get a
value for effectiveness of each node.
weighted average of B for each client.
if benefit value is 1, node doesn't provide any
benefit at all.
larger numbers show higher benefits.
Europe, when taken together, high benefit, as is
Tokyo; Miami node not so effective, and Delhi is
nearly ineffective.

Does anycast provide any value then?
knock out all except LINX; dark red curve (pre 1997)
10% wouldn't notice, 85% would get worse,
benefit value is 18.8,
so anycast does bring value.

the more routes competing in BGP with more nodes
doesn't matter for single packet UDP exchanges
does matter for TCP

Look at node switches that occur.
collect packet dumps on each node.
extract all 53/UDP traffic
k nodes only NTP synchronized
if IP shows up on two nodes, log a switch.

5 nodes, april 2006, 0.06% saw switches
2830 switchers out of 845,328, 0.33% switchers
no big issue with instance switchers.

Routing issues
k-root structure
5 global nodes (prepended)
 linkx, amsix, tokyo, mia, del

different prepending values
no-export causing reachability

TT103 has value of 200, the graph axis is cut.
tt103 is in Yokohama; Tokyo is 2ms away; but
the query goes to Delhi through Tokyo to LA.
416ms vs 2, so value is 208.

Thanks to Matsuzaki and Randy Bush,
got BGP paths from AS2497
bad interaction of different prepending lengths
need to fix prepending on Tokyo node.
Delhi had shorter prepending.

no-export and leaks
local nodes can be worse than global nodes
tt89, seeing Denic local node, 30ms instead
of going to London.  Local node, if no-export
is ignored, announced to customer, they are
more specific, leak to customers.

no-export can lead to loss of reachability

problematic interaction of no-export with anycast
use no-export to prevent local nodes from leaking
if have an AS
 whose providers all peer with a local node
  and honor no-export
customer never sees route for k IP address.
solution, send out a less specific, covering

Q: Mark Kosters, Verisign--saw much higher switching rates;
can he define switching better?
A: if an IP is seen at one location, then shifts to a
different site, that's one switch; going back to the
first node would be a second switch.