North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

[NANOG] Strange network behaviour

  • From: Douglas K. Rand
  • Date: Mon May 05 14:08:20 2008

We had a very strange problem today. Two of our hosts could not reach
a server, but only those two hosts. All of our other hosts could reach
those servers fine. (OK, I didn't try ALL of our IPs, but the half
dozen I did try worked fine.) 

I checked all of our firewalls and routers, and everywhere I looked
all of the traffic was exiting our network just fine. I saw on our
edge routers the traffic going out, just no traffic back to the two
hosts in question. (We had good bidirectional traffic to all of our
other hosts.) And the two hosts in question were only having problems
connecting to ftp.agnewsonline.com.

Lets start with a traceroute from a working host, the orginating host
is 12.192.92.14:

[~]% traceroute -I ftp.agnewsonline.com
traceroute to agnewsonline.com (64.46.45.226), 64 hops max, 60 byte packets
 1  12.192.92.3 (12.192.92.3)  0.257 ms  0.171 ms  0.163 ms
 2  pluto-0 (12.192.93.13)  0.401 ms  0.296 ms  0.294 ms
 3  ixion-att (12.192.93.244)  1.260 ms  0.463 ms  1.116 ms
 4  12.87.125.249 (12.87.125.249)  14.838 ms  9.314 ms  9.755 ms
 5  tbr2.cgcil.ip.att.net (12.122.99.122)  24.528 ms  24.788 ms  23.009 ms
 6  ggr2.cgcil.ip.att.net (12.123.6.69)  22.362 ms  23.410 ms  22.335 ms
 7  192.205.33.186 (192.205.33.186)  23.448 ms  24.074 ms  29.405 ms
 8  ae-31-53.ebr1.Chicago1.Level3.net (4.68.101.94)  22.800 ms  32.598 ms  36.093 ms
 9  ae-68.ebr3.Chicago1.Level3.net (4.69.134.58)  23.446 ms  21.599 ms  34.060 ms
10  ae-3.ebr2.Denver1.Level3.net (4.69.132.61)  61.517 ms  57.482 ms  56.606 ms
11  ae-2.ebr2.Seattle1.Level3.net (4.69.132.53)  96.484 ms  114.264 ms  96.984 ms
12  ae-23-52.car3.Seattle1.Level3.net (4.68.105.36)  91.295 ms  88.700 ms  89.705 ms
13  BIG-PIPE-IN.car3.Seattle1.Level3.net (4.71.152.26)  90.053 ms  90.511 ms  92.072 ms
14  rc1wh-pos14-0.vc.shawcable.net (66.163.76.1)  90.062 ms  93.489 ms  90.757 ms
15  rc2wh-pos0-15-2-0.vc.shawcable.net (66.163.69.181)  96.527 ms  91.743 ms  97.254 ms
16  rd1ht-tge1-1-1.ok.shawcable.net (66.163.77.18)  101.412 ms  114.160 ms  100.530 ms
17  ra1ht-ge3-1.ok.shawcable.net (66.163.72.134)  105.651 ms  101.336 ms  101.628 ms
18  rx0ht-rack-force-2.ok.bigpipeinc.com (64.251.64.50)  111.960 ms  101.535 ms  116.136 ms
19  rf1.01.rackforce.net (69.10.128.198)  583.192 ms  491.170 ms  598.406 ms
20  64.46.45.226 (64.46.45.226)  110.207 ms  108.718 ms  107.279 ms

A traceroute from one of the hosts that doesn't work would reach
ae-3.ebr2.Denver1.Level3.net but go no further. I then tried pinging
the routers I couldn't reach. I could not ping:

  ae-3.ebr2.Denver1.Level3.net (4.69.132.61)
  ae-2.ebr2.Seattle1.Level3.net (4.69.132.53)
  ae-23-52.car3.Seattle1.Level3.net (4.68.105.36)
  BIG-PIPE-IN.car3.Seattle1.Level3.net (4.71.152.26)

but when I started pinging rc1wh-pos14-0.vc.shawcable.net (66.163.76.1)
not only did I start getting responses, but everything started working
to ftp.agnewsonline.com too, but just from that host. It really seemed
that pinging that router some how fixed my problem.

Well, I'm not sure I really believed that, but I still had another
host that couldn't reach ftp.agnewsonline.com, so on that host I
started a ping. I'll add my comments to describe what I was doing in
another window in /* */:

[~]% ping ftp.agnewsonline.com
PING agnewsonline.com (64.46.45.226): 56 data bytes
/* At this point in another window I started a nother ping: */
/* ping 66.163.76.1 and immediately this ping started working ... */
64 bytes from 64.46.45.226: icmp_seq=18 ttl=108 time=104.617 ms
64 bytes from 64.46.45.226: icmp_seq=19 ttl=108 time=105.775 ms
64 bytes from 64.46.45.226: icmp_seq=20 ttl=108 time=101.569 ms
--- agnewsonline.com ping statistics ---
22 packets transmitted, 3 packets received, 86% packet loss
round-trip min/avg/max/stddev = 101.569/103.987/105.775/1.774 ms

It was like I threw a switch. The single outbound ICMP packet to
rc1wh-pos14-0.vc.shawcable.net (66.163.76.1) fixed everything for that
host. 

I was wondering if anybody has any clue what might be going on. I've
never experienced a problem like this before.

_______________________________________________
NANOG mailing list
[email protected]
http://mailman.nanog.org/mailman/listinfo/nanog