Re: Traceroute and random UDP ports

* Joe Abley · * Wed Aug 13 12:34:22 2008

On 13 Aug 2008, at 08:56, John Kristoff wrote:
For further information I
sugguest consulting Stevens TCP/IP Illustrated chapter 8, dated, but
still an indispensable resource.

... or the comments in Van's traceroute.c, which are pleasantly
educational.

Joe
/*

* traceroute host - trace the route ip packets follow going to
"host".

*

* Attempt to trace the route an ip packet would follow to some

* internet host. We find out intermediate hops by launching probe

* packets with a small ttl (time to live) then listening for an

* icmp "time exceeded" reply from a gateway. We start our probes

* with a ttl of one and increase by one until we get an icmp "port

* unreachable" (which means we got to "host") or hit a max (which

* defaults to net.inet.ip.ttl hops & can be changed with the -m flag).

* Three probes (change with -q flag) are sent at each ttl setting and

* a line is printed showing the ttl, address of the gateway and

* round trip time of each probe. If the probe answers come from

* different gateways, the address of each responding system will

* be printed. If there is no response within a 5 sec. timeout

* interval (changed with the -w flag), a "*" is printed for that

* probe.

*

* Probe packets are UDP format. We don't want the destination

* host to process them so the destination port is set to an

* unlikely value (if some clod on the destination is using that

* value, it can be changed with the -p flag).

*

* A sample use might be:

*

* [yak 71]% traceroute nis.nsf.net.

* traceroute to nis.nsf.net (35.1.1.48), 64 hops max, 56 byte
packet

* 1 helios.ee.lbl.gov (128.3.112.1) 19 ms 19 ms 0 ms

* 2 lilac-dmc.Berkeley.EDU (128.32.216.1) 39 ms 39 ms 19 ms

* 3 lilac-dmc.Berkeley.EDU (128.32.216.1) 39 ms 39 ms 19 ms

* 4 ccngw-ner-cc.Berkeley.EDU (128.32.136.23) 39 ms 40 ms
39 ms

* 5 ccn-nerif22.Berkeley.EDU (128.32.168.22) 39 ms 39 ms 39
ms

* 6 128.32.197.4 (128.32.197.4) 40 ms 59 ms 59 ms

* 7 131.119.2.5 (131.119.2.5) 59 ms 59 ms 59 ms

* 8 129.140.70.13 (129.140.70.13) 99 ms 99 ms 80 ms

* 9 129.140.71.6 (129.140.71.6) 139 ms 239 ms 319 ms

* 10 129.140.81.7 (129.140.81.7) 220 ms 199 ms 199 ms

* 11 nic.merit.edu (35.1.1.48) 239 ms 239 ms 239 ms

*

* Note that lines 2 & 3 are the same. This is due to a buggy

* kernel on the 2nd hop system -- lbl-csam.arpa -- that forwards

* packets with a zero ttl.

*

* A more interesting example is:

*

* [yak 72]% traceroute allspice.lcs.mit.edu.

* traceroute to allspice.lcs.mit.edu (18.26.0.115), 64 hops max

* 1 helios.ee.lbl.gov (128.3.112.1) 0 ms 0 ms 0 ms

* 2 lilac-dmc.Berkeley.EDU (128.32.216.1) 19 ms 19 ms 19 ms

* 3 lilac-dmc.Berkeley.EDU (128.32.216.1) 39 ms 19 ms 19 ms

* 4 ccngw-ner-cc.Berkeley.EDU (128.32.136.23) 19 ms 39 ms
39 ms

* 5 ccn-nerif22.Berkeley.EDU (128.32.168.22) 20 ms 39 ms 39
ms

* 6 128.32.197.4 (128.32.197.4) 59 ms 119 ms 39 ms

* 7 131.119.2.5 (131.119.2.5) 59 ms 59 ms 39 ms

* 8 129.140.70.13 (129.140.70.13) 80 ms 79 ms 99 ms

* 9 129.140.71.6 (129.140.71.6) 139 ms 139 ms 159 ms

* 10 129.140.81.7 (129.140.81.7) 199 ms 180 ms 300 ms

* 11 129.140.72.17 (129.140.72.17) 300 ms 239 ms 239 ms

* 12 * * *

* 13 128.121.54.72 (128.121.54.72) 259 ms 499 ms 279 ms

* 14 * * *

* 15 * * *

* 16 * * *

* 17 * * *

* 18 ALLSPICE.LCS.MIT.EDU (18.26.0.115) 339 ms 279 ms 279 ms

*

* (I start to see why I'm having so much trouble with mail to

* MIT.) Note that the gateways 12, 14, 15, 16 & 17 hops away

* either don't send ICMP "time exceeded" messages or send them

* with a ttl too small to reach us. 14 - 17 are running the

* MIT C Gateway code that doesn't send "time exceeded"s. God

* only knows what's going on with 12.

*

* The silent gateway 12 in the above may be the result of a bug in

* the 4.[23]BSD network code (and its derivatives): 4.x (x <= 3)

* sends an unreachable message using whatever ttl remains in the

* original datagram. Since, for gateways, the remaining ttl is

* zero, the icmp "time exceeded" is guaranteed to not make it back

* to us. The behavior of this bug is slightly more interesting

* when it appears on the destination system:

*

* 1 helios.ee.lbl.gov (128.3.112.1) 0 ms 0 ms 0 ms

* 2 lilac-dmc.Berkeley.EDU (128.32.216.1) 39 ms 19 ms 39 ms

* 3 lilac-dmc.Berkeley.EDU (128.32.216.1) 19 ms 39 ms 19 ms

* 4 ccngw-ner-cc.Berkeley.EDU (128.32.136.23) 39 ms 40 ms
19 ms

* 5 ccn-nerif35.Berkeley.EDU (128.32.168.35) 39 ms 39 ms 39
ms

* 6 csgw.Berkeley.EDU (128.32.133.254) 39 ms 59 ms 39 ms

* 7 * * *

* 8 * * *

* 9 * * *

* 10 * * *

* 11 * * *

* 12 * * *

* 13 rip.Berkeley.EDU (128.32.131.22) 59 ms ! 39 ms ! 39 ms !

*

* Notice that there are 12 "gateways" (13 is the final

* destination) and exactly the last half of them are "missing".

* What's really happening is that rip (a Sun-3 running Sun OS3.5)

* is using the ttl from our arriving datagram as the ttl in its

* icmp reply. So, the reply will time out on the return path

* (with no notice sent to anyone since icmp's aren't sent for

* icmp's) until we probe with a ttl that's at least twice the path

* length. I.e., rip is really only 7 hops away. A reply that

* returns with a ttl of 1 is a clue this problem exists.

* Traceroute prints a "!" after the time if the ttl is <= 1.

* Since vendors ship a lot of obsolete (DEC's Ultrix, Sun 3.x) or

* non-standard (HPUX) software, expect to see this problem

* frequently and/or take care picking the target host of your

* probes.

*

* Other possible annotations after the time are !H, !N, !P (got a
host,

* network or protocol unreachable, respectively), !S or !F (source

* route failed or fragmentation needed -- neither of these should

* ever occur and the associated gateway is busted if you see one). If

* almost all the probes result in some kind of unreachable, traceroute

* will give up and exit.

*

* Notes

* -----

* This program must be run by root or be setuid. (I suggest that

* you *don't* make it setuid -- casual use could result in a lot

* of unnecessary traffic on our poor, congested nets.)

*

* This program requires a kernel mod that does not appear in any

* system available from Berkeley: A raw ip socket using proto

* IPPROTO_RAW must interpret the data sent as an ip datagram (as

* opposed to data to be wrapped in a ip datagram). See the README

* file that came with the source to this program for a description

* of the mods I made to /sys/netinet/raw_ip.c. Your mileage may

* vary. But, again, ANY 4.x (x < 4) BSD KERNEL WILL HAVE TO BE

* MODIFIED TO RUN THIS PROGRAM.

*

* The udp port usage may appear bizarre (well, ok, it is bizarre).

* The problem is that an icmp message only contains 8 bytes of

* data from the original datagram. 8 bytes is the size of a udp

* header so, if we want to associate replies with the original

* datagram, the necessary information must be encoded into the

* udp header (the ip id could be used but there's no way to

* interlock with the kernel's assignment of ip id's and, anyway,

* it would have taken a lot more kernel hacking to allow this

* code to set the ip id). So, to allow two or more users to

* use traceroute simultaneously, we use this task's pid as the

* source port (the high bit is set to move the port number out

* of the "likely" range). To keep track of which probe is being

* replied to (so times and/or hop counts don't get confused by a

* reply that was delayed in transit), we increment the destination

* port number before each probe.

*

* Don't use this as a coding example. I was trying to find a

* routing problem and this code sort-of popped out after 48 hours

* without sleep. I was amazed it ever compiled, much less ran.

*

* I stole the idea for this program from Steve Deering. Since

* the first release, I've learned that had I attended the right

* IETF working group meetings, I also could have stolen it from Guy

* Almes or Matt Mathis. I don't know (or care) who came up with

* the idea first. I envy the originators' perspicacity and I'm

* glad they didn't keep the idea a secret.

*

* Tim Seaver, Ken Adelman and C. Philip Wood provided bug fixes and/or

* enhancements to the original distribution.

*

* I've hacked up a round-trip-route version of this that works by

* sending a loose-source-routed udp datagram through the destination

* back to yourself. Unfortunately, SO many gateways botch source

* routing, the thing is almost worthless. Maybe one day...

*

* -- Van Jacobson ([email protected])

* Tue Dec 20 03:50:13 PST 1988

*/