North American Network Operators Group Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical Re: Thoughts on increasing MTUs on the internet
On Apr 14, 2007, at 1:10 PM, Iljitsch van Beijnum wrote: On 14-apr-2007, at 19:22, Douglas Otis wrote:
For 1500 byte packets the fraction of packets with three bits flipped would be around 1 : 10^15, correcting for the larger number of packets per given amount of data, that's a difference of about 1 : 100. Quoting from "When The CRC and TCP Checksum Disagree" by Jonathan Stone and Craig Partridge: http://citeseer.ist.psu.edu/cache/papers/cs/21401/ http:zSzzSzsigcomm.it.uu.sezSzconfzSzpaperzSzsigcomm2000-9-1.pdf/ stone00when.pdf "Traces of Internet packets from the past two years show that between 1 packet in 1,100 and 1 packet in 32,000 fails the TCP checksum, even on links where link-level CRCs should catch all but 1 in 4 billion errors. For certain situations, the rate of checksum failures can be even higher: in one hour-long test we observed a checksum failure of 1 packet in 400. We investigate why so many errors are observed, when link-level CRCs should catch nearly all of them. We have collected nearly 500,000 packets which failed the TCP or UDP or IP checksum. This dataset shows the Internet has a wide variety of error sources which can not be detected by link-level checks. We describe analysis tools that have identified nearly 100 different error patterns. Categorizing packet errors, we can infer likely causes which explain roughly half the observed errors. The causes span the entire spectrum of a network stack, from memory errors to bugs in TCP. After an analysis we conclude that the checksum will fail to detect errors for roughly 1 in 16 million to 10 billion packets. From our analysis of the cause of errors, we propose simple changes to several protocols which will decrease the rate of undetected error. Even so, the highly non-random distribution of errors strongly suggests some applications should employ application-level checksums or equivalents." Hardware weaknesses within DSLAMs or various memory arrays, such as a weak driver on some internal interface, can generate high levels of multi-bit errors not detected by TCP checksums. When affecting the same bit within an interface, more than 1 out of 100 may go undetected. That seems like a lot, but getting better quality fiber easily compensates for this. Expressed differently, the average amount of data transmitted where you see one packet with three flipped bits is around 10 petabytes for 11454 byte packets and some 1.3 exabytes for 1500 byte packets. For the large packets that would be one packet in three years at 1 Gbps, for the small ones one packet in 380 years. Consider that the CRC is not always carried with the packet between interfaces. -Doug
|