North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: why upload with adsl is faster than 100M ethernet ?

  • From: Alex Bligh
  • Date: Fri Oct 15 12:54:05 2004


--On 15 October 2004 12:31 -0400 Andy Dills <[email protected]> wrote:

If the desire is to provide a simulated circuit with "x" bandwidth, CAR
does a great job, IFF you correctly size the burst: 1.5x/8 for the normal
burst, 3x/8 for the max burst.

The aggregate rate of the transfer is "x" in all the testing I've done.
How can you ask for more than the configured line rate? In my testing, I
noticed a pronounced saw-tooth effect with incorrectly configured bursts,
but with correctly configured bursts, the saw-toothing affect did not
prevent delivery of the configured throughput.
It's a fair while ago now, we did a pretty full range of tweaking (both
of max burst, burst size, and indeed of committed rate). We observed
the following problems:
a) The fudge factor that you needed to apply to get the right bandwidth
  depended heavily on (from memory)
  (i)   TCP stacks either end, whether slowstart configured etc.
  (ii)  path MTU
  (iii) Number of simultaneous connections
  (iv)  Protocol type (e.g. TCP vs. UDP), and content (HTTP was for
        reasons to do with persistent connections typically different
        from FTP)
  We did indeed (until we found a better solution) manage to come up
  with a fudge factor that minimized customer complaints under this
  head (which was most of them), but it was essentially "let's wind
  everything up high enough that in the worst case of the above they
  get throughput not less than they have bought"; however, this meant
  we were giving away rather more bandwidth than we meant to, which
  made upgrades a hard sell.
b) It *STILL* didn't work like normal TCP. We had customers with web
  servers behind these things who expected (say) a 2Mb service running
  constantly flatlined to operate like a 2Mb/s pipe running full (but
  not overfull) - i.e. they'd expect to go buy a level of service roughly
  equal to their 95th percentile / busy hour rate. When they were even
  slightly congested, their packet loss substantially exceeded what
  you'd see on the end of properly buffered (say) 2Mb/s serial link.
  If their traffic was bursty, the problem was worse. Even if you
  could then say "well our tests show you are getting 2Mb/s (or rather
  more than that)" the fact a disproportionate number of packets were
  being lost caused lots of arguments about SLA.
c) The problem is worst when the line speed and the ratelimit speed
  are most mismatched. Thus if you are ratelimiting at 30Mb/s on a
  100Mb/s, you won't see too much of a problem. If you are ratelimiting
  at (say) 128kbps on a 1Gb/s port, you see rather more problems.
  In theory, this should have been fixed by sufficient buffering and
  burst, but at least on Cisco 75xx (which is what this was on several
  years ago), it wasn't - whilst we found a mathematical explanation,
  it wasn't sufficient to explain the problems we saw (I have a feeling
  it was due to something in the innards of CEF switching).

I know several others who had similar problems both before this and after
it (one solving it by putting in a Catalyst with an ATM blade running LANE
and a fore ATM switch - yuck - there are better ways to do it now). I
am told that PXE stuff which does WFQ etc. in hardware is now up to this
(unverified). But that's shaping, not rate-limiting.

Alex