North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: fixing TCP buffers (Re: packet reordering at exchange points)

  • From: Richard A Steenbergen
  • Date: Tue Apr 09 20:41:55 2002

On Wed, Apr 10, 2002 at 12:22:57AM +0000, E.B. Dreger wrote:
> 
> My static buffer presumed that one would regularly see line rate;
> that's probably an invalid assumption.

Indeed. But thats why it's not an actual allocation.

> Why bother advertising space remaining?  Simply take the total
> space -- which is tuned to line rate -- and divide equitably.
> Equal division is the primitive way.  Monitoring actual buffer
> use, a la PSC window-tuning code, is more efficient.

Because then you havn't accomplished your goal. If you have 32MB of buffer
memory available, and you open 32 connections and share it equally for
1MB/ea, you could have 1 connection that is doing no bandwidth and one 
connection that wants to scale to more then 1MB of packets inflight. Then 
you have to start scanning all your connections on a periodic basis 
adjusting the socket buffers to reflect the actual congestion window, a 
la PSC.

My suggestion was to cut out all that non-sense by simply removing the 
received window limits all together. Actually you could accomplish this 
goal by just advertising the maximum possible window size and rely on 
packet drops to shrink the congestion window on the sending side as 
necessary, but this would be slightly less efficient in the case of a 
sender overrunning the receiver.

But alas we're both forgetting the sender side, which controls how quickly 
data moves from userland into the kernel. This part must be set by looking 
at the sending congestion window. And I thought of another problem as 
well. If you had a receiver which made a connection, requested as much 
data as possible, and then never did a read() on the socket buffer, all 
the data would pile up in the kernel and consume the total buffer space 
for the entire system.

> To respect memory, sure, you could impose a global limit and
> alloc as needed.  But on a "busy enough" server/client, how much
> would that save?  Perhaps one could allocate 8MB chunks at a
> time... but fragmentation could prevent the ability to have a
> contiguous 32MB in the future.  (Yes, I'm assuming high memory
> usage and simplistic paging.  But I think that's plausible.)

You're missing the point, you don't allocate ANYTHING until you have a
packet to fill that buffer, and then when you're done buffering it, it is
free'd. The limits are just there to prevent you from running away with a 
socket buffer.

-- 
Richard A Steenbergen <ras[email protected]>       http://www.e-gerbil.net/ras
PGP Key ID: 0x138EA177  (67 29 D7 BC E8 18 3E DA  B2 46 B3 D8 14 36 FE B6)