North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: links on the blink (fwd)

  • From: Hans-Werner Braun
  • Date: Wed Nov 08 15:15:25 1995

>I bet myself you would respond the way you did before I finished processing
>my mail bag. I won. Sure, 100% packet loss is eminently acceptable if that
>loss rate occurs not more than 1% of the time. Maybe 10% packet loss

I suppose that is correct. 100% packet loss for 1% of my *connections*,
I can live with that. We also lived with high delays (like 0.5 sec
satellite round trip on USAN, I don't mean those occasional 75 seconds
in some, ahem, other environments at that time), as long as it was
predictable. What drives me nuts is if services are unpredictable, like
immediate packet services at the 40% percentile, a few seconds and
occasional losses for another 40% of the packets, and the opportunity
to keep up with Dave's technical field between keystrokes for the
remaining 20%.

What does a consistent 10% packet loss mean? I think it has little to
do with telco line qualities these days, more with resource
contention.  What is contended? A T1 line (or whatever) is never, ever
congested, as it is just a simple bit pipe. The contention is at the
equipment, called routers, that aggregate more traffic on the inbound
interfaces than it can dump onto the outbound interfaces (e.g., for
outbound line capactity reasons, and buffers then filling up in the
router). Historically that was often a router problem, as they were too
slow to deal with the onslaught of packets for a plain
packet-per-second-rate (remember, in 1987 the NSF solicitation asked
for a then whopping 1000 packets per second per router, which was just
barely achievable then). Today you can buy technology off the shelf
that does not have a pps problem for typical situations. So what is the
problem, if it is not the rouuter interconnection or the router
technology? The answer is bad network engineering, little consideration
for architectural requirements, and lack of understanding for the
Internet workload profile. Intra-NSP, perhaps even more among NSPs. Or,
in other words, it is people that kill the network, not the routers or
phone lines, particularly people who are trying to make money off it,
probably using their unique optimization function focused on profit
and limiting expenses as much as they can, not understanding the fate
sharing yet.

A constant 10% packet loss (or any constant packet loss) means that the
network is underprovisioned. The *deployed* Internet depends heavily on
massive aggregation of microscopic transactions (e.g., 10 packets or so
at the 50% percentile of transaction, and tens of thousands of them
perhaps in parallel). These aggregations result in some degree of
steady state, but also burst behavior, which even in a well designed
network can result in occasional losses due to overutilization of
resources. But it should not happen consistently, if someone were to
claim it to be a well designed and implemented network.  The
conventional wisdom says to upgrade the capacity (e.g., more bandwidth
to improve the outflow from the routers) to handle the additional load
in cases of resource contention. Can be an expensive undertaking, and
an administrative nightmare, especially in the international realm. May
be a bandaid could be a prioritization of traffic, so that more deservy
traffic gets better services. For example, for my email 10% packet
losses I would typically not even know about, but most interactive
connections (call it lazyness, stupidity, whatever) create several
packets per keystroke, with their demands for end-to-end echos (hey,
Dave, you did a bad job of technology transfer out of Fuzzballs, as you
got it right, line mode by default, going into character mode only if
really necessary; i.e., proof that it is possible to do it right was
available 10 years ago). Prioritization can be left to either the
service provider (who may have to hide it; and it is also very hard to
serve everyone right that way), or by the end-user. If the end-user
specifies a service profile (be it IP Precedence or whatever) it will
only work if there is a penalty for higher service qualities (e.g.,
quotas, or precedence-0 is free, higher ones cost the end-user (or
someone else in the path who's pain would be desirable here)
something). You would still need to understand the workload profile
eventually, simple utilization graphs won't suffice if you compare the
common microscopic transactions with those exhibiting a high bandwidth
* duration product (e.g., them new multimedia thingies).

Anyway, no magic here either. This issues are on the table for many
years already, nothing new, though if the Internet is to survive, some
service providers probably need to adjust their optimization models.