North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

RE: What does 95th %tile mean?

  • From: David Schwartz
  • Date: Thu Apr 19 20:26:18 2001

> Same algorithm, same raw data, different 95% answers, both valid, yet one
> is twice as large as the other. Great outcome for a billing
> system isn't it?

	Any billing scheme based upon statistical sampling will, with some
probability, err in the favor of one party or the other randomly. But it is
important that the customer understands that he is being billed based upon
statistical sampling and thus there are no "exact" measurements.

	I've looked at other ways and can't find any better. Billing based upon
NetFlow, for example, is still statistical sampling since NetFlow loses a
percentage of flows. For example, one of my VIP2-50's says:

  368351628 flows exported in 12278484 udp datagrams
  33838 flows failed due to lack of export packet
  269989 export packets were dropped enqueuing for the RP
  108825 export packets were dropped due to IPC rate limiting

	Billing based upon total bytes transferred tends to create similar
problems. Do you bill based upon bytes transferred per day? Per month? If
so, it's still statistical sampling if you have some amount of 'paid
bandwidth'.

	And you can't collect this data from interfaces because interface rates
include local traffic, which (for example) grossly overbills customers with
newsfeeds.

	I think there would be a market for a device with two GE interfaces that
accounted for everything that passed through the two interfaces in a
reliable and configurable way. It would have to be capable of fault-tolerant
operation with multiple units. It would have to be free too. ;)

	DS