representativeness of flow data based on samples

North American Network Operators Group

representativeness of flow data based on samples

From: Joe Abley
Date: Wed Jan 30 14:04:40 2002

Traffic measurement techniques such as NetFlow work by associating
some characteristics of inbound packets on an interface with a flow,
e.g. some tuple like (source addr, source port, dest addr, dest port,
protocol). Counters per flow are incremented, and the numbers are
exported periodically or when flows become inactive.

There are a few vendors who now provide traffic export from high-speed
interfaces by sampling those interfaces at a particular rate, and
using the sampled packets to populate the per-flow counters, rather
than looking at every packet.

Does anybody here know of recent research with real internet traffic
which compares different sample rates wrt the representativeness of
the resulting flow data?

For example, if I am trying to rank the top traffic sinks for my
network beyond an attached peer (i.e. an ordinal rather than cardinal
measurement), will I get different answers if I use a sampling rate
of 1:1000 compared to 1:50, given a statistically "long enough"
measurement period?

Intuitively, it seems to me that the answers should be the same.
However, it also seems to me that statistics are frequently non-
intuitive.


Joe

Follow-Ups:
- Re: representativeness of flow data based on samples Fred True
- Re: representativeness of flow data based on samples Jake Khuon

Prev by Date: OT: Wireless Billing Trivia Being Sought
Next by Date: Re: representativeness of flow data based on samples
Date Index
Thread Index
Author Index
Historical