North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: Netflow/flowscan

  • From: Dave Plonka
  • Date: Tue Jun 22 11:37:20 2004

Hi Andrew,

On Mon, Jun 21, 2004 at 11:10:55PM -0700, andrew matthews wrote:
> 
> Anyone ever done some major flowscan stuff?

Well "major" is in the eye of the beholder, but here's what we do:

   http://wwwstats.net.wisc.edu/ (aggregated to ~1 OC-12 and Gig-E WAN link)
   http://wwwstats.wiscnet.net/  (multiple OC-12,Gig-E links)

Both of these are done using packet sampling though, with FlowScan
modified to simply multiply by the sampling rate before recording
values to RRD files.   BTW, it's good to keep SNMP measurements in
parallel so that you can continuously estimate the error due to packet
sampling, as I do in the grey band in the "Estimated UW-Madison Campus
I/O AS to AS" graph:

   http://wwwstats.net.wisc.edu/CampusIO/as2as_Mbps/as2as_Mbps_2d.png

> We tried it once for a while and we had so much traffic our dual zeon
> 3.06ghz system couldn't keep up. The flows just started getting more
> and more behind... anyone ever succesfully graphed large amounts of
> data? If so what kind of systems did you use and what type of
> capture/processor layout did you have?

If you're interested in a recommendation, get flow-tools and run
flow-capture:

   http://www.splintered.net/sw/flow-tools/

We export non-sampled flow data from our core routers (hybrid Cat6k),
collected with nine flow-capture daemons, on an 8-way Intel box (PIII
700MHz) running Linux.  (It is not dedicated solely to flow-capture...
it also MRTG's tens-of-thousands of targets.)

You can export from multiple routers to the same flow-capture daemon,
but I only do that to the point that makes sense based on your
topology.  For instance, colocated redundant routers probably make
sense to go to the same flow-capture instance since they are unlikely
to handle exactly the same traffic.

For visualization, we export sampled data from just the border routers
(Junipers), and post-process just those using FlowScan.

When something interesting is found in the sampled-data, we drill down
into the non-sample five-minute flow-files with flow-stat or flow-nstat,
flowdumper, etc.

If I were to try to develop the highest performance FlowScan-like
analysis/visualization system, I would use flow-tools underneath, and
use perl glue to grab the data and punch it into RRD files.
Personally, I haven't found the need (nor the time) to write it to
out-do our institution's performance requirements.

Dave

P.S. Regarding the other follow-up re: performance based on programming
language, while FlowScan is indeed in perl, its flow-file reading is
written in C, linked via perl's XS language.

That said, it's nowhere near as fast as Mark Fullmer's excellent
flow-tools package.  That's the price paid since FlowScan, Cflow, and
flowdumper allow you to do arbitrary flow queries and analysis using
any perl expression (or module).  In my experience there are many more
network operators willing to dabble in perl than customize software
written in C, etc.

P.P.S. Hints on using FlowScan with flow-tools is linked on the
flow-tools site as:

   "Tips on configuring FlowScan with flow-tools."
      http://net.doit.wisc.edu/~plonka/list/flowscan/archive/1117.html

Also, I've released a FlowScan report module that allows you to
configure the PacketSamplingInterval:

   http://net.doit.wisc.edu/~plonka/FlowScan/new/total_top/

-- 
[email protected]  http://net.doit.wisc.edu/~plonka  ARS:N9HZF  Madison, WI