|
You are hereHome » NANOG Meeting Presentation Abstract
|
|
NANOG Meeting Presentation Abstract
Network Telemetry at Yahoo! | Meeting: | NANOG70 | |
Date / Time: | 2017-06-07 11:30am - 12:00pm

| |
Room: | Grand Ballroom | |
Presenters: | Speakers:
Matt Hudgins, Yahoo! Matt Hudgins is currently a Senior Software Engineer at Yahoo! building scalable network analysis and optimization applications using open source software. He was previously a Software Engineer at Cisco Systems where he developed network operating systems for service provider networks. His open source contributions range from frontend tutorials to network monitoring tools.Varun Varma, YahooVarun Varma is a Principal Engineer currently leading the design and development of a global Network Telemetry Platform @ Yahoo. Over the course of his 19 year career, Varun has worked in a variety of management and technical roles in startups to web scale companies, helping build and operate everything from embedded network devices to ad technology at Internet scale. | |
Abstract: | Providing 1 billion monthly active users with responsive, rich applications requires a large scale network. Locked within processes running on network devices are valuable bits of control and data plane metrics like prefix usage, peer interface utilization and routing session flaps. By making this data available to any number of subscribers, we enable Yahoo! Engineers to create cost saving data visualizations and anomaly detection software. This paper explains the challenges encountered and architecture decisions made in building our real time network telemetry stack that currently polls millions of metrics from dozens of sites on five continents. A key goal of our system is to minimize the effort required to poll a new device type or write a new consumer application. To accomplish this, we abstracted scale away from engineers looking to poll devices and consumption away from engineers looking to build consumer applications. Our Python polling layer is built to be future proof, modular and horizontally scalable. We chose to use Python as our language because of its readability and community support. Python’s open source community provides a ready made plugin system called Yapsy. Polling plugins in our system are Yapsy plugins that specify how to get and clean data from a device before placing the results onto a Kafka bus. The platform then horizontally scales (unlike MRTG or Cacti) by scheduling the plugin through Celery, a Python distributed task queue. This yields many benefits, including the freedom to use the best polling method for a given device and the luxury of not needing to worry about scaling your plugin. For instance, where vendors support a robust API, we use that, but for API deficient vendors, we poll by SNMP instead. We also developed configuration driven SNMP polling that allows us to define SNMP table relations in configuration rather than code. This approach eases the mental burden of cross-SNMP table correlations, and allows us to poll new metrics without having to touch source code.
| |
Files: | Network Telemetry at Yahoo!(PDF)
Network Telemetry at Yahoo!
| |
Sponsors: | None. | |
Back to NANOG70 agenda. NANOG70 Abstracts- Lightning Talks
Speakers: Igor GashinskyYahoo; .Ian FlintYahoo; .Punky DueroICANN; .Chris GrundemannMyriad Supply; .
- Lightning Talks
Speakers: Igor GashinskyYahoo; .Ian FlintYahoo; .Punky DueroICANN; .Chris GrundemannMyriad Supply; .
- Lightning Talks
Speakers: Igor GashinskyYahoo; .Ian FlintYahoo; .Punky DueroICANN; .Chris GrundemannMyriad Supply; .
- Lightning Talks
Speakers: Igor GashinskyYahoo; .Ian FlintYahoo; .Punky DueroICANN; .Chris GrundemannMyriad Supply; .
- Lightning Talks
Speakers: Edward LopezCorsa; .Rafal SzareckiJuniper; .Peter ThimmeschAddrex; .
- Lightning Talks
Speakers: Edward LopezCorsa; .Rafal SzareckiJuniper; .Peter ThimmeschAddrex; .
- Lightning Talks
Speakers: Edward LopezCorsa; .Rafal SzareckiJuniper; .Peter ThimmeschAddrex; .
|
|