North American Network Operators Group Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical Re: Global BGP - 2001-06-23
Brett Frankenberger wrote: > > > Out of curiosity - did anyone see a duration of significanlt instability > > in the global routing tables on Saturday afternoon? Without violating NDA, > > all I can say is that it resembled a historic event involve a bad route, > > Ciscos, and Bay routers (only this time, it was a bad route, Ciscos, and > > <X> vendor whom I cannot name but is being soundly beaten with wet noodles > > to resolve the issue). The bad route, and instability, were seen across > > all of our transit vendors (all "household" names of transit service). > > Hmm ... why is <X> being beaten? Was the problem reversed this time? > > The only historic event I can recall involving a bad route, Cisco, and > Bay (actually, events would be better, since it happened at least > twice) was a case of (a) someone injecting a bad route, (b) the cisco > at the other end accepting it in violation of the RFC, (c) ciscos > passing that bad route all around the internet, all in violation of the > RFC, (d) that route eventually hitting a cisco<->bay peering > connection, and (e) the Bay (although the problem wasn't limited to > Bay, as gated, and possible other implementations as well, behaved the > same way) properly sending a NOTIFY and taking down the BGP session, as > required by the RFC. A) Ciscos flap sessions, according to the only reports I've heard. B) <X> routers were crashing, either due to the bug, or the session resets. Thus, <X> is being flogged. I have reports of at least one <Y> having problems, as well. C) I would post the BugID, but the only source I have is under NDA. However, having now heard this much in a public forum (IE, not covered), I can say "Invalid AS path data bug". > It only took two major outages before Cisco fixed the problem. (The > BGP advertisement was posted to NANOG both times, as was the BugID the > second time.) I have the guilty announcement, but again, it's under NDA. However, I can say that we are now seeing this announcement from all of our upstreams, non-blocked, so it appears that they fixed the origionating point. > So if this is the same issue, Cisco would be the vendor to flog, > although assuming they didn't re-introduce it, the flogging might more > correctly be directed at providers still running code old enough to > have this particular problem. I would flog Cisco as well, but A) they have a bug on it already, and B) we're not using Ciscos for our core (note: this is my personal email, and I am not speaking for my employer; however, this is publically documented on my employers website, so it's not NDAed). > Both my transits (Bay on my end, Cisco on the other end) made it > through just fine, though. (This time. The last two times it > happened, the cisco's on the other end happily passed the invalid route > to me and the Bay on my end happily dropped the BGP session, and this > was repeated ad infinitum until the bogus route was removed from the > other end.) I have no data on Bay; my apologies if this wasn't clear. Bay was *only* being referenced as a historical point of note. No attempt at FUD, and my apologies if anyone read it that way. -- *************************************************************************** Joel Baker System Administrator - lightbearer.com [email protected] http://www.lightbearer.com/~lucifer
|