North American Network Operators Group|
Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical
Re: AGIS Route Flaps Interrupting its Peering?
Peter et al: We too have had nothing but trouble with the netedge boxes (to mae-east and mae-west). They are particularly insidious when they are "kind of working". A couple years ago, when traffic loads were lower, they seemed to perform well. Does anyone know if MFS has plans to address this problem? -- Becca ----------------------------------------------------------------------- Rebecca L. Nitzan Lawrence Berkeley National Lab Network Engineering Services Group 1 Cyclotron Rd, 50A/3101 MS 50C ESnet - Energy Sciences Network Berkeley, CA. 94720 phone: 510-486-6468 fax: 510-486-4300 [email protected] ----------------------------------------------------------------------- >Here's some background: > >AGIS's router is not colocated at the MAE parking garage, but is in fact >colocated at WorldCom in downtown Washington DC. Our bits get from there to >the MAE via a DS3, and that DS3 is terminated at each end with a device >called a NetEdge, which does the FDDI to DS3 ATM conversion. > >These NetEdges seem to have three different possible operating states: >completely working (which doesn't happen often enough); broken (often, right >out of the box); and kind of working (which happens all too often). This >third operating state results in some very interesting, possibly misleading, >and sometimes damaging behavior. It looks quite similar to the kind of >behavior you get when you change the MAC layer device but keep the same ip >address at either of the MAE's: ARP caches get inconsistent, and BGP >sessions with other routers flop around, leading to routes getting flap >dampened by those running the appropriate code. > > >Here's what happened: > >AGIS's connection to MAE-East experienced one of these kind-of-working >problems which resulted in the erratic behavior above. Digex customers >wishing to reach AGIS customers called the Digex NOC, and the posting which >started this all was made to the Digex internal news group. Similarly, AGIS >customers had problems, and we worked with MFS to get the problem resolved >(they must have a warehouse full of swapped-out NetEdges at this point). > >In the interval, a short-on-facts bozo spit into the wind and got us and >Digex wet. I'm in private correspondence with Ed Kern to postmortem the >situation. > > Peter > > >At 10:25 AM 7/5/96 -0400, Ed Kern wrote: >>> >>> One key point is that we have not received any complaints or reports >>> of any sort concerning any perceived issues at mae-east from any >>> mae-east peers. Digex made no attempt to contact us. We were already >>> working with Advantis on the unreachable issue above, but the first we >>> heard of the "AGIS attacks mae-east" report was when a Digex customer >>> sent us a report similar to that forwarded to all of you by Cook. >> >>Went into this in the last message...Digex will try and be more >>proactive with pointing out Agis flapping prefixes in the future. >> >> >>> An appropriate audience would have been the AGIS noc and the Digex >>> noc. I think the Cook approach was inappropriate because the issue >>> was purely between Digex and AGIS until Cook distributed it to the >>> three widespread mailing lists. >> >>I agree.. >> >> >>> > How is the report flawed? >>> >>> I see that Ed Kern has already replied indicating that the report was >>> indeed flawed. I don't think that there is anything to be gained by >>> going into further detail. >> >>What I was referring to was the internal circulation here...which I was >>under the impression got to external customers....now im not so >>sure... >> >>The internal report was flawed because it relied to much on source >>routes and came to some bad conclusions on the internal state of agis. >> >> >>> My key point is that nothing of interest happened. This was a >>> non-issue until the misinformation was blasted around the Internet >>> technical universe. >>> >> >>I would argue that the external message that got sent around was >>misinformation...It was correct information from what the people >>could see at the time it was released...(lots of dampened prefixes and >>a down peer).. >> >> >>Ed >> >> >> > >_____________________________________________________________________ >Peter Kline Senior Network Engineer| 313-730-5151 >AGIS - Internet Backbone Services | _Lucem Diffundo_ >Post-Traumatic Success Disorder+ | >///////////////////////////////////////////////////////////////////// >You can pretend to care, but you can't pretend to be there. > - - - - - - - - - - - - - - - - -