North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

RE: UUNET connectivity in Minneapolis, MN

  • From: Charles Cala
  • Date: Fri Aug 12 10:13:03 2005
  • Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=Message-ID:Received:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=lxj88n38crvxcuX3ni115eApxhb4hM7Pdf7eu674bk1E9Okkt/wfxh8rcGQLkB8HnAz6QrqFrFpR1+O9OEwfMFX5fI/VDFgiAPzHbr3j+mIN85zvm02UN2WVLFcMUpuHjoG/tU+STU7bQ6t5/w5SNZJwsfkUEcz/+08B9H9+SgU= ;

 -----Original Message-----
 From: [email protected]
 On Behalf Of James D. Butt
  > Unless there is some sort of crazy story related
 > to why a service provider
 > could not keep the lights on, this should have not
 > been an issue with
 > proper operations and engineering.

6 stories from the trenches


Once a back hoe decided to punch through a high
pressure natural gas main, right outside 
our offices. The fire department had us 
shut down ANYTHING that MIGHT make a spark. 
No nothing was able to run. It did not matter 
that we had uspes and such, 
all went dark for hours.


During the Northridge earthquake (the one during the 
world series in sf.ba.ca.us) there was a BUNCH of 
disruption of the infrastructure, drives were shaken
til they crashed, power wend down all over the area, 
Telco lines got knocked down, underground vaults got
flooded, and data centers went off line.


When ISDN was king(or ya get a t-1), 
I worked for an ISP in the bay area that 
was one of the few to have SOME 
connectivity when mae-w went down. We had a t-1 that 
went “north” to another exchange point, and even
though 
that little guy had %50+ packet loss, it kept
chugging. 
We were one of the few isp’s that 
had ANY net connection, most of the people 
went in through their local MAE , 
(that was in the days before connecting 
to a MAE required that you be connected to 
several other MAE’s)


Once while working for a startup in SF, 
I pushed for upses and backup power gen 
sets for our rack of boxes, and I was told 
that we were "in the middle of the finintial district 
of SF, that bart/the cable cars ran near by, 
and that a big huge sub station with in 
rock throwing distance of our building, 
not to mention a power plant a couple 
miles away. There was no reason for us to 
invest in backup gen sets, or hours of 
ups time…. I asked what the procedure 
was if we lost power for an extended 
period of time, and I was told, “we go home”

wellllllll…… the power went off to the 
entire SF region, and I was able to shut 
down the equipment with out to 
much trouble, cause my laptop was plugged into a ups 
(at my desk) and the critical servers were on a ups,
as 
well as the hub I was on. After I verified that we
were 
stil up at our co-lo (via my CDPD modem) 
I stated the facts to my boss, and told him 
that I was following his established 
procedure for extended power loss. 
I was on my way home. (boss=not happy)

A backup generator failed at a co-lo because 
of algae in the diesel fuel. 

Another time a valve broke in the buildings HVAC
system 
sending pink gooey water under the door , 
and into the machine room.

There are reasons why a bunch of 9’s piled together,

weird stuff does happen. This is nanog, each 
‘old timer’ has a few dozen of these events 
they can relate.

The first 2 ya realy can’t prepare for other 
than for all your stuff to be mirrored 
‘some place else’, the rest are preventable, 
but they were still rare.

( back to an operational slant)
Get a microwave t-2 and shoot it over to some 
other building, get a freaking cable modem as 
a backup, or find another way to get your lines out.

 If having things work is important to you, 
YOU should make sure it happens!

If people are preventing you from doing your job 
(having servers up and reachable) CYA, and 
point it out in the post mortem.


-charles

Curse the dark, or light a match. You decide, it's your dark.
Valdis.Kletnieks in NANOG