North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: San Francisco Power Outage

  • From: George William Herbert
  • Date: Tue Jul 24 22:04:12 2007

Seth wrote:
>Jonathan Lassoff wrote:
>> Just a heads up to anyone on list that PG&E has just sustained a large
>> outage in San Francisco that has caused a few hiccups (both network,
>> electrical, infrastructural, etc.) around the city.
>> I've confirmed that both customers in 365 Main and parts of telecom 1
>> have both sustained brief blackouts. No word yet form 200 Paul.
>> Anyone in the area that could use a hand with anything, I'll probably
>> be wrapping up fixes for my stuff soon, and would be glad to help
>> however I can.
>I have a question: does anyone seriously accept "oh, power trouble" as a 
>reason your servers went offline? Where's the generators? UPS? Testing 
>said combination of UPS and generators? What if it was important? I 
>honestly find it hard to believe anyone runs a facility like that and 
>people actually *pay* for it.
>If you do accept this is a good reason for failure, why?

Unfortunate real-world lesson: there is a functional difference between
pushing the UPS test cutover button, and some of the stuff that can happen
out on the power lines (including rapid voltage swings, harmonics, etc).

I know 365 Main has the equipment and tests it, I've been standing outside
when the generators spool up.

I've had generator firmware upgrades generate reporting info on the
serial uplink that flipped the UPSes into permanent error state
until the Liebert guys got off the plane with the replacement
mainboard.  I've had grid voltage fluctuations that toasted VSDs
in chillers.  I watched a building's electrical service go "pop"
when a transformer blew and ran 10kv into the 220 mains for a
fraction of a second as it arced.  I was at home but called in
after a 5 MW generator popped under a sufficiently badly harmonic
UPS and PDU load of only about 2.4 MW.  I had a client who forgot
to wire the A/C into the UPS, and nearly melted a whole
server room.

And the stories that the power guy I'm working with tells about
foreign facilities, particularly in middle east war zones,
are really scary...

We fundamentally do not have the facilities problem completely
nailed down to the point that things will never drop.  Level 4
datacenters can, and will, fail.   Nothing you can do including
just doing 48V DC for everything are truly foolproof solutions.

-george williiam herbert
[email protected]