North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: San Francisco Power Outage

  • From: Brandon Galbraith
  • Date: Tue Jul 24 21:19:23 2007
  • Dkim-signature: a=rsa-sha1; c=relaxed/relaxed; d=gmail.com; s=beta; h=domainkey-signature:received:received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; b=EdZJQguhOQDc/RzsAGhWCiST5l3l3KJ+NVz12Li2B7o+XUwDWjjLpKCkfqIlha9YAeDMbB3U/q4wmBgXsWqEWoDV2Xs0jejsOIHHMcy09dkN+oIeQjRE/TEe+UPHMJzgNGMeQ9/HFwehXmy2Es8WiaQ46cgxDvzJs9LO/iF3OKw=
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=beta; h=received:message-id:date:from:to:subject:cc:in-reply-to:mime-version:content-type:references; b=HXkkioK3vECJ4HPjIjgDOxS0pPM7gtlY0Cf5pG+2/B9VxwiP31mQbfz7NA94md5btdVzOAw/64n7zohQogUKkNwbHz4VNXRFvKl5ha5dK322hXs4id3NwB5pOR9Rc1wFKbZIIJut8Ib169cBRIO4x9aALi1jAR97eOBQSPX9trA=


On 7/24/07, Seth Mattinen <[email protected]> wrote:

I have a question: does anyone seriously accept "oh, power trouble" as a
reason your servers went offline? Where's the generators? UPS? Testing
said combination of UPS and generators? What if it was important? I
honestly find it hard to believe anyone runs a facility like that and
people actually *pay* for it.

If you do accept this is a good reason for failure, why?

~Seth

I'm unable to find a link at the moment, but many moons ago power was lost at the 350 E Cermak Equinix facility in Chicago. At the time, we didn't have production equipment there (only a firewall in a shared colo cage/cabinet). This occured on a Friday evening and lasted for quite some time into Saturday morning because their generators would start up but would refuse to continue running. I believe the root cause was a problem related to insulation on the power cables somewhere. I understand testing is done frequently, but I'm also aware that if I want full redundancy, I'm going to have two physically separate locations. There are some events you can't plan for, as well as failure modes that aren't easily/quickly resolved.

-brandon