North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

RE: "They all suck!" Re: UPS failure modes (was: fire at NAC)

  • From: Shaun Bryant
  • Date: Thu May 29 18:04:41 2003

One thing people seam to have forgotten is that with added redundancy comes
added complexity that is many cases out ways the gain. 

Shaun  

> -----Original Message-----
> From: Alex Rubenstein [mailto:[email protected]]
> Sent: Thursday, May 29, 2003 1:40 PM
> To: Sean Donelan
> Cc: [email protected]
> Subject: Re: "They all suck!" Re: UPS failure modes (was: fire at NAC)
> 
> 
> 
> 
> > UPSes (and UPS batteries) do fail, sometimes in catastrophic ways.  I
> > would not design any critical system on the assumption that any
> particular
> > component won't fail.  High availability is about designing for failure.
> > Sometimes there is a long time between failures, other times they occur
> > early and often.  The most annoying thing about UPSes is they fail at
> > exactly the time they are needed most.
> 
> Except, that:
> 
> Even in instances where 'High availability' is designed, in the case where
> one of the units has a failure that causes a fire and FM200 dump, either
> the FM200 will still trigger an EPO, or the fire department will.
> 
> So, the second 'high available' unit will generally not prevent you from
> dropping the critical load, but instead, will help you get back on line
> quicker.
> 
> A much cheaper and easier to implement external maintenance
> make-before-break bypass will accomplish the same thing.
> 
> I've heard many a story of the paralleling gear causing the problem in the
> first place, as well...
> 
> 
> 
> -- Alex Rubenstein, AR97, K2AHR, [email protected], latency, Al Reuben --
> --    Net Access Corporation, 800-NET-ME-36, http://www.nac.net   --