North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

RE: Quick question.

  • From: Michel Py
  • Date: Sat Jul 31 23:54:10 2004

> Alexei Roudnev wrote:
> We had 6509 which failed, because backplain failed (it can
> not happen -:) but it happen) - iof course, no any 'dual
> CPU dual power' could prevent it... Image broken line card
> - it can crash whole box no matter how much 'dual' things
> you have. The same with software error (I crashed one of
> 6509 just running 'snmpwalk' on it).

I lost a 7507 dual power dual RSP earlier this year: one of the cards
died, something in the power circuitry. It put the entire router in
short-circuit, both power supplies decided to go south and would not
power back up again until the faulty card was physically removed. After
the card was removed it worked fine again. It does not happen often, but
it does happen.

Redundancy is not a slam dunk with IOS though; same as dCEF, don't
expect RPR-compatible images to run every config you'll bump into, YMMV.
There is an annoying number of things that are not working on RPR images
of fall back to route cache instead of distributed cache.


> So, I always prefer to have 2 boxes and application level
> reliaility instead of playing with 'dual everything'
> solutions (last example - 2 days ago one of our dual-power
> Intel servers failed because of 1 power supply failure -
> it did not broke, but it did something wrong''' and system
> crashed...).

Actually, what I try to do for routers is having a "dual everything" for
production and an "el-cheapo eBay special" sitting in the same rack for
backup. The reason I still do dual power and dual CPU is that over the
last 20 years I have seen very little failures of redundant systems
(although I have seen some) however a dual-something saved my bottom
several times. That part of my body is priceless :-D

For PCs I install dual Xeons on every production machine for example,
even though the CPU power needed for some is a 486; Intel processors do
die like anything else; a processor dying will typically lead to a
system crash, but it does reboot in single-processor mode when the
graveyard dude pushes the reset button. I also try do have RAID-10
arrays span over two raid cards; same as CPUs, a RAID card that dies
will likely crash the system but it will reboot in degraded mode.

Michel.