North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: Operate until failure

  • From: Josh Richards
  • Date: Mon Jan 08 19:16:19 2001

* Sean Donelan <[email protected]> [20010108 15:05]:
> 
> And what if you are not using APCs?

But still stand alone UPSes?  Don't most data centers have larger UPS(es) or
battery plants (say, two) feeding the entire facility?  The ones I've worked
in have (well, not *all* of them, but those exceptions had much bigger
issues than worrying about how they were going to shutdown all of the boxes
at once..)  And if you aren't using standalone UPSes what do you care what
the interface is to the BigUPS(tm) as long as you can get one of your network
monitoring servers to talk to it (and reliably)?  None of your servers in the
server farm  are going to be talking to your BigUPS(tm) directly anyway..  

> One issue with highly redundant data centers is the failure modes are
> "interesting."  You don't want to shutdown due to a single UPS failure, so
> you don't use something simple like PowerChute Plus.  You most likely don't
> want to shutdown based on any automatic signal.  However, you do want a way
> for an operator to gracefully shutdown a lot of equipment quickly when
> the decision is made.

Agreed.  And in this case, the UPS has no involvement.  If the operator 
wants the servers shutdown, the operator shuts servers down.  No UPS 
involved (OK, well not literally).  I realize this doesn't address your 
entire point...one sec I'll get to that.

> For a server farm, with potentially thousands of individual systems, is
> there any standard piece of software you can install on all of the systems
> to act as a receiver of a signal to begin a graceful shutdown that does
> not depend on a vendor's proprietary interface?  Preferabally one which
> does not involve running a lot of additional wires.

Sure, ssh/rsh[1]. :-) What vendor's proprietory interface -- the OS vendor of
the servers?  The UPSes don't have anything to do with the shutdown process 
if the operator is the one making the call.  To accomplish that it's a simple
matter of scripting a bunch of:

    ssh webserver01 'shutdown -h now Power-Go-Bye-Bye'

Of course, if you have unmanaged (e.g. customer boxes you do not have root
access to) within the same data center, and you want to do the same for 
those, that's a whole another story... 

Oh, hmm, and Windows.  Well, remote command execution is possible there too 
from my understanding. 

At that point, once all servers are gracefull shutdown, you can just shut the
UPS(es) off if you're intent is to eventually cut any and all power to the 
facility.

Or did I completely miss your point?

> Again this is only needed if people want a gracefull shutdown.  If
> you can live with a hard shutdown, you wouldn't require this.  If you
> use ctrl-alt-del as a normal management practice, I suspect you don't
> really require a graceful shutdown.

I'm being anal but even ctrl-alt-del is graceful on most modern OSes. The 
power or reset button though on the other hand...  :-)

[1] rsh only mentioned for historical reasons, please don't use to manage
the remote power capability of your mission-critical server farm located 
in your highly redundant data center unless you understand why you might
consider not doing so. :) 

-jr

----
Josh Richards [JTR38/JR539-ARIN]
<[email protected]/cubicle.net/fix.net/freedom.gen.ca.us>
Geek Research LLC - <URL:http://www.geekresearch.com/>
IP Network Engineering and Consulting

Attachment: pgp00006.pgp
Description: PGP signature