North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: NOC communications (was Re: Process management)

  • From: John Todd
  • Date: Sun Jul 12 23:36:17 1998

At 04:30 PM 7/11/98 -0500, Sean Donelan wrote:
>>> >establish a out-of-band communication system [...] connected via
>>> >nationwide system which doesn't rely on IP, ATM, or SS7
>>> I'm not sure where you'd find such a beast anymore.  Maybe an SS7 and
>>> an Iridium system, side-by-side, would do the trick.
>>HF radio, more likely. Those of you without roof-rights, you better
>>ask building management about getting same for the triband beam....
>Actually I was thinking of a combination of 'order wires' with a combination
>of VSAT and in-band network conference bridges for backup.  However several
>people of indicated that level of backup was too expensive.
>This problem isn't really unique to the Internet.  I was looking at the
>backup communications network for SS7 providers and noticed a couple
>of major SS7 network providers don't give out their direct contact
>information even to other SS7 network operators anymore.  They changed
>their listings in the emergency directory to general customer service

You bring up an interesting (if unspoken) point in your prior statement,
and that is: "Communications between entities is worthwhile if the
communications facility has a high signal-to-noise rate."   While I am not
certain, I have strong suspicions that the SS7 NOCs pushed their emergency
directory numbers to the general number due to useless, misdirected calls.
 Being a secretary is not something that a NOC can afford to do.  The same
is the case with the administrator of an AS number - this is the real point
of contact that providers use to work out issues with each other.

Therefore, to avoid this problem, one must limit the ubiquity of the
contact mechanism and increase the value of each message.  Using a medium
like a phone number is of course the standard method for contact in any
emergency situation (email is great, but it lacks a rapid
question-answer-experiment ability) but phones also carry with them the
ease of use that works against them, as well as for them.  A phone number
gets handed out on web sites, "emergency call" sheets, etc.  and soon
people who do not have anything directly relating to operations are calling
the operations hot line.  Either more staff is required to start answering
these questions, or (more likely) the "hot line" becomes not-so-hot and it
will go unanswered or not taken seriously, or simply nobody will care about
it and it will get forwarded to the Void.

Email has the same problems, but with the a benefit of being a one-to-many
broadcast service, where a phone is point-to-point.  (email = multicast
messaging?  Better not go there. ;)   However, email is not what you'd want
to use in the event of an outage (for obvious reasons) for both incoming
notifications as well as outgoing queries.   Additionally, email is more
easily ignored by a group of people than it is by an individual (how many
times have you said "I thought you took care of that problem!" to someone
on your "staff" list?)

The Fantasy:
   Using a REAL out-of-band system for notifications of regional or
"national" outages would be the best method for inter-provider
communication, IMHO.  However, Iridium phones, HF radios, or carrier
pigeons are all too complex and too expensive for deployment into any
organization other than the largest ISP/NSP circles.  While I'd LOVE to see
a line printer banging away in a corner on a 2400 baud Ku band
"gas-station" style uplink, I don't see that happening.  Besides, someone
would have to centrally coordinate that - you think ARIN would mind taking
on a multi-national satellite communications network?  Sure - just hike the
price for an AS number to, say, $4000 a year and make it mandatory that
each AS has one of these systems in house.   Right.

The Reality:

  Reality is ugly.  The phones and email are all we've got to work with.
The trick is to make them more useful.

Part 1:  ARIN should be a fascist for maintaining correct information.  The
best method for allowing interprovider communications to work smoothly is
to keep the NOC phone numbers and NOC email addresses valid and responsive.
  Maintainers of ASes should be correctly identified in the ARIN database.
Phone numbers should be audited.  I'm a big fan of "turn it off and see who
comes running" update procedures.  If someone doesn't respond within a
reasonable number of attempts, or information is out of date for X period
of time, enforce policy by denial of service.   (I will leave "turn off" as
an exercise to the reader - this could merely be administrative, or it
could actually be a DOS on the AS.  You could even do it during
more-or-less standard early-morning maintenance windows that were
pre-announced - enough pain for a few hours to encourage repair, but not
total and permanent disability.)  
   I guarantee that this will cause apoplexy, foaming at the mouth, and
certain members of this list threatening lawyer/orifice interaction with
ARIN or perhaps wondering if that's my black Sikorsky parked on the pad out
back... but the database will quickly come up to date, and I'll bet that
nobody is actually turned off.  The amount of time saved during real DOS
attacks, security breaches, peering problems, and route-mangling will far
outweigh the actual number of lost dollars due to policy-enforcement outages.

 Part 2:  Separate your handles for your domain names and your AS number.
A smart provider will have a friendly front-end (perhaps a general number)
on their domain name, since this is what the majority of the world will
look at when trying to get a point of contact.  Standard SPAM complaints,
sales questions, joe-user will be getting a (hopefully) responsive
interaction, but he won't be talking to the NOC at 3:00AM when he can't get
to the Doom server off your network.   The contact information for the AS
number should be direct, always staffed, and clued appropriately.   Even
forwarding it to a pager is better than RNA.  There are spaces for
"alternate contacts" on the ARIN forms - _use them_.  You might have a head
routing person listed as #1 option, but put your NOC as #2 - it's free!

"Operational" enough?  Long-winded enough?