North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

RE: Help with bad announcement from UUnet

  • From: David Luyer
  • Date: Sun Mar 31 14:05:06 2002

> What would work better/faster?
> 
> my-noc -> b0rken-noc
> 
> or
> 
> my-noc -> my-upstream-noc -> b0rken-noc-upstream-noc -> b0rken-noc
> 
> ?

OK, rant time (blame the easter long weekend... a 4 day weekend down
here... and associated excessive alcohol)...

General comment: the below isn't meant to reflect badly on any of our
past or present providers or peers... and in the most part problems
mentioned relate to previous suppliers so please don't try to guess
who they could be about :-)

Becomes much more relevant when you're not in America.  Often a company
in, say, Singapore or New Zealand may manage an Australian company's
connection to the US internet.  And then said Australian company may
have a problem connecting through the US internet to, say, China via
Japan (which the company I work at doesn't do anymore - one of our
providers now has connectivity via Singapore to China which is much
better, but that still isn't the case for many in Australia).

You want to think how many NOCs and language barriers there can be in
that path?  And peering relationships, timezone changes (harder to get
good engineers sometimes, and 24 hour NOCs aren't common in many
countries), etc?

Or, we can directly contact a NOC in southern China and get resolution
as well as having a very satisfied customer because all his other
upstreams attempted and failed the NOC to upstream NOC through a massive
number of NOCs who couldn't resolve the issue.  The problem is when you
take this approach you have to be very sure of which AS is causing the
real problem (and/or what the real problem is - calling your upstream's
upstream and telling them to tune their tx-ring-limits is another
example,
where your direct upstream at the time may not have heard of such a
thing to know to relay the fault in a way the remote NOC would work out
what the problem was and how to fix it.  of course the better thing to
tell the provider in question should have been "don't try and put that
many OC3 cards in a 7206!").

Admittedly the escalation in the southern China case (which wasn't
our standard problem with providers in China turning on routers which
make classful assumptions, and us having some 61.* IP space) was:

customer's customers -> customer's NOC -> our NOC ->
   problem site's upstream's NOC (who liased with problem site,
   and fortunately spoke english - the problem site didn't, but
   if it had been an issue, our customer's NOC had offered to
   translate)

but that cut out a _lot_ of NOCs.  To me there's some maximum number
of NOCs to be involved in a problem to coordinate well, and it is
around 4 (end ISP NOC, their upstream NOC to confirm the problem,
problem site's upstream NOC to enforce fixing of problem, problem
site's actual NOC), which then becomes 3 in the case where the
problem network is someone like sprint, at&t or uunet who we
wouldn't consider to actually have an "upstream" (and for the
record in the cases I've had to, I haven't had a problem dealing
those three directly even though we're not a customer; maybe I've
just been lucky).

An Australian company who is being directly affected by a problem may
keep good staff on until the right time to contact a US or other
international NOC directly during their working hours and get decent
staff, rather waiting on all the various NOCs to miscommunicate the
problem across various hops.

Another problem is "follow the clock" NOCs and trying to call at the
right time to get a US operator, since operators in the UK or Singapore
in a certain ISP had pretty much no access to their routers and could
do nothing more than email the US staff and hope to get some resolution
12 hours later... the country that took the call owned the problem, but
had to pass it off internally, then wait till that country was active
again to call the customer back, repeat that a few times to convince
them of the actual fault.  Glad I don't deal with that particular
company
anymore :-)

I haven't had a problem from large US providers in providing me a
trouble
ticket even though we're far removed from being a customer.  And we've
found the "trouble" has been things as lame as a certain large US
provider
putting a /32 static backhole in one of their routers, and following the
"correct" escalation path NOC to NOC in one case (since it was minor and
worked around) did nothing for a week, a direct email (in that case,
calls are for more urgent issues :-)) to their US NOC and the problem
was fixed within an hour.

The only group in the US I've found hard to deal with in any way
internet
operationally related was a bad experience and waste of international
calls to NetSol/VeriSign, they had no intent to deal with a _customer_
in a timely manner over an urgent change (domain change for a company
who
had just gone into liquidation and were about to lose the routability of
their IP space in 48 hours, and NetSol's systems weren't accepting IP
changes for the nameservers due to what turned out to be design problems
in their database application - the permission to change info update in
some cases needs 24+ hours to propogate internally before you can make
changes under the new permissions... ugly).

David.