North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: how to write an incident report

  • From: Simon Lyall
  • Date: Sat Oct 20 18:56:30 2007

On Sat, 20 Oct 2007, Joe Abley wrote:
> I've had a few responses like this, but I don't buy it. I've worked
> in many places, some in New Zealand and more elsewhere, where there
> was a general culture of fear about making public statements about
> operational incidents. I don't ever remember people sending proposed
> text to legal and having it pushed back with changes; what happened
> instead was that text wasn't written in the first place.

Of course it wasn't, the only time public statements beyond the simple
"Network Status" update are made is when the outage is so huge that the
news media report it. Then the idea is to spin the problem as a freak
occurrence that no amount of money and planning (which the company of
course spent years and millions doing) would have prevented.

Legal and PR are going to take one look at the report and then ask what
the upside for the company is in releasing it. In most cases there will be
none so it won't happen. Techs know this so don't even bother.

In reality a large percentage of outages happen for "dumb" reasons and
publicising them just makes the company look bad (look at the previous
fault on the page).

Look at this Citylink outage, I'm sure the sales guys for rival companies
are right now working on their pitches for their customer's business
based on that has been posted.

"Look at these guys, they took down half the city and still don't know it
wasn't caused by hackers. Half the government was offline [1] all day
because they couldn't even get into their building after hours. Their
phones were off, their mail servers stopped working, they couldn't login
to their network themselves, and their websites were offline. They've
been having these sort of outages on a smaller scale for years and just
ignored them because they only affect one or two customers at a time."

[1] Roughly: Beehive = Whitehouse, RBNZ = Federal reserve, Bowen St = Parliament.

> Maybe Simon's level of detail is such that no legal department would
> ever condone it. But there's such a tremendous distance between
> Simon's text and the usual "there are no known issues at this time"
> that I suspect people just aren't trying.

Well I was pleasantly surprised at 365 Main's explanation of the problem
a while back.

but once again that was a major event that couldn't be hidden.

Citylink is a slightly unusual company in it's level of openness (although
getting less so) but I would guess that most people on this list would be
fired if they posted something like Simon's text without running it by

Simon Lyall  |  Very Busy  |  Web:
"To stay awake all night adds a day to your life" - Stilgar | eMT.