North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: Converged Networks Threat (Was: Level3 Outage)

  • From: Matthew Crocker
  • Date: Wed Feb 25 13:45:32 2004

I'm saying that if a network had a FR/ATM/TDM failure in the past
it would be limited to just the FR/ATM/TDM network. (well, aside from
any IP circuits that are riding that FR/ATM/TDM network). We're now seeing
the change from the TDM based network being the underlying network to the
"IP/MPLS Core" being this underlying network.

What it means is that a failure of the IP portion of the network
that disrupts the underlying MPLS/GMPLS/whatnot core that is now
transporting these FR/ATM/TDM services, does pose a risk. Is the risk
greater than in the past, relying on the TDM/WDM network? I think that
there could be some more spectacular network failures to come. Overall
I think people will learn from these to make the resulting networks
more reliable. (eg: there has been a lot learned as a result of the
NE power outage last year).

Internet traffic should run over an IP/MPLS core in a separate session (VRF, Virtual context, whatever..) so the MPLS core never sees the full BGP routing information of the Internet. So long as router vendors can provide proper protection between routing instances so one virtual router can't consume all memory/cpu; The MPLS core should be pretty stable. The core MPLS network and control plane should be completely separate from regular traffic and much less complex for any given carrier. VoIP, Internet, EoM, AToM, FRoM, TDMoM should all run in separate sessions all isolated from each other. A router should act like a unix machine treating each MPLS/VRF session as a separate user, isolating and protecting users from each other, providing resource allocation and limits. I'm not sure of the effectiveness of current generation routers but it should be coming down the line. That said, the IP/MPLS core should be more stable than traditional TDM networks, the Internet itself may not stabilize but that shouldn't affect the core. What happened at L3 was an internet outage, that shouldn't in theory affect the MPLS core. Think back 10 years when it was common for a unix binary to wipe out a machine by consuming all resources (fork bombs anyone?). Unix machines have come a long way since then. Routers need to follow the same progression. What is the routing equivalent of 'while (1) { fork(); };'? Currently it is massive BGP flapping that chew resources. A good router should be immune to that and can be with proper resource management.