Re: Level 3's side of the story

North American Network Operators Group
Re: Level 3's side of the story

From: Richard A Steenbergen
Date: Sat Oct 08 15:08:37 2005
On Sat, Oct 08, 2005 at 07:24:06AM -0700, JC Dill wrote:
> 
> Cogent was a "tier 1" until prior de-peering incidents left them unable 
> to reach other networks.  They solved this by buying filtered transit 
> thru Verio to reach the networks they couldn't reach via peering.

For the record, Cogent was never a Tier 1. They have never had Sprint 
peering (unless you count the 30 seconds between acquisition of a company 
that did have it, and the depeering notice, years ago). Cogent's history 
of depeering debacles, at least as best as I can remember them, is:

ATDN (AS1668) depeers Cogent, December 18 2002.
http://www.cctec.com/maillists/nanog/historical/0212/msg00366.html
http://www.cctec.com/maillists/nanog/historical/0212/msg00412.html

ATDN is in the process of shutting off legacy transit and peering on its 
path to tier-1-dom, and disconnects Cogent due to ratio (also technically 
Cogent is still on a trial peering session). At this time, Cogent is a 
full transit customer of AboveNet (AS6461), ATDN is still a full transit 
customer of Level 3, and Cogent is a peer of Level 3. Following the 
depeering, Cogent shifts 100% of the traffic to their (3) peers, which 
become severely congested nearly 24/7 for several weeks. Despite being 
able to send some traffic to AboveNet transit, they decide to leave 
traffic congested to (3) to see if ATDN will repeer (not knowing that AOL 
customers don't know what peering is and thus won't be nearly as vocal as 
Cogent's customers). Traffic stays congested until Cogent's peering 
capacity with (3) is upgraded. ATDN later switches their routing with (3) 
from transit to customer-only routes (removing the last of their transit 
paths), at which point Cogent shifts traffic to newly acquired Verio 
transit to reach them.



Teleglobe (AS6453) depeers Cogent, some time in Feb 2005?
Don't ask me why but I can't find a NANOG thread discussing this.

Teleglobe depeers Cogent due to various ratio and market pressure issues. 
Of note is that Cogent has recently entered the Canadian market where 
Teleglobe has a strong presence, and has started giving away free or 
nearly free transit to large inbound networks. Teleglobe is a Sprint 
customer, and Cogent reaches Sprint through Verio. Teleglobe is caught 
completely off-guard when Cogent refuses to accept the route via Sprint 
transit, and blocks traffic between the networks. This continues for 
several days, until eventually routes are leaked/added from Teleglobe to 
SAVVIS (AS3561), who Cogent peers with. This continues for a few days more 
until Teleglobe finally agrees to repeer Cogent.



France Telecom (OpenTransit/AS5511) depeers Cogent, April 14 2005
http://www.merit.edu/mailinglist/mailarchives/old_archive/2005-04/msg00484.html

FT depeers Cogent due to, well, a variety of issues and general 
unhappiness surrounding Cogent's entrance into their markets through the 
purchase of Lambdanet. FT is a Sprint customer, Cogent is already 
receiving Sprint routes via Verio but intentionally blocks these routes so 
that they have no path to FT. The rumored resolution to the dispute is 
that a FT customer sues Cogent in France, and a French judge either does 
or is about to fine the hell out of Cogent unless connectivity is 
restored. At this point Cogent caves, and begins accepting the routes via 
Sprint (via Verio).



Of course I am certain there are a lot more depeerings (both from and to 
Cogent) that did not make the news, but these are the big notable events 
that dramatically impacted connectivity. For anyone keeping score, the 
last two times Cogent was depeered, it responded by intentionally blocking 
connectivity to the network in question, despite the fact that both of 
those networks were Sprint customers and thus perfectly reachable under 
the Sprint transit Cogent gets from Verio. While no one has come forward 
to say if the Cogent/Verio agreement is structured for full transit or 
only Sprint/ATDN routes, Cogent has certainly set a precedent for 
intentionally disrupting connectivity in response to depeering, as a scare 
tactic to keep other networks from depeering them.


> L3 was hoping to force Cogent to increase that transit to include the 
> traffic destined for L3's customers, thus raising Cogent's transport 
> costs at no additional (transport) cost to L3.

As I've already pointed out, L3 depeering Cogent is in fact a major 
revenue loss for L3. Not only will they not make any money off of Cogent 
(since we both know Cogent will NEVER give them money for direct transit), 
but Cogent will heavily depref them and shift many many gigabits of 
traffic away from L3 and onto their competitors, traffic that L3 was 
previously billing their customers for. They'll also lose customers during 
the unreachability, and even if Cogent buckles and buys transit they'll 
lose some outbound traffic from their multihomed customers due to a longer 
as-path length to reach Cogent and many of Cogent's routes (11k of them 
remember).

Let me be perfectly clear here, under absolutely no line of logic will L3 
see an increase in revenue from this, period. If you think they will, you 
don't understand how the Internet works. What L3 will see from this is a 
REDUCTION IN BILLABLE TRAFFIC AND BACKBONE UTILIZATION.

> >3)  Possible traffic issues.  Was Cogent guilty of not transporting the 
> >Level3-bound packets within the Cogent network to the closest 
> >point-of-entry peer to the host in the Level3 network, therefore 
> >"costing" Level3 transit of their own packets?  
> 
> Possible, in fact probable.  Most ISPs hand off traffic to peers under a 
> "hot potato" policy, they hand it off at the closest point where they 
> connect.  If the traffic is equal in both directions then this works. 
> If the traffic is not equal, then this lowers the cost of the network 
> that has high outbound traffic, as the other network bears the brunt of 
> the total cost for transporting the combined traffic between their 
> respective customers.

Do you know why people hot potato traffic? Because MEDs suck. In addition 
to the obvious aggregation issues (for example how do you assign a MED 
value to 4.0.0.0/8, it is used around the world), they usually end up 
producing sub-optimal routing. IGP cost is a view of what it costs YOU to 
get the packet off your network. MED values set to the opposing network's 
IGP cost is a view of what it costs THEN to get the packet off their 
network. Neither is a complete view of reality, and the MED view just 
happens to be worse.

Consider a simple scenerio, You operate a major network, you peer with 
someone who operates a major network, you both intelligently aggregate 
your prefixes and work with your customers to make certain that everything 
in BGP maps to a specific geographic region, and you both interconnect 
with each other in the usual "maxium reasonable extend possible" locations 
(New York, Ashburn, Chicago, Dallas, San Jose, Los Angeles, Seattle, 
Atlanta, Miami) across the US. Now lets say you have a customer who is in 
Chicago, and they're sending data to a customer in, oh lets say Denver. In 
hot-potato routing, you get the packet off your network in Chicago, and 
then the other network uses its more complete and detailed understanding 
of where this packet is going within its own network to know that 
Chicago->Denver is a straight shot.

In a cold potato situation however, you are only looking at the other 
network's IGP cost, not your own. Denver is pretty much dead center in the 
middle of San Jose, Chicago, and Dallas, and which one is "closer" is 
really up for grabs. On the vast majority of networks, Dallas is actually 
closest by IGP cost, with San Jose a close second, and Chicago a close 
third. If you're cold potato'ing to try and improve routing, even under 
the most ideal conditions possible (which given the current financial 
state of the carriers involved RARELY happens these days), you're going to 
end up hauling packets to some out of the way place like SJC or DFW, and 
then the other network is going to end up hauling packets back to Denver. 
You both lose the "saving money by hauling traffic less" game, and your 
customers lose in suboptimal routing.

The heart of the problem is that you need to consider your cost + their 
cost to have meds be effective, even if you solved all the implementation 
issues that you will never practically solve. Unfortunately since two 
networks have no way to coordinate metrics on the same "scale" (my 43ms 
may be 4300 igp cost, your 43ms may be 43, and joe bob's 43ms cost may be 
9182), you have no reliable way to "add" the two costs.

Now, networks who are looking for equity in the ratios have a choice. They 
can either:

* Spend thousands of man hours deaggregating (and then listen to 
  you complain about poluting the routing table with prefixes)
* Spend millions of dollars  deploying more gear into more interconnection 
  locations in areas of network presense but not peering presence (Denver, 
  St Louis, Kansas City, New Orleans, Tampa, Phoenix, etc etc etc), all 
  in areas without well defined peering locations where they are likely 
  to end up in buildings across the block but which cost thousands of 
  dollars to connect, or
* Establish these as smaller interconnections across telco circuits, 
  again spending thousands of dollars a month more in circuits, hundreds 
  of thousands of dollars in ports, tens of thousands of man hours 
  managing capacity at five dozen new interconnections around the world, 
  all while reducing to almost zero the ability for a major 
  interconnection to fail over to another major interconnection during a 
  maintenance, fiber cut, network event, etc.
* Break their customers routing in the process of doing all this.

-OR-

* Depeer said network, expect that they will buy transit from Verio or 
  any of the other dozens of networks who provide this service, and that 
  whoever ends up interconnecting with them to deliver the traffic will 
  have equitable traffic.

Now, which one do you think they're going to pick?

> There are ways to deal with it though, like cold potato routing.

Spoken like someone who has never dealt with the reality of running a 
large network, or dealt with customers wondering why you are routing their 
traffic across the country and back again. Anyone who values the quality 
of their connectivity will stick to arm-chair engineering and not actually 
building a network this way.

-- 
Richard A Steenbergen <[email protected]>       http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
Follow-Ups:
- Re: Level 3's side of the story jmalcolm
- Re: Level 3's side of the story Stephen J. Wilcox
- Re: Level 3's side of the story Patrick W. Gilmore
References:
- RE: Level 3's side of the story David Hubbard
- RE: Level 3's side of the story Jon Lewis
- Re: Level 3's side of the story Eric Louie
- Re: Level 3's side of the story JC Dill
Prev by Date: Re: Cogent move without renumbering
Next by Date: Re: Level 3's side of the story
Date Index
Thread Index
Author Index
Historical