North American Network Operators Group

Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical

Re: Converged Networks Threat (Was: Level3 Outage)

  • From: David Meyer
  • Date: Wed Feb 25 14:30:29 2004


>> I think it has been proven a few times that physical fate sharing is 
>> only a minor contributor to the total connectivity availability while 
>> system complexity mostly controlled by software written and operated by 
>> imperfect humans contribute a major share to end-to-end availability.

	Yes, and at the very least would seem to match our
	intuition and experience. 

>> From this, it can be deduced that reducing unneccessary system 
>> complexity and shortening the strings of pearls that make up the system 
>> contribute to better availablity and resiliency of the system. Diversity 
>> works both ways in this equation. It lessens the probablity of same 
>> failure hitting majority of your boxes but at the same time increases 
>> the knowledge needed to understand and maintain the whole system.

	No doubt. However, the problem is: What constitutes
	"unnecessary system complexity"? A designed system's
	robustness comes in part from its complexity. So its not
	that complexity is inherently bad; rather, it is just
	that you wind up with extreme sensitivity to outlying
	events which is exhibited by catastrophic cascading
	failures if you push a system's complexity past some
	point; these are the so-called "robust yet fragile"
	systems (think NE power outage).  

	BTW, the extreme sensitivity to outlying events/catastrophic
	cascading failures property is a signature of class of
	dynamic systems of which we believe the Internet is an
	example; unfortunately, the machinery we currently have
	(in dynamical systems theory) isn't yet mature enough to
	provide us with engineering rules.    

>> I would vote for the KISS principle if in doubt.

	Truly. See RFC 3439 and/or I
	also said a few words about this topic at NANOG26
	where we has a panel on this topic (my slides on