North American Network Operators Group Date Prev | Date Next | Date Index | Thread Index | Author Index | Historical Re: Converged Networks Threat (Was: Level3 Outage)
Petri, >> I think it has been proven a few times that physical fate sharing is >> only a minor contributor to the total connectivity availability while >> system complexity mostly controlled by software written and operated by >> imperfect humans contribute a major share to end-to-end availability. Yes, and at the very least would seem to match our intuition and experience. >> From this, it can be deduced that reducing unneccessary system >> complexity and shortening the strings of pearls that make up the system >> contribute to better availablity and resiliency of the system. Diversity >> works both ways in this equation. It lessens the probablity of same >> failure hitting majority of your boxes but at the same time increases >> the knowledge needed to understand and maintain the whole system. No doubt. However, the problem is: What constitutes "unnecessary system complexity"? A designed system's robustness comes in part from its complexity. So its not that complexity is inherently bad; rather, it is just that you wind up with extreme sensitivity to outlying events which is exhibited by catastrophic cascading failures if you push a system's complexity past some point; these are the so-called "robust yet fragile" systems (think NE power outage). BTW, the extreme sensitivity to outlying events/catastrophic cascading failures property is a signature of class of dynamic systems of which we believe the Internet is an example; unfortunately, the machinery we currently have (in dynamical systems theory) isn't yet mature enough to provide us with engineering rules. >> I would vote for the KISS principle if in doubt. Truly. See RFC 3439 and/or http://www.1-4-5.net/~dmm/complexity_and_the_internet. I also said a few words about this topic at NANOG26 where we has a panel on this topic (my slides on http://www.maoz.com/~dmm/NANOG26/complexity_panel). Dave
|