This is the first part of a series of posts on the history of WANs.
History of WAN – Part 1
(This is the second installment of a series of posts on the history of WANs. You can read the first post here.)
Expensive, deterministic technology evolves to redundant, inexpensive, highly probabilistic technology
A pendulum consistently swings in the world of technology evolution. Technology typically starts in its first generation as a highly engineered, expensive, deterministic solution. Soon, as the market place evolves with greater competition, the need for cost reductions pushes the pendulum to a much more inexpensive solution that is less deterministic and more probabilistic in nature. The initial low cost solution is frequently too unreliable for many mission critical uses and so the market is driven to improve the odds. As the market continues to mature, the technology is enhanced to swing back (retrograde) to at a higher probabilistic solution where the likelihood of failure is less but the economic cost still at a reasonable.
When we use the term deterministic, the meaning is that the outcome of the technology’s use is predestined prior to its use. There is little to no variability or chance that it will work differently than its predetermined outcome. When we use the term probabilistic technology, the meaning is that the outcome is not entirely determined prior to its use. There exists some chance that the technology may not perform well and the outcome is only really determined by empirically observing its actual use (heuristic). Low probabilistic technology is a relative measure that there exists a substantial chance the technology may not perform well enough for its needed typical use. Higher probabilistic technology is relative measure where the typical use of the technology has a very high chance of being satisfactory to most typical use cases.
Why, you may be wondering, would anybody not want to use deterministic technology at all times? The simple answer is: Because probabilistic technology is almost always very much less expensive and therefore more plentiful than expensive deterministic technology. When a probabilistic cheap alternative is developed it is able to be utilized by more consumers resulting in a rapidly expanding market. The probabilistic technology is cheap but the chance of it performing reliably enough is too low for many use cases that evolved in the more deterministic time prior. By utilizing techniques, such as redundancies and greater optimizing for the intended typical use cases, the chances of successful outcomes with the probabilistic technology may be enhanced to be more acceptable for the greater market. This improvement in the probability of success does come at some additional incremental costs but the cost is significantly less than the prior available deterministic alternative.
Let’s look at a few non networking historical examples:
Mainframes were/are expensive. Today they can cost from a hundred thousand dollars to millions of dollars per unit. Because of the expense and the criticality of their function to the users they were/are highly engineered for up time and deterministic outcomes. Down time is measured in seconds per year.
With the arrival of the microprocessor based personal computer the client/server world was born. PCs, even in the first generation, were relatively cheap at a few thousand dollars each. Many companies moved to use the PC platforms as cheap alternatives to expensive mainframes. Soon it became evident that an off the shelf PC acting as a server was not up to the task for most enterprise users. Typical PCs running server software were not reliable enough for mission critical use. They were cheap but they failed far too often. So enhancements were made to increase the redundancies and robustness within the server, at some higher incremental cost, to increase probability of solution being reliable. To cite a few, redundant array of inexpensive hard drives (RAID) were introduced, redundant power supplies, battery backup systems were added, the central processors, buses and memories (ECC) were improved for reliability. These enhancements came at some additional costs but, even with these additions, the costs of the higher probabilistic servers were substantially less than the prior mainframe deterministic equivalent. Mainframes still exist and are used for many of our everyday mission critical needs but the microprocessor servers are a permanent player in the marketplace.
Another example is computer memories. In the 1980s, the race was on to increase the speed of all computer memories. The reason was that the CPU speeds had advanced beyond the rate that the memories could feed them data. The CPUs were stalling waiting for the memories to provide data to process. Memories were pushing the limits of physics and budgets. This resulted in very expensive Static RAM (SRAM) technology. Having 10 nanosecond access SRAMs was a very deterministic approach to computer processing but it came at great expense. The cost per unit of SRAM is a thousand or more times per unit as compared to Dynamic RAM (DRAM). The dedicated SRAMs worked very well and were deterministic. It was possible to do mathematical static modeling of algorithms and pre determine the performance but the costs were exorbitant and prohibitive for anything but very special use cases.
The cheaper probabilistic approach was to implement a CPU cache. Cache methods apply a limited amount of SRAMs as fast access fetch storage over top of much cheaper and slower DRAM. That said, processor caches are not deterministic. There are some algorithms, for example memory scans, that run worse in a CPU with memory caches than if the CPU had no caches at all, but for typical use cases, caches provide a decent high probability that the data the CPU needs will be there when it is required without stalling the CPU for long durations. Larger caches further increase the probability of data being ready in Cache with fewer CPU stalls as a result. First implementations of cache on Intel computers used separate external SRAMs to the CPU. Later the cache was integrated into the CPU to minimize I/O bus width and latency. When cache misses where considered still too costly, a second tier of cache was added. This L2 cache was typically bigger than the L1 cache but provided a second level of probabilistic tech that reduced potential for misses for the L1 and L2. L2 caches also reduced the consequence of missing the L1 cache, since the memory requests could be frequently served from the lower latency L2 cache as compared to the much slower DRAM. For many processors, the L2 cache was eventually added into the CPU die as well.
So the pattern applies again. Deterministic costly technology is replaced with cheaper intelligent probabilistic technology using redundancy and optimization techniques.
The same pattern can be seen with DASDs to NAS Appliances, TCP/IP networks over SNA, Web browsing over 3270/VT100 terminals, Virtual Machines over dedicated servers, cloud over data centers, quality of digital mobile phones over analog mobile phones, WAN optimization over local site file servers, remote desktop (RDI) over local PCs, and many more instances.
Interesting, maybe, but what does this have to do with my WANs?
More to come…
Categories: Software Defined WAN (SD-WAN)