Hardware fault tolerance in a virtual environment
Stratus wants organizations that are moving more and more workloads into virtual environments to understand that they're taking a risk when they put a large number of workloads onto a single industry standard system. Although these systems have gotten better and better over the years, they still are not reliable enough for critical workloads.
Grouping these machines together to create a cluster that depends upon software to create an high availability or fault tolerant environment is a bit better, but still leaves an organization exposed to the risk of failure. Stratus believes that the best approach is to deploy fault tolerant systems to support virtual environments is a much better idea. So, Stratus is now providing VMware® Infrastructure 3 Foundation free of charge with Stratus® ftServer® systems.
Understanding the nines
Let's look at uptime to gain an understanding of what adding a "nine" will do to an organization's exposure to downtime.Monthly Uptime | Monthly Downtime | Seconds Down Per Month |
Minutes Down per Month
Hours Down per Month
99.0% 1% 25,920 432.00 7.200
99.9%
0.1%
2,592
43.20
0.720
99.99%
0.01%
259
4.32
0.072
99.999%
0.001%
26
0.43
0.007 99.9999%
0.0001%
5
0.09
0.001
Although workload uptime is often dependent on many factors including the uptime of their systems, storage devices, network and probability of staff error causing a slowdown or failure, the chart above shows that a system supplier offering 99.00% is really telling its customers that, on average, that they'll experience roughly 7 hours of downtime in any given month. While that might be good enough for some workloads, it is whoafully lacking for others.
If we add a "nine" to that uptime percentage to to 99.9% uptime, those same organizations would experience nearly three quarters of an hour of downtime in any given month. This, by the way, is a common uptime figure if all of the factors that can create planned or unplanned downtime are considered.
Those offering a clustering-based solution, which, by the way, includes most virtual machine migration-based clusters, would point out that they offer between 99.99% and 99.999% update. At 99.999% uptime, the organization would experience only 26 seconds of downtime in any given month. A financial institution could loose millions of dollars if their EFT or trading systems are down that long.
Stratus, a long time supplier of fault tolerant systems, would point out that just isn't good enough for critical applications that can not experience any downtime. Stratus wants organizations to have six nines, that is 99.9999% uptime. That means experiencing only 5 seconds of downtime in any given month.