A few days ago I had the opportunity to speak with Marathon Technology's, Michael Bilancieri, Director of Products, and Steve Keilen, Vice President of Marketing about creating both highly available (HA) and fully fault tolerant (FT) environments using industry standard systems. It was a very interesting conversation. While at DEC (may they rest in pieces), I had many conversations with people involved in DEC's VAXft family of fault tolerant VAX systems. I was surprised to find out that some of those good folks are still working on FT and are over at Marathon.
Marathon seems focused on reducing the cost and the complexity involved with deploying FT systems. To that end, Marathon Technologies announced the v-Available™ initiative to help its partners better understand and deploy HA and FT solutions. I'm expecting to hear more interesting things from them over time.
As I thought about FT solutions of the past, they were often based upon single-vendor processors and involved a high level of expertise in industrial sorcery. Suppliers who offered this type of system Tandem and DEC (now part of HP), IBM, Stratus and others) built special-purpose systems that had been configured so that the processors ran in lock step. This approach, by the way, often required the development of custom processors having the hardware features required to support this type of processing. The primary feature of these systems was how they treated the failure of a component. If one system failed, the others would simply pick up the work and continue. Applications would not be aware of the failure. This "fail through" process took a tiny fraction of a second making it highly desirable for applications that simply could not have down time, planned or unplanned.
Fault tolerant hardware typically was more costly than general purpose systems having similar processor, memory and storage configurations because every component was duplicated at least once. If an organization stood to loose enormous amounts of revenue due to a failure, it would purchase these systems regardless of the cost of the hardware. Since the cost of these configurations was high and these systems had to be treated as a single computer, many organizations turned to other types of virtualization, such as clustered systems, when their need for constant availability was not quite as high. While clusters took longer to deal with a "state change", the systems involved could all be productive rather than being treated as merely a "hot" backup.
Marathon's everRun™ FT creates a true fault tolerant environment using general purpose industry standard systems connected by Gigabit Ethernet. This means applications hosted in an everRun environment do not see failures. Processing "fails through" to remaining resources when something fails. Marathon is supporting Windows-based applications today and will support Linux-based applications in the future.FT solutions fit monolithic applications best, you know, where the user interface, the application rules processing, the data(base) management and storage management are are part of the same image. Distributed applications may be able to offer similar levels of availability by using other types of virtualization software and redundant systems. That being said, developers and IT managers who are not familiar with Marathon or its products might find them to offer interesting solutions to difficult availability problems.