Fault Tolerant and Fail Over is There a Difference?

Fault Tolerant and Fail Over is There a Difference?

Summary: Fault tolerant (FT) solutions go beyond HA fail over solutions to present an environment that is never seen to fail not merely an environment that survives a failure.  Some suppliers of FT technology call this "fail through" rather than fail over.

TOPICS: Virtualization

Fault tolerant (FT) solutions go beyond HA fail over solutions to present an environment that is never seen to fail not merely an environment that survives a failure.  Some suppliers of FT technology call this "fail through" rather than fail over. I thought that was a well known concept and was surprised to find that the distinction is still not clear to some.

While speaking with a potential client about how different forms of virtualization could address his organization's requirements, I detected that some of my comments created confusion rather than clarifying things.  As an aside, it appears that I have an innate ability to make some technology appear more complex that it really needs to be.

I'd like to offer a summary of the discussion while it is still fresh in my mind.

Virtualization technology, taken broadly, offers a number of approaches to availability. Here are a few of them.


  • Access to application solutions can be virtualized.  If the back end system fails, the individual using the application is connected to another system that offers the same application.  More sophisticated access virtualization software may make this process automatic. Even more sophisticated products in this area will remember the state of the application and give the impression that nothing ever failed. Doing this last bit, however, usually involves other forms of virtualization. This process, by the way, is unlikely to be instantaneous.
  • Application frameworks may offer load balancing and failover capabilities. The application framework monitor, upon detecting either a failure to meet service level objectives or some other type of failure, would start the application on another machine. Once again, the process could be automatic or require manual intervention. If other types of virtualization are in use, the actual state of the application could be saved during the process. While this process may happen quickly, it is likely that individuals using the application would notice a pause or a slow-down.
  • Processing virtualization, which includes clustering, parallel processing and virtual machine software, may offer similar load balancing and fail over capabilities to that offered by application framework virtualization for selected or all applications on a given system. The key difference between the levels of virtualization is that application framework virtualization only virtualizes applications running in that framework. Processing virtualization makes it possible for applications, data management products or even basic system services to fail over to another system. As with the other forms of virtualization, the fail over process can take some time.
  • Virtualizing storage often a necessity for all of the other forms of virtualization. After all, what good is moving an application over to another system, if the data it was processing is no longer available. Storage virtualization could be implemented using special purpose software on general purpose systems or by moving the entire storage function to a special purpose storage server.

All of these are well and good. What happens, however, when the requirement is that failures are never seen? This is the realm of FT systems.  In this case special purpose, redundant hardware configurations are deployed that are run in lock-step.  If one component of the system fails, the other continue working and the application does not fail.

Historically, FT solutions were quite expensive.  After all, every component of the system had to be replicated enough times to handle all expected failure scenarios. More recent solutions,  offered by suppliers such as Stratus and Maraton, are based upon industry standard systems and components. The use of off-the-shelf hardware significantly reduces the price of these solutions.

Does your organization deploy truly fault tolerant solutions or do one of the other forms of virtualization offer sufficient levels of reliability and availability?

Topic: Virtualization


Daniel Kusnetzky, a reformed software engineer and product manager, founded Kusnetzky Group LLC in 2006. He's literally written the book on virtualization and often comments on cloud computing, mobility and systems software. In his spare time, he's also the managing partner of Lux Sonus LLC, an investment firm.

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • These are neither new nor substantially different to existing solutions

    I agree that it is definitely worthwhile explaining the differences between fault handling modes of systems but neither of the solutions you link are substantially new or different to old school clusters.
    Both of these solutions are trying to make dumb applications behave intelligently from underneath. There are well understood fundamental limitations to this approach, the 'solution' looking up the stack cannot possibly predict the behaviour or failure patterns of any application, user load and data set you choose to install on it. This leaves the solution able to look after hardware failures and make educated guesses about software behaviour. This adds substantial installation and operational difficulty for a small gain in theoretical reliability. One of the solutions has an additional layer of complexity strapped around the application to try and mitigate this fundamental failing. Instead of spending company money to stick plasters over badly written applications we should be concentrating on selecting or writing applications that are written properly, are inherently fault tolerant and do not require overpriced solutions to deal with the symptoms of their architectural failings.
    Liam Newcombe
    • Corporate networks evolve

      <p>Thank you for your comments. They are well taken when one considers new applications.
      <p>In my experience, however, most corporate networks evolve slowly over time rather than changing rapidly. This means there is quite a bit of established technology at the heart of most corporate networks. If one looked at the systems running most major corporations around the world, one would find something like one of those lovely Russian dolls (a small doll contained within a bigger doll and that bigger doll contained within a still larger doll and so on). So, the environment would look something like this:
      <br />
      <li> Mainframe applications that were developed 30 years ago and constantly being updated ever since that are fed by
      <br /></li>
      <li>Unix applications that were written 20 years ago and being updated ever since that are fed by</li>
      <li>Windows and/or Linux applications that were written sometime within the last 10 years and have been updated ever since that are fed by</li>
      <li>User access device such as Windows PCs, Mac's or even handheld devices</li>
      <p>While it would be nice to simply re-implement the whole stack of software using the newest of technology, this is not likely to be the process IT departments use.&nbsp; They're simply unlikely to walk away from 30 years of investment.
      <p>So, the technologies mentioned in the post still have an important place in the IT department's tool kit. You certainly are correct that the implementors of new applications would be wise to consider an architecture such as the one you are suggesting.
      <br />
      <p>Dan K
      <br />
  • Nice article

    It is my viewpoint that companies really <a title="Daily" href="http://www.da-ily.com">Daily</a> need to consider whether they???d like to go to a restaurant <a title="Online" href="http://www.da-ily.com/online/">Online</a> and have a nice meal prepared or stay at <a title="a7bh" href="http://www.a7bh.com">a7bh</a> and prepare the meal for themselves.