Many IT managers are afraid to virtualize business-critical applications. It’s just too scary! If that’s the way you think, you may be missing a huge opportunity to get more out of your datacenter. In fact, by neglecting the potential for more agility, you may actually be exposed to more risk. But before we jump to any conclusions, let’s take a look at what is at stake.
It’s an easy decision to extract value from non-critical processes by pooling resources to achieve a higher efficiency and economy of scale. The risk is low. If the system crashes, there is little damage done.
When it comes to business-critical services, everyone is more cautious. These applications cannot afford down-time whether it is for planned maintenance or the result of an unexpected outage. Introducing new technologies is therefore an additional risk that could prove to be a career-ending move.
Yet, if high availability could be guaranteed, the same advantages of virtualization and automation that work for non-critical applications will work for mission critical ones. Fortunately, the right choice of architecture can allay many of your concerns and, in fact, cloud technologies offer some options that can actually boost availability.
The best way to maximize uptime is through redundancy in all elements of the services. This implies synchronizing applications and data to other components not likely to be affected by the same incidents.
The most important High Availability technology is clustering. To be able to take advantage of clustering within a virtual environment, the hypervisor must be fully aware of the technology and able to exploit the physical resources. For example, if a hardware or software failure occurs on a cluster node, all clients should be able to transparently reconnect to an alternate cluster node without experiencing any downtime.
There should also be redundancy at the network and I/O layers. So if a virtual machine connects its network adapters to multiple virtual switches, it can withstand the failure of a single adapter as long as the platform supports NIC Teaming. Similarly, Multipath I/O (MPIO) provides fault-tolerance by accommodating multiple physical paths between the CPU and its mass storage devices, for example using multiple controllers.
Regardless of all the resilience built into the components and subsystems, a business continuity plan also needs to consider a solution for the worst-case scenario—a disaster. A starting point is backup, whereby the challenge is often to execute the jobs while the machine is running without impacting the performance of the service. Solutions that require the system to shut down or that can only perform a full backup tend to be too disruptive and may not be appropriate for critical applications.
A more sophisticated solution to protect against datacenter outages is the replication of virtual machines to a separate server, where it would be possible restart the services if required. This implies replication of the databases, but it also involves synchronizing any changes to the guests and needs to be coordinated by the virtual machine manager. Hyper-V Replica, for example, is a function that records the write operations made on the primary virtual machine and then replicates these changes to a Replica server over the network.
Cloud computing is often seen as a drive towards higher efficiency, which is certainly one of its advantages. However, it also entails a new approach to availability, based on pervasive redundancy, which can offer equally compelling benefits. There is another advantage of high availability: greater agility. No matter how critical the services are to the business, you need to be able to configure and reconfigure infrastructure and applications quickly and reliably.