Although virtualisation technology has been around in the mainframe and Unix worlds for years, its more recent move into the X86 commodity server space has caused a bit of a stir.
On the one hand, it is seen as an important means of boosting computing efficiency due to its ability to boost server utilisation. While the average PC server uses only about 10 per cent of its CPU capacity at any one time, virtualisation software can increase this figure to as much as 70 or 80 per cent.
Virtualisation technology is also seen as a possible means of cutting costs by reducing server sprawl. This is because it enables organisations to run different operating systems and applications in partitions on the same physical server, which can lead to savings in terms of hardware procurement, power and cooling bills and data centre space.
The software is now in the early mainstream phase of adoption in the large enterprise space. It is also expected to take off among small to medium-sized enterprises during 2008 and 2009 after Microsoft ships the Windows Server Virtualisation free-of-charge add-on to its Longhorn server operating system at the end of 2007.
As a result, the time seemed right to explore the pros and cons of the technology through the eyes of real-world users — in this case, financial services giant Standard Life, and well-known charity Comic Relief.
Standard Life: Extraordinary technology
Standard Life implemented virtualisation technology in 2004 as part of a wider Intel server-consolidation project to cut running costs and simplify management and administration. More...
Comic Relief: Laughing all the way to the Grid
"The most important time for us is event days, as they're our primary means of fundraising. So it's 100 percent imperative that we have an infrastructure in place that can handle peaks in activity and provide an efficient and secure service to everyone who wants to give us money," says Martin Gill, head of new media at Comic Relief. More...
Standard Life: Extraordinary technology
The organisation, which is based in Edinburgh, is one of the UK's largest financial services companies, providing banking, pensions, life and private medical insurance. After starting the process in March 2004, Standard Life de-mutualised and floated on the London Stock Exchange in July 2006 and currently manages about £119bn in assets on behalf of seven million customers worldwide.
The company started its consolidation initiative in 2000 in a bid to reduce server sprawl. Ewan Ferguson, technical project manager for the firm's 500-strong Information Systems Operational Services group, explains the rationale: "Our servers had grown in number for 20 years or so on an ad hoc basis and we were starting to find them difficult to manage. We had pretty much one application per server and, while our headquarters are in Edinburgh, we had about 20 offices across the UK, so there were storage and remote-management issues."
To help combat this server sprawl, Standard Life decided on a strategy based around a number of consolidation "streams". The first involved opening a second data centre as a disaster-recovery site and moving the majority of the organisation's Intel servers into it — although others did remain in branch offices. At the same time, the decision was taken to standardise on common hardware, operating systems, security and patch-management software.
The next step in 2002 was to introduce a storage area network, initially to handle file and print services, although application support was added at a later date. But by 2004, says Ferguson: "We entered a strategic review to prepare for de-mutualisation, with the aim of improving both efficiency and our service-delivery capabilities. We were looking at streamlining workflows and moving from running one application per server to running multiple ones, and so virtualisation came into focus."
On analysing its server estate, which was in the "low hundreds", the organisation found that 70 percent of its machines were exploiting less than 10 percent of CPU capacity and less than 30 percent of memory. On further evaluation, it became clear that 70 percent of the estate consisted of "good candidates for virtualisation".
Good candidates included machines that were not making efficient use of resources. These included servers that were not running network-hungry applications, or making high demands on disk input/output such as large transaction-processing packages or databases.
Another consideration was whether the application vendors were prepared to provide support for packages running on virtualised software. "For one or two of the applications, we had to make a judgement call from a performance and cost-benefit analysis standpoint as to whether it was a good move," Ferguson says. "We had a few where, in a worst-case scenario, the vendor wouldn't have supported us and we could have done a reverse migration, but that was part of the risk analysis from the start and, as it turns out, we haven't had to do a single migration back."
By late 2004, having looked at all the different flavours of virtualisation software, Standard Life opted for VMware's ESX Server. "We'd done a lot of preparation before the roll-out. We had a pretty good handle on the hardware in our estate and we'd looked at reference sites and talked to other companies about lessons learned. We'd also put in a pilot to test the environment and see what virtualisation ratio we could get and how stable the environment was," Ferguson explains.
That virtualisation or server-consolidation ratio worked out at 13 machines to one, which has "significantly" cut operational costs, including power consumption and maintenance. For example, because the company can now deploy applications to run in virtual machines rather than having to provide dedicated hardware for each one, it has been able to increase the number of instances of application server software it uses while reducing the number of physical boxes that they run on.
This radical change in server deployment means that while in January 2005 there were 370 application servers running on 370 physical machines, a year later some 535 were able to operate on 350 physical servers.
"We're making a better use of our investment because as the estate continues to grow, we don't have to increase our investment in hardware. We're now exploiting about 70 to 80 percent of the CPU capacity of our machines rather than 10 percent, and we can deliver services more quickly," Ferguson says.
As a result, while IT's service-level agreements used to mean that it had to deploy a new service in 15 days, this is now generally possible within hours, as it is no longer necessary to involve four separate delivery teams in the process.
Disaster-recovery provision has likewise become more effective. "As virtualisation software is hardware-independent, you have more flexibility to move services from host to host and from one data centre to another. For example, one of our SAN-attached servers, which hosted 20 guests, went down a while ago and, although there was an impact, it wasn't a major headache," Ferguson says.
Another host at the disaster recovery data centre was up and running in less than an hour, and: "Although we may have had an outage, it took much less time to restore services than if we'd had to replace an entire server".
Standard Life uses its virtualisation software mainly to run testing and development servers and for infrastructure functions such as DNS, domain controllers, security and patch distribution. Large databases and other mission-critical applications, however, still run on their own dedicated hardware.
"Virtualisation has become a strategic technology for us. It's about ensuring a better use of the investment that we've already made and gaining efficiencies in terms of scalability and improved functionality, so it's become the default in the Intel space for us," concludes Ferguson.
Comic Relief: Laughing all the way to the Grid
Comic Relief was set up in 1985 and is based in Vauxhall, London. The charity employs about 100 staff and runs two key campaigns to help alleviate poverty both in the UK and internationally. Its flagship Red Nose Day takes place each March during odd years, with the next one being held in 2007, while Sport Relief, which began in 2002, occurs during even years.
For the last 20 years, the organisation has in the main collected money with the help of more than 14,000 call-centre volunteers using paper-based systems. This has traditionally meant that partners have taken between two and three weeks to process donations, or longer if something goes wrong.
In 1997, however, Comic Relief introduced a Web site for the first time and raised £40,000 as a result of online donations. Two years later, the figure had risen to £465,000 out of a total of £35 million, which "was the tipping point for us. The Web site moved from being nice-to-have to becoming important", according to Martin Gill, head of new media at Comic Relief.
By 2005, however, the charity had also decided to introduce a pilot project, enabling 5 percent of its call-centre operators to use the Web-based donation system rather than rely on a paper and pen. It now aims to increase uptake to between 7,000 and 9,000 of staff in 2007.
"It's quite a technical feat to build a system that can log in that many at the same time to take donations, but it would be a massive improvement for us because it would be possible to process money in real-time so that it would be in the bank the next morning. We'd also be able to notify people of any payment problems immediately and make Gift Aid claims a week after the event, not months later," says Gill.
Therefore, to enable this to take place, the organisation has implemented server virtualisation technology in the shape of a Grid-based system supplied by technology sponsors, Sun and Oracle.
Grid-based systems comprise disparate IT resources that have been networked together using middleware to create a single virtual computing infrastructure.
This enables all of the processing power of the Grid to be harnessed simultaneously to handle huge workloads no matter where individual machines happen to be located. But it also means that elements of that workload can be split off and allocated to idle CPUs in the network should others be working at full capacity.
In order to introduce Grid functionality, meanwhile, Comic Relief upgraded its existing Oracle 9i software and replaced it with the vendor's 10G database, Real Application Clusters software, Enterprise Manager management console and Fusion middleware, all running on Solaris.
Gill explains the move: "Resource utilisation and high availability are really important issues to us because we need to operate as efficiently as possible. Out of campaign times, we only use about six to eight per cent of the capacity of our servers, but during our fundraising events, the infrastructure simply has to be flexible enough to change focus very quickly, and Grid enables that."
The technology was load tested during the charity's Sport Relief event on 15 July, 2006 and enabled it to take 15 percent more donations than had previously been the case. The organisation also saw the number of partner staff required to help with system monitoring reduced from 17 to 10 as resource allocation became less of a manual process.
"We previously had to have a substantial collection of partner people there to help us with decisions on resource utilisation, but we were able to slim that down this time. Most of them work on a voluntary basis so it means asking favours of less people and making better use of their skill sets so they don't have to tinker with the technology to find the answer to a question," Gill explains.
But the next step, according to Gill, is to optimise resource utilisation rates still further. To date, Comic Relief has been looking at controlled utilisation rates of 50 to 55 percent, but wants to grow the figure to 75 to 80 percent, although no higher than that.
"Otherwise, if the public responds in a way that we don't expect, there's not a huge amount of room for the Grid to move things around to accommodate it, whereas 80 percent is enough headspace to cope with specific changes,” explains Gill.
Despite the benefits of the technology, however, he does not advise organisations to rush into adopting virtualisation technology or Grid.
"We took a gently, gently approach over 18 months and that worked well for us. So I'd advise people that, when they come to their normal refresh cycle, to do it with a view to creating an infrastructure that could support Grid. Most people are in the process of trying to maintain their infrastructure and do other things at the same time and it does require fairly substantial change," Gill says.
As a result, he recommends upgrading hardware, introducing clustering technology or implementing virtualisation middleware in stages rather than adopting a big-bang approach.
"Part of our challenge as a charity is to be as efficient as we can, especially in relation to technology utilisation. We need to know that our infrastructure is working as well as it can and that we can squeeze the pips, but we also need to know that it can respond to change in as flexible a way as possible," adds Gill.