case study Managing data can be difficult, especially if you have almost 500 terabytes of storage and spend $10,000 a month on backup tapes. This case study looks at how Melbourne IT, one of Australia's biggest web hosting companies, handles storage.
The Web 2.0 problem
Going back a year or so, Melbourne IT chief architect Glenn Gore had a problem. His company's datacentre demands were growing exponentially; 120 per cent year-on-year. Melbourne IT's "virtualisation farms" — groups of virtual servers used by customers — were growing even faster: 400 per cent year-on-year.
The cause? Web 2.0 applications. "If you look at the use of IT, particularly online applications, there has been a maturing in the way people are using [the web], particularly for promoting your business online and having your customers interact through businesses through the internet," Gore tells ZDNet.com.au in a recent interview.
(Credit: Melbourne IT)
This meant serious changes behind the scenes. "What that meant at the back end of suppliers supporting those sites was that the amount of processing power and storage required for that went up exponentially," says Gore.
With 660 staff primarily based in Australia, Melbourne IT has a history of dealing with web hosting. Beginning with registering a ".com.au" domain in 1996 and being accredited by ICANN to provide registrar services in the .com, .net and .org domains in 1999, Melbourne IT grew to host 6 million domains in 2008.
However, the company still found itself surprised by the onslaught of storage and processing requirements that came with Web 2.0 applications.
"Every year we are quadrupling the amount of processing power we have on tap, and that's because of processing the number of advanced internet applications we have," Gore says.
Big storage means big problems.
Gore's first approach to these problems was to virtualise across both storage infrastructure and processor farms, and learn to dynamically move load depending on customer demands.
"Those farms are constantly evolving. Every day we are adding servers and removing servers. The system itself is moving load around, depending on what's going to get the best performance for customers," Gore says.
This meant a system that could not only handle redundancy and failures, but also make autonomous decisions.
Gore says the systems are "actually making automated decisions based on business rules, how it deals with things like what happens when a server runs out of memory or when the CPU becomes a bottleneck". In short, the systems will work out how to move customers to different infrastructure, based on business rules and in real time with no outages.
To create such a system, Melbourne IT needed to define how to best utilise its hardware to meet its customer needs. "We use off-the-shelf software for the management component," says Gore. "What we did create ourselves is the IT and the business rules. What we have really done is merge the software management infrastructure with the physical hardware management infrastructure."
It wasn't a real challenge for server virtualisation to create this kind of intelligence. "Everyone knows how to do server virtualisation, VMware has been banging that drum for a while and Microsoft has entered that space," Gore says.
Every year we are quadrupling the amount of processing power we have on tap.
However, storage virtualisation proved to be more of a challenge. "If you're using a big chunk of storage, and you need to move to something more powerful and newer, it's normally been a big process requiring down time," the IT architect says. "It's also quite risky. Basically it ends up becoming so hard to do you don't end up doing it."
Melbourne IT has a lot of storage. Gore estimates its total current storage at "just under half a petabyte," or roughly 500 terabytes. Most of this storage was acquired recently.
"We made a very large storage purchase last year, of 330 terabytes in one hit," says Gore. "We are getting to the point where we now are actively looking at purchasing more storage because we have exhausted that purchase."
Gore also had the issue that customer's storage demand varied considerably throughout the year. "We could have customers that have sports activities, so they are only used for certain parts of the year, or only when games are on. We have financial institutions that get busy at months end," he says.
So big storage meant big problems, but storage tiers allowed Melbourne IT to migrate dynamically when customers hit peak demand.
"We invested in storage virtualisation, so now we are able to monitor the amount of storage being offered to a customer, and based on that work out whether we should move that storage dynamically to different performance tiers," the IT architect says.
"We can move storage up the tier, and we can also move it back down the tier. That's really important to us."
Battle of the titans
With a view to a long-term solution, Melbourne IT recently went out to tender for a storage vendor that could meet the needs of a massive IT environment. "We narrowed it down to three vendors: IBM, EMC and Network Appliance," Gore says.
I think ... if you talk just about the raw storage capabilities, EMC storage is probably better than the IBM storage.
Network Appliance was ruled out first due to being too expensive, but Gore says not everyone might have that problem. "It was just true for the characteristics of what we were buying," he says. This left two serious contenders. "To be honest it was a really close decision," he added.
"I think ... if you talk just about the raw storage capabilities, EMC storage is probably better than the IBM storage. But you have to take that holistic view of, 'how do all the components of my storage fit together?'"
This led Gore and Melbourne IT to its final decision.
"We were already an IBM storage shop and that really put them across the line," the IT architect says. "We also thought that the IBM storage technology had been out to market a lot longer, they had a lot stronger customer base, so we thought it was the right solution for us."
"The IBM storage product was a lot more mature, it has been out to market for years compared to months ... you're only really as good as that weakest link."
Turn over to see Melbourne IT's backup strategies and their views on the future.
How do you back-up 500 terabytes?
Gore says the problem with being a web host is rapidly changing data, with 40 per cent of the content Melbourne IT manages changing every seven days. This problem is magnified by the long lifetime of backups.
"We need to keep a backup of not only that content, but also those delta changes for as long as we keep backups for a standard customer. Typically that's up to six months," says Gore.
Melbourne IT's solution involved several forms of storage media. "What we have decided that we need to invest in — and we have been doing this for a number of years — is disk to disk to tape backup," says Gore.
This allowed the company to manage risk across a series of different storage media.
"The idea is that if we need to recover data, we can do it straight from fast disk, that's in the same datacentre," the IT architect says. "It's very rare for us to use backups for more than a month out. We still use tape to manage the risk of having a datacentre-wide outage."
Gore says Melbourne IT's tapes are encrypted and stored remotely by a third party. "To put it into perspective, we are spending in the order of more than $10,000 a month on tapes. It just comes down to the scale of managing all that data," he says.
Projecting future demand is difficult for any company, but Gore says the issue is magnified for hosting companies.
"As a hosting company, we are driven by what our customers are doing. A traditional enterprise would know where their business was going, and they would have three- and five-year plans. The IT [department] would know what they need to do to support that business," he says.
The biggest challenge for us, especially in Australia, is getting out hands on good quality datacentre space.
This challenge means creating highly dynamic IT environments, which can help with meeting future demands.
"As routine capacity planning we look at what we do every month in terms of business, and we forecast that out as a trend going forward and look at the growth on a month-by-month basis," Gore says.
"We then buy several months' worth of infrastructure in advance, and also keep several months' worth of infrastructure in the supply chain. We also manage a buffer, what we call the 'run rate head room', which is normally about 20 per cent reserved at any point in time."
However, Gore says continuing to scale presents unique challenges.
"The biggest challenge for us, especially in Australia, is getting out hands on good quality datacentre space... As soon as they build new datacentres they become full, and people are all trying to get out of datacentres that don't have the power density to meet today's IT requirements," he says.
This problem is compounded by the IT skills storage.
"[A problem] I think in 2009 will become even harder than datacentres is actually getting our hands on good quality IT staff which are experienced, have the right attitude, and can work in large, dynamic IT environments," the IT architect says.
Got an interesting story to tell about your company's IT environment? Drop ZDNet.com.au a line to have a chat.