Home & Office

What the hell are Grids anyway?

Platform Computing chief executive Songnian Zhou has the answers to everything you wanted to know about Grid technology but were afraid to ask
Written by Andrew Donoghue, Contributor

The term Grid computing has joined the august pantheon of IT terms which are banded about without much precision. (See also: utility computing, on-demand, autonomic computing). No-one wants to be seen as out of touch with the latest terminology -- so jargon goes untested and unexamined.

One of the most often cited examples of Grid technology is SETI@home -- which uses a downloadable screen-saver to harvest the computing power of thousands of desktop machines. But the organisation itself prefers to describe what it does as "Public distributed computing".

This lack of clarity has undoubtedly contributed to the gulf that exists between the grandiose claims made for Grid computing and the reality of its actual deployment. IBM recently announced plans to create The World Community Grid for various medical and environmental research programmes but projects of this scale are the exception rather than the rule.

Some experts postulate the concept of Grid technology arriving in three waves: Firstly in academic research communities, followed by corporations which is beginning to happen now. The ultimate goal however is the third wave, which will see the technology coalesce to create a processing network analogous to the Web, and called simply "the Grid".

To address some of the thornier definition issues around Grid technology ZDNet UK sought out the chief executive of Platform Computing Songnian Zhou. His PhD thesis has been credited with helping to establish the field of distributed resource management -- one of the foundations for the Grid concept.

There seems to be a lot of confusion about what Grid computing is. How would you define it and is this lack of clarity affecting its uptake?
It is important to have clear understanding of Grid but sometime we are too myopic or academic when it comes to defining it. I was actually working with Ian Foster in 2002 when he tried to come up with a definition, and came up with three criteria which I think are reasonable criteria.

If you ask me to give you a very clear and simplistic definition of grid I would say grid computing is distributed computing involving multiple sites to integrate and support applications and support collaboration.

Do you think the grid concept is being communicated succinctly enough to engage anyone apart from academics?
Not at all. It started on the wrong foot and has continued to be very confusing.

For example if you look at SETI@home that is an extreme example of what we call grid computing. Grid computing mostly focuses on server computers and not desktops. Also it is also mostly focused on existing applications, not brand new applications developed from scratch. People ask, "What is the killer app for grid?" and the answer is that the killer app is in front of you, behind you, all over the place -- but people just don't see that yet.

How far along are Grid standards? Are they holding back widespread uptake?
Standards are a major issue but not a show-stopper yet, because you first have to get a general understanding of what are the requisite technologies. Only then when you have some good ideas and contribution for expert parties can you even start to think what the standards will look like. I think it's still premature for the industry to know what the standards should be but the building blocks for standards are being worked on.

Is grid technology really one-step removed from the main responsibilities of an IT professionals working for an enterprise sized company -- isn't it something for service providers to be concerned with?
That's kind of confusing grid technology with on-demand or utility computing. They are all related, but grid computing is really focused on the infrastructure. From my experiences of the market over twelve years, the centre of gravity for adopting grid and the mainstream realisation value resides in the enterprise. The idea is to help companies integrate their IT resources better -- the hardware, the software, the data, the networking to make IT a more effective tool for businesses.

The vast majority of enterprise customers doing true enterprise grid aren't just running a cluster of 100 Linux machines but rather multiple applications, often in multiple departments, multiple groups and multiple locations.

Look at a company like JP Morgan [a Platform Computing customer], they, as a leading early adopter of computing, wanted to create a shared IT infrastructure based on industry standard technology -- they are effectively trying to create a virtual mainframe. The benefits they have been getting are that usage of resources is far more effective; instead of 20 percent it's much higher, around 80 percent. Also they are able to take advantage of industry standard servers but to do grid on this kind of scale -- thousands of machines -- you need a layer of software to manage all these resources, talk to the OSs.

But some purists claim that if a grid is limited to the confines of a single company network then it's not a true grid but an intra-grid?
It is not a grid -- it's intra-grid? OK, is intra-grid a grid? This can become really irrelevant and academic. Grid is a large-scale distributed computing infrastructure, integrating multiple standard resources, multiple domains -- could be the same location but typically multiple locations -- supporting a variety of applications with the purpose of sharing the resources and promoting collaboration.

This kind of definition resonates with our customers, who are using it from Pfizer to JP Morgan to Texas Instruments. They have been practising it first and foremost within their enterprises, but gradually, as that proves to be effective, they are in a position to start engaging outsourcing partners and external data centres.

It's an evolution. If you say, "Grid is between enterprises," then it's always going to be in the future for quite a while to come. If you say "Grid is clustering," then it's too restrictive but it's the first step on the road towards true distributed computing.

Some experts are claiming that the natural end-point of grid technology is the creation of "the Grid" -- a single global computing network that does for data processing what the Web has done for online content?
There is a very strong connection and natural evolution -- I made that connection in about 1999. If you look at the Web it provides a natural interface for humans, first and foremost to access information and data. But over time you start to say, "I want to add value to data, I want to provide intelligence and capability". To do that you need applications and this is where the Grid infrastructure comes into play and Grid becomes the support infrastructure for all the applications that become the active components on the Internet-wide environment.

So how far off is the concept of "the Grid"?
If you look at the maturity of the Grid compared to the Web than that kind of model is at least ten years away. You need to recognise that the Web compared to the grid is much simpler. Grid is all about the applications, the business logic and businesses have to transform to take advantage of this paradigm. Computing is going through an evolution from the middle ages to the renaissance period. With Grid you no longer talk about CPUs, you no longer talk about hardware, you no longer even talk about applications -- you talk about services.

So you advocate idea of utility computing such as that put forward by companies such as Sun where computing resources are analogous to the telephone network?
Yes, IT is becoming increasingly complex and companies don't have the necessary experts to create all the solutions for themselves - it doesn't make sense anymore.

Utility computing makes sense on paper, so why don't enterprises seem to be taking to it?
It's because customers are just not ready at all, mostly because the service providers have not been ready for this. The reality of today and the next several years is that if the customers are not moving in this direction for their own in-sourced IT the chances are that they won't be in a position to take up the offer of these kinds of services from the vendors. And what's more, the vendors won't be able to deliver these kind of services because they don't even know what the customers want, and they don't have the requisite technologies at all, because we are too busy debating what the grid is.

Editorial standards