Distributed computing is nothing new. The idea of taking a task too big for one computer and spreading it across many machines has been around since the sixties, especially in research, scientific and engineering communities. But as a result it has remained a deeply esoteric and specialist subject: fine for modelling a nuclear warhead, but not much cop for the small business.
All that's changing, and fast. Where once only large establishments could afford networks of fast machines, fast networks and software engineers capable of understanding connectivity, now we all have Internet connectivity, cheap networks, processors running at gigahertz speeds and -- most importantly -- open standards that are ubiquitous, capable and well-understood. Each company is a huge well of untapped computing power: when each 2GHz Pentium spends all but a tiny fraction of its time waiting for key presses, the potential is clear.
Cue the Globus Project. A hefty consortium of American academics, government agencies and IBM, Microsoft and Cisco, based in the US Argonne National Laboratory, Globus has been going for nearly six years developing the protocols, software and concepts necessary to produce open distributed computing. Although still focussed on scientific and technical projects, it has recently announced the Open Grid Services Architecture, OGSA, which makes the whole concept available to anyone. In addition, IBM has said that the Grid -- the concept of a sea of available processing linked by accessible networks -- and OGSA is one of its key corporate strategies. The company predicts that it's the "key to advancing e-business into the future and the next step in the evolution of the Internet towards a true computing platform."
All this intense research has produced a solid consensus on the architecture of Grid technology and what is needed to make it happen. Standard protocols to request remote operations, swap data, manage tasks and perform essential security are key, as is a standard application programming interface (API) collection, code libraries, reusable components and debugging methods.
Hence OGSA, which has developed out of experience with the already-available Globus Toolkit. It is based on a four-layer model: at the top are user applications; then comes what is known as collective services, which includes directory handling, diagnostics and monitoring; below that, resource and connectivity protocols handle access to servers and networks; and finally comes the Fabric, which is everything on the network -- storage, computers, connections, sensors and so on. The fabric and the user applications are familiar ground: it's the middle two layers that define the Grid.
The Grid looks very different from standard networked computing. Most services these days are client-server, with security concerned with identifying clients, authorising them and defining what they can do on a server. With the Grid, the difference between client and server is much less clear -- one computer can ask a second computer to do a task while simultaneously carrying out another task for a third party. You don't want to have to authenticate both ways for each type of transaction, nor do you want to have to log onto each of the potentially hundreds of computers that may be running your tasks or have to authenticate everyone else who might want to do the same with yours.
Also, you may well create a task on someone else's computer that itself creates tasks for further computers -- and, given the disparate nature of what runs on open standards, the security requirements for each may be different. It is quite impossible to relate all this to one individual: thus, the Grid depends on community authorisation. You become part of a community and what you do in that community's name depends on what the community is allowed to do.
Other important aspects of Grid computing include its expectations of handling terabyte-sized data sets, and larger. Its protocols are designed to cope with this efficiently -- a legacy of the big-iron scientific work, but one that's becoming ever more appropriate to commercial needs.
OGSA realises all of the above through a mixture of new protocols and existing, established concepts. Based around the idea of Virtual Organisations (VOs) it supports -- via standard interfaces and conventions -- the creation, termination, management, and invocation of transient services as named, managed entities with dynamic, managed lifetime. The conventions that define these services are defined using the existing Web Services Definition Language (WDSL), which is an XML-based way of saying how things can be built out of messages with data or executable contents. Various ways of linking this to standards such as SOAP, MIME and HTTP exist, but the emphasis is on adaptable ways of building in ideas as they prove appropriate.
As you may surmise, the development of a global system of distributed computing is expected to take a fair amount of time, and involves many novel concepts. It's still very much a work in progress -- but one that shows every sign of going mainstream as these ideas take hold. For further reading, the Globus Project's Web site is an excellent starting point, and the Global Grid Forum covers much of the bigger picture.
ZDNet UK's Developer News Section delivers the latest headlines together with the best UK jobs, right to your browser.
Let the editors know what you think in the Mailroom.