IBM: Grids, the Internet and Everything

Irving Wladawksy-Berger's latest focus is on two next-generation technologies: grid computing and autonomic systems

Vice president of technology strategy for IBM, Irving Wladawksy-Berger, is a 32-year IBM veteran whose career has included stints in research, product development, business management and strategic planning. He now believes the Internet is on the verge of becoming a global virtual computer, like a utility power grid, with computing resources available on demand.

Q: You have been leading IBM's efforts to understand the Internet as a business. How would you characterise the Internet today?

A: The Internet without a doubt is one of those once-in-a-generation technologies that will transform everything. When I say everything, I mean everything. The Internet is transforming all aspects of business, government, healthcare, learning and is making a huge different in our personal lives. But all that takes time. Electricity was once such a change and of course it took a while for electricity to make its change, same for the automobile.

We are in the early stages of Internet. If you look at the Internet today, we have a lot more bandwidth than we had five years ago, but this is still poor compared to what will hope to have 10 years from now. If you look at number of houses in the US that have access to broadband, it is still not a very large number.

The kinds of applications you can do over the Internet once everybody has access not just to base broadband, but to the next generation of broadband with 5- to 10-megabits per second will be significantly better. So we are in the early stages of the Internet, there is no question in my mind.

When will we get to a stage when those rich media applications can be run over the Internet? Is it five years, ten years -- what is your prediction?

Here is my feeling. I say feeling because I really don't know. These things are hard to predict. I think when we get to the point where over 50 percent of houses have broadband access, then you can start planning applications much more with broadband in mind. You can truly start not just taking into account 56K connection speeds, which you have to today, but start planning the next generation of applications that will only work well with broadband. We'll probably reach 50 to 60 percent critical mass in another two to three years. I think by 2005 it will start taking off more.

The technologies to build out the broadband infrastructure exist today. Is it more about waiting for the economy to get kick started again?

Certain things take time. It's not just a matter of a slow or fast economy. We are talking about technologies that are truly transforming business, creating a digital economy, and totally changing the way people get entertainment in their homes. The fact that those things might take 5, 10 or 15 years to roll out is not surprising. Because we were able to do so much so quickly in the late 1990s we all probably said "boy, this a breeze" and we'll can just keep going. There are quite a few things can do quickly, but other things, especially those that require deployment of telecommunications infrastructure, just take longer.

Grids: Virtualised services based on open standards

Q:There are some new developments such as IP version 6 and grid computing, which purports to be the utility for computing power. How do you see it playing out in terms of timeframes and capabilities that will be available?

A: Grid computing is extending the Internet to be able to become a computing platform. Let me explain. The Internet is a great network with TCP/IP supporting all kinds of network accesses. It's a great communications mechanism with email and instant messaging, and of course with the Worldwide Web, it's a fantastic repository of content.

We now want to take it to the next level in which applications can be distributed all over the Internet, and they can access all the resources that they need, and of course are allowed to access with the proper security even though they are distributed over the Internet. To have such distributed applications you need a set of protocols that everybody can use. That's what the grid community has been building, and that is what the grid computing is about.

Is big business involved in helping to set standards and create vertical slices that will allow them to take some of their corporate or campus resources and turn it into a grid?

The grid, just like TCP/IP and then the Worldwide Web, started in the research community. In fact, some great applications are already exploiting the capabilities of grid computing in the research world.

Globus is probably the grid organisation, and there are several others, which has been most successful in sophisticated research grid applications. What's been happening now with grid and with Globus in particular is, just like what happened with TCP/IP and the Web: the business world has begun to notice. The business world has begun to say: How can I apply these protocols and capabilities that the research community has been using successfully to now solve business problems like computing on demand, software as a service, applications on demand, and significantly increasing the efficiency of my total infrastructure and quality of service?

Now businesses like IBM are working very closely with the grid communities to be able to move grid protocols to support more commercial requirements and to start building products and services that exploit them.

So, a company could wire up its computers and during down time use them to crunch big number applications?

The thought is that by using grid protocols, over time you can link all your computing resources and run them as one virtual computer. You can have sophisticated schedulers and workload managers, and then whenever there is computing power that is not being used and you need it for other applications, you can go get it. The fact that it is in a different computer is immaterial, because you are running the whole thing as a virtual computer.

As you said, there need to be standards for big business to be able to use grid computing. Is grid computing becoming a form of Web services? What is the progress at this point in developing those protocols and Web services that will run on the network?

One of major advances that we have made in bringing grid protocols to the commercial world is to develop grid services, that is a grid architecture that is services oriented. To do that we all work together -- the grid community, Globus, IBM, and other companies -- to develop the open grid services architecture based on Web services. The open grid services architecture, which is the next generation of Web protocols, will use Web services standards for exchanging information, such as WSDL, SOAP, and UDDI for the directories, and then will develop essentially grid services to do security, authentication, allocate computing power, scheduling, exchange files and data and all the various services that grids will offer. Now they are being rendered using Web services as the underlying architecture. That's a major step in getting the whole community to embrace Web services moving to grid services.

Once you make it easy for people to get services on demand -- because you have an open set of protocols and because you have what we called in computer science virtual access to those services, meaning that you don't know where they get performed -- you simply request them and it's somebody else's job to do. Once have virtualised the services and it's based on open standards, now they may be performed in your local department or centralised data centre of the your business. But more and more businesses will make decisions about which services they should run themselves and which they are better off outsourcing to service providers who can do it less expensively and without them having to spend their capital and skills. I am expecting that over time. As a result of the emergence of Web and grid services, there will be a very rich service provider community offering all kinds of services to the consumers of IT, that is businesses large and small. Some will be computing power, storage capacity, software and all kinds of applications and business processes. A business will have lots of choices as to what to do and who to buy from.

Q: One of the issues that comes up with having a ubiquitous virtual computer running as part of the Internet is that you are also creating an inherently complex system. Another area of investigation is autonomic systems, which it is purported will create a more self-healing environment for computing. Can you give us a status report on autonomic computing?

A: Let me first comment on the complexity question because this is very important. Like other major technology infrastructures, such as electricity or the telephone network, the aim is that even though the infrastructure itself is complex -- say to generate electricity you have Hoover dam, transmission lines from Hudson Bay, nuclear power plants in Canada -- if want to toast a bagel in the morning, you don't have to know any of that.

It is important to encapsulate the complexity so that users of IT and developers of applications don't have to deal with it. They simply have a set of interfaces that they use and those will work very nicely.

Now comes the question of how do we manage this complexity. One of the major ways that we absolutely have to develop to manage this complexity is to automate it. We have to make the systems themselves much more self managing; to build into our systems and the whole infrastructure the ability to detect failures and heal themselves; to detect attacks and protect themselves from hackers or other attacking systems; to optimise themselves so if have an application that is slowing down because of too much demand, let it on the fly add computing capacity so it continues to provide a reasonable response time level; to configure themselves so when you add a new piece equipment to the system it automatically detects it, brings it online and everything works.

To do that we launched an initiative in IBM last year called autonomic computing. We launched it as an industry-wide initiative because we felt these are such complicated problems that all of us in the industry need to cooperate to make everything more self-managing.

What progress has been made so far in getting cooperation from other companies?

A number of companies have developed their own autonomic computing projects, such as HP and Unisys. I think that, as often happens in the industry, when someone sees a good idea, the markets follow. We are all exchanging research ideas. For example our Almaden research labs in San Jose California, ran an autonomic computing meeting in early April. We had participants from Stanford, Berkeley University of California, Sun, HP and other companies in the industry all exchanging ideas at the research level.

An area where we will collaborate in building autonomic infrastructure will be around grid computing precisely because in order to build a heterogeneous, self-managing infrastructure, you need to have a set of common protocols that run on every system so that the various systems can collaborate with each other.

You will see us, as a community, collaborate more on security that works across various systems and on self-healing algorithms. For example, workload managers that run across the infrastructure that can detect which nodes are having problems and therefore should be taken out of line and that work routed to other nodes that are operating well. Because of heterogeneous infrastructure we need common protocols. Building autonomic capabilities is one of major application areas on top of grid protocols.

'Grand challenge' problems demand more sophisticated systems

Many companies, like IBM, are developing point solutions such as workload managers. But the goal was to create something more like the human nervous system in terms of automatic response systems in which things happen quite naturally. What is the gap between building these kinds of point solutions solving discrete problems like workload management and fail-overs to something more sophisticated, more artificial intelligence and control theory?

Especially when you want those autonomic capabilities to work across the infrastructure, those are very difficult problems. Those are right now in the research stage. Again, we have to agree on a common set of protocols, which is part of what has driven us together around this grid community. We have to work with universities and research labs around the world to develop it.

These are very challenging, sophisticated problems. We will be able to do simpler things first. There is quite a bit you can do in workload management, such as detecting with heartbeat controls on all the nodes of the infrastructure. Other things like being able to do a very good job detecting denial-of-service attacks and take actions will take longer because there has to be sophisticated pattern recognition work to tell difference between a surge in usage and a true denial-of-service attack. You will see progress at different levels. I would call these Grand Challenge problems, meaning these are definitely complex problems that will take quite a while to get right.

Nanotechnology, which has the ability to create machines atom by atom, has been getting a more press lately. When do you think nanotechnology will be practical? Within the next 10 years?

A lot of this depends on the definition of what we mean by nanocomputers. We are seeing computers get smaller and more sophisticated. We are entering the next stage of computers called blades, which are nowhere near nanotechnology, but they are significantly smaller than even today's computers and will let us put together much larger numbers of computers in clusters.

We will see that every year we learn how to build smaller and smaller components and how to integrate them closer and closer together until eventually -- probably 8-10 years -- we will be approaching a level that feels more like nanotechnology, but a lot of it will depend on the definition.

We have projects at IBM research. For example the Blue Gene project is aiming to put together a million different computers interconnected in a special network to achieve petaflops of computing power in particular to tackle very sophisticated life science problems that are ahead of us, such as genomic and proteomics research. If you look at what is going on in storage technologies and the incredible densities that one is able to achieve, we are almost reaching the levels where quantum effects begin to matter. We are beginning to reach those points and people have to get much more sophisticated in building computers, but I view that as an evolutionary step.

Have your say instantly in the Tech Update forum.

Find out what's where in the new Tech Update with our Guided Tour.

Let the editors know what you think in the Mailroom.