Boldly Googling into the future

Google's chief technology officer Craig Silverstein claims the future of search technology will see science fiction become science fact, but in the meantime, the best option is to fake it
Written by Andrew Donoghue, Contributor

With former customers such as Microsoft and Yahoo looking to knock Google off the top of the lucrative search market, the company has opted to go on the offensive rather than retreat into lock-down and cost-cutting.

Keen to maintain its innovative reputation, Google is expanding its product range and is about to move into even larger offices in Silicon Valley formerly occupied by an increasingly troubled Silicon Graphics. It also recently expanded its number of world-wide offices to 21, with the opening of a Spanish sales branch in Madrid.

And when other tech companies are announcing job cuts, Google continues to swell staff numbers with computer science PhD's, currently totalling 60, through schemes such as its Code Jam competition. ZDNet UK spoke to Google chief technology officer Craig Silverstein about plans for future innovation and impending competition.

Google product manager Marissa Mayer recently said that search is still in its infancy -- how do you think it will look when it's 'grown up'?

When search grows up, it will look like Star Trek: you talk into the air ("Computer! What's the situation down on the planet?") and the computer processes your question, figures out its context, figures out what response you're looking for, searches a giant database in who-knows-how-many languages, translates/analyses/summarises all the results, and presents them back to you in a pleasant voice. I think this technology is about, oh, 300 years off. Just getting the computer to understand your question, much less the context it's being asked in, is way beyond the state of the art in computer science right now.

The best we can do in the meantime is "fake it" -- either by pretending to understand text even though we don't, or by leveraging human intelligence to do our computations. That's what PageRank does: it makes use of the links that people make between Web pages even though it doesn't understand why exactly someone decided to make a link between page A and page B.

Besides the intelligence part, there are other issues involved in growing up.  One is dealing with more diverse types of data formats, including non-textual ones. Another is dealing better with translation -- why should a result be bad for you just because it's written in a language you don't know?

Some of this is still in the research stage and not quite ready for prime time. Others are easy technologically, but have issues on the business side: for instance, searching music files.

Voiced-enabled searching isn't completely science fiction at the moment, is it? You are working on a project with BMW in this area.

Voice recognition is still very much a research problem. The BMW project is promising, but I think there's more work to be done, both in academia and in industry. This particular sub-problem I think we can lick in sooner than 300 years.

Google News and the recently launched specialist ecommerce search site Froogle are still in beta, how long before they are officially finished products?

We're continuing to develop new features -- Froogle just added the ability to sort by price, for instance, which a lot of people had asked for -- and we're continuing to evaluate feedback on what people think we're doing well and not so well. Once we're happy with the features we offer and the feedback we get on them, we'll take these projects out of beta. We're in no rush though; the main Google site stayed in beta for years.

Microsoft has publicly voiced its intentions to move into search in a bigger way with the eventual release of Longhorn, which will attempt to unify local and Internet searches. How is Google going to meet this challenge if and when it happens? Do you see Microsoft as a bigger threat than Yahoo long-term?

We're glad to see competition in search, because it means that companies are focusing on what I believe to be a very important problem on the Web: finding what you want out of all the information that's out there -- the more minds that attack that problem, the better. Obviously, I'd rather those minds work for Google.  But regardless, I as a Web user benefit as long as the competition is there and stays focused on the technology.

You reportedly have one of the biggest Linux clusters in the world (more than 10,000 servers) -- what's your opinion of the recent SCO lawsuit and what it could mean for Linux users if it's upheld? Has it made Google nervous of basing its systems around open-source?

The actual lawsuit is very narrow in its claims; we're not nervous about it at all. It's prompted lots of discussion, which has been very interesting to watch.

You have very cost-effective approach to your  internal architecture. Could you expand on Google's general approach to its internal systems?

We're cheap.  We use commodity computers -- thousands of them, all hooked together, to get the processing power we need -- and because it's off-the-shelf stuff, each computer is very cheap.  We've had to design our software to work well in such an environment: it has to be scalable and tolerant of errors, since when you have thousands of computers at least one is always on the blink, but it's been a very worthwhile investment for us.

Your caching process has been criticised for bypassing some companies' paid-for content and other patent issues -- do you see the company having to tweak how it works, and would this affect the search time and performance of the site?

We believe the cache is a very helpful feature for sites whose content changes frequently: a user can see why we thought the page was a good match for their query, even if the Web page has totally changed since we last indexed it. For individual Webmasters who would rather we don't cache the page, for whatever reason, we make it very easy for them to opt out of the program, either by putting appropriate tags in their Web page or using an automated system we have.  I think this strikes a nice balance between the various concerns.

Google is hiring and running programs like the Code Jam while other firms are still weathering the slump -- do you think the industry is going to see an up-tick any time soon?

I couldn't say about the industry as a whole, but Google continues to hire as fast as we can find good people. There's a lot for us to do, and only 300 years to do it in!

Editorial standards