When I was in Brazil recently, I met with Berthier Ribeiro-Neto, head of engineering at Google Brazil. During our conversation I mentioned an idea I had about making the Google index into an open database that anyone could access, I said that this could dramatically speed up the Internet.
He said it was a good idea and that I "should write a position paper" on this subject.
(As a further thought, maybe it could also serve to take away some of the heat Google is feeling lately, in terms of its index rankings potentially favoring its own business interests.)
Here is my logic:
Looking at my server logs shows that 20 different robots visit my site, one of the more frequent ones is the Googlebot. Each of these robots is trying to create an index of my site.
Each of these robots takes up a considerable amount of my resources. For June, the Googlebot ate up 4.9 gigabytes of bandwidth, Yahoo used 4.8 gigabytes, while an unknown robot used 11.27 gigabytes of bandwidth. Together, they used up 45% of my bandwidth just to create an index of my site.
These robots are all seeking the same information and they use nearly one-half of my bandwidth, slowing the site for all my readers. This is also the same for tens of millions of web sites.
What if there was a single index that anyone could access?
You would get an immediate speed increase in the Internet for no additional investment in infrastructure.
Google and others, could perform their own analysis of the index using their secret algorithms. After all, the value is not in the index it is in the analysis of that index.
Mr. Ribeiro-Neto said, "That's a good idea. You probably wouldn't even need to spider the web sites."
Each web site could update the central index automatically each time something changed. This would result in a massive savings in bandwidth used by dozens of robots scouring the Internet for new information.
What if Google opened up its index to the world as a goodwill gesture because it has the best index? It could still maintain the privacy of its algorithm but everyone would have the same information on which to perform their analysis.
It would show that there was nothing unusual or unethical in how Google collects information for its index. This might relieve some of the pressure it has come under this week to reveal more about how its search service is presented.
Also, Google founders were once strong advocates that the search index should be run as a non-profit.
On page 39 "Inside Larry and Sergey's Brain" by Richard Brandt (referral link).
Andrei Broder, who led the team that created the AltaVista search engine, the best of its time, talks about meeting Larry and Sergey. "When the discussion turned to the topic of making money from the technology, Broder found that Page had a profound difference of philosophy on the subject. "It was a very funny thing about Larry," Broder recalls. "He was very adamant about search engines not being owned by commercial entities. He said it should all be done by a nonprofit. I guess Larry has changed his mind about that."
Brian Lent, now CEO at Medio Systems:
"The problem with the Google search engine at the time, Lent recalls, is that Larry and Sergey didn't want to commercialize it, and Lent was anxious to become an entrepreneur. Their mantra at the time was more socialistic than entrepreneurial. "Originally, 'Don't be evil' was 'Don't go commercial,'" says Lent.
- - -
- The NYTimes: The Google Algorithm
- FT.com / Comment / Opinion - Do not neutralize the web’s endless search (Subscription required.)