Search is the most important Internet application and it is vital to the health of the Internet and to online commerce.
But looking more closely at search, and how the index is compiled, I'm beginning to realize it is highly inefficient and places a huge burden on the Internet.
I'd love to know how much processing power and how much bandwidth is used by the dozens of spiderbots crawling the web and compiling their own index.
If I look at my server stats, the amount of resources spiderbots take is huge. SVW is visited by 16 spiderbots daily and they consume 45% of my bandwidth (they return only 6% of total user traffic.)
Not every site will receive as many spiderbot visits but it would still represent a very large burden. It's a huge inefficiency at the heart of the Internet.
And with that comes the carbon cost of the energy required to run the servers and communications networks -- all to service an inefficient search industry.
What if there were a single search index held in common and administered by a non-profit that everyone could access? Surely that would negate the need for anyone to crawl the web and access the same data to construct their own search index?
The energy savings globally would be massive. And it would free-up resources, resulting in a faster Internet and one that has room to expand, without needing to invest in any new servers or communication lines.
Google founders originally believed that search should be in the public domain. Google could still be Google because its value lies in its unique ability to analyze the search index and rank the results, and serve up ads.
Surely it would be more efficient to have a single search index held in common than to have the current system where as much as 1/2 of bandwidth is consumed to create multiple indexes of the same data?
And it wouldn't affect the business of the search engines, in fact, it could result in a platform that spurs innovation, allowing smaller companies to create new algorithms and improve search -- the single most important application on the Internet.
- - -