X
Business

Meta-search: More heads better than one?

Commentary--Meta-search must combine results by weighted voting, must not overly bias with too many paid listings, and use as many search engines as possible.
Written by Raul Valdes-Perez, Contributor
Commentary--Meta-search engines send a user's query to multiple search engines and blend the top results from each into one overall list. A final step can involve clustering the combined results, but both meta-search and regular search engines can be clustered, so the clustering issue is separate. It’s been claimed that meta-search is inherently worse than regular search engines. We assert the contrary: meta-search has intrinsic advantages that are based on voting.

According to the UC Berkeley Teaching Library Internet Workshops, "`Smarter’ meta-searcher technology includes clustering and linguistic analysis that attempts to show you themes within results, and some fancy textual analysis and display that can help you dig deeply into a set of results. However, neither of these technologies is any better than the quality of the search engine databases they obtain results from. ... We recommend directly searching each search engine and recommend AGAINST using meta-searchers." [bolding and emphasis are from the source.]

If this "two heads cannot be better than one" approach were true, then there would be no value in having nine Supreme Court Justices; just appoint one Solomon. Folk wisdom asserts the contrary: two heads are better than one.

General Web crawlers today are harmed by the noise of blog cross-linking, link-bombing or Google-bombing, and commercial efforts to skew PageRank scores. News search engines are "noisy" because ranking news is even more arbitrary than ranking cross-linked web pages. Greg Notess has often reported on how little overlap occurs among the top results of regular engines like AltaVista, Google, Yahoo, etc. The overlapping search results are presumably better than the unique hits found by a single engine, but these unique hits compete for space and user attention with the consensus-best results, which the user is unable to distinguish from the unique hits.

To see how meta-search can lead to improved results, consider how electrical engineers perform averaging of noisy signals, which cancels out random noise and reveals the original noise-free signal. Since Web noise affects regular search engines in different ways, meta-search filters noise by averaging the votes of the underlying engines, revealing the consensus best results.

For meta-search engines to live up to this promise, they must combine search results by weighted voting, not by round-robin, they must not overly bias their results by inserting too many paid listings, and they should use as many underlying search engines as is practical.

Meta-search improves on search engines by canceling noise. Nine good heads are better than one Solomon, both in Justice and on the Web.

biography
Raul Valdes-Perez is CEO and co-founded of Vivisimo, a provider of search, meta-search and clustering software, and parent company of Clusty.com.

Editorial standards