X
Business

IBM wants to speed up Google's PageRank

You all know that PageRank is the name of a patented algorithm used by Google to sort the billions of records in its databases. And while the patent about PageRank has already been updated twice since 2001, IBM recently received a patent to speed up PageRank. What are the motives of IBM? Read more...
Written by Roland Piquepaille, Inactive

You all know that PageRank is the name of an algorithm -- trademarked and patented -- used by Google to sort the billions of records in its databases. And while the patent about PageRank has already been updated twice since 2001, other companies also patented various methods for other sorting algorithms. The latest is IBM, which recently received a patent to speed up PageRank, as 'SEO by the SEA' reported earlier this month. I'm using Google since September 1999 and I never found that its search engine was slow. So what are the motives of IBM? Read more...

Before looking at the new IBM patent, here is a little bit of history about Google's Pagerank, as I told you previously in this post. And of course, check what says Wikipedia, which reminds us that the name "PageRank" is a trademark of Google -- since March 2, 2004 if you search through the records of the United States Patent and Trademark Office (USPTO) -- but that the multiple patents about the algorithm have been granted to Stanford University -- to The Board of Trustees of the Leland Stanford Junior University to be precise.

Three patents were granted to Lawrence -- not Larry -- Page and Stanford University by USPTO about this sorting method.

As you can see, the first and the third patents carry the same name. But the three patents also contain almost the same figures. Below is a figure describing how PageRank works (Credit: Stanford University/USPTO).

How works Google's PageRank

Now, here is an interesting paragraph written by William -- Bill -- Slawski about the new IBM patent.

Rather than describing what the patent contains, I’m going to recommend looking at the paper I refer to in the first paragraph of this post, which is an excellent summary of many of the ideas in the patent. Both patent and paper discuss how difficult it is to compute pagerank for all of the pages on the web, and offer a few solutions which increase speed while also reducing possible errors.

For more information, Slawski is talking about a technical raper published by IBM Almaden Research Center in November 2001, "PageRank Computation and the Structure of the Web: Experiments and Algorithms" (PDF format, 5 pages, 64 KB).

For more details, here is a link to the new IBM patent, System and method for rapid computation of PageRank (Patent 7,089,252 dated August 8, 2006).

Here are some parts from the abstract.

The method comprises obtaining a plurality of documents, and determining a rank of each document. The rank of each document is generally a function of a rank of all other documents in the plurality of documents which point to the document and is determined by solving, by equation-solving methods (including Gauss-Seidel iteration and partitioning) of a set of equations wherein:.alpha..alpha..times..times..times..times. [...]

If this language is too esoteric for you, below is one of the figures associated with this patent (Credit: IBM/USPTO).

How works Google's PageRank

And here is an explanation of this figure given in the claims of the patent.

Considering the large-scale structure of the web, the arguments for using an equation-solving approach become even stronger. As is now well known, the graph structure of the web may be described by the "Bow Tie." The Bow Tie web structure generally comprises input segments and output segments. The input and output segments are connected to the strongly connected component. Input nodes are coupled to the input segment and output nodes are connected to the output segment. An interconnecting node directly couples the input segment and the output segment.

This "Bow Tie" theory is better explained in this other IBM document which also contains a better illustration.

After reading all these exciting patents, what do you think of the new one from IBM? Does it make sense to you? Does want IBM license it to Google? Please send me your thoughts.

Sources: William Slawski, SEO by the SEA, August 13, 2006; and various web sites

You'll find related stories by following the links below.

Editorial standards