Cartoon illustrating the basic principle of PageRank. The size of each face is proportional to the total size of the other faces which are pointing to it.

PageRank is a link analysis algorithm and it assigns a numerical weighting to each element of a hyperlinked set of documents, such as the World Wide Web, with the purpose of “measuring” its relative importance within the set. The algorithm may be applied to any collection of entities with reciprocal quotations and references.

The numerical weight that it assigns to any given element E is referred to as the PageRank of E.

A PageRank results from a mathematical algorithm based on the webgraph, created by all World Wide Web pages as nodes and hyperlinks as edges, taking into consideration authority hubs such as cnn.com or usa.gov. The rank value indicates an importance of a particular page.

A hyperlink to a page counts as a vote of support. The PageRank of a page is defined recursively and depends on the number and PageRank metric of all pages that link to it (“incoming links”). A page that is linked to by many pages with high PageRank receives a high rank itself.

Numerous academic papers concerning PageRank have been published since Page and Brin’s original paper. In practice, the PageRank concept may be vulnerable to manipulation.

Research has been conducted into identifying falsely influenced PageRank rankings. The goal is to find an effective means of ignoring links from documents with falsely influenced PageRank.

Other link-based ranking algorithms for Web pages include the HITS algorithm invented by Jon Kleinberg (used by Teoma and now Ask.com),the IBM CLEVER project, the TrustRank algorithm and the Hummingbird algorithm.

History

The eigenvalue problem was suggested in 1976 by Gabriel Pinski and Francis Narin, who worked on scientometrics ranking scientific journals, in 1977 by Thomas Saaty in his concept of Analytic Hierarchy Process which weighted alternative choices, and in 1995 by Bradley Love and Steven Sloman as a cognitive model for concepts, the centrality algorithm.

Larry Page and Sergey Brin developed PageRank at Stanford University in 1996 as part of a research project about a new kind of search engine.

Sergey Brin had the idea that information on the web could be ordered in a hierarchy by “link popularity”: a page ranks higher as there are more links to it.

Rajeev Motwani and Terry Winograd co-authored with Page and Brin the first paper about the project, describing PageRank and the initial prototype of the Google search engine, published in 1998.

Shortly after, Page and Brin founded Google Inc., the company behind the Google search engine.

While just one of many factors that determine the ranking of Google search results, PageRank continues to provide the basis for all of Google’s web-search tools.

The name “PageRank” plays off of the name of developer Larry Page, as well as of the concept of a web page.

The word is a trademark of Google, and the PageRank process has been patented (U.S. Patent 6,285,999). However, the patent is assigned to Stanford University and not to Google. Google has exclusive license rights on the patent from Stanford University. The university received 1.8 million shares of Google in exchange for use of the patent; it sold the shares in 2005 for $336 million.

PageRank was influenced by citation analysis, early developed by Eugene Garfield in the 1950s at the University of Pennsylvania, and by Hyper Search, developed by Massimo Marchiori at the University of Padua.

In the same year PageRank was introduced (1998), Jon Kleinberg published his work on HITS. Google’s founders cite Garfield, Marchiori, and Kleinberg in their original papers.

A small search-engine called “RankDex” from IDD Information Services, designed by Robin Li, was, from 1996, already exploring a similar strategy for site-scoring and page-ranking.

Li patented the technology in RankDex in 1999 and used it later when he founded Baidu in China in 2000.