Dec 21st, 2008 by GCR Editor
Does Google really rank colleges and universities? Of course it does! Well, not exactly. But in another sense, yes, certainly.
What Google does is index and rank webpages. And here at the Google College Rankings we use Google’s webpage rankings as a proxy for the overall quality of the institutions behind the webpages. This is not nearly as far-fetched as it sounds, once you understand how Google’s ranking algorithm works.
When you enter a search term such as “college” into Google and perform a search, you get pages and pages of links in return. Geeks refer to these pages as SERPs—search-engine results pages. How are these many pages of links assembled?
Google’s search spider, the Googlebot, is forever scanning the web, following links and looking for new webpages, and adding the contents of each page to Google’s enormous index. Whenever you perform a Google search you are searching through that index as it exists at the moment of your search. The index is in a perpetual state of revision, and not only that, it is distributed over many different physical locations—different data centers—so the results you get from one moment to the next aren’t always the same.
But even if the Google database were static, searching on a common word like “college” or “university” would return millions of pages. (As I’m typing this, Google is returning about 711,000,000 results for the word college, and 789,000,000 for the word university.) For these results to be useful, they have to be sorted in some way, with the more helpful links at the top, and the less helpful ones further down.
There’s no perfect way of doing this, of course, but Google and other search engines try their best. The way Google sorts the results is by assigning to every page on the web, in the course of its indexing process, a value Google calls the PageRank. When you search on a word like college, Google first finds all the pages it knows about that contain the world college, and then (in rough terms) it sorts them according to their PageRank: high-PageRank sites are near the top of the results, low-PageRank sites are near the bottom.
Of course no one but a handful of Google engineers know the exact details of how this works, because those exact details are closely-held corporate secrets. There are dozens and perhaps hundreds of variables that go into the calculation of PageRank, and that list of variables is subject to frequent revision.
One of the most important variables that goes into the calculation of PageRank is known, however. It is the number of other webpages that link to the page in question. An important webpage, the theory goes, has many other links—backlinks—pointing to it. And unimportant webpage, by contrast, is ignored by web authors at large, and few others pages link to it. The number of backlinks thus functions as a measure of reputation in a sense. Webpages that are deemed important within any given field—deemed so by other webpage authors—receive lots of “votes” in the form of incoming links. They are therefore assigned a higher PageRank by Google, and appear higher in the SERPs. Google is effectively saying, “Lots of other people seem to think these pages are important, so our best guess is that you, the user, will also find them helpful.”
Although the use of PageRank as a tool for quality evaluation is commonly associated with Google—and the term “PageRank” itself is one of their trademarks—the basic idea behind PageRank didn’t originate with Google, and in the days before the web similar statistics were used by journals such as Science Citation Index to evaluate the importance of academic research publications.
But what does this have to do with ranking colleges and universities? Well, in the popular America’s Best Colleges edition of U.S. News & World Report—probably the most widely-known college rankings guide in the United States—the weight given to “peer assessment” is fully 25%. The peer assessment surveys prepared by U.S. News are sent to about 4000 people, who simply vote their opinions of the quality of various colleges and universities, and this accounts for a quarter of the total rankings value calculated.
There’s nothing wrong with sending out a survey, but “peer assessment” is also what Google’s PageRank algorithm is all about, and it isn’t based on the opinion of a few thousand people, but rather on the actual effort put in by millions of web authors around the world, many of them experts in their respective fields. The results of a Google search are based strongly on peer assessment, just as the results of the U.S. News ranking process are. When the website of a college or university appear high in the Google SERPs, it’s because, in the collective judgment of web authors around the world, it really is important, for a great variety of reasons, and many people have linked to it. So does Google really rank colleges and universities? Well, in a sense, yes, it really does, because the web at large does. And an important element of its ranking process is “peer assessment,” just like U.S. News.
There are additional factors that can influence both the Google rankings of a college and the in-print rankings from U.S. News and other such publications. I’ll explore some of these in a future post.