There has been a lot of talk recently about the amount of spam that appears in Google’s search engine rankings. Indeed, the once vaunted, gold-standard, of Internet search has come under an increasingly critical eye as technology writers re-examine what was a non-issue. Many of these commenters and Google critics are blaming the so-called content mills and SEO consultant and webmasters who are “gaming” the Google ranking algorithm. Unofficial Google spokesman Matt Cutts recently commented that members of the web spam team had been moved on to other projects inside of Google.
However, none of these issues addresses the real problem with Google’s search engine. The real problem with Google search rankings is that many of the core principals upon which Google’s search algorithm was built are no longer true, or were never true in the first place.
Google Ranking Algorithm
The full Google ranking system has never been disclosed by the company. However, early researchers of Google search engine functionality used papers published by Google founders, as well as patents and other public documents, to piece together the core ranking system used by Google. Subsequent comments and pronouncements by the company, as well as real-world business research and methodology have painted a fairly strong picture of what goes on at the heart of Google’s ranking algorithm.
According to official and semi-official pronouncements from Google, there are hundreds of “signals” that go into ranking the webpages returned by Google’s search engine in response to a user search query. However, the vast majority of these so-called influences are insignificant compared to the “core” ranking signals.
The crux of Google’s search rankings is the paradigm that each link pointing to a website is an endorsement of the quality of that’s webpage’s content. The idea behind this ranking signal is that the more people who recommend a website by pointing their own readers to it, the better that website must be. However, this paradigm falls apart completely under the most minor scrutiny.
To be valid in for search ranking, content publishers must link elsewhere only because of perceived value.
This is so naïve as to be laughable. Webmasters and web page creators link to all manner of content around the web for all manner of reasons. Sometimes links are to things that are funny. Others are links to something negative.
A recent New York Times article pointed out that one website owner ranks highly in Google search results almost solely on the basis of those so enraged by the company’s business practices that they have filled forums, review sites, and more with negative reviews and scathing commentary. Despite the context, each and every link counted as a vote up in Google’s search algorithm.
After this particular high-profile embarrassment, Google tweaked its algorithm, supposedly so such things wouldn’t happen in the future, but more likely with no more affect than lowering the rankings of this particular website.
The “no follow” tag was supposed to prevent problems like this from occurring. By adding a no follow tag to a standard HTML link, the content creator can curate their links to prevent their approval from passing through a link. However, in order for the no follow tag to work, the web content writer must understand both webpage programming (HTML) AND both understand, and care about, how Google ranking power flows through links. This is obviously not the kind of thing that your average review writer has a grasp of.
Which brings us to the first reason Google’s search algorithm is failing.
Web publishing is no longer the providence of “experts.”
When Google was first created, there was a small, if significant, barrier to entry to publishing web content. While tools such as Geocities and even AOL allowed users to create webpages, one had to first know such a thing existed, and then decide that it was worth whatever effort was necessary to publish online.
Doing so required nothing less than sitting down at a computer with an Internet connection (not all of them had one then), typing the webpage, and then publishing it somewhere. Under these circumstances, it was a logical assumption that anything that someone bothered to write about and then link to, was something that the person felt was somehow noteworthy. Any “bad links” would be buried under the better links that more people wrote about.
These days, however, publishing online is something that is not only easy, but known about, and used by virtually everyone. Numerous services from Twitter, to Facebook, to Tumblr, and more allow almost anyone to post almost anything with virtually no learning curve. The computer isn’t even required anymore. Most smartphones provide a half-dozen ways to publish content online right out of the box.
Today, a link might be an endorsement or recommendation, but it could just as easily be a gag, a way to help a friend find a homework assignment, or a legal disclaimer automatically inserted without the user’s knowledge into every post or email sent.
Unfortunately, it seems as though Google’s search engineers don’t get out of the Googleplex enough to realize that the world around them has changed fundamentally. The search ranking continues to really on the fantasy that the search numbers of the Internet will swamp under all of those non-worthy links.
It doesn’t take much analysis to come to the conclusion that non-endorsement links are just as common as those fully intended to pass on their so-called link juice, and that is BEFORE one takes into account the number of people deliberately creating links meant for no purpose OTHER than to add a little more ranking power to each website.