The incredible shrinking GOOG! 25% smaller...

The incredible shrinking GOOG! 25% smaller...

Summary: Something strange is going on, Google seems to be shedding huge amounts of its index. By my simple measure, the size of its index has shrunk by nearly 25 per cent over a period of just over 5 months.

SHARE:
TOPICS: Google, Browser
42
Something strange is going on, Google seems to be shedding huge amounts of its index. By my simple measure, the size of its index has shrunk by nearly 25 per cent over a period of just over 5 months. At this rate it will be half the size by the end of this year!

Occasionally, I will Google myself just to see what content others are linking to and what they are saying. But lately I've noticed that my citations in Google are shrinking. I'm producing ever more content, but Google is finding fewer references.

Let me give you a point of comparison. On March 3 2009 I wrote: Internet Myth: Watch What You Post Because Search Will Reveal Everything Forever.

At that time I Googled "Tom Foremski" and it came up with a search result of 135,000 pages in Google's index.

Today, I did the same search, and came up with 102,000 pages in Google's index.

I tried Bing and got 157,000 results.

What's going on? Has Google run out of room? Is it deleting older content from its index? Or is it just me? I've no other point of comparison. Let me know if you've noticed something similar.

Topics: Google, Browser

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

42 comments
Log in or register to join the discussion
  • odd

    maybe that's how they're making google faster -
    just by indexing less?
    coffeeshark
  • Analysis is way too simple

    Simply looking at the number of returned results is a useless criterion for determining search quality. I Googled my name and got 489,000 results - none of the first 30 or so referred to me. I used Bing and got 269,000 results - again, none of the first 30 or so were me. I'd prefer to get just 1 hit if it was a link to something about me.

    The difference in the number of results may well be that Google have refined their search algorithm so that the results are more accurate (or not). Before commenting on the quality of a search engine, you have to establish some criteria for determining search quality, then measure various results based on those criteria.

    At that point, you might have a meaningful analysis. A simple count of the results from a single query is meaningless.
    Fred Fredrickson
    • Or it could be

      and no offense to the author, that maybe the subject matter is not being reported on to the extent it was earlier.

      If people pull pages from their sites due to lack of interest, then there will be less to index the next time a search is done on that subject.
      GuidingLight
    • Too simple indeed

      In addition, I've noticed on more arcane searches where it's possible to view all search results, the number of reported returns is far more than the number actually returned. Maybe Google has gotten better at counting the results they return. (To be fair, Google reports that they excluded some items from the results because they were duplicates. It still makes it hard to rely on the number of returned results as any kind of meaningful measure of the index size.)
      cburkitt2
    • How does the algorithm know its you and not another Fred Fredrickson?

      Unless you have a VERY unusual name or unless you're extremely famous, just typing in your name is unlikely to return anything about you.

      How do you expect Google to know that when you just search on Fred Fredrickson that you want to know about you and not another Fred Fredrickson?
      de-void-21165590650301806002836337787023
      • Right. Pagerank vs expectations vs quantity

        I agree with what seems to be the consensus here, quantity isn't
        the same as quality and the search engine can't read your mind.

        It can study your behavior though and display results better in
        accordance to your expectations depending on what you click on.
        Mikael_z
  • RE: The incredible shrinking GOOG! 25% smaller...

    I just did the same search on Google and got the same
    number. However, I also did the search with Google's new
    "Caffeine" search (http://www2.sandbox.google.com/) and
    only got 99,000 results. I had been noticing more
    results in Caffeine searches, but not when searching for
    "Tom Foremski".
    Matthew Sommer
  • Quite possibly...

    it is shrinking...and being 'moved sideways' into Google Sandbox.

    C'est possible? Oui?
    Dietrich®
  • there seem to be lots of duplicates in most searches

    maybe they're just removing the duplicates?
    (or if destination pages are duplicates, only show the more popular).
    stevey_d
    • Duplicates

      Then they're not doing too good of a job yet! Yesterday (8/13), I was searching for an artist's discography and found the exact same page 6 times in the first 10 pages. Different wording, same site.
      sirpaul1
  • RE: The incredible shrinking GOOG! 25% smaller...

    yes, bing is finally becoming much more relevant than Google. A real Paradigm shift is happening. Happens all the time. Does anyone remember Lotus 1-2-3 or Netscape Browser?
    georgeh36
    • Dreaming...

      Okay, time to wake up. You're drooling on your desk.
      Metronome49
    • BWHAW HAW HAW HAW HAW... :D

      LOL... :D

      Yeah...right...everybody's 'running' to Bing Blow...

      more LOL... :D
      Wintel BSOD
  • Not smaller index... more accurate results.

    They didn't shrink their index... they just refined their algorithm to bring more relevant data.

    Did you look at the last 33,000 results from the query that brought 135,000? I'm sure they were not relevant and could do to be scrapped. More results does not = bigger or better... all that really matters is the first 5 or 6 pages anyway. Noone is going to go to page 8,000 to find the information they are looking for... and to give results that have that many pages is a waste of time.

    That's likely why caffeine has even fewer, it's a refinement, which they CONSTANTLY do, not a cut back.
    Metronome49
    • Can't get past 552 results...

      I tried to view the last results back in March. Google had 135,000 pages
      indexed, but I couldn't see any past the first 55 pages (552 results.) So
      there is no way I could examine the last few thousand.
      foremski
      • I've noticed the same thing

        On several occasions I have noticed that the pages indexed number, far exceeds the number of pages you can actually click on.

        lehnerus2000
        lehnerus2000
  • RE: The incredible shrinking GOOG! 25% smaller...

    I run an online Forum for a non-profit historical Society. When Google first crawls our site we'll have 10,000 items listed. Two weeks later its down to 400. Where do the other 9,000 pages go? None of our links rank in the first 100 pages, most of the links provided on the first few pages are nothing but ads. With Bing we are the number one listing. We get three times more visitors from Bing.
    With the changes going on at Google we've mysteriously started seeing our pages rank high enough to be placed on the first page.
    No complaints here!
    ICUR12
  • RE: The incredible shrinking GOOG! 25% smaller...

    google is geting more acurate results, it looks like you are getting somthing with writing this articule, maybe you have some hidden interests? m$?
    raul.grp@...
    • I have no hidden interests...

      My content is already highly ranked. If I write about someone my posts
      will usually appear in the first five results for that person. I'm just
      interested in this as a possible indicator of something broader that might
      be affecting many other people and companies and to try to figure out
      what this all means.
      foremski
  • RE: The incredible shrinking GOOG! 25% smaller...

    I concur because I am finding that the main host URL appears against searches but the sub-pages do not. This is frustrating because many Web Hosters charge 'anything.domain' now as separate accounts which means that the only tenable option is to shortcut your different pages to 'domain/anything'. My personal finding is that you stand more chance of listings as the anything.domain and if you domain/anything you dont get the lists - simple as that. The allowable free listing scope has also been reduced it would seem.
    GeoffDuke