Researchers find web tracking up, privacy down

Researchers find web tracking up, privacy down

Summary: It's not your imagination. A new report from researchers at UC Berkeley says web trackers have stepped up their efforts to follow you, and they show no signs of slowing down.

SHARE:
TOPICS: Privacy
5

No, it's not just your imagination. You really are being tracked more online, and there is evidence that this trend will increase.

That's the conclusion of a new study published last month by the Berkeley Center for Law and Technology at the University of California.

The researchers behind the Web Privacy Census built an automated web browser with some extra diagnostic smarts. Then they used it to crawl the top 25,000 sites on the web, as measured by Quantcast. They did deeper crawls of the top 1000 and top 100 sites, respectively.

[The] deep crawl ... consisted of visiting the home page of the domain obtained from Quantcast and then traversing up to 6 random links from that page, intended to simulate some level of activity at the website.

The shallow and deepcrawls collect the same type of information at each webpage: http cookies, flash cookies, calls to HTML5 local storage, calls to flash that may be used for browser fingerprinting, as well as metadata about the webpage and crawl.

The goal, according to the researchers, is to "formalize the benchmarking process and measure internet tracking consistently over time."

For this pass, the results were alarming.

  • Every one of the top 100 sites uses third-party cookies capable of performing online tracking, with a minimum of 1 and a maximum of 234 third-party cookies per page.
  • Among the top 1000 sites, at least one set 359 third-party cookies per visit. (One shudders to think about the performance of that page.)
  • On average, the top 1000 websites (which represent the vast majority of traffic on the web) set more than 50 third-party cookies each.

And those numbers might be conservative, in terms of their impact on your privacy. The six links selected for each of the top 1000 sites were selected at random and might not represent the most popular links on a site. In addition, the researchers say:

[T]he crawler did not access content behind sites that require logins, consequently any content and trackers that existed behind a log in were not recorded. Related to this, the crawler did not login and maintain an identity while traversing sites. For example, the crawler did not log into a Facebook account and then attempt to visit websites in this iteration.

Make no mistake about it. These third-party cookies are used for tracking:

Most cookies— 84% of them—were placed by a third party host.  We detected over 446 third party hosts among the third party cookies.   Google had cookies on 16 of the top sites; the company’s ad tracking network, doubleclick.net, had cookies on 73. Combined, Google has a presence on 78 of the top websites.  Only 22 lacked some type of Google cookie.

[...]

The most frequently appearing cookie keys were: utmb,utma,utmc,utmz, uid. Many of these keys are commonly associated with unique user tracking and Google Analytics. For instance, __utma is used by Google for identifying unique visitors.

Based on previous tallies, those numbers are up dramatically just in the past two years, as this graph shows:

Total Cookies

There's also some evidence to suggest that the advertisers and analytics companies behind this major increase in web tracking are shifting their focus away from cookies, which can be easily detected and blocked, and are using HTML5 local storage instead.

The leading websites have moved aggressively to use the new technology: According to the researchers, 311 of the top 1000 sites were using HTML5 local storage. That's about three times the usage of this advanced feature compared to the top 25,000 sites, here fewer than 10% are using HTML5 local storage. This isn't necessarily a privacy risk, but it has tremendous potential for the data-collection industry. At least one tracker is using HTML5 local storage to hold unique identifiers from third party cookies, the researchers reported.

This report is the first in a promised quarterly census of the web and privacy. Sadly, the smart money is betting that the number of trackers will rise significantly over time.

Topic: Privacy

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

5 comments
Log in or register to join the discussion
  • I have my browser set to not save cookies when I close it.

    What other steps can I take to stop my browsing habits from being tracked? What other files might I have to delete?
    marbo100
    • Get off the Internet for starters

      @marbo100
      "What other steps can I take to stop my browsing habits from being tracked? "

      Delete your hard drive. Toss your box, crush your stinkin' lappie. Get in the car and drive man drive. Make for the highest mountain top you can find. Build a cabin. Pump in some water. Get a shotgun -- and the biggest mofo generator you can find.

      Now you can *R E L A X*

      But late at night, in the pitch of unholy darkness, when you hear those gnawing sounds reappearing in yer brain, when beads of sweat begin to fall from your overworked brow, repeat over and over, s l o w l y: THOSE ARE NOT RESPAWNING COOKIES I'M IMAGINING, THOSE THOSE ARE NOT RESPAWNING COOKIES, THOSE A R e nOt reesPAWn n n n ing cccc c cc ccookies ... na nuh n not no tt t ... *C r A c K*
      klumper
  • Deleting cookies isn't enough

    You have to use a tool to scrub evercookies and HTML5 local storage, like BleachBit (http://bleachbit.sourceforge.net).

    Ed, maybe this would be a good article: free tools and browser add-ons for privacy?
    JohnMorgan3
    • Evercookies

      or Super-Cookie?
      daikon
      • Take your pick

        Here's a recent example of a 'supercookie', courtesy of Microsoft:

        http://www.computerworld.com/s/article/9219312/Microsoft_disables_supercookies_used_on_MSN.com_visitors
        Rabid Howler Monkey