The peril of panels: Can Web traffic stats be trusted?

The peril of panels: Can Web traffic stats be trusted?

Summary: ComScore repeatedly warns in its Securities and Exchange Commission filings that if its statistics can't be trusted its business model is in jeopardy.I'm calling jeopardy.

SHARE:
TOPICS: Servers, Hardware
3

ComScore repeatedly warns in its Securities and Exchange Commission filings that if its statistics can't be trusted its business model is in jeopardy.

I'm calling jeopardy. I'm calling a bit of BS too.

Why? There's no way that ComScore traffic metrics and what I see internally can be off by--oh let's round it off--10 million or so page views for ZDNet blogs.

Even when explaining for differences--ComScore only counts U.S. traffic and uses a panel not cookies--the gap is too substantial. Now ComScore's traffic tallies could be off for every site in equal measure but my hunch is that a site that relies on B2B traffic has a harder time. Why? Work panels traditionally are difficult to put together and expensive. We're told by traffic wonks that the differential for consumer sites is smaller between server-based data and panels. The takeaway: If most of your traffic comes from workplaces you may have a beef with ComScore.

And if you're ZDNet you may really have issues. ZDNet users are the types that build computers over the weekend and skip antivirus software because they believe that you can defend yourself tweaking Windows settings. They also have a libertarian streak.

Some readers want to build their own computers, but we would hope the majority are IT execs and managers who need to keep up on what is going on and don't have time to panel.

Full disclosure: I'm not going to pretend that this little rant didn't start with a TechCrunch post noting CNET is on the skids at the expense of TechCrunch/CrunchGear. That post--along with looking at ComScore's IPO filing a while back--got me poking around on the issue, which is murky and circular to say the least. In fact, the more questions you ask--about assumptions, methodology, what's included and excluded--the murkier it gets. Also note that I'm not speaking for CNET Networks or News.com--this is just how things appear to me.

TechCrunch notes:

In the month of October alone, the New York Times added 4.9 million readers on the Web. That is more than double the total readership of CNet’s News.com of 2 million, which sadly seems to be one of the few media sites declining in visitors (from 2.5 million in August). News.com’s pageviews have also been flat, at 6 million per month since August. For comparison’s sake, comScore shows TechCrunch (including our sister site CrunchGear) at 8 million monthly pageviews worldwide in October (we surpassed News.com in September), and it shows us catching up in online readers with 1.7 million worldwide in October.

 

Then there's a chart citing unique users and a graph of total pages viewed. But there are a few questions: If TechCrunch lumps in CrunchGear are News.com's blogs lumped in with the News.com news property? It's not a small point since traffic and uniques have shifted from news to blogs. Is Webware and Crave included in News.com? What's the real comparison? What is the differential between TechCrunch's internal figures and ComScore's?

TechCrunch seems pleased with ComScore's figures so there doesn't seem to be any discrepancy--or at least one that's way negative. For News.com and ZDNet traffic ComScore is off anywhere from 70 percent to 90 percent compared to our internal figures. Uniques are also off by a wide margin.

I'm not going to get into a pissing match over traffic, but jeesh there are some serious discrepancies going on here. Internally at CNET, folks go "how are we going to respond?" Meanwhile, TechCrunch plays the old tabloid circulation game of "my circ is growing faster than yours." Good for TechCrunch (I'd do the same thing). Bad for CNET. But the real issue here could be the peril of Web ranking panels.

A ComScore spokesman was swamped when I emailed a few questions on Wednesday. He noted that the TechCrunch stats are mostly U.S. only. He also addressed any discrepancies:

Internal figures are generally a very poor gauge of actual unique visitor counts, because they are typically inflated due to several factors. Key factors include cookie deletion, which can overstate internals by as much as 2.5x, and double-counting of visitors who log on from both home & work. ComScore data is not cookie-based, so is not inflated due to this factor, and we account for home/work overlap in our unique visitor counts.

CNET filters for robots and corrects for cookies--although if you visited from two different computers it would be treated as if you were two different people.

Other questions for ComScore--notably about how panels are constructed, the methodology and whether all sites have the same margin of error--went unanswered for now.

Apparently, the CNET muckety mucks are having this behind the scenes conversation with ComScore about metrics, but here's the problem: ComScore's stats are bunk. And Nielsen Netratings statistics are a little less bunk than ComScore's. But both rely on panels--two seemingly different ones--to come up with rankings. Your traffic statistics depend on how the panel is constructed and the composition. Is the work sample dominated by financial firms? How about technology companies (that would suit ZDNet for sure)? Retailers? Companies that allow monitoring on their networks? It's a big black box to me that results in a margin of error the size of the Grand Canyon. Who are these panel people?

Bridging the server vs. panel gap There are efforts to bridge the gap between server-based and panel-based measurements, but it is early. The Interactive Advertising Bureau plans to address the server vs. panel gap and all of the discrepancies involved. Meanwhile, ComScore and Nielsen Netratings agreed to have their methodologies audited by the Media Ratings Council. Apparently, I'm not the only one wondering about this stuff.

So we have a conundrum where panel-based Web ratings are flawed, but server data also has issues. Each company has its own server logs and filtering metrics that tell a different story. The solution to me: Let's move bridge the server and panel gaps--or at least reconcile them--pronto.

The big picture is this: For an industry that rambles on about measuring ROI and traffic no one on the Web seems to trust the numbers. So why pay for ComScore and Nielsen? Why not check use the ever-flaky Alexa? All are flawed, but I'd rather have free and flawed than pricey subscription and flawed.

Makes you go hmm. A perusing of the risk factors in ComScore's SEC filings reveals the following:

The market for digital marketing intelligence products is at a relatively early stage of development, and it is uncertain whether these products will achieve high levels of demand and increased market acceptance. Our success will depend to a substantial extent on the willingness of companies to increase their use of such products.

The factors that go into market acceptance include reliability, concern about security and privacy and customers that may develop internal digital marketing capabilities. And then there's the idea of a panel to yield reliable metrics.

We believe that the quality, size and scope of our Internet user panel are critical to our business. There can be no assurance, however, that we will be able to maintain a panel of sufficient size and scope to provide the quality of marketing intelligence that our customers demand from our products. If we fail to maintain a panel of sufficient size and scope, customers might decline to purchase our products or renew their subscriptions, our reputation could be damaged and our business could be materially and adversely affected. We expect that our panel costs may increase and may comprise a greater portion of our cost of revenues in the future. The costs associated with maintaining and improving the quality, size and scope of our panel are dependent on many factors, many of which are beyond our control, including the participation rate of potential panel members, the turnover among existing panel members and requirements for active participation of panel members, such as completing survey questionnaires. Concerns over the potential unauthorized disclosure of personal information or the classification of our software as “spyware” or “adware” may cause existing panel members to uninstall our software or may discourage potential panel members from installing our software. To the extent we experience greater turnover, or churn, in our panel than we have historically experienced, these costs would increase more rapidly.

Perhaps ComScore's SEC statements are just boilerplate fodder, but the discrepancy between what I see internally and published stats are night and day. And those metrics are too night and day to not call BS on some level.

So what do we do about this little traffic discrepancy until audits are conducted? I really don't know. But I do know this: For ComScore to get everything in sync with reality (or at least their customers' versions of it) it would have to do the equivalent of a traffic stat restatement. And that's not going to happen because its business model depends on whether people believe its metrics. For the stock market oriented folks, it's the equivalent of Moody's saying "ok fellas, we realize our ratings were BS. But now we're starting anew. We're for real this time." We all know Moody's was smoking crack--just like Wall Street analysts in 1999 were--and is now just getting around to downgrading a bunch of subprime mortgage slime, but we play along anyway.

In other words, it'll be a cold day in hell before we see traffic ranking restatements. But we all play along--for now.

Topics: Servers, Hardware

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

3 comments
Log in or register to join the discussion
  • Surely internal are fairer

    I would not disagree that they aren't inflated by the factors listed . But surely they are inflated to a similar extent irrespective of the website being measured.

    Surely that makes them a less subjective comaparive measure.
    nmh
  • RE: The peril of panels: Can Web traffic stats be trusted?

    I've personally seen the same thing. We use comScore for tracking our web properties as well. Our comScore numbers are 50-70% lower than what we see from our internal server logs. Both unique visitors and page views are lower by an insane margin.

    comScore and other panel based measurement systems (Nielsen netratings) point their finger at cookie deletion and double counting users at work and at home. Fair enough. But neither of those can account for a 70% difference in raw page views. Cookies aren't needed to track page views and it doesn't matter if you double count users. A page view is a page view. According to comScore, we get 1/3rd the page views that we record internally. If page views are under-reported by 3x, what does that say about their unique user report? To me, it says their panel is made up of people who don't browse as much as the rest of the web, which means their panel users are not representative.

    Panel based systems are equivalent to opinion polls. Theoretically, if you have a large enough sample that is randomly sampled, you can get within a fraction of a percent margin of error. However, comScore's panel participants are self-selected, motived by free anti-virus software. This leads to a hugely biased sampling of web traffic.

    To top things off, it's insanely easy to game comScore. comScore counts the number of times its panel users hits a set of URLs. The URLs can be anything you tell them. Although, they try to validate them by asking questions, there's really not much they can do. In our case, there was some confusion over the URLs to track and comScore ended up using the wrong ones. They gave us a call in a panic when our numbers (both page views and uniques) came in higher than yahoo.com. It turned out they where tracking URLs that they shouldn't have been tracking. It made me realize that any site could easily inflate their numbers by telling comScore to track a set of URLs that really don't have anything to do with their content pages. I suspect that a number of companies do just that to get higher rankings and there's no way to know.

    I commend comScore for trying to solve this critical problem of accurate web accounting, but they fall far short.
    nowayjose555
  • Solutions Exist Today - Quantcast

    Great post, and clearly, there are lot's of factors that impact individual web site traffic (ie, unique cookies/machines) and audience (ie, people and demogrpahics) metrics. Full disclosure, I am CMO of Quantcast (www.quantcast.com), a new audience measurement service. We believe that web publishers should be active participants in how their sites are measured, and that the marketplace should have transparency about how data is being surfaced. Today, we have over 20,000 web publishers (representing millions of web destinations) who have quantified their sites - they've dropped Quantcast tags on their properties. The result - census level traffic results being reported to the market. Many of the problems impacting panel based services are overcome (work, university, panel bias, etc.) because every page impression is tracked. More importantly, we are able to measure the entire web ecosystem - all sites, of any size, distributed media, etc.

    We are working to address buy side (ie, advertiser) concerns (like non-human traffic, cookie correction, multiple machine use, multiple people per computer) so that audience figures (different from traffic counts!) accurately reflect the number of people consuming content. We use a sophisticated inference engine to project audience demogrpahics and affinities so that publishers and advertisers understand the more important metrics of *who* are consuming content - not just the raw traffic numbers.

    Our service is free, and publishers of any size (we have properties with as few as 1000 monthly unique visitors) can obtain full traffic and audience reporting.
    apgerber