Digg 3.0, Who needs The New York Times?

Will Digg become the Web 2.0 "newspaper of record"?

Michael Arrington, Web 2.0 champion at his TechCrunch, has been championing the Web 2.0 Digg, as of late.

Last week, TechCrunch posted “exclusive screenshots and stats” about Digg, prior to the sites official launch of its enhanced version, and Arrington said Digg is “looking more and more like the newspaper of the web, and is challenging even the New York Times on page views.”

Today, at Gnomedex, while suggesting that Web 2.0 is not just an echochamber, he gave the example of Digg, which he said is “getting to the size of the New York Times.”

What is the basis for Arrington’s assertion that a 15 person start-up that asks for “thumbs up" or “thumbs down” on headlines from stories created by other sites, sites such as nytimes.com, is on track to displace the 1200 person strong newsroom of the 155 year old New York Times “newspaper of record” with its $200 million worldwide news gathering budget?

The Techcrunch June 22 post, “Digg 3.0 to Launch Monday: Exclusive Screenshots and Stats,” links to a site called “alexaholic” (sponsored by a firm that aims to “Blow the top off your graph!), showing a “website traffic graphs comparing digg.com, nyt.com and slashdot.com”. While the Website name infers a connection to the Amazon property Alexa.com, a barely legible grey on white small font disclaimer at the very bottom of the page states “alexaholic is neither affiliated with nor possible without alexa.com.”

Alexa, itself, states the following "Important Disclaimers" about the "Alexa Traffic Rankings”:

The traffic data are based on the set of toolbars that use Alexa data, which may not be a representative sample of the global Internet population. Known biases include (but are likely not limited to) the following:

• Our users are disproportionately likely to visit sites that are featured on alexa.com such as amazon.com and archive.org, and traffic to these sites may be overcounted.

• The extent to which our sample may overcount or undercount users of the various browsers is unknown. Alexa's sample includes users of Internet Explorer, Firefox and Mozilla browsers. The AOL/Netscape and Opera browser is not supported, which means that sites operated by these companies may be undercounted.

• The extent to which our sample may overcount or undercount users of various operating systems is unknown. Alexa sample includes toolbars built for Windows, Macintosh and Linux.

• The rate of adoption of Alexa software in different parts of the world may vary widely due to advertising locality, language, and other geographic and cultural factors. For example, to some extent the prominence of Chinese sites among our top-ranked sites reflects known high rates of general Internet usage in China, but there may also be a disproportionate number of Chinese Alexa users.

• In some cases traffic data may also be adversely affected by our "site" definitions. With tens of millions of hosts on the Internet, our automated procedures for determining which hosts are serving the "same" content may be incorrect and/or out-of-date. Similarly, the determinations of domains and home pages may not always be accurate. When these determinations change (as they do periodically), there may be sudden artificial changes in the Alexa traffic rankings for some sites as a consequence.

• The Alexa Toolbar turns itself off on secure pages (https:). Sites with secure page views will be under-represented in the Alexa traffic data.

In addition to the biases above, the Alexa user base is only a sample of the Internet population, and sites with relatively low traffic will not be accurately ranked by Alexa due to the statistical limitations of the sample. Alexa's data come from a large sample of several million Alexa Toolbar users; however, this is not large enough to accurately determine the rankings of sites with fewer than roughly 1,000 total monthly visitors. Generally, Traffic Rankings of 100,000+ should be regarded as not reliable because the amount of data we receive is not statistically significant. Conversely, the more traffic a site receives (the closer it gets to the number 1 position), the more reliable its Traffic Ranking becomes.