Big data: An overview

Big data: An overview

Summary: Data is being generated about the activities of people and inanimate objects on a massive and increasing scale. We examine how much data is involved, how much might be useful, what tools and techniques are available to analyse it, and whether businesses are actually getting to grips with big data.


Computing devices and networks have been storing and processing data in increasingly large amounts for decades, but the rate of expansion of the 'digital universe' has accelerated massively in recent years, and now exhibits exponential growth.

Big Bang

Colossus Mk 2 review

Colossus Mk 2

Colossus Mk 2

Computing's 'Big Bang' moment came during World War 2, in the shape of the world's first programmable digital computer, Colossus. Built at the UK's Bletchley Park codebreaking centre to help break the German High Command's Lorenz cipher, Colossus could store 20,000 5-bit characters (~125KB) and input data at 5,000 characters per second via paper tape (~25Kbps). Small data in today's terms perhaps, but Colossus decrypts made a vital contribution to the Allied planning for D-Day, in particular.

The Digital Universe

In December 2012, IDC and EMC estimated the size of the digital universe (that is, all the digital data created, replicated and consumed in that year) to be 2,837 exabytes (EB) and forecast this to grow to 40,000EB by 2020 — a doubling time of roughly two years. One exabyte equals a thousand petabytes (PB), or a million terabytes (TB), or a billion gigabytes (GB). So by 2020, according to IDC and EMC, the digital universe will amount to over 5,200GB per person on the planet.

Source: The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East (IDC & EMC, December 2012)

In 2012 the US and Western Europe still accounted for over half (51%) of the digital universe (see diagram above right), but by 2020 IDC and EMC estimate that 62 percent will be attributable to emerging markets, with China alone accounting for 21 percent.

Big data

Not all of the myriad streams of data generated by and about people (and, increasingly, things) in this digital universe will be actually or even potentially useful. According to IDC and EMC, some 33 percent of 2020's 40,000EB (13,200EB) total might be valuable if analysed. In 2012, the figure is 23 percent of the 2,837EB total (652EB) — with only 3 percent (85EB) suitably tagged and just half a percent actually analysed. That still amounts to 14.185EB (14,185 petabytes, or 14.185 million terabytes) — 'big data' in anyone's book, but a mere footprint on a vast and largely unexplored cosmos of information.

Big picture

While we're still examining the big picture, it's worth looking at how Big Data has progressed along Gartner's Hype Cycle in recent years:

Gartner's Hype Cycles for Emerging Technologies, 2011-2013.

In 2011, the analyst firm placed Big Data (along with 'Extreme Information Processing and Management') in the Technology Trigger phase (since renamed Innovation Trigger), with mainstream adoption envisaged in 2-5 years. Last year saw it approaching the Peak of Inflated Expectations, which it has all but scaled in 2013. Gartner also revised its outlook for Big Data in 2013, placing mainstream adoption 5-10 years in the future, with the Trough of Disillusionment opening up before it.


Topics: Going Deep on Big Data, Big Data


Charles has been in tech publishing since the late 1980s, starting with Reed's Practical Computing, then moving to Ziff-Davis to help launch the UK version of PC Magazine in 1992. ZDNet came looking for a Reviews Editor in 2000, and he's been here ever since.

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • Riding high in the Hype

    I certainly have no problem agreeing that Big Data is coming close to the Peak of Inflated Expectations. That is why businesses need to accurately plan and research before they leap into a half-baked project.

    I wanted to share a video that I think can be helpful for your readers that deals with planning and executing a Big Data program. ( This video is based off of TEKsystems research and delivers the message in a cute way through multiple sci-fi references. It gives a more realistic expectations of how to begin to approach a Big Data initiative, backed up by research from leaders in the industry.
  • was the America's Cup the first use-case of big data in sport?

    This is an interesting well-researched article with good data points. On the subject of how big data will be used and what value it will add, there is an interesting theory that Oracle just won the America's Cup by using big data. Below is a blog post on this topic (which i contributed to).
  • What the Gartner hype cycle misses out

    Of course certain hyped things prove to be nothing but hype and after the trough of disillusionment disappear without trace.

    Big data is probably one of those.

    Of course no one likes to talk about the computer industry getting something absolutely and fundamentally wrong.
  • What the Gartner hype cycle misses out

    Of course no one likes to talk about the computer industry getting something absolutely and fundamentally wrong.
    Big data is probably one of those.
  • Big Data - Businesses are not ready yet

    considering that most business are not even ready for traditional style BI, I think Big Data and open source technologies orbiting it have a long way to go. However the path is set and sooner or later Big Data will become industry norm.

    Our company helps small and medium size organizations to take first steps into BI with confidence. We provide consultancy in proprietary & open source technologies.
  • Big Data

    I guess these are merely favorable..I am just astounded by their services and functions too..Check 'em