Computing devices and networks have been storing and processing data in increasingly large amounts for decades, but the rate of expansion of the 'digital universe' has accelerated massively in recent years, and now exhibits exponential growth.
Computing's 'Big Bang' moment came during World War 2, in the shape of the world's first programmable digital computer, Colossus. Built at the UK's Bletchley Park codebreaking centre to help break the German High Command's Lorenz cipher, Colossus could store 20,000 5-bit characters (~125KB) and input data at 5,000 characters per second via paper tape (~25Kbps). Small data in today's terms perhaps, but Colossus decrypts made a vital contribution to the Allied planning for D-Day, in particular.
The Digital Universe
In December 2012, IDC and EMC estimated the size of the digital universe (that is, all the digital data created, replicated and consumed in that year) to be 2,837 exabytes (EB) and forecast this to grow to 40,000EB by 2020 — a doubling time of roughly two years. One exabyte equals a thousand petabytes (PB), or a million terabytes (TB), or a billion gigabytes (GB). So by 2020, according to IDC and EMC, the digital universe will amount to over 5,200GB per person on the planet.
In 2012 the US and Western Europe still accounted for over half (51%) of the digital universe (see diagram above right), but by 2020 IDC and EMC estimate that 62 percent will be attributable to emerging markets, with China alone accounting for 21 percent.
Not all of the myriad streams of data generated by and about people (and, increasingly, things) in this digital universe will be actually or even potentially useful. According to IDC and EMC, some 33 percent of 2020's 40,000EB (13,200EB) total might be valuable if analysed. In 2012, the figure is 23 percent of the 2,837EB total (652EB) — with only 3 percent (85EB) suitably tagged and just half a percent actually analysed. That still amounts to 14.185EB (14,185 petabytes, or 14.185 million terabytes) — 'big data' in anyone's book, but a mere footprint on a vast and largely unexplored cosmos of information.
While we're still examining the big picture, it's worth looking at how Big Data has progressed along Gartner's Hype Cycle in recent years:
In 2011, the analyst firm placed Big Data (along with 'Extreme Information Processing and Management') in the Technology Trigger phase (since renamed Innovation Trigger), with mainstream adoption envisaged in 2-5 years. Last year saw it approaching the Peak of Inflated Expectations, which it has all but scaled in 2013. Gartner also revised its outlook for Big Data in 2013, placing mainstream adoption 5-10 years in the future, with the Trough of Disillusionment opening up before it.