Big data: all you need to know
Summary: Big data's the big buzz word of 2012. So what's behind the hype?
In a hypercompetitive world where companies struggle with slimmer and slimmer margins, businesses are looking to big data to provide them with an edge to survive. Professional services firm Deloitte has predicted that by the end of this year, over 90 per cent of the Fortune 500 companies will have at least some big-data initiatives on the boil. So what is big data, and why should you care?
Contents
What is big data?
As with cloud, what one person means when they talk about big data might not necessarily match up with the next person's understanding.
The easy definition
Just by looking at the term, one might presume that big data simply refers to the handling and analysis of large volumes of data.
According to the McKinsey Institute's report "Big data: The next frontier for innovation, competition and productivity", big data refers to datasets where the size is beyond the ability of typical database software tools to capture, store, manage and analyse. And the world's data repositories have certainly been growing.
In IDC's mid-year 2011 Digital Universe Study (sponsored by EMC), it was predicted that 1.8 zettabytes (1.8 trillion gigabytes) of data would be created and replicated in 2011 — a ninefold increase over what was produced in 2006.
The more complicated definition
Yet, big data is more than just analysing large amounts of data. Not only are organisations creating a lot of data, but much of this data isn't in a format that sits well in traditional, structured databases — weblogs, videos, text documents, machine-to-machine data or geospatial data, for example.
This data also resides in a number of different silos (sometimes even outside of the organisation), which means that although businesses might have access to an enormous amount of information, they probably don't have the tools to link the data together and draw conclusions from it.
Add to that the fact that data is being updated at shorter and shorter intervals (giving it high velocity), and you've got a situation where traditional data-analysis methods cannot keep up with the large volumes of constantly updated data, paving the way for big-data technologies.
The best definition
In essence, big data is about liberating data that is large in volume, broad in variety and high in velocity from multiple sources in order to create efficiencies, develop new products and be more competitive. Forrester puts it succinctly in saying that big data encompasses "techniques and technologies that make capturing value from data at an extreme scale economical".
Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.
Talkback
Logic and statistics
So-called big data tools like Hadoop lack the support for logic provided by modern data management methods like the relational model and are therefore wholly unsuitable for such work.
The big data tools are a re-run of antiquated methods that have already been shown to be flawed in theory and unmanageable in practice.
In short big data is nothing but new marketing selling obsolete methods.
Logic and statistics
So-called big data tools like Hadoop lack the support for logic provided by modern data management methods like the relational model and are therefore wholly unsuitable for such work.
The big data tools are a re-run of antiquated methods that have already been shown to be flawed in theory and unmanageable in practice.
In short big data is nothing but new marketing selling obsolete methods.
Logic and statistics
So-called big data tools like Hadoop lack the support for logic provided by modern data management methods like the relational model and are therefore wholly unsuitable for such work.
The big data tools are a re-run of antiquated methods that have already been shown to be flawed in theory and unmanageable in practice.
In short big data is nothing but new marketing selling obsolete methods.
You don't understand what a relational DBMS is
This is why it doesn't make sense to talk about RDBMSs not being scalable. It is a little bit like saying that long division isn't scalable because your only implementation is paper and pencil.
Sorry for the multiple postings
This worked in the old comment system.
"And that was the start of one hell of a mess, big data, big bad data".
Extensive....
Excellent article
I am sorry for leaving a late comment. This article is excellent. Thank you very much for the documentary researches you have made. Searching for some documentation to explain big data to my manager, your article is the best summary I found so far on this topic. Thanks again.
BIG Data
Excellent article
Despite the yes/not tribes, and my personal opinion on the subject, it's really well documented and well explained, from the beginning to the end. A long one, but really useful.
It goes directly to my 'notebook' on articles on that topic at first position.
Thanks.