Big Data: More than just analytics

Some say that Big Data is just Analytics repackaged, but there is a difference. It is especially important to understand the distinction because data driven companies have been found to outperform those who are not.

Analytics provides an approach to decision making through the application of statistics, programming and research to discern patterns and quantify performance. The goal is to make decisions based on data rather than intuition. Simply put, evidence-based or data-driven decision are better decisions. 

Analytics replaced the HiPPO effect ("highest paid person's opinion") as a basis for making critical decision.

So, what is the difference between Big Data and what we have traditionally called "analytics"? According to Andrew McAfee and Erik Brynjolfsson, the difference is massive volumes of data that we have access to, the speed at which data are accumulating and the variety of different data points.

According to the authors, “Each of us is now a walking data generator. The data available is often unstructured – not organized in a database – and unwieldy, but there is a huge amount of signal in the noise simply waiting to be released”.

Volume of data: As of this year the amount of data being created is in the range of a few exabytes (2.5) and doubles every 40 months.

An exabyte is 1,000 petabytes, or a billion gigabytes.  According the authors, more data crosses the Internet today than was stored in in the entire Internet twenty years ago. So the amount of data available is staggering.  

Managing that amount of data has spawned NoSQL players like of Hadoop and MongoDB.

Velocity of data: The speed at which data is created is sometimes more significant than the amount of data. The ability to react to large amounts of data in real, or near real time equates with agility today.  The example cited is that of the MIT Media Lab using location data from mobile phones to determine the volume of shoppers at a Macy's parking lot on Black Friday. The goal was to estimate the retailer's sales ahead of Macy's actually recording those sales. Analysts kill for this sort of predictive edge.

Variety of data: The advent of social media changed the data landscape significantly. Today we have many relatively new sources of data. When we think of traditional data points, or those that are found in relational databases, we don't tend to consider photos, tweets, status updates, location or GPS coordinates. These are all relatively new.

So, the challenge for the new tools is to help us unlock this 'signal'. While analytics bring us techniques to help us to make better decisions, big data provides us something more powerful as it has served to expand the traditional data set.