Andrew Brust
Revolution
Evolution
Dan Kusnetzky
Best Argument: Revolution
The moderater has delivered his final verdict.
Opening Statements
Don’t be afraid
Andrew Brust: Big Data is unmistakably revolutionary. For the first time in the technology world, we’re thinking about how to collect more data and analyze it, instead of how to reduce data and archive what’s left. We’re no longer intimidated by data volumes; now we seek out extra data to help us gain even further insight into our businesses, our governments, and our society.
The advent of distributed processing over clusters of commodity servers and disks is a big part of what’s driving this, but so too is the low and falling price of storage. While the technology, and indeed the need, to collect, process and analyze Big Data, has been with us for quite some time, doing so hasn’t been efficient or economical until recently. And therein lies the revolution: everything we always wanted to know about our data but were afraid to ask. Now we don’t have to be afraid.
Not really new
Dan Kusntezky: Big data isn't really new. What we now know as Big Data comes out of ancient and honorable analysis of log data, from a long line of analytical tools that deal with rapidly moving, large amounts of data. Analyzing log data coming out of operating systems, application frameworks, database engines, networking giblets and storage systems has been around for decades as a “big data” task. Just ask vendors such as Splunk, Loggly, or RainStor.
Talkback
Search is not Big Data analytics
Big Data is combination of technologies, like new computing paradigms (e.g. Red Lambda's collaborative grids, or Hadoop), simpler throughput-oriented storage (e.g. lock-free NoSQL, graph databases, etc.) and most importantly, incremental data mining. Big Data is really best defined as the situation in which data is either too large, or too 'continuous' to ever make an analytical pass over the entire dataset. You can't just fire up k-means and cluster your data, by the time you are done, the results are at worst irrelevant, or at best forensic.
The crown jewel of Big Data is incremental knowledge discovery. This is the act of applying data mining techniques to *any* events as they arrive, without referencing any other data that has been received. The trick is how to cluster, classify and perform anomaly detection on such data for the life of the system's operation. No batch processing method (like Hadoop) can solve this problem. *Any* events has to mean *any*. Binary files, imagery, audio, and UTF-8 logs are all forms of data. Being able to perform basic searches on one of them hardly qualifies for this category.
concept of big data and analytics isn't new
I've always thought that the term "big data" is not indicative of exactly
Big Data hardly matters
- Use it to obtain information to improve decision-making
- Protect what you have in custody
If you use the data wisely, it is a revolution. But the existence of the data explosion means little if it is not exploited and protected.
BD Matters, Evolution vs. Revolution Does Not.
The debate as to whether it's evolution or a revolution is, to me, primarily a question of etymology. So the question itself is intrinsically arbitrary. However, it seems that the highest benefits of this argument's conclusion(s) will be yielded from the questions that logically follow. For example, as Kusnetzky point out, "The key is that non-IT analysts can take part." This would be huge. The ability to regularly use experts in the fields directly related to the data at hand, rather than IT analysts, would be nothing short of game changing.
New Technologies of the Age Spell Big Data Revolution
Big data, or voluminous data, is happening now, and quantum computing
So, we need practical solutions and practical hardware, and practical applications, NOW!
We have practical hardware
CobraA1: Agree with you; the problem was with xamountofwords's quantum
Big Data has no relevance for operational systems
As for analysis there is a whole discipline devoted to interpreting large data sets. It's called statistics. You can't make "big data" intuitive because statistical analysis often reveals counter-intuitive results; that why you need statistics. Our intuitive sense of probability is extremely poor.
If there had been some huge breakthrough in statistical techniques then we might talk about revolution, but all I see is some optimisation techniques that aren't new and are liable to lead to incorrect results.
Likewise, if someone had devised a technique to query operational data in real-time with little or no impact on operational systems then I might be impressed. Very useful, but not really revolutionary.