Big Data: Revolution or evolution?

Moderated by Jason Hiner | April 2, 2012 -- 07:00 GMT (00:00 PDT)

Summary: The technology to collect, process and analyze Big Data has been around for a while. So what's changed?

Andrew Brust

Andrew Brust

Revolution

or

Evolution

Dan Kusnetzky

Dan Kusnetzky

Best Argument: Revolution

The moderator has delivered a final verdict.

Opening Statements

Don’t be afraid

Andrew Brust:  Big Data is unmistakably revolutionary.  For the first time in the technology world, we’re thinking about how to collect more data and analyze it, instead of how to reduce data and archive what’s left.  We’re no longer intimidated by data volumes; now we seek out extra data to help us gain even further insight into our businesses, our governments, and our society.

The advent of distributed processing over clusters of commodity servers and disks is a big part of what’s driving this, but so too is the low and falling price of storage.  While the technology, and indeed the need, to collect, process and analyze Big Data, has been with us for quite some time, doing so hasn’t been efficient or economical until recently.  And therein lies the revolution: everything we always wanted to know about our data but were afraid to ask.  Now we don’t have to be afraid.

Not really new

Dan Kusntezky: Big data isn't really new. What we now know as Big Data comes out of ancient and honorable analysis of log data, from a long line of analytical tools that deal with rapidly moving, large amounts of data. Analyzing log data coming out of operating systems, application frameworks, database engines, networking giblets and storage systems has been around for decades as a “big data” task. Just ask vendors such as Splunk, Loggly, or RainStor.

Talkback

24 comments
Log in or register to join the discussion
  • Search is not Big Data analytics

    "Big Data", which has to be the worst term yet coined, goes far beyond log management and search. At the moment, it's being abused by those vendors to give relevance to their products.

    Big Data is combination of technologies, like new computing paradigms (e.g. Red Lambda's collaborative grids, or Hadoop), simpler throughput-oriented storage (e.g. lock-free NoSQL, graph databases, etc.) and most importantly, incremental data mining. Big Data is really best defined as the situation in which data is either too large, or too 'continuous' to ever make an analytical pass over the entire dataset. You can't just fire up k-means and cluster your data, by the time you are done, the results are at worst irrelevant, or at best forensic.

    The crown jewel of Big Data is incremental knowledge discovery. This is the act of applying data mining techniques to *any* events as they arrive, without referencing any other data that has been received. The trick is how to cluster, classify and perform anomaly detection on such data for the life of the system's operation. No batch processing method (like Hadoop) can solve this problem. *Any* events has to mean *any*. Binary files, imagery, audio, and UTF-8 logs are all forms of data. Being able to perform basic searches on one of them hardly qualifies for this category.
    conduit242
    Reply Vote I'm for Revolution
  • concept of big data and analytics isn't new

    it's the ease in which the data is captured (mobile devices and apps) and the context. Big data isn't necessarily transactional (e.g. I bought a product)...it's behavioral.
    getrichieb
    Reply Vote I'm for Revolution
  • I've always thought that the term "big data" is not indicative of exactly

    what the tech people really want to convey. The data collected is "voluminous", and hard to tackle or tame for quick analysis. Huge volumes of data should be called exactly what they are, and that's "voluminous collections of data" or perhaps "voluminous data" for short.
    adornoe
    Reply Vote I'm for Evolution
  • Big Data hardly matters

    What matters is what you do with it:
    - Use it to obtain information to improve decision-making
    - Protect what you have in custody

    If you use the data wisely, it is a revolution. But the existence of the data explosion means little if it is not exploited and protected.
    nmarks2
    Reply Vote I'm for Revolution
    • BD Matters, Evolution vs. Revolution Does Not.

      I agree to the extent that what's truly important is "what you do with it."

      The debate as to whether it's evolution or a revolution is, to me, primarily a question of etymology. So the question itself is intrinsically arbitrary. However, it seems that the highest benefits of this argument's conclusion(s) will be yielded from the questions that logically follow. For example, as Kusnetzky point out, "The key is that non-IT analysts can take part." This would be huge. The ability to regularly use experts in the fields directly related to the data at hand, rather than IT analysts, would be nothing short of game changing.
      felipeowen
      Reply Vote I'm Undecided
  • New Technologies of the Age Spell Big Data Revolution

    The techhnologies needed to make big data meaningful are reasonably new in terms of availability: supercomputers. Specifically we're seeing impressive developments in quantum computing which will truly give rise to the big data revolution.
    xamountofwords
    Reply Vote I'm for Revolution
    • Big data, or voluminous data, is happening now, and quantum computing

      is not yet there.

      So, we need practical solutions and practical hardware, and practical applications, NOW!
      adornoe
      Reply Vote I'm Undecided
      • We have practical hardware

        We have practical hardware. That's not really the problem. We can process huge volumes of data easily. The question is, why are you processing the data, and how will the results affect your business?
        CobraA1
        Reply Vote I'm for Evolution
      • CobraA1: Agree with you; the problem was with xamountofwords's quantum

        computer comments, of which I was trying to impress that, we need the practical solutions now, and quantum computers are not there yet. And, yes, you're absolutely right about there already being the hardware and the software to process the huge amounts of data, systematically and organizationally, and with results that matter. It's only a matter of making sure that we continue to keep up with the amounts of data being generated.
        adornoe
        Reply Vote I'm Undecided
  • Big Data has no relevance for operational systems

    Which are the interesting things as far as I am concerned.

    As for analysis there is a whole discipline devoted to interpreting large data sets. It's called statistics. You can't make "big data" intuitive because statistical analysis often reveals counter-intuitive results; that why you need statistics. Our intuitive sense of probability is extremely poor.

    If there had been some huge breakthrough in statistical techniques then we might talk about revolution, but all I see is some optimisation techniques that aren't new and are liable to lead to incorrect results.

    Likewise, if someone had devised a technique to query operational data in real-time with little or no impact on operational systems then I might be impressed. Very useful, but not really revolutionary.
    jorwell
    Reply Vote I'm for Evolution