Another take on "Big Data"

Another take on "Big Data"

Summary: The topic "Big Data" always brings in many reader comments. Here's a segment of on response that gets to the heart of Big Data use cases.

SHARE:
TOPICS: Big Data
4

Every time I post something on "Big Data," I get quite a bit of Email with readers' thoughts on a good definition. A reader calling himself /herself "Mikey" sent a very short response that went to the heart of the topic. Here's a segment of what "Mikey" had to say:"

Think three Vs.

  • Volume - The sheer amount of data, whether from a webscale user base (Twitter, Facebook) or a huge amount of machine/sensor data (clickstreams, power grid monitors etc.)
  • Variety - Data is more than validated strings in fields - it's text, images, video, and all sorts of machine data formats
  • Velocity - Wherever and whoever it's coming from, you have to capture tens or hundreds of thousands of writes per second, maybe even millions. You need distributed systems, usually, because if you just try to throw performance and hardware at it you'll eventually always lose.

I would also add extreme amount of retail point of sale data to the reader's "Volume" list. Other than that, "Mikey" has the use case nailed.

The technology that supports Big Data, on the other hand, is much to complex to describe in a few short bullets.

Topic: Big Data

About

Daniel Kusnetzky, a reformed software engineer and product manager, founded Kusnetzky Group LLC in 2006. He's literally written the book on virtualization and often comments on cloud computing, mobility and systems software. In his spare time, he's also the managing partner of Lux Sonus LLC, an investment firm.

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Talkback

4 comments
Log in or register to join the discussion
  • RE: Another take on

    Excellent, simple precise to the point
    ssmusoke@...
  • Does my data look big in this?

    Of course it may be that your data really isn't that big. <br><br>But you got distracted by the misguided notion that denormalization improves performance. <br><br>Therefore your correct and correctly sized data has expanded into big data, regularly delivering contradictory results from your duplicated data.<br><br>Existing DBMSs are already far too redundant. The future is smaller data not bigger.
    jorwell
  • RE: Another take on

    "I would also add extreme amount of retail point of sale data to the reader???s 'Volume' list."

    You're talking about small text transactions vs . . . oh, I dunno, YouTube videos? A single video is probably as much data as many thousands of POS transactions.

    And do you really need every piece of data from every transaction outside of the individual store? For higher level stuff, shouldn't you just be concerned with processed, aggregated data?
    CobraA1
    • RE: Another take on

      @CobraA1 nice points.
      daviddaly