Measuring big data, by the byte and more

Measuring big data, by the byte and more

Summary: Never mind trying to tame "big data". Is this something that can even be measured? How big is the big data market anyway? Deloitte is attempting to do just that.

TOPICS: Big Data

What, exactly, is “big data”? In a recent interview (video posted below), Duncan Stuart, director of research for TMT at Deloitte Canada, defined it as 5 petabytes or more.

Many in the industry define big data not just by volume, but also by velocity and variety. But 5 PB is a good, simple threshold for current measurements. It’s also a moving target — bear in mind that 5 PBs may be what you find in a tablet computer three years from now.

Such large, multi-petabyte sites are likely to be proliferating. In a survey I helped conduct last fall as part of my work with Unisphere Research/Information Today Inc., nine percent of the companies participating reported data stores exceeding 1 petabyte. For comparative purposes, a petabyte is 1,000 times bigger than those 1-terabyte databases that made news just a decade ago.

In its report, Deloitte spells out the challenges with sizing the big data market — there are varied definitions of what big data is, it is still early in the adoption cycle of big data technologies, and most of the companies who are doing big data do not disclose their spending.

Nevertheless, Deloitte pegs the size of the big data market at about $1.3-$1.5 billion in 2012. The consultancy also predicts that this year, we’ll see big data experience accelerating growth and market penetration:

“As recently as 2009 there were only a handful of big data projects and total industry revenues were under $100 million. By the end of 2012 more than 90 percent of the Fortune 500 will likely have at least some big data initiatives under way.”

But the industry is still in its infancy, Deloitte cautions. “Big data in 2012 will likely be dominated by pilot projects; there will probably be fewer than 50 full-scale big data projects (10 PBs and above) worldwide.”

There are compelling reasons for companies to pursue big data. “Big data can see through time, big data basically allows you to see everything all at once, and in much finer detail,” says Stuart. “Instead of looking at my customer’s behavior once a month, I can look at it every minute of every day. That kind of insight is very, very powerful. It allows me to serve my customer better — either very large or very fast or both, requires the big data toolset.”

(Illustration by Joe McKendrick.)

(Cross-posted at SmartPlanet Business Brains.)

Topic: Big Data

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • RE: Measuring big data, by the byte and more

  • RE: Measuring big data, by the byte and more

    It's interesting because I think we may be doing a dis-service to the concept of big data by focusing on a specific size. As you noted, size constraints with hardware are constantly changing, but more importantly, big data is really a relative concept. Our definition is as follows...

    Big data: When volume, velocity and variety of data exceeds an organization???s storage or compute capacity for accurate and timely decision-making

    I also think it is important to note that big data is not one technology, it's certainly not just Hadoop. It's also not just one architectural pattern - we are seeing big data implementations where total data is factored into the analysis, but we are also seeing a pattern that we refer to as "stream it, score it, then store it", to describe an approach that leverages rich analytics up-front to determine relevant data for further analysis or processing.

    Market sizing can be interesting, but I think the concept of a big data market, unless defined specifically with a solid definition will just lead to confusion. At one level, big data is really not a different market, it's just managing analytics at scale. You could certainly size various technology markets that are associated with big data, but even those technologies are not always used just for big data. Regardless, it's an interesting conversation as long as it doesn't constrain how people think about big data in a way that forces them into unnatural choices.

    For our definition of big data, see this blog post - along with posts about Hadoop, information management, etc.,

    Mark Troester
    IT/CIO Thought Leader & Strategist
    Twitter #mtroester
    • Agree wholeheartedly

      @mtroester +1

      A lot of organizations are struggling at the terabyte scale of data, and could benefit from the advances that are happening with big data solutions. Deloitte is selling the scope of the issue short by looking at the top 50 in the world. There's a whole other class of demand for big data solutions if you bother to bring the TB camp into the discussion, and those projects probably number in the thousands.

      Also, don't think just about storage. It's about network speeds, search, retrieval, safekeeping, and derivative processing. The data is no good if it cannot be found, extracted, and used.
  • Easy

    Big data is measured in GIGO*bytes.

    It's what happens when you use a DBMS with massive data duplication and no constraint mechanisms.

    *GIGO - Garbage in, garbage out.