ie8 fix
madison

Measuring big data, by the byte and more

By | February 22, 2012, 6:43pm PST

Summary: Never mind trying to tame “big data”. Is this something that can even be measured? How big is the big data market anyway? Deloitte is attempting to do just that.

What, exactly, is “big data”? In a recent interview (video posted below), Duncan Stuart, director of research for TMT at Deloitte Canada, defined it as 5 petabytes or more.

Many in the industry define big data not just by volume, but also by velocity and variety. But 5 PB is a good, simple threshold for current measurements. It’s also a moving target — bear in mind that 5 PBs may be what you find in a tablet computer three years from now.

Such large, multi-petabyte sites are likely to be proliferating. In a survey I helped conduct last fall as part of my work with Unisphere Research/Information Today Inc., nine percent of the companies participating reported data stores exceeding 1 petabyte. For comparative purposes, a petabyte is 1,000 times bigger than those 1-terabyte databases that made news just a decade ago.

In its report, Deloitte spells out the challenges with sizing the big data market — there are varied definitions of what big data is, it is still early in the adoption cycle of big data technologies, and most of the companies who are doing big data do not disclose their spending.

Nevertheless, Deloitte pegs the size of the big data market at about $1.3-$1.5 billion in 2012. The consultancy also predicts that this year, we’ll see big data experience accelerating growth and market penetration:

“As recently as 2009 there were only a handful of big data projects and total industry revenues were under $100 million. By the end of 2012 more than 90 percent of the Fortune 500 will likely have at least some big data initiatives under way.”

But the industry is still in its infancy, Deloitte cautions. “Big data in 2012 will likely be dominated by pilot projects; there will probably be fewer than 50 full-scale big data projects (10 PBs and above) worldwide.”

There are compelling reasons for companies to pursue big data. “Big data can see through time, big data basically allows you to see everything all at once, and in much finer detail,” says Stuart. “Instead of looking at my customer’s behavior once a month, I can look at it every minute of every day. That kind of insight is very, very powerful. It allows me to serve my customer better — either very large or very fast or both, requires the big data toolset.”

(Illustration by Joe McKendrick.)

(Cross-posted at SmartPlanet Business Brains.)

Kick off your day with ZDNet's daily e-mail newsletter. It's the freshest tech news and opinion, served hot. Get it.

Topics

Joe McKendrick is an author, consultant and speaker specializing in trends and developments shaping the technology industry.

Disclosure

Joe McKendrick

Joe McKendrick is an independent consultant, editor and speaker.

Joe has performed project work (white papers, articles, blogs, research and presentations) for the following companies in the IT marketspace:

  • CBS Interactive/CNET/ZDNet (this blog)
  • ebizQ
  • Evans Data
  • Gartner
  • IBM
  • Informatica
  • IDC
  • Microsoft
  • Systinet/HP
  • Teradata
  • Unisphere Reseach, a division of Information Today, Inc.
  • WebLayers

Joe has also performed research work for the following sponsoring organizations in partnership with Unisphere Research, a division of Information Today, Inc.

  • IBM
  • Luminex
  • Noetix
  • Oracle Corp.
  • Teradata
  • Informatica
  • International Oracle Users Group
  • Oracle Applications Users Group
  • Professional Association for SQL Server
  • International DB2 Users Group
  • International Sybase Users Group
  • SHARE (IBM large systems users group)

Biography

Joe McKendrick

Joe McKendrick is an author and independent analyst who tracks the impact of information technology on management and markets. Joe is co-author, along with 16 leading industry leaders and thinkers, of the SOA Manifesto, which outlines the values and guiding principles of service orientation. He also speaks frequently on Enterprise 2.0 and SOA topics at industry events and Webcasts, and serves on the program committee for this year's SOA & Cloud Symposium in London. As an independent analyst, he has also authored numerous research reports in partnership with Unisphere Research, a division of Information Today, Inc. for user groups such as SHARE, Oracle Applications Users Group, and International DB2 Users Group. In a previous life, Joe served as director of the Administrative Management Society (AMS), an international professional association dedicated to advancing knowledge within the IT and business management fields. He is a graduate of Temple University.

4
Comments

Join the conversation!

Just In

Easy
jorwell Updated - 24th Feb
Big data is measured in GIGO*bytes.

It's what happens when you use a DBMS with massive data duplication and no constraint mechanisms.

*GIGO - Garbage in, garbage out.
0 Votes
+ -
testing.
0 Votes
+ -
It's interesting because I think we may be doing a dis-service to the concept of big data by focusing on a specific size. As you noted, size constraints with hardware are constantly changing, but more importantly, big data is really a relative concept. Our definition is as follows...

Big data: When volume, velocity and variety of data exceeds an organization???s storage or compute capacity for accurate and timely decision-making

I also think it is important to note that big data is not one technology, it's certainly not just Hadoop. It's also not just one architectural pattern - we are seeing big data implementations where total data is factored into the analysis, but we are also seeing a pattern that we refer to as "stream it, score it, then store it", to describe an approach that leverages rich analytics up-front to determine relevant data for further analysis or processing.

Market sizing can be interesting, but I think the concept of a big data market, unless defined specifically with a solid definition will just lead to confusion. At one level, big data is really not a different market, it's just managing analytics at scale. You could certainly size various technology markets that are associated with big data, but even those technologies are not always used just for big data. Regardless, it's an interesting conversation as long as it doesn't constrain how people think about big data in a way that forces them into unnatural choices.

For our definition of big data, see this blog post - http://blogs.sas.com/content/datamanagement/2011/11/05/big-data-defined-its-more-than-hadoop/ along with posts about Hadoop, information management, etc.,

Mark Troester
IT/CIO Thought Leader & Strategist
SAS
Twitter #mtroester
0 Votes
+ -
Agree wholeheartedly
rcasey101 23rd Feb
@mtroester +1

A lot of organizations are struggling at the terabyte scale of data, and could benefit from the advances that are happening with big data solutions. Deloitte is selling the scope of the issue short by looking at the top 50 in the world. There's a whole other class of demand for big data solutions if you bother to bring the TB camp into the discussion, and those projects probably number in the thousands.

Also, don't think just about storage. It's about network speeds, search, retrieval, safekeeping, and derivative processing. The data is no good if it cannot be found, extracted, and used.
0 Votes
+ -
Easy
jorwell Updated - 24th Feb
Big data is measured in GIGO*bytes.

It's what happens when you use a DBMS with massive data duplication and no constraint mechanisms.

*GIGO - Garbage in, garbage out.

Join the conversation!

Formatting +
BB Codes - Note: HTML is not supported in forums
  • [b] Bold [/b]
  • [i] Italic [/i]
  • [u] Underline [/u]
  • [s] Strikethrough [/s]
  • [q] "Quote" [/q]
  • [ol][*] 1. Ordered List [/ol]
  • [ul][*] · Unordered List [/ul]
  • [pre] Preformat [/pre]
  • [quote] "Blockquote" [/quote]
ie8 fix
Click Here
ie8 fix

The best of ZDNet, delivered

ZDNet Newsletters

Get the best of ZDNet delivered straight to your inbox

Facebook Activity

White Papers, Webcasts, & Resources
ie8 fix
ie8 fix