IBM’s Big Data Analytics Empire

When you think of Big Data Analytics, you might think of startups. Think again. IBM has the space covered, perhaps better than any other company.
Written by Andrew Brust, Contributor on
IBM's Deepak Advani
IBM's Deepak Advani

IBM is going gangbusters in the Big Data world.  Its IBM Big Data Web site is a major resource, its TV commercials project authority and its products are entrenched in the space.  But, for me, it took a discussion yesterday with Deepak Advani, IBM’s Vice President for Business Analytics Products and Solutions, to appreciate IBM’s Big Data play in all its depth.

Maybe old dogs do the best new tricks
IBM is a 101-year old company, based on the East Coast. It once made typewriters.  It still makes mainframes.  Its products interoperate with open source technology, but most of its products are anything but free. And many of those products have come through an array of acquisitions the company has made over the course of its history.  On top of all that, IBM is a services company, with teams of consultants working around the globe.

Most Big Data startups have a lean team, are just a few years old, based on the West Coast, offer their core technology as open source software running on commodity hardware, and have built their IP organically. Yet IBM is still a major player in the Big Data and analytics world.  How can this be when so many of its vital statistics appear counter to playing in the space?  Talking to Advani reminded me of several reasons why, and he pointed out some others that were new to me.  He helped me connect the dots and then added a few more.

For instance, I knew that IBM had made a number of acquisitions over the last decade in the business analytics space.  One of these is especially easy to remember: IBM acquired Business Intelligence (BI) powerhouse Cognos in 2007 -- right after the latter had acquired Applix that same year. This deal gave IBM an end-to-end BI suite that includes conventional and in-memory Online Analytical Processing (OLAP), reporting, dashboards and data visualization.  Again, I knew this.

But I had lost track of some of IBM’s other acquisitions, and it’s really the combination of all the deals that yields the compounded analytics value. Before the Cognos deal, IBM acquired Ascential Software in 2005, which among other assets gave it Extract Transform and Load (ETL) product DataStage. After the Cognos purchase, IBM swallowed up statistics and analytics heavyweight SPSS in 2009 and MPP (Massively Parallel Processing) Data Warehouse fixture Netezza in 2010.

So in addition to the full BI suite, IBM also has the high-performance data warehouse technology necessary to feed those BI systems and the tools necessary to run predictive analytics on the output.  Speaking of analytics, it’s a field closely tied with Risk Management. With that in mind, IBM’s purchase of OpenPages in 2010 and Algorithmics in 2011 just primed the Big Data pump that much more.

These are just some of the products IBM has in the business analytics space.  There are so many in fact that, as Advani explained to me, IBM has a whole “Signature Solutions” program meant to highlight the more interesting combinations of IBM’s products in the space, combined with the intellectual property developed around them by its services organization.

Homegrown products, and external ones too
Can IBM work with open source technologies from outside its own stack, even when they compete with some of its own products?  Sure.  IBM works with R (effectively a competitor to SPSS); it has a partnership with Cloudera (whose Hadoop distribution competes with IBM’s own, which is also open source) and can use Mahout, the machine learning component that runs on top of Hadoop.  And then, of course, there’s Linux, the open source operating system that IBM has used strategically for years.

IBM also has a number of organically developed products in the analytics space. InfoSphere Streams, its CEP (Complex Event Processing) offering and InfoSphere BigInsights, its own Hadoop distribution, are two examples.  One of the interesting points about the BigInsights Hadoop distribution is its integration with IBM’s DB2 relational database management system, one of the most important in the industry.  And while relational data might not be Big Data, accomplishments in the former build a pedigree for success in the latter.  Knowing how to handle data operationally builds a platform and competency for analyzing it later on.

This is about hardware too.  IBM is almost synonymous with mainframe computers and the vast back office systems that have run on them, collecting data for decades.  Handling systems with those workloads, for that long, makes Big Data a concrete concept for IBM. It doesn’t just build products and ask its customers to imagine the kinds of data they run through them.  Imagination’s fine, but decades of experience make it less necessary.

People, not just products 
With such a big product portfolio in the data analytics space, it’s easy to forget about human capital.  But the role of people in IBM’s Big Data initiatives arguably eclipse the products in importance.  To start with, IBM has a huge research faculty, including what Advani told me is the largest mathematics department in private industry.  This clearly provides significant firepower in predictive analytics innovation.

And then there’s IBM’s acquisition of PriceWaterhouseCoopers’ consulting division back in 2002.  Even prior to that deal, IBM had a substantial global services division, but the acquisition of PwC Consulting transformed Armonk from a products company with a services organization to, arguably, a services organization that leverages an impressive array of its parent company’s own products.

Advani introduced to me another of IBM’s analytics initiatives, called Analytical Decision Management, which focuses on embedding analytics functionality within business applications, rather than forcing frontline workers to go into dedicated, siloed analytics apps to get those insights.  This initiative allows, for example, call center workers to understand what offers are appropriate to certain callers and what outcomes are likely when the offers are made.  Users of these applications don’t even sense that they are using analytics technology, because it’s embedded into operational workflows.  Clearly, IBM’s combination of research and services delivery experience enhances its ability to deliver in such frontline worker scenarios.

My conversation with Advani was indeed eye-opening. I’ve been watching IBM build products and buy companies for years, and I’ve understood its interest in Big Data and analytics. I just hadn’t put it all together in my head. IBM is in a unique position, and doing things in the Big Data world that its competitors cannot.

It’s not easy being (Big) Blue 
But this observation is a bit humbling as well.  How will other tech companies, especially startups, hope to build out a similar data analytics empire?  And how will IBM manage so many different products, technologies, consulting teams and acquired companies?  After all, most big empires eventually fall into decline.

It seems to me that IBM will need to integrate its product portfolio as the new versions of the products are released.  On the BI side, I’ve seen that start to happen, and it will need to continue.  Meanwhile, small startups, unencumbered by the management of so many moving parts, are critical for launching and propelling innovative technology, and the markets around them.  Big Data is proof of that. 

Ultimately though, things will need to converge.  The Big Data space will mature, more Enterprise software companies will enter it, they’ll acquire some of the startups, and consolidation will occur.  The startups show us the importance of idealism and breaking new ground. IBM’s position shows us the importance of connecting Big Data with the Enterprise and a mainstream services organization.  It also demonstrates the power of embedding analytics functionality into line-of-business software that may deceptively register as mundane.

Cutting edge innovation is critical, but its full value is realized with integration into mainstream tools, products and companies.

Editorial standards