IBM spills more about Hadoop strategy with new PureData System

Summary:The Hadoop party continues as IBM introduces its latest big data offering based on the open source software.


IBM has already been investing in Hadoop-based projects for a considerable time now, and the tech giant has a new open source-based platform to add to its big data portfolio today.

Even after a few short months, Hadoop has already proven to be one of the biggest hot topics in enterprise technology this year.

See also: 'Big Data as a Service' is here, but is anybody ready?

Much of it got started around late February when Intel , Hewlett-Packard , Hortonworks and EMC all introduced new solutions based on the open source software. The common thread for most of these new products is to use Hadoop technology to streamline and process big data at a lower cost.

IBM's latest take is PureData System for Hadoop. Essentially an extension of IBM's other Hadoop-based platform InfoSphere BigInsights (along with some integration with analytics functions from IBM Research), the PureData platform is designed to enable companies of all sizes to be able to manage and analyze data while tacking on administrative, workflow, provisioning and security features.

Given how many big data solutions there are available (from IBM alone), it can be hard to tell a lot of them apart -- especially based on those generic terms.

The Armonk, N.Y.-headquartered corporation explained that PureData works with what it referred to as "cold data" and "hot data."

Essentially, PureData is supposed to provide a path to move older "cold data" into an active archive, which allows for historical data analysis. But the active "hot data" is analyzed in real time.

A real use case would be a bank trying to analyze historical "cold data" such as bank statements while simultaneously trying to process newer data in real to catch threats and fraud.

The PureData System for Hadoop will start shipping during the third quarter of this year.

IBM has a few other releases this week, including some upgrades for the InfoSphere BigInsights platform to ease up develop application using SQL.

IBM Research also introduced BLU Acceleration, a new platform designed to extend the capabilities of in-memory systems and maximize analytics performance, including skipping over data that doesn't need to be analyzed while also analyzing data in parallel across different processors.

Topics: Big Data, Data Centers, Data Management, IBM, Open Source


Rachel King is a staff writer for CBS Interactive based in San Francisco, covering business and enterprise technology for ZDNet, CNET and SmartPlanet. She has previously worked for The Business Insider,, CNN's San Francisco bureau and the U.S. Department of State. Rachel has also written for, Irish Americ... Full Bio

zdnet_core.socialButton.googleLabel Contact Disclosure

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.

Related Stories

The best of ZDNet, delivered

You have been successfully signed up. To sign up for more newsletters or to manage your account, visit the Newsletter Subscription Center.
Subscription failed.