IBM spills more about Hadoop strategy with new PureData System

The Hadoop party continues as IBM introduces its latest big data offering based on the open source software.


IBM has already been investing in Hadoop-based projects for a considerable time now, and the tech giant has a new open source-based platform to add to its big data portfolio today.

Even after a few short months, Hadoop has already proven to be one of the biggest hot topics in enterprise technology this year.

See also: 'Big Data as a Service' is here, but is anybody ready?

Much of it got started around late February when Intel , Hewlett-Packard , Hortonworks and EMC all introduced new solutions based on the open source software. The common thread for most of these new products is to use Hadoop technology to streamline and process big data at a lower cost.

IBM's latest take is PureData System for Hadoop. Essentially an extension of IBM's other Hadoop-based platform InfoSphere BigInsights (along with some integration with analytics functions from IBM Research), the PureData platform is designed to enable companies of all sizes to be able to manage and analyze data while tacking on administrative, workflow, provisioning and security features.

Given how many big data solutions there are available (from IBM alone), it can be hard to tell a lot of them apart -- especially based on those generic terms.

The Armonk, N.Y.-headquartered corporation explained that PureData works with what it referred to as "cold data" and "hot data."

Essentially, PureData is supposed to provide a path to move older "cold data" into an active archive, which allows for historical data analysis. But the active "hot data" is analyzed in real time.

A real use case would be a bank trying to analyze historical "cold data" such as bank statements while simultaneously trying to process newer data in real to catch threats and fraud.

The PureData System for Hadoop will start shipping during the third quarter of this year.

IBM has a few other releases this week, including some upgrades for the InfoSphere BigInsights platform to ease up develop application using SQL.

IBM Research also introduced BLU Acceleration, a new platform designed to extend the capabilities of in-memory systems and maximize analytics performance, including skipping over data that doesn't need to be analyzed while also analyzing data in parallel across different processors.

Show Comments