IBM has already been investing in Hadoop-based projects for a considerable time now, and the tech giant has a new open source-based platform to add to its big data portfolio today.
Even after a few short months, Hadoopto be in enterprise technology this year.
Much of it got started around late February when, , and all introduced new solutions based on the open source software. The common thread for most of these new products is to use Hadoop technology to streamline and process big data at a lower cost.
IBM's latest take is PureData System for Hadoop. Essentially an extension of IBM's other Hadoop-based platform InfoSphere BigInsights (along with some integration with analytics functions from IBM Research), the PureData platform is designed to enable companies of all sizes to be able to manage and analyze data while tacking on administrative, workflow, provisioning and security features.
Given how many big data solutions there are available (from IBM alone), it can be hard to tell a lot of them apart -- especially based on those generic terms.
The Armonk, N.Y.-headquartered corporation explained that PureData works with what it referred to as "cold data" and "hot data."
Essentially, PureData is supposed to provide a path to move older "cold data" into an active archive, which allows for historical data analysis. But the active "hot data" is analyzed in real time.
A real use case would be a bank trying to analyze historical "cold data" such as bank statements while simultaneously trying to process newer data in real to catch threats and fraud.
The PureData System for Hadoop will start shipping during the third quarter of this year.
IBM has a few other releases this week, including some upgrades for the InfoSphere BigInsights platform to ease up develop application using SQL.
IBM Research also introduced BLU Acceleration, a new platform designed to extend the capabilities of in-memory systems and maximize analytics performance, including skipping over data that doesn't need to be analyzed while also analyzing data in parallel across different processors.