IBM enhances Watson Data Platform, with an eye towards AI

IBM adds to its data preparation, catalog and governance services, and GAs its Hadoop/Spark service. And while that may sound like data-lake-de-rigueur, IBM says it's all about AI.
Written by Andrew Brust, Contributor

IBM is announcing today new capabilities in the Watson Data Platform (WDP), part of the IBM Cloud. As ZDNet's Larry Dignan wrote about in September, IBM's spin and strategy is to make WDP into a "data science operating system." Put another way, IBM is looking for it to be a premier platform for AI.

Also read: IBM's Watson Data Platform aims to become data science operating system

Composed of numerous services including managed NoSQL databases, Watson Machine Learning and cognitive APIs for natural language understanding, visual recognition, chatbots and more, IBM is enhancing the platform with new features.

Also read: IBM Machine Learning brings Spark to the mainframe

Catalog, refinery and governance, oh my!
First off, IBM is adding to its Data Catalog and Data Refinery offerings with features intended to amplify data enrichment and data cleansing capability. By doing so, the company believes, customers will be better able to create data sets that are more comprehensive, and of higher quality. When combined with feature engineering work (determining columns in a data set that are germane to predicting the value of others), such data sets can be used to create superior machine learning models.

Metadata gleaned from Data Catalog and Data Refinery can help drive data governance policies. Such governance becomes more relevant each day as new data breaches surface, and regulations meant to mitigate them multiply. Accordingly, IBM is also adding new features to its Unified Governance Platform, including capabilities aimed squarely at regulatory compliance, like the European Union's General Data Protection Regulation (GDPR).

Elephant in the Room
Finally, Big Blue is announcing the general availability of its IBM Analytics Engine, its ODPi-compliant Hadoop and Spark service, aimed at fast processing of large data sets.

Also read: IBM launches Apache Spark cloud service

As I alluded to earlier, while all of these components (especially Analytics Engine) are data lake technologies, IBM really sees them as the basis for doing better machine learning and artificial intelligence.

By combining the power of the Watson brand with AI-oriented marketing, IBM is making a bet that it can get real traction for its data platform, and compete with the likes of Amazon, Microsoft and Google for cloud data platform dominance. That's a tall order but it's not impossible, especially given IBM's Enterprise cred and the breadth and legacy of its on-premises data stack.

Editorial standards