EMC and its federated companies---Pivotal and VMware---have integrated a few third party vendors in what it bills a preintegrated data lake for big data deployments.
Billed as the Federation Business Data Lake, EMC is targeting enterprises that want to roll out big data infrastructure. Data lakes have become a popular way to combine structured and unstructured information for analysis. The general idea is to take information from various systems and silos, pool it and then analyze it via Hadoop.
This data lake stack includes VMware, Pivotal, EMC and third party Hadoop vendors such as Cloudera and Hortonworks as well as analytics platforms including SAS and Tableau. The EMC data lake deploys a wrapper model so it can absorb third party tools.
Here's the stack.
The gist of the integrated data lack is that it combines storage, analysis, surfacing of patterns and action in one pool. EMC also provides tools to manage the platform, catalog data and govern information. EMC also provides on call support.
Aidan O'Brien, senior director of EMC's big data initiative, said the aim of the data lake stack is to create a self-service system for analytics. "We're trying to create a self-service big data environment where an engineer, data scientist, developer or line of business person can get what they need and gather data," said O'Brien.
As for more third party partners, O'Brien said the data lake architecture is designed to work with other technologies. "We need to embrace third parties," he said.