I had the opportunity to speak with the good folks at Zettaset about harnessing the power of Apache Hadoop so that organizations can gather, analyze and then visualize massive data sets the represent the results of research, customer purchases, and the like.
Here is what Zettaset has to say
Big data are large datasets, in the order of terabytes, exabytes and zettabytes. Working with this amount of data is challenging using existing databases and management tools when it comes to the analysis, volume, speed and diversity of the data. Bridging this gap allows analysts to make informed decisions, and as we enter the Zettabyte age this becomes a necessity.
The Apache Hadoop software library framework allows for distributed processing of large datasets across clusters of computers on commodity hardware. This is most beneficial when combined with subprojects and other Hadoop-related projects. This solution is designed for flexibility and scalability, with an architecture that scales to thousands of servers. The library detects and handles failures at the application layer, delivering a high-availability service on commodity hardware.
Zettaset maximizes and enhances this offering by filling the gaps in the open- source projects, integrating as many as 30 services and dependencies into a single deployable system.
Snapshot analysisI've spoken with a number of companies that have embarked on this same journey including companies such as DataStax and SnapLogic. Without testing all of the products that have been presented to me, I would be hard pressed to suggest one.
Hadoop clearly is a powerful, yet challenging tool that requires a great deal of expertise if an organization hopes to get the maximum benefit from the type of analytics it can provide.
If your organization is interested in taking on this type of project, it would be good to put Zettaset on your list of suppliers to query. The Zettaset website offers a product tour video that might offer answers to initial questions.