MarketShare's Big Data Do-Over: Hadoop Deployment Overhaul

Marketshare takes another shot at its Hadoop deployment and illuminates important lessons for companies exploring the possibilities of big data in their enterprises.
Written by Chris Kanaracus, Contributor

While Hadoop remains one of the most popular platforms for big data, there's no single correct way to implement it in a manner best suited to your company's business requirements.

That's what MarketShare found out the hard way. The company, which is owned by Neustar, offers marketers analytics and reports that help them boost sales and make better decisions. Its clients include MasterCard, Turner Broadcasting and Twitter.

MarketShare has been in business since 2005, and by 2012 found itself in a much different place, as Constellation Research VP and principal analyst Doug Henschen writes in a newly published case study:

What started as a Small Data analysis challenge in the company's early days evolved into a Big Data challenge by 2012. That's when the firm shifted from analyzing gigabyte-scale data to using terabyte-scale data stored on Hadoop. The goal of using more data and a greater variety of data was to improve accuracy and to better measure digital activity across emerging mobile and social channels.

Initially, MarketShare used Amazon's Elastic MapReduce service, which is powered by Hadoop. It then took data extracts from Hadoop jobs, moved them into an Oracle database, and finally generated reports using Tableau. This solution proved to be too slow and complicated, and had a key weakness: Users weren't able to drill down into the broader trend data for richer details.

In early 2014, MarketShare began using Altiscale, a cloud-based Hadoop service that dramatically reduced processing times, but was still having challenges, as Henschen writes:

With the embrace of digital campaign information, MarketShare needed to analyze data on a whole new scale. "Instead of looking at aggregated data on a weekly basis, we decided to look at spend related to every single individual ad impression," says Satya Ramachandran, MarketShare's head of engineering. "That entails looking at tens of terabytes instead of hundreds of gigabytes."

The answer it found was a move to Arcadia Data, which provides a Hadoop-native visual analytics and BI toolset:

Arcadia Data enabled the company to deliver relevant data and customized reports more quickly and easily, while also better supporting client-specific querying down to the level of individual customer interactions.

The Bottom Line

MarketShare's experience initially tackled the first challenge companies face with Big Data, namely managing data at scale, but there's another one it had to handle as well, as Henschen writes:

The next (and perhaps bigger) challenge is analyzing data at scale, and on this front organizations are employing a mix of old and new technologies. MarketShare started by boiling down Big Data sets into extracts that could be handled in a conventional reporting environment, but the approach proved both time- and labor-intensive and limiting in terms of depth of insight.

MarketShare's story holds important lessons for every company exploring the possibilities of Big Data in their enterprises, and Henschen's full report goes into much greater detail. An excerpt is available at this link.

Editorial standards