Our last post presented an analogy for MapReduce. In this post, we layer real MapReduce vocabulary over the example to help decode the jargon that sometimes blocks understanding of Big Data.
Big on Data
Veteran data geek Andrew Brust covers Big Data technologies including Hadoop, NoSQL, Data Warehousing, BI and Predictive Analytics.
Andrew Brust has worked in the software industry for 25 years as a developer, consultant, entrepreneur and CTO, specializing in application development, databases and business intelligence technology. He has been a developer magazine columnist and conference speaker since the mid-90s, and a technology book writer and blogger since 2005. Andrew is also Founder and CEO of Blue Badge Insights, an analysis, strategy and advisory firm serving Microsoft customers and partners.
Can a skyscraper completed in 1931 be used to explain a parallel processing algorithm introduced in 2004? In this post, I use the anology of counting smartphones in the Empire State Building to explain MapReduce...without using code.
Big Data infrastructure and competency can seem distant from the workaday world of retail planning, strategy and analysis. Bringing the two worlds together would be quite useful though. At least one vendor is trying, through acquisition, integration and leadership experienced in both.
Complex Event Handling (CEP) is the category of technology focused on handling large, continuous streams of data that must be processed in real-time. CEP is distinct from Big Data in the eyes of some, and yet inextricably tied to it as well.
Microsoft's SQL Server 2012 has released to manufacturing. This release of the 20+ year-old database has tie-ins to Hadoop and Big Data analytics in general.
To many, Big Data goes hand-in-hand with Hadoop + MapReduce. But MPP (Massively Parallel Processing) and data warehouse appliances are Big Data technologies too. The MapReduce and MPP worlds have been pretty separate, but are now starting to collide. And that’s a good thing.
Big Data is all the rage these days, as are its constituent technologies like Hadoop, NoSQL, and the mystical discipline of data science. But it turns out that understanding of, and a consensus definition for, Big Data are rather elusive. This blog is here to address that.