Trying to find differentiation in Hadoop soup

Every company has access to the same list of Apache Hadoop-related projects, but how does one differentiate itself from all the others? Innovation, additional functionality, or marketing?
Written by Ken Hess, Contributor

Hand a small red ball to each member of a group of ten marketing professionals and ask each person to take 48 hours to analyze the product and come back with a pitch to sell this product to the widest possible audience. And do you know what you get at the end of the 48 hours? "The most <insert outrageous superlatives here> product that will revolutionize the way you look at little red balls"--that's what. I think that's what we have with Hadoop or what I call Hadoop soup. I like Hadoop. I like Apache. I like open source. What I'm not sure I like is hype. Hadoop soup: The most innovative, leading edge, enterprise capable, highly scalable, end-to-end, easy-to-implement, easy to deploy, fastest, unbreakable platform that will revolutionize the way you look at and interact with data.

I think I'm gonna need a bigger spoon to find some tasty bits in the Hadoop soup.

Years ago, I watched an IROC race on TV where the racers had "identically prepared Camaros". Same tires, same engine, same transmission--same everything. The challenge was to see which driver had the skill to win when the playing field was absolutely even. Except for positioning, at the start of the race, there were no advantages apart from the drivers themselves.

Most boring race I've ever watched. Painful. I don't remember who won because I couldn't watch for more than about 30 minutes. Left turns. Everyone going the same speed. Go Kart races are far more exciting than that snoozer.

The point I'm trying to make is that when all your Hadoop players have the same starting place, how do they differentiate themselves enough for a CIO to make a decision about a solution?

It isn't easy and I don't think anyone has managed to do it successfully.

How do you determine which little letter in your soup is the best? Do you have a favorite letter? They all taste the same, but if you can find differentiation, I'm interested.

Don't get me wrong. Again, I like Hadoop and all its associated projects. I like Hadoop companies. I have nothing against the ecosystem nor do I have anything against companies who want to create and to sell Hadoop solutions. I also don't have anything against innovation or differentiation.

After posting my recent, "Virtualized Hadoop: a brief look at the possibility", I began to research Hadoop companies and to check out exactly what they have to offer. I found a dozen or so companies, but not a lot of differentiation. They all began to run together. Their bits all tasted the same. It seems that the only bit of coaching that anyone got before that IROC race is the same advice that Hadoop companies got: Turn on your left blinker and floor it.

Yes, I'm being facetious, but only slightly so.

I've compared this type of homogeneity to automobile manufacturers. If you take a close look at pickup trucks from one year model to the next, how are they differentiated? Different grille. Different taillights? A new submodel based on some "packages" that make you think you're getting a more deluxe model? You know, LX, X, S, SS, GT, etc. I think we need for marketing people to catch onto that trend in IT.

For example, Apache Hadoop XVI: It contains all 16 Apache Hadoop-related packages. You know, all 16 that you can download, install, and use yourself right from the Apache website.

I'm not saying that a solution should or should not include all the packages, but there should be something behind them other than a support contract. Show me something that you've done extra to make your solution stand out among the Hadoop soup din.

I need for someone to stand up and say, "Hey, we have an Apache Hadoop solution that's orange orange, raspberry red, and lemon yellow. And it stays crunchy even in milk." While my very short attention span is still hovering over Hadoop, I need for that hand to go up. I'm looking for some real innovation with a Hadoop solution, not just another set of packages on awesome hardware that's going to impress me with the same list of features that everyone else has.

What do you think? Do you know of a truly differentiated Hadoop solution or are you also swimming in Hadoop soup? Talk back and let me know.

Editorial standards