Why hardware giants are Hadoop distro happy

EMC and Intel launch Hadoop distributions as they realize that big data is going to create a lot of hardware opportunities.
Written by Larry Dignan, Contributor

EMC has its own Hadoop distribution and now Intel has joined the cause. What's notable is that both EMC and Intel are primarily hardware vendors increasingly playing a software game. The only way to ride the big data wave is to play Hadoop.

Let's recap recent events:

  • EMC launched its Pivotal HD Hadoop distribution. The move melds EMC's Greenplum and storage intellectual property with Apache Hadoop.
  • Intel launched the Intel Distribution for Apache Hadoop. The name isn't exactly trendy, but the cause is notable. Intel is looking to build Hadoop from a silicon level to bolster computing speeds as well as security. With Xeon processors now supporting the Hadoop Distributed File System, Intel is claiming that processing time can be cut from four hours to 7 minutes.
  • Aside from contributing to what is becoming a Hadoop distribution glut, EMC and Intel are making moves that will matter to their futures. Here are the moving parts:
  • Hardware vendors see Hadoop as the next big thing that will drive compute.
  • The computing needs of big data will result in more hardware---servers, storage and networking.
  • Optimized Hadoop for EMC storage and Intel processors makes a lot of sense.
  • Hardware is ultimately a bad business as software begins to dominate the data center.

With that backdrop, the EMC and Intel Hadoop happy moves make a ton of sense.



What's the fallout for the big data landscape?

  1. Oracle may need its own Hadoop distribution to bridge big data and its databases. Today, Oracle is a Cloudera partner. Perhaps Oracle buys Cloudera.
  2. Teradata and other data warehousing players have connectors to Hadoop. These vendors may need their own distributions, but would have trouble rising above the din.
  3. Informatica makes a living on data integrated. If big data is integrated at the hardware level there's less work to do.

The problem for IT buyers is obvious: What Hadoop distribution do you pick? Intel has a bevy of partners. EMC is going after Cloudera. IBM has its own distribution. Toss in Cloudera and Hortonworks and the selection borders on too much. For now, it's probably best to sit back and evaluate all of the players. Your infrastructure---as well as your intent to stick with it---may be a gating factor.

Editorial standards