Microsoft's Windows Server implementation of Hadoop is in private preview

Microsoft's Windows Server implementation of Hadoop is in private preview

Summary: Microsoft and Hortonworks' implementation of Hadoop for Windows Server hasn't disappeared or been cut. It's actually in private preview, according to a new Microsoft roadmap.


At last, big-data fans, we've got some word of the seemingly-missing-but-not-forgotten Windows Server implementation of Hadoop promised by Microsoft and Hortonworks.

I'd started wondering whether Microsoft's repeated "no comments" about the project's whereabouts -- the most recent of which I received just a couple weeks ago, at the end of September 2012  -- meant Microsoft had decided to go cloud-only with Hadoop. But it turns out the Windows Server version of the Microsoft-Hortonworks Hadoop implementation is still around, and is just in private preview.

A quick refresher as to what's going on with Microsoft and Hadoop.

In the fall of 2011, Microsoft announced it was partnering with Hortonworks to create both a Windows Azure and Windows Server implementations of the Hadoop big data framework. At that time, Microsoft officials committed to providing a Community Technology Preview (CTP) test build of the Hadoop-based service for Windows Azure before the end of calendar 2011 and a CTP of the Hadoop-based distribution for Windows Server some time in 2012. A month after announcing the Hortonworks partnership, Microsoft dropped plans to make its own big data alternative, codenamed Dryad.

In late December 2011, Microsoft posted a video on its Channel 9 site that provided updated information about the company's Hadoop plans. According to that video, which Microsoft subsequently pulled from Channel 9, the company planned to make Hadoop on Windows Azure generally available in March 2012, and Hadoop for Windows Server generally available in June 2012.

Ever since, Microsoft officials have gone silent on the new timetables for the Hadoop for Azure and Hadoop for Windows Server offerings. Until late September 2012, that is.

A slide deck from the "24 Hours of PASS" event from Denny Lee, Technical Principal Program Manager for SQL Business Intelligence Group, made its way to the Web recently. Lee, according to his bio, is "one of the original core members of Microsoft Hadoop on Windows and Azure (code name: Isotope) and had helped bring Hadoop into Microsoft."

A few of the interesting slides from Lee's deck from his September 21, 2012 presentation:


Hadoop on Azure is still in preview, as Lee's slide says. (The latest publicly acknowledged build was the second Community Technology Preview release.) But now we know that the Windows Server version is in private preview, according to Lee's deck. I'm not sure how long it's been in private preview, and have never found any testers who've claimed to have been part of the preview for it.

Also: there's seemingly a new deliverable on the roadmap: An "on-demand" dedicated Hadoop cluster in the cloud, which seems to be some kind of hybrid between the two (best I can tell). Anyone know any more about this?


Microsoft officials have been saying for a while that it wasn't just the Hadoop framework which Microsoft planned to support. There are lots of other related components in the works, like the Excel Hive Add-in, Sqoop, Apache Pig, Hive ODBC and more, as this slide notes. I'm assuming the features listed below the beige bar are the features that will be in the Windows Server version of the Hadoop implementation, and those above the bar are what are in the Azure Hadoop one.



Hadoop for Windows Server includes an interactive console, remote-desktop support, and other related elements, as this slide seems to indicate.

The O'Reilly Strata Conference plus Hadoop World are on tap for late October in New York City. Maybe Microsoft and Hortonworks will share more about their Windows Azure and Windows Server Hadoop plans and progress then (even though there aren't many Softies listed as speakers)?

Topics: Big Data, Cloud, Microsoft, Servers, Windows


Mary Jo has covered the tech industry for 30 years for a variety of publications and Web sites, and is a frequent guest on radio, TV and podcasts, speaking about all things Microsoft-related. She is the author of Microsoft 2.0: How Microsoft plans to stay relevant in the post-Gates era (John Wiley & Sons, 2008).

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • subscribing to rss feed

    umm, i am tired of newsletter so i am switching to google reader as my RSS reader. but it ask me to import an opml file, while here u give me a xml file, so how do i import those rss subscribtion??
    • You can subscribe to RSS feed for my posts here

      Thanks! MJ
      Mary Jo Foley
  • A couple of notes

    I have no inside knowlege of what Microsoft is doing, but...

    1. A cloud-based Hadoop cluster would indicate, to me, that Microsoft provides a customer with a cluster of compute servers and storage where all the components are physically "close" to each other. Hadoop and HDFS understand locality and use it in trying to optimize a server's access to data. A plain-Jane, Azure-based implementation of Hadoop can't do that.

    2. My take on the beige bar in the slide is that the stuff above the line are the normal (open-source) Hadoop components (Hadoop, HDFS, Hive, Pig, etc). The stuff below the line are Microsoft's Windows-World add-ons (HDFS/Azure stuff, Excel, .NET, etc). I would expect that they'd be available on both an Azure implementation and on a Windows Server implementation.

    But, I may be wrong
  • Ah, the good ol' Triple E!

    Embrace - Extend - Extinguish. Wonder how many of their extensions are going to be open source, released under the same license as Hadoop?
    • MS is going to make their extensions available to the community

      Hi. MS said a year ago:

      Microsoft will make its contributions available to the Hadoop open-source community for possible inclusion in the core Hadoop platform to insure compatibility in the Apache codebase and trunk.

      Surprising, but true... MJ
      Mary Jo Foley
      • So the questions will be ...

        1) Do they follow through?

        2) Does this only apply to the code changes they make to the core Hadoop components, and NOT apply to their Excel Hive Add-in, Interactive Console, etc.

        I suspect their business plan is exactly what it is for Linux - they'll develop code for the core products that enable use of their proprietary products, contribute that code to core (so that everyone can pay them for their ancillary products), and NOT contribute their ancillary products as open source.

        Not that that's wrong ... but it's a different level of "participating in open source" than actual open source proponents mean (as opposed to those that view open source as a necessary evil). It's different than e.g. a Red Hat, which uses the same business plan (proprietary value-add tools and services) ... but also contributes heartily to the core products, independent of whether the contribution ties directly to their own revenue stream.
    • Re: Extinguish

      They only ever succeeded with that third stage against other proprietary companies. Microsoft has never been able to extinguish, or even weaken, a single Free Software project. Not to say it hasn't tried...
  • So now OpenSource is ok?

    I thought Microsoft said OpenSource was the plague which will destroy the world.

    Despicable, shameless behaviour.
    • It depends on your definition of "open source"

      You won't find too many good opinions of the GPL at Microsoft. They are ok (more or less) with other licenses (like Apache or the various MS-* licenses). The problem with the GPL is that it isn't really "open" open source; Stallman has made sure that it comes with some rather onerous conditions (if your business is writing commercial software).
      • Re: The problem with the GPL is that it isn't really "open" open source

        Of course it's open--no strings attached, guaranteed. And there are lots of succeesful commercial businesses built on top of GPL'd as well as other Free Software. Just look at the Who's Who list of major contributors to the Linux kernel, for a start.
  • Microsoft Hadoop

    Has Microsoft agreed "publicly" to abide by the Apache Hadoop copyright license (as different from usual copyrigt subversion), and will they "fork" hadoop to degree that Microsoft Hadoop services for Azure and Windows Server are incompatible with Big data services from UNIX/Linux more "standards based" implementation of Hadoop?
    • Re: and will they "fork"

      If they do, they will find themselves immediately losing the benefit of interoperability with the rest of the community.

      There's a huge difference in cost between developing proprietary versus Free software: the latter benefits greatly from the economies of scale you can leverage off the work of others (who in turn benefit from your work, of course). Going it alone is simply unaffordable these days, otherwise Microsoft would already be pushing its own proprietary alternative to Hadoop.
  • no hbase?

    Maybe that's a given - but would an interesting omission if it was left out
  • just got it

    90 day free Trial . Yes to most of what you are asking. I will get back to you . ASP.NET, PHP or Node.js with FTP, Git or TFS . Hadoop yes
  • Integration with Windows Azure

    This is really great news for Hadoop users even if the implementation is in private preview. In my opinion, the cloud-based platform of Windows Azure provides best hosted services and really flexible storage option to users around the globe.

    On the side note, I went through new integration options with Windows Azure and got to know about integration with GroupDocs Document Management Solution. It allows you to setup a new storage provision to store your files. You can also refer to the URL below to know more about this integration:

    Looking forward to your review of this new Windows Integration, MJ!