Microsoft to develop Hadoop distributions for Windows Server and Azure

Microsoft is stepping up its support of Hadoop with new Windows Azure and Windows Server distributions in order to better support users' big data and unstructured data needs.
Written by Mary Jo Foley, Senior Contributing Editor

Microsoft is stepping up its support for Hadoop with its Windows Server and Windows Azure deliverables and will be offering its contributions back to the Apache Software Foundation and the Hadoop project.

By developing its own implementations of the Hadoop stack, Microsoft is looking to provide customers with another option for big data/unstructured data storage and access, officials said.

Microsoft officials made the announcement at the SQL PASS Summit on October 12. Company execs also confirmed at the event what I've been expecting for the past month or so: SQL Server "Denali" will be officially named SQL Server 2012 when it ships in the first part of 2012. (Server and Tools boss Satya Nadella said earlier this year to expect Denali to ship in the "early part" of next year.) Denali is currently at the Community Technology Preview (CTP) 3 stage; next stop is RTM and general availability (no beta is on the roadmap).

Microsoft is going to be working with Hadoop core contributors from Yahoo Hadoop spinoff Hortonworks. Microsoft and Hortonworks are readying  a CTP test build of their Hadoop-based service for Windows Azure for delivery before the end of calendar 2011 and a CTP of the Hadoop-based distribution for Windows Server some time in 2012. The new stacks will work with Microsoft's business-intelligence tools, including Excel, PowerPivot and PowerView (the new data-analysis technology that is part of SQL Server 2012 and was formerly known by its codename "Crescent").

Microsoft will make its contributions available to the Hadoop open-source community for possible inclusion in the core Hadoop platform to insure compatibility in the Apache codebase and trunk. Microsoft officials are not making any commitments as to when a final version of the Hadoop distributions for Windows Server or Windows Azure will be available.

Microsoft earlier this year announced it was working on Hadoop Connectors for Microsoft's SQL Server and Parallel Data Warehouse offerings that would allow Hadoop integration with Microsoft's database platforms. Microsoft made the final version of these connectors available for download last week.

Microsoft is continuing to work on various alternatives to Java-based Hadoop and MapReduce and is "still committed" to these efforts, said Doug Leland, General Manager of Product Management in Microsoft's Business Platform Marketing Group. Microsoft's home-grown Hadoop competitor, aimed at .Net developers, is "Dryad" -- or as it is currently known, LINQ to HPC. Microsoft Research also fielded earlier this year a technology preview of “an iterative MapReduce runtime for Windows Azure,” codenamed "Daytona," that is meant to support data analytics and machine-learning algorithms which can scale to hundreds of server cores for analyzing distributed data.

Microsoft officials also showed off at SQL PASS today a new technology codenamed "Data Explorer." Data Explorer will plug into Microsoft's Windows Azure Marketplace and allow developers to create richer data sets that can be published and made available for free or pay. A Data Explorer CTP will be available via SQL Azure Labs later this year.

Update: Here are a few more details about Microsoft's Hadoop announcement:

* Microsoft is working on "a simplified download, installation and configuration experience of several Hadoop related technologies, including HDFS, Hive, and Pig," which officials say "will help broaden the adoption of Hadoop in the enterprise."

* Microsoft is "investing in making Javascript a first class language for Big Data. We will do this by making it possible to write high performance Map/Reduce jobs using Javascript."

Editorial standards