Microsoft announced this past fall plans to create Hadoop distributions for Windows Azure and Windows Server. And just last week, Microsoft opened up the Community Technology Preview for Hadoop on Windows Azure.
But thanks to a new Channel 9 webcast (part of the December 13 "Learning Windows Azure" day-long show), we now know lots more about what the Softies are thinking regarding their Hadoop futures. Here's what I learned by watching this 11-minute video:
- Project Isotope is the codename Microsoft is using for the Apache Hadoop on Windows Azure and Windows Server work. (Thanks to a loyal reader for the tip-off to the codename.) But Isotope is more than the distributions that the Softies are building with Hortonworks. Isotope also refers to the whole "tool chain" of supporting big-data analytics offerings that Microsoft is packaging up around the distributions.
- Microsoft formed a team a year ago to begin work on Project Isotope. The General Manager, Project Founder and Technical Architect behind Isotope is Alexander Stojanovic. Dave Vronay, a leader in the Microsoft Advanced Technology Group in Beijing, also is part of the Isotope team. Isotope was born from Microsoft's work on cloud-scale analytics.
- Microsoft is planning to make Hadoop on Windows Azure generally available in March 2012.
- Microsoft is planning to make Hadoop on Windows Server (referred to by Stojanovic as the "enterprise" version of the project) generally available in June 2012. This version will include integration of the Hadoop File System with Active Directory, giving users global single sign-on for not just their e-mail, but also for analytics. (Now we know a bit more of the behind-the-scenes regarding those hints by various Microsoft execs regarding Active Directory and the cloud.)
- After making the two Hadoop versions generally available, Microsoft is planning to release updates to them every three months (as service updates on Azure and as service packs for the on-premises version).
- The Isotope team also is working with the System Center team to deeply integrate System Center Operations Manager with Hadoop on Windows Server (and maybe also on Windows Azure -- I'm not clear on that part), giving users unified command and control capabilities across the two platforms.
The coming Hadoop distributions for Azure and Windows Server are not all that interesting in and of themselves. It's the tools and the data that make them potentially useful and lucrative. The Isotope team is working on enabling bidirectional integration between the core Hadoop File System and tools like Sqoop and Flume. (Sqoop provides integration between HDFS and relational data services; Flume provides access to lots of log data from sensor networks and such).
Microsoft's big-picture concept is Isotope is what will give all kinds of users, from technical to "ordinary" productivity workers, access from inside data-analysis tools they know -- like Microsoft's own SQL Server Analysis Services, PowerPivot and Excel on their PCs -- to data stored in Windows Servers and/or Windows Azure. (The Windows Azure Marketplace fits in here, as this is the place that third-party providers can publish free or paid collections of data which users will be able to download/buy.)
With Isotope, "we've turned the OSS (open source software) model of open collaborative development and made it part of our internal ecosystem at Microsoft," Stojanovic said, while "insuring we've safeguarded it from an internal IP perspective."
If you have 11 minutes and care about Microsoft's Hadoop/big data strategy, check out the Channel 9 video for yourself: