It's been a big few weeks for database and analytics news on the Microsoft Azure cloud. On June 27th, Microsoft had various announcements regarding Azure data storage and data integration. And, just this morning, Snowflake announced that its data warehouse service is now generally available on Azure.
Also read: Snowflake's cloud data warehouse comes to Microsoft Azure
Do the data shuffle
The news keeps coming. Today, at its Inspire partner conference in Las Vegas, Microsoft had more data-related announcements, for its own data warehouse service on Azure and for its Power BI service as well.
On the data warehouse side, Microsoft announced a new feature called Instant Data Movement for its Azure SQL Data Warehouse (SQL DW) service. Data warehouse platforms gain their speed by federating multiple servers ("nodes") together in one big logical server. As queries are run against this logical server, the data often needs to be moved around between nodes, and the speed with which a data warehouse does this is critical to its performance. SQL DW just made such operations much faster.
The product has used a feature called Data Movement Service (DMS) to handle this task, and its always been pretty good, as it's based on a longstanding efficient SQL Server technology called Bulk Copy Protocol (BCP). The problem with BCP, though, is that it executes on a single thread, on a single processor core, using SQL Server's single-row operations mode.
But users of SQL Server Gen2 storage can now take advantage of Instant Data Movement (IDM), which executes over multiple CPU cores and uses SQL Server's newer batch mode (based on vector processing). The result is much faster data movement, and therefore superior query performance when a query joins tables based on columns by which they are not physically arranged. In fact, Microsoft says that when combined with its new Azure Accelerated Networking, SQL DW can move data at rates of up to 1GB per second, per node.
Also read: Azure SQL Data Warehouse "Gen 2": Microsoft's shot across Amazon's bow
Between the IDM and the performance improvements due to SQL DW Gen2 storage and caching, Microsoft is getting very confident about this product's performance. Confident enough, in fact, to commission analyst firm GigaOm Research to run TPC-H benchmarks on SQL DW against Amazon Redshift, with seemingly very positive results.
Disclosure: I myself do analyst work for GigaOm Research. I was not involved in the SQL DW TPC-H benchmark project, though I was aware it was being carried out.
Microsoft's discussion of the TPC-H benchmark work is covered in a blog post, and the GigaOm report is available online as well
Power (up) BI
On the Power BI side, Microsoft has enhanced the popular Business Intelligence service on both the cloud Big Data and Enterprise axes.
For the former, Microsoft has enhanced the Power Query self-service data preparation tool (which is also embedded in the Windows version of Excel) to process data stored in the Power BI cloud service rather than limiting its functionality to Power BI models stored on the desktop.
Depending on how Power Query's cloud capabilities are implemented, it could make for a very interesting accompaniment to Microsoft's Azure Data Factory service, whose major enhancements were part of the June 27th announcements. In addition, Power BI is being integrated with Azure Data Lake Storage Gen2 (also announced on June 27th and currently in preview), an enhancement to Azure Blob Storage that eliminates file size restrictions and adds an access interface that makes it compatible with the Hadoop Distributed File System (HDFS), the canonical Big data storage technology.
Power BI is based on Microsoft's long-standing SQL Server Analysis Services (SSAS) technology and, starting today, Power BI now integrates a number of SSAS features. This includes compatibility with XML for Analysis (XMLA), which is SSAS' native protocol. XMLA compatibility brings with it compatibility with an array of tools built to work with SSAS and makes Power BI much more Enterprise-ready.
Power BI is also gaining integration with SQL Server Reporting Services (SSRS), Microsoft's Enterprise reporting technology. Now, in addition to Power BI reports and dashboards, the Power BI cloud service will be able to host and render SSRS reports.
This provides nice symmetry with Power BI Report Server which is itself a superset of SSRS' on-premises report server and which allows on-prem delivery of Power BI reports, alongside SSRS assets. Now, Microsoft customers will be able to comingle SSRS and Power BI reports, in both on-prem and in-cloud environments.
In furtherance of its Enterprise prowess, Power BI will now provide support for the Microsoft Common Data Model (CDM) and is adding multi-geo compliance, which allows customers to deploy Power BI Premium (not Professional) to specific global regions. This facilitates compliance with data residency requirements enhances data locality which can cut data loading times.
Microsoft knows that technology is almost always enabled by data and analytics, and it is doubling down on its various offerings in that arena, especially in the cloud. Microsoft's challenge now is to convince the market that it can best Amazon Web Services in the data space. While it's on its way, Redmond has more work to do to win the market's hearts and minds in the data segment.