Machine learning (ML) is exciting technology, but it can be hard for non-specialists to take advantage of it. Microsoft has a lot of irons in the ML fire too, what with the pre-trained all-purpose ML models that are part of Azure Cognitive Services; the developer and data scientist-friendly Azure Databricks and the all-purpose and operations-oriented Azure Machine Learning (Azure ML), but Microsoft has needed something that brings these disparate components together and makes them more broadly accessible.
Today, Microsoft is announcing new features in Power BI that do just that, enabling the same business analysts who can use Power BI for self-service analytics to integrate machine learning models built by their own data scientists or by Microsoft, and even create their own.
These Power BI features are launching today as a private preview. But Arun Ulagaratchagan, Microsoft's general manager for Power BI engineering, and his team, were kind enough to provide me with a very detailed demo, so I can attest to product being real and not "ether."
At a high level, the story is pretty simple. Microsoft is introducing four new AI-related features in Power BI:
- Integration of Azure Cognitive Services
- Integration of ML models hosted in Azure Machine Learning, including those built in Azure Databricks
- The ability to create, and then use, ML models using Azure Automated ML (AutoML)
- A new Key Driver Analysis visualization that reveals which columns and values drive specific outcomes (values) for data columns serving as measures or Key Performance Indicators (KPIs)
That's the TL;DR. Read on for coverage of each of of these four features. At the end of this post, I'll sum up with a few observations.
Access to Cognitive Services
The integration of Azure Cognitive Services and Azure ML-hosted models are launched from Power BI's recently-announced Data Flows feature, which is essentially a cloud-hosted implementation of the Power Query self-service data prep facility that's been available in Power BI Desktop (not to mention Excel) for some time. The key to gaining access to the AI features is to click a new "AI Insights" toolbar button in the Data Flows user interface.
From there, users can select whether they want to use an Azure Cognitive Services model or n Azure ML-hosted model created and shared with the Power BI user by a data scientist. In neither case does the Power BI user need any provisioned Azure services, tenants, or even an Azure subscription.
If the user picks the Azure Cognitive Services option, she can then further select whether to perform language detection, image detection, key phrase extraction or sentiment scoring. The team assures me that more Azure Cognitive Services options will be on-boarded and these four services are just the initial ones on offer.
After picking a service, the user then needs to wire up which columns in the data set map to the input parameters for the Cognitive Services model and then click an "Invoke" button. From there, the predicted model output for each row in the data set will appear in a new calculated column, added at the end.
Advanced users will be interested to know that, as with any calculated column, the contents of these special columns are just formulas built in the M programming language used by Power Query. This suggests the invocation of Cognitive Services in Power BI can be scripted, rather than being triggered exclusively through the UI.
The demo I was given involved a data set with a bunch of hotel customer reviews and Cognitive Services models were used to provide a sentiment score on the review text, extract key phrases (which were then visualized in a word cloud custom visualization) from the review, then extract and tag (caption) images from the reviews. All of this output was then easily visualized in a single-page Power BI report.
For Azure ML-hosted models, the experience is similar to that for Cognitive Services: select a model, wire up data set columns with ML model input parameters, click "Invoke" and get back a result. The main difference was that the resulting prediction comes back as a multi-column column record that then needs to be expanded; luckily Power Query and Data Flows have just such an expansion function built right in.
One other difference is the Power BI subscription level required for each of these features. At least for the private preview, a Power BI Premium subscription is required for the Cognitive Services integration. Access to Azure ML-hosted models (including those created in Azure Databricks) should just require a Power BI Professional subscription.
Build your own
The crown jewel in this set of new AI features is probably the ability to build a model of one's own, using Azure AutoML. Here's the recipe for getting it to work:
- In the Data Flow view in the Power BI cloud service, click on the "brain" icon for a specific flow, then click "Add a machine learning model" from the context menu
- Select the type of model desired (Binary Classification, General Classification, Regression or Forecasting, each of which is explained)
- Specify which column from the data set to use as the predicted column (the "label," in data science parlance)
- Review the columns already selected for you by AutoML to use as the input columns for the model (the "features," in data science parlance), overriding these selections if desired
- Name the model and select the values you wish to appear for each predicted classification
After these wizard-like steps are complete, Power BI (and AutoML) will then select the appropriate algorithm and accompanying parameter values for you -- all of which happens behind the scenes -- create and train the model, and add a calculated output column to your data set. As new data is added to the underlying table (which Data Flows can automate, through scheduled incremental refresh), new predicted values will be added to that column.
Power BI will also provide a report that evaluates the model's accuracy. While this report is automatically generated, it's actually just a standard Power BI report consisting of a collection of visualizations and a slicer for confidence threshold. This demonstrates well the suitability of BI tools for ML model management, and my guess is that editing the report will help BI specialists learn a lot about determining ML model accuracy.
Key driver analysis
The last feature to discuss is Key Driver Analysis, which uses AI, but doesn't "feel" like AI. Instead, users simply drag a special visualization into the report, and configure its "Target" column and collection of "Explain by" columns in the Fields well in Power BI Desktop. Simply by doing this, a visualization appears which, in a "Key influencers" view, shows what values for particular "Explain by" columns impact the value of the "Target" column most significantly. An alternate "Top profiles" view does likewise for specific, statistically interesting combinations of "Explain by" column values.
Microsoft has done some very valuable work here. To begin with, the Power BI team has integrated a bunch of disparate Azure services and made them turn-key, without the need for code or an Azure subscription. The team has also leveraged the power of AutoML and taken it the last mile to become a truly self-service offering. All of that is huge.
But what the team has also done is to fit all of this AI technology into the context of BI. The features are invoked from a data prep tool (or, for Key Driver Analysis, a special visualization). Everything on the input side is really just columns from a table; everything on the output side is just a calculated column in that same table, using the standard expression language for such columns. Model management is implemented in a standard report, and predictions are visualized in the same way other insights are.
This means everything that's already in Power BI can be brought to bear. For example, a bar chart showing sentiment score by brand could be created using Power BI's Q&A natural language interface (which, in the Power BI mobile application, can be voice driven). Fancy joins and filtering of data in a data flow can be used to build a model on the most relevant rows and columns. Standard slicers can be applied to the Key Driver Analysis output and any model output, as well.
In other words, Power BI has conformed many Azure AI-related services to the BI paradigm and made them accessible to people with BI skill sets. The failure of the industry at large to do much similar work is a big part of what has, thus far, held AI back from broader adoption, deployment and reasonable monetization. These new Power BI features set a new, and welcomed, precedent.