Microsoft has been serious about helping data scientists track and manage their machine learning experiments for some time now. For example, the company's Azure Machine Learning (Azure ML) cloud service has supported the logging of experiments, including iterative runs with varying algorithms, hyperparameter values, or both.
While Azure ML has had its own framework for such experiment monitoring and tracking, at last year's Spark+AI Summit, its partner Databricks launched the open source MLflow project for handling similar tasks. MLflow is designed to work from most any environment, including the command line, notebooks and more, and its popularity has grown impressively over the last year, ostensibly as a result of that open orientation.
Connecting the dots
Microsoft and Databricks are close partners, and MLflow is natively supported in Azure Databricks. But today, at this year's Spark+AI Summit, the two companies are announcing that Microsoft will now be an active contributor to the MLflow project and will support it natively from Azure ML.
As chance would have it, I'm at the Visual Studio Live! conference in New Orleans this week, and I happen to presenting on Azure Databricks today. As part of that presentation, I've been working on a demo of MLflow just this week, so this news is quite timely.
A little code will do ya
While many facets of doing machine learning can be quite complex and even a bit Rube Goldberg in nature, MLflow is refreshingly simple. Just by adding a few lines of code in the function or script that trains their model, data scientists can log parameters, metrics, artifacts (plots, miscellaneous files, etc.) and a deployable packaging of the ML model. Every time that function or script is run, the results will be logged automatically as a byproduct of those lines of code being added, even if the party doing the training run makes no special effort to record the results.
MLflow application programming interfaces (APIs) are available for the Python, R and Java programming languages, and MLflow sports a language-agnostic REST API as well. Databricks says the project has almost 500,000 monthly downloads, over 80 code contributors and 40 contributing organizations.
Now Microsoft will be an active contributor to the project, too. That should help standardize the DevOps of AI, across languages, clouds and machine learning frameworks. And, if you ask me, that standardization can't come soon enough.