Brainome, a new player in the machine learning space, is today launching Daimensions, a product which the company says helps customers take a "measure before build" approach to machine learning (ML) model development. The product, aimed at data scientists, helps optimize around training data analysis, data volume management and their downstream effect on training time, model size and performance.
ZDNet spoke with Brainome's co-founders, Bertrand Irissou (CEO) and Gerald Friedland (CTO). The two provided a careful, thoughtful and thorough explanation of how the company's approach to ML differs from others.
From trial-and error to measure-before-build
Brainome's take on ML is that much of the common model experimentation process can be optimized. Trial-and-error can be largely avoided by specifying the model's qualities and then building it, rather than following the standard experimentation approach of building several candidate models, then seeing which performs best. The company calls this a measure-before-build approach and analogizes it to the way bridges are built. Specifically, Irissou and Friedland say, civil engineers would never take an approach of building 100 bridges and then picking the best one. Instead, they measure and develop a spec for the bridge, and only then design and build it.
Brainome defines what are essentially key performance indicators (KPIs) for a model as a way to arrive at such a spec in the ML world. The KPIs are generalization, capacity progression, risk of overfit and memory equivalent capacity. Together these KPIs can profile the complexity of the training data, and the number of parameters needed to model its patterns. Fewer parameters allow the model to be based more on rules than on memorized facts, which avoids overfitting of data; it also means less data is needed to generate an accurate model, which can lead to faster training times.
Another part of Daimensions credo is that maximizing mathematical accuracy of a model -- and incurring the cost of all the compute resources that requires -- can easily cross a point of diminishing returns. Instead, model features should be ranked in importance prior to training. That prioritization helps simplify training and reduce model size. Meanwhile, understanding of the data's complexity and optimizing the size of the training data set helps shorten training. As a result Brainome says it's seen reductions in training times and model size (each in orders-of-magnitude) when building models on standard OpenML data sets.
Compact models, compact company
Daimensions models are produced using Brainome's Table Compiler (BTC) technology. The output is a single executable Python script that includes the data prep code and the model function itself. Because the models are shipped as code, they can be integrated into standard CI/CD (continuous integration/continuous delivery) pipelines. The company says this allows customers' DevOps practices and infrastructure to serve as the deployment component of MLOps as well. It also says the small models can avoid the need for, and expense of, GPUs (graphical processing units) and can even be deployed to small edge devices or as cloud microsoervices.
Brainome is a one year-old company with 11 employees, that has so far received about $1.5M in angel funding. Though small, its industry horizontal approach means it has been able to run pilot projects with organizations in HealthTech, FinTech, AdTech and genomics research. Hopefully, the company will have real impact on moving data science work away from the guesswork-laden processes that have defined it to date. Given the efficiencies and, indeed, intelligence that machine learning models are designed to engender, it's good to see both attributes applied directly to creating those models as well.