IBM Research on Wednesday unveiled CodeFlare, a new framework for integrating and scaling big data and AIworkflows in a hybrid cloud environment. The open-source framework aims to help developers cut back the time they spend creating pipelines to train and optimize machine learning models.
CodeFlare was built on Ray, an open-source technology from UC Berkeley. It builds on Ray with specific elements that make it easier to scale workflows. Using a Python-based interface for pipelines, CodeFlare makes it easier to integrate, parallelize and share data. This helps unify pipeline workflows across multiple platforms without requiring data scientists to learn a new workflow language.
While IBM says CodeFlare pipelines run easily on its new serverless platform IBM Cloud Code Engine and Red Hat OpenShift, developers can deploy it just about anywhere. CloudFlare also helps developers integrate and bridge pipelines with other cloud-native ecosystems by providing adapters to event triggers (such as the arrival of a new file). It also provides load and partition data from a variety of sources, including cloud object storage, data lakes and distributed filesystems.
CodeFlare is available on GitHub, and IBM is sharing examples that run on IBM Cloud and Red Hat Operate First. Developers already using CodeFlare, IBM said, have cut back their work by months.