Alibaba Blinks: Building an open source, data-driven cloud empire in real-time
Acquiring data Artisans, the vendor leading development of open source Apache Flink framework for real-time data processing, is the latest move from Alibaba. Where does this fit in Alibaba's strategy to grow its cloud?
Open source business comes and goes, clouds are here to stay. That's one lesson 2018 has offered, and part of the reason why vendors trading in open source/open core software are adjusting their strategy. Last year saw a number of said vendors change their licensing, adding clauses meant to restrict cloud vendors from "strip mining": Taking open source platforms and offering them as managed services.
In the case of Alibaba and data Artisans, the conflict ended before it began: Alibaba just acquired data Artisans for a total of €90 million. Data Artisans is the vendor leading development of the open source Apache Flink framework for real-time data processing, as it employs a major part of its core committers.
Alibaba is well on its way to becoming a major cloud vendor, too. On a global scale, that is, because it already is one back home in China. Alibaba is often thought of as the Chinese Amazon, but this is only partially true. Alibaba, like Amazon, started out from retail, in which it is the dominant player in the Chinese market. Alibaba functions as a platform on which retailers can sell and manage aspects such as logistics.
When it comes to cloud services, however, Alibaba wants to diversify from AWS by offering a value-add proposition instead of trying to play catch-up with them. The computational infrastructure needed to deliver platform services to clients is also used to offer them domain-specific solutions tailored to their needs. This is in stark contrast to AWS, which offers infrastructure and tools and lets clients build their own applications.
Alibaba's strategy is built around creating an ecosystem, and Min highlighted this when discussing Alibaba's offering compared to specialized domain solutions, focusing on data science: "We can support clients going into uncharted territory. Our Brains can support you, and you will not be fighting by yourself -- you'll have an army of data scientists on your side."
Brains is the name Alibaba uses for its AI-powered domain-specific solutions, and "an army" is literal in this case: Alibaba has ~50,000 employees, 20,000 of which are technical. Min is the leader of a cross-functional team of 300 people: 50 data scientists, 200 data engineers, and 50 business experts. Min said they have managed to recruit people from places like Japan, Europe, and the US.
Supporting open source actually makes lots of sense as a piece in Alibaba's strategy. Open source represents infrastructure for data and AI-driven solutions. The key to making such solutions work is data and expertise, and Alibaba does not seem to be in short supply of those. Alibaba is not in the business of selling managed services either, so why would they not want to invest in open source when they have no reason to compete with it?
This, and the need for scale, can explain Alibaba's special relationship with Apache Flink and data Artisans, leading to the acquisition. Min explained that Alibaba's infrastructure was based on a Lambda architecture, i.e., one that has two lines of data processing, one for batch and one for real-time. Flink enables this to be collapsed in a single line (Kappa architecture), saving resources and enabling faster insights in the process.
Alibaba has been long involved in Flink, having developed its own extensions to deal with their requirements, called Blink. As Alibaba needed the expertise and support that data Artisans has to offer, as well as its hardened, enterprise version including features such as patent-pending technology for serializable transactions, Alibaba has also been a data Artisans client.
At Alibaba scale, leveraging Flink can translate to substantial savings and competitive advantage. Instead of relying on an external entity for what is strategic software infrastructure, why not bring this in?
This deal may mean that data Artisans can have its pie and eat it, too, injecting a healthy dose of cash, while maintaining control. And Alibaba has committed to contribute Blink to core Flink. We would not be surprised, however, to see data Artisans push a Commons Clause for Flink in the near future as well. Other cloud providers are now direct competition, after all.
Another thing to keep an eye on is how this will affect the evolution of Apache Beam. Beam is the closest thing to a standard in the data streaming world, enabling streaming framework workloads and processing to be ported among different frameworks. Beam was initiated by Google for its cloud, and gained support from Flink and Samza, but not Spark. With Alibaba now behind Flink, this means de facto support from another major cloud vendor.
Cloud services: 24 lesser-known web services your business needs to try