Like many platform operators, Amazon has a love-hate relationship with those hosted on its platform. This is particularly true for open-source software creators, who see their products on offer on Amazon's cloud on terms they are not happy with.
It's a complicated relationship, which touches upon many aspects of technology, law, and social norms. The issue started becoming more pronounced and entering our turf on Big on Data, as Amazon Web Services (AWS) started offering top open-source data management products on its platform.
Vendors developing those open source products started accusing AWS of strip mining, i.e., reaping the benefits of the products, without contributing back to their development. Google stepped in to show there is another way of doing business with open source in the cloud. The New York Times stepped in and made this talk of the town.
This is huge, as much as it is complicated. Let's try to unpack it.
AWS is the technology branch of Amazon. AWS pioneered the cloud in the early 2010s, based on the premise of renting out spare compute and data capacity from Amazon's servers. By now, the cloud is the dominant way of storing data and doing compute.
AWS is not just the main source of Amazon's operating income at $40 billion in 2019. Amazon CEO Jeff Bezos describes Amazon as a technology company. The technological prowess AWS brings to Amazon is at the heart of Amazon's expansion to the aftermarket. AWS powers countless products and services every day. When AWS hiccups, the world notices.
Like AWS, open-source software today is ubiquitous. Open source predates Amazon, as much as it enables it. Open source such as the Linux operating system is what powers most data centers today, including AWS. Cloud, data, and open-source converge.
The future of databases is the cloud. But the future of databases is also something else: It is open source. By 2022, more than 70% of new in-house applications will be developed on an open-source database, and 50% of existing proprietary relational database instances will have been converted or be in the process of converting.
Historically, as open-source software (OSS) started becoming more complex and successful, different ways to monetize it started emerging, too. This makes sense: Software takes time and expertise to develop. While some people may develop OSS as a side gig, complex software requires dedication. Dedication requires financial support, hence, monetization.
In some cases, OSS projects are collaborations cross-cutting many organizations. Typically, this happens when many organizations are interested in developing something. Open source fosters collaboration and ensures the result can be used by everyone. Contributors are paid by organizations that employ them, and monetization is not an issue.
But what happens when a project gets a life of its own, and/or some contributors wish to go their way, and monetize the project? Offering services around the project is one way. Many SMBs and freelancers do this -- not necessarily the ones who built the software. The fact that OSS is free to use makes skills built around it transferable, which is a big advantage.
OSS builders enable technologist experts who use the software to monetize their services. This can be a symbiotic relationship, as technologists may help expand the software's user base and contribute knowledge in the form of documentation or question answering or even some code. Often, OSS builders and expert technologists using the software largely overlap.
But the more complex the software, the bigger the organization needed to build it. Service-based business model margins are relatively poor and do not scale well. Notoriously, only Red Hat has succeeded in building a sustainable business around complex OSS on a service-based business model. Even though Red Hat's scale is tremendous, eventually even Red Hat chose to accept IBM's acquisition offer.
But what if you want to build complex OSS, such as a database, and you don't have Red Hat's scale? First off, let's answer two very important questions: Why would anyone want to do this, and why would the rest of the world care.
OSS lowers the barrier to adoption. Oftentimes, OSS almost sells itself. Engineers adopt it, as they have access to it and start building on it. If they are happy using the software and/or have gone far enough using it on a project, a commercial license may be bought.
Many open source products come in different versions. An open-source, entry-level version and one or more commercial versions with the hardened codebase, advanced features, and potential support, too. This strategy is called open core, and it is something many OSS vendors apply.
Adoption is one reason why open source is an attractive choice for building software. Other reasons are quality and ethos. Open-source ethos motivates people to develop and contribute, thus also lowering the barrier on those fronts, helping build communities around the software. By having more contributors and less friction, OSS can result in higher quality.
The OSS ethos is about collaboration and sharing, which brings us to the "why should the rest of the world care" part. As Marko Rodriguez put it in a recent talk: "By making our software freely available, many institutions that would otherwise not pay for cutting-edge software have been able to advance their domain. We did something very good for this world."
Rodriguez, the founder of mm-ADT, has seen this from the inside. Around 2010, Rodriguez started Aurelius, the company behind two successful OSS graph database technologies: Apache TinkerPop and Titan. Aurelius, however, was a small company punching above its weight. So, when an acquisition offer from DataStax came along, Rodriguez took it and joined DataStax.
Doing so solved some problems and created some others. DataStax is itself an OSS vendor, built to offer an enterprise version of the Cassandra OSS database. DataStax had its issues with the Cassandra community. Interestingly, Cassandra became the last link in the chain of top OSS databases to be offered as a service on AWS.
This brought up, once again, the question of measuring contribution and sharing value around OSS. To be clear: This never was a part of OSS, and it still isn't. AWS is perfectly within its rights to take software it has not contributed to, run it as a service (SaaS), and profit from it. Of course, the rest of the world does not have to like this, and this is where it gets messy.
AWS is not the only one to do this. Many smaller players do something similar -- take OSS, it did not build and offer it as a service. Databases are complex as much as they are critical, and keeping them up and running is hard. This is why outsourcing management is something many organizations are willing to do.
AWS, however, is different. AWS also runs its database product line and has a size no other player has, as its own executives boast. AWS controls the market and has products in the market. Coincidentally, critics argue that Amazon also pays zero tax, but the reality as documented by a Wall Street Journal report highlights the nuances. Nevertheless, the Amazon tax narrative leaves AWS open to allegations of unfair competition, predatory pricing, and market foreclosure.
Commercial OSS is a business reality today. There is an event for it, too, called the Open Core Summit (OCS). OCS brings together cloud vendors, OSS vendors, Venture Capitals, and developers. OCS is organized by Joseph Jacks, who also run OSS Capital, a VC for OSS companies.
Recently, Jacks elaborated on the dynamics of the OSS ecosystem, focusing on the value it creates synergistically. When discussing OSS and cloud trends, Jacks predicted, among others, that cloud providers will uniformly embrace OSS as a core strategy. AWS was specifically called out for not acknowledging this.
Amazon is the elephant in the room, which is why its partnerships play on a love-hate tone, as Jacks also pointed out. For any software product, to not be on AWS is severely hurting its marketability: Most customers are on AWS. And to have AWS move to a market is good for business, as it brings exposure; except when AWS does so using your own product.
AWS' response to the strip mining allegations, via its VP of analytics, Andi Gutmans, is that they are "silly and off-base." AWS claims that it does contribute code to the OSS products it offers as a service. AWS also claims that it contributes in other ways, by helping the community.
As far as code goes, the counter-argument is that AWS' contribution is insignificant in comparison to the value it extracts, as it is centered around self-serving, peripheral issues. The closest thing we have to analysis on this is included in Rodriguez's presentation.
The analysis was done on publicly available GitHub OSS data by Russell Jurney, founder of Data Syndrome. The analysis includes top OSS products offered by AWS, and there are two main takeaways: In all of these projects a few core contributors are key to the codebase, and AWS is not among them.
The closest thing we have to a response to this comes from Matthew Wilson, AWS VP/distinguished engineer and self-proclaimed "OSS romantic." When asked to comment on that data in a recent Twitter thread, Wilson initially cited the fact that the data has been published by Rodriguez.
Wilson wrote he "avoids responding... due to [Rodriguez's] habit of engaging in personal character attacks." Confronted with the fact that the data is not Rodriguez's, Wilson claimed that AWS contributes in other ways beyond code. Asked to show data on that, or in lack of data, proposals on how to evaluate this contribution, Wilson did not provide a concrete answer.
The Apache Software Foundation famously favors community over code contributions. But how do you measure that? At this time, we do not have an answer. Until we do, based on what we can measure, it looks like AWS is more of a taker than a maker when it comes to OSS.
AWS is perfectly within its rights to do this. Looking at this bluntly, we are talking about commercial entities arguing with each other over splits, with OSS in the middle. It's true that OSS vendors are funded by venture capital and worried about their market share. It's true that no OSS vendor has been "fully displaced" because of this -- yet. And it's true that OSS is not a business model -- yet.
In other words, maybe OSS vendors should simply have known better than to expect this would not happen. To a large extent, yes, and they seem to be realizing it. Or, to quote Rodriguez: "The OSS summer of love is over." It's not fair, but who said the world, or OSS, is fair. The question is: Is there anything to be done about this, and why should the world care?
OSS is not about being fair. It does not in any way include notions of contribution, fair reward, or fair use. OSS is about removing barriers to collaboration and adoption. There are different OSS licenses, ranging from "do as you please, no questions asked" to "if you modify this, you must share modifications, and/or your codebase must be OSS, too."
But none of these licenses imposes any restrictions on where/in what way OSS is used. Some OSS vendors have tried changing their licenses to do that and were met with resistance by the OSI. The OSI has standardized a limited set of OSS licenses. Over time, this has helped ensure that legal departments know what they are dealing with, and users know their options.
So, if an OSS license with restrictions on use is not an option, what is?
Open core is one option. It seems to be working reasonably well, with the notable exception of Elastic. AWS has re-implemented non-OSS parts of Elastic, made it available as OSS and added it to its SaaS offering. AWS is within its rights to do this, but Elastic has filed against AWS on trademark infringement. Elastic has also sued others for infringing on its code. AWS continues to develop its version of Elastic. [Correction: A previous version of this article noted that Elastic was suing for copyright infringement.]
Reverting to closed source is another option. Jonathan Ellis, DataStax CTO and co-founder, commented on lessons learned from the Cassandra / DataStax journey in a recent keynote. Ellis said that it looks like a free tier, which is not OSS, is probably a better option for organizations that want to innovate today. There are examples of organizations successfully doing this.
But there may be another option. And this brings us to the "why should we care" part. By effectively imposing an AWS tax on OSS, AWS is pushing back OSS adoption. This is hurting the software ecosystem and society at large. This is not unprecedented. OSS is part of the Commons: A common pool of resources, from which many actors may draw. AWS disputes this tax on OSS notion and argues that players such as MongoDB have benefited and performed well. However, the open source software economics are frequently debated.
The Commons have a long history and have been researched by Nobel winner Elinor Ostrom, among others. What AWS is doing seems taken from a page of the Commons playbook: Engulfing. Research shows that to have sustainable commons, safeguards against engulfing need to be in place. The way OSS works, this is not the case.
The multi-billion-dollar question is how this can be done. More on that, in part two of this article, where people from the OSS community weigh in with proposals.
Updated February 18 2020 at 3:42 pm EST to reflect Amazon does pay taxes.