Today, a new company called Ahana, focused on analytics using the open source Presto query engine, is emerging from stealth and announcing it has raised $2.25M in seed funding. GV (formerly Google Ventures) leads the seed round, along with with participation from Leslie Ventures and other angel investors. Steven Mih and Dipti Borkar, who collaborated previously at both Alluxio and Couchbase, are the company's co-founders.
In a briefing last week, Mih and Borkar explained that Ahana will focus on Presto-based ad hoc analytics and will have a product in-market by the end of year. The two also expressed a priority around eliminating complexities in implementing Presto, explaining that the catalog, buffer pool and other components it uses aren't part of the core project, and that this can lead to customer pain points.
Who's on first?
Interestingly, Ahana bills itself as "the first commercial company focused on PrestoDB." Before taking the briefing with Mih and Borkar, the claim struck me as odd. After all, Presto is a fairly pervasive technology (Amazon Web Services uses it for its Athena service, for example) and Starburst Data, established in 2017, is one commercial entity that's clearly focused on it.
- Starburst Data's road to providing the gateway to cloud storage
- Alluxio 2.0 seeks to unify fragmented data ecosystem
The briefing allowed Ahana's founders to clarify the claim. Here's the rub: while I have thus far been talking about "Presto" as if it were a single open source project, it turns out that's not quite true. Unbeknownst to me until last week, it turns out there are not one but two Presto projects.
One Presto project, with which Ahana is aligned, has its Web site at prestodb.io; the other, towards which Starburst is oriented, has its site at prestosql.io. If you visit these sites, you'll see they look similar and use an identical Presto logo. In fact, you'll see the placement of that logo seems to be pixel-for-pixel perfectly identical between the two sites. You'll also see they link to separate GitHub repos, and that they operate under separate foundations.
Fork in the road
Presto, which was created at Facebook, got to this state because the Facebook engineers who created it in 2012 -- Martin Traverso, Dain Sundstrom and David Phillips -- left the company in 2018 and continued to focus on Presto's development. In December of that same year, the three established something called the Presto Software Foundation (PSF) around the technology, establishing their own repository for the code, and pushing the body of their work forward. Much of the Presto community hitched its wagons to the new foundation and the code in its GitHub repository.
Fair enough. But meanwhile, even if Traverso, Sundstrom and Phillips were no longer there, Presto was established at Facebook and is used by the company in its own work. As such, Facebook saw itself as custodian of the project and had a commercial vested interest in keeping it viable. Facebook maintained the original repo and eventually created a foundation of its own, the Presto Foundation (PF), now under the auspices of the Linux Foundation. In addition to Faceboook, Uber was closely involved in the establishment of the PF and Uber's head of open source, Brian Hsieh, serves as head of the PF's governing board. Perhaps not surprisingly, Ahana is now a premier member of the PF, with Mih serving on the governing board and Borkar serving on the outreach committee.
Which version of Presto is the legitimate one? Are Traverso, Sundstrom and Phillips more qualified to shepherd the project, or is Facebook? Does the Linux Foundation give more credibility to the PF than the engineers who created Presto do to the PSF? And, in the commercial sphere, does Ahana have standing as a formidable competitor to Starburst?
All parties involved
To get more background information, I first spoke with Uber's Hsieh. He explained that Uber makes extensive use of Presto, using it to power the transportation company's own data engine and query analysis. As such, Hsieh explained, the company wanted to make sure the Presto project had an able custodian, and one that exhibited transparency and rigorous governance that the company felt was difficult to discern at the PSF. Based on these concerns and impressions, the Presto Foundation was born, with Twitter and Alibaba joining Facebook and Uber as founding members.
I also corresponded with Michael Cheng, Associate General Counsel at Facebook. Cheng was heavily involved in establishing the PF and is involved closely with Facebook's open source initiatives. While we originally intended to speak by phone, we ended up corresponding by email, apparently because of scheduling difficulties.
Cheng's commentary was decidedly diplomatic. He explained that Facebook is "very supportive of any company building a business and innovating on top of [its] open source projects, with Ahana being the most recent example." Cheng also provided a quote from Amit Chopra, Facebook's representative on the Presto Foundation board, who said Facebook "sometimes spin[s]-out projects to the community so that they can grow under a neutral governance structure" and that Facebook is "thrilled to see that Ahana and other companies have the opportunity to use PrestoDB to build great products and businesses."
Speaking of other companies, I also spoke with Justin Borgman, CEO at Starburst. Borgman explained that Starburst hired Traverso, Sundstrom and Phillips in September of 2019, roughly two years after the company launched. This made Starburst the powerhouse behind PrestoSQL and the dominant force at the PSF. Along those lines, Borgman shared with me some metrics around both Presto projects, to demonstrate PrestoSQL's momentum. And it would appear that, whether judging by code check-ins on the GitHub repos, activity on the two projects' Slack channels, or breadth of contributors, PrestoSQL has had greater momentum than PrestoDB since the split began.
Will the real Presto please stand up?
Both sides clearly have legitimacy, significant support and strong claims on Presto's lineage, making it hard to pronounce one or the other as the definitively legitimate heir to the Presto throne. More important than anointing a winner, though, is pointing out the schisms such as this one are bad for the specific project they pertain to and for the analytics industry overall.
The Presto rivalry in many ways mirrors the erstwhile competition between the pre-merger Hortonworks and Cloudera, each of which could trace its lineage to the original Hadoop project at Yahoo. The two companies engaged in bitter rivalry which resulted in competing, duplicative efforts and a bifurcation of the Hadoop ecosystem. That schism was damaging to the ecosystem and to big data analytics overall.
In general, fragmentation of ecosystems creates confusion, subdues investment and brings about malaise. And when you consider the downward pressure of the COVID-19 pandemic on tech and the overall economy, this is no time to dilute the effectiveness of the Presto ecosystem and community.
The good news is there's been progress toward unification. While my discussions with the parties involved may have catalyzed things a tiny bit, talks between Starburst and the Presto Foundation were in fact already ongoing before Ahana briefed me and before I started researching the rift. Late last week, these talks resulted in Starburst joining the Presto Foundation, something Borgman announced in a blog post just yesterday.
This development should ultimately serve the entire Presto community, including the PrestoDB and PrestoSQL subcultures within it. In so doing, it should help the whole open source analytics space, or at least avoid an awkward separation that could hurt it. And with Ahana now engaged on the Presto scene, Starburst will have some healthy competition, which is usually beneficial for customers and innovation.