The timing of Facebook's Graph Search announcement yesterday was fortuitous for this blogger since I have been exploring open source graph databases such as Neo4j as of late.
Neo4j is not the technology behind Facebook's in-house developed graph database but it is representative of an emerging type of NoSQL databases required for a more complex, connected data world. And it is open source, like other open source graph databases such as InfiniteGraph, InfoGrid, OrientDB, BigData, DEX, HyperGraphDB, OQGraph and ArangoDB.
Neo Technology seems to be leading the open source pack: the San Mateo, Calif. company lists Cisco, Adobe, Intuit, Squidoo and Deutsche Telekom among its customers and it has raised more than $23 million since its founding in 2010, with $11 million of the funding announced just last fall from Sunstone Capital.
The company's Neo4j software is available under GPL v3 and the commercial edition is available under an AGPL v3 license.
There will be plenty of competition from social networks and other proprietary vendors. Google's Knowledge Graph and Twitter's Interest Graph are similar to Facebook's Social Graph, and traditional database vendors such as Oracle and IBM are getting into the Graph act, too. Microsoft Research's "Trinity" project is another example of a graph database.
In a recent presentation, Matt Aslett, research manager at the 451 Group, predicted that revenues for NoSQL databases such as Apache Cassandra will increase to $215 million in 2015, up from a mere $20 million in 2011. Revenues from graph databases only represents about eight percent (in 2015) and four percent (in 2011) of that, but it's a steady and increasing business for companies like Neo Technology as social networks, geo-location services and network/cloud management requirements increase.
In an email, the chief exec of Neo4j said graph databases are far better for complex, connected data than traditional relational databases and other NoSQL databases.
"A graph databases is a type of NOSQL database that is optimized for use cases where you have connected data. Connected data is prevalent in social networking (as you mention), logistics networks (for package routing), financial transaction graphs (for detecting fraud), telecommunications networks, ad optimization, recommendation engines, bioinformatics, and in many other places," said Emil Eifrem, CEO of Neo Technology. "Customers like Adobe, Cisco and Deutsche Telekom have found that Neo4j (which is the most popular graph database in the world) frequently outperforms a traditional database by a factor of one thousand when it comes to queries on connected data."
In 2009, as Neo was getting going, execs rightly anticipated that a new type of database would be needed for a more connected world in which unstructured and semi-structured data from social networks and clouds would have to be handled differently than tables.
"Graph databases instead assume that the relationships are as important as the records," a Neo technology blog said three years ago. "This difference has numerous consequences for ease of use as well as for performance. A table-based system makes a good fit for static and simple data structures, while a graph-based fits complex and dynamic data better."
Today, Philip Rathle, senior director of products, claimed Neo Technology now has 45 full-time employees, ample funding, a sizable community and maturity on its side.
" None of the other graph players have a commercial customer base, or community, the size of ours. Neo4j is proven technology with deployments dating back 10 years that has been built from the ground up to be a graph database," Rathle said. "InfiniteGraph is a graph database that has been built as a graph application layer on top of an object oriented database. It's an exciting time, because new graph databases seem to have popped up every few weeks last year... ok I'm exaggerating slightly.
"These technologies are great, and reflect the demand for, and usefulness of, graph databases," he continued. "However they will all take years to become mature, as it takes a lot of real-world shaking out for a database to be reliable in the range of real-world scenarios that databases need to be impervious to. Please fact check this, but you won't see any other graph database today that has anywhere close to the level of maturity and adoption that we do."