The great [structured] database in the sky

The great [structured] database in the sky

Summary: MySQL ECO Marten Mickos opened the morning session of the Web 2.0 Summit with his dream of the great database in the sky.

TOPICS: Data Management

MySQL ECO Marten Mickos opened the morning session of the Web 2.0 Summit with his dream of the great database in the sky. He envisioned a database in which all structured data could be shared, open sourced so that people could instantly know what is happening in the world. For example, if all the weather data were made available, users could tap into pool of information and apply it real estate, travel, sailing, surfing, events and many other activities.  

Mickos described the database in the sky as the other side of the coin for what Google does for unstructured data, although it must part of Google’s vision to tackle the dark Web of structured data. Mickos wouldn't mind if MySQL become the underlying database storing the structured data or index for the great database in the sky.

 MySQL CEO Marten Mickos look toward the great database in the sky

"Google gives access to all data in entire world. We could do the same for structured data,” Mickos said. "We could open up all the world’s structured data to all the world’s developers, entrepreneurs and mashup artists. It’s the great database in the sky.”

The biggest database is out there, it’s not just connected, he added. “We need to build a Skype for database connections, which allow sharing, connecting and aggregation of data in real time, so you instantly know what is happening in the world. Today it’s too complex, with different formats, ways to present data and different ways to get there.”

The technology pieces to make it happen exist, but haven’t been fully applied to the problem. Technologies include RSS, ATOM, Jabber, HTML, HTTP, XML, SQL and SMS, Mickos said, and they have to be scalable and work in a peer to peer model without lots of complicated constructs.

He said that a DNS server that knows all the SQL databases in the world would be required. Some of the problems to overcome include routing, making data understandable and accessible to other; and payment systems.

The fact that latency is going down and bandwidth is increasing gets us closer to synchronous flow of data, Mickos said. “It’s less of an issue if one of databases is a bit latent—the value of combining the data is much bigger,” he said.

To make the vision work, data owners need to share, Mickos said. “The simple value of sharing is enough of an incentive.” In addition, a brokerage keeping track of who has what data to offer the world in what format is also necessary.

If the great structured database in the sky materializes, then the data would be the platform, Mickos concluded. It's a great dream, and it will come to be in fits and starts as data sources are opened up and technological frictions are removed.

Topic: Data Management

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • We call it the Semantic Web

    The idea of a "great database in the sky" is exactly that of the Semantic Web. This vision is well into the implementation phase, with RDF as the (structured) data model and SPARQL as the query language. See:

    Here's a tutorial which should make sense to database folks:

    Here are some slides describing the current status:
    • call it whatever

      I don't care what people call it, I care about seeing it done...

      I'm not yet needing to use it, but heck, I'd like to see it usable, should I need to use it.
    • RDF: bazooka for a fly

      RDF is overkill

      But I know of something coming out next year that is totally kick ass, totally simple... and does everything this article describes... what is it? To name it would be to give it away...
      • Why wait until next year?

        Overkill? The core RDF tech is pretty much the minimum necessary to systematically integrate data in the global environment. It takes a while to get used to coding for it (as does HTTP), but with the libraries around now, in practice it's no harder than comparable technologies (like using SQL DBs or XML). Give me that over vaporware any day.