Shared nothing coming to open source

Shared nothing coming to open source

Summary: It's at share nothing that the proprietary world is really moving in on open source. But now projects that need to build big warehouses can say share nothing right back at 'em.


GreenplumShared nothing has been a big deal in the database world for some time. It's what companies like Google are based on -- a distributed database without a single point of contention.

Now this concept is coming to open source. It's not entirely there yet. But with Greenplum's new Bizgres MPP, based on the PostgreSQL open source base, you can see it from here.

Scott YaraGreenplum is an example of what CEO Scott Yara calls the next generation of innovation. "Commercial entities innovate on open source code and sell commercial features on open source," he said.

In the case of Greenplum, which has major venture capital backing, "We rebuilt the query optimizer and executor" of PostgreSQL, "as well as an interconnect. So you can run it across clusters and build terabyte sized warehousing systems using off the shelf hardware."

This is a problem a lot of Web 2.0 start-ups like Technorati, Bloglines and Flickr are facing, and projects like Drupal will face soon. They were built with open source tools, but then find they need to "graduate" to something like a data warehouse.  And there's old Oracle, telling them there's nothing from an open source supplier that can deliver what they need. Share with us, they say, you don't have any choice.

Well, now there is a choice. Greenplum CTO Luke Lonergan said that O'Reilly Media, one of Greenplum's early customers,  "graduated" from mySQL to PostgreSQL with Greenplum and got a 100%  100 times improvement in database access speed across a 500 Gigabyte database. Other Web 2.0 start-ups, and projects, can do the same thing.

"The price of conversion is where the pain is," said Yara, but "look at how fast some of these projects grow." While mySQL was smart in building on a lightweight Web base, more and more users and projects will find the need to graduate, and face proprietary FUD from major vendors saying they have to pay the "monopoly tax" in order to grow.

Well, they don't. And they can even convert now, if they fear the mySQL FUD companies like Oracle are putting out, Yara added.

It's at share nothing that the proprietary world is really moving in on open source. But now projects that need to build big warehouses can say share nothing right back at 'em.

Topic: EMC

Kick off your day with ZDNet's daily email newsletter. It's the freshest tech news and opinion, served hot. Get it.


Log in or register to join the discussion
  • Good to see open source trying to catch up.

    Hope they make it...
  • Not Exactly

    The point of the story is commercial products being built on top of open source.

    In time I expect those commercial products will themselves be open sourced, when additional, newer, better products are built on them (or they're replicated by a bunch of programmers dedicated to open source).

    That's why I wrote you can see open source from here.
  • Important Note

    This is a data warehousing product, not an online transaction processing product. Data warehousing systems are tuned for read access. OLTP needs to be read/write. That's not saying it's bad, not in the least. Greenplum's website makes what it is very clear. But making essentially read-only data hyper-fast and shared nothing isn't that hard. I just wanted to say this before some zealot posts about how OSS is creating monster innovations that will destroy the commercial software business. This is an example of a business that took a generalized BSD-licensed product and turned it into a specialized proprietary product. Kudos to them for that.
  • Shared nothing solutions gaining momentum

    Shared nothing solutions are gaining momentum. They are easy
    to install and more cost effective to acquire and manage, yet
    deliver superior high availability and capacity scalability. A
    commercial open source company Continuent
    ( has pioneered this approach since ?02
    with its m/cluster product. m/cluster supports MySQL database
    virtualization, delivering high-availability and scalability for
    business-critical data in a shared-nothing environment. They
    also offer p/cluster for PostgreSQL and will support for
    commercial databases later this year. The company offers both
    an enterprise-grade database high availability solution as well as
    core open source software for developers (
    • RE: Shared nothing, Continuent, Greenplum

      Continuent can definitely help in environments where you have a lot of read activity and a limited amount of data, however for more complex queries over large amounts of data you need parallelism to overcome the limitations of the capabilities of any single system in a cluster. Greenplum's software will help in such cases.

      Another vendor that is much cheaper than Greenplum is ExtenDB. They also use PostgreSQL and target BI.