Open source and the need for speed

Enterprise technology development and improvement rarely takes place as quickly as most IT managers would like, but blaming that lack of speed on the inherent complexity of the problems involved can sometimes be a lazy knee-jerk reaction.

Enterprise technology development and improvement rarely takes place as quickly as most IT managers would like, but blaming that lack of speed on the inherent complexity of the problems involved can sometimes be a lazy knee-jerk reaction.

I was reminded of this last week at linux.conf.au, during a presentation by Andrew Tridgell on the development of clustering capabilities in Samba, the wildly popular open source package which (in ridiculously simple terms) makes file system communication between Windows and Linux boxes possible.

Clustering is a very useful business continuity technique, but getting it to work effectively is not generally regarded as easy. But, with that said, the Samba team managed to get their approach up and running pretty quickly.

At the 2006 conference the bare bones of the underlying CTDB technology (which provides a lightweight database to store essential clustering information) were demonstrated.

One year later, building full production environments is possible.

"Last year it was a hack," Tridgell noted. "I think we got the first few packets going across a few hours before the talk. I was warning everyone 'do not use this'."

Twelve months on, it's a rather different story. "You can create the fastest NAS box around. You can get multiple gigabytes a second to a single machine on a single IP address."

It's also worth noting that most of this development took place without the benefit of full access to the relevant Windows documentation, as a settlement granting the Samba team access to that information wasn't granted until late in the year.

It wouldn't do to overstate the role that having an open code base played in the development of CTDB. As Tridgell himself noted: "People have been clustering Samba for years really, really badly. As they added more nodes to the cluster, it got slower."

Nonetheless, I'd wager that conventional corporate development methods would not have solved the problem as quickly.