In August, Google launched a free project hosting service on Google Code. One of the unique features of this service is that it only supports 7 licenses (compared to over 50 listed at opensource.org). The stated intent of this restriction was to cut down on the proliferation of open source licenses. Recently, I sat down with Google Lead Open Source Engineer and Apache Foundation Chairman Greg Stein to explore how this was working out. The result of that discussion is a 3-part series on the problems of license proliferation, choosing an open source license, and leadership in the open source community.
ZDNet: One of your most-requested features for project hosting is to add licenses. The OSI License Proliferation Committee has recognized 9 licenses that are "popular and widely used or with strong communities". Will you be adding support for the 3 that aren't supported by Google Code (CDDL, CPL, and EPL)? And why do you support the Artistic license, which isn't on their list?
Stein: At this time, we aren't going to be changing our set of licenses.
While I very strongly agree with the goals of the OSI's anti-proliferation stance (this is exactly why we have a limited set of licenses), I cannot completely agree with their selection."Our goal is not a project count, but to support the community" If and when the other licenses see a broader appeal, then we'll go ahead and add them.
And yes, I realize there is a potential chicken and egg there... if we don't support them, then how can they ever reach a broader appeal? Simple: we aren't the only hosting service. I would very happily recommend that those projects use SourceForge, tigris, or join Eclipse or java.net. But if they are coming here, I would advocate a change to the MPL because it has the same philosophy.
ZDNet: I suspect a lot of people choose a license just because they're familiar with it, without really understanding what it means.
Stein: Heh. Or they see the stats and go "Wow. GPL must be good because it is popular."
ZDNet: So should "broad appeal" or current use should be the main criteria for which licenses to promote? It should be a factor but also we should consider which licenses are actually better and more modern, right?
Stein: I fully agree. That is one of the reasons that Google chooses the Apache License (2.0) as the default for the software it open-sources. It is permissive like BSD, but (unlike BSD) actually happens to mention the rights under copyright law and gives you a license under those rights. In other words, it actually knows what it is doing unlike some of the other permissive licenses.
ZDNet: It's true that MPL is the grandfather of CPL, EPL, and CDDL. However the later licenses have subtle, dare I say, improvements. For example EPL is different than CPL only in two areas - the license steward is not IBM and the patent retaliatory clause is a smidge looser.
Stein: Yup. And it is precisely that, which is why I'd add EPL instead of the CPL (read: if any get added, it would be EPL; I can't picture adding CPL). I'm not sure of the specific differences between CDDL and EPL, so I'm not sure how those would shake out if pitted against each other.
ZDNet: I understand your point about Google Code hosting not being the only game in town, and people can go to alternatives if Google Code is not a good fit. But there are many people who would like to use your hosting service but can't because of this license thing.
Stein: I understand. It is a hard problem. We want to help projects (which means license flexibility), but we also want to reduce proliferation (which means restrictive).
Our count of projects is not important to us. We'd throw the licenses to the wind and really start "marketing" Google Code if that was the case. So not having "one more project" is no big deal. But it is a big deal because it means we couldn't help them. On the other hand, maybe we help them if they change their license :-) ... or if they think about licensing issues.
If I knew that the projects didn't have a good home to go to (like tigris.org or SF), then I might be more concerned. But right now, I can be a supplement rather than a replacement :-)
ZDNet: Some projects get around the Google Code restrictions by lying about the license on the form and then clarifying in the description and source. What do you think of that?
Stein: That would not be advisable. We have already run an analysis over all the repositories looking for exactly this. We need some more refinement to eliminate false positives, but we will be able to detect this kind of misdirection. And we do have the ability to ban users who abuse the site (relegating them to not-signed-in status to avoid our ban marker, meaning anonymous access, meaning no participation).
We've been discussing whether we can use our license analysis to help projects with properly licensing their code. e.g. making sure files are labeled, that a COPYING or LICENSE file is present (as required by the license in question), etc. Maybe mailing the project owners to let them know what problems we found and how they can correct it.
ZDNet: If you don't care about your count of projects, then why not just support a bare minimum (say, GPL, EPL, and Apache)? I think you do care somewhat.
Stein: Heh. Remember: it is a balance. We want to help projects who have a hosting need or want to use the unique features we provide. If we get too hard-core with our set of licenses, then we dramatically cut down on the set of projects that we can help.
My point is that our goal is not a project count, but to support the community. We won't publish or "compete" on project counts. Our internal goals are not measured by a count. The count has leveled somewhat, though our overall activity has been increasing (good!).