The breadth but not depth issue on Internet information I've been on about this week has an interesting parallel in open source development.
Look at either freshmeat or sourceforge and you see an incredible variety of projects including several thousand offering real, downloadable, and immediately useful applications.
In addition to important but niche stuff like phpsurveyor the open source world offers enormous application groupings like those around Apache, gcc, Linux, OpenOffice.org, and X11.
What there's very little of, however, is the stuff in between. In fact I think there's one hole in open source coverage that's particularly interesting for what it tells us about how people balance technical interest with financial reality.
First lets look at the overall pattern of open source development. In general you start the process with an idea about what you want to do, find somebody else's code that does some or a lot of it, and then modify that code to do precisely what you want. Stick to it long enough and what happens is that you learn much more about what it was that you started out wanting to do, and as you do that the code you started out with starts to look less and less applicable - so you gradually replace all of it to end up with your own brand new product that's built, as is any academic result worth mentioning, on earlier results obtained by others.
To cite the best known example, that's how Torvalds created Linux: he started with a demonstration of the superior performance available from using x86 interrupts in the Minix kernel, and ended up replacing all of it to create his own kernel and went on from there to make GNU/Linux the power it is today.
Great, but where are the infinitely flexible business applications? Specifically why am I unable to find a group dedicated to maintaining and expanding a complete ERP/SCM set of data definitions on which anyone can build applications?
There are open source ERP efforts and other people do use their data definitions (by which I mean a combination of ddl scripts defining the database structure and separate documentation on the content of each table and column) for additional applications. CentraView for example offers a "combination of Contact Management, Salesforce Automation (SFA), and Customer Relationship Management (CRM) functionality" built as add-ons within the Compiere ERP data definitions. As far as I know, however, these two are exceptional in the sense that this kind of behavior is not a general thing and there aren't several hundred other groups busily working with a single, standardized, set of shared data definitions.
So why not?
My guess is that the people who focus on data aren't generally open source oriented and those who are open source focused are probably more interested in coding and applications then in making and defending data definitions. And yet, such a resource, if done credibly and made widely available, could spark an open source landrush into data based business applications - boring ones like GLs, weird ones like job queue management, fun ones like solver interfaces, and leading edge stuff like decision support (aka business intelligence by those who don't mind the oxymoron).