The joys and perils of open-source life

Karim Yaghmour shares his views on the evolution of the open-source movement and his in-depth experiences with the development of a well-received Linux project.
Written by Karim Yaghmour, Contributor
Karim Yaghmour shares his views on the evolution of the open-source movement and his in-depth experiences with the development of a well-received Linux project.

In light of recent events on the tracing tools side of Linux, some reflections came to mind which I would like to share with the Linux community. It is my hope that my perspective and experience as a maintainer of an open-source project called the Linux Trace Toolkit (LTT) and as a consultant in the open-source field may be helpful to others in understanding how the open-source movement is evolving. After all, this is what I specialize in: helping people understand complex systems.
The origins of LTT
I started developing the Linux Trace Toolkit in the beginning of 1999 as part of my research at the Ecole Polytechnique de Montreal ("Poly"). I had one purpose: to provide a way for users to understand the dynamic behavior of a complex operating system -- in this case, Linux. From the start, I was convinced that numbers and lists of items were insufficient to describe such systems. Instead, what was needed was a way to visually illustrate the system's behavior.
As the old adage says, a picture is worth a thousand words. Many ideas came to mind, but right from the start the one that seemed to make the most sense was a control-graph similar to the one currently displayed by LTT. Hence, I started out with a very clear idea of what I expected of this tool once it was running.

Thus began a six month period of experiment after experiment, trying to probe Linux's behavior. With some guidance from Professor Michel Dagenais at Poly, I began to make rapid progress. By March, I had a working prototype that enabled me to trace timer interrupts, buffer these events in a data log, retrieve them using a daemon, and print them off-line.
By May, I had perfected the technique while adding to it more events, all the while becoming more and more familiar with the sources ("use the source, Luke!") and continually gaining a better understanding of the flow of control within the kernel. By then, the entries were marked "Linux trace utility" in my log book. Meanwhile, the event format had changed many times over, as trace logs were getting very large and I was looking for ways to minimize their size.
By the end of May, I was quite satisfied with the quantity of events I could retrieve; so I started working on a graphical front end to it all. By mid-June, the whole interface was built and I was quite satisfied with how the graphs displayed and how the information was laid out.
From there on, I spent a month debugging things and testing the tool as much as I could in various situations. I remember spending a lot of time on the routines that manage the display of the graphs. While it may be obvious from reading a list of events to find out where the system is, whether in a process or in the kernel, it is quite another story to try to write down the rules that govern this behavior in a programming language.
Finally, the first LTT release came out on July 22, 1999 (with intent to come out on the 21st). After that, I was off for some fresh air.
The first reaction to come in was from Peter Wurmsdobler: "It seems to be . . . a well appropriate tool to . . . give an understanding for the operating system . . ." This was closely followed by "It was with great interest that we recently read about your Linux Trace Toolkit, as it provides some good capability that was not previously available for Linux," from David Beal, of Zentropix (later acquired by Lineo). Later came "looks very interesting, particularly for debugging interactions among groups of daemons and/or the kernel," from Werner Almesberger (LILO maintainer and kernel contributor).
From the feedback I was receiving, it was clear that my initial purpose was fulfilled. LTT was providing a unique capability enabling users to understand the behavior of a complex operating system. It was now time to sharpen the knife.
In the following months, advice and support came in from various sources.
Jacques Gelinas, author of Linuxconf, suggested I use macros for the kernel instrumentation, as it would enable conditional compilation of trace statements in the kernel. Jean-Hughes Deschenes helped in refining the graphical interface by providing a toolbar, event icons, selectable events, and other very useful visual enhancements. Andi Kleen suggested some better ways to go up the call stack when retrieving system call address origins.
In parallel, I pushed other things. First, I completed instrumentation of the kernel. Then, I did a complete rewrite of the event database browsing engine. (It used to read the whole trace into custom structures and then browse through those traces to display trace information. Now it maps the trace into memory and reads it as it is.)
In effect, this was almost a complete rewrite of the visualization tool, because major portions had to be modified. The event structures changed all the while. Later, I added keyboard support to the graphical tool, since browsing through the large traces armed only with a mouse proved to be a challenging affair.
By April 2000, things were starting to take a whole new turn. I did a first presentation about LTT at the Linux Expo in Montreal in front of a very interested crowd, and had a presentation scheduled for the refereed track of the Usenix Annual Technical Conference for June in San Diego. Meanwhile, I had started receiving requests from various Linux vendors wanting to add features to LTT. At that point, I had to quickly decide on the strategy I would adopt in receiving help, either technical or financial, from commercial entities.
Just at that time, I was fortunate to run into Richard Stallman at Linux Expo, and Richard was kind enough to spend some time with me discussing the issues of open-source, free software, copyrights, the GPL etc. These discussions were extremely valuable in giving me access to Richard's 16 years of GPL experience -- a source of wisdom which would have been foolish to overlook. Richard gave me a breakdown of the do's and don'ts of managing GPL source with outside contributors, the reasons for these to exist, and the possible consequences of not taking the right precautions. It was serious food for thought!
Following these discussions and many reflections, I arrived at what I consider a "sane protocol" for managing outside contributions: any source code contributed to LTT is owned (copyrighted) by its author, regardless of whether the author is an employee of a corporation or is sponsored by a corporation.
What does this mean? Well, it simply means that if John Doe works for company X and he writes code which is meant to be included in LTT, then the copyright is attributed to John Doe and NOT company X. Which does not mean company X cannot receive credit for having helped in developing LTT.
The reason for this is very simple. Whereas company X might be well intentioned at the time of the contribution, I, as the project maintainer, have no guarantee as to future intent. Therefore, I would do a great disservice to LTT and to the open-source community at large by allowing the company to own the copyright of the contribution.
Moreover, take the situation where John Doe leaves company X and starts working for company Y. Does he lose control of the source he actually wrote and understands better than anyone else? Again, as a project maintainer, this is just a risk I cannot afford to take.
I put my money where my mouth is. Specifically, you will find that all of the code I wrote in LTT is copyrighted in my personal name and not in the name of my company (Opersys Inc.), although I could have easily done otherwise.
Needless to say, this protocol has raised eyebrows and has met opposition. Nonetheless, I believe that in the long run it makes the most sense. And recent events have reinforced this belief. Let me explain.
With the rise of popularity of Linux in both mainstream and embedded computing, many companies have moved to capitalize (financially) on Linux. First, let me be clear: there is nothing inherently wrong with doing this.
In the beginning, the corporate messages tended to be: "It's free, and we offer packaging and support." As time went by, this evolved into variants like: "It's free, but we've added some very cool features to it." Again, such an attitude is not inherently wrong; there can be good reasons for taking that approach.
The problem, however, comes when a partisan philosophy, burdened by corporate agendas, starts to wind its way into the very projects that made the free software market emerge. That's when things start to become very slippery. Project management then becomes a difficult compromise between accepting "cool features" and ensuring that the project remains on course and free from corporate agendas. In this regard, my motto is "don't shoot yourself in the foot." Having said that, I tend to agree with Messadie when he says: "It is my belief that it is profoundly satanic to believe in the Devil."
What do I mean by that? Simply that there are no bad guys and no good guys. That, by the way, is why I didn't call this editorial "Misleading announcements about Linux tracing tools." Reality often tends to be complex -- fortunately or unfortunately.
After all, every company in this market wants to survive. Need I remind anyone of the SEC fillings of some Linux companies which explicitly state that there are no known open-source companies that make a profit? To some, this means acquiring legitimacy by promoting open-source projects; to others, it means differentiating their company's offerings by "adding value."

Editorial standards