Raiders of the lost code

At first glance, software developers have little in common with Indiana Jones. But the emerging field of software archaeology applies some of the same skills, if not the dashing adventure.
Written by Simon Sharwood, Contributor
At first glance, business software developers have little in common with Indiana Jones. But the emerging field of software archaeology applies some of the same skills, if not the dashing adventure.

In 1900, a Greek sponge diver found the wreck of a 2000-year-old ship near the island of Antikythera. The sunken ship yielded up the usual ancient booty, but among the loot was something unusual: a corroded lump of metal with a large wheel on its front. Decades later, gamma-ray examinations showed that the artifact contained bronze gears and wheels.

Science historians now call the device the Antikythera Mechanism, and science historians agree that it is the earliest known computing machine. To this day, however, despite many tests, simulations and reconstructions, no one knows exactly what the Antikythera mechanism actually computed. While the favorite theory alleges that it calculated the position of the stars to aid navigation at sea, its designers and builders are long dead, ancient literature lacks a single mention of such devices and the only documentation it bears is an inscription suggesting the island of Rhodes as the place it was built.

This ancient riddle will seem familiar to many software developers confronted with the task of rebuilding applications. Like the Antikythera Mechanism, many applications were created years ago by unknown coders who left no documentation and can’t be reached any more. Yet the mystery of their work can be as important to a business as the Antikythera Mechanism is to an archaeologist, as uncovering the business value encoded into an old application can tell a business a lot about its past and help shape its future.

Raiders of the lost code
Because old code has such potential, developers around the world are starting to develop formal practices to conduct “software archaeology,” the discipline of investigating and re-invigorating old applications so their structure can be fully appreciated and their code put back to work.

The need for software archaeology is not exclusively the domain of mainframes. “I did a ‘dig’ for a website moving from Windows DNA to .Net,” recalls Davyd Norris, a Rational Technical Consultant at IBM. “They asked for a website critique. What we found was four different styles of web architecture. We could almost see work by different consultants based on the way the HTML was put together,” and detect which version of Microsoft’s fast-evolving web tools had been deployed as the site evolved.

Early editions of Visual Basic and the languages used in the client-server boom of the early 1990s are another common source of “digs.” Few who coded in those languages suspected their work would last long, resulting in little documentation. Software created by newer methodologies is another subject of software archaeology, with the legendary payroll application for automaker Chrysler that kick-started the Extreme Programming movement recently the target of software archaeology. Short-lived platforms and architectures are another reason for archaeology: the skills required to code under Data General’s DG-UX operating system and NUMA APIs are obsolete. Human nature chips in too: source code goes missing, documentation disappears, or programmers just code unintelligibly for no good reason.

Whatever the source of the problem program, the reasons for conducting software archaeology are prosaic: repairing an application for re-use or porting to a newer environment is generally faster, cheaper and less risky than starting an application from scratch.

“Often software archaeology gets done because a company sees that a program is working but needs to understand it better before they can web enable it,” says Michael Hawkins, Asia-Pacific General Manager of professional services for Software AG. “I also had one project where the customer was redeveloping an old application into Java. The CIO came along half way through, asked where the value was in doing that and they ended up doing archaeology instead.”

This potential to save money and speed development makes software archeology a good business opportunity for practitioners, as Peter Taylor can attest. Taylor, Director of Strategic Alliances for Australian company Quipoz, proudly points to growth to 23 people just 30 months after opening to provide what it calls “automated legacy system transformation,” and already plans more hires.

“Customers need our services to reduce costs,” Taylor says. “License and maintenance fees for mainframe applications and hardware are substantial. If we can get them off the mainframe we save our clients a lot of money,” a message that has quickly won the company clients in the banking and finance, industrial and insurance sectors in Australia, New Zealand and the UK.

Temple of dull
So what does it take to perform software archaeology? As with real archaeology, which involves a lot of time bent over in the mud, there is little glamour.

“It’s an iterative and interactive process,” Taylor says, explaining his company’s four step process that starts with an “input” team that parses and lexers old code, using automated tools whenever possible. A second team performs “enrichment” duties by building a working data model and describing interfaces to other applications so old code can link into a customer’s new environment. A deployment team then ports old code to a new platform and database, before a testing team puts the rebuilt application through its paces.

Other practitioners advocate similarly structured approaches. “There are four key aspects to software archeology,” says Davyd Norris, Rational Technical Consultant at IBM. “You need to look at structure, behaviour, context, and rationale.”

Taylor says the skills needed to do this work have given the company an average age much higher than the industry average. “The input and enrichment need to have a fair bit of grey hair,” he says. “They need a bit of knowledge about the original languages.”

Sean Salisbury, Compuware Asia-Pacific’s senior regional technical specialist, thinks mastery of even finer details may be needed to make a good software archaeologist.

“When I first started programming you needed to know that disk sectors were 512 bytes long. You designed your records around that constraint,” he says. “Later on you needed to know the difference between DOS 6.2 and DOS 6.2.2,” he adds, as the programming tools of the day did not abstract the operating system at all and subtle changes to code were required if an application was to survive the subtle variations in upgraded operating systems. Without this kind of knowledge, he feels a would-be archeologist may lack the insights to perform effectively.

The holy grail: a job
Is it worth acquiring this kind of knowledge to add a software archeology string to your career bow? The increasing prevalence of automated tools makes it unlikely that developing a deep understanding of a dying technology will prove a career masterstroke.

“Anyone who can program a language in anger can pick up a new one easily,” says Software AG’s Hawkins. “When you get into the esoteric technologies, the question is: How much investment to make in the skills?”

Hawkins’ customers tend to vote for automation instead of teaching their people dwindling skills, and this seems to be where the new practice is heading. Like many other companies, Software AG offers products capable of analyzing and documenting old code, producing flowcharts and diagrams that explain its function and then suggesting ways it can be rebuilt. Many “Once you have your analysis done, you can change it automatically,” he says, warning that while automation is powerful it is still sensible to “test like buggery afterwards.”

Those who hope their more esoteric skills will turn into a few interesting contracts need not give up the idea of becoming archeologists, however. Many businesses are partial to hiring skilled archeologists for their first “dig” to assist help impart skills that may be needed as they delve further into their old code.

And if that means your PASCAL or COBOL skills only occasionally translate into the chance et to explore the software equivalent of an Antikythera Mechanism, the occasional chance to unlock the secrets of an ancient mystery is certainly better than missing out on that chance altogether.

It's the story of a man named Grady
One of the prime movers behind software archaeology is Grady Booch, Chief Scientist at IBM's Rational Software subsidiary. Booch’s brand of archaeology is not, however, a commercial project. He believes we must document old software while its authors are still alive to say how and why they created it, because the current generation of software represents the fundamental foundation of all future development.

“It is a sign of maturity for any given engineering discipline when we can name, study, and apply the patterns relevant to that domain,” he writes in his online Handbook for Software Architecture. In civil engineering, one can study the fundamental elements of architecture in works that expose and compare common architectural styles. Similarly, in chemical engineering, mechanical engineering, electrical engineering, and now even genomic engineering, there exist libraries of common patterns that have proven themselves useful in practice.

“Unfortunately, no such architectural reference yet exists for software-intensive systems.”

Booch hopes his site will provide a collaborative environment in which netizens can start to gather and document old code, so that the computer history museum or a similar institution can eventually preserve it forever.

Booch’s aims are extremely laudable, although he is willing to admit they are also fuelled by a tiny amount of avarice.

“The third goal of this work is to feed my insatiable curiosity,” he writes. “Whenever I encounter an interesting or useful software-intensive system, I often ask myself, ‘how did they do that?’ By exposing the inner beauty of these systems through a study of their architectural patterns, I hope to offer inspiration to developers who want to build upon the experience of other well-engineered systems.”

Editorial standards