Undoing coding mistakes with refactoring

Wish you could turn back the clock on a development project? Methodology expert Steve Hayes shows how to correct the mistakes of the past with refactoring.

Every day, software developers make decisions that they later regret. They may be tiny decisions, such as the name to use for a variable, or they may be major decisions such as the metaphor to use for a new application, but no one gets it right every time. Even the most experienced, skilled software developers would like to be able to revisit earlier decisions, because it's the nature of their work to make decisions based on limited information, then to learn more over time.

Historically, software developers lived with the limitations of their earlier misconceptions until the burden became so great that they were compelled to rewrite the offending code. The associated costs were born by both the developers and by the business people sponsoring the project. People lived with this constraint for two reasons--fully automated testing wasn't a commonplace practice, and there were no well understood practices for making incremental improvements. In the agile development world automated testing is now well established, and the approaches to making incremental improvements have been codified as refactoring.

Defining refactoring
The word “refactoring” is used in two distinct ways, to refer to activities at two quite different levels of abstraction. At the large scale, refactoring is the process of making design improvements that leave the external behaviour of a software application unchanged. At a smaller scale, refactoring is applying a specific pattern to improve one particular aspect of the design. For example, a very simple refactoring would be to replace a piece of code that previously required a comment with an invocation of a method whose name conveys the information that used to be in the comment. While this seems like a trivial change, refactoring also encompasses more drastic change, and the result of relentlessly applying even trivial refactorings, hundreds and thousands of times, can be a radical improvement in an application's design.

Of course, the idea of improving an existing program isn't a new one--every programmer has no doubt seen things that they would have liked to change. However, most programmers (quite sensibly) shy away from changing code that they don't absolutely need to, since they worry about inadvertently breaking existing correct behavior. Taken alone, refactoring doesn't do anything to address this possibility, but almost every team that adopts refactoring as a standard part of their development process also adopts test driven development, or some equivalent practice that ensures they can automatically retest the application after a change, to verify the expected behaviour has been preserved.

Refactoring in IDE
Although all refactoring can be done manually, one step at a time, the trend for many languages is to include refactoring support in an integrated development environment (IDE). Automated refactoring is much easier to implement in languages that have automated memory management, so refactoring tools are more common for languages such as Java and C# than they are for C++. In the Java world, which I'm most familiar with, automated refactoring support is now expected in any serious IDE, and the number of supported refactorings is a point of competition between vendors.

Many programmers who use IDEs with automated refactoring support find that they allocate their design time differently. For example, rather than agonize over a class or method name when it's first required, they may use a name that's approximately right, and then rename it later when the concept has been clarified through use, confident that this will only take a few seconds. This underscores that refactoring isn't “rework”, in the sense of doing over something that was done badly the first time. Programmers who use refactoring still code to the best of their ability at each stage of development, but don't have to worry that they will have to live forever with any particular decision. This frees programmers to focus on getting something to work with a “reasonable” design and then improve it, rather than spend a lot of time implementing an “ideal” design that doesn't actually work. For experienced programmers, this probably sounds a lot like they way they already work, but with some extra rigour around activities that they're currently doing intuitively.

Refactoring entered the mainstream with the release of “Refactoring : Improving the Design of Existing Code” by Martin Fowler in 1999, and until recently refactoring has focussed primarily on the repeated application of the small scale refactorings that Martin described. In August this year, Joshua Kerievsky published “Refactoring to Patterns”, which combines the concepts of refactoring with the Gang Of Four design patterns that many programmers are already familiar with, and provides techniques for performing larger scale refactorings in a rigorous way.

This help illustrates that there's a lot to learn about refactoring, and that a thorough knowledge of refactoring can help programmers make vast improvements in the development of new applications, and the maintenance of existing ones.

Steve Hayes is the Software Development Manager at Internet Business Systems (IBS). IBS provides agile methods consulting and development services, and browser hosted solutions to the financial services industry.