Playing silly buffers: How bad programming lets viruses in

The buffer overflow is the mechanism of choice for the discerning malware merchant. So what is it and why does it overflow?

One of the most prevalent types of attack on networked computers is the buffer overflow. Searching for such vulnerabilities in Microsoft's Knowledge Base returns many hundreds of examples -- it's the mechanism of choice for the discerning malware merchant. New hardware and software techniques are reducing the incidence of this perennial problem, but it's unlikely ever to go away completely.

So, what is a buffer and why does it overflow? In the simplest terms, a buffer is an area of memory used to pass data between different devices or parts of software. Sometimes a buffer can be a physical memory chip -- printers and video cards have dedicated buffers, where the processor sends the data to be output -- but most often it's a temporary area in system memory set aside by one piece of software exchanging data with another. It's this kind of buffer that comes under attack, and there are hundreds if not thousands of them created and destroyed constantly as you use any piece of software.

What looks to you like a single program, like Word, is in fact made up of many components, known as routines, each a tiny program in its own right doing a single job. In a word processor, for example, there may be something that colours a block of text, another that underlines it, and yet another that recognises a block of text as a valid URL. These can have many different uses, like automatically highlighting a URL as you type one into a document.

As you type, whenever a space is entered the previous word is passed to the URL detector. If that finds the word is in fact a valid address, then the word processor will invoke the block colour and block underline code, so the URL appears as such on screen. Of course, the same is-this-a-URL? piece of code can be used by lots of other programs, such as email, database, and even Web browsing software -- and each one that uses it must create and use a buffer in the same way.

All code running on a computer has access to an area of memory called the stack. The processor has various special instructions designed to handle the stack and its contents quickly and efficiently, including handling addresses within the stack much faster than just any old random location. It's also automatically transferred when one routine uses - calls - another one, so it's where buffers are built and filled.

The stack also automatically contains a return address: when one routine calls another, the second routine needs to know from whence it was called so it can hand control back. It's this use of the stack for both data and address information that makes buffer overflows such a tempting target. Often, the return address is very close to the buffer -- exactly how close depends on many things, but is usually the same for a particular routine.

If the routine is tricked into writing too much to the buffer, the data it's storing will go off the end of one area and into the next -- potentially into the part where the routine will find its return address. At that point, all bets are off -- instead of returning safely to the code that called it, the routine will pass control to an address that was written in error. A crafty virus writer can force that address to correspond to their own, malicious code - and they've won.

A classic buffer overflow trick is to fool something that's trying to interpret directory and file names. Say there's a rule in an operating system that no directory name can be more than 256 bytes long on a disk. The person who writes the routine may think that this means the buffer for the directory name also only needs to be 256 bytes long -- a reasonable assumption. But elsewhere in the operating system, there's a specification that says you can represent a character by an escape sequence, so ^84 is the same as the letter T -- strings that use that form of nomenclature will be three times as long. If the routine doesn't know to check for that, it can easily end up copying far more than 256 bytes into the buffer even though it's sticking to what the writer thought were the rules. The programmer could have chosen to check for an overflow by counting bytes -- but that would involve some more programming, slowed the routine down and introduced more chances of error. At least, that would probably be the excuse: assumptions, ignorance and laziness are behind many buffer vulnerabilities.

A virus writer uses all the above information. They know where on the stack the return address is, they know how big the buffer is and they know how far they are apart. If they can fool the routine that writes to the buffer to write just that little bit more -- and arrange to have their own address copied at just the right place to overwrite the original return address -- they can take control of the computer. They can put their own malicious code in the buffer itself, and thus install and transfer control to a bad routine just by presenting the right data in the right way. They don't need to get user names, passwords or security privileges -- the operating system will think that the malicious code is being run under whoever's privileges were in use at the time.

Various ways exist to catch this behaviour. One of the most common -- now included by Microsoft in Windows 2003 -- is to generate a very hard-to-guess number and put it in a place in memory with no connection to any vulnerable buffers. Whenever a routine is called, a copy of this number -- called a cookie by Microsoft or a canary by everyone else -- is put on the stack just before the return address. The routine that's called does its job as usual, but immediately prior to getting the return address from the stack it checks the canary against the reference copy. If something has overwritten the stack on the way to the return address, the canary will be destroyed and the routine knows not to try and return control but to stop the software with an error.

This works well, for both malicious code and innocently written stack-trashing bugs. Because the canary is effectively random it's not possible for a virus to guess what number it's overwriting, and it doesn't affect the normal running of the code. However, there are still potential vulnerabilities -- if the copy of the canary in shared memory can be changed by an exploit, then the mechanism can be bypassed, and it's also possible for the error reporting mechanism to be attacked.

There are other ways. Both AMD and Intel have said that they are adding hardware support to their processors to stop the exploitation of buffer overflows: in effect, adding the ability to make critical areas of memory incapable of holding code that will execute. The processor can read and write it as usual so a buffer overflow can happen, but if the compromised address tries to transfer control to within the buffer -- where the virus lives -- the processor will refuse and an error will be generated.

However, there are good reasons why executable code may want to live on the stack, so such a technique will not be universally applicable in the future. Likewise, while the canary technique catches a good many classes of vulnerabilities there are other places where buffers full of data and addresses for executable code live side by side, in existing software as well as in stuff that's yet to be written.

In the end, we can only say that more tools will exist to catch or stop buffer overflow vulnerabilities from happening. Some will be in the operating system, some will be available for programmers to use if they wish. But good programmers have always been able to write code that is highly resistant to buffer overflows, while bad programmers will always be able to leave room for the unexpected case to cause unwelcome consequences. Poor programming, like poor people, will be with us always: education and higher standards will do as much to keep our buffers safe as anything else.