The Google search engine reports that there are over half a million MS Word .doc files presently available for download from dot-com Web sites. Of these, a small but significant percentage have been created using versions of the software known to create "leaky" documents.
First discovered in 1998, the bug causes random fragments of data from previously deleted files to be included in areas of a document that are otherwise unused. This random data can contain virtually anything that might have once been stored on the creator's computer, including passwords, sections of other documents, correspondence, etc.
Anyone downloading affected documents and browsing them with a binary editor can easily view this extra information, although it remains otherwise invisible.
The applications responsible for producing these leaky documents were Microsoft Word versions 6.0 and 7.0 plus version 7.0 of PowerPoint and Excel. Although a patch was quickly released to plug the hole, documents created before the patch was applied, and not subsequently edited, may still contain the unexpected snippets of sensitive data.
U.S. Government Web sites also appear vulnerable to these potential legacy leaks with some 240,000 MS Word documents and 32,000 PowerPoint files being listed by Google under the dot-gov top level domain. A small sampling indicates that up to 5 percent of these documents may have been created with the buggy versions of the software.
The problem appears to be a global one, although more pronounced in areas where the Net was in common use before the flaw was uncovered. Potentially leaky documents have been discovered on the government Web sites of a number of other countries including Canada, France, Australia and New Zealand.