Since my initial report on August 6 ofthe company and outsiders have been working to clarify the problem and provide guidance to customers.
The bug caused scans of documents with small numerals in them to have the numbers changed in the scanned image. For example, the paper version might have a '6' but the scanned document would show an '8'.
The problem was initially found by German computer scientist D. Kriesel, who reports that Xerox has worked hard on the problem and kept him in the loop. According to Kriesel '…the issue indeed affects all compression modes across a whole lot of devices, all are able to mangle numbers'. He adds that they have identified the bug, that a patch is being developed, and that they want him to test it with his originals that identified the bug in the first place. Smart move.
Xerox has produced documents for customers on the matter. In this entry of their Real Business at Xerox blog, Rick Dastin, corporate vice president and president, Office and Solutions Business Group, Xerox reveals that the problem may manifest in all settings of the products. They are working on a patch to fix the bug.
The company's initial responses indicated that the problem only happened with quality settings lower than the defaults for the product. The problem is indeed most likely to occur with 'stress documents' ('documents with small fonts, those scanned multiple times and hard to read').
The company says that this is a complete list of affected products:
- ColorQube: 87XX, 89XX, 92XX, 93XX
- WorkCentre: 5030, 5050, 51XX, 56XX, 57XX, 58XX, 6400, 7220, 7225, 75XX, 76XX, 77XX, 78XX
- WorkCentrePro: 2XX
- BookMark: 40, 55
Xerox produced a FAQ document entitled Xerox Scanning Update: What You Need To Know which adds several other useful data points:
- Only scanning is affected, not copying or faxing
- The 'character substitution' bug is unlikely to occur when the products are configured with default settings
- The affected component in the scanning software is the JBIG2 compressor. JBIG2 is an industry standard for image compression (click here for the specification). Character recognition is part of the spec and one of the reasons for the high compression rates.
- The patch will be available 'within a few weeks'.