- High text recognition accuracy
- good formatting preservation
- usable interface.
- PDF facilities could be better.
ScanSoft's OmniPage Pro 11 is a new version of the market-leading optical character recognition (OCR) package. Like most OCR programs, it not only recognises scanned text, but can also preserve page formatting details, including multiple columns and tables.
New features in version 11 include PDF import and export, a built-in document editor and, it's claimed, general improvements to accuracy. The PDF import allows you to take a PDF document and convert it to any of the output formats supported by OmniPage Pro. The package supports over 100 languages, including the Greek and Cyrillic alphabets.
Any TWAIN-compliant scanner can be used with the software, and a scanner test utility is included. This is run automatically the first time you use OmniPage Pro 11, but can also be run manually at any time later. We tested OmniPage Pro 11 under Windows ME using a Umax Astra 4000U USB scanner.
There are three operational modes: AutoOCR, which makes most of the choices for you; manual OCR, where you're in complete control; or the OCR Wizard. The latter presents you with each of the available options and allows you to choose the right ones for the type of document you're scanning. This is less laborious than setting each option individually, but for less experienced OCR users the AutoOCR method is probably a better option.
The screen is split into three panes, which show thumbnails of each page, the zoning for the current page and the recognised text for the current page. This makes it straightforward to navigate around the document and make any changes. You won't find yourself needing to go back and forward very often though, since OmniPage automatically starts proofreading the document once it has finished recognising the first page. Scanning (or reading from file), recognising and proofing all take place simultaneously, so you're not left waiting for a large document to finish scanning before you can start working on it.
The proofing process only covers words that OmniPage is unsure of, or which don't appear in its dictionary (and hence are likely to be errors). You don't have to check every inch of the document -- something of a boon on longer documents. For problem words you're shown the original scanned image.
The PDF Import facility is perhaps an admission that scanning paper documents is less relevant in the Internet age. Nevertheless, it could prove a useful feature. Even with Adobe's own Acrobat tools, it's difficult to turn a PDF document back into an application-specific document without losing information. Unfortunately, this feature is rather disappointing in OmniPage Pro 11, since it doesn't appear to use all the information in the PDF document. For instance, the text is created using OCR, rather than reading the text embedded in the PDF document. This means that you still have to proofread the OCR'd document despite it having come from a file. The formatting of the document is as well preserved as any other scanned document, so it's a shame that, given the possibility of 100 percent accurate text, ScanSoft has chosen this method of processing PDF documents.
We can't really comment on the accuracy of OmniPage Pro 11 relative to previous versions, but the level of errors we saw during testing was very low. Even so, if you're happy with the level of accuracy in your existing package you probably shouldn't upgrade, as some of the new features don't impress as much as they perhaps should. That's not to say this isn't a good product: for accurate OCR with good formatting preservation, OmniPage Pro 11 is just fine.