I had looked at the Readiris OCR (version 6) product previously, but did not review it at that time as it had already been reviewed earlier. However, I did review then and continue to use the IRIS Pen II, especially with my laptop. The IRIS Pen's OCR quality is exceptional, and the variety of applications I can output to is why I use it. Based upon my satisfaction with the IRIS Pen, I felt that since this is two releases later, it was time to review Readiris Pro version 8. Additionally, in the interval, ScanSoft Inc. acquired from Caere and Xerox most of the leading OCR packages, and then developed and released OmniPage Pro 12, which offers equivalent feature sets as Readiris. Thus the OCR market appears to have only these two leading products competing with new version releases and features.
Readiris Pro 8's installation is typical - put in the CD, respond to a few questions, wait a few moments, and then be able to OCR documents. There are some extras that come with the product: Cardiris LE, a limited edition of IRIS's product which allows the scanning and filing of business cards into a database; PhotoScore Lite from Neuratron which does limited scanning and conversion of musical scores to a midi file format; and AcrobatReader from Adobe for reading the manuals and other information.
There are two advertised features about the product that I will focus upon: speed (up to 1300 characters per second) and tools (powerful and easy-to-use tools to convert documents, including .PDF files, into editable files). Some other features which make Readiris Pro 8 a better product: recognizes up to 104 different languages including four Asian; seven user interface languages (English, Dutch, French, German, Italian, Spanish and Brazilian); and input/output is more expanded and user friendly.
Speed is relative, but can be tested. Readiris claims a speed of 1300cps (characters per second), making it the fastest in speed claims. Some cautions about speed tests and results are needed. First, character recognition speed testing occurs after the scan process has acquired the text, and second, every OCR program must learn the font and sizes before any speed can be realized. In testing, I was able to come fairly close in calculating speeds exceeding 1200 cps, however accuracy was sometimes impacted by mistakes I made during the `learning' process. One run which used an Acrobat file came fairly close to 1300 cps. I was also impressed with the speed of scanning Acrobat documents. I timed a 110 page .pdf file which scanned in 6 minutes (for all 3 operations, due to a 50 page maximum limit), and then took about 12 minutes (including the font and line graphics `learning' time) to create the output, full page, formatted .rtf file. I would conclude Readiris can capture the speed title.
Tools and OCR engine were great. Personally, the OCR Wizard is a definite keeper. I launched it every OCR read I did, and got great results. Since I was doing different types of OCR scans for the review, this was very helpful. If, however, you scan the same type of document every time, then you can use the `Scan/Open', `Sort', or `Recognize' buttons directly, instead of running the wizard.
The basic interface is very easy to use. There are options for almost every tool or button in the interface which provides more complex tools for input, OCR, and output.
Input sources are the good news here as input comes from scanner, camera, fax, any type of hardware image processor, and from a large number of files including the Acrobat .pdf format. I tested several Acrobat type of files (text searchable, read only, and data-entry) including files with tables. I scanned the IRS form 945A, HTML from the Internet and .pdf from the saved file. The form has lots of table grid to spare, as shown in the screenshot. This is certainly the best of the new features. For people who deal with the web or Acrobat documents, finally you can convert the .pdf file and web pages into editable, useable documents.
Output options have been greatly improved, allowing the user greater flexibility in how you work. Previous versions of OCR products either relied upon a native interface to MS Word or some macro to convert the output of the OCR package to a few other applications. Readiris outputs to a format that works with a specific application, then calls that application, handing off the converted file. You can specify the application you want the program to send to or the kind of file to save the document conversion to. The 17 applications that Readiris will hand off to include: MS Word, MS Excel, OpenOffice, Acrobat (with or without graphics); Corel WordPerfect; an HTML editor; Internet Explorer; Netscape; and StarOffice 6. There are 30 file output formats that expand beyond the 17 applications to include other applications and older versions. For example Readiris saves WordPerfect into a .rtf format. Or to the older WordPerfect 4.2 .txt format. I believe this is a much better approach than making one application the default output, forcing users to have to make work-around solutions. The disadvantage is that the converted document into some applications did not match the original formatting exactly, but then the single application output places the user into the same scenario, but with more work to convert to another file format.
The processing of the scanned image into a converted document can be done in two ways. The user's easy approach is handled by the Wizard's AutoformatTM technology. However, not every document behaves, so there are tools for the user to direct what areas are to be read and what areas are to be graphic images. The `Sort' button allows the arranging in recognition flow order: yellow text areas; blue graphics areas; and purple table areas. Skewed text is adjusted automatically or manually, or ignored depending upon user selection.
The tool columns on each side allows many adjustments to each page being scanned. The interface is convenient, and the Tool tips help identify the service each button/tool provides for the user. The only `reading' problem had to do with a document having two(or more) font types, and one font was a small caps font. If there are enough similarities between the other font(s) and the small caps font shape, then you may get a number of words containing capitalized letters inside. The problem comes from the font learning process. In that situation, click `Do Not Learn' on the small caps font. Obviously the learning process will affect accuracy and speed. It is wise to visually go through a document before running the recognize process. Special attention should be made for logos, non-traditional accents, symbols, and other details that need to be treated as graphics. Additionally, formatting from a landscape oriented document (like the IRS 945-A form) tends to be hardest to proportionally render into the converted document, and will require tweaking to get it to match the original page.
Overall, I am very impressed with this OCR product. Speed is a definite feature of the product, and in the absence of speed claims in the competition's descriptions, I would have to agree to IRIS's claim for fastest text recognition. The speed and capability I observed for scanning Acrobat documents is impressive and one of the best selling points for the product. The recognition accuracy and format rendering faithfulness was very good.
Faster, less expensive, great accuracy, multiple and mixed language fonts capacity, and very user friendly interface are all factors that result in a strong recommendation for Readiris Pro 8 to any one needing OCR capability. Though not reviewed, there is a corporate version of the product, which may meet stronger demands and multiple users within the same organization.
Readiris Pro 8
I.R.I.S. - North American Offices.
1600 N.W. Boca Raton Blvd, Ste 20
Boca Raton, FL 33432
Requirements: at least
`486, 64MB, 95MB installation
Windows 95 and up.
Addition requirements: twain compliant scanner, internet connection for website access.