<<

AAddvvHighaa Technn Centerccee Trainingdd UnitOO CCRR 21050 McClellan Rd. Cupertino, CA 95014 www.htctu.net

Foothill – De Anza Community College District California Community Colleges

Advanced OCR with OmniPage and FineReader

10:00 A.M. Introductions and Expectations FineReader in Kurzweil Basic differences: cost Abbyy $300, OmniPage Pro $150/Pro Office $600; automating; crashing; graphic vs. text 10:30 A.M. OCR program: Abbyy FineReader www.abbyy.com Looking at options Working with TIFF files Opening the file Zoom window Running OCR layout preview modifying spell check looks for barcodes Blocks Block types Adding to blocks Subtracting from blocks Reordering blocks Customize toolbars Adding reordering shortcut to the tool bar Save and load blocks Eraser Saving Types of documents Save to file Formats settings Optional in Word remove optional hyphen (Tools > Format Settings) Tables manipulating Languages Training

11:45 A.M. Lunch

1:00 P.M. OCR program: ScanSoft OmniPage www.scansoft.com Looking at options Languages Working with TIFF files SET Tools (see handout) www.htctu.net rev. 9/27/2011

Opening the file View toolbar with shortcut keys (View > Toolbar) Running OCR On-the-fly zoning modifying spell check Zone type Resizing zones Reordering zones Enlargement tool Ungroup Templates Saving Save individual pages Save all files in one document One image, one document Training Format types Use true for PDF, not Word Use flowing page or retain fronts and for Word Optional hyphen in Word Tables manipulating Scheduler/Batch manager: Workflow Speech Saving speech files (WAV) Creating a Workflow

2:30 P.M. Break

2:45 P.M. OmniPage and FineReader head to head more complex documents technical documents

4:30 P.M. Wrap-up

4:45 P.M. End

Objectives

Participants will be able to do the following: 1. understand the OCR process 2. use the basic functions of OmniPage and FineReader 3. use zones/blocks to facilitate the OCR process 4. compare and contrast OmniPage and FineReader

www.htctu.net rev. 9/27/2011 Advanced OCR

High Tech Center Training Unit

of the California Community Colleges at the Foothill-De Anza Community College District

21050 McClellan Road Cupertino, CA 95014 (408) 996-4636 (800) 411-8954

www.htctu.net

URL to our CC license: http://creativecommons.org/licenses/by-nd-nc/1.0/

Creative Commons website: http://creativecommons.org

Table of Contents

Basic Workflow ...... 1 Creating the Image File ...... 2 Abbyy FineReader ...... 2 Interface ...... 2 Toolbar Set-up ...... 3 Options Set-up ...... 4 Document Tab ...... 4 1. Scan/Open Tab ...... 5 2. Read Tab ...... 6 Important! ...... 7 3. Save Tab ...... 7 View Tab ...... 8 Advanced Tab ...... 9 Spell Checker Settings ...... 10 Processing an Image (TIFF or PDF) File...... 11 Step One: Open an Image File or a PDF File ...... 11 Step Two: Analyze Layout ...... 12 Step Three: Adjust Areas ...... 12 Step Four: Read Document ...... 13 Step Five: Check Spelling ...... 14 Step Six: Save the Document ...... 15 FineReader Tips ...... 16 Automating Tasks ...... 17 Creating an Automated Task ...... 17 OmniPage Pro ...... 24 Interface ...... 24 Document Manager...... 24 Configuration for Blind User ...... 25 Toolbars ...... 28 Options Set-up ...... 29 OCR Tab ...... 29 Process Tab ...... 30 Proofing Tab ...... 31 General Tab ...... 32 Text Editor Tab ...... 33 Scanner Tab ...... 34 Processing an Image (TIFF or PDF) File...... 36 Step One: Load a File ...... 36 Step Two: Run the OCR ...... 36 Step Three: Adjust Zones ...... 38 Step Four: Save the Document ...... 42 OmniPage Tips ...... 47

www.htctu.net ii Rev. April 27, 2010

Basic Workflow

1. Remove spine from book. 2. Separate pages in book page-by-page (have pages at least six inches apart—glue can be transparent and stretchy!!). 3. As you separate the pages, get a sense of the book, and choose a few representative pages—note if there are pages that may require different scanner settings—sticky notes help make it easy to return the pages later. (For easy books, 1 page may be enough, and usually 6 or so is plenty.) 4. Scan those pages. 5. Run OCR on pages. 6. If you’re getting more than one recognition error per page, go back and adjust the scanner settings. 7. Rerun steps 4–6 until the recognition errors drop. (As an aside, I find that most people go too quickly through the scanning step and do not get a good scan—the result is hours and hours of editing later!) 8. During the test-OCR phase, use your test pages to create a template for the book in your OCR program (OmniPage or FineReader). 9. Scan the book—usually in chapters, but you may scan the entire book, depending on your policies/procedures. 10. Open the TIFF files in a review program (Microsoft Office Document Imaging works well and is free)—rescan any pages that did not scan well. 11. OCR the book using the template you created. 12. Edit the book in your OCR program. 13. Save your OCR files, as well as any formats you create.

BASIC WORKFLOW CHECKLIST Remove book spine Separate pages Choose a few representative pages Scan test pages Run OCR on test pages Adjust scanner settings if needed Create a template Scan the book Review the scanned files OCR using the template Edit Save

www.htctu.net 1 Rev. April 27, 2010

Creating the Image File

Although you can scan with either OmniPage or FineReader, we recommend that you scan your files to TIFF, using the scanning utility that comes with your scanner, and then work with the resulting multipage image. There are a number of reasons: it preserves the TIFF files for later use with other applications, it prevents problems with crashing in the middle of scans, it allows you to take full advantage of the options that are built into your scanner. Please note that you can combine multiple scanned files (TIFF and JPEG, etc.) within the OCR program.

Abbyy FineReader

ABBYY USA 47221 Fremont Boulevard Fremont, CA 94538 510-226-6717 www.abbyyusa.com

Interface

www.htctu.net 2 Rev. April 27, 2010

Toolbar Set-up Go to View > Toolbars. Turn on the Quick Access Toolbar.

Go to View > Toolbars > Customize Toolbars…. Set the Categories to “Image,” and find the Eraser tool under Commands. Click the double-headed arrow to move the eraser tool onto your Quick Access Toolbar.

Choose View > Customize Toolbars. Move the Order Areas icon onto the Quick Access Toolbar.

www.htctu.net 3 Rev. April 27, 2010

Options Set-up In the menus, go to Tools > Options or use the keyboard shortcut CTRL + Shift + O. Document Tab Under the Document tab, you can choose the languages that are in your document. Click the Edit Languages button to access mathematical and scientific options.

www.htctu.net 4 Rev. April 27, 2010

The more precise you can be about the languages in a document, the better your recognition will be. Remember to include “Greek” whenever you process math so that you recognize the Greek letters (like pi) that are used as mathematical symbols.

1. Scan/Open Tab Make sure that “Do not read and analyze” is unchecked. You want to be able to control what FineReader does, not have it start automatically. Do enable the image preprocessing. That feature allows FineReader to deskew and enhance the scanned image (or PDF). If you have two pages per image page, you can choose “Split dual pages.” It is better to rotate the pages yourself, rather than asking FineReader to do so.

www.htctu.net 5 Rev. April 27, 2010

2. Read Tab Under the Read tab, you can choose the reading mode (Thorough) and opt to train the patterns of unusual .

FineReader now behaves much more like OmniPage with its ability to train recognition of unusual fonts. In order to use this feature, make sure that “Train user pattern” is selected. To load an already established pattern, select “Use user pattern.” Below are the instructions from the FineReader help menu.

1. Select Tools > Options… and click the 2. Read tab. 2. Select Train user pattern under Training. 3. Click the Pattern Editor… button. 4. In the Pattern Editor dialog box, click New…. 5. In the Create Pattern dialog box, type in a name for your new pattern and click OK. 6. Close the Pattern Editor and the Options dialog boxes by clicking the Close button in each. 7. On the toolbar at the top of the Image window, click Read. 8. Now if ABBYY FineReader encounters an unknown character, a Pattern Training dialog box will display the unknown character. 9. Teach new characters and ligatures. • A is a combination of two or three "glued" characters (for example, fi, fl, ffi, etc.). These characters are difficult to separate because they are "glued" www.htctu.net 6 Rev. April 27, 2010

during printing. In fact, better results can be obtained by treating them as single compound characters. Note: Words printed in bold or in your text or words in superscript/subscript may be retained in the recognized text by selecting the corresponding options under Effects. To go back to a previously trained character, click the Back button. The frame will jump to its previous location and the latest trained "character image-keyboard character" correspondence will be removed from the pattern. The Back button will function only within one word. Important! • You can only train the system to read the characters included in the alphabet of the recognition language. To train ABBYY FineReader to read characters that cannot be entered from the keyboard, use a combination of two characters to denote these non- existent characters or copy the desired character from the Insert Character dialog box (click to open the dialog box). • Each pattern may contain up to 1,000 new characters. However, avoid creating too many ligatures, as this may adversely affect OCR quality.

3. Save Tab Under the Save Tab, you can access settings for particular formats.

www.htctu.net 7 Rev. April 27, 2010

You can set the options for how FineReader will save documents into different formats. For , you can retain the page breaks, if you wish. Please note that it is generally best to use Formatted text for the layout, rather than Exact Copy or Editable Copy, which may result in text boxes.

View Tab Under the View tab, you can now choose whether to see thumbnails or details in the document window. For those using screen readers, I would recommend using the Details panel.

Both views can be adjusted to include exactly the information that you want to show. Click the Customize button.

www.htctu.net 8 Rev. April 27, 2010

Note that in the combo box at the bottom of the page, you can choose which colors to use for highlighting uncertain characters, non-dictionary words, etc.

Advanced Tab On the Advanced tab, you usually do not want to open the last FineReader document at startup.

www.htctu.net 9 Rev. April 27, 2010

Spell Checker Settings Under the Spell Checker Settings button, you can uncheck the "Stop at words not found in dictionary" option if you will be spell checking anyway later in Word. If you will not be taking the document into Word, then go ahead and turn on the dictionary. Note that if FineReader is not quite getting the spacing around correct, you can adjust the setting in this window.

You can also choose specialized dictionaries in this tab by selecting the View Dictionaries button. Note that if you are running OCR on a science text related to medicine, biology, or chemistry, it is good to turn on the English Medical Terminology. Similarly, if you are working on a business or legal text, you may receive better results by loading the English Legal Terminology.

www.htctu.net 10 Rev. April 27, 2010

Processing an Image (TIFF or PDF) File Step One: Open an Image File or a PDF File Click on the Open icon or use File > Open or use the keyboard shortcut Ctrl + O.

www.htctu.net 11 Rev. April 27, 2010

Step Two: Analyze Layout Use the Analyze Layout option (Document > Analyze Layout or Ctrl + Shift + E) first on complex layouts then read the pages. This workflow reduces the chance of the program crashing and allows layout adjustment without running OCR more than once..

Step Three: Adjust Areas Use the tools to add to or delete areas (formerly called blocks). To reorder zones, right click and turn on the Area Properties. It will display at the bottom of the image window. To change the reading order, simply change the number after the # sign.

Tip: To see more of the page, turn off the Area Properties by right-clicking on the image and deselecting “Area Properties.”

www.htctu.net 12 Rev. April 27, 2010

Step Four: Read Document Go to Document > Read Document or use the keyboard shortcut Ctrl + Shift + R.

www.htctu.net 13 Rev. April 27, 2010

Step Five: Check Spelling Go to Tools > Check Spelling. If you wish to change the settings for the spell checker, you can click on “Options.”

www.htctu.net 14 Rev. April 27, 2010

Step Six: Save the Document File > Save As > Microsoft Word Document

Note that you can adjust your settings for a specific format by clicking on “Options.” To avoid text boxes, save the layout as “Formatted Text.”

www.htctu.net 15 Rev. April 27, 2010

FineReader Tips Zoom window: FineReader has a zoom window (View > Zoom Window) that allows you to enlarge selected areas of the image or text. Stop spell check: To tell the program just to find OCR errors and not unknown words, go to Tools > Options > Check Spelling. Uncheck "Stop at words not found in dictionary.". Document, Page, or Area: Controls are now broken out under menu choices based on the size of the overall section that you want to work on. To apply a command to the entire document, look under Document on the menu bar. When you wish to work on an individual page, look under Page, etc. Eraser: The eraser tool allows you to edit the underlying TIFF file by deleting pixels (i.e., changing black to white). If you wish to save the changes, go under File to Save Images.

The eraser tool is now located under Page > Edit Page Image or use the shortcut keys Ctrl + Shift + C. The image for the page you are on will open in a new window titled Edit Image.

www.htctu.net 16 Rev. April 27, 2010

Automating Tasks You can now set up workflows in Abbyy FineReader. Creating an Automated Task As an example, we will create a workflow to OCR TIFF files that are in a folder and then save the files as a FineReader file. All files will be saved in one FineReader batch. 1. Start the Automation Manager (select Tools> Automation Manager or Ctrl + T).

www.htctu.net 17 Rev. April 27, 2010

2. In the Automation Manager dialog box, click the New button on the toolbar.

3. In the Automation Wizard dialog box, click the desired step from the list of steps in the left-hand pane. The selected step will be displayed in the right-hand pane.

www.htctu.net 18 Rev. April 27, 2010

www.htctu.net 19 Rev. April 27, 2010

Click Change… to change the properties of the step. In this case, click Change and then Browse to find the folder containing the files you wish to process.

If you want to be able to make adjustments to the layout, then save the document right after analyzing the layout. www.htctu.net 20 Rev. April 27, 2010

If you simply want to read the document and save it, choose Read.

www.htctu.net 21 Rev. April 27, 2010

www.htctu.net 22 Rev. April 27, 2010

4. Once you have added all the necessary steps to your automated task and selected their properties, click Finish.

www.htctu.net 23 Rev. April 27, 2010

OmniPage Pro

Nuance 1 Wayside Road Burlington, MA 01803 781-565-5000 www.nuance.com

Interface

Document Manager Of the two OCR programs, OmniPage is the more accessible for someone who is blind, and the main reason for that is the document manager. As a sighted-user, I generally prefer to work with it off, since I do not want to allot the screen to it. For a quick check to see exactly where the program is in the OCR process, however, it can be very useful.

Alt V + N will toggle the document manager on and off. It appears at the bottom of the window, and the area can be modified by dragging the horizontal bar at the top of the pane.

The document manager is completely accessible with screen readers (JAWS, Window- Eyes, SuperNova, etc.) and will give you a lot of information about your OmniPage document. You will need to configure it, however, or else it will provide a lot of confusing, extraneous information.

www.htctu.net 24 Rev. April 27, 2010

Configuration for Blind User The directions below are written to be usable by someone who is blind or sighted.

OmniPage includes four possible views: 1. The thumbnail images: This view shows thumbnails for each page and allows easy mouse navigation from one page to another. Once in this pane, you can also navigate with the arrow keys. 2. The page image: This is the original image of the document. Both the page image and thumbnail image are part of the “Image Panel.” This panel is optional for a blind user. 3. The text editor: This view shows the text after OCR has been run. 4. The document manager: This view provides a great deal of information about the document, and it is fully accessible to screen readers and refreshable Braille displays.

All of these views can be turned on and off under the View (Alt + V) menu on the menu bar. Please note that the F6 key will move you from one pane to the next.

To configure the document manager, you need to go under the View menu to Customize Columns ( Alt V + C).

The default at the beginning is to have all the columns on. This is not helpful; in fact, some of them show icons, which is totally useless for someone who is blind.

I recommend just using the first six columns on the list: Status Characters Suspect words Reject characters Words % Accuracy

You may find that you want to add other columns later as you get used to working with the program.

www.htctu.net 25 Rev. April 27, 2010

If the screen reader does not immediately start reading the choices in the choice box, Alt + C should move the focus to that window. These are check boxes, so you can use your space bar to select and deselect.

Mostly, you will want to deselect. Leave the first five selected then go through the list and deselect the rest. The down arrow key will move you properly from one to the other.

A minor , but once you have deselected the rest of the long list, you might want to use your up arrow (just hold it down until you hear “Status”) to go back to the top of the list. The columns are set too narrow for a sighted person to read the labels, and I do not know whether every screen reader will read them configured like that or not.

You can reset the sizes as follows: With the cursor on the column name (for instance, Status is the first column name), go to the Column Width edit box (Alt + W). Type in 100 (the unit is pixels). Alt + C will move you back to the column list and you can arrow down to the next column name and repeat the process.

If you are using a Braille display, you may want to play with the column sizes a bit more to get a precise fit. The 100 pixel width is just a rule of thumb. For a more precise fit, try starting with these settings: Status: 67 pixels Characters: 68 pixels Reject Characters: 103 pixels www.htctu.net 26 Rev. April 27, 2010

Words: 48 pixels Suspect Words: 87 pixels

One more refinement: I would recommend going to the “Reject Characters” choice and move it up (Alt + U). Now arrow down to the “Words” choice and again move it up (Alt + U). This order change allows you to hear total characters followed by suspect ones and then total words followed by suspect words, which gives you a sense of how well-recognized the page was. Too many errors, and you may need a better scan.

Make sure to say OK when all of this is done. OmniPage will remember the settings, so you should only need to do all this once.

If your focus is not in the Document Manager pane, use the F6 key to navigate to the proper pane.

As the screen reader reads across the row in the document manager, you will hear the page number where your focus is, then the status, then the characters on the page, the number of rejected characters, the number of words on the page, and the number of suspect words.

You can use the arrow keys to move from page to page.

You may find that you need even fewer columns than I’ve suggested, but these five are a nice starting point for you to experiment with.

As an example of one way to use the document manager, you may have recognized (i.e., run OCR) on a portion of the document and then stopped. At a later time, you want to resume processing the document.

To find out where you left off, you would move down the list until you find a page that has not been recognized. It will say acquired (meaning that the image file is present but OCR has not yet been run).

Starting at the first one that says “Acquired,” hold down your shift key and use your down arrow until you have selected all the pages in the document.

Once the pages have been selected, use Control + 2 to run the OCR.

I like to run the OCR separate from the proofreading, as I believe that combining the two leads to more crashing.

If the two are still linked, that is, if the proofreading window opens while the OCR is still running, then change your options. Go to Tools and Options: Alt T + O. In the Options window, go to the Proofing tab. Uncheck the second checkbox on the list, “Automatically proofread results after OCR.” Then say OK.

www.htctu.net 27 Rev. April 27, 2010

Toolbars To help you become accustomed to using the shortcut keys, turn on the shortcut keys on the toolbars: View > Toolbars. Check the Show ToolTips and Show Shortcut Keys in ToolTips checkboxes.

www.htctu.net 28 Rev. April 27, 2010

Options Set-up In the menus, go to Tools > Options or use the keyboard shortcut Alt + T + O.

OCR Tab The OCR tab allows you to select the languages that are included in a document. If you are working on a foreign language document, you can choose English, as well as the foreign language. Choosing a foreign language allows the program to recognize that letters in that language that use diacritical marks (umlauts, cedillas, accents, etc.). If you are processing a math book, turn on Greek so that the Greek letters (alpha, pi, theta, etc.) used in mathematics will be recognized. Be sure that you only select the languages that you need, however, as the program will look for (and find!) the letters used in those languages.

If you are processing a document that has a great deal of financial, legal, or medical terminology, be sure to turn on the professional dictionary for that subject. Please note that you can also load custom dictionaries if there is a great deal of unusual vocabulary.

www.htctu.net 29 Rev. April 27, 2010

Process Tab The OCR tab allows you to specify where to add pages to the OmniPage document if you have somehow skipped a page during scanning or need to add a page that needed to be rescanned. Make a TIFF file of the page to add and then set the option in this window prior to inserting it. Note that you can set the option to “before current page” if you need to insert a page in the middle of the document.

Make sure to set the “Page rotation” option to “None.” You can manually rotate a page if you need to, and the program does not do a good job of analyzing orientation.

www.htctu.net 30 Rev. April 27, 2010

Proofing Tab Depending on whether or not you will run a spell check later during your workflow, you may check or uncheck the “Mark all non-dictionary words” option. If you will be running a spell check in MS Word, you can leave this uncheck.

Make sure to uncheck “Automatically proofread results after OCR.” You can launch the proofreading window automatically, and there are other methods for proofreading your document that you may prefer using.

If you wish to use the IntelliTrain feature, you will need to use the proofreading window. IntelliTrain remembers the corrections that you have made and looks for patterns that the program can apply to better recognize unusual fonts. It makes most sense to use IntelliTrain if you are processing a book in sections. You can load the IntelliTrain data from a previous section before processing the next one, ensuring better recognition. If you use this feature, check the box to prompt to save. If you are not using the feature, then uncheck that box.

You can load IntelliTrain data from a previous file by selecting the three-dot browse button.

www.htctu.net 31 Rev. April 27, 2010

General Tab There are a number of useful features on this tab, depending on how you use your program. You can add OmniPage to your system tray for quick access. You can enable update detection to make sure that you always have the latest versions. We will not be using any of the features in this tab for this class.

www.htctu.net 32 Rev. April 27, 2010

Text Editor Tab Under Show, mark sure that “Headers and Footers” is unchecked. Otherwise, OmniPage will attempt to automatically detect these features, and you cannot easily access them— either to include the page number or to reorder the zone. It also will not include the header or footer when you save out to Word, which means that page numbers will be lost.

Make sure to check “Reading Order.” If this feature is not checked, you will not see the reader order of the zones (i.e., which zone comes before which other zone) or reorder them.

If your text in the text editor pane ever looks funny (running off the screen), check to make sure that “Wrap to Ruler” is turned on. Having it on is the default, so this should not be an issue.

www.htctu.net 33 Rev. April 27, 2010

Scanner Tab We do not generally recommend using OmniPage for scanning; however, if you need to do a great deal of color scanning, you might wish to use if. If you do use OmniPage for scanning, make sure to install the actual scanning program for your scanner in OmniPage. You can embed the drivers that came with your scanner in the OCR program. Do not use the generic drivers that came with OmniPage. Even using this technique, you may find a loss of speed and features scanning through OmniPage. Unless you have a compelling reason to scan through this program, with using the scanning utility that comes with your high speed scanner.

www.htctu.net 34 Rev. April 27, 2010

www.htctu.net 35 Rev. April 27, 2010

Processing an Image (TIFF or PDF) File Step One: Load a File

Step Two: Run the OCR Be sure to select the pages before running the OCR. Click on the first thumbnail and use CTRL + A to select all.

www.htctu.net 36 Rev. April 27, 2010

www.htctu.net 37 Rev. April 27, 2010

Step Three: Adjust Zones Use the "on-the-fly" tool to redraw zones.

To reorder zones, right click in Text Editor view and change reading order. You may need to ungroup the zones first. Right click and choose ungroup.

www.htctu.net 38 Rev. April 27, 2010

Next right click and choose, “Change Reading Order.” You will now see numbers showing the order of the zones.

www.htctu.net 39 Rev. April 27, 2010

Right click again and choose “Define Reading Order.” Your numbers will now disappear, and you can click on the zones in the order that you wish to have them read.

www.htctu.net 40 Rev. April 27, 2010

When you are satisfied with the order, right click one last time and select “Apply Order.”

www.htctu.net 41 Rev. April 27, 2010

Step Four: Save the Document Make sure to set the view for your text editor to “Formatted Text View” before you save. The setting is under View > Text Editor Views or the buttons at the bottom left of the text editor pane.

www.htctu.net 42 Rev. April 27, 2010

When you save the document, you have a number of options. You can save the text in MS Word or other text formats. You can save the document as a graphical PDF or a TIFF. You can also choose “save as multiple” and save to Word and PDF or Word and TIFF at the same time.

www.htctu.net 43 Rev. April 27, 2010

If you have a PDF document that you want to convert to TIFF, you can load the file into OmniPage and then jump directly to “save as” (step 3), without running the OCR. Since you are essentially going from one graphical format to another, there is no point to introducing possible recognition errors by running OCR. The only reason to run the OCR would be if you wanted to delete portions of the document before saving it to TIFF.

If you are saving to Word and you do not want any textboxes, you will need to remove the “retain drop cap” option. After selecting Microsoft Word as the file type to save, click on the “Options” button. Scroll down until you see the “Retain drop caps” checkbox and uncheck it. Note that you can check the “Make changes permanent” button if you want to change the default.

www.htctu.net 44 Rev. April 27, 2010

Also note that you have a choice of saving the entire OmniPage file to one document, saving individual pages as separate documents, and other variations.

www.htctu.net 45 Rev. April 27, 2010

www.htctu.net 46 Rev. April 27, 2010

OmniPage Tips Shortcut: To see the shortcut keys associated with menu items, go to View > Toolbars and check "with shortcut keys." On-the-fly zoning: You can now modify zones without having to reimage the entire page. For small changes, click on the "on the fly" button. Stop spell check: To tell the program just to find OCR errors and not unknown words, go to Tools > Options > Proofing and uncheck "Mark non-dictionary words." Reordering zones: You can reorder the zones in the text editor window; however, it must be set to True Page view (View > Text Editor Views > True Page). If the zones are locked, right click and choose the "ungroup" option. Then choose "change reading order" and "define reading order." Verifier: A zoom window, called the verifier, is included in the text editor view. Click on show/hide verifier or use F9 while in the text editor window. Saving: Note that you can save one document as individual pages, save multiple documents into one, or save each image as one document. Be aware of which you choose. Formats: For PDF, use True Page. For Word, use Flowing Page or Retain Fonts and Paragraphs. Batch manager: The batch manager is the new scheduler for running OCR, converting documents, etc. Note that you can also cancel a scheduled job through the batch manager. Templates: For books with a standard layout, you can create a template specifically for that book. Speech: OmniPage will read text to speech as well as responding to verbal commands in some of the windows.

www.htctu.net 47 Rev. April 27, 2010 Scanning Flowchart for End Formats

Scan documents (Use the utility that comes with the scanner, not Kurzweil, FineReader, OmniPage, etc.) Load image (TIFF) Load image (TIFF) into Abbyy FineReader into Kurzweil 3000 Creates a TIFF or Nuance OmniPage** Pro* (or WYNN (Image) Wizard) Note: Can go to PDF at this point, but it is an image!! Some scan/read products do Runs OCR—full Runs OCR—limited not access a graphical PDF well. structural structural control

PDF MS Word doc HTML / XML KESI file RTF (Kurzweil format) Text (ASCII)

Can create MP3

Braille DAISY MP3, WAV Translation

* Other similar scan and read software for students with learning disabilities has the same workflow, e.g., WYNN Wizard, Read&Write Gold, Universal Reader Plus, Easy Scan, etc. ** We most often recommend FineReader and OmniPage for optical character recognition (OCR), but there are other OCR programs available. Creating a Workflow in OmniPage 17 We are creating a workflow to open documents in a folder and run OCR on those documents.

aannddoouuttss HHHigh Tech Center Training Unit 21050 McClellan Rd. Cupertino, CA 95014 www.htctu.net

OmniPage Pro Keyboard Guide

Function Action key Show the online Help topic for the current screen area or F1 dialog box. Open the Edit note dialog box in the Document Manager F2 for the current page. F4 Find next suspect in the Text Editor. F5 Enable/disable Speech mode. F6 Move the focus to the next non-hidden splitter pane. F7 Start proofreading. F8 Temporarily hide/show verifier display while pressed. F9 Enable/disable verifier. Shift+F10 Open a shortcut menu for the current position.

Functions Key combinations Bold Ctrl+B Clear Delete Copy Ctrl+C Cut Ctrl+X Delete current page Ctrl+D End of table column Alt+Page Down End of table row Alt+End Export Results button Ctrl+3 (main keyboard) Find Ctrl+F Get Page button Ctrl+1 (main keyboard) Go to page Ctrl+G Insert soft hyphen Ctrl+- Insert soft new line Shift+Return Italic Ctrl+I New Ctrl+N Next page Ctrl+Page Down

OmniPage Pro Keyboard Guide 1 Adv OCR Functions Key combinations Open OmniPage Document Ctrl+O Paste Ctrl+V Perform OCR button Ctrl+2 (main keyboard) Previous page Ctrl+Page Up Redo Ctrl+Y Replace Ctrl+H Save Ctrl+S Select all Ctrl+A Select table Alt+Clear (Num 5 with Num Lock off) Start of table column Alt+Page Up Start of table row Alt+Home Underline Ctrl+U Undo Ctrl+Z Verifying text

After performing OCR, you can compare any part of the recognized text against the corresponding part of the original image, to verify that the text was recognized correctly.

To do this: Use this:

Turn verifier on F9 or

Turn verifier off Esc or F9 or Turn verifier on/off temporarily F8: press and hold down Show verifier until next keystroke Double-click on word Zoom display in Alt + Num + or click in verifier Zoom display out Alt + Num – or click in verifier Make verifier dynamic or docked/floating Alt + Num / Dynamic context (scroll through 3 values) Alt + Num *

The verifier tool is in the Formatting toolbar. The verifier can also be controlled from the View menu.

OmniPage Pro Keyboard Guide 2 Adv OCR To modify the size of the text context:

• Press Alt + Num * as many times as required. • Select the appropriate Verifier menu item from the View menu.

• You should proofread and verify texts before doing large-scale editing. If you cut and paste large blocks of text, the links between text and image may be disturbed.

• You can use OmniPage Pro’s text-to-speech facility to have the recognized text read aloud as another way of verifying text. You can hear the text letter-by-letter, word-by-word, line-by-line, sentence-by-sentence or in whole pages.

• Suspect and non-dictionary words may be marked in the Text Editor. Proofing stops only on marked words, and removes the marking as you move on. You can verify any words in the text, whether marked or not.

• Turn the verifier on and use the Find Next Suspect tool or F4 to quickly verify suspect words. Use the shortcut menu to get dictionary suggestions and proofing commands. Converting Text to Speech

OmniPage Pro can convert your recognized text into speech and read it out loud, allowing you to listen to your document. This may help you as you verify the text.

ScanSoft RealSpeak speech facility is offered for a wide range of languages: currently: English (US and International), Brazilian Portuguese, Dutch, French, German, Italian, Spanish and Swedish..All speech systems will be installed with OmniPage Pro if you choose a complete installation. If you perform a custom installation, you can choose the languages you need. If you later want to read text aloud in a language you did not choose, you can install it from your program CD.

To have text read aloud:

1. Enable the Text-to-Speech facility with the menu item Speech Mode in the Tools menu, or with the F5 key. 2. Use the menu item Speech Settings to select a voice (for example, male or female for a given language), a reading speed and the volume. 3. From now on, movements of the insertion point in the Text Editor will cause text to be read aloud. Make selections with the mouse or use keyboard keys as shown below. 4. Turn the Text-to-Speech facility off by clicking the Speech Mode menu item again.

OmniPage Pro Keyboard Guide 3 Adv OCR To hear text: Use these keys: Right or left arrow. Letter, number or One character at a time, forward or back punctuation names are spoken. Current word Ctrl + Numpad one ( 1 ) One word to the right Ctrl + right arrow * One word to the left Ctrl + left arrow * A single line Place the insertion point in the line Next line Down arrow Previous line Up arrow Current sentence Ctrl + Numpad two ( 2 ) From insertion point to end of sentence Ctrl + Numpad six ( 6 ) From start of sentence to insertion point Ctrl + Numpad four ( 4 ) Current page Ctrl + Numpad three ( 3 ) From top of current page to insertion point Ctrl + Home From insertion point to end of current page Ctrl + End Previous, next or any page Ctrl + PgUp, PgDn or navigation buttons Each typed character is pronounced, one by Typed characters one, including punctuation.

* If the cursor is in the middle of a word, you will first hear a word fragment, but from the second keystroke you will hear whole words.

• The three basic speech keys are neighbors on the keyboard. Use Ctrl + Numpad one, Ctrl + Numpad two and Ctrl + Numpad three respectively for Read current word, Read current sentence and Read current page.

You also have the following keyboard controls:

To do this: Use this: Pause/Resume Ctrl + Numpad five ( 5 ) Set speed higher Ctrl + Numpad plus ( + ) Set speed lower Ctrl + Numpad minus ( - ) Restore speed Ctrl + Numpad asterisk ( * )

Be sure the selected RealSpeak voice is for the same language as the text you intend to read.

OmniPage Pro Keyboard Guide 4 Adv OCR ABBYY FineReader Hot Keys

File menu To: Press: Open image from file CTRL+O Scan image CTRL+K Scan multiple images CTRL+SHIFT+K Stop scanning CTRL+T Create new batch CTRL+N Open a batch CTRL+P Save text to file CTRL+F2 Save image to file F12

Edit menu To: Press: Undo the last action CTRL+Z Redo the last undone action CTRL+Y Cut the selection and put it to the clipboard CTRL+X CTRL+INS or Copy the selection to the clipboard CTRL+C CTRL+V or Paste the clipboard contents SHIFT+INS Delete the active block, the selection, the selected pages DEL Select all text in the Text window, select all batch pages, select all CTRL+A blocks on the open image Find the specified text CTRL+F Find the next occurrence of the search text F3 Search for and replace the specified text CTRL+H

ABBYY FineReader Keyboard Guide 1 Adv OCR View menu To: Press: Magnify the image in the Image window CTRL+SHIFT+NUM+ Zoom Out the image in the Image window CTRL+SHIFT+NUM- Zoom In to selected blocks CTRL+SHIFT+NUM* Properties ALT+ENTER

Batch menu To: Press: Open next batch page ALT+Down Open previous batch page ALT+Up Open page with specified number CTRL+G Close the current page CTRL+F4 Delete the recognized text in the Text window CTRL+SHIFT+Del Delete all blocks in the Image window and all recognized CTRL+Del text in the Text window Update page list F5

Process menu To: Press: Scan and read an image CTRL+D Open and read an image CTRL+SHIFT+D Start Scan&Read Wizard CTRL+W Analyze layout Ctrl+E Analyze layout on all batch pages CTRL+SHIFT+E Read active or selected pages CTRL+R Read all batch pages CTRL+SHIFT+R Read active or selected blocks CTRL+SHIFT+B

ABBYY FineReader Keyboard Guide 2 Adv OCR

Tools menu To: Press: Spell the recognized text F7 Move to the next error or uncertain word. F4 Move to the previous error or uncertain word. SHIFT+F4 View Dictionaries CTRL+SHIFT+V Translate word with Lingvo CTRL+SHIFT+T Open the Language Editor dialog where you can create and edit CTRL+SHIFT+L languages and language groups Open the Pattern Editor dialog where you can create and edit the CTRL+SHIFT+A user's patterns Set the scanner parameters CTRL+SHIFT+S Open the Formats settings dialog where you can set save options for CTRL+SHIFT+X supported output formats Open the Options dialog CTRL+SHIFT+O

Window menu To: Press: Open the next window CTRL+F6 Open the previous window CTRL+SHIFT+F6 Open the Batch window ALT+1 Open the Image window ALT+2 Open the Text window ALT+3 Open the Zoom window ALT+4 Switch to the Advanced search window ALT+5 Open the Advanced search window ALT+F3

ABBYY FineReader Keyboard Guide 3 Adv OCR Help menu To: Press: Open help F1

General To: Press: Make the selection bold CTRL+B Make the selection italic CTRL+I Make the selection underlined CTRL+U Go to the next table cell left arrow, right arrow, up arrow, down arrow

ABBYY FineReader Keyboard Guide 4 Adv OCR Command Name Modifiers Key Menu All Caps Ctrl+Shift+ A Annotation Alt+Ctrl+ M Insert App Maximize Alt+ F10 App Restore Alt+ F5 Apply Heading1 Alt+Ctrl+ 1 Apply Heading2 Alt+Ctrl+ 2 Apply Heading3 Alt+Ctrl+ 3 Apply List Bullet Ctrl+Shift+ L Auto Format Alt+Ctrl+ K Auto Text F3 Auto Text Alt+Ctrl+ V Bold Ctrl+ B Bold Ctrl+Shift+ B Bookmark Ctrl+Shift+ F5 Insert Browse Next Ctrl+ Page Down Browse Prev Ctrl+ Page Up Browse Sel Alt+Ctrl+ Home Cancel Esc Center Para Ctrl+ E Change Case Shift+ F3 Char Left Left Char Left Extend Shift+ Left Char Right Right Char Right Extend Shift+ Right Clear Del Clear Close Document Ctrl+ w Close or Exit Alt+ F4 Close Pane Alt+Shift+ C Column Break Ctrl+Shift+ Return Column Select Ctrl+Shift+ F8 Copy Ctrl+ C Copy Ctrl+ Insert Copy Format Ctrl+Shift+ C Command Name Modifiers Key Menu Copy Text Shift+ F2 Create Auto Text Alt+ F3 Customize Add Menu Shortcut Alt+Ctrl+ = Tools Customize Keyboard Shortcut Alt+Ctrl+ Num + Customize Remove Menu Alt+Ctrl+ - Shortcut Cut Ctrl+ X Cut Shift+ Del Date Field Alt+Shift+ D Delete Back Word Ctrl+ Backspace Delete Word Ctrl+ Del Distribute Para Ctrl+Shift+ J Do Field Click Alt+Shift+ F9 Doc Close Ctrl+ W Doc Close Ctrl+ F4 Doc Maximize Ctrl+ F10 Doc Move Ctrl+ F7 Doc Restore Ctrl+ F5 Doc Size Ctrl+ F8 Doc Split Alt+Ctrl+ S Window Double Underline Ctrl+Shift+ D End of Column Alt+ Page Down End of Column Alt+Shift+ Page Down End of Doc Extend Ctrl+Shift+ End End of Document Ctrl+ End End of Line End End of Line Extend Shift+ End End of Row Alt+ End End of Row Alt+Shift+ End End of Window Alt+Ctrl+ Page Down End of Window Extend Alt+Ctrl+Shift+ Page Down Endnote Now Alt+Ctrl+ D Extend Selection F8 Command Name Modifiers Key Menu Field Chars Ctrl+ F9 Field Codes Alt+ F9 Find Ctrl+ F Ctrl+ D Font Ctrl+Shift+ F Font Size Select Ctrl+Shift+ P Footnote Now Alt+Ctrl+ F Go Back Shift+ F5 Go Back Alt+Ctrl+ Z Go To Ctrl+ G Edit Go To F5 Edit Grow Font Ctrl+Shift+ . Grow Font One Point Ctrl+ ] Hanging Indent Ctrl+ T Header Footer Link Alt+Shift+ R Help F1 Hidden Ctrl+Shift+ H Hyperlink Ctrl+ K Indent Ctrl+ M Italic Ctrl+ I Italic Ctrl+Shift+ I Justify Para Ctrl+ J Left Para Ctrl+ L Line Down Down Line Down Extend Shift+ Down Line Up Up Line Up Extend Shift+ Up List Num Field Alt+Ctrl+ L Lock Fields Ctrl+ 3 Lock Fields Ctrl+ F11 Macro Alt+ F8 Mail Merge Check Alt+Shift+ K Mail Merge Edit Data Source Alt+Shift+ E Command Name Modifiers Key Menu Mail Merge to Doc Alt+Shift+ N Mail Merge to Printer Alt+Shift+ M Mark Citation Alt+Shift+ I Mark Index Entry Alt+Shift+ X Mark Table of Contents Entry Alt+Shift+ O Menu Mode F10 Merge Field Alt+Shift+ F Microsoft Script Editor Alt+Shift+ F11 Microsoft System Info Alt+Ctrl+ F1 Move Text F2 New Ctrl+ N File Next Field F11 Next Field Alt+ F1 Next Misspelling Alt+ F7 Next Object Alt+ Down Next Window Ctrl+ F6 Next Window Alt+ F6 Normal Alt+Ctrl+ N View Normal Style Ctrl+Shift+ N Normal Style Alt+Shift+ Clear (Num 5) Open Ctrl+ O Open Ctrl+ F12 Open Alt+Ctrl+ F2 Open or Close Up Para Ctrl+ 0 Other Pane F6 Other Pane Shift+ F6 Outline Alt+Ctrl+ O Outline Collapse Alt+Shift+ - Outline Collapse Alt+Shift+ Num - Outline Demote Alt+Shift+ Right Outline Expand Alt+Shift+ = Outline Expand Alt+Shift+ Num + Outline Move Down Alt+Shift+ Down Command Name Modifiers Key Menu Outline Move Up Alt+Shift+ Up Outline Promote Alt+Shift+ Left Outline Show First Line Alt+Shift+ L Overtype Insert Page Alt+Ctrl+ P View Page Break Ctrl+ Return Page Down Page Down Page Down Extend Shift+ Page Down Page Field Alt+Shift+ P Page Up Page Up Page Up Extend Shift+ Page Up Para Down Ctrl+ Down Para Down Extend Ctrl+Shift+ Down Para Up Ctrl+ Up Para Up Extend Ctrl+Shift+ Up Paste Ctrl+ V Paste Shift+ Insert Paste As Hyperlink Edit Paste Format Ctrl+Shift+ V Prev Field Shift+ F11 Prev Field Alt+Shift+ F1 Prev Object Alt+ Up Prev Window Ctrl+Shift+ F6 Prev Window Alt+Shift+ F6 Print Ctrl+ P Print Ctrl+Shift+ F12 Print Preview Ctrl+ F2 Print Preview Alt+Ctrl+ I Proofing F7 Redo Alt+Shift+ Backspace Redo or Repeat Ctrl+ Y Edit Redo or Repeat F4 Edit Redo or Repeat Alt+ Return Edit Command Name Modifiers Key Menu Reject All Changes in Doc Reject Reject All Changes Shown Reject Repeat Find Shift+ F4 Repeat Find Alt+Ctrl+ Y Replace Ctrl+ H Edit Reset Char Ctrl+ Space Reset Char Ctrl+Shift+ Z Reset Para Ctrl+ Q Revision Marks Toggle Ctrl+Shift+ E Right Para Ctrl+ R Save Ctrl+ S Save Shift+ F12 Save Alt+Shift+ F2 Save As F12 File Select All Ctrl+ A Edit Select All Ctrl+ Clear (Num 5) Edit Select All Ctrl+ Num 5 Edit Select Table Alt+ Clear (Num 5) Toolbar 32778 Show All Ctrl+Shift+ 8 Show All Headings Alt+Shift+ A Show All Headings Alt+Shift+ A Show Heading1 Alt+Shift+ 1 Show Heading2 Alt+Shift+ 2 Show Heading3 Alt+Shift+ 3 Show Heading4 Alt+Shift+ 4 Show Heading5 Alt+Shift+ 5 Show Heading6 Alt+Shift+ 6 Show Heading7 Alt+Shift+ 7 Show Heading8 Alt+Shift+ 8 Show Heading9 Alt+Shift+ 9 Shrink Font Ctrl+Shift+ , Shrink Font One Point Ctrl+ [ Shrink Selection Shift+ F8 Command Name Modifiers Key Menu Ctrl+Shift+ K Space Para1 Ctrl+ 1 Space Para15 Ctrl+ 5 Space Para2 Ctrl+ 2 Spike Ctrl+Shift+ F3 Spike Ctrl+ F3 Start of Column Alt+ Page Up Start of Column Alt+Shift+ Page Up Start of Doc Extend Ctrl+Shift+ Home Start of Document Ctrl+ Home Start of Line Home Start of Line Extend Shift+ Home Start of Row Alt+ Home Start of Row Alt+Shift+ Home Start of Window Alt+Ctrl+ Page Up Start of Window Extend Alt+Ctrl+Shift+ Page Up Style Ctrl+Shift+ S Style Separator Alt+Ctrl+ Return Subscript Ctrl+ = Superscript Ctrl+Shift+ = Symbol Font Ctrl+Shift+ Q Thesaurus Shift+ F7 Language Time Field Alt+Shift+ T Toggle Character Code Alt+ X Toggle Field Display Shift+ F9 Toggle Master Subdocs Ctrl+ \ Tool Shift+ F1 Translate Pane Alt+Shift+ F7 Language Un Hang Ctrl+Shift+ T Un Indent Ctrl+Shift+ M Underline Ctrl+ U Underline Ctrl+Shift+ U Undo Ctrl+ Z Command Name Modifiers Key Menu Undo Alt+ Backspace Unlink Fields Ctrl+ 6 Unlink Fields Ctrl+Shift+ F9 Unlock Fields Ctrl+ 4 Unlock Fields Ctrl+Shift+ F11 Update Auto Format Alt+Ctrl+ U Update Fields F9 Update Fields Alt+Shift+ U Update Source Ctrl+Shift+ F7 VBCode Alt+ F11 Web Go Back Alt+ Left Web Go Forward Alt+ Right Word Count List Ctrl+Shift+ G Word Count Recount Ctrl+Shift+ R Word Left Ctrl+ Left Word Left Extend Ctrl+Shift+ Left Word Right Ctrl+ Right Word Right Extend Ctrl+Shift+ Right Word Underline Ctrl+Shift+ W

Shortcuts with WINDOWS Key

WINDOWS Key Start Menu (toggles)

WINDOWS Key + M minimizes all your running applications, bringing up the Desktop

WINDOWS Key + SHIFT + M restores your minimized applications

WINDOWS Key + D brings up your Desktop

WINDOWS + PAUSE / BREAK* opens the System Properties Dialog Box

WINDOWS Key + TAB brings your focus to the Task Bar

* PAUSE / BREAK Key is usually located on the top row of your keyboard on the right hand side, three keys to the right of the F12 Key.

Short Cut Keys for Editing SHIFT+ENTER A line break CTRL+ENTER A page break CTRL+SHIFT+ENTER A column break CTRL+HYPHEN An optional hyphen CTRL+SHIFT+HYPHEN A nonbreaking hyphen CTRL+SHIFT+SPACEBAR A nonbreaking space ALT+CTRL+C The copyright symbol ALT+CTRL+R The registered trademark symbol ALT+CTRL+T The trademark symbol ALT+CTRL+period An ellipsis

Character and formatting

Copy formatting CTRL+SHIFT+C Copy formatting from text CTRL+SHIFT+V Apply copied formatting to text

Change or resize the font CTRL+SHIFT+F Change the font CTRL+SHIFT+P Change the font size CTRL+SHIFT+> Increase the font size CTRL+SHIFT+< Decrease the font size CTRL+] Increase the font size by 1 point CTRL+[ Decrease the font size by 1 point

Apply character formats CTRL+D Change the formatting of characters (Font command, Format menu) SHIFT+F3 Change the case of letters CTRL+SHIFT+A Format letters as all capitals CTRL+B Apply bold formatting CTRL+U Apply an underline CTRL+SHIFT+W Underline words but not spaces CTRL+SHIFT+D Double-underline text CTRL+SHIFT+H Apply hidden text formatting CTRL+I Apply italic formatting CTRL+SHIFT+K Format letters as small capitals

Short Cut Keys for MS Office 1 9/27/2011 CTRL+EQUAL SIGN Apply subscript formatting (automatic spacing) CTRL+SHIFT+PLUS SIGN Apply superscript formatting (automatic spacing) CTRL+SPACEBAR Remove manual character formatting CTRL+SHIFT+Q Change the selection to the Symbol font

View and copy text formats CTRL+SHIFT+* (asterisk) Display nonprinting characters SHIFT+F1 (then click the text whose formatting you want to review) Review text formatting CTRL+SHIFT+C Copy formats CTRL+SHIFT+V Paste formats

Set line spacing CTRL+1 Single-space lines CTRL+2 Double-space lines CTRL+5 Set 1.5-line spacing CTRL+0 (zero) Add or remove one line space preceding a paragraph

Align paragraphs CTRL+E Center a paragraph CTRL+J Justify a paragraph CTRL+L Left align a paragraph CTRL+R Right align a paragraph CTRL+M Indent a paragraph from the left CTRL+SHIFT+M Remove a paragraph indent from the left CTRL+T Create a hanging indent CTRL+SHIFT+T Reduce a hanging indent CTRL+Q Remove paragraph formatting

Apply paragraph styles CTRL+SHIFT+S Apply a style ALT+CTRL+K Start AutoFormat CTRL+SHIFT+N Apply the Normal style ALT+CTRL+1 Apply the Heading 1 style ALT+CTRL+2 Apply the Heading 2 style ALT+CTRL+3 Apply the Heading 3 style CTRL+SHIFT+L Apply the List style

Short Cut Keys for MS Office 2 9/27/2011 Insert an international character by using a shortcut key To insert Press à, è, ì, ò, ù CTRL+` (ACCENT GRAVE), the À, È, Ì, Ò, Ù letter á, é, í, ó, ú, ý CTRL+' (APOSTROPHE), the letter Á, É, Í, Ó, Ú, Ý â, ê, î, ô, û CTRL+SHIFT+^ (CARET), the letter Â, Ê, Î, Ô, Û ã, ñ, õ CTRL+SHIFT+~ (TILDE), the letter Ã, Ñ, Õ ä, ë, ï, ö, ü, ÿ CTRL+SHIFT+: (COLON), the letter Ä, Ë, Ï, Ö, Ü, Ÿ å, Å CTRL+SHIFT+@, a or A æ, Æ CTRL+SHIFT+&, a or A œ, Œ CTRL+SHIFT+&, o or O ç, Ç CTRL+, (COMMA), c or C ð, Ð CTRL+' (APOSTROPHE), d or D ø, Ø CTRL+/, o or O ¿ ALT+CTRL+SHIFT+? ¡ ALT+CTRL+SHIFT+! CTRL+SHIFT+&, s