Algorithm for Character Recognition Based on the Trie Structure

University of Montana ScholarWorks at University of Montana Graduate Student Theses, Dissertations, & Professional Papers Graduate School 1987 Algorithm for character recognition based on the trie structure Mohammad N. Paryavi The University of Montana Follow this and additional works at: https://scholarworks.umt.edu/etd Let us know how access to this document benefits ou.y Recommended Citation Paryavi, Mohammad N., "Algorithm for character recognition based on the trie structure" (1987). Graduate Student Theses, Dissertations, & Professional Papers. 5091. https://scholarworks.umt.edu/etd/5091 This Thesis is brought to you for free and open access by the Graduate School at ScholarWorks at University of Montana. It has been accepted for inclusion in Graduate Student Theses, Dissertations, & Professional Papers by an authorized administrator of ScholarWorks at University of Montana. For more information, please contact [email protected]. COPYRIGHT ACT OF 1 9 7 6 Th is is an unpublished manuscript in which copyright sub s i s t s , Any further r e p r in t in g of its contents must be approved BY THE AUTHOR, Ma n sfield L ibrary U n iv e r s it y of Montana Date : 1 987__ AN ALGORITHM FOR CHARACTER RECOGNITION BASED ON THE TRIE STRUCTURE By Mohammad N. Paryavi B. A., University of Washington, 1983 Presented in partial fulfillment of the requirements for the degree of Master of Science University of Montana 1987 Approved by lairman, Board of Examiners iean, Graduate School UMI Number: EP40555 All rights reserved INFORMATION TO ALL USERS The quality of this reproduction is dependent upon the quality of the copy submitted. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if material had to be removed, a note will indicate th e deletion. Dissertatien UMI E P 40555 Published by ProQuest LLC (2014). Copyright in the Dissertation held by the Author. Microform Edition © ProQuest LLC. All rights reserved. This work is protected against unauthorized copying under Title 17, United States Code ProQuest LLC. 789 East Eisenhower Parkway P.O. Box 1346 Ann Arbor, Ml 4 8 1 0 6 -1 3 4 6 Paryavi, Mohammad N., M.S., Mar. 1987 Computer Science An Algorithm for Character Recognition Based on The Trie Structure Director Dr. Alden H. Wrigh Character recognition is the process of attempting to complete a partially typed string of characters by comparing it with the list of strings stored within the system. Character recognition can be used in a user-friendly interface for a computer system. In this thesis an algorithm for character recognition is developed based on the trie structure. A trie is an m-ary tree composed of two types of nodes; branch nodes that contain link fields associated with the elements of an alphabet, and information nodes that are the leaf nodes. For reasons of manipulation and implementation simplicity a binary tree representation of the trie structure was chosen. In this representation a trie branch node is represented by a list of binary tree nodes. This allows us to simulate a branch node with variable number of link fields. An algorithm for the character recognition operation on the trie is presented first in a general form and later in detail. Also, an algorithm is developed for the insertion operation into a trie structure. These algorithms are analyzed for run time and storage space efficiency. Finally, an alternative represention and an alternative algorithm are discussed. Table of Contents Abstract ...................................................................................................................ii Table of Contents............................................................................................................. iii List of Figures........................................................................................................... v Acknowledgements..........................................................................................................vi 1. Introduction ...........................................................................................................1 1.1. Background ..................................................................................................... 1 1.2. Statement of The Problem...............................................................................4 1.3. The Proposed Research...................................................................................4 2. Related Work and Literature ........................................................................................ 6 3. Algorithm Overview.................................................................... 8 3.1. The Data Structure ............................................ 8 3.1.1. The General Tree................................................................................ 8 3.1.2. The Trie Structure...............................................................................9 3.2. The General Algorithm........................................................................... 15 3.2.1. The Recognition Algorithm...............................................................16 3.2.2. The Insertion Algorithm....................................................................18 4. The Algorithm in Detail................................................................. 21 4.1. The Representation........................................................................................ 21 4.2. Notations, Symbols, and Conventions.........................................................23 4.3. Description of Variables................................................................................ 25 4.4. The Algorithm ................................................................................................26 4.4.1. The Recognition Algorithm............................................... ••........... 26 4.4.1.1. The Process_Recog Algorithm.......................................27 4.4.1.2. The Process_Other Algorithm........................................29 iii 4.4.1.3. The Process_Rubout Algorithm..................................... 32 4.4.1.4. The ProcessJNewline Algorithm ................................... 33 4.4.1.5. The Process_Listop Algorithm.......................................34 4.4.1.6. The Traverse_Trie Algorithm ......................................... 35 4.4.2. The Insertion Algorithm.................................................................. 36 4.4.2.1. The Add-To-Trie Alg.............................................. 36 4.4.2.2. The Process-Info-Node Alg............................................ 39 4.4.2.3. The Insert-To-Sibling-List Alg .......................... 40 4.4.2.4. The Insert-Short-Strings Alg ............... 42 4.4.2.5. The Insert-Long-Strings Alg ...........................................43 5. Analysis of the Algorithm ...................................................................................47 5.1. General Discussion.......................................................................................... 47 5.2. Time Efficiency of the Alg............................................................................... 47 5.2.1. Analysis of the Recognition Alg........................................................ 48 5.2.2. Analysis of the Insertion Alg..............................................................51 5.3. Space Efficiency of the Algorithm...................................................................52 5.4. Extensions to the Trie Structure.......................................................................54 5.5. An Alternative Tree Representation ................................................................. 55 5.6. An Alternative Algorithm........................................ .56 6. Summary and Conclusions ..................... 59 6.1. Summary..........................................................................................................59 6.2. Conclusions......................................................................................................62 BIBLIOGRAPHY ................................................................................................. 64 iv List of Illustrations Figure 3 .1 ........................................................................................... 9 Figure 3.2..............................................................................................11 Figure 3.3 ............................................................................................. 12 Figure 3.4 ...................................................................................... 14 Figure 4 .1 ............................................................................................. 22 Figure 5 .1 ............................ 50 Figure 5.2 ........................... 56 V Acknowledgments This research paper is soley dedicated to Dr. Alden Wright without whose inspirational and technical support it would not have been possible. His expertise was the technical backbone of this paper and his personality the motivating force behind it. I would also like to thank Professors William Ballard and Spencer Manlove for their invaluable guidance. My colleagues and fellow FIRESYS team members, Greg Hume, Bruce McTavish, and Jim Mitchell, at the University of Montana - Thanks to all of you. Special thanks are also due to my good friend Babak Shahpar for being so helpful in criticizing my work to make it better. Last but not least, my deepest gratitude and love to my wife

Algorithm for Character Recognition Based on the Trie Structure

Application of TRIE Data Structure and Corresponding Associative Algorithms for Process Optimization in GRID Environment

CS 106X, Lecture 21 Tries; Graphs

1 Suffix Trees

Adversarial Search

Tree-Combined Trie: a Compressed Data Structure for Fast IP Address Lookup

KP-Trie Algorithm for Update and Search Operations

C Programming: Data Structures and Algorithms

Lecture 26 Fall 2019 Instructors: B&S Administrative Details

Balanced Trees Part One

Abstract Data Types

Artificial Intelligence Spring 2019 Homework 2: Adversarial Search

Introduction to Linked List: Review