Acc~Pted ~V~L~~~~~~~
Total Page:16
File Type:pdf, Size:1020Kb
GRAPHICAL CONTEXT AS AN AID TO CHARACTER RECOGNITION by THEODORE THOMAS KUKLINSKI ~ B.S., Drexel University (1972) S.M., Massachusetts Institute of Technology (1975) E.E., Massachusetts Institute of Technology (1975) SUBMITTED IN·PARTIAL FULFILLMENT OF 'THE REQUIREMENTS FOR THE . DEGREE OF DOCTOR OF PHILOSOPHY at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY FEBRU/\RY, 1979 Vo;(,,_L. Signature redacted Signature of Au.thor ••••· • • ~l'"'T""'t ...............-.~.-.-- ••- .-•• -.. --•••••-. ·; ••- ••••••••• Department of Electrical Engineering and Computer Science /J /J ~~ Fe b r u a r y 2 O, 1 9 7 9 Signature redacted Ce rt i f i e d by •••••• • l · .r.&.--:-: • ·t. .L.:~ ~~---r h; ; i ; . sl~ p; ~: i ; ~ ~ -------1~. ?~ . ) ~~ - Acc~pted ~v~l~~~~~~~ ....... ~-.... Chairperson, Departmental Committee on Graduate Students ARCHIVES MASSACHUSETTS INSTITUTE Of TECHNOLOGY f'~ UV 1 0 1980 / UBRARlES GRAPHICAL CONTEXT AS AN AID TO CHARACTER RECOGNITION by THEODORE THOMAS KUKLINSKI Submitted to the Department of Electrical Engineering and Computer Science on February 14, 1979 in partial fulfillment of the requirements for the Degree of Doctor of Philosophy. ABSTRACT Contextual information is of great use to humans in the recognition of both machine and hand printed characters. The incorporation of contextual information, particularly a form known as graphical context, having to do with stylistic con- sistency within and between characters, may lead to better recognition than that currently achievable. The role and importance of character recognition technol- ogy in our society is examined, as are the problems involved and the state of the art in of both machine and handprinted character recognition. An examination of the importance of the use of various forms of contextual information in charac- ter recognition is presented. A functional attribute theory of character recognition is reviewed. This theory incorporates graphical context as a modulator of rules which transform between the physical and psychological domains. The theory is based on ambiguous char- acters which form the boundaries between letters. These bound- aries can be found using a variety of methodologies which are described. One of these, goodness rating, is used throughout the remainder of the thesis for testing graphical contextual effects. The range frequency model is described as a potentially useful candidate model for describing contextual effects on goodness curves and is traced historically and derived mathematically. It is based on the premise that judgment is a weighted compromise betwee*n a range principle and a frequency principle. The range tendency has to do with the stimulus range while the frequency principle deals with the effects of stimulus distribution. The model provides a method by which predictions can be made from the results of one category rat-. ing experiment to the results of another. A formulation to deal with prediction onto partial range experiments is developed. -2- Eight category rating experiments, spanning a variety of stimulus ranges and stimulus distributions, are described and provide a test of the graphical context model. The test con- sists of predicting the results of one experiment from the results of another. The procedures for making such predic- tions via the model are described in detail. The idea of a weighting factor varying along the dimension is developed. Three variants of the model are tried and three sets of predictions made. The first, a naive traditional RF model is found to be an inadequate predictor. N-ext, excellent predic- tions are obtained with some of the parameters having been optimized on one group of experiments (involving V-Y) and then tested on another group (involving C-F). It is discovered that the range principle is dominant particularly in the stimulus region around V and C. Finally, very good results are obtained using only the range principle. The results of each set of predictions are compared using a sum squared error metric. Letter archetypes are shown to be important determinants of stimulus range. It is concluded that plasticity effects in the perception of characters can be modeled successfully using a range type model. Applications of the model to graphical contextual analysis are discussed and possible future work is described. THESIS SUPERVISOR: Barry A. Blesser TITLE: Associate. Professor of Electrical Engineering ACKNOWLEDEMENTS I am very grateful to my thesis supervisor, Barry Blesser, for the help, encouragement, support and patience he has shown to me over the course of my stay at MIT. His com- ments have guided me along in the course of this thesis work. Likewise the constructive comments and insights of my thesis readers, Murray Eden and Mary Naus, as well as Ching Suen, is also gratefully acknowledged and appreciated. Thanks are due to Tim Waldron, Robin Kuklinski, as well as Robert Shillman for assistance in running some of the experiments. reported here. Many people in the CIPG laboratory and RLE were very helpful in many ways over the past few years. I especially thank Al Rudnick, our night custodian, John Hewitt, RLE Librarian, and John McKenzie, CIPG PDP-9 Doc- tor. The friendship and kind assistance of various other peo- ple in the CIPG lab, among them, Charlie Cox, Don Levinstone, Bob Babcock, is appreciated. I wish to thank the Cognitive Information Processing Group for their support of my work indirectly through the use of their computer facilities. I am grateful to the National Science Foundation for their support of me as a research assistant through May 1977 under Grants #NSF-ENG-7417459 and ENG74-24344, as well as MIT's Research Laboratory of Eledtron- ics for support in Fall 1977 under an industrial fellowship. To my parents I will ever be in debt for their encourage- ment, prayers and support. My true gratitude and love goes to my wife Hsueh-Rong for her help and understanding. This thesis is dedicated to her. Deo Gratias. -5- TABLE OF CONTENTS TITLE PAGE......... ......... ABSTRACT..... ... ..... ............. 2 . ACKNOWLEDGEMENTS . .. .. ... .. ... .. .. ... .. 4 TABLE OF CONTENTS . ... .. ... ... ... ... .. 6 LIST OF FIGURES.... .... ... .. .. .. ... .. 12 LIST OF TABLES .. .. ... ... ... ... .... 17 CHAPTER 1 INTRODUCTION .. ... ... ... .. ... ... ... 18 1.1 INTRODUCTION . ... ... ... .. 18 1.2 OUTLINE OF INTRODUCTORY AND BACKROUND MATERIAL . 21 1.3 OUTLINE OF RANGE FREQUENCY WORK. ... .. ... 25 CHAPTER 2 CHARACTER RECOGNITION. ........ .......... 30 2.1 INTRODUCTION .. ........... .. 30 2.2 THE THREE "R's" - A STATE OF THE ART REPORT. .. 30 2.3 MACHINE PRINTED CHARACTER RECOGNITION - THE PROBLEM. ... ... .... ... ... ... 34 2.4 HANDPRINTED CHARACTER RECOGNITION - THE PROBLEM. 37 2.5 ENGINEERING LITERATURE IN CHARACTER RECOGNITION. 47 2.6 PSYCHOLOGICAL LITERATURE IN CHARACTER RECOGNITION. 50 2.7 SOME USES OF CHARACTER RECOGNITION . ... 53 2.8 SUMMARY..... ....... .......... 57 -6- CHAPTER 3 CONTEXT IN CHARACTER RECOGNITION. ... ........ 58 3.1 OVERVIEW. .................. .58 3.2 THE POWER OF- CONTEXT: AN ILLUSTRATION . .60 3.3 HIGHER LEVELS OF CONTEXT . .. .. .. 62 3.4 CULTURAL, HISTORICAL AND GENERATIVE CONTEXT. .. 65 3.5 GRAPHICAL CONTEXT. .. .. ... .. .. ... 71 CHAPTER 4 A FUNCTIONAL ATTRIBUTE BASED THEORY OF CHARACTER RECOGNITION. .. .. ... .. .. .. .. ... .. 76 4.1 INTRODUCTION .. ... 76 4.2 THE CONCEPT OF ATTRIBUTE . .. ... 77 4.3 A SIMPLIFIED MODEL . .. ... .. .. ... .. 81 4.4 SUMMARY..... .. .. ... .. .. .. 85 CHAPTER .5 METHODOLOGIES FOR A FUNCTIONAL ATTRIBUTE BASED THEORY OF CHARACTER RECOGNITION...... .. .. 86 5.1 INTRODUCTION . ................ .. 86 5.2 A REVIEW OF EXPERIMENTAL WORK...... ... .. 87 5.3 THE GOODNESS METHODOLOGY .. .. 91 5.4 SOME OTHER METHODOLOGIES ... .. 94 5.5 PLASTICITY OF INTERLETTER BOUNDARIES .. ... .. 99 5.6 SUMMARY. ..... ....... 102 CHAPTER 6 RANGE FREQUENCY: A THEORY OF RELATIVITY FOR PSYCHOPHYSICS. .. ... .. ... .. .. .. ... 104 6.1 INTRODUCTION ... ... .. ... ... .. .. .. .104 6.2 SOME TERMINOLOGY . ... .. .. .. ... .108 -7- . 6.3 ROOTS IN ADAPTATION LEVEL THEORY . * . .9 115 6.4 EARLY RANGE FREQUENCY FORMULATION. 119 6.5 THE LIMEN MODEL............ 124 6.6 THE "SIMPLIFIED" RANGE FREQUENCY MODEL . 127 6.7 SUMMARY.... .......... 136 CHAPTER 7 RANGE FREQUENCY: FURTHER DEVELOPMENT. .. a . 138 7.1 THEORETICAL DEVELOPMENT. .... 138 7.2 FUNCTIONAL MEASUREMENT........ ... 138 7.3 RANGE FREQUENCY -THEORETICAL DEVELOPMENT. 141 7.4 PSYCHOPHYSICAL LAW .. .. 147 7.5 PSYCHOLOGICAL LAW-: RANGE FREQUENCY MODE 150 7.5.1 THE RANGE PRINCIPLE....... L. 151 7.5.2 THE FREQUENCY PRINCIPLE . ... 155 7.6 THE RANGE FREQUENCY COMPROMISE . 0 . .* 159 7.7 RESTRICTED RANGE CASE............. 164 7.8 SUMMARY............ ..... .. .. 169 CHAPTER 8 EXPERIM ENTS...... ....... 170 8.1 INTRODUCTION...... ........ 170 8.2 GOODNESS EXPERIMENTS AS A TEST OF RF THEORY4 170 8.3 THE ATTRIBUTE "LEG" AS A TEST OF RF THEORY 173 8.4 CONTEXT DETERMINING VARIABLES...... 175 8.5 OVERVIEW OF EXPERIMENTS. .0..... 176 -8- . ..... 8.6 RANGE VARIATION EXPERIMENTS. 181 8.6.1 BACKROUND 181 8.6.2 METHOD. 183 8.6.2.1 STIMULI. .. ... .. 183 8.6.2.2 SUBJECTS .. .. .. 185 8.6.2.3 PROCEDURE.. ...... 192 8.6.3 RESULTS . 195 8.7 EXPERIMENT 7: FULL RANGE. ... ... .. 200 8.7.1 BACKROUND 200 8.7.2 METHOD. 201 8.7.2.1 STIMULI........ 201 8.7.2.2 SUBJECTS . 205 8.7.2.3 PROCEDURE.. ....... 205 8.7.3 RESULTS 209 8.8 EXPERIMENT 8: FULL RANGE.'. ....... 209 8.8.1 BACKROUND 209 8.8.2 METHOD. 210 8.8.2.1 STIMULI. 210 8.8.2.2 SUBJECTS....... 212 8.8.2.3 PROCEDURE.. ....... 213 8.8.3 RESULTS