<<

Index

A Anchor points, 717 , 430, 891 Angular radial transform (ART) descriptors, , 430, 891 526, 530, 532 Academic systems, 429, 448 Animated, 625, 630, 633, 636, 637 Accidental forgery, 932 Anisotropic Gaussian, 269–271 Accuracy, 44, 50, 51, 54, 1023, 1024, ANN. See Artificial neural networks (ANN) 1027, 1033 Annotation, 630, 639, 966–968, 976, 983–1002 Acid-free paper, 15 Application domains, 182, 198 Acquisition, 11–60, 984–986, 989–991, Approximate NN search, 170 993, 996 APTI, 450 Action plan, 919, 921, 926, 929, 930 AdaBoost, 862, 865, 866 , 430, 434–436, 449 Adaptive local connectivity map, 476 extension, 435 Address block(s), 715, 717, 720, 721, 723, 724 letters, 432, 435, 436 Address block location, 720, 723–724, 727 OCR, 450 Address database, 715, 720, 729, 730, 744 styles, 437–438 Address interpretation, 723 writing , 432 Address recognition systems, 709, 710, Arabic and Persian signatures, 930 720–723, 732, 733, 744 Arabic and Syriac are cursive, 428 Adjacency Arabic recognition approaches, 446 grammars, 528, 537 letters, 433 matrices, 541, 544 Arc detection, 505, 511, 512 Affine covariant, 629 Architectural drawings, 493, 495, 511–513 , 440 Area under the curve (AUC), 931 Algebraic invariants, 617 Area Voronoi diagram, 146, 147, 156, 157, Allograph, 303, 304 161, 163 Alphabet(s), 7–9, 303, 304, 306, 307, 312, , 323 891, 892, 896, 897, 902, 904, 906, Arrowheads, 493, 512, 513 908, 913 Artificial intelligence, 332 Alphabetic class, 429 Artificial neural networks (ANN), 617, 632, Alphabetic fields, 719 636, 821, 835 Ambiguity, 680 Aruspix, 753, 768, 771 Analysis of Invoices, 184 Aryan , 304 Analysis System to Interpret Areas in Ascenders and descenders, 325, 442–444, 810, Single-sided Letters (ANASTASIL), 812, 814 181, 182, 207, 208 ASCII characters, 817 Analytical word recognition, 725 Asian scripts, 460–462, 471, 475, 483

D. Doermann, K. Tombre (eds.), Handbook of Document Image 1037 Processing and Recognition, DOI 10.1007/978-0-85729-859-1, © Springer-Verlag London 2014 1038 Index

Aspect, 279, 284 Bidirectional long short-term memory Assamese, 301, 304 (BLSTM), 822, 823 Assessment Bi-gram probability, 410, 411, 415 function, 1029–1032 Bi-lingual , 314, 316 methods, 1031 Billboards, 629, 640 Associative graphs, 527, 536–538 Binarization, 44, 49, 54, 189, 262, 275, 335, Assyrian script, 439 336, 338–339, 355, 466, 475–477, Attributed grammars, 695 716, 814, 984, 986, 989, 990, Attributed relational graph (ARG), 904 992–994, 1018–1022, 1024, 1033 Attributive symbols, 750 algorithms, 716 Automatic document processing, 614 of handprinted text, 367 Automatic sorting machines, 709 skew correction, and noise, 755 Automatic number plate recognition Binary (ANPR), 846 classifier, 738 images, 716 mask, 633 Bipartite graph, 540 Background analysis method, 144–146, 148, Black-and-white document, 144, 147 149, 158, 164 Bleed-through, 49, 95, 97, 99 Backtracking, 148 Blind attacker, 932 Bag-of-features, 871, 873, 874 Blob noise, 621 Bag-of-words, 618, 620, 627, 629 Block(s), 752 Bangla, 301, 305, 306, 308, 309, 315, 316, Block adjacency graphs, 479 321, 326, 461, 465 Block overlay phenomena, 778 Bank check(s), 363 Blur, 46, 48, 53, 54, 57, 59, 573, 844, 853, Bank check recognition, 67 856, 857 Banners, 640 Blurring, 623, 850, 854, 856–857 Bar codes, 708, 709, 723 Boldface, 322, 325 Bar lines, 751, 761, 767, 769 BongNet, 902, 903, 910 Bar units, 752 Book Baselines, 260, 263, 265, 266, 275–276, binding, 52, 57–59 287, 684, 686–688, 691, 692, 696, scanners, 846 810–815 Boosting methods, 540, 542, 543 Baselines extraction, 442, 686, 687, 691, 692 Borda count, 607 Base-region points, 311 Border removal, 102, 103 Baum-Welch algorithm, 350 Born digital documents, 687, 690 Bayesian combination rules, 739 Bottom-up strategies, 137, 144, 148, 149, 155, Bayesian network, 902, 903 157, 565, 566, 571, 722, 723, 928, BBN, 335, 354, 355 940, 941 Behavioural characteristics, 918 Bounding box, 367, 377, 382 Belga Logos, 632, 640 projections, 761, 764 Bengali Boxing systems, 723 alphabet, 304 Brain stroke risk factors, 941 script, 303, 304 Branch-and-bound, 628, 632 Bezier curve, 893 Brightness, 35, 37, 40, 43, 44 Bibliographic Broadcasting industry, 623 citation, 209 Brodatz textures, 314 metadata, 209 Brute-force attacker, 932 Biblio system, 196, 213 B-splines, 125, 126 Bidimensional Burmese, 314, 318 grammars, 513 Business Letter, 181, 182, 185, 186, 192, 197, patterns, 514 199, 204–210, 216 Index 1039

C Chemical drawing recognition, 973–974 C4.5 algorithm, 616 Chinese, 296, 305, 307–310, 312–316, Calligraphic 318–321, 325 characters, 852 Chinese Academy of Sciences (CASIA), 465 interfaces, 951 Chinese and Japanese signatures, 930 Calliope, 20 Chinese handwriting recognition competition, Camera(s), 44–47, 52, 59 463, 465 Camera-based OCR, 844–850, 860, 861, Chinese, Japanese, and Korean (CJK) 872, 874 ideographic , 462 Camera-captured, 546 scripts, 460–463, 465–474 Cancellable biometrics, 931 Chrominance, 628 Candidate(s), 274, 280, 281, 284 CIDK. See Class independent domain Candidate staff-line segments, 757 knowledge (CIDK) Canonical structure, 462 City name recognition, 720, 722, Caption texts, 489, 848, 854, 855, 862, 725–727 866, 872 Class dependent domain knowledge Carbon paper, 28, 31 (CDDK), 197, 204 Car license plates, 846 Classes of forgeries, 932 Case-based reasoning (CBR), 197, 211 Classification, 463, 470–473, 478, 479, CBIR. See Content-based image retrieval 482–484, 535, 538–543, 545, (CBIR) 760–764, 766–769 CCH. See Co-occurrence histograms (CCH) Classification units, 478, 479 CDDK. See Class dependent domain Classifiers, 279, 281, 282, 284–286, knowledge (CDDK) 446–449, 451 Census forms, 363 Classifiers combination, 383, 689, 692 Chain Codes, 120–123, 369, 377, 384, 525, Class independent domain knowledge (CIDK), 527, 534 197, 207 Character Client-entropy measure, 930 forms, 363 Cluttered background, 623 recognition, 331–356, 359–385, 427–453, CNN. See Convolutional neural network 459–484, 524, 525, 534, 707, 718, (CNN) 725, 729, 737, 739, 1013, 1025, Coding scheme, 809, 817, 826, 827 1026, 1033 Cognitive psychology, 195 segmentation, 466, 467, 469, 807, 811, Color, 18–20, 28–30, 33, 34, 36–38, 40, 44, 824, 830, 834 46, 47, 50, 598–600, 603, 605–610, Character error rate (CER), 1026 612–615, 617, 623, 629, 631, 635, Character recognition rate (CRR), 1026 848, 849, 851, 853, 862–864, 868 Character Shape Coding, 807, 809–811, clustering, 147, 164, 165, 626, 864, 817–818, 824, 826–828 867, 868 Charged couple devices (CCDs), 39 descriptors, 628 Chart analysis, 998 naming, 604, 611, 638 CHAYIM, 440 printing, 30, 34–36 Check spaces, 625–628, 631 forms, 731, 734–736 Color clustering, dither, 164 models, 735 Combination rules, 934 processing, 705–745 Command parameter, 921, 930 readers, 710 Commercial Check 21, 744 language, 301 Check recognition applications, 708–711, OCRs, 452 714–716, 718, 731, 735, 741, 745 systems, 429, 448, 450 Check recognition system, 707, 708, 712, 731, Committee of experts, 716, 719 733, 743, 744 Comparative study, 759, 762 1040 Index

Comparison, 1012–1016, 1018, 1022, 1024, Correlation filter, 545 1029, 1031, 1032 Cost value, 1028 Competitions, 759, 1013, 1022, 1028, Courtesy amount recognition, 710, 711, 719, 1032–1034 731, 733–741 datasets, 448 Creole , 295 Complementary metal-oxide silicon (CMOS), Critter, 323 39, 40 Cross-correlation, 100, 102, 114, 118 Complete graphic object recognition, Cross-related attributes, 194, 195 963–964, 976 Cryptographic key generation, 931 Complex background, 844, 851, 861 Cryptography, 303 Complexity of a signature, 929 Cube search, 906 Compression, 46, 48–50 Cultural habits, 930 Compression artifacts, 623 Currency sign, 731, 736, 738 Computational complexity, 714, 727 Cursive script, 890, 892, 896, 902, 906–908 Computer language, 295 Curvature, 317, 534, 535 Concavity features, 343 Curvature Scale (CSS), 601 Conditional random field (CRF), 870 Curvilinear, 260, 265, 266, 268 Condorcet method, 607, 609 Cyrillic, 305, 314, 316, 318, 319, 321 Confidence scores, 715, 716, 718 Confidence values, 719, 731, 734, 739–741 Conjunct , 477 D Connected components Damaged characters, 74, 95, 97, 99, 101, 127 analysis, 85, 99, 101, 102, 496, 498 Data acquisition devices, 922 labeling, 74–76, 100, 102 Data-driven approaches, 214 Connected components based method, Data reduction, 893, 895 148, 149 techniques, 928 Connectionist Temporal Classification (CTC), Dataset(s), 448–451, 453, 596, 602, 610, 612, 821, 822 613, 620, 631, 632, 636, 639–641, Consistency checking, 579 983–1002 Consistency model, 924 Data variability, 583 , 462–464, 477, 478 DAVOS, 182, 209 Constraints, 762, 764, 765 DCT, 863, 868 Content 3-D document shape, 123, 126 ownership, 623 Decision stream, 779, 781 making, 719–720, 733, 736, 741 Content-based image retrieval (CBIR), strategy, 739 544, 545 Decorative characters, 872–874 Content-based video retrieval, 848, 849 Defects, 46, 48, 51, 53, 57 Contest, 1013, 1018, 1032–1033 Deformations, 988 Context/contextual, 680, 681, 683, 685, 691, Degradation, 68, 260, 282, 984, 986, 988, 991, 693–695, 862, 864, 873, 919, 994, 998 924, 931 Degraded document, 812, 817, 824 information, 627, 638, 690, 696, 752, Delaunay triangulation, 146, 149, 761–765, 770 155–158, 213 knowledge, 568, 571, 582 Delayed strokes, 890, 891, 894, 896, Contours 901, 905 extraction, 816 Delta features, 895 tracking, 756–760, 767 Denoising, 953, 954, 975 Contrast, 40, 43, 44, 47–50 Descenders, 325, 442–444 Control variable, 925 Descriptors, 526, 528–539, 541–547 Convolutional neural network (CNN), 471–473 Deskew(ing), 112, 894 Co-occurrence histograms (CCH), 626, 631 Deslant, 112 Copy machines, 20 Detecting expressions in pen-based input, Corner, 851, 863, 864, 866, 870 685, 688 Index 1041

Detecting mathematical expressions, 686 layout analysis, 778, 791, 799, 1021–1024 Detection, 592, 595, 596, 604, 614–616, models, 712, 717 618–641 recognition, 1024, 1025 Detection rate, 1023, 1024 segmentation, 258, 259 script, 301, 303–306, 308, 309, segments, 778, 780, 798 314–321, 326, 461, 465, 475, similarity, 807 477–483 understanding, 178–186 Device interoperability, 940 vector, 828, 829, 834 Dewarping, 123, 126, 992, 994, 1000 Document image 3D gradients, 634 analysis, 301 (al), 303, 304, 891, 901, 902 binarization, 74, 77–79, 84, 91, 93, 127 marks, 428, 434, 435, 438, 441, 442 enhancement, 74, 95, 99, 127, 476 marks removal, 441 normalization, 74, 104, 127 Diagram(s), 492, 493 processing, 806, 807 Diagramming, 968, 970, 971, 976 retrieval, 138, 806–812, 823–838 , 681, 695, 697 Document layouts analysis, 1021–1024 Diaspora, 300 Domain Dictionary, 66, 67, 303 knowledge, 563, 571, 577 Differential invariants, 617, 618 semantics, 695 Digital-born documents, 649, 651, 652, 655, Dot gain, 16, 36 662, 667–672, 674 Dot matrix printers, 27, 30 Digital camera images, 845 DTW. See Dynamic time warping (DTW) Digital library(ies), 69, 180, 184, 198, 224, Dynamic 226, 244, 245, 337, 353, 809, information, 889, 890, 904, 912 811, 836 programming, 5, 366, 380–382, 627, 814, Digital paper, 777 816, 828 Digitizing tablets, 922, 932–934 Dynamic and tempo markings, 750, 751, 753 Digit recognizers, 738 Dynamic time warping (DTW), 531, 820, 835, Digit strings, 722, 897–899, 907 923, 926–929, 934–937 Dilation, 153 Dimension sets, 512, 513 Directed graphs, 526, 527 Directional element features, 344 Early fusion, 602, 606–607 Direct matching points (DMPs), 929 Eastern scripts, 428, 450 Disambiguation, 200 Edge Displaying recognition results, 698 detection, 165 Distance, 530, 536, 538–542 pixel, 626 calculation, 74, 75 Edge-backpropagation, 617, 620 reciprocal distortion metric, 1020–1021 Edit distance, 527, 536, 539–541 Distance-based approaches, 506 eInk, 38 Distance-based classification techniques, Elastic matching, 896 934, 935 Electronic Distance-based verification techniques, 926 books, 37–38 Distortion, 616, 621, 623, 626, 628, 986, 990, check remittance, 708 991, 1002 displays, 37 Dithering, 30, 34–37 documents, 69 Ditto machines, 24, 29, 31 pens, 922 Docstrum, 154, 339 whiteboards, 838 Document Electrophotographic printing, 31, 32, 52 analysis systems, 68 Embedded expressions, 686–688, 697 categorization, 192–193, 716–717 Embedding, 524, 527, 536, 538–540, 542, classification, 595, 614, 623, 640 543, 546 degradation, 189 Emission, 405–410 genre, 226, 234, 244 Ending, 752 1042 Index

Engineering drawings, 66, 67, 491, 493–496, Form(s), 66, 67, 69, 647–674, 709, 714, 717, 512, 516 720–722, 731, 734, 743, 744 Enrolment stage, 926 Formal language, 295 Entity Detection Metric (EDM), 1023 Formal problem statements, 698 Entropy, 83 Formats, 985–988, 992, 995 ePaper, 38 Form(s) processing, 224 ePens, 23, 37 Fourier Equal error rate (EER), 931, 935, 938, 939 descriptors, 526, 530, 533 eReaders, 37, 38 transform, 501 Erosion, 153 Fourier–Mellin transform, 530 Error rates, 216, 219, 709–711, 713, 714, 730, Fractal features, 313, 315, 316 731, 734, 741, 1024–1027 Freehand/skilled forgeries, 932 Estrangelo, 430, 431 French national archives, 214 Ethiopic, 314, 318 Frequency (IDF), 835 Euclidean, 539 Frequency (TF) and inverse document, 835 Euramerican scripts, 315 FRESCO, 180, 182 Evaluation, 1011–1034 Full interpretation systems, 556 Evaluation framework, 1030–1031 Function features, 924–926, 928, 934, 937 Expectation-maximization, 350 Fusion, 602, 604, 606–607, 617, 639 Expert knowledge, 580 Fuzzy Explicit character segmentation, 725, classification, 317 736, 737 grammars, 691, 693 Extraction of fields, 735 logic, 447, 449 rule matching, 195

F Facsimile, 32, 44, 55 G False Gabor filters, 314–317, 319, 320, 342, acceptance, 931 345, 346 rejection, 931 Gamera, 753, 771 False-positives, 285 Gamut, 36, 771 Fax machine, 30, 32, 33, 44 Gaussian, 269–271 Featural alphabet, 429, 902 Gaussian kernel, 535 Feature-based word spotting model, 832 Gaussian mixture models (GMM), 347, 350, Features 366, 821 character, 361, 364, 373, 374, 376 Gaussian smoothing filter, 94 extraction, 524, 528, 529, 531, 538, 541, Generalization, 596, 621, 635 544, 719, 738, 739, 892, 895–896 Generalized median graph, 540 sets, 725, 737 Generation of synthetic signatures, 933 vector, 814, 817–819, 821, 822, 824, 831, Generic Fourier descriptor, 530, 533 834, 835 Generic method, 661–663, 667, 672 File formats, 987 Genetic algorithms, 536, 542 Finite-state automata, 738 Geometric First-order language, 212 features, 496 Fisher classifiers, 616, 618 information, 181, 783 Fisher linear discriminant (FLD), 316, 321 knowledge, 207 Flatbed scanner, 39–44, 46, 57, 59 relationships, 783, 798 Flood-fill algorithms, 102–103 tree, 206, 208, 209 F-measure (FM), 202, 1017, 1019, 1024 Gestural Shortcuts, 957 (s) Gesture formation, 322 mode, 950, 955 generation, 294, 322, 326 recognition, 952, 953, 955, 956, 958–960, recognition, 291–327 962, 975–977 Foreground analysis, 144, 148 vocabulary, 958 Index 1043

Ghega datasheets, 218 Graphics recognition workshop (GREC), 491, Ghost characters, 438, 439 492, 511 Ghost character theory, 438 Graphism multiplication, 436, 437 Gill, 440 Grass script, 908 GLCM, 313, 319, 320 Gravure, 19, 24, 30, 31 Global GREC. See Graphics recognition workshop approach, 313 (GREC) features, 920, 926, 937, 940 Ground truth, 985–990, 992–996, 1012–1016, information, 901 1018–1024, 1026–1031, 1034 level, 926, 941 Ground-truth datasets, 217, 219 shape, 901, 904 Ground-truthing tools, 1028, 1030 Global-based vision classifiers, 446 Growth, 266, 268–269, 271, 273, 287 Global thresholding, 78–86, 88, 91, 93 GSC. See Gradient, structural, and concavity Global thresholding techniques, 78–86 (GSC) , 8, 10 Guide, 567, 582, 583 Gobbledoc, 180, 182 Guidelines, 736 Google book search, 198 Gujarati, 461 Gradient , 461, 475 features, 342–344, 468, 470, 471, 480 orientation, 819, 824 Gradient, structural, and concavity (GSC), H 480–481 Haar wavelet transform, 445 Graffiti, 896, 899, 900, 911 Halftoning, 35, 36 Grammars Hallucination, 855 inference, 536 Han, 460, 462 scale factor, 411 Han-based script, 313 Grammatical, 574, 577 Handheld devices, 923, 937 Graph Handprinted, 359–385 descriptors, 532, 536, 538 Hand-sketched, 527, 543, 546 grammars, 691, 693, 695, 765 Handwriting, 23, 51, 52, 986, 990, 991, 1002 hierarchy, 767 Handwriting recognition, 317 prototypes, 536 Handwritten Graph-based addresses, 710, 722, 723 models, 187 imaged documents, 813, 814 representations, 479 mathematical symbols, 688 techniques, 759 page, 136, 172 Graph cut textures, 635 scores, 758, 764, 766, 768 Graph-edit distance, 536 text recognition, 990 , 302–304, 731, 739, 740 word representation, 809 Graphic(s) Handwritten document documents, 553–587 images, 809, 812, 820, 829–832, 835 document understanding, 964–966, 976 retrieval, 808, 814, 835 object, 951–953, 958, 962–964, 975 Handwritten music primitives, 781, 783 scores, 751, 753, 754, 759, 762, 766–768, recognition, 68, 750, 759, 952, 962–964, 770, 771 989, 992, 996–999 symbols, 754 symbols, 302, 303 Handwritten word recognition (HWR), Graphical 812, 820 documents, 490, 491, 494–504, 512, , 460 513, 516 , 460, 462 entities, 638 Harmonic mean, 1017, 1019, 1023 primitives, 752–754, 760–762, 764, Hashing, 544 767, 768 Hausdorff–Besikovich dimension, 316 Graphic-rich documents, 504 Heading, bar units, ending, 752 1044 Index

Health conditions, 941 HWDB/OLHWDB, 465 , 314, 318, 320, 428, 430, 434, HWR. See Handwrittenword recognition 439–441, 445, 450, 452 (HWR) , 439, 440 Hybrid Hectograph, 24, 29 approaches, 209, 447, 448, 781 , 323 methods, 78 Heuristics, 561–563, 567, 579, 607, 618–621, thresholding techniques, 90–91 761, 768, 769 Hybrid-level classifiers, 446, 447 Heuristics rules, 725, 737, 738 Hidden Markov models (HMMs), 85, 93, 335, 336, 339, 348–353, 362, 371, 378, 380–382, 384, 393, 405–410, IFN/INIT, 448 412, 415–419, 421, 447–451, 482, Illumination, 618, 623, 626, 628, 635, 638 688, 689, 718, 725, 727, 738, 739, Image 762, 763, 768, 770, 771, 820, 821, acquisition, 707, 714, 716, 732, 824, 835, 837, 897–900, 902–905, 733, 736 907–912 databases, 707, 708, 730, 731, 742 Hidden structures, 781, 783 enhancement, 716 Hierarchical composition, 904 features, 806–809, 811, 820, 832 Hierarchical X-Y tree, 616 preprocessing, 1014–1017, 1029 High speed scanners, 716 quality, 68 High-stability regions, 929 search, 846 , 296, 301, 304, 314, 316, 321 segmentation, 779, 787 , 460, 466 Image-based documents, 651–667, 674 Histogram Imaged document retrieval, 807, 823, 827, descriptors, 534 836, 838 features, 342–344 Image features, 811 Historical, 259, 260, 262, 264, 276, 282, Image processing algorithms, 74–75, 127 283, 287 printing, 23, 24, 30, 52 documents, 461, 476 Implementation of an analysis system, 559 documents images, 809 Implicit segmentation, 725, 736, 738, 739 manuscripts, 814, 829, 831 Indexing, 600, 606, 614, 638, 639 History of automated document analysis, 555 Indic scripts, 460–465, 475–483 HMMs. See Hidden Markov models (HMMs) Individual dissimilarity values, 937 Holes, 316, 321, 325 Industrial Holistic applications, 707–714 approach, 446 systems, 707, 725, 727 features, 740 INEX book corpus, 202 Holistic word Information recognition methods, 727 extraction, 1017 representation, 809, 811, 812, 818, 828, 835 fusion, 606 word recognition, 725 spotting, 556, 584, 586 Hook, 894 Information retrieval (IR), 198, 216, 806, Horizontal belt, 150 809, 812, 823, 826, 829, 831, 832, Horizontal projection histogram, 814 836, 1017 Horizontal text layout, 466 Information retrieval (IR) systems, 606 Hough, 530 INFORMys, 182, 205, 217–218 Hough transform, 74, 75, 77, 113, 117, INFORMys invoice dataset, 217–218 149–151, 445, 498, 508, 511, Inkjet, 18, 20, 32–34, 51 512, 514 Inkjet printer, 718 Human document analysis, 555 Inks, 15–20, 22, 36, 37, 47, 52 Human interaction, 582 Inpainting, 625, 633–636 Human perception, 195, 196, 211, 215 Integral images, 627 Hu moment invariants, 601, 602 Integration strategy, 713, 716 Index 1045

Interactive whiteboard, 838 Korean, 313, 314, 318, 319, 321 Interest points, 545, 618 KOREN, 440 Interpretation Kullback–Leibler (KL) divergence, 539 context, 556, 559, 564, 567, 581–585, 587 Kurdish , 436 strategies, 560–564, 572, 581 Intersession variability, 925, 930 Intra-class variations, 619 L Invariant features, 871, 873 Labeled graphs, attributed graphs, 536 Inverse document frequency (IDF), 825, Labeling algorithms, 188, 216 832, 835 Language Invoice differences, 921, 930 processing, 203–205 identification, 293–296, 300–302, 307, system, 182, 203 325, 326 IPSAR, 450 independent, 811, 817, 824, 828 Isolated character recognition, 900 Language acquisition device (LAD), 295 Iterative thresholding methods, 83 Language models (LMs), 336, 348, 351–352, 463, 465, 473, 474, 479, 482–484, 909 J Large vocabularies, 722 Japanese, 305, 306, 313, 314, 316, 318, 320, Large-volumes lexicons, 720, 730 321, 325 Laser printer, 20, 32, 34 Late fusion, 602, 604, 607 Latin, 305, 313, 314, 316–321 K Latin based scripts, 313 Kalman filtering, 630 Lattices, 738, 741 , 460, 462, 466 Layout , 304, 306, 316, 320, 321, 461, analysis, 136–139, 144, 150, 615, 616, 683, 477, 478 685, 687, 690–696, 710, 711, 715, Karhunen-Loeve` transform, 366, 375 717–718, 720, 731 Kashti, 438 trees, 683–687, 691–698 , 460 Learning-based approaches, 196 Kernel Left (right) reservoir, 310 function, 540, 542, 543 Left-to-right, 343, 344, 350, 351 trick, 542 Legal amount recognition, 731, 733–735, Kerning, 274, 287, 288 739–742 Keypoint, 860, 863, 868 Legal amounts, 711, 717, 719, 731, 733–737, Keypoint correspondence, 627–628 739, 741 Keyword Legendre moments, 533 recognition, 719 Legibility, 278 spotting, 805–838 Letterpress, 19, 24–26, 29 Keyword-based strategies, 727, 728 Letters have position-dependent shapes, 428 k-Nearest neighbors (k-NN), 146, 149, 154, Level-building algorithm, 898 155, 161, 167, 168, 762, 768, Level set, 266, 271–272, 282, 283, 287 769, 771 Lexicon, 66, 67 Knowledge, 178, 179, 182–185, 196, 197, Ligatures, 360, 367, 368, 372, 428, 432, 199, 201–205, 207, 208, 210, 212, 436–438, 442, 443, 449, 452, 894, 214–216, 219 898, 899, 902, 903, 907, 908, 910 Knowledge based Ligatures model, 899, 907 approaches, 197 Lightface, 322 systems, 563 Line Knowledge modeling, 554, 560–563, 571–573, drawings, 554 582, 587 extraction, 814 Konkani, 304 tracking, 858 Koranic addition, 432 Line adjacency graph (LAG), 758, 767 1046 Index

Line-fitting, 506, 508 Luminance histogram, 94 Line segmentation, 259–260, 263–271, 287, Lyrics, 750, 751 475, 477, 478, 653, 657, 990, 996, 998 Linguistics, 66–68, 294, 295, 297–304 M Lithography, 24, 28, 29 Maatraas, 462 LMs. See Language models (LMs) Machine Local learning, 196, 197, 210, 211, 213, 219, approach, 315, 324 293, 303, 696, 861, 862, 864–866, decisions, 896 874, 875 level, 926, 941 printed, 360, 367, 368 minima, 442–444 Madrigals, 766, 767 neighborhood structure, 837 Magazines, 181–183, 199, 200, 206, 209–214 ones, 926 Mailroom solutions, 744 texture, 863 Mail sorting, 183, 184, 216, 333, 335, 337 thresholding techniques, 86–90 Majority voting, 196 Local-based approaches, 447 , 314, 316, 319, 320, 461 Local-based vision classifiers, 446 Manhattan layouts, 137, 140, 141, 188 Local gradient histogram (LGH), 821 Manipuri, 301, 304 Localization, 614, 621, 623, 626, 627, MAP-MRF framework, 468 636, 638 MAP rule, 541, 542 Local (adaptive) thresholding, 78 Maps, 491–496, 502, 511, 513, 514, 516, 517 Logical Marathi, 301, 304, 316 component, 778, 779, 781, 782, 784, Markov chain, 350, 351, 405 792, 799 Markov random fields (MRFs), 167, 170, labeling, 180, 182, 184, 187, 188, 193–197, 172, 369, 468, 537, 633, 636, 862, 204, 208, 210, 212, 216–220 869, 870 language, 295 Markov source model, 909 objects, 178, 180, 181, 184, 186, 197, 204, MAT. See Medial axis transform (MAT) 206–209, 213, 214, 217 Matched wavelet, 167, 170 Lognormal Matching, 527, 536, 537, 541, 545 impulse response, 921 Matching algorithms, 527, 528, 535, 536, 541, primitives, 921 819, 820, 829, 836, 837 velocity profile, 921 MatchScore, 1022–1024 Logo Math , 695 database, 620, 640, 641 Mathematical detection, 596, 604, 614, 615, 621–641 expressions, 974, 990, 992, 993, 999, 1000 models, 615, 618, 619, 623, 629, 635, 638 information retrieval, 682, 687 recognition, 595, 596, 614–622, 625, morphology, 75, 76, 92 629, 639 notation, 679–698 removal, 596, 622–641 Math recognition system, 681–685, 695–698 spotting, 614, 615, 618, 639 Matrix recognition, 691, 694–695 , 307 Maximal empty rectangles, 146, 147, 149, 160 Logographic, 467 Maximally stable extremal region (MSER), Logographic script, 429 867, 868 Logosyllabic, 892 M -band wavelet feature, 170 Log response time, 921 Mean square error (MSE), 1016, 1019 Log time delay, 921 Mechanical engineering, 974 Long range correlation, 901, 904 Medial axis, 370 Long-term modifications, 930 Medial axis transform (MAT), 525, 529, 537 Loosely-structured documents, 186 Media team Documents Dataset, 217 Low-force forgery, 932 Medical Article Records Ground-truth dataset, Low resolution, 623, 844, 850, 853–856, 21, 211, 212 870, 873 Medical forms, 363 Index 1047

Medical invoices, 193, 203, 205 MS New, 323 MEDLINE, 209 MS , 323 Merged regions, 1024 MST. See Minimum spanning tree (MST) Metadata, 781, 783–785, 793, 930, MS trebuchet, 323 931, 941 Multichannel Gabor filter, 166, 167 Metric, 528, 539, 546 Multi-expert Microfiche, 32, 34, 47 approach, 928, 941 Microfilm, 32, 34, 47 decisions, 719 Middle eastern Multifont, 65 character recognition, 427–453 Multi-grey level images, 707, 715, 716 scripts, 428, 450 Multi-language, 714 writing systems, 429, 430 Multilayer perceptron, 769 Million Book project, 198 Multilevel verification systems, 929 Mimeograph, 24, 31 Multilingual Minimum spanning tree (MST), 146, 149, 154, content, 836 156, 164, 689, 692, 694 documents, 838 Misclassification Penalty Metric (MPM), 1021 Multimodal, 606, 607, 617–618, 639 Missed regions, 1024, 1025 Multimodal biometrics, 933, 941 Mixed languages, 910 Multipage documents, 192 Mobile devices, 744 Multiple-frame processing, 853–855 Model(s), 12, 27, 36, 40, 41, 46, 48, 51–57, 60 Multi-recognition engine strategies, 740 Model-based classification techniques, 937 Multi-resolution Model-based techniques, 928, 938, 941 morphology, 153 Model decoding, 898 representations, 494, 500–501 Model-driven, 214 Multi-resolution analysis (MRA), 535 Model-driven (top-down), data-driven Multi-scale, 717 (bottom-up), 781 decomposition, 534–535 Modeling radicals, 467, 468 layout analysis, 717 Modified fractal signature (MFS), 315 Multi-script Modified quadratic discriminant function document, 294, 305–307, 326 (MQDF), 335, 345–347, 471, OCR, 306–308 472, 482 Multi-scriptor, 711 Modified quadratic discriminant function Multi-touch gesture, 958 (MQDF) classifier, 472 Music interpretation, 998 Moire,´ 36 Music symbols, 750, 752–754, 756, 757, Moments, 313, 318, 368, 370, 372–376, 759–765, 767, 769, 770 378, 384 Monogenetic theory, 298 Mono-scriptor applications, 711 Morphological Named entities, 337 analysis, 200, 717, 723 Naskh, 437–439, 447, 449, 452 operations, 74, 75, 96, 97, 99, 101, 761, , 437–439, 442, 444, 447, 449 764, 767 Natural images, 622, 627, 640 operators, 75 Natural language processing (NLP), 206, 448 processing, 633, 717, 736 Navigation, 847 structure, 448 n-best list, 411 Morphology Nearest neighbors, 379, 471–472, 542 analysis, 496–498, 502, 503 Nearest neighbors classifiers, 471–472 based method, 153 Near-vertical strokes, 119–120 Motion blurring, 850 Negative Rate Metric (NRM), 1020 MPEG-7 standard, 526, 534 Nepali, 301 MQDF. See Modified quadratic discriminant Neural nets, 718, 725, 737, 739 function (MQDF) Neural networks, 196, 211, 213–215, 762, MRFs. See Markov random fields (MRFs) 768–770 1048 Index

Neuromuscular system, 918, 919, 921 Optical character recognition (OCR), 7, 9, 10, Newspapers, 182, 183, 206, 209–214, 217 64–68, 259, 260, 262, 263, 266, N-gram language model, 482 272–273, 276, 278, 293, 301, 303, N-gram models, 474, 483 306–308, 315, 323, 325, 332, 333, N-grams, 686–688, 690, 697, 828, 909 335–338, 341, 342, 345, 348–350, Niblack, 335, 338 352–355, 524, 657, 663–667, NLP. See Natural Language Processing 673, 750, 752, 780, 795, 798, Noise, 596, 602, 616–618, 621, 622, 630, 633, 806–814, 817, 820, 828, 829, 832, 634, 653, 656, 660, 662, 986–988, 834, 835 990–992, 995, 1002 Optical character recognition (OCR)-based Noise removal, 189–190 techniques, 836 Noisy background, 95, 97, 99, 101 Optical character recognition (OCR) engines, Non-flat characters, 852 718 Non-impact printing, 23, 31–33 Optical music recognition (OMR), 750, Non-invasive, 940 752–759, 767–771 Non-Manhattan layout, 137, 138, 141, 142 Optimization, 865, 870, 872 Nonterminals, 528 Optophone, 333 Non-textual components, 495 Origin of language, 293, 296–299, 326 Non-threatening process, 940 Oriya, 304, 316, 320, 461 Non-uniform lighting, 860–861, 868 Orthogonal, 374 Normalization, 277, 283 , 302 Normalize, 276, 278, 283 Otsu, 338 Notes and rests, 751, 761 Outlier rejection, 720 Numeric fields, 719 Overlapping layout, 139, 141–142, 145, 147–149, 161, 165–172 text lines, 268 Objective assessment, 1032 Over-segmentation, 335, 349 Occlusion, 602, 621, 623, 627, 628, 635 OCR. See Optical character recognition (OCR) OCR’ed, 808, 829, 832, 836 P Off-line, 615, 623, 629 Page Offset mathematical expressions, 683, 686–688 analysis, 992, 994–996 Offset printing, 19, 24, 29, 30 component, 139, 144–148, 150, 151, 154, Old scores, 750 155, 157, 158, 160, 161, 165, 166, Old scores, handwritten music scores, 766, 770 169, 172 Omnifont, 65 decomposition, 187 Omni-scriptor applications, 711 hierarchy, 187 OMR. See Optical music recognition (OMR) orientation, 104–112 On-line, 600 Page segmentation, 135–173, 189–191, 212, Online approaches, 770 220, 985, 988, 995 Online graphics recognition, 952, 962, 963, Page segmentation competition, 1022 975, 976 Pages tree model, 779 Onlinemusic recognition, 769 Pairwise contour matching methods, 506–508 Ontology, 562, 580 Paleography, 302 …ODA, 208 Paleo-Hebrew, 439 Opaque, 625, 630, 633, 636, 637 Palimpsests, 14, 47 Open content alliance, 198 Palm leaf manuscripts, 476, 477 Opening, closing, 153 Paper Operational strokes, 923 degradation, 766, 768, 771 Operator dominance, 686, 687, 691–693, 695 fingerprint, 847 Operator-driven decomposition, 691–693, 695 Papyrus, 13–14 Operator trees, 683–685, 691, 693, 695, Parallel calculations, 616 697, 698 Parameter features, 924–926 Index 1049

Parchment, 13–14, 16, 47 Point Voronoi diagram, 156, 157 Parsing techniques, 695 Polar, 529–531 Partial differential equations (PDE), 634, 636 Polar coordinates, 530, 533 Partial graphic object recognition, Polygenetic theory, 298 962–963, 976 Polygonal approximation, 494, 504, 508, Partial missed region, 1024, 1025 510, 511 Part-of-speech (POS), 195, 200 Polyline primitives, 545 Patents dataset, 218 Poor-quality printing, 718 Paths, 757–760 Portable documents, 777, 779 Pattern, 599, 604, 617, 632 Postal applications, 705–745 Pattern recognition, 524, 527, 532, 537, 538, Postal automation, 67 540, 543, 545 Postal items, 707, 714, 716, 720 Pattern recognition applications, 743, 744 Postal addresses, 362, 363, 366 PAW, 446, 447, 453 Postal addresses recognition, 709, 710, 714, PDE. See Partial differential equations (PDE) 720, 721, 732 PDFBOX, 781–783, 785, 800 Post processing, 463, 465, 473–474, 479, PDF documents, 776–785, 799, 800 481–484 PDF generation methods, 779 Practical, 581–583 PDL. See Picture Description Language (PDL) Precision, 201, 202, 206, 210, 211, 215–217, Peak Signal-to-Noise Ratio (PSNR), 1016, 826, 838, 1017, 1019, 1024, 1027 1017, 1019–1020 Preclassification, 909 Penalty graph minimization, 694 Preprinted texts, 731 Pencil, 19, 20, 22, 23, 28 Preprocessing, 711, 715, 716, 755, 756, 767, Pen-computing technology, 326 769, 950–953, 959, 960, 975 Pens, 13, 16, 18–23, 25, 28, 37, 52 Pre-processing, 461, 466, 473, 475–477, 598, Pen trajectory, 890, 893 615, 619, 629 Percentile features, 344–345 Preregistered template, 717 Perceptually important points, 923, 927 Prevention of fraud, 744 Perceptual organization (PO), 513, 598–601, Price label scanners, 335 610, 614, 639 Primitive(s), 137, 144–149, 154, 158, 164, 527, Performance 528, 535, 537, 540, 541, 543–546, evaluation, 546, 673, 674, 984–986, 990, 752–754, 760–762, 764, 765, 767, 993, 997, 1022, 1028, 1032 768, 770 measures, 216–217 Primitive(s) extraction, 490 metrics, 696 Primitive shape fitting, 954–955 Peripheral features, 468 Printed Persian, 314, 319, 320 addresses, 723 Personal entropy, 924 document, 807, 814, 817, 832, 834 Personalized threshold, 937 Printing, 12, 15–20, 23–37, 44, 48, 49, Perspective distortion, 850, 853, 857–860, 51, 52, 55, 56 873, 875 Prior knowledge, 619 Perturbation(s), 716, 717, 719, 739, 740 Privacy issues, 931 Perturbation(s) techniques, 717 Probabilistic annotation model, 831, 835 , 428, 432 Probabilistic language models, 482 , 299 Probabilistic modeling, 205, 215, 216 Photocopier, 32, 49 Probability Photocopying, 16, 33, 49, 55 density, 266, 269–271, 287 Phrase recognition, 382 model, 823, 825 Picture description language (PDL), 526, 537 Profiles, 620, 629, 631 Pixel based detection, 615–616 Progressive refinement strategy, 722–725 Pixels, 36, 38–40, 44, 50, 51, 53–56 Projection(s) Pixel-to-contour distances, 1021 distortions, 623 Planographic printing, 28–30 histograms, 372–373 Platform-and application-independent, 777 Projection based method, 148–151 1050 Index

Projection profile(s), 74, 75, 77, 99, 100, 102, errors, 722, 723 103, 105, 106, 112–116, 120–121, lattice, 411 858, 859 rates, 708, 712, 730, 734, 735, 741 Projection profile(s) cutting, 687, 691, 692 strategy, 718, 721, 722, 724, 727, 733, 741 Projection, region growth, 287 systems, 1013, 1025, 1032, 1033 Properties of handprinted, 367–369, 377, 382, Recognition and retrieval, 698 384, 385 Recognition-based localization, 865–867, 873 Prototype-based descriptors, 535, 536 Recognition driven segmentation, 479 Pruning, 820 Recognition of amounts, 710, 711, 714 Pseudo-polar transform, 530 Recognition of legal amounts, 711 Psycho-visual approach, 213 Recognition of logos, 710, 711, 744 Public datasets, 596, 639 Rectangular layout, 138, 140, 144 Public signature databases, 932 Recurrent Neural Network (RNN), 821–824 , 5, 9 Recursive decomposition, 691–693, 695 Pyramidal decomposition, 535, 545 Recursive layout analysis, 692 Recursive XY cut, X-cut, Y-cut, model-based method, 150 Q Redundancy, 680 Quality, 11–60 Reference signature model, 930 Query Reflow, 138 expansion, 826 Region adjacency graphs (RAG), 527, 528, 537 by expression, 682, 698 Region based detection, 616 image, 819, 821 Region growth, 266, 268–269, 273, 287 QuickDiagram, 951, 963, 970–971, 976 Regions, 528, 529, 533, 535, 537, 544, 545 Regression, 529, 533 Regular expressions, 738 R Regularities, 443, 444 Radicals, 467, 468, 470, 892, 903, 904, 906 Rejection scheme, 741 Radon transforms, 529, 530, 533 Reject rate, 710, 741 RAG. See Region adjacency graphs (RAG) Rejects, 713, 715, 716, 719, 720, 741 Rajasthani, 301 Relevance feedback, 596, 607–610, 613, 628, Random forgeries, 926, 932, 935, 938 632, 826 Raster-to-vector Relief printing, 23–28 algorithms, 545 Representativeness, 596, 612, 627, 635 conversion, 494, 498, 503–505, 510, 516 Resampling, 894, 895 Ratio, 279, 284 Reservoir base-line, 311 Reading machine, 332–334 Resolution, 32, 34, 36, 38, 39, 44, 46, 48–50, Reading order, 778, 783, 799 53, 54, 343 Real-life Retrieval, 594–615, 617, 622, 623, 625–626, applications, 707 629, 631, 632, 638–640 problems, 744 Retrieval models, 823–826 Real-scenes, 596, 604, 623, 627, 638, 639 Right pages, 180, 182, 200 Real-scenes images, 623, 638 RLSA. See Run-length smearing algorithm Real-time (RLSA); Run-length smoothing computing, 709, 710, 717 algorithm (RLSA) recognition, 710, 714 Rocchio algorithm, 609 Real-world ROC curve, 931 applications, 711, 744 Roman script, 301, 303, 304, 314 data, 1032 Rotation invariant, 146, 161 Recall, 201, 202, 206, 210, 211, 215–217, 826, Routing, indexing, or translation, 307 838, 1017, 1019, 1024 Rule base, 194–195, 197, 209, 219 Receptive fields, 629, 631 Rule-based Reciprocal rank, 607 approaches, 194–195 Recognition system, 762 Index 1051

Rules, 750, 753, 754, 758, 760, 763–765, categories, 598, 604–607, 610 768, 769 gap, 547 Rulings, 650, 652–660, 662–665, 667, Semiformal visual language, 681 669, 670 Semi-global based vision classifiers, 446 Run, 308, 309, 311–315, 318 Semiotics, 571 Run-based encoding, 509 Semi-structured documents, 186 Run-length, 149, 151–152, 309, 311, 314, 315, Semi-transparent, 625, 630, 636, 637 318, 756–759 Sequence of strokes, 919, 920 Run-length smearing algorithm (RLSA), 149, , 272–274, 276, 277 151–153, 213 Server-based OCR, 336 Run-length smoothing, 74–76, 106 SFSA. See Stochastic finite state automaton Run-length smoothing algorithm (RLSA), 67, (SFSA) 76, 106, 151, 498, 502 Shadow-through effects, 95, 97, 99 Russian, 314, 319 Shape context, 601, 603, 604, 611, 618 descriptors, 600–604, 610 S recognition, 603 Sakhr, 450, 452 Shape code, 807, 810–812, 817, 826 , 301, 304 Shape context representation, 837 Santhali, 304 Shape directed covers, 149, 158–160 Sauvola, 335, 339 Shapeme, 603, 611 Sayre dilemma, 446 Shining-through, 95, 97, 99 Scalability, 596, 612, 615, 621–622, 638, 639 Shirorekha, 316, 475, 478 Scanned bitmap image, 780 Short-term modifications, 30 Scanners, 716 Show through, 48–50, 52, 56–57 Scanning, 32, 36, 37, 39, 40, 48, 50, 53–56 SIFT Scene descriptors, 535, 627, 629, 631, 632 images, 846, 849–852, 861, 869, 874, 875 features, 618 texts, 847–849, 851, 852, 854, 862 Sigma-lognormal SCHOCKEN, 440 model, 920–922 Scoring principles, 739 parameters, 933 Scripts, 291–327, 888–897, 900, 901, 903–906 Signal processing, 494, 496, 501, 502 Scripts identification, 293, 294, 301–305, Signatures 307–318, 320, 323, 325, 326 comparison, 926, 928 Seedpoint, 268, 269, 287 complexity of, 929 , 257–288 components, 919, 923 Segmentation segmentation, 923, 936, 940 based, 380–382 repeatability, 930 error, 1025 stability, 929 Segmentation-by-recognition, 725, 726, 729, verification, 984, 993, 999–1001 738, 739 Signature verification competition, 917–942 Segmentation-driven OCR, 478 Silkscreen, 24, 30, 31 Segmentation free, 366 Similarity, 807, 811, 824, 825, 828, 829, 832, Segmentation free methods, 807, 828 834–837 Segmentation Metric (SM), 1023 Similarity measure, 538, 539, 545 Segmentation of characters, 360, 361, 372, 382 Similarity measure method, 837 Segmentation-recognition, 719 Simple forgeries, 932, 935, 938 Segmented, 524–526, 531, 543 Simple operations, 761, 764 Selective averaging, 165 Single word recognition, 393, 398, 405, 411 Selforganizing map (SOM), 817, 818, 824 Singularities, 443, 444 Self-related attributes, 194, 195 Skeleton, 362, 364, 369–371, 384, 525, 529 Semantics Skeleton-basedmethods, 506, 509 ambiguities, 555, 559 Skeletonization, 74–76 analysis, 559 Sketching interfaces, 949–977 1052 Index

Skew, 40, 48, 53, 54, 816 Stochastic language models, 697 Skew detection, 190 Street name recognition, 722, 726–729 Skilled forgeries, 926, 931, 932, 935, 938 Street number recognition, 720, 722, 728–729 Slant Stripe noise, 621 correction, 814, 816, 894 Stroke(s), 274–276, 282, 283 estimation, 119–123 density features, 468–470 Slanted, 262, 281, 287 filter, 868 Slanted line, 316 map, 282 Sliding windows, 525, 545, 625–627 order, 891, 892, 901, 902, 904–906, 908 Slope, 317, 323, 325 relation, 902, 903, 906 Slurs, 751, 753 segmentation, 950, 953, 954, 959, 975 Smart desktop systems, 846 shape, 906 Smart FIX, 204 width, 311, 322 Smartphones, 708, 744 Stroke based approach, 468 Smearing based method, 149, 151, 164 Stroke cavity map (SCM), 282, 283 Smoothing, 267, 893 Stroke width transform (SWT), 867, 868 Sobel, 342 Structural Soft biometrics, 930 descriptors, 525, 528, 530, 534–542, Software, 761, 770–771 544, 546 SOM. See Selforganizing map (SOM) element, 153 Sparse-pixel, 494, 506, 508–509 elements of characters, 377 Spatial features, 343, 362, 365, 377–378, 384 information, 904 or geometric features, 480 relations, 891, 892, 896, 902, 903, 906 information, 904 relationships, 681, 685, 690, 691, 693, 695 method, 655, 668 structure, 901–904, 906 representation, 544 Spatio-temporal persistency, 630 Structure components, 784 Specialized hardware, 709, 716 Structured documents, 180, 186, 212 Spectral Style recognition, 293, 294, 317–322, 325, 326 estimation, 634 Stylus containing a small CCD camera, 922 saliency, 627 Subgraph Spline isomorphism, 536, 545 curves, 322 matching, 541, 545 function, 322 Subregion, 268, 269, 287 knots, 322 Superimposed logos, 623, 625, 630 Spot noise, 621 Superpixel, 145, 147, 868 Spotting Super-resolution, 853–856, 870 methods, 524, 544–547 Supervised learning, 538, 582, 583 system, 543, 544 Support vector machines (SVM), 196, 213, Staff 342, 366, 379, 384, 540, 542, lines, 752, 753, 755–761, 767, 768 543, 687, 688, 762, 768, 769, 862, removal, 753–759, 767, 769, 770 865, 866 Stamp recognition, 708 SURF keypoints, 627 Statistical , 891 classifiers, 531, 538, 541–543 Syllabic, 304 descriptors, 528, 530, 546 Syllabic system, 429 features, 468–470, 480 Symbols method, 668–669 identity, 687, 688, 690, 694 Statistical language models, 473, 909 location, 684, 690 Stencil duplicating, 31 normalization, 531 Stochastic, 693, 697 recognition, 680, 683, 685, 688–694, 696, Stochastic context-free grammar, 691, 698, 754, 755, 760, 763, 765, 769, 693, 694 770, 961–964, 969, 973, 976, 984, Stochastic finite state automaton (SFSA), 482 989, 993, 996, 997, 999 Index 1053

relationships, 690 Text line orientation detection, 466 spotting, 556, 568, 583, 997, 999 Text localization, 843–875, 999, 1000 Synergy, 921 Text/non-text discrimination, 861, 862, 865, Syntactic 867, 869, 870, 872 constraints, 688, 691, 694 TextNon-text classification, 190 descriptors, 535 Text recognition, 848, 849, 851–852, 862, grouping, 200 865, 869–875, 986, 988–990, 992, methods, 691 995–998, 1000, 1025–1026, 1033 models, 514 Text-to-speech conversion, 303, 333 Syntactical, 361, 362, 377, 378 Text touching graphics, 491, 493, 498, Syntactic logical labeling, 195 501–504 Syntactic pattern recognition, 680, 693–694 Texture analysis, 166 Syntax, 571 Texture-Based Graphical Primitives, 494, Synthetic 513–514 data, 984–988, 990–992, 1015, 1032 Texture features, 313–315, 317, 319, imagery, 1029 320, 325 query, 635 , 428, 430, 434 Synthetic individual generation, 933 Thai, 312, 314, 318, 321 Syriac, 428, 430–434, 447, 452 Thermal printers, 30 language, 430, 433 Thinning, 75, 370 words, 432, 433 Threshold(ing) System interoperability, 934 method, 368 Systems, 593, 595–614, 616–618, 621–623, reduction, 153 627, 629, 632, 638–640 Time delay neural network, 910 Time series, 889, 896 , 322, 323 Tobacco-800, 620, 640, 641 Table(s), 67, 647–674 TOCDAS, 200 processing, 1000 TOC page, 198–202 recognition, 984, 989, 993 Tokenization, 200 spotting, 203 Token passing model, 407 Tamil, 316, 320, 461, 477 Top-down Technical architecture, 765 documents, 491, 512, 524, 527, recognition strategies, 718 544, 546 verification strategies, 928 journals, 180, 184, 197, 206, Topology, 601–603, 608, 614 209–214, 217 Top (Bottom) reservoir, 310 Technology Development for Indian Total variation regularization, 96, 99 Languages (TDIL), 465 Touching characters, 260, 263, 265–268, Telugu, 315–317, 319, 461 273–281, 284–288, 371, 372, Template matching, 285, 286, 479, 725 759–761, 764 Touch screen, 923 Temporal Trademark clue, 893 descriptors, 596, 598, 604, 613 persistency, 633 registrations, 593–595, 604, 605, 639 Term frquency, 825, 835 watch, 594, 595, 597, 614 , 528, 537 Trademark logos, 530 Terminal punctuation, 957, 958 Trademark retrieval, 617, 623, 626 Text detection, 814 Trademark retrieval systems, 595–614, 622, Text extraction tools, 782 638–640 Text-graphics separation, 491, 493–497, Trainable sequential maximum 500–503, 512, 515–517 a posteriori, 170 Text image acquisition, 849, 850 Training-free, 615, 618 Text line, 810, 814, 820, 822, 824, 837 Transducer lexicon, 200 1054 Index

Translator, 846 Video, 44, 46 , 303 Video-based text documents, 326 Transparent, 625, 630, 632, 636, 637 Videos Tree-based classifiers, 480 coding, 707, 708, 732, 733 Tree-based document models, 187 inpainting, 633–636 Tree classifier, 316 retrieval, 848, 849 Tree organizations of lexicons, 726 sequences, 596, 622, 623, 628, 629, Tree-organized lexicons, 727 635, 637 Tri-lingual (triplet) documents, 315 Vienna classification, 597, 603, 605–607, 613 TV broadcasts, 622, 639 Visible watermark, 623 TV Logo detection, 622–625, 630, 633, 635, Visual 636, 640 saliency, 864 , 6, 7, 65, 274, 276–278, 281, similarity, 595–598, 601–605, 608, 609, 283, 287, 288, 303, 317, 322, 613, 617 323, 325 vocabularies, 717 Type I error rate, 931 Vocalization, 435 Type II error rate, 931 Voronoi Typesetting, 778 diagram, 146, 147, 149, 156, 157, 161, Typesetting conventions, 194, 197 163, 339 , 19, 24, 26–28, 31, 52 edge, 149, 156, 157, 161 Typewritten addresses, 709 region, 156, 157 Typographic signs, 317 Voting scheme, 626, 627 strategies, 545, 547 technologies, 711 UML diagramming, 957, 958 modifiers, 462, 463, 475, 477, Under-segmentation, 691, 694 478, 481 UNLV database, 207, 208 Vowels, 428, 430–432, 435, 436, 440, UPV-BHMM, 450 462, 463, 478 UPV-PRHLTREC1, 451 UPV-PRHLT-REC2, 450 UPV-BHMM, 450 W styles, 439 Wabot-2 robot, 759, 760 User interface, 613, 630, 698 Water flow level, 311, 312 User interface design, 966, 971 Watermarks, 15, 47 US National Library of Medicine, 209 Water reservoir, 310–312, 315, 316 UW datasets, 217 Water reservoir area, 311 Wavelet co-occurrence signatures, 314 V decomposition, 113–115 Vectorial representations, 498–500, 505 energy features, 314 Vectorization, 490–494, 503–506, 511, log mean deviation features, 314 515, 516 packet, 167, 169, 170 Vector space (VS), 823, 825, 835 scale co-occurrence signatures, 314 Vellum, 14, 17, 47 transform, Radon transform, 529 Vendors, 707, 708, 710, 731, 742–744 Weak classifier, 543 Vendors of postal recognition software and Web-based OCR, 336 systems, 732–733 Weighting scheme, 825 Verification, 284, 285 Western-style signatures, 930 Verification process, 919, 926, 930, 931, Whiteboard writing recognition, 837 937, 941 White tile(s), 146, 147, 149, 160–162 Vertical line, 316 Width normalization, 398 Vertical projection histograms, 814 Wiener filter, 90, 92, 95 VHTender, 214 Wigner-ville distribution, 120, 121 Index 1055

Wild dots, 893 X Windowing approaches, 616 Xerography, 20, 32 Wired-glove device, 922 X-line, 810–813 WISDOM++, 212 XML, 987, 988, 992, 995 Word error rate (WER), 1026–1027 XPDF, 781–783 Word features, 809, 829, 838 X-Y cut, 180, 209 Word image X-Y cut algorithm, 180 coding, 811 Xylography, 23, 24 descriptors, 819 X-Y tree, 67, 187, 195 Word insertion penalty, 411 Word recognition, 359–385, 445–448, 453, 465, 474, 480, 482, 707, 718–719, Y 725, 727, 731, 740, 741 , Ladino, 439 Word recognition rate (WRR), 1026–1027 Word segmentation, 442–444, 446, 447, 466, 478 Z Word shape coding, 810 Zernike, 374, 375 Word spotting, 805–838 Zernike moments, 526, 530, 533, 601, 610, 611 Writer-dependent Zernike polynomial, 533 approaches, 929 ZIP codes, 707, 709, 714, 719, 720, 722–726, threshold, 934 728, 730 Writer identification, 996, 998, 999 Zone classification, 995 Writer-independent approach, 929 Zone segmentation, 984, 986, 988, 992, 994, Writing 995, 997, 1000 media, 5, 8–9 Zones of interest(ZOIs), 819, 824 styles, 8, 9, 905, 906, 911 Zoning systems, 3–10, 429–441, 453 descriptors, 533 zones, 397–399 vector, 817, 824