Combined Script and Page Orientation Estimation Using the Tesseract OCR Engine

Total Page:16

File Type:pdf, Size:1020Kb

Combined Script and Page Orientation Estimation Using the Tesseract OCR Engine Combined Script and Page Orientation Estimation using the Tesseract OCR engine Ranjith Unnikrishnan and Ray Smith Google Inc., 1600 Amphitheatre Pkwy, Mountain View, CA 94043 [email protected], [email protected] ABSTRACT This paper proposes a simple but effective algorithm to es- timate the script and dominant page orientation of the text contained in an image. A candidate set of shape classes for Han Cyrillic Arabic Greek Tamil each script is generated using synthetically rendered text and used to train a fast shape classifier. At run time, the classifier is applied independently to connected components in the image for each possible orientation of the compo- nent, and the accumulated confidence scores are used to determine the best estimate of page orientation and script. Korean Japanese Hebrew Devanagari Kannada Results demonstrate the effectiveness of the approach on a dataset of 1846 documents containing a diverse set of images in 14 scripts and any of four possible page orientations. A C++ implementation of this work will be made avail- able in a future release of the open-source Tesseract OCR Bengali Telugu Thai Fraktur engine [1]. Figure 1: Image samples of the 15 scripts detected Categories and Subject Descriptors using the proposed algorithm. (Latin is not shown. Fraktur is treated as an independent script. See text I.7.5 [Document and Text Processing]: Document Cap- for details.) ture—Optical Character Recognition (OCR) General Terms and does not generalize across scripts. This makes the lan- Algorithms, Languages guage of the text contained in an image a crucial input pa- rameter to specify when using an OCR algorithm. Scripts form a natural appearance-based grouping of lan- Keywords guages, and many languages share the same script. For Script detection, Page orientation detection, Tesseract example, the Russian, Bulgarian and Ukranian languages share the “Cyrillic” script. Most Indic scripts, such as Tel- ugu, Kannada and Tamil, have only one language associated 1. INTRODUCTION with them, whereas the Latin script is shared by at least 26 This paper focuses on the problem of estimating the script common languages. Thus a solution to the script detection and dominant page orientation of printed text in an image. problem either solves the language identification problem or To accurately recognize the text in an image, optical charac- reduces it to a smaller problem. ter recognition (OCR) algorithms often utilize a great deal Independent of knowledge of the script, OCR algorithms of prior knowledge, such as of the shapes of characters, list often make the natural assumption that the text in the image of words and the frequencies and patterns with which they being processed is upright. This need not always be the case. occur. Much if not all of this knowledge is language-specific For instance, large tables and figures in books are frequently printed in landscape mode while the content of the book may be oriented in portrait mode. Or the book itself could be fed as input in an incorrect orientation. Such examples of Permission to make digital or hard copies of all or part of this work for mismatch between the input and the expectations of typical personal or classroom use is granted without fee provided that copies are OCR algorithms are common, and lead inevitably to the not made or distributed for profit or commercial advantage and that copies OCR algorithm producing garbage output. bear this notice and the full citation on the first page. To copy otherwise, to One brute-force solution would be to use a traditional republish, to post on servers or to redistribute to lists, requires prior specific OCR algorithm to process each input page once for each pos- permission and/or a fee. MOCR '09, July 25, 2009 Barcelona, Spain sible combination of orientation and language mode. How- Copyright 2009 ACM 978-1-60558-698-4/09/07 ...$10.00. ever, this would be impractically slow, and also require a separate procedure to determine validity of the text pro- 2. APPROACH duced as output in each orientation-language configuration. Our proposed approach falls into the category of “local” This paper proposes a simple but effective approach to es- approaches and operates by classifying individual connected timate script and dominant orientation not only with high components independently. This strategy gives it the com- accuracy, but also in far less time than it would take to pelling advantages of not requiring word segmentation or process the image even once using a traditional OCR al- text-line finding as a pre-processing step and of being able gorithm. The solution scales well enough to distinguish to work on small input images. between 14 scripts - Latin, Cyrillic, Greek, Hebrew, Ara- The basic idea behind the proposed approach is simple. bic, Chinese (or Han), Japanese, Korean, Thai, Devanagari, A shape classifier is trained on characters (classes) from all Kannada, Tamil, Telugu and Bengali - spanning more than the scripts of interest. At run-time, the classifier is run in- 40 languages, and also distinguish between Fraktur and non- dependently on each connected component (CC) in the im- Fraktur fonts. age and the process is repeated after rotating each CC into ◦ ◦ ◦ three other candidate orientations (90 , 180 and 270 from the input orientation). The algorithm keeps track of the 1.1 Related work estimated number of characters in each script for a given orientation, and the accumulated classifier confidence score Past approaches to solve the script detection problem may across all candidate orientations. The estimate of page ori- be grouped to three broad categories. The first may be entation is chosen as the one with the highest cumulative referred to as “global” or texture-based approaches. Algo- confidence score, and the estimate of script is chosen as the rithms in this category compute discriminative features on one with the highest number of characters in that script for blocks of text using image filters to determine patterns that the best orientation estimate. are unique to the language or the script. Chaudhury et. al [2] The main difficulty with this strategy is the large total proposed using a frequency domain representation of projec- number of classes associated with all 14 scripts. The Han tion profiles of horizontal text lines. Busch et. al [3] present script alone has several thousands of characters in frequent an extensive evaluation of a broad number of texture fea- use. In scripts such as Devanagari and Arabic, the shapes of tures, including projection profiles, Gabor and wavelet fea- letters can assume different forms depending on context, and tures and gray-level co-occurrence matrices for detecting the unlike Latin, the letters can join to form single or multiple script. This category of approaches has the drawbacks of connected components. These properties of scripts combi- requiring large and aligned homogeneous regions of text in natorially increase the total number of possible shapes, and one script, and of the features in question often being neither since our approach requires associating a class (and script) to very discriminative nor reliable to compute in the presence each connected component, the effective number of classes of noisy or skewed text. to train a shape classifier with can be impractically large. The second category of approaches may be referred to as Once the set of possible shape classes has been identified “local” or connected-component based. These utilize shape for all the scripts, the problem then becomes that of how and stroke characteristics of individual connected compo- to choose a subset of these classes with which to train the nents. Hochberg et. al [4] proposed using script-specific shape classifier. templates by clustering frequently occurring character or One possible approach to the problem of selecting rep- word shapes. Spitz et. al [5] construct shape codes that resentative classes for the scripts is through discriminative capture the concavities of characters, and use them to first selection. This involves selecting shape classes in a script classify them as Latin-based or Han-based, and then within that have a high “distance” to classes from other scripts. those categories using other shape-features. Ma et al [6] This distance between any two shapes may be computed as use Gabor-filters with a nearest-neighbor classifier to deter- a function of the classifier confidence when trained on one mine script and font-type at the word-level. Several hybrid shape and evaluated on the other, or by directly embedding variants of local and global approaches have also been sug- the shapes in an appropriate feature space. gested [7]. We see two main problems with a discriminative approach The third category of approaches may be referred to as to class selection. The first is that discriminative classes are “text-based”. Algorithms in this category work by process- not necessarily frequent. For example, suppose we choose ing the entire page with a traditional OCR engine using the lower-case ‘q’ as a class for the Latin script as it does not one or more “pilot” language modes and then use a separate share a shape similarity with any character from a different procedure to analyze the statistics of the (potentially inaccu- script. Given an image of text, the frequency with which the rate) output to guess what language the original image text letter ‘q’ occurs will likely be small. Thus, given a strategy may have been in. This category of approaches cleverly uti- of classifying a randomly sampled set of connected compo- lizes the fact that although shape classifiers are predictably nents, the expected time to identify the script as Latin will wrong when evaluated on classes they are not trained on, the be high. This limits the accuracy of the algorithm when the errors they make tend to be repeatable.
Recommended publications
  • Old German Translation Tips
    OLD GERMAN TRANSLATION TIPS BASIC STEPS 1. Identify the alphabet used (Fraktur, Sütterlin/Suetterlin, or Kurrent) and convert the letters to modern equivalents, and then; 2. Translate from German to English. CONVERTING OLD ALPHABETS Helps for Translating That Old German Handwriting http://nancysfamilyhistoryblog.blogspot.com/2011/06/helps-for-translating-that-old- german.html Helps for Translating the Old German Typeface http://nancysfamilyhistoryblog.blogspot.com/2011/10/helps-for-translating-old-german.html Omniglot https://www.omniglot.com/writing/german.htm –Scroll down for script styles. ONLINE TRANSLATORS AND DICTIONARIES Abkuerzungen http://abkuerzungen.de/main.php?language=de – for finding the meaning of abbreviations. Leo https://www.leo.org/german-english Members can also connect with each other via the LEO forums to ask for help. Langenscheidt online dictionary: Lingojam https://lingojam.com/OldGerman Linguee https://www.linguee.com/ Choose German → English from the dropdown menu. This site uses human translators, not machines! It gives you examples of the words used in a real-life context and provides all possible German translations of the word. You will find that anti-virus software treats some translation sites/apps (e.g. Babylon translator) as malware. ‘GENEALOGICAL GERMAN’ Family words in German https://www.omniglot.com/language/kinship/german.htm German Genealogical Word List: https://www.familysearch.org/wiki/en/German_Genealogical_Word_List Understanding a German Baptism Records http://www.stemwedegenealogy.com/baptism_sample.htm Understanding German Marriage Records http://www.stemwedegenealogy.com/marriage_sample.htm FACEBOOK HELP Genealogy Translations https://www.facebook.com/groups/genealogytranslation/ – for a wide range of languages, not only German. • Work out as much as you can for yourself first.
    [Show full text]
  • ISO/IEC International Standard 10646-1
    Final Proposed Draft Amendment (FPDAM) 5 ISO/IEC 10646:2003/Amd.5:2007 (E) Information technology — Universal Multiple-Octet Coded Character Set (UCS) — AMENDMENT 5: Tai Tham, Tai Viet, Avestan, Egyptian Hieroglyphs, CJK Unified Ideographs Extension C, and other characters Page 2, Clause 3 Normative references Page 20, Sub-clause 26.1 Hangul syllable Update the reference to the Unicode Bidirectional Algo- composition method rithm and the Unicode Normalization Forms as follows: Replace the three parenthetical notations in the second Unicode Standard Annex, UAX#9, The Unicode Bidi- sentence of the first paragraph with: rectional Algorithm, Version 5.2.0, [Date TBD]. (syllable-initial or initial consonant) (syllable-peak or medial vowel) Unicode Standard Annex, UAX#15, Unicode Normali- (syllable-final or final consonant) zation Forms, Version 5.2.0, [Date TBD]. Page 2, Clause 4 Terms and definitions Page 20, Clause 27 Source references for CJK Ideographs Replace the Code table entry with: In the list after the second paragraph, insert the follow- Code chart ing entry: Code table Hanzi M sources, A rectangular array showing the representation of coded characters allocated to the octets in a code. In the third paragraph, replace “(G, H, T, J, K, KP, V, and U)” with “(G, H, M, T, J, K, KP, V, and U)”. Remove the „Detailed code table‟ entry. Replace all further occurrences of „code table‟ in the Page 21, Sub-clause 27.1 Source references text of the standard with „code chart‟. for CJK Unified Ideographs Replace the last sentence of the second paragraph Page 13, Clause 17 Structure of the code with the following: tables and lists (now renamed „Structure of the code charts and lists‟) The current full set of CJK Unified Ideographs is represented by the collection 385 CJK UNIFIED Insert the following note after the first paragraph.
    [Show full text]
  • 5892 Cisco Category: Standards Track August 2010 ISSN: 2070-1721
    Internet Engineering Task Force (IETF) P. Faltstrom, Ed. Request for Comments: 5892 Cisco Category: Standards Track August 2010 ISSN: 2070-1721 The Unicode Code Points and Internationalized Domain Names for Applications (IDNA) Abstract This document specifies rules for deciding whether a code point, considered in isolation or in context, is a candidate for inclusion in an Internationalized Domain Name (IDN). It is part of the specification of Internationalizing Domain Names in Applications 2008 (IDNA2008). Status of This Memo This is an Internet Standards Track document. This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741. Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc5892. Copyright Notice Copyright (c) 2010 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License.
    [Show full text]
  • Proposal for a Kannada Script Root Zone Label Generation Ruleset (LGR)
    Proposal for a Kannada Script Root Zone Label Generation Ruleset (LGR) Proposal for a Kannada Script Root Zone Label Generation Ruleset (LGR) LGR Version: 3.0 Date: 2019-03-06 Document version: 2.6 Authors: Neo-Brahmi Generation Panel [NBGP] 1. General Information/ Overview/ Abstract The purpose of this document is to give an overview of the proposed Kannada LGR in the XML format and the rationale behind the design decisions taken. It includes a discussion of relevant features of the script, the communities or languages using it, the process and methodology used and information on the contributors. The formal specification of the LGR can be found in the accompanying XML document: proposal-kannada-lgr-06mar19-en.xml Labels for testing can be found in the accompanying text document: kannada-test-labels-06mar19-en.txt 2. Script for which the LGR is Proposed ISO 15924 Code: Knda ISO 15924 N°: 345 ISO 15924 English Name: Kannada Latin transliteration of the native script name: Native name of the script: ಕನ#ಡ Maximal Starting Repertoire (MSR) version: MSR-4 Some languages using the script and their ISO 639-3 codes: Kannada (kan), Tulu (tcy), Beary, Konkani (kok), Havyaka, Kodava (kfa) 1 Proposal for a Kannada Script Root Zone Label Generation Ruleset (LGR) 3. Background on Script and Principal Languages Using It 3.1 Kannada language Kannada is one of the scheduled languages of India. It is spoken predominantly by the people of Karnataka State of India. It is one of the major languages among the Dravidian languages. Kannada is also spoken by significant linguistic minorities in the states of Andhra Pradesh, Telangana, Tamil Nadu, Maharashtra, Kerala, Goa and abroad.
    [Show full text]
  • A Script Independent Approach for Handwritten Bilingual Kannada and Telugu Digits Recognition
    IJMI International Journal of Machine Intelligence ISSN: 0975–2927 & E-ISSN: 0975–9166, Volume 3, Issue 3, 2011, pp-155-159 Available online at http://www.bioinfo.in/contents.php?id=31 A SCRIPT INDEPENDENT APPROACH FOR HANDWRITTEN BILINGUAL KANNADA AND TELUGU DIGITS RECOGNITION DHANDRA B.V.1, GURURAJ MUKARAMBI1, MALLIKARJUN HANGARGE2 1Department of P.G. Studies and Research in Computer Science Gulbarga University, Gulbarga, Karnataka. 2Department of Computer Science, Karnatak Arts, Science and Commerce College Bidar, Karnataka. *Corresponding author. E-mail: [email protected],[email protected],[email protected] Received: September 29, 2011; Accepted: November 03, 2011 Abstract- In this paper, handwritten Kannada and Telugu digits recognition system is proposed based on zone features. The digit image is divided into 64 zones. For each zone, pixel density is computed. The KNN and SVM classifiers are employed to classify the Kannada and Telugu handwritten digits independently and achieved average recognition accuracy of 95.50%, 96.22% and 99.83%, 99.80% respectively. For bilingual digit recognition the KNN and SVM classifiers are used and achieved average recognition accuracy of 96.18%, 97.81% respectively. Keywords- OCR, Zone Features, KNN, SVM 1. Introduction recognition of isolated handwritten Kannada numerals Recent advances in Computer technology, made every and have reported the recognition accuracy of 90.50%. organization to implement the automatic processing Drawback of this procedure is that, it is not free from systems for its activities. For example, automatic thinning. U. Pal et al. [11] have proposed zoning and recognition of vehicle numbers, postal zip codes for directional chain code features and considered a feature sorting the mails, ID numbers, processing of bank vector of length 100 for handwritten Kannada numeral cheques etc.
    [Show full text]
  • Deciphering Old German Documents Using the Online German Script Tutorial Bradley J
    Deciphering Old German Documents Using the Online German Script Tutorial Bradley J. York Center for Family History and Genealogy at Brigham Young University Abstract The German Script Tutorial (http://script.byu.edu/german) is a new online resource for learning how to decipher documents written in old German script. The German Script Tutorial provides descriptions and actual examples of each German letter. Animations demonstrate how each letter is written. Tests assess users’ ability to read and write letters, words, and passages. User accounts allow users to save their test scores and keep track of which letters they need to practice. The tutorial also provides basic guidelines on how to find vital (i.e., genealogical) in- formation in old German documents. Introduction One of the greatest difficulties in German genealogical research is deciphering old docu- ments and manuscripts written in old German script. The term old German script refers to the handwriting and typefaces that were widely used in German-speaking countries from me- dieval times to the end of World War II. Al- though old German script was based on the Latin alphabet, as were old English, French, Italian, and Spanish scripts, individual letters of old German script may appear to the untrained eye as unique as letters of the Cyrillic or even Arabic alphabets. Since old German script has not been taught in German schools since the 1950’s, even most native Germans are unable to read and write it nowadays. Thus, decipher- This is an old German emigration document consisting of printed and handwritten information. Both the hand- ing old German documents is problematic for writing and the printed Fraktur typeface are styles of old German script.
    [Show full text]
  • Report of the Mikrusian Alphabet 1 Introduction
    Report of the Mikrusian alphabet 1 Introduction People of Earth speak in different languages. There are several hundreds of various languages. Each language is characterized by its own specific set of sounds and by a peculiarity of their articulation. However, there are people that regard their native languages as same ones, live in distant regions, and articulate equal-sense words with some difference. And the difference may be only in a point that some sounds are articulated with a moderate difference. The typical example is a situation with German language natives in different regions of Europe. In spite of this, these people communicate with each other without problems in understanding and they kick the articulation distinction to «accent», «dialect», «patois», or another term. Furthermore, there exist people that articulate individual sounds slightly in a different way in comparison with other “same-language” inhabitants of the same region. Usually this is classified by speech therapists as a speech defect. But this does not lead to problems in understanding such humans “with defect”. This is connected with the following: each listener identifies all “similar” sounds for himself. Our brain acts analogously when it analyzes all sounds in speech. The brain identifies sounds that belong to a certain range; it consider them as equal, as “equivalent”. 1.1 Classification subproblems In spite of such variety of languages, the variety of speech sounds is stipulated and bounded by possibilities of a human vocal apparatus. It has approximately the same possibilities in speech for all people (with rare exceptions of pathologies). The question arises: is it possible to classify sounds that a human vocal apparatus can generate in speech? As for any classification problem, one should solve the following subproblems.
    [Show full text]
  • The Unicode Standard, Version 4.0--Online Edition
    This PDF file is an excerpt from The Unicode Standard, Version 4.0, issued by the Unicode Consor- tium and published by Addison-Wesley. The material has been modified slightly for this online edi- tion, however the PDF files have not been modified to reflect the corrections found on the Updates and Errata page (http://www.unicode.org/errata/). For information on more recent versions of the standard, see http://www.unicode.org/standard/versions/enumeratedversions.html. Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and Addison-Wesley was aware of a trademark claim, the designations have been printed in initial capital letters. However, not all words in initial capital letters are trademark designations. The Unicode® Consortium is a registered trademark, and Unicode™ is a trademark of Unicode, Inc. The Unicode logo is a trademark of Unicode, Inc., and may be registered in some jurisdictions. The authors and publisher have taken care in preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. The Unicode Character Database and other files are provided as-is by Unicode®, Inc. No claims are made as to fitness for any particular purpose. No warranties of any kind are expressed or implied. The recipient agrees to determine applicability of information provided. Dai Kan-Wa Jiten used as the source of reference Kanji codes was written by Tetsuji Morohashi and published by Taishukan Shoten.
    [Show full text]
  • Calligraphy Specimens Following Writing
    Calligraphy Specimens following Writing Manual by ADOLPH ZUNNER[?] printed by JOHANN CHRISTOPH WEIGEL or CHRISTOPH WEIGEL THE ELDER In German and Latin, manuscript on paper Germany (Nuremberg), c. 1713 20 folios on paper, complete [collation i20], unidentified watermark (bisected with center lacking, crest holding 3? bezants with ornate frame, initials M and F at bottom), foliation in modern pencil in upper recto corners, text written in various calligraphic scripts in black ink on recto only, no visible ruling and varied justification, sketched decorative evergreen boughs on f. 1 and calligraphic scrollwork throughout, minor flecking and staining, some original ink blots. CONTEMPORARY BINDING, brown (once red?) brocade paper with elegant mixed floral design and traces of gold embossing, pasted spine, abrasion and discoloration but wholly intact. Dimensions 150 x 190 mm. Calligraphic sample books from the Renaissance, such as this manuscript, are far less common than their printed exemplars; this charming booklet, designed for teaching writing to the young, appears to be one of a kind. This volume in its fine contemporary binding includes texts that display a scribe’s skill in writing different types of scripts. It is partially copied from writing master Adolph Zunner’s 1709 Kunstrichtige Schreib-Art printed in Nuremberg by famous publisher and engraver [Johann] Christoph Weigel. PROVENANCE 1. Written in Germany, in Nuremberg, in 1713 or shortly thereafter, with a title page reading Gründliche Unterweisung zu Fraktur – Canzley – und Current Schrifften der lieben Jugend zum Anfang des Schreibens und sondern Nuzen gestellet durch A. <A. or Z.?> in Nürnberg Zufinden bey Johann Christoph Weigel (A Thorough Instruction in Fraktur, Chancery, and Cursive Scripts, prepared for the especial utility of dear Youth in beginning to write by A.
    [Show full text]
  • Download Eskapade
    TYPETOGETHER Eskapade Creating new common ground between a nimble oldstyle serif and an experimental Fraktur DESIGNED BY YEAR Alisa Nowak 2012 ESKAPADE & ESKAPADE FRAKTUR ABOUT The Eskapade font family is the result of Alisa Nowak’s unique script practiced in Germany in the vanishingly research into Roman and German blackletter forms, short period between 1915 and 1941. The new mainly Fraktur letters. The idea was to adapt these ornaments are also hybrid Sütterlin forms to fit with broken forms into a contemporary family instead the smooth roman styles. of creating a faithful revival of a historical typeface. Although there are many Fraktur-style typefaces On one hand, the ten normal Eskapade styles are available today, they usually lack italics, and their conceived for continuous text in books and magazines italics are usually slanted uprights rather than proper with good legibility in smaller sizes. On the other italics. This motivated extensive experimentation with hand, the six angled Eskapade Fraktur styles capture the italic Fraktur shapes and resulted in Eskapade the reader’s attention in headlines with its mixture Fraktur’s unusual and interesting solutions. In of round and straight forms as seen in ‘e’, ‘g’, and addition to standard capitals, it offers a second set ‘o’. Eskapade works exceptionally well for branding, of more decorative capitals with double-stroke lines logotypes, and visual identities, for editorials like to intensify creative application and encourage magazines, fanzines, and posters, and for packaging. experimental use. Eskapade roman adopts a humanist structure, but The Thin and Black Fraktur styles are meant for is more condensed than other oldstyle serifs.
    [Show full text]
  • Five Centurie< of German Fraktur by Walden Font
    Five Centurie< of German Fraktur by Walden Font Johanne< Gutenberg 1455 German Fraktur represents one of the most interesting families of typefaces in the history of printing. Few types have had such a turbulent history, and even fewer have been alter- nately praised and despised throughout their history. Only recently has Fraktur been rediscovered for what it is: a beau- tiful way of putting words into written form. Walden Font is proud to pres- ent, for the first time, an edition of 18 classic Fraktur and German Script fonts from five centuries for use on your home computer. This booklet describes the history of each font and provides you with samples for its use. Also included are the standard typeset- ting instructions for Fraktur ligatures and the special characters of the Gutenberg Bibelschrift. We hope you find the Gutenberg Press to be an entertaining and educational publishing tool. We certainly welcome your comments and sug- gestions. You will find information on how to contact us at the end of this booklet. Verehrter Frakturfreund! Wir hoffen mit unserer "“Gutenberg Pre%e”" zur Wiederbelebung der Fraktur= schriften - ohne jedweden politis#en Nebengedanken - beizutragen. Leider verbieten un< die hohen Produktion<kosten eine Deutsche Version diese< Be= nu@erhandbüchlein< herau<zugeben, Sie werden aber den Deutschen Text auf den Programmdisketten finden. Bitte lesen Sie die “liesmich”" Datei für weitere Informationen. Wir freuen un< auch über Ihre Kommentare und Anregungen. Kontaktinformationen sind am Ende diese< Büchlein< angegeben. A brief history of Fraktur At the end of the 15th century, most Latin books in Germany were printed in a dark, barely legible gothic type style known asTextura .
    [Show full text]
  • A Brief History of the Blackletter Font Style
    A Brief History of the Blackletter Font Style By: Linda Tseng ● Blackletter font also called Fraktur, Gothic, or Old English. ● From Western Europe such as Germany, Germany use Gothic until World War II. ● During twelfth century to twenty century. ● Blackletter hands were called Gothic by the modernist Lorenzo Valla and others in middle fifteenth century in Italy. ● Used to describe the scripts of the Middle Ages in which the darkness of the characters overpowers the whiteness of the page. ● In the 1500’s, blackletter became less popular for printing in a lots of countries except Germany and the Countries that speaking German. ● The best Textura specimen in the Gutenberg Museum Library’s collection is comes from Great Britain. Overview ● Rotunda types, the second oldest blackletter style never really caught on as a book type in German speaking lands, although twentieth century calligraphers, as well as arts and crafts designers, have used it quite well for display purposes. ● Cursiva- developed in the 14th century as a simple form of textualis. ● Hybrida( bastarda)- a mixture of textualis and cursiva, and it’s developed in the early 15th century. ● Donatus Kalender- the name for the metal type design that Gutenberg used in his earliest surviving printed works, and it’s from the early 1450s. Blackletters are difficult to read as body text so they are better used for headings, logos, posters and signs. ● Newspaper headlines, magazine, Fette Fraktur are used for old fashioned headlines and beer advertising. ● Wilhelm Klingspor Gotisch adorns many wine label. ● Linotext and Old English are popular choices for certificates.
    [Show full text]