Narges Roshanbin

Total Page:16

File Type:pdf, Size:1020Kb

Narges Roshanbin Interweaving Unicode, Color, and Human Interactions to Enhance CAPTCHA Security by Narges Roshanbin A thesis submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Software Engineering and Intelligent Systems Department of Electrical and Computer Engineering University of Alberta © Narges Roshanbin, 2014 Abstract Web security has become a critical issue due to the rising reliance of people on diverse types of online transactions. A CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is a security mechanism that is widely used to maintain the security of web services by preventing malicious programs from accessing these resources automatically. Despite the existence of several types of CAPTCHAs, many of them have been compromised due to their inherent vulnerabilities and the development of strong artificial intelligence and image recognition algorithms. The vulnerabilities of existing CAPTCHAs coupled with the trend of heavier dependence on the Internet calls for the development of a new generation of CAPTCHAs that are substantially more complex for machines, and yet easy to understand and solve by human users. In this thesis, we propose, implement and test a new CAPTCHA, which shows resistance against various forms of segmentation and recognition attacks. The multi-layered security approach employed in this CAPTCHA mainly comes from its use of Unicode as an input space, a virtual keyboard as the input device, the use of homoglyphs, the correlated usage of color in the foreground and background, and solution submission through intelligent human interaction. Furthermore, several forms of randomization are employed within different elements of the CAPTCHA which minimize the formation of detectable patterns that can be exploited by machines and make the attacks computationally complex for attackers. Our analyses provide evidence for substantial resistance of the proposed CAPTCHA against major attack types. Our CAPTCHA’s game-like solution procedure is enjoyable and intuitive for human users despite its relatively longer solution time compared to existing CAPTCHAs which comes as the ii price for the higher level of security it affords. Our user studies indicate that the CAPTCHA’s solving accuracy is comparable to major CAPTCHAs in use today. The complexity of this CAPTCHA can be further modified based on the security requirements of the resource being protected. Additional security- and usability-enhancing modifications are proposed and tested, which can further improve the CAPTCHA’s security or usability as needed. iii Preface This research project received research ethics approval from the University of Alberta Research Ethics Board, Project Name “An investigation to improve CAPTCHAs used for distinguishing humans from machines on the Internet”, ID. Pro00028168, Feb. 2012. Chapter 1 of this dissertation has been published as N. Roshanbin and J. Miller, “A survey and analysis of current CAPTCHA approaches,” J. Web Engineering, vol. 12, no. 1, pp. 1-40, 2013. Sections 2.5 and 2.6 have been published as N. Roshanbin and J. Miller, “Finding Homoglyphs - A Step towards Detecting Unicode-Based Visual Spoofing Attacks,” in Web Information System Engineering - WISE, 2011, pp. 1-14. The majority of Chapter 2 is derived from an article submitted for publication as N. Roshanbin and J. Miller, “ADAMAS: Interweaving Unicode and Color to Enhance CAPTCHA Security,” J. Future Generation Computer Systems. Chapter 3 and the appendix of this thesis have been submitted for publication as N. Roshanbin and J. Miller, “Enhancing CAPTCHA security using interactivity, dynamism, and mouse movement patterns,” International Journal of Information Security. An article has been submitted for publication which uses material from Chapter 5 (N. Roshanbin and J. Miller, “A comparative study of the performance of local feature-based pattern recognition algorithms,” J. Pattern Recognition). iv Dedication This dissertation is dedicated to my family without whom I would not be here: To my husband, Moein, who has been a source of love, encouragement and inspiration to me. Thank you for all the joy you have brought to my life and for patiently bearing with me during the completion of this work and encouraging me in challenging situations. To my parents, Fatemeh and Mehdi, for their unconditional love and support throughout my life, for teaching me the value of hard work, and for instilling in me the will to pursue my dreams. Thank you for your gifts of life and love. To my sisters, Asieh and Razieh, whose love, sincerity, intimacy and empathy have always been a source of peace and happiness for me. v Acknowledgements I would like to express my sincere gratitude to Professor James Miller, for his continuous support and guidance during my years in the PhD program and throughout the completion of this dissertation. Having professor Miller as my supervisor has been a true blessing. I would also like to thank my committee members Dr. Bruce Cockburn, Dr. John Aycock, Dr. Eleni Stroulia and Dr. John Salmon for their thoughtful and constructive comments and feedback. In addition, I am thankful to the institutions that granted me scholarships and financial assistance during this research: Alberta Innovates – Technology Futures for iCORE scholarship and University of Alberta for Queen Elizabeth II scholarship. vi Contents INTRODUCTION ....................................................................................................................................... 1 CHAPTER 1 AN OVERVIEW OF EXISTING CAPTCHA SYSTEMS .......................................... 6 1.1 INTRODUCTION ................................................................................................................................................. 6 1.2 CURRENT CAPTCHAS ....................................................................................................................................... 7 1.2.1 Text-based CAPTCHAs .............................................................................................................................. 9 1.2.2 Image-based CAPTCHAs ......................................................................................................................... 17 1.2.3 Audio-based CAPTCHAs ......................................................................................................................... 22 1.2.4 Motion-based CAPTCHAs ....................................................................................................................... 23 1.2.5 Hybrid CAPTCHAs ................................................................................................................................... 24 1.3 BREAKING CAPTCHAS .................................................................................................................................... 27 1.3.1 Algorithmic approaches to break CAPTCHAs ......................................................................................... 27 1.3.2 Random guessing attacks ...................................................................................................................... 41 1.3.3 Using social engineering to break CAPTCHAs ........................................................................................ 41 1.4 ROBUSTNESS OF CAPTCHAS ............................................................................................................................ 42 1.4.1 Segmentation resistance........................................................................................................................ 43 1.4.2 Recognition resistance ........................................................................................................................... 46 1.4.3 Random guessing resistance .................................................................................................................. 47 1.4.4 Security against third party human solver attacks ................................................................................ 48 1.4.5 Other security measures ........................................................................................................................ 48 1.5 USABILITY OF CAPTCHAS ................................................................................................................................ 49 1.5.1 Distortion ............................................................................................................................................... 49 1.5.2 Content .................................................................................................................................................. 50 1.5.3 Presentation ........................................................................................................................................... 51 1.6 CONCLUSION .................................................................................................................................................. 52 CHAPTER 2 THE PROPOSED CAPTCHA ..................................................................................... 58 2.1 INTRODUCTION ............................................................................................................................................... 58 2.2 INTERACTIVE CAPTCHAS ................................................................................................................................. 60 2.3 OVERVIEW OF OUR APPROACH ..........................................................................................................................
Recommended publications
  • Cloud Fonts in Microsoft Office
    APRIL 2019 Guide to Cloud Fonts in Microsoft® Office 365® Cloud fonts are available to Office 365 subscribers on all platforms and devices. Documents that use cloud fonts will render correctly in Office 2019. Embed cloud fonts for use with older versions of Office. Reference article from Microsoft: Cloud fonts in Office DESIGN TO PRESENT Terberg Design, LLC Index MICROSOFT OFFICE CLOUD FONTS A B C D E Legend: Good choice for theme body fonts F G H I J Okay choice for theme body fonts Includes serif typefaces, K L M N O non-lining figures, and those missing italic and/or bold styles P R S T U Present with most older versions of Office, embedding not required V W Symbol fonts Language-specific fonts MICROSOFT OFFICE CLOUD FONTS Abadi NEW ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 01234567890 Abadi Extra Light ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 01234567890 Note: No italic or bold styles provided. Agency FB MICROSOFT OFFICE CLOUD FONTS ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 01234567890 Agency FB Bold ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 01234567890 Note: No italic style provided Algerian MICROSOFT OFFICE CLOUD FONTS ABCDEFGHIJKLMNOPQRSTUVWXYZ 01234567890 Note: Uppercase only. No other styles provided. Arial MICROSOFT OFFICE CLOUD FONTS ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 01234567890 Arial Italic ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 01234567890 Arial Bold ABCDEFGHIJKLMNOPQRSTUVWXYZ abcdefghijklmnopqrstuvwxyz 01234567890 Arial Bold Italic ABCDEFGHIJKLMNOPQRSTUVWXYZ
    [Show full text]
  • Typing Vietnamese on Window
    TYPING VIETNAMESE WITH UNICODE FONT It is recommended that you use a Vietnamese keyboard or keyboard driver for the task. Despite its name, it is not hardware: it is a small program that sits in your OS and converts your keystrokes into Vietnamese characters. Unikey. Its advantages are: It's free. It's just a download away: for NT/2000/XP, for 95/98/MEor for Linux. Installation is simple: just unzip it and it is ready to go. It sits on the taskbar. This makes it easy to switch between "English" mode and "Vietnamese" mode: just click on the icon on the taskbar. The user interface actually provides for English speakers, which makes it easier to understand. Setup 1. When you start up Unikey, you see the following dialogue: What does it all mean? Fortunately, you can find out what is happening by clicking on the "Mở rộng" button. "Mở rộng" means expand, and that's what you need to do. Uncheck the checkbox with "Vietnamese interface”. The whole interface will turn into English: I recommend you always set the "Character Set" to Unicode - always. A character set is basically how characters like "ư" and "a" are represented as numbers that computers can handle. The Microsoft Office programs are set to handle Unicode by default. Unicode is an international standard, so you can't go much wrong with it. The "Input method" is what keystrokes will form a character like "ư". I prefer TELEX, but I will give instructions for using Unikey with VNI as well. See the next section for instructions.
    [Show full text]
  • Download Free Chinese Fonts for Mac
    1 / 4 Download Free Chinese Fonts For Mac ttc and Songti ttc and include TC fonts Hiragino Sans GB ~ Beginning with OS X 10.. 01[?]KaiTi楷体GB18030simkai ttfv5 01[?]FangSong_GB2312仿宋_GB2312GB2312SIMFANG.. [NEED MORE DETAILS HERE] [DISCUSSION OF WEB FONTS AND CSS3]Arphic [文鼎]Taiwan.. If you want to use this font for both simplified and traditional Chinese, then use Font Book to deactivate BiauKai and activate DFKai-SB instead.. ttf file and select install MacOS X (10 3 or later)Double-click on the ttf file and select install.. West is an IRG participant as a member of the UK delegation, so he is well-informed and up-to-date on the progress of their work, and his fonts reflect that knowledge. In addition, the Microsoft Office XP Proofing Tools (and Chinese editions) include the font Simsun (Founder Extended) [SURSONG.. A long time vendor of Chinese OEM fonts, in 2006 Monotype's new owners [Monotype Imaging] also acquired China Type Design [中國字體設計] in Hong Kong.. For the character sets and weights for each, see the Fonts section for your OS: 10.. If you have downloaded a font that is saved in Free Chinese Fonts Free Chinese Font is all about Chinese fonts that are free to download! This site aims to help you download high quality Chinese fonts in.. FamilyFile nameCharsetOS 910 310 410 510 610 710 810 1010 11PingFang SC PingFang HK PingFang TCPingFang.. Font files had to be converted between Windows and Macintosh Regardless, all TrueType fonts contain 'cmap' tables that map its glyphs to various encodings. chinese fonts chinese fonts, chinese fonts generator, chinese fonts download, chinese fonts copy and paste, chinese fonts google docs, chinese fonts dafont, chinese fonts adobe, chinese fonts in microsoft word, chinese fonts word, chinese fonts calligraphy Arial Unicode MS ~ Beginning with OS X 10 5, Apple includes this basic Monotype Unicode font from Microsoft Office [Arial Unicode.
    [Show full text]
  • The Unicode Cookbook for Linguists: Managing Writing Systems Using Orthography Profiles
    Zurich Open Repository and Archive University of Zurich Main Library Strickhofstrasse 39 CH-8057 Zurich www.zora.uzh.ch Year: 2017 The Unicode Cookbook for Linguists: Managing writing systems using orthography profiles Moran, Steven ; Cysouw, Michael DOI: https://doi.org/10.5281/zenodo.290662 Posted at the Zurich Open Repository and Archive, University of Zurich ZORA URL: https://doi.org/10.5167/uzh-135400 Monograph The following work is licensed under a Creative Commons: Attribution 4.0 International (CC BY 4.0) License. Originally published at: Moran, Steven; Cysouw, Michael (2017). The Unicode Cookbook for Linguists: Managing writing systems using orthography profiles. CERN Data Centre: Zenodo. DOI: https://doi.org/10.5281/zenodo.290662 The Unicode Cookbook for Linguists Managing writing systems using orthography profiles Steven Moran & Michael Cysouw Change dedication in localmetadata.tex Preface This text is meant as a practical guide for linguists, and programmers, whowork with data in multilingual computational environments. We introduce the basic concepts needed to understand how writing systems and character encodings function, and how they work together. The intersection of the Unicode Standard and the International Phonetic Al- phabet is often not met without frustration by users. Nevertheless, thetwo standards have provided language researchers with a consistent computational architecture needed to process, publish and analyze data from many different languages. We bring to light common, but not always transparent, pitfalls that researchers face when working with Unicode and IPA. Our research uses quantitative methods to compare languages and uncover and clarify their phylogenetic relations. However, the majority of lexical data available from the world’s languages is in author- or document-specific orthogra- phies.
    [Show full text]
  • Assessment of Options for Handling Full Unicode Character Encodings in MARC21 a Study for the Library of Congress
    1 Assessment of Options for Handling Full Unicode Character Encodings in MARC21 A Study for the Library of Congress Part 1: New Scripts Jack Cain Senior Consultant Trylus Computing, Toronto 1 Purpose This assessment intends to study the issues and make recommendations on the possible expansion of the character set repertoire for bibliographic records in MARC21 format. 1.1 “Encoding Scheme” vs. “Repertoire” An encoding scheme contains codes by which characters are represented in computer memory. These codes are organized according to a certain methodology called an encoding scheme. The list of all characters so encoded is referred to as the “repertoire” of characters in the given encoding schemes. For example, ASCII is one encoding scheme, perhaps the one best known to the average non-technical person in North America. “A”, “B”, & “C” are three characters in the repertoire of this encoding scheme. These three characters are assigned encodings 41, 42 & 43 in ASCII (expressed here in hexadecimal). 1.2 MARC8 "MARC8" is the term commonly used to refer both to the encoding scheme and its repertoire as used in MARC records up to 1998. The ‘8’ refers to the fact that, unlike Unicode which is a multi-byte per character code set, the MARC8 encoding scheme is principally made up of multiple one byte tables in which each character is encoded using a single 8 bit byte. (It also includes the EACC set which actually uses fixed length 3 bytes per character.) (For details on MARC8 and its specifications see: http://www.loc.gov/marc/.) MARC8 was introduced around 1968 and was initially limited to essentially Latin script only.
    [Show full text]
  • Suitcase Fusion 8 Getting Started
    Copyright © 2014–2018 Celartem, Inc., doing business as Extensis. This document and the software described in it are copyrighted with all rights reserved. This document or the software described may not be copied, in whole or part, without the written consent of Extensis, except in the normal use of the software, or to make a backup copy of the software. This exception does not allow copies to be made for others. Licensed under U.S. patents issued and pending. Celartem, Extensis, LizardTech, MrSID, NetPublish, Portfolio, Portfolio Flow, Portfolio NetPublish, Portfolio Server, Suitcase Fusion, Type Server, TurboSync, TeamSync, and Universal Type Server are registered trademarks of Celartem, Inc. The Celartem logo, Extensis logos, LizardTech logos, Extensis Portfolio, Font Sense, Font Vault, FontLink, QuickComp, QuickFind, QuickMatch, QuickType, Suitcase, Suitcase Attaché, Universal Type, Universal Type Client, and Universal Type Core are trademarks of Celartem, Inc. Adobe, Acrobat, After Effects, Creative Cloud, Creative Suite, Illustrator, InCopy, InDesign, Photoshop, PostScript, Typekit and XMP are either registered trademarks or trademarks of Adobe Systems Incorporated in the United States and/or other countries. Apache Tika, Apache Tomcat and Tomcat are trademarks of the Apache Software Foundation. Apple, Bonjour, the Bonjour logo, Finder, iBooks, iPhone, Mac, the Mac logo, Mac OS, OS X, Safari, and TrueType are trademarks of Apple Inc., registered in the U.S. and other countries. macOS is a trademark of Apple Inc. App Store is a service mark of Apple Inc. IOS is a trademark or registered trademark of Cisco in the U.S. and other countries and is used under license. Elasticsearch is a trademark of Elasticsearch BV, registered in the U.S.
    [Show full text]
  • Old Cyrillic in Unicode*
    Old Cyrillic in Unicode* Ivan A Derzhanski Institute for Mathematics and Computer Science, Bulgarian Academy of Sciences [email protected] The current version of the Unicode Standard acknowledges the existence of a pre- modern version of the Cyrillic script, but its support thereof is limited to assigning code points to several obsolete letters. Meanwhile mediæval Cyrillic manuscripts and some early printed books feature a plethora of letter shapes, ligatures, diacritic and punctuation marks that want proper representation. (In addition, contemporary editions of mediæval texts employ a variety of annotation signs.) As generally with scripts that predate printing, an obvious problem is the abundance of functional, chronological, regional and decorative variant shapes, the precise details of whose distribution are often unknown. The present contents of the block will need to be interpreted with Old Cyrillic in mind, and decisions to be made as to which remaining characters should be implemented via Unicode’s mechanism of variation selection, as ligatures in the typeface, or as code points in the Private space or the standard Cyrillic block. I discuss the initial stage of this work. The Unicode Standard (Unicode 4.0.1) makes a controversial statement: The historical form of the Cyrillic alphabet is treated as a font style variation of modern Cyrillic because the historical forms are relatively close to the modern appearance, and because some of them are still in modern use in languages other than Russian (for example, U+0406 “I” CYRILLIC CAPITAL LETTER I is used in modern Ukrainian and Byelorussian). Some of the letters in this range were used in modern typefaces in Russian and Bulgarian.
    [Show full text]
  • CSS 2.1) Specification
    Cascading Style Sheets Level 2 Revision 1 (CSS 2.1) Specification W3C Editors Draft 26 February 2014 This version: http://www.w3.org/TR/2014/ED-CSS2-20140226 Latest version: http://www.w3.org/TR/CSS2 Previous versions: http://www.w3.org/TR/2011/REC-CSS2-20110607 http://www.w3.org/TR/2008/REC-CSS2-20080411/ Editors: Bert Bos <bert @w3.org> Tantek Çelik <tantek @cs.stanford.edu> Ian Hickson <ian @hixie.ch> Håkon Wium Lie <howcome @opera.com> Please refer to the errata for this document. This document is also available in these non-normative formats: plain text, gzip'ed tar file, zip file, gzip'ed PostScript, PDF. See also translations. Copyright © 2011 W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark and document use rules apply. Abstract This specification defines Cascading Style Sheets, level 2 revision 1 (CSS 2.1). CSS 2.1 is a style sheet language that allows authors and users to attach style (e.g., fonts and spacing) to structured documents (e.g., HTML documents and XML applications). By separating the presentation style of documents from the content of documents, CSS 2.1 simplifies Web authoring and site maintenance. CSS 2.1 builds on CSS2 [CSS2] which builds on CSS1 [CSS1]. It supports media- specific style sheets so that authors may tailor the presentation of their documents to visual browsers, aural devices, printers, braille devices, handheld devices, etc. It also supports content positioning, table layout, features for internationalization and some properties related to user interface. CSS 2.1 corrects a few errors in CSS2 (the most important being a new definition of the height/width of absolutely positioned elements, more influence for HTML's "style" attribute and a new calculation of the 'clip' property), and adds a few highly requested features which have already been widely implemented.
    [Show full text]
  • Complete Issue 40:3 As One
    TUGBOAT Volume 40, Number 3 / 2019 General Delivery 211 From the president / Boris Veytsman 212 Editorial comments / Barbara Beeton TEX Users Group 2019 sponsors; Kerning between lowercase+uppercase; Differential “d”; Bibliographic archives in BibTEX form 213 Ukraine at BachoTEX 2019: Thoughts and impressions / Yevhen Strakhov Publishing 215 An experience of trying to submit a paper in LATEX in an XML-first world / David Walden 217 Studying the histories of computerizing publishing and desktop publishing, 2017–19 / David Walden Resources 229 TEX services at texlive.info / Norbert Preining 231 Providing Docker images for TEX Live and ConTEXt / Island of TEX 232 TEX on the Raspberry Pi / Hans Hagen Software & Tools 234 MuPDF tools / Taco Hoekwater 236 LATEX on the road / Piet van Oostrum Graphics 247 A Brazilian Portuguese work on MetaPost, and how mathematics is embedded in it / Estev˜aoVin´ıcius Candia LATEX 251 LATEX news, issue 30, October 2019 / LATEX Project Team Methods 255 Understanding scientific documents with synthetic analysis on mathematical expressions and natural language / Takuto Asakura Fonts 257 Modern Type 3 fonts / Hans Hagen Multilingual 263 Typesetting the Bangla script in Unicode TEX engines—experiences and insights Document Processing / Md Qutub Uddin Sajib Typography 270 Typographers’ Inn / Peter Flynn Book Reviews 272 Book review: Hermann Zapf and the World He Designed: A Biography by Jerry Kelly / Barbara Beeton 274 Book review: Carol Twombly: Her brief but brilliant career in type design by Nancy Stock-Allen / Karl
    [Show full text]
  • The Fontspec Package Font Selection for XƎLATEX and Lualatex
    The fontspec package Font selection for XƎLATEX and LuaLATEX Will Robertson and Khaled Hosny [email protected] 2013/05/12 v2.3b Contents 7.5 Different features for dif- ferent font sizes . 14 1 History 3 8 Font independent options 15 2 Introduction 3 8.1 Colour . 15 2.1 About this manual . 3 8.2 Scale . 16 2.2 Acknowledgements . 3 8.3 Interword space . 17 8.4 Post-punctuation space . 17 3 Package loading and options 4 8.5 The hyphenation character 18 3.1 Maths fonts adjustments . 4 8.6 Optical font sizes . 18 3.2 Configuration . 5 3.3 Warnings .......... 5 II OpenType 19 I General font selection 5 9 Introduction 19 9.1 How to select font features 19 4 Font selection 5 4.1 By font name . 5 10 Complete listing of OpenType 4.2 By file name . 6 font features 20 10.1 Ligatures . 20 5 Default font families 7 10.2 Letters . 20 6 New commands to select font 10.3 Numbers . 21 families 7 10.4 Contextuals . 22 6.1 More control over font 10.5 Vertical Position . 22 shape selection . 8 10.6 Fractions . 24 6.2 Math(s) fonts . 10 10.7 Stylistic Set variations . 25 6.3 Miscellaneous font select- 10.8 Character Variants . 25 ing details . 11 10.9 Alternates . 25 10.10 Style . 27 7 Selecting font features 11 10.11 Diacritics . 29 7.1 Default settings . 11 10.12 Kerning . 29 7.2 Changing the currently se- 10.13 Font transformations . 30 lected features .
    [Show full text]
  • Iso/Iec Jtc1 Sc2 Wg2 N3984
    ISO/IEC JTC1 SC2 WG2 N3984 Notes on the naming of some characters proposed in the FCD of ISO/IEC 10646:2012 (3rd ed.) (JTC1/SC2/WG2 N3945, JTC1/SC2 N4168) Karl Pentzlin — 2011-02-02 This paper addresses the naming of the following characters proposed for inclusion: U+2E33 RAISED DOT U+2E34 RAISED COMMA (both proposed in JTC1/SC2/WG2 N3912) U+A78F LATIN LETTER GLOTTAL DOT (proposed in JTC1/SC2/WG2 N3567 as LATIN LETTER MIDDLE DOT) shown by the following excerpts from N3945: 1. Dots, Points, and "middle" and "raised" characters encoded in Unicode 6.0 (without script-specific ones and fullwidth/small forms; of similar characters within a block, a single character is selected as representative): Dots and Points: U+002E FULL STOP U+00B7 MIDDLE DOT U+02D9 DOT ABOVE U+0387 GREEK ANO TELEIA U+2024 ONE DOT LEADER U+2027 HYPHENATION POINT U+22C5 DOT OPERATOR U+2E31 WORD SEPARATOR MIDDLE DOT U+02D9 has the property Sk (Symbol, modifier), U+22C5 has Sm (Symbol, math), while all others have Po (Punctuation, other). U+A78F is proposed to have Lo (Letter, other). "Middle" and "raised" characters: U+02F4 MODIFIER LETTER MIDDLE GRAVE ACCENT U+2E0C LEFT RAISED OMISSION BRACKET (representing several similar characters in the same block) U+02F8 MODIFIER LETTER RAISED COLON U+A71B MODIFIER LETTER RAISED UP ARROW The following table will show these characters using some different widespread fonts. Font 002E 00B7 02D9 0387 2024 2027 22C5 2E31 02F4 2E0C 02F8 A71B Andron Mega Corpus x.E x·E x˙E α·Ε x․E x‧E x⋅E --- x˴E x⸌E x˸E --- Arial x.E x·E x˙E α·Ε x․E x‧E
    [Show full text]
  • Translate's Localization Guide
    Translate’s Localization Guide Release 0.9.0 Translate Jun 26, 2020 Contents 1 Localisation Guide 1 2 Glossary 191 3 Language Information 195 i ii CHAPTER 1 Localisation Guide The general aim of this document is not to replace other well written works but to draw them together. So for instance the section on projects contains information that should help you get started and point you to the documents that are often hard to find. The section of translation should provide a general enough overview of common mistakes and pitfalls. We have found the localisation community very fragmented and hope that through this document we can bring people together and unify information that is out there but in many many different places. The one section that we feel is unique is the guide to developers – they make assumptions about localisation without fully understanding the implications, we complain but honestly there is not one place that can help give a developer and overview of what is needed from them, we hope that the developer section goes a long way to solving that issue. 1.1 Purpose The purpose of this document is to provide one reference for localisers. You will find lots of information on localising and packaging on the web but not a single resource that can guide you. Most of the information is also domain specific ie it addresses KDE, Mozilla, etc. We hope that this is more general. This document also goes beyond the technical aspects of localisation which seems to be the domain of other lo- calisation documents.
    [Show full text]