Electronic Document Preparation Pocket Primer
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Doctor's Thesis Studies on Multilingual Information Processing
NAIST-IS-DT9761021 Doctor’s Thesis Studies on Multilingual Information Processing on the Internet Akira Maeda September 18, 2000 Department of Information Systems Graduate School of Information Science Nara Institute of Science and Technology Doctor’s Thesis submitted to Graduate School of Information Science, Nara Institute of Science and Technology in partial fulfillment of the requirements for the degree of DOCTOR of ENGINEERING Akira Maeda Thesis committee: Shunsuke Uemura, Professor Yuji Matsumoto, Professor Minoru Ito, Professor Masatoshi Yoshikawa, Associate Professor Studies on Multilingual Information Processing on the Internet ∗ Akira Maeda Abstract With the increasing popularity of the Internet in various part of the world, the languages used for Web documents are expanded from English to various languages. However, there are many unsolved problems in order to realize an information system which can handle such multilingual documents in a unified manner. From the user’s point of view, three most fundamental text processing functions for the general use of the World Wide Web are display, input, and retrieval of the text. However, for languages such as Japanese, Chinese, and Korean, character fonts and input methods that are necessary for displaying and inputting texts, are not always installed on the client side. From the system’s point of view, one of the most troublesome problems is that, many Web documents do not have meta information of the character coding system and the language used for the document itself, although character coding systems used for Web documents vary according to the language. It may result in troubles such as incorrect display on Web browsers, and inaccurate indexing on Web search engines. -
Unicode Ate My Brain
UNICODE ATE MY BRAIN John Cowan Reuters Health Information Copyright 2001-04 John Cowan under GNU GPL 1 Copyright • Copyright © 2001 John Cowan • Licensed under the GNU General Public License • ABSOLUTELY NO WARRANTIES; USE AT YOUR OWN RISK • Portions written by Tim Bray; used by permission • Title devised by Smarasderagd; used by permission • Black and white for readability Copyright 2001-04 John Cowan under GNU GPL 2 Abstract Unicode, the universal character set, is one of the foundation technologies of XML. However, it is not as widely understood as it should be, because of the unavoidable complexity of handling all of the world's writing systems, even in a fairly uniform way. This tutorial will provide the basics about using Unicode and XML to save lots of money and achieve world domination at the same time. Copyright 2001-04 John Cowan under GNU GPL 3 Roadmap • Brief introduction (4 slides) • Before Unicode (16 slides) • The Unicode Standard (25 slides) • Encodings (11 slides) • XML (10 slides) • The Programmer's View (27 slides) • Points to Remember (1 slide) Copyright 2001-04 John Cowan under GNU GPL 4 How Many Different Characters? a A à á â ã ä å ā ă ą a a a a a a a a a a a Copyright 2001-04 John Cowan under GNU GPL 5 How Computers Do Text • Characters in computer storage are represented by “small” numbers • The numbers use a small number of bits: from 6 (BCD) to 21 (Unicode) to 32 (wchar_t on some Unix boxes) • Design choices: – Which numbers encode which characters – How to pack the numbers into bytes Copyright 2001-04 John Cowan under GNU GPL 6 Where Does XML Come In? • XML is a textual data format • XML software is required to handle all commercially important characters in the world; a promise to “handle XML” implies a promise to be international • Applications can do what they want; monolingual applications can mostly ignore internationalization Copyright 2001-04 John Cowan under GNU GPL 7 $$$ £££ ¥¥¥ • Extra cost of building-in internationalization to a new computer application: about 20% (assuming XML and Unicode). -
Markdown Markup Languages What Is Markdown? Symbol
Markdown What is Markdown? ● Markdown is a lightweight markup language with plain text formatting syntax. Péter Jeszenszky – See: https://en.wikipedia.org/wiki/Markdown Faculty of Informatics, University of Debrecen [email protected] Last modified: October 4, 2019 3 Markup Languages Symbol ● Markup languages are computer languages for annotating ● Dustin Curtis. The Markdown Mark. text. https://dcurt.is/the-markdown-mark – They allow the association of metadata with parts of text in a https://github.com/dcurtis/markdown-mark clearly distinguishable way. ● Examples: – TeX, LaTeX https://www.latex-project.org/ – Markdown https://daringfireball.net/projects/markdown/ – troff (man pages) https://www.gnu.org/software/groff/ – XML https://www.w3.org/XML/ – Wikitext https://en.wikipedia.org/wiki/Help:Wikitext 2 4 Characteristics Usage (2) ● An easy-to-read and easy-to-write plain text ● Collaboration platforms and tools: format that. – GitHub https://github.com/ ● Can be converted to various output formats ● See: Writing on GitHub (e.g., HTML). https://help.github.com/en/categories/writing-on-github – Trello https://trello.com/ ● Specifically targeted at non-technical users. ● See: How To Format Your Text in Trello ● The syntax is mostly inspired by the format of https://help.trello.com/article/821-using-markdown-in-trell o plain text email. 5 7 Usage (1) Usage (3) ● Markdown is widely used on the web for ● Blogging platforms and content management entering text. systems: – ● The main application areas include: Ghost https://ghost.org/ -
Character Set Migration Best Practices For
Character Set Migration Best Practices $Q2UDFOH:KLWH3DSHU October 2002 Server Globalization Technology Oracle Corporation Introduction - Database Character Set Migration Migrating from one database character set to another requires proper strategy and tools. This paper outlines the best practices for database character set migration that has been utilized on behalf of hundreds of customers successfully. Following these methods will help determine what strategies are best suited for your environment and will help minimize risk and downtime. This paper also highlights migration to Unicode. Many customers today are finding Unicode to be essential to supporting their global businesses. Oracle provides consulting services for very large or complex environments to help minimize the downtime while maximizing the safe migration of business critical data. Why migrate? Database character set migration often occurs from a requirement to support new languages. As companies internationalize their operations and expand services to customers all around the world, they find the need to support data storage of more World languages than are available within their existing database character set. Historically, many legacy systems required support for only one or possibly a few languages; therefore, the original character set chosen had a limited repertoire of characters that could be supported. For example, in America a 7-bit character set called ASCII is satisfactory for supporting English data exclusively. While in Europe a variety of 8 bit European character sets can support specific subsets of European languages together with English. In Asia, multi byte character sets that could support a given Asian language and English were chosen. These were reasonable choices that fulfilled the initial requirements and provided the best combination of economy and performance. -
Using NROFF and TROFF
Using NROFF and TROFF Part Number: 800-1755-10 Revision A, of 9 May 1988 UNIX is a registered trademark of AT&T. SunOS is a trademark of Sun Microsystems, Inc. Sun Workstation is a registered trademark of Sun Microsystems, Inc. Material in this manual comes from a number of sources: NrofflTroff User's Manual, Joseph F. Ossanna, Bell Laboratories, Murray Hill, New Jersey; A Troff Tutorial, Brian W. Kernighan, Bell Laboratories, Murray Hill, New Jersey; Typ ing Documents on the UNIXSystem: Using the -ms Macros with Troff and Nroff, M. E. Lesk, Bell Laboratories, Murray Hill, New Jersey; A Guide to Preparing Documents with -ms, M. E. Lesk, Bell Laboratories, Murray Hill, New Jersey; Document Formatting on UNIXUsing the -ms Macros, Joel Kies, University of California, Berkeley, California; Writing Papers with Nroff Using -me, Eric P. Allman, University of California, Berkeley; and Introducing the UNIXSystem, Henry McGilton, Rachel Morgan, McGraw-Hill Book Company, 1983. These materials are gratefully acknowledged. Copyright © 1987, 1988 by Sun Microsystems, Inc. This publication is protected by Federal Copyright Law, with all rights reserved. No part of this publication may be reproduced, stored in a retrieval system, translated, transcribed, or transmitted, in any form, or by any means manual, electric, electronic, electro-magnetic, mechanical, chemical, optical, or other wise, without prior explicit written permission from Sun Microsystems. Contents Chapter 1 Introduction . 1.1. nrof f andtrof f . Text Formatting Versus Word Processing TheEvolutionof nr of f andt ro f f Preprocessors and Postprocessors 1.2. tr of f, Typesetters, and Special-Purpose Formatters ............ 1.3. -
Cyinpenc.Pdf
1 The Cyrillic codepages There are several widely used Cyrillic codepages. Currently, we define here the following codepages: • cp 866 is the standard MS-DOS Russian codepage. There are also several codepages in use, which are very similar to cp 866. These are: so-called \Cyrillic Alternative codepage" (or Alternative Variant of cp 866), Modified Alternative Variant, New Alternative Variant, and experimental Tatarian codepage. The differences take place in the range 0xf2{0xfe. All these `Alternative' codepages are also supported. • cp 855 is the standard MS-DOS Cyrillic codepage. • cp 1251 is the standard MS Windows Cyrillic codepage. • pt 154 is a Windows Cyrillic Asian codepage developed in ParaType. It is a variant of Windows Cyrillic codepage. • koi8-r is a standard codepage widely used in UNIX-like systems for Russian language support. It is specified in RFC 1489. The situation with koi8-r is somewhat similar to the one with cp 866: there are also several similar codepages in use, which coincide with koi8-r for all Russian letters, but add some other Cyrillic letters. These codepages include: koi8-u (it is a variant of the koi8-r codepage with some Ukrainian letters added), koi8-ru (it is described in a draft RFC document specifying the widely used character set for mail and news exchange in the Ukrainian internet community as well as for presenting WWW information resources in the Ukrainian language), and ISO-IR-111 ECMA Cyrillic Code Page. All these codepages are supported also. • ISO 8859-5 Cyrillic codepage (also called ISO-IR-144). • Apple Macintosh Cyrillic (Microsoft cp 10007) codepage. -
Alphabetization† †† Wendy Korwin*, Haakon Lund** *119 W
Knowl. Org. 46(2019)No.3 209 W. Korwin and H. Lund. Alphabetization Alphabetization† †† Wendy Korwin*, Haakon Lund** *119 W. Dunedin Rd., Columbus, OH 43214, USA, <[email protected]> **University of Copenhagen, Department of Information Studies, DK-2300 Copenhagen S Denmark, <[email protected]> Wendy Korwin received her PhD in American studies from the College of William and Mary in 2017 with a dissertation entitled Material Literacy: Alphabets, Bodies, and Consumer Culture. She has worked as both a librarian and an archivist, and is currently based in Columbus, Ohio, United States. Haakon Lund is Associate Professor at the University of Copenhagen, Department of Information Studies in Denmark. He is educated as a librarian (MLSc) from the Royal School of Library and Information Science, and his research includes research data management, system usability and users, and gaze interaction. He has pre- sented his research at international conferences and published several journal articles. Korwin, Wendy and Haakon Lund. 2019. “Alphabetization.” Knowledge Organization 46(3): 209-222. 62 references. DOI:10.5771/0943-7444-2019-3-209. Abstract: The article provides definitions of alphabetization and related concepts and traces its historical devel- opment and challenges, covering analog as well as digital media. It introduces basic principles as well as standards, norms, and guidelines. The function of alphabetization is considered and related to alternatives such as system- atic arrangement or classification. Received: 18 February 2019; Revised: 15 March 2019; Accepted: 21 March 2019 Keywords: order, orders, lettering, alphabetization, arrangement † Derived from the article of similar title in the ISKO Encyclopedia of Knowledge Organization Version 1.0; published 2019-01-10. -
Deployment of XML for Office Documents in Organizations
JYVÄSKYLÄ LICENTIATE THESES IN COMPUTING 16 Eliisa Jauhiainen DeployPent of XML for OfÀFe DoFXPents in Organizations JYVÄSKYLÄ LICENTIATE THESES IN COMPUTING 16 Eliisa Jauhiainen Deployment of XML for Office Documents in Organizations UNIVERSITY OF JYVÄSKYLÄ JYVÄSKYLÄ 2014 Deployment of XML for Office Documents in Organizations JYVÄSKYLÄ LICENTIATE THESES IN COMPUTING 16 Eliisa Jauhiainen Deployment of XML for Office Documents in Organizations UNIVERSITY OF JYVÄSKYLÄ JYVÄSKYLÄ 2014 Editor Mauri Leppänen Department of Computer Science and Information Systems, University of Jyväskylä URN:ISBN:978-951-39-5600-4 ISBN 978-951-39-5600-4 (PDF) ISBN 978-951-39-5599-1 (nid.) ISSN 1795-9713 Copyright © 2014, by University of Jyväskylä Jyväskylä University Printing House, Jyväskylä 2014 ABSTRACT Jauhiainen, Eliisa Deployment of XML for office documents in organizations Jyväskylä: University of Jyväskylä, 201, 63 p. (+ four included articles) (-\YlVN\Ol/LFHQWLDWH7KHVHVLQ&RPSXWLQJ ISSN) ,6%1 (nid.), 978-951-39-5600-4 (PDF) Licentiate Thesis Majority of the content in organizations is stored as documents. Structured documents, like XML documents, allow the structure definitions, document instances, and layout specifications to be handled as separate entities. This is an important feature to realize from a document management point of view. A class of similar documents with the same structure constitutes a document type. The documents are built from components that are logical units of information within the context of the document type. Office documents are typically authored using word-processing software, they are relatively short in length, and intended for human consumption. The development of open office standards brought XML to organizations’ office en- vironments and changed the capabilities of using document content in ways that were previously impossible or difficult. -
Looking to the Future by JOHN BALDWIN
1 of 3 Looking to the Future BY JOHN BALDWIN FreeBSD’s 13.0 release delivers new features to users and refines the workflow for new contri- butions. FreeBSD contributors have been busy fixing bugs and adding new features since 12.0’s release in December of 2018. In addition, FreeBSD developers have refined their vision to focus on FreeBSD’s future users. An abbreviated list of some of the changes in 13.0 is given below. A more detailed list can be found in the release notes. Shifting Tools Not all of the changes in the FreeBSD Project over the last two years have taken the form of patches. Some of the largest changes have been made in the tools used to contribute to FreeBSD. The first major change is that FreeBSD has switched from Subversion to Git for storing source code, documentation, and ports. Git is widely used in the software industry and is more familiar to new contribu- tors than Subversion. Git’s distributed nature also more easily facilitates contributions from individuals who are Not all of the changes in the not committers. FreeBSD had been providing Git mir- rors of the Subversion repositories for several years, and FreeBSD Project over the last many developers had used Git to manage in-progress patches. The Git mirrors have now become the offi- two years have taken the form cial repositories and changes are now pushed directly of patches. to Git instead of Subversion. FreeBSD 13.0 is the first release whose sources are only available via Git rather than Subversion. -
Electronic Document Distribution Why I Can't Read Microsoft Word
Electronic document distribution or Why I can’t read Microsoft Word attachments By Tim Bell (HOD, Computer Science) 23 July 1999 We are entering an exciting era when documents can conveniently be stored and exchanged electronically, with the promise of increased efficiency and reduced costs. However, if done incorrectly, electronic documents can cause frustration and inconvenience. In the last few months a number of people have taken the bold step into the electronic world only to receive a negative response from people including myself. This document presents arguments for why some methods of document exchange are appropriate and efficient, and others are not. I am not arguing that Word should not be used; in fact this document was prepared in Microsoft Word by choice. I am discussing the method of distribution. This document can be read on any system that has normal access to the World Wide Web. It was prompted when today I received two emails that contained nothing but a Word attachment. The attachments took me about 5 minutes to decode, and turned out to be identical. Furthermore, the information in the documents could easily have been sent as a plain text email. What’s wrong with Microsoft Word attachments? Microsoft Word is fine; it is its use in disseminating documents that is the problem. A number of university documents have been distributed using Microsoft Word, usually as attachments to email. Word's "Send To" command is beguiling (as one user described it), as it instantly emails the document to many people with very little effort. For many of the recipients, only one click is required to view the document instantly. -
Javapos Driver Outline
PT330/PT331 POSPrinter, CashDrawer Application Programmer's Guide of Java for Retail POS Driver for Serial/ USB Interface Table of Contents Preface........................................................................................................................................... 1 1. Outline ................................................................................................................................4 1.1. Subject Scope of this document........................................................................................4 1.2. JavaPOS Driver Outline....................................................................................................5 1.3. Restrictions .......................................................................................................................7 1.4. Connection Way to POS Printer........................................................................................9 1.5. About install....................................................................................................................11 1.6. Setting Program Usage ...................................................................................................12 2. Using JavaPOS Driver ...................................................................................................... 16 2.1. Common .........................................................................................................................16 2.2. POS Printer .....................................................................................................................16 -
Windows NLS Considerations Version 2.1
Windows NLS Considerations version 2.1 Radoslav Rusinov [email protected] Windows NLS Considerations Contents 1. Introduction ............................................................................................................................................... 3 1.1. Windows and Code Pages .................................................................................................................... 3 1.2. CharacterSet ........................................................................................................................................ 3 1.3. Encoding Scheme ................................................................................................................................ 3 1.4. Fonts ................................................................................................................................................... 4 1.5. So Why Are There Different Charactersets? ........................................................................................ 4 1.6. What are the Difference Between 7 bit, 8 bit and Unicode Charactersets? ........................................... 4 2. NLS_LANG .............................................................................................................................................. 4 2.1. Setting the Character Set in NLS_LANG ............................................................................................ 4 2.2. Where is the Character Conversion Done? .........................................................................................