Automated Malware Analysis Report for Set-Up.Exe

Total Page:16

File Type:pdf, Size:1020Kb

Automated Malware Analysis Report for Set-Up.Exe ID: 355727 Sample Name: Set-up.exe Cookbook: default.jbs Time: 13:38:24 Date: 21/02/2021 Version: 31.0.0 Emerald Table of Contents Table of Contents 2 Analysis Report Set-up.exe 4 Overview 4 General Information 4 Detection 4 Signatures 4 Classification 4 Startup 4 Malware Configuration 4 Yara Overview 4 Sigma Overview 4 Signature Overview 4 Compliance: 5 Mitre Att&ck Matrix 5 Behavior Graph 5 Screenshots 6 Thumbnails 6 Antivirus, Machine Learning and Genetic Malware Detection 7 Initial Sample 7 Dropped Files 7 Unpacked PE Files 7 Domains 7 URLs 7 Domains and IPs 8 Contacted Domains 8 URLs from Memory and Binaries 8 Contacted IPs 8 General Information 8 Simulations 9 Behavior and APIs 9 Joe Sandbox View / Context 9 IPs 9 Domains 9 ASN 9 JA3 Fingerprints 9 Dropped Files 9 Created / dropped Files 10 Static File Info 10 General 10 File Icon 10 Static PE Info 10 General 10 Authenticode Signature 11 Entrypoint Preview 11 Rich Headers 12 Data Directories 12 Sections 13 Resources 13 Imports 14 Version Infos 16 Possible Origin 16 Network Behavior 16 Code Manipulations 16 Statistics 16 System Behavior 16 Analysis Process: Set-up.exe PID: 3976 Parent PID: 5896 16 Copyright null 2021 Page 2 of 21 General 16 File Activities 17 File Created 17 File Written 17 File Read 21 Registry Activities 21 Key Value Created 21 Disassembly 21 Code Analysis 21 Copyright null 2021 Page 3 of 21 Analysis Report Set-up.exe Overview General Information Detection Signatures Classification Sample Set-up.exe Name: CCoonntttaaiiinnss fffuunncctttiiioonnaallliiitttyy tttoo cchheecckk iiifff aa dd… Analysis ID: 355727 CCoonntttaaiiinnss fffuunncctttiiioonnaallliiitttyy tttoo dcdehetetteeccckttt ivvf iiirarrttt uud… MD5: de70f0deed893bb… CCoonntttaaiiinnss fffuunncctttiiioonnaallliiitttyy tttoo ddyeyntneaacmt iivicciaratllullllyy… SHA1: f351b0c2996a357… Ransomware CCoonntttaaiiinnss fffuunncctttiiioonnaallliiitttyy tttoo qdquyuenerarryym llloiocccaaallllyee… Miner Spreading SHA256: b9a187b59c758e… CCoonntttaaiiinnss fffuunncctttiiioonnaallliiitttyy tttoo rqreeuaaeddr y tth hloeec PPaElEeBB CCoonntttaaiiinnss fffuunncctttiiioonnaallliiitttyy tttoo rrreeaadd ttthhee PPEEBB mmaallliiiccciiioouusss Most interesting Screenshot: malicious Evader Phishing sssuusssppiiiccciiioouusss CCoonntttaaiiinnss fffuunncctttiiioonnaallliiitttyy wtwohh riiicechah d m tahayey bPbeeE…B suspicious cccllleeaann clean DCDeoettnteetcacttitenedsd pfpuoontttecetnniotttiiinaaalll clcitrrryy ppwtttoho i fcffuuhnn mcctttiaiiooynn be Exploiter Banker FDFooeuutenncddt e lllaadrrr gpgeoe t aeamntoioauuln nctttr oyopfff tnnooo fnnu---neecxxteeioccnuuttteedd… FFoouunndd plpaoortgtteeenn ttatiiiamalll ossutttrrrniiinnt ggo fdd neeoccrnrryy-pepttxtiiioeoncn u /// t aea…d Spyware Trojan / Bot Adware PFPEoEu ///n Od LLpEEo t ffefiiilllneet ihhaaal ss t arainn g iiinn dvveaacllliiidrdy cpceteirorrtttniiifffi iic/c aatttee Score: 7 Range: 0 - 100 PPEE ff/fii illOlee L ccEoon nftittlaaeiii nnhssa sst ttrarraannn iggneev a rrreleidsso ocuuerrrrccteiefisscate Whitelisted: false SPSaEam fiplpellle ec foffiiillnleet aiiissin ddsiii fffsfffeetrrrraeenngttt ettthh raaenns oourrriiriggciiiennsaalll … Confidence: 80% USUsasemessp 3l3e22 bfbiiliitett P PisEE d fffiiiflllefeessrent than original UUsseess M32iiicbcrrrioto sPsooEffftt t'f''ssil e EEsnnhhaanncceedd CCrrryypptttoogg… UUsseess cMcooidcdereo osobobfffutu'ss ccEaantttiihiooannn tttceeeccdhh nnCiiiqrqyuupeetsos g(((… Startup Uses code obfuscation techniques ( System is w10x64 Set-up.exe (PID: 3976 cmdline: 'C:\Users\user\Desktop\Set-up.exe' MD5: DE70F0DEED893BBA56CCB78EAFD59606) cleanup Malware Configuration No configs have been found Yara Overview No yara matches Sigma Overview No Sigma rule has matched Signature Overview • Cryptography • Bitcoin Miner • Compliance • Spreading • Networking Copyright null 2021 Page 4 of 21 • System Summary • Data Obfuscation • Persistence and Installation Behavior • Malware Analysis System Evasion • Anti Debugging • Language, Device and Operating System Detection • Lowering of HIPS / PFW / Operating System Security Settings Click to jump to signature section There are no malicious signatures, click here to show all signatures . Compliance: Uses 32bit PE files Creates install or setup log file Contains modern PE file flags such as dynamic base (ASLR) or NX Binary contains paths to debug symbols Mitre Att&ck Matrix Remote Initial Privilege Credential Lateral Command Network Service Access Execution Persistence Escalation Defense Evasion Access Discovery Movement Collection Exfiltration and Control Effects Effects Valid Command Path Path Virtualization/Sandbox OS System Time Remote Archive Exfiltration Encrypted Eavesdrop on Remotely Accounts and Scripting Interception Interception Evasion 1 Credential Discovery 1 Services Collected Over Other Channel 2 Insecure Track Device Interpreter 2 Dumping Data 1 Network Network Without Medium Communication Authorization Default Scripting 1 Boot or Boot or Modify Registry 1 LSASS Security Software Remote Data from Exfiltration Junk Data Exploit SS7 to Remotely Accounts Logon Logon Memory Discovery 2 1 Desktop Removable Over Redirect Phone Wipe Data Initialization Initialization Protocol Media Bluetooth Calls/SMS Without Scripts Scripts Authorization Domain Native Logon Script Logon Deobfuscate/Decode Security Virtualization/Sandbox SMB/Windows Data from Automated Steganography Exploit SS7 to Obtain Accounts API 1 (Windows) Script Files or Information 1 Account Evasion 1 Admin Shares Network Exfiltration Track Device Device (Windows) Manager Shared Location Cloud Drive Backups Local At (Windows) Logon Script Logon Scripting 1 NTDS File and Directory Distributed Input Scheduled Protocol SIM Card Accounts (Mac) Script Discovery 1 Component Capture Transfer Impersonation Swap (Mac) Object Model Cloud Cron Network Network Obfuscated Files or LSA System Information SSH Keylogging Data Fallback Manipulate Accounts Logon Script Logon Information 2 Secrets Discovery 1 5 Transfer Channels Device Script Size Limits Communication Behavior Graph Copyright null 2021 Page 5 of 21 Hide Legend Legend: Process Signature Created File DNS/IP Info Is Dropped Is Windows Process Behavior Graph Number of created Registry Values Number of created Files ID: 355727 Visual Basic Sample: Set-up.exe Startdate: 21/02/2021 Delphi Architecture: WINDOWS Java Score: 7 .Net C# or VB.NET C, C++ or other language started Is malicious Internet Set-up.exe 1 11 Screenshots Thumbnails This section contains all screenshots as thumbnails, including those not shown in the slideshow. Copyright null 2021 Page 6 of 21 Antivirus, Machine Learning and Genetic Malware Detection Initial Sample Source Detection Scanner Label Link Set-up.exe 1% Virustotal Browse Set-up.exe 0% Metadefender Browse Set-up.exe 0% ReversingLabs Dropped Files No Antivirus matches Unpacked PE Files No Antivirus matches Domains No Antivirus matches URLs Source Detection Scanner Label Link cacerts.dig 0% Avira URL Cloud safe ocsp.dig 0% Avira URL Cloud safe Copyright null 2021 Page 7 of 21 Source Detection Scanner Label Link https://127.0.0.1 0% Virustotal Browse https://127.0.0.1 0% Avira URL Cloud safe ocsp.digicert.c 0% Avira URL Cloud safe https://127.0.0.1https://127.0.0.1https://127.0.0.1https://127.0.0.1https://127.0.0.1https://127.0.0 0% Avira URL Cloud safe https://tron-qe-user-packages.s3.amazonaws.comhttps://tron-qe-user-packages.s3- 0% Avira URL Cloud safe accelerate.amazonaws. Domains and IPs Contacted Domains No contacted domains info URLs from Memory and Binaries Name Source Malicious Antivirus Detection Reputation typekit.com/eulas/000000000000000000014f4e Set-up.exe false high typekit.com/eulas/000000000000000000014f4d Set-up.exe false high cacerts.dig Set-up.exe, 00000000.00000002. false Avira URL Cloud: safe unknown 649160410.0000000001076000.000 00004.00000040.sdmp ocsp.dig Set-up.exe, 00000000.00000002. false Avira URL Cloud: safe unknown 649160410.0000000001076000.000 00004.00000040.sdmp https://127.0.0.1 Set-up.exe false 0%, Virustotal, Browse unknown Avira URL Cloud: safe typekit.com/eulas/000000000000000000014825 Set-up.exe false high typekit.com/eulas/000000000000000000014824 Set-up.exe false high typekit.com/eulas/000000000000000000014823 Set-up.exe false high typekit.com/eulas/000000000000000000014822 Set-up.exe false high typekit.com/eulas/000000000000000000014f4f Set-up.exe false high ocsp.digicert.c Set-up.exe, 00000000.00000002. false Avira URL Cloud: safe unknown 649401359.0000000001114000.000 00004.00000001.sdmp www.winimage.com/zLibDll Set-up.exe false high typekit.com/eulas/000000000000000000014f52 Set-up.exe false high typekit.com/eulas/000000000000000000014f51 Set-up.exe false high typekit.com/eulas/000000000000000000014f50 Set-up.exe false high Set-up.exe false Avira URL Cloud: safe low https://127.0.0.1https://127.0.0.1https://127.0.0.1https://127.0. 0.1https://127.0.0.1https://127.0.0 https://tron-qe-user- Set-up.exe false Avira URL Cloud: safe unknown packages.s3.amazonaws.comhttps://tron-qe-user- packages.s3-accelerate.amazonaws. Contacted IPs No contacted IP infos General Information Joe Sandbox Version: 31.0.0 Emerald Analysis ID: 355727 Start date: 21.02.2021 Start time: 13:38:24 Joe Sandbox Product: CloudBasic Overall analysis duration: 0h 3m 28s Hypervisor based Inspection enabled: false Report type: light Sample file name: Set-up.exe Copyright null 2021 Page 8 of 21 Cookbook file name: default.jbs Analysis system description: Windows 10 64 bit v1803 with Office Professional Plus 2016, Chrome 85, IE 11, Adobe Reader
Recommended publications
  • Doctor's Thesis Studies on Multilingual Information Processing
    NAIST-IS-DT9761021 Doctor’s Thesis Studies on Multilingual Information Processing on the Internet Akira Maeda September 18, 2000 Department of Information Systems Graduate School of Information Science Nara Institute of Science and Technology Doctor’s Thesis submitted to Graduate School of Information Science, Nara Institute of Science and Technology in partial fulfillment of the requirements for the degree of DOCTOR of ENGINEERING Akira Maeda Thesis committee: Shunsuke Uemura, Professor Yuji Matsumoto, Professor Minoru Ito, Professor Masatoshi Yoshikawa, Associate Professor Studies on Multilingual Information Processing on the Internet ∗ Akira Maeda Abstract With the increasing popularity of the Internet in various part of the world, the languages used for Web documents are expanded from English to various languages. However, there are many unsolved problems in order to realize an information system which can handle such multilingual documents in a unified manner. From the user’s point of view, three most fundamental text processing functions for the general use of the World Wide Web are display, input, and retrieval of the text. However, for languages such as Japanese, Chinese, and Korean, character fonts and input methods that are necessary for displaying and inputting texts, are not always installed on the client side. From the system’s point of view, one of the most troublesome problems is that, many Web documents do not have meta information of the character coding system and the language used for the document itself, although character coding systems used for Web documents vary according to the language. It may result in troubles such as incorrect display on Web browsers, and inaccurate indexing on Web search engines.
    [Show full text]
  • Alphabetization† †† Wendy Korwin*, Haakon Lund** *119 W
    Knowl. Org. 46(2019)No.3 209 W. Korwin and H. Lund. Alphabetization Alphabetization† †† Wendy Korwin*, Haakon Lund** *119 W. Dunedin Rd., Columbus, OH 43214, USA, <[email protected]> **University of Copenhagen, Department of Information Studies, DK-2300 Copenhagen S Denmark, <[email protected]> Wendy Korwin received her PhD in American studies from the College of William and Mary in 2017 with a dissertation entitled Material Literacy: Alphabets, Bodies, and Consumer Culture. She has worked as both a librarian and an archivist, and is currently based in Columbus, Ohio, United States. Haakon Lund is Associate Professor at the University of Copenhagen, Department of Information Studies in Denmark. He is educated as a librarian (MLSc) from the Royal School of Library and Information Science, and his research includes research data management, system usability and users, and gaze interaction. He has pre- sented his research at international conferences and published several journal articles. Korwin, Wendy and Haakon Lund. 2019. “Alphabetization.” Knowledge Organization 46(3): 209-222. 62 references. DOI:10.5771/0943-7444-2019-3-209. Abstract: The article provides definitions of alphabetization and related concepts and traces its historical devel- opment and challenges, covering analog as well as digital media. It introduces basic principles as well as standards, norms, and guidelines. The function of alphabetization is considered and related to alternatives such as system- atic arrangement or classification. Received: 18 February 2019; Revised: 15 March 2019; Accepted: 21 March 2019 Keywords: order, orders, lettering, alphabetization, arrangement † Derived from the article of similar title in the ISKO Encyclopedia of Knowledge Organization Version 1.0; published 2019-01-10.
    [Show full text]
  • Surface Or Essence: Beyond the Coded Character Set Model
    Surface or Essence: Beyond the Coded Character Set Model. Shigeki Moro1) Abstract For almost all users, the coded character set model is the only way to use characters with their computers. Although there have been frequent arguments about the many problems of coded character sets, until now, there was almost nothing on the philosophical consideration on a character in the field of Computer science. In this paper, the similarity between the coded character set model and Aristotle’s Essentialism and the consequent problems derived from it, is discussed. Then the importance of the surface of the character is pointed out using the ´ecrituretheory of Jacques Derrida. Lastly, the Chaon model of the CHISE project is introduced as one of the solutions to this problem. Keywords: Unicode, Aristotle’s Essentialism, Derrida’s Theory of ´ecriture,Chaon model “Depth must be hidden. Where? On the surface.” other local and super character code sets are still —Hugo von Hofmannsthal (1874-1929) being developed, and the repertoires of the existing character sets are increasing even now. What users 1 Introduction. can only do is to choose and follow these character sets. Writing, is not only considered as one of the most The main reason for this is that there are both fundamental mediums of intellectual activities, but sides: Writing is not only dependent on a context, also a frequently used one, which is not restricted but that it is transmitted exceeding the context (it to the use of computers alone. Needless to say that is contrastive with oral language being indivisible the coded character set model (abbreviation being from a context).
    [Show full text]
  • Electronic Document Preparation Pocket Primer
    Electronic Document Preparation Pocket Primer Vít Novotný December 4, 2018 Creative Commons Attribution 3.0 Unported (cc by 3.0) Contents Introduction 1 1 Writing 3 1.1 Text Processing 4 1.1.1 Character Encoding 4 1.1.2 Text Input 12 1.1.3 Text Editors 13 1.1.4 Interactive Document Preparation Systems 13 1.1.5 Regular Expressions 14 1.2 Version Control 17 2 Markup 21 2.1 Meta Markup Languages 22 2.1.1 The General Markup Language 22 2.1.2 The Extensible Markup Language 23 2.2 Markup on the World Wide Web 28 2.2.1 The Hypertext Markup Language 28 2.2.2 The Extensible Hypertext Markup Language 29 2.2.3 The Semantic Web and Linked Data 31 2.3 Document Preparation Systems 32 2.3.1 Batch-oriented Systems 35 2.3.2 Interactive Systems 36 2.4 Lightweight Markup Languages 39 3 Design 41 3.1 Fonts 41 3.2 Structural Elements 42 3.2.1 Paragraphs and Stanzas 42 iv CONTENTS 3.2.2 Headings 45 3.2.3 Tables and Lists 46 3.2.4 Notes 46 3.2.5 Quotations 47 3.3 Page Layout 48 3.4 Color 48 3.4.1 Theory 48 3.4.2 Schemes 51 Bibliography 53 Acronyms 61 Index 65 Introduction With the advent of the digital age, typesetting has become available to virtually anyone equipped with a personal computer. Beautiful text documents can now be crafted using free and consumer-grade software, which often obviates the need for the involvement of a professional designer and typesetter.
    [Show full text]
  • I18n, M17n, Unicode, and All That
    I18N, M17N, UNICODE, AND ALL THAT Tim Bray General-Purpose Web Geek Sun Microsystems /[a-zA-Z]+/ This is probably a bug. The Problems We Have To Solve Identifying characters Storage Byte⇔character mapping Transfer Good string API Published in 1996; it has 74 major sections, most of which discuss whole families of writing systems. www.w3.org/TR/charmod Identifying Characters 1,1 17 “Planes”14,1 each with 64k code points: U+0000 – U+10FFFF BMP 12 Unicode Code Points 0 0000 1 0000 Basic Multilingual Plane 2 0000 Dead Languages & Math 3 0000 Han Characters 4 0000 5 0000 Non-BMP 6 0000 7 0000 99,024 characters defined in Unicode 5.0 “Astral” Planes 8 0000 9 0000 A 0000 B 0000 C 0000 D 0000 E 0000 Language F 0000 10 0000 Private Use T ags The Basic Multilingual Plane (BMP) U+0000 – U+FFFF 0000 Alphabets 1000 2000 3000 Punctuation 4000 Asian-language Support 5000 Han Characters 6000 7000 8000 9000 A000 Y B000 i Hangul C000 D000 E000 (*: Legacy-Compatibility junk)Surrogates F000 Private Use * Unicode Character Database 00C8;LATIN CAPITAL LETTER E WITH GRAVE;Lu;0;L;0045 0300;;;;N;LATIN CAPITAL LETTER E GRAVE;;;00E8; “Character #200 is LATIN CAPITAL LETTER E WITH GRAVE, a lower-case letter, combining class 0, renders L-to-R, can be composed by U+0045/U+0300, had a differentÈ name in Unicode 1, isn’t a number, lowercase is U+00E8.” www.unicode.org/Public/Unidata $ U+0024 DOLLAR SIGN Ž U+017D LATIN CAPITAL LETTER Z WITH CARON ® U+00AE REGISTERED SIGN ή U+03AE GREEK SMALL LETTER ETA WITH TONOS Ж U+0416 CYRILLIC CAPITAL LETTER ZHE א U+05D0 HEBREW LETTER
    [Show full text]
  • How Unicode Came to "Dominate the World" Lee Collins 18 September 2014 Overview
    How Unicode Came to "Dominate the World" Lee Collins 18 September 2014 Overview • Original design of Unicode • Compromises • Technical • To correct flaws • Political • To buy votes • Dominates the world • But is it still “Unicode” Why Unicode • Mid-late 1980s growth of internationalization • Spread of personal computer • Frustration with existing character encodings • ISO / IEC 2022-based (ISO 8895, Xerox) • Font-based (Mac) • Code pages (Windows) Existing Encodings • No single standard • Different solutions based on single language • Complex multibyte encodings • ISO 2022, Shift JIS, etc. • Multilinguality virtually impossible • Barrier to design of internationalization libraries Assumptions • Encoding is foundation of layered model • Simple, stable base for complex processing • Characters have only ideal shape • Final shape realized in glyphs • Font, family, weight, context • Character properties • Directionality • Interaction with surrounding characters • Non-properties • Language, order in collation sequence, etc. • Depend on context Unicode Design • Single character set • Sufficient for living languages • Simple encoding model • “Begin at zero and add next character” — Peter Fenwick of BSI at Xerox 1987 • No character set shift sequences or mechanisms • Font, code page or ISO 2022 style • Fixed width of 16 bits • Encode only atomic elements • Assume sophisticated rendering technology • a + + = • = Early Strategy • Unicode as pivot code • Interchange between existing encodings • Focus on particular OSs • Xerox, Mac, NeXTSTEP,
    [Show full text]
  • Buzzword Compliance
    BUZZWORD COMPLIANCE Buzzword Compliance ● 3 Slides Per Buzzword ● High Signal To Noise ● Breadth Over Depth About EXPLORING Python Buzzword Compliance Library Building Blocks (&)Games Graphics TheThe PythonPython LanguageLanguage Big Honking Frameworks (Web Application Frameworks) All are part of Python LEARNING PYTHON Learning Python ● The Quick Reference Sheet ● Python Tutorial ● Python Challenge A Cycle of Learning Learning Python GENERAL READING USE NEW TOOLS CODE! EXPLORE LIBRARIES LIST [ ] COMPREHENSIONS List Comprehensions ● A Cool Idiom of Python ● Enables Conciseness ● Obviates map, filter, reduce Unrolls into Simple Loops List Comprehensions lost = sum([c.billed - c.paid for c in customers if c.is_deadbeat()]) l = [ ] for c in customers: if c.is_deadbeat(): l.append(c.billed - c.paid) lost = sum(l) EXECUTING MODULES Executing Modules ● Import runs code, once. ● def is just a statement ● Use to precalculate stuff Python just runs scripts in namespaces Executing Modules class C: print ªHello from Cº def help_make_table(size): ... c_table = help_make_table(64) del help_make_table @ DECORATORS Decorators ● Wraps methods with new functionality ● Useful for logging, security, etc. ● Clean Syntax for use Unrolls to simple code Decorators from decorator import decorator @decorator def trace(f, *args, **kw): print "call %s with args %s, %s" % (f.func_name, args, kw) return f(*args, **kw) @trace def buggy_function(a, b, c) METACLASSES Metaclasses ● The superclass ©type© of classes ● Changes functionality of Python ● Adds complexity to
    [Show full text]
  • Chapter 4 Character Encoding in Corpus Construction
    Chapter 4 Character encoding in corpus construction Anthony McEnery Zhonghua Xiao Lancaster University Corpus linguistics has developed, over the past three decades, into a rich paradigm that addresses a great variety of linguistic issues ranging from monolingual research of one language to contrastive and translation studies involving many different languages. Today, while the construction and exploitation of English language corpora still dominate the field of corpus linguistics, corpora of other languages, either monolingual or multilingual, have also become available. These corpora have added notably to the diversity of corpus-based language studies. Character encoding is rarely an issue for alphabetical languages, like English, which typically still use ASCII characters. For many other languages that use different writing systems (e.g. Chinese), encoding is an important issue if one wants to display the corpus properly or facilitate data interchange, especially when working with multilingual corpora that contain a wide range of writing systems. Language specific encoding systems make data interchange problematic, since it is virtually impossible to display a multilingual document containing texts from different languages using such encoding systems. Such documents constitute a new Tower of Babel which disrupts communication. In addition to the problem with displaying corpus text or search results in general, an issue which is particular relevant to corpus building is that the character encoding in a corpus must be consistent if the corpus is to be searched reliably. This is because if the data in a corpus is encoded using different character sets, even though the internal difference is indiscernible to human eyes, a computer will make a distinction, thus leading to unreliable results.
    [Show full text]
  • Introduction to I18n
    Introduction to i18n Tomohiro KUBOTA <debianattmaildotplaladotordotjp(retiredDD)> 29 Dezember 2009 Abstract This document describes basic concepts for i18n (internationalization), how to write an inter- nationalized software, and how to modify and internationalize a software. Handling of char- acters is discussed in detail. There are a few case-studies in which the author internationalized softwares such as TWM. Copyright Notice Copyright © 1999-2001 Tomohiro KUBOTA. Chapters and sections whose original author is not KUBOTA are copyright by their authors. Their names are written at the top of the chapter or the section. This manual is free software; you may redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version. This is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. See the GNU General Public License for more details. A copy of the GNU General Public License is available as /usr/share/common-licenses/GPL in the Debian GNU/Linux distribution or on the World Wide Web at http://www.gnu.org/copyleft/gpl.html. You can also obtain it by writing to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA. i Contents 1 About This Document1 1.1 Scope............................................1 1.2 New Versions of This Document............................1 1.3 Feedback and Contributions...............................2 2 Introduction 3 2.1 General Concepts.....................................3 2.2 Organization........................................6 3 Important Concepts for Character Coding Systems9 3.1 Basic Terminology.....................................9 3.2 Stateless and Stateful..................................
    [Show full text]
  • A Framework for Multilingual Information Processing by Steven Edward Atkin Bachelor of Science Physics State University of New Y
    A Framework for Multilingual Information Processing by Steven Edward Atkin Bachelor of Science Physics State University of New York, Stony Brook 1989 Master of Science in Computer Science Florida Institute of Technology 1994 A dissertation submitted to the College of Engineering at Florida Institute of Technology in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Computer Science Melbourne, Florida December, 2001 We the undersigned committee hereby recommend that the attached document be accepted as fulfilling in part the requirements for the degree of Doctor of Philosophy of Computer Science “A Framework for Multilingual Information Processing,” a dissertation by Steven Edward Atkin __________________________________ Ryan Stansifer, Ph.D. Associate Professor, Computer Science Dissertation Advisor __________________________________ Phil Bernhard, Ph.D. Associate Professor, Computer Science __________________________________ James Whittaker, Ph.D. Associate Professor, Computer Science __________________________________ Gary Howell, Ph.D. Professor, Mathematics __________________________________ William Shoaff, Ph.D. Associate Professor and Head, Computer Science Abstract Title: A Framework for Multilingual Information Processing Author: Steven Edward Atkin Major Advisor: Ryan Stansifer, Ph.D. Recent and (continuing) rapid increases in computing power now enable more of humankind’s written communication to be represented as digital data. The most recent and obvious changes in multilingual information processing have been the introduction of larger character sets encompassing more writing systems. Yet the very richness of larger collections of characters has made the interpretation and pro- cessing of text more difficult. The many competing motivations (satisfying the needs of linguists, computer scientists, and typographers) for standardizing charac- ter sets threaten the purpose of information processing: accurate and facile manipu- lation of data.
    [Show full text]
  • Ascii 1 Ascii
    ASCII 1 ASCII The American Standard Code for Information Interchange (ASCII /ˈæski/ ASS-kee) is a character-encoding scheme originally based on the English alphabet that encodes 128 specified characters - the numbers 0-9, the letters a-z and A-Z, some basic punctuation symbols, some control codes that originated with Teletype machines, and a blank space - into the 7-bit binary integers.[1] ASCII codes represent text in computers, communications equipment, and other devices that use text. Most modern character-encoding A chart of ASCII from a 1972 printer manual schemes are based on ASCII, though they support many additional characters. ASCII developed from telegraphic codes. Its first commercial use was as a seven-bit teleprinter code promoted by Bell data services. Work on the ASCII standard began on October 6, 1960, with the first meeting of the American Standards Association's (ASA) X3.2 subcommittee. The first edition of the standard was published during 1963, a major revision during 1967, and the most recent update during 1986. Compared to earlier telegraph codes, the proposed Bell code and ASCII were both ordered for more convenient sorting (i.e., alphabetization) of lists, and added features for devices other than teleprinters. ASCII includes definitions for 128 characters: 33 are non-printing control characters (many now obsolete) that affect how text and space are processed[2] and 95 printable characters, including the space (which is considered an invisible graphic[3][4]). The IANA prefers the name US-ASCII to avoid ambiguity. ASCII was the most commonly used character encoding on the World Wide Web until December 2007, when it was surpassed by UTF-8, which includes ASCII as a subset.
    [Show full text]
  • T-Kernel 2.0 Extension Specification (TEF020-S009
    T-Kernel2.0EXtension T-Kernel 2.0 Extension Specification December 2012 T-Engine Forum http://www.t-engine.org/ TEF020-S009-02.00.00/en Copyright (c) 2012 by T-Engine Forum T-Kernel 2.0 Extension Specification (Ver.2.00.00) ------------------------------------------- Copyright (c) 2012 by T-Engine Forum You should not transcribe the content, duplicate a part of this specification, etc. without the consent of T-Engine Forum. For improvement, etc., information in this specification is subject to change without notice. For information about this specification, please contact the following: T-Engine Forum Secretariat In YRP Ubiquitous Networking Laboratory 28th Kowa Building, 2-20-1 Nishi-gotanda Shinagawa, Tokyo Japan 141-0031 +81-(0)-3-5437-0572 +81-(0)-3-5437-2399 [email protected] Note In this specification, POSIX means Portable Operating System Interface, specifically the so-called UNIX system Operating System Interface defined in the following standards. ISO/IEC/IEEE 9945 Information technology - Portable Operationg System Interface (POSIX) Base Specifications, Issue 7 The standard C library referred to in the chapter for the Standard C Compatible Library means the above POSIX as well as the library functions defined in the following standard. JIS X 3010:2003 (ISO/IEC 9899:1999) Programming Language C Considering the programming ease and portability at some degree of affinity with POSIX, this specification follows the standard C library specifications almost as is so that programs using the standard C library may easily be ported. This specification quotes some descriptions from the above standards with permission from IEC. This specification is an extension of the underlying T-Kernel 2.0, which is an operating system of a totally different nature from POSIX.
    [Show full text]