Review for Quiz 1 1. ASCII Is a Character-Encoding Scheme That

Total Page:16

File Type:pdf, Size:1020Kb

Review for Quiz 1 1. ASCII Is a Character-Encoding Scheme That AP Computer Science Principles Scoring Guide Review for Quiz 1 1. ASCII is a character-encoding scheme that uses 7 bits to represent each character. The decimal (base 10) values 65 through 90 represent the capital letters A through Z, as shown in the table below. What ASCII character is represented by the binary (base 2) number 1001010 ? A H B I C J D K 2. What is the best explanation for why digital data is represented in computers in binary? Copyright © 2017. The College Board. These materials are part of a College Board program. Use or distribution of these materials online or in print beyond your school’s participation in the program is prohibited. Page 1 of 7 AP Computer Science Principles Scoring Guide Review for Quiz 1 The binary number system is the only system flexible enough to allow for representing data other than numbers. It typically takes fewer digits to represent a number in binary when compared to other number systems (for example, the decimal number system) It's impossible to build a computing machine that uses anything but binary to represent numbers It's easier, cheaper, and more reliable to build machines and devices that only have to distinguish between binary states. 3. Which best describes a bit? A binary number, such as 110 A rule governing the exchange or transmission of data between devices. A A single 0 or a 1 B Time it takes for a number to travel from its sender to its receiver C 4. A student is recording a song on her computer. When the recording is finished, she saves D a copy on her computer. The student notices that the saved copy is of lower sound quality than the original recording. Which of the following could be a possible explanation for the difference in sound quality? Copyright © 2017. The College Board. These materials are part of a College Board program. Use or distribution of these materials online or in print beyond your school’s participation in the program is prohibited. Page 2 of 7 AP Computer Science Principles Scoring Guide Review for Quiz 1 The song was saved using fewer bits per second than the original song. The song was saved using more bits per second than the original song. The song was saved using a lossless compression technique. Some information is lost every time a file is saved from one location on a computer to another location. 5. What is the minimum number of bits you would need to encode 51 different colors? 5 bits 6 bits A 7 bits B 8 bits C 6. A video-streaming Web site uses 32-bit integers to count the number of times each video D has been played. In anticipation of some videos being played more times than can be represented with 32 bits, the Web site is planning to change to 64-bit integers for the counter. Which of the following best describes the result of using 64-bit integers instead of 32-bit integers? Copyright © 2017. The College Board. These materials are part of a College Board program. Use or distribution of these materials online or in print beyond your school’s participation in the program is prohibited. Page 3 of 7 AP Computer Science Principles Scoring Guide Review for Quiz 1 2 times as many values can be represented. 32 times as many values can be represented. 232 times as many values can be represented. 322 times as many values can be represented. 7. Which is the best definition for bandwidth? A set of rules governing the exchange or transmission of data between devices. Time it takes for a bit to travel from its sender to its receiver. A The number of bits that are conveyed or processed per unit of time. e.g. 8 bits/sec. B Transmission capacity measure by bit rate C 8. An online store uses 6-bit binary sequences to identify each unique item for sale. The D store plans to increase the number of items it sells and is considering using 7-bit binary sequences. Which of the following best describes the result of using 7-bit sequences instead of 6-bit sequences? Copyright © 2017. The College Board. These materials are part of a College Board program. Use or distribution of these materials online or in print beyond your school’s participation in the program is prohibited. Page 4 of 7 AP Computer Science Principles Scoring Guide Review for Quiz 1 2 more items can be uniquely identified. 10 more items can be uniquely identified. 2 times as many items can be uniquely identified. 10 times as many items can be uniquely identified. 9. Which is the best definition for latency? A set of rules governing the exchange or transmission of data between devices. Time it takes for a bit to travel from its sender to its receiver. A the number of bits that are conveyed or processed per unit of time. e.g. 8 bits/sec. B Transmission capacity measure by bit rate C 10. Which of the following is NOT an example of a binary question? D When is your birthday? Which do you prefer: Coke or Pepsi? Do you want to go to the store today? Do you like math or history better? Copyright © 2017. The College Board. These materials are part of a College Board program. Use or distribution of these materials online or in print beyond your school’s participation in the program is prohibited. Page 5 of 7 AP Computer Science Principles Scoring Guide Review for Quiz 1 11. What is the 5-bit binary number for the decimal number twenty-six (26)? 10010 11010 10110 10101 12. Which word best fits the following definition: A set of rules governing the exchange or transmission of data between devices. Bandwidth A Bit rate B Latency C Protocol D 13. Number systems with different bases such as binary (base-2) and decimal (base-10) are all used to view and represent digital data. Which of the following is NOT true about representing digital data? Copyright © 2017. The College Board. These materials are part of a College Board program. Use or distribution of these materials online or in print beyond your school’s participation in the program is prohibited. Page 6 of 7 AP Computer Science Principles Scoring Guide Review for Quiz 1 Some large numbers cannot be represented in binary and can only be represented in decimal. Groups of bits can be used to represent abstractions, including but not limited to numbers and characters. The same value (number) can have a different representation depending on the number system used to represent it. At one of the lowest levels of abstraction, all digital data can be represented in binary using only combinations of the digits zero and one 14. 5 bits is enough to represent 32 different numbers. How many bits do you need to represent twice as many numbers (64)? 6 7 A 10 B 11 C D 15. Which is the best definition for bit rate? A set of rules governing the exchange or transmission of data between devices. Time it takes for a bit to travel from its sender to its receiver. the number of bits that are conveyed or processed per unit of time. e.g. 8 bits/sec. Transmission capacity measure by bit rate Copyright © 2017. The College Board. These materials are part of a College Board program. Use or distribution of these materials online or in print beyond your school’s participation in the program is prohibited. Page 7 of 7.
Recommended publications
  • Package 'Pinsplus'
    Package ‘PINSPlus’ August 6, 2020 Encoding UTF-8 Type Package Title Clustering Algorithm for Data Integration and Disease Subtyping Version 2.0.5 Date 2020-08-06 Author Hung Nguyen, Bang Tran, Duc Tran and Tin Nguyen Maintainer Hung Nguyen <[email protected]> Description Provides a robust approach for omics data integration and disease subtyping. PIN- SPlus is fast and supports the analysis of large datasets with hundreds of thousands of sam- ples and features. The software automatically determines the optimal number of clus- ters and then partitions the samples in a way such that the results are ro- bust against noise and data perturbation (Nguyen et.al. (2019) <DOI: 10.1093/bioinformat- ics/bty1049>, Nguyen et.al. (2017)<DOI: 10.1101/gr.215129.116>). License LGPL Depends R (>= 2.10) Imports foreach, entropy , doParallel, matrixStats, Rcpp, RcppParallel, FNN, cluster, irlba, mclust RoxygenNote 7.1.0 Suggests knitr, rmarkdown, survival, markdown LinkingTo Rcpp, RcppArmadillo, RcppParallel VignetteBuilder knitr NeedsCompilation yes Repository CRAN Date/Publication 2020-08-06 21:20:02 UTC R topics documented: PINSPlus-package . .2 AML2004 . .2 KIRC ............................................3 PerturbationClustering . .4 SubtypingOmicsData . .9 1 2 AML2004 Index 13 PINSPlus-package Perturbation Clustering for data INtegration and disease Subtyping Description This package implements clustering algorithms proposed by Nguyen et al. (2017, 2019). Pertur- bation Clustering for data INtegration and disease Subtyping (PINS) is an approach for integraton of data and classification of diseases into various subtypes. PINS+ provides algorithms support- ing both single data type clustering and multi-omics data type. PINSPlus is an improved version of PINS by allowing users to customize the based clustering algorithm and perturbation methods.
    [Show full text]
  • Base64 Character Encoding and Decoding Modeling
    Base64 Character Encoding and Decoding Modeling Isnar Sumartono1, Andysah Putera Utama Siahaan2, Arpan3 Faculty of Computer Science,Universitas Pembangunan Panca Budi Jl. Jend. Gatot Subroto Km. 4,5 Sei Sikambing, 20122, Medan, Sumatera Utara, Indonesia Abstract: Security is crucial to maintaining the confidentiality of the information. Secure information is the information should not be known to the unreliable person, especially information concerning the state and the government. This information is often transmitted using a public network. If the data is not secured in advance, would be easily intercepted and the contents of the information known by the people who stole it. The method used to secure data is to use a cryptographic system by changing plaintext into ciphertext. Base64 algorithm is one of the encryption processes that is ideal for use in data transmission. Ciphertext obtained is the arrangement of the characters that have been tabulated. These tables have been designed to facilitate the delivery of data during transmission. By applying this algorithm, errors would be avoided, and security would also be ensured. Keywords: Base64, Security, Cryptography, Encoding I. INTRODUCTION Security and confidentiality is one important aspect of an information system [9][10]. The information sent is expected to be well received only by those who have the right. Information will be useless if at the time of transmission intercepted or hijacked by an unauthorized person [7]. The public network is one that is prone to be intercepted or hijacked [1][2]. From time to time the data transmission technology has developed so rapidly. Security is necessary for an organization or company as to maintain the integrity of the data and information on the company.
    [Show full text]
  • Unicode and Code Page Support
    Natural for Mainframes Unicode and Code Page Support Version 4.2.6 for Mainframes October 2009 This document applies to Natural Version 4.2.6 for Mainframes and to all subsequent releases. Specifications contained herein are subject to change and these changes will be reported in subsequent release notes or new editions. Copyright © Software AG 1979-2009. All rights reserved. The name Software AG, webMethods and all Software AG product names are either trademarks or registered trademarks of Software AG and/or Software AG USA, Inc. Other company and product names mentioned herein may be trademarks of their respective owners. Table of Contents 1 Unicode and Code Page Support .................................................................................... 1 2 Introduction ..................................................................................................................... 3 About Code Pages and Unicode ................................................................................ 4 About Unicode and Code Page Support in Natural .................................................. 5 ICU on Mainframe Platforms ..................................................................................... 6 3 Unicode and Code Page Support in the Natural Programming Language .................... 7 Natural Data Format U for Unicode-Based Data ....................................................... 8 Statements .................................................................................................................. 9 Logical
    [Show full text]
  • Lecture 2: Variables and Primitive Data Types
    Lecture 2: Variables and Primitive Data Types MIT-AITI Kenya 2005 1 In this lecture, you will learn… • What a variable is – Types of variables – Naming of variables – Variable assignment • What a primitive data type is • Other data types (ex. String) MIT-Africa Internet Technology Initiative 2 ©2005 What is a Variable? • In basic algebra, variables are symbols that can represent values in formulas. • For example the variable x in the formula f(x)=x2+2 can represent any number value. • Similarly, variables in computer program are symbols for arbitrary data. MIT-Africa Internet Technology Initiative 3 ©2005 A Variable Analogy • Think of variables as an empty box that you can put values in. • We can label the box with a name like “Box X” and re-use it many times. • Can perform tasks on the box without caring about what’s inside: – “Move Box X to Shelf A” – “Put item Z in box” – “Open Box X” – “Remove contents from Box X” MIT-Africa Internet Technology Initiative 4 ©2005 Variables Types in Java • Variables in Java have a type. • The type defines what kinds of values a variable is allowed to store. • Think of a variable’s type as the size or shape of the empty box. • The variable x in f(x)=x2+2 is implicitly a number. • If x is a symbol representing the word “Fish”, the formula doesn’t make sense. MIT-Africa Internet Technology Initiative 5 ©2005 Java Types • Integer Types: – int: Most numbers you’ll deal with. – long: Big integers; science, finance, computing. – short: Small integers.
    [Show full text]
  • Chapter 4 Variables and Data Types
    PROG0101 Fundamentals of Programming PROG0101 FUNDAMENTALS OF PROGRAMMING Chapter 4 Variables and Data Types 1 PROG0101 Fundamentals of Programming Variables and Data Types Topics • Variables • Constants • Data types • Declaration 2 PROG0101 Fundamentals of Programming Variables and Data Types Variables • A symbol or name that stands for a value. • A variable is a value that can change. • Variables provide temporary storage for information that will be needed during the lifespan of the computer program (or application). 3 PROG0101 Fundamentals of Programming Variables and Data Types Variables Example: z = x + y • This is an example of programming expression. • x, y and z are variables. • Variables can represent numeric values, characters, character strings, or memory addresses. 4 PROG0101 Fundamentals of Programming Variables and Data Types Variables • Variables store everything in your program. • The purpose of any useful program is to modify variables. • In a program every, variable has: – Name (Identifier) – Data Type – Size – Value 5 PROG0101 Fundamentals of Programming Variables and Data Types Types of Variable • There are two types of variables: – Local variable – Global variable 6 PROG0101 Fundamentals of Programming Variables and Data Types Types of Variable • Local variables are those that are in scope within a specific part of the program (function, procedure, method, or subroutine, depending on the programming language employed). • Global variables are those that are in scope for the duration of the programs execution. They can be accessed by any part of the program, and are read- write for all statements that access them. 7 PROG0101 Fundamentals of Programming Variables and Data Types Types of Variable MAIN PROGRAM Subroutine Global Variables Local Variable 8 PROG0101 Fundamentals of Programming Variables and Data Types Rules in Naming a Variable • There a certain rules in naming variables (identifier).
    [Show full text]
  • SAS 9.3 UTF-8 Encoding Support and Related Issue Troubleshooting
    SAS 9.3 UTF-8 Encoding Support and Related Issue Troubleshooting Jason (Jianduan) Liang SAS certified: Platform Administrator, Advanced Programmer for SAS 9 Agenda Introduction UTF-8 and other encodings SAS options for encoding and configuration Other Considerations for UTF-8 data Encoding issues troubleshooting techniques (tips) Introduction What is UTF-8? . A character encoding capable of encoding all possible characters Why UTF-8? . Dominant encoding of the www (86.5%) SAS system options for encoding . Encoding – instructs SAS how to read, process and store data . Locale - instructs SAS how to present or display currency, date and time, set timezone values UTF-8 and other Encodings ASSCII (American Standard Code for Information Interchange) . 7-bit . 128 - character set . Examples (code point-char-hex): 32-Space-20; 63-?-3F; 64-@-40; 65-A-41 UTF-8 and other Encodings ISO 8859-1 (Latin-1) for Western European languages Windows-1252 (Latin-1) for Western European languages . 8-bit (1 byte, 256 character set) . Identical to asscii for the first 128 chars . Extended ascii chars examples: . 155-£-A3; 161- ©-A9 . SAS option encoding value: wlatin1 (latin1) UTF-8 and other Encodings UTF-8 and other Encodings Problems . Only covers English and Western Europe languages, ISO-8859-2, …15 . Multiple encoding is required to support national languages . Same character encoded differently, same code point represents different chars Unicode . Unicode – assign a unique code/number to every possible character of all languages . Examples of unicode points: o U+0020 – Space U+0041 – A o U+00A9 - © U+C3BF - ÿ UTF-8 and other Encodings UTF-8 .
    [Show full text]
  • The Art of the Javascript Metaobject Protocol
    The Art Of The Javascript Metaobject Protocol enough?Humphrey Ephraim never recalculate remains giddying: any precentorship she expostulated exasperated her nuggars west, is brocade Gus consultative too around-the-clock? and unbloody If dog-cheapsycophantical and or secularly, norman Partha how slicked usually is volatilisingPenrod? his nomadism distresses acceptedly or interlacing Card, and send an email to a recipient with. On Auslegung auf are Schallabstrahlung download the Aerodynamik von modernen Flugtriebwerken. This poll i send a naming convention, the art of metaobject protocol for the corresponding to. What might happen, for support, if you should load monkeypatched code in one ruby thread? What Hooks does Ruby have for Metaprogramming? Sass, less, stylus, aura, etc. If it finds one, it calls that method and passes itself as value object. What bin this optimization achieve? JRuby and the psd. Towards a new model of abstraction in software engineering. Buy Online in Aruba at aruba. The current run step approach is: Checkpoint. Python object room to provide usable string representations of hydrogen, one used for debugging and logging, another for presentation to end users. Method handles can we be used to implement polymorphic inline caches. Mop is not the metaobject? Rails is a nicely designed web framework. Get two FREE Books of character Moment sampler! The download the number IS still thought. This proxy therefore behaves equivalently to the card dispatch function, and no methods will be called on the proxy dispatcher before but real dispatcher is available. While desertcart makes reasonable efforts to children show products available in your kid, some items may be cancelled if funny are prohibited for import in Aruba.
    [Show full text]
  • Julia's Efficient Algorithm for Subtyping Unions and Covariant
    Julia’s Efficient Algorithm for Subtyping Unions and Covariant Tuples Benjamin Chung Northeastern University, Boston, MA, USA [email protected] Francesco Zappa Nardelli Inria of Paris, Paris, France [email protected] Jan Vitek Northeastern University, Boston, MA, USA Czech Technical University in Prague, Czech Republic [email protected] Abstract The Julia programming language supports multiple dispatch and provides a rich type annotation language to specify method applicability. When multiple methods are applicable for a given call, Julia relies on subtyping between method signatures to pick the correct method to invoke. Julia’s subtyping algorithm is surprisingly complex, and determining whether it is correct remains an open question. In this paper, we focus on one piece of this problem: the interaction between union types and covariant tuples. Previous work normalized unions inside tuples to disjunctive normal form. However, this strategy has two drawbacks: complex type signatures induce space explosion, and interference between normalization and other features of Julia’s type system. In this paper, we describe the algorithm that Julia uses to compute subtyping between tuples and unions – an algorithm that is immune to space explosion and plays well with other features of the language. We prove this algorithm correct and complete against a semantic-subtyping denotational model in Coq. 2012 ACM Subject Classification Theory of computation → Type theory Keywords and phrases Type systems, Subtyping, Union types Digital Object Identifier 10.4230/LIPIcs.ECOOP.2019.24 Category Pearl Supplement Material ECOOP 2019 Artifact Evaluation approved artifact available at https://dx.doi.org/10.4230/DARTS.5.2.8 Acknowledgements The authors thank Jiahao Chen for starting us down the path of understanding Julia, and Jeff Bezanson for coming up with Julia’s subtyping algorithm.
    [Show full text]
  • JS Character Encodings
    JS � Character Encodings Anna Henningsen · @addaleax · she/her 1 It’s good to be back! 2 ??? https://travis-ci.org/node-ffi-napi/get-symbol-from-current-process-h/jobs/641550176 3 So … what’s a character encoding? People are good with text, computers are good with numbers Text List of characters “Encoding” List of bytes List of integers 4 So … what’s a character encoding? People are good with text, computers are good with numbers Hello [‘H’,’e’,’l’,’l’,’o’] 68 65 6c 6c 6f [72, 101, 108, 108, 111] 5 So … what’s a character encoding? People are good with text, computers are good with numbers 你好! [‘你’,’好’] ??? ??? 6 ASCII 0 0x00 <NUL> … … … 65 0x41 A 66 0x42 B 67 0x43 C … … … 97 0x61 a 98 0x62 b … … … 127 0x7F <DEL> 7 ASCII ● 7-bit ● Covers most English-language use cases ● … and that’s pretty much it 8 ISO-8859-*, Windows code pages ● Idea: Usually, transmission has 8 bit per byte available, so create ASCII-extending charsets for more languages ISO-8859-1 (Western) ISO-8859-5 (Cyrillic) Windows-1251 (Cyrillic) (aka Latin-1) … … … … 0xD0 Ð а Р 0xD1 Ñ б С 0xD2 Ò в Т … … … … 9 GBK ● Idea: Also extend ASCII, but use 2-byte for Chinese characters … … 0x41 A 0x42 B … … 0xC4 0xE3 你 0xC4 0xE4 匿 … … 10 https://xkcd.com/927/ 11 Unicode: Multiple encodings! 4d c3 bc 6c 6c (UTF-8) U+004D M “Müll” U+00FC ü 4d 00 fc 00 6c 00 6c 00 (UTF-16LE) U+006C l U+006C l 00 4d 00 fc 00 6c 00 6c (UTF-16BE) 12 Unicode ● New idea: Don’t create a gazillion charsets, and drop 1-byte/2-byte restriction ● Shared character set for multiple encodings: U+XXXX with 4 hex digits, e.g.
    [Show full text]
  • Does Personality Matter? Temperament and Character Dimensions in Panic Subtypes
    325 Arch Neuropsychiatry 2018;55:325−329 RESEARCH ARTICLE https://doi.org/10.5152/npa.2017.20576 Does Personality Matter? Temperament and Character Dimensions in Panic Subtypes Antonio BRUNO1 , Maria Rosaria Anna MUSCATELLO1 , Gianluca PANDOLFO1 , Giulia LA CIURA1 , Diego QUATTRONE2 , Giuseppe SCIMECA1 , Carmela MENTO1 , Rocco A. ZOCCALI1 1Department of Psychiatry, University of Messina, Messina, Italy 2MRC Social, Genetic & Developmental Psychiatry Centre, Institute of Psychiatry, Psychology & Neuroscience, London, United Kingdom ABSTRACT Introduction: Symptomatic heterogeneity in the clinical presentation of and 12.78% of the total variance. Correlations analyses showed that Panic Disorder (PD) has lead to several attempts to identify PD subtypes; only “Somato-dissociative” factor was significantly correlated with however, no studies investigated the association between temperament T.C.I. “Self-directedness” (p<0.0001) and “Cooperativeness” (p=0.009) and character dimensions and PD subtypes. The study was aimed to variables. Results from the regression analysis indicate that the predictor verify whether personality traits were differentially related to distinct models account for 33.3% and 24.7% of the total variance respectively symptom dimensions. in “Somatic-dissociative” (p<0.0001) and “Cardiologic” (p=0.007) factors, while they do not show statistically significant effects on “Respiratory” Methods: Seventy-four patients with PD were assessed by the factor (p=0.222). After performing stepwise regression analysis, “Self- Mini-International Neuropsychiatric Interview (M.I.N.I.), and the directedness” resulted the unique predictor of “Somato-dissociative” Temperament and Character Inventory (T.C.I.). Thirteen panic symptoms factor (R²=0.186; β=-0.432; t=-4.061; p<0.0001). from the M.I.N.I.
    [Show full text]
  • San José, October 2, 2000 Feel Free to Distribute This Text
    San José, October 2, 2000 Feel free to distribute this text (version 1.2) including the author’s email address ([email protected]) and to contact him for corrections and additions. Please do not take this text as a literal translation, but as a help to understand the standard GB 18030-2000. Insertions in brackets [] are used throughout the text to indicate corresponding sections of the published Chinese standard. Thanks to Markus Scherer (IBM) and Ken Lunde (Adobe Systems) for initial critical reviews of the text. SUMMARY, EXPLANATIONS, AND REMARKS: CHINESE NATIONAL STANDARD GB 18030-2000: INFORMATION TECHNOLOGY – CHINESE IDEOGRAMS CODED CHARACTER SET FOR INFORMATION INTERCHANGE – EXTENSION FOR THE BASIC SET (信息技术-信息交换用汉字编码字符集 Xinxi Jishu – Xinxi Jiaohuan Yong Hanzi Bianma Zifuji – Jibenji De Kuochong) March 17, 2000, was the publishing date of the Chinese national standard (国家标准 guojia biaozhun) GB 18030-2000 (hereafter: GBK2K). This standard tries to resolve issues resulting from the advent of Unicode, version 3.0. More specific, it attempts the combination of Uni- code's extended character repertoire, namely the Unihan Extension A, with the character cov- erage of earlier Chinese national standards. HISTORY The People’s Republic of China had already expressed her fundamental consent to support the combined efforts of the ISO/IEC and the Unicode Consortium through publishing a Chinese National Standard that was code- and character-compatible with ISO 10646-1/ Unicode 2.1. This standard was named GB 13000.1. Whenever the ISO and the Unicode Consortium changed or revised their “common” standard, GB 13000.1 adopted these changes subsequently. In order to remain compatible with GB 2312, however, which at the time of publishing Unicode/GB 13000.1 was an already existing national standard widely used to represent the Chinese “simplified” characters, the “specification” GBK was created.
    [Show full text]
  • Plain Text & Character Encoding
    Journal of eScience Librarianship Volume 10 Issue 3 Data Curation in Practice Article 12 2021-08-11 Plain Text & Character Encoding: A Primer for Data Curators Seth Erickson Pennsylvania State University Let us know how access to this document benefits ou.y Follow this and additional works at: https://escholarship.umassmed.edu/jeslib Part of the Scholarly Communication Commons, and the Scholarly Publishing Commons Repository Citation Erickson S. Plain Text & Character Encoding: A Primer for Data Curators. Journal of eScience Librarianship 2021;10(3): e1211. https://doi.org/10.7191/jeslib.2021.1211. Retrieved from https://escholarship.umassmed.edu/jeslib/vol10/iss3/12 Creative Commons License This work is licensed under a Creative Commons Attribution 4.0 License. This material is brought to you by eScholarship@UMMS. It has been accepted for inclusion in Journal of eScience Librarianship by an authorized administrator of eScholarship@UMMS. For more information, please contact [email protected]. ISSN 2161-3974 JeSLIB 2021; 10(3): e1211 https://doi.org/10.7191/jeslib.2021.1211 Full-Length Paper Plain Text & Character Encoding: A Primer for Data Curators Seth Erickson The Pennsylvania State University, University Park, PA, USA Abstract Plain text data consists of a sequence of encoded characters or “code points” from a given standard such as the Unicode Standard. Some of the most common file formats for digital data used in eScience (CSV, XML, and JSON, for example) are built atop plain text standards. Plain text representations of digital data are often preferred because plain text formats are relatively stable, and they facilitate reuse and interoperability.
    [Show full text]