CS110: Introduction to Computer Science the String Data Type The

Total Page:16

File Type:pdf, Size:1020Kb

CS110: Introduction to Computer Science the String Data Type The The String Data Type CS110: Introduction to Text is represented in programs by the Computer Science string data type. A string is a sequence of characters enclosed within quotation marks (") or Dianna Xu apostrophes ('). 1 2 The String Data Type The String Data Type When you enter a name, it ’s doing the One way to fix this is to enter your same thing as: string input with quotes around it. firstName = Dianna The way Python evaluates expressions Even though this works, this is is to look up the value of the variable cumbersome! Dianna and store it in firstName. 3 4 The String Data Type The String Data Type There is a better way to handle We can access the individual characters text – the raw_input function. in a string through indexing . The positions in a string are numbered raw_input is like input , but it from the left, starting with 0. doesn ’t evaluate the expression The general form is that the user enters. <string>[<expr>] , where the value of expr determines which character is selected from the string. 5 6 Python Programming, 1/e 1 The String Data Type The String Data Type H e l l o B o b H e l l o B o b 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 6 7 8 In a string of n characters, the last character is at position n-1 since we start counting with 0. We can index from the right side using negative indexes. 7 8 The String Data Type The String Data Type Indexing returns a string containing a Slicing: single character from a larger string. <string>[<start>:<end>] We can also access a contiguous start and end should both be int s sequence of characters, called a The slice contains the substring substring , through a process called beginning at position start and runs up to slicing . but doesn ’t include the position end. If either expression is missing, then the start or the end of the string are used. 9 10 The String Data Type The String Data Type Can we put two strings together into a Operator Meaning longer string? + Concatenation Concatenation “glues ” two strings together ( +) * Repetition <string>[] Repetition builds up a string by multiple Indexing concatenations of a string with itself ( *) <string>[:] Slicing The function len will return the length len(<string>) Length of a string. for <var> in Iteration through <string> characters 11 12 Python Programming, 1/e 2 Simple String Processing Simple String Processing >>> Usernames on a computer system Please enter your first name (all lowercase): john First initial, first seven characters of last Please enter your last name (all lowercase): doe name uname = jdoe # get user ’s first and last names >>> first = raw_input( “Please enter your first name (all lowercase): ”) Please enter your first name (all lowercase): donna last = raw_input( “Please enter your last name (all lowercase): ”) Please enter your last name (all lowercase): rostenkowski uname = drostenk # concatenate first initial with 7 chars of last name uname = first[0] + last[:7] 13 14 Simple String Processing Simple String Processing Another use – converting an int that Month Number Position stands for the month into the three letter Jan 1 0 abbreviation for that month. Feb 2 3 Store all the names in one big string: “JanFebMarAprMayJunJulAugSepOctNovDec” Mar 3 6 Use the month number as an index for Apr 4 9 slicing this string: monthAbbrev = months[pos:pos+3] To get the correct position, subtract one from the month number and multiply by three 15 16 Strings, Lists, and Sequences Strings, Lists, and Sequences Strings are really a special kind of We can use the idea of a list to make sequence , so these operations also our previous month program even apply to sequences! simpler! We change the lookup table for months Strings are always sequences of to a list: characters, but lists can be sequences of arbitrary values. months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", Lists can have numbers, strings, or "Dec"] both! 17 18 Python Programming, 1/e 3 Strings, Lists, and Sequences Strings, Lists, and Sequences To get the months out of the sequence, This version of the program is easy to do this: extend to print out the whole month monthAbbrev = months[n-1] name rather than an abbreviation! Rather than this: months = ["January", "February", "March", "April", "May", monthAbbrev = months[pos:pos+3] "June", "July", "August", "September", "October", "November", "December"] 19 20 Strings, Lists, and Sequences Strings and Secret Codes Lists are mutable , meaning they can be Inside the computer, strings are not changed. Strings can be changed. represented as sequences of 1 ’s and 0’s, just like numbers. A string is stored as a sequence of numbers, one number per character. It doesn ’t matter what value is assigned as long as it ’s done consistently. 21 22 Strings and Secret Codes Strings and Secret Codes In the early days of computers, each 0 – 127 are used to represent the characters manufacturer used their own encoding typically found on American keyboards. of numbers for characters. 65 – 90 are “A”–“Z” 97 – 122 are “a”–“z” Today, American computers use the 48 – 57 are “0”–“9” ASCII system (American Standard Code The others are punctuation and control codes for Information Interchange). used to coordinate the sending and receiving of information. 23 24 Python Programming, 1/e 4 Strings and Secret Codes Strings and Secret Codes One major problem with ASCII is that The ord function returns the it ’s American-centric, it doesn ’t have numeric (ordinal) code of a single many of the symbols necessary for character. other languages. The chr function converts a Newer systems use Unicode , an numeric code to the corresponding alternate standard that includes support character. for nearly all written languages. 25 26 Other String Operations Other String Operations There are a number of other string count( s, sub ) – Count the number of processing functions available in the occurrences of sub in s string library. Try them all! find( s, sub ) – Find the first position capitalize( s) – Copy of s with only the where sub occurs in s first character capitalized join( list ) – Concatenate list of strings capwords( s) – Copy of s; first character into one large string of each word capitalized ljust( s, width ) – Like center, but s is center( s, width ) – Center s in a field of left-justified given width 27 28 Other String Operations Other String Operations lower( s) – Copy of s in all lowercase letters rstrip( s) – Copy of s with trailing lstrip( s) – Copy of s with leading whitespace removed whitespace removed split( s) – Split s into a list of substrings replace( s, oldsub, newsub ) – Replace upper( s) – Copy of s; all characters occurrences of oldsub in s with newsub converted to uppercase rfind( s, sub ) – Like find, but returns the right-most position rjust( s, width ) – Like center, but s is right- justified 29 30 Python Programming, 1/e 5 From Encoding to Encryption The process of encoding information for the purpose of keeping it secret or transmitting it privately is called encryption . Cryptography is the study of encryption methods. 31 Python Programming, 1/e 6.
Recommended publications
  • Compilers & Translator Writing Systems
    Compilers & Translators Compilers & Translator Writing Systems Prof. R. Eigenmann ECE573, Fall 2005 http://www.ece.purdue.edu/~eigenman/ECE573 ECE573, Fall 2005 1 Compilers are Translators Fortran Machine code C Virtual machine code C++ Transformed source code Java translate Augmented source Text processing language code Low-level commands Command Language Semantic components Natural language ECE573, Fall 2005 2 ECE573, Fall 2005, R. Eigenmann 1 Compilers & Translators Compilers are Increasingly Important Specification languages Increasingly high level user interfaces for ↑ specifying a computer problem/solution High-level languages ↑ Assembly languages The compiler is the translator between these two diverging ends Non-pipelined processors Pipelined processors Increasingly complex machines Speculative processors Worldwide “Grid” ECE573, Fall 2005 3 Assembly code and Assemblers assembly machine Compiler code Assembler code Assemblers are often used at the compiler back-end. Assemblers are low-level translators. They are machine-specific, and perform mostly 1:1 translation between mnemonics and machine code, except: – symbolic names for storage locations • program locations (branch, subroutine calls) • variable names – macros ECE573, Fall 2005 4 ECE573, Fall 2005, R. Eigenmann 2 Compilers & Translators Interpreters “Execute” the source language directly. Interpreters directly produce the result of a computation, whereas compilers produce executable code that can produce this result. Each language construct executes by invoking a subroutine of the interpreter, rather than a machine instruction. Examples of interpreters? ECE573, Fall 2005 5 Properties of Interpreters “execution” is immediate elaborate error checking is possible bookkeeping is possible. E.g. for garbage collection can change program on-the-fly. E.g., switch libraries, dynamic change of data types machine independence.
    [Show full text]
  • Metaobject Protocols: Why We Want Them and What Else They Can Do
    Metaobject protocols: Why we want them and what else they can do Gregor Kiczales, J.Michael Ashley, Luis Rodriguez, Amin Vahdat, and Daniel G. Bobrow Published in A. Paepcke, editor, Object-Oriented Programming: The CLOS Perspective, pages 101 ¾ 118. The MIT Press, Cambridge, MA, 1993. © Massachusetts Institute of Technology All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher. Metaob ject Proto cols WhyWeWant Them and What Else They Can Do App ears in Object OrientedProgramming: The CLOS Perspective c Copyright 1993 MIT Press Gregor Kiczales, J. Michael Ashley, Luis Ro driguez, Amin Vahdat and Daniel G. Bobrow Original ly conceivedasaneat idea that could help solve problems in the design and implementation of CLOS, the metaobject protocol framework now appears to have applicability to a wide range of problems that come up in high-level languages. This chapter sketches this wider potential, by drawing an analogy to ordinary language design, by presenting some early design principles, and by presenting an overview of three new metaobject protcols we have designed that, respectively, control the semantics of Scheme, the compilation of Scheme, and the static paral lelization of Scheme programs. Intro duction The CLOS Metaob ject Proto col MOP was motivated by the tension b etween, what at the time, seemed liketwo con icting desires. The rst was to have a relatively small but p owerful language for doing ob ject-oriented programming in Lisp. The second was to satisfy what seemed to b e a large numb er of user demands, including: compatibility with previous languages, p erformance compara- ble to or b etter than previous implementations and extensibility to allow further exp erimentation with ob ject-oriented concepts see Chapter 2 for examples of directions in which ob ject-oriented techniques might b e pushed.
    [Show full text]
  • Chapter 2 Basics of Scanning And
    Chapter 2 Basics of Scanning and Conventional Programming in Java In this chapter, we will introduce you to an initial set of Java features, the equivalent of which you should have seen in your CS-1 class; the separation of problem, representation, algorithm and program – four concepts you have probably seen in your CS-1 class; style rules with which you are probably familiar, and scanning - a general class of problems we see in both computer science and other fields. Each chapter is associated with an animating recorded PowerPoint presentation and a YouTube video created from the presentation. It is meant to be a transcript of the associated presentation that contains little graphics and thus can be read even on a small device. You should refer to the associated material if you feel the need for a different instruction medium. Also associated with each chapter is hyperlinked code examples presented here. References to previously presented code modules are links that can be traversed to remind you of the details. The resources for this chapter are: PowerPoint Presentation YouTube Video Code Examples Algorithms and Representation Four concepts we explicitly or implicitly encounter while programming are problems, representations, algorithms and programs. Programs, of course, are instructions executed by the computer. Problems are what we try to solve when we write programs. Usually we do not go directly from problems to programs. Two intermediate steps are creating algorithms and identifying representations. Algorithms are sequences of steps to solve problems. So are programs. Thus, all programs are algorithms but the reverse is not true.
    [Show full text]
  • Scripting: Higher- Level Programming for the 21St Century
    . John K. Ousterhout Sun Microsystems Laboratories Scripting: Higher- Cybersquare Level Programming for the 21st Century Increases in computer speed and changes in the application mix are making scripting languages more and more important for the applications of the future. Scripting languages differ from system programming languages in that they are designed for “gluing” applications together. They use typeless approaches to achieve a higher level of programming and more rapid application development than system programming languages. or the past 15 years, a fundamental change has been ated with system programming languages and glued Foccurring in the way people write computer programs. together with scripting languages. However, several The change is a transition from system programming recent trends, such as faster machines, better script- languages such as C or C++ to scripting languages such ing languages, the increasing importance of graphical as Perl or Tcl. Although many people are participat- user interfaces (GUIs) and component architectures, ing in the change, few realize that the change is occur- and the growth of the Internet, have greatly expanded ring and even fewer know why it is happening. This the applicability of scripting languages. These trends article explains why scripting languages will handle will continue over the next decade, with more and many of the programming tasks in the next century more new applications written entirely in scripting better than system programming languages. languages and system programming
    [Show full text]
  • Lecture 2: Variables and Primitive Data Types
    Lecture 2: Variables and Primitive Data Types MIT-AITI Kenya 2005 1 In this lecture, you will learn… • What a variable is – Types of variables – Naming of variables – Variable assignment • What a primitive data type is • Other data types (ex. String) MIT-Africa Internet Technology Initiative 2 ©2005 What is a Variable? • In basic algebra, variables are symbols that can represent values in formulas. • For example the variable x in the formula f(x)=x2+2 can represent any number value. • Similarly, variables in computer program are symbols for arbitrary data. MIT-Africa Internet Technology Initiative 3 ©2005 A Variable Analogy • Think of variables as an empty box that you can put values in. • We can label the box with a name like “Box X” and re-use it many times. • Can perform tasks on the box without caring about what’s inside: – “Move Box X to Shelf A” – “Put item Z in box” – “Open Box X” – “Remove contents from Box X” MIT-Africa Internet Technology Initiative 4 ©2005 Variables Types in Java • Variables in Java have a type. • The type defines what kinds of values a variable is allowed to store. • Think of a variable’s type as the size or shape of the empty box. • The variable x in f(x)=x2+2 is implicitly a number. • If x is a symbol representing the word “Fish”, the formula doesn’t make sense. MIT-Africa Internet Technology Initiative 5 ©2005 Java Types • Integer Types: – int: Most numbers you’ll deal with. – long: Big integers; science, finance, computing. – short: Small integers.
    [Show full text]
  • Kind of Quantity’
    NIST Technical Note 2034 Defning ‘kind of quantity’ David Flater This publication is available free of charge from: https://doi.org/10.6028/NIST.TN.2034 NIST Technical Note 2034 Defning ‘kind of quantity’ David Flater Software and Systems Division Information Technology Laboratory This publication is available free of charge from: https://doi.org/10.6028/NIST.TN.2034 February 2019 U.S. Department of Commerce Wilbur L. Ross, Jr., Secretary National Institute of Standards and Technology Walter Copan, NIST Director and Undersecretary of Commerce for Standards and Technology Certain commercial entities, equipment, or materials may be identifed in this document in order to describe an experimental procedure or concept adequately. Such identifcation is not intended to imply recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that the entities, materials, or equipment are necessarily the best available for the purpose. National Institute of Standards and Technology Technical Note 2034 Natl. Inst. Stand. Technol. Tech. Note 2034, 7 pages (February 2019) CODEN: NTNOEF This publication is available free of charge from: https://doi.org/10.6028/NIST.TN.2034 NIST Technical Note 2034 1 Defning ‘kind of quantity’ David Flater 2019-02-06 This publication is available free of charge from: https://doi.org/10.6028/NIST.TN.2034 Abstract The defnition of ‘kind of quantity’ given in the International Vocabulary of Metrology (VIM), 3rd edition, does not cover the historical meaning of the term as it is most commonly used in metrology. Most of its historical meaning has been merged into ‘quantity,’ which is polysemic across two layers of abstraction.
    [Show full text]
  • Subtyping, Declaratively an Exercise in Mixed Induction and Coinduction
    Subtyping, Declaratively An Exercise in Mixed Induction and Coinduction Nils Anders Danielsson and Thorsten Altenkirch University of Nottingham Abstract. It is natural to present subtyping for recursive types coin- ductively. However, Gapeyev, Levin and Pierce have noted that there is a problem with coinductive definitions of non-trivial transitive inference systems: they cannot be \declarative"|as opposed to \algorithmic" or syntax-directed|because coinductive inference systems with an explicit rule of transitivity are trivial. We propose a solution to this problem. By using mixed induction and coinduction we define an inference system for subtyping which combines the advantages of coinduction with the convenience of an explicit rule of transitivity. The definition uses coinduction for the structural rules, and induction for the rule of transitivity. We also discuss under what condi- tions this technique can be used when defining other inference systems. The developments presented in the paper have been mechanised using Agda, a dependently typed programming language and proof assistant. 1 Introduction Coinduction and corecursion are useful techniques for defining and reasoning about things which are potentially infinite, including streams and other (poten- tially) infinite data types (Coquand 1994; Gim´enez1996; Turner 2004), process congruences (Milner 1990), congruences for functional programs (Gordon 1999), closures (Milner and Tofte 1991), semantics for divergence of programs (Cousot and Cousot 1992; Hughes and Moran 1995; Leroy and Grall 2009; Nakata and Uustalu 2009), and subtyping relations for recursive types (Brandt and Henglein 1998; Gapeyev et al. 2002). However, the use of coinduction can lead to values which are \too infinite”. For instance, a non-trivial binary relation defined as a coinductive inference sys- tem cannot include the rule of transitivity, because a coinductive reading of transitivity would imply that every element is related to every other (to see this, build an infinite derivation consisting solely of uses of transitivity).
    [Show full text]
  • Guide for the Use of the International System of Units (SI)
    Guide for the Use of the International System of Units (SI) m kg s cd SI mol K A NIST Special Publication 811 2008 Edition Ambler Thompson and Barry N. Taylor NIST Special Publication 811 2008 Edition Guide for the Use of the International System of Units (SI) Ambler Thompson Technology Services and Barry N. Taylor Physics Laboratory National Institute of Standards and Technology Gaithersburg, MD 20899 (Supersedes NIST Special Publication 811, 1995 Edition, April 1995) March 2008 U.S. Department of Commerce Carlos M. Gutierrez, Secretary National Institute of Standards and Technology James M. Turner, Acting Director National Institute of Standards and Technology Special Publication 811, 2008 Edition (Supersedes NIST Special Publication 811, April 1995 Edition) Natl. Inst. Stand. Technol. Spec. Publ. 811, 2008 Ed., 85 pages (March 2008; 2nd printing November 2008) CODEN: NSPUE3 Note on 2nd printing: This 2nd printing dated November 2008 of NIST SP811 corrects a number of minor typographical errors present in the 1st printing dated March 2008. Guide for the Use of the International System of Units (SI) Preface The International System of Units, universally abbreviated SI (from the French Le Système International d’Unités), is the modern metric system of measurement. Long the dominant measurement system used in science, the SI is becoming the dominant measurement system used in international commerce. The Omnibus Trade and Competitiveness Act of August 1988 [Public Law (PL) 100-418] changed the name of the National Bureau of Standards (NBS) to the National Institute of Standards and Technology (NIST) and gave to NIST the added task of helping U.S.
    [Show full text]
  • The Future of DNA Data Storage the Future of DNA Data Storage
    The Future of DNA Data Storage The Future of DNA Data Storage September 2018 A POTOMAC INSTITUTE FOR POLICY STUDIES REPORT AC INST M IT O U T B T The Future O E P F O G S R IE of DNA P D O U Data LICY ST Storage September 2018 NOTICE: This report is a product of the Potomac Institute for Policy Studies. The conclusions of this report are our own, and do not necessarily represent the views of our sponsors or participants. Many thanks to the Potomac Institute staff and experts who reviewed and provided comments on this report. © 2018 Potomac Institute for Policy Studies Cover image: Alex Taliesen POTOMAC INSTITUTE FOR POLICY STUDIES 901 North Stuart St., Suite 1200 | Arlington, VA 22203 | 703-525-0770 | www.potomacinstitute.org CONTENTS EXECUTIVE SUMMARY 4 Findings 5 BACKGROUND 7 Data Storage Crisis 7 DNA as a Data Storage Medium 9 Advantages 10 History 11 CURRENT STATE OF DNA DATA STORAGE 13 Technology of DNA Data Storage 13 Writing Data to DNA 13 Reading Data from DNA 18 Key Players in DNA Data Storage 20 Academia 20 Research Consortium 21 Industry 21 Start-ups 21 Government 22 FORECAST OF DNA DATA STORAGE 23 DNA Synthesis Cost Forecast 23 Forecast for DNA Data Storage Tech Advancement 28 Increasing Data Storage Density in DNA 29 Advanced Coding Schemes 29 DNA Sequencing Methods 30 DNA Data Retrieval 31 CONCLUSIONS 32 ENDNOTES 33 Executive Summary The demand for digital data storage is currently has been developed to support applications in outpacing the world’s storage capabilities, and the life sciences industry and not for data storage the gap is widening as the amount of digital purposes.
    [Show full text]
  • How Do You Know Your Search Algorithm and Code Are Correct?
    Proceedings of the Seventh Annual Symposium on Combinatorial Search (SoCS 2014) How Do You Know Your Search Algorithm and Code Are Correct? Richard E. Korf Computer Science Department University of California, Los Angeles Los Angeles, CA 90095 [email protected] Abstract Is a Given Solution Correct? Algorithm design and implementation are notoriously The first question to ask of a search algorithm is whether the error-prone. As researchers, it is incumbent upon us to candidate solutions it returns are valid solutions. The algo- maximize the probability that our algorithms, their im- rithm should output each solution, and a separate program plementations, and the results we report are correct. In should check its correctness. For any problem in NP, check- this position paper, I argue that the main technique for ing candidate solutions can be done in polynomial time. doing this is confirmation of results from multiple in- dependent sources, and provide a number of concrete Is a Given Solution Optimal? suggestions for how to achieve this in the context of combinatorial search algorithms. Next we consider whether the solutions returned are opti- mal. In most cases, there are multiple very different algo- rithms that compute optimal solutions, starting with sim- Introduction and Overview ple brute-force algorithms, and progressing through increas- Combinatorial search results can be theoretical or experi- ingly complex and more efficient algorithms. Thus, one can mental. Theoretical results often consist of correctness, com- compare the solution costs returned by the different algo- pleteness, the quality of solutions returned, and asymptotic rithms, which should all be the same.
    [Show full text]
  • NAPCS Product List for NAICS 5112, 518 and 54151: Software
    NAPCS Product List for NAICS 5112, 518 and 54151: Software Publishers, Internet Service Providers, Web Search Portals, and Data Processing Services, and Computer Systems Design and Related Services 1 2 3 456 7 8 9 National Product United States Industry Working Tri- Detail Subject Group lateral NAICS Industries Area Code Detail Can Méx US Title Definition Producing the Product 5112 1.1 X Information Providing advice or expert opinion on technical matters related to the use of information technology. 511210 518111 518 technology (IT) 518210 54151 54151 technical consulting Includes: 54161 services • advice on matters such as hardware and software requirements and procurement, systems integration, and systems security. • providing expert testimony on IT related issues. Excludes: • advice on issues related to business strategy, such as advising on developing an e-commerce strategy, is in product 2.3, Strategic management consulting services. • advice bundled with the design and development of an IT solution (web site, database, specific application, network, etc.) is in the products under 1.2, Information technology (IT) design and development services. 5112 1.2 Information Providing technical expertise to design and/or develop an IT solution such as custom applications, 511210 518111 518 technology (IT) networks, and computer systems. 518210 54151 54151 design and development services 5112 1.2.1 Custom software Designing the structure and/or writing the computer code necessary to create and/or implement a 511210 518111 518 application design software application. 518210 54151 54151 and development services 5112 1.2.1.1 X Web site design and Designing the structure and content of a web page and/or of writing the computer code necessary to 511210 518111 518 development services create and implement a web page.
    [Show full text]
  • Media Theory and Semiotics: Key Terms and Concepts Binary
    Media Theory and Semiotics: Key Terms and Concepts Binary structures and semiotic square of oppositions Many systems of meaning are based on binary structures (masculine/ feminine; black/white; natural/artificial), two contrary conceptual categories that also entail or presuppose each other. Semiotic interpretation involves exposing the culturally arbitrary nature of this binary opposition and describing the deeper consequences of this structure throughout a culture. On the semiotic square and logical square of oppositions. Code A code is a learned rule for linking signs to their meanings. The term is used in various ways in media studies and semiotics. In communication studies, a message is often described as being "encoded" from the sender and then "decoded" by the receiver. The encoding process works on multiple levels. For semiotics, a code is the framework, a learned a shared conceptual connection at work in all uses of signs (language, visual). An easy example is seeing the kinds and levels of language use in anyone's language group. "English" is a convenient fiction for all the kinds of actual versions of the language. We have formal, edited, written English (which no one speaks), colloquial, everyday, regional English (regions in the US, UK, and around the world); social contexts for styles and specialized vocabularies (work, office, sports, home); ethnic group usage hybrids, and various kinds of slang (in-group, class-based, group-based, etc.). Moving among all these is called "code-switching." We know what they mean if we belong to the learned, rule-governed, shared-code group using one of these kinds and styles of language.
    [Show full text]