Programming Fundamentals and Python

Total Page:16

File Type:pdf, Size:1020Kb

Programming Fundamentals and Python Programming Fundamentals and Python Steven Bird Ewan Klein Edward Loper 2006-01-15 Version: 0.6 Revision: 1.21 Copyright: c 2001-2006 University of Pennsylvania License: Creative Commons Attribution-NonCommercial-ShareAlike License Note This is a draft. Please send any feedback to the authors. 1 Introduction This tutorial provides a non-technical overview of Python. It contains many working program fragments which you should try yourself. For a more detailed overview, we recommend that you consult one of the introductions listed in the further reading section below. More advanced programming material is contained in a later tutorial. 2 Processing lists and strings In writing Python programs for natural language processing we make heavy use of lists. A list is simply an ordered sequence of items. Each item could be a string, a number, or some more complex object such as a list. A Python list is represented as comma-separated items, enclosed inside brackets, e.g. ['John', 14, 'Sep', 1984] is a list consisting of four elements. We initialize a list by giving a variable name, followed by the equals sign, followed by this square bracket notation, e.g.: >>> a = ['colourless', 'green', 'ideas'] This command sets the value of variable a. To see the value of this variable, we can give the command print a. In interactive mode, we can just type the variable name: >>> a ['colourless', 'green', 'ideas'] To nd out the length of a list, we can use the len() function, which takes the list variable as its argument: >>> len(a) 3 We can access the items of a list individually by using indexes. Note that list items are counted starting from zero. 1 >>> a[0] 'colourless' >>> a[1] 'green' We can also index from the end of the list, using negative numbers. Thus, index -1 returns the last item: >>> a[-1] 'ideas' Sometimes we mistakenly try to access an index position which does not exist. If so, Python reports an error: >>> a[5] IndexError: list index out of range It is often useful to access multiple items in a list. In Python this is done using a special construct called the slice. In general, if a is a list, then a[m:n] is the slice starting at item m, and going up to (but not including) item n. >>> a[1:3] ['green', 'ideas'] If we leave out the rst number, it is assumed to be zero, the start of the list. So a[:n] nds everything up to (but not including) item n. Similarly, a[m:] nds everything from item m to the end: >>> a[1:] ['green', 'ideas'] >>> a[:2] ['colourless', 'green'] Lists can be concatenated, sorted and reversed, as we se below: >>> b = a + ['sleep', 'furiously'] >>> print b ['colourless', 'green', 'ideas', 'sleep', 'furiously'] >>> b.sort() >>> print b ['colourless', 'furiously', 'green', 'ideas', 'sleep'] >>> b.reverse() >>> print b ['sleep', 'ideas', 'green', 'furiously', 'colourless'] Note The individual elements of the above list are strings, and when we try to concatenate them using the + operator, Python performs string concatenation. E.g. b[2] + b[1] concatenates two strings to give 'greenideas'. We can use the for item in list syntax to iterate over the items of a list. In the following fragment we iterate over each word in the above b list, and print the rst character of each word: >>> for w in b: ... print w[0] ... s Natural Language Toolkit 2 nltk.sourceforge.net i g f c Observe that we used the bracket notation to access characters in a string, just as we earlier used the notation to access items in a list. We can combine these two kinds of access, e.g. to nd the third word (at index 2), and print the second character (at index 1): >>> b[2] 'green' >>> b[2][1] 'r' So far we have accessed items in a list by giving their index. We can also access them in terms of their content. The index() function returns the rst index where the specied item was found: >>> b.index('green') 2 We have already seen the string concatenation operator, +. We can multiply strings: >>> 'sleep' * 3 'sleepsleepsleep' Here is a useful idiom for reversing strings: >>> b[::-1] 'sselruoloc ylsuoifur neerg saedi peels' We can also join and split strings: >>> c = ' '.join(b) >>> c 'sleep ideas green furiously colourless' >>> c.split('r') ['sleep ideas g', 'een fu', 'iously colou', 'less'] More information on lists and strings can be found by typing help(list) and help(str) at the command prompt. To nd out about individual functions, you can type, e.g. help(list.append). 3 Dictionaries in Python Lists and strings are accessed by indexes. This turns out to be too limiting for many applications. Instead, we would like to access items by their names. For instance, we access a printed dictionary by looking up an entry by its headword, not by its number. Python provides a dictionary data type. We access it using the familiar bracket notation. Here we declare that d is a dictionary, then add three entries to it: >>> d = {} >>> d['colourless'] = 'adj' >>> d['furiously'] = 'adv' >>> d['ideas'] = 'n' The keys of the dictionary are just the headwords. However, they are not in any predened order. Natural Language Toolkit 3 nltk.sourceforge.net >>> d.keys() ['furiously', 'colourless', 'ideas'] >>> d['ideas'] 'n' We can iterate over the items in a dictionary using the for entry in dict syntax, as illustrated below: >>> for w in d: ... print "%s [%s]," % (w, d[w]), furiously [adv], colourless [adj], ideas [n], We can print an entire dictionary just by typing its name at the interactive command prompt: >>> d {'furiously': 'adv', 'colourless': 'adj', 'ideas': 'n'} If we try to access a non-existent entry, e.g. by typing d['sleep'], the Python interpreter will report an error. Two other functions are useful when we don't know if an entry exists, has_key(), and get(), as illustrated below: >>> print d.has_key('ideas') True >>> print d.get('sleep') None As a simple rule of thumb, dictionary entries are like variable names. We can create them simply by assigning to them, e.g. x = 2 (variable), d['x'] = 2 (dictionary entry). We can access them by reference, e.g. print x (variable), print d['x'] (dictionary entry). We can use dictionaries to count word occurrences. For example, the following code reads Macbeth and counts the frequency of each word: >>> from nltk_lite.corpora import gutenberg >>> count = {} # initialize a dictionary >>> for word in gutenberg.raw('shakespeare-macbeth'): # tokenize Macbeth ... word = word.lower() # normalize to lowercase ... if word not in count: # seen this word before? ... count[word] = 0 # if not, set count to zero ... count[word] += 1 Now we can inspect the dictionary: >>> print count['scotland'] 12 >>> frequencies = [(freq, word) for (word, freq) in count.items()] >>> frequencies.sort() >>> frequencies.reverse() >>> print frequencies[:20] [(1986, ','), (1245, '.'), (692, 'the'), (654, "'"), (567, 'and'), (482, ':'), (399, 'to'), (365, 'of'), (360, 'i'), (255, 'a'), (246, 'that'), (242, '?'), (224, 'd'), (218, 'you'), (213, 'in'), (207, 'my'), (198, 'is'), (170, 'not'), (165, 'it'), (156, 'with')] 4 Regular Expressions Finally, we look at Python's regular expression module re, for substituting and searching within strings. >>> import re >>> from nltk_lite.utilities import re_show >>> sent = "colourless green ideas sleep furiously" Natural Language Toolkit 4 nltk.sourceforge.net We use a utility function re_show to show how regular expressions match against substrings. First we search for all instances of a particular character or character sequence: >>> re_show('l', sent) co{l}our{l}ess green ideas s{l}eep furious{l}y >>> re_show('green', sent) colourless {green} ideas sleep furiously Now we can perform substitutions. In the rst instance we replace all instances of l with s. Note that this generates a string as output, and doesn't modify the original string. Then we replace any instances of green with red. >>> re.sub('l', 's', sent) 'cosoursess green ideas sseep furioussy' >>> re.sub('green', 'red', sent) 'colourless red ideas sleep furiously' So far we have only seen simple patterns, consisting of individual characters or sequences of characters. However, regular expressions can also contain special syntax, such as | for disjunction, e.g.: >>> re_show('(green|sleep)', sent) colourless {green} ideas {sleep} furiously >>> re.findall('(green|sleep)', sent) ['green', 'sleep'] We can also disjoin individual characters. For example, [aeiou] matches any of a, e, i, o, or u, that is, any vowel. The expression [^aeiou] matches anything that is not a vowel. In the following example, we match sequences consisting of non-vowels followed by vowels. >>> re_show('[^aeiou][aeiou]', sent) {co}{lo}ur{le}ss g{re}en{ i}{de}as s{le}ep {fu}{ri}ously >>> re.findall('[^aeiou][aeiou]', sent) ['co', 'lo', 'le', 're', ' i', 'de', 'le', 'fu', 'ri'] We can put parentheses around parts of an expression in order to generate structured results. For example, here we see all those non-vowel characters which appear before a vowel: >>> re.findall('([^aeiou])[aeiou]', sent) ['c', 'l', 'l', 'r', ' ', 'd', 'l', 'f', 'r'] We can even generate pairs (or tuples), which we may then go on and tabulate. >>> re.findall('([^aeiou])([aeiou])', sent) [('c', 'o'), ('l', 'o'), ('l', 'e'), ('r', 'e'), (' ', 'i'), ('d', 'e'), ('l', 'e'), ('f', 'u'), ('r', 'i')] For an extended discussion of regular expressions, please see the regular expression tutorial. 5 Accessing Files and the Web It is easy to access local les, and web-pages in Python. Here are some examples. (You will need to create a le called corpus.txt before you can open it for reading.) >>> print open('corpus.txt').read() Hello world. This is a test file. Natural Language Toolkit 5 nltk.sourceforge.net >>> from urllib import urlopen >>> page = urlopen("http://news.bbc.co.uk/").read() >>> page = re.sub('<[^>]*>', '', page) # strip HTML markup >>> page = re.sub('\s+', ' ', page) # strip whitespace >>> print page[:60] BBC NEWS | News Front Page News Sport Weather World Service 6 Accessing NLTK NLTK consists of a set of Python modules, each of which denes classes and functions related to a single data structure or task.
Recommended publications
  • Regular Expressions
    Regular Expressions A regular expression describes a language using three operations. Regular Expressions A regular expression (RE) describes a language. It uses the three regular operations. These are called union/or, concatenation and star. Brackets ( and ) are used for grouping, just as in normal math. Goddard 2: 2 Union The symbol + means union or or. Example: 0 + 1 means either a zero or a one. Goddard 2: 3 Concatenation The concatenation of two REs is obtained by writing the one after the other. Example: (0 + 1) 0 corresponds to f00; 10g. (0 + 1)(0 + ") corresponds to f00; 0; 10; 1g. Goddard 2: 4 Star The symbol ∗ is pronounced star and means zero or more copies. Example: a∗ corresponds to any string of a’s: f"; a; aa; aaa;:::g. ∗ (0 + 1) corresponds to all binary strings. Goddard 2: 5 Example An RE for the language of all binary strings of length at least 2 that begin and end in the same symbol. Goddard 2: 6 Example An RE for the language of all binary strings of length at least 2 that begin and end in the same symbol. ∗ ∗ 0(0 + 1) 0 + 1(0 + 1) 1 Note precedence of regular operators: star al- ways refers to smallest piece it can, or to largest piece it can. Goddard 2: 7 Example Consider the regular expression ∗ ∗ ((0 + 1) 1 + ")(00) 00 Goddard 2: 8 Example Consider the regular expression ∗ ∗ ((0 + 1) 1 + ")(00) 00 This RE is for the set of all binary strings that end with an even nonzero number of 0’s.
    [Show full text]
  • Technical Standard COBOL Language
    Technical Standard COBOL Language NICAL H S C T A E N T D A R D [This page intentionally left blank] X/Open CAE Specification COBOL Language X/Open Company, Ltd. December 1991, X/Open Company Limited All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior permission of the copyright owners. X/Open CAE Specification COBOL Language ISBN: 1 872630 09 X X/Open Document Number: XO/CAE/91/200 Set in Palatino by X/Open Company Ltd., U.K. Printed by Maple Press, U.K. Published by X/Open Company Ltd., U.K. Any comments relating to the material contained in this document may be submitted to X/Open at: X/Open Company Limited Apex Plaza Forbury Road Reading Berkshire, RG1 1AX United Kingdom or by Electronic Mail to: [email protected] X/Open CAE Specification (1991) Page : ii COBOL Language Contents COBOL LANGUAGE Chapter 1 INTRODUCTION 1.1 THE X/OPEN COBOL DEFINITION 1.2 HISTORY OF THIS DOCUMENT 1.3 FORMAT OF ENTRIES 1.3.1 Typographical Conventions 1.3.2 Terminology Chapter 2 COBOL DEFINITION 2.1 GENERAL FORMAT FOR A SEQUENCE OF SOURCE PROGRAMS 2.2 GENERAL FORMAT FOR NESTED SOURCE PROGRAMS 2.2.1 Nested-Source-Program 2.3 GENERAL FORMAT FOR IDENTIFICATION DIVISION 2.4 GENERAL FORMAT FOR ENVIRONMENT DIVISION 2.5 GENERAL FORMAT FOR SOURCE-COMPUTER-ENTRY 2.6 GENERAL FORMAT FOR OBJECT-COMPUTER-ENTRY 2.7 GENERAL FORMAT FOR SPECIAL-NAMES-CONTENT 2.8 GENERAL FORMAT FOR FILE-CONTROL-ENTRY 2.8.1
    [Show full text]
  • CMPSCI 250 Lecture
    CMPSCI 250: Introduction to Computation Lecture #29: Proving Regular Language Identities David Mix Barrington 6 April 2012 Proving Regular Language Identities • Regular Language Identities • The Semiring Axioms Again • Identities Involving Union and Concatenation • Proving the Distributive Law • The Inductive Definition of Kleene Star • Identities Involving Kleene Star • (ST)*, S*T*, and (S + T)* Regular Language Identities • In this lecture and the next we’ll use our new formal definition of the regular languages to prove things about them. In particular, in this lecture we’ll prove a number of regular language identities, which are statements about languages where the types of the free variables are “regular expression” and which are true for all possible values of those free variables. • For example, if we view the union operator + as “addition” and the concatenation operator ⋅ as “multiplication”, then the rule S(T + U) = ST + SU is a statement about languages and (as we’ll prove today) is a regular language identity. In fact it’s a language identity as regularity doesn’t matter. • We can use the inductive definition of regular expressions to prove statements about the whole family of them -- this will be the subject of the next lecture. The Semiring Axioms Again • The set of natural numbers, with the ordinary operations + and ×, forms an algebraic structure called a semiring. Earlier we proved the semiring axioms for the naturals from the Peano axioms and our inductive definitions of + and ×. It turns out that the languages form a semiring under union and concatenation, and the regular languages are a subsemiring because they are closed under + and ⋅: if R and S are regular, so are R + S and R⋅S.
    [Show full text]
  • Some Aspects of Semirings
    Appendix A Some Aspects of Semirings Semirings considered as a common generalization of associative rings and dis- tributive lattices provide important tools in different branches of computer science. Hence structural results on semirings are interesting and are a basic concept. Semi- rings appear in different mathematical areas, such as ideals of a ring, as positive cones of partially ordered rings and fields, vector bundles, in the context of topolog- ical considerations and in the foundation of arithmetic etc. In this appendix some algebraic concepts are introduced in order to generalize the corresponding concepts of semirings N of non-negative integers and their algebraic theory is discussed. A.1 Introductory Concepts H.S. Vandiver gave the first formal definition of a semiring and developed the the- ory of a special class of semirings in 1934. A semiring S is defined as an algebra (S, +, ·) such that (S, +) and (S, ·) are semigroups connected by a(b+c) = ab+ac and (b+c)a = ba+ca for all a,b,c ∈ S.ThesetN of all non-negative integers with usual addition and multiplication of integers is an example of a semiring, called the semiring of non-negative integers. A semiring S may have an additive zero ◦ defined by ◦+a = a +◦=a for all a ∈ S or a multiplicative zero 0 defined by 0a = a0 = 0 for all a ∈ S. S may contain both ◦ and 0 but they may not coincide. Consider the semiring (N, +, ·), where N is the set of all non-negative integers; a + b ={lcm of a and b, when a = 0,b= 0}; = 0, otherwise; and a · b = usual product of a and b.
    [Show full text]
  • Theme and Variations on the Concatenation Product
    Theme and variations on the concatenation product Jean-Eric´ Pin1⋆ LIAFA, University Paris-Diderot and CNRS, France. Abstract. The concatenation product is one of the most important op- erations on regular languages. Its study requires sophisticated tools from algebra, finite model theory and profinite topology. This paper surveys research advances on this topic over the last fifty years. The concatenation product plays a key role in two of the most important results of automata theory: Kleene’s theorem on regular languages [23] and Sch¨utzen- berger’s theorem on star-free languages [60]. This survey article surveys the most important results and tools related to the concatenation product, including connections with algebra, profinite topology and finite model theory. The paper is organised as follows: Section 1 presents some useful algebraic tools for the study of the concatenation product. Section 2 introduces the main definitions on the product and its variants. The classical results are summarized in Section 3. Sections 4 and 5 are devoted to the study of two algebraic tools: Sch¨utzenberger products and relational morphisms. Closure properties form the topic of Section 6. Hierarchies and their connection with finite model theory are presented in Sections 7 and 8. Finally, new directions are suggested in Section 9. 1 The instruments This section is a brief reminder on the algebraic notions needed to study the con- catenation product: semigroups and semirings, syntactic ordered monoids, free profinite monoids, equations and identities, varieties and relational morphisms. More information can be found in [1–3,18,35,42,45]. 1.1 Semigroups and semirings If S is a semigroup, the set P(S) of subsets of S is also a semiring, with union as addition and multiplication defined, for every X, Y ∈P(S), by XY = {xy | x ∈ X,y ∈ Y } In this semiring, the empty set is the zero and for this reason, is denoted by 0.
    [Show full text]
  • Mal E Init N Synt X Cobol
    BeMA EUROPEAN CO~IPUTER ~lANUFACTURERS ASSOCIATION MAL E INIT N of the SYNT X COBOL eptelnber ] 970 Free copies of this document are available from ECMA, European Computer Manufacturers Association, 114 Rue du Rhone - 1204 Geneva (Switzerland) EeMA EUROPEAN COMPUTER MANUFACTURERS ASSOCIATION FORMAL DEFINITION of the SYNTAX OF COBOL Septenlber 1970 FORMAL DEFINITION of the SYNTAX OF COBOL - i - CONTENTS Page PREFACE ix INTRODUCTION TO THE NOTATION xiii FORMAL DEFINITION OF COBOL SYNTAX G. Syntactic Definitions of General Nature EMPTY 1 COBOL GRAPHICS 2 COBOL CONTROLS 4 SOME FREQUENTLY USED SEPARATORS 5 WORD 6 PROPER NONNUMERIC LITERAL 7 PROPER NUMERIC LITERAL 8 FIGURATIVE CONSTANT 9 LITERAL 10 ARITHMETIC OPERATOR 11 PROPER RELATIONAL OPERATOR 12 PICTURE CHARACTER STRING 13 COMMENT STRING 14 OTHER LANGUAGE STRING 15 T. COBOL Text SEPARATORS 16 GENERALIZED CHARACTER STRING 17 STRUCTURE OF COBOL TEXT 18 - ii - Page N. Names defined by the Imp1ementors HARDWARE NAMES 19 OTHER NAMES 20 C. COBOL Program COBOL PROGRAM STRUCTURE 21 I. Identification Division IDENTIFICATION DIVISION STRUCTURE 22 PROGRAM-ID PARAGRAPH 23 DATE COMPILED PARAGRAPH 24 OTHER PARAGRAPHS 2S COt~ENT PAP~GRAPH BODY 26 E. Environment Division ENVIRONMENT DIVISION STRUCTURE 27 CONFIGURATION SECTION STRUCTURE 28 INPUT-OUTPUT SECTION STRUCTURE 29 SOURCE COMPUTER PARAGRAPH 30 OBJECT COMPUTER PARAGRAPH 31 SEGMENT LIMIT CLAUSE 32 SPECIAL-NAMES PARAGRAPH 33 SPECIAL-NAMES CLAUSE 34 CURRENCY-SIGN CLAUSE 37 DECIMAL-POINT CLAUSE 38 FILE-CONTROL PARAGRAPH 39 SELECT CLAUSE 41 ASSIGN CLAUSE 42 MULTIPLE REEL/UNIT CLAUSE 43 ALTERNATE AREA CLAUSE 44 FILE-LIMIT CLAUSE 4S - iii - Page ACCESS MODE CLAUSE 46 PROCESSING MODE CLAUSE 47 KEY CLAUSE 48 I-O-CONTROL PARAGRAPH 49 RERUN CLAUSE 50 SAME CLAUSE 51 MULTIPLE FILE CLAUSE 52 D.
    [Show full text]
  • Introduction to Object-Oriented Programming
    QUIZ How could we disable the automatic creation of copy- constructors pre-C++11? What syntax feature did C++11 introduce to make the disabling clearer and more permanent? • Give a code example. QUIZ How is the overloaded assignment operator different from all other overloaded operators in C++? QUIZ Show in a code example how to disable the automatic creation of the overloaded assignment operator post- C++11. The class is named Foo. Ch. 14: Inheritance & Composition Remember composition Composition If embedded object is private, it will be accessed through member (or friend) functions Composition – multi-level example Draw the UML diagram, as learned in ch.1! Inheritance Base class, from which Y inherits UML diagram Inheritance The default is private inheritance During inheritance, everything defaults to private. If the base class were not preceded by public, it would mean that all of the members of the base class, private and public would be private in the derived class. Examples on next slide Explaining public and private in inheritance (see protected later!) Inheritance – adding new functions change( ) ↔ we can add completely new functions. Inheritance – redefining functions set( ) ↔ we can redefine functions from the base class. If function is redefined, we can still call the base-class version using the scope resolution operator! Inheritance – redefining functions A note on language: redefining functions in a derived class is mostly referred to today as overriding; this includes the C++ standards. (Redefining is used when we talk about typedef or namespaces.) Unfortunately, the matter is further muddled by the existence of the override identifier (since C++11 – it applies only to virtual functions).
    [Show full text]
  • Finite Automata Part Three
    Finite Automata Part Three Recap from Last Time A language L is called a regular language if there exists a DFA D such that ℒ(D) = L. NFAs ● An NFA is a ● Nondeterministic ● Finite ● Automaton ● Can have missing transitions or multiple transitions defined on the same input symbol. ● Accepts if any possible series of choices leads to an accepting state. ε-Transitions ● NFAs have a special type of transition called the ε-transition. ● An NFA may follow any number of ε-transitions at any time without consuming any input. a start a a q0 q1 q2 ε ε b, ε b q03 q4 q5 b a a b b b New Stuff! DFAs and NFAs NFAs and DFAs ● Any language that can be accepted by a DFA can be accepted by an NFA. ● Why? ● Just use the same set of transitions as before. ● Question: Can any language accepted by an NFA also be accepted by a DFA? ● Surprisingly, the answer is yes! Finite Automata ● NFAs and DFAs are finite automata; there can only be finitely many states in an NFA or DFA. ● An NFA can be in any combination of its states, but there are only finitely many possible combations. ● Idea: Build a DFA where each state of the DFA corresponds to a set of states in the NFA. Simulating an NFA with a DFA 0 0 start q14 q03 start 0 1 0 1 q0 q1 q2 1 0, 1 ε 1 0 0 0 q2 q q5 q3 q4 q5 1 0 1 0 1 1 0 1 q q q 1 4 0 3 The Subset Construction ● This construction for transforming an NFA into a DFA is called the subset construction (or sometimes the powerset construction).
    [Show full text]
  • Regular Expressions
    Automata Theory CS411 2015F-02 Formal Languages David Galles Department of Computer Science University of San Francisco 02-0: Alphabets & Strings An alphabet Σ is a finite set of symbols Σ1 = {a, b, . ., z} Σ2 = {0, 1} A string is a finite sequence of symbols from an alphabet fire, truck are both strings over {a, . ., z} length of a string is the number of symbols in the string |fire| = 4, |truck| = 5 02-1: Alphabets & Strings The empty string ǫ is a string of 0 characters |ǫ| = 0 ◦ is the concatenation operator w1 = fire, w2 = truck w1 ◦ w2 = firetruck w2 ◦ w1 = truckfire w2 ◦ w2 = trucktruck Often drop the ◦: w1w2 = firetruck For any string w, wǫ = w 02-2: Concatenation & Reversal We can concatenate a string with itself: w1 = w w2 = ww w3 = www By definition, w0 = ǫ Can reverse a string: wR truckR = kcurt 02-3: Formal Language A formal language (or just language) is a set of strings L1 ={a, aa, abba, bbba} L2 ={car, truck, goose} L3 ={1, 11, 111, 1111, 11111, . .} A language can be either finite or infinite 02-4: Language Concatenation We can concatenate languages as well as strings L1L2 = {wv : w ∈ L1 ∧ v ∈ L2} {a, ab}{bb, b} = 02-5: Language Concatenation We can concatenate languages as well as strings L1L2 = {wv : w ∈ L1 ∧ v ∈ L2} {a, ab}{bb, b} = {abb, ab, abbb} {a, ab}{a, ab} = 02-6: Language Concatenation We can concatenate languages as well as strings L1L2 = {wv : w ∈ L1 ∧ v ∈ L2} {a, ab}{bb, b} = {abb, ab, abbb} {a, ab}{a, ab} = {aa, aab, aba, abab} {a, aa}{a, aa} = 02-7: Language Concatenation We can concatenate languages as well as strings L1L2 = {wv : w ∈ L1 ∧ v ∈ L2} {a, ab}{bb, b} = {abb, ab, abbb} {a, ab}{a, ab} = {aa, aab, aba, abab} {a, aa}{a, aa} = {aa, aaa, aaaa} What can we say about |L1L2|, if we know |L1| = m and |L2| = n? 02-8: Language Concatenation We can concatenate a language with itself, just like strings L1 = L, L2 = LL, L3 = LLL, etc.
    [Show full text]
  • (Python) Chapter 1: Introduction to Programming in Python 1.1
    (Python) Chapter 1: Introduction to Programming in Python 1.1 Compiled vs. Interpreted Languages Computers only understand 0s and 1s, their native machine language. All of the executable programs on your computer are a collection of these 0s and 1s that tell your computer exactly what to execute. However, humans do a rather poor job of communicating and 0s and 1s. If we had to always write our instructions to computers in this manner, things would go very, very slowly and we'd have quite a few unhappy computer programmers, to say the least. Luckily, there are two common solutions employed so that programmers don't have to write their instructions to a computer in 0s and 1s: 1) Compiled languages 2) Interpreted languages In a compiled language, the programmer writes a program in a programming language that is human readable. Then, a program called a compiler translates this program into a set of 0s and 1s known as an executable file that the computer will understand. It's this executable file that the computer runs. If one wants to make changes to how their program runs, they must first make a change to their program, then recompile it (retranslate it) to create an updated executable file that the computer understands. In an interpreted language, rather than doing all the translation at once, the compiler translates some of the code written (maybe a line or two) in a human readable language to an intermediate form, and then this form gets "interpreted" to 0s and 1s that the machine understands and immediately executes.
    [Show full text]
  • Concatenation
    Back to concatenation • Theorem The class of regular languages is closed under concatenation. If L1 and L2 are regular languages then so is L1 L2. 1/1 Back to concatenation • Theorem The class of regular languages is closed under concatenation. If L1 and L2 are regular languages then so is L1 L2. • Question: Can we apply the same trick as with the Unioner M which simultaneously runs M1 and M2 & accepts if either M1 accepts or M2? M would have to accept if the input can be split into two pieces: one which is accepted by M1 and the following one by M2 1/2 Let’s play a bit • First, find the DFA for – L1 = { x {0,1}* | x is any string} – L2 = { x {0,1}* | x is composed of one or more 0s} • Then, find the DFA for – L1 L2 = { x {0,1}* | x is any string that ends with at least one 0} Introduction to Nondeterministic Finite Automata • We have mostly played with Deterministic Finite Automata (DFA) – Every step of a computation follows from the preceding one in a unique way • Nondeterministic Finite Automata (NFA) generalizes DFA – Several choices may exist for the next state at any point – We have already seen one example of a NFA: One in which not all transitions were defined • Now, let’s dive deeper 1/4 Introduction to Nondeterministic Finite Automata • The static picture of an NFA is as a graph whose edges are labeled by S and by e (called Se) and with start vertex q0 and accept states F.
    [Show full text]
  • Object-Oriented Programming in Java Buzzwords
    Object-Oriented Programming in Java Buzzwords responsibility-driven design inheritance encapsulation iterators overriding coupling cohesion javadoc interface collection classes mutator methods polymorphic method calls CS2336: Object-Oriented Programming in Java 2 What to Expect • Expect: fundamental concepts, principles, and techniques of object- oriented programming • Not to expect: language details, all Java APIs, JSP/JavaServlet, use of specific tools CS2336: Object-Oriented Programming in Java 3 Fundamental concepts • object • class • method • parameter • data type CS2336: Object-Oriented Programming in Java 4 What is OOP? • OOP: – identifying objects and having them collaborate with each other • procedural programming: – identifying steps that need to be taken to perform a task CS2336: Object-Oriented Programming in Java 5 Objects and classes • objects – represent ‘things’ from the real world, or from some problem domain (example: “the red car down there in the car park”) – An object has a unique identity, state, and behaviors • classes – represent all objects of a kind (example: “car”) CS2336: Object-Oriented Programming in Java 6 Methods and parameters • Objects have operations which can be invoked (Java calls them methods). • The behavior of an object is defined by a set of methods. • Methods may have parameters to pass additional information needed to execute. • Methods may return a result via a return value. • Method signature specifies the types of the parameters and return values CS2336: Object-Oriented Programming in Java 7 Data Types • Parameters have data types, which specify what kind of data should be passed to a parameter. • There are two general kinds of types: primitive types, where values are stored in variables directly, and object types, where references to objects are stored in variables.
    [Show full text]