EVALUATIONTRAINING MATERIALS FOR IT COPYPROFESSIONALS

Unauthorized

Reproduction

or Distribution

Prohibited EVALUATIONAdvanced Python COPY (PYT212 version 1.1.0) Unauthorized Copyright Information

© Copyright 2019 Webucator. All rights reserved.

The Authors Reproduction Nat Dunn Nat Dunn founded Webucator in 2003 to combine his passion for technical training with his business expertise and to help companies benefit from both. His previous experience was in sales, business and technical training, and management. Nat has an MBA from Harvard Business School and a BA in International Relations from Pomona College. or Roger Sakowski (Editor) Distribution Roger has over 35 years of experience in technical training, programming, data management, network administration, and technical writing for companies such as NASA, Sun Microsystems, Bell Labs, GTE, GE, and Lucent among other Fortune 100 companies. Stephen Withrow (Editor) Prohibited Stephen has over 30 years© experience in training, development, and consulting in a variety of technology areas including Java, C, C++, XML, JavaScript, AJAX, Tomcat, JBoss, Oracle, and DB2. His background includes design and implementation of business solutions on client/server, Web, and enterprise platforms. Stephen is a published writer in both technical and non-technical endeavors. Stephen received an undergraduate degree in Computer Science and Physics from Florida State University. EVALUATIONAccompanying Class FilesCOPY This manual comes with accompanying class files, which your instructor or sales representative will point to you. Most code samples and exercise and solution Unauthorizedfiles found in the manual can also be found in the class files at the locations indicated at the top of the code listings. Due to space limitations, the code listings sometimes have line wrapping, where no line wrapping occurs in the actual code sample. This is indicated in the manual using three greater than signs: >>> at the beginning of each wrapped line. In other cases, the space limitations are such that we have inserted a forced line break in the middle of a word. When this occurs, we append the following symbol at the end of theReproduction line before the actual break: »»

or Distribution

Prohibited Table of Contents Table of Contents

1. Advanced Python Concepts...... 1 Lambda Functions...... 1 EVALUATIONAdvanced List Comprehensions...... COPY2 Quick Review of Basic List Comprehensions...... 2 Multiple for Loops...... 3 Exercise 1: Rolling Five Dice...... 5 UnauthorizedCollections Module...... 7 Named Tuples...... 7 Default Dictionaries (defaultdict)...... 7 Exercise 2: Creating a defaultdict...... 13 Ordered Dictionaries (OrderedDict)...... 17 Exercise 3: Creating a OrderedDict...... 20 Counters...... 23 Exercise 4: Creating a Counter...... 28 Deques (deque)...... 31 Exercise Reproduction5: Working with a deque...... 34 Mapping and Filtering...... 38 map(function, iterable, ...)...... 38 filter(function, iterable)...... 38 Using Lambda Functions with map() and filter()...... 39 Mutable and Immutable Built-in Objects...... 42 Strings are Immutable...... 42 Lists are Mutable...... 42 Sorting...... 43 Sorting Lists in Place...... 43 The sorted() Function...... or 45 Exercise 6: Converting list.sort() to sorted(iterable)Distribution...... 47 Sorting Sequences of Sequences...... 50 Sorting Sequences of Dictionaries...... 51 Unpacking Sequences in Function Calls...... 53 Exercise 7: Converting a String to a datetime.date Object...... 55 Modules and Packages...... 57 Modules...... 57 Packages...... 58 Search Path for Modules and Packages...... Prohibited58

Version: 1.1.0. Printed: 2019-04-02. i Table of Contents

2. Working with Data...... 61 Relational Databases...... 61 PEP 0249 -- Python Database API Specification v2.0...... 61 PyMySQL...... 64 Returning Dictionaries instead of Tuples...... 66 EVALUATIONsqlite3...... COPY67 Exercise 8: Querying a SQLite Database...... 68 Passing Parameters...... 71 SQLite Database in Memory...... 71 UnauthorizedExecuting Multiple Queries at Once...... 73 Exercise 9: Inserting File Data into a Database...... 76 CSV...... 79 Reading from a CSV File...... 79 Finding Data in a CSV File...... 81 Exercise 10: Comparing Data in a CSV File...... 83 Creating a New CSV File...... 85 CSV Dialects...... 86 Getting Data from the Web...... 91 The ReproductionRequests Package...... 91 Beautiful Soup...... 94 XML...... 96 Exercise 11: Requests and Beautiful Soup...... 97 JSON...... 99 Exercise 12: Using JSON to print Course data...... 106

3. Testing and Debugging...... 111 Testing for Performance...... 111 time.perf_counter()...... or 111 The timeit Module...... 114 The unittest Module...... Distribution 120 Unittest Test Files...... 122 Exercise 13: Fixing Functions...... 129 Special unittest.TestCase Methods...... 132 Assert Methods...... 133

Prohibited

ii © Copyright 2019 Webucator. All rights reserved. Table of Contents

4. Classes and Objects...... 137 Attributes...... 137 Behaviors...... 138 Classes vs. Objects...... 138 Everything Is an Object...... 139 EVALUATIONCreating Custom Classes...... COPY140 Attributes and Methods...... 141 Exercise 14: Adding a roll() Method to Die...... 150 Private Attributes...... 153 UnauthorizedProperties ...... 156 Creating Properties with the property() Function...... 156 Creating Properties using the @property Decorator...... 156 Exercise 15: Properties...... 160 Objects that Track their Own History...... 163 Documenting Classes...... 163 Using docstrings...... 164 Exercise 16: Documenting the Die Class...... 171 Inheritance...... 173 OverridingReproduction a Class Method...... 173 Extending a Class...... 174 Exercise 17: Extending the Die Class...... 176 Extending a Class Method...... 179 Exercise 18: Extending the roll() Method...... 181 Static Methods...... 183 Class Attributes and Methods...... 185 Class Attributes...... 185 Class Methods...... 186 You Must Consider Subclasses...... 189 Abstract Classes and Methods...... or 190 Understanding Decorators...... Distribution 195

Prohibited

Version: 1.1.0. Printed: 2019-04-02. iii EVALUATION COPY

Unauthorized

Reproduction

or Distribution

Prohibited Advanced Python Concepts 1. Advanced Python Concepts

In this lesson, you will learn... EVALUATION1. To work with lambda functions. COPY 2. To write more advanced list comprehensions. 3. To work with the collections module to create named tuples, Unauthorized defaultdicts, ordereddicts, counters, and deques. 4. To use mapping and filtering. 5. To sort sequences. 6. To unpack sequences in function calls. 7. To create modules and packages. In this lesson, you will learn about some Python functionality and techniques that are commonly used but require a solid foundation in Python to understand. 1.1 LambdaReproduction Functions

Lambda functions are anonymous functions that are generally used to complete a small task, after which they are no longer needed. The syntax for creating a lambda function is:

Syntax lambda arguments: expressionor Lambda functions are almost alwaysDistribution used within other functions, but for demonstration purposes, we could assign a lambda function to a variable, like this:

f = lambda n: n**2

We could then call f like this:

f(5) #Returns 25 f(2) #Returns 4 Prohibited

Python 2 Difference In Python 2, you can pass a tuple to a lambda function and reference each member of the tuple by its variable name, like this:

Version: 1.1.0. Printed: 2019-04-02. Page 1 of 198 Advanced Python Concepts

f = lambda (a,b): a + b x = f( (1,2) ) print x EVALUATIONIn Python 3, you can pass a tuple to a lambda function COPY as well, but you must reference tuple members by their position, like this:

f = lambda nums: nums[0] + nums[1] Unauthorizedx = f( (1,2) ) print(x)

1.2 Advanced List Comprehensions

ClassReproduction Files Examples Examples from this section are in: • advanced-python-concepts/Demos/list_comprehension.py

Quick Review of Basic List Comprehensions Before we get into advancedor list comprehensions, let's do a quick review. The basic syntax for list comprehension is: Distribution Syntax my_list = [f(x) for x in iterable if condition]

In the above code f(x) could be any of the following: • Just a variable name (e.g., x). • An operation (e.g., x**2). Prohibited • A function call (e.g., len(x) or square(x)). Here are a couple of simple examples:

Example 1: List Comprehension with Condition words = ['Woodstock', 'Gary', 'Tucker', 'Gopher', 'Spike', 'Ed', 'Faline', 'Willy', 'Rex', 'Rhino', 'Roo', 'Pongo', 'Kaa'] three_letter_words = [w for w in words if len(w) == 3]

three_letter_words will contain:

Page 2 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

['Rex', 'Roo', 'Kaa']

Example 2: List Comprehension with Function Call people = ['George Washington', 'John Adams', EVALUATION'Thomas Jefferson', 'John Quincy Adams'] COPY def get_inits(name): inits=[] for name_part in name.split(): Unauthorized inits.append(name_part[0]) return '.'.join(inits) + '.'

inits = [get_inits(person) for person in people]

inits will contain:

['G.W.', 'J.A.', 'T.J.', 'J.Q.A.']

Now,Reproduction on to the more advanced uses of list comprehension.

Multiple for Loops List comprehensions can include multiple for loops. This provides an easy way to create something similar to a two-dimensional array or a matrix:

dice_rolls = [(a,b) for a in range(1,7)or for b in range(1,7)] Distribution The above code will create a list of tuples containing all the possible rolls of two dice:

[ (1, 1), (1, 2), (1, 3), (1, 4), Prohibited (1, 5), (1, 6), (2, 1), (2, 2), ... ]

Notice that the above output contains what game players would consider duplicates. For example, the two bolded rolls (1, 2) and (2, 1) are considered the same in dice. We can remove these pseudo-duplicates by starting the second for loop

Version: 1.1.0. Printed: 2019-04-02. Page 3 of 198 Advanced Python Concepts

with the current value of a in the first for loop. We can do this, because each for loop has access to all variables created in previous for loops:

dice_rolls = [(a,b) for a in range(1,7) EVALUATIONfor b in range(a,7)] COPY The dice_rolls list will now contain the different possible rolls (from a dice Unauthorizedrolling point of view): [ (1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (2, 2), Reproduction(2, 3), (2, 4), (2, 5), (2, 6), (3, 3), (3, 4), ... ] or Distribution

Prohibited

Page 4 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

Exercise 1 Rolling Five Dice 10 to 15 minutes There is no limit to the number of for loops in a list comprehension, so we can use this same technique to get the possibilities for more than two dice. EVALUATION1. Create a new file advanced-python-concepts/Exercises COPYnamed list_comprehen sions.py . 2. Write two separate list comprehensions: Unauthorized A. The first should output five-item tuples for all unique permutations from rolling five identical six-sided dice. When looking for permutations, order matters. B. The second should output five-item tuples for all unique combinations from rolling five identical six-sided dice. When looking for combinations, order doesn't matter.

Reproduction

or Distribution

Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 5 of 198 Advanced Python Concepts

Exercise Solution advanced-python-concepts/Solutions/list_comprehensions.py

1. # Get unique permutations: EVALUATION2. dice_rolls = [(a,b,c,d,e) COPY 3. for a in range(1,7) 4. for b in range(1,7) Unauthorized5. for c in range(1,7) 6. for d in range(1,7) 7. for e in range(1,7)] 8. 9. print('Solution 1', dice_rolls, '-'*70, sep='\n') 10. 11. # Get unique combinations: 12. dice_rolls = [(a,b,c,d,e) 13. Reproductionfor a in range(1,7) 14. for b in range(a,7) 15. for c in range(b,7) 16. for d in range(c,7) 17. for e in range(d,7)] 18. 19. print('Solution 2', dice_rolls, '-'*70, sep='\n') or Distribution

Prohibited

Page 6 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

1.3 Collections Module

The collections module includes specialized containers (objects that hold data) that provide more specific functionality than Python's built-in containers (list, EVALUATIONtuple, dict, and set). Some of the more useful COPY containers are named tuples (created with the namedtuple() function), defaultdict, OrderedDict, Counter, and deque. Unauthorized Named Tuples Imagine you are creating a game in which you need to set and get the position of a target. You could do this with a regular tuple like this:

#set target position: target_pos = (100, 200) #getReproduction x value of target position target_pos[0] #100

But someone reading your code might not understand what target_pos[0] refers to. A named tuple allows you to reference target_pos.x, which is more meaningful and helpful. Code Sample or advanced-python-concepts/Demos/namedtuple.pyDistribution

1. from collections import namedtuple 2. 3. Point = namedtuple('Point','x, y') 4. 5. # set target position: 6. target_pos = Point(100,200) Prohibited 7. 8. # get x value of target position 9. print(target_pos.x) Code Explanation As the above code shows, the namedtuple() function allows you to give a name to the elements at different positions in a tuple and then refer to them by that name.

Version: 1.1.0. Printed: 2019-04-02. Page 7 of 198 Advanced Python Concepts

Default Dictionaries (defaultdict) With regular dictionaries, trying to modify a key that doesn't exist will cause an exception. For example, the following code will result in a KeyError: EVALUATIONfoo = {} COPY foo['bar'] += 1

A defaultdict is like a regular dictionary except that, when you try to look up Unauthorizeda key that doesn't exist, it creates the key and assigns it the value returned by a function you specified when creating it. To illustrate how a defaultdict can be useful, let's see how we would create a regular dictionary that shows the number of different ways each number (2 through 12) can be rolled when rolling two dice, like this:

{ 2: 1, Reproduction3: 2, 4: 3, 5: 4, 6: 5, 7: 6, 8: 5, 9: 4, 10: 3, 11: 2, 12: 1 or } Distribution 1. First, create the list of possibilities as we did earlier:

dice_rolls = [(a,b) for a in range(1,7) for b in range(1,7)] 2. Next, create an empty dictionary, roll_counts, and then loop through the dice_rolls list checking for the existence ofProhibited a key that is the sum of the dice roll. For example, on the first iteration, we find (1,1), which when added together, gives us 2. Since roll_counts does not have a key 2, we need to add that key and set its value to 1. The same is true for when we find (1,2), which adds up to 3. But later when we find (2,1), which also adds up to 3,

Page 8 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

we don't need to recreate the key. Instead, we increment its value by 1. The code looks like this:

roll_counts = {} for roll in dice_rolls: if sum(roll) in roll_counts: EVALUATIONroll_counts[sum(roll)] += 1 COPY else: roll_counts[sum(roll)] = 1 Unauthorized This method works fine and gives us the following roll_counts dictionary:

{ 2: 1, 3: 2, 4: 3, 5: 4, 6: 5, Reproduction7: 6, 8: 5, 9: 4, 10: 3, 11: 2, 12: 1 }

Another option is to just go ahead and try to increment the value of each potential key we find and then, if weor get a KeyError, we assign 1 for that key, like this: roll_counts = {} Distribution for roll in dice_rolls: try: roll_counts[sum(roll)] += 1 except KeyError: roll_counts[sum(roll)] = 1

This also works and produced the same dictionary. Prohibited But by using a defaultdict, we can avoid the need for the if-else or try-except block. The code looks like this:

from collections import defaultdict

roll_counts = defaultdict(int) for roll in dice_rolls: roll_counts[sum(roll)] += 1

The result is a defaultdict object that can be treated just like a normal dictionary:

Version: 1.1.0. Printed: 2019-04-02. Page 9 of 198 Advanced Python Concepts

defaultdict(, { 2: 1, 3: 2, 4: 3, 5: 4, 6: 5, EVALUATION7: 6, COPY 8: 5, 9: 4, Unauthorized 10: 3, 11: 2, 12: 1 })

Notice that we pass int to defaultdict(). Remember, when you try to look up a key that doesn't exist in a defaultdict, it creates the key and assigns it the value returned by a function you specified when creating it. In this case, that function is int(). When passing the function to defaultdict(), you do not include parentheses,Reproduction because you are not calling the function at the time you pass it to defaultdict(). Rather, you are specifying that you want to use this function to give you default values for new keys. By passing int, we are stating that we want new keys to have a default value of whatever int() returns when no argument is passed to it. That value is 0. You could create default dictionaries with any number of functions, both built-in and user-defined: or a = defaultdict(list) #default key value will be [] b = defaultdict(str) #defaultDistribution key value will be '' c = defaultdict(lambda: 5) #default key value will be 5*

def foo(): return 'bar' d = defaultdict(foo) #default key value will be 'bar'

Prohibited

Page 10 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

Code Sample advanced-python-concepts/Demos/defaultdict.py

1. # Create a dictionary that shows the number of different ways each EVALUATION2. # number (2 through 12) can be rolled. COPY 3. dice_rolls = [(a,b) 4. for a in range(1,7) Unauthorized5. for b in range(1,7)] 6. 7. # Solution 1 if else 8. # Loop through list checking for existence of each key. 9. # If it exists, increment it by 1. If it doesn't, add it 10. # and set it to 1. 11. roll_counts = {} 12. for roll in dice_rolls: 13. if sum(roll)Reproduction in roll_counts: 14. roll_counts[sum(roll)] += 1 15. else: 16. roll_counts[sum(roll)] = 1 17. 18. print('Solution 1:', roll_counts,'-'*70,sep='\n') 19. 20. # Solution 2 try except 21. # Loop through trying to incrementor each key. If it fails, this means 22. # the key doesn't exist, so add it and set it to 1.roll_counts = {} 23. roll_counts = {} Distribution 24. for roll in dice_rolls: 25. try: 26. roll_counts[sum(roll)] += 1 27. except KeyError: 28. roll_counts[sum(roll)] = 1 29. 30. print('Solution 2:', roll_counts,'-'*70,sep='\n')Prohibited 31. 32. # Solution 3 defaultdict 33. from collections import defaultdict 34. 35. roll_counts = defaultdict(int) 36. for roll in dice_rolls: 37. roll_counts[sum(roll)] += 1 38. 39. print('Solution 3:', roll_counts,'-'*70,sep='\n')

Version: 1.1.0. Printed: 2019-04-02. Page 11 of 198 Advanced Python Concepts

Code Explanation As the above code shows, using a defaultdict avoids the need for using an EVALUATIONif-else or try-except block. COPY

Unauthorized

Reproduction

or Distribution

Prohibited

Page 12 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

Exercise 2 Creating a defaultdict 15 to 20 minutes In this exercise, you will organize the 1927 by position by EVALUATIONcreating a default dictionary that looks like this: COPY defaultdict(, { 'OF': ['', '', '', Unauthorized '', ''], 'C': ['Benny Bengough', '', 'Johnny Grabowski'], '2B': ['', 'Ray Morehart'], 'SS': [''], '3B': ['', '', 'Julie Wera'], 'P': ['Walter Beall', 'Joe Giard', '', '', '', '', '', '', '', ''], '1B': [''] }) Reproduction

You will start with this data:

or Distribution

Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 13 of 198 Advanced Python Concepts

yankees_1927 = [ {'position': 'P', 'name': 'Walter Beall'}, {'position': 'C', 'name': 'Benny Bengough'}, {'position': 'C', 'name': 'Pat Collins'}, {'position': 'OF', 'name': 'Earle Combs'}, {'position': '3B', 'name': 'Joe Dugan'}, EVALUATION{'position': 'OF', 'name': 'Cedric Durst'},COPY {'position': '3B', 'name': 'Mike Gazella'}, {'position': '1B', 'name': 'Lou Gehrig'}, Unauthorized {'position': 'P', 'name': 'Joe Giard'}, {'position': 'C', 'name': 'Johnny Grabowski'}, {'position': 'P', 'name': 'Waite Hoyt'}, {'position': 'SS', 'name': 'Mark Koenig'}, {'position': '2B', 'name': 'Tony Lazzeri'}, {'position': 'OF', 'name': 'Bob Meusel'}, {'position': 'P', 'name': 'Wilcy Moore'}, {'position': '2B', 'name': 'Ray Morehart'}, {'position': 'OF', 'name': 'Ben Paschal'}, {'position': 'P', 'name': 'Herb Pennock'}, Reproduction{'position': 'P', 'name': 'George Pipgras'}, {'position': 'P', 'name': 'Dutch Ruether'}, {'position': 'OF', 'name': 'Babe Ruth'}, {'position': 'P', 'name': 'Bob Shawkey'}, {'position': 'P', 'name': 'Urban Shocker'}, {'position': 'P', 'name': 'Myles Thomas'}, {'position': '3B', 'name': 'Julie Wera'} ] 1. Open advanced-python-concepts/Exercises/defaultdict.pyor for in your editor. 2. Write code so that the script creates the defaultdict above from the given list. Distribution 3. Output the stored in your new defaultdict.

Prohibited

Page 14 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

EVALUATION COPY

Unauthorized

Reproduction

or Distribution

Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 15 of 198 Advanced Python Concepts

Exercise Solution advanced-python-concepts/Solutions/defaultdict.py

1. from collections import defaultdict EVALUATION2. COPY 3. yankees_1927 = [ 4. {'position': 'P', 'name': 'Walter Beall'}, Unauthorized5. {'position': 'C', 'name': 'Benny Bengough'}, 6. {'position': 'C', 'name': 'Pat Collins'}, 7. {'position': 'OF', 'name': 'Earle Combs'}, 8. {'position': '3B', 'name': 'Joe Dugan'}, 9. {'position': 'OF', 'name': 'Cedric Durst'}, 10. {'position': '3B', 'name': 'Mike Gazella'}, 11. {'position': '1B', 'name': 'Lou Gehrig'}, 12. {'position': 'P', 'name': 'Joe Giard'}, 13. {'position':Reproduction 'C', 'name': 'Johnny Grabowski'}, 14. {'position': 'P', 'name': 'Waite Hoyt'}, 15. {'position': 'SS', 'name': 'Mark Koenig'}, 16. {'position': '2B', 'name': 'Tony Lazzeri'}, 17. {'position': 'OF', 'name': 'Bob Meusel'}, 18. {'position': 'P', 'name': 'Wilcy Moore'}, 19. {'position': '2B', 'name': 'Ray Morehart'}, 20. {'position': 'OF', 'name': 'Ben Paschal'}, 21. {'position': 'P', 'name':or 'Herb Pennock'}, 22. {'position': 'P', 'name': 'George Pipgras'}, 23. {'position': 'P', 'name': 'DutchDistribution Ruether'}, 24. {'position': 'OF', 'name': 'Babe Ruth'}, 25. {'position': 'P', 'name': 'Bob Shawkey'}, 26. {'position': 'P', 'name': 'Urban Shocker'}, 27. {'position': 'P', 'name': 'Myles Thomas'}, 28. {'position': '3B', 'name': 'Julie Wera'} 29. ] 30. Prohibited 31. positions = defaultdict(list) 32. for player in yankees_1927: 33. positions[player['position']].append(player['name']) 34. 35. print(positions['P'])

Page 16 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

Ordered Dictionaries (OrderedDict)

OrderedDict Examples EVALUATIONExamples from this section are in: COPY • advanced-python-concepts/Demos/ordereddict.py

UnauthorizedAn ordered dictionary stores items in the sequence they are inserted. Ordered dictionaries are created using the OrderedDict class (a subclass of dict). As an example, we will consider a dictionary that stores keys and values for used car condition. These might represent the condition of cars offered for sale at a car dealership. Here is the code to build the ordered dictionary:

from collections import OrderedDict

car_conditions=OrderedDict()Reproduction

car_conditions['E']='Excellent' car_conditions['G']='Good' car_conditions['F']='Fair' car_conditions['P']='Poor'

The items will appear in the order of insertion if we iterate the dictionary and display each key/value pair: or for (key, value) in car_condition.items(): print('Condition code:Distribution ' + key + ' Value: ' + value)

The sorted function can be used to specify either a key order or a value order when you create the ordered dictionary from an existing dictionary. Let us presuppose that we would like to create a new ordered dictionary that is ordered by car condition value from the dictionary we created earlier:

car_condition_value_sorted=OrderedDict(sorted(car_condition.items(),Prohibited key=value_sort))

The sorted functions accepts two arguments: the dictionary items and a named function (or lambda as we'll see shortly) that determines if the order is based on key or value. Let's take a look at the function:

Version: 1.1.0. Printed: 2019-04-02. Page 17 of 198 Advanced Python Concepts

def value_sort(ccd): return ccd[1]

The argument to the function is the car condition dictionary item. I return the value (index equal to 1). If I returned the key (index equal to 0) then then dictionary would EVALUATIONbe ordered by key (e.g., car condition code). COPY We can replace the named function with a lambda function to achieve the same Unauthorizedordering by value: car_condition_value_sorted=OrderedDict(sorted(car_condition.items(), key=lambda ccd: ccd[1]))

The OrderedDict class offers two useful methods: • popItem(last=True) Removes an item from the dictionary. If last equals True (the default) then the last dictionary item is removed. If last is Reproductionequated to False, then the first item is removed. • move_to_end(key, last=True) Moves the item with key key to the end of the dictionary. If last equals True (the default) then the item is moved to the end of the dictionary. If last is equated to False, then the item is moved to the beginning of the dictionary. The function will raise a KeyError if the key does not exist.

or Distribution

Prohibited

Page 18 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

Code Sample advanced-python-concepts/Demos/ordereddict.py

1. from collections import OrderedDict EVALUATION2. COPY 3. car_conditions=OrderedDict() 4. Unauthorized5. car_conditions['E']='Excellent' 6. car_conditions['G']='Good' 7. car_conditions['F']='Fair' 8. car_conditions['P']='Poor' 9. 10. print('The ordered dictionary condition keys and values:') 11. for (key, value) in car_conditions.items(): 12. print('Condition key: ' + key + ' condition value: ' + value) 13. print('-'*70)Reproduction 14. 15. # define a function to sort by value: 16. def sort_by_value(ccd): 17. return ccd[1] 18. 19. # create dictionary in car condition value order 20. # using named function: 21. car_conditions_value=OrderedDict(sorted(car_conditions.items(),or 22. key=sort_by_value)) 23. print('''The value ordered dictionaryDistribution condition keys 24. and values using named function:''') 25. for (key, value) in car_conditions_value.items(): 26. print('Condition key: ' + key + ' condition value: ' + value) 27. print('-'*70) 28. 29. # create dictionary in car condition value order 30. # using lambda function: Prohibited 31. car_conditions_value=OrderedDict(sorted(car_conditions.items(), 32. key=lambda ccd: ccd[1])) 33. print('''The value ordered dictionary condition keys 34. and values using lambda function:''') 35. for (key, value) in car_conditions_value.items(): 36. print('Condition key: ' + key + ' condition value: ' + value) 37. print('-'*70)

Version: 1.1.0. Printed: 2019-04-02. Page 19 of 198 Advanced Python Concepts

Exercise 3 Creating a OrderedDict 15 to 20 minutes In this exercise, you will create an ordered dictionary from a regular dictionary that stores order status codes and values that are used to track customer order status for EVALUATIONa hypothetical e-commerce company. The regular dictionaryCOPY looks like this: order_status = {'O': 'open', 'S': 'shipped', 'B': 'backordered', Unauthorized 'X': 'cancelled', 'R': 'returned'} 1. Open advanced-python-concepts/Exercises/ordereddict.py in your editor. 2. Write code to create an OrderedDict from the given order status dictionary. Provide a lambda function to order the dictionary by order status value, i.e., 'backordered','cancelled','open','returned','shipped'. 3. Use a named function to specify ordering by key. Print the key/value pairs to ensure that the dictionary is ordered correctly. 4. Customer service has added another order status: 'D' for 'damaged'. Add the Reproductionkey/value pair for the new order status. Print your dictionary to ensure that the item has been added. Note that the new item has been stored at the end of the dictionary. 5. Customer service has decided to roll back the new order status of 'D' for 'damaged'. Remove the key/value pair for the order status of 'D'. Print your dictionary to ensure that the item has been removed.

or Distribution

Prohibited

Page 20 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

EVALUATION COPY

Unauthorized

Reproduction

or Distribution

Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 21 of 198 Advanced Python Concepts

Exercise Solution advanced-python-concepts/Solutions/ordereddict.py

1. from collections import OrderedDict EVALUATION2. COPY 3. order_status = {'O': 'open', 'S': 'shipped', 'B': 'backordered', 4. 'X': 'cancelled', 'R': 'returned'} Unauthorized5. 6. def ordering (od): 7. return od[0] 8. 9. od_order_status = OrderedDict(sorted(order_status.items(), 10. key=lambda od: od[1]) ) 11. print('''Ordered dictionary with lambda function 12. for ordering by value:''') 13. for (key,Reproduction value) in od_order_status.items(): 14. print("key: " + key + " value: " + value) 15. print('-'*70) 16. 17. od_order_status = OrderedDict(sorted(order_status.items(), 18. key=ordering)) 19. print('Ordered dictionary with named function for ordering by key:') 20. for (key, value) in od_order_status.items(): 21. print("key: " + key + " value:or " + value) 22. print('-'*70) 23. Distribution 24. # add an order status: 25. od_order_status['D']='damaged' 26. print('Ordered dictionary after adding an order status:') 27. for (key, value) in od_order_status.items(): 28. print("key: " + key + " value: " + value) 29. print('-'*70) 30. Prohibited 31. # delete the last item ('D') : 32. od_order_status.popitem() 33. print('Ordered dictionary after popitem():') 34. for (key, value) in od_order_status.items(): 35. print("key: " + key + " value: " + value) 36. print('-'*70)

Page 22 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

Counters Consider again the defaultdict object we created to get the number of different ways each number could be rolled when rolling two dice. This type of task is very common. You might have a collection of plants and want to get a count of the number EVALUATIONof each species or the number of plants by color. TheCOPY objects that hold these counts are called counters and the collections module includes a special Counter() class for creating them. UnauthorizedAlthough there are different ways of creating counters, they are most often created with an iterable, like this:

from collections import Counter c = Counter(['green','blue','blue','red','yellow','green','blue'])

This will create the following counter: Counter({Reproduction 'blue': 3, 'green': 2, 'red': 1, 'yellow': 1 })

To create a counter from the dice_rolls list we used earlier, we need to first create a list of sums from it, like this: or roll_sums = [sum(roll) for roll in dice_rolls] Distribution roll_sums will contain the following list:

[2,3,4,5,6,7,3,4,5,6,7,8,4,5,6,7,8,9,5,6,7,8,9,10,6,7,8,9,10,11,7,8,9,10,11,12]

We then create the counter like this: c = Counter(roll_sums) Prohibited That creates a counter that is very similar to the defaultdict we saw earlier:

Version: 1.1.0. Printed: 2019-04-02. Page 23 of 198 Advanced Python Concepts

Counter({ 7: 6, 6: 5, 8: 5, 5: 4, 9: 4, EVALUATION4: 3, COPY 10: 3, 3: 2, Unauthorized 11: 2, 2: 1, 12: 1 })

Code Sample advanced-python-concepts/Demos/counter.py

1. from collectionsReproduction import Counter 2. c = Counter(['green','blue','blue','red','yellow','green','blue']) 3. print('Example 1:', c, '-'*70, sep='\n') 4. 5. dice_rolls = [(a,b) 6. for a in range(1,7) 7. for b in range(1,7)] 8. 9. roll_sums = [sum(roll) for rollor in dice_rolls] 10. c = Counter(roll_sums) 11. print('Example 2:', c, '-'*70, sep='\n')Distribution Counter is a subclass of dict. We will learn more about subclasses later, but for now all you need to understand is that a subclass generally has access to all of its superclass's methods and data. So, Counter supports all the standard dict instance methods. The update() method behaves differently though. In standard dict objects, update() replaces its key values with those of the passed-in dictionary. In Counter objects, update() adds the values of theProhibited passed-in Counter object to its own values. The example below illustrates this:

Updating with a Dictionary grades = {'English':97, 'Math':93, 'Art':74, 'Music':86} grades.update({'Math':97, 'Gym':93})

The grades dictionary will now contain:

Page 24 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

{ 'Art': 74, 'English': 97, 'Gym': 93, #key is added with value of 93 'Math': 97, #97 replaces 93 'Music': 86 EVALUATION} COPY

Updating with a List Unauthorizedc = Counter(['green','blue','blue','red','yellow','green','blue']) c.update(['red','yellow','yellow','purple'])

The c counter will now contain:

Counter({ 'blue': 3, 'yellow': 3, #2 added to 1 'red': 2, #1 added to 1 Reproduction'green': 2, 'purple': 1 #key is created with value of 1 })

Updating with a Counter c = Counter(['green','blue','blue','red','yellow','green','blue']) d = Counter(['green','violet']) c.update(d)

The c counter will now contain:or Distribution Counter({ 'blue': 3, 'green': 3, # 1 added to 2 'red': 1, 'violet': 1, # key is created with a value of 1 'yellow': 1 }) Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 25 of 198 Advanced Python Concepts

Code Sample advanced-python-concepts/Demos/counter_update.py

1. grades = {'English':97, 'Math':93, 'Art':74, 'Music':86} EVALUATION2. grades.update({'Math':97, 'Gym':93}) COPY 3. print('Updating with a dictionary', grades, '-'*70, sep='\n') 4. Unauthorized5. from collections import Counter 6. c = Counter(['green','blue','blue','red','yellow','green','blue']) 7. c.update(['red','yellow','yellow','purple']) 8. print('Updating with a list', c, '-'*70, sep='\n') 9. 10. c = Counter(['green','blue','blue','red','yellow','green','blue']) 11. d = Counter(['green','violet']) 12. c.update(d) 13. print('UpdatingReproduction with a counter', c, '-'*70, sep='\n') Counters also have a corresponding subtract() method. It works just like update() but subtracts rather than adds the passed-in iterable counts:

Subtracting with a Counter c = Counter(['green','blue','blue','red','yellow','green','blue']) c.subtract(['red','yellow','yellow','purple']) The c counter will now contain:or Counter({ Distribution 'blue': 3, 'green': 2, 'red': 0, #1 subtracted from 1 'purple': -1, #key is created with value of -1 'yellow': -1 #2 subtracted from 1 })

Code Sample Prohibited advanced-python-concepts/Demos/counter_subtract.py

1. from collections import Counter 2. c = Counter(['green','blue','blue','red','yellow','green','blue']) 3. c.subtract(['red','yellow','yellow','purple']) 4. print(c)

Page 26 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

Other Counter Methods

Other counter methods include: • elements() - Returns an iterator over elements repeating each as many EVALUATIONtimes as its count, but ignoring elements with aCOPY count of zero or less. • most_common([n]) - Returns the n most common elements and their counts. If n is not passed in, it returns all elements. Unauthorized

Reproduction

or Distribution

Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 27 of 198 Advanced Python Concepts

Exercise 4 Creating a Counter 10 to 15 minutes In this exercise, you will create a counter that holds the most common words used and the number of times they show up in the Declaration of Independence. EVALUATION1. Create a new Python script in advanced-python-concepts/Exercises COPYnamed counter.py. 2. Write code that: Unauthorized A. Reads the Declaration_of_Independence.txt file in the same folder. B. Creates a list of all the words that have at least six characters. Use upper() to convert the words to uppercase. C. Creates a Counter from the word list. D. Outputs the most common ten words and their counts. The result should look like this:

[ Reproduction('PEOPLE', 13), ('STATES', 7), ('INDEPENDENT', 5), ('AGAINST', 5), ('SHOULD', 5), ('OTHERS', 4), ('ASSENT', 4), ('GOVERNMENT,', 4), ('CONNECTION', 3), ('DECLARE', 3) ] or Distribution

Prohibited

Page 28 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

EVALUATION COPY

Unauthorized

Reproduction

or Distribution

Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 29 of 198 Advanced Python Concepts

Exercise Solution advanced-python-concepts/Solutions/counter.py

1. from collections import Counter EVALUATION2. COPY 3. with open('Declaration_of_Independence.txt') as f: 4. doi = f.read() Unauthorized5. 6. word_list = [word for word in doi.upper().split() if len(word) > 5] 7. 8. c = Counter(word_list) 9. print(c.most_common(10))

Reproduction

or Distribution

Prohibited

Page 30 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

Deques (deque)

Deque Examples EVALUATIONExamples from this section are in: COPY • advanced-python-concepts/Demos/deque.py

UnauthorizedA deque, pronounced "deck", is a -Ended queue. Therefore, a deque can be operated on as a stack (Last In First Out, or LIFO) or as a queue (First In First Out, or FIFO). Let's consider a simple example of deque:

from collections import deque

names=deque(['Stephen', 'Nat']) print('TheReproduction deque items:') for name in names: print(name)

We pass an iterable (in this case an array) to the deque class. In the example above, the deque has two items. We can now process the deque as a stack or a queue, interchangeably. The left side of the deque might be considered the beginning of a queue or the top of a stack. The right side can be considered the end of a queue or the bottom of a stack. or Let's add a name to the right sideDistribution of the deque: names.append('Roger')

The deque would now look like this:

['Stephen','Nat','Roger']

We can pop the name on the left side: Prohibited

names.popleft()

The deque would now look like this:

['Nat','Roger']

We can just as easily add a name to the left side of our collection:

Version: 1.1.0. Printed: 2019-04-02. Page 31 of 198 Advanced Python Concepts

names.appendleft('Donna')

The deque would look like this: EVALUATION['Donna','Nat','Roger'] COPY But items can be inserted as well! Let's insert 'Connie' at position 1 (relative to the left side with position 0 being the leftmost item): Unauthorized names.insert(1,'Connie')

Here is the result:

['Donna','Connie','Nat','Roger']

Earlier we removed a name from the left side. Now let's remove a name from the right side: Reproduction names.pop()

This is how the deque looks now:

['Donna','Connie','Nat']

To clean up, we can remove all items in the deque and then display the count of items (which will be 0) to ensureor all items have been removed: names.clear() print('Deque item count isDistribution now + " str(len(names)))

The deque class offers additional useful methods: • extend(iterable) Append an iterable to the right side of the deque. • extendleft(iterable) Append an iterable to the left side of the deque, reversing the sequence (e.g., the first item in the iterable will be appended to the left side of the deque and to its left the secondProhibited iterable item will be appended and so on). For a complete list of methods go to https://docs.python.org/3/library/collec tions.html#collections.deque.

Page 32 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

Code Sample advanced-python-concepts/Demos/deque.py

1. from collections import deque EVALUATION2. COPY 3. names=deque(['Stephen','Nat']) 4. print('The deque items:') Unauthorized5. for name in names: 6. print(name) 7. 8. names.append('Roger') 9. print('The deque items with new name appended to the right end:') 10. for name in names: 11. print(name) 12. print('-'*70) 13. Reproduction 14. names.popleft() 15. print('The deque items with name on left end removed:') 16. for name in names: 17. print(name) 18. print('-'*70) 19. 20. names.appendleft('Donna') 21. print('The deque items with newor name appended to the left end:') 22. for name in names: 23. print(name) Distribution 24. print('-'*70) 25. 26. names.insert(1, 'Connie') 27. print('The deque items with name inserted at position 1:') 28. for name in names: 29. print(name) 30. Prohibited 31. names.pop() 32. print('The deque items with name removed from the right end:') 33. for name in names: 34. print(name) 35. 36. names.clear() 37. print('The deque cleared, item count is: ' + str(len(names))) 38. print('-'*70)

Version: 1.1.0. Printed: 2019-04-02. Page 33 of 198 Advanced Python Concepts

Exercise 5 Working with a deque 10 to 15 minutes In this exercise, you will create a deque that stores today's agenda for a hypothetical IT department. EVALUATION1. Open the Python script in advanced-python-concepts/Exercises COPYnamed deque.py. 2. Write code that: A. Creates a deque from the array named todays_agenda. Unauthorized B. Adds an array of agenda items named items_added_start_of_day to the left side of the deque. C. Removes the item "Database upgrade late AM". D. Inserts the item item_inserted immediately after "Accounts payable software test at 2PM". E. Adds the item item_added_late_night at the end of the day (late evening). This is the last agenda item for the day. Reproduction

or Distribution

Prohibited

Page 34 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

EVALUATION COPY

Unauthorized

Reproduction

or Distribution

Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 35 of 198 Advanced Python Concepts

Exercise Solution advanced-python-concepts/Solutions/deque.py

1. from collections import deque EVALUATION2. COPY 3. todays_agenda=deque(['Staff meeting at 10AM', 4. 'Database upgrade late AM', Unauthorized5. 'Accounts payable software test at 2PM', 6. 'System maintenance in the evening']) 7. 8. items_added_start_of_day=['Conference call with London customer', 9. 'Brew a good pot of coffee!'] 10. item_inserted='Accounts receivables unit test, late afternoon' 11. item_added_late_night='Restart servers' 12. 13. print('Today\'sReproduction agenda:') 14. for agenda_item in todays_agenda: 15. print(agenda_item) 16. print('-'*70) 17. 18. # We have to add some items that are to occur prior to staff meeting: 19. todays_agenda.extendleft(items_added_start_of_day) 20. 21. print('Today\'s agenda afteror appending early morning items:') 22. for agenda_item in todays_agenda: 23. print(agenda_item) Distribution 24. print('-'*70) 25. 26. # Remove Database upgrade... agenda item: 27. todays_agenda.remove('Database upgrade late AM') 28. print('Today\'s agenda after removing Database upgrade:') 29. for agenda_item in todays_agenda: 30. print(agenda_item) Prohibited 31. print('-'*70) 32. 33. # Insert new agenda item at position 4 34. # (immediately after Accounts payable software test): 35. todays_agenda.insert(4, item_inserted) 36. print('''Today\'s agenda after inserting new agenda item 37. Accounts receivable unit test... at position 4''') 38. print('''(Positions displayed below to affirm new item 39. is located at position 4):''')

Page 36 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

40. 41. for item_seq, agenda_item in enumerate(todays_agenda): 42. print(str(item_seq) + ') ' + agenda_item) 43. print('-'*70) 44. EVALUATION45. # Add a final item to the end of the agenda (latest COPY time): 46. todays_agenda.append(item_added_late_night) 47. print('''Today\'s agenda after inserting Unauthorized48. new agenda item Restart servers''') 49. for agenda_item in todays_agenda: 50. print(agenda_item) 51. print('-'*70)

Reproduction

or Distribution

Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 37 of 198 Advanced Python Concepts

1.4 Mapping and Filtering

Mapping and Filtering Examples EVALUATIONExamples from this section are in COPY • advanced-python-concepts/Demos/map_and_filter_functions.py Unauthorized map(function, iterable, ...) The built-in map() function is used to sequentially pass all the values of an iterable (or multiple iterables) to a function and return an iterator containing the returned values. The code below illustrates how map() works:

def multiply(x,y): Reproductionreturn x*y nums1 = range(0,10) nums2 = range(10,0,-1)

for i in map(multiply, nums1, nums2): print(i)

This code multiplies 0 by 10, 1 by 9, 2 by 8,... 9 by 1 and returns the result as an iterator. It then loops throughor the iterator printing each result.

Python 2 Difference Distribution In Python 2, the map() function continues iterating through the longest passed-in iterable and passes in None values for any iterables that reach their end. In Python 3, the map() function stops iterating when it has reached the end of the shortest passed-in iterable. Prohibited

filter(function, iterable) The built-in filter() function is used to sequentially pass all the values of a iterable to a function and return an iterator containing the values for which the function returns True. The code below illustrates how filter() works:

Page 38 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

def is_odd(num): return num % 2

nums = range(0,10)

for i in filter(is_odd, nums): EVALUATIONprint(i) COPY

This code will creates an iterator containing all the odd numbers between 0 and 9. UnauthorizedIt then loops through the iterator printing each result.

Python 2 Difference In Python 2, map() and filter() return lists. In Python 3, map() and filter() return iterators. If you need a list, pass the result to the list() function. Reproduction

Using Lambda Functions with map() and filter() Review the following two scripts to see how lambda functions are used with map() and filter():

or Distribution

Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 39 of 198 Advanced Python Concepts

Code Sample advanced-python-concepts/Demos/mapping.py

1. # Here's the lambda function we will demonstrate: EVALUATION2. f = lambda n: n**2 COPY 3. 4. # Before we use that lambda function, let's see how map() works Unauthorized5. # with a named function: 6. def square(n): 7. return n**2 8. 9. squares = list(map(square, range(10))) 10. print('Squares with a named function:', squares,'-'*70,sep='\n') 11. 12. # Now we use map() with a lambda function to do the same thing: 13. squares =Reproduction list(map(lambda n: n**2, range(10))) 14. print('Squares with a lambda function:', squares,'-'*70,sep='\n') 15. 16. # And here's how we can accomplish the same thing 17. # using list comprehension: 18. squares = [n**2 for n in range(10)] 19. print('Squares using list comprehension:', squares,'-'*70,sep='\n') or Distribution

Prohibited

Page 40 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

Code Sample advanced-python-concepts/Demos/filtering.py

1. # Here's the lambda function we will demonstrate: EVALUATION2. f = lambda n: n**.5 == int(n**.5) COPY 3. 4. # Before we use that lambda functions, let's see how filter() Unauthorized5. # works with a named function: 6. def is_perfect_square(n): 7. return n**.5 == int(n**.5) 8. 9. perfect_squares = list(filter(is_perfect_square, range(100))) 10. print('Using a named function:', perfect_squares,'-'*70,sep='\n') 11. 12. # Now we use filter() with a lambda function to do the same thing: 13. perfect_squaresReproduction = list(filter(lambda n: n**.5 == int(n**.5), 14. range(100))) 15. print('Using a lambda function:', perfect_squares,'-'*70,sep='\n') 16. 17. # And here's how we can accomplish the same thing 18. # using list comprehension: 19. perfect_squares = [n for n in range(100) if n**.5 == int(n**.5)] 20. print('Using list comprehension:', perfect_squares,'-'*70,sep='\n') or Let's just *keep* lambda Distribution Some programmers, including Guido van Rossum, the creator of Python, dislike lambda, filter() and map(). These programmers feel that list comprehension can generally be used instead. However, other programmers love these functions and Guido eventually gave up the fight to remove them from Python. In February, 2006, he wrote1: After so many attempts to come up with an alternativeProhibited for lambda, perhaps we should admit defeat. I've not had the time to follow the most recent rounds, but I propose that we keep lambda, so as to stop wasting everybody's talent and time on an impossible quest. You'll have to decide for yourself whether or not to use them.

1. See https://mail.python.org/pipermail/python-dev/2006-February/060415.html.

Version: 1.1.0. Printed: 2019-04-02. Page 41 of 198 Advanced Python Concepts

1.5 Mutable and Immutable Built-in Objects

Class Files Examples EVALUATIONExamples from this section are in COPY • advanced-python-concepts/Demos/mutable_vs_immutable_objects.py

UnauthorizedThe difference between mutable objects, such as lists and dictionaries, and immutable objects, such as strings, integers, and tuples, may seem pretty straightforward. Mutable objects can be changed; immutable objects cannot. But it helps to have a deeper understanding of how this can affect your code.

Strings are Immutable When an object is assigned to a variable, the variable is just a pointer to the object. So, ifReproduction we assign "A" to v1, we mean that v1 is pointing to the string object "A". Then if we assign v1 to v2, we are not pointing it at the v1 variable itself. Rather, we are pointing v2 to the same object v1 is pointing to. Consider the following:

or Distribution Notice that v2 does not change when we change the value of v1. This illustrates that v2 points to "A", not to v1. Here is another way to look at it:

Prohibited

Notice that vs[0] remains 'A' even after v1 has 'C' appended to it. This illustrates that v1 += 'C' doesn't change the string object. Remember, strings are immutable. Rather, that line of code assigns a new string object to v1. It is the equivalent of v1 = v1 + 'C'.

Page 42 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

Lists are Mutable Lists, on the other hand, are mutable and can be modified in place. For example: EVALUATION COPY

Unauthorized

Notice that with lists, v2 does change when we change v1. Both are pointing at the same list object, which is mutable. So, when we modify the v1 list, v1 still points to the same object. And here is another way to look at it: Reproduction

Again, notice that modifying the list does not change the fact that both variables still point to the same object.or Be careful though. If you use theDistribution assignment operator, you will overwrite the old list and create a new list object:

Prohibited

1.6 Sorting

Sorting Lists in Place

Version: 1.1.0. Printed: 2019-04-02. Page 43 of 198 Advanced Python Concepts

Class Files Examples Examples from this section are in: EVALUATION• advanced-python-concepts/Demos/sorting_sort_method.py COPY Python lists have a sort() method that sorts the list in place:

Unauthorizedcolors = ['red', 'blue', 'green', 'orange'] colors.sort()

The colors list will now contain:

['blue', 'green', 'orange', 'red']

The sort() method can take two keyword arguments: key and reverse. Reproduction reverse

The reverse argument is a boolean:

colors = ['red', 'blue', 'green', 'orange'] colors.sort(reverse=True)

The colors list will now contain:or ['red', 'orange', 'green',Distribution 'blue']

key

The key argument takes a function to be called on each list item and performs the sort based on the result. For example, the following code will sort by word length:

colors = ['red', 'blue', 'green', 'orange']Prohibited colors.sort(key=len)

The colors list will now contain:

['red', 'blue', 'green', 'orange']

And the following code will sort by last name:

Page 44 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

def get_lastname(name): return name.split()[-1]

people = ['George Washington', 'John Adams', 'Thomas Jefferson', 'John Quincy Adams'] EVALUATIONpeople.sort(key=get_lastname) COPY The people list will now contain:

Unauthorized[ 'John Adams', 'John Quincy Adams', 'Thomas Jefferson', 'George Washington' ]

*It's worth pointing out that John Quincy Adams shows up after John Adams in the result only because he shows up after him in the initial list. Our code as it stands doesReproduction not take into account middle or first names.

Using Lambda Functions with key

If you don't want to create a new named function just to perform the sort, you can use a lambda function. For example, the code below would do the same thing as the code above without the need for the get_lastname() function:

people = ['George Washington',or 'John Adams', 'Thomas Jefferson', 'John Quincy Adams'] people.sort(key=lambda name:Distribution name.split()[-1])

Combining key and reverse

The key and reverse arguments can be combined. For example, the code below will sort by word length in descending order: Prohibited colors = ['red', 'blue', 'green', 'orange'] colors.sort(key=len, reverse=True)

The colors list will now contain:

['orange', 'green', 'blue', 'red']

Version: 1.1.0. Printed: 2019-04-02. Page 45 of 198 Advanced Python Concepts

The sorted() Function The built-in sorted() function requires an iterable as its first argument and can take key and reverse as keyword arguments. It works just like the list's sort() EVALUATIONmethod except that: COPY 1. It does not modify the iterable in place. Rather, it returns a new sorted list. 2. It can take any iterable, not just a list (but it always returns a list). Unauthorized

Reproduction

or Distribution

Prohibited

Page 46 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

Exercise 6 Converting list.sort() to sorted(iterable) 15 to 25 minutes In this exercise, you will convert all the examples of sort() we saw earlier to use sorted() instead. EVALUATION1. Open advanced-python-concepts/Exercises/sorting_built_in_sorted_function.py COPY in your editor. Unauthorized2. The code in first example has already been converted to use sorted(): colors = ['red', 'blue', 'green', 'orange'] #colors.sort() #print(colors) new_colors = sorted(colors) print('Simple sorted()',new_colors,'-'*70, sep='\n') 3. Convert all other code examples in the script. Reproduction

or Distribution

Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 47 of 198 Advanced Python Concepts

Exercise Solution advanced-python-concepts/Solutions/sorting_built_in_sorted_function.py

1. # Simple sort() method EVALUATION2. # This one has been done for you COPY 3. colors = ['red', 'blue', 'green', 'orange'] 4. # colors.sort() Unauthorized5. # print(colors) 6. new_colors = sorted(colors) 7. print('Simple sorted()',new_colors,'-'*70, sep='\n') 8. 9. # The reverse argument: 10. #colors.sort(reverse=True) 11. #print(colors) 12. new_colors = sorted(colors, reverse=True) 13. print('ReverseReproduction order',new_colors,'-'*70, sep='\n') 14. 15. # The key argument: 16. #colors.sort(key=len) 17. #print(colors) 18. new_colors = sorted(colors, key=len) 19. print('Ordered by length of key',new_colors,'-'*70, sep='\n') 20. 21. # The key argument with namedor function: 22. def get_lastname(name): 23. return name.split()[-1] Distribution 24. 25. people = ['George Washington', 'John Adams', 26. 'Thomas Jefferson', 'John Quincy Adams'] 27. #people.sort(key=get_lastname) 28. #print(people) 29. new_people = sorted(people, key=get_lastname) 30. print('Ordered by lastname using named function',new_people,Prohibited 31. '-'*70, sep='\n') 32. 33. # The key argument with lambda function: 34. people = ['George Washington', 'John Adams', 35. 'Thomas Jefferson', 'John Quincy Adams'] 36. #people.sort(key=lambda name: name.split()[-1]) 37. #print(people) 38. new_people = sorted(people, key=lambda name: name.split()[-1]) 39. print('Ordered by lastname using lambda function',new_people,

Page 48 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

40. '-'*70, sep='\n') 41. 42. # Combining key and reverse 43. #colors.sort(key=len, reverse=True) 44. #print(colors) EVALUATION45. new_colors = sorted(colors, key=len, reverse=True) COPY 46. print('Ordered by reverse length of key',new_colors, 47. '-'*70, sep='\n') Unauthorized

Reproduction

or Distribution

Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 49 of 198 Advanced Python Concepts

Sorting Sequences of Sequences

Class Files Examples EVALUATIONExamples from this section are in: COPY • advanced-python-concepts/Demos/sorting_sequences_of_sequences.py

UnauthorizedWhen you sort a sequence of sequences, Python first sorts by the first element of each sequence, then by the second sequence, and so on. For example:

ww2_leaders = [ ('Charles', 'de Gaulle'), ('Winston', 'Churchill'), ('Teddy', 'Roosevelt'), #not a WW2 leader,but helps make point ('Franklin', 'Roosevelt'), ('Joseph', 'Stalin'), Reproduction('Adolph', 'Hitler'), ('Benito', 'Mussolini'), ('Hideki', 'Tojo') ]

ww2_leaders.sort()

The ww2_leaders list will be sorted by first name and then by last name. It will now contain: or [ ('Adolph', 'Hitler'), Distribution ('Benito', 'Mussolini'), ('Charles', 'de Gaulle'), ('Franklin', 'Roosevelt'), ('Hideki', 'Tojo'), ('Joseph', 'Stalin'), ('Teddy', 'Roosevelt'), ('Winston', 'Churchill') ] Prohibited

To change the order of the sort, use a lambda function:

ww2_leaders.sort( key=lambda leader: (leader[1], leader[0]) )

The ww2_leaders list will now be sorted by last name and then by first name. It will now contain:

Page 50 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

[ ('Winston', 'Churchill'), ('Adolph', 'Hitler'), ('Benito', 'Mussolini'), ('Franklin', 'Roosevelt'), ('Teddy', 'Roosevelt'), EVALUATION('Joseph', 'Stalin'), COPY ('Hideki', 'Tojo'), ('Charles', 'de Gaulle') Unauthorized] It may seem strange that "de Gaulle" comes after "Tojo," but that is correct. Lowercase letters come after uppercase letters in sorting. However, if you want to change the result, you can use the lower() function:

ww2_leaders.sort(key=lambda leader: (leader[1].lower(), leader[0]))

ww2_leadersReproductionwill now contain: [ ('Winston', 'Churchill'), ('Charles', 'de Gaulle'), ('Adolph', 'Hitler'), ('Benito', 'Mussolini'), ('Franklin', 'Roosevelt'), ('Teddy', 'Roosevelt'), ('Joseph', 'Stalin'), ('Hideki', 'Tojo') or ] Distribution Sorting Sequences of Dictionaries

Class Files Examples Examples from this section are in: Prohibited • advanced-python-concepts/Demos/sorting_sequences_of_dictionaries.py

You may often find data stored as lists of dictionaries similar to the one created below:

Version: 1.1.0. Printed: 2019-04-02. Page 51 of 198 Advanced Python Concepts

from datetime import date ww2_leaders = [] ww2_leaders.append({'fname':'Winston', 'lname':'Churchill', 'dob':date(1889,4,20)}) ww2_leaders.append({'fname':'Charles', 'lname':'de Gaulle', 'dob':date(1883,7,29)}) EVALUATIONww2_leaders.append({'fname':'Adolph', 'lname':'Hitler', COPY 'dob':date(1890,11,22)}) ww2_leaders.append({'fname':'Benito', 'lname':'Mussolini', Unauthorized'dob':date(1882,1,30)}) ww2_leaders.append({'fname':'Franklin', 'lname':'Roosevelt', 'dob':date(1884,12,30)}) ww2_leaders.append({'fname':'Joseph', 'lname':'Stalin', 'dob':date(1878,12,18)}) ww2_leaders.append({'fname':'Hideki', 'lname':'Tojo', 'dob':date(1874,11,30)})

This data can be sorted using a lambda function similar to how we sorted lists of tuples:Reproduction ww2_leaders.sort(key=lambda leader: leader['dob'])

You can use this same technique to sort by a tuple:

ww2_leaders.sort(key=lambda leader: (leader['lname'], leader['fname']) )

While the above method worksor fine, the operator module provides an itemgetter() method that performs this same task a bit faster. It works like this: Distribution

from operator import itemgetter ww2_leaders.sort(key=itemgetter('lname','fname'))

Prohibited

Page 52 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

1.7 Unpacking Sequences in Function Calls

Class File Examples EVALUATIONExamples from this section are in: COPY • advanced-python-concepts/Demos/unpacking_function_arguments.py

UnauthorizedSometimes you'll have a sequence that contains the exact arguments a function needs, but the function does not expect a sequence but rather expects separate arguments. To illustrate, consider the following function:

import math def distance_from_origin(a, b): Reproductionreturn math.sqrt(a**2 + b**2) The function expects two arguments, a and b, which are the x, y coordinates of a point. It uses the Pythagorean theorem to determine the distance the point is from the origin. We can call the function like this:

c = distance_from_origin(3, 4) print(c) or But it would be nice to be able toDistribution call the function like this too:

point = (3,4) c = distance_from_origin(point) print(c)

However, that will cause an error because the function expects two arguments and we're only passing in one. Prohibited One solution would be to pass the individual elements of our point:

point = (3,4) c = distance_from_origin(point[0], point[1]) print(c)

But Python provides an even easier solution. We can use an asterisk in the function call to unpack the sequence into separate elements:

Version: 1.1.0. Printed: 2019-04-02. Page 53 of 198 Advanced Python Concepts

point = (3, 4) c = distance_from_origin(*point) print(c)

When you pass a sequence preceded by an asterisk into a function, the sequence EVALUATIONgets unpacked, meaning that the function receives theCOPY individual elements rather than the sequence itself. Unauthorized

Reproduction

or Distribution

Prohibited

Page 54 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

Exercise 7 Converting a String to a datetime.date Object 10 to 20 minutes In this exercise, you will convert a string representing a date to a datetime.date object. EVALUATION1. Open advanced-python-concepts/Exercises/converting_date_string_to_date COPY time.py in your editor. 2. The imported datetime module includes a date() method that can create Unauthorized a date object from three passed-in parameters: year, month, and day. For example:

datetime.date(1776, 7, 4) 3. Write the code for the str_to_date() function so that it... A. Splits the passed-in string into a list of date parts. B. Returns a date object created by passing the unpacked list of date parts Reproductionto datetime.date() .

or Distribution

Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 55 of 198 Advanced Python Concepts

Exercise Solution advanced-python-concepts/Solutions/converting_date_string_to_datetime.py

1. import datetime EVALUATION2. COPY 3. def str_to_date(str_date): 4. date_parts = [int(i) for i in str_date.split("-")] Unauthorized5. return datetime.date(*date_parts) 6. 7. str_date = input('Input date as YYYY-MM-DD: ') 8. date = str_to_date(str_date) 9. print(date)

Reproduction

or Distribution

Prohibited

Page 56 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

1.8 Modules and Packages

You have worked with different Python modules (e.g., random and math) and packages (e.g., collections). In general, it's not all that important to know EVALUATIONwhether a library you want to use is a module or a package,COPY but there is a difference, and when you're creating your own, it's important to understand that difference. UnauthorizedModules A module is a single file. It can be made up of any number of functions and classes. You can import the whole module using:

import module_name

Or you can import specific functions or classes from the module using:

fromReproduction module_name import class_or_function_name For example, if you want to use the random() function from the random module, you can do so by importing the whole module or by importing just the random() function:

or Distribution

Every .py file is a module. When you build a module with the intention of making it available to other modules for importing, it is common to include a _test() function that runs tests when the module is directly. For example, if you run random.py, which is in the Lib directory of your PythonProhibited home, the output will look something like this:

Version: 1.1.0. Printed: 2019-04-02. Page 57 of 198 Advanced Python Concepts

2000 times random 0.003 sec, avg 0.500716, stddev 0.285239, min 0.000495333, max 0.99917

2000 times normalvariate EVALUATION0.004 sec, avg 0.0061499, stddev 0.971102, COPY min -2.86188, max 3.02266 2000 times lognormvariate 0.004 sec, avg 1.64752, stddev 2.12612, min 0.0310675, max 28.5174 Unauthorized... Open random.py in an editor and you will see it ends with this code:

if __name__ == '__main__': _test()

The __name__ variable of any module that is imported holds that module's name. For example, if you import random and then print random.__name__, it will outputReproduction "random". However, if you open random.py, add a line that reads print(__name__), and run it, it will print "__main__". So, the if condition in the code above just checks to see if the file has been imported. If it hasn't (i.e., if it's running directly), then it will call the _test() function. If you do not want to write tests, you could include code like this:

if __name__ == '__main__': print('''This module is for importing and is not meant to beor run directly.''') Distribution Packages A package is a group of files (and possibly subfolders) stored in a directory that includes a file named __init__.py. The __init__.py file does not need to contain any code. For example, the __init__.py file within Lib/idlelib (the package for the IDLE editor) looks like this: Prohibited # Dummy file to make this a package.

However, you can include code in the __init__.py file that will initialize the package. You can also (but do not have to) set a global __all__ variable, which should contain a list of files to be imported when a file imports your package using from package_name import *. If you do not set the __all__ variable, then that form of import will not be allowed, which may be just fine.

Page 58 of 198 © Copyright 2019 Webucator. All rights reserved. Advanced Python Concepts

Search Path for Modules and Packages The Python interpreter must locate the imported modules. When import is used within a script, the interpreter searches for the imported module in the following EVALUATIONplaces sequentially: COPY 1. The current directory (same directory as script doing the importing). 2. The library of standard modules. Unauthorized3. The paths defined in sys.path.2 As you see, the steps involved in creating modules and packages for import is relatively straightforward. However, designing useful and easy-to-use modules and packages takes a lot of planning and thought. 1.9 Conclusion In this lesson, you have learned several advanced techniques with sequences. You haveReproduction also learned to do mapping and filtering, and to use lambda functions. Finally, you have learned how to create modules and packages.

or Distribution

Prohibited

2. sys.path contains a list of strings specifying the search path for modules. The list is os-dependent. To see your list, run the following code at your Python prompt:

import sys sys.path

Version: 1.1.0. Printed: 2019-04-02. Page 59 of 198 Advanced Python Concepts

EVALUATION COPY

Unauthorized

Reproduction

or Distribution

Prohibited

Page 60 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data 2. Working with Data

In this lesson, you will learn... EVALUATION1. To access and work with data stored in a relational COPY database. 2. To access and work with data stored in a CSV file. 3. To get data from a web page. Unauthorized4. To access and work with data stored as HTML and XML. 5. To access an API. 6. To access and work with data stored as JSON. Data is stored in many different places and in many different ways. There are Python modules for all of the most common ways. 2.1 Relational Databases Reproduction Class Files Examples The database we will use for our examples is Lahman's Baseball Database3, which includes a huge amount of data on from 1871 to the present.

Python is able to connect to all the commonly used databases, including PostgreSQL4, MySQL5, Microsoft SQL Serveror 6, and Oracle7. In this course, we will show a couple of different options for working with MySQL, but generally implementations for working with different relationalDistribution databases follow PEP 0249 -- Python Database API Specification v2.08, which is described below.

PEP9 0249 -- Python Database API Specification v2.0 PEP 0249 defines an API for Python interfaces that work with databases. Generally, you follow these steps to pull data from a database (code samples will use the MySQL for Python10 interface): Prohibited

3. See http://www.seanlahman.com/baseball-archive/statistics/. 4. See https://wiki.python.org/moin/PostgreSQL. 5. See https://wiki.python.org/moin/MySQL. 6. See https://wiki.python.org/moin/SQL%20Server. 7. See https://wiki.python.org/moin/Oracle. 8. See https://www.python.org/dev/peps/pep-0249/. 9. PEP stands for Python Enhancement Proposal. 10. See http://sourceforge.net/projects/mysql-python/.

Version: 1.1.0. Printed: 2019-04-02. Page 61 of 198 Working with Data

If you do not have MySQL for Python installed, you can install it at the command line by running: EVALUATIONpip install mysql-connector-python COPY 1. Import a Python Database API-2.0-compliant interface.

Unauthorized import mysql.connector 2. Open a connection to the database.

connection = pymysql.connect(host='host_name', user='user_name', passwd='password', db='database_name') 3.Reproduction Write your query. For example: query = '''SELECT nameFirst, nameLast, weight, year(debut) FROM Master ORDER BY weight DESC LIMIT 5''' 4. Create a cursor for the connection.

cursor = connection.cursor() 5. Use the cursor to executeor one or more queries.

cursor.execute(query) Distribution 6. Get the results of the query/queries from the cursor.

results = cursor.fetchall()

Prohibited

Page 62 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

7. Close the cursor.

cursor.close() 8. Close the connection to the database. EVALUATIONconnection.close() COPY The results are returned as a Python list. Each record in the results is generally Unauthorizedreturned as a sequence. The type of sequence depends on the cursor type. Here is the complete code using the Master table of Lahman's Baseball Database, which we have hosted at MySQLC11.webtekwebhosting.com. The username is "student" and the password is "webuc8". Code Sample working-with-data/Demos/mysql_for_python.py

1. # You mayReproduction need to install via pip install mysql-connector-python 2. import mysql.connector 3. 4. connection = mysql.connector.connect( 5. host='MySQLC11.newtekwebhosting.com', 6. user='student', 7. password='webuc8', 8. database='baseball') 9. or 10. query = '''SELECT nameFirst, nameLast,Distribution weight, year(debut) 11. FROM Master 12. ORDER BY weight DESC 13. LIMIT 5''' 14. 15. cursor = connection.cursor() # pass dictionary=True to return 16. # records as dicts 17. cursor.execute(query) Prohibited 18. results = cursor.fetchall() 19. cursor.close() 20. connection.close() 21. 22. print(results) This will output:

Version: 1.1.0. Printed: 2019-04-02. Page 63 of 198 Working with Data

[('Walter', 'Young', 320, 2005), ('Jumbo', 'Diaz', 315, 2014), ('Dmitri', 'Young', 295, 1996), ('Jonathan', 'Broxton', 295, 2005), EVALUATION('Jumbo', 'Brown', 295, 1925)] COPY The table below shows the most common cursor methods: Cursor Methods Unauthorized Method Description Returns cursor.execute(operation [, Prepares and executes a Depends on parameters]) database query or implementation. command. cursor.executemany(operation, Prepares a database query Depends on seq_of_parameters) or command and executes implementation. it once for each sequence Reproduction or mapping in seq_of_parameters. This is usually used for INSERT and UPDATE statements and not for queries that return result sets. cursor.fetchone() Fetches the next row of a A single row of or result set. data. cursor.fetchmany(n=cursor.arraysize)DistributionFetches the next n rows of A list of data a result set. The rows. cursor.arraysize can be set; it defaults to 1. cursor.fetchall() Fetches all data rows of a A list of data result set. rows. Prohibited PyMySQL In the example above, we used MySQL for Python. Another choice is PyMySQL, which also conforms to PEP 0249. All we have to do is replace "mysql.connector" with "pymsql" and the code will work in exactly the same way:

Page 64 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

import pymysql connection = pymysql.connect(host='MySQLC11.newtekwebhosting.com', user='student', password='webuc8', database='baseball')

EVALUATIONquery = '''SELECT nameFirst, nameLast, weight,COPY year(debut) FROM Master ORDER BY weight DESC UnauthorizedLIMIT 5''' cursor = connection.cursor() cursor.execute(query) results = cursor.fetchall() cursor.close() connection.close() print(results) However,Reproduction cursors in PyMySQL are context managers, meaning that they support the with statement. So instead of explicitly closing the cursor, we can write the code like this: Code Sample working-with-data/Demos/pymysql_connect.py

1. import pymysql 2. connection = pymysql.connect(host='MySQLC11.newtekwebhosting.com',or 3. user='student', 4. password='webuc8',Distribution 5. database='baseball' 6. ) 7. 8. query = '''SELECT nameFirst, nameLast, weight, year(debut) 9. FROM Master 10. ORDER BY weight DESC 11. LIMIT 5''' Prohibited 12. 13. with connection.cursor() as cursor: 14. cursor.execute(query) 15. results = cursor.fetchall() 16. connection.close() 17. 18. print(results)

Version: 1.1.0. Printed: 2019-04-02. Page 65 of 198 Working with Data

Code Explanation Note the use of with in relation to the cursor on lines 13 through 16.

EVALUATIONIf you do not have PyMySQL installed, you can installCOPY it at the command line by running: Unauthorizedpip install pymysql

Returning Dictionaries instead of Tuples According to PEP 0249, the cursor fetch methods must return sequences or sequences of sequences. Both MySQL for Python and PyMySQL return a tuple (for fetchone()) and a list of tuples (for fetchmany() and fetchall()). But theyReproduction both allow you to change the cursor type or class, so that the results are returned as dictionaries instead of tuples. They do this in different ways. In MySQL for Python, you pass dictionary=True to the cursor() method:

cursor = connection.cursor(dictionary=True)

In PyMySQL, you pass cursorclass=pymysql.cursors.DictCursor to pymysql.connect(): or connection = pymysql.connect(host='MySQLC11.webtekwebhosting.com', Distributionuser='student', password='webuc8', database='baseball', cursorclass=pymysql.cursors.DictCursor )

For the query getting the five heaviest baseball players of all time, this change would make the result look like this: Prohibited

Page 66 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

[{'nameFirst': 'Walter', 'nameLast': 'Young', 'weight': 320, 'year(debut)': 2005}, {'nameFirst': 'Jumbo', 'nameLast': 'Diaz', EVALUATION'weight': 315, COPY 'year(debut)': 2014}, {'nameFirst': 'Dmitri', Unauthorized 'nameLast': 'Young', 'weight': 295, 'year(debut)': 1996}, {'nameFirst': 'Jonathan', 'nameLast': 'Broxton', 'weight': 295, 'year(debut)': 2005}, {'nameFirst': 'Jumbo', 'nameLast': 'Brown', 'weight': 295, Reproduction'year(debut)': 1925}]

Whether you use MySQL for Python or PyMySQL or some other interface for MySQL or a different database is largely a matter of personal preference. All should work pretty much the same.

sqlite311 SQLite12 is a server-less SQLor database engine. Each SQLite database is stored in a single file that can easily be transported between computers. While SQLite is not as robust as enterprise relationalDistribution database management systems, it works great for local databases or databases that don't have large loads. Python's sqlite3 module conforms to the Python Database API Specification v2.0

Jeff Knecht maintains a SQLite version Lahman's BaseballProhibited Database13, which is included in the class files: working-with-data/lahman2016.sqlite.

11. See https://docs.python.org/3/library/sqlite3.html for documentation on sqlite3. 12. See https://www.sqlite.org. 13. See https://github.com/jknecht/baseball-archive-sqlite/blob/master/lahman2016.sqlite.

Version: 1.1.0. Printed: 2019-04-02. Page 67 of 198 Working with Data

Exercise 8 Querying a SQLite Database 10 to 15 minutes In this exercise, you will use your knowledge of the Python Database API Specification v2.0 to connect to and query the lahman2016.sqlite database. EVALUATION1. Open working-with-data/Exercises/querying_a_sqlite_database.py COPYin your editor. 2. The connection to the SQLite database has already been made and the query Unauthorized has been written. Note that the debut field in the SQLite database is stored in milliseconds from the epoch. We use SQLite's strftime() function to convert that to a year. 3. Finish the code so that it runs the query and assigns the results to the results variable. Don't forget to close your cursor and connection.

*Challenge ForReproduction each record returned, print a sentence, such as "Walter Young weighed 320 pounds when he started his MLB career in 2005."

or Distribution

Prohibited

Page 68 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

EVALUATION COPY

Unauthorized

Reproduction

or Distribution

Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 69 of 198 Working with Data

Exercise Solution working-with-data/Solutions/querying_a_sqlite_database.py

1. import sqlite3 EVALUATION2. connection = sqlite3.connect('../lahman2016.sqlite') COPY 3. 4. query = '''SELECT nameFirst, nameLast, weight, Unauthorized5. strftime('%Y', debut / 1000, 'unixepoch') 6. FROM Master 7. ORDER BY weight DESC 8. LIMIT 5''' 9. 10. cursor = connection.cursor() 11. cursor.execute(query) 12. results = cursor.fetchall() 13. cursor.close()Reproduction 14. connection.close() 15. 16. print (results) 17. 18. # Challenge: Make output pretty 19. 20. for record in results: 21. print('''{} {} weighed {}or pounds when he started 22. his MLB career in {}.''' 23. .format(record[0], Distribution 24. record[1], 25. record[2], 26. record[3]) )

Prohibited

Page 70 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

Passing Parameters A placeholder for a parameter in a query is marked with a questions mark (?) like this: EVALUATIONquery = '''SELECT yearID, HR COPY FROM Batting WHERE playerID IN (SELECT playerID FROM Master Unauthorized WHERE nameFirst = ? AND nameLast = ?) ORDER BY yearID'''

You pass in parameters as the second argument of the cursor's execute() method as a sequence (usually a tuple), like this:

player = ( 'Babe', 'Ruth' ) cursor.execute(query, player)

Code Sample Reproduction working-with-data/Demos/sqlite3_passing_parameters.py

1. import sqlite3 2. connection = sqlite3.connect('../lahman2014.sqlite') 3. 4. query = '''SELECT yearID, HR 5. FROM batting or 6. WHERE playerID IN (SELECT playerID 7. FROM master Distribution 8. WHERE nameFirst = ? AND nameLast = ?) 9. ORDER BY yearID''' 10. 11. cursor = connection.cursor() 12. player = ( 'Babe', 'Ruth' ) 13. cursor.execute(query, player) Prohibited 14. results = cursor.fetchall() 15. cursor.close() 16. connection.close() 17. 18. print(results)

SQLite Database in Memory Python allows you to create in-memory databases with SQLite. This can be useful when you have a lot of data in a tab-delimited file that you want to query using SQL,

Version: 1.1.0. Printed: 2019-04-02. Page 71 of 198 Working with Data

but don't want to maintain as a database file as well. To create a connection to an in-memory database, use the following code: EVALUATIONconnection = sqlite3.connect(':memory:') COPY For the rest of the database section, we will continue to work with an in-memory SQLite database, but the concepts apply to all databases. Unauthorized

You then create your tables with CREATE TABLE statements and populate them with INSERT statements. For example:

Reproduction

or Distribution

Prohibited

Page 72 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

Code Sample working-with-data/Demos/sqlite3_in_memory_tables.py

1. import sqlite3 EVALUATION2. connection = sqlite3.connect(':memory:') COPY 3. cursor = connection.cursor() 4. Unauthorized5. create = '''CREATE TABLE beatles ( 6. 'fname' text, 7. 'lname' text, 8. 'nickname' text 9. )''' 10. 11. cursor.execute(create) 12. 13. members =Reproduction [ 14. ('John', 'Lennon', 'The Smart One'), 15. ('Paul', 'McCartney', 'The Cute One'), 16. ('George', 'Harrison', 'The Funny One'), 17. ('Ringo', 'Starr', 'The Quiet One') 18. ] 19. 20. for member in members: 21. cursor.execute("INSERT INTOor beatles VALUES" + str(member)) 22. 23. select = 'SELECT * FROM beatles'Distribution 24. cursor.execute(select) 25. 26. results = cursor.fetchall() 27. cursor.close() 28. connection.close() 29. 30. print(results) Prohibited

Executing Multiple Queries at Once You might have noticed that the way we handled the inserts above is a bit of a hack. It takes advantage of the fact that the string representation of a Python tuple happens to be the same string format we need for our insert. If the data had been stored in lists, that would not have worked. Instead, we would have had to write something like this:

Version: 1.1.0. Printed: 2019-04-02. Page 73 of 198 Working with Data

for member in members: insert = """INSERT INTO beatles VALUES ('{}','{}','{}')""".format(member[0], member[1], member[2]) EVALUATIONcursor.execute(insert) COPY And that is just plain ugly. Luckily, Python gives us a more pythonic way of doing Unauthorizedthis with the executemany() method, which takes two arguments: 1. The query to run, which usually includes some question marks to replace with passed-in values. 2. A sequence of sequences, each of which contains the values with which to replace the question marks. The query will run once for each sequence in the sequence of sequences. For example:

members = [ Reproduction('John', 'Lennon', 'The Smart One'), ('Paul', 'McCartney', 'The Cute One'), ('George', 'Harrison', 'The Funny One'), ('Ringo', 'Starr', 'The Quiet One') ]

insert = 'INSERT INTO beatles VALUES (?,?,?)'

cursor.executemany(insert, members) or Different databases may implement this in different ways; however, the Python code should work with any database asDistribution long as your using a Python Database API-2.0-compliant interface.

Prohibited

Page 74 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

Code Sample working-with-data/Demos/sqlite3_multiple_queries_at_once.py

1. import sqlite3 EVALUATION2. connection = sqlite3.connect(':memory:') COPY 3. cursor = connection.cursor() 4. Unauthorized5. create = '''CREATE TABLE beatles ( 6. 'fname' text, 7. 'lname' text, 8. 'nickname' text 9. )''' 10. cursor.execute(create) 11. 12. members = [ 13. ('John',Reproduction 'Lennon', 'The Smart One'), 14. ('Paul', 'McCartney', 'The Cute One'), 15. ('George', 'Harrison', 'The Funny One'), 16. ('Ringo', 'Starr', 'The Quiet One') 17. ] 18. 19. insert = 'INSERT INTO beatles VALUES (?,?,?)' 20. 21. cursor.executemany(insert, members)or 22. 23. select = 'SELECT * FROM beatles'Distribution 24. cursor.execute(select) 25. 26. results = cursor.fetchall() 27. cursor.close() 28. connection.close() 29. 30. print(results) Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 75 of 198 Working with Data

Exercise 9 Inserting File Data into a Database 20 to 30 minutes In this exercise, you will use the data from a text file to populate a database table. Open working-with-data/Exercises/states.txt and which has 52 lines of data14. Each EVALUATIONline contains three pieces of data15 separated by tabs: COPY 1. State Name Unauthorized2. Population in 2014 3. Population in 2000 For example, the line for California reads:

California 38,802,500 33,871,648 1. Open working-with-data/Exercises/inserting_file_data_into_a_database.py in your editor. 2.Reproduction A connection to an in-memory database has already been established and the SQL statements for creating the table, inserting the records, and selecting the data have been written and stored in variables. 3. Your job is to: A. Create a cursor. B. Run the CREATE statement. C. Get the data from the states.txt file into a list of 52 three-element tuples. Note that you will have to remove the commas from the population numbers so that theyor are valid integers. D. Insert the rows using the list of tuples (i.e., the sequence of sequences) you just created. Distribution E. Run the SELECT statement. This returns two columns: the state and the projected population in 2028. F. Fetch the results and output a sentence for each row (e.g., "The projected population of California in 2028 is 44,451,158."). G. Don't forget to close your cursor and connection. Prohibited

14. It includes Washington, D.C. and Puerto Rico. 15. The state population data is from https://en.wikipedia.org/wiki/List_of_U.S._states_and_territories_by_population.

Page 76 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

EVALUATION COPY

Unauthorized

Reproduction

or Distribution

Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 77 of 198 Working with Data

Exercise Solution working-with-data/Solutions/inserting_file_data_into_a_database.py

1. import sqlite3 EVALUATION2. connection = sqlite3.connect(':memory:') COPY 3. cursor = connection.cursor() 4. Unauthorized5. create = '''CREATE TABLE states ( 6. 'state' text, 7. 'pop2014' integer, 8. 'pop2000' integer 9. )''' 10. 11. cursor.execute(create) 12. 13. insert = 'INSERTReproduction INTO states VALUES (?,?,?)' 14. 15. data = [] 16. with open('states.txt') as f: 17. for line in f.readlines(): 18. state_data = line.split('\t') 19. tpl_state_data = ( state_data[0], 20. int(state_data[1].replace(',','')), 21. int(state_data[2].replace(',',''))or ) 22. 23. data.append(tpl_state_data)Distribution 24. 25. cursor.executemany(insert, data) 26. 27. select = '''SELECT state, 28. CAST( (pop2014*1.0/pop2000) * pop2014 AS INTEGER) pop2028 29. FROM states ORDER BY pop2028 DESC''' 30. cursor.execute(select) Prohibited 31. 32. results = cursor.fetchall() 33. cursor.close() 34. connection.close() 35. 36. for record in results: 37. print('''The projected population of {} in 2028 is {:,}.''' 38. .format(record[0], record[1]))

Page 78 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

CSV2.2

Class Files Examples EVALUATIONExamples from this section are in working-with-data/Demos/csv_examples.py COPY.

CSV (for "Comma Separated Values") is a format commonly used for sharing data Unauthorizedbetween applications, in particular, database and spreadsheet applications. Because the format had been around for awhile before any attempt was made at standardization, not all CSV files use exactly the same format. Luckily, Python's csv module does a good job of handling and hiding these differences so the programmer generally doesn't have to worry about them. Microsoft Excel is perhaps the most common application used for making CSV files. Here is a sample CSV file (working-with-data/csvs/us-population-2010- 2014.csvReproduction16) in Microsoft Excel showing the United States population breakdown over several years:

or Distribution And here is the same file opened in a text editor:

Prohibited Python's csv module is used for: 1. Reading from a CSV file. 2. Creating a new CSV file. 3. Writing to an existing CSV file.

16. Source: Vintage 2014 National Population Datasets. https://www.census.gov/popest/data/national/asrh/2014/files/NC- EST2014-AGESEX-RES.csv.

Version: 1.1.0. Printed: 2019-04-02. Page 79 of 198 Working with Data

Reading from a CSV File To read data from a CSV file: 1. Open the file using the built-in open() function with newline set to an EVALUATIONempty string. COPY 2. Pass the file object to the csv.reader() method. 3. Read the file row by row. Each row is a list of strings. Unauthorizedimport csv

with open('../csvs/us-population-2010-2014.csv', newline='') as csvfile: pops = csv.reader(csvfile) for i, row in enumerate(pops, 1): print(', '.join(row)) if i >= 5: break Reproduction This will output:

DictReader or When retrieving rows using the reader()Distributionmethod, it's possible to manipulate them item by item, but you have to know the positions of the different fields. For a CSV with a lot of columns, that can be pretty difficult. You will likely find it easier to use a DictReader, which gives you access to the fields by key. For example, in the population CSV, the first column holds the sex of the population for that row: A for Both, M for Male, and F for Female. Below we create a sexes dict to map those keys and values and use that dictionaryProhibited to output something more meaningful in the report:

Page 80 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

sexes = {'A':'Both', 'M':'Male', 'F':'Female'} print('Using DictReader(): ') with open('../csvs/us-population-2010-2014.csv', newline='') as csvfile: pops = csv.DictReader(csvfile)

EVALUATIONheader = ','.join(pops.fieldnames) COPY print(header) Unauthorized print('-' * len(header)) for row in pops: sex = sexes[row['SEX']] print(sex, row['AGE'], row['POPESTIMATE2010'], row['POPESTIMATE2011'], row['POPESTIMATE2012'], row['POPESTIMATE2013'], Reproductionrow['POPESTIMATE2014'])

Notice that the DictReader object has a fieldnames attribute that by default contains a list holding the keys taken from the first row of data. If the CSV doesn't have a header row, you can pass fieldnames in when creating the DictReader:

fieldnames = ['SEX', 'AGE', 'POPESTIMATE2010', 'POPESTIMATE2011',or 'POPESTIMATE2012',Distribution 'POPESTIMATE2013', 'POPESTIMATE2014'] pops = csv.DictReader(csvfile, fieldnames)

Finding Data in a CSV File You may want to do more with CSV data than just readProhibited it in and output it. For example, using the population CSV, you might want to find the population for a specific age group and sex and in a specific year (e.g., 30 year-old females in 2011). As a DictReader object is iterable, you can loop through it until you find the row you're looking for:

Version: 1.1.0. Printed: 2019-04-02. Page 81 of 198 Working with Data

with open('../csvs/us-population-2010-2014.csv', newline='') as csvfile: pops = csv.DictReader(csvfile)

for row in pops: if (row['AGE'] == '30' and row['SEX'] == 'F'): EVALUATIONpopulation = row['POPESTIMATE2011'] COPY break else: Unauthorized population = None print(population)

Here's a reusable function that does the same thing:

def find_pop(pops, age, sex, year): for row in pops: if (row['AGE'] == str(age) and row['SEX'] == sex): return row['POPESTIMATE' + str(year)] Reproductionreturn None

pop = find_pop(pops, 30, 'F', 2011)

If you try to call the find_pop() function twice in a row the output from the second call to the function returns None. That's because the file cursor has already passed that line in the file. There are a couple of options for dealing with this: 1. We could use csvfile.seek(0)or to return the cursor to the beginning of the file before each call to the function. 2. We could create a list from Distributionpops using list(pops) and pass that list into our function. Both solutions are shown in working-with-data/Demos/csv_examples.py.

Prohibited

Page 82 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

Exercise 10 Comparing Data in a CSV File 15 to 20 minutes In this exercise, you will make use of a helper function for comparing data in a CSV file. EVALUATION1. Open ../csvs/us-population-2010-2014.csv and COPY study it. You will use this file to find the difference in population between males and females in 2011. 2. Open working-with-data/Exercises/compare_populations_csv.py in your editor. Unauthorized3. Study the compare_pops() function. It calculates the difference between two populations and the percentage of the difference. 4. Create two age_sex_year tuples, one for 30-year-old females in 2011 and the other for 30-year-old males in 2011. 5. Call the compare_pops() function, passing in the appropriate data. 6. Print the result. It should output:

(10353, 1.004910053492218) Reproduction

or Distribution

Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 83 of 198 Working with Data

Exercise Solution working-with-data/Solutions/compare_populations_csv.py

1. import csv EVALUATION2. COPY 3. def compare_pops(pops, age_sex_year1, age_sex_year2): 4. '''Finds the populations (pop1 and pop2) for the two Unauthorized5. passed-in age, sex, and year tuples. 6. 7. Returns a two-item tuple containing: 8. - the numeric difference in population (pop2 - pop1) 9. - the ratio (pop2 / pop1) 10. 11. Keyword arguments: 12. pops -- a sequence holding dictionaries 13. age_sex_year1Reproduction -- a tuple holding age, sex, and year values 14. age_sex_year2 -- a tuple holding age, sex, and year values''' 15. pop1, pop2 = -1, -1 16. for row in pops: 17. if (row['AGE'] == str(age_sex_year1[0]) 18. and row['SEX'] == age_sex_year1[1]): 19. pop1 = row['POPESTIMATE' + str(age_sex_year1[2])] 20. pop1 = int(pop1.replace(',','')) 21. if (row['AGE'] == str(age_sex_year2[0])or 22. and row['SEX'] == age_sex_year2[1]): 23. pop2 = row['POPESTIMATE'Distribution + str(age_sex_year2[2])] 24. pop2 = int(pop2.replace(',','')) 25. if pop1 > 0 and pop2 > 0: 26. return (pop2 - pop1, pop2/pop1) 27. 28. with open('../csvs/us-population-2010-2014.csv', newline='') as csvfile:

29. pops = list(csv.DictReader(csvfile)) Prohibited 30. 31. pop1 = (30, 'F', 2011) 32. pop2 = (30, 'M', 2011) 33. 34. diff = compare_pops(pops, pop1, pop2) 35. print(diff)

Page 84 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

Creating a New CSV File To write data to a CSV file: 1. Open the file using the built-in open() function in writing mode and with EVALUATIONnewline set to an empty string. COPY 2. Pass the file object to the csv.writer() method or to csv.DictWriter(). 3. Write the file row by row with the writerow(sequence) method or all at Unauthorized once with the writerows(sequence_of_sequences) method. The following example shows how you could write data retrieved from a database into a CSV file:

import pymysql, csv connection = pymysql.connect(host='MySQLC11.newtekwebhosting.com', user='student', password='webuc8', Reproduction database='baseball' ) query = '''SELECT year(debut) year, avg(weight) weight FROM Master WHERE debut is NOT NULL GROUP BY year(debut) ORDER BY year(debut)'''

with connection.cursor() as cursor: cursor.execute(query)or results = cursor.fetchall() Distribution connection.close()

print('Writing with writer(): ')

with open('../csvs/mlb-weight-over-time.csv', 'w', newline='') as csvfile: writer = csv.writer(csvfile) writer.writerow(['Year', 'Weight']) Prohibited writer.writerows(results)

DictWriter

And here we do the same thing using a DictWriter:

Version: 1.1.0. Printed: 2019-04-02. Page 85 of 198 Working with Data

import pymysql, csv connection = pymysql.connect(host='MySQLC11.newtekwebhosting.com', user='student', password='webuc8', database='baseball', EVALUATIONcursorclass=pymysql.cursors.DictCursor COPY ) query = '''SELECT year(debut) year, avg(weight) weight UnauthorizedFROM Master WHERE debut is NOT NULL GROUP BY year(debut) ORDER BY year(debut)'''

with connection.cursor() as cursor: cursor.execute(query) results = cursor.fetchall() connection.close()Reproduction print('Writing with DictWriter(): ')

with open('../csvs/mlb-weight-over-time2.csv', 'w', newline='') as csvfile: fieldnames = results[0].keys() writer = csv.DictWriter(csvfile, fieldnames=fieldnames) writer.writeheader() writer.writerows(results) or To add lines to an existing CSVDistribution file, just open in it in append mode: with open('../csvs/mlb-weight-over-time.csv', 'a', newline='') as csvfile: fieldnames = results[0].keys() writer = csv.writer(csvfile) writer.writerow([2015,200])

CSV Dialects Prohibited The default dialect used in the csv module is 'excel', which is the dialect (i.e., the format) used by Microsoft Excel. If you want to change this, you can pass in a different value for dialect to reader(), DictReader(), writer(), or DictWriter(). Other recognized options are 'excel-tab' and 'unix'.

Page 86 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

Python 2 Difference In Python 3, opening CSV files with newline = '' allows the csv module to determine line breaks for itself. If you do not specify newline = '' then newlines EVALUATIONwithin quoted fields will be interpreted incorrectly. COPY In Python 2, the same issue is handled differently. Instead of using newline = '', you should open CSV files in binary mode (e.g., open('foo.csv', 'rb')). Unauthorized

Reproduction

or Distribution

Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 87 of 198 Working with Data

Code Sample working-with-data/Demos/csv_examples.py

1. # Reading with reader() EVALUATION2. import csv COPY 3. 4. print('Using reader(): ') Unauthorized5. with open('../csvs/us-population-2010-2014.csv', newline='') as csvfile: 6. pops = csv.reader(csvfile) 7. for i, row in enumerate(pops, 1): 8. print(', '.join(row)) 9. if i >= 5: 10. break 11. print('-'*70) 12. 13. # ReadingReproduction with DictReader() 14. 15. sexes = {'A':'Both', 'M':'Male', 'F':'Female'} 16. print('Using DictReader(): ') 17. with open('../csvs/us-population-2010-2014.csv', newline='') as csvfile:

18. pops = csv.DictReader(csvfile) 19. 20. header = ','.join(pops.fieldnames)or 21. print(header) 22. Distribution 23. print('-' * len(header)) 24. 25. for row in pops: 26. sex = sexes[row['SEX']] 27. print(sex, 28. row['AGE'], Prohibited 29. row['POPESTIMATE2010'], 30. row['POPESTIMATE2011'], 31. row['POPESTIMATE2012'], 32. row['POPESTIMATE2013'], 33. row['POPESTIMATE2014']) 34. print('-'*70) 35. 36. # Finding data in a CSV file 37.

Page 88 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

38. print('Finding data in a CSV file: ') 39. with open('../csvs/us-population-2010-2014.csv', newline='') as csvfile:

40. pops = csv.DictReader(csvfile) 41. EVALUATION42. for row in pops: COPY 43. if (row['AGE'] == '30' and row['SEX'] == 'F'): 44. population = row['POPESTIMATE2011'] Unauthorized45. break 46. else: 47. population = None 48. 49. print(population) 50. print('-'*70) 51. 52. # Call find_pop() multiple times 53. Reproduction 54. def find_pop(pops, age, sex, year): 55. for row in pops: 56. if (row['AGE'] == str(age) and row['SEX'] == sex): 57. return row['POPESTIMATE' + str(year)] 58. return None 59. print('Call find_pop() multiple times: ') 60. with open('../csvs/us-population-2010-2014.csv', newline='') as csvfile: or 61. pops = csv.DictReader(csvfile) 62. pop1 = find_pop(pops, 30, 'F',Distribution 2011) 63. pop2 = find_pop(pops, 30, 'F', 2011) 64. 65. print(pop1, pop2) 66. print('-'*70) 67. 68. # Call find_pop() multiple times using csvfile.seek(0) 69. Prohibited 70. print('''Call find_pop() multiple times with repositioning 71. of cursor using seek(0): ''') 72. with open('../csvs/us-population-2010-2014.csv', newline='') as csvfile:

73. pops = csv.DictReader(csvfile) 74. pop1 = find_pop(pops, 30, 'F', 2011) 75. csvfile.seek(0) 76. pop2 = find_pop(pops, 30, 'F', 2011) 77.

Version: 1.1.0. Printed: 2019-04-02. Page 89 of 198 Working with Data

78. print(pop1, pop2) 79. print('-'*70) 80. 81. # Create a list from csv.DictReader() 82. EVALUATION83. print('''Call find_pop() multiple times using aCOPY list 84. from csv.DictReader: ''') 85. Unauthorized86. with open('../csvs/us-population-2010-2014.csv', newline='') as csvfile:

87. pops = list(csv.DictReader(csvfile)) 88. 89. pop1 = find_pop(pops, 30, 'F', 2011) 90. pop2 = find_pop(pops, 30, 'F', 2011) 91. 92. print(pop1, pop2) 93. print('-'*70)Reproduction 94. 95. # Writing with writer() 96. 97. import pymysql, csv 98. connection = pymysql.connect(host='MySQLC11.newtekwebhosting.com', 99. user='student', 100. password='webuc8', 101. ordatabase='baseball' 102. ) 103. query = '''SELECT year(debut) year,Distribution avg(weight) weight 104. FROM Master 105. WHERE debut is NOT NULL 106. GROUP BY year(debut) 107. ORDER BY year(debut)''' 108. 109. with connection.cursor() as cursor: Prohibited 110. cursor.execute(query) 111. results = cursor.fetchall() 112. 113. connection.close() 114. 115. print('Writing with writer(): ') 116. 117. with open('../csvs/mlb-weight-over-time.csv', 'w', newline='') as csv »» file: 118. writer = csv.writer(csvfile)

Page 90 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

119. writer.writerow(['Year', 'Weight']) 120. writer.writerows(results) 121. 122. print('-'*70) EVALUATION2.3 Getting Data from the Web COPY

UnauthorizedAlthough Python has a built-in urllib module for making HTTP requests, it is thoroughly un-pythonic. Luckily, there is a Requests17 package that makes it easy to access web pages with Python. In this section, you'll learn how the Requests package works and how to combine it with the Beautiful Soup18 library to parse the code. Both Requests and Beautiful Soup are included with Anaconda.

TheReproduction Requests Package

Class Files Examples Examples from this section are in working-with-data/Demos/http_requests.py.

Import the Requests package with import requests. Although all HTTP requestor methods (e.g., post, put, head,...) can be used, in most cases, you will use the getDistributionmethod using requests.get() , to which you will pass in the URL as a string like this:

import requests

url = 'https://www.webucator.com/course-demos/python/courselist.cfm' r = requests.get(url) content = r.text print(content[:200]) # print first 200 charactersProhibited

Custom Headers A web server might choose to block any requests that don't identify the user agent. You can handle this by passing in headers that include a value for "user-agent", like this:

17. See http://www.python-requests.org. 18. See http://www.crummy.com/software/BeautifulSoup/bs4/doc/.

Version: 1.1.0. Printed: 2019-04-02. Page 91 of 198 Working with Data

r = requests.get(url, headers={'user-agent': 'my-app/0.0.1'})

EVALUATIONThe text attribute of the response object returned COPY by requests.get() will hold the full code of the HTML page delivered from the specified URL. In this case

Unauthorized

Reproduction

or Distribution

Prohibited

Page 92 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

the URL is https://www.webucator.com/course-demos/python/courselist.cfm. When viewed in a browser, the page looks like this: EVALUATION COPY

Unauthorized

Reproduction

or Distribution

Prohibited

This page displays Webucator courses. Each course listing is contained within a tag. Each row contains the category group, the category name, the course

Version: 1.1.0. Printed: 2019-04-02. Page 93 of 198 Working with Data

name and the date the course was added to the course catalog. Each of these items is contained within a tag. Writing code to extract course data using string manipulation and regular expressions EVALUATIONwould be quite challenging. Beautiful Soup toCOPY the rescue! Beautiful Soup Beautiful Soup is a Python library for extracting HTML and XML data. It is often Unauthorizedused to find specific content on a web page, a process known as "scraping." The steps involved are: 1. Import Beautiful Soup:

from bs4 import BeautifulSoup 2. From the content of a web page (often retrieved with requests.get()) create a BeautifulSoup object using the HTML parser: Reproduction soup = BeautifulSoup(content, 'html.parser') 3. Use the find(), find_all(), or select() methods to find the tags you are looking for. A. find() returns a single bs4.element.Tag object, which we will just call a Tag object. B. find_all() returns a bs4.element.ResultSet, which is essentially a list of Tag objects. C. select() returnsor a bs4.element.ResultSet , which is essentially a list of Tag objects. Distribution

find() and find_all()

The find() and find_all() methods provide a lot of options for finding tags. We'll use find_all() to illustrate, but find() works in the same way. The signature for find_all() is: Prohibited

find_all(name, attrs, recursive, text19, limit, **kwargs) • name - The name of the tag to match. This can be a string, a list of options, or a compiled regular expression. • soup.find_all('a') - finds all tags.

19. The current Anaconda version at the time of this writing includes Beautiful Soup version 4.3. In version 4.4, the text parameter's name was changed to string.

Page 94 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

• soup.find_all(['b','strong']) - finds all and tags. • soup.find_all(re.compile('^t')) - finds all tags that begin with "t" (e.g.,, <table>, <tr>, <td>, etc.). • attrs - The collection of attributes the tag should have passed in as a dict. EVALUATION• soup.find_all('link', {'rel': COPY True}) - finds all <link> tags with a rel attribute. • soup.find_all('td', {'class': 'CourseName'}) - finds Unauthorized all <td> tags class attribute value of CourseName. • recursive - Boolean indicating whether a recursive search should be done. Defaults to True. • text - Text to search for. This can be a string, a list of options, or a compiled regular expression. • The following finds all the phone numbers formatted as "(###) ###-####":</p><p> pattern = re.compile('\(\d{3}\) \d{3}-\d{4}') Reproductionsoup.find_all(text = pattern) • limit - The maximum number of matches to find. • **kwargs - Keyword arguments. Any unrecognized argument will be treated as an attribute of the tag. • soup.find_all('a', href=True) - finds all <a> tags with an href attribute. • soup.find_all('a', href = re.compile('testimonials.cfm')) - finds all <a> tags with an href attributeor that contains the string "testimonials.cfm". • soup.find_all('td',Distribution text = re.compile('Java ') - finds all <td> tags that contain the text "Java ". • soup.find_all('td', text = re.compile('JavaS') - finds all <td> tags that contain the text "JavaS" e.g., JavaScript). • soup.find_all('td', text = re.compile('Java[ |S]') - finds all <td> tags that contain the text "Java " or "JavaS". Note that these arguments can be used in any combinationProhibited and none is required. For example, the following will find the first three <td> tags that contain a phone number.</p><p> pattern = re.compile('\(\d{3}\) \d{3}-\d{4}') soup.find_all('td', text = pattern, limit=3)</p><p>Version: 1.1.0. Printed: 2019-04-02. Page 95 of 198 Working with Data</p><p> select()</p><p>The select() method makes it possible to get elements based on CSS selectors. For example, the following code will get all the <a> tags within <li> tags within EVALUATION<ul> tags with the class name "nav": COPY top_nav_links = soup.select('ul.nav li a') for link in top_nav_links: Unauthorized print(link.text.strip())</p><p>XML Beautiful Soup makes it easy to parse XML documents as well. To do so, just pass in "xml" when you create your BeautifulSoup object:</p><p> soup = BeautifulSoup(content, 'xml') Reproduction The Webucator courses page we have been working with has an XML version at https://www.webucator.com/course-demos/python/courselistxml.cfm. The XML for the page contains a list of Course elements, a simplified version of which is shown below:</p><p><course id="PYT111"> <category-group>Programming</category-group> <category>Python</category> <title>Introduction toor Python Training 19-Feb-15 Distribution

Note that course name is stored in the title tag. If we wanted to list all course titles in the XML file, here is the code we could run:

soup = BeautifulSoup(content, 'xml') course_names = soup.find_all('title') print('All Courses using XML') Prohibited for i, course_name in enumerate(course_names, 1): print(i, course_name.text) print('-'*70)

Page 96 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

Exercise 11 Requests and Beautiful Soup 20 to 30 minutes In this exercise, you will create a script that prompts the user for a search string that will be used to locate course names on the Webucator course list web page. EVALUATION1. Open working-with-data/Exercises/requests_and_beautiful_soup.py COPY. 2. Write code that prompts the user for a search string. 3. Use this search string with a find_all function. Specify td for the tag name Unauthorized and a value of CourseName for the class attribute. In addition, use the text keyword argument to specify a regular expression that contains the search text the user provided. 4. Use the results to print out the courses that were found. If no results were returned (length of the collection is 0), print a not found message and then exit the script. For example, if "Python" is provided by the user, then here are the results: Reproduction1 Advanced Python Training 2 Introduction to Python Training 3 Python Data Analysis with NumPy and pandas

*Challenge Using a Counter from the collections module, find the three most common category groups, i.e., td tags with a value of CategoryGroup for the class attribute. or *Challenge Distribution If you have extra time, try scraping your own website.

Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 97 of 198 Working with Data

Exercise Solution working-with-data/Solutions/requests_and_beautiful_soup.py

1. import requests EVALUATION2. from bs4 import BeautifulSoup COPY 3. import re 4. Unauthorized5. url = 'https://www.webucator.com/course-demos/python/courselist.cfm' 6. headers = {'user-agent': 'my-app/0.0.1'} 7. 8. course_search = input('Enter your search text (e.g., "Python") : ') 9. r = requests.get(url, headers=headers) 10. content = r.text 11. 12. soup = BeautifulSoup(content, 'html.parser') 13. courses =Reproduction soup.find_all('td', {'class': 'CourseName'}, 14. text=re.compile(course_search)) 15. 16. if len(courses) == 0 : 17. print('No courses were found') 18. exit 19. 20. for i, courses in enumerate(courses, 1): 21. print(i, courses.text.strip())or 22. print('-'*70) 23. Distribution 24. # Challenge: 25. # List the 3 most common Category Group names 26. 27. from collections import Counter 28. 29. cat_groups = soup.find_all('td', {'class': 'CategoryGroup'}) 30. c = Counter(cat_groups) Prohibited 31. print('The 3 most common Category Groups') 32. print(c.most_common(3)) 33. print('-'*70)

Page 98 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

JSON2.4

Class Files Examples EVALUATIONExamples from this section are in working-with-data/Demos/json_examples.py COPY.

JSON stands for JavaScript Object Notation. According to the official JSON Unauthorizedwebsite 20, JSON is: 1. A lightweight data-interchange format. 2. Easy for humans to read and write. 3. Easy for machines to parse and generate. Numbers 1 and 3 are certainly true. Number 2 depends on the type of human. Experienced Python programmers will find the JSON syntax extremely familiar as it isReproduction uses the same syntax as Python's dict and list objects. Here is an example of JSON holding weather data:

or Distribution

Prohibited

20. See http://www.json.org.

Version: 1.1.0. Printed: 2019-04-02. Page 99 of 198 Working with Data

{ "city":{ "coord":{ "lat":41.85, "lon":-87.6501 }, EVALUATION"country":"US", COPY "id":4887398, "name":"Chicago" Unauthorized }, "cnt":2, "cod":"200", "list":[ { "clouds":{ "all":0 }, "main":{ "grnd_level":1024.6, Reproduction"humidity":82, "pressure":1024.6, "sea_level":1047.8, "temp":47.39, "temp_kf":3.52, "temp_max":47.39, "temp_min":41.04 }, "wind":{ "deg":338.001,or "speed":2.55 } Distribution }, { "clouds":{ "all":8 }, "main":{ "grnd_level":1024.41, "humidity":75, Prohibited "pressure":1024.41, "sea_level":1047.37, "temp":48.83, "temp_kf":2.64, "temp_max":48.83, "temp_min":44.08 }, "wind":{ "deg":59.5004, "speed":2.3 }

Page 100 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

} ] }

The JSON above is a simplified version of the JSON returned from the daily forecast EVALUATIONOpenWeatherMap API21. As you can see, it is formatted COPY just like a Python dict. The value for the "city" key is a dict and the value for the "list" key is a list of dicts, each representing the weather for the next two days. UnauthorizedIf you assign the dict to a weather variable, you could get the name of the city and the max temparature for the next day as follows:

city_name = weather['city']['name'] max_temp_tomorrow = weather['list'][0]['main']['temp_max']

print('The high tomorrow in {} will be {}.'.format(city_name, max_temp_tomorrow))

ThisReproduction will output:

The high tomorrow in Chicago will be 83.96.

Many organizations, including Google, Twitter, Facebook, Reddit, and Microsoft, provide APIs that return data in JSON.22 Each API has its own rules and parameters and some, including the OpenWeatherMap API, require an API key. or Getting an OpenWeatherMap API Key Distribution The OpenWeatherMap API is free, but it requires an API key. To get an API key, sign up at http://home.openweathermap.org/users/sign_up.

To get the daily forecast for Chicago for the next two days in JSON format, go to this URL (replacing yourapikey with a valid API key): http://api.openweath ermap.org/data/2.5/forecast?AP Prohibited PID=yourapikey&mode=json&cnt=2&id=4887398&units=imperial Notice the parameters passed in the URL: 1. APPID - a valid API key. 2. mode: json - the format of the return data. 3. cnt: 2 - the number of days.

21. See http://api.openweathermap.org/data/2.5/forecast/daily. 22. For a list of many JSON APIs, see http://www.programmableweb.com/category/all/apis?data_format=21173.

Version: 1.1.0. Printed: 2019-04-02. Page 101 of 198 Working with Data

4. id: 4887398 - the id of the city (see callout below). 5. units: the system of measurement (imperial, metric, etc.)

Finding the OpenWeatherMap ID for your City EVALUATIONTo find the OpenWeatherMap id of a city: COPY 1. Go to http://openweathermap.org/find and search for the city: Unauthorized

Reproduction

2. Click on the result you want. 3. The id is at the end of the URL of the resulting page:

or Distribution

And here is how we access the OpenWeatherMap API with Python:

Prohibited

Page 102 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

import requests from pprint import pprint #"pretty prints" the data in a human- readable way

api_key = 'yourapikey' feed = "http://api.openweathermap.org/data/2.5/forecast/?APPID=" + EVALUATIONapi_key COPY params = {'id': 4887398, 'mode': 'json', 'units': 'imperial', 'cnt': 2} Unauthorizedr = requests.get(feed, params) weather = r.json()

city_name = weather['city']['name'] max_temp_tomorrow = weather['list'][0]['main']['temp_max']

print(r.url) #prints the URL created using the params print('-' * 50) print('The high tomorrow in {} will be {}.'.format(city_name, max_temp_tomorrow)) print('-'Reproduction * 50) pprint(weather)

The json() method of the response from requests.get() converts the JSON code to a Python dict. From there, you can work with the Python dict object as you normally would.

pprint Module or The pprint module is used for pretty printing objects. It can be useful for making large Python or XML objects humanDistribution readable.

Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 103 of 198 Working with Data

Code Sample working-with-data/Demos/json_examples.py

1. import requests EVALUATION2. weather = {'city': {'coord': {'lat': 41.85, 'lon': COPY -87.6501}, 3. 'country': 'US', 4. 'id': 4887398, Unauthorized5. 'name': 'Chicago'}, 6. 'cnt': 2, 7. 'cod': '200', 8. 'list': [{'clouds': {'all': 0}, 9. 'dt': 1524236400, 10. 'dt_txt': '2018-04-20 15:00:00', 11. 'main': {'grnd_level': 1024.6, 12. 'humidity': 82, 13. Reproduction'pressure': 1024.6, 14. 'sea_level': 1047.8, 15. 'temp': 47.39, 16. 'temp_kf': 3.52, 17. 'temp_max': 47.39, 18. 'temp_min': 41.04}, 19. 'sys': {'pod': 'd'}, 20. 'weather': [{'description': 'clear sky', 21. 'icon':or '01d', 22. 'id': 800, 23. 'main':Distribution 'Clear'}], 24. 'wind': {'deg': 338.001, 'speed': 2.55}}, 25. {'clouds': {'all': 8}, 26. 'dt': 1524247200, 27. 'dt_txt': '2018-04-20 18:00:00', 28. 'main': {'grnd_level': 1024.41, 29. 'humidity': 75, 30. 'pressure': 1024.41, Prohibited 31. 'sea_level': 1047.37, 32. 'temp': 48.83, 33. 'temp_kf': 2.64, 34. 'temp_max': 48.83, 35. 'temp_min': 44.08}, 36. 'sys': {'pod': 'd'}, 37. 'weather': [{'description': 'clear sky', 38. 'icon': '02d', 39. 'id': 800,

Page 104 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

40. 'main': 'Clear'}], 41. 'wind': {'deg': 59.5004, 'speed': 2.3}}], 42. 'message': 0.006} 43. 44. city_name = weather['city']['name'] EVALUATION45. max_temp_tomorrow = weather['list'][0]['main']['temp_max'] COPY 46. 47. print('The high tomorrow in {} will be {}.'.format(city_name, Unauthorized48. max_temp_tomorrow)) 49. # Getting JSON from a feed 50. # You will need an API key from http://home.openweathermap.org/ 51. 52. #import requests 53. from pprint import pprint #"pretty prints" the data in a human-readable way 54. 55. api_key =Reproduction 'abca198b092b0295697beb48914a442c' 56. feed = "http://api.openweathermap.org/data/2.5/forecast/?APPID=" + api_key 57. params = {'id': 4887398, 'mode': 'json', 58. 'units': 'imperial', 'cnt': 2} 59. r = requests.get(feed, params) 60. weather = r.json() 61. 62. city_name = weather['city']['name']or 63. max_temp_tomorrow = weather['list'][0]['main']['temp_max'] 64. Distribution 65. print(r.url) #prints the URL created using the params 66. print('-' * 50) 67. print('The high tomorrow in {} will be {}.'.format(city_name, 68. max_temp_tomorrow)) 69. print('-' * 50) 70. pprint(weather) Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 105 of 198 Working with Data

Exercise 12 Using JSON to print Course data 20 to 30 minutes In this exercise, you will print course data that is stored in a JSON array. 1. Open working-with-data/Exercises/json_courses.py. EVALUATION2. The script fetches an array of JSON objects from COPY a JSON version of Webucator course data located at https://www.webucator.com/course-demos/python/courselistjson.cfm. Unauthorized3. Iterate through the array and print the CourseName, Category, and CategoryGroup for each course. For example, to access the course name for a JSON object named course, code course['CourseName']. For readability, separate the data with forward slashes so that a report line looks something like this:

1 Adobe Animate Creative Cloud (CC) Training / Animate / Adobe Reproduction

or Distribution

Prohibited

Page 106 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

EVALUATION COPY

Unauthorized

Reproduction

or Distribution

Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 107 of 198 Working with Data

Exercise Solution working-with-data/Solutions/json_courses.py

1. import requests EVALUATION2. COPY 3. data="https://www.webucator.com/course-demos/python/courselistjson.cfm" 4. Unauthorized5. r = requests.get(data) 6. courses = r.json() 7. 8. for i, course in enumerate(courses, 1): 9. print(i, '{} / {} / {}'.format(course['CourseName'], 10. course['Category'], 11. course['CategoryGroup'])) 12. print('-'Reproduction * 70)

or Distribution

Prohibited

Page 108 of 198 © Copyright 2019 Webucator. All rights reserved. Working with Data

2.5 Conclusion In this lesson, you have learned you have learned to work with data stored in EVALUATIONdatabases, CSV files, HTML, XML, and JSON. COPY

Unauthorized

Reproduction

or Distribution

Prohibited

Version: 1.1.0. Printed: 2019-04-02. Page 109 of 198 Working with Data

EVALUATION COPY

Unauthorized

Reproduction

or Distribution

Prohibited

Page 110 of 198 © Copyright 2019 Webucator. All rights reserved. EVALUATION COPY

Unauthorized

Reproduction

7400 E. Orchard Road, Suite 1450 N Greenwood Village, Colorado 80111 Ph: 303-302-5280 www.ITCourseware.comor Distribution

Prohibited

9-38-00259-000-05-14-19