2019‒05‒02 Python Lesson Tutor Notes
Total Page:16
File Type:pdf, Size:1020Kb
tutor_notes.md 4/29/2019 2019‒05‒02 Python lesson tutor notes These notes are intended for the tutor as they work through the material, but may also be useful for independent learning. Start the slides Start the slides TITLE: Building Programs With Python (Part 1) SECTION 01: Setup SECTION 02: Getting Started SECTION 03: Data Analysis SECTION 04: Visualisation SECTION 05: for loops SECTION 06: lists SECTION 07: Making choices SECTION 08: Analysing multiple files SECTION 09: Conclusions (Part 1) TITLE: Building Programs With Python (Part 2) SECTION 10: Jupyter notebooks SECTION 11: Functions SECTION 12: Refactoring SECTION 13: Command‒line programs SECTION 14: Testing and documentation SECTION 15: Errors and Exceptions SECTION 16: Defensive programming TITLE: Building Programs With Python (Part 1) SLIDE: Etherpad Please use the Etherpad for the course DEMONSTRATE LINK note that slides and tutor notes are available online SLIDE: Why Are We Here? We're here to learn how to program this lesson just happens to be in Python, but the principles are relevant to all languages Programming is a way to solve problems in your research through making a computer do work quickly and accurately You'll build functions that do specific, defined tasks You'll automate those functions to perform tasks over and over again (in various combinations) 1 / 79 tutor_notes.md 4/29/2019 You'll manipulate data, which is at the heart of all research You'll learn some file input/output to make the computer read and write useful information You'll learn some data structures, which are ways to organise data so that the computer can deal with it efficiently SLIDE: XKCD: The best use of your time The XKCD comic is tongue‒in cheek, but there's a lot of truth in this The less time you spend messing with Excel or manually processing data files, the more time you have for getting your research done SLIDE How are we doing this? We'll be learning how to program using Python Why Python? We need to use some language Python is free, with good documentation and lots of books and online courses. Python is widely‒used in academia, and there's lots of support online It can be easier for novices to pick up than other languages We won't be covering the entire language in detail For those with a bit more experience, note: we will be using some long‒handed ways of doing things to keep them clear for novices SLIDE: XKCD: Python This XKCD comic highlights two of those key points: Programming in Python is FUN There are libraries for pretty much anything you might want to do SLIDE No, I mean "How are we doing this?" We'll use two tools to write Python The bulk of the course will be in a text editor Text editors are part of the edit‒save‒execute cycle, which is how much code is written We'll also spend a little bit of time writing code in the Jupyter notebook** Jupyter is good for exploring data, prototyping code, data‒wrangling, and teaching However, it's not so good for writing "production code" in a general sense There are also specialist integrated development environments (IDEs) for Python that are extremely useful for developers, but we'll not be using them SLIDE Do I need to use Python afterwards? No. The lesson and principles are general, we're just teaching in Python 2 / 79 tutor_notes.md 4/29/2019 What you learn here will be relevant in other languages If your field or colleagues use another language in preference, there may be very good reasons for that, and they may be able to offer detailed, relevant support and help to you in that language. This is valuable. My advice is usually: when learning, choose the language that you can get local help in. Language Wars waste everyone's time. SLIDE What are we doing? We're using a motivating example of data analysis We've got some data relating to a new treatment for arthritis, and we're going to explore it. Data represents patients and their daily measurements of inflammation We need to analyse the data We need to visualise the data We're going to get the computer to do this for us It's a small dataset, so we could do this by hand (it would take us a day, maybe) (Excel anecdote from lab?) Automation is key: fewer human mistakes easier to apply to other future datasets easier to share with others (transparency) We can also share our code and results via sites such as GitHub and BitBucket publish as supplementary information greater impact encourage collaboration SECTION 01: Setup SLIDE Setting Up ‒ 1 ‒ DEMO We want a neat (clean) working environment: always a good idea when starting a new project ‒ it helps for when you might want to use git to put it under version control, later. Change directory to desktop (in terminal or Explorer) Create directory python-novice-inflammation Change your working directory to that directory cd ~/Desktop mkdir pni cd pni SLIDE Setting Up ‒ 2 ‒ DEMO We need to download our data (and also a little code that can help us) 3 / 79 tutor_notes.md 4/29/2019 This is just like grabbing data from an analytical machine's output, or being given data by a collaborator Go to Etherpad in browser http://pad.software‒carpentry.org/2019‒05‒02‒standrews Point out file links http://swcarpentry.github.io/python‒novice‒inflammation/data/python‒ novice‒inflammation‒data.zip Click on file links to download Move files to pni directory Extract files ‒ this will create a subdirectory called data in that folder CHECK WHETHER EVERYONE HAS EXTRACTED THE DATA SHOW FILE STRUCTURE IN TERMINAL AND FILE EXPLORER SECTION 02: Getting Started SLIDE Python in the terminal We start the Python console with the command python This should bring up the interactive console Explain header information Explain the prompt $ python Python 3.6.3 |Anaconda custom (64-bit)| (default, Oct 6 2017, 12:04:38) [GCC 4.2.1 Compatible Clang 4.0.1 (tags/RELEASE_401/final)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> CHECK WHETHER EVERYONE HAS STARTED THE CONSOLE SLIDE Python REPL You learned about the REPL (read‒evaluate‒print‒loop) in the shell lesson 4 / 79 tutor_notes.md 4/29/2019 Python's console implements the REPL We can use Python like a complex calculator Note the spaces around operators ‒ good Python style >>> 3 + 5 8 >>> 12 / 7 1.7142857142857142 >>> 2 ** 16 65536 >>> 15 % 4 3 >>> (2 + 4) * (3 - 7) -24 SLIDE My first variable We've seen how to use the REPL To build interesting things, we'll need to store values We need to work with variables Variables are like named boxes An item of data goes into the box When we refer to the box/variable name, we get the contents of the box We need a variable name We need variable contents Use a real‒life example to hand if possible You can think of a variable as a labelled box, containing a data item Here, we have a box labelled Name ‒ this is the variable name We've put the value Samia into the box SLIDE: Creating a variable We assign a value to a variable with the equals sign: = Variable name goes on the left, value on the right Character strings (words etc.) are enclosed in quotes 5 / 79 tutor_notes.md 4/29/2019 EXPLAIN DOUBLE‒ AND SINGLE‒QUOTES Python accepts either double‒ or single‒quotes, but you can't mix them. After assignment, if we refer to the variable Name, we get the value that's in the box, which is: Samia The print() function shows the value of a variable KEY POINTS: We refer to the name of the variable, but get its contents >>> name = "Samia" >>> name 'Samia' >>> print(name) Samia CHECK THAT EVERYONE GETS THE CONCEPT/SEES THE NAME Variable names can include letters, digits, and underscores must not start with a number are case sensitive >>> myvar_1 = "Michael" >>> myvar_1 'Michael' >>> myvar-1 = "Michael" File "<stdin>", line 1 SyntaxError: can't assign to operator >>> 1myvar = "Michael" File "<stdin>", line 1 1myvar = "Michael" ^ SyntaxError: invalid syntax >>> Myvar_1 = "Alex" >>> Myvar_1 'Alex' >>> myvar_1 'Michael' SLIDE: Working with variables 6 / 79 tutor_notes.md 4/29/2019 Lead the students through the code: >>> weight_kg_text = "weight in kilograms:" >>> weight_kg = 55 Note, we're assigning an integer now (no quotes), but assignment is the same for all data items Python knows about several types of data, including integer numbers floating point numbers strings We can print weight_kg to see its value The print() function will also take more than one argument, separated by commas, and print them: >>> print(weight_kg) 55 >>> print(weight_kg_text, weight_kg) weight in kilograms: 55 Variables can be substituted by name wherever a value would go, in calculations for example >>> 2.2 * weight_kg 121.00000000000001 We can mix strings and variables and even do calculations with variables inside the print() function: >>> print("Weight in pounds:", 2.2 * weight_kg) Weight in pounds: 121.00000000000001 People may ask about floating point representations here ‒ an introduction is at https://docs.python.org/3/tutorial/floatingpoint.html ‒ this is on the Etherpad. Most decimal fractions can't be represented exactly as binary fractions Reassigning to the same variable overwrites the old value 7 / 79 tutor_notes.md 4/29/2019 >>> weight_kg = 65.0 >>> print("Weight in kilograms is now:", weight_kg) Weight in kilograms is now: 65.0 Changing the value of one variable does not automatically change the values of other variables calculated using the original value >>> weight_lb = 2.2 * weight_kg >>> print('weight in kilograms:', weight_kg, 'and in pounds:', weight_lb) weight in kilograms: 65.0 and in pounds: 143.0 >>> weight_kg = 100 >>> print('weight in kilograms:', weight_kg, 'and in pounds:',