<<

2020. 3. 28. Assignment 3 Assignment 3

Due Apr 10 by 11pm Points 8 Available after Mar 24 at 12am

Recent updates

24 Mar The assignment submission system, MarkUs, is not available, yet. I will send an announcement when it is ready. CSCA08H Assignment 3

Deadline: 10 April 2020 by 11:00pm

Late policy: There are penalties for submitting the assignment after the due date. These penalties depend on how many hours late your submission is. Please see the syllabus on Quercus for more information. Please do not violate the U of T Code of Student Conduct

As before, please do not look for extra help outside the course resources. The reasons we have heard from cases who ended up violating the U of T Code of Student Conduct are typical of what we hear every semester. They are discussions ranging from someone who had an upper-year CS friend helping them and that person was also helping someone else too much by sharing that same code, or a roommate was stuck and so they worked together for a bit, or they posted their code on WeChat or similar for feedback and other people copied it.

Please don't do any of that, or read your code over the phone to your friend, or steal someone's code, or use a tutoring service to write it with you, or have a friend back home do most of it for you. The U of T has a Code of Student Conduct (https://www.md.utoronto.ca/sites/default/files/Student%20Conduct%2C%20Code%20of.pdf) . It's only 12 pages long, and describes your rights and responsibilities related to the U our university. If you're looking for something to do, it's probably worth a read!

We use a program that detects similarities. It works even if you rename everything and move blocks of code around: it looks for sequences of statements with similar structure, wherever they may be.

We provide regular TA and instructor office hours and are happy to help! More information will soon be announced.

If you normally rely on too much help, try to work on your own more for this assignment, and come visit office hours as many times as you like!

https://q.utoronto.ca/courses/137991/assignments/314587 1/13 2020. 3. 28. Assignment 3

Introduction

In this assignment, you will write a program to analyze poetry, counting syllables and looking for rhymes.

This handout explains the problem you are to solve and the tasks you need to complete for the assignment. Please read it carefully. Goals of this Assignment Write function bodies using dictionaries and file reading. Write code to mutate lists and dictionaries. Break a problem down into subtasks and implement helper functions to complete those tasks. Write tests to check whether a function is correct.

Files in the download

Please download the Assignment 3 files and extract the zip archive.

Starter code: poetry_reader.py and poetry_functions.py

These are the only files you need to modify and submit. These two files contain the headers for the functions you will need to write for this assignment, and a few completed function docstrings. Many of these functions will be called by the main program ( poetry.py ). You can, and should, write some helper functions in this file. Your lives will be easier if you do.

Helper module: poetry_constants .py

Read this! This file contains several definitions of new types that we use in the function type annotations (i. e., the part of the header in which the types of input parameters are described).

Main Program: poetry.py

https://q.utoronto.ca/courses/137991/assignments/314587 2/13 2020. 3. 28. Assignment 3 Run this first. The file contains a program that calls the functions in the starter code files. You can run it now, although it won't work properly until you complete the functions in the starter files. Still, you'll be able to use this to check your progress.

Data: poetry/*.txt

In the poetry directory are several files containing poems that you can use to test your code.

Data: dictionary.txt

This file contains a huge list of English words and their pronunciations.

Data: poetry_forms.txt

This file contains information describing various poetic forms.

Checker: a3_checker.py

We have provided a checker program that you should use to check your code. See below for more information about a3_checker.py . Poetry Forms

Poetry differs from prose because it has a fixed structure. Different forms of poetry, such as sonnets and haiku, have rules about which words must rhyme and the number of syllables in each line.

In this assignment, you will write a program to read a poem from a file, figure out the pronunciation, count the number of syllables in each line, and determine which lines rhyme.

Some poetry forms specify the number and order of stressed and unstressed syllables within a line. We will not consider syllabic stress in this assignment.

Some poetry forms specify that particular words must alliterate, which means they start with the same sound. We will not consider alliteration in this assignment. Denitions

Some of the concepts used in this assignment are explained and defined bellow. All links go to https://dictionary.com (https://dictionary.com/) . poem (https://www.dictionary.com/browse/poem) a composition in verse, especially one that is characterized by a highly developed artistic form and by the use of heightened language and rhythm to express an intensely imaginative interpretation of the subject rhyme (https://www.dictionary.com/browse/rhyme) a word agreeing with another in terminal sound: For example, find is a rhyme for mind and womankind consonant (https://www.dictionary.com/browse/consonant) (in English articulation) a speech sound produced by occluding with or without releasing (p, b; t, d; k, g), diverting (m, n, ng), or obstructing (f, v; s, z, etc.) the flow of air from the lungs (opposed to https://q.utoronto.ca/courses/137991/assignments/314587 3/13 2020. 3. 28. Assignment 3 vowel) vowel (https://www.dictionary.com/browse/vowel) (in English articulation) a speech sound produced without occluding, diverting, or obstructing the flow of air from the lungs (opposed to consonant) syllable (https://www.dictionary.com/browse/syllable) an uninterrupted segment of speech consisting of a vowel sound, a diphthong, or a syllabic consonant, with or without preceding or following consonant sounds

There are many vowel sounds. For example, freight, fraught, fruit, and fright all are different vowel sounds — there are far more vowel sounds than there are letters used to describe them: a, e, i, o, u, and sometimes y. Poetry Form Example: Limerick

Here is an impressive work of "limerick" art. The lines have been numbered and we have highlighted the last syllable of each line, because those words must rhyme according to a particular scheme. We have indicated, using bold and underlined italics, the two sets of rhyming words.

1. I wish I had thought of a rhyme 2. Before I ran all out of time! 3. I'll sit here instead, 4. A cloud on my head 5. That rains 'til I'm covered with slime.

Limericks are five lines long. Lines 1, 2, and 5 have eight syllables and the last syllables on these lines rhyme with each other. Lines 3 and 4 have five syllables and the last syllables rhyme with each other. (There are additional rules about the location and number of stressed vs. unstressed syllables, but we'll ignore those rules for this assignment; we will be counting syllables, but not paying attention to whether they are stressed or unstressed.) The CMU Pronouncing Dictionary

We'll need a way to examine words and break them into syllables and consonants. We're going to use the Carnegie Mellon University Pronouncing Dictionary (http://www.speech.cs.cmu.edu/cgi-bin/cmudict) , which contains a dictionary where instead of definitions they store pronunciations. They use a plain-text notation for various sounds; the quickest way to get used to them is to go look at some. You don't need to memorize the notation, but it helps to see it. Head to the CMU Pronouncing Dictionary (http://www.speech.cs.cmu.edu/cgi-bin/cmudict) now and look up a couple of words; try searching for words like Joseph, talks, and fast, and see if you can interpret the results. Do contractions like I'll (short for I will) and we'll (short for we will) work? What about possessives like "Rita's"?

Now click the "Show Lexical Stress" checkbox and see how that changes the results.

Here is the output for David (with "Show Lexical Stress" turned on): D EY1 V IH0 D . There are five phonemes in the word David and each phoneme describes a sound. The sounds are either vowel https://q.utoronto.ca/courses/137991/assignments/314587 4/13 2020. 3. 28. Assignment 3 sounds or consonant sounds. We will refer to phonemes that describe vowel sounds as vowel phonemes, and similarly for consonants.

The phoneme notation was defined in a project called Arpanet (http://en.wikipedia.org/wiki/Arpabet) that was created by the Advanced Research Projects Agency (ARPA) (http://en.wikipedia.org/wiki/Advanced_Research_Projects_Agency) back in the 1970's.

We have downloaded a text file containing the CMU Pronouncing Dictionary: all the words and their pronunciations. All vowel phonemes end in a 0 , 1 , or 2 , with the digit indicating a level of syllabic stress. Consonant phonemes do not end in a digit. The number of syllables in a word is the same as the number of vowel sounds in the word, so you can determine the number of syllables in a word by counting the number of phonemes that end in a digit.

As an example, in the word "secondary" ( S EH1 K AH0 N D EH2 R IY0 ), there are 4 vowel phonemes, and therefore 4 syllables. The vowel phonemes are EH1 , AH0 , EH2 , and IY0 .

In case you're curious, 0 means unstressed, 1 means primary stress, and 2 means secondary stress — try saying "secondary" out loud to hear for yourself which syllables have stress and which do not. (In this assignment, your program will not need to distinguish between the levels of syllabic stress.)

The assignment zipfile includes dictionary.txt , which contains our version of the Pronouncing Dictionary. You must use this file, not any files from the CMU website, because our version differs slightly from the CMU version. We have removed alternate pronunciations for words, and we have removed words that do not start and end with alphanumeric characters (like #HASH-MARK , #POUND-SIGN and #SHARP-SIGN ). Open up dictionary.txt file to see the format; notice that any line beginning with ;;; is a comment.

The words in dictionary.txt are all uppercase and do not contain surrounding punctuation. When your program looks up a word, use the uppercase form, with no leading or trailing punctuation. Function clean_up in the starter code file poetry_functions.py will be helpful here. Describing Poetry Forms

Here is our limerick poetry form:

Limerick 8 A 8 A 5 B 5 B 8 A

On each line, the first piece of information is a number that indicates the number of syllables required on that line of the poem. The second piece of information on each line is a letter that indicates the rhyme scheme. Here, lines 1, 2, and 5 must rhyme with each other because they're all marked with the same letter ( A ), and lines 3 and 4 must rhyme with each other because they're both marked with the same letter ( B ). (Note that the choice to use the letters A and B was arbitrary. Other letters could have been used to describe this rhyme scheme.) https://q.utoronto.ca/courses/137991/assignments/314587 5/13 2020. 3. 28. Assignment 3 Two lines of a poem rhyme with each other when the last syllable of the last word on each of the two lines rhyme. Two syllables rhyme when their vowels are the same and they end in the same sequence of consonant phonemes, like gosh and wash.

Some poetry forms don't require lines that rhyme. For example, the haiku form has 5 syllables in the first line, 7 in the second line, and 5 in the third line, but there are no rhyme requirements. Here is an example:

Dan's hands are quiet. Soft peace surrounds him gently: No thought moves the air.

And another one:

Jen sits quietly, Thinking of assignment three. All ideas bad.

We'll indicate the lack of a rhyme requirement by using the symbol * . Here is our poetry form description for the haiku poetry form:

Haiku 5 * 7 * 5 *

Some poetry forms have rhyme requirements but don't have a specified number of syllables per line. Quintain (English) is one such example; these are 5-line poems with an ABABB rhyme scheme, but with no syllable requirements. Here is our poetry form description for the Quintain (English) poetry form (notice that 0 is used to indicate that there is no requirement on the number of syllables in the line):

Quintain (English) 0 A 0 B 0 A 0 B 0 B

Here's an example of a Quintain (English) from Percy Bysshe Shelly'sOde To A Skylark:

Teach us, Sprite or Bird, What sweet thoughts are thine: I have never heard Praise of love or wine That panted forth a flood of rapture so divine.

Your program will read a poetry form description file containing a list of poetry form names and their poetry form descriptions. For each poetry form in the file:

https://q.utoronto.ca/courses/137991/assignments/314587 6/13 2020. 3. 28. Assignment 3 the first line gives the name of the poetry form subsequent lines contain the number of syllables and rhyme scheme for each line of poetry each poetry form is separated from the next by a blank line

The poetry form names given in a poetry form description file are all different.

We have provided poetry_forms.txt as an example poetry form description file. We will test your code with other poetry form descriptions as well. Stanza-based poetry

Many poetry forms don't have a fixed number of lines. Instead, they specify what a stanza looks like, and then the poetry is made up of as many stanzas as the poet likes.

As an example drawn from Narodnaya Volya literature, here are the first two stanzas of a poem called The Beauteous Terrorist. The author, Henry Parkes, was inspired by Sophia Perovskaia, a prominent member of the Narodnaya Volya, to write the poem. Each stanza follows a simple ABAB rhyme scheme.

SOFT as the morning's pearly light, Where yet may rise the thunder cloud, Her gentle face was ever bright With noble thought and purpose proud.

Dreamt ye that those divine blue eyes, That beauty free from pride or blame, Were fashioned but to terrorize O'er Despot's power of sword and flame?

We will not consider stanza-based poems in this assignment. Data Representation

We use the following Python definitions to create new types relevant to the problem domain. Read the comments in starter code file poetry_constants.py for detailed descriptions with examples.

POETRY_FORM Tuple[List[int], List[str]]

POETRY_FORMS Dict[str, POETRY_FORM]

CLEAN_POEM List[List[str]]

WORD_PHONEMES List[str]

LINE_PRONUNCIATION List[WORD_PHONEMES]

POEM_PRONUNCIATION https://q.utoronto.ca/courses/137991/assignments/314587 7/13 2020. 3. 28. Assignment 3 List[LINE_PRONUNCIATION]

PRONOUNCING_DICTIONARY Dict[str, PHONEMES]

Required Functions

This section contains a table with detailed descriptions of the functions that you must complete in the two starter code files. You'll need to add a second example to the docstrings for each function in the starter code.

For all poetry samples used in this assignment, you should assume that all words in the poems will appear as keys in the pronouncing dictionary. We will test with other pronouncing dictionaries, but we will always follow this rule.

You should follow the approach we've been using on large problems recently and write additional helper functions to break these high-level tasks down. Each helper function must have a clear purpose. Each helper function must have a complete docstring produced by following the Function Design Recipe. You should test your helper functions to make sure they work!

Functions to write in poetry_functions.py:

Function name: Description (paraphrase to get a proper docstring (Parameter types) -> description) Return type

The parameter represents a poem. This function should create and return a list of lists (one sublist per poem line) of capitalized words containing no leading or trailing punctuation.

clean_poem: Punctuation internal to words, such as the apostrophe in (str) -> CLEAN_POEM DON'T, should be preserved. Hint: write as many helper functions as you need. It will make your life much easier. What are the tasks that need to be done?

The first parameter represents a clean poem (capitalized extract_phonemes: words in a nested list) and the second a pronouncing (CLEAN_POEM, dictionary. The function is to return a POEM_PRONUNCIATION , which PRONOUNCING_DICTIONARY) is a list of line pronunciations where each inner list is the -> POEM_PRONUNCIATION pronunciation for a single line of poetry, based on the pronunciations in the PRONOUNCING_DICTIONARY.

phonemes_to_str: The parameter represents a poem pronunciation. This function (POEM_PRONUNCIATION) - is to return a string containing all the phonemes in each word > str in each line of the parameter. The phonemes are separated by spaces, the words are separated by ' | ' , and the lines are https://q.utoronto.ca/courses/137991/assignments/314587 8/13 2020. 3. 28. Assignment 3

separated by '\n' .

See the Pronunciation text box in the screenshot at the top of this handout for an examples.

get_rhyme_scheme: The parameter represents a poem pronunciation. This function (POEM_PRONUNCIATION) - is to return a list of letters describing the rhyme scheme for the > List[str] poem, starting the scheme with the letter A for the first line.

The parameter represents a poem pronunciation. This function get_num_syllables: is to return a list of the number of syllables on each line of the (POEM_PRONUNCIATION) - poem. Hint: write a helper function count_syllables that counts > List[int] the number of syllables in a single line of poetry.

Functions to write in poetry_reader.py:

Function name: Description (paraphrase to get a proper docstring (Parameter types) -> description) Return type

The parameter represents an open file containing a poem. This function should return the contents of the file as a single string, removing blank lines and leading

read_and_trim_whitespace: and trailing whitespace on each line. The result should (TextIO) -> str contain newlines separating the lines of the poem.

This is used by the poetry.py program to read the poem file and show the contents in the Poem box (see the screenshot on the top of the handout).

The parameter represents an open file in the format of the CMU Pronouncing Dictionary. This function should accumulate and return the pronouncing dictionary based on the given file. read_pronouncing_dictionary:

(TextIO) -> This is used by the poetry.py program to read the

PRONOUNCING_DICTIONARY pronouncing dictionary. To use a different dictionary file — for example, if you want to try it with a tiny dictionary to make any output easier to read — change the DICTIONARY_FILENAME constant at the top of file poetry.py temporarily.

read_poetry_form_descriptions: The parameter represents a poetry form description file (TextIO) -> POETRY_FORMS that has been opened for reading. This function should accumulate and return a dictionary where each key is

https://q.utoronto.ca/courses/137991/assignments/314587 9/13 2020. 3. 28. Assignment 3 a poetry form name and each value is the poetry pattern for that form based on the given file.

This is used by the poetry.py program to read the poetry forms file. To use a different poetry forms file — for example, if you want to try it with other poetry forms — change the POETRY_FORMS_FILENAME constant at the top of file poetry.py temporarily.

Hint: write a helper function that reads a single POETRY_FORM.

A3 Checker

We are providing a checker module ( a3_checker.py ) that tests two things:

whether your code follows the Python Style Guidelines, and whether your functions are named correctly, have the correct number of parameters, and return the correct types.

To run the checker, open a3_checker.py and run it. Be sure to scroll up to the top and read all messages.

If the checker passes without errors, it means:

Your code follows the style guidelines. Your function names, number of parameters, and return types match the assignment specification. This does not mean that your code works correctly in all situations. We will run a different set of tests on your code once you hand it in, so be sure to thoroughly test your code yourself before submitting.

If the checker fails, carefully read the message provided:

It may have failed because your code did not follow the style guidelines. Review the error description(s) and fix the code style. Please see the PyTA documentation (http://www.cs.toronto.edu/~david/pyta/) for more information about errors. It may have failed because: you are missing one or more function, one or more of your functions is misnamed, one or more of your functions has the incorrect number or type of parameters, or one of more of your function return types does not match the assignment specification.

Read the error message to identify the problematic function, review the function specification in the handout, and fix your code.

Make sure the checker passes before submitting. Running the checker program on Markus https://q.utoronto.ca/courses/137991/assignments/314587 10/13 2020. 3. 28. Assignment 3 In addition to running the checker program on your own computer, run the checker on MarkUs as well. You will be able to run the checker program on MarkUs once every 12 hours (note: we may have to revert to every 24 hours if MarkUs has any issues handling every 12 hours). This can help to identify issues such as uploading the incorrect file.

First, submit your work on MarkUs. Next, click on the "Automated Testing" tab and then click on "Run Tests". Wait for a minute or so, then refresh the webpage. Once the tests have finished running, you'll see results for the Style Checker and Type Checker components of the checker program (see both the Automated Testing tab and results files under the Submissions tab). Note that these are not actually marks -- just the checker results. If there are errors, edit your code, run the checker program again on your own machine to check that the problems are resolved, resubmit your assignment on MarkUs, and (if time permits) after the 24 hour period has elapsed, rerun the checker on MarkUs. Testing your Code

It is strongly recommended that you test each function as soon as you write it. As usual, follow the Function Design Recipe (we've provided the function name and types for you) to implement your code. Once you've implemented a function, run it against the examples in your docstrings and the unit tests you've defined. How to tackle this assignment Principles: To avoid getting overwhelmed, deal with one function at a time. Start with functions that don't call any other functions; this will allow you to test them right away. The steps listed below give you a reasonable order in which to write the functions. For each function that you write, start by adding at least one example call to the docstring before you write the function. Keep in mind throughout that any function you have might be a useful helper for another function. Part of your marks will be for taking advantage of opportunities to call an existing function. As you write each function, begin by designing it in English, using only a few sentences. If your design is longer than that, shorten it by describing the steps at a higher level that leaves out some of the details. When you translate your design into Python, look for steps that are described at such a high level that they don't translate directly into Python. Design a helper function for each of these high-level steps, and put a call to the helpers into your code. Don't forget to write a great docstring for each helper! Steps:

Here is a good order in which to solve the pieces of this assignment.

1. Read this handout thoroughly and carefully, making sure you understand everything in it. 2. Read the poetry_functions.py and poetry_reader.py starter code to get an overview of what you will be writing. 3. Read the poetry_constants.py file and start to figure out the new type definitions. https://q.utoronto.ca/courses/137991/assignments/314587 11/13 2020. 3. 28. Assignment 3

4. Next, read the starter code poetry_reader.py again and implement and test those functions. 5. Next, read the starter code poetry_functions.py again and implement and test those functions. 6. Read the code provided in annotate_poetry.py and run it. If there are any problems with the results, try to identify which of your functions has an issue, and go back to testing that function. Additional requirements

Do not call print , input , or open , except within the if __name__ == '__main__' block. Do not use any break or continue statements. Do not modify or add to the import statements provided in the starter code. Do not add any code outside of a function definition. Do not mutate objects unless specified. Marking

These are the aspects of your work that will be marked for Assignment 3:

Correctness (80%): Your functions should perform as specified. Correctness, as measured by our tests, will count for the largest single portion of your marks. Once your assignment is submitted, we will run additional tests, not provided in the checker. Passing the checker does not mean that your code will earn full marks for correctness. Coding style (20%): Make sure that you follow the Python style guidelines that we have introduced and the Python coding conventions that we have been using throughout the semester. Although we don't provide an exhaustive list of style rules, the checker tests for style are complete, so if your code passes the checker, then it will earn full marks for coding style with two exceptions: docstrings and use of helper functions may be evaluated separately. For each occurrence of a PyTA error (http://www.cs.toronto.edu/~david/pyta/) , a 1 mark (out of 20) deduction will be applied. For example, if a C0301 (line-too-long) error occurs 3 times, then 3 marks will be deducted. Your program should be broken down into functions, both to avoid repetitive code and to make the program easier to read. If a function body is more than about 20 statements long, introduce helper functions to do some of the work -- even if they will only be called once. All functions, including helper functions, should have complete docstrings including preconditions when you think they are necessary. Also, your variable names and names of your helper functions should be meaningful. Your code should be as simple and clear as possible. No Remark Requests

No remark requests will be accepted. A syntax error could result in a grade of 0 on the assignment. Before the deadline, you are responsible for running your code and the checker program to identify and resolve any errors that will prevent our tests from running. What to Hand In https://q.utoronto.ca/courses/137991/assignments/314587 12/13 2020. 3. 28. Assignment 3 The very last thing you do before submitting should be to run the checker program one last time.

Otherwise, you could make a small error in your final changes before submitting that causes your code to receive zero for correctness.

https://q.utoronto.ca/courses/137991/assignments/314587 13/13