Efficient Optical Recognition Validation using MIDI Sequence Data by Janelle C. Sands Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Master of Engineering in Electrical Engineering and Computer Science at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY May 2020 ○c Massachusetts Institute of Technology 2020. All rights reserved.

Author...... Department of Electrical Engineering and Computer Science May 12, 2020

Certified by...... Michael Scott Cuthbert Associate Professor Thesis Supervisor

Accepted by ...... Katrina LaCurts Chair, Master of Engineering Thesis Committee ii Efficient Optical Music Recognition Validation using MIDI Sequence Data by Janelle C. Sands

Submitted to the Department of Electrical Engineering and Computer Science on May 12, 2020, in partial fulfillment of the requirements for the degree of Master of Engineering in Electrical Engineering and Computer Science

Abstract Despite advances in optical music recognition (OMR), resultant scores are rarely error-free. The power of these OMR systems to automatically generate searchable and editable digital representations of physical is lost in the tedious manual effort required to pinpoint and correct these errors post-OMR, or evento just confirm no errors exist. To streamline post-OMR error correction, I developeda corrector to automatically identify discrepancies between resultant OMR scores and corresponding Digital Interface (MIDI) scores and then either automatically fix errors, or in ambiguous cases, notify the user to manually fix errors. This tool will be open source, so anyone can contribute to further improving the accuracy of OMR tools and expanding the amount of trusted digitized music.

Thesis Supervisor: Michael Scott Cuthbert Title: Associate Professor

iii iv Acknowledgments

Thank you to my advisor, Professor Cuthbert, for sharing your enthusiasm, expertise, and encouragement throughout this project. I am grateful for the invaluable learning experience and opportunity to engage in such an exciting interdisciplinary field under your advising.

I’m grateful to all of my professors and teachers for all you have taught me. Thank you for imparting a joy in learning and pushing me to grow. I’d especially like to thank the many music educators and musicians who have inspired and instilled in me this love of music. What you have taught me is not only integral to this work, but also my identity.

I’d also like to thank the incredible network of friends who have been there for me through this crazy and wonderful MIT adventure. Your pep talks, kindness, listening ears, advice, cooking, and humor made even the most stressful times okay.

Finally, thank you to my family for your love and support. I have leaned on your unwavering encouragement and vast collective knowledge in every area this project touches. I couldn’t imagine a better family to be my side through this project and life. Thank you.

v vi Contents

1 Introduction 1 1.1 The : An Illustrative Example ...... 1 1.2 The Problem: Trading Off Time versus Accuracy ...... 4 1.3 The Overlooked Opportunity: Improving OMR with MIDI ...... 5

2 Existing Digital Music Technology 7 2.1 History of Optical Music Recognition (OMR) ...... 7 2.2 Demystifying OMR Systems ...... 7 2.3 A Post-OMR Step ...... 8 2.4 Algorithms for Aligning Two Different Versions of the Same Score .9 2.5 Symbolic Music Encoding ...... 10 2.6 Human in the Loop ...... 11 2.7 Useful Tools ...... 12 2.7.1 music21 Toolkit ...... 12 2.7.2 SmartScore: Accessible OMR ...... 12 2.7.3 Viewing and Editing Music with MuseScore ...... 13 2.7.4 IMSLP: Hundreds of Thousands of Music Scans ...... 13 2.7.5 Online Classical Archives: Countless Works Without Visual Notation ...... 13

3 Design and Implementation of a System for Efficient OMR Output Correction and Validation 15 3.1 Goals of OMR Validation ...... 15

vii 3.1.1 Goal 1: Accuracy is Guaranteed ...... 16 3.1.2 Goal 2: Manual Intervention is Efficient ...... 16 3.1.3 Goal 3: Automatic Correction is Reasonably Efficient . . . . 17 3.1.4 Goal 4: Easy for Developers to Expand ...... 17 3.2 System Overview ...... 18 3.2.1 An Accessible System ...... 18 3.2.2 Automatic Fixing ...... 20 3.2.3 Manual Fixing ...... 20 3.2.4 Expected Output ...... 22 3.3 Intended Inputs ...... 23 3.3.1 One Part at a Time ...... 23 3.3.2 Input Correctness Assumptions ...... 25 3.3.3 Supported Instruments ...... 26 3.4 Preliminary Fixing ...... 27 3.4.1 Measure Alignment ...... 27 3.4.2 Holistic Fixing ...... 28 3.5 Tracking the Correction History ...... 28 3.5.1 Aligning OMR and MIDI Parts ...... 29 3.5.2 Fixing a Correction Generation ...... 30 3.5.3 Creating the Next Generation ...... 31 3.6 Automatic Fixing ...... 32 3.6.1 The Process ...... 32 3.6.2 Challenges ...... 33 3.7 Manual Fixing and Approval ...... 34 3.7.1 Correction Statuses For Display ...... 35 3.7.2 Display Layout ...... 36 3.7.3 Editing the Display ...... 38 3.7.4 Rebuilding The Corrected Score ...... 40

viii 4 Error Fixers 43 4.1 Identifying Errors ...... 44 4.1.1 Challenges With Existing Visualizations ...... 44 4.1.2 A Better Way To Display Discrepancies ...... 45 4.2 Classifying Errors ...... 48 4.3 Initial Fixers for Measure Alignment ...... 48 4.3.1 Shifting to Account for a Pickup Measure ...... 49 4.3.2 Collapsing Repeats ...... 52 4.3.3 Fixing Misread Multirests ...... 54 4.4 Other Initial Fixers ...... 55 4.4.1 Resolution ...... 55 4.4.2 Resolution ...... 57 4.4.3 Replacing Intruding Guitar ...... 59 4.5 Iterative Fixers ...... 60 4.5.1 Rest Representation Discrepancies ...... 60 4.5.2 Misread Accidentals ...... 61 4.5.3 Undoing the Rhythmic Harm of a Misread Dot ...... 65 4.5.4 Articulations Marked By Different ...... 66 4.5.5 Finding Missing MIDI Notes ...... 66 4.5.6 Ornaments ...... 67 4.6 Developers Guide to Fixers ...... 74 4.7 Existing Fixers Fix Some, but Not All Discrepancies ...... 75

5 Evaluating Usefulness 77 5.1 Evaluation Procedure ...... 77 5.2 Metrics for Evaluation ...... 78 5.2.1 Quantity of Content to Review ...... 78 5.2.2 Accuracy Improvement due to Automatic Correction . . . . . 78 5.3 Applying the Corrector to Quartet Music ...... 79 5.3.1 Haydn’s First Quartet ...... 79

ix 5.3.2 Mozart’s First Quartet ...... 87 5.4 Generalizing Results ...... 90 5.4.1 Changes in Accuracy From Automatic Fixing ...... 90 5.4.2 Discrepancies Can be Resolved and Some Errors Fixed . . . . 92 5.4.3 Effectively Reducing the Quantity of Content to Review . 93 5.5 The Role of Human Error in MIDI Files ...... 94 5.6 Truths Are Not Necessarily Ground Truths ...... 94 5.7 Unlocking a New Piece of Music ...... 94

6 Conclusion 97

A Example Corrector Display Score: Haydn Quartet 99 A.1 Haydn Quartet Op.1, No.1, i: 1 ...... 100 A.2 Haydn Quartet Op.1, No.1, i: Violin 1 ...... 101 A.3 Haydn Quartet Op.1, No.1, i: Violin 1 ...... 102 A.4 Haydn Quartet Op.1, No.1, i: Violin 1 ...... 103 A.5 Haydn Quartet Op.1, No.1, i: Violin 2 ...... 105 A.6 Haydn Quartet Op.1, No.1, i: Violin 2 ...... 106 A.7 Haydn Quartet Op.1, No.1, i: Violin 2 ...... 107 A.8 Haydn Quartet Op.1, No.1, i: Violin 2 ...... 108 A.9 Haydn Quartet Op.1, No.1, i: Viola ...... 110 A.10 Haydn Quartet Op.1, No.1, i: Viola ...... 111 A.11 Haydn Quartet Op.1, No.1, i: Viola ...... 112 A.12 Haydn Quartet Op.1, No.1, i: Viola ...... 113 A.13 Haydn Quartet Op.1, No.1, i: Cello ...... 115 A.14 Haydn Quartet Op.1, No.1, i: Cello ...... 116 A.15 Haydn Quartet Op.1, No.1, i: Cello ...... 117

B Example Corrector Display Score: Mozart Quartet 119 B.1 Mozart Quartet No.1, ii: Violin 1 ...... 120 B.2 Mozart Quartet No.1, ii: Violin 1 ...... 121

x B.3 Mozart Quartet No.1, ii: Violin 1 ...... 122 B.4 Mozart Quartet No.1, ii: Violin 1 ...... 123 B.5 Mozart Quartet No.1, ii: Violin 1 ...... 124 B.6 Mozart Quartet No.1, ii: Violin 1 ...... 125 B.7 Mozart Quartet No.1, ii: Violin 2 ...... 126 B.8 Mozart Quartet No.1, ii: Violin 2 ...... 127 B.9 Mozart Quartet No.1, ii: Violin 2 ...... 128 B.10 Mozart Quartet No.1, ii: Violin 2 ...... 129 B.11 Mozart Quartet No.1, ii: Violin 2 ...... 130 B.12 Mozart Quartet No.1, ii: Violin 2 ...... 131 B.13 Mozart Quartet No.1, ii: Viola ...... 132 B.14 Mozart Quartet No.1, ii: Viola ...... 133 B.15 Mozart Quartet No.1, ii: Viola ...... 134 B.16 Mozart Quartet No.1, ii: Cello ...... 135 B.17 Mozart Quartet No.1, ii: Cello ...... 136 B.18 Mozart Quartet No.1, ii: Cello ...... 137

xi xii List of Figures

1-1 Excerpt of original sheet music for the viola part from the beginning of Haydn’s Op. 1 No. 1 Quartet movement 1 ...... 1 1-2 Excerpt from the MIDI file for the viola part from the beginning of Haydn’s Op. 1 No. 1 Quartet movement 1 ...... 2 1-3 OMR generated excerpt for the viola part from the beginning of Haydn’s Op. 1 No. 1 Quartet movement 1 ...... 3

3-1 Code for using the corrector ...... 19 3-2 Marked correction statuses and their respective intervention cues . . . 20 3-3 Difficult to detect differences without visual cues ...... 21 3-4 Revealing of mistakes from Figure 3-3 ...... 22 3-5 A display including all four quartet parts is difficult to navigate 23 3-6 A display score for a single quartet part is easy to understand . . . . 24 3-7 Example of a substantially incorrect OMR output ...... 26 3-8 Example of alignment changes and identifying problematic OMR ele- ments ...... 29 3-9 Example of a page in the display score ...... 37 3-10 Example of a delayed discrepancy ...... 39

4-1 Original display of aligned parts requires toggling between tabs . . . . 45 4-2 Improved stacked score display in MuseScore for identifying discrepancies 46 4-3 Viewing a display parts with different pickup measures is difficult . 49 4-4 Content repeated in the Haydn MIDI part that needs collapsing . . . 52 4-5 OMR misreading a as a chord from Haydn ...... 53

xiii 4-6 OMR misreading a multirest as a single bar of rest from Haydn . . . 54 4-7 Examples of key signatures ...... 55 4-8 Example of key signature applied to a B flat major scale . . . . . 55 4-9 Key signature discrepancy between OMR and MIDI from Haydn . . . 55 4-10 Misread rehearsal mark as guitar from Haydn ...... 59 4-11 Rest discrepancy from Haydn ...... 60 4-12 Set of common accidentals to fix ...... 61 4-13 An example of enharmonic notes ...... 62 4-14 Logic to fix accidental errors ...... 63 4-15 Additional incorrect dots cause errors ...... 65 4-16 Different rhythmic representations of a short quarter note from Haydn 66 4-17 Example of MIDI omitting redundant parts from Haydn ...... 67 4-18 Variety of ornament discrepancies from Haydn ...... 67 4-19 Resolving a missing turn ...... 69 4-20 Resolving a missing inverted turn ...... 69 4-21 Resolving a missing eighth note ...... 70 4-22 Resolving a missing sixteenth note tremolo ...... 70 4-23 Resolving a missing trill ...... 71 4-24 Resolving a missing inverted trill ...... 71 4-25 Resolving a missing enharmonic inverted trill ...... 71 4-26 Resolving a discrepancy caused by notating a trill differently . . . . . 71 4-27 Resolving a missing nachschlag trill ...... 72 4-28 Resolving a missing mordent ...... 73 4-29 Resolving a missing inverted mordent ...... 73

A-1 Violin 1 scan from Haydn’s Quartet Op. 1 No. 1, Movement 1 . . . . 99 A-2 Violin 2 scan from Haydn’s Quartet Op. 1 No. 1, Movement 1 . . . . 104 A-3 Viola scan from Haydn’s Quartet Op. 1 No. 1, Movement 1 . . . . . 109 A-4 Cello scan from Haydn’s Quartet Op. 1 No. 1, Movement 1 ...... 114

xiv List of Tables

3.1 Logic for updating correction status based on the previous status and current discrepancies ...... 35

4.1 Table of some potential OMR-MIDI discrepancies and the ways to identify and resolve them ...... 47

5.1 Accuracy change from Haydn automatic fixing ...... 91 5.2 Accuracy change from Mozart automatic fixing ...... 91 5.3 Elements to manually correct from Haydn ...... 93 5.4 Elements to manually correct from Mozart ...... 93

xv Chapter 1

Introduction

1.1 The Overture: An Illustrative Example

Figure 1-1: Excerpt of original sheet music for the viola part from the beginning of Haydn’s Op. 1 No. 1 Quartet movement 1 [19]

It is an hour before doors open for the quartet’s recital. Both of the violinists and the cellist are unpacking their instruments for soundcheck when a phone rings. It is the violist. She is still at home. She is too sick to make the performance. Her ensemble members caution her to stay home, then snap into action to salvage the performance. Their teacher serendipitously is a violinist himself, with his violin on hand. He has subbed for the first violinist before and plays well with the group. The only

1 Figure 1-2: Excerpt from the MIDI file for the viola part from the beginning of Haydn’s Op. 1 No. 1 Quartet movement 1 Some of many errors are marked. There is an incorrect key signature (pink), missing notes (brown), incorrect notes (red), and even an inappropriate bass (orange). Blue shows the part should be shifted to the left. Green shows where a repeat sign should exist [20]. hiccup is that he is not comfortable reading alto clef, so if he were to sightread the viola part, he would make errors. The members of the quartet hold themselves to a high standard. What does not cause the non-performer audience members to cringe will cause those familiar with the piece to lose respect. The teacher will only agree to play if the viola part is written using the treble clef he is familiar with, so he can easily and accurately play the part. It would take only a few mouse clicks using music notation software to modify the part if they had a digital representation of the original part, but they do not. The violist can scan and send images of the physical sheet music from home, but the computer cannot change the clef, as the image is

2 simply a sequence of numbers representing the color of each image pixel. Someone would still need to painstakingly input each note into the music notation software. There is no time for that, and moreover, if there were, there would be a chance the person inputting the music could miscopy a note. The teacher finds that there are several digital representations on the weband some even use treble clef for the viola part, but they are all MIDI. Upon opening and looking at several, they are all a headache to read! Since MIDI files are generated primarily for audio playback, the visual notation is obtuse. Notes are represented in technically correct but abnormal ways. Even with the treble clef, reading the MIDI will be just as difficult as reading the original viola part. He cannot perform this without making mistakes.

Figure 1-3: OMR generated excerpt for the viola part from the beginning of Haydn’s Op. 1 No. 1 Quartet movement 1 Some of many errors are marked. There is an incorrect time signature (pink), extraneous key signature (dark blue), incorrect pitch (red), missing bars of rest (light blue), and a rogue guitar tablature (green) [19].

One of the sound technicians overhears the dilemma and suggests a tool to help them solve the problem: an optical music recognition (OMR) system called SmartScore [1]. All they need to do is run the music photos through the software to generate a

3 digital MusicXML file [15], which can be easily understood and modified bymusic notation software. SmartScore could digitize the music in about a minute. The mu- sicians agree. The violist scans her part at home and sends it to the technician. She inputs it into the software and the resultant score appears in digital format. It looks more or less correct, but there are some glaring errors, such as incorrect pitches. The quartet is devastated. If they can see some errors, how can they be sure less notice- able errors are not hiding throughout the part? They have no time to comb through the part to identify and fix all errors. What the quartet needs is to quickly generatea 100% accurate digital part they can trust to make this performance happen. Without a tool to do this, the quartet has no choice but to cancel the performance.

1.2 The Problem: Trading Off Time versus Accuracy

Musicians lack an efficient way to accurately digitize music. Digitized music isuse- ful. It is editable, meaning one piece of digital music can be easily rearranged for performance by a different set of instrumental performers, increasing the variety of available music. It is also searchable, empowering computational musicologists to analyze compositions for interesting trends and patterns [24]. One option for digitizing music is to manually input music one note at a time. However, this process is tedious, time consuming, and prone to human errors. Thus, many musicians have been hopeful for the potential of OMR systems to automate the process [25]. While OMR systems have seen much improvement over the years, they lag behind optical character recognition (OCR) systems and are still error prone. Although many commercial OMR systems boast nearly 100% accuracy [8], systems combining these outputs for a more accurate output only achieve 95% accuracy [32]. This discrepancy can be explained by the fact that there is no standard evaluation for OMR systems, and the accuracy is highly contingent on how errors are counted, the complexity of the pieces used for evaluation, and the quality of the scans [7]. Thus, even systems boasting nearly perfect accuracy, may only be nearly perfect in specific situations, and most systems struggle on less cleanly scanned andmore

4 complex parts, most noticeably handwritten music. To be useful to and be accepted by the music community, OMR systems need assistance to achieve accuracy at least as good as existing publishers, or essentially 100% accuracy, or be able to be fixed to 100% accuracy easily. While advances in computer vision techniques could decrease errors, the only way to guarantee an accurate score is to manually comb through the resultant OMR score and check every note against the original score. This is nearly as tedious as entering each note by hand.

1.3 The Overlooked Opportunity: Improving OMR with MIDI

Classical music MIDI files are abundantly available for download [36], but lack visual encoding information necessary for producing readable sheet music [35]. By aligning these MIDI files with corresponding OMR outputs, which do encode visual informa- tion, OMR outputs can be efficiently validated. This validation is done in two stages: an initial automatic discrepancy identification and resolution stage using a suite of fixers, and a final manual approval phase in which a user approves automatic fixes and corrects unresolved discrepancies. Since error resolution is complex and sometimes ambiguous, human intervention is necessary. The corrector system enables efficient user validation by resolving as many discrepancies as possible and identifying limited regions for review. The novel corrector system design, implementation, and use is described below. It combines the strengths of MIDI encoding, OMR systems, algorithmic automation, and human analytical capabilities to overcome each method’s weaknesses, thereby expanding the library of validated and useful digital music.

5 6 Chapter 2

Existing Digital Music Technology

2.1 History of Optical Music Recognition (OMR)

Digitizing centuries of creative content has challenged researchers for decades. The field of optical character recognition (OCR), which focuses on digitizing textdocu- ments, emerged in the 1950s and experienced slow development due to its low initial accuracy [23]. However, in the last 30 years, OCR has seen the development of ro- bust and reliable technologies, which have digitized countless handwritten and printed texts [23]. Optical music recognition (OMR) is often described as OCR for music [8]. Since its emergence in the 1960s, the field of OMR has made significant, but insuffi- cient, advancements [33]. Despite much hype, many OMR systems are limited, only supporting a simple set of musical symbols in high quality scans of cleanly printed sheet music [7]. Researchers seek to extend the practicality of OMR systems to sup- port handwritten [34], varying image qualities, more extensive sets of musical notation, and more complex musical scores.

2.2 Demystifying OMR Systems

Most OMR systems can be divided into four stages: initial image preprocessing, segmentation of musical symbols, recognition of those music symbols, and the re- construction of recognized symbols into digital music notation according to their

7 structure on the page [33]. In the preprocessing and segmentation stage, researchers have developed better computer vision algorithms, such as improving binarization of the image and staff line detection and removal [3]. For recognition, researchers have experimented with size, orientation, and font-invariant symbol recognition, as well as ways to expand the amount of supported symbols [33]. To further improve recog- nition, researchers used techniques such as k-Nearest Neighbors, neural nets trained on extensive data sets, and rules-based recognition [33]. Music notation is extensive, with standard music notation supporting a large set of symbols, but many symbols, such as notation for emerging modern techniques or specific instruments, styles, or time periods, are not yet supported in music software. Individual researchers have sought to improve recognition of certain types of symbols such as lute notation [12]. Structure reconstruction is particularly tricky because music’s meaning is derived from combining symbols. For example, the key signature at the beginning of a line can affect the actual pitch of a note transcribed a distance away. Several software packages have made OMR techniques available to the public. These include SharpEye, SmartScore, PhotoScore, Audiveris, NoteScan in Nightin- gale, and MIDIScan in [33]. While these and other systems developed by researchers claim varying high accuracies, typically above 90%, Byrd suggests that the lack of an OMR evaluation standard makes it difficult to compare and understand the accuracy of these systems [7]. However, it is clear none of the systems have 100% accuracy, or a number confidently close. To raise the accuracy of these OMR systems to 100%, an additional post-OMR step can be introduced.

2.3 A Post-OMR Step

Introducing an additional post-OMR processing step to improve accuracy is a tech- nique used by other OMR researchers. For example, M. Church’s system corrects rhythm errors post-OMR by finding similar rhythms earlier or later in a piece tocor- rect measures with incorrect rhythms [10]. Another example of post-OMR processing is E. Zhang’s OMRMIDICorrector, which corrects pitch mistakes due to accidental

8 misreadings in resultant OMR scores by referencing a MIDI encoding of the same piece [37]. The idea of merging various outputs post-OMR is an established approach for improving accuracy and is inspired by similar post-OCR techniques. OCR researchers have sought to combine outputs of OCR systems using three steps: detecting an error as a discrepancy between outputs, finding the set of possible corrections, and then choosing the most likely correction [26]. Not only are different OCR systems used, some researchers also used different inputs, generated by applying filters to the original source image [4]. Combining multiple outputs has been explored for OMR as well. An early attempt was made to merge the outputs of several imperfect OMR software systems using a voting protocol weighted by the overall accuracy of each system [6]. In order to compare each software’s vote, the music needed to be aligned. However, at the time of that research, there was little success in aligning music. Until the development of a reliable alignment system robust enough to align the music even with high errors, these researchers determined that merging outputs is not a feasible way to improve the accuracy. Later on, utilizing a robust music alignment algorithm, other researchers successfully increased post-OMR pitch and rhythm accuracy to 95% by using multiple OMR recognizers [32].

2.4 Algorithms for Aligning Two Different Versions of the Same Score

Aligning music and identifying differences has many applications, such as evaluating and correcting OMR outputs [14]. Alignment algorithms find a series of edits which could be applied to the one version of music to change it into the other version of music [14]. These algorithms can also be used to measure the similarity between two versions of music [37]. To align music, complex musical notation must be abstracted into hashable, and consequently, comparable data types, such that only the relevant information is com-

9 pared. One alignment technique hashes visual properties of music, such as type, note position, slurs, and beaming, into hierarchical tree structures for compari- son [14]. E. Zhang’s StreamAligner hashes duration and pitch frequency of notes and rests [37]. The StreamAligner finds global alignment using a distance matrix popu- lated with hash edit distances and produces a change list that provides information on how to edit the aligned source version to become the aligned target version.

2.5 Symbolic Music Encoding

Development of standard music encodings, such as the file types of MusicXML and MIDI, have been critical to the field of OMR. While music can be stored as PDFsor image files, this encoding of pixels lacks musical meaning. Images and PDFs, hence, do not lend themselves to musical playback, easy editing such as transposing all notes a step higher, or searching for specific patterns of notes [15] . The Notation Interchange File Format (NIFF) meets this need by enabling encod- ing of both graphical properties, such as position and layout, and musical properties, such as pitch and duration [17]. However, despite OMR systems supporting NIFF output, the encoding has become virtually obsolete. This is because NIFF’s empha- sis on encoding notes as graphical elements instead of as musical elements hinders integration with music editing software [16]. Conversely, symbolic file types like Mu- sicXML and MIDI enable computers to encode and communicate musical concepts. Developed in the 1980s, MIDI is a protocol for communicating musical data [35]. MIDI files store sequences of instructions, such as note on and note off, foraper- former or device to generate music. Each instruction includes a time and a message [13]. The time is the amount of time in metronome clicks to wait before reading the next message. The message contains a command, followed by data. The command is the particular action which should be taken, and the data contains parameters specifying how the action should be done. For example, a common command is note on, which has parameters note velocity and pitch. Velocity is how intensely the note is sounded, or in other words, its volume. Pitch is how high or low a note sounds,

10 which is related to the frequency of the vibrations. MIDI encodes pitch as a number, with Middle C, also known as C4, as 60. There are unique MIDI numbers for each key on the . MIDI files also encode important metadata at the beginning ofthe file, such as the metronome marking, which dictates the speed of the commands and time signature. MIDI transcriptions are created by recording key presses from a musician into a instrument controller, such as a keyboard, or manually using music notation software [35]. MIDI files are widely useful in encoding musical information, but arenota perfect encoding, as visual information cannot be encoded within the parameters. As Michael Good, the founder of MusicXML states, "MIDI contains sophisticated representations for how to make music sound on an instrument, but primitive rep- resentations for how to make music appear on a printed page [16].” MusicXML was invented in 2000 to encode not only how music is played back, but also how a score looks. Since then, it has become an accepted standard for music encoding and is supported by all major music notation software [15]. However, its recent emergence as a music encoding standard means there is still a lot of music encoded as MIDI files. Moreover, more MIDI files continue to be generated since many existing instruments do not support the MusicXML output.

2.6 Human in the Loop

Some researchers propose that in order to achieve the necessary accuracy, human intervention is necessary post-OMR, since music sometimes breaks its own rules and has an extensive symbol set with many rarely used symbols [9]. Furthermore, it seems some mistakes are easier fixed by a person who can quickly and easily identify and remedy the issue. This is evident in E. Zhang’s suggestion for a deletion fixer to detect measures which are so incorrect and delete the entire measure, so a person can re-enter the notation correctly [37].

11 2.7 Useful Tools

There are an abundance of musical computer tools available for developing a post- OMR correction system. This project relies on the music21 Python library for de- velopment, SmartScore OMR system for producing resultant OMR scores, public domain internet content for music inputs, and MuseScore for user interaction.

2.7.1 music21 Toolkit

The corrector system was built using music21. Music21 is Python library for analyzing music [11]. It is comprised of tools for editing and analyzing music via Python, such as tools for alignment [37]. Music21 also provides a structure for storing music, which can be built by specifying code commands or by translating from existing music encodings, like MusicXML or MIDI. This structure can be exported as those same standard encodings. Music21 represents music as stream objects. A stream represents different hierarchical musical structures, such as a score, part, or measure. Streams can contain other streams or musical elements like notes, rests, or chords. Musical elements can have properties associated with them, such as an ornamental expression or a lyric.

2.7.2 SmartScore: Accessible OMR

While the corrector can correct a resultant score from any OMR System which outputs MusicXML files, the OMR system I used to generate resultant scores was SmartScore [1]. SmartScore processes inputted TIFF scanned image files into symbolic music encodings, displaying both the input and output in its interface. The resultant OMR score can be exported into a music encoding such as MusicXML. SmartScore is far from perfect, but could decently convert simple scores of high quality scans. However, sometimes SmartScore is unable to read a page at all or disastrously misreads entire lines of music. Other times, noise on the page can cause a significant number of extraneous notes, making the output score difficult to read and understand.

12 2.7.3 Viewing and Editing Music with MuseScore

MuseScore is a free and open source music notation software with an intuitive interface for creating, editing, and playing back music [5]. MuseScore can be used for manual music entry: users can specify score structure, such as the number of measures and the set of parts, and then enter notes and other musical elements. Element inputting can be done using drag and drop palettes or key board shortcuts, making the inter- face accessible to novices and efficient for experts. Users can use MuseScore toedit inputted notes, as well as fix incorrect notes in an OMR output. MuseScore supports all standard music encoding types, such as MIDI and MusicXML. Furthermore, its scripting capabilities mean it could be seamlessly incorporated in an automated work flow.

2.7.4 IMSLP: Hundreds of Thousands of Music Scans

The International Music Score Library Project (IMSLP) is a free online library with hundreds of thousands of public domain sheet music scans [18]. The library includes entire scores, single instrument parts, and arrangements available for download. These files can be used as inputs for OMR systems like SmartScore. While varying inscan quality and legibility, scores which are cleanly engraved and scanned with high density scanners are likely to produce low-error OMR resultant scores. I am grateful for the support of IMSLP’s founder, E. Guo, for allowing exceptions to rate limiting to enable automated collection of scans for this project. Such automation can streamline the process of expanding the amount of publicly available, trusted, digitized music.

2.7.5 Online Classical Archives: Countless Works Without Visual Notation

The Classical Archives has a large collection of community contributed MIDI files of available for download [36]. Coupled with the extensive repertoire available on IMSLP, there is huge potential to generate and validate an unprecedented volume of useful MusicXML files.

13 14 Chapter 3

Design and Implementation of a System for Efficient OMR Output Correction and Validation

To efficiently validate OMR-generated digital music, I created a post-OMR correc- tor. The corrector system identifies discrepancies between a part of music optically generated using OMR and that same part of music already stored as MIDI digitally sequenced audio. Using those discrepancies, the OMR part is corrected in two stages: automatically according to a set of fixing rules and manually by the user. The sys- tem first applies fixing rules to automatically fix discrepancies and marks thosefixes and remaining discrepancies on the OMR part. A user then views this marked and modified OMR part alongside the original OMR and MIDI parts to approve fixesand manually resolve non-automatically resolved discrepancies. The final corrected OMR part can then be extracted and used.

3.1 Goals of OMR Validation

Such a tool for OMR validation can empower individual musicians to creatively mod- ify any piece of music for performance and enable communities of musicians to expand and enhance the library of digital music. To be useful, the system should produce

15 high accuracy music encodings from OMR outputs within a reasonable amount of time while limiting human effort. Additionally, the underlying code should enable easy, incremental community contribution to expand the system’s functionality.

3.1.1 Goal 1: Accuracy is Guaranteed

Music transcriptions must be accurate, and the same accuracy is needed for the digital versions created using OMR. Many musical pieces involve multiple voices performing their own parts simultaneously. If even one part has a note with an incorrect pitch, that note could clash cacophonously with the other parts. The probability of a mistake in an ensemble of 푝 parts playing from a piece of music with error rate 푒 is 1−(1−푒)푝, where e and p are probabilities between 0 and 1. With a 5% error rate and a 20- part ensemble, there is an over 60% chance that there is a mistake in one of the parts at any instant, which could sound terrible. Furthermore, if one voice skips a note, it becomes unaligned with the ensemble, causing the same cacophonous result. Musicians take pride in trying to achieve perfection, and hitting all the right notes is a fundamental requirement of that goal. If the notes are transcribed incorrectly, it is almost impossible for the musicians to play the intended correct notes. For music analysis, high accuracy is also required to find matches when searching for musical queries. Hence, it is understandable that applications of digitized music from OMR demand high accuracy. However, correcting these small errors is tedious and time consuming, requiring careful human comparison of every digitally transcribed note to its original. Even if a part produced from an OMR system does not contain any mistakes, the same careful examination must be done to guarantee that the music is accurate enough for musicians to accept and read it. Thus, this system needs not only to produce accurate OMR parts, but also to provide a guarantee of that accuracy.

3.1.2 Goal 2: Manual Intervention is Efficient

Perfect accuracy is only possible with human intervention because there are cases of ambiguity. While human effort is required, it should not be overly exploited, andthe

16 time required to manually fix the automatically fixed part should be minimized. This can be achieved by minimizing the amount of content the user needs to review, the amount of time to approve each fix, and the amount of time to fix each remaining error. An easy-to-use program is integral to this goal.

3.1.3 Goal 3: Automatic Correction is Reasonably Efficient

While efficiency is important, the time required to automatically and manually fix a part is a reasonable trade-off to ensure high accuracy. The total time needed to generate a part from an OMR system and fix it with a corrector system should beless than the time needed to manually input the score. People already need to convert physical sheet music to digital formats and do so with existing time-intensive or error- prone techniques, so any workflow which improves accuracy without sacrificing time required is useful. Therefore, optimizing program run-times is not a high priority, as long as the automatic fixing stage can complete within the time it would taketo perform the piece.

3.1.4 Goal 4: Easy for Developers to Expand

Ultimately, I am creating tools and an infrastructure that the music and programming community can easily contribute to, in order to make incremental efficiency and quality improvements to the OMR process. The initial implementation only accounts for a subset of possible OMR errors I have discovered from examining a few simple pieces of music. Future users will likely discover further OMR errors, which could also be automatically corrected. Alternatively, future users might want to customize fixer logic to perform better under a set of assumptions specific to their usecases. A corrector system that encourages collective contributions and enables individual customization is powerful. Thus, the developer interface must be easy to use.

17 3.2 System Overview

3.2.1 An Accessible System

The corrector is a Python class developed using the music21 library [11]. It calls a set of fixer functions also developed using the same library. The corrector validates an inputted MusicXML file generated from an OMR system using an inputted MIDI file. The inputted MIDI file, while originally encoded in the MIDI format, shouldfirst be converted to a MusicXML file for better processing in music21. Note that while MusicXML files encode visual information, the MusicXML copy of the MIDI filewill not automatically infer the correct visual properties. This conversion can be done by opening the MIDI file in MuseScore and exporting it as a MusicXML file [5]. Since MuseScore supports scripting, this process can be automated to efficiently convert many MIDI files. Since the corrector only corrects a single part of music at a time but accepts both scores of parts and single parts, an optional part number argument can specify which part should be used. When not specified, the corrector uses the first or only part.The corrector also optionally accepts a MusicXML ground truth file. Since the ground truth file encodes the state of the OMR file once fully corrected, it is onlyincluded as a reference to validate system behavior or to debug during further development. When an already-correct MusicXML file exists, it is redundant to correct anOMR generated file. A user instantiates a corrector instance in a Python file with loaded music21 streams corresponding to the OMR and MIDI files. Then, the user can call the corrector’s set up and fixing functions on that instance. The fix function returns the final score and a list of intermediary scores for manual review. These all havethe format of a score of parts. Those parts are the original OMR part, the original MIDI part, the modified OMR part, and optionally, the ground truth part. Any ofthese scores can be displayed immediately in a music editor pop up or saved as a file for later review. Example Python code below demonstrates corrector use. The corrector can also be included in a Python script, which can be run from the

18 from music21 import *

omr = converter.parse("midiFilename.xml") = converter.parse("omrFilename.xml") partNum = 0 groundTruth = None

corrector = omrMidiCorrector.Corrector(omr, midi, partNum, groundTruth) corrector .setUp() finalScore , intermediaryScores = corrector.fix()

# Pops open the final score in a music editor for manual review # Score can be saved from that editor finalScore .show()

# Or save the score for manual review later finalScore.write("xml", "correctedScore.xml")

# Can also extract and save only the corrected OMR part correctedOmrPart = finalScore.parts[2] #3rd part in 0−indexed parts correctedOmrPart. write("xml", "correctedOmr.xml")

Figure 3-1: Code for using the corrector

19 terminal to correct a single piece of music or an entire directory structure of music. Because of this, the corrector supports integration into in an automated workflow for greater efficiency.

3.2.2 Automatic Fixing

The corrector first applies initial fixes to the inputted streams to align measures and make corrections considering the scope of the entire part. Then, the corrector attempts to iteratively resolve local regions of discrepancy with a suite of fixers. Each fixer identifies and resolves a specific type of error. For example, onefixeris responsible for fixing misread dotted rhythms. OMR systems are prone to incorrectly adding an extra dot next to notes, consequently modifying that note’s duration to be longer. This small visual error can be detected by comparing such a problematic OMR note with its corresponding MIDI note. If removing the dot causes the OMR note to have the same duration as the MIDI note, the fixer removes the dot. These fixers are described in detail in Chapter 4.

3.2.3 Manual Fixing

Figure 3-2: Marked correction statuses and their respective intervention cues

While the fixers can identify and fix several types of errors, they are not guaranteed to correctly fix every error, so user intervention is essential to guarantee the resultant OMR part is accurate. To approve fixes and resolve ambiguous discrepancies or incorrectly resolved fixes, the user needs a means to visualize what has been fixed, what still needs to be fixed, and the original OMR and MIDI inputs. When no more

20 discrepancies can be resolved by the fixers, the corrector returns a MusicXML display score, which can be saved as a file for viewing and editing in a music editor. That score contains a marked and fixed copy of the original OMR part and the original OMR and MIDI parts. Elements, like notes and rests, in the fixed OMR part are colored according to their correction status: whether they were confidently fixed, tentatively fixed, never had a discrepancy, or still have a discrepancy. Userscan identify areas requiring intervention and the extent of intervention needed from the visual color cues. Following manual corrections, the user can extract the human- computer modified OMR part.

Figure 3-3: Difficult to detect differences without visual cues There are two incorrect notes in the upper version of the Mozart’s String Quartet No. 1 movement 1 from the music21 corpus [11]. It is challenging to identify these errors without visual cues to make them stand out. See Figure 3-4 to reveal the differences.

21 Figure 3-4: Revealing of mistakes from Figure 3-3

3.2.4 Expected Output

The corrector system modifies the OMR part for output. The MIDI part isfixedin each generation only to resolve alignment discrepancies to better align the remaining content. Since the desired output is a high quality encoding of the music to possibly be printed and read by musicians, the OMR part is the better part to fix. The OMR part can be encoded in MusicXML format, which encodes visual information, such as the correct clef, otherwise absent in a MIDI encoding. The system preserves that visual information while verifying the pitch and rhythmic accuracy against the MIDI part. As the MIDI part is oblivious of display information, it can switch clefs randomly or use less ideal enharmonic choices. These are not detected as discrepancies by the aligner and so are never fixed. Thus, the MIDI part is modified as a side effectof correcting the corrector, but not to the visual standard necessary to be an acceptable output. The corrected OMR output is the result of the corrector.

22 3.3 Intended Inputs

The corrector is initialized with an OMR stream, MIDI stream, and a number corre- sponding to which part to fix. When provided for evaluation, a ground truth stream can also be included. The inputted streams can be scores of parts or a part.

3.3.1 One Part at a Time

Figure 3-5: A display score including all four quartet parts is difficult to navigate The first two pages of a display score with information for manual correction forall four quartet parts is shown. This display score includes four parts for each of the original four parts, totalling 16 parts. Not only are the parts difficult to differentiate, but lower parts are even cut off, as is the case for the last part on the firstpage.

While OMR systems can process a single part and scores of parts, the corrector corrects a single part at a time. A part consists of measures of notes and rests. Since a piece of music can either consist of a part or a score of parts, this ensures the corrector is modular. A part is the smallest reasonable grouping. Correcting measures one at a time is less ideal, because measures may not be aligned and OMR

23 Figure 3-6: A display score for a single quartet part is easy to understand The first two pages of a display score with information for manual correction fora single quartet part is shown. It is easy to differentiate the two upper reference parts from the lower modified part and intended ground truth part. notes may incorrectly bleed into other measures. In other words, mistakes are not isolated to the scope of a measure. Fixing a single part instead of a score of parts is easier. The display score produced by the corrector has three or four staffs for the one original staff, and occupies three or four times the vertical space of the inputted parts. The display score for fixing a single part nicely fits on a computer screen as shown in Figure 3-6. A display score for multiple parts occupies proportionally more vertical space, potentially making it more difficult to read as shown in Figure 3-5. Scores can be corrected by independently correcting each part and then re-combining those corrected parts. Fixing an entire score at once could be considered for future work. Reading such a display score is not unreasonable, as conductors frequently read scores of over a dozen parts. Moreover, scrolling and zooming could be used to navigate many parts. The benefit of fixing an entire score of parts is that otherOMR

24 parts can be used to correct each other. For example, different parts might double each other, and some properties should be the same across parts, such as measure structure, repeat locations, and time signature changes. M. Church successfully used other parts to inform rhythm correction [10]. The corrector can still use knowledge from other parts to help fix the selected part. This is enabled when scores of parts are corrector inputs. While the whole score is accepted, only the single indicated part will be fixed. When extra parts are included, the extra information aids the corrector in making more informed decisions. For example, the key signature fixer, which ensures MIDI and OMR have a consistent key signature, determines the best key signature by finding the most common key signatures across all parts.

3.3.2 Input Correctness Assumptions

The inputted OMR part must be substantially correct. The measure structure must be mostly correct, not missing many measures and not having additional measures, as the corrector works best when the every measure exists, even if its contents are incorrect. The score display is easiest to read when there is a one-to-one mapping of every OMR measure to every MIDI measure, and the corrector currently has limited capability to adjust entirely incorrect measures. Parts with many errors are too complex for the corrector to analyze and accurately fix. When errors are not isolated, and instead nearby each other, or a single element has multiple errors, the fixers may not correctly diagnose the error’s cause andthe appropriate fix. Compounding errors can combine in complex ways, and establishing logic to diagnose all these potential combinations is exponentially challenging. There- fore, the corrector makes a best effort to correct errors under the assumption that they are single and isolated. Current fixers can correctly fix discrepancies caused by a single error, but several compounding errors, such as incorrect pitch and rhythm on the same note, can cause unpredictable results. Since these multi-cause errors can exist and it is impossible to at a glance identify them, the system relies on people to account for these errors. The

25 Figure 3-7: Example of a substantially incorrect OMR output This scan inputted into SmartScore produced the following highly incorrect output, primarily because the system did not correctly identify every staff line on the original sheet music [21]. This is evident in the three tiers of notes on each staff: too high, appropriately located, and too low. This is an example of an OMR output which is too grossly incorrect to be corrected by the corrector. corrector does not reject these grossly misread OMR files, but will try its best. This system assumes that the input MIDI files have no audible errors. To the ex- tent that this assumption is wrong (see evaluation), fewer OMR errors are corrected– in fact, new errors can be introduced.

3.3.3 Supported Instruments

The corrector corrects music for single voice instruments like violin, viola, cello, and bass, as well as ensembles of these instruments. Versions of the corrector could be extended to support other instruments. These instruments were chosen because they usually have a single voice and cover a large subset of repertoire. Piano music is intentionally omitted because the left and right hand parts can be interchanged in

26 MIDI encodings, making it difficult to compare each hand part individually. This version of the corrector is used to correct string quartets, so fixes notation, such as guitar tablatures, which do not belong in quartet music. It could easily be generalized to other non-piano instruments.

3.4 Preliminary Fixing

3.4.1 Measure Alignment

The corrector attempts to align measures in the inputted parts if they are unaligned. Since the corrector system assumes fairly high quality OMR outputs, these adjust- ments are fairly basic. To account for all situations, a user may intervene after the initial set up and re-adjust the parts so the measures align. This is a reasonable ask of the user, as intervention through a visual interface is an established technique for mitigating complexity [9]. In this case, the user can use music notation software to approve the modifications. Ensuring the parts have the same measure structure is critical because while the computer can still proceed with automatic fixing when measures are unaligned [37], unaligned measures are difficult for a user to compare and interpret. See Figure4- 3 as evidence. If the parts have a different number of measures, there is no way every OMR measure can be one-for-one aligned with every MIDI measure. There is a chance multiple measure errors will cancel each other out, such that the parts have the same number of measures, but are still misaligned. Missing a measure early in the part and then adding an extra measure later in the part is an example of this. This type of alignment error is fairly obvious from a cursory look at the display score. OMR systems can err in many complex ways, and this large error space is not entirely accounted for in set up. For example, bar lines can be omitted or erroneously added [25]. This would manifest itself as misaligned measures and OMR measures which are too long or too short. The OMR system could misread and omit entire lines of music as is the case in Figure 3-7, but such an output is considered too incorrect

27 to be a good input to the corrector system. If the user is confident the measures will be aligned, she does not need toreveal the initial generation and can proceed directly to the automatic fixing. Even if this is an incorrect assumption, she will see this in the final corrector output and can re-run the corrector on modified inputs.

3.4.2 Holistic Fixing

The initial adjustments made by the corrector are described in depth in Chapter 4, but in brief, the corrector condenses MIDI repeats, unifies key signatures and meters, extends missed OMR multirests, and shifts MIDI parts to have the correct pickup. This correction is done holistically, considering the content across the entire part, or when applicable, across all other parts within the same score. After set up, the OMR and MIDI parts should have the same key signature, meter, and number of measures. Key signature and meter corrections should be done early on because these components can cause many secondary errors. Moreover, those fixers require knowledge of the entire piece, as opposed to later fixers, which typically consider the limited scope of only the elements nearby discrepancies. The system finally extracts the desired parts from the inputted streams andcre- ates the initial correction generation to snapshot the initial discrepancy state. The corrector attempts to fix that generation’s child.

3.5 Tracking the Correction History

A correction generation is a snapshot of the successive changes applied by the cor- rector. It is represented by four parts, which are stored as music21 streams. They are the initial OMR and MIDI parts and the modified OMR and MIDI parts. The initial OMR and MIDI parts are passed into the generation at its creation. These are the versions of the OMR and MIDI parts produced by the previous generation’s fix. The first generation uses the OMR and MIDI parts fixed at set up. A generation’s initial OMR and MIDI parts are aligned to identify corresponding elements, as well as

28 regions of discrepancy and similarity. Alignment is recalculated for every generation since earlier fixes can improve alignment. Problematic OMR elements are notes and rests in the OMR part which do not map to identical notes in the MIDI part. Problematic OMR elements, their corresponding MIDI elements, and other nearby elements are analyzed by fixers. If an error is identified and a fix is determined, the corresponding elements in the generation’s OMR and MIDI parts to be modified are changed. To enable this, the generation has the functionality to map between corresponding elements in these internal parts. Similar to previous drafts of a paper or previous commits in a code repository, the history of generations enables the user to understand the sequence of automatic fixes. These generations can be viewed independently in a music editor. Additionally, parts from the first and last generation can be used to produce a summary display scoreof the initial and end state for the user.

3.5.1 Aligning OMR and MIDI Parts

Figure 3-8: Example of alignment changes and identifying problematic OMR elements Each arrow represents an entry in the aligner change list. Red OMR elements are the problematic OMR elements. The corresponding elements for each OMR element have arrows to them from that OMR element. Corrector fixers generally only consider the first occurrence of a corresponding MIDI note.

In addition to storing the four parts, the generation also stores information to map between corresponding elements. Applying the music21 aligner maps the OMR part to the MIDI part by comparing the hash of elements [37]. The default hash is comprised of the MIDI pitch number of a note and its numerical duration. Other properties, such as articulation markings, are ignored. Only notes, chords, and rests

29 are hashed and aligned. All other elements, such as repeat signs and key signatures, are not considered. The aligner returns a change list, which can be used to build the internal gen- eration mapping. The change list stores mappings of OMR elements to MIDI ele- ments with the change required to map between. The change could be substitution, insertion, deletion, or no change. A change mapping of no change implies that cor- responding pairs of elements are considered the same. Every OMR element occurs at least once in the change list and every MIDI element occurs exactly once. For ex- ample, if the OMR part has a single note and the MIDI part has that note repeated several times, the change list would include an entry mapping the OMR element to the first MIDI element with no change and additional mappings for the remaining MIDI elements from that same OMR element with a change of insertion. To instantly determine all related MIDI elements from the OMR element, the list is processed into a mapping. While this change list provides useful information, finding the OMR elements that need to be fixed and their corresponding MIDI elements requires looping through the list with some logic. During generation initialization, following alignment, informa- tion is extracted and compiled from the change list into a structure which supports easy iteration through problematic initial OMR elements, and quick mapping to the relevant initial MIDI elements. The generation stores a list of problematic initial OMR elements, which support iteration for sequential fixing. Problematic OMR el- ements are defined to have a mapping of substitution, insertion, or deletion with at least one MIDI element. The generation also supports retrieval of the first ini- tial MIDI element corresponding with each problematic OMR element. This enables iterative analysis of problematic elements.

3.5.2 Fixing a Correction Generation

When a fix for a discrepancy is found, the initial OMR and MIDI parts arenot modified, but instead, the aptly named modifiable OMR and MIDI parts are modified. This is to enable comparison of the parts before and after the fix. Since one fix can

30 reveal another discrepancy to resolve, the functionality is especially useful to future developers debugging new fixers or curious users who want to understand the chain of corrections. The modification could be changing the OMR part to be like the MIDI part, changing the MIDI part to be like the OMR part, or changing both parts. Regardless, the modifiable OMR and MIDI parts must no longer have thatsame discrepancy in the resolved region after the modification. To modify the modifiable parts, a mapping is used to find the corresponding modifiable elements from the initial elements. Since modifiable parts are copiesof the initial parts and music21 stores which element an element was copied from, by looping through the elements in the modifiable parts and examining their derivations, a bidirectional mapping can be created. This mapping is one to one. After this modification is made, some mappings are no longer accurate. The mapping between initial and modifiable elements may no longer be one to one. One situation resulting in a one-to-many mapping is when the fix is inserting several notes after a problematic note. On the other hand, some elements in the mapping could be removed from the modified part entirely. Thus, after modification, the mapping functionality between initial and modified parts can no longer be guaranteed. Atthis point, a new generation must be created to make further discrepancy fixes. Another critical reason for creating a new generation after one fix is to achieve better alignment. The alignment of elements following the discrepancy may have been impacted by that discrepancy such that the generation’s knowledge of corresponding OMR and MIDI elements is not correct for every note.

3.5.3 Creating the Next Generation

The next generation is created from the previous generation after a fix has been ap- plied to the modifiable parts. Its initial OMR part is a copy of the parent generation’s modified OMR part, and its initial MIDI part is a copy of the parent generation’s modified MIDI part. This relationship between generations allows for the storageand review of correction history. Parts internal to a generation should be isolated so that modifying a part in one generation mutates neither a part in another generation nor

31 another part in the same generation. Even though the initial parts of a generation have the same content as the previous generation’s modified parts, these should not reference the same stream. Thisis because they are displayed differently. The initial parts are displayed with lyric numbers marking corresponding dissimilar elements between aligned OMR and MIDI parts. On the other hand, the modifiable OMR part does not have these alignment numbers, but instead, its elements are colored according to their correction status. The MIDI modified part is never displayed, as it is only modified as a side effectto match the OMR modified part. Its modifications can be inferred.

3.6 Automatic Fixing

3.6.1 The Process

To correct an OMR part, the corrector starts with an initial generation initialized with the set up OMR part and MIDI part. These are the parts initially corrected to have the same measure structure and other unified features. The corrector loops through the problematic OMR elements in that generation, in the order they appear in the OMR part, and attempts to fix each one. If a problematic element cannot be fixed, the corrector proceeds to the next problematic element. To attempt fixing a discrepancy, the corrector calls fixers from a suite of fixers with the problematic element and the entire generation as inputs. These fixers are described in detail in Chapter 4, but each can detect indicators of a specific error. For example, an indicator for a missing accidental is a half step pitch discrepancy. If an error is detected, the fixer then modifies the generation’s modifiable parts. The fixers return true or false outputs depending on whether a discrepancy is found and fixed. The fixers are called one-at-a-time in a strategic order, such that earlier fixers aremore specific than later ones. Once a fixer successfully identifies and resolves anerror,the generation’s modifiable parts are modified to resolve the discrepancy. As soon as a fix is applied, the generation no longer has consistent internal map-

32 ping state and the alignment between initial parts could be improved. A child gener- ation is created from the recently modified generation. This new generation is used for the next iteration of fixing. The corrector loops through the problematic OMR elements in the new generation, only considering problematic elements which occur after the most recently fixed element. This is to ensure the fixer does not become stuck in an infinite loop trying to re-fix the same element. While this condition should allow for fixing of most elements, problematic elements that occur at the sametime, such as multiple wrong notes in a chord, might not all be considered. Moreover, if a fix shifts the element earlier, there might be new discrepancy content before themost recent offset, which the corrector will no longer consider. However, these are special edge cases, and the consequence of them occurring is that these skipped discrepancies will require manual fixing. This is a fair trade-off to protect against infinite looping, as more fixers are implemented and included. The iterative fixing continues until there are no more discrepancies between the OMR and MIDI parts or until the fixers cannot resolve the remaining discrepancies.

3.6.2 Challenges

Fixers follow a set of rules to detect errors and then resolve those errors. However, these fixers face three challenges: the error space is complex, as discrepancies can be caused by a compounding of several errors, the error space is large, as there are many different types of errors OMR systems can make, and the error resolution can be ambiguous in cases where detected discrepancies could have been caused by more than one type of error. Discrepancies can be complex in that some discrepancies can be caused by multiple errors, such as a problematic note with both an extra accidental and a missed dotted rhythm. Fixers currently rely on other aspects of the element to be correct in order to correctly diagnose and fix errors. For example, a pitch fixer will only adjust the pitch of an OMR note with equal duration to the corresponding MIDI note. Rulesets could consider possible fixes for combinations of errors, but this quickly becomes highly complex. The error space is large, as music can be misread by OMR systems in a variety of ways. This fixer currently only resolves a subset

33 of the most prevalent and impactful errors. There is also ambiguity in which fix is best when the discrepancy could be the result of one of many errors. For example, it can be difficult to determine if a pitch error was caused by a misread accidental or misread notehead location for some close pitch discrepancies. It is unrealistic for a set of logical fixers to account for all these challenges, but instead of giving up and declaring fixing too complex for automation, the corrector makes its best effort to correct discrepancies. It attempts to fix the common errors it supports. Less common errors, which are not accounted for in the suite of fixers, are not resolved or are resolved incorrectly. This is acceptable, as the system can automatically catch the most common errors, drastically minimizing the amount of remaining errors. Moreover, the corrector assumes all errors are isolated errors, dis- tanced from other errors which could affect its manifestation. This is a reasonable assumption if the input score has fairly few errors, as such an input is less likely to have error collisions on a single note or on notes near one another. As for ambigu- ous resolutions, the fixer applies one possible fix, which could save time if thisisthe correct fix. However, this should not add additional time to manually change ifthe automatic fix is incorrect, as the discrepancy would have needed fixing regardless. Thus, the automatic fixer attempts to make fixes, but may not be 100% accurate because of the complex error space. This is why a human in the loop is critical: to re-fix incorrectly resolved ambiguous errors, fix errors not supported by the corrector, re-fix incorrect fixes for elements with compounding errors, and approve correctly fixed errors. For human interaction, a user-friendly interface should be produced [9].

3.7 Manual Fixing and Approval

While automatically fixing errors is a time-saving measure, it is only worthwhile if the output score can be guaranteed to be accurate. Manual intervention through a user-friendly interface is necessary. That intervention must take minimal effort, and visual cues can help reduce that effort. Without such cues, finding potential errors is like searching for a needle in a haystack, as is evidenced in Figure 3-3.

34 3.7.1 Correction Statuses For Display

A generation’s modified OMR part is marked for easy user review. Notes and restsin the part are marked with a correction status. Correction statuses are no discrepancy, discrepancy, confident fix, and tentative fix. Elements with those correction statuses are colored black, red, green, or orange accordingly (Figure 3-2). The user can easily scan the display to identify fixes that need approval or discrepancies that still need resolved. Previous Is Problematic in Current Generation Situation Generation Status Current Generation Status Should Be New Note n/a D Discrepancy New Note n/a No Discrepancy Previously Fixed Confident Fix Confident Fix Correctly Previously Fixed Tentative Fix Tentative Fix Correctly Previously Fixed Confident Fix D Discrepancy Incorrectly Previously Fixed Tentative Fix D Discrepancy Incorrectly Resolved Due to Discrepancy Confident Fix Other Fix Now Appears No Discrepancy D Discrepancy Problematic Table 3.1: Logic for updating correction status based on the previous status and current discrepancies

In the first generation, all problematic OMR elements are marked with the correc- tion status of discrepancy and consequently colored red. A small technicality is that the very first generation, considered a generation zero, actually has no modifications, so a user can view it to understand which notes initially had a discrepancy. When a fix is applied to the modifiable OMR part, the fix could affect multiple elements,and all elements related to the fix should be marked either as a confident fix or tentative fix. Elements which are marked confident or tentative fix might not be included in the problematic elements for future generations. This could be because the fix was successful and there is no longer a discrepancy at that location. It is critical that the

35 correction status of elements correctly fixed in a previous generation, which would have no discrepancy in a later generation, maintain their previous correction status of confident or tentative fix. To overwrite the previous fixed correction status toinstead be no discrepancy would result in the loss of nuanced fixing information. If, however, elements which were tentatively or confidently fixed in a previous generation are still listed as problematic elements, their correction status should be replaced with that of discrepancy. This is because the previous fix was unsuccessful in the context of the piece and these elements are ones which might still need to be manually fixed. Elements that were discrepancies in a previous generation and were not fixed, but are no longer discrepancies may have just appeared as discrepancies because of an error earlier in the part. These elements should be marked as confident fixes because they are most likely correct as the result of automatic fixing. Elements that were not discrepancies in a previous generation and were not fixed, but are now discrepancies should be considered as well. These elements may have luckily aligned due to earlier errors, but by fixing those earlier errors, new discrepancy is revealed. These elements should be marked as the discrepancy they have been revealed to be. This coloring enables a user to at a glance identify discrepancies which were au- tomatically resolved according to fixer rules and to understand the confidence level of that resolution. Users will need to act differently and with different amounts of distrust towards differently colored notes.

3.7.2 Display Layout

Every correction generation has an associated score of relevant marked internal parts, as shown in Figure 3-9. This score can be viewed and edited in a music editor and then manually saved, or automatically saved for later viewing and editing. The fixed part can be extracted from this score. The generation display score consists of the following parts: the initial OMR part, the initial MIDI part, the modified OMR part, and optionally a ground truth part. The initial parts are aligned, such that the

36 32

MIDI           184      193 194 197    205 206 207211  214 215216      185     195     208 212  217  209 

OMR-pre                                              184 185 193 194195 197 205 206207 208 209 211212 214215216217

OMR-post                                             Truth                                          

39

MIDI                   221      229   232   235 238 242243      250251 253     236 239     240 B OMR-pre        221 229  232  235236 242243    250 251  253                          B       OMR-post                                           Truth                                                   

46

MIDI                  254  255 260      269 272 273  274 279   256   275 257   276 258 277

OMR-pre                                    254 255256 257 258  260  269 272 273 274275 276 277 279

OMR-post                                     Truth                                          

Figure 3-9: Example of a page in the display score

Shown is the an excerpt from Haydn’s first string quartet, violin 1 part3 after automatic correction [19, 20]. The MIDI is the initial MIDI part after set up and OMR-pre is the initial OMR part after set up. They are aligned in the first generation as shown by the numbers in these parts. OMR-post is the modified OMR part of the last generation. Truth is the correct encoding, included for reference.

37 numbers mark corresponding elements within regions of discrepancy. This is useful in helping the user understand what exact discrepancy the fixers are resolving. The modified OMR part is displayed such that elements are colored according totheir correction status. Visually identifying the elements which need attention reduces the number of elements the user must manually check. Without this marked output, the user would need to tediously check every single note. Checking both the pitch and duration of every single note and rest is tedious and error prone, as some mistakes can look so similar to what is correct, as demonstrated in Figure 3-3. The corrector returns two useful structures. The first is a display score, which uses the initial part from the first generation, the modified MIDI part from thelast generation, and optionally a ground truth part. This display score shows the changes in the most condensed, summarized format by showing the initial uncorrected state and final corrected state. However, if a user wants to examine the particular history of changes to understand how one correction affects the chain of corrections, she can explore the second returned structure: a list of correction generations. Ultimately, the display score created from the first and last generation is the one viewed and modified by the user.

3.7.3 Editing the Display

Users can develop their own work flow for reviewing the outputted display file. The following is a proposed work flow, which I used. The user can open the resultant display score in a music editor, such as MuseScore [5]. On an initial look through of the file, the user will notice if measures are badly aligned, as this isabigvisual difference. If this is the case, the OMR part, and possibly the MIDI part, ifeitherare significantly incorrect, should be modified and re-ran through the corrector. Some modifications are easy enough to do from the music editor. For example, selecting and deleting stretches of extra misread measures can be easily done through the visual interface. Other modifications are easier performed by manipulating the part as a stream in music21. For example, if the shifter shifts the MIDI part the wrong duration, it is much easier to run the shifter on the stream with the correct adjustment

38 offset instead of manually moving every element. If the OMR part is so incorrect, it could be worthwhile to re-scan the original. For example, when scanning physical sheet music, the pages can be curled or the light can be too dappled, transforming the scanned image in such a way that it is less likely to be processed correctly by an OMR system. If re-scanning still does not produce a higher accuracy OMR output, the user should consider if fixing all the inaccuracies is a time saver over manually inputting the notes. These vastly incorrect outputs are not the intended inputs of the corrector system. If a user decides to manually enter a piece of music, that encoding can still be double checked with the MIDI part using the corrector.

Figure 3-10: Example of a delayed discrepancy Discrepancy earlier of a missing note is not detected until later because the alignment appears okay until the first red note. This is an excerpt from the violin2 part of Mozart’s Quartet No. 1, movement 1 [30, 31].

The user can look through the score display for colored regions. It is important to consider the few elements surrounding a colored note. Sometimes areas which are discrepancies are not marked as such because of confusion from other nearby discrepancies as shown in Figure 3-10. However, missed discrepancy markings should not occur far away from discrepancy markings. By only considering these regions and not needing to review the entire piece, the user is saving time. Fixes are marked in green or orange and are hopefully fixed correctly. If this is the case, the user simply needs to look at those notes, then select them together and color them black using tools in the music notation software. Alternatively, the user

39 can leave the colors as is, and later run a music21 function on the entire extracted OMR part to remove all stylistic coloring. If a fix is not correct or the user needs to correct a red unresolved discrepancy, this can be done by looking at the original OMR and MIDI parts aligned immediately above to understand where the mistake occurred. If the appropriate correction is still unclear, the user can validate against the original scan. If the MIDI part is correct at an unresolved or incorrectly resolved discrepancy, those elements can be selected, copied, and pasted into the modifiable OMR discrepancy region. This can save time from manually inputting the elements. Note that even if a discrepancy was fixed incorrectly, it is no more effort to correctly fix it than if it had never been resolvedas the user would need to take action to fix the issue manually in either case. After manually fixing the OMR part, and optionally the MIDI part if thereare errors, the OMR part can be extracted from the display score. To safe guard against human error in correction, this manually modified OMR part can be re-ran through the corrector. There should be no more discrepancies, but if there are, those might have been overlooked areas, which can now be corrected. While there is a non-zero chance the user will incorrectly modify both the OMR and MIDI parts, this is highly unlikely and is therefore not be a concern. Once the user has finished correcting and approving the part, it is ready to be extracted.

3.7.4 Rebuilding The Corrected Score

The fixed OMR part is stored as the third part in the display score andcanbe extracted with music21 by indexing into that part. That fixed part can be written to its own file. If, more likely, the OMR part is one of many in a score, eachpart is corrected separately. Each part will exist in its own display score and can be extracted. Using music21, the modified parts can replace the original parts inthe original score and this can be saved to a new file. At the end of this process, the user will have generated an OMR part or score of parts which can guaranteed to be correct as checked by a MIDI score. This is possible because of minimal manual intervention of extracting and recombining parts

40 following manual fixing. This high accuracy music can used and trusted by performers. The validated MusicXML file can be quickly and easily modified by the quartet in Chapter 1.Clefs can be changed, all notes can be transposed for another instrument’s range, and endless other creative transformations are possible.

41 42 Chapter 4

Error Fixers

The fixers focus on correcting pitch errors, which affect how high or low a notesounds, and rhythm errors, which affect the duration of a note or rest, in optically generated music. While these errors are not the only ones OMR systems can make, they are the most essential to correct since musicians take liberties to interpret phrasing and articulation, even when printed in music, whereas pitch and rhythm are played exactly as written. Other types of errors include text errors, which affect numerical markings such as measure numbers and time signatures, as well as textual indicators for speed, volume, and feeling. Phrasing errors affect the , volume, and slurring ofthe notes, and articulation errors affect the style of the start and end of notes. Some errors, and means for correcting them, will be discussed in depth in this chapter. Errors in OMR-produced files can be corrected by referencing MIDI files. MIDI sequence data is used for playback of music, and stores information on the timing, amplitude, and frequency of notes [35]. While able to present accurate instructions for how to produce the music, visual notation generated from MIDI files can be difficult to read. For example, enharmonic notes, which sound the same but look different, are encoded identically in MIDI. MIDI rhythms can be notated in difficult to read representations. Therefore, the non-ideal, but already existing MIDI data can be used to validate optically generated notation. To do so, the fixers must overcome some discrepancies in notation which communicate the same sound. Therefore, by making strategic use of music domain knowledge and the strengths and weaknesses of

43 MIDI encoding and OMR systems, it is possible to develop a set of rules, such as the ones in this chapter, to fix discrepancies between aligned MIDI and OMR versions of a part. However, because of the complex and ambiguous nature of errors, humans are necessary to be in the loop to approve automatic fixes and resolve ambiguous discrepancies.

4.1 Identifying Errors

As suggested by Byrd et al. [6], the first step in fixing errors when using multiple sources of truth is understanding the strengths of each system. They propose doing this by examining the types of errors made. From there, a list of specific rules can be created. While those authors suggest a large amount of music must be examined to generalize, I found combining an examination of a few pieces with knowledge of OMR weaknesses revealed many error patterns. Moreover, the corrector system is modifiable and modular, allowing others to contribute fixers as more error patterns are discovered. One process to identify errors is to examine discrepancies between the OMR and MIDI parts while consulting the ground truth of the original scan. To identify areas of discrepancy, OMR and MIDI parts can be aligned using music21’s aligner [37].

4.1.1 Challenges With Existing Visualizations

The aligner locates and marks substitutions, insertions, and deletions to the OMR part to map its elements to those in the MIDI part. These corresponding numerical markings in both parts can be used to identify aligned passages. A computer can easily switch between these parts to understand and resolve the discrepancy according to a set of rules for recognizing and fixing discrepancies. However, without an appropriate display of these aligned parts, this task is difficult for people. Originally, the two aligned parts were displayed in separate music editor tabs in MuseScore, so a user must toggle between tabs to view one part at a time. To locate corresponding passages and understand and resolve discrepancies requires remember-

44 Figure 4-1: Original display of aligned parts requires toggling between tabs ing the location number and corresponding notes in the other part. This process requires context switching and would be much easier if corresponding passages could be simultaneously viewed, such as the split screen view in SmartScore [1]. However, looking at both parts side by side, as in a split screen view, is still cognitively taxing. Even though both parts can be seen at once, the eyes still must oscillate between passages, re-locating the corresponding passages at each re-focus. This is a critical issue because the extra cognitive effort for visually aligning the passages distracts from the intended task of diagnosing the cause of the discrepancy. This discrepancy analysis is necessary for generating logic rules to resolve discrepancies.

4.1.2 A Better Way To Display Discrepancies

To better understand the discrepancies, the passages should be physically aligned such that corresponding measures can be viewed in the same focus. SmartScore supports this by panning to the corresponding section in the original scan when the user mouses to sections of the OMR output [1]. The Audiveris OMR system further reduces the distance the eyes must travel to compare corresponding sections by overlaying OMR output over the original scan [2]. By adding the aligned parts to a score in music21, they can be vertically stacked. In this way, the corresponding MIDI measure is above the corresponding OMR mea-

45 Figure 4-2: Improved stacked score display in MuseScore for identifying discrepancies The score displays the original MIDI part and original OMR part (OMR-pre) with numerical points marking corresponding areas when not aligned. Below these is included a copy of the OMR part (OMR-post) with problematic elements highlighted in red. The correct part is included at the very bottom for reference.

sure. The discrepancies can be marked in red on the OMR part and their correspond- ing elements are easy to spot. I no longer had to remember what the MIDI part had while looking at the OMR part, but could instead look at both parts simultaneously. With this visualization, the errors were easier to identify.

46 Rule Situation Identifying Solution PITCH OMR, MIDI have different pitches, A Different enharmonic spelling Use MIDI representation of pitch but same MIDI pitch value OMR, MIDI have different pitches, B Accidental Error different MIDI pitch value Adding an accidental to OMR B1 Missing OMR accidental Use OMR pitch with accidental gives same MIDI pitch value Removing an accidental from OMR B2 Added OMR accidental Use OMR pitch without accidental gives same MIDI pitch value Modifying OMR note’s accidental gives same MIDI pitch value

- Sharps can be modified to naturals B3 Misread OMR accidental Use OMR pitch with modified accidental - Flats can be modified to naturals or double flats - Naturals can be modified to flats or sharps - Double flats can be modified to single flat - Double sharps can’t be modified OMR, MIDI have different pitches, C Misread OMR line or space different MIDI pitch value, Use OMR pitch transposed within augmented third interval D Clef Error D1 Obscure Clef Change in MIDI Different clefs but same MIDI pitch values Use OMR clef Different clefs, different MIDI pitch values. Removing the MIDI clef doesn’t change later D2 Extraneous MIDI Clef Change Use OMR clef, don’t use MIDI clef pitches. Previous MIDI clef is the same as the one under consideration Sequence of at least two OMR notes can be transposed to have the same MIDI pitch value as corresponding MIDI notes. MIDI and OMR D3 Misread/Missing Clef in OMR Add clef or change clef in OMR sequences have the same rhythm. Misread clef is guaranteed if a nonconventional clef for that instrument is used. OMR and MIDI have different MIDI pitch values H Other Pitch Errors Requires manual fixing not resolved by rules in A,B,C RHYTHM Same total rest duration, I Misrepresented MIDI Rests Use OMR rests different shape of rests Rhythm difference less than a 32nd note J MIDI rhythm too specific Use OMR rhythm on notes with same MIDI pitch value Removing the dot from both MIDI and OMR K Misread dot Use MIDI rhythm gives same duration L Ornaments MIDI has many busy notes, which could be MIDI has an expanded ornament, L1 expanded from the single OMR note, and the Use OMR note to replace all busy notes OMR has an ornament on one note OMR note has ornaments in its expressions MIDI has many busy notes, which could be MIDI has an expanded ornament, expanded from the single OMR note, but the Use OMR note with appropriate ornament L2 OMR is missing an ornament OMR note does not have any ornaments added onto it to replace all busy notes on one note in its expressions M Articulations One part has a longer note M1 Different Shortness and the other has a short note and rest Use OMR representation in the same duration One part has rests and other has notes N Missing Notes Use part with notes for at least a measure Consider the result of M. Church’s algorithm OMR rhythm doesn’t add up to OMR rhythm doesn’t add up to on the problematic OMR measure [10]. R whole measure appropriate measure duration -use MIDI rhythm confidently if matches result -or tentatively if it does not Consider the result of M. Church’s algorithm on the problematic OMR measure [10]. S Other Rhythm Errors -use MIDI rhythm confidently if matches result -or tentatively if it does not OTHER T Key Signature Mismatch OMR and MIDI have different key signatures Key Signature Mismatch for OMR and MIDI have at least one Analyze the key and apply the one if matches T1 Single Key Signature Piece key signature that is different either OMR or MIDI key OMR and MIDI are aligned Shift MIDI to match OMR. Shift is the mode U Pick up Mismatch with identical, aligned notes falling on beat difference of aligned identical elements. different beats most of the time V Unexpected Element OMR has guitar tabs, V1 Chords in Non-guitar music Convert guitar tabs to rehearsal letter marks but is not written for guitar W Time Signature Mismatch OMR and MIDI have different time signatures Chose or re-calculate the time signature with Time Signature Mismatch for OMR and MIDI have at least one a measure duration equivalent to the most W1 Single Time Signature Piece time signature that is different common measure duration and that adheres to beaming in OMR part Table 4.1: Table of some potential OMR-MIDI discrepancies and the ways to identify and resolve them Green indicates a confident fix. Yellow is a tentative fix. Red cannot notbefixed automatically. Grey indicates a variable correction status dependent upon further analysis. Purple rules are addressed in this project.

47 4.2 Classifying Errors

Many different errors were identified in Haydn’s Opus 1 Quartet No. 1movement [19], which inspired the table of errors shown in Table 4.1. This table specifies means for identifying the existence of specific errors and the proper resolution technique. Fixers to identify and resolve some of these errors were implemented and are described in this chapter.

Elements can be marked with the following correction status: confident fix, tenta- tive fix, discrepancy, no discrepancy. Elements with no discrepancy do not need fixed. Problematic elements which no fixers can identify remain discrepancies. Elements with a discrepancy that the fixers can fix are either fixed confidently or tentatively. Some fixes can identify the most likely solution, whereas others can identify thehighly likely correct solution. These are differentiated as confident or tentative fixes.

4.3 Initial Fixers for Measure Alignment

The technique of displaying OMR and MIDI parts vertically aligned in a score is only useful if the measures are aligned. In other words, the first measure in the OMR part must also be the first measure in the MIDI part, and so on for the second, third, and rest of the measures. When the measures are not aligned, some fixers can be applied to attempt to re-align. Sometimes, even after these fixers are applied, the parts may still be unaligned, as indicated by different numbers of measures in each part. In that case, since it is difficult for a human to understand and resolve errorsin the later visual stage, the corrector will return an error to alert the user to manually adjust the measures. This is acceptable, as the corrector is not intended to correct all resultant OMR scores, but only ones with relatively few mistakes.

The initial adjustments are of two types. The one type, which encompasses key signature and meter fixing, is applied to the entire piece instead of each problematic element. The other type increases measure alignment.

48 4.3.1 Shifting to Account for a Pickup Measure

Figure 4-3: Viewing a display parts with different pickup measures is difficult The MIDI part is missing a short pickup measure, as its first measure is the full duration. OMR and the ground truth correctly have a pick up measure, with a total duration of on eighth note.

One reason the measures could be misaligned is if the parts have different pickups. This is manifested by the MIDI part starting at the wrong place in the first measure. This means notes are shifted into the wrong measure, making it difficult to visually compare the incorrectly shifted part with a correctly shifted one. The system can still

49 automatically apply fixes since it relies on points of alignment, not physical offset from the beginning, but this error will make fixing challenging for a user. A shifter was created for the purpose of adjusting the pickup, such that notes belong to the correct corresponding measures between OMR and MIDI parts. The shifter can be run on the input to the corrector.

A pickup is a measure at the beginning of a piece, or following a repeat sign, which is shorter than the bar duration of the time signature. The bar duration of a time signature is how long a measure in a piece with a given time signature should last. The first note in a pickup measure starts some offset after the beginning of the measure. That offset time is either filled with rests, or, more commonly, omitted. Musicians follow the rule that if a measure is too short according to the time signature, its notes start such that the last musical element ends at the end of the measure. Pickups are critical to beats lining up in the correct position in measures.

Beats in a measure are signified by the position in the measure and receive different emphasis. The beat number is indicated by the position in the measure: the first beat is at the beginning of the measure, the last beat is at the end of the measure, and the other beats are in the middle. Each beat occupies the same duration and could contain multiple notes. Notes often span multiple beats. For a time signature with four beats in western music, the first beat has the most emphasis, the third has the second most emphasis, and the second and fourth have the least emphasis. For another example, consider a waltz, which consists of three beats: strong, weak, weak.

Music that is shifted such that the emphasized first beat is no longer in the first position of the measure is difficult to read and interpret. It is important that the pickup is preserved, so notes belonging to the first beats exist at the beginnings of measures and are not shifted later or earlier. This is true for all beats: they should not be shifted. Moreover, it is difficult to find corresponding notes between a shifted and non-shifted version as corresponding notes will not be in the same position in the measure. Because corresponding notes are not in the same position in the measure, they could even have different beaming, causing them to look different. Therefore, if one part is shifted and the other is not, it is difficult to assess and resolve discrepancies

50 between MIDI and OMR parts. In MIDI encoding, timing is critical to playback, but the divisions of notes into measures is not critical. MIDI encoding begins the score at the beginning of the first measure, without an offset. However, pickup measures begin some offset intothe measure. One way to express pickup measures in MIDI is to have a different time signature at the beginning of the pickup measure and then switch to the piece’s actual time signature for the following, non-pickup measure. This is extra effort, which is unnecessary, especially when the primary use case of the MIDI file is generating audio. Moreover, changing the time signature for the first measure would be a discrepancy from how the pickup would be notated in printed music. Hence, pieces with pickups can be displayed incorrectly by MIDI. This issue is detected when the beats of corresponding and identical notes be- tween MIDI and OMR have a different offset within their respective measures. More specifically, the most common offset is detected. This is because there could besome anomalies that would otherwise skew this number. To fix this issue, the MIDI part needs to be shifted the difference of that offset so the beats are aligned. TheOMR part is trusted because it is more likely to have the correct separation of notes into measures. This is due to the fact that it is generated from the visual representation, which values notes’ groupings into measures. Granted, OMR could still have a mis- take if bar lines are misread, so a human in the loop is still necessary to potentially apply a better shift. Shifting the MIDI part is not as simple as shifting the start duration of every note or other musical elements. Part streams contain measure streams which contain notes. Shifting the elements in a part means potentially moving elements from one measure to another: either partially or entirely. The shifter shifts a stream some duration earlier or later. When the stream con- tains multiple parts or other non-measure streams, it recursively shifts each contained stream and combines them together into the encompassing stream. This is allowable because musical elements will not need shifted between these such streams. For other streams, the shifter ensures the stream has measures, making measures if they do not

51 exist. The shifter copies elements, such as notes and rests, from the measured stream into a new measure until that measure is full. Once that measure is full, it is added to the new shifted stream and elements are added into another new measure. That new measure is added to the stream once it is full. This process is repeated until all elements in the measured stream have been considered.

4.3.2 Collapsing Repeats

Figure 4-4: Content repeated in the Haydn MIDI part that needs collapsing

While MIDI can include optional notation for bar lines and repeat signs, repeat signs are not necessarily included in MIDI files, particularly when playback is the

52 primary use case. When repeat signs are not included, the material in the repeat section is written out explicitly multiple times. This can cause the MIDI part to be longer than the OMR part, meaning the measures will not be aligned for easy display.

Figure 4-5: OMR misreading a repeat sign as a chord from Haydn

OMR parts should contain repeat signs. However, as shown in Figure 4-5, repeat signs can be misread as a chord since chords are visually similar, being a vertical note stem with circle along it. Other times, repeat signs are misread as bar lines when the dots are not detected. This means, the structure of measures in the OMR part is most likely accurate, despite potentially omitted repeat signs. Applying music21’s repeat.repeatFinder function to the MIDI part, sections which repeat can be found and then collapsed. By simplifying the found repeats in the MIDI part, the OMR and MIDI measures have one less issue preventing them from being aligned. An additional benefit of collapsing repeats is minimizing the content necessary for review, as a person would not need to redundantly review the repeat section each time it repeats if collapsed. There is a possibility the repeat finding and simplifying functionality will overly condense the MIDI part by finding more short repeated sections which are not notated with a repeat sign in sheet music. Consider a lower part such as the cello in Pachelbel’s Canon in D, which may repeat a section multiple times while other parts layer different variations over each repeat. While such a cello part could be condensed to a series of repeats, it should not, because all parts should share the same measure structure. Regions which repeat should repeat for all parts. This oversimplification of a single part can be avoided by including all

53 MIDI parts for that piece to the repeat finding and collapsing function. On the other hand, sections which should be collapsed into a repeat might not be collapsed into a repeat if there is a mistake in one iteration of the section, causing the repeated material to be different across repeats. In either scenario, human intervention can fairly easily detect and fix these mistakes, as large stretches of mismatching measures are easy to spot. To ensure the OMR part has the correct repeats included, the fixer could loop through the bar lines at the beginning and end of each corresponding measure and ensure the OMR measures have the same repeat signs as the modified MIDI measures.

4.3.3 Fixing Misread Multirests

Figure 4-6: OMR misreading a multirest as a single bar of rest from Haydn

Stretches of silence are represented by rests. A multirest is a series of multiple measures of rests, denoted by a single measure of rest with a number above signifying the total measures of rest. See Figure 4-6 for an example. OMR systems can have difficulty interpreting this notation, sometimes incorrectly outputting a single measure of rest, seemingly ignoring the number above. Missing measures of rests cause measure misalignment. This is fixed by looping through each of the OMR measures. When anOMR measure m1 of entirely rests is followed by an OMR measure with at least one non- rest element and corresponds with a MIDI measure m2 entirely of rests followed by some number of MIDI measures of rests, a fix should be applied. To fix this, more measures of rests are inserted following measure m1 in the OMR part. The number of rests inserted is equal to the number of consecutive bars of rests immediately following

54 MIDI measure m2. Those consecutive measures of rest are not interrupted by any measures with non-rest note elements.

4.4 Other Initial Fixers

Other fixes are applied once at the beginning of the fixing process.

4.4.1 Key Signature Resolution

Figure 4-7: Examples of key signatures

Figure 4-8: Example of key signature applied to a B flat major scale

Figure 4-9: Key signature discrepancy between OMR and MIDI from Haydn MIDI has a F major key signature with one flat, whereas OMR has a B flat major key signature with two flats. The OMR is correct for this specific instance.

55 The key signature is a set of sharps or flats, which are by default applied tothe notes following it. For example, the key signature associated with B flat major has two flats: B flat and E flat. This means every note which is written asaBorEin the piece is played as a B flat or E flat. This simplifies notation, making iteasierto read the music, as well as analyze chord progressions. OMR systems can misread key signatures by misreading the number of sharps or flats, reading key signatures where there are none, or omitting key signatures where there are. A single key signature mistake can propagate to create many secondary pitch errors throughout the piece since notes following will be incorrectly sharped or flatted. In previous work aligning OMR and MIDI parts, misread key signatures were identified as the cause for large numbers of pitch discrepancies between MIDIand OMR parts [37]. MIDI parts can be trusted to have the correct pitch frequency, but the key signa- ture may not be correct. For example, the key signature of C major, which has no flats or sharps, could be used, but if most B and E notes are flatted, andthechord structure matches that of B flat major, the key signature of B flat major withaBand E flat might be more appropriate. While the playback pitches are correct, theMIDI notation could be difficult to read. However, if the MIDI part has too many random accidentals due to less ideal enharmonic choice, the analyzed key may be incorrect. Thus, it is necessary to unify the key signature across the MIDI and OMR parts. This fixer assumes a piece with a single key signature, which is a reasonable assump- tion for many pieces, especially simpler ones. Using music21’s Krumhansl-Schmuckler key signature analysis algorithm, the most likely key for the MIDI part is determined [27]. This is the analyzed key. The MIDI part is analyzed because it has correct pitch frequencies, even if abstrusely notated, as opposed to the OMR part, which could have independently misread pitches in addition to many secondarily misread pitches if there is an incorrectly read key signature. If the analyzed key matches any keys in the OMR part, it is deemed to be the correct key. This is because it is unlikely that the OMR system misreads the key signature and that misread key signature matches the analyzed key signature. It is

56 more likely a MIDI part would have an incorrect key signature and the analyzed key would match. However, if no matches are found in the OMR keys, the analyzed key is deemed to be the correct key if it matches any MIDI keys. If there are still no matches, the correct choice is the most common OMR key, breaking ties by prioritizing earlier keys. If there are no OMR keys, the correct choice is the analyzed key because all pieces must have a key signature. With the correct key chosen, all key signatures are removed from the OMR and MIDI parts unless they are the correct key signature and at the very beginning of the piece. If the OMR or MIDI part are then missing key signatures, the correct key signature is inserted into the beginning of the part. When the fixer is finished, the OMR and MIDI parts will have the same key signature at the beginning of each, which is hopefully the correct key signature. Human intervention in the last stage can confirm this.

4.4.2 Time Signature Resolution

Sometimes the meter at the beginning of an OMR part is misread. A meter is represented by a numerator, which is the number of beats in the measure, and a denominator, which divided by four gives the quarter length duration of the beat. A meter’s bar duration is defined as the total duration of a measure in a piecewith that meter. MIDI parts can also have meter errors by having a meter with correct bar duration but incorrect subdivisions. For example, 6/8 and 3/4 have the same bar duration of three quarter lengths, but 6/8 subdivides eighth notes and 3/4 subdivides quarter notes. This fixer is for pieces without meter changes, or in other words, pieces witha single meter. It takes in OMR and MIDI parts which are either scores of all the parts or a single part. This is because all parts in a piece have the same time signature, and more information can be useful in accurately finding the meter. First, the most common measure duration is calculated over all measures in the OMR and MIDI parts combined. Then, all meters in all the OMR parts are considered in order of the occurrence

57 and in part order. The first meter found which has the same bar duration as themost common measure duration is considered the correct duration. The chance that the meter is misread in OMR and equals the bar duration is low, since this would only be possible if not only both the numerator and denominator are misread, but also if the misreading causes them to have the same ratio relationship as the correct meter.

If no correct meter is found in an any of the OMR parts, then the fixer adjusts each of the OMR meters to create two new meters: one if the numerator was misread and one if the denominator was misread. These new meters meet the specification that they have the correct bar duration. Since the numerator and denominator cannot be fractions themselves, it is possible that there would not be a corresponding numerator or denominator. If both new meters are valid meters, they are compared to see which allows for the beaming structure of the current OMR part. Beaming of notes, or connecting smaller notes like eighth notes, is applied according to the breakdown of beats as dictated by the meter. Since OMR output is generated from a visual representation, its beaming should be mostly correct, barring errors. The percent of correctly beamed measures in the subset of measures with duration equal to the most common measure duration is the metric used to evaluate the better modified meter. The modified meter with the highest value of this metric is deemed to be thecorrect meter.

However, if at this point, there is still not a correct meter, the fixer checks if there is a meter in any MIDI parts which has a bar duration equal to the most common measure duration. If so, that is the correct meter. If there is still no correct meter, all meters up to a set denominator of 13 (which covers the most common meters) are considered. The numerator is generated from the denominator such that the bar duration is equal to the most common measure duration and the numerator is an integer. The meter from these generated ones with the most accurate beaming is selected as the correct meter.

Similar to the key signature fixer, this fixer replaces meters in both OMR andMIDI parts, such that each has the correct time signature exactly once at the beginning. After this is done, if the meter in the OMR part was previously incorrect, there is a

58 chance rests in full bars of rest have a duration equal to the previous incorrect bar duration. This is because full bars of rest are created with the assumption that they last the full duration. If the full duration changes, any OMR measures containing only a single rest should be updated to the correct duration.

4.4.3 Replacing Intruding Guitar Tablatures

Figure 4-10: Misread rehearsal mark as guitar tablature from Haydn

SmartScore reads rehearsal letter markings as chord tablature notation. Knowing this corrector is applied to string quartets, symbols like chord tablature, which do not make sense for these instruments, can be removed. In this case, the appropriate rehearsal letter mark can replace the chord tablature according to a generalization of the following rule: if the chord is for A, the rehearsal mark is ’A’. This is helpful, since the aligner considers tablature notation as pitches to align, which could appear as a discrepancies.

59 4.5 Iterative Fixers

4.5.1 Rest Representation Discrepancies

MIDI encodes periods of rests for the correct total amount of time, but sometimes produces the wrong notation. MIDI specifies note on and note off commands, and rests are the inferred silence between [13]. Thus, MIDI has no way to encode specific visual notation for rests.

Figure 4-11: Rest discrepancy from Haydn MIDI is incorrect and OMR is correct

There are a variety of symbols to represent different duration rests and there isa correct way to divide a period of silence into rests according to the beat pattern of the time signature. For example, consider the time signature of 6/8, which has six beats of eighth notes. These beats are grouped into two groups of three eighth notes each. Rests should reflect these beat and grouping patterns. If there is only aneighth note at the end of a 6/8 measure, the correct rest pattern at the beginning would be: quarter rest (equal to two eighth notes), eighth rest, quarter rest, as shown in Figure 4-11. MIDI parts could display this is any series of rests so long as the total duration is correct.

This fixer identifies and corrects periods of rest which are aligned at the beginning and have the same total duration, but have different rest representations. The OMR visual information can be trusted because it is validated by the duration information in the MIDI part.

60 Figure 4-12: Set of common accidentals to fix

4.5.2 Misread Accidentals

Pitch dictates the frequency of the sound vibrations, which influence how high or low a note sounds. It is notated by a notehead’s vertical position on a specific line or space on the staff and optionally the existence of specific accidentals before the notehead. As shown in Figure 4-12, the three most common accidental symbols are a sharp, flat, and natural. More complicated accidentals are double sharps anddouble flats, or even triple sharps or triple flats. For this system, I only consider themost common accidentals. A sharp sign in front of a note raises its pitch by a half step and a flat sign in front of a note decreases its pitch by a half step. An accidental applied to a note applies to all other notes following it in that measure with the same notehead vertical position, unless the accidental is canceled out with a natural sign. The natural sign in front of a note indicates the pitch should not be adjusted higher or lower from the pitch indicated by its position on the staff. Natural signs can also cancel out accidentals from the key signature, which otherwise apply to all notes of that type throughout the piece. For example, G major has one sharp in its key signature, which is F sharp, meaning all notes written as an F in the piece are played as F sharps unless they are preceded by a natural sign. A natural sign preceding a note that is already unmodified causes no change in the sound of the pitch. However, sometimes redundant accidentals are added before notes as reminders to the musician. These are called courtesy accidentals and can making reading the music easier. Some pitches are enharmonic, meaning they represent the same frequency but have different names and appear differently in music. For example, the note Fistwo half steps below G, so F sharp, which is one half step above F, and G flat, which is one half step below G, meet at the same frequency. However, these notes belong to different keys, and in those keys, take on different meaning. Notes are printed with

61 Figure 4-13: An example of enharmonic notes the simplest accidental representation which is true to the key, so the specific pitch notation is important.

MIDI can only encode the MIDI pitch number. There is a unique pitch number for each sounding musical frequency. Every sounding pitch has a MIDI pitch. Middle C, but also the B sharp below it, both have a MIDI pitch number of 60. Hence, the information regarding the note name and accidental are lost since there is a many to one mapping of pitches to MIDI number. This makes MIDI notation much harder to read and analyze. However, MIDI pitches can be trusted to sound correct.

OMR systems can easily mistake accidentals. Accidentals can incorrectly be added where they do not exist, omitted where they do exist, or interchanged for other accidentals. Looking at the visual similarity of accidental signs, it is believable that OMR systems could interchange sharps and naturals, and flats and naturals, but not flats and sharps. Sharps and naturals have a similar box shape with lines extending. Naturals and flats both have vertical lines from an enclosed shape. Sharps and flats have too many visual differences to be reasonably interchanged in cleanly printed and scanned sheet music.

The fixer considers whether notes with a pitch discrepancy are enharmonic. If not, it considers if modifying the accidental resolves the discrepancy. In the case of enharmonic discrepancy, the OMR pitch is trusted. For example if the MIDI pitch is G flat and the OMR pitch is F sharp, as shown in Figure 4-13, the fixer determines the OMR’s F sharp is correct with validation from the MIDI G flat. However, the aligner used for the corrector aligns notes based on their pitch MIDI number, so

62 Figure 4-14: Logic to fix accidental errors Incorrectly read OMR notes in the left column should have been either possible correction in the right column if an accidental error is at fault. enharmonic differences are not identified as discrepancies for this fixer toresolve. This is another reason it is critical the corrector corrects and outputs the OMR score: these enharmonic discrepancies are not detected, but can be left as is in the OMR part.

Therefore, when this fixer is called from the corrector, it only attempts to identify if an accidental error occurred. The fixer considers the three conditions for the state of the OMR note: it has a sharp, a flat, or either a natural or no accidental. Inthe case that there is a sharp and a discrepancy, the original scanned note could have had no accidental or a natural sign. Both cases can be tested by removing the accidental from the OMR note and checking to see if that pitch, which is a half-step lower, is enharmonic with the MIDI pitch. If they are enharmonic, the correct pitch is the OMR pitch without the sharp. In a similar way, when there is a flat on the OMR note and a discrepancy, the original scanned note could have had no accidental or a natural sign. Both cases can be tested by removing the accidental from the OMR note and checking to see if that pitch, which is a half-step higher, is enharmonic with the MIDI pitch. If they are enharmonic, the correct pitch is the OMR pitch without

63 the flat. A technicality for the music21 representation is that for both scenarios of OMR originally having a sharp or flat, the correct pitch is marked with having no accidental as opposed to having a natural accidental. This is because an accidental sign will be automatically added in the display to the note if there are other notes or a key signature that would otherwise change the pitch up or down. Otherwise, excessive unnecessary natural signs make the music more difficult to read. The only issue with this is that courtesy natural signs will be lost if they are misread as a sharp or flat. The courtesy natural sign is used as a hint to the performer, oftentimes on a note they would be likely to read incorrectly. While this omission of a courtesy natural sign makes it more likely musicians reading the music will make a mistake on that note, the note will technically be correct and not necessarily more difficult to read.

The other scenario is if there is no accidental or there is a natural sign on the OMR note. Both cause the pitch to be unchanged from that indicated by the notehead’s vertical staff line position. It is possible if there is no accidental sign, asharporflat sign was ignored by the OMR system. It is also possible, if there is a natural sign on the OMR note, that it was supposed to be a sharp or flat that is now misread. Both scenarios can be checked by testing if adding either a sharp or flat to the OMR note causes the OMR and MIDI notes to be enharmonic. If adding a sharp or flat causes the MIDI and OMR note to be enharmonic, then the OMR pitch can be fixed to be its original pitch with the appropriate sharp or flat accidental.

This fixer resolves accidental errors very well, but it can imperfectly correct other pitch errors. The line or staff a notehead is centered on can also be misread byOMR systems. For some misreadings of the line above or below a space or the space above or below a line, the difference in pitch between the correct and misread pitch isa half step. In this case, the fixer will add or remove accidentals. This is notthe correct fix, as adjusting the line or space the notehead is centered on is thecorrect fix. However, this fix is no less correct than the existing MIDI encoding sincethe resulting note will sound correct. Moreover, fixes are still marked in the outputted file, so a person can quickly identify that the cause was not an accidental error, buta

64 37

MIDI   203  207  210 211 212    204 208  213  205 

OMR                Truth             39  

MIDI   positional  misreading. Adding217 another automated fixer would not resolve this issue as         itOMR would be ambiguous which fixer is correct without knowledge of the correct pitch.             4.5.3Truth Undoing the Rhythmic Harm of a Misread Dot             41     MIDI            OMR            Truth        42 Figure 4-15: Additional incorrect dots cause rhythm errors MIDI The dots in the MIDI part indicate a short articulation. The red notes in the OMR part have incorrect extra rhythm231 dots. These dots cause the notes234 to be too long  232 235 and should be removed. 236

OMRDots occur in music for two reasons: as a staccato above or below a notehead to  denote a short articulation, or as a rhythm modifier to the right of a notehead. A Truth  notehead can be followed by any number of dots. One dot increases the duration of       the note by half its duration. A second dot further increases the duration by a fourth 7 of its duration, a third by an eighth, and so on. For example, an eighth note, which has a normal duration of half a quarter length, will have a modified duration of .75 quarter lengths with a dot to its right. Misreading a small dot is highly possible. Many OMR computer vision algorithms rely on staff line removal to isolate elements and then reduce the noise [3]. Ifastaff line is not fully removed, an extra dot can appear. Other noise could be misinterpreted as a dot as well. On the other hand, original dots can be incorrectly removed during noise removal if they are considered noise. These small marks make a big difference in note duration. For any aligned notes with different duration and at least one of the notes having a dotted rhythm, if removing all dots causes the notes to have the same duration, the duration of the OMR note should be changed to match the duration of the MIDI

65 note. This is because, while the OMR system is prone to misreading, the sounding duration of MIDI notes can be trusted. This fix changes the duration of the OMR note, so the elements following that note must have their offsets adjusted.

4.5.4 Articulations Marked By Different Rhythms

Figure 4-16: Different rhythmic representations of a short quarter note from Haydn MIDI is incorrect. OMR is the correct notation.

Composers can convey how short a note should be played either with articulation markings or by the duration of the note itself. As shown in Figure 4-16, a short note could be denoted by a quarter note with a staccato dot or as an eighth note followed by an eighth rest. Both notations will result in a shorter note that occupies one quarter length of duration. However, many musicians will distinguish between these notations, interpreting them differently. When there is a note in the OMR part which is aligned with another note of different duration in the MIDI part, such a misread dot error is a possible cause. The OMR is trusted if tied in notes and rests immediately following and including the aligned notes from both parts have the same total duration. This has the added benefit of resolving instances where MIDI represents a duration with an unnecessary tie.

4.5.5 Finding Missing MIDI Notes

Since MIDI’s primary function is audio playback, doubled parts can potentially only be notated once in a single part instead of redundantly in all of the parts where

66 Figure 4-17: Example of MIDI omitting redundant parts from Haydn those notes should be played. The parts which should double, but incorrectly do not, instead have rests during this time. Any problematic OMR note which corresponds to a rest in the MIDI part is a potential start of this issue. The amount of space which can be filled is the minimum of the following: duration of uninterrupted MIDI rests or duration of uninterrupted OMR notes. Candidates for filling that space are found by looking at that offset position in the other MIDI parts. Those candidates are copied into the MIDI part one at a time, and if the percent alignment increases with a candidate, are kept.

4.5.6 Ornaments

23 3 3 3 MIDI                             OMR                            Truth                          

26 Figure 4-18: Variety of mordent ornament discrepancies from Haydn OMRMIDI has the  correct mordent in measure 23, but it is written out in the MIDI part.           Both    in152 measure  156 24 are157 written out 158 in the163 165 MIDI166167  part, but170 the171 OMR 172 part    159  173 is missing both mordent symbols. Other non-ornament160 areas of discrepancy exist174 in 161 175 A these measures too.

OMR              Ornaments  are flourishes   added to a note, indicated    by a symbol or marking above     152   156 157 158 159 160 161 163 165 166 167   170 171 172 173 174 175 that note. These flourishes are generally several short duration notes with nearby Truth                          67          

31

MIDI            177 179   184       193  194 197     205  206 207211    180  185  195  208 212   209

OMR                                                177 179 180   184 185 193 194195 197 205 206207 208 209 211212

Truth                                           

38

MIDI                 214 215 216    221      229   232   235 238 242 243     217      236 239     240 B OMR      214 215 216 217   221 229  232  235236 242 243                                     Truth                                              2 pitches to the written note. In music notation, ornamental notes are not explicitly written on the staff and are instead inferred from the symbol. MIDI explicitly encodes each ornamental note because every sounding note needs a note on and note off encoding. OMR parts, and written notation in general, will encode a single note with an associated ornament expression. These are two ways of representing the same thing. However, the MIDI encoding of explicitly writing out the flourishes can be difficult to read because these flourishes consist ofmanyshort notes, meaning they add a lot of ink to the page to read. These different ways of encoding an ornament are a discrepancy which needsto be resolved. Moreover, OMR systems could misread ornament symbols or markings, causing the note to be outputted without an associated ornament. A set of ornament recognizers were created to identify situations where the simple OMR note can be expanded to the aligned busy MIDI notes occupying the same total duration, as specified by an ornament. The recognizers implemented can identify turns, trills, mordents, and from a series of busy notes, optionally with a simple note. When the simple note is provided, the recognizers check if the busy notes are an expansion of that simple note according to the ornament. When a simple note is not provided, the recognizer checks if the busy notes could be an expansion of any simple note, as specified by the ornament the recognizer is responsible for identifying. All ornament recognizers ensure the total duration of the busy notes equals the duration of the simple note. The recognizer returns a music21 expression for the recognized ornament or false if there is none recognized. The overall ornament checker must try to recognize ornaments in a specific order: specifically, it must try to recognize mordents before trills as mordents are a subset of trills. To resolve discrepancies caused by different ornament notations, the problematic OMR note and the corresponding MIDI notes occupying the same duration are passed as inputs to the ornament recognizers as the simple and busy notes accordingly. When an OMR note is identified or confirmed to have an ornament, the appropriate ornament is added to that note’s expressions if that note does not already have any ornamental expressions. The discrepancy is resolved and that note is deemed correct

68 with validation from the busy MIDI notes. The busy notes in the MIDI are replaced with the modified OMR note.

Turn and Inverted Turn Recognizer

Figure 4-19: Resolving a missing turn

Figure 4-20: Resolving a missing inverted turn

Some ornaments are turns and inverted turns. Both are four notes which "turn" around the simple note. A regular turn, as shown in Figure 4-19, starts above the simple note, goes down to the simple note, goes down again, and then goes back up to the simple note. Inverted turns, such as the one in Figure 4-20, start below the simple note, go up to the simple note, go up again, and then go back down to the simple note. For all transitions between turn and inverted turn notes, the distance moved is a space or line on the staff, or in other words, is an interval equal to a major second, minor second, or augmented second. The recognizer ensures there are exactly four busy notes. When the simple note is provided, the recognizer also ensures the second and fourth busy notes are enharmonic with it. The intervals between the busy notes must be a major second, minor second, or augmented second and the direction of the intervals must match either that of a turn or inverted turn. Depending on the directions of the intervals, either a turn or inverted turn is identified. If the busy

69 notes do not meet the specifications detailed, the recognizer returns false as neither a turn nor inverted turn was found.

Tremolo Recognizer

Figure 4-21: Resolving a missing eighth note tremolo

Figure 4-22: Resolving a missing sixteenth note tremolo

Another common ornament is a tremolo. Tremolos are marked by one or more slashes on the note stem and indicate repeating many shorter, busy notes with the same pitch as the notated note for the notated note’s duration. The duration of each busy note is indicated by the number of slashes: eighth notes are one slash, sixteenth notes are two, and thirty-second note notes are three. The recognizer checks that the busy notes are enharmonic or share the same pitch. When a simple note is provided, it also ensures every busy note is enharmonic with the simple note. It also checks that there are at least two busy notes. If the busy notes do not meet the specifications detailed, the recognizer returns false. When returning a tremolo, the number of slashes is equal to log base two of one divided by the duration of a busy note.

70 Figure 4-23: Resolving a missing trill

Figure 4-24: Resolving a missing inverted trill

Figure 4-25: Resolving a missing enharmonic inverted trill

Figure 4-26: Resolving a discrepancy caused by notating a trill differently

Trill Recognizer

Trills are repeated oscillations between two notes, which are within three half steps of each other. When a simple note is provided, the recognizer ensures one of those two notes has the same pitch or, as is the case in Figure 4-25, is enharmonic with

71 the simple note. Trills typically oscillate between the notes more than once, but the minimum number of busy notes required to be classified as a trill is two. The busy notes are checked to ensure all odd busy notes are enharmonic with each other and all even busy notes are enharmonic with each other. The intervals between the sequential busy notes is checked to ensure they are all within three half steps of each other. If the busy notes are classified as a trill, a trill expression is created. It is assignedan interval size. When a simple note is provided, the assigned interval is the interval from the simple note to any non-enharmonic busy note. When a simple note is not provided, the interval is calculated from the first busy note to the second busy note. This is critical in determining the direction of the interval.

Figure 4-27: Resolving a missing nachschlag trill

A nachschlag is a series of extra notes at the end of a trill which help transition out of a trill. The trill recognizer can recognize nachschlag trills, such as the one in Figure 4-27. When a flag for allowing nachschlag is turned on, the recognizer loosens the requirements for a trill to also allow for nachschlag trills. There must be at least five notes for a nachschlag to exist at the end of the trill and the nachschlag cannot occupy more than half of the trill. In relaxing the requirements, the recognizer allows the busy notes in the last half of the trill to be any pitch. When there is a nachschlag, as recognized by some notes at the end not following the oscillation pattern, the nachschlag flag on the returned trill is set to true.

72 Figure 4-28: Resolving a missing mordent

Figure 4-29: Resolving a missing inverted mordent

Mordent Recognizer

Mordents and inverted mordents are trills with three busy notes. As shown in Figure 4-28, mordents start on the simple note, go up, and then return to the simple note. Inverted mordents, such as in Figure 4-20, start on the simple note, go down, and then return to the simple note. Therefore, the term upper mordent often refers to a mordent and lower mordent refers to an inverted mordent. Mordents are confirmed by passing the busy notes, and if provided, simple notes, to the trill recognizer. To be a mordent, the trill recognizer must recognize these notes as a trill. To ensure the busy notes are a mordent in particular, there must be exactly three busy notes. Either a mordent or inverted mordent is returned depending on the interval direction between the first and second note.

Every Ornament Imaginable

Besides turns, tremolos, trills, and mordents, there are other ornaments which cause similar discrepancies, such as , acciaccaturas, and . While these ornaments are not resolved by the existing suite of fixers, developers can add

73 fixers to support these additional ornaments, as well as any other cause of discrepancy with a clear identification and resolution procedure.

4.6 Developers Guide to Fixers

Fixers are functions which accept a correction generation and a problematic OMR element. The OMR element has a discrepancy with the one or more MIDI elements it is aligned with and exists in the generation’s initial OMR part. If a fix should be applied, fixers modify a generation’s modifiable OMR and MIDI parts, butnot its initial OMR and MIDI parts. Even though the OMR part is the output of the overall corrector, it is important to modify both the OMR and MIDI parts at the resolved discrepancies. This is to ensure that the next generation, which is based off of the previous generation’s modifiable parts, will no longer have the same discrepancy there.

If a fix is found, the fixer returns true; otherwise, it returns false. The fixershould not return true unless it fixes something. If a fixer does not return true, itshould not modify the generation. If it does return true, it must modify the generation. The MIDI elements aligned with the inputted problematic OMR element can be found using functions belonging to the generation. The corresponding modifiable elements can be found from the initial elements using other functions belonging to the generation.

New fixers can be added to the corrector, which has an ordered list offixersto use. The corrector has a fix function which calls these fixers in that specific order. The order of fixers may be critical, as some fixers can detect a more specific fix,which may be more correct than a general fix. For example, a mordent could be classified as a trill, so it is important that a mordent fixer is applied before a trill fixer, asonce a fix is found, it is applied, and a new generation iscreated.

74 4.7 Existing Fixers Fix Some, but Not All Discrepancies

Existing fixers can account for many different types of discrepancies between optically generated OMR parts and audio sequence data stored in MIDI files, in order to validate the OMR part. Many discrepancy causes and ways to both identify and resolve them are detailed in Table 4.1. Some are caused by true errors, such as misread accidentals or dots, whereas others, such as rest representation discrepancies, are caused by encoding musical instructions differently. Furthermore, some fixers, such as time signature and key signature fixers, are applied holistically at the beginning of the fixing process, while others are applied to localized discrepancies. Some discrepancies included in 4.1 do not have a corresponding fixer because they are assumed to already be correct in the OMR part and do not raise a discrepancy with the existing hashing and aligning notes and rests according to their pitch and duration. One such example is a clef discrepancy: MIDI parts are riddled with unhelpful clef changes which are not present in the OMR part. By not triggering a discrepancy, but being kept with the OMR part, the correct clefs are persevered in the OMR part. Other discrepancies, such as incorrect OMR clefs or missing notes, are currently not identified and resolved by any fixers. However, developers can follow atemplate prescribed in the earlier developer’s guide to extend the fixing functionality to account for a wider range of errors. The evaluation in Chapter 5 documents the need for specific additional fixers, as well as assesses the usefulness of existing fixers.

75 76 Chapter 5

Evaluating Usefulness

5.1 Evaluation Procedure

To evaluate the usefulness of the corrector, I digitally encoded two quartet movements: the first movement of Franz Joseph Haydn’s String Quartet in Bb, Hob.III:1, Op.1, No.1 and the second movement of Wolfgang Amadeus Mozart’s String Quartet No.1 in G, K.80.

I first collected PDF scans and MIDI files. I found both a score and individual parts of the Haydn movement in the MIT Lewis Music library and personally scanned it, ensuring a high quality scan [22, 19]. The Mozart movement was downloaded from IMSLP [30]. Corresponding MIDI files are similar to those from the online Classical Archives [31, 20]. The ground truth files for both movements were sourced from the music21 corpus and modified to match the scanned edition [11].

I converted the scans into MusicXML files using the SmartScore X2 Pro OMR system [1]. To validate the OMR-produced MusicXML files, the MIDI and MusicXML files were as used as inputs to the corrector system. The correction generations and final display score were saved as files for later examination and manual fixing.The corrector can also produce metrics for evaluation.

77 5.2 Metrics for Evaluation

5.2.1 Quantity of Content to Review

One goal of the system is to minimize user intervention to validate OMR outputs, which is achieved by reducing the amount of content to manually review. Without the system, the user would need to review every element in the OMR part. This total number of elements serves as the baseline. This number is compared to the total number of discrepancies after alignment and before automatic fixing and the total number of discrepancies remaining after automatic fixing. The number of discrepan- cies before automatic fixing is the number of elements the user will need to reviewif using the corrector. However, some of these elements will require less effort to review. On average, elements which are automatically fixed will require less time to review than elements which still need correction, as the user likely will not need to modify a significant amount of automatically fixed elements. The difference in discrepancy count before automatic fixing and after automatic fixing is the number of resolved discrepancies which were automatically fixed. The corrector should drastically limit the amount of content necessary to review.

5.2.2 Accuracy Improvement due to Automatic Correction

Another metric to evaluate the effectiveness of the automatic correction stage ofthe corrector is the accuracy of the OMR part before and after automatic correction. The accuracy of the part is determined by its similarity score when aligned with the correct ground truth. The similarity score of aligned parts is the ratio of no-change changes to other changes [37]. Identical parts have a similarity score of 100% and entirely dissimilar parts have a similarity score of 0%. It is worth noting that the overall accuracy of the system is given by the accuracy of the OMR part after not only being automatically fixed, but also manually fixed. This accuracy should be 100%, as people can understand and resolve nuanced errors while being checked by the computer. However, 100% accuracy only guarantees correct

78 pitches and rhythms. Other important musical components, such as dynamics and slurs, are not guaranteed to be correct by this version of the corrector and are not considered in this measure of accuracy.

5.3 Applying the Corrector to Quartet Music

To understand the capabilities and limitations of the corrector system, the first violin, second violin, viola, and cello parts in both quartet movements are examined. The following is an exploration of why the corrector produced the display scores included in the appendix. Some interesting behaviors are discussed.

5.3.1 Haydn’s First Quartet

The first movement of Franz Joseph Haydn’s String Quartet in Bb, Hob.III:1, Op.1, No.1 was selected for a variety of reasons. Due to its age, it is part of the public domain, and a wide range of editions are available, as well as a MIDI version. Pieces such as this one, which are composed during the classical era, use ornamentation. Thus, this piece can evaluate corrector handling of ornaments. Furthermore, this movement is simple since it has a single key signature and single time signature. While this movement includes repeats, which are common enough that the system must demonstrate repeat support, this movement does not use more complicated repeat directions, such as D.S. al Coda. Originally, I attempted to generate a MusicXML file from IMSLP’s available scans of this piece, but upon discovering that the low density of those scans means low accuracy in generated OMR parts, I decided to personally scan the sheet music. While I first scanned a study score of the movement [22], upon seeing its resulting OMR output was still error-ridden because of the tiny score notes, I decided to scan individual parts [19]. Notes in individual parts can be printed larger than those in scores, and consequently resulted in higher accuracy OMR outputs. These fairly correct outputs were validated using the corrector. The following sections analyze the results of the corrector’s automatic fixes. The

79 automatically fixed output for all parts is included in the appendix, and itisrecom- mended to view these alongside the following analysis. The MIDI and OMR parts in these scores have already been adjusted from the original MIDI input, according to the initial fixers. For this particular movement, fixes applied to all MIDI partsare shifting all elements an eighth note earlier, condensing both repeated sections from their original expansion, and correcting the key signature to have two flats instead of one. The initial fix applied to the viola and cello OMR parts was expanding misread two-bar multirests. All OMR parts had misread rehearsal marks notated as guitar tablatures, which were also replaced.

Violin 1 Part

The first violin part typically carries the melody, consequently having more notes. The number of notes and rests in the violin 1 OMR part is 366. 113 discrepancies requiring review were identified. After automatic correction, only 34 discrepancies remained. As shown in the part included in Appendix A, the rest fixer correctly validated the OMR rest representations in measures 5, 9, 25, 32, 34, 36-38, and 42, marking these green. By resolving these discrepancies, some areas which were aligned such that they initially appeared problematic, but were originally correct, became appropriately aligned and could be marked as having no discrepancy. Examples of this are the green fixed notes at the end of measures where rest discrepancies were resolved, aswell areas following the rest discrepancy, such as measure 35. The fixers also identified several incorrect OMR pitches. Some were correctly fixed, such as the missing flats on the first note in measures 37 and 38, andtheextra flats on the second note in measures 38 and 43. However, other pitch discrepancies, which could not be fixed by adjusting the accidental, remain marked for manual correction. An example of an unresolved pitch error in the OMR part is the red note in measure 45. It is incorrect because of a misread position. There are also cases where the OMR and MIDI parts were misaligned, so the corresponding MIDI notes for the OMR notes were incorrectly assumed. This is intended behavior, but resulted

80 in the OMR pitches being adjusted to match their incorrectly corresponding MIDI notes.

Some regions had incorrect content in the MIDI part, so were marked as a dis- crepancy. The corrector can automatically resolve mistakes only when the MIDI part is correct. However, when the MIDI part is incorrect, the discrepancy will still be detected and marked. These incorrect MIDI regions can be identified by consulting the original scan, but in this case, can also be identified by consulting the included ground truth. Some correct notes following this type of discrepancy are consequently marked as a discrepancy. By viewing not only the entire marked region, but also the notes before and after, the user can develop a better understanding of the situation. An example of this MIDI error is in the middle of measure 20, as well as in measures 57 and 59. Another region with an error in both the OMR and MIDI parts is the first note of measure 23. This is not marked as a discrepancy because theOMRand MIDI parts both agree on the pitch, and both are incorrect. Errors like this will sneak by undetected. This may be acceptable because the output score will be at least as accurate as the MIDI and OMR part individually, but in most, cases more accurate than both.

Several rhythmic discrepancies between the OMR and MIDI parts were validated in the OMR part. Notably, differences of rhythm for articulation purposes, such as the OMR eighth note and eighth rest pairs in measures 23 and 24 were validated.

A major source of error in this part was misread additional rhythm-extending dots on eighth notes throughout. These were always correctly removed. However, in OMR measures with this issue, the rests or notes at the end often overflowed into the following measure. This is manifested as an additional rest or note in the second position in the measure following the problematic OMR measure. If that additional element is a note, it is generally tied in from the problematic OMR measure prior. In isolated cases, this overflow can be adjusted such that any extended material can be reclaimed into the problematic measure it belongs to, as is accomplished in measures 18, 27, and 49. The withdrawn content is marked in yellow as a tentative fix. Dots were fixed, but no withdrawing was necessary in measure 41. However, when

81 the problematic measure is surrounded by other problematic measures, this duration reclaiming is not done correctly, as is evidenced in measures 19 through 22 and 28 through 32. Another illustrative situation occurs when the reclaiming must be done for several measures following, as is the case following measure 45. While measure 45 is corrected appropriately, the content in the following measures must also be shifted over. This goes to show that the fixers can automatically fix errors and discrepancies which are isolated, and otherwise can only make a best effort, which may later need to be manually adjusted. The mordent ornament fixer is useful for this part, identifying discrepancies in which the MIDI notates the expansion of an ornament. The corrector resolved these discrepancies, as well as added missing mordent signs on notes in the regions before both repeats. Notice that the extraneous mordent in measure 44 was not removed because note expressions are not hashed and compared. Customized hash functions are supported by the aligner and could be used to detect this issue in future work [37]. The corrector currently only supports adding missing ornaments. The initial accuracy against the ground truth for this part was 83.56%, which was increased to 92.35% after automatic fixing. This movement experienced the most significant increase in accuracy of all the parts across both movements.

Violin 2 Part

Without the corrector, the user would have had to check 314 elements, but 126 discrepancies to consider were found. After automatic fixing, 73 remained. As seen in the Appendix A, the most notable cause of discrepancy in this part is the missing MIDI music where the violin 2 part doubles the violin 1 part. In some cases, the fixer was able to locate this missing music in the violin 1 part to validate the OMR violin 2 part, such as on the first line. In other cases, the fixer was unable to do this because of an error immediately preceding, which shifted the alignment. This is seen in measure 39 and in the measures following measure 60. Interesting accidental fixing occurred. In measure 13, a missing natural signis added. Later in measure 16, a misread natural sign is corrected. By consulting the

82 appropriate content doubled in the violin 1 part, the corrector also identified an OMR pitch error in measure 5. However, this error remains a discrepancy because it cannot be resolved by adding an accidental. The MIDI part is incorrect in measures 18 and 19. Because of this, the corre- sponding note for the red OMR F is the MIDI E, so the accidental fixer adds a flat to the OMR note to resolve this discrepancy. This will need re-fixing by a user. Even though the MIDI part is also incorrect in measure 21, the alignment is able to recover and the OMR part is mostly validated against the violin 1 part. Similar to the first violin part, mordents are resolved and added when necessary before the repeats. However the mordents at the end of the piece are not completely corrected because of other errors. Since appoggiaturas are not yet supported by the corrector, the misread in measure 34 is an unresolved discrepancy. The OMR elements at the beginning of measure 24 are validated against the MIDI part since these are different rhythmic representations of the same articulation. Sim- ilar to the violin 1 part, rest discrepancies are resolved. Consequently, the following notes, which were originally correct but marked as discrepancies, are automatically marked as fixed. This is demonstrated after the repeat. The different rhythmic rep- resentations of the same thing is resolved in the first note of measure 51, but not resolved later for the following A. This is because the OMR rhythm is incorrect due to a missing dot, and the MIDI representation is simply different. This causes the OMR A to be aligned with the first MIDI A and the A flat in the following OMR measure to be aligned with the second tied in MIDI A. This confusion means the actual cause of discrepancy goes unnoticed and the blame is placed later. A similar situation occurs with the second note in measure 52. A user looking at the score will be able to understand the cause of these discrepancies and correct this region by consulting the reference OMR and MIDI parts and by looking backwards for the unmarked but problematic notes. Measures 49 and 50 together are a good example of removing an extra dot and correctly collapsing overflow notes back into the problematic measure. Since the MIDI part is missing a note in measure 25, the alignment is off and

83 the appropriate corresponding notes cannot be found to resolve the following rest discrepancies. In measures 57 and 58, an incorrect MIDI part is at fault. The initial OMR accuracy against ground truth of 93.35% was only slightly in- creased to 93.67% after fixing. Despite the minor improvement, this can stillbe considered a success since a limited amount of elements need review to achieve 100% accuracy is reasonable.

Viola Part

The viola part included in the Appendix A originally had an incorrect additional first measure. It was not automatically removed during corrector setup, which made vi- sual measure alignment difficult. Thus, I chose to manually remove the first measure to achieve the included score. The initial fixers did, however, correct a misread key signature in the OMR part, changing it from 3/8 to 6/8, and removed additional un- necessary key signatures throughout the OMR part. The initial fixers also expanded the OMR multirest in measure 16 and 55, which were misread as a single bar of rest. This is important for visual measure alignment. It is interesting this MIDI viola part uses treble clef, which is not as appropriate as the alto clef used in the OMR part. Regardless of differing clefs, comparisons can still happen. For this part, the user would have had to check 270 elements, but the corrector identified 143 issues to consider, which it reduced to 97 with automatic fixing.A large issue with correcting this OMR part was the many errors in the MIDI part. By correcting the MIDI part using the corrector output and re-running with the corrected MIDI part, 60 issues were identified, and ultimately reduced to 21 with automatic fixing. This illustrates not only the importance of using an accurate MIDI encoding, but also the possibility of correcting an OMR part even without an originally accurate MIDI part. The score included in the appendix uses the MIDI part prior to correction. The MIDI part is incorrect in many places, such as measures 11, 13, 21, and 34. These MIDI errors can cause incorrect fixes, such as the addition of a sharp tothe last OMR note in measure 22. That fix was applied to make the OMR note match

84 the first MIDI note in measure 23, to which it was mapped. Similar totheviolin 2 part, the viola MIDI is missing notes doubled with the violin 1 part. These are sometimes included to resolve discrepancies, as shown in the first line. However, another consequence of MIDI errors is that doubled violin 1 parts might not be successfully found, as is the case following measure 21. Different representations caused discrepancies which could be resolved. For exam- ple, dotted quarter notes in measures 12 and 14 are correctly validated, even though the MIDI rhythm is not exactly the same. Similarly, the green OMR eighth note and eighth rest in measures 23 and 62 are validated. Furthermore, different rest representations are validated in the OMR part, such as in measures 27 and 29. By resolving the rest discrepancy in measure 27, the improved alignment shows the notes in measure 28 as consequently fixed even though they did not change. The viola part was automatically fixed to 94.96% accuracy after initially having 93.17% accuracy when using the corrected MIDI file. The initial accuracy should be even lower than this, however, since I intervened and removed the suspicious additional measure at the beginning before re-running the corrector.

Cello Part

A user validating the cello part would have had to check 256 elements without the corrector system. However, 96 issues to consider were identified, which became 21 with automatic fixing. This automatically part can be found in Appendix A. Similar to the viola part, the cello OMR part was originally missing two-bar multirests in measures 16 and 55. The multirest was successfully expanded at measure 16, but not at measure 55. Because of an incorrect additional note in measure 55, the fixer did not accurately diagnose the missing multirest. Multiple compounded errors are complex to resolve, but a person could easily fix this issue. Since this error is not fixed, the measures are unaligned from measure 55 onward, and this iseasily noticeable. Different rhythmic representations of the same thing caused initial discrepancies, but were resolved according to fixers. Different representations of the same rests were

85 validated in measures 4 and 8, and consequently, the originally marked discrepancy notes in the following measures could also be validated. The rests in many other measures were validated in a similar fashion. Some of these are measures 25, 27, 29, 33, 35, 38, 42, 45, and 49. Similarly, different rhythmic representations of the same articulation were validated in measures 12, 31, and 32. This was unsuccessfully resolved in measures 51 through 53 because of the combination of a rhythm error in the OMR part and a discrepancy in representation between the OMR and MIDI parts. An incorrect OMR rhythm in measures 4, 10, 37, and 45 is detected. These are currently not supported for fixing, but are still identified and could be fixed with additional fixers. The wrong rhythm in measure 37 causes some rest duration to overflow into the next measure, but this could be noticed upon user examination of the green area. OMR tremolos are missing their slashes in this OMR part. Notes with and without tremolos appear similar because only the pitch and duration of the note itself, not the expressions on it, are hashed and compared. However, when comparing against the MIDI part, which notates each tremolo note explicitly, the missing slashes can be detected, as evidenced in measures 18 through 20. By fixing this issue, the later OMR parts became appropriately aligned with the MIDI part and could be validated. Interestingly, the last note of the OMR part is a chord. It is a misread repeat sign. Chords are not correctly marked as a discrepancy when incorrect, so will need attention in future work. An immediate workaround could be automatically marking all chords as discrepancies regardless of that necessarily being true in order to ensure they are not overlooked. The trade-off is this will increase the amount of content to manually review, but it can guarantee chord errors will not be overlooked. The need for this functionality is further demonstrated in the discussion of automatically fixing most Mozart parts. The accuracy against the ground truth increased from 95.77% to 96.15%. The improvement was actually more than this because the missing tremolos were originally considered similar.

86 5.3.2 Mozart’s First Quartet

Wolfgang Amadeus Mozart’s String Quartet No.1, as shown in Appendix B, is a good option to supplement the Haydn quartet. While the first movement produced an OMR output with many errors, the second movement output was more accurate. Instead of using individual parts scans, I used a scan of the entire score scan. This meant that missing multirests were a not concern as scores explicitly notate each measure. Similar to the Haydn quartet, this piece has a single key signature and two repeated sections. Those repeats were condensed in the MIDI part during corrector setup. Since the MIDI file and OMR files both had the correct key signature and time signature, neither was modified during set up. This piece features appoggiatura ornamentation. Since the correction system does not currently support appoggiaturas, they are only marked as discrepancies and not resolved. They could be resolved with an additional fixer. This goes to show that while the fixer does not account for every error it could fix, it is still usable inmerely identifying these errors.

Violin 1 Part

The violin 1 part has 500 elements which would have needed review, but the corrector found 63 discrepancies to consider, eventually reducing this to 44 discrepancies with automatic fixing. These discrepancies and automatic fixes can be seen in Appendix B. Appoggiatura discrepancies are observed in measures 1, 4, 8, 33, 34, 50, 53, and 57. This error can cause an extra tie overflow into the next measure, as is thecase between measures 1 and 2 and between measures 5 and 6. While this overflow is not explicitly marked, the user should be able to trace the sequence of potentially affected notes. If not, the user can run the manually corrected part through the corrector a second time. Notes in unison go unmarked, as is the case in measures 33 and 49. This is a critical issue, which needs to be addressed in future work. Possible workarounds are

87 discussed in the above Haydn cello part analysis. The eighth note and eighth rest pairs in measures 2, 6, 23, 51, and 55 are appro- priately validated with the corresponding MIDI quarter note. This is not successfully done in measure 21 because there is not only an error in the OMR part, in which an eighth note is misread as a quarter note, but also a discrepancy of representation between parts. An incorrectly dotted rhythm is appropriately corrected in measure 26. On the other hand, an incorrect rhythm in measure 78 is identified, but an automatic fix is not supported. Manual adjustment should be easy. However, automatic fixing could also be possible using M. Church’s rhythm correction algorithm [10]. The accuracy of this part increased slightly from 93.94% to 94.68% during auto- matic fixing. An appoggiatura fixer could help increase this accuracy even more.

Violin 2 Part

As seen in Appendix B, the violin 2 part has 558 elements, of which 52 were identified for review, and ultimately 42 were still unresolved after automatic fixing. Similar to the violin 1 part, most discrepancies are related to appoggiaturas as shown in measures 1, 4, 5, 8, 34, 50, 57, and 83. In measure 6, incorrectly added OMR rests are identified and the OMR eighth note and eighth rest is validated with the MIDI quarter note. Similar validation occurs in measures 23, 51, and 72, and somewhat successfully occurs at the end of measure 55, which is not exactly correct because of a previous rhythm error. There is a missing note at the end of measure 9, which goes unnoticed due to alignment until the note changes in the following measure. The user can notice that discrepancy on the identical notes and look earlier for the cause. The cause for discrepancies on identical notes will always be earlier in the music, because notes before can affect the alignment of notes following, but not vice versa. Similarly, the missing OMR note at the end of measure 62 causes a misalignment. As a result, the first OMR note in measure 63 is adjusted to be enharmonic with the last MIDInote in measure 62. An incorrect rhythm in 74 causes a propagation of discrepancies.

88 Interestingly enough, the rhythm error in measure 61 is incorrectly fixed by adding a tremolo. Based on the the corrector’s ability to recognize tremolos as one longer OMR note with several faster copies of that note during the same time in the MIDI part, it is reasonable this fix would be applied. However, this is not the correct fix. A human user would be able to identify this. The incorrect note rhythm in measure 30 is identified but not automatically fixed. The issue in measure 77 is marked, but the problematic lower chord note is not marked as chord error marking is not supported in the current corrector version. The accuracy against the ground truth decreased due to automatic fixing. It was originally 95.18%, and dropped slightly to 95.02% after fixing. This is because the fixers identified the wrong cause in some situations, or had an incorrect assumptionof corresponding notes. While the result of automatic fixing is worse, human intervention can quickly improve the final OMR part to be 100%. Moreover, additional fixers could be implemented to address some of the incorrectly resolved discrepancies.

Viola Part

For the viola part, a user would have needed to check 445 elements. Before automatic fixing, the corrector decreased this number to 17, but after automatic fixing, this number increased to 21. This is because fixers operated under assumptions that were not necessarily true. Still, 21 elements to consider is drastically decreased from the original 445. This viola part is included in Appendix B. Similar to the the other parts, most discrepancies are related to appoggiaturas as shown in measures 4, 8, 33, 53, 57, and 82. Something interesting happens starting at measure 10. The misread OMR note could be replaced with a tremolo and is, even though this is not the correct solution. The reason for this is explained in the earlier Mozart violin 2 section, since that part also has the same issue. Further on, in measure 11, the additional rest causes a later misalignment, such that the OMR notes at the end of the measure correspond with the MIDI notes at the beginning of measure 12. To resolve the pitch discrepancy, a sharp sign is added to the OMR note to make these corresponding notes enharmonic.

89 The OMR notes at the end of measure 12 are also then marked as a discrepancy as a result of earlier errors. The MIDI part has an error in measure 22, which is identified. Since, MIDI can generally be trusted, automatic fixes are done according to that assumption, andthe corrector does not attempt to fix MIDI errors. Upon consulting the original score, the user can identify the incorrect MIDI pitch as the cause of this discrepancy. Ultimately this part decreased in accuracy due to fixing, going from 97.77% ac- curacy to 96.21%.

Cello Part

As shown in Appendix B, the cello part has similar success as the viola part. The cello part originally had 361 elements in need of review, which became 15 before automatic fixing and 16 after automatic fixing. Additionally, the accuracy dropped from 96.68% accuracy before automatic fixing to 96.12% accuracy after automatic fixing. This is due to an incorrect resolution of eighth notes missing a flag in measures 73 and 77. Tremolos were understandably, but incorrectly, used as the correct replace- ment. The logic for this is explained in the Mozart violin 2 part. This creates new discrepancies in the notes at the end of these measures. Again, most discrepancies are related to appoggiaturas, as shown in measures 4, 8, 53, and 57. Another error detected is the missing OMR note in the penultimate measure. Overall, these fixes are simple and would take a user only a few minutes tocorrect in order to produce an 100% accurate score.

5.4 Generalizing Results

5.4.1 Changes in Accuracy From Automatic Fixing

While the accuracy of the parts against the ground truth did not necessarily im- prove after automatic correction, it never decreased drastically. This small fluctuation

90 Part Accuracy Before Accuracy After Change Violin 1 83.56% 92.35% +8.79% Violin 2 93.35% 93.67% +0.32% Viola (with 93.17% 94.96% +1.79% Corrected MIDI) Cello 95.77% 96.15% +0.38% All Parts Average 91.46% 94.28% +2.82% Table 5.1: Accuracy change from Haydn automatic fixing Compares the part accuracy against the ground truth before and after automatic fixing by the corrector. Accuracy is the percent of no-change changes in the aligner’s change list [37].

Part Accuracy Before Accuracy After Change Violin 1 93.94% 94.68% +0.74% Violin 2 95.18% 95.02% -0.16% Viola 97.77% 96.21% -1.56% Cello 96.68% 96.12% -0.56% All Parts Average 95.89% 95.51% -0.38% Table 5.2: Accuracy change from Mozart automatic fixing Compares the part accuracy against the ground truth before and after automatic fixing by the corrector. Accuracy is the percent of no-change changes in the aligner’s change list [37]. downwards in accuracy can be attributed to fixers incorrectly resolving discrepancies because the correct correction is not yet supported. Future work to identify and re- solve a wider range or errors could improve overall automatic correction performance. While automatic fixing did increase the accuracy of many parts, this increase ismore significant than expressed. This is because notes are deemed similar onlyifthey have the same pitch and rhythm, but ornaments and other musical properties are ignored. Therefore, the accuracy of notes missing tremolos will have perfect accuracy before and after correction, even though they are more correct after the ornament is added. Moreover, manual user fixing following automatic fixing can ensure that the part is fixed to 100% accuracy. Manual user corrections can be further validated by re-running the manually corrected parts through the corrector a second time.

It is worth noting that the system is limited to ensuring outputs have the correct notes and rests, as determined by only their pitch and duration. Other musical elements and properties are not yet guaranteed, but are likely to be correct if a legible and high quality scan is used. This guarantee is acceptable, as correct notes

91 and rests are not only the most important components, but are also most prone to having overlooked errors. Other musical properties can even acceptably differ across editions of the same piece, and many can be easily noticed and adjusted by a user.

5.4.2 Discrepancies Can be Resolved and Some Errors Fixed

The fixers were effective at resolving most causes of discrepancies that were notdue to an error. This is critical in helping align the OMR and MIDI parts and limiting the amount of content that the user needs to carefully review. It successfully did this for different rest representations, rhythmic articulation representations, pitch representations, and ornament representations.

Error situations the corrector struggled with were the compounding of multiple discrepancies. For example, it had difficulty when there was an error and theMIDI and OMR parts had different representations. It also performed less well when there were several nearby errors and when there were several errors on a single element. The types of errors that the corrector was most successful in correcting were missing ornaments, extra rhythmic dots, and incorrect accidentals. Other errors are still brought to the attention of the user. Types of errors the corrector was not successful at resolving were rhythmic errors not due to a misread dot, pitch errors not due to a misread accidental, and ornamental errors due to an appoggiatura. Fixers for these types of errors are suggestions for future work, and I believe there is great potential in expanding the supported errors.

Pitch and rhythm errors can be identified for review. However, chord discrep- ancies are currently not identified and marked. In addition to correctly processing chords, future work could address articulation and phrasing error discrepancies, such as incorrect dynamics, ties, articulations, phrasings, or beaming. These properties would need to be included in the aligner’s hasher for comparison.

92 Discrepancies Discrepancies Part Elements in OMR Before After Violin 1 366 113 34 Violin 2 314 126 73 Viola (with Corrected 270 60 21 MIDI) Cello 256 96 21 All Parts Total 1206 395 149 Table 5.3: Elements to manually correct from Haydn This table compares the total elements in the part with the number of elements with discrepancies before and after automatic fixing by the corrector.

Discrepancies Discrepancies Part Elements in OMR Before After Violin 1 500 63 44 Violin 2 558 52 42 Viola 445 17 21 Cello 361 15 16 All Parts Total 1864 147 123 Table 5.4: Elements to manually correct from Mozart This table compares the total elements in the part with the number of elements with discrepancies before and after automatic fixing by the corrector.

5.4.3 Effectively Reducing the Quantity of Content to Review

Until the accuracy of the corrector improves, the system can still be effectively used by users to manually correct all errors. While some errors were automatically cor- rectly resolved, others were incorrectly resolved, and some were unable to be resolved. The corrector was successful on all parts in limiting the number of elements the user needs to review. As OMR systems improve and accuracy goes up, the amount of errors, and hence content for review, will decrease. As is, the user needs to review roughly a quarter of the Haydn elements and less than a tenth of the Mozart ele- ments. It is worth noting that users still will need to review the regions surrounding a discrepancy for possible unmarked, related discrepancies nearby, but these should follow predictable patterns.

93 5.5 The Role of Human Error in MIDI Files

Most MIDI files available on the internet are community sourced [36]. Since they are generated by people manually inputting notes into music notation software or playing music into an instrumental controller, human error can come into play. This corrector helped reveal the quality of some available MIDI files by reporting areas where they are incorrect. In this way, the corrector can play the additional role of a second pair of eyes to validate existing MIDI material. Fortunately, the corrector can still usefully validate an OMR output with an incorrect MIDI file, but it will require slightly more effort with lower accuracy.

5.6 Truths Are Not Necessarily Ground Truths

This is one of the first times the music21 corpus ground truth files have had asecond pair of eyes to check them, and it was surprising how many errors existed. An un- expected contribution from this project is the discovery that supposed ground truths may not be as correct as previously assumed, and the mistakes can lay latent for a while, as they are especially difficult to notice (e.g., see Figure 3-3). A major takeaway, therefore, is to be skeptical of content generated by people and apply technology, such as this automated corrector system, to validate such content’s quality.

5.7 Unlocking a New Piece of Music

The Haydn and Mozart quartet movements were useful for not only evaluating the corrector system, but also identifying and correcting errors in existing MusicXML files. However, to expand the repertoire, I created a musical and visual encoding of the Rondo movement from Friedrich Kuhlau’s Grande Quartet for 4 flutes, Op.103, using individual part scans from IMSLP [28] and a MIDI file from Classical Archives [29]. This piece is interesting and valuable to unlock for several reasons. First of all, the piece itself is a flurry of rapid notes, many of which have accidentals, andsome of which are expressed as ornaments. Being able to digitize and validate complex

94 pieces of music is important. Additionally, while flute players are not uncommon, a MusicXML digital encoding of this piece, which lends itself to easy transposition for other instruments, such as those in a quartet, would increase this piece’s performance potential.

While the process for digitizing this movement still took a few hours, the corrector made the process significantly faster, easier, and less error-prone. First, I split the scanned PDF into individual movement files and extracted the flute parts from the two-handed MIDI piano parts. Then, I converted the files into MusicXML, using SmartScore for the scans and MuseScore for the MIDI file. Running the corrector on all parts revealed that the measure structure of the OMR output was not correct. This caused physical misalignment of measures, making the display score difficult to read. To overcome this challenge, I manually adjusted all measure structures in the OMR part, so the measures would align, even if their contents differed.

Rerunning the corrector on these adjusted OMR parts revealed a useful display score that I used to approve fixes and to make other adjustments. The first and second flute parts took the most time to correct because they both had manypitch errors caused by both misread accidentals and misread lines or spaces. When the system automatically corrected misread accidental errors, it was a relief to not need to make further adjustments. Despite needing to manually intervene for the many line and space errors, having those discrepancies marked made identifying those regions easier. Additionally, the corrector saved me a lot of time by automatically identifying ornaments across all parts. Contrasting the first two parts, the third and fourth parts were much easier to correct because these later parts serve a base line role in the ensemble. Therefore, they had fewer notes and less room for error. For all parts, I still needed to personally verify properties not related to rhythm or pitch, such as staccato articulations, slurs and ties connecting notes, and dynamics. Prior to recombining the parts, I also removed misread text, since crescendo markings, publisher information, and other text was often misread.

While the process for digitizing this movement was still time consuming, the cor- rector helped me focus my effort, thereby requiring less time than the alternative

95 methods of either manually inputting all notes or correcting the OMR output unas- sisted. This goes to show that the corrector can still be helpful even if the the OMR output is riddled with errors. Moreover, with the corrector’s validation, I am confi- dent that the MusicXML file I produced is highly accurate, so I can confidently add it to the music21 corpus for anyone to use. This contribution makes this movement accessible to not only conventional ensembles, such as quartets, but also to unconven- tional ensembles, such as a quartet of bassoon players. Therefore, the corrector system shows potential for liberating music for more creative and inclusive performances.

96 Chapter 6

Conclusion

OMR Systems cannot guarantee 100% accuracy and moreover struggle on less cleanly scanned and more complex parts, most noticeably handwritten music. The music community needs a robust process to validate OMR outputs while minimizing human time and effort in the process. The post-OMR corrector addresses this need. I implemented a post-OMR system, which uses digitally sequenced audio encoding to validate optically derived music notation. Integral to this system is a ruleset I created to classify discrepancies and determine fixes. Fixes are either automatically corrected using fixers I implemented, or in ambiguous cases, are marked for a user to manually correct. These resolved and unresolved discrepancies are marked in an output display I implemented for final, quick human approval. This same display and the necessary modifications to the music streams to physically align them inthe display were essential contributions to initially diagnose these errors. This fixing system is available as open source code with instructions on how toadd new functionality to address more errors. Extending the functionality will enable the tool to automatically correct an even larger set of errors. Using MIDI files available on Classical Archives [36] and music scans from IMSLP [18], the user-corrector team can efficiently increase the amount of validated digital music that can be effortlessly edited and read by musicians. Not only does the corrector enable the expansion of digital repertoire, but it empowers musicians, like those in a quartet with a sick violist, to save performances on a day-to-day basis with creative music transforms.

97 98 Appendix A

Example Corrector Display Score: Haydn Quartet

Figure A-1: Violin 1 scan from Haydn’s Quartet Op. 1 No. 1, Movement 1

99 A.1 Haydn Quartet Op.1, No.1, i: Violin 1 Quartet Op. 1 No. 1 Movement 1 Violin 1 Franz Joseph Haydn  = 160 MIDI                               14  15     31  32      16      33 Prest 17 34 OMR-pre                                              Prest     14 1516 17      31 3233 34 OMR-post               Presto                                   Ground Truth                                                               

11

MIDI                                                OMR-pre                                                OMR-post                                                Truth                                               

18

MIDI         84  86  89 91  95  98    92  

OMR-pre                    84 86   89 91 92 95   98

OMR-post                       Truth                     

100 A.2 Haydn Quartet Op.1, No.1, i: Violin 1

20

MIDI                         100   106 107  112 114     101 108    102 

OMR-pre                           100 101102 106 107 108 112 114           

OMR-post                                     Truth                                      23 3 3 3 MIDI   131 132133134 135 136 137 138 140  141 142 143 144 145              146         A  147 OMR-pre                      131 132 133 135 136 137 139  141 142 143144  134 138 A

OMR-post                                 Truth                               

27 MIDI                152  156 157  158 163 165166167  170 171  172   159  173 160 174 161 175

OMR-pre                           152   156 157 158 159 160 161  163 165 166 167   170 171 172 173 174 175  OMR-post                                 Truth                                  2

101 A.3 Haydn Quartet Op.1, No.1, i: Violin 1

31

MIDI                177 179   184  193 194 197  205  206 207 211   180  185  195  208 212   209

OMR-pre                                              177 179 180   184 185 193 194195 197 205 206207 208 209 211212

OMR-post                                                Truth                                             

38

MIDI                 214 215 216    221      229   232   235 238 242243     217      236 239     240 B OMR-pre      214 215 216 217   221 229  232  235236 242 243                              B       OMR-post                                           Truth                                                  

45

MIDI                  250251 253 254  255 260      269 272 273  274 256   275 257   276 258 277

OMR-pre                                   250 251 253 254 255256 257 258  260  269 272 273 274275 276 277

OMR-post                                 Truth                                         

3

102 A.4 Haydn Quartet Op.1, No.1, i: Violin 1

51

MIDI                279   286   290 292    299    293 

OMR-pre                                     279 286  290 292293   299

OMR-post                                      Truth                             

55

MIDI                      322         337 338                339

OMR-pre                           322      337338339                  OMR-post                                              Truth                                                     

60

MIDI          3    3    3               361 362363364 365 366367368 371372373 375        376 OMR-pre                                        361362363 365366 367 371 375376    364  368 372 373

OMR-post                                               Truth                                              

4

103 Figure A-2: Violin 2 scan from Haydn’s Quartet Op. 1 No. 1, Movement 1

104 A.5 Haydn Quartet Op.1, No.1, i: Violin 2 Quartet Op. 1 No. 1 Movement 1 Violin 2 Franz Joseph Haydn

MIDI       0 1  2 3  4       17 18  19 20  21 22 24 27        5   2528 6    7 Prest 8 OMR-pre                                   0 1 2 3 4 5 6 7 8  17 18 19 20 21 22 24 25 2728                Prest OMR-post                                               Ground Truth                                                               

10

MIDI        53     73                                        OMR-pre     53   73                                              OMR-post                                                      Truth                                                   

18

MIDI  81   89   103 104 105  106    82     90             107 OMR-pre           81 82     89 90           103 104 105 106 107 108 109 110 111      OMR-post                                Truth                                              

105 A.6 Haydn Quartet Op.1, No.1, i: Violin 2

23 3 3 3 MIDI  108  109 110 111 112 113 115 116  118 121 122 123            114   119      A OMR-pre                   112 113 114 115 118 119 121 122 123 124 125 127 131  116 A OMR-post                              Truth                             

27

MIDI      134    141  142 145          160  161   168       143     162 169    163 OMR-pre                                      132 134      141 142143  145       160 161162 163 169 OMR-post                                                  Truth                                               

35

MIDI    171 172 175       189 190  191 192  193   202   173 194     203   195 196

OMR-pre                 171 172 173  175     189 190 191 192 193 194 195 196    202 203 205 206           OMR-post                                            Truth                                                    

2

106 A.7 Haydn Quartet Op.1, No.1, i: Violin 2

43

MIDI   205 206 207 208        224 225 226 227 228  233          209       229 234 210   230  211 231 212 213 214 215

OMR-pre                                    207208 209 210 211 212 213 214 215   224 225 226 227 228 229 230 231  233234 

OMR-post                                           Truth                                                     

50

MIDI               243 244 245  247 248 249 250   253 254          OMR-pre                         243 247 253    244 248 254 245 249 250

OMR-post                         Truth                          

55

MIDI               277   285 286 288 289 290   278      287  3 OMR-pre             277 278 285 286 287 288 289 290                     3 OMR-post                    Truth                                         3

107 A.8 Haydn Quartet Op.1, No.1, i: Violin 2

59

MIDI                3   299 300 301  302 303  304 305 306

OMR-pre                            299 300 301 302 303 304 305 306 307 308 309 310 311 313

OMR-post                                 Truth                                

63

MIDI  3    3    307 308 309 310 311 313 318 319 320 3121  323 324   314   315 316

OMR-pre     314 315 316 318 320  323       321 324 OMR-post                Truth                  

4

108 Figure A-3: Viola scan from Haydn’s Quartet Op. 1 No. 1, Movement 1

109 A.9 Haydn Quartet Op.1, No.1, i: Viola Quartet Op. 1 No. 1 Movement 1 Viola Franz Joseph Haydn

MIDI    0 1  2 3   11   13 14 16 18 19 20  21 23   29        4  17 22 24   5  25 6 26 7 27 8 Presto 9 OMR-pre                                             0 1 2 3 4 5 6 7 8 9 11 13 16 17 18 1920 21 22 23 2425 26 27 29 14 Presto OMR-post                                                   Ground Truth                                                 

10

MIDI             42 43  46 47 48 49 50 51 52  54 55 56     65 66 67 68   71      44         

OMR-pre                        4243 44  46 47 48 50 51  54       65 66 67 71    49 52 55 68  56

OMR-post                                        Truth                                       

110 A.10 Haydn Quartet Op.1, No.1, i: Viola

19

MIDI                           89 90 91 92 93 101 102 103 104  109110   94     105   95  106 96 97 98 99

OMR-pre                           89 90 91 92 93 94 95 96 97 98 99  101 102 103 104 105 106  109110 111       OMR-post                                                Truth                                                    

26

MIDI                       111 112 117 120 125 126 129      147 113 118  121   127         148  114 115 A 116

OMR-pre                                             115 116 117118 120121 125 126127  129  147148 A

OMR-post                                              Truth                                             

34

MIDI              153154  156             174 175  176 177  178 157   158

OMR-pre                          153  156 157 158       174 175 176177 178 179 180 181 182 183 154     OMR-post                                               Truth                                              2

111 A.11 Haydn Quartet Op.1, No.1, i: Viola

41

MIDI    179   189 190 191  192 193  194 195 196 197 198 199 201 202 180      200 181 182 183 184 B

OMR-pre                                 184   189 190 192 193 197 198 199 200 201 202203 205 207 213 191 B

OMR-post                                     Truth                                   

47

MIDI   203 205 207  213 214 215 216 217 219                   218    OMR-pre                         214 215 216 217 218 219 220 221 222 223224 225 226 227 228 229 230 231 232

OMR-post                             Truth                           

52

MIDI   220 221222 223 225 226 227 228 229 230  238  239 240   242 243    224   231       232             233 234 235 236

OMR-pre       233 234 235 236 238240 242 243   239             

OMR-post                           Truth                          

3

112 A.12 Haydn Quartet Op.1, No.1, i: Viola

59

MIDI             264 265 266  267  276 277 279 280 285          268    278 281  286         269  270 271 272 273

OMR-pre                                    264 266 267 268 269 270 271 272 273 276 277 278 279280281  285286 265 

OMR-post                                           Truth                                           

4

113 Figure A-4: Cello scan from Haydn’s Quartet Op. 1 No. 1, Movement 1

114 A.13 Haydn Quartet Op.1, No.1, i: Cello Quartet Op. 1 No. 1 Movement 1 Cello Franz Joseph Haydn

MIDI           9 10 1115   26 27 32 35 36            12         28    37  Presto       OMR-pre        9 1011 12 15     26 2728  32 35 36 37                            Presto       OMR-post                                                 Ground Truth                                                   

11

MIDI         41 42   47 48 49 50 51         67 68 69 70 73 74 75 76   43                      OMR-pre         41 42 43   47 48         67 68 69 73       49      70    50 51

OMR-post                                  Truth                                

19

MIDI         77 8182 8384 868788 89 9192 93 94 95 96 97 98 99 100   105106 107 108 109  110                             OMR-pre      74 75 76 77    81 82 83 84  86 87 88 89  91 93 94 95 96     105 106  107       92 97   98 99 100

OMR-post                                        Truth                                        

115 A.14 Haydn Quartet Op.1, No.1, i: Cello

27

MIDI  111     118 119  124 127 128129130 131 132 133 134 135 136               120      OMR-pre    108 109 110      118 119 120   124  127 128 131 133  111    129 132   130

OMR-post                            Truth                             

33

MIDI         137 138      145  146  151   154 155    160 161       147 156 162    OMR-pre                                           134 136137  145 146147  151 154 155 156 160 161 162  135 138

OMR-post                                                 Truth                                              

40

MIDI            180 181     192  193 194198       182   195         183     OMR-pre                                            180 181 182 183   192 193194 195 198   OMR-post                                                 Truth                                                 

2

116 A.15 Haydn Quartet Op.1, No.1, i: Cello

49

MIDI        209  210  215  219 220 221 222 223 224 225 226 227  211    OMR-pre      209 210211   215 219 222 224      220 223 225  221 226

OMR-post                  Truth                  

53

MIDI                        228 229 230 231 232   237 238 239 240                          OMR-pre           227 229  237 238 239                  260 261   228 230  240 231    232

OMR-post                                       Truth                              

60

MIDI           260 261       271 272 274 275  280    273  276   281       282 283 284

OMR-pre                         271 272 273 274 275 276  280 281   OMR-post                               Truth                                   

3

117 118 Appendix B

Example Corrector Display Score: Mozart Quartet

119 B.1 Mozart Quartet No.1, ii: Violin 1 Quartet No. 1 Movement 2 Violin 1 Wolfgang Amadeus Mozart  = 148 MIDI                         3     3 4 17 18  5 19  Allegro

    OMR-pre                  l 5   17 18 19     n   l  3 Allegro

    OMR-post                             Ground Truth                                 

4 MIDI                           3      3      25 26 27 28 29 33 34 42 49 50  43 51 44 45

OMR-pre                              26 27 29  42 33 43 44 45 49 50 51  28 34 

OMR-post                                    Truth                                   

120 B.2 Mozart Quartet No.1, ii: Violin 1

8 MIDI                                3      57 58 59 64    65

OMR-pre                     57 58         59             OMR-post                                           Truth                            = 148      14 = 74           MIDI                                 134       135         OMR-pre                            134 135                   OMR-post                                                   Truth                                      23           MIDI                          147 148        170    149        OMR-pre                        147 148 149       170                

OMR-post                                            Truth                                           29 

MIDI                                            OMR-pre            213 214                                   OMR-post                                                 Truth                                             2

121 B.3 Mozart Quartet No.1, ii: Violin 1

34  = 94  = 148 MIDI         3                            222 223 224 225    OMR-pre       222 224                223 225             OMR-post                                    Truth                                     

43  = 138 MIDI                                  295               296         297   

OMR-pre                                         296    297295        OMR-post                                                    Truth                                                            

50  = 148 MIDI                      3     311 312 320 326 327  321 328  322 l l l   OMR-pre                     320 l l 321 322 326 327 328     311 312   l l l  OMR-post                               Truth                              

3

122 B.4 Mozart Quartet No.1, ii: Violin 1

53 MIDI                      3      334 335 336  341 355 356  342 357 

OMR-pre                       334 336       l 355 356 357       OMR-post                                      Truth                                     

57 MIDI                         3        363 364365  378 383 388 l O OMR-pre                                  364 365 378 383 388  l O OMR-post                              Truth                                = 148 62  = 74 MIDI                                     OMR-pre                                                 OMR-post                                            Truth                                           

4

123 B.5 Mozart Quartet No.1, ii: Violin 1

70

MIDI                 439 440        453 454   441 455   OMR-pre                439 440 441        453 454 455           OMR-post                                    Truth                                   

75

MIDI                    493             1 l       OMR-pre                     493             1 l       OMR-post                                         Truth                                         79

MIDI               8 O       OMR-pre               8 O       OMR-post                     Truth                    

82 MIDI       522523524      OMR-pre        522    523   524

 OMR-post               Truth            5

124 B.6 Mozart Quartet No.1, ii: Violin 1

83 MIDI          3      532 533 534               OMR-pre          532 534                  OMR-post                            Truth                         

6

125 B.7 Mozart Quartet No.1, ii: Violin 2 Quartet No. 1 Movement 2 Violin 2 Wolfgang Amadeus Mozart

MIDI                          3        3 4 16 17  18

OMR-pre    20      3 4        19    17 18    16    OMR-post                            Ground Truth                                  

4

MIDI                            3  3       26 27 28 34 35 43 49 50  44 51

OMR-pre        27   43 34     44    i  49 50 51  28  35          OMR-post                                  Truth                                    

8

MIDI        3                                   57 58 59 87 95

OMR-pre    58   87 95  59                                 

OMR-post                                         Truth                                         

126 B.8 Mozart Quartet No.1, ii: Violin 2

11

MIDI                                                      OMR-pre                                                            OMR-post                                                            Truth                                                                17

MIDI                                177 178     179 OMR-pre                              177 178 179            OMR-post                                         Truth                                       

25

MIDI                                        OMR-pre                                      OMR-post                                        Truth                                       

30 MIDI         228                       OMR-pre             228                 OMR-post                              Truth                              2

127 B.9 Mozart Quartet No.1, ii: Violin 2

3 34

MIDI                      252 253 254            OMR-pre   253          254                  

OMR-post                                Truth                             

42

MIDI                                                        OMR-pre                           l                               OMR-post                                                         Truth                                                            

49

MIDI              3                  338 339 340 352 353    354

OMR-pre              li 340           352 353 354    338   OMR-post                               Truth                                 

3

128 B.10 Mozart Quartet No.1, ii: Violin 2

52

MIDI                          3          360 361362 368 380 381 382  383

OMR-pre         361            l l l 381382 383      362      380      OMR-post                                    Truth                                          

57

MIDI        3                                   389 390 391 OMR-pre  389    390                                   391

OMR-post                                        Truth                                         

60

MIDI  456 460 471 472         461                              462         463  

OMR-pre   460462 456 463461 471                                     472            OMR-post                                             Truth                                                        

4

129 B.11 Mozart Quartet No.1, ii: Violin 2

64 MIDI             513 514                       515 OMR-pre             513514515                            OMR-post                                     Truth                                     

74 MIDI                       528529 526 527   546 547   

OMR-pre               526         546     527  547 528 529

OMR-post                            Truth                            

76 MIDI                     558560          577   559   

OMR-pre                  560 577                     OMR-post                                        Truth                                   

5

130 B.12 Mozart Quartet No.1, ii: Violin 2

3 82 MIDI     594 595 596597                   OMR-pre          595 596        597             OMR-post                     Truth                            

6

131 B.13 Mozart Quartet No.1, ii: Viola Quartet No. 1 Movement 2 Wolfgang Amadeus Mozart Viola3

MIDI                          16 1718                   OMR-pre    17                       18                  OMR-post                                            Ground Truth                                             3 8

MIDI      39 4041                       66           78   79 OMR-pre      39 40   78 66                        79            41      OMR-post                                                Truth                                         

11 MIDI                                                       OMR-pre                                                         OMR-post                                                        Truth                                                      

132 B.14 Mozart Quartet No.1, ii: Viola

16 MIDI                     152             OMR-pre              139        152            

OMR-post                                  Truth                                  30

MIDI                     3                    181 182 183 184    

OMR-pre                  181 183                   182 184    

OMR-post                                         Truth                                         38

MIDI                                            OMR-pre                                              OMR-post                                                Truth                                3         49    

MIDI             276 277278                      OMR-pre     276 278                                  OMR-post                                       Truth                                  2

133 B.15 Mozart Quartet No.1, ii: Viola

3 55

MIDI                                299 300 301      OMR-pre    300    301                          OMR-post                                  Truth                                 59 

MIDI                                                           OMR-pre                                                           OMR-post                                                           Truth                                                           63 MIDI                                        OMR-pre                                        OMR-post                                        Truth                                        76

MIDI                     3             439 440 441 442 OMR-pre             439 441        440 442          OMR-post                              Truth                                3

134 B.16 Mozart Quartet No.1, ii: Cello Quartet No. 1 Movement 2 Wolfgang Amadeus Mozart 3Cello

MIDI           6 7 8                OMR-pre        6  8                     OMR-post                            Ground Truth                     3    8 MIDI    19 20 21                                       OMR-pre     19  21                                           OMR-post                                                Truth                                          13 MIDI                                              OMR-pre                                              OMR-post                                              Truth                                              24

MIDI                                                        OMR-pre                                                        OMR-post                                                        Truth                                                       

135 B.17 Mozart Quartet No.1, ii: Cello

32

MIDI                                      OMR-pre                                      OMR-post                                        Truth                                     3  49 MIDI                199 200 201 202         OMR-pre               199 201    202   OMR-post                       Truth                    3  54 MIDI           213 214 215216         OMR-pre           213 215    216   OMR-post                  Truth                58

MIDI                                                   OMR-pre                                                    OMR-post                                                    Truth                                                   2

136 B.18 Mozart Quartet No.1, ii: Cello

64

MIDI                 296                              OMR-pre                                 296             OMR-post                                                Truth                                             

75

MIDI                                 327       OMR-pre                                 327       OMR-post                                        Truth                                       

81 MIDI        358          OMR-pre        358           OMR-post                  Truth                 

3

137 138 Bibliography

[1] SmartScore X2 Professional Edition. https://www.musitek.com/.

[2] Hervé Bitteur. Audiveris, 2014. https://github.com/Audiveris/audiveris.

[3] Dorothea Blostein and Henry S Baird. A Critical Survey of Music Image Analysis. In Structured Document Image Analysis, pages 405–434. Springer, 1992.

[4] Costin-Anton Boiangiu, Radu Ioanitescu, Razvan-Costin Dragomir, et al. Voting-Based OCR System. The Proceedings of Journal ISOM, 10:470–486, 2016.

[5] T Bonte. Musescore: Open source music notation and composition software. Technical report, Free and Open source Software Developers’ European Meeting, 2009.

[6] Donald Byrd and Megan Schindele. Prospects for Improving OMR with Multiple Recognizers. In International Society for Music Information Retrieval Confer- ence, pages 41–46, 2006.

[7] Donald Byrd and Jakob Grue Simonsen. Towards a Standard Testbed for Optical Music Recognition: Definitions, Metrics, and Page Images. Journal of New Music Research, 44(3):169–195, 2015.

[8] Jorge Calvo-Zaragoza, Jan Hajic Jr, and Alexander Pacha. Understanding Op- tical Music Recognition. Computer Research Repository, abs/1908.03608, 2019.

[9] Liang Chen and Christopher Raphael. Optical Music Recognition and Human-in- the-loop Computation. In International Society for Music Information Retrieval Conference, International Workshop on Reading Music Systems, 2018.

[10] Maura Church and Michael Scott Cuthbert. Improving Rhythmic Transcriptions via Probability Models Applied Post-OMR. In International Society for Music Information Retrieval Conference, pages 643–648, 2014.

[11] Michael Scott Cuthbert and Christopher Ariza. music21: A Toolkit for Computer-Aided and Symbolic Music Data. pages 637–642, 2010.

[12] Christoph Dalitz and Thomas Karsten. Using the Gamera Framework for Build- ing a Lute Tablature Recognition System. In Proceedings of the Sixth Interna- tional Society for Music Information Retrieval Conference, pages 478–481, 2005.

139 [13] Hélio Magalhães de Oliveira and Raimundo de Oliveira. Understanding MIDI: A Painless Tutorial on Midi Format. arXiv, 2017.

[14] Francesco Foscarin, Florent Jacquemard, and Raphaël Fournier-S’niehotta. A diff procedure for music score files. In 6th International Conference on Digital Libraries for Musicology, pages 58–64, 2019.

[15] Michael Good. Beyond PDF – Exchange and Publish Scores with MusicXML , Apr 2013. Musikmesse Music Industry Trade Fair.

[16] Michael Good and LLC Recordare. Lessons from the Adoption of MusicXML as an Interchange Standard. In Proceedings of XML, pages 5–7, 2006.

[17] Cindy Grande and Alan Belkin. The Development of the Notation Interchange File Format. Computer Music Journal, 20(4):33–43, 1996.

[18] Edward Guo. IMSLP/Petrucci Library Free Public Domain Sheet Music. http://imslp.org.

[19] Franz Joseph Haydn. String Quartet in Bb, Hob.III:1, Op.1, No.1 (’La chasse’). Edition Peters, MIT Lewis Music Library.

[20] Franz Joseph Haydn. String Quartet in Bb, Hob.III:1, Op.1, No.1 (’La chasse’). Accessed through Classical Archives https://www.classicalarchives.com/midi/composer/2679.html Uploaded by Edwards, Steven E.

[21] Franz Joseph Haydn. String Quartet in Bb, Hob.III:1, Op.1, No.1 (’La chasse’). Merton Music, Accessed through IMSLP http://imslp.org, Id 71570.

[22] Franz Joseph Haydn. String Quartet in Bb, Hob.III:1, Op.1, No.1 (’La chasse’). Edition Eulenburg, MIT Lewis Music Library.

[23] Noman Islam, Zeeshan Islam, and Nazia Noor. A Survey on Optical Character Recognition System. Journal of Information Communication Technology, 10(2), 2016.

[24] Jan Hajič jr, Marta Kolárová, Alexander Pacha, and Jorge Calvo-Zaragoza. How Current Optical Music Recognition Systems are Becoming Useful for Digital Libraries. In Proceedings of the 5th International Conference on Digital Libraries for Musicology, pages 57–61, 2018.

[25] Kirk Kassner. SmartScore X. General Music Today (Online), 21(3):35, 2008.

[26] Shmuel T Klein, M Ben-Nissan, and M Kopel. A Voting System for Automatic OCR Correction. Citeseer, 2002.

[27] Carol L Krumhansl. Cognitive foundations of musical pitch, volume 17. Oxford University Press, 2001.

140 [28] Friedrich Kuhlau. Grande Quartet for 4 Flutes, Op.103. accessed through IMSLP http://imslp.org, Id 97477.

[29] Friedrich Kuhlau. Grande Quartet for 4 Flutes, Op.103. Accessed through Clas- sical Archives https://www.classicalarchives.com/midi/composer/2848.html Uploaded by Sato, Takahiro.

[30] Wolfgang Amadeus Mozart. String Quartet No.1 in G, K.80 (’Lodi’). Inter- nationale Stiftung Mozarteum, Online Publications, Accessed through IMSLP http://imslp.org, Id 422093.

[31] Wolfgang Amadeus Mozart. String Quartet No.1 in G, K.80 (’Lodi’). Accessed through Classical Archives https://www.classicalarchives.com/midi/composer/3052.html Uploaded by Findenegg, Gunter R .

[32] Victor Manuel Padilla Martin-Caro, Alex McLean, Alan Alexander Marsden, and Kia Ng. Improving Optical Music Recognition by Combining Outputs from Multiple Sources. In 16th International Society for Music Information Retrieval Conference, pages 517–523, 2015.

[33] Ana Rebelo, Ichiro Fujinaga, Filipe Paszkiewicz, Andre RS Marcal, Carlos Guedes, and Jaime S Cardoso. Optical music recognition: state-of-the-art and open issues. International Journal of Multimedia Information Retrieval, 1(3):173–190, 2012.

[34] JW Roach and JE Tatem. Using domain knowledge in low-level visual processing to interpret handwritten music: an experiment. Pattern recognition, 21(1):33–44, 1988.

[35] Joseph Rothstein. MIDI: A Comprehensive Introduction, volume 7. AR Editions, Inc., 1995.

[36] Pierre R Schwob. Classical Archives. https://www.classicalarchives.com.

[37] Emily H Zhang. An Efficient Score Alignment Algorithm and its Applications. Master’s thesis, Massachusetts Institute of Technology, 2017.

141