Understanding Optical Music Recognition
1 Understanding Optical Music Recognition Jorge Calvo-Zaragoza*, University of Alicante Jan Hajicˇ Jr.*, Charles University Alexander Pacha*, TU Wien Abstract For over 50 years, researchers have been trying to teach computers to read music notation, referred to as Optical Music Recognition (OMR). However, this field is still difficult to access for new researchers, especially those without a significant musical background: few introductory materials are available, and furthermore the field has struggled with defining itself and building a shared terminology. In this tutorial, we address these shortcomings by (1) providing a robust definition of OMR and its relationship to related fields, (2) analyzing how OMR inverts the music encoding process to recover the musical notation and the musical semantics from documents, (3) proposing a taxonomy of OMR, with most notably a novel taxonomy of applications. Additionally, we discuss how deep learning affects modern OMR research, as opposed to the traditional pipeline. Based on this work, the reader should be able to attain a basic understanding of OMR: its objectives, its inherent structure, its relationship to other fields, the state of the art, and the research opportunities it affords. Index Terms Optical Music Recognition, Music Notation, Music Scores I. INTRODUCTION Music notation refers to a group of writing systems with which a wide range of music can be visually encoded so that musicians can later perform it. In this way, it is an essential tool for preserving a musical composition, facilitating permanence of the otherwise ephemeral phenomenon of music. In a broad, intuitive sense, it works in the same way that written text may serve as a precursor for speech.
[Show full text]