A New Take on an Old Approach to Computational Musicology

HUMDRUMR: A NEW TAKE ON AN OLD APPROACH TO COMPUTATIONAL MUSICOLOGY Nathaniel Condit-Schultz Claire Arthur Georgia Institute of Technology Georgia Institute of Technology [email protected] [email protected] ABSTRACT proaches. Humanist music scholars’ deep, nuanced knowl- edge of musical culture, structure, and practice could be Musicology research is a fundamentally humanistic en- an invaluable asset to the computational/empirical music deavor. However, despite the productive work of a small research community. Unfortunately, the training neces- niche of humanities-trained computational musicologists, sary for fruitful computational scholarship is absent from most cutting-edge digital music research is pursued by most music curricula, which cover neither the fundamen- scholars whose primary training is scientific or compu- tal methodological principals of empirical research nor the tational, not humanistic. This unfortunate situation is necessary coding skills. The need to assist, and convince, prolonged, at least in part, by the daunting barrier that music scholars to learn programming has been an area of computer coding presents to humanities scholars with no active discussion for some time [7], in particular there is technical training. In this paper, we present humdrumR a need for software tools crafted to support their learning (“hum-drummer”), a software package designed to afford and research goals [32]. computational musicology research for both advanced and To make computational research truly appealing and novice computer coders. Humdrum is a powerful and in- accessible to traditional humanists and seasoned compu- fluential existing computational musicology framework in- tational researchers alike, we must juggle three conflict- cluding the humdrum syntax—a flexible text data format ing factors: flexibility, power, and usability. User-friendly with tens of thousands of extant scores available—and the interfaces like the Josquin Research Project’s Analysis Bash-based humdrum toolkit. HumdrumR is a modern Tools 1 , Theme Finder 2 , or rapscience.net’s Visualizer 3 replacement for the humdrum toolkit, based in the data- can be utilized by scholars with no special training. How- analysis/statistical programming language R. By combin- ever, such tools allow only variations of hard-coded anal- ing the flexibility and transparency of the humdrum syn- yses applied to limited, fixed databases. On the op- tax with the powerful data analysis tools and concise syn- posite extreme, any skilled programmer can code their tax of R, humdrumR offers an appealing new approach to own symbolic music data formats and analysis/parsing would-be computational musicologists. HumdrumR lever- software “from scratch”—as many computational projects ages R’s powerful metaprogramming capabilities to create [4, 11] have done—, allowing for the unlimited power an extremely expressive and composable syntax, allowing of the programming language of their choice, but requir- novices to achieve usable analyses quickly while avoiding ing substantially more effort and experience. The most many coding concepts that are commonly challenging for prominent modern computational musicology toolkit— beginners. music21 [10]—, in our estimation, falls too close to the later extreme, proving quite daunting to novice coders. 1. INTRODUCTION This paper describes humdrumR (“hum-drummer”), a software toolkit for symbolic musicological data analysis Though digital musicology has been a productive area of intended to be appealing and accessible to traditional mu- research for several decades (e.g., [2, 8, 12, 16, 18–20, 22– sicologists while also being useful to more experienced 24,26,28,29]), it remains a niche field within the musical- computational researchers. Learning lessons from the suc- side of academia. In fact, most cutting-edge, scientific cesses and failures of existing tools, humdrumR strikes a music research has been pursued by researchers with pri- powerful new balance between flexibility, power, and us- mary training in computer science and psychology. Fortu- ability: nately, recent years have seen a flourishing of “digital humanities” research in general, with increasing numbers of 1. Using the humdrum data syntax, humdrumR is ex- traditional humanities scholars adopting computational ap- tremely flexible and general in scope, allowing users to study any type of performance-art data that can c Nathaniel Condit-Schultz, Claire Arthur. Licensed un- be represented symbolically—musical scores, dance der a Creative Commons Attribution 4.0 International License (CC BY steps, trumpet fingerings, etc. 4.0). Attribution: Nathaniel Condit-Schultz, Claire Arthur. “hum- drumR: A New Take on an Old Approach to Computational Musicology”, 1 http://josquin.stanford.edu/search/ 20th International Society for Music Information Retrieval Conference, 2 http://www.themefinder.org/ Delft, The Netherlands, 2019. 3 http://rapscience.net/Analysis/gui.html 715 Proceedings of the 20th ISMIR Conference, Delft, Netherlands, November 4-8, 2019 2. Based in the R programming language, humdrumR programming environment (Python, R, MATLAB, Julia, is extremely powerful, capable of complex data ma- etc.) to achieve more complex data manipulation, statisti- nipulation and interfacing with R’s statistics, visual- cal testing, or data visualization. The burden of parsing and ization, and machine learning libraries. manipulating humdrum data in the higher-level language falls on the user, substantially increasing the need for cod- humdrumR 3. is designed to present a relatively low ing skills and prolonging the workflow. To make matters barrier of entry for non-technical researchers, offer- worse, installation of the humdrum toolkit can be fairly ing a concise, expressive syntax for applying music complicated, especially for Windows users, who must in- analyses while avoiding difficult coding paradigms. stall a Unix emulator to use the toolkit at all. In 2005, musicologist and psychologist Nicholas Cook ob- HumdrumR is a successor to the humdrum toolkit, re- served that to truly engage in computational musicology placing all the original toolkit’s functionality while adding “proper,” scholars must achieve “sufficient understanding significantly more. However, humdrumR uses the origi- of the symbolical processing and data representation on nal humdrum syntax specification, and is thus compatible which it’s based” [7]. We believe that the combination of with all existing humdrum data. In fact, both the techni- humdrum and R represents an ideal avenue for computa- cal design and methodological philosophy of the humdrum tional novices to develop this “sufficient” understanding: syntax are fundamental to humdrumR. enough to pursue quality computational research but with- out having to engage with general coding paradigms that 2.1 The Humdrum Syntax are irrelevant to their research. The humdrum syntax is an extremely general scheme for In this paper, we first review the relevant philosophi- representing musicological data in plain-text. The syntax cal and technical features of humdrum and R, noting how is basically tabular (tab-delineated columns) but with a few humdrumR incorporates these features (sections 2 and additional complexities: 3). We next contrast humdrum(R) coding philosophy and style with that of music21 (section 4). Finally, we de- • Columns of data are interpreted as spines, which scribe the principle features of the humdrumR package can dynamically start, end, split, or merge, creating and humdrumR syntax (section 5), including numerous spine paths. Spine paths allow the syntax to repre- code examples. sent many complex cases from music notation: For example, if the upper staff of a piano score splits temporarily into two voices (one with beams up, the 2. HUMDRUM other with beams down), this can be encoded by Humdrum 4 is a system for computational musicology re- splitting the spine representing that staff into two search, created by David Huron [17] (circa 1995) and columns (with a *^ token) before merging them maintained by Craig Sapp at Stanford’s Center for Com- again after the split passage (*v tokens). puter Assisted Research in the Humanities. Though no one digital musicology framework has ever truly dominated the • Humdrum records are tagged as interpretation field, humdrum is certainly among the most widely used records (representing local metadata) or data and influential systems, being cited as the direct inspiration records (notes, chords, etc.). By simply interspers- for some of its most successful competitors—music21 ing interpretation and data records, metadata can be [10] and MusicXML [13]—and with tens of thousands of concisely associated with specific data points, pas- scores available in humdrum format. Humdrum actually sages, or sections. For example, key information can has two distinct components: the humdrum syntax and the be quickly read, edited, or added to scores by simply humdrum toolkit. inserting lines containing tokens like *f#: (f# mi- The humdrum toolkit is a collection of Unix command- nor). This approach contrasts with the more verbose, line tools for parsing and analyzing humdrum data, largely hierarchical attributes and tags used in XMLs. written in Bash and AWK but with more recent “extra” • Finally, each “cell” (i.e., each line-column coor- 5 commands written in C++. The humdrum toolkit is fun- dinate) in a humdrum file can contain multiple damentally entwined with the Bash shell, in particular the data tokens—a feature commonly used to represent grep and

Load more