Intelligent User Interfaces for Music Discovery
Total Page:16
File Type:pdf, Size:1020Kb
Knees, P., et al. (2020). Intelligent User Interfaces for Music Discovery. Transactions of the International Society for Music Information 7,60,5 Retrieval, 3(1), pp. 165–179. DOI: https://doi.org/10.5334/tismir.60 OVERVIEW ARTICLE Intelligent User Interfaces for Music Discovery Peter Knees*, Markus Schedl† and Masataka Goto‡ Assisting the user in finding music is one of the original motivations that led to the establishment of Music Information Retrieval (MIR) as a research field. This encompasses classic Information Retrieval inspired access to music repositories that aims at meeting an information need of an expert user. Beyond this, however, music as a cultural art form is also connected to an entertainment need of potential listeners, requiring more intuitive and engaging means for music discovery. A central aspect in this process is the user interface. In this article, we reflect on the evolution of MIR-driven intelligent user interfaces for music browsing and discovery over the past two decades. We argue that three major developments have transformed and shaped user interfaces during this period, each connected to a phase of new listening practices. Phase 1 has seen the development of content-based music retrieval interfaces built upon audio processing and content description algorithms facilitating the automatic organization of repositories and finding music according to sound qualities. These interfaces are primarily connected to personal music collections or (still) small commercial catalogs. Phase 2 comprises interfaces incorporating collaborative and automatic semantic description of music, exploiting knowledge captured in user-generated metadata. These interfaces are connected to collective web platforms. Phase 3 is dominated by recommender systems built upon the collection of online music interaction traces on a large scale. These interfaces are connected to streaming services. We review and contextualize work from all three phases and extrapolate current developments to outline possible scenarios of music recommendation and listening interfaces of the future. Keywords: music access; music browsing; user interfaces; content-based MIR; community metadata; recommender systems 1. Introduction digital repositories of acoustic content, retrieving relevant With its origins in Information Retrieval research, a pieces from music collections, and discovering new fundamental goal of Music Information Retrieval (MIR) as music, interfaces based on querying, visual browsing, a dedicated research field in the year 2000 was to develop or recommendation have facilitated new modes of technology to assist the user in finding music, information interaction. about music, or information in music (Byrd and Fingerhut, Revisiting an early definition of MIR by Downie (2004) 2002). Since then, also driven by the developments as “a multidisciplinary research endeavor that strives to in content-based analysis, semantic annotation, and develop innovative content-based searching schemes, personalization, intelligent music applications have had novel interfaces, and evolving networked delivery significant impact on people’s interaction with music. mechanisms in an effort to make the world’s vast store These applications comprise “active music-listening of music accessible to all” 16 years later, reveals that interfaces” (Goto and Dannenberg, 2019) which augment these developments were in fact intended and implanted the process of music listening to increase engagement of into MIR from the beginning. Given the music industry the user and/or give deeper insights into musical aspects landscape and how people listen to music today,1 this also to musically less experienced listeners. For accessing visionary definition has undoubtedly stood the test of time. In this paper, we reflect on the evolution of MIR-driven user interfaces for music browsing and discovery over * Faculty of Informatics, TU Wien, Vienna, AT the past two decades—from organizing personal music † Institute of Computational Perception, Johannes Kepler University Linz, AT collections to streaming a personalized selection from ‡ National Institute of Advanced Industrial Science and “the world’s vast store of music”. Therefore, we connect Technology (AIST), JP major developments that have transformed and shaped Corresponding author: Peter Knees ([email protected]) MIR research in general, and user interfaces in particular, 166 Knees et al: Intelligent User Interfaces for Music Discovery to prevalent and emerging listening practices at the time. with the title list and mere text searches based on We identify three main phases that have each laid the bibliographic information are useful enough to browse foundation for the next and review work that focuses on the whole collection to choose pieces to listen to. However, the specific aspects of these phases. as the accessible collection grows and becomes largely First, we investigate the phase of growing digital unfamiliar, such simple interfaces become insufficient personal music collections and interfaces built upon (Cunningham et al., 2017; Cunningham, 2019), and new intelligent audio processing and content description research approaches targeting the retrieval, classification, algorithms in Section 2. These algorithms facilitate the and organization of music emerge. automatic organization of repositories and finding music “Intelligent” interfaces for music retrieval become in personal collections, as well as commercial repositories, a research field of interest with the developments in according to sound qualities. Second, in Section 3, we content-based music retrieval (Casey et al., 2008). A investigate the emergence of collective web platforms and landmark in this regard is the development of query by their exploitation for listening interfaces. The extracted humming systems (Kageyama et al., 1993) and search user-generated metadata often pertains to semantic engines indexing sound properties of loudness, pitch, and descriptions and complements the content-based methods timbre (Wold et al., 1996) that initiate the emancipation of that facilitated the developments of the preceding phase. music search systems from traditional text- and metadata- This phase also constitutes an intermediate step towards based indexing and query interfaces. While interfaces are exploitation of collective listening data, which is the still very much targeted at presenting results in sequential driving force behind the third, and ongoing phase, which order according to relevance to a query, starting in the is connected to streaming services (Section 4). Here, the early 2000s, MIR research proposes several alternatives to collection of online music interaction traces on a large facilitate music discovery. scale and their exploitation in recommender systems are defining elements. Extrapolating these and other ongoing 2.1 Map-based music browsing and discovery developments, we outline possible scenarios of music Interfaces that allow content-based searches for music recommendation and listening interfaces of the future in retrieval are useful when people can formulate good Section 5. queries and especially when users are looking for a Note that the phases we identify in the evolution of user particular work, but sometimes it is difficult to come interfaces for music discovery also reflect the “three ages up with an appropriate query when faced with a huge of MIR” as described by Herrera (2018). Herrera refers to music collection and vague search criteria. Interfaces for these three phases as “the age of feature extractors”, “the music browsing and discovery are therefore proposed to age of semantic descriptors” and “the age of context-aware let users encounter unexpected but interesting musical systems”, respectively. We further agree on the already pieces or artists. Visualization of a music collection is ongoing “age of creative systems” that builds upon MIR one way to provide users with various bird’s-eye views to facilitate new interfaces that support creativity as we and comprehensive interactions. The most popular discuss in Section 5. We believe that this strong alignment visualization is to project musical pieces or artists onto gives further evidence of the pivotal role of intelligent a 2D or 3D space (“map”) by using music similarity. 2D user interfaces in the development of MIR. While user visualizations also lend themselves to being applied on interfaces, especially in the early phases, were often tabletop interfaces for intuitive access and interaction (e.g. mere research prototypes, their development is tightly Julià and Jordà, 2009). The trend of spatially arranging intertwined with ongoing trends. Thus, they provide collections for exploration can be seen throughout the essential gauges to the state of the art, and, even beyond, past 20 years and is still unbroken, cf. Figure 1. give perspective of what could be possible. One of the earliest interfaces is GenreSpace by Tzanetakis et al. (2001) that visualizes musical pieces with 2. Phase 1: Content-Based Music Retrieval genre-specific colors in a 3D space (see Figure 1(a) for Interfaces a greyscale image). Coloring of each piece is determined The late 1990s see two pivotal developments. On one by automatic genre classification. The layout of pieces is hand, the Internet gets established as mainstream determined by principal component analysis (PCA), which communication medium and distribution channel. On projects high-dimensional audio feature vectors into 3D the other hand, technological