An Approach for Developing a Mobile Accessed Music Search Integration Platform
Total Page:16
File Type:pdf, Size:1020Kb
Proceedings of the 2013 Federated Conference on Computer Science and Information Systems pp. 267–273 An Approach for Developing a Mobile Accessed Music Search Integration Platform Marina Purgina Andrey Kuznetsov Evgeny Pyshkin Institute of Computing and Control Institute of Computing and Control Institute of Computing and Control St. Petersburg State St. Petersburg State St. Petersburg State Polytechnical University Polytechnical University Polytechnical University St. Petersburg, Russia, 194021 St. Petersburg, Russia, 194021 St. Petersburg, Russia, 194021 Email: [email protected] Email: [email protected] Email: [email protected] Abstract—We introduce the architecture and the data model a product of music intelligent perception [4], [5]. Recently of the software for integrated access to music searching web (see [6]) we also analyzed internal models of music repre- services. We illustrate our approach by developing a mobile sentation (with most attention to a function-based representa- accessed application which allows users of Android running touch screen devices accessing several music searchers including Musi- tion) being the foundation of various algorithms for melody pedia, Music Ngram Viewer, and FolkTuneFinder. The application extraction, main voice recognition, authorship attribution, etc. supports various styles of music input query. We pay special Music processing algorithms use the previous user experience attention to query style transformation aimed to fit well the implicitly. As examples, we could cite the Skyline melody requirements of the supported searching services. By examples extraction algorithm [7] based on the empirical principle that of using developed tools we show how they are helpful while discovering citations and similarity in music compositions. the melody is often in the upper voice, or Melody Lines algorithms based on the idea of grouping notes with closer I. INTRODUCTION pitches [8]. VARIETY of multimedia resources constitutes consid- The remaining text of the article is organized as follows. In A erable part of the present-day Web information content. section II we review music searching systems and approaches The searching services usually provide special features to of the day. We also introduce our experience in the domain deal with different types of media such as books, maps, of human centric computing and refer to some recent related images, audio and video recordings, software, etc. Together works. In section III we describe music query input styles with general-purpose searching systems, there are solutions and analyze possible transformations of music input forms so using specialized interfaces adopted to the subject domains. as to fit the requirements of searching services. Section IV Truly, quality of a searching service depends both on the contains the description of the developed Android application efficiency of algorithms it relies on, and on user interface architecture. We show how it works and make an attempt facilities. As shown in our previous work, such interfaces to analyze the searching output from the point of view of include special syntax forms, user query visualization facil- a musicologist. ities, interactive assisting tools, components for non-textual II. BACKGROUND AND RELATED WORKS query input, interactive and “clickable” concept clouds, and so on [1]. Depending on searching tasks, specialized user in- Apart from searching media by metadata descriptions (like terfaces may support different kinds of input like mathematical author, title, genre, or production year information), main equations or chemical changes, geographic maps, XML-based scenarios of music searching are the following: resource descriptions, software source code fragments, editable 1) Searching music information by existing audio fragment graphs, etc. considered as an input. In text searching such aspects as morphological and syn- 2) Searching music by the written note score. onymic variations, malapropisms, spelling errors, and time 3) Searching compositions by human remembrance repre- dependency condition particular difficulties of a searching sented in a form of singed, hummed, tapped or anyhow process. In the music searching domain there are specific else defined melody or rhythm fragment. complications like tonality changes, omitted or incorrectly Searching by given audio fragments is supported by many played notes or intervals, time and rhythmic errors. Thus, specialized search engines such as Audiotag, Tunatic or although there are eventual similarities between text and music Shazam[9]. As a rule, it is implemented on the basis of information retrieval, they differ significantly [2]. so called audio fingerprinting technique. The idea of this In our previous work (see [3]) we analyzed and developed approach is to convert an audio fragment of fixed length to a an improved EMD algorithm used to compare single voice and low-dimensional vector by extracting certain spectral features polyphonic music fragments represented in symbolic form. from the input signal. Then this vector (being a kind of audio Human ability to recognize music is strongly interrelated spectral fingerprint) is compared to fingerprints stored in some to listener’s experience which may be considered itself to be database [10], [11]. 978-1-4673-4471-5/$25.00 c 2013, IEEE 267 268 PROCEEDINGS OF THE FEDCSIS. KRAKOW,´ 2013 TABLE I ACCESSING MUSIC SEARCHING WEB SERVICES R Access D Name U GUI Micro Text Soft Web D U Audiotag + – – – + R D R D Tunatic + + – – – U R Shazam + + Partially – + D U Midomi + + – – + R Musipedia + + Partially SOAP + Ritmoteka + – – – + Songtapper + – – – + Music Ngram Viewer + – + REST/JSON + FolkTuneFinder + – + REST/JSON + Fig. 1. Beethoven’s “Ode to Joe” fragment represented in Parsons code The second scenario Searching by note score implicates two possibilities. The first one is to look for a given music fragment in the note sheet databases (this is beyond the scope of this (both are also used in our work as target searching services) paper). The main difficulty demanding using special music are based on the REST architectural style and their responses oriented algorithms is that the note score may contain user are wrapped in JSON format. Detailed description of the API errors. The second possibility is to convert the melody into usage rules and examples for Music Ngram Viewer service one of forms discussed hereafter so as to use existing searching may be found in [14]. facilities. In respect to music inputs styles, existing tools support the Finally, in the case of Searching by human remembrance, a following opportunities to define a music fragment: system deals with main voice, rhythm, melody contour or in- • To sing or to hum the theme and to transfer the recording terval sequence which is usually not perfectly defined. Hence, to the music search engine. it’s impossible to search directly within the binary contents of • To write notes by using one of known music notations audio resources: we have no faithful audio fragment. directly (e.g. music score, Helmholtz or American pitch Thus, there are numerous ways to provide music input for notation, MIDI notation, etc.). a searching system. The extensive description of input styles • To tap the rhythm. used by music search engines may be found in [12]. Presently • To play the melody with the use of a virtual keyboard. there are many searching web services allowing customers • To use MIDI-compatible instrument or it’s software using one of several possible styles to input a music query model. including such services like Midomi, Musipedia, Ritmoteka, • To enter Parsons code, or to set the melody contour by 1 Songtapper, Music Ngram Viewer, and FolkTuneFinder. using “U, R, D” instructions as shown in Figure 1. Table I represents possible ways to access different music • To define keywords or enter text query. web searching services and pays attention to the following Table II lists some known music web searching services facilities: together with information about supported music query input • GUI Graphical user interface styles, namely: • Micro Using microphone • Audio Audio fragment • Text Text mode • Notes Music score or pitch notation • Soft Access for software applications by SOAP or similar • Hum Singing or humming software interaction protocols • Rhythm Tapping the rhythm • Web Web interface • VKB Virtual keyboard generating note sequence with rhythm As you can see, nowadays many services are accessible via • URD Parsons code browsers since they support Web interface features. Another • MIDI MIDI-piano important issue is the possibility to access some services from • Text Keywords or text query inside the software applications by using open protocols. It gives the way to create tools which allow users not to be III. MUSIC QUERY INPUT STYLES limited by only one service at a time. Due to such tools we are As shown in the above section, we may define the music able to use integrally different features of different searchers, query by using different input styles. For a searching frame- to access additional resources such as youtube clips, and to deal with additional search attributes like genres, styles, time 1Each pair of consecutive notes is coded as U (“sound goes Up”) if the periods, etc. second note is higher than the first note, R (“Repeat”) if the consecutive pitches are equal, and D (“Down”) otherwise. Some systems use S (“the Particularly, Musipedia service uses SOAP protocol de- Same”) instead of R to designate pitch repetition. Rhythm