Hamokara: a System for Practice of Backing Vocals for Karaoke
Total Page:16
File Type:pdf, Size:1020Kb
HAMOKARA: A SYSTEM FOR PRACTICE OF BACKING VOCALS FOR KARAOKE Mina Shiraishi, Kozue Ogasawara, and Tetsuro Kitahara College of Humanities and Sciences, Nihon University shiraishi,ogasawara,kitahara @kthrlab.jp { } ABSTRACT the two singers do not have sufficient singing skill, the pitch of the backing vocals may be influenced by the lead Creating harmony in karaoke by a lead vocalist and a back- vocals. To avoid this, the backing vocalist needs to learn to ing vocalist is enjoyable, but backing vocals are not easy sing accurately in pitch by practicing the backing vocals in for non-advanced karaoke users. First, it is difficult to find advance. musically appropriate submelodies (melodies for backing In this paper, we propose a system called HamoKara, vocals). Second, the backing vocalist has to practice back- which enables a user to practice backing vocals. This ing vocals in advance in order to play backing vocals accu- system has two functions. The first function is automatic rately, because singing submelodies is often influenced by submelody generation. For popular music songs, the sys- the singing of the main melody. In this paper, we propose tem generates submelodies and indicates them with both a backing vocals practice system called HamoKara. This a piano-roll display and a guide tone. The second func- system automatically generates a submelody with a rule- tion is support of backing vocals practice. While the user based or HMM-based method, and provides users with is singing the indicated submelody, the system shows the an environment for practicing backing vocals. Users can pitch (fundamental frequency, F0) of the singing voice on check whether their pitch is correct through audio and vi- the piano-roll display. The user can find out how close the sual feedback. Experimental results show that the gener- pitch of his/her singing voice comes to the correct pitch. ated submelodies are musically appropriate to some de- The rest of the paper is organized as follows: in Section gree, and the system helped users to learn to sing sub- 2, we present a brief survey on related work from the per- melodies to some extent. spectives of harmonization and singing training. In Section 3, we describe the overview of our system. In Section 4, 1. INTRODUCTION we present the details of our automatic submelody genera- tion methods. In Section 5, we report results of evaluations Karaoke is one of the most familiar forms of entertainment conducted to confirm the effectiveness of our system. Fi- related to music. At a bar or a designated room, a per- nally, we conclude the paper in Section 6. son who wants to sing orders the karaoke machine to play back the accompaniment of his/her favorite songs. During the playback of the accompaniment, he/she enjoys singing. 2. RELATED WORK Because karaoke is enjoyable for many people, techniques 2.1 Harmonization enhancing the entertainability of karaoke have been devel- oped [1–3]. Harmonization has two general approaches. The output of One way to enjoy karaoke is to create harmony with two the first approach is a sequence of chord symbols, such as singers. Typically, one person plays lead vocals (singing C-F-G7-C, for a given melody [4–6]. The second approach the main melody) and the other person plays backing vo- assigns concrete notes to voices other than the melody cals (singing a different melody). The melody sung by the voice. The typical form in the latter approach of harmo- backing vocalist (we call this a submelody here) typically nization is a four-part harmony, which consists of soprano, has the same rhythm as the main melody but has differ- alto, tenor, and bass voices. Four-part harmonization is ent pitches (for example, a third above or below from the a traditional part of the theoretical education of Western main melody). If the lead vocalist and backing vocalist are classical musicians, and so numerous researchers have at- able to sing in harmony with each other, the sound will be tempted to generate it automatically [7–14]. pleasant. However, this is not easy in practice. First, it is Our task—generation of submelodies for karaoke—is difficult to find a musically appropriate submelody. Par- slightly different from the above-mentioned harmoniza- ticularly for songs that do not contain backing vocals in tion. In karaoke, one sings together with the accompani- the original CD recordings, the backing vocalist has to cre- ment played back by the karaoke machine, so accompa- ate an appropriate submelody him/herself, which requires niment data (typically MIDI-equivalent sequence data) are musical knowledge and skill. Second, accurately playing given in advance. Submelodies should be consonant with backing vocals in pitch requires advanced singing skill. If the accompaniment as well as with the main melody. Copyright: c 2018 Mina Shiraishi et al. This is an open-access article distributed 2.2 Singing practice support ⃝ under the terms of the Creative Commons Attribution 3.0 Unported License, which As commonly known, almost all commercial karaoke ma- permits unrestricted use, distribution, and reproduction in any medium, provided chines have a singing scoring function. For example, in the original author and source are credited. US Patent 5719344 [15], a technique has been proposed SMC2018 - 511 that is based on comparing the karaoke user’s voice and completely corresponds to each note in the main melody a referential voice extracted from the original CD record- one-for-one. Submelodies are often sung only at limited ing. In addition, Tsai et al. [16] also proposed an automatic sections (such as chorus sections) in actual music scenes, evaluation method of karaoke singing. but our system generates a submelody for the entire main Computer-assisted singing learning systems have been melody. This is so that users can simply ignore the sub- develoepd, such as Singad, Albert, Sino & See, and melody indication except in the sections where they want WinSINGAD (See [17] for details). These systems aim at to sing harmony. self-monitoring the user’s singing with visual feedback of It is possible to generate submelodies that have higher or the pitch, spectra, and/or other acoustic/musical features. lower pitch than the main melody. The user can select an In addition, Hirai et al. [18] developed the Voice Shoot- upper submelody or a lower submelody at the start screen. ing Game to improve poor-pitch singing. In this game, the A typical phenomenon occurs when one person plays vertical position of the user’s character on the screen is syn- lead vocals and another person plays backing vocals: here, chronized with the pitch of the user’s singing voice. The their singing pitches are influenced by each other. In par- user has to control his/her pitch accurately to control the ticular, backing vocals may be greatly influenced by lead vertical position of the character and to catch target items. vocals because the backing vocalist has to sing an unfamil- Nakano et al. [19] developed a singing-skill visualization iar melody. One solution to this problem is to enable the system called MiruSinger. This system extracts and visu- backing vocalist to check his/her pitch accuracy during the alizes acoustic features, particularly the pitch and vibrato practice phase. Our system enables the backing vocalist to sections, from the user’s singing voice. control the volume balance between his/her own voice and Lin et al. [20] developed a real-time interactive applica- the guide tone in order to easily check his/her voice. In tion that runs on a smartphone, which enables the user to addition, he/she can find the pitch accuracy through visual learn the intonation and timing while singing. feedback. The pitch of his/her voice and the correct pitch These studies and systems focus on solo singing, not on are displayed on the piano-roll screen in real time. situations where two persons sing together in harmony. Fukumoto et al. [21] developed a system, RelaPitch, that 3.2 System configuration enables the user to train relative pitch. Listening to a stan- The block diagram of our system is shown in Figure 1. dard tone, the user tries to sing the pitch that has a speci- Loading and playing back MIDI data, generating sub- fied interval from the standard tone. This system focused melodies, and displaying lyrics and submelodies on the on training the creation of harmony by singing, but does screen are performed by the PC. The microphone for the not focus on creating harmony in actual songs. backing vocals is connected to the USB audio interface communicating with the PC. The audio signal of the back- 3. SYSTEM OVERVIEW ing vocals is sent from the audio interface to the PC, and the accompaniment sound is sent from the PC to the audio 3.1 Basic system design interface. The volume of the backing vocals and the ac- The goal of our system is to enable the user to practice companiment can be adjusted independently on the audio backing vocals for a song from popular music (especially interface. Japanese popular music, or J-pop). To achieve this, the In contrast, the microphone for lead vocals is connected system has to support the following functions: to the analogue mixer, not to the audio interface. This is in order to not send the audio signal of the lead vocals to 1. Submelody generation the PC. To make it possible to listen to both lead vocals Some songs do not have submelodies in the original and backing vocals, the headphones for both vocalists are CD recordings. For such songs, the system has to connected to the analogue mixer.