Virtual Voices on Hands: Prominent Applications on the Synthesis and Control of the Singing Voice
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
The Race of Sound: Listening, Timbre, and Vocality in African American Music
UCLA Recent Work Title The Race of Sound: Listening, Timbre, and Vocality in African American Music Permalink https://escholarship.org/uc/item/9sn4k8dr ISBN 9780822372646 Author Eidsheim, Nina Sun Publication Date 2018-01-11 License https://creativecommons.org/licenses/by-nc-nd/4.0/ 4.0 Peer reviewed eScholarship.org Powered by the California Digital Library University of California The Race of Sound Refiguring American Music A series edited by Ronald Radano, Josh Kun, and Nina Sun Eidsheim Charles McGovern, contributing editor The Race of Sound Listening, Timbre, and Vocality in African American Music Nina Sun Eidsheim Duke University Press Durham and London 2019 © 2019 Nina Sun Eidsheim All rights reserved Printed in the United States of America on acid-free paper ∞ Designed by Courtney Leigh Baker and typeset in Garamond Premier Pro by Copperline Book Services Library of Congress Cataloging-in-Publication Data Title: The race of sound : listening, timbre, and vocality in African American music / Nina Sun Eidsheim. Description: Durham : Duke University Press, 2018. | Series: Refiguring American music | Includes bibliographical references and index. Identifiers:lccn 2018022952 (print) | lccn 2018035119 (ebook) | isbn 9780822372646 (ebook) | isbn 9780822368564 (hardcover : alk. paper) | isbn 9780822368687 (pbk. : alk. paper) Subjects: lcsh: African Americans—Music—Social aspects. | Music and race—United States. | Voice culture—Social aspects— United States. | Tone color (Music)—Social aspects—United States. | Music—Social aspects—United States. | Singing—Social aspects— United States. | Anderson, Marian, 1897–1993. | Holiday, Billie, 1915–1959. | Scott, Jimmy, 1925–2014. | Vocaloid (Computer file) Classification:lcc ml3917.u6 (ebook) | lcc ml3917.u6 e35 2018 (print) | ddc 781.2/308996073—dc23 lc record available at https://lccn.loc.gov/2018022952 Cover art: Nick Cave, Soundsuit, 2017. -
Expression Control in Singing Voice Synthesis
Expression Control in Singing Voice Synthesis Features, approaches, n the context of singing voice synthesis, expression control manipu- [ lates a set of voice features related to a particular emotion, style, or evaluation, and challenges singer. Also known as performance modeling, it has been ] approached from different perspectives and for different purposes, and different projects have shown a wide extent of applicability. The Iaim of this article is to provide an overview of approaches to expression control in singing voice synthesis. We introduce some musical applica- tions that use singing voice synthesis techniques to justify the need for Martí Umbert, Jordi Bonada, [ an accurate control of expression. Then, expression is defined and Masataka Goto, Tomoyasu Nakano, related to speech and instrument performance modeling. Next, we pres- ent the commonly studied set of voice parameters that can change and Johan Sundberg] Digital Object Identifier 10.1109/MSP.2015.2424572 Date of publication: 13 October 2015 IMAGE LICENSED BY INGRAM PUBLISHING 1053-5888/15©2015IEEE IEEE SIGNAL PROCESSING MAGAZINE [55] noVEMBER 2015 voices that are difficult to produce naturally (e.g., castrati). [TABLE 1] RESEARCH PROJECTS USING SINGING VOICE SYNTHESIS TECHNOLOGIES. More examples can be found with pedagogical purposes or as tools to identify perceptually relevant voice properties [3]. Project WEBSITE These applications of the so-called music information CANTOR HTTP://WWW.VIRSYN.DE research field may have a great impact on the way we inter- CANTOR DIGITALIS HTTPS://CANTORDIGITALIS.LIMSI.FR/ act with music [4]. Examples of research projects using sing- CHANTER HTTPS://CHANTER.LIMSI.FR ing voice synthesis technologies are listed in Table 1. -
A Unit Selection Text-To-Speech-And-Singing Synthesis Framework from Neutral Speech: Proof of Concept 39 II.1 Introduction
Adding expressiveness to unit selection speech synthesis and to numerical voice production 90) - 02 - Marc Freixes Guerreiro http://hdl.handle.net/10803/672066 Generalitat 472 (28 de Catalunya núm. Rgtre. Fund. ADVERTIMENT. L'accés als continguts d'aquesta tesi doctoral i la seva utilització ha de respectar els drets de ió la persona autora. Pot ser utilitzada per a consulta o estudi personal, així com en activitats o materials d'investigació i docència en els termes establerts a l'art. 32 del Text Refós de la Llei de Propietat Intel·lectual undac F (RDL 1/1996). Per altres utilitzacions es requereix l'autorització prèvia i expressa de la persona autora. En qualsevol cas, en la utilització dels seus continguts caldrà indicar de forma clara el nom i cognoms de la persona autora i el títol de la tesi doctoral. No s'autoritza la seva reproducció o altres formes d'explotació efectuades amb finalitats de lucre ni la seva comunicació pública des d'un lloc aliè al servei TDX. Tampoc s'autoritza la presentació del seu contingut en una finestra o marc aliè a TDX (framing). Aquesta reserva de drets afecta tant als continguts de la tesi com als seus resums i índexs. Universitat Ramon Llull Universitat Ramon ADVERTENCIA. El acceso a los contenidos de esta tesis doctoral y su utilización debe respetar los derechos de la persona autora. Puede ser utilizada para consulta o estudio personal, así como en actividades o materiales de investigación y docencia en los términos establecidos en el art. 32 del Texto Refundido de la Ley de Propiedad Intelectual (RDL 1/1996). -
Challenges and Perspectives on Real-Time Singing Voice Synthesis S´Intesede Voz Cantada Em Tempo Real: Desafios E Perspectivas
Revista de Informatica´ Teorica´ e Aplicada - RITA - ISSN 2175-2745 Vol. 27, Num. 04 (2020) 118-126 RESEARCH ARTICLE Challenges and Perspectives on Real-time Singing Voice Synthesis S´ıntesede voz cantada em tempo real: desafios e perspectivas Edward David Moreno Ordonez˜ 1, Leonardo Araujo Zoehler Brum1* Abstract: This paper describes the state of art of real-time singing voice synthesis and presents its concept, applications and technical aspects. A technological mapping and a literature review are made in order to indicate the latest developments in this area. We made a brief comparative analysis among the selected works. Finally, we have discussed challenges and future research problems. Keywords: Real-time singing voice synthesis — Sound Synthesis — TTS — MIDI — Computer Music Resumo: Este trabalho trata do estado da arte do campo da s´ıntese de voz cantada em tempo real, apre- sentando seu conceito, aplicac¸oes˜ e aspectos tecnicos.´ Realiza um mapeamento tecnologico´ uma revisao˜ sistematica´ de literatura no intuito de apresentar os ultimos´ desenvolvimentos na area,´ perfazendo uma breve analise´ comparativa entre os trabalhos selecionados. Por fim, discutem-se os desafios e futuros problemas de pesquisa. Palavras-Chave: S´ıntese de Voz Cantada em Tempo Real — S´ıntese Sonora — TTS — MIDI — Computac¸ao˜ Musical 1Universidade Federal de Sergipe, Sao˜ Cristov´ ao,˜ Sergipe, Brasil *Corresponding author: [email protected] DOI: http://dx.doi.org/10.22456/2175-2745.107292 • Received: 30/08/2020 • Accepted: 29/11/2020 CC BY-NC-ND 4.0 - This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. 1. Introduction voice synthesis embedded systems, through the description of its concept, theoretical premises, main used techniques, latest From a computational vision, the aim of singing voice syn- developments and challenges for future research. -
First Report on User Requirements Identification and Analysis
See discussions, stats, and author profiles for this publication at: http://www.researchgate.net/publication/271530318 First Report on User Requirements Identification and Analysis TECHNICAL REPORT · AUGUST 2013 DOI: 10.13140/2.1.3418.6564 DOWNLOADS VIEWS 102 28 5 AUTHORS, INCLUDING: Francesca Pozzi Michela Ott Italian National Research Council Italian National Research Council 97 PUBLICATIONS 205 CITATIONS 193 PUBLICATIONS 291 CITATIONS SEE PROFILE SEE PROFILE Francesca Dagnino Alessandra Antonaci Italian National Research Council Italian National Research Council 33 PUBLICATIONS 24 CITATIONS 15 PUBLICATIONS 7 CITATIONS SEE PROFILE SEE PROFILE Available from: Francesca Pozzi Retrieved on: 06 July 2015 Project Title: i-Treasures: Intangible Treasures – Capturing the Intangible Cultural Heritage and Learning the Rare Know- How of Living Human Treasures Contract No: FP7-ICT-2011-9-600676 Instrument: Large Scale Integrated Project (IP) Thematic Priority: ICT for access to cultural resources Start of project: 1 February 2013 Duration: 48 months D2.1 First Report on User Requirements Identification and Analysis Due date of 1 August 2013 deliverable: Actual submission 6 August 2013 date: Version: 2nd version of D2.1 Main Authors: Francesca Pozzi (ITD-CNR), Marilena Alivizatou (UCL), Michela Ott (ITD-CNR), Francesca Dagnino (ITD-CNR), Alessandra Antonaci (ITD-CNR) D2.1 First Report on User Requirements Identification and Analysis i-Treasures ICT-600676 Project funded by the European Community under the 7th Framework Programme for Research and -
Singing Synthesis Framework from Neutral Speech: Proof of Concept Marc Freixes* , Francesc Alías and Joan Claudi Socoró
Freixes et al. EURASIP Journal on Audio, Speech, and Music Processing (2019) 2019:22 https://doi.org/10.1186/s13636-019-0163-y RESEARCH Open Access A unit selection text-to-speech-and- singing synthesis framework from neutral speech: proof of concept Marc Freixes* , Francesc Alías and Joan Claudi Socoró Abstract Text-to-speech (TTS) synthesis systems have been widely used in general-purpose applications based on the generation of speech. Nonetheless, there are some domains, such as storytelling or voice output aid devices, which may also require singing. To enable a corpus-based TTS system to sing, a supplementary singing database should be recorded. This solution, however, might be too costly for eventual singing needs, or even unfeasible if the original speaker is unavailable or unable to sing properly. This work introduces a unit selection-based text-to-speech-and-singing (US-TTS&S) synthesis framework, which integrates speech-to-singing (STS) conversion to enable the generation of both speech and singing from an input text and a score, respectively, using the same neutral speech corpus. The viability of the proposal is evaluated considering three vocal ranges and two tempos on a proof-of-concept implementation using a 2.6-h Spanish neutral speech corpus. The experiments show that challenging STS transformation factors are required to sing beyond the corpus vocal range and/or with notes longer than 150 ms. While score-driven US configurations allow the reduction of pitch-scale factors, time-scale factors are not reduced due to the short length of the spoken vowels. Moreover, in the MUSHRA test, text-driven and score-driven US configurations obtain similar naturalness rates of around 40 for all the analysed scenarios. -
Production of Greek Singing Voice, the Result Is Clearly Acceptable, but Not Satisfactory Enough to Be 3.1
Proceedings of the Inter. Computer Music Conference (ICMC07), Copenhagen, Denmark, 27-31 August 2007 A SCORE-TO-SINGING VOICE SYNTHESIS SYSTEM FOR THE GREEK LANGUAGE Varvara Kyritsi Anastasia Georgaki Georgios Kouroupetroglou University of Athens, University of Athens, University of Athens, Department of Informatics and Department of Music Studies, Department of Informatics and Telecommunications, Athens, Greece Telecommunications, Athens, Greece [email protected] Athens, Greece [email protected] [email protected] ABSTRACT [15], [11]. Consequently, a SVS system must include control tools for various parameters, as phone duration, In this paper, we examine the possibility of generating vibrato, pitch, volume etc. Greek singing voice with the MBROLA synthesizer, by making use of an already existing diphone database. Our Since 1980, various projects have been carried out all goal is to implement a score-to-singing synthesis system, over the world, through different optical views, languages where the score is written in a score editor and saved in and methodologies, focusing on the mystery of the the midi format. However, MBROLA accepts phonetic synthetic singing voice [10], [11], [13], [14], [17], [18], files as input rather than midi ones and because of this, and [19]. The different optical views extend from the we construct a midi-file to phonetic-file converter, which development of a proper technique for the naturalness of applies irrespective of the underlying language. the sound quality to the design of the rules that determine the adjunction of phonemes or diphones into phrases and The evaluation of the Greek singing voice synthesis their expression. Special studies on the emotion of the system is achieved through a psycho acoustic experiment. -
Analysis on Using Synthesized Singing Techniques in Assistive Interfaces for Visually Impaired to Study Music
GSTF Journal on Computing (JoC) Vol.4 No.4, April 2016 Analysis on Using Synthesized Singing Techniques in Assistive Interfaces for Visually Impaired to Study Music Kavindu Ranasinghe, Lakshman Jayaratne Abstract—Tactile and auditory senses are the basic types of depicted that singing output proposed by that initial research methods that visually impaired people sense the world. Their solution is the best output method compared to existing music interaction with assistive technologies also focuses mainly on Braille output with 81% user votes. tactile and auditory interfaces. This research paper discuss about the validity of using most appropriate singing synthesizing Singing best fits for a format in temporal domain compared techniques as a mediator in assistive technologies specifically to a visual format such as a notation and lyric script in spatial built to address their music learning needs engaged with music domain. But the expected outcome of the research is not just a scores and lyrics. Music scores with notations and lyrics are synthesized singing, but an interface which is capable to considered as the main mediators in musical communication successfully communicate all forms of information that can be channel which lies between a composer and a performer. Visually embedded within a music notation script through auditory impaired music lovers have less opportunity to access this main objects. Some of the recent researches on Auditory Displays mediator since most of them are in visual format. If we consider a which examine how human auditory system can be used as the music score, the vocal performer’s melody is married to all the primary interface channel for communicating and transmitting pleasant sound producible in the form of singing. -
State of Art of Real-Time Singing Voice Synthesis
State of art of real-time singing voice synthesis Leonardo Araujo Zoehler Brum1, Edward David Moreno1 1Programa de Pós-graduação em Ciência da Computação – Universidade Federal de Sergipe Av. Marechal Rondon, s/n, Jardim Rosa Elze – 49100-000 São Cristóvão, SE [email protected], [email protected] Abstract This work is organized as follows: Section 2 describes the theoretical requisites which serve as This paper describes the state of art of real- base for singing voice synthesis in general; Section time singing voice synthesis and presents its 3 presents a technological mapping of the patents concept, applications and technical aspects. A registered for this area; in Section 4 the systematic technological mapping and a literature review are review of literature is shown; Section 5 contains a made in order to indicate the latest developments in comparative analysis among the selected works; this area. We made a brief comparative analysis Section 6 discuss the challenges and future among the selected works. Finally, we have tendencies for this field; finally, Section 7 presents discussed challenges and future research problems. a brief conclusion. Keywords: Real-time singing voice synthesis, 2. Theoretical framework Sound Synthesis, TTS, MIDI, Computer Music. 1. Introduction Singing voice synthesis has two elements as input data: the lyrics of the song which will be synthesized and musical parameters that indicate The aim of singing voice synthesis is to computationally generate a song, given its musical sound qualities. The lyrics can be inserted according notes and lyrics [1]. Hence, it is a branch of text-to- to the orthography of the respective idiom or speech (TTS) technology [2] with the application of through some phonetical notation, like SAMPA, some techniques of musical sound synthesis.