1.4 a Pure Data Spectro-Morphological Analysis Toolkit for Sound-Based Composition
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
The Timbre Toolbox: Extracting Audio Descriptors from Musical Signals
The Timbre Toolbox: Extracting audio descriptors from musical signals Geoffroy Peetersa) Institut de Recherche et Coordination Acoustique/Musique (STMS-IRCAM-CNRS), 1 place Igor-Stravinsky, F-75004 Paris, France Bruno L. Giordano Centre for Interdisciplinary Research on Music Media and Technology (CIRMMT), Schulich School of Music, McGill University, 555 Sherbrooke Street West, Montre´al, Quebe´c H3A 1E3, Canada Patrick Susini and Nicolas Misdariis Institut de Recherche et Coordination Acoustique/Musique (STMS-IRCAM-CNRS), 1 place Igor-Stravinsky, F-75004 Paris, France Stephen McAdams Centre for Interdisciplinary Research on Music Media and Technology (CIRMMT), Schulich School of Music, McGill University, 555 Sherbrooke Street West, Montre´al, Quebe´c H3A 1E3, Canada (Received 24 November 2010; revised 9 March 2011; accepted 12 March 2011) The analysis of musical signals to extract audio descriptors that can potentially characterize their timbre has been disparate and often too focused on a particular small set of sounds. The Timbre Toolbox provides a comprehensive set of descriptors that can be useful in perceptual research, as well as in music information retrieval and machine-learning approaches to content-based retrieval in large sound databases. Sound events are first analyzed in terms of various input representations (short-term Fourier transform, harmonic sinusoidal components, an auditory model based on the equivalent rectangular bandwidth concept, the energy envelope). A large number of audio descrip- tors are then derived from each of these representations to capture temporal, spectral, spectrotempo- ral, and energetic properties of the sound events. Some descriptors are global, providing a single value for the whole sound event, whereas others are time-varying. -
A Psychoacoustic Model with Partial Spectral Flatness Measure for Tonality Estimation
A PSYCHOACOUSTIC MODEL WITH PARTIAL SPECTRAL FLATNESS MEASURE FOR TONALITY ESTIMATION Armin Taghipour1, Maneesh Chandra Jaikumar2, and Bernd Edler1 1 International Audio Laboratories Erlangen, Am Wolfsmantel 33, 91058 Erlangen, Germany 2 Fraunhofer Institute for Integrated Circuits IIS, Am Wolfsmantel 33, 91058 Erlangen, Germany [email protected] ABSTRACT quantization Psychoacoustic studies show that the strength of masking is, quantization spectral among others, dependent on the tonality of the masker: the decompo¡ coding PCM sition coded effect of noise maskers is stronger than that of tone maskers. signal audio Recently, a Partial Spectral Flatness Measure (PSFM) was in- signal quantization troduced for tonality estimation in a psychoacoustic model for perceptual audio coding. The model consists of an Infi- nite Impulse Response (IIR) filterbank which considers the perceptual model spreading effect of individual local maskers in simultaneous masking. An optimized (with respect to audio quality and Fig. 1. Basic structure of a transform based audio encoder. computational efficiency) PSFM is now compared to a sim- A transform decomposes frames of the input audio signal in ilar psychoacoustic model with prediction based tonality es- their spectral components. The psychoacoustic model calcu- timation in medium (48 kbit/s) and low (32 kbit/s) bit rate lates estimated masking thresholds for the frames with which conditions (mono) via subjective quality tests. 15 expert lis- it controls quantization steps for the individual subbands. teners participated in the subjective tests. The results are de- picted and discussed. Additionally, we conducted the subjec- to the best possible extent. However, at medium and low bit tive tests with 15 non-expert consumers whose results are also rates, the estimated masking threshold has to be violated. -
University of California, San Diego
UNIVERSITY OF CALIFORNIA, SAN DIEGO Physical and Perceptual Aspects of Percussive Timbre A dissertation submitted in partial satisfaction of the requirements for the degree Doctor of Philosophy in Music by William Brent Committee in charge: Professor Miller Puckette, Chair Professor David Borgo Professor Diana Deutsch Professor Shlomo Dubnov Professor Shahrokh Yadegari 2010 Copyright William Brent, 2010 All rights reserved. The dissertation of William Brent is approved, and it is acceptable in quality and form for publication on micro- film and electronically: Chair University of California, San Diego 2010 iii DEDICATION For my father. iv EPIGRAPH One of the most striking paradoxes concerning timbre is that when we knew less about it, it didn't pose much of a problem. |Philippe Manoury v TABLE OF CONTENTS Signature Page . iii Dedication . iv Epigraph . .v Table of Contents . vi List of Figures . ix List of Tables . xii Acknowledgements . xiii Vita and Publications . xv Abstract of the Dissertation . xvi Chapter 1 Introduction . .1 Chapter 2 Historical Overview of Timbre Studies . .7 2.1 Experimental Design . .7 2.2 Verbal Attribute Studies . 10 2.2.1 Von Bismarck . 10 2.2.2 Kendall & Carterette . 13 2.2.3 Freed . 15 2.3 Multidimensional Scaling . 17 2.3.1 Grey . 20 2.3.2 Iversen & Krumhansl . 24 2.3.3 McAdams . 27 2.3.4 Lakatos . 32 2.4 Summary . 34 Chapter 3 Objective Analysis . 37 3.1 Low level features . 38 3.1.1 Spectral Centroid . 38 3.1.2 Spectral Spread . 39 3.1.3 Spectral Skewness . 39 3.1.4 Spectral Kurtosis . 40 3.1.5 Spectral Brightness . -
Bigear: Inferring the Ambient and Emotional Correlates from Smartphone-Based Acoustic Big Data
BigEAR: Inferring the Ambient and Emotional Correlates from Smartphone-based Acoustic Big Data Harishchandra Dubey∗, Matthias R. Mehly, Kunal Mankodiya∗ ∗Department of Electrical, Computer and Biomedical Engineering, University of Rhode Island, Kingston, RI, USA yDepartment of Psychology, University of Arizona, Tucson, AZ, USA [email protected], [email protected], [email protected] Abstract—This paper presents a novel BigEAR big data frame- in naturalistic settings. The psychologists had an access to the work that employs psychological audio processing chain (PAPC) audio data and also the estimated mood labels that can help in to process smartphone-based acoustic big data collected when the making clinical decisions about the psychological health of the user performs social conversations in naturalistic scenarios. The overarching goal of BigEAR is to identify moods of the wearer participants. Recently, our team developed a novel framework from various activities such as laughing, singing, crying, arguing, named as BigEAR that employs a psychological audio pro- and sighing. These annotations are based on ground truth cessing chain (PAPC) to predict the mood of the speaker from relevant for psychologists who intend to monitor/infer the social DALs. The BigEAR application installed on the iPhone/iPad context of individuals coping with breast cancer. We pursued captures the audio data from ambient environment at random a case study on couples coping with breast cancer to know how the conversations affect emotional and social well being. instance throughout the day. The snippets are of small sizes In the state-of-the-art methods, psychologists and their team so that we do not record significant duration of a personal have to hear the audio recordings for making these inferences context.