Convention Program
Total Page:16
File Type:pdf, Size:1020Kb
AES 145TH CONVENTION OCT PROGRAM OCTOBER 17–20, 2018 JAVITS CONVENTION CENTER, NY, USA The Winner of the 145th AES Convention * * * * * Best Peer-Reviewed Paper Award is: Session P1 Wednesday, Oct. 17 The Effect of Pinnae Cues on Lead-Signal Localization 9:00 am – 12:00 noon Room 1E11 in Elevated, Lowered, and Diagonal Loudspeaker Config- urations—Wesley Bulla, Belmont University, Nashville, PERCEPTION TN, USA; Paul Mayo, University of Maryland, College Park, MD, USA Chair: Elizabeth McMullin, Samsung Research America, Valencia, CA, USA Convention Paper 10066 9:00 am To be presented on Thursday, Oct. 18, in Session 7—Perception—Part 2 P1-1 Improved Psychoacoustic Model for Efficient Perceptual Audio Codecs—Sascha Disch,1,2 Steven van de Par,3 * * * * * Andreas Niedermeier,1 Elena Burdiel Pérez,1 Ane Berasategui Ceberio,1 Bernd Edler2 The AES has launched an opportunity to recognize student 1 Fraunhofer Institute for Integrated Circuits IIS, members who author technical papers. The Student Paper Award Erlangen, Erlangen, Germany Competition is based on the preprint manuscripts accepted for the 2 Friedrich Alexander University, International Audio AES convention. Laboratories Erlangen, Erlangen, Germany A number of student-authored papers were nominated. The 3 University of Oldenburg, Oldenburg, Germany excellent quality of the submissions has made the selection process both challenging and exhilarating. Since early perceptual audio coders such as mp3, the The award-winning student paper will be honored during the underlying psychoacoustic model that controls the encod- Convention, and the student-authored manuscript will be consid- ing process has not undergone many dramatic changes. ered for publication in a timely manner for the Journal of the Audio Meanwhile, modern audio coders have been equipped with Engineering Society. semi-parametric or parametric coding tools such as audio Nominees for the Student Paper Award were required to meet bandwidth extension. Thereby, the initial psychoacoustic the following qualifications: model used in a perceptual coder, just considering added (a) The paper was accepted for presentation at the AES 145th quantization noise, became partly unsuitable. We propose Convention. the use of an improved psychoacoustic excitation model (b) The first author was a student when the work was conducted based on an existing model proposed by Dau et al. in 1997. and the manuscript prepared. This modulation-based model is essentially indepen- (c) The student author’s affiliation listed in the manuscript is an dent from the input waveform by calculating an internal accredited educational institution. auditory representation. Using the example of MPEG-H 3D (d) The student will deliver the lecture or poster presentation at Audio and its semi-parametric Intelligent Gap Filling the Convention. (IGF) tool, we demonstrate that we can successfully con- * * * * * trol the IGF parameter selection process to achieve overall improved perceptual quality. The Winner of the 145th AES Convention Convention Paper 10029 Student Paper Award is: Musical Instrument Synthesis and Morphing 9:30 am in Multidimensional Latent Space Using P1-2 On the Influence of Cultural Differences on the Variational, Convolutional Recurrent Perception of Audio Coding Artifacts in Music— Autoencoders—Emre Çakir, Tuomas Virtanen, Sascha Dick,1 Jiandong Zhang,2 Yili Qin,2 Nadja Tampere University of Technology, Tampere, Finland Schinkel-Bielefeld,1 Anna Katharina Leschanowsky,1 Convention Paper 10035 Frederik Nagel1 To be presented on Wednesday, October 17, 1 International Audio Laboratories Erlangen, a joint in Session 2—Signal Processing—Part 1 institution of Universität Erlangen-Nürnberg and 1 Audio Engineering Society 145th Convention Program, 2018 Fall Fraunhofer IIS, Erlangen, Germany 11:00 am 2 Academy of Broadcasting Planning, SAPPRFT, Beijing, China P1-5 Developing a Method for the Subjective Evaluation of Smartphone Music Playback—Elisabeth McMullin, Modern audio codecs are used all over the world, reach- Victoria Suha, Yuan Li, Will Saba, Pascal Brunet, ing listeners with many different cultures and languages. Samsung Research America, Valencia, CA USA This study investigates if and how cultural background influences the perception and preference of different To determine the preferred audio characteristics for me- audio coding artifacts, focusing on musical content. A dia playback over smartphones, a series of controlled dou- subjective listening test was designed to directly compare ble-blind listening experiments were run to evaluate the different types of audio coding and was performed with subjective playback quality of six high-end smartphones. Mandarin Chinese and German speaking listeners. Over- Listeners rated products based on their audio quality pref- all comparison showed largely consistent results, affirm- erence and left comments categorized by attribute. The de- ing the validity of the proposed test method. Differential vices were tested in different orientations in level-matched comparison indicates preferences for certain artifacts in and maximum-volume scenarios. Positional variation and different listener groups, e.g., Chinese listeners tended biases were accounted for using a motorized turntable and to grade tonality mismatch higher and pre-echoes worse audio playback was controlled remotely with remote-ac- compared to German listeners, and musicians preferred cess software. Test results were compared to spatially-av- bandwidth limitation over tonality mismatch when com- eraged measurements made using a multitone stimulus pared to non-musicians. and demonstrate that the smoothness of the frequency Convention Paper 10030 response is the most important aspect in smartphone pref- erence. Low frequency extension, decreased levels of non- linear distortion, and higher maximum playback level did 10:00 am not correlate with higher phone ratings. Convention Paper 10033 P1-3 Perception of Phase Changes in the Context of Musical Audio Source Separation—Chungeun Kim, Emad M. 11:30 am Grais, Russell Mason, Mark D. Plumbley, University of Surrey, Guildford, Surrey, UK P1-6 Investigation into the Effects of Subjective Test Interface Choice on the Validity of Results—Nicholas This study investigates into the perceptual consequence Jillings,1 Brecht De Man,1 Ryan Stables,1 Joshua D. of phase change in conventional magnitude-based source Reiss2 separation. A listening test was conducted, where the par- 1 Birmingham City University, Birmingham, UK ticipants compared three different source separation sce- 2 Queen Mary University of London, London, UK narios, each with two phase retrieval cases: phase from the original mix or from the target source. The participants’ Subjective experiments are a cornerstone of modern responses regarding their similarity to the reference research with a variety of tasks being undertaken by showed that (1) the difference between the mix phase and subjects. In the field of audio, subjective listening tests the perfect target phase was perceivable in the majority provide validation for research and aid fair comparison of cases with some song-dependent exceptions, and (2) between techniques or devices such as coding perfor- use of the mix phase degraded the perceived quality even mance, speakers, mixes, and source separation systems. in the case of perfect magnitude separation. The findings Several interfaces have been designed to mitigate biases imply that there is room for perceptual improvement by and to standardize procedures, enabling indirect compar- attempting correct phase reconstruction in addition to isons. The number of different combinations of interface achieving better magnitude-based separation. and test design make it extremely difficult to conduct a Convention Paper 10031 truly unbiased listening test. This paper resolves the larg- est of these variables by identifying the impact the inter- face itself has on a purely auditory test. This information 10:30 am is used to make recommendations for specific categories P1-4 Method for Quantitative Evaluation of Auditory of listening tests. Perception of Nonlinear Distortion: Part II—Metric Convention Paper 10034 for Music Signal Tonality and its Impact on Subjective [Paper presented by Brecht De Man] Perception of Distortions—Mikhail Pakhomov, Victor Rozhnov, SPb Audio R&D Lab, St. Petersburg, Russia Session P2 Wednesday, Oct. 17 9:00 am – 11:00 am Room 1E12 In the first part of the paper we have noticed that the impact of audible nonlinear distortions on subjective lis- SIGNAL PROCESSING—PART 1 tener preference is strongly dependent on the spectral structure of a test signal. In the second part we propose a method for considering the spectral characteristics of Chair: Emre Çakir, Tampere University of Technology, a test signal in the evaluation of the subjective percep- Tampere, Finalnd tion of audible nonlinear distortions. To describe the tonal structure of a music signal, a qualitative characteristic, 9:00 am tonality, is taken as a metric, and tonality coefficient is P2-1 Musical Instrument Synthesis and Morphing in proposed as a measure of this characteristic. Subjective Multidimensional Latent Space Using Variational, listening tests were performed to estimate how the audi- Convolutional Recurrent Autoencoders—Emre Çakir, tory perception of nonlinear distortions depends on the Tuomas Virtanen, Tampere University of Technology, tonal structure of a signal and the spectral distribution of Tampere, Finland the noise-to-mask ratio (NMR) Convention