Understanding Audio Production Practices of People with Vision Impairments

Total Page:16

File Type:pdf, Size:1020Kb

Understanding Audio Production Practices of People with Vision Impairments Understanding Audio Production Practices of People with Vision Impairments Abir Saha Anne Marie Piper Northwestern University University of California, Irvine Evanston, IL, USA Irvine, CA, USA [email protected] [email protected] ABSTRACT of audio content creation, including music, podcasts, audio drama, The advent of digital audio workstations and other digital audio radio shows, sound art and so on. In modern times, audio content tools has brought a critical shift in the audio industry by empower- creation has increasingly become computer-supported – digital in- ing amateur and professional audio content creators with the nec- struments are used to replicate sounds of physical instruments (e.g., essary means to produce high quality audio content. Yet, we know guitars, drums, etc.) with high-fdelity. Likewise, editing, mixing, little about the accessibility of widely used audio production tools and mastering tasks are also mediated through the use of digital for people with vision impairments. Through interviews with 18 audio workstations (DAWs) and efects plugins (e.g., compression, audio professionals and hobbyists with vision impairments, we fnd equalization, and reverb). This computer-aided work practice is that accessible audio production involves: piecing together accessi- supported by a number of commercially developed DAWs, such 1 2 3 ble and efcient workfows through a combination of mainstream as Pro Tools , Logic Pro and REAPER . In addition to these com- and custom tools; achieving professional competency through a mercial eforts, academic researchers have also invested signifcant steep learning curve in which domain knowledge and accessibility attention towards developing new digital tools to support audio are inseparable; and facilitating learning and creating access by production tasks (e.g., automated editing and mixing) [29, 57, 61]. engaging in online communities of visually impaired audio enthu- Despite a growing interest in computer-supported audio content siasts. We discuss the deep entanglement between accessibility and creation within industry and academia (e.g., dedicated communities professional competency and conclude with design considerations such as NIME, AES and ACM IMX), one area that has not received to inform future development of accessible audio production tools. much attention is how people with vision impairments perform au- dio production tasks using computer-based tools. While emerging CCS CONCEPTS literature within HCI and accessibility has studied other forms of computer-supported creative work (e.g., photography [3, 40], draw- • Human-centered computing ! Empirical studies in acces- ing [9, 56], making and fabrication [5, 19], 3D printing [24, 64] and sibility. writing [20, 51]), accessibility in audio production remains relatively KEYWORDS under-explored. The limited prior work that does exist has revealed that people with vision impairments face accessibility challenges in Accessibility, audio production, blind, vision impairment using digital audio production tools due to an often-exclusive use ACM Reference Format: of visualizations (e.g., waveform and graphic equalizer) to represent Abir Saha and Anne Marie Piper. 2020. Understanding Audio Production audio information [50, 68]. Consequently, researchers have focused Practices of People with Vision Impairments. In The 22nd International ACM on creating accessible representations of this information by de- SIGACCESS Conference on Computers and Accessibility (ASSETS ’20), October veloping novel multimodal interfaces [26, 36, 50, 53, 68]. However, 26–28, 2020, Virtual Event, Greece. ACM, New York, NY, USA, 13 pages. less is known about how people with vision impairments use main- https://doi.org/10.1145/3373625.3416993 stream audio production tools in their personal and professional 1 INTRODUCTION practices. Understanding how people with vision impairments nav- igate existing tools and the associated challenges to achieve their From contemporary pop music to award-winning musical mas- audio production goals is essential in designing sustainable solu- terpieces to educational podcasts, professionally produced audio tions and establishing a holistic view of how computer-based tools begins as a few raw, untouched audio tracks that undergo hours of can better support both hobbyists’ and professionals’ work. intricate polishing stages, such as editing, mixing, and mastering. To help bridge this gap in the literature, we report fndings from This complex and detailed set of workfows, commonly known as semi-structured interviews with 18 visually impaired professionals audio production, is a skilled practice and a cornerstone of all types and hobbyists who produce audio content using various software Permission to make digital or hard copies of all or part of this work for personal or and hardware tools. Our analysis reveals three main aspects of classroom use is granted without fee provided that copies are not made or distributed the audio production practices of people with vision impairments: for proft or commercial advantage and that copies bear this notice and the full citation on the frst page. Copyrights for components of this work owned by others than the (1) piecing together accessible and efcient workfows through a author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or combination of mainstream and custom tools; (2) achieving profes- republish, to post on servers or to redistribute to lists, requires prior specifc permission sional competency through a steep learning curve in which domain and/or a fee. Request permissions from [email protected]. ASSETS ’20, October 26–28, 2020, Virtual Event, Greece © 2020 Copyright held by the owner/author(s). Publication rights licensed to ACM. 1https://www.avid.com/pro-tools ACM ISBN 978-1-4503-7103-2/20/10...$15.00 2https://www.apple.com/logic-pro https://doi.org/10.1145/3373625.3416993 3https://www.reaper.fm ASSETS ’20, October 26–28, 2020, Virtual Event, Greece Saha and Piper knowledge and accessibility are deeply entangled; and (3) facilitat- 2.2 Accessibility in Audio Production ing learning and creating access by engaging in online communities In line with the growing interest in accessibility in content produc- of visually impaired audio enthusiasts. tion, researchers within HCI and assistive technology have recently The present paper makes three primary contributions. First, our started exploring accessibility in audio production for people with work presents an empirical understanding of the audio produc- vision impairments. As an example, researchers [49, 50, 68] orga- tion practices of people with vision impairments. While prior work nized participatory design workshops with visually impaired people focuses on introducing accessible visualizations for digital audio to design multimodal interfaces for three diferent DAW features: interfaces [36, 50, 53], it leaves open questions around how blind recognizing automation line anchor points (through sonifcation people uniquely experience and navigate mainstream audio pro- and pitch modifcation), peak meter (through sonifcation) and au- duction tools to support their work. Second, our analysis provides dio amplitude curve (through haptic feedback). Haenselmann et evidence of the intertwined nature of professional competency al. [26] designed a multitrack MIDI (Musical Instrument Digital and accessibility, in which developing domain specifc skills and Interface) sequencer that allows visually impaired users to execute capabilities is inseparable from the work of mastering a complex all MIDI sequencer functions using the keys on an electronic mu- set of largely inaccessible tools. Creating a more inclusive audio sical keyboard, thereby eliminating the needs of interacting with production industry, a profession our informants feel should be computer keyboard and display. Others have designed audio editing an ideal career path for people with vision impairments, requires and music production interfaces that visually impaired users can understanding that accessibility currently constrains but is critical control using voice [14], game controllers [34], and tangible [36] for professional success. Finally, we provide considerations for the and tabletop objects [53]. Much of this emerging research focused future design of accessible audio production tools and resources to on designing novel tools to improve accessibility of specifc audio better support this community of professionals and hobbyists. production tasks leaving open questions around current practices of screen reader users. Our work contributes to this literature by developing an understanding of how blind professionals and hob- 2 RELATED WORK byists make use of mainstream audio production tools and thus informing future accessible design in this space. In grounding the present paper, below we review prior research on accessibility in audio and other forms of creative content production as well as literature on audio production tools and practices. 2.3 Audio Production Tools and Practices While research on accessibility in audio production is still at an 2.1 Accessibility in Creative Content early stage, there is a large body of work in digital audio and mu- Production sical expression literature that investigates how sighted people Our work is situated within a growing body of literature that fo- interact with digital audio production tools and how the design of cuses on understanding and designing new systems for improving these tools shape
Recommended publications
  • Learning to Build Natural Audio Production Interfaces
    arts Article Learning to Build Natural Audio Production Interfaces Bryan Pardo 1,*, Mark Cartwright 2 , Prem Seetharaman 1 and Bongjun Kim 1 1 Department of Computer Science, McCormick School of Engineering, Northwestern University, Evanston, IL 60208, USA 2 Department of Music and Performing Arts Professions, Steinhardt School of Culture, Education, and Human Development, New York University, New York, NY 10003, USA * Correspondence: [email protected] Received: 11 July 2019; Accepted: 20 August 2019; Published: 29 August 2019 Abstract: Improving audio production tools provides a great opportunity for meaningful enhancement of creative activities due to the disconnect between existing tools and the conceptual frameworks within which many people work. In our work, we focus on bridging the gap between the intentions of both amateur and professional musicians and the audio manipulation tools available through software. Rather than force nonintuitive interactions, or remove control altogether, we reframe the controls to work within the interaction paradigms identified by research done on how audio engineers and musicians communicate auditory concepts to each other: evaluative feedback, natural language, vocal imitation, and exploration. In this article, we provide an overview of our research on building audio production tools, such as mixers and equalizers, to support these kinds of interactions. We describe the learning algorithms, design approaches, and software that support these interaction paradigms in the context of music and audio production. We also discuss the strengths and weaknesses of the interaction approach we describe in comparison with existing control paradigms. Keywords: music; audio; creativity support; machine learning; human computer interaction 1. Introduction In recent years, the roles of producer, engineer, composer, and performer have merged for many forms of music (Moorefield 2010).
    [Show full text]
  • DMA Recording Project Guidelines (Fall 2011) Page 1 of 2
    DMA – TMUS 8329 Major Project (4–6 cr.)* Guidelines for Recording Project In addition to the normal requirements of a recital, it is expected that the student will become involved in all aspects of the recording, in effect acting as producer from start to finish. Before undertaking the Recording Project: the student must submit a prospectus that is reviewed and approved by the faculty advisory committee and the Associate Dean of Graduate Studies. A copy of the prospectus must be kept in the student’s file. For a recording to fulfill the requirement, it must adhere to the following criteria. 1. Be comparable in length to a recital. 2. The student is responsible for coordinating all matters pertaining to the recording including: contracting of musicians, studio manager and recording engineer; CD printing or duplication, graphic art design layout, and recording label (optional). 3. The recording must be unique in some way that sets it apart from other recordings. Examples can include but are not limited to recordings that feature: original compositions and/or arrangements, collections of works that are less known and/or have not been readily available in recordings, and so on. This aspect of the project should reflect the student’s creativity and research skills. 4. The submitted recording must be of professional quality. 5. Post-recording work is an essential aspect of the project. The student must oversee the editing, mixing, and (optionally) the mastering of the recording. Refer to the definitions of these processes at the bottom of this document. a. Editing (if necessary) b. Mixing c.
    [Show full text]
  • “Knowing Is Seeing”: the Digital Audio Workstation and the Visualization of Sound
    “KNOWING IS SEEING”: THE DIGITAL AUDIO WORKSTATION AND THE VISUALIZATION OF SOUND IAN MACCHIUSI A DISSERTATION SUBMITTED TO THE FACULTY OF GRADUATE STUDIES IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY GRADUATE PROGRAM IN MUSIC YORK UNIVERSITY TORONTO, ONTARIO September 2017 © Ian Macchiusi, 2017 ii Abstract The computer’s visual representation of sound has revolutionized the creation of music through the interface of the Digital Audio Workstation software (DAW). With the rise of DAW- based composition in popular music styles, many artists’ sole experience of musical creation is through the computer screen. I assert that the particular sonic visualizations of the DAW propagate certain assumptions about music, influencing aesthetics and adding new visually- based parameters to the creative process. I believe many of these new parameters are greatly indebted to the visual structures, interactional dictates and standardizations (such as the office metaphor depicted by operating systems such as Apple’s OS and Microsoft’s Windows) of the Graphical User Interface (GUI). Whether manipulating text, video or audio, a user’s interaction with the GUI is usually structured in the same manner—clicking on windows, icons and menus with a mouse-driven cursor. Focussing on the dialogs from the Reddit communities of Making hip-hop and EDM production, DAW user manuals, as well as interface design guidebooks, this dissertation will address the ways these visualizations and methods of working affect the workflow, composition style and musical conceptions of DAW-based producers. iii Dedication To Ba, Dadas and Mary, for all your love and support. iv Table of Contents Abstract ..................................................................................................................
    [Show full text]
  • Thesis Table of Contents
    A new sound mixing framework for enhanced emotive sound design within contemporary moving-picture audio production and post-production. Volume 1 of 3 Neil Martin Georges Hillman PhD University of York Theatre, Film and Television September 2017 2 Abstract This study comprises of an investigation into the relationship between the creative process of mixing moving-picture soundtracks and the emotions elicited by the final film. As research shows that listeners are able to infer a speaker’s emotion from auditory cues, independently from the meaning of the words uttered, it is possible that moving-picture soundtracks may be designed in such a way as to intentionally influence the emotional state and attitude of its listening-viewers, independently from the story and visuals of the film. This study sets out to determine whether certain aspects of audience emotions can be enhanced through specific ways of mix-balancing the soundtrack of a moving-picture production, primarily to intensify the viewing experience. Central to this thesis is the proposal that within a film soundtrack there are four distinct ‘sound areas’, described as the Narrative, Abstract, Temporal and Spatial; and these form a useful framework for both the consideration and the creation of emotional sound design. This research work evaluates to what extent the exploration of the Narrative, Abstract, Temporal and Spatial sound areas offers a new and useful framework for academics to better understand, and more easily communicate, emotive sound design theory and analysis; whilst providing practitioners with a framework to explore a new sound design approach within the bounds of contemporary workflow and methodology, to encourage an enhanced emotional engagement by the audience to the soundtrack.
    [Show full text]
  • Understanding Audio Production Practices of People with Vision Impairments
    Understanding Audio Production Practices of People with Vision Impairments Abir Saha Anne Marie Piper Northwestern University University of California, Irvine Evanston, IL, USA Irvine, CA, USA [email protected] [email protected] ABSTRACT of audio content creation, including music, podcasts, audio drama, The advent of digital audio workstations and other digital audio radio shows, sound art and so on. In modern times, audio content tools has brought a critical shift in the audio industry by empower- creation has increasingly become computer-supported – digital in- ing amateur and professional audio content creators with the nec- struments are used to replicate sounds of physical instruments (e.g., essary means to produce high quality audio content. Yet, we know guitars, drums, etc.) with high-fdelity. Likewise, editing, mixing, little about the accessibility of widely used audio production tools and mastering tasks are also mediated through the use of digital for people with vision impairments. Through interviews with 18 audio workstations (DAWs) and efects plugins (e.g., compression, audio professionals and hobbyists with vision impairments, we fnd equalization, and reverb). This computer-aided work practice is that accessible audio production involves: piecing together accessi- supported by a number of commercially developed DAWs, such 1 2 3 ble and efcient workfows through a combination of mainstream as Pro Tools , Logic Pro and REAPER . In addition to these com- and custom tools; achieving professional competency through a mercial eforts, academic researchers have also invested signifcant steep learning curve in which domain knowledge and accessibility attention towards developing new digital tools to support audio are inseparable; and facilitating learning and creating access by production tasks (e.g., automated editing and mixing) [29, 57, 61].
    [Show full text]
  • Recording: Processing Audio and the Modern Recording Studio
    Chapter 7. Meeting 7, Recording: Processing Audio and the Modern Recording Studio 7.1. Announcements • Quiz next Thursday • Numerous listenings assignments for next week 7.2. Processing Audio • Contemporary processors take many physical forms: effects units, stomp-boxes © source unknown. All rights reserved. This content is excluded from our Creative Commons license. For more information, see http://ocw.mit.edu/fairuse. 160 Photo courtesy of kernelslacker on Flickr. 161 Photo courtesy of michael morel on Flickr. Courtesy of George Massenburg Labs. Used with permission. 162 Original photo courtesy of eyeliam on Flickr; edited by Wikipedia User:Shoulder-synth. 163 • As software, most are implemented as plug-ins 164 © Avid Technology, Inc. All rights reserved. This content is excluded from our Creative Commons license. For more information, see http://ocw.mit.edu/fairuse. 165 © MOTU, Inc. All rights reserved. This content is excluded from our Creative Commons license. For more information, see http://ocw.mit.edu/fairuse. 7.3. Distortion • Pushing a signal beyond its dynamic range squares the waveform • Making round signals more square adds extra harmonics [demo/processorsDistortion.pd] 166 • Examples • Overdrive • Fuzz • Crunch 7.4. Dynamics Processors • Transform the amplitude of a signal in real-time • Amplitudes can be pushed down above or below a threshold to decrease or increase dynamic range • Examples 167 • Compressors and Limiters • Expanders and Gates 7.5. Dynamics Processors: Compression • Reduces a signal’s dynamic range • Makes the quiet sounds louder • Helps a track maintain its position in the mix • Two steps • Reduce dynamic range: turn amplitudes down if a above a specific level (the threshold) • Increase amplitude of entire signal so that new peaks are where the old were 168 To be compressed Uncompressed Peak Threshold Sound Energy Time Compression occurs Previous 0 VU Threshold Sound Energy Time 0 VU Sound Energy Boost overall level Time Figure by MIT OpenCourseWare.
    [Show full text]
  • Approaches in Intelligent Music Production
    arts Article Approaches in Intelligent Music Production David Moffat * and Mark B. Sandler School of Electronic Engineering and Computer Science, Queen Mary University of London, London E1 4NS, UK * Correspondence: [email protected]; Tel.: +44-(0)20-7882-5555 Received: 18 July 2019; Accepted: 10 September 2019; Published: 25 September 2019 Abstract: Music production technology has made few advancements over the past few decades. State-of-the-art approaches are based on traditional studio paradigms with new developments primarily focusing on digital modelling of analog equipment. Intelligent music production (IMP) is the approach of introducing some level of artificial intelligence into the space of music production, which has the ability to change the field considerably. There are a multitude of methods that intelligent systems can employ to analyse, interact with, and modify audio. Some systems interact and collaborate with human mix engineers, while others are purely black box autonomous systems, which are uninterpretable and challenging to work with. This article outlines a number of key decisions that need to be considered while producing an intelligent music production system, and identifies some of the assumptions and constraints of each of the various approaches. One of the key aspects to consider in any IMP system is how an individual will interact with the system, and to what extent they can consistently use any IMP tools. The other key aspects are how the target or goal of the system is created and defined, and the manner in which the system directly interacts with audio. The potential for IMP systems to produce new and interesting approaches for analysing and manipulating audio, both for the intended application and creative misappropriation, is considerable.
    [Show full text]
  • Digital Transmit Audio Processing with Presonus Studio One Artist
    Digital transmit audio processing with Presonus Studio One Artist ... Setup guide for a software based digital audio workstation (DAW) as an alternative to conventional hardware based audio processing techniques. Appendix A provides additional info for items high lighted in yellow … How this scheme works: A. Digital audio output from the Rode NT-USB microphone is routed to the computer via a type A/B USB cable. B. Digital audio is then processed within the computer by the software based system Presonus Studio One Artist. C. Digital transmit audio output from Studio One Artist is then routed via Type A/B USB cable to the Kenwood TS- 590 via a USB cable. Finally, it is converted from digital to analog and transmitted. D. 90 watts RF output from TS-590S transceiver. E. 800 watts RF output from Ameritron AL80B linear. F. 800 watts RF output from Palstar AT1500BAL to horizontal loop. G. Receive audio from the Kenwood 590S is routed via USB cable (C.) back to the computer and outputted to a pair of near field monitor speakers. Setup: Hardware: Kenwood TS-590S PC running Window’s 10 with internal Realtek sound card. Necessary interconnection cabling. Presonus Studio One Artist Digital Audio Workstation (DAW) software: Your best bet is to purchase Presonus AudioBox USB audio interface as described at http://www.sweetwater.com/store/detail/AudioBoxUSB . This interface includes a free copy of Studio One Artist. Note: Alternatively, you can use the AudioBox USB interface together with conventional analog cables for transceivers that are not equipped with USB connectivity. Kenwood 590 settings: 61 - 115200 62 - 115200 63 - USB 64 - 1 65 - 2 68 - off 69 - on (assuming you prefer to use VOX) 70 - 20 71 - 3 79 - 205 (for ptt via front panel "PFA' button) Note: Other Kenwood 590 settings include menu 25/27 at 100, menu 26/28 at 2900.
    [Show full text]
  • Advanced Automatic Mixing Tools for Music Perez Gonzalez, Enrique
    Advanced automatic mixing tools for music Perez Gonzalez, Enrique The copyright of this thesis rests with the author and no quotation from it or information derived from it may be published without the prior written consent of the author For additional information about this publication click this link. https://qmro.qmul.ac.uk/jspui/handle/123456789/614 Information about this research object was correct at the time of download; we occasionally make corrections to records, please therefore check the published record when citing. For more information contact [email protected] Advanced Automatic Mixing Tools for Music Submitted by Enrique Perez Gonzalez For the Ph.D. degree of Queen Mary University Of London Mile End Road London E1 4NS September 30, 2010 2 I certify that this thesis, and the research to which it refers, are the product of our own work, and that any ideas or quotations from the work of other people, published or otherwise, are fully acknowledged in accordance with the standard referencing practices of the discipline. I acknowledge the helpful guidance and support of our supervisor, Dr. Johua Daniel Reiss. 3 Abstract This thesis presents research on several independent systems that when combined together can generate an automatic sound mix out of an unknown set of multi‐channel inputs. The research explores the possibility of reproducing the mixing decisions of a skilled audio engineer with minimal or no human interaction. The research is restricted to non‐time varying mixes for large room acoustics. This research has applications in dynamic sound music concerts, remote mixing, recording and postproduction as well as live mixing for interactive scenes.
    [Show full text]
  • Dynamic Range Compression and the Semantic Descriptor Aggressive
    applied sciences Article Dynamic Range Compression and the Semantic Descriptor Aggressive Austin Moore Centre for Audio and Psychoacoustic Engineering, School of Computing and Engineering, University of Huddersfield, Huddersfield HD1 3DH, UK; [email protected] Received: 29 January 2020; Accepted: 12 March 2020; Published: 30 March 2020 Featured Application: The current study will be of interest to designers of professional audio software and hardware devices as it will allow them to design their tools to increase or diminish the sonic character discussed in the paper. In addition, it will benefit professional audio engineers due to its potential to speed up their workflow. Abstract: In popular music productions, the lead vocal is often the main focus of the mix and engineers will work to impart creative colouration onto this source. This paper conducts listening experiments to test if there is a correlation between perceived distortion and the descriptor “aggressive”, which is often used to describe the sonic signature of Universal Audio 1176, a much-used dynamic range compressor in professional music production. The results from this study show compression settings that impart audible distortion are perceived as aggressive by the listener, and there is a strong correlation between the subjective listener scores for distorted and aggressive. Additionally, it was shown there is a strong correlation between compression settings rated with high aggressive scores and the audio feature roughness. Keywords: dynamic range compression; music production; semantic audio; audio mixing; 1176 compressor; FET compression; listening experiment 1. Introduction 1.1. Background In addition to general dynamic range control, it is common for music producers to use dynamic range compression (DRC) for colouration and non-linear signal processing techniques, specifically to impart distortion onto program material.
    [Show full text]
  • Touchmix Application Guide for Musicians and Bands
    Application Guide for Musicians & Bands TOUCHMIX APPLICATION GUIDE FOR MUSICIANS AND BANDS Do you need TouchMix? • Are you a performing musician? • Would you like your live shows to sound like a professionally mixed concert? • Would you like to have multi-track recordings of your shows? • Do you and your band-mates struggle with hearing yourselves on stage? • Do you have problems with feedback during performances? • Are you a better musician than a sound engineer? If you are answer yes to most of these questions, you’ve come to the right place because the QSC TouchMix series mixers have a solution for you. WHAT’S IN THIS GUIDE There are two sections to this guide. The first section will explain how to perform some tasks that are common to almost any live performance, whether you’re a solo performer or a ten-piece band. Topics include mixer selection, setups and presets, features. The second part of the guide demonstrates several real-world applications of TouchMix mixers, ranging from solo performer gigs to club concerts. Which TouchMix is right for you? In the entire history of audio mixing consoles, nobody has ever been sorry they got a mixer with a few more inputs and outputs then they needed. So it’s probably a good idea to consider your current needs and then give yourself a little room to grow. The TouchMix digital mixer series—Top left: TouchMix-16; Top right: TouchMix-8; Bottom: TouchMix-30 Pro 2 TouchMix Compact Digital Mixers PRE-FLIGHT: SETTING UP YOUR TOUCHMIX AT HOME TO SAVE TIME AT THE GIG.
    [Show full text]
  • Auto-Adaptive Resonance Equalization Using Dilated Residual Networks
    AUTO-ADAPTIVE RESONANCE EQUALIZATION USING DILATED RESIDUAL NETWORKS Maarten Grachten Emmanuel Deruty Alexandre Tanguy Contractor for Sony CSL Paris, France Sony CSL Paris, France Yascore, Paris, France ABSTRACT The use of machine learning, in particular neural net- works, to solve audio production related tasks is recent. In music and audio production, attenuation of spectral res- Automatic mixing tasks that have been addressed in this onances is an important step towards a technically correct way include automatic reverbation [6], dynamic range result. In this paper we present a two-component system compression [18], and demixing/remixing of tracks [17]. to automate the task of resonance equalization. The first To our knowledge, there is no documented example of the component is a dynamic equalizer that automatically de- use of neural networks for automatic equalization. tects resonances, to be attenuated by a user-specified fac- A specific form of equalization used both in mixing and tor. The second component is a deep neural network that mastering is the attenuation of resonating or salient fre- predicts the optimal attenuation factor based on the win- quencies, i.e. frequencies that are substantially louder than dowed audio. The network is trained and validated on em- their neighbors [2]. Salient frequencies may originate from pirical data gathered from a listening experiment. We test different phenomena, such as the acoustic resonances of a two distinct network architectures for the predictive model physical instrument or an acoustic space. They are consid- and find that an agnostic network architecture operating ered a deficiency in the sense that they may mask the con- directly on the audio signal is on a par with a network tent of other frequency regions.
    [Show full text]