Abstract Page
Total Page:16
File Type:pdf, Size:1020Kb
IEEE Xplore - Abstract Page IEEE.org | IEEE Xplore Digital Library | IEEE Standards | IEEE Spectrum | More Sites Cart(0) | Create Account | Sign In Access provided by: Australian National University Sign Out MY SETTINGS WHAT CAN I ACCESS? | About IEEE Xplore | Terms of Use | Feedback Help Advanced Search | Preferences | Search Tips | More Search Options Browse Conference Publications > Acoustics, Speech and Signal ... RELATED CONTENT Forensic voice comparison with secular shibboleths - A hybrid Page Help Identification of a fused gmm-multivariate likelihood ratio-based approach using stochastic neuroelectric system using the alveolo-palatal fricative cepstral spectra maximum likelihood approach This paper appears in: A comparison of hybrid HMM architecture using Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE global discriminating International Conference on training Date of Conference: 22-27 May 2011 Author(s): Rose, P. Comparison of Hybrid On Page(s): 5900 - 5903 Localization Schemes Product Type: Conference Publications using RSSI, TOA, and TDOA Soil texture classification using wavelet transform and maximum likelihood approach ABSTRACT A maximum likelihood approach to texture The suitability of voiceless fricative spectra for forensic voice comparison is explored within a Likelihood Ratio- classification using wavelet based framework. Non-contemporaneous landline telephone recordings of 99 male Japanese speakers are transform compared using only tokens of their voiceless alveolo-patalal fricative [ç]. A subset of mean-cepstrally- subtracted LPC CCs from the fricative spectrum from dc to 5 kHz is used. GMM/UBM and multivariate likelihood ratios are extracted for the 99 target and 4851 non-target trials, and fused with logistic regression. An EER of 7.4% and log-LR cost of 0.26 is demonstrated. It is concluded that the [ç] spectrum does have some individualising potential. INDEX TERMS IEEE Terms Cavity resonators , Cepstral analysis , Forensics , Speaker recognition , Speech , Tongue INSPEC Controlled Indexing maximum likelihood estimation , speaker recognition Non Controlled Indexing LPC CC , UBM , alveolo-palatal fricative cepstral spectra , forensic voice comparison , fricative spectrum , fused GMM-multivariate likelihood ratio-based approach , noncontemporaneous landline telephone recordings , secular shibboleths Author Keywords Forensic Voice Comparison , GMM/UBM , Multivariate Likelihood Ratio , cepstrum , coronal fricative spectra Additional Details References (17) On page(s): 5900 Conference Location : Prague ISSN : 1520-6149 E-ISBN : 978-1-4577-0537-3 Print ISBN: 978-1-4577-0538-0 INSPEC Accession Number: 12176492 Digital Object Identifier : 10.1109/ICASSP.2011.5947704 Date of Current Version : 11 July 2011 http://ieeexplore.ieee.org/xpl/articleDetails.jsp?reload=true&arnumber=5947704&contentType=Conference+Publications[8/05/2012 2:24:27 PM] IEEE Xplore - Abstract Page Issue Date : 22-27 May 2011 Sign In | Create Account IEEE Account Purchase Details Profile Information Need Help? Change Username/Password Payment Options Communications Preferences US & Canada: +1 800 678 4333 Update Address Order History Profession and Education Worldwide: +1 732 981 0060 Access Purchased Documents Technical Interests Contact & Support About IEEE Xplore | Contact | Help | Privacy & Security | Terms of Use | Nondiscrimination Policy | Accessibility | Site Map A non-profit organization, IEEE is the world's largest professional association for the advancement of technology. © Copyright 2012 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions. http://ieeexplore.ieee.org/xpl/articleDetails.jsp?reload=true&arnumber=5947704&contentType=Conference+Publications[8/05/2012 2:24:27 PM] 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2011) Prague, Czech Republic 22 – 27 May 2011 Pages 1-844 IEEE Catalog Number: CFP11ICA-PRT ISBN: 978-1-4577-0538-0 1/7 Copyright © 2011 by the Institute of Electrical and Electronic Engineers, Inc All Rights Reserved Copyright and Reprint Permissions: Abstracting is permitted with credit to the source. Libraries are permitted to photocopy beyond the limit of U.S. copyright law for private use of patrons those articles in this volume that carry a code at the bottom of the first page, provided the per-copy fee indicated in the code is paid through Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923. For other copying, reprint or republication permission, write to IEEE Copyrights Manager, IEEE Service Center, 445 Hoes Lane, Piscataway, NJ 08854. All rights reserved. ***This publication is a representation of what appears in the IEEE Digital Libraries. Some format issues inherent in the e-media version may also appear in this print version. IEEE Catalog Number: CFP11ICA-PRT ISBN 13: 978-1-4577-0538-0 ISSN: 1520-6149 Additional Copies of This Publication Are Available From: Curran Associates, Inc 57 Morehouse Lane Red Hook, NY 12571 USA Phone: (845) 758-0400 Fax: (845) 758-2633 E-mail: [email protected] Web: www.proceedings.com TABLE OF CONTENTS AASP-L1: ACOUSTIC SOURCE SEPARATION I AASP-L1.1: COMBINING HMM-BASED MELODY EXTRACTION AND NMF-BASED SOFT ....................................... 1 MASKING FOR SEPARATING VOICE AND ACCOMPANIMENT FROM MONAURAL AUDIO Yun Wang, Zhijian Ou, Tsinghua University, China AASP-L1.2: ADAPTATION OF SOURCE-SPECIFIC DICTIONARIES IN NON-NEGATIVE ........................................... 5 MATRIX FACTORIZATION FOR SOURCE SEPARATION Xabier Jaureguiberry, Pierre Leveau, Simon Maller, Juan José Burred, Audionamix, France AASP-L1.3: AN ACOUSTICALLY-MOTIVATED SPATIAL PRIOR FOR UNDER-DETERMINED ................................ 9 REVERBERANT SOURCE SEPARATION Ngoc Q. K. Duong, Emmanuel Vincent, Rémi Gribonval, INRIA / Centre de Rennes - Bretagne Atlantique, France AASP-L1.4: RESOLVING FD-BSS PERMUTATION FOR ARBITRARY ARRAY IN PRESENCE ................................. 13 OF SPATIAL ALIASING Jani Even, Norihiro Hagita, ATR, Intelligent Robotics and Communication Laboratories, Japan AASP-L1.5: A NON-NEGATIVE APPROACH TO SEMI-SUPERVISED SEPARATION OF ........................................... 17 SPEECH FROM NOISE WITH THE USE OF TEMPORAL DYNAMICS Gautham J. Mysore, Adobe Systems Inc., United States; Paris Smaragdis, University of Illinois Urbana-Champaign, United States AASP-L1.6: ITAKURA-SAITO NONNEGATIVE MATRIX FACTORIZATION WITH GROUP .................................... 21 SPARSITY Augustin Lefevre, Francis Bach, Ecole Normale Superieure, France; Cédric Févotte, CNRS LTCI / Télécom ParisTech, France AASP-L2: MUSIC SIGNAL PROCESSING I AASP-L2.1: MULTIPITCH ESTIMATION BY JOINT MODELING OF HARMONIC AND ............................................ 25 TRANSIENT SOUNDS Jun Wu, The University of Tokyo, Japan; Emmanuel Vincent, INRIA, France; Stanislaw Raczynski, Takuya Nishimoto, Nobutaka Ono, Shigeki Sagayama, The University of Tokyo, Japan AASP-L2.2: FREQUENCY SELECTIVE PITCH TRANSPOSITION OF AUDIO SIGNALS ............................................. 29 Sascha Disch, Fraunhofer Institute for Integrated Circuits (IIS), Germany; Bernd Edler, International Audio Laboratories Erlangen, Germany AASP-L2.3: IMPROVING MELODY EXTRACTION USING PROBABILISTIC LATENT .............................................. 33 COMPONENT ANALYSIS Jinyu Han, Northwestern University, United States; Ching-Wei Chen, Gracenote, United States AASP-L2.4: POLYPHONIC MUSIC TRANSCRIPTION USING NOTE ONSET AND OFFSET ...................................... 37 DETECTION Emmanouil Benetos, Simon Dixon, Queen Mary University of London, United Kingdom AASP-L2.5: AUTOMATIC MUSICAL THUMBNAILING BASED ON AUDIO OBJECT ................................................. 41 LOCALIZATION AND ITS EVALUATION Hiroyuki Nawata, Noriyoshi Kamado, Hiroshi Saruwatari, Kiyohiro Shikano, Nara Institute of Science and Technology, Japan AASP-L2.6: SCORE INFORMED AUDIO SOURCE SEPARATION USING A PARAMETRIC ...................................... 45 MODEL OF NON-NEGATIVE SPECTROGRAM Romain Hennequin, Bertrand David, Roland Badeau, Institut TELECOM / TELECOM ParisTech, France AASP-L3: SPATIAL AND MULTICHANNEL SIGNAL PROCESSING AASP-L3.1: EFFICIENT RANGE EXTRAPOLATION OF HEAD-RELATED IMPULSE ................................................ 49 RESPONSES BY WAVE FIELD SYNTHESIS TECHNIQUES Sascha Spors, Jens Ahrens, Deutsche Telekom Laboratories, Germany AASP-L3.2: EFFICIENCY EVALUATION AND ORTHOGONAL BASIS DETERMINATION IN ................................. 53 FUNCTIONAL HRTF MODELING Mengqiu Zhang, Rodney A. Kennedy, Thushara D. Abhayapala, Australian National University, Australia AASP-L3.3: SPATIAL SOUND REPRODUCTION SYSTEMS USING HIGHER ORDER ................................................ 57 LOUDSPEAKERS Mark Poletti, Industrial Research Ltd, New Zealand; Thushara D. Abhayapala, Australian National University, Australia AASP-L3.4: CONVERTING 5.1 AUDIO RECORDINGS TO B-FORMAT FOR DIRECTIONAL .................................... 61 AUDIO CODING REPRODUCTION Mikko-Ville Laitinen, Ville Pulkki, Aalto University, Finland AASP-L3.5: AN ANALYTICAL APPROACH TO LOCAL SOUND FIELD SYNTHESIS USING .................................... 65 LINEAR ARRAYS OF LOUDSPEAKERS Jens Ahrens, Sascha Spors, Deutsche Telekom Laboratories, Germany AASP-L3.6: A METHODOLOGY FOR EVALUATING THE ACCURACY OF WAVE FIELD ....................................... 69 RENDERING TECHNIQUES Antonio Canclini, Politecnico di Milano, Italy; Paolo Annibale, University Erlangen-Nuremberg,