Cross-Correlation of Beat-Synchronous Representations
Cross-Correlation of Beat-Synchronous Representations for Music Similarity Dan Ellis, Courtenay Cotton, and Michael Mandel Laboratory for Recognition and Organization of Speech and Audio Dept. Electrical Eng., Columbia Univ., NY USA {dpwe,cvcotton,mim}@ee.columbia.edu http://labrosa.ee.columbia.edu/ 1. Music Similarity 2. Beat-Synchronous Representations 3. Cross-Correlation Similarity 4. Subject Tests Correlation Music Similarity - Ellis, Cotton, Mandel 2008-04-03 - 1 /15 1. Music Similarity • Goal: Computer predicts listeners’ judgments of music similarity e.g. for playlists, new music discovery • Conventional approach statistical models of broad spectrum (MFCCs) • Evaluation? MIREX: 2004 onwards proxy tasks: Genre classification, artist ID ... direct evaluation: subjects rate systems’ hits Correlation Music Similarity - Ellis, Cotton, Mandel 2008-04-03 - /15 Which is more similar? • “Waiting in Vain” by Bob Marley & the Wailers Waiting in Vain - Bob Marley 4 freq / kHz 2 0 5 10 15 20 Jamming - Bob Marley Waiting in Vain - Annie Lennox 4 2 0 0 2 4 6 8 2 4 6 8 10 12 time / sec • Different kinds of similarity Correlation Music Similarity - Ellis, Cotton, Mandel 2008-04-03 - 3 /15 2. Chroma Features • Chroma features map spectral energy into one canonical octave i.e. 12 semitone bins Piano chromatic scale IF chroma 4 a z m Piano H G o k r / 3 h F c q scale e r f 2 D 1 C 0 A 2 4 6 8 10 time / sec 100 200 300 400 500 600 700 time / frames • Can resynthesize as “Shepard Tones” all octaves at once Shepard tone resynth 4 z H k / 3 q e r f 2 1 0 2 4 6 8 10 time / sec Correlation Music Similarity - Ellis, Cotton, Mandel 2008-04-03 - 4 /15 Beat-Synchronous Chroma Features • Beat + chroma features / 30ms frames → average chroma within each beat compact; sufficient? &# %# $# 34,5-.-6,7 "# # 89/,)-/)4,9:); "$ "# ( ' 0;48+2-1*9/ & $ # ! "# )*+,-.-/,0 "! "$ "# ( ' 0;48+2-1*9/ & $ ! "# "! $# $! %# %! )*+,-.-1,2)/ Correlation Music Similarity - Ellis, Cotton, Mandel 2008-04-03 - 5 /15 3.
[Show full text]