Handout Lecture / Tutorial on Music Information Retrieval
Total Page:16
File Type:pdf, Size:1020Kb
Lead-in Music Information Retrieval Who am I? Vienna University of Technology http://www.tuwien.ac.at http://www.ifs.tuwien.ac.at/mir • Faculty of Computer Science http://www.cs.tuwien.ac.at – Department of Software Technology and Interactive Systems Andreas Rauber http://www.isis.tuwien.ac.at » Software and Information Engineering Group Department of Softwaretechnology and http://www.ifs.tuwien.ac.at Interactive Systems - Andreas Rauber Vienna University of Technology http://www.ifs.tuwien.ac.at/~andi http://www.ifs.tuwien.ac.at/~andi Machine Learning, Neural Networks Text Mining, Digital Libraries Music Retrieval Digital Preservation . Lead-in Lead-in Activities Who else is MIR@ifs? Audio Feature Extraction Thomas Lidy Music Classification Robert Neumayer PlaySOM: Organisation of Music Archives PocketSOM: Browsing Music on Mobile Devices Rudolf Mayer 3D Worlds for Music Jakob Frank Audio Segmentation Chord Detection Other members Former members Blind Source Separation Veronika Zenz Markus Frühwirth Text and Music (Lyrics, Bio, ...) Peter Hlavac Elias Pampalk Ewald Peiszer Stefan Leitich Andreas Scharf David Laister Andrei Grecu & Doris Baum & others others . Chorus Music IR – Music? What is „Music“? Lead-in Music, of course! Chorus Audio: wav, au, mp3, ... Verse 1: Music-IR Symbolic: MIDI, mod, ... Verse 2: Audio Features www.samplesmith.com Scores: Scan, MusicXML Verse 3: Classification and Benchmarking www.westminster.gov.uk Verse 4: Clustering & Browsing Text Community data Video/Images Verse 5: Some other applications – Song lyrics – Playlists – Album covers Fade-out – Artis Biographies – Market basket – Music videos – Websites: – Band evolution Fanpages, Album Reviews, Genre descriptions . 1 Music IR – Music? Music IR – Music? Music - Sound Music - Sound - Loudness http:// www.phys.unsw.edu.au/jw/hearing.html Sound as acoustic wave Source of sound sound pressure sound pressure level Characterized by the properties of waves pascal dB re 20 µPa immediate soft tissue damage 50000 approx. 185 (frequency/wavelength, amplitude) threshold of pain 100 134 Frequency: pitch hearing damage during short-term effect 20 approx. 120 jet engine, 100 m distant 6–200 110–140 – Humans can hear approx. 20Hz-20kHz jack hammer, 1 m distant / discotheque 2 approx. 100 – speech: 200Hz-8kHz hearing damage during long-term effect 0.6 approx. 85 major road, 10 m distant 0.2–0.6 80–90 Amplitude: Loudness passenger car, 10 m distant 0.02–0.2 60–80 – measured as pressure in micropascal µPa TV set at home level, 1 m distant 0.02 ca. 60 normal talking, 1 m distant 0.002–0.02 40–60 – hearing threshold: approx. 20 µPa very calm room 0.0002–0.0006 20–30 – logarithmic decibel scale leaves noise, calm breathing 0.00006 10 auditory threshold at 2 kHz 0.00002 0 . Music IR – Music? Music IR – Music? Music - Sound Music - Sound Nyquist sampling theorem: Different file formats for storing sound: Exact reconstruction of a continuous-time baseband signal from its – lossless formats samples is possible if the signal is bandlimited and the sampling • WAV (may hold compressed audio, but usually lossless PCM) frequency is greater than twice the signal bandwidth. • FLAC, Shorten, Monkey's Audio, ATRAC Advanced Lossless, Apple Lossless, WMA Lossless, TTA is the Nyquist frequency, i.e. a signal with a specific frequency – lossy formats must be sampled with twice that frequency for reconstruction. • MP3 • ATRAC More on sound, sound pressure, hearing thresholds, etc. later when • AAC we talk about feature extraction from sound. • Ogg Vorbis • WMA • ... Music IR – Music? Music IR – Music? Music - Sound - PCM Music - Sound - MP3 PCM: Pulse Code Modulation Actually: MPEG-1 Audio Layer 3 Digital representation of an analog signal where the magnitude of Developed by a groups around Fraunhofer, Thomson, the signal is sampled regularly at uniform intervals, then quantized AT&T Bell Labs, several patent issues pending to a series of symbols Lossy compression, based on psycho-acostic models Used in WAV, CD-recordings, ... – differential encoding of stereo signal (lossless) Quantization error: chosing discrete value near the analog signal – focus on audible frequencies for each sample – masking effects Any frequency above or equal to – adaptive bit-depth encoding 1/2 sampling frequency is lost – quantization and huffman-encoding . 2 Music IR – Music? Music IR – Music? Music - Sound - MP3 What is „Music“? ID3-Tags Music, of course! Added later-on to allow embedding of meta data – Audio: wav, au, mp3, ... – Symbolic: MIDI, mod, ... ID3v1: 30 char per entry, few standard fields www.samplesmith.com – Scores: Scan, MusicXML ID3v2.4: UTF-8 support, tags at beginning of file www.westminster.gov.uk Used by search engines Text Community data Video/Images – Song lyrics – Playlists – Album covers – Artis Biographies – Market basket – Music videos – Websites: – Band evolution Fanpages, Album Reviews, Genre descriptions . Music IR – Music? Music IR – Music? Musical Instrument Digital Interface - MIDI Musical Instrument Digital Interface - MIDI Some MIDI examples Symbolic Music File Format (from: http://www.borg.com/~jglatt/files/midifile.htm ) Dave Smith, proposed in 1981 – Orchestral: Bach: Branderburg Concerto 4 – Orchestral: Star Treck Theme: Next Generation MIDI specification 1.0 in 1983 – Classic: Beethoven: Für Elise Interacting with keyboard produces messages – 1950's Rock&Roll: Bill Haley: Rock Around the Clock – 1950's Rock&Roll: Jerry Lee Louis: Great Balls of Fire – Note-On , Aftertouch , and Note-Off – Pop: Elton John: Don't Let the Sun Go Down – 127 note pitches – Pop: Phil Colins: Another Day in Paradise Sequence of control commands – Heavy Metal: Queen: Another One Bites the Dust – Heavy Metal: Van Halen: Jump . Music IR – Music? Music IR – Music? MOD MOD Similar to MIDI, but Some examples (from http://modarchive.org ) stores audio samples together with control instructions – Classical: Dark Castle (Part 1) should sound the same on every player – Classical: Canon in D – Classical: Beethoven: Für Elise a.k.a. tracker modules (first ever module creating program – Guitar: Sweet Lorraine was Soundtracker, created by Karsten Obarski 1987) – Latin: Heart and Soul – Techno: 10KBlur – Disco: Rob Hubbard . 3 Music IR – Music? Music IR – Music? Scores What is „Music“? Also referred to as „Sheet Music“ Music, of course! – Audio: wav, au, mp3, ... Hand-written or printed form of musical notation – Symbolic: MIDI, mod, ... – Handwritten scores www.samplesmith.com – Scores: Scan, MusicXML – Printed scores www.westminster.gov.uk – Typeset scores Text Community data Video/Images – MusicXML – Song lyrics – Playlists – Album covers Different IR tasks – Artis Biographies – Market basket – Music videos – Scan & Optical Music Recognition (OMR) – Websites: – Band evolution Fanpages, – Score following Album Reviews, – Melodic retrieval Genre descriptions . Music IR – Music? Music IR – Music? Handwritten scores Different styles of notation Handwritten / printed scores http://en.wikipedia.org/wiki/Musical_notation Different styles of notation – Neumes Ancient greek: – Staff stone at Delphi containing the second of the Complex annotations two hymns to Apollo Scanning scores Indian notation e.g. Musitek SmartScore: bhat notation http://www.musitek.com/ China Bach SheetmusicDemo: Quin notation http://bach.nau.edu/UWDigital/Washington.html . Music IR – Music? Music IR – Music? Music Typesetting / Scorewriter GNU LilyPond Software Software used to automate the task of writing and engraving sheet music, ako word processor for text http://lilypond.org/ Input via text editor or MIDI interface, Input: UTF-8, no graphical interface some support Scan+OMR some graphical editors produce LilyPond output Output: PS/PDF, graphics, MIDI, MusicXML (e.g.Rosegarden, NoteEdit, Canorus) Popular programs: – GNU LilyPond Software: http://lilypond.org/ Output: compiled to PDF, SVG, MIDI, ... – GUIDO Music Notation: http://www.salieri.org/GUIDO/ Notes are entered in note, pitch and length format – Finale: http://www.finalemusic.com/ – Sibelius: http://www.sibelius.com/ Used by several projects (Mutopia, Musipedia) – Comprehensive list: http://en.wikipedia.org/wiki/Scorewriter#Scorewriters . 4 Music IR – Music? Music IR – Music? LilyPond example LilyPond example (1/5, from http://en.wikipedia.org/wiki/GNU_LilyPond) (2/5, from http://en.wikipedia.org/wiki/GNU_LilyPond) #!lilypond firebreathers.ly -*- coding: utf-8; -*- %% Theme to "Fire Breathers", a homebrew NES game perpetually %% The header block defines the titles and texts. %% under development. Composed by Urpo Lankinen. %% Note: The composer has made this source code available \header { %% to Wikipedia under the GFDL license. Other versions outside title = "Theme to ``Fire Breathers!''" %% Wikipedia are typically under CC BY-SA license. instrument = "For the 2A03 or SID" %% This file uses Finnish note names (for example, where composer = "Urpo Lankinen" %% Americans use "F#" and "Bb", Finns use "Fis" and "B"). enteredby = "Urpo Lankinen" %% Dutch note names are used by default. updatedby = "Jan Nieuwenhuizen" \include "suomi.ly" date = "June 2005" %% Optional language upgrade helper. } \version "2.6.0" . Music IR – Music? Music IR – Music? LilyPond example LilyPond example (3/5, from http://en.wikipedia.org/wiki/GNU_LilyPond) (4/5, from http://en.wikipedia.org/wiki/GNU_LilyPond) Melody = \relative c'' { %% This is the second voice. \clef treble SecondVoice = \relative c { \time 3/4