Music Representations
Lecture Music Processing
Music Representations
Meinard Müller International Audio Laboratories Erlangen [email protected]
Music Representations Score Representation
. Score representation: symbolic description
. MIDI representation: hybrid description (models note events explicitely but may also encode performance subtleties)
. Audio representation: physical description (encodes a sound wave)
Score Representation Score Representation
Beam Flag
Stem Whole Half Quarter Eigth Sixteenth note note note note note Note head
measure bar line Score Representation Score Representation
Score Representation Score Representation
Score Representation Score Representation
Types of score: . Scanned image . Full score: shows music for all instruments and voices; used by conductors . Various symbolic data formats – Lilypond . Piano (reduction) score: transcription for piano – MusicXML Example: Liszt transcription of Beethoven symphonies . Short score: reduction of a work for many instruments to . Optical Music Recognition (OMR) just a fews staves . Lead sheet: specifies only melody, lyrics and harmonies . Music notation software (chord symbols); used for popular music to capture – Finale essential elements of a song – Sibelius Score Representation Score Representation
MusicXML Musical score / sheet music:
. Graphical / textual encoding of musical parameters (note onsets, pitches, durations, tempo, measure, dynamics, instrumentation)
. Guide for performing music
. Leaves freedom for various interpretations
MIDI Representation MIDI Representation
. Musical Instrument Digital Interface (MIDI) MIDI note numbers (MNN) ≙ piano keys
. Standard protocol for controlling and synchronizing digital instruments
. Standard MIDI File (SMF) is used for collecting and storing MIDI messages
. SMF file is often called MIDI file
MIDI Representation MIDI Representation Time Message Channel Note Velocity (Ticks) Number MIDI parameters: 60 NOTE ON 1 67 100 0 NOTE ON 1 55 100 [0 :127] 0 NOTE ON 2 43 100 . MIDI note number (pitch) 55 NOTE OFF 1 67 0 0NOTEOFF1550 p = 21, …, 108 ≙ „piano keys“ 0NOTEOFF2430 p = 69 ≙ concert pitch A (440Hz) 5 NOTE ON 1 67 100 0 NOTE ON 1 55 100 . Key velocity [ 0 : 127 ] ≙ intensity 0 NOTE ON 2 43 100 55 NOTE OFF 1 67 0 . MIDI channel [ 0 : 15 ] ≙ instrument 0NOTEOFF1550 0NOTEOFF2430 . Note-on / note-off events onset time & duration 5 NOTE ON 1 67 100 ≙ 0 NOTE ON 1 55 100 0 NOTE ON 2 43 100 . Tempo measured in clock pulses or ticks 55 NOTE OFF 1 67 0 (each MIDI event has a timestamp) 0NOTEOFF1550 0NOTEOFF2430 . Absolute tempo specified by 5 NOTE ON 1 63 100 0 NOTE ON 2 51 100 – ticks per quarter note (musical time) 0 NOTE ON 2 39 100 240 NOTE OFF 1 63 0 – micro-seconds per tick (physical time) 0NOTEOFF2510 0NOTEOFF2390 MIDI Representation MIDI Representation 71/B4
67/G4
60/C4
55/G3
48/C3
43/G2
36/C2
MIDI Representation MIDI Representation
Piano roll representation:
. Piano roll: music storage medium used to operate a player piano
. Perforated paper rolls
. Holes in the paper encode the note parameters onset, duration, and pitch
. First pianola: 1895
Audio Representation Audio Representation Waveform Various interpretations – Beethoven’s Fifth
Compression Rarefaction Bernstein
Karajan
Scherbakov (piano) Pressure-time plot at a specific location Compression MIDI (piano) Rarefaction
Average air pressure deviation Air pressure Time Audio Representation Audio Representation Waveform Waveform Pure tone (harmonic sound): . Audio signal encodes change of air pressure at a certain location generated by a vibrating object . Sinusoidal waveform (e.g. string, vocal cords, membrane) . Prototype of an acoustic realization of a musical note
. Waveform (pressure-time plot) is graphical representation of audio signal Parameters: . Period p : time between to successive high pressure . Parameters: amplitude, frequency / period points 1 . Frequency f = (measured in Hz) p . Amplitude a : air pressure at high pressure points
Audio Representation Audio Representation Waveform Waveform
Amplitude Period
Average air pressure deviation Air pressure Time in seconds Amplitude Amplitude
Time (seconds)
Audio Representation Audio Representation Waveform Sound
Bernstein (orchestra) Glen Gould (piano) . Sound: superposition of sinusoidals
. When realizing musical notes on an instrument one obtains a complex superposition of pure tones (and other noise-like components)
. Harmonics: integer multiples of fundamental frequency 1. Harmonic ≙ fundamental frequency (e.g. 440 Hz) 2. Harmonic ≙ first overtone (e.g. 880 Hz) 3. Harmonic ≙ second overtone (e.g. 1320 Hz)
Time (seconds) Time (seconds) Audio Representation Audio Representation Pitch Pitch
Equal-tempered scale: A system of tuning in which every pair of . Property that correlates to the perceived frequency adjacent notes has an identical frequency ratio (≙ fundamental frequency) Western music: 12-tone equal-tempered scale . Example: middle A or concert pitch ≙ 440 Hz . Each octave is devided up into 12 logarithmically equal parts . Slight changes in frequency have no effect on perceived pitch (pitch ≙ entire range of frequencies) . Notes correspond to piano keys
. Referenz: standard pitch ^ . Pitch perception: logarithmic in frequency Example: Octave ≙ doubling of frequency . Frequency of a note with MIDI pitch p
Audio Representation Audio Representation Harmonics Dynamics
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 . Intensity of a sound
. Energy of the sound per time and area
octave fifth major third Mix Harmonics: Frequency = integer multiples of fundamental frequency . Loudness: subjective (psychoacoustic) perception of intensity (depends on frequency, timbre, duration) Deviation in cents: +2 -14+2 -31 +4 -14 -49 +2 +41 -31 -12
MIDI: Frequency = fundamental frequency of MIDI pitch
Stereo file: Harmonics vs. MIDI
Audio Representation Audio Representation Dynamics Dynamics energy power W . intensity time area area m 2 Source Intensity Intensity level ×TOH . Decibel (dB): logarithmic unit to measure intensity Threshold of hearing (TOH) 10-12 0 dB 0 relative to a reference level Whisper 10-10 20 dB 102 W Pianissimo 10-8 40 dB 104 . Reference level: threshold of hearing (THO) P 11012 0 m2 Normal conversation 10-6 60 dB 106 P Fortissimo 10-2 100 dB 1010 . Intensity P measured in dB: dB(P ) 10log 1 1 1 10 Threshold of pain 10 130 dB 1013 P0 . Examples: Jet take-off 102 140 dB 1014 4 16 P1 10 P0 P1 has a sound level of 10 dB Instant perforation of eardrum 10 160 dB 10
P2 100 P0 P2 has a sound level of 20 dB Audio Representation Audio Representation Dynamics Dynamics
Upper envelope Amplitude Amplitude
Lower envelope Time Time
Audio Representation Audio Representation Dynamics Loudness
Equal-loudness contours (phone)
120
100
Amplitude 80
60
AD S R Intensity (dB) 40
20 Threshold of hearing Key pressed Key released 0 0 phone
20 100 1000 10000 Frequency (Hz)
Audio Representation Audio Representation Loudness Timbre
Equal-loudness contours (phone) . Quality of musical sound that distinguishes different types of sound production such as voices or instruments
120 120 phone Threshold of pain 100 100 phone . Tone quality
80 80 phone
60 phone 60 . Tone color
Intensity (dB) 40 40 phone . Depends on energy distribution in harmonics 20 20 phone Threshold of hearing 0 0 phone
20 100 1000 10000 Frequency (Hz) Audio Representation Audio Representation Timbre Digitization
All instruments play the same note C4 (261.6 Hz)
Piano Trumpet Violine Flute Frequency (Hz) Frequency (Hz) Frequency (Hz) Frequency (Hz) Time (seconds) Time (seconds) Time (seconds) Time (seconds) Vibrato: Tremolo: Frequency Amplitude modulation modulation
Audio Representation Music Representations Digitization
. Convertion of continuous-time (analog) signal into a discrete signal Transcription Rendering Audio Symbolic Sheet Music . Sampling (discretization of time axis) Representations Representations Representations Synthesis / OMR . Quantization (discretization of amplitudes) Performance Physical Time Musical Time
Time Domain Image Domain Examples: . Audio CD: 44100 Hz sampling rate 16 bits (65536 values) used for quantization . Telephone: 8000 Hz sampling rate 8 bits (256 values) used for quantization