Music Representations

Lecture Music Processing

Music Representations

Meinard Müller International Audio Laboratories Erlangen [email protected]

Music Representations Score Representation

. Score representation: symbolic description

. MIDI representation: hybrid description (models note events explicitely but may also encode performance subtleties)

. Audio representation: physical description (encodes sound wave)

Score Representation Score Representation

Beam Flag

Stem Whole Half Quarter Eigth Sixteenth note note note note note Note head

measure bar line Score Representation Score Representation

Score Representation Score Representation

Score Representation Score Representation

Types of score: . Scanned image . Full score: shows music for all instruments and voices; used by conductors . Various symbolic data formats – Lilypond . Piano (reduction) score: transcription for piano – MusicXML Example: Liszt transcription of Beethoven symphonies . Short score: reduction of a work for many instruments to . Optical Music Recognition (OMR) just a fews staves . Lead sheet: specifies only melody, lyrics and harmonies . Music notation software (chord symbols); used for popular music to capture – Finale essential elements of a song – Sibelius Score Representation Score Representation

MusicXML Musical score / sheet music:

. Graphical / textual encoding of musical parameters (note onsets, pitches, durations, tempo, measure, dynamics, instrumentation)

. Guide for performing music

. Leaves freedom for various interpretations

MIDI Representation MIDI Representation

. Digital Interface (MIDI) MIDI note numbers (MNN) ≙ piano keys

. Standard protocol for controlling and synchronizing digital instruments

. Standard MIDI File (SMF) is used for collecting and storing MIDI messages

. SMF file is often called MIDI file

MIDI Representation MIDI Representation Time Message Channel Note Velocity (Ticks) Number MIDI parameters: 60 NOTE ON 1 67 100 0 NOTE ON 1 55 100 [0 :127] 0 NOTE ON 2 43 100 . MIDI note number (pitch) 55 NOTE OFF 1 67 0 0NOTEOFF1550 p = 21, …, 108 ≙ „piano keys“ 0NOTEOFF2430 p = 69 ≙ concert pitch A (440Hz) 5 NOTE ON 1 67 100 0 NOTE ON 1 55 100 . Key velocity [ 0 : 127 ] ≙ intensity 0 NOTE ON 2 43 100 55 NOTE OFF 1 67 0 . MIDI channel [ 0 : 15 ] ≙ instrument 0NOTEOFF1550 0NOTEOFF2430 . Note-on / note-off events onset time & duration 5 NOTE ON 1 67 100 ≙ 0 NOTE ON 1 55 100 0 NOTE ON 2 43 100 . Tempo measured in clock pulses or ticks 55 NOTE OFF 1 67 0 (each MIDI event has a timestamp) 0NOTEOFF1550 0NOTEOFF2430 . Absolute tempo specified by 5 NOTE ON 1 63 100 0 NOTE ON 2 51 100 – ticks per quarter note (musical time) 0 NOTE ON 2 39 100 240 NOTE OFF 1 63 0 – micro-seconds per tick (physical time) 0NOTEOFF2510 0NOTEOFF2390 MIDI Representation MIDI Representation 71/B4

67/G4

60/C4

55/G3

48/C3

43/G2

36/C2

MIDI Representation MIDI Representation

Piano roll representation:

. Piano roll: music storage medium used to operate a player piano

. Perforated paper rolls

. Holes in the paper encode the note parameters onset, duration, and pitch

. First pianola: 1895

Audio Representation Audio Representation Waveform Various interpretations – Beethoven’s Fifth

Compression Rarefaction Bernstein

Karajan

Scherbakov (piano) Pressure-time plot at a specific location Compression MIDI (piano) Rarefaction

Average air pressure deviation Air pressure Time Audio Representation Audio Representation Waveform Waveform Pure tone (harmonic sound): . Audio signal encodes change of air pressure at a certain location generated by a vibrating object . Sinusoidal waveform (e.g. string, vocal cords, membrane) . Prototype of an acoustic realization of a

. Waveform (pressure-time plot) is graphical representation of audio signal Parameters: . Period p : time between to successive high pressure . Parameters: amplitude, / period points 1 . Frequency f = (measured in Hz) p . Amplitude a : air pressure at high pressure points

Audio Representation Audio Representation Waveform Waveform

Amplitude Period

Average air pressure deviation Air pressure Time in seconds Amplitude Amplitude

Time (seconds)

Audio Representation Audio Representation Waveform Sound

Bernstein (orchestra) Glen Gould (piano) . Sound: superposition of sinusoidals

. When realizing musical notes on an instrument one obtains a complex superposition of pure tones (and other noise-like components)

. Harmonics: integer multiples of fundamental frequency 1. Harmonic ≙ fundamental frequency (e.g. 440 Hz) 2. Harmonic ≙ first overtone (e.g. 880 Hz) 3. Harmonic ≙ second overtone (e.g. 1320 Hz)

Time (seconds) Time (seconds) Audio Representation Audio Representation Pitch Pitch

Equal-tempered scale: A system of tuning in which every pair of . Property that correlates to the perceived frequency adjacent notes has an identical frequency ratio (≙ fundamental frequency) Western music: 12-tone equal-tempered scale . Example: middle A or concert pitch ≙ 440 Hz . Each octave is devided up into 12 logarithmically equal parts . Slight changes in frequency have no effect on perceived pitch (pitch ≙ entire range of ) . Notes correspond to piano keys

. Referenz: standard pitch ^ . Pitch perception: logarithmic in frequency Example: Octave ≙ doubling of frequency . Frequency of a note with MIDI pitch p

Audio Representation Audio Representation Harmonics Dynamics

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 . Intensity of a sound

. Energy of the sound per time and area

octave fifth major third Mix Harmonics: Frequency = integer multiples of fundamental frequency . Loudness: subjective (psychoacoustic) perception of intensity (depends on frequency, timbre, duration) Deviation in cents: +2 -14+2 -31 +4 -14 -49 +2 +41 -31 -12

MIDI: Frequency = fundamental frequency of MIDI pitch

Stereo file: Harmonics vs. MIDI

Audio Representation Audio Representation Dynamics Dynamics energy power  W  . intensity     time  area area  m 2  Source Intensity Intensity level ×TOH . Decibel (dB): logarithmic unit to measure intensity Threshold of hearing (TOH) 10-12 0 dB 0 relative to a reference level Whisper 10-10 20 dB 102 W Pianissimo 10-8 40 dB 104 . Reference level: threshold of hearing (THO) P 11012 0 m2 Normal conversation 10-6 60 dB 106  P  Fortissimo 10-2 100 dB 1010 . Intensity P measured in dB: dB(P ) 10log  1  1 1 10   Threshold of pain 10 130 dB 1013  P0  . Examples: Jet take-off 102 140 dB 1014 4 16 P1  10 P0  P1 has a sound level of 10 dB Instant perforation of eardrum 10 160 dB 10

P2  100 P0  P2 has a sound level of 20 dB Audio Representation Audio Representation Dynamics Dynamics

Upper envelope Amplitude Amplitude

Lower envelope Time Time

Audio Representation Audio Representation Dynamics Loudness

Equal-loudness contours (phone)

120

100

Amplitude 80

60

AD S R Intensity (dB) 40

20 Threshold of hearing Key pressed Key released 0 0 phone

20 100 1000 10000 Frequency (Hz)

Audio Representation Audio Representation Loudness Timbre

Equal-loudness contours (phone) . Quality of musical sound that distinguishes different types of sound production such as voices or instruments

120 120 phone Threshold of pain 100 100 phone . Tone quality

80 80 phone

60 phone 60 . Tone color

Intensity (dB) 40 40 phone . Depends on energy distribution in harmonics 20 20 phone Threshold of hearing 0 0 phone

20 100 1000 10000 Frequency (Hz) Audio Representation Audio Representation Timbre Digitization

All instruments play the same note C4 (261.6 Hz)

Piano Violine Flute Frequency (Hz) Frequency (Hz) Frequency (Hz) Frequency (Hz) Time (seconds) Time (seconds) Time (seconds) Time (seconds) Vibrato: Tremolo: Frequency Amplitude modulation modulation

Audio Representation Music Representations Digitization

. Convertion of continuous-time (analog) signal into a discrete signal Transcription Rendering Audio Symbolic Sheet Music . Sampling (discretization of time axis) Representations Representations Representations Synthesis / OMR . Quantization (discretization of amplitudes) Performance Physical Time Musical Time

Time Domain Image Domain Examples: . Audio CD: 44100 Hz sampling rate 16 bits (65536 values) used for quantization . Telephone: 8000 Hz sampling rate 8 bits (256 values) used for quantization