Techniques to Foster Drum Machine Expressivity
Je A. Bilmes
Perceptual Computing Group
MIT Media Lab oratory
Massachusetts Institute of Technology
Cambridge, MA 02139
Abstract
Electronic drum machines need algorithms to help them pro duce \expressive-
sounding" rhythmic phrases. In [Bilmes, 1992], I claim that three p erceptually separate ele-
ments characterize p ercussiverhythm: metric content, temp o variation, and deviations (for-
merly called event-shifts). Herein, I demonstrate that algorithms based on this mo del may
considerably facilitate repro duction of expressiverhythm. I describ e one such algorithm
which extracts the separate elements from a p ercussive p erformance. The p erformance is
then resynthesized with varying degrees of temp o variation and deviations. Without the de-
viations, the p erformance sounds mechanical. With them it sounds rich and alive. Conse-
quently, I claim that we should b egin a concentrated study on the separate elements char-
acterizing p ercussiverhythm, particularly deviations. To this e ect, development has b e-
gun on a graphical drum machine program with which deviations may b e explored.
2 Rhythm 1 Intro duction
There is no doubt that music devoid of b oth har- To err is human. Yet most users of drum machines
monyandmelodymay still contain considerable and music sequencers strive to eliminate \errs"
expression. Percussivemusic is a case in p oint, in musical p erformance. In fact, some computer
1
as anyone who has truly enjoyed traditional mu- musicians never turn o the quantize option, de-
sic from Africa, India, or Central or South Amer- stroying forever \ aws that make the p erformance
ica knows. sound sloppy." At the same time, other computer
Unsuitable for p ercussivemusic however, musicians complain ab out the mechanical quality
previous representations of expressive timing of computer music. They call for the development
([Clines, 1977], [Ja e, 1985], [Schloss, 1985], of techniques whichwould enable computers to
[Repp, 1990], [Wessel et al., 1991], [Anderson and sound b etter, i.e. more \human."
Kuivila, 1991], [Anderson and Bilmes, 1991], and There are two orthogonal criteria of p er-
[Desain and Honing, 1992]) can all (with the ex- formance. The rst is sheer technical pro ciency.
ception of [Desain and Honing, 1992]) b e reduced Clearly, computers have long surpassed humans
to temp o variation. on this axis. The other is expressivity, something
In [Bilmes, 1992], I intro duce a new mo del more elusive, something that gives music its emo-
of rhythmic expressivity. Sp eci cally, I state that tion, its feeling, its joy and sorrow, and its human-
b eat-based rhythm can b e characterized bythree ity. Music exudes humanity; computer music ex-
comp onents: metric structure, temp o variation, udes uniformity. This, I strive to eliminate.
and deviations (formerly called event-shift mo d-
Author's current address: Computer Science Division,
els). In this pap er, I argue that deviations are
EECS Department, U.C. Berkeley, Berkeley, CA 94720.
most imp ortant for p ercussive and non-Western
1
I use \computer musician" to refer to anyone who uses
music, and that they are indisp ensable study for
a computer to create music, and \computer music" to re-
any drum machine architect wishing to create an
fer to music created thereof.
expressive sounding pro duct.
When we listen to or p erform music, weof-
ten p erceive a high frequency pulse, frequently a
binary, trinary, or quaternary sub division of the
2
musical tactus . What do es it mean to p erceive
3
tatum . this pulse, or, as I will call it, 2000
. erceiving the tatum do es not necessarily
P .
king in the mind, likea clock. imply a conscious tic .
354
Often, it is an unconscious and intuitive pulse
Tatums/Minute
that can b e broughtinto the foreground of one's
12345678
twhenneeded.Perceiving the tatum im- though Tatum Num
(A)
plies that the listener or p erformer is judging and
ticipating musical events with resp ect to a high an +15%
frequency pulse. It is a natural p erception, p er- 0%
haps similar to the illusory contour in the well
-15%
known picture on the frontcover of Marr's b o ok
Deviation: % of current tatum duration. [Marr, 1982].
12345678
ays explicitly stated in The tatum is not alw Tatum Num
(B)
usic. How, then, is it implied? The
a piece of m Tempo = 300 tatums/minute.
tatum is the lowest level of the metric musical hi-
erarchy. Often, it is de ned by the smallest time
Figure 1: EquivalentTemp o Variation and Devi-
interval b etween successive notes in a rhythmic
ation Representations
phrase. For example, two sixteenth notes followed
by eighth notes would probably create a sixteenth
note tatum. Other times, however, the tatum is
memb ers of the ensemble slightly adjusting their
not as apparent; then, it might b est b e describ ed
own p ersonalized temp o. Furthermore, the temp o
as that time division which most highly coincides
change needed to represent deviations in a typi-
with all note onsets.
cal p erformance would b e at an unnaturally high
The tatum provides a useful means of de n-
frequency and high amplitude. Imagine, at each
ing twocomponents of b eat-based rhythm, temp o
successive tatum, varying the temp o b etween 2000
variation and deviations. Tatums pass byata
and 354 tatums p er minute (Figure 1A). Perceptu-
certain rate, and may b e measured in tatums p er
ally, this seems quite imp ossible. However it seems
minute. Temp o variation may b e expressed as
quite reasonable to assume that a p erson could, at
tatum duration (in seconds) as a function of tatum
a constant temp o of 300 tatums p er minute, play
numb er. Similarly, deviations may b e expressed
every other note 15 p ercent of a tatum early (Fig-
as deviation (in seconds) as a function of tatum
ure 1B). In other words, although they mightbe
numb er. That is, a deviation function determines
mathematically equivalent, temp o variation and
the amountoftimethatanevent metrically falling
deviations are di erent in more imp ortantways {
on a particular tatum should b e shifted when p er-
they are distinct functionally and conceptually.
formed. Note that temp o variation is de ned p er
The previous paragraph suggests that there
ensemble, whereas deviations are de ned p er p er-
must b e some (p er p erson) upp er limit on temp o
former.
oscillation frequency. That is, any p erformance
A common question asked is, if we assume
variation not accounted for by temp o variation b e-
temp o variation is also p er p erson, why not just
cause of its high frequency must b e owing to \de-
use temp o variation to represent deviations in a
viations." The following algorithm was develop ed
p erformance? That is, is there mathematically
with this assumption in mind.
any di erence b etween temp o variation and devia-
tions? The answer is no, there is no mathematical
di erence. Either can represent p erformance tim-
3 Timing Extraction Algo-
ing. There is, however, a p erceptual di erence.
When listening to or playing in a drum or
rithm
jazz ensemble, there are times when the temp o
In [Bilmes, 1992], I p oint out the necessityofan
is considered constant, even though memb ers are
algorithm that extracts deviations from a p erfor-
playing o the b eat. Notice, the concept of b e-
mance. Presented herein is one that extracts the
ing \o the b eat" suggests that there is deviation
quantized score, the temp o variation, and the de-
from some temp o generally followed by the ensem-
viations. The input to the algorithm is a list of
ble. There is no concept, however, of individual
attack times.
2
The stroke of the hand or baton in conducting, or, of-
The algorithm is given complete metric
ten, the main quarter note b eat
3
knowledge of the p erformer. That is, it knows the
When I asked Barry Verco e if this concept had a term,
he felicitously replied \Not until now. Call it temporal
time signature, the numb er of tatums p er b eat,
atom,ortatom." So, in honor of Art Tatum, whose tatum
the numb er of b eats p er measure, and where the
was faster than all others, I chose the word tatum.
in time, the reference instrumentinformstheen- b eginning of the measure is (the answer to \where
semble what the temp o is. The p erformance in- is one?"). There are twoversions of the algorithm;
strument, dep ending on whether it is dominant one is primarily for p ercussivemusic, the other is
in the ensemble (such as a lead drum), controls slightly more general.
when the temp o sp eeds up and slows down. That Version I of the algorithm requires a rep et-
is, wesay the reference instrument de nes the itive reference instrument (such as a b ell, a clave,
temp o, and the p erformance instrument controls or a bass). The reference instrumentisusedto
the temp o. Therefore, when obtaining the timing extract temp o and must rep eatedly play a known
of a dominant p erformance instrument, welook pattern. The pattern p erio d mustbeaninteger
slightly ahead, and compute multiple of the measure duration. Percussivemu-
sic normally contains such an instrument, so this
n+C
X
1
is not an unreasonable requirement. The algo-
0
T [n]= T [k ];
rithm pro duces the expressive timing of a perfor-
C +1
k =n
mance instrument relative to the reference instru-
where C is a parameter determining howfarinto
ment. In an ensemble, any instrument other than
the future we should lo ok. C dep ends on the p er-
the reference instrumentmay b e considered a p er-
formance instrument, and could equal zero; ac-
formance instrument.
cordingly, T [i]mayormaynotbeananticipatory
The algorithm rst computes a temp o func-
measure duration estimate.
tion using the reference instrument. The temp o
Creating a continuous time function, we
function is then transformed into a tatum dura-
4
tion function { tatum duration as a function of next linearly interp olate
tatum number. The tatum function determines
t x[n]
a normalized metric grid; i.e. a time grid spaced
; D (t)=D [n]+(D [n +1] D [n])
x[n +1] x[n]
so that grid markers determine the time p oints
of each tatum. The metric grid is then used to
where
judge the p erformance instrument. For each p er-
formance instrumentattack, the deviation is its
n = fn : x[n] t distance to the nearest grid marker. It follows that D (t) is an estimate, at time t,of Let L b e the numb er of tatums p er measure, the measure duration. R b e the numb er of reference instrumentattacks th Clearly, D (t) increases as temp o decreases, p er measure, x[n]bethe n reference instrument th and 1=D (t) decreases as the temp o decreases. We attack time, y [n]bethen p erformance instru- want the temp o-normalized time p oints. There- ment attack time, and let z [n]= x[n R]beour th fore, for each measure, we nd the time p oints that estimate of the starting time of the n measure divide the area under 1=D (t)into L equal area re- (if reference instrument attacks do not fall on the gions. The time p oints provide a tatum time func- measure starting p oints, weinterp olate, add en- tion, a function that gives the time p oint for each tries to x[n], and pretend that it do es). y [0] must tatum. lie past the rst measure's starting p oint. So, for each measure m,0 m