<<

Music from the Inside: Emotional Expression and the

Understanding Listener

Belinda Marie Prakhoff

Submitted in total fulfilment of the requirements of the degree of Doctor of Philosophy

December 2013

School of Historical and Philosophical Studies The University of Melbourne

Produced on archival quality paper

i

Abstract

My thesis examines the view that training in musical analysis is essential for musical understanding. I defend the untrained lover of music against the formalist charge that their understanding of music must be inadequate. Following Stephen Davies’s account of understanding “by degrees”, my account emphasises music as heard rather than analysed. I construct a case against Eduard Hanslick’s views, which I argue have contributed to more recent formalist accounts that depend upon the flawed theory-dependence of observation thesis and a de-emphasis on expressive properties of music. In response to such formalist views, my account takes the expressive properties of music and places them at the centre of our musical understanding, arguing that untrained listeners are equipped to understand them through a combination of hardwiring and cultural immersion rather than formal study. I discuss aspects of modularity theory (with particular emphasis on its support for theory-neutral observation) to defend this claim, arguing as a part of this defence that music expresses basic emotions only. I also confront accusations of circularity made in the past against any account proposing that understanding expression is necessary to musical understanding. These accusations draw upon the subjectivity inherent in the response- dependent nature of musical expressive properties. I argue, with reference to Philip Pettit’s

“ethocentric” conception of response-dependence, that this subjectivity is not so extreme as to lead to the kind of vacuous circularity such accusations assume. I conclude by sharpening the distinction, not always clearly drawn, between understanding and appreciation. I argue that it is appreciation, rather than understanding, that comes in

“degrees”, and the background knowledge and training previously thought to be essential to understanding is better placed within appreciation. I also deploy the Aristotelean notion of three kinds of friendship to argue that our conception of appreciation should be re- weighted to include the quality of our relationship with the work to provide a more inclusive model.

ii

Declaration

This is to certify that:

i) The thesis comprises only my original work towards the PhD;

ii) Due acknowledgement has been made in the text to all other material

used;

iii) The thesis is fewer than 100 000 words in length, exclusive of tables,

maps, bibliographies and appendices.

Signed:

Belinda Prakhoff.

iii

Acknowledgements

There are many people I would like to thank for their help in my completion of this thesis.

First of all, my supervisor, Karen Jones, whose patience and determination have been beyond measure throughout my candidature. Thanks too to John Armstrong, whose supervision in the early stages of my research was inspirational. The members of Karen’s research group, particularly Sam Gates-Scovelle, Trevor Pisciotta, Katinka Morton, Judy

Chambers and Phillip Dragic, gave me valuable feedback and support, as did Monte

Pemberton. I am also very grateful to Stephen Davies and Christopher Cordner for their advice and discussion regarding aspects of my work.

To my colleague and friend Clare McCausland, thanks are due not only for her proof- reading and comments but also for her support (via innumerable emails and coffees) throughout her own candidature and mine. I also thank my cousin and ex-housemate

Olivia Watchman; my singing teachers Patricia Sage, Adrienne Dugger, Raymond Connell and Tania Ferris; my understanding employers at The University of Melbourne; and my parents, Andrew and Brenda Paterson, whose faith in me has been constant throughout.

This work would not have been completed without their support.

Special thanks and love are reserved for my husband Rick Prakhoff, whose tolerance, understanding, vast musical knowledge and sensational cooking have all been essential to the writing process.

And finally, thanks are also due to my analysis lecturer, who could never have imagined where her one remark might lead.

Belinda Prakhoff

December 2013.

iv

v

Table of Contents Abstract ...... i Declaration...... ii Acknowledgements ...... iii Introduction: “But what are you hearing?” ...... 1 1. Motivation ...... 1 2. Methodology and caveats ...... 9 3. Understanding, Appreciation and Relationships ...... 13 4. Chapter by chapter outline ...... 20 Chapter One: Overview ...... 25 1. Central questions ...... 28 2. Formalism: On the Musically Beautiful – Eduard Hanslick (1891) ...... 30 3. Emotivism: a short survey ...... 40 Chapter Two: Analysis and Understanding ...... 54 1. The traditional view ...... 56 2. Two objections ...... 61 Chapter Three: Experience and Understanding ...... 76 1. Expression, circularity and response-dependence ...... 80 2. Understanding by degrees ...... 89 3. Experiential formal meaning ...... 99 4. Expression as musical structure ...... 104 Chapter Four: Kinds of Emotions ...... 110 1. Basic emotions ...... 113 2. Some objections ...... 122 Chapter Five: Expression ...... 133 1. Basic emotions and music ...... 137 2. Cross-cultural basic emotions in music ...... 143 3. Experimental evidence and the problem of cultural distortion ...... 146 4. Basic emotions: three or six? ...... 155 5. Summary and conclusions ...... 160 Chapter Six: Response ...... 165 1. Being moved ...... 169 2. The emotional response ...... 173 3. The aesthetic response – is it an emotion? ...... 183 Chapter Seven: Appreciation...... 197 1. Appreciation ...... 202 2. Kinds of relationships ...... 212 Conclusion: Music from the inside ...... 221 References ...... 231

1

Introduction: “But what are you hearing?”

1. Motivation

I’ll begin with the motivation behind this thesis. Many years ago, a music analysis lecturer in my Masters course responded to my confession that I had difficulty analysing one of the set pieces by exclaiming in a bewildered tone: “but what are you hearing, then?” She was unable to conceive that it might be possible for me, by then an experienced performer, to hear the structures described under analysis under any other description; that is, she thought that if I did not have the analytical concepts at my disposal, then I would be unable to perceive those structures in the music itself. In her view, this would mean I was therefore unable to understand the music I had listened to, performed, and loved for as long as I could remember.

This was, to say the least, distressing. I had long been suspicious of the theoretical foundation of this particular analysis subject in any case; the idea was that the analyses undertaken on the works would directly “inform” and improve our performances of them, thereby justifying the analysis itself by giving it a practical application. We were directed to perform our pieces in class and were assessed on how well our performances demonstrated this informed relationship between score and performance. Being a singer, this had seemed a wrong-footed approach from the start, as essential information for me tended towards (at least at first) a complete understanding of the text and its interaction with the score rather than, say, the significance of the chord inversion in bar five of the piano reduction. But it also seemed wrong-footed in some more fundamental ways for the instrumentalists in my class. It seemed to me that in attempting to “inform” our performances as we were directed to do, our interpretations were being more tightly restrained by the score; that is, that our performances were being tied to the page beyond the point of any practical value. This seemed more and more the case the more detailed the analyses became, especially in light of the fact that some of the structural scored details being unearthed were in any case inaudible in a performance of the work – that is, they were properties of the work as scored

2 rather than properties of the work as performed. On the other hand, it became equally apparent that those students who could imaginatively interpret beyond the directions contained in the score seemed, to me at least, to produce performances of greater character and beauty. It seemed that the scores were being interpreted for performance through another set of rules and conventions that were not part of the skill set of score analysis.

Given these different interpretative conventions, I suspected that my lecturer had made some questionable ontological assumptions about the nature of the relationship between the work as performed and the work as scored. The connection did not appear to be as close as she was assuming it to be.

But aside from these observations was the original suggestion that I didn’t understand the music I was hearing and performing. It certainly didn’t feel that way to me; I wasn’t experiencing some sort of structureless amorphous mash of sound when I listened to music, as my lecturer’s comment implied must be the case. I could isolate and describe various structural features of the music when I listened to it, but I sometimes used everyday

“folk” rather than formal terminology in these descriptions. Further to this, I couldn’t see how music could be so important in my life, both as a professional performer and as a music lover, if I simply didn’t understand it. Instrumentalist colleagues who were better versed in formal analysis, noticing my discomfort, reassuringly pointed out that it was all

“just another language”; just another way, that is, of describing the same musical structures.

If that were the case, I thought, then why tie just this one other language so tightly to the structures it describes that our very perception of musical structure falls under its concepts, let alone our understanding?

This account of theory-driven musical understanding also had more far-reaching implications. Most listeners, after all, are not trained musicologists. If it were the case that training in analysis is required to “properly” understand music, then the vast majority of listeners and music lovers are enjoying music that they don’t understand (or, on this rather extreme account, can’t even hear). Again, it seemed to me that an analysis-based theory of understanding is impoverished if it cannot account for the experiences of these untrained but apparently appreciative listeners. However, this theory appeared to be the mainstream

3 view within music education, at least at that time. Someone somewhere had missed the point about musical understanding, I thought, but I still couldn’t be sure that it wasn’t me.

This thesis, then, is an extended exercise in finding out exactly what that point is. It explores the idea of a theory of musical understanding that might encompass the very different experiences of both ordinary listeners and trained musicologists.

The obvious starting point for such an alternative account was to look at one of the most objective and accessible features of musical works: the emotion expressed by the music. It was obvious because most listeners within a particular culture, whether they are trained in analysis or not, will be able to identify the broad emotional colour of the music they are hearing: they can state whether it is, for example, happy, sad or angry relatively easily. This is of course to be carefully distinguished from the completely different question of how the music makes the listener feel; that is, how the listener responds to the music emotionally.

Rather, all the listener needs to do here is simply recognise whether the music sounds happy or sad. This is something that we could expect the majority of listeners (who are suitably well-versed in that music’s culture) to be able to do. More recently, there are indicators in the psychological literature that this ability may even be cross-cultural – that is, there are indicators that we can understand the expressed emotion in a piece from an unfamiliar culture1. Perhaps, then, a theory that can explain how we understand this expression - and how music expresses in the first place - might be the simplest starting point for a more inclusive theory of musical understanding.

When I eventually found my way into the literature some years later, I discovered that this starting point was anything but simple. To begin with, on the topic of understanding it was reassuring to find that I wasn’t attacking a straw man: there are plenty of philosophers who hold variations of the view espoused by my lecturer (one of whom, Mark DeBellis (1995), argues with her that musical structure could not be heard without the corresponding musicological concepts being internalised first). On the other hand, there are plenty of

1 For example, Philip Ball (2010) cites the study undertaken on the African Mafa, who could reliable identify broad emotional colours in Western music examples above chance levels using facial expression to match musical expressed emotions (Fritz et al 2009). More recently, Sievers et al (2013) offers stronger support for “universal” emotion recognition through movement rather than facial expression. These studies will be discussed in detail in chapter five.

4 philosophers who argue with me against that view. One of these, Stephen Davies, includes the experiences of non-musicologist listeners by arguing that it is possible to understand music “by degrees”. Moreover, he argues that ordinary everyday “folk” musicological terminology can be used to describe musical structures as evidence of understanding, rather than insisting that formal musicological terminology is essential for either perception or understanding2. This is the view upon which I will base my own account, with some developments and some notable differences.

But, returning to my survey of the literature, it also became obvious that not only are there confusions surrounding what we mean by musical understanding, but also surrounding what we mean by appreciation. Often the two terms are used almost interchangeably.

Sometimes appreciation is coupled with aesthetic response; sometimes it isn’t. Appreciation can also be seen to depend upon a solid understanding of a musical work as a foundation, which can involve not just a musicological grasp (folk or formal) but also a strong grasp of background contextual knowledge about the work (when it was composed, how it compares to similar works of the same period, how this performance of the work compares to other interpretations, etc); or it can be simply seen as an advanced form of understanding, in the sense of a higher degree of understanding than can be achieved just by uneducated listening. Somewhere in the middle of all this is the question of whether or not the listener likes the work, which is seen as either a requirement for appreciation or as a separate question altogether. It is conceivable, after all, that a listener might recognise a work as a great piece of art without actually liking it or remaining interested in engaging with it. Then again, appreciation might encompass more than merely recognising a work’s status as art. It might also refer to the way that a work we understand, appreciate and like actually becomes an important part of our lives; a marker of who we are. Most music lovers could easily name several works that fit into this category. It seems, then, as if this is another confusion that needs sorting out. If I am going to explain musical understanding, I will also need to be clear on the differences (if there are any) between understanding, appreciation, and liking.

2 See Davies 2011: Musical Understandings, chapter 7; also in Davies 2006a.

5

The most tangled confusion of all, however, lies in the distinction between the way that we recognise which emotion the music is expressing and the way the music makes us feel in response. The confusion arises when this distinction is not recognised in discussions of musical expression. I argue that the confusion was at least partially caused by the arguments put forward in the late 1800s by Eduard Hanslick, a prominent music critic of the time. He argues that any emotions experienced by listeners have nothing whatsoever to do with the proper understanding and appreciation of that music and, moreover, that anyone who does think such emotions are important lowers listeners to the same state as mindless drug addicts, wallowing in emotion instead of intellectually engaging with the music (1891, p.59). While it seems clear that he is here referring to emotional responses to music rather than the emotions the music is expressive of, he nonetheless proceeds to conflate the two into an enduring and, as I will argue, insidious view that the emotional expressiveness of music is irrelevant to our understanding of it. This view, in addition to the influence of analysts such as Heinrich Schenker (1925), has led to an overemphasis on the score as a means of understanding (accessed via detailed analysis of sometimes inaudible scored structural features), rather than the listening experience. That is, the idea of understanding music became one of score analysis rather than experience, of reading rather than listening, and the idea of listening became one of sensual “mindless” experience.

The insidiousness of this kind of view lies, I will argue, in its underlying assumptions and their consequences. Perception, for example, was and sometimes still is regarded as a process that needs to be fed with concepts gained through rigorous study and training. This assertion stems from the theory-dependence of observation thesis (Churchland 1988; Fodor

1983; DeBellis 1995). In the case of musical perception, the thesis dictates that musical structures may be heard and understood only insofar as we understand their corresponding theoretical concepts. The theory-dependence of observation thesis, then, underpins my lecturer’s argument: you quite literally won’t even hear structural features with which you are theoretically unacquainted. The connection between adopting this argument and the denigration of emotional expression is this: once scored structural features are seen as central to any understanding of the music (which tends to be a consequence of adopting

6 my lecturer’s view of understanding as dependent upon theoretical training, as discussed above), heard features of the music tend to fall by the wayside. While the theory dependence of observation argument is not appealed to by Hanslick himself, and is not required for his argument against expression, it has been explicitly recruited by contemporary philosophers of music (such as DeBellis (1995)) to achieve this separation between scored and heard properties. And, while it is clearly the case that scored properties such as key, tempo, and so on will influence the emotional colour of the work, there is also a clear sense in which musical expression is accessed through listening “from the inside” rather than through analysis. I will explain this point further below.

Hanslick’s view and variations of it are known as formalism; opposing theorists are variously known as expression theorists, emotivists, or even (as I will discuss in the case of

Peter Kivy (1990)) cognitivists, depending upon the kind of theory of expression they defend. For now, I will refer to philosophers who hold that expression is a relevant part of musical understanding as emotivists or emotion theorists. The Hanslickian argument above for the hierarchical separation of (scored) structural and (heard) expressive properties is fuelled by a two-pronged companion argument that aims to undermine the emotivist position. The first prong is this: if the emotivists are right and music does express emotion, and it is important to our understanding of music, then they need to explain how it does this. It is, after all, far from obvious how music expresses emotion. Music is not like a language in that it has no semantic content. Sad music doesn’t simply mean sad like the word ‘sad’ does (Davies 2010, p.23). So how does the music carry or convey emotional states to the listener? I will call this the “how” question in what follows. Emotivists, as I will argue, need to provide a compelling response to this question in order to defeat the formalists.

The second prong of the Hanslickian companion argument is this: emotivists also need to provide an explanation of why the emotions expressed by music seem to be restricted to a small group of very broad emotional categories. Music without text seems to struggle to express complex emotional states such as jealousy, Schadenfreude or pride; but simpler emotional states such as joy, sadness or anger are much more easily identifiable by the

7 listener. Moreover, it can sometimes be difficult to determine exactly what degree of emotion is being expressed by the music within each emotional category. We can tell the music is sad, but it can be hard to say whether it expresses grief, misery, or just everyday melancholia. So why is this the case? And, as Hanslick asks, how can emotional expression be relevant to musical understanding if sometimes the emotion being expressed by the music is only broadly identifiable (if at all)?

Both prongs of this argument have presented emotivists with major difficulties, and their

Hanslickian opponents have consequently highlighted these difficulties as serious flaws in expression theory. The argument points back to Hanslick’s original assertion that emotional expression is irrelevant to musical understanding, an assertion he extends into the second prong of the argument above: if music cannot express individual emotions, then it cannot express emotions at all (Hanslick 1891, trans. Payzant 1986 p.8). We are left, then, with a polarised debate between the Hanslickians on the one side and the emotivists on the other. My thesis is therefore going to explore this polarisation with an initial aim of exposing any foundational misconceptions on either side of the argument. This exploration will then develop into an examination of what happens to the debate once these misconceptions are addressed.

An example of my approach is as follows. In response to this two-pronged argument, and in an attempt to answer the “how” question it poses in its first prong, what I want to ask is this: what happens to the debate once we start taking the focus of investigation away from the music itself, which I think has contributed to the emphasis on scored properties and analysis, and placing it on the listener instead? In the past, attempts to explain how music expresses have centred on either the musical structure and identifying which of its properties might be doing the expressive work in some symbolic sense (for example, Cooke

1959); or on the emotions being expressed, on the assumption that there are actual emotions being conveyed to the listener (for example, arousalist theories like Matravers

1998). Perhaps we should turn this on its head; perhaps the way we understand musical expression can tell us something about how music expresses and what it expresses.

8

While it is important not to confuse the process of expression with the cognitive mechanisms behind it (as I will discuss in chapter five, this is a distinction infrequently acknowledged, particularly in the psychological literature in the area), I will argue that moving the focus to the listener in this way can provide a better explanation of the role of musical expression in understanding music than that offered by the traditional, score-based view. Again, I agree with Davies (2003, 2011) that the key here, even while we focus on the listener in the way I suggest, is not to make the listener some kind of vessel for the music’s

“real” emotions. Many past accounts, as Davies points out, tend to assume that the music is conveying what he calls genuine “occurrent” emotional states; that is, “real” emotions occurring in real time either being transmitted from music to listener, or being aroused in the listener in some way. This view not only leans towards a conflation between emotions expressed by the music and emotions felt in response by the listener, but also assumes that we always and only understand emotions as they occur in ourselves and in others. This is, however, not the case: we can and frequently do understand emotional expression without having to assume the emotions being expressed are simultaneously being felt. We can, that is, see a smiling face as happy even if the face’s owner isn’t happy at that moment, or the droopy face of a Bassett hound as sad even though that Bassett hound isn’t sad at that moment (Davies 2003, 2011; Kivy 1989). I’ll spend more time on this point in chapter one.

The above is the beginnings of my response to the first prong of the Hanslickian argument.

I will also argue that my approach can provide an answer to the second prong, regarding the unusually broad character of the emotions being expressed by music. Again, I will argue that if we take the focus away from the music itself and look at the nature of the emotions we experience it as expressing, then this provides a clear explanation for the way that music seems to express only very broad categories of emotions. Hanslick, as I mentioned above, bases his argument on this point on a very firm view of the nature of the emotions. He argues that all emotions are identified by their propositional objects, and that every emotion must therefore have a thought or belief at its core. I will argue that once we examine the kinds of emotions expressible in music, it emerges that Hanslick is mistaken in his views about the nature of all emotions, and therefore mistaken in the conclusions he draws about musical expression based on these views. Not all emotions, as it turns out,

9 have the kind of propositional core that Hanslick believes and requires them to have; some are more reflexive, instantly recognisable and broad in character. Once this is established, it emerges that these emotions tend to be the ones expressed in music3.

These responses to the two-pronged argument provide examples of the kind of stance I will be adopting in this thesis. But, as I have stated, I will also be delving a little further into the foundations of the debate. I will therefore, for instance, be examining the substantial controversy surrounding the arguments for the theory dependence of observation itself. If, as I suggest above, the theory dependence of observation is indeed a contributing factor in the separation between score and experience in musical understanding, I will argue that it is by no means as solid an hypothesis as it needs to be. This will, however, involve investing the discussion with a wider, more interdisciplinary scope. Given this, now is probably a good time to introduce my methodology in a little more detail. In the section to follow, I’ll outline my approach to the debate over musical expression and musical understanding.

2. Methodology and caveats

There is a surprising lack of consensus amongst philosophers on how some of the central concepts within the debate are to be defined and applied. It would be equally surprising if this lack of consensus is not behind some of the debate’s more enduring disagreements, such as the one concerning emotional expression. Given this situation, my aim is to prise apart the central concepts of appreciation and understanding with the intention of sharpening some of the distinctions in frequent use. While the two terms seem to be used almost interchangeably in everyday life, it seems clear to me that if they are indeed different processes, not only do we need to sort out how they differ but also where they intersect. In addition, it seems clear that a definite view must be taken on the nature of the emotions themselves; it is after all inadvisable to attempt to construct a theory of musical expression, or a theory of understanding resting upon musical expression, if our conception of what is being expressed is misinformed.

3 I say “tend” because this argument is not without its own problems (see, for example, the objections Davies raises (2011)). I will explain in more detail in chapter five.

10

Aside from these basics, some of the foundational arguments employed or implied by my analysis lecturer and philosophers like DeBellis also need to be examined with an eye to whether or not they are up to the job of supporting such accounts. I will focus upon the

“theory-dependence of observation” thesis to this end, arguing that it is not as strong a foundation as it needs to be. In addition there is the question of what, exactly, we mean when we talk about some of the basics of musical ontology contributing to the separation of scored and heard properties I discussed earlier. For example, when there is talk of musical structure (or at least when analysts talk of musical structure), this tends to mean the sort of structure recordable in a score rather than heard. Yet the music as scored does not share all of the same properties as the music as heard, as I indicated above; there also seems to be more to our understanding of musical expression than scored properties alone. The traditional emphasis on scored properties therefore seems to be worthy of examination if some of the most accessible heard properties of music (such as expressive properties) are being sidelined as a result.

This sidelining may even indicate, as Aaron Ridley suggests (1993), that there is a problem with our conception of musical structure itself. He argues that expressive properties of music are also structural properties, as opposed to content carried by the music. I will be discussing Ridley’s point briefly, as it offers some support for my stance against the scored/heard separation and against Hanslick more generally. But for the time being, I should clarify that on my account, the fact that we understand expression also means that we must be understanding other, more traditionally structural features of the music.

Expressive features, as Ridley argues, are part of musical structure but they play a critical role in our understanding of it. On his view and on mine, we understand expression along with the rest of the musical structure because scored musical structure in an expressive work simply doesn’t make sense to a listener without it4. This is why my proposed account acts as a defence of uneducated listeners: I argue that understanding expression in music is

4 It is important to be clear about what sort of sense this is: it is the “making sense of the structural whole” kind of sense, not the kind of semantic sense we mean in relation to language. I will discuss this further in chapter three.

11 evidence of wider structural understanding. Understanding emotional expression, then, does not comprise musical understanding in and of itself.

This argument is part of the way that I will address the concerns expressed by Jerrold

Levinson (1996) and Andrew Kania (2012), who believe that any account of musical understanding leading with expression is vulnerable to circularity. That is, we are not to define musical understanding in the following way:

a. Only understanding listeners can hear music’s expressive properties.

b. What counts as understanding?

c. The ability to hear expression in music.

This is not the definition I will defend, since it assumes that understanding emotional expression comprises musical understanding. Nor am I arguing that music has emotional properties because we understand them to be there, which Levinson also cautions against.

He thinks instead that “sensitivity to expressivity will come along with the rest” (1996, p.109; Kania 2012), which is also my view. That is, I am not assuming that expression is an effect of musical structure, as I suspect Levinson assumes, but a part of it. Part of the problem here is the response-dependent nature of these expressive properties. Being response-dependent, they are also inherently subjective in character (Pettit 1991), which lends them to the kinds of circular definitions that worry Levinson. I will address this matter in chapter three, and argue that Pettit’s “ethocentric” account of response- dependence is the key to countering Levinson’s concerns.

In summary, then, two of the formative aims of my thesis, are first, defining central concepts; and second, experimenting with the idea of leading with rather than devaluing the role of emotional expression in musical understanding. What characterises my argument, as opposed to the more traditional one, is how I go about constructing it. I will be adopting a wider, more “interdisciplinary” view of the area. There is nothing especially new about this per se, as plenty of philosophers (Davies, for example, and Paul Griffiths

(1997)) have done the same. What is different about my approach is that the depth of my

12 explorations will lead to a defence of some arguments that I think were too quickly dismissed in the past (for example, the role of “basic” emotions in musical expression, as I will explain below). The result, I will argue, is a path through the debate that highlights the advantages to be gained through incorporating relevant experimental evidence into the philosophical arguments regarding musical understanding. It will reveal the narrowness in the traditional approach, in that this alternative approach can account for the experiences of a far wider group of listeners and include the educated and uneducated alike, instead of offering an account for the educated alone.

I will therefore be choosing particular discussions from each of the key areas surrounding musical understanding, not as an overview of all of the important arguments in general, but as individual markers for where my own arguments sit within the debate. Through examining these marker points, I will be effectively outlining my own account by demonstrating where, how and why my opinions differ; I will be defining my account through what it does not support just as much as what it does. Constructing my argument by these means will therefore require forays not only in the philosophy of music, but also into the psychological literature on musical understanding; into emotion theory; and even into some areas in the philosophy of science (regarding the debate surrounding the theory- dependence of observation). These forays are all necessarily brief. But it will also emerge that the key to building a possible theory here might lie within cognitive science: specifically, in aspects of classical Fodorian modularity theory (Fodor 1983). For example, some sections of my argument rely upon only three of modularity theory’s foundational concepts (domain specificity, informational encapsulation and fast, mandatory response) paired with the idea of “basic emotions” from emotion theory; further than that, much of modularity theory’s finer points are not necessarily implied or required by my argument.

But I think it is more than enough to show that there are significant gains to be made if the debate is reinvigorated outside of the terms of reference set by the more traditional philosophy of music/aesthetics-based discussions.

Two points need to be firmly established before proceeding, however. First, I am not about to claim that all musical understandings are somehow equal. My aim is not to argue that

13 the average listener’s understanding of music is just as “valid” as the trained musicologist’s.

Rather, my aim is to find a theory that does not exclude or deny the average listener’s understanding in the way that my lecturer inadvertently did, even given the fact that some listeners are going to be more talented and educated than others. I will argue that since the overwhelming majority of untrained listeners enjoy and value music, it seems unlikely that they do not perceive its major structural features or understand the way those features interact. It simply cannot be the case that these listeners mindlessly enjoy music in the same way that they enjoy their favourite ice-cream; the musical experience is more meaningful than that. The relationship they have with music is more meaningful than that. I will explain the crucial role of such relationships in the overview below.

Secondly, when I refer to “music”, unless otherwise specified I will be referring to Western classical music without text, or “music alone” (Kivy 1990). The reason for this, as argued by

Kivy, is that text carries obvious semantic baggage that might tip off the listener about its emotional expression; I am interested in the music’s expressive properties rather than the texts’, when it has them5. I also acknowledge that some music within this genre (perhaps arguably) does not clearly express a particular emotional state. I’ll address this point further below at the close of the next section.

I can now move on to an overview of the argument in a little more detail. Having covered the motivation for the thesis and my methodology, in the next section I will introduce (or reintroduce, in some cases) the arguments themselves and outline how they fit together.

3. Understanding, Appreciation and Relationships

The discussion so far has revealed three foundational questions that need to be addressed by any account of musical understanding and appreciation. I will argue that answers to

5 These assumptions about “music alone” and the expressive capabilities about music with text have more recently come into question (Ridley 2004; Davies 2011). I have some sympathy with the view that music with text can be part of these discussions, but I will refer to music alone unless otherwise specified when constructing my own account.

14 these questions require reassessment in the construction of a stronger defence of uneducated listeners. These questions are:

1. What defines musical understanding?

2. How is musical understanding distinguished from musical appreciation?

3. How does music express emotions?

The traditional answers, in my view, are either incomplete or inconsistent with empirical evidence. Questions one and two tend to be conflated, in the sense that understanding and appreciation are seen as two ways of describing the same process: the accumulation and application of formal theoretical knowledge. Question three is sometimes answered with a decided negative, in the sense that, against most listeners’ reports, music simply doesn’t express emotions at all; or, if it does, such emotional expression is an irrelevant side-effect of the intellectual business of musical understanding6.

I argue that such responses have been strongly influenced by Hanslick’s views. The reason why there is such disparity between the accepted analytical view of musical understanding and the experience of the majority of uneducated listeners (who are clearly understanding something about the music they are listening to, contrary to the accepted view), is the way that Hanslick and his formalist sympathisers seem to dismiss as equally irrelevant the emotions expressed by the music along with the emotions that he feels are “mindless” responses to the music. In so doing, Hanslick places understanding and emotional expression at opposite poles. Once I have outlined the problems with this view in chapters one and two, I will introduce my own account by adopting the opposite stance to

Hanslick’s. By carefully distinguishing emotions expressed and emotions felt in response, that is, I can begin to argue that expressed emotions (at the very least) are actually an essential part of our musical understanding.

6 I will discuss examples of such views in chapters one, two and three. Examples include Hanslick (1891); DeBellis (1995); Zangwill (2004).

15

My argument stems from the following idea: if we are accounting for the understanding of the uneducated majority, and if one of the most accessible properties of music as heard is its emotional expression, then how we understand this emotional expression may tell us something about how to define wider musical understanding. The key to all three of the central questions above, I want to suggest, is discovering the kind of emotions that are musically expressible and the cognitive processes that recognise such expression. I will argue that, pace Hanslick, there are actually two kinds of emotions here: basic emotions

“in” the music, which are reflexive or modular in production and do not draw upon beliefs or thoughts; and “higher level” cognitive emotions brought to the music by the listener that do. Basic emotions generally include very broad emotional states, such as sadness, joy, anger, and fear, although there is some controversy over exactly how many emotional states fit into this class (Ekman 1972). Higher-level emotions include more complex propositional states such as jealousy, nostalgia, and love.

The two key theories I will be working with are, accordingly, Paul Griffith’s adaptation of basic or “affect program” emotions vs. higher-level emotions (1997), and Jerry Fodor’s

“Modularity of Mind” theory (1983). My attitude to the use of these theories is experimental; I want to ask how much further can they stretch our theories of understanding, appreciation and expression. I will argue that the answer is: a long way. My fundamental hypothesis, given this proposed stance, is this: music is expressive of basic emotions only and we recognise this expression through the same modular processes by which we recognise the expression of basic emotions in other humans. There is considerable experimental evidence to support this hypothesis, although Davies (2010b and 2011) and

Peter Goldie (2000) have questioned much of the pancultural evidence it requires for support. Their criticism is indeed a serious one: if species-wide hardwired modular processes are behind the recognition of musical emotional expression, then there should be experimental evidence to support this cross-culturally, as there is already such pancultural evidence for facial, vocal or bodily triggers for recognition of those same emotions. Davies argues that convincing pancultural evidence has not been obtained because it is essentially unobtainable (2010b & 2011). I will address this argument in detail in chapter five.

16

If, as I argue, I can overcome these criticisms, then not only can the resulting account admit the uneducated listener, it can also (as a beneficial side-effect) reinforce a theory of musical expression that was already a frontrunner: the contour theory (Davies 1980 & 2003; Kivy

1980). The contour theory asserts that music does not express occurrent emotions; rather, listeners are recognising the forms of these emotions’ expression as we do in other humans, animals, or even inanimate objects like trees (Davies 2003, chapter 11). This recognition takes place whether or not there are emotions occurring in the expressive being (human or dog) or object at the time. This recognition, then, turns on resemblance: we see a Bassett hound’s face as sad because it looks like the face of a sad person; we see a drooping willow tree as sad because it resembles the bodily carriage of a sad person; and, the theory argues, we hear music as sad because it resembles, in some way, the expression of sadness in a person. The beneficial side effect of adopting basic emotions here is that it can ground the resemblance: if the recognition of basic emotions is modular, then this recognition might occur when triggered by music as it does when triggered by human voices, faces or movement. I argue that this side effect is symbiotically beneficial, in that the ease with which it enhances the contour theory provides further supporting evidence for basic emotions and modularity more generally.

So far, then, my hypothesis cements my proposed alignment between modular basic emotions and our understanding of musical expression and enhances the contour theory.

So what about appreciation? The key, for the time being at least, lies in the kinds of emotions involved. If, as I argue, our understanding of emotional expression in music is necessarily restricted to basic emotions, then by definition, any higher-level emotional responses therefore must occur outside of the scope of this modular process. Such responses draw upon background knowledge and beliefs, or even simple associations. As such, they are informationally unencapsulated non-modular responses; they may incorporate what we know. Given this, I want to argue that higher-level emotions are aligned with appreciation. Such emotions are also likely to be felt in response rather than merely recognised at this level. We can respond emotionally to the work with higher-level emotions (such as nostalgia, to pick a common example) by drawing on previous experiences; and we can draw on what we know about the work (although such emotions

17 are not necessarily part of being an appreciative listener: being nostalgic about a piece generally results from personal association rather than anything intrinsic to the work itself). What I mean by this is that the listener might feel nostalgia when the work itself might be expressing a different emotional state, such as happiness. Nostalgia is therefore associated with the work through association with various events in the listener’s life that might trigger a feeling of nostalgia when hearing the work, rather than being present in the work itself. Sometimes, however, a sense of nostalgia may be experienced through degrees of sadness expressed in the music in combination with associated experiences of the listener. I will explain further in chapters six and seven.

Given this proposed alignment of kinds of emotions with understanding vs. appreciation, let’s return to the original three central questions and view them in light of the fundamental hypothesis above. Answering these three questions successfully through application of the hypothesis will not only clarify the distinction between understanding and appreciation, and offer further support for the contour theory, but also enable me to address my overarching aim: the construction of a more inclusive theory of musical understanding.

The answers I will be defending will be already apparent: in summary, understanding and appreciation are distinguished by the kinds of emotion inherent in each (modular- processed basic emotions in understanding and non-modular higher level emotions in appreciation). The alignment of basic emotions with understanding enables me to argue that all listeners have a baseline understanding of the music of their culture, and very probably the emotional properties of the music of other cultures as well. The alignment of higher-level responses with appreciation enables me to argue that it is at this stage, and not at understanding, that theoretical knowledge may be drawn upon and personal associations made. And both of these points are reinforced by the answer to the third question: how music expresses in the first place. The answer is that the contour theory is the most accurate description of musical expression. By enhancing the contour theory with the idea of basic emotions and the modular processes behind their recognition, we can ground the resemblance behind the theory and hence strengthen it further.

18

My account therefore differs from Davies’ not just in my attitude to basic emotions but also in the clear distinction I draw between appreciation and understanding. In my view, it is appreciation, rather than understanding, that is better described as occurring “by degrees”.

My clarification of the different kinds of emotions involved (and hence different cognitive processes) in understanding and appreciation support the argument for this distinction. In addition to this is the observation that the traditional view of understanding and appreciation, with all of its fuzzy distinctions, also doesn’t ever really tell us how much knowledge is enough to “properly” understand or appreciate a work. There is a significant sense in which the goalposts are continually shifting towards the trained expert and no amount of knowledge is ever “enough”7. This does not mean that the accumulation of theoretical knowledge is a pointless exercise in the enhancement of our musical experiences. But this potentially infinite process, I argue, makes it impossible to distinguish between understanding and appreciation with any degree of rigour, with the result that both are seen to depend upon the attainment of formal theoretical knowledge. Instead, my account puts theoretical knowledge in its proper place: as a foundation for appreciation rather than for understanding. I argue that this reassignment both coheres more closely with the available empirical evidence regarding uneducated listeners and allows such listeners, through hardwiring and cultural immersion, to understand the works they love, and not merely in a trivial or “meaningless” sense of enjoyment. This understanding, I argue, forms the basis of their eventual regard for and appreciation of the work, which may be fuelled in turn by the accumulation of formal or even “folk” musicological knowledge.

I am not, then, simply setting out to argue that an uneducated listener really does understand a work. I am also setting out to offer an explanation as to the unquestionable meaning of their relationship with the works that they love arising out of that understanding. And I use the word “love” advisedly; in chapter seven, I develop a suggestion of Davies’ that appreciation involves a “species of love” for the music (Davies

2003, p. 201). I adapt Aristotle’s account of the three kinds of friendship (from

Nicomachean Ethics, Book viii) as a means of demonstrating why distinguishing between

7 This is a frustration shared by Nicholas Cook (1990). I discuss Cook’s views further in chapter seven.

19 the kinds of relationships we have with music are so important. I will argue that only one of the three kinds of relationships Aristotle defines applies to musical appreciation: the

“character relationship”, in which the participants love each other for “their own sake” rather than for simple pleasure or utility. This is the only kind of relationship that would enable traditional aesthetic appreciation (in which a work is appreciated “for its own sake”), which adds weight to my assertion that such relationships normally stem from an aesthetic response to the work. Pleasure or utility relationships, I argue, are the kind assumed by Hanslick as grounds for evidence of non-understanding; his error was to assume these were the only kinds of relationships such listeners are capable of forming with music. Moreover, the analogy between relationships with music and relationships with people becomes even stronger in light of my earlier argument that the psychological mechanisms identified in musical understanding are the same as those that we use in relation to other people. This explains some of the sense of intimacy that may develop in our relationships with music. This, too, emerges as a further advantage for the contour theory of expression, as it might be the case that these same psychological mechanisms can underpin the resemblance upon which the theory turns.

One final point: as I mentioned earlier, leading a theory of understanding with emotional expression in this way raises the question of how such a theory would encompass our understanding of music that is not obviously expressive of any emotional state. It has been widely agreed (see, for example, Kivy 1990 and Budd 1985) that such music exists; and we can easily imagine that, for example, some of Bach’s drier works for harpsichord might sound to us to be more concerned with traditional structural features than with emotionally expressive ones. On my account, it should also be difficult to build a meaningful relationship with such works without the sense of intimacy gained through modular emotional recognition. Yet clearly listeners do seem to form such relationships with inexpressive works. So where does this fit into the account?

My first response is to point out that these works are by and large in the minority of cases.

But my account can nonetheless encompass relationships formed with such works very easily. To begin with, our base-level understanding doesn’t just encompass emotional

20 recognition. It also accommodates the recognition of other large-scale structural features.

Our listening experience suggests that music does not suddenly become aural mush simply because it is not clearly expressive. Secondly, this less expressive music might inspire relationships weighted more towards higher degrees of formal appreciation than other, more expressive, pieces. What I mean by this is that we can value a work for its formal structure just as we can value it for its expressive properties; and I will not be arguing that all works will be equally accessible to all listeners. Just as some listeners will be better at listening than others, some works will require more formal study to appreciate than others.

The beauty of my account, in its assertion that appreciation comes in degrees rather than understanding, is that it easily accommodates these differences.

This brings the overview section to a close. My thesis, then, is composed of a number of interweaving threads. I will be drawing together arguments about emotions, relationships and modularity over its course. In the next section, I will very briefly outline the arguments within each chapter. This is intended both as a reference point for the reader and as a further illustration of how my argument (and its defences) fits together.

4. Chapter by chapter outline

Given that I respond primarily to Hanslick throughout, chapter one (Overview) begins with a summary of his objections to emotivism in On the Musically Beautiful (1891). I will extract a number of these objections as a demonstration of the fundamental ways in which he has influenced the modern debate, especially in the idea that emotional expression is not important to musical understanding. It is this latter point, I will suggest, that lends weight to the assumption that scored (rather than heard properties) are what should be understood when we understand music. I will also argue that his account depends upon the separation between our rational and “animal” natures, and that this is, to say the least, not a strong foundation. In the second half of chapter one I will review the many kinds of emotivist theories available and outline some of their major problems, with an emphasis on their answer to the essential question of how music expresses. This question is essential because the lack of a convincing answer to it has so far enabled formalists to deny either the

21 importance of musical expression in musical understanding, or the assertion that music expresses at all. I will also introduce the contour theory in more detail, with the aim of showing that its strength lies in the abandonment of occurrent emotional states as musical content.

Chapter two, Analysis and Understanding, is the first of two chapters exploring what we mean by musical understanding. In this chapter, I will examine the traditional musicologists’ view of understanding championed by my analysis lecturer - that is, the idea that perception of musical structures depends upon the internalisation by the listener of corresponding concepts in musical theory. This is the view I am setting out to challenge.

The representative version of this view is Mark DeBellis’ (1995). I will discuss two objections to his account: first, I argue that that the controversy surrounding the “theory- dependence of observation” thesis is a much bigger threat to his account than DeBellis allows; and, secondly, I argue that his account denies any kind of understanding to the listening majority. That this latter point is not supported, I will argue, is evidenced by the reports of uneducated listeners about their relationship with music, and by the way they can accurately describe what they are hearing. With regard to the former point, I will argue that Fodor’s modularity theory offers a viable alternative to the theory dependence of observation in that it provides a strong model for reflexive theory-neutral perception.

DeBellis’ criticisms of modularity theory are not as fatal as he claims. The purpose of this chapter is to dismantle the traditional view of understanding before I assemble my own, which will draw upon both Fodor’s defence of theory-neutral observation and upon the concept of domain-specific, encapsulated modular processes.

Chapter three, Experience and Understanding, outlines the key components of my alternative account. It has two main aims. The first is to address the general reluctance in the past to bring expressive properties into accounts of musical understanding. I will suggest that this stems from the fact that these properties are response-dependent, in the same way that we ascribe objective redness to an object because it provides us with subjective experiences of redness. Accordingly, I will clarify, with reference to Philip Pettit’s

“ethnocentric” conception of response-dependence (1991), whether response-dependence

22 reduces the objectivity of expressive properties in music. I will also address the question of whether or not there is any vicious circularity involved in placing expressive properties front and centre in an account of musical understanding, as argued by Levinson (1996) and

Kania (2012). I will argue that this circularity can be neutralised through Pettit’s ethocentrism enhanced by modularity theory, and also, in agreement with Aaron Ridley

(1993), by arguing that expressive properties of music are structural properties.

The second aim is to detail the theoretical foundations of my account, in the light of my overall aim to accommodate the experiences of uneducated listeners more accurately. To this end, I will discuss Davies’ theory of “understanding by degrees” (2011), contrasting it with Jerrold Levinson’s sympathetic account of music “in the moment” (1997). These accounts of understanding will also contribute to my own, in that they both claim to reject theory-dependent listening and musical training as necessary to understanding. I will then discuss Constantjin Koopman and Stephen Davies’ account of “music from the inside”

(2001), which is based on our listening experience providing access to the music’s

“experiential meaning”. The combination of this account and Davies’ “understanding by degrees” will provide the basis of my own account of understanding.

Chapter four, Kinds of Emotions, looks at Griffiths’ arguments for basic emotions (1997).

This chapter will therefore involve some extensive background forays into emotion theory.

Griffiths points out that there is more experimental evidence for basic emotions than there is for any other “kind”; that the independence of basic emotions is such that they form their own natural kind; and that in asserting this we are not committed to Goldie’s “avocado pear” view of emotions, in which basic emotions form the “core” of higher-level emotional states (2000). I will outline Jessie Prinz’s (2004) theory as an example of such an “avocado pear” view. My defence of basic emotions, in combination with my defence of theory- neutral perception in chapter two, both add further weight to my commitment to modularity theory: modularity provides the strongest beginnings for an explanation of how perception can be theory neutral, and there is experimental evidence that basic emotions are modular in both production and recognition.

23

In chapter five, Expression, I will examine the controversial idea that basic emotions are the emotions expressed in music. There is considerable empirical evidence to suggest that at least some basic emotions are the only ones that can be reliably and objectively identified in music by listeners, either educated or uneducated. I will consider both Davies’ and Goldie’s objections to this suggestion. These objections turn mainly on the unreliability of the pancultural psychological evidence. Pancultural evidence of basic emotion recognition is essential as evidence of the hardwiring inherent in modularity theory; if basic emotions are indeed recognised through modular processes, then there ought to be evidence across all cultures to support this. Davies questions whether valid pancultural evidence will ever be attainable on two grounds: first, because psychologists in this area are unable to design their experiments adequately and tend to draw unsupported conclusions from their studies; and secondly, because of the cultural influence on musical expression (it may be the case that each culture effectively disguises the emotional expression in its own music by shaping that music’s structures). I argue that these objections are not fatal. Davies is right to question psychologists’ existing abilities, but this doesn’t indicate they will never design adequate experiments. I discuss the recent study by Beau Sievers et al (2013), which shows how future research might avoid some of the objections Davies raises. More powerful are his later arguments about the nature of music itself preventing cross-cultural recognition, but I argue that he also seems to expect the listener to understand the cultural context of the emotions expressed before they may recognise those emotions. Slightly at odds with the rest of his view is Davies’ assertion that only three of the basic emotion set are expressed in music anyway (sadness, joy, and anger) and that the remaining basic emotions are inexpressible in music. I argue that in making this assertion, his concerns lie in defending his version of the contour theory rather than examining other factors that might determine which emotions composers choose to express through their music.

Chapter six, Response, will examine emotional response to music rather than recognition of emotional expression. I will distinguish between the simple “mirroring” of basic emotions due to emotional contagion, and the associative-type responses that rely upon “higher- level” unencapsulated emotions (such as nostalgia or the “our song” response). I aim to show that not only are such associative non-basic responses possible, but that there is also

24 strong evidence for basic emotions being evoked in listeners due to mirroring the emotions being expressed in the music. To this end, I will briefly examine Peter Kivy’s argument to the contrary as an example case. Kivy (1990) has argued that music does not evoke

“garden-variety” emotions; rather, he says, it simply “moves” us. I will examine psychologist Patrik Juslin’s objection to Kivy’s view (Juslin et al 2010). Finally, I speculate that an initial aesthetic response might be distinguished from aesthetic appreciation, in that the initial response, like understanding, might be the formally uneducated starting point for an eventual appreciation of a piece. I close the chapter with an outline of the other kinds of emotional response to music, concentrating on one in particular: the feeling we have about music that is most like love (Ruby Meager 1958; Davies 2003).

Finally, in chapter seven, Appreciation, I pull together the threads of my account so far and describe the relationship between understanding and appreciation in greater detail. I begin the chapter with an overview of Davies’ account of appreciation, and I argue that Davies has overlooked the truly intuitive listener in his model. I also discuss the idea, introduced in chapter six, that we construct relationships with music. This is central to my account of appreciation, in that my argument was motivated by my observation that uneducated listeners can form lasting and meaningful relationships with works. I introduce this idea by reference to Aristotle’s three kinds of friendship: friendships of utility, pleasure, and

“character” (in which the other is loved “for themselves”), and argue that relationships of character are indicative of a significant degree of appreciation of a piece of music “for itself”. I argue that the uneducated are not excluded from “character” relationships, just as they are not confined to utility or pleasure relationships only. I conclude that once we adopt this relationship-enhanced conception of appreciation, we can account for the relationships both educated and uneducated listeners have with music, allow for the input of musicological background knowledge and begin to describe, through character relationships, aesthetic appreciation.

25

Chapter One: Overview

This chapter provides an essential backdrop to the account of musical expression and musical understanding that I will assemble over the course of this thesis. In addition to summarising several complex discussions, it allows for some speculation as to the origin of some of the debate’s central premises. The impact of this analysis on the account of musical understanding and expression developed in this thesis will emerge over the next few chapters.

To begin with, however, I should start with some of the observations behind the whole argument about musical understanding to set some context for the chapter. I will briefly outline three such observations that might, at first glance, appear to be unrelated to this argument. First of all, music is the only art form that is inherently non-representational in any obvious sense, in that it is not “about” anything and does not refer to anything outside of itself. Added to this is the second, more complex point: it is also ontologically ambiguous or varied. Put as simply as possible, this means that it exists in different forms: as a score, a recording, or a performance. Each of these forms realise different properties of the work in terms of how we experience them, and how we conceive of the work as a piece of art.

Scores, for example, abstract the work out of the “real-time” of a performance and record it on paper in much the same way as a script of a play is recorded (that is, with instructions from the composer for realisation in performance with the expectation that a set of unwritten social and historical conventions will be applied by the performer in the interpretation of those instructions). A performer needs to be aware of both the meaning of the written instructions on the score and the accepted scope for their own interpretative input within the current interpretative convention (Davies 2011; Levinson 1990). Such scope is typically applied to the expressive aspects of the work; how the performer chooses to interpret instructions such as appassionato can greatly influence the work’s expressive impact. More unusual instructions, such as etwas ruhiger, aber trotzdem schwungvoll und

26 enthusiastisch (“quieter, but still lively and enthusiastic”8) give even more expressive rope to the performer. Even tempo markings, which are standardised via metronome settings, are often nonetheless up for argument and may vary through each conductor’s and/or performer’s interpretation of a work. The point is that playing a piece of music using only the instructions as written – that is, without this interpretative input - often results in a formally correct but somewhat robotic performance. Due to the individual nature of such input, performances of a work will therefore have different properties to the same work’s scored incarnation. But this leads to further complications. Different performances of the same work might therefore have different properties as well, especially if we include errors in performance such as wrong notes. This raises immediate questions about the identity of the work, in that we need to establish when a performance of a work consisting entirely of wrong notes might cease to be a performance of that work9.

This situation adds to the ontological confusions about music in the following way. It means that a quick game of “locate the artwork” is, in the case of music as opposed to visual art or literature, particularly challenging (see also Davies 2006b, chapter 4; and 2003, chapter 2). It is hard to say, that is, what constitutes the musical artwork itself. While a piece of music can take the form of objects like a record or a score or even abstract tones, these are not the only forms in which it can exist. Each form also has properties not shared by the other. We can therefore argue that music as an artwork doesn’t (strictly speaking) always exist in locatable space, like a sculpture in a gallery or a book on a shelf might seem to do.

And even the forms in which music does exist are subject to potentially destabilising factors, such as performance convention (as we have seen above), which can also change over time. But even in the face of this destabilisation, things are not as grim as they sound.

There are other more robust identifying properties of a work that act to balance such shifting conventions. As Davies points out, one such identifying property might be the social and historical context in which the work was created, which constrains, for example,

8 Richard Strauss’ Ariadne Auf Naxos, bar no.114 in the Composer’s aria (p.83 piano/vocal score, Boosey & Hawkes). 9 This particular point (whether or not a performance qualifies as a performance of a work if it is full of errors) is discussed at some length by Levinson (1990).

27 how we might interpret the work appropriately for performance, giving less scope for individual interpretation and hence less variation in expressive properties (2006b, p.81).

Even given this potential solution, one of the major problems in locating a piece of music as an artwork is this: in other art forms (generally speaking), works can be identified by reference to their content, or what they are about. In “music alone”, as I claim, this is not possible. Music alone does not seem to refer to anything outside of itself, and in this sense at least, does not have any content. This, then, is the relationship between the first and second of the three contextual observations so far: music’s apparent lack of content adds to its ontological ambiguity, in that its content cannot be an identifying property10. The third observation can also be linked to the first two observations. It is this: it is objectively the case that some (even most) music is expressive of various emotional states, and that ordinary non-musicologist listeners are able to recognise and identify these emotional states (though not necessarily go on to experience these or other emotional states themselves)11.

These three observations are tied together as follows. The key to these ties, I want to suggest, are the expressive properties in music. To begin with, I think it is interesting that the properties falling between the cracks in the ontological discussion tend to be the emotionally expressive ones (many of the interpretative conventions observed by the performer, remember, are not always to be found in the score. These interpretative decisions, by definition, only influence the expressivity of the resulting performance of the work). I say “tend” because not all of these properties are emotional properties. Many of the remainder, however, are properties that don’t play either an identifying role or influence how we experience the work itself. For example, while it may form a part of the historical or cultural context of the work’s creation, the work’s property of being composed on a Tuesday is normally not going to change how it sounds to us or add anything especially significant to its identity as a work. Neither is the property of its being composed in a leap year, or (in most cases) the paper chosen by the publishers for the first edition of

10 Music with text, of course, is nominally “about” the subject of its text. However, see Ridley (2004) and Davies (2011) for discussion of the interaction of music and text in this context. 11 This view is shared by Levinson (1996; 1997); Davies (1994; 2006); and Budd (1985).

28 the score. So are emotionally expressive properties a part of this body of more or less irrelevant properties? Where do they fit into the picture? I argue that they are actually essential to a satisfying understanding listening experience of the music, yet not all are score-able; they may vary greatly from performance to performance, yet this does not change the identity of the work (within reason). I would add the observation that expressive properties are accessible to untrained listeners, yet these listeners may not be able to read music or perform it themselves. That is, I argue that the very expressive properties that fall through the ontological cracks in a piece of music are most accessible to the untrained listener.

What I mean by forcing together the ambiguity of the work’s expressive properties and our untrained musical understanding so abruptly is this: musicologists (like my analysis lecturer) sometimes assume that the work’s “real” identity lies in its scored properties, and it is therefore these we must understand to “truly” understand the work. But I argue, and I will keep arguing throughout, that the accessibility of heard expressive properties in fact does not indicate that they are irrelevant to the “real” work’s understanding. There is nothing “secondary” about these heard properties; not only are they essential to the understanding of other major structural properties (in a way I will explain in more detail in chapter three), but their understanding is also evidence that other major structural properties are understood by the listener. These expressive properties cannot, that is, stand alone, independent of the structures around them, and yet be understood, as seems to be traditionally assumed.

1. Central questions

With this assertion, I have, then, begun to answer the first of three central questions regarding musical expression that have guided the debate so far. These are: first, given the ontological ambiguity described above, are these expressive properties really important to our understanding of music if some of them are only manifest in its performance (as I’m arguing they are)? Secondly, can these expressive properties play a role in identifying the

29 work? And thirdly (and more broadly), how is something abstract, non-representational and clearly non-sentient able to be emotionally expressive in the first place?

Generally, traditional responses to these questions split into two main camps in the literature: the emotivists and the formalists. The formalists are on the negative side, arguing that emotional expression is, in essence, merely a side- effect of music. They assume that music alone is contentless, and as such they do not need to explain how such music is expressive. Our understanding, on this account, comprises only the formal structural aspects of music; expressive properties are not structural properties. However, formalists need to explain how purely abstract musical tones can provide such profound and valuable listening experiences; how can music be this “meaningful” if it expresses nothing (Budd

1985, p.154)? The traditional emotivist solution, on the other hand, argues for the positive.

Some emotivists argue that emotions are the content of music alone12. This solution makes emotionally expressive properties relevant to our understanding of music. Emotivists need to explain how this assertion might be understood, and in so doing address the question of how abstract musical tones can “carry” emotions as content in the way they propose (Budd

1985, p.153). However, I will argue that neither side has so far convincingly provided the required explanations13. In short, in the face of the evidence from our listening experience of musical expressivity, difficulties persist for both camps: the emotivists have difficulty explaining how music expresses emotions, while the formalists have difficulty explaining why they shouldn’t have to explain how music expresses emotions.

In this chapter I will be examining arguments on both sides, specifically regarding the question of whether emotional properties are important to understanding, and the equally divisive “how music expresses” question (that is, the first and third of the central questions above). This latter question is crucial because the lack of a convincing answer to it has so

12 For the time being, I am classing as “emotivist” any theory that characterises expressed emotions as part of our understanding of music as this is the key difference (in my view) between these theories and Hanslickian formalist theories. As I will explain shortly, not all emotivist theories claim that emotions are musical content, but all of them do need to explain how music is expressive. Different kinds of emotivist theories are discussed in overview in section 2 of this chapter. 13 The exception, as I mentioned in the introduction to this thesis, is the contour theory – but this theory does not propose that emotions are musical content. I will discuss the contour theory later in the chapter.

30 far enabled formalists to deny the importance of musical expression in musical understanding, or to argue that music simply does not express emotions at all. The second section will deal with a number of different responses to these questions from the emotivist camp, with particular emphasis on the (most promising, as I shall argue) contour theory. In the first section, however, I will look in some detail at the views of arch-formalist Eduard

Hanslick (1891). I will argue that even though his account itself is incomplete and now archaic, when we extract some of its central premises we discover that almost all still feature prominently in the modern debate. I will list these premises and demonstrate that they are still accorded much more weight and influence than they deserve. Because of this influence, they have contributed to what can best be described as a bias towards the “work- as-scored” in constructing an account of musical understanding, leaving the evidence of our expressive listening experiences mostly unaccounted for.

2. Formalism: On the Musically Beautiful – Eduard Hanslick (1891)

First of all, it is important to realise that Hanslick regarded On the Musically Beautiful as a

“preliminary sketch or ground plan” for the new and complete aesthetics of music he planned to write in later years but never did (trans. Payzant 1986, p. xiii)14. It is, he says, the demolition work before the reconstruction; he concentrates on disagreeing with common views of the day about the place of emotions in our understanding of music rather than providing a fully-fledged alternative. (In doing this demolition work, however, he nonetheless outlines his ideas of what constitutes beauty and how our understanding may allow us access to it. I will discuss this latter point below, but for now I’ll start with his dismantling of two popular ideas regarding our musical understanding: first, that it involves or comprises the emotions music arouses; and secondly, that it involves or comprises the emotions that music carries or represents. These two views, he says, “are similar in that both are false” (1986, p. 3). He devotes his opening two chapters to explaining why they are false. These chapters contain the substance of his negative view against the emotivists (1986, p. xvi).

14 All references are to the 1986 translation by Geoffrey Payzant (Hackett Publishing Company).

31

Hanslick sets up his argument by stating that, contrary to expectations, he is by no means going to argue that emotions are irrelevant to our experience of art; indeed, “the ultimate worth of the beautiful”, he allows, “is always based on the immediate manifestness of feeling” (1986, p.xxii). It is perfectly acceptable, he thinks, to talk of emotions being aroused by music; it’s just not what music is either for or about. The idea is to understand the beautiful in music in itself, which seems to refer to its scored structures, rather than in terms of the feelings we experience when we hear it (1986, p.1). In taking this position, he is setting up clear boundaries between our experience of the music and our (ideal) understanding of it. As such, the only sense in which we can talk about “content” in music is in the structure of the music itself – to Hanslick, form is not only form as traditionally understood, but content as well. He sees this as excluding any possibility that emotional expression may comprise such content. This view, however, emerges later in the work

(1986, p. 29). For now, it is worth noting that one consequence of this view regarding structure is that only the musically educated can understand music, since, as he implies, the more accessible heard properties such as emotional expression are simply irrelevant to understanding (even if they are part of beauty’s “ultimate worth”).

He then proceeds to his arguments themselves. First, he examines the idea that the purpose of music is to arouse feelings in us. This he simply denies, stating that “beauty has no purpose at all”. Whilst we may experience feelings through the contemplation of beauty,

“these have nothing to do with the beauty as such” (1986 p.3). In fact, in much the same way as trees falling unheard in a forest, beauty, says Hanslick, carries on being beautiful even if no-one is perceiving it (1986 p.3). Moreover, we don’t access beauty through feelings but through what Hanslick terms “pure” rational contemplation, as “imagination and not feeling is always the aesthetical authority” (1986 p.4). He is therefore setting out some of the major tenets of formalism here, which he sees as the only viable alternative to the popular ideas about music and the emotions. His approach at this point rests on his conviction that formalism is self-evidently the stronger view, given his conception of beauty as being “above” mere feeling.

32

Hanslick also argues that if our understanding of music is in fact based on the emotions

(which it necessarily would be if the arousal of emotions were its purpose), then the relationship between music and the emotions would be less transient. As things stand, he says, our emotional interpretations of different pieces of music can vary dramatically according to our changing musical experiences, influenced as they are by changing social and educational contexts (1986 p.6). Our grandparents, for example, might have experienced a passionate emotional response to a work that seems banal to us these days

(Hanslick points to Mozart’s works as an example of this – we might find a modern alternative in the impact of popular music in the 1930’s and 40’s). Moreover, the intensity of these musically-invoked emotions, he points out, does not make such emotions intrinsically valuable or meaningful in the context of musical understanding. Equally intense emotions can also be aroused by non-musical events such as winning the lottery or the terminal illness of a friend. So why, he asks, should we think the intensity of these emotions is profound just because they are aroused by music (1986 p.7)? Added to this, he says, is the lack of emotional consistency: the same piece of music can arouse different emotions on different occasions in the same listener, indicating that it is not just the music doing all of the arousing. Hanslick’s general point is that there is simply nothing special about the emotional aspect of our musical experience; nor is the connection between music and the emotions experienced adequately specific or reliable. As he says, “the effect of music upon feeling possesses neither the necessity nor the exclusiveness nor the constancy which a phenomenon would have to exhibit in order to be the basis of an aesthetical principle” (1986 p.7). This in itself is telling: to Hanslick, the point of the whole listening enterprise is to access musical beauty, and this is the measure of musical understanding.

This beauty has nothing to do with arousing feelings of any kind. Therefore feelings or emotions we experience have no impact on our musical understanding.

In some ways, this isn’t a startling position. While it might be argued that experiencing emotions in response to music may be a marker of understanding, it is not a requirement for it (Davies 2011, p.105). But if Hanslick has been thus far dismissing the idea that music arouses emotions, his second major argument in chapter two aims to demolish the view that emotions are the content of music; or, that “feelings are the content which music

33 represents” or expresses (trans. Payzant 1986, p.8). Hanslick tackles the matter by invoking his conception of content and its relationship with the form of the artwork in the production of beauty. Beauty, he says, arises through the unity of a concept or idea embodied in the physical form of the work. This concept forms the basis of the work’s content, and, importantly, must be expressible in words; moreover, even though he acknowledges that there is “great diversity of content” in art, all content is reducible in this way (that is, “expressed in words and traced back to a concept”) (1986, p.8). He is referring to what the work represents or what it can be said to be about. “We say that this picture represents a flower girl”, he says, “this statue a gladiator, this poem one of Roland’s exploits” (1986, p.8).

It is obvious where Hanslick is taking this argument. In his view, there is no representational equivalent, no “aboutness”, in music. He acknowledges that music can indicate, by means of tempi, melodic line or key, a sense of vague feeling, or the “motion” in emotion; but we can never say one way or the other whether a particularly stirring musical passage is meant to represent the throes of hopeless love, or joyful love, or jealous rage. The emotions are certainly suggested by the music, but we have no idea which particular emotion is being suggested. All we have, says Hanslick, are “unspecific stirrings”, the “accompanying adjectives and never the noun” (1986 p.9). In addition to this point, and in direct contradiction to the popular views of his day, he also argues that the emotions are not vague entities suited to being musical content. Instead, they depend for their individuation, or their “conceptual essence”, upon thoughts and concepts: “the whole range of intelligible and rational thought, to which some people so readily oppose feeling” (1986 p.9). In essence, then, Hanslick is subscribing to a cognitive theory of the emotions, according to which all emotions have a concept or linguistically-mediated thought at their core. Music cannot contain such concepts; therefore, in his view, music cannot represent, contain, or be about specific emotions.

To illustrate this view, Hanslick points to the aria Che Faro Senza Euridice from Gluck’s opera Orphée. He argues that text with the opposite meaning (i.e. rather than grief at losing

Euridice, joy at having found her again) would have no impact at all on the aria – the music

34 serves just as well to highlight joy as it does grief (1986 p.17)15. Whilst it could be argued that this merely indicates that the aria is badly written, Hanslick highlights what he feels is a stronger demonstration of the same point. A number of pieces from Handel’s sacred work

Messiah had originally been composed by Handel to set some “secular and mainly erotic” madrigal texts for the Princess Caroline of Hanover (Hanslick, trans. Payzant 1986, p.19). It is difficult to question the quality of Handel’s music, he says, so we can’t argue that this music is badly written; we are left with the conclusion that, in general, the same music can serve to fulfil completely opposing texts. This is because, says Hanslick, music can only ever provide the peripheries of emotions; it is rational thought, linguistically expressed, that forms the emotional conceptual core. This music cannot convey16.

This point is important because it highlights that to Hanslick, the emotions are not entirely irrational in nature, since they have a rational thought at their core. However, in the next few chapters of On the Musically Beautiful, Hanslick seems to insist on entirely the opposite view, and embarks on what can only be described as a tirade against emotivism based on the much more traditional opposition of emotions and rationality. In this tirade, I think that he inadvertently reveals his confusion, or (less charitably) bias about the nature of emotions upon which his account is based.

This confusion emerges at the end of chapter four, in which he discusses the “subjective impression of music” and tackles the question of where all this emotion in music originates. Given he has just established that emotions cannot be the content of music, he also rejects the idea that the composer’s emotions can be transferred by the music to the listener. He is careful to emphasise that whilst emotions are a necessary part of the process

15 See also Kivy’s discussion of this Gluck example (2001, chapter 3), in which he points out that Hanslick was also deeply inconsistent in his practice as a music critic. Whilst arguing that music was not able to clearly express emotion here, Hanslick regularly used extravagantly emotive terms throughout his musical criticism (Kivy 2001, p.40). My concern in this chapter is to establish Hanslick’s influence over the debate through On the Musically Beautiful rather than to dwell on his other inconsistencies; however, Kivy’s observation underlines a telling difference in Hanslick’s approaches to musical analysis and to listening. 16 This is also, of course, in direct opposition to my premise regarding the objectivity of musical expression. Kivy regards Hanslick’s position on this as “palpably false”, and self-evidently so (2001, p.96). I agree, as will be discussed later in this chapter and in chapter three.

35 of creating or performing music, “emotion never becomes the subject of the work” (1986, p.46). Creative people, in other words, tend to be emotional types, but this is not what enables them to create, nor does their emotion find its way into the resultant composition.

He points to the example of women in defence of this view. They are, after all, emotional rather than rational creatures, yet “they have not amounted to much as composers” (1986, p.46)17. To Hanslick, then, it is plainly rationality alone that is the driving force in musical creation.

Hanslick moves on to consider the emotional state of the listener rather than the composer, and this is where his value system begins to emerge more clearly. It appears that in his view there are two ways of responding to music: an almost Pavlovian bodily response, resulting in the experience of vague but enjoyably intense feelings with no rational content; and an intellectualised act of “pure contemplation”, in which emotions are experienced not merely in a bodily sense but through the understanding of the music’s form as well (1986, p.57).

He devotes the rest of this chapter to an examination of the bodily sense of musical response, or “the intensive influence of music upon the nervous system” (1986, p.51).

Hanslick acknowledges that this influence is unquestionable, but he takes issue with how it is interpreted by physiologists and psychologists. He rejects completely the idea that we respond emotionally to music in the same unreflective way that a window frame will vibrate in response to certain frequencies because, he says, “this is not music” (1986, p.52).

Moreover, he argues that if our responses really were this reflexive, that composers could play upon our feelings as if they were a keyboard, producing exactly the same responses on every occasion regardless of context. This, he says, is very clearly not the case (1986, p.57).

For the same reasons, he argues that music therapy, a popular form of treatment for all sorts of ailments in his day, can’t work. “The bodily effect of music” he says, “is neither so strong, nor so reliable, nor so independent of physical and aesthetical preconditions, nor, finally, so manipulable at will that it could be a possibility as an effective cure” (1986, p.53).

17 To Hanslick’s credit, he does allude to the “circumstances in general” which may also have contributed to women’s exclusion from composition as a profession (1986 p.46).

36

What the problem reduces to, says Hanslick, is that physiology and psychology are unable to provide us with an explanation as to why certain musical features “act upon different nerves” in the way they do (i.e. why they produce various emotional states; why some keys sound “sad” to us, etc) (1986, p.55). Moreover, Hanslick argues that science never will be able to solve these mysteries. “Does physiology know how grief produces tears, how joy produces laughter? In fact it does not know what grief and joy are! Therefore let everyone take care not to seek from science explanations which it cannot give” (1986, p.55). We still need science to tell us which nerves respond to which auditory stimuli, but this, he says, will never tell us about music (which involves the rational mind), only about noise being processed by the body. The journey from nerve stimulus to conscious experience is to

Hanslick inexplicable by definition: “All that lies on the other side of that mysterious divide which no investigator has crossed. There are paraphrases a thousandfold of this one ancient riddle: How the body is connected to the soul. This sphinx will never throw herself off her rock” (1986, p.56). Music, it seems, is in the domain of the non-physical soul and is therefore, by definition, out of the reach of scientific explanation.

How he tackles the question of listeners who are musically untrained yet profess to experience profound emotional states through music (chapter 5), is also revealing. This type of listener, who I term the “ordinary” or uneducated listener, is the class of listener whose experiences I claim require explanation. Hanslick responds with a dismissal: such listeners understand nothing, or at least very little, about the music they are responding to.

“Slouched dozing in their chairs”, he hisses, “these enthusiasts allow themselves to brood and sway in response to the vibration of tones, instead of contemplating tones attentively……these people make up the most ‘appreciative’ audience and the ones most likely to bring music into disrepute” (1986, p.59). These listeners, “mindlessly at ease”, experience transports of pleasure at what Hanslick terms “the elemental in music”. The same effect would be produced by a warm bath as a symphony, such is the “effortless suppression of awareness” that music produces in them. They might as well, he adds, be using “ether or chloroform” instead (1986, p.59).

37

Hanslick is frustrated by these listeners experiencing music in this way whilst not, in his view, understanding it. He implies that they are degrading music by experiencing it at the bodily level only (that is, without the correct rational approach to allow for an aesthetic appreciation of the work). But to point this out to such listeners, he says with an embittered air, leads to one being labelled “cold. Unfeeling. Cerebral”, thereby indicating that they have as little understanding of his experience as he has of theirs (1986, p.69). It seems unlikely, for example, that these listeners would describe their experiences as “numbing”; rather, they would be able to label their emotive states just as articulately as Hanslick can his own. But the point for the time being is that Hanslick is interpreting the situation through the idea that music belongs to the lofty domain of the soul, and that the emotional listeners’ chief offence is that they are lowering music to the bodily level. Nowhere is this made clearer than in Hanslick’s declaration that:

The form (as tonal structure), as opposed to the feeling (as would-be content), is precisely the real content of the music, is the music itself, while the feeling produced can be called neither content nor form, but actual effect. In the same way, the supposed material, that-which-represents, is precisely that which is structured by mind, while what is allegedly that-which-is-represented, namely, the impression of feeling, inheres in the substratum of the tones and in large part conforms to physiological laws (trans. Payzant 1986, p.60).

Clearly, Hanslick’s arguments were greatly influenced by the attitudes of his day, particularly the Victorian-era fascination with the opposition between our animal and

“civilised” natures. There seems to be a grid from which he is working in which rationality, musical form, the mind / soul, religion and men are on one side, and emotion / feelings, nature, the body and women are on the other (with science, at once rational yet “dealing in the physical”, uncomfortably straddling both). The rational side is clearly seen to be superior to the less controllable emotional side; the debate about musical aesthetics is just as clearly fuelled by the belief that the artistic value of music, which is spiritual in nature, cannot possibly be grounded in something as earthy, messy and bodily as the emotions. We have seen already that Hanslick separates reflexive feelings and emotions from the rational mind / soul; that he associates women with feeling and hence precludes them from composition; and that science is only concerned with the physical and not therefore with

38

“rational” music. Given this, I can conclude that many of Hanslick’s arguments stem from this now outdated framework.

However, I suggested earlier that it is worthwhile spending this much time on Hanslick’s account because his views (or their consequences) have formed many of the premises of the current ongoing discussion. As Kivy notes, Hanslick’s book “has had an effect on the philosophy of music far out of proportion to its modest size” (2001, p.95). It is perhaps surprising to note that the premises of Hanslick’s argument actually comprise almost all of the most contentious topics within the modern debate about musical understanding and musical expression. Before moving on to the opposing views of emotivists, I want to set out these eight premises as clearly as possible.

They are as follows:

1. Since emotions cannot be carried as musical content (and are at any rate bodily not

“rational” phenomena), what we understand when we understand music is formal

structure. Once consequence of this is that understanding music requires musical

education. Without this education, rational musical beauty cannot be accessed.

2. Neither emotional expression nor emotional arousal has anything to do with this

understanding.

3. Emotional expression is an effect of music, not its content; it is something the

music does, not part of the music itself.

4. Recognising musical expression cannot be unreflective; it is a rational process

involving training and judgement.

5. An aesthetic response must also be rational, not reflexive.

6. Emotions are individuated by thoughts and beliefs, not by feeling or bodily states.

7. Given (6), music does not carry enough information to express more than very

broad vague emotional states; it does this via its “motion” or dynamics. These

emotions are so broad they cannot be readily identified.

8. Science will never be able to explain musical expression or the aesthetic experience.

39

My point in listing these premises is twofold. First, in the long term I am going to argue over the course of this thesis that all of them are (at least partly) false. Secondly, for the time being I am attempting to demonstrate how these largely false premises have, in opposing emotional expression and rational understanding, led to a situation in which the resulting two sides of the debate (formalist and emotivist) have remained within their parameters ever since. Because of these outdated parameters, I want to argue, both sides invariably come up against some serious obstacles in their efforts to explain musical understanding. If, for example, we want to argue for Hanslick and the formalists, and state that the musical structure is what is understood and that this requires education to access, then there is the question of the listeners who are uneducated but whose experiences seem to go beyond the

“mindless” emotional states he so deplores. These experiences suggest that at least some of the music’s properties might be accessible without formal education18. If, on the other hand, we want to argue along with the emotivists, and argue that emotional expression at least is important to our understanding of music, then there are two questions to be answered: first, how does the music express emotions, given the difficulties Hanslick describes; and second, how exactly does this fit into an account of understanding? What is being understood, if not only “formal” structural properties?

This is the polarisation in the debate I described earlier. However, more contextual discussion is required before proceeding. Bearing in mind Hanslick’s eight formalist premises (above), the next section will be a short survey of some of the emotivist theories in opposition to his views. This survey is not intended to be comprehensive, but rather will illustrate the leading emotivist responses to the question of how music expresses. I will show that those theories incorporating “occurrent” emotions (that is, those Hanslick defines as content-based theories in which the music carries “real” emotional states) have the most serious problems. The aim is to demonstrate the depth of the “how” question by discussing the difficulties besetting each of the attempts to answer it and, in the end, by showing how one theory - the Contour Theory, as defended by Stephen Davies and Peter

18 Hanslick argues that “pure rational contemplation” is required to understand the formal musical structures being heard. I interpret this to mean that he is saying that these structures must be identifiable through musical education only (or at least that this is how his view has been historically interpreted), as I will argue in chapter two.

40

Kivy (in differing forms) - overcomes these difficulties much more efficiently than its competitors. The motivation here is to mount a defence against the kind of criticism made by Hanslickian formalists of the whole idea of musical emotional expression comprising even a part of our musical understanding. In the context of Hanslick’s foundational premises above, if the emotivists cannot mount a compelling defence of their view, we are left with a situation in which the formalists, and indeed my analysis lecturer, may be right.

If the most accessible of musical properties (expressive properties) are irrelevant to understanding music, then only the educated few may be said to truly understand it.

3. Emotivism: a short survey

I’ll begin with a brief examination of three theories that, I think, almost inadvertently skirt around the question of how music expresses rather than facing the question squarely. The first of these is a reductive approach that merely points out which compositional devices and/or structural features do the expressive work. That is, on this view, music sounds sad because it is in a minor key and has a slow, dragging tempo (an example of this kind of approach is Bernstein 197619). On the face of it, this seems a sensible starting point; after all, whatever is doing the expressive work is going to be a part of the music itself. But there seems to be more to it than this; we need a mechanism rather than a description of expression. Pointing out compositional devices still doesn’t explain why being in a minor key sounds sad to us. It doesn’t explain, that is, how music expresses.

The second is the idea, supported by several philosophers (including Goodman 1968;

Scruton 1997; and Zangwill 2007), that musical expression is metaphorical; that no “real” emotions are involved, only metaphorical ones. Zangwill, for example, argues that the emotion descriptions in music are not to be taken literally as the music being expressive of real emotion, but instead are “metaphorical descriptions of aesthetic properties” (2007, p.391). Further, he takes this to mean that, in agreement with Hanslick, “if emotion

19 However, Bernstein then goes on to argue that such structural features carry a kind of semantics (much like language), and in so doing acknowledges the poverty of explanations turning on structural features alone. See below (3.2) for a brief discussion of the problems with Bernstein’s account.

41 descriptions do not literally refer to emotions, then surely music itself has nothing essential to do with emotion” (Zangwill 2007, p.394). Zangwill argues that there is nothing particularly special about emotion descriptors in music anyway; we use plenty of other metaphorical descriptions of music as well, such as “delicate” or “balanced” or, as Scruton points out, “high” or “low”. Moreover, we also describe plenty of other abstract art forms in exactly the same emotive terms, so why, asks Zangwill, do we need to unpack those applying to music as if they were somehow more literal (2007, p.392)?

I’ll spend a little time on Zangwill’s argument, as it is the strongest of these first three positions I will be discussing. First, Zangwill seems to assume that there are only two ways to describe emotions: either the literal (involving real occurrent emotions) or the non- literal or metaphorical (no real emotions involved). He describes Davies’ account as fitting into the literal category (or at the very least, that Davies uses “dead” metaphors that now refer literally to emotions through overuse) (Zangwill 2007, pp.393-394). I will argue later in this chapter that there is actually a third way. It is the way Davies (2003) describes emotions in the contour theory of expression: emotions don’t have to be occurring at the time we understand their expression. They can be expressed without anybody actually feeling them, in much the same way as a Bassett hound’s droopy face may look sad without the hound feeling sad. This doesn’t strike me to be an especially metaphorical way of describing what it going on (by the generally accepted uses of “metaphorical”); instead, it seems that the connection between the dog’s face and the expression of the emotional state is simply one of resemblance, at something more than a metaphorical level (see also Budd

2008, p.5). So the suggestion that abandoning occurrent emotional states must lead only to metaphorical descriptions of them is, I argue, misleading.

Zangwill’s view regarding emotional metaphors also depends upon another fundamental agreement with Hanslick about the nature of emotions. Both he and Hanslick share the idea that emotions have a belief at their core; as Zangwill says, “anger, fear and pride are had by people, they are about something, they feel a certain way, and they are irrational if had without certain beliefs” (2007, p.393). Zangwill argues that when emotion terms are applied to something that cannot have these beliefs, such terms are “probably

42 metaphorical” in nature (2007, p.393). But, as I will argue at length in the chapters to follow, not all emotions are so clearly classified. Experimental evidence has supported the idea that, in addition to such complex emotional states, there also exists a simpler kind of emotion which can occur without a core belief and can be recognised in expression in other humans almost reflexively following bodily, facial and vocal cues20. While I am not going to defend this here (I will do so in chapter four), I do want to raise a flag for future reference over the certainty with which formalists regard the nature of the emotions, and over the importance of this certainty to their arguments regarding expression in music.

There are, then, some questions to be raised about the literal/non-literal distinction in

Zangwill’s account and also about his conception of the nature of emotions. These are closely related to Zangwill’s use of the term “metaphor”. He does not attempt to provide “a general account of what metaphor is”, but rather opts for an account of the metaphorical uses of emotion terms only (2007, p.393). It emerges that, generally speaking, emotion terms applied to non-sentient beings or things must be metaphorical. I’ve just expressed my doubts about this very point on two fronts, so let’s revert instead to the more traditional understanding of what a metaphor is and compare it to Zangwill’s use of the term. Let’s say, just for argument’s sake, that music expresses in the same way that some poetry expresses: via suggestion and the creation of small or large-scale metaphor rather than via the clear and definitive expression that Hanslick seems to expect of music. But one problem with such an approach is an apparent breakdown in analogy between poetic and musical metaphors. Let’s say that poetic metaphors rely for their success not upon resemblance but upon the juxtaposition of linguistic semantic contents between the concept under description and the concept doing the describing. But as we will see, there is no equivalent to semantic content in music; there is nothing to juxtapose. Davies argues that this must mean the whole idea of music expressing metaphorically “must itself be a metaphor”, since there is no way that the metaphor itself can be unpacked (2010b, p.24)21. So while

20 This assertion is not without controversy in itself, as I will discuss in later chapters. The point for now is that the situation regarding the nature of emotions is not as black and white as Zangwill requires it to be. 21 Malcolm Budd (2008 and 1985a) also questions Zangwill’s use of the term “metaphor”, or more specifically questions Zangwill’s assertion in an earlier work (2001) that there is no other way to express the aesthetic properties of the music than by these metaphorical emotion terms.

43

Zangwill’s account and others like it might solve the problem for expression theories of determining whose emotions are being expressed (nobody’s), it remains mysterious as to how music has this metaphorical power in the first place. Goodman (1968) and Scruton

(1997) argue that this power necessarily remains mysterious; the metaphor cannot be analysed. Zangwill seems unperturbed by the question. But we do need to consider whether an explanation relying upon an inherently mysterious metaphor, or indeed a meta- metaphor, can go very far towards answering the original question of how music expresses emotions.

There is a further problem with the metaphorical approach: when we are recognising emotions expressed in music, we don’t experience that recognition as though it were a metaphorical version of a real emotion or, as Zangwill would have it, a metaphorical reference to an aesthetic property of the music. Rather, as Davies argues (2010b, p.26;

2011), encountering an emotion expressed in music is experienced in a very similar, immediate way to encountering an emotion expressed by another person. While it can be unwise to assume that our experience of psychological events is always an accurate reflection of how they are produced22, I would argue that until we have convincing evidence to the contrary (all other things being equal), an explanation that coheres with our experience explains more than one that doesn’t. This is especially the case when, as I will argue in later chapters, the experimental evidence supports our experience; it suggests that the similarity in emotional experience we feel between music and other people is very likely to be because the same psychological mechanisms are used to recognise particular emotions in both, regardless of what or who is expressing them.

This point about experience, then, is also a problem for an extension of Zangwill’s position, which states outright that music’s expressiveness is not merely metaphorical but sui generis and therefore unrelated to emotional expression in other contexts. While this spares everyone the need to go looking for a theory explaining musical expression, it does not adequately defend its rejection of this enterprise or account for its apparent disregard for

22 For example, it might seem to us that the “Cartesian Theatre” model coheres with our experience of a centre of consciousness or personhood more accurately than the sorts of integrated models evidenced by “split brain” patients (see Gazzaniga & Le Doux 1978).

44 this aspect of our musical experience. After all, it is possible to acknowledge that musical expression is going to be different to the expression of occurrent emotions in sentient creatures but it doesn’t necessarily follow that this difference is inherently mysterious

(Davies 2010b, p.23). It doesn’t follow that we need to abandon any attempt to explain, as I have just argued above, why we use emotion terms and experience emotional expression in an equivalent way for both music and humans. Rather, any attempt to ignore experience in this way would need to be supported by some compelling experimental evidence that our experience is misleading.

Given all this, I agree with Davies (2010b, pp.23-34) that any survey of viable theories of musical expression should stick to the commonly understood sense of terms like “sad” or

“happy”; that is, musically expressed emotions need to be relatable to the emotions expressed by people. This is, after all, how we use the terms in application to music; this is what we mean when we say the music sounds “sad”. Given this, simply producing a new, never-before-offered sense of emotion terms applicable only to musical expression (or, for that matter, expression in other art forms) will not explain very much about how music expresses at all; it will simply need explanation in its turn. It postpones the required explanation rather than removing the need for one.

In the light of these problems with metaphor-based accounts, I would suggest that we need to turn to some alternative accounts that attempt to address the “how” question of musical expression whilst maintaining a connection to our everyday experiences of musical emotion. First of all, I will briefly examine some accounts that tackle the question as literally (or as unmetaphorically) as possible; that is, accounts that work on the assumption that the emotions being expressed are real emotions actually occurring at the time of expression (or “occurrent”) emotions (Davies 2003; 2011).

3.1 Music expresses the emotions of composers, listeners or hypothetical personae

This group of theories argues that if only sentient creatures can express occurrent emotions, then music, which is obviously not sentient, must be somehow channelling the emotions of sentient (albeit sometimes hypothetical) creatures and presenting it as

45

“content”. The problem, of course, lies in explaining exactly how music does this. The expression theory argues that the music is expressing the emotions of the composer; that is, the composer expresses their emotions through the construction of the music (see, for example, Collingwood (1938) and more recently, Robinson (2005)). There are two major problems with this. The first is that it seems to be empirically false. Not all music is written by composers who are in the grip of one emotional state; and, moreover, some works take weeks or months to complete, which would imply that the composer maintained that emotional state for the duration of the composition period. This seems unlikely. The second is the “mask argument” (Davies 2010b, p.29). The natural means of expressing sadness is through weeping, wailing, etc, not (normally) through composition. A composer who composes a sad work to express their own sadness is therefore using the music as they would use an expressive theatrical mask; to match what they currently feel rather than actively convey that emotion. That is, the mask is already independently expressive whether or not its wearer is matching their own felt emotion to it. In the same sense, the music is already expressive, regardless of whether or not the composer is feeling that particular emotion. So, yet again, such theories do not actually explain the expressiveness of the music itself at all; all they highlight is that it is possible to match up felt emotional states with the states expressed in the music. A further explanation of how the mask/music itself is expressive is therefore still required.

Alternatives to the expression theory include the arousal theory, which argues that music expresses by virtue of the corresponding emotions it arouses in its listeners (see, for example, Derek Matravers 1998). This goes along the same lines as the argument that grass is green by virtue of the experiences it produces in normal human viewers. Music is sad in the same way that grass is green; it is sad because it causes sad experiences in listeners. This is therefore denying the distinction between recognised and aroused emotions in listeners; by arguing that music only expresses by virtue of arousal, there can be no sense in which a listener might recognise sadness in the music without feeling sad themselves. This, however, highlights the theory’s one obvious flaw: listeners regularly do recognise that the music is expressive of sadness without feeling sad themselves. Matravers has attempted to deal with this situation by claiming that even if the listeners in question (the “dry-eyed

46 critic” as he puts it) think they don’t feel sad, they actually do – they just don’t realise it. He says they experience “incipient sadness” (1998, pp.200-201).

But I find the degree of unfalsifiability incorporated in this proposal unconvincing. Even acknowledging that emotions can be identified in subjects through physiological checks alone (while disregarding the testimony of those subjects), and even though there is plenty of empirical evidence suggesting that it is sometimes possible to “mask” emotions from our own conscious experience, these cases tend to be the exception rather than the rule. That is,

Matravers once again comes up against the need to cohere with the majority’s experience

(and this is one of the better demonstrations of why that need can be important). It is objectively the case that listeners routinely distinguish the emotion expressed by the music as apart from the emotion that they themselves may mirror or bring to the experience themselves (if any). As this is the case, any attempt to tie the expressed and felt emotions together in the way Matravers is suggesting – that is, by proposing that, without any empirical evidence to speak of, an “incipient” emotional state is bestowing music with expressivity - will fail23.

A third group of theories argues instead that the music expresses imaginary or “make- believe” emotions, rather than real or metaphorical ones. Such theories are usually unpacked in one of two ways: either the listener identifies the music as expressing their own hypothetical emotional states and what it feels like to have them; or the music creates an imaginary subject for the emotions, and the listener is merely imaginatively observing the narrative unfolding around this hypothetical persona (see, for example, Walton 1988 for the former kind of theory, and Budd 1985, Levinson 1996 or Ridley 1995 for the latter kind). I agree with Davies regarding the major problem with either of these theories: while it might be possible to overlay such imaginative exercises on the music being heard, it is far from clear that we do this as a matter of course or that it is in this way that we understand the music’s expressiveness (2010b; 2003; 2006; see also Kania 2012).

23 See Kivy 2001 (chapter 7) for a more detailed examination of Matravers’ account.

47

But an important point here, in addition to these concerns, is this: any kind of theory of expression must explain how music may be expressive even when it seems to carry very little information. It is mysteriously expressive because we can’t immediately understand how it is doing it. Imaginative persona theories, however, seem to ignore this informational dearth. There simply isn’t enough information in the music for the listener to perform the sort of imaginative feats this theory requires. Contrast the situation with how we read a novel (Davies 2010b, p.31). The reader is carefully instructed, through the provision of extensive data, in how to imaginatively reproduce the specified characters and settings by the author. The composer provides no such guidance; what is provided is necessarily abstract. How do we know, for example, whether there is just one hypothetical persona in the music we are hearing? What if there are two or three and how could we tell if there were? So we are left to conclude, as Davies does, “what is imagined reveals more about the listener than about the music’s expressiveness”; it is the listener that provides this information, not the music (2010b, p.31).

If the main problem with this kind of theory is the lack of information provided by the music to the listener, then the next group of theories attempts to turn this on its head: it argues that music represents emotions symbolically. I will argue that such theories, which emphasise musical expressivity as content, struggle to account for the apparent expressivity of the musical structure itself.

3.2 Music expresses symbolically

On this view, more often than not employing an analogy with language, music expresses as a symbol, referring to content outside of itself. It comes to denote an emotion in this sense via convention or association. While it is easy to see the structural analogies between music and language, (Cooke (1959), Lerdahl & Jackendoff (1983), and Bernstein (1976) have all, with varying degrees of success, exploited this analogy), theories based on such analogies tend to collapse once we reach the level of semantic content (Davies 2010b, p.25). There is simply no equivalent to linguistic semantic content in the case of music, meaning that the analogy - and therefore the theory - breaks down. Andrew Kania (2012) also summarises a related, more general problem with such “language analogy” theories, which is that the

48 structure of the music itself seems to be doing the expressive work in the way that the structure of a sentence alone (without its semantics) cannot. “Even if music were about the emotions in the way that language can be”, he says, “that would not account for music's expressivity. The sentence ‘I am sad’ is about the emotions, but it is not expressive of sadness in the way a sad face is, though I could use either to express my sadness. Most people agree that music's relation to emotion is more like that of a sad face than that of a sentence.”24 This objection also applies to theories such as Suzanne Langer’s (1953), in which music is “about” emotions, but not in a linguistically symbolic way (Kania 2012).

Similar objections hold with regard to theories that try to avoid such problems by claiming that the symbolism operates only through association. This might work for some musical features (fanfares sound pompous, organ music is associated with Christian religious sentiment), but it leaves unexplained the “rightness” of some associations; the way that the music seems to “match” the emotions being expressed. These cases suggest the music itself has a hand in the expressiveness rather than our purely cultural decision to associate it with a particular emotion. That is, there are no fast-paced, major-key funeral dirges in Western music that I am aware of, suggesting that the expression of sadness is more than a matter of association alone. This kind of theory also misses the essential character of the emotional state that the music expresses that is reflected in our experience of that music. As Davies says, “it is not the case that music points or refers to emotions that it then goes on to describe” (2010b, p.25). Rather, related to Kania’s point above, we experience the emotions expressed as though another person is expressing them. “Registering music’s expressiveness”, says Davies, “is more like encountering a person who feels the emotion and shows it than like reading a description of emotion or than like examining the word

‘sad’” (2010b, p.26). There seems to be something expressive about the musical structure itself that is not accommodated by any theory turning on symbolism, be it raw association, linguistic, or non-linguistic.

24 From: "The Philosophy of Music", The Stanford Encyclopedia of Philosophy (Fall 2012 Edition), Edward N. Zalta (ed.), URL = .

49

These theories, then, all appear to have some serious flaws, which (with the exception of

Zangwill’s metaphors) are mainly due to their insistence that the emotions expressed must be owned by someone or something, even if that owner is abstracted or hypothetical. The general problem seems to lie in locating and individuating these occurrent emotions, and balancing this difficulty against the difficulties caused by the lack of detailed information supplied by the music. Added to this general problem is the way that symbolic theories cannot adequately account for the expressivity we experience in the musical structure itself.

One theory, however, abandons the attempt to work with occurrent emotions altogether, either experienced or make-believe, and successfully avoids this problem. This is the contour theory (as espoused by Davies 1994; 2010b; and Kivy 1980; 1990). I will spend some time now introducing the fundamentals of the contour theory, since I will return to it in later chapters.

3.3 The contour theory

The contour theory takes the idea of music expressing via resemblance to human expressions of emotion, and adds the further premise that music is not expressing occurrent emotions at all. Music merely presents emotion characteristics that we recognise as expressive. The theory suggests that the phenomenology of our experiences of musical emotion can be explained by the way we associate the resemblance of musical emotional expression with human emotional expression. In this sense, the way we understand and react to musical expression is because of the way that particular features of the music might sound, for instance, like a sad person; or even the way that features of the music might resemble how it feels to be sad ourselves.

The theory begins with the observation that we routinely experience particular behaviours, comportments and physiognomies as expressive even when these features are not caused by an occurrent emotion (Davies 1980; 2010b, p.33; 1994; and Kivy 1989). The best example of this, discussed by Davies and Kivy in independently developed accounts (both

1980), is the “sad”, downcast face of a Bassett hound or St Bernard dog. The dog itself might not be feeling sad (it is clearly not constantly sad), yet its face is more or less

50 constantly expressive of sadness by virtue of its resemblance to the sad face of a human. It presents, in other words, emotion characteristics that we recognise as expressive of sadness without requiring an actual occurrent emotion to be expressed. Kivy explains this as the difference between an emotion being expressed by the dog’s face and the dog’s face being expressive of an emotion (1989; 2002, p.38). The same holds for another common example in a non-sentient being: the sad, drooping shape of a “weeping” willow. Again, and more obviously, the willow itself is not sad, but its bearing expresses sadness to us because of its resemblance to the drooping frame of a sad person (Davies 2010b).

The argument, then, is that music is like a Bassett hound’s face. It expresses sadness by virtue of presenting emotion characteristics that resemble sadness in humans. It therefore does not need to account for the ownership of occurrent emotions, either real or make- believe; and it is not referring beyond itself to emotion states existing elsewhere in any symbolic sense. Rather, “the expressiveness is a property of the music itself” (Davies 2010b, p.31); the music presents characteristics that we recognise as expressive. So, just as the

Bassett hound can look sad, music can just sound sad in a similar way because the way it has been put together prompts us to recognise its expressive properties.

What needs to be established in this case is what it is about the emotions that the music resembles; upon what does the resemblance depend? There are three main contenders in answer to this question. The first, suggested by Kivy (1990), is that music alone expresses through its resemblance to human vocal expression of emotions. There is certainly something to this, particularly in baroque music, in which musical structures such as the sospiro (“sigh”) directly referred, via resemblance to human expressions of sadness. But it should also be admitted that there is often not very much about a symphony orchestra in full flight that sounds like a person in the grips of an emotional state, at least not to anything more than an extremely abstract degree (Davies 2010b, p.27). On the larger, longer-term scale, then, this idea at least lacks explanatory punch. The second candidate, discussed by psychologist Patrik Juslin (2010, 2003), is that music resembles via the prosody of expressive speech (that is, via the contour, volume and phrasing we draw upon in adding expression to spoken sentences). Again, there seems to be something to this. However,

51 again there is a question mark over whether this explanation can accommodate sustained, long-term expression of emotional states in music alone, particularly large-scale symphonies (Davies 2010b, p.27). Prosodic resemblance may unpack melodic expression in a short, simple, speech-like phrase of music, but it seems to tell us very little about what is going on when we understand the profound emotional impact of, say, the second movement of Beethoven’s 7th symphony. Such theories need an additional pointer about whether extended expressive works express via sections of short, prosodic resemblance that is merely experienced as one long expressive statement, or whether there is some other extended resemblance upon which such large-scale works turn.

The third and perhaps more promising candidate (supported by Davies 1980) addresses this need with the suggestion that music expresses through its resemblance to the dynamics of human expression; that is, through the way that we move, our posture or gestures when experiencing particular emotions. On this account, slow, downcast movements express sadness; happier pieces move quickly and lightly; anger can be expressed in loud, sudden movements; and all three of these examples apply to both music and humans. By widening the scale of resemblance in this way, this account avoids the “small-scale” problems associated with vocal or prosodic resemblance, which is a considerable advantage.

This version of the contour theory is well placed, then, to account for our experience of musical expression as being more like an encounter with a person than with a symbol. We recognise the sadness in music in the same way we recognise the sadness in the bearing of another person. This leaves two factors yet to be established: what these expressive properties in music are; and how, and indeed why, this recognition occurs. Davies argues that the properties most likely to be grounding the resemblance are dynamic properties such as movement and pattern in music; these, he thinks, do not suffer from the “short duration” problem surrounding Kivy’s vocal resemblances or Juslin’s prosodic ones, as I discussed above. If we concentrate on dynamics, he argues, then we have a clear explanation for the sense we have of musical expression unfolding as the music progresses in real time. Expressive character, he says, “is revealed only gradually” to the attentive listener (2010b, p.32). I’ll return to this point later on.

52

This leaves the “how” and “why” aspects to be dealt with. The experience we have of the powerful nature of emotional expression in music suggests that there is something particular about the way we recognise or understand this expression that enhances the experience. Kivy has argued that we are “evolutionarily programmed” to respond to such resemblances in the way that we do; we animate what we perceive (1989). This, I think, is deserving of more attention, but it also requires more empirical evidence to uphold. While constructing a theory of any evolutionary advantages conferred by this ability to recognise emotion is necessarily speculative (as many evolutionary theories are, as Steven Pinker has remarked (1997)), it might still be possible to construct a less speculative theory incorporating the cognitive mechanisms behind these responses, and perhaps even identifying the triggers that set them off. This, if successful, would provide the most thorough answer to the “how” question yet produced, in that it would be more firmly grounded in empirical evidence than previous theories. Being “evolutionarily programmed”, that is, doesn’t have to remain a mysterious process; it doesn’t have to be an end point of any explanation of how we go about perceiving and understanding our surroundings. It is possible to drop a few degrees down the scale of speculation by looking at the ways in which we might be doing that perceiving and understanding. This is one of the reasons why, in the next chapter, I will be looking in much more detail at Fodor’s modularity of mind thesis (1983).

With this direction in mind, in the chapters to follow I will be exploring the idea that the kind of empirically-based theory I am supporting in answer to the “how” question can strengthen and enhance the explanatory heft of the contour theory. I want to show, that is, that contrary to the views of the formalists there are very good explanations, supported by experimental evidence, for why we experience music as expressive via the kind of resemblance proposed by the contour theory. In doing so, I support another aim of my argument: to allow a theory of understanding to encompass expressive rather than, as has been done in the past, purely formal structural properties of the music. This intention extends into my further argument that to understand music, it is necessary (but not sufficient) to understand its emotional expressivity, where such expressivity exists. This is

53 in contrast to the more traditional views in which theoretical knowledge is necessary (and sufficient) both to musical understanding and perception. But I have yet to fully elucidate this traditional view that I mean to oppose in this empirically-grounded way; or at least, to fully elucidate the way in which Hanslick’s eight premises have moulded the modern debate. In the next chapter, I will look more closely at a modern representative of this traditional view as a backdrop to my alternative account to follow.

54

Chapter Two: Analysis and Understanding

Higher levels of appreciation are laden with music-theoretic concepts. Such training thus characteristically makes conceptual what was, at most, formerly non- conceptual: it brings about conceptual representation of properties that the listener had heretofore represented, if at all, non-conceptually……conceptualization is at the heart of growth in the appreciation and understanding of music.

- Mark DeBellis (1995, pp. 6-8, italics mine).

As we have seen, there is a school of thought prevalent in some circles of musical academia according to which the aim in studying musicology and music analysis is to “properly” understand a musical work; consequently, those listeners remaining uneducated in analysis or musicology will be denied the opportunity to truly understand the music they hear (or perform). We have also seen, via the discussion of Hanslick’s views in the last chapter, where such a school of thought might have had its origins. When asked to consider what the untrained listener might experience, adherents of this view either indicate that such listeners must be having some kind of ineffable musical experience that they cannot quite put into words; or, as in the case of my analysis lecturer, they admit to some puzzlement as to what such listeners are even hearing without the benefit of musical terminology to conceptualise their experience.

I’ll start this chapter with an outline of a view that incorporates both of these responses.

Mark DeBellis (1995) has provided an account of musical understanding that represents what I refer to as the “traditional” view: that an extensive musical education is a necessary condition of musical understanding25. He constructs an argument that purports to explain exactly why uneducated listeners do not understand or appreciate what they hear, and why educated listeners do. Put very simply, the account rests on the idea that recognition of

25 Although DeBellis’ later views were more moderate than the one now under examination, I am going to use this earlier position to unpack an extreme take on the traditional view because it perfectly describes my lecturer’s position. DeBellis later (1999, 2003 & 2005) allows both that most untrained listeners have extensive “folk” musical perceptual concepts and also that over-theorised listening can be distracting.

55 musical structures during the listening experience depends upon the listener having the corresponding theoretical concepts at their disposal. Attaining the required theoretical concepts gives the listener an array of “perceptual concepts” that do the recognition work for them. Listeners lacking these concepts cannot hear them in the music; their perception is “non-conceptual” and therefore cannot be transformed into belief. Perception must be informed, that is, by theory.

There are three reasons why I will be discussing DeBellis’ account in some detail here. First, it represents (if not always precisely) the analysis-based view of musical understanding that

I think is flawed. The idea is that his account fills the role of my analysis lecturer’s opinion writ large. Secondly, it highlights one of the major debates that will form part of the foundations of my account. This debate concerns the “theory-dependence of observation”, or the view that what we know or believe constrains our observation of the world around us; in this case, the music we hear. DeBellis builds his view squarely upon this theory being an accurate picture of perception (i.e. perception is penetrable by theoretical concepts/beliefs). This puts him in disagreement with Jerry Fodor, whose alternative account, the modularity of mind theory (1983) defends a theory-neutral picture of perception (i.e. perception occurs independently of theoretical concepts/beliefs). DeBellis spends some time arguing against Fodor, and I will, accordingly, examine this disagreement in some detail. The fundamental difference between them, as DeBellis sees it, has to do with whether there is a distinction between cognition and perception (DeBellis

1995, p.8). Fodor argues there is a distinction (put simply, we perceive and then we form beliefs) and DeBellis, on the basis of the theory-ladenness of observation, argues that there isn’t (our perceptions are beliefs). I will argue that DeBellis’ arguments in this regard ultimately fail and that this failure strikes at the heart of his account26. I also discuss

McCauley & Henrich’s account of the disagreement between Fodor and Churchland

(2006). I will argue that the distinction between diachronic and synchronic modular penetration they raise is not as harmful to Fodor’s account as they think it is. Again, as I

26 I am therefore attacking DeBellis’ account from a different direction than Davies (2011, pp.100- 105), whose focus is to establish that ordinary listeners can meaningfully describe their experience of the music, and therefore be said to understand it, without the need for DeBellis’ theoretical concepts. I will discuss Davies’ attack in section 1.1.2 below.

56 argued in the last chapter, there is also the evidence in the reported experiences of uneducated listeners themselves to consider. My opposition to DeBellis is supported, I think, in the light of this evidence supplied by the uneducated listening majority, who generally do not report hearing an ineffable mish-mash of sound when listening to music.

This is not consistent with the claims DeBellis makes regarding the theory dependence of observation, which imply that such listeners would not recognise an assortment of musical structural features. Both this inconsistency and Fodor’s modularity of mind theory will prove to be essential in building my alternative account in later chapters.

The third reason I want to focus on DeBellis’ account is more general in nature. As evidenced by the quote at the head of the chapter above, I will question whether DeBellis and others of his ilk are actually conflating appreciation with understanding, and whether this distinction is as important as I think it might be in the defence of uneducated listeners.

As will become apparent in later chapters, I am all for musical training as a means of developing a listener’s appreciation of the work and their relationship with it (where appreciation includes supplementary knowledge about the work); but I will also argue that this is not equivalent to claiming that such training is essential to either perceiving or understanding the work in the first place. Many of DeBellis’ examples, I will argue, count more towards background knowledge about the work (and hence appreciation of the work) than they count towards its baseline understanding. This distinction between understanding and appreciation, as I will explain, rests upon Fodor’s distinction between cognition and perception as mentioned above. But first of all, I’ll open the discussion in the next section with a broad summary of DeBellis’ account.

1. The traditional view

To begin with, DeBellis’ account turns on his distinction, as I mentioned above, between conceptual and non-conceptual listening. Conceptual listening is possible when the listener has acquired the relevant musical theory, leading in turn to the acquisition of perceptual concepts, which, during listening, can be converted to perceptual beliefs regarding the music’s structural features. These features can then be clearly reported upon using

57 theoretical terminology. Non-conceptual listening is not informed by theory, and hence lacks the perceptual concepts under which to perceive the music. Structural properties of the music will therefore go unrecognised by the uneducated listener. Such non-conceptual listening cannot give rise to perceptual beliefs and is therefore, by definition, ineffable.

With reference to this distinction between conceptual and non-conceptual listening,

DeBellis splits all listeners into three broad categories:

1. The trained listener (expert), whose listening is “theoretically informed” or

“theory equivalent” (DeBellis 1995, p.1 & p.39). What this means is that the expert

has the theoretical concepts to inform their listening experience; or, in DeBellis’

terms, the theoretical concepts they have learnt give rise to perceptual concepts

they experience in the music. When the expert hears a dominant seventh chord, for

example, they will identify it and experience it as a dominant seventh chord – or to

put it in DeBellis’ terms, they will bring it under the concept of a dominant seventh

chord and form the belief that it is a dominant seventh chord. This is to be

contrasted with the experience of:

2. The ordinary (untrained) listener, whose listening is completely without

theoretical background but who may have acquired some folk musicological

knowledge through cultural immersion and/or experience. This listener may be

aware of the harmonic function of the dominant seventh chord they hear (for

example, that it will tend to resolve to the tonic – although that term is also only

accessible to the expert), but because they do not have the concept or label for the

dominant seventh, they cannot identify the chord they are hearing in that way or

form any perceptual beliefs. Because of this lack of perceptual beliefs, theirs is a

“non-conceptual” experience that cannot be effectively described or defined

(DeBellis 1995, p.1).

3. The intermediate listener, who does have some theoretical training but has not yet

reached the point where those acquired concepts can inform their listening

experience (DeBellis 1995, p.36). They have not yet, that is, transformed the

theoretical concepts into perceptual ones. They might know (in theory) what a

dominant seventh chord is, but still not be able to label the heard chord like the

58

expert listener can in practice. Hence, their listening experience remains just as

non-conceptual as the ordinary listener’s, although it is further along the path to

educated listening in that the concepts can be consciously applied upon reflection

after the fact. Over time and practice, the claim is, the acquired theoretical concepts

will become fully-fledged perceptual concepts.

It is important to realise that DeBellis is not simply arguing that intermediate and ordinary listeners do not have the correct label to apply to their hearing of the chord. He is saying that they lack the perceptual concept, which is best understood as an ability to discriminate between various aspects of perceptual experience. In this context, it means they cannot discriminate between or identify dominant seventh chords and other chords in the music they hear. He argues, then, that their listening is not just non-conceptual but “strongly non-conceptual” in that it cannot form part of a belief (1995, p.57, my italics). Their experience is therefore necessarily and literally ineffable.

This argument is of central importance in DeBellis’ account, so I will spend some time examining it more closely. If, say, we were to shift the context from musical properties to colours, we could say that a person who has the perceptual concept of red and orange can sort red and orange tokens into labelled boxes. If this person lacked the concepts of magenta and chartreuse, say, when confronted with tokens in those colours they would be unable to sort them correctly (Davies 2011, p.100). Of course, this person would be able to tell that the magenta tokens were a different colour to the chartreuse ones, and separate them accordingly, but to DeBellis this ability is not sufficient because it lacks the requisite concepts of “magenta” and “chartreuse”27. On his view, if you don’t have the concept, you can’t perceive the property it conceptualises in what you are seeing or hearing; what you can perceive under such a concept-less state will be ineffable and therefore not count as understanding. What DeBellis is saying, then, is that the “what it is like” aspect of a perceptual experience is ineffable for the ordinary listener if there is no concept under which to experience it.

27 This, however, is the very ability Davies argues is enough to begin the process of understanding in his counter-attack, because, he argues, it is indeed describable (2001, p.100). See section 1.1.2 below.

59

The expert listener, on the other hand, is not just experiencing a chord and then labelling it as a dominant seventh. They are actually experiencing it as a dominant seventh because their theoretical knowledge directly informs their perception28. And since the expert listener can no longer separate theory and perception in their listening experience, they can find more to understand in it than the ordinary or intermediate listener can. DeBellis concludes that the expert listener therefore gains a superior, richer understanding than the intermediate and ordinary listeners, whose understanding is limited insofar as it is non- conceptual. He distinguishes between what he calls “theory-equivalent” hearing and

“theory inequivalent” hearing (1995, pp.40 – 41), which means that his expert listeners have theoretical concepts (which are all couched in music-theoretical terminology) with which to categorise their experience, while the ordinary listeners do not. Because the ordinary listener lacks the theory, they cannot categorise their experience in the same way.

Hence, there is less for them in the music to understand. This is why DeBellis’ reliance upon the theory-dependence of observation thesis is so central to his account, and why I have chosen to examine it critically below.

Before I do that, there are a few important points to note about DeBellis’ account.

Crucially, it turns upon the notion that when we are listening to music, to listen with understanding we must be able to easily describe what it is we are hearing – this, essentially, is what he means by “theory equivalence”. You can’t be said to understand the dominant seventh chord you have just heard, that is, unless you can identify and experience it as a dominant seventh chord. When asked to describe what you just heard, you will say “a dominant seventh chord”. This is because all the necessary theoretical concepts, on

DeBellis’ view, are linguistically mediated; it is also why he thinks, in agreement with Paul

Churchland (1988a), that musical training provides the perfect counter-example to Fodor’s argument for theory-neutral perception (DeBellis 1995, p.8; chapter 4). So what DeBellis is

28 This does tend to explain my introductory anecdote about the analysis lecturer who could not comprehend what an uneducated listener might actually hear, as on this account perception and theory are experienced by the expert as completely fused. What I am arguing, however, is that this is not evidence of the strength of DeBellis’ theory; rather, it is evidence of the lack of empathetic comprehension on the part of the expert.

60 trying to do here is remove any overlap between what we hear, what we experience, and how we categorise and describe that experience. If it all maps perfectly together with no ragged edges, then we can be said to understand the work. But it only maps together this perfectly, DeBellis argues, when the experience and resulting descriptions are couched in music theoretical terms. If, in the case of the ordinary listener, there is theory inequivalent hearing going on, then there will be large, unmapped and (to his view) indescribable areas of ineffable experience because of the lack of theoretical concepts. When asked to describe what they have heard, he argues that the ordinary listener will be unable to do so with any precision, and will omit most structural features from their description because they do not possess the theoretical concepts required to accurately experience and describe them. They cannot be said, then, to understand the work in a “true” sense.

This means that as far as DeBellis is concerned, the only appropriate concepts under which to bring our perception are those acquired through training in musical analysis29. So not only has he defended the idea that analysis yields a “true” understanding of a musical work, but he has also offered an explanation as to why this is the case through his account of theory fusing with perception in the expert listener. Thus he has defended the traditional

Hanslickian view of what it is to understand music, and even more importantly, the traditional Hanslickian view of what is being understood.

Musical analysis operates over the structure of the music as recorded on a score. It is about identifying chords and harmonic structure, as well as the larger-scale overall formal architecture of the work. By insisting on the importance of training in musical analysis, he is essentially implying that scored properties are the only components of the music that are relevant to understanding30. The experience of the ordinary listener without analytical

29 Although he does acknowledge that there might be ways to acquire the “relevant discriminative capacities” by another route, meaning by ways other than through music theory (1995 p.92), he does not develop this further here. 30 DeBellis takes this view one step further: he advocates a “realist” perception of musical theory, in which “theories of music are empirical theories about structures or other properties of musical works, and perception is a matter of mentally representing sound-events as having such properties” (1995, p.107).

61 training, in contrast (although this is not made explicit31), is therefore some kind of ineffable auditory mush; the experience of listening can therefore add nothing to their understanding of the work. The ordinary listener may have a sense of the larger structural features – the “final” quality of a cadence, for example – but they will be unable to articulate the importance of such features further than this. The score, then, takes centre stage over the listening experience in DeBellis’ account of musical understanding and appreciation.

This is, almost exactly, the argument defended by Hanslick (as we saw in the previous chapter). It is a good demonstration of the influence of Hanslick’s account, and accordingly, of why I am characterising DeBellis’ account as the “traditional” view.

In the next section, I’m going to look at two broad objections to DeBellis’ account. The first objection attacks the theory-dependence of observation thesis, which, as I have stated, is the foundation of DeBellis’ argument. I will argue that this foundation is not as secure as he supposes. This will involve a modest foray into the debate on this topic in cognitive science.

The second objection is much more general, and points to the evidence supplied by the uneducated majority of listeners (whose reports indicate that they understand far more than DeBellis allows). I will argue, via a discussion of Davies’ response to DeBellis (Davies

2011, chapter 7), that DeBellis cannot account for the evidence in these reports.

2. Two objections

2.1 The theory-dependence of observation

The theory-dependence of observation thesis holds that what we know/believe about something can constrain what we actually perceive; what we know about the world, then, directly influences what we can observe in it. Extremist supporters argue that is it therefore not possible to perceive in an objectively theory-neutral way, a view that sometimes leads to a more general argument for scientific relativism (Couvalis 1997; McCauley & Henrich

2006). This thesis, then, is what DeBellis presupposes when he suggests that theory and

31 But see p.125 (1995): in DeBellis’ discussion of Kivy’s “Howard’s End” example (Kivy 1990), Mrs Munt’s experience is described as an example of an untrained listener. DeBellis argues her experience lacks important conceptual dimensions: “someone might have a visual experience of a mountaintop that represents it as having a certain irregular shape without possessing concepts to specify that particular shape”.

62 perception are fused in the expert listener, and that without the necessary level of training in musical theory, the ordinary listener does not have the theoretical concepts required to even recognise its essential structural features in sufficient detail or clarity. While the controversy surrounding this thesis is too complex to cover in much detail here, I will now summarise the arguments DeBellis offers to support this crucial foundation of his view.

DeBellis discusses the disagreement between Paul Churchland and Jerry Fodor, held (for the most part) in Churchland’s Scientific Realism and the Plasticity of Mind (1979)32 and

Fodor’s Modularity of Mind (1983). For the purposes of this work we can understand the

Churchland-Fodor debate as about whether we perceive objectively and then interpret our perceptions, forming beliefs (Fodor: theory-neutral); or whether the content of our perceptual experiences is simply provided by our belief systems, with built-in interpretation

(Churchland: theory-laden)33. Fodor argues that it is demonstrably the case that our perceptions are not influenced by our beliefs. His point is most clearly made in his use of the Müller-Lyer illusion (1985, p.2) (Figure 1, below).

Both lines in the diagram are in fact the same length when measured with a ruler, even though it looks to us as though the top line is longer. There are two points to note here: first, the lines don’t suddenly look the same length even though we now believe that they are (i.e. some beliefs, at least, don’t filter through to our perception); and second, we have

32 DeBellis also refers to Churchland (1988a) and Fodor (1984). I find Fodor’s “Précis of The Modularity of Mind” (1985) provides a good overview of Fodor’s account and will be referring to it in this discussion. 33 The debate is summarised in Couvalis (1997) against Churchland (pp.11 – 20), and in depth by McCauley & Henrich (2006). A further summary is provided by A. F. Chalmers (1976, pp.20 – 34), who argues, although not conclusively in my view, against inductivists such as Fodor.

63 no trouble with the idea that the matter can be settled by measuring the lines with a ruler

(i.e. our perceptual experiences are routinely used to test theories). However, this checking of our observations would not be possible if our perceptions were theory-laden, as all perceptions would remain relative to the beliefs we already have (Fodor 1985, p.2).

Fodor then presents his modularity theory as an explanation of how such theory-neutral perception might work. Described as a computational theory of mind, modularity theory turns on the idea that the mind is composed of separate modules, each responsible for particular areas (domain specific) and each functioning in isolation from other modules

(informationally encapsulated) (Fodor 1985, pp.3-4). There are, in this model, modular input systems informing a non-modular central processing unit (CPU). The input system modules are responsible for processing perceptual information – their operation is fast, mandatory (that is, you can’t control whether or not they do their job) and not accessible to conscious recall. Their output is extracted by the central processing unit, which then interprets and evaluates the information in the form of hypotheses. It is important to note that the central unit, being non-modular and therefore unencapsulated, is the level at which the beliefs and theories we have are brought to bear on the output of the perceptual modules. Those beliefs do not influence the reports the modules give, however, as the modules’ function is informationally encapsulated (Fodor 1983, 1985).

Similarly important is another point regarding perceptual modular reports: they are not just “dumb” reflexive responses. While they are encapsulated like reflexive responses,

Fodor argues that they are smarter than that; perceptual processes are typically inferential processes (1985, p.2). They have, that is, access to very simple hardwired theories that serve to translate the environmental (or distal) stimuli into a report the CPU can work with. As

Fodor explains, “perceptual systems have access to (implicit or explicit) theories of the mapping between distal causes and proximal effect. But that’s all they have” (1985, p.4).

This sounds confusing in the light of the earlier distinction made between perception and cognition, but the distinction still holds: the perceptual modules have access to these basic theories only. They do not have access to higher cognitive beliefs. This is why we can rely upon a level of objectivity from our perceptual modules; we are all more or less on the same

64 perceptual page regardless of education or belief systems. This point about the objectivity of modular systems is going to be essential to my arguments in later chapters.

This perceptual objectivity, says Fodor, is also a good evolutionary strategy: “prejudiced and wishful seeing makes for dead animals” (1985, p.2). The key is the speed at which an encapsulated perceptual module can operate when it is not required to sort through innumerable “higher-level” beliefs in order to produce a report that would, in the end, be relative to those beliefs anyway. It’s a survival advantage, that is, to have fast and unintelligent perceptual modules in order to react appropriately, quickly and as objectively as possible to perceived threats, like an approaching sabre-toothed tiger. If all of our perceptual experiences were influenced by our linguistically-mediated belief states, then the whole process of perceiving the tiger would be slower and potentially less reliable (and we would be at a higher risk of being eaten).

As objectors to Fodor’s theory, Churchland (and, by association, DeBellis) are therefore set the task of offering alternative explanations for both the illusion and its evolutionary implications34. Churchland’s account, as such an alternative explanation, turns on the idea that perception is almost completely theory-laden, in that causal proximal stimuli simply trigger our belief system into providing the content of our perceptual experience

(Churchland 1988b, pp.79-80; Couvalis p.17). So, in application to the Müller-Lyer illusion, he has an immediate problem – if the lines still look as though they are different lengths

(even after the acquisition of a belief that they aren’t) then it seems that the beliefs generating our perceptual content in this case are, at the very least, not the right ones.

Churchland’s response to this is to argue that theories may not have immediate perceptual effect, but that they can alter our perceptions over time (1988a). This presumably means

34 The fundamental problem, though, as Fodor explains, lies in providing an explanation for the Poverty of the Stimulus argument (1985, p.2). This argument points out that there is typically more information in a perception than there is in the stimulus itself; ie, perception seems more cognitive than reflexive. Churchland is clearly beguiled by this into arguing that perception just is cognitive in the same way higher cognition is (ie, in accessing all beliefs) (1988b, p.79). Fodor, as I have explained, takes a moderate approach in bestowing the perceptual modules with just enough theory to satisfy the Poverty of the Stimulus argument.

65 that if we stare at the Müller-Lyer illusion for long enough, and possess the right belief states, the lines will eventually look of even length to us. This, I think, is unconvincing because it does not specify how long it will take to make this change, which belief states will be the most effective, and what evidence supports instances where such perceptual alterations have occurred. Moreover, the lack of evidence for the overall claim is concerning given the immediate appeal of the evidence Fodor supplies for the opposite position (that is, the continuing recalcitrance of the illusion in the face of our beliefs about it). Given this, the general consensus is that Churchland does not offer enough evidence for this claim, and that some of the neurophysiological data is also against him (i.e. such perceptual alteration may not be physically possible). For example, there is evidence that it is possible to soften the effects of visual illusions over time, but there is as yet no empirical evidence to suggest that they disappear entirely in subjects susceptible to them (Gilman

1991; McCauley & Henrich 2006, p.5).

I want to spend some time on this last point about empirical evidence. First of all, there is an often-cited study (Segall et al 1966) published years before Fodor’s theory that concludes that susceptibility to the Müller-Lyer illusion is not pancultural, in that some cultures do not fall for the illusion at all, while some acquire it as children in the first twenty years of development. This, it is argued, weakens Fodor’s case against Churchland, in that he can no longer claim that it shows that theory-neutral perception is possible, since it is assumed that the failure to see the illusion must be entirely cultural (for example, McCauley &

Henrich 2006). But this argument against theory-neutrality, as McCauley & Henrich

(2006) acknowledge, also raises a crucial distinction between two different kinds of modular penetration, or weakening of informational encapsulation, and one that I think lessens the damage done to Fodor’s account by this evidence. This distinction is between synchronic and diachronic penetration of modular input systems (McCauley & Henrich

2006, pp.5-7). Synchronic penetration occurs when acquiring a theory instantly (or very quickly) alters perception. This, then, is not supported by the Müller-Lyer illusion, since the illusion persists even after we acquire the belief that the lines are of equal length, regardless of whether or not the illusion is pancultural. Diachronic penetration, on the other hand, takes longer (weeks or even years), and concerns the effects on perception of

66 experience and practical (rather than theoretical) training. For my purposes, then, this distinction emphasises the difference between fast changes to perceptual outputs by formal theory acquisition and slow changes to perceptual outputs through practical experience, or folk theory acquisition. It seems to me that DeBellis is therefore referring to synchronic penetration only in his discussion. He thinks, after all, that ingesting music theory is the only perceptual enabler for musical structure. Diachronic penetration, on the other hand, is invoked by Fodor himself to describe the linguistic input system. It describes how babies born in Norway speak Norwegian while babies born in Japan speak Japanese; while the underlying systems are hardwired, the babies’ respective cultural experience gradually shapes the language they eventually speak (McCauley and Henrich 2006 p.22).

It is important to keep this distinction between synchronic and diachronic permeability in mind when interpreting the Segall et al study (1966). As McCauley and Henrich point out,

“nothing about any of the findings we have discussed establishes the synchronic cognitive penetrability of the Müller-Lyer stimuli. Nor do the Segall et al (1966) findings provide evidence that adults’ visual input systems are diachronically penetrable. They suggest that it is only during a critical developmental stage that human beings’ susceptibility to the

Müller-Lyer illusion varies considerably and that that variation substantially depends on cultural variables”. Given this, I would argue that if adults who are responsive to the illusion are not susceptible to either form of permeability, and children only to diachronic within a particular window, then the question becomes one of establishing whether this impacts upon DeBellis’ and Churchland’s arguments for theory-dependence – which depend upon synchronic penetration of visual systems more than they do diachronic – more than it impacts upon Fodor’s arguments for theory-neutrality. Clearly, as McCauley and Henrich argue, it impacts unfavourably upon both arguments (2006, p.22). But I would argue that it leaves Churchland, and those who argue for scientific relativism generally, worse off than Fodor, since there remains no evidence for penetrability of either kind against the Müller-Lyer illusion in individuals susceptible to it.

Admittedly, the evidence under discussion in these studies is necessarily restricted to visual perception. Returning to DeBellis’s argument, he needs to show that aural perception is

67 synchronically permeable by theory for his own account to succeed. However, his overall claim is much broader. Central to his defence of his theory is the example of trained musicians Churchland (1988a) discusses. According to this example, the musicians’ ingestion of music theory enables them to directly perceive complex musical structures in what they hear (1995, p.102)35. They literally experience a dominant seventh chord, for example, as a dominant seventh, exactly as DeBellis describes it above. This, he argues, shows us that perception in general can be altered over time36: that is, that synchronic permeability in music supports diachronic permeability in perception more generally. This seems to fit nicely with Churchland’s view about the theory-dependence of observation – and as Churchland himself uses musical perception as an example of the synchronic penetration of theory into aural perception, DeBellis seizes upon this to put his case for the

“natural” perceptual capabilities of trained musicians and for perception in general (1995, pp.102-103). However, I am not convinced that he can argue from synchronic to diachronic or even vice versa, as he seems to be doing. Diachronic penetration is acknowledged to be less dependent upon formal theory, although he is arguing that we can use acquired formal theory over time to induce it.

However, I think that there are some broader problems with DeBellis’ case. DeBellis states throughout his exposition that the “burden of proof” lies with Fodor with regard to

“natural” musical perception in trained listeners (i.e. the evidence of such listeners’ experience indicates, on DeBellis’ view, that Fodor should be asking “why can’t perception be penetrated by music theory?” as opposed to “why should it be”?) (DeBellis 1995, p.92). It is not clear that this is the case. It is not enough for DeBellis to point to this “natural” perception of musical features under various theoretical concepts, which is after all the central prop of his case: he argues he can offer an explanation for it in the theory- dependence of observation. Fodor, however, has also offered an alternative explanation of

35 This is presumably also what makes the difference between intermediate and trained listeners on his account; intermediate listeners have the necessary beliefs (synchronic) but have not yet put in the time (diachronic) to allow those beliefs to become perceptual concepts. 36 He also criticises Fodor’s decisions about which terms can be considered “basic categories” (i.e. basic perceptual categories like “dog” that we pick up in early childhood), with the aim of establishing that terms like “dominant seventh” can be regarded as basic (see DeBellis 1995, p.99). However, since I am arguing that his entire view stands or falls on his claims about the theory- dependence of observation, I will not discuss the details of his view here.

68 the expert listener’s perceptions (1988). He suggests that it might just be practice in listening to music, as opposed to ingesting music theory, that produces this “natural” experience (much like language acquisition), and even more pertinently, that Churchland has to show that actual perceptual capacities are altered by musical education, as opposed to the listener just knowing more about music (1988 in 1990, p.260). The problem here for

Churchland and DeBellis is the lack of evidence for beliefs penetrating all perception. As things stand in this regard, the advantages of the experimental evidence and wider explanatory scope are all on Fodor’s side. As I stated earlier, until Churchland and DeBellis can produce evidence for the diachronic alteration of perception by theory over time (as yet there is none, whereas Fodor has produced substantial evidence for the opposite case in the recalcitrance of visual illusions like the Müller-Lyer37), then it seems that any “burden of proof” must lie with them, not with Fodor.

In addition, Fodor’s account explains more than DeBellis’ without any undesirable consequences. Until Churchland (and therefore DeBellis) can account for the way that scientists objectively debate theories, as Fodor has done with his appeal to theory neutral observation (in which we are all happy to settle disputes about the Müller-Lyer illusion with a ruler), Churchland (and therefore DeBellis) can’t consistently argue for both theory- dependence and scientific objectivity. With this small-scale example of scientific debate,

Fodor is pointing out that if Churchland is right, we may be committed to scientific relativism, and that adopting the view that all observation is always theory-laden unavoidably commits us to it. “The thing is”, Fodor explains, “if you don’t think that theory-neutral observation can settle scientific disputes, you’re likely to think that they are settled by appeals to coherence, or convention, or – worse yet – by mere consensus” (1983, in DeBellis p.85). There are, then, serious consequences for adopting DeBellis’ view beyond those concerned with musical understanding. They are serious because DeBellis seems to be assuming that he can still appeal to scientific objectivity, even though his account depends upon the theory-ladenness of observation. In the light of all this, I would argue

37 DeBellis seems to believe that his insistence that it is “natural” to perceive as is such evidence (for example, see 1995, pp.102-103). However, this evidence is at best anecdotal (along the same lines of the evidence provided by my analysis lecturer) and therefore, in my view, not strong enough to challenge Fodor’s argument regarding the Müller-Lyer illusion.

69 that the theory-dependent foundations for DeBellis’ view are much more unstable than he supposes, and that therefore Fodor’s account of theory –neutral perception is preferable, even in light of the doubts about the universality of the Müller-Lyer illusion raised by Segall et al (1966).

However, this theory-laden view of perception is only one part of DeBellis’ overall argument. He is, remember, supplying an explanation not only for how trained musicians might experience and understand music, but also for why it is that trained listeners can perceive more in the music they hear than untrained listeners. If these trained musicians perceive more, the implication is, then their understanding is superior to that of untrained listeners. And if the properties of the music essential to that understanding are only to be couched in musical theoretical concepts and experienced under such concepts, then there is not much left at all for the untrained musician to understand. Partly explicitly and partly by implication, then, on DeBellis’ view untrained listeners simply do not understand music because they cannot even perceive the musical structures necessary to that understanding.

Their experience of music, that is, is non-conceptual.

The problem with this is that it sits uncomfortably with the reported experiences of untrained listeners. They do seem to be able to describe what they are hearing without theoretical concepts, and they do seem to be able to form opinions about the music and relationships with it that would not be possible if DeBellis is right and they can’t even properly perceive it. In the next section, I am going to expand upon this objection to

DeBellis’ view, and examine an alternative argument by Davies that can accommodate the experiences of both the trained and untrained listener without the dangerous appeal to theory-dependence made by DeBellis. The overall aim is to offer an account in which untrained listeners can be said to understand the music they hear.

2.2 The listening majority – describing music and the purpose of analysis

The vast majority of listeners (ordinary or intermediate listeners) do not seem to be adequately accounted for by DeBellis, whose concern is with the experiences of the expert musician. The affection and interest that ordinary listeners hold for their favourite musical work also sits uncomfortably with the implication that, however much they enjoy the music

70 or respond emotionally to it, they just don’t understand it. The idea that the true understanding of music is exclusively held by a minority group of musicologists who possess the required theoretical concepts seems implausible, given the possible depth and profundity of the ordinary listener’s musical experience. Moreover, the ordinary and intermediate listener can often describe their listening experience (and why they enjoy or even dislike a particular piece) in great detail. Such descriptions can include what would qualify as recognisably structural features of the music – certainly large scale features like repeated themes or contrasts between the movements in a symphony are easily discussed without using formal terminology.

Given this situation, I think that DeBellis needs to be able to account for the experience of ordinary listeners in such a way as to also account for the importance they place on it. As an alternative, Davies raises two objections to DeBellis’ account that offer such an explanation (2011, chapter 7). He argues that first, musicological concepts/language are by no means the only appropriate ways to understand/describe a listening experience; and secondly, that the kind of structural features that DeBellis thinks are crucial components of a superior understanding are in fact not important to understanding at all (or at least, not important in the fundamental way DeBellis thinks they are – they are at best

“supplementary knowledge” more pertinent to appreciation (as I will argue) than to understanding (Davies 2011, p.102)).

Davies’ first objection is that a “folk” musicological understanding is just as effective as a formally trained one - that is, that an ordinary listener can achieve and demonstrate the same level of understanding using everyday language as the expert using musicological terminology can. DeBellis’ account, as we have seen, seems to turn on the idea that whatever is ineffable in the untrained listener’s experience - i.e. whatever the listener cannot accurately describe - is a marker of that listener’s lack of understanding of the work. This is essentially his basis for arguing that musical training provides the only concepts that can adequately describe the music being heard (Davies 2011, p.101). Davies argues that “there are always aspects of the detail of occurrent perception that cannot be described” (2011, p.99). We can struggle to describe, for example, the exact shade of green

71 of a leaf. This might also explain why listening to music in real time tends to seem richer in detail than replaying it from memory “in your head”. So ineffability, in short, may not be problematic after all and may not indicate a serious lack of understanding.

Added to this is Davies’ point that while musical training may be an efficient way of bringing musical structural features to the attention of the trained listener, there is nothing in this acknowledgement to imply that it is the only way. “Even if she is not formally trained”, he says, “the ordinary listener possesses a "folk" or everyday musicology and a barrage of terms that serve it” (2011, p.103). Through exposure to music and, he says, “as a competent language user, she knows and uses words like tune, rhythm, beat, volume, discord, harmony, chord, note, and key” (2011, p.103). This folk terminology, he argues, is enough to express an understanding equivalent in depth to the trained listener’s. Davies argues that the structures that matter can be recognised by an untrained listener, even if they cannot formally name them; essentially the broad concepts of folk musicology are enough. For example, most listeners can identify the recapitulation as (at the very least)

“the tune that came before”; and they can usually describe the shape, pace and colour of that tune in some detail38. This can involve some sophisticated feats of recognition, and can be described in folk musicological terms, or even hummed and whistled if need be. This kind of recognition, he says, is more relevant to hearing the work with understanding than being able to name the structure being recognised as precisely as DeBellis expects.

Identifying this kind of recognition is tied to the second of Davies’ objections, which concerns the purpose of musical analysis itself. The kinds of folk musicological terms he identifies actually describe the larger-scale musical features that Davies argues should be accessed by all understanding listeners, trained or otherwise. Terms acquired through formal training, it emerges, don’t always tell you anything about the music that is important to the listening experience; they may not contribute to our heard understanding of it at all. The reason for this is that they identify the wrong features of the music – those

38 Malcolm Budd, in “Understanding Music” (1985) extends this point: “To experience music with musical understanding a listener must perceive various kinds of musical processes, structures, and relationships. But to perceive phrasing, cadences, and harmonic progressions, for example, does not require the listener to conceptualise them in musical terms” (pp.246-247, quoted in Levinson 1997, p.124).

72 that are (for the most part) score-based rather than heard. For example, one of the central requirements of formal analysis is often the breaking down of structural features and identifying their component parts. Davies argues that as far as our listening understanding goes, this is entirely the wrong way around – it is the larger structural features that are identifiable by ear that matter to the understanding listener. This is because, in his view, musical understanding for the listener proceeds in the individuation of melodic and harmonic gestalts as the work unfolds (2011, p.102). The listener is carried through a series of realisations about the work’s structure and colour, rather than dissecting, labelling and reassembling the musical components as they hear them.

This is, however, not simply due to the way we experience the music but to the way it is constructed. As Davies explains, “melodies are not merely strings of pitched tones (or intervals) and chords are not merely aggregates of pitched tones” (2011, p.102)39. The correct identification of chords, for example, depends absolutely upon the context in which they are set within the work. “In tonal music”, says Davies, “we normally identify them in terms of their harmonic functions and the relative frequency with which they are preceded by certain chords and resolve to others, so if there is a reductive analysis to be effected, it is from harmonic sequence to chord succession, not the other way…. Indeed, the principles of identity and individuation that govern the "simpler" constituents of musical sound structures usually depend on higher ones” (2011 pp. 102-103 and see 2001, p.58).

This is in fact what makes learning how to formally analyse works much more difficult for a student who is used to either playing or listening to music without such analytical training (i.e. they have learnt “by ear”). Studying analysis is standardly done with the emphasis on working from a score rather than through listening. The rules governing identification of chords or even key signatures depend on the student’s having an overall grasp of the harmonic structure of the work (or the phrase) under examination before analysis of its component parts can take place. This can be confusing for the novice, as the whole exercise of working from a score seems to depend upon relative rules that are

39 Davies addresses the question of how we recognise melodies in Musical Works and Performances (2001), pp.54 – 58, in which he argues that we don’t recall melody as a succession of pitches but as a shape.

73 continually shifting, with little reference to how the music actually sounds. For example, the same notes can be correctly identified as one chord in one harmonic context and another chord in another harmonic context. I will quote Davies at some length in explanation:

It is a nice question if G-B-D (reading from the lowest to the highest note) is the same chord as B-G-D or D-B-G. The answer is not as obvious as most musicologists would assume. And if they are the same chord, are B-D-F, G-B-F, and G-D-F also the same? Were any of these combinations to occur in a context where it is preceded by a major chord on the fourth degree of the C-major scale or a minor chord on the second degree of the scale and followed by a tonic C major chord, they would rightly be identified as the same, I think. Meanwhile, a second occurrence of B-D-F might be properly regarded as a different chord from the earlier B-D-F if it resolved unexpectedly to A-C-E. (It might be heard in terms of an unsounded E, rather than G, root.) And C-E-G might be identified as the same chord as G-B-D if the piece had modulated to F in the meantime, because these two chords are functionally equivalent. The identification of chords is not settled simply by considering the intervals or the pitch types they involve (2011, p.102).

The point is nonetheless a simple one: DeBellis requires that listeners should be able to name individual components of chords as evidence of their understanding – indeed, not only name but perceive these components under their functional labels. But Davies is pointing out that the untrained listener is not losing very much in being unable to do this.

Because of the way music is constructed and perceived, this ability is, at best, evidence of extended supplementary knowledge rather than evidence of superior understanding. In fact, he adds, we normally don’t assume people with perfect pitch (the ability to name absolute individual pitches without training) understand music any better because they can identify notes, so why should we assume so in this case (2011, p.102)? Moreover, any kind of recognition (outside of music theory), also does not always depend upon the ability to label the object recognised; it is possible to recognise a face, for example, without remembering or even knowing what name to attach to it (Davies 2011, p.103). Again, this is another blow for DeBellis’ reliance upon the theory-dependence of observation, or at least for his interpretation of it. Even if perceptual concepts are required to recognise musical structures, they certainly do not seem to consist in formal musicological terms. This will prove to be an important point in the chapters to follow.

74

So to summarise, the objections discussed here have shown that there are serious problems with the foundations of DeBellis’ view, and that there are further problems with crucial arguments about musicological terminology and musical structure. Davies’ objections focus on DeBellis’ claim that untrained listeners’ experiences are ineffable and beyond description. In this respect, Davies’ second objection therefore supports his first; folk musicological terms are sufficient indicators of understanding of music because the formal musicological terms insisted upon by DeBellis do not describe the large-scale structural features necessary for an understanding listener. Analytical skill, that is, may not produce a listener with superior understanding after all. Davies argues that there is nothing preventing an untrained listener from attaining an equivalent degree of understanding to that of a trained listener. I think that his second objection about analysis, then, leans towards a question about the nature of what is being understood. It suggests that analysing the sorts of features emphasised in a score may not accurately represent the work as we actually understand it when listening. It is pointing listeners away from exclusively score- based structural features and towards those accessed through the listening experience.

Davies’ argument against DeBellis also assumes that perception can be theory-neutral, as

Fodor argues. In Fodor’s view, there is a clear distinction between the perceptual reports produced by the modules and the beliefs that we have about them, ensuring the neutrality of our immediate observations. I would also argue that this provides us with a significant degree of objectivity about our immediate perceptions. The modules are equipped with only very basic algorithms that are not penetrable by our beliefs, as the Müller-Lyer illusion demonstrates (at least for individuals susceptible to the illusion) (McCauley & Henrich

2006). This is, I think, what enables Davies to suggest that our musical perceptions can be described in different terminologies, folk as well as formal, because we are all perceiving the same music. Further to this, I want to suggest that the distinction between perception and cognition supported by modular theory might also map onto a distinction between musical understanding and musical appreciation; that is, a distinction between what we hear in the music and what we know or believe about it. It will be my aim in the next few chapters to argue that because of the modular way in which we perceive music, most listeners can achieve a working understanding of the music they are familiar with. Given this, it might be

75 more appropriate to say that it is not understanding that comes in degrees, but appreciation. While we can learn more about the music, leading to a greater appreciation of it, our understanding comes courtesy of a number of factors including the reports of the perceptual modules, culture, and talent.

So far in this chapter, however, I have shown that the traditional score-based view of musical understanding (as exemplified by DeBellis’ account) has some serious problems. In the next chapter, then, I will begin to assemble an alternative account to the traditional view. I want to first of all examine the nature of these “heard” properties of music in more detail. As I argued in chapter one, these include expressive properties amongst the large- scale structural features in heard music. That such properties should be ignored by adherents of the traditional view of musical understanding might again, I want to suggest, tie back to the old Hanslickian bias towards scored properties and away from expressive ones. I will introduce the idea that expressive musical properties are response-dependent, and discuss response-dependence with a view to grounding musical perception more objectively in the (as I see it) modular processes behind it. This grounding will also act as part of my argument against accusations of circularity inherent in definitions of response- dependent properties (Levinson 1996; Pettit 1991). I will argue that such accusations have prevented any serious consideration of the role of expressive properties in musical understanding in the past. The core of my argument in the next chapter is a case for granting expressive musical properties their proper place in the overall picture of musical understanding and appreciation. However, I will first of all examine Davies’ own account of musical understanding in more detail, as it forms the foundation of my own account.

76

Chapter Three: Experience and Understanding

In general neither the lack of a certain concept of a particular phenomenon nor the inability to recognise instances of the phenomenon as falling under the concept prevents a person from being sensitive to the presence of the phenomenon in a work of art and alive to the aesthetic or artistic function of the phenomenon in the work….

Malcolm Budd (1985a)

In the last chapter, I discussed DeBellis’ view of musical understanding being dependent upon formal analytical concepts. I argued both that this view commits him to the theory- dependence of observation thesis and that it does not account for the apparently meaningful listening experiences of the uneducated majority. Moreover, the quote at the head of this chapter from Malcolm Budd exemplifies a now commonly-held view that understanding music is not a purely theory-dependent exercise40. The purpose of this chapter, then, is to develop a more inclusive version of this view as an alternative to those accounts based on theory-dependence. To be inclusive in the required sense, such an alternative account should stem from the listening experience rather than the score; I will argue that it should therefore turn upon some of the music’s salient heard properties, including its expressive properties (that is, only the emotion the music is expressive of, not the listener’s own emotional response to the work).

However, at the risk of generalising too widely, expressive properties tend not to feature heavily in accounts of musical understanding, either theory-dependent or otherwise. While expressive properties of the work feature in discussions principally concerned with our experience of the work, or in analyses of various expression theories, there remains a reluctance to bring the way we understand expressive features of music out of our experience and into our accounts of understanding more explicitly. This minor role given

40 Other supporters of this view include Tanner & Budd 1985, Kivy 1990, Davies 1994, Levinson 1996 & 1997.

77 to emotional expression seems incongruous given the accepted objectivity of such expressive properties41. It appears, even given the rejection of the need for analytical training, understanding nonetheless still concerns formal structural matters as traditionally conceived. A chief aim of this chapter, then, is to offer a diagnosis as to why expressive properties have not been taken to be central to understanding and to provide a reassurance that they can be. Once this is achieved, a further aim is to outline an account of musical understanding that places the expressive properties of music front and centre.

In my view, the reluctance to bring expressive properties into our accounts of musical understanding is partly due to the fact that expressive properties are response-dependent; that is, like redness, or bluntness, or smoothness, they are properties that we attribute to an object out there in the world because of our particularly human responses to that object

(Pettit 1991; Levinson 1996). Nothing, that is, anchors that property to the object in the world other than our subjective response to it. When we look at an object and experience sensations of redness, we attribute the property of redness to that object. But another animal, whose eyes might not be configured to redness, might look at the same object and not experience redness at all (they might instead see it as green, or grey). Response- dependent properties are therefore “higher-order” or “secondary quality” properties in that they are subject-implicating (Pettit 1991, p.597). There is nothing other than our experience with which to underpin them.

So why would this response-dependence act as a deterrent to expression theorists? Because the response-dependent nature of expressive properties means that the only way to underpin a definition of understanding listeners is through reference back to those listeners’ understanding (Levinson 1996, p.109; Kania 2012). As Levinson explains, being able to discern the expressive qualities of music makes you an understanding listener. Yet,

41 Like Davies, I am taking it as read that, all things being equal, if two listeners within one culture listen to the same piece of textless music and one reports the piece as expressive of happiness and the other reports it as expressive of sadness, then “one of them is wrong” (Davies 2006b, p 146). But while there will always be a clear distinction between happy and sad music, two listeners arguing over whether or not a work is expressive of sadness or grief, on the other hand, may not truly disagree (Davies 2006b, p.146).

78 if we are concerned with musical expression in a theory of understanding, the only way we can specify an understanding listener is through the ability to hear expressive qualities in music. That is, if you understand music's expressive properties then you understand music; and you understand music if and only if you hear expressive properties in it. Levinson describes this situation as “vacuous”; Kania describes it as circular; either way, it is enough to have caused theorists in musical understanding to treat expressive properties with extreme caution in the past.

I will argue that this circularity (or vacuity) is not the problem it appears to be at first glance. In agreement with Philip Pettit (1991), I will argue that the problem arises out of a distorted view of response–dependence itself. Pettit argues that attempting to define such properties in this analytic way is the wrong approach. On his view, we should be describing how users acquire the concept of redness rather than attempting to define the property of redness itself. Such definitions might describe the situation from the theorists’ perspective

(that is, looking in from the outside), but it does not describe how users of response- dependent concepts out in the field learn how to apply those concepts. They learn how to use them through a constant and unconscious development of “habits of response and practices of self-correction” rather than through the conscious application of a formal definition (Pettit 1991, p.601). This is Pettit’s “ethocentric” version of response- dependence, and I will argue that it can save expression-based accounts of musical understanding from accusations of circularity. My argument will turn on Pettit’s distinction between our concept of redness and the “Sloane St set” concept of “U-ness” from Nancy Mitford’s novels (1991, p.613). I will then show that modularity theory can reinforce Pettit’s more objective underpinning by showing how ethocentric practice might be part of the way we “train” our perceptual modules.

I will also examine the nature of musical expressive properties themselves more broadly.

Traditionally, musical expressive properties have been seen as a side-effect of music, or as some kind of musical content, as I discussed in chapter one. I will argue (along with Aaron

Ridley 1993, p.594), that musical expressive properties are not something the music does or creates, but a part of the music’s structure; and we understand expression along with the

79 rest of musical structure because musical structure simply doesn’t make sense without it.

This in turn allows me to argue that understanding expression in music is evidence of wider structural understanding. It does not comprise musical understanding in and of itself. This, as I will explain, is also a part of the defence against the charge of circularity.

Understanding expression, in my view, can only be seen as damagingly circular in accounts in which it is necessary and sufficient for the understanding listener to understand expression. That is not the kind of account I wish to defend.

To this end, the chapter itself will be structured as follows. I will discuss Davies’ argument that understanding “comes in degrees”, and that a listener with no musical education can reach an equivalent level of understanding to that of an educated listener (2011, chapter 7).

He also suggests that musical perception might be modular. I will argue that Davies’ account of musical understanding is much stronger at the lower end of the understanding degree scale, and that it raises the question of how to distinguish between understanding and appreciation (if at all). This point will be relevant to later chapters. I will also examine

Levinson’s sympathetic account of “Music in the Moment” (1997), particularly his argument for “concatenationism”, in which he proposes that listeners don’t need or indeed have the kind of large-scale architectural concept of the music that analysts (like DeBellis) think they require.

Both of these accounts provide the perfect foundation for the defence of uneducated listeners. But neither, I will argue, adequately confronts the question of emotional expression and its role in this uneducated understanding of music. In section three, I will discuss one account that does: Constantjin Koopman and Stephen Davies’ view of music from the listener’s perspective, or “from the inside” (2001), in which they introduce the concept of “formal experiential meaning” in music as heard as opposed to music as analysed. This is very close to what I am aiming at with my own account of understanding.

I think it expresses more clearly “what it is like” for the majority of listeners (Nagel 1974;

Jackson 1982). However, Koopman and Davies’ casting of much of this experience as

“innate” and therefore non-cognitive raises questions about the distinction between cultural immersion, learned behaviour and “innate” reflexes in cognition. I argue that these

80 questions can be answered by some of the finer distinctions in Fodorian modularity theory, and that their conception of “innate” is actually more cognitive (in the sense of less “dumb” reflexive) than they allow. This is important because I will eventually argue that such modular perception counts as understanding.

First of all, however, I will confront the problems surrounding response-dependent expressive musical properties playing a central role in musical understanding. Generally speaking, philosophers tend to be leery of placing expressive properties in anything other than an incidental role in any theory of musical understanding. The sources of this leeriness, in my view, vary in substance. There are three main sources: first, habit (in that expression has been underemphasised since Hanslick argued it is unimportant to understanding); secondly, a tendency to view expression as an effect of music (that is, something like musical content); and thirdly, worries about the possible circularity involved, as I discussed above, in placing expression front and centre in any account of musical understanding due to the response-dependent nature of expressive properties. It is this third source, I argue, that is the most substantial. In the next section, I will describe the circularity it raises in further detail, return to the concept of response-dependence itself, and assert that it is largely the perceived subjectivity of response-dependent properties that allows for accusations of circularity in their definition. Once we unpack how this subjectivity is over-emphasised, I will argue that we can begin to dismantle the accusations of circularity that stem from it.

1. Expression, circularity and response-dependence

There are two levels to the accusations of circularity under examination here. The first is, as

I mentioned above, the core of the argument against placing expressive properties of music front and centre in a theory of musical understanding. This is a problem for the kind of theory of musical understanding that I aim to construct: one that encompasses a contour theory of musical expression, which, as we have seen in chapter one, turns on resemblance between musical emotional expression and bodily or vocal emotional expression. It is a problem because the resemblance invoked makes musical expression a matter of response-

81 dependence; in this case, the responses of the listeners are evidence of the resemblance between the musical expression of emotional states and other expressions of emotional states (Kania 2012; Levinson 1996, p.109). If it is then claimed within a theory of musical understanding that only understanding listeners hear the expression in the music, then the circularity arises as follows:

a. Only understanding listeners can hear music’s expressive properties.

b. What counts as understanding?

c. The ability to hear expression in music.

This, as Levinson says, provides a “vacuous” specification of an understanding listener

(1996, p.109). But it’s not just vacuous at this level. The second accusation is directed towards formulations about the expressive properties of music themselves (say, the sadness expressible in music). These formulations face the same obstacles as the above outline of a theory of understanding due to the response-dependent nature of expressive properties, as follows:

1. A piece of music M is sad, iff M is such as to produce sensations of sadness (under

normal conditions to normal listeners), or:

2. For a piece of music M to express sadness, M must sound sad (under normal

conditions to normal listeners).

Both of these formulations offer circular and therefore uninformative analyses of expressive properties, since in both cases sensations of sadness are employed to account for sensations of sadness. So in order to achieve my immediate aim of placing expression into a theory of musical understanding, and my overall aim of discovering exactly how music goes about doing its expressing, I will need to overcome (or at least neutralise) the toxic circularity inherent in both levels of accusation above: that is, the circularity inherent in the role of expression in a theory of musical understanding and the circularity inherent in the response-dependent nature of expression itself. Since it is the latter circularity at the bottom of it all - in that the response-dependent nature of expressive properties causes the

82 difficulties in producing a non-circular definition of understanding them in music - I will begin with an examination of formulations (1) and (2) above. Formulation (1) describes the arousal theory of expression, which already has its problems, as I argued in chapter one. It is also viciously circular. Given that I have already argued that the arousal theory is not viable, I will not be engaging in a rescue attempt for formulation (1) in this section. I am going to argue instead that I can rescue formulation (2) (which describes a version of the less problematic contour theory of expression) from circularity, and that in so doing I am able to give our understanding of such expression a role in my theory of musical understanding.

The key to my argument is that the above accusations are based on a distorted view of response-dependence itself. It is the subjectivity of response-dependent properties inherent in the whole concept of response-dependence, I want to say, that is behind the circularity.

Our perceptual sensations and experiences, after all, cannot be directly compared with the perceptual experiences of others or examined by other perceivers. This is a much broader problem than the problems facing our understanding of musical expression; it extends into our perception of the world in general, since in fact many more aspects of our perceptual experience are response-dependent than just our experience of musical expression. There are many concepts that exist though our capacities for particular responses, and it is hard to see how beings lacking these capacities could access such concepts at all (for example, the concept of redness is response-dependent; smoothness is response-dependent, as is blandness or sharpness). Some philosophers even argue that response-dependence is a global phenomenon, and that none of our perceptual concepts can be completely objective, thereby placing realism itself into doubt (see Putnam 1981, p.63). I will argue if this subjectivity can be de-emphasised, and our responses grounded by other less subjective means, then the accusations of circularity directed at our understanding of musical expression lose their force.

I will begin my argument with a discussion of Philip Pettit’s “ethocentric” conception of response-dependence, which allows him to argue that response-dependence can be compatible with realism (Pettit 1991, p.601). Pettit points out that much of the discussion

83 of response-dependence (itself a relatively recent term coined by Mark Johnston in 198942) has focussed upon defining response-dependent properties, and that this is what has led us to circular definitions. To use the traditional example, the definition of the property of redness is ”the property possessed by something which produces red sensations in us”

(Pettit 1991, p.600). On this traditional view, something is red “if and only if it is such as to look red to normal observers in normal circumstances”. It is therefore “a priori that [these normal observers] will not be in ignorance or error about the redness of something presented to them” (Pettit 1991, pp.599 – 600).

Pettit finds this definition of the property of redness to be implausible. But Pettit argues that this definition is not just implausible because it is circular. It is implausible because, as a description of how we apply response-dependent concepts, it also assumes that we somehow already know what the sensation of redness is like (1991, p.600). The definition doesn’t explain, that is, how we acquire the concept of redness itself in order to recognise our red experiences. It seems unlikely that the concept of redness is innate, on the one hand; but on the other hand, the response-dependence of concepts like redness obviously rules out any form of objective comparison of other observers’ red sensations to establish their redness. We can’t be shown directly, that is, what other people’s red sensations are like. How, then, asks Pettit, do we acquire response-dependent concepts like redness in the first place, and how do we learn to employ them (1991, p.600)?

Pettit uses this question, and its inherent focus on response-dependent concepts rather than properties, to describe his ethocentric conception of response-dependence. This turns on adopting a different approach to the biconditional in the traditional formulation. Pettit argues that the biconditional describes what goes on from the theorists’ point of view, but not from the point of view of the actual user of the concept out in the field. It is not the case that in order to have a red sensation, an observer has to be consciously aware of what comprises normal conditions and normal observers. Rather, observers arrive at normal conditions and observers in practice: by noting the similarities between red objects (what

42 In “Dispositional Theories of Value” Proceedings of the Aristotelian Society, supplementary volume 63, pp.139-74

84 do these objects have in common? They are all “that colour, pointing at relevant examples”) and then by noting under what circumstances observations of redness might not count (Pettit 1991, p.600). Abnormal circumstances, then, might be identified by noting how other observers apply the concept, or by noting exceptions about the other users themselves. This is what allows observers to form working suppositions about colour stability across a variety of circumstances and examples, and hence allows the observer to access the concept of the colour itself. As Pettit explains, “thus the colour which they identify by reference to certain examples as that colour is not whatever colour property the objects present, but whatever property they present under conditions that can be allowed to count” (1991, p.600). An ethocentric account of response-dependence, then, centres on the way the users themselves might access response-dependent concepts: through “habits of response and practices of self-correction” rather than through the conscious application of a formal definition (Pettit 1991, p.601). This way, observers can legitimately possess the concept of redness without being born with it, which doesn’t seem to be the case, and without the impossibility of comparing actual sensations with other observers.

It may be asked at this point, however, whether Pettit’s ethocentrism has merely shifted concepts like redness from an individual level of personal subjectivity to a group level: a kind of relativism in which a particular group of observers simply decide that a concept looks like that and supplies examples of situations in which that concept will apply for other members of the group to copy. Such a relativist view would not be compatible with realism. Pettit addresses this worry by providing a clear distinction between an ethocentric response-dependent concept and a relativist response-dependent concept. He discusses a contrast between the concept of redness and the concept of “U-ness”, as devised by Nancy

Mitford (Pettit 1991, p.611). “U-ness” is a concept applied by the Sloane Square Set in

Mitford’s novels to flag what can only be described as “cool” versus “uncool” behaviours.

As Pettit explains, “to speak of lavatories is U, of bathrooms non-U; to lay cloth napkins at table is U, to lay paper napkins non-U; and so on…” (1991, p.611). And of course, the

Sloanes are notorious for moving the goalposts on what is U and non-U as soon as they feel too many of the common non-Sloanes are catching on. Things that were once U suddenly become non-U and vice-versa.

85

The U-ness case is a perfect example of a group deciding upon what constitutes normal conditions and normal observers and thereby ensuring that anyone with access to the rules has access to the concept. Getting things right when it comes to U-ness, then, only has to do with the say-so of Sloanes. However, Pettit argues that this is not his ethocentric view of response–dependence. Contrasting the concepts of U-ness and redness, for example, reveals a significant difference between the two. Pettit argues that it is not merely the say-so of the group that dictates proper usage of the concept of redness, but things out there in the world; that is, when we see something as red, we see it “because the thing presents itself – and, if there is no misrepresentation, because it is – a certain way” (1991, p.613). We might say, then, that the distinction between U-ness and redness has to do with a realist disanalogy. Conditions conducive to the production of red sensations (such as light bouncing off surfaces in a particular way, etc.) exist out there in the world and then influence our usage of the concept of red, while conditions conducive to U-sensations cannot be located out there in the world at all. Pettit highlights this disanalogy by asking

Plato’s Euthyphro question (is something holy because the gods love it or do they love it because it is holy?) of both U-ness and redness (1991, p.614). It is clear that “something is red or U because it evokes the U or red response amongst normal subjects”. But when we come to the converse question, the disanalogy emerges. Something evokes the red response in normal subjects because it is red, but you cannot say that something evokes the U response because it is U (Pettit 1991, p.614). This is because U-ness does not exist outside of the Sloanes’ random decision-making processes.

The disanalogy between U-ness and redness allows Pettit to argue that response- dependence can still be compatible with realism, when realism is understood to mean that there are things out there in the world that exist independently of our beliefs about them, and that are discovered, not invented. He can argue for this compatibility because response-dependent concepts can be more like redness than like U-ness; what constitutes a red sensation is not merely a matter of agreement between users of the concept. I argue that ethocentric response-dependence’s compatibility with realism also weakens any circularity in definitions of response-dependent properties. The Sloanes’ concept of U-ness has no

86 connection with anything out there in the world; the concept is entirely relative to Sloane decision-making. It is therefore the case that a definition of U-ness must be vacuous and viciously circular. The property of U-ness is a property of anything that is U, in accordance to Sloane usage and with no possibility of appeal to anything out there in the world.

Redness, however, is a different case altogether. According to the ethocentric story, the practices of self-correction and convergence in acquiring the concept of redness do not align themselves exclusively to the practices of correction and convergence of the users and no further; the practices are themselves dictated by reference to something out there in the world. Any circularity in definitions of properties causing us to have experiences that in turn, after correction and convergence, cause us to acquire the concept itself, is therefore not as toxic as it would be were we to be discussing U-ness instead of redness. As I see it, in other words, the more objective our response-dependent concepts are through ethocentric practice, the less viciously circular our definitions of response-dependent properties will be.

We can now effectively apply this distinction between redness and U-ness to the expressive properties of music, and look to rescue formulation (2) of expressive properties (which describes the contour theory of expression and makes no assumptions about whether or not the emotions expressed and understood are then mirrored in the listener, unlike formulation (1)). Formulation (2) was, as a reminder: “For a piece of music M to express sadness, M must sound sad (under normal conditions to normal listeners)”.

The question, then, is whether our concept of musical expression is more like redness or more like U-ness. I argue that it is more like redness, in that whether a piece of music expresses sadness is not determined only by the say-so of a particular group of listeners.

Evidence for a wider objectivity is in the ethocentric way we go about acquiring the concept of musical sadness: the process is clearly one of self-correction and convergence rather than keeping oneself in step with the shifting Sloaney conventions of a particular musical set.

Sadness in music, that is, is one of the cases of sadness more generally that counts as sadness. The fact that uneducated listeners can accurately identify emotions expressed by music, and that expression is one of the more accessible of musical properties, is also

87 evidence of expression being more than just a group decision43. And this evidence extends beyond the practices of acquiring the concept as well. For example, even if we were to argue that each culture has its own musical Sloanes, and that music expressing sadness is merely a decision they have made, empirical evidence is now mounting (as we will see in later chapters) that a small group of musically-expressed emotions can be recognised panculturally – that is, by non-Sloanes. But for now my point is this: if musically-expressed sadness is more like redness than it is like U-ness, then I argue that formulation (2) is not dangerously circular. Sounding sad is more than just a decision; it is grounded by something out there in the world.

Furthermore, adding a modular theory of perception to this story shows how such grounding might work. As I discussed in the last chapter, we have perceptual modules that are “set up to be set off by” particular emotional states in other humans. We can now flesh out formulation (2) even further with modular theory, and in so doing make it impossible to equate musical expression with U-ness. The ethocentric model of response-dependence can show that formulation (2) is not doomed to vacuous circularity. But, as an enhancement of this model, I can now propose a third answer to the question of what it is for a piece of music to express sadness:

3. For a piece of music M to express sadness, M must have a property that resembles

those expressive properties our affect-recognition modules are configured to detect

in other humans.

So how is this formulation, when added to Pettit’s ethnocentric model, an improvement upon the traditional and “extremely implausible” definition of the property of redness

(Pettit 1991, p.600)? This was, remember, that redness “is the property possessed by

43 Levinson also makes a similar suggestion. Concerns about relativism or vacuity in the definition of the understanding listener can be at least partially allayed by noting that “listeners can show that they are to be accounted members of this class, that is, competent judges, not by convergence with the judgements of others with respect to that very work but by a background of convergence with respect to other works, encountered earlier, and also by displaying other of the skills, discrimination capacities and behavioural reactional capacities known, by observation and experience, to go with the capacity for reliable expressive judgement” (1996, p.109).

88 something which produces red sensations in us” under normal conditions (Pettit 1991, p.600). First of all, as Pettit himself points out, this is the wrong way of going about defining response-dependent properties, and not just because it is circular. It also assumes we already have the concept in question and are simply comparing the current experience to what we already know a red experience to be like. Rather, Pettit’s ethocentric conception of response-dependence centres on how we acquire the concept in the first place: through a process of self-correction and convergence. And I am suggesting here that this ethocentric practice might be a part of the modules’ training: it sets up the modules to recognise emotions in music. In other words, modular theory might help to explain how ethocentrism works. I will discuss this idea further in later chapters.

But before moving on to the rest of this chapter, I now want to return to the original problem at the beginning of this section: the first-level accusation that placing emotional expression front and centre in a theory of musical expression is viciously circular. What I hope this discussion has demonstrated is that this accusation is not entirely accurate, since the circularity assumed in treating expression like U-ness has been shown to be unfounded.

I am arguing that the first-level accusation treats expression like U-ness too. But there is a further point to be made here. It is this: I am not suggesting that musical understanding is comprised only of understanding emotional expression. It is not the case that, to take an unlikely example, if a listener has failed to notice that a piece of music is sad, that they do not understand anything at all about the music. Rather, I am suggesting that understanding emotional expression is necessary, but not sufficient, for understanding an expressive piece of music as a listener and not an analyst.

Having thus assuaged at least some concerns about response-dependence and circularity, I now want to survey some existing accounts of musical understanding that might contribute towards my aim of providing a more inclusive, expression-based account. The first of these is Davies’ account of understanding “by degrees” (2011). While I will discuss Davies’ account in more detail in chapters five and seven, in this chapter I want to introduce its major points as a basis for my account to follow, with particular reference to what I mean by music “as heard”.

89

2. Understanding by degrees

As we saw in chapter two, DeBellis’ theory-dependent view of musical understanding leaves the majority of all listeners’ (that is, the untrained everyday listeners’) apparent understanding of music unaccounted for. Davies, in response to this problem, articulates a way in which our musical understanding “comes in degrees” (2011, chapter 7). This way, without DeBellis’ insistence upon theoretical training, we can account for the experiences of untrained listeners according to their own experiences and attitudes towards the music they are hearing.

An effective way into Davies’ account is through his discussion of Levinson’s Music in the

Moment (1997). Levinson, like Davies, is defending untrained listeners against views such as DeBellis’ – listeners who, “though untutored, are experienced, attentive and passionate” and therefore understanding (1997, p.ix). Levinson, again like Davies, attacks the idea behind the traditional analytical view of understanding: that listening must be informed by a theoretical grasp of the music’s overall form or structure (as, for instance, represented in a score) to qualify as understanding. Levinson calls this “architectonic awareness”, and argues that this traditional view has confused the awareness of overall form with the momentary awareness during the listening experience of much smaller structural sections that are tacitly, or unconsciously, correlated with earlier passages (or anticipated in later ones) by the listener (1997, p.ix). Because of this confusion, and contrary to popular belief, architectonic awareness in Levinson’s view is not only unnecessary for understanding but also more or less irrelevant (Levinson 1997, p.2). In the next section, I will discuss

Levinson’s account and Davies’ response. My aim here is twofold: both to show that this kind of defence of untrained listeners is not as uncommon as many analysts might think, and also to show that such defences are convincing in their accommodation of untrained and trained listeners alike.

90

2.1 Music in the Moment

Levinson’s account is in turn based on an argument that Edmund Gurney first outlined in the 1880s44. Gurney’s view relies on his observation that we cannot ever perceive an entire musical work in the same way that we can take in, say, the façade of a beautiful building in one sweeping glance (Levinson 1997, p.3). This is because of the way that a musical structure unfolds in real time as we listen; it is never perceptually present to us as a whole.

The most we can experience at once on this view are the short passages we can hear in a moment, and their connections to the musical structures in the previous few seconds prior and subsequent to that moment. It is because of this inability to perceive the whole work that Gurney (and subsequently Levinson) thinks that an understanding of the work’s formal structure is therefore irrelevant to listening to it with understanding. Levinson calls this view “concatenationism”; it encompasses the idea that “musical understanding centrally involves neither the aural grasp of a large span of music as a whole, nor the intellectual grasp of large-scale connections between parts; understanding music is centrally a matter of apprehending individual bits of music and immediate progressions from bit to bit” (Levinson 1997, p.13)45.

There are two important aspects of this theory that Davies highlights as potentially problematic (2011, pp.96-97). First, without an overall grasp of structure (any grasp at all – not just the detailed grasp of the trained listener), but with the view that we can only clearly perceive what is in the musical moment, Levinson needs to explain how it is that we recognise recurring stretches of music that are separated by more than a few seconds.

Levinson argues that when such recognition occurs, it is with what he terms “unspecific” conscious content; the listener would describe their experience as “this bit, or something like it, occurred earlier” (1997, p.64). The missing overall grasp of form, he says, plus the evidence provided by our experience of what it is actually like to listen to a piece of music

44 Gurney’s book, The Power of Sound (1880), is described by Levinson as one of the most underrated works of that century, easily on a par with Hanslick’s in terms of contribution to the field (Levinson 1997, p.1). 45 Levinson also states that there are three other components to concatenationism concerning its implications for form, enjoyment and value (1997 p.14). All of these focus on the idea that there is no room whatsoever for the idea of an overall musical form as you would find recorded on a score – even form itself is described as “centrally a matter of cogency of succession, moment to moment and part to part”.

91 means that we may recognise a “bit” but not be able to say when or even where it occurred earlier46. This is contrasted with the traditional view, according to which the expected theory-driven report from the listener can identify time and structural place: “This bit is the same as or resembles the passage concluding the development section in the preceding movement” (Levinson 1997, p.64). This kind of “specific” content, says Levinson, is simply not a requirement for understanding.

Davies’ second criticism concerns Levinson’s argument that most, if not all, of a listener’s recognition and understanding is not accessible to consciousness – and therefore not describable by the listener in anything more than the unspecific terms just discussed (1997, pp.72 – 73). This means that the listener, as Davies points out, is “not able to describe the music in ways that reveal her grasp of it” (2011, p.96). Davies requires that listeners can describe their experience of the music as evidence of their understanding, using folk musicological terminology if need be.

The reasons why Davies raises these two objections against Levinson highlight the major differences between their two accounts. First and most importantly, Davies argues that while Levinson is correct to emphasise the experience of listening, he is also assuming structural understanding in his listeners even while defending concatenationism, since many of his examples involve a tacit understanding of such basic structural features as when the work begins and ends. For example, for an understanding listener to say “this bit, or something like it, occurred earlier” without some kind of structural and temporal backdrop, as Levinson asserts should be the case, could actually mean the bit in question occurred in an altogether different piece the listener heard last week, not in the same piece they are listening to now. Specific structural knowledge of a kind, in other words, is being assumed here, particularly in the case of a listener who is familiar with the work – it would be unusual at the very least to claim that such a listener understands a work they have heard many times before if they are unsure of the order of structural events within that

46 Levinson (1997, p.65) also refers to Nicholas Cook’s views here (1990, p.68), (though, as Davies notes (2011 p.96) that in contrast to Levinson, Cook does think that large scale form is perceptible). Levinson quotes Cook: “Unless they have both the training and the inclination to track the form of a piece of music in theoretical terms as they listen, people experience recurrence without actually observing what it is that recurs”.

92 work (Davies 2011, p.96). Moreover, as Davies notes, Levinson allows that an understanding listener may know how to continue a familiar work (by, say whistling the melody) (Levinson 1997, p.26; Davies 2011, p.96). Levinson states that continuing or reproducing a work in this way is a sign of understanding but adds that such abilities indicate only that the work can be reproduced from bit to bit, not that the listener has

“engaged in conscious cognition of overall form or of large-scale formal relationships”

(1997, p.26). Yet this is also puzzling and for the same reasons. It seems that either

Levinson is arguing that “bit to bit” reproduction is all the listener is doing (this is tricky to establish when the whistled evidence for reproduction through architectonic listening would sound exactly the same as the whistled evidence for “bit to bit” listening); or he is again tacitly assuming some kind of basic overall structural knowledge. Essentially, Davies is arguing that it is impossible for the understanding listener to do away with “structural hearing” up to a point – and that this point stretches only as far as the average listening process rather than towards DeBellis’ trained analysts. What Levinson’s concatenationism actually shows, says Davies, is “not that grasping a work’s overarching form is irrelevant to musical understanding but that such awareness must arise from the listening experience.

Musical form should be heard as flowing from interactions between the work's materials rather than being imposed intellectually and externally via musicologist's textbook schema”

(2011, p.97). I agree with this. There are distinctive features of the work, in my view, that are available through the listening experience. As Alan H. Goldman observes (1992, p.38):

“the point is not to comprehend or apprehend an abstract form – one can do so more easily from score or diagram – but to have one's listening informed by an implicit grasp of structure, so that one can hear a cadence as a resolution or a development or variation as such in relation to a main theme.”

Davies extends this point about listening further. If we allow that some kind of informal structural understanding is required to improve Levinson’s account, then there is a sense in which the listener does need to be able to describe their experience as evidence of their understanding. Levinson, remember, does not require such descriptions; the unconscious does most of the comprehending work, so this is not accessible to or describable by the listener. However, Davies is not arguing that our entire listening experience is to be

93 accompanied by a running linguistic commentary to qualify as understanding (pp.98-99).

Rather, our descriptions are evidence of understanding only, not records of the process of understanding. He agrees with Levinson in part here; some of our processing is not accessible to consciousness, but not for the same reasons that Levinson favours. Gurney’s

Victorian views on perception were by necessity focussed on our experience, given the limitations of the available cognitive theories of the time. Davies, however, updates his account of perception by adopting a modular explanation of recognition continuous with perception. As such, some, although not all, of the processing is encapsulated, and therefore fast, automatic and not retrievable for conscious reporting.

The best way of clarifying this point about modular processing is to summarise Davies’ own account, aside from the glimpses of it that have emerged through his objections to

DeBellis and Levinson. We are now at a point where this may be easily done.

2.2 Understanding, appreciation and modularity

Imagine a concertgoer (let’s call her Mary) sitting in a recital hall and listening to a

Beethoven symphony. The hall is full of different sounds. To begin with, there is traffic noise and birdsong from out in the street filtering through to the hall. There are the coughs and shuffles and whispers and program page-turns from her fellow audience members.

Somebody’s mobile phone goes off. More of these extraneous noises are also coming from the concert platform itself. There is the squeak of violinist’s fingers on the strings, breathing from the players and the sound of turning score pages. There is the humming of an over- involved conductor, or the tap and rattle of valves and keypads on a clarinet (Davies 2011, pp.88- 91). These sounds are not necessarily detrimental, as Davies points out – they all add to the dynamic atmosphere of live performance – but the point is this: Mary’s brain is apparently effortlessly sifting through all of these sounds and sorting out which is the music and which is the non-music47. Clearly this is not simply a matter of where the sounds are

47 Davies notes that this ability is what John Cage’s 4’33” trades upon (p.89, footnote; see also Davies 1997). While Cage himself seems to believe that his work (which consists of 4 minutes and 33 seconds of scored silence from a piano) was a piece of music in itself, I agree with Davies that it is in fact a piece of performance art about music. I think that the work actually highlights the way we can effortlessly sort music from non-music, rather than the usual interpretation that Cage offered: that all sound is music, or at least musical in nature.

94 coming from, as many are sourced at the stage – which is where the music is coming from as well. Equally clearly, some of the sounds are musical in themselves (like the birdsong, or the ringtone of the mobile phone), and yet still Mary instantly understands that they are not a part of the piece she is hearing.

This, then, is the first attainable degree of musical understanding that we, and Mary, seem to be automatically able to achieve: separating the music from the rest of the world, and setting boundaries around what it is that we are hearing and understanding. The next step, once this is done, is that Mary must hear the music as music (Davies 2011, p.90). There is an overlap here in that this is part of what her brain has already done in separating music and non-music. But “hearing as” is also a requirement of the next degree of understanding

– that of understanding the piece being performed.

Davies unpacks this process as a series of recognitions, enabling Mary to map out for herself the work’s major structural features and expressive capacities. This starts with understanding where the music begins and ends, so that Mary will know if the music stops abruptly before it has finished, just as easily as she will understand the way the music is constructed towards its ending (Davies 2011, p.90). The beginnings and endings of movements, melodies and subsections should also be recognisable to Mary. She should be able to recognise the return of melodies, which might be difficult due to variations or elaborations on the original tune or orchestration (Davies 2011, pp.90-91). She should be able to recognise the tensions and resolutions within the work, and, says Davies, any expressive qualities that the work might have and how this interacts with the structure of the piece; as he says, she will “consider what is expressed in the music in relation to more obviously structural elements” (p.91). Moreover, throughout these recognitions, Mary should feel that the music makes some kind of sense as a whole. Indicators of this understanding can include, as Levinson also noted, being able to continue a melody if it stops; or, being able to tell when a mistake in the performance alters the harmonic or melodic structure of the work (Davies 2011, p.92)48.

48 There are a variety of theories about the role of anticipation in musical understanding and enjoyment, including those of Leonard Meyer (1956). He argues that musical understanding is all

95

For Davies, if Mary has reached this point with the Beethoven piece she is hearing, then she can be said to understand it to a certain degree. Note that thus far, nothing has been mentioned about education, or what terminology Mary chooses to use to describe the structural features she has recognised in the work. This is because, up until this point, understanding to this degree is something that we are able to do through “hard wiring” and through exposure to the musical conventions of our own culture. We can practise it, and improve our listening skills, but essentially “we take in our mother music much as we take in our mother tongue, by osmosis at a very young age, with the result that its sense seems natural and inevitable and comprehending it is comparatively effortless” (2011, p.92). No theoretical training is required to attain this level of understanding. To argue otherwise, says Davies, is to argue the equivalent in the case of language: that grammatical theory must be taught before language users can be said to understand it. This, Davies correctly notes, is patently ridiculous. “Most people” he says, “are no more familiar with the technical terms of music analysis than they are with words such as "gerund," "dative case," or "pluperfect tense"” (2011, p.93). Just as we are all competent language users without this theory, so we are all competent music listeners without music theory – although skill levels, as they do with language, may vary.

This point about theory is emphasised by the way in which the important recognitions of musical features (such as recapitulations) take place. Part of the reason why education is irrelevant at this level of understanding is that these recognitions, Davies says, are modular functions of our brains, in the Fodorian sense (2011, pp.98-99). Being thus informationally encapsulated, perceptual modules are not penetrable by theory and the processes they perform are not retrievable to consciousness. As he points out, recognition of anything, be it a musical feature or a human face, can take place without our being able to place a name or a label to it. We are also unable to access how it was that we can recognise that person or feature – for instance, we certainly don’t recall to conscious thought every face that we have ever seen in order to ascertain whether or not we have seen that face before (Davies 2011,

about anticipation and expectation, and how we are pleased by these expectations sometimes being thwarted by the composer.

96 p.98). Rather, the process is fast, involuntary and unconscious – all hallmarks of modular function49.

The difference between Davies’ and Levinson’s accounts, then, reduces to this: Davies does not believe that we simply receive perceptual input and then unconsciously process it, resulting in a sense of ineffable understanding, as Gurney (and therefore Levinson) thinks.

Rather, part of the input process is this modular processing of various features of the music, which then presents to consciousness already processed outputs that can then be interpreted and described. Recognising musical features is not a consciously intellectual exercise; nor is it, as DeBellis thought, one that accesses our knowledge base. Davies’ account reflects perfectly the objection I raised against DeBellis in chapter two: that the theory-dependence of observation cannot support a theory-driven account of musical understanding. The strongest alternative, as Davies also argues, is a modular account.

But returning to Mary, we can now safely say that she has attained a degree of understanding attainable by most untrained listeners who are interested and willing to improve their listening skills. According to Davies, there are a number of ways that Mary can heighten her degree of understanding (2011, pp.93-95). Most of these involve self- education in background knowledge – that is, acquiring information that is not immediately presented to her on the concert platform. One of the most basic examples is that Mary might be comparing the performance she is hearing with a recording of the same work she has at home. In hearing both versions, she may begin to prise apart the work itself and different interpretations of that work. She may compare this performance and interpretation to others she has heard. She may wish to look at the score if she has the background to make sense of it. She might read program notes to understand the aims of the conductor for that performance. She will also need to understand some of the basics of instrumental playing, such that she can successfully gauge the degree of difficulty of the

49 Steven Pinker (1997) also speculates on the possibility that our brains have a “perception module” that recognises objects “with the geometry of faces”; thereby allowing us not only to recognise faces in people, but also seeing them in cars, or clouds and so on. “If objects other than faces (animals, facial expressions or even cars) have some of these geometric features, the module will have no choice but to analyse them, even if they are most useful for faces” (1997, pp.273-274).

97 work and whether or not the performers are reaching it. But most importantly for Davies, her understanding must be “historically informed”: in order to make all of these judgements, she must be aware of “the course of music history – of which styles came first and of how they changed and evolved, of technical changes in musical instruments, of alterations in the conventions of performance and in the presentation of performances….And above all, she must have some idea of the genealogies that relate composers to each other, so that she can recognise what is an original achievement as against established common ground and so that she can spot influences, quotations, allusions, caricatures, rebellions against, homages to, and so forth” (2011, p.93).

Davies argues that all of these background tasks can be undertaken “by degrees” as well - not only that, but from different perspectives. For example, an academic’s understanding of all of these matters will be more formal than Mary’s; a performer’s understanding of performance conventions and what they mean in practical terms may be more immediate than the academic’s50. So the scope that Davies is presenting here, although vast, is amenable to both the armchair listener and the expert. The question is whether or not the degree of understanding for each kind of listener is, once we get to the level of judgement rather than recognition, relative to their training – and if so, whether this means that

Davies’ account, despite his insistence that the same levels of understanding are attainable by those using folk musicology as those using formal musicology, becomes closer to

DeBellis’ (in essence if not in process) at the higher levels of understanding. The

“untrained” listener is in effect, once the higher levels of understanding or appreciation are reached, an expert as well. The only difference appears to be in the formality of the necessary training.

Davies has rightly established that uneducated listeners can and do understand music, and they demonstrate this by describing the music in folk musicological terms. He has outlined, via the modularity theory, how this perceptual process occurs by virtue of a series of modular recognitions. However, in my view, on his account higher-level understanding or appreciation seems to occur via conscious non-modular judgements made on the strength

50 Davies discusses all of these various perspectives later in the same chapter (chapter 7, 2011).

98 of supplementary knowledge applied to the recognition-based understanding. I would suggest that, given this, he has supplied a solid explanation for the lower levels of understanding, but once we ascend into higher degrees of understanding (towards appreciation, on his view), we also seem to be moving away from modularity and back towards background knowledge and formal training. That is, Davies’ insistence on folk terminology seems to accommodate the lower levels of understanding perfectly, but whether or not it is possible to achieve the higher levels Davies describes without formal training (given the kind of knowledge he sets as a requirement) is less than certain. But however we choose to navigate this situation for now, it is clear that a distinction between understanding and appreciating music will require further thought once we abandon traditional score-based accounts of understanding and adopt in its place a modular-based account of musical perception and understanding. And whatever the implications of this may turn out to be, Davies has illustrated the most promising path against the traditionalists. He has also shown exactly how both the traditionalist accounts and

Levinson’s more radical reaction to such accounts are flawed.

I will return to the distinction between understanding and appreciation, and Davies’ version in particular, in chapters six and seven. In this chapter so far, I have endeavoured to outline how the emotionally expressive properties of music, as I introduced in the previous chapter, may fit into and reinforce Davies’ experience-based view as essential to our understanding, rather than as “secondary” (in the Hanslickian sense of “disposable”) properties of music.

I will discuss this suggestion in detail in section four of this chapter. But before I do that, there is still some ground to be prepared regarding the whole idea of music as heard. In section three (to follow) I will examine an account with a focus upon our experience of music as a listener, and what kind of meaning we might draw from that experience, by

Constantjin Koopman and Stephen Davies (2001). I will argue that because of their emphasis on experience, their account accommodates expression much more effectively than the Hanslickian view. However, it also falls behind on the objectivity stakes because of this emphasis; it is not grounded in anything other than our experience. The solution, I will

99 argue, is to continue to reinforce modularity as a means to underpin the objectivity of ethocentrically response-dependent expressive properties in the way I proposed in section one.

3. Experiential formal meaning

Koopman and Davies, in a paper first published in 200151, focus upon another angle on our experience of “what it is like” to listen to music (and how important it is to our understanding of it). In elucidating their account, they make the leap from perception to understanding through meaning, with the emphasis firmly on listening rather than analysis. But this is a particular kind of meaning; it has little to do with the semantic meaning in language or even the meaning associated with value in artworks. Rather, experiential formal meaning, they argue, is the meaning accessed by a listener who listens with understanding. This is not dependent upon training in musical theory; rather, it is the kind of listening in which the majority of music lovers engage. They therefore highlight their account by contrasting it with the more traditional view of understanding, gained by analysing a score.

But to begin with, I will start with some further discussion of what experiential musical meaning is thought to be. Formal musical meaning, argue Koopman and Davies, is first of all to be distinguished from experiential formal meaning. Formal musical meaning is best understood as grasping how the structures and events within the music coherently relate to one another. It is not to be confused with the kind of meaning ascribed to a sentence; i.e. referring to something outside of itself. As they explain, “the question: ‘what is the meaning of event x in piece y?’ should not be taken as a request to specify some referent z which could be identified as the meaning of x. Typically, it is a request to elucidate the way event x coheres with the rest of piece y” (in Davies 2011, p.75).

Experiential formal meaning, then, is best understood as what is accessed via the experience of this coherence from the inside, as a listener, as it unfolds in real time. More than merely

51 I refer to it as Davies (2011) throughout.

100 perceiving a series of patterns in the music, it is an experience of the musical parts “as connected into a dynamic whole”; of feeling the music rather than reasoning through it

(Davies 2011, p.75). In this sense, it is similar to Levinson’s (1997) view of “music in the moment”; it does not require that the listener has any view of the music’s overall structure and, equally, does not depend upon the listener’s grasp of musicological concepts. There is, then, a “what it is like” to hear the music, and Koopman and Davies argue that this is much more fundamental to our understanding than formal musical meaning. It is also, they stress, important to realise that each piece will provide a different experience because the

“experiential content”, as they put it, of each work is determined by that work’s form

(Davies 2011, p.76).

Their position is further enhanced in the contrast between this kind of understanding and the understanding we gain through musical analysis (Davies 2011, p.78). In traditional formal analysis, music is broken down into its component parts and examined as abstractions; as Koopman & Davies point out, “analysis treats music as architecture”

(Davies 2011, p.78). This, in other words, is approaching music from the outside, as an inanimate object (Davies 2001, p.79). Opposed to this is experiential meaning’s approach, in which we feel as though we are interacting not with an inanimate object, but another feeling being, even another person. We can hear expressions of emotion within it; and we respond to the musical movement such that emotional expression involves with empathy rather than dispassion. As Stanley Cavell points out, “objects of art do not merely interest and absorb, they move us; we are not merely involved with them, but concerned with them, and care about them; we treat them in special ways, invest them with a value which normal people otherwise reserve only for other people…. They mean something to us, not just the way statements do, but the way people do“ (1977, pp.234-235). While it may be theoretically possible to be moved by the formal beauty of an analysed but unheard score in appreciating a work, it is unlikely to produce the kind of personal response Koopman and

Davies describe in a listener’s understanding of it. It is the experience of the work from the inside, not as an abstracted construction, that allows for this kind of response52.

52 I will further develop this idea in chapter seven, arguing that not only do we empathetically respond to the music as if it were a person, but the appreciation process also incorporates the

101

Music’s experiential meaning, then, is response-dependent; it is revealed only in the experience of the listener. In what sense, then, is this meaning a property of the music and not, in a relativist frame, a property of the listener’s experience, or some sort of property that is inessential to understanding the work, much as a formalist might argue? Koopman and Davies explain that music has experiential meaning in much the same way as grass has response-dependent greenness (Davies 2011, p.76). Grass has greenness as a secondary property in that only suitably qualified creatures with the correct physical set-up can perceive green, as I have discussed in the previous section; yet we routinely ascribe the greenness to the grass and not to the experience because it is universally agreed amongst these suitably qualified observers that grass provides a green experience. There is therefore little reason to ascribe the property to the experience because there is no variation in experience amongst these observers. We are, then, much more accurate in describing the grass as producing green experiences amongst suitably qualified viewers than we would be if we were to ascribe this quality to individual experiences. Musical experiential meaning is seen to be analogous to this green-experience-producing property of grass in that the experiential meaning is agreed amongst suitably qualified listeners to be an objective property of the music (Davies 2011, p.77).

Koopman and Davies make further a point, however, that will prove to be central to the use of modularity theory as a basis for much of my own argument over the remaining chapters.

The point is as follows: the analogy between grass’s greenness (or a red object’s redness) on the one hand and music’s experiential meaning on the other breaks down because, they argue, perceiving colours is simply an “innate” ability whereas “our ability to grasp a musical work’s experiential meaning results from a (largely unconscious) learning process in which we become acquainted with the conventions of the musical traditions to which it belongs” (Davies 2011, p.77). This is why, they add, the latter is better classed as

“understanding” than as mere “perception”. I first of all want to compare this distinction between understanding and perception with Fodor’s distinction between perception and

formation of a relationship with the work similar to a personal relationship, as Davies suggests elsewhere (2011).

102 cognition, with the caveat that modular perception, as I discussed in chapter two, while not being a judgement, is nonetheless more than “dumb” reflexes. While I agree with Koopman and Davies that experience of listening to music is a form of understanding, on the basis of

Fodor’s model (which, as I argued, has evidence to support it) I’m not sure that we should be so dismissive of innate abilities as “dumb”. I think that the “largely unconscious” learning process Koopman and Davies mention, through which we learn to grasp experiential meaning through cultural immersion, could be better described through

Fodorian modular processes and that Koopman and Davies might be overestimating the

“cultural” input and underestimating the cognitive scope of our “innate” (or modular) perceptual abilities. We need to be much clearer about which abilities are “innate”, and about how we learn abilities through cultural immersion; I suggest that modular theory can inform these distinctions and definitions. Moreover, this implies that the breakdown in the analogy between grass’s greenness and music’s experiential meaning may not be as extensive as Koopman and Davies suspect. On Fodor’s model, perceiving colours is more cognitive than they allow and understanding experiential meaning might be more “innate” than they allow.

So, as Fodor would say, here’s the thing: Koopman and Davies’ aim (and mine) is to show that uneducated listeners can understand more than has previously been acknowledged about music. In this way, we can account for the fact that the vast majority of listeners clearly do understand and value music, even without musicological training. Davies himself, in an earlier critique of Nicholas Cook’s (1990) account, made the point that the most effective way to do this is not by arguing that musicological training is irrelevant, as

Cook chose to do53. Rather, he says we should argue, “‘ordinary’ perception is more cognitive than is often acknowledged” (1994, p.336). This, as I discussed in chapter two, is also Fodor’s view. And the theory that best underpins the cognitive nature of perception, or at least the best theory of those available, I would suggest, is modularity theory54. It is important to realise, however, that this is not the sort of “cognitive” nature a formalist like

53 Because Cook’s account concerns appreciation more than understanding, I will return to a discussion of this point in chapter seven. 54 To be clear, I am not saying that Koopman and Davies’ account fails because it does not cohere with Fodor’s. Rather, I am saying that Fodor’s account is a stronger, empirically grounded way of accommodating their overall aims.

103

DeBellis is talking about. He means “cognitive” in the sense of “accessing beliefs and knowledge”; on his view, perception is cognitive because it relies on theoretical concepts.

On Fodor’s account, this is not the case. Rather, the sense of “cognitive” here (and it is a term that carries more than one sense between cognitive science and psychology) turns more upon the mechanisms behind it. Fodor explains how it is that perception is not simply a matter of uninterpreted input: the modules are informationally encapsulated but they have access to basic perceptual algorithms and can also be “trained”. This is, after all, the theory that underpins (on the Chomskyan account) the analogous ability we have to attain a language. On this view, we are hardwired with a “universal grammar” which is then informed by the culture into which we are born. So the distinction Koopman and Davies draw between the simple innate (and, as they describe it) non-cognitive ability to perceive colours and the more complex, yet still unconscious, learning process that is required to access musical experiential meaning may not be as clear as they suppose. There might indeed be more to the process of our understanding of musical experiential meaning that is modular, or “innate” than they allow, even though I agree that much of our ability develops through cultural immersion rather than formal training. Certain aspects of musical experiential meaning may be perceived in the same way as we perceive colour (i.e. involuntarily), once we recast perception as more than “dumb” reflexes. I will expand on this suggestion, and point to which aspects these are (they include emotional expression) in chapter five.

For now, my aim in discussing Koopman and Davies’ account of experiential understanding was to show that response-dependent properties (such as expressive properties) are far from optional extras when it comes to understanding music. In the next section, in a discussion of Aaron Ridley’s argument (1993), I will argue a further point that these properties are not to be discarded because they are actually objective properties of the music’s structure and hence essential to the understanding of the work, either experiential or formal. Both Koopman and Davies and Ridley argue that expressive properties are a part of the work’s meaning, in that our experience of the work doesn’t make sense without them. I want to suggest that we need not remain within the context of our experience to treat expressive properties as essential in this way; I want to suggest, that is, that Ridley is

104 correct to argue that expressive properties are part of musical structure, but not for the reasons he defends.

4. Expression as musical structure

So far, I have aimed to establish via the discussion of DeBellis/Hanslick vs. Davies that the experiences of uneducated listeners strongly support Koopman and Davies’ listening-based account. In order to understand music, on their view, it is crucial that musical form be accessed experientially through listening rather than abstractly through a score-based analysis. The properties necessary to the understanding of the work are easily accessible through listening and we are provided, through hardwiring and through enculturation, with the necessary equipment to do so.

As I stated at the outset, it seems incongruous to me that accounts discussed so far have largely concentrated on whether the listener can understand relatively complex structural features such as the placing of the recapitulation rather than on some other features that strike the listener almost immediately – that is, whether the piece sounds happy or sad.

One way to explain this omission might be that, in the past, emotional expression was accounted for via a theory involving occurrent emotions (such as the expression theory, as we saw in my survey of emotivist theories in chapter one), which meant that the scored structure of the piece was seen to “carry” the emotional “content” of the work. The listener therefore had to ensure the whole structure was digested, on this view, before the emotional content could be identified, in much the same way as a sentence must be heard in its entirety before its meaning can be accurately understood. Even though such expression theories are ultimately unsuccessful, it seems that we were nonetheless left with the residual idea, associated with Hanslick’s view, that emotional expression is something that the music does, rather than it somehow being a property of the music itself. But as we have seen, this is not consistent with the music as experienced; most listeners will report being able to identify the emotion being expressed as the music unfolds

105

Aaron Ridley (1993) addresses this situation head on, arguing that it isn’t even possible to separate emotional expression from the traditional “purely musical” features because both are structural properties of the music. This is quite a revolutionary idea; however, to Ridley it is not only a commonsensical suggestion, but also a necessary one. It just isn’t possible, he says, to form an understanding of the music without both kinds of structural properties

(the purely musical and the expressive) interacting with each other – and, moreover, he argues that we already think of expressive properties as structural in the context of our listening experience anyway (1993, p.594). His argument for this solution to the traditional structure/expression dichotomy rests upon his account of understanding, to which we will now turn. It will be valuable both in the context of the discussion of expression theory to follow in chapter five, and in the construction of the experiential account of understanding in this chapter.

4.1 Aaron Ridley: structure and expression

Ridley’s account of musical understanding is based on the idea that all we have to work with in describing the understanding process is our experience of the music as we listen. In his view, music does not exist outside of our experience of it (at least, he says that if it does, it doesn’t tell us anything of aesthetic value when we speculate about its nature55). He argues that understanding a musical piece reduces to the ability to hear the music as music; and the ability to discern particular “perceptual properties” within the raw “sensory properties” of our listening experience56. Examples of perceptual properties include traditional structural features such as “melody, harmony, etc” - the “purely musical” (1993, p.590). The nature of the experiences these properties provide the listener can vary

55 While I agree with the proposal that expressive properties are structural, I don’t agree with the ontological position upon which it is based. I don’t feel that experiential access to structural features necessarily indicates that such features are simply experiential, as discussed in section two. And, as Koopman and Davies (2001) argue above, we don’t need to argue that grass doesn’t exist outside our experience just because its greenness is a response-dependent property of grass. Ridley also argues that asking ontological questions doesn’t tell us anything about the work’s aesthetic value; I’m not convinced that it should. For an argument against Ridley on these latter grounds, see Kania (2008). 56 It should be noted that Ridley’s overall aim in this paper is to argue that it can be detrimental to the understanding of various works to have them performed as extracts out of context, in what he terms “bleeding chunks” (eg performing just “The Ride of the Valkyries” as an extract from the entire Die Walküre). While this forms an interesting contrast to Levinson’s view, it will not be discussed here.

106 depending upon the listener’s previous training, interest, or even level of fatigue when listening, such that two listeners hearing the same sounds as music may experience differing perceptual properties over and above those basic ones that allow them to hear the music as music to begin with. The ordinary listener, then, may discern simpler perceptual properties within their experience than the trained musician. This is a similar concept to

Davies’ view of “understanding by degrees”, as Ridley explains: “a tired or inattentive listener, or a listener whose rhythmic sense is relatively poor, may hear two passages as having the same rhythm where a more alert or gifted listener hears a rhythmic difference.

So one kind of perceptual property (e.g. rhythm) may be possessed by an experience to varying degrees” (1993, p.590)57.

The point in outlining Ridley’s account in this way is as follows. He holds that our musical experience also has expressive perceptual properties, and as I do, he objects to the traditional view that such properties are subordinate in our understanding to “purely musical” properties. As I have already discussed, such traditional views, like DeBellis’, tend to downplay the expressive and focus attention on formal features. Ridley cites Leonard

Meyer’s expectation theory of understanding (1956) as an example of this traditional view

(although he defends other aspects of that theory in this paper) (1993, p.593). Meyer’s view is essentially that listeners understand music when they form expectations about how it will proceed, and their enjoyment consists in being surprised or having those expectations thwarted. As stated earlier, Ridley argues that expressive perceptual properties are in fact essential to any level of understanding of a piece of music. His objection is based upon an appeal to our intuitions, both about our experience and also about how we normally classify expressive features of the music as part of that music (as opposed to something that music does or creates). The result is an objection to the traditional account that is very similar to the one I have been outlining so far in this chapter: understanding emotional

57 Davies (2011, p.16) relates this kind of view to an argument by Kathleen Higgins (1997). She questions “the assumption that there is an aperspectival way of hearing music, free from the ‘distortions’ that result from taking a particular perspective”. This applies particularly to professional musicians, who have the most “idiosyncratic” perspective on her view. Like Davies, I think she is right to raise this question. I would suggest that our assumption that there is an aperspectival way of hearing is probably also a hangover from our score-based conception of music – the aperspectival hearing is assumed to be the correct hearing of all the scored properties of the music.

107 expression in music is necessary to an understanding the work itself. As he states, “the objection is simply that you cannot entirely grasp the “purely musical” without grasping the expressive: the attempt to make sense of a musical experience which only has “purely musical” perceptual properties will frequently fail” (1993, p.594).

In defence of this objection, he argues that even if Meyer’s theory is adopted, listeners will form expectations about expressive properties of the music as well as about the “purely musical” ones, and in exactly the same way. For example, Mozart’s String Quintet K. 516 comprises three movements of sombre, tragic music followed by what is usually described as a joltingly and inappropriately jolly finale (Ridley 1993, p.594). The point is that this expressive failing, as it is generally understood to be, is a musical failing, not some kind of lesser consideration that need not detract from the piece itself. And there is no reason, he says, that this should be considered as anything other than a structural property of the music, since the listener has formed expectations about it and about other properties, and since it is clearly essential to even the most basic understanding of the work. As he concludes: “Expressive contrast, then, becomes a fact about structure”58(1993, p.594).

In presenting this argument, Ridley has provided us with another account of how the understanding listener processes expressive properties of the music in the same way as they process the “traditionally” structural ones; moreover, he adds that we assume this anyway in the commonsense way we regard musical expression. He sees this as an argument for emotional features being structural features (at least in the context of the listening experience, remembering his ontological proclivities). And over the course of this chapter,

I have provided an explanation, by means of Davies’ modular account of understanding

“by degrees”, of how the understanding of expressive properties Ridley proposes might be processed. This explanation coheres with the arguments presented earlier that emotional properties are first of all “secondary” in name only, and also essential to any degree of

58 It is interesting to note that Ridley’s ontological views are probably behind his more direct statement of the nature of expressive contrast. He could have said, for example, that expressive contrast supervenes on facts about structure. However, music (in his view) exists only in experience; there are no structural layers over which expression might intervene. On the other hand, I am proposing an even more literal approach: expressive properties of music are as real as its other structural properties.

108 understanding of the work. Since emotional properties are understood in the same way as traditional structural properties, and since the vast majority of listeners readily access and understand them, I see this as an argument for emotional properties being structural properties as well. This seems the sleekest and most efficient way of accommodating the situation as discussed over the course of this chapter. It is also the final nail in the coffin for

“occurrent emotion” expression theories, which by definition (as I have argued) deal in some form of emotional “content”, since expressive properties are now also structural properties. It therefore also works, as a beneficial side effect, to reinforce the contour theory of expression, as I will discuss in chapter five.

But there were two aspects of Davies’ theory that seem to require further examination before concluding this chapter. The first is his (and indeed almost everyone’s) downplaying of the expressive properties of music in musical understanding, which (while understandable given the need to grasp the work’s major structural features) also seems incongruous given that most uneducated listeners can identify expressive features of a work. The second is some blurring around the question of the nature of the distinction between appreciation and understanding. I will address this second point in later chapters by distinguishing between those features that are processed via recognition (i.e. a modular process: fast, mandatory and not consciously retrievable); and those that are processed via judgement (i.e. a conscious process operating over the results of the modular processes). I then returned to the omission of expressive properties in music from accounts of musical understanding, which was, in the end, the focus of this chapter’s defence of the uneducated listener. I argued that the response-dependence of such expressive properties need not lead to a vicious circularity in our definitions of them, as Pettit’s ethocentric account shows.

This circularity cannot therefore be a reason to deny expression its place in a theory of understanding. I also argued that modularity theory could underpin the ethocentric process by which we access response-dependent concepts. To reinforce this point, via a discussion of Ridley’s account, I argued that there is significant evidence from our listening experience to suggest that response-dependent expressive properties are understood in the same way as other more basic structural features are (such as where the work begins and ends) – that is, as modular recognitions rather than as conscious judgements.

109

I then pointed out that Ridley’s argument for expressive features being best regarded as structural relies upon our listening experience, and upon a compelling rejection of the traditional subordination of emotional expression within accounts of musical understanding. Both of these arguments are important because they raise the status of emotional expression to something that is a part of the music itself, rather than a part of what it does. It also makes sense of the way that so many untrained listeners can access such features, understand them, and in so doing understand the music itself rather than understanding a “secondary” by-product. This also adds weight to Davies’ theory of understanding (or appreciation) by degrees, in that it shows precisely how the untrained but passionate and interested listener can be meaningfully said to be understanding the most accessible structural properties of the music they hear: emotional properties.

I now want to add to this picture the suggestion that the emotions being expressed are necessarily broad and basic due to the abstract nature of musical expression. There is simply not enough information in the music about the emotions being expressed to perform anything but the simplest recognitions on the part of the listener. What I am going to argue over the remainder of this thesis, then, is that emotional properties of the music are therefore processed via modular recognition rather than conscious deliberation. This argument has far-reaching implications for my account of musical understanding (and hence my defence of the uneducated listener) and will require considerable defence. I will begin in the next chapter with an examination of “basic” emotions and an argument for their role in musical expression.

110

Chapter Four: Kinds of Emotions

Psychologists have disputed whether the basic emotions are really basic, that is, whether the other emotions are really all based on these six. They have also disputed whether the basic emotions are emotions, suggesting instead that they are mere building blocks that form parts of more complex psychological states, and that it is these complex states that better deserve the name ‘emotions’. Emotions or not, however, the basic emotions clearly form part of what is going on in emotion episodes.

Paul Griffiths (2003, p.7)

I’ve been arguing so far that expression should be a part of a theory of musical understanding because it will make the theory more inclusive: expressive properties are one of the most accessible properties of music to uneducated listeners. I am putting forward two arguments to explain this accessibility: one is about the music itself (expressive properties are structural properties, as I argued in chapter three), and one is about the way we understand the expressive properties of music. I first introduced this latter argument in chapter two and will continue it now. I will argue, that is, that we understand expressive properties through modular processes rather than through theory-dependent processes.

This is why the resulting theory of understanding I am constructing is more inclusive, in that formal education is not required to access these properties. Our modules, trained by cultural immersion, do the job for us.

So if I am right, and if we understand musical expression via our perceptual modules, then

I need some further evidence for this claim. This evidence is the subject matter for this chapter. We know already that there is a particular group of emotions (called basic or

“affect program” emotions) thought to be both produced and recognised through modular processes. Based on an original study by psychologist Paul Ekman (1972), this group normally comprises only six emotional categories: anger, fear, disgust, sadness, joy, and

111 surprise (Griffiths 1997 p.14))59. We also know that there is only a small group of very broad emotions that can be reliably recognised in music. As Hanslick says, music simply cannot carry the information required to definitively express complex “higher cognitive” emotions, such as envy, guilt, jealousy or love. But it seems just as obvious to most listeners that music can carry enough information to express a limited range of emotions such as joy, sadness, or anger60. These emotions are less complex because, unlike “higher- cognitive” emotions, they don’t have a thought or a belief at their core; like music, they don’t have to be about anything at the time they are experienced. It just so happens that this same limited range of emotions expressed in music is also described within basic emotion theory as the exact range thought to be modular in nature; that is, they are informationally encapsulated, involuntarily recognised, and pancultural.

Pancultural evidence in particular is important because if such emotions are modular in production they will be universal across the species. Further evidence of the modularity of basic emotions will therefore lie in pancultural recognition of these emotional states in other humans. The hypothesis is, very simply speaking, that modularity needs hardwiring; hardwiring needs to be across the species; therefore pancultural evidence will support the proposed modularity of both the production and recognition of basic emotions. However, empirical evidence for pancultural recognition of emotions in music is only recently emerging and has in the past been the subject of some controversy in the philosophical literature. Davies, for example, after some early support for the idea (2003, p.171), now thinks it untenable (2011). I don’t agree, and will discuss this particular controversy, and in fact the application of basic emotions to music in general, in the next chapter. There the idea will be to argue that while Hanslick was right to say that complex, higher emotional states are not clearly expressible in music, he was wrong to ignore that broader families of emotional states, which, as I say, happen to be almost perfectly encompassed within the category of basic emotions, are expressible in music. And this argument, I will suggest in the next chapter, should be the focus of an emotivist response to Hanslick. In fact, most of

59 As noted in section one below, Ekman’s original group of “basic emotions” later expanded to fifteen emotions; this is part of his overall view that all emotions are basic, and any falling outside of the group is not an emotion but an “emotional plot” (Ekman 1999, p.62). 60 It is generally accepted that music can objectively express a limited range of broad emotional states (Davies 2003, 2006, 2010a, 2011).

112 the problems in emotivism have arisen because we have been trying to encompass too much in our theories of emotional expression in music; or more specifically, too many emotions. If we concentrate on basic emotions, then many of the traditional pitfalls for emotivists surrounding the question of how complex emotional states are expressible in music no longer apply.

In this chapter, however, I won’t be discussing the application of basic emotions to music very much at all. Instead, the focus will be on defining basic emotions themselves, their modularity, and the empirical evidence for their recognition in faces, voices, or bodily posture. It will emerge that, just as Hanslick assumes that all emotions must have a thought or belief at their core to qualify as emotions, most of the doubts in the literature about basic emotions concern first of all their very existence, and secondly their relationship to

“higher-cognitive” emotional states. The central questions in such discussions are whether or not basic emotions are truly “basic”, in that they form the core of “higher” cognitive emotions, and whether or not basic emotions are truly emotions in the first place (Griffiths

2003, p.7). My concern in this chapter, given my intention to argue that basic emotions are expressed in music in the next one, is simply to establish that the evidence for basic emotions is strong, and that subsequent arguments concerning the nature of non-basic emotions should not detract from the strength of that evidence61.

I will do this as follows. Drawing on Paul Griffiths’ (1997) and Jesse Prinz’s (2004) discussions of basic emotions62, I will argue in agreement with Griffiths that if we concentrate on the empirical evidence, basic emotions are in fact much more strongly supported than “higher cognitive” emotions are (that is, the kind favoured by Hanslick).

Much of this evidence challenges some long-held assumptions about the nature of the emotions in general. Griffiths argues that the everyday use of the word “emotion” covers too much conceptual space; there is no one “kind of thing” that the word “emotion” applies to, and hence no one theory of the emotions that will explain everything that currently falls into that category. The emotions, that is, are not a natural kind. Basic emotions like sadness

61 I will be using the term “basic” emotions rather than “affect program” throughout. 62 Joseph Le Doux (1998) also defends an account of basic emotions.

113 might therefore exist alongside more complex reflective and/or culturally influenced emotions such as guilt (and there is considerable experimental evidence to support this suggestion63) without confusing the answer to the traditional question of what “an emotion” actually is, since emotions can be different kinds of things. I do not intend, however, to get too deeply involved in the more general debate about natural kinds nor the cognitivism vs. basic emotions debate in this chapter, other than noting that arguments about higher cognitive emotions has distracted attention away from the evidence for basic emotions. All I need to draw from these debates is the solid foundation of evidence for the existence of basic emotions, regardless of whether or not they form the basis of “higher- level” emotions, whether or not emotions are indeed a natural kind, and whether or not the category of basic emotions includes six, fifteen or three emotions. Evidence for basic emotions is also evidence for modularity; this in turn will fuel my argument for the modular recognition of these same emotions in music in chapter five. Compiling evidence for basic emotions, as I have said above, is therefore my aim for this chapter.

In the next section, with this aim in mind, I will first of all examine in more detail the divide between basic and cognitivist conceptions of emotion. I will look at what a basic emotion is thought to be, evidence in support of basic emotions, and some of the background motivation for Griffiths’ account. I will also look at some objections raised by

Peter Goldie (2000) to the whole idea of stand-alone basic emotions. I will argue that some of Goldie’s objections turn on a disagreement about what, exactly, Griffiths is committed to in defending the idea of basic emotions.

1. Basic emotions

Just as the emotivists and formalists polarised the debate about musical expression, so the arguments within emotion theory continue between two broad camps: those who hold that emotions are identified through the bodily changes and sensations they either cause or comprise (feeling theorists); and those who hold that emotions are identified through the

63 See Griffiths (1997); Damasio (1994); LeDoux (1998). Hjort & Laver (eds.) (1997, pp.6-9) provide a good outline of the differences between cognitivism and social constructivism.

114 belief or thought at their core (cognitivists, or, as Griffiths calls them, the “propositional attitude school” (1997, p.2)).

It’s not difficult to see how this situation has arisen. Consider, as Prinz asks, the emotions elicited by winning or losing an important competition (2004, p.3). If you have won, you will first of all have a thought: the realisation that you have won the prize you have wanted for so long (say, an Oscar). Then there are a series of other events occurring in no particular order, such as physiological changes: you may smile broadly, turn red, shed a tear, or your heart rate might increase. There are the effects on memory and attention: you may have further thoughts about new opportunities that will open to you, or memories about people who supported you in the past, disbelief, or self-congratulatory thoughts.

There are behavioural changes like leaping into or punching the air, hugging your companion, fanning yourself. Overarching all of this will be an almost dominant set of conscious feelings: a background “zing” of energy and excitement, for example.

If you have lost the Oscar to Gwyneth Paltrow, on the other hand, the same categories of events unfold. You will first have a thought registering your loss and Gwyneth’s win. There are the physiological changes: your posture may droop, you may wear a false smile, you may shed a tear, have a lump in your throat, turn pale. There are the effects on memory and attention: you may have thoughts about Gwyneth’s deserving or otherwise of her

Oscar, regrets about your choices of roles in the past, your choice of agent, your poor self- esteem. There are behavioural changes: you may weakly applaud Gwyneth, try to avoid others’ attention, leave the auditorium, or bravely grasp the hand of your companion. And overarching all of this will be the conscious feeling: a physical pang, almost like pain.

What these two scenarios show, says Prinz, is that typically, emotional episodes contain a number of component parts. The problem, and this is what has caused the polarisation in the debate, consists in identifying which of these parts is essential to the individuation of the emotion. So, if the Oscar-winning emotion here is “elation”, which of these parts is essential to the make up of “elation”? The thought? The overarching feeling? The behaviour? One way to deal with this question is by asking how the emotion would be

115 individuated should one of these components be removed from consideration. If we subtracted, say, the physiological changes, would it still be an episode of “elation”? Can we conceive of an elated experience without such changes? Prinz calls this the “Problem of

Parts” (2004, p.4).

This problem, it emerges, is behind the polarisation in the debate because the cognitivists

(loosely speaking) argue that the thought is the individuating component – i.e. it would not be an episode of elation without the appropriate thoughts or beliefs at its core – while the feeling theorists argue that the feelings are the individuating component – i.e. that it would not be an episode of elation without that zing of energy and excitement. Meanwhile, somatic theorists argue that the physiological changes are the key, behaviourists argue that the behavioural changes are the important factor and judgementalists argue that there is a particular kind of thought that individuates emotions (Prinz 2004, p.10).

It is not difficult to point to examples that support each camp’s account (that is, the cognitivists and the feeling theorists). It might indeed be conceivable for some emotional experiences to lack a thought component, such as feeling afraid or apprehensive about your

Oscar win without consciously knowing why (for the feeling theorists); or it might even be possible to know that you are quietly elated without apparent behavioural or physiological outcomes (for the cognitivists). The problem, though, is that for every emotional episode in which the feelings seem central, there will be just as many situations in which the most obvious identifying feature seems to be the thought - and vice versa. Each camp will therefore need to parry an array of possible counterexamples. Against the feeling theorists, the same physical changes may accompany several different emotional experiences, and might just as easily apply to another emotion altogether – a pounding heart, for example, can accompany or comprise “fear” or “excitement”. If such physical changes are the identifying component of the emotional experience, then how do we tell which emotion it is per episode? Against the cognitivists, it is questionable that we could conceive of elation simply as the thought “I have won an Oscar” without any accompanying feelings, or physiological changes at all. There is also considerable evidence, drawn from experiments conducted by R.B Zajonc (1980) to suggest that people “can have an emotional response to

116 something they have no thoughts about in the everyday sense of “thoughts” “(Griffiths

1997, p.27)64. It seems, then, that each camp has some explaining to do if it is to insist on only one category of individuation criteria in this way.

Given this situation, another response has been to assert that all of these categories are essential components of an emotional state. That is, just as we could not reliably individuate an emotion from a set of physiological changes alone, neither can we conceive of an emotional experience comprising a thought alone, without any accompanying feelings or physiological changes. Perhaps, according to this approach, all (or a select group) of the components are equally important to the individuation process. But the problem with this kind of all-encompassing theory, says Prinz, is this: you need to be able to account for how the various components interact or “hang together” (p.19), especially in cases where one or the other component seems salient. That is, you need to provide an explanation as to how it all works and provide an explanation as to why all components of emotions are necessary against a hugely varied set of counterexamples in our emotional experiences. Prinz calls this the “Problem of Plenty” (2004, p.18).

Stuck between Parts and Plenty, then, the emotion theorists are left with two unpalatable alternatives: either argue for one kind of component, which seems counterintuitive given the number of counterexamples, or argue for all of them, which might account for more of the counterexamples but demands an encompassing explanation of the parts’ interaction as well65. There is, however, a third option. It may also be possible to argue that the terms of the original debate have been misconstrued. Paul Griffiths (1997) has taken an extreme

64 Zajonc is essentially arguing, says Griffiths, that emotions are elicited by a reflex-like process rather than a consciously-mediated process (p.27). In support of this is the later finding that there are substantial physical differences between the emotions through automatic nervous system arousal (ANS) even when uninterpreted by subjects themselves (Ekman, Levenson and Friesen 1983). Their experiments showed that “not only was it possible to distinguish positive emotions such as joy from negative emotions such as fear, anger and grief, it was possible to distinguish among those negative emotions” through measurements of ANS arousal indicators such as heart rate, skin temperature and muscular tension (Griffiths 1997, p.83). This indicates that such physical changes may indeed be individuators, rather than reliance upon the subject’s conscious interpretation (which may be distorted by confabulation in any case).

65 For a comprehensive survey of the various failings across both “parts” and hybrid “plenty” accounts, see Prinz (2004, chapter one).

117 form of this path, arguing that the reason why there is no one conception of emotional nature that covers all possible scenarios is that the term “emotion” is far too broad: the emotions are not, he argues, a natural kind66. The search for the nature of an “emotion” will therefore never be successful because the category of what we currently term “emotions” covers not one kind of thing, but potentially many. The two camps are therefore arguing over something that simply does not exist (1997, p.1). Given this, “the idea that we need a theory of the emotions, or a theory of some specific emotion”, he says, “may be a mistake”

(1997, p.2).

Griffiths is careful to point out that this does not mean that nothing is going on in people having emotional experiences. It just means that there might be several kinds of experience that we are mistakenly trying to lump together as one kind of experience. He places the blame for this situation emphatically at the feet of the cognitivists (or “propositional attitude theorists”), who, he claims, have “ignored” the experimental evidence of cognitive psychology and grounded their theories almost exclusively on armchair conceptual analysis

(1997, p.3). Interestingly, he argues that one of the main motivations behind the cognitivist view is to rescue the emotions from the essentially non-rational feeling theory, which

“bolsters false views in ethics and aesthetics that depend on the contrast between thought and emotion” (1997, p.2)67. Griffiths’ accusation here is based on his view of conceptual analysis, which turns upon our accepted use of the natural kind term. The way that we apply the term “emotion” under such conceptual analysis is used as a basis for a definition of emotions. This is because we assume that the meaning of natural kind terms is simply derived from our cultural and habitual application of those terms without reference to new experimental evidence or discoveries in other fields (Griffiths 1997, pp.3-4)68.

66 Griffiths is taking it as a given that the debate is about natural kinds, when natural kinds are understood to be those corroborated by scientific rather than cultural or personal evidence. Griffiths makes this definition clearer in 2003 and 2004, (p.903), when he states that “natural kinds are categories about which we can make inductive scientific discoveries”, and to avoid further confusion, renames “natural kinds” as “investigative kinds” (2004, p.907). This definition, however, forms part of Goldie’s objection (2000), as I discuss below. 67 This is clearly applicable to the formalist/emotivist debate in music, given Hanslick’s emphasis on that very separation. 68 It is unlikely that emotion theorists in general have relied as heavily upon conceptual analysis as he claims they have done. However, the bottom line is we should be keeping pace with (and passing a

118

Griffiths finds this practice objectionable, and points out that such an insulated view has long been rejected in the philosophy of language and in scientific circles. The way we use natural kind terms, he says, should be evidence-based rather than language-based. He points out that examples of such informed revisions of natural kind concepts are easily found. Barnacles are in fact crustaceans, like lobsters, rather than molluscs, like clams, and we have had to revise our usage of those kind terms accordingly. In Aristotelian science, there was a category of “superlunary objects”, which encompassed all objects outside of the orbit of the moon. Clearly this was a category determined by the extent of scientific knowledge at that time. It is too large a category to be of any explanatory value, and too large a category to admit of one overarching theory. A continuing search for a single theory of all superlunary objects is therefore doomed to failure. So, rather than assume that our usage of kind terms defines the nature of objects within that kind, we should instead be looking to revise our categorisations when evidence dictates in one of two ways: we can either revise the membership of established kinds, as in the case of the barnacles, or we can decide that the entire kind is not in fact a kind, as in the case of the superlunary objects

(Griffiths 1997, p.4). Griffiths argues that the emotions are like superlunary objects: an outdated kind that is “not really a kind” at all (1997, p.4; p.14). On this view, then, there is no one theory that can explain how all the components of all emotions hang together in every circumstance because there is no such natural kind. Prinz’s “Problem of Plenty” therefore doesn’t exist.

As it turns out, on Griffiths’ view neither does the Problem of Parts. This problem, remember, lay in determining which component was essential to the identification of emotional states (very loosely speaking, the choice was between one of two general camps: thoughts/ beliefs/ judgements vs. feelings/ physiological changes/ behaviours). It was a problem because there seemed to be instances where the thought appeared central to the identification of the emotion, whilst in other instances the feeling seemed central. Again, according to Griffiths, this problem dissolves once we accept that there is no one kind of thing called an “emotion”. Given this, the apparently variable nature of the required critical eye over) findings in other relevant fields, whether we are philosophers or, as Davies later points out, psychologists (2011). I will discuss Davies’ point in chapter five.

119 essential component can be explained because there are different kinds of states being identified rather than just one. It may therefore be the case that in some of these states, the thought is an identifier, whilst the physical changes are central in others. No one theory can capture all instances because there is more than one kind of state involved.

Accordingly, once we take into account the findings of current psychological research, the single domain we currently call “emotions”, Griffiths says, actually fractures into three parts (1997, p.14). These three parts are:

1. Affect program responses, or “basic” emotions. Based on the work of Charles

Darwin (1872) and Paul Ekman (1972), these are, as we have seen, reflex-like

informationally encapsulated modular processes encompassing a limited group of

particular emotions: usually, surprise, fear, anger, disgust/contempt, sadness and

joy (Griffiths 1997, p.8; p.16)69. Standardly, as a modular-processed state, such

emotions are of involuntary onset, short-lived, not influenced by

thoughts/knowledge, and pancultural (that is, occurring in most human

populations and not under cultural influence). A good example of a basic emotion

is fear at the sight of a common harmless spider; sometimes this fear will arise even

when the subject knows that such spiders are harmless (known as recalcitrant

emotions or emotional lag). Griffiths maintains that the affect program theory is

well supported by experimental evidence (elsewhere he states that there has been

scientific consensus on their pancultural existence “for the past thirty years” (2003,

p.7)). He cites Zajonc (1980, 1984) as listing a range of experiments tending to

show that emotional responses can be triggered when the information that

triggered the response is unavailable for higher cognitive purposes (1997, p.92).

This strongly supports the modular reflex aspect of such states, and as Fodor has

69 I say “usually” because there is some debate about exactly how many emotions are basic in this way, although Griffiths lists just these six (based on Ekman’s original proposal in 1972). Ekman (1999, p.62) has since expanded the list to fifteen, citing “amusement, anger, contempt, contentment, disgust, embarrassment, excitement, fear, guilt, pride in achievement, relief, sadness/distress, satisfaction, sensory pleasure, and shame” as basic emotions. Ekman discusses the controversy over this point, admitting that “there are probably more emotional words than there are emotions”, and that this may cause confusion in identifying such states. (See also Griffiths 2003, p.14).

120

explained, reflexes are “informationally encapsulated with bells on” (1983, pp.71-

72). There is also much to support the idea that such emotions can be recognised

by facial or vocal expression by members of other cultures who are not familiar

with the language or social customs (Griffiths 1997, p.9; Ball 2010; Prinz 2004;

Oatley 2006; Zajonc 1980). This, when combined with the observations regarding

equivalent states in primates generally, gives weight to the pancultural and

evolutionary evidence in support of the theory (Griffiths 1997, p.13).

2. Socially sustained pretenses70. These also cover a limited range of emotional states

and describe a socially constructed behaviour to depict emotions rather than an

actual instance of that emotion, as in displaying angry behaviour in a particular

socially-prescribed way. This is not to be confused with an affect program instance

of anger, as this kind of emotion has “no more in common with the other emotions

than a piece of playacting has in common with the behaviour it imitates” (1997,

p.15). Whilst social constructivists may argue that all emotions are social

constructs, meaning that none are pancultural or universal, Griffiths argues that

social constructivist theory only adequately describes this limited range of specific

pretenses (1997; LeDoux 1998).

3. Higher cognitive states. This is a much larger category, encompassing the more

complex, consciously-mediated and unreflex-like emotions such as envy, guilt,

jealousy, and love (1997, p.9). These, in other words, are the kinds of

unencapsulated emotions that the cognitivists set out to account for. They seem to

involve consciously accessible thoughts or attributions and they can seem to

operate over basic emotional states. Griffiths argues that this category is best

approached via evolutionary psychology, although he adds that this is nevertheless

the least defined and least understood category. This is possibly because of its

complexity and the fact that there are many “evolutionary storytellers” peddling

70 Griffiths developed this category in later years, adding “social constructionist theory”, which refers to socially constructed emotions that may also include genuinely felt emotions (ie, non-pretense socially constructed emotions). He also added “Machiavellian” emotions to this category but has expressed doubts about its viability since (2003).

121

what are essentially educated guesses about the origins of such states71. The

emotions in this category differ from socially sustained pretenses in that they are

authentic instances of emotional states, and they differ from affect program

emotions in that they are not informationally encapsulated (they draw upon

knowledge or beliefs).

One confusing but crucial aspect of this categorisation is, as Griffiths points out, that one

“emotional” state can appear in various forms in all three categories. Anger, as we have seen, is one of these (Griffiths 1997, p.16). While there are affect program instances, in which we would experience “blind” anger without necessarily knowing why, there are social pretenses of anger (aggressive blustering to achieve a particular end, for example), and there are higher cognitive instances of anger, such as becoming enraged about a situation that draws upon knowledge and beliefs (say, outrage at the public’s rejection of the government’s climate change policy). But Griffiths’ point is that we should not be misled by the way we use the term “anger”. Even though it appears in these three modes, the fact that it is referred to by the same word does not mean it is the same kind of event each time. In a similar vein, he argues that even if we can’t bring ourselves to stop using the term “emotion” to cover events that fall into one or more of these categories, we should at least stop thinking that such events are all instances of the same natural kind.

However, for my purposes, I would like to propose that Griffith’s tripartite categorisation be further simplified into 1) basic emotions (or affect program theory emotions) and 2) non-basic emotions (or non-encapsulated emotions, including Griffiths’ “higher cognitive states”). Whether or not Griffiths is right about emotions not being a natural (or

“investigative”) kind, there certainly does seem to be a clear (albeit speculative) distinction between basic and non-basic emotions in terms of the degree of informational encapsulation in their processing. It is also clear that Griffiths is right to highlight affect program states as the category best supported by theory and experimental evidence, citing long-term “scientific consensus” on their existence (2003, p.7). Griffiths’ use of modular

71 Griffiths rejects these theories for the same reason he rejects conceptual analysis by philosophers: they do not take into account experimental evidence and/or there is inadequate testing of their claims (1997, p.9).

122 terminology indicates a very high degree of confidence in the modular nature of basic emotions; their informational encapsulation forms the basis of their definition on Griffith’s account. He uses this definition to further reinforce his point that such modular states cannot be categorised with higher cognitive states from an “investigative” viewpoint. He uses this definition, that is, to support his argument against emotions being a natural (or

“investigative”) kind.

I now want to spend some time looking at some objections to Griffiths’ claims in order to further reinforce the distinction I am proposing between modular basic emotions and non- modular “higher cognitive states”. There is a very real sense in which these objections do not sit comfortably with the consensus that Griffiths claims exists about basic emotions being modular. That is, objections to Griffiths’ position on natural kinds tend to be at the expense of the modularity of basic emotions. The two accounts I will briefly examine in this light are Prinz (2004) and Goldie (2000). Since the modularity of basic emotions is also essential to my own account of understanding musical expression, much of the discussion to follow will also be relevant for the next chapter.

2. Some objections

Given Griffiths’ argument is that basic emotions demonstrate that emotions are not a natural kind, Prinz’s account is interesting as he aims to both retain the category of basic emotions and argue that Griffiths is wrong. Emotions are a natural kind, Prinz argues, because they all incorporate basic emotions in a particular way: either on their own, as per the unadulterated basic emotional instances, or, in the case of higher cognitive or “non- basic” emotions, as the foundation or core of the emotional state being experienced72.

Jealousy, for example, might comprise a blend of anger, sadness and disgust when experienced against the appropriate cognitive context or thought, called a “calibration file”

(Prinz 2004, pp.99–102). This is what saves the account from merely arguing for

72 As I discuss below, Goldie objects to this “avocado pear” view of emotions (2000, p.85). See also Ratcliffe 2006 for a brief review of Prinz’s account.

123 cognitivism, in that these thoughts are not components of the emotional state, but a context against which that state occurs.

It emerges, however, that this proposal turns on a much more permeable sense of informational encapsulation than is usually intended by that term. Classical modularity theory suggests that we are hardwired to produce or recognise a basic emotional state in response to particular environmental triggers. Prinz asks us to consider that a significant number of our current emotional triggers might not have been the kind of events that were relevant to our early ancestors. We can be afraid of explicitly modern events such as asbestos causing cancer rather than more ancient concerns such as attack by neighbouring tribes or the local wildlife. How, then, do we account for these triggers if the connection between the event and the emotion is hardwired? Prinz argues that these triggers can be

“recalibrated”. The modules can learn to recognise events other than the original trigger events, or even add more modern context to such events (Prinz 2004, p.99)73. Jealousy, for example, can be comprised of anger “recalibrated by a judgement about infidelity” (2004, p.100). But anger, on the other hand, even when it is about something (say, losing one’s job), is still just called “anger”. Prinz does not see why some emotions are traditionally considered “higher cognitive” (such as jealousy), and some (such as anger) are not, when both are subject to recalibration through cognitive judgements (2004, p.100). He argues that this provides enough processing similarity to render all emotional states part of the one natural kind. Calling any emotions “higher cognitive” is therefore simply arbitrary74.

However, it might be objected that the way Prinz constructs this argument greatly weakens the modular conception of basic emotions through relaxing informational encapsulation too much (see Jones 2006). This is a problem because it appears to render basic emotions much less “basic” and therefore contravenes the considerable experimental evidence

73 This idea, he says, is imported from Dretske (1986). 74 Ekman (1999) attempts an approach falling between Griffiths and Prinz. He argues that all emotions are basic, and those identified as “higher cognitive”, such as love, are not emotions as such but “emotional plots”, which endure over time and tend to involve more than one basic emotion as they unfold.

124 supporting the modularity of basic emotions cited by Griffiths75. If Prinz is arguing for one natural kind, and that kind is both basic and rational, then it is possible to argue that

(against his protestations to the contrary) the kind he ends up with resembles “higher cognitive states” more than it resembles reinterpreted basic emotions. This is because his

“calibration file”, which bestows the modules with more flexibility through judgements/appraisals, may simply perform the same purpose as a lesser degree of encapsulation (i.e. weakened modularity). That is, it is difficult to avoid the appearance that the thoughts comprising those judgements might become a true component of the emotion rather than a context for it. Given all this, Prinz needs to defend his account against the possible accusation that he is merely changing the labels around here rather than recasting a more rational modularity.

As Jones explains (2006), Prinz is aware of such objections. In order to be saved from this accusation of merely re-labelling thoughts as prior conditions rather than constituent parts

(or playing a “cheap verbal trick”, as he puts it (2004, p.101)), his account needs to explain more than our current alternative account can. It needs to tell us, that is, more about the nature and function of this one natural kind of emotion than our current account does. But his defence against this objection instead turns on an accusation about the objection’s assumptions. He claims that we can’t argue that judgements are a constituent part of an emotion because the contents of the judgement itself can change. “To say that a higher cognitive emotion is reliably triggered by a particular judgement does not entail that the same emotion is reliably triggered by that very judgement” (2004, p.101). This, he says, is a

“false assumption” and one that the re-labelling objection assumes.

However, I’m not convinced that the objection assumes anything of the kind. Higher cognitive emotions evidently do not require a particular thought to trigger a particular emotional state on each occasion. Rather, they just require the thought to be a constituent part of the emotion, and this thought can pertain to a number of contextual matters

75 I should point out that this is a very general objection, skating over a lot of contextual detail, particularly in the exchange between Prinz and Jones, who (in 2006) is concerned with the rationality of the proposed form of basic emotions. Jones points out that the more modular (i.e. domain specific and encapsulated) an emotion is seen to be, the less it may contribute to practical rationality, which demands a wider range of application and inputs.

125 surrounding the identification of that emotional state. As far as I understand it, that is,

Prinz’s argument, in which he describes the many different thoughts that can trigger the emotion of jealousy (“the judgement that one’s lover has been unfaithful….. the judgement that one’s lover has been staying unusually late at work….. the smell of an unfamiliar perfume on another’s clothes” (2004, p.101)), is just as effective a description of the higher cognitive state of jealousy as it is of his own “modular” one. Different triggers do not, that is, identify different emotional states in higher cognitive theory either; they all reduce to the constituent thought that identifies the state of jealousy (here grouped under the “possible markers of infidelity” banner). It is not clear, then, that Prinz has produced a knock-down defence against this objection. If Prinz cannot answer it (see also Jones 2006, p.22), then he can no longer claim that thoughts are not constituent parts of an emotional state, or at least not by calling them “prior conditions”. If he can no longer make that claim, then he can no longer make the associated claim of defending a truly modular account of the emotions

(Jones 2006, p.22)76.

While there is more to be said on Prinz’s theory, I think there is enough doubt surrounding his claim to modularity to justify looking at some alternative positions. One such alternative is Peter Goldie’s view (2000, chapter 4). He questions the very scientific consensus on the modular nature of basic emotions upon which Griffiths relies, objects to the idea of basic emotions being “basic” in the way Prinz thinks they are, and champions

“commonsense psychology” over Griffiths’ scientific approach to the whole question of basic emotions. His view is that we should abandon any attachment to the idea of

“hardwiring” in favour of a “developmentally open” account of human emotional experience, in which our emotional capacities are shaped more by the environment and culture than anything else (2000, p.85). Goldie’s account is accordingly complex, so I will try to deal with each of these objections as simply as possible.

76 Jones makes a wider point regarding weakening of the notion of encapsulation by such re-labelling of any cognitive influence as input to modular function. If information suddenly becomes contextual input, and almost any kind of information can be thus described, then why should Prinz bother with the concept of encapsulation at all (Jones 2006, p.22)?

126

First, Goldie questions whether or not the pancultural evidence for facial recognition of basic emotions is as strong as Griffiths thinks it is (2000, pp.89-90). As I have explained, pancultural evidence is particularly important in support of modularity, as the required

“hardwiring” will need to be present across the species. Goldie accepts the findings of

Ekman’s 1972 experiment on facial recognition of basic emotions amongst the Fore in New

Guinea (a group who had no prior experience with other cultures), and describes Ekman’s evidence for reliable recognition of facial expressions of basic emotions above chance levels as “robust”. However, he is critical of the way the study was conducted. Although Goldie does not expressly claim that Ekman’s study was what he terms a “judgement test”, he nonetheless cautions against taking experiments that rely upon judgement tests too seriously in this context. A judgement test in this context is one in which the participants are asked to choose from a specified set of emotion terms to describe the facial expression with which they are presented, rather than initiating the terms of that description themselves. A judgement test may, he says (rightly), be influenced by the way each culture conceives of emotions, and they often involve the use of “forced choice” responses to clarify results (2000, p.90). He then proceeds to detail the many different linguistic characterisations of emotional states and points out, again quite rightly, that some cultures have words for emotional states that others do not (2000, p.91).

But all this, as it emerges, doesn’t have much impact upon Ekman’s findings regarding the

Fore. As Griffiths explains (2003, p.7), the Fore participants were not given a forced choice test of the kind Goldie describes. To control for this very problem, participants were given pictures of faces expressing basic emotions and asked to match them with pictures of actors in corresponding scenes (such as a father sitting by the bed of a dead child, or the sudden appearance of an aggressive predator). This resulted in the Fore making reliable identifications of emotion states without being restricted by linguistic differences (if the

Fore have a word for “sadness at the death of a child”, it is still, after all, sadness). While it could be admitted that only a few scenarios were pictured, and in that sense the choice was forced, it is very difficult to conceive of any other way to test the Fore, given that the whole point of testing them at all was the absence of prior cultural and linguistic contact. And,

Griffiths adds, “Ekman also filmed the faces of Fore people acting out some of the same

127 incidents and students back in the United States proved equally adept at identifying the intended emotion from these films (Ekman, 1972)” (2003, p.7). Also, while Goldie points out that “free choice” studies (wherein participants are not given a choice of words but provide their own) result in a huge variety of different terms used to describe basic emotion states, emotions like anger or sadness nonetheless tend to be identified more reliably that emotions such as contempt or disgust (and this is supported by the Russell (1994) study

Goldie discusses). This will be a significant point in the next chapter, but for now there is little to suggest that Ekman’s results (at least) offer anything other than “robust” support for the modularity of basic emotions77.

The second of Goldie’s objections is that inherent in the support for basic emotions is also support for the idea that they provide the “core” of non-basic emotions, which is what

Goldie terms the “avocado pear” conception of emotions (2000, p.7). Goldie prefers a cultural view in which there is no hardwired core surrounded by a softer culturally-fed filling; rather, he accounts for the evidence of pancultural similarities by presenting a view in which emotions have a “paradigmatic narrative structure” that is shared across cultures but which may also be culturally influenced (2000, p.92). Prinz also shares the view that basic emotion theory implies a commitment to non-basic emotion theory: “Anyone who believes in basic emotions must explain how they give rise to non-basic emotions” (2004, p.91). That is, both Goldie and Prinz are assuming that any theory of the emotions must encompass all emotions, both basic and non-basic. However, this only holds if you also hold the view that the emotions are a natural kind. If emotions are not a natural kind, then

Griffiths is within his rights to merely present an explanation for basic emotions, as he has done, noting that no theory is available for non-basic emotions (or whatever we choose to call them) as yet. Hence, as Griffiths makes it clear in his earlier work (1997), his argument for emotions not being a natural kind stems in part from his noting the level of scientific consensus on the existence of basic emotions. Adopting the scientific approach that he advocates, rather than conceptual analysis, means that we must abandon theories for which

77 Goldie, consistent with his emphasis on “commonsense psychology”, also points out that while such studies might test recognition of expressions, they don’t test for intentionality (2000, p. 90). However, Griffiths cautions against expecting the studies to do so in the first place. They are concerned, he says, with output only; it is important not to confuse “pan-cultural” with “human nature” (1997, p. 9).

128 there is little supporting evidence. However, this does leave something of a gap in emotion theory. While I admit that Griffiths’ view leaves non-basic emotions more or less unexplained, it should also be emphasised that this, in a way, was his whole point: we need to find some other way of unpacking them. His later work (2003, 2004) acknowledges this, in that he addresses the question of the relationship between basic and non-basic emotions.

Goldie’s final point, about commonsense psychology, is best explained via his argument against the idea of basic emotions being emotions at all. I suspect that it also betrays a high level of expectation of what Griffiths’ original three-way categorisation of emotions sets out to explain (1997). Griffiths notes that it is possible for one emotion to be realised in each of the three categories, and uses the example of anger, as I discussed above. However, Goldie

(in discussing an objection to Ekman’s argument that all emotions are basic) uses anger as an example of a long-term emotional episode that draws on experiences and beliefs and has a definite object (2000, pp.105-106). This, because it flies against the “basic emotion” characteristics of automated, short-term and involuntary, he sees as an example against the whole idea of basic emotions being emotions, and suggests that we find another name for them, preferably “affect program responses”. This works more against Ekman, than it does against Griffiths, who freely admits that anger can be a non-basic emotion (in that it can fit into each of his three categories). Griffiths is also, as we saw in the quote at the head of this chapter, relatively unconcerned about whether or not we call these “emotions”, noting that

“when used in this context all these emotion words refer to phenomena less rich and varied than those they refer to in common speech” (2003, p.7). He would also probably be unconcerned by Goldie’s plea to “let us all stop talking about basic emotions: it just generates confusion” (2000, p.106). As I discussed above, all Griffiths needs for his argument to stand up (at least in its 1997 incarnation) is the consensus for the existence of basic emotions based on the evidence of modularity in conjunction with his view that emotions are not a natural kind. He has made it clear that he makes no claim about what the different kinds should be called. It seems to me, then, that Goldie’s arguments, since they are directed at basic emotions as a concept, have more force against Ekman than they do against Griffiths, whose point was much broader. Griffiths is not proposing that the

129 existence of basic emotions (or “affect program responses”) has anything to do with non- basic emotions, however they are thought to be composed.

In conclusion, let’s summarise the disagreement between Griffiths, Goldie and Prinz. As

Prinz outlines, there are two central problems and the responses to these problems characterise the three different accounts. These are the problem of parts (that is, which component of an emotional state is essential for its individuation: thought or feeling?); and the problem of plenty (that is, if we answer the problem of parts with “both”, how do they hang together?). This latter problem depends upon the closely related assumption that emotions are a natural kind, which in turn leads us to expect that any definition of basic emotions must incorporate a definition of non-basic emotions. Each of the accounts I have discussed also needs to accommodate or reject the empirical evidence for the modularity of basic emotions. Prinz tackles the problems, and coheres with the empirical evidence, by arguing that basic emotions are a part of non-basic emotions, and retain their modularity in that the characterising thought in non-basic emotions is not a component but a context.

On his account, emotions are a natural kind and a theory of the emotions must therefore accommodate both basic and non-basic emotional states. The problem with his account is that it is difficult to unpack his version of modularity without it looking less and less modular as the complexity of the emotional state increases. Goldie deals with the problems by rejecting the empirical evidence for basic emotions, and the whole idea of basic emotions themselves. They are not emotional states and therefore will not be a part of a theory of the emotions. Instead, he advocates a culturally contextual notion of what emotions are, and accounts for apparent pancultural evidence by proposing “narrative paradigms” that are culturally influenced (2000, chapter 4). The problem with this account is whether his rejection of the empirical evidence is justified; also, Goldie’s assumption that emotions are a natural kind (or at least, that inherent in the idea of basic emotions is the idea that they are component parts of non-basic emotions) steer his argument at cross- purposes to Griffiths (see, for example, Goldie 2000, p.87).

Finally, against Goldie and Prinz is Griffiths, who argues, as we have seen, that since emotions are not a natural kind, there is no reason to produce one theory to account for all

130 instances we currently regard as emotional. Since most of the empirical evidence (which he regards as sound) backs a modular view of basic emotions, he argues that we need to find another explanation for non-basic emotions. The problem with this view was his argument that the same emotional state might be both basic and non-basic, and yet not fall under the same explanatory theory. That is, it feels to us as though basic and non-basic anger might be more closely related than he allows.

This might be where we reach the core of the problem. While it feels in our emotional experience as though there should be some relationship between basic and non-basic anger, and that they are different versions of a similar kind of thing, I would agree with Griffiths in his view that we should nonetheless ruthlessly stick to the theory for which we have sound supporting evidence. In this case, we have much more supportive empirical evidence for basic emotions than we do for any other state, whether we call basic emotions

“emotions” or not. However, despite this, we seem led by our experience of the relationship between emotional types towards the other end of the emotional scale. An emphasis on this non-basic aspect of our experience in the literature, given this, is of course understandable.

I would suggest that there seems to be much more anxiety in the literature to produce an account of non-basic emotions than there is of basic ones, as evidenced by the admittedly small sample of Prinz’s and Goldie’s accounts just discussed. But I suspect that what is actually going on here is a very subtle (and largely unconscious) bias towards that traditional separation between the rational and the emotional or bodily states that Hanslick found appealing and that Griffiths warns against in aesthetics in particular (1997, p.2). I suspect, that is, that the kind of bias I discussed in chapter three, in which musically- expressed emotions understood through listening are seen to be somehow “secondary” (in a range of senses) to formal structural properties understood through analysis, might be leading to the tendency to argue, if not always literally, that basic emotions aren’t “real” emotions because they lack a “thought” component. Whilst Griffiths is ignoring much of our experience in his declaration of independence from traditional emotion theory, and while his accusation of an over-reliance by philosophers on conceptual analysis might go too far, his call for a new perspective on the problem is, I think, valuable in that it does

131 allow us to re-evaluate traditional approaches to the importance of this “rational” thought component in emotion theory.

However, the topic of emotional natural kinds is best left to emotion theorists. I, on the other hand, only require evidence for the modularity of basic emotions, whatever we choose to call them. The discussion of the objections against Griffiths’ evidence for modularity raises several important points for my argument in the chapters to follow. My argument for the way we understand emotional expression in music depends upon two premises: first, an at least significant degree of informational encapsulation in the modules recognising expressive features in other humans, animals and, as I will point out, musical structure. Secondly, my argument also depends upon there being reasonable grounds for concluding that, at the very least, there is a distinct processing difference between basic and non-basic emotions (regardless of whether this difference means that both varieties of emotions are not a natural kind and regardless of whether or not basic emotions are truly

“basic”). The discussion of the disagreement between Griffiths, Prinz and Goldie addressed both of these requirements, in that it showed that Prinz’s argument for processing similarity (the most comprehensive on offer at present) is at the cost of modularity - which is problematic for him because of the considerable experimental evidence supporting it in the context of basic emotions.

I think that in the light of the overview of emotion theory provided in this chapter, we can safely proceed with my two required premises sufficiently intact. I can conclude, for my purposes, that there is robust evidence for the existence of basic emotions, however we describe them. I can also conclude, for my purposes, that there is a processing difference between basic and non-basic emotions. That difference turns on the degree of informational encapsulation: higher for basic emotions, and lower for non-basic emotions, in that the latter can incorporate thoughts and beliefs. In the next chapter, I will (finally) turn the discussion back to music. I want to concentrate on why basic emotions, and their modular nature, are central to my account of emotional expression in music. The discussion of non-basic emotions will need to wait until chapters six and seven, in which I

132 will explain the distinctions between our understanding of emotional expression, our appreciation of it and our response to it.

133

Chapter Five: Expression

The sadness is to the music rather like the redness to the apple, than it is like the burp to the cider.

O. K. Bouswma (1950)

While the previous chapter was concerned with defending “basic” emotions and their modularity, this chapter will focus on demonstrating how the concept of basic emotions strengthens and extends the “contour theory” of musical expression. Recall, from my discussion in chapter one, that the contour theory of expression claims that music expresses through resemblance to voices or bodily posture78. I will argue that applying the concept of basic emotions to the contour theory strengthens it by showing how the resemblance upon which the theory turns might actually work: adding basic emotions means that the contour theory now describes not only what is being resembled, but also the psychological mechanisms behind our recognition of that resemblance. These last two points are the only areas still requiring clarification within this already successful theory, as

I also discussed in chapter one. This chapter will therefore focus upon providing some answers to the related questions of how music expresses (the contour theory’s concern) and also why it expresses only a limited set of emotional categories (through the application of basic emotions to the contour theory). I will argue that music expresses basic emotions only, and that this is the key to the enhancement of the contour theory. I will conclude that not only does the combination of basic emotions and the contour theory clarify the above two points about resemblance, it also reveals why the emotions seem to us to be part of the music itself rather than something the music does or produces; or, as O. K. Bouswma states, it reveals why the emotions expressed in music seem to be more like the “redness of the apple” than “the burp to the cider” (1950). I am not, then, merely pointing to experimental evidence within the psychological domain: arguing in this way will also enable me to address some philosophical concerns as well. To this end, I will argue that the contour theory’s claim to resemblance is not the “end of the philosophical line” for this

78 Kivy (1980) and Davies (1980, 1994, 2003) both defend versions of the contour theory.

134 theory (Kania 2012); I suggest that adding basic emotions to the mix in this way can clarify some of the disagreements about how music expresses and how we understand that expression.

A generic contour theory supports the idea that, as Davies puts it, “pieces present emotion characteristics, rather than giving expression to occurrent emotions, and they do so by virtue of resemblances between their own dynamic structures and behaviours or movements that, in humans, present emotion characteristics. The claim is not that somehow music refers beyond itself to occurrent emotions; music is not an iconic symbol of emotions as a result of resembling their outward manifestations. Rather, the claim is that the expressiveness is a property of the music itself” (2003, p.181)79. I intend to show that, given the contour theory was the strongest on offer to begin with, it is further reinforced once we begin to consider emotional expressiveness as a property of the music’s structure when structure is conceived experientially (as per Ridley’s suggestion (1993 & 2003) – see my discussion in chapter three). The contour theory is reinforced because, I want to say, it can explain more with this addition. I will argue that the contour theory, thus enhanced, can offer a detailed explanation of the “how” question of expression and suggest how we understand it.

But returning to the contour theory itself, both Kivy’s and Davies’ versions are grounded in some kind of resemblance. We are said to recognise certain features in the music that resemble other things in the world – like human voices (Kivy 1980); or human bodily movements and dynamics (Davies 2003); or the faces of sad-looking dogs (both, at various points, Kivy 1980, 1990 and Davies 2003, 2010b). This recognition leads us to correctly understand the emotion being expressed by the music (or, as Kivy prefers to phrase it, the emotion that the music is “expressive of” (1980)). Now here’s the point I want to make: the modular model of affect program or basic emotions can tell us more about how this resemblance actually works. What has been missing in any detail in prior formulations of the contour theory (although Davies (2003, p.171) does discuss the beginnings of this idea)

79 This formulation is much closer to Davies’ own account than it is to Kivy’s (Davies 2003, p.181). Kivy also grounds resemblance in human vocal expression rather than just bodily dynamics.

135 is first, the connection between the music itself (while it is doing its expressing) and the thing in the world it resembles; and secondly, how this resemblance is processed by us into an understanding of the emotion being expressed. I will argue that the enhanced contour theory can address these two points. It can do this by proposing that the music’s structure has properties that are expressive of basic emotions by virtue of resemblance to human expressions of those emotions. Because they are basic emotions, our brains recognise them via modular processing; in this sense, we are “hardwired” to do so.

My argument, then, is this: the evidence about the processing of basic emotions generally supports the idea that similar processes govern the recognition of such emotions in music.

If we adopt Ridley’s structural hypothesis (minus his ontological basis for it) and a version of the contour theory (in the light of experimental evidence), then we stand to explain much more about musical expression than we have in the past. Therefore, the way they work together offers support for the idea of basic emotions, the enhanced contour theory and Ridley’s structural hypothesis. My defence of this argument will therefore rest upon pancultural evidence of modular emotional recognition, as I discuss below. If these processes are modular, then, by definition, to the degree that they are hardwired they are also species-wide.

In addition to this argument, I will also speculate about what, in the end, it might mean to say that there are structural features of the music that are expressive of emotion in this way.

There has been resistance in the past to the idea that expression simply reduces to musical structure (see, for example, Davies 2003, p.172: “It will not do to reduce music’s expressiveness to a catalogue of technicalities and compositional devices”). But this is not what I am suggesting. Rather, I am arguing that because of the modular way in which such emotions are understood, asking what it is about a sad piece of music that makes it expressive of sadness is like asking what it is about a sad face that makes it expressive of sadness. The modules produce instant recognitions of various features – a minor key here, a downturned mouth there – and we recognise their emotional colour. It doesn’t suggest that sadness reduces to minor keys. It just shows that these are the kinds of features that the

136 simple fast processing mechanisms in our brains recognise80. To support this argument, I will be looking in more detail at some of the experimental evidence suggesting that facial and linguistic emotional recognition is pancultural, as I began to do in the last chapter. I will then propose (in accordance with the above argument regarding modularity and the enhanced contour theory), that musical emotional recognition operates through the same modular processes as facial and vocal recognition.

I should also note that Davies has raised some substantial objections to both the idea of basic emotions in musical expression and to the methodology of the psychological studies in the area, which he regards as often fatally flawed (2010a, 2010b and 2011). This is a problem for my argument because if the hypothesis regarding basic emotions being expressed in music is viable, then we ought to obtain experimental evidence analogous to the evidence on pancultural facial recognition (which, as Goldie suggested in the last chapter, may have some methodological problems of its own). It might be thought that future studies will produce such evidence, even if the current studies fail. But Davies argues that it is not simply the case that convincing evidence for pancultural basic emotion recognition in music has not yet been obtained; he argues that it is not obtainable. Further to this is his view that only three of the set of basic emotions (anger, joy and sadness) are expressible in music (2011, p.37). I will argue that while his point regarding psychologists’ methodology is compelling, his argument for the unobtainability of the required pancultural evidence rests on an exaggerated expectation of what the experiments will need to show; in my view, the hypothesis is simply that there is modular recognition of basic emotions in music as there is in faces or voices. It is not about whether there is evidence of musical understanding beyond this point; the postulated hardwiring does not extend into access to culturally-influenced contextual features of the music. As I mentioned in the last chapter, we need to be distinguish carefully between “pancultural” and “human nature” here (Griffiths 1997, p.9).

80 Modularity theory stops short of speculating on the exact nature of these triggers. They are, as Dretske (1986) has suggested, best conceptualised as whatever the modules are “set up to be set off by”.

137

Before I start the discussion, however, I should insert a reminder that we are still limited to the emotions we recognise as being expressed by the music itself; this, as I will address in the next chapter, is a different question to any concerning the emotions the listener might feel in response to the music. With this reminder in place, I will begin the next section at the foundation of my argument for this chapter: that music is expressive of basic emotions.

1. Basic emotions and music

Consider, for a moment, a speculative evolutionary story about the adaptive advantages conferred by the emotions (Darwin 1872; Pinker 1997; Davies 2006b; Prinz 2004).

Emotions are advantageous because they tend to track features of our environments that may impact upon our survival. Leaving any psychological mechanisms out of the picture for the time being, it is not difficult to understand how fear, for example, may direct and focus our attention upon a predator (such as the well-worn scenario of the charging sabre- tooth tiger), and may even provide us with reflex-like avoidance responses such as running away. We have also seen how fast responses can be when we are faced with a spider or snake; our fear can be unconsciously triggered and our flight initiated before we are even aware of its cause (Griffiths 1997, pp.94-95). Emotions, that is, can save lives.

But there are also other advantages to be had. These turn upon our recognition of emotional states in other humans (and also in some animals) (Darwin 1872). Because we live in social groups, understanding the current general emotional states of others can serve to highlight aspects of our environment we may have overlooked. For example, in prehistoric times the fear shown by our companions may direct our attention to an approaching predator. This kind of emotional recognition happens in the same reflex-like way it does when we ourselves spot the predator: involuntarily and quickly. This maximises the survival advantages because it means the appropriate behaviour in the observer, once the recognition is made quickly, is initiated instantly if necessary. There is no time for conscious judgement; or at least, the risk of taking extra time to respond is not worth taking.

138

However, we might also see that our companion has mistaken something harmless for a dangerous animal (say, a bent stick for a snake). There are therefore three steps to this emotional response. First, the recognition (involuntary); then two possible options: either we are afraid as well simply because our companion is afraid (called “mirroring” or

“emotional contagion”81), or we can make a judgement that our companion is mistaken, and the fear dissolves along with any inclination to flee. More often than not, both of these latter two steps occur, in that after we recognise fear, we mirror it, start to run, and then realise a mistake has been made after enough time has elapsed to make the conscious judgement (or not, as the case may be).

Importantly, emotional recognition in others can also serve to bond the group more closely together, which has safety advantages in and of itself (Cosmides & Tooby 2000). Some of these social bonding emotions tend to be more complex, and usually require some deduction from the observer as to their cause. These recognitions tend to be a slower process due to their complexity and due to the fact that they involve conscious thought. For example, we may deduce from observing the happiness or contentment of another member of the group that they value a friendship highly, and it is this friendship that has made them feel happy or contented (Davies 2006b, p.135). In this kind of instance, the group can gain advantages through more harmonious relationships founded on learning about others’ emotional lives and their motivations. It also strengthens protective impulses within the group; if you can understand the inner life of other group members, you are more likely to take care of them by sharing food, looking after their infants, or defending them against attack from hostile groups of other humans or from animals.

It is important to note one more aspect of this hypothetical scenario: the means by which this kind of recognition occurs. As we are offering an evolutionary explanation, we might be referring to a time before language was sophisticated enough to do this work for us82. So

81 For an explanation of emotional contagion, see Oatley et al (2006). I will discuss it in more depth in chapter six. 82 But there are several species of animals - for example, groups of modern Vervet monkeys - who have developed particular warning calls for particular predators (ie, different calls for hawks than for snakes to indicate different evasive actions). While this cannot be called a language, it could indicate that there may always have been some form of rudimentary meaningful vocalisations in early human

139 these recognitions took place partly via visual input (they occur in response to facial expression or bodily movements and positions), and partly through aural input (there may also have been some rudimentary alarm call, or, even more probably, involuntary vocalisations such as screeching in anger, gasps of fear, or wails of sadness).

Griffiths quite rightly points out that this kind of speculation is best viewed as providing us with “how possibly” explanations until they are properly confirmed (1997, p.96). But I think it is not unreasonable to draw some support from this kind of story for a few ground- rule assumptions. First of all, it seems clear that the emotions we recognise quickly and involuntarily are (more often than not) the modular-processed basic emotions. So the fear- response example makes perfect sense – we can run from a predator without the time- consuming need to consciously think about it, and our fear provides that instant motivation (Zajonc 1980; Ekman 1999). As Griffiths says, “the modularity of our emotional responses can be seen as a mechanism for saving us from our own intelligence by rapidly and involuntarily initiating essential behaviours” (1997, p.95). We do not even need to be consciously aware of the stimulus and, importantly, we can continue to be motivated by this kind of emotional response even if it contradicts our currently held beliefs. That is, on this view emotional lag or recalcitrance due to the modules’ encapsulation is also a risk- minimiser; our modular basic emotions lead us to err on the side of caution regardless of our belief states. Given this, the second assumption we can draw out of this story is that the more complex, slower, social emotions (such as jealousy or love) require time, conscious accessibility, and reference to beliefs/desires/thoughts83. These are the emotional states I will discuss in the chapters to follow. For now, however, I will be concentrating on basic emotions.

social groups (Attenborough 2002, p.271). There are also other means by which vocalisations might convey information. Sell et al’s study (2010) supports the hypothesis that even modern humans can successfully assess the strength (or fighting ability) of another human (particularly other males) through the timbre of their voice alone, regardless of language. 83 Although, as Griffiths argues (1997), emotions like anger can exist as a basic emotion and as a non-basic emotion. However, I would suggest that while the same facial or vocal recognition cues apply in both cases, the non-basic state has further layers in its reference to beliefs/desires/thoughts/.

140

These evolutionary considerations – particularly the social ones - can provide further reinforcement to the assertion that we recognise basic emotions through faces and voices.

Now, bearing all this in mind, we need to shift the focus back to music. Consider the many attempts in the past to attribute more complex representational powers of emotional expression to music. One of the extreme examples is Leonard Bernstein’s The Unanswered

Question (1976), in which he argues that music is directly analogous to language, both in the structural sense (words, phrases and sentences as analogous to musical structures) and almost at the level of semantics (his view is that music means like poetry means: with some metaphorical or imaginative input on the part of the listener (1976, p.79)). Bernstein argues that there could be a musical phrase that means, albeit poetically, “Jack loves Jill” (1976, p.79). But this kind of theory ultimately fails in that it does not reflect either the experiences of ordinary listeners or the results of psychological studies: in practice, only the most general of emotional colours can be objectively identified by ordinary listeners

(Davies 2006b and 2010b). It is not the case that even trained listeners can access that much more about emotional content, either, however extensive their training and however adept they may be at recognising other complex structural scored features. This generality of expression does seem to be counterintuitive, as every musical work is differently constructed in the same way that different literary works are, and we would therefore expect different works to express significantly different stories; this is the assumption that

Bernstein trades upon. But while we may like to think that a particular piano sonata “tells a story” of exquisite romantic suffering and thwarted youthful love (and musical criticism, unhelpfully, is often written in such terms), when it comes down to it all that can be reliably said about the piece is that it sounds sad, or very sad, as the case may be. The same holds for happy works. Once we remove any projections, symbolism or associations that may be imposed upon the work, leaving bare the structure of the piece itself, music simply does not provide us with enough information to allow us to recognise any but the most general emotional colours in its expression. This, after all, was Hanslick’s original point; he sees this expressive vagueness as grounds for abandoning the idea that music expresses any emotions at all. Davies quite rightly disagrees, stating that “….what is unique to the musical work is not… the emotion expressed but rather the means of expression. Music usually expresses rather general emotions and two different works can express the same quality of

141 sadness. What is highly specific to the work is the detail of the means – the actual notes, harmonies, and so on – by which the emotion is expressed” (2011, p.130).

Now, finally, consider some findings in various psychological studies over the past few decades that support three foundational assumptions within my argument for basic emotions. These assumptions are: first, basic emotions fit into Fodorian modularity and the emotional models outlined by Ekman and Zajonc (that is, they are ancient, limbic system processes that fit the modular fast and involuntary model rather than the cognitivist slow and conscious model). Our initial speculative evolutionary story above offers modest

(speculative) support for this assumption. Griffiths also observes that Zajonc’s studies, showing that basic emotions are elicited by a reflex-like process rather than a consciously- mediated process, are clear evidence for modularity (Griffiths 1997, p.96). Secondly, early cross-cultural studies on the recognition of basic emotions by facial expression tend to show that basic emotions (Ekman’s 1972 foundational set of six: surprise, fear, anger, disgust, sadness and joy84) are most reliably identified. And thirdly, further studies have supported the idea that this same set of emotions can also be recognised through vocal expression; i.e. when the subject does not understand the language they are hearing, they can still reliably identify the emotional colour of the utterance within the range of basic emotions. For example, Griffiths refers to a study by Scherer et al (1991), which purports to show that “the classes of emotion which can be recognised in speech include those uncovered in the classic studies of facial expression” (1997, p.84)85. These classic studies involve basic emotions.

So we can say with some degree of confidence, then, that the following three points are supported by experimental evidence so far:

1. Ekman’s six basic emotions are best described using a modular model;

2. This set of basic emotions is recognised through facial expression;

84 This set is sometimes expanded to seven, to include contempt, but for the purposes of the discussion to follow in section 2.1, I am adopting the original set of six (Ekman 1972). 85 For further supportive studies, see Griffiths (1997) and Davies (in Schellekens & Goldie, 2010)

142

3. This same set of basic emotions is also recognised through non-linguistic vocal

expressive colour;

There is a fourth point I would like to add to these, given our considerations above on the general nature of the emotions expressed by music:

4. This same set of basic emotions is also reliably recognised in music.

It is easy to see why this would be the neatest development of my argument up to this point. I have thus far proposed that the general nature of the emotions we can recognise in music is due to their basic nature, and because of the modular way by which we recognise them. Experience suggests, just for starters, that we can objectively recognise general states like happy, sad, or angry very easily in music just as we recognise those same emotions in faces or in voices.

There is a lot, then, riding on the neatness of this argument. If it collapses, then I stand to lose the enhancement of the contour theory and an answer to the question of how music expresses and how we understand that expression. But Davies argues that this “fantasy”, as he terms it, rests upon some deeply problematic foundations (2011, p.36). First of all, and probably most significantly, if we are going to argue for modular basic emotions from an evolutionary perspective, then (as I have already stated above) we need to show that there is evidence of universality across the species. That is, we need to be able to point to cross- cultural studies showing that facial, vocal, and musical recognition of such emotions occurs between members of different cultures. This is not, as it turns out, as easy as it may sound.

If we are positing hardwired capabilities, then we need to be able to isolate these from cultural influences that may mask or distort pancultural evidence. This will hold even if, as seems most likely, what I am proposing here is a similar kind of Chomskyan model to the one posited for language: put simply, a hardwired ability that is informed by cultural context, in that the modules are hardwired and informed by cultural surroundings in early childhood. Musical recognition of basic emotions is therefore analogous to, say, facial expression of basic emotions on this model. Particular basic emotions are hardwired to

143 correspond to particular basic facial expressions, subject to learned cultural themes or patterns86.

But as Davies points out, most of the current evidence of cross-cultural universality of recognition of emotional expression in music is far from reliable because of the many badly constructed and ill-informed psychological studies on the topic that do not adequately protect against cultural contamination. Secondly, Davies argues that not all basic emotions actually are expressed by music anyway (2011, p.37). Instead, he says that only three can be reliably identified: sadness, happiness and anger. This puts a significant dampener on the whole idea because it weakens the analogy between musical, facial and vocal recognition. If only three basic emotions are expressible by music, that is, it raises the question of why the other basic emotions aren’t, since they are processed in the same modular way and are all recognised through facial and vocal expression. In the next section, I will examine Davies’ arguments to this end. While I agree that many of the major psychological studies do suffer from methodological flaws, and that further better-informed experimental work is required

(but not impossible to achieve), I think his argument that only three basic emotions are expressed in music is less of a threat than it first appears.

2. Cross-cultural basic emotions in music

First of all, like Goldie (2000), Davies cautions against the view that basic emotions are actual emotions in the first place. Rather, like Prinz, he sees them as components of more complex emotional states, stating that “affect programs are common, perhaps even necessary, elements in emotional episodes of happiness, disgust, and the like, but there can be more to these emotions, especially cognitively, than is covered by the affect program”

86 Ekman argues for this kind of model in his later (1999, p.8) discussions of the nature of basic emotions, in a way that fits the generality of musical emotional recognition perfectly. “Each emotion”, he says, “is not a single affective state but a family of related states. Each member of an emotion family shares the characteristics I have described. These shared characteristics within a family differ between emotion families, distinguishing one family from another. Put in other terms, each emotion family can be considered to constitute a theme and variations. The theme is composed of the characteristics unique to that family, the variations on that theme are the product of individual differences, and differences in the specific occasion in which an emotion occurs. The themes are the product of evolution, while the variations reflect learning.”

144

(2010a, p.11). Davies therefore leans towards the idea that emotions are a natural kind in highlighting the differences between basic emotions (like fear), and “higher” or “cognitive” emotions (like guilt). While he acknowledges that this is a relatively minor point (and then goes on to treat basic emotions as emotions for the duration of his discussion), it is worth noting his apparent disagreement with Griffiths’ views on the status of basic emotions. It is certainly incongruous with Davies’ partial acceptance of modularity theory (since this comprises most of the evidence that Griffiths cites), as I will argue shortly.

Secondly, Davies questions the validity of many, if not most, of the existing psychological studies on cross-cultural musical expressiveness (and its associations with vocal and facial interpretation), and states that “the results of this research program have so far been limited, inconclusive, and inadequate” (2010a, p.11). His point is as follows: psychological experiments are normally built on hypotheses made up of informed assumptions on the part of the experimenters. Psychologists, however, simply aren’t in a position to be making these assumptions about music; they lack the musical background to determine which assumptions are the important ones. They should, he suggests, refer to musicologists and philosophers in order to make the right assumptions and ask the right questions in the construction of their experiments and in the interpretation of their data. This, of course, works both ways. As he explains (2010b, p.15): “Sometimes philosophy seems to psychologists to be psychologising in a fashion that is uninformed and unrestrained by empirical data (and sometimes psychology looks to philosophers like unskilled philosophising!)”. This is a neat and convincing subversion of Griffiths’ earlier point, in which he blames philosophers for too much a priori language-based theorising and not enough referral to the experimental evidence produced by psychologists.

This subversion is convincing in the light of Davies’ examination of a number of badly- constructed studies in the area that have not only found publication in psychological journals, but have also been widely cited. It is simply not the case, Davies argues, that we can assume very much at all about the pancultural nature of the musical expression of basic emotions on the basis of these flawed studies, which are, as he resignedly says, “suspect in the ways that most psychologists’ studies are”: they draw conclusions from a limited range

145 of musical examples; the musical extracts are normally very brief, out of context, and sometimes heard under unconducive laboratory conditions; and the choice of responses for the subjects is usually provided by the experimenter (thereby subtly directing these responses) (2011, p.42)87. But questionable methodology aside, he also accuses psychologists of missing one fundamental point: “Even where there is cross-cultural recognition of expression and that expression involves music or has a musical character, before one can credit the music with even partial responsibility for the transmission of affect it is necessary to control for other modes of universally recognizable auditory expression that could mediate the communication of affect” (2011, p.41). But, he says, in order to do this, the psychologist would require significant musicological and philosophical background. For example, in addition to other universal cues that might be missed, psychologists tend to ignore the ways in which emotional expression in music may be disguised or distorted by cultural convention (for example, Davies discusses a case in which a particular culture may attribute extra layers of expressiveness into the use of a certain flute to play on occasions of great social meaning, which renders the music almost completely “unfathomable” to listeners outside of that culture (2010a)). Psychologists also often lack enough philosophical background to realise that, for example, choosing a lullaby instead of “music alone” could mean that the music itself may not be doing the expressive work. Instead, it may be the effect of “motherese”, a particular “sing-song” prosodic effect used universally by mothers to infants, or it may be the text of the lullaby at work. He adds that even vocal music should be viewed with caution as again, what we may be understanding (even in wordless laments) is the expressiveness of wailing grief-stricken voices rather than the expressiveness of the music itself (2011, p.41)88.

Davies cites several studies to support these accusations. But the study that best highlights his point is Gregory and Varney’s (1996), which crops up again and again in his discussion89. Theirs is perhaps the most striking example of the way in which assumptions

87 These are similar concerns to Goldie’s (2000) about some of the facial recognition experiments, as discussed in the last chapter. 88 Davies cites Unyk et al. (1992), and Trehub et al. (1993) as examples of problematic studies involving lullabies. 89 Gregory, AH and Varney N (1996): “Cross-cultural comparisons in the affective response to music” Psychology of Music, vol 24 pp.47-52.

146 made about the original hypothesis can distort the eventual conclusions drawn by experimenters. In the next section, I will discuss Gregory and Varney’s study, Davies’ criticisms of it, and present some more recent studies that may alleviate some of Davies’ concerns.

3. Experimental evidence and the problem of cultural distortion

Davies argues that Gregory and Varney (1996) make an error common to many other studies in the area: they confuse the emotion expressed by the music with the emotion the listener then feels90. In asking Western and Indian listeners to identify the mood expressed by the music they were hearing, they instead asked them how the music made them feel; and it “was plain throughout”, as Davies explains, “that they took these two very different questions to mean exactly the same” (2011, p.43). They also assume that musical expression is grounded entirely in learned association (Davies 2011, p.44). To this latter end, they compare Western music to Indian ragas. First, they asked their participants to state which season is depicted by both Vivaldi's "Spring" and the monsoon raga "Rag Kirvani," effectively equating spring with the monsoon season because there is “the same association with renewal and joyfulness” (Davies 2011, p.44). But again, this shows a failure to isolate the music’s expressiveness from other expressive factors. If the expressiveness of music depends on properties intrinsic to the music rather than contingent association, as is usually the case, the fact that “both kinds of music are associated with renewal might be irrelevant to their expressive characters” (Davies 2011, p.44). The participants, as it turns out, did not agree on this, and this led Gregory and Varney to conclude that there was no cross-cultural agreement on “renewal” being expressed by the two pieces of music; therefore the music’s expressiveness is not cross-culturally accessible. What they should have concluded, however, is that the association with “renewal” was cross-culturally opaque and that the two pieces still might express different emotions cross-culturally (or even, different degrees of “joyfulness”). The experimenters therefore have no grounds for such a general conclusion about cross-cultural musical expression.

90 They are not alone amongst psychologists in this: see also Darrow, Haack and Kuribayashi (1987). Such confusion is also noted and discussed by Juslin & Sloboda (2010), pp.82-83.

147

The point for now is that this is a perfect example of erroneous assumptions driving the conclusions of the experimenters. It demonstrates that amongst the necessary assumptions being made in experiment design, there is also a need to make a careful distinction between the cultural and musical factors at work. This is perhaps the subtlest of the distinctions to be made, and one that the earlier psychological literature in particular tends to miss.

However, I would suggest that even Davies might be over-compensating for this failing in his own criticisms of such studies, in that he tends to over-emphasise the cultural over the musical. What I mean by this is he sometimes draws the boundaries around musical structure to include its much wider cultural context. This is in keeping with his views on the identity of a musical work including aspects of its cultural and historical context (2006

& 2011), but I suspect that in this case, it’s all gone a little too far towards the cultural. For example, elsewhere (2010b, p.34), he has argued against the availability of musical cross cultural evidence per se by claiming that different cultures may express different emotions on such occasions as a funeral; in one culture it may be a cause for celebration and release, in another, a cause for intense sadness. It would be impossible for an outsider to accurately identify the emotion expressed by the music played on these occasions, he says, without their knowing the cultural situation of the piece in question; that is, he suggests that unless the listener knows that a funeral is a cause for sadness in culture X, then they will not be able to recognise the sadness in the funeral music presented to them. But this, I think, is not quite the point. Because we are hypothesising about basic emotions, it is the music’s expressiveness that is under examination, not the cultural context in which it is normally heard. That is, I am arguing that the wider cultural context is not as relevant to our hearing of expressive properties in the music as Davies is suggesting. The hypothesis is simply that the expressive properties of the music may provide pancultural triggers for the recognition of basic emotions in accordance with modularity theory. It therefore doesn’t matter whether the listener knows that culture X thinks a funeral is a sad or a happy event; the whole point of basic emotion recognition is that such knowledge is not accessible by the modules. The outsider should therefore be able to recognise the happiness or sadness of the music regardless; its wider cultural application is an entirely different matter, by which the outsider may even be surprised once the emotion expressed is recognised.

148

But Davies also identifies a broader (and stronger) cultural problem in any study in this area. Because of the all-pervasive nature of modern Western culture (through film, television and radio), it is becoming increasingly difficult to find participants who have not had contact with Western music to some degree. This means that the musics being compared are often strikingly similar to begin with, leading to findings that cannot tell us very much about cross-cultural expression at all. Studies affected by this contamination include those by Krumhansl (1995, 1999, 2000), which compare English and Finnish folk hymns to Saami yoiks (Davies 2011, p.44). What may be required instead are comparisons between musics that are truly culturally disparate and unaffected by the omnipresence of

Western music, rather than between musics that are merely different musical dialects. A more convincing choice in Davies’s view, and one that is potentially untainted by Western music, would be a study comparing the music of an Australian Aboriginal culture with that of the Inuit. This has so far not been attempted (Davies 2011, p.44). Finding groups of either of these cultures who have not yet experienced Western music could also now be much more problematic than it once was in the earlier stages of the last century.

This is indeed a serious problem, but one that more recent studies have recognised, as I will discuss below. There are two levels at which it is addressed. One level is, as just discussed, to find truly dissimilar and uncontaminated musics to compare. One of the more promising examples of this approach is the Fritz et al study (2006 & 2009), which involved an African people from Cameroon called the Mafa (see also Ball 2010; Thompson &

Balkwill 2010, in Juslin & Sloboda 2010 p.770). The Mafa had never been exposed to

Western music of any kind. When they listened through headphones to a series of short extracts from symphonic Western music (i.e. without vocal lines), they were asked to identify which emotion the music was expressing (this was limited to three emotions: happy, sad and scary) by pointing to the standard pictures of facial expressions of those emotions (from the 1976 Ekman study of cross-cultural recognition of basic emotions).

They were able to accurately identify, at rates above mere chance, the emotion being expressed in each extract even though the music itself was strange to them. While I agree with Davies that the conclusion drawn from this study of this one group of people was far too sweepingly strong: “that the emotional expressions conveyed by the Western musical

149 excerpts can be universally recognized, similar to the largely universal recognition of human emotional facial expression and emotional prosody” (Davies 2011, p.43), it indicates that there might yet be corroborating evidence to follow in later studies.

It would certainly be valuable if such a conclusion were to be corroborated. It would support, for example, the theory under discussion about basic emotions, music and modularity (albeit noting that some basic emotions scored higher than others: for example, happiness was more readily recognised than sadness or “scariness” in the Mafa study91

(Thompson & Balkwill 2010, p.770)). However, Davies rightly points out that there are two potential problems with this study that need to be taken into account. These again are methodological weaknesses. They are also associated with a lack of musicological background. First, on top of the usual question marks surrounding out-of-context extracts, directed responses etc (as above), the sample size in the Mafa study was too small to justify a strong conclusion. Secondly, and perhaps more seriously, there is considerable musicological evidence to suggest that music from that region of Africa shares some modal characteristics with Western music (such as major and minor tonalities), meaning that the

Mafa, while unacquainted with Beethoven, may yet be cognitively equipped to understand his music through the music of their own culture. According to Davies, the authors of the study did not acknowledge this, and nor did they supply any other information about the music of the Mafa (2010a). So even in those rare cases in which the participants have been isolated from Western music, we still need to take into account how close their musical dialect may be to it.

Davies’ overall point, however, is much broader than this. Not only must we ensure that the cultures under comparison are truly isolated from each other’s musics, but it seems he thinks that even this might be ultimately futile. His view, as I mentioned above, is that music is essentially structured via cultural and historical convention, and hence might

91 Interestingly, this was also the case for the German control group in this study, although the German group were better than the Mafa at recognising happiness as well (95% correct as opposed to the Mafa’s 65%). Thompson & Balkwill also report that Fritz et al found “the capacity to decode emotional expressions was significantly correlated with the degree of appreciation of the compositions”. It is not clear what they mean by “appreciation” in this context, but it seems to indicate both understanding and aesthetic appreciation (2010, p.770).

150 never be available to listeners from outside of that culture (at least, not without extensive experience or education) (2010b, p.34). We will never, he argues, be able to know for certain that music expresses basic emotions panculturally because these cultural influences are part of the musical structure itself and cannot be extracted from it. This is the major stumbling point, in his view, for true cross-cultural evidence of emotional recognition in music. It is much more likely that music may be distorted according to impenetrable culturally-specific conventions than facial or vocal cues might be in other studies of this type. He is therefore questioning whether any study, no matter how well constructed, will be able to compensate for such cultural distortion in music.

I have argued already that Davies’ expectations of this recognition might be skewed too far towards the cultural. However, a recent psychological study (Sievers et al, 201392) suggests a new approach by taking this possible cultural distortion of musical expression into account.

With considerable foresight, Sievers et al acknowledge that ensuring cultural isolation and insuring against the cultural distortion of expression are essential, and design their experiment accordingly93. They do this by finding an isolated population (the Kreung, in the village of L’ak in Northeastern Cambodia) to compare with a group of American students, and (more startlingly) by leaving comparisons of music out of the study altogether: at no point was either group asked to listen to the unfamiliar music of the other’s culture. Instead, participants were asked to use a computer program to create, rather than to recognise, prototype emotional expressions. These prototypes were then compared, and similarities in the major structures of the prototypes (factors such as dynamics and tempo) across the two cultures taken to indicate degrees of pancultural expression. And, importantly, the experimenters are careful to note that Kreung music is

“formally dissimilar to Western music: it has no system of vertical pitch relations equivalent to Western tonal harmony, is constructed using different scales and tunings, and is performed on morphologically dissimilar instruments” (2012, p.3). It seems that the

92 Sievers, Beau; Polansky, Larry; Casey, Michael; Wheatley, Thalia 2013 “Music and movement share a dynamic structure that supports universal expressions of emotion” PNAS 110 (1) 70-75; published ahead of print December 17, 2012 (page numbers will reference this early online edition) 93 One of the authors of this study is composer Larry Polansky, who was also recruited to advise on musicological background.

151 concerns Davies expresses regarding musical similarities between the two groups in the

Mafa study do not appear to hold the same force for this study; it also shows that it might indeed be possible to at least confront, if not completely evade, Davies’ more fundamental concerns about cultural distortion as well.

Given this advantage, I will now spend a little time outlining this study. To begin with, the observation behind it is in fact the same observation behind Davies’ own version of the contour theory: “that music and movement share a dynamic structure” (2012, p.1). The central hypothesis of the study, as the title suggests, takes this one step further: “music and movement share a dynamic structure that supports universal expressions of emotions”

(2012, p.1). It is therefore this dynamic structure doing the expressive work, either within emotional expressions in movement or emotional expressions in music, just as Davies argues; but the experimenters also claim that this structure facilitates pancultural emotional expression in either format.

The study is designed along the following lines. The experimenters devised a computer program that can generate examples of music and of movement from a set of five features:

“rate [tempo], jitter (regularity of rate), direction, step size, and dissonance/visual spikiness” (2012, p.1). Each feature is controlled by the participant on the computer screen through a set of corresponding slider bars. “This model”, say Sievers et al, “represented both music and movement in terms of dynamic contour: how changes in the stimulus unfold over time” (2012, p.1). Each group of participants (American and Kreung) was split roughly into two smaller groups; each smaller group were assigned either music or movement, and were unaware that the program could generate both. Every participant undertook the exercises alone. Manipulating the slider bars produced an expression of each assigned emotion either in music for the music group, or as an animated bouncing red ball on the screen for the movement group (with dissonance in music mode equivalent to a spiky rough surface on the ball in movement mode). Sievers et al then analysed and compared the structures of these creations, with the likely distortions of even this degree of cultural input taken into account. “Because many musical practices are culturally transmitted”, they acknowledge, “we did not expect both experiments to have precisely

152 identical results. Rather, we hypothesized results from both cultures would differ in their details yet share core dynamic features enabling cross-cultural legibility” (2012, p.1).

But here is the most interesting feature of the study, from my perspective: the emotion set the experimenters chose for all participants comprised “angry, happy, peaceful, sad and scared” (2012, p.2). Four out of five, in other words, are basic emotions, although the experimenters do not discuss this (it seems that their concern is to demonstrate that cross- cultural emotional recognition simply occurs, rather than hypothesise about how or why it occurs). “Peaceful” was the only non-basic emotion; it was also (perhaps not surprisingly) the “least successfully identified emotion” both within and across both groups, as it has been found to be in previous studies (Sievers et al 2012, Supporting Information, p.4). As we also might expect, the study found that the structures of the dynamic and musical constructions from each cultural group for each basic emotion were significantly similar

(in core dynamic features) both within and across cultures. The only exception was that

“angry” did not correlate across cultures, although four of the five parameters were similar

(2012, p.4). In the light of these findings, “there is evidence”, say Sievers et al, “that emotion expressed in music can be understood across cultures, despite dramatic cultural differences.

There is also evidence that facial expressions and other emotional movements are cross- culturally universal, as Darwin theorized” (2012, p.1). I think, however, the findings indicate that it might be basic emotions that are panculturally understood (albeit with reservations about “angry” in this case). Moreover, the study shows that it is possible for psychologists to both acknowledge and to sidestep problems such as cultural distortion in the musical expression of these emotions.

It is also possible, however, to argue that the statement just quoted reveals that Sievers et al believe the evidence they have produced about musical expression is analogous to the evidence previously produced in earlier studies about vocal and facial expression. But the one glaring disanalogy is methodological: in the facial and vocal studies, expressions were directly recognised by the participants, not created and then structurally compared. But I would argue in Sievers et al’s defence that this disanalogy is more or less unavoidable. If we are to avoid accusations like Davies’ (of cultural distortions rendering the musics of other

153 cultures opaque to outsiders), then the only way of ensuring this distortion does not occur is to avoid asking participants to listen to the music of other cultures. Sievers et al have devised a way around this with their computer program. My view, however, is that in the case of basic emotions, cultural distortion might not always be as insurmountable as the experimenters seem to expect. This is because of the modular process behind the recognition of basic emotions; we do not yet know what the triggers are that “set off’ the modules. They might not be as vulnerable to distortion as other properties of the music might be. Also, as I argued above, sometimes the cultural influences distort properties of the music that are not central to its expressive capacity; it is essential to be clear about which these are. However, it is equally clear that playing it safe in the early stages of this research is advisable, and Sievers et al appear to have hit upon the only reasonable alternative approach to date.

So far, then, we have covered two of Davies’ major objections to the idea that music may express basic emotions in the same way as facial expression and vocal prosody do. Recall that his first objection is that basic emotions are not “true” emotions. I pointed out that while Goldie shares this view, Griffiths, Prinz and Ekman do not; it is, in the end, just one response to the problem of whether or not emotions are a natural kind. His second objection is the lack of psychological evidence supporting the idea of music expressing basic emotions. This is because, he argues, most of the major studies in the area have been flawed, largely due to a lack of musicological and philosophical background but partly due to some common methodological problems in the design of the studies. While I share

Davies’ exasperation, I think there is more hope to be drawn from studies like Fritz et al

(2006) about the Mafa, and certainly from Sievers et al (2013) than Davies allows. These studies may suggest that there is at least a correlation between the way we universally recognise basic emotions in faces, in voices, and in music.

Overall, then, the indications that we already have (given the evidence provided by facial and vocal recognition studies) show that, as the more successful studies have indicated, further correlation might be likely. We are, after all, concerned with the most accessible levels of musical complexity; we are not asking the subjects to analyse the works or explain

154 their cultural significance, and it is essential to keep this modest ambition in mind. It might still be possible to reliably test for indications of correlation despite the undeniable cultural influences, as the Sievers et al study shows. There is not much to suggest, that is, that were we to achieve the required level of musicological and philosophical purity in a future study, results showing correspondence between the three modes of basic emotion recognition might not be obtainable. There is some evidence supplied by studies like Fritz et al and

Sievers et al, albeit with the caveats discussed; and there is some evidence supplied by the numerous studies on facial and vocal recognition. Thompson and Balkwill (in Juslin &

Sloboda 2010, pp.782–3), in a positive frame, also outline the direction for future empirical tests of the extent to which musical expressiveness is cross-culturally recognizable. Like

Sievers et al, they acknowledge Davies’ concern about cultural disguises of expressiveness in music, and they also expect that more work will be done on vocal recognition in comparison to music (as Davies acknowledges himself in 2010a). Given this, the idea that basic emotions are expressed by music and are recognised through modular processing is by no means without experimental support, and there are strong indications that more might be obtained in the future.

Davies’ third objection, however, may yet have some force. This was, remember, the weakening effect on the analogy between faces, voices and music that results from his argument that only three of the six basic emotions (sadness, happiness and anger) can be expressed by music. This argument, if upheld, is damaging because it raises the question that if basic emotions are recognised through modular processing (i.e. fast and involuntary), then why are only three recognisable in music when all six are recognisable through faces and voices? In the next subsection, I will argue that first of all, three out of six is still evidence for the recognition of basic emotions; and secondly, there are explanations that can account for the absence of some of the missing emotions other than the assumption that they are inexpressible in music.

155

4. Basic emotions: three or six?

Davies supports his argument that music can only express three basic emotions in two ways. The first way seems to reduce to the idea that music is expressive only of sadness, happiness or anger because that is how we talk about it. To this end, Davies argues that

Music can be sad, happy, or angry. Despite its non-sentience, we say the music is sad, or that it expresses sadness, or that its expressive mood is one of sadness, and we do not mean by this merely that it is saddening, though it may be that too. But the other basic affect programs do not seem to be susceptible of expression in pure music. Music can be surprising, disgusting, or even frightening – that is, it can elicit (and even be the object of) surprise, disgust or fear – but we would not say of it that it is surprised, disgusted, or frightened (2011, p.37).

Davies is saying, that is, that there is a distinction to be made between how we feel in response to the music and how we understand its emotional expression; this is a distinction with which we are already familiar. But he is also suggesting that this distinction does not hold with equal force for all basic emotions, and that this is evidenced in the way we talk about it. Because we do not say of music that it is afraid in the same way that we say it is sad, music is not expressive of fear (although we still may find it “scary”). He adds in a footnote that even music in horror movies does not express fear as such; it merely “builds tension”. The same holds, it seems, for disgust: “if someone expressed the view that a musical piece was disgusted, we would be hard pressed to guess what he might be trying to say” (2011, p.37). The bottom line appears to be that we can respond to the music by feeling these three basic emotions but the music itself cannot express them.

This is, I think, unconvincing. To begin with, the way we talk about music being sad and don’t talk about it being afraid could simply be put down to convention; there is nothing intrinsically weirder about saying music is afraid than there is about saying it is sad, after all

– music is literally neither. We just tend not to talk about it in that way. The fact that there is a lot more sad music around than there is “scary” music might also have contributed to this convention being widespread; there are many more shades of sadness to express, after all, than there are of fear, and it might be worth speculating about sadness simply being much more artistically interesting to composers. But more seriously, this kind of argument

156 is exactly the type that Griffiths objects to in philosophy in general, as I discussed in chapter four (1997, pp.3-4). In just the same way as we should abandon the idea of

“superlunary objects”, and classify barnacles as crustaceans instead of as molluscs, or even refrain from calling all emotions one natural kind, we should also be wary of drawing conclusions about the nature of musical expression based on our linguistic habits regarding it. While the musical properties in question are response-dependent properties, and molluscs, etc are not, the point remains that linguistic evidence is just one kind of evidence, and one that can be explained away. It holds no privilege, that is, simply because expressive properties are response-dependent, as I have argued in previous chapters.

Similarly, the assertion that music in horror movies merely “builds tension” rather than expresses fear is also questionable. We might just as easily argue it builds tension because it expresses fear, and that it is effective in the horror movie context for exactly that reason. I would also add that it is possible to hear scary music without feeling scared oneself. This is a similar framework as the one accepted for “sad” music; we understand the sadness being expressed and we may, although not always, mirror that sadness94. This could just as easily hold for “music alone” as it does for film soundtracks.

These points aside, I think that Davies has another agenda in putting forward his assertion that only three basic emotions are expressible by music: he intends to use it in support his version of the contour theory (that is, that the basis of the resemblance is human dynamics mapping onto musical dynamics). A quick rehash of some essential points is required before proceeding. As I discussed in chapter one, the contour theory differs from other expression theories in that it does not assume that the music conveys occurrent emotions; nobody is experiencing the emotion being expressed by the music, but the music resembles the emotion in the same way a basset hound’s face looks sad to us, even though the dog itself may not be experiencing sadness. As I have mentioned, Peter Kivy (1980) also supports a version of a contour theory, arrived at independently from Davies’. While Kivy thinks the resemblance can also be based on human vocal expression (sad wails or happy

94 To clarify: there is a degree of modular involuntariness about “mirroring” that can be confused with sadness being aroused by sad music by other pathways (for example, association). See Davies in Juslin & Sloboda 2010, p.37.

157 laughter dictating melodic contour and harmonies, for example), Davies, like Sievers et al

(2013), thinks the resemblance is based on dynamics, or human movements (fast movement tends to express happiness, slow dragging movement expresses sadness, and sharp, sudden movements express anger).

Davies ties his assertion about the three basic emotions to his contour theory in the following way. If resembling human dynamics is how music goes about its expressive business, then there is going to be a need for what Davies calls “temporal extension” in the musical phrase in question; that is, the structure of the music will need to be complex enough to permit dynamic resemblance (2011, p.38). What this reduces to is duration, and a certain threshold of musical complexity, both in the structure of the music itself and the time taken for the listener to experience and understand the emotion being expressed. It is not easy to resemble any kind of movement, for example, in just one single note; several bars are usually required to establish both tempo and phrasing, or phrase movement, such that the resemblance to human movement can be understood. On Davies’ account, temporal extension is therefore essential if dynamics are the basis of resemblance in the contour theory of expression. Given this, his argument is that happiness, sadness and anger fit this requirement very well, in that they can be clearly described by the music because they have temporal extension: they can all be experienced as emotions for hours, days, or even months, and be clearly understood by others by the carriage, gait or movement of the person experiencing them. Their expression in music is therefore particularly easily achieved. A sad person, for example, droops, moves slowly, and does not make any sudden gestures; a sad piece of music has long, lugubrious downward-moving phrases and slow tempi. In this sense, then, both the experience of the emotion and the experience of the listener understanding that emotion require temporal extension.

Davies argues that the remaining basic emotions (fear, disgust and surprise) are all

“typically short-lived”, and “do not possess distinctive patterns of behavioural expression over the longer term” (2011, p.38). Because of this lack of temporal extension, he argues, musical structure cannot resemble the dynamics of these emotions in humans. Davies does acknowledge that music can gesture towards these short-lived emotions, but the music has

158 insufficient time to express them clearly: “Fear behaviours, where the fear's object is not immediately present, indicate a general nervousness or anticipatory tension. These are states that music can express – for instance, with disjointed, darting phrases, or with throbbing discords that call for future resolution”, but even given this, the emotional states being expressed are “too general to count distinctively as of fear or disgust” (2011, p.38).

Therefore, the argument is, only happiness, sadness and anger are properly expressible in music, and that is why we describe music as “sad”, “happy” or “angry” but not “afraid”,

“surprised” or “disgusted”.

I have stated already that the requirement for temporal extension is tied to the need for the music to express through resemblance to human dynamics, on Davies’ account. But what if we return for a moment to Kivy’s rival account (1980)? He argues that expression turns on resemblance grounded in human vocal utterances (like wails, shrieks, and laughter) or the kinds of prosodic inflections used in speech. We have seen already that there is experimental evidence supporting cross-cultural recognition of all basic emotions through prosody only (i.e. when participants do not understand the semantic content of the utterance they are hearing) (Scherer et al 2001; Ekman 1993). But here’s the thing: remember that on Davies’ account, dynamics require temporal extension to be properly expressed in the music. It is not easy to express movement in one short note, as you would expect in musical expressions of anger or fear, for example. But as soon as we turn to prosodic expression, that situation changes. It is possible to describe, or at least indicate, an emotion in even one or maybe two notes. Imagine a single wail; a singer’s “sobbed” effect on a single wordless note; a violin’s vibrato. It is clear that it might be possible for the three less popular basic emotions – which, as Davies argues, have no dynamic temporal extension - could nonetheless be expressed in short, sharp bursts of prosodic colour. As

Davies admits, “prosodic features can be extended through repetition and the like, but expressive prosodic features tend not to be temporally extended. Overall, it is less clear on this account of music's expressiveness than on the one presented earlier [that is, the dynamic resemblance account] why music is limited in the range of affect-program emotions it expresses” (2011, p.39).

159

Prosodic recognition is therefore a challenge to the idea that only three basic emotions are expressible by music. The experimental evidence showing correlation between cross- cultural facial and vocal recognition of all basic emotions adds to this challenge, reinforced by the evidence in Sievers et al’s findings (2013). There are also some pertinent facts about the process of modular recognition of basic emotions to add to the mix, namely that such recognition occurs quickly and involuntarily, and for very good evolutionary reasons of survival. Davies’ requirement for temporal extension, with its implication that we need a certain amount of time to arrive at a clear judgement of the music’s expressive nature, is at odds with these facts. Yet Davies seems to see it as a failing of the prosodic theory that it does not support the limitation of expressible basic emotions to only three. In fact, he suggests that the methodological flaws in existing psychological studies on music stem in part from a lack of recognition of the importance of temporal extension, a fixation with prosodic expression, and an inability to acknowledge that only three basic emotions are expressible in music. In his conclusion, he suggests that psychologists might “test why music expresses only some basic affects, given that the ones it does not express also have distinctive prosodic as well as physiognomic expressions. And rather than focusing exclusively on parallels between expressive music and prosodic speech, which tend to be local and short-term, they might pay more attention to large-scale expressiveness and its possible connections with musical structure and movement” (2011, p.46).

While I agree with Davies on the many philosophical and methodological failings of some psychological studies (as exemplified by Gregory & Varney’s), I find this conclusion puzzling. The flaws in the psychological approach are compellingly revealed in Davies’ discussion. But it is far from obvious that these flaws stem from psychologists failing to notice that not all basic emotions are expressible by music, and that temporal extension is the key to both this and our understanding of expression. It is certainly the case that it would be beneficial to focus a little more on dynamics as a basis of emotional expression (as

Sievers et al have shown), and this might reveal a little more about musical expression, but it seems that to start from a position of “only three basic emotions are expressible by music”, established, as we have seen, partially on an argument based on language use, is to start from a position for which there is less than definitive evidence. Overall, Davies places

160 too much weight on our language use with regard to the basic emotions expressed in music, and I think there is a hint of question-begging in his presupposition of his dynamic account as a support for the “three emotions only” argument. Both are tactics offered in support of his version of the contour theory rather than as arguments against the whole set of basic emotions being expressible in music.

5. Summary and conclusions

Let’s look at the arguments in this section again in summary. On the one hand, Davies is arguing that fear, disgust and surprise are not expressible (not just “not expressed”) because they lack temporal extension and therefore can’t be clearly resembled by music, and because of the way that we speak about them. Sadness, happiness and anger are, in contrast, easily recognised in music because of temporal correlation between the experience of those emotions and the musical structure; this is why we do not describe music as scared, disgusted or surprised. On the other hand, there is Kivy and a number of psychologists arguing that prosody is an important factor in vocal expression and therefore might be in musical expression as well. It seems that once Davies’ requirement for temporal extension is dropped, there is little to say that fear, disgust, and surprise are not at least expressible by music, since there is cross-cultural evidence that they can be clearly expressed vocally and facially (and according to Sievers et al, musically) along with sadness, happiness and anger.

The evidence that our recognition of basic emotions is modular, and therefore fast and involuntary, also supports this view.

I have argued already against Davies’ language-based approach to “sad” music. However, I think he is right to say that fear, disgust and surprise are more difficult for us to recognise in music, which is, after all, the point of this discussion95. But I think he is right for different reasons than the ones he presents, and I would also suggest that his argument that these emotions are not expressible altogether is too strong. Rather than temporal extension – both in the emotion experience and the musical structure - being the key to clarity, I think there are other factors at play. First, we cannot ignore the ease and speed with which we

95 Although Sievers et al (2013) found fear to be easily recognised (see above).

161 understand prosodic emotional cues. We therefore cannot discount the possibility that fear, surprise and disgust may be understood as expressed by music, since prosodic cues seem to be important for this set of relatively “short-lived” basic emotions. I am suggesting that music may express these emotions, but they may be harder for us to understand in music than the other three. This is something that should be expected, given that it is also mirrored in the studies regarding other pathways to recognition of these emotions in facial or vocal expression – as far as I am aware, it is never the case that all six in the set are recognised with equal success (see my discussion of this in chapter four).

Secondly, while fear, disgust and surprise may be expressible in music, they are less often expressed in music, or at least with less clarity. I therefore need to account for why this is the case in order to offer a real alternative to Davies’ argument regarding temporal extension. I have stated already that we cannot argue from ordinary language descriptions of music and that prosodic cues cannot be disregarded so completely, given the experimental evidence. I would also like to offer the following: while it is clear that surprise and disgust are genuinely short-lived as experiences, it is not as clear that fear is. If we take

Ekman’s view that basic emotions are flagships for families of emotions – and this seems to be the general assumption – then fear itself, the experience of fear, can endure as fully- fledged fear or strong anxiety for just as long as various forms of sadness or joy. Davies admits that fear can be expressed through prosodic/dynamic cues that are extensively repeated, but I would suggest that its more enduring forms might be just as clearly expressed in music as sadness may be. So why isn’t it, given this? Why do we not count amongst the canon of Western music entire symphonies dedicated to the expression of fear or anxiety, as we do to sadness or joy96? As I have indicated above, I suspect that it’s just not as interesting artistically to composers or to audiences; it could be accounted for by cultural convention. While there are centuries of tradition now swathed around the apparent romanticism and glamour of melancholia, no such appealing context is available to fear, which seems to be largely left to film scores (as Davies indicates). And of course in contrast, the appeal of sharing in musical happiness, or relating to musical anger, is just as obvious.

96 Or if we do, such works are certainly in the minority. The Rite of Spring comes close in parts, but being a ballet, it is also close to a film score.

162

While I am speaking from a Western music perspective here, it would be interesting to explore whether or not there are any cross-cultural correlations with these apparent conventions.

That leaves surprise and disgust to account for. While I think that the same argument can cover these emotions – that is, we are just not that musically into them – there is a little more to it. These basic emotions, more than the other four, tend to bleed into the others and are less robust on their own account. The experience of surprise, for instance, is more easily conceived as value-laden, in that it’s harder to imagine an experience of surprise that is neutral (i.e. just a surprise, not an unpleasant or pleasant surprise). This tends to lead to either fear or disgust (for unpleasant surprises), or happiness (for pleasant ones) very quickly. I would suggest that it is this that makes it hard to isolate in music, although it is much more clearly recognised in facial expression as the cross-cultural studies indicate.

The same would tend to hold for disgust; disgust we experience for a foodstuff, for example, might be experienced differently to disgust for another’s conduct. While the facial expression might be the same in each case, the latter experience may bleed into anger and mask its recognition in musical expression. So in short, these emotions are indeed harder to recognise in music. But it is not simply because they lack temporal extension and nor is it because they are inexpressible by music per se; there are other factors at play here.

While these considerations are speculative responses to Davies’ account, they can be reinforced further by a return to what we know about how these emotions are recognised by the responsible modules. As Davies says, music is not about resembling a facial expression so much as what it feels like to feel that emotion (2010a; 2011). However, this does not mean that the cues that allow the modules to recognise such experiences in others are going to be vastly different in each case. To return to some evolutionary speculation here, it would make little sense if, when observing another member of our social group on a prehistoric hunting trip, our modules only took account of their bodily dynamics indicating fear rather than their vocalisations. It might possibly be slower, for a start, although physical cues like these would be recognised faster than they might be resembled in music.

It would make much more sense, simply from the point of view of risk minimisation and

163 speed, to use both kinds of cues to trigger our own fear/flight (possibly through different encapsulated domain-specific modules working simultaneously). This would make maximum use of the available information. That is, if we return to looking at the modular mechanisms by which these emotions are recognised, it seems sensible to allow that both bodily dynamics and prosody can cue the modules into recognition in the interests of reaction speed.

It is now worth speculating about how might this inform the contour theory, given that this was the driving force behind Davies’ argument for temporal extension in the first place. If music expresses through resemblance grounded in human dynamics, then it requires temporal extension, as we have seen. Therefore, it is consistent to argue that any basic emotion that is too short-lived to support temporal extension cannot be represented in music. Kivy’s account of the contour theory, however, relies upon prosody rather than dynamics and as such does not need to allow for temporal extension at all. If we focus on the modular processing model, then it would make sense to suggest that the contour theory might ground its resemblances in both prosody and bodily dynamics in the same way. That is, a contour theory that best reflects what we know about modular basic emotion recognition, and that therefore best allows for the possibility of the entire set of basic emotions being expressed (in varying degrees of clarity, admittedly), is one that encompasses both Davies’ and Kivy’s accounts: one that encompasses both prosody and bodily dynamics.

Finally, a reminder of why all of this is important. The overall discussion within this chapter has centred upon seeing what happens when the idea of modular-processed basic emotions is applied to existing discussions of musical expression that incorporate these emotions. What happens is this: the focus on modularity shows that it might be possible to reconfigure the contour theory such that it is grounded in both dynamics and prosody, and in doing so, provide a clearer account of the perceptual mechanisms behind how we understand or recognise the basic emotions we hear expressed in music. Moreover, in referring to psychological studies in the area, we can see that while there were serious methodological and philosophical errors in many of the studies, there are some signs that

164 cross-cultural recognition of at least some basic emotions in music might be supported

(Fritz et al 2006; Sievers et al 2013) and further signs of its being supported by future psychological studies, as suggested by Balkwill & Thompson (2010).

Another reminder is also necessary. In this chapter, I discussed how we perceive and understand the emotions expressed in music, not how we might feel in response. I showed that music can only express a limited range of emotions, and that these (arguably) correspond with basic emotions. This is why the modular theory was brought to the forefront of the discussion. But I now need to return to the distinction I drew very early on between understanding and appreciation, and to what I think is a corresponding distinction between basic emotions and understanding on the one hand and higher-order cognitive emotions and appreciation on the other. As we will see, some recent psychological studies have shown that the most common emotional response is in

“mirroring”, which means (since this is a modular process) that the most common emotions evoked by music – not just expressed by music - are basic emotions97. However, these emotional responses still lie outside of the primarily judgemental realm in which we decide how much we like the music; how motivated we are to return to it, and how the music rates as a piece of art. In this chapter, I have only discussed the basic emotions we perceive and sometimes mirror in music. In the next two chapters, I will discuss the other

“higher-order” responses to music, in terms of what is meant by appreciation, why it is so often confused with understanding, and why, perhaps most importantly, “higher-order” rather than basic emotions are involved at this level. This discussion has to do with an idea

I have so far avoided: the idea of aesthetic appreciation, or aesthetic emotions about, or in response to, an artwork. I want to explore another suggestion of Davies’ about such emotions: that the aesthetic response is a “species of love” (2003, p.201)98.

97 Juslin, P et al 2010 “How does music evoke emotions? Exploring the underlying mechanisms” in Juslin, P and Sloboda, J (eds.) 2010 Handbook of Music and Emotion: Theory, Research, Applications Oxford: Oxford University Press 98 See also Davies in Sukla (ed.) (2003).

165

Chapter Six: Response

“Is it Good Art?” “Well, I don’t quite know what you mean,” I said warily. “I think it is a remarkable example of its period. Probably in eighty years it will be greatly admired.” “But surely it can’t be good twenty years ago and good in eighty years, but not good now?” “Well, it may be good now. All I mean is that I don’t happen to like it much.” “But is there a difference between liking a thing and thinking it good?”

- Evelyn Waugh, Brideshead Revisited.

While the previous chapters up to this point have examined how music expresses basic emotions and how we understand such expression, this chapter looks at how we respond to it, and how we respond to the music itself. I am going to approach our response to music in the same way I have approached our musical understanding: by making some clearer distinctions between the variety of experiences that can fall under that one umbrella term. I have argued already in earlier chapters that Hanslickian conflations between our understanding of musical expression and our own emotional responses to music have (at least partially) resulted in a bias in the literature towards the educated listener’s musical understanding. My aim now is to demonstrate that once these understanding/response conflations are teased further apart, and the kinds of emotional responses we experience are better understood, then this bias towards the educated, score-based perspective is weakened even further. And it is merely a bias, I will continue to argue, because, in opposition to some of Hanslick’s views, not all of the emotional responses of the musically uneducated reveal a lack of understanding of the music itself. Rather, some kinds of emotional responses actually depend upon it.

This is not in itself a revolutionary view. Even the staunchest formalist acknowledges the possibility of feeling aesthetically “moved” by the beauty of the musical structure when it is revealed by formal study, as Hanslick does (Payzant 1986). But I am going to argue that first, the uneducated can experience versions of aesthetic responses too, and second, that other kinds of responses commonly experienced by uneducated listeners reveal rather than suppress understanding. I will argue, then, that the distinctions between the kinds of

166 responses to music I want to make will also serve to strengthen my defence of the uneducated listener’s musical understanding. The kinds of responses to music I will discuss are as follows:

1. Emotional, which split into a) mirroring the emotion expressed in the music and

b) emotions felt by association with the music but not necessarily expressed by it;

2. Aesthetic, which involves responding to the work as an artwork (is it “good art?”);

3. Liking or enjoying the work, and

4. Being “moved”, which has been classified as either a particular kind of emotional

response and/or as part of an aesthetic response.

I should note that first, aesthetic responses (type 2 above) are not necessarily emotional responses; and secondly that some, if not all of these responses have generally been thought to be some of the components of an overall appreciation of the work99. However, I am deliberately examining them individually in this chapter as part of an argument that they can exist independently, as simple responses. We might like the work, for example, without responding to it aesthetically or even vice versa. Why I think this is partially to do with a revised conception of appreciation that I will outline in chapter seven, so I will explain more about what I mean then.

But I also take this individualised approach to emphasise my argument that uneducated listeners can experience all of these responses based on the understanding they have of the work. This is not in line with the traditional view. For example, Hanslick argues that emotional responses should have very little to do with the real business of understanding music, as I discussed in chapter one. But I think this suggests that Hanslick is conflating all emotional responses into those described by (1.b) above: emotional responses that are learned by association without having anything to do with the music itself. But looking at point (1) above, I will argue that there is empirical evidence to suggest that emotional responses can either be by association in this way, or they can “mirror” the emotion expressed by the music (that is, we can feel sad when we listen to sad music). Now if (as I

99 See, for example, Davies (2011); Armstrong (2000).

167 argued in chapter five) basic emotions are expressed by music and uneducated listeners are equipped to recognise their expression through modular processes, then I want to argue in this chapter that the mirroring response is also dependent upon the same modular processes as those that enable recognition. This is part of the answer to the “how” question formalists ask of emotivists that I discussed in chapter one. With this “how” question in mind, in what follows I will focus more upon mirrored responses than on responses by association. I will examine some empirical evidence supporting the idea that we mirror the basic emotions expressed in music through emotional contagion; that is, these emotions are recognised and sometimes (but not always) reproduced through modular processes in listeners (Juslin et al 2010, p.622). The point, however, is that on my account, such contagion cannot occur without the listener recognising which basic emotion is being expressed in the first place.

Against Hanslick, then, I will argue that uneducated listeners can respond emotionally to a work with understanding. In a similar vein, I will also assert that uneducated listeners can respond aesthetically to a work, which will involve drawing a careful distinction between the aesthetic appreciation of a work and an aesthetic response to it. While a fully-developed appreciation of a work might require access to extensive background knowledge as is traditionally thought, I will argue that our initial aesthetic response to a work (that is, the listener’s realisation that the work is potentially of aesthetic interest) may not depend on such knowledge. I want to suggest that aesthetic response may be developed into aesthetic appreciation in that traditional sense, but that the response originates in an essentially gestalt-type recognition rather than in a judgement. My ultimate aim, then, is also to defend the ordinary listener’s capacity for aesthetic response just as I have defended their capacity for musical understanding. As I will also argue in chapter seven, I intend to describe and defend every aspect of the uneducated listener’s musical experience. Without such a comprehensive defence, my conception of modular baseline musical understanding would seem to describe only a fraction of uneducated listeners’ experience, which would not by itself explain the value such listeners place on the music in their lives.

168

My central argument in this chapter, then, is this: it is important to distinguish between the various types of responses to music listed above because conflating them supports the view that the uneducated listener cannot truly understand what they hear. Three of the most common of these conflations, and their effects, will be the focus of this chapter. They are:

1. Conflating kinds of emotional responses (1(a) and 1(b) above) leads to the

mistaken formalist view that emotional responses reveal a lack of understanding;

2. Conflating aesthetic response with aesthetic appreciation leads to the mistaken

formalist view that aesthetic responses are only possible through formal study; and

3. Conflating being “moved” and aesthetic response leads to the mistaken formalist

view that both aesthetic responses and being appropriately moved are dependent

upon formal study.

“Liking”, on the other hand, is a response I will describe in this chapter but discuss in detail in chapter seven, where I will argue that there are different kinds of “liking” too. Its relationship to the other kinds of responses is also complex, and dictated by hard-to- quantify factors such as taste. However, I will argue that “liking” does not always accompany aesthetic response, although it is likely to; in reply to the quotation at the head of this chapter, there is a difference between “liking a thing and thinking it good”.

I begin my discussion in section one with an example of an account that not only makes at least one of the above conflations, but also rejects some of the distinctions I draw between kinds of responses. Peter Kivy (1980; 1990) argues that music does not arouse emotions in listeners at all, either by association or by mirroring; rather, it “moves” listeners. Being

“moved” on his account turns out to be a special kind of emotional state dependent upon considerable theoretical background100. He therefore rejects point (1) on my list of kinds of responses altogether and makes conflation (3) (above), in much the same way that Hanslick does. He also shares two of Hanslick’s eight basic premises (those I outlined in chapter one), and uses these to defend his view. They are: premise six, that emotions are

100 Kivy’s later views (1993 in 2001, for example) are slightly softer than this in allowing that music may have a “tendency” to make a listener happy, sad, etc (Kivy 2001, p.91; see also Kivy 1999).

169 individuated by thoughts and beliefs (i.e. a cognitivist view of emotions that excludes basic emotions); and premise eight, that science will never be able to explain musical expression or the aesthetic experience. I will argue that Kivy can provide little defence for this view on the basis of these premises, and that there is no empirical or indeed anecdotal evidence to support his argument that music does not arouse emotions in listeners. I will do this via a discussion based on psychologist Patrik Juslin’s examination of Kivy’s account, in which

Juslin and his colleagues argue that there is no empirical evidence to support either aspect of Kivy’s view; that is, there is no evidence to suggest that “aesthetic” emotions exist and, conversely, considerable evidence to suggest that music does arouse particular emotional states in listeners (Juslin et al 2010).

Given this argument, the Juslin et al (2010) discussion of Kivy’s account will feature throughout the three sections of this chapter. I intend the discussion in section one to be a demonstration of how Kivy’s conflations lead to a theory of musical response that is biased towards the educated minority. In section two, in light of my argument that there is little evidence for Kivy’s view against emotional arousal, I will examine how music might arouse emotions in listeners as an alternative view, and discuss emotional contagion as a possible psychological mechanism for the mirroring response. In section three, I will argue against

Kivy’s claim that the aesthetic response is a particular emotional state. I will draw a clear distinction between aesthetic response and aesthetic appreciation, and argue that the uneducated listener is capable of aesthetic response and, to a certain degree, aesthetic appreciation.

1. Being moved

I’ll begin with an overview of Kivy’s account (1980; 1990). First of all I’ll examine his argument for his belief that music cannot arouse emotions in listeners. Like Hanslick’s premise six, this turns on the idea that emotions must have a belief or thought at their core, which means they are just not musically conveyable (see also my discussions in chapter one and chapter four). Kivy’s argument is that all emotions must be about something. As an example, he discusses his rage at Uncle Charlie blaming his Aunt Bella for the failure of his

170

(Uncle Charlie’s) business (1990, pp.148-149). It is clear, says Kivy, his rage has as its object

Uncle Charlie’s unreasonable behaviour, which gives his emotion an explanation and direction.

Kivy argues that this is the kind of explanation we need for the emotions evoked by music: a straightforward, commonsense, “Uncle Charlie” explanation. But, he asks, if music

(music alone, or pure music) moves him to anger, “who or what am I angry at? At the music? The composer?...... Where’s the Uncle Charlie?” (1990, p.49). He argues that it is obvious there is no Uncle Charlie in this case, and that therefore music does not evoke

“garden-variety” emotions such as anger or sadness because of the lack of a garden-variety object for them (1990, p.152)101. To account for the many reports against this view – that is, to account for the many reports of music arousing emotions in listeners – he then suggests that listeners are being moved rather than having “garden-variety” emotional experiences.

His argument, then, is this:

1. Any emotion aroused by music must be objectless;

2. Garden-variety emotions must have an object;

3. Therefore music cannot arouse garden-variety emotions; instead, it moves listeners.

On Kivy’s view, our experience as listeners of music’s expressiveness arousing emotions in us is misleading. What is really going on, he says, is that the recognition of properties expressive of, say, sadness in the music moves us, rather than arouses sadness in us. There is the advantage here, he adds, that it is possible for music that is not emotionally expressive to move us, which can be difficult for opposing theories to explain. On Kivy’s view, we are moved by the beauty of the musical structure itself (the sheer beauty of the sound it embodies, or perhaps by the way it was perfectly constructed by the composer) whether or not it is emotionally expressive (1990, pp.158 - 160). Only in this sense, he says, can the music legitimately be the “object” of our response. Against all reports by the majority of

101 Kivy calls the commonly musically-evoked emotions “garden-variety” emotions (that is, basic emotions like sadness, anger and joy), although he also seems to view non-basic emotions such as hope or love as garden-variety (1990, p.149).

171 listeners, that is, he says that “a piece of music might move us (in part) because it is expressive of sadness, but it does not move us by making us sad” (1990, p.153).

Kivy’s motivation here is to argue against arousalist-type (or what he calls “emotivist”) theories, which propose in their strongest forms that music is expressive only by virtue of the emotions it arouses in listeners102. His main complaint against such theories is their lack of an explanation as to how music does this (this is the “how” question I have identified throughout), so he is here attempting to circumvent the question by presenting an account that doesn’t need to answer it. To highlight this alleged weakness in arousalist accounts and strengthen his own alternative account, he presents an “obvious, commonsensical” explanation that does away with the idea that music arouses everyday emotions in us at all and focuses upon music moving us instead (1990, p.151). But it is worth noting that the hardline arousalist view that music is expressive only by virtue of such arousal (for example,

Matravers’ account (1998)) is making a different claim to the one Kivy is pursuing here.

Kivy seems to confuse demolishing the arousal theory with showing that music does not arouse emotions in listeners, as Davies points out (2011, p.54, footnote 9). These are quite different claims, and there is “no inconsistency”, as Davies says, in both holding that the arousal theory is false and that music may nonetheless arouse emotions in listeners. Music may arouse emotions in listeners but be expressive on some other basis altogether - such as resemblance, for instance - as I have already argued in my defence of the contour theory in chapter five.

Leaving this possible confusion aside, Kivy is left with the task of explaining away the empirical evidence against his view supplied by the reports and studies showing that we

102 To clarify, Kivy distinguishes between “cognitivists” and “emotivists” in a different sense to the way I have been using those terms so far. By cognitivist, he means subscribing to the view that we can recognise expressive properties in music without feeling such emotions ourselves (I have used that term, on the other hand, in reference to the view in emotion theory that emotions require a belief or object at their core). Recognising expressive properties and then feeling the emotion, as I (and Davies) have been suggesting can be the case, is not an option on Kivy’s account due to the alleged lack of emotional object. By emotivist, Kivy means a view in which we cannot recognise expressive properties without feeling such emotions ourselves (I have used that term, on the other hand, in reference to the broader view that emotions are not only expressible by music but important to its understanding, as opposed to the formalist views of Hanslick).

172 actually do experience music as arousing emotions. Here he invokes Hanslick’s premise that science can have no part in explaining our musical and aesthetic experiences (premise eight as identified in chapter one) (Hanslick trans. Payzant 1986, pp.55-56). To this end,

Kivy argues that the emotivist account is actually weakened by its tendency to rely upon psychological mechanistic explanations rather than philosophical ones (1990, p.149). He sees this as a weakness because, like Davies, he is sceptical about psychological methodology. However, unlike Davies, his scepticism seems to include science in general.

“‘Scientific’ theories,” he observes, “come and they go.. [they] range from the wildly false through the uselessly true to the highly controversial” (1990, p.149). Also unlike Davies,

Kivy does not defend this view with examples from psychological literature, listing instead a series of past questionable psychological theories of emotion and stating that overall, he finds “something deeply wrong” about the way that such theories are “put in the service of musical aesthetics” (1990, p.149). In addition to this, his argument continues, why bother with psychology when “there is an Uncle Charlie explanation already in place” in his own theory (1990, pp.149- 150)? While seemingly dismissive of the entire scientific enterprise on these grounds (an impression enhanced by his liberal use of scare quotes throughout for terms such as “scientific” and “psychology”), he is nevertheless careful to state that he is

“not arguing that because we do not yet have a successful psychology and physiology of the emotions…. we can never have one” (1990, p.149). Rather, his argument reduces to this: emotions all need a propositional object; we have a perfectly obvious explanation in operation here (i.e. music does not arouse garden-variety emotions; instead, it moves us); therefore there is no need to look to psychology at all (which is, on his view, a “peculiar” thing to want to do) (1990, p.149).

Clearly, Kivy’s stance on the relationship between psychology and philosophy is in line with Hanslick’s and in direct opposition to my own. It is not difficult to see how such an argument might have inspired Juslin et al, as psychologists, to mount a case against Kivy.

They point to several psychological studies to argue that not only does music evoke

“garden-variety” emotions, but also that they are evoked in the majority of emotional responses to music. Moreover, in discussing the different mechanisms by which music can arouse emotions, Juslin et al argue that the experience Kivy refers to as being “moved”

173 cannot be a single emotional state103, as he seems to imply. In my view, the studies they discuss, while subject to some of the flaws in construction I mentioned in chapter five, are nonetheless robust enough to support their argument.

2. The emotional response

What is at stake here, then, is not just a defence of scientific method but also a re-posing of the “how” question: if Kivy is wrong and music does cause emotions in us, then we need to explain how it does this. Juslin and his colleagues first examine Kivy’s claim that music does not arouse “garden-variety” emotions. By “garden-variety”, Kivy seems to mean anything other than the purely aesthetic emotion of “being moved”. Experience suggests otherwise: most listeners would state that they have experienced “garden-variety” emotions being aroused by music, as numerous studies attest (Juslin et al 2010, p.606). This sets up an immediate tension with Kivy’s assertion that his is the commonsense view. It does seem unlikely that there have been reporting errors on such a scale that nearly all listeners are wrong when they think they are sad or angry, when in reality they are merely “moved”104.

Juslin et al also identify that Kivy’s entire argument turns on his Hanslickian insistence that such emotions must have an object, or “belief-opportunity”, and that music cannot provide these. Kivy also seems to think that unless this emotional object can be identified, the emotion itself cannot have been experienced (2001, p.135). Perhaps, Juslin et al suggest,

Kivy’s rigid adherence to an appraisal model of emotions is the real problem here, and one that may be addressed by looking at alternative theories for the evocation of musical emotions (2010, p.606).

103 I should note, however, that I agree with Kivy in that we can be “moved” by music. I don’t agree with his assumption that it follows from this that being “moved” is a particular emotional state, or that music does not evoke “garden-variety emotions”. 104 While this may be the case, people can certainly misreport the object or reason for their emotional state more frequently than they might be confused about the state itself. Matravers (1998, p.160) picks up this point about Kivy’s account, stating that “it is by no means necessary that we know why we feel the emotions we do……I might simply hate someone without knowing why”. Kivy (2001, p.135) rejects this, arguing in reply to Matravers that normally we do know, and that any case in which we don’t is pathological (or at least atypical), and therefore irrelevant. Matravers counters with a suggestion that Kivy thinks we must be “justified” in our emotions to be said to have them, and identifying our beliefs about them serves as such justification (1998, p.159; in Kivy 2001 p.136).

174

However, Juslin et al acknowledge that the psychological studies in the area so far have been inadequate. They note that most theories and most psychological studies have focussed on “the expressive properties in music that allow listeners to perceive emotion in music”, such as Cooke (1959), Juslin (2001) and Langer (1957), but almost none have focussed with any efficacy on emotional reactions to music. If they have focussed on reactions, they say, the results have shown a complete lack of interest in how these reactions are produced (2010, p.606). Like Davies (as we saw in the last chapter), Juslin et al acknowledge that there are problems with how the information is gathered in the first place. “Laboratory experiments do not capture all relevant aspects, and the artificial settings raise concerns about the generalisability of results. Field studies, on the other hand, could jeopardize the validity of causal inferences” (2010, p.607). In this respect too they are in agreement with Davies, in that they recognise that psychologists, by making ill-informed methodological assumptions, are contributing to “confusion and controversy in the field”

(Juslin et al 2010, p.606). The best way to gather such data, they say, is through a combination of methods: survey studies, diary studies and experiments (2010, p.608). They also stress that in terms of participants, who takes part is equally as important as how many take part, as previous skewed results from samples composed exclusively of university students have shown. It is therefore vital to feature “a listener sample that reflects the full heterogeneity of the population” in order to have anything approaching an accurate picture of emotional prevalence (2010, p.608).

This is promising with regard to the methodological concerns I discussed in chapter five.

However, I am also interested in finding out what kinds of emotions are most often reported. I’ve already suggested that there are two kinds of emotional responses to music: emotions mirrored by the music itself and emotions associated with the music, which may have little bearing on the emotions the music is expressing. As I’ve already argued in previous chapters, the emotions music expresses are basic emotions. It would follow, then, that the results should show that the emotions being mirrored are basic emotions too. So where does that leave emotions by association? I would expect, given they are not restricted to basic emotions by the music’s expressive capacities, that emotions felt by association might be non-basic. Is this division supported by the study?

175

Juslin et al’s results appear to support this predicted division between basic and non-basic responses. To begin with, 84% of reported emotional responses to music were positive in nature; and 8% reported as “general” (neither positive or negative affect). The results were as follows: the leader, at over 50% (over 300 participants’ reports), was happy-elated; followed distantly by sad-melancholic, at 80 reports; calm-content, at 60 reports; and nostalgic-longing, at just over 50 reports. There is also a mix featured between basic (happy) and higher cognitive emotions (nostalgia), although it is clear that two basic emotions (joy and sadness) form the vast majority of reported experiences (notable also is that this follows the trend in Sievers et al (2013): “peaceful”, as a non-basic emotion, did not rate highly). In addition, it is clear that the vast majority of emotions experienced are single emotions; 89% of the reports featured “pure” emotions, such as “sad”, rather than mixed emotions like “anger and sadness” (Juslin et al 2009 in Juslin et al 2010, pp.609-610). Most importantly, Kivy’s view that most people experience being “moved” by music is not supported by the survey’s findings: only 10 of the 706 listeners reported “being moved”

(2010, p.610)105.

Things are not looking hopeful, then, for Kivy’s assertion that we are moved by the sadness expressed in music rather than saddened by it. But in Kivy’s view, being moved is dependent upon a particular level of musical education, such that the listener is capable of understanding the “masterful” construction of the music. Listeners are moved because they understand and appreciate how the music does its expressing rather than what is being expressed. “I might, “ he says, “…..be impressed to the extent of being moved by the masterful way an aria is contrived to be expressive of an operatic character’s anger or joy”

(1990, p.171). In his defence, then, Kivy may reply that only 10 of the 706 participants were educated enough to be “moved”, as no indication is given in the study as to levels of musical education or competence within the sample group. But Kivy’s overall claim is much stronger than a mere statement that we are only “moved” by music. It is also the claim that music simply does not cause any other everyday emotional state. If it seems to,

105 Juslin et al also note that even these 10 participants may simply have been indicating they were emotionally affected by the music and using the term “moved” because they couldn’t be more precise at the time – many of them were able to specify emotions when given the opportunity later (2010, p.610). This is not the sense of “moved” that Kivy needs.

176 this is because of the listener’s own associations with the work or some abnormality in their response. “If a piece of music makes someone sad, or frightened, or despairing, or angry”, he says, “you can be sure the reaction is either personal or pathological” (1990, p.170).

I would therefore argue that even if the 10 “moved” participants were adequately educated,

Kivy is nonetheless committed either to the view that the reactions of the other 696 participants in the study were abnormal, if not actually pathological; or that their responses were all “personal” in origin; or a mixture of both. That such large a proportion of the participants should be pathological (and such a small proportion be “normal”) certainly seems unlikely. But what might Kivy mean by “personal”? He means that the music itself did not cause the emotion, but some personal association or conditioning did. He is therefore conflating the two kinds of emotional responses on my original list (point 1) in the same way that Hanslick does: he is assuming that only emotions felt by association can be triggered by music, and that since these emotions have nothing to do with the musical structure itself, they can be discarded as irrelevant to the understanding or even appreciation of that music. However, the study has not yet given us reason to reject this view. It is not clear that the emotions expressed in the music heard are causing the emotions reported so far in any case, and this might be a problem for the study. While nothing is said about education levels of the participants, nothing is said about the nature of the musical samples the participants were referring to either (all we are told is that they were all reporting their “most recent” experience of music. But was it classical? Popular? Did it have text? Had it played at their father’s funeral?). This might add a further complaint about poor methodology to the list of flaws in such studies, as clearly the controlled nature of the music heard is just as crucial as the randomised nature of the sample.

But Juslin et al explain that they did allow for this, in that the same participants were also asked what they thought caused their emotion (2010, p.611)106. The categories of responses differentiate between “musical factors” (such as tempo, the excellent performance, the melody); “lyrics” (such as the message of the lyrics, the words used); “memory factors”

106 Bearing in mind, as mentioned in a previous footnote, that we can be mistaken about the causes of our emotional states much more easily than we can be mistaken in identifying them.

177

(what Kivy calls “personal” causes and what I call “by association” responses, such as memories of the song played at a funeral), and factors such as “pre-existing mood” (2010, p.611). Overwhelmingly, musical factors were reported as the cause of the emotions (45%, as opposed to 10% for lyrics and 24% for memory factors). So while we are still in the dark about the actual music heard, some sort of effort was made to keep “pure” music apart from text or personal association, and the results show that “pure” music was thought to be the cause of the emotions experienced in almost half of the cases.

Given the survey format, this can only be taken as an indication of where future research might lead107. But it does look to me as though the only way to interpret this trend about musical causes indicated by the survey study is to adopt one of the following two positions:

1. Kivy is right about all emotions always requiring an object (meaning that music

cannot evoke “garden-variety” emotions). Therefore nearly half of the participants

(45%) reported pathological experiences; or

2. Kivy is wrong about all emotions always requiring an object. Therefore music can

cause “garden-variety” emotions, and the 45% of participants who reported such

musically-evoked emotions are reporting normal experiences.

If we adopt the first statement, then we are going to need a very good explanation of why nearly half of the reports were abnormal (beyond the acknowledged unreliability of self- reporting on causation). It is not immediately obvious where we should even begin to look for one. But if we adopt the second statement, which given the lack of explanation of widespread musical pathology seems to be the more likely of the two, then we are faced with Kivy’s objection that there is no explanation as to how music might cause such emotions (or no answer to the “how” question, as I put it). Juslin et al respond by outlining seven possible answers to the how question in seven possible psychological mechanisms that might produce emotional responses to music. I intend to focus only upon the one answer that is the most effective in explaining why basic emotions are most often reported:

107 Juslin et al acknowledge this by stating that all they are trying to establish is the prevalence of musical emotion, and that the information gathered may help to identify “causal factors that need to be addressed in subsequent studies” (2010, p.608).

178 that of emotional contagion (see Juslin et al 2010 pp.620-626 for the full list). This is because all of the other processes Juslin et al describe do not rely upon the expressive capacity of the music itself doing the emotional evoking, since this was Kivy’s major objection to the idea that sad music makes listeners sad. Contagion provides the strongest challenge to Kivy’s account because it shows how it is possible for us to recognise and respond to emotional expression in music without those responses having emotional objects (Davies 2011, p.51).

Emotional contagion, then, is the process “whereby the listener perceives the emotional expression of the music and then ‘mimics’ this expression internally” (Juslin et al 2010, p.622). It operates in the same modular way as emotional contagion through facial expression or through vocal expression, as supported by recent studies108. We perceive emotionally expressive features in music or faces or voices which we then automatically mimic ourselves such that we experience that same emotion. The important point is this: at its inception point, we do not hold belief states about the perceived emotion that make that perceived emotion an object for our response; there is nothing our response emotion is about or for (Davies 2011, p.48; p.51). Juslin et al state that this is a modular process whereby we reflexively react to stimulus with the production of the same emotion

(possibly, although this is controversial, through what Juslin et al term “so-called ‘mirror neurons’” (2010, p.622))109 without access to our beliefs/knowledge base. However, it should be noted that while contagion is modular, it is not always automatic. Juslin et al point out, as I have already discussed in chapter five, that we may often perceive emotions in music without feeling any musically-induced emotion at all (55 – 65% of listening

108 See also Hatfield, Cacioppo & Rapson (1994) for facial expression and Neumann and Strack (2000) for vocal expression. 109 Mirror neurons are generally thought to be active in subjects observing others performing particular actions that they know how to do – expressing emotions being one of them. When watching someone else do something familiar, mirror neurons activate the same areas in the brain that activate when the subject is actually doing the same thing themselves. They are therefore thought to be the mechanism facilitating the mimicry. Despite the controversy, there has been some investigation of mirror neurons in therapeutic disciplines such as speech pathology. See Wan, C et al (2010); and Perkins et al (2010).

179 episodes induce an emotion, but that leaves a significant proportion that don’t)110 (2010, p.632).

There are a few further points that need to be clarified about contagion. First, the emotions mirrored are objectless in their origin in that they are not “about or for the music”; they are also not justifiable in an “Uncle Charlie” sense and are therefore non-rational (Davies 2011, pp.51-52). So while the music is both the perceptual object and the cause of the emotion, listeners do not believe that the music itself is sad or suffering, so it is not the object of the emotion experienced. Second, it is important to be clear about the nature of the contagion here. We are not talking about a situation, say, in which I see that you are afraid, look around for the object of your fear – the traditional example of charging sabre-toothed tiger, for instance – and then feel afraid myself. Rather, we are talking about a modular process whereby the perception of properties expressive of emotion in music reflexively causes the listener to mirror that emotion themselves, without there being a sense of “aboutness” to that emotion and without their conscious access to that process. Overall, I see the proposed psychological mechanism of contagion as further evidence that music only expresses basic emotions, since they best fit the fast, unconscious model being described. Davies however, as we saw in the last chapter, has reservations about specifying basic emotions in this context111.

Even given these reservations, it seems that emotional contagion provides a strong challenge to Kivy’s brand of cognitivism. There are two levels to the response to Kivy under discussion: the idea of music-to-listener emotional contagion itself, and the added more general support of the purported psychological mechanism of emotional contagion discussed by Juslin et al (2010). Both of these combine to interpret and explain the data

Juslin et al collected in the survey: contrary to Kivy’s view, most listeners report that the expressive nature of the music causes them to mirror that same emotion themselves. This confronts Kivy’s two-pronged attack. First of all, there is his very Hanslickian view that all

110 Studies supporting this assertion include Juslin et al 2008; Juslin & Laukka 2004 (in Juslin et al 2010, p.632). 111 See also Jenefer Robinson (2005) for some objections to the contagion model (for example, she argues that contagion only occurs subconsciously, when we are not paying attention to the music). Davies counters her objections in 2011, pp.56-58.

180 emotions must have a propositional object to qualify as an emotion, and that music by definition cannot supply such an object. But establishing that objectless emotional contagion occurs shows that this is not the case: emotions can be transmitted via contagion in a number of contexts (from facial expression, vocal expression, music, the weather, and even the décor of a room, as Davies points out) (2011, p.65). This addresses Kivy’s account at the same level as his faith in Uncle Charlie; that is, it supplies his demand for a commonsense explanation as to how music might evoke objectless emotions in listeners. I think that this explanation accounts for the gathered data much more thoroughly than

Kivy’s explanation; as discussed above, Kivy would have to dismiss the majority of experiences as abnormal if not pathological.

But there is a further advantage to the contagion theory over Kivy’s. I think contagion theory is enhanced if we assume that basic emotions are facilitating this contagion. I am therefore committed to the view that there must be emotionally expressive properties in all of the sentient and non-sentient things described above that can trigger the same mirroring response in observers. Juslin et al describe how this might occur in contagion theory. But is this view as convincing as it is convenient? Davies cautions that there may yet be some distance to go in unpacking the actual mechanism of emotional contagion (2011, p.65). He argues that all the psychologists have done so far is merely describe the phenomenon of contagion, rather than tell us anything about how it actually works. To then call this description a “mechanism”, he says, “is to identify emotional contagion with the causal mechanisms that activate it, as psychologists are prone to do” (2011, p.65).

To avoid this error, Davies remains “agnostic” about the mechanisms behind contagion. It is also another factor, I suspect, in his lack of enthusiasm for basic emotions. But I’m not convinced that this is a serious problem. As I have already discussed, modularity allows a similar agnosticism about the triggers for modular responses112. We don’t yet know what these are; they could be present in décors just as they are in music, and the same triggers could service both (this is what I propose with regard to emotional recognition in faces,

112 The “trigger problem” has also been identified as a weakness for modularity theory: not simply in the sense that they are unidentified, but also in the sense of questioning how the modules can filter through everything that isn’t a trigger. For discussion, see Fodor 2000 and Prinz 2006.

181 voices, and human/musical dynamics). It might be argued that simply shifting the agnosticism back one gear in this way gains no advantage over Davies’ more cautious position. I’m not convinced it is a pointless gesture, however; as I have been arguing throughout, adopting basic emotions and modularity enables us to address a whole raft of questions regarding expression (and the understanding of that expression) more thoroughly. Moreover, when added to the kind of account Juslin et al are proposing, the mounting evidence so far for the existence of basic emotions also suggests that we are not making the kind of conflation that Davies fears.

So, given this, where do we look for a mechanism, rather than a description, of emotional contagion that would satisfy Davies? There has also been some experimental work done on the question of mechanism outside of modularity theory that could nonetheless be interpreted as searching for a trigger for an emotional module. While some psychologists have made speculative gestures towards mimicry of facial, vocal, or bodily expressive behaviour (Hatfield, Cacioppo and Rapson 1994), or pheromones (Brennan 2004) acting as the mechanism for contagion between humans, there are, as Davies points out, immediate and obvious problems with transferring these mechanisms (particularly pheromones) to music or house décors (2011, p.62)113. But I think that if we start from the position I am proposing about modularity and basic emotions in music, it might reveal more about where we should be looking. For example, we know that on this view emotional contagion occurs via the expression of basic emotions. We also know that basic emotions, in this kind of consciously inaccessible non-“Uncle Charlie” context, are mainly mediated through modular processing. And we know that modular processing depends upon base-level triggers for fast, inaccessible, informationally encapsulated results (Griffiths 1997; Fodor

1983). So, what we are looking for, then, (if the contour theory is correct) are triggers that the modules pick up regardless of whether they occur in a face, a voice or a musical work; we need, that is, to look for the triggers that the modules are “set up to be set off by”

(Dretske 1986; Prinz 2004, p.55). Modules, after all, are not smart enough to distinguish very much at all about the context in which their triggers originate. This is what I think

113 Although, as I argued in the last chapter, Sievers et al (2013) have since supplied some compelling evidence regarding bodily dynamics and music.

182 explains Davies’ observation (2011, p.58) that the image of expressiveness is as evocative as the real thing – as long as the right triggers are there, the modules are not able to tell one from the other (see also Pinker 1997). So there may be something that a sad face and a sad piece of music have in common at this most basic of processing levels, and this is what we need to discover.

However, identifying this trigger is, as Davies says (2011), best left to the scientists (and this has always been the viewpoint within modularity theory more generally). We just don’t know yet what it is about a smile that is contagious, in much the same way as we don’t know exactly why a major key sounds happier than a minor (other than, of course, the level of explanation provided by the idea of emotional contagion and the contour theory respectively). The base level of explanation here could indeed be nothing more than the way the modules are wired (this would of course require further cross-cultural study). At the risk of waving the modular magic wand over the problem of emotional recognition and contagion, perhaps it really is, as Kivy says about his own theory, that simple.

In summary, then, I think that there remain a few ways in which we might find out more about the mechanisms for our responses in searching for an answer to the “how” question, even in the light of Davies’ caution. In this case, for example, the empirical approach showed that Kivy’s insistence that music does not arouse basic emotional states is indefensible; that is, the question still requires an answer. But what of Kivy’s elevation of the aesthetic response to music to a league beyond the grasp of psychological theory – that is, his reliance upon Hanslick’s eighth premise? In the next section, I will examine Juslin et al’s attack on this aspect of Kivy’s account when they raise the question of whether

“aesthetic” emotions exist (Juslin et al 2010). They argue that there is no experimental evidence to support the existence of single emotions that are purely “aesthetic”. They also observe that overall aesthetic appreciation is much more complex than simple emotional evocation, thereby demonstrating how often appreciation itself is confused with simply feeling “moved”.

183

3. The aesthetic response – is it an emotion?

Juslin et al’s first concern is to demolish any idea that the experience of an “aesthetic response” is a single kind of emotional state. The idea of “aesthetic emotions” is more often implied or assumed in literature than it is directly analysed; it is also, as Juslin et al note, one of those everyday concepts that can either be used to define a “special” category of aesthetic emotions, or merely used in reference to any kind of emotion experienced in response to an artwork (Juslin et al 2010, p.634). Aesthetic emotions are also sometimes seen to be “special” not just in the sense of being a separate and somehow superior category of emotional states, but “special” in the sense of being profound and difficult to define – this can extend to a reluctance to even attempt to analyse their nature, as evidenced by

Kivy’s and Hanslick’s discomfort at psychology having anything to do with aesthetics in general. Yet this ineffable quality has led to some questioning of aesthetic emotions as well.

For example, Neill (2003, p.424), questions the distinction between “aesthetic” and

“garden-variety” emotions, noting that there is something “more than a little mysterious” about the difference between such states. Moreover, he suggests that when pressed, “there appear to be plenty of people for whom introspection reveals no such distinction in feeling”

(2003, p.424).

In the face of such disagreement, the obvious approach is to investigate what kind of emotion these purported aesthetic emotions might be and establish how, exactly, they differ from other less “special” kinds of emotions. First of all, there is the idea that an aesthetic emotion is only experienced in response to an artwork; that is, there is something about the artwork that triggers this particular emotional response. But Juslin et al think this is going to mean that the postulated mechanism somehow “knows” “which stimuli are artworks and which are not” (Juslin et al 2010, p.634). They are therefore rejecting any possibility that there is some kind of aesthetic “trigger” setting off the production of a specialist aesthetic emotional response. This would mean, on my view, that such a response is unlikely to involve modular basic emotions. They base their view on what amounts to evolutionary speculation. “Musical emotions”, they say, ”are evoked by general mechanisms that did not evolve specifically to respond to music” and as such are not able to tell when something is artistic; moreover, because of this, “many of the seemingly

184

‘obvious’ differences among musical and non-musical emotions do not hold up when examined closer” (Juslin et al 2010, p.635, footnote). If we accept the evolutionary assumption they are making (that is, that a response to artwork of any kind must be a side- effect of other more immediately advantageous emotional responses, as argued by Pinker

(1997)), then this would tend to rule out the idea that aesthetic emotions might be modular in origin.

What we can draw from this is the following: if the aesthetic response involves an emotion, then it is more likely to be an appraisal-based emotional state than a modular one. But given the lack of consensus for the idea that “being moved” is the aesthetic emotion we’ve been looking for, there are only a handful of other non-basic candidates that might fit the description; that is, we might be experiencing these non-basic states and identifying them as aesthetic emotions in that context (i.e. when exposed to artworks). Possible candidates, in my view, might include awe; wonder; or even profound, object-directed joy, although I should note that some of these emotional states, as argued by Griffiths (1997), might also be classed as both basic and non-basic. But the idea that aesthetic emotions might be contextualised non-basic emotions is not supported by the study results either. Juslin et al think that reports of appropriately “aesthetic” non-basic emotional states (they cite “awe” as an example) were, as we saw above, too rarely reported to tally with the number of aesthetic responses actually experienced. This raises considerable doubt about the aesthetic response consisting in a special kind of single emotional state, since such responses are neither basic

(in that Juslin et al think there can’t be a trigger) or non-basic (either in and of themselves or in whatever context they are experienced). So, if there is no clear empirical evidence supporting the suggestion that the aesthetic response is not a special kind of emotional state, perhaps the mystery can be explained by taking a different approach. How are we to unpack the second sense of “special” Juslin et al mentioned: the sense of vague profundity surrounding the notion of the aesthetic response?

What requires explanation, in the absence of particular aesthetic emotions, is the sense that an aesthetic response is nonetheless an intense and often emotional experience. This experience has traditionally been characterised in two contrasting ways: as an objective,

185 detached unemotional judgement of aesthetic value; and as a deeply subjective, emotional experience (or emotional enough, at any rate, to have produced the term “aesthetic emotions” in the first place). Any explanation of the aesthetic response is therefore going to have to either encompass this contrast between rational detachment and emotional profundity or explain why one of the characterisations is inaccurate. Juslin et al go for the first option, presenting a triangular model of the aesthetic response that has three causally linked but also independently-realisable corners: emotion, aesthetic response, and “liking”.

“Liking” is here defined as a simple preference for one work over another; it is not understood to comprise an emotional state in itself (2010, p.636, fig. 22.7):

The suggested model is a good one, as it seems to cohere with the experiences of all listeners, whether educated or uneducated. The key to this coherence is that each corner of the model is causally linked in both directions with the others, and yet can be reported on its own. The idea is that this enables it to account for a wider range of possible scenarios, but it also removes any necessary connection between the three corners. For example, emotions, as we have seen, can be evoked by music through contagion or through association without any aesthetic response and without any sense of liking the work (“I hate this work – it’s so cheesy – but it makes me cry every time!”). In the same way, aesthetic judgements, as just outlined, can be made coldly, without emotion and without liking the work (“it is a great artwork but I don’t like it”). A work can also be “liked”

186 without any emotion evoked and without any aesthetic judgement being made, as in the case of a favourite pop song114(Juslin et al 2010, p.636).

This seems to be a promising model, in that it can encompass a wide range of possibilities.

But once the interrelationships between the three elements are discussed (that is, once we examine how they do interact, as aside from how they don’t), these possibilities become considerably more complicated. Each element, as I mentioned above, is causally linked in both directions. This can mean, as Juslin et al rather anxiously note, that assigning a high aesthetic value to a work will probably mean that we will be moved by it, and more often than not being moved by a work will mean that we like it; and on the other hand, we might think the work more aesthetically valuable because it evoked an emotional response, or we may “react with an emotion because we happened to come across a piece of music that we like” (Juslin et al 2010, p.637). It becomes clear that the causal relationships between these three elements are difficult to define, to put it mildly. Adding to the confusion is the fact that “we still lack a proper psychological theory of how music is aesthetically evaluated” in the first place (Juslin et al 2010, p.637).

In the light of all this complexity, Juslin et al conclude that we need to abandon the idea of

“aesthetic” emotions. Rather, we should focus on a better understanding of the way that

“general emotion-inducing mechanisms”, like those listed above, can be recruited into our processing of aesthetic experiences stemming from music (Juslin et al 2010, p.637). They stress in a footnote, however, that none of this theorising detracts from the special or unique qualities of our overall musical experience. It just means that there is nothing special or unique about the emotions evoked by those experiences. That quality of special uniqueness, they say, does exist; it is just to be found in “other aspects of the music experience” than “the mechanisms activated or the emotions aroused” (Juslin et al 2010, p.637, footnote; and Juslin & Västfjäll 2008b). So much, then, for Kivy’s account: Juslin et al have, in my view, comprehensively shown that there is no empirical support for Kivy’s

114 Although this latter case, I suspect, might be the most precarious of their statements. Juslin et al state that pop songs are not considered to be “art” and therefore “does not invite an aesthetic attitude”; however, that in itself (even if it were the case, which is by no means established), requires some kind of aesthetic judgement at some level.

187 argument that music is incapable of arousing emotions or for his argument that there are special “aesthetic” emotions.

However, Juslin et al’s 2010 paper raises some other questions I would now like to explore further. For example, they do not speculate upon the nature of these “other aspects of the music experience” they think are responsible for our overall aesthetic experience. It is, after all, not necessary to the argument at hand, which is entirely concerned with arguing against

Kivy by dispensing with the traditional notion of “aesthetic” emotions, and (successfully, in my view) with defending the notion of emotional contagion as being behind most musical emotional responses, as we saw in the last section. But I would argue that it is at this point that the psychological approach (or at least this psychological approach) reaches its limits.

As I discussed in chapter five, sometimes identifying the right questions to investigate requires some philosophical input. Juslin et al’s puzzlement about the aesthetic response is fed by a failure to note that it may be the very “other aspects of the music experience” they identify that have a direct bearing on the way we explain that overall aesthetic experience.

Their puzzlement is also partly a result, I would argue, of a further failure to distinguish between the broad way in which the term “aesthetic response” can be applied to mean full- grown aesthetic appreciation or just the initial aesthetic “frisson” or spark in aesthetic response. Juslin et al might, in other words, be attempting to explain too much: our overall aesthetic appreciation is a much more complex process than our aesthetic response alone.

And this is why a conflation between the two terms, I think, has a direct bearing on what we think uneducated listeners are capable of. These terms need to be clearly distinguished, and we need to establish which of them involves background knowledge and experience and which may not. If we jump straight to full-blown educated appreciation, without examining how this appreciation began, then we are already stacking the cards against the uneducated.

What I mean by all this is the following. Juslin et al are right to place the aesthetic response as one part of the triangle, and they are right to identify the “other aspects” of the musical experience as important. But I think that they are more accurate than they realise in isolating the “aesthetic response” as independently operating from all of the other factors.

188

This isolation is accurate, I argue, because the response doesn’t access background knowledge other than that attained through cultural immersion, it doesn’t necessarily occur with an emotional state, and it doesn’t necessarily involve “liking”. This is because it is a much simpler state, I want to suggest, than the aesthetic appreciation it may eventually give rise to. This is why I think it is vitally important to distinguish an aesthetic response from an aesthetic appreciation of a work. The first is a reflexive recognition, or flagging, of the presence of potential aesthetic value (i.e. “this object is worthy of attention”115); the second is more of a relationship with the work based on a broader grasp of contextual knowledge and fuelled by an interest in the work (I will address this relationship in the next chapter).

There is a further aspect to this distinction that I want to introduce. I’ve argued already that the aesthetic response is a kind of baseline marker from which appreciation might grow. I now argue that aesthetic appreciation, growing from this baseline, can be experienced by degrees. I want to argue, as above, that uneducated listeners can have an aesthetic response to a work that is, like their understanding, made possible through a combination of hardwiring and cultural immersion. But any subsequent appreciation of that work, as I will detail in chapter seven, is dependent on many other factors as well as aesthetic response, including background knowledge (either through folk or formal pathways, as per Davies’ account) and “liking”. I want to spend some time on this distinction now in the hope that it can add some weight to the overall account of the uneducated listener’s understanding and appreciative experiences that I will be drawing together in this and the next chapter.

The foundation of my argument is the view that we neglect the experiences of the average everyday listener if we cannot account for such listeners’ musical aesthetic experiences as well as their musical understanding. I am arguing, that is, that ordinary listeners have such experiences; this appears to be going against the general formalist view that any kind of aesthetic experience in any degree is dependent upon a theory-based understanding of the

115 We may speculate that this recognition might be also modular in the same way that emotional recognition is modular, given that I argue that it does not depend upon beliefs/knowledge and is experienced as a similar kind of gestalt-type realisation. However, this is difficult to defend without experimental support; nor is it the only possible alternative explanation.

189 work in question. But in the face of the evidence of uneducated experiences of musical understanding, as I have already argued, we likewise cannot rely on the traditional view of understanding depending upon formal education; the same evidence means that we cannot rely upon the traditional idea of the aesthetic response only being possible by virtue of even more extensive formal education. This is because there are plenty of everyday instances in which an uneducated consumer of art has a response to it that can only be described as aesthetic. Moreover, we are used to thinking in this everyday way about such aesthetic responses. It is easy to conceive of someone who hasn’t studied visual art, for example, having an aesthetic response to a painting in a gallery: they might be completely floored by a painting they have never seen before and know little about. Inexperienced consumers of art, that is, can legitimately respond aesthetically to works with an immediate, “it just is” kind of perceptual experience (in the same way that music just “makes sense” or “sounds right” when it is heard with the same level of understanding)116. Nor is there anything too unusual in their being unable to say why they think the work has aesthetic value, apart from gestures towards an ineffable sense of profundity. As I discussed earlier, while they can evidence aspects of their understanding of the work in folk terms, justifying their sense of aesthetic value can quite legitimately be another question.

The same applies to everyday listeners in music as well as viewers of art. Because of the difficulty in describing these ineffable experiences, the question is one of evidence of aesthetic response; while an understanding listener might describe aspects of their understanding experience, it is not so easy to describe a truly aesthetic experience. Again, I would argue that we are already used to such ineffability being standard, and we tend to look for other indicators that an aesthetic response is present. One indicator in some cases might be the quality of the pleasure an aesthetic response accompanies (even though an

116 The idea that humans might have evolved innate aesthetic capacities (an idea that would support the argument I am defending here) is not without scientific support. While such support remains speculative, there is strong archaeological evidence that music and visual art are as ancient as humanity itself. I would argue that to dismiss them as either recent cultural developments or inexplicable evolutionary offshoots is to dismiss any role they might have played in human cognitive evolution. The alternative view that the production of art (particularly music) is inherently cultural or somehow a side-effect of other, more obviously advantageous abilities such as language (as argued by Pinker (1997)) has been widely rejected amongst evolutionary psychologists (see, for example, Ball 2010 and Mithen 2005).

190 aesthetic response need not accompany pleasure or emotion, it often does). The following extract, from George Eliot’s Middlemarch, describes one example in the experience of

Caleb Garth, a local farm manager with little formal education but extensive experience of sacred music:

Caleb was very fond of music, and when he could afford it went to hear an oratorio that came within his reach, returning from it with a profound reverence for this mighty structure of tones, which made him sit meditatively, looking on the floor and throwing much unutterable language into his outstretched hands.

Caleb has experienced some kind of aesthetic response that he is unable to clearly articulate. I think that the quality of the “fondness” he has for the work, however, indicates that he has also experienced an aesthetic response. This kind of accompanying pleasure, I want to argue, is not simply sensual “enjoyment”, as formalists would be committed to arguing. Caleb clearly does not feel the same way about the oratorio as he does about his favourite ale. He is a “qualified listener” in Davies’ sense, with an evident baseline understanding of the work. There is therefore no reason to characterise his response as anything other than aesthetic. It is the quality of this pleasure he takes in the music, then, that is the defining feature here; it is one marker of an aesthetic response as opposed to “the sort of physiological or psychological tweak that accompanies merely sensuous delight”

(Davies 2003, p.203).

Formalists would see the situation very differently. DeBellis’ interpretation of Caleb’s response, for instance, would be based not on his pleasure but on what he knows, which in this case is very little. The result would be a picture of Caleb as a kind of wondering ignorant: someone who is incapable of true understanding or true aesthetic response. But

Caleb’s very different experience (as indicated by the quality of his pleasure) does not fit this traditional bill in terms of the truly aesthetic, and leaves the formalists without an explanation for it. Given this formalist failing, I would argue as an alternative explanation that this kind of pleasure, the kind that accompanies Caleb’s reaction to the music, is present because it is derived exclusively from his experiential (rather than theory-based) understanding of the work “from the inside”. Were his experience to lack this

191 understanding, his pleasure would then become mere enjoyment as Davies states above

(see also Davies 2003). So while Caleb’s understanding is the kind of folk understanding without formal education I discussed in chapter three, and he would know little about the works he was hearing (and would also be influenced in his response by the religious texts, but I am overlooking this point for now), I want to argue that his aesthetic response to the music is nonetheless of a similar level to his uneducated understanding of it. But, and this is the point I want to stress, his response is legitimately aesthetic in the same way his experiential understanding “from the inside” is legitimate. And as I will argue in chapter seven, his eventual appreciation of the music, which does depend on background knowledge about the work, can be built on this foundation.

I am therefore suggesting that the aesthetic response is something with which our brains are equipped as part of our abilities to understand the music of our own culture. This makes sense in two ways: first, our aesthetic appreciation has to start somewhere. Our experience of sudden revelation during an aesthetic response feels very unlike the slowly- formed judgements by which we arrive at aesthetic appreciation. And while we cannot always rely on evidence from phenomenology exclusively, in the absence of a viable alternative theory (given that formalists must support the deeply flawed theory-dependence of observation thesis, as I argued in chapter two), it seems a good place to start. Moreover, formalists are themselves operating from phenomenological evidence: their account is based upon their experiences of theory and perception being fused. But since this describes only a minority of experiences within an educated minority of listeners, again I would stress that the majority must be accounted for too. Secondly, and more importantly, our ordinary-level understanding of music alone would not account for our overall musical experience without an accompanying aesthetic evaluative ability and the quality of pleasure this carries with it. Our overall experience, that is, simply will not make sense without this aesthetic ability. It is a source of value; it is part of a motivation to return to the work to increase knowledge and appreciation of it, and it forms the basis, as I will later argue, for a relationship with the work – that is, where it rests on the spectrum of our personal tastes, its quality of “uniqueness” and the role it will play in our lives. While it may not be the case that we always experience an aesthetic response to a work, even a work that we “like” or

192 enjoy, we tend to form a more profound relationship with works to which we do respond aesthetically. This relationship, I want to argue, is therefore another form of evidence of an aesthetic response to a work (along with the quality of the pleasure taken in it).

This overall relationship requires more than just the first aesthetic response, however. Our response also needs to encourage us to take an aesthetic interest in the object; we need to respond to the object for itself (Davies 2003, p.199; 2006). This means that our primary interest in the artwork should not be as a means to an end; we are not taking aesthetic interest in a piece of music if we listen to it simply to relax, for instance, any more than we would be taking an aesthetic interest in Rodin’s Thinker if we simply regarded it as a paperweight117. As I discussed in the Caleb Garth example, the aesthetic interest is a significantly different attitude to one we may adopt in perceiving something as casually pleasing. I am therefore going to be using the term “interest” here in two ways: first, to describe the concept of interest in the artwork for itself; and second, to describe a motivating interest for the listener, a reason to return to the work to develop a deeper appreciation of it. This latter sense can be (but is not always) sparked by the initial aesthetic response; the former sense is a necessary component of the initial aesthetic response.

Both senses, however, can be united in the appropriate way if we take into account a suggestion Davies makes when he footnotes that “it may not be fanciful to regard aesthetic appreciation as a species of love” (2003, p.201). Davies is here expanding upon an idea in a

1958 paper by Ruby Meager, in which she discusses the issue of how we are supposed to make meaningful aesthetic evaluations of artworks (which inherently involves acts of comparison to other artworks) if each artwork is aesthetically valuable because it is unique, and only to be judged for itself (Meager 1958, pp.50-51). The solution she proposes turns on the assertion that our relationship with an artwork is like our relationship with a person

117 However, I agree with Davies (2011, p.132) that listening to music for pleasure is not using music as mere means to an end in the prohibited way; this is because that kind of pleasure is not simply a bodily sensation that can be got from other sources (like eating ice-cream, for example). Rather, the pleasure is inseparable from the music because it has the music as its object; the music is “the unique means to the pleasure”. This is also discussed in Levinson (1996), chapters 1 and 2. I do not rule out, then, a scenario in which music is relaxing partially because of the aesthetic interest we might take in it.

193 we love: in this case, Meager suggests, a girl called Rosie (1958, p.69). Rosie is not beloved simply because she has the properties of being witty or intelligent. Lots of other people have those properties and it is not merely those properties alone that prompt our love for her.

Being witty, for example, can take a number of forms, but it is Rosie’s particular wit, and the particular way it is combined with her other commendable properties, that is loveable.

Rosie is loveable because of the way those properties are united in her; Rosie is loved for herself, not as a means to an end and not because of general loveable properties like

“wittiness”.

Davies suggests that this is similar to our eventual relationship with a piece of music for which we also have an aesthetic appreciation. We love music in this same sense of loving it for itself and the particular way that its properties are combined. It seems special and valuable to us. He develops this idea further (in Sukla 2003 and Davies 2011, p.135), arguing that the kind of love we are talking about here is not the all-consuming passion of a lover, but “the kind of love that sustains and fructifies an ongoing, developing relationship”. And, as it does in healthy relationships with other people, the relationship a music lover has with music will develop and contribute to their own identity as a person.

This is exactly the kind of relationship that develops over time as an aesthetic appreciation of the music grows; it becomes an important part of the listener’s life, a part of who they are. And this is not overstating the case at all. “For music lovers”, says Davies, “music is that central” (2011, p.135).

This is also a particularly relevant perspective given the way we can respond to musical emotional expression. We respond not to an inanimate object, but as though we were encountering another person expressing that emotion. This coheres nicely with the proposal of our forming a relationship with the work that is analogous to our relationships with other people. It also matches the way that we tend to enthuse about our favourite works, the way that we view them as unique, and value that uniqueness; we enthuse in a similar way about valued relationships with other people as well (such as, for example, romantic partners or children).

194

I will argue in the next chapter that forming such relationships, based as they may be on an aesthetic response, begins the process of aesthetic appreciation. But before I move on to the next chapter, I will summarise exactly what I am proposing about the distinction between the aesthetic response and aesthetic appreciation in this one. The response is, as I have argued, a realisation or recognition that does not depend upon education; appreciation, on the other hand, draws on knowledge about the work, understanding of the work, beliefs, and emotional responses, and can grow over hours, days or many years as the listener returns again and again to the work. The concept of appreciation is one with which we are already familiar; it is a relatively traditional view of that concept. But what, given this, of the aesthetic response? If it is a realisation, what is it a realisation of, exactly? And how does aesthetic interest fit into the picture?

The aesthetic response is, quite simply, a realisation that the artwork in question has an as yet undetermined degree of aesthetic value. As previously discussed, once this realisation occurs, the listener/viewer has a number of options open to them. They may find the work of aesthetic interest, which may in turn motivate them to “like” it, return to it, learn more about it, and begin to appreciate it such that they may make an aesthetic evaluation or judgement about it. Or they may find it of no aesthetic interest at all. Or they might respond aesthetically to a work without liking it or finding it of interest. They may also respond emotionally, which is more likely in the face of a strong aesthetic response; or they may not respond in that way at all. And the factor determining which of these routes, among others, may be taken is the as yet unmentioned idea of taste – which works are appealing, or which ones the listener “likes”, as Juslin et al also note in their triangular model (2010, p.636).

Taste is also informed by the listener’s understanding of the work, as is their eventual appreciation of it. And when it all comes together, when the listener aesthetically appreciates the work - meaning that this appreciation is based on solid understanding – the interest they take in it is analogous to the interest they take in a loving relationship with another person. Just as we respond to emotional expression in music as if we were encountering another person, so, I argue (in agreement with Davies and Meager), we begin to love that work and the properties that make it what it is “for itself” in the same way that

195 we do when we form a relationship with another person118. We can also have relationships with music that vary in intensity, just as we do with people.

To summarise this chapter, then, I argued in agreement with Juslin et al (2010) that there is little or no empirical evidence to support Kivy’s view that music does not arouse emotions in listeners, and his view that there are purpose-built “aesthetic emotions” with which we respond to music. I then proposed an alternative account in which music might evoke basic emotions in listeners through emotional contagion, as well as through the more traditional

(but indirect) method of association. I then pointed out that there is a distinction between aesthetic response and aesthetic appreciation, and that this distinction is frequently unobserved, to the detriment of uneducated listeners’ musical experiences. I outlined my view of the aesthetic response and the evidence of its existence, noting that the concept appears to be colloquially accepted in any case in application to uneducated listeners. I also argued that aesthetic appreciation might come “in degrees” and is a kind of relationship or friendship with a work analogous to those we form with other people. What now remains for me to provide is an explanation of how we are to understand appreciation itself thus conceived.

In the next chapter, I will construct this explanation. I will adopt an argument by Aristotle about different kinds of friendships (from Nicomachean Ethics, Book viii) and suggest that they are analogous to the different kinds of relationships we might have with music. It’s important to realise, however, that in taking this relationship-based stance on appreciation,

I am not abandoning Davies’ emphasis on how such relationships must take place against a backdrop of contextual knowledge about the work. Rather, I will argue that appreciation can be gained “by degrees” by attaining this background knowledge through folk or formal pathways. But given my focus on reasserting ordinary listeners’ experiences of music as genuinely understanding and genuinely capable of aesthetic response, the motivating question for me will be how much contextual or background knowledge about the work is required. In answer to this question, I will tease out a difference between an intuitive

118 This idea also has some basis in John Armstrong’s conception of our intimacy with artworks. Our affection for art, says Armstrong, arises from a combination of fascination (or magnetism) and intimacy, or “being engaged in a personal and private way” with art (2000, p.2).

196 listener, an autodidact, and a trained analyst and compare these listeners to Davies’ folk and formally–educated listeners. I will argue that a truly intuitive listener does not feature in Davies’ account and yet can understand and respond aesthetically to music. In my view, such a listener can also form a relationship with a work equal in quality to that formed by a trained analyst. I will argue that our concept of appreciation should be re-weighted to accommodate such relationships as well as background knowledge.

197

Chapter Seven: Appreciation

They would bring out a new album and for a few listenings it would leave me cold and confused. Then gradually it would begin to unravel itself in my mind. I would realise that the reason I was confused was that I was listening to something that was simply unlike anything that anybody had done before. “Another Girl”, “Good Day Sunshine”, and the extraordinary “Drive my Car”. These tracks are so familiar now that it takes a special effort of will to remember how alien they seemed at first to me. The Beatles were now not just writing songs, they were inventing the very medium in which they were working...Mozart and Bach and Shakespeare are always with us, but I grew up with the Beatles and I’m not sure what else has affected me as much as that.

- Douglas Adams, The Salmon of Doubt

Most of this thesis so far has been concerned with downscaling our expectations of musical understanding and upscaling the importance of both recognising musical expression and emotional or aesthetic responses to music. I have also argued that boundaries between understanding and appreciation exist, and that conflating the two puts uneducated listeners at a disadvantage, leaving much of their experience unaccounted for. My approach, however, has meant that what used to be at the forefront of traditional understanding/appreciation hybrid models – our background knowledge, gained through either folk or formal channels – has not yet featured in the story. In this final chapter, I will give it a place in a new, more inclusive model of appreciation. This new model, however, will be weighted more heavily towards the earlier stages of perception and understanding than it has been in the past. It will also be re-weighted to accommodate the relationship an uneducated listener might form with a work. I will argue that the quality of such relationships is unsustainable without a significant degree of true appreciation, and that these relationships should therefore feature in a re-conception of appreciation. I will also argue that it is appreciation, not understanding, that “comes in degrees”, and that to argue otherwise excludes the truly intuitive listener.

198

I will defend my conception of intuitive listeners later in the chapter. For now, my immediate aim is to construct a model of appreciation that better reflects the experiences of the majority of listeners: those who are formally uneducated, whose responses to the music they understand go beyond mere sensual enjoyment, and who are motivated to return to the music, learn more about it, and value it. It is this more complete picture, I want to say, that forms the basis of their appreciation of that music; it is also this complete picture that feeds a relationship with the work which feels, as Davies says, like “a species of love” (2003).

Adding the relationship dimension, I will argue, is crucial because we can now account for the importance of music in such listeners’ lives, whether their relationship with it is based upon personal association, aesthetic concerns, wider appreciation, or all of the above. My point at the end of the last chapter was that that there are other factors at play in the formation our relationship with the work than just the interaction of “liking”, aesthetic response, and emotional response (Juslin et al 2010). These other factors include appreciation, background knowledge, and the character of that relationship; they are mediated by talent and by taste. Overall, my aim is to emphasise the relationship we have with music as an often-overlooked aspect of our experience of it. Underlying this aim, then, will be the need to define the interaction between our relationship with and our appreciation of music. There are still some distinctions to be made.

In the course of re-assembling some of the distinctions I have already made, however, I will extend the above idea that the uneducated listener is capable of varying degrees of knowledge about the music that will impact their response to, appreciation of and relationship with the work. As we have seen, this knowledge encompasses historical context, background about the composer, comparisons with other interpretations of the piece, theoretical or folk-musicological knowledge, and so on (Davies 2001; 2003; 2007 and

2011). What I will question in this chapter is whether this application of knowledge, whether couched in folk or formal musicological terms, is best considered as

“understanding” (as Davies and others have done); or is best considered as “appreciation”.

I have argued already that the ordinary listener’s understanding of heard music is a modular process, for which they are equipped both by disposition and culture. As an extension of this account, I will argue that the non-modular application of background

199 knowledge to this modular understanding is more appropriately described as appreciation.

My version of the distinction between understanding and appreciation makes a stronger point about the way most people understand music; it is intended to highlight the way that this untrained understanding has traditionally been downgraded or overlooked (by, for example, DeBellis and Hanslick). It is also a distinction drawn to make a point about the way that most people love music, and to suggest that the uneducated, contrary to past implications, are not only able to understand the music they love, but are also able to assign or recognise a degree of aesthetic value in a work.

However, what this distinction between understanding and appreciation is not is part of an argument supporting a view that all musical appreciations are somehow “equal”, or that musical appreciation requires no effort, engagement or ability on the part of the listener. I don’t believe this is the case, and I will examine a statement of Davies’ (in which he rails against those who mistake enjoyment for understanding) to explain why (2003, p.232). In this chapter I will also not be buying into the conventional divide in the Western tradition between pop or rock music and classical music119. This divide is often raised in this context as an example of a perceived elitism in classical music. I will argue that we can have different kinds of relationships with both kinds of music, and that relationships with both kinds of music can be of identity-forming significance to the listener. Both kinds of music, too, can be loved for themselves.

What I am claiming here is that the kind of relationship a listener may form with a piece of music is central both to my response to the above elitism problem and to my account in general. To clarify what I mean by this, I will be adapting Aristotle’s account of the three kinds of friendship (or love) from Book viii of Nicomachean Ethics. These are friendships of utility, of pleasure, and “complete” friendships of character120. With a few caveats in place, as I will explain, there are two kinds of relationships traditionally associated with either uneducated listeners or inferior music: friendships for the sake of utility and friendships for the sake of pleasure. Aristotle views both of these as relationships of inferior quality and

119 Most of the debate concerns Western classical “music alone”. But for relatively recent discussions concerning rock music, see Kania (2006) or Gracyk (1996). 120 All references are to the translation by Terence Irwin (1985).

200 argues that such friendships are usually short-lived (Nic. Ethics, 1156a19-21). I will argue that these forms of friendship are analogous to the distinction between the aesthetic and the non-aesthetic interest in a work, in that artworks of any kind are not to be loved for their utility, or for the simple “sensual” pleasure they bring, but for themselves as artworks.

This aesthetic love is analogous to Aristotle’s ideal of the “perfect” friendship of character

(or “good”). In these friendships, the two participants love each other only by virtue of their goodness, or character; that is, for themselves (Nic. Ethics, 1156b7-10). This kind of relationship with an artwork is, I will argue, more often than not also associated with a high degree of appreciation of the work.

The analogy with Aristotle’s account also highlights a common confusion between our appreciation of music and our relationships with it. Not only does Aristotle acknowledge that friendships can be had “by degrees”, but he also specifies that it is a mistake to think that this means there is only one kind of friendship: “some people think there is only one species of friendship because friendship allows more and less. But here their confidence rests on an inadequate sign; for things of different species also allow more and less” (Nic.

Ethics 1155b12-15). I will argue that this is the point the traditional analytical view misses: not only does appreciation come in degrees, but it also interacts with the relationship the listener has with the work. Admitting that appreciation comes in degrees does not mean it is the only way to account for our musical experiences and to measure our musical capabilities.

How, then, do appreciation and relationships interact, if this interaction is of such importance to my account? Let’s take a look at how the experience might unfold for the ordinary listener heading towards forming an appreciation of a work. Assuming for the moment that, as asserted, this listener is equipped to understand (without formal training) the work’s expressive and other structural features, now is the time when their knowledge about the work swings into action. Now is the time, that is, when we move away from modular, informationally encapsulated processes and towards “higher-level” processes that draw upon what we know and what we believe about the work. We have also reached the point, then, where our understanding listener might have had a moment of revelatory

201 aesthetic response, which may grow (with the application of background knowledge) into aesthetic interest. Of course, this is not a necessary condition of appreciation on my view; appreciation may not involve aesthetic interest (although it certainly helps if it does).

However, the interest and appreciation levels will also affect the intensity and quality of the relationship this listener will form with the work in the future. Some of these relationships feel like the love we feel for another person; they also become an enduring part of the listener’s identity. And this is why I think my proposed account offers a defence of the ordinary listener’s musical capabilities. My argument is that such listeners might form relationships with a work of equivalent quality to the trained analyst, but from “the inside”: through experience rather than formal study. These relationships are impossible to form without understanding and without a significant degree of appreciation. In the light of this, to say that such listeners cannot understand or even accurately perceive the music that holds such a position in their lives is, I think, simply indefensible.

My account of modular understanding shows how this degree of appreciation and intensity of relationship might be possible for ordinary listeners. It shows how such listeners can perceive and understand the works they hear. It also shows how their background folk musicological knowledge about the work may feed their appreciation of it, and how they may feed their motivation to increase this appreciation with aesthetic response and interest.

My concerns for this chapter, then, are threefold. First, I want to continue to defend the ordinary listener by presenting my account of appreciation “by degrees”, including relationships with musical works. This will turn upon the analogy with Aristotle’s views on friendship. Secondly, I will continue to highlight how the modular base of my account of understanding enables me to do this. Thirdly, and relatedly, I want to ask how much background knowledge enables such listeners to form an acceptable degree of appreciation of the work. I will argue that asking this question is essential to clarify the distinction between understanding and appreciation necessary to the defence of uneducated listeners.

If even the most dedicated analyst cannot answer this question with any accuracy, as I will argue is the case, then we are left with understanding and appreciation both being measured against the same set of ever-shifting goalposts.

202

To begin my discussion, I will turn for one last time to Davies’ account of appreciation and understanding. His concerns, at least in regard to ordinary listeners, are very similar to my own. He argues that background knowledge is essential to any account of musical understanding and appreciation, but it is the way in which the knowledge is gained that is the point. Formal training, he says, is not a requirement for musical understanding; folk musicology will do just as well. Looking at his account of appreciation in overview will enable me to address the three concerns for this chapter that I outlined above, with particular reference to the third. I will argue that Davies requires too much of the ordinary listener in terms of how much background knowledge they must acquire. With appeal to my account of aesthetic interest, and Aristotle’s view of “character” relationships, I will argue that it might be possible to lower the bar on background knowledge for the ordinary listener without compromising their understanding of the work or their relationship with it. This is not to say, of course, that a love of a work can replace an informed appreciation of it; a high degree of informed appreciation would usually intensify that love. But what I want to argue is that once we take factors such as talent and taste into account, we ought to be aiming at the building the best kind of relationship with a work rather than acquiring an ever-expanding set of necessary facts about it. This may turn out to be much the same thing in most cases, but I will argue that the distinction remains particularly pertinent for the ordinary listener. I will briefly appeal to some aspects of Nicholas Cook’s account (1990) to support this position (although there are other aspects of his account with which I disagree).

First, however, is Davies’ account of appreciation.

1. Appreciation

On Davies’ view, appreciation is enabled by the understanding listener taking an aesthetic interest in the work “for itself”. It emerges that the listener must acquire considerable background knowledge in order to take such an interest. This is, importantly, contrary to the traditional Kantian “disinterested” model, in which knowledge is disregarded (and

203 which, on my view, therefore better describes the aesthetic response rather than the aesthetic interest). As Davies explains:

…an aesthetic interest in a musical work for itself is not a form of disinterest requiring the listener to put aside all the knowledge and understanding of music she has gained through past experiences of listening to music. To take an aesthetic interest in a musical work is not usually to regard it as a totally isolated event to be appreciated and approached without reference to anything else. An interest in something for itself does not preclude one’s bringing to one’s experience of that thing a knowledge of the traditions and conventions within and against which it is intended to be understood and appreciated. To take an aesthetic interest in a musical work is to take an interest in it for itself. When the musical work is a symphony, for example, an interest in it for itself would appropriately be an interest in it as a symphony (2003, pp.203-204).

Davies means that the interest in the work “as a symphony” encompasses an awareness of the work’s place against an extensive background knowledge about the symphonic form, its history and development, comparisons with other symphonies, comparisons with other symphonies by other composers in other musical eras, and so on. When this awareness is sufficiently complete, he says, the listener will not only properly appreciate it but will also hear it differently. “An awareness of the traditions and conventions against which the work was written”, he says, “affects the way the work sounds” (2003, p.203, his italics).

While this might describe the reported experiences of more educated listeners, in the face of the theory-dependence of observation debate it is a controversial claim. However, rather than engaging in that debate, Davies refers instead to Higgins’ (1997) frustration with the whole idea of a “neutral” objective perspective on the work that ignores the personal perspective of the listener (see also Davies 2011, p.83). Higgins questions whether such neutrality is possible. She argues that philosophers seem to be working on the assumption that there is a standardised way of hearing the work, just as there is a standardised way of reading its score. But a listener’s own knowledge base and experience will, according to

Higgins, inevitably affect how that listener hears the work. For example, it is difficult for professional musicians to listen to pieces without hearing the technical aspects of the playing or singing. It would be equally difficult for a musicologist to hear a piece without imposing an analytical perspective on the unfolding structure of the work. Moreover, I

204 would argue that once this level of education and experience is achieved, trained listeners tend to lose the ability to imagine how a listener without this background might experience the work121. Since Davies talks about different kinds of understanding in exactly this vein

(2011, chapter 7), he might as well be talking about kinds of experiences. Performers, composers, listeners and musicologists will have different interests in the work, and this will impact upon both their understanding and their experience of it; it will “sound” different to each of them.

But does this admission that the work sounds different to each of these listener categories challenge the modular account I am defending? Trained analysts, such as my lecturer or

DeBellis, often report that their very perception is influenced by theory, as we have seen. If you know more about the music, they argue, you can hear more in it. So in other words, while I have been emphasising the way that the untrained majority of listeners experience music, does the experience of the trained minority undermine the modular concept of understanding by indicating that theory may bleed through to perception after all? Is there a difference between the trained minority’s claim and Higgins’ questioning of aperspectival listening? Do both these questions commit us to relativism, either through the trained minority’s theory-dependent observation or through Higgins’ rejection of objective hearing?

I don’t think that these questions are the problem for my account that they may seem to be.

I have argued that traditional views such as DeBellis’ might lead us to relativism due to their reliance upon the “theory dependence of observation” thesis. However, while Higgins’ argument may emphasise the personal aspect of our experience, it need not commit us to relativism, ontologically speaking: I agree with Davies that the musical features creating these experiences are still objective (2011, p.83). There are still musical features out there in the world, that is, that are being understood, albeit in the production of different listening experiences. Naomi Cumming (1993) also considers the question of relativism, but does so by taking on the theory-dependence of observation argument. She argues that accepting the modularity of input systems (the level I am claiming is the perceptual understanding level)

121 This was, remember, my analysis lecturer’s reaction to my own experience.

205 does not preclude the recognition of learning as an influence on how that input is interpreted by the CPU at a higher level122 (what I am arguing is the appreciation level).

That is, there is nothing to say that the levels of encapsulation at the lower levels of perception, once presented as output, cannot be further interpreted at higher levels123. This interpretation can therefore account for the personal perspectives highlighted by Higgins.

Such interpretation is of course going to be seamless with perception within the experience of the trained listener, which is why, I think, we have ended up with accounts like DeBellis’; these experiences are persuasive. But Cumming’s account is more robust than such traditional views, in that it explains more: it allows for both the experimental evidence of modularity and (to a certain point) the way that background knowledge can be integrated into experience, without invoking the theory-dependence of observation and all of its unpleasant consequences, as I discussed in chapter two.

And as I discussed in chapter three, the way this background knowledge is obtained is, according to Davies, not that important. Folk musicological knowledge about musical structure, he says, is equally effective as a formal education for the purposes of the average listener, and the kind of background knowledge he views as necessary (about symphonies in general, for instance) can be gleaned through extensive listening and the avid reading of

CD covers and concert programme notes (Davies 2011, p.94). This is, after all, the kind of knowledge one needs to form a relationship with a work. It is also a similar process as

“getting to know” a person with whom you are forming a relationship (although ideally this involves less formal study). This analogy tends to reinforce the idea that we understand many aspects of musical expression in the same way that we understand personal emotional expression. Davies also ties this social aspect into the question of whether a listener’s motivation to improve their appreciation is related to their own self-development.

We also, he says, gain vital information about each other through musical tastes. While neglecting one’s musical development might not be unethical, “nevertheless”, he says,

“music lovers do tend to judge each other in terms of their musical tastes. Perhaps this is

122 See also Davies 2011, p.125 123 I am not, that is, suggesting that perception is “massively” modular (ie, modular all the way up). As I explained in chapter three, I have been assuming a classically Fodorian model, in which modular outputs are analysed by a non-modular CPU. See Fodor (1983).

206 because these are often so fundamental in their lives that they are integral to each person’s sense of identity and value” (2011, p.95).

These points aside, it seems to me that on Davies’ account, the degree of education required for the average listener is perilously close to that of a formal musical education, at least in terms of content. From his perspective, this makes sense; he is, after all, arguing that the understanding of each kind of listener can be equivalent, and hence each understanding must encompass the same material to achieve this equivalence and build it into appreciation. Moreover, this is also a response to those who argue that observation is theory-laden, in that Davies has shown that the uneducated listener can perceive and understand as the educated listener can, without acquiring formal theoretical knowledge.

But let’s now view this from my perspective. As far as musical structure goes (that is, knowing which sounds belong to the music and which don’t; being able to predict and follow the path of the music as it progresses, being able to identify emotional colour etc), I agree it is straightforwardly the case that, as Davies says, “the relevant education is provided by exposure and non-technical commentary”, taken in “by osmosis at a very young age”, much as we “pick up our mother tongue” (2011, p.92). This, as I argued earlier, is what we require to understand the work, and I offered the modular account of how we attain such understanding. But when we reach the stage where I argue that understanding shifts into appreciation, I do not share Davies’ conviction that an equivalent appreciation is always going to mean acquiring exactly the same theoretical content as a trained listener. It is of course perfectly reasonable for the interested, or aesthetically interested listener to want to return to the work to listen to it again, compare various interpretations of it, or even read about it in CD sleeve notes as Davies suggests. But Davies then takes the requirements even further, stating that the interested listener should, in the crucial appreciative act of making comparisons with other interpretations of the same work, “study its score”, gain “some idea of the course of music history”, or even “consider the development and role of musical notations” (2011, pp.93-94). Taking things this far, I would suggest, seems beyond the reach of the average punter. It would be a devoted listener, indeed, who is able to read and “study a score”, especially a full score of a

207 symphony, without formal education. I would also ask: even if they did manage to achieve this skill untutored, how far do they then need to take it to be viewed as a truly appreciative listener? Such a skill would certainly be of benefit if it were applied to a more analytical approach to the music. In which case, I would argue, folk musicology begins to be overtaken by formal musicology; because the material to be acquired is essentially identical, the difference between folk and formal methods of acquisition becomes less significant.

To me, this difference also raises the question of how much background knowledge is enough to qualify the average listener’s understanding as appreciation and/or to develop their aesthetic interest. While Davies’ requirements are extensive, I agree that there is no doubt they would improve and extend an appreciation of a musical work when appreciation is viewed on his terms. What is in doubt, from my perspective, is how comfortably this assertion sits with his claim that formal education, of the kind championed by DeBellis (1995), is not necessary to gain the same level of appreciation as a listener with a PhD in musicology. While I also agree that appreciating music takes effort and engagement, it seems to me that the boundaries of background contextual knowledge tend to expand beyond the reach of the average listener, stopping only when they reach the extent of educated opinion. But even this boundary is not always clearly defined, and this is my worry about Davies’ account in application to untrained listeners. The goal posts of understanding or appreciation, I think, have the potential to continually shift in the experts’ favour, further and further away from the ordinary listener’s untrained capabilities.

Nicholas Cook (1990) makes this point when he quotes (in disbelief) a statement from musicologist Leonard B Meyer (1967): “…it is possible to read, memorise and perform music that one does not really understand”, to which Cook replies: “…one wants to add: and to compose, listen to, and write about it too!” (1990, p.185).

But before I tackle this problem of the shifting goalposts directly, I want to discuss Davies’ possible motivation for his position. In an earlier work (2003), Davies expresses an understandable frustration at the way that some uneducated listeners operate under the delusion that their enjoyment of music is genuine appreciation, and at the way that this is encouraged by the relativism he sees pervading the education system:

208

We live in an age where it is regarded as both offensive and false to suggest that there is not a democratic equality among all kinds of music in their artistic value and among all listeners in their understandings of music. It seems also to be widely held that understanding simply comes as a result of one’s giving oneself over to the music (as if there must be something wrong with a work that does not appeal on first hearing). The ideas that there are worthwhile degrees of musical understanding that might be attained only through years of hard work and that there are kinds of music that yield their richest rewards only to listeners prepared to undertake it smack of an intellectual elitism that has become unacceptable, not only in society at large but in the universities. “Anti-democratic “ ideas are rejected not just for music, of course, but across the social and political board, but the case for musical ‘democracy” is especially strong, since almost everyone loves and enjoys some kind of music. Nevertheless, the arguments I have developed above suggest to me that many music lovers mistake the enjoyment they experience for the pleasure that would be afforded by deeper levels of understanding (2003, p.232).

This extract, then, expresses a similar frustration to Hanslick’s, whose “mindless drug addicts” wallow in their emotional responses to the music while thinking themselves musical connoisseurs. Hanslick blames emotions rather than relativism, but the baseline argument is more or less the same: it is unforgiveable to mistake enjoyment or “liking” for appreciation and/or informed understanding. Moreover, Davies argues that the listeners who do so are doing themselves a significant disservice, in that they are missing out on far greater pleasures than they know. Everyone’s understanding of music is not equal, he argues, and should not be viewed as such.

This is a perfectly reasonable view (from Davies, rather than from Hanslick). As is probably evident by now, I don’t approve of any form of relativism either124. However, missing from the picture so far is the concept of ability, or talent. It is one thing to deliberately delude oneself that one’s appreciation of music is genuine when there is every indication that it could be improved upon; it is quite another to lack the ability to improve it. Davies of course acknowledges this point as being behind the idea that understanding can “come in

124 As Fodor says, “The thing is: I hate relativism. I hate it more than anything else, excepting, maybe, fibreglass powerboats” (1985, p.5). While I don’t have an opinion on powerboats, I agree with Fodor that a modular theory of mind is one way to guard against relativism, in that it puts forward a convincing case for theory-neutral perception (see Fodor 1985, p.5).

209 degrees”. “Not all listeners”, he says, “are equipped or sufficiently interested to probe the depths of works…..the comprehending listener may find that understanding does not always yield appreciation” (2011, p.95). His overall view, then, is that the degree of understanding and eventual appreciation is determined by the education of the listener, their interest and application, and ultimately, their aptitude or talent. And this last point in particular is one with which I agree. Some listeners are simply going to be better at this than others. I think that, in the end, it is a question of aptitude just as much as the mode of education (if not its extent, as I will argue later) that determines the degree of appreciation gained.

But this is where I think the difficulties arise for Davies’ account, as I stated above. There is a tension between what he thinks we should all learn to legitimately understand/appreciate music and his claim that the untrained listener can understand/appreciate just as much as the musicologist (albeit in “folk” terms). I would suggest that acquiring some of this material is beyond the reach (or at least interest) of the average untrained listener, and that this is at odds with the evidence of appreciation that their relationship with the work reveals. Davies is prescribing an ideal rather than describing what actually goes on. To illustrate my point, consider the differences between the following three listeners: the formally-trained musicologist; a talented and motivated autodidact, who may simply teach themselves more or less the same skills as the musicologist one way or another; and the more “instinctive” or intuitive appreciator, who simply has an excellent culturally- immersed “ear”, good musical memory and extensive musical experience. The autodidact has obtained a similar level of technical expertise to the musicologist through a different route. The intuitive appreciator, on the other hand, will not have score reading or analytical skills and may not have an interest in gaining any. They will, however, combine their talent for listening with basic background knowledge about a work; they can even form opinions about various interpretations of a work. But it is their “ear” that they rely upon to do this, rather than more formal skills like score-reading and analysis. I think that Davies is conflating the autodidact and the intuitive appreciator, in that the informal mode of education is conflated with intuitive “ear-based” talent and cultural immersion. If we want to defend the uneducated, intuitive listener, then we need to look further than whether or

210 not the same theoretical content was acquired through formal or informal means. Of course, Davies would rightly respond that such listeners may simply understand or appreciate to a lesser degree than the autodidact and the trained listener. But I will argue below that the quality of their relationship with music indicates that a higher degree of appreciation is present in such listeners than Davies’ account allows.

My suggestion is this: if we measure understanding and appreciation through the kind of relationship the listener has with the work, rather than through their erudition, then we may find that some of the contextual knowledge traditionally viewed as essential to a “true” appreciation of a work might turn out not to be. Some aspects of Cook’s account (1990) cohere with this suggestion, although his view is rather more extreme. He argues that analysis and listening are two completely different skills. They are so different, in fact, that they comprise two discrete activities: the “musical listening” of the uneducated and the

“musicological listening” of the educated (1990, p.152). They also result, according to

Cook, in two discrete aesthetic experiences of two different kinds of beauty. While it may seem that the analyst can appreciate a heightened beauty about the work, this is “a musicological rather than a musical beauty”(1990, p.166). Most importantly, musicological listening is seen as actively detrimental to our enjoyment of the work because it leads to the listener becoming so preoccupied with the musical structure that they “cease hearing it as music at all” (1990, p.158).

Cook’s account is similar to mine, in that his aim is to champion the ordinary listener by arguing that whatever understanding music is, it doesn’t consist of musicological understanding alone. Rather, musicological understanding is merely “a partial understanding”, one that remains incomplete without the immediacy of the listening, performing or memorising experience. This is why I emphasise the heard expressive properties of music within my account of musical understanding. However, I don’t agree with Cook’s views on the nature of perception; I think his separation of analytical listening and musical listening goes too far. Every listener needs to listen formally to a certain degree in order to recognise the sounds of the piece as music in the first place, after all. As Ridley

(1993) and Kivy (1992) have both pointed out, there is in even the most naïve listener an

211 ability to make basic theoretical judgements such as when the piece begins and ends; which sounds belong to the music and which (like coughs from the audience) do not; and even when tunes reappear. This, they argue, is part of what the experience of listening involves

(Davies 1994, pp.333-334). I have argued that this ability is part of what our perceptual modules unconsciously do for us; Cook, on the other hand, seems to be suggesting that it is all theoretical and all consciously applied as we listen. I have argued that this is not the case.

Just as an aside, I don’t agree with Cook’s views on aesthetics either; he equates the

“aesthetic response” to music with simply enjoying the work (1990, pp.151-152). This creates entirely the wrong impression of the whole enterprise. In my view, the aesthetic response is not a matter of enjoyment alone because first, it leaves the “it’s a great piece of art but I don’t like it” response unexplained, and second, if it were the case that aesthetic response were simple enjoyment, then our relationship with music would be of a different kind. It would be the opposite of what is traditionally viewed as aesthetic appreciation; we would love the music simply for the pleasure it gives rather than for itself, as an artwork.

Cook’s approach, then, does nothing to defend the uneducated or intuitive listener against accusations such as Davies’ and Hanslick’s above: pleasure should not be mistaken for appreciation.

Given these considerations, in the next section I am going to argue further about why the kind of relationship we have with a work is so important in this defence of the uneducated listener. Even given my reservations about Cook’s account, I share his suspicion that, if appreciation is measured via erudition, the game is very much in control of the trained analysts, who often seem uncertain themselves about what exactly constitutes an adequate store of background knowledge (see Meyer’s quote above, in which he implies that even actual musicians don’t understand music). Whatever this store might be, it is certainly beyond the reach of many intuitive listeners, who, I argue, nonetheless have the ability, talent and understanding to fuel an intensity of relationship with the work that goes beyond mere enjoyment. Arguing, as Davies does, that “everyone loves music”, and that therefore this cannot be a measure of musical appreciation lest we descend into relativism, is to overlook some of the possible complexities of our relationship with music. Of course,

212 he is right to say that there are many listeners who “mindlessly” enjoy music, and love it for that reason alone. But I argue that there are also many cases of uneducated listeners forming a more profound relationship with music: one that requires understanding, aesthetic interest and a significant degree of appreciation. That is, there is more than .

Perhaps, then, my seeking to determine a point at which a listener can attain appreciation by virtue of their musical education alone, without taking such relationships into account, is a futile exercise. For a start, this “appreciation point” would probably vary from listener to listener with experience, aptitude and other such factors. There is something faintly absurd about the question, too, in that its scope is so broad; I am essentially asking exactly what you need to know about music, however you go about gaining that knowledge, to be said to appreciate it. My point, though, is that the current view, in which the musicologists tend to answer this question with: “everything”, is equally broad and uninformative. So perhaps, rather than asking about the extent of our background knowledge, we need to find a different measure of the degree of a listener’s appreciation. I will argue that the question ought to be: can the intuitive listener form a relationship of equivalent quality with the music as the autodidact or the musicologist?

2. Kinds of relationships

There is one important sense in which they can. An intuitive listener can build a relationship with a piece that is central to their very being and identity. This concept of identity-forming relationships is in itself complex (and this complexity is frequently overlooked in discussions like this one) so I’ll take some time to outline what I mean by it now. It can in fact be very difficult to determine whether a particular trait is central to a person’s identity in the required sense (Rorty & Wong 1990, p.20). In pointing out this fact, however, I am not concerned with the traditional questions to do with personal identity such as individuation or identity over time; rather, I am concerned with the more practical everyday questions such as the way a person “fits” into social groups or the way they make sometimes life-altering decisions based on likes or dislikes. These questions, then, reduce to

213 a more self-reflective question, such as: “who am I?” The answer this question will involve appeals to traits (including relationships, likes and dislikes) that will mean the person feels

“radically changed if the trait is lost or strongly modified” (Rorty & Wong 1990, p.20). The difficulty here is that such traits can be apparently superficial, such as liking to wear high heels; or deeper, such as loving a particular art form; and yet both can have significant impact on decisions, choices and beliefs. But I think it is possible to distinguish between these traits without imposing an external value system upon them. We can establish which traits are central to the person’s identity by reference to those that “are the focus of self- evaluation and self-esteem” (Rorty & Wong 1990, p.20), and also by reference to the quality of the trait itself. What I mean is this: while it is in principle conceivable that a particular relationship with high-heeled shoes might form the lynchpin of a person’s identity, this is less likely to be of the same level of profundity than a long-term relationship with a particular piece of music. The difference, as I will explain below, lies in the quality of the relationship itself behind that trait; that is, there are different ways of “liking”.

Returning to intuitive listeners, then, my claim is that they are capable of relationships of such high quality with works that these relationships shape their sense of self. Intuitive listeners’ musical abilities are such that they can form these relationships via an extension of the kind of baseline understanding I mentioned earlier, coupled with their own aesthetic and emotional responses to the work, their talent and their motivation. They can learn about the background to the piece, various interpretations of the work, the historical context of its creation, its comparative position in regard to other works of the same genre, and so on; but this kind of knowledge will not be at the forefront of their experience of the work. They will not approach it from that perspective, in Higgins’ sense (1997). Yet they clearly are not examples of Hanslick’s mindless addicts or Davies’ deluded postmodernists.

This is not a mere case of lazy sensual enjoyment, or it would not hold the intense importance in their lives that it does. And this in itself raises the question: if, on Davies’ account, such listeners are potentially very low on his scale of possible degrees of understanding (at least in comparison to the autodidact and the musicologist), how is it the case that the relationships they build with the music seem to be very high on the scale of personal importance?

214

The answer to this question is that appreciative relationships with a work, as I explained above, can take many forms. It might not only be a case of what you know about the work, and it certainly isn’t just a case of how you emotionally respond to the work, but it is more a case of how the work contributes to who you are. The level of impact the relationship has on your identity, moreover, is also a measure of the quality of that relationship. What I mean by this is best described by analogy with Aristotle’s account of the three kinds of friendship, or love. While Aristotle specifically stated that these categories cannot be applied to feelings for inanimate objects (or “soulless things”), since part of the definition is that they are mutual feelings, in the case of music I think we can make an exception (Nic.

Ethics, 1155b27-32). We respond to music’s emotional expression, after all, in a similar way to which we respond to another person’s; there is also a sense in which the relationships we build with it can be as profoundly self-defining as those we share with other people. So while the analogy is not perfect, I intend to tweak it slightly in what follows.

The kinds of friendship Aristotle defines are friendships of utility; of pleasure; and

“complete” friendships of character, or “good people” (Nic. Ethics, 1156a-1156b).

Friendships of utility involve loving the other person only in virtue their benefit to you; in musical terms, you might love a piece of music because it helps you stay energised at the gym or soothes you to sleep at night. Friendships of pleasure are, as the name suggests, pleasure-based; in musical terms, this describes loving a work simply for the pleasure it brings you (so this is closest to the experiences cited by Davies and Hanslick). Friendships of character, or the good, are the most interesting for my purposes. These relationships form when you love the other “for the friend’s own sake” because they are of good character (Nic. Ethics, 1156b-11). This is, in Aristotle’s terms, the “perfect friendship”; it is the most enduring kind of friendship, it takes time to form (for “they cannot know each other until they have shared the traditional [peck of] salt’”), and it unites all the benefits of the other kinds of friendship in that it also gives pleasure and improves the participants’ own characters over time (Nic. Ethics, 1156b 26-29).

This latter kind of friendship, then, is the one I think best describes the ideal relationship formed by a listener with a piece of music. It is clearly analogous to the aesthetic approach

215 of loving a work “for itself” in that it takes time to form. It also takes effort to form, as there is a long process of “getting to know” the work such that it can be loved in this way. The work’s properties must be both familiar and recognised for their aesthetic value, which in itself normally requires aesthetic response and interest. This friendship can also be part of the listeners’ identity, in that it both shapes and improves their character; it tells you more about who you are. This relationship, then, is the early stage of an appreciative relationship with a work. What I now want to ask is how much of this relationship depends upon the traditional view that the “getting to know you” phase involves amassing knowledge to the

(unspecified) degree required by the analysts. I argue that it is possible for the intuitive listener, who may not share an equivalent knowledge base, to form an equivalent

“character” relationship with the work just as the analyst might.

What is emerging, then, are two senses of the term “appreciation”: the traditional knowledge-based sense and my knowledge-plus-relationship-based sense. But this latter sense is most emphatically not intended as a means of arguing that everyone’s appreciation of music is equal. A musicologist will always know more about analysis and music history than an instinctive listener; a professional performer will always hear technique in the playing; a conductor will always hear imbalances between the parts or unsteady tempi, for example. Each appreciative relationship will therefore deepen in each direction according to interest. But my point is that while they might know more about the music than the average punter, all of these listeners are also guided in their perspectives by their talents and experiences in the formation of their “character” relationship with the work. And this doesn’t mean any listener can claim this kind of character relationship either; such relationships do need to be based on understanding, some contextual knowledge, and, often, an aesthetic response. This is after all what I mean by “appreciation by degrees”. But my argument is that such a relationship formed without, for example, the ability to score- read does not make it a relationship of pleasure or utility. Such ability might deepen a listener’s knowledge of the work, or it might deepen their analytic interest. But it won’t necessarily change the intensity of the character relationship they have with the work.

216

Let’s now look more closely at the nature of this relationship. So far I have been talking about it in the usual way in such discussions: that is, as though all of these listeners are listening to the same piece, and as though that piece is classical “music alone” (say, a symphony by Beethoven). But some of the most intense “love” relationships are formed with pop or rock music, as it marks youthful rites of passage and becomes a significant part of the (largely teenage) listener’s identity. At first glance, I would argue that these would tend to fall not into the character relationship category, but into either “pleasure” or

“utility”; Aristotle specifies that these latter two categories tend to be short-lived relationships, because needs and interests might change as the listener ages (Nic. Ethics,

1156a 34-35). I would add to this that such short-lived relationships have a similarly transient impact upon the listener’s identity. We are probably all familiar with the sense of disappointment when we return to a once-beloved piece of music from our youth; we are

“just not the same person” anymore. The relationships formed in our teenage years with music (and indeed with some people) can take an intense turn that tends, upon mature reflection in our middle years, to seem incomprehensible or even mildly embarrassing. But there is likewise no question that our relationship with some pop music from our youth does remain with us, does remain dear to us and does become a part of our identity.

Mechanisms like personal association, involving emotional responses such as nostalgia, can be extremely powerful in tipping the balance of our value scale in our relationship with a work. But we can also have gone to the trouble of building a knowledge base about the work, learning to play it ourselves, or learning to see it as an artwork “for itself”. We have gone to the trouble, that is, of forming a character relationship with the work. And the interesting thing about using a pop music example is that it takes the debate out of the domain of the analysts; there is less of an emphasis on scores and more on listening experience. It is also a positive for my account that it can accommodate relationships with music alone and rock/pop music in the one model.

An example of this kind of character relationship is reflected in the Douglas Adams quote heading this chapter (2002, pp.3-6). He talks with some affection about his schooldays when he broke into the school Matron’s office to use her record player to listen to the

Beatles’ latest release. When he was given detention for this misdemeanour, he says, “it

217 seemed a small price to pay for what I now realise was art”. Many a young enthusiast has probably said the same about their band of the moment, but Adams, writing in early middle age, actually goes on to highlight exactly those properties that classify his relationship as a character relationship, including a significant degree of appreciation for the music for itself. First, the relationship has endured, and has shaped his identity: “I grew up with the Beatles, and I’m not sure what else has affected me as much as that” (2002, p.6).

Second, he learnt more about the music itself and the traditions it belongs to, teaching himself to play “Blackbird” on guitar and noting, via his school choir singing experience, that there were “impossible harmonies and part playing you never heard in pop songs before” (2002, p.4). Third, he acknowledges that his knowledge base has affected the way the songs sound, and that their appeal was not immediate: …”it takes a special effort of will to remember how alien they seemed at first to me” (2002, p.5). And finally, it is worth noting that Adams also views Bach, another one of his heroes, as the “greatest genius who ever walked amongst us” (2002, p.82). He is not, in other words, an untalented listener.

My point is this: Adams is also not a musicologist, yet he has clearly filled the requirements for a profound relationship with these songs. I don’t think this point is diminished by my use of a pop music example, either. Were we to exchange the Beatles for, say, a Beethoven symphony, or even one of Adams’ favourite Brandenburg concertos, I would suggest that the quality of his relationship with those pieces (even though they are much more complex works) would not change significantly. I don’t think that the usual view of Beethoven or

Bach works having a greater quantity of more complex structural and contextual properties to recognise, understand and appreciate is necessarily going to change very much about our relationship with it. While a listener like Adams may not have score reading skills, his modular understanding, aesthetic response, aesthetic interest, liking for the work, and knowledge of historical contextual matters is certainly enough to form a character relationship with it. What remains to be clarified, then, is whether his appreciation of these works is of a different quality or degree to his appreciation of the Beatles. I’ve said already that character relationships are not a substitute for informed appreciation, but that informed appreciation feeds these character relationships. I would now argue that character relationships cannot be formed without appreciation, whereas pleasure and utility

218 relationships can. So I would argue that the formation of a character relationship is grounds for assuming at least a significant degree of appreciation in a listener.

To clarify further: I am saying that while we shouldn’t measure appreciation entirely in terms of relationships instead of erudition, we shouldn’t measure it exclusively through erudition either. I think that examples like Adams’ show that ordinary listeners have an appreciation of music without the skills that many analysts would regard as essential, and this evidence needs to be accommodated by any theory of appreciation claiming to champion such listeners. What I now suggest is this: if we measure appreciation in degrees and in terms of relationship quality (instead of Davies’ essentially erudition-based model), then we both avoid the problem of the ever-expanding erudition requirements and offer a theory of appreciation and understanding that can accommodate the ordinary listener’s interests as well as those of the trained expert. I would argue, then, that we are generally better served by adopting my knowledge-plus-relationship-based sense of the term

“appreciation” than by adhering to either the traditional one or Davies’.

One last point remains to be made about my argument. It is still conceivable that a trained analyst may have all the background knowledge possible to accumulate about a work, but no feeling of love (of any of the three kinds) for it, as I also discussed in chapter six. “It’s a great work”, they might say, “but I don’t like it”. The work, in other words, has no impact on this analyst’s emotions, life or identity at all. What then? Can they still be said to appreciate it on my account? Yes, of course they can. This scenario of the “cold” analyst might fit more closely with traditional, erudition-based models of appreciation, but while I would lean towards Cook’s attitude to such appreciation (that it might be an appreciation of something other than the music itself), I don’t see that my account, in my anxiety to defend the ordinary listener, excludes such examples. The “cold” analyst has a degree of appreciation for the work, but it grows from a different perspective (in Higgins’ sense), to the appreciation developed by an intuitive listener. Perhaps, then, this is not only an account of appreciation “by degrees”, but also of different kinds of appreciation. Our perspectives on the work as listeners might take inevitably different paths towards a

219 character relationship with it, or even away from any kind of relationship with it. But where such a relationship exists, I want to argue, there is also a significant degree of appreciation.

Finally, a few words in summary before I move on to the conclusion of this thesis. This chapter has covered considerable ground. My overall aim has been to show how an ordinary “intuitive” listener might truly be said to understand and appreciate the music that can be central to their identity. My motivation has lain in observing that much of the traditional discussion of appreciation has been weighted towards an emphasis on higher- level emotional states and beliefs from the perspective of a well-educated listener (be they an autodidact or an actual musicologist). This, I argue, has led to Davies’ account of understanding “by degrees” which, however inadvertently, omits the experiences of the truly intuitive listener. In response, I have developed an alternative approach to appreciation to lend further clarity to the distinction between understanding and appreciation. Just as I placed expression front and centre in earlier chapters, in this one I ask what happens when we incorporate the idea of a particular kind of relationship with music into a concept of appreciation (rather than understanding) “by degrees”. I think that this alternative approach can unpack our complex relationship with music in further and more inclusive detail. This approach can, for example, offer an explanation for the intensity of the relationship that many intuitive listeners experience with the music in their lives, an intensity that I argue would not exist if such listeners merely experience the kind of sensual enjoyment so abhorred by Hanslick.

So I conclude that if the relationship thread, rather than the education thread, is teased out, it emerges that the formation of a “character” relationship by an intuitive listener, in which music is loved “for itself”, is in fact evidence of a significant degree of appreciation and understanding. These listeners, remember, are after all the majority of listeners (that is, the majority that accounts for the degrees of appreciation up to but not including the educated score-reading analyst). Such character relationships, then, are evidence that they can’t all be listening without legitimate understanding and legitimate appreciation; they can’t all be merely experiencing sensual enjoyment. It is after all very unusual to hold the same identity-forming intensity of character relationship with, say, ice-cream (although I

220 allow it might be theoretically possible amongst ice-cream connoisseurs). However, in the face of all this it is still important to acknowledge that some listeners actually do listen with mere sensual enjoyment, just as some people really like ice-cream. These are, in the

Aristotelian sense, relationships of utility or of pleasure. Such relationships tend to be short-lived and shallow (just like the corresponding relationships with people). Not all listeners are equal and not all musical relationships are equal. Everyone’s appreciation of music will, to a certain extent, take a different form, but this does not lead us to the edge of a relativist slippery slope. Some appreciations will be better informed than others, but this fact has as much to do with ability and perspective as it has motivation and education. But I argue that bringing the quality of our relationship with music into our assessment of appreciation is going to account for many more kinds of appreciative experiences than relying upon background knowledge alone.

221

Conclusion: Music from the inside

“A sonnet written by a machine will be better appreciated by another machine.” Alan Turing, 1949

I began this thesis with the observation that there is a tension between the traditional view of musical understanding and the quality of the relationships that ordinary untrained listeners develop with the music they love. If, according to the traditional view, none of these listeners truly understand music because they lack the necessary musical training, it seems incongruous that they could nonetheless build passionate and even life-long attachments to it. Untrained listeners comprise, after all, the vast majority of music lovers.

A large proportion of musical experiences are therefore being excluded from the accepted theory of musical understanding, either explicitly, through the argument that they simply do not understand the music they love; or implicitly, through a commitment to the idea that theory completely permeates perception. But there is more than enough evidence to suggest, through a combination of folk musicological descriptions of musical experiences and the “character” relationships that untrained listeners have with music, that such listeners are understanding listeners. And there is evidence, such as the Müller-Lyer illusion, to suggest that our perception is not always influenced by our beliefs to the required degree. So each of these views, both implicit and explicit, seem to me to be mistaken.

I followed this first observation with a second: alongside other major structural properties, some of the most objectively identifiable properties of music to listeners of all levels of training are its emotionally expressive properties. Even if we cannot identify specifics about the emotion being expressed by the music, we can usually identify the broad emotional category (sad, happy, angry and so on). It seems likely, then, that untrained listeners should at least be able to understand this property of the music even if they miss some of the less accessible ones. Yet expressive properties, particularly according to the more extreme

222 traditional views, do not seem to feature heavily in accounts of musical understanding.

Understanding musical expression therefore plays only a minor role in understanding music if at all.

It is easy to trace the source of the attitude leading to this omission. Eduard Hanslick’s views make it abundantly clear that understanding expression has little to do with our understanding of music in general. It is also clear that his views were at least partly influenced by outdated Victorian ideas about the opposition of mind and body. It may therefore be somewhat tongue-in-cheek to blame him outright, but even if Hanslick was not solely responsible for such attitudes, his are certainly typical of them. This observation raises the following question: if the current traditional view of understanding does not account for the experiences of the untrained listening majority and omits musical expression, would a view of understanding that leads with musical expression be able to account for them more effectively? Early in the thesis, it seemed to me that setting out to unpack musical understanding by means of unpacking our understanding of musical expression might be the way forward, since musical expression tends to be left out of traditional views in spite of its accessibility. Hanslick seems to leave it out because of its accessibility, in fact; anything that is both easily accessible and a source of “sensual” pleasure to listeners has, in his view, nothing to do with the business of understanding music.

But there was one final observation to be made at this early stage. The boundaries between musical understanding and musical appreciation also seem to me to be blurry. This could of course be because there simply is no useful distinction between the two. However, while often the two terms seem to be interchangeable in the literature, one point is widely accepted: it will be impossible to properly appreciate a work without adequately understanding it first. So the exercise becomes one of establishing at which point, if there is one, understanding becomes appreciation, and how this boundary might be defined.

Consolidating all of these observations led to my formulation of the three questions guiding my thesis, as I set out in the introduction:

223

1. What defines musical understanding?

2. How is musical understanding distinguished from musical appreciation?

3. How does music express emotions?

The approach, as per the above strategy to lead with expression, was therefore to examine

3) first to see if it might shed any light on 1) and 2). In doing so, I sought to ask how much more might we be able to explain if we adopt my approach? And what would we need to do to shore up the resulting account?

As it turns out, this approach enabled me to adapt certain aspects of two accounts into the foundation of my answers to the three questions. These are Jerry Fodor’s modularity of mind theory (1983) and Paul Griffiths’ arguments for basic emotions (and their modularity) (1997). These theories have been, in their individual ways, controversial. I have briefly noted already a debate between Fodor (2001) and Fiona Cowie (1993) about the nativism inherent in modularity theory. And the way that Griffiths accuses philosophers in emotion theory of “ignoring” scientific evidence, in particular, has produced considerable irritation amongst emotion theorists.

But over the course of this thesis, I have not had to involve myself very deeply in any of the more controversial aspects of any of these accounts in order to assemble my argument. In making the case for basic emotions, I don’t need to take sides over whether or not emotions are a natural kind, I don’t need to find a theory to explain any link between basic and non- basic emotions, and I certainly don’t need to accuse emotion theorists of ignoring anything.

I just need to establish that basic emotions are well supported by scientific evidence, and this is relatively unproblematic even given some of the problems with the relevant psychological studies. In modularity theory’s case, first, like Fodor I don’t think that attacks on nativism itself are dangerous until there is a viable alternative (see Fodor 2001).

Secondly, the simple fact of the matter is that modularity theory can still underpin theory- neutral perception better than any other theory available - at least, the aspects of modularity theory that I require, which are the ones that are best supported by

224 experimental evidence. For example, I argued in chapter two that informational encapsulation remains the strongest explanation for the recalcitrance of the Müller-Lyer illusion. I also argued that DeBellis’ and Churchland’s arguments are not convincing in defence of the alternative “theory dependence of observation” account, and that they have relativistic consequences. More importantly, the recognition and production of basic emotions can best be explained in terms of informational encapsulation, as supported by the pancultural evidence I discussed. The use of modularity theory in this way does not leave the experiences of trained listeners unaccounted for either. As Cumming (1993) suggests, and as I agreed in chapter seven, the modularity of input systems (i.e. at understanding level, on my account) might not preclude the recognition of learning as an influence at higher, appreciative levels of interpretation.

Bearing all this in mind, let’s now look at my eventual answers to the three questions above to show just how much may be explained by adopting my suggested approach. I stated in the introduction that my fundamental hypothesis is as follows:

Music is expressive of basic emotions only and we recognise this expression through the same modular processes by which we recognise the expression of basic emotions in other humans.

Again this hypothesis was firmly supported, with a complex defence that I built upon

Fodorian modularity and psychological experimental evidence. Davies (2011) and Goldie

(2000) question such psychological evidence. Given this, my defence involved one slight concession to Davies: it may indeed be the case that music does not, rather than cannot, express all of the standard basic emotion set. It is certainly the case that (as Davies argues) anger, sadness, joy and (as I argue) fear, are expressible and recognisable in music. But given the elasticity of the basic emotion set (Ekman’s formulation, as we saw, ranges from seven to sixteen possible candidates), I’m not sure that this is such a problem anyway. The musical case may even provide evidence that the basic emotion set actually comprises four rather than seven emotions, since the four easily recognised in music tend to match those emotions most reliably recognised in facial or vocal studies, as I suggested. In any case,

Davies’ view that the whole idea of basic emotions being expressed by music at all is a

225

“fantasy” is too harsh, given that he admits that at least some of the basic emotions set clearly are expressed by music.

But these matters aside, let’s look at each of my three answers. They each refer back to the hypothesis above as defended throughout. I’ll begin with the third question:

3. How does music express emotions?

Music expresses basic emotions through our modular recognition of its resemblance to human expressions of emotions. I defended a version of the contour theory and argued that, once we accept the fundamental hypothesis, we can allow that the resemblance may turn on both dynamic and vocal/prosodic cues, as it does in the recognition of basic emotion states in other humans. If both kinds of cues or triggers are allowed in this way, we can account for both long and short term expressive passages in the music (that is, entire phrases or passages seem to require dynamic resemblance; short term expression relies more heavily on vocal resemblances, once we eliminate cultural factors). This resolves the debate between Davies and Kivy on this point and allows for a stronger version of the contour theory.

With this model in place, I also examined what I argue to be the distinction between understanding and appreciation. This distinction turns on the kind of emotions involved in each, and also incorporates the relationship listeners may have with a work, as follows:

1. What defines musical understanding?

Musical understanding operates at the level at which basic emotions operate; that is, through encapsulated modular-level processes. We are equipped at birth with the ability to learn, via immersion in a culture, that culture’s musical lexicon in much the same way that we are equipped with the ability to learn that culture’s language. The idea is that musical understanding is a more complex process than the mere simple perception of, say, colours and shapes; the modular operations presiding over musical perception qualify as understanding by virtue of the level of relative complexity at which these modules operate.

These operations enable us to distinguish features such as when the music starts and stops;

226 which sounds are part of the music and which are extraneous; which sections of the music repeat and whether they are modified; whether or not the music makes sense; and which emotion the music is expressive of, in Kivy’s sense of the term. This is the baseline understanding of which all listeners are capable (talent allowing).

The important point about this is that such understanding is gleaned from music as heard rather than music as scored. This, I think, is my concession to Nicholas Cook’s account

(1990). I argued that there are certain response-dependent properties of the music that are essential to its baseline understanding that are not accessible through a score alone. But I disagree with Cook when he argues that scored properties are irrelevant or even detrimental to our experience of music. Rather, I argued that they are part of the answer to the remaining question:

2. How is musical understanding distinguished from musical appreciation?

If understanding is to be defined by its encapsulated processes, then appreciation is defined by its unencapsulated processes; that is, this is the stage encompassing what we know, believe and even feel in response to the music. Score analysis is therefore part of appreciation rather than understanding, in that it can increase background knowledge about the music as scored rather than simply as heard. Such analysis will enable the listener to identify features of the work more easily and can be undertaken in either formal musicological terms or folk musicological terms, as Davies argues. This, then, is where appreciation becomes a matter of degree as a listener’s knowledge and regard for a work develops.

My major departure from the traditional measures of appreciation, however, lies in my argument that the listener’s relationship with a work be incorporated as evidence of a significant degree of both understanding and appreciation. This relationship must be of a particular kind to qualify: it must be, in the Aristotelian sense, a “character” relationship, a love of the music “for itself” rather than for pleasure or utility. Bringing these relationships into the picture is best viewed as a change in emphasis away from background knowledge and towards relationships; I do not suggest for a moment that relationships can replace

227 knowledge about a work. I argued that this focus upon relationships fits more pieces of the puzzle together, in that such relationships are likely to grow from an aesthetic response and interest in the work, which in turn provides the motivation necessary to return to the work and learn more about it (either formally or informally). It also avoids a problem I identified within the traditional, knowledge-based view of appreciation, in which it seems that the knowledge base required to achieve it is potentially infinitely expandable. But most importantly, this emphasis on character relationships within appreciation (and de- emphasis on acquiring background knowledge) means that the ordinary intuitive listener can appreciate the music they love “for itself”. I want to conclude, that is, that I present a more inclusive account of musical appreciation. Outside of the terms of the traditional erudition-based account, we are no longer obliged to say of the uneducated listening majority that they do not understand or appreciate the music they love.

The idea of the modular recognition of basic emotions can also, it emerged, begin to explain that very love itself. Many uneducated listeners might not proceed very far along the appreciative path as it is defined above. Yet the personal nature of their character relationships with music is still to be explained. We have always had a sense, as Davies observes (2011), that when interacting with music, particularly expressive music, it feels more like interacting with another person than like understanding a word or concept. The modular theory of basic emotion recognition argues that we use the same modular processes in recognising the musical expression of basic emotions as we do in the human expression of basic emotions (at least, the pancultural experimental evidence supports facial and vocal recognition, even if conclusive evidence for musical recognition is yet to be produced). My suggestion, then, is that we feel like we are interacting with another human in recognising musical expression because the recognition process is the same in both cases. This also gives us the starting point for an explanation of the way our relationship with music feels like our relationships with other people. When these relationships with music achieve character status, they may also contribute to our very identities, just as our relationships with people might.

228

There were also some beneficial side-effects of adopting my account. First, my emphasis on experiential understanding “from the inside” made it possible to accept Ridley’s suggestion that music’s expressive properties might be both response-dependent and structural, even if we do not accept his ontological views behind it. This is an important part of my account.

Expressive properties become just another objective structural property of the music, like its more traditional large-scale structural properties. Our recognition and understanding of these expressive properties occurs in the same way as we recognise such properties as when the work begins and ends, for example. This argument, along with Pettit’s “ethocentric” account of response-dependence (1991), also defends against any accusation of circularity in my expression-driven account; recognising expressive properties as structural reinforces the argument that their response-dependence does not diminish their objectivity, as I discussed in chapter three.

Secondly, I suggested that our initial aesthetic response (that is, the realisation that the work may be of aesthetic interest) might not be dependent upon background knowledge.

This aesthetic response, as I argued, must be very carefully distinguished from aesthetic appreciation. But having made this distinction, the idea offers an explanation of the more profound experiences of uneducated listeners like Caleb Garth; furthermore, it adds weight to the meaningful nature of the character relationship such listeners might build with the work.

Given all of this, I can conclude that if my account can be further reinforced with experimental evidence, it will be able to offer an explanation of musical understanding and appreciation that can apply to ordinary, intuitive, untrained listeners just as well as to trained ones. This conclusion is therefore a significant advance over the traditional view as

I outlined it in the early chapters of this thesis, in which we were obliged to state that untrained listeners not only don’t understand music, but also may not even be able to hear it correctly. At this stage, then, it might be a useful exercise to return to Hanslick’s premises to see just how far the hypothesis I have defended departs from the traditional view, which, as I have argued, turns upon Hanslick’s formulation of the situation. As a reminder,

Hanslick’s view can be summarised in the following eight points:

229

1. What we understand when we understand music is formal structure. This requires

education.

2. Neither emotional expression nor emotional arousal has anything to do with this

understanding.

3. Emotional expression is an effect of music, not its content; it is something the

music does, not part of the music itself.

4. Recognising musical expression cannot be unreflective or reflexive.

5. An aesthetic response must be rational, not reflexive.

6. Emotions are individuated by thoughts and beliefs, not by feeling or bodily states.

7. Given (6), music does not carry enough information to express more than very

broad vague emotional states; it does this via its “motion” or dynamics.

8. Science will never be able to explain musical expression or the aesthetic experience.

Almost all of these premises have been challenged by the discussion over the course of this thesis. The only point that has some force remaining is point seven: “music does not carry enough information to express more than very broad vague emotional states”. Hanslick argues that this means music is not expressive of emotions in any real sense at all. I argued that this is in fact not a problem because the “broad emotional states” expressed are basic emotions, and that modularity theory combined with the contour theory provide the beginnings of an explanation as to how it may do this. Moreover, both Sievers et al (2013) and Davies’ version of the contour theory both show how “motion” or dynamics can more than adequately express basic emotions. So Hanslick could only argue that this is a weakness because he was working with the wrong kind of emotions; he assumes all emotions are non-basic. I showed that both he (in chapter one) and Kivy (in chapter six) have little basis for this assumption.

Finally, Hanslick’s last premise turns on the view that science will never explain the regard we have for music; it will never explain the profundity of our relationship with it. Our attitude to science has changed so much since he made that point that it now hardly seems relevant. From a physicalist perspective, for example, there is very little room left for any account based on a separation between body and mind, or “soul”, as Hanslick’s is. To this

230 end, I chose the quote from Alan Turing at the head of this conclusion very deliberately.

Much – indeed nearly all – of my account depends upon our uneducated intuitive listeners knowing what experiencing emotions “from the inside” feels like. That is part of the emotional recognition process and that is part of what makes our relationship with music seem meaningful. We require experiential knowledge of the emotions to recognise their musical expression. Turing, I think, was making this point about his sonnet-writing machine; this is why he thinks that art made by machines is best appreciated by other machines. To properly appreciate such sonnets, that is, you need to know what being a machine feels like. My view of music “from the inside”, then, includes the view that when we understand emotional expression in music, this is not simply a musical expression of an emotional state; it is also an expression of what it feels like to be human. Turing’s sonnet- writing machine is intended to show that science and art are not opposed, as Hanslick thought they were; it is also an acknowledgement of what there is yet to discover.

231

References

Adams, Douglas 2002 The Salmon of Doubt London: Pan Macmillan Ltd

Aristotle Nicomachean Ethics Trans. Irwin, Terence (1985) Indianapolis: Hackett Publishing Company

Armstrong, John 2000 The Intimate Philosophy of Art London: Allen Lane

Attenborough, David 2002 The Life of Mammals London: BBC Books

Ball, Philip 2010 The Music Instinct London: The Bodley Head

Bernstein, Leonard 1976 The Unanswered Question Cambridge, MA: Harvard University Press

Bouswma, O.K. 1950 “The Expression Theory of Art” in M. Black (ed.) Philosophical Analysis Englewood Cliffs, NJ: Prentice-Hall

Brennan, Teresa 2004 The Transmission of Affect Ithaca: Cornell University Press

Budd, Malcolm 1985a Music and the Emotions, London: Routledge & Kegan Paul

- 1985b “Understanding Music” Proceedings of the Aristotelian Society, vol. 59, supplement

- 2008 Aesthetic Essays Oxford: Oxford University Press

Cavell, Stanley (ed.) 1977 Must We Mean What We Say? Cambridge: Cambridge University Press

Chalmers, AF 1976 What is this thing called Science? St Lucia: University of Queensland Press

Churchland, Paul 1979 Scientific Realism and the Plasticity of Mind Cambridge: Cambridge University Press

- 1988a: “Perceptual plasticity and theoretical neutrality” Philosophy of Science vol. 55, pp.167 – 187

- 1988b Matter and Consciousness (revised edition) Cambridge, Mass: MIT Press

Collingwood, R.G. 1938 The Principles of Art Oxford: Oxford University Press.

Cook, Nicholas 1990 Music, Imagination and Culture New York: Oxford University Press

232

Cooke, Deryck 1959 The Language of Music Oxford: Oxford University Press

Cosmides, Leda & Tooby, John 2000 in Michael Lewis & Jeannette Haviland-Jones (eds.) Handbook of Emotions, 2nd Edition NY: Guilford

Couvalis, George 1997 The Philosophy of Science: Science and Objectivity London: Sage Publications

Cowie, Fiona 1999 What’s Within? Nativism Reconsidered New York: Oxford University Press

Cumming, Naomi 1993 “Music analysis and the perceiver: a perspective from functionalist philosophy” Current Musicology, vol. 54, pp.38-53

Damasio, Antonio 1994 Descartes’ Error New York: Putnam

Darrow, Alice-Ann; Haack, Paul; Kuribayashi, Fumio 1987 “Descriptors and Preferences for Eastern and Western Musics by Japanese and American Nonmusic Majors” Journal of Research in Music Education vol. 35 (4), pp.237-48

Darwin, Charles 1872 The Expression of the Emotions in Man and Animals Chicago: University of Chicago Press (1965 edition)

Davies, Stephen 1980 “The Expression of Emotion in Music” Mind, vol. 89, pp.67-86 (Also in 2003 Themes in the Philosophy of Music Oxford: Oxford University Press)

- 1994 Musical Meaning and Expression Ithaca: Cornell University Press

- 1997 “John Cage’s 4’33”: is it music?” Australasian Journal of Philosophy, vol 75, pp.448-62. Also in 2003 Themes in the Philosophy of Music Oxford: Oxford University Press

- 2001 Musical Works and Performances: A Philosophical Exploration Oxford: Oxford University Press

- 2003 Themes in the Philosophy of Music Oxford: Oxford University Press

- 2006a “Musical Understandings” in Becker, A & Vogel, M (eds.), Vogel, M (trans.) Musickalischer Sinn: Beiträger zu einer Philosophie der Musik Frankfurt: Suhrkamp Verlag (and in Davies 2011)

- 2006b The Philosophy of Art Oxford: Blackwell

- 2010a “Cross Cultural Expressiveness” in Elisabeth Schellekens and Peter Goldie (eds.), Philosophical Aesthetics and Aesthetic Psychology. Oxford: Oxford University Press, 2010

233

- 2010b “Emotions expressed and aroused by music: philosophical perspectives” in Juslin, Patrik & Sloboda, John (eds.) 2010 Handbook of Music and Emotion: Theory, Research, Applications Oxford: Oxford University Press

- 2011 Musical Understandings & Other Essays on the Philosophy of Music Oxford: Oxford University Press

DeBellis, Mark 1995 Music and Conceptualization Cambridge: Cambridge University Press

- 1999 “The Paradox of Musical Analysis” Journal of Music Theory vol. 43, pp.83-99

- 2003 “Schenkerian Analysis and the Intelligent Listener” Monist, vol. 86, pp.579–607

- 2005 ”Conceptual and Nonconceptual Modes of Music Perception” Postgraduate Journal of Aesthetics vol. 2, pp.45-61

Dretske, F 1986 “Misrepresentation” in Bogdan, R (ed.) Belief: Form, content and Function Oxford: Oxford University Press (pp.17-36)

Ekman, Paul, and Freisen, W.V 1969 “The repertoire of non-verbal behaviour: categories, origins, usage and coding” Semiotica vol. 1, pp.49-98

Ekman, Paul 1972 “Universals and cultural differences in facial expressions of emotion” in Cole, J (ed.) Nebraska Symposium on Motivation Lincoln: University of Nebraska Press

- 1992 “An Argument for Basic Emotions” Cognition & Emotion, vol. 6, pp.169-200

- 1999 “Basic Emotions” in Dalgleish, T and Power, M (eds.) Handbook of Cognition and Emotion Chichester, NJ: Wiley, pp.55-60

Faucher, Luc & Tappolet, Christine (eds.) 2006 The Modularity of Emotions Calgary: University of Calgary Press

Fodor, J 1983 The Modularity of Mind Cambridge: MIT Press

- 1984 “Observation Reconsidered” Philosophy of Science vol. 51 Reprinted in 1990: A Theory of Content and Other Essays Cambridge, Mass: MIT Press

- 1985 “Précis of The Modularity of Mind” The Behavioural and Brain Sciences, vol. 8, pp.1-42

- 1988 “Reply to Churchland’s ‘Perceptual Plasticity and Theoretical Neutrality” Philosophy of Science, vol. 55. Reprinted in 1990: A Theory of Content and Other Essays Cambridge, Mass: MIT Press

234

- 2000 The Mind Doesn’t Work That Way: The Scope and Limits of Computational Psychology, Cambridge, MA: MIT Press.

- 2001 “Doing without What’s Within: a critique of Fiona Cowie’s critique of nativism” Mind, vol.110, no. 437, pp.99-148

Fritz, T; Jentschke, S; Gosselin N; Sammler, D; Peretz, I; Turner, R; Frederici, AD; Koelsch, S 2009 “Universal recognition of three basic emotions in music” Current Biology vol. 19, pp.573-6

Fritz, T; Sammler, D; Koelsch, S 2006: “How far is music universal? An intercultural comparison” In Baroni, M; Addessi, R C; Costa, M (eds.) 9th International Conference on Music Perception and Cognition (p.88) Bologna: Bononia University Press

Gazzaniga, M.S. & LeDoux, J.E. 1978 The Integrated Mind. New York: Plenum Press.

Gilman, D 1991 “The neurobiology of observation” Philosophy of Science, vol. 58, pp.496 – 502

Goldie, Peter 2000 The Emotions Oxford: Oxford University Press

Goldman, Alan 1992 “The Value of Music” Journal of Aesthetics and Art Criticism, vol. 50, pp.35-44

Goodman, Nelson 1968: Languages of Art Indianapolis: Bobbs

Gracyk, Theodore A 1996 Rhythm and Noise: An Aesthetics of Rock Music Durham, NC: Duke University Press

Gregory, AH & Varney N 1996: “Cross-cultural comparisons in the affective response to music” Psychology of Music, vol. 24 pp.47-52

Griffiths, Paul E 1997 What Emotions Really Are: The Problem of Psychological Categories Chicago: The University of Chicago Press

- 2003 “Basic Emotions, Complex Emotions, Machiavellian Emotions” in Philosophy and the Emotions A. Hatzimoysis (ed.), Cambridge CUP pp.39-67

- 2004 “Emotions as Natural Kinds and Normative Kinds” Philosophy of Science vol. 71 (5 Supplement: Proceedings of the 2002 Biennial Meeting of the PSA), pp. 901-911.

Gurney, Edmund 1880 The Power of Sound New York: Basic Books (1966)

Hanslick, Eduard 1891 On the Musically Beautiful, Indianapolis: Hacket Publishing Company (Translation: Payzant, Geoffrey 1986)

235

Hatfield, Elaine; Cacioppo, John T and Rapson, Richard L 1994 Emotional Contagion New York: Cambridge University Press

Higgins, Kathleen 1997 “Musical Idiosyncracy and Perspectival Listening” in J. Robinson (ed.) Music and Meaning Ithaca: Cornell University Press, pp.83-102

Hjort, Mette & Laver, Sue (eds.) 1997 Emotion and the Arts Oxford: Oxford University Press

Jackson, Frank 1982 “Epiphenomenal Qualia” Philosophical Quarterly vol. 32, pp.127 - 136

James, William 1884 “What is an emotion?” Mind, vol. 9, pp.188-205

Jones, Karen 2006 “Quick and Smart? Modularity and the Pro-Emotion Consensus” in Faucher & Tappolet (eds.) The Modularity of Emotions Calgary: Calgary University Press

Juslin, Patrik; Liljestrom, Simon; Västfjäll, Daniel; Lundqvist, Lars-Oslov 2010 “How does music evoke emotions? Exploring the underlying mechanisms” in Juslin, Patrik & Sloboda, John (eds.) 2010 Handbook of Music and Emotion: Theory, Research, Applications Oxford: Oxford University Press (pp.605 – 642)

Juslin Patrik N & Sloboda, John A (eds.) 2001 Music and Emotion: Theory and Research Oxford: Oxford University Press

- 2010 Handbook of Music and Emotion: Theory, Research, Applications Oxford: Oxford University Press

Juslin, Patrik & Västfjäll, Daniel 2008a “Emotional responses to music: the need to consider underlying mechanisms” Behavioural and Brain Sciences, vol. 31, pp.559-75

Juslin, Patrik & Västfjäll, Daniel 2008b “All emotions are not created equal: reaching beyond the traditional disputes” Behavioural and Brain Sciences, vol 31, pp.600-621

Kania, Andrew 2006 “Making Tracks: The Ontology of Rock Music” The Journal of Aesthetics and Art Criticism vol. 64, no. 4 (Autumn, 2006), pp.401-414

- 2008 “Piece for the end of time: in defence of musical ontology” British Journal of Aesthetics vol. 48, no. 1, pp.65–79

- 2012 "The Philosophy of Music", The Stanford Encyclopaedia of Philosophy (Fall 2012 Edition), Edward N. Zalta (ed.), URL = .

Kivy, Peter 1980 The Corded Shell Princeton: Princeton University Press

236

- 1989 Sound Sentiment. Philadelphia, PA: Temple University Press

- 1990 Music Alone: Philosophical Reflection on the Purely Musical Experience Ithaca: Cornell University Press

- 1993 “Auditor’s Emotions: Contention, Concession and Compromise” Journal of Aesthetics and Art Criticism vol. 51, pp.1-12

- 1999, “Feeling the Musical Emotions”, British Journal of Aesthetics, vol. 39, pp.1–13

- 2001 New Essays on Musical Understanding Oxford: Clarendon Press

- 2002 Introduction to a Philosophy of Music Oxford: Clarendon Press

Koopman, Constantjin; Davies, Stephen 2001 “Musical Meaning in a Broader Perspective” Journal of Aesthetics and Art Criticism. vol.59, no.3 (Summer, 2001), pp.261-273

Krumhansl, Carol L 1995 “Music Psychology and Music Theory: Problems and Prospects” Music Theory Spectrum, vol.17, pp.53-80

- 1997 “An Exploratory Study of Musical Emotions and Psychophysiology” Canadian Journal of Experimental Psychology, vol. 51, pp.336-52

Krumhansl, Carol L; Louhivuori, Jukka; Toiviainen, Pekka; Järvinen, Topi; Eerola, Tuomas 1999 “Melodic Expectation in Finnish Spiritual Folk Hymns: Convergence of Statistical, Behavioural and Computational Approaches” Music Perception, vol. 17, pp.151-95

Krumhansl, Carol L; Toivanen, Pekka; Eerola, Tuomas; Toivanen, Petri; Järvinen, Topi; Louhivuori, Jukka 2000 “Cross-cultural Music Cognition: Cognitive Methodology Applied to North Sami Yoiks” Cognition vol. 76, pp.13-58

Langer, Susanne K 1953 Feeling and Form London: Routledge & Kegan Paul.

- 1957 Philosophy in a New Key (3rd edn) Cambridge, MA: Harvard University Press

Le Doux, Joseph 1998 The Emotional Brain London: Orion Books Ltd

Lerdahl, F & Jackendoff, R 1983 A Generative Theory of Tonal Music London: MIT Press

Levinson, Jerrold 1990 “Music and Negative Emotion” in Music, Art and Metaphysics Ithaca: Cornell University Press

- 1996 The Pleasures of Aesthetics: Philosophical Essays Ithaca: Cornell University Press

237

- 1997 Music in the Moment Ithaca: Cornell University Press

- 2003 “Philosophical aesthetics: an overview”. In Levinson, Jerrold (ed.) Oxford Handbook of Aesthetics (pp.3-24) Oxford: Oxford University Press

Matravers, Derek 1998 Art and Emotion Oxford; Oxford University Press

McCauley, R. N. and Henrich, J. 2006 “Susceptibility to the Müller-Lyer illusion, theory- neutral observation, and the diachronic penetrability of the visual input system” Philosophical Psychology vol.19, pp.79–101.

Meager, Ruby 1958 “The Uniqueness of a Work of Art” Proceedings of the Aristotelian Society, New Series, vol. 59 (1958 - 1959), pp.49-70

Meyer, Leonard B 1956 Emotion and Meaning in Music Chicago: University of Chicago Press

Mithen, Steven 2005 The Singing Neanderthals: The Origins of Music, Language, Mind and Body London: Weidenfeld and Nicolson

Nagel, Thomas 1974 “What Is It Like to Be a Bat?” Philosophical Review, vol. 83, pp.435 – 450

Neill, Alex 2003 “Art and Emotion” in Levinson, Jerrold (ed.) Oxford Handbook of Aesthetics (pp.421-35) Oxford: Oxford University Press

Neill, Alex & Ridley, Aaron (eds.) 2002 Arguing About Art (2nd edition) London: Routledge

Oatley, Keith; Keltner, Dacher and Jenkins, Jennifer 2006 Understanding Emotions (2nd ed.) Oxford; Blackwell Publishing Ltd

Perkins et al 2010: “Mirror neuron dysfunction in autism spectrum disorders” Journal of Clinical Neuroscience vol 17, pp.1239-1243

Pettit, Philip 1991 “Realism and Response-Dependence” Mind, vol.100, no.4, pp.587-626

Pinker, Steven 1997 How the Mind Works New York: W W Norton

Prinz, Jesse J. 2004 Gut Reactions: A Perceptual Theory of Emotion Oxford: Oxford University Press

- 2006 “Is the mind really modular?” In R. Stainton (ed.), Contemporary Debates in Cognitive Science (pp 22–36). Oxford: Blackwell.

Ratcliffe, Matthew 2006 “Review of Gut Reactions (Jesse Prinz)” Philosophical Books vol. 47 no. 2 April 2006 pp.170–175

238

Ridley, Aaron 1993 “Bleeding chunks: some remarks about musical understanding” The Journal of Aesthetics and Art Criticism, vol. 51, no. 4 (Autumn, 1993), pp.589-596

- 1995 Music, Value and the Passions Ithaca: Cornell University Press

- 2003 “Against musical ontology” Journal of Philosophy, vol 100, pp.203 - 220

- 2004 The Philosophy of Music: Theme and Variations Edinburgh: Edinburgh University Press

Robinson, Jenefer 2005 Deeper than reason: Emotion and its role in literature, music and art Oxford: Clarendon Press

Rorty, Amelie and Wong, David 1990 “Aspects of Identity and Agency” in Flanagan, Owen and Rorty, Amelie (eds.) 1990: Identity, Character and Morality: Essays in Moral Psychology Cambridge, Massachusetts: MIT Press

Russell, J 1994 “Is there Universal Recognition of Emotion from Facial Expression? A Review of Cross-Cultural Studies” Psychological Bulletin, vol. 115, pp.102-41

Scruton, Roger 1997 The Aesthetics of Music Oxford: Oxford University Press

Segall, M., Campbell, D. and Herskovits, M. J. 1966 The Influence of Culture on Visual Perception New York: Bobbs-Merrill

Sell, Aaron; Bryant, Gregory; Cosmides, Leda; Tooby, John; Sznycer, Daniel; von Rueden, Christopher; Krauss, Andre; Gurven, Michael 2010 “Adaptations in humans for assessing physical strength from the voice” Proceedings of the Royal Society of Biological Sciences, vol. 277, pp.3509 - 3518

Schenker, H 1925 The Masterwork in Music: A Yearbook Cambridge: Cambridge University Press (1994)

Scherer KR, Banse R, Wallbott HG 2001 “Emotion inferences from vocal expression correlate across languages and cultures” Journal of Cross Cultural Psychology vol. 32, no.1, pp.76–92.

Sievers, Beau; Polansky, Larry; Casey, Michael; Wheatley, Thalia 2013 “Music and movement share a dynamic structure that supports universal expressions of emotion” PNAS 110 (1) pp.70-75; published ahead of print December 17, 2012

Sloboda, Joseph 1985 The Musical Mind: The Cognitive Psychology of Music Oxford: Clarendon Press

Sukla, A.C (ed.) 2003 Art and Experience Westport: Praeger

239

Tanner, Michael and Budd, Malcolm 1985 “Understanding Music” Proceedings of the Aristotelian Society, Supplementary Volumes, vol. 59, pp.215 - 248

Thompson, William and Balkwill, Laura-Lee 2010 “Cross-cultural similarities and differences” in Juslin, P & Sloboda, J 2010 Handbook of Music and Emotion: Theory, Research, Applications Oxford: Oxford University Press

Trehub, Sandra E; Unyk, Anna M; Trainor, Laurel J 1993 “Adults Identify Infant-directed Music Across Cultures” Infant Behaviour and Development vol. 16, no.2, pp.193-211

Unyk, Anna M; Trehub, Sandra E; Trainor, Laurel J; Schellenberg, E. Glenn 1992 “Lullabies and Simplicity: A Cross-Cultural Perspective” Psychology of Music, vol .20, pp.15-28

Walton, Kendall 1970 “Categories of Art” The Philosophical Review vol. 79, no. 3, pp.334- 367

- 1988 “What is Abstract About the Art of Music?” Journal of Aesthetics and Art Criticism vol. 46, pp.351-64

- 1994 “Listening with Imagination: Is Music Representational?” Journal of Aesthetics and Art Criticism vol. 52, pp.47–61; also in Robinson, Jenefer (ed.) 1997

Wan, C et al 2010: “From music making to speaking: engaging the mirror neuron system in autism” Brain Research Bulletin vol. 82, pp.161-168

Zajonc, R. B. 1980 “Feeling and thinking: preferences need no inferences” American Psychologist vol. 35, pp.151 – 175

- 1984 “On the primacy of affect” in Scherer, K. R. & Ekman, P (eds.) 1984 Approaches to Emotion Hillsdale, NJ: Erlbaum

Zangwill, Nick 2004 “Against Emotion: Hanslick Was Right About Music”, British Journal of Aesthetics, vol. 44, pp.29–44

- 2007 “Music, Emotion and Metaphor” Journal of Aesthetics and Art Criticism vol. 65, no. 4, pp.391-400

Zemach, Eddy M 2002 “The Role of Meaning in Music” British Journal of Aesthetics vol .42, pp.169–78

Minerva Access is the Institutional Repository of The University of Melbourne

Author/s: Prakhoff, Belinda Marie

Title: Music from the inside: emotional expression and the understanding listener

Date: 2013

Persistent Link: http://hdl.handle.net/11343/39622

File Description: Music from the Inside: Emotional Expression and the Understanding Listener