<<

The Experimental of Musical

The LeBlanc Model, Arousal, Complexity, and Prototypicality

A thesis submitted in fulfilment

Of the requirements for the degree of

Master of Music

Finn Pursell University of NSW

April 2004 Acknowledgements

I would like to thank Dr. Robert Walker who has advised me on the project and offered invaluable criticism. I would also like to thank Dr. Gary McPherson for getting me started and Dr. Christine Logan for her support. I would also like to thank the student body, particularly in the music department, for taking part in the study. Also, special thanks must be given to Christina who helped me collect data. Lastly, I would like to thank my parents for supporting me in the endeavor. SYNOPSIS

The thesis is a review of some of the published literature on the determinants of music preference, and an exploration of the nature of music preference among a sample of university students in Sydney, Australia. The review initially gives the reader a picture of a cross- section of the literature on music preference, followed by a focus on a single determinant. This is done in order to give the reader a broad perspective as well as an in-depth look at a single factor, not only in order to thoroughly investigate one aspect of music preference without losing sight of the entire picture, but also to detail the and designs used by practising researchers in the field. The empirical research is an exploratory study on the effects of training, age, and gender on patterns of preference for styles of music.

Abstract Patterns of preference for 12 styles of music were examined, and the effects of gender, age, and training investigated. Subjects responded by indicating their preference on a ten- point rating scale, and on gender, training, and age was collected. Analysis of variance revealed significant main effects for training and . In addition, the style by training, style by age, and style by age and training interactions were all significant. Music students gave higher preference ratings on average than non-music students overall. Non-music students tended to give higher preference ratings for the pop and rock genres, while music students gave higher ratings for the jazz, blues, folk, and classical genres of music. Older students gave higher ratings than younger students for the seven classical genres, while there was little difference in ratings for the genres of music. The 3-way interaction revealed differences in the way young and old subjects responded to styles of music by training. Results broadly support findings in the mainstream literature regarding the effects of age and training on preference; however, the lack of effect of gender does not correspond to the popular regarding the importance of gender influences. ii

Table of Contents

ABSTRACT……………………………………………………………………………..i

TABLE OF CONTENTS…………………………………………………………….ii-iii

1. INTRODUCTION………………………………………………………………..1-5

2. THE LEBLANC MODEL………………………………………………..…...... 6-36 2.1 The Music……………………………………………….….....7-14 2.11 Physical properties……………………………………….7 2.12 Complexity………………………………………………7 2.13 Referential Meaning…………………………………...8-9 2.14 Performance ………………………………….9-11 2.15 The Media…………………………………………...11-14 2.2 The Environment…………………………………………....15-16 2.21 Peer Group, family educators and authority figures……15 2.22 Incidental Conditioning………………………...…...15-16 2.3 Intervening Variables………………………………….….....17-19 2.4 Relatively Stable Factors of the Individual…………………19-35 2.41 Personality…………………………………………..19-21 2.42 Maturation…………………………………………..21-23 2.43 Gender…………………………………………....…23-25 2.44 Training……………………………………………...25-27 2.45 Ethnic Group…………………………………...…...27-29 2.46 Culture…………………………………………....…29-35 2.5 Final Stages and Conclusions…………………………….…35-36 iii

3. EXPOSURE, AROUSAL, COMPLEXITY & PROTOTYPICALITY….…...37-111 3.1 Zajonc’s Mere Exposure Theory………………………….....37-41 3.2 Berlyne’s Arousal Model and the Wundt Curve………….....41-48 3.3 The Optimal Complexity Model……………………….……48-51 3.4 The Concept of Complexity…………………………….…...51-57 3.5 Assessing the Optimal Complexity Model…………….…..58-104 3.51 Mull……………………………………………….…….58 3.52 Hargreaves…………………………………………..58-62 3.53 Heyduk…………………………………………...... 62-65 3.54 Hargreaves…………………………………………..66-67 3.55 Radocy…………………………………………...... 68-69 3.56 Burk & Gridley……………………………………...70-71 3.57 Orr & Ohlsson………………………………………71-74 3.58 Steck & Machotka…………………………………..74-77 3.59 Vitz………………………………………………….78-87 3.60 Crozier………………………………………………88-91 3.61 Smith………………………………………………..91-96 3.62 North & Hargreaves……………………………….96-104 3.6 The Prototypicality Models………………………………104-110 3.7. Conclusions……………………………………………….110-111 4. THE STUDY……………………………………………………………….112-164 4.1 Introduction………………………………………………..112-118 4.2 Method…………………………………………………….118-124 4.3 Analysis & Results……………………………………...... 124-146 4.4 Discussion…………………………………………………146-154 REFERENCES …………………………………………………………………..155-164 APPENDIX…………………………………………………………………………...i-xii A1 Questionnaire…………………………………………………….....i A2 Preference scores………………………………………...... ii-vi A4 Analysis………………………………………………..……..vii-viii A5 Means & S.D…………………………………..…...... ix-xii 1

CH. 1

Introduction

The problems associated with studying responses to music are numerous and varied, involving issues relating to the subject, object, and context of the listening experience. Preference decisions occur so frequently and in so many contexts that to ask why one option is chosen over another is exceptionally difficult. The reasons for an outcome of a decision may be based on hundreds of factors, some of which contribute to the determination of the aesthetic experience and some that do not, yet nevertheless affect the outcome of the preference decision. Many of these factors interact and exert forces without the conscious realization of the individual. How then, can we possibly study factors that are involved in shaping our tastes and influencing our preference decisions with regard to music, if there are so many that impinge on the decision making process and do so in such complex ways? The fact that seemingly identical experiments can give rise to contradictory or ambiguous results attests to the difficulty faced by experimental researchers studying response to music, and the complexity of the phenomenon. In order to begin to study music preference we must first arrive upon an adequate definition of the term, narrow down the range of our enquiry, and look for frequently occurring patterns that are readily observable and explainable. Fortunately, an extensive amount of research has already been done and a number of influential theories and models formulated. This gives us a point of departure and allows for a convenient entry point for evaluating the literature to enable us to better understand the nature of music preference.

The definition of preference that will be adhered to in the present inquiry is one given by Price (1986:151) and accepted by Burnsed (1998: 397). Preference is defined as “an act of choosing, esteeming, or giving advantage to one thing over another”. This is a useful definition as it bypasses the problems inherent in a more complete definition. In this sense it is superficial as it does not attempt to explain any of the underlying motives associated with preferring one object to another. It simply determines that preference for an object occurs if it is chosen more often, or given an advantage, over other objects. Consequently, a shopper at a supermarket will demonstrate a preference for one brand of 2 bread over another simply by choosing it more often than the other brands of bread. Similarly, if a shopper rates one brand of bread higher on the liking end of a ten-point rating scale than another brand, an advantage is given to that brand. It is the preferred brand. This would hold true according to this definition even if circumstances were such that the person choosing the bread was told to pick a brand by parents despite actually liking the taste of another brand better. Preference measures, at best, try to be objective, valid, and reliable, and the issues concerning the reasons behind an apparent preference are problems for the researcher to work out.

The question of how preference relates to taste can perhaps be best understood by considering one of Even Rune’s definitions quoted by Farnsworth (1969: 97), which states that taste “is the ensemble of preferences shown by an artist in his choice of elements from nature and tradition, for his works of ”. Although this definition is based on the perspective of an artist who demonstrates his tastes through his work, it can take on a larger connotation simply by replacing “artist” with “person”, and chopping off the last part of the sentence “for his works of art”. Now we have a generalized definition based on the cumulative outcome of preference decisions. Restriction of the context of this definition of taste is dependent on the modifier that precedes it. Hence, musical taste can be understood along the parameters mentioned but within a restricted field. The concept is eloquently summed up by Farnsworth (1969: 97) who describes musical taste as the “overall attitudinal one has toward the phenomenon that collectively comprises music”.

Questions concerning music preference revolve around a number of influential models and theories that have drawn experimental inquiries from many different quarters. The field of experimental aesthetics, which can be subsumed under the title of preference theory but which is really an extension of the wider body of traditional aesthetic enquiry, has largely been expedited by the ideas of Gustav Theodor Fechner and Daniel Berlyne. Fechner pioneered the empirical study of the golden section hypothesis, opening up traditional questions regarding the aesthetic properties of art to systematic scientific investigation. Although Fechner found little evidence of the superior aesthetic power of the golden section proportion (roughly 3:5) in and crucifixes, in a seminal study 3

(1876) incorporating rectangles whose sides were in varying proportions to each other, Fechner found that subjects preferred rectangles whose sides fit the proportions of the golden section. However, recent studies have largely rejected the significance of the golden section proportion, primarily as a result of finding equally striking results in favor of preferences for other proportions (Hoge 1997). Despite this, the idea that certain formal qualities of aesthetic objects do seem to play a critical role in determining aesthetic value and preferences has pervaded the experimental aesthetics and music preference literature ever since, and has delivered the age old debate over the sources of aesthetic power in art into an empirical setting.

Daniel Berlyne (1971) developed a psychobiological theory of preference which explains response to aesthetic objects in terms of arousal potential and hedonic tone. Berlyne suggested that hedonic value is a function of the collative, psychophysical, and ecological properties of the stimulus, and can be described by the Wundt Curve, formulated (by Wilhelm Wundt) to describe the relationship between pleasantness and stimulus intensity. Although certain aspects of Berlyne’s theory have been challenged recently, it is a benchmark theory for explaining aesthetic response and has led to the development of the Optimal Complexity Model which relates preference to complexity and familiarity in the form of an inverted U shaped curve. First articulated by Edward Walker (1980) and coined the “Hedge Hog Theory”, the basic form of the model has been studied extensively by researchers, particularly David Hargreaves (1984) as it applies to explaining patterns of preference for music.

Another theory is the Mere Exposure Theory proposed by Robert Zajonc (1970), which puts significant emphasis on exposure as a determinant of preference. Although this last theory has largely taken a back seat to the optimal complexity model, it has been given its due merit and makes for an interesting topic of debate, particularly with regards to music appreciation.

Albert LeBlanc’s (1982) model for explaining sources of variation in musical taste focuses on arranging relevant variables impacting on musical preference and taste into a . The net result is a useful schematic for understanding where variables can come into play in the context of a single music listening experience. All of 4 these models and theories have been studied by a sizeable number of researchers resulting in a significant expansion of the detail and scope of preference theory.

Structure of the Thesis

The present enquiry is divided into two sections consisting of the i) literature review and ii) empirical research portion of the thesis. The literature review is structured according to discussion of the models mentioned in the introduction, and are presented in the following manner: i) Literature Review

CH. 2-Albert LeBlanc’s Interactive Theory of Music Preference-The discussion focuses on the mechanics of the model and inserts reviews of studies in order to capture the nature of the research done on variables presented in the model. These reviews do not pretend to make significant headway in determining the comparative strength and manner of influence of variables, but rather selects studies in order to give the reader some sense of the type of literature engaging specific factors represented in the model, and the nuance of findings made regarding their effects on preference.

Ch. 3 - Zajonc’s Mere Exposure Theory-The theory is introduced with a study by Robert Zajonc. Studies by Bradly, Schuckert, Hargreaves, and Cantor are reviewed and implications of results obtained are considered with respect to the Mere Exposure Theory.

- Daniel Berlyne’s Arousal model-Discussion is centered on the concept of the model including explication of the origins and function of the Wundt curve and nature of the collative, psychophysical, and ecological variables and their proposed relationship to hedonic tone.

- The Optimal Complexity Model-The basic mechanics of the model are described. Ideas by Edward Walker regarding the applicability of the model to human behavior in general are touched upon. Following is a review of literature that has sought to test hypotheses regarding the applicability and validity of the Optimal Complexity Model 5 within the context of subject’s affective response to music. The studies are grouped according to similarity of design. First in the review are studies using naturalistic stimuli presented in lab conditions i) with repetition, followed by those in which ii) repetition is omitted. Next are studies in which iii) artificial stimuli are presented in lab conditions (with repetition present in one case), and iv) naturalistic stimuli are presented in natural settings. This last group represents an attempt to optimize the ecological validity of results.

-Several of the prototypicality models are covered in response to North & Hargreaves’ findings that prototypicality may have been an important influencing factor on determining music preferences in their studies. The literature on prototypicality and music preference is introduced. ii) Empirical research

Ch.4-The empirical research consists of a study incorporating scored data. The outline of the study can be understood in the following manner: The premise of the study is introduced by presenting relevant questions and recapping findings obtained in the mainstream literature. Specific hypotheses are outlined, followed by discussion of the questionnaire, methodology, and results. Results and their implication to the literature are discussed and considerations for the direction and manner of future research are presented. The final section can be viewed as an appendix of sorts, and includes the questionnaire used in the study and condensed data formats. 6

CH. 2

Albert LeBlanc’s Interactive Theory of Music Preference

LeBlanc’s (1982) Interactive Theory of Music Preference diagrams the structure of a single music preference decision. The model is built as a hierarchy of factors consisting of eight stages representing the interaction of input information and characteristics of the listener. Level eight is the first stage of the listening experience and includes all relevant variables that are associated with the point at which information is coming in. These include factors (see diag. below) pertaining to the stimulus, and those pertaining to the cultural environment in which it is perceived.

Fig. 2.1. LeBlanc’s Model of Sources of Variation in Music Preference

LeBlanc (1999: 73) 7

The Music

Physical Properties

Physical properties include objective features of the stimulus such as frequency, amplitude, and waveform. Also included under this heading is tempo, which LeBlanc himself has engaged extensively in a series of studies that point to a preference among subjects for comparatively faster tempos over slower ones. Although the earlier studies in the series (LeBlanc 1981, 1983) tended to confound tempo effects with other variables, such as music style and performing medium, two studies provide ample evidence of tempo effects. In the first of these studies, LeBlanc (1983) isolates tempo effects from unwanted interactions by incorporating music selections entirely from within the confines of one musical genre and one performing medium. Only instrumental jazz selections of four varying tempos are used. Results are conclusive. For every increase in tempo an increase in preference is observed, with significant differences in liking between the slowest tempo and all other tempos. LeBlanc’s latest study of tempo (2000-2001) is nearly identical in design but for the addition of a country variable. Jazz instrumental selections of four differing tempos were played to participants from China, Italy, Brazil, South Africa, and the USA. Results indicated that once again tempo was a significant factor, with the faster tempos most preferred.

Complexity

LeBlanc includes complexity as an input variable and adheres to the idea that an optimal level of complexity is preferred by the listener, with optimal levels varying among listeners. Although complexity has been given much attention, particularly in the experimental aesthetics literature, the degree and manner that complexity alone determines music preferences has been a topic of some debate and involves careful consideration of the optimal complexity model. This will be introduced and discussed in chapter three. 8

Referential Meaning

Referential meaning of a stimulus refers to situations where extra-musical ideas are connected to music, as in the case of a person of Christian faith with little musical training listening to the Messiah. In this case, preference may be based largely on the individual’s identification with the religious connotations of the music rather than the works compositional merits. One of the most convincing studies done in recent years concerning the effects of referential meaning in music is a study conducted by Miller & Strongman (2002) on the effects of religious and secular music on preference ratings of religious and non-religious subjects. Two secular pop and two religious pop excerpts (as determined by the nature of the lyrical content of the music) and a slow and fast excerpt, for each group, were selected and matched according to similarity of pairs for use as music stimuli. Subjects responded to music excerpts by scoring the extent to which they agreed on 41 emotional response adjectives using a seven point Likert-type scale anchored by the terms “strongly agree” ( 7 ) and “strongly disagree”( 1 ). Of the 41 adjectives, principal component analysis revealed that underlying dimensions (principal factors) consisted of eight adjectives divided into “negative”, “energetic”, and “awesome” components. The effects of excerpt type (religious/non-religious/fast/slow) on scorings for Pentecostal and non-Pentecostal groups on the three affective dimensions were then studied via an analysis of variance.

Fig. 2.2. Means and s.d. for dimensions by group, music style and tempo.

Miller & Strongman (2002:18) 9

The important findings were in relation to the interactions of subject group and music group (secular/religious) variables on the “awesome” and “energetic” dimensions. For the awesome dimension, no difference in scoring was observed between secular and religious excerpts on the part of the non-Pentecostal group, while a significant difference was observed on the same dimension for the Pentecostal group, with religious excerpts receiving a higher mean score than secular excerpts. For the energetic dimension the pattern was the same, with little difference in scoring on the part of the non-Pentecostal group, but a higher scoring given to the religious excerpt by the Pentecostal group. This indicates that the religious subjects tended to experience heightened emotional levels in response to the religious excerpt with increases in the intensity of energetic feelings and awe. Interestingly, both subject groups rated the secular excerpts higher on the negative dimension. Nevertheless, the significant interactions do indicate that religious and non-religious groups experience religious excerpts differently, presumably as a result of the religious connation of the music with which religious subjects identify. The overall result of this, in Strongman & Miller’s terms, is the increased ability of religious music to elicit disassociated and heightened emotional states among members of the church congregation, and in essence, bring the congregation closer to having a “religious” experience. The power of music to affect emotional states as a result of the religious meaning and significance of the music is demonstrated in this study.

Performance Quality

The performance quality variable includes aspects relating to the production of the music by the performer and encompasses both expressive capabilities as well the performer’s sex appeal and/or charisma, a factor that is clearly prominent in the pop culture. It is well documented that subject’s of physically attractive persons tend to be much more favorable than those of unattractive people and is a phenomenon that has been explored in the context of the music performance setting. One notable study by Hargreaves & North (1997) clearly demonstrated the effects of performer on subject’s attitudes towards music. Slides of physically attractive and unattractive persons were paired with the music of artists and presented to subjects, with half the subjects rating the attractiveness of the supposed performer’s photo and half 10 rating their preferences for the music excerpts. In addition, subjects rating the music excerpts were asked to rate the degree they perceived the music as sophisticated, intelligent, likely to be popular, innovative, poised, emotionally warm, sensitive, masculine, profound, and possessing of . Subjects in the performer-rating group rated the performer on equivalent scales. The following main effects were observed in each group, suggesting that performer’s attractiveness plays a large part in forming impressions, of not only the performer’s character, but also of certain aspects of the alleged performer’s music.

Fig. 2.3. Output for the analysis of variance

North & Hargreaves (1997: 80)

Performer rating results indicate that the majority of positive attributes were perceived as being more prominent in physically attractive performers. Interestingly, there were no significant differences in ratings for unattractive and attractive performers on the masculine, innovative, sensitive, profound, and artistic merit measures, indicating that a number of positive attributes were not affected by performer attractiveness. This is especially surprising for the innovative, profound and artistic merit measures, which perhaps reflect counter-effects created by the popular image of the truly talented artist as being somewhat lacking in the social graces and sex appeal departments. Strikingly, these results do not correspond perfectly with the music-rating group. 11

Fig. 2.4.

North & Hargreaves (1997: 81)

As can be seen, music allegedly performed by the physically attractive artists was given a substantial advantage in ratings for a number of the attributes. However, the significantly higher ratings for the attractive performers on the artistic merit scale do not correspond to results obtained in the performer-rating group. Although there were additional inconsistencies, such as the rating patterns between groups for the poised, feminine, and emotionally warm measures, this perhaps can be disregarded as these attributes do not necessarily reflect specifically positive or negative characteristics that one might associate with music and music performance. Despite these enigmatic details, preference and attractiveness scores were strongly positively correlated (+.79), suggesting that with regard to overall preferences there is a strong trend in favor of a preference for the music of physically attractive performers, a finding that is given considerable weight by the pop music industry which takes great pains to advertise artists in a sexually enticing light.

The Media

The media are placed directly between input variables of the music stimulus, and variables relating to the cultural environment in which the music is experienced. The media affects preference by selecting music from an array and making it available to the listener. In this way the cultural variables, which affect what the media selects, limit the range of possibilities for the music stimulus variables. Thus, certain combinations of the stimulus variable may be unusual in one culture, but extremely common in another. 12

The media are dependent on the means of content delivery, which today include internet, television, radio, CD player, record player, tape player, and the like. When the term “media” is used, we generally make the between the organizations that produce and distribute content and the means by which it is consumed. Thus, the music media includes radio stations that broadcast music, as well as record companies that sign artists and market their product. In LeBlanc’s terms, the media factor consists of the net effect of both the organizations which have particular motivations (i.e. to make profit or play a particular role), and the availability and usage of the instruments that bring content to the individual. The structure of the media has changed significantly since the industrial revolution and the invention of modern instruments of content transmission, and has unquestionably altered our lives as a result of the comparatively quick and easy delivery of content made available, at one level or another, to almost everyone, anywhere. In fact, it cannot be escaped and at least in the developed world, pervades every aspect of our lives (Tagg 1981).

Roe (1985) reports that in Sweden the typical person is subject to three hours of music per day, with two of those hours spent listening to records (that figure today would no doubt lean increasingly towards CD player) and the remaining hour spent exposed to music on the radio, TV, or other media. Much of the media literature is devoted to researching the role and presence of the television in what Christenson (1985: 327) describes as a “pro-television skew”. One of the few studies examining music listening and TV watching habits of subjects was done by Larson (1983), who had a sample of high school students carry beepers throughout the day and evening. Subjects were paged at random times every two hours and completed a self-report describing their present activity and social context of activity. In addition, subjects responded to scales measuring the degree and nature of their involvement, including items detailing current affective state and motivation, and filled out a demographics section. Larson found that subjects spent more time watching TV than listening to music, with younger subjects watching more TV than the older ones. Time spent listening to music followed closely behind, particularly when background music was included. Subjects listening to music were more often alone than not, while those watching TV were more often present with family members. Further, when watching TV alone, subjects more often reported feelings of 13 loneliness and negative affective sensations, whereas listening to music alone was more often associated with positive affects. Larson interprets these findings as supporting the idea that television content, in general, is adult oriented, expressing values pertinent to the working adult life, whereas music on the radio or on record/CD affords the adolescent the opportunity to tap into a youth culture expressing corresponding values. Although the degree that the television can be seen as being adult oriented seems questionable considering some of the content today, Larson’s study is a particularly fine example of research which effectively taps into the private and inner workings of the everyday lives of adolescents and their relationship to the media.

Although only briefly alluded to in the former study, the question of the patterning of values with musical forms both in the media and out, is of great interest to researchers. A “massification hypothesis” (i.e. the breakdown of distinct cultures into one homogenous mass) has been put forward as the net result of an increasingly globalized media. Others argue that a diversity of taste cultures and taste publics continue to be present in all their varied forms (Gans 1974), and much evidence has since been published to support this idea (Russell 1986).

Research by Fox & Wince (1975) has been particularly insightful in identifying the types of people who are associated with distinct taste cultures, as identified by subjects’ music of choice. Self-report questionnaires were administered to a sample of undergraduate university students in Iowa, who responded on a 5-point like-dislike scale to nine labels of generalized music styles. In addition, demographics were recorded. A factor analysis was then performed to determine the underlying dimensions that best explained the preference response patterns for styles of music. The first factor loaded highly on jazz, blues, and classical music, while the second loaded highly on current pop hits and easy listening, the third on folk music, the fourth on social protest music, and the last on country and western. Factor scores were calculated for each individual, allowing for an observation of the extent to which individuals are involved in each taste culture, as defined in terms of the underlying dimensions reported above. A multiple classification analysis was performed in order to assess the relationship between demographic variables, such as age and gender (categorical independent variables or dummy 14 variables), on factor loadings, with the dependent measure accounted for in the form of adjusted subclass means. Results indicate that for the first factor (primarily jazz/blues factor) preference (or membership) was strongly related to hometown size, religious affiliation, and father’s education and occupation. Fox interprets this as a result of the tendency for jazz and blues to be historically rooted in a distinctly urban environment (less so for blues, but certainly for jazz). Atheists, agnostics, and Jews were identified as being strongly affiliated with this taste culture, which is not surprising as jazz, in particular, was until relatively recently considered offensive to traditional tastes and values (Roe 1985). Father’s occupation reflected a tendency for farmers and generally rural occupations to be negatively associated with membership with this taste culture, while parents with higher-positioned urban occupations were positively associated with membership. For the popular hits factor, age and training were of particular interest. Membership was stronger for women than men and for young adolescents over young adults. Folk music was strongly associated with membership by women and negatively associated with hometown size. For rock and protest music, surprisingly, there was little evidence of a tendency for males to be members over females. However, age, hometown size, and religious preferences were pertinent. Membership was higher for younger persons and stronger for atheists and agnostics, as well as for persons residing in larger urban areas. For the country and western factor, males had a stronger membership than females. Surprisingly, hometown size and religious disposition had little presence on this dimension. Results confirm the concepts of “taste publics” and “taste cultures” to which persons are affiliated with to one degree or another, and disconfirm a massification hypothesis. The identification of taste public characteristics commonly associated with corresponding taste cultures further corroborates the notion that even within the relatively homogenous subpopulation of university undergraduates, taste publics do exist and can do so within the framework of a digitized and commercialized media without compromising the identity and values so peculiar to each. 15

The Environment

Peer Group, Family, Educators, and Authority Figures

The peer group variable includes social circles that influence an individual’s preference through the tastes of its members. The family, educator and authority figure variables act in a similar way, although with perhaps a tendency to impart values and principles rather than individual tastes. Notable studies on the effects of authority figure bias on subject’s judgments of musical excerpts include a study by Radocy (1976) and a study by Alpert (1982). Radocy observed that when music majors were presented with bogus information regarding performances and the authorship of compositions by the researcher and assistant (the authority figures), subjects were susceptible to bias in favor of the bias conditions. Although some anomalies did appear regarding comparison of the second of each pair of strong bias conditions with the control, there were consistent bias effects, supporting the notion that conformity to authority is a viable factor within a music preference context. Alpert’s study incorporated verbal and behavioral measures in order to observe the effects of disc jockey, peer, and teacher approval on subjects’ preferences for classical, country, and rock music. Results were mixed depending on the measure studied. Music teacher and disc jockey approval for classical music increased subject’s preference, and peer approval decreased preference, as measured by a music selection recorder. Strikingly, on the verbal rating measure, peer approval increased preference in a similar manner to the disc jockey and music teacher approval groups. These results are evidence of the complexity with which social groups of different kinds can influence an individual’s behavior and attitude towards music. Indeed, each of these environmental group factors may vary in strength of influence at different points of time in an individual’s life and may work in opposition to each other. This is illustrated in the stereotypical case of a young Caucasian adolescent who drops violin lessons and immerses himself in the rap culture in order to anger his parents.

Incidental Conditioning

Incidental conditioning occurs when an emotional state is attached to a piece of music such that it is evoked at a later listening of the piece. A popular example of this, cited by 16

LeBlanc (1982: 34), is the case of a couple who happen to listen to a certain piece of music on their honeymoon and are constantly reminded of the blissful experience whenever the piece is heard subsequently. There is very little on conditioning and preference for music in the music education, , and perception literature. This is no doubt a result of the difficulty in studying the phenomenon as it occurs in a music preference context. For those interested in pursuing this topic, the best place to start is within the general psychology literature with the concept of conditioning.

Ivan Pavlov was the first to experimentally demonstrate classical conditioning. Pavlov (1928) inserted tubes into the salivary glands of dogs and measured the amount of salivation that resulted when food was shown to them. He then paired the sound of a metronome with food and found that the dog salivated in the presence of the metronome ticking alone. However, salivation decreased as it became apparent (to the dog) that no food followed presentation of the metronome. In this context, the food is the unconditioned stimulus while the metronome is the conditioned stimulus. The unconditioned response is salivation in the presence of food, while the conditioned response is salivation in the presence of the metronome alone. In LeBlanc’s example, a similar labeling of stimulus and response components can be done. However, in this context the response is an emotional one, and hence we speak of a conditioned emotional response (Grusec et al. 1990). Here the unconditioned stimulus (US) is the presence of the other person (food in Pavlov’s case) in the experience, while the unconditioned response (UR) is the feeling of bliss. The conditioned stimulus (CS) is the music which becomes paired with the presence of the other person, and on later listening of the music produces the conditioned emotional response (CER), or bliss, as it “tricks” the mind into thinking that the person (or unconditioned stimulus) is present. A clear understanding of this process and the related concept of operant conditioning paves the way for the execution of an experiment researching the role of conditioning on shaping music preferences. A successful study would be bound to be one of the few, if not first, of its type, at least in the music preference literature, and as such, makes for an intriguing avenue of research. 17

Intervening Variables

Basic Attention, Current Affective state, and Physiological Enabling Conditions

Intervening variables constitute the next three stages and include physiological enabling conditions, basic attention, and current affective state. These variables affect the degree to which all prior variables at the input stage influence the musical experience and preference decision. All three variables are static ones that are related to the immediate context of the listening experience, and include factors internal to the listener and ones pertaining to the immediate characteristics of the listening environment. If the piece of music was heard outdoors near a waterfall, the extent to which the background noise interferes with the experience would be an example of a physiological enabling condition (or disabling condition in this case). Another example of this variable is whether or not the listener has cotton balls in his ears, or to quote LeBlanc’s (1982: 35) example, can hear frequencies over four thousand hertz.

Basic attention and current affective state variables include factors pertaining to the energy level, focus of attention, and mood of the individual during the listening experience. Clearly, the basic attention factor is critical to the successful processing of any musical stimulus, as a distracted individual may fail to process information altogether. This variable is given a great deal of weight by music researchers, particularly those involved in conducting studies assessing young children’s attitudes towards music, who are known to have much shorter attention spans than older children and adults. Consequently, researchers pay great heed to the choice of length of playing time of music excerpts in order to account for this. Current affective state seems to have a slightly more subtle effect on the outcome of music preference decisions.

Konecni (1976) studied the effect of arousal on preference for melodies of varying complexity. Subjects participated in two fictitious experiments “blind” to the true aims of each, before undergoing the experimental treatment of interest. In the first experiment, subjects solved sets of anagrams along with a “confederate”, who behaved differently according to two conditions. In the first, the confederate quietly solved anagrams without disturbing the subject, while in the second (annoy condition), the confederate quickly 18 solved anagrams and then proceeded to insult the subject over their performance on the task. In the second experiment, half the annoyed subjects were sent to a room and left alone for seven minutes. The other half participated in a memory experiment in which the confederate was “randomly chosen” to partake as the recipient of electric shock administered by the subject in response to incorrect recall (in no shocks were delivered). Combinations of the conditions of both experiments resulted in the conditions “annoy-wait,” “annoy-shock”, and “no annoy-wait”. Following this, all subjects listened to melodies of varying complexity, as determined a priori in the manner of Paul Vitz (1966), through a set of headphones. Subjects were instructed to press one of two buttons and told that a melody would sound for 10 seconds following. This was to be done 50 times, with each button producing either a complex or simple melody. The proportion of complex melodies chosen was tabulated and analyzed with respect to the three conditions. Results indicated that subjects in the annoy-wait condition (aroused subjects) chose the simple melodies more often than subject in the other conditions. Subjects given an opportunity to shock the confederate (thus reducing arousal via a catharsis type effect), showed no differences in comparison to the control (no annoy-wait). The implication of this study as deduced by Konecni is clear-cut. A person who is annoyed and thus has a higher current level of generalized arousal, will tend to shun arousing stimuli in an attempt to prevent a further increase in arousal which is already above the optimum preferred level. Clearly, as evidenced by this study, current affective state is important in determining preference. This is certainly very intuitive as well, as someone who is extremely bored will seek some activity or stimulus that will stimulate them, whereas someone who has had too much excitement for one day may choose to have a cold beer and watch the box instead.

Together, these three variables act as what LeBlanc (1982: 35) describes as a “gate and filter”, controlling what passes through for processing and evaluation. Each of these variables also has the potential to interact with, not only variables classified under the intervening variable category, but with variables at the input stage and remaining influence variable stage as well. 19

Relatively Stable Factors of the Individual

Level four, the last of the influence variable levels, represents relatively stable factors pertaining to the individual and includes auditory sensitivity, musical ability, training, personality, gender, ethnic group, socio-economic status, maturation, and memory. Of these factors, ethnic group, training, gender, and maturation (or age) have been given the most attention in the music preference literature. In addition, a good amount of exciting research has been done on personality and preference for music.

Personality

David Rawlings (1997) explored the relationship between personality and music preference in a study incorporating the Music Preference Scale (MPS) developed by Little and Zuckerman (1986), and the revised NEO personality inventory developed by Costa and McCrae (1992). The MPS is a 75-item questionnaire consisting of demographics and 60 labels of styles of music with examples for each. Labels of individual styles are divided into 10 general genres of music. Rawlings made slight adjustments to the general categories and by means of a factor analysis determined that all items could be explained in terms of three underlying dimensions, termed as “breadth of preference” (factor 2), “popular music” (factor 3), and “rock music” (factor 1).

The NEO personality inventory supports a five-factor model of personality, consisting of the dimensions “agreeableness”, “extraversion”, “neuroticism”, “openness”, and “conscientiousness”. The three factors describing the underlying dimensions of the MPS were analyzed in relation to subjects’ responses on the NEO personality inventory, with the personality factors as independent variates and the general music style dimensions as dependent variates. Canonical correlation (see Hair et al. 1998) yielded three significant canonical functions in which linear composites of the dependent and independent variates were explored. Additional analysis was conducted on facets of the personality dimensions and their relationships to the dependent variates. For the first analysis, for the first canonical function, strong negative loadings were observed on the independent variates of music training, music interest, and openness. A corresponding high negative loading for breadth of music preference was observed in addition to a positive loading for the 20 popular music factor. Results suggest that subjects with high preferences for popular music will tend to have low levels of music training and interest, as well as exhibit narrow-mindedness (i.e. opposite of openness). Furthermore, such persons will tend to have an aversion to classical, jazz, and the remaining styles of music represented under the “breadth of preference” factor.

Fig. 2.5.

Rawlings (1997: 125)

For the second canonical function, strong negative loadings were apparent for breadth of music preference and pop music, and a high positive loading was present for rock music. On the independent variates, strong negative loadings were observed for gender, music training, and conscientiousness, suggesting that preference for rock music is associated with being male and un-conscientious, with little training in music. For the final function, all loadings were negative apart from one, which was very weak at any rate. Strong negative loadings were present for all dependent variates, while for the independent variates, strong negative loadings were observed on the agreeableness, openness, and extraversion factor, suggesting that persons with these personality characteristics will have a tendency to be open to all forms of music. A consideration of the remaining analysis conducted on the facets of the openness and extroversion factors yielded 21 additional details on the nuance of personality and its relationship to preference for styles of music. For our purposes, the conspicuousness of the loadings on the training variable and the general finding that gender is associated with liking for particular styles of music (males with certain personality dispositions coupled with preference for rock music), is of considerable interest, as will become apparent in the empirical portion of the thesis. With respect to the underlying theory regarding personality and music preference, it has been suggested (Eysenck 1967) that low or high tonic levels of cortical arousal characterize certain personalities. Extroverts and sensation seekers have a low tonic, or resting level, and as a result are driven towards attaining higher levels of arousal. This is reflected in the behavior (and preferences) of such individuals, who appear “stimulus hungry”. Questions regarding the biological ramifications of this general idea still remain, and indeed, even within the context of an arousal hypothesis, there is considerable disagreement (Rawlings 1996).

Maturation

Hargreaves’ (1995) “open-earedness” hypothesis, which appears to be inspired by Murray Schafer’s (1967) pedagogical text “Ear-Cleaning”, suggests that young children are less opinionated to unusual and uncommon music than adults and “may show less evidence of acculturation to normative standards of good taste” (Hargreaves: 1995: 243). This idea parallels Schafer’s advice to students to “clean the ears” of conventional frames of reference when listening to music, and perceive the sounds as a child, as if for the first time. LeBlanc (1996) develops this notion into a structured model for explaining maturation effects on music preference. Younger children tend to be more open to a wide variety of sounds and music, with this trait diminishing with the approach of adolescence, then rebounding at young adulthood, before declining again as the individual matures and ages. This model implies that younger children will have a tendency to be more receptive to world and art music than children in the adolescence stage, who will tend to develop an increasing preference for popular styles of music. This trend reverses again at young adulthood, with art and world music again getting the upper hand. A slight leveling off for all styles is predicted with old age. The strongest evidence for these trends has come 22 from LeBlanc (1996), who observed a U-shaped trend in preferences for art, jazz, and rock genres among American subjects ranging from ages six to ninety-one.

Fig. 2.6. Preference means for styles of music by age group

Le Blanc (1996:56) Music Preference means by grade level. 13=univ stdnts/ 14=yng adults.

Although the consistently higher preferences for rock music were observed up to grade eight, the general U-shaped trend and striking interactions (although somewhat premature than one might prefer) occurring between grade levels eight and fourteen, offer substantial support for the open-earedness hypothesis. Much music preference research 23 has confirmed that young children tend to like pop and rock genres better than serious music, so the higher preferences for rock music for those age levels in LeBlanc’s sample should come as no surprise. This does not refute the validity of the open-earedness hypothesis as it simply states that a young child will tend to be more open to serious and unusual music than those in their early teens, which is in fact the case in LeBlanc’s sample. What is unusual is the outlier for art music at grade five, the dip in preference for art music at grade twelve, and the consistently lower preference ratings for jazz music at the university level, which at this stage we would expect to surpass preference levels for rock or pop genres of music. Despite these details, some of which may be a result of random variation, the U-shaped trend is evident as confirmed by the hypothesis test of the goodness-of-fit of the quadratic model (p=.05).

Gender

Gender effects tend to be less prominent than age effects. However, Hargreaves (1995) did come up with significant findings in a study assessing subjects’ preferences for styles of music. This particular study stands out from many, as it incorporates a standard survey format and elicits response from subjects to style labels rather than music excerpts played on a tape. Although Hargreaves recognizes the problems that can be caused by variability in subjects’ understandings of what constitutes a style, he also points out the benefit gained by not having to deal with a possibly small number of unrepresentative music excerpts and the relatively easy access to a large number of preference judgments based on individuals’ conceptions of style labels. Hargreaves found that in his sample, girls tended to have significantly more music training than boys. Further, boys showed greater preferences for heavy metal and rock music genres, while girls had greater preferences for reggae, pop, jazz, folk, and classical genres. The popular explanation for this is that boys tend to be more extroverted and aggressive in nature, and thus, are more apt to identify with the so-called “masculine” traits that are commonly associated with heavy metal and rock music. Girls on the other hand, will tend to prefer music of a more feminine disposition, and as a consequence be more likely to gravitate towards art music and take up an instrument. Hargreaves points out that recent evidence accumulated by Crowther (1982) indicates that girls may be more receptive than boys to 24 a wider variety of music in general. If so, the differences in ratings between genders could be explained as a result of stronger dislike ratings for certain genres given by boys. Further complicating the matter, Hargreaves found that training and preference for those genres most preferred by females were positively correlated, suggesting that higher preferences for those genres on the part of the female group, may have been a result of the prevalence of music training in addition to gender traits. The question remains as to which factor has the stronger influence. Do gender traits largely cause females to take up formal music training as a result of a genetic disposition common to all females, or do socially defined variables play a more important role in seeing that females are more likely to receive formal training? If the latter is the case, then at least in Hargreaves’ sample, training not gender becomes the dominant factor in explaining preference discrepancies.

Christenson (1988), one would suspect, would place substantial weight on the importance of gender as a distinguishing factor in shaping music preferences. He points out that to gloss over gender differences is equivalent to ignoring social boundaries, a sin magnified by the fact that gender differences can be prominent (and in fact may be especially prominent) even within relatively homogenous music subcultures. Christenson cites ideas by Frith (1981) and Brake (1980) to illustrate a culturally constructed phenomenon that is believed to be largely responsible for gender differences in music preferences. Frith argues that the “work” of the female, ultimately, is to attract a husband and bear children, and as a result, females are taught (or pressured) by society to adopt traits and tendencies to this end. This notion is reflected in music preferences, with females adopting the “softer”, more romantic popular styles, and males taking to the harder rock styles. Brake goes a step further arguing that the pop culture itself enforces gender role stereotypes. There is music for males and music for females, each reflecting traditional gender qualities. Consequently, popular styles must be understood and engaged as the sum of two very separate and distinct cultures, each having its own peculiar characteristics and nuances. Christenson’s results seem to support this conception of the nature of music preference in relation to gender. 26 style labels were chosen in consultation with music publications and radio directors, followed by a pilot survey consisting of a random sample of 110 undergraduate students. The style labels 25 were deemed appropriate in terms of representing the general spread of style categories commonly associated with the student body, and the main body of the survey commenced. Subjects rated their preferences for each style category and provided demographic information, including gender. Results indicated that females preferred mainstream pop, soul, 70s disco, and black gospel, significantly more than males, who preferred contemporary rhythm and blues and three different categories of rock, significantly more than females.

Training

LeBlanc (1982: 36) distinguishes the training variable from the musical ability variable on the grounds that subjects with equivalent levels of training can differ extensively in terms of natural musical ability. However, clearly any person receiving training will in most cases exhibit some sort of advantage in music appreciation and performance in comparison to someone who has received no training. Further, one can speculate that a person with an undeveloped natural ability will not achieve their full potential without some sort of training. Consequently, training must be considered as a significant variable in determining not only a person’s success in music, but also a person’s response to music. It can be stipulated that training enhances a person’s ability to appreciate music and increases the potential aesthetic power of music. Researchers and educators alike are well aware of the importance of education in enriching a person’s musical experiences, and much of the music education literature focuses specifically on examining the best ways of maximizing the efficiency and effectiveness of music pedagogue. However, a number of researchers have looked at the extent to which response to music differs for trained persons and musically naïve persons. Hargreaves (1980) investigated the effect of training on quality and preference ratings for familiar and unfamiliar music excerpts. 54 undergraduate students were divided into two groups on the basis of training, resulting in 27 subjects in the trained group and 27 subjects in the untrained group. Both groups listened to four classical music selections and four pop music selections, with selections consisting of two familiar and two unfamiliar excerpts for each style of music. Subjects rated each excerpt for liking and quality. 26

Fig. 2.7. Effects of training on mean liking and quality scores

Hargreaves (1980: 16)

Both the trained and untrained groups rated the classical excerpts as being of higher quality than the pop excerpts. However, the trained group had much stronger preferences for classical excerpts, while there was only a marginally stronger preference for familiar pop excerpts on the part of the untrained group. Hargreaves explains these results as indicative of a greater tendency for untrained persons to fragment their attitude to music on the basis of their cognitive and affective responses. Persons with high levels of training on the other hand, will be more likely to prefer the music they regard as being of higher quality. According to these results, untrained persons will be more likely realize that a piece of music is of poor quality, yet still prefer it over one of higher quality. This suggests that for untrained subjects, serious listening plays a comparatively smaller role than it does for trained persons, who have developed an ability to appreciate more complex and unusual music.

Bias on the part of both untrained and trained subjects must also be considered as a very likely effect in this context. For most music students the central topic of study is Western classical music. For a student to study classical music seriously, putting large amounts of 27 time and effort into the endeavor, yet like pop music better, would result in cognitive dissonance. This student would be under severe stress as a result of the incompatibility of motivational factors coming into play. On the one hand, the student would be struggling to convince himself that study of classical music is worthwhile, and on the other hand, would seek out pop music as it has greater aesthetic value. This tendency would continually be demanding some kind of resolution such that the two clashing motivations are reconciled. One way of achieving this would be through denial, which would manifest itself in higher rated preferences for classical genres of music. This same type of bias effect could also explain the discrepancy in preference and quality ratings on the part of the untrained group. It is possible that the untrained group may be equally pressured by Western society (which for the most part deems classical genres of music to be of higher quality than pop genres) to recognize pop music as being inferior. The bias in this case, would inflate the quality ratings for the classical genres, and may explain why there was so little difference in preference ratings for classical and pop excerpts for this group.

Ethnic Group

The effects of ethnic group membership on musical preference are dramatic, as has been observed by Morrison (1998) and McCrary (1993). Both studies focus on the preference responses of African-American and American-Caucasian subjects to music excerpts. The results of both suggest that minority groups will tend to exhibit much stronger preferences for music styles of their own culture, with this tendency increasing in strength with age from childhood to young adulthood. Results stop short of confirming the wider behavioral concept of same group identification, as the Caucasian subjects displayed no such tendencies in their preference responses to music excerpts. Although the results in Morrison’s study are more striking, McCrary did incorporate vocal music from popular, jazz, folk, and gospel genres, allowing for observation of same group identification effects among subjects of both races. Because McCrary divided excerpts according to those exhibiting Western European and African-American vocal stylings, and included a measure of subjects’ ability to identify the performer’s race, subjects’ race scorings could be correlated with subsequent preference scorings. Results indicated that African-American college subjects’ race scorings tended to be highly correlated with 28 their preference scorings in the direction of a higher preference for music identified as performed by same race performers, and a lower preference for music identified as performed by other race performers. Despite similar performer race identification scorings, no such pattern was found among Caucasian subjects, whose preference means for black and white performers were roughly equal.

Morrison restricted his excerpts to the jazz genre, and paired slides of the performers with each excerpt according to two conditions. In the “correct slide” condition, the correct match of performer to excerpt was made, whereas in the “incorrect slide” condition, the race was reversed for each excerpt such that subjects were presented with slides of the race opposite to that of the actual performer’s race on each excerpt. A control was included in which no slides were presented in conjunction with excerpts.

Fig. 2.8. Like/dislike means for slide conditions by race of performer and subject

Morrison (1998: 215) 29

Results indicated that African-American subjects completely reversed their preferences ratings according to a bias in favor of performers of the same race. White subjects were not influenced in this way, suggesting that information regarding the performer’s race was not as significant in contributing to music preference as it was for African-American subjects. Although this trend may be peculiar to the United States, it certainly gives good reason for conducting further cross-group studies of this sort.

Culture

Culture is not listed as a single variable in LeBlanc’s model. Rather, the environment variables and some of the level four variables, including ethnic group and socio- economic status, can all be seen as falling under the larger heading of culture, as each contributes to the cultural identity of an individual. An African person of tribal origin can thus be understood as having particular values on the environment variables for example, which may differ markedly from those of persons of European origin. These differences distinguish cultures, and allow for the conception of culture as a single variable.

Exemplary cross-cultural studies include investigations by Fung (1999/2000), Morrison (1999), and Darrow (1987). Although Fung’s study on the music style preferences of young Hong Kong students is really a within-culture study, it makes for a good cross-cultural comparison if looked at in light of results obtained by LeBlanc (1996) on the effects of maturation on style preferences. Fung incorporated the same 18 excerpts used by LeBlanc to represent Western art, jazz, and Western rock excerpts, in addition to 12 Eastern excerpts taken from cantopop and “Sizhu” (traditional bamboo and silk ensemble music) genres, for presentation to students in Hong Kong varying from ages six to fifteen. The same general trend of a decrease in preference, in absolute terms, for all styles, was observed as age increased to young adolescence. However, there were marked dissimilarities in the magnitude of differences in preference for styles of music between the Hong Kong subjects in Fung’s sample and the American subjects in LeBlanc’s sample. 30

Fig. 2.9. Preference by grade level for music excerpts of different cultural backgrounds.

Fung (1999/2000: 59)

As can be seen, in terms of preference, Hong Kong subjects made little distinction between jazz and Western rock excerpts. However, the Western art music excerpts were preferred substantially more across all ages. Further, Hong Kong students preferred cantopop more than Western rock music and art music, a trend that held throughout the entire sample. Although the latter genre can be viewed as holding an equivalent place in Hong Kong society as Western rock and pop holds in American society, there was no overlap in preferences between classical and popular styles with the approach of adolescence, as observed in LeBlanc’s sample. In addition, the general U-shaped trend predicted by the open-earedness hypothesis was absent. This can partly be explained as a result of the smaller age range present in Fung’s sample, as college students and adults were not surveyed, which is where the upturn in preferences is predicted to be most prominent. Despite this, the differences in trend between Fung and LeBlanc’s sample are 31 apparent. Subjects in LeBlanc’s sample developed a rising trend in preferences for music across all styles at about grade eight and nine, a pattern absent in Fung’s sample. It is also interesting to note the extraordinarily low preference scores for shizu music across all ages, a trend which we would expect to be similar to preferences for Western art music, if we were to continue the analogy of the equivalence of Chinese popular and traditional styles to their counterparts in Western culture. This is not the case, and it can be presumed that the low preferences for shizu music are a result of the lack of attention given to traditional Eastern genres of music in comparison to Western art genres in Hong Kong schools. Indeed, Fung notes that for most Hong Kong students, shizu would not be encountered very often, and in the sample was considered old fashioned and boring by the majority of subjects. In addition to these observations, an unexpected finding cropped up which contradicts LeBlanc’s established hypothesis on the effects of tempo on preference. In Fung’s sample, the cantopop excerpts had the slowest mean tempo yet were preferred most, suggesting that tempo preference trends must be considered within a cultural context and cannot be assumed to generalize indiscriminately across populations.

In Morrison’s sample, non-music and music university students from the USA, Mainland China, and Hong Kong listened to jazz excerpts, Western classical excerpts, and Chinese classical excerpts. Subjects rated preference on a nine-point scale and indicated the reasons for their preference ratings. American persons rated jazz excerpts highest, while jazz was the least preferred selection among Chinese and Hong Kong persons. There was little difference in preference for Western classical excerpts between American and Hong Kong persons, while Chinese persons gave significantly higher preference ratings for this genre than both American and Hong Kong persons. The highest ratings were given by Chinese persons for Chinese classical music, while the lowest ratings were given by American persons for the same genre. 32

Fig 2.10. Preference means or styles of music by country

Morrison (1999: 10)

The explanation for these preference trends, as advanced by Morrison, is that as a general rule, persons will simply have a tendency to prefer those styles of music more closely related to their own culture than those that are different. Consequently, Americans will give the highest ratings for jazz excerpts, Chinese persons the highest ratings for Chinese traditional excerpts, and so on. It is possible that that in Morrison’s sample, if Eastern or cantopop excerpts were included, Hong Kong persons may have given higher preference ratings for these styles of music than for both Eastern and Western classical music excerpts, a trend which would be in accordance with that observed by Fung. However, Morrison’s explanation does not account for the fact that Chinese subjects preferred Western classical music more than American subjects, which we would expect intuitively to be lower, or on par, with those of American subjects. This does not invalidate the notion that persons will prefer music similar to that of their own culture most, as this idea only implies a specific pattern of preferences within homogenous cultural groups. The anomaly observed in Morrison’s sample arises from a comparison of preferences between two different homogenous cultural groups. However, this does lead one to suspect that a bias factor was coming into play for Chinese subjects, for it does appear that their preference trend is transposed up a few points higher from a level we might normally expect. 33

Analysis of the free response data revealed cultural differences as well. Over 4,500 responses were categorized on the basis of being analytical, metaphorical, stylistic, judgmental, or “other” in nature. Of the total number of responses given by each nationality, the highest proportion of responses for the American subject group was “analytical”. Hong Kong and Chinese subjects both gave a higher proportion of “metaphoric” responses, each of which was considerably higher than the proportion of responses given by American subjects for that category. Morrison suggests that this tendency on the part of respondents in Hong Kong and China reflects a characteristic of the Chinese way of approaching and thinking about art in general, which corresponds to a value system that emphasizes symbolic representation and emotive and visual association in art, rather than concept.

Morrison’s study makes some in studying the use of terminology to represent musical concepts by subjects of varying backgrounds. However, it does not incorporate use of specific descriptors in the form of semantic differential scales, and consequently, is less conclusive in regards to commenting on differences in the perception and description of specific attributes of music across cultures. In fact, the question of consistency of the experience of music, reflected by consistency in the use of terminology to represent musical concepts, remains open to debate. (1757) observes that words will tend to have some consistency of use within cultures. The implication is that there will be little discrepancy of use of words to describe particular attributes of music between individuals within roughly homogeneous cultures. Terms such as “tender” or “dignified”, will have roughly the same musical connotation within a culture. However, it is unclear how the same attributes in a Mozart piano concerto, perceived by Westerners, for example, would be experienced and described by indigenous persons from countries such as Papua New Guinea or Angola.

Darrow explored this question in a less extreme context, comparing preferences and use of descriptors for Eastern and Western music by Japanese and American subjects recruited from universities in their respective countries. Darrow incorporated nine terms taken from Hevner’s (1935) adjective list, as descriptors, from which subjects could choose one to describe each of 18 excerpts presented to them on tape. In addition, 34 subjects rated their level of preference for each excerpt on a 7-point rating scale. Excerpts included selections from traditional and contemporary Western classical music, jazz, and chant, as well as Eastern traditional ceremonial music, court music, Eastern chant, koto, and “enka”, or easy listening music. Pop excerpts of both cultures were not included in the study. Darrow found that there was considerable agreement among subjects from both cultural groups as to which descriptor best characterized the quality of the music, particularly for the Western excerpts. For six pieces, both Japanese and American subjects gave the highest proportion of choices in favor of the same descriptor. However, there was also considerable disagreement, much of which was a result of differences in choice of descriptors for the Eastern excerpts. For example, for the Japanese “nogaku”, a traditional ceremonial music, a significant proportion of the American subjects chose “mournful” as the most appropriate descriptor, whereas an equivalent proportion of Japanese subjects favored “majestic” as most descriptive of the music. Clearly, these two terms, as understood in Western society, designate quite different emotional states. Although there are slight similarities in tone, as one may think of a funeral march as both majestic and mournful, it is unlikely that this alone could explain the discrepancy. It is more probable that certain sounds are favored by Eastern cultures, which over time and through tradition, have come to be associated with certain types of sentiments, while those very same (or similar) types of sounds in Western culture, have developed associations to sentiments of a very different sort. Although it is not certain how this might impact on musical preferences, it does suggest that perception of musical concepts may vary across cultures depending on the cultures and music concerned, and cause differences in preferences for music as a result. In fact, Western styles of music, particularly the classical styles, were generally preferred most by both groups, with Japanese students displaying marginally higher levels of preference for Eastern genres than American subjects. A striking exception to this was the relatively high preference scores given by both groups for “enka”, or eastern easy listening music. An additional finding, which contrasts with the trend observed in Fung’s sample, is the strong preferences of Japanese subjects for the jazz excerpt. Apart from this, results are in keeping with findings of most cross-cultural studies covering developed Eastern nations. In Japan’s case, high preference trends for Western genres of music can be explained as a result of the significant amount of exposure to Western ideas, culture, and music Japan 35 has received since the close of World War II, a phenomenon not reciprocated to the same extent in the West, for Eastern music (Darrow 1987: 246). The strong preferences for classical genres can be interpreted as resulting from the focus and attention given to these genres of music in the Japanese school curriculum, which in the case of music, has largely been modeled after Western music curriculums. This form of enculturation can account for the dual nature of preferences of Japanese subjects, and suggests that although perceptions may differ regarding the qualities of music in certain contexts, the effects of this in itself may have little bearing on overall preference trends, and are only of great prominence in cases where two cultures are drastically different and unfamiliar with one another.

The LeBlanc Model - Final Stages and Conclusions

The final stages of the LeBlanc model depict the point at which the listener processes information and reflects on the listening experience. Decisions regarding whether or not to continue listening, and whether the music is liked, disliked, or neither liked nor disliked, is made at this stage. Once the piece is processed and the person remains neutral or undecided regarding his or her music experience, repeated sampling may be required, in which case the individual starts at level eight once again and progresses upward to the higher processing stages. If a decision is made regarding the music experience, the piece may be rejected, in which case processing ends, or the piece may be accepted, in which case the individual may repeat the listening experience perhaps at a heightened level of attention. Once a decision regarding a music excerpt is made, all future preference decisions regarding the same piece will take into account prior decisions, this reflected in the LeBlanc model by the arrows going from the rejection/acceptance phase to the processing phase.

Although levels of the model reflect where in the listening experience factors tend to come into play, so in this sense is one which orders variables on a time continuum, the model combines the linear time orientation of a decision tree with the same time free attribute characteristic of some statistical models. The dual nature of LeBlanc’s model makes it difficult to apply in terms of studying a single music experience, and in many ways is too detailed for its own good. For example, it could be argued that the ethnic 36 group variable is appropriate for inclusion under the environment heading in the input stage, as it is similar to the peer and family group variables. In addition, the complexity of a stimulus for an individual, for a single listening session, is really the perception of objective properties of the stimulus, which may vary according to the processing capabilities of the individual, in which case complexity is not really an objective trait of the object, but rather, a subjective perception of the individual. The same problem arises with constructs such as novelty. Should novelty and complexity be considered as properties of the object, subject or both? In which case should there be corresponding counterparts for these constructs in the level representing relatively stable traits of the listener? Despite these questions, which are certainly difficult to tackle in this context, the model is extremely helpful in enabling one to place specific studies into the framework of a single and complete music listening experience, and in this regard, perhaps has more value as a model of the structure of the music preference literature. It allows us to view extremely detailed and specific studies regarding the effects of tempo on preference, for example, from a wider perspective. LeBlanc’s model, in essence, provides us with a useful birds eye view of the factors involved in shaping musical taste, and offers a vantage point from which to proceed in regards to sampling the literature in order to study specific variables and their relations to each other.

The Mere Exposure Theory proposed by Robert Zajonc is not specifically addressed in the LeBlanc model. However, LeBlanc’s model does have a circular component, represented by the repeated sampling and repetition of stimulus headings and depicted by the arrow going from heightened attention to the input variables. Thus, repetition effects are implicitly factored into the model as one can picture going through the hierarchy up to the decision stages of the model with each repetition. The issue of repetition effects on preference is explicitly engaged in the Mere Exposure Theory as well as the Optimal Complexity Model. 37

CH 3. Zajonc’s Mere Exposure Theory

The Mere Exposure Theory proposes that exposure is a sufficient condition to produce liking for unfamiliar objects. Specifically, the relationship between exposure and liking is logarithmic, with increases in liking greatest in the initial stages of exposure, less pronounced in the later stages, and minimal for very familiar stimuli. Zajonc (1970) explains this pattern as occurring when unfamiliar and unique characteristics of a stimulus lose novel traits and become familiar. Novel, or otherwise unfamiliar objects, hamper stimulus generalization and produce conflict in the subject. This creates stress and a general dislike for the object. With repetition, the novelty of the object wears off and stimulus generalization occurs, resulting in greater consistency in response and a reduction in conflict. This decreases the stress level of the subject, causing the object to be better liked.

Fig. 3.1. The Mere Exposure Theory

25 20 15 Series1 10 5 Affective rating 0 1234567891011 Exposure frequency

Robert Zajonc, (1969) one of the strongest proponents of this theory, clearly demonstrated the effects of repetition on liking for stimuli in a hallmark study incorporating Turkish words displayed with varying frequencies in two local American university newspapers. At the end of a 25-day exposure period, questionnaires were 38 administered to a random sample of university students and good/bad ratings collected. Zajonc found that ratings increased with increases in frequency of the word in a logarithmic . Although Zajonc asked subjects to rate on a 7-point scale whether the words meant something good or bad, ratings were taken as indicative of an attitude to the object, which was determined as equivalent to preference. Consequently, words that were thought to mean something bad by a subject were interpreted in the experiment as being disliked, and words thought to mean something good, understood as liked. Although this method of interpreting ratings has questionable validity, there is no overt reason to understand ratings as not reflecting levels of preference. This study effectively demonstrates the influence of repetition on subjects’ affective ratings of stimuli, but within a limited context. Of contention is the idea expressed by Zajonc that exposure alone is the greatest consequence to liking for a stimulus, especially one as complex and varied as music. However, studies incorporating music excerpts have arrived at results compatible with Zajonc’s theory.

Bradley (1971) conducted a study in which 14 classes of grade seven students listened to 12 music excerpts over a 14-week exposure period in music listening sessions. Three of each of four different types of music, classified as tonal, polytonal, atonal, or electronic, was used. Pre- and post-test preference means revealed significant differences in liking for all four categories of music, suggesting that the exposure period had the effect of enhancing subject’s affective response to excerpts in the form of an increase in liking. Results were conclusive enough to warrant Bradley’s recommendation to use repetition as a means of encouraging music appreciation in the classroom. Although this study reveals increases in preference following exposure, questions remain regarding the nature of the increase. Because only a pre- and post-test was administered, no conclusions could be drawn regarding the nature of the increasing trend, which could be linear or curvilinear. Further, lack of a control allowed for the possibility of confounds misconstruing results. Consequently, any conclusions drawn regarding the applicability of the Mere Exposure Theory to these results must be tempered by the fact that the design excluded a control group and the possibility of testing hypothesis regarding the nature of the relationship between preference and exposure. 39

In a study by Schuckert (1968), a jazz and a classical music excerpt were played to a group of 20 children in four listening sessions. In an initial testing phase, preference ratings were obtained for each excerpt for each child, followed by four sessions in which the least preferred selection was played to subjects individually. A post-test revealed preference shifts in the expected direction among roughly half the children. However, these results were not significant. Despite this, Schuckert concludes that improvements in design, such as increasing the length and number of exposure periods and including a control group, might yield substantial results in favor of significant exposure effects.

Bradley’s results provide circumstantial evidence for the Mere Exposure Theory, while Schuckert’s results are somewhat equivocal, casting some doubt on the degree to which exposure alone affected liking. Clearly, there are many variables involved in shaping an individual’s preference for music, as suggested by LeBlanc, and the variation in degree of effect observed in these two studies suggest that exposure may not be the most important factor in determining preference. The fact that two similar types of experiments produced somewhat differing results indicates that there are other variables worth studying in addition to exposure; variables that may have had little, if any presence in Zajonc’s study.

Zajonc’s study does provide strong evidence for the Mere Exposure Theory. The stimuli used were presented in a very subtle manner and for most respondents were undoubtedly devoid of any meaningful content, yet words that were presented most frequently were liked most. It is hard to imagine that in this context any other factor could have done a better job in explaining preference trends, although it is possible that higher rated preferences for words with higher exposure frequencies may to some extent reflect an individual’s recognition (whether conscious or subconscious) of a word. Despite the striking results obtained in Zajonc’s study, it is not certain that exposure would influence preference for music excerpts in the same way. One would think that some characteristic of the music, such as content, would have the strongest influence in shaping preference, and that exposure effects would vary according to stimulus and subject characteristics. For this reason Zajonc’s findings appear to have strong 40 implications for the advertising industry (particularly subliminal advertising), but is less informative in explaining exposure effects with regard to music.

Cantor’s results are diametrically opposed to those of Zajonc and add further skepticism to the applicability of the Mere Exposure Theory to a range of different contexts. Cantor (1968) used ten “low meaning figures” of various shapes to test the effects of familiarity on liking for shapes. Familiar and unfamiliar figures (familiar figures consisting of those presented to subjects in a familiarization session) were presented to two groups of 26 subjects, with liking ratings tabulated for each figure. Means for familiar and unfamiliar figures were calculated for each subject and compared. The grand mean for liking of unfamiliar figures was significantly higher than that for familiar figures. Cantor explains these findings as a result of a “tedium producing property” inherent in familiar stimuli. Conversely, it is suggested that novel stimuli have unique properties that can arouse both a positive or negative aesthetic response, that are lost with repetition and absent in familiar stimuli. If this were the case, one would intuitively expect repetition of a novel stimulus that initially arouses a positive response solely on the basis of its novelty (such as in the case of a catchy tune), to quickly lose its appeal with repetition. This is in fact the opposite of what the Mere Exposure Theory proposes, which predicts an increase or minimal change (for familiar stimuli) in liking with exposure instead. The contrasting results obtained by Zajonc’s and Cantor’s studies suggest that there is an unexplained element responsible for the discrepancy. In the context of affective response to music, certainly both scenarios implied by Cantor’s and Zajonc’s results make sense intuitively and can readily be seen in everyday life. Indeed, Russell (1987) points out that often there are times when a piece of music “grows on one”, becoming more attractive with repeated listening. Equally common are situations in which a piece of music that is initially liked, quickly becomes disliked with repeated listening. The question is, all else equal, what variable or combination of variables causes one piece of music to be liked more with repeated listening while another loses its appeal? The trends observed in a study by Hargreaves give some hint as to how these two scenarios can be reconciled. 41

Hargreaves (1980) incorporated 100 words selected randomly from every tenth page of the dictionary to observe the effects of familiarity on favorability. A between-subjects design was used, with subjects assigned to either a familiarity group or preference group. Words were printed onto index cards and subjects asked to sort them into five piles corresponding to levels of familiarity and liking. Mean scores for familiarity and liking were then calculated and graphed. A moderately significant linear relationship was observed between familiarity and liking when all words were included in the analysis. However, when words were separated into two groups, with the first group containing all words with a mean familiarity score of less than 2.5, and the second consisting of words with a mean greater than 2.5, two different trends were observed. For the first group, there was a positive correlation between liking and familiarity, while in the second, a negative correlation was observed. In addition, for all words combined, there were significant departures from linearity with a strong quadratic component observed in the relationship between familiarity and liking. Results from this experiment suggest that even in contexts where words alone are used, preference trends will not necessarily increase in a logarithmic fashion with increases in familiarity. Rather, liking may be related in a more complex way than presumed by Zajonc. It is possible that familiarity, to some extent, acts as a confounding variable, obscuring more important influences of some unknown variable. Hargreaves suggests that Zajonc and Cantor’s conflicting results can be accounted for by considering the preference/familiarity relationship as occurring in the form of an inverted U shaped curve. Hargreaves draws on Daniel Berlyne’s theory of aesthetic preference in an effort to explain the preference/familiarity relationship, and makes some slight adjustments to the shape of the Wundt curve for explaining this trend.

Berlyne’s Arousal Model and the Wundt Curve

Berlyne (1971) theorized that much of human behavior can be understood according to a drive to initiate levels of arousal that produce positive or rewarding experiences and minimize and avoid negative or painful experiences. These two poles, which can be understood as holding opposite points on a hedonic spectrum, are generally referred to as positive and negative hedonic value. Positive hedonic value encompasses pleasure, reward value, positive feedback, attractiveness and any sensations and motivations that 42 signal events that benefit the well-being of the individual. Negative hedonic value includes pain, repulsiveness, punishment value, aversion, negative feedback, and any form of noxious sensation that signals events that may be injurious to the well-being and survival of the individual. Hedonic tone is argued to be directly linked to the intensity of sensations, with the arousal of hedonic sites in the brain leading to the instigation of hedonic experience and corresponding behavior patterns. Hedonic sites, consisting of an aversion system and a primary and secondary reward system, are linked with numerous pathways interconnecting one another and other sense organs and brain structures. The aversion system inhibits the primary reward system, such that activation of the aversion system leads to de-activation of the primary reward system. In this way a pleasurable sensation can fast become un-pleasurable with arousal of the aversion system. The secondary reward system works in the opposite direction by inhibiting the aversion system. Activation of this system can indirectly lead to positive hedonic sensation by preventing the aversion system from inhibiting the primary reward system. Arousal of the reward and aversion systems can be achieved through stimulation of sense organs, release of chemicals in the blood that affect hormones and neurotransmission, and activation of brain structures which have connecting pathways to the brain stem.

Arousal is linked to hedonic tone via the Wundt Curve, which rests on the assumption that the firing thresholds of neurons in the brain are normally distributed, with the mean threshold for the aversion system higher than that of the primary reward system. Generalized or non-specific arousal can be said to initially activate the neurons in the primary reward system. As arousal increases beyond the primary reward system’s mean firing threshold, neurons in the aversion system begin to fire, decreasing hedonic tone. At very high levels of generalized arousal, the majority of neurons in the aversion system are excited and a negative hedonic experience produced. This concept can be seen in the shape of the Wundt curve as the resultant of the algebraic sum of two separate and contrasting hedonic curves. The most efficient way of conceptualizing this is to do a reconstruction of the Wundt curve using two hypothetical sets of data points distributed according to the normal distribution probability density function. 43

Fig. 3.2.

Kenkel (1996: 258) The distribution function for the standard normal curve:*

Washington (1995: 596)

*For the standard normal curve the mean =0 and S.D.=1 thus the first equation boils down to the second equation.

The probabilities of a hypothetical set of data points varying along the x-axis can be approximated using the formula for the normal curve. For our purposes, the simplest method is to approximate relative frequencies and cumulative frequencies using the formula for the standard normal curve, then transpose the curve to the right on the x-axis to represent mean firing thresholds greater than zero. Next, the cumulative frequency curve for the aversion system must be flipped over on its side such that the curve descends towards an asymptote, paralleling the one approached by the cumulative frequency curve for the primary reward system. Lastly, the two curves are subtracted from one another and superimposed on a y-axis representing hedonic tone, to arrive at the final form of the Wundt curve. This process is described in the following steps.

Fig. 3.3. Calculation of approximate relative frequencies (probabilities) for x values of the standard normal curve and its transposition. 44

Fig. 3.4. Graph of the standard normal curve and two transposed curves in relative and cumulative relative frequency form.

At this point the graphs of the relative frequencies can be disregarded, and the cumulative relative frequencies conceived of in terms of cumulative frequencies. One can picture the aversion system as having a cumulative frequency curve which reaches an asymptote higher up on the y-axis than the primary reward system, as the degree of activation (or number of excited neurons) increases to a higher level than that reached by the primary reward system. In order to identify the hedonic polarity of each system, the negative hedonic curve must be flipped over on its side, and the y-axis extended down, with descent from the origin indicating increased neural excitation in the aversion system. This is depicted in the following diagram.

Fig. 3.5. Graphs of the two hedonic curves

Berlyne (1971: 88) 45

At point X1 the aversion system begins to activate as generalized arousal has gone well past the mean firing threshold (represented by the blue line) for the primary reward system and is approaching the mean for the firing threshold of neurons in the aversion system. The distance YA must be larger than the distance YR, otherwise summation of the two curves would simply yield a curve that does not dip down into the negative hedonic value region. If the two distances are equal in length, the lowest level reached is zero, while if YR is larger, then the curve will flatten out to an asymptote above zero, indicating positive hedonic tone. Clearly, this is not appropriate as negative hedonic tone must be represented in the model. Consequently, the distance YR must be greater than YA. The final stage entails subtraction of the two curves, and substitution of degree of activity for hedonic tone on the y-axis.

Fig. 3.6. The Wundt Curve depicted defining the relationship between arousal potential and hedonic tone.

Berlyne (1971: 89)

The resultant curve reaches a peak at the point in which all neurons in the primary reward system have been activated and the aversion system begins to be activated. As arousal potential increases further, hedonic tone decreases until the point of indifference is surpassed and negative hedonic tone experienced. The curve descends towards an asymptote at the extreme end in the negative hedonic tone direction. This represents a person’s tendency to exhibit symptoms of low arousal at the upper limits of the body’s negative hedonic capacity. The body can shut down and a person can go into shock in cases where physical damage to the body is absorbed. High levels of mental stress can lead to psychotic depression and catatonic states. Pavlov, who observed that dogs tend to cower and give up when subjected to high levels of inescapable electric 46 shock, refers to the phenomenon as supra-maximal inhibition and considers it a last ditch protective device of the body (Berlyne 1971). It makes little sense, mathematical constraints aside, for the curve to continue indefinitely to ever increasing levels of negative hedonic tone, as the body seems limited in its capacity to experience negative sensations. At some point behavioral correlates are lost, and it becomes difficult to suggest that negative hedonic value is becoming more negative. If anything, at the most extreme levels of arousal potential, behavior becomes symptomatic of states of low arousal, in which case the curve could well be seen to increase to higher levels of hedonic tone rather than level off. Clearly this inappropriate as death or serious injury may follow shortly at levels of generalized arousal beyond this point.

Although response to aesthetic objects rarely approaches extremes in either direction, it has been well documented that artists do go into to through compositional inspiration, and audiences can react very strongly to artistic objects. This is perhaps best illustrated by considering the reaction of the first audience to Stravinsky’s Rites of Spring at its debut in Paris, in which much of the audience became physically ill in the hallways. One can only speculate that the constant barrage of dissonant and irregular rhythms, coupled with jarring brass, must have jolted those of more traditional tastes out of their dinners. Many of the characteristics that must have proven so indigestible at the ill-fated concert can be summed up to constitute a specific class of variables that Berlyne suggests is of the greatest consequence in determining hedonic tone.

Berlyne used the term “collative” to designate these variables, as they require a summing up of elements of a stimulus (complexity), or comparison of a stimulus attribute to the same attribute on another stimulus (novelty). The remaining determining variables include those Berlyne terms as psychophysical and ecological. Psychophysical variables include the physical properties of objects such as rhythm, timbre, and sound intensity. Ecological variables are those factors that have been consistently accompanied by significant biological occurrences. To cite Berlyne’s example (1971: 69), the rhythmic stroking of a mother’s caress for a young child is associated with safety and the absence of pain. The rattle or hissing noise produced by a snake is for some animals a signal of 47 danger and a threat to survival. All variables can be classified under these three general headings, and each contributes to the arousal potential of an object.

Although the curve relates hedonic value to arousal potential, which is defined as the aggregate of stimulus variable arousal potentials, it is possible to conceive of the contribution of each stimulus factor (such as novelty) to hedonic value in isolation of each other, in the manner of the Wundt curve. In view of this, zero novelty is represented at the point where the x-axis meets the y-axis, with novelty increasing at points along the right on the x-axis. As can be seen at point “La” (see fig 3.6.), stimuli with very little novelty have very little effect on hedonic value, producing states of indifference in the subject. With increases in novelty, hedonic value becomes positive, reaching a peak at an intermediate level of novelty, represented by point “X1”. Further increases in novelty reduce hedonic value until a point is reached in which the stimulus is so novel that it is disliked, represented by the descending portion of the curve at points beyond “X2”.

Hargreaves (1980) relates favorability to familiarity in the form of an inverted U- shaped curve with the abscissa presented in Berlyne’s depiction of the Wundt curve, reversed, as familiarity is understood as being the opposite of novelty. In this depiction, high levels of familiarity are represented on the right side of the peak of the curve and low levels on the left.

Fig. 3.7. Hargreaves depiction of the curve relating favourability to familiarity

Hargreaves (1980: 163) 48

As can be seen, the curve now descends below zero on both sides of the peak. In addition, the curve is quadratic, whereas the Wundt curve has its origins in the Gaussian distribution. It is not clear if Hargreaves intended this, but the theory underlying the Wundt curve certainly does not allow for conflation of functions for describing the shape of the curve relating preference to arousal. The Wundt curve specifically incorporates the Guassian distribution, which is integral to the foundations of Berlyne’s arousal model and the construction of the curve itself. However, it is safe to assume that the general contour of the Wundt curve (with the slight adjustments mentioned earlier), which in this context can perhaps be best described as an inverted U-shape curve, is Hargreave’s primary intention. In Hargreave’s curve, in a similar fashion to the Wundt curve, an intermediate level of familiarity elicits maximal levels of preference, while levels to either side of this optimal point produce lower levels of liking. This general shape does mirror the principal concept underlying the Optimal Complexity Model, which relates preference to both familiarity and complexity in this manner.

The Optimal Complexity Model

The Optimal Complexity Model is an alternative to the Mere Exposure Theory for explaining preferences. According to this model, every stimulus has objective characteristics that combine to form an objective and unvarying level of complexity. Each individual perceives these physical characteristics in a unique way, such that with each stimulus is a corresponding subjective level of complexity, determined on the one hand by the features of the stimulus itself, and on the other, by the person perceiving the object. Every individual can be regarded as an information processing system that has a limited capacity for assimilating, processing, and responding to stimuli. For each person there is an optimal or preferred level of information that is sought at any point in time. This level is dependent on the information processing capacity of the individual and can vary substantially depending on the experience of the subject, the objective complexity of the stimulus, and context in which it is perceived. Preference for objects is thus determined by the degree to which it is above or below the individual’s optimum level of complexity. 49

Fig. 3.8. The Optimal Complexity Model

The Optimal Complexity Model relates liking to complexity in the form of an inverted U shape function, with liking increasing with increases in subjective complexity before reaching an optimal level. From this point, further increases in subjective complexity result in decreases in liking. Lower levels of liking on the rising side of the curve reflect how very simple stimuli will tend to have little aesthetic value as a result of an information content level below the optimum for an individual. In this case, the stimulus is boring or uninteresting. As complexity increases, liking increases until an optimum point is reached producing a maximal level of liking. Further increases in complexity frustrate the subject. In this case, the stimulus has too much information to be assimilated by the observer at a given point in time. The stimulus has surpassed the information processing capacity of the subject, causing a reduction in liking. Familiarity has the effect of reducing subjective complexity levels such that with repeated exposure of a stimulus familiarity increases monotonically, decreasing subjective complexity and increasing liking. In this way a complex piece of music, which normally lies beyond the optimally preferred level of complexity on the inverted U shape curve producing low levels of liking, will decrease in subjective complexity with repetition, producing higher levels of liking. 50

Edward Walker (1981), who has largely been responsible for developing the Optimal Complexity Model, applies it to a wide range of contexts and in fact introduces complexity as a fundamental determinant of all human behavior. In this way, the individual is equated to a hedgehog, which has one response to stimuli, and that is to roll up into a ball. Similarly, although humans have numerous ways of responding to stimuli, human behavior in general is determined almost exclusively by a preference for optimal levels of complexity in the environment and its experience. The theory becomes the hedgehog as it explains all behavior in terms of one principal factor, psychological complexity. Humans can be understood as responding principally to the need to maintain optimal psychological complexity levels in the stream of psychological events. Thus, in a manner of speaking, humans also become a hedgehog of a sort, as motivation and behavior is boiled down to response to a single construct. The following diagram produced by Walker best illustrates this concept.

Fig. 3.9. The Hedgehog Theory of Behavior

Walker (1980: 11)

In this diagram the solitary circle to the left represents the current psychological event experienced by an individual. Of options “a” to “e”, event “b” has a complexity level closest to the optimum for the individual and consequently is chosen over the other events. Following experience of event “b” is a decline in its corresponding subjective complexity level, so that the event with complexity closest to the optimally preferred level is now event “d” which is consequently chosen. Following this, and the subsequent decrease in that event’s subjective complexity level, event “c” is chosen as this event is no longer familiar and once again has a complexity level closest to the optimum. The 51 same procedure continues, resulting in the subject choosing event “e” followed by event “a”. Although this diagram depicts the choice of options of an individual when the person has control over what choices he can make, it is obvious that we do not always have complete power over our lives. Some “options” may be imposed on us, thus the individual will experience these events according to where they lie on the complexity/preference curve. If the event is extremely complex the subject will experience a corresponding level of negative emotion, which may illicit avoidance behavior. Of course, in cases where physical harm is imminent, complexity of the threatening event may not be the issue, yet the response may be the same, that is, the subject will react to avoid the event. In this way, Walker makes the optimal complexity model a behavioral model, explaining human behavior in terms of basic drives equivalent in some respects to those of hunger, thirst, and sexual satiation. Although numerous studies in the experimental aesthetics and music preference literature corroborate the general findings of the Optimal Complexity Model within a limited context, Walker’s application of the model to a wide range of contexts as a general behavioral principle is perhaps somewhat ambitious, and at any rate, in terms of detailed commentary, is beyond the scope of this paper.

The Concept of Complexity

Before assessing the Optimal Complexity Model in depth, it makes sense to clarify definitions of complexity and their means of measurement. No single definition of complexity is consistently used. Although definitions appear to vary according to context and disciplinary field, there is some consistency in the use of the term to describe certain qualities. These commonalities are centered on the concept of variety. A complex object may have more interacting parts than another, or a task may consist of a larger number of different events within a sequence than another. Each of these examples shares this common trait, which describes the concept. Measures of complexity on the other hand, rarely capture the full meaning of the concept, are numerous, if not limitless in number, and may differ dramatically depending on context and researcher. For example, the measure of complexity that is the focus of attention of computational complexity theory is the time and space required by a computer to solve a problem. Here the measure of 52 complexity is based on the length of running time of the algorithm used to solve the problem and the space needed for recording the computation (Urquhart 2003: 19). In music preference studies, a researcher may order objects on a complexity continuum depending on a single factor, such as the number of different chords in a progression. Consensual complexity is derived from the tabulation of the complexity ratings of numerous persons and is generally obtained by calculation of the mean of this set. The first two measures in these examples are objective in nature as they occur on a ratio and interval scale and conform to a universal standard. The last measure can be viewed as a subjective measure as the ratings are dependent on numerous individual perceptions that occur at an ordinal level and do not conform to any recognized standard. Although the objective measures of complexity mentioned above, can be very useful for experimental purposes, they only reflect specific dimensions of the broader concept, while the subjective measure is in want of explanation. The definition of complexity incorporated by motivational researchers given below, and which can be understood as representing subjective or psychological complexity in the optimal complexity model, is an attempt to fully capture the nature of the concept. Complexity is defined by December & Earl (1957: 94) as the discrepancy of the probability of the occurrence of stimulus components from the expectation of their occurrence. This definition represents a theoretical construct “conceptually equivalent” to the information theory metric for amount of information, or more specifically, entropy, defined as observational variety or actual diversity and expressed by Shannon’s formula:

Fig. 3.10.

Krippendorff (1986: 16)

In order to explain how this equation comes about, it is useful to take the example given by Floridi (2003) of a coin tossing experiment. If two coins are tossed the following four outcomes are possible: . The amount of information inherent in each symbol (in this case, symbol pair ie <- ->) is defined by the reduction in logical possibilities or the decrease in data deficit, also known as Shannon’s uncertainty, given 53 by the occurrence of that outcome or symbol pair. Thus, the amount of information given by the symbol , or a heads and tails outcome, is given by the logarithm to the base two of the number of logical possibilities N, where use of logarithms represents the method of counting options by the value of the exponent. In this case there are four equally likely outcomes, so that for any symbol pair the amount of information received is equal to Log 2 N, or Log 2 4=2 bits. Similarly, upon determining the outcome of a 6- sided dice rolling experiment where each outcome is equi-probable, 2.58 bits (Log2 6) of information is obtained. In cases where outcomes of an experiment are not equi-probable, our measure must take into account the probability of each symbol occurring, defined by the frequency of that symbol divided by the total frequency of all symbols. This is represented in the following extraction:

Fig. 3.11.

Krippendorff (1986: 14)

Where NA is the total frequency, Na is the frequency of the symbol and I=informative- ness=-log2Pa. Although this particular expression, taken from Krippendorff (1986), is formulated toward explaining the equivalence of different forms in the context of amount of information in a transmitted message, it can be directly applied in our coin tossing experiment. If, on four separate trials the outcomes are ,, and , then using equation 2, NA=4andNa=1.Incorporatingthiswefindthatforthesymbol the amount of information obtained is 2 bits where I=-log2.1/4=-log2.25 =2. In the first equation we arrive at the same result as it simply sums the weighted informative-ness of each symbol, where each weight = ¼=.25. Thus the computation: (.25) 2 +(.25) 2 + (.25) 2 +(.25) also yields 2 bits of information. If it is known that outcomes are equi-probable (and it can be assumed that given an infinite number of n trial experiments, on average outcomes will have equal frequencies), as in cases where fair coins are used, the weightings are not needed. In cases where the probability of events differ for each symbol, such as in the case of determining the information content of the string obtained using unfair coins, weightings are needed. An additional weighting is required in more 54 complicated cases, such as Markov chains, where occurrences of symbols are interdependent. Here the occurrence of a symbol may depend on the occurrence of a specific number of other symbols. This is exemplified in the illustration given by Edwards (1964: 46).

Fig. 3.12. Sequence of letters: BACAABADACBADAABACADABAACBABBADACBADAABA

The frequencies of each letter are: A=20, B=10, C=5 D=5. The total # of letters=40 and the probabilities of each letter found according to frequency of letter/total frequency: A=20/40=.40, B=10/40=.25, C=5/40=.125, D=5/40=.125.

The frequency of each letter following each other letter can be represented in the form of a matrix of proportions, where letters on the top row represent the frequency of the letter following the corresponding letters in the columns. As can be seen, A follows an A, four times out of a total of 19 cases in which A is followed by another letter (no letters follow the last A hence proportion is 4/19 instead of 4/20). B follows an A, five times out of a total of 19 cases, A follows a B, nine times out of 10 possible cases and so on.

Fig. 3.13. Matrix of the proportions for each letter following every other letter

Edwards (1964: 47) 55

From here application of the first formula results in:

(-4/19 log2 4/19) + (-5/19 log2 5/19) + (-5/19 log2 5/19) + (-5/19 log2 5/19), which is equal to (.4725) + (.50) + (.50) + (.50) = about 1.9 or (more specifically 1.994) bits of information given by identification of the letter following letter A . However, we must now take into account the probability of the occurrence of A in the sequence, as the probabilities for letters following A can only be valid if they actually follow an A with the probability of occurring P(A). The second weighting procedure is arrived at by multiplying the initial result by the probability of A =.50. This is done for each letter with final products summed to give us the average amount of information gain per letter, given interdependencies.

Fig. 3.14. Information obtained for identifying letters following letters A to D.

Edwards (1964: 48)

The following expression describes the entire calculation.

Fig. 3.15.

Edwards (1964: 48)

As can be seen, the greatest amount of information is yielded by identification of the letter following A, with .997 bits obtained. This makes sense as A is the most frequent letter, and intuitively one would expect more variability in the outcomes of letter pairs as a result. By contrast C, which occurs only five times, should vary less with regard to the 56 outcomes of its letter pairs, and indeed it does, as reflected in the corresponding low information value (.121 bits). By summing all of the products ( .997 +.117 +.121+.000=1.235) we are calculating the average information obtained by identification of a letter following any other letter in the chain, and as such have a specific value which can be understood as reflecting the complexity of the chain. Thus, another Markov chain in which the summed products yields a higher information value, say 124 bits, can be said to be more complex than the other.

The conception of the complexity of visual perceptions as defined in terms of the information theory metric is described by December & Earl (1957) in accordance with Coomb’s (1951) measurement theory. For a stimulus “j”, at moment “h”, for subject “i”, Q is the measure of some attribute. Following exposure to this stimulus, retention of information will result in the expectancy C for that attribute of the stimulus. If the stimulus is altered on that attribute and the subject again is exposed to it, there will be some discrepancy between the expected value C of that attribute with the altered observed value Q. Hence, the absolute value of this discrepancy is given by Phij=[Chij-

Qhij]. Consider an example parallel to the one given by December and Earl of a pattern of two different and adjacent textures in which each texture is perceived in separate scans. In the first scan, a value for C is obtained, while in the second, the adjacent texture is perceived as Q, and the value for P is obtained. As the discrepancy increases, so does the surprise value of the stimulus (pattern) component and the amount of information obtained in the succeeding scan, given according to the graph of the function –Pi log2 Pi.

Fig. 3.16. The information curve.

Edwards (1964: 41) 57

For the case of component A: P(A) =4/19

-Pa log2 Pa =-4/19 log2 .4/19= [{-(.21) x log10 .21 }/ log10 2* =.142/.30=.47 For the case of component B: P(B) =5/19

-Pa log2 Pa =-5/19 log2 5/19= [{-(.26) x log10 .26}/ log10 2] =.152/.30=.50

*The calculation requires transformation of the log to the base 2 to an expression incorporating common logarithms, where logbX=logaX/logab and a = the base 10, b= the log to the base 2 in above examples and X=the values whose logarithm we are trying to find.

The construction of universal standards of measurement for subjective perceptions has been a constant pursuit of researchers working in the domain of psychometrics, who have largely modeled their endeavors after the fruits of physics (Coombs 1951: 1). Although consideration of this issue in any depth is beyond the scope of this paper, the approach adopted by December & Earl is an example of such an endeavor and entails ranking perceptions according to hypothesized values for the expected value C and their discrepancy from observed value Q. The subsequent size of the discrepancy for a component can then be located on the curve according to the hypothesized decrease in uncertainty, or data deficit, given by the stimulus component, and perceptions ordered. A large discrepancy will reflect a small probability of the occurrence of Q, while no discrepancy occurs when C=Q, and Q can be said to be completely redundant, having a probability of occurring =1.00. In this case no information is obtained as the perception of the stimulus on some attribute does not vary or change.

In two studies by Vitz (1964, 1966), to be reviewed later, the usefulness of the information metric for representing complexity is assessed. At present, it is sufficient to know that measures of complexity vary, with some making greater progress in attempting to capture the full concept than others. Although the mathematical theory of communication and its derivation for the quantitative measurement of information is widely accepted, its application to experimental aesthetics as a means for explaining the concept of psychological complexity, appears to have less solid grounding, and is only briefly mentioned here as an entry point to the literature. 58

Assessing the Optimal Complexity Model

Helen Mull (1957) did one of the early studies incorporating music excerpts in order to observe the effects of repetition on liking. Although this study was executed before the Optimal Complexity Model attained its current recognizable form, results prefigure those of later studies that more clearly demonstrate the plausibility of the model. In Mull’s design, a Schoenberg and Hindemith excerpt were played to subjects individually in two one-hour sessions, with subjects arranged into two groups with the order of presentation of excerpts reversed for each group. Subjects rated liking after each hearing and raised their hands for passages that were particularly liked or disliked less. Liking increased with exposure frequency, with a high degree of consistency in liking for specific passages. The most frequent reason cited for liking a passage was melody and lack of dissonance. These results suggest that an intermediate level of complexity was most preferred, characterized by simple melodic phrasing and less dissonant . This particular design, like Bradley’s design, is perhaps not ideally suited to assessing exposure effects, as it only includes measures of preference on two separate occasions. In this respect results can be interpreted as supporting both the Optimal Complexity Model and the Mere Exposure Theory. However, the fact that the most preferred passages appeared to be the ones exhibiting an intermediate level of complexity, does suggest that the answer to Mull’s question as to whether repeated listening brings out the charms in music, is, yes it may, by bringing subjective complexity to an optimal level.

Hargreaves (1984), in a similar manner to Mull, investigated the Optimal Complexity Model by having subjects rate music excerpts. In the first experiment, 59 subjects, consisting of undergraduates and adult education students, met in three separate classes for three hours per week in a “Psychology of Music” course. Subjects listened to an easy listening and jazz excerpt for a total of three times at different times in the session. Subjects rated each excerpt at each listening for preference and familiarity on a five-point rating scale. Analysis of variance revealed that the easy listening piece was rated as more familiar and most liked. In addition, there was a significant interaction of the “piece” and “playing” variables for both the familiarity and preference scores, and a 59 significant main effect of “playing” for the familiarity scores. However, there was no significant difference in liking for excerpts across playing (i.e. no main effect of “playing” for liking scores). It should come as no surprise that familiarity increased each time the selection was played; however, the lack of effect of repetition on liking does not conform to changes predicted by the Optimal Complexity Model. Despite this, the significant interactions can be explained as a result of the different complexity levels of the two excerpts. The jazz piece may have been perceived as more complex than the easy listening piece, with subjective complexity levels for the jazz excerpt further away from the optimum level than the easy listening piece. With repetition, the subjective complexity of the easy listening piece was reduced to a point below the optimum level producing a reduction in liking. For the jazz piece, subjective complexity shifted closer to the optimum level, producing increases in liking. Although no main effect for the playing variable was evident, the significant interaction does suggest that some form of the Optimal Complexity Model was at work in this case.

In the second experiment, a popular, jazz, and classical excerpt were presented to subjects in three exposure sessions held at one-week intervals, with each excerpt played four times per session. The added number of exposures resulted in a larger sample of data, with scores covering a larger range of the objective familiarity variable. In addition, the combination of continuous exposure and exposure at intervals, made it possible to observe the strength of recovery effects. Preference means for the popular and classical excerpt across listening over weekly sessions correspond to predictions made by the optimal complexity model. If it can be assumed that pop excerpts are less complex than classical excerpts for the average listener, then according to the model, one would expect pop excerpts to reach an optimal complexity level sooner than the classical excerpt. This would be reflected in liking means for the pop excerpt converging to a peak before those for the classical excerpt. Further, a decline in liking for the pop excerpt would be expected to occur as a result of subjective complexity decreasing below an optimum level before onset of a decline in liking for the classical excerpt. If avant-garde jazz can be assumed to be more complex for the average listener than both classical and pop excerpts, we would expect the same pattern of preference but occurring at later stages of 60 the repetition sequence. In fact, the graph of the means indicates that preference trends for classical and pop excerpts did correspond to the predicted pattern.

Fig. 3.17. Subjects liking means for music excerpts

Subjects familiarity means for excerpts

Hargreaves (1984: 44)

Liking for pop excerpts rose to a peak sooner than did liking for the classical excerpt. In addition, onset of a gradual decline in liking occurred for pop excerpts roughly at week two, while for classical excerpts the decline in liking began to occur at about the beginning of week three, again mirroring the trend predicted by the Optimal Complexity Model. However, there was very little change in liking across all listening sessions for the 61 avant-garde jazz excerpt. This result does not conform to the expected pattern in preference, and suggests that the jazz excerpt was so unfamiliar that as a consequence it had a subject complexity level well beyond the optimum level, such that repetition effects had a minimal influence on simplifying the object and moving subjective complexity closer to an optimum level. As a result, only a fractional change in liking was observed. Indeed, Hargreaves suggests that for any excerpt and any individual, there is an optimal discrepancy between the objective complexity of an excerpt and its corresponding subjective complexity, such that any constant increases in familiarity produce a maximal increase in liking. This may have led to the conspicuousness of the changes in preference for the pop and classical excerpts. By the same token, any deviation from this optimal level of discrepancy may produce progressively smaller increases in liking with any constant increase in familiarity. The latter may have been the case for the jazz excerpt. This notion is further corroborated by the fact that familiarity for each excerpt tended to increase at a similar rate across listening sessions.

A particular striking pattern was the lack of inverted U effects within listening sessions. Only in the first week of listening for the pop excerpt was there a rise, followed by a decline in liking, indicating that recovery effects had very little influence in this experiment and that inverted U trends may be more prominent in designs incorporating repetition at intervals rather than repetition within sessions. That said, presumably at some point the time between intervals will be so great as to render familiar excerpts unfamiliar, displacing preference and familiarity to an original level at each interval and decreasing the likelihood of observing inverted U shaped trends in preference. This suggests that inverted U effects can best be studied by incorporating designs in which time intervals between listening sessions are of intermediate length, with listening sequences consisting of a sufficient number of sessions to adequately sample both sides of the inverted U curve.

Another interesting aspect of the results obtained in this study, which may lend greater nuance to the nature of the Optimal Complexity Model, lies in the consistency of ranking of both preferences and levels of familiarity for style excerpts across all listening sessions. The pop excerpt received consistently higher ratings than both the classical and 62 jazz excerpt, while the classical excerpt held the middle ground, and the jazz excerpt the low ground. These ranks held despite changes in ratings over time, indicating that for relatively homogeneous subject groups, although ranks may change for pieces within a style, ranking is not as susceptible to change for pieces between styles. This suggests that inverted U effects are tempered by a tendency for subjects to maintain “watertight divisions” between styles of music, thus limiting the ability for repetition and changes in subjective complexity to elicit corresponding changes in preference. This may explain why by week three, preference for the popular excerpt did not dip below the preference level reached by the classical excerpt, but rather, as though it were an asymptote, moved as close to it as possible without crossing boundaries. Hargreaves speculates that an element of “musical prejudice”, or stereotyping, may be responsible for the consistency in ranking of ratings, and that further research might do well to attempt to “disentangle” inverted U effects from stereotyping effects in order to accurately determine the strength of each of their influences.

Heyduk (1975) introduces measures of complexity into the design in order to observe the effects of both complexity and repetition on liking. Four piano compositions varying in complexity were played to 120 university students and initial liking and subjective complexity ratings obtained. Following this, subjects were divided into four groups in which one of each of the compositions was played 16 times, with liking and complexity ratings obtained in a final listening session in which all four compositions were played once for each person. Of immediate concern in this study is the problem of objectively measuring the complexity of each composition, such that they can be arranged on a continuum of increasing complexity. This problem was resolved by way of the compositional process. The length and basic structure of each piece was made equivalent, resulting in a set of three variations on an original theme. This ensured that pieces all had common attributes and only varied on the experimental characteristics of interest in the study. Complexity was manipulated by systematically varying the chord structure (number of different chords used) and rhythm of each variation in the following manner: 63

Fig. 3.18. Three variations on an original theme by Ronald Heyduk

Heyduk (1975: 86) 64

Composition A consisted of two different chords and no syncopation, composition B, four different chords with syncopations occurring simultaneously in both hands, composition C, eight different chords and syncopation in the left hand only, and composition D, twelve different chords and different syncopations in both hands. This process ensured a range of complexity levels and the possibility of assessing and ordering compositions on the basis of complexity. As it turned out, subjective complexity ratings were congruent with objective measures for each piece. This was reflected in the rank order of mean complexity ratings for each piece corresponding to that determined by the compositional method. This result reflects the validity of the measure of objective complexity that was used, and is significant in itself, as one of the problems associated with conducting a study of this sort is in obtaining reliable and valid measures of objective complexity. Having a measure of objective complexity in this context is important as it allows us to introduce stimuli into the design, that we can be reasonably sure vary systematically in complexity, resulting in an ability to elicit a wide range of subjective complexity levels. This ensures that both sides of the optimal complexity curve are sampled, thereby excluding this question as a possible cause in the case of monotonic increasing or decreasing trends turning up in the results.

Mean liking ratings collected in the initial listening phase showed a preference for the composition with an intermediate level of complexity, with liking ratings increasing in a curvilinear fashion with increases in complexity, before dropping for the most complex piece. In addition, 85 of the 120 subjects revealed patterns in preference corresponding to the overall group trend, a number significantly greater than would be expected by chance. Results in the initial phase of the study were taken to reflect optimal subjective complexity levels for each subject, and predictions made on the basis of the Optimal Complexity Model as to the effects of repetition on liking. Predictions were then compared to the actual results obtained at the end of the exposure period. The post-test revealed 88 uni-modal inverted U shaped functions confirming 55 predictions. There were only 33 incorrect predictions. In comparison, only 21 predictions made on the basis of the Mere Exposure Theory were accurate. These results further support the validity of the Optimal Complexity Model and indicate that when making predictions about changes 65 in liking following exposure, the Optimal Complexity Model performs much better than theMereExposureModel.

Although the results substantiate the Optimal Complexity Model, one criticism of the present study remains. Whenever compositions of a Western classical idiom are used as stimuli, subjects are inevitably referenced to the body of literature that already exists and to which subjects may be familiar with to one degree or another. It is possible that the most complex composition was less characteristic of traditional piano music than the most preferred piece. Certainly, the compositions with lower complexity levels (A and B) resemble unfinished pieces awaiting further refinement. Further, it could be argued that lower preference ratings for the most complex piece were a result of the compositions unusual rhythmic patterns, which may have been perceived by subjects as uncharacteristic of the norm. A balance between what is recognizable as appropriate within the Western classical idiom and compositional originality is perhaps established with the most preferred piece. Consequently, prototypicality may have acted as a confounding variable in this study. Certainly, it is quite likely that prototypicality and complexity combined contributed to the preference trends, and brings up the problem of how one differentiates these effects. Incorporating a measure of prototypicality would remedy this and allow for observation of the extent to which each of these factors explains variance in the dependent variable.

Thus far, all the studies that have been reviewed have incorporated exposure in the form of repetition, as an independent variable to assess the Optimal Complexity Model. However, there are many ways to study the model, one of which is to do away with repetition altogether and incorporate an equivalent variable instead. As repetition can be considered a measure of familiarity (specific objective familiarity to be precise), it makes sense to substitute repetition with other measures of familiarity, such as training or age, each of which can be considered as nonspecific measures. This can be convenient, as the study is reduced to only one or two testing sessions, thereby greatly reducing the strain on the researcher’s resources. Further, study of the training and age variable in the context of the Optimal Complexity Model, provides us with additional perspectives and allows for the possibility of enhancing (or diminishing, as the case may be) the soundness of the 66 model. The following studies are good examples of research that contribute results with implications to the Optimal Complexity Model, without the use of exposure as an independent variable.

Hargreaves (1986) had subjects rate liking for recordings of familiar and unfamiliar nursery rhymes played on a synthesizer. In addition, subjects rated liking for near and far approximations to music. Sixteen subjects in each of six different age groups representing an age range covering young children to adults were tested, with the very young subjects making ratings using pictures of faces with expressions corresponding to levels of liking. The following mixed design ANOVA (two-way) revealed significant main effects as well as a significant interaction of stimulus sequence and age.

Fig. 3.19. Cell means for the two-way ANOVA

Hargreaves (1986: 67)

The hypotheses Hargreaves outlined for himself to test the validity of the optimal complexity model were firstly, that liking would vary with increasing age in the form of an inverted U shape curve, and secondly, that this will only be the case for the familiar and unfamiliar melodies. It was proposed that the approximations should become preferred less with age, as older subjects will more readily recognize the unusual and unmusical quality of the approximations, which as a consequence will have a lower familiarity level than that of the melodies. Results indicated that the youngest age group gave the highest ratings for all sequences. Hargreaves dismisses the validity of the ratings 67 for this age group, as the youngest children seemed to be responding less to the stimuli than to the test administrator. Consequently, ratings may reflect children’s attitudes to the researcher rather than the stimuli. If these results are excluded, one can see an inverted U shape curve relating liking to age group for the familiar and unfamiliar melodies, with the highest preferences for both the unfamiliar and familiar melody sequence occurring for the ten to eleven year age group. On each side of these peaks, preferences decline. Explained in terms of the Optimal Complexity Model, results suggest that an optimum level of familiarity and a corresponding optimum complexity level occur for children in the ten to eleven year age group. For subjects in the older age groups, familiarity levels for melodies are beyond the optimum, resulting in subjective complexity levels below the most preferred level. The opposite is the case for the younger age group, with familiarity levels below the optimum and corresponding subjective complexity levels above the optimum. The approximations to music were liked less than the real life music, with preferences declining steadily with age. The youngest age groups did not distinguish the unfamiliar melodies from the approximations, while the older age groups did make a distinction, reflected in the higher preference ratings for the unfamiliar melodies than for the approximations. The Optimal Complexity Model can explain this trend as a result of a decrease in familiarity with age and a corresponding increase in perceived complexities of sequences, with levels above the optimum for the latter, and below the optimum for the former. Although this method for explaining the preference trends is very elegant, of concern is the notion that approximations will become less familiar with age, and as a consequence, will be perceived as more complex. One would think that a fully grown adult, as a result of greater capacity, will perceive approximations as less complex than will young children. In this case, the declining preference trend can be interpreted as reflecting an increase in the perceived atypicality of sequences. The higher preferences for the approximations on the part of younger subjects can then be explained as a result of an “open-earedness” effect rather than familiarity and perceived complexity levels closer to an optimum level. However, this said, it is only intuitive to think that the typical sequences will be familiar and atypical sequences unfamiliar. If so, prototypicality and familiarity have equivalent roles in this context, in which case results do not invalidate the Optimal Complexity Model. 68

Radocy (1982) incorporated both measures of subjective familiarity and subjective complexity. A tape of 15 music excerpts was played to 139 university students, and preference, complexity, and familiarity ratings made on a 5-point scale, collected. In addition, subjects were distinguished as either musicians or non-musicians, thus including a measure of general objective familiarity in the form of training. Of the total subject pool, data from 36 musicians and 36 non-musicians was randomly selected for analysis, resulting in a mixed design ANOVA with equal sample sizes for the between-subjects factor. Analysis revealed significant differences in mean complexity ratings among the familiarity levels, significant differences in mean preference ratings for levels of complexity, and significant differences in mean preference ratings for levels of familiarity for both musicians and non-musicians considered separately. Generally, differences observed for the non-musician groups were larger than that of the musician group, as reflected in the substantially higher F-statistics for the former. There were strong quadratic trends in the preference/complexity relationships, strong linear trends in the preference/familiarity relationships, and strong linear, cubic, and quartic trends in the complexity/familiarity relationships.

Fig. 3.20. Output for the mixed design ANOVA

Radocy (1982: 94) 69

Together, these findings can be taken to support the Optimal Complexity Model. Firstly, the quadratic relationships observed between preference and complexity suggests that subjects tended to prefer an intermediate level of complexity, with extremes in either direction producing lower liking scores. The strong linear trends observed between complexity and familiarity indicate that subjects labeled the most complex pieces as the least familiar, and the least complex pieces as most familiar. The significant cubic and quartic trends observed between this pair of variables are harder to explain. However, the significant linear relation corresponds to the trend predicted by the Optimal Complexity Model, as preference is an inverted U shaped function of both familiarity and complexity, which are in turn an inverse linear function of one another. The linear trends observed between preference and familiarity indicates that subjects’ preference increased with familiarity. All of these results are compatible with the Optimal Complexity Model, except for the last. What should have been observed was a significant quadratic component relating preference to familiarity. The fact that this was not the case immediately suggests that this particular study sampled the rising portion of the inverted U curve. However, this cannot be the reason for the linear trend in the data, as a significant quadratic trend was observed in the complexity/preference relationship, and a significant linear trend was observed in the complexity/familiarity relationship, indicating that a wide range of complexity and familiarity levels was sampled. The question then remains as to why there was a significant quadratic component in the preference/complexity relationship, but not the preference/familiarity relationship. One explanation lies in the fact that this particular study was naturalistic; it incorporated “real” music as stimuli, enhancing the design’s ecological validity. The reason why only a linear trend was observed in the preference/familiarity relationship may have been due to feedback effects. None of the subjects were too familiar (to the point of dislike) with any of the excerpts, as in the natural environment people can always stop listening to a piece of music before it reaches a level of familiarity that elicits dislike. This control over exposure would enable persons to return to the excerpt at a later date with a fresh perspective, ensuring that none of the excerpts encountered by subjects in the study wouldbesofamiliarastobedisliked.Ofcourse,ifanexposureperiodwasincludedin the study, repetition effects may result in familiarity levels increasing beyond the optimum level for a subject and producing dislike. This would result in an inverted U 70 effect, similar to that found in Hargreaves study (discussed prior) which did incorporate multiple exposures.

In a similar vein to Radocy, Burk & Gridley (1990) had both non-musicians and musicians rate their preference for music excerpts varying on a continuum of complexity. The excerpts were chosen from a set of ten by judges who rated each on a complexity scale. The four excerpts that had the lowest, middle and highest average ranks were chosen. The excerpts, from least to most complex, were (1)Bach’s C major prelude and fugue from bk I, (2)Debussy’s “Maiden with the Flaxen Hair”, (3)Grieg’s “Wedding Day at Troldhaugen” and (4)Piano sonata no.1 by Boulez. Results revealed an inverted U shaped function relating preference to complexity for both musicians and non-musicians.

Fig. 3.21. Preference means for each excerpt by musical sophistication

Excerpts representing four levels of complexity: Bach/ Debussy/Greig/Boulez Burke (1990: 689)

As expected, musicians gave higher preference ratings than non-musicians. The highest preference mean for the musicians was for the Debussy excerpt, while for the non- musicians the highest preference mean was for the Grieg excerpt. Strangely, a peak in 71 preference was reached for a higher complexity level on the part of non-musicians. If it can be assumed that non-musicians will have an optimum preference for a complexity level which is lower than that for musicians, we would expect the peak to occur “before” that of musicians. That is, the peak for musicians should be reached at a higher complexity level than for non-musicians as a result of subjects’ greater musical sophistication. The fact that the opposite was the case suggests that other factors were relevant. Gridley remarks that musicians might have a greater fondness for the Debussy excerpt regardless of complexity. This may be because it is a very well-known piano piece that is often encountered and performed, and in general terms, can be considered as more popular among musicians than the Grieg excerpt. The peak at complexity level two may then be partly a result of this bias factor (if it can be called a bias factor), rather than the result of the excerpt having a perceived complexity level closer to an optimum. Another reason for this premature peaking may be due to the fact that pieces were arranged on a complexity continuum on the basis of judges’ personal perceptions. Gridley does indicate that the judges’ perceptions of the complexity of excerpts may have been different than those of subjects. Subjects may have perceived the Grieg excerpt as less complex than the Debussy excerpt, in which case the occurrence of the peaks should be reversed for the musician and non-musician group. If this were done, we would get the expected trend predicted by the Optimal Complexity Model. Although these scenarios may very well explain the anomaly, some doubt regarding the applicability of the Optimal Complexity Model is raised, and suggests that complexity effects have subtle influences that are not easily observed or separated from those of other factors.

Orr & Ohlsson (2001) investigated the applicability of the Optimal Complexity Model to different styles of music by having subjects rate complexity and preference for especially improvised compositions. In the first experiment, two professional jazz and bluegrass musicians improvised a total of 80 compositions, with half in the jazz genre and half in the bluegrass genre. In addition, musicians improvised each excerpt according to a complexity continuum consisting of five levels, with eight excerpts at each complexity level for each of the jazz and bluegrass designations. Because of the relatively large number of excerpts representing each complexity level, it was possible to arrive at a more accurate test of the validity of complexity distinctions by having subjects rate 72 perceived complexity levels for each excerpt and plotting the combined means against each performer complexity level designation. This method represents an improvement over Gridly’s design, as only four excerpts were used, with each excerpt representing a different complexity level. The weakness is inherent in the greater possibility of a confound misconstruing placement of excerpts on the complexity continuum. The increased number of excerpts representing each complexity level reduces the probability of this happening in the present study.

Fig. 3.22. Subject complexity means plotted against performer designations

Orr & Ohlsson (2001: 113)

The graph of the means indicate that perceived complexity levels on the part of subjects increased steadily with increases in performer complexity, with pair wise tests revealing significant differences between each adjacent pair of means for all cases except the level three and four comparison for the jazz excerpts. Overall, this result indicates that subjects perceived the complexity of improvisations in accord with the musician’s designations.

The graphs of the preference means plotted against the complexity mean ratings for each excerpt revealed both significant linear and quadratic relationships for the graph of all 80 excerpts, and for jazz and blue grass excerpts graphed separately. In each case, the quadratic component explained more variance in the dependent variable than did the linear regression, reflected in the former having higher values for eta-sqaured (r-squared). 73

Fig. 3.23. Preference means against complexity means for all eighty improvisations and jazz and blue grass improvisations graphed separately.

Orr & Ohlsson (2001: 114)

When each graph was split according to the ascending and descending portion of the curve, and correlations calculated for each half, it was found that for the jazz genre only two excerpts lay to the left of the peak of the curve, while in the case of bluegrass music there were ten improvisations to the left of the curve. This suggests that for the jazz excerpt, a negative linear trend is most appropriate for explaining the relationship, despite the lower value of eta-squared for this trend. This is further supported by the small difference in the value of eta-squared for the quadratic and linear components (.65 compared to .58).

In experiment two, a between-subjects design was used, with different subjects rating preference and liking for excerpts. In addition, 59 improvisations were used in order to reduce fatigue effects. The same pattern of results was obtained in this experiment as the first, with both significant linear and quadratic components observed for all excerpts and jazz and bluegrass excerpts graphed separately. In addition, the quadratic component explained more variance in preference ratings than the linear component. Further, when graphs were split according to the rising and descending portions of the curve, it was again found that very few excerpts lay to the left of the peak for the jazz genre. Orr and Ohlsson conclude that results support an inverted U relationship for the bluegrass genre 74 but not for the jazz genre, suggesting that the applicability of the Optimal Complexity Model may be restricted according to style of music. However, Orr & Ohlsson do acknowledge that the negative linear relationship observed for the jazz genre may have arisen from sampling one side of the complexity curve. Subjects’ lowest mean complexity rating was higher for the jazz genre than it was for the bluegrass genre, allowing for the possibility that jazz excerpts were comparatively more complex than blue grass excerpts, resulting in most jazz excerpts having a subjective complexity level beyond the optimum. The congruity of the subjective complexity ratings given by subjects with the levels designated by the performers, would seem to refute this scenario, and indeed, the authors do indicate that the actual subjective complexity range was greater for the jazz genre, suggesting that if any style suffered from restricted sampling, it was the bluegrass genre rather than the jazz genre. Further, the simple jazz excerpts were in fact quite simple, and as noted by the authors, it would be hard to arrive upon simpler improvisations that still resembled music. Thus, the tentative conclusion reached by the authors is that the inverted U-shape trend relating preference to complexity may be style specific, or that it is easily obscured by other factors that have a stronger influence on determining preference trends. Specifically, the authors point to the prototypicality variable as a possible confound for complexity. Very simple and very complex pieces may tend to be less typical of music of a style. Consequently, the pieces of moderate complexity are also the ones that are most prototypical of a style and as a result, the most readily accepted. In light of this possibility, the peaks for the jazz genre at a lower complexity level may reflect a recognition and preference for the most prototypical excerpts located primarily in the lower complexity level designations. For bluegrass music, the most prototypical excerpts can be interpreted as possibly occurring at the higher complexity levels marking the peak of the curve. In each case, trends can be explained as a result of a prototypicality variable rather than a complexity factor. In addition, results may have been influenced by a context effect of the type researched by Steck and Machotka.

In contrast to the previous studies reviewed, Steck & Machotka (1975) constructed artificial tone sequences in order to determine the nature of the preference/complexity relationship without the interference of extraneous variables, such as prototypicality, that 75 are commonly associated with real music. Sequences were determined by randomly selecting sinusoidal tones from a set of six frequencies belonging to a non-Western ten- note scale. Compositions were all roughly ten seconds in length, with the number of tones presented within this time span varying according to the table below.

Fig. 3.24. Definitions of complexity levels

Steck & Machotka (1975: 171)

The least complex sequence consisted of five tones, with each tone lasting for a duration of two seconds, while the most complex sequence consisted of 120 tones, with each tone lasting a duration of one twelfth of a second (a .025 sec rest between each tone explains discrepancy). Sequences were presented to subjects in distributions, with the first consisting of all complexity levels, and the following four consisting of complexity levels 1-8, 3-10, 7-14, and 9-16, respectively. Each complexity level for the sub-interval distributions were represented by two different sequences, while for the total distribution, each complexity level was represented by only one sequence, resulting in a total of 80 different sequences. Each distribution was presented to 60 undergraduate university students in two separate random orders, and preferences for each sequence rated on a seven-point scale. Ratings were then plotted against complexity levels for each subject, for each distribution, resulting in a total of five graphs per subject. Graphs for each subject had the same general shape, with peaks at the same relative point for each distribution, indicating that preference peaks were entirely dependent on context. This phenomenon is demonstrated in the diagram below. 76

Fig. 3.25. Preferences of subjects on each sub-interval for subjects with peak preferences on the full distribution for complexity level four (.991 sec) and twelve (.153), represented in the first and second columns respectively.

Machotka (1975: 172)

As can be seen, in each column the peak is at the same relative point on the graph, resulting in peak preferences for different levels of complexity for each distribution. Further confirmation of context acting as a primary determinant of the placement of the peak of the preference curve was arrived at by standardizing the range of each, correlating peak points on standardized ranges, and conducting t-tests for the differences between means across all subjects for each pair of distributions. Results revealed significant correlations and no significant differences between each possible pair of ranges. 77

Fig. 3.26. Correlations and t-tests between comparisons of the five ranges

Machotka (1975: 173)

Results suggest that for any one subject at a given point in time there is no absolute optimal complexity level, but rather that subjects habituate to different contexts such that optimal complexity levels are determined by context. In this way, a subject may prefer a two-part invention of intermediate complexity given a possible choice of all two-part inventions. However, if the range of that choice is expanded to include three-part inventions and preludes and fugues, the most preferred piece may represent a complexity level well beyond that indicated initially, but will maintain the same relative place in the array of pieces. This may have been a factor in Orr & Ohlsson’s study, and supports the validity of the observed negative linear trend as a better predictor of preference for the jazz genre, and further backs up the claim that the preference/complexity relationship may vary between styles of music for subjects.

The results of this study regarding the effects of context on the preference/complexity relationship are potentially lethal to conclusions drawn by researchers who have not adequately taken into consideration this context effect. However, this study incorporated statistically generated sequences that have very little similarity to real music often used by researchers in preference studies. In this respect, it is possible that a context effect is especially prominent, or even peculiar, to cases in which artificial stimuli are used. Further research on context effects may include real world music as stimuli in order to 78 allow for a wider generalization of results. Consequently, a number of the past and upcoming studies under review are not all that vulnerable to the criticism of failing to take into account a context factor. Only those studies especially similar to the present one will address this problem.

Similarly to Steck & Matchotka, Vitz (1964) conducted a study in which artificial tone sequences of varying complexity were generated in order to investigate the preference/complexity relationship. Unlike any of the previous studies, Vitz incorporated the information theory metric for entropy in order to arrange stimuli on a continuum of complexity. Sixteen tones were generated electronically and tone sequences constructed according to random selection of frequencies. Each sequence was then recorded onto a tape and presented to subjects individually, who rated each on an eight-point pleasantness scale. The sequences were divided into three groups according to the number of tones presented per second, and information values calculated for each, as presented in the table below.

Fig. 3.27. Tone sequence characteristics

Vitz (1964: 179) 79

For each information value, per speed division, (calculated according to Shannon’s formula), two sequences were used. For sequences 1-18, tones were randomly drawn from only eight of the 16 tones, while the last 12 sequences incorporated the full range of possibilities. For sequences 9, 10, 17, 18, 29 and 30, the occurrence of each tone was equi-probable as shown by the equivalence of the probabilities for the “dominant tone” and “each other tone” categories. Calculation of the H/value (information value) for each of these sequences is done in the following manner:

-For sequence 9,10, 17 and 18: [(-1/8 log10 1/8)/log10 2] x 8 =3

-For sequence 29 and 30: [(-1/16 log10 1/16)/log10 2] x 16 =4

For sequences in which the probability of occurrence per tone differed, the H/ value per tone is calculated using each differing probability and the values for all tones summed. Also, instead of calculating by hand, consult the graph of the function –Pi Log2 Pi.

In addition, the amount of information presented per second for each tone sequence was calculated by multiplying H/ tone by the speed (tones per sec). Subjects’ mean pleasantness ratings were calculated and means graphed over the rate at which sequences presented information. In experiment one, 24 subjects rated sequences 1-10, while in experiment two, subjects rated sequences 11-18, and for experiment three, the remaining sequences (19-30) were rated. In experiment four, subjects rated sequences presenting tones at both two and four tones per second in order to test if speed affected pleasantness ratings separately from rate of information transfer. In the last experiment, subjects chose the most preferred sequence from a pair, with four random orders of two pairs of sequences used as test stimuli. For each pair, a sequence at a speed of two tones per second was matched with a sequence at four tones per second. A paired-comparison design was incorporated in this last case in order to further test hypotheses regarding the effect of speed on pleasantness ratings.

The results from experiments one, two, and three revealed monotonic increasing trends, with pleasantness ratings leveling off somewhat for the sequences presenting information at the higher rates of transfer. 80

Fig. 3.28. Pleasantness ratings over rate of information transfer for experiments 1 & 2.

Vitz (1964: 180)

For experiments one and two, a flattening out of the curve occurs at rates of 4.4 bits per second and 8.8 bits per second. No peak was apparent in either of the curves as expected according to the Optimal Complexity Model. Experiment three addressed the possibility that the complexity range in experiment two was not large enough to elicit an inverted U- shaped curve.

Fig. 3.29. Pleasantness/rate of H/ sec for experiment three.

Vitz (1964: 181)

In this experiment, the range for the rate of information transfer was extended from the maximum of 12 bits per second in the previous experiment, to a maximum of 32 bits per second in the present one. The curve flattens out even more, with an apparent peak being reached at 32 bits per second. However, no decline in pleasantness ratings was observed and it remains to be seen if the nature of the curves in these experiments is asymptotic. 81

This is possible considering the fact that it would be very difficult to increase the rate of information transfer, suggesting that a suitably wide complexity range is in fact sampled in this case. Vitz notes that this experiment already pushes the boundaries of humans’ ability to discriminate between different pitches presented alone and in quick succession. However, Vitz does not dismiss altogether the possibility that a limited complexity range was sampled. Nor does he dismiss the Optimal Complexity Model for explaining the hypothesized relationship between pleasantness and rate of information transfer. Rather, he argues that the measure of information used (obtained by –Pi Log Pi) is inadequate as it does not take into account the “magnitude of stimulus differences”. To cite Vitz’s example, no distinction is made between the amount of information presented by a sequence in which a tone at 105 cycles per second follows a tone at 100 cycles per second, and a sequence in which the successive tone is at 1000 cycles per second. In the latter case, one would intuitively expect more information to be presented as the difference between successive tones is extremely large and represents a larger frequency range. This implicitly increases the possible size of the value obtained for the variance measure used to describe a stimulus set, and the corresponding probabilities for the occurrence of individual components of the stimulus set.

Results obtained in experiments four and five suggest that the more general term “variation” proposed by Vitz, may be better suited as an independent variable than the very specific quantity obtained using Shannon’s formula. The hypothesis outlined for the last two experiments was that only a single curve describing the pleasantness/information transfer rate would be evident, as pleasantness should not vary depending on speed and information/tone value. However, the hypothesis was not confirmed. Two separate curves were evident, with the largest discrepancy in pleasantness ratings occurring for sequences at transfer rates of 5.6 bits/second. For the sequences played at two tones per second, pleasantness ratings scored a mean close to +1 in the pleasantness direction, while for the same transfer rate for the sequences in which tones were played at four tones per second, pleasantness ratings scored a mean closer to –1 in the unpleasantness direction. 82

Fig. 3.30. Pleasantness ratings by information transfer rate for speeds of 2 and 4 tones per second.

Vitz (1964: 182)

Experiment five substantiates Vitz’s claim that the information metric given by –Pi log Pi is inadequate alone as a measure of complexity in this context, as 69% of all choices were made in favor of sequences presented at two tones per second. If the information metric is indeed unreliable alone as a measure of complexity, it is possible that sequences with lower rates of information transfer can in fact be perceived as more complex if the difference between pitches in these sequences are comparatively greater than those labeled as more complex. This would result in sequences which should be placed to the right on the x-axis, placed more to the left, thus preventing the inverted U curve from taking shape.

Although the general findings of this study do not support the Optimal Complexity Model, they do not debunk it either, as results are interpreted as failing to produce the expected pattern primarily due to the inadequacy of the information measure. The fact that a majority of subjects preferred the sequences played at a slower speed indicate that the rate of information transfer alone is not the sole predictor of preference, rather, it is a single dimension of complexity which needs to be better represented by two, if not more, measures. Certainly this is very likely, and Vitz makes a strong case, but it is interesting to note that Vitz dismisses outright the idea that the fundamental shape of the inverted U- shape curve relating preference to complexity may be invalid, and that the monotonically increasing trend observed is in fact valid. As it turns out, results obtained in a second study by Vitz, indicate that this reasoning is well founded and suggest that any single measure of complexity utilized alone, may fail to capture the full nuance of the concept 83 and produce misleading results. In addition, in the following study, Vitz adds additional dimensions to the sequences in order to increase the range of complexity, suggesting that he does give credence to the possibility that in the first study, the range for complexity was restricted as a result of varying parameters on a single dimension. The later study approaches something more akin to real world music in this respect, and suggests that in the previous study, a relatively wide complexity range was sampled on a single dimension only, but in terms of overall complexity, sampled a comparatively smaller range. This is verified in the following discussion of the second study of the series.

In the follow up study, Vitz (1966) constructed 18 tone sequences representing six levels of magnitude on three separate dimensions, with each level matched for each dimension. The two additional dimensions of duration and loudness were included as additions to the design of the previous study in order to increase the complexity of sequences. Further, variance was introduced as an additional measure of objective complexity in order to capture the magnitude of component differences in stimulus component sets. The resulting chart ranks sequences according to successive increases in stimulus variation.

Fig. 3.31. Tone sequence characteristics

Vitz (1966: 76) 84

Level one represents the lowest level of complexity, while level six represents the highest level of complexity. The method for increasing stimulus variation involved systematically increasing the range of stimulus components from which sequences were composed, with increases equivalent across all dimensions for each level of complexity. As an example, for level five, the tone sequences consisted of frequencies drawn from the set of eight used on the proceeding levels, along with four additional frequencies. Similar procedures were used to increase the information content and variability of stimulus sets for duration and loudness. Information content was calculated for each level, on each dimension, using Shannon’s formula, as was the variance.

Fig. 3.32. Population form of the computation for variance used to describe dimension categories.

For category 2a: variance is the sum of the squared deviations from the mean of the frequencies used to compose the sequence =11888.75. Divided by the number of deviations to arrive at the mean squared deviation=2,972.1875. Rounded to the nearest thousandth=3.

In order to test the validity of stimulus variation category ranks, 36 subjects rated the “amount of variation or unexpected change” for each sequence on a five-point scale, where 0 indicates no change, and +4 indicates very much change. These labels can be understood as equivalent to measures of subjective complexity, while the variance and information measures can be understood as objective in nature. The subjective perception of sequences tended to match the ranks given according to the objective measures, as depicted in the following chart:

Fig. 3.33. Mean ratings given for sequences at each level of complexity

Vitz (1966: 76) 85

The results confirm the validity of the method for grouping sequences into strata according to magnitude of stimulus variation, marking an improvement to the method used by Vitz on the previous study.

Following confirmation of the validity of the ranking of stimuli, sequences were played to subjects, who rated each for pleasantness on a nine-point scale. Unlike the findings of the previous study, the graph of mean pleasantness ratings over subjective complexity ratings revealed an inverted U shape trend, with pleasantness ratings peaking at sequence three.

Fig. 3.34. Pleasantness means over subjective complexity

Vitz (1966: 77)

In the previous study, at the highest levels of complexity, pleasantness ratings approached an asymptote rather than curving down from a peak. The lack of an inverted U-shaped curve was attributed primarily to the shortcomings of the information metric for ordering sequences according to increasing levels of complexity, rather than the cause of a restricted complexity range. This was argued on the basis that it would be hard to increase the complexity range on the dimension used, as distinctions would not be perceived at increased rates of speed or among an increased number of pitches. It was suggested that the only means of increasing the complexity of sequences would be to add additional dimensions on which sequences could be varied, and that a measure of variability incorporating a measure of central tendency would be needed to take into account the magnitude of stimulus differences. Both remedies were implemented and the 86 expected trend observed, suggesting that one, or more likely, both of the explanations cited for failing to produce the expected curve in the previous study, were responsible factors.

In the second experiment of this study, subjects were divided according to musical interest and years of formal training, resulting in a high and low music group. Subjects rated pleasantness for six sequences representing the full range of complexity levels, and means were graphed over subjective complexity by music group.

Fig. 3.35 Pleasantness by subjective complexity and training

Vitz (1966: 78)

As can be seen, trends roughly approximated inverted U shaped curves of the Optimal Complexity Model, with the high (appreciation) music group scoring higher for the more complex sequences and lower for the less complex sequences, than the low music (appreciation) group. These results are a strong confirmation of the Optimal Complexity Model. We would expect optimal complexity levels for the high music group to be higher than that for the low music group. Consequently, a peaking for the low music group is expected “before” that of the high music group. Although the latter group has two peaks rather than one, the first peak may be a result of random variability as there was very little difference between the peak and the succeeding point for this group.

With respect to the results obtained by Steck & Matchotka regarding the effects of context on rating trends, it is possible that the inverted U shape curve reflects only a transient level of optimal complexity that may really be an artifact of the range of the 87 complexity of sequences used. An equivalent range for complexity transposed up to a higher complexity level (by increasing # of voices for example), may in fact yield identical results, making the idea that a single absolute optimal level of complexity exists for any person at a given point in time, dubious. Steck & Matchotka conducted their study before Vitz, so no discussion of this problem is engaged in the present study. However, because of the similarity in design and method for constructing stimuli, science demands some sort of explanation, or attempt to reconcile results with those relevant findings of other researchers. The most plausible argument, in defense of Vitz’s conclusions, is that results in the first study did not reveal a similar inverted U shape trend, which would be expected if context alone was responsible for the trend in the second study of the series. The fact that only a substantial increase in the complexity of sequences resulted in the expected function, gives particular credence to the notion that in this series of studies, an absolute level of complexity did exist for subjects, as a peak and descent was only revealed upon increasing complexity beyond the point sampled in the previous study.

Although the results of the second study are striking in their contrast with results obtained in the first study of the pair, findings must be tempered by the knowledge that the low music group in the third experiment of the second study, consisted of only 14 individuals, well below the prescribed minimum per-group size of 30 needed to obtain valid and reliable results. In order for the independent groups t-test (t {42}=2.49, p =.02) comparing overall group means to have an acceptable degree of power, the crucial assumption that must be met is that subject scores are normally distributed and symmetric about a true population mean for each group. Clearly this cannot be assumed in this case, and we are left asking the author of the present study his reasoning for glossing over this deficit. Despite this, the two studies taken together represent elegant examples of an effective means for testing the validity of the Optimal Complexity Model by using artificial stimuli presented in lab settings. The advantage in having a large degree of control over attributes of the stimuli is apparent, as unwanted effects can be filtered out or negated, allowing for systematic study of the effects of specific dimensions of music and their incremental change on subjects’ preference responses. 88

Crozier (1974) conducted an experiment nearly identical to Vitz’s second study. Crozier incorporated 12 of the exact same stimuli (two for each uncertainty level as opposed to three) and presented them to 48 subjects who rated each on a 7-point displeasing-pleasing and simple-complex scale.

Fig. 3.36

Crozier (1974:30)

Mean ratings were then graphed over uncertainty levels and analysis carried out. The reason for comparing subjective ratings of the complexity of stimuli to the levels obtained by the information theory metric was to determine the validity of the grouping of stimuli into complexity strata on the basis of their corresponding information values. Similarly to Vitz, Crozier obtained a significant linear relationship between subjective ratings and uncertainty levels for the group as a whole and for the group divided according to musical training. This indicates that the method used to order stimuli on a complexity continuum was valid.

There was both a significant quadratic (F=33.96, p<.001) and significant linear (F=57.94, p<.001) component in the pleasingness/uncertainty relationship, indicating that the data is compatible with an inverted-U hypothesis. Although Crozier indicates that the data is better described as an “r-shaped” function, he does not make this ground for dismissing an optimal complexity effect in the present study.

In addition to measuring subject’s response to stimuli by way of the pleasingness scale, Crozier also had subject’s rate stimuli on a 7-pt beautiful/ugly scale. Although any 89 conclusions drawn regarding the results obtained in this portion of the study must be considered in light of the fact that the construct , does not represent a means by which one can assess the Optimal Complexity Model, as the model relates preference (in the form of liking or pleasingness) to complexity. It does not explicitly relate beauty to complexity, and one can imagine that beauty is somewhat different in terms of acting as a dependent variable than pleasingness and liking. The latter two are measures of a subject’s action, that is, the degree to which a sensation or experience is present in the subject. The construct beauty on the other hand, describes characteristics of the object perceived by the subject. However, despite this introspective logic, the beauty/uncertainty relationship was not only strongly quadratic, but when subjects were divided into two groups on the basis of training, conformed almost exactly to predictions made on the basis of the Optimal Complexity Model.

Fig. 3.37.

Crozier (1974: 43)

One can almost see the Optimal Complexity Model in all its splendor in these two graphs. In the bottom graph, the data can be interpreted as a result of the different optimally preferred levels of complexity for the music and non-music group. Specifically, the non- music group would have a lower optimal level and therefore would evidence a downturn in preference ratings before the music group. This of course is exactly what is depicted in the graphs. The only problem is that the dependent variable measures beauty, not preference or liking. In short, this last result is not wholly pertinent to commenting on the 90

Optimal Complexity Model, but lends some food for thought in regards to a probably consistent component of subject’s response to aesthetic objects. That is, objects most liked have an optimal level of complexity, and as a consequence are deemed beautiful.

In a second study, Crozier (1974) had subjects respond the same stimuli on the same three scales, but this varied the characteristics of the sample of subjects tested. Subjects consisted of 24 persons in each of three different age groups. The first consisted of person eight to nine years of age, the second 14-15 years of age, and the last consisted of persons 20+ years of age. As in the previous experiments the data was analyzed and means for each age group graphed over uncertainty level. For the complexity scale, a strong linear relationship between rated complexity and uncertainty was found. As in the previous experiment, this indicates that subjects perceptions of the complexity level of stimuli corresponded to objective measure obtained using the information theory metric. For the pleasingness scale, graphs of the means revealed a pattern strongly reminiscent of that obtained in the previous study.

Fig. 3.38.

Crozier (1974: 95)

As can be seen, the youngest age group reached a peak at the lowest complexity level, followed by the 14-15 year old group and adult group. This corresponds to the trend predicted by the Optimal Complexity Model, as very young subjects are expected to have 91 a smaller capacity to process and store information. Consequently, their optimally preferred level of complexity is lower than that of older persons.

Thetrendsobservedinthedatacollectedontheugly-beautifulscale,again, corresponded to those observed in the previous experiment. Interestingly, the smoothness of the curves obtained offered a more elegant picture of the Optimal Complexity Model than those obtained using the pleasing/displeasing scale.

Fig. 3.39.

Crozier (1974: 96)

Although the trends in this case also corresponded to an inverted U hypothesis, the apparent plateau in the data for the eight year old group going from the 1 bit to 6.17 bit complexity levels, and the rise in pleasingness following the down turn in pleasingness going from the 8.17 bit to the 9.17 complexity level, offers a rough approximation of the ideal form of the model. The reason why the data obtained on the ugly-beautiful scale presented a better “fit” in both experiments is unclear. It can only be suggested that certain aspects of the aesthetic experience incorporate judgments of beauty, which in turn are closely linked to the complexity of the object and the degree to which it is liked.

Smith (1986) orders sequences on a complexity continuum on the basis of decreasing rigidity of constraints, implemented according to rule structure, in order to study the 92 preference/ complexity relationship. The net result of this method is a set of tone sequences vaguely similar to those constructed by Vitz, with the exception that sequences on the lower complexity levels are more representative of traditional tonal music. The rule structure consisted of five levels, each representing a different level of complexity, defined in this context according to the degree to which rule structure decreases uncertainty. At level five, the most complex level, the seven-tone sequence was constructed from the full 12-tone chromatic scale. At level four, four of the five middle tones were diatonic to C-major. At level three, this same rule applied to all five of the middle notes. At level two, in addition to the previous rule, the first three notes formed a major triad, and at level one, in addition to the same rules applied as those used for level two, the closing tones of the sequence implied a dominant-tonic cadence. A total of 20 different sequences were constructed, with four representing each level of rule structure, and half of the total divided according to complex and simple melody contour and zero/ non-zero excursion. Complex contours were defined as those in which there were four changes in pitch direction, while simple contours had only two changes in pitch direction. Zero excursion melodies ended the sequence on the same note that it was started, while non-zero excursion melodies did not.

Fig. 3.40. Tone sequences divided according to rule structure

Smith (1986: 24)

The experiment was composed of four sections in which sequences were played to subjects individually, who rated each on a 6-point pleasingness scale in a practice trial, pre-repetition trial, repetition trial, and post-repetition trial. In the repetition trial, all 20 93 sequences were played for each individual, with sequences randomly chosen without replacement in order to control order effects. In the repetition trials, a sequence was randomly chosen from each level of complexity and repeated ten times for each level of complexity, resulting in a total of 50 sequence exposures per person. The post-repetition trial was identical to the pre-repetition trial. The training factor was studied at each phase of the experiment, with subjects divided into three groups according to levels of training representing an untrained group, trained group, and highly trained group. Two separate sets of ANOVAs were performed on the data. Data obtained in the repetition trials were studied in one analysis, and data obtained in the pre- and post-repetition trials were studied together in another.

The main findings of the repetition phase provided ample support for the Optimal Complexity Model. The following graph of the pleasingness means for repetitions by level of complexity shows increasing linear preference trends for complexity levels S5 to S4 and a decreasing linear trend for S1.

Fig. 3.41. Mean ratings by repetition and complexity level

Smith (1986: 26)

The results suggest that the optimal level of complexity for all subjects combined lay somewhere between complexity levels S1 and S2, as repetition for S1 decreased pleasantness ratings while ratings increased for S2. One can visualize sequences on each of these two levels of complexity as occurring to either side of the peak of the curve. 94

With repetition, the sequences at level S2, which are beyond the optimal level of complexity, simplify towards a preferred point, and sequences at level S2, which are below the optimum level of complexity, simplify further to the point of converging on a preferred level equal to that approached by sequences on level S2. The sequences at increasingly higher levels of complexity are located at points further beyond the optimal point of complexity, eliciting progressively lower ratings for pleasingness. What is somewhat surprising is the absence of a significant two-way interaction of the training variable with the repetition variable. However, there was a significant interaction of training and complexity level, with the highly trained group giving higher ratings than the other two groups for all levels of complexity. The latter result is expected as highly trained subjects should have higher optimal levels of complexity than less trained and untrained subjects, so that sequences beyond the optimal level for the overall group, shift closer to the peak of the curve, going to the highly trained group. The only anomaly is the higher ratings given for the least complex sequences, which should shift further down from the peak of the curve, producing lower pleasingness means than those for the other two subjects groups. For the two-way interaction, repetition would be expected to influence ratings by shifting the perceived complexity of sequences for the trained subject group further towards the point of decreasing preferences. For the highly trained group, the preference trend should reverse at a point earlier in the repetition trials, as the peak is traversed earlier resulting in a rising and then decreasing preference trend, thus contrasting with trends for the other subject groups and creating a significant two-way interaction. Although this was not observed, the significant two-way interaction of training and complexity level and the significant effect of repetition across complexity levels does support the Optimal Complexity Model, and suggests that if sample size was increased the more subtle influence of training in conjunction with repetition may have been forthcoming. As for the possibility of a context effect of the type observed by Steck & Matchotka, Smith dismisses this as an irrelevant factor in this study on the grounds that for the repetition trials, trends were all monotonic for each level of complexity. The context effect is explained as only a “possible” cause of inverted U functions.

The only major finding from comparison of the pre- and post-repetition trials was that there was no difference in overall mean ratings between the two trials. If exposure has the 95 effect of simplifying stimuli, one would expect sequences in the post-repetition trials to be far simpler than those in the pre-repetition trials, and as a result mean ratings should differ. Specifically, sequences representing the higher complexity levels, initially located at a point lower down on the curve beyond the optimal level, should continue to move towards the peak of the curve and eventually traverse the peak and descend down the other side. In the extreme possibility that overall mean ratings for pre- and post-repetition trials are not significantly different, yet movement has occurred, points will hold the same range of pleasingness scores and result in equal means, but with placement for the simple and complex sequences reversed. This was not the case, indicating instead that there was no movement along the optimal complexity curve, and that in fact, sequences reverted back to their original levels of perceived complexity from the time of the close of the pre-repetition trials to the time of the post-repetition trials. This is another example of the recovery type effect observed by Hargreaves in his study incorporating repetition at intervals, and is explained as a common phenomenon that poses no threat to the validity of the Optimal Complexity Model.

Smith concludes that the overall findings support the Optimal Complexity Model. She further argues that results in the two phases of the study (repetition trials, vs. pre- and post-repetition trials), were products of two separate cognitive processes. Although discussion of this hypothesis is not relevant in terms of assessing the Optimal Complexity Model, it is interesting to note that Smith identifies the repetition effects with habituation, or a process of identifying and becoming accustomed to the sameness of sequences as they are repeated. The trends observed in the post- and pre-repetition trials on the other hand, are identified as occurring as a result of a process of recognition that relies more on discrimination and identification of the nature of rule structures (whether consciously understood or heard). This is a particularly finely crafted distinction to be made taking into account the plausibility that identification of a rule structure may also develop with repetition of sequences. Further, the notion that repetition effects are a result of a process of habituation, is quite similar in tone to the type of language used to describe the Mere Exposure Theory, which states that repetition is a sufficient condition to produce liking in stimuli. This of course is certainly not Smith’s intent, as results are firmly explained as supporting the Optimal Complexity Model. However, the implication of the description 96 of the proposed cognitive process responsible for the repetition effects appears to be slightly at odds with the concept underlying the Optimal Complexity Model, which is based on the perception and processing of stimulus content.

The studies reviewed thus far have all had varying results in terms of commenting on the Optimal Complexity Model. Studies incorporating naturalistic or real world music have been reviewed, as have studies incorporating artificially constructed tone sequences resembling anything but music, as in the case of the last three studies. Of these, all have used the same dependent variable, expect for the study by Smith and the studies by Vitz and Crozier. The latter used pleasingness rather than liking as the dependent measure. Although one may question the validity of making comparisons between studies in which different dependent variables were used, researchers have found that pleasingness and liking as dependent measures correlate strongly (Russell 1994). In addition to this, the similarities in the way in which the two terms are used in language suggest that the two constructs are similar. Indeed, as Russell points out (Russell 1994: 142), many preference studies do not differentiate between the two, and often use the concepts interchangeably. Thus, the present review is quite robust in terms of standing up to criticisms of consistency. However, it is not complete. None of the previous experiments have directly studied the presence of naturalistic stimuli in everyday life in order to answer research questions, although certainly results have been obtained in order to comment on the possible effects of certain factors in the real world. The following studies by North & Hargreaves fill this gap by presenting naturalistic stimuli to subjects in real world settings. This represents an attempt to minimize the degree to which subjects feel and behave like guinea pigs with regard to the experiment. In this respect, it is hypothesized that behavior and response of subjects differs little, if at all, to that occurring in natural settings in which a preference experiment is not present. This attempt to maximize the ecological validity of results is in response to an observation made by Konecni (1982: 498). He states that much of the experimental aesthetics literature has “treated aesthetic preference and choice as if they, and the process of appreciation itself, normally occur within a social, emotional and cognitive vacuum, as if they were independent of the contexts in which people enjoy aesthetic stimuli in daily life”. Although there is no guarantee that the ensuing experiments will report behavior which will not be sullied by 97 the intrusion of researchers, only the best attempt can be made to minimize the researchers’ presence in the study.

In the first study of the pair, North & Hargreaves (1996) examined the complexity and the appropriateness variables with reference to rated preferences of subjects attending a yoga and aerobics class. Seven new age/ambient house music excerpts were selected, of which five served as experimental stimuli, and the remaining two served to introduce and close the list. The five experimental excerpts were selected from an array of 60 excerpts on the basis that five different complexity levels were represented, ranging from very low to very high complexity. Subjects rated each excerpt on an 11-point scale for complexity, and either preference or appropriateness, with roughly half the subjects in each class in each rating group. The excerpts were played in the last 20 minutes of each class during the class routine, followed by administration of the questionnaires. The mean ratings for each excerpt for each class and scale were calculated, and mean ratings graphed over one another for each class.

Fig. 3.42. Mean liking scores over complexity

North & Hargreaves (1996: 542)

The graph of the liking means over subjective complexity means revealed a significant quadratic trend in the relationship for the yoga group, and a significant linear trend in the 98 relationship for the aerobics group. For both groups the quadratic model explained more variance in the dependent variable than did the linear model (given by eta squared). The present results confirm the inverted U hypothesis for the yoga group only. The fact that the quadratic model was non-significant for the aerobics group but still explained more variation (83.7% compared to 82.8 %) is somewhat enigmatic and requires a closer study of the conditions under which eta-squared can yield results discrepant from those expected according to the significance of fit for the regressions.

Eta-squared, or the coefficient of determination, is calculated by dividing the sum of the squares of the regression (SSR) by the sum of the squares total (SST). The SST is obtained by summing the squared deviations of raw scores from the mean of y, and the SSR is obtained by summing the squared deviations of the regression scores from the mean of y. For the hypothesis test of the goodness of fit of the linear model (E(Y given X)=Beta 0 + Beta 1Xi), the test concerns whether the value of the slope is significantly different from zero. That is, if having information regarding values of X is any better than making predictions on the basis of the intercept alone, then the slope will be significantly different from zero, as 0Xi + Beta 0 will always equal the intercept. Consequently, in the case where there are outliers well below the mean of Y, it is possible to have a significant linear relationship despite a very poor value for R-squared, if there is a linear clustering of scores above the mean of Y. This may arise when outliers bring the regression line closer to the mean of Y such that the distance of raw scores from the mean of Y is larger than the distance of regression points from the mean of Y (i.e. SSR is small in comparison to SST yielding a small value for R-squared). It is particularly likely that if the trend is vaguely quadratic, some points will have the effect of outliers with respect to calculation of R-squared for the linear model. In this case, it is likely that the value of R- squared for the quadratic model will be higher despite the model’s non-significance. This implies that the relationship between preference and complexity in the aerobics class was in fact vaguely quadratic, but that the trend was too weak to be significant. This is apparent in the curve depicted in the graph, and suggests that with increases in sample size the quadratic trend may have become more striking. 99

A similar pair of relationships was observed between the appropriateness and complexity measure. Again, the quadratic model explained more variance in both groups than did the linear model, despite the non-significance of the quadratic fit of the quadratic model for the aerobics data.

Fig. 3.43. Appropriateness over complexity

North & Hargreaves (1996: 541)

Interestingly, both the linear and quadratic models were significant for the yoga group, although the linear trend is not readily apparent. The fit of the linear model can only vaguely be seen as a negative or inverse linear trend. These trends indicate that for the yoga group, the most appropriate excerpts were those with an intermediate level of complexity, while for the aerobics group this was not as clear. Rather, it appears that appropriateness declined quite steadily with increases in complexity. Again, due to the higher value for R-squared for the quadratic model, one would speculate that increasing the sample size would yield a significant quadratic trend, re-enforcing the suggestion of a valid inverted U relationship between appropriateness and complexity.

A positive linear relationship was observed in the liking/appropriateness data, indicating that preference (or liking) increased steadily with increases in the perceived 100 appropriateness of the excerpts. In addition to the goodness of fit tests and calculations for R-squared, a series of t-tests were conducted in order to determine if there were significant differences between group means for each of the three scales. Results indicated that the yoga group assigned higher overall ratings for liking and appropriateness than the aerobics group, while no differences were observed in ratings between groups for complexity. North interprets this last result as providing evidence contrary to the Optimal Complexity Model, suggesting that additional factors were influencing preference. As excerpts did not vary between groups in terms of perceived complexity, the difference in liking scores cannot be attributed to differences in perceived complexity. The logical explanation for the difference is that higher perceptions of appropriateness in the yoga group elicited higher preference ratings. Here we have a confirmation of the influence of perceived appropriateness on preference, and a weighting in favor of accepting that the nature of the relationship between liking and perceived complexity, and appropriateness and perceived complexity, for the aerobics group, is indeed inverse linear. This implies that an intermediate level of complexity is not the only issue, and that the perceived degree of appropriateness of excerpts for the listening situations can have an overriding influence. Despite this, the clear inverted U trends for the yoga group, and the higher values for R-squared given by the quadratic model for the preference/complexity and appropriateness/complexity relationships, indicate that complexity was an influencing factor. North & Hargreaves appear to be somewhat ambivalent in favoring one explanation over the other and take a neutral position, suggesting that both factors had important influences.

In the second study of the pair, North & Hargreaves (1996) presented five musical excerpts to diners at a university cafeteria who rated preference on an 11-point scale. In addition, subjects rated the number of aspects of the environment they would like to change. Excerpts consisted of 20 short pieces of music divided into five groups of four, representing low, moderate, and high complexity new age music, and moderate complexity organ music. The additions of the moderate complexity organ music permitted researchers to compare the effect of music styles on preference and determine whether complexity alone was the primary influencing factor, or whether other factors had an important presence. A stall was set up and excerpts played through a loud speaker, 101 followed by collection of data. A one way ANOVA and Tuckey’s honest significant difference test (HSD) indicated that the mean preference score for moderate complexity new age music was significantly higher than that of other groups, which exhibited no significant differences between one another. Further, the largest number of items about the environment subjects would like to change before listing the music, was for the moderate complexity new age genre. The two trends matched each other closely, as can be seen in the graph of the means.

Fig. 3.44. Mean preference scores and mean number of aspects cited before music for each genre.

North & Hargreaves (1996: 498)

The inverted U shape in the preference/complexity data within the new age genre category supports the validity of the Optimal Complexity Model. However, the lower preference ratings for the mechanical organ music, which had an equivalent level of complexity to the moderate complexity new age music, indicate that complexity alone did not determine preferences. The trend between rated preferences and the number of aspects of the environment subjects would like to change, also suggest that the inverted U trend between preference and complexity is susceptible to, in North’s terms, “breaking 102 down” depending on the intrusion of other factors. That is, the number of aspects cited before mentioning the music, increased to a maximum as expected according to the Optimal Complexity Model, but only for the moderate complexity new age music. The same result did not hold for the moderate complexity organ music. North suggests that other factors worth studying, in order to disentangle the effects of variables such as complexity from those of others, are ones relating to the socio-psychological characteristics of the music. The degree to which music of a style is an appropriate exemplar of the type of music expected in a given listening context is included in this category. These two studies make some progress to this end and clearly indicate that the other factors can over-ride effects of complexity and produce results which may appear to invalidate the basic tenets of the model, raising doubts about the model’s applicability and overall usefulness in predicting preference trends as they occur in everyday life.

North & Hargreaves’ studies appear to be motivated in particular by the consistent and enigmatic presence of the prototypicality variable as a determining factor of music preference. In each case, the importance of the prototypicality of stimuli excerpts is referred to. In the first study, appropriate excerpts are ones that match the prototype excerpt commonly heard or expected for a given setting. Consequently, the prototypicality variable had an indirect effect on preference ratings by determining the degree to which excerpts were perceived as appropriate for the setting. In the second study, the mechanical organ music of moderate complexity was disliked, with this finding explained as a result of the rarity in which this type of music is encountered as background music in the cafeteria. The mechanical organ music has few common traits to the perceived prototype harbored by most cafeteria diners. Both studies implicitly refer to the importance of prototypicality in influencing subjects’ aesthetic response to music, and suggest that prototypicality is a worthy topic of study in terms of commenting on the Optimal Complexity Model.

It has already been mentioned that prototypicality may have been a confound for complexity in a number of studies, including Heyduk’s famous preference study using especially composed compositions. If in this case, prototypicality was in fact the primary determinant of preferences, and complexity was only correlated with this factor without 103 having much influence on its own, then it must be recognized that prototypicality is a serious contender for explaining music preferences and a stumbling block for the Optimal Complexity Model. It has been mentioned that distinguishing the effects of prototypicality from the effects of other variables, such as complexity, is a tricky matter. In addition the similarity of the nature of the concept of prototypicality with familiarity makes differentiation between effects that much harder. One would think that typical objects are inherently familiar and atypical objects are unfamiliar (or novel). If this is true, the Optimal Complexity Model accounts for prototypicality in the form of building familiarity into the model. Despite this intuition, the validity of the familiarity/prototypicality equivalence seems unlikely. The statement made earlier implies correlation only and is not the same as the statement “familiarity is prototypicality”. If the relationship is purely correlational, then there is always the possibility that in some circumstances prototypicality will behave differently from familiarity, in which case these concepts must be treated as separate and individual variables. This is best illustrated by way of a simple “mind experiment”. If one thinks of a bird, a familiar image is created. However, if one views the strange looking bustard, clearly, for most people it would not be prototypical of the category of birds. However, it may still be familiar as it is without a doubt a bird and not a strange thing, as it has wings, feathers, and a beak. In addition to this logic, differences in the way the terms are used in language also suggest a distinction in their meanings. Familiarity is the degree to which a person recognizes an object. A person may be more or less familiar with something. Prototypicality is a construct that describes an object. The difference between the concepts is with reference to the perspective in which each term deals. Familiarity describes the degree of familiarity in the subject so to speak, whereas prototypicality describes the degree of familiarity in the object, that is, the degree to which it has traits “familiar” with the concept of the object.

From this discussion, not only do we better understand the meaning of prototypicality, but we also arrive at the characteristics that distinguish prototypicality from familiarity. Although much of the past discussion may resemble a test in semantics, or at best, seem trivial, it does adequately determine that familiarity and prototypicality are distinct constructs and must be treated as such. Consequently, the problem of prototypicality as a 104 confound for complexity in reviewed studies is a serious possibility, and gives rise to an element of doubt regarding the validity and applicability of the optimal complexity model. A clarification of the nature and nuance of prototypicality is warranted and paves the way for developing a better understanding of how the prototypicality variable operates within the context of music perception and appreciation.

The Prototypicality Models

In his research on subject’s aesthetic choice patterns for furniture, Whitfield (1979) indicates that the origins of the term “prototype” date back to Aristotle, who determined that all objects fit into categories that are discrete and bounded. Any one object is lumped into one category or another on the basis of its possession of a few “critical features” which characterize the object. Chairs are used for sitting. Consequently, a flat plane that comfortably fits the buttocks must characterize critical features. Further, as sitting requires that the legs and feet extend from the body at a comfortable angle, the flat plane, or “seat,” must be raised above the ground by a certain amount. A support is needed for the seat. This support commonly appears in the form of “legs”. As a result, chairs are readily recognized according to whether they have a seat and legs.

Whitfield points out that more recently, research by Rosch (1975) puts less emphasis on the importance of the idea of bounded categories, and instead stresses the notion that objects are perceived in terms of family resemblances. Exemplars, or prototypes, are determined according to the degree that they bear features in common with other objects of a class, where the class is the category denoted by the concept. A chair of the type found in classrooms may be a good prototype, and as such be readily perceived and recognized as a chair. A rock on the other hand, may not as easily be perceived as a chair as it does not bear a family resemblance to the prototype. The degree to which objects deviate from the prototype, which can be understood as holding the peak or mean of a symmetrical distribution of like objects, determines the extent to which objects are perceived as belonging to that category of concept. In this depiction, the y-axis measures the number of common or shared attributes, and the x-axis lists the objects. 105

Fig. 3.45. Depiction of the family resemblance hypothesis

In our example of the prototypical chair, the classroom chair may hold the peak of the curve while the rock, which has fewer common attributes than most other chairs, is placed at either tail of the distribution. If we designate the distribution as normal, we see that no category with distinct bounds is possible as the tails approach asymptotes on the y-axis, thus continuing indefinitely in either direction. From this observation it is possible to determine that any object will have the possibility of being perceived as falling into any other family of objects. There will always be some form of common attribute in any object with every other object, although this commonality becomes infinitely small as objects diverge from the exemplar (i.e. move out into the tails of the distribution). This observation brings us to the second qualification for the identification of a prototype for a concept. Not only must the prototype have the most shared traits within its concept class, it must also have a minimum number of traits shared by objects belonging to another concept class. An object can then be classified on the basis of determining which prototype has the most traits in common with the object. This is demonstrated below:

Fig. 3.46. Depiction of the overlap in the domain of two different prototypes 106

As can be seen, the means of each of the respective distributions have the largest number of traits in common with other objects belonging to the same concept class, at the same time as having a minimum number of traits in common with objects belonging to the other concept class. Objects to the left of the distribution for prototype A, may have fewer traits in common with prototype B than prototype A, but they also have fewer traits in common with other objects belonging to the same class. The prototype maintains a balance between being maximally discrepant from the objects of other concept classes, and most representative of objects of its own concept class. In this depiction it is clear that the rock (to continue with the original example) now can be understood as having traits in common with two concept classes, class A and class B, each of which is represented by a prototype. The question is: To which concept class does the rock most properly belong? Prototype A represents the concept “chair”, and prototype B represents the concept “boulder”. The rock, located in the far right tail of the distribution for prototype A, has no traits in common with the chair prototype and two traits in common with the boulder prototype. Clearly, the rock belongs in concept class B as it has more traits in common with the prototype for this class.

Reed (1972), in his research on categorization of schematics of faces varying on dimensions of facial attributes, develops mathematical models for describing the hypothesized means by which objects are perceived and classified into categories. In his depictions, a distinction between distance models and probability models is made. The family resemblance model described above is a probability model as assimilation of an object to a concept class is dependent on cue validity, where the validity of a cue is understood as the proportion of its frequency of occurrence in one class, divided by its total frequency of appearance in all classes under consideration. In the above example the cue validities of the squares would be calculated as follows:

Total frequency for all square types =8 For concept A: Frequency of square =5 -For concept B: Frequency =6 Cue validity of square for concept A = 5/8=.625 Cue validity of square for concept B = 6/8=.75 107

In this case, the square (an attribute) has more validity as a cue for concept B, or the boulder, rather than the chair. Everything else equal, the attribute denoted by the square will be more likely to lead us to perceive an object containing that attribute as a boulder rather than a chair. In Reed’s study, the model coined the family resemblance model by Rosch, is known as the cue validity model, and is placed in the context of an object being assigned on the basis of consideration of all cues apparent in the object. Placement of the object is done by averaging the total cue validity for each category on the relevant cues, and placing the object in the class with the highest resulting validity. In the example of an object with the attributes represented by the square, circle, and x, the procedure entails finding the cue validity for each of these cues and averaging them for each concept class separately, and then placing the object accordingly.

Another model, known as the prototype model, simply takes the mean of each attribute for the concept class. The prototype is defined as the object with the mean value for each of these attributes. In Reeds study, schematics of faces vary on four dimensions: length of nose, distance between eyes, height of mouth, and height of forehead. The mean nose length, mean distance between eyes, and so on, is calculated for each separate category, and the resulting values used on those dimensions to represent the prototype for that category of faces. The prototype model is a distance model as any schematic is placed on the basis of its degree of similarity from the prototypes expressed in terms of distance. If face X is less than or equal in distance to prototype A than B, face X is perceived as belonging to category A. The distance models rely on a transformation of similarity measures to distance measures in some n-dimensional (2 or 3 dimensional) Euclidian space. Although the exact computation for this transformation is complex, the intuition is straightforward. Similar objects are closer together in the sample space, and different objects are further apart. In this way one can measure the distance an object lies from prototypes within this sample space, and determine on this basis whether an object will be perceived as belonging to the class of one prototype or another.

An additional feature of all of the models described by Reed is with reference to a weighting procedure. It makes sense, as indicated as early as Bruner (1957), that some 108 features have more value in determining what class an object belongs to than others. For the prototypical chair, one would expect that the seat feature is more important than the back support, and as such should be given more emphasis with regard to determining whether an object belongs to the class of chairs or not. This is done by a weighting procedure, which for the cue validity model entails multiplying cue validity by a prior probability (the probability of a cue leading to assimilation to a category before categories are viewed by the subject), and a sample probability, defined in terms of the total frequency of the occurrence of an attribute. The weighting procedure for the prototype model incorporates matrix algebra in order to adjust distances of objects within the sample space. In total, Reed explains four separate models with weighted features for three, with mathematical computations for each presented in his appendix.

Researchers from a wide range of disciplinary fields have contributed additional nuances to the concept of prototypicality. The current discussion represents an entry point to a wider body of literature, much of which lies in the domain of categorization in visual perception. Certainly, the specific computations and weightings for these models are not easily applied to the context of music (auditory) perception, primarily as a result of the temporal nature of music. The music perception literature uses a different language in detailing prototypicality theory, centered on the concept of the schemata. Much of literature has incorporated theory to explain obtained results when examining the perception of melody and tune structures (Rosner & Meyer 1982; Attneave & Olsen 1971; Deutsch 1972; Dowling 1978), motive structures (Welker 1982), and harmony structures (Bharucha 1987; Krumhansl, Bharucha & Castellano 1982; Krumhansl & Castellano 1983). The defining characteristic of schema theory is that it not only takes into account the object in terms of those traits that make it atypical or typical, but that it also puts substantial emphasis on underlying perceptual processes that occur within the subject. Objects are understood on the basis of a match or mismatch of the object with abstract representations (or schemas) of it (Schubert 1996), which are formed in the psyche of the subject over time and through experience. Although consideration of this literature in any depth merits its own chapter, a study by Melera (1990) sheds a great deal of light on the problem of prototypicality and complexity within a schema-oriented framework. 109

Melera (1990: 279) writes that “..aesthetic pleasure [derived from the experience of music] comes from an exquisite game of expectational cat and mouse with the composer, in which the listener enjoys the tensions and the resolutions, the problems posed, and problems solved, the confusions followed by comprehension.” Here the central theme is that pleasure is derived when a music object is “optimally discrepant from a schematic ideal”. Melera points to an array of literature which supports this view against the alternative, which is that prototypical objects are most preferred, where the prototypical object is the schema or abstract representation of the object. One can imagine the prototypical sonata as the textbook sonata bearing all the most typical and common structural traits. The optimal discrepancy theory would suggest that such a piece of music would be boring as it would offer little in terms of pleasantly surprising the listener. Rather, maximum pleasure is derived from the sonata that adheres to the critical formal aspects of the textbook sonata but which deviates from the norm just enough to elicit the response of “boy that’s clever”, from the listener. Indeed many of the Beethoven sonatas and even Mozart sonatas do just this (i.e. Mozart C-major sonata K.545, starts off in the parallel minor of the close of the exposition instead of continuing in the dominant), although it could be argued that “maximally discrepant” for Beethoven was a great deal more discrepant than for Mozart and contemporaries of Beethoven’s day. Melera studies the applicability of these two hypotheses and arrives at some interesting findings that have implications for the prototypicality/complexity dilemma.

A prototypical harmonic progression was constructed along with six transformational levels of the prototype and played to groups of novice, expert, and high level expert listeners, who rated each for pleasingness, unusualness, complexity, and interestingness. Each transformational level represented an increasingly syntactically atypical progression. Normalized scores were then regressed on to one another for various combinations of the variables. Results indicated that novices and experts (low-level) had the greatest preference for the prototype, with preference decreasing steadily with increases in unusualness, complexity, and syntactic atypicality. The opposite trend was observed for the high expert group. For the novices and low exert group, the results support a preference for prototypes model rather than an optimal discrepancy model, 110 while it is hypothesized that if the range of syntactic atypicality was increased, the trend for the high level expert group would be congruous with the optimal discrepancy model. The pertinent findings with regard to complexity were, firstly, that no inverted U shaped trend was observed relating pleasingness to complexity, and secondly, that complexity and unusualness ratings were highly correlated, indicating that as syntactic atypicality increased so did the perceptions of the degree of complexity. Melera points out that atypicality can be viewed as a “highly psychologized” form of complexity, but distinguishes the traditional literature on complexity (the new experimental aesthetics school), which has a tendency to incorporate information theory or objective measures of complexity, from the prototypicality school, which has often incorporated schema theory, on the grounds that each has produced somewhat differing results. Melera proposes that the “still newer experimental aesthetics” would do well to integrate the relative strengths of both schools, but by starting on the premise that ultimately we are dealing with human beings who each have prior formed knowledge structures regarding music and aesthetic objects in general, which are pertinent to the determination of aesthetic behavior.

Conclusions

The present chapter has touched upon a range of models and theories, which in the aesthetics, music preference, and perception literature have been given a great deal of attention over the last fifty years. The Mere Exposure Theory was introduced, with the literature pointing to its applicability to only a very narrow range of contexts. Berlyne’s psychobiological theory of aesthetic preference, which is no doubt the single most important theory in the aesthetics literature, was briefly illustrated, and its relationship to the more specific domain of the Optimal Complexity Model, was discussed. The concept of complexity has been given much attention by Berlyne and others as one of the most important variables in determining the aesthetic preference of objects. Consequently, much effort was spent researching the validity of the Optimal Complexity Model, and indirectly, specific claims of Berlyne’s aesthetic theory. The literature on complexity produced a wide range of results, some of which cast strong doubts on the applicability and validity of the Optimal Complexity Model. The review as a whole suggests that the inverted U hypothesis relating preference to complexity is certainly an important trend. 111

However, its lack of consistency in terms of appearing in nature makes it an extremely fleeting and enigmatic phenomenon. The effect of prototypicality adds to the difficulty in arriving at any concrete conclusions regarding the role of complexity in determining aesthetic preference, and suggests that any one variable alone can ultimately entertain only a portion of the relevant theory. This of course, is implicitly stated in Albert Le Blanc’s taxonomy, which reserves a single place for complexity among many in the network of relevant influences. It is perhaps partly vogue that has seen such a huge volume of published literature on complexity, and certainly much of this is apparent in journals from the 50s, 60s, and 70s. More recently, research has tended to accord complexity a less substantial place in the literature, focusing instead on a wider range of influencing variables. From this chapter, it is clear that a more detailed review of Berlyne’s aesthetic theory is needed, as well as a more thorough review of the growing body of literature producing results contrary to those implied by Berlyne’s theory. The entrance of prototypicality serves as a useful countering force to the persistent drum beat of the formalists in their infatuation with complexity, and offers a line of enquiry that must be pursued within the more specific domain of music preference. A further clarification of the role of prototypicality not only provides commentary on the Optimal Complexity Model, but also opens the way for a study of alternate theories and models for explaining the ambiguity and variation in findings in the vast body of literature on preference behavior.

. 112

Ch. 4 Abstract Patterns of preference for 12 styles of music were examined, and the effects of gender, age, and training investigated. Subjects responded by indicating their preference on a ten-point like/dislike scale, and information on gender, training, and age was collected. Analysis of variance revealed significant main effects for training and style. In addition, the style by training, style by age, and style by age and training interactions were all significant. Music students gave higher preference ratings on average than non-music students overall. Non-music students tended to give higher preference ratings for the pop and rock genres, while music students gave higher ratings for the jazz, blues, folk, and classical genres of music. Older students gave higher ratings than younger students for the seven classical genres, while there was little difference in ratings for the other genres of music. The 3-way interaction revealed differences in the way young and old subjects responded to styles of music by training. Results broadly support findings in the mainstream literature regarding the effects of age and training on preference, however, the lack of effect of gender does not correspond to the popular perception regarding the importance of gender influences.

Introduction The vast number of music preference studies can roughly be divided into those examining the effects of the characteristics of the listener, the environment, and the music, on music preference (Finnas 1989; Wapnick 1976). This artificial division is useful as a means of grouping studies and topic areas, lending some structure to a field that is notoriously fragmented (Hargreaves 1986). However, it does mislead one to think of music preference in terms of these three separate domains. Interactions are of particular importance, especially those occurring across domains, as clearly listening and music preference behaviour cannot occur without a context, subject, and musical object. The interaction of course is, and always has been, one of the phenomenons of most interest in all empirical research, with this preoccupation perhaps best captured in a quote by Fisher (1926: 511).

“No aphorism is more frequently repeated in connection with field trials, than that we ask Nature few questions, or ideally, one question at a time. The writer is convinced that this view is wholly mistaken. Nature, he suggests, will best respond to a logical and carefully thought out 113 questionnaire; indeed, if we ask her a single question, she will often refuse to answer until some other topic has been discussed.”

Here, the point is that truth is only revealed upon considering two or more questions (factors) and their relation to one another (interaction). Although interactions are certainly grappled with, the music preference literature per se, suffers from lopsidedness as a result of a tendency for individual studies to filter out, or overlook, potentially worthwhile variables in an apparent attempt to simplify an extremely complex and varied phenomenon. The net result is an accumulation of many different parts of the picture. This has most likely reduced the detail and scope of the body of theory, which arises most naturally out of a need to explain results and give some solid and consistent meaning to findings. For the most part, only pockets of theory pertaining to subject, object, and environment are extant in the published music journals, with object variables, particularly complexity, yielding the most comprehensive body of theory. This makes it difficult to construct a theoretical framework from which the diversity of influences taking place in the music listening experience can be mapped and understood, and the number of attempts to do so is limited, with LeBlanc’s (1982) interactive model one of the few efforts to structure the phenomenon in its entirety. Certainly, more attention should be given to complex interactions, especially those between the domains of subject, object, and environment. This is clearly difficult, and often the only means of observing the influence of any one variable is by eliminating potential confounds. This means restricting the enquiry and using controls to enable a solid determination of the effect of one variable alone. LeBlanc (1981, 1983, 1983) found that a process of filtration can be extremely effective in coming to terms with a factors influence, and it is certain that much of the literature has been given its present structure as a result of this methodological consideration. A respectable number of studies in the music preference literature do look at complex interactions of factors across domains. Object factors are commonly examined with particular combinations of subject factors, which are often associated with one another as they are striking features of the sample (gender, training, cultural background, and age). The current study follows in the tradition of engaging complex interactions between subject variables, yet maintains a distance between subject and object factors, as a result of a necessity of the design. Subjects responded to labels of 114 styles of music rather than actual music, thus the focus of this study was on subject factors.

The empirical research takes an exploratory approach to studying patterns of preference for styles of music as determined by gender, age, and training. The phenomenon of primary interest was the interaction of these variables, as already published literature leaves little ambiguity with respect to the nuance of variable influences generally. Le Blanc has studied the maturation (or age) variable extensively, and has developed a formal theory known as the “open-earedness hypothesis” for explaining these effects. The theory states that young children will be open to a diversity of musical styles, with maturation diminishing open mindedness to music till early adulthood, whereupon persons again evidence greater appreciation for a range of music. “Open-earedness” declines once again with the advent of old age. Some of the most convincing evidence of this pattern has come from LeBlanc (1996), who observed the entire breadth of the trend, and Fung (1999/2000), who observed the predicted trend in subjects in the young to adolescent age range. Part of the aim of the present study was to duplicate these results, albeit within a fairly narrow age range among the young adult/adult group.

Gender has been given an enormous amount of attention, particularly with regard to music instrument choice and gender role stereotyping. Abeles & Porter (1978) found patterns in the degree that instruments were perceived as masculine or feminine among students, with loud and percussive instruments viewed as masculine and the softer timbered instruments, such as flute and woodwinds, deemed feminine. A possible parental influence was also uncovered in shaping student perceptions regarding appropriateness of instrument choice, depending on gender. Further, gender differences were observed to increase with age, and that educational (authority) biases may be a precipitating factor. Much of the recent literature since has arrived at similar findings (Delzell & Leppla 1992; Fortney, Boyle & DeCarbo 1993; Conway 2000; Harrison 2000), suggesting that gender differences still predominate, not only with respect to instrument choice, but also with regard to gender roles in general (Shepard & Hess 1975; Brines 1994). Frith (1981) and Brake (1980) argue that traditional views of gender roles 115 are reinforced by society, which has imbued a premium on gentle and sensitive traits for woman, and more competitive and aggressive traits for males. This societal pressure is argued to be primarily responsible for gender differences in preference for styles of music, with males opting for harder rock genres, and females preferring the more romantic and danceable forms of pop. Evidence in the music preference literature would seem to corroborate this view (Christenson & Peterson 1988; Roe 1985; Fox & Wince 1975), although questions remain as to the degree that differences are culturally derived or inherent. For example, the choice to not take up certain instruments may be a result of physical factors. Females, who on average are smaller and physically weaker than males, may be put off by larger instruments such as the trombone, which require a certain degree of physical strength and stamina to play (Fortney 1993). In addition, studies have found gender differences in preference for styles of music, with corresponding personality correlates (Rawlings 1997), suggesting that certain personality dispositions may be more prominent depending on gender, which in turn may explain music instrument choice and/or music style preference.

Crowther & Durkin (1982) studied the interaction of gender and age. In his review of the literature, Crowther points to a gender paradox with respect to attitude and performance in the music educational system, and gender disparities in the professional music world. It has been well documented that in schools females tend to have more positive attitudes towards music and perform better generally (Bentley 1975; Shuter 1968). However, there is a larger proportion of males with professional performing careers (Jacob 1980). Crowther speculates that this may be product of a societal pressure that stresses the importance for males to bear traits important towards successful employment and financial security, while traits important in developing and fostering interpersonal relationships are stressed for females. As such, success in the music educational system is seen as reflecting this tendency. Scrutiny of data obtained by Crowther & Durkin corroborates the finding that females are generally more positive toward music and music education, but does not support the hypothesis that age effects gender trends differentially. A musical interests and attitude to music questionnaire were administered to a sample of secondary school students in a rural town in Southern England. Results indicated that attitudes to music in general (without respect to specific 116 genres), increased overall with age for both males and females. In addition, females had more positive attitudes at all ages, with this difference most prominent in the early school years, although the interaction of age and gender was not significant.

Training would seem to have obvious and clear effects on music preferences. However, somewhat disparate findings have been produced over the effects of individual music appreciation courses on attitudes, and what Price (1988) calls opinions. Price defines opinions as a preference response to individual music excerpts that are experienced, whereas attitudes reflect preference responses for categories of music in the absence of the experience of specific musical stimuli. Price (1990; 1988), in a series of studies, observed that increases in knowledge of composers and styles of music, was not attended by increases in preference opinions for music excerpts, although attitudes were affected positively. This finding conflicts with the earlier findings of Bradly (1971), which observed increases in preference following an instructional period. Further, studies incorporating exposure periods alone have also found positive changes in preference (Mull 1957; Schukert 1968; Hargreaves 1984), although these findings are explained in terms of a different theoretical construct. Shehan (1985) observed increases in preference ratings for excerpts of non-Western and Western genres of music following a teaching period, as did Flowers (1988) who measured the effects of two different training on preference for excerpts from Western classical symphonic works. Despite the discrepancy in findings, training can be assumed to positively influence general attitudes to music, although this may not be the case when looking at individuals per se.

This is further corroborated by studies comparing preference responses of music and non-music students. Hargreaves (1980) found that, overall, music students had a greater preference for classical (taught) genres of music than pop genres, with preference means for classical music higher for music students than non-music students. Interestingly, little difference was observed in the preference ratings of non-music students between classical and pop genres. Geringer (1982), using the OMLR (operant music listening recorder), found that music majors spent more time listening to Bach and Beethoven than Barry Manilow and John Denver than did non-music majors, who spent more time listening to 117 the latter two artists. Clearly, training has significant effects on attitude if not “opinion”, and it is certainly the goal of music departments in secondary and tertiary education for this to be the case. One can interpret Price’s findings as a result of ineffective or poor teaching method. Specific changes to the teaching method may have resulted in the predicted (and desirable) outcomes.

An additional finding in Geringer’s study is with regards to the age variable. An additional group of 5th and 6th grade elementary students listened to excerpts via the OMLR. These younger subjects spent less time listening to the classical composers and more time to the pop artists than both the music majors and non-majors, indicating that age increased listening times (or preference) for classical genres and decreased listening time for the non-classical genre, with training augmenting this trend considerably in either direction. Consequently, it appears training affects preference differentially depending on style of music, with age having a slightly weaker presence in the interaction. The purpose of the present enquiry, apart from observing the effects of  training in general, was to see if similar interactions to the one above would turn up. A similar interaction of gender with the style and training variable was also of interest. Hargreaves (1995) conducted one of the few studies examining this interaction. Subjects, consisting of boys and girls aged between 11 and 16 years, indicated their preference for labels of styles of music. In addition, demographic data was obtained detailing subjects musical training. Results indicated that females had more training than males. Preference increased significantly with age for jazz, country/western, folk, and reggae styles of music, while preference decreased for rock, heavy metal, and rap. Males preferred rock and heavy metal significantly more than females, while females preferred reggae, pop, jazz, classical, and opera more than males. No significant interactions were observed for the age/gender combination across styles of music. Further, training was

 Geringer did not report this interaction due to violations of basic assumptions. Geringer shifted to non- parametric tests as a result, and reported the significance of chi-square test statistics for levels of the within subject factor (composer) and levels of the between subject factor (subject group), followed by post-hoc tests run on each level separately (i.e. differences between mean listening times of artists/composers for non-music majors, music majors, and elementary school students). Although the three-way interaction was not reported, observation of the differences would suggest that it was significant. Consultation of the graph of mean listening times for generalized genres (i.e. classical vs. non-classical) further supports this notion, and is presented in the discussion. 118 significantly correlated with classical, jazz, opera, folk, blues, and rock. The anomaly in this study is with regards to the peculiar response to rock music. Given previous research, we would not expect rock music to be positively correlated with training. Further, according to LeBlanc’s open-earedness hypothesis, rock should be preferred more among the older teenagers in the sample. However, despite these details, the general findings do correspond roughly to expected trends. A significant style by age interaction and style by gender interaction were observed. Somewhat disappointing was the lack of interactions of training with age and gender variables, and part of the purpose of the present research was to determine if such interactions were apparent in the data.

Method Subjects were 163 students recruited from a university in Sydney, Australia. The total subject pool consisted of 73 music students and 90 non-music students. The music preference survey consisted of labels of 12 genres of music with a 10-point dislike-like scale associated with each. In addition, the option “don’t know” (listed as “DK” in the survey) was included for each label in order to minimize subjects with little or no knowledge of a music style randomly choosing a preference score and contributing to overall means. A demographics section was also included in order to collect data on subject’s age, gender, and musical training.

The questionnaire recorded subjects’ response to labels of music rather than actual music excerpts. This method has obvious shortcomings, particularly with regard to reliability. Subjects responding to style labels, such as “classical” or “modern” music, contribute scores on the basis of their personal understanding of what types of music constitute those particular labels. Consequently, for any two subjects, any one label may have differing domains. This would be particularly true of labels that have consistent meaning and application within select or specialized groups. The label “post-modern” music for example, for an educated music student may encompass the music of John Cage and Steve Reich, but not Stravinsky, while for the lay person or non-music student, all of the above may fit conveniently into the label “classical” music. However, the benefits in this context, were deemed to outweigh the disadvantages in much the same 119 way they did for Hargreaves (1995: 245), who pointed out that the strength of using style labels is that:

“participants are responding to their conception of the generic style as a whole rather than to the specific and possibly unrepresentative features of particular examples of it, and that a very broad range of musical experience can be rapidly summarized.”

In Hargreaves’ study, style labels were chosen on the basis of a pilot study in which labels were tested informally among subjects as to their meaningfulness, and labels most representative and often used, chosen for the main body of research. The means for choosing labels in the present study differed somewhat as labels were primarily chosen with regard to their use and meaning in music academia. The labels commonly used in music textbooks and music history classes to designate time periods representing roughly homogeneous stylistic content, were chosen. The labels used to represent what is commonly viewed as a whole as “art music” or “classical music” consisted of the labels “medieval”, “renaissance”, “baroque”, “classical”, “romantic”, “modern”, and “post- modern” music. For non-art music genres the commonly used and fairly unambiguous labels “jazz”, “blues”, “pop”, and “rock” music were chosen. “Folk” and “ethnic” were also included as labels, however the latter was disregarded and removed from the study as it quickly became apparent that many persons of non-Caucasian origin found the term offensive. For future reference, it is recommended that the term “world music” incorporated by Fung (1996), be used instead. This was not done in the present investigation due to the complications inherent in making changes to the survey during thecollectionphaseofthestudy.

A stratified random sampling method was chosen as a means of obtaining the desired minimum quotas of subjects. It was determined on a priori basis that it would be an objective to attempt to have at least 25 subjects for each two-way combination of between-subject variables, in order to maximize power. Ideally, the aim was to obtain the same numbers of subjects for each variable combination. That is, 30 male music students, 30 female music students, 30 male non-music students, and 30 female non-music students, with the same numbers of young to old students for each training group. The 120 same distribution was not needed for the gender/age variable combination, as this interaction was not of great interest. The three-way between-subject combinations were not considered, as it was determined that the overall sample size would not be large enough to accommodate such an objective, and would have little analytical value any way.

As it turned out hitting these quotas was quite difficult. In the music department there were a comparatively smaller number of boys than girls, and of these, a good portion opted not to participate in the survey. Further complicating the matter was the treatment of the “don’t know option” (labelled DK). Two out of three methods for treating this data involved cutting out subjects from the study. Consequently, with commencement of the data entry phase, the number of valid cases was revised, and it was decided not to re- sample in the best interest of keeping a time constraint on the sampling phase of the study. Methods for dealing with DK data included retaining all surveys with DK data, cutting surveys with more than two DK categories chosen, and cutting all surveys with DK options chosen. This trimming of the original sample led to the creation of a total of three different samples to be analysed. In addition, for the samples in which DK data was included, imputation methods had to be incorporated, further increasing the number of different samples open for analysis. Three imputation methods were used and analysis of the resulting samples carried out, as discussed in the analysis section.

Fig. 4.1. No cuts 121

The numbers are the same across all styles, as style is a within-subjects variable. As can be seen above, there were 46 young music students (age 15-20), 27 old music students (age 21+), 37 young non-music students (age 15-20) and 53 old non-music students (age 21+). This represented an acceptable sample size for tests of the significance of the interaction of the style variable with the training and age variable combination. Total sample size N =163.

Fig. 4.2. No cuts

As depicted in the above table, there were 28 male music students, 45 female music students, 41 male non-music students, and 49 female non-music students. This represented an acceptable sample size for testing the significance of the interaction of the style variable with the training and gender variable combination. Total sample size N=163.

It was clear by looking at the data for the total uncut sample that there were some subjects who indicated they did not know many of the styles. These surveys were viewed as poor in terms of the quality of the data, as the subjects in these cases were either quite ignorant or unmotivated. The second method entailed cutting out these subjects, in essence picking out the proverbial “bad apples” and keeping the good ones. This was 122 done by removing all subjects who chose the DK option for more than two genres of music. The resultant per-cell sample size is depicted in the following tables.

Fig. 4.3. N2 Cuts

For the N2 sample, there were 45 young music students, 27 old music students, 25 young non-music students, and 42 old music students. This represents a deficit from the total sample of 1 young music student, 0 old music students, 12 young non-music students, and 11 old non-music students. The resultant per-cell sample size was deemed adequate for analysis of the style by training/age interaction. Revised sample size N=139. Deficit=24.

Fig. 4.4. Method 2: N2 cuts –per cell sample size for the training/gender combination 123

There were 28 male music students, 44 female music students, 31 male non-music students, and 36 female non-music students. This represented a deficit from the total sample of 0 male music students, 1 female music student, 10 male non-music students, and 13 female non-music students. The revised sample size was deemed adequate for analysis of the style by training/age interaction. Revised N=139. Deficit=24.

Fig. 4.5. All cuts

For the sample in which all surveys with DK options chosen were cut, there were 37 young music students, 24 old music students, 18 young non-music students, and 32 old non-music students. This represents a deficit from the total sample of 9 young music students, 24 old music students, 18 young non-music students, and 32 old non-music students. The decreased per-cell size was too great for any reliable analysis of the style by training/age interaction. However the test of the significance of the main effect of training was still possible as there were a total of 61 music students and 50 non-music students in this sample. Revised N=111. Deficit=52. 124

Fig. 4.6. All cuts

There were 23 male music students, 27 female music students, 25 male non-music students, and 36 female non-music students in the current revised sample. This represents a deficit from the total sample of 5 male music students, 18 female music students, 16 male non-music students, and 13 female non-music students. The per-cell sample size was deemed adequate for testing the significance the style by training/gender interactions. Revised N=111. Deficit=52.

Analysis and Results

Analysis was carried out using SPSS 11.5 (Statistics Package for the Social Sciences). Analysis of variance (ANOVA) was conducted for each sample using the “general linear model” command, with “preference score” as the dependent variable, “style” as the within-subjects independent variable, and “age”, “training”, and “gender” as the between- subjects independent variables. The resulting mixed design ANOVA yielded significance tests for the following main effects and interactions: 125

Fig 4.7.

Within Subjects effects

MAIN EFFECTS: Style, 2-way interactions: Style * Age, Style * Gender, Style * Training, 3-way interactions: Style * Age * Training,Style*Age*Gender, Style * Gender * Training, 4-way interaction: Style * Age * Gender * Training,

Between Subjects effects

MAIN EFFECTS: Age, Gender, Training, 2-way interactions: Age * Gender, Age * Training, Gender * Training, 3-way interactions: Age * Gender * Training

The effects of primary interest are in italics. The other effects were of only marginal interest, and as it turned out, were all non-significant. The purple shaded effects were significant across all imputation methods and samples, while the grey shaded effects were only significant for a portion of the sample/imputation combinations. A total of three different imputation methods were used and analysis was carried out for each sample and imputation method combination, resulting in a total of seven separate ANOVA’s. Results were fairly consistent across samples and imputation methods. For the effects of interest, the age and gender main effects were non-significant, as was the style * gender and style * gender * training interactions, across all samples and imputations methods. The only between-subjects main effect that was significant was that for training. For the within- subjects effects, the main effect of style was significant as was the style * training interaction. The style * age interaction was significant for all samples except the one in which all subjects with DK data were cut. The style * training * age interaction was significant for all samples except for the N2cut sample for the cold deck imputation 126 method by Greehouse-Geisser’s correction. The significant within-subjects effects and interactions are summarized in the table overleaf.

Fig. 4.8. SIGNIFICANCE (p<.05) OF WITHIN-SUBJECT MAIN EFFECTS AND INTERACTIONS Greenhouse-Geisser

Within Nocuts N2Cut Allcut subjects EFFECT Cold Mean EMMS Cold Mean EMMS Allcut F1 .000 .000 .000 .000 .000 .000 .000 F1*Age .016 .011 .011 .000 .003 .003 .085 F1*Train .000 .000 .000 .000 .000 .000 .000 F1*Train*age .011 .007 .007 .056 .048 .041 .043

Huynh-Feldt

Within Nocuts N2Cut Allcut subjects EFFECT Cold Mean EMMS Cold Mean EMMS Allcut F1 .000 .000 .000 .000 .000 .000 .000 F1*Age .013 .008 .008 .002 .002 .002 .073 F1*Train .000 .000 .000 .000 .000 .000 .000 F1*Train*age .009 .005 .005 .048 .041 .048 .034

Key: Sample: Nocuts= all surveys with DK data included, N2Cut= all surveys with more than two DK options chosen are removed, Allcut= all surveys with DK options chosen are removed. Key: Imputation method: Cold =cold deck imputation, Mean=mean substitution, EMMS =estimated marginal mean substitution. 127

The Greenhouse-Geisser and Huynh-Feldt corrections were used (Howell 2002: 486- 487), as sphericity could not be assumed for any of the samples. The corrections simply reduce the degrees of freedom (df) for the treatment or interaction and error term. All else equal, this has the effect of pushing the critical value of F further into the tail of the F- distribution, making it harder to obtain significant F statistics and a rejection of the null hypothesis. This is done in compensation for violations of the sphericity assumption, and provides a more conservative test of the null hypothesis.

The imputation methods include cold deck imputation, mean substitution, and estimated marginal mean substitution (EMMS), all of which will be discussed briefly. Cold deck imputation is the simplest of the lot, involving the insertion of a single value for all the missing values. This value is determined on an a-priori basis, and for the present study, involved taking the mean of the scale (5.5), and inserting it in place of all DK options. This ensured that the effect of imputation was one of creating gravity towards the middle of the scale. The centre of the scale was chosen as it is most representative of neutral feelings, (i.e. neither like nor dislike), which was deemed to be most similar to a “don’t know” response in terms of nuance than any other parts of the scale. In addition, this was done so that subjects with no opinion of a genre of music were given a score that was most likely (in the absence of any other information) to be closest to any decision made following exposure to a genre.

Mean substitution involved taking the mean value for a style of music and substituting that value into each of the cases in which DK was opted for that style. For this method, if the overall mean for a particular style was very low or high, the effect of imputation would be minimal as it would simply create a gravity centring on the mean calculated prior to imputation. Here the attempt was to alter the prior calculations of style means as little as possible, regardless of polarity. Means were calculated for each style by training group, and imputed accordingly. This was done due to the significant interaction of style and training which indicated that there were differences in the way that music and non- music students responded to styles of music. Consequently, it made sense to take into account a subject’s level of training when imputing values. This was not done for each training\age combination, as this interaction was not significant across all imputation 128 methods and samples. In addition, a further division of groups for imputing means would lower the number of observations from which the mean was calculated. Taking into account age/training sub-groups was not deemed worthwhile for this reason. In this respect, taking into account the larger training groups was considered a good compromise.

The estimated marginal mean substitution (EMMS) procedure (SPSS advanced models 12.0, 2003) incorporates maximum likelihood techniques to estimate marginal (group) means, which are then imputed for the DK data. SPSS has a default setting for estimating means, which can be altered according to the researchers needs, under the “mixed model analysis” command.

As there was little difference in the significance of effects across imputation methods the EMMS procedure was picked to represent effects and interactions in graph format. This was done for the sample in which all subjects with DK options chosen more than twice were removed (N2 sample). This sample was chosen as it was a good compromise between the other two samples, and represented a means by which DK data could be included in subsequent analysis without being too badly tainted by those subjects who decided to opt out instead of responding.

The pattern in DK data itself was of considerable interest, and a check for the randomness of missing data was conducted in order to identify missing data processes. Although the larger frequency of DK data for non-music students, on the surface, would point to a tendency for non-music students to choose the DK option more readily than music students, a correlation matrix was run on dichotomised data to check for randomness of the DK options. Initially, the sample was separated into two groups on the basis of training, and frequencies run under the “descriptives” command for each. A total of 38 missing values were present in the sample, of which 24 (63%) were given by non- music students and 14 (37%) were given by music students. 12 (86% of music total) of the missing values given by music students were in the non-classical genres, of which 7 were given to folk music. For the classical genres, music students had only one missing value for each of renaissance and medieval styles, with no missing values given to the 129 others. For non-music students 22 (92% of non-music total) missing values were present among the classical genres, while only one missing value was given to each of jazz and blues genres. No missing values were present among the pop, rock, and folk genres. The correlation matrix of dichotomised data clearly reveals this pattern in the missing (DK) data. The pattern is striking. The blank spaces indicate perfect correlations, which we would expect in cases where no missing data is present (i.e. all 0s), while correlations are given for non-perfect associations (i.e. where there is at least one missing value). As can be seen, there is a box shaped pattern for the music group, with DK options chosen only for the non-classical genres. For the non-music group, the pattern is a characterized by a funny looking cross, with DK options chosen for jazz, blues, and the classical genres.

Fig. 4.9.

Key: M=music students, NM=non-music students 130

Clearly, there is an underlying missing (DK) data process present, which is determined by the training factor. If a correlation matrix was run on the combined sample, the pattern would disappear, as we would see missing data across all genres of music. This further corroborates the notion that the variable of importance with respect to explaining missing data patterns is training.

Graphs of the means1 for styles of music for all subjects combined, subjects partitioned on the basis of training, gender, and age separately, and age and training combined, indicate the direction of the effects and interactions. This is necessary as the ANOVAs simply indicate that there are differences between the ways subjects responded for at least two groups, but it does not indicate the direction of differences, or where the differences lie. Consequently, further analysis is needed to gain insight into the nature of the relationships between factors and the dependent variable. The most efficient method for detecting these nuances is through graphical depictions of the group means, contrasts analysis, and t-tests for assessing the significance of simple effects. However, the t-tests (and indeed contrasts) must be used sparingly, as numerous t-tests will result in an increase in the probability of making a type I or II error. Consequently, the t-test was used for cases of particular interest only, while contrasts analysis was used to test the significance of differences of interest between grouped levels of the within-subjects factor, for levels of the between-subjects factor. Comparisons between clumped groups were chosen on an a-priori basis under the hypothesis that subjects would respond similarly for like styles, and that differences would be most prominent between amalgamated groups depending on subject variables. Factor analysis was performed to determine the final structure of clumped groups, and is presented along with contrasts analysis following graphs of the means for styles of music alone and in combination with the between-subject factors for the initial style format.

The main effect of style (F=37.643, p=.000) indicates that subjects responded differently across styles of music. Inspection of the graph of the style means reveals high

1 Means and standard deviations are included in the appendix. 131 preferences among subjects for jazz, rock, classical, romantic, and modern genres of music, all of which had a mean greater than seven. The lowest preference scores were given to folk, medieval, and renaissance music, each of which had means below the mid- five range.

Fig. 4.10. Preference means for styles of music

A partitioning of scores was needed in order to determine how music students and non- music students contributed to the overall trend observed above. Further, as there was at least one significant difference between means for training groups for a style of music (inferred from the significant style by training interaction, F=8.342, p=.000), it was clear that the graph of style means by training group would yield somewhat differing trend lines. In fact, the difference in the two trend lines was striking, as can be seen below. 132

Fig. 4.11. Preference means for styles of music by training

KEY: 0=non music students, 1=music students

Music students consistently gave higher ratings for jazz, blues, and folk music, as well as the classical type genres of music, while the non-music students gave higher ratings for rock and pop music. The largest differences in mean ratings between training groups was for folk, rock, pop, baroque, and romantic music, with the difference in each case greater than one rating point. Surprisingly, there was a convergence in mean ratings for modern music, with the difference non-significant (t=-.542, p=.589). There were also fairly small differences in means between training groups for post-modern and classical music, each of which was in the half-point range. Apart from this, mean ratings were fairly consistent across genres of music, staying roughly in the one to one and a half point ranges.

Inspection of the graphs of the style by age interaction revealed similar differences in mean ratings for rock music and the classical genres of music, and little difference in ratings for jazz, blues, folk, and pop music. 133

Fig. 4.12. Preference means by age

Key: 0=age 15-20, 1=age 21+

Older subjects consistently gave higher ratings for the classical genres of music as well as for rock music (t=2.335, p=.021). Despite these differences, the two trend lines closely matched the trend line for the entire subject group, suggesting that the style*age interaction was less substantial than the style*training interaction. This is corroborated by the corresponding F-statistics and their p-values. For the style*training interaction, the observed F-statistic of 8.342 (p=.000) was larger than that for the style*age interaction, which had an F-statistic of 3.155 (p=.003 Greenhouse-Geisser, p=.002 Huynh-Feldt).

Although the style* gender interaction was not significant, there were relatively large differences in means between males and females for romantic and medieval music, with the difference for the former significant at alpha=.05 (t=1.998, p=.048). 134

Fig. 4.13. Preference means for styles of music by gender

Key: 0=female, 1=male

However, as Levene’s test for the equality of variances indicated that the two gender groups had different variances (F=8.177, p=.005), an alternate test statistic was consulted, which indicated no difference in means between gender groups (t=1.912, p=.059), although significance was nearly reached at the .05 level.

The three-way interaction of style*age*training was significant (F=2.037, p=.048 Greenhouse-Geisser, p=.041 Huynh-Feldt). The most striking pattern occurs with respect to the differences in ranks of training/age groups for the rock, pop, and romantic genres. In the former case, the highest ratings were given by young non-music students followed by old non-music students, young music students, and old music students. For romantic music, this ranking reversed, with the highest ratings given by old music students followed by young music students, old non-music students, and young non-music students. 135

Fig. 4.14. Preference means for styles of music by age and training

Key: 1= non-music students age 15-20, 2=non-music students age 21+ 3=music students age 15-20, 4=music students age 21+.

This pattern in ranking was mimicked roughly for the classical genres, with young music students and old non-music students closely matched across styles. For rock music, these two age groups responded similarly, however with the higher ratings given by old non- music students. Surprisingly, for pop music, the older music group had a higher mean than the younger music group, although this difference was so slight and was no doubt due to random error. For jazz, blues, and folk music, young music students, followed by old music students, old non-music students, and young non-music students, gave the highest ratings. The most striking differences were between the young non-music student group and both old and young music student groups. For all styles, except rock and pop music, young non-music students ranked lowest with means well below those of music students. For rock and pop music, the young non-music students had means well above the other groups. 136

It was hypothesized that subjects would respond similarly for like genres of music. Consequently, the most striking differences were expected to occur between these groups depending on the significance of the between-subject factors, and that contrasts would prove most effective in untangling interactions if implemented on clumped genres. The pre-determined clumping led to the formation of three general groups. The first consisted of jazz, blues, and folk music. The second consisted of rock and pop, and the last consisted of the classical genres. A factor analysis with varimax rotation was performed to verify the validity of the groupings.

Fig. 4.15. Factor Analysis Rotated Component Matrix(a)

Component (factors)

1 2 3 4

BAROQUE .812

RENAISS .807

CLASS .791

MEDEIVAL .734

ROMANT .702 .359

FOLK .491 .462

BLUES .874

JAZZ .874

MOD .906

POSTMOD .859

POP .818 ROCK .757 Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. a Rotation converged in 5 iterations.

As can be seen, the structure of subject responses is best explained in terms of four underlying dimensions. For factor one, folk and the classical genres, minus post-modern and modern music, loaded highly, while the latter two styles, in the company of romantic music, loaded highly only on factor three. Not surprisingly, rock and pop music were 137 relegated to their own factor, while jazz and blues loaded highly on factor two along with folk music. This very nearly supports the predetermined hypothesized groupings, which predicted that folk, jazz, and blues genres would group together as would rock and pop, while the remaining classical genres would constitute the remaining group. Raising the eigen value threshold for inclusion of factors to 1.40, would certainly have contributed to support of the original hypothesized groupings, as can be seen from the scree test below.

Fig. 4.16. Scree plot of eigen values for component factors in descending order of variance explained in the dependant variable by corresponding factor.

Scree Plot 5

4

3

2

1

0 Eigenvalue 987654321 121110

Component Number

The scree test suggests that factor one is clearly most prominent of the four. Inspection of the percentages of variance explained in the dependent variable by each rotated factor supports this notion, with factor one explaining 27.61% of the variability, followed by factor one, two, and four, each of which explained 16.86, 14.75, and 11.91% of the variance respectively. The four-factor model explains a total of 71.28% of the variability in the dependent variable, while a three-factor model explains nearly 60% of the variability. 138

The component (factor) plot provides a useful visual for observing the structure of the model lay out. As we can only visualize (via a plot at least) in three dimensions and no more, component four is removed from the plot.

Fig. 4.17.

Component Plot in Rotated Space

1.0 bluesjazz .8

.6 folk .4 Component 2 renaissmedeival baroque postmod .2 class romant rock mod 0.0

-.2 pop 1.0 1.0 .8 .6 .6 .8 .4 .2 .2 .4 Component 1 0.0 0.0 Component 3

The jumbled bunch of labels hugging the left wall represent the labels which loaded highly on factor one, and include the classical genres plus folk music, minus modern and post-modern music. The latter two can be seen at the right wall, with high loadings on factor three. Jazz and blues genres are in their rightful place at the top of the junction of the back walls, while rock and pop genres are in limbo near the floor and mid-range of the plot, which we would expect as factor four is not represented. 139

Contrasts analysis, in accord with the results obtained in the factor analysis, were carried out in order to determine the significance of differences between groupings for between-subjects factors. As the style*gender interaction was not significant, no further analysis was done for this combination. The style *age interaction was clear and fairly consistent for styles, and a single t-test reported on the non-reduced model was deemed adequate. The method used for the style * training contrasts included running separate contrasts for each level of the training variable (for music and non-music students separately). The three-way interaction was the most complicated to deal with, and it was decided not to run contrasts, but to instead run two ANOVAs separately on rock and romantic music, followed by post hoc tests. Graphs of grouped genres make for a good visual reference for understanding the ensuing analysis.

Fig. 4.18. Underlying dimensions –preference means for factors 1-4 140

Fig. 4.19. Tests of Within-Subjects Contrasts

Type III Within subjects DMSION SS df Mean Square F Sig. DMSION Level 1 vs. 234.862 1 234.862 39.614 .000 Level 4 Level 2 vs. 5.451 1 5.451 .790 .376 Level 4 Level 3 vs. 13.768 1 13.768 1.939 .166 Level 4

Type III Sum of Source DMSION Squares df Mean Square F Sig. DMSION Level 2 vs. Level 1 1 168.754 39.240 .000 168.754 Level 3 vs. Level 1 134.901 1 134.901 23.497 .000

Type III Sum of Source DMSION Squares df Mean Square F Sig.

DMSION Level 2 vs. Level 3 1.893 1 1.893 .275 .601

Key: level 1=factor 1, level 2=factor 2, level 3=factor 3, level 4=factor 4.

For the above contrasts, all comparisons are significant except for those in which factor two and three, two and four, and three and four are compared. If we refresh our memories regarding the content of these factors, the implications will be clear.

FACTOR CONTENT 1. =Baroque/Renaissance/Classical/Medieval/Romantic/Folk 2. =Blues/Jazz 3. =Modern/Post-modern 4. =Pop/Rock 141

Factor one, representing the classical genres and folk music, was preferred significantly less than all the other factors. There was no difference in preference means between all the other comparisons, suggesting that the classical genres, for the entire sample, were largely unpopular, or at least, were preferred less in comparison to the other music groupings. However, if we run contrasts on each level of the training variable for grouped genres, a different picture emerges. A quick scan of the graph would seem to indicate that for both music and non-music students, factor one scored lower than factor two. However, for music students, it is not clear that there is any difference in ratings between factor one and three and one and four. For the non-music students, clearly there was no difference in mean ratings between factor two and three, and certainly there was a significant difference between factor one and four, with factor four getting the upper hand. Consequently, for these comparisons, contrasts were not performed as differences could quite safely be assumed, thus reducing the familywise error rate. Contrasts were conducted for the remaining comparisons.

Fig. 4.20. Means for underlying dimensions (factors) by training

Key: 0=non-music students, 1=music students 142

Fig. 4.21.

MUSIC STUDENTS-CONTRASTS Tests of Within-Subjects Contrasts

Measure: MEASURE_1 Type III Sum Source DMNSION of Squares df Mean Square F Sig. DMNSION Level 2 vs. Level 1 83.318 1 83.318 24.423 .000 Level 3 vs. Level 1 26.600 1 26.600 5.089 .027 Level 4 vs. Level 1 .625 1 .625 .123 .727

Type III Sum Source DMNSION of Squares df Mean Square F Sig. Level 2 vs. Level 4 69.510 1 69.510 12.006 .001

Type III Sum Source DMNSION of Squares df Mean Square F Sig. DMNSION Level 2 vs. Level 3 15.764 1 15.764 3.018 .087

NONMUSIC Tests of Within-Subjects Contrasts

Measure: MEASURE_1 Type III Sum Source DMNSION of Squares df Mean Square F Sig. DMNSION Level 2 vs. Level 1 76.686 1 76.686 15.153 .000 Level 3 vs. Level 1 111.312 1 111.312 17.856 .000 Level 4 vs. Level 1 378.005 1 378.005 54.124 .000

Measure: MEASURE_1 Type III Sum Source DMNSION of Squares df Mean Square F Sig. DMNSION Level 2 vs. Level 4 114.175 1 114.175 14.471 .000

Key: level 1=factor 1, level 2=factor 2, level 3=factor 3, level 4=factor 4.

For both groups, factor one scored significantly lower than factor two, while for music students, a significant difference was observed in mean ratings between factor one and 143 three and two and four. For music students, no difference was found between factor one and four and factor two and three. For non-music students, a clear difference was observed between factor two and four, with four receiving higher ratings. In addition music students rated factor one significantly lower than factor three, which is given considering the earlier comparison between factors one and two. A t-test indicated no difference (t=-1.099, p=.273) between mean ratings for music and non-music students for factor three. The clear differences between training groups for factors one, two and four were assumed due to the significance of the style by training interaction reported earlier. The effect of age seemed to disappear completely with the reduction of data into a factor model, as can be seen in the graph of the means below.

Fig. 4.22. Preference means for underlying dimensions (factors) by age.

Key: 0= age 15-20 (young), 1=age 21+ (older)

However, the interaction was still significant (F=4.076, p=.007), and a t-test indicated differences between age groups for factor one (t=-3.627, p=.000) with the other differences between age groups, no doubt, contributing to the overall significance of the interaction. 144

For the three-way dimension*training*age interaction, significance was approached at the .05 level (F=2.358, p=.071), which although unfortunate, is not too surprising as for the ungrouped genres the interaction was only marginally significant (F=2.037, p=.041). Notice that the F statistic in the latter case is smaller, yet still significant. This is a result of the larger degrees of freedom in this case, which may have helped push the F-stat into the rejection region. Consequently, the significance of the three-way interaction must be tempered with the knowledge regarding the lack of significance observed in the interaction for the four-factor model. Despite this, the ANOVA and post-hoc test was run for the romantic and rock genres to determine if there were differences between subject groups.

Analysis indicated a lack of significance for the age*training interaction for the rock genre (F=.442, p=.507). Analysis of variance was run for the romantic genre and the result for the age*training interaction was equally disappointing (F=1.11, p=.294). On top of this was the loss of the age effect (equivalent to the style*age interaction for the mixed model design) and a rebound of a gender effect which approached significance (F=3.546, p=.062). The net result suggests that the age*training interaction observed earlier was not present within either style individually but rather became significant in the repeated measures design as a result of consideration of scores across all styles simultaneously. The chart of homogenous sub groups further corroborates this notion.

Fig. 4.23. ROCK Tukey HSD AGETRN1 N Subset (preference means) 1 2 4 27 6.2963 3 45 7.3062 2 42 7.4524 1 25 8.8400 Sig. .117 1.000

Key: 1=young non –music, 2=older non-music, 3= young music, 4=older music 145

For rock music, the old non-music students and young and old music students all belonged to the same group, as no significant differences were observed in comparisons between each of these. Only the young non-music group differed significantly from the others.

For the romantic genre of music, the old non-music group belonged to both the music subset and non-music subset, while only the young non-music group differed significantly from the music groups.

Fig. 4.24. ROMANT Tukey HSD Subset (preference AGETR means) N1 N 1 2 1 25 6.1200 2 42 7.0123 7.0123 3 45 8.0222 4 27 8.2593 Sig. .380 .117

Key: 1=young non–music, 2=older non-music, 3= young music, 4=older music

However, the two charts taken together indicated significant differences across levels for both groups with the order of means reversed for each, as depicted in the graph of the three-way interaction for un-clumped styles. The significant differences are roughly between the trained and untrained group for romantic music, and between the young non- music group and all others for rock music. Thus, we have significant differences resulting from a training factor (most apparent for romantic music) and an age*training combination (apparent for rock music), indicating that the three-way interaction observed in the repeated measures design is indeed accurate. It could be argued that in this context any effects with p-values close to the .05 mark should be treated with scepticism, and that 146 only those effects well significant at p=.01, can be used for drawing solid conclusions. With this in mind, the three-way interaction initially observed must be viewed as a weak yet clearly present as evidenced by the post-hoc tests.

Discussion

The most striking findings were in relation to the significant main effect of training, lack of effect main effects for the age and gender variables, the significant interactions of the training and age variables, and the lack of interaction of the gender variable with other variables. The significant main effect of style is expected, as just as not all persons prefer herring to chocolate cake, not all persons prefer post-modern music to pop. Similarly, one cannot expect all persons’ preferences to be equally divided among genres of music.

Music students had a higher overall mean preference than non-music students (t=- 4.789, p=.000). This is not surprising considering the published literature on training, which suggests that music education has a positive influence on preference. The obtained result also does not conflict with Price’s unusual findings regarding a lack of change in terms of opinion, yet a positive change with respect to attitude. As subjects responded to style labels rather than music excerpts, response, in line with Prices distinction, reflects attitude rather than opinion.

The higher preferences on the part of music students is particularly relieving, as one of the central goals of any form of music education, apart from increasing a person’s general level of music knowledge, is to open the student to an array of musical possibilities and enrich their lives by giving them the means of heightening the aesthetic experience of music. Indeed, if it were found that music students had lower levels of preference than non-music students, a further study of the reasons for this would be warranted. The next logical step would be to determine where the high and low preferences occur, and depending on the outcome of the enquiry, a move to remedying a possibly ineffective music curriculum. This is achieved by examining the style by training interaction. 147

This interaction was the most significant of all interactions in the sample. Music students had higher mean ratings than non-music students for all the classical genres of music, as well as for folk, blues, and jazz genres. Non-music students had higher preferences than music students for rock and pop music. This finding corresponds to previous research, and indicates that the taught genres are most affected in terms of changes in preference. One can speculate that the red line trend (non-music students) represents the preference trend of the university minus the music department, and as such, reflects the preference of a typical university student (as the proportion of non- music students is considerably higher than that for music students). Participation in the music department can be said to push up this “normal” or expected trend for the taught genres, and push down the trend for the pop genres. Thus, the effect of training is active, and has direct implications to the everyday lives of individuals, supporting the notion that the music department under question is effective in its teaching methods as it is able to produce significant changes in attitudes. Sceptics may wonder as to the degree to which this trend is a result of a naturally occurring bias in favour of the taught genres on the part of music students. Certainly this may partly be the case, but as music educators well know, a person whose sole motivation is pretence of preference, will generally show deficiencies in performance and participation. Such persons are obviously not a majority in the present department, as evidenced by various criterion variables, such as quality and quantity of student performances. The current result reflects the desired trend, and research of a similar nature at other departments could well profit in terms commenting on teaching effectiveness by duplicating this result.

The absence of a main effect of age does not correspond to LeBlanc’s open-earedness hypothesis, which predicts that young adults will be more accepting of a variety of music than adolescents, who will be more narrow-minded, with preference limited to popular genres. Given the variety of genres of music on the present questionnaire, and the number of classical genres, a lower level of preference would be expected for the younger age group. However, it must be mentioned that the young age group consisted of subjects aged 15-20, certainly not representative of an adolescent age group, but rather a straddling of the adolescent to young adult age group. The old age group consisted of students 21 + years of age. Consequently, the comparison is really between a mix of 148 adolescents and young adults on the one hand, and adults on the other, in which case a gradual decline in preference is expected starting from about 20 years of age onwards. The age spread in the current sample was not large enough to observe such overall differences between the two age groups. However, the style by age interaction was significant.

Young subjects consistently rated the classical genres lower than did the older subjects, and in addition, gave higher ratings for rock music. This trend roughly mirrors the style by training trend, although the differences, group wise for genres, are clearly less prominent. Consequently, one can speculate that the cut-off point from which the preference by age trend begins to gradually decline, is somewhere between age 21 and 24, which constituted the bulk of the age range in the older age group. The reason that the age effect was not visible overall is no doubt a result of the offsetting effect created by the marginally higher preferences given on the part of young students for the jazz, blues, folk, and rock genres. Thus, the interaction is consistent with the open-earedness hypothesis to a degree. The higher preferences on the part of older subjects indicated that they were more open to a variety of classical type genres than young persons. This result suggests that it is quite plausible that inclusion of an additional age group, consisting of students age 10 to 14 (i.e. adolescents), would result in a trend line with lower means than both university age groups on the jazz, blues, folk, and classical genres, but higher means for rock, and especially, pop music.

The lack of a main effect of gender is surprising considering Crowther’s findings, which indicated that females had more positive attitudes to music than males. The popular perception is that females do better than males in music at the educational level, and that females will be more receptive to music in general, especially for those genres appealing to the . A consideration of the style by gender interaction sheds considerable light over the anomaly. There were marginal differences in preferences for styles of music between genders, with a significantly higher-level preference on the part of females over males, for romantic music. This is in keeping with finding of Hargreaves, who observed that females preferred opera and classical music more than males. The expected higher preferences on the part of females for pop music was not apparent, as 149 was the higher preferences expected for males for rock music, with this latter difference only marginal. The trends clearly show that in this particular sample, gender made very little difference in shaping music preferences. This goes against the mainstream literature, and suggests that gender differences in preference among the university population are slight. From observation of the graphs of means, it is clear that gender has a weaker presence than both training and age, which are clearly more influential in determining preference. A study of the graph of the means obtained in Hargreaves’ study makes for a worthwhile comparison of the magnitude of the differences by gender and training variables. In order to highlight the similarities and contrast in trend lines between the present research and Hargreaves’ study, means obtained by the latter for styles of music were arranged in an order paralleling that of the current research questionnaire. In addition, as Hargreaves used a scale where high scores indicate dislike and low scores liking, graphs of the means were flipped upside-down to allow for easier comparison. Similarly to the current research, the graph of preference means by gender reveals two closely matching trend lines with a peak reached for pop music.

Fig. 4.25. Preference by Gender

Key: 0 =female, 1=male 150

The comparatively low mean for classical music suggests that university students preferred classical music more than did subjects in Hargreaves’ sample. However, as the current research divided the categories of classical and opera music into seven sub- categories, some of which were quite low, it is possible that for clumped genres the means would be more closely matched to those in Hargreaves’ sample. Folk was not particularly liked in either sample, while jazz received a fairly lukewarm reception in Hargreaves’ sample. As can be seen, there are similarities in response for genres of music. A clearer picture would arise if identical labels were used. Over all, it is clear that the gender factor was much more prominent in Hargreaves’ sample, and a further exploration of the style by age interaction sheds further light on the relative strength of factors.

Fig. 4.26. Preference by Age

key: 0=young, 1=older 151

The style by age trends mirror those of the gender trends, in a similar manner to that observed in the present research. It is not clear overall, from comparison of age and trend interactions, which factor was more prominent. However, striking and large differences are apparent in the age interaction, particularly with respect to rap, country, folk, and jazz, which was preferred more by the younger subjects.

The style by age/training interaction was significant, with the trends reversed for pop and classical genres, although the rank ordering is approximate. Few studies have grappled with this complex interaction and unfortunately Hargreaves, in his analysis, does not distinguish training groups. However, Geringer does, and observed a significant style by age/training interaction. Although Geringer used a behavioural measure, a moderately strong correlation has been documented between behavioural and attitudinal measures (Price 1987). The reaction tells us that training was most significant in determining preference, followed by the age variable. The extent to which age was an effect in the interaction in comparison to training, can be understood if one looks at the magnitude of the difference between music majors and non-music majors (for either genre), and compares this to the distance resulting from the difference between non- music majors and elementary school students.

Fig. 4.27. Mean listening times for classical and non-classical (pop/rock) genres by group

300

200

100 GROUP

elem

M

0 NM Mean TIME clss non-clss

MUSIC

Key: elem=elementary students, M=music majors, NM=non-music majors 152

The three-way interaction appears to be substantial as each trend line “crosses” each other trend line. Despite the absence of a formal report on the significance of this interaction, the graph of the means strongly suggests that the training factor is most dominant, followed by the age variable. In the present research, the style by training/age interaction was significant, though only at the .05 level. The graph of the means is conceptually equivalent to the one observed in Geringer’s study, although the age factor had much more presence. This is especially apparent between young and old non-music students for the rock and pop genres, and between young and old music students for medieval, renaissance, baroque, and post-modern music. Observation of the two middle trend lines (old non-music vs. young music) stresses the importance of the training and age variables. Age increases preference for non-music students up to a point comparative to young music students, or vice versa, such that persons of differing ages and levels of training appear to have similar levels of appreciation for a diversity of genres (particularly classical ones). Interestingly, this is not the case for jazz, blues, and folk music.

Clumping genres together has both advantages and disadvantages. The disadvantage is that constructs which respondents distinguish are robbed of their identity and given a new identity. This, in fact, is the very basis of the factor analysis which does just this. However, the advantage is that any further analysis is simplified, as the amount of data is reduced and ordered into fewer categories or dimensions. Although the dimensions themselves are not always easily explained, that is, given an identity, the process is valid and very useful not only in coming to terms with the “hidden” constructs that may be responsible for certain patterns in preference, but also for arriving at a structural conception of the phenomenon being studied.

Factor analysis of the current research data yielded some similar groupings to those obtained by Fox & Wince (1975) who found that jazz and blues music formed a single factor, as did current pop hits and easy listening and Folk and classical music. Strikingly, folk loaded highly on the “jazz/blues” factor for both studies, with the difference in factor loadings for folk music between each factor almost identical across studies (.029 for 153 current study and .02 for Fox and Wince). This consistency is quite remarkable, and suggests that folk music does indeed have a dual nature, perhaps as a result of the folk- like elements in some classical music (i.e. Bartok) and the corresponding folk like elements in jazz and blues (i.e. John Lee Hooker). Although an interesting finding in its own right, the primary value of the similarities of groupings of styles of music is as a validation of the soundness of factor groupings.

The subsequent analysis of underlying dimensions largely duplicate the overall findings extracted from analysis of the un-clumped data. However, as observed in the graphs of the means, the differences are far less striking. This is expected, as each clumped factor is really a composite of numerous separate genres, each which have highs and lows. The net result is a watering down of differences, and a corresponding watering down of effects and interactions. This is clearly seen with respect to the style by age interaction, which appears non-existent (yet is still significant), and the style by age/training interaction, which now approaches significance. However, Tuckey post hoc tests indicate that differences in the latter interaction are present, and that the “watering down” effect was most likely responsible for the lack of significance of the interaction.

The fundamental and most important result of the present study is clearly with respect to the absence of gender differences in music preference. The training influence is not too surprising considering the past literature, and the age influence is of interest for its implications to the open-earedness hypothesis. However, the absence of gender factor influence is startling, as it would appear from the literature that such differences are an established fact. There is no clear reason for this glaring inconsistency, and it certainly cannot be that the effect is not there because the study “got it wrong”. The sensitivity of the observed interactions would suggest that the scale did not fail to detect gender differences, but rather, they were not present. The reason for this is anyone’s guess, and it may be that the noted convergence of gender trends over time, illustrated in the music instrument choice studies, may have in fact happened in the present study. That is, the convergence process which some authors suggest exists, has in this study been completed such that no gender differences exist. Further research on this question would benefit by a follow-up study incorporating the exact same questionnaire, and may yield an answer to 154 this anomaly. In addition, a follow-up would allow for observation of the comparative strength, or influence, of each of the age, training, and gender variables, and their rank ordering as determinants of music preference.

In conclusion, it is clear that two of the three factors (age and training) had a conspicuous presence in the data when considering preference ratings for the different genres of music. A gender effect, surprisingly, was largely absent. Results suggest that gender has only a marginal influence on preference for styles of music, while training has the potential to change the tastes of individuals. This reaffirms the importance of having music as a field of study in the educational system. If indeed, persons studying music in school evidence increased preferences for music that they might otherwise shun, then it is possible that music education may lead to change in the value system adhered to by an individual, and open the way for further increasing the range of aesthetic possibilities in the experience of art and beauty, ultimately, enhancing a person’s quality of life. Age appears to complement the music educator’s fruits of labour. The frustrated music teacher who sees children and young adults wasting away on superficial qualities may perhaps take some small satisfaction from the notion that, like a budding wine, the adolescent may yet see the growth of a more profound aesthetic faculty. 155

References

Abeles, H.F. & Porter, S.Y. (1978). The sex-stereo typing of musical instruments. JRME, 26 (2), 65- 75.

Alpert, J. (1982). The effect of disc jockey, peer, and music teacher approval of music on music selection and preference. JRME, 30 (3), 173-186.

Attneave, F. & Olsen, R.K. (1971). Pitch as medium: A new approach to psychophysical scaling. American Journal of Psychology, 84, 147-166.

Bentley, A. (1975). Music in Education: A point of view.Windsor:NFER.

Berlyne, D. (1971). Aesthetics and Psychobiology. New York: Appleton-and-Crofts.

Bharucha, J.J. & Stoekig, K. (1987). Priming of chords: Spreading activation or overlapping frequency spectra? Special issue: The understanding of melody and rhythm. Perception and Psychophysics, 41, 519-524.

Bradley, I.L. (1971). Repetition as a factor in the development of preferences. JRME, 19, 295- 298.

Brake, M. (1980). The of Youth Culture and Youth Subcultures. London: Routledge & Kegan Paul.

Bragg, B.W.E. & Crozier, J.B. (1974). The development with age of verbal and exploratory responses to sound sequences varying in uncertainty level. In D.E. Berlyne (Ed.), Studies in the New Experimental Aesthetics: Steps Towards an Objective Psychology of Aesthetic Appreciation (pp. 27-87). Washington D.C.: Hemisphere publishing.

Brines, J. (1994). Economic dependency, gender, and the division of labor at home. American Journal of Sociology, 100 (3), 652-688.

Bruner, J.S. (1957). On perceptual readiness. Psychological Review, 64 (2), 123-151. 156

Burke, M.J. & Gridley, M.C. (1990). Musical preferences as a function of stimulus complexity and listeners sophistication. Perceptual & Motor Skills, 71, 687-690.

Burnsed, V. (1998). The effects of expressive variation in dynamics on the musical preferences of elementary school children. JRME, 46 (3), 397-404.

Cantor, G.N. (1968). Children’s “like-dislike” ratings of familiarized and nonfamiliarized visual stimuli. Journal of Experimental Child Psychology, 6, 651-657.

Christenson, P.G. (1985). Genre and gender in the structure of music preferences. Communication Research, 15 (3), 282-301.

Christenson, P.G. (1985). Children’s use of audio media. Communication Research, 12 (3), 327- 343.

Conway, C. (2000). Gender and musical instrument choice: A phenomenological investigation. Bulletin of the Council for Research in Music Education,146,1-15.

Croizer, J.B. (1974). Verbal and exploratory responses to sound sequences varying in uncertainty level. In D.E. Berlyne (Ed.), Studies in the New Experimental Aesthetics: Steps Towards an Objective Psychology of Aesthetic Appreciation (pp. 27-87). Washington D.C.: Hemisphere publishing.

Crowther, R.D. & Durkin, K. (1982). Sex and age related differences in the musical behavior, interests and attitudes towards music of 232 secondary school students. Educational Studies, 8 (2), 131-139.

Darrow, A., Haack, P. & Kuribayashi, F. (1987). Descriptors and preferences for Eastern and Western musics by Japanese and American nonmusic majors. JRME, 35 (4), 237-248.

December, W.N. & Earl, R.W. (1957). Exploratory, manipulatory & curiosity behaviors. Psychological Review, 64 (2), 91-96. 157

Delzell, J.K. & Leppla, D.A. (1992). Gender association of musical instruments and preferences of fourth-grade students for selected instruments. JRME, 40 (2), 93-103.

Deutsch, D. (1972). Octave generalization and tune recognition. Perception and Psychophysics, 11, 411-412.

Dowling, W.J. (1978). Scale and contour: Two components of a theory of memory for melodies. Psychological Review, 85, 341-354.

Edwards, E. (1964). Information Transmission. London: Chapman & Hall.

Eysenck, H.J. (1967). The Biological Basis of Personality. Springfield: Thomas.

Farnsworth, P. (1969). The Social Psychology of Music. Ames: The Iowa State University Press.

Fechner, G.T. (1876/1997) Various attempts to establish a basic form of beauty: Experimental aesthetics, golden section and square. Empirical Studies of the , 15 (2), 115-129.

Finnas, L. (1989). How can musical preferences be modified? Bulletin of the Council for Research in Music Education, 102, 1-58.

Fisher, R.A. (1926). The arrangement of field experiments. Journal of the Ministry of Agriculture of Great Britain, 33, 503-513.

Floridi, L. (2003). Information. In L. Floridi (Ed.) Philosophy of Computing and Information (pp.40-59). Oxford: Blackwell Publishing.

Flowers, P.J. (1988). The effects of teaching and learning experiences, tempo, and mode on undergraduate children’s symphonic music preferences. JRME, 36 (1), 19-34.

Fortney, P.M., Boyle, D.J., & DeCarbo, N.J. (1993). A study of middle school band students’ instrument choices. JRME, 41 (1), 28-39. 158

Fox, W. & Wince, M. (1975). Musical taste cultures and taste publics. Youth & Society, 7 (2), 198-224.

Frith, S. (1981). Sound effects: Youth, leisure and the politics of rock ‘n’ roll.NewYork: Pantheon.

Fung, V.C. (1999/2000). Music style preferences of young students in Hong Kong. Bulletin of the Council for Research in Music Education, 143, 50-63.

Fung, V.C. (1996). Musicians’ and non-musicians’ preferences for world musics: Relation to musical characteristics and familiarity. JRME, 44 (1), 60-83.

Gans, H.J. (1974). and High Culture. New York: Basic Books.

Geringer, J.M. (1982). Verbal and operant music listening preferences in relationship to age and musical training. Psychology of Music, Special issue: Proceedings of the Ninth International Seminar on Research in Music education, 47-50.

“GLM Repeated Measures Model.” (2003). SPSS Advanced Models12.0,p.32.

Grusec, J.E., Lockhart, R.S., & Walters, G.C. (1990). Foundations of Psychology. Toronto: Pitman Ltdy.

Hair, J.F., Anderson, R.E., Tatham, R.L., & Black, W.C. (1998). Multivariate Data Analysis. New Jersey: Prentice-Hall Inc.

Hargreaves, D. J. (1995). Effects of age, gender, and training on musical preferences of British secondary school students. JRME, 43 (3), 242-250.

Hargreaves, D.J. & Castell, K.C. (1986). Development of liking for familiar and unfamiliar melodies. Bulletin of the Council for Research in Music Education, 91, 65-69.

Hargreaves, D.J. & North, A.C. (1997). The effect of physical attractiveness on responses to pop music performers and their music. Empirical Studies for the Arts, 15 (1), 75-89. 159

Hargreaves, D.J. (1984). The effects of repetition on liking for music. JRME, 32 (1), 35-47.

Hargreaves, D.J. (1986). The Developmental Psychology of Music. Cambridge: Cambridge University Press.

Harrison, A.C. (2000). Children’s gender-typed preferences for musical instruments: An intervention study. Psychology of Music, 28, 81-97.

Hedyduk, R.G. (1975). Rated preference for musical compositions as it relates to complexity and exposure frequency. Perception & Psychophysics, 17 (1), 84-91.

Hevner, K. (1935). Expression in music: A discussion of experimental studies and theories. Psychological Review, 42, 188-204.

Hoge, H. (1997). The golden section hypothesis-Its last funeral. Empirical Studies for the Arts,15 (2), 233-255.

Howell, D.C. (2002). Statistical Methods for Psychology. Pacific Grove: Duxbury.

Hume, D. (1757). Of the standard of taste. In D. Cooper (Ed.), Aesthetics: The classic readings (pp.76-93). Oxford: Blackwell Publishers.

Jacob, A. (Ed.) (1980). British music yearbook, 6th ed.A.&C.Black.

Konecni, V.J. & Crozier, J.B. (1976). Anger and expression of aggression: Effects on aesthetic preference. Scientific Aesthetics, 1 (1), 47-55.

Krippendorff, K. (1986). Information Theory: Structural Models for Qualitative Data. London: Sage publication.

Krumhansl, C.L., Bharucha, J., & Castellano, M.A. (1982). Key distance effects and perceived harmonic structure in music. Perception and Psychophysics, 32, 96-108. 160

Krumhansl, C.L. & Castellano, M.A. (1983). Dynamic process in music perception. Memory and Cognition, 11, 325-334.

Larson, R. (1983). Television and music; contrasting media in adolescent life. Youth and Society, 15 (1), 13-31.

LeBlanc, A. (1981). Effects of style, tempo and performing medium on children’s music preference. JRME, 29 (2), 143-156.

LeBlanc, A. (1982) An interactive theory of music preference. Journal of Music Therapy,19(1), 28-45.

LeBlanc, A. (1983). Effect of tempo on children’s music preference. JRME, 31 (4), 283-293.

LeBlanc, A. (1983). Effects of tempo and performing medium on children’s music preference. JRME, 31 (1), 57-66.

LeBlanc, A. (1996). Music style preferences of different age listeners. JRME, 44 (1), 49-59.

LeBlanc, A. (2000-2001). Tempo preferences of youth listeners in Brazil, China, Italy, South Africa, and the United States. Bulletin of the Council of Research in Music Education, 147, 97- 101.

Little, P. & Zuckerman, M. (1986). Sensation seeking and music preference. Personality and Individual Differences, 7 (4), 575-577.

McCrary, J. (1993). Effects of listener’s and performers race on music preferences. JRME,41(3),200-211.

Miller, M.M. & Strongman, K.T. (2002). The emotional effects of music on religious experience: A study of the Pentecostal-charismatic style of music and worship. Psychology of Music,30,8- 27. 161

Morrison, S.J. (1998). A comparison of preference responses of white and African-American students to musical versus musical/visual stimuli. JRME, 46 (2), 208-222.

Morrison, S.J. (1999). Preference responces and use of written descriptors among music and nonmusic majors in the United States, Hong Kong and the People’s Republic of China. JRME, 47 (1), 5-17.

Mull, H.K. (1957). Effects of repetition on the enjoyment of modern music. Journal of Pychology, 43, 155-162.

North, A.C. & Hargreaves, D.J. (1996). Responses to music in aerobic exercise and yogic relaxation classes. British Journal of Psychology, 87, 535-547.

North, A.C. & Hargreaves, D.J. (1996). Responses to music in the dining area. Journal of Applied Social Psychology, 26 (6), 491-501.

Orr, M.G. & Ohlsson, S. (2001). The relationship between musical complexity and liking in jazz and bluegrass. Psychology of Music, 29 (2), 108-127.

Pavlov, I.P. (1928). Conditioned reflexes: An investigation of the physiological activity of the cerebral cortex. Twenty-five Years of Objective Study of the Higher Nervous Activity (Behavior) of Animals. London: Laurence & Wishart.

Price, H.E. (1988). The effect of a music appreciation course on student’s verbally expressed preferences for composers. JRME, 36 (1), 35-45.

Price, H.E. (1990). Changes in musical attitudes, opinions, and knowledge of music appreciation students. JRME, 38 (1), 39-48.

Radocy, R.E. (1976). Effects of authority figure biases on changing judgments of musical events. JRME, 24 (3), 119-128.

Radocy, R.E. (1982). Preference for classical music: Test for a hedgehog. Psychology of Music, Special Issue, 91-95. 162

Rawlings, D. (1997). Music preference and the five-factor model of the NEO personality inventory. Psychology of Music, 25, 133-148.

Reed, S.K. (1972). Pattern recognition and categorization. Cognitive Psychology, 3, 382-407.

Roe, K. (1985). Swedish youth and music; listening patterns and motivations. Communication Research, 12 (3), 353-362.

Rosch, E. & Mervis, C.B. (1975). Family resemblances: Studies in the internal structure of categories. Cognitive Psychology, 7, 573-605.

Rosner, B.S. & Meyer, L.B. (1982). Melodic processes and the perception of music. In D. Deutsch (Ed.), The Psychology of Music (pp. 317-343). New York: Academic Press.

Russell, P. (1994). Preferability, pleasingness, and interestingness: Relationship between evaluative judgments in empirical aesthetics. Empirical Studies of the Arts,12(2),141-157.

Russell, P. (1987). Effects of repetition on the familiarity and likeability of popular music recordings. Psychology of Music. 15, 187-197.

Schafer, M. R. (1967). Ear Cleaning, Don Mills, Ontario: BMI.

Schubert, E. (1996). Enjoyment of negative emotions in music: An associative network explanation. Psychology of Music, 24, 29-46.

Schuckert, R.F. (1968). An attempt to modify the musical preferences of preschool children. JRME, 16, 39-44.

Shehan, P.K. (1985). Transfer of preference from taught to untaught pieces of non-Western music genres. JRME, 33 (3), 149-158.

Shepard, W.O. & Hess, D.T. (1975). Attitudes in four age groups toward sex role division in adult occupations and activities. Journal of Vocational Behavior, 6, 27-39. 163

Shuter, R. & Gabriel, C. (1981). The Psychology of Musical Ability. London: Methuen.

Sluckin, W., Colman, A.M. & Hargreaves, D.J. (1980) Liking for words as a function of the experienced frequency of their occurrence. British Journal of Psychology, 71, 163-169.

Smith, K.C. & Cuddy, L.L. (1986). The pleasingness of melodic sequences: Contrasting effects of repetition and rule-familiarity. Psychology of Music, 14, 17-32.

Smith, D.J. & Melera, R.J. (1990). Aesthetic preference and syntactic prototypicality in music: ‘Tis the gift to be simple’. Cognition, 34, 279-298.

Steck, L.& Machotka, P. (1975). Preference for musical complexity: Effects of context. Journal of Experimental Psychology: Human Perception & Performance, 104 (2), 170-174.

Tagg, P.D. (1981). On the specificity of musical communication. University of Gothenburg, Musicology dept. Stencil No 8115.

Urquhart, A. (2003). Complexity. In L. Floridi (ed.), Philosophy of Computing and Information (pp.18-27). Oxford: Blackwell Publishing.

Vitz, P.C. (1964). Preferences for rates of information presented by sequences of tones. Journal of Experimental Psychology, 68 (2), 176-183.

Vitz, P.C. (1966). Affect as a function of stimulus variation. JournalofExperimentalPsychology, 71 (1), 74-79.

Walker, E. (1980). Psychological Complexity and Preference: A Hedgehog Theory of Behavior. Montery, California: Brooks/Cole.

Wapnick, J. (1976). A review of research on attitude and preference. Bulletin of the Council for Research in Music Education, 48, 1-20. 164

Washington, J.A. (1995). Basic Technical Mathematics with Calculus. Readings, Massachusetts: Addison-Wesler Publishing Co.

Welker, R.L. (1982). Abstraction of themes from melodic variations. Journal of Experimental Psychology: Human Perception and Performance, 8, 435-447.

Whitfield, T.W.A. & Slatter, P.E. (1979). The effects of categorization and prototypicality on aesthetic choice in a furniture selection task. British Journal of Psychology, 70, 65-75.

Zajonc, R.B. (1970). Brainwash: Familiarity breeds comfort. Psychology Today, 33-35 & 60-62.

Zajonc, R.B. (1970). Exposure and affect: A field experiment. Psychonomic Science, 17, 216- 217. i ii iii iv v vi

 vii

               viii

   ix

Means and Standard Deviations for sample N2cut- (EMMS)

jazz blues folk rock pop medeival renaiss baroque class romant mod postmod Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean ______

Grand Total 7.50 6.60 5.29 7.43 6.77 4.34 4.77 5.90 7.14 7.42 7.15 6.29 



jazz blues folk rock pop medeival renaiss baroque class romant mod postmod StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev ______

Grand Total 2.20 2.22 2.56 2.19 2.37 2.17 2.33 2.37 2.36 2.37 2.24 2.36 



jazz blues folk rock pop medeival renaiss baroque class romant mod postmod AGE1 Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean ______

0 7.66 6.68 5.40 7.86 6.72 4.01 4.33 5.69 6.90 7.34 6.90 6.21

1 7.33 6.52 5.17 7.00 6.83 4.66 5.21 6.12 7.38 7.50 7.41 6.36  x

jazz blues folk rock pop medeival renaiss baroque class romant mod postmod AGE1 StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev ______

0 2.14 2.24 2.59 1.94 2.24 2.17 2.38 2.43 2.37 2.43 2.59 2.60

1 2.27 2.20 2.54 2.35 2.50 2.14 2.21 2.30 2.35 2.33 1.79 2.09 

jazz blues folk rock pop medeival renaiss baroque class romant mod postmod TRAIN Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean ______

0 6.88 6.08 4.55 7.97 7.39 4.00 4.37 5.08 6.89 6.68 7.05 5.98

1 8.07 7.09 5.97 6.93 6.20 4.65 5.14 6.67 7.36 8.11 7.25 6.57 

jazz blues folk rock pop medeival renaiss baroque class romant mod postmod TRAIN StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev ______

0 2.33 2.30 2.61 2.00 2.50 2.07 2.25 2.36 2.71 2.45 2.20 2.29

1 1.92 2.03 2.33 2.25 2.09 2.23 2.36 2.12 1.97 2.09 2.28 2.40 

jazz blues folk rock pop medeival renaiss baroque class romant mod postmod GENDER1 Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean ______

0 7.54 6.70 5.40 7.42 6.96 4.58 4.97 5.89 7.27 7.76 7.21 6.22

1 7.44 6.46 5.14 7.44 6.51 4.00 4.50 5.92 6.95 6.96 7.07 6.37  xi

jazz blues folk rock pop medeival renaiss baroque class romant mod postmod GENDER1 StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev ______

0 2.07 2.17 2.64 2.20 2.19 2.17 2.27 2.14 2.20 2.03 2.07 2.18

1 2.38 2.29 2.47 2.20 2.58 2.15 2.40 2.67 2.56 2.72 2.46 2.59 

jazz blues folk rock pop medeival renaiss baroque class romant mod postmod AGE1 TRAIN Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean Mean ______

0 0 6.68 5.60 3.84 8.84 7.80 3.40 3.89 4.61 6.44 6.12 6.72 6.12

1 8.20 7.27 6.26 7.31 6.12 4.36 4.58 6.29 7.16 8.02 7.00 6.27

1 0 7.00 6.36 4.98 7.45 7.14 4.36 4.65 5.37 7.16 7.02 7.24 5.90

1 7.85 6.78 5.48 6.30 6.33 5.14 6.08 7.30 7.70 8.26 7.67 7.07 

jazz blues folk rock pop medeival renaiss baroque class romant mod postmod AGE1 TRAIN StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev StdDev ______

0 0 2.27 2.26 2.56 1.07 2.40 1.98 2.52 2.50 2.90 2.67 2.84 2.76

1 1.88 2.03 2.20 2.11 1.92 2.22 2.28 2.20 2.00 2.02 2.47 2.54

1 0 2.38 2.30 2.58 2.24 2.56 2.07 2.04 2.26 2.59 2.28 1.74 2.00

1 2.01 2.04 2.50 2.38 2.37 2.20 2.22 1.86 1.90 2.23 1.88 2.07  xii

Means & S.D. for clumped genres factor1 DMNSION AGE1 Mean DMNSION Mean ______1 0 5.61 1 5.81 16.01 2 7.05 2 0 7.16 3 6.72 16.93 4 7.10  3 0 6.56 factor1 DMNSION StdDev 16.89 ______4 0 7.29 1 1.78 16.91 2 2.08 

3 2.09

4 1.87  factor1 DMNSION AGE1 StdDev ______factor1 DMNSION TRAIN Mean 1 0 1.85 ______11.70 1 0 5.26 2 0 2.05 16.32 12.10 2 0 6.47 3 0 2.39 17.58 11.73 3 0 6.52 4 0 1.63 16.91 12.07 4 0 7.68

16.56

Dimension: Key: 1=Baroque/Renaissance/Classical Age: 0=15-20, 1=21+ Medieval/Romantic/Folk Training: 0=non-music 2=Blues/Jazz 1=music 3=Modern/Post-Modern 4=Pop/rock