<<

Modeling Cultural

Richard McElreath Joseph Henrich

January 12, 2006

For publication in Oxford Handbook of Evolutionary , Dunbar and Barret, editors

1 Introduction and they die. Darwin’s insight was to see that organisms vary, and the processes of their lives When Darwin left for his voyage around the affect which types spread and which diminish. world on the Beagle, he took with him the The key to explaining long run change in na- first volume of Charles Lyell’s Principles of ture, to explaining the origin of new species, of Geology. Later in the voyage he received the whole new types of organisms, and of life itself second volume by post somewhere in South was to apply Lyell’s principle of uniformitar- America. Lyell never accepted Darwin’s ac- ianism to populations. By keeping track of count of evolution by , pre- how the small events of everyday life change sumably because of his religious beliefs. It is the composition of populations, we can ex- ironic then that Lyell’s work played a crucial plain great events over long time scales. role in the development of Darwin’s thinking. Biologists have been thinking this way In some ways Lyell’s principle of uniformitari- ever since Darwin, but it is still news in most anism is as central to Darwinism as is natural parts of the social sciences. Are people prod- selection. ucts of their societies or are societies prod- Before Lyell, it was common to explain ucts of people? The answer must be “both,” the features of the earth’s geology in terms but theory in the social sciences has tended of past catastrophes: floods, earthquakes and to take one side or the other (Marx’s di- other cataclysms. In contrast, Lyell tried to alectic being an obvious exception). In evo- explain what he observed in terms of the cu- lutionary models, this classical conflict be- mulative action of processes that we could ob- tween explanations at the level of the society serve every day in the world around us—the (think Durkheimian social facts) and explana- sinking of lands and the build up of sediments. tions at the level of individuals (think micro- By appreciating the accumulated small effects ) simply disappears. Population of such processes over long time spans, great models allow explanation and real causation changes could be explained. at both levels (and more than two levels) to Darwin took the idea of small changes over exist seamlessly and meaningfully in one the- long time spans and applied it to populations ory. We don’t have to choose between atom- of organisms. Darwin was a good naturalist istic and group-level explanations. Instead, and knew a lot about the everyday lives of one can build models about how individuals plants and animals. They mate, they repro- can create population-level effects which then duce, they move from one place to another, change individuals in powerful ways.

1 Cultural evolutionary models are much these assumptions into mathematical expres- the same as better-known genetic ones: events sions that tell us how the frequencies of each in the lives of individuals interact at the scale cultural and genetic variant (and the covari- of populations to produce feedback and pow- ance among them, if necessary) change during erful long-term effects on behavior. There are each stage of the life cycle. These expressions, three basic steps. called recursions, do the work of integrating events in the lives of individuals into micro- 1. One begins by specifying the structure evolutionary consequences—changes observ- of the population. How large is it? Is it able over short time spans. The next goal is to sub-divided? How do sub-divisions af- deduce the long-term macro-evolutionary con- fect one another? How does migration sequences of the assumptions. This is done by work? How is the population size regu- finding any combinations of cultural and ge- lated? netic variants that lead to steady states, equi- libria, and what combinations of environmen- 2. Then one defines the life cycle of the or- tal conditions and life-cycle variables make ganism. How does mating work? When different equilibria possible. Some of these is learning possible? What states do equilibria will be stable, meaning the popu- individuals pass through from birth to lation will be attracted to them, while oth- death? ers will be unstable, meaning the population 3. Finally, one defines the different herita- will move away from them. Stable equilibria ble variants possible in the model. What are candidates for long-term evolutionary out- is the range of strategies or mutations comes, and unstable ones are important be- over which evolution operates? How do cause they often inform us as to how likely these variants affect events in the life the population is to reach any of the stable cycle of the organism, such as death or equilibria or how much time it may spend at development, including learning, atten- each. tion, and inference? Thus by writing down formal expressions Since cultural evolutionary models can that capture assumptions about how tiny contain two interacting biological systems of events in the lives of individuals affect sur- inheritance, and genes, the answers to vival, reproduction, and the probabilities of these questions can be different for each sys- being a cultural parent, evolutionary mod- tem. For example, individuals may be able to els allow one to deduce the population-level acquire many different socially-learned behav- evolutionary consequences of individual-level iors, but the range of possible genetically in- psychologies, decision rules, and behavior. At herited learning strategies may be very small. the same time, since these expressions simul- The number of genetic parents has an upper taneously define how events in the life-cycle limit of two (for most vertebrates at least), affect the population and how the popula- but cultural parents can be many and the tion affects individuals, it is a two-way street. contributions among them can be very un- The mass action arising from individuals in- equal. In some cultural evolutionary mod- tegrates up at the population level to have els, the contribution of each parent is typically potentially powerful affects on the fates of in- non-additive in ways most people consider im- dividuals with different cultural and genetic possible in genetics. variants. These different fates in turn lead to After the structure of the model is com- further changes in the population, which lead pletely specified, the objective is to transform to yet more consequences for individuals.

2 It is not easy to keep all of these balls in ple, we could construct a very simple genetic the air simultaneously. The slipperiness of model in which the change (∆) in the fre- verbal reasoning is famous, and that is per- quency of an allele, p, is a function of envi- haps the reason why so many fields, from ronmental state, E. This system would have philosophy to economics to physics, use for- a single recursion: malism to make deductions about complex systems. The steady stream of interesting ∆p = F (p, E), and counter-intuitive results that emerge from where the function F (p, E) is to be specified these formalisms has demonstrated their value depending upon what model of adaptation to and made them centerpieces of theory devel- the environment we might choose. It might opment. be that E has little effect on individuals with Many social scientists and biologists work different alleles, or it might be that E favors on how individuals make decisions and how one over the others. It might be that E is behavior is acquired. Fewer ask how those fluctuating, so that selection favors different decisions and mechanisms of learning aggre- alleles at different times. The change might gate at the population level. Our position is depend upon p itself, as it does in the example that both are inherently interesting and cru- of sickle-cell anemia and other cases of over- cial for understanding evolving systems, in- dominance. But nowhere do we allow in such cluding culture. In the remainder of this pa- a system for E itself to evolve in response to per, we explore three key, and sometimes con- p. troversial, issues in the evolution of culture The scientific question is whether such which arise by examining the population pro- models are sufficient to model the evolution cesses cultural inheritance may generate. We of a given human phenotype. If we only knew invite the reader to join us in a tour of this genotypes and the state of the environment, biological frontier and see how formal popu- could we predict the behavior of organisms in lation models of cultural systems may clarify the next time period? When the answer to and address questions about human behavior, this question is “no,” we need at least one psychology, and society. more equation:

∆p = F (p, q, E), 2 Why bother with cultural ∆q = G(p, q, E), evolution? where q is the frequency of some cultural vari- Some phenotypes need more than ant (a dialect, say), and G(p, q, E) a function genes and environment, to be rep- telling us how dialect responds to environ- resented in a formal model. ment, E, and its own previous state, q, and the frequency of an allele, p. Sometimes people ask us why we should This all sounds rather complex. And it even bother with modeling ? can be. However, when important parts of Why are genetic models not sufficient? What phenotype are acquired during development scientific payoff is there in the added complex- and depend upon previous phenotypes, some ity? system like this is useful for understanding These are fine questions, and they have how the organism evolves. Unless we think fine answers. The basic issue is to identify the existing behaviors could be predicted solely minimal requirements for representing evolu- from knowing the environment and the distri- tion of phenotype in a species. For exam- bution of genes, at some point evolutionary

3 models must incorporate the dynamics of be- if all the above is true, culture can still be havioral inheritance. No heroic assumptions an evolving system that leads to cumulative are required for behavioral inheritance to ex- adaptation. This does not mean that evolved ist: if portions of phenotype depend upon the psychology has no role to play in how cul- phenotypes of other individuals, then weak or ture evolves (we think psychology has a huge strong inheritance of behavior can exist. In role to play in understanding culture), but we the long run, in a given model, it might turn think it does mean that dismissing cultural out that behavioral dynamics have little effect evolution on the basis of imperfection of the on the outcome. In others, it will make a huge genetic analogy is unwarranted. difference. Many people—enthusiasts of the “meme” Cultural evolutionary models (as well as approach and critics alike—seem to have niche construction models, see Odling-Smee been persuaded by ’ abstract et al. 2003) can model just the non-genetic statements on what is required for adaptive behavioral dynamics, as if q above did not evolution to occur. In The Extended Pheno- depend upon p, as well as joint dynamics type (1982), he argued that any successfully of a coupled gene-culture system (Boyd and replicating entity must exhibit (1) longevity, Richerson, 1985; Cavalli-Sforza and Feldman, (2) fecundity, and (3) fidelity. The entity must 1981; Durham, 1991). In each case, however, last long enough (longevity) to make copies of the structure of the model is decided by the itself (fecundity) that are reasonably similar question of interest. In the rest of this review, to it (fidelity). Some have interpreted this we show how cultural evolution models have to mean that anything with high mutation been used to address questions about human rates cannot be a successful replicator. Thus if behavior. cultural ideas change in the process of social learning, the conclusion is that they do not 3 Transmission in noisy sys- constitute an evolving system at all (see cita- tions in Henrich and Boyd 2002). Similarly, if tems cultural variants are continuous and blended entities, then they never exactly replicate, and The imperfection of the analogy again cannot produce adaptive evolution. between genetic and cultural evo- lution does not mean culture does These conclusions are unfounded. Read not evolve. very generally, Dawkins’ conditions are neces- sary and sufficient—There must be some heri- While evolutionary principles are equally ap- tability for adaptive evolution to occur. How- plicable to almost any dynamical system, ever, there are many ways to produce heri- many researchers approach models of cultural table variation. So in the strict sense many transmission and evolution via an analogy people have read them, while Dawkins’ con- with genetic evolution. This has led some to ditions are sufficient, they are definitely not be concerned about the strength of this anal- necessary. Reverse-engineering DNA may tell ogy (Sperber, 2000). If cultural variants are us how inheritance can work, but it does not not discrete, are prone to “mutation,” and are tell us how it must work. Henrich, Boyd and strongly affected by learning biases, then is Richerson (forthcoming) examines the prob- it appropriate to speak of “transmission” of lems with this reverse-engineering in greater culture at all? While we have no particular depth. attachment to the term “transmission,” we In this section, we address concerns arising think the answer is definitively “yes.” Even from the gene-culture analogy by demonstrat-

4 ing ways that transmission can deviate sub- after one generation of learning. To simplify stantially from the genetic analogy but never- their presentation, assume that there are only theless produce both heritable variation and two cultural parents and that each contributes adaptive evolution. Our broader message is equally to socialization. Let  be the vari- that biologists and social scientists alike have ance in error in cultural learning. When  is tended to think too narrowly in terms of the large, learning is noisy. When  = 0, cultural genes metaphor. Many other systems of inher- variants replicate perfectly. After some cal- itance are possible in principle, and culture is culus, the variation in cultural behavior, V , only one. We believe that it is more produc- after learning is (see pages 73-74): tive to drop the genetic analogy and instead 1 study cultural transmission and evolution on V 0 = (V + ). its own. 2 If  = 0, then the above has only one sta- 3.1 Noisy learning can maintain ble value, V 0 = V = 0. Blending reduces heritable cultural variation variation each generation until it is all gone. In this case, Jenkin was correct. However, if Before the union of genetics and Darwinism,  > 0, the equilibrium amount of variation most biologists, including Darwin, thought (found where V 0 = V ) is: that inheritance was a blending process: off- spring were a mix of parental phenotypes. Vˆ = . Darwin was troubled by Fleeming Jenkin’s (1864) argument that natural selection could Thus if there is substantial noise, there will not produce adaptations, because inheritance be substantial variation at equilibrium. This would quickly deplete the variation natural se- variation can be subject to selective forces and lection relies upon. Fisher’s (1918) argument produce adaptive change, just as in the ge- reconciling genetics with continuous pheno- netic case. typic variation purportedly rescued Darwin, The population comes to rest at the but in reality both Jenkin’s argument and amount of variation above, because while those who think Fisher saved Darwin are sim- blending inheritance does deplete variation, ply wrong: blending inheritance can preserve error in learning replenishes it. In the long variation, and particulate inheritance is nei- run, the balance between these two forces ther necessary nor sufficient to preserve vari- results in the expected amount of variation ation (Maynard Smith, 1998, has a chapter in the population being equal to the aver- that examines this problem). age amount of error individuals commit when Boyd and Richerson (1985) presented a learning. Blending can never reduce variation simple model to prove this point. The below this amount, because as soon as it does, model assumes that (1) cultural variants more mistakes are made, and more variation are continuous (non-discrete), and (2) naive is pumped into the system. Variation cannot individuals sample n cultural parents and stay forever above this amount, because learn- adopt a weighted average of their observed ing averages out any “error” that exceeds that behavior—blending inheritance. Observa- inherent in the learning itself. tions and reconstructions are prone to an arbi- Boyd and Richerson also showed that if trary amount of error, however, and therefore cultural parents assort by phenotype, then as- inheritance here involves continuous traits, sortment can help to maintain variation. This blending, and noise/error. They derive a re- might occur if similar types inhabit similar en- cursion for the variation in cultural behavior vironments or if similar types are more likely

5 to mate and jointly socialize their offspring. ferences, and these cognitive attractors may When this happens, the parents being blended swamp any evolutionary dynamics possible in together are more similar to one another and culture. therefore the loss of variation due to blending We do not doubt that psychological bi- is less than in the case above. If parents are ases for learning exist, and their importance weighted unequally (mom is more important has long been a part of cultural evolutionary than dad), this will also tend to slow the rate modeling (see Boyd and Richerson’s (1985) at which blending reduces variation, because direct bias). However, Henrich and Boyd unequal weighting reduces the effective num- (2002) have addressed whether strong innate ber of cultural parents. inferential mechanisms swamp adaptive cul- How cultural learning actually works is a tural evolution–influenced by selective forces good empirical question, but models like this like imitating successful people–by deriving a one prove that the argument that cultural model of cultural transmission that assumes variants cannot evolve in a meaningful way, continuously varying representations under because they are (1) not discrete entities like the influence of weak selective transmission genes and (2) prone to error, is simply not and strong cognitive attractors. This model a valid deduction. Likewise, the observation addresses the complaint that culturally trans- that culture does evolve does not imply that mitted ideas are rarely if ever discrete, but there are any units analogous to genes nor instead blend, as well as the complaint that that imitation and other forms of social learn- cognitive influences on social learning swamp ing are highly accurate. transmission effects such that cultural vari- We also think that the empirical evidence ation is not heritable. Using a very gen- is quite strong that many aspects of hu- eral model, they show that these complaints man behavior (including technology) evolve are deductively invalid. In fact, they derive in a Darwinian fashion (Richerson and Boyd, a non-intuitive conclusion: If inferential bi- 2005). Many of these are not plausibly ge- ases are sufficiently strong relative to selective netic, in any immediate sense. Thus non- forces, a continuous representation (quantita- deductive philosophical arguments that cul- tive blending) model reduces to the discrete- ture cannot evolve seem very suspicious, es- trait replicator dynamic commonly used in pecially when there are existing deductive ar- population models of both culture and genes. guments to the contrary. Thus, powerful and biased inferential mecha- nisms actually mean that even a weak selec- 3.2 Noisy learning can produce tive component will eventually determine the adaptive evolution final equilibrium of the system, in true Dar- winian fashion. The important assumption Some authors (Sperber and Hirschfeld, 2004) here is that learners’ psychology have multiple have made a lot out of the results of experi- inferential systems—that is, there are multi- ments that resemble games of “telephone” (as ple cognitive attractors. Strong cognitive bi- it is called in North America) or “Chinese ases do not swamp selective effects, but rather whispers” (as it is called in Britain). When make discrete models better estimates of the pairs of individuals pass a signal along a chain, actual dynamics. the message tends to be corrupted. Thus, we might conclude, social learning is too er- In two other models in the paper, Henrich ror prone to maintain variation or content in and Boyd (2002) construct systems with large and of itself. Strong innate inferential mecha- amounts of transmission error to show that ac- nisms may be needed to stabilize cultural dif- curate individual-level replication of cultural

6 variants is not necessary for selective forces tence may lead females to strive for rank be- to generate either cultural inertia or cumula- cause of its downstream consequences, in ad- tive cultural adaptation. Their second model dition to its immediate resource access effects shows that if learners aggregate information (Boyd, 1982; Leimar, 1996). from multiple cultural parents using a con- Extra- or “epigenetic” (Maynard Smith, formist bias (see Henrich & McElreath, this 1990) systems like this are increasingly rec- volume) they can dramatically reduce the av- ognized: everywhere biologists look, they find erage noise/error in their inferences, suggest- hints of inheritance systems either built on top ing that our psychology may have genetically of genes or built from entirely different mecha- evolved a conformist bias, in part, to reduce nisms. If the key question is what mechanisms transmission error (among other reason, see account for heritable phenotypic differences Henrich & Boyd 1998). In the aggregate, this among organisms, then the answer appears bias creates heritability at population level, to be “many.” Jablonka and Lamb’s Evolu- and can lead to cultural inertia. In the third tion in Four Dimensions (2005) mounts the model, Henrich and Boyd combine all the empirically rich argument that heritable dif- potential problems with models of cultural ferences in many species are due to the action evolution, assuming continuous (non-discrete) of several inheritance systems (genetic, epi- cultural representations, incomplete transmis- genetic, behavioral and symbolic), sometimes sion, and substantial inferential transforma- interacting, sometimes acting in parallel. tions. Despite these assumptions, they con- If one thinks about cell division for a mo- struct a model which produces adaptive cul- ment, it is obvious that processes other than tural evolution. the replication of DNA are needed to explain how it works. Organelles need to be copied 3.3 Other inheritance systems (Sheahan et al., 2004), and the genetic code In many baboons, females inherit dominance itself needs to be copied (and this is not con- rank from their mothers and sisters (Silk tained in the DNA, nor could it be). Beyond and Boyd, 1983). In these species, fitness cell division, adult phenotypes depend upon is strongly affected by this extra-genetic in- imprinting and other forms of learning that heritance: any female adopted at birth into may channel the environments offspring are a high-ranking matriline would be better off exposed to (a kind of niche construction— than if she were adopted into a low-ranking Odling-Smee et al. 2003). And finally, most matriline. And this female will have her dom- biologists believe that DNA was certainly not inance rank before she fights a single mem- the first form of hereditary biological mate- ber of her social group. Dominance is herita- rial (Szathm´aryand Maynard Smith, 1995). ble, has important effects on fitness, and yet Thus some inheritance systems must be able the mechanism of inheritance is at least partly to sometimes create complementary and even non-genetic. The rules of how this inheritance usurping inheritance systems. works are complicated and very unlike genes. In light of these plausible “inheritance sys- It probably depends upon the composition of tems,” it appears that human culture may not ones own matriline, the composition of the be so special or surprising at all, in the sense entire social group, and local resource den- of being a non-genetic system of inheritance. sity and feeding competition. And yet no pri- Organisms as diverse as arabidopsis (a small matologist could completely understand ba- plant related to mustard that is a favorite boon biology without taking this complicated of geneticists), common fruit flies and single- extra-genetic pedigree into account. Its exis- celled microscopic animals such as paramecia

7 exhibit heritable differences due at least in innovations, and reorganizations of human so- part to mechanisms other than the sequence cieties happen much faster than the average of nucleotides in their DNA. The existence of tempo of genetic evolution. Despite the mas- social learning as a system of inheritance and sive differences in behavior and social orga- adaptation that functions in complement to nization among human societies, there is lit- DNA may turn out to be unremarkable. tle genetic variation among groups within our To someone who makes formal models species (P¨a¨abo, 2001), leading most social sci- of evolutionary systems, the question that entists to infer that differences among human we must answer is whether it will be suffi- groups are due to rapid cultural evolution, not cient to represent human (or any other or- selection on genes. ganism’s) evolution with just state variables While we agree that cultural evolution is for its alleles. If we require state variables typically absolutely faster than genetic evolu- for early childhood experience, imprinting, tion, at least in the short term, this is only or behaviors acquired via social learning, to part of the story. The danger with the sum- make useful models of our own evolution, mary we just gave is that it encourages the then attempts to construct culture-free mod- view that cultural and genetic evolution lead els are simply scientifically inadequate. As to similar outcomes, only on different time with each of the possible systems above (e.g. scales. The relative rates of competing evo- Jablonka and Lamb, 1991; P´aland Mikl´os, lutionary forces are very different in the two 1999; Maynard Smith, 1990), the specific dy- systems. Population geneticists tend to think namics and consequences of cultural learn- of evolution as the result of the balance of ing may be rather unique and very important forces acting on alleles. Migration, mutation, for understanding both micro- and macro- and selection all act to alter allele frequencies, evolution. but appreciating the balance of these forces In the next two sections, we explore mod- is what makes population genetics predictive. els of the possible dynamic consequences of Because the balance is likely quite different in cultural inheritance. While such models do cultural models (and presumably the real sys- not tell us how actually tems the models caricature), quite different works, they direct our attention to possibili- outcomes are possible. ties we are unlikely to consider, if we consider DNA to be the only important source of her- 4.1 The balance of selective forces itable variation in our species. and migration For our discussion, we focus on the relative 4 The relative strength of strengths of two forces, migration and selec- forces of cultural evolution tion. Selection in the cultural case refers to learning forces that favor some behavioral Cultural evolution may be most variants over others. For example, people different in the relative difference probably prefer to imitate the successful (see in strength of evolutionary forces, Henrich and McElreath, this volume), and rather than the absolute speed of this favors behaviors that lead to success. its evolution. An ounce of mixing is a pound of effect, in most models of genetic evolution. In large It is commonly observed that cultural evolu- animals like ourselves, migration among sub- tion may be much faster than genetic evolu- populations is typically a very strong force. tion. Styles of dress and speech, technological This strong force of migration tends to unify

8 subpopulations of alleles with respect to selec- rieties, to spread. We want to make some- tion. However this is only true because mea- thing of the opposite point: hybrid corn dif- sured selection coefficients tend to be small, fused much more quickly than we might ex- relative to the force of mixing (Endler, 1986). pect, based upon its payoff difference with If selection were stronger (and it sometimes existing strategies and how natural selection is—see again Endler 1986), then more differ- would respond. ences could be maintained among sub-groups. If we take the genetic replicator model But in a cultural model, the strength of and use it to model the diffusion of hybrid learning biases that, for example, favor behav- corn, we can get a feeling for how much iors with higher payoffs over behaviors with stronger selective forces can be in cultural lower payoffs can be arbitrarily strong. Nat- evolution. This thought experiment violates ural selection of ideas does occur, such as many truths. We are assuming a year is the when different fertility ideologies influence the generation time, and that there is no individ- differential growth of religious groups (Stark, ual decision-making beyond imitation of suc- 2005). A school of American archaeology used cessful strategies. However, the ordinary pop- to argue that most important cultural and ulation genetic replicator dynamic and that technological change came about through nat- for simple imitation models is very similar ural selection of this kind (see Boone and (Gintis, 2000, provides a general derivation). Smith, 1998, for references). It would there- The most basic model, in which individuals fore be useful to consider how strong such se- compare their own payoff against that of a lection is, relative to what we might consider random individual and preferentially copy the fairly fast genetic evolution—such as the 4% strategy with the higher payoff, yields: increase in the depth of finch beaks Peter and ∆p = p(1 − p)β(w − w¯), Rosemary Grant recorded on Daphne Major, 1 an island in the Galapagos, during a two-year where p is the frequency of the cultural vari- draught in 1976 and 1977. This strength of ant (hybrid corn), β a rate parameter, and w1 selection is sufficient to produce beaks sub- andw ¯ have similar meanings to the genetic stantially deeper in less than a decade, assum- model, payoff to the behavior of interest and ing selection would continue at the same rate the average payoff, respectively. (Grant and Grant, 1993). Figure 1 shows these models with two An extrapolation from an empirical exam- strengths of “selection,” compared to the ac- ple of cultural evolution will help to make tual spread of hybrid corn. At the actual pay- clear how much stronger “selective” forces— off difference between hybrid corn and then- by which we mean forces what favor different existing varieties, the spread would have been variants in a non-random way—can be in cul- far slower than observed. A difference as tural systems. The classic study of the diffu- large as 50% is needed to predict the dif- sion of technological innovations is the Ryan fusion of hybrid corn in 13 years. The ac- and Gross (1943) study of the diffusion of hy- tual spread lagged behind this prediction for brid corn in Iowa farmers. Hybrid corn be- as much as half of the diffusion period, but came available in Iowa in 1928 and was even- then accelerated, so clearly other forces were tually adopted by nearly all farmers by 1941, at work in this example (Henrich, 2001). For over a period of 13 years. For those complet- current purposes, note that whatever social ing and reviewing the study, the shock was learning mechanisms are at work here must how long it took hybrid corn, which had a magnify observed payoff differences. Consider 20% increase in yield over then-existing va- also that a 20% difference in yield is unlikely

9 Figure 1: The diffusion of hybrid corn, modeled with the simple replicator dynamic presented in the text, for two strengths of “selection.” The dashed curve shows the predicted spread using the actual payoff difference between hybrid corn and then-existing varieties (20%). If this were natural selection, a 20% difference in fitness would be tremendous and rare, from one generation to the next. At this strength, the curve falls far short of predicting a spread in about 13 years. The solid curve shows the predicted spread for a 50% advantage, which is capable of predicting a spread in the approximately 13 years it took for hybrid corn to diffuse. See Henrich (2001).

10 to result in a 20% difference in reproduction altruism when: or survivorship important to natural selection  on genes. Many other behaviors matter for var(pi)β(wi, pi) > E var(pij)β(wij, pij) , aggregate fitness of an individual. Thus the where p is the frequency of an altruism gene magnitude of “selection” in this case of cul- i in population subdivision i, w is the average tural diffusion seems even larger in compari- i fitness in group i, and p and w are the fre- son to typical genetic estimates. ij ij quency of altruism and fitness of individual Because selective forces, arising from hu- j in group i, respectively. β(x, y) indicates man psychology, that favor some variants over the slope of the linear regression of x on y others may be strong, and especially strong (∂x/∂y). Thus the beta coefficients above are relative to mixing, cultural evolution may pro- selection gradients for different components of duce outcomes that are very unlikely in ge- fitness. In plain language, this condition can netic evolution. In this section, we explain one be read as: important case in which cultural evolution- ary models produce equilibrium results that The product of the variance in al- are possible, but highly unlikely, in analogous truism among groups and the rate genetical models. As discussed in Henrich of change in the average fitness of & McElreath (this volume), other examples individuals in a group as a func- may include ethnic marking (McElreath et al., tion of the number of altruists in 2003) and ethnocentrism (Boyd and Richer- the group son, 1985; Gil-White, 2001). must exceed the average product of the vari- ance within each group and the 4.2 for altruistic rate of change in individuals fit- behavior ness as a function of the amount of altruism the individual exhibits. By focusing on competition among cultural groups (), several “Groups” here are defined by the scale at modeling efforts have demonstrated how the which helping behavior benefits other individ- cultural transmission of behaviors related uals. Behaviors that aid brothers and behav- to cooperation and punishment may explain iors that aid entire ethnic groups are both gov- some otherwise puzzling patterns of human erned by this fact. However, common descent prosociality (Boyd and Richerson, 1985; Boyd maintains genetic variation among kin groups, et al., 2003). See Henrich (2004) and Hen- while variation among large groups is much rich and Henrich (forthcoming) for reviews. harder to sustain. Mixing is very strong in The reason these models can result in stable animals like ourselves, leading to either very cooperative equilibria, while analogous genet- little equilibrium variance among large groups ical models cannot, is due to the plausibility of individuals or the steady leaching away of of strong imitation forces opposing forces of variation (see the model by Rogers, 1990). If mixing (Boyd et al., 2003). learning forces like conformity effectively re- Mixing is an enemy of altruism because se- duce mixing of cultural variants, then varia- lective forces can only produce altruism when tion among groups can remain high enough the between-group variance in behavior is to support group selection. There is nothing large enough to overcome within-group selec- heretical about this statement. W. D. Hamil- tion opposing altruism. Price (1972) and later ton himself saw as a special case Hamilton (1975) showed that selection favors of this general condition (see Hamilton, 1975).

11 The key issue in any model of the evolution of itatively different outcomes for an organ- altruism is what forces are available to main- ism? But the evolution of sexual reproduc- tain variation among groups. tion transformed how traits are inherited and In the cultural case, it is plausible, al- created equally (if not more) novel evolution- though hardly yet proven empirically, that ary dynamics. Models of sexual selection strong learning dynamics combined with weak of animal signals have no problem produc- effective migration can result in more variance ing situations in which males produce and than analogous genetical models (Boyd et al., females prefer costly ornaments that lower 2003, models this process). This in turn might the overall fitness of the population (Fisher, produce selection on culturally transmitted 1930; Lande, 1981). Few people have a prob- ideas that lead to self-sacrifice. Groups with lem calling such equilibria fundamentally Dar- such ideas may either defeat their neighbors in winian, even though evolution sometimes pro- open conflict, because they can muster more ceeds quite differently in sexual than asex- fighters to the field of battle, or defend them- ual species. Similarly, we should not balk at selves better from aggression, because they noticing that a genetically-evolved system for can recruit more people to dig trenches, build acquiring behavior via social learning might walls, or mount a defense of arms. end up producing equilibria that are not the self-same ones the genes themselves would be We must caution the reader to avoid a mis- selected to arrive at. take others have made in understanding this hypothesis. Cultural group selection trades off the very fact that human ethnic groups 5 Gene-culture are well-mixed genetically, but still maintain appreciable cultural distinctiveness. Alleles Gullibility may be an adaptation, for group-oriented self-sacrifice are unlikely because critical thinking is costly. to spread, because genes move among ethnic and other cultural groups quite often. How- Over the very long run, cultural dynamics ever, this mixing does not always destroy cul- cannot continue to always outrun genetics. tural variation. Immigrants do not necessar- Genes must have an eventual influence. One ily erode the variation in such ideas among reason could be that, as variation among cul- groups, because immigrants may quickly con- tural variants diminishes, the rate of evolu- form to local beliefs, even though they cannot tion will slow, and then lagging changes in change their alleles. The group selection is on genetic variants will become more important. culturally transmitted beliefs, not on physi- Also, the cultural system should eventually cal bodies. It is possible to construct a work- reach some stationary distribution, even if it ing cultural group selection model (Boyd and is stochastic. Then selection on genes, how- Richerson, 2002) in which comparison across ever slow, may determine how this equilibrium groups generates the equilibrium shifts, not shifts. Even rates of change in classic organic differential reproduction or survival of groups. evolution appear to vary on different scales In this case, the group selection may involve (Penny, 2005). Thus it seems that ignoring no differential death or birth of human bodies genes in the long run is probably a mistake. at all. In this final section we present a very An effect like this might seem initially simple model of gene-culture coevolution. It implausible. Would a system of phenotypic helps explain one way to model the joint evo- transmission like social learning, created by lution of transmission systems with very dif- genetical evolution, actually lead to qual- ferent rates of change. Also, this model al-

12 lows us the opportunity to explain a few im- these two-dimensional systems can be solved portant predictions about behavior that arise with matrix techniques. However, if the cul- from gene-culture models. tural dynamics are fast enough relative to the genetic dynamics, the cultural dimension p 5.1 When culture is much faster will come to rest at its steady state,p ˆ, very than genes quickly. This can be true either because there are many opportunities to learn and update One way to deal with the difference in rates behavior per selection event or because selec- is to assume that the distribution of cul- tion coefficients are weak, compared to the tural variants reaches an equilibrium instan- rate of change due to learning (see the previ- taneously, with respect to genetic evolution. ous section). If either is true, then the system The distribution of alleles then responds to arrives at a cultural equilibrium quickly, and this stationary distribution of cultural vari- q will respond to this value. As q changes un- ants. Provided cultural dynamics are suf- der selection on genes, of course,p ˆ will also ficiently faster than genetic ones, then this change. But now since p instantly reaches its method yields a good approximation of the steady-state for any given value of q, we have joint system dynamics. Boyd and Rich- a one-dimensional system again: erson (1985) and Alan Rogers (1988) have used this tactic to derive joint evolutionary ∆ˆp = F (q), equilibria for simultaneous cultural and ge- ∆q = G(ˆp, q). netic recursions, without resorting to more- complex multi-dimensional techniques. Nu- With such a system, all we have to worry merical analysis of the recursions shows that about is the stability of the genetic equilib- the infinitely-fast-culture assumption does not ria. The cultural equilibrium just responds result in misleading results. to it. You might think that this means the The basic problem is that the change in genes run the show, and that such a model frequency of a single cultural variant can be produces the same outcomes as a culture-free represented in a one-dimension system by the model. But as we will demonstrate, not even abstract function: the simplest models back up that intuition. Here is a model in the spirit of Rogers ∆p = F (p). (1988). We think this very simple model This means we can compute the change in the demonstrates the vulnerability of some com- frequency, if we know the current frequency. monly held beliefs about what kinds of be- But if we add a simultaneous second recursion havior we expect natural selection to produce. for genes that specify how culture is acquired, Imagine a simple organism capable of imitat- then we have a two-dimensional system with ing the behavior of older individuals, in ad- two functions: dition to investing effort in updating behavior through individual trial and error. We use the ∆p = F (p, q), discrete formulation, but as with all models ∆q = G(p, q). of this type, there is an equivalent continuous formulation (in which individuals do some im- Now we must know both the frequency of itation and some individual learning). Each the cultural variant and the genes influenc- generation, individuals learn according to an ing social learning in order to find the change inherited allele (individual or social) and then in either. The trick is to determine stabil- receive payoffs determined by whether what ity in such systems. In principle, stability in they have learned is adaptive under current

13 circumstances. With the above assumptions, we can write fitness expressions for each allele, I for individ- First, a caveat: People sometimes com- ual learners and S for social learners. Let a plain that it is unrealistic to consider a pure be the frequency of currently adaptive behav- “social learning” strategy, because real people ior among adults of the previous generation. always make inferences while being influenced Then: by the behavior of others. We agree. All social learning depends upon individual psychology W (I) = B − C, and how that process works. If we expressed W (S) = Ba. this model in its completely equivalent contin- uous form, with a family of mixed strategies The variable a is the frequency of adaptive that rely upon a mix of individual and so- behavior at any one moment, but it changes cial influence, fewer people would complain. over time. This implies a recursion for how The version we present here is better for illus- a changes, and this process will depend upon trating the insights we wish to draw from it. how the population learns. Let L be the fre- Models are like cartoons: there is an optimal quency of individual learners in the popula- amount of detail, and often that amount is tion. Then the frequency in the next genera- very small. We caution readers of such mod- tion is: els not to get hung up on vague words like a0 = u(0) + (1 − u)L(1) + (1 − L)a. “social learning” that have different meanings The parameter u is the fraction of the time in different sub-disciplines, but instead to at- that the environment changes, rendering all tend to the structure of the assumptions. As past behavior maladaptive, each generation. others have shown, equivalent models can be The rest of the time, L of the population derived under the assumption that individu- learned for themselves and arrived at adaptive als are entirely Bayesian updaters, but able behavior with certainty. The remaining 1 − L to observe what other people do (Boyd and of the population imitates and transmits the Richerson, 2005; Bikhchandani et al., 1992). previously adaptive frequency a. Suppose an infinite number of behaviors Now we apply the assumption that cul- are possible, but only one is adaptive for cur- tural dynamics are much faster than genetic rent environmental circumstances and yields dynamics. This allows us to find the steady a payoff B. All others yield a payoff of zero. state value of a, call thisa ¯, for any given L. This assumption just sets the scale of payoffs, This exists where a0 = a and is: so we lose no generality with it. The environ- L(1 − u) a¯ = . ment itself changes state, making a new be- 1 − (1 − L)(1 − u) havior optimal, with probability u each gen- Over the long run, the fitness of social learners eration. When this happens, since there exists will depend upon this value. We pluga ¯ into a very large number of possible behaviors, we the expression for W (S) and find the value of assume all existing behavior in the population L that yields a genetic equilibrium, the end- is rendered maladaptive. Individual learners point of the long-term selection on genes. The pay a cost for experimentation and mistakes equilibrium frequency of individual learning, (C), but they always arrive at the currently Lˆ turns out to be: adaptive behavior. In contrast, social learn- B/C − 1 ers pay no up-front costs, but they just copy Lˆ = . a random adult from the previous generation, 1/u − 1 so they have no guarantee of acquiring the This expression tells us how the stable fre- currently adaptive behavior. quency of individual learning responds to the

14 costs of learning and the unpredictability of Richerson and Boyd (2005) call this effect the environment. The quantity B/C is the the costly information hypothesis: when infor- ratio of the benefits of acquiring adaptive be- mation about the world is costly to acquire, havior to the costs of learning it. As this goes it pays to rely upon cheaper ways of learn- down, learning is more costly, and the fre- ing. Consider what proportion of behavior is quency of individual learning declines. The adaptive to current circumstances, when C/B second effect is that as the environment be- is very small, perhaps 1/100. In this case, comes less predictable (u increases), the de- because information is very cheap to acquire, nominator decreases and the equilibrium fre- most individuals (if not all) will be individ- quency of individual learning increases. If the ual learners, and the expected accuracy of be- world is unstable, what your parents did may havior, a¯ˆ, will be nearly 100%. But when in- no longer be adaptive, so it pays more to think formation is costly, because it is dangerous, for yourself. time-consuming, or difficult to acquire and The most obvious result of this model is process, then the expected accuracy will be that natural selection can easily favor sub- much smaller. stantial amounts of social learning. Unless u When we look at a population of animals is very large or B/C is very small, there will and ask why they behave as they do, this be a substantial frequency of social learners at model (and many others like it, see Boyd and equilibrium. Richerson 1995) suggests it will be risky to assume that development (in this case, learn- 5.2 Gullibility as an adaptation ing) is irrelevant to our explanations of what behavior we will see. If the costs of informa- An interesting further deduction from the tion are high, then substantial portions of the above model is the frequency of adaptive be- population will be practicing maladaptive be- havior once genes also reach equilibrium. Call havior. this a¯ˆ. This is found by substituting the value of Lˆ for L in the expression fora ¯. After sim- Moreover, this will be the optimal strat- plification: egy, from the point of view of the genes. Any more individual learning would not be C a¯ˆ = 1 − . an equilibrium, even though it would lead to B more accurate behavior. What is happening This result is very interesting. Notice that it is that social learning saves fitness costs at one does not depend upon u. Natural selection point in the life cycle, only to pay other fit- adjusts learning in response to u so that, at ness costs later. Even models of cumulative equilibrium, the value of socially-acquired be- culture (Boyd and Richerson, 1996) show the havior, a¯ˆ, is governed only by the cost of in- same tradeoff. When we sample behaviors, formation. When the world is relatively sta- we might not notice the information-gathering ble from one generation to the next, there are costs paid by individual learners and conclude more social learners at equilibrium, which re- that individual learning has higher fitness, be- duces the expected value of socially-acquired cause on average individual learners practice behavior. However, the countervailing effect more-accurate behavior. is that, in a more-stable world, adaptive be- The social learners in this very stylized havior has a better chance to accumulate, so a model are gullible. They believe whatever smaller number of individual learners can pro- the previous generation demonstrates. In this vide the same expected accuracy of behavior case, gullibility is an adaptation, because the as a large number, in an unstable world. costs one would have to pay to verify all the

15 suggested behavior in the world would be too tocene environments, human societies proba- great. Some individual learning is always fa- bly showed widespread non-adaptive and mal- vored, because otherwise the population can- adaptive behaviors, because such behaviors not track the environment at all. But large are a side-effect of relying upon social learning doses of gullibility can be adaptive, because as an adaptation. Thus we suspect that even information is costly. Pleistocene foragers did a lot of “silly” things, from the perspective of genetic fitness. This Our impression of real human societies is would not mean that culture, as a system of that many people will believe nearly anything behavioral evolution, is a maladaptation, no you tell them, at least at first. Many read- more than the cost of producing sons means ers of this chapter will be successful students sexual reproduction is a maladaptation. But or professional scholars, who have substantial it would unfortunately mean that predicting experience with teaching. Isn’t it amazing behavior is harder than simply applying opti- that students are willing to take our word on mality criteria. It would mean that both de- so many abstruse topics? We think models velopment and population dynamics are cen- like this one suggest an answer: being gullible tral to Darwinian explanations of individual when a problem is abstruse is adaptive, be- behavior. cause it is often beyond the individual’s means to verify the accuracy of it alone. If we in- sisted on learning everything for ourselves, we References would miss out on many very adaptive solu- tions. Given how hard it is for agricultural sci- Bikhchandani, S., Hirshleifer, D., and Welch, entists to decide what crops in what propor- I. (1992). A theory of fads, fashion, cus- tions to plant, it seems implausible that many tom, and cultural change as informational real agriculturists, who have to live off their cascades. Journal of Political Economy, produce, can afford to experiment and analyze 100:992–1026. their year-to-year yields (Henrich, 2002). Boone, J. and Smith, E. (1998). Is it evolution The cost of being adaptively gullible may yet? A critique of evolutionary archaeology. be that we are sometimes, perhaps often, lead Current , 39:141–173. astray. The universal existence of magical thinking might be a symptom of this tradeoff. Boyd, R. (1982). Density-dependent mortal- After all, if you cannot disprove that there are ity and the evolution of social interactions. dangerous spirits in the forest, it may be best Animal Behaviour, 30:972–982. to just trust that there are. We think this possibility is provocative, because it suggests Boyd, R., Gintis, H., Bowles, S., and Richer- that behavioral maladaptation in some do- son, P. J. (2003). The evolution of altruis- mains may be a standard feature of human so- tic punishment. Proceedings of the National cieties, even in our proper evolutionary niche. Academy of Sciences USA, 100:3531–3535. Clearly some of the non-adaptive things hu- Boyd, R. and Richerson, P. (1996). Why cul- mans do, like gorge on fatty foods, are proba- ture is common, but cultural evolution is bly products of our minds being adapted to a rare. Proceedings of the British Academy, world in which fatty foods were rare. But this 88:77—93. kind of out-of-equilibrium explanation of mal- adaptive behavior is not the only possibility. Boyd, R. and Richerson, P. J. (1985). Culture If the above model captures the texture of hu- and the Evolutionary Process. Univ Chicago man cultural adaptations, then even in Pleis- Press, Chicago.

16 Boyd, R. and Richerson, P. J. (1995). Why Grant, B. R. and Grant, P. R. (1993). Evo- does culture increase human adaptability? lution of darwin’s finches caused by a rare Ethology and , 16:125–143. climatic event. Proceedings of the Royal So- ciety London B, 251:111–117. Boyd, R. and Richerson, P. J. (2002). Group beneficial norms spread rapidly in a struc- Hamilton, W. D. (1975). Innate social ap- tured population. Journal of Theoretical titudes of man: an approach from evolu- Biology, 215:287–296. tionary genetics. In Fox, R., editor, Bioso- cial Anthropology, pages 133–153. Malaby Boyd, R. and Richerson, P. J. (2005). Ra- Press, London. tionality, imitation, and tradition. In The Origin and Evolution of , chap- Henrich, J. (2001). Cultural transmission and ter 18, pages 379–396. Oxford University the diffusion of innovations: Adoption dy- Press, Oxford. namics indicate that biased cultural trans- Cavalli-Sforza, L. L. and Feldman, M. W. mission is the predominate force in behav- (1981). Cultural transmission and evolu- ioral change and much of sociocultural evo- tion: a quantitative approach. Princeton lution. American Anthropologist, 103:992– University Press, Princeton. 1013.

Dawkins, R. (1982). The Extended Phenotype: Henrich, J. (2002). Decision-making, cultural The Long Reach of the Gene. Oxford Uni- transmission and adaptation in economic versity Press, Oxford. anthropology. In Ensminger, J., editor, Theory in Economic Anthropology, pages Durham, W. H. (1991). Coevolution: Genes, 251–295. AltaMira Press. Culture, and Human Diversity. Stanford University Press, Stanford. Henrich, J. (2004). Cultural group selection, coevolutionary processes and large-scale co- Endler, J. (1986). Natural Selection in the operation. Journal of Economic Behavior Wild. Princeton University Press, Prince- and Organization, 53:3–35. ton. Henrich, J. and Boyd, R. (2002). On modeling Fisher, R. A. (1918). The correlation between cognition and culture: Why replicators are relatives on the supposition of Mendelian not necessary for cultural evolution. Jour- inheritance. Transactions of the Royal So- nal of Cognition and Culture, 2:87–112. ciety of Edinburgh, 52:399–433. Henrich, J., Boyd, R., and Richerson, P. J. Fisher, R. A. (1930). The genetical theory of (forthcoming). Five common mistakes in natural selection. Dover, New York. cultural evolution. In Sperber, D., editor, Gil-White, F. J. (2001). Are ethnic groups bi- Epidemiology of Ideas. Open Court Pub- ological ’species’ to the human brain?: Es- lishing. sentialism in our cognition of some social categories. Current Anthropology, 42:515– Henrich, N. and Henrich, J. (forthcoming). 554. The Origins of Cooperation: A cultural and evolutionary exploration among the Iraqi Gintis, H. (2000). Game Theory Evolving. Chaldeans of metro-Detroit. Oxford UP, Princeton University Press, Princeton. New York.

17 Jablonka, E. and Lamb, M. (2005). Evolu- Richerson, P. J. and Boyd, R. (2005). Not by tion in Four Dimensions. MIT Press, Mas- Genes Alone: How culture transformed hu- sachusetts. man evolution. University of Chicago Press, Chicago. Jablonka, E. and Lamb, R. M. (1991). Sex chromosomes and speciation. Proceedings Rogers, A. R. (1988). Does biology constrain of the Royal Society of London, B, 243:203– culture? American Anthropologist, 90:819– 208. 831. Jenkin, F. (1864). The origin of species. North Rogers, A. R. (1990). Group selection by se- British Review, 46:277–318. lective emigration: The effects of migration and kin structure. American Naturalist, Lande, R. (1981). Models of speciation by sex- 135:398–413. ual selection on polygenic traits. Proceed- ings of the National Academy of Sciences Ryan, B. and Gross, N. (1943). The diffusion USA, 78:3721–3725. of hybrid seed corn in two iowa communi- ties. Rural Sociology, 8:15–24. Leimar, O. (1996). Life history analysis of the Trivers-Willard sex-ratio problem. Behav- Sheahan, M. B., Rose, R. J., and McCurdy, ioral Ecology, 7:316–325. D. W. (2004). Organelle inheritance in plant cell division: the actin cytoskeleton is Maynard Smith, J. (1990). Models of a dual required for unbiased inheritance of chloro- inheritance system. Journal of Theoretical plasts, mitochondria and endoplasmic retic- Biology, 143:41–53. ulum in dividing protoplasts. The Plant Maynard Smith, J. (1998). Evolutionary Ge- Journal, 37:379–390. netics. Oxford University Press, Oxford. Silk, J. B. and Boyd, R. (1983). Fe- McElreath, R., Boyd, R., and Richerson, P. J. male cooperation, competition, and mate (2003). Shared norms and the evolution choice in matrilineal macaque groups. In of ethnic markers. Current Anthropology, Wasser, S. K., editor, Social Behavior of Fe- 44:122–129. male Vertebrates, pages 315–347. Academic Press, New York. Odling-Smee, F. J., Laland, K. N., and Feld- man, M. W. (2003). Niche Construction: Sperber, D. (2000). An objection to the The Neglected Process in Evolution. Num- memetic approach to culture. In Aunger, ber 37 in Monographs in Population Biol- R., editor, Darwinizing Culture: The Sta- ogy. Princeton University Press, Princeton. tus of as a Science, pages 163– 173. Oxford University Press, Oxford. P¨a¨abo, S. (2001). The human genome and our view of ourselves. Science, 16:1219–1220. Sperber, D. and Hirschfeld, L. A. (2004). The cognitive foundations of cultural stability P´al, C. and Mikl´os, I. (1999). Epigenetic and diversity. Trends in Cognitive Sciences, inheritance, genetic assimilation and spe- 8:40–46. ciation. Journal of Theoretical Biology, 200:19–37. Stark, R. (2005). The rise of Mormonism. Penny, D. (2005). Relativity for molecular Columbia University Press, New York. clocks. Nature, 436:183–184. Szathm´ary, E. and Maynard Smith, J. (1995). Price, G. R. (1972). Extension of covariance The major evolutionary transitions. Nature, selection mathematics. Annals of Human 374:227–232. Genetics, 35:485–490. 18