Chapter 54: Biology, Genetics and Evolution

Antonio Benítez-Burraco1, Koji Fujita2, Koji Hoshi3 and Ljiljana Progovac4

1. Department of Spanish, Linguistics and Theory of Literature (Linguistics), Faculty of

Philology, University of Seville, Seville, Spain

2. Department of Human Coexistence, Graduate School of Human and Environmental Studies,

Kyoto University, Kyoto, Japan

3. Faculty of Economics, Keio University, Yokohama, Japan

4. Department of English, Linguistics Program, Wayne State University, Detroit, USA

Overview 1. Introduction 2. The core view: Biolinguistics and the Chomskyan approach to language evolution 3. The brain 4. Genetics 5. Subsequent views/developments 5.1. Gradualist view within Minimalism 5.2. Neurobiological continuity under Minimalism 5.3. Evolutionary continuity under Minimalism 6. Future prospects

1. Introduction

In this Chapter we first look at the core view of the biology of language associated with Minimalism, including the Biolinguistics Program (section 2). Next, we consider research on the brain (section 3) and genetics (section 4), associated with this framework. Finally, we introduce some subsequent views of language evolution which break away from the saltationist,

1 discontinuous nature of the mainstream approach (section 5), and draw some conclusions regarding future prospects (section 6).

2. The core view: Biolinguistics and the Chomskyan approach to language evolution Rather than a thorough introduction, here we can only provide a brief perspective on the Biolinguistics Program, associated with linguist Noam Chomsky. The very term Biolinguistics in the modern sense is linked to the events that started in the 1950’s, with the early work of Noam Chomsky (e.g. his Ph.D. manuscript The Logical Structure of Linguistic Theory, 1955), and Eric Lenneberg’s influential (1967) book, Biological Foundations of Language, which popularized the idea of the critical period in (for some discussion, see e.g. Hoshi 2017). This kind of approach that studies language in a biological framework saw a revival at the turn of the 21st century, when it picked up pace significantly, with its own dedicated journal, Biolinguistics, as well as hundreds of publications addressing the topic, including several co- edited volumes. In the editorial introduction to their newly established journal Biolinguistics, biolinguists Cedric Boeckx and Kleanthes Grohmann adopt the following five main topics of inquiry for this enterprise (Boeckx and Grohmann 2007), posed already in the early days (see Chomsky 1986, 1988):

(1) 1. What is knowledge of language? 2. How is that knowledge acquired? 3. How is that knowledge put to use? 4. How is that knowledge implemented in the brain? 5. How did that knowledge emerge in our species?

The charge of this chapter is to address the last two questions, i.e., how language is represented in the brain, and how it evolved in our species, from the point of view of Minimalism. On the face of it, the term Biolinguistics evokes a field that studies language in a biological framework, considering how to cross-fertilize crucial postulates and findings in biological sciences with those of linguistic sciences. However, the term has been used somewhat differently and more narrowly, to refer to a rather specific approach to the nature of the faculty of language, in particular a specific theoretical approach to syntax, leading to a focus that is perhaps unexpected.

2 First, Biolinguistics is not primarily or necessarily associated with biology, but is instead associated more broadly with the ‘laws of nature,’ often invoking physics or mathematics (see e.g. Lenneberg, 1972, who warned against this trend in generative grammar). For example, it is often considered in this approach that the essence of syntax is better illuminated by regular and orderly principles of nature, such as those postulated in physics and mathematics, rather than biological postulates such as natural selection and evolutionary tinkering (see e.g. Jacob 1977 for the relevance of tinkering in biological, adaptationist approaches to evolution). It has been Noam Chomsky’s long-held view that invoking natural selection via tinkering can be symptomatic of the lack of understanding: “if you take a look at anything that you don’t understand, it’s going to look like tinkering,” but when things are properly understood, one realizes that there is much more order in nature (Chomsky 2002, p. 139).

Chomsky (2002, 2005), among others, has claimed that human language/grammar can be a by- product of other phenomena, such as the increase in brain size, or general laws of physics, rather than being an agent in its own evolution (see e.g. the quote from Berwick and Chomsky 2011 below, according to which “language is something like a snowflake, assuming its particular form by virtue of laws of nature…”.1 This stands in stark contrast to the arguments put forth by e.g. Pinker and Bloom (1990), who state that the only way to evolve a truly complex design that serves a particular purpose, such as language, is through a sequence of /changes with small effects, and with intermediate stages (but see e.g. Lewontin, 1998, for a criticism of such views). Pinker and Bloom’s argument is based on the analogy with the intricate structure of the eye, stating that evolution is the only physical process that can create an eye because it is the only physical process in which the criterion of being good at seeing can play a causal role. On the other hand, Noam Chomsky’s arguments have to do with (i) his view that syntax is an all or nothing package, not decomposable into stages (the view which seems to be shared by many other followers of Biolinguistics2), as well as with (ii) his long-held view that there are no

1 These claims can be related to e.g. Thompson-style (1917) view that physical laws are part of biology, in the sense that biological forms can be determined by physical/mathematical laws, and according to which natural selection is considered to be only of secondary importance. 2 This view of syntax and its evolution can be found in e.g.: Berwick (1998); Bickerton (1990, 1998); Lightfoot (1991); Chomsky (2002, 2005); Berwick and Chomsky (2011; 2016); Piattelli-Palmarini (2010); Piattelli-Palmarini and Uriagereka (2004; 2011); Moro (2008); Hornstein (2009); Miyagawa (2017); Miyagawa et al. (2014); Di Sciullo (2013). 3 genetic differences among humans when it comes to language abilities (e.g. Chomsky 2002: 147; Berwick and Chomsky 2016).

In their (2016) book Berwick and Chomsky did conclude that natural selection may have been responsible for spreading that one beneficial that they postulate was responsible for the emergence of language (see below), but this was only after a bulk of the book discussed what they consider to be the problems with natural selection, and the problems with Darwin not having had a mathematical mind. Moreover, invoking selection for one single mutation is very different from invoking it at multiple steps in the evolution of a trait, as has been the case with e.g. the structure of the eye, involving intermediate stages and tinkering with its many interacting components, where each step brings some (small) benefits, and reveals continuity with the other species. Nonetheless, Berwick and Chomsky’s (2016) claims are consistent with Chomsky’s early views (e.g. 1972, 97) expressed in the context of language evolution that “it is perfectly safe to attribute this development to ‘natural selection,’ so long as we realize that there is no substance to this assertion, that it amounts to nothing more than a belief that there is some naturalistic explanation for these phenomena.”

These considerations seem to be unexpected features of the Biolinguistics enterprise, considering its name. Even when this approach invokes biology, it typically expresses opposition to the postulates of natural and sexual selection, as well as to gradualist, adaptationist approaches to the evolution of syntax/language, which are widely utilized in biology. Instead, Biolinguistics is primarily associated with Noam Chomsky and his followers’ view of syntax, which is based on the assumption of the innateness and universality of syntactic/linguistic principles across all humans, and which advocates that this innate capacity was a result of one single evolutionary event, perhaps a slight re-wiring of the brain due to a single minor mutation (see below). As such, this view is saltationist in nature,3 with a rather late postulated date of emergence of language, in humans only, advocating discontinuity between humans and other species, including our closest relatives among the primate group.4 The reason and rationale for the saltationist view seemingly comes from the postulates of the Minimalist Program (although see e.g. Clark 2013 3 This kind of saltationist view of language evolution has been challenged by e.g. Pinker and Bloom (1990); Newmeyer (1991, 1998, 2005); Jackendoff (1999, 2002); Givón (e.g. 2002; 2009); Gil (2005); Culicover and Jackendoff (2005); Tallerman (2014); Heine and Kuteva (2007); Hurford (2007, 2012); Dediu and Ladd (2007); Dediu and Levinson (2013); Progovac (2008, 2010, 2015, 2016a, 2019); Fitch (2017a,b); Fisher (2017). 4 for explaining why the theoretical framework need not determine whether a saltationist or gradualist approach is adopted; see also section 5). According to Berwick and Chomsky (2011: 29-31) “the simplest assumption, hence the one we adopt…, is that the generative procedure emerged suddenly as the result of a minor mutation. In that case we would expect the generative procedure to be very simple… The generative process is optimal. … Language is something like a snowflake, assuming its particular form by virtue of laws of nature… Optimally, recursion can be reduced to Merge… There is no room in this picture for any precursors to language—say a language-like system with only short sentences” (see also Chomsky 2002, 2005).5

While Biolinguistics in this general sense has been around since the 1950’s, it was not until the advent of Minimalism in the 1990’s (e.g. Chomsky 1995), and specifically with the formulation of the Strong Minimalist Thesis (SMT), that these views took a center stage in the theory of syntax itself. As summarized in Berwick and Chomsky (2016, p. 94), “UG [Universal Grammar] reduces to the simplest computational principles… sometimes called the Strong Minimalist Thesis (SMT).” The claim is that the “generative process is optimal,” based on “efficient computation” (p. 71), and that “this newly emerged computational system for thought… is perfect, in so far as SMT is correct” (p. 80). According to SMT, language is an optimal solution to legibility conditions (e.g. Chomsky 2000: 96; see also Epstein, Kitahara, and Seely 2010),6 where human syntax reduces to a single optimal operation Merge. In their influential article, Hauser, Chomsky and Fitch (2002) claim that recursion, which is considered in this Biolinguistics approach to reduce to Merge, is the only operation that needed to evolve in humans to yield language, and the title of Berwick’s (2011) paper is “All you need is Merge.” Evoking SMT, Berwick and Chomsky (2016) define Merge as “the simplest possible mode of recursive generation: an operation that takes two objects … and forms from them a new object

4 In his earlier work, Chomsky (2005) postulated that language emerged in humans around 50,000 years ago. Dediu and Levinson (2013) directly challenged this view when it comes to both timing and continuity with other hominin species, estimating that language dates back to the common ancestor of humans and Neanderthals, to some 400,000- 500,000 years ago, allowing for the possibility that Neanderthals and Denisovans had some forms of language (for a response, see e.g. Berwick, Hauser, and Tattersall 2013). In their (2016) book, Berwick and Chomsky shifted their early estimated date to up to 200,000 years ago (p. 157). 5 In fact, Piattelli-Palmarini (2010, p. 160), one of the founders of Biolinguistics, flirts with the idea that not just syntax, but language in its entirety, arose as one single event, stating that it is illusory to think that words can exist outside of full-blown syntax, or that any proto-language can be reconstructed in which words are used, but not full- blown syntax (for similar ideas, see also DiSciullo 2013; Nóbrega and Miyagawa 2015). 6 However, the terms “optimal” and “perfect” have not been defined in such a way as to allow this proposal to be subject to empirical verification and falsification (see e.g. Johnson and Lappin 1999; Progovac 2015). 5 … the set.” From there it appears to follow that “we simply do not have as much to explain” (Berwick and Chomsky 2016, p. 11): given how simple syntax must be, the evolution of syntax amounted to just one single, unremarkable event. “UG [Universal Grammar] must meet the condition of evolvability, and the more complex its character, the greater the burden on some future account” of its evolution (Berwick and Chomsky 2016, p. 93). In other words, (i) in order for syntax to be evolvable, syntax itself has to be extremely simple, and, (ii) given that syntax must be so simple (as per (i)), syntax must have arisen through one single, minor mutation.7

So, in that respect, Biolinguistics, which is largely associated with Minimalism,8 is primarily a view of syntax, a linguistic approach, which posits some assumptions regarding how certain theoretical postulates of syntax can be likened to natural laws. If that is all there is to this enterprise, the name Biolinguistics would seem misleading for the following reasons: (i) this approach invokes the laws of physics and mathematics to explain the nature of syntax, much more readily than it invokes the postulates of biology; (ii) it is doubtful, or at least ambivalent when it comes to any role or relevance of natural selection via tinkering and of intermediate stages when it comes to the evolution of language/syntax; (iii) initially at least, and consistent with the previous item, this approach assumed genetic identity, rather than individual variability, when it comes to language abilities; and (iv) it assumes discontinuity with other species, including other hominins. What Biolinguists cast doubt on (assumptions (ii-iv)) are exactly the most productive postulates and achievements of biological sciences when it comes to evolution. It is here that, curiously, the Biolinguistics approach still seems to be about biology and linguistics. In fact, it is an approach whose central premise is that biology plays only a minor, secondary role, in the evolution of human language. That seems to be the common thread that binds various contributions to this enterprise. As such, this enterprise tends to be rather critical of any dealings with language evolution that invoke biology more than in this minimal, inconsequential way; according to e.g. Berwick and Chomsky (2016), to understand evolution “requires a more subtle mathematical analysis, and so far as we can make out, none of the recent books on the evolution of language seem to have grasped this in full.” Another example is

7 In her review of Berwick and Chomsky (2016), Progovac (2016b) found this argument circular.

8 To clarify, while researchers who associate with Biolinguistics are typically practitioners of Minimalism, this does not mean at all that all Minimalist syntacticians are practitioners of biolinguistics. In fact, only a minority of them are. 6 Berwick, Hauser, and Tattersall (2013)’s response to Dediu and Levinson (2013)’s proposal, mentioned previously. In the rest of this chapter we explore the biological implications (and plausibility) of Minimalism, with a focus on the brain (section 3) and on its genetic underpinnings (section 4). In section 5 we discuss some research programs inspired by the Minimalist enterprise that aim to reconcile it with current approaches to the biology of language. Some conclusions and future prospects can be found in section 6.

3. The brain Syntacticians working within Minimalism and previous frameworks leading to it have been quite interested in the neurobiological basis of their theoretical postulates, and there has emerged some noteworthy research on how the brain processes syntax, coming both from the studies of aphasia and from neuroimaging experiments. On the other hand, it has also been suggested that the study of how syntactic structures are represented and processed in the brain has reached an impasse, leading to a state of cross-sterilization, rather than cross-fertilization, between the fields of theoretical linguistics and neuroscience. We consider both sides of the picture here.

Especially fruitful seem to have been studies that rely on the following hierarchy of projections, widely adopted in Minimalism (e.g. Chomsky 1995; Adger 2003), although this type of theoretical postulate certainly predates Minimalism (see e.g. Stowell 1983; Kitagawa 1985; Koopman and Sportiche 1991). According to this rather stable postulate, sentences are analyzed as hierarchical constructs, consisting of several layers of structure composed in a binary fashion, including, but not limited, to the following layers:

(2) CP>TP > vP > SC/VP

Here CP is the top layer, typically associated with the clause type (e.g. whether the sentence expresses a question or a statement), as well as with sentence embedding. TP is a Tense Phrase layer, accommodating the grammatical expression of tense and finiteness. vP is a (higher) Verb Phrase, accommodating transitivity and agency, and VP/SC is the basic Verb Phrase/Small Clause. Syntactic derivation of a transitive sentence, such as Elena will grow tomatoes, proceeds

7 from the most basic, inner layer, the VP grow tomatoes (2a), by adding the vP layer, which accommodates the agent Elena (2b).9 The TP layer, in this case headed by the tense auxiliary will, is then superimposed over the verbal layers (2c). (2c) also shows that the surface subject of the sentence, Elena moves to the specifier of TP, taking its position before the tense auxiliary will, and leaving a copy of itself in the underlying position (marked by the cross-out notation).

(3) a. [VP grow tomatoes] →

b. [vP Elena [VP grow tomatoes]] →

c. [TP Elena [T’ will [VP Elena [VP grow tomatoes]]]]

In order to form e.g. a wh-question, the CP layer is activated by moving the wh-word/phrase to the specifier of CP, as in e.g. What will Elena grow?

(4) [CP What [C’ will [TP Elena [T’ will [VP Elena [VP grow what]]]]]]

This framework thus relies heavily on syntactic layering, as well as on syntactic movement, called Move, which has recently in Minimalism been subsumed under Merge, and dubbed Internal Merge. For an accessible introduction to the nuts and bolts of the Minimalist analysis of syntactic phenomena, see e.g. Adger (2003) and Carnie (2013).

There are brain studies and findings that have targeted these layers and constructs directly. For example, it has been proposed that there is a processing cost in manipulating all the syntactic layers, especially the top layers, and that these layers get “pruned” in agrammatic aphasia (e.g. Friedmann 2006; Friedmann and Grodzinsky 1997; see also Kolk 2006). There are also recent studies which targeted specifically the processing of structures lacking vP and TP layers in healthy individuals (Progovac et al. 2018a,b). When it comes to Move, it has been found that sentences which involve certain kinds of Move exhibit increased activation in the left Inferior

9 In this inner VP layer, the syntactic function of the noun phrase (NP) tomatoes is not yet determined, given that from this same inner layer one can also derive the intransitive sentence Tomatoes will grow, as given below in (i). In other words, the NP inside the inner layer can become either a subject or an object of the sentence (TP), depending on what is merged in the higher layers. This observation will become relevant in section 5. (i) a. [VP grow tomatoes] → b. [TP Tomatoes [T’ will [VP grow tomatoes]]] (abstracting away from the recent tendency to project a vP layer even in intransitive sentences) 8 Frontal Gyrus (IFG), clustering around (and outside) Broca’s area, i.e. Brodmann Areas (BA) 44, 45, 46 and 47 (Ben-Shachar, Palti, and Grodzinsky 2004; Constable et al. 2004; Friederici et al. 2006; Grodzinsky 2010; Grodzinsky and Friederici 2006; Stromswold et al. 1996).10 Likewise, Broca’s aphasics have been reported to have difficulties comprehending structures involving syntactic movement (e.g. Caramazza and Zurif 1976; Grodzinsky 2000; Zurif 1995).11 For the involvement of the Broca’s area in syntactic processing, even more precisely BA 44, the reader is also referred to Friederici (2017), Zaccarella and Friederici (2015), Opitz and Friederici (2007), and Hagoort and Indefrey (2014: 356).

At the same time, it has become clear that Broca’s area is not the sole center for syntactic processing, but rather only an integral part of a larger circuit that involves subcortical structures, including basal ganglia (e.g. Gibson 1996; Lieberman 2000, 2009; Vargha-Khadem et al. 2005; Ullman 2006; Ardila et al. 2016a,b; see also Lenneberg 1967 for a detailed discussion on the role of subcortical structures for language processing in the brain). The basal ganglia (a collection of subcortical structures, the largest of which is the striatum, encompassing the caudate nucleus and the putamen) are highly interconnected to cortical regions, especially in the frontal lobes, including Broca’s area (e.g. Ardila et al. 2016a; Draganski et al., 2008; Ford et al., 2013; Frey et al., 2008). The involvement of the striatum in syntactic processing was demonstrated across different languages in lesion studies, including in patients with Parkinson’s and Huntington’s disease (Moro et al., 2001; Newman et al., 2010; Teichmann et al., 2005, Teichmann et al., 2008). First models (e.g. Ullman’s 2001) suggested a role of the basal ganglia in computing hierarchical structures, as part of the procedural memory. Recent models (e.g. Murphy’s 2021) support the view that the basal ganglia, but also other subcortical structures, particularly, the thalamus and the hippocampus, are involved in the memory buffering of the hierarchically organized syntactic objects, seemingly contributing to the modulation of fronto-temporal activity responsible for the labelling of syntactic structures (e.g. Complementizer Phrase, Tense Phrase, Verb Phrase, and the like) (for the involvement of fronto-temporal areas in syntactic

10 For further findings correlating an increase in syntactic complexity to an increase in neural activation in certain specific areas of the brain, the reader is referred to Brennan et al. (2012); Caplan (2001); Indefrey et al. (2001); Just et al. (1996); Pallier, Devauchelle, and Dehaene (2011).

11 However, it has also been reported that there is a much more complex and indirect relationship between the damage to the Broca’s area and syntactic processing deficits (e.g. Mohr et al. 1978; Novick, Trueswell, and Thompson-Schill 2010; Thothathiri, Kimberg, and Schwartz 2012). 9 representations, see Meyer et al., 2020). More specifically, Murphy (2021) proposes that whereas the categorization aspect of labeling recruits a network of posterior temporal regions, the maintenance of categories in memory for constructing multi-phrases also demands the involvement of subcortical areas. Additionally, Murphy (2021) proposes that the basal ganglia could be involved as well in the early stages of externalization, specifically, in the sequencing of syntactic information. As for the thalamus, it would also be involved in lexicalizing conceptual representations (Benítez-Burraco and Murphy 2016, Murphy 2021), whereas the hippocampus might contribute to stringing together large clusters of (cortical) linguistic features into coherent lexical items and phrases, as part of the episodic memory (Piai et al. 2016; see also Benítez- Burraco 2021).

The emerging picture is thus that language processing involves a distributed network of interconnected modules in the left hemisphere, with the right hemisphere also being involved (see e.g. Bookheimer 2002; Embick et al. 2000; Friederici, Meyer, and von Cramon 2000; Moro et al. 2001; Brennan et al. 2012). As concluded in Grodzinsky and Friederici (2006), each subpart of the linguistic system, including syntax, is neurologically decomposable into subsystems with a distinct neuro-functional architecture. Overall, when it comes to its processing, syntax proves to be a complex phenomenon that recruits multiple loci in the brain. In section 4, we discuss some evidence for the genetic underpinnings of the Broca’s-basal ganglia connectivity, and for its evolutionary significance.

On the other hand, it has been suggested that the study of how syntactic structures are represented and processed in the brain has reached an impasse, failing to achieve cross- fertilization between the fields of theoretical linguistics and neuroscience (e.g. Poeppel and Embick 2005; also Fedorenko and Kanwisher 2009). The claim is that the difficulty lies in the inherent mismatch in conceptual granularity between the concrete biological units of neuroscience, such as neurons, axons and dendrites, and the abstract postulates of linguistic theory (Poeppel and Embick 2005). In this respect, it has been proposed that the remedy (or the missing piece) may come from the consideration of the evolution of syntax, i.e. the decomposition of syntax into its evolutionary primitives, yielding linguistic constructs more commensurate with the postulates of neuroscience, and specifically with the subtractive method

10 used in neuroscience (Progovac 2010; Progovac et al. 2018a,b; section 5.1; see also in particular section 5.2 for a novel neurobiological approach to language under a Minimalist lens).

4. Genetics Two lines of research support the implementation of a genetic approach to Minimalism. On the one hand, and this predates the advent of the Minimalist Program, empirical evidence suggests that humans are born with some species-specific ability to acquire a language (e.g., Lenneberg 1967). It is still hotly debated whether this ability is specific to language, or if it results from the interplay of cognitive devices otherwise involved in functions not related to language, or even if it is culturally-driven, considering that in humans, complex cognitive abilities may be fruits rather than seeds of cultural selection, as suggested by e.g. Heyes (2020). Focusing on syntax, Chomsky has repeatedly hypothesized, as noted, that one single mutation might have brought about Merge via a slight re-wiring of the brain (see Berwick and Chomsky 2016 for a particular proposal on such a re-wiring of the brain; see also section 5.3 below). At the very least, the putative gene bearing this mutation deserves to be found. On the other hand, many cognitive disorders exist that impact on our language abilities, and most notably, on our syntactic abilities: the problems that patients experience with syntax can be formalized using a minimalist approach (e.g. Wexler, 1999; Penke, 2015). Because many of these disorders impacting on language have a genetic basis (see Koomar and Michelson, 2020 for a recent discussion), it can be expected that the non-mutated variants of the affected genes play some role in the development and interconnection of the brain circuits enabling syntactic processing, and potentially, accounting for our Merge ability, as discussed in the previous section. Although, as also noted, a direct translation of specific components of Minimalism (e.g. Merge) to specific brain areas is very problematic, putative candidates accounting for syntactic deficits in these clinical conditions are expected to provide some hints about how the neurotypical brain develops and unfolds in order to provide us with the ability for Merge. Moreover, documenting the changes accumulated in these genes should enable one to address two important concerns of any biological approach to language, generally, and of Minimalism, specifically. Analyzing such changes, if any, that occurred during , should enable one to achieve a more exact view of how language evolved, and particularly, if a Merge-ready brain emerged from one single mutation during human evolution. Likewise, examining such changes, if any, found in present-day human

11 populations, should enable one to provide an answer to a second key assumption of the minimalist approach to language, namely, the purportedly homogeneous nature of the human faculty for language, including our merging abilities. It is well beyond the scope of this chapter to provide even a rough summary of the past and ongoing research about the genetic underpinnings of human language, either from an ontogenetic perspective, or a phylogenetic perspective, or both (evo-devo) (but see Benítez-Burraco, 2013; Graham and Fisher, 2015; Derizotis and Fisher, 2017; Den Hoed and Fisher, 2020 among many others). Instead, we will use FOXP2, the renowned language gene to discuss the two important questions formulated above. These two aspects are intimately interrelated and the case of FOXP2 illustrates the benefits (and the limitations) of a genetic approach to human syntax under a Minimalist view, and more generally, of the genetic analysis of human language.

Regarding the first problem we wish to address, namely, how our Merge-ready brain emerged, it is worth highlighting that the connectivity of the Broca’s area–basal ganglia network, as discussed in section 3, seems to have been strengthened relatively recently in evolution. FOXP2 and other genes seem to have played a crucial role in these changes.12 For example, it has been shown that humanized FOXP2 alleles inserted into the Foxp2 gene of mice affect their basal ganglia, increasing dendrite lengths and synaptic plasticity of the medium spiny neurons in the striatum (Enard et al., 2009), ultimately improving learning abilities by favoring the transition from a declarative to a procedural working memory (Schreiweis et al. 2014). Furthermore, according to Enard et al. (2002), there is evidence for positive selection of changes in FOXP2 in the line of the descent of humans. Based on this and other comparable research (e.g. Vernes et al. 2007; Fisher 2017, 37), it has been proposed that specific mutations in FOXP2 (and a handful of other related genes) led to increased synaptic plasticity and denser neuronal connectivity of the human brain, better connecting the fronto-stratial network (Dediu 2015; Hillert 2014). Moreover, it has been reported that the processing of hierarchical syntax places higher demands on the brain capacity, i.e. on the established language networks, such as Broca’s area-basal ganglia network,

12 The initial report on the FOXP2 mutation as being human specific (Enard et al. 2002) was considered as an argument for the saltationist view of language evolution (see section 2), i.e. for the view that language, or at least syntax, emerged suddenly and recently, only in humans, as a result of one mutation. However, subsequent findings indicate that Neanderthals also had a derived variant of FOXP2 (Krause et al. 2007), which prompted a revision of the initial conclusions, as discussed in the paper by Piattelli-Palmarini and Uriagereka (2011) on the FOXP2 gene, titled “A geneticist’s dream, [but] a linguist’s nightmare.” This shift is also reflected in DiSciullo et al.’s (2010) statement that the foundation for language is clearly polygenic. 12 and on working memory. For example, Makuuchi and Friederici (2013) report the results of fMRI experiments during sentence reading which found not only a processing hierarchy from (i) the visual (specifically, the fusiform gyrus, where the visual word form area is located) to (ii) working memory (specifically, the inferior frontal sulcus and the intraparietal sulcus) to (iii) core language systems (specifically, the pars opercularis and the middle temporal gyrus), but also a clear increase of connectivity between working memory regions and language regions as the processing load increases for syntactically complex sentences. Wynn and Coolidge (2004) have proposed that working memory may have been enhanced in modern humans, compared to Neanderthals, potentially contributing to the capacity for innovation and experimentation. Interestingly, it has been suggested that there is also a relation between the regulation of the expression of FOXP2 gene and procedural or working memory (Bosman et al. 2004, Ullman and Pierpont 2005).

As with most of the genes currently related to language, this “FOXP2 window to language” was made possible by the studies of language disorders. Accordingly, the involvement of FOXP2 in language was established by a discovery that KE family’s congenital language impairment was caused by a mutation in the FOXP2 gene (Fisher et al. 1998; Lai et al. 2001). In an fMRI study, Liégeois et al. (2003) found that the processing patterns of the affected KE members not only exhibited under-activation in the Broca’s area and its right homologue, but also that both the caudate nucleus and putamen (structures of basal ganglia) were sites of morphological abnormality in the affected individuals. Later, other potentially relevant genes for language processing and evolution have been found to interact with FOXP2. One of them is CNTNAP2, which is downregulated by FOXP2 and which has been linked to language impairment in children (Vernes et al. 2008) (more on this gene below). Over the years, many other genes with an impact on our language abilities have been found to show signals of positive selection in the human lineage (e.g. Diller and Cann 2012; Fitch 2010, 291; see Dediu and Ladd 2007 specifically for ASPM and MCPH1). Overall, some of these specific genetic mutations are expected to have contributed to the enhanced capability of cortico-basal ganglia circuits regulating critical aspects of language, cognition, and motor control (Lieberman 2009). More generally, consideration of the molecular etiology of other cognitive disorders entailing language deficits, like autism spectrum disorders or schizophrenia, have served as an excellent source of

13 candidate genes for language and an excellent source for further hypotheses (and testing grounds) about the biological basis for language, also from a Minimalist perspective (see e.g. Benítez-Burraco and Murphy 2016 on autism; or Murphy and Benítez-Burraco 2017 on schizophrenia; see section 5.2 below). Overall, the main lesson of the genetic approach to language is that human language seems to be polygenic, involving “many genes with small effects,” in the words of Dediu and Ladd (2007: 10944). Moreover, as most differences between humans and other hominins do not pertain to proteins themselves, but to how, when, and in which amount they are synthesized (Reilly and Noolan, 2016; Franchini and Pollard, 2017), the evolutionary emergence of language can be expected to have resulted mostly from subtle changes in gene regulation patterns, reinforcing the view of a deep evolutionary continuity of the cognitive and biomechanical machinery supporting language (see section 5.3 below on putative precursors of Merge). Ultimately, some of these genetic changes are expected to have facilitated cultural processes that also played a crucial role in language evolution (see section 5.1 below with regards to Minimalism). Benítez-Burraco and Dediu (forthcoming) provide a recent, updated review of some of the key candidates for all these changes.

Ongoing research on human genetic diversity, particularly after the advent of next-generation genomic facilities, has facilitated an answer to the question whether humans are genetically identical or not when it comes to language abilities. Syntacticians pursuing Minimalism, and generative grammar more generally, had for a long time assumed or claimed that there is no individual variability in one’s language abilities (with the exception of cognitive disorders), neither phenotypically nor genetically. Most recently this view w as expressed in Berwick and Chomsky’s (2016, p. 54) claim that any infant from a Stone Age tribe in the Amazon, “if brought to [today’s] Boston, [would] be indistinguishable in linguistic and other cognitive functions” from the rest of the children. This claim would suggest not only that all humans are identical in their genetic basis for language today, but also that humans have always had this same genetic basis, even in prehistory. This would be consistent with the saltationist, one-gene approach to the evolution of language, as outlined in section 2. However, this belief in genetic identity across individuals and times has begun to erode, just as has the one-gene view of language evolution, with even Berwick and Chomsky (2016, p. 177) acknowledging that there may indeed exist “some language variation in normal human populations that is being uncovered by genome

14 sequencing.” Similarly, DiSciullo et al. (2010, p. 13) note that “within the past five years, this picture [of linguistic and genetic uniformity among humans] has completely changed; … it is now well established that genes affect speech and language in individuals and there are now many demonstrable associations between inter-individual differences in genetic makeup and inter-individual differences in speech and language abilities.” CNTNAP2, a gene regulated by the protein FOXP2. , is one notable example. Hence, Kos et al. (2012) found that whereas some SNP variants in human populations do impair language processing, there are others that impact on language processing by otherwise healthy adults, accounting in part for normal variability in (Whitehouse et al. 2011), and for specific patterns of neural activation in healthy adults (Whalley et al. 2011; Kos et al. 2012).

In brief, three of the most important biological considerations when it comes to language, i.e. how it is represented in the brain, how it varies across individuals, and how it evolved, have begun to be addressed empirically, and new answers are beginning to emerge, changing some long-held views and assumptions, but also raising further questions and hypotheses for testing. These kinds of correlations that can be discovered between genes and certain language (in)abilities, as well as between brain activation and certain language constructions, qualify for what Boeckx and Grohmann (2007) call “biolinguistics in the strong sense,” that is, a true interdisciplinary cross-fertilization. This contrasts with biolinguistics research in the weak sense, which, as we discussed in section 2, does not necessarily address biology. In the next section, we briefly discuss three recent approaches to Minimalism under this strong biolinguistic perspective.

5. Subsequent views/developments While the Minimalist Program may have a methodological advantage in approaching the logical problems of language implementation in the brain and language evolution, it has its own drawbacks when evaluated in a broader perspective. Among others, it is heavily oriented towards a modular view of the brain, especially apparent in the famous model of the faculty of language advanced by Hauser et al. (2002), which is also related to the bias towards locationist approaches to the neurobiological characterizations of language, which seem to have reached, as noted in section 3, an impasse. Additionally, the research agenda associated with the Minimalist Program is heavily biased towards a saltational and discontinuous model of language evolution, which is

15 especially apparent in language-specific components: syntax and the lexicon. While it is often suggested by the practitioners of Minimalism that Merge-based hierarchical syntax appeared suddenly only once in the human lineage in a very short evolutionary time, and that other species lack human-like lexical atoms (concepts), it is worth pointing out that evolution is very often a matter of old traits recruited or recycled for new functions. In this section we mention some approaches which use the tools of Minimalism to make a case for a gradualist and continuous model of language evolution.

5.1. Gradualist view within Minimalism Within Minimalism, this core view (of syntax being undecomposable, and evolving as one single event) has been challenged by Progovac’s work (e.g. 2008, 2010, 2015, 2016a, 2019). She has proposed that certain postulates of Minimalism (and predecessors), in particular the clausal hierarchy introduced in section 3 (and reproduced below), provide an especially useful, precise tool for decomposing syntax into its evolutionary primitives. Progovac reconstructed the bottom layer of this hierarchy, i.e. the (flat) small clause layer, as the approximation of the earliest grammar, and the foundation (common denominator) upon which various types of hierarchical grammars are built. This earliest stage of syntax would have been intransitive and absolutive- like, unable to distinguish subjects from objects grammatically, relying on the combinatorial operation of Proto-Merge, only capable of generating (non-recursive) two-slot grammars (see section 5.3 for more discussion on Merge and Proto-Merge).

(5) CP>TP > vP > SC/VP

The idea is that languages (i.e. their speakers) make use of this foundational layer to gradually “tinker” different solutions to the same problem, yielding significant cross-linguistic variation in how subsequent layers are built, including how transitivity is expressed: by ergative-absolutive means; by accusative-nominative means; by duplicating the small clause structure (serial verb means; etc.) In this sense, Progovac’s approach promotes an evolutionary scenario in which the cultural innovations with language directly interacted with biological forces, via the mechanism of natural or sexual selection, specifically for the benefits of using language, giving language an

16 active, causal role in its evolution, as well as in the evolution of the brain (see Pinker and Bloom 1990; also section 2).

This gradualist, incremental approach to the evolution of syntax is well-positioned to identify continuity with other species, most notably through its identification of the simplest, earliest stages of grammar, which seem accessible to other species as well. In this respect, Progovac (2017) has proposed that various two-slot combinations that other species are capable of, such as compound-like creations by e.g. the bonobo Kanzi (hide-peanut) and the chimpanzee Washoe (water-bird for a duck) constitute the point of continuity with such constructs still found in various guises in human languages as well, as illustrated below with linguistic “fossils.” In this sense, Progovac looks for continuity in the most basic constructions of human language, rather than in the most advanced. In order to identify what these most basic constructions are, she claims that one needs to rely on an explicit linguistic theory, such as Minimalism (but not necessarily Minimalism), and to have an explicit, precise theory of language evolution, which can be subjected to empirical testing and verification (Progovac 2019).

Of particular interest to Progovac’s approach is the identification of proxies/approximations/ “living fossils” of this proto-grammar, including the so-called “exocentric” (headless) verb-noun compounds (e.g. kill-joy, tattle-tale, scatter-brain, turn-coat, turn-table, stink-bug) and the so- called small clauses (e.g. Me first; Case closed; Problem solved; Fingers crossed), both prevalent to different degrees across languages (Progovac 2015). The identification of these proxies allows Progovac to test her hypothesis through neuroimaging experiments, as well as to cross-fertilize it with biological theories, such as the Self-Domestication Hypothesis (SDH). Regarding neuroimaging, Progovac et al.’s (2018a,b) fMRI experiments found that, in contrast to their more hierarchical counterparts, the processing of small clause approximations of proto- syntax invoke significantly less activation in the Broca’s-basal ganglia network, the network whose connectivity has been bolstered in recent evolution (see sections 3 and 4). Additionally, they found that the processing of the “fossil” verb-noun compounds (the kill-joy type), in contrast to the hierarchical -er counterparts (i.e. joy-kill-er type), involves a robust effect in the fusiform gyrus area (BA 37), the area where visual processing and certain non-compositional semantic processing (e.g. concreteness, metaphor) come together. They ascribed this finding in

17 part to the visceral, raw effect of these creations, which also often happen to be derogatory and humorous. In fact, when referring to humans, these compositions specialize for insult, i.e. verbal aggression (Progovac and Locke 2009; Progovac 2015), which led Progovac and Benítez- Burraco (2019) to propose that self-domestication (SD) in early human evolution not only favored the emergence of a less aggressive phenotype, but more precisely phenotype prone to replace (reactive) physical aggression with (reactive) verbal aggression. In this view, early SD processes were engaged in an intense feedback loop with early language, contributing to the solidification of these early grammars, with a high adaptive value for survival.

5.2. Neurobiological continuity under Minimalism As discussed in section 3, most neurobiolgical approaches to language with a Minimalist persuasion have tried to map operations, like Merge, Labeling, or Search, and representations of lexical and categorial features, like nouns or adjectives, to the brain. As noted, these attempts have mostly resulted in controversial findings and some sort of an impasse, mostly because the precise location of operations and representations is far from clear, but particularly, because it is doubtful that the identified regions are solely devoted to language processing. In a series of related papers and books, Benítez-Burraco and Murphy (Benítez-Burraco and Murphy, 2016; 2019; Murphy and Benítez-Burraco, 2016; 2018a; 2018b; Jiménez-Bravo et al., 2017; Murphy, 2018; Murphy, 2021) have tried to improve this Minimalist approach to language in the brain by focusing on computational primitives of language function as postulated by Poeppel and Embick (2005). Specifically, they have relied on brain oscillations, that is, repetitive patterns of neural activity that are hypothesized to synchronize neural function at different scales across the brain in order to increase information transfer. A growing body of research suggests that specific computational and representational properties of the brain can be attributed to these oscillations (e.g. Poeppel, 2012; Ding et al. 2016; Meyer, 2017). As with most approaches to the biology of language, they started from the examination of language deficits in high-prevalent clinical conditions entailing problems with language, particularly, autism and schizophrenia, which entail language impairment mostly at the syntax-semantics interface, and specific language impairment and developmental dyslexia, which entail language impairment mostly at the syntax-phonology interface. These two interfaces are important aspects of the faculty of language according to Minimalist models of language (Hauser et al. 2002; Berwick et al. 2013).

18 In these authors’ model, the combinatorial power of Merge and the cyclic power of Labeling are implemented via “cross-frequency” (i.e., between distinct frequencies) coupling, specifically via δ-θ inter-regional phase-amplitude coupling, which gives rise to multiple sets of linguistic syntactic and semantic features.13 Later, β and γ sources are hypothesized to be coupled with θ oscillations for achieving syntactic predictions and conceptual bindings, respectively. Finally, α sources, which are also involved in the early stages of binding, are hypothesized to synchronize distant cross-cortical γ sites required for the θ-γ phase-amplitude coupling of working memory, allowing for the reapplication of Merge to its own output, and ultimately, accounting for recursive hierarchical phrase structures, the core distinctive feature of human language (see Figure for a schematic view; see Murphy and Benítez-Burraco, 2018a, Benítez-Burraco and Murphy, 2019, and Murphy 2021 for details).

Figure. Putative neural codes for language under an oscillatory view, representing the different cross-frequency couplings proposed to implement hierarchical phrase structures (above), and a map of the brain locations for the hypothesized codes (below). Adapted from Benítez-Burraco and Murphy, 2019; Figures 1 and 2.

13 The relevant representative neural oscillations/brain rhythms are classified by frequency as follows: delta (δ: ~0.5–4Hz), theta (θ: ~4–8Hz), alpha (α: ~8–12Hz), beta (β: ~12–30Hz), and gamma (γ: ~30–150Hz) (see Murphy 2021 and references cited therein). 19 Importantly, the involved oscillations are not unique to language, as they contribute to other cognitive abilities as well, being involved in domain-general processes, such as working memory, attention, and the like. Thus, it can be expected that the specificity of language results from a conspiracy of a range of domain-general oscillatory processes and that the disruption of this particular entrainment leads to the sort of language deficits found in disorders. At the same time, these oscillations are highly heritable features (van Beijsterveldt et al. 1996; Linkenkaer- Hansen et al. 2007, Müller et al. 2017), including those involved in language processing (Araki et al. 2016), although they stem from genes controlling the brain’s neurochemistry (Begleiter and Porjesz, 2006). Accordingly, these genes cannot be regarded as specific to language, either. But again, it is possible to identify a set of genes controlling the phasal and cross-frequency couplings described above, with most of these genes being involved in neurotransmitter function (see Murphy and Benítez-Burraco, 2018a for a detailed characterization). This ultimately means that the consideration of brain oscillations could enable one to formulate more robust bridging theories between genes (including genes affected in disorders impacting on language) and language features (including the problems with language experienced by patients), at least more robust than the current genes-neuroanatomy-language links, because, in essence, oscillations are closer to gene function and are also more primitive than standard cognitive labels (see Murphy and Benítez-Burraco 2018a, 2019 for a detailed discussion, as well as for a number of linking hypotheses between genes, particular oscillations, and specific aspects of language processing in a coherent Minimalist framework).

Importantly too, because these oscillations are shared across many species (Buzsáki et al., 2013; Murphy and Benítez-Burraco, 2018b), an evolutionary continuity can be expected between language and the perceptual and cognitive abilities of animals, particularly, of primates and extinct hominins. Accordingly, the basic computational processes of language, like merging, searching, maintaining, etc., can be hypothesized to have resulted from small tweaks to the phasal and coupling properties of neural oscillations found in mammals, yielding modifications to their scope and form that would be human-specific.

This approach is expected to render more precise hypotheses about the evolution of (specific operations and representations of) language. In a recent paper, Murphy and Benítez-Burraco

20 (2018a) relied on our knowledge of the evolutionary changes in the sequences of genes important for language-related neural oscillations to infer some properties of the Neanderthal brain oscillatory activity, and ultimately, to postulate some differences between the two species in cognitive functions important for language. According to Murphy (2018), the human-specific pattern of brain oscillatory behavior might have arisen via specific changes in brain morphology (‘globularity’), as our more globular braincase resulted in a reduction of the ‘spatial inequalities’ (Salami et al., 2003) between cortical and subcortical regions.

Several conclusions of relevance for achieving a truly biolinguistic approach to Minimalism can be drawn from this research program. As summarized in Benítez-Burraco and Murphy (2018a:1), (i) “computational operations of language can be decomposed into generic processes”; (ii) “these generic processes interact in dynamic ways and can be implemented via neural oscillations”; (iii) “these oscillations implement a multiplexing algorithm for the combination and interpretation of linguistic representations”; (iv) this implementation likely occurs via “migrating oscillations”, that is, oscillations that cycle cross-cortically and that form spatially coherent waves that move across the cortex in order to coordinate neural activity at large scale (Zhang et al., 2018); and (v) “this multiplexing algorithm appears to be species-specific” .

5.3. Evolutionary continuity under Minimalism Considering that virtually everything is continuous in evolution albeit “qualitatively discontinuous” (Lenneberg, 1967; see also footnote 13 below), it is possible that Merge had some evolutionary precursor, a pre-existing capacity in humans and other species which was later co-opted for syntactic computation, rather than simply suggesting that Merge “emerged suddenly as the result of a minor mutation” (Berwick and Chomsky, 2016: 70).14 Berwick and Chomsky refer to “a slight rewiring of the brain” and “a slight extension of existing cortical wiring” (p.79). In the two previous sections, we have provided reasons in favor of a deeper continuity between language (and specifically, Merge) and animal cognition/communication. In the previous section, we have shown that this rewiring of the brain might have consisted in a

14 The term “continuity” here refers to the Darwinian continuous natural phylogenetic history of species, accompanied by descent with modification. Although Lenneberg (1967) uses the term “discontinuity,” it was to emphasize the qualitative difference between human language and animal communication systems. The continuity hypothesis as discussed in this section is fully compatible with Lenneberg’s view (see Trettenbrein, 2017).

21 change in the oscillatory entrainment of the brain. In this section we will explore two views (Fujita 2009, 2017, and Hoshi 2018, 2019) that argue that this slight rewiring cannot be responsible for the saltational emergence of Merge, but is more likely involved in deriving Merge from pre-existing functions. This is in line with the Darwinian evolutionary continuity (of the biological underpinnings of language) defended in the two previous sections.

Firstly, Fujita (2009, 2017) proposes that Merge evolved from motor action planning, as observed in human and non-human tool use and tool making.15 He borrows from cognitive archaeology the insight that tools and language have strong evolutionary links, and in particular applied Greenfield’s (1991) Grammar of Action to differentiate distinct ways of applying Merge. Greenfield identified three methods of Action Grammar, i.e., Pairing strategy, Pot strategy and Subassembly strategy, in the order of increasing complexity. Just as the Subassembly strategy is almost uniquely human, Merge in the Subassembly fashion (Sub-Merge) is crucial in forming uniquely human hierarchical syntax. According to Fujita (2009, 2017), the evolution of Sub- Merge is the key event in language evolution. The three strategies of Action Grammar identified by Greenfield seem to be paralleled by linguistic Merge in the following way:

(6) a. Pairing Strategy (Proto-Merge) i. Merge (tea,cup) → {tea,cup} ii. Merge (green,tea) → {green,tea} b. Pot Strategy (Pot-Merge) following (6ai), Merge (green,{tea,cup}) → {green, {tea,cup}} c. Subassembly Strategy (Sub-Merge) following (6aii), Merge ({green,tea}, cup) → {{green,tea}, cup}

15 See also the discussion in Chapter 52. The idea is based on the observation that a Merge-like hierarchical and recursive combinatorial operation is shared by other species in motor domains, including insect navigation by path integration, but most notably animal tool use, such as nut cracking using an anvil stone and a hammer stone (plus, occasionally, a wedge stone) by chimpanzees. In these cases, the animal manipulates and combines concrete objects into a hierarchical structure, parallel to the way Merge combines syntactic objects into hierarchical linguistic structure. 22 The first, simplest operation is just a non-recursive combination of two objects, and it corresponds to Progovac’s (2015) Proto-Merge (see section 5.1.) The difference between the second Pot-Merge and the third, most complex Sub-Merge explains the structural ambiguity of green tea cup, “a tea cup which is green” vs. “a cup for green tea.” In contrast to Pot-Merge, Sub-Merge requires a kind of “multiple attention” to the targets of operation, consequently yielding different patterns of labeling. In (6b), the target of operation is always “cup,” which is reflected by the fact that tea cup and green tea cup are both types of cup (“cup” consistently provides the label). In (6c), by contrast, the targets of operation are “tea” and “cup”; green tea is a type of tea (“tea” is the label because it is the target) but green tea cup is a type of cup (“cup” is the new label). The ability to utilize multiple attention in this sense may be linked to human self-domestication (SD). Unlike animals in the wild which need to react promptly to outside stimuli (predators, food, etc.) for survival, domesticated animals, free from those worries, can have the luxury of delaying their response and directing their attention to communicative choices. If so, then this would be a further illustration of how SD contributed to the evolution of human language, in this case especially of complex hierarchical syntax, in addition to Progovac and Benítez-Burrraco’s (2019) conjecture in section 5.1 regarding early syntax (in this respect, see also Benítez-Burrraco and Progovac 2020). In sum, Fujita’s proposal is that linguistic Merge evolved gradually in this order, and that human language emerged when Sub-Merge became available, essentially arguing for a sensorimotor (SM) origin of Merge.

Secondly, Hoshi (2018, 2019), on the other hand, put forth a hypothesis for a conceptual- intentional (CI) origin of Merge, claiming that Merge, as a label-free recursive set-formation operation, was exapted from the recursive set-formation sub-component of cognitive process of categorization, which itself can be characterized as a composite cognitive process of labeling plus set-formation, yielding labeled category sets. Building on Lenneberg’s (1967) insight, he proposes that the precursor for Merge was the interrelational mode of categorization which relates more than one object to form a category or more than one category to form a super- category.

In general, the cognitive operation of categorization can be regarded as an unordered set- formation recursive function (call it Cat) on a par with Merge. Merge, defined as Merge (X, Y) =

23 {X, Y}, is a set-formation recursive function that takes as input either conceptual atoms or syntactic objects (SOs) already generated by the function and produces an unordered set as output. Likewise, the cognitive process of Cat under the label κ (Catκ) is an n-ary set-formation recursive function that takes cognitive representations of any elements to the extent that they “satisfy” the label κ as its relevant concept for determining the category membership (see Hoshi 2018, 2019 for more details). In principle, Cat can take as input not only cognitive representations related to either objects in the world or abstract constructs in the mind (= “low- order” Cat) but also cognitive representations related to category sets already “generated” by Cat (= “higher-order” Cat), depending upon the availability of appropriate labels for super-category sets. Notice that the crucial difference between Merge and Cat is that the former is label-free while the latter is inherently labeled. As such, if Merge evolved from Cat, the labeling sub- component of Cat must have been separated from it, leaving the label-free unordered set- formation recursive function sub-component of it to be co-opted/exapted for Merge in the biological evolution of human language.16 Note that Cat per se has been preserved and its power has been enhanced due to feedback of Merge as input to the category labeling in our species. He also points out that, unlike Merge, which has both external and internal versions (External Merge (EM) and Internal Merge (IM)) thanks to its label-free nature, Cat only permits its external version, where objects/categories to be interrelated under a category/super-category are independent of one another. The absence of internal version of Cat is attributed to its inherently obligatory category labeling as part of the cognitive operation of categorization unlike the label- free Merge (see Hoshi, 2018, 2019 for details on this important point).

The idea of Merge evolving from the cognitive operation of categorization Cat in the domain of the CI system is based on the fact that (at least) all the vertebrates, including our species, categorize experience in the environment (Lenneberg 1967, Bickerton 1990, Hurford 2007, Tallerman 2009, inter alia). For instance, animals must differentiate between categories such as “edible/inedible, potable/non-potable, harmful/harmless, ally/enemy, predator/prey, light/dark, solid/liquid, etc., using the perceptual and motoric properties of objects around them” (Tallerman 2009: 188). 16 Hoshi (2019) speculates that the binarity of Merge derived from the imposition of the bipartite “function- argument” structure of the characteristic function of the category label κ in Cat (see Hoshi 2019 for details). For the notions of functions and arguments in representations and computations in the brain that are relevant in this context, see, e.g., Gallistel and King (2010). 24 While Fujita’s SM-origin of Merge and Hoshi’s CI origin of Merge may strike the reader as being incompatible at first blush, they are in fact complementary to each other and ideally should be synthesized, as per the following. The two hypotheses explored above share the view that label-free recursive set-forming nature of Merge was already present in its precursors, so that no drastic change took place in the emergence of Merge as far as operational procedures go.17 From the perspective of this synthetic hypothesis, it can be surmised that some biological change took place in the neural basis of the basal ganglia and its related network, presumably including Broca’s area, in bringing about the emergence of Merge, probably via some changes in the pattern of neural oscillations (see also Okanoya, 2017, for evidence that a certain neural change in the basal ganglia led to complexity of courtship songs in Bengalise finches during the course of domestication). While it has been widely assumed that Broca’s area is involved with processing of syntax in human language (see, e.g., Friederici 2017), as Fitch and Martins (2014) discuss in detail, Broca’ area is rather involved in hierarchical processing in not only language but also music and action (Boeckx et al. 2014 claim that the area is responsible for linearization of outputs of Merge). Furthermore, as also discussed in section 3, syntactic hierarchical structure building is facilitated by the basal ganglia (Balari and Lorenzo 2013, Balari et al. 2013, Teichmann et al. 2015). Given these considerations, it seems to be more fruitful to investigate the Broca’s area-basal ganglia neural connection and interaction in seeking the origin(s) of Merge. In this connection, it is noteworthy that interaction between the pre-frontal cortex and the basal ganglia is crucial for categorization (Seger 2008, Seger and Miller 2010, Villagrasa et al. 2018) and that the basal ganglia also participates in the network for motor control, including , and cognitive processes, including categorization and syntax (Lieberman 2000, 2002, 2006, 2007). As extensively discussed in Lieberman (2006) and references cited therein, lesions/ pathology of the basal ganglia will result in dysfunction of motor control, categorization, and comprehension of complex syntax, as observed in Broca’s syndrome and Parkinson’s disease. This fact also strongly suggests the close interrelation among Merge, motor control, and

17 In fact, Greenfield’s (1991) Pot strategy and Subassembly strategy, Fujita’s (2009, 2017) Pot-Merge and Sub- Merge, and Hoshi’s (2018, 2019) low-order Cat and higher-order Cat bear a striking resemblance to one another, respectively, in terms of their nature, which is highly likely to be more than a mere accident. In Subassembly strategy, Sub-Merge, and higher-order Cat, you need to form sub-units/sub-categories before constructing a larger unit/category that contains them, which would require multiple-attention to the sub-units/categories in an expanded working memory space in our species, due to the human self-domestication (SD), as discussed for Sub-Merge in the text. 25 categorization, as implied by the motor control-categorization integrative hypothesis for the origin of Merge in syntax in our species.

6. Future prospects In conclusion, we contend that any approach to language evolution should be founded on a linguistic theory, whether Minimalism or some other, rather than on impressionistic characterizations of what language is, and how it evolved. At the same time, we believe that the future of this field lies in moving away from those postulates of Minimalism that reinforce the strong bias towards a saltationist and discontinuous model of language evolution, which does not allow much room for exploring the alternative models, or for testing and falsification. Furthermore, in our view, a true cross-fertilization between linguistics and biology can only work if it goes both ways, where the biological methods need to be adjusted to the constraints and limits of linguistics, but where linguistic theories also need to learn from, and adjust to, the limits and constraints of biology, as Lenneberg (1967) originally intended. In sum, we contend that the future of the field that studies the biology of language lies (i) in the reliance on both linguistic theory and biological theory; (ii) in cross-fertilizing the two in such a way that each learns from the other, and adjusts to the other; and (iii) in moving away from saltationist, discontinuous bias in Minimalism, and opening up to exploring gradualist, continuous approaches, those that can be subjected to empirical and experimental scrutiny. In this respect, three of the most important biological considerations when it comes to language, i.e. how it is represented in the brain, how it varies across individuals, and how it evolved, have begun to be addressed empirically, and new answers are beginning to emerge, changing some long-held views and assumptions, but also raising furhter questions and hypotheses for testing.

References Adger, D. 2003. Core Syntax: A Minimalist Approach. Oxford: Oxford University Press.

Araki, T., Hirata, M., Yanagisawa, T., Sugata, H., Onishi, M., Watanabe, Y., Ogata, S., Honda, C., Hayakawa, K., Yorifuji, S., Osaka Twin Research Group, 2016. Language-related cerebral oscillatory changes are influenced equally by genetic and environmental factors. Neuroimage 142, 241-247.

26 Ardila, A., Bernal, B., and Rosselli, M. (2016a). How localized are language brain areas? A review of Brodmann areas involvement in oral language. Arch. Clin. Neuropsychol. 31, 112– 122. doi: 10.1093/arclin/acv081

Ardila, A., Bernal, B., and Rosselli, M. (2016b). Why Broca’s area damage does not result in classical Broca’s aphasia? Front. Hum. Neurosci. 10:249. doi: 10.3389/fnhum.2016.00249

Balari, S. and G. Lorenzo. (2013). Computational Phenotypes: Towards an Evolutionary Developmental Biolinguistics. Oxford: Oxford University Press.

Balari, S., A. Benítez-Burraco, V. M. Longa and G. Lorenzo. (2013). The fossils of language: What are they, who has them, how did they evolve? In C. Boeckx and K. K. Grohmann (eds.), The Cambridge Handbook of Biolinguistics, 489-523. Cambridge : Cambridge University Press.

Begleiter, H., and Porjesz, B. (2006). Genetics of human brain oscillations. Int. J. Psychophysiol. 60, 162–171. doi: 10.1016/j.ijpsycho.2005.12.013

Benítez-Burraco, A. (2021). Mental time travel, language evolution, and human self- domestication. Cognitive Processing https://doi.org/10.1007/s10339-020-01005-2.

Benítez-Burraco, A., and Murphy, E. (2016). The Oscillopathic Nature of Language Deficits in Autism: From Genes to Language Evolution. Frontiers in human neuroscience, 10, 120. https://doi.org/10.3389/fnhum.2016.00120

Benítez-Burraco, A., and Murphy, E. (2019). Why Brain Oscillations Are Improving Our Understanding of Language. Frontiers in behavioral neuroscience, 13, 190. https://doi.org/10.3389/fnbeh.2019.00190

27 Benítez-Burraco, A., and Progovac, L. (2020). A four-stage model for language evolution under the effects of human self-domestication. Language and Communication 73, 1-17. doi.org/10.1016/j.langcom.2020.03.002

Benítez-Burraco, A., and Dediu, D. (forthcoming) The evolution of language and speech: what we know from genetics. In Lock A, Sinha C, Gontier N, eds, The Oxford Handbook on Human Symbolic Evolution. Oxford: Oxford University Press

Benítez-Burraco, A., and Murphy, E. (2016). The oscillopathic nature of language deficits in autism: from genes to language evolution. Front. Hum. Neurosci. 10:120. doi: 10.3389/fnhum.2016.00120

Ben-Shachar, M., Palti, D., and Grodzinsky, Y. (2004). Neural correlates of syntactic movement: converging evidence from two fMRI experiments. NeuroImage 21, 1320–1336. doi: 10.1016/j.neuroimage.2003.11.027

Berwick, R. C., Friederici, A., Chomsky, N., and Bolhuis, J. J. (2013). Evolution, brain, and the nature of language. Trends Cog. Sci. 17, 89–98. doi: 10.1016/j.tics.2012.12.002

Berwick, R., and Chomsky, N. (2011). The Biolinguistic Program. The current state of its development. In A.M. Di Sciullo, and C. Boeckx (eds.), The Biolinguistic Enterprise: New Perspectives on the Evolution and Nature of the Human Language Faculty, 19-41. Oxford: Oxford University Press.

Berwick, R., and Chomsky, N. (2016). Why Only Us? Language and Evolution. Cambridge, MA and London, England: MIT Press.

Berwick, R.C. (1998). Language evolution and the Minimalist Program: The origins of syntax. In J.R. Hurford, M. Studdert–Kennedy, and C. Knight (eds.), Approaches to the Evolution of Language: Social and Cognitive Bases, 320–340. Cambridge: Cambridge University Press.

28 Berwick, R.C. (2011). All you need is Merge: Biology, computation, and language form from the bottom-up. In A.M. Di Sciullo, and C. Boeckx (eds.), The Biolinguistics Enterprise: New Perspectives on the Evolution and Nature of the Human Language Faculty, 461-491. Oxford: Oxford University Press.

Berwick, R.C., Hauser, M.D., and Tattersall, I. (2013). Neanderthal language? Just-so stories take center stage. Frontiers in Psychology 4. doi: 10.3389/fpsyg.2013.00671.

Bickerton, D. (1990). Language and Species. Chicago: University of Chicago Press.

Bickerton, D. (1998). Catastrophic evolution: The case for a single step from protolanguage to full human language. In J.R. Hurford, M. Studdert-Kennedy, and C. Knight (eds.), Approaches to the Evolution of Language: Social and Cognitive Bases, 341-358. Cambridge: Cambridge University Press.

Bickerton, D. (2007). Language evolution: A brief guide for linguists. Lingua 117: 510-526.

Bickerton, D. (2014). More than Nature Needs: Language, Mind, and Evolution. Cambridge, MA: Harvard University Press.

Boeckx, C. and Grohmann, K.K. (2007). The Biolinguistics manifesto. Biolinguistics 1, 1–8.

Boeckx, C., Martinez-Alvarez, A., and Leivada, E. (2014). The functional neuroanatomy of serial order in language. Journal of Neurolinguistics 32, 1-15.

Bookheimer, S. (2002). Functional MRI of language: new approaches to understanding the cortical organization of semantic processing. Annu. Rev. Neurosci. 25, 151–188. doi: 10.1146/annurev.neuro.25.112701.142946

Bosman, C., García, R., and Aboitiz, F. (2004). FOXP2 and the language working-memory system. Trends in cognitive sciences, 8(6), 251–252. https://doi.org/10.1016/j.tics.2004.04.006

29 Brennan, J., Nir, Y., Hasson, U.,Malach, R., Haeger, D. J., and Pylkkänen, L. (2012). Syntactic structure building in the anterior temporal lobe during natural story listening. Brain Lang. 120, 163–173. doi: 10.1016/j.bandl.2010.04.002

Burzio, L. (1981). Intransitive verbs and Italian auxiliaries. Ph.D. dissertation, Massachusetts Institute of Technology, Cambridge, MA.

Caplan, D. (2001). Functional neuroimaging studies of syntactic processing. J. Psycholing. Res. 30, 297–320. doi: 10.1023/A:1010495018484

Caramazza, A., and Zurif, E. B. (1976). Dissociation of algorithmic and heuristic processes in sentence comprehension: evidence from Aphasia. Brain Lang. 3, 572–582. doi: 10.1016/0093- 934X(76)90048-1

Carnie, A. (2013). Syntax: A Generative Introduction. 3rd edn. Oxford, UK: Wiley-Blackwell.

Chomsky, N. (1955). The Logical Structure of Linguistic Theory.

Chomsky, N. (1995). The Minimalist Program. Cambridge, MA: MIT Press.

Chomsky, N. (2000). Minimalist inquiries: The framework. In R. Martin, D. Michaels, and J. Uriagereka (eds.), Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik, 89- 155. Cambridge, MA: MIT Press.

Chomsky, N. (2002). On Nature and Language. Edited by A. Belletti, and L. Rizzi. Cambridge: Cambridge University Press.

Chomsky, N. (2005). Three factors in language design. Linguistic Inquiry 36: 1-22.

30 Chomsky, N. (2010). Some simple evo-devo theses: How true might they be for language? In R.K.

Christiansen, M.H., and Chater, N. (2008). Language as shaped by the brain. Behavioral and Brain Sciences 31: 489–558.

Clark, B. (2013). Syntactic theory and the evolution of syntax. Biolinguistics 7: 169-197.

Constable, R. T., Pugh, K. R., Berroya, E., Mencl, W. E., Westerveld, M., Ni, W., et al. (2004). Sentence complexity and input modality effects in sentence comprehension: an fMRI study. NeuroImage 22, 11–21. doi: 10.1016/j.neuroimage.2004.01.001

Culicover, P.W., and Jackendoff, R. (2005). Simpler Syntax. New York: Oxford University Press.

Darwin, C. (1872). The Expression of the Emotions in Man and Animals. London: John Murray.

Darwin, C. M. A. (1874). The Descent of Man, and Selection in Relation to Sex. New edition, revised and augmented. New York: Hurst and Company.

Dediu, D. (2015). An Introduction to Genetics for Language Scientists: Current Concepts, Methods, and Findings. Cambridge: Cambridge University Press.

Dediu, D., and Ladd, D.R. (2007). Linguistic tone is related to the population frequency of the adaptive haplogroups of two brain size genes, ASPM and Microcephalin. Proceedings of the National Academy of Sciences of the USA 104: 10944-10949.

Dediu, D., and Levinson, S.C. (2013). On the antiquity of language: The reinterpretation of Neandertal linguistic capacities and its consequences. Frontiers in Psychology 4, 397. doi: 10.3389/fpsyg.2013.00397.

31 den Hoed J, Fisher SE (2020) Genetic pathways involved in human speech disorders. Curr Opin Genet Dev. 65:103-111. doi: 10.1016/j.gde.2020.05.012.

Deriziotis, P., and Fisher, S. E. (2017). Speech and Language: Translating the Genome. Trends in genetics : TIG, 33(9), 642–656. https://doi.org/10.1016/j.tig.2017.07.002

Di Sciullo, A. M., Piattelli-Palmarini, M., Wexler, K., Berwick, R.C., Boeckx,C., Jenkins, L.,

Di Sciullo, A-M. (2013). Exocentric compounds, language and proto-language. Language and Information Society 20: 1-26.

Di Sciullo, Anna Maria; Boeckx, Cedric (2011). The Biolinguistic Enterprise: New Perspectives on the Evolution and Nature of the Human Language Faculty Volume 1 of Oxford Studies in Biolinguistics. Oxford University Press, 2011. ISBN 9780199553273.

Diller, K.C., and Cann, R.L. (2013). Genetics, evolution, and the innateness of language. In R. Botha and M. Everaert (eds.), The Evolutionary Emergence of Language, 244-258. Oxford: Oxford University Press.

Ding, N., Melloni, L., Zhang, H., Tian, X., and Poeppel, D. 2016. Cortical tracking of hierarchical linguistic structures in connected speech. Nature Neuroscience 19, 158-164.

Di Sciullo, A. M, Piattelli-Palmarini, M., Wexler, K., Berwick, R. C., Boeckx, C., Jenkins, L., Uriagereka, J., Stromswold, K., Lai-Shen Cheng, L., Harley, H., Wedel, A., McGilvray, J., van Gelderen, E., Bever, T.G. (2010). The biological nature of human language. Biolinguistics 4.1: 004–034.

Draganski, B., Kherif, F., Klöppel, S., Cook, P.A., Alexander, D.C., Parker, G.J.M., Deichmann, R., Ashburner, J., and Frackowiak, R.S.J. (2008). Evidence for segregated and integrative connectivity patterns in the human basal ganglia. The Journal of Neuroscience 28: 7143–7152.

32 Embick, D., Marantz, A., Miyashita, Y., O’Neil, W., and Sakai, K. L. (2000). A syntactic specialization for Borca’s area. Proc. Natl. Acad. Sci. U.S.A. 97, 6150–6154. doi: 10.1073/pnas.100098897

Enard, W., Gehre, S., Hammerschmidt, K. …, and Pääbo, S. (2009). A humanized version of FOXP2 affects cortico-basal ganglia circuits in mice. Cell 137: 961-67.

Enard, W., Przeworski, M., Fisher, S.E., Lai, C.S.L., Wiebe, V., Kitano, T., Monaco, A.P., and Pääbo, S. (2002). Molecular evolution of FOXP2, a gene involved in speech and language. Nature 418: 869-872.

Epstein, D.S., Kitahara, H., and Seely, D. (2010). Uninterpretable features: What are they and what do they do? In M.T. Putnam (ed.), Exploring Crash-Proof Grammars, 125-142. Amsterdam: John Benjamins.

Fedorenko, E., and Kanwisher, N. (2009). A new approach to investigating the functional specificity of language regions in the brain. Poster at the Neurobiology of Language Conference, Chicago.

Fisher, S.E. (2017). Evolution of language: Lessons from the genome. Psychonomic Bulletin Review 24: 34-40.

Fisher, S.E., Vargha-Khadem, F., Watkins, K.E., Monaco, A.P., and Pembrey, M.E. (1998). Localization of a gene implicated in a severe speech and language disorder. Nature Genetics 18: 168-170.

Fitch, W.T. (2008). Co-evolution of phylogeny and glossogeny: There is no “logical problem of language evolution.” Behavioral and Brain Sciences 31.5: 521-522.

Fitch, W.T. (2010). The Evolution of Language. Cambridge: Cambridge University Press.

33 Fitch, W.T. (2017a). Preface to the Special Issue on the Biology and Evolution of Language. Psychonomic Bulletin Review 24: 1-2.

Fitch, W.T. (2017b). Empirical approaches to the study of language evolution. Psychonomic Bulletin Review 24: 3-33.

Fitch, W. T. and M. D. Martins. (2014). Hierarchical processing in music, language, and action: Lashley revisited. Annals of the New York Academy of Sciences 1316, 1-18.

Ford, A. A., Triplett, W., Sudhyadhom, A., Gullett, J., McGregor, K., FitzGerald, D. B., et al. (2013). Broca’s area and its striatal and thalamic connections: a diffusion-MRI tractography study. Front. Neuroanat. 7:8. doi: 10.3389/fnana.2013.00008

Franchini, L. F., and Pollard, K. S. (2017). Human evolution: the non-coding revolution. BMC biology, 15(1), 89. https://doi.org/10.1186/s12915-017-0428-9

Frey, S., Campbell, J. S. W., Pike, G. B., and Petrides, M. (2008). Dissociating the human language pathways with high angular resolution diffusion fiber tractography. J. Neurosci. 28, 11435–11444. doi: 10.1523/JNEUROSCI.2388-08.2008

Friederici, Angela D. (2017). Language in Our Brain: The Origins of a Uniquely Human Capacity. Cambridge, MA: MIT Press.

Friederici, A. D., Fiebach, C. J., Schlesewsky, M., Bornkessel, I. D., and von Cramon, D. Y. (2006). Processing linguistic complexity and grammaticality in the Left Frontal Cortex. Cereb. Cortex 16, 1709–1717. doi: 10.1093/cercor/bhj106

Friederici, A. D., Meyer, M., and von Cramon, D. Y. (2000). Auditory language comprehension: an event-related fMRI study on the processing of syntactic and lexical information. Brain Lang. 74, 289–300. doi: 10.1006/brln.2000.2313

34 Friedmann, Na’ama and Yosef Grodzinsky. 1997. Tense and Agreement in Agrammatic Production: Pruning the Syntactic Tree. Brain and Language 56, 397-425.

Friedmann, Na’ama. 2006. Speech production in Broca’s agrammatic aphasia: Syntactic tree pruning. In Y. Grodzinsky and K. Amunts (eds.), Broca’s Region, pp. 63-82. New York: Oxford University Press.

Fujita, K. (2009). A prospect for evolutionary adequacy: Merge and the evolution and development of human language. Biolinguistics 3, 128-153.

Fujita, K. (2017). On the parallel evolution of syntax and lexicon: A Merge-only view. Journal of Neurolinguistics 43B, 178-192.

Gallistel, C.R., and King, A.P. (2010). Memory and the Computational Brain: Why Cognitive Science Will Transform Neuroscience. Malden, MA: Wiley-Blackwell.

Gibson, K. (1996). “The ontogeny and evolution of the brain, cognition, and language,” in Handbook of Human Symbolic Evolution, eds A. Lock and C. R.Peters (Oxford: Clarendon Press), 407–431.

Gil, D. (2005). Isolating-monocategorial-associational language. In H. Cohen, and C. Lefebvre (eds.), Handbook of Categorization in Cognitive Science, 347–379. Amsterdam: Elsevier.

Givón, T. (2002). Bio-linguistics: The Santa Barbara Lectures. Amsterdam: John Benjamins.

Givón, T. (2009). The Genesis of Syntactic Complexity: Diachrony, Ontogeny, Neuro-cognition, Evolution. Amsterdam/Philadelphia: John Benjamins.

Gopnik, M. (1990). Feature-blind grammar and dysphasia. Nature 344: 715-715.

35 Gopnik, M., and Crago, M.B. (1991). Familial aggregation of a developmental language disorder. Cognition 39: 1-50.

Gould, S.J., and Eldredge, N. (1977). Punctuated equilibria: The tempo and mode of evolution reconsidered. Paleobiology 3: 115-151.

Graham SA, Fisher SE (2015) Understanding language from a genomic perspective. Annu Rev Genet. 49:131-60. doi: 10.1146/annurev-genet-120213-092236.

Greenfield, P.M. (1991). Language, tools, and brain: The ontogeny and phylogeny of hierarchically organized sequential behaviour. Behavioral and Brain Sciences 14, 531-595.

Grodzinsky, Y. (2000). The neurology of syntax: Language use without Broca’s area. Behavioral and Brain Sciences, 23, 1-21.

Grodzinsky, Y. (2010). The picture of the linguistic brain: How sharp can it be? Reply to Fedorenko and Kanwisher. Language and Linguistics Compass, 4/8, 605-622.

Grodzinsky, Y., and Friederici, A.D. (2006). Neuroimaging of syntax and syntactic processing. Current Opinion in Neurobiology, 16, 240-246.

Hagoort, P., and Indefrey, P. (2014). The neurobiology of language: Beyond single words. Annual Review of Neuroscience, 37, 347-62. Doi: 10.1146/annurev-neuro-071013-013847

Hauser, M. D., Chomsky, N., and Fitch, W. T. (2002). The Faculty of Language: What is it, who has it, and how did it evolve? Science 298, 1569–1579.

Heim, S., Eickhoff, S. B., and Amunts, K. (2009). Different roles of cytoarchitectonic BA 44 and BA 45 in phonological and semantic verbal fluency as revealed by dynamic causal modelling. Neuroimage 48(3):616-24. Doi: 10.1016/j.neuroimage.2009.06.044.

36 Heine, B., and Kuteva, T. (2007). The Genesis of Grammar. A Reconstruction. Oxford: Oxford University Press.

Hillert, D. (2014). The Nature of Language: Evolution, Paradigms and Circuits. New York: Springer.

Hinton, G.E., and Nowlan, S.J. (1987). How learning can guide evolution. Complex Systems 1: 495-502.

Hornstein, N. (2009). A Theory of Syntax: Minimal Operations and Universal Grammar. Cambridge: Cambridge University Press.

Hoshi, K. (2017). Lenneberg’s contributions to the biology of language and child aphasiology: Resonation and brain rhythmicity as key mechanisms. Biolinguistics 11.SI, 83-113.

Hoshi, K. (2018). Merge and Labeling as Descent with Modification of Categorization: A Neo- Lennebergian Approach. Biolinguistics 12, 39-54.

Hoshi, K. (2019). More on the Relations among Categorization, Merge and Labeling, and Their Nature. Biolinguistics 13, 1-21.

Hurford, J. R. (2007). The Origins of Meaning. Language in the Light of Evolution. Oxford: Oxford University Press.

Hurford, J. R. (2012). The Origins of Grammar. Language in the Light of Evolution II. Oxford: Oxford University Press.

Indefrey, P., Hagoort, P., Herzog, H., Seitz, J., and Brown, C. M. (2001). Syntactic processing in left prefrontoal cortex is independent of lexical meaning. NeuroImage 14, 546–555. doi: 10.1006/nimg.2001.0867

37 Jackendoff, R. (1999). Possible stages in the evolution of the language capacity. Trends in Cognitive Science 3: 272–279. doi:10.1016/S1364-6613(99)01333-9.

Jackendoff, R. (2002). Foundations of Language: Brain, Meaning, Grammar, Evolution. Oxford: Oxford University Press.

Jackendoff, R., and Wittenberg, E. (2014). What you can say without syntax: A hierarchy of grammatical complexity. In F.J. Newmeyer, and L.B. Preston, Measuring Grammatical Complexity, 65-82. Oxford: Oxford University Press.

Jacob, F. (1977). Evolution and tinkering. Science 196: 1161-1166.

Jiménez-Bravo, M., Marrero, V., and Benítez-Burraco, A. (2017). An oscillopathic approach to developmental dyslexia: from genes to speech processing. Behav. Brain Res. 329, 84–95. doi: 10.1016/j.bbr.2017.03.048

Johnson, D.E., and Lappin, S. (1999). Local Constraints vs. Economy. CSLI Publications: Stanford Monographs in Linguistics.

Just, M. A., Carpenter, P. A., Keller, T. A., and Thulborn, K. R. (1996). Brain activation modulated by sentence comprehension. Science 274, 114–116. doi: 10.1126/science.274.5284.114

Kitagawa, Y. (1985). Small but clausal. Chicago Linguistic Society 21: 210-220.

Kolk, Herman. 2006. How language adapts to the brain: An analysis of agrammatic aphasia. In Progovac et al. (eds.), The Syntax of Nonsententials: Multidisciplinary Perspectives, 229–258. Amsterdam: John Benjamins.

38 Koomar, T., and Michaelson, J. J. (2020). Genetic Intersections of Language and Neuropsychiatric Conditions. Current psychiatry reports, 22(1), 4. https://doi.org/10.1007/s11920-019-1123-z

Koopman, H., and Sportiche, D. (1991). The position of Subjects. Lingua 85: 211–258.

Kos, M., van den Brink, D., Snijders, T.M., Rijpkema, M., Franke, B., Fernandez, G., and Hagoort, P. (2012). CNTNAP2 and language processing in healthy individuals as measured with ERPs. Plos One 7: e46995.

Krause, J., Lalueza-Fox, C., Orlando, L., Enard, W., Green, R., Burbano, H., Hublin, J.-J., Hänni, C., Fortea, J., Rasilla, M., Bertranpetit, J., Rosas, A., and Pääbo, S. (2007). The derived FOXP2 variant of modern humans was shared with Neanderthals. Current Biology 17(1-5): 53- 60.

Lai, C.S., Fisher, S.E., Hurst, J.A., Vargha-Khadem, F., and Monaco, A.P. (2001). A forkhead- domain gene is mutated in a severe speech and language disorder. Nature, 413, 519–523.

Lenneberg, E. H. (1967). Biological foundations of language. New York Wiley.

Lenneberg, E. H. (1972). Language and brain: Developmental aspects. A report based on an NRP Work Session held November 19-21, 1972, and updated by participants. Neuroscience Research Programme Bulletin 12(4), 511-656.

Lewontin, R.C. (1998). Evolution of cognition: Questions we will never answer. In D. Scarborough, and S. Sternberg (eds.), An Invitation to Cognitive Science, vol. 4: Methods, Models, and Conceptual Issues, 107–132. Cambridge, MA: MIT Press.

Lieberman, P. (2000). Human Language and Our Reptilian Brain: The Subcortical Bases of Speech, Syntax, and Thought. Cambridge, MA: Harvard University Press.

39 Lieberman, P. (2002). On the nature and evolution of the neural bases of human language. Yearbook of Physical Anthropology 45, 36-62.

Lieberman, P. (2006). Toward an Evolutionary Biology of Language. Cambridge, MA: Harvard University Press.

Lieberman, P. (2007). The evolution of human speech: Its anatomical and neural bases. Current Anthropology 48, 39-66.

Lieberman, P. (2009). FOXP2 and human cognition. Cell 137: 801-802.

Liégeois, F, Baldeweg, T., Connelly, A., Gadian, D.G., Mishkin, M., and Vargha-Khadem, F. (2003). Language fMRI abnormalities associated with FOXP2 gene mutation. Nature Neuroscience 6.11: 1230-7.

Lightfoot, D. (1991). Subjacency and sex. Language and Communication 11: 67–69.

Linkenkaer-Hansen, K., Smit, D.J., Barkil, A., van Beijsterveldt, T.E., Brussaard, A.B., Boomsma, D.I., et al., 2007. Genetic contributions to long-range temporal correlations in ongoing oscillations. J Neurosci 27, 13882-9

Makuuchi M, Friederici AD. (2013). Hierarchical functional connectivity between the core language system and the working memory system. Cortex 49(9):2416-23. doi: 10.1016/j.cortex.2013.01.007. Epub 2013 Jan 28. PMID: 23480847.

Marcus, G. (2008). Kluge: The Haphazard Construction of the Human Mind. Boston and New York: Houghton Mifflin Company.

McBrearty, S. (2007). Down with the revolution. In P. Mellars, K. Boyle, O. Bar-Yosef, and C. Stringer (eds.), Rethinking the Human Revolution: New Behavioral and Biological Perspectives

40 on the Origin and Dispersal of Modern Humans, 133-151. University of Cambridge: McDonald Institute for Archeological Research.

McBrearty, S., and Brooks, A. (2000). The revolution that wasn’t: A new interpretation of the origin of modern human behavior. Journal of Human Evolution 39: 453-563.

Mellars, P. (2002). Archeology and the origins of modern humans: European and African perspectives. In T.J. Crow (ed.), The Speciation of Modern Homo Sapiens, 31–47. Oxford: Oxford University Press.

Mellars, P. (2007). Introduction: Rethinking the Human Revolution: Eurasian and African perspectives. In P. Mellars, K. Boyle, O. Bar-Yosef, and C. Stringer (eds.), Rethinking the Human Revolution: New Behavioral and Biological Perspectives on the Origin and Dispersal of Modern Humans, 1-11. University of Cambridge: McDonald Institute for Archeological Research.

Meyer, L. 2017. The neural oscillations of speech processing and language comprehension: state of the art and emerging mechanisms. European Journal of Neuroscience doi:10.1111/ejn.13748.

Meyer, L., Sun, Y., & Martin, A.E. (2020). Synchronous, but not entrained: exogenous and endogenous cortical rhythms of speech and language processing. Language, Cognition and Neuroscience 35(9): 1089-1099.

Miyagawa, S. (2017). Integration Hypothesis: A parallel model of language development in evolution. In S. Watanabe, M.A. Hofman, and T. Shimizu (eds.), Evolution of the Brain, Cognition, and Emotion in Vertebrates, 225-250. Brain Science Series. Springer Japan. doi 10.1007/978-4-431-56559-8_11.

Miyagawa, S., Ojima, S., Berwick, R.C., and Okanoya, K. (2014). The integration hypothesis of human language evolution and the nature of contemporary languages. Frontiers in Psychology 5, doi: 10.3389/fpsyg.2014.00564.

41 Mohr, J. P., Pessin, M. S., Finkelstein, S., Funkenstein, H. H., Duncan, G. W., and Davis, K. R. (1978). Broca aphasia: pathologic and clinical. Neurology 28, 311–311. doi: 10.1212/WNL.28.4.311

Moro, A. (2008). The Boundaries of Babel: The Brain and the Enigma of Impossible Languages. Cambridge, MA: The MIT Press.

Moro, A., Tettamanti, M., Perani, D., Donati, C., Cappa, S. F., and Fazio, F. (2001). Syntax and the brain: disentangling grammar by selective anomalies. NeuroImage 13, 110–118. doi: 10.1006/nimg.2000.0668

Müller, V., Anokhin, A. P., Lindenberger, U., 2017. Genetic influences on phase synchrony of brain oscillations supporting response inhibition. Int J Psychophysiol 115, 125-132.

Murphy, E. (2018). “A domesticated code: on the emergence of the oscillatory basis of phrase structure,” in The Evolution of Language: Proceedings of the 12th International Conference (Evolang 12), eds C. Cuskley, M. Flaherty, L. McCrohon, H. Little, A. Ravignani, and T. Verhoef (Torun: Nicolaus Copernicus University), 335–338. doi: 10.12775/3991-1.081

Murphy, E. (2021). The Oscillatory Nature of Language. Cambridge: Cambridge University Press.

Murphy, E., and Benítez-Burraco, A. (2016). Bridging the gap between genes and language deficits in schizophrenia: an oscillopathic approach. Front. Hum. Neurosci. 10:422. doi: 10.3389/ fnhum.2016.00422

Murphy, E., and Benítez-Burraco, A. (2018a). Toward the language oscillogenome. Front. Psychol. 9:1999. doi: 10.3389/fpsyg.2018.01999

42 Murphy, E., and Benítez-Burraco, A. (2018b). Paleo-oscillomics: inferring aspects of Neanderthal language abilities from gene regulation of neural oscillations. J. Anthropol. Sci. 96, 111–124. doi: 10.4436/JASS.96010

Newbury, D.F., and Monaco, A.P. (2010). Genetic advances in the study of speech and language disorders. Neuron 68: 309-320.

Newman, A. J., Supalla, T., Hauser, P., Newport, E. L., and Bavelier, D. (2010). Dissociating neural subsystems for grammar by contrasting word order and inflection. Proc. Natl. Acad. Sci. U.S.A. 107, 7539–7544. doi: 10.1073/pnas. 1003174107

Newmeyer, F.J. (1991). Functional explanation in linguistics and the . Language and Communication 11: 1- 28.

Newmeyer, F.J. (1998). On the supposed “counterfunctionality” of Universal Grammar: Some evolutionary implications. In J. R. Hurford, M. Studdert-Kennedy, and C. Knight (eds.), Approaches to the Evolution of Language: Social and Cognitive Bases, 305-319. Cambridge: Cambridge University Press.

Newmeyer, F.J. (2005). Possible and Probable Languages: A Generative Perspective on Linguistic Typology. Oxford: Oxford University Press.

Nóbrega, V., and Miyagawa, S. (2015). The precedence of syntax in the rapid emergence of human language in evolution as defined by the integration hypothesis. Frontiers in Psychology. doi.org/10.3389/fpsyg.2015.00271.

Novick, J. M., Trueswell, J. C., and Thompson-Schill, S. L. (2010). Broca’s area and language processing: evidence for the cognitive control connection. Lang. Linguist. Compass 4, 906–924. doi: 10.1111/j.1749-818X.2010.00244.x

43 Okanoya, K. (2017). Sexual communication and domestication may give rise to the signal complexity necessary for the emergence of language: An indication from songbird studies. Psychonomic Bulletin Review 24: 106-110.

Opitz, B., and Friederici, A. D. (2007). Neural basis of processing sequential and hierarchical syntactic structures. Hum. Brain Mapp. 28, 585–592. doi: 10.1002/hbm.20287

Pallier, C., Devauchelle, A. D., and Dehaene, D. (2011). Cortical representation of the constituent structure of sentences. Proc. Natl. Acad. Sci. U.S.A. 108, 2522–2527. doi: 10.1073/pnas.1018711108

Penke, M. (2015) Syntax and language disorders. In Kiss T, Alexiadou A, eds., Syntax– Theory and Analysis. An International Handbook. Berlin: Walter de Gruyter Editors, pp. 1833-1874

Piai, V., Anderson, K.L., Lin, J.J., Dewar, C., Parvizi, J., Dronkers, N.F., & Knight, R.T. (2016). Direct brain recordings reveal hippocampal rhythm underpinnings of language processing. PNAS 113(40): 11366-11371.

Piattelli-Palmarini, M. (2010). What is language, that it may have evolved, and what is evolution, that it may apply to language? In R.K. Larson, V. Deprez, and H. Yamakido (eds.), The Evolution of Human Language: Biolinguistic Perspectives, 148–162. Cambridge: Cambridge University Press.

Piattelli-Palmarini, M., and Uriagereka, J. (2004). Immune syntax: The evolution of the language virus. In L. Jenkins (ed.), Variation and Universals in Biolinguistics, 341–377. Oxford: Elsevier.

Piattelli-Palmarini, M., and Uriagereka, J. (2011). A geneticist’s dream, a linguist’s nightmare: The case of FOXP2 gene. In A.M. Di Sciullo and C. Boeckx (eds.), The Biolinguistic Enterprise: New Perspectives on the Evolution and Nature of the Human Language Faculty, 100-125. Oxford: Oxford University Press.

44 Pinker, S., and Bloom, P. (1990). Natural language and natural selection. Behavioral and Brain Sciences 13: 707-784.

Poeppel, D., 2012. The maps problem and the mapping problem: Two challenges for a cognitive neuroscience of speech and language. Cogn. Neuropsychol. 29, 34–55. doi:10.1080/02643294.2012.710600.

Poeppel, David and David Embick. 2005. Defining the relation between Linguistics and Neuroscience. In Anne Cutler (ed.), Twenty-First Century Psycho-linguistics: Four Cornerstones, 103–118. Mahwah, NJ: Lawrence Erlbaum.

Progovac, L. (2008). What use is half a clause? In A. Smith, K. Smith, and R. Ferrer i Cancho (eds.), Evolution of Language: Proceedings of the 7th International Conference, Barcelona, Spain, March 12–15, 259–266. Hackensack, NJ: World Scientific.

Progovac, L. (2010). Syntax: Its evolution and its representation in the brain. Biolinguistics 4.2- 3: 233-254.

Progovac, L. (2015). Evolutionary Syntax. Oxford Studies in the Evolution of Language. Oxford: Oxford University Press.

Progovac, L. (2016a). A Gradualist scenario for language evolution: Precise linguistic reconstruction of early human (and Neandertal) grammars. Frontiers in Psychology 7:1714. doi: 10.3389/fpsyg.2016.01714.

Progovac, L. (2016b). Review of Robert Berwick and Noam Chomsky’s 2016 book Why Only Us: Language and Evolution. Cambridge, MA: Cambridge University Press. Language 92.4: 992-6.

45 Progovac, L. (2017). Where is continuity likely to be found? Commentary on ‘The social origins of language’ by Robert M. Seyfarth and Dorothy L. Cheney. Edited and introduced by Michael Platt, 46-61. Princeton and Oxford: Princeton University Press.

Progovac, L. (2019). A Critical Introduction to Language Evolution: Current Controversies and Future Prospects. Springer Briefs in Linguistics: Expert Briefs. Cham, Switzerland: Springer.

Progovac, L., and Benítez-Burraco, A. (2019). From physical aggression to verbal behavior: Language evolution and self-domestication feedback loop. Frontiers in Psychology 10: 2807 doi: 10.3389/fpsyg.2019.02807

Progovac, L., and Locke, J.L. (2009). The urge to merge: Ritual insult and the evolution of syntax. Biolinguistics 3.2-3: 337-354.

Progovac, L., Rakhlin, N., Angell, W., Liddane, R., Tang, L., and Ofen, N. (2018a). Diversity of grammars and their diverging evolutionary and processing paths: Evidence from Functional MRI study of Serbian. Frontiers in Psychology. Special Issue: Languages as Adaptive Systems, edited by E. Aboh, and U. Ansaldo. doi: 10.3389/fpsyg.2018.00278.

Progovac, L., Rakhlin, N., Angell, W., Liddane, R., Tang, L., and Ofen, N. (2018b). Neural correlates of syntax and proto-syntax: An fMRI study. Frontiers in Psychology 9:2415, 1-16. doi: 10.3389/fpsyg.2018.02415.

Rakhlin, N., and Grigorenko, E. (2014). (A)typical language development: genetic and environmental influences. In Handbook of Communication Disorders, eds. R. A. Bahr and E. Silliman. Routledge, pp. 11–21.

Reilly SK, Noonan JP (2016) Evolution of gene regulation in humans. Annu Rev Genomics Hum Genet. 17:45-67. doi:10.1146/annurev-genom-090314-045935

46 Salami, M., Itami, C., Tsumoto, T., Kimura, F. (2003). Change of conduction velocity by regional myelination yields constant latency irrespective of distance between thalamus and cortex. PNAS, 100, 6174-6179.

Schreiweis C, Bornschein U, Burguière E, Kerimoglu C, Schreiter S, Dannemann M, Goyal S, Rea E, French CA, Puliyadi R, Groszer M, Fisher SE, Mundry R, Winter C, Hevers W, Pääbo S, Enard W, Graybiel AM (2014) Humanized Foxp2 accelerates learning by enhancing transitions from declarative to procedural performance. Proc Natl Acad Sci U S A. 111(39):14253-8. doi: 10.1073/pnas.1414542111. Seger, C. A. (2008). How do the basal ganglia contribute to categorization? Their role in generalization, response selection, and learning via feedback. Neuroscience and Biobehavioral Reviews 32, 265-273.

Seger, C.A., and Miller, E.K. (2010). Category learning in the brain. Annual Review of Neuroscience 33: 203-219.

Stowell, T. (1981). Origins of phrase structure. Ph.D. Dissertation, Massachusetts Institute of Technology, Cambridge, MA.

Stromswold, K., Caplan, D., Alpert, N., and Rauch, S. (1996). Localization of syntactic comprehension by positron emission tomography. Brain Lang. 52, 452–473. doi: 10.1006/brln.1996.0024

Tallerman, M. (2009). The origins of the lexicon: how a word-store evolved. In Rudolf Botha and Chris Knight (eds.), The Prehistory of Language, 181-200. Oxford: Oxford University Press.

Tallerman, M. (2014). No syntax saltation in language evolution. Language Sciences 46 (Part B): 207-219.

47 Teichmann, M., Dupoux, E., Kouider, S., Brugières, P., Boissé, M.-F., Baudic, S., et al. (2005). The role of the striatum in rule application: the model of Huntington’s disease at early stage. Brain 128, 1155–1167. doi: 10.1093/brain/awh472

Teichmann, M., Gaura, V., Démonet, J.-F., Supiot, F., Delliaux, M., Verny, C., et al. (2008). Language processing within the striatum: evidence from a PET correlation study in Huntington’s disease. Brain 131, 1046–1056. doi: 10.1093/ brain/awn036

Teichmann, M., C. Rosso, J-B Martini, I. Bloch, P. Brugières, H. Duffau, S. Lehéricy and A.-C Bachoud-Lévi. (2015). A cortical-subcortical syntax pathway linking Broca’s area and the striatum. Human Brain Mapping 36, 2270-2283. Thompson, D’Arcy W. 1917. On Growth and Form. Cambridge, Cambridge University Press.

Thothathiri, M., Kimberg, D. Y., and Schwartz, M. F. (2012). The neural basis of reversible sentence comprehension: evidence from voxel-based lesion symptom mapping in aphasia. J. Cogn. Neurosci. 24, 212–222. doi: 10.1162/jocn_a_00118

Trettenbrein, P. C. (2017). 50 years later: A tribute to Eric Lenneberg’s Biological Foundations of Language. Biolinguistics 11.SI, 21-30.

Ullman, M.T. (2001). The declarative/procedural model of lexicon and grammar. Journal of Psycholinguistic Research 30: 37-69.

Ullman, M.T. (2006). Is Broca’s area part of a basal ganglia thalamocortical circuit? Cortex 42: 480-485.

Ullman, M. T., and Pierpont, E. I. (2005). Specific language impairment is not specific to language: the procedural deficit hypothesis. Cortex 41(3), 399–433. https://doi.org/10.1016/s0010-9452(08)70276-4

48 Uriagereka, J., Stromswold, K., Lai-Shen Cheng, L., Harley, H., Wedel, A., McGilvray, J., van Gelderen, E., and Bever, T. G. (2010). The Biological Nature of Human Language. Biolinguistics 4.1: 004–034, 2010 ISSN 1450–3417 http://www.biolinguistics.eu

Van Beijsterveldt, C.E., Molenaar, P.C., de Geus, E.J., Boomsma, D.I., 1996. Heritability of human brain functioning as assessed by electroencephalography. Am J Hum Genet 58, 562–573.

Vargha-Khadem, F., Gadian, D.G, Copp, A., and Mishkin, M. (2005). FOXP2 and the neuroanatomy of speech and language. Nature Reviews Neuroscience 6: 131-138.

Vernes, S.C., Newbury, D.F., Abrahams, B.S., …, and Fisher, S.E. (2008). A functional genetic link between distinct developmental language disorders. The New England Journal of Medicine 359: 2337–2345. doi: 10.1056/NEJMoa0802828.

Vernes, S.C., Spiteri, E., Nicod, J., Groszer, M., Taylor, J.M., Davies, K.E., Geschwind, D.H., and Fisher, S.E. (2007). High-throughput analysis promoter occupancy reveals direct neural targets of FOXP2, a gene mutated in speech and language disorders. The American Journal of Human Genetics 81: 1232-1250.

Villagrasa, F., J. Baladron, J. Vitay, H. Schroll, E. G. Antzoulatos, E. K. Miller and F. H. Hamker. (2018). On the role of cortex-basal ganglia interactions for category learning: A neurocomputational approach. The Journal of Neuroscience 38, 9551-9562.

Voight, B.F., Kudaravalli, S., Wen, X., and Pritchard, J.K. (2006). A map of recent positive selection in the human genome. PLOS Biology 4(3): e72; doi 10.1371/journal.pbio.0040072.

Wexler, K. (1999). Very early parameter setting and the unique checking constraint: A new explanation of the optional infinitive stage. Lingua 106, 23-79.

49 Whalley, H.C., O'Connell, G., Sussmann, J.E., …, and Hall, J. (2011). Genetic variation in CNTNAP2 alters brain function during linguistic processing in healthy individuals. American Journal of Medical Genetics Part B-Neuropsychiatric Genetics 156B: 941-948.

Whitehouse, A.J.O., Bishop, D.V.M., Ang, Q.W., Pennell, C.E., and Fisher, S.E. (2011). CNTNAP2 variants affect early language development in the general population. Genes Brain Behavior 10: 451-456.

Wu, JIe Qiong (15 January 2014). An Overview of Researches on Biolinguistics. Canadian Social Science. pp. 171–176. CiteSeerX 10.1.1.820.7700

Wynn, T., and Coolidge, F.L. (2004). The expert Neandertal mind. J.Hum.Evol. 46,467–487. doi:10.1016/j.jhevol.2004.01.005

Zaccarella E and Friederici AD (2015) Merge in the Human Brain: A Sub-Region Based Functional Investigation in the Left Pars Opercularis. Front. Psychol. 6:1818. doi: 10.3389/fpsyg.2015.01818

Zhang H, Watrous AJ, Patel A, Jacobs J. (2018) Theta and Alpha Oscillations Are Traveling Waves in the Human Neocortex. Neuron. 98(6):1269-1281.e4. doi: 10.1016/j.neuron.2018.05.019.

Zurif, E. B. (1995). “Brain regions of relevance to syntactic processing,” in An Invitation to cognitive science, eds L. Gleitman and M. Liberman (Cambridge, MA: MIT Press), 381–398.

50