<<

Linguistics and LHS (print) issn 1742–2906 LHS (online) issn 1743–1662 the Human Sciences Article

What sort of minded being has language? Anticipatory dynamics, arguability and agency in a normatively and recursively self-transforming learning system Part 2

Paul J. Thibault

Abstract Part 2 is a direct continuation of Part 1, which was published in Linguistics and the Human Sciences (2005) Volume 1(2). The numbering of the sections and the figures and tables continues directly from Part 1. The two parts should therefore be read as a single overall article. In Part 2, I consider some specific instances of learning systems involving humans and in relation to the theoretical issues discussed in Part 1. In Part 2, the implications of the systemic-functional theory of language for a unified view of cognition and semiosis as distributed activity on diverse time-scales are further discussed in the light of the issues that are discussed above. Keywords: affordance, agent, anticipatory dynamics, discourse, cognition, distributed activity, learning, metafunction, semiosis, values

Affiliation

Paul J. Thibault, Professor of Linguistics and Media Communication, Faculty of Humanities, Agder University College, Kristiansand, Norway email: [email protected]

LHS vol 1.3 2005 355–401 doi : 10.1558/lhs.2005.1.3.355

©2005, equinox publishing LONDON 356 linguistics and the human sciences

13 Interpersonal negotiation and anticipatory dynamics in an episode of -human interaction What can interaction between human caregivers and bonobos such as and , who have been co-reared by human caregivers at the Language and Cognition Research Center in Atlanta, Georgia, tell us about the anticipa- tory dynamics of interpersonal negotiation and experiential representation? In this Section, I shall discuss an episode that occurred between Janine, a former researcher at the Center, and Kanzi, a resident bonobo. In the episode of bonobo-human interaction to be analysed, Janine telephones the Center from her home. Sue is together with Kanzi in a laboratory at the Center. Sue relays Kanzi’s lexigram sign-making to Janine during the telephone conversation. In the ensuing telephone discussion with Kanzi, who is in the Center’s laboratory along with Sue, Janine undertakes to bring a number of surprises for Kanzi. Four hours later she comes to the Center with the promised surprises in her backpack. A transcript of the episode is presented in Table 3. This transcript presents the first phase of the overall episode, i.e. the telephone conversation prior to Janine’s coming to the Center. The follow-up episode of her arrival at the Center and meeting with Kanzi and Sue is discussed in Thibault (2005a).

Action Language

1 Janine: I’m just talking to … Kanzi I want to tell you something (???? on the phone) Kanzi I was going to come see you I’m going to bring a backpack with some surprises would you like some surprises? can you tell me if you’d like some surprises?

2 Kanzi touches lexigram Kanzi:

3 Janine: you would

Sue: surprise (overlapping with Janine) Janine: ok

4 Sue points to bench Sue: Kanzi come up here top as Kanzi gets down on to floor

5 Kanzi returns to Janine: would you like any food? tell me what food you’d bench top and takes like telephone to ear

6 Kanzi touches lexigram Kanzi:

7 Janine laughs Janine: some food surprise P. J. Thibault 357

8 Sue: food surprise [excited, rising intonation]

9 Janine: Kanzi would you like some juice or some um M&M’s or some sugar cane?

10 Kanzi:

11 Sue: [excited voice] M&M’s

12 Janine: you like M&M’s ok Kanzi is there any other food that you would like me to bring in the backpack?

13 Kanzi touches lexigram Kanzi:

14 Janine: a ball ok I can bring a ball I’m going to bring them to see you

FOUR HOURS LATER JANINE COMES TO THE CENTER

Note: in the transcription (???) indicates indecipherable linguistic text

Table 3: Transcript of telephone conversation between Janine, Sue, Kanzi, prior to Janine’s visitor to the Center; transcribed from the NHK (Japan) video documentary Kanzi II.

Kanzi’s participation in and contribution to the exchange show that he is strongly attracted to its dynamics. By the same token, he is able to influence its development according to his wants and needs. Moreover, the representational resources that the lexigram affords him mean that he can link inner needs and desires to external representational resources that can be taken up, interpreted and negotiated by others in ways that maintain the episodic flow of the activity. His ability to participate in and actively shape episodes such as the following shows an emergent capacity to play out a limited range of social roles as a competent member of the bonobo-human society of which he is a member. In this first instance, Kanzi’s signs seek to fulfil his wants, needs, and gratifi- cations. The values are based on instrumental criteria of pleasure gratification in the here-now. However, Kanzi’s integration of lexigram signs to the activity show a limited representational capacity to go beyond the here-now. They exhibit a limited ability to temporally generalise behavioural expectations. His representations are lifted out of the flow of the activity and functions as signs that can anticipate and shape the flow of the activity (Cowley, 2005; Thibault, 2005a: 113). His representational capacity does not simply express concrete wants and needs, though it certainly does that too. Rather, his interaction with Janine shows that he has temporally generalisable expectations about how the interaction should proceed. His limited capacity for experiential representation 358 linguistics and the human sciences indicates some ability to abstract to actions and their expected outcomes in other times and places. Expectations about actions can be located on a timeline of past, present and future actions and their outcomes. In lifting things out of the flux in this way, representations create episodic units, consisting of events, participants, and outcomes; expectations constitute perspectives on these events. The chunking of experience into episodic units is a kind of proto-experiential meaning. Expectations, on the other hand, entail the perspectives of the observers who interpret these events from their perspectives. Expectation therefore is a kind of proto-modality; it is the means whereby the satisfying of needs and the attain- ing of wants is no longer tied to the self’s immediate gratification of pleasure or pain. Expectation in this sense lifts the activity out of the here-now. Kanzi’s needs and wants are mediated by indexical-symbolic resources such as those afforded by the lexigram. Kanzi’s pressing into service of these resources in episodes such as the one under consideration here means that he has achieved an elementary level of indexical-symbolic mediation of his needs and wants. His actions are interpreted by his human interlocutors (e.g. Sue, Janine) in terms of culturally mediated needs and wants. Moreover, these same mediational resources allow for the emergence and recognition of a human-bonobo social group with its semiotically mediated social-interpersonal participant roles (Benson, 2002; Benson et al., 2002). With this level of symbolic achievement, Kanzi has acquired some ability to generalise from individuals to a range of role-types that characterise the pan-human socio-cultural group of which he is a member and the activities that characterise the social organisation of the group and its members. This ability creates expectations as to who can and should fulfil these roles from one occasion to another. This achievement further entails the development of generalised expectations concerning the behaviours of the participants who enact particular roles and the responsibilities that are assigned to these roles. At this point, we might venture that Kanzi has an implicit theory of himself and his human interlocu- tors as social agents who can be held responsible for their actions on the basis of generalisable expectations about identities, roles and actions. Such expectations partially lift these identities, roles and actions out of the flux of the here-now activity and place them on a time-line of expectations concerning past, present, and future actions. In this way, Kanzi’s bodily performances on level L in the three-level hierarchy (Section 2, Part 1) are contextually integrated to and reorganised by an emergent system of symbolically mediated pan-human role types, along with their associated expectations and responsibilities on level L+1 of the emergent pan-human ecosocial semiotic system. P. J. Thibault 359

Kanzi’s lexigram signs in this episode anticipate a desired action outcome at the same time that this is calibrated with an elementary experiential representa- tion. Each sign that he produces during the course of the interaction with Janine progressively indicates a more precisely defined subclass of surprise, as the progression throughout the episode shows. The increasingly precise experiential representations guide the action selections in relation to his expectations and the perspectives that inform his expectations. The grounding of Kanzi’s utterances is far more implicit and less highly speci- fied compared to the grammatical resources of the clause. Tense, modality and polarity (Section 5, Part 1) are not present in his utterances. Nevertheless, his utterances function as moves in the discourse between himself and Janine. On this basis, we can say that his utterances specify a point of reference in relation to the here-now (the ground) of the speech event and therefore the modal or proto-modal perspectives that such points of reference entail. Consider the following excerpt from Table 3 above: 3 Janine: can you tell me if you’d like some surprises? 4 Kanzi: touches lexigram: 5 [Sue: surprise] 6 Janine: you would, ok

Janine’s interrogative utterance asks for Kanzi’s modal perspective on the propo- sition in her clause. That is, she grounds her utterance in terms of a modal assessment of ‘ability’ (can) and ‘desire’ (would like) on the part of her interlocu- tor, Kanzi. In other words, she seeks to make Kanzi modally responsible for the meaning of the proposition. That is built into the meaning of interrogative mood. Thus, the selection of second person deixis you( ) as Subject ties Kanzi to the ground as the participant who is required to take up the speaking role of responding to Janine’s interrogative utterance. From Janine’s perspective, as expressed in her utterance, Kanzi is positioned as the embodiment of a symboli- cally mediated role-type which entails certain expectations and concomitant responsibilities concerning the appropriate course of action to adopt with respect to this role-type. Kanzi’s response, i.e. , does not, of course, have any of the grammatical features discussed above. Nor is his perspective as semantically specified as Janine’s. Nevertheless, it clearly conforms to the expectations cre- ated by Janine’s utterance just as he takes up the corresponding participant role and negotiates it in ways that lead to a situationally appropriate resolution of this part of the exchange, as shown in Janine’s response in (3). In this sense, Kanzi’s response and the associated role relation are of a semantic type that was correctly anticipated by Janine’s utterance. However, his own perspectives also 360 linguistics and the human sciences inform Kanzi’s response. These perspectives accordingly inform the grounding of his utterance. First, his utterance is located in time with respect to the moment of utterance. In the present case, his lexigram sign is located as temporally removed from the ground, i.e. as ‘not yet attained, but desired’. Second, his utterance is also grounded in terms of a proto-modal expectation that Janine will fulfil the desired surprise at some future moment. Third, his utterance is grounded by being tied to his role as first person speaker, who responds to Janine’s interrogative utterance in a contextually appropriate way. Kanzi’s embodiment of the speaking role in question is itself a form of indexical grounding in the here-now. However, the symbolic dimensions of his utterance also show a capacity to integrate his embodiment to role- types and actions that transcend his embodiment in ways that integrate him to a set of social roles and ways of acting. The structure of this symbolic grounding is more implicit and less specified than the resources of clause grammar in human language. However, it is not absent and clearly functions in this episode as well as in other related episodes in Kanzi’s interactions with humans in the pan-human cultural environment that he participates in (Benson, 2002; Benson et al., 2002; Benson and Greaves, 2005; Cowley, 2005; Thibault, 2004c; 2005a). The implicit grounding of his lexigram sign can be unpacked as follows: 1 implicit first- and second-person speaking and listening roles; 2 temporal grounding in relation to implicit timeline of past, present, and future mediated by memory-governed meanings; 3 implicit modal perspective {Expectation} evaluating the proposed action from his first-person perspective along with limited capacity to generalise expectations and role relations to the second person perspective of the addressee. The utterance is grounded in terms of the implicit proto-modality {Expectation}. Kanzi’s utterance therefore locates the utterance in terms of a modal stance that I shall roughly gloss as ‘not yet attained, but expected at some future time’. Moreover, his utterance is also grounded by the implicit person deixis that provides a link between his utterance, the speech event, and the role relations of himself and Janine in the unfolding event. The absence of explicit grammatical features equivalent to person deixis suggests a certain leeway in the possible interpretations of the ways in which the utterance is linked to the I-you of the speech event. It is, of course, Kanzi’s utterance and in this sense it is clearly enough linked to Kanzi as first bonobo addresser. By the same token, Janine’s two-part response, i.e. you would, ok, takes up and negotiates Kanzi’s move P. J. Thibault 361 both as an answer to her question (you would) and as a proposed action, i.e. that she give him the surprise (ok). In other words, Kanzi’s lexigram sign can be interpreted both as a kind of response statement to her question (‘yes I would’) and a proposal for action (‘get me a surprise’). The grounding of Kanzi’s utterance can be unpacked as follows: [I want [I expect [FOOD – SURPRISE] YOU GET IT FOR ME]]]

Each individual’s contribution to the interaction – his or her discourse moves – are modalised takes on the emerging situation which constitute a certain kind of interactive potential. For example, Kanzi’s move is an indication of both a want and an expectation concerning the future outcome of that want. His lexigram sign does not stand in a correspondence relation with something that is already there in the situation. Rather, his sign constitutes a particular development of the interaction at the same time that it anticipates possible future developments, as shown above. Furthermore, both Kanzi and Janine treat his move as consistent with this possibility in ways that guide and modulate the future development of their interaction. It is also possible that a given move and the perspective that this impli- cates is not consistent with the perspectives of other participants, their understandings or with some feature of the situation to hand. In such cases, the representation of the situation may be in error, false, incorrect, wrong, inappropriate, and so on. However, a recursively self-transforming learning system is able to detect such error and to correct it (Bickhard, 2005). The interpersonal grounding of discourse moves with respect to the speaker- listener’s temporal and modal perspectives on that move and its referent situation, along with the person deixis of the Subject, mean that such error can be detected and corrected in and through the dialogic uptake, negotiation and development of these perspectives. There is no need for the term to stand in a relation of correspondence with an external state of affairs. There are no entities ‘out there’ in the real world that the term might correspond to. That is not what representation is all about. Rather, the representation is an exploratory anticipation of interactive potentiality. It is future-oriented and anticipatory. Moreover, it may anticipate erroneously, incorrectly, inappropriately, and so on, in ways that can be argued about and corrected through further discourse negotiation. Furthermore, the grounding resources of both the expression and content strata show that representation is always tied to dialogically coor- dinated action and interaction and the always embodied and contextually embedded temporal and modal perspectives which ground representations in particular contexts. 362 linguistics and the human sciences

The interaction between Kanzi and Janine is future-oriented and anticipa- tory. Kanzi’s use of explicit representational resources as afforded by the lexigram also implicates modalised stances on the interaction. In his case, I have called these proto-modalities to indicate that these are vaguer, less specified analogues of the semantically more highly specified modalities in natural (human) languages. Kanzi’s encounter with Janine on the phone is a contingent event. From Kanzi’s perspective, Janine, who is herself socialised to the pan-human cultural environment, affords Kanzi possibilities of interac- tion. At the beginning of their phone conversation, Janine announces her intention to bring some surprises to Kanzi. Kanzi has no trouble understand- ing what Janine says in spite of his being initially somewhat disconcerted by the telephone voice. The point is that Janine’s discourse move affords different semantic possibilities for its further negotiation. There are different possible perspectives on it and different ways of taking up and negotiating its meaning potential. Some of these are directly available in the linguistic form of the utterance. Others require the potential meaning of the move to be turned into an actuality by further development of this potential through negotiation. This is what happens here. The first mention of ‘surprise’ is grounded as a future oriented intention or undertaking (I’m going to bring a backpack with some surprises). Janine then checks out Kanzi’s own perspective (would you like some surprises? can you tell me if you’d like some surprises?). Kanzi confirms his interest. In doing so, he gives a further interactive twist to Janine’s prior moves and therefore helps to nudge the negotiation down a certain pathway so as to bring a particular modalised take on the situation closer to its realisation as an actualised material affordance that he can exploit for his own gratification. Each participant’s moves are a form of interactive potential that anticipates possible future developments of the system of relations. These developments are informed by the perspectives of participants. The representations that are so produced must be meaningful to the agent that uses them. The agent must have in its central nervous system a semiotic model – a system of interpretance – that enables it to interpret and use its memories as signs about its external and internal environments. Such an internalised model enables the agent to use memories of past external and internal events as signs that can regulate its responses to present (actual) and future (possible) events in its environment. A model of this kind creates expectations and therefore allows the agent to anticipate possible environ- mental events and contingencies. Memories are potential meanings rather than stored ‘information’ because memories exist in the perspective of a self in relation to the contexts in which the memories have meaning for the self. The ‘retrieval’ of a given memory from the past in newly contingent present P. J. Thibault 363 circumstances is always a recontextualisation of that memory in response to present circumstances (see also Peng, 2003: 42–3). Memory then becomes cognition and functions to select and determine the organism’s orientation to its environment both present and future. As Peng (op. cit.) points out, memory and cognition ‘are two sides of the same coin’. Memory is the tale that wags the dog of anticipation. Arguability means that agents can adapt to the failure of their discursively posited propositions and proposals by reorganising the interaction itself so that error is corrected. The success of a proposition or proposal is always a modal success (Halliday, 1994: 76–8; Halliday and Matthiessen, 2004: 117–20; Section 5, Part 1). This means that through its own interactive processes the agent can modify and add to its own agency. In doing so, it learns to anticipate and therefore to act on its own agency and the perspectives that its agency affords the agent for action, interaction, and learning. In the following sections, I shall consider some of the ways in which the combined anticipatory dynamics of interpersonal negotiation and experiential construal contribute to the building up of a matrix of interactive potential in a learning situation involving a small group of young children.

14 Introducing the episode to be analysed The brief episode to be analysed in this section is from a video recording of a computer game involving six children, aged between 7 and 9. The game takes place in the school library at a school near Nottingham in England and lasts for about 75 minutes. On the prompting of the adult supervisor (Anthony), who is also present, the children divide themselves into two groups, consisting of three boys versus three girls. The episode is transcribed in Appendix I. The computer game was developed by Anthony Baldry and is based on the computer programme HyperContext (Baldry, 1996: 149–56; Baldry et al., 1994; Piastra and Lombardi, 2000: 247–62; Baldry, 2005). During the game, the children are required both to assign story fragments to an appropriate genre and to assemble them in an appropriate order. The game also prompted children to invent their own stories. At the start of the game, the following instructions are shown on the com- puter screen: Decide which group (Group A or Group B) will start the game and click the corresponding button. Here are 5 stories: choose the one you like best and click it. 1. Once upon a time a lion lived in a forest … 2. A group of aliens live on a planet very far away from the Earth … 364 linguistics and the human sciences

3. According to the myth, there was a time when proud heroes did courageous undertakings … 4. A young sailor is in search of adventure … 5. The gang of three thieves is in town again … Can you find the right continuation to the stories the computer will show you? Click the ‘Forward’ and ‘Back’ buttons at the top of the screen to see all the possible continuations. When you’ve found the story continuation you think is right click the ‘Choose’ Button. Divide yourselves into two teams: every time your team makes a good move you win 5 points. So before you make your choice, discuss it carefully among your team. After you’ve made your choice, pass the mouse to the other team. Try to use the story fragments to invent your own continuations and try and explain what types of stories are being told. This will help you answer the questions at the end. Good luck! May the best team win! Click here to continue.

The episode transcribed in Appendix I refers to the second story in the above list, which is about a group of aliens from another planet. The following text about the aliens appears on the computer screen at this stage of the game and requires a response from one of the two groups who are playing the game. All the inhabitants are shocked by the appearance of these strange beings. They keep on asking each other ‘Will they be peaceful and friendly or will they be really nasty?’ Before you go on: pretend that the ‘invaders’ are really nasty, horrid creatures. Your group must give a description of their characteristics.

The transcript starts at this point in the game when two of the boys, designated Boy1 and Boy2 in the analysis and discussion below, on the supervisor’s prompt- ing, engage in the task of describing the characteristics of the aliens. Only the first few utterances of this extended episode, which lasts approximately 02:30 minutes, are shown in the transcription for the purposes of the analysis and discussion below.

15 Affordances for action and meaning In the theory of distributed cognition that is of interest here, the brain makes use of external resources of various kinds in order to simplify the computational tasks it is required to carry out. Cognition is distributed between brain, body, and external environment. What are the forms of external scaffolding or the semiotic technologies that enable Boy1 to perform the action that we observe? In the present case, these external resources will be subdivided into two analytical groups: (1) environmental affordances that agents can use; and (2) the perceived regularities of the events in which they participate or which they observe. P. J. Thibault 365

Relevant affordances include: the texts and images displayed on the computer screen; the mouse; the instructor; the other children; one’s own body; the spatial organi- sation of the room; the furniture; Regularities of the situation and/or the activities which take place include: the division into two teams of boys and girls; playing the game and its rules; story genres; instructional genres. In the situation shown in Transcription I, Boy1 is confronted with sets of contextual constraints and possibilities for action that interact with each other in a seamless way whenever he makes a decision as to what to do. Thus, he can respond to specific affordances, he can go with the flow of the activity and its regularities and related expectations, he can formulate new plans for acting in and maybe changing the situation, and so on. Boy1 takes up and responds to the instruction shown on the computer screen. This instruction is also repeated by Anthony. The instruction asks the boy’s team to describe the characteristics of the aliens. In responding, Boy1 takes up a performance role in a definable social activity-structure type or discourse genre. This can be schematised as follows:give instruction^respond with appropriate action. We can assume that his response is guided by a proximate intention, which can be schematised as follows: respond to instruction: describe characteristics of the aliens. The specific course of action that he adopts involves a very close synchronisation of his vocalisations with other sensorimotor activity as he acts upon Boy2 and turns him into an impromptu alien, as the discussion in this section shows. Language, as we shall see below, has a number of different roles to play in this particular situation in relation to Boy1’s actions with Boy2. These roles relate to factors such as: 1 Affective modulation of action: self-confident; pleasurable: having fun; 2 Recruiting external affordances in the fulfilment of specific intentions and plans; 3 Exploring a novel task and coming up with interesting solutions; 4 The role of values and motivations in fulfilling the goal of describing their characteristics; 5 Social norms: pleasing the instructor, getting approval; 6 Social norms: solidarity with norms of the game; friendly competition with others. 366 linguistics and the human sciences

In one sense, we can say that Boy1 and Boy2 engage in a cooperative form of joint bodily engagement when Boy1 pulls Boy2’s nose and Boy2 willingly plays the role of the recipient of Boy1’s action. Boy1 thus uses selected parts of Boy2’s body (his nose and his hair) and clothing (his tie) as an external cognitive- semiotic resource. Boy1 uses Boy2 to create a link between the textual mention of the aliens in the displayed text on the computer screen and Boy2’s body. In so doing, he compares Boy2 to the aliens in the screen text. At the same time, Boy1 is also creating a link between Boy2’s body and the goal that he (Boy1) posits in the immediate discourse context, i.e. responding to the instructions on the computer screen. In so doing, he is relating the newly contingent relation he has established between the aliens in the screen text and Boy2-as-alien to his own goal in this particular phase of the activity. Both the vocal activity and Boy2’s body are environmental affordances which Boy1 can exploit in order to bring about some change both in the external environment as well as in the brains of himself and his addressees. In both cases, Boy1 modulates environmental affordances in order to articulate specific effects and furthermore to produce or induce particular effects and changes in others. In the case of his vocal activity, it is his own vocal apparatus that is modulated in the production of speech sounds. In the case of Boy2’s body, Boy1 performs actions on it with his own body (particularly with his hands) in order to modulate and reshape Boy2 in the service of his specific goals at this point in the interaction. It is in the process of articulating and modulating or reshaping a material resource for interactive purposes that the addresser not only gets stimulus information to the perceptual systems of other individuals, but also provides the brain with pattern-completing material for the specification of second-order meanings which are not neces- sarily tied to the immediate environment and which have the potential to span and connect diverse space-time scales (Lemke, 2000; Thibault, 2000; 2004a; 2004b). The first boy’s utterance, together with the action of pulling the other boy’s ears, initiates a cascade of interactive and exploratory potential. The initial utterance draws attention to certain possibilities for visual and haptic explora- tion that the second boy’s body parts (e.g. ears) and clothing (e.g. tie) afford the development of the aliens theme. Each utterance anticipates further interactive potential for such exploration as its coupling to specific body parts or items of clothing shows. The second boy’s body parts and clothing constitute multiple possibilities for further interactive potential as the aliens theme is further developed by each successive move, as set out below. At a later stage in this episode (see Table 4), the second boy joins in with his own contributions after having been the recipient of the first boy’s exploratory and interactive efforts earlier on. In this way, we see how the aliens theme and P. J. Thibault 367 the local resources that get drawn into its matrix can be taken up and negotiated from the different perspectives of different agents. Moreover, the body parts and items of clothing can potentially be re-accessed on different occasions and mediated by different discursive uptakes by the same or different agents even when the particular situation to hand has come to pass. Each discourse move is a complex coupling in real-time of linguistic, perceptual, and other semiotic modalities. It is a time-locked anticipatory interactive potential. As such, it is also a learning potential that contributes to the construction of a matrix of interactive potential that can be accessed in different ways by different agents who enter the matrix (see Hoey, 2001: 93–118 for the idea of the matrix, which I have adapted to suit my own purposes here). Table 4 presents a matrix view of this interactive potential in this episode.

Boy 1 Boy 2

1 well they sort of look like this + pulls ears low level sound in response to head being twisted

2 with their ties go like that + pulls tie indecipherable soft vocalisation

3 and they have three claws

4 they have a … they have stands a crinkled face + Jo passes his right hand over Ja’s face; his left hand rests on Ja’s right shoulder on ‘face’

5 and their nose is says aaahh (in response long + pulls nose to nose being pulled)

6 and their two front extended vocalisation teeth are huge + Ja’s right hand moves across Jo’s after ‘huge’ mouth, gently drawing Jo’s head to one side; Jo resists and returns head to previous position

7 and the rest of them are tiny

8 all their hair is sticking up + Ja’s left hand takes Jo’s hair and pulls it up

9 and they’ve got slime all over …+ Ja moves his two hands over Jo and turns towards An.

10 looks toward camera and smiles they’ve got very long tongues

11 like um green slime 368 linguistics and the human sciences

12 and they’ve got very long tongues

13 Jo begins to extend both arms before him makes low level ‘alien’ sound

14 Ja twists his arms, makes contorted movements with and they talk like … his body

15 they talk like this

16 ‘aaaaaah’

17 both Ja and Jo continue ‘alien’ body movements both Ja and Jo continue ‘alien’ body movements

18 oooooaaaaayau (increasing in volume, reaching a crescendo on ‘yau’)

Table 4: Matrix view of interaction between the two boys in ‘Aliens’ episode

In trying to figure out what the aliens might look like or, in other words, to find a solution to the problem of describing the characteristics of the aliens as required by the instruction displayed on the computer screen in the game, Boy1 radically simplifies the cognitive complexity of the problem by exploiting and modulating an environmental resource, vis. Boy2’s body. He does so by producing a match between what he does with Boy2’s body and the mention of the aliens in the text on the computer screen. Rather than inner mental representations of aliens that are somehow communicated or transmitted from one mind to another, this match is based on the computationally much more economical and direct process of visually attending to Boy2’s reconfigured alien body and relating it back to the text on the computer screen. This helps to explain the relevance of language here. The attributive clause, well they sort of look like this, functions to draw out and to bring into focus the interactive potential of this particular affordance. In reconfiguring Boy2’s body in this way and by proposing a comparison between this and the aliens in the computer text, Boy1 produces an elegant and immediate solution to the problem of describing the characteristics of the aliens. On this account, the burden of the explanation is not placed on inner computational processes inside the brain of the individual, but on the ways in which computation and semiosis are distributed among brains, bodies, and resources and affordances in the external ecosocial semiotic environment. In this view, language in all its manifestations is a resource that both con- strains and enables particular ways of acting and understanding. Language is an external resource that enables cognitive tasks to be reshaped and extended (Clark, 1997; 2001a; 2001b). For example, Boy1’s utterance well they sort of P. J. Thibault 369 look like this specifies the possibility of a visual match or comparison between the aliens previously mentioned in the computer text and what he does with Boy2 in the here-now of the utterance. In doing so, it directs attention to a new source of perceptual stimulation. Moreover, it reshapes the cognitive-semiotic landscape by allowing a visual display to complete and extend in a different modality the meaning of a linguistic utterance and by extension the prior mention of ‘aliens’ in the computer text. By the same token, the Boy1’s utterance fits into the generic activity of give instruction^respond to instruction in contextually appropriate ways. Boy1 dialogically responds to the instruction, interprets it, and further develops its local meaning potential in ways that are relevant to the current situation and to his own goals in it. The instruction does not therefore play an executive role in the discourse; it, too, is a local resource. It affords its further uptake and semantic development in ways that depend on the contingencies of the situation as well as on the genre and other social conventions and constraints at play. All modes of semiotic activity can accordingly be seen as cascading webs of resources that are distributed across potentially many different space-time scales. Instead of saying that these resources are (potential) representations of things in the world – real or imagined – the matter can be formulated differ- ently. The production of material semiotic artifacts and texts of all kinds do not provide stimulus information about the external environment (the world around us) in exactly the same way that events in the environment do. In the first instance, artifacts of this kind are man made objects and processes which have been modulated so as to attract the attention of the observer. Moreover, they direct his or her attention to second- and higher-order patterns that are not perceived except in the mind’s eye, so to speak, and which belong to or can be integrated to a shared cognitive-semiotic environment linking many different space-time scales. The activity of the individual’s body and brain is not only an integral and functioning part of larger-scale semiotic and material processes, but this activity is also contextually extended and completed by semiotic-material artifacts which exist in the external environment of the body-brain, but which are selectively imported into its internal dynamics by the body-brain’s participa- tion in these same activities. However, the starting point of such processes lies in the articulatory processes whereby material expressive resources – somatic or extra-somatic – are modulated and shaped so as to (1) get the attention of others by providing their perceptual systems with stimulus information; and (2) using this information to specify meanings which are not present in the stimulus information or its mechanical source. One of the things that language does from this point of view is to guide and shape the activity of self and others 370 linguistics and the human sciences and to enable them to orient to and understand phenomena that might not be accessible to direct perception. In doing what he does to Boy2, Boy1 does more than simply play a role in the maintenance of an ongoing activity, i.e. responding to the instructions both displayed on the screen and read by the researcher to the children. In this perspective, Boy1 shows his ability to adapt an available resource to fit the requirements of the activity to hand and therefore to conform to people’s expectations concerning the conduct of such interpersonally coordinated activities. In exploiting Boy2 as an articulatory resource in his immediate environment and in modulating this resource in order to achieve his own interactive ends, Boy1 also shows his ability to interpolate what he does with Boy2 into the flow of the activity itself. In other words, Boy1 makes Boy2 into a sign. The activities which Boy1 performs on Boy2 are contextually integrated with other aspects of the overall activity such that aspects of Boy2 are made into signs of aliens at the same time that the resulting signs are the means whereby diverse features of the situation are integrated into an emerg- ing understanding of the aliens. In this way, Boy1 acts on Boy2 in ways that other participants and observers can contextualise as having a meaning and relevance to the ongoing situation. In one sense, what Boy1 does is a productive uptake of the prior instruction. As such, it is a contribution to a jointly constructed social activity and its regularities. In another sense, it is a creative act: Boy1 creates an on-the-fly sign which enables him to direct and modulate the flow of the activity, to creatively modify the activity, and to build connections with meanings and events in other times and places removed from the present activity. The articulation and modulation of expressive resources, when connected up with the larger-scale ecosocial semiotic environment in the way described earlier, means that such resources can participate in processes of semiotic mediation between individu- als and across diverse space-time scales. Moreover, the mediating role of the signs that arise in this way are always embedded in interpersonally coordinated activity. Boy1’s response to the instruction on the computer screen is not, however, an external response to a behavioural stimulus. Rather, it is motivated by and intended for a specific addressee. Boy1 treats Boy2 as a mediational resource (Vygotsky, [1934] 1986; Hasan, 1992b). The meaning-making potential of such resources is not limited to the here-now scale, but can be used so as to have effects on diverse space-time scales. This is not unlike the ways in which some features of the human voice enable it to produce effects beyond the here and now; Boy1’s articulation and modulation of Boy2’s body as a sign means that it can take on a particular meaning in the situation. Moreover, by virtue of his actions serving both P. J. Thibault 371 to achieve an immediate goal in the discourse (respond to the instruction) and to play its part in the enactment and maintenance of an interpersonally coordinated activity, he provides his interlocutors with resources for devel- oping and extending their own understanding of the task to hand. Later in the episode, Boy2 provides his own uptake on and further development of the situation, which was instigated by Boy1. This gives rise to the dialogical interweaving of the voices and bodily actions of the two boys as they co- develop the meaning of the aliens that Boy1 initiated in Visual Frame 9 (see Transcription in Appendix I). The use of their vocalisations – linguistic and non-linguistic – further shows how specific sound patterns are connected to particular goals and understandings of the event. In the next Section, I shall consider the grounding functions of deixis and how these function to integrate language and action.

16 Deixis, grounding, and the integration of language and action From the point of view of the activity in which the discourse is embedded, deixis can be understood in connection with the ways in which discourse items are grounded in relation to the activity and the participants in that activity. The linguistic component of the activity emerges out of and in rela- tion to other aspects of the overall activity. At the same time, the linguistic aspects also shape the non-linguistic aspects of the activity. In the clause, they sort of look like this, the item referred to by this is defined in terms of the way it relates both to Boy1 as speaker and to Boy1’s immediate goal in the discourse (responding to the displayed text on the computer screen about the aliens by describing them). Deixis functions to index relevant contextual features relative to the subjective purview of the speaker and listener. The deictic word this functions to coordinate the discourse participants in a shared visual-spatial-temporal purview. Generally speaking, the meaning of the demonstrative pronoun this can be described in terms of a number of semantic parameters, as follows: [deixis: spatial: proximal; number: singular; reference category: thing; phoricity: exophoric or endophoric]. The deictic category proximal][ indicates that the item referred to is near the speaker without explicitly referring to the speaker. This category may or may not refer to physical distance from the speaker; it can also indicate the speaker’s subjective stance or perspective on the item referred to. The category of number][ quantifies the item referred to asone instance of the item referred to. The reference category specifies the experiential class of thing referred to, as indicated by the Head noun report in, for example, the nominal group this report. In English, the demonstrative pronouns this and that can have either endophoric or exophoric reference. In the present example, 372 linguistics and the human sciences the demonstrative this points forward to the Head noun report in its nominal group; this presents this item to the reader, which is the first clause of the text in which the item occurs. Martin (1992: 98–102), following Ellis (1971), refers to this type of endophoric reference as esophoric reference as distinct from cataphoric reference. Esophoric reference points forward to an item within the same nominal group; cataphoric reference points forward to an item beyond the nominal group in which the phoric item occurs. The meaning ofthis depends on its temporal synchronisation with the non- linguistic things that Boy1 does to Boy2. The utterance gives shape and meaning to an emergent non-linguistic action. Boy1 directs their attention to the action he performs on Boy2. In this way, he coordinates the participation of all con- cerned in a shared frame of bodily attention and experience. In Uexküll’s (1909; 1982) terms, the utterance does so by connecting the structure of the organism to its Umwelt. In the present example, the deictic this extends and further develops the meaning of the hands: the hands create a vector which links one boy’s body-space to the other’s in a shared orientational framework. The deictic this creates a shared vector of interest and attention that extends from one body-space to another. Boy1 treats Boy2 as an extra-somatic affordance with interactive potential. As we shall see in more detail below, this means that deictics are directly implicated in perception-action routines and hence in intentionally directed bodily activity when agents interact with their ecosocial semiotic environment. The spatial meaning ofthis is ‘proximate’ in contrast with the ‘distal’ meaning of that. These meanings are centred on the subjective reality of ego and ego’s living, interacting body. The spatial meanings of these deictic categories derive from and extend the operations and effects of our perceptual-motor systems; the perceptual-motor system creates the spatial structure and organisation in terms of which we orient to and interact with our surroundings. The brain receives a continual stream of exteroceptive multimodal stimulus information about the world outside the body. On this basis, the brain forms models of the environment with which the body interacts. At the same time, the brain also receives a continual stream of proprioceptive information from the muscles, limbs, joints, and so on, of the body. On this basis, the brain forms a model of the body as the locus of one’s subjective experience of the self (Damasio, [1994] 1996; 2000). The brain’s models of the body and of the outside world form a unity and are the basis of our sense of a self in relation to yet distinct from the nonself beyond our body frame (Thibault, 2004a: 195–201). Boy1’s utterance well they sort of look like this locates the action he performs within his own body sphere as the source of the action. The action is also directed to someone else’s body P. J. Thibault 373

(Boy2) and provides onlookers with visual and cinesic stimulus information that enables them to orient to and to interpret Boy1’s action from their own perspective. Language has grammatical resources for re-grounding a proposition in the perspective of the addressee independently of the author/speaker of the utter- ance (Thibault, 2004a: 190–4; see also Section 5, Part 1). A non-linguistic action such as Boy1’s pulling Boy2’s ears, tie, and so on, is, however, no less grounded in the here-now of the speech situation. It both occurs in a closely synchronised temporal and spatial relation with the linguistic aspects of the activity. At the same time, its own meaning and significance is precisely determined by the way it (normatively) functions in that activity. Non-linguistic and linguistic dimensions of activity are co-temporal (Harris, 1981: 157–64).

17 Extending the body-brain: coupling vocal activity, body movement and environment All meanings must in some way be embodied. The initial problem then is to find ways of getting stimulus information to the perceptual systems of other individuals. This is a primary function of the material expression stratum of any semiotic modality. However, the expression stratum is not an inert material that carries a meaning or content from one mind to another. The expression stratum must be enacted and modulated by specific articulatory processes. These may be in the form of somatic or extra-somatic material resources. In this sense, Boy1’s use of Boy2’s body along with his vocalisation can be seen in this light. Discourses, activities, participants, and contexts are not separate entities that are then hooked up (Ziemke, 1999: 178–9). The deictic and other grounding resources of language are embodied resources. They have both co-evolved in the species and co-developed in individuals through the cross-coupling and mutual specification of human agents and their environments (see Section 2, Part 1). This perspective directly implicates the role of the body as the means through which agents interact with and participate in their environments. On both the expression and content strata of natural languages, deictic and other indexical resources embed the agent in its physical and semiotic environments. If we go back to our example, we can see that the grounding resources of language on the content stratum are attuned to and extend in meaningful ways the sensori- motor capacities of the body. Thus, the deictic meaning ‘proximal’ in the case of this interprets a visual-spatial field in relation to the speech event; it also implicates an indexical act of pointing to something in that field. In the present example, this involves more than merely pointing to a des- ignated object. Boy1 reaches for, grasps, and pulls Boy2’s ears in order to 374 linguistics and the human sciences turn them into ‘alien’ ears for the purposes of the comparison with the aliens mentioned in the computer text. Boy2’s alien features are referred to is a graspable and manipulable entity (Boy2’s ears) which is made relevant to the participants in the situation through a combination of different modes of attention (Gibson, [1966] 1983: 50) such as looking, touching and language. Moreover, the grounding resources of the linguistic utterance extend and hone these modes of attention in contextually relevant ways. A further mode of attention that is relevant is afforded by the basic orienting system of body movement: the upright posture of the two boys and their close physical proximity to each other enable a jointly coordinated action-structure to be created in space and time on the basis of the interaction between the two boys’ bodies. The vocalisation in question qua vibratory event is in the first instance a source of stimulus information; it gives the listener information about the speaker. However, the sound is also used to direct the attention of the listener towards something in the environment that is external to the speaker. There is a great deal of synchronisation between the vocalisation and the bodily interaction between the two boys. Boy1 takes Boy2’s ears and pulls them outwards whilst he gently twists his head to one side. Whilst Boy1 performs this action on Boy2, Boy2 holds his outstretched arms and hands before himself and utters a very soft non-linguistic vocalisation in response to Boy2’s action. The vocalisation and the action are synchronous both in time and space. They occur at the same time; moreover, the vocalisation directs the listener’s attention to the visual-spatial event performed by Boy1 on Boy2. The vocalisation is used to specify something in the shared environment of Boy1 and the others present. Observers are both listening to what Boy1 says and looking at the action he performs. However, this ‘something in the environment’ is not objectively ‘out there’, so to speak, but is created in and through the joint bodily action of the two boys. For the purposes of our analysis and discussion, the utterance can be divided into two tone units. The first tone unit has rising tone while the second has falling-rising tone. Rising tone characterises a constituent in the discourse that is already given or presumed in the discourse. Brazil refers to this tone as Referring tone. Falling tone characterises a constituent that is presented as new information. Brazil refers to this tone as Proclaiming tone (see Brazil, 1981: 48–9). The rising unit and the falling-rising unit coincide in time with two distinct micro-phases in Boy1’s body movement and body posture. Figure 2 shows the analysis of the sound wave of this utterance, including the pitch contour. P. J. Thibault 375

Figure 2: Praat spectogram and spectograph analysis of well they sort of look like this; pitch settings: 150-500 hz; spectogram settings: 0-8000 hz.

The synergy of vocalisation and body movement in Boy1’s utterance can be analysed as follows. Immediately prior to this utterance, Boy1 was leaning towards the computer screen in response to Anthony’s verbal instruction ‘come on …’ . Prior to this moment, Boy1 was standing upright. He leans briefly towards the screen to attend to it in response to what Anthony says shortly before uttering the clause that is the focus of our analysis here. During the tone unit that has rising tone (well), Boy1 moves from leaning towards the computer screen to the upright position. The combination of this tone unit and this phase in Boy1’s body movement constitute one specific area of concern and orientation, vis. the text on the screen about the aliens and therefore the textual source of the ‘they’. In the second phase, there is a shift in focus and orientation away from the computer screen and its text about the aliens to a new interactional space – both physical and semiotic. This entails a new focus on what Boy1 does to Boy2. With the onset of the tone unit with falling-rising tone (they sort of look like this), Boy1 turns towards Boy2 and then takes Boy2’s ears and pulls them in order to turn Boy2 into an alien. In the first part of the vocalisation, falling 376 linguistics and the human sciences pitch extends over they sort of look like). On uttering the word this, Boy1’s voice rises sharply in pitch; it takes on a creaky quality at the same time that the single syllable word this is lengthened over the time span of Boy1’s holding and pulling Boy2’s ears. In the micro-time of the utterance, there is a very high degree of synchronisation between Boy1’s vocalisation and the actions he performs on Boy2. Both Boy1’s vocalisation and his body movements are perceptual-motor activities. In the real-time of the utterance, they constitute a coupled system, as defined above. The rise in pitch, the lengthening of [đı::s], the increase in amplitude, and the creaky quality in Boy1’s voice on this syllable coincide with the pulling of Boy2’s ears and therefore to the first stage in the presented scene which the two boys jointly enact. These features of Boy1’s voice, along with the body movement, give rise to a synergy of perceptual-motor factors and sustain a focus of attention across a number of perceptual modalities, notably the auditory, visual, and movement modalities. From the points of view of the two boys, we could also add to this list the haptic modality of touch and haptic exploration for Boy1’s uttering of /đıs/ also coincides with his taking Boy2’s ears in his hands and pulling them. This action itself elicits a barely audible vocal response from Boy2. This synergy of interacting semiotic and perceptual modalities gives rise to an interesting and pleasure-giving scene. The micro-temporal dynamics of just one vocalisation (see Cowley, 1994; 1997; 1998), coupled to the joint body move- ments of the two boys, suggest some possible ways in which different modalities – vocal and cinesic – become coupled in real-time. It is this coupling that leads to the perception of the resulting whole as having distinct phases and hence a distinctive episodic or event structure (see Sarles, 1977: 236). The changes in, for example, amplitude, pitch, and voice quality that I described above coincide with distinctive phases of body posture and movement. In turn, the coupling of these factors to various perceptual modalities produces an orientational frame for observers of the scene. The synergy of vocal activity and body movement in the present example gives us a glimpse into the ways in which the two are inseparable components of a single overall bodily activity. The vocalisations and movements of one of the boys elicit a response from the other; in turn, this response gives rise to a further response from the other, and so on. The loop of action-followed- by-response between the two boys is dyadic and is increasingly elaborated as their interaction proceeds. Boy1 turns towards Boy2 at the onset of his initial utterance well they sort of … . Boy2 responds by spreading his arms and hands outwards almost as if he were going to embrace Boy1. In the final part of Boy1’s utterance, vis. look like this, Boy1 takes Boy2’s ears in his two hands and pulls the two ears outwards. P. J. Thibault 377

In both of these micro-level phases of the activity, the vocalisation and the body movements of the two boys are inseparable components of a single proc- ess. It is not the case that the vocalisation simply comments on an otherwise separate body movement or that the body movement demonstrates or reflects what is said. Instead, the cinesic and the vocal modalities are really different manifestations in different perceptual modalities of the one overall process (Sarles, 1977: 200). The vocalisation and the body movement together function to specify something that is external to the individual’s body in the shared environment of the participants in this event. In other words, the visual, aural, cinesic, haptic, and other sources of stimulus information about the observed event tell us about more than just the biome- chanical sources of these events. Both the vocalisation and the movement direct the attention of speaker and listeners and onlookers not only to the observable scene which the two boys enact, but also to a second-order perception of the aliens which Boy1 has imagined and which, though his actions, he is able to propose to the others present. This second-order perception is integrated to the developing story about the aliens in the unfolding discourse. In their different ways, both modalities contribute to this process. The focus is not simply on the observable body movements of the two boys per se. These movements belong to the first-order reality of phenomena that we can perceive through our perceptual systems. The focus is also on the ways in which the body movements symbolise things (the aliens) which are not present, but which can act as sources of stimulation in their own right. This symbolic potential is a result of the time-locked integra- tion of vocalisation and body movement qua external resources or affordances that brains can lock into for the purposes of integrating their own patterns of neural activity with patterns of external activity. It is this time-locked pattern- completing activity that enables meaning to emerge. The vocalisation and the bodily action afford further interactive and representational potential in ways that multiply the possibilities of the emerging semiotic-cognitive matrix (Section 15). The attention of the participants is accordingly directed to phe- nomena which are not directly available to perception, but which constitute a second-order environment of meanings that speaker and listener can access and share on larger time scales than the micro-temporal dynamics of the real-time interaction that has been the focus in this Section. Cognition, in this perspective, emerges and takes place on larger time-scales than the real-time micro-dynamics of semiosis that are the focus here. By the same token, the diverse time-scales of cognition and semiosis are seamlessly interwoven with each other. Semiosis mediates between micro-temporal body dynamics and cognitive processes on larger, slower time-scales. What we nor- 378 linguistics and the human sciences mally call cognition – e.g. knowledge structures, longer-term memory, and so on – is not distinct from or opposed to semiosis, but is its reorganisation and contextual integration with meanings and their still little understood forms of organisation on larger time-scales.

18 The body as locus of grounding If grounding were limited to perception-action resources of the acoustic and visual kind afforded by vocalisations and body movements on the expres- sion stratum, there would still be more limited possibilities for intersubjective convergence. There would be a more limited form of reflexivity though with- out the possibilities for the discursive negotiation and renegotiation of the dialogically coordinated takes and uptakes of addressers and addressees in discourse. Grounding would be confined to the perceptual purview afforded by the stimulus information in the here-now context of utterance. There would not be the same possibilities for connecting and integrating diverse, often widely separated, space-time scales and re-grounding these in the here-now of the current observer’s perspective. The lexicogrammatical resources of the Mood system provide these extended resources in addition to the resources of the expression stratum mentioned above. The stimulus information which the expression stratum makes available is an external resource which brains match to their own patterns of time-locked neural activity and in ways which impact on and change consciousness of the self and others in meaningful ways (see Section 17 above). The brain is able to do this because of the entraining and modulating of its intrinsic neural dynamics to those of a higher-scalar ecosocial semiotic system. In this way, complex computational problems and cognitive tasks can be reduced to simpler, pattern-completing discursive operations of the kind the brain can manage and feel comfortable with (Clark, 2001a). Language therefore affords the reduction of topological-continuous variation and complexity to patterns of typological-categorial difference and relative simplicity. Language provides small-scale semiotic models for modelling large-scale world phenomena, and for manipulating, rearticulating and modulating somatic and extra-somatic environmental affordances in ways that can integrate diverse space-time scales beyond the here-now scale. The detection of changes in amplitude, pitch, tone, and voice quality in the speaking voice and the matching of these things to the perception and understanding of event-like or episodic structures are user-friendly resources of the body. These resources naturally and spontaneously lend themselves to the kind of pattern-detection and pattern-completion activities that the brain P. J. Thibault 379 is most at home with in the process of extending itself beyond the body into its ecosocial semiotic environment (Thibault, 2004a: 157–9). Boy1’s vocal and other activity functions in the creation and negotiation of a shared understanding of the activity. This can only happen if both he and the others understand themselves as all taking part in and sharing the same activity. For example, the pronoun they can only be understood if all the participants refer it back to the text about the aliens on the computer screen. This under- standing arises because all of the participants are engaged in various ways in the creating of a story about the aliens. Boy1 therefore assumes that the others are engaged in the same activity and to some extent at least they all share the same frame of reference. How is such a shared frame of reference created? The body movement is grounded in sensori-motor activity in ways that are relevant to the perspectives of the participants in the activity as well as onlookers, though in different ways. The body movement is also grounded in a binocular field of vision relative to the position of an observer. The information in the field of vision of the observer is exteroceptive information about events in the observer’s environment. The position of the observer is also grounded in proprioceptive information about the observer’s activities (Gibson, [1979] 1986: 114–15). Clearly, the exteroceptive and the proprioceptive information that is picked up by observers of and participants in the event will be different because their points of observation are different. Nevertheless, both exteroceptive and proprioceptive sources of information about the event and the observer’s activi- ties will be picked up and integrated in both cases. In the case of vocalisations, the listener orients in the first instance to a vibra- tory event and its source (Section 2, Part 1). However, the lexicogrammatical and semantic dimensions of this vocal event are not only grounded in the sensori-motor activity of articulation; they are also grounded in relation to a signifying intention. It is this signifying intention which represents the condi- tions of possibility of the more specific forms of indexical-symbolic grounding that are made possible by the systems of tense, deixis, and modality in natural languages. Human movement too is grounded in relation to a global signifying intention and cannot be reduced to neuroanatomy. Boy1’s actions towards Boy2 are signifying acts in this sense. Such signifying acts are not reducible to the visual field alone. Rather, they constitute an intentional field of signifying acts which have as their centre the signifying body of the person who performs them (Merleau-Ponty, [1942] 1963: 157; Thibault, 2004b). Stimulus information about environmental events beyond the body is avail- able for perceptual pick up from a point of observation in the environment of an animal (Gibson, [1979] 1986: 43). A point of observation may or may not be occupied by a particular individual on any given occasion, though it has the potential to be. In one sense, the perceptual pick up of the stimulus 380 linguistics and the human sciences information by an individual means that that individual undergoes a unique experience. However, there are also important ways in which the members of a given species share a common environment and shared experiences of that environment. Factors such as the following bear this out: (1) points of observa- tion may be occupied on different occasions by different individuals; (2) the information about environmental events is information which forms part of the environment of a given species and therefore specifies possible courses of action for individuals of that species; and (3) the neuroanatomical capacities of individuals of the same species must be sufficiently similar in spite of the many differences that also exist due to contingencies of neural wiring, individual experience, and so on. The above-mentioned factors indicate that there is a fair degree of convergence in the way that members of the same species perceive the phenomena of their environments.

19 Experiential meaning as guide to activity: rethinking representation Boy1’s use of Boy2’s body is a spontaneous and unplanned response to particu- lar contingencies. He interprets and shapes Boy2’s body in ways that are relevant to the activity and which allow him to engage in that activity with others. In this perspective, Boy1 uses Boy2 as an external resource just like other available resources such as the texts and images displayed on the computer screen, the suggestions made by the researcher, the contributions of the two groups of children to the game, the rules of the game, and so on. All of these are resources that Boy1 and others can draw on and manipulate in order to engage in the activity and to create meanings in and through their activity. Boy1 uses Boy2 to develop and extend the meaning of the text on the computer screen. The attributive clause of visual comparison creates a link between the computer text and Boy2’s body. In the clause, the pronoun they creates an anaphoric cohesive link back to the already mentioned text on the screen about the aliens; the demonstrative pronoun this points forward to the (non-verbal) actions which Boy1 performs on Boy2 (pulling his nose and neck tie). The verbal utterance and the non-verbal action are fully integrated as insepa- rable components of the one overall activity. The declarative clause in effect holds in its scope both the clause itself and the action as an extension of the meaning of the clause, vis. [[[mood: decl [[proposition [+ body action]]]. As we saw above, the demonstrative meaning of this effectively points to the action as a demonstration. The linguistic utterance does not simply ‘represent’ an action that occurs in some other modality. The meaning of the clause in relation to the action performed is neither fixed nor determinate, but open and negotiable. It is more accurate to say that the experiential meaning of the P. J. Thibault 381 clause does not represent anything; rather, it functions as an always negotiable guide to activity, not its representation. The clause operationalises a meaning in relation to the bodily action. In this particular case, the experiential meaning of the utterance creates a basis for the visual comparison between the previ- ously mentioned ‘they’ and the action Boy1 performs on Boy2. The utterance functions to orient the participants (Boy1 and his listeners/onlookers) to this particular phase of the ongoing activity and its development. It creates a joint focus of attention across a spread of perceptual modalities at the same time that it creates a semiotic bridge between the text on the screen and the action which Boy1 performs on Boy2. The joint focus of attention which is sustained here involves the follow- ing perceptual modalities: listening; looking; haptic exploration (grasping, touching); and movement. I shall start with some observations concerning movement. Boy1 and Boy2 are standing in close proximity to each other and facing each other for the most part during this phase of the activity whereas the other children are all seated. The fact that they are standing and facing each other allow for an interactional dynamics which both demarcates them from the rest of the group at the same time that it gives Boy1 the opportunity to attend to Boy2’s ears, nose, his tie, and later his hair in the way described. The pattern of posture and movement vis-à-vis the action performed itself enables Boy1 to reach towards and to grasp and manipulate Boy2’s nose, tie, and so on. Boy1 focuses on items that are located on or are close to Boy2’s face. The deictic-indexical grounding of the utterance in the situation therefore involves a number of different modes of attention (listening, looking, touch- ing, and general orientation). The grounding function of deixis cross-couples the utterance, the perceptual field, and the participants in ways that make shared interpretations of perceived phenomena possible. All of these factors together create a coupled system, i.e. there is mutual interaction among all the components of the system in real time so that they jointly shape and guide the activity. The language used does not need to represent a perceived situation; participants can directly attend to the situation through the various perceptual modalities available to them. When Boy1 says well they sort of look like this, the understanding of this utterance also involves concurrent visual processing of the scene that is enacted between the two boys. The clause does not represent in symbolic form something which is already there or which existed as a prior thought in Boy2’s head. Instead, the clause extends the meaning of the event and provides an orientational frame for further reflection on and action in relation to the event. In this sense, the clause is an external indexical-symbolic resource that enables the body-brains of the participants in this event to adopt a shared intersubjec- 382 linguistics and the human sciences tive frame of reference. The utterance builds on, extends, and complements the computational resources that the brain uses in perceptual-motor tasks; it is not in this view a totally different form of computation that reorganises the brain along radically new lines. This can be better understood if we look at the ways in which the expression and content strata of language together func- tion to ground an utterance in relation to its context. The linguistic utterance has its own specific intrinsic properties which, when coupled to those of the participants in the discourse event, extend the cognitive-semiotic resources of human agents. In so doing, it gives rise to an extended system that is able to perform operations that the individual’s body-brain system alone would not otherwise be able to do.

20 Integrating linguistic and non-linguistics dimensions of the activity Clause such as well they sort of look like this and they talk like this indicate that the action performance which follows the clause in both cases is a demonstra- tion by the speaker of a particular referent situation. The clause frames the speaker’s performance of the non-verbal action, lifts it out of the surround- ing flux and indicates that it is to be evaluated or viewed in a certain way. Specifically, the clause frames the activity: it indicates that the performed activity is a demonstration of something that is signified by the activity. The clause also provides an experiential interpretation of the activity. The attributive clause creates a comparison – visual and aural – between the two participants – Carrier (they) and Attribute (this + demonstrated action). The activities performed are, in both cases, non-linguistic demonstrations of things that the aliens purportedly do. In demonstrating the aliens in this way, i.e. through non-linguistic bodily and vocal enactments, rather than providing a linguistic description, e.g. they look like dwarves with two very long long teeth, the demonstrated alien action stands outside the frame of reference which is created by the framing clause; it is independent of the framing clause and could stand alone as an action which could be enacted in its own right. If the clause provided a linguistic description, as in the invented example above, then the perspective would be that of that speaker and the speaker’s linguistic interpretation of the aliens. In the present case, the demonstrated enactment locates the ‘deictic’ centre of the two actions in the non-linguistic performance itself rather than in the frame of reference provided by the framing clause. The reference point is in such cases the directly demonstrated activity rather than its linguistic interpretation by the speaker of the framing clause. The non-linguistic action and the non-linguistic vocalisation do not of course have tense, deixis, and mood in the way that P. J. Thibault 383 linguistic clauses do. The contrast is not between direct and indirect speech so it is irrelevant as to whether the non-linguistic performance has tense, mood, and deixis or not. These are categories belonging to the linguistic semiotic. Rather, the contrast is between (1) the linguistic interpretation of the activity from the perspective of the speaker of the framing clause, as in the invented example above about the dwarves and (2) its non-linguistic demonstration from the perspective of the demonstrated activity and its performers. In the latter case, the multimodal relationship between linguistic utterance and non-linguistic action or vocalisation is foregrounded.

21 Experiential meaning and the affordances of the environment Boy1 picks up visual and other stimulus information about his surroundings (Section 15). This information directly specifies possible courses of action. He does not have recourse to pre-formed plans that are then executed by a central programme in his head. Instead, the presence in his immediate environment of Boy2 suggests possible courses of action. Thus, Boy1 sees that Boy2’s nose and tie can be held and manipulated in ways that fit in with his own immediate project or goal. Boy1’s action is spontaneous rather than premeditated. Boy2’s presence provides an opportunity for the kind of improvisatory activity that we witness when Boy1 pulls Boy2’s nose and tie so as to turn him into an alien. Boy1 thus locks into and exploits Boy2 – his body parts and clothing – as a locally available resource for his own purposes. At the same time, Boy2 is a co-agent in the realisation of Boy1’s purpose. Rather than saying that Boy1 runs through a set of choices in a pre-given system of options in language, action, and so on, we can say that Boy1 interacts with the affordances in his environment; it is the interaction between Boy1, the linguistic and other semiotic options available to him, and the resources avail- able to him in his immediate environment which give rise to specific ‘choices’ in the course of the activity itself. It is in this perspective that we can understand the meaning of the clause they sort of look like this. The experiential structure of the clause is analysed in Table 5.

well they sort of look like this

Carrier Process: attribution: mental: Attribute: Circumstance: perception: comparison: Comparison visual

Table 5: Experiential structure of the clause ‘well they sort of look like this’

The meaning of this clause is based on visual comparison between different domains. In the present case, the two domains are (1) the aliens who are men- 384 linguistics and the human sciences tioned in the displayed text on the computer screen (they) and (2) the action that Boy1 performs on Boy2 (pulling his nose, tie, and hair). In the context to hand, the uttering of the clause + the action performed is a demonstration or an act of showing. The meaning of the clause is connected to both visual perception and to the act of indicating or pointing to something in the visual purview. Rather than drawing on a stored knowledge base that the agent needs to run through every time he or she encounters a new situation, the agent’s resources for acting and representing are built up in the course of his or her interactions with particular situations. Attributive clauses of the kind mentioned above are encountered in situa- tions of visual comparison and demonstration of items. Particular situations are experienced in terms of the significance they have for the agent’s current understanding and goals rather than in terms of a pre-stored knowledge base of cognitive or semantic structures. It is in this way that the agent is able to apply its knowledge of typical situations to future contingencies. Deictic and other resources enable the grounding of the clause in the particular situation. The participatory and situated character of the activities in which such clauses are used is accordingly emphasised. As we shall see in Section 22 below, this entails something very different from the grounding of an abstract experiential semantic structure in the speech event.

22 Interpersonal meaning and the dynamics of anticipation Interpersonally, the utterance is a declarative proposition. It functions to invite addressees to accept or to agree with the proposition that the speaker asserts, i.e. to align addressees to its perspectives and evaluations. The verb process look is instantiated as an instance of the process-type by the Finite element in the verbal group (Langacker, 1991; Halliday, 1994: 75–8; Davidse, 1997). In the demonstration context of this clause, the use of non-progressive present tense indicates that the process is instantiated as being simultaneous with the ground of the speech event. The process is also grounded in terms of the modal operator sort of. In this way, the speaker provides an assessment as to the degree to which the Carrier ‘looks like’ the Attribute. In the present context, Boy1 indicates that the enacted scene may serve as an approximate, rather than an exact, guide to the appearance of the aliens. The clause is also grounded by the person deixis of the Subject (Davidse, 1997). The process ‘look’ is tied to the third person participant ‘they’ as Subject of this clause. The Subject is the modally responsible element about which the proposition is affirmed. In other words, the proposition in this clause makes an arguable claim about the Subject. It is the element that is being made modally responsible for the validity of the proposition (or proposal). P. J. Thibault 385

In the present case, the speaker asserts an arguable claim concerning the degree to which the Subject (they) visually resembles the action performed on Boy2. This claim can be related to higher-order norms concerning criteria of visual resemblance, acceptable degrees of accuracy, approximation, and so on. The claim is grounded in higher-order validity claims in terms of which the validity of any given proposition can be established, debated, and so on (Habermas, 1979; Hasan, 1992a; Thibault and vanLeeuwen, 1996; Halliday and Matthiessen, 2004: 119). In turn, these norms can be related to still more general principles or criteria of rationality concerning the understanding and interpretation of visual perception, the basis for making comparisons, as well as providing means for regulating disputes between alternatives perceptions and points of view. It is this interpersonal grounding of the proposition in a particular speech event which enables a reflexive relationship to be articulated between the speaker-cum-observer and the grammatical Subject about which the proposi- tion is made. The proposition is therefore grounded in the temporal and modal perspective of the speaker as a unique, context-specific type-specification of the process type. In this way, a proposition is tied to an observer’s viewpoint. At the same time, the proposition also enables a dialogically coordinated response to it on the part of addressees. This means that any given addressee, in a given time and place, can re-ground the proposition from his or her observational viewpoint. Thus, addressees can take up the meaning of the addresser’s proposi- tion and further negotiate its meaning according to the stance they may have on the addresser’s proposition. For example, an addressee can agree or disagree with, believe or disbelieve, affirm or deny, accept or reject, and so on, a given proposition (Martin, 1992: 76–8; Thibault, 1999: 588–92). The interpersonal-dialogical resources of clause grammar therefore provide standardised lexicogrammatical formats and discursive procedures for reaching intersubjective agreement through the processes of orienting to, responding to and negotiating the meanings of propositions in discourse. While many aspects of experience may be subjective, personal, and, in a sense unrepeatable, language provides resources for reaching intersubjective agreement and disagreement about the world even though the perspectives of different addressers and addresses may be different. Boy1’s utterance is a response to the text on the computer screen, which is also read aloud by the instructor. However, his response is not merely backward looking or reactive. Instead, it is anticipatory: its semantics is oriented to and anticipates a certain kind of response. This does not mean that this utterance causes Boy1’s response in the same way that efficient causes cause. The instruc- tion cannot determine what will come next in any kind of deterministic way. The instruction is not an external force that impacts on Boy1 as an efficient 386 linguistics and the human sciences cause. Instead, the semantics of the utterance is more like a formal cause that modulates the range of possible responses in the sense that it constrains the semantic possibilities that could constitute a response. It entrains or enslaves the responses to its own semantics. However, Boy1 is not merely enslaved to the prior discourse move. A recursively self-transforming learning system can modulate its own dynam- ics. It can adjust its responses to the environment so as to maintain and change itself, rather than be pushed around by external factors, includ- ing other people’s utterances. Moreover, the semantics of the instruction is subtly indeterminate, rather than being fixed. This indeterminacy, which is a normal condition of natural language in use, allows for multiple, though always contextually constrained, uptakes of the semantic possibilities of the utterance and its further development in discourse negotiation. A recursively self-transforming system is characterised by positive feedback processes or circular causality – its own processes are necessary for the ongoing mainte- nance and transformation of the system. The recursive processes of dialogic engagement with others involve this kind of circular causality. The earliest dyadic forms of exchange between mother and infant represent the begin- nings of this process. Mother and infant produce and orient to each other’s kinetic and other bodily activity (smiles, vocalisations, facial expressions, gaze). In their affect charged joint bodily engagements with each other, their dialogically synchronised and coordinated interactions give rise to emergent inter-individual patterns of bodily activity. Infant and mother both act and respond to each other, in the process setting off inter-individual patterns that, in time, lead to their higher-level reorganisation as language-activity (Thibault, 2004a: Chapter 3). Self-organising processes of this kind are semiotic processes; their dynamical forms of organisation select which environmental information is relevant to the system and therefore which information is to be imported into its internal dynamics. Rather than being ‘reactive’, the system is able to anticipate the environment’s responses. The semiotic closure of the system imposes its own internal dynamic such that the system is not dominated by external factors. Dialogic moves in discourse are not pushed along by external factors. Instead, they are dynamical attractors within the semantic phase space of the developing discourse’s trajectory. Normatively self-transforming learning systems have modal perspectives as part of their semantic resources; they can anticipate future developments of the systems they interact with and can modulate their own activity accordingly. The metafunctional organisation of language (Halliday, 1979) means that linguistic action is calibrated with a representation. In this way, the world qua interactive potential is made to fit with one’s representations. P. J. Thibault 387

A given discourse’s meaning-making trajectory embodies the meanings and perspectives of the agent(s) from which it is sourced. Language, in this view, is not the encoding of propositions in grammatical form. Rather, grammar is the operation of contextual constraints that embody the meanings and perspectives of agents and enable these meanings and perspectives to flow into action. The capacity for anticipation is a modulation of an internal state of the agent arising from its internal system of interpretance and the modal perspectives that this affords the agent. In this way, agents can adaptively and creatively modify their actions relative to the agents they interact with. An action vector has magnitude and orientation in space, time and modal perspective. The grounding of a discourse move is a vector in this sense. Modalised perspectives have their semantic source at an agent. These perspec- tives have semantic force. The prosodic and scopal nature of their linguistic realisations means that they have quantity; they spread across a trajectory in a cumulative fashion and modify its shape in accordance with the modalised perspective of the agent who is the source of the trajectory (e.g. the speaker). They therefore extend over a given stretch of discourse so as to guide and modulate the action trajectory of the agent along the temporal duration of the trajectory. Discourse moves are temporally and modally grounded in relation to the speech event at the same time that they anticipate further responses (e.g. of others) and adjust and modulate their own activity accord- ingly. The Mood element in the clause does not ground the clause in objective Newtonian space-time, but, rather, in terms of the semantic (modal) perspec- tives from which the move flowed as well as the attractors that constrain and shape its future development. The present (here-now) grounding of discourse moves in time, modal evaluation and polarity are oriented to and shaped by future states of the system in ways that modify the present state of the system (agent). Representations seek to approximate future interactive potential. Such approximations are normative because they can be connected to and corrected by higher-order norms (e.g. truthfulness, goodness, badness, sincerity, importance, and many others) that function to correct or to steer the agent along some courses of preferred or positively valued action rather than others. A representation is associated with a value-stance that comprises a small-scale model on the basis of which the agent can anticipate future interactive potential and orient to it. The relative semantic indeterminacy of discourse moves is a form of bottom- up enabling constraint. The semiotic variety at this level is a means of enlarging the overall phase space of the system by providing new possibilities. It is a form of interactive potential in the sense discussed before. At the same time, top- down contextual constraints impose order and organisation on the emerging 388 linguistics and the human sciences whole by limiting the possibilities on the lower level. Emergent global organisa- tion limits the degrees of freedom of the individual contributions on the lower level. Each person’s dialogic move in discourse is a functioning component is a more global system; their possibilities for action are constrained by the overall system of relations of which they are a functioning part. This means: (1) the degrees of freedom of each person’s contribution – e.g. their possible discourse moves – to the evolving whole is reduced and constrained by the dynamics of the system as a whole; and (2) each person’s contribution is now a functioning part in a larger whole, rather than being an independent component emanating from the lower level per se. As a functioning component in a more global system of relations, each con- tributor’s discourse move functions to maintain the current state of the system. Function is therefore defined in terms of the current state of the system; a given system component is functional insofar as it plays a role in maintaining that system. However, a recursively self-transforming system is one that can act both to maintain and to change itself. All learning systems are recursively self-transforming systems. The selection of a given discourse move is a means of interacting with the agent’s environment; as such, the selection is functional in maintaining the system. A discourse move is, of course, selected by the system (the agent) so that the agent’s own actions contribute to the agent’s self-transformation. The selection of a discourse move such as well they sort of look like this involves anticipation; it anticipates possible future developments of the system (e.g. possible responses by others, possible environmental contingen- cies, the potentialities of environmental affordances for further action, and so on). It is not merely backward looking or reactive; it does not merely hark back to what was said previously. The selection of discourse moves in some unfolding discourse event means making selections that constitute appropri- ate responses to the current environment. The current environment affords the selection of interactions – discourse moves – that serve to maintain and to transform the system by anticipating possible future developments of the system. The utterance therefore anticipates potential courses of action that the other boy qua affordance for action makes available. The interpersonal semantics of the utterance also specify the speaker’s evaluative orientation to the referent situation of the clause. The modal operator ‘sort of’ indicates the speaker’s assessment of the proposition and grounds it in relation to a particular domain of validity in terms of which the proposition can be argued. In the present example, the modal operator locates the proposition in a validity domain that can be glossed as [speaker’s informal assessment of degree of accuracy], i.e. P. J. Thibault 389 the degree to which the visual comparison made in the clause process accu- rately compares the two entities according to the speaker. Here, the modality means something like ‘looking like but not completely accurately according to the speaker’. The process instancelook is thus grounded in a framework of modal assessment by the speaker at the same time that the present tense in the Finite element grounds the proposition as simultaneous with the current speech event. Importantly, the modality sort of in this example highlights the fact that the visual comparison may be contested as wrong, inaccurate, misleading, and so on. The modal operator in the present case maintains a degree of dialogical openness and heteroglossic diversity regarding the possible representations of the referent situation, rather than monological closure and monoglossic uniformity. The modality locates the speaker’s evaluation in a heteroglossic space of possible alternative evaluations and possible alternative representa- tions that can be argued about. These alternatives can contribute to the further development of the proposition, to its rejection, to its replacement with another one that is more appropriate to the environmental contingency and the possible courses of action that this affords. The heteroglossic openness of the proposition anticipates possible responses, possible renegotiations of its meaning potential by the speaker as well as by other interactants. The visual comparison in the experiential semantics of the clause may prove to be in error and therefore of a type that is not suitable, useful or appropriate for further courses of action in relation to the environment. The other boy may not be a suitable resource for demonstrating what the aliens look like. Judgments as to suitability, usefulness, appropriateness, and so on, are normative judgments. They are grounded in systems of norms which can be referred to in discourse negotiation to resolve disputes and conflicting evaluations concerning such things as the appropriateness, truthfulness, right- ness, and so on, of representations. In anticipating their future development along their trajectory, a recursively self-maintenant learning system can assert propositions that incorrectly anticipate future developments of the system and the agent-environment relations. In calibrating action with representation, the experiential dimension of the utterance’s meaning helps to constrain action along appropriate pathways. The instruction on the computer screen constrains possible responses to it. In this way, the combination of imperative mood and experiential selections in its semantics evokes a self-organised set of alternatives that anticipate pos- sible future responses. The semantics of this move proposes a course of action (‘describe’) and calibrates this with an experiential construal that mentions ‘aliens’ and ‘nasty horrid creatures’. In responding to this move, Boy1 does 390 linguistics and the human sciences not have to process an indefinitely large number of alternative possibilities for responding. The prior move on the computer screen delimits the set to a manageable subset from which the response will be selected. This does not mean that there is an already existing set of choices that one makes off the shelf, so to speak; rather, the computer text in our example both specifies and anticipates the semantic space – the range of possible responses – in terms of which a given response is organised. Broadly speaking, this may be characterised as a continuum of possibilities – linguistic and non-linguistic – between ‘Describe’ and ‘Don’t describe’ the aliens. Other semantic possibilities are not countenanced by this particular space. This is not the same as saying that other possibilities are not ruled out. Rather, a response such as starting a conversation about the school football team instead of responding to the instruction would be a radical re-constitution of the semantic space along quite different lines. From the perspective of the instruction and its semantics, this possibility is simply not on the semantic radar screen. What we see here is how the instruction organises the prior existing semantic space in terms of a determinate set of possibilities – a constrained contrast set of alternatives – that constrain but do not determine possible future responses. In this way, we see how the instructional text on the computer screen plays its role in a distributed network of relations that self-organise along the meaning- making trajectory of the discourse. The decision by Boy1 to turn his companion into an alien is not of course predicted or determined by this prior semantic space. Nor does the decision originate uniquely inside his head. The point is that his response is semantically congruent with that space at the same time that it is its further development. This view of dialogue is rather different from the more orthodox claim that meaning in interaction is ‘making and fulfilling dialogic claims in initative and reactive actions which relate to certain propositional states’ (Weigand, 2004: 378). Weigand’s view of dialogue as ‘initiative and reactive action in interdependence’ (2004: 378) is founded on disembodied ‘propositional states’ in which action is reduced to a passive ‘reacting’ to prior action inputs. The action-reaction model is past oriented and mechanistic; it looks back to a prior input – an initiative action – which ‘rationally determines’ it (Weigand, 2004: 378). The anticipatory model, on the other hand, is future-oriented; rather than being rationally determined by someone’s initiative action, a response is a future-oriented development of it. The action-reaction model is passive and deterministic in conformity with the logic of all disembodied and formal systems. P. J. Thibault 391

A given speaker’s utterance affords multiple possible uptakes and develop- ments. Some of these are available in the formal selections that characterise a given move; others may require further contextual development and nego- tiation in order to bring a particular potentiality into interactional actuality. Weigand’s model of dialogue is based on the sequential formalisms character- istic of machine logic and decontextualised formal models. It fails to show that dialogic interaction depends on the embodied perspectives and orientations characteristic of agents who live in ecosocial time and space (Thibault, 2004b: 24–6) rather than as relating to disembodied ‘propositional states’ (Weigand, 2004: 378). Utterances are not disembodied propositional states à la Weigand; nor are they formal encodings of meanings. In a functional view of the kind put forward here, the lexicogrammatical form of the utterance has to do with the structure of an operation that is performed on a context, including the discourse move of one’s interlocutor. Its grammatical shape is the shape of an operation rather than an encoding. The grammatical shape of the utterance enacts the interpersonal layer of its meaning (Davidse, 1997). Operations generate new points of departure, new modal orientations, new experiential representations, and so on. Utterances act on and help to constitute situations and actions; situations and actions are matrices of interactive potentialities that constitute future possibilities of interaction. In operating on this matrix of potentialities, utterances, on account of their metafunctional organisation, do not represent something already there; rather, they fine-tune and bring into interactive focus a progressive differentiation of the total realm of possibilities as affordances for future-oriented interactive potential. Boy 1’s utterance well they sort of look like this operates on a matrix of affordances as possibilities for future interaction. The experiential semantics of the clause does not encode a prior content that was present in the envi- ronment or in the speaker’s mind; its categorial semantic distinctions, in conjunction with the choice of mood, act on this matrix and call forth some of its possibilities as affording the development of future-oriented interactive potential. The interpersonal shape of the clause acts on a representation and anchors it to its ground at the same time that the experiential semantics of the clause progressively differentiates and brings into focus a representation as an affordance for future action from the total range of possibilities in the given situation. The experiential semantics of the clause focuses on features that are relevant to the attainment of the speaker’s goal at this particular stage in the discourse. It is in this sense that we can say that the clause is a guide to action, rather than a representation of some pre-existing aspect of the speaker’s world. The 392 linguistics and the human sciences

‘representation’ does not hark back to something that is ‘re-presented’ by the experiential semantics of this clause. It is not then a representation of features in the environment that the agent detects. Instead, the experiential semantics of the clause looks forwards; it is future-oriented. I shall now explain this notion in more detail. The clause is not a representation of properties or features of its referent situation that are perceptually picked up by the agent and which already exist in the situation. Rather, the clause instantiates a semantic relationship with relevant properties or features of its environment. However, in the current perspective, this means that the clause focuses on or attends to properties which constitute further indications of interactive potential, rather than representing already given features of the environment that are represented by the clause. The clause does not represent to speaker and listeners some features of the environment that were present prior to the uttering of the clause. Instead, its semantic features create, in relation to the relevant environment, semantic specifications as to further interactive potential with that environment. If the former case held, then the clause would be backwards looking, as if Boy2’s ears, tie, and hair and the action that Boy1 performs in relation to these items corresponded to some privileged prior locus of representation which the clause re-presents to the participants in the situation. Instead, in the perspective argued for here, the experiential construal or representa- tion is future-oriented; it looks forward to and seeks out possibilities in the environment, including of course other agents, that afford further pos- sibilities for interaction and of course the further development of the current interaction. In a nutshell, there is nothing there that the representation can correspond to. There is no natural kind ‘out there’ in the world that the clause simply stands for; no alien-like features of Boy2 that Boy1 simply represents. At this point in the argument, we can bring in the relevance of the systemic dimension of the clause’s meaning potential. As a selection from a set of possible meaning-making choices in the area of transitivity, the experiential semantics of the clause specify what the system is supposed to represent, not what it corresponds to in fact. The system therefore provides a normative basis to representation. This is what makes it possible for a particular repre- sentation to be argued about in the dialogical space that is opened up by the interpersonal dimension of the clause’s meaning. Just as the speaker affirms or asserts a modalised declarative proposition about the referent situation as construed from his perspective, his interlocutors can respond in various ways – both linguistically and non-linguistically – which may support the assertion, contradict it, deny it, and so on. P. J. Thibault 393

The experiential construal of the situation is open to negotiation, revision, and replacement with other possible construals if it is found to be in some way an inadequate or defective representation. This therefore also allows for the possibility that the experiential selections in the clause misconstrue the situation. That is, it allows for other possible experiential takes and associated evaluative stances on the situation as well as for the possibility that the cur- rent representation can be further developed, corrected and transformed by subsequent negotiation. The representation, as we saw above, is grounded by the resources of person deixis and Finiteness in the mood element of the clause. It is these resources that enable propositions and proposals to be grounded in terms of interpersonal norms that can be agreed upon, disputed, and so on (Halliday, 1994: 75; Halliday-Matthiessen, 2004: 117; Habermas, 1979; Hasan, 1992a; Thibault, 1999; 2002), rather than on the basis of a potentially infinite regress of representations of representations of … etc. Representations are grounded with reference to interpersonal norms of arguability concerning the domain(s) of validity in relation to which the meaning of the representation is determined (see above). These norms also reflect the perspectives of embodied agents in the modalised space-time of discursive negotiation, rather than a procrustean bed of more atomistic or primitive representations that constitute the foundations of a given representation. A regress of the kind mentioned above would presuppose that there are always other more atomistic representations on which the current one is founded. If we simply said, for example, that the representational meaning of the demonstrative pronoun ‘this’ in the present clause can be explained in terms of a more basic, more atomistic set of features such as [deixis: spatial: proximal; number: singular; reference category: thing; phoricity: exophoric or endophoric], we would still be saying nothing about what the given linguistic term is supposed to represent in this context. I don’t have time to uncover more and more levels of representations and nor do speakers and listeners when they produce representations in the contexts that matter to them. They simply do what they have to, when they have to, in response to whatever the environment throws at them. The affordances of the environment specify their own models of interactive potential. In actual fact, the representation is an emergent property of the agent’s inter- action with relevant aspects of his environment. It is instructive in this regard that the present example can be seen as functioning in a semi-formal learning situation for the children. Boy1 responds to the prior discourse move on the part of the computer/instructor. In doing so, he not only performs actions – linguistic and non-linguistic – that constitute a response to the prior move. 394 linguistics and the human sciences

Importantly, his response re-orients the meanings that are being negotiated in ways that are future-oriented, not simply backwards looking. The selection of the clause well they sort of look like this is not a selection from a pre-given system of options, but an adaptive modification of systemic possibilities – of system potential – so as to construct a representation – an experiential construal – that is oriented not to the representation of properties or features that are already there ‘in’ the situation. Rather, the experiential semantics of the clause assert what is supposed to be represented by grounding it in an interpersonal-normative space of arguability. At the same time, the clause specifies, semantically speaking, ways in which the referent situation can henceforth be oriented to and therefore ways in which the situation can be developed through future interaction. This is precisely what happens in the present instance. We have here a theory of cognition that recognises the centrality of the interpersonal dimension of semiosis. This dimension functions to ground experiential representations in the way described above. Such a theory also shows how the agent’s interactions with its environment achieve a dialogical closure (Bråten, 1992; 1998; 2002; Thibault, 2000; 2005b) with their environ- ments: each interaction loops back on, supports and increases the agency that made these interactions possible in the first place. The recursive dialogical closure of the system is the means whereby it interacts with its environment and in the process maintains and develops its own agency and therefore its own individuality, its own learning, its own knowledge.

23 Conclusion In detecting and responding to relevant stimuli in the external environment, as well as in its internal milieu, the agent can select from the mood system in order to respond to affordances in its environment with appropriate actions. This means that agents select from among different mood options accord- ing to their interpretations of the environment and their assessment of the best way to respond to it with an appropriate course of action. In this sense, linguistic and non-linguistic forms of action are exactly the same. A human agent qua recursively self-transforming dynamic open system is a system that learns and individuates along its temporal (historical-biographical) trajectory. The normative aspect of its representations or semantic construals is seen as a high-level contribution to the maintenance of the system qua cognitive system. These semantic construals are functional in the learning and in the self-transformation of the system. P. J. Thibault 395

The dialogic-interpersonal basis of the grammar of natural languages pro- vides a normative basis for the agent’s evaluations of what is ‘appropriate’, ‘good’, ‘right’, ‘truthful’, ‘interesting’, ‘important’, and so on for the ongoing maintenance of the agent as a system that learns and through learning changes itself. The dialogical-interpersonal criteria of arguability cited above provide a basis for grounding these representations in normatively scaled validity claims and values that ultimately derive from the community. The fact of arguability in discursive interaction means that the agent’s own representations can be both self-corrected and corrected by others. These representations can also be further developed by the self and others on the basis of the same validity claims and values in relation to which the perspectives and the courses of action of both self and others are normatively grounded. Our semantic construals of the environment through the experiential resources of the grammar of natural languages are normative precisely because and insofar as they are functional in the recursive self-transformation of the agent qua learning and individuat- ing system. The future learning and individuation of the system depends on the ongoing maintenance of these functional representations. Arguability and the recursivity of dialogue therefore provide corrective feedback that enables the agent’s trajectory to connect to normative principles that guide, modulate and ground agents and their activity in their ecosocial environment at the same time that agents and their activity can be corrected. The observations made in this section also make sense from the evolutionary perspective. Organisms have developmental, hence type-specific, constraints that prevent random change or what Darwinists like to call ‘blind variation’. Constraints of this kind are information-semiotic constraints because evolu- tionary lineages have developed them in evolution to minimise the possibility of error variation. Organisms can distinguish between what is good and bad for them. This is a form of normative knowledge that has accumulated over genera- tions as semiotic-informational constraints, rather than mechanistic selectional constraints emanating ‘from above’. A given lineage does not go on repeating the same error from generation to generation, but learns how to correct these errors by anticipating them and reorganising its own semiotic-informational constraints accordingly. 396 linguistics and the human sciences

Appendix I: Transcription of vocal and cinesic layers in the ‘Aliens’ text, showing multimodal realisation of discourse moves

VF 1 2 3 4 moves Instruction Prompt Show Affect + Affect-Solidarity Focus In your group must come on give a description of their characteristics B1 mickey mouse medium loud (loud) laughter from whole group in response to Ja Cinesic B1 points at Mickey Mouse icon on screen, leads towards screen during the point; then retracting to upright standing 2 position; turns head to In. then briefly back to screen

VF 5 6 7 8 Act- Instruction Prompt^Directive Focus Attention+ type Anticipate B1 In nasty horrid come on^see creatures what you … B1 B2 Cinesic leans towards screen spreads arms at waist height 3 P. J. Thibault 397

VF 9 10 11 12 Act- Acknowledge^ Respond^Track Respond Respond+Track type Respond Action Action In B1 well^they sort of look like this with their ties go like that B2 low level sound indecipherable in response to vocalisation head being twisted Cinesic B1 turns towards B1 takes B2’s ears B1 pulls B2’s tie B1 twists tie B2 and pulls them away from his outwards and chest gently twists B2’s head; B2 holds his arms outspread

Key to Notations

Participants In = adult instructor B1 = first boy B2 = second boy

Cross-referencing between participants and act-types italics: adult Instructor (In) normal: first boy (B1) bold: second boy (B2) unbroken underline: Group B (the boys’ group) broken underline: both groups in unison (Group A and Group B)

Other symbols VF = Visual Frame ^ = is followed by + = is simultaneous with 398 linguistics and the human sciences

Note 1 I am grateful to Anthony Baldry for kindly providing me with the video tape of the complete episode from which the brief sequence transcribed is taken and for his agreeing to its present use in this paper. The transcription and analysis shown here are entirely my own responsibility.

References Baldry, A. P. (1996) Laboratorio virtuale: creazione di percorsi multipli. In A. P. Baldry and N. Prozzo (eds) Ipermedia nell’educazione linguistica 149–56. Rome: Armando. Baldry, A. P. (2005) Children’s use and awareness of genre: A case study of the evolution from multimodal transcription to multimodal concordancing based on system net- works. In C. Taylor Torsello, M. Gotti and C. Taylor (eds) I Centri Linguistici: approcci, progetti e strumenti per l’apprendimento e la valutazione. Atti del 3° convegno nazionale AICLU, 423–42. Trieste: EUT. Baldry, A. P., Bolognesi, R., Piastra, M. (1994) Retraction with face saving: modeling conversational interaction through dynamic hypermedia. Alt-J, Association for Learning Technology Journal 2(2): 27–37. Benson, J. (2002) Bonobo-human discourse: where does Kanzi’s ‘bad surprise’ come from? In C. Tatilon and A. Baudot (eds) La Linguistique fonctionnelle au tournant du siècle 201–6. Toronto: Éditions du GREF. Benson, J., Greaves, W., O’Donnell, M. and Taglialatela, J. (2002) Evidence for symbolic language processing in a bonobo (Pan paniscus). Journal of Consciousness Studies 9(12): 33–56. Benson, J. and Greaves, W. (eds) (2005) Functional Dimensions of -Human Discourse. London and New York: Equinox. Bickhard, M. H. (2005) Function, anticipation, representation. Retrieved from http:// www.lehigh.edu/~mhb0/mhb0.html March 2005. Bråten, S. (1992) The virtual other in infants’ minds and social feelings. In A. Heen Wold (ed.) The Dialogical Alternative: towards a theory of language and mind 77–97. Oslo: Scandinavian University Press. Bråten, S. (1998) Intersubjective communion and understanding: development and perturbation. In S. Bråten (ed.) Intersubjective Communication and Emotion in Early Ontogeny 372–82. Cambridge and Paris: Cambridge University Press and Editions de la Maison des Sciences de l’Homme. Bråten, S. (2002) Altercentric perception by infants and adults in dialogue: Ego’s virtual participation in Alter’s complementary act. In M. I. Stamenov and V. Gallese (eds) Mirror Neurons and the Evolution of Brain and Language 273–94. Amsterdam/ Philadelphia: John Benjamins. Brazil, D. (1981) Intonation. In R. M. Coulthard and M. M. Montgomery (eds) Studies in Discourse Analysis, 39–70. London: Routledge. P. J. Thibault 399

Clark, A. (1997) Being There: putting brain, body and world together again. Cambridge, MA: The MIT Press. Clark, A. (2001a) Where brain, body, and world collide. In G. M. Edelman and J. Changeux (eds) The Brain 257–80. New Brunswick and London: Transaction Publishers. Clark, A. (2001b) Reasons, robots and the extended mind. Mind & Language 16(2): 121–45. Cowley, S. J. (1994) Conversational functions of rhythmical patterning: a behavioural perspective. Language & Communication 14(4): 353–76. Cowley, S. J. (1997) Conversation, coordination, and vertebrate communication. Semiotica 115(1/2): 27–52. Cowley, S. J. (1998) Of timing, turn-taking, and conversations. Journal of Psycholinguistic Research 27(5): 541–71. Cowley, S. J. (2005) Languaging: how babies and bonobos lock onto human modes of life. International Journal of Computational Cognition 3(1): 44–56. Damasio, A. [1994] (1996) Descartes’ Error: emotion, reason and the human brain. London and Oxford: Macmillan Papermac. Damasio, A. (2000) The Feeling of What Happens: body, emotion and the making of consciousness. London: William Heinemann. Davidse, K. (1997) The Subject-Object versus the Agent-Patient asymmetry. Paper presented at the congress ‘Objects, grammatical relations and semantics’, University of Gent, 23–24 May 1997. Ellis, J. (1971) The definite article in translation between English and Twi. In M. Houis (ed.) Actes du Huitième Congrès de la Societié Linguistique de l’Afrique Occidentale 1969 Vol. 1, 367–80. Abidjan. Gibson, J. J. [1966] (1983) The Senses Considered as Perceptual Systems. Westport, Connecticut: Greenwood Press. Gibson, J. J. [1979] (1986) The Ecological Approach to Visual Perception. Hillsdale, NJ and London: Lawrence Erlbaum. Habermas, J. [1976] (1979) Communication and the Evolution of Society. (Translated by Thomas McCarthy.) London and Melbourne: Heinemann. Halliday, M. A. K. (1979) Modes of meaning and modes of expression: types of grammatical structure and their determination by different semantic functions. In D. J. Allerton, E. Carney and D. Holdcroft (eds)Function and Context in Linguistic Analysis: a Festschrift for William Haas 57–79. Cambridge: Cambridge University Press. Halliday, M. A. K. [1985] (1994) Introduction to Functional Grammar. (Second edition.) London and Melbourne: Arnold. Halliday, M. A. K. [1984] (2004) The complete ‘Nigel Transcipts’. In J. J. Webster (ed.) The Language of Early Childhood, Volume 4 in the Collected Works of M. A. K. Halliday. (CD supplement.) London and New York: Continuum. 400 linguistics and the human sciences

Halliday, M. A. K. and Matthiessen, C. M. I. M. (2004) Introduction to Functional Grammar. (Third edition.) London and New York: Arnold. Harris, R. (1981) The Language Myth. London: Duckworth. Hasan, R. (1992a) Rationality in everyday talk: from process to system. In J. Svartik (ed.) Directions in Corpus Linguistics: Proceedings of Nobel Symposium 82 Stockholm 257–307, 4–8 August 1991. Berlin and New York: Mouton de Gruyter. Hasan, R. (1992b) Speech genre, semiotic mediation and the development of the higher mental functions. Language Sciences 14(4): 489–528. Hoey, M. (2001) Textual Interaction: an introduction to written discourse analysis. London and New York: Routledge. Langacker, R. W. (1991) Foundations of Cognitive Grammar. Volume II. Descriptive appli- cation. Stanford, CA: Stanford University Press. Lemke, J. L. (2000) Across the scales of time: artefacts, activities, and meanings in ecoso- cial systems. Mind, Culture, and Activity 7(4): 273–90. Martin, J. R. (1992) English Text: System and structure. Philadelphia/Amsterdam: John Benjamins. Merleau-Ponty, M. [1942] (1963) The Structure of Behavior. (Translated by A. L. Fisher.) Pittsburgh: Duquesne University Press. Peng, F. C. C. (2003) The anthropology of language disorders: a first approximation. Lingua Posnaniensis XLV: 39–62. Piastra, M. and Lombardi, L. (2000) The HyperContext Web project: dynamic authoring for distance learning. In A. P. Baldry (ed.) Multimodality and multimediality in the distance learning age, Campobasso, Palladino Editore, 247–62. Thibault, P. J. (1999) Communicating and interpreting relevance through discourse negotiation: an alternative to relevance theory. Journal of Pragmatics 31: 557–94. Thibault, P. J. (2000) The dialogical integration of the brain in social semiosis: Edelman and the case for downward causation. Mind, Culture, and Activity 7(4): 291–311. Thibault, P. J. (2002) Interpersonal meaning and the discursive construction of action, attitudes and values: the Global Modal Program of one text. In P. Fries, M. Cummings, D. Lockwood, and W. Spruiell (eds) Relations and Functions Within and Around Language 56–116. London and New York: Continuum. Thibault, P. J. (2004a)Agency and Consciousness in Discourse: self-other dynamics as a complex system. London and New York: Continuum. Thibault, P. J. (2004b)Brain , Mind, and the Signifying Body: an ecosocial semiotic theory. London and New York: Continuum. Thibault, P. J. (2004c) Agency, individuation, and meaning-making: reflections on an episode of bonobo-human interaction. In G. Williams and A. Lukin (eds) Language Development: functional perspectives on evolution and ontogenesis 108–32. London and New York: Continuum. P. J. Thibault 401

Thibault, P. J. (2005a) Brains, bodies, contextualizing activity and language: do humans (and bonobos) have a language faculty, and can they do without one? Linguistics and the Human Sciences 1(1): 101–28. Thibault, P. J. (2005b) The interpersonal gateway to the meaning of mind: unifying the inter- and intra-organism perspectives on language. In R. Hasan, C. M. I. M. Matthiessen, and J. J. Webster (eds) Continuing Discourse on Language: a functional perspective, Vol 1. London: Equinox. Thibault, P. J. and van Leeuwen, T. (1996) Grammar, society, and the speech act: renewing the connections. Journal of Pragmatics 25: 561–85. Uexküll, J. von (1909) Umwelt und Innenwelt der Tiere. Berlin: Verlag von Julius Springer. Uexküll, J. von (1982) The theory of meaning.Semiotica 42(1): 25–82. Vygotsky, L. [1934] (1986) Thought and Language. (Translated by A. Kozulin.) Cambridge, MA and London: The MIT Press. Weigand, E. (2004) Empirical data and theoretical models. Pragmatics & Cognition 12(2): 375–88. Ziemke, T. (1999) Rethinking grounding. In A. Riegler, M. Peschl, and A. von Stein (eds) Understanding Representation in the Cognitive Sciences 177–90. New York: Plenum Press.