<<

SOCIAL CHUNKING IN WORKING 1

1

2

3

4

5 Chunking by Social Relationship in

6

7 Ilenia Paparella and Liuba Papeo

8 Institut des Sciences Cognitives— Marc Jeannerod, UMR5229, Centre National de la

9 Recherche Scientifique (CNRS) & Université Claude Bernard Lyon1, France

10

11

12

13

14 Author Note

15 Ilenia Paparella https://orcid.org/0000-0002-7683-2503

16 Liuba Papeo https://orcid.org/0000-0003-3056-8679

17 The authors have no conflicts of interest to disclose.

18

19 Correspondence concerning this article should be addressed to Ilenia Paparella, now at

20 GIGA-Research, Cyclotron Research Center-In Vivo Imaging Unit, 8 allée du Six Août,

21 Batiment B30, University of Liège, 4000 Liège (Belgium).

22 Email: [email protected]

23

SOCIAL CHUNKING IN WORKING MEMORY 2

24 Abstract

25 Working memory (WM) uses knowledge and relations to organize and store multiple

26 individual items in a smaller set of structured units, or chunks. We investigated whether a

27 crowd of individuals that exceeds the WM is retained and, therefore, recognized more

28 accurately, if individuals are represented as interacting with one another –i.e., they form

29 social chunks. Further, we asked what counts as a social chunk in WM: two individuals

30 involved in a meaningful interaction or just spatially close and face-to-face. In three

31 experiments with a delayed change-detection task, participants had to report whether a

32 probe-array was the same of, or different from a sample-array featuring two or three dyads

33 of bodies either face-to-face (facing array) or back-to-back (non-facing array). In Experiment

34 1, where facing dyads depicted coherent, meaningful interactions, participants were more

35 accurate to detect changes in facing (vs. non-facing) arrays. A similar advantage was found

36 in Experiment 2, even though facing dyads depicted no meaningful interaction. In

37 Experiment 3, we introduced a secondary task (verbal shadowing) to increase WM load.

38 This manipulation abolished the advantage of facing (vs. non-facing) arrays, only when

39 facing dyads depicted no meaningful interactions. These results show that WM uses

40 representation of interaction to chunk crowds in social groups. The mere facingness of

41 bodies is sufficient on its own to evoke representation of interaction, thus defining a social

42 chunk in WM; although the lack of semantic anchor makes chunking fainter and more

43 susceptible to interference of a secondary task.

44

45 Keywords: working memory, chunking, perceptual grouping, social cognition, scene

46 perception.

SOCIAL CHUNKING IN WORKING MEMORY 3

47 Living in a social world requires humans to process information about conspecifics and the

48 relationships between them. In scenarios that feature multiple faces or bodies, such as any

49 urban scene, vision exploits markers of interpersonal involvement to detect and recognize

50 social groups –i.e., people who engage in social relationship. One of such markers is the

51 relative positioning of bodies in space: nearby bodies in a face-to-face configuration are

52 more likely to be interpreted as interacting than bodies in other spatial configurations (Zhou,

53 Han, Liang, Hu, & Kuai, 2019); they are more likely to be attended to in a crowd (Papeo,

54 2020; Papeo, Goupil, & Soto-Faraco, 2019; Vestner, Gray, & Cook, 2020; Vestner, Tipper,

55 Hartley, Over, & Rueschemeyer, 2019) and to break into visual awareness under low-

56 visibility conditions (Papeo, Stein, & Soto-Faraco, 2017). Visual efficiency has been

57 accounted for by grouping, that is, the processing of multiple bodies as a single

58 perceptual/attentional unit, promoted by visuo-spatial cues of interaction such as spatial

59 proximity and facingness.

60 Here, we asked whether the advantage of grouping people by virtue of socially

61 relevant relations drags up to systems outside and beyond visual perception. A system that

62 might benefit from the representation of relationship between social agents is working

63 memory (WM). WM is the system that supports the temporary storage of a limited amount of

64 information for further cognitive operations (Ardila, 2003; Baddeley, 2000, 2003; Baddeley &

65 Logie, 1999). The limits of WM capacity can be exceeded through chunking, the process of

66 binding and storing multiple items into a single unit (Cowan, 2000; Mathy & Feldman, 2012;

67 Miller, 1956). Chunking in WM exploits a variety of cues, from perceptual similarity and low-

68 level perceptual features to semantic relatedness, and statistical regularities (Brady, Konkle,

69 & Alvarez, 2009; Brady & Tenenbaum, 2013; Hollingworth, 2007; Kaiser, Stein, & Peelen,

70 2014; Luck & Vogel, 1997; O’Donnell, Clement, & Brockmole, 2018).

71 Recent findings suggest that social relationship may be an effective principle of

72 chunking in WM. For example, it has been reported that body movements performed as part

73 of a meaningful interaction were more likely to be recognized in a short-delayed recognition

74 task, relative to movements performed by isolated agents (Ding, Gao, & Shen, 2017). In the

SOCIAL CHUNKING IN WORKING MEMORY 4

75 authors’ interpretation, movements that gave rise to interaction were chunked and stored as

76 a single unit, thus increasing WM efficiency. In another study, infants as young as 16 months

77 showed to rely on knowledge about social relations to chunk sets of dolls in social units

78 (Stahl & Feigenson, 2014). In particular, after seeing dolls interacting in pairs, infants were

79 capable to remember two pairs, i.e. four dolls, which exceeded the three-item limit of their

80 WM.

81 The above effects have been interpreted as the result of embedding individuals into

82 the representation of a meaningful social interaction. But, can facingness alone, in the

83 absence of a semantically specified interaction, trigger chunking of bodies in WM? In visual

84 perception and attention, effects of grouping have been found for bodies postures oriented

85 toward one another without giving rise to any meaningful, coherent interaction (Papeo et al.,

86 2017; 2019; Vestner et al., 2019). This circumstance raises the possibility that facingness –

87 i.e., the mutual perceptual accessibility of two bodies– is sufficient on its own to trigger the

88 representation of interaction that binds two bodies together. In other words, it is possible that

89 individuals represent face-to-face bodies as intrinsically meaningful, which would yield a WM

90 advantage, irrespective of whether the two facing bodies actually realize a semantically

91 specified interaction.

92 We addressed this in three experiments on female and male human adults, using a

93 delayed change-detection task. Participants saw static arrays of four or six bodies, which

94 approached or exceeded the WM capacity, arranged in two or three face-to-face (facing

95 arrays) or back-to-back dyads (non-facing array). Facing pairs could give rise to a coherent,

96 semantically explicit, interaction (meaningful set –MF), or not (meaningless set –ML). We

97 measured whether facing arrays were remembered better than non-facing arrays and, if so,

98 whether such advantage applied to MF as well as ML facing arrays (Experiments 1-2).

99 Finally, in Experiment 3, we asked participants to perform the same task with a concurrent

100 shadowing task (i.e., continuous word repetition). With this manipulation, we investigated the

101 effect of WM load on the performance with MF vs. ML arrays. Precisely, we tested whether

102 semantic specification provided an anchor point that made the representation of MF (vs. ML)

SOCIAL CHUNKING IN WORKING MEMORY 5

103 interactions stronger and less subjected to interference in WM. Moreover, visual items can

104 be stored in WM in the form of corresponding verbal labels, when available, a strategy that is

105 especially common when items are assigned to a meaning (Donkin, Nosofsky, Gold, &

106 Shiffrin, 2015). Since verbal shadowing especially interferes with verbal retrieval and

107 rehearsal, Experiment 3 could offer insight on the format of representation of dyads in WM

108 (e.g., verbal or visuo-spatial).

109 In summary, with this study, we sought to isolate the effect of body positioning (facing

110 vs. non-facing) from the effect of representing a familiar, semantically specified interaction, in

111 promoting chunking by social relationship in WM. Given that, here, performance depended

112 on the possibility to structure a crowd in social (multiple-person) units, the results of this

113 study shed light on what counts as a social unit in WM.

114

115 Experiment 1

116 Experiment 1 tested the participants’ performance in detecting a change in a crowded array,

117 with bodies forming facing or non-facing dyads. All facing dyads depicted coherent,

118 meaningful interactions.

119 Participants

120 Twenty healthy adults (18 females; mean age 22.8 ± 3.9 standard deviation, SD)

121 participated in Experiment 1 as paid volunteers. All had normal or corrected-to-normal vision,

122 reported no history of neurological or psychiatric conditions and no assumption of

123 psychoactive substances or medications. Participants gave informed consent prior to

124 participation in the study. Experiments 1 was exploratory with respect to the sample size.

125 Sensitivity analysis (GPower 3.1) estimated a minimum detectable effect (i.e., the smallest

2 126 true effect, which would be statistically significant with alpha = 0.05, and power = 0.80) of ηp

127 0.10 for the effect of positioning (facing vs. non-facing arrays) with a sample size of 20.

128 Results of Experiment 1 were used to calculate the sample size for the following

129 experiments. The local ethics committee approved this and the following experiments.

130

SOCIAL CHUNKING IN WORKING MEMORY 6

131 Stimuli

132 Meaningful-interaction (MF) dyads. We created gray-scale images of a human body in 48

133 different poses in lateral view, using Daz3D (Daz Productions, Salt Lake City, UT) and the

134 Image Processing Toolbox in MATLAB (The MathWorks, Natick, MA). Body poses were

135 compatible with one of six social interactions: communicating, talking, dancing, fighting,

136 quarreling and waving goodbye. Single bodies were paired and positioned face-to-face, so to

137 depict one of the six aforementioned interactions, yielding a total of 26 interacting dyads.

138 The meaningfulness of the interactions was evaluated in a rating and naming study involving

139 an independent sample of participants (see below).

140 Meaningless-interaction (ML) dyads: Forty-eight images of a human body in 48 new

141 poses in lateral view figures were created as above, and then randomly paired in 24 facing

142 dyads, so that the pairing gave rise to no familiar interaction.

143 Rating study. The meaningfulness of the above 24 ML and 24 MF dyads was evaluated

144 with a rating and naming study involving 19 native-French speakers (11 females, mean age

145 27.5 ±6.8 SD) external to the main experiments. All participants saw all the MF and ML

146 facing dyads in a random order. Dyads appeared at the center of a computer screen

147 subtending a visual angle of ~10°. For each dyad, participants were asked to rate the

148 meaningfulness of the scene by clicking on a Likert scale from 0 to 10 (0 = meaningless; 10

149 = very meaningful) displayed under the dyad. After a blank, participants were prompted to

150 provide a verbal description of the stimulus using one or few words. There was no time limit

151 to respond. The study was conducted online using Google Forms.

152 For each dyad, we computed: a) a score of meaningfulness corresponding to the

153 mean rate across participants; and b) a score of semantic consistency, representing the

154 percentage of participants who agreed on the expected meaning of a stimulus (descriptions

155 with similar meaning were considered semantically consistent; for example, “parler”, to talk,

156 “argumentée un point”, to make a point, were taken as descriptions compatible with the

157 general meaning “talking”).

SOCIAL CHUNKING IN WORKING MEMORY 7

158 The results of this study confirmed our a priori categorization of the dyads as MF or

159 ML. In particular, dyads that we had categorized as MF, had high meaningfulness scores

160 (mean = 7.71 ±1.14) and high semantic consistency (mean = 84.8 ±17.06). Dyads that we

161 had categorized as ML were given low meaningfulness scores (mean = 4.9 ± 0.92) and low

162 consistency (mean = 32.5 ±15.95). The two sets differed significantly for both

163 meaningfulness, t(23) = 9.54, p < .001, and consistency, t(23) = 9.97, p < .001.

164 Next, we created two sets of stimuli for the main experiments. For the MF set, we

165 selected three exemplars for each of the four categories that had obtained the highest

166 values of meaningfulness and semantic consistency: “Talking”, “Dancing”, “Fighting” and

167 “Quarreling” (mean meaningfulness = 7.67 ±1.35; mean consistency = 94.75 ±10.39). For

168 each of the selected dyads, we created a mirror version, yielding a total of 24 meaningful

169 dyads. Twenty-four non-facing dyads were created by swapping the position of the two

170 figures in each facing dyad (i.e., the figure on the left side was moved to the right side and

171 vice versa). The distance between two bodies in a dyad was matched across facing and

172 non-facing stimuli. To this end, we considered the distance between the centers of the two

173 minimal bounding boxes around each body (facing vs. non-facing dyads: t(23) = 1.04, p =

174 0.305); the distance between the closest points of the two bodies (facing vs. non-facing

175 dyads: t(23) = 1.41, p = 0.170).

176 For the ML set, 12 dyads were randomly selected amongst the dyads with the lowest values

177 of meaningfulness and consistency. Those dyads were flipped on the horizontal axis,

178 yielding a total of 24 meaningless dyads. The positioning of the two bodies within each dyad

179 was swapped to obtain 24 non-facing dyads. Across ML facing and non-facing stimuli, we

180 matched distances between the centers of the two bounding boxes around each body (t(23)

181 = 1.16, p = 0.256), and between the closest extremities of bodies, t(23) = 0.89, p = 0.382). In

182 Experiment 1, we only used the MF set. The ML set will be considered in in Experiment 2. All

183 stimuli are available upon request to the authors.

184

SOCIAL CHUNKING IN WORKING MEMORY 8

185 Arrays. We created arrays featuring two (set2; 50% of arrays) or three (set3) facing dyads.

186 In set2-arrays, dyads were placed on the right and left side, equally distant from a cross in

187 central fixation; in set3, dyads were place in correspondence of three angles of a (invisible)

188 triangle centered on the fixation. In each array, dyads were all facing (50%) or all non-facing.

189 In a facing array, facing dyads belonged to different semantic categories. For each

190 facing array, we created a non-facing array involving the very same dyads with bodies

191 presented back-to-back. Each array (sample) was paired with another array (probe) that

192 could be identical (same trials; 50%) or differ from the sample for one dyad of a category not

193 shown in the sample array (different trials). For example, if the sample-array showed

194 exemplars of “Fighting” and “Talking”, in a different trial, the probe-array would show the

195 “Fighting” dyad and a new one chosen from the remaining two categories, (“Quarreling” or

196 “Dancing”). For each participant, a new set of 432 arrays was created (160 sample-arrays

197 and corresponding same/different probe-arrays for the main experiment; 24 and 32 sample-

198 arrays and corresponding same/different probe-arrays for familiarization and training,

199 respectively), equally distributed across the eight experimental conditions: set2 and set3 of

200 facing and non-facing same and different arrays (see Figure 1B-C for examples of facing

201 and non-facing arrays of MF and ML dyads, respectively).

202 Procedure

203 Participants were seated on a height-adjustable chair in front of a screen for stimulus

204 presentation, with their eyes aligned to, and 60 cm away from the center of the screen. Each

205 trial began with a central fixation cross presented for 1400 ms followed by the sample-array

206 shown for 2000 ms. After an interval of 1000 ms, the probe-array appeared for 2000 ms (see

207 Figure 1A). For each trial, participants were asked to report whether the probe was the same

208 or different relative to the sample. Using the numeric keypad of a keyboard in front of them,

209 they had to press “1” for same and “2” for different, with the right index and middle finger,

210 respectively. Participants performed five blocks of 32 trials, four for each experimental

211 condition, yielding 20 trials for each of the eight conditions (set2 or set3, facing or non-facing

212 arrays in same or different trials) and a total of 160 trials. Stimuli of the eight conditions were

SOCIAL CHUNKING IN WORKING MEMORY 9

213 randomly interleaved in a block. Every two blocks, participants were invited to take a break.

214 The experiment was preceded by a familiarization block of 24 trials, during which

215 participants were free to ask questions and address the experimenter, and a training block of

216 32 trials, identical to the proper experiment. Response accuracy and reaction times (RTs)

217 were recorded for each trial. Images were displayed on a 17-in CRT monitor (1024x768

218 pixels resolution, 85-Hz refresh rate). Individual dyads subtended ~3° of visual angle and

219 their center was far ~2° from the fixation cross. Stimulus presentation and response

220 collection were controlled using the Psychophysics Toolbox extensions (Brainard, 1997)

221 through MATLAB. The entire experiment lasted ~30 minutes. In the debriefing at the end of

222 the experiment, participants were asked about strategies they might have used, to complete

223 the task.

224

225 Results

226 For each participant, for each condition, we computed mean proportion of correct responses

227 (accuracy) and mean RTs. No participant performed below or above 2 SD from the group

228 accuracy mean; therefore, they were all included in the final analyses. Mean accuracy rates

229 and RTs were analyzed in two separate repeated-measures ANOVAs with factors Spatial

230 position (facing/non-facing), Set size (set2/set3) and Trial type (same/different). In this and in

231 the following experiments, participants’ accuracy proved more sensitive than RTs to the

232 effects of experimental manipulations. Therefore, here we focus on accuracy results only.

233 RT results conformed to the pattern of accuracy results or showed no difference across

234 conditions, which in either case ruled out a speed-accuracy trade-off. A full report of RT

235 results is provided as Supplemental Information.

236 As illustrated in Figure 1B, accuracy values revealed an advantage for facing over

237 non-facing arrays, which especially emerged in different-trials. This pattern was confirmed by

238 the statistical analysis. The ANOVA revealed a main effect of Spatial position, F(1,19) =

2 239 10.5, p = .004, ηp = .354, showing that participants were more accurate with facing than

240 non-facing arrays. Spatial position significantly interacted with Trial type, F(1,19) = 15.1, p <

SOCIAL CHUNKING IN WORKING MEMORY 10

2 241 .001, ηp = .443. That is, the advantage for facing over non-facing arrays emerged in

242 different-trials, t(19) = 5.03, p < .001, but not in same-trials, t(19) = 0.17, p = .864. This

243 pattern applied to conditions with both set2, t(19) = 29.29, p<.001; and set3, t(19) = 2.72, p =

244 .013 (Trial type x Set size x Spatial position, F(1,19) = 2.14, p = .159) .

2 245 Also significant were the main effects of Set size, F(1,19) = 244.5, p < .001, ηp =

246 .927, reflecting overall better performance for set2- than set3-trials, and of Trial type, F(1,19)

2 247 = 64.8, p < .001, ηp = .773, reflecting better performance in same- than different-trials. The

2 248 interaction between the two factors was significant, F(1,19) = 53.1, p < .001, ηp = .736,

249 revealing a larger difference between set2 and set3, in different-trials, t(19) = 11.57, p <

250 .001, than in same-trials, t(19) = 2.093, p = .05. Finally, the Set size x Spatial position

2 251 interaction was not significant, F(1,19) = 1.834, p = .192, ηp = .088. These data, and those

252 of the following experiments, are available upon request to the authors.

253 In sum, we found a WM advantage of facing over non-facing dyads in a task that

254 required participants to hold the representation of visual stimuli in WM, for delayed

255 recognition in the probe array. The advantage emerged particularly in different-trials., In

256 same-trials, performance was near ceiling. The advantage for facing trials is compatible with

257 the hypothesis that facingness favors chunking; alternatively, it could reflect the difference

258 between processing of meaningful events in facing arrays vs. processing of meaningless

259 scenes in non-facing arrays, where body positioning broke the meaning of the interaction-

260 events. Experiment 2 spoke to that question.

261

262 Experiment 2

263 Results of Experiment 1 could reflect the advantage of processing facing (vs. non-facing)

264 dyads, or the advantage of processing meaningful (vs. meaningless) scenes. In Experiment

265 2, we sought to disentangle the effect of spatial positioning from the effect of meaning, by

266 presenting dyads of facing and non-facing bodies, which had no obvious semantic content in

267 either condition.

268 Participants

SOCIAL CHUNKING IN WORKING MEMORY 11

269 Twenty healthy adults (14 females; mean age 24.2±4.4) participated as paid volunteers. All

270 had normal or corrected-to-normal vision, reported no history of neurological or psychiatric

271 conditions and no assumption of psychoactive substances or medications. They gave

272 informed consent prior to participation. The sample size was identical to Experiment 1, as

273 Experiment 2 sought to test whether the effects in Experiment 1 could be replicated with a

274 new stimulus set.

275 Stimuli and procedures

276 Stimuli of Experiment 2 were formed using the same procedure of Experiment 1 (see

277 Stimulus section of Experiment 1), except that they involved dyads from the ML set. As in

278 Experiment 1, a unique set of 432 arrays was created for each participant (160 arrays of

279 two/three facing or non-facing dyads and as many identical or different arrays), in addition to

280 64 unique arrays (32 sample and 32 probes) for training and 48 arrays (24 samples and 24

281 probes) for the instructions and familiarization phase (the same 48 arrays were used for all

282 participants). To create the different-probe arrays, one dyad in the 50% of sample arrays

283 was replaced by a new dyad randomly selected from the remaining dyads of the ML set.

284 Experimental setting, task, and procedures were identical to Experiment 1, except that here

285 participants were instructed to respond after the probe-array disappeared from the screen

286 (after 2000 ms from probe onset), to encourage the exploration of the arrays.

287

288 Results

289 Mean accuracy values of all participants were within 2 SD from the group mean; therefore,

290 they were all included in the analysis. Mean proportions of correct responses (accuracy)

291 were analyzed in a 2 Spatial position x 2 Trial type x 2 Set size repeated-measures ANOVA

292 (see Supplementary materials for RT analyses and results).

293 As shown in Figure 1C, we found no main effect of Spatial Position, F(1,19) = 3.53, p

294 = .075, but a significant interaction between Spatial Position and Trial type, F(1,19) = 11.4,

2 295 p= .003, ηp = .375. Congruent with Experiment 1, the interaction revealed an advantage for

296 facing over non-facing dyads in different-trials, t(19) = 3.06, p = .006, but not in same-trials,

SOCIAL CHUNKING IN WORKING MEMORY 12

297 t(19) = 0.98, p = .336. A trend for the Set size by Spatial position interaction, F(1,19) = 3.74,

2 298 p= .068, ηp = .164, showed that the advantage of facing vs. non-facing arrays was stronger

299 with set3-arrays, t(19) = 2.33, p = .030, than with set2-arrays, t(19)= 0.23, p= .814. The three

300 way interaction between Trial type, Set size and Spatial position, however, did not reach the

2 301 significance, F(1,19) = 1.97, p = .175, ηp = .094.

2 302 We also found a main effect of Trial type, F(1,19) = 7.38, p = .014, ηp = .279, and Set

2 303 size, F(1,19) = 54.25, p < .001, ηp = .74, showing overall higher accuracy for same (vs.

304 different) trials and for set2 (vs. set3) trials, respectively. Finally, a significant interaction

305 between Trial type and Set size, F(1,19) = 8.29, p= .009, showed that the difference in

306 performance with set2- and set3-arrays was stronger in different- than in same-trials .

307 In summary, results of Experiment 2 were congruent with those of Experiment 1, in

308 showing a performance advantage in trials with facing-dyad arrays, when participants had to

309 detect a change in the probe (i.e., in different-trials). In Experiment 1, the advantage was

310 found for arrays of facing dyads depicting meaningful interactions. Here, we replicated the

311 effect with arrays of facing dyads that did not give rise to any obvious meaningful interaction.

312 This result suggests that facingness, even in the absence of a semantically coherent

313 interaction content, defined a relation that binds two bodies together in WM.

314 To compare results of Experiment 1 and 2, we performed an ANOVA with the same

315 within-subjects factor as above and Group (Experiment 1: MF group and Experiment 2: ML

316 group) as the only between-subjects factor. This analysis confirmed that the effects in

317 Experiments 1 and 2 were statistically comparable. In particular, we found a main effect of

2 318 Spatial position, F(1,39) = 12.3, p = .001, ηp = .244, and a significant two-way interaction

2 319 between Spatial Position and Trial type, F(1,39) = 26.0, p < .001, ηp = .406. The interaction

320 reflected higher accuracy for facing, relative to non-facing arrays in different-trials, t(39) =

321 5.46, p<.001, but not in same-trials, t(39) = .801, p = .427. The analysis also revealed an

2 322 interaction between Spatial Position and Set size, F(1,39) = 5.1, p= .029, ηp = .119, showing

323 that the advantage of facing over non-facing arrays occurred in set3-trials, t(39) = 3.62, p <

324 .001, but not in set2-trials, t(39) = .963, p = .341. We also found a trend for the interaction

SOCIAL CHUNKING IN WORKING MEMORY 13

2 325 between Spatial Position, Set size and Trial type F(1,39) = 4.0, p= .052, ηp = .095, showing

326 that the advantage of facing arrays was the strongest in the condition with set3-different

327 trials, allegedly the most demanding of all conditions.

2 328 Also significant were the main effects of Trial type, F(1,39) = 48.0, p < .001, ηp =

2 329 .558, and Set size, F(1,39) = 204.6, p < .001, ηp = .843, the Trial type by Set size

2 330 interaction, F(1,39) = 49.6, p= .029, ηp = .566, the Trial type by Group interaction, F(1,39) =

2 331 6.389, p= .015, ηp =. 143, and the Trial type by Set size by Group interaction, F(1,39) =

2 332 7.820, p= .008, ηp =. 170. This interaction showed that, while performance with same-trials

333 was overall very good, the drop in the performance accuracy with set3 different-trials (vs.

334 set2 different-trials) was more pronounced in the MF, than in the ML group. Importantly, the

335 factors Group and Spatial Position did not interact, meaning that the effect of Spatial Position

336 did not change across groups (effect of Group: F(1,39), <1, n.s.; Group x Spatial position:

337 F(1,39) < 1, n.s.; Group x Set size: F(1,39) = 2.65, < 1; Group x Trial type x Spatial position:

338 F(1,39) < 1, n.s.; Group x Set size x Spatial position: F(1,39) < 1, n.s.; Group x Trial type x

339 Set size x Spatial position: F(1,39) < 1, n.s.).

340 The last analysis confirmed an advantage for facing over non-facing dyads,

341 irrespective of whether the face-to-face positioning of bodies defined a familiar, semantically

342 coherent interaction (Experiment 1) or not (Experiment 2). Together, these results suggest

343 that the facingness on its own benefits WM performance, possibly by driving chunking of

344 bodies.

345 The debriefing of participants after the experiment provided insight on the mechanism

346 underlying the performance in the current task. Participants of both groups consistently

347 reported to have spontaneously labeled the facing dyads in the arrays. This circumstance

348 solicited a number of considerations. First, participants used labeling for facing, but not for

349 non-facing dyads, as if only the former spontaneously triggered a search for meaning, or

350 semantic retrieval. As a result, the processing of facing and non-facing might have recruited

351 different mechanisms and subsystems of WM, in particular, verbal through phonological loop

352 and visuo-spatial through the visuo-spatial sketchpad, respectively (Baddeley, 2003;

SOCIAL CHUNKING IN WORKING MEMORY 14

353 Baddeley & Hitch, 1994). Second, participants used labeling for both MF and ML facing

354 dyads, as if facingness on its own prompted participants to treat and interpret dyads as

355 meaningful stimuli. However, only ML-facing dyads were associated with an entry in

356 semantic memory. Semantic knowledge can support chunking in WM, even without the

357 mediation of verbal labeling (Daniel Kaiser, Stein, & Peelen, 2015; O’Donnell et al., 2018). In

358 contrast, in the case of unfamiliar, less coherent interaction-like scenes (i.e., ML-facing

359 dyads), chunking might be more effortful and, therefore, might benefit more from the support

360 of labeling. Following this reasoning, we asked whether a difference in the processing of

361 facing vs. non-facing dyads would emerge, if we added a secondary task that would interfere

362 with the WM performance, by abolishing the opportunity for verbal labeling. Experiment 3

363 was conceived to address this possibility and, by doing so, to test possible differences in the

364 format of representation of MF vs. ML stimuli, in WM.

365

366 Experiment 3

367 In Experiments 1-2, numerous participants reported to have spontaneously labeled facing

368 dyads and rehearsed those labels in the interval elapsing between sample and probe. Here,

369 we introduced verbal shadowing through the continuous repetition of a verbal message, in

370 order to increase the WM load and, specifically, occupy the phonological subsystem of WM

371 that holds representations in a verbal format (Baddeley, 2003). By doing so, we assessed

372 whether the semantic difference between MF- and ML- facing dyads made the latter more

373 susceptible to a concurrent task that interfered with the opportunity for verbal labeling and

374 rehearsal.

375

376 Participants.

377 Twenty-seven healthy adults (24 females; mean age 23.61 ±4.82 SD) participated as paid

378 volunteers. This sample size was established based on the size of the effect of Spatial

2 379 Position found in Experiment 1 (ηp = .354, alpha = 0.05, beta = 0.95). All participants had

380 normal or corrected-to-normal vision, reported no history of neurological or psychiatric

SOCIAL CHUNKING IN WORKING MEMORY 15

381 conditions and no assumption of psychoactive substances or medications. All gave informed

382 consent prior to participation.

383

384 Stimuli and procedure

385 Stimuli, apparatus and procedures were identical to Experiments 1-2, except for the

386 following changes. First, all participants saw all the stimuli of both Experiments 1 and 2,

387 which doubled the number of trials (320 in total) and conditions (16 conditions: same- and

388 different-trials with set2 and set3 arrays of MF- and ML- facing and non-facing dyads).

389 Stimuli based on the MF set and those based on the ML set were presented in separate

390 runs, with the order alternating between participants. Second, concurrently with the delayed

391 change-detection task, participants performed a shadowing task throughout the trial. To

392 implement this task (see Figure 2A), the trial began with two target-digits presented for 500

393 ms in red ink, at the center of the screen (1°×1° visual angle). Participants were instructed to

394 read the two target-digits and repeat them aloud throughout the trial. Meanwhile, the trial

395 unfolded identical to Experiments 1-2, with sample- and probe-arrays for the change-

396 detection task. After the participant responded to the change-detection task, a red digit

397 appeared on the screen and participants had to decide whether this was one of the two

398 target-digits. If so, they had to press the key “F” with the left index. If no response was

399 provided after 2000ms, it counted as a “no” response and the next trial began. The

400 experiment lasted ~85 min (~70 minutes for task + ~15 of breaks).

401

402 Results

403 All participants performed within 2 SD from the group accuracy mean and were all

404 included in the analysis. Only trials in which participants provided a correct response in the

405 secondary task were considered for further analysis (mean rejected trials 5.92 ±5.94 SD).

406 Mean accuracy values were analyzed in a 2 Set (MF/ML) x 2 Spatial position (facing/non-

407 facing) x 2 Set size (set2/set3) x 2 Trial type (same/different) repeated-measures ANOVA.

SOCIAL CHUNKING IN WORKING MEMORY 16

408 Results showed no main effect of Spatial position, F(1,26) = 1.27, p= .269, but a significant

2 409 interaction between Spatial position and Trial type, F(1,26) = 9.65, p = .004, ηp = .270.

410 Consistent with Experiments 1-2, this interaction showed higher accuracy for facing, than for

411 non-facing arrays in different-trials, t(26) = 2.36, p= .026, but not in same-trials, t(26) = 1.97,

412 p = .056 (see Figure 2B). Results also showed significant interactions between Set and

2 413 Spatial position, F(1,26) = 12.68, p = .001, ηp = .327, and between Set, Set size and Spatial

2 414 position, F(1,26) = 20.44, p = .001, ηp = .440. All effects were qualified by a significant

415 interaction between Set, Spatial position, Set size and Trial type, F(1,26) = 3.78, p = .063,

2 416 ηp = .127. This interaction revealed that the advantage for facing over to non-facing dyad-

417 arrays emerged in different-set3 trials, and only when facing dyads depicted semantically

418 coherent interactions (MF set), t(26) = 3.52, p = .001. The advantage disappeared with

419 facing dyads from the ML set. If anything, in different-set3 trails of the ML set, participants

420 were more accurate with non-facing, than with facing-dyad arrays, t(26) = 2.27, p = .031.

421 Performance with facing and non-facing arrays did not differ in all the conditions with same-

422 trials and with set2-trials (all ps > 0.250).

2 423 Finally, also significant were the effect of Trial type, F(1,26) =73.72, p< .001, ηp =

424 .739, reflecting higher accuracy with same- than different-trials, and the effect of Set size,

2 425 F(1,26) =422.01, p< .001, ηp = .941, reflecting higher accuracy with set2- than set3-trials. All

426 other interactions were not significant (Set x Trial type, F(1,26) < 1, n.s.; Set x Set size,

427 F(1,26) < 1, n.s.; Set size x Spatial position, F(1,26) < 1, n.s.; Set x Trial type x Set size,

428 F(1,26) = 1.073, p > .250; Set x Trial type x Spatial Position, F(1,26) = 2.598, p= .119; Trial

429 type x Set size x Spatial position, F(1,26) < 1, n.s.).

430 To summarize, like in Experiments 1-2, the WM advantage of facing, over non-facing

431 arrays emerged in the most demanding of the experimental conditions, i.e., the condition in

432 which participants had to detect a change in an array of six bodies (i.e., set3-arrays). With

433 the concurrent shadowing task, however, the advantage was only found for facing arrays

434 featuring familiar, meaningful interactions (MF set). This result demonstrates that the

435 semantic relation defined by MF-facing dyads provided an effective for chunking in WM,

SOCIAL CHUNKING IN WORKING MEMORY 17

436 irrespective of the opportunity for verbal labeling. Interference with verbal labeling and/or

437 rehearsal had catastrophic effect on performance with ML-facing arrays, which was poorer

438 than performance with non-facing arrays. This suggests that the advantage of ML-facing

439 dyads in Experiment 2 reflected a strategic, impromptu attribution of meaningful relations,

440 represented in WM in the form of verbal labels.

441

442 Discussion

443 Keeping in mind an array of more than four items for a short period of time depends on the

444 possibility to chunk multiple items in units in WM. We investigated whether social

445 relationship between social agents (i.e., human bodies) influenced the representation of

446 crowded scenarios in WM, by offering a structure to organize representations of single

447 individuals in social chunks. In addressing this question, we asked what kind of relationship

448 could bind two bodies together in WM. We considered the representation of a familiar,

449 meaningful interaction (i.e., fighting, talking, quarreling, or dancing; Experiment 1) versus the

450 mere, socially relevant, face-to-face positioning of two bodies (Experiment 2). In both cases,

451 we found that arrays of facing dyads were retained and recognized batter than arrays of non-

452 facing dyads. The conditions that proved most sensitive to the effect of positioning were

453 those in which a change occurred in arrays that exceeded the WM capacity of four items,

454 making chunking mandatory to succeed in the task (Awh, Barton, & Vogel, 2014; Cowan,

455 2000). Those conditions emphasized relations that could bind single bodies in social chunks:

456 facingness and social interaction.

457 Participants in Experiments 1-2 reported to use verbal labeling to encode and

458 remember facing dyads. Verbal labeling is a common, relatively undemanding strategy to

459 hold information by phonological maintenance or rehearsal. In visual WM tasks, it provides

460 an additional source of storage, whereby a verbal code is used to recall visual information.

461 This strategy can be prevented with shadowing by continuous word repetition (Baddeley,

462 Gathercole, & Papagno, 1998; Robbins et al., 1996). We reasoned that, in the case of

SOCIAL CHUNKING IN WORKING MEMORY 18

463 meaningless stimuli, verbal labeling could enhance a faint relationship prompted by spatial

464 positioning, thus facilitating binding of face-to-face bodies. If so, shadowing of verbal

465 and phonological rehearsal should abolish the advantage for meaningless-facing

466 dyads. Results of Experiment 3 supported this hypothesis: verbal shadowing left unhindered

467 the advantage for meaningful-facing dyads, while it had a detrimental effect on the

468 performance with meaningless-facing dyads.

469 Thus, without any instruction to do so, participants used different mechanisms to

470 process facing and non-facing dyads. In particular, the results of Experiments 1-3 suggest

471 that different subsystems supported the maintenance of facing and non-facing dyads in WM:

472 the visuo-spatial sketchpad holding visual representations for non-facing dyads and –in

473 addition to, or instead of the visuo-spatial sketchpad– the phonological loop holding visual

474 information in a verbal code for facing dyads (Baddeley, 2000; Baddeley & Hitch, 1994).

475 While it is possible that chunking and storage of both meaningful-facing bodies in

476 Experiment 1 and meaningless-facing bodies in Experiment 2 took advantage of verbal

477 labeling, only the former could still be bound together without labeling (Experiment 3).

478 The advantage of meaningful-facing dyads without the support of the phonological

479 loop highlighted the contribution of semantic knowledge about interaction, which provided a

480 structure to organize new information in WM. Indeed, WM can exploit various cues in the

481 visual input, which are related to prior knowledge, and can be used to form groups of

482 associated items that are thus stored together (Brady et al., 2009; Chen & Cowan, 2009;

483 Cowan, 2000; Ericsson & Kintsch, 1995). Current models of WM suggest that this

484 mechanism recruits the episodic buffer, a component of WM for temporary storage of

485 episodes, with access to and from long-term memory (Baddeley, 2000; Ericsson & Kintsch,

486 1995; Rossi-Arnaud, Pieroni, & Baddeley, 2006).

487 A similar mechanism might have operated in the processing of meaningless-facing

488 dyads. Our contention is that facingness triggered a general, underspecified representation

489 of social interaction; however, without an anchor to a specific semantic entry, and without the

490 additional support for chunking provided by labeling (Experiment 3), the underspecified

SOCIAL CHUNKING IN WORKING MEMORY 19

491 representation often failed to support discrimination between two instances of interaction

492 (two meaningless-facing dyads), making the change from sample to probe hard to detect for

493 meaningless-facing dyads.

494 The current results contribute to demonstrate that, among familiar associations and

495 semantic relations (Brady et al., 2009; Chase & Simon, 1973; Curby, Glazek, & Gauthier,

496 2009; Feigenson & Halberda, 2008; Daniel Kaiser et al., 2015; Kibbe & Feigenson, 2013),

497 individuals can use knowledge of social relationship (i.e., knowledge about the structure of

498 familiar dyadic interactions) for chunking in WM. In previous studies, chunking by social

499 relationship has been emphasized by showing meaningful social interactions

500 (physical/communicative exchanges) between social agents facing and moving toward each

501 other (Bellot, Abassi, & Papeo, 2021; Ding et al., 2017; Stahl & Feigenson, 2014). Here, we

502 set conditions to isolate the effects of spatial cues (i.e., spatial proximity and positioning) and

503 semantic relations (i.e., category of interaction). Thus, we showed that just facingness,

504 without a familiar, semantically explicit interaction, can establish a relationship –as faint as it

505 might be– that triggers chunking, to the benefit of WM capacity.

506 Our results also shed light on the relationship between visual perception and WM.

507 Research on scene perception has shown that visuo-spatial cues of interaction, such as

508 proximity and facingness, independently from the meaningfulness of the stimuli, trigger

509 automatic grouping of multiple bodies in the visual system (Adibpour, Hochmann, & Papeo,

510 2020), yielding increased efficiency in stimulus detection and recognition (Papeo, 2020;

511 Papeo et al., 2019, 2017; Strachan, Sebanz, & Knoblich, 2019; Vestner et al., 2019). The

512 current results show that grouping by facingness drags up to WM. In sum, facingness

513 defines a relationship that is exploited for efficiency in visual perception and for chunking in

514 WM.

515 A number of questions remain open. One concerns the representation of single

516 bodies that form an interacting (or seemingly interacting) dyad. In visual search for bodies

517 through a crowd, participants rapidly access two facing bodies as a group (vs. non-facing

518 bodies), but with a cost in the access to individual bodies of that group (Papeo et al., 2019).

SOCIAL CHUNKING IN WORKING MEMORY 20

519 A similar cost might be found in WM processing. Previous research involving familiar objects

520 has shown that compressing information in WM increases capacity, but reduces the number

521 of features that are encoded for each individual component (Alvarez, 2011; Alvarez &

522 Cavanagh, 2004; Brady & Alvarez, 2011). This cost however might vary depending on the

523 object class. For socially relevant stimuli such as faces, WM exhibits not only greater

524 capacity relative to non-face objects (Curby & Gauthier, 2007), but also improved resolution

525 (Scolari, Vogel, & Awh, 2008). Ding et al. (2017) tested WM for arrays of interacting or non-

526 interacting body dyads by asking participants to report whether a single body was present in

527 the previous array or not. A performance advantage was found for bodies seen in interacting

528 dyads. Those results encourage the hypothesis that the representation of single bodies (or

529 single actions) in WM is enhanced, rather than impoverished, in a meaningful-interaction

530 context (see also Abassi & Papeo, 2020; Bellot, Abassi, & Papeo, 2021; Neri, Luu, & Levi,

531 2006).

532 Another open question concerns the features that are more likely to be encoded in

533 the WM representation of an interacting dyad. Research on visual perception of the gist of

534 events has shown that individuals are extremely rapid and efficient at extracting information

535 about agent-patient roles (Hafri, Papafragou, & Trueswell, 2013; Hafri, Trueswell, &

536 Strickland, 2018) and action coherence (Glanemann, Zwitserlood, Bölte, & Dobel, 2016),

537 from the physical structure of the visual input. Are these the features of an interaction that

538 are most likely to be encoded in WM and pass into the long-term memory representation of a

539 social event? And, what other visuo-spatial features, besides facingness, can trigger the

540 representation of social interaction in WM? What happens to the WM representation of

541 social interactions that lack prototypical features of social interaction, such as facingness

542 and spatial proximity? These are some of the questions to address in order to understand

543 how people encode and remember one of the most important aspects of their visual world:

544 social interaction.

545 In conclusion, we showed that WM uses knowledge about social relationship to

546 chunk bodies in groups (dyads), thus increasing its capacity. Facingness alone can drive this

SOCIAL CHUNKING IN WORKING MEMORY 21

547 mechanism: It solicits semantic encoding of the stimuli, as revealed by spontaneous

548 labeling, which provides a principle for chunking in WM. Thus, two people mutually

549 accessible to one another form an intrinsically meaningful representation that human

550 cognition readily processes as a social unit, before the interaction is fully realized or

551 understood.

552

SOCIAL CHUNKING IN WORKING MEMORY 22

553

554 Acknowledgements

555 This work was supported by a European Research Council Starting Grant awarded to L.P.

556 (Project: THEMPO, Grant Agreement 758473) and by the Amici di Claudio Demattè

557 Scholarship (12° edition) awarded to I.P.

558

SOCIAL CHUNKING IN WORKING MEMORY 23

559

560 References

561 Abassi, E., & Papeo, L. (2020). The representation of two-body shapes in the human visual

562 cortex. Journal of Neuroscience, 40(4), 852–863.

563 https://doi.org/10.1523/JNEUROSCI.1378-19.2019

564 Adibpour, P., Hochmann, J.-R., & Papeo, L. (2020). Spatial relations trigger visual binding of

565 people. BioRxiv. https://doi.org/10.1101/2020.08.06.239749

566 Alvarez, G. A. (2011). Representing Multiple Objects as an Ensemble Enhances Visual

567 Cognition Representing multiple objects as an ensemble enhances visual cognition.

568 https://doi.org/10.1016/j.tics.2011.01.003

569 Alvarez, G. A., & Cavanagh, P. (2004). The Capacity of Visual Short- Term Memory Is Set

570 Both by Visual Information Load and by Number of Objects. Psychological Science,

571 17(5), 1019–1029. https://doi.org/10.1208/s12249-015-0428-4

572 Ardila, A. (2003). Language representation and working memory with bilinguals. Journal of

573 Communication Disorders, 36(3), 233–240. https://doi.org/10.1016/S0021-

574 9924(03)00022-4

575 Awh, E., Barton, B., & Vogel, E. K. (2014). Visual Working Memory a Fixed Number of

576 Represents Items Regardless of Complexity, 18(7), 622–628.

577 https://doi.org/10.1111/j.1467-9280.2007.01949.x

578 Baddeley, A. (2000). The episodic buffer: a new component of working memory? Trends in

579 Cognitive Sciences, 4(11), 417–423. https://doi.org/10.1016/S1364-6613(00)01538-2

580 Baddeley, A. (2003). Working memory: looking back and looking forward. Nature Reviews

581 Neuroscience, 4(10), 829–839. https://doi.org/10.1038/nrn1201

582 Baddeley, A. D., & Hitch, G. J. (1994). Developments in the concept of working memory.

583 Neuropsychology, 8(4), 485. https://doi.org/10.1037/0894-4105.8.4.485

584 Baddeley, A. D., & Logie, R. H. (1999). Working memory: The multiple-component model.

585 https://doi.org/10.1017/CBO9781139174909.005

586 Baddeley, A., Gathercole, S., & Papagno, C. (1998). The phonological loop as a language

SOCIAL CHUNKING IN WORKING MEMORY 24

587 learning device. Psychological Review, 105(1), 158.

588 Bellot, E., Abassi, E., & Papeo, L. (2021). Moving toward versus away from another: how

589 body motion direction changes the representation of bodies and actions in the visual

590 cortex. Cerebral Cortex, in press, doi.org/10.1093/cercor/bhaa382

591 Brady, T. F., & Alvarez, G. A. (2011). Hierarchical encoding in visual working memory:

592 Ensemble statistics bias memory for individual items. Psychological Science, 22(3),

593 384–392. https://doi.org/10.1177/0956797610397956

594 Brady, T. F., Konkle, T., & Alvarez, G. A. (2009). Compression in Visual Working Memory:

595 Using Statistical Regularities to Form More Efficient Memory Representations. Journal

596 of Experimental Psychology: General, 138(4), 487–502.

597 https://doi.org/10.1037/a0016797

598 Brady, T. F., & Tenenbaum, J. B. (2013). A probabilistic model of visual working memory:

599 Incorporating higher order regularities into working memory capacity estimates.

600 Psychological Review, 120(1), 85–109. https://doi.org/10.1037/a0030779

601 Brainard, D. H. (1997). The Psychophysics Toolbox. Spatial Vision, 10(4), 433–436.

602 https://doi.org/10.1163/156856897X00357

603 Chase, W. G., & Simon, H. A. (1973). The mind’s eye in chess. Visual Information

604 Processing. WG Chase. New York, Academic Press. https://doi.org/10.1016/B978-0-

605 12-170150-5.50011-1

606 Chen, Z., & Cowan, N. (2009). Core verbal working-memory capacity: The limit in words

607 retained without covert articulation. Quarterly Journal of Experimental Psychology,

608 62(7), 1420–1429. https://doi.org/10.1080/17470210802453977

609 Cousineau, D. (2005). Confidence intervals in within-subject designs: A simpler solution to

610 Loftus and Masson’s method. Tutorials in Quantitative Methods for Psychology, 1(1),

611 42–45.

612 Cowan, N. (2000). The magical number 4 in short-term memory: A reconsideration of mental

613 storage capacity Behavioral and Brain Sciences. Behavioral and Brain Sciences, (4),

614 87–185. https://doi.org/10.1017/S0140525X01003922

SOCIAL CHUNKING IN WORKING MEMORY 25

615 Curby, K. M., & Gauthier, I. (2007). A visual short-term memory advantage for faces.

616 Psychonomic Bulletin & Review, 14(4), 620–628. https://doi.org/10.3758/BF03196811

617 Curby, K. M., Glazek, K., & Gauthier, I. (2009). A visual short-term memory advantage for

618 objects of expertise. Journal of Experimental Psychology: Human Perception and

619 Performance, 35(1), 94. https://doi.org/10.1037/0096-1523.35.1.94

620 Ding, X., Gao, Z., & Shen, M. (2017). Two Equals One: Two Human Actions During Social

621 Interaction Are Grouped as One Unit in Working Memory. Psychological Science, 28(9),

622 1311–1320. https://doi.org/10.1177/0956797617707318

623 Donkin, C., Nosofsky, R., Gold, J., & Shiffrin, R. (2015). Verbal labeling, gradual decay, and

624 sudden death in visual short-term memory. Psychonomic Bulletin & Review, 22(1),

625 170–178. https://doi.org/10.3758/s13423-014-0675-5

626 Ericsson, K. A., & Kintsch, W. (1995). Long-term working memory. Psychological Review,

627 102(2), 211. https://doi.org/10.1037/0033-295X.102.2.211

628 Feigenson, L., & Halberda, J. (2008). Conceptual knowledge increases infants’ memory

629 capacity. Proceedings of the National Academy of Sciences, 105(29), 9926–9930.

630 https://doi.org/10.1073/pnas.0709884105

631 Glanemann, R., Zwitserlood, P., Bölte, J., & Dobel, C. (2016). Rapid apprehension of the

632 coherence of action scenes. Psychonomic Bulletin and Review, 23(5), 1566–1575.

633 https://doi.org/10.3758/s13423-016-1004-y

634 Hafri, A., Papafragou, A., & Trueswell, J. C. (2013). Getting the Gist of Events: Recognition

635 of Two-Participant Actions from Brief Displays, 71(2), 233–236.

636 https://doi.org/10.1038/mp.2011.182.doi

637 Hafri, A., Trueswell, J. C., & Strickland, B. (2018). Encoding of event roles from visual

638 scenes is rapid, spontaneous, and interacts with higher-level visual processing.

639 Cognition, 175(February), 36–52. https://doi.org/10.1016/j.cognition.2018.02.011

640 Hollingworth, A. (2007). Object-position binding in visual memory for natural scenes and

641 object arrays. Journal of Experimental Psychology: Human Perception and

642 Performance, 33(1), 31–47. https://doi.org/10.1037/0096-1523.33.1.31

SOCIAL CHUNKING IN WORKING MEMORY 26

643 Kaiser, D., Stein, T., & Peelen, M. V. (2014). Object grouping based on real-world

644 regularities facilitates perception by reducing competitive interactions in visual cortex.

645 Proceedings of the National Academy of Sciences, 111(30), 11217–11222.

646 https://doi.org/10.1073/pnas.1400559111

647 Kaiser, Daniel, Stein, T., & Peelen, M. V. (2015). Real-world spatial regularities affect visual

648 working memory for objects. Psychonomic Bulletin and Review, 22(6), 1784–1790.

649 https://doi.org/10.3758/s13423-015-0833-4

650 Kibbe, M., & Feigenson, L. (2013). Infants use statistical regularities to chunk items in visual

651 working memory. Journal of Vision, 13(9), 333. https://doi.org/10.1167/13.9.333

652 Luck, S. J., & Vogel, E. K. (1997). The capacity of visual working memory for features and

653 conjunctions. Nature, 390(6657), 279–281. https://doi.org/10.1038/36846

654 Mathy, F., & Feldman, J. (2012). What’s magic about magic numbers? Chunking and data

655 compression in short-term memory. Cognition, 122(3), 346–362.

656 https://doi.org/10.1016/j.cognition.2011.11.003

657 Miller, G. A. (1956). The magical number seven, plus or minus two: Some limits on our

658 capacity for processing information. Psychological Review, 63(2), 81.

659 https://doi.org/10.1037/h0043158

660 Neri, P., Luu, J. Y., & Levi, D. M. (2006). Meaningful interactions can enhance visual

661 discrimination of human agents. Nature Neuroscience, 9(9), 1186–1192.

662 https://doi.org/10.1038/nn1759

663 O’Donnell, R. E., Clement, A., & Brockmole, J. R. (2018). Semantic and functional

664 relationships among objects increase the capacity of visual working memory. Journal of

665 Experimental Psychology: Learning Memory and Cognition, 44(7), 1151–1158.

666 https://doi.org/10.1037/xlm0000508

667 Papeo, L. (2020). Twos in human visual perception. Cortex, 132, 473-478.

668 https://doi.org/10.1016/j.cortex.2020.06.005

669 Papeo, L., Goupil, N., & Soto-Faraco, S. (2019). Visual search for people among people.

670 Psychological Science. https://doi.org/10.1177/0956797619867295

SOCIAL CHUNKING IN WORKING MEMORY 27

671 Papeo, L., Stein, T., & Soto-Faraco, S. (2017). The Two-Body Inversion Effect.

672 Psychological Science, 28(3), 369–379. https://doi.org/10.1177/0956797616685769

673 Robbins, T. W., Anderson, E. J., Barker, D. R., Bradley, A. C., Fearnyhough, C., Henson, R.,

674 … Baddeley, A. D. (1996). Working memory in chess. Memory & Cognition, 24(1), 83–

675 93. https://doi.org/10.3758/BF03197274

676 Rossi-Arnaud, C., Pieroni, L., & Baddeley, A. (2006). Symmetry and binding in visuo-spatial

677 working memory. Neuroscience, 139(1), 393–400.

678 https://doi.org/10.1016/j.neuroscience.2005.10.048

679 Scolari, M., Vogel, E. K., & Awh, E. (2008). Perceptual expertise enhances the resolution but

680 not the number of representations in working memory. Psychonomic Bulletin & Review,

681 15(1), 215–222. https://doi.org/10.3758/PBR.15.1.215

682 Stahl, A. E., & Feigenson, L. (2014). Social knowledge facilitates chunking in infancy. Child

683 Development, 85(4), 1477–1490. https://doi.org/10.1111/cdev.12217

684 Strachan, J. W. A., Sebanz, N., & Knoblich, G. (2019). The role of emotion in the dyad

685 inversion effect. PloS One, 14(7), e0219185.

686 https://doi.org/10.1371/journal.pone.0219185

687 Vestner, T., Gray, K. L. H., & Cook, R. (2020). Why are social interactions found quickly in

688 visual search tasks? Cognition, 200, 104270.

689 https://doi.org/10.1016/j.cognition.2020.104270

690 Vestner, T., Tipper, S. P., Hartley, T., Over, H., & Rueschemeyer, S. A. (2019). Bound

691 Together: Social Binding Leads to Faster Processing, Spatial Distortion, and Enhanced

692 Memory of Interacting Partners. Journal of Experimental Psychology: General.

693 https://doi.org/10.1037/xge0000545

694 Zhou, C., Han, M., Liang, Q., Hu, Y.-F., & Kuai, S.-G. (2019). A social interaction field model

695 accurately identifies static and dynamic social groupings. Nature Human Behaviour.

696 https://doi.org/10.1038/s41562-019-0618-2

697

698

SOCIAL CHUNKING IN WORKING MEMORY 28

699 Figure Caption

700 Figure 1. Example trials and accuracy results of Experiments 1-2. A) Trial organization in

701 Experiments 1-2. Participants saw two arrays (sample and probe) with either two or three facing or

702 non-facing dyads. Participants had to report whether the probe was the same or different relative to

703 the sample. Represented here is a trial with set2-same facing-array. B) Example of stimulus-arrays

704 (left) and accuracy results (right) of Experiment 1. Arrays showed facing meaningful-interaction

705 dyads or non-facing dyads. Accuracy results (proportion of correct responses) are shown as a

706 function of Spatial position of bodies in dyads (facing or non-facing), Set size (Set2 or Set3) and Trial

707 type (different or same). Error bars represent ±1 within-subjects Standard Error of the Mean (SEM)

708 (Cousineau, 2005). Asterisks indicate significant effects (*p < .05, ***p < .001). C) Example of

709 stimulus-arrays (left) and accuracy results (right) from Experiment 2. Arrays showed facing or

710 non-facing dyads. Accuracy results are shown as in Experiment 1. Asterisks indicate significant

711 differences (*p < .05, **p < .01).

712

713 Figure 2. Example trial and accuracy results of Experiment 3. A) Trial organization in

714 Experiments 3. Participants saw two target-digits to repeat aloud throughout the trail, then two arrays

715 (sample and probe) with either two or three facing or non-facing dyads and, finally a digit. They first

716 had to report whether sample- and probe-arrays were the same or different and then whether the final

717 digit was one of the target ones. B) Accuracy results of Experiment 3. Accuracy (mean proportions

718 of correct responses) is shown as a function of Spatial position of bodies in dyads (facing or non-

719 facing), Set size (set2 or set3), Trial type (different or same) and Stimulus set (meaningful, MF or

720 meaningless, ML). Error bars represent ±1 SEM within subjects. Asterisks indicate significant

721 differences (*p < .05, **p < .01).

722

723

724

725

SOCIAL CHUNKING IN WORKING MEMORY 29

726 Figure 1

727

SOCIAL CHUNKING IN WORKING MEMORY 30

728

729 Figure 2

730

731

732

733

734

735

736

737

738

739

740

741

742

743

744

745

746

747

748