bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

1 The spatial transcriptomic landscape of the healing intestine following damage 2

3 Sara M. Parigi1,2, *, Ludvig Larsson3, *, Srustidhar Das1,2, Ricardo O. Ramirez Flores4, Annika 4 Frede1,2, Kumar P. Tripathi1,2, Oscar E. Diaz1,2, Katja Selin1,2, Rodrigo A. Morales1,2, Xinxin 5 Luo1,2, Gustavo Monasterio1,2, Camilla Engblom5, Nicola Gagliani1,2, 6, Julio Saez-Rodriguez4, 6 Joakim Lundeberg3 and Eduardo J. Villablanca1,2, &, #

7

8

9 1 Division of Immunology and Allergy, Department of Medicine Solna, Karolinska Institute and 10 University Hospital, Stockholm, Sweden 11 2 Center of Molecular Medicine, Stockholm, Sweden 12 3 Science for Life Laboratory, Department of Technology, KTH Royal Institute of 13 Technology, Stockholm, Sweden 14 4 Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for 15 Computational Biomedicine, Bioquant, Heidelberg, Germany 16 5 Department of Cell and Molecular Biology, Karolinska Institute, Stockholm, Sweden 17 6 I. Department of Medicine and Department of General, Visceral and Thoracic Surgery, 18 University Medical Center Hamburg-Eppendorf, Hamburg, Germany 19 20 21 *These authors contributed equally to this work 22 & Lead contact 23 # Correspondence should be addressed to E.J.V. ([email protected]).

1

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

24 Abstract 25 The intestinal barrier is composed of a complex cell network defining highly compartmentalized 26 and specialized structures. Here, we use spatial transcriptomics (ST) to define how the 27 transcriptomic landscape is spatially organized in the steady state and healing murine colon. At 28 steady state conditions, we demonstrate a previously unappreciated molecular regionalization of 29 the colon, which dramatically changes during mucosal healing. Here, we identified spatially- 30 organized transcriptional programs defining compartmentalized mucosal healing, and regions 31 with dominant wired pathways. Furthermore, we showed that decreased p53 activation defined 32 areas with increased presence of proliferating epithelial stem cells. Finally, we used our 33 resource to map transcriptomics modules associated with human diseases demonstrating that 34 ST can be used to inform clinical practice. Overall, we provide a publicly available resource 35 defining principles of transcriptomic regionalization of the colon during mucosal healing and a 36 framework to develop and progress further hypotheses. 37 38 Keywords: Spatial transcriptomics, mucosal healing, colon

2

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

39 Introduction 40 41 The intestine is divided into the small and large bowels that together host the highest density of 42 commensal microbiota, which in turn is spatially heterogeneous across the proximal-distal axis 43 1. The geographically heterogeneous microbial exposure has contributed to the establishment of 44 a highly compartmentalized organ that has distinct functions depending on the proximal-distal 45 location 2,3. For example, vitamin A-metabolizing and consequently retinoic acid 46 production and function are higher in the proximal compared to distal small intestine 4, 47 generating a proximal-to-distal gradient. Although it is broadly accepted that the small intestine 48 is highly compartmentalized, whether a clear molecular regionalization exists in the colon is yet 49 to be determined. 50 51 The intestine relies on the constant regeneration of the intestinal epithelium to maintain 52 homeostasis. Breakdown in regenerative pathways may lead to pathogen translocation and the 53 development of chronic intestinal pathologies, such as inflammatory bowel disease (IBD) 5. 54 Therefore, the intestinal barrier must quickly adapt to promote tissue regeneration and healing 55 following injury. However, the cellular and molecular circuitry at steady state conditions and how 56 it adapts upon challenge is yet to be fully characterized. 57 The intestine offers a unique opportunity to investigate common principles of tissue repair at the 58 barrier because of its spatial organization, which is fundamental to its function. When intestinal 59 barrier injury occurs, damaged epithelial cells are shed and replaced by mobilizing intestinal 60 stem cell (ISC)-derived cells 6, a phenomenon highly dependent on signals coming from the 61 neighboring microenvironment (niche) 7. Similarly, immune cells are recruited or expanded in 62 situ to protect the host from invading pathogens and to orchestrate the healing process by 63 providing resolving signals 8. Although initially considered as a mere structural support, stromal 64 cells, which includes fibroblasts, endothelial/lymphatic cells, pericytes and glial cells, are also 65 actively involved in barrier healing through tissue remodeling, matrix deposition, 66 neoangiogenesis, muscle contraction, and production of pro-regenerative signals 9. Therefore, 67 immune, epithelial and stromal cells must quickly adapt within a defined microenvironment and 68 establish a molecular network to promote tissue repair. However, whether different segments of 69 the intestine and their microenvironments possess distinct types of tissue repair mechanisms is 70 currently unknown. 71 Our previous study unveiled the temporal transcriptomic dynamic of the colonic tissue over the 72 course of dextran sodium sulphate (DSS) colitis, identifying and pathways differentially

3

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

73 modulated during acute injury or regeneration 10. Although bulk or single cell RNA sequencing 74 studies provide unbiased transcriptome analysis, the spatial context within the tissue is typically 75 lost. In contrast, targeted technologies for spatial analysis (e.g. in situ RNA- 76 sequencing, fluorescence in situ hybridization [FISH], RNA-scope) require knowledge of specific 77 candidate genes to interrogate and thus do not allow an unsupervised investigation of pathways 78 enriched in healing areas. 79 To overcome these limitations, we exploited spatial transcriptomics (ST), an unbiased 80 technology allowing sequencing of polyadenylated transcripts from a tissue section, which can 81 be spatially mapped onto the histological brightfield image 11. ST allowed us to uncover an 82 unprecedented view of the molecular regionalization of the murine colon, which was further 83 validated on human intestinal specimens. By comparing ST of colonic tissue under steady state 84 and upon mucosal healing (i.e. from DSS-treated mice), we identified and spatially mapped 85 transcriptional signatures of tissue repair processes, immune cell activation/recruitment, pro- 86 regenerative pathways, and tissue remodeling. The spatial landscape of pathway activity during 87 mucosal healing also unveiled a negative correlation between p53 activity and proliferating 88 epithelial stem cells. Moreover, targeted mapping of genes associated with disease outcome in 89 human IBD patients and IBD risk variants identified from genome-wide association studies 90 (GWAS) allowed us to infer their involvement in specific pathological processes based on their 91 localization in areas with distinct histological properties. 92 93 Results 94 95 Spatial transcriptomics revealed distinct molecular regionalization of murine colonic 96 epithelium at steady state condition 97 To characterize the transcriptomic landscape of the colon tissue at steady state condition, we 98 processed frozen colons for ST using the Visium (10X Genomics) platform (Fig. 1A). The pre- 99 filtered dataset corresponded mostly to protein coding genes (Extended Data Fig. 1A). Upon 100 filtering out non-coding RNAs (ncRNAs) and mitochondrial protein coding genes, the resulting 101 dataset consists of 2604 individual spots, with an average of ~4125 genes and ~11801 unique 102 transcripts per spot (Extended Data Fig. 1B). First, we deconvolved the spatial transcriptomic 103 dataset using non-negative matrix factorization (NNMF) to infer activity maps 12, and we 104 restricted the analysis to only 3 factors that capture the most basic structure of the colon at 105 steady state conditions (d0) (Fig. 1B). We identified 3 basic structural transcriptomic landscapes 106 that were histologically discernible as intestinal epithelial cells (IEC) (NNMF_3), muscle

4

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

107 (NNMF_2), and a mixture between lamina propria (LP) and IEC (NNMF_1), which were 108 indistinguishable from the IEC towards the most distal colon (Fig. 1B). Analysis of the top 109 contributing genes for each factor confirmed the identity of the muscle and IEC, and the mixed 110 signature between IEC, muscle, and LP (Fig. 1C). Using immunohistochemistry (IHC) data from 111 the human protein atlas 13, we validated the specific expression of CDH17 (also known as liver- 112 intestine cadherin or LI cadherin)14 and TAGLN (transgelin, smooth muscle marker) in the IEC 113 and muscularis layer, respectively (Fig. 1D). By contrast, ADH1 (alcohol dehydrogenase 1) 114 showed a mixed expression between the LP and IEC compartments (Fig. 1D). To better 115 visualize the molecular regionalization both across the proximal-distal and serosa-luminal axis, 116 we digitally unrolled the colon (Extended Data Fig. 2A and Fig. 1E) as described in methods. In 117 line with the ST expression in Fig. 1C, the muscle, LP/IEC and proximal IEC factors were 118 enriched in the corresponding regions of the distal-proximal and serosa-luminal axis of the 119 digitally unrolled colon (dark dots in Fig. 1E). Among the genes driving factor 3 (NNMF_3) we 120 found Car1, Mettl7b, Emp1, Fabp2, and Hmgcs2 that were highly expressed in the proximal 121 colon (Fig. 1F and Extended Data Fig. 2B). Our data aligns with reports showing Car1 promoter- 122 driven expression in the proximal but not distal colon 15. In contrast, Retnlb, Sprr2a2 and Ang4 123 were enriched in the mid colon, and Prdx6, Tgm3, Ly6g, Eno3, and B4galt1 were enriched in 124 the distal colon (Fig. 1F and Extended Data Fig. 2B). Because they have not been previously 125 described as markers for the distinct colonic compartments, we used qPCR to validate the 126 region-specific mRNA expression of genes coding for the ketogenic rate-limiting , 127 mitochondrial 3-hydroxy-3-methylglutaryl-CoAsynthase 2 (HMGCS2), the antimicrobial peptide 128 4 (ANG4), and the Beta-1,4-galactosyltransferase 1 (B4GALT1)( Fig. 1G). 129 Therefore, ST analysis distinguished a stratification between the LP, IEC, and muscle 130 compartment which is clearly evident at the proximal but not distal colon. 131 Some studies have shown structural and functional differences between proximal and distal 132 colon, specifically with respect to the epithelium 15-20; however, systematic and unsupervised 133 molecular regionalization of the colon is lacking. To objectively identify genes differentially 134 expressed in specific compartments of the colon, we used the NNMF method to distinguish 135 relevant sources of variability of the data 12. Detailed factor analyses of the naive colon (denoted 136 with “n”) resulted in a more pronounced/apparent colonic compartmentalization, in which the top 137 genes defining a factor were sufficient to specifically demarcate the regionalization in the colon 138 (Extended Data Fig. 3A-B). In particular, these factors defined a proximal-distal and a serosa- 139 luminal axis (Fig. 1H). Functional enrichment analysis using the top contributing genes of factors 140 defining the proximal and distal colonic IEC suggested that the murine proximal colon is

5

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

141 specialized in water absorption, while the distal colon is specialized in solute transport 142 (Extended Data Fig. 3C), indicating functional differences between the proximal and distal IEC 143 compartments. Altogether, ST permitted the identification of a previously unappreciated level of 144 colonic molecular compartmentalization in the steady state colon, as summarized in Fig. 1I. 145 146 Visualization of lymphoid structures by factors enriched with B-cell associated genes 147 Next, we examined the capacity of our dataset to resolve macroscopic structures, such as 148 lymphoid clusters, within the tissue. We observed an enrichment of B cell-associated genes in 149 factors 1n, 3n and 9n (Fig. 2A). Upon mapping these factors onto the colonic tissue, we 150 observed that factors 1n and 3n defined structures that resembled lymphoid aggregates, known 151 as isolated lymphoid follicles (ILF) and/or cryptopatches (CP) (Fig. 2B). In contrast, factor 9n 152 defined the colonic LP and was characterized by high expression of genes such as Igha, Jchain, 153 Igkc characteristic of plasma cells (Fig. 2B). Among top-listed genes found in factor_3n, we 154 validated Clu protein expression in lymphoid follicles 21 (Fig. 2C). Similarly, expression of 155 JCHAIN, a small 15 kDa glycoprotein produced by plasma cells that regulates multimerization of 156 secretory IgA and IgM and facilitates their transport across the mucosal epithelium 22, was 157 validated by IHC in the human colonic LP (Fig. 2C). Pathways analysis confirmed that these 158 factors were associated with immune responses (Fig. 2D). Whereas pathways associated with 159 factor 3n suggested sites of lymphocyte priming (defined by processes associated with 160 lymphocyte activation), pathways associated with factor 9n suggested sites of effector immune 161 responses (defined by processes associated with adaptive immunity) (Fig. 2D). Interestingly, 162 factor_1n, which defined the cells/region overlying the ILF, was characterized by genes such as 163 Ccl20 known to recruit CCR6+ B cells 23,24 and Il22ra2/Il22bp, a soluble receptor that neutralizes 164 the effects of IL-22, a pleiotropic cytokine primarily expressed by lymphoid tissue inducer cells 165 (LTis) 25. In addition, factor_1n was associated with pathways involved in cell mobilization and 166 response to external stimulus (Fig. 2D). Because of the observation that factor 1n defines ILFs, 167 it is tempting to propose that factor 1n-defined structures may serve as an anlagen for further 168 maturation into ILF (factor_3n). 169 170 Factor analysis identified molecular signatures that define areas associated with the 171 enteric nervous system 172 Factor 6n was characterized by an enrichment of enteric nervous system (ENS)-associated 173 genes which were located in the muscle area (Fig. 2E). Among the top-listed genes, we 174 validated ubiquitin C-terminal L1 (UCHL1) (Fig. 2E), which is specifically expressed in

6

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

175 neurons 26. Functional enrichment analysis confirmed that factor_6n defined a transcriptomic 176 profile associated with the ENS (Fig. 2F). In summary, the resolution of our Visium dataset 177 permitted the identification of known structures (e.g. ENS and ILFs; Fig. 2G), thereby providing 178 a platform to further investigate specific molecular circuitry within such regions. 179 180 Molecular landscape of intestinal mucosal healing 181 Next, we sought to spatially resolve the colonic transcriptomic landscape during mucosal 182 healing. We took advantage of our recent work showing that by day (d)14, the intestinal barrier 183 integrity is restored following damage induced by dextran sodium sulfate (DSS)10. Therefore, we 184 treated wild type (WT) mice with DSS in drinking water for 7 days followed by 7 days of recovery 185 and d14 colonic tissue was taken to generate frozen Swiss-rolls to be processed for ST (Fig. 186 3A). Despite the recovery at a physiological level (i.e. body weight gain) (Extended Data Fig. 187 4A), the colonic tissue after DSS treatment did not fully return to homeostasis, as demonstrated 188 by its reduced length (a sign of inflammation) (Extended Data Fig. 4B). 189 At the histological level, large lymphoid patches, as well as the muscle and mucosal layer 190 across the intestine, were easily identified (Extended Data Fig. 4C). Hematoxylin and eosin 191 (H&E) sections annotated by a blinded pathologist revealed the heterogeneity of the tissue, 192 including the presence of isolated lymphoid follicles (ILFs), as well as areas with edema, 193 hyperplasia, crypt duplication, and normal tissue (Extended Data Fig. 4D). Of note, the distal 194 colon (center of the Swiss roll) showed marked alterations, whereas the proximal colon (outer 195 Swiss roll) seemed non-affected (Extended Data Fig. 4D). 196 The d14 ST dataset consisted of 3630 individual spots with a number of unique genes per spot 197 (nFeature_RNA) that was comparable to the d0 tissue section (Extended Data Fig. 1A). To first 198 appreciate how the process of mucosal healing spatially altered the colonic transcriptome, the 199 ST data from d0 and d14 were embedded in 3 dimensions using Uniform Manifold 200 Approximation and Projection (UMAP). The values of these 3 dimensions were then re-scaled 201 into a unit cube (with a range of 0 to 1) and used as channels in CMYK color space to generate 202 a specific color for each ST spot (Fig. 3B, bottom part). Interestingly, lymphoid follicles 203 (identified by H&E staining) and areas in the proximal colon showed high similarity (i.e. same 204 color) between d0 and d14 samples (Fig. 3C), suggesting that these structures are 205 transcriptionally less affected during the process of mucosal healing following intestinal injury. 206 Vice versa, in line with the histo-pathological scoring, the distal portion of the d14 colon was the 207 most dramatically affected region.

7

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

208 To visualize how the colonic tissue is transcriptionally organized in different areas, we integrated 209 the data from d0 and d14 using harmony 27 and performed cluster analysis. We annotated 17 210 distinct clusters, which were visualized by embedding the data in 2 dimensions with UMAP (Fig. 211 3C). Differentially up-regulated genes per cluster are summarized in a heatmap showing the top 212 conserved genes in each cluster (Fig. 3D). Each cluster defined a distinct geographic area of 213 the tissue (Extended Data Fig. 5) For instance, cluster 12 designated the ENS, with scattered 214 expression in the submucosal layer, whereas cluster 0 mapped spatially to the proximal colon 215 (Fig. 3E). Interestingly, genes defining cluster 0 expanded towards the mid colon during 216 mucosal healing (d14)(Fig. 3E). Among these genes, Muc2 and Reg3b showing expanded 217 expression towards the mid colon (Fig. 3F) play a key role in establishing the barrier integrity. 218 Using qPCR, we validated the expanded expression of Reg3b during mucosal healing (Fig. 3G). 219 Overall, cluster analysis revealed that despite the existence of a conserved transcriptional 220 colonic regionalization, the process of tissue healing underlies the emergence of distinct 221 molecular signatures and alters the distribution of specific gene expression. 222 223 Non-negative matrix factorization analysis revealed a previously unappreciated 224 transcriptomic regionalization during mucosal healing 225 Even though the majority of the tissue was defined by clusters equally represented on both time 226 points, some clusters displayed a partial or drastic enrichment during tissue healing. Cluster 3 227 (found in the distal colon at the interface between the LP and the muscularis layer), cluster 11 228 and 16 (localized in the damaged area of d14 distal colon) and cluster 13 (marking lymphoid 229 follicles) were drastically enriched during d14 (Fig. 4A and S5). To visualize how the process of 230 mucosal healing alters the transcriptomic landscape of the colon, we deconvolved the d0 and 231 d14 datasets jointly into 20 factors using NNMF (Extended Data Fig. 6 and S7). Among these, 8 232 factors were defined by genes expressed in specific regions during mucosal healing (d14), but 233 not at d0 (Fig. 4B). In the proximal colon, factor 1 was characterized by genes involved in bile 234 acid and fatty acid metabolism (e.g. Cyp2c55, an enzyme involved in the metabolism of 19- 235 hydroxyeicosatetranoic acid). By contrast, factors positioned in the distal colon were 236 characterized by genes involved in inflammatory processes (e.g. Duoxa2 and Il18) and tissue 237 remodeling (e.g. Col1a1 and Col1a2), among others (Fig. 4C). Due to their close proximity and 238 their marked enrichment during mucosal healing, we focused on factors 5, 7, 14, and 20. Factor 239 5 delineated an edematous area, which was histologically characterized by inflammation and 240 positioned right beneath a severely injured epithelial layer with complete loss of crypt 241 architecture (i.e. factor 14). Pathway analysis revealed that factor 5 was associated with

8

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

242 processes involving anatomical structure development, cell adhesion, and extracellular matrix 243 (ECM) organization (Fig. 4D). Among the top genes defining this factor, we found Igfbp5 and 244 Igfbp4, as well as collagens Col1a1 and Col1a2 (Fig. 4C). In contrast, factor 14 was 245 characterized by the expression of genes involved in stress response (e.g. Duoxa2 and 246 Aldh1a3) and leukocyte infiltration (e.g. Ly6a ansd Cxcl5), which indicate an acute response to 247 a barrier breach and tissue damage. 248 At the end of the colonic tissue, the anus separates a mucosal tissue with a monolayered 249 epithelium (the rectum) from a stratified squamous epithelium (skin). Homeostatic breakdown 250 resulting from colonic inflammation generates an area of epithelial instability wherein a 251 heterogeneous tissue at the interface between skin and colonic epithelium appears (i.e. 252 enlarged multilayered crypt-like structures with squamous, but not cornified, appearance). 253 Factor 7, characterized by the expression of several keratins (e.g. Krt13, Krt5, Krt14 and Krt6a) 254 and pathways involving keratinocytes differentiation and wound healing, delineated this area 255 (Fig. 4C-D). 256 Finally, factor 20 predominantly defined the distal epithelium undergoing hyperplasia and crypts 257 arborization, which indicates epithelial repair. revealed that this factor was 258 associated with organogenesis (e.g. Hoxb13) and response to organic substrates (e.g. 259 Cyp2c68) (Fig. 4C-D). Overall, DSS-induced injury resulted in an assorted co-occurrence of 260 different histopathological processes within the murine colon. Furthermore, our analysis 261 revealed a previously unappreciated heterogeneous transcriptional and regional landscape of 262 tissue repair (Fig. 4E). 263 264 Predictive algorithms revealed coordinated signaling pathways depending on location. 265 We interrogated if distinct signaling pathways could be inferred by the spatially organized 266 transcriptional profiles by using PROGENy 28,29. Unlike other Gene Set Enrichment tools (as 267 KEGG), PROGENy estimates signalling pathway activities by looking at expression changes of 268 downstream genes in signaling pathways, which provides a more accurate estimation of the 269 activity of the pathway. A score for each of the 14 pathways annotated in PROGENy (i.e. Wnt, 270 VEGF, Trail, TNFa, TGFb, PI3K, p53, NFkB, MAPK, JAK/STAT, Hypoxia, Estrogen, Androgen 271 and EGFR) was estimated for each ST spot on d0 and d14 slides (see methods). First, we 272 computed a correlation matrix to understand how the spatially organized transcriptional 273 programs identified by NNMF (i.e. factors of Fig. 4) could be explained by signalling pathway 274 activities. We observed two main groups, in which, group 2 pathways (Androgen, JAK-STAT, 275 NFkB, TNFa, p53, Hypoxia and Trail), characteristic of an inflammatory/acute response to

9

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

276 damage, were associated with factors comprising damaged distal epithelium (e.g. factor 7, 10, 277 14, 20) and proximal epithelium (factor 11, 19)( Fig. 5A-B). In contrast, group 1 pathways 278 (TGFb, Wnt, PI3K, Estrogen and EGFR), normally regulating pro-regenerative/tissue 279 remodeling processes, were associated with factors defining the tissue beneath the damaged 280 epithelium (e.g. factor 5 and 17), the muscle layer (factor 2, 6 and 12) and lymphoid follicles 281 (factor 9) (Fig. 5B). In addition, MAPK and VEGF pathways were associated with similar factors, 282 and as expected, the spatial patterns of MAPK and VEGF pathways activity were comparable 283 (Fig. 5C). Of note, at steady state conditions (d0) MAPK and VEGF pathways were 284 homogeneously active along the mid-distal colon, whereas during mucosal healing, their 285 activation was higher within the damaged/regenerating areas (Fig. 5C, black arrows). 286 287 Shared and complementary pathway activities during mucosal healing 288 Comparable TNFa, NFkB, and JAK-STAT pathway activation scores between some factors 289 (e.g. factor 10 and 14) (Fig. 5A) suggest interconnectivity between these inflammatory 290 pathways. To test this possibility we further analyze the spatial pattern of these pathways. In 291 particular, the activities of the TNFa and NFkB pathways were almost identical within the colon, 292 regardless of the time point analyzed (Fig. 5D). Higher TNFa and NFkB activities were 293 appreciated in areas associated with injury and ILFs (Fig. 5D). Of note, in the absence of 294 damage/inflammation (d0), the spatial distribution of TNFa and NFkB showed activity confined 295 to the ILF luminal edge (Fig. 5D, d0), in agreement with previous studies showing that TNF 296 drives ILF organogenesis 30. Whether ILF forms where subclinical local damage occurs or 297 whether their presence, which allows dynamic exchange with the external environment, causes 298 subclinical inflammation, remains to be explored. 299 In contrast, JAK-STAT pathway activation showed co-occurrence with TNF and NFkB mostly in 300 the damaged area (factor 14), but not in ILFs (Fig. 5D). These results suggest that although all 301 three pathways may play a role within the damaged tissue, TNFa and NFkB, but not JAK-STAT, 302 are involved in the formation/function of ILFs. Unlike these pathways, androgen and estrogen 303 pathways showed mutually exclusive patterns of activity. Higher androgen pathway activity was 304 observed in areas of injured epithelium, while higher estrogen activity was associated with the 305 muscle layer (arrows, Fig. 5E), suggesting that these pathways negatively regulate each other 306 during mucosal healing. 307 308 Low p53 pathway activity is associated with proliferating crypts

10

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

309 Activation of the p53 pathway was homogeneously distributed across the proximal-distal axis, 310 but it showed more activity in the luminal side compared with the LP and muscle layer (Fig. 5F). 311 Interestingly, p53 activity was lower in the damaged area (box ii in Fig. 5F). Activation of p53 312 triggers cell cycle arrest, senescence, and apoptosis 31, suggesting that spots with decreased 313 p53 activity might be enriched in proliferating cells within the damaged area. To test this 314 possibility, we overimposed lower p53 activity spots onto the H&E images and showed co- 315 localization with the bottom of crypts (Fig. 5F, H&E boxes). To investigate if proliferating stem 316 cell signatures co-localize with low p53 activity spots, we took advantage of our single cell RNA 317 sequencing (scRNAseq) dataset of intestinal epithelial cells from d14 colon and identified a 318 population of proliferating stem cells (Fig. 5G). We mapped the stem cell core signature onto the 319 d14 colon ST datasets, and we superimposed the spots with high scores in the stem cell core 320 onto the H&E section. In agreement with our hypothesis, spots containing high scores (Fig. 5G) 321 coincided with low p53 activity (Fig. 5F). Pearson correlation analysis confirmed that ST spots 322 with high stem cell scores negatively correlated with p53 activity (Fig. 5H). Thus, our data 323 suggest that low p53 activity allows the identification of proliferating crypts during mucosal 324 healing. In summary, we spatially positioned clinically relevant pathways predicted by 325 PROGENy and showed that these pathways are highly coordinated during mucosal healing. 326 327 Integration of human datasets with mouse spatial transcriptomic 328 To enquire about the translational potential of the murine colonic ST, we investigated whether 329 human datasets could be integrated into murine ST data. Towards this end, we took advantage 330 of a human developing gut dataset 32 and mapped 31 distinct epithelial and stromal cells onto 331 our ST dataset (Fig. 6A). We observed correlations between human cells types and distinct 332 murine ST factors (Fig. 6B), indicating specific localization of human cell signatures within the 333 mouse colon. Interestingly, the signature of human proximal enterocytes is highly correlated 334 with factor 1 (Fig. 6B), defining the most proximal epithelium in mice (Extended Data Fig. 8A). 335 Human distal enterocytes and absorptive cells highly correlated with factors 3 and 10 (Fig. 6B), 336 which defines the most distal epithelium in mice (Extended Data Fig. 8A). These results indicate 337 that the transcriptomic features defining proximal and distal epithelial cells are conserved 338 between mouse and humans. On the other hand, two human stromal cells characterized by the 339 expression of the chemokines CCL21 and CXCL13 uniquely and strongly correlated with factor 340 9, defining lymphoid follicles (Fig. 6B), which is in agreement with the well-known role of these 341 chemokines in ILF development 33. Interestingly, these stromal cells mapped in complementary 342 patterns within the mouse ILF (Fig. 6C), suggesting that the coordinated action of these cells

11

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

343 might determine the recruitment/localization of immune cells within the follicle. Next, we 344 analyzed the damage/regeneration area (factor 5 and 14) which correlated with S1 (Stromal 1, 345 fibroblast marking bulk of submucosal structural cells in human), S1-COL6A5 and S1-IFIT3 346 human cells (two subtypes of S1) (Fig. 6B) and mapped in a complementary pattern of 347 distribution (Extended Data Fig. 8B). 348 We then extend our analysis to other cell types during mucosal healing (Fig. 6D and Extended 349 Data Fig. 8C). Interestingly, immune cells, mesothelium, endothelium and fibroblast signatures 350 were spatially enriched within defined areas during mucosal healing. In particular, fibroblasts 351 were dominant in factor 5 (remodeling) and immune cells in factor 9 (lymphoid follicles), 352 whereas endothelial and mesothelial cells colocalized within factor 7 (keratinization) (Fig. 6D 353 and Extended Data Fig. 8D). Among 10 distinct immune cell types identified 32, monocytes and 354 SPP1+ macrophages were enriched in factor 14 (danger response) and in factor 5 (tissue 355 remodeling) respectively, in line with their known roles in acute response to injury and matrix 356 deposition/wound healing (Extended Data Fig. 8E). Lymphocytes and dendritic cells, instead, 357 were enriched in factor 9 (lymphoid follicles) (Fig. 6E and Extended Data Fig. 8E). Further 358 analysis showed how these immune cells are heterogeneously distributed within the ILF (Fig. 359 6E). These results provide a proof-of-concept and support the notion that principles of spatial 360 distribution within the colonic tissue appear to be conserved between species and highlight 361 murine ST as a valuable platform for exploring and translating findings on distribution patterns of 362 cells/genes within a tissue. 363 364 Mapping transcriptomic datasets onto ST to inform medical practice 365 In order to establish a framework to integrate existing knowledge with ST datasets, we took 366 advantage of our longitudinal RNAseq dataset of colonic tissue collected during acute epithelial 367 injury and the recovery phase in the DSS-induced colitis model 10. In this study, we identified 368 sets of genes (called modules) displaying characteristic expression patterns, with some genes 369 being: a) downregulated upon injury (modules 2, 8, 7), b) upregulated during the 370 acute/inflammatory phase (modules 1, 3, 4, 9), and c) upregulated during the recovery phase of 371 DSS colitis (modules 5, 6) 10. To identify whether the different temporally regulated processes 372 (i.e. modules) were enriched in specific areas of the tissue, we computed a correlation matrix 373 between the gene signature of modules and ST factors (Fig. 7A). Higher gene enrichment was 374 found in module (m)1 and m6, characterized by genes induced during the inflammatory and 375 recovery phase, respectively (Fig 7A). Among the genes shared between factor 9 and m1, 376 Ptprc, Cd72 and Lyz2 encode for proteins expressed by immune cells and map predominantly

12

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

377 to lymphoid follicles and damaged areas in d14 (Fig. 7Bi). In contrast, ~60% of the top driving 378 genes defining factor 15 (i.e. ENS) were shared with m6, and their expression was distributed in 379 the submucosa/muscularis layer where neuronal bodies reside (Fig. 7Ci). GO enrichment 380 analysis of m1 and m2 also confirmed that the most dominant pathways were involved with 381 inflammatory responses and chemical synaptic transmission (Fig. 7Bii and 7Cii). The correlation 382 between temporal transcriptomic modules and spatial factors suggests that factor 9 (lymphoid 383 follicles) and factor 15 (ENS) are characterized by an ongoing inflammatory and regenerative 384 profile, respectively. Thus, the integration of longitudinal and ST data can be a powerful tool to 385 unveil biological processes related to diseases in time and space. 386 387 Spatial distribution of genes defining UC1 and UC2 profiles 388 We then sought to investigate if clinically relevant patient gene signatures could be mapped 389 onto mouse ST datasets. Toward this, we used the recent gene signature identifying ulcerative 390 colitis (UC) subgroups of patients: UC1 and UC2 10. This molecular classification is clinically 391 relevant because the UC1-related transcriptomic signature is associated with poor responses to 392 biological therapies and is enriched with genes involved in neutrophil activity. In contrast, 393 approximately 70% of UC2 patients achieved a clinical response to anti-TNF antibodies 10. In 394 addition, Smillie et al., 34 showed that inflammatory fibroblasts and monocytes mainly drive anti- 395 TNF resistance, and many of the genes upregulated in UC1 patients colonic tissue are highly 396 expressed by these cell types. To further understand the spatial distribution of differentially 397 expressed genes between UC1 and UC2 patients, we mapped all genes upregulated in either 398 UC1 and UC2 patients onto colon ST. Whereas genes defining UC2 patients were 399 homogeneously expressed across the colon (d0 and d14) (Fig. 7D), genes defining UC1 were 400 mostly localized within the damage/repair area at d14 (Fig. 7D). Further investigation revealed 401 that all the main functional classes upregulated in UC1 patients, including collagen synthesis 402 (Col12a1, Col4a1, Col4a2, Col7a1), ECM breakdown (Mmp3), Wnt-signalling pathway (Wnt5a), 403 cytokine signalling (Il11, Il1b, Il33, Il1r2, Il1rn, Tnfrsf11b, Csf2rb, Csf3r, Socs3, Trem1, Cxcr2) 404 and innate immunity (S100a9, S100a8, C5ar1, Sell, S100a4), were also upregulated in areas of 405 tissue damage Fig. S9A). In summary, these results suggest that UC1 patients may possess 406 higher tissue damage and ulceration compared with UC2 patients. 407 408 Defining the topography of IBD risk genes 409 Spatial transcriptomics of the colon undergoing injury/repair provides an opportunity to 410 comprehensively map IBD-associated risk genes. Therefore, we interrogated the expression

13

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

411 pattern of various human IBD-risk genes 35-39 on the ST profile of d0 and d14 colonic tissue. Out 412 of the 122 interrogated genes, 95 IBD-risk genes were selected based on the existence of their 413 murine ortholog and their detectable expression in the ST dataset. In order to identify whether 414 the spatial expression of these variants defined topographic patterns within the tissue, we 415 computed a correlation matrix (Extended Data Fig. 9B). Cluster analysis resulted in three main 416 co-expression clusters, with cluster 3 possessing the highest spatial expression correlation 417 between genes (Extended Data Fig. 9B). Functional annotation of the genes within this cluster 418 revealed enrichment in pathways related to immune cell recruitment (e.g. Itgal, Icam1, Itga4), 419 activation (e.g. Cd6, Plcg2, Ncf4, Il10ra), and antigen presentation (e.g. Tap1, Tap2, Psmb8). 420 To understand the spatial distribution of genes belonging to cluster 3, we mapped them onto the 421 colonic tissue using ModuleScore, a Seurat function assigning a score in the ST dataset to a set 422 of predefined genes (i.e. cluster 3 genes). In line with the functional annotation of cluster 3 423 genes, we observed enrichment in lymphoid follicles areas both on d0 and d14 (Fig. 7E). To 424 understand which ST factors were enriched with IBD risk genes, we performed Gene Set 425 Enrichment Analysis (GSEA) and calculated overrepresentation scores of IBD-risk variants in 426 the NNMF dataset (Extended Data Fig. 6-7). This analysis showed that factor 9, defining 427 lymphoid follicles within the tissue (Extended Data Fig. 6), was the only factor with significant 428 enrichment of IBD risk genes (Fig. 7F). Altogether, this analysis revealed that the expression of 429 a subset of human IBD-risk genes spatially co-occur within the murine colon. Their specific 430 expression pattern suggests that colonic tissue lymphoid follicles might define the area to 431 potentially target when developing therapeutic strategies for IBD patients displaying aberrant 432 immune activation. 433 434 Discussion 435 We and others have deeply characterized the transcriptomic landscape during mucosal healing 436 in the colon and small bowel 10,40,41. However, these studies lacked the spatial resolution 437 describing where genes were expressed. Here, we spatially placed cell populations and 438 pathways that might play pivotal roles in driving tissue response to damage. The current study 439 uncovered spatial transcriptomic patterns that are present at steady state conditions and that 440 arise in response to damage; these spatial transcriptomic patterns were characterized by unique 441 transcriptional signatures and coincided with different histological processes. Moreover, we 442 profiled the regional distribution of different biological processes, such as acute response to 443 injury or a regenerative response. Finally, we demonstrated the clinical relevance of this dataset

14

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

444 as seen by conserved spatial localization of gene signatures in human tissue and 445 transcriptomics data and by testing the distribution of clinically-relevant genes. 446 447 A recent study characterized the transcriptomic landscape during human gut development 32. 448 Here, we further these results by taking advantage of murine colonic Swiss rolls that fit the 6.5 449 mm2 area constraints provided by the manufacturer to perform spatial transcriptomics. This 450 approach enabled us to visualize the transcriptomic landscape of the whole colon in the same 451 slide, including the most proximal and distal segments. In combination with bioinformatics tools 452 (NNFM analysis), we uncovered a previously unappreciated molecular regionalization of the 453 colonic tissue in steady state conditions. This analysis allowed the identification of distinct 454 epithelial, LP, and muscularis/submucosa genetic programs depending on their proximal to 455 distal colon localization. Importantly, when mapping human cells into our ST datasets, we 456 observed conservation in transcriptomic features defining proximal and distal locations, 457 suggesting that our newly described molecular segmentation is conserved across mammals. 458 459 Using the entire murine colon, we provided a detailed analysis of a previously unappreciated 460 compartmentalization of the tissue repair process. In line with previous reports 42, our unbiased 461 analysis of the transcriptomic landscape during mucosal healing reveals that while dramatic 462 transcriptomic changes occur in the distal colon, the proximal colon remains almost comparable 463 to the steady state. Two potential scenarios can be proposed: a) the level of damage is 464 homogenous and the proximal colon heals faster compared with the distal colon; or b) the 465 proximal colon is more protected compared with the distal colon. Dramatic changes in the distal 466 rather than the proximal colon are in agreement with the phenotype observed in UC patients, 467 where the focus of inflammation extends proximally from the rectum 43. In addition, genes and/or 468 pathways, such as the JAK-STAT and TNFa pathway 44 or genes/pathways characterizing UC1 469 patients 10, were found to be dominant in the distal colon, suggesting that DSS-induced colitis is 470 a clinically relevant experimental model of UC1. Whether higher levels of damage/tissue repair 471 in the distal colon depend on microbiota, host-induced responses to the microenvironment, or 472 just different kinetics, remains to be addressed. 473 474 At steady state conditions, we identified molecular signatures associated with lymphoid 475 structures. Unlike Peyer’s patches (PP) that are macroscopically visible in the murine small 476 intestine, CP and ILF cannot be dissected and analyzed separately from the rest of the colonic 477 tissue for transcriptomic readouts. For instance, enrichment in the NFkB and TNFa pathway

15

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

478 activity was detected in the lumen-facing area corresponding to the epithelial layer overlaying 479 lymphoid clusters in steady state. Because these pathways are usually associated with immune 480 activation/inflammatory responses, these data suggest that ILF-associated epithelium is 481 undergoing inflammation. Moreover, expression of clusterin (Clu) alone was found to be highly 482 specific and sufficient in defining the isolated lymphoid follicles (ILF). Previous studies have 483 reported Clu expression in follicular dendritic cells (FDC), 45 as well as M cells 21 in the Peyer’s 484 patches. While Clu expression in FDCs serves as a pro-survival factor for germinal center B 485 cells in the follicle, the role of Clu in M cells is not clear. Interestingly, a recent study 40 identified 486 Clu as a marker of intestinal stem cells (ISC), known as revival stem cells, which are rarely 487 found in steady state, but are predominantly found in regenerating intestine following injury. 488 Whether the expression of Clu in the ILF- and follicle-associated epithelium has any bearing 489 during colonic infection and regeneration needs to be further investigated. 490 491 Besides the proximal-distal variance in the transcriptomic alterations during tissue repair, our 492 NNMF analysis revealed a high degree of compartmentalization within the distal colon itself. At 493 least 5 factors were delineating topographically and transcriptionally distinct areas in the distal 494 colon undergoing different biological responses to tissue injury. Such heterogeneity is likely the 495 result of different healing programs, such as skin-like re-epithelialization (factor 7), acute 496 damage (factor 14) and tissue regeneration (factor 20). Supporting this notion, the integration of 497 our RNAseq kinetic dataset 10 into the ST map revealed that certain areas of the distal colon 498 were transcriptionally closer to samples from the acute phase of DSS colitis (i.e. d6 to d8). The 499 asynchronous nature of the healing process may be associated with varying degrees of 500 exposure to the external environment and the elaborate architecture of the tissue. Similarly, in 501 human IBD, the “patchiness of the inflammatory response” is a well-known characteristic of 502 Crohn’s disease. In UC, the inflammation is traditionally thought to be continuous with 503 increasing intensity in distal colon, but longitudinal sampling has revealed episodes of both 504 macroscopic and microscopic patchiness of inflammation 46. Our dataset thus provides a 505 valuable resource to interrogate the transcriptional programs underlying distinct temporal and 506 biological processes of tissue healing. 507 PROGENy allows the prediction of pathways activated in specific regions of the colon. Our 508 analysis revealed a strong correlation between pathway activities, such as TNFa and NFkB, 509 suggesting that one pathway might depend completely on the other. We also identified that the 510 area of injury is characterized by several pathways that are also increased on the ILF edge at 511 steady state conditions, suggesting that the formation of lymphoid follicles result in the induction

16

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

512 of damage-associated pathways. Finally, decreased p53 activity represents a good strategy to 513 identify damage-associated proliferating crypts. These data suggest that within the same region, 514 intestinal crypts are heterogeneous in their response to damage; our dataset provides a toolkit 515 to investigate this composite response. 516 517 ST to inform clinical practice 518 Our study provides evidence that ST can be used to map clinically relevant genes and 519 pathways. Genes characterizing a newly described UC subgroup were associated with poor 520 treatment response in the damaged area of the regenerating colon. Previous studies confirm 521 these results; increased inflammation severity predicted poor response to anti-TNF treatments 522 47 or blocking cell recruitment to the inflamed intestine using anti-a4b7 antibodies 48. As a result 523 of severe inflammation, colonic ulceration may lead to lower therapeutic responses due to 524 decreased blood drug concentration/drug leakage 49. Our data suggest that UC1 patients have 525 increased tissue damage compared with UC2 patients, which might contribute to their poor 526 response to biological therapies. 527 528 Acknowledgments 529 We thank members of the Villablanca lab for helpful comments and Maria Lim for editorial 530 assistance. We thank Kim Thrane for her invaluable expertise in setting up the Visium 531 experiments and for help with tissue imaging. C.E. was supported by the Marie Sklodowska- 532 Curie grant agreement No 844712. L.L. was funded by grants from the Helmsley foundation. 533 E.J.V. was supported by grants from the Swedish Research Council, VR grant K2015-68X- 534 22765-01-6 and 2018-02533, Formas grant nr. FR-2016/0005, Cancerfonden (19 0395 Pj), and 535 the Wallenberg Academy Fellow program (2019.0315). The computations and data handling 536 were enabled by resources provided by the Swedish National Infrastructure for Computing 537 (SNIC) at KTH partially funded by the Swedish Research Council through grant agreement no. 538 2018-05973. Some schematics were partially created with BioRender.com. 539 540 Author Contributions 541 SMP, SD, AF, OED, RM, XL, GM, and CE performed experiments. LL, ROR, KPT, and KS 542 performed bioinformatics analysis of the transcriptomic data. EJV and SD conceived the idea. 543 NG, JSR, and JL provided resources. SMP, LL, and EJV wrote the paper. All authors discussed 544 the data, read, and approved the manuscript. 545

17

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

546 Declaration of Interests 547 E.J.V. has received research grants from F. Hoffmann-La Roche. C.E., L.L. and J.L. are 548 scientific consultants for 10X Genomics Inc. 549 550 Main Figure legends 551 552 Fig. 1, Spatial transcriptomics reveals molecular regionalization of the murine colonic 553 tissue in steady state. 554 (A) Schematics of the experimental design: the colonic tissue from a naive wild type mouse 555 was processed as a Swiss roll for spatial transcriptomic (ST) with VISIUM 10X 556 technology. 557 (B) Colon Swiss rolls shown in hematoxylin and eosin staining (left) and with each ST spot 558 color coded based on non-negative matrix factorization (NNMF) (right). ST spots 559 belonging uniquely to one factor are colored in red, blue and green for NNMF1, 2 and 3 560 respectively. ST spots shared between different factors are colored with respective 561 intermediate gradation of these 3 colors. 562 (C) Top: spatial distribution of the 3 factors distinguishing muscle, lamina propria (LP) and 563 intestinal epithelial cells (IEC). Bottom: heatmap showing the top 20 genes defining each 564 factor. 565 (D) Immunohistochemical staining of CDH17, TAGLN, ADH1 in healthy human colonic 566 tissue (from Human Protein Atlas). 567 (E) Digitally unrolled colonic tissue, showing the distribution of the 3 factors from Fig. 1C 568 along the serosa-luminal and proximal-distal axis. 569 (F) Proximal to distal distribution of Hmgcs2, Ang4 and B4galt1 expression in colonic swiss 570 rolls (left) and digitally unrolled colon (right). 571 (G) qPCR validation of regional expression of Hmgcs2, Ang4 and B4galt1 in proximal, mid 572 and distal colonic biopsies from wild type mice (n=3, each dot represents one mouse). 573 (H) Spatial distribution of 9 out of 20 factors in the naive colon displaying transcriptional 574 regionalization along the serosa-luminal and proximal-distal axis. Each ST spot is 575 assigned a color-coded score based on the expression of the genes defining each 576 factor. 577 (I) Schematic representation of the colon (top) and top genes annotated in Factors 578 describing molecular regionalization of the naive colon (bottom). Factors are grouped

18

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

579 based on their proximal-distal distribution and color-coded (i.e. grey-pink-purple) based 580 on their serosa-luminal distribution. 581 582 Fig. 2, Identification and regional distribution of lymphoid follicle, B cell-associated, and 583 enteric nervous system signatures in the naive murine colon. 584 (A) Heatmap of the top genes defining factors 1n, 3n and 9n enriched in B cell signature. 585 (B) Spatial distribution of B cell-associated factors in the naive colon. 586 (C) Immunohistochemical staining of CLU (enriched in lymphoid follicles) and JCHAIN 587 (localized in the lamina propria) in healthy human colonic tissue (from Human Protein 588 Atlas). 589 (D) Functional enrichment analysis (GO) of factors 1n, 3n and 9n. 590 (E) Top: spatial distribution (left) and top genes (right) defining factor 6n (enteric nervous 591 system). Bottom: immunohistochemical staining of UCHL1 (neuronal marker) in the 592 colonic submucosa of healthy human colonic tissue (from Human Protein Atlas). 593 (F) Pathway analysis (GO) of factor 6n. 594 (G) Schematic representation of spatial distribution of B cell factors (i.e. 1n, 3n and 9n from 595 Panels A-D) and ENS-factor 6n (from Panel E-F). 596 597 598 Fig. 3, Changes of the molecular topography during mucosal healing are dominant at the 599 distal colon 600 (A) Schematic representation of the experiment: colitis was induced by dextran sodium 601 sulfate (DSS) administration in drinking water for 7 days followed by 7 days of regular 602 water to promote tissue repair. Colonic tissue from a wild-type naive mouse (d0, from 603 Fig. 1) and from a mouse undergoing colonic regeneration (d14) were processed as 604 Swiss roll for spatial transcriptomic using Visium 10X technology. 605 (B) Top: Hematoxylin and eosin staining of colonic tissue from d0 and d14. Bottom: spatial 606 representation of UMAP values in CMYK colors on colon d0 and d14. Spots with the 607 same color in the two time points represent transcriptionally similar regions. 608 (C) Uniform Manifold Approximation and Projection (UMAP) representation of 16 color- 609 coded clusters defining regional transcriptome diversity in the colonic d0 and d14 610 datasets combined.

19

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

611 (D) Heatmap showing expression of top genes defining each cluster (color-coding on top) in 612 the ST datasets from the two timepoints (light blue columns: colon d0; pink columns: 613 colon d14). 614 (E) Schematic representation of cluster 0 and cluster 12 distribution in colon d0 (on the left) 615 and d14 (on the right). 616 (F) Expression of selected genes in cluster 0 onto ST. 617 (G) qPCR validation of regional expression of Reg3b in proximal, mid and distal colonic 618 biopsies from wild type mice at steady state conditions (d0) and during mucosal healing 619 (d14)(n=3, each dot represents one mouse). 620 621 Fig. 4, Non-negative matrix factorization reveals eight distinct molecular patterns during 622 colon mucosal healing 623 (A) UMAP representation of 16 clusters in d0 and d14 colon. 624 (B) Hematoxylin and eosin images displaying overlaid spots with the highest factor weight. 625 (C) Top 10 genes defining the indicated NNMFs (factors). 626 (D) Functional enrichment analysis (Gene Ontology, GO) based on the top genes defining 627 factor 5, 7, 14 and 20. 628 (E) Schematic representation summarizing the expression pattern between selected factors. 629 Biological processes associated with each factor are indicated in brackets. 630 631 Fig. 5, Predictive algorithm reveals pathway-specific spatial patterns during mucosal 632 healing. 633 (A) Correlation matrix between non-negative matrix factorization (NNMF) and pathway 634 activity scores determined by PROGENy 635 (B) Schematic of the colon area at d14 displaying the distribution of the indicated factors. 636 (C) Spatial transcriptomic (ST) spot heatmaps of the colon at d0 (upper swiss rolls) and d14 637 (lower swiss rolls) showing pathways scores predicted by PROGENy. Arrows indicate 638 areas of tissue damaged as defined by factor 14 shown in Extended Data Fig. 7 and Fig. 639 4A-C. 640 (D) ST spot heatmaps showing TNFa, NFkB, JAK-STAT pathway activity on d0 (upper 641 Swiss rolls) and d14 (lower Swiss rolls). Selected areas indicated as ILF (isolated 642 lymphoid follicles) or “i” and “ii” and outlined in black are magnified below each Swiss 643 roll. Arrows in “ii” indicate the presence of an ILF as defined by factor 9 Extended Data 644 Fig. 6.

20

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

645 (E) Spatial distribution of androgen and estrogen pathway activity at d14. Selected areas 646 (indicated as “i” and “ii”) are magnified. Arrows indicate an example of the muscle layer 647 showing opposite expression patterns between the two pathways. 648 (F) Spatial distribution of p53 pathway activity at d0 and d14. Selected areas indicated as “i” 649 and “ii” on colon d14 are magnified on the right. Hematoxylin and eosin magnifications 650 show the overlaid spots with the lowest p53 activity shown in “i” and “ii”. 651 (G) Left: UMAP visualization of IEC clusters from scRNAseq on colon d14 (Frede et al, 652 unpublished; GSE: 163638 ). Middle: ST spots from colon d14 are color-coded based on 653 the enrichment of stem cell core signature identified from scRNAseq dataset. Selected 654 areas indicated as “i” and “ii” on colon d14 are magnified on the right. Right: Hematoxylin 655 and eosin magnifications showing the overlaid spots with the highest stem cell signature 656 shown in “i” and “ii”. 657 (H) Pearson correlation between PROGENy predicted pathways and stem cell signature on 658 the ST dataset. 659 660 Fig. 6, Human cell type mapping onto murine spatial transcriptomic datasets 661 (A) Scheme showing the integration of published human single cell RNAseq 32 and our 662 mouse Visium datasets. 663 (B) Correlation matrix between transcriptomic profiles from human single cell datasets 32 and 664 factors defining transcriptomics patterns in mouse ST. 665 (C) Integration of human stromal cell transcriptomic profiles (S4.CCL21+ and S4.CXCL13+) 666 onto visium datasets at day 0. 667 (D) Integration of human intestinal cell transcriptomic profiles onto visium datasets at day 14. 668 (E) Integration of human immune cell transcriptomic profiles onto visium datasets at day 14 669 and magnification of the isolated lymphoid follicle area. 670 671 Fig. 7, Spatial transcriptomic (ST) allows mapping transcriptomic signatures with clinical 672 relevance 673 (A) Correlation matrix between transcriptomic modules distinguishing the processes of 674 inflammation and mucosal healing during DSS-induced colitis 10 and factors defining 675 transcriptomics patterns in ST. 676 (B) (i) Venn diagram and spatial representation of overlapping genes between module 1 and 677 factor 9. (ii) relative mean expression and Gene Ontology (GO) of genes belonging to 678 module 1.

21

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

679 (C) (i) Venn diagram and spatial representation of overlapping genes between module 6 and 680 factor 15. (ii) relative mean expression and Gene Ontology (GO) of genes belonging to 681 module 6. 682 (D) Spatial distribution of genes defining UC1 and UC2 patients on mouse ST colon d0 and 683 d14. 684 (E) Spatial distribution of IBD risk genes from Cluster 3 (Extended Data Fig. 9B) on colon d0 685 (left) and d14 (right). 686 (F) Gene Set Enrichment Analysis for IBD risk genes in Cluster 3 (Extended Data Fig. 9B) 687 and NNMF factors (Extended Data Fig. 6-7). 688 689 690

22

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

691 References 692 1. Li, N., et al. Spatial heterogeneity of bacterial colonization across different gut segments 693 following inter-species microbiota transplantation. Microbiome 8, 161 (2020). 694 2. Mowat, A.M. & Agace, W.W. Regional specialization within the intestinal immune 695 system. Nat Rev Immunol 14, 667-685 (2014). 696 3. Fenton, T.M., et al. Immune Profiling of Human Gut-Associated Lymphoid Tissue 697 Identifies a Role for Isolated Lymphoid Follicles in Priming of Region-Specific Immunity. 698 Immunity 52, 557-570 e556 (2020). 699 4. Villablanca, E.J., et al. MyD88 and retinoic acid signaling pathways interact to modulate 700 gastrointestinal activities of dendritic cells. Gastroenterology 141, 176-185 (2011). 701 5. Guan, Q. A Comprehensive Review and Update on the Pathogenesis of Inflammatory 702 Bowel Disease. J Immunol Res 2019, 7247238 (2019). 703 6. Blanpain, C., Horsley, V. & Fuchs, E. Epithelial stem cells: turning over new leaves. Cell 704 128, 445-458 (2007). 705 7. Santos, A.J.M., Lo, Y.H., Mah, A.T. & Kuo, C.J. The Intestinal Stem Cell Niche: 706 Homeostasis and Adaptations. Trends Cell Biol 28, 1062-1078 (2018). 707 8. Brazil, J.C., Quiros, M., Nusrat, A. & Parkos, C.A. Innate immune cell-epithelial crosstalk 708 during wound repair. J Clin Invest 129, 2983-2993 (2019). 709 9. Owens, B.M. & Simmons, A. Intestinal stromal cells in mucosal immunity and 710 homeostasis. Mucosal Immunol 6, 224-234 (2013). 711 10. Czarnewski, P., et al. Conserved transcriptomic profile between mouse and human 712 colitis allows unsupervised patient stratification. Nat Commun 10, 2892 (2019). 713 11. Stahl, P.L., et al. Visualization and analysis of gene expression in tissue sections by 714 spatial transcriptomics. Science 353, 78-82 (2016). 715 12. Lin, X. & Boutros, P.C. Optimization and expansion of non-negative matrix factorization. 716 BMC Bioinformatics 21, 7 (2020). 717 13. Uhlen, M., et al. Proteomics. Tissue-based map of the human proteome. Science 347, 718 1260419 (2015). 719 14. Baumgartner, W. Possible roles of LI-Cadherin in the formation and maintenance of the 720 intestinal epithelial barrier. Tissue Barriers 1, e23815 (2013). 721 15. Tetteh, P.W., et al. Generation of an inducible colon-specific Cre enzyme mouse line for 722 colon cancer research. Proc Natl Acad Sci U S A 113, 11859-11864 (2016).

23

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

723 16. Triff, K., et al. Genome-wide analysis of the rat colon reveals proximal-distal differences 724 in histone modifications and proto-oncogene expression. Physiol Genomics 45, 1229- 725 1243 (2013). 726 17. Tan, C.W., Hirokawa, Y., Gardiner, B.S., Smith, D.W. & Burgess, A.W. Colon 727 cryptogenesis: asymmetric budding. PLoS One 8, e78519 (2013). 728 18. Brennan, F.E. & Fuller, P.J. Acute regulation by corticosteroids of channel-inducing 729 factor gene messenger ribonucleic acid in the distal colon. Endocrinology 140, 1213- 730 1218 (1999). 731 19. Renes, I.B., et al. Alterations in Muc2 biosynthesis and secretion during dextran sulfate 732 sodium-induced colitis. Am J Physiol Gastrointest Liver Physiol 282, G382-389 (2002). 733 20. Fleming, R.E., et al. Carbonic anhydrase IV expression in rat and human gastrointestinal 734 tract regional, cellular, and subcellular localization. J Clin Invest 96, 2907-2913 (1995). 735 21. Verbrugghe, P., Kujala, P., Waelput, W., Peters, P.J. & Cuvelier, C.A. Clusterin in 736 human gut-associated lymphoid tissue, tonsils, and adenoids: localization to M cells and 737 follicular dendritic cells. Histochem Cell Biol 129, 311-320 (2008). 738 22. Castro, C.D. & Flajnik, M.F. Putting J chain back on the map: how might its expression 739 define plasma cell development? J Immunol 193, 3248-3255 (2014). 740 23. He, H., et al. CCR6(+) B lymphocytes responding to tumor cell-derived CCL20 support 741 hepatocellular carcinoma progression via enhancing angiogenesis. Am J Cancer Res 7, 742 1151-1163 (2017). 743 24. Suan, D., et al. CCR6 Defines Memory B Cell Precursors in Mouse and Human 744 Germinal Centers, Revealing Light-Zone Location and Predominant Low Antigen Affinity. 745 Immunity 47, 1142-1153 e1144 (2017). 746 25. Savage, A.K., Liang, H.E. & Locksley, R.M. The Development of Steady-State Activation 747 Hubs between Adult LTi ILC3s and Primed Macrophages in Small Intestine. J Immunol 748 199, 1912-1922 (2017). 749 26. Reinicke, A.T., et al. Ubiquitin C-terminal hydrolase L1 (UCH-L1) loss causes 750 neurodegeneration by altering protein turnover in the first postnatal weeks. Proc Natl 751 Acad Sci U S A 116, 7963-7972 (2019). 752 27. Korsunsky, I., et al. Fast, sensitive and accurate integration of single-cell data with 753 Harmony. Nat Methods 16, 1289-1296 (2019). 754 28. Schubert, M., et al. Perturbation-response genes reveal signaling footprints in cancer 755 gene expression. Nat Commun 9, 20 (2018).

24

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

756 29. Holland, C.H., Szalai, B. & Saez-Rodriguez, J. Transfer of regulatory knowledge from 757 human to mouse for functional genomics analysis. Biochim Biophys Acta Gene Regul 758 Mech 1863, 194431 (2020). 759 30. Furtado, G.C., et al. TNFalpha-dependent development of lymphoid tissue in the 760 absence of RORgammat(+) lymphoid tissue inducer cells. Mucosal Immunol 7, 602-614 761 (2014). 762 31. Hafner, A., Bulyk, M.L., Jambhekar, A. & Lahav, G. The multiple mechanisms that 763 regulate p53 activity and cell fate. Nat Rev Mol Cell Biol 20, 199-210 (2019). 764 32. Fawkner-Corbett, D., et al. Spatiotemporal analysis of human intestinal development at 765 single-cell resolution. Cell 184, 810-826 e823 (2021). 766 33. Mebius, R.E. Organogenesis of lymphoid tissues. Nat Rev Immunol 3, 292-303 (2003). 767 34. Smillie, C.S., et al. Intra- and Inter-cellular Rewiring of the Human Colon during 768 Ulcerative Colitis. Cell 178, 714-730 e722 (2019). 769 35. de Lange, K.M., et al. Genome-wide association study implicates immune activation of 770 multiple integrin genes in inflammatory bowel disease. Nat Genet 49, 256-261 (2017). 771 36. Uniken Venema, W.T., Voskuil, M.D., Dijkstra, G., Weersma, R.K. & Festen, E.A. The 772 genetic background of inflammatory bowel disease: from correlation to causality. J 773 Pathol 241, 146-158 (2017). 774 37. Verstockt, B., Smith, K.G. & Lee, J.C. Genome-wide association studies in Crohn's 775 disease: Past, present and future. Clin Transl Immunology 7, e1001 (2018). 776 38. Liu, J.Z. & Anderson, C.A. Genetic studies of Crohn's disease: past, present and future. 777 Best Pract Res Clin Gastroenterol 28, 373-386 (2014). 778 39. Jostins, L., et al. Host-microbe interactions have shaped the genetic architecture of 779 inflammatory bowel disease. Nature 491, 119-124 (2012). 780 40. Ayyaz, A., et al. Single-cell transcriptomes of the regenerating intestine reveal a revival 781 stem cell. Nature 569, 121-125 (2019). 782 41. Lukonin, I., et al. Phenotypic landscape of intestinal organoid regeneration. Nature 586, 783 275-280 (2020). 784 42. Yan, Y., et al. Temporal and spatial analysis of clinical and molecular parameters in 785 dextran sodium sulfate induced colitis. PLoS One 4, e6073 (2009). 786 43. Tontini, G.E., Vecchi, M., Pastorelli, L., Neurath, M.F. & Neumann, H. Differential 787 diagnosis in inflammatory bowel disease colitis: state of the art and future perspectives. 788 World J Gastroenterol 21, 21-46 (2015).

25

bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

789 44. Troncone, E., Marafini, I., Del Vecchio Blanco, G., Di Grazia, A. & Monteleone, G. Novel 790 Therapeutic Options for People with Ulcerative Colitis: An Update on Recent 791 Developments with Janus Kinase (JAK) Inhibitors. Clin Exp Gastroenterol 13, 131-139 792 (2020). 793 45. Huber, C., et al. Lymphotoxin-beta receptor-dependent genes in lymph node and 794 follicular dendritic cell transcriptomes. J Immunol 174, 5526-5536 (2005). 795 46. Kim, B., Barnett, J.L., Kleer, C.G. & Appelman, H.D. Endoscopic and histological 796 patchiness in treated ulcerative colitis. Am J Gastroenterol 94, 3258-3262 (1999). 797 47. Arias, M.T., et al. A panel to predict long-term outcome of infliximab therapy for patients 798 with ulcerative colitis. Clin Gastroenterol Hepatol 13, 531-538 (2015). 799 48. Barre, A., Colombel, J.F. & Ungaro, R. Review article: predictors of response to 800 vedolizumab and ustekinumab in inflammatory bowel disease. Aliment Pharmacol Ther 801 47, 896-905 (2018). 802 49. Brandse, J.F., et al. Loss of Infliximab Into Feces Is Associated With Lack of Response 803 to Therapy in Patients With Severe Ulcerative Colitis. Gastroenterology 149, 350-355 804 e352 (2015). 805

26

Figure 1 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

a b c factors d NNMF_1 NNMF_2 NNMF_3

500um 500um 500um H&E (d0) Base factors (d0)

500um CDH17

2 3 0.8

0 0 0

IEC TAGLN

LP/IEC

NNMF_1 NNMF_2 NNMF_3 muscle ADH1 e f digitally unrolled Swiss roll digitally unrolled proximal mid distal Hmgcs2 proximal mid distal NNMF3_1 g Validation 2.0 Hmgcs2 3 Hmgcs2 Ang4 B4galt1 1.0 2 proximal 1 8 * 15 * * 0.0 0 3 Ang4 6 10 NNMF3_2 2 1.0 Ang4 4 5 ** 3 1 0.5 2 2 mid

1 log2(FC) 0 0 0.0 0 0 NNMF3_3 -5 -1 B4galt1 -2 B4galt1 mid 2.0 mid 2 mid distal distal 1.0 1 distal proximal proximal distal proximal serosa-luminal axis 0.0 serosa-luminal axis 0 position along the proxima-distal axis position along the proxima-distal axis h factor_18n factor_17n factor_15n factor_4n factor_19n i

Enriched genes per factor Luminal

proximal-distal proximal mid distal 20n 18n 8n 15n 4n 19n 7n factor_8n factor_14n factor_20n factor_7n Gm42418 S100g Muc2 Ang4 Saa1 Sync Tagln Rgs5 Slc51a Agr2 Guca2a Prdx6 Fxyd4 Ckb Malat1 Cyp2c55 Nupr1 Fcgbp Car4 Prdx6 Acta2 Pcp4 Car1 Tmsb10 Aqp4 Aqp8 Krt8 Myh11 Mgp Slc10a2 Adamec1 Clca1 Slc20a1 Smim24 Pcp4 Lars2 Hmgcs2 CD24a Ggh Tgm3 Krt19 Tpm2

serosa Myl9 Tmigd1 Ccl6 Retnlb Hist1h1c Il1rn Myl9 Myh11 Cyp4b1 Txndc5 Mptx1 Atp12a Tns4 Nr4a1 lamina-propria Mylk Akr1c19 Hmgcs1 Spink1 Spink1 Ldha Csrp1 proximal-distal proximal-distal Ckb Mep1a Plet1 Slc9a3 Slc37a2 Lgals3 Cnn1 Figure 2 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

a b c B cell-associated factor 1n 3n 9n factor_1n factor_3n factor_9n Ccl20 Ubd Lyz2 0 1 2 0 5 10 0 1 2 3 Fabp5 Il22ra2 Ighm Clu Cd52 CLU Coro1a Cd79b H2-DMb2 Igha Jchain 6 Igkc 4 Apoe Ly6c2 2 Hao2 0 JCHAIN

d factor_1n factor_3n

immune system process GeneRatio immune system process response to external stimulus leukocyte activation GeneRatio defense response 0.3 lymphocyte activation 0.3 regulation of immune system process 0.4 cell activation 0.4 immune response 0.5 positive regulation of immune system process 0.5 cell migration 0.6 immune response response to stress 0.6 0.7 adaptive immune response localization of cell regulation of immune system process cell motility B cell activation −log10(p_value) response to biotic stimulus −log10(p_value) leukocyte proliferation 16 biological process involved in interspec... lymphocyte proliferation 14 response to other organism 13 mononuclear cell proliferation response to external biotic stimulus regulation of immune response 12 11 positive regulation of immune system pro... regulation of lymphocyte proliferation 10 locomotion 9 regulation of mononuclear cell proliferation anatomical structure development positive regulation of immune response 8 response to stimulus regulation of leukocyte proliferation cell adhesion regulation of leukocyte activation regulation of cell motility regulation of lymphocyte activation biological adhesion regulation of cell activation 9 11 13 15 7.5 10.0 12.5 15.0

factor_9n

regulation of immune system process regulation of immune response GeneRatio adaptive immune response 0.1 lymphocyte mediated immunity innate immune response 0.2 immunoglobulin mediated immune response 0.3 defense response to other organism humoral immune response mediated by circ... immune response −log10(p_value) defense response 6 B cell mediated immunity antigen processing and presentation of p... 5 leukocyte mediated immunity 4 humoral immune response adaptive immune response based on somati... 3 immune system process synapse pruning cellular response to nitrogen compound response to other organism response to external biotic stimulus 3 4 5 6

e f factor_6n factor_6n trans−synaptic signaling neuron projection development GeneRatio Uchl1 neuron development 0.2 Prph synaptic signaling nervous system development 0.3 Sncg vesicle−mediated transport in synapse Snap25 neurogenesis Stmn2 generation of neurons −log10(p_value) chemical synaptic transmission 5.5 Scg2 anterograde trans−synaptic signaling Eef1a2 modulation of chemical synaptic transmis... 5.0 regulation of trans−synaptic signaling Pcsk1n neuron differentiation 4.5 0.00 0.05 0.10 cellular component morphogenesis 4.0 gene weight neurotransmitter transport behavior 3.5 plasma membrane bounded cell projection ... regulation of neurotransmitter levels cell projection organization UCHL1 positive regulation of synaptic transmis... g 4 5 Lymphoid structures ENS (Panel A-D) (Panel E-F) Figure 3 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

a b d clusters d0 d14 condition Agr2 Muc2 d0 Cd24a Atp2a3 d14 Hmgcs2 Nupr1 Cldn15 Sult1a1 B3gnt6 Plet1 Car1 Cyp2c55 2200002D01Rik 1810065E05Rik Slc51a Mgat4c Emp1 Phgr1 Ugdh Tagln Acta2 4 Tpm2 Eef1a1 Csrp1 Cnn1 Sycn 2 Flna Cald1 Ckb Igfbp5 mt−Nd1 Col1a2 0 Gsn c C1qa Cd34 Tnxb B3gnt7 −2 Adh1 clusters Sval1 Klk1 0 9 Cela1 e d0 d14 Wfdc2 −4 1 10 Hoxb13 Cluster 12 Rps3a1 2 Ckmt1 11 Best2 3 12 Spink4 Oit1 4 13 Gpx2 Rps26 5 14 Clca3b Rps15a 6 15 Kcne3 St6galnac6 7 16 Tpm1 Mylk UMAP 2 (harmony) 8 Sparcl1 Myh11 Rgs5 Myl6 UMAP 1 (harmony) Myl9 Des Cluster 0 Malat1 Aqp4 Zg16 Fcgbp Igha Slc6a14 f Hao2 Clca1 Sema4g Igkc Guca2a Prdx6 d0 d14 Ly6g Tgm3 Muc2 Atp12a Saa1 Cyp2d12 Slc15a1 Spink1 Ggh Lypd8 Actg2 Reg3β Slc9a3 g Krt20 Car4 d0 d14 Aqp8 Ceacam1 Misp 15 **** Myo15b *** Txn1 Ltf Krt5 Ppbp 10 Psca 0 Stmn3 Nsg2 G Zcchc12 Tubb3 5 Rab3c Eef1a2 Reg3b Gng3 -5 Phox2b Pirt Rtn1 log2(FC) 0 Cd79b Ltb H2−DMb2 Cd37 -5 -10 Cd79a Cxcr5 Tnfrsf13c Fcer2a

mid Cd19 mid Cd52 distal distal Apoe B2m Vcam1 proximal proximal Arhgdib Figure 4 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

a b d0 d14 d0 d14

0.6 3 factor 5 d0 factor 1 0 0 13 11 16 3

factor 7 5 15 factor 10 d14 0 0 13

UMAP.harmony_2 11 16 3

15 factor 14 factor 17 3 0 0 UMAP.harmony_1 factor 20

factor 18 4 4

0 0 c factor 1 factor 5 factor 7 factor 10 factor 14 factor 17 factor 18 factor 20 Cyp2c55 Igfbp5 Krt13 Saa1 Clca4b Rps26 Krt8 Spink1 Car1 Col1a2 Krt5 Spink1 Wfdc18 Gpx2 Car2 Hoxb13 Hmgsc2 Mgp Ltf Car4 Ly6a Rplp0 Hao2 Sval1 Tmbim6 Col1a1 Krt6a Clca1 Aldh1a3 Prdx1 Krt19 Cyp2c68 mt-Co3 Igfbp4 Krt14 Aqp8 Clca3b Spink1 Aqp4 Cd9 Tspan8 Apoe Psca Ggh Duoxa2 Rps15a Txn1 Smim24 Emp1 Il6st Ly6d Txn1 Mfsd4a Rpsa Cox6b1 Il18 Mgat4c Postn Calml3 Zg16 Marcksl1 Rps14 Acads S100a6 Cox5a Lrp1 Lgals7 Tff3 Plac8 Ube2s Guca2a Aoc1 Phgr1 C1s1 Krt6b Prdx6 Cxcl5 Nme1 Zg16 B3galt5 0 1.5 0 0.3 0 0.075 0 0.15 0 0.1 0 0.15 0 0.09 0 0.09

d e factor 5 factor 7 anatomical structure develop. gene tissue develop. response to organic substance epidermis develop. gene ratio ratio cell adhesion keratinocyte differentiation biological adhesion 0.1 epithelial cell differentiation 0.1 cell. resp. to organic substance epidermal cell differentiation cell. resp. to chemical stimulus epithelium develop. cell differentiation skin develop. multicellular organism develop. 0.6 response to wounding 0.5 ECM organization wound healing extracellular organization keratinization 2.5 3.5 4.5 2 5 8 factor 14 factor 20

response to stress resp. to oxy-containing comp. reg. of cell. component movement gene regulation of localization factor 5 (regeneration) regulation of cell motility ratio anat. structure morphogenesis gene localization 0.15 response to acid chemical ratio factor 7 (keratinization) regulation of locomotion response to organic substance 0.1 response to cytokine tissue development factor 14 (danger response) resp.to oxy-containing compound peptide hormone secretion 0.55 0.3 factor 17 (protein synthesis) positive regulation of cell motility peptide secretion cell. component assembly blood vessel development factor 20 (organogenesis) localization of cell regulation of peptide secretion 2.8 3.4 4 3 4 5 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made Figure 5 available under aCC-BY-ND 4.0 International license.

c MAPK VEGF a b Group 2 factor 1 factor 3 Androgen factor 7 JAK−STAT factor 10 NFkB factor 11 TNFa factor 14 d0 p53 factor 19 Hypoxia factor 20 Trail MAPK factor 2 VEGF .5 factor 5 TGFb factor 6 WNT 0 factor 9 EGFR factor 12 Estrogen -.5 factor 17 9 2 15 6 5 121317 8 16 4 1811 1 19 3 1014 7 20 PI3K d14 factors d Group 1 TNFa NFkB JAK-STAT e d14 Androgen Estrogen i i d0 Androgen Estrogen 2 2 ILF ILF ILF -4 -2 ii ii i i ii ii

i i i p53 ii ii ii f H&E d14 2.5 d0 d14 i i 2 -5 i ii ii -4 i ii ILF i ii ILF i ii ILF ii

g IEC d14 stem cell core h (GSEO) i 200 p53 0.6 Estrogen seurat i -0.2 150 Hypoxia ii NFkB module score alue) TNFa PI3K v − JAK-STAT 100 MAPK Trail ii EGFR

Stem cells log10(p Androgen − 50

UMAP2 WNT 0 VEGF TGFb -1.0 -0.5 0.0 0.5 1.0 Pearson correlation UMAP1 FigurebioRxiv preprint6 doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

a gut mesenchymal/ colon spatial b epithelial cells (human) transcriptomics factor_12 (mouse) factor_15 factor_5 factor_2 correlation 0.8 factor_6 factor_3 0.4 31 factor_10 cell types factor_1 0.0 factor_11 −0.4 Fawkner-Corbett et al, our data factor_19 Cell, 2021 factor_13 −0.8 c factor_17 factor_9 S4 CCL21+ (d0) S4 CXCL13+ (d0) factor_4 factor_18 factor_8 factor_14 factor_20 factor_7 factor_16

S3 TA TA ve pti S phase rythroid ximal r ransitionalS1 IFIT3. s Distal E o S4 CXCL14 T oximallast S2 (1) oximal S2 (2) G2M phase Pr Enterocytes S3 Pr b Pr s rly Enterocytese last oximal Stem Cells Distal Abso Fibro b Pr Distal Enterocytes Fibro ximal Ea o oximal Matur Pr Pr S1 blets EECs S3 EBF Go S4 CCL21 Distal S2 EnterocytesOP2 Cells S3 HAND1S1 COL6A5 y Progenitors e OT blast ProgenitorS3 Progenitor r Distal Stem Cellsoximal Progenitor oximal Enterocytes Fibro Pr Secreto Pr BEST4 Distal Matur gut mesenchymal/epithelial cells (human)

d immune cells endothelium e immune cells Mac SPP1 Mac DC

ILC3 B cells pre-B cells

pre-B cells naive T cells B cells ILC3

fibroblasts mesothelium

monocytes DC Macrophages Macrophages SPP1 bioRxiv preprint doi: https://doi.org/10.1101/2021.07.01.450768; this version posted July 3, 2021. The copyright holder for this preprint (which Figurewas not certified 7by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-ND 4.0 International license.

a type b i ii factor 9 d0 d14 relative mean 2 1.3 (lymphoid expression 1.2 follicles) m1 8 1.1 7 1 73 5 0.9 0.8 27 days 0 7 14 6 0.7 556 DSS 4 type modules down m1 GO enrichment 1 repair inflammatory response inflammation Coro1a Sash3 Unc93b1 Ly6e Rasal3 3 Myo1g Sell Il21r Wdfy4 Dok3 m1 neutrophil degranulation response to lipopolysaccharide 9 Lcp1 Arhgdib Inpp5d Cd22 Ppp1r18 Ptprc Cd72 Il2rg Il16 Tbc1d10c 15 2 6 9 12 10 20 4 3 11 1 19 17 18 8 13 7 14 5 16 -log10(p-value) Lyz2 Rac2 Limd2 Rps15a Ptprcap Laptm5 Acap1 ST factors

c d0 d14 i d0 d14 factor 15 d (ENS) 41

59 top 17 773 UC1 genes m6 ii relative mean d0 d14 expression GO enrichment m6 chemical synaptic transmission m6 DNA replication potassium ion transmemb. transp. 0 7 14 -log10(p-value) 4 UC2 genes DSS days

e IBD risk genes (cluster 3) f d0 d14 IBD risk genes enrichment in Factors

2.0

value) 1.5 9 p (

0 1.0 8 0.5 16 -Log1 0.0 factors