ARTICLE

https://doi.org/10.1038/s41467-020-18873-z OPEN Predicting cell-to-cell communication networks using NATMI ✉ Rui Hou 1, Elena Denisenko1, Huan Ting Ong 2, Jordan A. Ramilowski 3,4 & Alistair R. R. Forrest 1,4

Development of high throughput single-cell sequencing technologies has made it cost- effective to profile thousands of cells from diverse samples containing multiple cell types. To study how these different cell types work together, here we develop NATMI (Network

1234567890():,; Analysis Toolkit for Multicellular Interactions). NATMI uses connectomeDB2020 (a data- base of 2293 manually curated - pairs with literature support) to predict and visualise cell-to-cell communication networks from single-cell (or bulk) expression data. Using multiple published single-cell datasets we demonstrate how NATMI can be used to identify (i) the cell-type pairs that are communicating the most (or most specifically) within a network, (ii) the most active (or specific) ligand-receptor pairs active within a network, (iii) putative highly-communicating cellular communities and (iv) differences in intercellular communication when profiling given cell types under different conditions. Furthermore, analysis of the Tabula Muris (organism-wide) atlas confirms our previous prediction that autocrine signalling is a major feature of cell-to-cell communication networks, while also revealing that hundreds of ligands and their cognate receptors are co-expressed in individual cells suggesting a substantial potential for self-signalling.

1 Harry Perkins Institute of Medical Research, QEII Medical Centre and Centre for Medical Research, The University of Western Australia, Nedlands, WA 6009, Australia. 2 Ear Science Institute Australia, and Ear Sciences Centre, The University of Western Australia, Nedlands, WA 6009, Australia. 3 Advanced Medical Research Center, Yokohama City University, Yokohama, Kanagawa 236-0004, Japan. 4 RIKEN Center for Integrative Medical Sciences, Yokohama, ✉ Kanagawa 230-0045, Japan. email: [email protected]

NATURE COMMUNICATIONS | (2020) 11:5011 | https://doi.org/10.1038/s41467-020-18873-z | www.nature.com/naturecommunications 1 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-18873-z

n 2015, we published the first draft map of predicted cell-to- plasma membrane or both (Methods). Additionally, to facilitate cell communication between major human cell types based on the use of NATMI for other organisms, we provide inferred I – the expression levels of 708 ligands and 691 receptors mea- ligand receptor pairs for multiple vertebrate species based on sured in 144 purified human primary cell types, and a manually human ortholog mappings provided in the NCBI HomoloGene curated set of 2422 human ligand–receptor pairs (1894 pairs Database23. had literature support)1. Our major findings were that any given cell type expresses tens to hundreds of different ligands and receptors, that extensive autocrine signalling is a common feature Predicting cell-to-cell communication using NATMI. The of all cell types studied thus far and that cell-to-cell commu- workflow for cell-to-cell communication analysis in NATMI nication networks are highly connected via hundreds of predicted starts by the user providing input data files of expression ligand–receptor paths. and cell labels for single-cell data analysis. Using the con- Recently, with the availability of high throughput single-cell nectomeDB2020 ligand–receptor pair list described above, platforms2–10, expression profiling across an ever-growing variety NATMI extracts the expression levels of every ligand and of cell types from dissociated tissues without prior purification receptor expressed in each cell type (Supplementary Fig. 1a–c). has become common. To understand how individual cells and cell Edges between any pair of cell types are then predicted based on types within these complex samples communicate, multiple the expression of the ligand in one cell type and the expression of groups have since used our ligand–receptor pairs from 20151 to its cognate receptor in the other cell type (Supplementary Fig. 2a). infer cell-to-cell communication in developing heart11, kidney12, As cells can express multiple ligands and cognate receptors, cell liver13, lung14, cancer15,16, and cortex17. pairs are connected by multiple edges that are accordingly Here, we present NATMI (Network Analysis Toolkit for weighted by the expression of ligands and receptors in these Multicellular Interactions), a user-friendly tool primarily designed cell types (Supplementary Fig. 2b). Moreover, the same ligand- to process single-cell datasets, which can also be s and receptors can be expressed by multiple cell types. This applied to bulk transcriptomics and proteomics data. In brief, makes any given ligand–receptor pair a hyperedge that can NATMI uses connectomeDB2020 (a newly updated curated connect multiple cell types and the underlying structure of a real ligand–receptor pair list) or user-specified ligand–receptor pairs cell-to-cell communication network—a weighted-directed-multi- to predict and visualise the network of cell-to-cell communication hyperedge network (Supplementary Fig. 2c). In NATMI, how- between cell types (or clusters) in these datasets (Supplementary ever, we reduce these to weighted-directed-multi-edge networks Fig. 1). (Supplementary Fig. 2d) and for simplicity refer to these as ‘cell- We demonstrate several ways NATMI can be used to sum- to-cell communication networks’. Lastly, NATMI introduces the marise and extract biological insights from cell-to-cell commu- concept of cell-connectivity-summary networks that merge the nication networks (Supplementary Fig. 1). Specifically, NATMI many ligand–receptor edges drawn from one cell type to another can: (1) show all cell types predicted to communicate via a user- into a combined weighted cell-connectivity-summary edge to specified ligand–receptor pair (Supplementary Fig. 1d); (2) show summarise how strongly (or specifically) each cell type is com- all ligand–receptor pairs used for communication between a user- municating to another cell type (Supplementary Fig. 1f). specified pair of cell types (Supplementary Fig. 1e); (3) summarise the entire communication network to show how strongly or specifically each cell type in a complex sample communicates to Edge weights in cell-to-cell communication networks. For each every other cell type, thus identifying highly communicating cell analysed dataset NATMI creates an edge file that summarises the pairs or communities (Supplementary Fig. 1f); and (4) compare levels and fractions of cells in each cell type expressing each communication networks from two different conditions and ligand and receptor. From this it calculates two different edge identify edges (ligand–receptor pairs) that differ (delta network) weights. The mean-expression edge weights are calculated by between them (Supplementary Fig. 1g). NATMI (Python script) multiplying the mean-expression level of the ligand in the send- and the updated ligand–receptor lists are freely available at ing cell type by the mean expression of the receptor in the target https://github.com/forrest-lab/NATMI/. cell type. This weighting is useful to emphasise highly expressed ligands and receptors but provides no discrimination between cell-type-specific and housekeeping edges. The specificity-based Results edge weights, on the other hand, help identify the most specific Updated ligand–receptor pair lists. To facilitate the exploration edges in the network regardless of expression levels and are cal- of intercellular interactions, in 2015 we published a set of 1894 culated as the product of the ligand and receptor specificities, ligand–receptor pairs with primary literature support and an where each specificity is defined as the mean expression of the additional 528 putative pairs (secreted and plasma-membrane ligand/receptor in a given cell type divided by the sum of with high throughput –protein interaction (PPI) the mean expression of that ligand/receptor across all cell types. evidence)1. Here, we present connectomeDB2020, an updated set The specificity-based edge weights range from 0 to 1 where a of 2293 ligand–receptor pairs with primary literature support weight of 1 means both the ligand and receptor are only (Supplementary Data 1). These consist of 1751 pairs from our expressed in one (not necessarily the same) cell type. 2015 resource, 121 pairs from CellphoneDB v2.018, 50 pairs from To better demonstrate the utility of different edge weighting RNA-Magnet19, 22 pairs from SingleCellSignalR20, 9 pairs from metrics in NATMI, we reanalysed communication between 12 ICELLNET21, and 340 new manually curated pairs pre- defined cell types in a previously published cardiac dataset11 and dominantly reported since the original publication. Supplemen- predicted 126,738 cell-ligand–receptor-cell edges (Supplementary tary Data 1 also lists pairs from each resource excluded due to Data 2). Ranking edges based on expression weighting (Fig. 1a, lack of primary evidence (these include 143 of our original pairs shows the edges for the top 20 most highly expressed removed after user feedback and further checks revealed that the ligand–receptor pairs) differed substantially from those based PubMed ID supplied from HPRD22 was incorrect and did on specificity weighting (Fig. 1b, shows the edges for the top 20 not support the interaction). To allow users to discriminate most specific ligand–receptor pairs). Notably, the ligand–receptor contact-dependent signalling from soluble ligand-mediated sig- pairs ranked by expression weighting were also detected in a nalling, we have now annotated all ligands as either secreted, broader range of communicating cell type pairs (noticeable when

2 NATURE COMMUNICATIONS | (2020) 11:5011 | https://doi.org/10.1038/s41467-020-18873-z | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-18873-z ARTICLE

a Top 20 most expressed ligand–receptor pairs b Top 20 most specific ligand–receptor pairs Max Expression ll1rn->ll1r2 0.8 App->Cd74 Ccl5->Ccr1 Apoe->Lrp1 ll1b->ll1r2 Ccl5->Ccr1 Lgi4->Adam11 ll1b->ll1r2 Lgi4->Adam23 Mif->Cd74 Ptprz1->Ncam1 0.6 Rps19->C5ar1 Ncam1->Ptprz1 Apoe->Sorl1 Btla->Cd79a Ccl5->Sdc4 Pglyrp1->Trem1 Vim->Cd44 L1cam->Erbb3 C1qb->Lrp1 Dcn->Tlr2 Hp->Cd163 Ccl5->Ccr5 L1cam->Ptprz1 0.4 Lgals1->Ptprc Ptprz1->L1cam

Ccl5->Sdc1 Ligand–receptor pair Ccl12->Ccr1 Ligand–receptor pair Apoe->Scarb1 Cxcl2->Cxcr2 Rarres2->Ccrl2 Vtn->Plaur C1qa->Cd93 Mmrn2->Clec14a 0.2 Col1a1->Cd36 Ptn->Ptprz1 Psap->Gpr37l1 ll34->Ptprz1 Mfge8->Pdgfrb Col18a1->Kdr

T cell->B cell B cell->B cell B cell->B cell Pericyte->B NKcell cell->B cell T cell->DC-like cell B cell->DC-like cell Fibroblast 1->BFibroblast cellT cell->GranulocyteMacrophage->B 2->B cell cell T cell->MacrophageB cell->GranulocyteB cell->Macrophage T cell->Granulocyte NK cell->Granulocyte Schwann cell->B cell Pericyte->DC-like cellNK cell->DC-like cell NK cell->Fibroblast 2 Endothelial cell->B cell Pericyte->MacrophageNK cell->Macrophage Macrophage->PericyteNK cell->Schwann cell NK cell->GranulocytePericyte->GranulocytePericyte->DC-like cell Pericyte->Schwann cell Fibroblast 2->FibroblastFibroblast 1 1->DC-likeFibroblast 2->DC-likecell cell Fibroblast 2->FibroblastDC-like cell->DC-like 2 cell Macrophage->FibroblastGranulocyte->GranulocyteFibroblast 1Schwann 1->Macrophage cell->DC-likeFibroblast 2->Macrophage cellMacrophage->FibroblastMacrophage->DC-like 2 cell Macrophage->GranulocyteDC-like Fibroblastcell->MacrophageGranulocyte->Macrophage 2->Granulocyte DC-like DC-likecell->Granulocyte cell->MacrophageFibroblast 2->Granulocyte Schwann cell->MacrophageMacrophage->Macrophage SchwannSmooth cell->Fibroblast muscle cell->B 1 cell Granulocyte->GranulocyteGranulocyte->MacrophageMacrophage->GranulocyteFibroblast 2->Schwann cell Endothelial cell->DC-like cell Schwann cell->Schwann cell Endothelial cell->Fibroblast 2 Endothelial cell->Macrophage Schwann cell->Endothelial cell Smooth muscle cell->DC-like cell Endothelial cell->Endothelial cell Smooth muscle cell->Macrophage Smooth muscleSmooth cell->Granulocyte muscle cell->DC-like cell Cell-type pair Cell-type pair

Fig. 1 Top 20 ligand–receptor pairs ranked by expression or specificity weighting in the Skelly et al.11 dataset. Ligand-receptor pairs ranked by a expression (product of mean ligand expression level × mean receptor expression level). Rows are scaled by max. b Specificity (product of ligand specificity × receptor specificity). Ligand–receptor pairs with high expression weights (a) may be broadly used by many cell-type pairs, while those with high specificity weights (b) tend to be limited to one or only a few cell-type pairs.

a Heatmap view of unfiltered edges b Network view of unfiltered edges c Circos view of unfiltered edges Endothelial cell (824 cells) B cell 14 46 33 42 43 42 51 33 30 33 32 19 Fibroblast 1 DC-like cell Fibroblast 1 (5902 cells) (190 cells) DC-like cell 23 101 67 83 87 85 102 89 62 77 80 46 Endothelial cell Endothelial cell 12 74 77 79 85 72 83 61 60 66 78 34

Fibroblast 1 18 119 104 120 124 98 125 87 89 106 111 51 Fibroblast 2 B cell Fibroblast 2 (592 cells) (294 cells) Fibroblast 2 25 127 113 135 147 103 137 91 99 123 122 57 DC-like cell Granulocyte 18 70 59 69 74 69 84 55 54 57 60 30

Macrophage 17 99 64 80 83 82 96 87 60 71 78 46

NK cell 18 78 47 61 63 67 81 61 46 48 53 33 Granulocyte T cell (44 cells) (126 cells) Granulocyte B cell Pericyte 13 69 62 69 73 70 77 53 55 65 72 28

Schwann cell 17 83 76 98 99 71 93 61 67 98 85 35 T cell

Cell-type expressing ligand (sending) Smooth muscle cell 15 77 72 87 92 71 89 56 67 75 87 31

T cell 13 48 37 49 48 45 55 38 35 37 40 21 Macrophage T cell B cell Smooth muscle cell

NK cell Macrophage Smooth muscle cell Pericyte (1292 cells) (631 cells) DC-like cell DC-like Fibroblast 1 Fibroblast 2 Fibroblast Granulocyte Macrophage Schwann cell Schwann NK cell Endothelial cell Schwann cell NK cell Schwann cell Smooth muscle cell Smooth muscle (54 cells) (172 cells) Pericyte Cell-type expressing receptor (receiving) Pericyte (398 cells)

Fig. 2 Cell-connectivity-summary-network visualisations in NATMI. To identify cell types that are communicating ‘more’ than others in Skelly et al.11, the number of ligand–receptor pairs (detected in more than 20% of cells) connecting each pair of cell types were counted. Resulting simple edge-count-based cell-connectivity-summary network is shown using three distinct visualisation methods available in NATMI. a Heatmap view. Rows indicate cells expressing the ligands and columns indicate cells expressing the receptors. Number of ligand–receptor pairs connecting the cell–cell pairs are indicated and coloured relative to max count. b Network-graph view and c Circos-view for the same network. For b, c, edges are coloured by ligand expressing cell type, arrows indicate target cell type and thickness is proportional to the number of ligand–receptor pairs connecting the two cell types. The NATMI directed heatmap visualisation is asymmetric. This recognises that cell communication is directional and that although one cell type may send many signals to another there is no guarantee that it is reciprocated. comparing the larger width of Fig. 1a to that of Fig. 1b) and thus The heatmap view, however, avoids the problem and reconfirms kept potentially less informative housekeeping edges. one of the major predictions of the cardiac study11 that fibroblasts are the most trophic, with edges from fibroblasts to 10 of the 12 Cell-connectivity-summary networks. One of the primary aims cell types dominating the network. In Fig. 3, we show how NATMI can be used to filter the network based on expression of cell-to-cell communication network analysis is to identify fi which cell types are mutually coordinating their activities by levels or speci cities of the ligands and receptors involved. – Filtering by expression weights (Fig. 3a) can provide users a ligand receptor-mediated communication. Our analyses indicate, fi however, that all cell types have substantial potential to com- higher con dence that the ligands and receptors are expressed at sufficient levels. For the cardiac dataset, we explored both the municate with each other. Consequently, this leads to the ques- fi fi tion of which cell types are communicating the most? Or the most ltered by expression and un ltered network (Fig. 2) yielded, fi however, a similar conclusion that the fibroblasts are the most speci cally? The simplest strategy to measure the degree of fi fi communication from one cell type to another is to count the trophic. In contrast, ltering on speci city weights (Fig. 3b) number of ligand–receptor pairs connecting them. Figure 2 highlights a different set of top cell-to-cell pairs. In particular, summarises the cell-connectivity-summary network for the above autocrine signalling of Schwann cells, endothelial cells and granulocytes, fibroblast and Schwann cell signalling to endothelial cardiac dataset based on simple edge count using three different fi visualisations—heatmap, network graph, and circos plot. High cells, and broblast, granulocyte and pericyte signalling to granulocytes is highlighted while the broad signalling from connectivity of these networks makes the network graph and fi fi fi circos views not easily interpretable due to over-plotting issues. broblasts seen in the un ltered and expression ltered networks

NATURE COMMUNICATIONS | (2020) 11:5011 | https://doi.org/10.1038/s41467-020-18873-z | www.nature.com/naturecommunications 3 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-18873-z

acCount of edges with expression weight > 100 Count of edges detected by CellPhoneDB, P-value ≤ 0.05 e Top 10 pairs in a, b, c, d a b c d B cell 12 27 18 28 23 35 33 21 28 26 18 9 B cell 8 13 9 13 10 11 14 11 6 7 5 6 Fibroblast 2 > Schwann cell DC-like cell 14 41 27 36 33 52 46 38 41 38 31 15 DC-like cell 11 43 31 30 21 33 36 42 28 32 28 14 Endothelial cell 11 46 41 44 37 52 44 34 47 44 38 9 Endothelial cell 7 26 42 44 28 25 29 29 29 27 35 9 Fibroblast 1 > Endothelial cell

Fibroblast 1 15 65 51 57 53 68 64 44 64 58 54 13 Fibroblast 1 12 58 67 83 51 46 67 59 52 57 63 19 Fibroblast 2 > Endothelial cell Fibroblast 2 15 58 45 56 56 59 64 37 60 59 49 15 Fibroblast 2 19 58 65 74 62 53 63 45 54 59 59 24 Fibroblast 1 > Granulocyte Granulocyte 15 45 37 41 38 57 54 36 48 44 34 13 Granulocyte 11 26 24 27 20 26 31 26 22 18 18 9 Schwann cell > Schwann cell Macrophage 14 46 28 39 34 54 47 45 42 38 31 18 Macrophage 9 50 35 41 29 39 39 44 32 32 35 19 Endothelial cell > Endothelial cell NK cell 10 32 22 27 22 41 39 32 28 25 17 12 NK cell 14 34 14 21 17 27 34 32 15 10 10 16 Pericyte 13 52 38 46 42 57 52 31 50 51 46 11 Pericyte 7 19 21 28 16 23 21 21 15 20 27 6 Granulocyte > Granulocyte

Schwann cell 13 46 44 47 47 50 51 31 51 63 41 13 Schwann cell 8 26 33 44 32 23 26 18 27 39 26 7 Fibroblast 2 > Fibroblast 1 Cell-type expressing ligand (sending) Cell-type expressing Smooth muscle cell 12 40 38 43 38 44 47 25 45 43 36 12 ligand (sending) Cell-type expressing Smooth muscle cell 7 16 29 33 21 18 18 12 23 20 23 4 Fibroblast 2 > Fibroblast 2 T cell 11 24 16 26 21 32 31 20 25 24 16 9 T cell 7 10 9 14 9 11 14 13 9 7 7 6 Fibroblast 1 > Macrophage T cell B cell T cell B cell Fibroblast 2 > Macrophage NK cell NK cell Pericyte Pericyte Fibroblast 2 > Granulocyte DC-like cell DC-like DC-like cell DC-like Fibroblast 1 Fibroblast 2 Fibroblast Fibroblast 1 Fibroblast 2 Fibroblast Granulocyte Granulocyte Macrophage Macrophage Schwann cell Schwann Schwann cell Schwann Endothelial cell Endothelial cell Schwann cell > Fibroblast 1 Smooth muscle cell Smooth muscle Smooth muscle cell Smooth muscle Macrophage > NK cell Cell-type expressing receptor (receiving) Cell-type expressing receptor (receiving) Schwann cell > Endothelial cell

bdCount of edges with specificity weight > 0.1 Sum of specificity weights of edges Fibroblast 1 > Fibroblast 1

B cell 1 1 0 0 0 2 0 1 0 0 0 1 B cell 1.07 0.63 0.43 0.63 0.36 1.03 0.70 0.86 0.35 0.29 0.19 0.25 Fibroblast 1 > NK cell

DC-like cell 1 1 0 0 0 3 1 5 0 3 1 1 DC-like cell 0.58 1.43 1.01 1.21 0.86 1.82 1.43 2.50 0.93 1.48 0.87 0.65 Fibroblast 1 > Smooth muscle cell Endothelial cell 0 3 15 4 3 4 1 6 6 2 3 0 Endothelial cell 0.29 1.51 5.80 2.11 1.86 1.93 1.36 1.99 2.50 1.47 2.07 0.37 Fibroblast 1 > DC-like cell Fibroblast 1 1 1 12 5 1 11 1 3 5 2 3 0 Fibroblast 1 0.79 2.68 4.40 3.55 2.47 3.78 2.82 2.54 3.00 2.73 2.66 0.78 Fibroblast 2 > DC-like cell Fibroblast 2 0 1 13 6 6 10 4 1 8 11 6 2 Fibroblast 2 0.69 2.78 4.76 4.05 3.84 3.11 3.01 2.07 3.49 4.43 3.01 0.89 Granulocyte 1 3 6 4 3 11 7 5 2 3 1 0 Granulocyte 0.72 1.54 2.45 1.63 1.47 5.29 2.79 2.20 1.18 1.46 0.81 0.37 Fibroblast 1 > Pericyte

Macrophage 0 7 3 3 1 7 3 11 4 2 2 4 Macrophage 0.47 2.90 1.90 1.88 1.23 2.99 1.95 3.48 1.86 1.36 1.56 1.21 Fibroblast 2 > Pericyte NK cell 1 4 2 0 3 5 9 7 2 1 0 1 NK cell 0.87 1.75 0.96 0.98 0.95 2.57 2.40 2.42 1.04 0.77 0.51 0.62 Pericyte 0 1 3 1 0 9 3 1 1 2 2 0 Pericyte 0.31 1.15 1.59 1.60 1.02 2.65 1.53 1.06 1.33 1.86 1.87 0.31 Schwann cell 0 3 11 9 8 4 1 3 3 25 3 0 Schwann cell 0.28 1.58 3.62 3.98 2.75 1.87 1.65 1.42 1.68 10.3 1.83 0.42 Cell-type expressing ligand (sending) Cell-type expressing Cell-type expressing ligand (sending) Cell-type expressing Smooth muscle cell 1 1 7 2 1 2 0 2 4 0 5 1 Smooth muscle cell 0.48 0.79 2.22 1.80 1.24 1.55 1.02 0.85 1.69 1.15 1.75 0.52 T cell 1 1 0 0 0 1 1 2 0 0 0 0 T cell 0.35 0.49 0.44 0.56 0.32 0.83 0.86 0.64 0.37 0.29 0.21 0.18 T cell T cell B cell B cell NK cell NK cell Pericyte Pericyte DC-like cell DC-like DC-like cell DC-like Fibroblast 1 Fibroblast 2 Fibroblast Fibroblast 1 Fibroblast 2 Fibroblast Granulocyte Granulocyte Macrophage Macrophage Schwann cell Schwann Schwann cell Schwann Endothelial cell Endothelial cell Smooth muscle cell Smooth muscle Smooth muscle cell Smooth muscle Cell-type expressing receptor (receiving) Cell-type expressing receptor (receiving)

Fig. 3 Impact of different edge filters and metrics on identifying top communicating cell-type pairs. Heatmap views of the Skelly et al.11 cell- connectivity-summary network shown in Fig. 2 after filtering ligand–receptor pairs based on a mean-expression weight (≥100), b specificity (≥0.1) or c p values obtained by using CellPhoneDB18 (≤0.05). As an alternative to hard filtering the network, view in d is weighted by the sum of the specificities. e Compares the top 10 communicating cell type pairs identified in a–d. is diminished. We next compared our results with those obtained communication network methods is to understand the general by filtering edges based on p values calculated by CellPhoneDB18. principles of cell-to-cell communication within multicellular The resulting heatmap (Fig. 3c) is similar to that observed for the organisms. Previously, analysis of the FANTOM5 (bulk expres- expression filtered network (Fig. 3a) suggesting NATMI may sion) dataset1 revealed that most cell types express tens to over a better highlight high specificity edges. (Note, the heatmap shown hundred different ligands and receptors, and that hematopoietic in Fig. 3c should not be confused with those generated by cells tend to express fewer ligands and receptors than cells from CellPhoneDB which are symmetric. NATMI heatmaps are other lineages. Importantly, it also predicted a substantial asymmetric and have direction from the ligand expressing cell potential for autocrine signalling, with over 50% of the ligands type to the receptor expression cell type.) Lastly, the network can and receptors detected in each cell type having cognate partners also be summarised using the summed-specificity weights expressed in the same cell type. To examine whether these between each cell type pair (Fig. 3d). This generates a similar observations were consistent when using single-cell expression network to that in Fig. 3b, without requiring to set an arbitrary data, we repeated the analysis by applying NATMI to the Tabula threshold on specificity. Noticeably, as each approach generates a Muris atlas24 (a mouse cell atlas containing 44,949 FACS sorted different view of the network and highlights different most- cells from 20 organs and classified into 117 organ-resident cell communicating cell type pairs (Fig. 3e), users need to consider types). these differences when interpreting their own cell-to-cell com- munication networks. In NATMI, the user can choose any of its Autocrine, self, and intra-organ signalling in Tabula Muris.At built-in approaches, however, we recommend to use summed a detection rate threshold of 20% (commonly applied to single- fi fi speci city for most analyses as this captures speci c signalling cell datasets11,25), most cell types in the Tabula Muris dataset fi between cell types (Fig. 3d). Different edge ltering methods are expressed over a hundred ligands and receptors, with hemato- further explained in a concept Supplementary Fig. 3. poietic cell types expressing fewer ligands/receptors than other lineages (Supplementary Fig. 4a). Notably, almost half of the Application of NATMI to an organism-wide single-cell ligands detected in any given cell type in Tabula Muris had dataset. One of the ultimate aims of developing intercellular cognate receptors (and vice versa) detected in the same cell type

4 NATURE COMMUNICATIONS | (2020) 11:5011 | https://doi.org/10.1038/s41467-020-18873-z | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-18873-z ARTICLE

a c e Autocrine signalling potential of each cell type of Tabula Muris Secreted ligand mediated (outgoing edges) Plasma membrane ligand mediated (outgoing edges) 80 45 45 Cell ontology Cell-to-cell edge category Cell-to-cell edge category Endothelial 40 Cell > different tissue 40 Cell > different tissue 70 Epithelial Random Random Hematopoietic 35 35 Mesenchymal Cell > same tissue Cell > same tissue Nervous system 60 30 Random 30 Random Other Cell > same cell (autocrine) Cell > same cell (autocrine) 25 Random 25 Random 50 20 20

40 15 15 Autocrine receptors (%) Autocrine

Percentage of outgoing edges Percentage 10 of outgoing edges Percentage 10 30 5 5

20 0 0 20 30 40 50 60 70 80 1–13 1–13 14–26 27–39 40–52 53–65 66–78 79–91 14–26 27–39 40–52 53–65 66–78 79–91 Autocrine ligands (%) 92–104 92–104 105–117 105–117 Rank interval of outgoing edges Rank interval of outgoing edges

b Self-signalling potential of each cell type of Tabula Muris d Secreted ligand mediated (incoming edges)f Plasma membrane ligand mediated (incoming edges) 200 45 45 Bladder cell | Bladder Cell-to-cell edge category Cell-to-cell edge category 40 40 175 Mesenchymal cell | Trachea Different tissue > cell Different tissue > cell Random Random Stromal cell | Mammary_Gland 35 35 150 Same tissue > cell Same tissue > cell Mesenchymal stem cell of adipose | Fat 30 Random 30 Random 125 Same cell > cell (autocrine) Same cell > cell (autocrine) Leukocyte | 25 Random 25 Random 100 20 20

75 Median 54 co-detected L–R pairs 15 15

50 Erythrocyte | Aorta of incoming edges Percentage 10 of incoming edges Percentage 10

25 5 5 Number of L–R Pairs co-detected in 20% of cells Number of L–R Pairs

0 0 0

Cell-type 1–13 1–13 14–26 27–39 40–52 53–65 66–78 79–91 14–26 27–39 40–52 53–65 66–78 79–91 92–104 92–104 105–117 105–117 Rank interval of incoming edges Rank interval of incoming edges

Fig. 4 Autocrine and self-signalling potential in Tabula Muris. NATMI was run across the 117 cell types identified in the Tabula Muris dataset (at the detection rate threshold of 20%). a Autocrine signalling potential of each cell type in Tabula Muris. Each point corresponds to a cell type, and colours indicate broad lineage classes. X-axis shows the fraction of ligands expressed by a given cell type where the cognate receptor is also expressed in the same cell type. Y-axis shows the reciprocal for the fraction of receptors on a given cell type where the cognate ligand is also expressed. Red lines show the mean fractions of ligands and receptors. b Self-signalling potential of each cell type in Tabula Muris. The dashed line indicates the median number of ligand–receptor pairs co-detected in more than 20% of cells of each cell type (at 10CPM threshold). In c–f, cell-to-cell pairs are classified as autocrine (blue), intra-organ (pink), and inter-organ (green) summary edges. The summary edges are then ranked by summed specificity and the distribution of ranks for each of the three classes shown. Error bars represent standard deviation of ranks. c The distribution of ranks of secreted ligand mediated outgoing edges. d The distribution of ranks of secreted ligand mediated incoming edges. e The distribution of ranks of plasma-membrane ligand mediated outgoing edges. f The distribution of ranks of plasma-membrane ligand mediated incoming edges. The average distributions of autocrine, intra-organ and inter-organ edges through 100× randomised ligand–receptor pairs are shown as dashed lines. further confirming our previous prediction of large potential for least 20% of cells of at least one cell type at an expression autocrine signalling in cell-to-cell communication networks threshold of 10CPM (Supplementary Data 4 and Supplementary (Fig. 4a). Fig. 4b). Next, using the single-cell resolution data we could discrimi- To examine autocrine signalling in more detail, we next nate whether a ligand and its cognate receptor were actually generated cell-connectivity-summary networks weighted by the expressed in the same cell or two different cells of the same summed specificity between each of the 117 cell types in the cell type (a distinct advantage over bulk measurements where Tabula Muris dataset (Supplementary Data 5 and Supplementary autocrine and self-signalling cannot be discriminated). Across the Fig. 5). To discriminate between contact-dependent and contact- 117 cell types in Tabula Muris, a median of 54 ligand–receptor independent signalling, we generated two separate analyses based pairs were co-detected in at least 20% of the same cells (from each on pairs that involved either secreted ligands or plasma- cell type) at an expression threshold of 10 counts per million membrane ligands. (CPM, Fig. 4b, Supplementary Data 3 lists the ligand–receptor For each of these 117 cell types, we ranked their connections pairs co-detected in each cell type). This extends on our original (outgoing and incoming edges) by their summed-specificity observations of substantial autocrine signalling potential weights and classified cell-to-cell summary edges as autocrine, using the FANTOM5 data1 and is the first finding that cognate intra-organ (excluding autocrine) and inter-organ. In Fig. 4c–f, ligands and receptors are co-expressed in a substantial fraction we then plotted the distribution of ranks for autocrine (‘blue’), of single cells suggesting their potential for self-signalling. intra-organ (‘pink’), and inter-organ (‘green’) summary edges. Moreover, we observed that a substantial fraction of all When compared, autocrine edges had higher rankings, meaning ligand–receptor pairs (31%, 719/2293) were co-detected in at that, on average, autocrine edges tend to be more specific than

NATURE COMMUNICATIONS | (2020) 11:5011 | https://doi.org/10.1038/s41467-020-18873-z | www.nature.com/naturecommunications 5 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-18873-z intra-organ and inter-organ edges (solid lines in Fig. 4c–f), epithelial cells were also predicted to receive signals from whereas repeated analyses using randomly permuted receptor- pancreatic beta cells and epithelial cells of the trachea. We also ligand pairs abolished these differences (dashed lines in Fig. 4c–f). predicted two smaller communities of cardiac muscle cells A slight enrichment was also observed for intra-organ signalling communicating to endocardial cells, and pancreatic delta cells for outgoing plasma-membrane ligand-mediated edges (Fig. 4e), communicating to enteroendocrine cells of the large intestine. while no such enrichment was found for the secreted ligand- Lastly, oligodendrocyte precursor cells (OPC) were predicted to mediated edges (Fig. 4c, d) and for the plasma-membrane undergo strong autocrine signalling and receive incoming receiving edges (Fig. 4f). communication from bladder cells. Repeating this analysis using To test whether the observed higher autocrine rankings depend the FANTOM5 bulk primary cell data also predicted a large on the edge weighting method used, we repeated the analysis community with hepatocytes as a central broadcasting node using simple edge counts and summed-expression weights and (Supplementary Fig. 6a). again observed higher (albeit to a lesser degree than shown in Examining the most specific ligand–receptor pairs involved in Fig. 4) rankings of autocrine signalling for the edge-count-based each cellular interaction identified both well-known and novel analysis (Supplementary Fig. 4c–j). Another repeated analysis pairs that appear to be biologically relevant (Supplementary using the FANTOM5 bulk data further confirmed autocrine edges Data 6). The most specific ligands driving communication from had higher ranks (Supplementary Fig. 6c–f). Hence, we conclude cardiac muscle cells to endocardial cells included factors relevant that autocrine signalling is a major predicted feature of cell-to-cell to cardiac development and homoeostasis such as Angpt1, communication networks. Bmp10, Vegfa, Vegfb, and Nppa. The most specific communica- tion was carried out via Nppa-Npr3, where Npr3 was previously reported as an endocardial marker26, and Nppa was reported to Prediction of cellular communities in the Tabula Muris.To 27 examine whether the summed-specificity weighted cell- be expressed in cardiomyocytes . These expression patterns have also been recently confirmed in another single-cell analysis of connectivity-summary networks might help reveal sets of cell 28 types that work together within an organ or to achieve a biolo- . gical process, we carried out hierarchical clustering of cell types Similarly, for pancreatic delta cells communicating to enter- fi oendocrine cells of the large intestine we also identified by the vectors of their summed-speci city weights (Supplemen- fi tary Fig. 5). For both the secreted ligand and plasma-membrane somatostatin (Sst) as the most speci c factor generated by delta 29 fi ligand mediated networks, this failed to reveal any underlying cells and con rmed that two of its receptors, Sstr1 and Sstr5, are fi 30 clustering of cell types into organs, tissues or cellular commu- speci cally expressed in enteroendocrine cells . Interestingly, Sst has been shown to signal via Sstr5 to induce production in nities. Instead, cells tended to cluster by lineage (indicated as 31 colour bars in Supplementary Fig. 5). the large intestine . Additionally, a recent study of intestinal fi delta cells validated Sst-Sstr5 mediated signalling to intestinal We next examined the top 10 summed-speci city edges 32 based on the secreted and plasma-membrane ligands and enteroendocrine cells . In the case of the hepatocyte-centred community, we re- visualised them as cell-connectivity-summary networks which fi revealed distinct cell communities for both secreted and plasma- identi ed the well-known endocrine relationship from hepato- membrane ligands (Fig. 5). For the connections involving cytes to megakaryocyte-erythroid progenitors mediated by Thpo and its receptor Mpl33. In addition to identifying such specific secreted ligands, we observed four disconnected communities fi (Fig. 5a). The largest community involved hepatocytes broad- edges, we also predicted that multiple brinogens produced by casting to basophils, microglial cells, megakaryocyte-erythroid hepatocytes (Fga, Fgb, and Fgg) are used to signal via Itgb1 progenitors, and proximal tubule epithelial cells. Proximal tubule (proximal tubule cells), Itga2b (megakaryocyte progenitors and

a Secreted-ligand-based cell-connectivity-summary network in Tabula Muris b PM-ligand-based cell-connectivity-summary network in Tabula Muris Bergmann glial cell | Brain_non-myeloid (40 cells) Pancreatic D cell | Basophil | Marrow Microglial cell | Brain_myeloid (140 cells) (25 cells) (4394 cells)

1.24 1.22 0.85 0.53 endothelial cell | Lung Oligodendrocyte precursor 1.95 Hepatocyte| (693 cells) cell | Brain_non-myeloid Enteroendocrine cell | Large_Intestine (391 cells) (203 cells) (59 cells) 1.41 1.41 Megakaryocyte-erythroid 0.65 progenitor cell | Marrow Epithelial cell 0.50 0.58 of proximal Astrocyte | Brain_ (55 cells) 0.68 non-myeloid Endocardial cell | Heart tubule | 0.87 (432 cells) (165 cells) Epithelial cell | Trachea (219 cells) 0.49 Neuron | Brain_non-myeloid (201 cells) 0.87 (281 cells) 0.98 Leukocyte | Skin 0.92 Type B pancreatic cell | Pancreas (15 cells) Cardiac muscle cell | Heart (449 cells) (133 cells) Keratinocyte | Tongue Bladder cell | Bladder (330 cells) (695 cells) Keratinocyte stem cell | Skin 0.93 (1404 cells) 0.86 0.94 0.87 Oligodendrocyte precursor cell | Brain_non-myeloid (203 cells) Epidermal cell | Skin (276 cells)

Fig. 5 Putative secreted and plasma-membrane-ligand mediated communities identified in Tabula Muris. Visualisation of cell types connected by the top ten summed-specificity edges in Tabula Muris based on a secreted ligands and b plasma-membrane ligands. Coloured shadows highlight the isolated communities found in each network, edge labels correspond to the summed-specificity weight of the edge. Dashed oval in a highlights highly specific signalling from hepatocytes to multiple cell types via secreted ligands. Dashed oval in b highlights a set of four brain-derived cell types communicating via plasma-membrane ligands.

6 NATURE COMMUNICATIONS | (2020) 11:5011 | https://doi.org/10.1038/s41467-020-18873-z | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-18873-z ARTICLE basophil), Itgb2 (basophils) and Itgam (basophils and microglial a cells). Comparison of the cellular composition between 3 and 8 m time point

For the plasma-membrane ligand-mediated interactions, we T cell observed two communities, which consisted of cell types that B cell were more likely to be physically collocated than those seen from Endothelial cell Source the secreted ligand pairs. One community consisted of seven cell Stromal cell 3 m 18 m types including four brain-derived cell types (neurons, OPCs, Cell type Macrophage astrocytes, and Bergmann glial cells), while the other consisted of Basal cell keratinocytes predicted to strongly signal with themselves and Luminal epithelial cell of mammary gland epidermal cells (Fig. 5b). Repeating this analysis using the 0 5 10 15 20 25 30 35 40 FANTOM5 bulk primary cell data also predicted a community of Fraction (%) neural cells (Supplementary Fig. 6b). Examining the most specific ligand–receptor pairs involved in signalling within the epidermal b Log2-transformed fold changes of counts of edges between 3 and 8 m time point cell community (Supplementary Data 6) identified binding between multiple desmogleins and desmocollins (Dsg–Dsc), as Higher in 3 m > 2 fold Luminal epithelial cell of mammary gland Less than 2 fold Stromal cell well as ephrins and eph receptors (Efn–Eph). These proteins are Higher in 18 m > 2 fold known to be expressed in skin and are important to its biological 34–39 functioning . Similarly, for the edges between the four B cell nervous system cell types (neuron, OPC, astrocyte, and Bergmann T cell glial cell) multiple neuroligins were predicted to signal via multiple neurexins and multiple Slirtk ligands via Ptprs. Interestingly, although neuroligin–neurexin complexes are known key interactors at neuronal synapses, growing evidence indicates that they are also used for interactions involving astrocytes, oligodendrocytes and OPCs40,41. We also observed interacting Macrophage pairs with more restricted cell tropism, such as Rtn4 signalling to Endothelial cell Lingo1 (used between neurons and OPCs) and Cntn6, Dll3, Cntn1 Basal cell signalling to Notch1 (used between OPC and astrocytes). Fig. 6 Differential analysis of cell-connectivity-summary networks from aging mammary gland in Tabula Muris. a Comparison of population Differential network analysis in NATMI. Lastly, we used fraction of seven cell types in the 3- and 18-month-old murine mammary NATMI to predict age-related changes in cell communication gland. b Comparison of edge-count-based cell-connectivity-summary within the murine mammary gland (mammary glands from 3- networks at two time points. Summary edges with twofold or more active and 18-month-old mice profiled in the Tabula Muris Senis42 were ligand–receptor pairs connecting them at 3 months than 18 months are compared). A simple edge count analysis, at a detection rate shown in red while those with twofold or more active ligand–receptor pairs threshold of 20%, revealed that there were substantially more connecting them at 18 months than 3 months are shown in blue. Edge ligand–receptor edges predicted as active at 3 months than at thickness is weighted by the log2-transformed fold changes between the 18 months (2045 edges were detected at both ages; 1247 edges two networks. Cell types (nodes) with a twofold difference in cell fraction were detected at 3 months only; and 340 edges were detected at are shown in red (higher in the 3-month timepoint) and blue (higher in the 18 months only, Supplementary Data 7 and 8). Examining dif- 18-month timepoint). ferences in the cell-connectivity-summary networks based on the 3- and 18-month-old mammary gland (Fig. 6) revealed specific Discussion cell types were driving these age-related differences. In particular, Here, we have described NATMI, a tool to help users explore cell- edges involving basal cells (basal cell > basal cell, luminal epi- to-cell communication using scRNA-seq or other omics expres- thelial cell > basal cell, stromal cell > basal cell and basal cell > sion data. In the first part of the manuscript, we described stromal cell) were more than twofold higher in the 3-month-old the underlying network concepts and explored the effect of dif- mammary gland than the 18-month sample. Similarly, edges ferent edge weighting approaches. To help identify cell types involving B and T lymphocytes (B cell > B cell, B cell > T cell and communicating ‘the most’, we introduced the concept of cell- T cell > T cell) were more than twofold higher in the older mice connectivity-summary networks and demonstrated how the (Fig. 6). weight metric and edge filtering used can influence the order of Furthermore, 266 (78.2%) of the 340 edges only detected in the top communicating edges. We recommend using cell specificity 18-month-old mammary gland involved signalling to or from T weighting for most applications; however, both simple edge count and B cells, while only 141 (11.3%) of the 1247 edges exclusively and expression-weighted edges are provided in our outputs as active at 3 months involved lymphocytes. Conversely, 613 default to yield additional insights on the communicating net- (49.2%) of the 1247 ligand–receptor edges only detected at works. As a note, the specificity weights were calculated based on 3 months involved signalling to or from basal cells while only 52 the input dataset and thus should be considered local or dataset (15.6%) of the 340 edges exclusively active at 18 months involved specific. With larger reference datasets becoming available basal cells. through efforts such as the Human Cell Atlas52, we will aim to Examining the basal cell data in more detail identified 9 provide global specificities based on expression across most receptors (Gpc1, Procr, Fzd7, Itga5, Ldlr, Tlr2, Lrp6, Ephb1, and profiled cell types in future versions of NATMI. Tfrc) and 8 ligands (Tgfa, Ngf, Col5a2, Il11, Col4a2, Jag1, Col18a1 Applying NATMI to the Tabula Muris data (the broadest and Hspg2) at least twofold down-regulated at 18 months single-cell atlas to date), we reconfirmed our original findings (Supplementary Data 8). Notably, many of these top down- from FANTOM5 bulk data that autocrine signalling is a major regulated ligands and receptors are known to be important in predicted component of cell-to-cell communication networks. maintenance of normal mammary basal stem cells and are This work is also the first to systematically assess the potential for implicated in basal-like and triple negative breast cancer43–51. individual cells to express cognate ligand–receptor pairs and to

NATURE COMMUNICATIONS | (2020) 11:5011 | https://doi.org/10.1038/s41467-020-18873-z | www.nature.com/naturecommunications 7 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-18873-z

find substantial potential for their self-signalling. Together, these Fig. 3, we also showed that edges filtered based on expression or results predicted that most cell types are major contributors to p values (calculated in CellPhoneDB) are very similar whereas the their own niche53,54. specificity metric used in NATMI identifies a different set of In contrast to expectations that cells based in the same tissue edges, which tend to be more unique and specific to a given pair would tend to communicate more than those in different tissues, of communicating cells. we found essentially no difference between ranks for intra-tissue We also acknowledge there are several limitations and caveats and inter-tissue edges (Fig. 4c–f) across all cell types within the to the case studies presented and, more generally, to the meth- Tabula Muris atlas. Focusing on the cells connected by the pre- odology used here. First, for the single-cell data analyses we have dicted top 10 summary edges identified biologically plausible used the original cellular annotations provided by the authors of communities (Fig. 5), which included communities involving cells each manuscript11,24,64, which can potentially affect the accuracy from the same tissue and communities involving more distant of predicted networks. As the quality of user-defined clusters/cell endocrine relationships. type annotations will define the quality of NATMI predictions, we Lastly, we have demonstrated how NATMI can be used to recommend optimising the clustering/cell type annotation prior identify changes in cell-to-cell communication by comparing to running NATMI. Additionally, the edges that we highlighted networks predicted from comparable datasets (e.g., paired sam- in the manuscript are based on experiments without replicates ples or time-course experiments in which two datasets under (unfortunately common in current single-cell analyses) making consideration have almost identical cellular composition). Using them potentially specific only to the samples tested. Moreover, for the Tabula Muris Senis dataset, we showed that signalling single-cell transcriptome data, the predictions are necessarily involving mammary basal cells is down-regulated between 3 and based on the mRNA expression levels rather than on protein 18 months (Fig. 6). Notably, this is observed both with specificity concentration and the data is subject to dropouts where a weakly weighted summary edges and with simple edge counts. expressed ligand or receptor may be missed in shallowly profiled We acknowledge that NATMI is not the first tool to attempt cells. As mentioned above, we do not model heterodimerization cell-to-cell communication analyses at the single-cell level. Sup- or requirements for co-receptors. We also do not have estimates plementary Data 9 systematically compares features and of the binding kinetics for each ligand–receptor pair, which can approaches used in NATMI and 13 other methods18–21,55–63. The vary across interacting pairs and, ultimately, in conjunction with major discriminating features incorporated in NATMI are that the number of functional ligand/receptor molecules dictate the (1) NATMI uses connectomeDB2020, the most comprehensive physiological response of the interacting pair. set of ligand–receptor pairs with primary literature support to In conclusion, we expect NATMI will greatly facilitate analysis date (note, a substantial number of ligand–receptor pairs in other of cell-to-cell communication networks based on single-cell gene resources lack primary literature support, Supplementary Data 1), expression data as well as for bulk RNA-seq and proteomic data. (2) NATMI can identify and visualise the cell-types that are Note, a distinct advantage of using single-cell data for building communicating the most or the most specifically (both the these networks is that rare and difficult to isolate cell types can be directed heatmap visualisation and summed-specificity weighting more easily profiled while avoiding many of the biases introduced of cell-connectivity summary edges is unique to NATMI) (Fig. 3), by cell isolation/purification and from cell culture needed for bulk (3) NATMI allows users to easily identify and visualise top profiling. Thus the generation of cell-connectivity-summary ligand–receptor pairs based on expression or specificity (Fig. 1) networks from this data will allow users to highlight the cell and (4) NATMI allows comparison of networks to identify types that are communicating the most and then further explore changes in both ligand–receptor signalling and overall cell-to-cell individual ligand–receptor pairs that are driving these connec- signalling between cells under different conditions (Fig. 6). tions. The next exciting, but non-trivial, challenge will be to Our comparison to the 13 other tools also highlights three validate these inferred intercellular communication pathways features that are currently not incorporated into NATMI, such as experimentally (using for example such methods as PIC-seq65). (1) the handling of heteromeric complexes (CellPhoneDB18, 19 RNA-Magnet ), (2) the prediction of downstream signalling Methods 55 59 consequences (NicheNet , SoptSC ), and (3) the analysis of High quality manually curated ligand–receptor interaction database.To spatial transcriptomics data (RNA-magnet19, SpaOTsc61). The expand upon our previous list of ligand–receptor pairs, we first incorporated pairs modelling of heteromers pioneered in CellPhoneDB18 acknowl- from CellPhoneDB v2.018. We downloaded 1144 interactions from https://www. cellphonedb.org/downloads/interactions_cellphonedb.csv and then extracted edges that many receptors and ligands only function as hetero- 693 simple:simple interaction pairs (CellPhoneDB internal ID of format ‘CPI- mers. As the full set of biologically relevant heteromers is still yet SSxxx’. Heteromeric complexes were excluded). All protein/gene identifiers were to be described, modelling heteromers based on what is currently updated to HGNC IDs and then compared to our previous catalogue. In total, 478 known can result in biases toward well-studied interactors, thus of these were already in our 2015 resource1. For the remaining 215, we used we do not model them in NATMI. Similarly, although using PubMed and google scholar searches to search for primary literature evidence. Finally, 121 pairs with primary literature evidence were kept. changes in the expression of predicted target induced by To further expand upon our previous interaction lists we used the following ligand mediated signalling (used in NicheNet55) may provide strategy: (1) manual literature search for ligand–receptor pairs using terms ‘ligand’, greater confidence that the ligand–receptor pair is active and ‘receptor’, ‘cytokine’, ‘growth factor’ and for genes from known ligand and receptor inducing a change of state in the target cell we suspect that families that were not yet covered in the database (2) systematic extraction of PPI pairs from STRINGDB (https://stringdb.org/cgi/download.pl? the predictions are biased toward well-studied ligands and cell sessionId=oCkT8UeKh8rN&species_text=Homo+sapiens. Specifically, physical- types (292 of the ligands in connectomeDB2020 are not in binding interactions ‘9606.protein.actions.v11.0.txt’ with score ≥700, and NicheNet). Lastly, although NATMI has not been specifically experimental interaction ‘9606.protein.links.full.v11.0.txt’ experiments score ≥700) designed for spatial data, cell type labels based on clusters that involving known ligands and receptors from the merged list above and putative fi secreted and plasma-membrane proteins from our previous publication1. In total, incorporate spatial context information (e.g., tumour in ltrating 3147 putative pairs involving a secreted protein and 1777 involving 2 plasma- T cells, tumour proximal T cells, tumour distal T cells) can also be membrane proteins were compared to our previous lists and the remainder was analysed. manually curated. This combined strategy identified a further 340 ligand–receptor Furthermore, comparison of NATMI and CellPhoneDB18 pairs that were absent from our 2015 resource and from CellPhoneDB. During the – peer review, we manually reviewed and incorporated an additional 50 pairs from using the same input and ligand receptor pair lists (Supple- 19 20 fi Supplementary Table 3 of RNA-magnet , 22 pairs from SingleCellSignalR mentary Notes 1 and 2) revealed NATMI identi es similar edges (source: https://github.com/SCA-IRCM/LRdb/blob/master/LRdb_122019.txt), and but is faster, especially with limited computational resources. In 9 pairs from Supplementary Table 1 of ICELLNET21.

8 NATURE COMMUNICATIONS | (2020) 11:5011 | https://doi.org/10.1038/s41467-020-18873-z | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-18873-z ARTICLE

Lastly, feedback from a user of our previous database of one of our entries being expressed a ligand at a lower mean-expression level than another may still be the incorrect prompted us to re-review the literature support of our original database. major cell type producing the ligand if it is more abundant (e.g., 500 cells of cell This led to exclusion of 143 pairs, mostly from HPRD, where the PubMedID did type A expressing mean 10CPM of a ligand vs 10 cells of cell type B expressing not actually support the interaction. Excluded putative ligand–receptor pairs from mean 80CPM of the same ligand → 5000 in cell type A vs 800 in cell type B). Care our 2015 resource, CellPhoneDB, RNA-magnet, SingleCellSignalR and ICELLNET must be taken in interpreting total-expression weights as they assume unbiased are listed along with the reasons for removal in a separate tab of Supplementary (truly representative) sampling of the cells in the sample. Data 1. Cell-connectivity-summary-network edge weights. To summarise cell-to-cell User supplied ligand–receptor interactions (optional). Currently, NATMI uses connectivity within the network, NATMI generates a matrix of cell-connectivity- connectomeDB2020 as the default ligand–receptor pair list, but users can also summary-network edges. These can be weighted by edge-count or summed provide their own custom ligand–receptor pairs lists, as described at https://github. expression and specificity. Using the ligand–receptor weights described above users com/forrest-lab/NATMI. can generate edge-count based summaries that simply count the number of ligand–receptor pairs, from cell-type1 to cell-type2, that pass a set of user-defined thresholds. For example, count all pairs observed at a detectionThreshold of 20%, Identification of ortholog ligand/receptors in other vertebrate species. The an expressionThreshold of 10CPM, or with a specificityThreshold of 0.1. In con- curated ligand–receptor database provided in this manuscript is based on inter- trast, the summed-specificity weight sums all specificity weights from cell-type1 to actions between human ligands and receptors. To run NATMI on other species we cell-type2 without applying a hard threshold. Cell type pairs connected by many extracted the homologues of interacting pairs from the NCBI HomoloGene specific edges will have high summed-specificity weights. Database23. The homologous pairs for 20 commonly requested species including mouse, rat, zebrafish (https://www.ncbi.nlm.nih.gov/homologene/statistics/) can be automatically inferred by NATMI based on the input gene expression data. Note, Visualising cell-to-cell communication edges using NATMI. NATMI uses as there are known species specific ligand–receptor pairs, we recommend to always multiple plot types to visualise cell-to-cell communication networks. With the check the literature and validate reported edges when applying NATMI to other Skelly et al.11 cardiac dataset as a demonstration, we used NATMI to generate species. For the case studies presented here using mouse single-cell data, we did not heatmaps of top-ranked ligand–receptor pairs based on expression and specificity systematically review the human–mouse ortholog pairs but we might consider weights (Fig. 1) and three different cell-connectivity-summary networks visuali- systematically annotating all pairs in a future database release. There are, however, sations (heatmap, network graph and circos views, Figs. 2 and 3). some drawbacks to limiting the pairs to those that have been confirmed in the To generate these plots, users can run the following commands: ‘python species of interest only vs a broader survey based on the largest collection of ExtractEdges.py --emFile expression.matrix.txt --annFile annotation.txt --out available pairs followed by post analysis confirmation. Our recommendation to outfolder’ and ‘python VisInteractions.py --sourceFolder outfolder’. By default, this end-users is to run NATMI with all L–R pairs available in connectomeDB2020, will generate nine cell-connectivity-summary-network visualisations (three views the predicted top L–R pairs and then confirm the pairs of interest in the with three kinds of edge weighting strategies (summed expression, summed literature or experimentally. Note it is also highly dependent on the degree of specificity, and edge count)), two heatmaps of top-ranked ligand–receptor orthology and the coverage/paucity of ligand–receptor studies in the species pairs based on average expression and specificity weights, and filtered considered. ligand–receptor-mediated edge list, adjacency matrices of cell-connectivity- summary networks. To visualise delta networks (as shown in Fig. 6), users should first extract edges Required input data fi . NATMI requires the user to provide a gene expression le in the two conditions by running the following commands: ‘python ExtractEdges. (NATMI can process both raw and normalised gene expression data but we py --emFile expression.matrix.condition1.txt --annFile annotation.condition1.txt recommend users to use a normalisation appropriate for their data), with columns --out condition1.folder’ and ‘python ExtractEdges.py --emFile expression.matrix. containing expression measurements for a single cell (or bulk measurement) and condition2.txt --annFile annotation.condition2.txt --out condition2.folder’. These rows corresponding to genes. For single-cell analyses, the user also needs to provide ‘ fi are then compared using the command: python DiffEdges.py --refFolder a metadata le with the mapping between each cell in the dataset and a cell type/ condition1.folder --targetFolder condition2.folder --out compare.folder’.To cluster label. Step-by-step instructions on how to generate or export these two ‘ 66 67 visualise the delta networks use the command: python VisInteractions.py compare. tables from Seurat and SCANPY are provided online in the GitHub resource: folder’. Consequently, six heatmaps and six network graphs of the delta networks https://github.com/forrest-lab/NATMI. For the expression data table, all datasets fi based on three kinds of edge weighting strategies (averaged expression, summed used in this paper were normalised by total number of unique molecular identi ers specificity, and edge count) and two change quantification methods (absolute (or reads if the data does not use UMIs) and then rescaled by multiplying by difference and fold change) are generated. Adjacency matrices of cell-connectivity- 1,000,000 (i.e., counts per million UMIs/reads). fi – summary networks in the two conditions and delta networks are also provided for Using connectomeDB2020 (or a user-speci ed ligand receptor pair list) further exploration. NATMI extracts the expression levels of every ligand and receptor in the provided More details (such as how to apply different edge filters) can be found in single-cell (or bulk) dataset. When using single-cell data, NATMI summarises the NATMI’s GitHub tutorial page: https://github.com/forrest-lab/NATMI expression of every ligand and receptor for each cell type/cluster specified in the metadata file (by calculating mean expression, total expression and fraction of cells the gene is detected in). For the analyses presented in this manuscript we consider a Reporting summary. Further information on research design is available in the Nature gene as expressed in a given cell type if more than 20% of cells of the cell type had Research Reporting Summary linked to this article. non-zero read counts for that gene. The user however is free to alter the fi detectionThreshold parameter to suit the capture ef ciency of their dataset and can Data availability use expressionThreshold parameter to filter out weakly expressed ligands and All data analysed within this paper are publicly available. The Skelly et al.11 mouse receptors. cardiac dataset can be downloaded from ArrayExpress (experiment E-MTAB-6173). The Tabula Muris data can be downloaded from figshare (https://figshare.com/articles/Single- Ligand–receptor edge weights. NATMI outputs weights of edges from a ligand- cell_RNA-seq_data_from_Smart-seq2_sequencing_of_FACS_sorted_cells/5715040). The producing cell type/cluster to a receptor-expressing cell-type/cluster using three Tabula Muris Senis mammary gland data can also be downloaded from figshare (https:// metrics. These are mean-expression weight, specificity weight and total-expression figshare.com/articles/Single-cell_RNA-seq_data_from_microfluidic_emulsion/5715025). weight. The mean-expression weight is calculated as the product of the mean The FANTOM5 CAGE expression data for human protein-coding genes in the 144 expression of the ligand in a cell type/cluster and the mean expression of the human primary cells are publicly available on FANTOM5 website (https://fantom.gsc. → mean = receptor in a cell type/cluster: edge(cell-type1 cell-type2) ligand1-receptor1 riken.jp/5/suppl/Ramilowski_et_al_2015/data/ExpressionGenes.txt). mean mean fi cell-type1 ligand1 × cell-type2 receptor1. The speci city weight is calculated as the product of (1) the mean expression of the ligand in a cell type divided by the sum of the mean expression of the ligand across all cell types in the dataset and (2) Code availability the mean expression of the receptor in a cell type divided by the sum of the mean NATMI, an open-source Python tool, and connectomeDB2020 database are available at expression of the receptor across all cell types in the dataset: edge(cell-type1 → cell- GitHub (https://github.com/forrest-lab/NATMI). The repository also includes specificity = mean ∑ mean −1 type2) ligand1-receptor1 cell-type1 ligand1 ×( (cell-type ligand1)) × installation instructions, format requirements, detailed function descriptions, example mean ∑ mean −1 cell-type2 receptor1 ×( (cell-type receptor1)) . workflows for the user and FAQs section. In addition, we provide the output of an Note, we do not use total-expression weight for any of the analyses presented in example workflow in the repository so users can preview the analysis results before the manuscript, however it is provided as a feature for future applications. The installing NATMI. total-expression weight is calculated as the product of the sum of expression of the ligand in a cell type and the sum of expression of the receptor in a cell-type: edge → sum = sum (cell-type1 cell-type2) ligand1-receptor1 cell-type1 ligand1 × cell- Received: 30 March 2020; Accepted: 16 September 2020; sum type2 receptor1. Weighting by total-expression acknowledges that each cell type in the network is present at different abundancies, thus a cell type that weakly

NATURE COMMUNICATIONS | (2020) 11:5011 | https://doi.org/10.1038/s41467-020-18873-z | www.nature.com/naturecommunications 9 ARTICLE NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-18873-z

References 34. Hunt, D. et al. Spectrum of dominant in the desmosomal cadherin 1. Ramilowski, J. A. et al. A draft network of ligand-receptor-mediated desmoglein 1, causing the skin disease striate palmoplantar keratoderma. Eur. multicellular signalling in human. Nat. Commun. 6, 7866 (2015). J. Hum. Genet. 9, 197 (2001). 2. Hashimshony, T., Wagner, F., Sher, N. & Yanai, I. CEL-Seq: single-cell RNA- 35. Kim, J. H. et al. A homozygous nonsense in the DSG3 gene causes Seq by multiplexed linear amplification. Cell Rep. 2, 666–673 (2012). acantholytic blisters in the oral and Laryngeal Mucosa. J. Investig. Dermatol. 3. Picelli, S. et al. Smart-seq2 for sensitive full-length transcriptome profiling in 139, 1187–1190 (2019). single cells. Nat. Methods 10, 1096–1098 (2013). 36. Ayub, M. et al. A homozygous nonsense mutation in the human desmocollin- 4. Macosko, E. Z. et al. Highly parallel genome-wide expression profiling of 3 (DSC3) gene underlies hereditary hypotrichosis and recurrent skin vesicles. individual. Cells Nanoliter Droplets. Cell 161, 1202–1214 (2015). Am. J. Hum. Genet. 85, 515–520 (2009). 5. Hashimshony, T. et al. CEL-Seq2: sensitive highly-multiplexed single-cell 37. Hafner, C., Becker, B., Landthaler, M. & Vogt, T. Expression profile of Eph RNA-Seq. Genome Biol. 17, 77 (2016). receptors and ephrin ligands in human skin and downregulation of EphA1 in 6. Han, X. P. et al. Mapping the mouse cell atlas by Microwell-seq. Cell 172, nonmelanoma skin cancer. Mod. Pathol. 19, 1369 (2006). 1091–1107 (2018). 38. Walsh, R. & Blumenberg, M. Specific and shared targets of ephrin A signaling 7. Rosenberg, A. B. et al. Single-cell profiling of the developing mouse brain and in epidermal keratinocytes. J. Biol. Chem. 286, 9419–9428 (2011). spinal cord with split-pool barcoding. Science 360, 176–182 (2018). 39. Ventrella, R. et al. EphA2 transmembrane domain is uniquely required for 8. Cao, J. et al. The single-cell transcriptional landscape of mammalian keratinocyte migration by regulating Ephrin-A1 levels. J. Investig. Dermatol. organogenesis. Nature 566, 496–502 (2019). 138, 2133–2143 (2018). 9. Saikia, M. et al. Simultaneous multiplexed amplicon sequencing and 40. Proctor, D. T. et al. Axo‐glial communication through neurexin‐neuroligin transcriptome profiling in single cells. Nat. Methods 16,59–62 (2019). signaling regulates myelination and oligodendrocyte differentiation. Glia 63, 10. Keren-Shaul, H. et al. MARS-seq2.0: an experimental and analytical pipeline 2023–2039 (2015). for indexed sorting combined with single-cell RNA sequencing. Nat. Protoc. 41. Stogsdill, J. A. et al. Astrocytic neuroligins control astrocyte morphogenesis 14, 1841–1862 (2019). and synaptogenesis. Nature 551, 192–197 (2017). 11. Skelly, D. A. et al. Single-cell transcriptional profiling reveals cellular diversity 42. Almanzar, N. et al. A single-cell transcriptomic atlas characterizes ageing and intercommunication in the mouse heart. Cell Rep. 22, 600–610 (2018). tissues in the mouse. Nature 583, 590–595 (2020). 12. Magella, B. et al. Cross-platform single cell analysis of kidney development 43. Reedijk, M. et al. JAG1 expression is associated with a basal and shows stromal cells express Gdnf. Dev. Biol. 434,36–47 (2018). recurrence in lymph node-negative . Breast Cancer Res. Treat. 13. Camp, J. G. et al. Multilineage communication regulates human liver bud 111, 439–448 (2008). development from pluripotency. Nature 546, 533–538 (2017). 44. Wang, D. et al. Identification of multipotent mammary stem cells by 14. Cohen, M. et al. Lung single-cell signaling interaction map reveals Basophil receptor expression. Nature 517,81–84 (2015). role in macrophage imprinting. Cell 175, 1031–1044 (2018). 45. Wang, D. et al. Protein C receptor is a therapeutic stem cell target in a distinct 15. Jerby-Arnon, L. et al. A cancer cell program promotes T cell exclusion and group of breast cancers. Cell Res. 29, 832–845 (2019). resistance to checkpoint blockade. Cell 175, 984–997 (2018). 46. Chakrabarti, R. et al. ΔNp63 promotes stem cell activity in mammary gland 16. Costa, A. et al. Fibroblast heterogeneity and immunosuppressive environment development and basal-like breast cancer by enhancing Fzd7 expression and in human breast cancer. Cancer Cell 33, 463 (2018). Wnt signalling. Nat. Cell Biol. 16, 1004 (2014). 17. Hrvatin, S. et al. Single-cell analysis of experience-dependent transcriptomic 47. Xiao, Y. et al. Integrin α5 down-regulation by miR-205 suppresses triple states in the mouse visual cortex. Nat. Neurosci. 21, 120–129 (2018). negative breast cancer stemness and metastasis by inhibiting the Src/Vav2/ 18. Efremova, M., Vento-Tormo, M., Teichmann, S. A. & Vento-Tormo, R. Rac1 pathway. Cancer Lett. 433, 199–209 (2018). CellPhoneDB: inferring cell–cell communication from combined expression of 48. Lindvall, C. et al. The Wnt co-receptor Lrp6 is required for normal mouse multi-subunit ligand–receptor complexes. Nat. Protoc. 15, 1484–1506 (2020). mammary gland development. PLoS ONE 4, e5813 (2009). 19. Baccin, C. et al. Combined single-cell and spatial transcriptomics reveal the 49. Breunig, C. et al. TGF β1 regulates HGF‐induced cell migration and molecular, cellular and spatial bone marrow niche organization. Nat. Cell Biol. hepatocyte growth factor receptor MET expression via C‐ets‐1 and miR‐128‐ 22,38–48 (2020). 3p in basal‐like breast cancer. Mol. Oncol. 12, 1447–1463 (2018). 20. Cabello-Aguilar, S. et al. SingleCellSignalR: inference of intercellular networks 50. Kalscheuer, S. et al. Discovery of HSPG2 (Perlecan) as a therapeutic target in from single-cell transcriptomics. Nucleic Acids Res. 48, e55 (2020). triple negative breast cancer. Sci. Rep. 9, 12492 (2019). 21. Noël, F. et al. ICELLNET: a transcriptome-based framework to dissect 51. Chakravarthy, R., Mnich, K. & Gorman, A. M. Nerve growth factor (NGF)- intercellular communication. Preprint at https://www.biorxiv.org/content/ mediated regulation of p75(NTR) expression contributes to chemotherapeutic 10.1101/2020.03.05.976878v1 (2020). resistance in triple negative breast cancer cells. Biochem. Biophys. Res. 22. Keshava Prasad, T. et al. Human protein reference database—2009 update. Commun. 478, 1541–1547 (2016). Nucleic Acids Res. 37, D767–D772 (2008). 52. Regev, A. et al. The human cell atlas. eLife 6, e27041 (2017). 23. Sayers, E. W. et al. Database resources of the National Center for 53. Doğaner, B. A., Yan, L. K. & Youk, H. Autocrine signaling and quorum Biotechnology Information. Nucleic Acids Res. 47, D23–D28 (2019). sensing: extreme ends of a common spectrum. Trends Cell Biol. 26, 262–271 24. Schaum, N. et al. Single-cell transcriptomics of 20 mouse organs creates a (2016). Tabula Muris: The Tabula Muris Consortium. Nature 562, 367 (2018). 54. Chacón-Martínez, C. A., Koester, J. & Wickström, S. A. Signaling in the stem 25. Geddes, T. A. et al. Autoencoder-based cluster ensembles for single-cell RNA- cell niche: regulating cell fate, function and plasticity. Development 145, seq data analysis. BMC Bioinforma. 20, 660 (2019). dev165399 (2018). 26. Zhang, H. et al. Endocardium minimally contributes to coronary 55. Browaeys, R., Saelens, W. & Saeys, Y. NicheNet: modeling intercellular in the embryonic ventricular free walls. Circ. Res. 118, 1880–1893 (2016). communication by linking ligands to target genes. Nat. Methods 17, 159–162 27. Claycomb, W. C. Atrial-natriuretic-factor mRNA is developmentally regulated (2019). in heart ventricles and actively expressed in cultured ventricular cardiac 56. Kumar, M. P. et al. Analysis of single-cell RNA-Seq identifies cell-cell muscle cells of rat and human. Biochem. J. 255, 617–620 (1988). communication associated with tumor characteristics. Cell Rep. 25, 1458–1468 28. Cui, Y. et al. Single-cell transcriptome analysis maps the developmental track (2018). e1454. of the human heart. Cell Rep. 26, 1934–1950 (2019). 57. Cillo, A. R. et al. Immune landscape of viral- and carcinogen-driven head and 29. Hökfelt, T. et al. Cellular localization of somatostatin in endocrine-like cells neck cancer. Immunity 52, 183–199 (2020). and neurons of the rat with special references to the A1-cells of the pancreatic 58. Choi, H. et al. Transcriptome analysis of individual stromal cell populations islets and to the hypothalamus. Eur. J. Endocrinol. 200,5–41 (1975). identifies stroma-tumor crosstalk in mouse lung cancer model. Cell Rep. 10, 30. Jepsen, S. L. et al. Paracrine crosstalk between intestinal L-and D-cells controls 1187–1201 (2015). secretion of glucagon-like peptide-1 in mice. Am. J. Physiol.-Endocrinol. 59. Wang, S., Karikomi, M., MacLean, A. L. & Nie, Q. Cell lineage and Metab. 317, E1081–E1093 (2019). communication network inference via optimization for single-cell 31. Song, S., Li, X., Geng, C., Li, Y. & Wang, C. Somatostatin stimulates colonic transcriptomics. Nucleic Acids Res. 47, e66 (2019). MUC2 expression through SSTR5-Notch-Hes1 signaling pathway. Biochem. 60. Zhang, J. et al. Single-cell transcriptome-based multilayer network biomarker Biophys. Res. Commun. 521, 1070–1076 (2019). for predicting prognosis and therapeutic response of gliomas. Brief. Bioinform. 32. Jepsen, S. L. et al. Paracrine crosstalk between Intestinal L-and D-cells controls 21, 1080–1097 (2020). secretion of glucagon-like peptide-1 in mice. Am. J. Physiol.-Endocrinol. 61. Cang, Z. & Nie, Q. Inferring spatial and signaling relationships between cells Metab. 317, E1081–E1093 (2019). from single cell transcriptomic data. Nat. Commun. 11, 2084 (2020). 33. Qian, S., Fu, F., Li, W., Chen, Q. & de Sauvage, F. J. Primary role of the liver in 62. Tsuyuzaki, K., Ishii, M. & Nikaido, I. Uncovering hypergraphs of cell-cell production shown by tissue-specific knockout. Blood 92, interaction from single cell RNA-sequencing data. Preprint at https://www. 2189–2191 (1998). biorxiv.org/content/10.1101/566182v1 (2019).

10 NATURE COMMUNICATIONS | (2020) 11:5011 | https://doi.org/10.1038/s41467-020-18873-z | www.nature.com/naturecommunications NATURE COMMUNICATIONS | https://doi.org/10.1038/s41467-020-18873-z ARTICLE

63. Wang, Y. et al. iTALK: an R Package to characterize and illustrate intercellular Competing interests communication. Preprint at https://www.biorxiv.org/content/10.1101/ The authors declare no competing interests. 507871v1 (2019). 64. Pisco, A. O. et al. A single cell transcriptomic atlas characterizes aging tissues Additional information in the mouse. Nature 583, 590–595 (2020). 65. Giladi, A. et al. Dissecting cellular crosstalk by sequencing physically Supplementary information is available for this paper at https://doi.org/10.1038/s41467- interacting cells. Nat. Biotechnol. 38, 629–637 (2020). 020-18873-z. 66. Satija,R.,Farrell,J.A.,Gennert,D.,Schier,A.F.&Regev,A.Spatialreconstruction of single-cell gene expression data. Nat. Biotechnol. 33, 495–502 (2015). Correspondence and requests for materials should be addressed to A.R.R.F. 67. Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018). Peer review information Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work. Peer reviewer reports are available.

Reprints and permission information is available at http://www.nature.com/reprints Acknowledgements We would like to acknowledge and thank Prof. A. Swarbrick, Dr. S. Wu and Dr. R. Tothill Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in for their feedback on NATMI. This work was carried out with the support of a collaborative published maps and institutional affiliations. cancer research grant provided by the Cancer Research Trust ‘Enabling advanced single-cell cancer genomics in Western Australia’, a grant from Cancer Council of Western Australia and an Australian National Health and Medical Research Council project grant Open Access APP1146323. R.H. is supported by an Australian Government Research Training Program This article is licensed under a Creative Commons (RTP) Scholarship. A.R.R.F. was supported by funds raised by the MACA Ride to Conquer Attribution 4.0 International License, which permits use, sharing, Cancer and a Senior Cancer Research Fellowship from the Cancer Research Trust. A.R.R.F. adaptation, distribution and reproduction in any medium or format, as long as you give is currently supported by an Australian National Health and Medical Research Council appropriate credit to the original author(s) and the source, provide a link to the Creative Fellowship APP1154524. Analysis was made possible with computational resources pro- Commons license, and indicate if changes were made. The images or other third party ’ vided by the Pawsey Supercomputing Centre with funding from the Australian Government material in this article are included in the article s Creative Commons license, unless and the Government of Western Australia. J.A.R. is supported by a Research Grant from indicated otherwise in a credit line to the material. If material is not included in the ’ MEXT to the RIKEN Center for Integrative Medical Sciences. article s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/ Author contributions licenses/by/4.0/. NATMI was developed by R.H. with help from J.A.R. The project was conceived by A.R. R.F. The updated ligand–receptor pair lists were curated by A.R.R.F., H.T.O., and J.A.R. © The Author(s) 2020 The paper was written by R.H. and A.R.R.F. with help from E.D. and J.A.R.

NATURE COMMUNICATIONS | (2020) 11:5011 | https://doi.org/10.1038/s41467-020-18873-z | www.nature.com/naturecommunications 11