<<

OPEN Network Analysis Reveals Functional

SUBJECT AREAS: Cross-links between Disease and CANCER GENETICS DATA INTEGRATION Inflammation GENETIC INTERACTION Yunpeng Zhang1*, Huihui Fan1*, Juan Xu1, Yun Xiao1, Yanjun Xu1, Yixue Li1,2 & Xia Li1 BIOINFORMATICS

1College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, Heilongjiang 150086, Chin, 2Shanghai Received Center for Bioinformation Technology, Shanghai, People’s Republic of China. 11 July 2013 Accepted Connections between inflammation and diseases are suggested important in understanding the genetic 18 November 2013 mechanisms of diseases. However, studies on the functional cross-links between inflammation and disease genes are still in their early stages. We integrated the –protein interaction (PPI), inflammation genes, Published and –disease associations to construct a disease-inflammation network (DIN). We found that nodes, 5 December 2013 which are both inflammation and disease genes (namely inter-genes), are topologically important in the DIN structure. Via mapping inter-genes to PPI, we classified diseases into two categories, which are significantly different in Intimacy measuring the contribution of inflammation genes to the connections between disease pairs. Furthermore, we constructed a cross-talking subpathways network. As indicated, the Correspondence and cross-subpathway analysis shows great performance in capturing higher-level relationship among requests for materials inflammation and disease processes. Collectively, The network-based analysis provides us a rather should be addressed to promising insight into the intricate relationship between inflammation and disease genes. Y.X.L. ([email protected]) or X.L. (lixia@hrbmu. ne of the major tasks for contemporary biology and medicine is to decipher the underlying mechanisms edu.cn) of complex diseases. Inflammation has been proposed as the seventh hallmark of cancer1, which O significantly contributes to different development stages of various diseases. Researches on the links between inflammation and disease genes are thus helpful to understand the complex nature of human genetic * These authors diseases. However, few systematic studies on the functional links have been reported. contributed equally to During the past decades, great efforts have been dedicated to identifying disease-related genes, , and this work. metabolites, which directly or indirectly connect to each other through computational or experimentally vali- dated interactions. In an early study on inflammatory diseases, such as rheumatoid arthritis and inflammatory bowel disease, Heller et al.2 identified disease-related genes by using cDNA microarrays. Later, with the advent of sequencing, Jones et al.3 implicated PALB2 as a susceptibility gene of pancreatic cancer with the use of exomic sequencing. An early systemic study of human disease genes completed by Wu et al.4, integrated human protein– protein interactions and known gene-phenotype associations to systematically predict disease genes related to various phenotypes. Turner et al.5 and Furney et al.6 also contributed to the studies of human disease genes. However, few studies have focused on the relationships among genes corresponding to different disease pheno- types. Goh et al.7 constructed a disease phenome network (human disease network, HDN) and a disease genome network (disease gene network, DGN), which indicated a common genetic origin of many diseases through disease-gene association pairs. Recently, inflammation, as an important factor in the initiation and progression of various diseases, has been generally accepted. Donoso et al.8 explored the role of inflammation in age-related macular degeneration. In another study on colorectal cancer, Itzkowitz et al.9 emphasized the important roles of inflammation. Nevertheless, none of the studies have systematically characterized the functional links between inflammation and disease genes at a network level. In this study, we focused on inflammation, controlled by both environmental and genetic factors and which is one of the most pivotal factors in inducing various diseases, to study the functional cross-links between inflam- mation and disease genes, as well as the mechanisms of their action. We integrated the human PPI network, inflammation genes, and gene–disease associations to construct a disease-inflammation network (DIN) and then dissected the topological characteristics of the network. In order to describe the relationships between inflam- mation and disease genes from a systems perspective, we classified diseases into type I diseases, which are significantly associated with inflammation genes, and type II diseases, which are not. Subsequently, we defined Intimacy as the contribution of inflammation genes to the connections between one disease and another, which

SCIENTIFIC REPORTS | 3 : 3426 | DOI: 10.1038/srep03426 1 www.nature.com/scientificreports tends to be higher in type I diseases. Finally, we showed that inflam- first examined the overlap among disease genes, inflammation genes, mation genes make great contributions to connections between most and PPI nodes (Figure 2A). As shown, we could classify the nodes in genetic diseases, especially in the associations between infection and the DIN as disease genes only, inflammation genes only, or both immunity, as well as between infection and cancer, at a network level disease and inflammation genes (henceforth, inter-genes). Subsequ- and a subpathway network level. ently, we determined the topological characteristics of the DIN, such as the degree, clustering, and topological coefficient. The degree Results distribution followed p(k) / k20.9536 using all the nodes in the DIN Construction of the DIN . The local inflammatory microenviron- (Figure 2B), which showed that the DIN is a scale-free biological ment, which is essential for the initiation and progression of various network. In a scale-free network11,mostofthenodeshaveonlya diseases, can be depicted by representative inflammation genes. To few interactions, whereas a few nodes with a large number of interpret the functional links between genetic diseases and inflam- interactions are tended to be hubs. We further examined the degree matory microenvironment, we used disease and inflammation genes distributions of three kinds of nodes in the DIN. As shown, the to construct a disease-inflammation network (DIN), integrating PPI general degree distribution of nodes that are inter-genes was information and gene–disease associations. There are 2831 disease- significantly greater than that of nodes that are disease genes only, related genes curated by the GAD and 231 inflammation genes generated from the GO database10. We mapped all these genes to with a p-value of 0.005 (Wilcoxon’s Rank-Sum Test, Figure 2C). the PPI network of the HPRD and then extracted the maximal Similarly, it was also significantly higher, when comparing the connected component as the DIN (Figure 1). degree distribution of inter-genes with inflammation genes only (p As shown in Figure 1, the DIN contained 1867 nodes with 6252 5 0.0012, Wilcoxon’s Rank-Sum Test, Figure 2C). While those nodes interactions. Finally, 1815 disease genes and 160 inflammation genes in the DIN that are inflammation genes only had a median degree of were included. 3, which is definitely equal to that of nodes that are disease genes only (p 5 0.0628, Wilcoxon’s Rank-Sum Test, Figure 2C). Besides, in order Dissection of the DIN. To delicately depict the functional cross- to properly control the analysis of inflammation genes and disease links between disease and inflammation genes in the network, we genes, we again compared them with non-disease genes in the original

CALCA

HCRTR22 PPRX HBE1

AMY1Y1A CALCRLCCRL TNFT FSFF44 HBA2 ABCC8 RDH5 LAAMCAMC2HCRTT APOC2 PLAA HBGHBGG2 CCXCL6 HCRTR1HC 1 CCR7CCCRC 7 DRD4 CXCXCLCL3 CCL2CC 24 LAMAMA2 HRH4HRH44 CHCHRM2HR 2 CYP2C9C 9 GNRHRR P2RYY122 CXCR3CCXXCRC 3 VIP AAKAP1KAP110 PTPRPRT APOC1 SLC2C22AA4 JPHJJPPH2 ACTN3CCYYP17AGCKAK1R ABCBCCCC2 KEL PIGRR CCCL1CCCL1CLCL119 RLBPR B 1 GCGCGR SLCSLLCCO1ACO A2 AVP CCYPP2P2E11 AVPAVVVPR11B ADCDCCYAPPTGPGDRGGD1 R CYYPCYPCCY1YYPAP2P222C19 MMTUS1MTTUS11 AADRARA2RRAA2B EDNREDNE NNRRA APOL1 FKBPF P1B PODXDXL EDN33 KIR2DK 2DDLDL4 TTFR2R22 APOA F DRP2DCETPCETCEETPETT2P CHRM1C LGALSS2 KCNJ1KCCN 1 SHHS KLRCK 2 NPR2 ADAM1AMAM1M 0 NPHSS2 PRRCPCPLOPOXSPPIN PCSKPCSCSC K9 GJB6 BPI P2RY1P2RRYY1 CCL2LL220 GCCK NPPCN PCP SCG3 HFEF GLGLS ABABCCBBCCCC9 IHH CCL223 LAMB GNGT1 OXTROXOOXTXXT CACNCACNA1CAAC 1C KIR3DL1 PVR CACNA1A11A FCACAR KCNQ4 GP1BB LOXLXL1 HBAA1 ADORA2AADDORRARA2A2A SLC2C A2 CD1B SAAA2 RGS10S110 SCN4SC 4A CACCNA1CN 1S SOD3 ACACE AOXA 1 KIR3KIR3DS3DDS1 SCARB1 PDLD IIMM5 CNNP CD1A NPFFRNNP FFR22 TBXABXA2R HTR3A SESEMAEMEMAMA4F CCHCRCCHCC CCRR1 PORPPOOOR KLRDLRDD1 GHRH VAPAPB SNAPAP29 SLCLLC1CC11A1 HLA-G ACE2 XK LCL T NPR3NP 3 CRRH NPPN B RERECE K AVPAVVPPR1R1A DRDDDRRD5RDD DKKD K1 CHCHGH A ADORA1 HTHHTR1FTR F RYRRYR22 LDHLDHHA MUUTETEPSEDESSMDMBMCRHCRCRHR8HRHHR2R22 MPOO KISSK 1R SNTBSSNN 1 CACACNA1AC AA11D cancer CRHCRRHBP MMATN3 GPR50GGPGPR5PRPR50 SLCLLC6CC66A2CDCD1ED GHRHRR CPLCPCPLXPLXX1 NMB PTGIRPPTGI TATAPP2 NPHS1PHHSHS11 GABGABRA4 KIR2K R2DS1 UC4UUCC4 SLC11A6 KIR2DR2DDSS4 SNCSNCC FFHHL1 KIRK R3DR33DL3D 2 CRHR1CCRHRHR1 HHLA-- KIRKKIR2D2DLD 1 aging TRTRDMT1R 1 CD2424 GJBGJBB2 KIR2DR DS2 GHSR GP9G TNFSF1F 5 T SNAPA 25 SLC6A4 PYYY DTNDTTNT A GNBGNGNBN 1 TAPBP KIR2DL2 RNASE1 FZD3Z TNXT B UMPS CCP OXTOOXT cardiovascular NPY1R CD1DCCDD1DDD11D1D NPY2R NPY5R NR1NRNR B SCNN1A SEELE P RNH1 GHRL LTB GAGGABAABBRG2 FTH1 CD8D8D8A SLC6LLCC6C6A9 AANSCNNSSCNCNNCCNG NN1GG CALC U ABCAABBCACA1 PIKPIP K3C3C3 MMP12MP 2 MTMTNR1N 1A SHMTMT2MMTTT22 HHLA-DRBB1 CLOCK HLA-H A A HIVVEP3VEVEP3TPTP33P1 chemdependency NEDD4DDD4D4L KRKRT19 MAOOA GABRG3 CCS MEP1ME 1 RAIRRA 1 DTNBBP1 HLAHLA-DMA-DMA MB HOMERH MERMEM R1R1 HLHLA-DMHLAA MA SERPINFSE PINNFF1F RYR1RYYRGJBGGJJBJ1B1 DNDNASEA 1 TRA@ GJAG 5 GABRG1 PDLDLIM1 FTL CASCASASR CECENPENPEE PEE HESX1 S100AS10S100SS100A10A0A10AA12 GRGGRIKR K4 BBSB 4 developmental TNFRSRSF6 SCNN1B NPYN MMYL2 PRKGPR 1 KCNKKCNQ2 TNFSF8T 8 SERPRPRPIPPINIPIINI I11 TACRR1 NPPPPPA DLXD 3 PCPCM1CMMM1 APAP3BB1 UCP1 AADAMTMTTSSTS1T8SIA13SIAIAA2 KIFKIFFF2A F3F F7F7 AGEA ER LITAL TTAAF PYGPPYYGGMMCYP19ACYP1C A1PROP1 PTHPTHH SORLS RLL1 CFTRCFCFTFTR MPZLMMPP GAD1 ADD3 TGM1M1M1GABARAGABABARAA AP HLA-DRBH 5 HLA-DRHLHLAHLA-DDRRA GP1BGGPP1BPP1BA TNFRSFTNFRRSFF8F8 TMPRSTTMPTMMMPRMPPPRRSRSSSS6S6 TMSBTMTMSBSB4X XRCC3 hematological F10F F99 CRYGCCRCRYGRRYY C BUB1B11B PLA2G5PPER3PEER GP5 A3 PROCPRPROPROCOOCC TNNT2TTNN T2 GFRARA2 LRPLRL 6 KCNEKCK 4 GYS1YS1 TPM1 PPP1R33A GGAD2 TACRCRR2 TFPI2TFPITFPTFFPPI2P 2 MYO9MYOM 9B PCMT1PCPCMTT1 DEK PGM1PGMPGPGM1 CCPBPB2 ZIC2 PROPRROROOSOS1S1 SSERRPMGMMGSINAA1ST3ST03 GGCXG X DLSDLDLSL T SOX3 TACACRR3 TNNI3TTN I33 PKDKD2 LDHLDHHB PER22 KCNEK E2 KCNEKKCKCNC 1 immune SULT2BSU B1 TAC4 PITX2PITTXX2X2 HPNN IISSLS 1 SLN KLK1 PRSSSS1 S10S 00A1 RADRAAD51D515511C PLPLN SLC5LC5A7 EN1

ST8SIAS 4 TTIMIMMELELESL SS SSNAISN I1 FANCFANANCG CIDEEA INNPP1 THBTTHHBH D FG PER1ER POU3F4POU3F4F44 ZICC1 infection BUBUBB3 PROCCRRPROZZ S1S10011000B RARAX KCNKCKCCNNQ1 NKX2-2 GLIG 2 ACA D SLC12A2AA3 ST14S 4 PTGS2PTPPTGTTGSGSS2S2 GRMGRRMM3M SPINK5 BUBB1 ACADA ADDM TSPYL1 KCNHKCKCNKC 2 ZNF350ZN 500 COMPCOCOMMP TSHTSHB IL122 kidney RREGREEG33A EFEMPEFEMEF MPMPP11 MMRN11 ARGARG2 GDNGGDNNFF RAD54L COL1OLOLL111AA1 S10100A2 LMAN1 D1D1 ALXX4 RECQL CCFICF NME1NMNMEME1E1 TPMTPPMM3M3 STSTK1TK 1 GH1G LOXLLO 4FCGR33B GFRAFRARA1 CCCCK BACEBACEE2 C1QA AANAT EIF2B5 DDCDDDC WWNK4 FGGF C4C4 GPX1X1 G6G6PCC SHSHFM1HHFF EGR2 POPOLR2L B metabolic PAWPPAAWWR TIPPARA P TYRP1TTYY 1 NCNCOA7 IIL18RILL18RL188RRARAP AZU1 AGRGRP JPHH3 RPA3 C1RC1R PKLPKLR DLCLLCC1 ERCC44 PPPARGPPAPPARARGCAARARGRGCGC1C1B HAPAPP1 PCBDPCBDC D1 POMCPOP C TERRF1RRFF11 POU5POPOU5F1OU5FU5FU55F15FFF11 LIGL G1 CRABPR BPP2 ININHBI B VHVHL DEEFB1 ERCC1 DISC11 PAXAX6 HOXC4HHOXOXC4OXCXC4XCCC44 REG1A MC4RC44R CSTSTTACOL9L9A1 SULT1SSULSUULULT1ELT1LTT11E1E1 BRCBRCACACA2 PRDM2PRRDMRDRDMM2 POU3F2P 3F3 2 LIGG4 AALOX5AAP F13F 3B OLIGOL 2 neurological CC66 COL9L9A3 NCDNCCD2 ALOXALALOALOXLLOOXX5 AAICDA LECTCT1 HLA-DPAAA-DP-D- A1 TTK RNF1RNRNF11391 9 C55 CD209CD209 SOSOXSOX10100 PPOPOOOX FLG CLLEC4C M CSTCCSSTT3T3 CD163CD1CDCD16D1D1163166363 PCSK6 INHINN A ILL18BP C3 PLPPLPP1 SIMMM1 CCKCCCKBR CD86CDCD8866 MYMMYOYO6YOYO6O6 LL1CACAM MECP22 normal variation TUBBTTUUBB1 ICACAMAM CCKACKAR NAT2 MC5RMMCC55 MASPM SP2 AHRR HLA-DHLA--DDDPBPB1 CD80C GRIKGRG K3 MC1RMMCC1R1R PMCHP CH MASPM 1 T6GGALNG HEYNAY2YAC2C2 IGHG1IGGHGGGHG1GHHGHGG11 PAARP3 TYRR SLC3L 3A2 other NRF1NR IL13RAL133 2 MBLBBL2 CR1CRR1 HTR22KIKAR2RR2DDL5DDLL5A MC2RMC TNS1TTNNS1N BBTTLA SLAMF1 NAT1 C2C2 C4C4SFTB TPDD CKCKM

MC3R3 C3ARC3ARR1 IL5ILIL5L5R5RA ASIA P LILLRBLRRBRB4 FCN2 ITGAX CHRMCHRMM33FKBPFKBP5 FAFABABPSSP4PRY2PRYP Y2Y2 HTRRAA1 CCALCCAALA CR MGEAEEAA5 CDCD72D7D772 pharmacogenomic HLLA-DRBB3 TSNATSTSNSNAS AX NAA10 CFFB IAAAPPP TNFTTNFSNFFSF1FSFSF1SSFF1F 3 IL4 FCFCER2CERR22 RPGR NRIP2 EPHBEP 1 CD55CDCCD5 ERCC6C MUTYHYHH CFHR5HR5R PAX9 NTF3NNTTF3F3F3 PAK3 ILI 5 ERCC5 MED12ME GATAATA6 ZFHHX3 NPHP4 CFHR4FHRHR44 INNSRR psychological IL12B SYNN2 RTRTTN4RN R DBP SYNSYYN3N33 ABCCCBB1 PRF1P SLSLC16AA2 SSSTR5TR5 ACATAT2 PON2 PCPCSKCFK1FHR3HHR3 SLC7C77A8 ENPP1 XBPXBXBP1 MRC1RC11 SLCC33MCACHR1CHC1 R1 PCSKPCSSK2 MINK1 FOXAXA2 ITGAIT AL TWITTWWWISWIIISSST1ST1 MUC20M CKB IGKIG C RENR ONEONONNENECEECUTECUECCCUUTUT1T FRK RPGRIPP1 IL2L23L22323R MSHMSMSH6H6H6 reproduction LIPE MYLMYYLL3 FOXP3 GLULLUUL CCDDC42SED 2SEE2 MFAP4 ANKANK2 HRH LIGG3 NEFH IL12RB12R2RB2RRBB2 DMTF1DDMMT 1 IL1ILL1L15 NOTNNOOTCH3OTC 3 DYNCDYNC1HNC11HH11 MPGPG OPTNN COLCOLL171717A7A1 BMBBMPMPMP7 CDTCDTT1T1 DMBT1 SFTPA1 CIITIITTA IL1ILI 0 KLFKKLLLFF5 ARVARARVCRV F EXOC4EXE OC CCL18 DDBDDBB2 BRDB D2 unknown PSPSENENNENNEENEN NFE2LL1 TNFRSRSF1313B APHAPH1H1B PLAPPL GL1 HOXD4HOXDXD4D44 GDF5GDF5DFF55 ILL1L15R15R1 RA KLKLFF4 NEFEFL SLC7LC A5 PLCEPLPLCEE1 PVRLP 2 APPOPPOLFBXOOLOBXLXO2O11 M6PR6 IL12RBIL 2 1 TPH1 BMPB P4 AMHMHHR2 TFCP2TFTFCCPCP2P HNH F1B IL23A MGP FAFAMAAMM46A IL9I 9R MSH3HH3 SCG5 MMERTRTR K RYRYK WNKWN 1 CDH23CDH2DH23 XPCXPXPCPC vision MT2MT2A IL10RL1L10RA S10SS100A0 A GNBGGN 3 CBCBBFB EPBPPB4B42 MEFVMME V PPAFAHAH1BB1 GGAS6 TTNFRSN SFFF1117 TNFRSF10DTTNFRSF1FRRS SPRSP SP44 RENNBP CDD97 NOGN EHMT2EH CES1CCE BMP8P B CAC D TNFSF133B TMPRSSTM S3 USH1CUSH C MAMAD2L2 GAGATAATATA4 GABPGAG BPA MYBPYBB C BRIP1 KLLFL 2 EXXO1XO KLF12 SPISP NK7 GGAAATAAT 3 MD SSTR1T ARSR A PRL TUBTUBBAB 8 LZTSS1 GUSSB GCSHGCGCSHH ZFPMZFZFPPM2P 2 MYM D88888 CAMAM2 HOHOOXOXD13D13 CYBCCYYBBB IRF4IIR 4 HOXBXB6 IFI2I22727 RRAAD23A B SSTSST IL21I R PPTGDSPTGDPT DSS ZFZZFP36FP3366 TLRTLR6 IL1A TOR1TOOR1A HOXD1HOXD10OXDXXD1D10 TTFAPTFTFA 2B SMN1N11 ETV6ET 6 TTF1TTTFTF1F1 MTPNPPN OASO L KLF6 NEURNEUUROGOGCDCG1GDKN21KN D HSPASPAA2 USH1GG SFTPA2 DLD PRPH2RPH2PH THPOTTHHPO TCAT AP IRF8IRRF8 PANK4 NLRCNNLLRL C RNRNAASEEL BPYB 2 UCPUCPP33 TBPLT L1 ALLDHLDLDHH CYBCCYBA ATXNATTXXXN7 inflammation UBE2B D1 PTGESS ADIPORAAD OORR1 MTMR22 PTGER2GER2R2 BBMP5 IFNIFFNW1FNFNWNW1 CRXC X AKR1BRR1 1 IL10R0RB LSP1 NOD2OD TXNTTXXXNIPXNNIP RP11 PMS2 PTGER3GER3 PTGER1PTGT R11 PTGPTGES2BMBE 2PP115 SAT1 MBDMBD4 DYDYX1Y 1C11C1 PHOXO 2A IRFF5 UBQQLN1 GGPD2 ATTTXNTXN2 NFATTC4 TNFRSF13CTNFRSF1 CLCN5 HPCAL1 CL2CCL2A1 FXNF FXYD6F CTNNNA3POU2AFOUU AF1 PTTGTGFRR HK USP40 TOR1B ILL9 ACCNC 1 NFATC3 CARD11 SCN1CN1B TCP1TCPC IL221 GABRB1 C2orf3 CARDR 8 ATP7AT B AIRRE UFUFD1D1L APOM PCYTT1A POLPOLI USP2SP2 KHDRBSD S3 SSTR44 NFKBIZ RRM2RR B POLOLO K CPT1CPPT1A SOD1SODOD KLF1K 3 MBD2 NFFRKFRRKKB CHRNACHRNNA4 NKX2-5 JARID2 REV1 SGMSS 1 SSERPISERPINHNHH1H1 PNNNP PMS1PM PRPHH ROM1R M IL1RLL L1 DNNMTT3B CASP5 SP100 GBP2 IL2IL 2 CYCYPYP11BB2 KIF5I A OAZ1 SCAS P IL1RN INSSSIG2GMG2GGLL NOXX44 NOX1NOXO NLRN P12 MLH3 NRL SLC4C A1 TXNRD1TXN 1 PDDC RBMX GLDC PPP3R1R11 TLR7 CYP11BB1 GABBRA3 CARD1CA 8 IL1F10

CHRNACCHRHRH 3 SLC22AS A1 GGABRD CHRNB2 TFAM HSPA13 NLRP2 RRM1M1 RHARHAAGOAOAZ3 SLC22AA2 ODDCC1 NLRPN 3

CHRNA2

Figure 1 | The disease and inflammation network (DIN). The network is constructed by mapping disease genes and inflammation genes to the PPI network and then generating the maximal connected component as the DIN. Inflammation genes are colored in grey and disease genes are colored according to their classes. Those genes that are both inflammation genes and disease genes are drawn with black border. MD denotes genes involved in multiple disease classes.

SCIENTIFIC REPORTS | 3 : 3426 | DOI: 10.1038/srep03426 2 www.nature.com/scientificreports

A HPRD Inflammation

Inflammation genes only

Inter-genes; namely both inflammation and disease genes Disease genes only

Disease

3 B 10 C

p<0.01 p<0.01

102

101 Degree Number of genes 101

0 0 10 10 100 101 102 Inter-genes Disease genes only Inflammation genes only Degree DE0 1 10 10

0 10-1 10

-2 -1

Clustering coefficient 10 10 Topological coefficient

-3 -2 10 10 0 1 2 100 101 102 10 10 10 Degree Degree

Figure 2 | Topological analysis of the DIN. The number of overlapping genes among the nodes of the PPI network, disease genes, and inflammation genes are shown in (A). Black arrow indicates the three categories of inter-genes, inflammation genes only and disease genes only in the DIN. (B) Degree distribution for all the nodes in the DIN is plotted on the x-axis, and the numbers of genes are plotted on the y-axis. (C) Degree distribution for three categories of nodes in the DIN. Significance tests were based on the Wilcoxon’s Rank-Sum Test and p values for comparisons between inter-genes and disease genes only, and between inter-genes and inflammation genes only are both less than 0.01. Clustering coefficients (D) and topological coefficients (E) for all the nodes in the DIN are plotted on the y-axis, and the corresponding degrees are plotted on the x-axis.

PPI network. As indicated, the general degree distribution is signifi- (p 5 2.4687e-75, Wilcoxon’s Rank-Sum Test, Supplementary Figure 1). cantly higher when comparing inflammation genes with non-disease Moreover, in the DIN, the shortest paths show a similar tendency just genes (p 5 6.8293e-07, Wilcoxon’s Rank-Sum Test, Supplementary as the degree distributions for the three kinds of nodes, with nodes Figure 1), as well as comparing disease genes with non-disease genes that are inter-genes having significantly shorter length of paths.

SCIENTIFIC REPORTS | 3 : 3426 | DOI: 10.1038/srep03426 3 www.nature.com/scientificreports

The clustering coefficient of each node in the network is a measure define the significance of the Intimacy for the pair of diseases of the tendency of nodes in a network to form clusters or groups12.In (Figure 5A). Figure 2D, we found that, with an increase in the node degree, the As indicated by the results, Intimacy, among IRD tended to be clustering coefficient decreases. In addition to clustering coefficient, higher than that in NIRD, which suggests the potential functions of the topological coefficient is also computed13, which is used to mea- inflammation genes in bridging the connection between each pair of sure the extent to which a node shares links with the others in a diseases. Biologically, the computation of Intimacy is designed to network. As shown in Figure 2E, with an increase in the node degree, take the direction of disease conversion between each pair of diseases the topological coefficient increases, which was consistent with pre- into consideration. Therefore, in order to measure the general level of vious researches on biological network structures14–16. Additionally, Intimacy between each disease pair, we ranked all disease pairs based in consideration of the accuracy of topological analysis, we also on the sum of Intimacy for each disease pair with direction informa- compared mean values of degree (Supplementary Figure 2), cluster- tion (i.e. sum of Intimacy from disease A to B and that from disease B ing coefficient (Supplementary Figure 3), and topological coefficient to A; Supplementary Table 1). The most associated disease classes (Supplementary Figure 4) of all nodes in the DIN with random bridged by inflammation genes are ‘‘immune’’ and ‘‘infection’’, distribution, separately. 1000 random DINs were extracted from which has already been supported by literature19–21. Therefore, 1000 random PPI networks, using edge permutation with degree of Intimacy defined by inflammation genes, which are also part of each node in the original PPI unchanged. As shown, all these topo- immune response and infection, contributing to the connections logical parameters were significantly higher as comparing with the between immune and infection is unsurprisingly the most relevant random distributions. one. Others ranking in the top 5 includes disease pair of cardiovascu- To further confirm the reality of the DIN, the number of nodes and lar and metabolic (Supplementary Figure 5)22,23, cardiovascular and edges in the real DIN were also compared with the random distri- immune (Supplementary Figure 6)24,25, and aging and cancer bution. As shown in Figure 3, the average number of nodes and edges (Supplementary Figure 7)26,27, whose connections have already been across 1000 random DINs were significantly smaller than that of the shown in relation with the bridgeness of inflammation genes. Disease real DIN. pair of normal variation and metabolic was suggested by us to be newly involved. As supported by the literature28,29, metabolic system Characterization of the functional cross-links between disease and is one of the most fundamental requirements for survival, whose inflammation genes at a network level. To explore the functional proper function is mutually dependent on immune response. cross-links between disease and inflammation genes, we further Inflammation could cause disequilibrium of the mutual dependence mapped 121 inter-genes to the human PPI network and then of metabolic and immune systems and then lead to chronic disorders calculated the significance of the overlap between inter-genes and of homeostasis. Beta-glucuronidase belonging to the disease class of the PPI network through hypergeometric distribution. With 121 normal mutation, which is generally known to be associated with inter-genes mapped to the human PPI network, a significant inflammation in the exudates from gingival30. We therefore inferred overlap was observed with a p-value of 3.6713e-32 (Figure 4A, the disease pair of normal variation and metabolic could be bridged hypergeometric distribution). Subsequently, we focused on the 121 by inflammation genes in proper conditions. overlapping genes, which formed a new network with the maximal To further explain in detail the Intimacy bridged by inflammation connected component containing 32 genes and 49 relationships genes, we focused on two specific disease pairs. The maximal con- (Figure 4B). As shown by the result, inflammation genes tend to be nected component of the genes of each given pair of diseases was associated with multiple disease genes, which indicates an important generated after mapping these genes to the human PPI network and role of inflammation genes in contributing to genetic diseases in the then defined as the gene module of this pair of disease classes. We network. We also examined the distribution of 121 genes into 18 took the gene modules of two pairs of diseases as examples: the pair of disease classes according to the GAD classification system by infection and immunity (Figure 5B) and the pair of infection and excluding the classes ‘‘unknown’’ and ‘‘other’’ (Figure 4C). We cancer (Figure 5C). Generally, genes with higher degree, such as showed that the class ‘‘immune’’ overlaps with most inflammation NFKB1 (Figure 5B), RXRA (Figure 5B), RELA (Figure 5C) and genes, which suggests that alterations in the focal inflammatory CCR5 (Figure 5C), tend to be hubs in the gene modules, and are microenvironment are more associated with immunity. Further- believed to have much more impact on the global structure of the more, some inter-genes are shared by different disease classes, module networks. The NFKB1 (Figure 5B) is the termed as multiple diseases (MD), as illustrated in Figure 4B, most abundant form of NF-kappa-B, which is complexed with which shows the multiple functions of inflammation in various the product of the gene RELA (Figure 5C). NF-kappa-B31,32 is a genetic diseases (Figure 4C). transcription factor that is activated by various intra- and extracel- To show the statistical significance of the overlap between inflam- lular stimuli, such as , oxidant free radicals, ultraviolet irra- mation and disease genes, the p-value and fold-enrichment ratios diation, and bacterial or viral products. The expression of genes (FER) were calculated (Figure 4D). Only four classes, namely regulated by Rel/NFKB members is involved in immunity and apop- chemo-dependency, developmental, normal variation, and psycho- totic and oncogenic processes; thus, NFKB133, as an inflammation logical disease, were not significant (p . 0.001). We termed these gene, is important in linking infection and immunity. Because NF- diseases as non-inflammation-related diseases (NIRD). On the con- KB plays a well-known function in the regulation of inflammation, trary, those diseases with a significant overlap with inflammation we thus reasoned that inflammation bridges infection and immunity. were termed as inflammation-related diseases (IRD). In addition to Epidemiologic studies34,35 have shown that chronic inflammation the most highly associated class ‘‘immune’’, the disease class cardio- predisposes individuals to various types of cancer. Furthermore, it vascular, infection, and cancer were also in relation to inflammation, is estimated that the underlying infections that could cause chronic which has been supported by literature17,18. In order to measure the inflammation are linked to approximately 15% of all deaths from contribution of inflammation genes to the connections between one cancer worldwide36. RELA, as an important inflammatory factor, is disease to another, we further examined the Intimacy of each pair of involved in linking infection and cancer through the mediator of diseases in the two disease categories of IRD and NIRD (details in inflammation37–39. We thus conclude that the inflammation genes Methods). For each pair of diseases, we constructed 1000 random NFKB1 and RELA are important in the link between infection and pseudo-inflammation gene sets, and then computed the random immunity, as well as between infection and cancer. Additionally, Intimacy values to construct a random distribution. By comparing disease gene RXRA (Figure 5B) and CCR5 (Figure 5C) have been the real Intimacy with the random distribution of Intimacy, we could partially validated to be linked to the process of immune40,41 and

SCIENTIFIC REPORTS | 3 : 3426 | DOI: 10.1038/srep03426 4 www.nature.com/scientificreports

A

0.10

0.05 Density

DIN

0 1740 1760 1780 1800 1820 1840 1860 1880 B number of nodes 0.10

0.08

0.06 Density 0.04

DIN 0.02

0 4800 5000 5200 5400 5600 5800 6000 6200 6400 number of edges

Figure 3 | The further analysis of the DIN. Density plot of the random number of nodes (A) and edges (B) in 1000 random DINs, with the real values of the number of nodes and edges in the DIN indicated by a red downward arrow. infection, respectively42,43. Nevertheless, further researches on these structure data. Using the iSubpathwayMiner software package, two and some other genes are needed to understand the cellular and disease class-related and inflammation-related subpathways were molecular mechanisms mediated by them underlying those complex generated according to 15149 unique gene-disease associations diseases. involving 18 disease classes and 2831 disease genes. Any two sub- pathways that are significantly enriched for common genes were Further dissection of the functional cross-links based on disease- connected by an edge in the final subpathway–subpathway net- and inflammation-related subpathways. To further assess the func- work if the gene overlapping between them was significant (p , tional cross-links between disease and inflammation genes, we 0.01, hypergeometric distribution). We thus constructed a subpath- constructed a subpathway–subpathway network based on disease- way–subpathway cross-talk network (Figure 6A) with 202 subpath- related subpathways, inflammation-related subpathways, and pathway way nodes and 716 edges.

SCIENTIFIC REPORTS | 3 : 3426 | DOI: 10.1038/srep03426 5 www.nature.com/scientificreports

CXXCXCLCLL1 CXXCCLL6 B CXXCXCLCLL22

CCXXCXCRR1R CCXXCRR2R cancer C55 aging cardiovascular IL88 C33 chemdependency CC33ARR1R developmental CCCLL4L CCXCRXCRR4R hematological CCCLL2L immune CCCCLCL1171 OORMM1M CC4AA CCCRR55 infection CCCCLCL11616 CCCLL8L kidney BMMMPRR1BR CCCRR4R CCCCLCL1111 CCCRCCR1R metabolic BBCLL6 BBMPP2P CCCLL3L3 neurological BBMPP6P CCCRR3R CCCLL7L HHDDACC4C TTGFBGFBB1B normal variation CCCCLCL2262 CXXXCLL10L CCEBPEBPPBP CCCCLCL2L224 CCIITTA other pharmacogenomic IIL1AL1AA RELAA TTLRR5R TTLRR1R1 psychological NNFKBB11 reproduction ILLL1RA1RAAPA RXRAA ITITTGBB2B unknown TOOOLLLIPLIL KKNGG1G1 FOSS TTLR1LR101 vision IIL1BL1BB CCD114 MD KKLLKBB1B inflammation ILLL233R3R ILILL233A CDDD400LG0LGCCD4404 CCYBBBBB CCYBBA TNNFFRRSSF4 TNNNFSSF4S pharmacogenomic (17) CCMAA1A SEERRPIINNA3 MMSMS4AS4AA2A AKT11 C normal variation (5) reproduction (13) neurological (23) vision (11) A psychological (13) metabolic (40) aging (10) kidney (20) 9028 genes in PPI network cancer (34)

189121 2168 Overlap infection (44) cardiovascular (48) Inflammation Disease chemdependency (7) developmental (6) hematological (9) immune (82)

D infection FER=8.44 kidney FER=7.24 immune FER=6.25 aging FER=5.49 cardiovascular FER=4.31 vision FER=3.86 hematological FER=3.74 reproduction FER=3.39 pharmacogenomic FER=3.30 metabolic FER=3.29 normal variation cancer FER=2.66 neurological FER=2.50 chemdependency developmental psychological 0 5 101520253035404550

Figure 4 | Analysis for the importance of inter-genes at the level of network. (A) The number of the overlapping genes between inflammation and disease in human PPI network. (B) A subnetwork was generated by mapping inter-genes to the human PPI network. (C) Distribution of inter-genes into different disease classes. (D) Fold enrichment ratios (FERs) were computed according to the overlapping between inflammation genes and genes from different disease classes.

SCIENTIFIC REPORTS | 3 : 3426 | DOI: 10.1038/srep03426 6 www.nature.com/scientificreports

A chemodependency 3 developmental normal variation 2.5 psychological neurological cancer 2 metabolic pharmacogenomic 1.5 reproduction -logP hematological

vision 1 cardiovascular aging 0.5 immune kidney

infection 0

kidney aging vision cancer infection immune metabolic neurological reproduction psychological cardiovascularhematological developmental LEPFCGR2B normal variation B FCGR2A pharmacogenomic C CXCL5 C7 chemodependency FCGR1A CRP FGF2 C4A C6 PTX3 IL1A C1QA ADRA1A C4B CASP1 CXCL2 IL1RN CR1 C5 infection CFH C3AR1 ADRB2 CXCR2 infection C2 IL1R1 TBP CXCR1 CFBMASP1 NR1I2 immnue CD46 MMP7 cancer LRP1 VDR TGFBR1 IL1B PPARG MMP13 IL8 GC MD CR2 THRB IL12A MMP9SPP1 MD C3 TGFB2 F2 TGFB1 MMP2 RXRA MMP3 CTSG PPARA TGFBR2 NFKBIATNFSF11 IGFBP3 TGFBR3 CCL7 AGT PARP1 CCL3 CCL4 MMP1 THBS1 CMA1 F5 IL2RA EDN1 CYBA CCR3 NCF1 AR CCL8 NCF4 CYBB EDN1 IRF2 CCL16 ACE CCL2 MMP8 NCF2 SERPINA3 SOCS3 CCL5 KITLG NFKB1 ESR1 NOS1 CCL11 IRF1 CCL3L1 BDKRB2 NOS3 CCR1 CMA1 ETS1 STAT6 AGTR1 STAT5A PPIA FCER1A NR3C1 S100A8 AGT MS4A2 IL23A CCR5 TLR9 PTPN6 PTPN11 HSPD1 IL12RB1 SMAD3 ARL6IP5 ITGAL PTPRCCXCR4 CD55 CXCL12 JAK1 TLR10 CREB1 PLCG1 ACVR2A TLR1 MMP1 IL23R TLR3 TLR6 CEBPB JAK2 SOCS1 SELE CAT EGFR AHR CD4 XRCC5 LBP CD14 TERT STAT1 EIF2AK2 RUNX1 XRCC6 TLR2 TIRAP CD4 BMPR1A PTK2 TLR5 DPP4 BMP6 PPP2R1B JAK1 STAT5A SELL SIGIRRPTGS2 TLR4 STAT3 SELPLG NCOA3 SELP PPP1R13L CCL2 CD24 TAF1 SELL HSPA8 VCAM1 TP53 IL1RL1 SELE CXCR4 IL4R MIF TOLLIP LCK CXCL10 CD40 IL13 STAT1 CCR5 PGR CD44 PPARD PTPN11 ELANE IRAK1 CCL11 IL6 MMP14 CXCR3 TRAF6 IL1RAP CCL16 CHEK1 CAV1 MS4A1 IL18R1 MMP3 ZBTB16 IRF3 CD40LG CCL3L1 RELA CXCL12PTPRC CCR1 MMP7 IGKC CCL8 MMP13 CCL3 CD40 IL1R1 F2 TP53BP2 CD40LG SPP1 IRF4 BCL6 IGHG1 CCL26 CCR3 PTGS2 IL1R2 CCL7 PRKCA PIK3R1 CCL5 CCL4 TRAF5 IL4R A2M IGFBP5 CTSG TP53 IL1RN IL1A CCR4 CCL24 NOTCH1 TP53BP1 IL1B MIF PML ADRB2 HCK MMP2 BTRC EP300 CCL17 MEN1 IL13 CTGF ITGAV VCAM1 NR3C1 AR TP73 NME1 IFNG ABL1 CASP1 IRF1 COL2A1TGFB1SPARC C3 IL13RA1IL13RA2 IL8 TNF NFKB1 STAT6 SMAD3 NLRC4 CXCR1 HRAS DAXX CSF1 IL2RA WT1 CXCL2 CXCR2 TNFRSF1B PARP1 NOD1 CASP9 CDK2 MDM2 TGFBR3 MMP9 TNFRSF1A CFLAR ESR1BRCA1 SMAD2 IL12A LTBP4 TXN GC CR2CR1 TGFB2 DCN GSK3B NLRP1 NOD2 CTNNB1 NFKBIA LRP1

Figure 5 | Intimacy heatmap of disease pairs and examples indicating the functional cross-links between inflammation genes and disease genes. (A) Examination of the Intimacy bridged by inflammation genes using all disease pairs from different disease classes. Diseases are ordered by their FER values (fold enrichment ratios), and minus 10-based logarithm p-value is showed as the right-side color bar. (B , C) Examples show the functional roles of inflammation genes in bridging the connections between disease pairs. Gene modules were generated from disease pair of infection and immunity, and disease pair of infection and cancer, separately.

In parallel with the two gene modules (Figure 5B and 5C) gener- as well as inflammation- and immune-related subpathways. Some ated from the PPI network, two disease- and inflammation-related new genes were also included, such as IRAK4, IRAK1, and MYD88. subpathway-based subnetworks extracted from the subpathway Shared by multiple immune-, inflammation-, and infection-related network were much more representative of the bridgeness of disease subpathways, such as path:05145_1, path:04722_1, and path: and inflammation genes. For example, five inflammation-related 04620_5, the inflammation gene IRAK144 is a critical mediator of subpathways (path:04620_3, path:04620_11, path:04722_1, path: innate immunity, which is also important in the immune response 05131_2, and path:04620_9) mediated the connections between corresponding to viral infection45,46. Similarly, another IRAK family infection and immunity (Figure 6B). In agreement with Figure 5B, member, IRAK4, is also in control of the immune response to intra- NFKB1 and RELA also emerged as the essential genes in linking cellular infection, such as Chlamydia pneumonia47. Furthermore, the infection and immunity when searching for the common genes sharing of path:05145_1, path:04620_3, and path:04620_11 by the shared by both inflammation- and infection-related subpathways, inflammation gene MYD88 is also critical in the connection between

SCIENTIFIC REPORTS | 3 : 3426 | DOI: 10.1038/srep03426 7 www.nature.com/scientificreports

Psychological-path:04730_2

A other-path:04610_1infection-path:05140_1 neurological-path:04720_1 neurological-path:04020_2 neurological-path:05140_1 neurological-path:05150_1

Psychological-path:04730_1 infection-path:04610_3 Psychological-path:04920_1 cardiovascular-path:04610_1 immune-path:05140_1 pharmacogenomic-path:00140_16 cancer-path:04012_1 pharmacogenomic-path:04920_1 immune-path:04610_1 infection-path:05145_6 reproduction-path:00140_1 hematological-path:04610_1 cancer-path:04620_4 metabolic-path:04012_1 pharmacogenomic-path:05214_1 pharmacogenomic-path:04510_1 developmental-path:05014_2 cardiovascular-path:04620_6 other-path:04510_1 immune-path:05140_5 pharmacogenomic-path:04010_4 infection-path:04620_5 metabolic-path:05214_2 infection-path:04620_6 cancer-path:05214_1 aging-path:04510_1 pharmacogenomic-path:05212_1 metabolic-path:05145_1 cancer-path:04722_1 infection-path:05160_4 infection-path:05142_1 cancer-path:05140_4 cancer-path:04210_5 metabolic-path:04520_3 metabolic-path:05212_1 infection-path:04012_2 cancer-path:04114_5 infection-path:04620_2 immune-path:05145_1 inflammation-path:04620_3 neurological-path:05200_7 pharmacogenomic-path:04722_1 metabolic-path:05223_5 inflammation-path:04620_11 cancer-path:05218_4 cancer-path:05145_1 immune-path:04062_1 cancer-path:05214_2 metabolic-path:04722_1 inflammation-path:04722_1 metabolic-path:04210_3 aging-path:04930_1 metabolic-path:04114_2 cardiovascular-path:04512_10 immune-path:04210_4 pharmacogenomic-path:04620_2 infection-path:05200_4 infection-path:05145_1 neurological-path:04620_2 infection-path:05212_2 metabolic-path:05214_1 immune-path:04620_2inflammation-path:05131_2 metabolic-path:04150_1pharmacogenomic-path:05200_1 metabolic-path:05200_3 hematological-path:04512_1 immune-path:04010_9 chemdependency-path:00380_6 cancer-path:05200_13metabolic-path:04620_5 reproduction-path:04114_1 metabolic-path:04960_1 metabolic-path:04510_1cancer-path:05200_5 cancer-path:04620_3 developmental-path:05145_5 metabolic-path:05218_7 cancer-path:04920_1 other-path:04930_1pharmacogenomic-path:05214_2 cancer-path:04010_3unknown-path:04930_1 cancer-path:05142_1 neurological-path:00350_1 cardiovascular-path:04920_1 metabolic-path:04115_1 metabolic-path:00350_1 infection-path:05140_2 pharmacogenomic-path:05218_4 cardiovascular-path:03320_1 pharmacogenomic-path:04960_2metabolic-path:05200_7 vision-path:04115_1 cardiovascular-path:04512_12 chemdependency-path:00380_4 inflammation-path:04620_9 cardiovascular-path:04115_1 Psychological-path:00590_1 metabolic-path:03320_1 other-path:04115_1 pharmacogenomic-path:04115_1metabolic-path:05200_16 developmental-path:05200_15 immune-path:04630_2 cancer-path:05214_4 metabolic-path:04630_2 cancer-path:05142_15 reproduction-path:00982_21 infection-path:04650_1 pharmacogenomic-path:04630_1 metabolic-path:05215_3cancer-path:04115_1 other-path:00350_2 Psychological-path:04722_1 metabolic-path:05222_2 neurological-path:04114_4immune-path:04115_1 developmental-path:04115_4 cancer-path:05142_11 other-path:00982_21 vision-path:04110_1 pharmacogenomic-path:00982_23 cancer-path:05200_10 cancer-path:05200_14cancer-path:05200_29 variation-path:00982_2developmental-path:05416_3 metabolic-path:04920_1 pharmacogenomic-path:05221_1 cancer-path:04110_1 hematological-path:00982_19 cancer-path:05215_4 metabolic-path:00982_20 metabolic-path:04350_6 pharmacogenomic-path:05215_3vision-path:04610_1 developmental-path:04210_4 infection-path:04630_1 neurological-path:00982_21 metabolic-path:00052_2 normal cancer-path:05220_10 aging-path:04910_1 metabolic-path:04930_2 cancer-path:05014_2 metabolic-path:04110_1 neurological-path:05215_3cancer-path:05200_31 Psychological-path:00350_1 metabolic-path:05221_1immune-path:04650_1 cancer-path:05222_1 immune-path:03320_2 cancer-path:05221_1 Psychological-path:00982_11 aging-path:05218_2 cancer-path:04310_2 other-path:03320_1 cardiovascular-path:04110_1 other-path:04920_1 pharmacogenomic-path:04110_1 chemdependency-path:00350_1

neurological-path:04115_1other-path:05211_2 cancer-path:04350_6 metabolic-path:04910_1 cancer-path:05210_3 cancer-path:05213_1 pharmacogenomic-path:03320_3metabolic-path:04950_2 other-path:05211_4 metabolic-path:00071_1 neurological-path:03320_1 other-path:05200_15 cancer-path:03320_2

Psychological-path:03320_1 neurological-path:05014_1 developmental-path:00140_2 reproduction-path:00140_3 infection-path:00982_19 metabolic-path:00670_1 developmental-path:00670_1 chemdependency-path:00982_13 metabolic-path:00140_4 metabolic-path:00140_7 developmental-path:00140_1 pharmacogenomic-path:00240_5 cancer-path:00140_12

cancer-path:00982_25 metabolic-path:00140_5 cancer-path:00670_1pharmacogenomic-path:00670_1 reproduction-path:00140_4

B C cancer-path:03320_2 infection-path:05160_4 infection-path:05140_1 cancer-path:00670_1 infection-path:05140_2 infection-path:05160_4 infection-path:00982_19 cancer-path:05214_1 infection-path:05145_6 infection-path:05145_6 cancer-path:00982_25 infection-path:04610_3infection-path:04620_6 immune-path:04620_2 infection-path:04012_2 cancer-path:05142_15 infection-path:04012_2 cancer-path:00140_12 immune-path:04630_2 cancer-path:04114_5 inflammation-path:04620_3 cancer-path:04722_1 infection-path:04650_1 inflammation-path:04620_3 infection-path:05142_1 immune-path:04062_1 infection-path:05200_4 cancer-path:05142_11cancer-path:05218_4 inflammation-path:04620_11 infection-path:05212_2 infection-path:04620_6 cancer-path:05214_2 infection-path:05200_4 cancer-path:05140_4 immune-path:04650_1 inflammation-path:04620_11 cancer-path:04620_4 cancer-path:05214_4 immune-path:05145_1 inflammation-path:04722_1 infection-path:05212_2 infection-path:05142_1 infection-path:04650_1 inflammation-path:05131_2 inflammation-path:05131_2 cancer-path:04115_1 immune-path:04210_4 infection-path:04630_1 cancer-path:04010_3 infection-path:04620_2 inflammation-path:04620_9 cancer-path:05200_5 immune-path:05140_5 infection-path:05140_2 infection-path:05145_1 cancer-path:04110_1cancer-path:05215_4 cancer-path:05200_29 cancer-path:05200_14 immune-path:04010_9 inflammation-path:04722_1 inflammation-path:04620_9 cancer-path:05142_1 infection-path:05140_1 infection-path:04620_2 cancer-path:05210_3 infection-path:05145_1 cancer-path:05220_10 cancer-path:05221_1 immune-path:05140_1 cancer-path:04210_5 cancer-path:04920_1 infection-path:00982_19 cancer-path:05145_1 cancer-path:05200_31 infection-path:04620_5 cancer-path:05200_13 immune-path:04610_1 infection-path:04610_3 cancer-path:05222_1 cancer-path:04012_1 cancer-path:04310_2 infection-path:04630_1 infection-path:04620_5 cancer-path:04620_3 immune-path:03320_2 immune-path:04115_1 cancer-path:05014_2 cancer-path:05200_10 cancer-path:04350_6 cancer-path:05213_1

Figure 6 | The cross-talking subpathway network based on disease- and inflammation-related subpathways. The whole subpathway–subpathway cross- talking network contained 202 subpathway nodes and 716 edges. The rectangles in the network correspond to disease- and inflammation-related subpathways. The nodes are colored according to their categories, which contains 18 disease classes obtained from the GAD database (A). (B , C) Examples of subpathway-supathway network showing functional connections bridged by inflammation genes. The network in (B) was generated by extracting inflammation-, immune-, and infection-related subpathways, and the network in (C) was generated through extracting inflammation-, cancer-, and infection-related subpathways.

SCIENTIFIC REPORTS | 3 : 3426 | DOI: 10.1038/srep03426 8 www.nature.com/scientificreports infection and immunity mediated by inflammation. Takeuchi et al.48 associated with an inflammatory microenvironment. This might confirmed that mice with MyD88 deficiency are highly susceptible to thus suggest that path:04620 (Toll-like signaling pathway) Staphylococcus aureus infection and that immune cells can be acti- could be a good exhibitor of cancer progression activated by inflam- vated through the TLR7 MyD88-dependent signaling pathway49. mation and infection. With regard to the connection between infection and cancer mediated by inflammation (Figure 6C), the IL1B and TNF contained Discussion in path: 04620_9 are also implicated in the link between inflam- The identification of human disease-associated genes has long been mation and cancer, in addition to the above-discussed inflammation of central importance in the study of human genetics. With the genes NFKB1, RELA, MyD88, IRAK4, and IRAK1. Gene poly- establishment of the human disease–disease network, a shift has been morphisms in IL1B and TNF could induce cancer risk through an seen from the study of disease genes to the study of associations inflammatory microenvironment in several different popula- between various disease genes. Because the close relationship tions50,51. In addition, the NFKB1 and RELA that reside in the NF- between inflammation and various disease phenotypes has been kB node of path:04620_9 and the NF-kB signaling pathway that widely accepted55, studies of the roles of inflammation in carcinogen- resides upstream of path:04620_9 are important in the tumor-pro- esis have emerged. However, systematic studies on the functional moting processes activated by inflammation and infection52. links remain in their early stages. Interestingly, three subpathways (path:04620_3, path:04620_9, Inflammation contributes to the diverse progression of human and path:04620_11) reside in the same KEGG pathway, path:04620 complex diseases, such as immunity and cancer. To study the func- (Toll-like receptor signaling pathway; Figure 7), which is associated tional cross-links and the underlying mechanisms of their action, we with infection, inflammation, and cancer53,54. Furthermore, two integrated the PPI network, gene–disease information, and inflam- inflammation-related subpathways, path:04620_3 and path:04620_ mation genes to construct a DIN network. By further dissecting 11, which are associated with immunity and infection mediated by topological parameters, such as the shortest paths of the DIN, we MYD88, are located upstream of path:04620. Moreover, the subpath- found that nodes that are inter-genes are important in the mainten- way path:04620_9 is located downstream of path:04620. Collectively, ance of the network structure, which is consistent with a previous path:04620_9 is related to the promotion of tumor, which is tightly study7. In the present study, we confirmed the topological import-

Figure 7 | Detailed information of subpathways (i.e. path:04620_3, path:04620_11, and path:04620_9) in the KEGG (i.e. path:04620). Nodes marked by red asterisk are genes significantly overlapping with the corresponding subpathways.

SCIENTIFIC REPORTS | 3 : 3426 | DOI: 10.1038/srep03426 9 www.nature.com/scientificreports

ance of disease genes in the DIN. In addition, we showed that inter- can define i(gjo,gkp) from the following transformation: genes, as both inflammation and cancer genes, are topologically 1 i(gjo,gkp)~ , ð2Þ important. After mapping inter-genes to the PPI network, we com- 1zd(gjo,gkp) puted the FER values between inter-genes and different disease gene sets and then classified diseases as either IRD or NIRD. Based on the where d(gjo,gkp) is the shortest path length between gjo and gkp. two classifications, we further examined the Intimacy of each pair of Generation of random networks. The PPI network was randomized 1000 times diseases mediated by inflammation genes. As shown by the results, using edge permutation, with the degree corresponding to each node in the original the Intimacy of IRD was found to be higher than that of NIRD, which PPI network kept unchanged. The edge permutation approach has been widely is a good indicator of the functional cross-links between inflam- applied on various kinds of networks to generate randomized networks60. All 1000 random PPI networks were used to construct 1000 random DINs via mapping mation and various disease phenotypes. inflammation and disease genes to those random PPI networks and then extracting We comprehensively examined all the disease pairs via ranking the maximal connected component, separately. them based on the summed Intimacy for each pair (i.e. sum the Intimacy from disease A to B and disease B to A; disease A and Random test and statistical analysis. In order to compute the significance of disease B belong to one disease pair; Supplementary Table 1). In Intimacy bridged by inflammation genes, we constructed 1000 random pseudo- inflammation gene sets. Each pseudo-inflammation gene set contained the same total, we found that there were only one pair of inflammatory dis- number of genes as the real inflammation gene set, and each pseudo-inflammation eases that ranked in the top 5 of the table, together with other three gene had the same degree as the real one. Given a pair of diseases, we computed the pairs of inflammation-related diseases; their connections bridged by real Intimacy bridged by the real inflammation gene set, and then computed a inflammation were all confirmed by other researches. In addition to random distribution of Intimacy values using the 1000 pseudo-inflammation gene sets. Subsequently, we could define the significance of the Intimacy for the pair of those literature-supported pairs, the remaining pair ranking in the diseases, via comparing with the random distribution of Intimacy. top 5 was newly found by us that might be involved in the bridgeness The significance of the overlap between inflammation genes and disease genes by inflammation genes in proper conditions (i.e. disease pair of nor- against nodes of the human PPI network, and the overlap between gene sets from different subpathways, were computed by hypergeometric distribution as follows: mal variation and metabolic). As shown by the examples of gene modules extracted from the PPI network, inflammation was indeed m N{m X found to be important in mediating between infection and immun- k k n{k P(X§kjN,m,n)~1{ , ð3Þ ity, as well as between infection and cancer. Collectively, as suggested i~0 N by the results, our integrated approach could not only recur the n connections bridged by inflammation genes between those well- considering that a set of N elements has two subsets with m and n elements, known inflammatory disease pairs, but also predict new connections, respectively. We calculated the probability of containing at least k overlapping ele- which could therefore help us to study the functional roles of inflam- ments using the formula. mation genes between disease pairs, via their structural importance at the level of network. Construction of subpathway–subpathway network. We used the method that has Additionally, we also constructed a subpathway–subpathway net- been incorporated into the CRAN package iSubpathwayMiner (http://cran.r-project. org/web/packages/iSubpathwayMiner/) to identify the disease-related subpathways. work based on disease- and inflammation-related subpathways to In this method, the subpathway regions were located by lenient distance similarity of characterize the cross-links, as illustrated by the two examples of signature nodes within the pathway structure. Subsequently, we used hypergeometric disease-related subpathways mediated by inflammation-related sub- test to identify disease-related subpathways. pathways. Furthermore, this network-based analysis adds a new layer of complexity to the study of human diseases in that it considers 1. Colotta, F., Allavena, P., Sica, A., Garlanda, C. & Mantovani, A. Cancer-related inflammation as an important factor in the initiation and progres- inflammation, the seventh hallmark of cancer: links to genetic instability. sion of diseases. Carcinogenesis 30, 1073–1081 (2009). 2. Heller, R. A. et al. Discovery and analysis of inflammatory disease-related genes using cDNA microarrays. Proc Natl Acad Sci U S A 94, 2150–2155 (1997). Methods 3. Jones, S. et al. Exomic sequencing identifies PALB2 as a pancreatic cancer Data. A set of inflammation genes was obtained from the categories susceptibility gene. Science 324, 217 (2009). ‘‘inflammatory response’’ (GO:0006954) and ‘‘regulation of inflammatory response’’ 4. Wu, X., Jiang, R., Zhang, M. Q. & Li, S. Network-based global inference of human (GO:0050727), namely the human inflammation gene set containing 231 genes. The disease genes. Mol Syst Biol 4, 189 (2008). human disease gene set was generated on May 2012 from the Genetic Association 5. Turner, F. S., Clutterbuck, D. R. & Semple, C. A. POCUS: mining genomic Database (GAD, http://geneticassociationdb.nih.gov/), which includes 2831 disease sequence annotation to predict disease genes. Genome Biol 4, R75 (2003). genes corresponding to 18 disease classes. The database56 is an comprehensive archive 6. Furney, S. J., Alba, M. M. & Lopez-Bigas, N. Differences in the evolutionary of associated genes of human complex diseases and disorders, which also includes history of disease genes affected by dominant or recessive mutations. BMC summary data extracted from published papers on candidate genes and GWAS Genomics 7, 165 (2006). studies. The human PPI network (http://www.hprd.org)57, which involves 9028 7. Goh, K. I. et al. The human disease network. Proc Natl Acad Sci U S A 104, proteins with 35865 high-confidence interactions, is then used to construct DIN. The 8685–8690 (2007). database seems the most integrated for human proteins in the public domain and 8. Donoso, L. A., Kim, D., Frost, A., Callahan, A. & Hageman, G. The role of widely used in scientific researches58,59. inflammation in the pathogenesis of age-related macular degeneration. Surv Ophthalmol 51, 137–152 (2006). Fold-enrichment ratio and Intimacy between each disease pair. The fold- 9. Itzkowitz, S. H. & Yio, X. Inflammation and cancer IV. Colorectal cancer in enrichment ratio (FER) is defined as the ratio between the observed value and inflammatory bowel disease: the role of inflammation. Am J Physiol Gastrointest expected value (O/E ratio). An estimator is used to measure whether the observed Physiol 287, G7–17 (2004). overlap when mapping inflammation genes to the gene set of each disease class is large 10. Ashburner, M. et al. Gene ontology: tool for the unification of biology. The Gene enough to be significant. Ontology Consortium. Nat Genet 25, 25–29 (2000). For the given pair of diseases dk and dj with corresponding disease gene sets gk1,…, 11. Jeong, H., Tombor, B., Albert, R., Oltvai, Z. N. & Barabasi, A. L. The large-scale gkm and gj1,…,gjn, the Intimacy is defined to describe the contribution of inflam- organization of metabolic networks. Nature 407, 651–654 (2000). mation genes in bridging the connections between disease dk to dj. Considering the 12. Soffer, S. N. & Vazquez, A. Network clustering coefficient without degree- disease information passed from dk to dj based on the human PPI network connec- correlation biases. Phys Rev E Stat Nonlin Soft Matter Phys 71, 057101 (2005). tion, we can define how much dj is influenced by dk by treating genes related to dj as 13. Xu, J. & Li, Y. Discovering disease-genes by topological features in human protein- abnormal genes (upregulated or downregulated). Let I(dk R dj) denote the intensity protein interaction network. Bioinformatics 22, 2800–2805 (2006). with which dj is influenced by dk, as follows: 14. Xu, J. & Li, Y. Discovering disease-genes by topological features in human Xn protein–protein interaction network. Bioinformatics 22, 2800–2805 (2006). 1 I(dk?dj)~ maxfi(gjo,gkp)g(o~1,2,:::m) ð1Þ 15. Stelzl, U. et al. A human protein-protein interaction network: a resource for n p~1 annotating the proteome. Cell 122, 957–968 (2005). 16. Ravasz, E., Somera, A. L., Mongru, D. A., Oltvai, Z. N. & Baraba´si, A.-L. where n is the total number of disease genes of dj and i(gjo,gkp) is the Intimacy between Hierarchical organization of modularity in metabolic networks. Science 297, the disease pair. By using the network-based method, the shortest path method, we 1551–1555 (2002).

SCIENTIFIC REPORTS | 3 : 3426 | DOI: 10.1038/srep03426 10 www.nature.com/scientificreports

17. Infection, inflammation, and mental illness. The body’s immune response to 45. Kawai, T. & Akira, S. Innate immune recognition of viral infection. Nat Immunol infectious agents may imperil the brain. Harv Ment Health Lett 26, 1–3 (2009). 7, 131–137 (2006). 18. Grivennikov, S. I., Greten, F. R. & Karin, M. Immunity, inflammation, and cancer. 46. Saito, T. & Gale, M. Jr. Regulation of innate immunity against hepatitis C virus Cell 140, 883–899 (2010). infection. Hepatol Res 38, 115–122 (2008). 19. Karin, M., Lawrence, T. & Nizet, V. Innate immunity gone awry: linking microbial 47. Trumstedt, C. et al. Role of IRAK4 and IRF3 in the control of intracellular infections to chronic inflammation and cancer. Cell 124, 823–835 (2006). infection with Chlamydia pneumoniae. J Leukoc Biol 81, 1591–1598 (2007). 20. Fox, J. G. et al. Concurrent enteric helminth infection modulates inflammation 48. Takeuchi, O., Hoshino, K. & Akira, S. Cutting edge: TLR2-deficient and MyD88- and gastric immune responses and reduces helicobacter-induced gastric atrophy. deficient mice are highly susceptible to Staphylococcus aureus infection. Nature medicine 6, 536–542 (2000). J Immunol 165, 5392–5396 (2000). 21. Lindholt, J. S. & Shi, G.-P. Chronic inflammation, immune response, and infection 49. Hemmi, H. et al. Small anti-viral compounds activate immune cells via the TLR7 in abdominal aortic aneurysms. European journal of vascular and endovascular MyD88-dependent signaling pathway. Nat Immunol 3, 196–200 (2002). surgery 31, 453–463 (2006). 50. Zhang, W. H., Wang, X. L., Zhou, J., An, L. Z. & Xie, X. D. Association of 22. Manabe, I. Chronic inflammation links cardiovascular, metabolic and renal interleukin-1B (IL-1B) gene polymorphisms with risk of gastric cancer in Chinese diseases. Circ J 75, 2739–2748 (2011). population. 30, 378–381 (2005). 23. Tracy, R. P. Inflammation, the metabolic syndrome and cardiovascular risk. Int J 51. Garza-Gonzalez, E. et al. Role of the polymorphic IL-1B, IL-1RN and TNF-A Clin Pract Suppl, 10–17 (2003). genes in distal gastric cancer in Mexico. Int J Cancer 114, 237–241 (2005). 24. Cengel, A. Epicardial , metabolic syndrome, inflammation, and 52. Karin, M. Nuclear factor-kappaB in cancer development and progression. Nature cardiovascular risk. Turk Kardiyol Dern Ars 40, 696–698 (2012). 441, 431–436 (2006). 25. Lin, G. M., Chu, K. M. & Han, C. L. The influence of inflammation outweighing 53. Chen, K. et al. Toll-like receptors in inflammation, infection and cancer. Int the metabolic syndrome on cardiovascular risk and mortality. Int J Cardiol 146, Immunopharmacol 7, 1271–1285 (2007). 431 (2011). 54. Killeen, S. D., Wang, J. H., Andrews, E. J. & Redmond, H. P. Exploitation of the 26. Caruso, C., Lio, D., Cavallone, L. & Franceschi, C. Aging, longevity, inflammation, Toll-like receptor system in cancer: a doubled-edged sword? Br J Cancer 95, and cancer. Ann N Y Acad Sci 1028, 1–13 (2004). 247–252 (2006). 27. Vasto, S. et al. Inflammation, ageing and cancer. Mech Ageing Dev 130, 40–45 55. Coussens, L. M. & Werb, Z. Inflammation and cancer. Nature 420, 860–867 (2009). (2002). 28. Hotamisligil, G. S. Inflammation and metabolic disorders. Nature 444, 860–867 56. Becker, K. G., Barnes, K. C., Bright, T. J. & Wang, S. A. The genetic association (2006). database. Nat Genet 36, 431–432 (2004). 29. Navab, M., Gharavi, N. & Watson, A. D. Inflammation and metabolic disorders. 57. Li, Z. et al. Characterization the regulation of herpesvirus miRNAs from the view Curr Opin Clin Nutr Metab Care 11, 459–464 (2008). of human protein interaction network. BMC Syst Biol 5, 93 (2011). 30. Bang, J., Cimasoni, G. & Held, A. J. Beta-glucuronidase correlated with 58. Nibbe, R. K., Koyuturk, M. & Chance, M. R. An integrative -omics approach to inflammation in the exudate from human gingiva. Arch Oral Biol 15, 445–451 identify functional sub-networks in human colorectal cancer. PLoS Comput Biol (1970). 6, e1000639 (2010). 31. Jiang, Z., Wu, W. & Qian, M. L. Cellular damage and apoptosis along with changes 59. Xia, K. et al. Identification of the proliferation/differentiation switch in the cellular in NF-kappa B expression were induced with contrast agent enhanced ultrasound network of multicellular organisms. PLoS Comput Biol 2, e145 (2006). in gastric cancer cells and hepatoma cells. Cancer Cell Int 12, 8 (2012). 60. Milo, R. et al. Network motifs: simple building blocks of complex networks. 32. Wagner, A. E., Boesch-Saadatmandi, C., Dose, J., Schultheiss, G. & Rimbach, G. Science 298, 824–827 (2002). Anti-inflammatory potential of allyl-isothiocyanate--role of Nrf2, NF-(kappa) B and microRNA-155. J Cell Mol Med 16, 836–843 (2012). 33. Zhao, X., Ross, E. J., Wang, Y. & Horwitz, B. H. Nfkb1 inhibits LPS-induced IFN- beta and IL-12 p40 production in by distinct mechanisms. PLoS One Acknowledgments 7, e32811 (2012). This work was supported in part by the Creative Research Groups of The National Natural 34. Hanahan, D. & Weinberg, R. A. Hallmarks of cancer: the next generation. Cell 144, Science Foundation of China (Grant No. 81121003, 91129710, 91029717, 61170154 and 646–674 (2011). 31200997), the Specialized Research Fund for the Doctoral Program of Higher Education of 35. Nathan, C. & Ding, A. Nonresolving inflammation. Cell 140, 871–882 (2010). China (Grant Nos. 20102307110022), the Science Foundation of Heilongjiang Province 36. Balkwill, F. & Mantovani, A. Inflammation and cancer: back to Virchow? Lancet (Grant Nos. D200834 and C201207) and the Science Foundation of Heilongjiang Province 357, 539–545 (2001). Education Department (Grant Nos. YJSCX2012-230HLJ, Grant Nos.YJSCX2012-252HLJ). 37. Yazdani, S. et al. Nuclear factor kappaB1/RelA mediates the inflammation and/or survival of human airway exposed to sulfur mustard. J Recept Signal Transduct Res 31, 367–373 (2011). Author contributions 38. Steinbrecher, K. A., Harmel-Laws, E., Sitcheran, R. & Baldwin, A. S. Loss of Y.P.Z. and H.H.F. conceived and designed the research, performed the molecular genetics epithelial RelA results in deregulated intestinal proliferative/apoptotic manipulations and the experiments, analyzed data and wrote the paper. X.L. and Y.X.L. homeostasis and susceptibility to inflammation. J Immunol 180, 2588–2599 designed the research, analyzed the data and wrote the paper. Y.X. participated in the design (2008). of the study and contributed to the mathematical analysis of the data. J.X. contributed to the design of the study and helped to write the paper. Y.J.X. designed the research, analyzed data 39. Algul, H. et al. Pancreas-specific RelA/p65 truncation increases susceptibility of and wrote the paper. All authors read and approved the final anuscript. acini to inflammation-associated cell death following cerulein pancreatitis. J Clin Invest 117, 1490–1501 (2007). 40. Li, M., Zhang, J., Wu, Y. & Li, J. The regulation of thymic stromal lymphopoietin Additional information in gut immune homeostasis. Dig Dis Sci 56, 2215–2220 (2011). Supplementary information accompanies this paper at http://www.nature.com/ 41. Clement, S., Pascarella, S. & Negro, F. Hepatitis C virus infection: molecular scientificreports pathways to steatosis, resistance and . Viruses 1, 126–143 (2009). Competing financial interests: The authors declare no competing financial interests. 42. Fox, J. G. et al. Concurrent enteric helminth infection modulates inflammation How to cite this article: Zhang, Y.P. et al. Network Analysis Reveals Functional Cross-links and gastric immune responses and reduces helicobacter-induced gastric atrophy. between Disease and Inflammation Genes. Sci. Rep. 3, 3426; DOI:10.1038/srep03426 Nat Med 6, 536–542 (2000). (2013). 43. Choe, H. et al. The beta-chemokine receptors CCR3 and CCR5 facilitate infection by primary HIV-1 isolates. Cell 85, 1135–1148 (1996). This work is licensed under a Creative Commons Attribution- 44. Gottipati, S., Rao, N. L. & Fung-Leung, W. P. IRAK1: a critical signaling mediator NonCommercial-NoDerivs 3.0 Unported license. To view a copy of this license, of innate immunity. Cell Signal 20, 269–276 (2008). visit http://creativecommons.org/licenses/by-nc-nd/3.0

SCIENTIFIC REPORTS | 3 : 3426 | DOI: 10.1038/srep03426 11