A Volcano plot Not significant 150 Up regulated Down regulated

100

50 FDR adjust P value

0 −5 0 5 10 log2 Fold Change

B PCDHB11 ZNF675

ZNF85 NAT16

ZNF519

LPPR4 N4BP2L1 PCDHA4 CUEDC1 CCER2

C4orf6 FAM189B GAL3ST4

GRAMD1B GOLPH3L UNC119B PRDM7GCSAML BPIFB4 ENSG00000181638 ZNF560 ADAMDEC1 GRAMD1A ARL17A DCAF4L1 TSGA10IP CXorf36 LDLRAD1 ZNF391 CYP2W1 CST2 CCDC169

LYG1SCGB3A1 ANKLE1 PCDHA7 BNIPL TMC4 ARL17B BTNL8 GRAMD1C C3orf55 CARD18 AGAP4 SBK3 TSPAN5 NUDCD1 ZNF114 LRRC63 MAP7D2 ERICH6B SLC22A15 ZNF273 GLOD5 RTKN2 SERHL2 HEPACAM2 ACBD7 ZNF530 EFHD1 STXBP4 C1orf95 PROKR1 C12orf73 ZFP69B NXPH4 C9orf57 PRR5L FDCSP BPIFA2 ZNF71 BEX2NT5DC4 ZNF532 LRRC19 SLC45A4 DRICH1 PADI3 SLC38A8CCDC71L WDR17 FNDC4LRRC37A BEND6 FAM110CSLC26A7 PCDHB16 DPEP2 AQP10 LRRC37A3 ONECUT3 CYP27C1ANKS6 GPR64 ZSCAN9GPR115 FAM198A BANF2 FAHD2A OTOGL ZSWIM5 TSPAN11 RUFY4 MROH6 LIPH TTC9 DEFB132 TIGD5 CST5 ZNF454 POPDC3TM4SF20 CWH43 ZNF251 SSC4D EPPK1 SLC5A10 MFSD6 HHIPL2 ASIC5 CDH19C11orf53 FAM151A SLC25A47PNRC1 C8orf59 RGS17 OTOP3 CEP128 CTAGE15 EDDM3A NAALAD2 SAMD12 SEC14L3FSTL4 SCUBE1 PBLDPPEF1 FRMD3 FAM171A2ANO9 DNASE1L3 OCLM C1orf116FAM57BSLCO5A1 C1orf198 CLRN3ZPLD1 WDYHV1 CLVS1CEMIP CLDN15 GSDMC CUZD1 JRKL ANKAR DAPK2 C6orf132 RNF125 PADI1 PRDM12 IMPG2 HMCN2TMC7 LRRC36TMEM106C LPPR1 MCCD1 CCDC112C5orf34 SLC5A11 MYO16 TMEM191B PTCHD4 HSD17B13 SLC4A3 KLHDC7B PLSCR5TSPAN10 JRKPSAPL1 SLC38A6 AQPEP AIFM3 SPATA21 UQCC2 C21orf62 UCN2 SLC26A9 NIPAL2 AP3B2GPR161C15orf59 ABCB9 PKDCC ILDR2 SLC4A5 GPR146B4GALNT4 PLEKHS1 RASL10B MYO7BPDZD7 PCDHGA1 SSC5D GALNT13 PNMA3 PLEKHH1 SLC38A5 FAM178BMATN4 C21orf91NUGGCMFAP3LMSANTD1 C11orf96 LGALS12 SMIM24 LRRC39SLC7A6 CNGB1 C1orf158 FAM180A PAGE2 STC1 FIBCD1RDH8 CD302PAPLN GPR88 NRIP2 PRRG3 BAI3 LGI2 PLXDC1CHST5 SCN7APLSCR4 MIP 43528 SSUH2 SCGB1D2 ABHD12B AQP8 CST1 NAALADL2 TMCO3YJEFN3 ATP13A4 ILDR1 OLAH STRA6 NDRG2 ANKRD29LRRC55 ZNF711DSCR8 SPRYD4 CYP4V2 JSRP1 AVIL MRO BFSP2 MPPED2 TMEM25 RBM46 DAGLA GLB1L3 SLMO1 BEND4IRF2BPL SLC7A4 SLC6A15 CDH24NKAIN2ZNF404ANO4OR51E1 GAL3ST2 FITM1 SLC13A4 CCDC154 CA5B CPNE7TBX19 SHISA6KIAA1324 PXMP2 JAKMIP3 ASIC3 CPED1 LRRC26 GULP1SLC4A11 FAM149ARGS13 CRYBB1 DSCR4 ZNF793 CELA3A CHST6 MFSD2ACDH26 KRT6BESYT3 CHST4GXYLT2 PRIMA1 ODAM SERPINB4 SLC22A10EHD3 TCTEX1D1 RTBDNNTN5 CLDN18 CLIC6 CRYBA2 LAMP5 MCF2L2 RHBDL1 TMPRSS15 BCAT1 GPR50 RDH5 PNMT LRRC66SLC6A9 AVPI1 ISX MUC12 ATHL1 KCNJ14ABCA4 KCNJ15 SCNN1BKCNK10SLC38A11 NWD2 CAMK2N2 KIAA1211L TMEM178B C15orf48 TBC1D16 CPA5 FSCN2RUSC1 KCNK12EPS8L3 EFCAB12 SPATC1LPCDHGA2 ACSM1 TPSG1 SLC41A2 HS3ST3A1BPIFB1 TSPEARSLC9B2 GPR123 BEND3 BEST4 SCNM1 AMOTL2 C1QTNF9BUROC1 SPATA18CTRB2 REG1B MGAT3SLC7A8LYG2 AQP5 ANO2 CLDN19CCDC85APMEPA1 PRRT4 KCNJ13 MCC ACADSB MUCL1SCGB2A1 GDPD3 RGS20 AQP3TM6SF2TRPM5 RS1FCGBPCNTN1RUNDC3A GATSSAMD5ZBTB21 WSCD1 SEZ6L FAM222A GALNT5 MALRD1 SLC38A2PCNXL2 PCDH9 ABLIM2 LHFPL3 NOL4 CDH12 SOWAHB FAM212B ANGPTL6 LOC93432 GPR1 CRLF1 LRAT WBSCR28 PRKAA2 C9orf152ALS2CLENSG00000279362PI15 PLAC8L1 C1QL1 PLEKHD1 NIPAL1RIPPLY2 ADIG FRMD4B RHBG PRELPBCO2CPA2 MAFANKIRAS1SLC46A2DOK7SLC35E4PCSK1N MEP1B CPNE5 CCDC141GPR160 FAM19A4NDUFA4L2 ATP8A2 SYNDIG1 ELFN2BAIAP2L2 PRSS56 HIC2 KBTBD11 PTPN14 PCDHB8 PLEK2 B3GNT4 B3GNT3 SLC9A9INS-IGF2 WNK2 PDE1C OSR1 RIPPLY3 DDN GPR114 AQP7SLC12A1 TMEM35 TTYH3 VSIG10L C6orf163 CST4MS4A7 ATP11C MGAT5BMUC15 SLC1A7GALNT16CNGA1PLP2 CEND1MKRN3 PRR15L APOBEC3A KAAG1 ZNF385C SLC7A14ST3GAL6GLYATCHST8 TUSC5MEP1A CADPS GABRQPKD1L1 DSG4PRRT3KCNH6 FER1L5ZFR2 RRAGDMRAP2PXDNL CLUL1 RASD2 SLITRK3SOCS7 HAP1RIMS3S100A3MPPED1SLC6A17VAMP1 VPS45FRMPD1BICC1 PCDHA6 QRICH2 AQP6ACSM3 TMEM108 CSN2ZG16VN1R1 BPIFA1LYPD8CRHBPSCGNGJB6 STRCSYT8PDE6G OLFM3 SOWAHA C8orf4TMEM200A KIAA1462 SLAIN1 GDPD1 ZNF365 ALDH8A1 ABHD6 SLC34A3MDGA2 PPP1R1B EDIL3 KRT81 CCDC108 TTC13 VSIG1 TGM3 MPC1 CTRB1CPA1NMNAT2 ADRBK2 ANKRD52 REEP2 CRYBA4 VSTM5 LCNL1 RASGEF1B SLC16A11SLC44A3 CPA6 REG1A CHST10CILPHAPLN3 P2RX3 PTCHD2RIMS2 MGAT4C CLEC2L FAM186AENSG00000248919 CXCL17MS4A6A GPR27SLC14A2ADM2 RIMKLA TIMD4PEG3CHRNDLSAMP CLCNKASLC6A14HS3ST4 TTYH1OR2B6GLDNANKRD23APOBEC2 CORO6 EGFL6NREP RDH16 PLVAP CD300EPOU6F2 EPHX4 CCDC102B UTS2 GPR97DNASE1SYNGR3CNIH2 ASIC1 OR51E2 KRT12 FOXS1CIB2AIM1LPOPDC2 SLC22A17 PNMA5 TTC36 PEX5L SLC38A1 SLC6A12GNAT1 SLC26A3 DOC2AGRIK2SGSM1 SEZ6L2CAPN14RNF157 ANXA13 TMIEFER1L6 PIANP LRP4 UPK3ASLC9A2 PLCXD3TMEM132BBSNDSLC26A6SLC6A2ANKRD22KCNH1 MCOLN3CNTN4 SYT5 GJA3 ZNF285ENSG00000157654TMEM151A ZNF605 SIGLEC11 REG3A PDZK1IP1DEFA5 RBP2 FAM132A B4GALNT1SLC16A3CCL28KLK4 ARR3DBNDD1CRB1 SLC17A8 SCN11AUNC13AKCNJ6 ENPP5TIGD1 SLITRK6 GALNT10HPSE2 PLIN1 SLC16A14 SLC5A4CNGB3KLF11 GABRA3JPH2 GABRG3 AATK VEPH1CCDC36 COX4I2 ODF3L1 REG3G TMEM8C KLHL35 NPY5RSLC7A2 QRFPRGUCA2APROK1 LPHN1SLC7A10MDGA1 GABRA2CLEC3B KCNU1AIF1L KRT23CKMT1B IL1RAPL2COX7B2 FHL5 VCX PLA2R1 CHST1 HMGCLL1 CXCL14 CTSE PKD2L1AVPR1A AGPAT4NMB NPFFR2CHRM3APELAARHGAP28OXCT1NETO2 PDE6CART3DFNB31 KIRREL2SLC45A2 SEZ6 ZNF169 SAMD11C11orf80 SDK1 GPR128 GCNT3SLC16A4ST8SIA1CEL GLP1R AP1M2 UNC5D PDE2ASLC46A3MAP3K15WNK4 STRIP2CAPN12 PRR15 SLC29A4ACADL C1R PTGER2SCTRCA4GPR83SLC1A1ADRB1CRHR1 SLC9A3 CLDN4PACSIN1RBPJLFXYD3 GJB3 KCNK2 ENTHD1 SEC14L5 SYTL5 EFCAB5 SCIMP ACADS CLEC4MB3GNTL1GCDH GIPRHCAR1PON3 STAB2MUC13 BAI1 TECTBCACNG4 KCNK9 LYPD1KCNE1 STX11 BCAS1PPFIA4ENSG00000204176ADAM11 BFSP1 MYEF2 SULT1C2POU2F3 CCL23 ZBTB12 SI GNA14 TECTAEDNRB TRPC3 NPTX1PHLDA1CATSPERB EIF4E3NAAA ACAA1 IYDPLIN2CCL4L1MMP10ETFDHTRPM8 ATP1A4 MGPDPYSL4PDZD2 RTDR1 SYT13 C1orf204 DLK2 ALG1L EXTL1 SLC13A3 SMPDL3BPAMR1PPAP2C SSTR3BDKRB1CCRN4L POF1BSLC2A14 GNAZDAND5ASPHABCA3 CLN3 SH3GL3OCA2LHFPL4 SP5 HTRA4C5orf46 IGSF11 NDRG3 CES3 PEX11GTMEM82ACOT2NDST3SPINK1NDST4ACAA2TBXA2RHGFACCCL13 GRPR TMEM132AKISS1R ADRA1DATP1B1TMEM130CEACAM7NTM FAM163B CSMD1GABREGDAP1L1 CLVS2BMP8BFOXD2CAPN8GJC1LYPD6 DRP2ERVFRD-1 PCDHA3FAXC KIAA1549L SERPINB9 HTRA3 TTC39B UPK1A CORIN B3GNT5PNLIPSULT4A1SLC16A2GLS2ADAM12PRKD1PDE6ADUSP26RHO CACNA2D3CLCN1CKMT2SLC6A3MYBPHLSNAP91 LRFN2ARHGEF26CAMK1G SERPINI2ATP1B3HHATLNXPH3 SVEP1 TMSNB KIAA1614 CES5A IL27 AADAT SLC2A5SLC35D1GPR150 CRHR2 NTNG2 UTS2BCNTN5 MYO18BCLDN6 SCN4ANDRG4 CCDC170 GPR158 SUSD4KRT85 COA6 ACAD11FERMT1B4GALNT2LPGAT1CA8 SULT1A2 SSTR5MLYCDIAPP GSTM5ADRA2BVIPR1 TACSTD2OXTNTS GHRHRTMEM27 PTGDSCALML6TRPC4SMTNL2 PITPNM3DACH2 KBTBD12 CXorf22 DHRS2 CNNM1SIGLEC14 KELAQP1GAL CPT1B TMEM220SHANK1LYPD2GABRR3 BOLA2BKIAA0319 SLC8A2 LGI3 GJA10 HAGHL LRRC69 ELOVL4SPTSSB NPBWR1TAS2R5PTGIRMUC5ACSLC5A1 GRM8GABBR2KMONPY1RCHRM2 TMEM59LCNTN3ALLCSGIP1STK39 GUCY2DCRYABSMPXKCNH5FAM134BCASKIN1CSDC2 SYT6 VGLL2 KLHDC8A C22orf23 CYP4F2 FCN1 BCHE RXFP1DRD4PDLIM7ADRA2CCALCBQRFP GOLM1SLC12A5PLEKHG4BEPHB1 EPB41L1TMPRSS4SCAMP5 FREM2 ENSG00000259305SIRPB1 SLCO2A1MARCO GDF6 UAP1L1 ACSL1 CCR3 P2RY12SLC9C2PTGFRFFAR2HTR2CARG2ADRA1A THBS4 CALN1GABRD SDK2 STXBP6SH3D21IL1RAPL1 C1orf64 PGLYRP2 C14orforf96SOX21 C8orforf25 PAQR5ELOVL3 S100A12 CCL4 ALPI TP63ADCY6POSTN LHX9DBH CALB1KCNT1KCNJ4 IFITM5MAP2K3GPR56SCN5ASLC35F1RASGRF2RHCG BACE2C2orf40 ZNF385D LIX1 CCDC185MYOM3CDH7 CYP39A1FCN2UGT2B11 AKR1C3ABCC4CD1A IL1RAPPPBPGLYATL1 GNRH1DCNATP6V0D2PGCPNMAL2NGF GCH1NKX3-1 JPH1 GJA5PRR16MLANAFAT4 ANKS1BFLJ22184 KANK4 KRT86CNIH4 IGFN1 UGT2B7 FLRT1 ENSG00000160200KCNH4LPCAT1INHAPLA2G1BNELL1CXCL2TNR CXCL1EDN3HTR3A KIAA1549NLGN1SYP KCNC1SCN4BMYL9 TCAPCBLN1MYH13 CRISPLD2LMOD2 CCDC64 DLL3 VWA5B2 MSTO1 IGSF1 XG BCKDHBCYP1A2LILRA1 IDO2FASN RAMP3 ADORA3GPSM2GPD1 PLCB1ERBB3OLFM4PTHLHGAD1CPEGAD2ALPL CREB3L1 SIX4SYT9SYT14DPP6 SLC44A4 MYBPC1PVRL1CNTNAP4S100A14UCK2SLC6A8PLAG1CPEB3 DCLK3 VCX3A SLC4A10 APOA2GNEEMR1HOGA1 CTSVGRIN3BGCKKISS1 TGFBR3 GDF2AGL WIPF3KCNK3CTNNA2CACNA1CCACNG1OPHN1NRCAM SYN3 EGLN3LRRN2FEZ1 DUSP15DEPDC7 ZNF107 C1QTNF7 MTMR7 PTP4A3 SRD5A2MSRAADAT2PFKPLYVE1TSLPHAVCR1SELPATP4AACANCXCL12COLGALT2GPAT2DPT PRRT2UCN PLXNA1ACTN2MAP1BANK1ABCB5PLXNC1 SLC44A5CTNND2 SCARA3 C21orf58 GRAMD2 RBP7IQCH APOA4LAD1IDO1MBL2HS3ST3B1CD74 COL8A1 AMBN ALDOAULBP2PI3SLC8A1 MATN3BCL11A RAB3CDCLK1TRIM67OBSCN ASB15SORCS3 CCDC110 PNPLA5 SLC10A1SERPINA11EPHX2 CHRNB4 ACLYPTH1R IGFBP3FGF23VWF CSPG5LEPREL1 GFAPPTF1AUNC5ANTRK3 PLP1CSMD3ACTG2TYRP1DCDC2 SNPHMYLK2NRG2 TMSB10 TM4SF19 CYP4A22SIGLEC15COLEC10AKR1D1SLCO1B3 C8BMPEG1ACSM5STK32BCYP7A1 NAT1CD46CA3 FTL DBN1 CACNB1PLN MAG SMYD1 SYT1CDH6 HPS5 SLCO6A1 TPBGL ANXA2RLILRA2 RPRML COLEC12UGT1A10UGT1A7 GPR82 PYGBMST1RTPH1FABP6IL1BLEPCLN6THY1COL17A1CALB2GNAO1ARPC5GPHNKCNJ11NRG1GAP43 PDX1 SLITRK1ATP6V1C2 ATP10BMAST2SPATS2 DCST2 C17orf53 PHYHD1 LILRA5 AKR1C1CYP4A11 CYP8B1ORM1 NAT9FUT3 VCANSTC2 MAFF KCNE1LRGMARHOFLRRC3BENTPD8HOXD1TCF21RBMS3RGS6 BMPER PERM1DTNA ADAM32XAGE5 STAB1ALOX15 SCDALDH6A1SERPINH1OLR1LTFNNMTFCGR2BLRG1BLK PRDM16LAMA5LDLRTHBDHAS2 CYR61PLXNB3CETPNEFLEFNB3RYR2 MAPT NOVTEAD2MYBPC3CCDC135PPFIA2TNNT3KCNN1 DSTYK CT45A10CT83 GSTM1LCAT C1QTNF1 IL33 CANX RASD1SLC30A8AOC1 ANXA2SEMA3FGHRDUSP9NTF3 NT5M GREM1JPH3 TNNI2 MSMB43713ADAM23 CAPN9 MURC FAM166A GOLGA7B FAM78BCRIP3 SQLE CD1D CD163FA2HLPL CD36 FPR1 AFPPYCR1AMHR2 SP8AKAP5FABP7NPNTTNFRSF19 SCUBE3MYH4 SIM1 NUDT17 NUP210LM1AP PHACTR3 RGSL1 NAT2 SLC22A4NEU1AKR1B10CRLF2GPR61 PHEXIL1RL1ALDH3B2 FOLH1PRSS1GPC5FOSL1S100A11BDNFCAD ATP2A1LAMC1PAX8MT3MAPK13 SPDEFCASQ2 TNNT1 EVX1MYH1 TPPP2 MAP1A C3orf14 DUSP13 UGT1A5UGT2B10PODN ARHGAP36 CYP2B6 LYZ ASPG COG2 ANKRD45COMPCTSK ENTPD2NCAN COL11A2GJA1NPPA RHOBLTBP2EPHB2WIF1SEMA3A BSNTCHH PPP1R3ADKK3FBXL22 CDH10RIBC2 EFHC2PKN3 CRISP2 ZNF726PRKCDBP AKR7A3AFM FCAR ARHGAP39COL15A1CD4MGAMGBAPGM1CCL2TRPC6RTN2ENO1 ITGAVPEG10 MB NPTX2TNNT2HES6 OTOG STOX1DTX1 NRSN1YEATS2ALX3 CHP2 PLEKHG4TCP11 TMEM255A FAM135B SLC22A8 OSBP2CHRNA3ATP6V1C1MIOXABATKREMEN2 GLA IL4I1PTGS2 TFRCSPARCNR4A1FOS TMEM163SHC1KITLG MECOMMYOM2LIMK1 ZIC2 DIRAS3 SNTG1 HRASLS5 CD200R1 CR1B3GALT1LILRB2ABCC1 FGR FABP4CLEC12A ITGA3CNDP1GPC3IL6PCYT1B COL24A1FAM83HPARM1SPP1NR0B1 KIF5A PLXNA3NESKCNA3PROL1SLC35F2 43722 PCDHB10 RGAG1 RNASE2IL12A HSD17B3SAA2UGT8ACPT CD177LPAGYS2 TKT MMP9 TMPRSS2SRCCAMKV KCNIP3MAP2SOX11HSPA12ATPM2 ASRGL1 CNTNAP3 CDH16 TMEM201 ARMC12 C9orf84 SLC22A31 SOAT2CD300C GCHFR HAAOIL1RN GSTA4GCKRASNSTYRO3ZFPM2 ALDH2TXNRD1 COL7A1SPEGINPP5JEGF FLNASNAI2TSC1LGALS4TNNI3MCAM EYA4CDH17SGK494FOXG1 ROBO1CDKN2BMYT1LRAB3DRCAN1DRAXINNEB PKIAHOXC13 EDARADD MYCBPAP NFAM1 C7 PDPNGGH ERP27STARHBBSLC45A3ENPP6 CKAP4IGF1MMP1 RET PRKCB KCNQ3 PPP2R2CFXYD1TRIM63ARHGEF11CTNNA3 MYOT NXF3 STARD5 ACOT12CYP2A6CEACAM20 DGKHCFPHK3NOX1 OMG OSTNBCANCOL1A2 MMP3MYRFPHGDH SOCS2FOLR1FLNC ACTL6BADAM22 LHX4CA12TCTEX1D2FBXW12 NAP1L6DMBX1PAIP2B FAM111B ADAM21 CYP27B1HAO2AKR1C2 ALOX15B IL3RA GTPBP2 IL10SLC27A2ARTN TNFSF11CAMK4NOTUMSIX2SPTA1LINGO1 ACTB KCND3 ACTA1 NR6A1 UBE2UBIRC7 CDNF PLCH2 CD5L SLC22A1CYP2C8CIART CYP2C9 LDHDPVRL4 OLFML3HMOX1P4HA2GAL3ST1ITGB6S100A10CA9 AMHEFNA3EBF1 LRRK2SEMA7AGLMNLHX3MMRN1 MYH7B FBXL16BICD1APOBEC3BTMEM120BDNAH8 FAM24BSLC34A2 LILRB1ABCC5RGS18 SHBG SORT1LCN2 G6PD SERPINE1LAMB3BGLAPCD34 IBSPNR4A2FXNGADL1HSPA6LAMA1 CHRNB2CABP7EYA1 MYH3ASB11SCARA5 SALL2 SH3RF3 ANKRD36B ZNF648 PDGFRLPHPT1C1QL4PIK3C2GPDE7BB3GALNT1SRXN1CFD CHKATREM2LRRC4CCHRNA5FIGF CD24COL5A1ITGA9 GAPDHEFNA5BMP4ESRP1BMP7 ETV4 TWIST2RASGRP4 KIF1ADNAH12DSE NPM2CATIP MAGEB16 ZNF83 FAPTRIM46 ZNF296DUOXA2CUBNCD226 ACSBG1IL18R1 MMP13FAT1 ID2 HGFLGR5EGR1TKTL1FGF13 TNFRSF9ADAM9 NEFHHAND2ACTC1RHNO1MSI1ASB16 MAPK8IP2 PROCA1 WISP3 HRK NCCRP1 SIGLEC8 PADI2 TNFRSF25NQO1CYP17A1RSPO4SERPINB8ANXA8SPOCK1ICAM5COL1A1 CHRNA1PKM HSPB1EFNA4EPCAMPHOSPHO1ADAMTS16FKBP10FGF8 STEAP4HOXD10TCP10SEMA4FHEY1SEMA3BRSPO3 CEP250 DNAJB11 IGLON5TINAG SOX12 TEKT5 CABYR PRAMEF8 CMTM4 CBFA2T3 SIGLEC1PDCD1 FABP5 NAT8LORM2TLR4 LOXVANGL2IL13RA2HSP90B1LAMA4 GLI1PMELHBA2MSTN MAPK12NKX2-5HOXB7PAX2ROBO2 SOSTDC1BMP5 ELAVL2 ENSG00000258643RBM24DCUN1D3 SPATC1CCDC11 SNCAIP MSS51 ABCA8 ZBED8 SIGLEC9COLEC11 IGFALS SARDHEGR3 TMEM145TEKTLR3 COL3A1GFRA3COL9A1SLC6A7CISHENTPD3DLX5 RGS5 SPTBN2TBX15 HOXC4 KLHL31 SAPCD2 CBR3 PLGLB2 CD300ACASS4ANKFN1 SGMS2TNFSF15INHBC PLCD3 S100A8SCGB1A1IL2RB SPHK1 BCAS4SEMA5B CKM PIM1SFRP1 LZTS1 TLE6 MRAS DNAH17 IZUMO1 FAM9A LAIR2 SLC51B HP OSM FOSB HBA1PRR18COL4A1MAP2K1COL11A1PAK3 PRRX1WNT3A FZD10 SPSB4TRIM11 CCDC114 TGM4 BCL2L14RSRP1 FAM212A CDKL3 FAT2 ME3 OTUB2GSTZ1NR0B2ROS1 IL11COL22A1CD69 COL4A6ITGA2AK8EEF1A2PLAUANGPT1 MAPK11NME8WNT2B PPP2R2BSEMA6DNKD1HES5SIX1DOCK3PPP1R14CMISPCTXN1MYH15XPO5 CBLN3CCDC14 MAGEB2 MDFI KLHL15 ZG16B UGT3A2INMT ITLN1ALDH1B1SULT2B1 CYP19A1 PAFAH1B3COL5A3CDK5R1COL5A2VSX1 SOX9PRKAR2BLHX2HSPA5RFWD2POU5F1TBX10 CABP1MYCNCAP2KCTD17TRIM50 FSIP2 TSACC ALDH1L2 ARNT2OLIG3NAALADL1BLVRA CHST7 ADAMTS14MMP14SLC39A10SOD3 TDGF1 HAMPTNFAIP6TGFB2 FLT3 ILF2 TMEM119 S100A6OTX1 CEP41ENAHGAS2L3SPSB2HOXC6 OAZ3 TLCD1 PRAMEF7 FCN3 SYNGR4 SORL1NKX2-3NCR1CCL3CTLA4ZFP36FOXK1 TRIB1LMOD1DUSP1MMP17PDGFRBFOXO1GATA3ETS2ESR1 FZD2 HSP90AB1LIN28BLEF1PTK7FOXC2 MMS22LRAB6B DAPL1 AKR1B15 LILRB5 DHDH HIVEP1 CD14 PLGLB1APOFROR1ADAMTSL2COL2A1XDHCOL10A1NGFRTHBS1JUNDKK1SNRPD1 SOX2 GREB1PITX2 TOP3BKDELR3 BMP10FIGNL1TSSK6 PTPDC1KLHL34 ZNF233 SPEF1 CD244 MAPK15DUOX2ME1 PDGFACOL25A1 DNAJC12GDAP1 KLF4 EN2 HOXA10 FAM81A FBXO32HOXA3RBM20SHOX2 KLHL30 ACTL8 TEX37 MAN1C1 USP18 CTF1 FGF19IGF2BP2ITGB4FRRS1LFOXF2 TECRLPDGFRAMMP7F8CSPG4UCHL1 MYCPSMA8TBX18 KIF5CHOXC9CBX2 DNAH11 PRTG ZFP1 NCKAP5 PRAMEF4 C10orf35 IGJFLVCR1MT1G SLC34A1SPINT1PPAP2BF13A1EBF2 ARHGEF37SERPINE2 SQSTM1EGR2 SOCS3PDGFDSTK31RNF144ASMAD6COL6A3 TNXBCNTD2 SHFM1CCDC80TUBA1B CLK2SYNE1CCBE1 OR13A1 DNAH14 C10orf90 C7orf61 FAM153BKLRD1 CNFN FAM13A SMARCD3COL4A5 RHOBTB1SALL4FOXJ1DLX1TWIST1SNRPE C1QTNF3TUBB4AEYA2CEP131DYX1C1 FBXW10 TMEM98 KLK13 P2RY8 TNFRSF4 RAP2A COL4A2 ITGA6WNT1TFCP2L1IHHDIRAS1WNT2ENSG00000258947CTHRC1TCP10L2CCT3FHIT KCTD7STMN1SEMA6CPITX1 SUPT3H FAM46CNKAIN4SPATA25MAGEA3 TPRXL EBI3 INSC STEAP2GPR182HPX JUNBVLDLRADAMTS13PSPH PABPC4LIGF2BP1NOTCH3FGF12TERT FZD6HMGA2ITIH5PSMD4 RSPO2TBC1D31 FBXL13BRSK2MAST1 TSSK3RAB40A CCDC155 FAM133A ZCCHC12 B3GALT2GPX8 CHI3L1INSRR SCRIBCOL9A2WNT10BRPL38SOX6 POU3F2 PHF19RHOVL3MBTL1TBX4DSN1 ARHGAP11B PRDM9CGREF1FOXD1ZP3 DPF1 SYDE2BIK CCSAP GPR19 MEX3A C2orf82 DUOX1 PLCH1UXS1 LIFR SHC4 WNT11NME1 SFN WHSC1TUBA3CASPM SCML2 HOXA11ARHGEF39SGCA MAGEA12MTL5 KIAA1244 FATE1 C1orf162 MT1M ARID3ALOXL1ELOVL7MMP12 IRAK1RPL22L1ASPDHDNAJB13TMEM26HBE1COL6A6 CCT6ASNRPBFZD9 AXIN2 VRK1 DMC1 KLC3 CHMLMLLT11HOXD9CAGE1 RAB11FIP4PRSS50MAGEC1 KHDC1 FHAD1 CD200C1RLPMFBP1 TNFSF4 SLC52A2 PDIA2 MMP11NOX4KLF6COL21A1EPRS ANXA10RPLP0HSPB8ALX4ID1PIP5KL1 PRKDCDNMT1EPHA10 CDH3 USP21GAS1 CBFA2T2 MAGEA1 TMEM136 VMO1 BRINP3 CD160 RNPC3 DLX6 DHODHLAMA3 DUSP6KIF19TUBG1 DYNC1I1ARHGEF2ASB14IQGAP3 CPEB1FBXL18 MAGEA10 CT45A1SSX1 NR2C2AP SMIM22 SPIC NECAB1 ECM1 SFTPDSMPD3 RHEBL1FBLN1 SNCG SMOC2FGF22HMGA1PABPC1KLK2KAT2ACDKN2ASOHLH1POLR2K DIAPH3ERCC6LEXPH5 BRSK1 XKRX HDGFL1 OBBP2 DMGDHSLC39A14 MT2A SRGAP1NFATC4 RPL5 ACVR1CFOXL1STRA8WDR27EMX1 LIN28AJAG2FANCGRAB34 HOXB13LIN9 KLF10 TRIM59 SYCP2L PAGE2B NNATGNMT SHC3 GADD45BPRSS16TLL2 RPS19DNMT3A FOXM1TFAP2ASERPINI1HOXA9CCNB1TFPI2NCAPD2EREG CCNE2TRIM71 SPA17 PRAME SIGLEC7 SPG20 SLC28A1MEFV KCNG1 ASPSCR1AXL EML6 TPRPNCK EZH2 MSX1KIFC1FAM83DSOX4AFF2MPP3 HOXC8 PAGE5 ARSIKRTCAP2 HOPXBHMT LATCITED2RPS20RPL39L KCNF1RPL9IRF8 HMGB2TRAIP CCNA2DLGAP5NCAPG2 CUL7 SUB1 GAGE1 AHRR TTC39ADDIT4LTMEM121 ATAT1DDR1 RPS18DNMT3BYWHAZHNRNPA1UBD CHEK2CDKN3 RAD21DSCC1MROH2B HOXB8 SSX5 MAGEB1 TIGD7 CEACAM4MT1F TESCMT1H NR4A3 CAPG SPRY2RPS6RPL13ATRAF5 CCND2RPL4LHX8 CDK1 RFC4CDC20PRR11TCTE3 CEP72WDR62 SPERT FOXD4 PEMTFAM3BBACH2SPON2SPATA45CNBD2CBSL INTS8PSRC1VAX2 CENPJBBOX1 MKI67NUDT11AURKAPOLE2CCNE1KNTC1CDC7TEX15SMC1B DDX11 PLEKHN1 PASD1 FXYD2ADAMTS6 FGF17 NUDT1TTC9BRPL36A NFIL3CDA BIRC3 CLGN SP6 BRCA2 PRRX2CDC25CNEK2PLK1NWD1KIF4AUBE2S CENPECEP55 CLIP2DKKL1HOXD4 LRRC1 TMEM45BPAGE1 PRAMEF9 LCTL PPM1K ITGAD IL17DGPAA1PFKFB2 ROR2CERS1 BHLHE40EHMT2MESP1TYMS TK1MCMDC2HJURPCCNFLMNB1ARHGAP11AECT2XRCC3 MCM8PLK3 UBAP2L CFAP44 ACYP1 MT1AFBLIM1 SSR2PPIA OVOL1MMP24FRMD5SH2D5SPEF2TDRD12EVPLRPL30RPSAHLA-DOBFEZF1KLHL38NUP37INSM1 MSH2DTL GMNNIRX6KIF18B HOXD8 C2CD4D TRAM1L1 MAFG ITGA11 SRRM3RPS3 JDP2KDM6BMDKTP73 E2F1CDC25APIFONUSAP1RND3 KIF2CCDC37L1 MAGEC2MESP2 PGBD1TNFSF9 C19orf48 SIK1 HHIP CHEK1GINS3CDKN2CKIF11PSMB4 DNA2 SERTAD4 MAGEA6 FAM154A CCDC28B PZP TRIM15 ACPP FRZB TUBA3ENPM1 HIST1H2BNPBKEN1HMMRPCNA PLK4 KIF14 FBF1 ZNF492CSAG1 SUN3XAGE2B TCF23 SLC39A5NUDT10SH2D3ASCX MCTP1MFAP4SNRPD2 DKK2 SYCP2PTTG1BLMSFI1 CCNB2RACGAP1MELKCDCA3INCENPCENPL HORMAD1 RASL12BRDT RANBP17 IGFBPL1 MAGEA4 TNFRSF18ZADH2FAM171B MAT1A TCOF1SPIBRAET1EKCNS2RASA4 PIGCCYTL1C1QTNF6 IRX3TACC3POLA2SYCP3RRM2 UBE2CDEPDC1B SPAG6 HOXA13 INSL6 SAGE1 GAGE2A RARRES1 NFE2L3 IRGM STEAP3FBN3 RPL8 RPS27BARD1HIST1H2BJHIST1H2ADH2AFZH2AFXKIAA0101HELLSCDC6CENPFTMED3MCM4PRIM1NCAPHKIF15 TEX11TCF19 MTRNR2L8 WBP2NL FAM155B CD70 AIRE ADAMTS1 RPS7CCDC15 CCNO HIST1H4H TPX2TTKLIG1TICRRAURKBCKAP2L CENPWATAD2KIF18A SPESP1 ATP8B3FBXO43 ZNF775 IDNK ZNF750ULBP1 ZKSCAN3PPOX MT1XZNF572 PPP1R27 DUSP2 HIST1H4EEIF3ESFRP4WNT6GTSE1 FANCD2HIST1H2BHRAD51AP1EXO1ESPL1MAD2L1OIP5TOP2ARMI2 CENPHBORA NRM CTAG2 TMEM253 ZNF676ZC2HC1C MROH1DDX53IKBKE SMG5TNNDDIASIGF2BP3TET1TLX1EIF5A2 GSG2 ZWINTMYBL2BIRC5BUB1HERC5SMC4 MND1 CNTD1 PXDC1 ABCC10 ESM1 MTHFD1L IER2 RPL10LDNMT3LADAM15BRINP2 HIST2H4AH1F0GINS1HIST1H2BO E2F8NUF2 PSMC3IPEME1RAB41RFX5 DRC1KIAA1731 CT55 C6orf223 CCDC180 FAM13C EPB41L4B KPNA7CENPIANLNS100P KIF24CKS1BDNAJC9 SPDL1SMYD2FANCE CCNI2 TMEM246 NANOS1 TMEM164 PRTFDC1 SMYD3LAPTM4B DPPA4ARMC9 KCNG2MSH5POLD1HIST1H4I BUB1BMCM7ESCO2 GINS2CDCA8SHCBP1 CCDC65KIAA1524 PCDHB2 CKLF TSPAN15 MICBSLC30A2 SLC30A3LAMB4 SNX15 SCUBE2NKX3-2HIST1H3B CDK3DTYMKGLI2CDCA2CDCA5NCAPGFEN1SPIRE2CDC45CENPUMCM5 YBX2 HOXC10 ZNF239 SPATA17 ENSG00000026036 ERICH5KLRF1ANGPTL7SBSPON BOP1 BCL9NDC80RFC3KIF20AORC1HN1PRC1CHTF18KIF20BTNP1ITGB3BPTROAP DNAAF1 EDA2R CFAP43 CMTM2 KSR2 CACYBPPIR HIST1H1CDKK4HIST1H4KOSR2 KIF23 CEP152 FAM72A VPS72SPAG4 LUZP4 ANKRD33 MMEL1SGPP2 ADAMTS18CCDC3 MFSD2BDCAF13RPL37 LSM8 TRIP13PRIM2SKA3SGOL1CENPAE2F5STILKPNA2FAM132B DBF4DONSONIFT80 ARHGAP22 ZNF695 NOX5CLEC4G MAP3K9NECAB3LOXL4 SPECC1MT1E DDX39BCELSR3 CKS2SPAG5ASF1BCLSPN SASS6 HECW1FAM182B DUOXA1ALX1 GLT1D1SMG9 RPLP1 CENPM UHRF1CENPOWDHD1CENPKSRPK3CHAF1B TAF7L WISP2 IL31RALY96 GTSF1HDAC11 SPRED3PYDC1SMKR1 CDT1 SPC24RAD54LUBE2TNEIL3 DPY19L2 GOLGA8B SF3B4 SKAP1CREG2 CITED1 MCM2RAD54BRAD51HOXB9SMC2 EHFCCDC67TMEM52B ZNF761 ZNF681 OSBPL3 NLRP12 PFKFB4 THSD7AFCRL6 ACRV1 KIF7 PKMYT1ZWILCHMCM3 C19orf40PIF1 TFDP3 C1orf35 TRIM45LOXL2 CDCA7TTF2DNAH3E2F7 XRCC2 CKAP2GINS4CENPPFANCBCCNJLPOLN C19orf57 KRTAP5-9 GPR137C SH2D1BMYBL1 HIST1H3DTTLL6 MCM6 RBM3LMNB2CENPQ AKAP3 GATSL2 PNLDC1 LGALS9B STAC2 RASGEF1ARHOXF2CHODL ARHGAP40SLC39A4HSPE1-MOB4MFAP2 WDR76 SKA1 FAM64ASGOL2POLQ NRSN2 TEX19PODNL1 KRTAP5-5 TDRKHMTHFD2L MPZL1 RNF17GSPT2NELFERASAL1ISL2FOXF1AHSA2BOCDDX39AHIST3H2ACDK15 ATAD5 ZMYND10 SLITRK4 ARHGAP33KIF21B HIST1H3HRBL1RNASEH2ABARX1RECQL4FANCITUBA3D FAM72D ZBED2 THEM5 NOXRED1 OLFML2BTMEM154 SUV420H2 GDF10 MAELHIST1H3EE2F2DEPDC1PLK5 CHAF1ANT5DC2KIFC2 ZCCHC24ZNF229 C9orf173 ZNF813NPIPB4 TRIM17 RHPN1 MCM10PARPBP CDCA4 RANBP3LMEOX1 KRTAP5-1 ZNF98 PTCRA PIGU HIST1H3C ORC6 SPC25 MTFR2 FAM72BMTBP C19orf67ZIC5 KAZALD1 GADD45GNAP1L1HEATR1 KIF26BTUBAL3 NSMCE2 MAGEA11 CCDC78 MUSTN1 OAS2 RCL1 DPF3 MARK1 MSH4 MSC ZNF607 C8orf44 TRIM31 FOXQ1 CELF5 SOX14 HIST1H3GRNF165 CCHCR1CDC20BBRIP1 GLI4 MAGEC3 CHI3L2 AGBL4TRIM6 HIST1H2AE HIST1H2AI AMDHD1IQCD HSF2BP KRTAP5-7 GBP2PRR19 HIST1H2BGHIST1H2BE FAM83F AGBL3 SRSF12NAA40 NAV3KIAA1875 TMEM74 RCOR2 CSMD2 DCAF4L2 TLDC2 TTLL4 AHNAK2 HIST1H2BFHIST1H1EHIST3H2BB PCDHB3FAM72CTMEM150B TMEM249 ZGRF1HPDLSPAG17 RSAD2 C16orf59 KCP HIST2H2BF DLX4 KLHL23 EBF3 PTGES3L POLG2 HIST1H2AG RFX8 RFPL4B C21orf33 MEF2B DNAJC25 DBF4B RBM44 C1orf106 MYEOV DQX1ULK4 RNFT2TDRD9LUZP2 TAF1A HIST1H2AMPAGE4 USP49 C20orf144 RELL2 ISM2 C6orf48 TRIM16HFM1 TRIM7PFN4 STAMBPL1 IQCA1 GAREM MAGEE1 KIAA1841 TLX2 CDRT1 NKX6-3 CSRNP1 APCDD1 PABPC3FMN2 ACRCMYT1RNF152MTUS2C5orf30 MEOX2 TMEM72 SLC50A1 VAT1LTMEM132E SOHLH2 PPP1R14D STMND1 PELI2 C19orf81 ZNF692 FAM167B LTK ARHGDIG UBALD1 SAC3D1 PHLDA3 LMTK3 ZNF300 ELL3 TONSL RHBDL2 SERTAD1DPEP1 PPAPDC3 ZBTB32 RPS6KL1FAM90A1TRIM16L ZNF219 TFAP2E PODXL2 RGS22TARBP1 RDM1 C5orf58 KRTAP5-10 ZNF320 PABPC1L SAPCD1 LRCOL1 C16orf74 ZNF789 ATAD3C PDZRN4 ZNF431 TUBE1 KAZN ITLN2 DNASE1L2 IGSF3 ZSCAN16 ZNF541 TMPRSS5SCHIP1 OIT3 LRRC14 NLRP14 OTUD6B DCAF8L1 KIAA1522 SHISA4 BAI2 NVL NKPD1 AXDND1 ACTL10 ZNF382 TDRD5 TMEM40 RNF183AUNIP ORAOV1 TEX22 SOWAHC ENOX1 ANKRD55PYGO2 CYB561D1 NAA11 METTL18 FHDC1PRAM1 LRGUK CXorf67 NLRP4 SNX22 TAL2MYO19 CFAP61 C4orf46 TMEM74BTMC5 LMO3 WBP5 MXD3 CCDC34 NPIPB11 SUCOTREX2 VASH2 ARID3C ZNF525 DISP2 NKX1-2 C1orf105 KCNK17 NUPR1 C15orf41 ZFP62 DUSP14SPDYA ERVMER34-1 FSD1L ZNF57 MAMSTR DCAF8L2 ZNF727P CCDC144NLKDM8 SALL3 RAD51AP2 GOLGA8ATMEM61 EID2B ZNF714 ANKRD34B ZNF730 ZBED6CL ATOH8 TP53I3PSORS1C1 ZSCAN31 C12orf75TMEM63C GOLGA6L10 LEPREL4 LRRN4CL ZNF517 RHOXF2B GOLGA6L9 PAQR8 PCDHGA4 TRABD2A TTC23L SIPA1L3 SMR3B FOXN4C1orf111 BSPRY RHOXF1 ITGB1BP2 USP54 SPOCD1CLEC18B PAQR4 ZNF208 C8orf33 ZNF610 CCDC138 ZNF93 FAM162B SMCO2 LGALS14 WDR66 TRIM73 CNTNAP3B SOGA1 RAVER2 TMEM155 TRIM74 DYDC2

ZFP82 SHISA3 PAQR6 ALPK3 METTL12

ZNF341 STEAP1B ZNF578

IQCC OLFML2A ZNF99

FAM159A

Figure S1. Analysis of differentially (DEGs). (A) Volcano plots of transcrip- tome profiling results in HCC tissue samples versu adjuacent normal tissue samples. (B) PPI network of DEGs. Nodes represent the and are labeled with their names. The size and color of nodes vary with their degree centrality in the network. The width of edges varies with their confidence. A PAMR1 B NAT2 C STIL 1.0 Low TPM 1.0 Low TPM 1.0 Low TPM High TPM High TPM High TPM 0.8 n (low) = 182 0.8 n (low) = 182 0.8 n (low) = 182 al (%) al (%) n (high) = 182 n (high) = 182 al (%) n (high) = 182 vi v vi v 0.6 0.6 vi v 0.6 0.4 0.4 0.4 0.2 HR (high) = 0.9 0.2 HR (high) = 0.72 0.2 HR (high) = 1.6 P (HR) = 0.57 P (HR) = 0.63 P (HR) = 0.014 ercent su r ercent su r Logrank P = 0.56 Logrank P = 0.082 ercent su r Logrank P = 0.013 P P 0 20 40 60 80 100120 0 20 40 60 80 100120 P 0 20 40 60 80 100120 Months Months Months D NUDT10 E C16ORF59 F SFTA1P 1.0 Low TPM 1.0 Low TPM 1.0 Low TPM High TPM High TPM High TPM 0.8 n (low) = 177 0.8 n (low) = 182 0.8 n (low) = 181 al (%) n (high) = 146 al (%) n (high) = 182 al (%) n (high) = 181

vi v 0.6 vi v 0.6 vi v 0.6 0.4 0.4 0.4 0.2 HR (high) = 0.88 0.2 HR (high) = 2 0.2 HR (high) = 1.2 P (HR) = 0.51 P (HR) = 0.00023 P (HR) = 0.35

ercent su r Logrank P = 0.51 ercent su r Logrank P = 0.00018 ercent su r Logrank P = 0.36 P 0 20 40 60 80 100120 P 0 20 40 60 80 100120 P 0 20 40 60 80 100120 Months Months Months G ZADH2 H VMO1 1.0 Low TPM 1.0 Low TPM High TPM High TPM 0.8 n (low) = 182 0.8 n (low) = 182 al (%) n (high) = 182 al (%) n (high) = 182

vi v 0.6 vi v 0.6 0.4 0.4 0.2 HR (high) = 0.74 0.2 HR (high) = 0.94 P (HR) = 0.096 P (HR) = 0.73

ercent su r Logrank P = 0.74 ercent su r Logrank P = 0.74 P 0 20 40 60 80 100120 P 0 20 40 60 80 100120 Months Months Figure S2. Survival analysis of candidate genes other than NOX4 and FLVCR1. (A) PAMR1; (B) NAT2; (C) STIL; (D) NUDT10; (E) C16ORF59; (F) SFTA1P; (G) ZADH2; (H) VMO1. The data was analyzed by Kaplan–Meier; Log-rank P < 0.05 was considered as statistically significant. CACNA1SCABP7 CABP1ATP2A1ATP1A2ACTC1 ABCC9 NKX2-5 KCNE1 JPH1 CASQ2CALML6 S100A1RHBDL3PKD2L1 ULBP1 SLC8A2SCN4B SCN1A SNRPE RFX5

MPC1 MICB

SLC7A14 PPIA DDX39B ANXA10 CCDC11 CCDC112 CCDC13 TINAG CEP131 RBM3 GNAL NUDT1 MMEL1 COL11A2 AMHR2 CAD ATP1A4 B3GALT1 COG2 B3GALNT1 GABBR2 UBALD1 GGN TSPEAR TFPI2 RTN2 CNTD2 RNF17 DISP2 PYDC1 HHIP LRRK2 OTOG LRGUK RASL12 GAD1 TMEM74B CLGN CYP19A1 CYP39A1 FOXF2 GCHFR IDO1 IDO2 LTBP2 OCLM SIX2 SLC38A2 SRD5A2 UGT1A10 UGT1A5 UGT1A7 ADRA1A ADRBK2 SYT1 APLN IL3RA AVPR2 GBA BDKRB1 FCGR2B CHRM2 FCAR CHRM3 ARPC5 EDN3 EDNRB GALR2 GALR3 GCG GCGR GNA14 GPR37L1 GRPR KIAA0319 KISS1R LPAR3 MCHR1 NMB NMUR1 NPPA NR3C2 PLCB1 PLCB4 PROK1 PTGFR TCOF1 QRFPR SMKR1 SGIP1 PIM1 SLC22A2 KCNC2 SYT2 GPRIN1 SYT8 CDHR2 TNFSF11 TRPV4 UTS2B RHEBL1 TTYH2

AP1M2 BFSP2 CRYBB1 CTSA FABP5 PADI2 SCARA5 SYNGR4 TMEM132A

CES3 CYP7A1 SI

OSBP2

AIRE

ARID3A

CHRND

GAD2

TNFSF15

SLC27A2

RHPN1

C9

AFM

KCNH2 HCN4 CACNA1C TNNC1 CALB1 TNNI3 TCAP HAAO TUBB4A PRKAR2B RYR2 PLN CALB2 PPP2R2C FBF1 CEP41 PVALB TNNT2 C21orf33 KIF24 MYL9 KCND3 STIL CEP152 RPL30 HSP90AB1 SYNGR3 PRR18 SMG9 TCTE3 RPL37 TCP10L2 MIOX KIAA1731 NUDT10 AIFM3 SASS6 EIF3E CCDC110 CCDC78 RPL22L1 CCDC67 RPL10L AKR1C2 TBC1D31 RPLP0 AGXT2 CCDC77 YWHAZ TRIM59 NUDT11 ASS1 CCDC14 ENO1 MT2A SPEF2 ACTB BCKDHB SPEF1 RPS18 MT1M VMO1 B3GALT2 SRD5A1 TCF19 MT1H CBSL SHBG CCT6A F9 EPRS CST1 SERPINE1 F2RL3 MT1F MMP11 GLS ENSG00000160200 IHH ADAT2 KCNIP3 SCUBE2 GLUL EPHX2 ADH1A GLS2 FAM13A SMOC2 ERP27 GSTZ1 ERC2 GNE GSTM1 HSPA6 GSTM5 HNRNPA1 IYD AKR1C1 RPL4 SULT1A2 TFRC MTHFD2L GGT5 UGT2B11 MCAM OGDHL AKR1C3 SYT9 ALDH3A1 BSG RIMKLA CYP1B1 CPD TMCO3 NAT2 SLC39A14 WNK4 OXTR CD4 SLC5A4 AGTR1 ADRB2 COX6A2 HBB SLC14A2 RPL38 CTLA4

AKR7A3 KCNC1 CD69

CYP2C8 WNT2 KITLG WNT6 KDM6B RPL8 ELOVL4 VAX2 RPS3 RPS19

FGR ACLY C5AR1 HOXB7

ANXA2 AGTR2 HOXA13

PLA2G6 KCNG1 DLX2 FA2H RPS20 ACVR1C RNASE2 RPL9

GJA3 RPL13A

DNAJC6 RCL1

FTL KCNQ3

CBR3 RPL36A

NFE2L3 DCAF13

MT1E UBD

NQO1 RPL5

CYP2C9 KCNA3

CYP26A1 HEATR1

CYP4F2 HBA1

AKR1D1 SMG5

UGT1A8 RPLP1 APLP1 SLCO1B3 CCT3 ART3 SLC10A1 KCNF1 BAI2 HTR2C SNRPD2 CNTN3 CYP4A11 KCNS2 CNTN4

PLA2G1B KCNG2 FAM110C

CYP1A2 SSR2 FUT2

NR1I2 RPL39L JRKL

TNFRSF25 EIF5A2 LSAMP

LY6H TNFSF4 GSPT2

LYPD1 CHRNB4 HBA2

LYPD2 OLR1 LYZ

MDGA1 NOX4 OLFM4

MDGA2 GLI1 CST4

NTM EVPL CFP

NTNG2 ITGB4 MEFV

TECTA FGF22 MEP1A GLA TECTB LAMA5 CTSC ULBP2 LAMA4 ORM1 VNN2

TUBA1B TNFAIP6 XKRX

CXCL12 TERT PYGB

MYH1 HSP90B1 LTF

MYH7B HSPA5 HPX

TNNI2 LAMA3 A1BG

ACSL4 RPSA MMP7 HAS2 PABPC1 XDH MYOM2 SEC14L3

RPS7 SOD3 MTNR1B MOGAT2

PXDC1 NME1 MT3 EBI3

SLC16A11 TXNRD1 HMOX1

PLEK2 ACAD11 GSTA4 MAPK12 SHISA2 KRTAP5-7 ALOX15B GPX8

AKR1B10 G6PD

C6orf223 MSRA REN ZFPM2

SRXN1 EGR1

RCOR2 GPX2 COL1A1

STXBP6 IGF1 BMP4 ZBED2

SYP SCG3

PSMD4 DLX5 IL1RL1

EGLN3 CYP2B6 BPIFB2

BHLHE40 ENO3 AMH SUSD4

CKM SPARC

CKB MGP

CA3 SALL4

MYH4 ID2 AMDHD1

CD163 TBX4 ASPA

BHMT ADRA1D EGR2

THY1 SFRP1

CNTNAP3 TWIST1 CHRNA5

PRND FGF23

CNTNAP1 OSM

MSLN GAPDH

ATAT1 CNTN5 ASNS

KLF6 CSN2 IBSP PSCA

MB GCK CCDC80

CRHBP CD34 DHODH

DOC2A PTGS2 PDIA2 STEAP3 RRAGD

SELP CANX RGMA STARD5 NAAA C8orf76 CIART NME8 IL10 ZC2HC1C C2CD4B DCN GDF2 LRRC4C

ZADH2 CDH17 FAM132B

ENSG00000259305 BMPER TRAF5 NGFR SDK1 BDKRB2

SERTAD4 KAT2A ATOH8 CILP TEX19 DUSP1 NPTX2

RASGRF2 SMC1B IL1B

SLC6A8 CCDC185 SLC6A4

AMBN PEMT DLX4 BIK

CA9 ALDH3B2 IL1RN

CHGB GADL1 MMP24

COL10A1 ALDH2 CCL13 COL2A1

DKK1 CYP2A6 PDGFRA

DPT UGT2B7 PTGER2

FOXC2

UGT2B10 MMP13 FZD9

GOLM1 NAT1 PI3

GPNMB ACSM3 SOX9

HAVCR1

BCL9 LOX IGFBP3

MFGE8 VIPR1 VWF

PHEX IL2RB CLEC1B HAMP POSTN CD74

PTGDS TMEM220 TLR4 C8B TEK SRPX LCN2

VGF

PAMR1 SAA1 TCF15

COMP PTGDR

HAPLN1 ID1

TCP11 SLC7A2 IL6

SNRPD1 OIP5 NR0B2 PRPF3

FABP4 HP LSM8

SPP1 MCM8

EZH2 SOX2

CDC25C HGF

FOXF1 BLM CCL2

MAP2K1 NOX1

WDR76 SOCS3 NDC80

NVL

THBS1 TUBG1 EHMT2

RHOB TCP10

PRKDC WDR62

GJA1 RAD51AP1

CACYBP MYCN DMC1

DNMT1 FANCI

BGLAP FANCD2

CCDC15 MMP3 HELLS

BIRC3 SLC7A11

TOP3B

MMP1 H2AFZ MAT1A

TLR3 CKS2 E2F5

COL4A6 HIST1H2AD PRIM1

FGF13 ZWINT

MTFR2 MAPK11 WDHD1

KPNA2

MMP14 TOP2A HPDL

CCNE2 NOS2 ESR1

DUSP6 RMI2

WNT1 NUSAP1

ERBB3 STMN1 LRP4

ETS2 BARD1

NGF CENPJ

PDGFRB TMEM35

FOXK1 TUBA3E

CBX2 CEP72

COL1A2 KIF18A

WNT3A CHTF18

ATF3 CCNA2 SIX1

MAP2K3 TICRR LRRC4

MMP10 KIF2C

SPHK1 CENPH

SMARCD3 KIF14

EGR3 CDC25A

FOSB PKM CDH11

JUNB FASN

SIK1 ENTPD3

HK3 CHRM1 AK8

GYS2

CHRNA1 RHO

LEF1 HK2

LGALS4 GUCY2C

MAPK15 AGL

NTRK3 GPD1

PRKD1 HSPB1

RET POU5F1

TRPC6 EEF1A2

ETV4 LEP

FOSL1 CALCA

TGFB2 SST

TRPM8 AXIN2 PTGIR TRPA1

POLR2K TACR1 SSTR5

SCTR FXYD2 NTS

RAMP3

HOXA9 NPBWR1 PTH2R

IRAK1 ADRA1B PTH1R

PAX8 LPL UCHL1 P2RX3

NFIL3 ADCY6 NPFFR2

NEFH ME1 CHGA

IAPP

JUN SCG2 GPR83

MYBL2 CYP27B1 GPR27

GLP2R PLK4 OXT GHRHR

MND1 GAST GAL

FAM19A4 GADD45B SNAP25

ASIC3

MAST1 GNRH1 ANK1

EFNB3 GNGT1 ADCY8

DDX39A ATP4A APOA2

B3GNTL1 GADD45G CTSK

BPIFA1 CCNB1 FLVCR1 CDK5R1

DUSP2 GMNN FXN

DUSP5 GINS2 LDHD

EN1

IQGAP3 MFI2 HOXC8

HRK CDCA7 FOS

JDP2 E2F1 NR4A2

MAFF

KPNA7 MMP9 MAFG

MAPK13 ENSG00000258947 PTGER3

MECOM NES CXCR2 NKD1

EMR1 COL3A1 PAGE4

RFWD2 EPCAM CHI3L1

SORT1 MKI67 CCR1

SLC16A14

MCM5 CCL20 SLC1A3 SYNE1

KIF23 PZP GPAA1

APOD CENPU LPA

TMEM163

NUF2 COL4A5 SLC6A3

BRIP1 CXCL5 SLC17A8

KLF11 KIF11 KISS1

KLF10

RNASEH2A IGF2BP3 FOXN4

DEPDC1B GGH FGF8

VCAN TK1 PBK

SCUBE3 C1QL1 CLSPN FIGNL1 S100A12

E2F7 MCM7 QPCT

PTGES ATAD2 KNTC1

PRSS1 CHAF1A MMS22L PPBP

KIAA1524 FEN1 PLGLB2

DONSON HJURP PLGLB1

PHPT1 SPAG5 AOC1 PGM1

MCM4 CHST10 MEP1B

POLQ DIAPH3 LRG1

LHX8 POLA2 NCAPG2

KLK2 UBE2S ARHGAP11A ILF2

DBF4 EXO1 HOXC6

FKBP10 XRCC2 HN1

HEY1 MFSD2B ERCC6L

FIGF LGR5 SLC9A2 CEP55 ECM1

SLC22A12 SH3GL3 DLK1

SLC16A9 SMC4 CXCR1

CTRB1 C16orf59 ECT2

CTHRC1 CDT1 SKA3 COL5A1

CENPM CENPI COL4A2

FANCE SKA1 COL17A1

CALCR CHEK1 CENPA

ANGPT2 FAM64A CKS1B ADAMTS1

DTL CDCA8

ASF1B PSRC1

KIAA0101 KCNJ11

SGOL1 ESCO2

TMED3 CDC20

STK31 KIF15 SMC2 PCP4

NEIL3 CDKN3 LGALS9B

CLIP2 SPC25 DCLK3

CENPE KIF18B

PKMYT1 MCM10

HMGB2 GTSE1

PITPNM3 DSCC1 DEPDC1 GUCY2D LMNB2 CDCA3

BICD1 RACGAP1

CENPF KIFC1

TMEM65 SHCBP1 FOXM1 RASEF GSG2 CDC45 MSH5 NEK2 CDC7 HIST1H3D

LMNB1 CENPK PROCA1 ARHGEF39

PRR11 ANLN

KIFC2 TPX2

PSMB4 MSH2

PSMA8 NCAPG

BIRC7 ORC6

HIST1H1E DLGAP5

FLT3 CKAP2

AURKB NCAPH TREX2 TRIP13 ORC1 NSMCE2 POLE2 HIST1H3H MSH4 C8orf4 KIAA1875 ALPI PIFO ATAD5

ARHGAP36 CCNO MCM2

ACADS CDC6 MELK

B3GNT4 CCND2 CCNB2

GCKR PARPBP RPS27 RBM44

HSD17B13 AURKA CENPQ

INHBC VRK1 DYNC1I1

JSRP1 BUB1 CENPL

MCM6 RAD21 PLEKHH1 TACC3 SGOL2 SLC22A11 OPHN1 ASPM SPDL1 SLC26A3 GPM6A RAD54L BUB1B SLC47A2 B3GNT5 TRAIP MAD2L1 SLC51B PTTG1 TUBE1

TRIM46 HMMR ADH4 ZNF71 HDAC11 CHEK2 LIG1 E2F2 PLK1 DUSP9

FBXW12 TP63 ABCC1 KCTD7 NPM1

BRSK2 SPSB4 SRC TNP1 FANCB RBL1 MMP17 SQSTM1 PLAU HIST1H2AE MAGEA11 ASB16 LAMC1 NELFE FBXL13 ITGB6

FBXL22 ITGA9 GLMN ITGA2 PIGU UBE2U ITGA11 CCNF DNM3 CDCA2 CYP17A1 CENPW DNMT3A KIF20B RHOV CHAF1B RHOF NCAPD2 MAP2 UBE2T ARHGEF2 RFC4 S100A8 PCNA EPHA2 DTYMK ADAM15 ITGB3BP ADAM12 RAD54B SH3PXD2B CPLX2 RFC3 PRKCB CENPP PIK3R2 INCENP MAPT ULK4 PLK3 DUOX1 DNA2 IL11 RAD51 ITGAV EME1 FLNA CDKN2C TYMS KRT23 CHKA UBE2C CENPO C2orf40 IL1RAP DNAJC9 TRIM50 BRCA2 TRIM11 FANCG HERC5 SPC24 SFN TROAP MISP RPS6 TSC1 STK32B KIF26B CDC20B FOXO1 CDKN2B MSTN EPHB2 EGF LIN9 MDK KIF21B GINS4 GPC3 PPP2R2B CDK1 FAM72A TTK CKAP2L KIF19 CDCA4 FAM72B KIF1A LY6K KLC3 KIF4A ZWILCH CKAP4 KIF20A DNMT3B TTF2 ESPL1 PRC1 CDCA5 PIF1 RRM2 CCNE1 STC2 BIRC5 GINS1 KDELR3

FAM72D

MTBP

LAT

GINS3

DBF4B

LPGAT1 COL9A2

NECAB3 ANGPTL7

CHP2

CACNG1

CATIP

FOXJ1

TARBP1 RHCG ZMYND10

ELL3 NUP210L SLC39A5

TP53I3 SUB1 FOLH1 UNC5A TYRO3 TDGF1 TACSTD2 SYN3 SLC9A3 SKAP1 SHC4 SH2D1B SCRIB SCARA3 RHOBTB1 RASA4 RAPSN PXDNL PTP4A3 PDPN PAK3 NRG2 MST1R MPZL1 MEGF10 MAGEC2 LRP8 ITGA3 INPP5J IGF2BP1 GRIN1 GNAO1 EREG EFNA5 EFNA3 CRHR2 CLIC6 CLIC5 CEND1 DCUN1D3 CEACAM3 FBXW10 CD46 TRIM63 CASS4 TRIM71 CACNA1E BLK AXL ARR3 ARHGEF26 AQP5 ACTA1

PPP1R1B NAP1L6

LTK GPC2 HOXD1 BRSK1 BCAN EN2 CCNJL H1F0 AFP HIST1H4HLZTS1 ACOT2 MYT1 SCML2 SPDYA TP73 TPR UCN KIF7 BCL11ATUBA3DTUBAL3 EDARADDENTPD2 PKN3 Figure S3. Vector illustration of thre-level PPI network

Table S1. Top 10 hub genes identified using the four parameters

Betweenness Closeness Hub genes Dregree Stress Centrality Centrality

GADPH 422 0.066 0.455 25518406

CDK1 317 0.015 0.413 5852942

SRC 264 0.035 0.431 11025368

IL6 258 0.021 0.418 7803430

EGF 215 0.019 0.418 7584486

MYC 223 0.018 0.418 6875820

BDNF 215 0.016 0.418 6311752

MMP9 207 0.014 0.411 5842810

JUN 201 0.014 0.409 5703898

FOS 205 0.013 0.409 5202498

Degree indicates the number of direct interactions between a protein and the others while centralities represent the possibility that a protein is functionally able to keep communicating nodes together in a biological network. After the calculation, top 10 genes with high value in these four parameters were identified as hub genes that have many interations with other proteins and participate in many pathways. Code S1. Code of RF classifier import pandas as pd import numpy as np import matplotlib.pyplot as plt from sklearn import preprocessing from sklearn import svm from sklearn.ensemble import RandomForestClassifier from sklearn.model_selection import train_test_split from sklearn.model_selection import GridSearchCV from sklearn.metrics import accuracy_score from sklearn.metrics import roc_auc_score from sklearn.metrics import roc_curve from sklearn.metrics import classification_report from sklearn.model_selection import cross_val_score np.set_printoptions(threshold=np.nan) pd.set_option('display.max_columns', None) pd.set_option('display.max_rows', None) pd.set_option('display.width', 20000) feature = [] name = 'name' label = 'label' dataset1 = pd.read_csv('dataset1.csv') dataset2 = pd.read_csv('dataset2.csv') dataset = pd.concat([dataset1, dataset2], axis=1) x_columns = [x for x in dataset.columns if x not in [label, name]]

# x_columns = [ # 'ENSG00000086991','ENSG00000149090','ENSG00000156006','EN- SG00000123473','ENSG00000122824', # 'ENSG00000162062','ENSG00000225383','ENSG00000162769','EN- SG00000180011','ENSG00000182853' # # # ,'ENSG00000021852','ENSG00000111713' # # ,'ENSG00000064703','ENSG00000162365','ENSG00000206195', # # 'ENSG00000116031','ENSG00000104938','ENSG00000036448','EN- SG00000115919','ENSG00000121207', # # 'ENSG00000267690','ENSG00000204356','ENSG00000121289','EN- SG00000114805','ENSG00000135409', # # 'ENSG00000198300','ENSG00000123485','ENSG00000105383','EN- SG00000103671','ENSG00000145824', # # 'ENSG00000019169','ENSG00000152133' # # ,'ENSG00000073969','ENSG00000148773','ENSG00000109674', # # 'ENSG00000220008','ENSG00000233124','ENSG00000235295','EN- SG00000168899','ENSG00000144061', # # 'ENSG00000185359','ENSG00000122218','ENSG00000197114','EN- SG00000100889','ENSG00000239998', # # 'ENSG00000242019','ENSG00000173281' # # 'ENSG00000000003','ENSG00000000005','ENSG00000000419','EN- SG00000000457','ENSG00000000460', # # 'ENSG00000000938','ENSG00000000971','ENSG00000001036','EN- SG00000001084','ENSG00000001167' # # 'ENSG00000001460','ENSG00000001461','ENSG00000001497','EN- SG00000001561','ENSG00000001617','ENSG00000001626', # 'ENSG00000001629','ENSG00000001630','ENSG00000001631','EN- SG00000002016','ENSG00000002330' # ]

# print(x_columns) # for a in dataset[x_columns].columns: # dataset[a] = preprocessing.scale(dataset[a]) X_train, X_test, y_train, y_test = train_test_split(dataset[x_columns], dataset[label], test_size=0.3, random_state=42)

# dataset_train = pd.read_csv('train.csv') # dataset_test = pd.read_csv('validate.csv') # X_train = dataset_train[x_columns] # X_test = dataset_test[x_columns] # y_train = dataset_train[label] # y_test = dataset_test[label] # print(dataset[label].value_counts()) # print(y_train.value_counts()) # print(y_test.value_counts())

# parameter_space = { # 'max_features':range(10, 160, 10) # }

# clf = svm.SVC(probability=True) clf = RandomForestClassifier(random_state=42)

# grid = GridSearchCV(clf, parameter_space, cv=5, scoring='f1') # grid.fit(X_train, y_train) # print(' the best parameter:') # print(grid.best_params_) # clf = grid.best_estimator_ clf.fit(X_train, y_train) y_pred = clf.predict(X_test) y_pred_pro = clf.predict_proba(X_test) y_scores = pd.DataFrame(y_pred_pro, columns=clf.classes_.tolist())[1].values y_true = y_test print('\naccuracy:', accuracy_score(y_true, y_pred)) # print(y_pred) # print(y_pred_pro) def get_result(y_ture,y_pred): TN = 0 FN = 0 FP = 0 TP = 0 for x, y in zip(y_ture, y_pred): if x == 1 and y == 1: TP += 1 if x == 0 and y == 1: FP += 1 if x == 1 and y == 0: FN += 1 if x == 0 and y == 0: TN += 1 print('\nconfusion matrix:') print('\t', 1, '\t', 0) print('1\t%d\t%d' % (TP, FP)) print('0\t%d\t%d' % (FN, TN)) print('\nConfusion matrix:') print('\t', 1, '\t', 0) print('1\t%.2f%%\t%.2f%%' % (TP/(TP+FN)*100, FP/(FP+TN)*100)) print('0\t%.2f%%\t%.2f%%' % (FN/(TP+FN)*100, TN/(FP+TN)*100)) get_result(y_true, y_pred) print('\np value, r value, f1 score:') print(classification_report(y_true, y_pred)) auc_value = roc_auc_score(y_true, y_scores) print('\nauc area:', auc_value) fpr, tpr, thresholds = roc_curve(y_true, y_scores) plt.figure() lw=2 plt.plot(fpr, tpr, color='darkorange', linewidth=lw, label='ROC curve(area=%0.4f)' % auc_value) plt.plot([0, 1], [0, 1], color='navy', linewidth=lw, linestyle='--') plt.xlim([0.0, 1.0]) plt.ylim([0.0, 1.05]) plt.xlabel('False Positive Rate') plt.ylabel('True Positive Rate') plt.title('Receiver operating characteristic example') plt.legend(loc='lower right') plt.show()

# (pd.DataFrame(clf.feature_importances_, index=x_columns)).to_csv('fi.csv') scores = cross_val_score(clf, dataset[x_columns], dataset[label], cv=10, scoring='ac- curacy') print(scores) print(scores.mean())

Code S2. Code of DT classifier import pandas as pd import numpy as np import matplotlib.pyplot as plt from sklearn import preprocessing from sklearn import svm from sklearn.tree import DecisionTreeClassifier from sklearn.model_selection import train_test_split from sklearn.model_selection import GridSearchCV from sklearn.metrics import accuracy_score from sklearn.metrics import roc_auc_score from sklearn.metrics import roc_curve from sklearn.metrics import classification_report np.set_printoptions(threshold=np.nan) pd.set_option('display.max_columns', None) pd.set_option('display.max_rows', None) pd.set_option('display.width', 20000) feature = [] name = 'name' label = 'label' dataset1 = pd.read_csv('dataset1.csv') dataset2 = pd.read_csv('dataset2.csv') dataset = pd.concat([dataset1, dataset2], axis=1) x_columns = [x for x in dataset.columns if x not in [label, name]] # for a in dataset[x_columns].columns: # dataset[a] = preprocessing.scale(dataset[a]) X_train, X_test, y_train, y_test = train_test_split(dataset[x_columns], dataset[label], test_size=0.3)

# dataset_train = pd.read_csv('train.csv') # dataset_test = pd.read_csv('validate.csv') # X_train = dataset_train[x_columns] # X_test = dataset_test[x_columns] # y_train = dataset_train[label] # y_test = dataset_test[label] # print(dataset[label].value_counts()) # print(y_train.value_counts()) # print(y_test.value_counts())

# parameter_space = { # 'max_features':range(10, 160, 10) # }

# clf = svm.SVC(probability=True) clf = DecisionTreeClassifier(random_state=42)

# grid = GridSearchCV(clf, parameter_space, cv=5, scoring='f1') # grid.fit(X_train, y_train) # print('the best parameter:') # print(grid.best_params_) # clf = grid.best_estimator_ clf.fit(X_train, y_train) y_pred = clf.predict(X_test) y_pred_pro = clf.predict_proba(X_test) y_scores = pd.DataFrame(y_pred_pro, columns=clf.classes_.tolist())[1].values y_true = y_test print('\naccuracy:', accuracy_score(y_true, y_pred)) # print(y_pred) # print(y_pred_pro) def get_result(y_ture,y_pred): TN = 0 FN = 0 FP = 0 TP = 0 for x, y in zip(y_ture, y_pred): if x == 1 and y == 1: TP += 1 if x == 0 and y == 1: FP += 1 if x == 1 and y == 0: FN += 1 if x == 0 and y == 0: TN += 1 print('\nconfusion matrix:') print('\t', 1, '\t', 0) print('1\t%d\t%d' % (TP, FP)) print('0\t%d\t%d' % (FN, TN)) print('\n confusion matrix:') print('\n confusion matrix:') print('\t', 1, '\t', 0) print('1\t%.2f%%\t%.2f%%' % (TP/(TP+FN)*100, FP/(FP+TN)*100)) print('0\t%.2f%%\t%.2f%%' % (FN/(TP+FN)*100, TN/(FP+TN)*100)) get_result(y_true, y_pred) print('\np value, r value, f1 score:') print(classification_report(y_true, y_pred)) auc_value = roc_auc_score(y_true, y_scores) print('\nauc area:', auc_value) fpr, tpr, thresholds = roc_curve(y_true, y_scores) plt.figure() lw=2 plt.plot(fpr, tpr, color='darkorange', linewidth=lw, label='ROC curve(area=%0.4f)' % auc_value) plt.plot([0, 1], [0, 1], color='navy', linewidth=lw, linestyle='--') plt.xlim([0.0, 1.0]) plt.ylim([0.0, 1.05]) plt.xlabel('False Positive Rate') plt.ylabel('True Positive Rate') plt.title('Receiver operating characteristic example') plt.legend(loc='lower right') plt.show()

Code S3. Code of GBDT classifier import pandas as pd import numpy as np import matplotlib.pyplot as plt from sklearn import preprocessing from sklearn import svm from sklearn.ensemble import GradientBoostingClassifier from sklearn.tree import DecisionTreeClassifier from sklearn.model_selection import train_test_split from sklearn.model_selection import GridSearchCV from sklearn.metrics import accuracy_score from sklearn.metrics import roc_auc_score from sklearn.metrics import roc_curve from sklearn.metrics import classification_report from sklearn.model_selection import cross_val_score np.set_printoptions(threshold=np.nan) pd.set_option('display.max_columns', None) pd.set_option('display.max_rows', None) pd.set_option('display.width', 20000) feature = [] name = 'name' label = 'label' dataset1 = pd.read_csv('dataset1.csv') dataset2 = pd.read_csv('dataset2.csv') dataset = pd.concat([dataset1, dataset2], axis=1) print(dataset[label].value_counts()) x_columns = [x for x in dataset.columns if x not in [label, name]]

# x_columns = ['ENSG00000184374','ENSG00000130812','EN- SG00000171307','ENSG00000106809','ENSG00000196924' # ,'ENSG00000185052','ENSG00000270673','ENSG00000169604' # ,'ENSG00000135842','ENSG00000169184' # ,'ENSG00000174099' # ,'ENSG00000101335','ENSG00000151468','ENSG00000155254' # ]

# for a in dataset[x_columns].columns: # dataset[a] = preprocessing.scale(dataset[a]) X_train, X_test, y_train, y_test = train_test_split(dataset[x_columns], dataset[label], test_size=0.3, random_state=42)

# dataset_train = pd.read_csv('train.csv') # dataset_test = pd.read_csv('validate.csv') # X_train = dataset_train[x_columns] # X_test = dataset_test[x_columns] # y_train = dataset_train[label] # y_test = dataset_test[label] # print(dataset[label].value_counts()) # print(y_train.value_counts()) # print(y_test.value_counts())

# parameter_space = { # 'max_features':range(10, 160, 10) # }

# clf = svm.SVC(probability=True) clf = GradientBoostingClassifier(random_state=42)

# grid = GridSearchCV(clf, parameter_space, cv=5, scoring='f1') # grid.fit(X_train, y_train) # print('the best parameter:') # print(grid.best_params_) # clf = grid.best_estimator_ clf.fit(X_train, y_train) y_pred = clf.predict(X_test) y_pred_pro = clf.predict_proba(X_test) y_scores = pd.DataFrame(y_pred_pro, columns=clf.classes_.tolist())[1].values y_true = y_test print('\naccuracy:', accuracy_score(y_true, y_pred)) # print(y_pred) # print(y_pred_pro) def get_result(y_ture,y_pred): TN = 0 FN = 0 FP = 0 TP = 0 for x, y in zip(y_ture, y_pred): if x == 1 and y == 1: TP += 1 if x == 0 and y == 1: FP += 1 if x == 1 and y == 0: FN += 1 if x == 0 and y == 0: TN += 1 print('\nconfusion matrix:') print('\t', 1, '\t', 0) print('1\t%d\t%d' % (TP, FP)) print('0\t%d\t%d' % (FN, TN)) print('\nconfusion matrix: ') print('\t', 1, '\t', 0) print('1\t%.2f%%\t%.2f%%' % (TP/(TP+FN)*100, FP/(FP+TN)*100)) print('0\t%.2f%%\t%.2f%%' % (FN/(TP+FN)*100, TN/(FP+TN)*100)) get_result(y_true, y_pred) print('\np value, r value, f1 score:') print(classification_report(y_true, y_pred)) auc_value = roc_auc_score(y_true, y_scores) print('\nauc area:', auc_value)

''' fpr, tpr, thresholds = roc_curve(y_true, y_scores) plt.figure() lw=2 plt.plot(fpr, tpr, color='darkorange', linewidth=lw, label='ROC curve(area=%0.4f)' % plt.plot(fpr, tpr, color='darkorange', linewidth=lw, label='ROC curve(area=%0.4f)' % auc_value) plt.plot([0, 1], [0, 1], color='navy', linewidth=lw, linestyle='--') plt.xlim([0.0, 1.0]) plt.ylim([0.0, 1.05]) plt.xlabel('False Positive Rate') plt.ylabel('True Positive Rate') plt.title('Receiver operating characteristic example') plt.legend(loc='lower right') plt.show() '''

# (pd.DataFrame(clf.feature_importances_, index=x_columns)).to_csv('fi_GB- DT.csv') scores = cross_val_score(clf,dataset[x_columns], dataset[label],cv=5,scoring='accura- cy') print(scores.mean())