A Multi-Objective Genetic Algorithm to Find Active Modules from Multiplex Biological Networks: Supplementary material

Elva-Mar´ıaNovoa-del-Toro1,∗ Efr´enMezura-Montes2 Matthieu Vignes3 Fr´ed´eriqueMagdinier1 Laurent Tichit4 Ana¨ısBaudot1,∗

1Aix Marseille Univ, INSERM, Marseille Medical Genetics (MMG), Marseille, France 2University of Veracruz, Artificial Intelligence Research Center, Mexico 3School of Fundamental Sciences, Massey University, New Zealand 4Aix Marseille Univ, CNRS, Centale Marseille, I2M UMR 7373, Marseille, France ∗To whom correspondence should be addressed

Contents

Figure S1 2

Figure S2 2

Figure S3 3

Figure S4 3

Figure S5 4

Figure S6 4

Figure S7 5

Figure S8 5

Figure S9 6

Figure S10 6

Figure S11 14

Figure S12 18

Figure S13 30

Figure S14 38

Table S1 48

Table S2 48

Table S3 49

1 Figure S1: Sizes of the subnetworks identified by PinnacleZ in the experiment using the network PPI 1 and the simulated data with normal distribution

Figure S2: Sizes of the subnetworks identified by PinnacleZ in the experiment using the network PPI 2 and the sampled data from RNA-Seq TCGA breast dataset

2 Figure S3: Sizes of the subnetworks identified by COSINE in the experiment using the network PPI 1 and the simulated data with normal distribution

Figure S4: Sizes of the subnetworks identified by COSINE in the experiment using the network PPI 2 and the sampled data from RNA-Seq TCGA breast cancer dataset

3 Figure S5: Sizes of the subnetworks identified by jActiveModules in the experiment using the network PPI 1 and the simulated data with normal distribution

Figure S6: Sizes of the subnetworks identified by jActiveModules in the experiment using the network PPI 2 and the sampled data from RNA-Seq TCGA breast cancer dataset

4 Figure S7: Sizes of the subnetworks identified by all the methods in the experiment using the network PPI 1 and the simulated data with normal distribution

Figure S8: Sizes of the subnetworks identified by all the methods in the experiment using the network PPI 2 and the sampled data from RNA-Seq TCGA breast cancer dataset

5 Figure S9: F1 score values of jActiveModules, COSINE, PinnacleZ, and MOGAMUN corresponding to the two experiments from the benchmark

Figure S10

Eighteen active modules obtained by applying MOGAMUN on Yao’s biopsies dataset [1] (see Table S1 for the list of samples). The color of the nodes represents the fold-change, where green and red nodes correspond to under- and over-expressed , respectively. Nodes with bold black border correspond to genes significantly differentially expressed (F DR < 0.05 and absolute log2 fold-change > 1). Blue and white nodes correspond to genes with no associated transcriptomics data and no deregulation, respectively.

ORC1

MCM2 ORC6

MCM6 MCM4 FCGR3A ORC5

ORC2 CDC7 MCM5 CDC45 DBF4

FCGR3B CDK2 RPA1

MCM7 MCM3

Figure S10.1. Yao’s dataset, biopsies: Active module 1

6 PRC1

POLA1 JUN CDK2 FCGR3B CDK1

MYC BARD1 CDKN1A

CCNA2 ACTC1

PCDHB4 BRCA1 EP300

ABCC8 BRCA2 TP53

Figure S10.2. Yao’s dataset, biopsies: Active module 2

STAT3 PLCG2

SOS1 SYK CBL SRC

EGFR PIK3R1

ACTC1 PIK3CG LYN AKT1

HSPB2 CXCL14 PRKDC MAPK8

Figure S10.3. Yao’s dataset, biopsies: Active module 3

ABCC8

DUSP1

VAMP2

FCGR3B HSPB2 ACTC1

FCGR3A

HCK SYK

OGDHL STAT3 LYN PLCG2

CBL

KIT PIK3CG

Figure S10.4. Yao’s dataset, biopsies: Active module 4

7 EGFR

PIK3R1 SYK

GRB2 SHC1 PLCG2

SRC PIK3R2 CBL

FYN LYN

STAT3 OGDHL ERBB2 ABL1

Figure S10.5. Yao’s dataset, biopsies: Active module 5

MAPK14 RB1 SOS1 EP300

GRB2 SYK

PLCG1 CDKN1A

AKT1 EGFR CDK1

CBL

STAT3 OGDHL PIK3R1 LYN

Figure S10.6. Yao’s dataset, biopsies: Active module 6

PLCG1

GAB1

CBL VAV1 FYN SHC1

CRK GRB2 PIK3R1 STAT3 EGFR

SRC FCGR3B ERBB2 SYK

Figure S10.7. Yao’s dataset, biopsies: Active module 7

8 STAT3 OGDHL CREBBP UBE2I

EP300

FCGR3B CKS1B JUN ATF2

FOSL1

DUSP1 JUND HSPB2

MAPK13

ACTC1

Figure S10.8. Yao’s dataset, biopsies: Active module 8

SRC GRB2

EGFR PIK3R1 MET

LYN SYK

ERBB2 SHC1 KIT

CBL FYN LCK

OGDHL STAT3

Figure S10.9. Yao’s dataset, biopsies: Active module 9

9 ACTC1

CDT1

RPA1

DBF4 MCM7 CDC7 MCM6

MCM4 CDK2 MCM2

CDK1 ORC6

ORC5 ORC2 CDKN1A ORC1

Figure S10.10. Yao’s dataset, biopsies: Active module 10

ACTC1

PFN2

ENAH RB1

MAPK1 SP1

CDK2 TP53 SRC

CCND2 CDKN1A MYC

SYK SMAD2

STAT3 CREBBP

EP300

Figure S10.11. Yao’s dataset, biopsies: Active module 11

10 CREBBP HIF1A

HDAC1 SMAD3 MAPK1 EP300

TP53 SP1

RELA MYC ESR1

JUN

RB1

FCGR3B MAPK14

Figure S10.12. Yao’s dataset, biopsies: Active module 12

PIK3R2 FCGR3A

FCGR3B PIK3R1 HCK GRB2

CBL LILRB2 GJA4

SRC KIT SYK STAT3

CREBBP

OGDHL

Figure S10.13. Yao’s dataset, biopsies: Active module 13

11 MYC

DUSP1 EP300

MAPK14

RB1 JUN MAPK8

EGFR HSPB2 ACTC1 CDK1 STAT3

FCGR3B SYK CSF2RB

PIK3CG

OGDHL

Figure S10.14. Yao’s dataset, biopsies: Active module 14

OGDHL

STAT3

RB1 JUN

SMAD3 EP300 HDAC1

CDKN1A CREBBP MYC

RELA ESR1 SP1

BRCA1 TP53

Figure S10.15. Yao’s dataset, biopsies: Active module 15

12 ACTC1

CDH5

CDK1 BRCA1 NDC80

CDK2 RELA CDKN1A

KAT5

KPNA2

UBE2I BARD1

TRRAP TP53 EP300

KAT2A MYC

Figure S10.16. Yao’s dataset, biopsies: Active module 16

OGDHL

HIF1A STAT3 CREBBP

RB1 HDAC3 MAPK14

PIK3R1 EP300 SP1 SMAD3 MYC

CDKN1A KAT2B ESR1

Figure S10.17. Yao’s dataset, biopsies: Active module 17

13 ABCC8

ACTC1 ITGA1 SLC5A4

HSPB2 SOX17 DAAM2 RUVBL1

CLEC14A FCGR3B ADAM33 ALYREF DDX39A

CENPN KPNA2 DUSP1 APLNR

LSM7 GLRX3

Figure S10.18. Yao’s dataset, biopsies: Active module 18

Figure S11

Ten active modules obtained by applying MOGAMUN on Yao’s myoblasts dataset [1] (see Table S1 for the list of samples). The color of the nodes represents the fold-change, where green and red nodes correspond to under- and over-expressed genes, respectively. Nodes with bold black border correspond to genes significantly differentially expressed (F DR < 0.05 and absolute log2 fold-change > 1). Blue and white nodes correspond to genes with no associated transcriptomics data and no deregulation, respectively.

LRRTM4

NRXN2 SHANK3 GFRA1

DLGAP4

MDM2 UBC USP15

JUN ZG16B UBQLN1 SMAD3

SMAD2 OTUB1

EP300

Figure S11.1. Yao’s dataset, myoblasts: Active module 1

14 PDLIM7 GFRA1 CLIP3

UBC FAM149A PTN TRAF2

LRRTM4

UBQLN4 NRXN1 RAI2 BRCA2

ZG16B

APBA2 APP

Figure S11.2. Yao’s dataset, myoblasts: Active module 2

NOTCH2NL HOXA1

KRTAP10-5 LCE1B

KRTAP10-3

KRTAP10-9

KRTAP9-2 KRTAP4-2

SPRY2

KRTAP10-8

KRTAP10-7 CHRD

KCND3

NRXN1 GFRA1

LRRTM4

Figure S11.3. Yao’s dataset, myoblasts: Active module 3

15 MCM3

ORC5

CDC45 ORC6 MCM6

MCM2 MCM7 MCM5

CDC7 CDC6 ORC1

MCM4

DBF4 RAI2 MCM10

ORC2

RHOJ

GFRA1

Figure S11.4. Yao’s dataset, myoblasts: Active module 4

CBL

CTNNB1

RET ERBB2

LRRTM4 NRXN3 AFDN

GFRA1 EGFR PIK3R1

HRAS

DOK1 SRC SHC1

PTK2

Figure S11.5. Yao’s dataset, myoblasts: Active module 5

16 ARAP2

ARHGEF7 APP

RHOG

HOMER2

PIK3R1

SHANK3 LRRTM4 GFRA1 NRXN3

SHC1 EGFR CRK

GRB2 RET

Figure S11.6. Yao’s dataset, myoblasts: Active module 6

NRXN2 LRRTM4

NRXN3 NLGN1 NLGN2

NLGN3 NRXN1 AFDN RIT1 PTN SGTA ZG16B

EPHB2 HRAS

GFRA1

NRTN

Figure S11.7. Yao’s dataset, myoblasts: Active module 7

LRRTM4

NRXN3

NRXN1 MCM6 MCM5

KCND3 GFRA1 MCM2 CDC7 ORC6

ORC1

ORC2 MCM3 MCM7 DBF4

LTV1 ORC5 MCM4

Figure S11.8. Yao’s dataset, myoblasts: Active module 8

17 GFRA1

GAB1 SRC PTN PIANP NRXN2 LRRTM4

CBL SHC1 RET

CRK PIK3R1

FYN GRB2 ERBB2

Figure S11.9. Yao’s dataset, myoblasts: Active module 9

LRRTM4 CDK2 BRCA1 NRXN1 MCM5 CDC7

KCND3 ORC1 MCM7

GFRA1 ORC6

MCM3 CCNA2 MCM2 MCM6

ORC2 DBF4 CDC6

ORC5 MCM4 CDT1

Figure S11.10. Yao’s dataset, myoblasts: Active module 10

Figure S12

Twenty three active modules obtained by applying MOGAMUN on Yao’s myotubes dataset [1] (see Table S1 for the list of samples). The color of the nodes represents the fold-change, where green and red nodes correspond to under- and over-expressed genes, respectively. Nodes with bold black border correspond to genes significantly differentially expressed (F DR < 0.05 and absolute log2 fold-change > 1). Blue and white nodes correspond to genes with no associated transcriptomics data and no deregulation, respectively.

18 TRIM49B TRIM49

TRIM49D2

TRIM49C

PRAMEF12 TRIM49D1 TCEB3B

ZNF705B

MBD3L2 TRIM43 ARGFX

ZIM3 USP29

MBD3L3 PRAMEF11 TPRX1

PRAMEF2

HNRNPCL1

Figure S12.1. Yao’s dataset, myotubes: Active module 1

CCNA1 PRAMEF12 PSMD10 TRIM49D2

MDM2 CCNE1 UBE2D1 TRIM43 RB1 MBD3L2

TRIM49D1 CDKN1A PSMA3

PSMD4

TRAF2 CCNB1

Figure S12.2. Yao’s dataset, myotubes: Active module 2

19 UBE2C CDK2

CCNB1 CDC20

CCNA1 SRC AR

FZR1

UBE2D1

TRIM43

TRIM49B MBD3L2

TRIM49D1 PRAMEF12

TRIM49D2

Figure S12.3. Yao’s dataset, myotubes: Active module 3

CCNA1 FZR1

TP53

CDC6 CDC20 CCNB1

CDK7

CDC25A CCNA2 FOXM1 CDKN1A

CDK2 CDK1

ARID4A UBE2C

CCNE1

Figure S12.4. Yao’s dataset, myotubes: Active module 4

20 SNW1 CDKN1A CCNE1

CDC25A RB1 CCNA1 UBE2D1

CCNB1 EP300 HDAC1

TRIM43 TRIM27 CDK2

HIST1H2BA MBD3L2

Figure S12.5. Yao’s dataset, myotubes: Active module 5

KLHL38 NOTCH2NL MGAT5B CREB5

KRTAP10-8 KRTAP10-3 KRTAP10-7 HOXA1

KRTAP10-9 KRTAP4-2

KRTAP5-9 KRTAP9-2

KRTAP5-6 SPRY2 LCE3E

Figure S12.6. Yao’s dataset, myotubes: Active module 6

21 MDM2 CCNA1 CCNA2

CCNB1

RB1 TP53 CDKN1B

UBE2D1

CDK2 CDKN1A

EP300 SKP2 HDAC1

BRCA1

CCNE1

Figure S12.7. Yao’s dataset, myotubes: Active module 7

KRTAP5-9 KRTAP10-8

KRTAP10-7 KRTAP10-3 MEOX2

KLHL38 NOTCH2NL

KRTAP10-9 PTGER3

TRIM42

TRIM43

PRAMEF12 TRIM49B

MBD3L2 TRIM49D1

Figure S12.8. Yao’s dataset, myotubes: Active module 8

22 PRAMEF11

ZIM3

HNRNPCL1

TPRX1 PRAMEF2 HIST2H2AC

HIST1H2BA

MBD3L3 RELA TRIM43 EP300

UBE2D1 HIST2H2BE CTNNB1

CREBBP LEF1

Figure S12.9. Yao’s dataset, myotubes: Active module 9

MBD3 PRAMEF12 RBBP4

TRIM49B MTA2 TRIM49D2 MBD3L2 HDAC1 GATAD2B

TRIM49D1

TRIM43 RBBP7

HDAC2 GATAD2A TRIM49

CHD4 HIST1H2BA

Figure S12.10. Yao’s dataset, myotubes: Active module 10

23 KRTAP9-4 LCE4A

SPRY2 CREB5

PTGER3 KRTAP10-7

KRTAP5-9 MEOX2

KRTAP4-2

KRTAP10-8 KRTAP10-9

LCE1B KRTAP10-3

NOTCH2NL

KLHL38

Figure S12.11. Yao’s dataset, myotubes: Active module 11

PRAMEF12

TRIM49D2 MBD3L2

TRIM49B

TRIM49D1 TRIM43

UBE2D1

RB1 MCM3

CDC6 CDKN1A MCM2

CCNE1 UBE2C

CCNA1 CDKN1B CCNB1 MCM5

CDK2 SKP2

Figure S12.12. Yao’s dataset, myotubes: Active module 12

24 KRTAP10-8 PTGER3 KRTAP4-12

KRTAP10-3

KRTAP10-7 LCE1B

ADAMTSL4 KRTAP9-2

HOXA1 GLRX3 KRTAP10-9 KRTAP5-9 SPRY2

KRTAP4-2

KRTAP10-5

Figure S12.13. Yao’s dataset, myotubes: Active module 13

USP29 TRIM39

HNRNPCL1 TRIM43 RBCK1

CDKN1A UBE2D1 UBE2D4 UBE2D2 UBE2C

UBC

SKP2

CDK2 TRIM63

CCNA1 CDC20

CCNB1

Figure S12.14. Yao’s dataset, myotubes: Active module 14

25 CLSPN CDC6

CCNE1 CDC25A UBE2C

CCNB1 FZR1 CCNA1 UBE2D1

TRIM43

TRIM49C MBD3L2

TRIM49D1

PRAMEF12 TRIM49D2 TRIM49B

Figure S12.15. Yao’s dataset, myotubes: Active module 15

CDK1 CDC25A

CCNE1 CDKN1A CENPA SKP2

CDK2 RPA1 CCNB1 CDC20 EP300

FZR1 UBE2D1 CCNA1 CDC6

UBE2C

Figure S12.16. Yao’s dataset, myotubes: Active module 16

26 KRTAP9-2 SPRY2 ADAMTSL4

LCE3E KRTAP9-4

KRTAP5-9

KRTAP4-12 HOXA1

KRTAP10-5 OTX1

ZNF679 PRAMEF2

HNRNPCL1

TPRX1 ARGFX

USP29

PRAMEF11

MBD3L3

Figure S12.17. Yao’s dataset, myotubes: Active module 17

TRIM43

CCNA1 UBE2D1

BRCA1 CCNE1 CDC27

CDK2

CDC20 CDK1

CDKN1B

CDC6 CDT1 CCNB1 E2F1

CDC25C

Figure S12.18. Yao’s dataset, myotubes: Active module 18

27 CCNK

POLR2A POLR2F POLR2C

POLR2D BRCA1 POLR2B

POLR2G

TCEB3B POLR2H

TRIM43

MBD3L3 PRAMEF12

PRAMEF2 HNRNPCL1

Figure S12.19. Yao’s dataset, myotubes: Active module 19

PRAMEF9

KRTAP10-9 KRTAP9-2

NOTCH2NL KRTAP9-4 KRTAP5-9

LCE3E LCE1B

KRTAP4-12

KRTAP5-6 OTX1 KRTAP4-2

KRTAP10-3

HOXA1 KRTAP10-1

KRTAP10-5

Figure S12.20. Yao’s dataset, myotubes: Active module 20

28 RB1 CDKN1A

CDC6 ORC1 CCNA2 CCNE1

SKP2

CDK2 MCM4 CCNA1

E2F1 FZR1 CDT1 BRCA1

ORC2

Figure S12.21. Yao’s dataset, myotubes: Active module 21

UBE2C

E2F1

FZR1

CDC20 CCNB1 CDK2

CCNA2 CDK1

CDC6 CCNE1 CCNA1 CDT1 CDKN1A

TP53

EP300 CDC25A

Figure S12.22. Yao’s dataset, myotubes: Active module 22

29 KRTAP9-4 KRTAP10-5

KRTAP4-2 KRTAP5-6

KRTAP9-2

LCE1B NOTCH2NL KRTAP5-9 SPRY2

KRTAP10-8 KRTAP10-9 KRTAP4-12

KRTAP10-3 PTGER3

KLHL38

Figure S12.23. Yao’s dataset, myotubes: Active module 23

Figure S13

Twenty three active modules obtained by applying MOGAMUN on Banerji’s 2017 dataset [2] (see Table S2 for the list of samples). The color of the nodes represents the fold-change, where green and red nodes correspond to under- and over-expressed genes, respectively. Nodes with bold black border correspond to genes significantly differentially expressed (F DR < 0.05 and absolute log2 fold-change > 1). Blue and white nodes correspond to genes with no associated transcriptomics data and no deregulation, respectively.

MAPK10 MAPK8

PTGDS EGF EP300 TP53 DSP

SEPP1 ERBB2 SHC1 GRB2 AGT

EGFR ERBB3 CDK1

TGFA

CD4 JAK2

PIK3R1 NEDD4

Figure S13.1. Banerji’s 2017 dataset: Active module 1

30 RELA KAT2B

SMAD2

NCOA1 PPARG TP53

HDAC1 JUN NR4A1 SMAD3

CREBBP EP300 ESR1 RXRA

PPARGC1A NCOA6

Figure S13.2. Banerji’s 2017 dataset: Active module 2

UBE2I NCOA1

SIRT1 AR CREBBP

NR3C1 ESR1 JUN PPARGC1A

FOS TP53

NCOA6 ATF2 RELA

SMAD3

Figure S13.3. Banerji’s 2017 dataset: Active module 3

31 PTGDS A2M NRG1

SOS1 NRAS SRC HSP90AA1 GAB1

FYN MAPK10 HRAS EGF

EGFR SHC1 PIK3R1

Figure S13.4. Banerji’s 2017 dataset: Active module 4

A2M SEPP1 BRCA1

CASQ2 FOS NR4A1 ESR1

SKP2

CDK2 TP53 SMAD3 ACTB

EP300 CREBBP KAT2B

Figure S13.5. Banerji’s 2017 dataset: Active module 5

CATSPER1 HOXA1 ADAMTSL4

PLSCR1 KRTAP5-9 KRTAP4-2 KRTAP10-1

KRTAP10-5 KRTAP10-3 KRTAP10-8 KRTAP5-6 KRTAP10-9

CHRD NOTCH2NL

ZNF417 KRTAP10-7

Figure S13.6. Banerji’s 2017 dataset: Active module 6

32 KRTAP10-1 KRTAP10-5 SPRY2 KRTAP5-6

OTX1 CHRD HOXA1 KRTAP10-3

KRTAP5-9 CATSPER1 CREB5 KRTAP9-2

KRTAP4-2 KRTAP10-8 KRTAP10-9

Figure S13.7. Banerji’s 2017 dataset: Active module 7

SHC1 PIK3R1

PTK2 GRB2

PTEN IRS1 EGF

CBL TP53 EGFR HSP90AA1

UBC LRRK2 TGFA PTGDS

Figure S13.8. Banerji’s 2017 dataset: Active module 8

APP ABCA8

SHC1 A2M PDK4

CASQ2 MAPK10 ACTA2 LRRK2 NEDD4

LUM EGFR PTGDS

CBLB SEPP1 NDN

Figure S13.9. Banerji’s 2017 dataset: Active module 9

33 NR4A1 PROX1

NFKB1 NR3C2

NCOA1 EP300

SMAD3 CDK2 MAPK10

TP53

BRCA1 SP1 SKP2

SMAD2 ESR1

Figure S13.10. Banerji’s 2017 dataset: Active module 10

ERBB3 CRKL CBL

SHC1 GRB2 EGFR MAPK10

A2M ESR1 PTGDS TIMP3

HSP90AA1 CASQ2 SEPP1 CACNA1A

TP53 CDK1

Figure S13.11. Banerji’s 2017 dataset: Active module 11

MCM10 NR3C2

PTGDS ACTA2 ABCA8 CDC6 FMOD

MAPK10 MCM7 CASQ2

LUM PRPH2

NDRG2 SEPP1 NR4A1 A2M

Figure S13.12. Banerji’s 2017 dataset: Active module 12

34 BRCA1 CREBBP MAPK10

MAPK14 FOS UBE2I MYC

SP1

ESR1 TP53 RELA JUN

SMAD3

EP300 MAPK1

Figure S13.13. Banerji’s 2017 dataset: Active module 13

TIMP3

GRB2 A2M CASQ2 EGF

BCR ERBB3 LUM

JAK2 MAPK10 LYN

ABCA8 LRRK2 PIK3R1 SHC1

Figure S13.14. Banerji’s 2017 dataset: Active module 14

A2M TIMP3

NR4A1 MCM2

CDK1 CASQ2

LRRK2 FOS PTGDS

LUM CDC6 MAPK10 NR3C2

SEPP1 PSMC2

Figure S13.15. Banerji’s 2017 dataset: Active module 15

35 SHC1 PLCG1

ERBB3 GRB2 CRKL EGFR

CBL EGF PIK3R1

ABL1 SOS1

SRC PIK3R2

GAB1

CRK

Figure S13.16. Banerji’s 2017 dataset: Active module 16

HDAC1 NR4A1 SMAD3

JUN

CREBBP TP53 PML RXRA

NR3C1 EP300 UBE2I KAT2B

MDM2 PPARG PPARGC1A

Figure S13.17. Banerji’s 2017 dataset: Active module 17

36 TP53 JUN MAPK8

CREBBP FOS EP300 KAT2B SIRT1 ESR1 PPARGC1A NR4A1

NCOA3 NCOA1 AR NCOA2

Figure S13.18. Banerji’s 2017 dataset: Active module 18

HIST3H2A AGT

KAT2B NR4A1 ZFP2

HIST4H4 ZNF43 TRIM28 SIRT1 HIST1H3J TP53 HIST3H2BB ZNF429

EP300

MAPK10 PBX1 HIST1H4L PPARGC1A ZNF681 FOXO1 ZNF726

Figure S13.19. Banerji’s 2017 dataset: Active module 19

PAMR1 CHRD SEPP1 TIMP3

CASQ2

RASSF9 POLE3

FMOD ABCA8 MAPK10

COL14A1 PTGDS A2M

NR3C2 LUM PYGM

Figure S13.20. Banerji’s 2017 dataset: Active module 20

37 SHC1 PLCG1 ERBB3

FYN SRC EGFR HCK

PIK3R1 GRB2 SYK GAB1

ERBB2 EGF CBL

VAV1

Figure S13.21. Banerji’s 2017 dataset: Active module 21

ESR1 PPARGC1A FOS

NR4A1 EGFR TP53 CREBBP MAPK10

NCOA1 EP300 JUN ACTB

NR3C2 KAT2B HIF1A

SMAD5

Figure S13.22. Banerji’s 2017 dataset: Active module 22

ATF2 SMAD3 RELA HDAC1

TP53 MAPK8

BRCA1 JUN

MAPK10 CREBBP EP300 SP1

UBE2I NR3C1

ESR1

Figure S13.23. Banerji’s 2017 dataset: Active module 23

38 Figure S14

Seventeen active modules obtained by applying MOGAMUN on Banerji’s 2019 dataset [3] (see Table S3 for the list of samples). The color of the nodes represents the fold-change, where green and red nodes correspond to under- and over-expressed genes, respectively. Nodes with bold black border correspond to genes significantly differentially expressed (F DR < 0.05 and absolute log2 fold-change > 1). Blue and white nodes correspond to genes with no associated transcriptomics data and no deregulation, respectively.

FZR1 MDM2 UBE2D1

HIST2H2BE TP53 UBC

HIST1H4L CREBBP

SMAD3 PBX1 YY1

EP300 MYC H2AFJ HIST1H3J

KAT2B HIST1H4A NCOA1

RXRA SMARCC1

Figure S14.1. Banerji’s 2019 dataset: Active module 1

RELA MAPK10 ATF2

EP300 SMAD3 NCOA3 ESR1 JUN

CREBBP MYC BRCA1 TP53

MAPK1 SP1 SMAD2

Figure S14.2. Banerji’s 2019 dataset: Active module 2

39 SNCA MAPK1

CSNK2A1

JUN

RAF1 YY1 SMAD3 CDK4

CCND2

MYC TP53 EP300

RB1

CDKN1B SKP2

Figure S14.3. Banerji’s 2019 dataset: Active module 3

ERBB4 NR3C2

TRAF6

MAPK10 CPT1C PPARGC1A SMAD3

FHL2

MAPK12 SMAD2 TP53 NGFR

MAPK14

NCOA1 BCL2

RXRA

Figure S14.4. Banerji’s 2019 dataset: Active module 4

40 KCND3 CYYR1

A2M CILP

MAPK10 DOC2B PTGDS

FMOD

GHR SEPP1 ABCA8 LUM

CIT

FMO3 PAMR1

Figure S14.5. Banerji’s 2019 dataset: Active module 5

CSNK2A1

CREBBP SMAD2 HIST1H4L MAPK8

SMARCE1 STAT3 MYC BCL2

MAPK14 KAT2B JUN

SMAD3 MAPK10 RB1

TP53

Figure S14.6. Banerji’s 2019 dataset: Active module 6

41 NCOA3 TP53 PPARGC1A

MAPK10 MAPK14 EGFR RXRA

SMAD3 MAPK11 YY1 SMAD2

H2AFJ ERBB4 SP1

NEDD4 ABCA8 SMARCC1

Figure S14.7. Banerji’s 2019 dataset: Active module 7

NEDD4 SNCA

SMAD3

CTNNB1 MAPK1

JUN FYN MAPK10

SMAD2

MAPK14 CDK5R1 DPYSL5

ESR1

TRAF6 TP53

Figure S14.8. Banerji’s 2019 dataset: Active module 8

42 SQSTM1 TRAF6 SPSB4

CDK2 TP53 MAPK14 NGFR

HIST1H4L UBB

RB1 UBC SMAD2

KAT2B H2AFV SMARCC1 CHD3 SNCA

HIST1H3J H2AFJ

Figure S14.9. Banerji’s 2019 dataset: Active module 9

MAPK10

SKIL ESR1

TP53 SMAD2 ATF2

RELA

SMAD3 MYC CREBBP

JUN CEBPB KAT2B

EP300 MAPK1

Figure S14.10. Banerji’s 2019 dataset: Active module 10

43 BRCA1

EP300 SMAD2

ESR1

MYC NCOA6 TP53 CREBBP

MAPK14 SMAD3 NCOA1 JUN NCOA3

RELA MAPK1 KAT2B

MAPK10 PPARGC1A

Figure S14.11. Banerji’s 2019 dataset: Active module 11

NCOA1 PPARGC1A PPARG

MAPK10 KAT2B CREBBP

ESR1 EP300

ATF2 JUN MYC TP53

SMAD3 SP1

SMAD2

Figure S14.12. Banerji’s 2019 dataset: Active module 12

44 CIT KPNA2

FMO3 A2M

LUM ABCA8

CDK1

GHR LIMS2 CILP FMOD PAMR1

ATP1A2

KCND3

NR3C2 RASSF9

Figure S14.13. Banerji’s 2019 dataset: Active module 13

SMAD2

NCOA1 TP53

JUN NCOA6

CREBBP MAPK1 KAT2B EP300

MED1 RXRA PPARGC1A MAPK14

AR NCOA3

Figure S14.14. Banerji’s 2019 dataset: Active module 14

45 NFKB1 TP53 SMAD2

PPARGC1A JUN

YY1

PPARG CREBBP MYC SP1

EP300 ESR1

NCOA1

NCOA6 SMAD3

Figure S14.15. Banerji’s 2019 dataset: Active module 15

SMARCC1 NR3C2

AKT1

RB1 MYC NCOA1

MDM2

RXRA TP53

PPARG

BCL2 GSK3B

CREBBP PPARGC1A

MAPK10 MAPK1

Figure S14.16. Banerji’s 2019 dataset: Active module 16

46 FMOD MAPK10

NR3C2

GHR FMO3

PLN CIT

PAMR1 CILP

LUM ABCA8 HSPB6 KCND3

DOC2B RASSF9

Figure S14.17. Banerji’s 2019 dataset: Active module 17

Sample ID Type Origin F1 Patient Biopsy F2 Patient Biopsy F3 Patient Biopsy F4 Patient Biopsy F5 Patient Biopsy F6 Patient Biopsy F7 Patient Biopsy F8 Patient Biopsy F9 Patient Biopsy C1 Control Biopsy C2 Control Biopsy C3 Control Biopsy C4 Control Biopsy C5 Control Biopsy C6 Control Biopsy C7 Control Biopsy C8 Control Biopsy C9 Control Biopsy F4 Patient Myoblast F6 Patient Myoblast C21 Control Myoblast C22 Control Myoblast F4 Patient Myotube F6 Patient Myotube C20 Control Myotube C21 Control Myotube C22 Control Myotube

Table S1: Samples from Yao’s datasets [1]. Downloaded from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE56787

47 ID Type Batch 54 12 r1 Patient 1 54 12 r2 Patient 1 54 12 r3 Patient 1 54 6 r1 Control 1 54 6 r2 Control 1 54 6 r3 Control 1 54 2 r1 Patient 2 54 2 r2 Patient 2 54 2 r3 Patient 2 54 A10 r1 Control 2 54 A10 r2 Control 2 54 A10 r3 Control 2 54 A5 r1 Patient 2 54 A5 r2 Patient 2 54 A5 r3 Patient 2 12ABic r3 Patient 3 12ABic r1 Patient 3 12ABic r2 Patient 3 12UBic r3 Control 3 12UBic r2 Control 3 12UBic r1 Control 3 16ABic r1 Patient 3 16ABic r2 Patient 3 16ABic r3 Patient 3 16UBic r3 Control 3 16UBic r2 Control 3 16UBic r1 Control 3

Table S2: Samples from Banerji’s 2017 dataset [2]. Downloaded from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE102812

48 ID Type Batch 54 12 T8 r1 Patient 1 54 12 T8 r2 Patient 1 54 12 T8 r3 Patient 1 54 6 T8 r1 Control 1 54 6 T8 r2 Control 1 54 6 T8 r3 Control 1 54 2 T8 r1 Patient 2 54 2 T8 r2 Patient 2 54 2 T8 r3 Patient 2 54 A10 T8 r1 Control 2 54 A10 T8 r2 Control 2 54 A10 T8 r3 Control 2 54 A5 T8 r1 Patient 2 54 A5 T8 r2 Patient 2 54 A5 T8 r3 Patient 2 12A T8 r1 Patient 3 12A T8 r2 Patient 3 12A T8 r3 Patient 3 12U T8 r1 Control 3 12U T8 r2 Control 3 12U T8 r3 Control 3 16A T8 r1 Patient 3 16A T8 r2 Patient 3 16A T8 r3 Patient 3 16U T8 r1 Control 3 16U T8 r2 Control 3 16U T8 r3 Control 3

Table S3: Samples from Banerji’s 2019 dataset [3]. Downloaded from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE123468

References

[1] Z. Yao, L. Snider, J. Balog, R. J. Lemmers, S. M. Van Der Maarel, R. Tawil, and S. J. Tapscott. Dux4- induced expression is the major molecular signature in fshd . Human molecular genetics 23, 2014.

[2] C. R. Banerji, M. Panamarova, H. Hebaishi, R. B. White, F. Relaix, S. Severini, and P. S. Zammit. Pax7 target genes are globally repressed in facioscapulohumeral muscular dystrophy skeletal muscle. communications 8, 2017. [3] C. R. Banerji, M. Panamarova, J. Pruller, N. Figeac, H. Hebaishi, E. Fidanis, ..., and P. S. Zammit. Dynamic transcriptomic analysis reveals suppression of pgc1 α/err α drives perturbed myogenesis in facioscapulohumeral muscular dystrophy. Human molecular genetics 28, 2019.

49