<<

Analysis of cellular responses to microwave irradiation in E. coli and change of oxygen level and culture medium in human cancer cell lines using RNA-seq based

transcriptomic profiling

Eunike Ilona Hilson

Biotechnology

Submitted in partial fulfillment

of the requirements for the degree of

Master of Science

Faculty of Mathematics and Science, Brock University

St Catharines, Ontario

© 2021 Abstract

RNA sequencing (RNA-seq) is one of the applications of next-generation sequencing

(NGS) with differential expression (DGE) analysis at the transcriptomic level as its primary objective. Among the NGS technologies, the Illumina platforms are the current standard for

RNA-seq analysis for their best cost efficiency and sequencing accuracy. In this study, we employed Illumina-based RNA-seq to examine the profile change in E. coli cells after exposure to microwave irradiation (MWI) and in cancer cell lines in response to different culture conditions using breast cancer cell lines (MCF7) and prostate cancer cell lines (PC3) as the models.

Our results in examining the gene expression change in E. coli showed that the non- thermal effects of MWI led to E. coli cells entering the stationary phase with most of the downregulated involved in metabolic and biosynthesis pathways. MWI also upregulated the expression of genes important for the maintenance of membrane integrity and adhesion associated with bacterial motility. In comparison with other similar studies, our methodology allowed us to observe the impact of non-thermal effects of MWI at 2.45 GHz via simultaneous cooling.

Our results in examining the transcriptomic profile of MCF7 and PC3 cells in response to oxygen level and culture medium change showed that gene expression in MCF7 is highly affected by oxygen level and culture medium changes when compared to PC3, especially in

DMEM at 18% O2. DNA replication, cell-cycle, and viral carcinogenesis are the most affected pathways observed from different culture conditions in both cell lines. In PC3, only the legionellosis seems to be most impacted by culture medium changes at 5% O2, involving 8 differentially expressed genes (DEGs), important for cancer cell development. DGE analysis also

I provides the transcriptomic profile of MCF7 and PC3, showing that different nutrient composition (between DMEM and Plasmax) and oxygen levels (5% O2 and 18% O2) changes the and various signaling pathways in both cell lines differently suggesting that the oxygen level and culture medium are important factors impacting the outcome of cell culture- based experiments in cell type-specific fashion.

Keywords: RNA-seq, differential gene expression analysis, differentially expressed genes, transcriptomic profile, microwave irradiation, E. coli, culture conditions, cancer cell lines,

MCF7, PC3.

II

Acknowledgment First, I would like to thank Brock University for all the support and wonderful experiences I have during my study in Canada. I am grateful to my supervisor, Dr. Ping Liang, for his support, guidance, and encouragement throughout my master's program and in completing my research, helping me understand the subject matter and broaden my skill in

Biotechnology as well as Bioinformatics. I am thankful to my committee members, Dr. Jeff

Stuart and Dr. Tony Yan, for their support and guidance to help me understand more about the research topics by allowing me to participate in their research projects. I also want to acknowledge all my colleagues in Dr. Ping Liang’s lab (Zakia Dahi, Jerry Tang, Robert Martin,

Arsala Ali, Radesh Nattamai, Daniel Tang, Vinay Kumar Chundi, Marina Casavecchia, Bruce

Racey, and Dr. Kai Hu), Fereshteh Moradi (Dr. Jeff Stuart’s Ph.D. student), and Frank

Betancourt Montoya (Dr. Tony Yan’s Ph.D. student) for their help and support in completing this thesis. Finally, I would like to thank my family and friends in Canada and Indonesia for their continued support physically and spiritually to achieve my goals, especially Mr. and Mrs.

Horvarth and family, Rebecca Joseph, Daislyn Vidal, Pau Pin, Felicia Marija, Katelyn Stachow,

Central Church Community, Brock Power to Change (P2C), my parents, my sister and her husband, my husband, and his family, Adriana Nana, Mertha Prana, Dircia Cannisia C, and Dr.

Dhira Satwika. May all the glory, honor, and praise for God alone, for he makes all things possible and beautiful in his time.

III

Table of contents

Abstract …………………………………………………………………………………………...I

Acknowledgment ………………………………………………………………………………..III

List of Tables ……………………………………………………………………………...……..X

List of Figures ………………………………………………………………………………….XII

List of Abbreviations ………………………………………………………………………….XIV

Chapter 1: General introduction on differential gene expression analysis by RNA sequencing …1

1.1. RNA-sequencing for differential gene expression analysis ………………………1

1.2. Overview of RNA-seq workflow for DGE analysis ……………………………..3

1.3. Research design: experimental design for RNA-seq based DGE analysis ….…...3

1.3.1. Expression variation ……………………………………………………….4

1.3.2. Level of replications ……………………………………………………....5

1.3.3. Sequencing read depth ………………………………………………….....6

1.3.4. Sequencing read length …………………………………………………....7

1.3.5. Library type: single-end or paired-end sequencing …………………….....8

1.3.6. Selection of RNA species ………………………………………………....9

1.3.7. Strand and non-strand-specific sequencing ……………………………...10

1.3.8. Reference availability: reference vs non-reference …………………...….13

1.4. DGE analysis …………………………………………………………………...14

1.4.1. Pre-processing ……………………………………………………………15

1.4.2. Read mapping ……………………………………………………………15

1.4.3. De novo transcriptome assembly ………………………………………...18

1.4.4. Quantification of reads abundance ……………………………………….20

IV

1.4.5. Normalization ……………………………………………………………22

1.4.6. Generating the raw DEGs list ……………………………………………24

1.4.7. Filtering the list of DEGs ………………………………………………...27

1.4.8. Enrichment analysis ……………………………………………………...28

1.5. Overall research objectives ……………………………………………………..30

Chapter 2: Analysis of cellular responses to microwave radiation in E. coli using RNA-seq based transcriptome profiling ...………………………………………………………………………...31

2.1. Introduction and related literature review …………………………………………..31

2.1.1. Microwave irradiation …………………………………………………….31

2.1.2. MWI disrupts the cellular membrane activity ……………………………32

2.1.3. MWI changes the enzymatic activity ……………………………………..34

2.1.4. Research objectives ……………………………………………………….35

2.2. Methods and materials ……………………………………………………………...36

2.2.1. Sample preparation ……………………………………………………….36

2.2.2. Library preparation and sequencing ………………………………………37

2.2.3. Assessment of RNA-seq data ……………………………………………..38

2.2.4. RNA-seq reads alignment ………………………………………………...38

2.2.5. DGE analysis ……………………………………………………………..38

2.2.6. DGE downstream analysis ………………………………………………..40

2.2.7. Identification of unannotated transcripts …………………………………40

2.2.8. Functional enrichment analysis …………………………………………..40

2.2.9. Analysis of co-expressed DEGs …………………………………………..41

2.2.10. Comparing the transcriptomic data with the previous proteomic data ….41

V

2.2.11. Computational analysis ………………………………………………….41

2.3. Results ………………………………………………………………………………42

2.3.1. Overview of the RNA-seq data: quality check and summary statistics …..42

2.3.2. Overview of the gene expression profile in response to MWI …………...44

2.3.3. List of DEGs common in 3 tools …………………………………………47

2.3.4. Functional enrichment analysis of DEGs using DAVID …………………48

2.3.6. Co-expressed analysis …………………………………………………….51

2.3.7. Comparing the transcriptomic data with proteomic data from the previous

study ……………………………………………………………………………..55

2.4. Discussion …………………………………………………………………………….57

2.4.1. E. coli reacted to enhance membrane integrity and adhesion for survival in

response to MWI ……………………….………………………………………..58

2.4.1.1. The increased activity of bacterial efflux pumps………………………..58

2.4.1.2. Disruption of membrane responsible for stabilizing membrane

integrity ………………………………………………………………………….60

2.4.1.3. Increased activity of the PTS, controlling carbon uptake and metabolism

……………………………………………………………………………………62

2.4.1.4. E. coli adhesion increased by MWI …………………………………….63

2.4.2. E. coli shuts down most metabolism and biosynthesis to enter the stationary

phase in response to MWI ……………………………………………………….64

2.4.2.1. E. coli enters stationary phase during MWI …………………………….65

2.4.2.2. The downregulated expression of genes coding for F1 sector of

membrane-bound ATP synthase ………………………………………………...66

VI

2.4.2.3. The suppression of NAD+ biosynthesis ………………………………...67

2.4.2.4. The downregulated expression of genes coding for aminoacyl-tRNA

synthetases ………………………………………………………………………69

2.4.2.5. The suppression of one carbon pool by folate ………………………….71

2.4.2.6. The suppression of iron transport ………………………………………74

2.4.3. The differences between transcriptomic and proteomic data from E. coli in

response to MWI ………………………………………………………………...76

2.4.4. Concluding statements and future perspectives …………………………..77

Chapter 3: Assessment of different culture condition’s impact on cell’s physiology in culture by gene profiling…………………………………………………………………………………….79

3.1 Introduction and related literature review ……………………………………………..79

3.1.1. The development of cell culture technique ……………………………….79

3.1.2. Cell culture medium ………………….…………………………………...79

3.1.3. The impact of different culture medium components on cell culture …….80

3.1.4. The impact of oxygen on cell physiology in culture ……………………...83

3.1.4.1. The impact of hyperoxia in cell culture ………….……………..84

3.1.4.2. The impact of hypoxia in cell culture …………….…………….86

3.1.5. Interaction between medium’s composition and oxygen level …………...87

3.1.6. Research objectives …………………………………………………….....88

3.2. Methods and Materials ………………………………………………………………..88

3.2.1. Sample preparation ……………………………………….……………....88

3.2.2. Library preparation and sequencing ……………………….……………...89

3.2.3. Assessment of RNA-seq data ……………………………….…………….90

VII

3.2.4. RNA-seq reads alignment ………………………………….……………..90

3.2.5. DGE analysis ……………………………………………….…………….90

3.2.6. Functional enrichment analysis …………………………………………...91

3.2.7. Analysis of co-expressed DEGs …………………………………………..92

3.2.8. Computational analysis …………………………………….……………..92

3.3. Results ………………………………………………………………………………92

3.3.1. Summary statistics for the RNA-seq data ………………………………...92

3.3.2. Overview of the gene expression profile in response to oxygen level and

medium change ….……………………………………………….……………...94

3.3.3. DGE in response to oxygen level changes in MCF7 ……...……………...99

3.3.4. DGE in response to oxygen level changes in PC3 ………….…………...105

3.3.5. DGE in response to culture medium changes in MCF7 ………………...109

3.3.6. DGE in response to culture medium changes in PC3 .…………………..115

3.3.7. Common and differential response to culture conditions between PC3 and

MCF7 …………………………………………………………………………..117

3.4. Discussion …………………………………………………………………………125

3.4.1. The impact of culture conditions on MCF7 and PC3 …………………..125

3.4.2. The most affected KEGG Pathways in response to culture conditions in

MCF7 and PC3 ………………………………………………………………...129

3.4.2.1. Cell-cycle ……………………………………………………………...129

3.4.2.2. DNA replication ……………………………………………………….130

3.4.2.2.1. p53 signaling pathway ………………………………………………131

3.4.2.2.2. MMR pathway ………………………………………………………133

VIII

3.4.2.2.3. DNA polymerase proofreading ……………………………………...134

3.4.2.2.4. MCM ……………………………………………………....135

3.4.2.3. Viral infections contribute to carcinogenesis in cell lines …………….136

3.4.2.4. Biosynthesis of antibiotics associated with various cancer metabolism

…………………………………………………………………………………..138

3.4.2.4.1. pathway ………………………………………………….139

3.4.2.4.2. ACLY function in metabolism, fatty acid (FA) synthesis, and

mevalonate pathways …………………………………………………………..140

3.4.2.4.3. FH and IDH play role in mitochondria dysfunction ………………...141

3.4.2.4.4. MVD, FDFT, and SQLE involved in cholesterol biosynthesis ……..142

3.4.2.4.5. Biosynthesis of nucleotides ………………………………………….143

3.4.2.5. Glycolysis/ …………………………………………..144

3.4.3. Two biological processes associated with response to hypoxia were affected

by culture conditions in MCF7 and PC3 ……………………………………….147

3.4.4. Concluding statements and future perspectives …………………………148

References ……………………………………………………………………………………...150

Appendix Chapter 2 ……………………………………………………………………………183

Appendix Chapter 3 ……………………………………………………………………………227

IX

List of Tables

Table 2.1. Alignment statistics for RNA-seq data …………………………………….………...43

Table 2.2. Statistics for RNA-seq data based on the unique mapping reads ……………………44

Table 2.3. Pairwise comparison of gene expression between control (CTR) and treated (MW) samples ……………………………………………….…………………………………….……46

Table 2.4. The most significantly enriched GO terms and KEGG pathways from up-regulated

DEGs in E. coli in response to MWI ……………………………………………………………48

Table 2.5. The most significantly enriched GO terms and KEGG pathways from down-regulated

DEGs in E. coli in response to MWI ……………………………………………………………50

Table 2.6. Overlap genes between the transcriptomic and proteomic analysis in response to MWI

……………………………………………………………………………………………………56

Table 3.1. Applied experimental conditions for PC3 and MCF7 cell lines ……………………..89

Table 3.2. Description of the experimental condition involved in 3 DGE comparisons ………..91

Table 3.3. Sequencing coverage statistic of RNA-seq data ……………………………………..93

Table 3.4. Alignment statistics for RNA-seq data ………..……………………………………..94

Table 3.5. Pairwise comparison of gene expression profile among all samples based on Pearson correlation* ……………………………………………………………………………………...96

Table 3.6. Number of differentially expressed genes (DEGs) for pairwise comparisons ...…….98

Table 3.7. Enriched GO terms and KEGG pathways among the DEGs in MCF7 between 5% and

18% O2 in DMEM ……………………………………………………………………………...101

Table 3.8. Enriched GO terms and KEGG pathways among the DEGs in MCF7 between 5% and

18% O2 in Plasmax ……………………………………………………………………………..104

X

Table 3.9. Enriched GO terms and KEGG pathways among DEGs in PC3 between 5% and 18%

O2 in DMEM …………………………………………………………………………………...107

Table 3.10. Enriched GO terms and KEGG pathways among DEGs in PC3 between 5% and 18%

O2 in Plasmax …………………………………………………………………………………..107

Table 3.11. Enriched GO terms and KEGG pathways among DEGs in MCF7 between DMEM and Plasmax at 5% O2 ………………………………………………………………………….111

Table 3.12. Enriched GO terms and KEGG pathways among DEGs in MCF7 between DMEM and Plasmax at 18% O2 ………………………………………………………………………...114

Table 3.13. Enriched GO terms and KEGG pathways among DEGs in PC3 between DMEM and

Plasmax at 5% O2 ………………………………………………………………………………115

Table 3.14. Enriched GO terms and KEGG pathways among DEGS in PC3 between DMEM and

Plasmax at 18% O2 ……………………………………………………………………………..117

Table 3.15. Comparison of the enriched KEGG pathway among the different lists of DEGs

……………………………………………….………………………………………………….120

Table 3.16. Comparison of the enriched biological process GO terms among the different lists of

DEGs ……………………………………………………………………………………….…..121

XI

List of Figures

Figure 2.1. Scatter boxplot showing a gene expression distribution pattern of control and treated

E. coli samples …………………………………………………………………………………..45

Figure 2.2. Scatter box plot of the three replicates of two conditions, showing the varieties of gene distribution …………………………………………………………………………………46

Figure 2.3. A network of the most significantly enriched GO terms and KEGG pathways from downregulated DEGs …………………………...... 51

Figure 2.4. The comparison of significantly enriched biological processes among co-expressed, up-and down-regulated DEGs.…………………………………………………………………...52

Figure 2.5. The comparison of the most significantly enriched molecular functions among co- expressed, up-and down-regulated DEGs ……………………………………………………….53

Figure 2.6. The comparison of the most significantly enriched cellular components among co- expressed, up-and down-regulated DEGs ……...... 53

Figure 2.7. The comparison of the most significantly enriched KEGG pathways among co- expressed, up-and down-regulated DEGs ……………………………………………………….54

Figure 3.1. Distribution of gene expression values in MCF7 (left) and PC3 (right) under different culture conditions ………………………………………………………………………………..95

Figure 3.2. The number of differentially expressed genes (DEGs) for pairwise comparisons among different culture conditions for PC3 and MCF7 ….……………………………………..99

Figure 3.3. Venn diagram analysis for DEGs in response to oxygen level change in DMEM (left) and Plasmax (right) ………………………..…………………………………………………...118

Figure 3.4. Venn diagram analysis for DEGs in response to medium change at 5% O2 (left) and

18% O2 (right) ……………………...…………...... 119

XII

Figure 3.5. Stack bar plots summarizing the effect of oxygen level and culture medium changes on MCF7 and PC3 based on the number of enriched biological processes (green), molecular functions (blue), and KEGG pathways (orange) among all DEGs …………………………….124

XIII

List of Abbreviations

Basic Local Alignment Search Tool (BLAST)

Database for Annotation, Visualization and Integrated Discovery (DAVID)

Differential Expressed Genes (DEGs)

Differential Gene Expression (DGE)

Dulbecco’s Modified Eagle’s Medium (DMEM)

Empirical Analysis of Digital Gene Expression Data in R (edgeR)

Fold Change (FC)

Fragments Per Kilobase of transcript per Million mapped reads (FPKM)

Gene Ontology (GO)

Kyoto Encyclopedia of Genes and Genomes (KEGG)

Microwave (MW)

Microwave Irradiation (MWI)

Paired-end (PE)

Polymerase Chain Reaction (PCR)

Quality Control (QC)

Ribonucleic acid (RNA)

Ribosomal RNA (rRNA)

RNA sequencing (RNA-seq)

Spliced Transcripts Alignment to a Reference (STAR)

XIV

Chapter 1: General introduction on differential gene expression analysis by RNA sequencing

1.1. RNA sequencing and its applications

RNA sequencing (RNA-seq), as one of the applications of next-generation sequencing

(NGS), is a powerful high-throughput sequencing assay combining discovery and unlimited dynamic range of quantification for whole transcriptome analysis with reduced technical variability and high-cost efficiency (Conesa et al., 2016; Tandonnet and Torres, 2017; Van Verk et al., 2013). RNA-seq has extended the scope and depth of transcriptome studies for model and non-model organisms as the successor of the microarray-based transcriptome profiling with the advantages of having access to non-model organisms by not requiring the availability of the existing genomic sequences (Han et al., 2015).

In addition to the analysis of differential gene expression (DGE) as the primary application (Stark et al., 2019), RNA-seq can be used for novel gene identification, splicing analysis, detection of mutation, RNA editing products, and sequence variants in a (Conesa et al., 2016; Han et al., 2015). Since its first appearance in 2008, almost 7,000 research publications involving the use of RNA-seq (Weber, 2015) (covering organisms such as

Zea mays, Arabidopsis thaliana, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Mus musculus, and Homo sapiens have been archived (Stark et al., 2019; Wang et al., 2010). RNA- seq also has been used for gene discovery and gene expression quantification in species without a sequenced genome or non-model organisms and has been particularly successful to study unknown and regulators (Weber, 2015). Several studies in plant metabolism crop domestication have been successfully applied RNA-seq to discover relevant monoterpene indole alkaloids from Asterids (Góngora-Castillo et al., 2012) or sesquiterpenes in tomato (Solanum

1 lycopersicum) (Schilmiller et al., 2010), novel components of xyloglucan biosynthesis of seed development in Tropaeolum majus (Jensen et al., 2012), and glycerolipid biosynthesis in oilseeds from four different plant species (Troncoso-Ponce et al., 2011).

At present, the Illumina platform is the most commonly used NGS technology and as the standard NGS platform for RNA-seq (Conesa et al., 2016; Van Verk et al., 2013). The Illumina platform can currently yield up to 3 billion reads per sequencing run and it can handle total RNA, messenger RNA (mRNA), targeted RNA, and small RNA (Illumina, 2017; Van Verk et al.,

2013). Both total RNA and mRNA sequencing allow the identification and quantification of common and rare transcripts and detection of splice variants, novel transcripts, and gene fusions

(Illumina, 2017). Total RNA sequencing (also known as whole-transcriptome sequencing) is the most comprehensive approach and typically involves sequencing all RNA molecules, both coding and noncoding. If the research goal is to focus primarily on the coding genes, then mRNA-Seq represents the best choice. The mRNA-Seq protocol uses a selection method to enrich for polyadenylated (poly(A)) RNA, which represents only a small percentage of the total

RNA molecules (Invitrogen, 2019).

A targeted RNA-seq approach on the Illumina NGS technology combines the RNA

(complementary DNA (cDNA)) library preparation with the enrichment of transcripts of interest using complementary capture oligonucleotide probes as baits and does not rely on poly-A tail enrichment or ribosomal RNA (rRNA) depletion (Hardwick et al., 2019; Mittempergher et al.,

2019). Targeted RNA-seq has been applied to the study of long noncoding RNA (lncRNA) and non-coding RNA (ncRNA)) and micro RNA (miRNA), and the study of these RNAs have grown as their importance in transcriptional and translational regulation has become more evident

(Illumina, 2017).

2

1.2. Overview of RNA-seq workflow for DGE analysis

There is no universal optimal pipeline applicable to all experiments and applications associated with RNA-seq analysis. Researchers have been independently generating the pipelines based on the research goals, the organism being studied, and whether reference genome or transcriptome sequences are available. The current standards and most resources of RNA-seq data analysis are based on Illumina sequencing, consisting of research design and DGE analysis, with some optional additional advanced analysis depending on the specific research goals. The key considerations in RNA-seq research design are experimental design and methods for library preparation. These two steps are crucial for a successful DGE analysis using RNA-seq (Conesa et al., 2016).

The standard workflow begins in the laboratory, with experimental design, followed by

RNA extraction (e.g. mRNA enrichment or ribosomal RNA depletion), and sequencing (cDNA synthesis and preparation of an adaptor-ligated sequencing library, and the actual sequencing)

(Stark et al., 2019). The last step is to perform DGE analysis, including re-processing for reads quality check, aligning the reads and assembling the reads, quantification of the transcript abundance, data normalization, and filtering, to create lists of differentially expressed genes

(DEGs). Further biological insight into an experimental system can be gained by performing enrichment analysis for (GO) terms, pathways, and other biological database terms.

1.3. Research design: experimental design for RNA-seq based DGE analysis

For the RNA-seq project, a good experimental design, followed by a suitable sequencing protocol needs to be considered (Conesa et al., 2016) to obtain high-quality and biologically meaningful data. This can be achieved by considering several aspects: controls, level of

3 biological and technical replications, expression variation, selection of RNA-extraction protocol and sequencing protocol (e.g., library protocol, sequencing length, and depth, strand specificity), and the availability of the reference for the organism being studied.

1.3.1. Expression variation

DGE experiments compare the relative measure of transcriptional activity across several biological conditions to identify genes that are differentially expressed between conditions. This task is complicated by expression noise resulting from biological and technical variability, which introduces a level of uncertainty that needs to be considered when the expression values from two or more conditions are compared. There are two major obstacles in identifying whether the observed difference in the expression of a gene between the two conditions is statistically significant or not. First, the observed read counts are a product of both the library size (total number of reads) and the fractional expression of the genes, and they need to be appropriately normalized before they can be meaningfully compared across conditions. Second, due to cost, time, and workload constraints, RNA-seq experiments typically consist of only a few replicates, making DGE particularly difficult due to a lack of statistical power in the experiment (Hansen et al., 2012).

Moreover, true gene expression is a stochastic process and is known to vary between individuals or units in the same population. Stochasticity in gene expression is generally viewed as being detrimental to cellular function and yet advantageous and is manifested as fluctuations in the abundance of expressed molecules at the single-cell level with variability and heterogeneity within populations of genetically identical cells. For instance, genetically identical cells exposed to the same environmental conditions can show significant variation in molecular

4 content and marked differences in phenotypic characteristics that are linked to stochasticity in gene expression (Gierliński et al., 2015).

In a typical experiment, variation in gene expression measurements can be decomposed as (Chhangawala et al., 2015):

Var (Expr) = Across Group Variability + Measurement Error + Biological Variability.

Group variability is the variation in gene expression under consideration in an experiment that can be measured by comparing samples from different biological groups and is typically the research interest. The second component, measurement error, can be estimated with technical replicates – different aliquots of the same sample measured with technology multiple times

(Chhangawala et al., 2015). The third component, biological variability, particular to each experimental system and is harder to control. True biological variability can only be measured by considering expression measurements taken from multiple biological samples within the same group (Chhangawala et al., 2015; Illumina, 2020).

1.3.2. Level of replications

Having enough biological replicates is essential in an experiment to capture the biological variability between samples to give confident results in quantitative analysis. Replicates allow outlier samples to be identified and, if necessary, removed or down-weighted before performing the data analysis (Han et al., 2015). Measurement variability (e.g. effect size, false-positive, and false-negative rates), reproducibility (e.g. maximum sample size), and biological variability (e.g. within-group variation) are the factors that determine the optimal number of replicates in RNA- seq experiment (Stark et al., 2019; Illumina, 2020). The variability in the measurements is influenced by the technical noise and biological variation, while reproducibility in RNA-seq is usually high at the level of sequencing. Other steps, such as RNA extraction and library

5 preparation, can be noisier and may introduce biases in the data, and these can be minimized by adopting good experimental procedures (Illumina, 2020).

The large impact that ‘bad’ replicates have on the underlying properties of RNA-seq data can be reduced through improvements in sequencing techniques such as paired-end (PE) and longer reads, but it cannot be eliminated. Increasing the number of biological replicates can also be used in mitigating the risk of ‘bad’ biological replicates skewing the interpretation of the data

(Hansen et al., 2012). It is suggested that a minimum of 4 to 6 biological replicates should be used with an emphasis on the necessity of measuring biological variance. More replicates are likely to be required for highly diverse samples, such as clinical tissue from cancer patients and tumors (Stark et al., 2019; Illumina, 2020).

1.3.3. Sequencing read depth

Another important factor in pre-analysis is determining sequencing depth, defined as the number of sequenced reads for a given sample (Stark et al., 2019; Illumina, 2020). The ability to find transcripts and detect differential expression, especially for genes and transcripts that have low expression levels, is very much determined by the sequencing depth, and this will lead to the question of how many reads should be generated in an RNA-seq experiment to obtain robust results. Knowledge of the relationships among sequencing depth, feature detection, and differential expression is needed for experimental design purposes (Hardcastle and Kelly, 2010).

Based on Illumina, most experiments require 5–200 million reads per sample: (1) 5–25 million reads for gene expression profiling; (2) 30–60 million reads for a more global view of gene expression, (3) 100–200 million reads to get an in-depth view of the transcriptome, or to assemble new transcripts; (4) 3 million reads for targeted RNA expression; (5) 1–5 million reads for miRNA-

Seq or small RNA analysis (Illumina, 2020). For RNA DGE experiments in eukaryotic genomes,

6 read depths of around 10–30 million reads per sample are required. However, it has been shown across multiple species that depths of 1 and 30 million reads per sample provide similar transcript abundance estimation for the most highly expressed parts of the transcriptome. If only relatively large changes in the expression of the most highly expressed genes are important, and if there are adequate biological replicates, less sequencing may be enough. Validating the estimate of read depth can be done after sequencing by checking the distribution of reads among the samples and checking saturation curves to determine whether further sequencing is likely to increase the sensitivity of the experiment. The total number of reads required is determined by multiplying the number of samples by the desired read depth (Stark et al., 2019). Sequencing saturation is a measure of the fraction of library complexity that was sequenced in a given experiment, depends on the library complexity and sequencing depth. Different cell types will have different amounts of RNA and thus will differ in the total number of different transcripts in the final library (also known as library complexity). The formula for calculating sequencing saturation is as follows: Sequencing Saturation = 1 - (n_deduped_reads / n_reads), where n_deduped_reads is the number of unique (valid cell-barcode, valid UMI, gene) combinations among confidently mapped reads and n_reads is a total number of confidently mapped, valid cell-barcode, valid UMI reads (10x Genomics, 2018).

1.3.4. Sequencing read length

Current RNA-seq protocols use an mRNA fragmentation approach before sequencing to gain sequence coverage of the whole transcripts, where the total number of reads for a given transcript is proportional to the expression level of the transcript multiplied by the length of the transcript. This means that a long transcript will have more reads mapped to it and more power to detect its differential expression compared to a short transcript of a similar expression (Han et

7 al., 2015). The lengths of reads offered by the NGS platform have increased substantially over time, and sequencers have been improved from single-end (SE) sequences to paired-end (PE) sequences, allowing the sequencing of both ends of a fragment. The standard read length is 100 bp for PE reads and also possible to run 300 bp for PE reads (Oshlack and Wakefield, 2009).

Based on Illumina, read length depends on the application and final size of the library: (1) 50-75 bp

SE for gene expression or RNA profiling; (2) PE reads (e.g., 2 x 75 bp or 2 x 100 bp) for transcriptome analysis (e.g., novel transcriptome assembly and annotation projects); (3) 50 bp SE for small RNA analysis (Illumina, 2020).

In many sequencing applications, the length of the sequencing reads has a great impact on the usefulness of the data, as longer reads give more coverage of the sequenced DNA than the short reads. This is less applicable when using RNA-seq to examine DGE, where the important factor is the ability to determine the origin of each read in the transcriptome. Once a read’s position can be mapped unambiguously, longer reads do not add much value in a quantification- based analysis. However, the answer to the ideal length of sequencing for RNA-seq depends on the research goals. If only a list of DEGs is desired, then 50 bp single-end reads would be enough for most studies. In contrast, for splicing variant detection and specific isoforms identification, longer reads should be used, including using PE reads (Stark et al., 2019).

1.3.5. Library type: single-end or paired-end sequencing

Sequencing can involve SE or PE reads, although PE is preferable for de novo transcript discovery or isoform expression analysis (Conesa et al., 2016). In SE sequencing, only one end

(3ʹ or 5ʹ) of each cDNA fragment is used to generate a sequence read, while PE sequencing generates two reads for each fragment (one for 3ʹ and one for 5ʹ). In assays where coverage of as many nucleotides as possible is desired, PE sequencing is preferred. However, sequencing every

8 base of a transcript fragment is not required for DGE analysis, if the only count of the reads mapped to a transcript after alignment is needed (Han et al., 2015).

Studies have shown that using short SE reads compromised the ability to detect isoforms, as fewer reads were seen spanning a splice junction. PE sequencing can additionally help in preventing ambiguous read mapping and is preferred in alternative-exon quantification, fusion transcript detection, and de novo transcript discovery, particularly when working with poorly annotated transcriptomes. Despite its advantages, the use of PE reads leads to increased cost in reagents and an increase in sequencing running time (Conesa et al., 2016; Han et al., 2015).

1.3.6. Selection of RNA species

For eukaryotes, the selection of RNAs involves choosing whether to enrich for mRNA using poly(A) selection or to deplete rRNA (Conesa et al., 2016; Zhang et al., 2012). One important aspect of the experimental design before constructing RNA-Seq libraries is the RNA extraction protocol used to enrich or deplete a “total” RNA sample for particular RNAs (Han et al., 2015; Sultan et al., 2012). The total RNA pool includes rRNA, precursor messenger RNA

(pre-mRNA), mRNA, and various classes of ncRNA. In most cell types, the majority of RNA molecules are rRNA, typically accounting for over 90% of the total cellular RNA, leaving the 1–

2 % comprising mRNA that we are normally interested in (Conesa et al., 2016; Sultan et al.,

2012; Zhang et al., 2012). If the rRNA transcripts are not removed before library construction, they will consume the bulk of the sequencing reads, reducing the overall depth of sequence coverage and thus limiting the detection of other less-abundant RNAs (Sultan et al., 2012).

Two methods that are generally used are oligo-dT enrichment for RNAs selection and rRNA depletion for whole transcriptome analysis. In oligo-dT-enrichment, mRNA is separated from genomic DNA using biotinylated oligo (dT) primers by selecting transcripts containing a

9 poly(A) tail. This poly(A) selection strategy has the advantage of capturing the most informative transcripts such as mRNAs and most long non-coding RNAs (lncRNAs) by excluding the undesirable rRNA and transfer RNA (tRNA) in a single step (Saliba et al., 2014; Stark et al.,

2019). The majority of the published RNA-seq data are generated using this method focusing on sequencing on the protein-coding regions of the transcriptome to study the expression of most protein-coding genes but missing non-coding RNA (Han et al., 2015; Stark et al., 2019). Poly(A) selection typically requires a relatively high proportion of mRNA with minimal degradation as measured by RNA integrity number (RIN), which normally yields a higher overall fraction of reads falling onto known exons (Xie et al., 2014).

rRNA removal is achieved either by separating rRNAs from other RNAs (pull-out methods) or by selective degradation of rRNA by RNase H using sequence- and species-specific oligonucleotide probes. The oligonucleotide probes are complementary to both cytoplasmic rRNAs and mitochondrial rRNAs. Pull-out methods incorporate biotinylated probes and streptavidin-coated magnetic beads used to remove the oligo rRNA complexes from the solution.

RNase H methods degrade the resulting oligo-DNA: RNA hybrid using RNAse H. A recent comparison of these methods shows that, in high-quality RNA, the two methods can reduce rRNA to under 20% of the subsequent RNA-seq reads, with RNase H methods being much less variable than pull-out approaches (Stark et al., 2019).

1.3.7. Strand and non-strand-specific sequencing

Strand information is vital since many genes undergo anti-sense transcription, which serves a regulatory role and has also been associated with diseases (Sarantopoulou et al., 2019).

In eukaryotes, RNA molecule complement to mRNA can also be transcribed as anti-sense transcription (Borodina et al., 2011). A particularly relevant aspect in transcriptomics is the

10 information of the transcript orientation, which will facilitate the detection of overlapping transcripts in opposite orientations and allows an accurate measure of gene activity levels

(Conesa et al., 2016). In a standard treatment or control experiment, an important segment of signal comes from the anti-sense strand of annotated RNA and if data generated are not strand- specific, then all such reads will get quantified as a “sense” signal (Sarantopoulou, et al., 2019).

Originally, RNA-Seq was not strand-specific, which created difficulties for data analysis to distinguish between the original RNA molecule and its complementary (Borodina et al.,

2011). The first generation of Illumina-based RNA-seq used random hexamer priming to reverse-transcribe poly(A)-selected mRNA. This methodology did not retain the information of the DNA strand that is actually expressed and therefore complicates the analysis and quantification of antisense or overlapping transcripts (Zhang et al., 2012). To closely reflect the original cellular RNA content of a sample, the library preparation step is crucial (Sultan et al.,

2012).

Several strand-specific RNA-Seq (ssRNA-Seq) protocols were developed for generating strand-specific RNA-seq libraries based on the ligation of adapters to the RNA molecules or modifications of the double-stranded cDNA synthesis procedure. Involving adapter ligation directly to single-stranded RNA molecules has no restrictions in RNA length and the only choice for the analysis of short RNA molecules (e.g., miRNA). This principle is used in many commercial preparation kits, but these methods are laborious and are sensitive to rRNA contamination (Borodina et al., 2011). Among these methods, the deoxy-UTP (dUTP) strand- marking protocol has been rated as the leading methodology for antisense-transcription identification, providing excellent library complexity, strand specificity, coverage evenness, agreement with known annotation, and accurate for expression profiling (Conesa et al., 2016;

11

Zhang et al., 2012). In this method, the RNA is first to reverse transcribed into cDNA: RNA using random primers. To synthesize the second cDNA strand, dUTP instead of deoxythymidine triphosphate (dTTP) is used, marking the second cDNA strand for subsequent degradation with uracil-DNA glycosylase (UDG) to preserve strand information (Zhang et al., 2012). However, the dUTP-based methods are laborious for it is time-consuming and difficult since gel purification steps and reagents calibration are required (Conesa et al., 2016).

Because of the limitation in dUTP-based methods, commercial kits were used for many high-throughput RNA-seq production pipelines, such as the Illumina TruSeq RNA protocol.

Despite optimization for total RNA and its ability for processing in a 96-well microtiter plate format, a major drawback of the Illumina TruSeq protocol is the loss of the original RNA strand information. To solve this problem, the modification of the Illumina TruSeq protocol with the dUTP protocol by introducing the strand specificity feature in the Illumina method has been reported. This includes the original protocol to a simple scalable polyA + library preparation method (Conesa et al., 2016). Another protocol for preparing strand-specific RNA-Seq libraries is the use of a combination of rRNA removal using the Ribo-Zero rRNA Removal Kit

(Epicentre, Madison, WI, USA) and the dUTP method for ensuring strand specificity. This protocol has shown advantageous in terms of time and cost efficiency as well as optimized the performance for size-selectivity and DNA recovery to allow the use of small amounts of starting

RNA (Chen et al., 2017). Although a wide range of strand-specific protocols are available and proven to be more beneficial, many researchers still often rely on non-strand-specific RNA-Seq library preparation, where RNA is first converted into double-stranded cDNA, mostly because it is simple, straightforward, and stable (Borodina et al., 2011).

12

1.3.8. Reference availability: reference vs non-reference

When a reference genome is available, RNA-seq analysis will normally involve the mapping of the reads onto the reference to determine which transcripts are expressed (Conesa et al., 2016). Mapping-first approaches start with aligning all the reads to a reference genome and then merging sequences with overlapping alignment, spanning splice junctions with reads and paired-ends (Conesa et al., 2016; Grabherr et al., 2011). By contrast, if no reference genome is available for the organism, then the analysis has to start with the transcriptome assembly to be used as the reference transcriptome (Conesa et al., 2016). More specifically, such an assembly- first (de novo) method uses the reads to assemble transcripts directly into longer contigs and then to treat these contigs as the expressed transcriptome in which reads are mapped back for quantification (Conesa et al., 2016; Grabherr et al., 2011).

Mapping solely to the reference transcriptome of a known species precludes the discovery of new unannotated transcripts and focuses the analysis on quantification alone

(Conesa et al., 2016). However, mapping-first approaches promise, in principle, maximum sensitivity, but depend on correct read-to-reference alignment, a task that is complicated by splicing, sequencing errors, and the lack or incompleteness of reference genomes in many cases.

Conversely, the assembly-first approaches may deal with gapped, highly fragmented, or substantially altered transcript assemblies, as is the case in cancer cells (Grabherr et al., 2011). It can be challenging is to accurately assemble the short reads from non-model organisms without any reference genome, a process called de novo transcriptome assembly. Because this is the first step, problems at this stage (e.g., incomplete assembly, assembly errors, and redundancy) cause difficulties for downstream analyses including ortholog and paralog identification, alignment,

13 and matrix construction. This will also increase the amount of missing data in the final aligned matrix, ultimately limiting the amount of useful transcriptomic data (Yang and Smith, 2013).

1.4. DGE analysis

The next step in DGE analysis is transcriptome profiling to determine the differentially expressed gene. At least five steps of analysis are required for core analysis, including pre- processing for raw reads quality-control check to ensure high-quality input data, reads mapping, transcript assembly, quantification of transcript abundance, data normalization, and generating

DEGs list, followed by enrichment analysis (Conesa et al., 2016; Stark et al., 2019). The starting point for DGE analysis is the data files generated by the sequencing service provider, usually in

FASTQ format, which contains sequencing base calls and per base quality score, followed by pre-processing and DGE analysis (Cock et al., 2009).

A typical RNA-seq pipeline for generating the DEGs list consists of three major steps.

First, reads are mapped to the genome or transcriptome. Second, mapped reads for each sample are assembled into gene-level, exon-level, or transcript-level expression summaries, depending on the aims of the experiment. Finally, the summarized data are normalized in concert with the statistical testing of DGE, leading to a ranked list of genes with associated p-value and fold change (FC) that will be used for the filtering of DEGs (Oshlack and Wakefield, 2009). Each step requires distinct computational tools and the number of computational approaches has greatly expanded over the past 10 years. Differences in the approaches and combinations of techniques in a pipeline used are substantially divergent in analytical practice at each stage and can affect the biological conclusions drawn from the data. Additionally, the optimal set of tools to use will depend on the specific biological question and the availability of computational resources (Stark et al., 2019).

14

1.4.1. Pre-processing

After obtaining the RNA-Seq reads, their quality should be evaluated to detect sequencing errors, polymerase chain reaction (PCR) artifacts, or contaminations. These involve the analysis of sequence quality, including GC content, the presence of adaptors, overrepresented k-mers, and duplicated reads with their acceptable levels being an experiment- and organism- specific, but ideally the values should be consistent across the samples in the same experiments.

Outliers with over 30 % disagreement are suggested to be discarded (Conesa et al., 2016; Stark et al., 2019). FastQC (Andrew, 2010) is a popular tool for performing these analyses on Illumina reads. It flags any potential abnormalities that may have occurred during library preparation or the sequencing reaction. NGS quality control (QC) toolkit (Patel et al., 2012) is also an easy-to- use stand-alone package for quality check, filtering, trimming, generating statistics, and conversion between different file formats/variants of NGS data from any sequencing platform. In general, read quality decreases towards the 3’ end of reads, and if it becomes too low, bases should be removed to improve mapping ability (Conesa et al., 2016; Han et al., 2015). Other software tools, such as the FASTX-Toolkit (Gordon and Hannon, 2010) and Trimmomatic

(Bolger et al., 2014) can be used to discard low-quality reads, trim adaptor sequences, and eliminate poor-quality bases (Conesa et al., 2016; Stark et al., 2019).

1.4.2. Read mapping

After pre-processing of the raw reads is completed, the first step in typical transcript-level

RNA-seq processing workflow is aligning reads to a transcriptome or a reference genome, converting each sequence read to one or more genomic coordinates for further use in estimating transcript abundances (Bray et al., 2016; Conesa et al., 2005; Stark et al., 2019). The task of mapping is to find a unique location where a short read is identical or closely similar to the

15 reference. However, the reference is never a perfect representation of the actual biological source of RNA being sequenced. In addition to sample-specific attributes, such as single nucleotide polymorphisms (SNPs) and insertions or deletions (indels), there is also the consideration that the reads arise from a spliced transcriptome rather than a genome (Oshlack and Wakefield,

2009). Alignment to the genome is slower than to the transcriptome because the mapping of intron-spanning reads requires greater computer processing time, but it provides important information on the presence of different mRNA splice variants. For transcriptome alignments, software tools such as Bowtie (Langmead et al., 2009) and Burrows-Wheeler Aligner (BWA) (H.

Li and Durbin, 2009) are commonly used, whereas the splice aligner such as TopHat (Trapnell et al., 2009), Spliced Transcripts Alignment to a Reference (STAR) (Dobin et al., 2013) or

Hierarchical Indexing for Spliced Alignment of Transcripts (HISAT) (Kim et al., 2015), are popular choices for mapping reads to an annotated reference genome (Oshlack and Wakefield,

2009; Stark et al., 2019; Van Verk et al., 2013).

BWA is a read alignment package that is based on a backward search with Burrows-

Wheeler Transform (BWT) to efficiently align short sequencing reads against a large reference sequence, allowing mismatches and gaps (H. Li and Durbin, 2009). The BWT is an algorithm that takes a block of data and rearranges it using a sorting algorithm. The resulting output block contains the same data elements that it started with, differing only in their ordering. The actual output of the BWT consists of two things: a copy of the last column (L), and the primary index, an integer indicating which row contains the original first character (Nelson, 1996). Bowtie is an ultra-fast short-read mapping program using a technique borrowed from data-compression, the

Burrows-Wheeler transform for indexing the reference genome to produce a memory-efficient data structure. TopHat is a software package that identifies splice sites ab initio by large-scale

16 mapping of RNA-Seq reads. TopHat aligns all sites, relying on an efficient 2-bit-per-base encoding and a data layout that effectively uses the cache on modern processors. This strategy works well in practice because TopHat first maps non-junction reads (those contained within exons) using Bowtie to produce a memory-efficient data structure (Trapnell et al., 2009). STAR software was designed to specifically address many of the challenges in RNA-seq data mapping and uses a novel strategy for spliced alignments via sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR was designed to align the non-contiguous sequences directly to the reference genome (Dobin et al.,

2013). HISAT uses an indexing scheme based on the Burrows-Wheeler transform and the

Ferragina-Manzini (FM) index to create a fast spliced-aligner that uses a modest amount of random-access memory (RAM). HISAT employs two types of indexes for alignment: a whole- genome FM index to anchor each alignment and numerous local FM indexes for very rapid extensions of these alignments (Kim et al., 2015).

More recently, computationally efficient ‘alignment-free’ tools, such as Sailfish (Patro et al., 2014), Kallisto (Bray et al., 2016), and Salmon (Patro et al., 2017), have been developed to associate sequencing reads directly with transcripts without a separate quantification step.

Sailfish quantifies by extracting k-mers from reads followed by exact matching of the k-mers using a hash table. However, shredding reads into k-mers discards valuable information present in the complete reads since each k-mer can align to more transcripts than the read itself. For this reason, Kallisto, a new method based on the pseudo alignment of reads and fragments, works to address this issue by focusing only on identifying the transcripts from which the reads could have originated and not trying to pinpoint exactly how the sequences of the reads and transcripts align. The accuracy of Kallisto is similar to those of existing RNA-seq quantification tools and

17 pseudo alignments explicitly preserve the information provided by k-mers across reads, which enables a substantial improvement over Cufflinks and Sailfish (Bray et al., 2016; Patro et al.,

2014, 2017).

A novel quantification procedure, Salmon, employs a new-dual-phase statistical inference procedure and sample-specific-phase bias models that account for sequence-specific, fragment

GC-content, and positional biases. It achieves the same order-of-magnitude benefits in speed as

Kallisto and Sailfish but with greater accuracy. Salmon is capable of either mapping sequencing reads itself by using a fast and lightweight procedure called quasi-mapping or accepting precomputed read alignments in the form of SAM or BAM file (Patro et al., 2017). Overall, these

‘alignment-free’ tools have demonstrated good performance in characterizing more highly abundant and longer transcripts; however, they are less accurate in quantifying low-abundance or short transcripts (Stark et al., 2019).

1.4.3. De novo transcriptome assembly

In the absence of high-quality genome annotation containing known exon boundaries, or if it is desirable to associate reads with transcripts rather than genes, assembly of transcripts from the reads is performed first. Assembly tools such as StringTie (Pertea et al., 2015)and

SOAPdenovo-Trans (Xie et al., 2014) use the gaps identified in the alignments to derive exon boundaries and possible splice sites, while Trinity (Grabherr et al., 2011; Xie et al., 2014) does not. StringTie uses a genome-guided transcriptome assembly approach along with concepts from de novo genome assembly by first grouping the reads into clusters and creating a splice graph for each cluster followed by estimating its expression level. StringTie assembles transcripts to improve transcript assembly and estimates their expression levels simultaneously (Pertea et al.,

2015).

18

Trinity generates a de novo RNA-seq assembly by partitioning RNA-seq data into many independent de Bruijn graphs (DBG) and uses parallel computing to reconstruct transcripts from these graphs, including alternatively spliced isoforms. Trinity can leverage Illumina’s strand- specific paired-end libraries, but it can also accommodate non-strand-specific and single-end- read data. Furthermore, Trinity was converted into a modular platform that seamlessly uses third- party tools, such as Jellyfish, to build the initial k-mer catalog. Other third-party tools integrated into Trinity have enhanced the utility of its reconstructed transcriptomes. For example, Trinity

now supports tools, such as RSEM (Li and Dewey, 2011), edgeR (Robinson et al., 2009), and

DESeq (Love et al., 2014) that take its output transcripts and test for differential expression while accounting for both technical and biological sources of variation and multiple hypothesis testing (Haas et al., 2013; Illumina, 2019).

SOAPdenovo-Trans, a de novo RNA-Seq assembler, is a DBG-based assembler for transcriptome data, derived from the SOAPdenovo2 (Luo et al., 2015) genome assembler, which adopts and improves on concepts from Trinity and Oases (Schulz et al., 2012) for the error- removal model. The SOAPdenovo-Trans algorithm consists of two main steps: (i) contig assembly and (ii) transcript assembly, with the use of a strict transitive reduction method to simplify the scaffolding graphs and provide more accurate results. SOAPdenovo-Trans provides higher contiguity, lower redundancy, and faster execution. These de novo transcript assembly tools are particularly useful when the reference genome annotation may be missing or incomplete or where aberrant transcripts (for example, in tumor tissue) are of interest.

Transcriptome assembly methods may benefit from the use of PE reads and/or longer reads that have a greater likelihood of spanning splice junctions. However, a complete de novo assembly of

19 a transcriptome from RNA- seq data is not generally required for determining DGE (Stark et al.,

2019).

1.4.4. Quantification of reads abundance

Once reads have been mapped to the genomic or transcriptomic locations, the next step in the analysis process is to assign them to genes or transcripts to determine their abundance. This process consists of two steps. First, the data need to be translated into a quantitative measure of gene expression, which can be achieved by counting the number of reads mapped to each gene/transcript (Van Verk et al., 2013). The quantification of read abundances for individual genes or transcripts relies on counting sequence reads that overlap with the known genes through the use of a transcriptome annotation. However, allocating reads to specific isoforms using short reads requires an estimation step, as many reads will not span splice junctions and therefore cannot be unambiguously assigned to a specific isoform. If the only gene-level differential expression is being studied, quantifying differences in isoforms may result in more accurate results in the case of a gene shifting its expression between isoforms of different lengths (Stark et al., 2019).

Quantification tools, including RSEM, Cufflinks, and HTSeq (Anders et al., 2015), as well as the alignment-free direct quantification tools mentioned above, are commonly used in this process. Read-count-based tools such as HTSeq or featureCounts (Liao et al., 2014) will generally discard many aligned reads, including those that are multi-mapped or overlap with multiple expression features. As a result, homologous and overlapping transcripts may be eliminated from subsequent analyses (Stark et al., 2019). RSEM allocates ambiguous reads using expectation maximization (EM) for quantifying gene and isoform abundances from SE or PE

RNA-Seq data, whereas reference-free alignment methods such as Kallisto include these reads in

20 their transcript count estimates, which can produce bias in the results. RSEM generates abun- dance estimates, 95% credibility intervals, and visualization files as the outputs and also simulates RNA-seq data based on parameters learned from real data sets. In contrast to other tools, RSEM does not require a reference genome, and its combination with a de novo transcriptome assembler, RSEM enables accurate transcript quantification for species without sequenced genomes (Oshlack and Wakefield, 2009; Stark et al., 2019).

Estimating the expression levels of individual splice variants of a gene is more complex, as splice variants for a gene typically share a set of exons, and thus only a minor fraction of the reads will align uniquely to the distinct regions of a splice variant. Methods such as Cufflinks use statistical models to estimate the proportion of reads that can be assigned to individual splice variants by counting the number of reads that map to full-length transcripts. It also can perform isoform quantification, maximum likelihood estimation of relative isoform expression. In some cases, the determination of the splice variants of a gene present within a sample could require very deep sequencing or simply be impossible with current technology (Oshlack and Wakefield,

2009; Van Verk et al., 2013). Alternative approaches, such as HTSeq, can quantify expression without assembling transcripts by counting the number of reads that map to an exon, while featureCounts implements highly efficient hashing and feature blocking techniques

(Oshlack and Wakefield, 2009). Also, transcript abundance estimates can be converted to read count equivalents, using a package such as tximport (Soneson et al., 2016). The results of the quantification step are usually combined into an expression matrix, with a row for each expression feature (gene or transcript) and a column for each sample, with the values being either actual read counts or estimated abundances (Stark et al., 2019).

21

1.4.5. Normalization

Following quantification of expression levels, a common objective is to identify genes that are differentially expressed between different conditions. DGE analysis requires that gene expression values be comparable among samples. Generally, quantified gene or transcript counts are also normalized to account for differences in read depth and technical biases (Conesa et al.,

2016; Stark et al., 2019; Van Verk et al., 2013). Two important types of sequencing bias need to be considered when normalizing count data: within-sample or between-gene bias, primarily caused by differences in transcript length, and between-sample bias, mainly from differences in sequencing depth (Oshlack and Wakefield, 2009; Van Verk et al., 2013).

Methods for normalizing an expression matrix can be more complex. Straightforward transformations by removing features with uniformly low read abundance can adjust abundant quantities to account for differences in GC content and read depth (Stark et al., 2019).

Comparative studies have shown empirically that the choice of the normalization method has a major impact on the final results and biological conclusions. Normalization methods are also considered as an essential step in DGE analysis from RNA-seq data (Oshlack and Wakefield,

2009; Stark et al., 2019). However, most computational normalization methods rely on two key assumptions: first, the expression levels of most genes remain the same across replicate groups; and second, different sample groups do not exhibit a meaningful difference in overall mRNA level. It is particularly important to carefully consider whether and how to perform normalization when these basic assumptions may not hold true (Stark et al., 2019).

Reads per kilobase of transcript per million mapped reads (RPKM), fragments per kilobase of transcript per million mapped reads (FPKM), and transcript per million (TPM) deal with normalization of the most important factor for comparing samples, including sequencing

22 depth, which can be significantly different between samples (Conesa et al., 2016). RPKM and

FPKM values are useful for analyzing differences in the abundance of alternative splice variants between samples, as the correction for the length of each splice variant is essential for this type of analysis. Although RPKM and FPKM are popular normalization strategies, they have been demonstrated to bias subsequent calls for differential expression in favor of longer transcripts

(Van Verk et al., 2013). Recently, methods based on RPKM are now recognized to be insufficient and have been replaced by those that sufficiently correct more subtle differences between samples, such as quartile or median normalization (Stark et al., 2019). TPM, which effectively normalizes the differences in the composition of the transcripts in the denominator, is considered more comparable between samples of different origins and composition but can still suffer some biases (Conesa et al., 2016).

RPKM, FPKM, and TPM approaches rely on normalizing methods that are based on total or effective counts and tend to perform poorly when samples have heterogeneous transcript distributions which can be considered as highly and differentially expressed features that can skew the count distribution. Normalization methods that can deal with heterogeneous transcript distributions are trimmed mean of M- values (TMM), DESeq, PoissonSeq, and UpperQuartile, which ignore highly variable and/or highly expressed features. TMM can compensate for these cases and is recommended over methods based on the total read count (Conesa et al., 2016; Stark et al., 2019; Van Verk et al., 2013). Additional factors that interfere with intra-sample comparisons include changes in transcript length across samples or conditions, positional biases in coverage with the transcript (which are accounted for in Cufflinks), average fragment size, and the GC contents of genes. The NOISeq (Tarazona et al., 2015), an R package, contains a wide variety of diagnostic plots to identify sources of biases in RNA-seq data and to apply appropriate

23 normalization procedures in each case. Finally, despite these sample-specific normalization methods, batch effects may still be present in the data. These effects can be minimized by appropriate experimental design or removed by batch correction methods such as COMBAT

(Johnson et al., 2007) or ARSyN (Nueda et al., 2012).

1.4.6. Generating the raw DEGs list

The goal of a DGE analysis is to identify the list of genes with significant changes in abundance across experimental conditions. This means taking a table of summarized count data or an expression matrix for each library and performing statistical testing between samples of interest to determine which transcript features are likely to have changed their level of expression

(Oshlack and Wakefield, 2009; Van Verk et al., 2013). As RNA-seq quantification is based on read counts that are absolutely or probabilistically assigned to transcripts, the earlier approaches to computing differential expression used discrete probability distributions, such as the Poisson or negative binomial, followed by Fisher’s exact test without accounting for biological variability among samples (Conesa et al., 2016; Oshlack and Wakefield, 2009).

The negative binomial distribution, also known as the gamma-Poisson distribution, is a generalization of the Poisson distribution, allowing for additional variance such as overdispersion, beyond the variance expected from randomly sampling from a pool of molecules as the characteristic of RNA-seq data. However, the use of discrete distributions is not required for accurate analysis of differential expression if the sampling variance of small read counts is considered (Conesa et al., 2016). While the technical variability of RNA-seq is extremely low compared with microarray data, the biological variability could be significantly addressed by analyzing several replicates through permutation-derived methods. Serial analysis of gene expression has been developed for biological variability assessment, in which larger-scale

24 datasets are used so that an additional dispersion parameter can be estimated based on an extended Poisson distribution, allowing extensive molecular characterization capability (Oshlack and Wakefield, 2009).

For most applications, many replicates may be too costly, and many methods have solved this problem by modeling biological variability and measuring the significance with a limited number of samples, applying pairwise or multiple group comparisons. Several programs (e.g.

Cuffdiff, DESeq, DESeq2, and edgeR) offer a good solution for this purpose and have been applied in numerous studies for biomedical and clinical research. Since RNA-seq read counts are integer numbers that range from zero to millions and are highly skewed, many kinds of transformation algorithms have been applied to the counts so that the numbers can be fit to statistic distribution models for differential expression detection. For instance, PoissonSeq, a method for normalization, testing, and false discovery rate estimation for RNA-seq data is based on Poisson log-linear model (Oshlack and Wakefield, 2009).

Some other methods, such as edgeR, DESeq, and DESeq2, use the negative binomial as the reference distribution and take raw read counts as input and introduce possible bias sources into the statistical model to perform an integrated normalization as well as a differential expression analysis. These methods are computationally efficient and can provide comparable results (Conesa et al., 2016; Han et al., 2015; Van Verk et al., 2013). edgeR uses two statistical models (e.g., Poisson model and Empirical Bayes) for examining differential expression of replicated count data. An over-dispersed Poisson model is used to account for both biological and technical variability and the Empirical Bayes method is used to moderate the degree of overdispersion across transcripts, improving the reliability of inference. The methodology can be used even with the most minimal levels of replication, i.e., at least one phenotype or

25 experimental condition is replicated. DESeq is used to analyze count data and test for differential expression with multi factors analysis, Poisson generalized linear model (GLM), while DESeq2 uses shrinkage estimation for dispersions and FCs to improve stability and interpretability of estimates. DESeq2 enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression (Oshlack and Wakefield, 2009).

baySeq (Hardcastle and Kelly, 2010) and EBSeq (Leng et al., 2013) are Bayesian approaches, also based on the negative binomial model, that define a collection of models to describe the differences among experimental groups and to compute the posterior probability of each one of them for each gene. Other approaches include data transformation methods that consider the sampling variance of small read counts and create discrete gene expression distributions that can be analyzed by regular linear models. Finally, non-parametric approaches such as NOISeq or SAMseq (Li and Tibshirani, 2013) make minimal assumptions about the data and estimate the null distribution for inferential analysis from the actual data alone. For small- scale studies that compare two samples with no or few replicates, the estimation of the negative binomial distribution can be noisy (Conesa et al., 2016).

Besides the read coverage, another factor that affects the estimated transcript abundance is isoforms abundance. RSEM can accurately estimate gene and isoform expression levels and can be used even for species without a reference genome assembly (Oshlack and Wakefield,

2009). However, the relative abundance of different mRNA splice variants and model differential isoform expression comparison is recognized as a more challenging task. A method such as Cuffdiff, implemented in the Cufflinks package, is a robust and accurate tool for differential analysis of RNA-Seq for isoform level which can handle testing for differences in alternative splicing and isoform expression between two samples. Cuffdiff uses isoform levels in

26 the analysis but tends to require more computational power with more variation in the results

(Oshlack and Wakefield, 2009; Stark et al., 2019; Van Verk et al., 2013).

1.4.7. Filtering the list of DEGs

Filtering to remove features with uniformly low read abundance is straightforward and has been shown to improve the detection of true differential expression (Stark et al., 2019).

Several methods can be used to assess the significant DEGs or transcripts, such as t-tests,

ANOVAs, p-value cut-offs, Bonferroni corrections, array normalization, Fisher’s exact test, and fold change (FC) cut-offs (Tarazona et al., 2015). To enable gene set analyses, modifying a

DEGs t-statistic by dividing the square root of gene length to minimize the effect of length bias on DEGs is suggested before finalizing the DEGs (Oshlack and Wakefield, 2009). Current statistical methods such as the significance analysis of microarrays (SAM) and t-test often have insufficient statistical power for this application scenario. T-tests have been widely used to identify deviation from the mean, whereas large sampling sizes (~15,000 genes assayed) can increase the number of false positives and may infer little if anything about the biology

(Borodina et al., 2011; Marioni et al., 2008).

More frequently, researchers use the FC metrics, calculated as the ratios of the gene expression levels between two conditions, to detect the genes with FC values greater than an arbitrary threshold as DEGs (Marioni et al., 2008). FC lends itself to a more biologically meaningful assessment yet still encounters problems with identifying what is significant to the organism (Borodina et al., 2011). However, this method does not provide any statistical control.

The lowly and highly expressed genes in both conditions may easily have large FC simply as a result of technical variations (Marioni et al., 2008). To address this problem, a new algorithm has been proposed to identify DEGs based on the significant reproducibility of genes with top-

27 ranked FCs or average expression differences (ADs) between paired case-control replicates.

However, it still cannot obtain DEGs with false discovery rate (FDR) control (Borodina et al.,

2011). For instance, at a stringent FDR, deviations from the Poisson model do not lead to the identification of an appreciable number of false-positive differentially expressed genes (Schulz et al., 2012).

All these filtering methods lead to a reduction in DEGs which may inadvertently reduce or increase the power of analysis. Changing the significance level as well as the FC cut-off has been used to obtain more than one interpretation of the data. The number of significantly expressed genes was overwhelming at a p-value ≤ 0.05, and even upon increasing to a p-value ≤

0.02 level, the data may be reduced almost in half yet remaining massive for understanding the biological response. FC suggests more meaningful insight and proving to be a better eliminator of background noise as there were fewer genes left after making an FC cut-off as compared to p- value cut-offs. As the FC level increases above 2, the number of genes significantly decreases

(Borodina et al., 2011).

1.4.8. Enrichment analysis

In most cases, creating lists of DEGs is not the final step of RNA-seq experiments.

Further biological insight into an experimental system can be gained by studying the biological functions represented by the identified DEGs. A standard transcriptomics study is often about the characterization of the molecular functions or pathways in which DEGs are involved (Conesa et al., 2016). Many tools focusing on gene set testing, network inference, and knowledge databases have been designed for analyzing lists of DEGs from microarray datasets (Oshlack and

Wakefield, 2009). The two main approaches to the functional characterization developed first for microarray technology are (a) comparing a list of DEGs against the rest of the genes in the

28 genome for overrepresented functions, and (b) gene set enrichment analysis (GSEA), which is based on ranking the transcriptome according to a measurement of differential expression

(Conesa et al., 2016). The issue with the use of functional characterization based on microarray technology for RNA-seq is gene length bias in RNA-seq data, in which longer genes have higher counts (at the same expression level). This results in greater statistical power to detect long and highly expressed DEGs. This can dramatically affect the results of downstream analyses, such as

Gene Ontology (GO) terms enrichment among DEGs (Oshlack and Wakefield, 2009).

To solve this problem, several RNA-seq specific tools have been proposed (Conesa et al.,

2016). GO-seq (Young et al., 2010) is an approach developed specifically for RNA-seq data that can incorporate length or total count bias into gene set tests (Xie et al., 2014). GO-seq estimates a bias effect on differential expression results and adapts the traditional hypergeometric statistic used in the functional enrichment test to account for this bias (Conesa et al., 2016). Similarly, the

Gene Set Variation Analysis (GSVA) (Hänzelmann et al., 2013) or SeqGSEA (Wang and Cairns,

2014) packages also combine splicing and implement enrichment analyses similar to GSEA.

Functional analysis requires the availability of enough functional annotation data for the transcriptome under study. Resources such as gene ontology (GO) (Ashburner et al., 2000),

Bioconductor (Huber et al., 2016), Database for Annotation, Visualization and Integrated

Discovery (DAVID) (Huang et al., 2009), and Babelomics (Medina et al., 2010) contain annotation data for most model species. The use of standard vocabularies such as from the GO database allows for some exchangeability of functional information across orthologs. A popular tool such as Blast2GO (Conesa et al., 2005) allows large-scale annotation of complete transcriptome datasets against a variety of databases and controlled vocabularies. Typically,

29 between 50 and 80% of the transcripts reconstructed from RNA-seq data can be annotated with functional terms in this way (Conesa et al., 2016).

1.5. Overall research objectives

In this study, we propose to use Illumina-based RNA-seq to examine the gene expression profile change in E. coli cells exposing to microwave irradiation (MWI) and to understand how cells respond to change in oxygen level and culture medium using human breast cancer cell lines

(MCF7) and human prostate cancer cell lines (PC3) as the models. In general, the specific objectives of these studies will cover: (1) the identification of DEGs specific to the treatment; (2) the determination of the GO terms (e.g. biological process, molecular function, and cellular component) and KEGG pathways enriched among these DEGs. Results from the study of E. coli in response to MWI are expected to give a new insight to understand the specific effects of MWI followed by comparing these results with the previous study by Mazinani et al. (2019). Results from the study of cell lines in response to different culture conditions are expected to provide some important guidance in better designing cell culture-based research.

30

Chapter 2: Analysis of non-thermal effects of microwave irradiation on Escherichia coli using RNA-seq based transcriptome profiling

2.1 Introduction and related literature review

2.1.1. Microwave irradiation

The effects of electromagnetic fields (EMF) on living organisms have been an important research topic for many years, shown to have a major impact on biological systems (Said-Salman et al., 2019; Salmen et al., 2018). Microwave (MW) as a type of non-ionizing electromagnetic wave, is a part of electromagnetic irradiation, holding frequency ranging from 300 MHz to 300

GHz (Raval et al., 2014) and wavelengths from meter to millimeters (Janković et al., 2014).

Three types of MW effects have been investigated in the literature: thermal, MWI-specific effect, and nonthermal (Stanisavljev et al., 2017).

The thermal theory explains the effects of microwave irradiation (MWI) by the increase of temperature in the reaction mixture and corresponding enhancement of the reaction rate

(Stanisavljev et al., 2017). It also further claims that MWI’s bioeffects can be explained solely by differences in the temperature profiles between MWI and conventionally heating systems and that no basis for direct electromagnetic interaction with living systems exists independent of such temperature-mediated effects (Shamis et al., 2012). Specific MWI effects are essentially thermal but cannot be reproduced by conventional heating and maybe connected with the various phases in the system with differential absorptions of MW energy and capacity for heating (Stanisavljev et al., 2017). The mechanism of the non-thermal effects of MWI defines as other effects not caused by thermal effects, and its mechanism of action is still not completely understood and is the most controversial. It is argued that a “non-thermal” effect cannot be said to exist in MWI without careful control of instantaneous temperature, since an unmeasured energy transfer occurs

31 between the MWI and the sample, suggesting that the expression “MWI- specific effect” may be more suitable than “non-thermal effect” (Shamis et al., 2012).

It is generally believed that the destruction of microorganisms by MWI is mainly due to its thermal effect, but several researchers have attempted to ascertain if such irradiation has a nonthermal effect on microorganisms (Woo et al., 2000). It is generally accepted that MW causes dielectric heating in biological systems as a result of the absorption of the MW energy by a dielectric material (primarily water) with the MWI energy transformed into heat due to the internal resistance of rotation (Shamis et al., 2012). In other words, MW as a form of electromagnetic wave increases the temperature as the result of MW energy transferred to the medium, leading to uncontrollable heating (H. Zhang and Datta, 2000). In addition, the effect of

MWI on the growth of microorganisms primarily depends on the frequency of the irradiation and the total energy absorbed by the microorganisms. When MWI is applied at certain frequencies with high energy and for a sufficiently long period, their thermal effect is most likely dominant, leading to cell inactivation. In contrast, when microorganisms were irradiated with MWI at temperatures lower than the thermal destruction level, various effects are observed, ranging from killing the organisms to enhanced growth of the organisms (Janković et al., 2014).

2.1.2. MWI disrupts the cellular membrane activity

One of the mechanisms that might contribute to bacterial killing by MWI is the increase of the conductivity and permeability of cell membranes to enhanced diffusion, including mobility of ions (Asay et al., 2008; Shamis et al., 2011). This mechanism involves three steps.

First, MWI increases the movements of ions causing the reversible development of cell membrane pores. Second, the development of the pores causes localized structural disarrangements of the cell membrane, which results in the emergence of pores that allows the

32 free transport of ionic and molecular materials through the cellular envelope. Third, disrupted partitioning of the ions changes the protein structures, resulting in leakage of vital intracellular molecules out of the bacterial cells. In this mechanism, there may be other different MWI-related biological effects and biochemical reactions that may lead to cellular death (Janković et al.,

2014; Shamis et al., 2011).

A study by Shamis et al., (2012) shows that MW exposure with a frequency of 18 GHz at

20-40°C on E. coli induces the formation of temporary pores within the bacterial membrane.

Another study also shows that MWI using power equal to or higher than 400 Watts for 30 minutes disrupts membrane integrity but not motility (Rougier et al., 2014). This study also reports that exposure of 2.45-GHz-discontinuous microwave (DW) to 37°C induces E. coli membrane modifications and the temperature increases from room temperature up to 37°C is responsible for the measured effects of MW on membrane integrity. A study by Jankovic et al.,

(2014) also found the non-thermal effect of MWI using E. coli and Bacillus licheniformis. With

E. coli cultures exposed to MWs (18 GHz, absorbed power 1500 kW/m2, electric field 300 V/m) at temperatures below 40°C, transient morphological changes and openings of pores in the cell membrane were observed. B. licheniformis spores were irradiated with MWI (2.45 GHz, 2 minutes, 2 kW), which caused spore cortex hydrolysis, swelling, and finally rupture, as well as rupture of the inner membrane, whereas, in Bacillus subtilis, cell walls were disrupted, and the aggregation of cytoplasmic (intracellular) proteins was detected. The author of this study suggested that these effects could not be attributed to thermal damage because the same temperatures from other heat sources did not produce similar changes.

33

2.1.3. MWI changes the enzymatic activity

Besides the effects on cell membranes, MWI also causes non-thermal acceleration of enzymatic reactions, such as non-aqueous esterification (formation of ester-type of chemical compounds in non-aqueous systems), and this effect is substrate concentration-dependent.

Moreover, MWI can accelerate the synthesis of glycopeptides, which may occur in microbial cells (Janković et al., 2014). An increase in the activity levels by MWI has been reported in other studies and is considered as the result of specific MWI effects and/or the interaction of

MWI’s heating and electromagnetic effects on the bacteria. It was argued that MWI interacts directly with the material causing temperature increases due to the interaction of the molecular electric fields with the electromagnetic waves at their different phases, thus producing ionic conduction and dipolar rotation that result in a kinetic excitation (Shamis et al., 2012).

An earlier study demonstrates that the use of MWI frequency at 2.45 GHz and a peak temperature of 46 °C on Staphylococcus aureus impacted the activity and production of several key enzyme systems from cell lysates and walls at various levels of MWI exposure (Dreyfuss and Chipley, 1980). The observed enzymes in this study are malate dehydrogenase, α- ketoglutarate dehydrogenase, cytochrome c oxidase (COX), cytoplasmic adenosine triphosphatase, and glucose-6- dehydrogenase. They suggested that the levels of enzymatic activity highly correlated with the growth stimulation of the culture by the temperature and that the amount of cell biomass possibly responsible for the enzyme production.

Another study also reported that MWI exposure at 8.53 GHz and 23°C decreased the level of fluorescence of enhanced green fluorescent protein (EGFP) in E. coli but increased the temperature of EGFP solution (Copty et al., 2006). They suggested that the specific microwave effect potentially related to MWI absorption by the water molecules bound to the EGFP barrel in

34 the folded state. The absorption of MWI resulting in the water bound to the EGFP barrel being heated more efficiently than those in the ambient solution, leading to a change in the EGFP molecule, affecting its fluorescence.

A study by Mazinani et al. (2015) suggested that there are changes in the secondary structure of trypsin exposed to MWI shown by the significant increase of trypsin activity when the reaction mixture was irradiated with MWI at a constant temperature. their recent study showed that acceleration of enzymatic reactions by MWI at constant bulk temperature is enzyme-dependent (Mazinani and Yan, 2016). For example, the enzymatic activity of α-amylase and towards the hydrolysis of starch and 4-nitrophenyl phosphate was not affected by exposure to MWI (10 Watts), however, trypsin activity was increased by MWI at a constant temperature (Mazinani et al., 2015). They also reported that cellular metabolism in E. coli was modified after exposure to MWI at 2.45 GHz with power up to 10 Watts for 5 hours

(hrs) as shown by 10 identified proteins showing different levels of abundance based on mass spectrometric analysis (Mazinani et al., 2019). Among the 10 identified proteins, 6 proteins with decreased levels of abundance are involved in the citric acid cycle (TCA cycle) whereas the remaining 4 proteins with increased levels of abundance are involved in the aminoacyl-tRNA biosynthesis pathway.

2.1.4. Research objectives

Overall, the focus of this study is to examine the gene expression profile change at the transcriptome level in E. coli cells after exposure to MWI using RNA-seq. Three specific objectives include: (1) to identify genes showing up-regulation and down-regulation specific to

MWI; (2) to identify genes showing the highest expression changes in response to the treatment; and (3) To understand the biology of MWI’s impact via enrichment analysis of differentially

35 expressed genes by determining the GO terms (e.g. biological process, molecular function, and cellular component) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways enriched among these genes. Compared to previous studies, this study was conducted based on

Mazinani et al., (2019) in aerobic conditions with relatively low power (up to 10 W) with the culture temperatures maintained constant at 37°C via simultaneous cooling to allow examination of the non-thermal and microwave-specific effect of MWI. The proteomic analysis of total proteins with the same experiment conditions has been conducted by Mazinani et al., (2019).

This study will serve to analyze the microwave-specific effect of MWI in E. coli at the transcription level and to compare it with the previously generated proteomic data by Mazinani et al., (2019).

2.2. Methods and Materials

2.2.1. Sample preparation

Sample preparation was performed by Dr. Tony Yan’s research group from the

Department of Chemistry, Brock University. Chemicals were purchased from Sigma-Aldrich and were used without further purification. E. coli DE3 was acquired from Dr. Charles Déspres’s lab in the Department of Biological Science, Brock University. A CEM Discovery Coolmate MW system was used for this experiment and the speed of the stirring bar was measured by an Omega

HHT13-Kit tachometer to be 800 ± 20 rpm (Mazinani et al., 2019).

Individual bacterial colonies from Luria Bertani (LB) agar plates were inoculated into LB broth in a shaking incubator at 37°C and 260 rpm until an OD600 around 0.93 was reached. A portion of this pre-culture (0.5 mL) was then added to fresh LB medium (4.5 mL) and exposed to a maximal 10 Watts MWI at 2.45 GHz for five hours while the culture temperature was maintained at 37°C through simultaneous cooling with constant stirring. As a control, E.

36 coli cultures were incubated in an oil-bath at 37°C and stirred. A portion of each control and treated sample (4.7 mL) was transferred into a Falcon tube (15 mL) followed by centrifugation at

255 relative centrifugal force (rcf) for 15 minutes. The cell pellet was resuspended in fresh phosphate buffer saline (PBS) (7 mL) followed by centrifugation at 255 rcf for 15 minutes three times and stored at -80°C after flash-frozen in liquid nitrogen. A total of three biological replicates were obtained for both control and treatment conditions and were used for sequencing.

RNA extraction was performed at Norgen Biotek Corp (Thorold, ON, Canada) as a paid service. Briefly, each pellet was used in an RNA purification protocol for Gram-negative bacteria using Norgen’s Total RNA Purification Plus Kit (cat#,48300). RNA was eluted in 45 mL. Before rRNA depletion, 1 mL of 1/500 diluted RNA was tested on an Agilent RNA Pico

Chip using an Agilent 2100 Bioanalyzer and 2 mL of purified RNA was used for RNA quantification using a Thermo Fisher Scientific NanoDrop spectrophotometer. A portion of isolated RNA (3μg) were used per sample for library preparation according to the manufacturer’s instructions with NEB NEBNext® UltraTM II Directional RNA Library Prep Kit for Illumina

(cat#: E7760) with bacterial rRNA depletion using Thermo Fisher RiboMinusTM Transcriptome

Isolation Kit, bacteria (cat#: K155004). The RNA was then concentrated using Norgen’s RNA

Clean-up and Concentration Micro-Elute Kit (cat#: 61000).

2.2.2. Library preparation and sequencing

RNA sequencing including sequencing library construction was also performed at

Norgen Biotek Corp (Thorold, ON, Canada) as a paid service. In brief, RNA fragmentation and cDNA synthesized using random hexamer primer were done followed by adaptor ligation, size selection, and PCR enrichment to generate a double-stranded cDNA library (New England

37

Biolabs, Inc.). Sequencing was performed with an Illumina NextSeq 500 platform to construct a

PE library using NextSeq 500/550 Mid Output Kit v2. The raw read files were saved in compressed FASTQ format (Cock et al., 2009), which contain sequence based-calls and per base quality score.

2.2.3. Assessment of RNA-seq data

Sequencing data quality for all samples was assessed to calculate the sequencing coverage using FastQC v0.11.8 (Andrews, 2010). Using Linux shell awk command, a domain- specific language for text processing, the total length of the sequence for individual raw reads file was calculated. The total length of the exon from the Escherichia coli HUSEC2011CHR1.37 reference transcript in “gene transfer format” (gtf) (Ensemble Genome, release 37) was used as the basis to calculate the total exon length of the E. coli transcriptome, and the sequencing coverage was calculated by dividing the total length of the sequence of the PE raw reads for each sample with the total exon length of the transcripts.

2.2.4. RNA-seq reads alignment

The reads in FASTQ format from all samples were mapped individually using STAR v2.7.0a (Dobin et al., 2013) to the reference genome, E. coli HUSEC2011CHR1 from Ensemble

Genome (release 37) with a dataset of known splice sites for E. coli HUSEC2011CHR1.37 in gtf

(Ensemble Genome, release 37) as part of the inputs.

2.2.5. DGE analysis

DGE analysis was performed using initially using 3 tools, Cufflinks (Trapnell et al.,

2012), edgeR (Robinson et al., 2009), and DESeq2 (Love et al., 2014) For analysis with the

Cufflinks package, the STAR alignment files were processed using cufflinks for transcript assembly and followed by Cuffmerge, continued with the calculation of FPKMs with the

38 statistical significance changes using Cuffdiff. The output in tab-delimited text was visualized using scatter boxplot and the pairwise correlation among samples was performed using Pearson correlation and paired T-test from R statistical computing environment.

For analysis by edgeR and DESeq2, a database of all annotated transcripts by specifying the gene was generated from the transcript reference, using the function makeTxDbFromGFF from GenomicFeatures. BamFileList function from Rsamtools was used to read the alignment file in BAM format and the mapped reads were counted with summarizeOverlaps function from GenomicAlignment, supplemented with the result from GenomicFeatures and Rsamtools to generate a gene-based count matrix. Additionally, a table of sample information was generated with data.frame function. In the end, a count matrix and sample table were used for data frame construction using the DataFrame function for further analysis with edgeR and DESeq2.

For DESeq2 DESeqDataSet object from the above data frame was loaded with the DESeqDataSet function followed by DESeq and results function. DESeq function was used to run the standard differential expression analysis between the control and test groups, followed by using the results function to extract a result table containing the log2(FC), adjusted p-value, as well as other related information.

For edgeR, the DGEList object covering the information of the genes and grouping was created using the DGEList function. To minimize the technical influence on differential expression, the calcNormFactors function was applied to normalize the RNA composition by minimizing the FC between samples. The default method is a trimmed mean of M-values

(TMM). After the dispersion estimates were obtained, DEGs were tested for determining differential expression using likelihood ratio tests by glmFit and glmLRT function followed by topTags function.

39

2.2.6. DGE downstream analysis

DEG lists from the above analyses were filtered based on the p-value (≤ 0.05) for

Cufflinks and edgeR, and adjusted p-value for DESeq2 (≤ 0.05). Further filtering by log2(FC)

(≥1 or equal to FC ≥ 2) was also applied for the results from three methods, followed by applying a minimum of 5 for the FPKM value only for Cuffdiff for at least one sample in the comparison. The list of common DEGs from three methods was obtained using the Linux grep.

2.2.7. Identification of unannotated transcripts

For DEGs lacking function annotation and assignment of gene symbol, their RNA sequences were used to search against E. coli K12 cDNA (release 44) from Ensembl Bacteria

DNA sequencing using BLAST+ v2.7.1 (Altschup et al., 1990). Specifically, DNA sequences in fasta format for genes on the DEGs list were extracted from the reference genome using

BEDTools v2.28.0 (Quinlan and Hall, 2010). BLASTN program was run using an e-value of 1e-

20 and 80% identity with the maximum alignment of 1. Gene symbols from the matched E. coli K12 cDNA were transferred to the corresponding transcripts of DEGs.

For those with no match from the above step, their DNA sequences were used to search the non-redundant (nr) protein sequences database using the command-line version of the NCBI

BLASTX (Ramsay et al., 2000) using an e-value of 1e-20 and maximum sequences of 10 with a minimum of 90% sequence identity. If a sequence matches a known protein, the gene symbol and function description of the matched protein was assigned to the unannotated transcript.

2.2.8. Functional enrichment analysis

Functional enrichment analysis using DAVID version 6.7 (Huang, et al., 2009) was performed for the DEGs list. E. coli str. K-12 substr. MG1655 was used as the selected species

40 and background for DAVID enrichment analysis. Biological process, cellular component, and molecular function from the GO database and KEGG pathways were identified. A raw p-value and Benjamini corrected p-value (≤ 0.05) were applied to collect the significantly enriched GO terms and KEGG Pathways. The result was visualized using the EnrichmentMap v3.2.1 (Merico et al., 2010) plug-in for Cytoscape v3.7.2 (Paul Shannon et al., 1971) based on genes that are shared among GO terms and KEGG pathways by setting the p-value  0.05, default edge cut off of 37.5% for genes that are shared between GO terms and KEGG pathways, and false discovery rate (FDR) q-value  0.25.

2.2.9. Analysis of co-expressed DEGs

As an alternative to grouping, the DEGs as showing up-or down-regulation in response to

MWI, all combined DEGs were used to run a pairwise Pearson correlation based on the FPKM values among all six samples using an in-house Perl script. These genes were then placed into groups, in which each gene is connected to at least another gene with a significant positive expression correlation, followed by enrichment analysis for GO terms and KEGG pathways for groups with at least 100 genes.

2.2.10. Comparing the transcriptomic data with the previous proteomic data

The list of 42 proteins shown to be differentially expressed under the same experimental condition via proteomic analysis (Mazinani et al., 2019) was compared to the list of DEGs from this analysis. The gene-level expression of these proteins was retrieved from the results of DGE analysis using Cufflinks.

2.2.11. Computational analysis

All RNA-seq analysis until the generation of DEG lists was performed by using Compute

Canada high-performance computing facilities (https://www.computecanada.ca), while statistical

41 analyses were performed using R studio v1.2.1335 (https://www.rstudio.com) on desktop computers.

2.3. Results

2.3.1. Overview of the RNA-seq data: quality check and summary statistics

The raw sequence read data for each of the six samples (3 controls and 3 MWI treated samples) was subject to quality check using fastQC. The result showed that all six samples fulfilled the quality requirements for per base and per tile sequence quality, per sequence quality score, per sequence GC content, per base N content, and adapter content. Certain issues were found in sequence length distribution, sequence duplication levels, per base sequence content, and overrepresented sequences. The sequence length distribution was high for sequences with

>72 bp (Appendix Figure 1). Moreover, there was a high percentage of sequence duplication

(Appendix Figure 2) and per base sequence content (Appendix Figure 3). Also, there is a very high number of overrepresented sequences with no matches found in a database of common contaminants. For RNA-seq, these problems are not considered to adversely affect the downstream analysis, as they naturally occur in RNA-seq libraries involving the use of random hexamers as primers for reverse transcription.

Table 2.1 summarizes RNA-Seq data quality in the percentage of reads mapped to the reference. The number of uniquely mapped reads (18.88-45.54%) was lower than the multi mapped reads (51.57-77.22%) and only a small percentage of unmapped reads was reported for all samples (1.53-3.06%). For a very good library, uniquely mapped reads exceed 90%, and for good libraries, it should be above 80%. Low mapping rates (<50%) are indicative of a problem with library preparations or data processing, including insufficient depletion of rRNA, poor sequencing quality, and exogenous RNA/DNA contamination (Dobin and Gingeras, 2015).

42

One of the strategies that have been previously used to handle low mapping rates is to discard the multi-mapping reads, keeping only uniquely mapped reads for expression estimation

(Zhang et al., 2013). In our analysis, the read mapping was done using STAR (Dobin et al.,

2013) which discards the multi mapping and unmapped reads, so unique mapping reads were used for the DGE analysis. Before the alignment, the total number of reads per sample was in the range between 13- 44 million with the sequencing coverage depth ranging from 448-1428 times (Table 2.2), which is considered sufficient for in-depth analysis of DGE (Illumina, 2020).

After alignment, the total number of reads per sample and the sequencing coverage was decreased but were still in the range between 5-25 million reads per sample, which is still sufficient for the study that focused on profiling the highly expressed genes (Illumina, 2020).

Table 2.1. Alignment statistics for RNA-seq data. Unique mapping* Multi-mapping* Unmapped* Sample Number of reads (%) (%) (%) CTR 1 32,240,612 31.7 66.09 1.68 CTR 2 88,104,658 29.26 68.73 1.56 CTR 3 51,160,082 36.66 61.36 1.53 MW 1 27,660,392 18.88 77.22 3.61 MW 2 78,902,186 45.54 51.57 2.63 MW 3 53,738,308 43.01 53.65 3.06 Average 55,301,040 34.175 63.10 2.35 Abbreviation: CTR control sample; MW sample treated with MWI. *Alignment was done by STAR using E. coli HUSEC2011 genome from Ensembl Bacteria (release 37).

43

Table 2.2. Statistics for RNA-seq data based on the unique mapping reads. Before alignment After alignment Unique Sample Number Total PE Coverage Number of Total PE Coverage mapping of reads length *** reads* length** *** (%) CTR 1 32,240,612 2,435,388,268 522 31.7 10,220,274 772,018,081 166 CTR 2 88,104,658 6,654,690,780 1428 29.26 25,779,423 1,947,162,522 418 CTR 3 51,160,082 3,863,450,101 829 36.66 18,755,286 1,416,340,807 304 MW 1 27,660,392 2,088,754,608 448 18.88 5,222,282 394,356,870 85 MW 2 78,902,186 5,956,754,796 1278 45.54 35,932,056 2,712,706,134 582 MW 3 53,738,308 4,057,754,281 871 43.01 23,112,846 1,745,240,116 374 Average 55,301,040 4,176,132,139 896 34.175 18899130.42 1427193159 306 Abbreviation: CTR control sample; MW sample treated with MWI. *Calculated as the unique mapping (%) times the number of reads before alignment. **Calculated as the unique mapping (%) times total PE length before alignment. ***Calculated as the total length divided by total exon length of reference transcript of E. coli HUSEC2011.37 (4,661,235 bps).

2.3.2. Overview of the gene expression profile in response to MWI

We first examined the overall expression profile for individual samples in the distribution pattern of the gene expression level using box plots. As shown in Figure 2.1, while both the control and treatment samples show a mean value around 10 in FPKM, there is a substantially wider range of gene expression in the control group (CTR) compared to the treatment group

(MW) by having more genes expressed (1214 genes) in the former than in the latter (1072 genes), with the later missing most of the low-level expression genes seen in the former.

44

Figure 2.1. Scatter boxplots showing the gene expression distribution pattern of control and treated E. coli samples. Scatter boxplots were generated based on the Cuffdiff FPKM values for the control samples combined (CTR) and samples treated with MWI combined (MW).

To examine the degree of variation between replicates in the groups, we calculated the t- test and Pearson correlation coefficient for each pair-wise comparison. The three control samples showed a quite similar expression distribution pattern (correlation r>0.98 and a low difference in median value based on paired t-test) as expected for good consistency in technical replicates for normal controls. Similarly, within the treatment group (MW), all three samples showed similarity in mean expression values (p-values from paired t-test >0.05) and high correlation

(r>0.98). However, MW1 showed less similarity to MW2 and MW3 by having the lowest correlation value with the other two samples in the group. This is also visible by its fewer outliers in the lower bound of the boxplots compared to the other two MW samples (Figure 2.2).

The similarity between groups is much less than between samples from the same group as indicated by the lower correlation values and the significant differences in the median values indicated by the lower p-values for paired t-tests between the groups (Table 2.3). Therefore, a

45 good consistency was obtained for replicates within the groups, while a significant difference in gene expression was observed between the control group and the test group, as we anticipated.

Figure 2.2. Scatter box plot of the three replicates of two conditions, showing the varieties of gene distribution. Scatter box plot was generated based on the Cuffdiff result. Control samples (CTR) (n = 3) and samples treated with MWI (MW) (n = 3).

Table 2.3. Pairwise comparison of gene expression between control (CTR) and treated (MW) samples. CTR1 CTR2 CTR3 MW1 MW2 MW3 CTR1 0.981328 0.9911 0.934478 0.965516 0.955503 CTR2 0.271838 0.984956 0.938534 0.949303 0.937012 CTR3 0.01414 0.04023 0.958605 0.977769 0.968846 MW1 0.000466 2.49E-05 6.09E-05 0.979063 0.980417 MW2 1.17E-05 3.47E-07 1.34E-08 3.97E-01 0.997405 MW3 2.71E-05 2.89E-05 5.19E-06 5.41E-01 1.41E-01 *Gene expression is determined by the FPKM values from Cuffdiff output. Pearson correlation coefficient values are shown in the yellow background with values in bold print for within-group comparison; p-values from paired t- test are shown in the green background with values in bold print for between- group comparison

46

2.3.3. List of DEGs common in 3 tools

To identify DEGs, we used three popular RNA-seq tools and used the list of DEGs shared among all three tools for the best confidence of the identified DEGs. A total of 2249 from

4595 genes (Cuffdiff), 2997 from 5373 genes (edgeR), and 2729 from 5373 genes (DESeq2) were identified to be differentially expressed. Further filtering was applied based on p-value ≤

0.05, FC ≥2, and minimum FPKM value ≥ 5 at least in one sample (applied only for Cuffdiff). In total, 1379 DEGs (753 up-regulated and 626 down-regulated DEGs) were shared among the three lists and meeting the filtering criteria. Among these, 560 DEGs lack annotation and they were subject to further analysis using BLASTN and BLASTX to retrieve any functional information (Appendix Table 2.1 and 2.2).

Interestingly, among the 1379 DEGs, 36 DEGs were shown to be turned on by MWI (i.e. no expression in the control) (Appendix Table 2.3) and none were turned off (i.e. expressed in control but not in MWI samples). Among the 36 genes, 4 unknown genes (e.g. gene:HUS2011_0170, gene:HUS2011_2516, gene:HUS2011_1980, and gene:HUS2011_1010) and a gene coding for a toxic polypeptide, small (hokA) showed the highest FPKM. The most up-regulated DEGs with the highest FC are heat-shock chaperone (ibpA), carbohydrate-specific outer membrane porin, cryptic (bglH), TMAO reductase III (TorYZ), cytochrome c-type subunit

(torY), gene:HIS2011_0220, and heat shock chaperone (ibpB) (Appendix Table 2.4). The most down-regulated DEGs are isocitrate (aceA), galactitol-1-phosphate dehydrogenase, Zn- dependent and NAD(P)-binding (gatD), malate synthase A (aceB), alpha-galactosidase, NAD(P)- binding (melA), and pseudo (gatC) (Appendix Table 2.4).

47

2.3.4. Functional enrichment analysis of DEGs using DAVID

To understand the biology represented by the DEGs, we performed GO terms and KEGG

pathways enrichment analysis for the annotated up-regulated (679) and down-regulated (614)

DEGs, separately and combined, using the DAVID tool (Huang, et al., 2009). To be considered

significantly enriched, we required both the raw p-value and Benjamini corrected p-value to be ≤

0.05. As shown in Table 2.4 (Appendix Table 2.5), GO terms associated with the plasma

membrane, an integral component of the membrane, pilus, and pentose and glucuronate

interconversions were shown to be significantly enriched among the up-regulated DEGs. Most of

these genes code for proteins in the plasma membrane and integral component of the membrane

and 140 genes were shown to overlap between these 2 cellular components (Appendix Table

2.6). The pentose and glucuronate interconversions pathway was shown to be the only

significantly enriched KEGG pathway.

Table 2.4 The most significantly enriched GO terms and KEGG pathways among the up- regulated DEGs in E. coli in response to MWI. Cellular Component Term p-value Fold Enrichment Benjamini Count GO:0005886~plasma membrane 8.17E-11 1.38994 3.92E-09 221 GO:0016021~integral component of 5.33E-07 1.402052 1.28E-05 150 membrane GO:0009289~pilus 1.40E-04 2.853954 0.002233 16 KEGG Pathway Term p-value Fold Enrichment Benjamini Count eco00040:Pentose and glucuronate 6.94E-04 9.38E-06 5.342857 11 interconversions The most significant enriched GO terms and KEGG pathways based on the p-value (≤ 0.05) and Benjamini corrected p-value (≤ 0.05).

Among the 614 down-regulated DEGs, 4 cellular components from GO terms and 14

KEGG pathways were shown to be enriched (Table 2.5 and Appendix Table 2.7). A high

number of down-regulated DEGs are localized in cytosol (263/614). The 14 enriched KEGG

pathways are those related to metabolism, biosynthesis of antibiotics, secondary metabolites, and

48 bacterial-type flagellum hook. A network of significantly enriched GO terms and KEGG pathways showed that the majority of down-regulated DEGs are involved in 9 KEGG pathways coding for proteins in the cytosol (Figure 2.3). In total, 46 links were generated showing the interaction between GO terms and KEGG pathways based on the number of overlapping genes.

The connection between metabolic pathways, biosynthesis of antibiotics, biosynthesis of amino acids, and biosynthesis of secondary metabolites was revealed by a significant number of overlapping genes between them (linked by the thicker lines) (The list of overlap genes are in

Appendix Table 2.8). The impact of MWI on various metabolism and biosynthesis is also supported by having genes showing the highest change in expression, including aceA, aceB, and gatD, all directly involved in metabolism, indicating shutting down all major metabolic pathways as a prominent response to MWI.

Enrichment analysis was also performed after combining up-and-down-regulated DEGs.

Only 14 KEGG pathways, but none of the GO terms, were shown to be significantly enriched

(Appendix Table 2.9). In comparison with the enriched KEGG pathway lists for the up-and down-regulated DEGs (Table 2.4 and 2.5), the enriched GO terms were no longer in the list, while for the KEGG pathways, one carbon pool by folate showed up as a new entry, while pentose and glucuronate interconversions and fatty acid metabolism are no longer significantly enriched. Therefore, we reasoned that the results of enrichment analysis by treating the up-and down-regulated DEGs lists separately is more informative.

49

Table 2.5. The most significantly enriched GO terms and KEGG pathways from down- regulated DEGs in E. coli.

Cellular Component Fold Term p-value Benjamini Count Enrichment GO:0005829~cytosol 2.60E-22 1.563201 1.77E-20 263 GO:0009424~bacterial-type flagellum hook 2.20E-04 4.9119 0.007462 8 GO:0016020~membrane 0.001381 1.543005 0.030834 48 GO:0045261~proton-transporting ATP 0.003007 6.139875 0.049902 5 synthase complex, catalytic core F(1) KEGG Pathway Fold Term p-value Benjamini Count Enrichment eco01130:Biosynthesis of antibiotics 1.97E-12 2.084017 1.97E-10 76 eco01110:Biosynthesis of secondary 8.09E-12 1.803606 4.05E-10 99 metabolites eco01100:Metabolic pathways 3.18E-11 1.397885 1.06E-09 180 eco01200:Carbon metabolism 1.50E-09 2.336703 3.74E-08 47 eco00260:Glycine, serine and threonine 1.48E-08 3.398715 2.96E-07 23 metabolism eco01230:Biosynthesis of amino acids 6.53E-08 2.118875 1.09E-06 47 eco00630:Glyoxylate and dicarboxylate 1.96E-06 2.854488 2.80E-05 22 metabolism eco01120:Microbial metabolism in diverse 1.17E-05 1.58901 1.46E-04 69 environments eco00190:Oxidative 1.03E-04 2.526871 0.001138 19 eco00620:Pyruvate metabolism 1.32E-04 2.294785 0.001314 22 eco00020:Citrate cycle (TCA cycle) 2.71E-04 2.864469 0.002464 14 eco01212:Fatty acid metabolism 0.002291 2.786524 0.018933 11 eco00010:Glycolysis / Gluconeogenesis 0.003752 2.127891 0.026496 16 eco00270:Cysteine and methionine metabolism 0.003674 2.384706 0.027914 13 The most significant enriched GO terms and KEGG pathways based on the p-value (≤ 0.05) and Benjamini corrected p-value (≤ 0.05).

50

Figure 2.3. A network of the most significant enriched GO terms and KEGG pathways from downregulated DEGs. Each circular node is a gene set with a diameter proportional to the number of genes involved. The inner node color represents the p-value of the enrichment as shown in the scale bar. The darker the colour, the more significant is the degree of enrichment. Lines represent the fraction of overlapped genes between the nodes, and the thicker the line, the more the overlap is. KEGG pathways are indicated with a prefix of “ECO”, while GO terms are prefixed with “GO:”.

2.3.6. The co-expressed DEGs analysis

A co-expression analysis was performed to identify DEGs showing both positive and negative correlated expression among 6 samples and they were grouped based on their strongest pair-wise correlation connections. Despite starting with the DEGs, the expression correlation grouping may offer refined sub-lists of these DEGs, which may reveal different/extra enrichment of GO terms and KEGG pathways. The analysis identified 1379 DEGs in 8 groups. Genes in five of the groups (G1-3 and G5-6) showed significantly enriched GO terms and KEGG pathways

(Figure 2.4-2.7), and these were compared with those of up-and down-regulated DEGs. Nine biological processes and 7 molecular functions were shown as the new entry, mostly enriched in

G2 and G6, respectively (Figure 2.4 and 2.5), whereas the pore complex showed up as the only

51 extra enriched cellular component found in G1 (Figure 2.6). The comparison of enriched KEGG pathways revealed that propanoate metabolism, methane metabolism, , aspartate and glutamate metabolism, flagellar assembly, and two-component system were showed up as the new entry among the list of co-expressed genes, mostly in G2 and G3, whereas one carbon pool by folate has been reported before being enriched among all DEGs but not in the separated list of up-and down-regulated DEGs (Figure 2.7), indicating the grouping of co-expressed DEGs does seem to offer some additional information.

Figure 2.4. Uniquely enriched GO terms of biological processes among co-expressed, up- and down-regulated DEGs. The list of the significantly enriched GO terms and KEGG pathways were filtered with p-value and Benjamini corrected p-value (≤ 0.05).

52

Figure 2.5. The comparison of the most significantly enriched molecular functions among co-expressed, up-and down-regulated DEGs. The list of the most significantly enriched GO terms and KEGG pathways were filtered with p-value and Benjamini corrected p-value (≤ 0.05).

Figure 2.6. The comparison of the most significantly enriched cellular components among co-expressed, up-and down-regulated DEGs. The list of the most significantly enriched GO terms and KEGG pathways were filtered with p-value and Benjamini corrected p-value (≤ 0.05).

53

Figure 2.7. The comparison of the most significantly enriched KEGG pathways among co- expressed, up-and down-regulated DEGs. The list of the most significantly enriched GO terms and KEGG pathways were filtered with p-value and Benjamini corrected p-value (≤ 0.05).

54

2.3.7. Comparing the transcriptomic data with proteomic data

An earlier proteomic study for a similar experiment done in the same lab (Mazinani et al.

2019) identified a total of 42 proteins to be differentially expressed in response to MWI. We were interested in finding out how much agreement exists between the proteomic and transcriptome analyses. For this, we compared the list of genes/proteins in the proteomic study with the list of DEGs obtain in this study. In total, 16 genes are shared among the two studies

(Table 2.6). However, the pattern of the expression changes was mostly different between the two studies. Only leuS, mutS, sucA, and acnB showed the same pattern between the two studies for showing down-regulated expression in response to MWI, while the proteomic study showed the majority of the genes, including the 2 genes involved in the aminoacyl-tRNA biosynthesis pathway (lysU and proS), with an increased level of expression, opposite to the pattern of change at the transcript level. Overall, RNA-seq identified more genes showing differential expression than proteomic analysis, which is a common phenomenon. In summary, we found a low rate of agreement between the transcriptomic and proteomic data, limited to 4 down-regulated genes.

55

Table 2.6. Shared genes between the transcriptomic and proteomic analysis in response to MWI.

Protein Gene Protein Protein FC RNA-seq FC spot* Symbol** Accession* (p-value)* (p-value)** Gene Description*** leuS P07813 2.8 (0.001) ↓ leucyl-tRNA synthetase methyl-directed mismatch mutS P23909 2.2 (0.001) ↓ Spot 1 repair protein 1.3 (0.004) ↓ 2-oxoglutarate decarboxylase, sucA P0AFG3 51.8 (5.00E-05) ↓ thiamine triphosphate-binding aconitate hydratase 2; acnB P36683 9.2 (5.00E-05) ↓ B; 2-methyl-cis-aconitate hydratase formate C-acetyltransferase 1, Spot 2 pflB P09373 3.0 (<0.001) ↑ 5.2 (5.00E-05) ↓ anaerobic; pyruvate formate- lyase 1 PEP-protein phosphotransferase ptsP P37177 3.2 (5.00E-05) ↓ enzyme I; GAF domain- 2.0 (<0.001) ↑ containing protein Spot 3 formate C-acetyltransferase 1, pflB P09373 5.2 (5.00E-05) ↓ anaerobic; pyruvate formate- lyase 1 Spot 4 rnb P30850 1.6 (<0.001) ↑ 3.0 (5.00E-05) ↓ ribonuclease II lysine tRNA synthetase, lysU P0A8N5 4.0 (5.00E-05) ↓ inducible Spot 5 30S ribosomal subunit protein rpsA P0AG67 1.4 (0.001) ↑ 2.6 (5.00E-05) ↓ S1 proS P16659 3.4 (0.0054) ↓ prolyl-tRNA synthetase curcumin/dihydrocurcumin curA P76113 3.0 (5.00E-05) ↓ reductase, NADPH-dependent fructose-bisphosphate aldolase, fbaA P0AB71 6.8 (5.00E-05) ↓ Spot 8 1.9 (<0.001) ↑ class II malE P0AEX9 4.5 (5.00E-05) ↓ maltose transporter subunit aspartate aminotransferase, aspC P00509 5.1 (5.00E-05) ↓ PLP-dependent Fe-binding and storage protein; Spot 10 dps P0ABT2 1.4 (0.001) ↑ 4.2 (5.00E-05) ↓ stress-inducible DNA-binding protein Arrows show upregulated (↑) and downregulated (↓) with the associated number being the fold of change of expression between control and MW groups. *Based on the result from Proteome Discoverer (version 2.2, Thermo Scientific) against the Uniprot E. coli (strain K12) (Mazinani et al., 2019) **Based on the result from Cuffdiff DGE analysis against Escherichia coli HUSEC2011 genome from Ensembl Bacteria (release 44). *** Based on DAVID enrichment analysis using Escherichia coli str. K-12 substr. MG1655.

56

2.4. Discussion In this study, we aimed to examine the effect of MWI on E. coli at the transcriptome level using RNA-seq as the state-of-art gene expression profiling approach. In comparison with other studies on the impact of MWI, our methodology design addresses 2 critical issues, the side effect of overheating during MWI, as has been reported by previous studies. This was achieved by using MWI at 2.45 GHz with simultaneous cooling for 5 hours without being obscured by the impact of overheating, thus allowing us to observe the non-thermal effect of MWI. Our findings also provide new insight into the specific impact of MWI treatment related to the up-regulation of genes associated with membrane integrity activity (as previously reported by other studies) and adhesion for biofilm formation causing suppression of motility (not yet reported by other studies). Shutting down of the metabolic and biosynthesis pathways as E. coli entering stationary phase also shown by the downregulation of genes involved in 14 KEGG pathways (Table 2.5), showed that cellular metabolism in E. coli was modified during MWI exposure as previously reported by Mazinani et al. (2015). We also identified several downregulated genes that are involved in ATP synthesis and hydrolysis, nicotinamide adenine dinucleotide (NAD)+ biosynthesis, protein biosynthesis, one carbon pool by folate activation, and iron transport. The impacts of the downregulation of genes in E. coli in response to MWI in this study have not yet been reported by other studies as the MWI-specific effect.

Although we were not able to observe the enzymatic activity of the corresponding DEGs in this study, our results showed that the enzymatic activity involved in various signaling and metabolism was disrupted by MWI. This is supported by the comparison between our transcriptomic data with proteomic data from Mazinani et al. (2019) showing that protein-coding genes involved in the TCA cycle and tRNA biosynthesis were differentially expressed in

57 response to MWI. We believe that more enzymes might have been impacted by MWI as we found that most of these DEGs were coding for various enzymes in E. coli. Below, we discuss the relevance of our results in the context of existing literature.

2.4.1. E. coli reacted to enhance membrane integrity and adhesion for survival in response to MWI

Our gene expression profiling identified a total of 679 DEGs showing up-regulation of expression in response to MWI, and these genes showed enrichment for the plasma membrane, an integral component of membrane, pilus, and pentose and glucuronate interconversions (Table

2.4) with a high number of DEGs found in the plasma membrane (221) and integral components of membrane (150). Genes involved in efflux systems, the OM maintenance, PTS, and bacterial adhesion are also found to be involved in maintaining membrane integrity to resist the stress conditions as a response to MWI (Said-Salman et al., 2019). This is in agreement with the result of a previous study showing the conductivity and permeability of materials increased after MWI to enhance the diffusion and mobility of ions (Shamis et al., 2011).

2.4.1.1. The increased activity of bacterial efflux pumps

Among the 140 up-regulated DEGs associated with the plasma membrane and integral component of membrane, 4 DEGs were known as lactose/glucose efflux system (setB), putative multidrug or homocysteine efflux system (hsrA), threonine and homoserine efflux system (rhtA), and zinc efflux system (zitB) (Appendix Table 2.6). Further, 42 of these DEGs are known as putative transporter, putative inner membrane transporter, putative MFS transporter, and sugar transporter, and arabinose transporter. Several DEGs code for an inner membrane protein, including 2 membrane fusion protein (MFP) components of efflux pump (yibH and yiaV), 2

58 inner membrane efflux pump associated proteins (ydhI and aaeX), paraquat-inducible, SoxRS- regulated inner membrane protein (pqiA), and preprotein membrane subunit (secG).

Bacterial efflux pumps (EPs) are formed by proteins that are localized and embedded in the plasma membrane of the bacterium whose function is to recognize agents that have penetrated the protective cell wall of the organism and reached the periplasm or cytoplasm and extrude them before they reach their intended targets. Moreover, EPs also recognize toxic compounds that are products of the metabolism of the bacterium and hence perform excretory functions. In other words, EPs are transporters of noxious compounds from within the bacterial cell to the external environment (Amaral et al., 2014). Interestingly, the up-regulation of 2 genes coding for multidrug efflux system proteins (acrF and emrD) and a gene coding for putative multidrug or homocysteine efflux system (hsrA) involved in the plasma membrane and integral component of membrane confirm that the concentration of the noxious agent exceeds the capacity of the intrinsic EP to extrude the agent leading to the over-expression of the main EP, resulting in a multi-drug resistant (MDR) phenotype, a prevalent form of clinical resistance

(Amaral et al., 2014).

Additionally, among the up-regulated DEGs, several DEGs potentially play roles in EPs, and these include citrate/succinate antiporter; citrate carrier (citT), potassium translocating

ATPase, subunit B (kdpB), voltage-gated potassium channel (kch), alanine exporter, alanine- inducible, stress-responsive (alaE), amino acid exporter for proline, lysine, glutamate, homoserine (yahN), and oxidative stress resistance protein; putative MATE family efflux pump;

UV and mitomycin C inducible protein (dinF). The upregulation of these genes can be related to

EPs function as transporters of noxious compounds and also its ability to utilize sources of energy (e.g. ATP and proton motive force (PMF)) (Amaral et al., 2014).

59

2.4.1.2. Disruption of membrane proteins responsible for stabilizing membrane integrity

OM of gram-negative bacteria is a complex structure with a major role in the adaptation of the bacterium to various external environments while passively and selectively controlling the influx and efflux of important solutes, peptides or proteins, nucleic acids, and other organic compounds such as lipids and polysaccharides (Amaral et al., 2014). The significant change in expression for several genes coding for outer membrane proteins (OMPs) also revealed that the

OM was affected by the MWI (Confer and Ayalew, 2013). Some of these genes were up- regulated (e.g., ompC, ompG, and ompN) while others were down-regulated (e.g., ompA, ompF, and ompW) by MWI (Appendix Table 2.11).

OMPs are highly expressed under optimal environmental conditions, their level of expression is adjusted when it is necessary to minimize the penetration of noxious compounds or maximize access to nutrients (Amaral et al., 2014). The two major OMPs in E. coli are OmpC and OmpF, consisting of three 16-stranded β-barrels defining a transmembrane pore in the outer membrane porin. The level of expression of the porins OmpC and OmpF controls the permeability of the OM to glucose and nitrogen uptake under nutrient limitation. Moreover,

OmpC and OmpF synthesis is reduced or markedly reduced, respectively, during the adaptation process and along with upregulation of EPs (Amaral et al., 2014). In our study, these two genes showed a large degree of expression changes in response to MWI indicating the increased activity of OmpC to deal with the increase of noxious compounds produced associated whereas downregulation of OmpF potentially associated with the increased activity of EPs (Appendix

Table 2.11). OmpA is a major heat-modifiable OMP in E. coli and is one of the best characterized OMPs and has both structural and ion-permeable porin roles, with its ionic pore

60 controlled by a salt-influenced electrostatic gating mechanism for bacterial survival during osmotic stress (Confer and Ayalew, 2013).

OMPs also function together with lipoproteins in membrane structure and stability, active and passive ion and solute transport, signal transduction, defense, and catalysis (Confer and

Ayalew, 2013). Lipoproteins have been shown to play a role in the stabilization of the outer membrane, substrate transport, outer membrane protein assembly, and cell signaling (Bernstein,

2011). We noticed that a murein lipoprotein (lpp) found in the membrane was also downregulated by MWI (Appendix Table 2.7), potentially as the impact of downregulated expression of OMPs. Lpp is the most abundant protein in E. coli, surface-exposed with the bulk of the protein embedded in the OM (Bernstein, 2011). Lpp functions to anchor the OM to the bacterial cell wall, aiding in the stability and durability of the bacterial cell (Gonzalez et al.,

2015).

The impact of MWI on membrane integrity is also supported by the downregulation of alpha-ketoglutarate transporter (kgtP) in membrane and HU, DNA-binding transcriptional regulator, alpha subunit (hupA) in cytosol and membrane (Appendix Table 2.7) that also contributes to membrane integrity. KgtP is a hydrophobic membrane protein that co-transports ct-ketoglutarate and protons into E. coli (Seol and Shatkin, 1993). hupA plays an important role in the regulation of genes involved in processes related to virulence and tolerance to different types of stress, such as anaerobiosis, medium acidification, osmolarity increase, and UV irradiation (Conforte et al., 2019).

The downregulation of lpp and 3 OMPs by MWI potentially responsible for the upregulation of genes involved in a variety of efflux systems and transport of ions and metabolites induced by stress conditions. Downregulation of OMPs in E. coli affecting the

61 structural and ion-permeable porin roles, promotes bacteria survival during osmotic stress and various external environments, while passively and selectively controlling influx and efflux of important solutes, peptides or proteins, nucleic acids, and other organic compounds such as lipids and polysaccharides (Confer and Ayalew, 2013). Consequently, the increase in the conductivity and permeability of cell membranes to enhanced diffusion, including mobility of ions could also lead to bacterial killing (Asay et al., 2008; Shamis et al., 2011).

2.4.1.3. Increased activity of PTS controlling carbon uptake and metabolism

In bacteria, the phosphoenolpyruvate-dependent carbohydrate phosphotransferase system

(PTSsugar) drives the uptake of carbohydrates across the inner membrane, as well as their subsequent phosphorylation (Strickland et al., 2019). Several DEGs coding for fused mannose- specific PTS enzymes in integral components of membrane and plasma membrane were found to be upregulated by MWI (Appendix Table 2.6). These DEGs are maltose and glucose-specific

PTS enzyme IIB component and IIC component (malX), putative enzyme IIC component of PTS

(fryC), trehalose-specific PTS enzyme: IIB and IIC component (treB), and fused beta-glucoside- specific PTS enzymes: IIA component/IIB component/IIC component (bglF).

Multiple mechanisms are involved in PTS regulation, in which transcriptional regulation can increase or decrease the concentration of the enzymes and substrates for phosphotransfer depending on the number of metabolites in the growth medium. The PTSsugar comprises a five- step cascade initiated by phosphoenolpyruvate, a glycolytic intermediate, and involves three proteins, namely, enzyme I (EI), the phosphor carrier protein (HPr), and enzyme II (EII)

(Strickland et al., 2019). The EI and HPr are soluble cytoplasmic proteins participating in the transport of all PTS carbohydrates and are referred to as the general PTS components unless they are a part of a multidomain fusion protein that possesses specificity for one carbohydrate. The

62 utilization of the sugar carbohydrate taken up by a non-PTS active transporter is required for energy transport and extra ATP for phosphorylation of carbohydrate molecule in the carbohydrate kinase reaction (Kotrba et al., 2001). PTS proteins, through their extent of phosphorylation or the nature of the carbohydrate, are also associated with metabolic pathway regulation and cell signaling (Strickland et al., 2019) which could potentially contribute to the decreased activity of several metabolic and biosynthesis pathways as described below.

2.4.1.4. E. coli adhesion increased by MWI

Adhesion to a surface is a survival mechanism for bacteria and is generally recognized as the first step in biofilm formation (Busscher and van der Mei, 2012). Our results showed that

MWI also enhanced the expression of 16 genes in the pilus, possibly associated with bacterial adhesion and bacterial colonization (Table 2.5). Among DEGs showing up-regulation, 9 of them

(yadN, yadM, yadL, yraH, yraK, ydeS, ydeQ, yfcV, and ygiL) code for putative fimbrial-like adhesin protein, whereas ygiI and ybgO code for a fimbrial protein (Appendix Table 2.6). A link has been described between strong adhesion forces between bacteria and substratum surfaces causing membrane stresses and the increase of the percentage of dead cells on a surface for which the term ‘‘stress deactivation’’ was coined. Adhering bacteria react to membrane stresses due to the adhesion forces that regulate their adhering state on a surface, leading to bacteria transformation from a planktonic to a biofilm phenotype (Busscher and van der Mei,

2012).

Biofilm formation starts from microbial adhesion to the surface, as a result of non- covalent interactions such as electrostatic interactions, hydrophobic interactions, or van der

Waals forces (Różalska and Sadowska, 2018). During the early phase of biofilm development, the synthesis of the flagella is repressed as the attachment to the surface makes the adhered cells

63 sessile. Several small molecules such as cyclic‐diguanylic acid (c‐di‐GMP) are responsible for the shift from planktonic to sessile state (Sharma et al., 2016). In line with this, our data showed that 8 genes (Table 2.5 and Appendix Table 2.7), coding for a flagellar component of the cell- proximal portion of the basal-body rod (flgB, flgC, flgE, and flgG), flagellar hook-filament junction protein (flgK and flgL), and flagellar hook protein (flgD and flgE), were suppressed by

MWI, affecting bacterial motility and taxis (Kato et al., 2019; Matsunami et al., 2016).

2.4.2. E. coli shuts down most metabolism and biosynthesis to enter the stationary phase in response to MWI

Our results revealed that a large number of genes involved in metabolism and biosynthesis pathways in E. coli were downregulated in response to MWI. In total, 14 KEGG pathways including biosynthesis pathways of antibiotics, secondary metabolites, and amino acids, as well as metabolic pathways, carbon metabolism, glycine, serine and threonine metabolism, glyoxylate and dicarboxylate metabolism, microbial metabolism in diverse environments, oxidative phosphorylation, pyruvate metabolism, citrate cycle (TCA cycle), fatty acid metabolism, glycolysis/gluconeogenesis, and cysteine and methionine metabolism.

Interestingly, 47 genes involved in carbon metabolism found to be specifically involved in one- carbon pool by folate, an enriched KEGG pathway based on all combined DEGs (Appendix

Table 2.9), showing that combining all DEGs provides more specific information to analyze the effect of MWI on E. coli. In addition to this, ATP synthesis and hydrolysis and NAD+ biosynthesis, protein biosynthesis, and iron transport were also suppressed by MWI. The significant down-regulation of these metabolic pathways indicates that E. coli cells prepare them for entering a stationary phase for better survival from the MWI stress. The relevance of these pathways is discussed in the section below.

64

2.4.2.1. E. coli enters stationary phase during MWI

Microorganisms often switch between phases of growth (log phase) and non-growth

(stationary phase) to adapt to environmental conditions and they can enter a dormant state/stationary phase (Said-Salman et al., 2019). The stationary phase is characterized by repressed metabolic activity and tolerance to bactericidal antibiotics (Stokes et al., 2019). In this stationary phase, bacteria can still preserve certain metabolic activity but are incapable of division due to their reduced adaptation capacity (Said-Salman et al., 2019). It was suggested that during MWI, part of the bacterial population enters into the dormant state through metabolic suppression and growth arrest to survive in the stress conditions as shown by the downregulated expression of genes involved in several metabolic and biosynthesis pathways (Table 2.5 and

Appendix Table 2.7). Most of these gene products are localized in the cytoplasm, specifically in the cytosol (Table 2.5), which is the major aqueous environment site in which most of the cellular activity takes place with some of the enzymes embedded in the plasma membrane

(Kumar and Dubey, 2019).

Our findings also showed that the effects of MWI treatment on bacteria are similar to those treated with bacteriostatic compounds by Lobritz et al (2015) in analyzing the proteome and metabolome of cells. They reported that bacteriostatic compounds decrease cellular respiration and downregulate glycolysis and gluconeogenesis, pyruvate metabolism, and the

TCA cycle, and also accumulate ATP, adenosine diphosphate (ADP), adenosine monophosphate

(AMP), NADH, and central carbon metabolites (Lobritz et al., 2015).

Oxidative phosphorylation was also suppressed by MWI and as a result of lacking some of the necessary proteins from oxidative phosphorylation, the TCA cycle was affected, and the fermentative metabolism was increased to prevent the accumulation of NADH and pyruvate.

65

2.4.2.2. The downregulate expression of genes coding for the F1 sector of membrane-bound

ATP synthase

ATP synthase (also called F-ATPase or FoF1) in membranes of bacteria, mitochondria, or chloroplasts plays a central role in energy transduction by synthesizing most of the cellular

ATP from ADP and inorganic phosphate (Pi) (Nakanishi-Matsui et al., 2016). This enzyme also utilizes electrochemical energy stored in a proton gradient for the formation of the high-energy phosphate bond of ATP and achieves energy coupling as dual engine rotary nanomotors

(Nakanishi-Matsui et al., 2016; Shah et al., 2013). Based on our findings, 5 genes coding for a different subunit of the F1 sector of membrane-bound ATP synthase were down-regulated by

MWI, shown by the enrichment of proton-transporting ATP synthase complex, catalytic core

F(1) from the enriched cellular component (Table 2.5). These genes are the F1 sector of membrane-bound ATP synthase from alpha subunit (atpA), epsilon subunit (atpC), beta subunit

(atpD), gamma subunit (atpG), delta subunit (atpH), subunit c (atpE), and subunit b (atpF)

(Appendix Table 2.7). Among these, atpA, atpD, and atpG were involved in 2 significantly enriched KEGG pathways (e.g. metabolic pathways and oxidative phosphorylation) and localized in 2 cellular components, membrane and proton-transporting ATP synthase complex, catalytic core F(1).

F-ATPase (FoF1), consisting of the catalytic sector F1 and the transmembrane proton transport sector Fo, synthesizes or hydrolyzes ATP coupled with proton transport (Sekiya et al.,

2009). F1-ATPase can consume ATP to drive rotation of the γ-subunit inside the ring of three

αβ-subunit heterodimers in 120° power strokes (Martin et al., 2014). A critical feature of the F1-

ATPase is the inherent asymmetry of the three β subunits in different conformations, βTP, βDP, and βE, referring to the nucleotide bound in each catalytic site, ATP, ADP, or empty, respectively

66

(Sekiya et al., 2009). In most living organisms, the Fo component of the FoF1 complex uses energy derived from a proton-motive force across a membrane to power the F1-dependent synthesis of ATP from ADP and Pi (Martin et al., 2014). In the rotational catalysis of ATP synthesis, proton transport through the interface between the a and c subunits drives rotation of the εγc10–15 rotor complex. The γ subunit rotation in the α3β3 hexamer induces alternative conformational changes of the catalytic site in each β subunit, and the product ATP is released from the ATP-bound β subunit. In the reverse direction, successive ATP hydrolysis at the β subunit catalytic site drives rotation of the εγc10–15 complex resulting in proton transport

(Nakanishi-Matsui et al., 2016).

The downregulation of all genes coding for catalytic sector F1 suggested that the rotational catalysis of ATP synthesis was suppressed by MWI, possibly resulting in the decreasing level of ATP production and also increase the accumulation of cellular ADP and Pi.

The decreased level of ATP synthesis also contributed to the suppression of other metabolic and biosynthesis pathways and further support the possibility that E. coli entering the stationary phase during MWI, as described previously. In the reverse direction, proton transport was also decreased due to the suppression of the atpD gene, important for ATP hydrolysis.

2.4.2.3. The suppression of NAD+ biosynthesis

NAD+ has been identified as a co-substrate for DNA , poly-ADP-ribose polymerases (PARPs), and CD38/157 ADP-ribosyl cyclase reactions (Yoshida and Imai, 2018).

ADP-ribosylation is a reversible, post-translational, covalent modification of proteins in which the ADP-ribose moiety of NAD+ is enzymatically transferred to specific amino acid residues

(His, Arg, or Cys) of the target protein, with the subsequent release of nicotinamide (Aguilera et al., 2009). Our data showed that MWI suppressed the NAD biosynthesis by downregulating the

67 expression of genes coding for several enzymes such as NADH: ubiquinone

(nuoB and nuoC) and NAD(P)H: quinone oxidoreductase (wrbA), malate dehydrogenase,

NAD(P)-binding (mdh), and enoyl-[acyl-carrier-protein] reductase, NADH-dependent (fabI)

(Appendix Table 2.7). These proteins are also involved in metabolic pathways and are localized in the membrane whereas wrbA, fabI, and mdh are also found in cytosol. wrbA is also involved in the biosynthesis of secondary metabolites whereas fabI is involved in fatty acid metabolism. nuoB and nuoC are involved in oxidative phosphorylation. mdh is also involved in the biosynthesis of antibiotics and secondary metabolites, carbon metabolism, microbial metabolism in a diverse environment, glyoxylate and dicarboxylate metabolism, pyruvate metabolism, TCA cycle, and cysteine and methionine metabolism.

NAD+ and its reduced form NADH, as well as NADP+ and NADPH, have been well studied as coenzymes for many redox reactions (Yoshida and Imai, 2018) and they contribute to the storage of energy (Crowley and Kyte, 2014). The reduced form of NAD+release the stored energy when the reverse reaction occurs and they become oxidized, ultimately by oxygen.

NADH is a carrier of hydride ions that have a central role in the metabolism of all cells, referred to as pyridine nucleotides because the nicotinamide contains the ring of a pyridine. These pyridine nucleotides, by accepting or donating the formal equivalent of a hydride ion, accomplish the respective oxidation-reduction of other substrates (Crowley and Kyte, 2014).

Since the NAD+ biosynthesis was suppressed by MWI, redox reactions, oxidation-reduction of other substrates, and other metabolic pathways were also affected as previously described above.

This also suggests that in the presence of oxygen, MWI forces E. coli to undergo anaerobic metabolism.

68

The presence of downregulated genes coding for dehydrogenase also can be associated with NAD+ biosynthesis since they function in oxidation. These genes are choline dehydrogenase, a flavoprotein (betA), and dihydrolipoyl dehydrogenase; E3 component of pyruvate and 2-oxoglutarate dehydrogenases complexes; glycine cleavage system L protein; dihydrolipoamide dehydrogenase (lpd). betA and lpd are also involved in metabolic pathways and glycine, serine, and threonine metabolism, whereas lpd also involved in the biosynthesis of antibiotics and secondary metabolites, carbon metabolism, glyoxylate and dicarboxylate metabolism, microbial metabolism in diverse environments, pyruvate metabolism, TCA cycle, and glycolysis/gluconeogenesis (Appendix Table 2.7). Choline dehydrogenase catalyzes the four-electron oxidation of choline to glycine-betaine via a betaine-aldehyde intermediate (Gadda and McAllister-Wilkins, 2003). The single mutation of lpd is known to be important in reducing inhibition of (PDH) complex by NADH during fermentative growth, in which NADH pools are high in comparison to aerobic growth (Sun et al., 2012). The downregulation of betA and lpd further confirmed that MWI suppressed the activity of NAD+ biosynthesis.

2.4.2.4. The downregulated expression of genes coding for aminoacyl-tRNA synthetases

Aminoacyl-tRNA synthetases (aaRSs), the intermediate of the cognate amino acid, serves to transfer the amino acid to the tRNA and as a translation factor required for protein biosynthesis. The intrinsic proofreading capacities of the aaRSs and their balanced expression contribute greatly to the accuracy of translation of the genetic code. aaRSs also involved in several regulatory processes via their product, the charged tRNA. The synthesis of several E. coli’s aaRSs is controlled by different mechanisms acting at the transcriptional or translational level. Protein synthesis requires the specific binding of mRNA and aminoacyl-tRNAs to the

69 ribosome. In addition to the specific binding of ribosomes, mRNA, and tRNA, translation requires amino acids, nucleotides, and specialized proteins (Putzer and Laalami, 2003). Our results showed that MWI also downregulates several genes important for protein synthesis localized in the cytosol, coding for aaRS (Appendix Table 2.7). These include genes coding for alanyl-tRNA synthetase (alaS), phenylalanine tRNA synthetase, beta subunit (pheT), tyrosyl- tRNA synthetase (tyrS), and lysine tRNA synthetase, inducible (lysU) (Table 2.6 and Appendix

Table 2.10). Also, the 30S ribosomal subunit protein S1(rpsA) was downregulated by MWI. In line with the previous study by Manzinani et al., (2019), our results suggesting that starvation and heat regulate the cellular tRNA levels as a response to stresses after MWI. Supported by the agreement between proteomic and transcriptomic, showing the downregulation of leuS, lysU, and rpsA in response to MWI (Table 2.6).

For the mRNA to be translated correctly, the ribosome must be placed at the correct start location. The start of translation requires the formation of an initiation complex, including mRNA, the 30S ribosomal subunit, fmet-tRNAfmet, GTP (guanosine-5'-triphosphate), and three protein initiation factors, named IF-1, IF-2, and IF-3 (Shen, 2019). The latter include aaRSs and translation factors transiently associated with the ribosomes. These proteins catalyze sequential steps during translation, starting with the charging of tRNA, ribosome-dependent polypeptide synthesis, final release of the protein, and ribosome recycling. aaRSs play a central role in protein biosynthesis by catalyzing the attachment of a given amino acid to the 3' end of its cognate tRNA by forming an energy-rich aminoacyl-adenylate (Putzer and Laalami, 2003).

Tyrosyl-tRNA synthetase (TyrRS) coded by tyrS, is one of the first aaRSs to be studied by mutational and structural analyses, and the loop bearing the KMSKS signature motif

(“KMSKS loop”) of TyrRS has a major role in the catalysis of the tyrosine activation reaction

70 whereas TyrRS catalyzes the esterification of its cognate tRNA (Kobayashi et al., 2005). Alanyl- tRNA synthetase, encoded by the alaS gene, is a tetramer of the α4 type (Putzer and Laalami,

2003). The downregulation of these genes potentially inhibits protein synthesis via tyrosine activation reaction and by repressing the synthetase transcription due to their lack of quantities and the presence of alanine (Putzer and Laalami, 2013).

Phenylalanyl-tRNA synthetase (PheST) is a tetrameric enzyme of the α2β2 type, encodes the small and large subunits, respectively. In E. coli, there are two lysyl-tRNA synthetases encoded by the genes lysS and lysU, capable to activate the same tRNA iso-acceptor family. The lysU gene is normally silent at low temperatures and its expression is induced by a variety of stimuli such as high temperature, anaerobiosis, low external pH, the stationary phase, or the presence of specific metabolites including leucine or leucine-containing dipeptides in the culture medium (Putzer and Laalami, 2013). This is contradicted with our findings, whereas DUF218 superfamily vancomycin high-temperature exclusion protein (sanA), enriched in the plasma membrane (Appendix Table 2.5), and leucine efflux protein (leuE) which indicates the presence of leucine or leucine-containing dipeptides, were upregulated by MWI leading to the downregulation of lysU. In other words, high temperature, anaerobiosis, and stationary phase as the result of MWI exposure, suppressed the expression of lysU.

2.4.2.5. The suppression of one-carbon pool by folate

Folate is a B-vitamin that is present in cells as a family of enzyme cofactors that carry and chemically activate 1-carbons at the oxidation level of formate, formaldehyde, and methanol

(Stover et al., 2009), important for DNA replication and repair, mitochondrial protein synthesis and amino acid interconversions and catabolism (Chon et al., 2017). Folate-activated 1-carbons are required for the de novo synthesis of purines and thymidylate and the re-methylation of

71 homocysteine to methionine, an essential amino acid important for protein synthesis and polyamine synthesis (Chon et al., 2017; Stover et al., 2009).

Our findings revealed that 47 genes involved in carbon metabolism (Table 2.5) are also involved in one-carbon pool by folate. Among these genes, 11 genes are crucial in the folate- activated 1-carbons mechanism (Appendix Table 2.9) and their expression is shown in

Appendix Table 2.12. Among these 4 genes known as homocysteine-N5-methyltetrahydrofolate transmethylase, B12-dependent (metH), serine hydroxymethyltransferase (glyA), 10- formyltetrahydrofolate: L-methionyl-tRNA(fMet) N-formyltransferase (fmt), and phosphoribosylaminoimidazole-succinocarboxamide synthetase (purC). The downregulation of metH, glyA, fmt, and purC have a great impact on other synthesis processes and pathways related to the folate-activated 1-carbons mechanism. In addition to this, we found that adenylosuccinate lyase (purB), phosphoribosylglycinamide synthetase phosphoribosylamine- glycine ligase (purD), IMP cyclohydrolase and phosphoribosylaminoimidazolecarboxamide formyltransferase (purH), phosphoribosylformyl-glycine amide synthetase (purL), phosphoribosylglycinamide formyltransferase 2 (purT), formyltetrahydrofolate (purU), aminomethyltransferase, tetrahydrofolate-dependent, and thymidylate synthetase (thyA) were also downregulated and possibly contributed to this pathway. Moreover, all these genes are also involved in most of the KEGG pathways, except oxidative phosphorylation, pyruvate metabolism, TCA cycle, glycolysis/gluconeogenesis, fatty acid metabolism, and oxocarboxylic acid metabolism (Appendix Table 2.7).

Serine hydroxymethyltransferase (SHMT) (from glyA gene) is a key enzyme in the serine pathway in several bacteria for the assimilation of one-carbon (C1) compounds, through the addition of tetrahydrofolate (THF) to glycine, which yields serine, the main intermediate, in the

72 pathway (Zuo et al., 2007). SHMT catalyzes the reversible reaction of 3-hydroxy amino acids

(L-serine, L-threonine, allothreonine, 3-phenylserine) to glycine and 5,10- methylenetetrahydrofolate (Chon et al., 2017; Zuo et al., 2007). The downregulation of the glyA gene downregulates the catalysis of the reversible reaction of 3-hydroxy amino acids to form glycine and 5,10-methylenetetrahydrofolate important for folate-activated 1-carbons mechanism.

This is also supported by the suppression of glycine, serine, and threonine metabolism involving

23 downregulated genes as shown in Table 2.5.

In mitochondria, glycine, dimethylglycine, and sarcosine are catabolized to produce formaldehyde, which is condensed with THF to generate methylene-THF. The activated formaldehyde of methylene-THF is oxidized, forming 10-formyl-THF, which serves to formylate fmetMet-tRNA, a specialized tRNA that aminoacylated with methionine, for mitochondrial protein synthesis (Chon et al., 2017; Stover et al., 2009). 10-Formyltetrahydrofolate: L- methionyl-tRNA(fMet) N-formyltransferase (coding by fmt gene) function to catalyze the N formylation of the initiator fmetMet-tRNA (Guillon et al., 1992). The downregulation of the glyA leading to the downregulation of the fmt gene to formylate fmetMet-tRNA for mitochondrial protein synthesis.

In the cytoplasm, folate-activated 1-carbon units function in an interdependent anabolic network comprised of 3 biosynthetic pathways: de novo purine biosynthesis, which requires 10- formyl-THF for the C2 and C8 carbons of the purine ring; de novo thymidylate biosynthesis, which requires methylene-THF for the reductive methylation of deoxyuridylate (dUMP) to form thymidylate (dTMP); and the remethylation of homocysteine to methionine, which requires 5- methyl-THF (Stover et al., 2009). Since the 10-formyl-THF production was suppressed by the

73 downregulation of glyA, de novo purine synthesis and de novo thymidylate biosynthesis were also suppressed.

Additionally, our data also showed the downregulation of purC gene, code for a key enzyme involved in the de novo purine-biosynthesis pathway of bacteria, the foundation of purine-based nucleotides, inosine monophosphate (IMP), from 5-phosphoribosyl pyrophosphate

(PPRP) (Tuntland et al., 2015). purC is also referred to as 5-aminoimidazole-4-(N- succinylcarboxamide) ribonucleotide (SAICAR) synthetase which converts ATP, l-aspartate, and

5-aminoimidazole-4-carboxyribonucleotide (CAIR) to SAICAR (Ginder et al., 2006). The downregulation of purC potentially leads to increase activity of adenyl succinate synthetase and adenyl succinate lyase but a decrease in the 1-alanosine toxicity as the result of low production of L-alanosyl-5-amino-4-imidazolecarboxylic acid ribonucleotide in SAICAR synthetase reaction (Ginder et al., 2006).

2.4.2.6. The suppression of iron transport

Iron is an essential element but can also be toxic through Fenton chemistry, where iron

(II) catalyzes the decomposition of hydrogen peroxide, resulting in hydroxyl radical and hydroxide production. The hydroxyl radical product can damage DNA, proteins, and lipids

(Orchard et al., 2012). Based on our results, the 2,3-dihydroxybenzoate-AMP ligase component of enterobactin synthase multienzyme complex (entE) important for iron transport was downregulated by MWI (Appendix Table 2.7). Enterobactin, the tris-(N-(2,3-dihydroxy benzoyl) serine) trilactone siderophore of E. coli, is synthesized by a three-protein (EntE, B, F), six-module non-ribosomal peptide synthetase (NRPS). NRPS activates a broad range of amino acids, including numerous non-proteinogenic amino acids, and incorporates them into growing peptidyl chains as an elongating series of peptidyl-S-enzyme intermediates (Ehmann et al.,

74

2000). Also, enterobactin is one of the specialized systems for iron transport once it binds to iron and becomes ferric enterobactin (Orchard et al., 2012). With the limited production of enterobactin due to the suppression of entE gene expression, the elongation series of peptidyl-S- enzyme intermediates and iron transport activity was suppressed, resulting in the accumulation of iron in cells leading to damage in DNA, proteins, and lipids.

This finding is also supported by the downregulation of genes involved in stress response to iron accumulation, such as stress protein member of the CspA-family (cspC) and bacterioferritin, iron storage, and detoxification protein (bfr) (Appendix Table 2.7). Other downregulated genes involved in this process are Fe-binding and storage protein; stress- inducible DNA-binding protein (dps), and kgtP. CspC is produced mainly at 37°C, originally identified as a multicopy suppressor of a temperature-sensitive chromosome-partitioning mutant.

CspC is also involved in the regulation of the expression of RpoS, a global stress response regulator, and UspA, a protein responding to numerous stresses, during normal growth as well as during the stress response (Phadtare and Inouye, 2001). The downregulation of CspC indicates that the MWI suppressed CspC production important for stress response. The inactivation of stress response to iron accumulation possibly distracted the dps activity.

Dps plays a role in iron storage protein together with bfr and kgtP to warrant a low steady-state concentration of hydroxyl radicals, a major hazard for various cellular target molecules (e.g. DNA, proteins, or lipids). In addition to iron storage, dps also binds to DNA, especially in the stationary phase, contributing to the formation of stable and highly ordered complexes termed biocrystals. These complexes cover the chromosomal DNA and contribute to stress protection. The Dps protein is also suited to diminish iron-mediated oxidative stress. It ideally combines the ability to use hydrogen peroxide (rather than molecular oxygen)

75 for ferrous iron oxidation and concomitant storage of iron as a mineralized core to avoid cell damage. Dps also contributes to defense against copper stress in growing cells of E. coli

(Thieme and Grass, 2010). In the suppression of dps, bfr, and copA, iron-mediated oxidative stress occurs as the result of a limited amount of iron storage protein, potentially leading to cell death.

2.4.3. The differences between transcriptomic and proteomic data from E. coli in response to MWI

Based on the comparison of DEGs from our analysis and the differentially expressed proteins identified by Mazinani et al. (2019), we found less agreement between them. Only 16

DEGs were shown to be differentially expressed at the protein level with only 4 genes showing a similar expression change pattern at transcript and protein level (Table 2.6). This shows that mRNA and protein expression data from the same cells under similar conditions do not show a high agreement between them and this also has been reported in previous studies.

Several physical properties contribute to the differences between mRNA and protein expressions (Haider and Pal, 2013). These physical properties include the whole structure of the mRNA, which is sensitive to temperature, codon composition (codon bias), number of ribosomes in a transcriptional unit (ribosome-density), the variability of mRNA expression level during the cell cycle, the average of the half-life of mRNA and proteins, and the experimental errors in the type of data extraction. The average half-life of eukaryotic mRNA is reported to be 10-20h whereas the average half-life is 48-72h in eukaryotic proteins, depending on its amino-terminal residue. Posttranscriptional factors (e.g., phosphorylation, ubiquitination, and localization of proteins), variation in synthesis, and degradation of different proteins also lead to variable half- lives for proteins. The low agreement between the transcriptomic and the proteomic data may

76 also be contributed in part by siRNAs in regulating the post-transcription in growth phase adaptation (Bathke et al., 2019). siRNA can affect gene expression by altering the stability of mRNAs or by promoting or inhibiting translation.

Despite the low agreement between two omics data, features on transcriptomic and proteomic levels might reflect the different biological processes or pathways involved in generating mRNA and proteins. This illustrates the importance of the combination of the transcriptomic and proteomic data analysis to complement each other (Niu et al., 2018).

2.4.4. Concluding statements and future perspectives

In this study, through gene expression profiling, we demonstrated that MWI contributes to significant alteration of gene expression in E. coli. Specifically, our results show that MWI downregulates the expression of genes involved in important metabolic and biosynthesis pathways, and up-regulation of genes important for maintaining membrane integrity in E. coli.

By inactivating most of these metabolic and biosynthesis pathways, E. coli bacteria can survive under stress conditions via switching from growth phase to stationary phase, promoting the efflux systems and transport activity to reduce noxious agents and toxic metabolites/materials.

MWI also induces the adhesion forces to promote biofilm formation and colonization, leading to the suppression of cell motility, as a bacterial survival mechanism.

In conclusion, our data provide comprehensive transcriptomic profiling of E. coli in response to MWI, revealing its specific impact on the expression of genes related to metabolic and biosynthetic pathways, biofilm formation, motility, and membrane integrity. Future studies can focus on validating the pattern of expression changes for genes involved in the critical pathways impacted by MWI using quantitative polymerase chain reaction (qPCR) and designing

77 more defined experiments to interrogate the coordination/interaction of different pathways in a cell survival response.

78

Chapter 3: Assessment of different culture condition’s impact on cell’s physiology in culture by gene profiling

3.1 Introduction and related literature review

3.1.1. The development of cell culture technique

Standard cell culture techniques currently in use were first developed in the mid-20th century. They were in turn built on the work of the 19th-century English physiologist, Sydney

Ringer, who succeeded in maintaining isolated animal hearts in salt solutions that contained sodium, potassium, calcium, and magnesium salts. In 1885, Wilhelm Roux removed and maintained a portion of the medullary plate of an embryonic chicken in a warm saline solution, which then established the cell culture principle for growing animal cells. Later, Ross Granville

Harrison published some of his experiments from 1907 to 1910, further establishing the methodology for tissue culture. In the 1940s and 1950s, cell culture techniques significantly advanced, and with these breakthroughs in animal cell culture, viral culture and production were also developed (Pham, 2018). In recent years, the use of cell culture technology in biomedical research has increased. Cell culture-based applications have been developed in various areas, including the assessment of the efficacy and toxicity of drugs, the manufacture of vaccines and biopharmaceuticals, and assisted reproductive technology (Yao and Asayama, 2017).

3.1.2. Cell culture medium

The culture medium is one of the important factors in cell culture technology. A medium supports cell survival and proliferation, as well as cellular functions, meaning that the quality of the medium directly affects the research results, the biopharmaceutical production rate, and treatment outcomes of assisted reproductive technology. At present, the synthetic medium can be classified into several groups, based on the type of supplements added, for example, serum-

79 containing medium, serum-free medium (e.g., protein-free medium, and chemically defined medium) (Yao and Asayama, 2017).

Serum-containing medium contains various serum-derived natural substances (e.g., fetal bovine serum) which provide essential components such as growth factors, non-polar nutrients, and trace elements, which make the medium composition unclear and whose concentrations can fluctuate from batch to batch. Hence, the concentration varies between different batches impairing the reproducibility of results between laboratories, makes the culture results less reproducible, and poses a risk of microbial contamination (Ackermann and Tardito, 2019; Bae et al., 2017). Serum-containing medium, however, can be designed easily and be used effectively for a variety of cell types (Yao and Asayama, 2017).

The use of fetal bovine serum as a supplement for the medium has now been replaced with chemically defined components, resulting in a serum-free medium. Serum-free medium, in contrast, have a defined composition, resulting in a high reproducibility of results. Among the serum-free medium, subgroups of protein-free medium and chemically defined medium provide additional stability and reproducibility for culture systems, facilitating the identification of the cellular secretions and reducing the risk of microbial contamination. However, the serum-free medium is difficult to design and only specific cell types have been successfully cultivated this way (Yao and Asayama, 2017). To achieve a chemically defined medium that allows cells to be cultured without serum supplementation, essential components must be included in the formulation (Ackermann and Tardito, 2019).

3.1.3. The impact of different culture medium components on cell culture

In recent years, it has been shown that the concentrations of nutrients affect the metabolism of cultured cells and affects the response of cells to stresses and various stimuli,

80 leading to discrepancies in metabolic phenotypes between cultured cells and tumors (Ackermann and Tardito, 2019; Voorde et al., 2019). This is due to the lack of some metabolites that are normally present in human fluids, or to the presence of some metabolites, such as glucose, glutamine, or pyruvate, at supra-physiological concentrations. On the contrary, compounds irrelevant for human pathophysiology are commonly supplemented at millimolar concentrations, with uncharacterized, yet inevitable consequences on cell metabolism (Ackermann and Tardito,

2019).

Earlier studies have discovered the effect of many different culture medium composition on cell culture, including Ringer’s solution, a balanced salt solution (1882) with only inorganic salts and glucose added as a nutrient which have been used successfully to keep tissues and cells outside the body alive for short periods. Another study in 1911 by Margaret R. Lewis and

Warren H. Lewis reported the importance of glucose concentration for the chick embryo cells and the requirement of glutathione for the control of the redox environment during cell cultivation. In 1913, the discovery of adding the embryonic extract to blood plasma proved to dramatically increase cellular proliferation and extend the culture period of fibroblasts from the chick embryo heart. Later, insulin was discovered by Frederick Banting and Charles Best (1921) and was supplemented into the culture medium in the 1960s in which combination of insulin with low-concentration serum yielded a higher level of efficacy of baby hamster kidney cell growth. After 1946, the supplementation of dialyzed blood plasma that could sustain cells only for a short period was reported, indicating that the low-molecular-weight fraction was essential for the survival of cells (Yao and Asayama, 2017).

Currently available commercial cell culture media were thought to allow continuous growth of specific cell types with minimal amounts of nutrients and serum (e.g., Eagle’s minimal

81 essential medium), without recapitulating the metabolic environment and reproducing the physiological cellular environment of the tissue of origin (Ackermann and Tardito, 2019; Voorde et al., 2019). Later, these media were modified by multiplying the concentration of selected nutrients, such as glucose and glutamine, by a factor of 4, to avoid nutrient exhaustion when leaving the culture unattended for longer periods (e.g., Dulbecco’s modified Eagle’s medium

(DMEM)) (Voorde et al., 2019). Several studies have also discovered the impact of glucose, calcium, phosphate, cysteine, and arginine in cell culture. Two recent studies reported that high glucose concentration induces cellular senescence, inhibits cell differentiation (Wu et al., 2009), and is responsible for the induction of epithelial-mesenchymal transition (EMT) in breast cancer and renal tubular cells in different cell culture formulations (Kim et al., 2015), while reduction of glucose enhanced proliferation (Wu et al., 2009). Higher calcium and lower phosphate induced alkaline phosphatase activity (ALP), whereas a greater calcium concentration enhances mineralization which strongly stimulates matrix mineralization for final osteoblast proliferation and differentiation (Wu et al., 2009). Also, the use of excessive concentrations of pyruvate in cell culture showed that the proliferation of cancer cells depends less on mitochondrial respiration.

Moreover, a high concentration of cysteine found in historic medium enhances glutamine consumption. Also, the high concentrations of arginine in a medium such as DMEM reverse the direction of the reaction catalyzed by the urea cycle enzyme, arginosuccinate lyase (Voorde et al., 2019).

In 2019, Plasmax™ (Voorde et al., 2019), was introduced as a ‘physiologic’ medium aiming to recapitulate the nutrient composition of human plasma (Ackermann and Tardito,

2019). Plasmax™ contains 66 organic components. Amongst these, arginine and pyruvate are

~10 fold less abundant compared to DMEM (Ackermann and Tardito, 2019). A recent study

82 comparing the triple-negative breast cancer (TNBC) cell lines cultured in Plasmax™ or DMEM-

F12 (a commercially available, nutrient-rich medium) shows that the adoption of a more physiological culture medium substantially influences the colony-forming capacity of cancer cells by preventing ferroptosis (Voorde et al., 2019). They suggested that the supra-physiological concentration of nutrients (e.g., glucose and glutamine) present in most commercial media are cause a significant change in the metabolism of cancer cells (Voorde et al., 2019). A recent study reported that a natural polyphenolic compound known as resveratrol (RES), affects the mitochondrial in the presence of galactose in DMEM, promoting oxidative phosphorylation. This study suggested that RES can affect cellular reactive oxygen species (ROS) metabolism, either directly as an antioxidant or pro-oxidant or indirectly by regulating the expression of ROS- producing enzymes/organelles or antioxidant enzymes (Fonseca et al., 2018).

3.1.4. The impact of oxygen on cell physiology in culture

Oxygen (O2) is essential for the viability and function of most metazoan organisms and is closely regulated at both the organismal and cellular levels (Liu et al., 2006). The principal role of O2 in mammalian physiology is to function as the terminal electron acceptor in the electron transport chain (ETC). In this capacity, a single oxygen atom is reduced to H2O in the presence of two protons and two electrons. Such reductive/oxidative (redox) reactions are fundamental in

O2 physiology and are utilized by cytosolic enzymatic systems in addition to mitochondria

(Keeley and Mann, 2019). The delivery of O2 is determined by the metabolic requirements and functional status of each organ and tissue. The balance between delivery and consumption determines the O2 partial pressure (p O2), which is specific to each organ and generally much lower than that of the atmosphere (21%) (Yazdani, 2016).

83

Physoxia is a normal oxygen level found in healthy tissues, which ranges from about 2% to 9% (Jing et al., 2019), partially depending on the tissue of origin, and, notably, many prostate and pancreatic tumors are profoundly hypoxic (McKeown, 2014). Levels of O2 higher and lower than physoxia are defined as hyperoxia and hypoxia, respectively (Yazdani, 2016). O2 participates in many metabolic reactions and cellular processes ranging from metabolism to signaling, some of which are sensitive to O2 levels in the range between hypoxia and hyperoxia

(Stuart et al., 2018; Yazdani, 2016). The rapid proliferation of tumors outgrows their surrounding vasculature, resulting in a drop of physoxia levels to hypoxic levels of less than 2% (Jing et al.,

2019). It is generally accepted that the oxygen level in hypoxic tumor tissues is poorer than the oxygenation of the respective normal tissues, between 1%–2% O2 and below. However, the oxygen concentration commonly used in the laboratory setting is hyperoxic rather than physoxic conditions of respective organs (Muz et al., 2015).

3.1.4.1. The impact of hyperoxia in cell culture

In the history of the earth and the development of life on it, increased O2 levels enabled the shift from the inefficient anaerobic respiration found in prokaryotes to more efficient aerobic respiration in eukaryotes. An appropriate supply of O2 to tissues is necessary for their optimal function and continued survival. However, controlling the O2 level in cell cultures still becomes a challenge resulting in the experiments is not precise nor consistent, leading to cells being exposed to higher O2 levels (hyperoxia) than physoxia. Exposure to higher O2 levels resulting in changes within the cells such as altered phenotypes and gene-expression levels. Stress can also be induced when the cells are isolated from their organ and kept under culture conditions at O2 levels different from physoxia (Yazdani, 2016).

84

One potential direct consequence of hyperoxia in cell culture that has been studied is increased cellular production of ROS and reactive nitrogen species (RNS) (Stuart et al., 2018).

Recent initial interests in the mechanisms of cellular O2 utilization and its consequences strongly influence cellular physiology through the formation of ROS that plays functional roles in cell signaling (Keeley and Mann, 2019). Hyperoxia increases the level of ROS inside the cell and has a direct effect on both the cell cycle and cell viability. ROS can induce DNA damage, thus increasing cell cycle length because of multiple repair needs (De Bels et al., 2020). In mammalian cells, ROS/RNS are produced by a wide range of organelles and enzymes, including

NADPH oxidase (NOX), nitric-oxide synthase (NOS), monoamine oxidase (MAO), xanthine oxidase/oxidoreductase (XO/XOR), lipoxygenase (LOX), cyclooxygenase (COX), heme oxygenase (HOX), and the trans-plasma membrane redox system (tPMRS) (Stuart et al., 2018).

The production of ROS/RNS probably contributes to the observed effects of high O2 levels on cell senescence, differentiation, apoptosis, and various signaling pathways (Maddalena et al.,

2017; Stuart et al., 2018).

Maddalena et al., (2017) investigated cellular H2O2 production under standard (18%) and physiological (5%) O2 levels and found that the rate of H2O2 production from commonly used six cell lines is much greater at 18% than at 5% O2. They suggested that the increased H2O2 (and other ROS) production under these conditions cause the accumulation of DNA damage, cellular senescence, and contribute to the altered regulation of protein activities via redox modifications of specific amino acids. Moreover, hyperoxia can be deleterious to cancer cells by leading to apoptosis in solid tumors, cell cycle blocking in carcinoma cells, or anticancer immune- surveillance due to T cell and NK cell stimulation, mobilization of stem progenitor cells, and change in cytokine expression (De Bels et al., 2020).

85

3.1.4.2. The impact of hypoxia in cell culture

The concept of hypoxia, a pathological reduction in O2 availability, now is accepted as a hallmark of disease conditions (Keeley and Mann, 2019). In mammalian systems, hypoxia affecting cellular metabolism and gene expression. Rapid and reversible effects on cell signaling, contractility, ion flux, and redox state are critical for neural, cardiovascular, and pulmonary function, serve to balance energy supply and demand in the face of reduced capacity for oxidative metabolism (Arsham et al., 2003). Hypoxia can trigger far-reaching signaling cascades via processes such as the unfolded protein response (UPR), mammalian target of rapamycin

(mTOR) signaling, and hypoxia-inducible factor (HIF)-mediated gene regulation. This, in turn, can lead to reduced metabolic rates, temporary cell cycle arrest, promoting the maintenance of an undifferentiated state, and upregulating the production of pro-angiogenic and pro-survival signals (Al-Ani et al., 2018).

To survive O2 deficiency, cells activate a variety of adaptive mechanisms, and among the immediate effects of hypoxia is rapid inhibition of mRNA translation (Arsham et al., 2003; L.

Liu et al., 2006). A study by Arsham et al., (2003) showed that modest hypoxia (1.5% O2) inhibits mRNA translation via the rapid hypophosphorylation of mTOR targets, including eIF4E binding protein (4EBP1), p70S6K, and rpS6. As a result, ribosome biogenesis, via enhanced translation of ‘‘TOP’’ mRNAs encoding ribosomal proteins and elongation factors, is increased.

Translation, especially at the initiation step, is highly regulated and exquisitely sensitive to cellular stress (Arsham et al., 2003).

Liu et al., (2006) demonstrated that hypoxia controls protein synthesis by concomitantly inhibiting multiple key translational regulators independent of hypoxia-inducible factor (HIF) activity with the AMP-activated protein kinase (AMPK)/TSC2/Rheb pathway being an important

86 mechanism for HIF-independent. HIF-1 or HIF-2 is implicated in the mechanisms of cell apoptosis. It is noteworthy that HIF is under-expressed during stable hyperoxia and over- expressed right after returning to physoxia in noncancer cells, confirming that transient hyperoxia is perceived as a hypoxic stimulus (De Bels et al., 2020).

3.1.5. Interaction between medium compositions and oxygen level

Regarding the effect of the medium’s composition on the cell’s response to the O2 level, only very limited research has been done recently. It has been reported that fetal bovine serum showing a negative effect on O2-enhanced metabolism in primary rat hepatocytes cultured on an

O2-carrying matrix. Also, monolayer cultures of renal tubular epithelia were shown to be affected when the medium volume covering them was increased, resulting in a decreased supply of O2, leading to a shift from oxidative metabolism to increased glycolysis. In static systems where oxygenation only occurs through surface aeration, the O2 transfer rate (OTR) depends on the optimal level of the medium, suggested to be 0.2 cm, equivalent to a volume of 0.2 mL/cm2

(Yazdani, 2016).

In addition to this, a previous study showed the impact of oxygen tension may be associated with the fluctuations in substrate concentration, including glucose. For example, the use of 5.5 mM and 25 mM glucose concentrations and different oxygen levels (2.5%, 8%, and

21%) together affected cell proliferation and the formation of ROS. Hyperoxia (> 21%) led to a

65% reduction in cell numbers whereas interaction of low oxygen tension (1%) and high glucose concentration (25 mM) elicited maximal exosome secretion (Salomon and Rice, 2017).

Furthermore, oxygen tension can exert a significant effect on viral propagation in vitro and possibly in vivo. For example, hypoxia can induce the growth of viruses that naturally target tissues exposed to low oxygen (Vassilaki and Frakolaki, 2017).

87

3.1.6. Research objectives

The overall goal of this study is to better understand how cells respond to change in oxygen levels and culture media via gene expression profiling using RNA-seq with human breast cancer cell lines (MCF7) and human prostate cancer cell lines (PC3) as the models. Five specific objectives include: (1) to identify genes that are affected by oxygen levels; (2) to identify genes that are affected by different culture medium; (3) to identify genes that are affected by the combination of different oxygen levels and culture media; (4) to determine the biological processes, molecular functions, cellular components, KEGG pathways these genes are involved;

(5) to analyze the differences between the two cell lines in response to changes in oxygen level and culture medium and combination of the two. Results from this study are expected to provide some guidelines in better designing cell culture-based research to minimize unintended variations caused by culture conditions and better interpreting the results.

3.2. Methods and Materials

3.2.1. Sample preparation

Sample preparation was done by Dr. Jeff Stuart’s research group in the Department of

Biological Sciences, Brock University. Both cell lines (PC3 and MCF7) were acquired from the

American Type Culture Collection (Manassa, Virginia, USA) and cultured according to the distributor’s protocol in high glucose DMEM (Sigma, Burlington, ON, Canada), or in Plasmax

(Voorde, et al., 2019). Before harvesting, PC3 and MCF7 cell lines were cultured in a humidified

5% CO2 atmosphere at 37°C held at either 5% O2 or allowed to equilibrate with the atmosphere to about 18.5% O2 for two weeks in CO2 incubators (Thermo Fisher Scientific Canada) with O2 regulated via nitrogen flushing. Three technical replicates were obtained for each of the experimental conditions. Cells were tested with DAPI DNA fluorescence staining to check for

88 mycoplasma contamination. Total RNA was extracted using the miRNeasy kit (Qiagen,

Burlington, ON) according to the manufacturer’s instructions. A total of 8 RNA samples with each representing a mix of three technical replicates for each of the four conditions listed in

Table 3.1 for PC3 and MCF7 cell line were sent for sequencing by Novogene (California,

USA).

Table 3.1. Applied experimental conditions for PC3 and MCF7 cell lines. No. Experimental conditions 1. 5% O2 in Plasmax medium 2. 5% O2 in DMEM 3. 18% O2 in Plasmax medium 4. 18% O2 in DMEM

3.2.2. Library preparation and sequencing

RNA samples were subject to sample tests, library preparation, and sequencing by

Novogene. Quality check (QC) was obtained for RNA samples, including preliminary quantitation with Nanodrop, RNA degradation, agarose gel electrophoresis, and Agilent 2100 for

RNA integrity and quantitation check. mRNA was enriched using oligo(dT) beads for selecting polyA carrying molecules and the Ribo-Zero kit was used for removing rRNA. RNA was subject to fragmentation before cDNA synthesis using random hexamer primers. Sequencing adaptor ligation, size selection, and PCR enrichment were performed to generate cDNA sequencing.

The cDNA library was subject to QC processes, consisting of library concentration preliminarily tests with Qubit v2.0, insert size tests using Agilent 2100, and Q-PCR for library effective concentration precisely quantification. mRNA sequencing was performed on the

Illumina NovaSeq 6000 S4 platform via PE sequencing at 150 bp x 2. Sequencing data were subject to QC for sequencing error rate, GC content, filtering by removing the adapters, and reads containing N >10% and low-quality bases with Q score £ 5. The raw read files were

89 provided to us in compressed FASTQ format, which contains sequence base-calls and per base quality score.

3.2.3. Assessment of RNA-seq data

QC was performed using FastQC v0.11.8 (Andrews, 2010). The total number of reads in

FASTQ format were processed to calculate the sequencing coverage depth for each sample using

Linux shell utilities. The total exon length of the transcripts from Homo sapiens GRCh38 from

Ensembl (release 97) in “gene feature format” (gff) was calculated as the length of human reference transcriptome. The general equation for coverage calculation was based on the

Lander/Waterman method as described below (Illumina, 2014):

C stands for coverage, G is the haploid genome/transcriptome length,

L is the read length, and N is the number of reads.

3.2.4. RNA-seq reads alignment

The reads in the FASTQ file format from all samples were mapped to the Homo sapiens reference genome, GRCh38 from Ensemble (release 97) using STAR v2.7.0a (Dobin et al.,

2013). To improve alignment accuracy and identify potential spliced sequencing reads correctly, a dataset of known splices sites, Homo sapiens GRCh38 from Ensemble (release 97) in gff format was used as the input for the mapping runs.

3.2.5. DGE analysis

DGE analysis was performed by using the Cufflinks package with the recommended default setting (Trapnell et al., 2012). Cufflinks was used to assembly the transcripts from the aligned reads for each sample. The resulting assemblies were merged and integrated with the reference transcripts file using the Cuffmerge, followed by using the Cuffdiff utility to calculate and test the statistical significance of changes in expression between each pair of the

90 experimental conditions, listed in Table 3.2. By using the ggplot function from R statistical computing environment, the outputs of Cuffdiff were visualized using scatter boxplots and pairwise comparison across samples was performed using Pearson correlation and paired T-test based on fragments per kilobase of transcript per million mapped reads (FPKM). DGE downstream analysis was performed first by filtering the individual DEG lists from the above step based on the p-value (≤ 0.05) and log2(FC) ≥1 (equal to FC ≥ 2) to determine the statistically significant change in gene expression. The lists were further filtered by requiring the

FPKM value to be a minimum of 5 in at least one of the two samples in comparison. Venn diagram was generated using a web tool from http://bioinformatics.psb.ugent.be/webtools/Venn/ to find the overlapped genes across different DEGs lists.

Table 3.2. Description of the experimental condition involved in 3 DGE comparisons.

Comparison Experimental condition 5 % O in Plasmax vs 18% O in DMEM Combinatory effects 2 2 18 % O2 in Plasmax vs 5% O2 in DMEM 5% vs 18% O in Plasmax Different oxygen levels 2 5% vs 18% O2 in DMEM 5% O in Plasmax vs DMEM Different culture media 2 18% O2 in Plasmax vs DMEM

3.2.6. Functional enrichment analysis

Functional enrichment analysis using Database for Annotation, Visualization, and

Integrated Discovery (DAVID) version 6.7 (Huang, et al., 2009) was performed for the DEGs list. Homo sapiens was used as the selected species and background. Enriched GO terms and

KEGG pathways were identified using the default setting of DAVID functional annotation. A raw p-value and Benjamini corrected p-value (≤ 0.05) were applied for identifying the most statistically significant enriched GO terms and KEGG Pathways. The result was visualized using the EnrichmentMap v3.2.1 (Merico et al., 2010) plug-in for Cytoscape v3.7.2 (Paul Shannon et

91 al., 1971) to see the connectivity between GO terms and KEGG pathways based on genes that are shared among them by setting the p-value (0.05), default edge cut off (37.5% similarity) for genes that are shared between GO terms and KEGG pathways, and false discovery rate (FDR) q- value (0.25).

3.2.7. Analysis of co-expressed DEGs

All combined DEGs from PC3 and MCF7 were used to run a pairwise Pearson correlation based on the FPKM values from all eight samples using an in-house Perl script.

These genes were then placed into groups, in which each gene is connected to at least another gene with a significant positive expression correlation (r ≥ 0.95). A group with 100 or more genes was analyzed using DAVID enrichment analysis tool to identified enriched GO terms and

KEGG pathways, which were then compared to those for the up-and down-regulated DEGs from individual pair-wise comparisons.

3.2.8. Computational analysis

All steps of RNA-seq analysis until the generation of DGE lists were performed by using

Compute Canada high-performance computing facilities (https://www.computecanada.ca), while statistical analyses were performed using R studio v1.2.1335 (https://www.rstudio.com) on desktop computers.

3.3. Results

3.3.1. Summary statistics for the RNA-seq data

RNA-seq was used to analyze changes in expression at the transcriptional level in PC3 and MCF7 Cells cultured at 5% or 18% O2 and in DMEM or Plasmax. The fastQC result showed that the quality of the RNA-seq data met the recommended cut-offs for sequence quality, GC content, N content, length distribution, overrepresented sequences, and adapter content. There

92 seemed to have some issues for per base sequence content and sequence duplication levels, but these usually do not adversely affect the downstream analysis.

The sequencing coverage for the RNA-seq data was in the range of 34-52 times with the number of reads per sample between 87 – 122 million (Table 3.3.). Table 3.4. summarizes the alignment statistics of RNA-Seq data from the STAR aligner. The unique mapping rates were very high (91.78-94.43%), while these of multi-mapping and unmapped were very low (3-5%), which indicated high-quality RNA-seq data with uniquely mapped reads exceed 90% (Dobin and

Gingeras, 2015). The multi-mapping and unmapped reads were discarded by the STAR aligner, keeping only the unique mapping reads for the downstream analyses.

Table 3.3. Sequencing coverage statistic of RNA-seq data.

Sample Number of read pairs Number of reads Total length* Coverage** MCF7 5D 39,750,784 79,501,568 11,925,235,200 34 MCF7 5P 53,307,181 106,614,362 15,992,154,300 45 MCF7 18D 45,768,533 89,537,270 13,430,590,500 38 MCF7 18P 61,205,716 91,537,066 13,730,559,900 39 PC3 5D 46,849,091 122,411,432 18,361,714,800 52 PC3 5P 43,746,521 87,493,042 13,123,956,300 37 PC3 18D 54,162,389 93,698,182 14,054,727,300 40 PC3 18P 44,768,635 108,324,778 16,248,716,700 46 Average 48,694,856 97,389,713 14,608,456,875 41 *The total number of reads times length of read (150 bps). **The total length divided by total exon length of human reference transcriptome in Homo sapiens (354,866,726 bps).

93

Table 3.4. Alignment statistics for RNA-seq data.

Unique mapping* Multi-mapping* Unmapped* Sample Number of reads (%) (%) (%) MCF7 5D 79,501,568 92.94 2.59 4.40 MCF7 5P 106,614,362 92.86 2.68 4.38 MCF7 18D 89,537,270 91.78 2.99 5.15 MCF7 18P 91,537,066 93.04 2.89 3.98 PC3 5D 122,411,432 94.43 2.36 3.16 PC3 5P 87,493,042 93.44 2.55 3.94 PC3 18D 93,698,182 93.13 2.50 4.31 PC3 18P 108,324,778 92.71 2.96 4.27 Abbreviations: PC3, Human prostate cancer cell line; MCF7, breast cancer cell line; 5P, 5% O2 in Plasmax medium; 5D, 5% O2 in DMEM; 18P, 18% O2 in Plasmax medium; 18D, 18% O2 in DMEM. *Alignment was performed using Homo sapiens GRCh38 genome from Ensembl (release 97) using STAR.

3.3.2. Overview of the gene expression profile in response to oxygen level and medium changes

We started the gene expression analysis by first examining and comparing the overall expression profiles across samples and cell lines. As shown in Figure 3.1, the overall distribution of the gene expression level in FPKM values is more or less similar among all samples for both cell lines. The majority of the genes show an expression level ranging in log10(FPKM) value between -2.5 (FPKM =0.0032-0.0035) and 2.5 (FPKM = 280-310) with mean values also being similar. This pattern matches with what is observed in most gene expression profiling studies: the number of differentially expressed genes is ~10% of all genes, as can be seen in volcano plots (Appendix Figure 3.1 and 3.2). To get a more precise measure of the degree of expression profile similarity, we performed pairwise Pearson correlation analysis based on the FPKM values from Cufflinks’ output. As shown in Table 3.5, as a general trend, a stronger correlation is seen among samples for the same cell line than across cell lines

94 with the r values >0.86 for PC3 samples and >0.93 for MCF7 samples and with the r values across cell lines all below 0.8, mostly between 0.62 to 0.77, indicating the overall gene expression profiling is mostly a characteristic of the cell type. In this regard, MCF7 at 5% in

DMEM seemed to be an exception to the above pattern, as it showed a much lower correlation with all other samples, including the other 3 MCF7 samples (with r values ranging from 0.44 to

0.46) and the PC3 samples (with r values ranging from 0.33 to 0.36), as shown by the values in the yellow highlight in Table 3.5, indicating a hypoxic gene expression signature in MCF7 cells cultured in DMEM. The pairwise comparison also showed less correlation between 2 different cell lines.

Figure 3.1. Distribution of gene expression values in MCF7 (left) and PC3 (right) under different culture conditions. Scatter box plots showing the distribution of gene expression values in log10(FPKM). MCF7, breast cancer cell line; PC3, prostate cancer cell line; 5P, 5% O2 in Plasmax medium; 5D, 5% O2 in DMEM; 18P, 18% O2 in Plasmax medium; 18D, 18% O2 in DMEM.

95

Table 3.5. Pairwise comparison of gene expression profile among all samples based on Pearson correlation*.

PC3 5D PC3 5P PC3 18D PC3 18P MCF7 5D MCF7 5P MCF7 18D MCF7 18P PC3 5D 1 0.9621234 0.9242322 0.8789873 0.3447927 0.6808806 0.6789881 0.634247 PC3 5P 1 0.9480266 0.9305387 0.3619882 0.7626288 0.7447194 0.6751694 PC3 18D 1 0.864688 0.3326478 0.703074 0.7091266 0.6275336 PC3 18P 1 0.3363413 0.7701498 0.7110999 0.6788219 MCF7 5D 1 0.443164 0.4590847 0.4452072 MCF7 5P 1 0.9377025 0.9022873 MCF7 18D 1 0.9027213

MCF7 18P 1

*, Pairwise Pearson coefficient values between each of all possible pairs of samples based on gene expression values in FPKM. Coefficient values above 0.9 are labeled in red font, while those below 0.5 are highlighted in yellow background.

To further analyze the impact of oxygen level and culture medium on gene expression,

we performed the analysis of differential gene expression and collected the number of DEGs for

comparisons within each of the two cell lines independently. Specifically, the DEGs were

identified as genes showing significantly higher expression in either of the compared samples

based on p-value ≤ 0.05, FC ≥ 2, and minimum FPKM value ≥ 5 in at least one of two samples in

comparison. As shown in Figure 3.2 (with more details provided as Volcano plots in Appendix

Figure 3.1 and 3.2, the total number of DEGs is highly variable across the various comparisons,

ranging from 163 in PC3 for 18D vs. 5D to 588 in MCF7 for 18D vs. 18P. Some interesting

observations were made based on the number of DEGs as a measure of the gene expression

profile differences among these samples (Table 3.2).

Overall, PC3 showed a lower response to condition changes than MCF7 with the total

number of DEGs among all 6 pairwise comparisons being 2,576 vs. 3,156 for MCF7 (Table 3.6).

More specifically, a lower number of DEGs in PC3 than in MCF7 is seen for 18D vs. 5D, 5D vs.

96

5P, and 18D vs. 5P, but a larger number of DEGs in PC3 than in MCF7 is seen for 18P vs. 5P,

18D vs. 18P, and 5D vs. 18P (Figure 3.2, Table 3.6). Interestingly, the largest number of DEGs for both cell lines was observed in response to medium change at 18% oxygen (18D vs. 18P) for being 619 and 884 for PC3 and MCF7, respectively, while that for at 5% oxygen (5D vs. 5P) is

254 and 510 for PC3 and MCF7. This result indicates that both cell lines showed the largest response to medium change at 18% O2 and lower response to medium change at 5% O2, with

MCF7 having a higher response than PC3 in both cases.

In response to oxygen level change, PC3 showed the least response in DMEM (18D vs.

5D) with the number of DEGs being 163, the smallest among all comparisons, while that in

Plasmax (18P vs. 5P) is 587 as the 2nd highest for PC3. In comparison, for MCF7, the number of

DEGs for oxygen level change in DMEM (18D vs. 5D) and Plasmax (18P vs. 5P) is 563 and

471, respectively, with the latter being close to the lowest for MCF7 (458). These results indicate that PC3 has the least response to oxygen level change in DMEM (18D vs. 5D), but much higher in Plasmax (18P vs. 5P), while MCF7 has a lower response to oxygen level change in Plasmax

(18P vs. 5P) than in DMEM (18D vs. 5D), with the pattern being opposite to that of PC3. The differences between the two cell lines are also seen in the pattern of distribution of DEGs between two compared conditions (Figure 3.2). For example, the opposite pattern between the two cell lines is seen for 18P vs. 5P, 18D vs. 18P, 18D vs. 5P, and 5D vs. 18P. Even in the remaining two sets of compared conditions (18D vs. 5D and 5D vs. 5P), where the pattern is the same, the ratio of the DEGs between the two compared conditions is quite different between the two cell lines.

In summary, by the number of DEGs, the two cell lines responded differently to oxygen level and culture medium change, with PC3 showing an overall lower response than MCF7. For

97 the response to oxygen level, PC3 showed the least response in DMEM but higher in Plasmax, while MCF7 showed more response in Plasmax in DMEM. For a response to the medium change, both cell lines showed the largest response at 18% oxygen, much less at 5% oxygen with

MCF7 having a higher response than PC3 in both media. Changes in oxygen level and medium did not seem to have an accumulated positive or negative effect on the degree of response in either cell line by having the largest or smallest number of DEGs among all condition changes

(Table 3.6). For this reason, we focused our further analyses on the independent effect of changes in oxygen level and medium.

Table 3.6. Number of differentially expressed genes (DEGs) for pairwise comparisons Comparison Cell line Cond1 Cond2 Total PC3 78 85 163 18D vs 5D MCF7 184 379 563 PC3 267 320 587 18P vs 5P MCF7 345 126 471 PC3 180 74 254 5D vs 5P MCF7 269 241 510 PC3 375 244 619 18D vs 18P MCF7 146 442 588 PC3 267 121 388 18D vs 5P MCF7 217 349 566 PC3 331 234 565 5D vs 18P MCF7 86 372 458 PC3 1498 1078 2576 Total MCF7 1247 1909 3156

98

Figure 3.2. The number of differentially expressed genes (DEGs) for pairwise comparisons among different culture conditions for PC3 and MCF7. Bar plots showing the number of DEGs between conditions as genes showing higher expression in either side of the compared samples based on p-value (≤ 0.05), fold change (≥ 2), and FPKM (≥ 5 in at least in one of the two samples). Cond1, first condition in a pairwise comparison; Cond2, second condition in comparison. MCF7, breast cancer cell line; PC3, prostate cancer cell line; 5P, 5% O2 in Plasmax medium; 5D, 5% O2 in DMEM; 18P, 18% O2 in Plasmax medium; 18D, 18% O2 in DMEM.

3.3.3. DGE in response to oxygen level changes in MCF7

In response to oxygen level change in MCF7 in DMEM, a total of 563 DEGs were identified (18D vs. 5D), among which 184 genes showed higher expression at 18% oxygen, while 379 genes showed higher expression at 5% oxygen (Table 3.6 and Figure 3.2). In comparison, in Plasmax (18P vs. 5P), a total of 471 DEGs were identified, among which 345 genes showed higher expression at 18% oxygen, while 126 genes showed higher expression at

5% oxygen. Therefore, the total number of DEGs in response to oxygen level change was different in the two media and the distribution pattern of DEGs between the two conditions was opposite between the two media (Table 3.6 and Figure 3.2).

99

To understand the biology represented by these gene expression changes, we performed enrichment analysis for GO terms and KEGG pathways for all DEGs and DEGs showing higher expression in each condition. As shown in Table 3.7, the 563 DEGs in response to O2 level change in MCF7 in DMEM showed enrichment for GO terms involving 42 biological processes,

23 cellular components, 9 molecular functions, and 16 KEGG pathways (p-value and Benjamini corrected p-value ≤ 0.05) (Table 3.7, Appendix Table 3.1). Among these, the main themes include processes promoting cell cycle (e.g., G1/S and G2/M transition of mitotic cell cycle and cell division), DNA replication, and DNA damage repair (e.g., base excision repair (BER), mismatch repair (MMR), and homologous recombination (HR)) (Table 3.7). As seen in

Appendix Figure 3.3, the enriched GO terms and KEGG pathways associated with cell cycle,

DNA replication, and DNA damage repair were mostly connected to each another by sharing a minimum of 37.5% genes between them with most of their genes also function in protein binding and were localized in nucleus, nucleoplasm, cytosol, and cytoplasm. It is very interesting to note that most of the associated genes showed higher expression in 5% O2. Those with higher expressions in 18% O2 were enriched for the p53 signaling pathway, FoXO signaling pathways, and viral infections. Also enriched in the cellular response to hypoxia, involving 7 up-regulated genes at 5% and 5 genes up-regulated at 18% oxygen, in MCF7 (Table 3.7).

100

Table 3.7. Enriched GO terms and KEGG pathways among the DEGs in MCF7 between 5% and 18% O2 in DMEM. Biological Process All DEGs 5D 18D GO:0000070~mitotic sister chromatid segregation 10 10 - GO:0000079~regulation of cyclin-dependent protein serine/threonine kinase activity 10 7 3* GO:0000082~G1/S transition of mitotic cell cycle 28 27 - GO:0000083~regulation of transcription involved in G1/S transition of mitotic cell cycle 9 9 - GO:0000086~G2/M transition of mitotic cell cycle 23 21 - GO:0000281~mitotic cytokinesis 7 7 - GO:0000722~telomere maintenance via recombination 8 8 - GO:0000731~DNA synthesis involved in DNA repair 8 8 - GO:0000732~strand displacement 6 6 - GO:0006096~glycolytic process 8 8 - GO:0006260~DNA replication 38 38 - GO:0006268~DNA unwinding involved in DNA replication 5 5 - GO:0006270~DNA replication initiation 13 13 - GO:0006271~DNA strand elongation involved in DNA replication 5 5 - GO:0006281~DNA repair 32 27 - GO:0006297~nucleotide-excision repair, DNA gap filling 6 6 - GO:0006974~cellular response to DNA damage stimulus 20 13 7* GO:0006977~DNA damage response, signal transduction by p53 class mediator resulting in 12 8 4* cell cycle arrest GO:0007049~cell cycle 21 17 - GO:0007051~spindle organization 7 7 - GO:0007052~mitotic spindle organization 7 7 - GO:0007059~chromosome segregation 14 13 - GO:0007062~sister chromatid cohesion 21 21 - GO:0007067~mitotic nuclear division 47 44 - GO:0007080~mitotic metaphase plate congression 8 8 - GO:0007093~mitotic cell cycle checkpoint 7 7 - GO:0009612~response to mechanical stimulus 9 - 7 GO:0009636~response to toxic substance 10 - 6* GO:0015949~nucleobase-containing small molecule interconversion 6 5 - GO:0031145~anaphase-promoting complex-dependent catabolic process 11 11 - GO:0032508~DNA duplex unwinding 8 7 - GO:0034501~protein localization to kinetochore 5 5 - GO:0042493~response to drug 23 16 5** GO:0051301~cell division 60 56 - GO:0051439~regulation of ubiquitin-protein ligase activity involved in mitotic cell cycle 7 7 - GO:0051726~regulation of cell cycle 15 11 4* GO:0061621~canonical glycolysis 8 8 -

101

GO:0071456~cellular response to hypoxia 12 7* 5* GO:0090307~mitotic spindle assembly 7 7 - GO:0097193~intrinsic apoptotic signaling pathway 7 - 4* GO:1901796~regulation of signal transduction by p53 class mediator 13 13 - Cellular Component All DEGs 5D 18D GO:0000775~chromosome, centromeric region 12 12 - GO:0000776~kinetochore 16 15 - GO:0000777~condensed chromosome kinetochore 16 16 - GO:0000784~nuclear chromosome, telomeric region 12 11 - GO:0000790~nuclear chromatin 15 11* - GO:0000793~condensed chromosome 6 6 - GO:0000922~spindle pole 17 17 - GO:0000942~condensed nuclear chromosome outer kinetochore 4 4 - GO:0005634~nucleus 245 184 61* GO:0005654~nucleoplasm 162 140 - GO:0005657~replication fork 5 5 - GO:0005737~cytoplasm 207 151 56* GO:0005813~centrosome 30 28 - GO:0005819~spindle 17 16 - GO:0005829~cytosol 158 126 - GO:0005874~microtubule 25 21 - GO:0005876~spindle microtubule 9 7 GO:0005971~ribonucleoside-diphosphate reductase complex 3 2 - GO:0030496~midbody 18 16 - GO:0032133~chromosome passenger complex 4 4 - GO:0042555~MCM complex 7 7 - GO:0051233~spindle midzone 7 7 - GO:0072686~mitotic spindle 7 7 - Molecular Function All DEGs 5D 18D GO:0000405~bubble DNA binding 4 4 - GO:0003677~DNA binding 72 51* - GO:0003678~DNA helicase activity 7 6* - GO:0003682~chromatin binding 28 23 - GO:0003688~DNA replication origin binding 5 5 - GO:0003697~single-stranded DNA binding 12 12 - GO:0005515~protein binding 339 241 98* GO:0005524~ATP binding 84 72 - GO:0042802~identical protein binding 40 26* 14* KEGG Pathway All DEGs 5D 18D hsa00010:Glycolysis / Gluconeogenesis 9 9 - hsa00240:Pyrimidine metabolism 11 10 -

102 hsa01230:Biosynthesis of amino acids 10 10 - hsa03030:DNA replication 13 13 - hsa03410:Base excision repair 10 10 - hsa03430:Mismatch repair 6 6 - hsa03440:Homologous recombination 6 6 - hsa03460:Fanconi anemia pathway 10 10 - hsa04068:FoxO signaling pathway 13 8* 5* hsa04110:Cell cycle 35 32 - hsa04114:Oocyte meiosis 17 15 - hsa04115:p53 signaling pathway 22 8 14 hsa04914:Progesterone-mediated oocyte maturation 12 11 - hsa05161:Hepatitis B 15 - 8* hsa05166:HTLV-I infection 25 17 8* hsa05219:Bladder cancer 7 5* -

Abbreviations: 5D 5% O2 in DMEM, 18D 18% O2 in DMEM The significantly enriched GO terms or KEGG pathway is determined by the p-value and Benjamini corrected p-value ≤ 0.05 unless indicated otherwise; * Significantly enriched based on p-value (≤0.05) only; ** Not significantly enriched; (-) No result.

For MCF7’s response to oxygen level change in Plasmax, the 471 DEGs showed

enrichment for 8 biological processes, 15 cellular components, 9 molecular functions, and 2

KEGG pathways (Table 3.8, Appendix Table 3.2), less than in DMEM. Table 3.8 shows that

the majority of genes associated with the enriched GO terms and KEGG pathways had higher

expression in 18% O2 in Plasmax, which is opposite to the oxygen effects in DMEM. Cell cycle

(e.g., negative regulation of cell proliferation and G1/S transition of mitotic cell cycle) and DNA

replication initiation were shown to be the most affected by this condition and strongly

associated with one another with most of their genes expressed higher at 18% O2 and localized in

nucleus and nucleoplasm (Appendix Figure 3.4). Also, there is a strong involvement of genes

function in protein binding with most of GO terms and KEGG pathways related to DNA

replication and cell-cycle (Appendix Figure 3.4).

103

Table 3.8. Enriched GO terms and KEGG pathways among the DEGs in MCF7 at 5% and 18% O2 in Plasmax. All Biological Process 5P 18P DEGs GO:0000082~G1/S transition of mitotic cell cycle 21 - 19 GO:0006260~DNA replication 22 - 22 GO:0006270~DNA replication initiation 11 - 11 GO:0000083~regulation of transcription involved in G1/S transition of mitotic cell 8 - 8 cycle GO:0045429~positive regulation of nitric oxide biosynthetic process 10 - 9 GO:0071353~cellular response to interleukin-4 8 - 8 GO:0008285~negative regulation of cell proliferation 25 - 21 GO:0043524~negative regulation of neuron apoptotic process 13 5* 8* All Cellular Component 5P 18P DEGs GO:0005737~cytoplasm 174 43* 131 GO:0070062~extracellular exosome 107 23** 84 GO:0005654~nucleoplasm 105 27* 78 GO:0005634~nucleus 170 44* 126 GO:0005829~cytosol 115 - 91 GO:0005615~extracellular space 57 - 51 GO:0005925~focal adhesion 25 - 23 GO:0042555~MCM complex 5 - 5 GO:0015629~actin cytoskeleton 17 - 16 GO:0016020~membrane 78 - 62 GO:0001726~ruffle 10 - 8 GO:0031012~extracellular matrix 18 - 18 GO:0005884~actin filament 8 - 6* GO:0000784~nuclear chromosome, telomeric region 11 - 8* GO:0030141~secretory granule 8 - 6* All Molecular Function 5P 18P DEGs GO:0005515~protein binding 275 68* 207 GO:0005524~ATP binding 64 20* 44* GO:0001786~phosphatidylserine binding 8 - 8 GO:0001077~transcriptional activator activity, RNA polymerase II core promoter 18 6* 12* proximal region sequence-specific binding GO:0042803~protein homodimerization activity 36 9** 27* GO:0003678~DNA helicase activity 6 - 5* GO:0008134~transcription factor binding 19 5** 14* GO:0005200~structural constituent of cytoskeleton 11 - 9* GO:0005544~calcium-dependent phospholipid binding 8 - 8 All KEGG Pathway 5P 18P DEGs 104 hsa04110:Cell cycle 19 - 16 hsa03030:DNA replication 10 - 10

Abbreviations: 5P 5% O2 in Plasmax, 18P 18% O2 in Plasmax The most significantly enriched GO terms or KEGG pathway is determined by the p-value and Benjamini corrected p-value ≤ 0.05 unless indicated otherwise; * Significantly enriched based on p-value (≤0.05); ** Not significantly enriched; (-) No result.

3.3.4. DGE in response to oxygen level changes in PC3

In response to oxygen level change in PC3 growing in DMEM, a total of 163 DEGs were identified (18D vs. 5D), among which 78 genes showed higher expression at 18% O2, while 85 genes showed higher expression at 5% O2 (Table 3.6 and Figure 3.2). In comparison, in PC3 cells growing in Plasmax, a total of 587 DEGs were identified in (18P vs. 5P), among which 267 genes showed higher expression at 18% O2, while 320 genes showed higher expression at 5%

O2. The total number of DEGs was much higher for Plasmax than for DMEM but both having a larger number of DEGs at 18% O2 (Table 3.6 and Figure 3.2).

Among all DEGs in DMEM, GO terms of “extracellular exosome” and “extracellular space” were enriched (p-value and Benjamini p-value ≤ 0.05) with genes in extracellular space showing higher expression only at 5% O2, whereas genes in extracellular exosome showing higher expression only at 18% O2 (Table 3.9, Appendix Table 3.3). There is no gene shared between these GO terms. Among the 587 DEGs in response to oxygen level change in Plasmax,

39 biological processes, 32 cellular components, 10 molecular functions, and 4 KEGG pathways are significantly enriched among the DEGs (Table 3.6, Appendix Table 3.4). More genes associated with the enriched GO terms and KEGG pathways showed higher expression at 5% O2 than 18% O2 (Table 3.10) at a level similar to that of MCF7 in DMEM by the number of DEGs

(Table 3.6). The most affected GO terms and KEGG pathways at 5% O2 in Plasmax are mostly related to cell cycle regulation and (e.g., cell division, mitotic nuclear division, cell proliferation, and oocyte meiosis) and p53 signaling pathway. The high number of genes shared between GO

105 terms and KEGG pathways is shown between the regulation of ubiquitin-protein ligase activity involved in mitotic cell-cycle, oocyte meiosis, and negative regulation of ubiquitin-protein ligase activity involved in the mitotic cell cycle with cell cycle (Appendix figure 3.5). Eighteen GO terms also showed the absence of overlapped genes with other GO terms and KEGG pathways, indicating that their gene expression is not affected by other GO terms and KEGG pathways. In comparison, genes showing higher expression at 18% O2 in Plasmax are associated with viral carcinogenesis (e.g., response to the virus, viral genome replication, and defense response to the virus), RNA polymerase II promoter activity (e.g., positive regulation of transcription from RNA polymerase II promoter in response to endoplasmic reticulum stress and negative regulation of transcription from RNA polymerase II promoter), and apoptosis (e.g., negative regulation of apoptotic process and intrinsic apoptotic signaling pathway in response to endoplasmic reticulum stress) (Table 3.10, Appendix Figure 3.5). The enriched GO term, “cellular response to hypoxia” was also observed with 10 genes showing higher expression at 5% O2 and 6 genes showing higher expression at 18% O2, whereas for “response to hypoxia” 10 genes showed higher expression at 18% O2 only. These findings showed that PC3 is highly impacted by oxygen level change in Plasmax in a pattern opposite to MCF7, which showed a higher response to oxygen level change in DMEM. Genes associated with “cellular response to hypoxia” showed

DGE in both media at 5% O2 as expected.

106

Table 3.9. Enriched GO terms and KEGG pathways among DEGs in PC3 between 5% and 18% O2 in DMEM. All Cellular Component 5D 18D DEGs GO:0070062~extracellular exosome 39 18* 21* GO:0005615~extracellular space 23 15* -

Abbreviations: 5D 5% O2 in DMEM, 18D 18% O2 in DMEM The most significantly enriched GO terms or KEGG pathway is determined by the p-value and Benjamini corrected p-value ≤ 0.05 unless indicated otherwise; *, significantly enriched based on p- value (≤0.05); **, not significantly enriched; (-) no result.

Table 3.10. Enriched GO terms and KEGG pathways among the DEGs in PC3 at 5% and 18% O2 in Plasmax. All Biological Process DEGs 5P 18P GO:0051301~cell division 42 40 - GO:0007062~sister chromatid cohesion 21 20 - GO:0007067~mitotic nuclear division 29 29 - GO:0000086~G2/M transition of mitotic cell cycle 21 18 - GO:0000082~G1/S transition of mitotic cell cycle 17 15 - GO:0008283~cell proliferation 33 25 - GO:0071456~cellular response to hypoxia 16 10* 6* GO:0042493~response to drug 29 12* 17* GO:0060337~type I interferon signaling pathway 13 13* - GO:0043066~negative regulation of apoptotic process 36 18* 18* GO:0006260~DNA replication 19 19* - GO:0070059~intrinsic apoptotic signaling pathway in response to endoplasmic reticulum stress 9 - 8* GO:0000070~mitotic sister chromatid segregation 8 8 - GO:0045766~positive regulation of angiogenesis 15 - 10* GO:0006977~DNA damage response, signal transduction by p53 class mediator resulting in cell cycle arrest 11 9 - GO:1990440~positive regulation of transcription from RNA polymerase II promoter in response to endoplasmic reticulum stress 6 - 6* GO:0043627~response to estrogen 11 4** 7* GO:0007052~mitotic spindle organization 8 8 - GO:0034097~response to cytokine 10 4** 6* GO:0007059~chromosome segregation 11 11 - GO:0000122~negative regulation of transcription from RNA polymerase II promoter 43 - 20* GO:0032355~response to estradiol 12 - 8* GO:0031145~anaphase-promoting complex-dependent catabolic process 11 9 - GO:0045071~negative regulation of viral genome replication 8 - 7* GO:0000278~mitotic cell cycle 8 6 -

107

GO:0016925~protein sumoylation 13 9* - GO:0009636~response to toxic substance 11 - 7* GO:0001666~response to hypoxia 16 - 11 GO:0051436~negative regulation of ubiquitin-protein ligase activity involved in mitotic cell cycle 10 8 - GO:0070301~cellular response to hydrogen peroxide 9 8 - GO:0043434~response to peptide hormone 8 - 6* GO:0007093~mitotic cell cycle checkpoint 7 6 - GO:0009612~response to mechanical stimulus 9 4** 5* GO:0002931~response to ischemia 7 4* 3** GO:0010595~positive regulation of endothelial cell migration 8 - 5* GO:0000083~regulation of transcription involved in G1/S transition of mitotic cell cycle 6 6 - GO:0051439~regulation of ubiquitin-protein ligase activity involved in mitotic cell cycle 6 6 - GO:0009615~response to virus 12 - 11* GO:0000910~cytokinesis 8 7 - All Cellular Component DEGs 5P 18P GO:0005654~nucleoplasm 152 102 50* GO:0005737~cytoplasm 227 139 88* GO:0005829~cytosol 161 105 56* GO:0016020~membrane 116 79 - GO:0030496~midbody 20 20 - GO:0000776~kinetochore 15 15 - GO:0070062~extracellular exosome 130 61* 69* GO:0000777~condensed chromosome kinetochore 15 15 - GO:0005819~spindle 17 16 - GO:0005634~nucleus 213 121 92* GO:0016363~nuclear matrix 14 8 6* GO:0051233~spindle midzone 7 7 - GO:0000775~chromosome, centromeric region 10 10 - GO:0032154~cleavage furrow 9 8 - GO:0005657~replication fork 6 5 - GO:0000942~condensed nuclear chromosome outer kinetochore 4 4 - GO:0005615~extracellular space 65 - 41* GO:0000796~condensin complex 4 4 - GO:0005643~nuclear pore 10 8 - GO:0005876~spindle microtubule 8 8 - GO:0005813~centrosome 27 24 - GO:0000786~nucleosome 11 - 11* GO:0005925~focal adhesion 25 18 - GO:0000779~condensed chromosome, centromeric region 4 3 -

108

GO:0015629~actin cytoskeleton 17 12 - GO:0031965~nuclear membrane 17 12 - GO:0030141~secretory granule 9 - 9* GO:0000788~nuclear nucleosome 7 - 7* GO:0042555~MCM complex 4 4 - GO:0043234~protein complex 24 15* - GO:0005635~nuclear envelope 13 12 - GO:0034399~nuclear periphery 5 4 - GO:0003682~chromatin binding 26 - 15* GO:0000982~transcription factor activity, RNA polymerase II core promoter proximal region sequence-specific binding 6 - 4* All Molecular Function DEGs 5P 18P GO:0005515~protein binding 333 194 139* All KEGG Pathway DEGs 5P 18P hsa04110:Cell cycle 25 22 - hsa05203:Viral carcinogenesis 21 - 13* hsa04115:p53 signaling pathway 11 7* 4** hsa04114:Oocyte meiosis 13 12 - Abbreviations: 5P 5% O2 in Plasmax, 18P 18% O2 in Plasmax The most significantly enriched GO terms or KEGG pathway is determined by the p-value and Benjamini corrected p-value ≤ 0.05 unless indicated otherwise; * Significantly enriched based on p-value (≤0.05); ** Not significantly enriched; (-) No result.

3.3.5. DGE in response to culture medium changes in MCF7

In response to culture medium change in MCF7 cells growing at 5% O2 (5D vs. 5P), a total of 510 DEGs were identified, among which 269 genes showed higher expression at 5% O2 in DMEM, while 241 genes showed higher expression in Plasmax (Table 3.6 and Figure 3.2).

In comparison, for MCF7 cells growing at 18% O2 (18D vs. 18P), a total of 588 DEGs were identified as the largest list of DEGs in this study. Among these, 442 genes showed higher expression at 18% O2 in DMEM, while 442 genes showed higher expression in Plasmax. The total number of DEGs was higher at 18% O2 than at 5% O2, but the distribution pattern of DEGs between two culture media is similar for having a comparable number of DEGs at both oxygen levels (Table 3.6 and Figure 3.2).

109

Enrichment analysis among all DEGs response to a medium change in MCF7 at 5% O2 showed enrichment of 31 biological processes, 15 cellular components, 11 molecular functions, and 7 KEGG pathways (Table 3.11, Appendix Table 3.5). The main themes associated with these enriched GO terms and KEGG pathways include regulation of cell cycle (e.g. cell division and G1/S transition of mitotic cell cycle) and DNA replication that is connected by sharing common genes between them, with most of their genes were higher expressed at 5% in O2

Plasmax (Table 3.11, Appendix Table 3.6). The majority of genes showing higher expression in

Plasmax at 5% O2 are involved in viral carcinogenesis and viral infections (e.g., Measles and

Hepatitis B) (Table 3.11). Interestingly, response to the virus, defense response to the virus, type

I interferon signaling pathway, interferon-gamma-mediated signaling pathway, and negative regulation of viral genome replication also upregulated by Plasmax at 5% O2, and they shared common genes between them (Appendix Figure 3.6). Cellular response to hypoxia was also shown to be affected by different culture media at 5% O2 with 7 genes showing higher expression in DMEM and 5 genes showing higher expression in Plasmax (Table 3.11).

110

Table 3.11. Enriched GO terms and KEGG pathways among DEGs in MCF7 between DMEM and Plasmax at 5% O2. All Biological Process 5D 5P DEGs GO:0060337~type I interferon signaling pathway 18 - 18 GO:0009615~response to virus 21 6* 15* GO:0000082~G1/S transition of mitotic cell cycle 20 19 - GO:0006270~DNA replication initiation 12 12 - GO:0006260~DNA replication 22 22 - GO:0051607~defense response to virus 22 - 20 GO:0045071~negative regulation of viral genome replication 11 - 11 GO:0006334~nucleosome assembly 17 - 14 GO:0070059~intrinsic apoptotic signaling pathway in response to endoplasmic 9 - 8 reticulum stress GO:0071353~cellular response to interleukin-4 8 7 - GO:0042493~response to drug 25 13* 12* GO:0043065~positive regulation of apoptotic process 24 - 16 GO:0000083~regulation of transcription involved in G1/S transition of mitotic cell 7 7 - cycle GO:0071456~cellular response to hypoxia 12 7* 5* GO:0043627~response to estrogen 10 7 - GO:0006268~DNA unwinding involved in DNA replication 5 5 - GO:0051290~protein heterotetramerization 8 4* 4* GO:0098609~cell-cell adhesion 20 - 13* GO:0032508~DNA duplex unwinding 8 8 - GO:0000722~telomere maintenance via recombination 7 7 - GO:0071480~cellular response to gamma radiation 6 4* - GO:0051591~response to cAMP 8 - 6 GO:0031100~organ regeneration 8 4* 4* GO:0006139~nucleobase-containing compound metabolic process 8 6* - GO:0042542~response to hydrogen peroxide 8 - 5* GO:0051301~cell division 22 22 - GO:0061621~canonical glycolysis 6 6 - GO:0051726~regulation of cell cycle 12 - 7* GO:0000079~regulation of cyclin-dependent protein serine/threonine kinase activity 7 4* - GO:0034340~response to type I interferon 4 - 4 GO:0060333~interferon-gamma-mediated signaling pathway 9 9 - All Cellular Component 5D 5P DEGs GO:0005654~nucleoplasm 142 90 52 GO:0016020~membrane 112 74 38 GO:0005829~cytosol 146 92 54* GO:0005634~nucleus 205 99 106

111

GO:0005737~cytoplasm 195 104 91 GO:0042555~MCM complex 6 6 - GO:0070062~extracellular exosome 111 71 - GO:0000786~nucleosome 13 - 13 GO:0000784~nuclear chromosome, telomeric region 15 10 5** GO:0005739~mitochondrion 58 9 24* GO:0000788~nuclear nucleosome 8 - 8 GO:0043209~myelin sheath 14 13 - GO:0042470~melanosome 11 - 7* GO:0005913~cell-cell adherens junction 20 13 - GO:0000790~nuclear chromatin 14 8* 6** All Molecular Function 5D 5P DEGs GO:0005515~protein binding 302 166 136 GO:0005524~ATP binding 81 56 - GO:0003725~double-stranded RNA binding 10 - 7 GO:0003677~DNA binding 73 34* 39 GO:0046982~protein heterodimerization activity 29 - 22 GO:0003688~DNA replication origin binding 5 5 - GO:0003682~chromatin binding 25 15* 10** GO:0098641~cadherin binding involved in cell-cell adhesion 20 13* - GO:0000982~transcription factor activity, RNA polymerase II core promoter 6 - 6 proximal region sequence-specific binding GO:0042802~identical protein binding 38 23* 15** GO:0042393~histone binding 12 6* 6* All KEGG Pathway 5D 5P DEGs hsa04110:Cell cycle 21 17 - hsa03030:DNA replication 11 11 - hsa04115:p53 signaling pathway 12 5* 7 hsa05162:Measles 15 - 11 hsa05203:Viral carcinogenesis 19 - 13 hsa05161:Hepatitis B 15 - 9 hsa01130:Biosynthesis of antibiotics 18 18 - hsa01100:Metabolic pathways - 41 -

Abbreviations: 5D 5% O2 in DMEM, 5P 5% O2 in Plasmax The most significantly enriched GO terms or KEGG pathway is determined by the p-value and Benjamini corrected p-value ≤ 0.05 unless indicated otherwise; * Significantly enriched based on p-value (≤0.05); ** Not significantly enriched; (-) No result.

112

For MCF7’s response to medium change at 18% O2, the 588 DEGs showed enrichment for 14 biological processes, 6 cellular components, 2 molecular functions, and one KEGG pathway (Table 3.12, Appendix table 3.6), lower than those in culture medium change at 5% O2

(Table 3.11). However, cell cycle (e.g., G1/S transition of mitotic cell cycle and cell division),

DNA replication, negative regulation of transcription from RNA polymerase II promoter, protein binding, and ATP binding are shown to be the most affected by different culture media at different oxygen levels with more genes showing higher expression in Plasmax than in DMEM at 18% O2 and they also share common genes between them (Table 3.12, Appendix Figure 3.7).

Only several genes involved in lipoprotein metabolic process and protein binding, as well as several genes in cytosol and cytoplasm, expressed higher in DMEM (Table 3.12). In summary, our results demonstrated that MCF7 responded more to medium change at 18% O2 than at 5%

O2. At 18% O2, genes associated with cell cycle/DNA replication showed higher expression in

Plasmax, while at 5% O2 genes associated with these pathways showing higher expression in

DMEM.

113

Table 3.12. Enriched GO terms and KEGG pathways among DEGs in MCF7 between DMEM and Plasmax at 18% O2. All Biological Process 18D 18P DEGs GO:0006260~DNA replication 27 - 27 GO:0060337~type I interferon signaling pathway 17 - 17 GO:0000082~G1/S transition of mitotic cell cycle 19 - 18 GO:0071222~cellular response to lipopolysaccharide 16 - 13 GO:0006270~DNA replication initiation 9 - 9 GO:0009615~response to virus 14 - 13 GO:0045071~negative regulation of viral genome replication 9 - 9 GO:0051301~cell division 27 - 24 GO:0000122~negative regulation of transcription from RNA polymerase II 43 - 36* promoter GO:0060333~interferon-gamma-mediated signaling pathway 11 - 11 GO:0000731~DNA synthesis involved in DNA repair 8 - 8 GO:0000732~strand displacement 7 - 7 GO:0042157~lipoprotein metabolic process 8 6* - GO:0051607~defense response to virus 16 - 15 All Cellular Component 18D 18P DEGs GO:0005654~nucleoplasm 130 - 111 GO:0005634~nucleus 215 - 175 GO:0005737~cytoplasm 207 51* 156 GO:0042555~MCM complex 6 - 6 GO:0005615~extracellular space 65 - 54 GO:0005829~cytosol 132 36* 96* All Molecular Function 18D 18P DEGs GO:0005515~protein binding 332 77* 255 GO:0005524~ATP binding 75 * 59 All KEGG Pathway 18D 18P DEGs hsa03030:DNA replication 10 - 10

Abbreviations: 18D 18% O2 in DMEM, 18P 18% O2 in Plasmax The most significantly enriched GO terms or KEGG pathway is determined by the p-value and Benjamini corrected p-value ≤ 0.05 unless indicated otherwise; * Significantly enriched based on p-value (≤0.05); ** Not significantly enriched; (-) No result.

114

3.3.6. DGE in response to culture medium changes in PC3

In response to culture medium change in PC3 cells growing at 5% O2, a total of 254

DEGs were identified (5D vs. 5P), among which 180 genes showed higher expression in

DMEM, while 74 genes showed higher expression in Plasmax (Table 3.6 and Figure 3.2). In

comparison, in PC3 cells growing at 18% O2 (18D vs. 18P), a total of 619 DEGs were identified,

being the largest for PC3 and among which 375 genes showed higher expression in DMEM,

while 244 genes showed higher expression in Plasmax. The total number of DEGs was more

than 2 times higher at 18% O2 than 5% O2, and in both cases, more DEGs showed higher

expression in DMEM than in Plasmax, similar to MCF7 in having more response at 18% O2, but

MCF7 showed a similar number of DEGs between the two media at both oxygen level (Table

3.6 and Figure 3.2).

Enrichment analysis showed that positive regulation of the apoptotic process and

legionellosis were enriched among all DEGs at 5% O2, which mostly showed higher expression

in DMEM (Table 3.13, Appendix Table 3.7). DEGs showing higher expression in Plasmax

showed enrichment only for Legionellosis with 5 genes.

Table 3.13. Enriched GO terms and KEGG pathways among DEGs in PC3 between DMEM and Plasmax at 5% O2. All Biological Process 5D 5P DEGs GO:0043065~positive regulation of apoptotic process 15 12* - All KEGG Pathway 5D 5P DEGs hsa05134:Legionellosis 8 3** 5

Abbreviations: 5D 5% O2 in DMEM, 5P 5% O2 in Plasmax The most significantly enriched GO terms or KEGG pathway is determined by the p-value and Benjamini corrected p-value ≤ 0.05 unless indicated otherwise; * Significantly enriched based on p-value (≤0.05); ** Not significantly enriched; (-) No result.

115

Among the 619 DEGs in response to medium change at 18% O2 showed enrichment for 7 biological processes, 5 cellular components, and 2 molecular functions (Table 3.14, Appendix

Table 3.8). DEGs with higher expression in Plasmax showed enrichment for GO terms associated with intrinsic apoptotic signaling pathway in response to ER stress, type I interferon signaling pathway, positive regulation of angiogenesis, cellular response to hypoxia, transcription factor binding, and protein heterodimerization. More DEGs with higher expression in DMEM showed enrichment for gap junction assembly and negative regulation of cell migration, and protein binding, whereas an equal number of genes involved in response to hypoxia was shown to be higher expressed at both media (Table 3.14). Among these enriched

GO terms, only protein binding showing a minimum of 37.5% of its genes similar to those in cellular response to hypoxia, intrinsic apoptotic signaling pathways in response to ER stress, transcription factor binding, and protein heterodimerization activity, mostly localized in cytosol and cytoplasm (Appendix Figure 3.8).

In summary, PC3 showed the largest response to medium change at 18% O2, as was the case for MCF7, with more DEGs mostly involved in the enriched molecular functions showing higher expression in Plasmax than in DMEM. In comparison, DEGs involved in enriched GO terms at 5% were mostly higher expressed in DMEM than in Plasmax.

116

Table 3.14. The list of enriched GO terms and KEGG pathways among DEGs in PC3 between DMEM and Plasmax at 18% O2. All Biological Process 18D 18P DEGs GO:0070059~intrinsic apoptotic signaling pathway in response to endoplasmic 9 - 8 reticulum stress GO:0060337~type I interferon signaling pathway 12 - 11 GO:0045766~positive regulation of angiogenesis 15 - 10 GO:0016264~gap junction assembly 5 4* - GO:0071456~cellular response to hypoxia 13 - 9 GO:0001666~response to hypoxia 18 9* 9* GO:0030336~negative regulation of cell migration 13 8* 5* All Cellular Component 18D 18P DEGs GO:0005737~cytoplasm 222 146 76* GO:0005829~cytosol 153 104 49** GO:0016020~membrane 108 79 - GO:0005925~focal adhesion 28 22 - GO:0070062~extracellular exosome 122 67** 55 All Molecular Function 18D 18P DEGs GO:0008134~transcription factor binding 24 - 14* GO:0005515~protein binding 329 206* 123* GO:0046982~protein heterodimerization activity 32 - 19

Abbreviations: 18D 18% O2 in DMEM, 18P 18% O2 in Plasmax The most significantly enriched GO terms or KEGG pathway is determined by the p-value and Benjamini corrected p-value ≤ 0.05 unless indicated otherwise; * Significantly enriched based on p-value (≤0.05); ** Not significantly enriched; (-) No result.

3.3.7. Common and differential response to culture conditions between PC3 and MCF7

To understand the commonality and differences between PC3 and MCF7 in their response to oxygen level and culture medium changes, we compared their DEGs and enriched

GO terms and KEGG pathways. We first compared the lists of DEGs via Venn diagram analysis.

For the response to oxygen level change, 18 and 89 DEGs are common between the two cell lines in DMEM and Plasmax, respectively (Figure 3.3; List of genes available in Appendix

Table 3.9, 3.10), indicating more similarity between the two cell lines is seen in Plasmax. In more details, the largest number of shared DEGs between the two cell lines was seen between

18% O2 in Plasmax (42 DEGs shared PC3 18P and MCF18P) and the least number of DEGs was

117 seen between PC3 at 18% O2 and 5% O2 in DMEM (2 DEGs shared for PC3 18D and MCF7

5D).

Figure 3.3. Venn diagrams showing a comparison of DEGs in response to oxygen level change in DMEM (left) and Plasmax (right). MCF7, breast cancer cell line; PC3, prostate cancer cell line; 5P, 5% O2 in Plasmax medium; 5D, 5% O2 in DMEM; 18P, 18% O2 in Plasmax medium; 18D, 18% O2 in DMEM.

In response to different culture media, 35 and 109 DEGs are common between MCF7 and PC3 at 5% and 18% O2, respectively (Figure 3.4; List of genes available in Appendix

Table 3.11, 3.12), suggesting more similarity between the two cell lines was seen at 18% O2.

More specifically, the largest number of DEGs was seen at 18% O2 in Plasmax (75 DEGs shared between PC3 18P and MCF 18P), while the least number of DEGs was seen between PC3 in

DMEM and MCF7 in Plasmax both at 5% O2 (1 DEGs shared between PC3 5D and MCF7 5P).

Overall, based on the number of shared DEGs between the two cell lines, the best similarity was seen at 18% O2 Plasmax and the very low similarity was seen between 5% O2 in

DMEM and a few other conditions (i.e., PC3 5P and MCF7 5D, PC3 18D and MCF7 5D, PC3

18D and MCF7 18D, and PC3 5D and MCF7 18D), mostly in associated with DMEM. This seems to suggest that in the more physiological medium, Plasmax, cells tend to behave more consistently.

118

Figure 3.4. Venn diagrams showing comparison of DEGs in response to medium change at 5% O2 (left) and 18% O2 (right). MCF7, breast cancer cell line; PC3, prostate cancer cell line; 5P, 5% O2 in Plasmax medium; 5D, 5% O2 in DMEM; 18P, 18% O2 in Plasmax medium; 18D, 18% O2 in DMEM.

We then compared the enriched KEGG pathways among different culture conditions. As seen in Table 3.15, it seems that the largest number of pathways were impacted in MCF7 in response to oxygen level change in DMEM, while the least number of pathways was impacted in

PC3 cell in response to oxygen level change in DMEM and medium change at 18% O2. The cell cycle and DNA replication, and p53 signaling pathways represent the most commonly impacted pathways. This is in agreement with the results from the similar analysis on biological process

GO terms with those related to cell cycle and DNA replication as the most impacted (Table

3.16).

119

Table 3.15. Comparison of the enriched KEGG pathway among the different lists of DEGs.

5D vs 18D 5P vs 18P 5P vs 5D 18P vs 18D KEGG pathway MCF7 PC3 MCF7 PC3 MCF7 PC3 MCF7 PC3 hsa04115:p53 signaling pathway hsa04110:Cell cycle hsa03030:DNA replication hsa05203:Viral carcinogenesis hsa04114:Oocyte meiosis hsa03410:Base excision repair hsa05166:HTLV-I infection hsa03460:Fanconi anemia pathway hsa05161:Hepatitis B hsa01230:Biosynthesis of amino acids hsa03430:Mismatch repair hsa00010:Glycolysis / Gluconeogenesis hsa00240:Pyrimidine metabolism hsa05219:Bladder cancer hsa03440:Homologous recombination hsa04914:Progesterone-mediated oocyte maturation hsa04068:FoxO signaling pathway hsa05162:Measles hsa01130:Biosynthesis of antibiotics hsa05134:Legionellosis The presence of the KEGG pathway is highlighted in green, while the absence of the KEGG pathway is highlighted in yellow.

120

Table 3.16. Comparison of the enriched biological process GO terms among different lists of DEGs. 5D18D 5P18P 5P5D 18P18D Biological process MCF7 PC3 MCF7 PC3 MCF7 PC3 MCF7 PC3 GO:0007049~cell cycle GO:0000278~mitotic cell cycle GO:0000082~G1/S transition of mitotic cell cycle GO:0000086~G2/M transition of mitotic cell cycle GO:0007093~mitotic cell cycle checkpoint GO:0000070~mitotic sister chromatid segregation GO:0007062~sister chromatid cohesion GO:0007051~spindle organization GO:0007052~mitotic spindle organization GO:0090307~mitotic spindle assembly GO:0007059~chromosome segregation GO:0007067~mitotic nuclear division GO:0007080~mitotic metaphase plate congression GO:0000722~telomere maintenance via recombination GO:0006334~nucleosome assembly GO:0016264~gap junction assembly GO:0008283~cell proliferation GO:0051301~cell division GO:0031100~organ regeneration GO:0051726~regulation of cell cycle GO:0051436~negative regulation of ubiquitin-protein ligase activity involved in mitotic cell cycle GO:0051439~regulation of ubiquitin-protein ligase activity involved in mitotic cell cycle GO:0000079~regulation of cyclin-dependent protein serine/threonine kinase activity GO:0000122~negative regulation of transcription from RNA polymerase II promoter GO:0008285~negative regulation of cell proliferation GO:0000083~regulation of transcription involved in G1/S transition of mitotic cell cycle GO:0006260~DNA replication GO:0006270~DNA replication initiation GO:0032508~DNA duplex unwinding

121

GO:0006268~DNA unwinding involved in DNA replication GO:0006271~DNA strand elongation involved in DNA replication GO:0000732~strand displacement GO:0006281~DNA repair GO:0000731~DNA synthesis involved in DNA repair GO:0006297~nucleotide-excision repair, DNA gap filling GO:0006974~cellular response to DNA damage stimulus GO:1901796~regulation of signal transduction by p53 class mediator GO:0006977~DNA damage response, signal transduction by p53 class mediator resulting in cell cycle arrest GO:0000910~cytokinesis GO:0000281~mitotic cytokinesis GO:0034097~response to cytokine GO:0060333~interferon-gamma-mediated signaling pathway GO:0060337~type I interferon signaling pathway GO:0034340~response to type I interferon GO:0001525~angiogenesis GO:0010595~positive regulation of endothelial cell migration GO:0030336~negative regulation of cell migration GO:0098609~cell-cell adhesion GO:0097193~intrinsic apoptotic signaling pathway GO:0070059~intrinsic apoptotic signaling pathway in response to endoplasmic reticulum stress GO:0043065~positive regulation of apoptotic process GO:0043066~negative regulation of apoptotic process GO:0043524~negative regulation of neuron apoptotic process GO:0045071~negative regulation of viral genome replication GO:0009615~response to virus GO:0051607~defense response to virus GO:0016925~protein sumoylation

122

GO:0034501~protein localization to kinetochore GO:0042157~lipoprotein metabolic process GO:0051290~protein heterotetramerization GO:0006094~gluconeogenesis GO:0006096~glycolytic process GO:0061621~canonical glycolysis GO:0070301~cellular response to hydrogen peroxide GO:0071222~cellular response to lipopolysaccharide GO:0071353~cellular response to interleukin-4 GO:0071456~cellular response to hypoxia GO:0071480~cellular response to gamma radiation GO:0001666~response to hypoxia GO:0002931~response to ischemia GO:0009612~response to mechanical stimulus GO:0009636~response to toxic substance GO:0032355~response to estradiol GO:0042493~response to drug GO:0042542~response to hydrogen peroxide GO:0043434~response to peptide hormone GO:0043627~response to estrogen GO:0051591~response to cAMP GO:0006139~nucleobase-containing compound metabolic process GO:0015949~nucleobase-containing small molecule interconversion GO:0031145~anaphase-promoting complex- dependent catabolic process GO:0045429~positive regulation of nitric oxide biosynthetic process The presence of enriched biological process GO term in a specific DEG list is highlighted in green, while the absence of biological process is highlighted in yellow.

123

Overall, based on the number of enriched GO terms and KEGG pathways between the two cell lines and among the different culture conditions, MCF7 was shown to be highly affected by changes in oxygen level and culture medium, especially at 5D vs 18D. In comparison, PC3 was shown to be much less affected, in particular at 5D vs 18D and 5D vs 5P. Interestingly, while MCF7 in DMEM was highly affected by oxygen level, in which PC3 showed little effect,

PC3 was shown to be more affected by oxygen level in Plasmax than MCF7 (Figure 3.2 and

3.5). For the response to culture medium change, MCF7 was shown to be more affected than

PC3 at both oxygen levels. This is also the case when condition changes involved both the medium and oxygen level.

Figure 3.5. Stack bar plots summarizing the effect of oxygen level and culture medium changes on MCF7 and PC3 based on the number of enriched biological processes (green), molecular functions (blue), and KEGG pathways (yellow) among all DEGs. MCF7, breast cancer cell line; 5P, 5% O2 in Plasmax medium; 5D, 5% O2 in DMEM; 18P, 18% O2 in Plasmax medium; 18D, 18% O2 in DMEM.

124

3.4. Discussions

In this study, we examined the impact of oxygen level in cultured cells using PC3 and

MCF7 as the model and compared their response in two different culture media by performing

RNA-seq based gene expression profiling. We discuss below the relevance of our results to the existing literature.

3.4.1. The impact of culture conditions on MCF7 and PC3

Based on the number of DEGs and significantly enriched biological process, molecular function, and KEGG, the effect of oxygen level changes and culture medium changes were examined. In response to oxygen level changes, MCF7 was highly affected in DMEM, whereas

PC3 was highly affected in Plasmax (Figure 3.2 and 3.5). Interestingly, oxygen level changes showed less impact on PC3 growing in DMEM as shown by a smaller number of DEGs, and only 2 cellular components (extracellular exosome and extracellular space) were significantly enriched with their genes were higher expressed at 5% O2 in DMEM. PC3 seems to be resistant to the oxygen level changes in DMEM as previously reported, both O2 and glucose seem to have a little impact on cell growth rates in PC3 (Fonseca et al., 2018). A high number of enriched GO terms and KEGG pathways in MCF7, when compared to PC3, showed that MCF7 is very sensitive to oxygen level changes in DMEM in which most of these pathways were suppressed at

18% O2 (Table 3.7). However, the comparison of the impact of oxygen level changes between

DMEM and Plasmax showing a decreasing number of DEGs and KEGG pathways in MCF7 growing Plasmax while PC3 growing in Plasmax showing an opposite trend (Figure 3.2 and

3.5). Interestingly, most of the DEGs in MCF7 showing a higher expression in 18% O2 in

Plasmax (Table 3.8) whereas in PC3 most of the DEGs were higher expressed at 5% O2 in

Plasmax (Table 3.10).

125

In response to culture medium changes, both cell lines were highly affected at 18% O2 when compared to 5% O2 as shown by the total number of DEGs (Figure 3.2). Interestingly, the enrichment analysis showed that as the level of oxygen increased, the number of affected GO terms and KEGG pathways were decreased in MCF7 but increased in PC3 (Figure 3.5).

However, MCF7 was being the most affected by culture medium changes as shown by the higher number of significantly enriched GO terms and KEGG pathways being affected when compared to PC3. DNA replication is the most affected pathway shown in MCF7 (5D vs 5P and 18D vs

18P) but not in PC3 which only affecting Legionellosis at 5D vs 5P (Table 3.15). At 5% O2, p53 signaling pathway, cell cycle, viral carcinogenesis, Hepatitis B, and measles are also significantly enriched in MCF7. However, based on the enriched biological process, response to hypoxia, gap junction assembly, negative regulation of cell migration, type I interferon signaling pathway, and intrinsic apoptotic signaling pathway in response to endoplasmic reticulum stress are enriched in PC3 at 18% O2 (Table 3.16) with most of the genes involved in these biological processes were higher expressed in Plasmax than in DMEM (Table 3.14).

In response to the combinatory effect, both cell lines were highly affected by standard cell-culture conditions (18% O2 in DMEM) and physiological condition (5% O2 in Plasmax)

(Appendix Table 3.13-16), as we compared the enrichment result between 5 % O2 in DMEM and 18% O2 in Plasmax (Appendix Table 3.17-20). By looking at the list of DEGs separately to determine the specific impact of standard cell-culture and physiological condition, both cell lines show a high number of DEGs expressed at 18% O2 in DMEM whereas at 5% O2 in Plasmax,

MCF7 showing a higher number of DEGs than PC3 (Figure 3.2). As can be seen in Figure 3.5, there is a decreased number of the most significantly enriched GO terms and KEGG pathways among all DEGs at culture condition vs physiological condition in both cell lines when compared

126 to 5 % O2 in DMEM vs 18% O2 in Plasmax. No significantly enriched GO terms and KEGG pathways were found in PC3 whereas a low number of significantly enriched GO terms and

KEGG pathways were found in MCF7

Based on the enrichment analysis in MCF7, genes involved in p53 signaling pathway

(KEGG pathway) and 3 biological processes (intrinsic apoptotic signaling pathway in response to endoplasmic reticulum stress, negative regulation of transcription from RNA polymerase II promoter, and positive regulation of transcription from RNA polymerase II promoter) were upregulated by culture condition whereas the majority of genes involved in the enriched GO terms were higher expressed at physiological condition (Appendix Table 3.13,3.14). In PC3, only 3 cellular components being the most significantly enriched among all DEGs (Appendix

Table 3.15), and DEGs involved in the anaphase-promoting complex-dependent catabolic process, negative regulation of apoptotic process, extracellular space, protein binding, NOD-like receptor signaling pathway, cell cycle, extracellular space, protein binding, TNF signaling pathway, and Legionellosis were expressed only in physiological condition (Appendix Table

3.16).

These findings showed that both oxygen level and culture medium contributed to the changes in gene expression in both cell lines affecting various metabolism and signaling pathways. Intratumoral hypoxia is a common finding in advanced cancers. Both breast cancer and prostate cancer tissues are characterized by low oxygen tension–hypoxia (Ma et al., 2015;

Zhang et al., 2016), and significant changes in gene expression at high oxygen levels are expected for both cell lines. For example, as we can see from the combinatory effect, physiological condition upregulated most of the gene expression important for various pathways and biological processes in both cell lines. This effect also can be seen at different oxygen levels

127 and culture media. Also, 18% O2 in DMEM and Plasmax contributed to the suppression of important biological processes, molecular functions, and pathways important such as cell growth and survival such as cell cycle, DNA replication, glucagon signaling pathway, central carbon metabolism in cancer, glycolysis / Gluconeogenesis, starch and sucrose metabolism, and metabolic pathways. We suggest that 5% is more suitable for MCF7 and PC3 cell growth as it is closer to their physiological oxygen level. However, the use of lower oxygen levels than 5% for cell culture is strongly recommended as it will provide information similar to their physiological condition.

In addition to this, the composition of Plasmax and DMEM is very different which also contributed to the significant changes in gene expression in both cell lines (the composition of

Plasmax and DMEM were reviewed in Ackerman and Tardito (2019). Plasmax was formulated to provide the essential nutrient composition for cell growth which is closer to human plasma

(Ackermann and Tardito, 2019) than DMEM in which most of the nutrients essential for cell growth possibly not provided in DMEM, leading to the change in metabolism and various signaling pathway in both cell lines (Voorde et al., 2019). In addition to this, DMEM was widely used for cell culture before the new formulation of Plasmax medium was obtained. This possibly changes the cell lines adaptation after the transition from DMEM to Plasmax. Moreover, oxidative stress, the key player of glucose toxicity, was increased more in intermittent high glucose, such as in Plasmax with reduced glucose composition, than sustained high glucose

(DMEM with higher glucose composition than Plasmax) due to chronic exposure to high glucose might induce some metabolic variations including antioxidant defenses or feedback regulatory as a defense mechanism of cells (Kim et al., 2012). Intermittent exposure to high glucose in

Plasmax might reduce such adaptation (Kim et al., 2012) and possibly contributed to the drastic

128 changes of metabolism and signaling pathway in both cell lines growing in DMEM and Plasmax, specifically in MCF7. This suggesting that more studies to observe the impact of Plasmax and

DMEM using oxygen levels lower than 5% or close to physiological oxygen levels for both cell lines are needed.

3.4.2. The most affected KEGG Pathways in response to culture conditions in MCF7 and

PC3

Based on the comparison of enrichment analysis among all DEGs in both cell lines in response to oxygen level changes, culture medium changes, and combinatory effect, cell cycle,

DNA replication, and p53 signaling pathway were the most affected KEGG pathway (Appendix

Figure 3.9,3.10). This impact can be observed in MCF7 and PC3 in response to oxygen level changes in Plasmax and DMEM (Table 3.7-3.10) as well as 18P vs 18D, and 5P vs 5D. MCF7 grows in DMEM for example showed that GO terms and KEGG pathways mostly associated with DNA replication in response to DNA damage (e.g. DNA replication, p53 signaling pathway, BER, MMR, and HR) were upregulated by 5% O2 in DMEM (5D vs 18D) (Table 3.7) whereas at 18% O2 in Plasmax (5P vs 18P) showed an opposite result (Table 3.8). A similar impact of hypoxia also can be observed in PC3 at 5% in Plasmax (Table 3.10). In addition to this, viral carcinogenesis and hepatitis B were affected in 3 comparisons, followed by biosynthesis of antibiotics, glycolysis/gluconeogenesis, and oocyte meiosis was affected in two comparisons. The relevance of our results in the context of existing literature is discussed below.

3.4.2.1. Cell-cycle

Cell cycle events are regulated and driven by a variety of cyclin, such as cyclin- dependent kinases (CDKs) combinations at the G1, G1/S, S, and G2/M phases (Maziero et al.,

2020) and cyclin A and B (CCNA1-2 and CCNB1-3), function in meiosis regulation (Li et al.,

129

2019). We identified 16 genes that play a major in cell cycles (KEGG pathway) and its associated biological processes were differentially expressed, such as CCNB1, CCNB2, CCNA2,

CDK1, CDK2, the cell division cycle (CDCA2-5,7, and 8), cell division cycle (CDC6, 25A, and

27), and growth arrest and DNA damage (GADD45) alpha (GADD45A) and gamma

(GADD45G) (Appendix Table 3.21).

The change in expression of CDK1, CDK2, CCNB1, CCNB2, and CCNA2 potentially leads to delays in mitotic resumption and progression in oocyte meiosis (Li et al., 2019) whereas expression change of CDCA8 results in kinetochore–spindle miss-attachments, ectopic spindle poles formation, and also leads to defective cell proliferation and p53 accumulation (Zhang et al.,

2020). The expression changes of CDC25A possibly contributed to the inactivation of CDK activity in the mitotic cell cycle by phosphorylating the crucial residues Tyr 15 and Thr 14, located within the Cdk ATP-binding loop. Inactivation of Cdk/cyclin complexes also can be achieved by ubiquitin-mediated degradation of cyclins. However, specific checkpoints that are activated by damaged or un-replicated DNA may also rapidly inhibit Cdk activity, delaying the progression of the cell cycle to provide time to repair DNA or to complete replication (Donzelli and Draetta, 2003).

3.4.2.2. DNA replication

MCF7 and PC3 are characterized by hypoxia which has been known to be associated with increased chromosomal instability, gene amplification, down-regulation of DNA damage repair pathways, and altered sensitivity to DNA damaging agents contributing thus to cancer phenotype (Ma, et al., 2015) affecting DNA replication process. To ensure the accuracy of DNA replication, crucial to maintaining the integrity of the , DNA damage response

(DDR) mediated by various cell cycle checkpoints either activates the DNA repair system or

130 induces cellular apoptosis/senescence when DNA damage arises (Zhang et al., 2016). The DDR engages signaling pathways that regulate the recognition of DNA damage, the recruitment of

DNA repair factors, the initiation and coordination of DNA repair pathways, transit through the cell cycle, and apoptosis (Li et al., 2016). Based on enrichment results, DNA replication process involving DDR was affected (Appendix Figure 3.13 and 3.14), as also shown by several DEGs that are involved in DNA replication, p53 signaling pathway, MMR pathway, DNA polymerase proofreading, and also genes code for MCM proteins important for DDR (Appendix Table

3.21), as described below. In MCF7, genes involved in KEGG pathway “DNA replication” were upregulated by 5% O2 in DMEM (5P vs 18P and 5D vs 5P), 18% O2 in Plasmax (5P vs 18P and

18D vs 18P), and standard cell culture condition (standard cell culture condition vs physiological condition), whereas in PC3, these genes only affected by 5% O2 in Plasmax (5P vs

18P).

3.4.2.2.1. p53 signaling pathway

p53 is a tumor suppressor and product of the TP53 gene that plays an important role in the regulation of DDR. Under normal physiological conditions, p53 is expressed at an extremely low level (Yoshihara et al., 2012). The tumor suppressor p53 is one of the most frequently mutated or deleted genes in human cancers (Chen et al., 2014). It has been implicated in regulating an assortment of cellular events including cell cycle arrest, apoptosis, senescence, aging in response to cellular damage or insult, guarding against genomic instability, and playing a critical role in DNA repair (Chen et al., 2014; de Marval and Zhang, 2011). It also reported being involved in differentiation, inflammation, immune response, metabolism, hormone- induced processes, transcription, epigenome, and autophagy (Issaeva, 2019). p53 contributes to the maintenance of the G2/M checkpoint by transcriptional repression of both cdc25C and cyclin

131

B, upregulation of p21 that can inhibit cyclin B–cdk1 complexes, 14-3-3 sigma proteins that target cdc25C proteins to the cytoplasm, and GADD45, a protein that can inhibit cyclin B–Cdc2 complexes (Senturk and Manfredi, 2013). When DNA damage occurs, cells stop the cell cycle at

G1/S, S, or G2/M phase and activate checkpoint pathways regulating the replication and DNA repair (Yun et al., 2012).

Our data shows that the p53 signaling pathway was affected in both cell lines (Appendix

Figure 3.13 and 3.14) in which genes in these pathways were mostly upregulated by 18% O2 in

DMEM in MCF7 (5D vs 18D), 5% O2 in Plasmax in PC3 (5P vs 18P) and MCF7 (5P vs 5D), and also 18% O2 in Plasmax in MCF7 (18D vs 18P). This is also supported by significantly enriched GO terms, including DNA damage response, signal transduction by p53 class mediator resulting in cell cycle arrest, regulation of signal transduction of p53 class mediator, and other related biological processes and molecular function. In total, 10 genes important in the p53 signaling pathway are differentially expressed by different levels of oxygen and culture media

(Appendix Table 3.21). These genes are tumor protein p53 (TP53), Serpin Family E Member 1

(SERPINE1), Serpin Family F Member 1 (SERPINF1), 2 Cyclin-Dependent Kinase Inhibitors

(CDKN1A and CDKN2B), 3 Growth Arrest and DNA Damage (GADD45) (e.g. inducible alpha

(GADD45A), inducible beta (GADD45B), and inducible gamma (GADD45G), and BUB1

Mitotic Checkpoint Serine/Threonine Kinase (BUB1).

The upregulation of CDKN1A accomplishes cell cycle arrest by inhibiting cyclin–CDK complexes by regulating p21 binds to the Cdk2/cyclin E complex in promoting mitosis in the presence of DNA damage (Senturk and Manfredi, 2013; Yun et al., 2012). This is then repressed the action of enzyme-related DNA replication phosphorylated by Cdk2, leading to the progress of the cell cycle is arrested (Yun et al., 2012). The disruption of mitosis is also supported by the

132 differentially expressed BUB1 that plays multiple roles in chromosome segregation and spindle checkpoint during mitosis. BUB1 ensures that activation of the anaphase-promoting complex

(APC) (also known as cyclosome, APC/C) in controlling sister chromatid separation and mitotic exit is delayed until all the have achieved proper bipolar connections to the mitotic spindle, by phosphorylating Cell Division Cycle 20 (CDC20), a key regulator of APC/C activity

(Ha and Breuer, 2012). Importantly, the downregulation of BUB1, BRCA2, and CDC20 inhibiting p-53 dependent transcriptional activation, cell cycle arrest, and apoptosis, leads to genome instability, resulting in aneuploidy (Ha and Breuer, 2012). The differentially expressed

GADD45 showed that stress signal regulates p53 to activate GADD45 to promote cell survival or senescence (de Marval and Zhang, 2011).

3.4.2.2.2. MMR pathway

The post-replication surveillance by the MMR system plays important role in several cellular processes including the DDR as the result of DNA replication errors (Li et al., 2016;

Rayner et al., 2016). MMR pathway, an important tumor suppressor pathway, resolves single- nucleotide misincorporations and small insertion/deletion loops (IDLs) created by the DNA polymerase to counteract replication errors, improving replication fidelity. MMR proteins also activate cell cycle checkpoints and cell death pathways in response to certain DNA lesions. In developing tumors, loss of this DNA damage response may allow cells to tolerate excessive

DNA damage, further contributing to the risk for tumorigenesis (Gupta and Heinen, 2019).

Our results showed that 4 genes, play a major role in the MMR pathway (Appendix Table 3.21), such as Exonuclease 1 (EXO1), Proliferating Cell Nuclear Antigen (PCNA), Replication Factor

C Subunit 2 (RFC2), and Replication Factor C Subunit 3 (RFC3), were differentially expressed.

133

These findings suggest that the disruption of the re-synthesis in the MMR pathway is facilitated by DNA polymerase in which PCNA act as the processivity factor for DNA polymerase (Gupta and Heinen, 2019; Mastrocola and Heinen, 2010) and RFCs responsible to load PCNA at the double-strand/single-strand DNA allowing DNA polymerase to access the replication site (Ogi et al., 2010). Moreover, the downregulation of EXO1 also potentially disrupting the MMR pathway, as EXO1 is the only exonuclease known to participate in the excision step of MMR and is required for in vitro MMR reactions (Li et al., 2016). Also, the inactivation of PCNA possibly affected another DNA repair process such as DNA polymerase proofreading and nucleotide excision repair (NER).

3.4.2.2.3. DNA polymerase proofreading

To ensure the maintenance of the DNA helix integrity in the presence of DNA damages by a multitude of endogenous and exogenous genotoxic agents, a network of defense mechanisms have evolved including accurate and efficient DNA repair processes. Besides the

MMR pathway, one of these processes is DNA polymerase proofreading facilitated by DNA polymerase Delta (Pol δ) and DNA polymerase Epsilon (Pol ε) (Overmeer et al., 2010). Our data showed that 4 genes involved in DNA polymerase proofreading response to DNA damage were differentially expressed by different oxygen levels and culture media in MCF7 (Appendix Table

3.21). These genes are known as DNA Polymerase Delta 2, Accessory Subunit (POLD2), DNA

Polymerase Delta 3, Accessory Subunit (POLD3), DNA polymerase epsilon, catalytic subunit

(POLE), and DNA polymerase epsilon 2, catalytic subunit (POLE2).

The downregulation of POLE, POLE2, POLD2, and POLD3 suppressed the DNA polymerase proofreading activity important for DNA repair in response to DNA damage. This potentially affects a wider range of cellular activities such as BER, NER, MMR, DSB repair, cell

134 cycle checkpoint regulation, and propagation of chromatin modification states (Rayner et al.,

2016). Also, BER utilizing specific DNA glycosylases is the primary repair pathway for oxidized

DNA damage that excise the damaged bases followed by cleavage at the abasic site by AP- endonuclease 1 and gap repair involving pol beta (Li et al., 2016), and this process could potentially be affected by the downregulation of these genes. Moreover, reduced levels of

POLD3 specifically can cause telomere shortening and loss, and chromosome breaks, compromising meiosis and possibly leads to meiosis defects, increased DNA damage which also induces apoptotic cell death (Zhou et al., 2018).

3.4.2.2.4. MCM proteins

MCM is composed of six subunits (MCM2-7) and functions both as a factor of the pre-replicative complex (pre-RC) in origin during the initiation of replication and as helicase in replication forks during the elongation of replication (Yun et al., 2012). Expression levels of MCM subunits are down-regulated in differentiated somatic cells in keeping with its function in cell proliferation. On the other hand, enhanced expression of MCM proteins has been reported in many cancer cells derived from patients. The roles of the MCM proteins in cancer progression have been linked to at least two cancer hallmarks including enhanced proliferation and regulation of replicative stress (Seo and Kang, 2018). We found that 7 genes code for Mini- chromosome Maintenance (MCM) proteins (MCM2, MCM3, MCM4, MCM5, MCM6, and

MCM7), Maintenance 10 Replication Initiation Factor (MCM10), and GINS complex subunit 1 and 3 (GINS1 and GINS3) (Appendix Table 3.22) were differentially expressed as the effect of different oxygen levels and culture media in MCF7 and different oxygen levels in Plasmax in

PC3. This is also supported by the enriched cellular component “MCM complex” in MCF7 in

135 response to different oxygen levels (Table 3.7-3.8) and culture media (Table 3.11-3.12), as well as in PC3 in response to different oxygen levels in Plasmax (Table 3.10).

The changes in expression of MCM2-7 and MCM10 decrease the production of MCM proteins leading to DNA damage and instability of the genome, which activates the DNA damage checkpoint. In this case, the absence of MCM2, MCM3, and MCM4 protein inactivated the DNA damage recognition whereas under normal conditions these proteins are phosphorylated by ataxia telangiectasia mutated (ATM) and Rad3-related (ATR) kinase in recognizing DNA damage. Moreover, phosphorylation of MCM4 protein regulated by ubiquitination inactivates the helicase activity of the MCM complex and it can stop the replication of DNA directly (Yun et al., 2012). Importantly, the absence of MCM10 will delay

CDC45, and GINS interaction with MCM2–7 and RPA interaction with origin DNA is reduced during the S phase in vivo, leading to a growth defect as a result of defective DNA replication. In addition, MCM2 phosphorylation by Dbf4-dependent kinase Cdc7 (DKK) is reduced in the absence of MCM10 in vivo and increased in the absence of MCM10 in vitro, supporting the role of MCM10 in stimulating MCM2 phosphorylation by DDK during the S phase (Perez-Arnaiz et al., 2016).

3.4.2.3. Viral infections potentially contribute to carcinogenesis in cell lines

Oncogenecity refers to viruses that may cause cancers and the viruses associated with malignancies are known as tumor viruses. The known tumor viruses are Hepatitis B virus

(HBV), Hepatitis C virus (HCV), Epstein-Barr virus (EBV), human herpesvirus 8 (HHV-8),

Human papillomavirus (HPV), and Human T lymphotropic virus type I (HTLV-I) (Ahmadi

Ghezeldasht et al., 2013). Measles virus (MV) is not considered a tumor virus but it is a highly infectious virus, transmitted through aerosols and droplets and causes measles. The disease is

136 associated with transient immune suppression and an increased risk of childhood morbidity and mortality for a period of more than 2 years (Laksono et al., 2018).

Based on our findings, viral carcinogenesis HTLV-I infection, Measles, Herpes simplex infection, Epstein-Barr virus infection, Hepatitis B, Hepatitis C, and Influenza A were significantly enriched among the culture conditions (Table 3.7, 3.8, 3.10, 3.11, Appendix Table

3.18 and 3.20). This is also supported by the enrichment of several biological processes related to viral carcinogenesis such as response to the virus, defense response to the virus, response to type I interferon, type I interferon signaling pathway, interferon-gamma-mediated signaling pathway, and negative regulation of viral genome replication. Interestingly, several genes involved in the cell cycle and DNA replication also shown to be involved in various viral infections and viral carcinogenesis (e.g., CCNA2, CDC20, CDK1, CDK2, CDKN1A, CDKN2B,

TP53, PCNA, POLE, POLE2, POLD2, and POLD3). More genes that are differentially expressed in response to culture conditions, involved in various viral infections also can be seen in Appendix Table 3.23.

The DEGs involved in viral carcinogenesis and specific viral infections showed that viruses can function as carcinogenic agents, utilize a variety of carcinogenic mechanisms to transform human cells in conjunction with additional carcinogenic factors (Chen et al., 2014). It has been reported that chronic infection with HBV is an important risk factor for the development of hepatocellular carcinoma, a malignant tumor (Elgui de Oliveira, 2007). In contrast, another study found an association between the presence of MV and classical Hodgkin lymphoma (cHL) in which apoptosis modulation is a possible mechanism of action for MV in cancers (Benharroch et al., 2014). In addition to this, oxygen tension was also reported as one of the mechanisms that possibly induced the activation of genes involved in viral carcinogenesis

137

(Santra et al., 2019) in which several genes involved in cell cycle and DNA replication are also involved in viral carcinogenesis and viral infections. A study by Vassilaki and Frakolaki, (2017) reported that oxygen tension can exert a significant effect on viral propagation in vitro and possibly in vivo whereas hypoxia restricts the replication of viruses that naturally infect tissues exposed to ambient oxygen and induces the growth of viruses that naturally target tissues exposed to low oxygen.

3.4.2.4. Biosynthesis of antibiotics associated with various cancer metabolism

The enrichment results showed that biosynthesis of antibiotics was significantly enriched in MCF7 (5P vs 5D) and PC3 (5D vs 18P) (Appendix Figure 3.9, 3.10) involving 18 genes with higher expression in DMEM compared to Plasmax. These genes are known as A (LDHA), mevalonate diphosphate decarboxylase (MVD), phosphofructokinase, platelet (PFKP), 2 (HK2), fructose-bisphosphatase 1 (FBP1), phosphoglycerate 1 (PGAM1), ATP citrate lyase (ACLY), adenylate kinase 4 (AK4), isocitrate dehydrogenase 3 (NAD(+)) alpha (IDH3A), farnesyl-diphosphate farnesyltransferase 1

(FDFT1), , muscle (PKM), squalene epoxidase (SQLE), isocitrate dehydrogenase

(NADP(+)) 2, mitochondrial (IDH2), ornithine aminotransferase (OAT), phosphoribosyl aminoimidazole carboxylase; phosphoribosyl aminoimidazole succinocarboxamide synthase

(PAICS), phosphoribosyl pyrophosphate synthetase 2 (PRPS2), 1 (ENO1), and fumarate hydratase (FH).

A previous study showed that the enriched pathway of has01130 (biosynthesis of antibiotics) was significantly associated with disease resistance, an important part of immunity ability. This suggesting that has01130 (biosynthesis of antibiotics) was associated with immune ability and affected the occurrence and development of coronary heart disease by causing the

138 build-up of plaque through the antigen-antibody reaction (Tang et al., 2018). There is not much evidence and information related to the biosynthesis of antibiotics involving all these genes in cancer cells. However, their function was found to be associated with various metabolism in cancer such as glycolysis, fatty acid synthesis, mevalonate pathway, mitochondria function, and nucleotide biosynthesis. A more detailed explanation related to the involvement of these genes in various metabolism is discussed below.

3.4.2.4.1. Glycolysis pathway

Seven genes involved in the biosynthesis of antibiotics, PKM, LDHA, PFKP, ENO1,

FBP1, HK2, and PGAM1, were also previously reported to be involved in the glycolysis pathway. PKM is an enzyme that is involved in the final step of glycolysis and catalyzes the formation of ATP from ADP as phosphoenolpyruvate undergoes to pyruvate

(Li et al., 2020). LDHA plays important role in glycolysis by converting pyruvate to lactate, transform NADH to NAD+, and function as a single‐stranded DNA binding protein (SSB) (Feng et al., 2018). Elevated levels of LDHA in tumor cells are considered as their metabolic adaptation to anaerobic glycolysis in which more glucose is consumed, followed by more being formed. An increased in LDHA results in increased the expression of the proteins involved in apoptosis inhibition: Bcl-2 and Bcl-XL, which both inhibit mitochondrial cytochrome c release, and at the same time decreased the Bax level, preventing the cytochrome c release and apoptotic cell death (Urbańska and Orzechowski, 2019).

PFKP is one of the limited glycolytic enzymes, irreversibly catalyzes the formation of fructose 1 and 6-bisphosphate and ADP from fructose 6-phosphate and ATP (Chen et al., 2018;

Lee et al., 2018). ENO-α (ENO-1) is one of the isoforms of enolases (ENO), glycolytic enzymes responsible for the ATP-generated conversion of 2-phospho-D glycerate to phosphoenolpyruvate

139 during glycolysis to support cancer cell proliferation and metastasis and its upregulation in several tumor tissues play a pivotal role in tumorigenesis and cancer metastasis (Sun et al., 2019;

Zhu et al., 2015). FBP1 functions both as an enzyme of glycogenesis in the maintenance of low glucose metabolism and as a tumor suppressor in the maintenance of both low ROS level and high self-renewal, contributing to tumorigenesis (Dai et al., 2017). The HK2 enzyme plays a pivotal role to catalyze the first step of glycolysis by phosphorylating glucose to glucose-6- phosphate (G-6-P) and promotes tumor progression n the glycolytic pathway of cancer cells (Bao et al., 2018). PGAM1 is expressed at various levels within various normal tissues during cellular differentiation or transformation (Ohba et al., 2020) and commonly upregulated in human cancers due to loss of TP53 (Hitosugi et al., 2012). Inhibition of PGAM1 results in increased 3- phosphoglycerate (3-PG) and decreased 2-phosphoglycerate (2-PG) levels in cancer cells, leading to significantly decreased glycolysis, pentose phosphate pathway flux, and biosynthesis, as well as attenuated cell proliferation and tumor growth (Hitosugi et al., 2012).

3.4.2.4.2. ACLY function in glucose metabolism, fatty acid (FA) synthesis, and mevalonate pathways

ACLY is a cross-link between glucose metabolism, fatty acid (FA) synthesis/mevalonate pathways, and histone acetylation that is required for DNA transcription and replication. As has been reported before, ACLY was found to be overexpressed in many aggressive cancers such as lung, prostate, bladder, breast, liver, stomach, and colon tumors (Chypre et al., 2012; Icard et al.,

2020). ACLY is a cytosolic enzyme responsible for the synthesis of acetyl-CoA in cytoplasm and oxaloacetate from citrate and CoA with simultaneous hydrolysis of ATP to ADP and phosphate. Acetyl-CoA is used to produce the long-chain fatty acid palmitate in the FA synthesis pathway and also a precursor for the mevalonate pathway to synthesize the farnesyl-

140 pyrophosphate (FPP) involved in cholesterol biosynthesis, and also synthesize the geranylgeranyl-pyrophosphate (GG-PP) (Chypre et al., 2012). The upregulation of ACLY in

DMEM at 5% showed that both fatty acid and mevalonate pathways were stimulated. Moreover, the upregulation of ACLY decreasing the cytosolic citrate gauge, stimulate glucose consumption, glycolysis, oncogenic signaling, and proliferative pathways, and promotes many aspects of cancer growth. Its overexpression also correlates with poor differentiation and prognosis (Icard et al., 2020).

3.4.2.4.3. FH and IDH play role in mitochondria dysfunction

The role of mitochondrial dysfunction in cancer has led to the discovery that mutation in mitochondrial genes, including fumarate hydratase (FH) and isocitrate dehydrogenase (IDH), cause hereditary and sporadic forms of cancer (Schmidt et al., 2020). FH or fumarase is a nuclear-encoded mitochondrial enzyme that takes part in the TCA cycle, catalyzing the reversible conversion between fumarate and L-malate (Ibarrola et al., 2018). Loss of FH has been previously associated with increased production of ROS and leads to the truncation of the

TCA cycle causing the accumulation of fumarate, enhances oxidative stress (Ibarrola et al., 2018;

Schmidt et al., 2020).

IDH is one of the primary enzymes in the TCA cycle and consists of three self-regulating enzymes (IDH1, IDH2, IDH3) (Yang et al., 2014). IDH1 and IDH2 are nicotinamide adenine dinucleotide phosphate (NADP)-dependent enzymes that catalyze the oxidative decarboxylation of isocitrate to alpha-ketoglutarate (α-KG) (Upadhyay et al., 2017). IDH2 is also recognized as a key generator of NADPH, important in mediating the redox status and protecting cells from oxidative stress-induced injury which subsequently decreases cellular vulnerability (Choi et al.,

2018; Kim et al., 2020). IDH2 is also crucial for glutathione (GSH) turnover and defense against

141

ROS (Kim et al., 2020). IDH3 catalyzes the same reaction in the mitochondria, but in a NAD- dependent fashion (Upadhyay et al., 2017). IDH3 catalyzing the decarboxylation of isocitrate into α-KG and transferring electrons from NAD+ to NADH via its catalytic subunit, namely

IDH3α, which reduces the damage caused by ROS. IDH3α found to be upregulated after exposure to H2O2 indicates the role of ROS to induce up-regulation of IDH3α (Yang et al.,

2014). These results showed that the upregulation of FH prevented the increased production of

ROS and truncation of the TCA cycle showing its important role in maintaining oxidative stress.

In addition to this, IDH2 and IDH3 also upregulated which also contributed to the defense mechanism of ROS and reduced damage by ROS. The upregulation of IDH3 indicating the presence of a low level of ROS at 5% O2 in DMEM which in line with the upregulation of FBP1 at a low ROS level.

3.4.2.4.4. MVD, FDFT, and SQLE involved in cholesterol biosynthesis

FDFT1, a gene that encodes squalene synthase (SQS or FDFT), is a key regulator enzyme of the cholesterol biosynthesis pathway by directing intermediates produced from mevalonate to either nonsterol pathway or cholesterol synthetic pathway (Colak et al., 2018;

Park et al., 2014). FDFT catalyzes the formation of squalene from two molecules of farnesyl diphosphate (FPP), the first cyclic structure, and the first step in the sterol biosynthesis pathway

(Colak et al., 2018). Squalene epoxidase (SE) coded by the Human SQLE gene present at very low levels in most non-cholesterolemic mammalian tissues, while it is highly expressed in the liver, neural tissue. SE catalyzes the first oxygenation step of the cholesterol biosynthesis, the conversation of squalene to 2,3eoxidosqualene, which is then cyclized to form either lanosterol or cycloartenol. If the step of squalene oxygenation catalyzed by SE is influenced, the synthesis of sterols and cell membrane or even cell growth will be subsequently affected (Cirmena et al.,

142

2018). These results showed that DMEM enhanced the high production of acetyl-CoA leads to the accumulate of acetyl-CoA in the cytosol as the result of the mitochondria oxidation was incomplete. This acetyl-CoA then stimulated the activation of MVA metabolism and cholesterol biosynthesis through the upregulation of MVFD, FDFT, and SQLE. MVA metabolism and cholesterol biosynthesis are important to support tumor cell growth.

3.4.2.4.5. Biosynthesis of nucleotides

Based on our result, 4 DEGs (PAICS, PRPS2, AKT4, and OAT) function in the biosynthesis of nucleotides. Cancer cells utilize the de novo purine and pyrimidine biosynthetic pathway as a source of their nucleotide needs to utilize 5-phosphoribosyl-1-pyrophosphate

(PRPP) and generate inosine 5′-monophosphate ((Berg et al., 2002; Chakravarthi et al., 2018).

This pathway involving PAICS, a de novo purine metabolic enzyme, to utilize 5-aminoimidazole ribonucleotide (AIR) to generate N-succinocarboxyamide-5-aminoimidazole ribonucleotide and aminoimidazole-4-carboxamide ribonucleotide by adenylosuccinate lyase (Chakravarthi et al.,

2018; Zhou et al., 2019). PRPS2 belongs to phosphoribosyl pyrophosphate synthetase (PRPS) acts as a molecular rheostat for the nucleotide biosynthesis pathway in cancer cells, controlling the flow of ribose-5-phosphate from the pentose phosphate pathway into the nucleotide biosynthetic precursor PRPP (Cunningham et al., 2014).

In line with the increased production of nucleotide, cancer cells subsequently activated the nucleotide metabolism through the upregulation of AK4, a unique member of the adenylate kinases (Aks) family involved in energy metabolism and homeostasis of cellular adenine nucleotide composition (Liu et al., 2009). AK4 expression has shown an increase at protein levels in cultured cells exposed to hypoxia (Liu et al., 2009), in line with our finding where AK4 was upregulated at 5% O2 in DMEM. Besides the role of amino acid as a compound in

143 nucleotide biosynthesis, non-essential amino acids have also been used as a substrate for OAT.

OAT acts on ornithine, a non-essential amino acid, to yield glutamate semialdehyde in mitochondria (Sivashanmugam et al., 2017). OAT also is involved in the ultimate formation of proline from the amino acid ornithine and is needed to establish spindle formation at least in rapidly proliferating cancer cells (Liu et al., 2019).

These results showed that the upregulation of PAICS and PRPS2 at low oxygen levels in

DMEM increased the activity of nucleotides biosynthesis, including purine and pyrimidine biosynthesis as a source of their nucleotides needs. The increase in nucleotide biosynthesis activated nucleotide metabolism through the upregulation of AKT4. The upregulation of OAT also showed that amino acids were used not only for the biosynthesis of nucleotide but also for proline biosynthesis, both are important for cancer cell proliferation in MCF7.

3.4.2.5. Glycolysis/Gluconeogenesis

Cancers outgrow their supply by continuous proliferation and a high rate of nutrient consumption emerge the gradients for O2, glucose, other nutrients, and tumor-cell-derived metabolites, like lactate (Grasmann et al., 2019). Gluconeogenesis is the synthesis of glucose from non-carbohydrate carbon substrate (e.g., , lactate, pyruvate, and glucogenic amino acids) also known as glucose de novo synthesis (Owczarek et al., 2020; Wang et al., 2019) involving many glycolytic reactions run in the reverse direction (Grasmann et al., 2019). The liver is the main organ responsible for this process followed by the kidneys (Owczarek et al.,

2020). This anabolic pathway plays an equal role in controlling aerobic glycolysis by cancer cells (Wang et al., 2019).

Based on our results Glycolysis/gluconeogenesis was affected by 5D vs 18D in MCF7 and 5D vs 18P in PC3 (Appendix Figure 3.13 and 3.14). Glycolysis/gluconeogenesis and

144 canonical glycolysis were suppressed in MCF7 at 18% in DMEM (5D vs 18D) and at 5% O2 in

Plasmax (5D vs 5P). The glycolytic process was suppressed by 18% DMEM in MCF7 (5D vs

18D) whereas gluconeogenesis was suppressed by 5% in DMEM in PC3 (5D vs 18P). In total,

15 genes involved in the glycolysis/gluconeogenesis pathway with their specific involvement in

3 biological processes and 1 KEGG pathway, were differentially expressed (Appendix Table

3.24). These genes are PKM, LDHA, PGAM1, HK2, ENO1, PFKP, triosephosphate 1

(TPI1), 6-phosphofructo-2-kinase/fructose-2,6-biphosphatase 3 (PFKFB3),

1 (PGM1), dehydrogenase E1, and transketolase domain containing 1 (DHTKD1), phosphofructokinase, glyceraldehyde-3-phosphate dehydrogenase (GAPDH), phosphofructokinase, liver type (PFKL), hexokinase 1 (HK1), phosphoenolpyruvate carboxykinase 2, mitochondrial (PCK2), and 1 (PGK1) (Appendix

Table 3.24).

These results suggested that gluconeogenesis is disrupted, in which glucose production decreased while the consumption of glucose increased in cancer cells, potentially leading to cell death (Yadav et al., 2017). The suppression of gluconeogenesis is possibly facilitated by AMPK via the down-regulation of Forkhead box protein O1 (FoxO1) target genes through increasing

FoxO1 phosphorylation (Yadav et al., 2017). Another mechanism involved in the suppression of gluconeogenesis is the suppression of TGF-β activity in which TGF-β activity activates the TGF- receptor I and II kinases and the Smad transcription factor network and interacts with FoxO1,

LKB1 also known as Serine/Threonine Kinase 11 (STK11), and AMPK to regulate metabolic and nutrient sensory pathways and glucose metabolism (Yadav et al., 2017). In line with our finding where FoXo signaling pathways in MCF7 were also affected by different oxygen levels in DMEM, in which FOXO6 and 12 other genes were differentially expressed (Appendix Table

145

3.1). Among these DEGs, SMAD family member 3 (SMAD3) is one potential FoxO DNA- binding partner activated by the transforming growth factor-β (TGF-β) pathway (Bollinger et al.,

2014). The inactivation of hypoxia-inducible factor-1 (HIF-1), the regulator of glucose metabolism, also can contribute to gluconeogenesis suppression by shifting energy metabolism from oxidative phosphorylation to anaerobic glycolysis (Owczarek et al., 2020). In addition to this, low expression of LDHA lowers the oxidation of exogenous lactate, another mechanism to utilize nutrients other than glucose leading to cancer cells sparing glucose and prefer to undergo oxidation of exogenous lactate (Grasmann et al., 2019).

Overall, the most affected KEGG pathways show that the changes in expression of genes in response to culture conditions in both cell lines are strongly associated with the regulation of the cell cycle which potentially leads to specific checkpoints activated by DNA damage. This

DNA damage resulting in delaying the progression of the cell cycle to provide time to repair

DNA or to complete replication. The DNA damage potentially induced stress signal via p53- mediated cell death and leads to cell cycle arrest and apoptosis resulting in genome instability.

The DNA damaged will then modulate the DNA MMR pathway facilitated by DNA polymerase and affecting another DNA repair process such as DNA polymerase proofreading and NER. The mechanism of DNA repair also correlated with the MCM protein family as a modulator of DNA damage recognition and DNA replication to prevent growth defects as a result of defective DNA replication. The enriched GO terms and KEGG pathways associated with viral carcinogenesis and viral infections show the mechanism of oncogenicity by viruses, affected by oxygen levels, that can act as carcinogenic agents that contributed to cancer development. The enriched biosynthesis of antibiotics shows the importance of DEGs involved in this pathway in various metabolism in cancer such as glycolysis, fatty acid synthesis, mevalonate pathway, mitochondria

146 function, and nucleotide biosynthesis. Glycolysis/gluconeogenesis also disrupted, possibly facilitated by AMPK, FoXo signaling pathways, and TGF-β activity.

3.4.3. Two biological processes associated with response to hypoxia were affected by culture conditions in MCF7 and PC3

Based on the comparison of enriched biological processes GO term, “cellular response to hypoxia” was affected in MCF7 by the change of oxygen level in DMEM and change of medium at 5% O2. whereas in PC3, “cellular response to hypoxia” and “response to hypoxia” were affected by the change of oxygen level in Plasmax and change of culture medium at 18% O2, as well as 5% O2 in DMEM vs 18% O2 in Plasmax. We identified a total of 27 genes in cellular response to hypoxia and 29 genes in response to hypoxia showing differential expression in

MCF7 and PC3, with 3 genes, vascular endothelial growth factor A (VEGFA), BCL2 interacting protein 3(BNIP3), and heme oxygenase 1(HMOX1) shared between the two biological processes

(Appendix table 3.25). VEGFA is one of the HIFs, facilitating the passage of metastatic cancer cells through the vessel wall (Jing et al., 2019). BNIP3 is one of the HIFs targets with expression induced by hypoxia independently of the medium, showing a significant decrease in Plasmax compared to DMEM-F12 (Voorde et al., 2019). HMOX1 plays roles in mediating potent antioxidant fumarate hydratase (FH) deficient cells associated with the role of mitochondrial dysfunction in cancer (Schmidt et al., 2020). Interestingly, 3 genes, CCNB1, CCNA2, and TP53, also previously reported to be differentially expressed in MCF7 and PC3, involved in the cell cycle, DNA replication, and p53 signaling pathway and viral carcinogenesis (specifically TP53 and CCNA2), were also involved in “cellular response to hypoxia”.

These findings suggest that the oxygen levels in MCF7 and PC3 growing in DMEM and

Plasmax causes cells to respond differently to decreased oxygenation, affecting genes important

147 for cellular response to hypoxia, cell cycle, DNA replication, and p53 signaling pathway, potentially leading to cell death or cell survival. Exposure to a short period of hypoxia allows cells to survive via apoptosis, autophagy, and metabolic adaptation of cells by decreasing oxidative metabolism, and ROS production (Muz et al., 2015). However, exposure to hyperoxia or longer hypoxia is strongly associated with an increased level of ROS production affecting cell cycle and cell viability. ROS can induce a high frequency of DNA breaks, accumulation of DNA replication errors since hypoxia hampers DNA repair systems including homologous recombination and mismatch repair, leading to genetic instability and mutagenesis (De Bels et al., 2020; Muz et al., 2015). This is supporting our earlier results in which the most affected biological processes and KEGG pathways were associated with cell cycle and DNA repair mechanisms in response to DNA damage possibly because of the increased level of ROS production in response to oxygen level and culture medium changes.

3.4.4. Concluding statements and future perspectives

In this study, we employed RNA-seq based gene expression profiling to analyze the effects of oxygen level and culture medium on cells using two human cancer cell lines, PC3 and

MCF7 as the models. Our results showed that overall, the effect of oxygen level and culture medium is highly variable in a cell type-specific fashion. In other words, different cell types may respond differently to oxygen level change, depending also on the type of culture medium.

Between PC3 and MCF7 cell lines, MCF7 was shown to be more affected by culture conditions when compared to PC3 overall with more genes showing differential expression impacting more biological processes and pathways. More specifically, both cell lines were shown to have the largest impact for medium change under 18% O2; while MCF7 showed a large impact of oxygen level change in DMEM, PC3 showed a larger response to the oxygen level in Plasmax. The most

148 impacted biological processes and pathways include cell cycle, DNA replication, and p53-signal pathways which are strongly associated with hypoxia as the characteristic of the microenvironment in most cancer cells, including MCF7 and PC3, potentially as the results of an increased level of ROS production. Follow-up studies may be designed to validate the critical part of the observed gene expression alternation pattern and cellular activity using quantitative

PCR and cell proliferation assays.

It can be concluded from our study that both oxygen level and culture medium as the key part of the cell culture conditions affect gene expression of cells involving various metabolism and signaling pathways in a cell line specific pattern. It can be inferred from our study that the results of a specific treatment obtained using different culture conditions or different cell lines may not be comparable. Furthermore, the use of physiological conditions (i.e., 5% oxygen and

Plasmax) over the current standard cell culture condition (i.e., 18% oxygen in DMEM) may be preferable for many reasons.

149

References Ackermann, T., and Tardito, S. (2019). Cell Culture Medium Formulation and Its Implications in

Cancer Metabolism. Trends in Cancer, 5(6), 329–332.

https://doi.org/10.1016/j.trecan.2019.05.004

Aguilera, L., Giménez, R., Badia, J., Aguilar, J., and Baldoma, L. (2009). NAD+-dependent

post-translational modification of Escherichia coli glyceraldehyde-3-phosphate

dehydrogenase. International Microbiology, 12(3), 187–192.

https://doi.org/10.2436/20.1501.01.97

Ahmadi Ghezeldasht, S., Shirdel, A., Ali Assarehzadegan, M., Hassannia, T., Rahimi, H., Miri,

R., and Rahim Rezaee, S. A. (2013). Human T lymphotropic virus type I (HTLV-I)

oncogenesis: Molecular aspects of virus and host interactions in pathogenesis of adult T cell

leukemia/lymphoma (ATL). Iranian Journal of Basic Medical Sciences, 16(3), 179–195.

https://doi.org/10.22038/ijbms.2013.730

Al-Ani, A., Toms, D., Kondro, D., Thundathil, J., Yu, Y., and Ungrin, M. (2018). Oxygenation

in cell culture: Critical parameters for reproducibility are routinely not reported. PLoS ONE,

13(10), 1–13. https://doi.org/10.1371/journal.pone.0204269

Altschup, S. F., Gish, W., Miller, W., Myers, E. W., and Lipman, D. J. (1990). Basic Local

Alignment Search Tool. In J. Mol. Biol (Vol. 215).

Amaral, L., Martins, A., Spengler, G., and Molnar, J. (2014). Efflux pumps of Gram-negative

bacteria: What they do, how they do it, with what and how to deal with them. In Frontiers

in Pharmacology: Vol. 4 JAN. Frontiers Research Foundation.

https://doi.org/10.3389/fphar.2013.00168

Anders, S., Pyl, P. T., and Huber, W. (2015). HTSeq-A Python framework to work with high-

throughput sequencing data. Bioinformatics, 31(2), 166–169.

150

https://doi.org/10.1093/bioinformatics/btu638

Andrews, S. (2010) FastQC: A Quality Control Tool for High Throughput Sequence Data.

Available at https://www.bioinformatics.babraham.ac.uk/projects/fastqc/

Arsham, A. M., Howell, J. J., and Simon, M. C. (2003). A novel hypoxia-inducible factor-

independent hypoxic response regulating mammalian target of rapamycin and its targets.

Journal of Biological Chemistry, 278(32), 29655–29660.

https://doi.org/10.1074/jbc.M212770200

Asay, B., Tebaykina, Z., Vlasova, A., and Wen, M. (2008). Membrane Composition as a Factor

in Susceptibility of Escherichia coli C29 to Thermal and Non-thermal Microwave

Radiation. Journal of Experimental Microbiology and Immunology, 12(April), 7–13.

Ashburner, M., Ball, C. A., Blake, J. A., Botstein, D., Butler, H., Cherry, J. M., Davis, A. P.,

Dolinski, K., Dwight, S. S., Eppig, J. T., Harris, M. A., Hill, D. P., Issel-Tarver, L.,

Kasarskis, A., Lewis, S., Matese, J. C., Richardson, J. E., Ringwald, M., and Rubin, Gerald

M. Sherlock, G. (2000). The Gene Ontology Consortium, Michael Ashburner1, Catherine

A. Ball3, Judith A. Blake4, David Botstein3, Heather Butler1, J. Michael Cherry3, Allan P.

Davis4, Kara Dolinski3, Selina S. Dwight3, Janan T. Eppig4, Midori A. Harris3, David P.

Hill4, Laurie Is. Nature Genetics, 25(1), 25–29. https://doi.org/10.1038/75556.Gene

Bae, J. S., Kim, S. M., and Lee, H. (2017). The Hippo signaling pathway provides novel anti-

cancer drug targets. Oncotarget, 8(9), 16084–16098.

https://doi.org/10.18632/oncotarget.14306

Bainbridge, M. N., Warren, R. L., Hirst, M., Romanuik, T., Zeng, T., Go, A., Delaney, A.,

Griffith, M., Hickenbotham, M., Magrini, V., Mardis, E. R., Sadar, M. D., Siddiqui, A. S.,

Marra, M. A., and Jones, S. J. M. (2006). Analysis of the prostate cancer cell line LNCaP

151

transcriptome using a sequencing-by-synthesis approach. BMC Genomics, 7, 1–11.

https://doi.org/10.1186/1471-2164-7-246

Bao, F., Yang, K., Wu, C., Gao, S., Wang, P., Chen, L., and Li, H. (2018). New natural

inhibitors of hexokinase 2 (HK2): Steroids from Ganoderma sinense. Fitoterapia,

125(January), 123–129. https://doi.org/10.1016/j.fitote.2018.01.001

Bathke, J., Konzer, A., Remes, B., McIntosh, M., and Klug, G. (2019). Comparative analyses of

the variation of the transcriptome and proteome of Rhodobacter sphaeroides throughout

growth. BMC Genomics, 20(1), 1–13. https://doi.org/10.1186/s12864-019-5749-3

Berg, J. M., Tymoczko, J. L., and Stryer, L. (2002). Chapter 25, Nucleotide Biosynthesis. In:

Biochemistry. 5th edition. New York: W H Freeman; Available at

https://www.ncbi.nlm.nih.gov/books/NBK21216

Bernstein, H. D. (2011). The double life of a bacterial lipoprotein. Molecular Microbiology,

79(5), 1128–1131. https://doi.org/10.1111/j.1365-2958.2011.07538.x

Bolger, A. M., Lohse, M., and Usadel, B. (2014). Trimmomatic: A flexible trimmer for Illumina

sequence data. Bioinformatics, 30(15), 2114–2120.

https://doi.org/10.1093/bioinformatics/btu170

Borodina, T., Adjaye, J., and Sultan, M. (2011). A strand-specific library preparation protocol for

RNA sequencing. In Methods in Enzymology (1st ed., Vol. 500). Elsevier Inc.

https://doi.org/10.1016/B978-0-12-385118-5.00005-0

Bray, N. L., Pimentel, H., Melsted, P., and Pachter, L. (2016). Near-optimal probabilistic RNA-

seq quantification. Nature Biotechnology, 34(5), 525–527. https://doi.org/10.1038/nbt.3519

Busscher, H. J., and van der Mei, H. C. (2012). How do bacteria know they are on a surface and

regulate their response to an adhering state? PLoS Pathogens, 8(1), 1–3.

152

https://doi.org/10.1371/journal.ppat.1002440

Chakravarthi, B. V. S. K., Rodriguez Pena, M. D. C., Agarwal, S., Chandrashekar, D. S.,

Hodigere Balasubramanya, S. A., Jabboure, F. J., Matoso, A., Bivalacqua, T. J., Rezaei, K.,

Chaux, A., Grizzle, W. E., Sonpavde, G., Gordetsky, J., Netto, G. J., and Varambally, S.

(2018). A Role for De Novo Purine Metabolic Enzyme PAICS in Bladder Cancer

Progression. Neoplasia (United States), 20(9), 894–904.

https://doi.org/10.1016/j.neo.2018.07.006

Chen, G., Liu, H., Zhang, Y., Liang, J., Zhu, Y., Zhang, M., Yu, D., Wang, C., and Hou, J.

(2018). Silencing PFKP inhibits starvation-induced autophagy, glycolysis, and epithelial

mesenchymal transition in oral squamous cell carcinoma. Experimental Cell Research,

370(1), 46–57. https://doi.org/10.1016/j.yexcr.2018.06.007

Chen, W., Jia, Q., Song, Y., Fu, H., Wei, G., and Ni, T. (2017). Alternative Polyadenylation:

Methods, Findings, and Impacts Methods and Findings of Alternative Polyadenylation.

Genomics, Proteomics and Bioinformatics, 15(5), 287–300.

https://doi.org/10.1016/j.gpb.2017.06.001

Chen, Y., Williams, V., Filippova, M., Filippov, V., and Duerksen-Hughes, P. (2014). Viral

carcinogenesis: Factors inducing DNA damage and virus integration. Cancers, 6(4), 2155–

2186. https://doi.org/10.3390/cancers6042155

Cheung, F., Haas, B. J., Goldberg, S. M. D., May, G. D., Xiao, Y., and Town, C. D. (2006).

Sequencing Medicago truncatula expressed sequenced tags using 454 Life Sciences

technology. BMC Genomics, 7, 1–10. https://doi.org/10.1186/1471-2164-7-272

Chhangawala, S., Rudy, G., Mason, C. E., and Rosenfeld, J. A. (2015). The impact of read length

on quantification of differentially expressed genes and splice junction detection. Genome

153

Biology, 16(1), 1–10. https://doi.org/10.1186/s13059-015-0697-y

Choi, S. jeong, Piao, S., Nagar, H., Jung, S. byel, Kim, S., Lee, I., Kim, S. min, Song, H. J., Shin,

N., Kim, D. W., Irani, K., Jeon, B. H., Park, J. W., and Kim, C. S. (2018). Isocitrate

dehydrogenase 2 deficiency induces endothelial inflammation via p66sh-mediated

mitochondrial oxidative stress. Biochemical and Biophysical Research Communications,

503(3), 1805–1811. https://doi.org/10.1016/j.bbrc.2018.07.117

Chon, J., Stover, P. J., and Field, M. S. (2017). Molecular Aspects of Medicine Targeting nuclear

thymidylate biosynthesis. Molecular Aspects of Medicine, 53, 48–56.

https://doi.org/10.1016/j.mam.2016.11.005

Chypre, M., Zaidi, N., and Smans, K. (2012). ATP-citrate lyase: A mini-review. Biochemical

and Biophysical Research Communications, 422(1), 1–4.

https://doi.org/10.1016/j.bbrc.2012.04.144

Cirmena, G., Franceschelli, P., Isnaldi, E., Ferrando, L., De Mariano, M., Ballestrero, A., and

Zoppoli, G. (2018). Squalene epoxidase as a promising metabolic target in cancer treatment.

Cancer Letters, 425, 13–20. https://doi.org/10.1016/j.canlet.2018.03.034

Cock, P. J. A., Fields, C. J., Goto, N., Heuer, M. L., and Rice, P. M. (2009). The Sanger FASTQ

file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants.

Nucleic Acids Research, 38(6), 1767–1771. https://doi.org/10.1093/nar/gkp1137

Colak, Y., Coskunpinar, E. M., Senates, E., Oltulu, Y. M., Yaylim, I., Gomleksiz, O. K., Ozan

Tiryakioglu, N., Hasturk, B., Ekmekci, C. G., and Aydogan, H. Y. (2018). Assessment of

the rs2645424 C/T single nucleotide polymorphisms in the FDFT1 gene, hepatic

expression, and serum concentration of the FDFT in patients with nonalcoholic fatty liver

disease. Meta Gene, 18(June), 46–52. https://doi.org/10.1016/j.mgene.2018.07.006

154

Conesa, A., Götz, S., García-Gómez, J. M., Terol, J., Talón, M., and Robles, M. (2005).

Blast2GO: A universal tool for annotation, visualization and analysis in functional

genomics research. Bioinformatics, 21(18), 3674–3676.

https://doi.org/10.1093/bioinformatics/bti610

Conesa, A., Madrigal, P., Tarazona, S., Gomez-Cabrero, D., Cervera, A., McPherson, A.,

Szcześniak, M. W., Gaffney, D. J., Elo, L. L., Zhang, X., and Mortazavi, A. (2016). A

survey of best practices for RNA-seq data analysis. Genome Biology, 17(1), 1–19.

https://doi.org/10.1186/s13059-016-0881-8

Confer, A. W., and Ayalew, S. (2013). The OmpA family of proteins: Roles in bacterial

pathogenesis and immunity. In Veterinary Microbiology (Vol. 163, Issues 3–4, pp. 207–

222). https://doi.org/10.1016/j.vetmic.2012.08.019

Conforte, V. P., Malamud, F., Yaryura, P. M., Toum Terrones, L., Torres, P. S., De Pino, V.,

Chazarreta, C. N., Gudesblat, G. E., Castagnaro, A. P., R. Marano, M., and Vojnov, A. A.

(2019). The histone-like protein HupB influences biofilm formation and virulence in

Xanthomonas citri ssp. citri through the regulation of flagellar biosynthesis. Molecular

Plant Pathology, 20(4), 589–598. https://doi.org/10.1111/mpp.12777

Copty, A. B., Neve-Oz, Y., Barak, I., Golosovsky, M., and Davidov, D. (2006). Evidence for a

specific microwave radiation effect on the green fluorescent protein. Biophysical Journal,

91(4), 1413–1423. https://doi.org/10.1529/biophysj.106.084111

Crowley, T. E., and Kyte, J. (2014). Introduction to Enzymes Catalyzing Oxidation-Reductions

with the Coenzyme NAD(P). In: Experiments in the Purification and Characterization of

Enzymes. Academic Press,1-9. https://doi.org/10.1016/B978-0-12-409544-1.02002-6.

Cunningham, J. T., Moreno, M. V., Lodi, A., Ronen, S. M., and Ruggero, D. (2014). Protein and

155

nucleotide biosynthesis are coupled by a single rate-limiting enzyme, PRPS2, to drive

cancer. Cell, 157(5), 1088–1103. https://doi.org/10.1016/j.cell.2014.03.052

Dai, J., Ji, Y., Wang, W., Kim, D., Fai, L. Y., Wang, L., Luo, J., and Zhang, Z. (2017). Loss of

fructose-1,6-bisphosphatase induces glycolysis and promotes apoptosis resistance of cancer

stem-like cells: an important role in hexavalent chromium-induced carcinogenesis.

Toxicology and Applied Pharmacology, 331, 164–173.

https://doi.org/10.1016/j.taap.2017.06.014

De Bels, D., Tillmans, F., Corazza, F., Bizzari, M., Germonpre, P., Radermacher, P., Orman, K.

G., and Balestra, C. (2020). Hyperoxia alters ultrastructure and induces apoptosis in

leukemia cell lines. Biomolecules, 10(2), 1–14. https://doi.org/10.3390/biom10020282 de Marval, P. L. M., and Zhang, Y. (2011). The RP-Mdm2-p53 pathway and tumorigenesis.

Oncotarget, 2(3), 234–238. https://doi.org/10.18632/oncotarget.228

Dobin, A., Davis, C. A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson,

M., and Gingeras, T. R. (2013). STAR: Ultrafast universal RNA-seq aligner.

Bioinformatics, 29(1), 15–21. https://doi.org/10.1093/bioinformatics/bts635

Donzelli, M., and Draetta, G. F. (2003). Regulating mammalian checkpoints through Cdc25

inactivation. EMBO Reports, 4(7), 671–677. https://doi.org/10.1038/sj.embor.embor887

Dreyfuss, M. S., and Chipley, J. R. (1980). Comparison of effects of sublethal microwave

radiation and conventional heating on the metabolic activity of Staphylococcus aureus.

Applied and Environmental Microbiology, 39(1), 13–16.

https://doi.org/10.1128/aem.39.1.13-16.1980

Ehmann, D. E., Shaw-Reid, C. A., Losey, H. C., and Walsh, C. T. (2000). The EntF and EntE

adenylation domains of Escherichia coli enterobactin synthetase: Sequestration and

156

selectivity in acyl-AMP transfers to thiolation domain cosubstrates. Proceedings of the

National Academy of Sciences of the United States of America, 97(6), 2509–2514.

https://doi.org/10.1073/pnas.040572897

Elgui de Oliveira, D. (2007). DNA viruses in human cancer: An integrated overview on

fundamental mechanisms of viral carcinogenesis. Cancer Letters, 247(2), 182–196.

https://doi.org/10.1016/j.canlet.2006.05.010

Emrich, S. J., Barbazuk, W. B., Li, L., and Schnable, P. S. (2007). Gene discovery and

annotation using LCM-454 transcriptome sequencing. Genome Research, 17(1), 69–73.

https://doi.org/10.1101/gr.5145806

Feng, Y., Xiong, Y., Qiao, T., Li, X., Jia, L., and Han, Y. (2018). Lactate dehydrogenase A: A

key player in carcinogenesis and potential target in cancer therapy. Cancer Medicine, 7(12),

6124–6136. https://doi.org/10.1002/cam4.1820

Fonseca, J., Moradi, F., Valente, A. J. F., and Stuart, J. A. (2018). Oxygen and glucose levels in

cell culture medium determine resveratrol’s effects on growth, hydrogen peroxide

production, and mitochondrial dynamics. Antioxidants, 7(11).

https://doi.org/10.3390/antiox7110157

Gadda, G., and McAllister-Wilkins, E. E. (2003). Cloning, expression, and purification of

choline dehydrogenase from the moderate halophile Halomonas elongata. Applied and

Environmental Microbiology, 69(4), 2126–2132. https://doi.org/10.1128/AEM.69.4.2126-

2132.2003

Gierliński, M., Cole, C., Schofield, P., Schurch, N. J., Sherstnev, A., Singh, V., Wrobel, N.,

Gharbi, K., Simpson, G., Owen-Hughes, T., Blaxter, M., and Barton, G. J. (2015).

Statistical models for RNA-seq data derived from a two-condition 48-replicate experiment.

157

Bioinformatics, 31(22), 3625–3630. https://doi.org/10.1093/bioinformatics/btv425

Ginder, N. D., Binkowski, D. J., Fromm, H. J., and Honzatko, R. B. (2006). Nucleotide

complexes of Escherichia coli phosphoribosylaminoimidazole succinocarboxamide

synthetase. Journal of Biological Chemistry, 281(30), 20680–20688.

https://doi.org/10.1074/jbc.M602109200

Gonzalez, T., Gaultney, R. A., Floden, A. M., and Brissette, C. A. (2015). Escherichia coli

lipoprotein binds human plasminogen via an intramolecular domain. Frontiers in

Microbiology, 6(OCT), 1–10. https://doi.org/10.3389/fmicb.2015.01095

Gordon, A., and Hannon, G. J. (2010). Fastx-Toolkit. FASTQ/A Short-Reads Pre-Processing

Tools (unpublished). Available online at http://hannonlab.Cshl.Edu/fastx_toolkit/

Grabherr, M. G., Haas, B. J., Yassour, M., Levin, J. Z., Thompson, D. A., Amit, I., Adiconis, X.,

Fan, L., Raychowdhury, R., Zeng, Q., Chen, Z., Mauceli, E., Hacohen, N., Gnirke, A.,

Rhind, N., Di Palma, F., Birren, B. W., Nusbaum, C., Lindblad-Toh, K., … Regev, A.

(2011). Full-length transcriptome assembly from RNA-Seq data without a reference

genome. Nature Biotechnology, 29(7), 644–652. https://doi.org/10.1038/nbt.1883

Grasmann, G., Smolle, E., Olschewski, H., and Leithner, K. (2019). Gluconeogenesis in cancer

cells – Repurposing of a starvation-induced metabolic pathway? Biochimica et Biophysica

Acta - Reviews on Cancer, 1872(1), 24–36. https://doi.org/10.1016/j.bbcan.2019.05.006

Guillon, J., Mechulam, Y., Blanquet, S., and Fayatt, G. U. Y. (1992). Disruption of the Gene for

Met-tRNAf et Formyltransferase Severely Impairs Growth of Escherichia coli. 174(13),

4294–4301.

Gupta, D., and Heinen, C. D. (2019). The mismatch repair-dependent DNA damage response:

Mechanisms and implications. DNA Repair, 78(April), 60–69.

158

https://doi.org/10.1016/j.dnarep.2019.03.009

Góngora-Castillo, E., Childs, K. L., Fedewa, G., Hamilton, J. P., Liscombe, D. K, et al.

(2012) Development of Transcriptomic Resources for Interrogating the Biosynthesis of

Monoterpene Indole Alkaloids in Medicinal Plant Species. PLOS ONE 7(12):

e52506. https://doi.org/10.1371/journal.pone.0052506

Ha, G. H., and Breuer, E. K. Y. (2012). Mitotic kinases and p53 signaling.

Research International, 2012. https://doi.org/10.1155/2012/195903

Haas, B. J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P. D., Bowden, J., Couger, M.

B., Eccles, D., Li, B., Lieber, M., Macmanes, M. D., Ott, M., Orvis, J., Pochet, N., Strozzi,

F., Weeks, N., Westerman, R., William, T., Dewey, C. N., … Regev, A. (2013). De novo

transcript sequence reconstruction from RNA-seq using the Trinity platform for reference

generation and analysis. Nature Protocols, 8(8), 1494–1512.

https://doi.org/10.1038/nprot.2013.084

Haider, S., and Pal, R. (2013). Integrated Analysis of Transcriptomic and Proteomic Data.

Current Genomics, 14(2), 91–110. https://doi.org/10.2174/1389202911314020003

Han, Y., Gao, S., Muegge, K., Zhang, W., and Zhou, B. (2015). Advanced applications of RNA

sequencing and challenges. Bioinformatics and Biology Insights, 9, 29–46.

https://doi.org/10.4137/BBI.S28991

Hansen, K. D., Irizarry, R. A., and Wu, Z. (2012). Removing technical variability in RNA-seq

data using conditional quantile normalization. Biostatistics, 13(2), 204–216.

https://doi.org/10.1093/biostatistics/kxr054

Hardcastle, T. J., and Kelly, K. A. (2010). BaySeq: Empirical Bayesian methods for identifying

differential expression in sequence count data. BMC Bioinformatics, 11.

159

https://doi.org/10.1186/1471-2105-11-422

Hardwick, S. A., Bassett, S. D., Kaczorowski, D., Blackburn, J., Barton, K., Bartonicek, N.,

Carswell, S. L., Tilgner, H. U., Loy, C., Halliday, G., Mercer, T. R., Smith, M. A., and

Mattick, J. S. (2019). Targeted, high-resolution RNA sequencing of non-coding genomic

regions associated with neuropsychiatric functions. Frontiers in Genetics, 10(APR), 1–17.

https://doi.org/10.3389/fgene.2019.00309

Hänzelmann, S., Castelo, R., and Guinney, J. (2013). GSVA: Gene set variation analysis for

microarray and RNA-Seq data. BMC Bioinformatics, 14. https://doi.org/10.1186/1471-

2105-14-7

Hitosugi, T., Zhou, L., Elf, S., Fan, J., Kang, H. B., Seo, J. H., Shan, C., Dai, Q., Zhang, L., Xie,

J., Gu, T. L., Jin, P., Alečković, M., LeRoy, G., Kang, Y., Sudderth, J. A., DeBerardinis, R.

J., Luan, C. H., Chen, G. Z., … Chen, J. (2012). 1 Coordinates

Glycolysis and Biosynthesis to Promote Tumor Growth. Cancer Cell, 22(5), 585–600.

https://doi.org/10.1016/j.ccr.2012.09.020

Huang, D. W., Sherman, B. T., and Lempicki, R. A. (2009). Systematic and integrative analysis

of large gene lists using DAVID Bioinformatics Resources. Nature Protoc. 4(1):44-

57. https://doi: 10.1038/nprot.2008.211.

Huber, W., Carey, V. J., Gentleman, R., Anders, S., Carlson, M., Carvalho, B. S., Bravo, H. C.,

Davis, S., and Gatto, L. (2016). Cell Painting, a high-content image-based assay for

morphological profiling using multiplexed fluorescent dyes Mark-Anthony. Nature

Protocols, 11(9), 1757–1774. https://doi.org/10.1038/nmeth.3252.Orchestrating

Ibarrola, J., Sádaba, R., Garcia-Peña, A., Arrieta, V., Martinez-Martinez, E., Alvarez, V.,

Fernández-Celis, A., Gainza, A., Santamaría, E., Fernández-Irigoyen, J., Cachofeiro, V.,

160

Fay, R., Rossignol, P., and López-Andrés, N. (2018). A role for fumarate hydratase in

mediumting oxidative effects of galectin-3 in human cardiac fibroblasts. International

Journal of Cardiology, 258, 217–223. https://doi.org/10.1016/j.ijcard.2017.12.103

Icard, P., Wu, Z., Fournel, L., Coquerel, A., Lincet, H., and Alifano, M. (2020). ATP citrate

lyase: A central metabolic enzyme in cancer. Cancer Letters, 471(December 2019), 125–

134. https://doi.org/10.1016/j.canlet.2019.12.010

Illumina. (2014). Estimating Sequencing Coverage.

http://genome.ucsc.edu/ENCODE/protocols/dataStandards/

Illumina. (2017). An introduction to Next-Generation Sequencing Technology. Pub.No.770-

2012-008-B. https://www.illumina.com/content/dam/illumina-

marketing/documents/products/illumina_sequencing_introduction.pdf

Illumina. (2020). Consideration of RNA-Seq read length and coverage. Available at

https://support.illumina.com/bulletins/2017/04/considerations-for-rna-seq-read-length-and-

coverage-.html

Invitrogen. (2019). Choosing between total RNA-Seq and mRNA-Seq. Available at

http://assets.thermofisher.com/TFS-Assets/BID/Technical-Notes/collibri-stranded-rna-

library-prep-kit-total-rna-seq-mrna-seq-technical-note.pdf

Issaeva, N. (2019). P53 signaling in cancers. Cancers, 11(3), 14–16.

https://doi.org/10.3390/cancers11030332

Janković, S. M., Milošev, M. Z., and Novaković, M. L. (2014). The Effects of Microwave

Radiation on Microbial Cultures. Hospital Pharmacology, 1(2), 102–108.

www.hophonline.org

Jensen, J. K., Schultink, A., Keegstra, K., Wilkerson, C. G., and Pauly, M. (2012). RNA-seq

161

analysis of developing nasturtium seeds (Tropaeolum majus): Identification and

characterization of an additional galactosyltransferase involved in xyloglucan biosynthesis.

Molecular Plant, 5(5), 984–992. https://doi.org/10.1093/mp/sss032

Jing, X., Yang, F., Shao, C., Wei, K., Xie, M., Shen, H., and Shu, Y. (2019). Role of hypoxia in

cancer therapy by regulating the tumor microenvironment. Molecular Cancer, 18(1), 1–15.

https://doi.org/10.1186/s12943-019-1089-9

Johnson, W. E., Li, C., and Rabinovic, A. (2007). Adjusting batch effects in microarray

expression data using empirical Bayes methods. Biostatistics, 8(1), 118–127.

https://doi.org/10.1093/biostatistics/kxj037

Kato, T., Makino, F., Miyata, T., Horváth, P., and Namba, K. (2019). Structure of the native

supercoiled flagellar hook as a universal joint. Nature Communications, 10(1).

https://doi.org/10.1038/s41467-019-13252-9

Keeley, T. P., and Mann, G. E. (2019). Defining physiological normoxia for improved

translation of cell physiology to animal models and humans. Physiological Reviews, 99(1),

161–234. https://doi.org/10.1152/physrev.00041.2017

Kim, D., Langmead, B., and Salzberg1, S. L. (2015). HISAT: a fast spliced aligner with low

memory requirements Daehwan HHS Public Access. Nature Methods, 12(4), 357–360.

https://doi.org/110.1016/j.bbi.2017.04.008

Kim, H., Lee, J. H., and Park, J. W. (2020). Down-regulation of IDH2 sensitizes cancer cells to

erastin-induced ferroptosis. Biochemical and Biophysical Research Communications,

525(2), 366–371. https://doi.org/10.1016/j.bbrc.2020.02.093

Kim, M., Chung, H., Yoon, C., Lee, E., Kim, T., Kim, T., Kwon, M., Lee, S., Rhee, B., and Park,

J. (2012). Increase of INS-1 cell apoptosis under glucose fluctuation and the involvement of

162

FOXO-SIRT pathway. Diabetes Research and Clinical Practice, 98(1), 132–139.

https://doi.org/10.1016/j.diabres.2012.04.013

Kim, S. W., Kim, S. J., Langley, R. R., and Fidler, I. J. (2015). Modulation of the cancer cell

transcriptome by culture medium formulations and cell density. International Journal of

Oncology, 46(5), 2067–2075. https://doi.org/10.3892/ijo.2015.2930

Kobayashi, T., Takimura, T., Sekine, R., Vincent, K., Kamata, K., Sakamoto, K., Nishimura, S.,

and Yokoyama, S. (2005). Structural snapshots of the KMSKS loop rearrangement for

amino acid activation by bacterial tyrosyl-tRNA synthetase. Journal of Molecular Biology,

346(1), 105–117. https://doi.org/10.1016/j.jmb.2004.11.034

Kotrba,’ Masayuki Inui,’ And, P., and Yukawa’, H. (2001). REVIEW Bacterial

Phosphotransferase System (ITS) in Carbohydrate and Control of Carbon Metabolism

Uptake (Vol. 92, Issue 6).

Kumar, P., and Dubey, K. (2019). Chapter 13 - Citric Acid Cycle Regulation: Back Bone for

Secondary Metabolite Production. In: New and Future Developments in Microbial

Biotechnology and Bioengineering, Elsevier Inc. 168-181. https://doi.org/10.1016/B978-0-

444-63504-4.00013-X

Laksono, B. M., de Vries, R. D., Verburgh, R. J., Visser, E. G., de Jong, A., Fraaij, P. L. A.,

Ruijs, W. L. M., Nieuwenhuijse, D. F., van den Ham, H. J., Koopmans, M. P. G., van Zelm,

M. C., Osterhaus, A. D. M. E., and de Swart, R. L. (2018). Studies into the mechanism of

measles-associated immune suppression during a measles outbreak in the Netherlands.

Nature Communications, 9(1), 1–10. https://doi.org/10.1038/s41467-018-07515-0

Langmead, B., Trapnell, C., Pop, M., and Salzberg, S. L. (2009). Ultrafast and memory-efficient

alignment of short DNA sequences to the human genome. Genome Biology, 10(3).

163

https://doi.org/10.1186/gb-2009-10-3-r25

Lee, J. H., Liu, R., Li, J., Wang, Y., Tan, L., Li, X. J., Qian, X., Zhang, C., Xia, Y., Xu, D., Guo,

W., Ding, Z., Du, L., Zheng, Y., Chen, Q., Lorenzi, P. L., Mills, G. B., Jiang, T., and Lu, Z.

(2018). EGFR-Phosphorylated Platelet Isoform of Promotes PI3K

Activation. Molecular Cell, 70(2), 197-210.e7. https://doi.org/10.1016/j.molcel.2018.03.018

Leng, N., Dawson, J. A., Thomson, J. A., Ruotti, V., Rissman, A. I., Smits, B. M. G., Haag, J.

D., Gould, M. N., Stewart, R. M., and Kendziorski, C. (2013). EBSeq: An empirical Bayes

hierarchical model for inference in RNA-seq experiments. Bioinformatics, 29(8), 1035–

1043. https://doi.org/10.1093/bioinformatics/btt087

Li, B., and Dewey, C. N. (2011). RSEM: Accurate transcript quantification from RNA-Seq data

with or without a reference genome. BMC Bioinformatics, 12. https://doi.org/10.1186/1471-

2105-12-323

Li, H., and Durbin, R. (2009). Fast and accurate short read alignment with Burrows-Wheeler

transform. Bioinformatics, 25(14), 1754–1760.

https://doi.org/10.1093/bioinformatics/btp324

Li, J., Qian, W. P., and Sun, Q. Y. (2019). Cyclins regulating oocyte meiotic cell cycle

progression. Biology of Reproduction, 101(5), 878–881.

https://doi.org/10.1093/biolre/ioz143

Li J. and Tibshirani R. (2013). Finding consistent patterns: A nonparametric approach for

identifying differential expression in RNA-Seq data. Statistical Methods in Medical

Research, 22(5):519-536. https://doi:10.1177/0962280211428386

Li, X., Turanli, B., Juszczak, K., Kim, W., Arif, M., Sato, Y., Ogawa, S., Turkez, H., Nielsen, J.,

Boren, J., Uhlen, M., Zhang, C., and Mardinoglu, A. (2020). Classification of clear cell

164

renal cell carcinoma based on PKM alternative splicing. Heliyon, 6(2), e03440.

https://doi.org/10.1016/j.heliyon.2020.e03440

Li, Z., Pearlman, A. H., and Hsieh, P. (2016). DNA mismatch repair and the DNA damage

response. DNA Repair, 38, 94–101. https://doi.org/10.1016/j.dnarep.2015.11.019

Liao, Y., Smyth, G. K., and Shi, W. (2014). FeatureCounts: An efficient general purpose

program for assigning sequence reads to genomic features. Bioinformatics, 30(7), 923–930.

https://doi.org/10.1093/bioinformatics/btt656

Liu, L., Cash, T. P., Jones, R. G., Keith, B., Thompson, C. B., and Simon, M. C. (2006).

Hypoxia-induced energy stress regulates mRNA translation and cell growth. Molecular

Cell, 21(4), 521–531. https://doi.org/10.1016/j.molcel.2006.01.010

Liu, R., Ström, A. L., Zhai, J., Gal, J., Bao, S., Gong, W., and Zhu, H. (2009). Enzymatically

inactive adenylate kinase 4 interacts with mitochondrial ADP/ATP translocase.

International Journal of Biochemistry and Cell Biology, 41(6), 1371–1380.

https://doi.org/10.1016/j.biocel.2008.12.002

Liu, Y., Li, K., Wang, L., and Wang, S. (2019). Ornithine aminotransferase promoted the

proliferation and metastasis of non ‐ small cell lung cancer via upregulation of miR ‐ 21.

November 2018, 12828–12838. https://doi.org/10.1002/jcp.27939

Lobritz, M. A., Belenky, P., Porter, C. B. M., Gutierrez, A., Yang, J. H., Schwarz, E. G., Dwyer,

D. J., Khalil, A. S., and Collins, J. J. (2015). Antibiotic efficacy is linked to bacterial

cellular respiration. Proceedings of the National Academy of Sciences of the United States

of America, 112(27), 8173–8180. https://doi.org/10.1073/pnas.1509743112

Love, M. I., Huber, W., and Anders, S. (2014). Moderated estimation of fold change and

dispersion for RNA-seq data with DESeq2. Genome Biology, 15(12).

165

https://doi.org/10.1186/s13059-014-0550-8

Luo, R., Liu, B., Xie, Y., Li, Z., Huang, W., Yuan, J., He, G., Chen, Y., Pan, Q., Liu, Y., Tang,

J., Wu, G., Zhang, H., Shi, Y., Liu, Y., Yu, C., Wang, B., Lu, Y., Han, C., … Wang, J.

(2015). Erratum to “SOAPdenovo2: An empirically improved memory-efficient short-read

de novo assembler” [GigaScience, (2012), 1, 18]. GigaScience, 4(1), 1.

https://doi.org/10.1186/s13742-015-0069-2

Ma, T., Schreiber, C. A., Knutson, G. J., Khattouti, A. El, Sakiyama, M. J., Hassan, M.,

Charlesworth, M. C., Madden, B. J., Zhou, X., Vuk-Pavlović, S., and Gomez, C. R. (2015).

Effects of oxygen on the antigenic landscape of prostate cancer cells Cancer. BMC

Research Notes, 8(1), 1–15. https://doi.org/10.1186/s13104-015-1633-7

Maddalena, L. A., Selim, S. M., Fonseca, J., Messner, H., McGowan, S., and Stuart, J. A. (2017).

Hydrogen peroxide production is affected by oxygen levels in mammalian cell culture.

Biochemical and Biophysical Research Communications, 493(1), 246–251.

https://doi.org/10.1016/j.bbrc.2017.09.037

Marioni, J. C., Mason, C. E., Mane, S. M., Stephens, M., and Gilad, Y. (2008). RNA-seq: An

assessment of technical reproducibility and comparison with gene expression arrays.

Genome Research, 18(9), 1509–1517. https://doi.org/10.1101/gr.079558.108

Martin, J. L., Ishmukhametov, R., Hornung, T., Ahmad, Z., and Frasch, W. D. (2014). Anatomy

of F1-ATPase powered rotation. Proceedings of the National Academy of Sciences of the

United States of America, 111(10), 3715–3720. https://doi.org/10.1073/pnas.1317784111

Mastrocola, A. S., and Heinen, C. D. (2010). Nuclear reorganization of DNA mismatch repair

proteins in response to DNA damage. DNA Repair, 9(2), 120–133.

https://doi.org/10.1016/j.dnarep.2009.11.003

166

Matsunami, H., Barker, C. S., Yoon, Y. H., Wolf, M., and Samatey, F. A. (2016). Complete

structure of the bacterial flagellar hook reveals extensive set of stabilizing interactions.

Nature Communications, 7, 1–10. https://doi.org/10.1038/ncomms13425

Maziero, R. R. D., Guaitolini, C. R. de F., Paschoal, D. M., Crespilho, A. M., Sestari, D. A. O.,

Dode, M. A. N., and Landim-Alvarenga, F. da C. (2020). Effects of the addition of oocyte

meiosis-inhibiting drugs on the expression of maturation-promoting factor components and

organization of cytoplasmic organelles. Reproductive Biology, 20(1), 48–62.

https://doi.org/10.1016/j.repbio.2019.12.005

Mazinani, S. A., DeLong, B., and Yan, H. (2015). Microwave radiation accelerates trypsin-

catalyzed peptide hydrolysis at constant bulk temperature. Tetrahedron Lett, 56, 5804–

5807. https://doi.org/10.1016/j.tetlet.2015.09.003.

Mazinani, S. A., Noaman, N., Pergande, M. R., Cologna, S. M., Coorssen, J., and Yan, H.

(2019). Exposure to microwave irradiation at constant culture temperature slows the growth

of: Escherichia coli DE3 cells, leading to modified proteomic profiles. RSC Advances,

9(21), 11810–11817. https://doi.org/10.1039/c9ra00617f

Mazinani, S. A., and Yan, H. (2016). Impact of microwave irradiation on enzymatic activity at

constant bulk temperature is enzyme-dependent. Tetrahedron Letters, 57(14), 1589–1591.

https://doi.org/10.1016/j.tetlet.2016.02.104

McKeown, S. R. (2014). Defining normoxia, physoxia and hypoxia in tumours - Implications for

treatment response. British Journal of Radiology, 87(1035), 1–12.

https://doi.org/10.1259/bjr.20130676

Medina, I., Carbonell, J., Pulido, L., Madeira, S. C., Goetz, S., Conesa, A., Tárraga, J., Pascual-

Montano, A., Nogales-Cadenas, R., Santoyo, J., García, F., Marbà, M., Montaner, D., and

167

Dopazo, J. (2010). Babelomics: An integrative platform for the analysis of transcriptomics,

proteomics and genomic data with advanced functional profiling. Nucleic Acids Research,

38(SUPPL. 2), 210–213. https://doi.org/10.1093/nar/gkq388

Merico, D., Isserlin, R., Stueker, O., Emili, A., and Bader, G. D. (2010). Enrichment map: A

network-based method for gene-set enrichment visualization and interpretation. PLoS ONE,

5(11). https://doi.org/10.1371/journal.pone.0013984

Mittempergher, L., Delahaye, L. J. M. J., Witteveen, A. T., Spangler, J. B., Hassenmahomed, F.,

Mee, S., Mahmoudi, S., Chen, J., Bao, S., Snel, M. H. J., Leidelmeijer, S., Besseling, N.,

Bergstrom Lucas, A., Pabón-Peña, C., Linn, S. C., Dreezen, C., Wehkamp, D., Chan, B. Y.,

Bernards, R., … Glas, A. M. (2019). MammaPrint and BluePrint Molecular Diagnostics

Using Targeted RNA Next-Generation Sequencing Technology. Journal of Molecular

Diagnostics, 21(5), 808–823. https://doi.org/10.1016/j.jmoldx.2019.04.007

Muz, B., de la Puente, P., Azab, F., and Azab, A. K. (2015). The role of hypoxia in cancer

progression, angiogenesis, metastasis, and resistance to therapy. Hypoxia, 83.

https://doi.org/10.2147/hp.s93413

Nakanishi-Matsui, M., Sekiya, M., and Futai, M. (2016). ATP synthase from Escherichia coli:

Mechanism of rotational catalysis, and inhibition with the ε subunit and phytopolyphenols.

Biochimica et Biophysica Acta - Bioenergetics, 1857(2), 129–140.

https://doi.org/10.1016/j.bbabio.2015.11.005

Nelson, M. 1996. Data compression with the Burrows-Wheeler transform. Dr. Dobb’s Journal of

Software Tools, 21(9):46–50. http://www.dogma.net/markn/articles/bwt/bwt.htm.

Niu, H., Wang, J., Zhuang, W., Liu, D., Chen, Y., Zhu, C., and Ying, H. (2018). Comparative

transcriptomic and proteomic analysis of Arthrobacter sp. CGMCC 3584 responding to

168

dissolved oxygen for cAMP production. Scientific Reports, 8(1), 1–13.

https://doi.org/10.1038/s41598-017-18889-4

Nueda, M. J., Ferrer, A., and Conesa, A. (2012). ARSyN: A method for the identification and

removal of systematic noise in multifactorial time course microarray experiments.

Biostatistics, 13(3), 553–566. https://doi.org/10.1093/biostatistics/kxr042

Patel RK, Jain M (2012) NGS QC Toolkit: A Toolkit for Quality Control of Next Generation

Sequencing Data. PLOS ONE 7(2): e30619. https://doi.org/10.1371/journal.pone.0030619

Ogi, T., Limsirichaikul, S., Overmeer, R. M., Volker, M., Takenaka, K., Cloney, R., Nakazawa,

Y., Niimi, A., Miki, Y., Jaspers, N. G., Mullenders, L. H. F., Yamashita, S., Fousteri, M. I.,

and Lehmann, A. R. (2010). Three DNA Polymerases, Recruited by Different Mechanisms,

Carry Out NER Repair Synthesis in Human Cells. Molecular Cell, 37(5), 714–727.

https://doi.org/10.1016/j.molcel.2010.02.009

Orchard, S. S., Rostron, J. E., and Segall, A. M. (2012). Escherichia coli enterobactin synthesis

and uptake mutants are hypersensitive to an antimicrobial peptide that limits the availability

of iron in addition to blocking Holliday junction resolution. Microbiology, 158(2), 547–559.

https://doi.org/10.1099/mic.0.054361-0

Oshlack, A., and Wakefield, M. J. (2009). Transcript length bias in RNA-seq data confounds

systems biology. Biology Direct, 4, 1–10. https://doi.org/10.1186/1745-6150-4-14

Overmeer, R. M., Gourdin, A. M., Giglia-Mari, A., Kool, H., Houtsmuller, A. B., Siegal, G.,

Fousteri, M. I., Mullenders, L. H. F., and Vermeulen, W. (2010). Replication Factor C

Recruits DNA Polymerase δ to Sites of Nucleotide Excision Repair but Is Not Required for

PCNA Recruitment. Molecular and Cellular Biology, 30(20), 4828–4839.

https://doi.org/10.1128/mcb.00285-10

169

Owczarek, A., Gieczewska, K., Jarzyna, R., Jagielski, A. K., Kiersztan, A., Gruza, A., and

Winiarska, K. (2020). Hypoxia increases the rate of renal gluconeogenesis via hypoxia-

inducible factor-1-dependent activation of phosphoenolpyruvate carboxykinase expression.

Biochimie, 171–172, 31–37. https://doi.org/10.1016/j.biochi.2020.02.002

Park, E. M., Nguyen, L. N., Lim, Y. S., and Hwang, S. B. (2014). Farnesyl-diphosphate

farnesyltransferase 1 regulates hepatitis C virus propagation. FEBS Letters, 588(9), 1813–

1820. https://doi.org/10.1016/j.febslet.2014.03.043

Patro, R., Duggal, G., Love, M. I., Irizarry, R. A., and Kingsford, C. (2017). Salmon provides

fast and bias-aware quantification of transcript expression. Nature Methods, 14(4), 417–

419. https://doi.org/10.1038/nmeth.4197

Patro, R., Mount, S. M., and Kingsford, C. (2014). Sailfish enables alignment-free isoform

quantification from RNA-seq reads using lightweight algorithms. Nature Biotechnology,

32(5), 462–464. https://doi.org/10.1038/nbt.2862

Paul Shannon, 1, Andrew Markiel, 1, Owen Ozier, 2 Nitin S. Baliga, 1 Jonathan T. Wang, 2

Daniel Ramage, 2, Nada Amin, 2, Benno Schwikowski, 1, 5 and Trey Ideker2, 3, 4, 5,

山本隆久, 豊田直平, 深瀬吉邦, and 大森敏行. (1971). Cytoscape: A Software

Environment for Integrated Models. Genome Research, 13(22), 426.

https://doi.org/10.1101/gr.1239303.metabolite

Perez-Arnaiz, P., Bruck, I., and Kaplan, D. L. (2016). Mcm10 coordinates the timely assembly

and activation of the replication fork helicase. Nucleic Acids Research, 44(1), 315–329.

https://doi.org/10.1093/nar/gkv1260

Pertea, M., Pertea, G. M., Antonescu, C. M., Chang, T. C., Mendell, J. T., and Salzberg, S. L.

(2015). StringTie enables improved reconstruction of a transcriptome from RNA-seq reads.

170

Nature Biotechnology, 33(3), 290–295. https://doi.org/10.1038/nbt.3122

Phadtare, S., and Inouye, M. (2001). Role of CspC and CspE in regulation of expression of RpoS

and UspA, the stress response proteins in Escherichia coli. Journal of Bacteriology, 183(4),

1205–1214. https://doi.org/10.1128/JB.183.4.1205-1214.2001

Pham, P.C. (2018). Chapter 19 - Medical Biotechnology: Techniques and Applications. In:

Omics Technologies and Bio-Engineering. Academic Press,449-469.

https://doi.org/10.1016/B978-0-12-804659-3.00019-1

Putzer, H., and Laalami, S. (2003). Regulation of the Expression of Aminoacyl-tRNA

Synthetases and Translation Factors. Translation Mechanisms, 107, 388–415.

papers3://publication/uuid/16EFB0ED-94AA-4CC8-8F45-B24FC47A7468

Quinlan, A. R., and Hall, I. M. (2010). BEDTools: A flexible suite of utilities for comparing

genomic features. Bioinformatics, 26(6), 841–842.

https://doi.org/10.1093/bioinformatics/btq033

Ramsay, L., Macaulay, M., Degli Ivanissevich, S., MacLean, K., Cardle, L., Fuller, J., Edwards,

K. J., Tuvesson, S., Morgante, M., Massari, A., Maestri, E., Marmiroli, N., Sjakste, T.,

Ganal, M., Powell, W., and Waugh, R. (2000). A simple sequence repeat-based linkage map

of Barley. Genetics, 156(4), 1997–2005.

Raval, S., Chaudhari, V., Gosai, H., and Kothari, V. (2014). Effect of low power microwave

radiation on pigment production in bacteria. Microbiology Research, 5(1).

https://doi.org/10.4081/mr.2014.5511

Rayner, E., Van Gool, I. C., Palles, C., Kearsey, S. E., Bosse, T., Tomlinson, I., and Church, D.

N. (2016). A panoply of errors: Polymerase proofreading domain mutations in cancer.

Nature Reviews Cancer, 16(2), 71–81. https://doi.org/10.1038/nrc.2015.12

171

Robinson, M. D., McCarthy, D. J., and Smyth, G. K. (2009). edgeR: A Bioconductor package for

differential expression analysis of digital gene expression data. Bioinformatics, 26(1), 139–

140. https://doi.org/10.1093/bioinformatics/btp616

Rougier, C., Prorot, A., Chazal, P., Leveque, P., and Leprat, P. (2014). Thermal and nonthermal

effects of discontinuous microwave exposure (2.45 Gigahertz) on the cell membrane of

Escherichia coli. Applied and Environmental Microbiology, 80(16), 4832–4841.

https://doi.org/10.1128/AEM.00789-14

Różalska, B. and Sadowska, B (2018). Chapter 18 - In Vivo Resistance Mechanisms:

Staphylococcal Biofilms. In:Pet-To-Man Travelling Staphylococci. Academic Press, 237-

251. https://doi.org/10.1016/B978-0-12-813547-1.00018-2.

Said-Salman, I. H., Jebaii, F. A., Yusef, H. H., and Moustafa, M. E. (2019). Global gene

expression analysis of Escherichia coli K-12 DH5α after exposure to 2.4 GHz wireless

fidelity radiation. Scientific Reports, 9(1). https://doi.org/10.1038/s41598-019-51046-7

Saliba, A. E., Westermann, A. J., Gorski, S. A., and Vogel, J. (2014). Single-cell RNA-seq:

Advances and future challenges. Nucleic Acids Research, 42(14), 8845–8860.

https://doi.org/10.1093/nar/gku555

Salmen, S. H., Alharbi, S. A., Faden, A. A., and Wainwright, M. (2018). Evaluation of effect of

high frequency electromagnetic field on growth and antibiotic sensitivity of bacteria. Saudi

Journal of Biological Sciences, 25(1), 105–110. https://doi.org/10.1016/j.sjbs.2017.07.006

Salomon C and Rice GE. Role of Exosomes in Placental Homeostasis and Pregnancy Disorders.

Prog Mol Biol Transl Sci. 145:163-179. https://doi.org/10.1016/bs.pmbts.2016.12.006.

Santra, T., Herrero, A., Rodriguez, J., von Kriegsheim, A., Iglesias-Martinez, L. F., Schwarzl, T.,

Higgins, D., Aye, T. T., Heck, A. J. R., Calvo, F., Agudo-Ibáñez, L., Crespo, P., Matallanas,

172

D., and Kolch, W. (2019). An Integrated Global Analysis of Compartmentalized HRAS

Signaling. Cell Reports, 26(11), 3100-3115.e7. https://doi.org/10.1016/j.celrep.2019.02.038

Sarantopoulou, D., Tang, S. Y., Ricciotti, E., Lahens, N. F., Lekkas, D., Schug, J., Guo, X. S.,

Paschos, G. K., FitzGerald, G. A., Pack, A. I., and Grant, G. R. (2019). Comparative

evaluation of RNA-Seq library preparation methods for strand-specificity and low input.

Scientific Reports, 9(1), 1–10. https://doi.org/10.1038/s41598-019-49889-1

Schilmiller, A. L., Miner, D. P., Larson, M., McDowell, E., Gang, D. R., Wilkerson, C., and

Last, R. L. (2010). Studies of a biochemical factory: Tomato trichome deep expressed

sequence tag sequencing and proteomics. Plant Physiology, 153(3), 1212–1223.

https://doi.org/10.1104/pp.110.157214

Schmidt, C., Sciacovelli, M., and Frezza, C. (2020). Fumarate hydratase in cancer: A

multifaceted tumour suppressor. Seminars in Cell and Developmental Biology, 98(March

2019), 15–25. https://doi.org/10.1016/j.semcdb.2019.05.002

Schubiger, C. B., Orfe, L. H., Sudheesh, P. S., Cain, K. D., Shah, D. H., and Calla, D. R. (2015).

Entericidin is required for a probiotic treatment (Enterobacter sp. Strain C6-6) to protect

trout from cold-water disease challenge. Applied and Environmental Microbiology, 81(2),

658–665. https://doi.org/10.1128/AEM.02965-14

Schulz, M. H., Zerbino, D. R., Vingron, M., and Birney, E. (2012). Oases: Robust de novo RNA-

seq assembly across the dynamic range of expression levels. Bioinformatics, 28(8), 1086–

1092. https://doi.org/10.1093/bioinformatics/bts094

Sekiya, M., Nakamoto, R. K., Al-Shawi, M. K., Nakanishi-Matsui, M., and Futai, M. (2009).

Temperature dependence of single molecule rotation of the Escherichia coli ATP synthase

F1 sector reveals the importance of γ-β subunit interactions in the catalytic dwell. Journal of

173

Biological Chemistry, 284(33), 22401–22410. https://doi.org/10.1074/jbc.M109.009019

Senturk, E. and Manfredi, J. J. (2013). p53 and cell cycle effects after DNA damage. Methods in

Molecular Biology (Clifton, N.J.), 962:49-61. https://doi.org/10.1007/978-1-62703-236-0_4.

Seo, Y. S., and Kang, Y. H. (2018). The human replicative helicase, the CMG complex, as a

target for anti-cancer therapy. Frontiers in Molecular Biosciences, 5(MAR), 1–21.

https://doi.org/10.3389/fmolb.2018.00026

Seol, W., and Shatkin, A. J. (1993). Membrane topology model of Escherichia coli α-

ketoglutarate permease by phoA fusion analysis. Journal of Bacteriology, 175(2), 565–567.

https://doi.org/10.1128/jb.175.2.565-567.1993

Shah, N. B., Hutcheon, M. L., Haarer, B. K., and Duncan, T. M. (2013). F1-ATPase of

Escherichia coli: The ε-inhibited state forms after ATP hydrolysis, is distinct from the ADP-

inhibited state, and responds dynamically to catalytic site ligands. Journal of Biological

Chemistry, 288(13), 9383–9395. https://doi.org/10.1074/jbc.M113.451583

Shamis, Y., Croft, R., Taube, A., Crawford, R. J., and Ivanova, E. P. (2012). Review of the

specific effects of microwave radiation on bacterial cells. In Applied Microbiology and

Biotechnology (Vol. 96, Issue 2, pp. 319–325). https://doi.org/10.1007/s00253-012-4339-y

Shamis, Y., Taube, A., Mitik-Dineva, N., Croft, R., Crawford, R. J., and Ivanova, E. P. (2011).

Specific electromagnetic effects of microwave radiation on Escherichia coli. Applied and

Environmental Microbiology, 77(9), 3017–3022. https://doi.org/10.1128/AEM.01899-10

Sharma, G., Sharma, S., Sharma, P., Chandola, D., Dang, S., Gupta, S., and Gabrani, R. (2016).

Escherichia coli biofilm: development and therapeutic strategies. Journal of Applied

Microbiology, 121(2), 309–319. https://doi.org/10.1111/jam.13078

Shen, C. (2019). Chapter 4 - Gene Expression: Translation of the Genetic Code. In: Diagnostic

174

Molecular Biology. Academic Press, 87-116, https://doi.org/10.1016/B978-0-12-802823-

0.00004-3.

Sivashanmugam, M., J., J., V., U., and K.N., S. (2017). Ornithine and its role in metabolic

diseases: An appraisal. Biomedicine and Pharmacotherapy, 86, 185–194.

https://doi.org/10.1016/j.biopha.2016.12.024

Soneson, C., Love, M. I., and Robinson, M. D. (2016). Differential analyses for RNA-seq:

Transcript-level estimates improve gene-level inferences [version 2; referees: 2 approved].

F1000Research, 4, 1–22. https://doi.org/10.12688/F1000RESEARCH.7563.2

Stanisavljev, D., Gojgić-Cvijović, G., and Bubanja, I. N. (2017). Scrutinizing microwave effects

on glucose uptake in yeast cells. European Biophysics Journal, 46(1), 25–31.

https://doi.org/10.1007/s00249-016-1131-4

Stark, R., Grzelak, M., and Hadfield, J. (2019). RNA sequencing: the teenage years. Nature

Reviews Genetics, 20(11), 631–656. https://doi.org/10.1038/s41576-019-0150-2

Stokes, J. M., Lopatkin, A. J., Lobritz, M. A., and Collins, J. J. (2019). Bacterial Metabolism and

Antibiotic Efficacy. In Cell Metabolism (Vol. 30, Issue 2, pp. 251–259). Cell Press.

https://doi.org/10.1016/j.cmet.2019.06.009

Stover, P. J., Pathologies, F., and Stover, P. J. (2009). One-Carbon Metabolism – Genome

Interactions. The Journal of Nutrition, 139(12), 2402–2405.

https://doi.org/10.3945/jn.109.113670.2402

Strickland, M., Kale, S., Strub, M. P., Schwieters, C. D., Liu, J., Peterkofsky, A., and Tjandra, N.

(2019). Potential Regulatory Role of Competitive Encounter Complexes in Paralogous

Phosphotransferase Systems. Journal of Molecular Biology, 431(12), 2331–2342.

https://doi.org/10.1016/j.jmb.2019.04.040

175

Stuart, J. A., Fonseca, J., Moradi, F., Cunningham, C., Seliman, B., Worsfold, C. R., Dolan, S.,

Abando, J., and Maddalena, L. A. (2018). How supraphysiological oxygen levels in

standard cell culture affect oxygen-consuming reactions. Oxidative Medicine and Cellular

Longevity, 2018. https://doi.org/10.1155/2018/8238459

Sultan, M., Dökel, S., Amstislavskiy, V., Wuttig, D., Sültmann, H., Lehrach, H., and Yaspo, M.

L. (2012). A simple strand-specific RNA-Seq library preparation protocol combining the

Illumina TruSeq RNA and the dUTP methods. Biochemical and Biophysical Research

Communications, 422(4), 643–646. https://doi.org/10.1016/j.bbrc.2012.05.043

Sun, L., Lu, T., Tian, K., Zhou, D., Yuan, J., Wang, X., Zhu, Z., Wan, D., Yao, Y., Zhu, X., and

He, S. (2019). Alpha-enolase promotes gastric cancer cell proliferation and metastasis via

regulating AKT signaling pathway. European Journal of Pharmacology, 845(September

2018), 8–15. https://doi.org/10.1016/j.ejphar.2018.12.035

Sun, Z., Do, P. M., Rhee, M. S., Govindasamy, L., Wang, Q., Ingram, L. O., and Shanmugam, K.

T. (2012). Amino acid substitutions at glutamate-354 in dihydrolipoamide dehydrogenase of

Escherichia coli lower the sensitivity of pyruvate dehydrogenase to NADH. Microbiology,

158(5), 1350–1358. https://doi.org/10.1099/mic.0.055590-0

Tandonnet, S., and Torres, T. T. (2017). Traditional versus 3′ RNA-seq in a non-model species.

Genomics Data, 11, 9–16. https://doi.org/10.1016/j.gdata.2016.11.002

Tang, Y., Ke, Z., Peng, Y., and Cai, P. (2018). Co-expression analysis reveals key gene modules

and pathway of human coronary heart disease. July 2017, 2102–2109.

https://doi.org/10.1002/jcb.26372

Tarazona, S., Furió-Tarí, P., Turrà, D., Di Pietro, A., Nueda, M. J., Ferrer, A., and Conesa, A.

(2015). Data quality aware analysis of differential expression in RNA-seq with NOISeq

176

R/Bioc package. Nucleic Acids Research, 43(21). https://doi.org/10.1093/nar/gkv711

Thieme, D., and Grass, G. (2010). The Dps protein of Escherichia coli is involved in copper

homeostasis. Microbiological Research, 165(2), 108–115.

https://doi.org/10.1016/j.micres.2008.12.003

Trapnell, C., Pachter, L., and Salzberg, S. L. (2009). TopHat: Discovering splice junctions with

RNA-Seq. Bioinformatics, 25(9), 1105–1111. https://doi.org/10.1093/bioinformatics/btp120

Trapnell, C., Roberts, A., Goff, L., Pertea, G., Kim, D., Kelley, D. R., Pimentel, H., Salzberg, S.

L., Rinn, J. L., and Pachter, L. (2012). Differential gene and transcript expression analysis

of RNA-seq experiments with TopHat and Cufflinks. Nature Protocols, 7(3), 562–578.

https://doi.org/10.1038/nprot.2012.016

Troncoso-Ponce, M. A., Kilaru, A., Cao, X., Durrett, T. P., Fan, J., Jensen, J. K., Thrower, N. A.,

Pauly, M., Wilkerson, C., and Ohlrogge, J. B. (2011). Comparative deep transcriptional

profiling of four developing oilseeds. Plant Journal, 68(6), 1014–1027.

https://doi.org/10.1111/j.1365-313X.2011.04751.x

Tuntland, M. L., Wolf, N. M., and Fung, L. W. M. (2015). Differences in the purification and

solution properties of PurC gene products from Streptococcus pneumoniae and Bacillus

anthracis. Protein Expression and Purification, 114, 143–148.

https://doi.org/10.1016/j.pep.2015.05.016

Upadhyay, V. A., Brunner, A. M., and Fathi, A. T. (2017). Isocitrate dehydrogenase (IDH)

inhibition as treatment of myeloid malignancies: Progress and future directions.

Pharmacology and Therapeutics, 177, 123–128.

https://doi.org/10.1016/j.pharmthera.2017.03.003

Urbańska, K., and Orzechowski, A. (2019). Unappreciated role of LDHA and LDHB to control

177

apoptosis and autophagy in tumor cells. International Journal of Molecular Sciences, 20(9),

1–15. https://doi.org/10.3390/ijms20092085

Van Verk, M. C., Hickman, R., Pieterse, C. M. J., and Van Wees, S. C. M. (2013). RNA-Seq:

Revelation of the messengers. Trends in Plant Science, 18(4), 175–179.

https://doi.org/10.1016/j.tplants.2013.02.001

Vassilaki, N., and Frakolaki, E. (2017). Virus–host interactions under hypoxia. Microbes and

Infection, 19(3), 193–203. https://doi.org/10.1016/j.micinf.2016.10.004

Voorde, J. Vande, Ackermann, T., Pfetzer, N., Sumpton, D., Mackay, G., Kalna, G., Nixon, C.,

Blyth, K., Gottlieb, E., and Tardito, S. (2019). Improving the metabolic fidelity of cancer

models with a physiological cell culture medium. Science Advances, 5(1).

https://doi.org/10.1126/sciadv.aau7314

Wang, X., and Cairns, M. J. (2014). SeqGSEA: A Bioconductor package for gene set enrichment

analysis of RNA-Seq data integrating differential expression and splicing. Bioinformatics,

30(12), 1777–1779. https://doi.org/10.1093/bioinformatics/btu090

Wang, Y., Liu, L. L., Tian, Y., Chen, Y., Zha, W. H., Li, Y., and Wu, F. J. (2019). Upregulation

of DAPK2 ameliorates oxidative damage and apoptosis of placental cells in hypertensive

disorder complicating pregnancy by suppressing human placental microvascular endothelial

cell autophagy through the mTOR signaling pathway. International Journal of Biological

Macromolecules, 121, 488–497. https://doi.org/10.1016/j.ijbiomac.2018.09.111

Wang, Z., Gerstein, M., and Snyder, M. (2010). Nihms229948. 10(1), 57–63.

https://doi.org/10.1038/nrg2484.RNA-Seq

Weber, A. P. M. (2015). Discovering new biology through sequencing of RNA. Plant

Physiology, 169(3), 1524–1531. https://doi.org/10.1104/pp.15.01081

178

Weber, A. P. M., Weber, K. L., Carr, K., Wilkerson, C., and Ohlrogge, J. B. (2007). Sampling

the arabidopsis transcriptome with massively parallel pyrosequencing. Plant Physiology,

144(1), 32–42. https://doi.org/10.1104/pp.107.096677

Woo, I. S., Rhee, I. K., and Park, H. D. (2000). Differential damage in bacterial cells by

microwave radiation on the basis of cell wall structure. Applied and Environmental

Microbiology, 66(5), 2243–2247. https://doi.org/10.1128/AEM.66.5.2243-2247.2000

Wu, X., Lin, M., Li, Y., Zhao, X., and Yan, F. (2009). Effects of DMEM and RPMI 1640 on the

biological behavior of dog periosteum-derived cells. Cytotechnology, 59(2), 103–111.

https://doi.org/10.1007/s10616-009-9200-5

Xie, Y., Wu, G., Tang, J., Luo, R., Patterson, J., Liu, S., Huang, W., He, G., Gu, S., Li, S., Zhou,

X., Lam, T. W., Li, Y., Xu, X., Wong, G. K. S., and Wang, J. (2014). SOAPdenovo-Trans:

De novo transcriptome assembly with short RNA-Seq reads. Bioinformatics, 30(12), 1660–

1666. https://doi.org/10.1093/bioinformatics/btu077

Yadav, H., Devalaraja, S., Chung, S. T., and Rane, S. G. (2017). TGF-β1/Smad3 pathway targets

PP2A-AMPK-FoxO1 signaling to regulate hepatic gluconeogenesis. Journal of Biological

Chemistry, 292(8), 3420–3432. https://doi.org/10.1074/jbc.M116.764910

Yang, C., Fan, J., Zhuang, Z., Fang, Y., Zhang, Y., and Wang, S. (2014). The role of NAD+-

dependent isocitrate dehydrogenase 3 subunit α in AFB1 induced liver lesion. Toxicology

Letters, 224(3), 371–379. https://doi.org/10.1016/j.toxlet.2013.10.037

Yang, Y., and Smith, S. A. (2013). Optimizing de novo assembly of short-read RNA-seq data for

phylogenomics. BMC Genomics, 14. https://doi.org/10.1186/1471-2164-14-328

Yao, T., and Asayama, Y. (2017). Animal-cell culture medium: History, characteristics, and

current issues. Reproductive Medicine and Biology, 16(2), 99–117.

179

https://doi.org/10.1002/rmb2.12024

Yazdani, M. (2016). Technical aspects of oxygen level regulation in primary cell cultures: A

review. Interdisciplinary Toxicology, 9(3–4), 85–89. https://doi.org/10.1515/intox-2016-

0011

Yoshihara, Y., Wu, D., Kubo, N., Sang, M., Nakagawara, A., and Ozaki, T. (2012). Inhibitory

role of E2F-1 in the regulation of tumor suppressor p53 during DNA damage response.

Biochemical and Biophysical Research Communications, 421(1), 57–63.

https://doi.org/10.1016/j.bbrc.2012.03.108.

Yoshida, M., and Imai, S. (2018). Chapter 2 - Regulation of Sirtuins by Systemic NAD+

Biosynthesis. In: Introductory Review on Sirtuins in Biology, Aging, and Disease.

Academic Press, 7-25. https://doi.org/10.1016/B978-0-12-813499-3.00002-2.

Young, M. D., Wakefield, M. J., Smyth, G. K., and Oshlack, A. (2010). Gene ontology analysis

for RNA-seq: accounting for selection bias. Genome Biology, 11(2).

https://doi.org/10.1186/gb-2010-11-2-r14

Yun, H. J., Hyun, S. K., Park, J. H., Kim, B. W., and Kwon, H. J. (2012). Widdrol activates

DNA damage checkpoint through the signaling Chk2-p53-Cdc25A-p21-MCM4 pathway in

HT29 cells. Molecular and Cellular Biochemistry, 363(1–2), 281–289.

https://doi.org/10.1007/s11010-011-1180-z

Zhang, Changquan, Zhao, L., Leng, L., Zhou, Q., Zhang, S., Gong, F., Xie, P., and Lin, G.

(2020). CDCA8 regulates meiotic spindle assembly and chromosome segregation during

human oocyte meiosis. Gene, 741(February), 144495.

https://doi.org/10.1016/j.gene.2020.144495

Zhang, Chuanzhao, Samanta, D., Lu, H., Bullen, J. W., Zhang, H., Chen, I., He, X., and

180

Semenza, G. L. (2016). Hypoxia induces the breast cancer stem cell phenotype by HIF-

dependent and ALKBH5-mediated m6A-demethylation of NANOG mRNA. Proceedings of

the National Academy of Sciences of the United States of America, 113(14), E2047–E2056.

https://doi.org/10.1073/pnas.1602883113

Zhang, H., and Datta, A. K. (2000). Coupled electromagnetic and thermal modeling of

microwave oven heating of foods. Journal of Microwave Power and Electromagnetic

Energy, 35(2), 71–85. https://doi.org/10.1080/08327823.2000.11688421

Zhang, Z., Theurkauf, W. E., Weng, Z., and Zamore, P. D. (2012). Strand-specific libraries for

high throughput RNA sequencing (RNA-Seq) prepared without poly(A) selection. Silence,

3(1), 1–9. https://doi.org/10.1186/1758-907X-3-9

Zhou, S., Yan, Y., Chen, X., Wang, X., Zeng, S., Qian, L., Wei, J., Yang, X., Zhou, Y., Gong,

Z., and Xu, Z. (2019). Roles of highly expressed PAICS in lung adenocarcinoma. Gene,

692(January), 1–8. https://doi.org/10.1016/j.gene.2018.12.064

Zhou, Z., Wang, L., Ge, F., Gong, P., Wang, H., Wang, F., Chen, L., and Liu, L. (2018). Pold3is

required for genomic stability and telomere integrity in embryonic stem cells and meiosis.

Nucleic Acids Research, 46(7), 3468–3486. https://doi.org/10.1093/nar/gky098

Zhu, X., Miao, X., Wu, Y., Li, C., Guo, Y., Liu, Y., Chen, Y., Lu, X., Wang, Y., and He, S.

(2015). ENO1 promotes tumor proliferation and cell adhesion mediated drug resistance

(CAM-DR) in Non-Hodgkin’s Lymphomas. Experimental Cell Research, 335(2), 216–223.

https://doi.org/10.1016/j.yexcr.2015.05.020

Zuo, Z., Zheng, Z., Liu, Z., Yi, Q., and Zou, G. (2007). Cloning , DNA shuffling and expression

of serine hydroxymethyltransferase gene from Escherichia coli strain AB90054. 40, 569–

577. https://doi.org/10.1016/j.enzmictec.2006.05.018

181

10x Genomics. 2018. What is sequencing saturation? Available at

https://kb.10xgenomics.com/hc/en-us/articles/115005062366-What-is-sequencing-

saturation-

10x Genomics. 2018. How is sequencing saturation calculated? Available at

https://kb.10xgenomics.com/hc/en-us/articles/115003646912-How-is-sequencing-

saturation-calculated-

182

Appendix Chapter 2

Methods

1. Alignment using STAR Building STAR index [ilona@cedar5 E.coli]$ STAR –runMode genomeGenerate –genomeDir ./STAR/STARIndex/ -- runThreadN 12 –genomeFastaFiles Escherichia_coli.HUSEC2011CHR1.dna_sm.chromosome.Chromosome.fa –sjdbGTFfile Escherichia_coli.HUSEC2011CHR1.37.chromosome.Chromosome.gtf 1> ./STAR/STARIndex.log 2> ./STAR/STARIndex.err CTR1 sample representing alignment using STAR [ilona@cedar5 E.coli]$ STAR –runMode alignReads –genomeDir ./STAR/STARIndex/ -- runThreadN 12 –readFilesIn ./RawReads/CTR1.R1.fastq ./RawReads/CTR1.R2.fastq – outFileNamePrefix ./STAR/new.STAR_out/CTR1. –outSAMunmapped Within –outSAMtype BAM SortedByCoordinate –outSAMstrandField intronMotif –outFilterIntronMotifs RemoveNoncanonical 1> ./STAR/new.STAR_out/CTR1.star.log 2> ./STAR/new.STAR_out/CTR1.star.err and

2. Differential gene expression analysis using Cufflinks package CTR3 sample representing transcript assembly using Cufflinks [ilona@cedar5 E.coli]$ cufflinks -u -p 16 –library-type fr-firststrand -G ./Reference/Escherichia_coli.HUSEC2011CHR1.37.chromosome.Chromosome.gtf -b ./Reference/Escherichia_coli.HUSEC2011CHR1.dna_sm.chromosome.Chromosome.fa -o ./Cufflinks/CTR3.Cufflinks_out ./STAR_out/CTR3.Aligned.sortedByCoord.out.bam 1>./Cufflinks/CTR3.Cufflinks_out/Cufflinks.log 2>./Cufflinks/CTR3.Cufflinks_out/Cufflinks.err List all the transcript.gtf from Cufflinks output in one file “gtf_files.txt” [ilona@cedar1 new.STAR]$ find -name transcripts.gtf > gtf_files.txt Combine transcripts.gtf from Cufflinks output from all samples using Cuffmerge [ilona@cedar1 new.STAR]$ cuffmerge -o ./Cuffmerge -g ../Reference/Escherichia_coli.HUSEC2011CHR1.37.chromosome.Chromosome.gtf -s ../Reference/Escherichia_coli.HUSEC2011CHR1.dna_sm.chromosome.Chromosome.fa gtf_files.txt 1>./Cuffmerge/Cuffmerge.log 2>./Cuffmerge/Cuffmerge.err and Differential gene expression analysis using Cuffdiff [ilona@cedar1 Cufflinks]$ cuffdiff -o Cuffdiff -b ../../Reference/Escherichia_coli.HUSEC2011CHR1.dna_sm.chromosome.Chromosome.fa - compatible-hits-norm -multi-read-correct -no-update-check -FDR -verbose -quiet -p 16 -L CTR,MW -u ../Cuffmerge/merged.gtf ./CTR1/CTR1.Aligned.sortedByCoord.out.bam,./CTR2/CTR2.Aligned.sortedByCoord.out.bam,. /CTR3/CTR3.Aligned.sortedByCoord.out.bam, ./MW1/MW1.Aligned.sortedByCoord.out.bam,./MW2/MW2.Aligned.sortedByCoord.out.bam,./ MW3/MW3.Aligned.sortedByCoord.out.bam > Cuffdiff/STAR.Cuffdiff.log 2> Cuffdiff/STAR.Cuffdiff.err Creating scatter boxplot using R based on Cuffdiff result > boxplot <- ggplot(E.coli.Data.Frame, aes(x = condition, y=log10(FPKM)))

183

> boxplot <- boxplot + geom_boxplot() > boxplot <- boxplot + theme_bw(base_size=12) > boxplot <- boxplot + theme(strip.background = element_blank(), strip.text.x = element_blank()) > boxplot <- boxplot + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) > boxplot <- boxplot + theme(axis.text.x=element_text(family=”Arial”,color=”black”), axis.text.y=element_text(family=”Arial”,color=”black”)) > boxplot_1 <- boxplot + geom_boxplot() + geom_jitter(size=0.01, position = position_jitter(width = 0.21, height = 0)) Pearson Correlation analysis between 2 conditions based on Cuffdiff result > withMW1 <- read.delim(“~/Pearson Correlation/withMW1.txt”, header=FALSE) > withMW1.cor <- cor(withMW1, method= “pearson”) > write.table(withMW1.cor, file = “withMW1.cor.txt”, sep=”\t”) Paired T-test for Cuffdiff result between across samples > t.test(E.coli.FPKM$CTR1, E.coli.FPKM$CTR2, var.equal=TRUE, paired=TRUE)$p.value [1] 0.2718378 > t.test(E.coli.FPKM$CTR1, E.coli.FPKM$CTR3, var.equal=TRUE, paired=TRUE)$p.value [1] 0.01413811 > t.test(E.coli.FPKM$CTR2, E.coli.FPKM$CTR3, var.equal=TRUE, paired=TRUE)$p.value [1] 0.04022994 > t.test(E.coli.FPKM$CTR1, E.coli.FPKM$MW1, var.equal=TRUE, paired=TRUE)$p.value [1] 0.0004660669 > t.test(E.coli.FPKM$CTR1, E.coli.FPKM$MW2, var.equal=TRUE, paired=TRUE)$p.value [1] 1.167842e-05 > t.test(E.coli.FPKM$CTR1, E.coli.FPKM$MW3, var.equal=TRUE, paired=TRUE)$p.value [1] 2.712154e-06 > t.test(E.coli.FPKM$CTR2, E.coli.FPKM$MW1, var.equal=TRUE, paired=TRUE)$p.value [1] 2.487065e-05 > t.test(E.coli.FPKM$CTR2, E.coli.FPKM$MW2, var.equal=TRUE, paired=TRUE)$p.value [1] 3.471824e-07 > t.test(E.coli.FPKM$CTR2, E.coli.FPKM$MW3, var.equal=TRUE, paired=TRUE)$p.value [1] 2.889156e-05 > t.test(E.coli.FPKM$CTR3, E.coli.FPKM$MW1, var.equal=TRUE, paired=TRUE)$p.value [1] 6.093775e-05 > t.test(E.coli.FPKM$CTR3, E.coli.FPKM$MW2, var.equal=TRUE, paired=TRUE)$p.value [1] 1.344243e-08 > t.test(E.coli.FPKM$CTR3, E.coli.FPKM$MW3, var.equal=TRUE, paired=TRUE)$p.value [1] 5.186369e-06 > t.test(E.coli.FPKM$MW1, E.coli.FPKM$MW2, var.equal=TRUE, paired=TRUE)$p.value [1] 0.3970932 > t.test(E.coli.FPKM$MW1, E.coli.FPKM$MW3, var.equal=TRUE, paired=TRUE)$p.value [1] 0.5414028 > t.test(E.coli.FPKM$MW2, E.coli.FPKM$MW3,paired=TRUE)$p.value [1] 0.1406625 > t.test(E.coli.FPKM$MW2, E.coli.FPKM$MW3, var.equal=TRUE, paired=TRUE)$p.value

184

[1] 0.1406625

3. Creating STAR.SE.Rdata for edgeR and DESeq2 [ilona@gra-login2 R.withMW1]$ sbatch R.sh [ilona@gra-login2 R.withMW1]$ cat R.sh #!/bin/bash #SBATCH –time=24:00:00 #SBATCH –account=def-pliang #SBATCH –job-name=STAR_SummarizedOverlaps #SBATCH –nodes=1 #SBATCH –cpus-per-task=32 #SBATCH –mem=20G module load nixpkgs/16.09 gcc/7.3.0 r/3.6.0 export R_LIBS=~/local/R_libs/ mpirun -np 1 R CMD BATCH Sum.Overlaps.R Sum.Overlaps.txt [ilona@gra-login2 R.withMW1]$ cat Sum.Overlaps.R library(“GenomicFeatures”) EC <- makeTxDbFromGFF(“Escherichia_coli.HUSEC2011CHR1.37.chromosome.Chromosom e.gtf”, format=”gtf”) exonsByGene <-exonsBy(EC, by=”gene”) fls <-list.files(pattern=”bam$”,full=TRUE) library(“Rsamtools”) bamLst <-BamFileList(fls, yieldSize=100000) library(“GenomicAlignments”) se <-summarizeOverlaps(exonsByGene, bamLst, mode=”Union”, inter.feature=TRUE, singleEnd=FALSE, ignore.strand=FALSE, fragments=TRUE) save(se,file=”STAR.SE.Rdata”)

4. DGE analysis using edgeR [ilona@cedar1 edgeR]$ module load nixpkgs/16.09 gcc/7.3.0 r/3.6.0 [ilona@cedar1 edgeR]$ R > library(“GenomicFeatures”) > library(“Rsamtools”) > library(“GenomicAlignments”) > library(“edgeR”) > load(“STAR.SE.Rdata”) > sampleTable <-data.frame(“SampleName” = c(“CTR1”,”CTR2”,”CTR3”, “MW1”,””MW2”,”MW3”), “FileName” = c(“CTR1.Aligned.sortedByCoord.out.bam”,”CTR2.Aligned.sortedByCoord.out.bam”,”CTR3.Al igned.sortedByCoord.out.bam”,”MW1.Aligned.sortedByCoord.out.bam”,”MW2.Aligned.sorted ByCoord.out.bam”,”MW3.Aligned.sortedByCoord.out.bam”), “Treatment” = c(“Control”,”Control”,”Control”,”Microwave”,”Microwave”)) > colData(se) <-DataFrame(sampleTable) > colnames(se) <- sampleTable$SampleName > save(se, file =”New.STAR.SE.Rdata”)

185

> genetable<-data.frame(gene.id=rownames(se)) > countdata <- assay(se) > coldata <- colData(se) > group<-c(rep(“Control”,3),rep(“Treatment”,3)) > edgeR.DGEList<-DGEList(counts=countdata,sample=coldata,genes = genetable,group=group) > save(edgeR.DGEList, file=”edgeR.DGEList.Rdata”) > design <-model.matrix(~ group, edgeR.DGEList$samples) > save(design, file=”design.Rdata”) > new.edgeR.DGEList<-calcNormFactors(edgeR.DGEList) > new.edgeR.DGEList<-estimateDisp(new.edgeR.DGEList, design) > fit <-glmFit(new.edgeR.DGEList,design) > lrt<-glmLRT(fit,coef=ncol(design)) > tt<-topTags(lrt,n=nrow(new.edgeR.DGEList),p.value = 0.05) > tt10<-topTags(lrt) > save(new.edgeR.DGEList, file=”new.edgeR.DGEList.Rdata”) > save(fit, file=”fit.Rdata”) > save(lrt,file = “lrt.Rdata”) > save(tt,file=”tt.Rdata”) > save(tt10, file=”tt10.Rdata”) > write.csv(tt,file=”tt.csv”) > write.csv(tt10,file=”tt10.csv”) > tt.all<-topTags(lrt, n=nrow(new.edgeR.DGEList), sort.by=”none”) >write.csv(tt.all, file=”tt.all.csv”)

5. DGE analysis using DESeq2 [ilona@cedar1 R]$ module load nixpkgs/16.09 gcc/7.3.0 r/3.6.0 [ilona@cedar1 R]$ R > library(“GenomicFeatures”) > library(“Rsamtools”) > library(“GenomicAlignments”) > library( “DESeq2” ) > load(“STAR.SE.Rdata”) > sampleTable <-data.frame(“SampleName” = c(“CTR1”,”CTR2”,”CTR3”,”MW1”,”MW2”,”MW 3”), “FileName” = c(“CTR1.Aligned.sortedByCoord.out.bam”,”CTR2.Aligned.sortedByC oord.out.bam”,”CTR3.Aligned.sortedByCoord.out.bam”,”MW1.Aligned.sortedByCoord.ou t.bam”,”MW2.Aligned.sortedByCoord.out.bam”,”MW3.Aligned.sortedByCoord.out.bam”), “Treatment” = c(“Control”,”Control”,”Control”,”Microwave”,”Microwave”,”Microwav e”)) > write.csv(sampleTable, file = “E.coli.sampleTable.csv”) > colData(se) <-DataFrame(sampleTable) > colnames(se) <- sampleTable$SampleName > ddsFull <- DESeqDataSet( se, design = ~Treatment ) > as.data.frame(colData(ddsFull)[,c(“SampleName”,”FileName”,”Treatment”)]) SampleName FileName Treatment CTR1 CTR1 CTR1.Aligned.sortedByCoord.out.bam Control

186

CTR2 CTR2 CTR2.Aligned.sortedByCoord.out.bam Control CTR3 CTR3 CTR3.Aligned.sortedByCoord.out.bam Control MW1 MW1 MW1.Aligned.sortedByCoord.out.bam Microwave MW2 MW2 MW2.Aligned.sortedByCoord.out.bam Microwave MW3 MW3 MW3.Aligned.sortedByCoord.out.bam Microwave > dds<-DESeq(ddsFull) estimating size factors estimating dispersions gene-wise dispersion estimates mean-dispersion relationship final dispersion estimates fitting model and testing > DESeq.Result<-results(dds) > save(dds, file=”dds.Rdata”) > write.csv(DESeq.Result,file=”DESeq.Result.csv”) > save(DESeq.Result, file=”DESeq.Result.Rdata”) 6. Co-expressed analysis to reveal the major GO terms and KEGG pathways [ilona@gra-login3 ~]$ projects/ctb-pliang/lianglab/bin/perl/Stat_row2allpw_cor.pl -g 0.95 -o l - v1 projects/ctb-pliang/ilona/Correlation/E.coli.DEGs.FPKM.txt>projects/ctb- pliang/ilona/Correlation/E.coli.grouping.cor.txt 1379 lines processed. processing groups. printing groups. The list of genes with more than 100 genes was subject to enrichment analysis using DAVID web tools.

187

Results

Appendix Figure 2.1. The sequence length distribution of the CTR_0.R1 FASTQ file determined by fastQC showed a high distribution of sequence with > 72 bp in length.

Appendix Figure 2.2. CTR_0.R1 FASTQ file showed a high percentage of sequence duplication after quality check with fastQC. The peaks in the blue trace suggest that there are a large number of different highly duplicated sequences. The blueline takes the full sequence set and shows how its duplication levels are distributed. The redline showed the sequences are de- duplicated which come from different duplication levels in the original data.

188

Appendix Figure 2.3. Per base sequence content of CTR_0.R1 FASTQ file determined by fastQC showed that the libraries produced bias composition at the start of the read whereas the lines in the plot run non-parallel with each other.

189

Appendix Table 2.1. Result of BLASTN and BLASTX analysis for unannotated up- regulated DEGs. Gene Accession Gene symbol* Methods Symb Gene description*** Number ol ** 2,3-diketo-L-gulonate reductase, NADH- gene:HUS2011_4268 BLASTN AIZ53381 yiaK dependent gene:HUS2011_4129 BLASTN AIZ53285 yhiM acid resistance protein, inner membrane alanine exporter, alanine-inducible, gene:HUS2011_3172 BLASTN AIZ52472 alaE stress-responsive gene:HUS2011_3661 BLASTN AIZ52807 yqhD aldehyde reductase, NADPH-dependent gene:HUS2011_3727 BLASTN AIZ52874 ygjK alpha-glucosidase amino acid exporter for proline, lysine, gene:HUS2011_0358 BLASTN AIZ53059 yahN glutamate, homoserine anti-FlhD4C2 factor, inactive EAL family gene:HUS2011_2047 BLASTN AIZ51509 ydiV phosphodiesterase arabinose efflux transporter, arabinose- gene:HUS2011_1821 BLASTN AIZ51325 ydeA inducible gene:HUS2011_3106 BLASTN AIZ52400 grcA autonomous glycyl radical gene:HUS2011_1459 BLASTN AIZ51004 dauA C4-dicarboxylic acid transporter c-di-GMP-binding biofilm dispersal gene:HUS2011_5071 BLASTN AIZ54070 bdcA mediumtor protein gene:HUS2011_0341 BLASTN AIZ52922 yahA c-di-GMP-specific phosphodiesterase gene:HUS2011_4478 BLASTN AIZ53525 chrR chromate reductase, Class I, flavoprotein gene:HUS2011_4660 BLASTN AIZ53702 yiiG conserved lipoprotein conserved membrane protein, predicted gene:HUS2011_3896 BLASTN AIZ53036 yhdP transporter gene:HUS2011_2252 BLASTN AIZ51661 yebG conserved protein regulated by LexA conserved protein with bipartite regulator gene:HUS2011_3370 BLASTN AIZ52650 yqeH domain conserved protein with FAD/NAD(P)- gene:HUS2011_2006 BLASTN AIZ51466 ydhS binding domain conserved protein with nucleoside gene:HUS2011_0954 BLASTN AIZ54726 ybjD triphosphate hydrolase domain conserved protein, 5- gene:HUS2011_3445 BLASTN AIZ52709 fau formyltetrahydrofolate cyclo-ligase family gene:HUS2011_3110 BLASTN AIZ52404 yfiP conserved protein, DTW domain gene:HUS2011_5003 BLASTN AIZ54007 yjfM conserved protein, DUF1190 family gene:HUS2011_0493 BLASTN AIZ54306 ybaA conserved protein, DUF1428 family gene:HUS2011_4999 BLASTN AIZ54002 yjfI conserved protein, DUF2170 family gene:HUS2011_5001 BLASTN AIZ54004 yjfK conserved protein, DUF2491 family gene:HUS2011_1734 BLASTN AIZ51244 ydcX conserved protein, DUF2566 family gene:HUS2011_2267 BLASTN AIZ51678 yebB conserved protein, DUF830 family gene:HUS2011_4463 BLASTN AIZ53510 yidB conserved protein, DUF937 family gene:HUS2011_4619 BLASTN AIZ53663 yihF conserved protein, DUF945 family gene:HUS2011_0706 BLASTN AIZ54530 ybfE conserved protein, LexA-regulated gene:HUS2011_5000 BLASTN AIZ54003 yjfJ conserved protein, PspA/IM30 family

190

conserved protein, SanA family, DUF218 gene:HUS2011_3733 BLASTN AIZ52880 ygjQ superfamily gene:HUS2011_5066 BLASTN AIZ54065 ridA conserved protein, UPF0131 family gene:HUS2011_5069 BLASTN AIZ54069 yjgH conserved protein, UPF0131 family gene:HUS2011_4482 BLASTN AIZ53529 cbrC conserved protein, UPF0167 family gene:HUS2011_0010 BLASTN AIZ50784 yaaW conserved protein, UPF0174 family gene:HUS2011_2700 BLASTN AIZ51999 yejL conserved protein, UPF0352 family gene:HUS2011_0011 BLASTN AIZ50883 yaaI conserved protein, UPF0412 family cyclic-di-GMP phosphodiesterase, gene:HUS2011_3026 BLASTN AIZ52319 yfgF anaerobic cyclic-di-GMP phosphodiesterase, gene:HUS2011_4199 BLASTN AIZ53324 yhjH FlhDC-regulated gene:HUS2011_1183 BLASTN AIZ50836 ycdT diguanylate cyclase, membrane-anchored gene:HUS2011_4903 BLASTN AIZ53947 yjdL dipeptide and tripeptide permease gene:HUS2011_2894 BLASTN AIZ52183 dsdX D-serine permease Elongation Factor P Lys34 gene:HUS2011_4976 BLASTN AIZ53977 epmA lysyltransferase epoxyqueuosine reductase, cobalamine- gene:HUS2011_4984 BLASTN AIZ53986 queG stimulated; queosine biosynthesis flagellar velocity braking protein, c-di- gene:HUS2011_1446 BLASTN AIZ50990 ycgR GMP-regulated gene:HUS2011_1029 BLASTN AIZ54804 zapC FtsZ stabilizer fused predicted DNA-binding gene:HUS2011_1727 BLASTN AIZ51238 ydcR transcriptional regulator/predicted aminotransferase gene:HUS2011_5004 BLASTN AIZ54008 yjfC glutathionylspermidine synthase homolog gene:HUS2011_2045 BLASTN AIZ51507 ydiE hemin uptake protein HemP homolog gene:HUS2011_2376 BLASTN AIZ54186 azuC hypothetical protein gene:HUS2011_0118 BLASTN AIZ50903 yacH hypothetical protein gene:HUS2011_0357 BLASTN AIZ53035 yahL hypothetical protein gene:HUS2011_0668 BLASTN AIZ54494 ybeQ hypothetical protein gene:HUS2011_0719 BLASTN AIZ54543 ybfA hypothetical protein gene:HUS2011_0878 BLASTN AIZ54645 ybiJ hypothetical protein gene:HUS2011_0980 BLASTN AIZ54753 ycaK hypothetical protein gene:HUS2011_1379 BLASTN AIZ50924 ycfJ hypothetical protein gene:HUS2011_1448 BLASTN AIZ50992 ycgY hypothetical protein gene:HUS2011_1836 BLASTN AIZ51334 ydeI hypothetical protein gene:HUS2011_1787 BLASTN AIZ51295 ydeM hypothetical protein gene:HUS2011_1788 BLASTN AIZ51296 ydeN hypothetical protein gene:HUS2011_2007 BLASTN AIZ51468 ydhT hypothetical protein gene:HUS2011_2147 BLASTN AIZ51554 ydjY hypothetical protein gene:HUS2011_2199 BLASTN AIZ51606 yeaR hypothetical protein gene:HUS2011_2241 BLASTN AIZ51650 yebW hypothetical protein gene:HUS2011_2375 BLASTN AIZ51720 yecJ hypothetical protein gene:HUS2011_2351 BLASTN AIZ51694 yecT hypothetical protein

191 gene:HUS2011_2586 BLASTN AIZ51883 yegL hypothetical protein gene:HUS2011_2627 BLASTN AIZ51926 yehI hypothetical protein gene:HUS2011_2810 BLASTN AIZ52125 yfcI hypothetical protein gene:HUS2011_2852 BLASTN AIZ52166 yfdF hypothetical protein gene:HUS2011_2904 BLASTN AIZ52194 yfdX hypothetical protein gene:HUS2011_2945 BLASTN AIZ52237 yfeK hypothetical protein gene:HUS2011_3030 BLASTN AIZ52322 yfgI hypothetical protein gene:HUS2011_3173 BLASTN AIZ52473 ygaC hypothetical protein gene:HUS2011_3498 BLASTN AIZ52754 yggM hypothetical protein gene:HUS2011_3500 BLASTN AIZ52756 yggN hypothetical protein gene:HUS2011_3889 BLASTN AIZ53028 yhcN hypothetical protein gene:HUS2011_4006 BLASTN AIZ53166 yhfS hypothetical protein gene:HUS2011_4008 BLASTN AIZ53168 yhfU hypothetical protein gene:HUS2011_4012 BLASTN AIZ53172 yhfY hypothetical protein gene:HUS2011_4126 BLASTN AIZ53283 yhiJ hypothetical protein gene:HUS2011_4211 BLASTN AIZ53334 yhjR hypothetical protein gene:HUS2011_4292 BLASTN AIZ53404 yibH hypothetical protein gene:HUS2011_4715 BLASTN AIZ53752 yijF hypothetical protein gene:HUS2011_4818 BLASTN AIZ53854 yjbL hypothetical protein gene:HUS2011_4819 BLASTN AIZ53855 yjbM hypothetical protein gene:HUS2011_4833 BLASTN AIZ53868 yjcB hypothetical protein gene:HUS2011_4901 BLASTN AIZ53945 yjdO hypothetical protein gene:HUS2011_4863 BLASTN AIZ53905 yjdP hypothetical protein gene:HUS2011_5074 BLASTN AIZ54073 yjgL hypothetical protein gene:HUS2011_5234 BLASTN AIZ54151 yjjI hypothetical protein gene:HUS2011_5223 BLASTN AIZ54139 yjjZ hypothetical protein gene:HUS2011_0328 BLASTN AIZ52795 ykgI hypothetical protein gene:HUS2011_1411 BLASTN AIZ50960 ymgC hypothetical protein gene:HUS2011_1418 BLASTN AIZ50963 ymgD hypothetical protein gene:HUS2011_2245 BLASTN AIZ51654 yobA hypothetical protein gene:HUS2011_2920 BLASTN AIZ52211 ypeC hypothetical protein gene:HUS2011_3664 BLASTN AIZ52809 yqhG hypothetical protein gene:HUS2011_3693 BLASTN AIZ52840 yqiI hypothetical protein gene:HUS2011_0925 BLASTN AIZ54694 ybjM inner membrane protein gene:HUS2011_1042 BLASTN AIZ54816 yccS inner membrane protein gene:HUS2011_1384 BLASTN AIZ50930 ycfT inner membrane protein gene:HUS2011_1390 BLASTN AIZ50937 ycfZ inner membrane protein gene:HUS2011_2244 BLASTN AIZ51653 yebZ inner membrane protein gene:HUS2011_4441 BLASTN AIZ53489 yidI inner membrane protein inner membrane protein regulated by gene:HUS2011_2121 BLASTN AIZ51529 ydjM LexA gene:HUS2011_4198 BLASTN AIZ53323 yhjG Inner membrane protein, AsmA family

192

inner membrane protein, ComEC family gene:HUS2011_0992 BLASTN AIZ54766 ycaI of competence proteins gene:HUS2011_4481 BLASTN AIZ53528 cbrB inner membrane protein, creBC regulon inner membrane protein, DUF1449 gene:HUS2011_0495 BLASTN AIZ54308 ylaC family inner membrane protein, DUF1449 gene:HUS2011_3695 BLASTN AIZ52842 yqiJ family gene:HUS2011_4110 BLASTN AIZ53266 yhhQ inner membrane protein, DUF165 family inner membrane protein, DUF2569 gene:HUS2011_1962 BLASTN AIZ51422 ydgK family inner membrane protein, DUF2593 gene:HUS2011_0936 BLASTN AIZ54705 ybjO family inner membrane protein, DUF2755 gene:HUS2011_0407 BLASTN AIZ53571 yaiY family inner membrane protein, DUF3302 gene:HUS2011_4293 BLASTN AIZ53405 yibI family gene:HUS2011_0327 BLASTN AIZ52785 ykgB inner membrane protein, DUF417 family gene:HUS2011_0565 BLASTN AIZ54376 ybcI inner membrane protein, DUF457 family gene:HUS2011_5079 BLASTN AIZ54077 yjgN inner membrane protein, DUF898 family inner membrane protein, gene:HUS2011_0009 BLASTN AIZ54730 yaaH Grp1_Fun34_YaaH family inner membrane protein, Imp-YgjV gene:HUS2011_3737 BLASTN AIZ52885 ygjV family inner membrane protein, Predicted gene:HUS2011_4620 BLASTN AIZ53664 yihG acyltransferas inner membrane protein, predicted gene:HUS2011_4237 BLASTN AIZ53349 yhjX oxalate-formate antiporter inner membrane protein, predicted gene:HUS2011_4714 BLASTN AIZ53751 yijE permease inner membrane protein, predicted gene:HUS2011_4197 BLASTN AIZ53322 yhjE transporter inner membrane protein, predicted gene:HUS2011_4635 BLASTN AIZ53677 yihN transporter inner membrane protein, predicted gene:HUS2011_1721 BLASTN AIZ51233 ydcO transporter, function unknown inner membrane protein, SNARE_assoc gene:HUS2011_0061 BLASTN AIZ54435 yabI family gene:HUS2011_0863 BLASTN AIZ54628 ybhM inner membrane protein, UPF0005 family gene:HUS2011_3356 BLASTN AIZ52635 ygdQ inner membrane protein, UPF0053 family gene:HUS2011_0163 BLASTN AIZ51338 yadS inner membrane protein, UPF0126 family gene:HUS2011_4343 BLASTN AIZ53459 yicG inner membrane protein, UPF0126 family gene:HUS2011_4253 BLASTN AIZ53367 yiaA inner membrane protein, YiaAB family gene:HUS2011_4254 BLASTN AIZ53368 yiaB inner membrane protein, YiaAB family gene:HUS2011_1237 BLASTN AIZ54281 insA IS1 repressor TnpA gene:HUS2011_0220 BLASTN AIZ51951 yafT lipoprotein gene:HUS2011_0943 BLASTN AIZ54713 ybjP lipoprotein gene:HUS2011_3124 BLASTN AIZ52419 yfiL lipoprotein gene:HUS2011_2632 BLASTN AIZ51930 yehR lipoprotein, DUF1307 family mechanosensitive channel protein, very gene:HUS2011_1653 BLASTN AIZ51137 ynaI small conductance 193

membrane fusion protein (MFP) gene:HUS2011_4279 BLASTN AIZ53392 yiaV component of efflux pump, signal anchor membrane-anchored, periplasmic TMAO, gene:HUS2011_2446 BLASTN AIZ51784 yedY DMSO reductase methionine aminotransferase, PLP- gene:HUS2011_0622 BLASTN AIZ54449 ybdL dependent gene:HUS2011_3452 BLASTN AIZ52716 scpA methylmalonyl-CoA mutase microcin C transporter, ATP-binding gene:HUS2011_2692 BLASTN AIZ51992 yejF subunit; ABC family gene:HUS2011_2017 BLASTN AIZ51478 ynhG murein L,D-transpeptidase gene:HUS2011_3029 BLASTN AIZ52321 yfgH outer membrane integrity lipoprotein gene:HUS2011_3665 BLASTN AIZ52810 yqhH outer membrane lipoprotein, Lpp paralog gene:HUS2011_0390 BLASTN AIZ53359 yaiO outer membrane protein PHB family membrane protein, function gene:HUS2011_3696 BLASTN AIZ52844 yqiK unknown phosphoenolpyruvate and 6- gene:HUS2011_4480 BLASTN AIZ53527 yieH phosphogluconate phosphatase pilotin, required for secretin (GspDbeta) gene:HUS2011_3512 BLASTN AIZ52767 yghG OM localization; part of defective Gsp- beta operon; verified gene:HUS2011_2956 BLASTN AIZ52250 yfeX porphyrinogen oxidase, cytoplasmic protein required for 2-thiolation step of gene:HUS2011_3972 BLASTN AIZ53129 tusB mnm(5)-s(2)U34-tRNA synthesis gene:HUS2011_2204 BLASTN AIZ51611 yeaW putative 2Fe-2S cluster-containing protein putative 2Fe-2S cluster-containing gene:HUS2011_1030 BLASTN AIZ54802 ycbX protein; 6-N-hydroxylaminopurine resistance protein gene:HUS2011_3088 BLASTN AIZ52381 yfhL putative 4Fe-4S cluster-containing protein gene:HUS2011_1074 BLASTN AIZ50802 yccM putative 4Fe-4S membrane protein gene:HUS2011_4483 BLASTN AIZ53531 yieK putative 6-phosphogluconolactonase gene:HUS2011_2034 BLASTN AIZ51496 ydiO putative acyl-CoA dehydrogenase putative acyltransferase with acyl-CoA gene:HUS2011_0263 BLASTN AIZ52122 yafP N-acyltransferase domain gene:HUS2011_1454 BLASTN AIZ50999 ycgV putative adhesin putative alternate lipid exporter, gene:HUS2011_4196 BLASTN AIZ53320 yhjD suppressor of msbA and KDO essentiality, inner membrane protein gene:HUS2011_3919 BLASTN AIZ53056 yhdX putative amino-acid transporter subunit gene:HUS2011_3920 BLASTN AIZ53057 yhdY putative amino-acid transporter subunit gene:HUS2011_2828 BLASTN AIZ52141 yfcJ putative arabinose efflux transporter putative arginine/ornithine antiporter gene:HUS2011_1941 BLASTN AIZ51399 ydgI transporter gene:HUS2011_0319 BLASTN AIZ52691 ecpC putative aromatic compound dioxygenase gene:HUS2011_3634 BLASTN AIZ52781 yghS putative ATP-binding protein gene:HUS2011_3635 BLASTN AIZ52782 yghT putative ATP-binding protein putative bifunctional enzyme and gene:HUS2011_4530 BLASTN AIZ53574 yifB transcriptional regulator gene:HUS2011_3400 BLASTN AIZ52665 ygeW putative carbamoyltransferase

194

putative CoA-, NAD(P)- gene:HUS2011_2900 BLASTN AIZ52190 yfdE binding gene:HUS2011_4192 BLASTN AIZ53316 yhjA putative cytochrome C peroxidase gene:HUS2011_2182 BLASTN AIZ51594 yeaJ putative diguanylate cyclase gene:HUS2011_2924 BLASTN AIZ52215 yfeA putative diguanylate cyclase gene:HUS2011_3685 BLASTN AIZ52832 ygiD putative dioxygenase, LigB family putative D-mannonate oxidoreductase, gene:HUS2011_1842 BLASTN AIZ51341 ydfI NAD-dependent putative DNA-binding transcriptional gene:HUS2011_0212 BLASTN AIZ51855 yafC regulator putative DNA-binding transcriptional gene:HUS2011_0402 BLASTN AIZ53530 yaiV regulator putative DNA-binding transcriptional gene:HUS2011_0872 BLASTN AIZ54638 ybiH regulator putative DNA-binding transcriptional gene:HUS2011_0979 BLASTN AIZ54751 ycaN regulator putative DNA-binding transcriptional gene:HUS2011_1641 BLASTN AIZ51127 ycjW regulator putative DNA-binding transcriptional gene:HUS2011_1650 BLASTN AIZ51135 ycjZ regulator putative DNA-binding transcriptional gene:HUS2011_1722 BLASTN AIZ51234 ydcN regulator putative DNA-binding transcriptional gene:HUS2011_2187 BLASTN AIZ51596 yeaM regulator putative DNA-binding transcriptional gene:HUS2011_2934 BLASTN AIZ52224 yfeR regulator putative DNA-binding transcriptional gene:HUS2011_3087 BLASTN AIZ52380 yfhH regulator putative DNA-binding transcriptional gene:HUS2011_3457 BLASTN AIZ52719 ygfI regulator putative DNA-binding transcriptional gene:HUS2011_4195 BLASTN AIZ53319 yhjC regulator putative DNA-binding transcriptional gene:HUS2011_4724 BLASTN AIZ53761 yijO regulator gene:HUS2011_0138 BLASTN AIZ51101 yadK putative fimbrial-like adhesin protein gene:HUS2011_0139 BLASTN AIZ51110 yadL putative fimbrial-like adhesin protein gene:HUS2011_0140 BLASTN AIZ51121 yadM putative fimbrial-like adhesin protein gene:HUS2011_0145 BLASTN AIZ51161 yadN putative fimbrial-like adhesin protein gene:HUS2011_0734 BLASTN AIZ54557 ybgO putative fimbrial-like adhesin protein gene:HUS2011_1792 BLASTN AIZ51300 ydeQ putative fimbrial-like adhesin protein gene:HUS2011_1794 BLASTN AIZ51303 ydeS putative fimbrial-like adhesin protein gene:HUS2011_2846 BLASTN AIZ52159 yfcV putative fimbrial-like adhesin protein gene:HUS2011_3690 BLASTN AIZ52836 ygiL putative fimbrial-like adhesin protein gene:HUS2011_3797 BLASTN AIZ52936 yraH putative fimbrial-like adhesin protein gene:HUS2011_3800 BLASTN AIZ52939 yraK putative fimbrial-like adhesin protein gene:HUS2011_3014 BLASTN AIZ52307 focB putative formate transporter gene:HUS2011_1630 BLASTN AIZ51116 ycjM putative glucosyltransferase gene:HUS2011_1649 BLASTN AIZ51134 ycjY putative hydrolase gene:HUS2011_1699 BLASTN AIZ51212 ynbC putative hydrolase

195 gene:HUS2011_0891 BLASTN AIZ54659 ybiP putative hydrolase, inner membrane gene:HUS2011_3829 BLASTN AIZ52968 yhbX putative hydrolase, inner membrane gene:HUS2011_3839 BLASTN AIZ52979 yhbE putative inner membrane permease gene:HUS2011_1497 BLASTN AIZ51040 ychE putative inner membrane protein gene:HUS2011_1981 BLASTN AIZ51441 ydhI putative inner membrane protein gene:HUS2011_2228 BLASTN AIZ51637 yebO putative inner membrane protein gene:HUS2011_2803 BLASTN AIZ52117 yfcC putative inner membrane protein gene:HUS2011_4007 BLASTN AIZ53167 yhfT putative inner membrane protein gene:HUS2011_4592 BLASTN AIZ53634 yigM putative inner membrane protein putative inner membrane protein, gene:HUS2011_1811 BLASTN AIZ51316 yneE bestrophin family putative inner membrane protein, gene:HUS2011_3511 BLASTN AIZ52765 yqgA DUF554 family gene:HUS2011_2919 BLASTN AIZ52210 yfeO putative ion channel protein gene:HUS2011_0202 BLASTN AIZ51723 yaeF putative lipoprotein gene:HUS2011_0203 BLASTN AIZ51723 yaeF putative lipoprotein gene:HUS2011_0442 BLASTN AIZ53882 yajI putative lipoprotein gene:HUS2011_3335 BLASTN AIZ52611 ygdI putative lipoprotein putative lipoprotein and C40 family gene:HUS2011_0258 BLASTN AIZ52052 yafL peptidase putative lipoprotein involved in gene:HUS2011_2905 BLASTN AIZ54265 ypdI 196roquoi acid biosynthesis gene:HUS2011_4461 BLASTN AIZ53508 yidX putative lipoproteinC putative membrane-anchored cyclic-di- gene:HUS2011_1412 BLASTN AIZ50961 ycgG GMP phosphodiesterase putative membrane-anchored cyclic-di- gene:HUS2011_4834 BLASTN AIZ53869 yjcC GMP phosphodiesterase putative membrane-anchored cyclic-di- gene:HUS2011_0494 BLASTN AIZ54307 ylaB GMP phosphodiesterase putative membrane-anchored diguanylate gene:HUS2011_2181 BLASTN AIZ51593 yeaI cyclase putative membrane-anchored diguanylate gene:HUS2011_0911 BLASTN AIZ54679 yliF cyclase gene:HUS2011_0977 BLASTN AIZ54749 ycaD putative MFS-type transporter gene:HUS2011_4277 BLASTN AIZ53391 yiaT putative outer membrane protein gene:HUS2011_3799 BLASTN AIZ52938 yraJ putative outer membrane protein putative outer membrane protein, acid- gene:HUS2011_2115 BLASTN AIZ51522 ydiY inducible gene:HUS2011_0621 BLASTN AIZ54448 ybdH putative oxidoreductase gene:HUS2011_1791 BLASTN AIZ51299 ydeP putative oxidoreductase gene:HUS2011_2011 BLASTN AIZ51472 ydhV putative oxidoreductase gene:HUS2011_2205 BLASTN AIZ51613 yeaX putative oxidoreductase putative oxidoreductase with gene:HUS2011_4160 BLASTN AIZ53286 yhiN FAD/NAD(P)-binding domain putative oxidoreductase, Zn-dependent gene:HUS2011_1634 BLASTN AIZ51120 ycjQ and NAD(P)-binding putative oxidoreductase, Zn-dependent gene:HUS2011_2172 BLASTN AIZ51583 ydjL and NAD(P)-binding

196 gene:HUS2011_1934 BLASTN AIZ51391 ydgD putative peptidase gene:HUS2011_2597 BLASTN AIZ51892 yegQ putative peptidase gene:HUS2011_3814 BLASTN AIZ52954 yhbU putative peptidase (collagenase-like) putative peptidase with chaperone gene:HUS2011_0988 BLASTN AIZ54762 ycaL function gene:HUS2011_1994 BLASTN AIZ51452 ydhO putative peptidase, C40 clan gene:HUS2011_3692 BLASTN AIZ52839 yqiH putative periplasmic pilin chaperone gene:HUS2011_2844 BLASTN AIZ52158 yfcS putative periplasmic pilus chaperone gene:HUS2011_3806 BLASTN AIZ52946 yraQ putative permease putative PF10971 family periplasmic gene:HUS2011_1696 BLASTN AIZ51208 ydbD methylglyoxal resistance protein putative phosphopantetheinyl transferase, gene:HUS2011_4477 BLASTN AIZ53524 yieE COG2091 family gene:HUS2011_3492 BLASTN AIZ52748 yggR putative pilus retraction ATPase putative polysaccharide deacetylase gene:HUS2011_0131 BLASTN AIZ51033 yadE lipoprotein gene:HUS2011_1784 BLASTN AIZ51293 yddB putative porin protein putative positive effector of YfiN activity, gene:HUS2011_3127 BLASTN AIZ52422 yfiB OM lipoprotein gene:HUS2011_3815 BLASTN AIZ52955 yhbV putative protease gene:HUS2011_4577 BLASTN AIZ53620 yigE putative protein, DUF2233 family putative pyruvate formate lyase activating gene:HUS2011_5233 BLASTN AIZ54150 yjjW enzyme gene:HUS2011_3752 BLASTN AIZ52897 yqjF putative quinol oxidase subunit putative rubrerythrin/ferritin-like metal- gene:HUS2011_1575 BLASTN AIZ51056 yciE binding protein gene:HUS2011_1851 BLASTN AIZ51379 ynfE putative selenate reductase, periplasmic gene:HUS2011_3753 BLASTN AIZ52898 yqjG putative S-transferase putative sugar transporter subunit: gene:HUS2011_1631 BLASTN AIZ51117 ycjN periplasmic-binding component of ABC superfamily putative transcriptional regulator, HxlR- gene:HUS2011_5032 BLASTN AIZ54034 ytfH type, DUF24 family putative transcriptional regulator, PadR gene:HUS2011_3717 BLASTN AIZ52864 yqjI family gene:HUS2011_4425 BLASTN AIZ53475 nepI putative transporter gene:HUS2011_0006 BLASTN AIZ54434 yaaJ putative transporter gene:HUS2011_0043 BLASTN AIZ53686 yaaU putative transporter gene:HUS2011_0952 BLASTN AIZ54723 ybjE putative transporter gene:HUS2011_0924 BLASTN AIZ54693 ybjL putative transporter gene:HUS2011_1834 BLASTN AIZ51332 ydeE putative transporter gene:HUS2011_1996 BLASTN AIZ51454 ydhP putative transporter gene:HUS2011_2165 BLASTN AIZ51575 ydjE putative transporter gene:HUS2011_2188 BLASTN AIZ51598 yeaN putative transporter gene:HUS2011_2203 BLASTN AIZ51610 yeaV putative transporter gene:HUS2011_2232 BLASTN AIZ51641 yebQ putative transporter

197 gene:HUS2011_2901 BLASTN AIZ52191 yfdV putative transporter gene:HUS2011_3246 BLASTN AIZ52540 ygbN putative transporter gene:HUS2011_3760 BLASTN AIZ52907 yhaO putative transporter gene:HUS2011_4223 BLASTN AIZ53341 yhjV putative transporter gene:HUS2011_4355 BLASTN AIZ53470 yicJ putative transporter gene:HUS2011_4449 BLASTN AIZ53494 yidE putative transporter gene:HUS2011_4977 BLASTN AIZ53978 yjeM putative transporter putative transporter subunit: ATP-binding gene:HUS2011_2628 BLASTN AIZ51927 yehL component of ABC superfamily gene:HUS2011_4047 BLASTN AIZ53206 yhgA putative transposase gene:HUS2011_0248 BLASTN AIZ53279 yhhI putative transposase gene:HUS2011_0579 BLASTN AIZ53279 yhhI putative transposase gene:HUS2011_0723 BLASTN AIZ53279 yhhI putative transposase gene:HUS2011_0724 BLASTN AIZ53279 yhhI putative transposase gene:HUS2011_4484 BLASTN AIZ53532 yieL putative xylanase gene:HUS2011_4202 BLASTN AIZ53326 yhjJ putative zinc-dependent peptidase gene:HUS2011_2528 BLASTN AIZ51825 plaP putrescine importer, low affinity gene:HUS2011_1520 BLASTN AIZ51365 ydfA Qin prophage; putative protein gene:HUS2011_3757 BLASTN AIZ52902 yhaK redox-sensitive bicupin gene:HUS2011_4291 BLASTN AIZ53402 rhsA rhsA element core protein RshA ribose 5-phosphate isomerase B/allose 6- gene:HUS2011_4862 BLASTN AIZ53904 rpiB phosphate isomerase gene:HUS2011_2957 BLASTN AIZ52251 yfeY RpoE-regulated lipoprotein S9 peptidase family protein, function gene:HUS2011_3060 BLASTN AIZ52352 yfhR unknown gene:HUS2011_1841 BLASTN AIZ51340 ydfZ selenoprotein, function unknown gene:HUS2011_3701 BLASTN AIZ52848 ygiM SH3 domain protein stationary phase growth adaptation gene:HUS2011_4754 BLASTN AIZ53792 yjaZ protein gene:HUS2011_3808 BLASTN AIZ52949 yhbO stress-resistance protein tandem DUF2300 domain protein, gene:HUS2011_2733 BLASTN AIZ52040 yfaQ function unknown gene:HUS2011_1471 BLASTN AIZ51017 ldrC toxic polypeptide, small transcriptional repressor for divergent gene:HUS2011_5072 BLASTN AIZ54071 bdcR bdcA translation inhibitor toxin of toxin- gene:HUS2011_0255 BLASTN AIZ52031 yafQ antitoxin pair YafQ/DinJ gene:HUS2011_3168 BLASTN AIZ52469 ygaV tributyltin-inducible repressor of ygaVP gene:HUS2011_0918 BLASTN AIZ54686 ybjG undecaprenyl pyrophosphate phosphatase gene:HUS2011_3417 BLASTN AIZ52684 uacT uric acid permease gene:HUS2011_2447 BLASTN ER3413_2035 yedZ - gene:HUS2011_3911 BLASTN ER3413_3347 hdJ - gene:HUS2011_0910 BLASTN ER3413_851 yliE - gene:HUS2011_1238 BLASTN ER3413_3673 xylF - gene:HUS2011_5106 BLASTN ER3413_3673 xylF -

198 gene:HUS2011_0573 BLASTN AIZ54384 sfmF putative fimbrial-like adhesin protein gene:HUS2011_0654 BLASTN AIZ54480 ybeD conserved protein, UPF0250 family gene:HUS2011_1427 BLASTN AIZ50971 pliG hypothetical protein gene:HUS2011_2756 BLASTN AIZ52066 nudI nucleoside triphosphatase gene:HUS2011_4034 BLASTN AIZ53195 yhgE putative inner membrane protein gene:HUS2011_4078 BLASTN AIZ53232 yhhW quercetinase activity in vitro gene:HUS2011_4290 BLASTN AIZ53403 yibA putative lyase containing HEAT-repeat gene:HUS2011_4354 BLASTN AIZ53469 yicI putative alpha-glucosidase gene:HUS2011_4960 BLASTN AIZ53958 yjeH putative transporter gene:HUS2011_0787 BLASTN AIZ54611 ybhC acyl-CoA thioesterase, lipoprotein gene:HUS2011_1072 BLASTX - - - gene:HUS2011_3674 BLASTX - - - gene:HUS2011_5189 BLASTX WP_038976843.1 - adhesin amino acid ABC transporter substrate- gene:HUS2011_5196 BLASTX WP_000738579.1 - binding protein autotransporter outer membrane beta- gene:HUS2011_1417 BLASTX WP_001390463.1 - barrel domain-containing protein gene:HUS2011_5135 BLASTX WP_000625670.1 - DUF2254 domain-containing protein gene:HUS2011_1542 BLASTX APE80338.1 - endopeptidase fimbria/pilus outer membrane usher gene:HUS2011_0018 BLASTX WP_077627289.1 - protein gene:HUS2011_1797 BLASTX WP_137456420.1 - fimbrial protein gene:HUS2011_2845 BLASTX WP_001112829.1 yfcU fimbrial usher protein YfcU gene:HUS2011_0137 BLASTX EAA3064927.1 - fimbrial-like adhesin gene:HUS2011_2466 BLASTX WP_000480487.1 - glycosyltransferase family 9 protein gene:HUS2011_1648 BLASTX EAB7521485.1 - 199roquois199cal protein gene:HUS2011_0017 BLASTX WP_053897244.1 - hypothetical protein gene:HUS2011_0170 BLASTX WP_071600053.1 - hypothetical protein gene:HUS2011_0792 BLASTX WP_000188871.1 - hypothetical protein gene:HUS2011_1479 BLASTX WP_000069486.1 - hypothetical protein gene:HUS2011_2194 BLASTX WP_000310501.1 - hypothetical protein gene:HUS2011_2470 BLASTX APE79413.1 - hypothetical protein gene:HUS2011_2479 BLASTX EGK23134.1 - hypothetical protein gene:HUS2011_2745 BLASTX WP_001009400.1 - hypothetical protein gene:HUS2011_3354 BLASTX WP_152071814.1 - hypothetical protein gene:HUS2011_3501 BLASTX WP_001347978.1 - hypothetical protein gene:HUS2011_3687 BLASTX WP_134798122.1 - hypothetical protein gene:HUS2011_4419 BLASTX ESP32249.1 - hypothetical protein gene:HUS2011_4468 BLASTX WP_000250224.1 - hypothetical protein gene:HUS2011_5155 BLASTX WP_016234075.1 - hypothetical protein gene:HUS2011_5178 BLASTX APE81848.1 - hypothetical protein gene:HUS2011_4711 BLASTX KDX42473.1 - hypothetical protein AC16_4204 gene:HUS2011_2679 BLASTX KDT06510.1 - hypothetical protein AC66_2320

199 gene:HUS2011_2517 BLASTX APT62020.2 - hypothetical protein BUE82_08605 gene:HUS2011_1980 BLASTX ABV06040.1 - hypothetical protein EcHS_A1719 gene:HUS2011_1303 BLASTX ABG69046.1 - Hypothetical protein ECP_1033 gene:HUS2011_5170 BLASTX ABG71875.1 - hypothetical protein ECP_3904 gene:HUS2011_4262 BLASTX APE77751.1 - hypothetical protein FORC29_0137 gene:HUS2011_1010 BLASTX ASI51897.1 - Hypothetical protein FORC43_3600 hypothetical protein gene:HUS2011_4097 BLASTX EFK53292.1 - HMPREF9345_00355 hypothetical protein gene:HUS2011_3700 BLASTX EFK46855.1 - HMPREF9346_01479 gene:HUS2011_3523 BLASTX AFS87942.1 - hypothetical protein O3O_21110 gene:HUS2011_3394 BLASTX APJ64785.1 - hypothetical protein RG25_21960 integrase arm-type DNA-binding domain- gene:HUS2011_4953 BLASTX WP_001218815.1 - containing protein LOW QUALITY PROTEIN: hypothetical gene:HUS2011_3484 BLASTX OSK64305.1 - protein EAEG_02305, partial gene:HUS2011_2703 BLASTX EYZ58963.1 - membrane protein gene:HUS2011_5211 BLASTX WP_113073644.1 - MFS transporter gene:HUS2011_1843 BLASTX WP_000151243.1 - MHS family MFS transporter gene:HUS2011_3623 BLASTX WP_121865827.1 - N-acetyltransferase gene:HUS2011_3691 BLASTX ASQ68667.1 - outer membrane usher protein gene:HUS2011_2516 BLASTX WP_137502388.1 - phosphotriesterase BAI32365.1 or predicted IS602 transposase OrfA or gene:HUS2011_3151 BLASTX - OJS29288.1 transposase PstS family phosphate ABC transporter gene:HUS2011_4066 BLASTX WP_122991610.1 - substrate-binding protein gene:HUS2011_0401 BLASTX CCQ27280.1 - putative flagellar structural protein gene:HUS2011_0683 BLASTX APE81146.1 - Putative membrane protein putative type III effector protein (ankyrin gene:HUS2011_2112 BLASTX SQY54839.1 - repeat protein B) pyruvate/proton symporter BtsT (btsT gene:HUS2011_5209 BLASTX WP_001387313.1 btsT gene) rhomboid family intramembrane serine gene:HUS2011_0684 BLASTX WP_096961215.1 - protease Rpn family recombination-promoting gene:HUS2011_3918 BLASTX WP_000181142.1 - nuclease/putative transposase gene:HUS2011_0460 BLASTX WP_024174603.1 - sel1 repeat family protein gene:HUS2011_4809 BLASTX WP_001347654.1 - SopA family protein type II secretion system pilot lipoprotein gene:HUS2011_3612 BLASTX WP024177769.1 gspS2 GspS-beta (gspS2 gene) gene:HUS2011_4356 BLASTX WP_001218908.1 - tyrosine-type recombinase/integrase gene:HUS2011_4545 BLASTX STI82318.1 - uncharacterised protein gene:HUS2011_1651 BLASTX WP_001324051.1 - Unnamed protein product WGR and DUF4132 domain-containing gene:HUS2011_2626 BLASTX WP_001215604.1 - protein gene:HUS2011_4301 BLASTX WP_136770576.1 - YjiK family protein gene:HUS2011_3630 BLASTX WP_109955477.1 - YtfJ family protein

200

gene:HUS2011_0246 BLASTX KQJ27762 - hypothetical protein - gene:HUS2011_0538 BLASTX MZV17504.1 - hypothetical protein - gene:HUS2011_0788 BLASTX WP_178943752.1 - tyrosine-type recombinase/integrase - gene:HUS2011_0796 BLASTX AQZ87531.1 - Host-nuclease inhibitor protein gam - gene:HUS2011_1290 BLASTX WP_033547090.1 - domain-containing protein - gene:HUS2011_1525 BLASTX WP_109554494.1 - DUF977 family protein - gene:HUS2011_3615 BLASTX CEK06843.1 - Conserved hypothetical protein - * Gene symbol after DEGs analysis using E.coli HUSEC2011.37 (release 37) from Ensemble Genome. ** Gene symbol after BLASTN analysis using E. coli K12 complementary deoxyribonucleic acid (cDNA) (release 44) from Ensemble Bacteria and BLASTX search in NCBI website *** Gene description after BLASTN and BLASTX, validated by using gene description after DAVID enrichment analysis.

201

Appendix Table 2.2. Result of BLASTN and BLASTX analysis for unannotated down- regulated DEGs. Gene Accession Gene symbol* Methods Symbol Gene description*** Number ** gene:HUS2011_2753 BLASTN AIZ52060 rhmA 2-keto-3-deoxy-L-rhamnonate aldolase activator of AmiC murein hydrolase gene:HUS2011_3252 BLASTN AIZ52541 nlpD activity, lipoprotein alkaline phosphatase required for gene:HUS2011_1294 BLASTN AIZ50841 ycdX swarming anti-repressor for YcgE, blue light- responsive; FAD-binding; has c-di-GMP gene:HUS2011_1409 BLASTN AIZ50956 bluF phosphodiesterase-like EAL domain, but does assembly protein for flagellar basal-body gene:HUS2011_1338 BLASTN AIZ50882 flgA periplasmic P ring ATP-binding protein, periplasmic, gene:HUS2011_1741 BLASTN AIZ51252 yncE function unknown bacterioferritin, iron storage and gene:HUS2011_3966 BLASTN AIZ53122 bfr detoxification protein gene:HUS2011_4892 BLASTN AIZ53936 yjdF conserved inner membrane protein conserved metal-binding protein, NIF3 gene:HUS2011_0728 BLASTN AIZ54551 ybgI family gene:HUS2011_2190 BLASTN AIZ51600 yoaF conserved outer membrane lipoprotein conserved predicted enzyme, PhzC-PhzF gene:HUS2011_1753 BLASTN AIZ51261 yddE family conserved protein with NAD(P)-binding gene:HUS2011_0947 BLASTN AIZ54717 ybjT Rossmann-fold domain conserved protein with nucleoside gene:HUS2011_3883 BLASTN AIZ53021 yhcM triphosphate hydrolase domain gene:HUS2011_5036 BLASTN AIZ54040 ytfK conserved protein, DUF1107 family gene:HUS2011_0667 BLASTN AIZ54493 ybeL conserved protein, DUF1451 family conserved protein, DUF853 family with gene:HUS2011_5086 BLASTN AIZ54085 yjgR NTPase fold gene:HUS2011_2383 BLASTN AIZ51725 yecA conserved protein, UPF0149 family gene:HUS2011_3443 BLASTN AIZ52707 ygfB conserved protein, UPF0149 family conserved protein, UPF0178 family, gene:HUS2011_0416 BLASTN AIZ53643 yaiI downregulated by beryllium gene:HUS2011_1486 BLASTN AIZ51032 ychJ conserved protein, UPF0225 family gene:HUS2011_1504 BLASTN AIZ51046 yciU conserved protein, UPF0263 family gene:HUS2011_2798 BLASTN AIZ52113 yfbU conserved protein, UPF0304 family gene:HUS2011_2594 BLASTN AIZ51891 yegP conserved protein, UPF0339 family gene:HUS2011_2193 BLASTN AIZ51604 yeaQ conserved protein, UPF0410 family gene:HUS2011_3683 BLASTN AIZ52830 ygiB conserved protein, UPF0441 family gene:HUS2011_1333 BLASTN AIZ50877 yceH conserved protein, UPF0502 family gene:HUS2011_1398 BLASTN AIZ50944 ycfD cupin superfamily protein curcumin/dihydrocurcumin reductase, gene:HUS2011_1738 BLASTN AIZ51249 curA NADPH-dependent gene:HUS2011_2191 BLASTN AIZ51601 yeaP diguanylate cyclase

202 gene:HUS2011_0101 BLASTN AIZ54803 yacG DNA gyrase inhibitor gene:HUS2011_4616 BLASTN AIZ53660 yihD DUF1040 protein YihD gene:HUS2011_3680 BLASTN AIZ52827 yqiB DUF1249 protein YqiB DUF3561 family inner membrane gene:HUS2011_3261 BLASTN AIZ52549 ygbE protein DUF465 family protein, function gene:HUS2011_1712 BLASTN AIZ51225 ydcH unknown free methionine-HAS-sulfoxide gene:HUS2011_2236 BLASTN AIZ51645 msrC reductase fused predicted membrane gene:HUS2011_2219 BLASTN AIZ51627 yoaE protein/conserved protein fused predicted sugar transporter gene:HUS2011_3073 BLASTN AIZ52366 yphE subunits of ABC superfamily: ATP- binding components fused predicted transporter subunits of gene:HUS2011_5247 BLASTN AIZ54165 yjjK ABC superfamily: ATP-binding components galactofuranose binding 203roquoi: gene:HUS2011_5046 BLASTN AIZ54050 ytfQ periplasmic-binding component of ABC superfamily gene:HUS2011_0915 BLASTN AIZ54682 gstB glutathione S-transferase gene:HUS2011_2806 BLASTN AIZ52120 yfcF glutathione S-transferase gene:HUS2011_4286 BLASTN AIZ53401 yibF glutathione S-transferase homolog gene:HUS2011_1744 BLASTN AIZ51254 yncG glutathione S-transferase homolog glutathionylspermidine synthase gene:HUS2011_3684 BLASTN AIZ52831 ygiC homolog gene:HUS2011_3618 BLASTN AIZ52772 glcG hypothetical protein gene:HUS2011_0120 BLASTN AIZ50925 yacL hypothetical protein gene:HUS2011_0213 BLASTN AIZ51877 yafD hypothetical protein gene:HUS2011_0385 BLASTN AIZ53321 yaiL hypothetical protein gene:HUS2011_1005 BLASTN AIZ54780 ycbK hypothetical protein gene:HUS2011_1086 BLASTN AIZ50814 yccJ hypothetical protein gene:HUS2011_1377 BLASTN AIZ50922 ycfP hypothetical protein gene:HUS2011_1439 BLASTN AIZ50982 ycgB hypothetical protein gene:HUS2011_1428 BLASTN AIZ50973 ycgL hypothetical protein gene:HUS2011_1442 BLASTN AIZ51021 ychN hypothetical protein gene:HUS2011_1751 BLASTN AIZ51258 yddH hypothetical protein gene:HUS2011_1940 BLASTN AIZ51398 ydgH hypothetical protein gene:HUS2011_2004 BLASTN AIZ51464 ydhQ hypothetical protein gene:HUS2011_2024 BLASTN AIZ51485 ydiH hypothetical protein gene:HUS2011_2117 BLASTN AIZ51524 ydiZ hypothetical protein gene:HUS2011_2173 BLASTN AIZ51584 yeaC hypothetical protein gene:HUS2011_2240 BLASTN AIZ51648 yebV hypothetical protein gene:HUS2011_2526 BLASTN AIZ51823 yeeD hypothetical protein gene:HUS2011_2754 BLASTN AIZ52064 yfaY hypothetical protein gene:HUS2011_2832 BLASTN AIZ52146 yfcL hypothetical protein

203 gene:HUS2011_3166 BLASTN AIZ52465 ygaU hypothetical protein gene:HUS2011_3671 BLASTN AIZ52818 ygiW hypothetical protein gene:HUS2011_3758 BLASTN AIZ52905 yhaL hypothetical protein gene:HUS2011_3884 BLASTN AIZ53022 yhcB hypothetical protein gene:HUS2011_3979 BLASTN AIZ53137 yheV hypothetical protein gene:HUS2011_4830 BLASTN AIZ53865 yjbR hypothetical protein gene:HUS2011_4963 BLASTN AIZ53961 yjeI hypothetical protein gene:HUS2011_5006 BLASTN AIZ54010 yjfN hypothetical protein gene:HUS2011_1724 BLASTN AIZ51236 yncJ hypothetical protein gene:HUS2011_2247 BLASTN AIZ51656 yobB hypothetical protein gene:HUS2011_2227 BLASTN AIZ51636 yobF hypothetical protein gene:HUS2011_3070 BLASTN AIZ52363 yphB hypothetical protein gene:HUS2011_3748 BLASTN AIZ52893 yqjC hypothetical protein gene:HUS2011_0867 BLASTN AIZ54634 ybhQ inner membrane protein gene:HUS2011_2250 BLASTN AIZ51659 yebE inner membrane protein, DUF533 family gene:HUS2011_3754 BLASTN AIZ52899 yhaH inner membrane protein, DUF805 family gene:HUS2011_1943 BLASTN AIZ51400 ydgC inner membrane protein, GlpM family inner membrane protein, hemolysin III gene:HUS2011_3431 BLASTN AIZ52696 yqfA family HylIII inner membrane protein, UPF0324 gene:HUS2011_2667 BLASTN AIZ51966 yeiH family inner membrane protein, UPF0394 gene:HUS2011_2527 BLASTN AIZ51824 yeeE family gene:HUS2011_2645 BLASTN AIZ51944 yohC inner membrane protein, Yip1 family Iron binding protein associated with gene:HUS2011_3049 BLASTN AIZ52342 iscX IscS; putative molecular adaptor of IscS function iron-sulfur cluster repair protein, gene:HUS2011_3430 BLASTN AIZ52695 ygfZ plumbagin resistance gene:HUS2011_2272 BLASTN AIZ51683 yecD isochorismatase family protein L,D-transpeptidase linking Lpp to gene:HUS2011_1382 BLASTN AIZ50928 ycfS murein gene:HUS2011_1646 BLASTN AIZ51133 ycjG L-Ala-D/L-Glu epimerase malonic semialdehyde reductase, NADPH-dependent; L-allo-threonine gene:HUS2011_1839 BLASTN AIZ51337 ydfG dehydrogenase, NAD(P)-dependent; also oxidizes L-serine, D-serine, D-threonine and membrane-anchored ribosome-binding gene:HUS2011_3749 BLASTN AIZ52894 yqjD protein gene:HUS2011_1004 BLASTN AIZ54779 ycbB murein L,D-transpeptidase O-acetyl-ADP-ribose deacetylase; Rnase gene:HUS2011_1308 BLASTN AIZ50854 ymdB III inhibitor during cold shock; putative cardiolipin synthase C regulatory Phe-Phe periplasmic metalloprotease, gene:HUS2011_3476 BLASTN AIZ52736 loiP OM lipoprotein; low salt-inducible; heat shock protein that binds Era gene:HUS2011_5253 BLASTN AIZ54170 ytjC phosphatase

204 gene:HUS2011_3111 BLASTN AIZ52405 pka protein lysine acetyltransferase protein that protects iron-sulfur proteins gene:HUS2011_3506 BLASTN AIZ52761 yggX against oxidative damage gene:HUS2011_3903 BLASTN AIZ53043 acuI putative acryloyl-CoA reductase gene:HUS2011_3368 BLASTN AIZ52648 yqeF putative acyltransferase gene:HUS2011_3699 BLASTN AIZ52847 ygiF putative adenylate cyclase gene:HUS2011_4840 BLASTN AIZ53875 yjcE putative cation/proton antiporter putative C-N hydrolase family amidase, gene:HUS2011_0249 BLASTN AIZ51970 yafV NAD(P)-binding gene:HUS2011_0877 BLASTN AIZ54643 ybiC putative dehydrogenase gene:HUS2011_2912 BLASTN AIZ52202 ypdC putative DNA-binding protein putative DNA-binding transcriptional gene:HUS2011_1709 BLASTN AIZ51222 ydcI regulator putative DNA-binding transcriptional gene:HUS2011_2666 BLASTN AIZ51965 yeiE regulator putative DNA-binding transcriptional gene:HUS2011_3399 BLASTN AIZ52664 ygeV regulator putative DNA-binding transcriptional gene:HUS2011_3756 BLASTN AIZ52901 yhaJ regulator putative DNA-binding transcriptional gene:HUS2011_3076 BLASTN AIZ52369 yphH regulator putative epimerase, with NAD(P)- gene:HUS2011_2530 BLASTN AIZ51827 yeeZ binding Rossmann-fold domain gene:HUS2011_0876 BLASTN AIZ54642 ybiB putative family 3 glycosyltransferase gene:HUS2011_3314 BLASTN AIZ52592 yqcA putative flavoprotein gene:HUS2011_4595 BLASTN AIZ53636 ysgA putative hydrolase gene:HUS2011_2661 BLASTN AIZ51960 yeiB putative inner membrane protein putative inner membrane protein, gene:HUS2011_1654 BLASTN AIZ51140 ynaJ DUF2534 family putative inner membrane protein, gene:HUS2011_0408 BLASTN AIZ53575 yaiZ DUF2754 family gene:HUS2011_1671 BLASTN AIZ51180 ydbJ putative lipoprotein, DUF333 family gene:HUS2011_1006 BLASTN AIZ54781 ycbL putative metal-binding enzyme gene:HUS2011_2349 BLASTN AIZ51692 yecM putative metal-binding enzyme gene:HUS2011_3108 BLASTN AIZ52402 yfiF putative methyltransferase putative NAD(P)-binding gene:HUS2011_3734 BLASTN AIZ52881 ygjR dehydrogenase putative NAD(P)-binding gene:HUS2011_3434 BLASTN AIZ52699 ygfF oxidoreductase with NAD(P)-binding Rossmann-fold domain gene:HUS2011_3916 BLASTN AIZ53055 yhdV putative outer membrane protein gene:HUS2011_1960 BLASTN AIZ51420 ydgJ putative oxidoreductase gene:HUS2011_2161 BLASTN AIZ51571 ydjA putative oxidoreductase putative oxidoreductase with NAD(P)- gene:HUS2011_1334 BLASTN AIZ50878 yceM binding Rossmann-fold domain putative oxidoreductase, Zn-dependent gene:HUS2011_0356 BLASTN AIZ53024 yahK and NAD(P)-binding putative oxidoreductase, Zn-dependent gene:HUS2011_3071 BLASTN AIZ52364 yphC and NAD(P)-binding

205 gene:HUS2011_2118 BLASTN AIZ51525 yniA putative phosphotransferase/kinase putative response regulator in two- gene:HUS2011_2911 BLASTN AIZ52201 ypdB component system with YpdA putative rhodanese-related gene:HUS2011_1320 BLASTN AIZ50864 yceA sulfurtransferase gene:HUS2011_3639 BLASTN AIZ52786 yghU putative S-transferase putative sugar transporter subunit: gene:HUS2011_3072 BLASTN AIZ52365 yphD membrane component of ABC superfamily putative sugar transporter subunit: gene:HUS2011_3074 BLASTN AIZ52367 yphF periplasmic-binding component of ABC superfamily gene:HUS2011_4042 BLASTN AIZ53200 yhgF putative transcriptional accessory protein gene:HUS2011_4954 BLASTN AIZ53951 yjdC putative transcriptional regulator putative transcriptional regulator, DeoR gene:HUS2011_3241 BLASTN AIZ52535 ygbI family putative transcriptional regulator, gene:HUS2011_4246 BLASTN AIZ53357 yiaG HTH_CROC1 family gene:HUS2011_3676 BLASTN AIZ52822 ygiN quinol monooxygenase required, with yghB, for membrane gene:HUS2011_3746 BLASTN AIZ52890 yqjA integrity; inner membrane protein ribosome-binding protein, probably gene:HUS2011_2772 BLASTN AIZ52082 elaB membrane-anchored, function unknown gene:HUS2011_1321 BLASTN AIZ50865 yceI secreted protein gene:HUS2011_2797 BLASTN AIZ52111 yfbT sugar phosphatas gene:HUS2011_4829 BLASTN AIZ53864 yjbQ thiamin phosphate synthase YcdX chaperone, redox enzyme gene:HUS2011_1295 BLASTN AIZ50842 ycdY maturation protein (REMP) required for swarming protein kinase, function unknown; gene:HUS2011_2179 BLASTN AIZ51590 yeaG autokinase putative diguanylate cyclase, GGDEF gene:HUS2011_2581 BLASTN AIZ51876 yegE domain signaling gene:HUS2011_0479 BLASTN AIZ54183 ybaV conserved protein, ComEA homolog gene:HUS2011_0998 BLASTN AIZ54774 ycbJ hypothetical protein 2-hydroxyglutaryl-CoA dehydratase, D- gene:HUS2011_5192 BLASTX EGI43241.1 - component subfamily bacteriophage lambda head decoration gene:HUS2011_0826 BLASTX EFK26425.1 - protein D gene:HUS2011_0648 BLASTX MGS83472.1 - deaminated glutathione amidase WP_072113137 gene:HUS2011_2600 BLASTX - DeoR/GlpR transcriptional regulator .1 gene:HUS2011_0828 BLASTX MHU81261.1 - DNA-packaging protein FI WP_001169669 gene:HUS2011_1475 BLASTX ychN DsrE/F sulfur relay family protein YchN .1 gene:HUS2011_1172 BLASTX CDL43138.1 - FIG00638688:hypothetical protein WP_001674693 gene:HUS2011_4171 BLASTX - hypothetical protein .1 WP_000063238 gene:HUS2011_0827 BLASTX - major capsid protein .1

206

WP_000056717 gene:HUS2011_0832 BLASTX - phage tail protein .1 gene:HUS2011_5206 BLASTX CQR83726.1 - putative GTPase WP_000297127 gene:HUS2011_0829 BLASTX - tail attachment protein, partial .1 * Gene symbol after DEGs analysis using E.coli HUSEC2011.37 (release 37) from Ensemble Genome. ** Gene symbol after BLASTN analysis using E. coli K12 complementary deoxyribonucleic acid (cDNA) (release 44) from Ensemble Bacteria and BLASTX search in NCBI website *** Gene description after BLASTN and BLASTX, validated by using gene description after DAVID enrichment analysis.

207

Appendix Table 2.3. List 36 out of 753 DEGs that are turned on after MWI treatment. CTR MW Gene Symbol log2(FC) FC p-value (FPKM) (FPKM) gene:HUS2011_0170 0 173.949 inf inf 5.00E-05 gene:HUS2011_2516 0 141.675 inf inf 0.00045 gene:HUS2011_1980 0 136.603 inf inf 5.00E-05 gene:HUS2011_1010 0 119.332 inf inf 5.00E-05 hokA 0 83.077 inf inf 5.00E-05 gene:HUS2011_1651 0 62.9859 inf inf 0.0004 gene:HUS2011_1471 0 61.4552 inf inf 5.00E-05 gene:HUS2011_4711 0 49.1549 inf inf 0.00015 gene:HUS2011_3484 0 43.5716 inf inf 0.00145 gene:HUS2011_2679 0 39.6816 inf inf 0.02395 gene:HUS2011_1411 0 35.2344 inf inf 5.00E-05 cnu 0 32.3625 inf inf 5.00E-05 gene:HUS2011_3354 0 30.3113 inf inf 5.00E-05 gene:HUS2011_2745 0 26.5611 inf inf 5.00E-05 cspH 0 22.8309 inf inf 5.00E-05 gene:HUS2011_0140 0 22.5742 inf inf 5.00E-05 gene:HUS2011_3674 0 20.2315 inf inf 0.0103 gene:HUS2011_2905 0 20.187 inf inf 0.02395 gene:HUS2011_4901 0 18.7519 inf inf 5.00E-05 gene:HUS2011_4419 0 18.2201 inf inf 0.002 gene:HUS2011_2194 0 16.5644 inf inf 0.02395 gene:HUS2011_4097 0 14.3045 inf inf 0.01295 gene:HUS2011_4254 0 13.7917 inf inf 5.00E-05 gene:HUS2011_1290 0 9.93356 inf inf 5.00E-05 gene:HUS2011_1303 0 8.86343 inf inf 5.00E-05 gene:HUS2011_0328 0 8.51601 inf inf 5.00E-05 gene:HUS2011_0573 0 8.43618 inf inf 0.00035 gene:HUS2011_3615 0 8.15123 inf inf 5.00E-05 fimZ 0 8.13938 inf inf 5.00E-05 gene:HUS2011_4262 0 7.8383 inf inf 0.00145 gene:HUS2011_4818 0 7.74003 inf inf 5.00E-05 gene:HUS2011_2470 0 6.3168 inf inf 0.02395 gene:HUS2011_1792 0 6.13968 inf inf 5.00E-05 gene:HUS2011_0538 0 6.03917 inf inf 0.0179

208

tdcR 0 5.93738 inf inf 5.00E-05 safA 0 5.86117 inf inf 0.00105 Abbreviations : FC fold change; FPKM Fragments Per Kilobase of transcript per Million mapped reads

Appendix table 2.4. List of 10 DEGs with the highest FC, excluding gene with the infinite FC. 10 up-regulated genes with the highest fold-change CTR MW Gene Symbol log2(FC) FC p-value (FPKM) (FPKM) ibpA 3.89997 558.277 7.16138 143.149 5.00E-05 bglH 0.149892 20.4098 7.08919 136.1634 0.044 torY 0.929654 113.268 6.92884 121.8389 5.00E-05 gene:HUS2011_0220 0.188968 16.4136 6.4406 86.85915 0.0026 ibpB 7.89025 588.568 6.22099 74.59434 5.00E-05 gene:HUS2011_2628 0.17068 10.6057 5.9574 62.13792 0.044 gene:HUS2011_1630 0.878849 51.6526 5.87708 58.77301 5.00E-05 gene:HUS2011_2846 1.10096 58.2335 5.72501 52.89338 5.00E-05 gene:HUS2011_5178 0.267313 13.9062 5.70106 52.02216 0.044 gene:HUS2011_2267 0.682773 33.6679 5.62383 49.31053 0.00045 10 down-regulated genes with the highest FC CTR MW Gene Symbol log2(FC) FC p-value (FPKM) (FPKM) aceA 6323.92 46.9463 -7.07366 -134.705 5.00E-05 gatD 4608.37 43.0142 -6.7433 -107.136 5.00E-05 aceB 5594.3 78.832 -6.14903 -70.9648 5.00E-05 melA 2933.49 44.6114 -6.03906 -65.7565 5.00E-05 gatC 3447.5 58.1006 -5.89085 -59.3367 5.00E-05 sucB 4354.74 80.644 -5.75488 -53.9996 5.00E-05 sucA 5066.2 97.7266 -5.69601 -51.8405 5.00E-05 acs 770.026 17.5194 -5.45788 -43.9528 5.00E-05 gatB 2455.24 78.1995 -4.97256 -31.3971 5.00E-05 gene:HUS2011_1172 2343.38 77.1291 -4.92517 -30.3826 5.00E-05 Abbreviations : FC fold change; FPKM Fragments Per Kilobase of transcript per Million mapped reads

209

Appendix Table 2.5. The result of significant GO terms and KEGG pathways from up-regulated DEGs in E. coli in response to microwave irradiation. Cellular component Fold Term p-value Enrichment Benjamini Count Genes YABI, GUDP, ALAE, HYCD, HTPG, YIBI, YIBH, YDJM, DTPD, YNAI, ECNA, YIAV, PHOR, YDJE, ACRF, ACRE, YGBN, GNTP, MALX, GNTT, LPXT, YICJ, GNTU, RHTA, GLCA, YICG, AZUC, YBCI, GABP, KUP, GADC, YLAB, YLAC, PTSG, YCHE, MGRB, CDH, PLAP, MDFA, YIAB, YHAO, DGOT, YIAA, DEGP, YAAU, CHIQ, YAAJ, HSRA, ANSP, YCFZ, ARSB, YIHN, YCFT, NEPI, YAFL, CSGG, YHBE, METQ, YAFT, TTDT, GSPC, CSGD, MGTA, YIHG, MDTE, MDTF, MDTK, DINF, NARK, YIGM, CBRB, YPDI, TYRP, ARTP, GFCA, YEHR, CITT, YCGG, AAEX, YDGI, YHBX, YDGK, YBHM, YCDT, YIDE, YQGA, YDHP, YIDI, APBE, MREB, AAEA, AAEB, YADS, XANQ, YDHI, CRED, PQIA, MNGA, YEJF, YAEF, ARAE, FOCB, ARAG, YHDY, PSIE, GO:0005886~plasma 8.17E- 1.389940079 3.92E-09 221 YHDX, BGLF, PITB, ZITB, HDED, YJGN, membrane 11 MNTH, YFIB, SETB, SETC, YHIM, YQIK, YBJE, YQIJ, YBJG, YHHQ, YCCM, YGDI, YBJM, YBJP, YBJO, BASS, YGDQ, YDCO, YCCS, YBJL, YAJI, CADC, YHGE, CADB, WECH, CSRD, DNAA, YLIF, YAIY, UHPT, FRUA, YNFE, KDPB, TRG, UHPB, YQJF, YFGF, PSUT, YFGH, DNAK, UACT, YDEE, EMRD, YDEA, YCAD, YCAI, YHFT, YAHN, NIRC, LEUE, PSPD, YEAJ, YEAI, TDCC, DSDX, RSEP, FHUF, DCUC, YGID, FRWC, DCUD, YEAN, MURP, PSPG, YFEO, YFEA, YJCC, PPPA, SECG, LEPA, DAUA, PROP, FIEF, ALX, TREB, YGHG, YNEE, YFDV, GLPF, XYLE, HOKC, PYRD, KCH, PGAC, YFCJ, PGAD, FRYC, YRAQ, NDH, YFCC, YJEH, YEBZ, OPGC, OPGB, YJEM, ADIC, TSGA, YEBQ, YEBO, YHJG, YHJE, YHJD, YGJV, CLCB, GCD, YEAV, GPT, YHJX, YHJV, YGJQ YABI, GUDP, ALAE, PITB, HYCD, ZITB, YIBH, HDED, YJGN, YDJM, DTPD, YNAI, MNTH, YFIB, SETB, YIAV, SETC, PHOR, YDJE, ACRF, MALX, YFHR, YICJ, GO:0016021~integral 5.33E- 1.40205183 1.28E-05 150 RHTA, YICG, YBCI, YQIK, YQIJ, YBJG, component of membrane 07 YHHQ, YCCM, YBJM, YBJO, YGDQ, BASS, YDCO, YCCS, YBJL, KUP, CADC, YHGE, CSRD, YLIF, YAIY, UHPT, FRUA, KDPB, TRG, YLAB, YLAC, UHPB, YCHE,

210

MGRB, YQJF, YFGF, CDH, YDEE, EMRD, YDEA, YHAO, DGOT, YCAD, YCAI, YHFT, YAHN, YAAU, NIRC, YAAJ, LEUE, HSRA, YCFZ, YIHN, YEAJ, YEAI, TDCC, DCUC, YCFT, NEPI, DCUD, YHBE, YEAN, PSPG, YFEO, YCFJ, TTDT, GSPC, YJCC, PPPA, SECG, YIHG, MDTG, PROP, MDTF, MDTK, DINF, FIEF, YIGM, ALX, YNEE, TREB, YFDV, CBRB, TORY, GFCA, HOKA, CITT, HOKC, AAEX, YDGK, YHBX, KCH, YBHM, YCDT, YIDE, YQGA, YDHP, YIDI, YRAJ, PGAA, PGAC, YFCJ, PGAD, FRYC, FRYA, YRAQ, YFCC, AAEA, AAEB, YADS, YEBZ, OPGC, OPGB, YDHI, CRED, PQIA, YEBQ, YEBO, YHJG, YHJE, PAGP, YEAV, GCD, YIDX, YHJX, YHDY, PSIE, YHDX, BGLF, YHJV, YGJQ SFMF, YRAK, YRAH, YADK, SFMA, 1.40E- YADN, YADM, CSGA, YQII, YADL, GO:0009289~pilus 2.853954391 0.002233 16 04 CSGB, YDES, YBGO, YDEQ, YFCV, YGIL KEGG Pathway Fold Term p-value Enrichment Benjamini Count Genes eco00040:Pentose and 9.38E- glucuronate 5.342857143 6.94E-04 11 ARAA, YIAK, ARAB, UXAB, UXUA, 06 interconversions UGD, UXAC, UXUB, YBHC, RSPA, RSPB

Appendix table 2.6. List of up-regulated DEGs that are shared between 2 cellular components Overlap Similarity Interaction Overlap genes size coefficient YABI, GUDP, ALAE, HYCD, YIBH, YDJM, DTPD, YNAI, YIAV, PHOR, YDJE, ACRF, MALX, YICJ, RHTA, YICG, YBCI, KUP, YLAB, YLAC, YCHE, MGRB, CDH, YHAO, DGOT, YAAU, YAAJ, HSRA, YCFZ, YIHN, YCFT, NEPI, YHBE, TTDT, GSPC, YIHG, MDTF, MDTK, DINF, YIGM, CBRB, GFCA, CITT, AAEX, YHBX, YDGK, YBHM, YCDT, YIDE, YQGA, YDHP, GO:0005886~PLASMA YIDI, AAEA, AAEB, YADS, YDHI, CRED, PQIA, MEMBRANE AND YHDY, PSIE, YHDX, BGLF, PITB, ZITB, HDED, YJGN, GO:0016021~INTEGRAL 140 0.769697 MNTH, YFIB, SETB, SETC, YQIK, YQIJ, YBJG, YHHQ, COMPONENT OF YCCM, YBJM, YBJO, BASS, YGDQ, YDCO, YCCS, MEMBRANE YBJL, CADC, YHGE, CSRD, YLIF, YAIY, UHPT, FRUA, KDPB, TRG, UHPB, YQJF, YFGF, YDEE, EMRD, YDEA, YCAD, YCAI, YHFT, YAHN, NIRC, LEUE, YEAJ, YEAI, TDCC, DCUC, DCUD, YEAN, PSPG, YFEO, YJCC, PPPA, SECG, PROP, FIEF, ALX, TREB, YNEE, YFDV, HOKC, KCH, PGAC, YFCJ, PGAD, FRYC, YRAQ, YFCC, YEBZ, OPGC, OPGB, YEBQ, YEBO, YHJG, YHJE, GCD, YEAV, YHJX, YHJV, YGJQ

211

Appendix Table 2.7. Enriched GO terms and KEGG pathways from down-regulated DEGs in E. coli in response to MWI. Cellular Component Fold Term p-value Benjamini Count Genes Enrichment MTFA, YIBF, ALAS, ELAB, PAND, ROB, YDJA, PGI, GLTD, PGM, GLTA, PGK, BOLA, GLTX, QUEF, POLA, GLCB, GABT, SUCB, SUCA, GATA, RIBC, RLUF, RNB, GATD, GATB, THYA, PURT, ACS, YCHN, TPIA, ASD, MANA, DHAK, PURU, NUDE, PTSI, NUDF, MOAE, ZAPB, PTSH, TYRB, PTSN, PPSA, HFQ, TALA, ALDA, TALB, RUTR, FOLX, ACPP, GRXB, GRXD, FKLB, MUTS, MGSA, FLGD, LPLA, FOLE, FLGE, YCFP, MSRC, MSRB, METH, ACNB, LUXS, HEME, BFR, ACNA, DXS, YIHD, HEML, SLYD, YDFG, EVGA, NLPD, PEPD, YBGI, TYRS, RSMB, TIG, GLYA, YPDC, PEPB, YHCB, RIMJ, AVTA, ASCB, YCDX, ILVN, YCDY, ILVD, LTAE, BETI, SPEC, GDHA, BETB, RPIA, RBSD, TKTA, NADE, SSEA, DDLA, GCVH, YBIB, YBIC, YCEH, GCVP, IHFA, OSMC, BGLA, GCVT, SELD, SODB, MAEB, MAEA, THRA, THRB, THRC, LYSC, CLPA, FUMA, RRAA, FUMC, ICLR, FABI, GO:0005829~cytosol 2.60E-22 1.563201 1.77E-20 263 PATA, YCBL, ASPA, FABG, ASPC, MFD, YFIF, PNCA, ENTE, TOPB, KDSA, ENTH, FADD, TAS, ISCU, UBIC, ISCX, ISCR, LYSS, GPMA, AROG, YCCJ, TNAA, LYSU, ISCS, AROD, YDCI, FDOG, AROC, EDA, AROA, UBID, UBIG, YGFZ, YHGF, GND, MDH, FBAA, TATA, ALLR, SERS, YAIL, STHA, PPA, FMT, YGGX, SERC, GLNE, POXB, TESB, LEUS, GLNB, YGFB, LPD, SSB, FABB, DAPD, FABA, PHET, DAPB, PHES, FEAR, GLMU, GLMS, PPIB, EFP, TDH, FLDA, YEEZ, RAIA, CRR, DEF, HUPA, CRP, HUPB, CSPD, PCK, CSPE, CSPC, ORN, SECA, KBL, PFLA, PFLB, PROS, CAN, HPF, YGIN, PURH, FDX, PURL, MELA, PURC, PROA, LRP, PURB, PYRE, GHRA, TRXA, GMHB, GSHB, TRXB, MRP, PFKB, DCD, FRE, YECM, HISS, YTJC, YFCL, MIND, MINE, MINC, ACCB, DCP, FADR, ACCC, ASNA, SDAA, ASNB, YFBU, YECD, CYSK, RIHA, PMBA, CYSE, ACEA, HPT, ICD, CYSB, PRMB, WRBA, GSTB, YJDC, CYSS, DKSA, CRL GO:0009424~bacterial- FLGL, FLGK, FLGD, FLGC, FLGB, FLGG, 2.20E-04 4.9119 0.007462 8 type flagellum hook FLGF, FLGE GO:0016020~membrane 0.001381 1.543005 0.030834 48 TPIA, CRR, HUPA, MDH, CSPC, ALAS,

212

DPS, FABI, MRCB, OMPA, ASPA, NHAA, MALK, NUOC, GOR, NUOB, BFR, BETA, ECNB, PPA, KGTP, PFLB, SLT, ENTE, GLNH, RPSA, ATPA, ATPD, TALB, LPP, TIG, TYRS, LPD, GLYA, RBSB, WRBA, PURC, LACY, ATPG, PHET, SPPA, LYSS, CYDD, LYSU, TNAA, FDOG, EDA, SODB GO:0045261~proton- transporting ATP 0.003007 6.139875 0.049902 5 ATPA, ATPD, ATPC, ATPH, ATPG synthase complex, catalytic core F(1) KEGG Pathway Fold Term p-value Benjamini Count Genes Enrichment LYSA, THRA, FRDD, FRDC, LYSC, ALAA, PCK, FUMA, FUMC, YQEF, ASPC, ACNB, ACNA, GLTD, DXS, PGI, PGM, GLTA, PGK, PURH, GLCF, ENTC, PURL, FADB, FADA, GLYA, PURD, PROA, PURC, PURB, SUCB, SUCA, AROH, eco01130:Biosynthesis GPMA, AROG, AVTA, AROD, AROC, 1.97E-12 2.084017 1.97E-10 76 of antibiotics AROA, PFKB, PURT, ACS, TPIA, ASD, ILVN, GND, MDH, FBAA, YTJC, ILVD, ILVB, LTAE, SPEC, ACCB, TYRB, ACCC, SDAA, RPIA, SERB, SERC, GARK, CYSK, TALA, TALB, TKTA, ICD, LPD, GCVH, DAPB, SDHB, GLMU, GLMS, GCVP, PUTA, GCVT, YSGA LYSA, FRDD, THRA, FRDC, THRB, THRC, LYSC, ALAA, PCK, FUMA, FUMC, YQEF, PAND, METH, ACNB, ASPC, HEME, ACNA, PGI, DXS, GLTD, HEML, BGLX, PGM, GLTA, PGK, PURH, GLCF, ENTC, PURL, GLTX, FADB, FADA, HEMX, GLYA, PURD, PROA, GLCB, PURC, PURB, GLGP, UBIA, SUCB, eco01110:Biosynthesis TRPC, SUCA, GHRA, UBIC, RIBC, AROH, of secondary 8.09E-12 1.803606 4.05E-10 99 GPMA, AROG, AVTA, AROD, AROC, metabolites AROA, UBID, UBIG, UBIF, PFKB, PURT, ACS, TPIA, ASD, ILVN, MANA, GND, MDH, FBAA, YTJC, ILVD, LTAE, ILVB, SPEC, ACCB, TYRB, ACCC, ASNA, SDAA, ASNB, RPIA, CYSK, TALA, TALB, ALDB, TKTA, CYSE, ICD, HPT, ACEA, ACEB, LPD, WRBA, GCVH, DAPB, SDHB, GCVP, PUTA, GCVT, YSGA LYSA, MAEB, THRA, THRB, THRC, LYSC, ALAA, FUMA, FUMC, FABI, PAND, PATA, ASPA, FABG, PUUA, NUOC, ASPC, NUOB, NUOA, PGI, GLTD, eco01100:Metabolic 3.18E-11 1.397885 1.06E-09 180 BGLX, PGM, GLTA, PGK, NADR, PNCA, pathways SAD, GLCF, ATPA, ATPD, ENTC, ATPC, KDSA, GLTX, FADB, GATY, NUOL, FADA, NUOM, LACZ, FADD, NUON, QUEF, POLA, ATPF, QUEC, GLCB,

213

FADE, NUOI, ATPE, ATPH, ATPG, UBIA, GABT, SUCB, GATA, SUCA, UBIC, RIBC, AROH, GPMA, AROG, ISCS, AROD, AROC, FDOG, EDA, GATD, PGSA, AROA, UBID, UBIG, GATB, THYA, UBIF, PURT, ACS, TPIA, ASD, MANA, GND, MDH, FBAA, DHAK, MOAE, TYRB, PPSA, STHA, SERB, SERC, GARK, TALA, ALDB, TALB, GARR, LPD, FABB, DAPD, FABA, DAPB, GLMU, GLMS, PUTA, CYOA, LPLA, FOLE, FRDD, FRDC, PCK, YQEF, METH, LUXS, ACNB, HEME, ACNA, DXS, PGPB, HEML, PFLB, YDFG, PURH, PEPD, PURL, DADX, HEMX, GLYA, PURD, PROA, PEPB, PURC, PURB, GLGP, PYRE, TRPC, GHRA, MURG, GMHB, AVTA, PEPN, ULAD, GSHB, PFKB, DCD, ILVN, YTJC, ILVD, LTAE, ILVB, SPEC, GDHA, ACCB, ACCC, BETB, MAZG, BETA, ASNA, SDAA, ASNB, RPIA, CYSK, TKTA, NADE, CYSH, CYSE, ICD, HPT, ACEA, ACEB, SSEA, DACA, DDLA, DACC, GCVH, SDHB, GCVP, GCVT, YSGA, SELD, CYSM ACS, MAEB, FRDD, MAEA, TPIA, FRDC, GND, PCK, MDH, FBAA, FUMA, YTJC, FUMC, YQEF, ACNB, ACCB, ACNA, eco01200:Carbon ACCC, PPSA, PGI, SDAA, RPIA, GLTA, 1.50E-09 2.336703 3.74E-08 47 metabolism SERB, SERC, PGK, CYSK, TALA, TALB, TKTA, CYSE, ICD, FADB, ACEA, ACEB, LPD, GLYA, GLCB, SUCB, SUCA, SDHB, GCVP, GPMA, GCVT, FDOG, EDA, PFKB GARK, THRA, THRB, ASD, THRC, LYSC, eco00260:Glycine, LPD, GLYA, YTJC, GCVH, LTAE, GHRA, serine and threonine 1.48E-08 3.398715 2.96E-07 23 GCVP, GPMA, TDH, BETB, KBL, BETA, metabolism GCVT, SDAA, SERB, YDFG, SERC LYSA, TPIA, THRA, THRB, ASD, ILVN, THRC, LYSC, ALAA, FBAA, YTJC, ILVD, LTAE, ILVB, METH, ASPC, ACNB, LUXS, TYRB, ACNA, ASNA, GLTD, SDAA, eco01230:Biosynthesis 6.53E-08 2.118875 1.09E-06 47 RPIA, GLTA, SERB, SERC, PGK, CYSK, of amino acids TALA, TALB, TKTA, CYSE, ICD, GLYA, PROA, DAPD, TRPC, DAPB, AROH, AROG, GPMA, AROD, AROC, AROA, CYSM, PFKB GARK, GLCF, ALDA, DMLA, MDH, eco00630:Glyoxylate ACEA, GARR, LPD, ACEB, GLYA, and dicarboxylate 1.96E-06 2.854488 2.80E-05 22 GCVH, YQEF, GLCB, PURU, GHRA, metabolism ACNB, GCVP, ACNA, GCVT, FDOG, EDA, GLTA LYSA, MAEB, FRDD, THRA, THRB, eco01120:Microbial FRDC, THRC, LYSC, PCK, FUMA, FUMC, metabolism in diverse 1.17E-05 1.58901 1.46E-04 69 YQEF, MHPC, ACNB, ACNA, GLTD, PGI, environments PGM, GLTA, PGK, SAD, GLCF, FADB, FADA, GLYA, GLCB, SUCB, SUCA, 214

GHRA, GPMA, DLD, ULAD, FDOG, EDA, PFKB, ACS, TPIA, ASD, GND, MDH, FBAA, YTJC, LTAE, AGP, ACCB, RHMA, ACCC, PPSA, RPIA, SERB, SERC, CYSK, ALDA, TALA, ALDB, TALB, CYSH, TKTA, CYSE, ICD, ACEA, ACEB, SSEA, LPD, DAPD, DAPB, SDHB, MGSA, YSGA ATPA, ATPD, FRDD, ATPC, FRDC, eco00190:Oxidative NUOL, NUOM, NUON, ATPF, ATPE, 1.03E-04 2.526871 0.001138 19 phosphorylation NUOI, ATPH, ATPG, SDHB, NUOC, NUOB, NUOA, PPA, CYOA ACS, ALDA, MAEB, POXB, FRDD, eco00620:Pyruvate ALDB, MAEA, FRDC, PCK, MDH, LPD, 1.32E-04 2.294785 0.001314 22 metabolism ACEB, FUMA, FUMC, YQEF, GLCB, GHRA, ACCB, ACCC, DLD, PPSA, PFLB FRDD, FRDC, ICD, MDH, PCK, LPD, eco00020:Citrate cycle 2.71E-04 2.864469 0.002464 14 FUMA, FUMC, SUCB, SDHB, SUCA, (TCA cycle) ACNB, ACNA, GLTA FABI, FABG, ACCB, ACCC, FADB, eco01212:Fatty acid 0.002291 2.786524 0.018933 11 FADA, FADD, YQEF, FADE, FABB, metabolism FABA ACS, TPIA, ALDB, CRR, PCK, FBAA, eco00010:Glycolysis / 0.003752 2.127891 0.026496 16 LPD, YTJC, AGP, GPMA, ASCB, PGI, Gluconeogenesis BGLA, PGM, PFKB, PGK CYSK, METH, THRA, ASPC, LUXS, ASD, eco00270:Cysteine and 0.003674 2.384706 0.027914 13 TYRB, LYSC, CYSE, MDH, SSEA, SDAA, methionine metabolism CYSM

215

Appendix table 2.8. Lists of interactions between GO terms and KEGG pathways from down-regulated DEGs based on the overlap genes. Overlap Similarity Interaction Overlap genes size coefficient LYSA, TPIA, THRA, THRB, ASD, ILVN, THRC, ECO01230:BIOSYNTHESIS LYSC, ALAA, FBAA, YTJC, ILVD, LTAE, OF AMINO ACIDS AND ILVB, METH, ASPC, ACNB, TYRB, ACNA, ECO01110:BIOSYNTHESIS 41 0.6314083 ASNA, GLTD, SDAA, RPIA, GLTA, PGK, OF SECONDARY CYSK, TALA, TALB, TKTA, CYSE, ICD, METABOLITES GLYA, PROA, TRPC, DAPB, AROH, AROG, GPMA, AROD, AROC, AROA ECO01200:CARBON METABOLISM AND MDH, YQEF, ACNB, ACNA, ACEA, ACEB, ECO00630:GLYOXYLATE 13 0.411526 LPD, GLYA, GLCB, GCVP, GCVT, FDOG, EDA AND DICARBOXYLATE METABOLISM ECO00020:CITRATE CYCLE (TCA CYCLE) AND FRDD, FRDC, ICD, MDH, PCK, LPD, FUMA, 13 0.5292857 ECO01110:BIOSYNTHESIS FUMC, SUCB, SDHB, SUCA, ACNB, ACNA OF SECONDARY METABOLITES ECO00620:PYRUVATE METABOLISM AND ACS, FRDD, ALDB, FRDC, PCK, MDH, LPD, ECO01110:BIOSYNTHESIS 15 0.4116638 ACEB, FUMA, FUMC, YQEF, GLCB, GHRA, OF SECONDARY ACCB, ACCC METABOLITES ECO01200:CARBON ACS, MAEB, FRDD, MAEA, FRDC, PCK, MDH, METABOLISM AND 16 0.5145798 FUMA, FUMC, YQEF, ACCB, ACCC, PPSA, ECO00620:PYRUVATE ACEB, LPD, GLCB METABOLISM ECO01200:CARBON METABOLISM AND FRDD, FRDC, PCK, MDH, FUMA, FUMC, 13 0.5997024 ECO00020:CITRATE ACNB, ACNA, ICD, LPD, SUCB, SUCA, SDHB CYCLE (TCA CYCLE) ECO00630:GLYOXYLATE AND DICARBOXYLATE GLCF, MDH, ACEA, LPD, ACEB, GLYA, METABOLISM AND 14 0.3836024 GCVH, YQEF, GLCB, GHRA, ACNB, GCVP, ECO01110:BIOSYNTHESIS ACNA, GCVT OF SECONDARY METABOLITES TPIA, THRA, ASD, ILVN, LYSC, ALAA, FBAA, ECO01230:BIOSYNTHESIS YTJC, ILVD, LTAE, ILVB, ASPC, ACNB, TYRB, OF AMINO ACIDS AND ACNA, GLTD, SDAA, RPIA, GLTA, SERB, 36 0.5898753 ECO01130:BIOSYNTHESIS SERC, PGK, CYSK, TALA, TALB, TKTA, ICD, OF ANTIBIOTICS GLYA, PROA, DAPB, AROH, AROG, GPMA, AROD, AROC, AROA ACS, FRDD, TPIA, FRDC, GND, PCK, MDH, ECO01200:CARBON FBAA, FUMA, YTJC, FUMC, YQEF, ACNB, METABOLISM AND ACCB, ACNA, ACCC, PGI, SDAA, RPIA, GLTA, ECO01110:BIOSYNTHESIS 39 0.5971366 PGK, CYSK, TALA, TALB, TKTA, CYSE, ICD, OF SECONDARY FADB, ACEA, ACEB, LPD, GLYA, GLCB, METABOLITES SUCB, SUCA, SDHB, GCVP, GPMA, GCVT ECO01130:BIOSYNTHESIS THRA, FRDD, FRDC, LYSC, PCK, FUMA, 45 0.551087 OF ANTIBIOTICS AND FUMC, YQEF, ACNB, ACNA, GLTD, PGI, PGM,

216

ECO01120:MICROBIAL GLTA, PGK, GLCF, FADB, FADA, GLYA, METABOLISM IN SUCB, SUCA, GPMA, PFKB, ACS, TPIA, ASD, DIVERSE GND, MDH, FBAA, YTJC, LTAE, ACCB, ACCC, ENVIRONMENTS RPIA, SERB, SERC, CYSK, TALA, TALB, TKTA, ICD, LPD, DAPB, SDHB, YSGA ECO00620:PYRUVATE METABOLISM AND ACS, ALDA, MAEB, FRDD, ALDB, FRDC, PCK, ECO01120:MICROBIAL 19 0.5637626 MDH, LPD, ACEB, FUMA, FUMC, YQEF, METABOLISM IN GLCB, GHRA, ACCB, ACCC, DLD, PPSA DIVERSE ENVIRONMENTS MAEB, THRA, THRB, THRC, LYSC, PCK, ECO01120:MICROBIAL FUMA, FUMC, ACNB, ACNA, GLTD, PGI, METABOLISM IN PGM, GLTA, PGK, GLYA, GLCB, SUCB, SUCA, DIVERSE GHRA, GPMA, FDOG, EDA, PFKB, ACS, TPIA, 50 0.4509713 ENVIRONMENTS (ACEB, ASD, GND, MDH, FBAA, YTJC, LTAE, ACCB, ACEA) AND ACCC, PPSA, RPIA, SERC, CYSK, ALDA, GO:0005829~CYTOSOL TALA, TALB, TKTA, CYSE, ICD, ACEA, SSEA, LPD, DAPD, DAPB, MGSA ECO01230:BIOSYNTHESIS LYSA, TPIA, THRA, THRB, ASD, THRC, LYSC, OF AMINO ACIDS AND FBAA, YTJC, LTAE, ACNB, ACNA, GLTD, ECO01120:MICROBIAL 28 0.4569632 RPIA, GLTA, SERB, SERC, PGK, CYSK, TALA, METABOLISM IN TALB, TKTA, CYSE, ICD, GLYA, DAPD, DIVERSE DAPB, GPMA ENVIRONMENTS ECO00260:GLYCINE, THRA, THRB, ASD, THRC, LYSC, LPD, GLYA, SERINE AND THREONINE 19 0.448624 YTJC, GCVH, LTAE, GHRA, GCVP, GPMA, METABOLISM AND TDH, BETB, KBL, GCVT, SDAA, YDFG GO:0005829~CYTOSOL MAEB, THRA, THRB, THRC, LYSC, FUMA, FUMC, PGI, GLTD, PGM, GLTA, PGK, SAD, ECO01100:METABOLIC GLCF, FADB, FADA, GLCB, SUCB, SUCA, PATHWAYS AND GPMA, FDOG, EDA, ACS, TPIA, ASD, GND, ECO01120:MICROBIAL MDH, FBAA, PPSA, SERB, SERC, TALA, 61 0.604263 METABOLISM IN ALDB, TALB, LPD, DAPD, DAPB, FRDD, DIVERSE FRDC, PCK, YQEF, ACNB, ACNA, GLYA, ENVIRONMENTS GHRA, ULAD, PFKB, YTJC, LTAE, ACCB, ACCC, RPIA, CYSK, TKTA, CYSH, CYSE, ICD, ACEA, ACEB, SSEA, SDHB ECO01100:METABOLIC THRA, THRB, THRC, LYSC, GPMA, ASD, PATHWAYS AND SERB, LPD, YDFG, GLYA, GHRA, YTJC, ECO00260:GLYCINE, 19 0.4646739 LTAE, BETB, BETA, SDAA, GCVH, GCVP, SERINE AND THREONINE GCVT METABOLISM THRA, FRDD, FRDC, LYSC, ALAA, PCK, FUMA, FUMC, YQEF, ASPC, ACNB, ACNA, GLTD, DXS, PGI, PGM, GLTA, PGK, PURH, ECO01130:BIOSYNTHESIS GLCF, ENTC, PURL, FADB, FADA, GLYA, OF ANTIBIOTICS AND PURD, PROA, PURC, PURB, SUCB, SUCA, ECO01110:BIOSYNTHESIS 70 0.7938596 AROH, GPMA, AROG, AVTA, AROD, AROC, OF SECONDARY AROA, PFKB, PURT, ACS, TPIA, ASD, ILVN, METABOLITES GND, MDH, FBAA, YTJC, ILVD, ILVB, LTAE, SPEC, ACCB, TYRB, ACCC, SDAA, RPIA, CYSK, TALA, TALB, TKTA, ICD, LPD, GCVH, DAPB, SDHB, GCVP, PUTA, GCVT, YSGA

217

ACS, MAEB, FRDD, TPIA, FRDC, GND, PCK, ECO01200:CARBON MDH, FBAA, FUMA, YTJC, FUMC, YQEF, METABOLISM AND ACNB, ACCB, ACNA, ACCC, PPSA, PGI, RPIA, ECO01120:MICROBIAL 42 0.7305923 GLTA, SERB, SERC, PGK, CYSK, TALA, TALB, METABOLISM IN TKTA, CYSE, ICD, FADB, ACEA, ACEB, LPD, DIVERSE GLYA, GLCB, SUCB, SUCA, SDHB, GPMA, ENVIRONMENTS FDOG, EDA ECO01120:MICROBIAL METABOLISM IN DIVERSE PCK, PGI, PGM, GPMA, PFKB, ACS, TPIA, ENVIRONMENTS 12 0.4571918 FBAA, YTJC, AGP, ALDB, LPD (ACEB,ACEA) AND ECO00010:GLYCOLYSIS / GLUCONEOGENESIS ECO00020:CITRATE CYCLE (TCA CYCLE) FRDD, FRDC, ICD, MDH, PCK, LPD, FUMA, AND 13 0.5487013 FUMC, SUCB, SDHB, SUCA, ACNB, ACNA ECO01130:BIOSYNTHESIS OF ANTIBIOTICS ECO01212:FATTY ACID METABOLISM AND FABG, ACCB, ACCC, FADB, FADA, FADD, 9 0.4338162 ECO01100:METABOLIC YQEF, FADE, FABB PATHWAYS THRA, THRB, THRC, LYSC, PCK, FUMA, FUMC, PAND, METH, ACNB, ASPC, HEME, ACNA, PGI, DXS, GLTD, HEML, PGM, GLTA, PGK, PURH, PURL, GLTX, GLYA, PROA, ECO01110:BIOSYNTHESIS GLCB, PURC, PURB, SUCB, SUCA, GHRA, OF SECONDARY UBIC, RIBC, GPMA, AROG, AVTA, AROD, 76 0.5167055 METABOLITES AND AROC, AROA, UBID, UBIG, PFKB, PURT, ACS, GO:0005829~CYTOSOL TPIA, ASD, ILVN, MANA, GND, MDH, FBAA, YTJC, ILVD, LTAE, SPEC, ACCB, TYRB, ACCC, ASNA, SDAA, ASNB, RPIA, CYSK, TALA, TALB, TKTA, CYSE, ICD, HPT, ACEA, LPD, WRBA, GCVH, DAPB, GCVP, GCVT ECO00020:CITRATE CYCLE (TCA CYCLE) FRDD, FRDC, ICD, MDH, PCK, LPD, FUMA, AND 13 0.5001973 FUMC, SUCB, SDHB, SUCA, ACNB, ACNA ECO01100:METABOLIC PATHWAYS ECO00620:PYRUVATE ACS, MAEB, FRDD, ALDB, FRDC, PCK, MDH, METABOLISM AND 17 0.4323096 LPD, ACEB, FUMA, FUMC, YQEF, GLCB, ECO01100:METABOLIC GHRA, ACCB, ACCC, PPSA PATHWAYS THRA, LYSC, PCK, FUMA, FUMC, ASPC, ACNB, ACNA, GLTD, DXS, PGI, PGM, GLTA, PGK, PURH, PURL, GLYA, PROA, PURC, PURB, SUCB, SUCA, GPMA, AROG, AVTA, ECO01130:BIOSYNTHESIS AROD, AROC, AROA, PFKB, PURT, ACS, OF ANTIBIOTICS AND 59 0.493515 TPIA, ASD, ILVN, GND, MDH, FBAA, YTJC, GO:0005829~CYTOSOL ILVD, LTAE, SPEC, ACCB, TYRB, ACCC, SDAA, RPIA, SERC, CYSK, TALA, TALB, TKTA, ICD, LPD, GCVH, DAPB, GLMU, GLMS, GCVP, GCVT ECO00630:GLYOXYLATE 17 0.4323096 GLCF, MDH, ACEA, GARR, LPD, ACEB,

218

AND DICARBOXYLATE GLYA, GCVH, YQEF, GLCB, GHRA, ACNB, METABOLISM AND GCVP, ACNA, GCVT, FDOG, EDA ECO01100:METABOLIC PATHWAYS ECO00010:GLYCOLYSIS / GLUCONEOGENESIS ACS, TPIA, CRR, PCK, FBAA, LPD, YTJC, 13 0.4306861 AND GPMA, ASCB, PGI, BGLA, PGM, PFKB GO:0005829~CYTOSOL ACS, MAEB, MAEA, TPIA, GND, PCK, MDH, FBAA, FUMA, YTJC, FUMC, ACNB, ACCB, ECO01200:CARBON ACNA, ACCC, PPSA, PGI, SDAA, RPIA, GLTA, METABOLISM AND 39 0.4868493 SERC, PGK, CYSK, TALA, TALB, TKTA, GO:0005829~CYTOSOL CYSE, ICD, ACEA, LPD, GLYA, GLCB, SUCB, SUCA, GCVP, GPMA, GCVT, FDOG, EDA ACS, MAEB, FRDD, TPIA, FRDC, GND, PCK, MDH, FBAA, FUMA, YTJC, FUMC, YQEF, ECO01200:CARBON ACNB, ACCB, ACNA, ACCC, PPSA, PGI, METABOLISM AND 45 0.6023498 SDAA, RPIA, GLTA, SERB, SERC, PGK, CYSK, ECO01100:METABOLIC TALA, TALB, TKTA, CYSE, ICD, FADB, ACEA, PATHWAYS ACEB, LPD, GLYA, GLCB, SUCB, SUCA, SDHB, GCVP, GPMA, GCVT, FDOG, EDA ECO00020:CITRATE CYCLE (TCA CYCLE) ICD, MDH, PCK, LPD, FUMA, FUMC, SUCB, 10 0.3758694 AND SUCA, ACNB, ACNA GO:0005829~CYTOSOL TPIA, THRA, THRB, ASD, ILVN, THRC, LYSC, FBAA, YTJC, ILVD, LTAE, METH, ASPC, ECO01230:BIOSYNTHESIS ACNB, LUXS, TYRB, ACNA, ASNA, GLTD, OF AMINO ACIDS AND 39 0.4868493 SDAA, RPIA, GLTA, SERC, PGK, CYSK, TALA, GO:0005829~CYTOSOL TALB, TKTA, CYSE, ICD, GLYA, PROA, DAPD, DAPB, AROG, GPMA, AROD, AROC, AROA LYSA, THRA, LYSC, ALAA, FUMA, FUMC, ASPC, PGI, GLTD, PGM, GLTA, PGK, GLCF, ENTC, FADB, FADA, SUCB, SUCA, AROH, GPMA, AROG, AROD, AROC, AROA, PURT, ECO01100:METABOLIC ACS, TPIA, ASD, GND, MDH, FBAA, TYRB, PATHWAYS AND SERB, SERC, GARK, TALA, TALB, LPD, 75 0.7006034 ECO01130:BIOSYNTHESIS DAPB, GLMU, GLMS, PUTA, FRDD, FRDC, OF ANTIBIOTICS PCK, YQEF, ACNB, ACNA, DXS, PURH, PURL, GLYA, PURD, PROA, PURC, PURB, AVTA, PFKB, ILVN, YTJC, ILVD, LTAE, ILVB, SPEC, ACCB, ACCC, SDAA, RPIA, CYSK, TKTA, ICD, GCVH, SDHB, GCVP, GCVT ECO01230:BIOSYNTHESIS OF AMINO ACIDS AND THRA, ASD, LYSC, METH, ASPC, LUXS, ECO00270:CYSTEINE 9 0.4343891 TYRB, SDAA, CYSE AND METHIONINE METABOLISM ECO00190:OXIDATIVE PHOSPHORYLATION ATPA, ATPD, FRDD, ATPC, FRDC, NUOL, AND 17 0.4940717 NUOM, NUON, ATPF, ATPE, NUOI, ATPH, ECO01100:METABOLIC ATPG, SDHB, NUOC, NUOB, NUOA PATHWAYS ECO01230:BIOSYNTHESIS 45 0.6023498 TPIA, THRA, THRB, ASD, ILVN, THRC, LYSC,

219

OF AMINO ACIDS AND ALAA, FBAA, YTJC, ILVD, LTAE, ILVB, ECO01100:METABOLIC METH, ASPC, ACNB, LUXS, TYRB, ACNA, PATHWAYS ASNA, GLTD, SDAA, RPIA, GLTA, SERB, SERC, PGK, CYSK, TALA, TALB, TKTA, CYSE, ICD, GLYA, PROA, DAPD, TRPC, DAPB, AROH, AROG, GPMA, AROD, AROC, AROA, CYSM ECO00270:CYSTEINE AND METHIONINE METABOLISM AND METH, THRA, ASPC, ASD, TYRB, LYSC, 9 0.3898432 ECO01110:BIOSYNTHESIS CYSE, MDH, SDAA OF SECONDARY METABOLITES THRA, THRB, THRC, LYSC, ALAA, FUMA, FUMC, PAND, ASPC, PGI, GLTD, BGLX, PGM, GLTA, PGK, GLCF, ENTC, GLTX, FADB, FADA, GLCB, UBIA, SUCB, SUCA, UBIC, RIBC, AROH, GPMA, AROG, AROD, AROC, AROA, UBID, UBIG, UBIF, PURT, ACS, TPIA, ECO01100:METABOLIC ASD, MANA, GND, MDH, FBAA, TYRB, TALA, PATHWAYS AND ALDB, TALB, LPD, DAPB, PUTA, FRDD, ECO01110:BIOSYNTHESIS 96 0.7471436 FRDC, PCK, YQEF, METH, ACNB, HEME, OF SECONDARY ACNA, DXS, HEML, PURH, PURL, HEMX, METABOLITES GLYA, PURD, PROA, PURC, PURB, GLGP, TRPC, GHRA, AVTA, PFKB, ILVN, YTJC, ILVD, LTAE, ILVB, SPEC, ACCB, ACCC, ASNA, SDAA, ASNB, RPIA, CYSK, TKTA, CYSE, ICD, HPT, ACEA, ACEB, GCVH, SDHB, GCVP, GCVT ECO01110:BIOSYNTHESIS OF SECONDARY PCK, PGI, PGM, GPMA, PFKB, ACS, TPIA, METABOLITES AND 11 0.3966346 FBAA, YTJC, ALDB, LPD ECO00010:GLYCOLYSIS / GLUCONEOGENESIS ECO00020:CITRATE CYCLE (TCA CYCLE) AND FRDD, FRDC, ICD, MDH, PCK, LPD, FUMA, ECO01120:MICROBIAL 13 0.5571429 FUMC, SUCB, SDHB, SUCA, ACNB, ACNA METABOLISM IN DIVERSE ENVIRONMENTS ECO00630:GLYOXYLATE AND DICARBOXYLATE METABOLISM AND GLCF, ALDA, MDH, ACEA, LPD, ACEB, ECO01120:MICROBIAL 14 0.4090909 GLYA, YQEF, GLCB, GHRA, ACNB, ACNA, METABOLISM IN FDOG, EDA DIVERSE ENVIRONMENTS ECO01110:BIOSYNTHESIS OF SECONDARY THRA, THRB, THRC, LYSC, GLYA, GHRA, METABOLITES AND 15 0.3961804 GPMA, ASD, YTJC, LTAE, SDAA, LPD, GCVH, ECO00260:GLYCINE, GCVP, GCVT SERINE AND THREONINE METABOLISM ECO01110:BIOSYNTHESIS LYSA, FRDD, THRA, FRDC, THRB, THRC, 52 0.6009495 OF SECONDARY LYSC, PCK, FUMA, FUMC, YQEF, ACNB, 220

METABOLITES AND ACNA, PGI, GLTD, PGM, GLTA, PGK, GLCF, ECO01120:MICROBIAL FADB, FADA, GLYA, GLCB, SUCB, SUCA, METABOLISM IN GHRA, GPMA, PFKB, ACS, TPIA, ASD, GND, DIVERSE MDH, FBAA, YTJC, LTAE, ACCB, ACCC, ENVIRONMENTS RPIA, CYSK, TALA, TALB, ALDB, TKTA, CYSE, ICD, ACEA, ACEB, LPD, DAPB, SDHB, YSGA ECO01100:METABOLIC PATHWAYS AND THRA, LYSC, ASPC, ASD, MDH, TYRB, METH, ECO00270:CYSTEINE 11 0.4532967 LUXS, SDAA, CYSE, SSEA AND METHIONINE METABOLISM ECO00270:CYSTEINE AND METHIONINE METH, THRA, ASPC, LUXS, ASD, TYRB, 11 0.4438316 METABOLISM AND LYSC, CYSE, MDH, SSEA, SDAA GO:0005829~CYTOSOL ACS, FRDD, TPIA, FRDC, GND, PCK, MDH, ECO01200:CARBON FBAA, FUMA, YTJC, FUMC, YQEF, ACNB, METABOLISM AND ACCB, ACNA, ACCC, PGI, SDAA, RPIA, GLTA, 37 0.6087333 ECO01130:BIOSYNTHESIS SERB, SERC, PGK, CYSK, TALA, TALB, TKTA, OF ANTIBIOTICS ICD, FADB, LPD, GLYA, SUCB, SUCA, SDHB, GCVP, GPMA, GCVT MAEB, THRA, THRB, THRC, LYSC, FUMA, FUMC, FABI, PAND, PATA, ASPA, FABG, ASPC, PGI, GLTD, PGM, GLTA, PGK, PNCA, KDSA, GLTX, FADD, QUEF, POLA, GLCB, GABT, SUCB, GATA, SUCA, UBIC, RIBC, GPMA, AROG, ISCS, AROD, AROC, FDOG, EDA, GATD, AROA, UBID, UBIG, GATB, THYA, PURT, ACS, TPIA, ASD, MANA, GND, MDH, FBAA, DHAK, MOAE, TYRB, PPSA, ECO01100:METABOLIC STHA, SERC, TALA, TALB, LPD, FABB, DAPD, PATHWAYS AND 120 0.5190918 FABA, DAPB, GLMU, GLMS, LPLA, FOLE, GO:0005829~CYTOSOL PCK, METH, LUXS, ACNB, HEME, ACNA, DXS, HEML, PFLB, YDFG, PURH, PEPD, PURL, GLYA, PROA, PEPB, PURC, PURB, PYRE, GHRA, GMHB, AVTA, GSHB, PFKB, DCD, ILVN, YTJC, ILVD, LTAE, SPEC, GDHA, ACCB, ACCC, BETB, ASNA, SDAA, ASNB, RPIA, CYSK, TKTA, NADE, CYSE, ICD, HPT, ACEA, SSEA, DDLA, GCVH, GCVP, GCVT, SELD

221

Appendix Table 2.9. The enriched GO terms and KEGG pathways from all DEGs in E. coli in response to MWI. KEGG Pathway Fold Term p-value Benjamini Count Genes Enrichment LYSA, THRA, IDI, THRB, THRC, LYSC, ALAA, FUMA, FUMC, ANSB, PAND, ASPC, GLTD, PGI, GNTK, BGLX, PGM, GLTA, PGK, GLCF, ENTC, MALY, GLTX, ARGA, FADB, FADA, GLCB, SUCB, UBIA, SUCA, UBIC, RIBC, AROH, GPMA, AROG, AROF, AROD, AROC, AROA, UBID, UBIG, ARGI, UBIF, PURT, ACS, TPIA, ASD, MANA, GND, CADA, MDH, FBAA, TYRB, CDH, TALA, ALDB, eco01110:Biosynthesis TALB, LPD, PLSB, DAPB, GUAB, of secondary 9.79E-10 1.5426741 1.04E-07 125 PUTA, FRDD, FRDC, PCK, CPSG, metabolites YQEF, METL, METK, METH, ACNB, HEME, ACNA, METB, DXS, META, HEML, GLPX, PURH, PURL, GLPD, GLYA, HEMX, PURD, PROA, PURC, PURB, GLGP, TRPC, GHRA, AVTA, FRMA, PFKB, TYNA, ILVN, YTJC, ILVC, ILVD, ILVB, LTAE, SPEC, ACCB, ACCC, ASNA, RPIB, ASNB, SDAA, RPIA, MQO, LDCC, CYSK, TKTA, CYSE, ICD, HPT, ACEA, ACEB, WRBA, GCVH, SDHB, GCD, GCVP, GCVT, GPT, YSGA LYSA, THRA, IDI, FRDD, FRDC, LYSC, ALAA, PCK, FUMA, FUMC, YQEF, METL, ASPC, ACNB, ACNA, METB, GLTD, PGI, DXS, GNTK, GLPX, PGM, GLTA, PGK, PURH, GLCF, ENTC, PURL, ARGA, FADB, FADA, GLYA, PURD, PROA, PURC, PURB, SUCB, SUCA, AROH, GPMA, eco01130:Biosynthesis AROG, AVTA, AROF, AROD, AROC, 1.27E-08 1.6532377 6.71E-07 89 of antibiotics AROA, FRMA, PFKB, ARGI, PURT, ACS, TPIA, ASD, ILVN, GND, MDH, FBAA, YTJC, ILVC, ILVD, ILVB, LTAE, SPEC, ACCB, TYRB, ACCC, RPIB, SDAA, RPIA, MQO, SERB, SERC, GARK, CYSK, TALA, TALB, TKTA, ICD, LPD, GCVH, DAPB, SDHB, GLMU, GCD, GLMS, GCVP, PUTA, GCVT, YSGA MAEB, MAEA, FRDD, FRDC, PCK, SCPA, FUMA, FUMC, YQEF, ACNB, METF, ACNA, PGI, GNTK, GLPX, eco01200:Carbon 6.13E-08 1.8860416 2.17E-06 56 GLTA, PGK, FADB, KDGK, GLYA, metabolism GLCB, SUCB, SUCA, GPMA, FDOG, EDA, FRMB, PFKB, FRMA, ACS, TPIA, GND, MDH, FBAA, YTJC,

222

ACCB, ACCC, PPSA, RPIB, SDAA, RPIA, MQO, SERB, SERC, CYSK, TALA, TALB, TKTA, CYSE, ICD, ACEA, ACEB, LPD, SDHB, GCVP, GCVT ALAA, PHOA, ANSB, PAND, PUUA, PGI, FAU, GLTD, GNTK, PGM, BGLX, GLTA, PGK, NADR, SAD, GLCF, ATPA, ATPD, ATPC, MALY, GLTX, ARGA, GATY, KDGK, QUEF, POLA, QUEC, ATPF, GLCB, ATPE, ATPH, ATPG, GABT, SUCB, SUCA, GATA, RIBC, OXC, GATD, GATB, THYA, ARGI, PURT, ACS, TPIA, ASD, MANA, UXUA, BIOA, DHAK, MOAE, TYRB, PPSA, UXUB, CDD, RUTC, GARK, TALA, TALB, ALDB, GARR, PLSB, GUAB, GGT, PUTA, LPLA, CYOA, FOLE, FRDD, FRDC, KBAY, YQEF, METL, METK, METH, LUXS, ACNB, METF, HEME, ACNA, METB, DXS, META, HEML, YDFG, PEPD, SUHB, GLYA, HEMX, PEPB, GLGP, TRPC, AVTA, PEPN, ILVN, YBHC, RSPA, RSPB, ILVC, ILVD, LTAE, ILVB, SPEC, GDHA, UXAB, UXAC, BETB, BETA, RPIB, RPIA, NADA, TKTA, NADE, SSEA, DDLA, GCVH, eco01100:Metabolic 3.13E-07 1.2363013 8.29E-06 235 ARAA, SDHB, ARAB, GCVP, GCVT, pathways SELD, LYSA, MAEB, IDI, THRA, THRB, THRC, LYSC, FUMA, FUMC, FABI, PATA, ASPA, FABG, NUOC, ASPC, NUOB, NUOA, UGD, PNCB, PNCA, ENTC, KDSA, FADB, NUOL, FADA, NUOM, LACZ, FADD, NUON, FADE, NUOI, UBIA, UBIC, AROH, GPMA, AROG, AROF, ISCS, AROD, FDOG, AROC, EDA, AROA, PGSA, UBID, UBIG, UBIF, CADA, GND, MDH, FBAA, FRUA, STHA, SERB, SERC, LPD, FABB, DAPD, FABA, DAPB, GLMU, GLMS, PCK, CPSG, SCPA, PGPB, PFLB, GLPX, PURH, DADX, PURL, PURD, PURC, PROA, PURB, PYRE, GHRA, MURG, HIUH, GMHB, MURI, PYRD, ULAD, GSHB, ULAG, FRMA, PFKB, DCD, TYNA, YTJC, ACCB, ACCC, MAZG, ASNA, SDAA, ASNB, MQO, LDCC, CYSK, ADIA, CYSH, CYSE, HPT, ACEA, ICD, ACEB, DACA, DACC, HOLE, GCD, GPT, YSGA, CYSM THRA, TYNA, THRB, ASD, THRC, eco00260:Glycine, LYSC, YTJC, METL, LTAE, KBL, serine and threonine 9.39E-07 2.5025602 1.99E-05 25 BETB, BETA, SDAA, SERB, SERC, metabolism YDFG, GARK, LPD, GLYA, GCVH,

223

GHRA, GCVP, GPMA, TDH, GCVT LYSA, THRA, THRB, THRC, LYSC, ALAA, METL, METK, METH, ASPC, ACNB, LUXS, ACNA, METB, GLTD, META, GLTA, PGK, MALY, ARGA, GLYA, PROA, TRPC, AROH, AROG, eco01230:Biosynthesis GPMA, AROF, AROD, AROC, AROA, 1.50E-06 1.7407639 2.64E-05 57 of amino acids ARGI, PFKB, TPIA, ASD, ILVN, FBAA, YTJC, ILVC, ILVD, ILVB, LTAE, TYRB, ASNA, RPIB, SDAA, RPIA, SERB, SERC, CYSK, TALA, TALB, TKTA, CYSE, ICD, DAPD, DAPB, CYSM GARK, GLCF, ALDA, DMLA, ACEA, eco00630:Glyoxylate FUCO, MDH, GARR, ACEB, SCPA, and dicarboxylate 2.61E-05 2.1973699 3.96E-04 25 LPD, GLYA, GCVH, YQEF, GLCB, metabolism PURU, GHRA, ACNB, GCVP, ACNA, OXC, GCVT, FDOG, EDA, GLTA LYSA, LDHA, MAEB, THRA, FRDD, THRB, FRDC, THRC, HCHA, LYSC, PCK, SCPA, FUMA, FUMC, YQEF, METL, MHPC, ACNB, METF, ACNA, GLTD, PGI, GNTK, GLPX, PGM, GLTA, PGK, SAD, GLCF, FADB, FADA, KDGK, GLYA, GLCB, SUCB, SUCA, GHRA, HIUH, GPMA, DLD, eco01120:Microbial ULAD, FDOG, EDA, ULAG, FRMB, metabolism in diverse 2.24E-04 1.372833 0.0029635 88 FRMA, PFKB, ACS, TPIA, ASD, GND, environments MDH, FBAA, FRUA, YTJC, LTAE, AGP, ACCB, ACCC, RHMA, PPSA, RPIB, RPIA, MQO, SERB, SERC, CYSK, ALDA, TALA, TALB, ALDB, CYSH, FUCK, TKTA, FUCI, CYSE, ICD, FUCO, ACEA, LPD, SSEA, ACEB, DAPD, DAPB, SDHB, MGSA, YSGA, FUCA CYSK, THRA, ASD, MALY, LYSC, eco00270:Cysteine CYSE, MDH, SSEA, METL, METK, and methionine 4.09E-04 2.236771 0.0048091 18 METH, ASPC, LUXS, TYRB, METB, metabolism SDAA, META, CYSM ACS, MAEB, LDHA, MAEA, FRDD, FRDC, HCHA, PCK, MDH, FUMA, eco00620:Pyruvate 0.0021063 1.7665131 0.0201135 25 FUMC, YQEF, ACCB, ACCC, PPSA, metabolism PFLB, MQO, ALDA, POXB, ALDB, LPD, ACEB, GLCB, GHRA, DLD ACS, TPIA, ALDB, MALX, CRR, PCK, eco00010:Glycolysis / FBAA, LPD, YTJC, AGP, PTSG, 0.0019935 1.8919355 0.0209297 21 Gluconeogenesis GPMA, BGLB, ASCB, PGI, BGLA, GLPX, PGM, PFKB, PGK, FRMA FRDD, FRDC, PCK, MDH, ICD, LPD, eco00020:Citrate cycle 0.004029 2.07905 0.0350331 15 FUMA, FUMC, SUCB, SDHB, SUCA, (TCA cycle) ACNB, ACNA, MQO, GLTA eco00670:One carbon PURH, PURT, METH, METF, FAU, 0.0044468 2.5740619 0.0356868 10 pool by folate GCVT, GLYA, FMT, THYA, PURU eco00190:Oxidative 0.0053665 1.8018433 0.0399228 20 ATPA, ATPD, FRDD, ATPC, FRDC,

224

phosphorylation NUOL, NUOM, NUON, ATPF, ATPE, NUOI, NDH, ATPH, ATPG, SDHB, NUOC, NUOB, NUOA, PPA, CYOA

Appendix Table 2.10. DEGs that are associated with tRNA biosynthesis important for protein biosynthesis. Down-regulated DEGs iscS cysteine desulfurase (tRNA sulfurtransferase), PLP-dependent hisS histidyl tRNA synthetase cysS cysteinyl-tRNA synthetase proS prolyl-tRNA synthetase alaS alanyl-tRNA synthetase* Fmt 10-formyltetrahydrofolate:L-methionyl-tRNA(fMet) N-formyltransferase tyrS tyrosyl-tRNA synthetase* gltX glutamyl-tRNA synthetase pheT phenylalanine tRNA synthetase, beta subunit* lysU lysine tRNA synthetase, inducible* leuS leucyl-tRNA synthetase pheS phenylalanine tRNA synthetase, alpha subunit serS seryl-tRNA synthetase Up-regulated DEGs tadA tRNA-specific adenosine deaminase trmA tRNA m(5)U54 methyltransferase, SAM-dependent; tmRNA m(5)U341 methyltransferase queA S-adenosylmethionine:tRNA ribosyl transferase-isomerase tusB mnm(5)-s(2)U34-tRNA synthesis 2-thiolation protein Cca fused tRNA nucleotidyl transferase/2’3’-cyclic phosphodiesterase/2’nucleotidase and phosphatase dusB tRNA-dihydrouridine synthase B rluA dual specificity 23S rRNA pseudo uridine(746), tRNA pseudouridine(32) synthase, SAM-dependent mnmG 5-methylaminomethyl-2-thiouridine modification at tRNA U34 * Significantly enriched in the cytosol after DAVID enrichment analysis

225

Appendix Table 2.11. The expression of DEGs that are localized in the membrane. Gene CTR MW log2(FC) FC p-value Symbol (FPKM) (FPKM) ompC 2.9164 37.3448 3.67865 12.8051 5.00E-05 ompG 0.181618 7.2749 5.32395 40.05605 0.0262 ompN 1.05237 13.9272 3.72618 13.23413 5.00E-05 ompA 3866.03 1410.15 -1.455 -2.74157 0.00045 ompF 23769.4 3023.73 -2.97471 -7.86095 5.00E-05 ompW 24.6986 7.69969 -1.68156 -3.20774 5.00E-05 lpp 48992.2 7703.68 -2.66893 -6.35958 5.00E-05 ecnB 485.572 123.321 -1.97727 -3.93746 5.00E-05 kgtP 578.237 133.793 -2.11166 -4.32188 5.00E-05 hupA 2631.46 922.319 -1.51252 -2.85309 0.0001

Appendix Table 2.12. The expression of down-regulated genes that are associated with a folate-activated 1-carbons mechanism Gene CTR MW log2(FC) FC p-value Symbol (FPKM) (FPKM) glyA 1169.98 142.287 -3.03961 -8.22268 5.00E-05 Fmt 398.093 179.041 -1.15282 -2.22347 0.0002 metH 371.019 47.6137 -2.96204 -7.79227 5.00E-05 purB 82.4647 31.9326 -1.36875 -2.58246 5.00E-05 purC 420.395 83.7014 -2.32842 -5.02256 5.00E-05 purD 96.7492 21.6072 -2.16274 -4.47764 0.0004 purH 74.3382 23.0064 -1.69207 -3.2312 5.00E-05 purl 152.833 35.1717 -2.11947 -4.34534 5.00E-05 purT 140.706 24.0994 -2.54562 -5.83857 5.00E-05 purU 255.917 98.917 -1.37138 -2.58719 5.00E-05 thyA 287.532 99.6657 -1.52855 -2.88496 5.00E-05

226

Appendix chapter 3 Methods

1. Quality check using FastQC [ilona@gra-login3 Dataset]$ emacs fastQC.sh #!/bin/bash #SBATCH --time=24:00:00 #SBATCH --account=def-pliang #SBATCH --job-name=Cell-line.fastQC #SBATCH --nodes=1 #SBATCH --cpus-per-task=32 #SBATCH --mem=50G module load fastqc fastqc MCF7_5D_1.fq MCF7_5D_2.fq -o ./fastQC fastqc MCF7_5P_1.fq MCF7_5P_2.fq -o ./fastQC fastqc MCF7_18D_2_1.fq MCF7_18D_2_2.fq -o ./fastQC fastqc MCF7_18P_1.fq MCF7_18P_2.fq -o ./fastQC fastqc PC3_5D_1.fq PC3_5D_2.fq -o ./fastQC fastqc PC3_5P_1.fq PC3_5P_2.fq -o ./fastQC fastqc PC3_18D_1.fq PC3_18D_2.fq -o ./fastQC fastqc PC3_18P_1.fq PC3_18P_2.fq -o ./fastQC

2. Coverage calculation Fatools to calculate the total length of sequence from fasta file [ilona@gra-login1 CellLine]$ fatools -p S DB/Homo_sapiens.GRCh38.dna.primary_assembly.fa Calculate the exon length [ilona@gra-login2 ilona]$ awk -F "\t" '($3 == "exon"){len +=($5-$4+1)}; END{print len}' CellLine/DB/Homo_sapiens.GRCh38.97.gtf Calculate the total length of the paired end sequence from raw reads in fastq file [ilona@gra-login2 Dataset]$ awk 'NR%4 == 2 {len +=length($0)}; END{print len}' MCF7_18P_*.fq Calculate the number of reads pairs from fastq file [ilona@gra-login2 Dataset]$ grep "@" MCF7_18P_1.fq|wc -l

3. Alignment using STAR Building STAR index STAR --runMode genomeGenerate --genomeDir STARIndex --runThreadN 12 -- genomeFastaFiles GRCh38_latest_genomic.fna --sjdbGTFfile GRCh38_latest_genomic.gff 1> STARIndex.log 2> STARIndex.err PC3_5P representing alignment using STAR STAR --runMode alignReads --genomeDir ./STAR/STARIndex/ --runThreadN 12 --readFilesIn Dataset/PC3_5P_1.fq Dataset/PC3_5P_2.fq --outFileNamePrefix ./STAR/STAR_out/PC3_5P. -- outSAMunmapped Within --outSAMtype BAM SortedByCoordinate --outSAMstrandField intronMotif --outFilterIntronMotifs RemoveNoncanonical 1> ./STAR/STAR_out/PC3_5P.log 2> ./STAR/STAR_out/PC3_5P.err

227

4. Differential gene expression analysis using Cufflinks package PC3_5D sample representing transcript assembly using Cufflinks cufflinks -u -p 16 --library-type fr-firststrand -G DB/Homo_sapiens.GRCh38.97.gtf -b DB/Homo_sapiens.GRCh38.dna.primary_assembly.fa -o Cufflinks/PC3_5D STAR/STAR_out/PC3_5D.Aligned.sortedByCoord.out.bam 1>Cufflinks/PC3_5D/PC3_5D .log 2>Cufflinks/PC3_5D/PC3_5D.err Combine all the transcript.gtf from Cufflinks output in one file “gtf_files.txt” [ilona@gra-login3 Cufflinks]$ ls MCF7_18D MCF7_18P MCF7_5D MCF7_5P PC3_18D PC3_18P PC3_5D PC3_5P [ilona@gra-login3 Cufflinks]$ find -name transcripts.gtf > gtf_files.txt Combine transcripts.gtf from Cufflinks output from all samples using Cuffmerge cuffmerge -o Cuffmerge/ -g ../DB/Homo_sapiens.GRCh38.97.gtf -s ../DB/Homo_sapiens.GRCh38.dna.primary_assembly.fa gtf_files.txt 1> Cuffmerge/Cuffmerge.log 2> Cuffmerge/Cuffmerge.err MCF7 Differential gene expression analysis using Cuffdiff cuffdiff -o Cuffdiff/MCF7/18D_5P/ -b ../DB/Homo_sapiens.GRCh38.dna.primary_assembly.fa -compatible-hits-norm -multi-read-correct -no-update-check -FDR -verbose -quiet -p 16 -L MCF7_5P,MCF7_18D -u Cuffmerge/merged.gtf ../STAR/STAR_out/MCF7_5P.Aligned.sortedByCoord.out.bam ../STAR/STAR_out/MCF7_18D.Aligned.sortedByCoord.out.bam 1>Cuffdiff/MCF7/18D_5P/Cuffdiff.log 2>Cuffdiff/MCF7/18D_5P/Cuffdiff.err PC3 differential gene expression analysis using Cuffdiff cuffdiff -o Cuffdiff/PC3/18D_5P/ -b ../DB/Homo_sapiens.GRCh38.dna.primary_assembly.fa - compatible-hits-norm -multi-read-correct -no-update-check -FDR -verbose -quiet -p 16 -L PC3_5P,PC3_18D -u Cuffmerge/merged.gtf ../STAR/STAR_out/PC3_5P.Aligned.sortedByCoord.out.bam ../STAR/STAR_out/PC3_18D.Aligned.sortedByCoord.out.bam Creating volcano plot based on Cuffdiff result in R > setwd("~/Volcano/MCF7/6") > Cuffdiff<-read.table("gene_exp.diff", header=TRUE) > with(Cuffdiff, plot(log2foldchange, -log10(pvalue), pch=20,main="MCF7 5D vs 18P", col="black", cex=0.5)) > with(subset(Cuffdiff, pvalue<0.05 and log2foldchange>=2), points(log2foldchange, - log10(pvalue), pch=20,col="red", cex=0.5)) > with(subset(Cuffdiff, pvalue<0.05 and log2foldchange<=-1), points(log2foldchange, - log10(pvalue), pch=20,col="green", cex=0.5)) Creating scatter boxplot using R based on Cuffdiff result > boxplot <- ggplot(MCF7.Data.Frame, aes(x = condition, y=log10(FPKM))) > boxplot <- boxplot + geom_boxplot() > boxplot <- boxplot + theme_bw(base_size=12) > boxplot <- boxplot + theme(strip.background = element_blank(), strip.text.x = element_blank()) > boxplot <- boxplot + theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) > boxplot <- boxplot + theme(axis.text.x=element_text(family="Arial",color="black"), axis.text.y=element_text(family="Arial",color="black"))

228

> boxplot_1 <- boxplot + geom_boxplot() + geom_jitter(size=0.01, position = position_jitter(width = 0.21, height = 0)) Pearson Correlation analysis between 2 conditions based on Cuffdiff result > Pearson_Celllines <- read.delim("~/Pearson Correlation/Pearson_Celllines.txt") > Celllines_Pearson<-cor(Pearson_Celllines, method ="pearson") > write.table(Celllines_Pearson, file = "Celllines_Pearson", sep="\t") Paired T-test for Cffdiff result between across samples > t.test(PC3_Pearson$PC3_5D_FPKM, PC3_Pearson$PC3_5P_FPKM,var.equal=TRUE, paired=TRUE)$p.value [1] 0.001084947 > t.test(PC3_Pearson$PC3_5D_FPKM, PC3_Pearson$PC3_18D_FPKM,var.equal=TRUE, paired=TRUE)$p.value [1] 0.00114821 > t.test(PC3_Pearson$PC3_5D_FPKM, PC3_Pearson$PC3_18P_FPKM,var.equal=TRUE, paired=TRUE)$p.value [1] 3.026933e-07 > t.test(PC3_Pearson$PC3_5P_FPKM, PC3_Pearson$PC3_18D_FPKM,var.equal=TRUE, paired=TRUE)$p.value [1] 0.1234485 > t.test(PC3_Pearson$PC3_5P_FPKM, PC3_Pearson$PC3_18P_FPKM,var.equal=TRUE, paired=TRUE)$p.value [1] 7.140998e-07 > t.test(PC3_Pearson$PC3_18P_FPKM, PC3_Pearson$PC3_18D_FPKM,var.equal=TRUE, paired=TRUE)$p.value [1] 0.0004786915 Continue for all comparison between samples in MCF7 and between 2 groups (MCF7 and PC3)

5. Creating the list of significant DEGs Creating the list of significant DEGs for MCF7 [ilona@gra-login1 18P_5P]$ awk -F"\t" -v threshold=0.05 '$12<=threshold' gene_exp.diff >0.05.18P_5P.gene_exp.diff

Creating the list of upregulated DEGs for MCF7 [ilona@gra-login3 MCF7_DEGs]$ for f in 18D_18P 18D_5D 18D_5P 18P_5P 5D_5P; do awk - F"\t" -v threshold=1 '$10>=threshold' 0.05.$f.gene_exp.diff > 0.05.LFC1.$f.gene_exp.diff; done [ilona@gra-login3 MCF7_DEGs]$ for f in 18D_18P 18D_5D 18D_5P 18P_5P 5D_5P; do awk - F"\t" -v threshold=5 '$9>=threshold' 0.05.LFC1.$f.gene_exp.diff > FPKM.0.05.LFC1.$f.gene_exp.diff; done [ilona@gra-login1 5D_18P]$ awk -F"\t" -v threshold=1 '$10>=threshold' 0.05.5D_18P.gene_exp.diff>0.05.LFC1.5D_18P.gene_exp.diff [ilona@gra-login1 5D_18P]$ awk -F"\t" -v threshold=5 '$9>=threshold' 0.05.LFC1.5D_18P.gene_exp.diff>FPKM.0.05.LFC1.5D_18P.gene_exp.diff Creating the list of downregulated DEGs for MCF7 [ilona@gra-login3 MCF7_DEGs]$ for f in 18D_18P 18D_5D 18D_5P 18P_5P 5D_5P; do awk - F"\t" -v threshold=-1 '$10<=threshold' 0.05.$f.gene_exp.diff > 0.05.LFC-1.$f.gene_exp .diff; done

229

[ilona@gra-login3 MCF7_DEGs]$ for f in 18D_18P 18D_5D 18D_5P 18P_5P 5D_5P; do awk - F"\t" -v threshold=5 '$8>=threshold' 0.05.LFC-1.$f.gene_exp.diff > FPKM.0.05.LFC-1.$f .gene_exp.diff; done [ilona@gra-login1 5D_18P]$ awk -F"\t" -v threshold=-1 '$10<=threshold' 0.05.5D_18P.gene_exp.diff>0.05.LFC-1.5D_18P.gene_exp.diff [ilona@gra-login1 5D_18P]$ awk -F"\t" -v threshold=5 '$8>=threshold' 0.05.LFC- 1.5D_18P.gene_exp.diff>FPKM.0.05.LFC-1.5D_18P.gene_exp.diff

Continue with the same step to create the list of significant DEGs for PC3

6. Co-expressed analysis to reveal the major GO terms and KEGG pathways [ilona@gra-login3 ~]$ projects/ctb-pliang/lianglab/bin/perl/Stat_row2allpw_cor.pl -g 0.95 -o l - v1 projects/ctb-pliang/ilona/Correlation/Cell-Lines.DEGs.FPKM.txt >projects/ctb- pliang/ilona/Correlation/Cell-Lines.grouping.cor.txt 2463 lines processed. processing groups. The list of genes with more than 100 genes was subject to enrichment analysis using DAVID webtools.

230

Results

Appendix Figure 3.1. Volcano plot showing the expression distribution of DEGs from a different comparison in MCF7 using Cuffdiff result. Red and green dots indicate significant DEGs (p-value ≤ 0.05) showing up and down-regulated expression (FC ≥ 2.0), respectively.

231

Appendix Figure 3.2. Volcano plot showing the expression distribution of DEGs from different comparisons in PC3 using Cuffdiff result. Red and green dots indicate significant DEGs (p-value ≤ 0.05) showing up and down-regulated expression (FC ≥ 2.0), respectively.

232

Appendix Figure 3.3. A network of significantly enriched GO terms and KEGG pathways among DEGs in MCF7 at 5D vs 18D based on the overlap genes with a minimum of 37.5% similarity. Each circular node is a gene set with a diameter proportional to the number of the genes involved. The inner node color represents the p-value of the enrichment in the range of 0 – 0.05. The darker the node, the more the term/pathway is enriched (p-value < 0.05). Lines between nodes represent the fraction of overlapped genes between the nodes, and the thicker the line, the more significant the overlap is. KEGG pathways are indicated with a prefix of “HAS”, while GO terms are prefixed with “GO:”.

233

Appendix Figure 3.4. A network of significantly enriched GO terms and KEGG pathways among DEGs in MCF7 at 5P vs 18P based on the overlapped genes with a minimum of 37.5% similarity. Each circular node is a gene set with a diameter proportional to the number of the genes involved. The inner node color represents the p-value of the enrichment in the range of 0 – 0.05. The darker the node, the more the term/pathway is enriched (p-value < 0.05). Lines between nodes represent the fraction of overlapped genes between the nodes, and the thicker the line, the more significant the overlap is. KEGG pathways are indicated with a prefix of “HAS”, while GO terms are prefixed with “GO:”.

234

Appendix Figure 3.5 A network of the most significant enriched GO terms and KEGG pathways among DEGs in PC3 at5P vs 18P based on the minimum of 37.5% overlapped genes. Each circular node is a gene set with a diameter proportional to the number of the genes involved. The inner node color represents the p-value of the enrichment in the range of 0 – 0.05. The darker the node, the more the term/pathway is enriched (p-value < 0.05). Lines between nodes represent the fraction of overlapped genes between the nodes, and the thicker the line, the more significant the overlap is. KEGG pathways are indicated with a prefix of “HAS”, while GO terms are prefixed with “GO:”.

235

Appendix Figure 3.6. A network of significantly enriched GO terms and KEGG pathways among DEGs in MCF7 at 5D vs 5P based on the minimum of 37.5% overlapped genes. Each circular node is a gene set with a diameter proportional to the number of the genes involved. The inner node color represents the p-value of the enrichment in the range of 0 – 0.05. The darker the node, the more the term/pathway is enriched (p-value < 0.05). Lines between nodes represent the fraction of overlapped genes between the nodes, and the thicker the line, the more significant the overlap is. KEGG pathways are indicated with a prefix of “HAS”, while GO terms are prefixed with “GO:”.

236

Appendix Figure 3.7. A network of significantly enriched GO terms and KEGG pathways among DEGs in MCF7 at 18D vs 18P based on the minimum of 37.5% overlapped genes. Each circular node is a gene set with a diameter proportional to the number of the genes involved. The inner node color represents the p-value of the enrichment in the range of 0 – 0.05. The darker the node, the more the term/pathway is enriched (p-value < 0.05). Lines between nodes represent the fraction of overlapped genes between the nodes, and the thicker the line, the more significant the overlap is. KEGG pathways are indicated with a prefix of “HAS”, while GO terms are prefixed with “GO:”.

237

Appendix Figure 3.8. A network of significantly enriched GO terms and KEGG pathways among DEGs in PC3 at 18D vs 18P based on the minimum of 37.5% overlapped genes. Each circular node is a gene set with a diameter proportional to the number of the genes involved. The inner node color represents the p-value of the enrichment in the range of 0 – 0.05. The darker the node, the more the term/pathway is enriched (p-value < 0.05). Lines between nodes represent the fraction of overlapped genes between the nodes, and the thicker the line, the more significant the overlap is. KEGG pathways are indicated with a prefix of “HAS”, while GO terms are prefixed with “GO:”.

238

Appendix Figure 3.9. The comparison of the most significantly enriched KEGG pathways (p-value and Benjamini corrected p-value ≤ 0.05) from all DEGs in MCF7. MCF7, breast cancer cell line; 5P, 5% O2 in Plasmax medium; 5D, 5% O2 in DMEM; 18P, 18% O2 in Plasmax medium; 18D, 18% O2 in DMEM.

Appendix Figure 3.14. The comparison of the most significantly enriched KEGG pathways (p-value and Benjamini corrected p-value ≤ 0.05) from all DEGs in MCF7. PC3, prostate cancer cell line; 5P, 5% O2 in Plasmax medium; 5D, 5% O2 in DMEM; 18P, 18% O2 in Plasmax medium; 18D, 18% O2 in DMEM.

239

Appendix Table 3.1. Enriched GO terms and KEGG pathways among all DEGs in MCF7 in response to oxygen level change in DMEM. Biological Process Fold Term p-value Enrichment Benjamini Count Genes KIFC1, KNTC1, AURKA, PTTG1, FAM83D, CDCA8, CDCA7, MIS18A, CDCA2, CDCA5, CCNA2, TUBA1B, CDCA4, CDCA3, KIF14, CDK1, CDC6, KIF11, ANAPC5, LIG1, CCNF, TPX2, TACC3, UBE2C, MCM5, CDK2, TACC1, DCLRE1A, MAD2L1, GO:0051301~cell division 2.14E-28 5.850871 5.02E-25 60 TIMELESS, CCND3, SGO1, ZWINT, CDCA7L, CKS1B, HAUS5, NEK2, NR3C1, CCNG1, CCNG2, TUBB, NCAPH, NCAPG2, NCAPG, BUB1, SKA3, NUF2, CENPF, KIF18B, NDC80, SPDL1, BIRC5, CDC20, KNSTRN, REEP4, CDC25A, SMC4, CCNB1, CCNB2, KNL1 KIF22, HAUS5, NEK2, KNTC1, PKMYT1, ANLN, AURKA, NR3C1, CEP55, AURKB, PTTG1, CCNG1, CCNG2, FAM83D, NCAPG2, MIS18A, INCENP, CDCA2, GO:0007067~mitotic nuclear BUB1, SKA3, CCNA2, CDCA5, ASPM, 4.06E-24 6.468201 4.77E-21 47 division CDCA3, CDK1, CDC6, KIF11, ANAPC5, CCNF, KIF15, NUF2, TPX2, CENPF, NDC80, CDC20, BIRC5, REEP4, CDK2, CDC25A, DCLRE1A, CCNB2, TIMELESS, NOLC1, KNL1, PLK1, SGO1, CIT CLSPN, BLM, TICRR, CHEK1, MCM10, CDT1, CDC45, MCM7, POLE2, ORC6, ORC1, EXO1, RECQL4, CDK1, SSRP1, GO:0006260~DNA CDC6, DTL, GINS3, LIG1, POLE, BRIP1, 1.30E-23 8.367375 1.01E-20 38 replication RMI2, MCM2, MCM3, MCM4, CDK2, CDC25A, MCM5, MCM6, POLD3, TIMELESS, RFC2, RRM2, POLD2, RRM1, PCNA, CHTF18, CHAF1A IQGAP3, PKMYT1, MCM10, CDT1, TYMS, CDC45, MCM7, POLE2, CDKN2C, ORC6, GO:0000082~G1/S transition RANBP1, ORC1, CDCA5, CDC6, CDK1, 8.12E-19 9.369042 4.76E-16 28 of mitotic cell cycle POLE, MCM2, CDKN3, MCM3, MCM4, CDK2, CDC25A, MCM5, MCM6, CDKN1A, DHFR, RRM2, PCNA CLSPN, KIF22, XRCC2, BLM, TICRR, UNG, RPS27L, CHEK1, PTTG1, RRM2B, TTC5, POLE2, FANCG, FANCA, EXO1, RECQL4, GO:0006281~DNA repair 2.16E-12 4.6475 1.01E-09 32 SSRP1, CDK1, NUDT1, LIG1, POLE, TRIM28, RAD54L, CDK2, ATRX, UHRF1, BTG2, FANCD2, CHAF1A, PARP2, GADD45A, UBE2T KIF22, CENPM, NUF2, KNTC1, CENPF, BIRC5, NDC80, SPDL1, CDC20, AURKB, GO:0007062~sister 1.55E-11 6.95856 6.07E-09 21 CENPH, CDCA8, MAD2L1, SGO1, KNL1, chromatid cohesion PLK1, INCENP, ZWINT, BUB1, CENPU, CDCA5 CDC6, CDC45, MCM7, POLE2, POLE, GO:0006270~DNA 4.54E-11 13.86535 1.52E-08 13 ORC6, MCM2, MCM3, MCM10, ORC1, replication initiation MCM4, MCM5, MCM6 GO:0000086~G2/M CDK1, NES, HAUS5, NEK2, TPX2, 7.57E-11 5.729868 2.22E-08 23 transition of mitotic cell PKMYT1, BIRC5, CHEK1, AURKA, OPTN,

240 cycle CDC25A, CDK2, HMMR, CCNB1, PLK4, CDKN1A, TUBB, CCNB2, PLK1, CIT, MELK, CALM1, TUBB4B GO:0000070~mitotic sister KIFC1, MAD2L1, PLK1, NEK2, ZWINT, 1.96E-08 13.65203 5.10E-06 10 chromatid segregation KIF18B, NDC80, ESPL1, KNSTRN, SMC4 KIF11, NEK2, NUF2, CENPF, NDC80, GO:0007059~.05E-08 7.026781 1.65E-05 14 NR3C1, KNSTRN, SGO1, HJURP, INCENP, segregation MIS18A, CDCA2, SKA3, TOP2A CKS1B, HRAS, E2F8, DTYMK, AURKB, MCM10, CITED2, ZFP36L1, FAM83D, TYMS, MCM7, BCL2, PRMT5, BUB1, GO:0008283~cell 1.22E-07 2.984051 2.61E-05 32 MATK, PDK1, CDK1, MKI67, DLGAP5, proliferation KIF15, FSCN1, TPX2, CENPF, TACC3, TACC1, CDC25A, UHRF1, SERPINF1, PLK1, PCNA, TCF19, MELK GO:0000083~regulation of transcription involved in 1.68E-07 13.35525 3.29E-05 9 G1/S transition of mitotic cell CDK1, CDC6, TYMS, CDC45, DHFR, cycle RRM2, PCNA, ORC1, CDT1 GO:0000079~regulation of cyclin-dependent protein CDC6, CDKN1A, BLM, CDKN2C, PKMYT1, 1.41E-06 8.751303 2.54E-04 10 serine/threonine kinase CCNG1, CDKN3, CCNA2, GADD45A, activity CDC25A GO:0006977~DNA damage response, signal transduction CCNB1, E2F1, CDK1, CDKN1A, BTG2, 1.58E-06 6.605822 2.65E-04 12 by p53 class mediumtor E2F7, BAX, PCNA, AURKA, GADD45A, resulting in cell cycle arrest GTSE1, CDK2 GO:0007051~spindle KIF11, TTK, AURKA, RANBP1, AURKB, 3.78E-06 14.93191 5.92E-04 7 organization KNSTRN, ASPM TXNIP, E2F2, CKS1B, GMNN, RBL1, DTYMK, SUV39H1, AURKA, CDC20, GO:0007049~cell cycle 6.16E-06 3.302911 9.03E-04 21 AURKB, MCM2, CDKN3, TACC1, RCBTB1, UHRF1, TSPYL2, CCND3, NOLC1, HJURP, CHTF18, CHAF1A GO:0061621~canonical PKM, TPI1, PFKFB3, PGAM1, HK2, PFKP, 7.11E-06 10.50156 9.81E-04 8 glycolysis GAPDH, ENO1 DDX39A, BLM, DTL, ZMAT3, SUV39H1, GO:0006974~cellular CHEK1, RPS27L, MCM10, CDKN1A, response to DNA damage 1.19E-05 3.281739 0.001552 20 MCM7, TIMELESS, BTG2, BBC3, BCL2, stimulus WDR76, SPATA18, H2AFX, FANCG, TOP2A, UBE2T E2F2, DTL, CCNF, RBL1, PKMYT1, CENPF, GO:0051726~regulation of 1.59E-05 4.128639 0.001968 15 CCNG1, MYBL2, CCNG2, CDC25A, cell cycle CCNB1, CCNB2, PLK1, JUN, GADD45A GO:0000722~telomere maintenance via 3.12E-05 8.53252 0.00366 8 POLD3, POLE2, RFC2, LIG1, POLE, POLD2, recombination PCNA, BRCA2 GO:0051439~regulation of ubiquitin-protein ligase 4.01E-05 10.38742 0.004468 7 activity involved in mitotic CCNB1, CDK1, ANAPC5, PLK1, CDC20, cell cycle UBE2C, CDK2 GO:0006096~glycolytic LDHA, TPI1, PGM1, PGAM1, HK2, 4.75E-05 8.030607 0.005052 8 process DHTKD1, GAPDH, ENO1 GO:0000731~DNA synthesis EXO1, POLD3, XRCC2, BLM, POLE, 5.78E-05 7.801161 0.005888 8 involved in DNA repair BRCA2, BRIP1, RMI2 GO:0007080~mitotic KIF14, CCNB1, KIFC1, KIF22, CDCA8, 8.42E-05 7.379477 0.008203 8 metaphase plate congression SPDL1, CEP55, CDCA5 TXNIP, CDK1, HMGB2, LDHA, GGH, GO:0042493~response to CTPS1, CENPF, FOSB, AK4, GAL, RAD54L, 9.28E-05 2.58221 0.008683 23 drug CCNB1, TYMS, FOS, CDKN1A, MCM7, JUN, BCL2, OXCT1, PTN, IGFBP2, THBS1,

241

DNMT3B GO:0031145~anaphase- CCNB1, CDK1, MAD2L1, ANAPC5, promoting complex- 9.85E-05 4.75229 0.008857 11 PSMC3, PLK1, AURKA, CDC20, PTTG1, dependent catabolic process AURKB, UBE2C SLC29A1, ZFP36L1, CCNB1, E2F1, BBC3, GO:0071456~cellular 1.12E-04 4.26626 0.009694 12 BCL2, EDN1, MST1, SUV39H1, BNIP3, response to hypoxia PTN, CCNA2 GO:0034501~protein 1.32E-04 17.06504 0.010993 5 localization to kinetochore CDK1, KNL1, TTK, SPDL1, AURKB GO:0006268~DNA unwinding involved in DNA 1.32E-04 17.06504 0.010993 5 replication MCM7, MCM2, MCM4, TOP2A, MCM6 GO:0000281~mitotic KIF4A, PLK1, ANLN, STMN1, CEP55, 1.62E-04 8.238295 0.013063 7 cytokinesis RACGAP1, KIF20A GO:0007052~mitotic spindle CCNB1, KIF11, WDR62, TTK, NDC80, 1.98E-04 7.963686 0.01538 7 organization AURKA, STMN1 GO:0097193~intrinsic SIVA1, CDKN1A, HRAS, CASP4, BBC3, 1.98E-04 7.963686 0.01538 7 apoptotic signaling pathway BAX, SART1 GO:0032508~DNA duplex RECQL4, GINS1, ATRX, CDC45, BLM, 2.63E-04 6.205469 0.0197 8 unwinding BRIP1, MCM3, MCM5 GO:0007093~mitotic cell HRAS, MAD2L1, ZWINT, KNTC1, BUB1, 2.88E-04 7.465955 0.020258 7 cycle checkpoint TTK, CHEK1 GO:1901796~regulation of EXO1, SSRP1, BLM, RFC2, PRMT5, TPX2, signal transduction by p53 2.80E-04 3.578154 0.020344 13 BRIP1, AURKA, CHEK1, RMI2, AURKB, class mediumtor TTC5, CDK2 GO:0009612~response to TXNIP, CCNB1, BTG2, JUN, FOSB, 3.02E-04 5.206284 0.02064 9 mechanical stimulus IGFBP2, THBS1, CXCL12, CITED2 GO:0006297~nucleotide- excision repair, DNA gap 5.62E-04 8.53252 0.035977 6 filling POLD3, RFC2, LIG1, POLE, POLD2, PCNA GO:0090307~mitotic spindle KIFC1, TUBGCP3, KIF11, NEK2, TPX2, 5.60E-04 6.636405 0.03688 7 assembly BIRC5, MYBL2 GO:0015949~nucleobase- containing small molecule 6.85E-04 8.19122 0.042571 6 RRM2, DTYMK, RRM1, CTPS1, RRM2B, interconversion AK4 GO:0006271~DNA strand elongation involved in DNA 7.63E-04 11.37669 0.046041 5 replication POLD3, GINS1, GINS3, POLD2, PCNA GO:0009636~response to CDK1, DHRS2, TYMS, FOS, CDKN1A, 8.34E-04 4.015304 0.047778 10 toxic substance BAX, BCL2, SLC6A14, FAS, DNMT3B GO:0000732~strand 8.28E-04 7.876173 0.048663 6 displacement EXO1, XRCC2, BLM, BRCA2, BRIP1, RMI2 Cellular Component Fold Term p-value Enrichment Benjamini Count Genes XRCC2, CRABP2, CDKN2AIPNL, PKMYT1, AURKA, RPS27L, AURKB, MCM10, SART1, PGR, CDCA8, CDCA7, PACSIN3, INCENP, MIS18A, CDCA2, ORC6, H2AFX, CDCA5, CCNA2, ORC1, DDX39A, H1F0, ANAPC5, DTL, NEIL3, USP1, LIG1, POLE, RBL1, RMI2, OPTN, DEPDC1, MRTO4, GO:0005654~nucleoplasm 1.29E-20 2.051157 5.50E-18 162 DCLRE1A, TIMELESS, SGO1, RFC2, JUN, SIVA1, HMGB1, KIF4A, HMGB2, BLM, LMNB1, TICRR, PFKFB3, ANLN, CHEK1, RRM2B, MYBL2, CCNG1, TTC5, RIOK2, RPS26, POLE2, NCAPG2, HNRNPD, SYBU, WDHD1, ASF1B, DNMT3B, GINS1, GINS3, SUV39H1, LMCD1, TOMM40, BRIP1, SMAD3, NR4A1, BRCA2, ATAD2, CDC20,

242

S100A14, RAD54L, POLD3, CDKN1A, DHFR, NOLC1, PSMC3, PLK1, POLD2, PCNA, PARP2, POP7, SNRNP25, E2F1, IER2, CLSPN, E2F2, E2F7, EZH2, BNIP3, GTSE1, CDT1, FOS, ACOT7, CDC45, MCM7, PRMT5, FANCE, FANCG, KDM5B, FANCA, TOP2A, TSEN15, EGR1, CDK1, CDC6, TPX2, MCM2, UBE2C, MCM3, MCM4, CDK2, MCM5, MCM6, TRAP1, CCND3, HIST2H2BE, FANCD2, HIST2H2BF, RRM2, ZMIZ1, NUP205, RRM1, NCAPH2, KPNA2, GADD45A, UBE2T, CKS1B, MFSD8, UNG, ZNF367, NR3C1, TYMS, TSPYL2, HJURP, BUB1, EXO1, SSRP1, CENPM, TONSL, GMNN, TRIM28, AFF3, CENPF, BIRC5, RACGAP1, CDC25A, CENPH, SMC4, CCNB1, CCNB2, RPS6KA4, PHF19, KNL1, CHTF18, NOP56, CENPU, KIF20A, CALM1 KIFC1, MCRIP1, LDHA, HRAS, H1FX, AURKA, AURKB, PTTG1, CITED2, PGR, CDCA8, CDCA7, H2AFV, MIS18A, WDR76, H2AFZ, H2AFX, FAS, CDCA5, CCNA2, CDCA4, H1F0, MAGI2, ANAPC5, LIG1, POLE, ESPL1, OPTN, DEPDC1, UHRF1, DCLRE1A, MAD2L1, SGO1, ZWINT, JUN, CDCA7L, TICRR, NEK2, ZNF76, DUSP12, CHEK1, TPI1, NCAPG2, DNMT3B, TUBB4B, RECQL4, MKI67, SMAD6, NUF2, GGH, SMAD3, CDC20, NDC80, SPDL1, CSRP2, RAD54L, DLX3, WDR62, PCNA, TCF19, CHAF1A, PARP2, SNRNP25, POP7, TCOF1, EZH2, BNIP3, YBX2, CDT1, PKM, PBXIP1, FANCE, TOP2A, FANCA, ZFP36, CDC6, CCNF, ANP32E, TPX2, FOSB, MXD4, CCND3, FANCD2, HIST2H2BE, HIST2H2BF, RRM2, GADD45A, RASD1, UNG, ZNF367, ZFP36L1, NCAPH, NCAPG, BCL2, BAZ2B, TRIP13, SSRP1, GMNN, DLGAP5, TRIM28, AFF3, BIRC5, CDKN3, GO:0005634~nucleus 6.63E-18 1.594853 1.41E-15 245 CABYR, MT1X, PLEKHF1, ATRX, CCNB1, CCNB2, RPS6KA4, DUSP1, KNL1, SVIL, MT2A, HIST1H2AI, GAMT, CALM1, CRABP2, DTYMK, KNTC1, EIF5A, RPS27L, APOBEC3H, MCM10, FOXO6, RCBTB1, CDKN2C, ORC6, ORC1, ASPM, NET1, DDX39A, NUDT1, DTL, USP1, NEIL3, RBL1, RMI2, BASP1, TACC1, MRTO4, ASCL2, TIMELESS, CPE, SDCBP, SWT1, HMGB1, HMGB2, CNBP, HMGB3, FGFR3, BLM, TFAP4, MSMB, ITGB4, SESN2, TIMP3, SESN1, SESN3, TUBB, HNRNPD, ASRGL1, ASF1B, DDIAS, GINS1, RFX5, LPP, GINS3, LMCD1, SUV39H1, ATAD2, BRCA2, NR4A1, BRIP1, POLD3, CDKN1A, CORO1A, PLK1, PSMC3, PNRC1, POLD2, TMPO, DCXR, E2F1, IER2, KIF22, IER3, PTGES2, E2F7, LYAR, E2F8, CCHCR1, FOS, WARS, CDC45, MCM7, PRMT5, RANBP1, KDM5B, EGR1, KIF14, CDK1, EGR3, PFKP, MCM2, MCM3, MCM4, CDK2, MCM5, MCM6, DHRS2, CMSS1,

243

BTG1, NRGN, KPNA2, UBE2T, MELK, FRK, NR3C1, CALCOCO1, TYMS, TSPYL2, HJURP, SAPCD2, PLEKHO1, GAPDH, ENO1, EXO1, TXNIP, NACC1, CNTD2, FLT4, CENPF, KIF18B, KNSTRN, COTL1, RACGAP1, CDC25A, CENPH, SMC4, ADI1, BAX, SP6, CENPU, GDF15 HRAS, LDHA, DTYMK, CRABP2, KNTC1, PGAM1, IQGAP3, PKMYT1, EIF5A, AURKA, PTTG1, AURKB, ITPKA, SART1, FAH, CDCA8, PGPEP1, CDKN2C, INCENP, VPS13A, FAS, CDCA5, ORC1, CDCA3, NET1, MATK, NUDT1, ANAPC5, ESPL1, OPTN, RND1, MAD2L1, TNNT1, SGO1, ZWINT, JUN, PGM1, PDE5A, SDCBP, STMN1, KIF4A, CNBP, SRM, PFKFB3, NEK2, CTPS1, CHEK1, DUSP12, SESN2, RIOK2, SESN1, TK1, RPS26, TPI1, RAC3, HNRNPD, IDH2, ASRGL1, GALE, BMF, FH, TUBB4B, ODC1, GABARAPL1, SMAD6, FSCN1, GGH, NUF2, HGD, FN3KRP, SMAD3, CDC20, SPDL1, NDC80, DDX58, CDKN1A, PLK4, CORO1A, DHFR, BBC3, GO:0005829~cytosol 5.44E-12 1.680067 7.71E-10 158 PSMC3, PLK1, AHCYL2, SAT1, IER3, KIF22, PTGES2, RHOQ, RHOV, GTSE1, CDT1, PKM, MTHFD1, WARS, FOS, ACOT7, MCM7, PBXIP1, PRMT5, ARHGAP11A, DHTKD1, KIF14, ZFP36, CDK1, CDC6, KIF11, ACTA2, KIF15, PFKP, TPX2, PADI2, UBE2C, LDLRAP1, CDK2, BTG2, RRM2, RRM1, CTSH, KPNA2, AARS2, PPFIA4, HAUS5, PLBD1, HK2, HMMR, ZFP36L1, TUBGCP3, TYMS, NCAPH, NCAPG, BCL2, BUB1, SH2B2, GAPDH, ABCA12, ENO1, TXNIP, CENPM, GMNN, NAT1, CENPF, BIRC5, RACGAP1, CDC25A, CENPH, SMC4, CCNB1, ADI1, CCNB2, KNL1, BAX, MT2A, PHGDH, SP6, GAMT, CENPU, CIT, CALM1 KIF22, NEK2, CENPF, TTK, NDC80, AURKB, KNSTRN, WDR81, CENPH, GO:0000776~kinetochore 7.19E-09 6.962867 7.64E-07 16 MAD2L1, SGO1, PLK1, INCENP, ZWINT, BUB1, SKA3 HRAS, LDHA, MCRIP1, XRCC2, DZIP3, EDN1, PGAM1, PTTG1, SART1, CITED2, CDCA7, MIS18A, CDCA2, FAS, GOLGA8A, CDCA5, CCNA2, MAGI2, ESPL1, OPTN, SGO1, ZWINT, PGM1, CDCA7L, SIVA1, KIF4A, NEK2, DUSP12, LIF, RPS26, MYO15B, SKA3, WDHD1, DNMT3B, FH, RECQL4, ODC1, MKI67, SMAD6, SMAD3, CDC20, S100A14, NOLC1, PCNA, PARP2, GO:0005737~cytoplasm 1.64E-08 1.39729 1.39E-06 207 POP7, SNRNP25, CLSPN, TCOF1, EZH2, BNIP3, YBX2, PKM, FANCG, WDR34, TOP2A, FANCA, ZFP36, CDC6, ACTA2, CARD10, CCND3, HIST2H2BE, FANCD2, HIST2H2BF, NUP205, RRM2, RRM1, GADD45A, GULP1, ZFP36L1, NCAPG, BCL2, BUB1, PHLDA3, ABCA12, SSRP1, GMNN, DLGAP5, AFF3, BIRC5, CDKN3, MT1X, CABYR, CCNB1, RPS6KA4, DUSP1, KNL1, SVIL, MT2A, TEX19, CHTF18,

244

GAMT, NOP56, DRAM1, CALM1, CRABP2, KNTC1, TTK, EIF5A, APOBEC3H, MCM10, FOXO6, RCBTB1, KRT80, PACSIN3, CDKN2C, ORC1, ASPM, DDX39A, NUDT1, DTL, RMI2, BASP1, TACC3, TACC1, MRTO4, ASCL2, RELT, SPATA18, SDCBP, STMN1, HMGB2, HMGB3, BLM, RRM2B, SESN2, TTC5, SESN1, CCNG2, SESN3, TUBB, ASRGL1, TFF3, DDIAS, GINS1, LPP, SELENOW, FSCN1, LMCD1, BRCA2, NR4A1, TOMM40, BRIP1, DDX58, CORO1A, PSMC3, PLK1, TROAP, TMPO, IER2, KIF22, FAM83D, CCHCR1, WARS, ACOT7, CDC45, MCM7, PRMT5, DDX60, KLHL24, RANBP1, KDM5B, EGR1, CDK1, KIF11, PFKP, PADI2, MCM2, DAPK2, UBE2C, CDK2, DHRS2, BTG1, ZMIZ1, KPNA2, UBE2T, SHCBP1, FRK, NR3C1, CALCOCO1, TUBGCP3, TYMS, TSPYL2, HJURP, SAPCD2, PLEKHO1, SH2B2, GAPDH, ENO1, TXNIP, EXO1, NES, NACC1, TONSL, FLT4, CENPF, KIF18B, KNSTRN, RACGAP1, COTL1, CDC25A, ACTL8, SMC4, ADI1, BAX, GDF15 CENPM, NEK2, NUF2, KNTC1, NDC80, GO:0000777~condensed BIRC5, KNSTRN, CENPH, MAD2L1, SGO1, 1.99E-08 6.48267 1.41E-06 16 chromosome kinetochore HJURP, KNL1, INCENP, ZWINT, BUB1, CENPU MCM7, TONSL, MCM2, MCM3, MCM4, GO:0042555~MCM complex 3.91E-08 27.41629 2.38E-06 7 MCM5, MCM6 CDC6, KIF11, NEK2, TPX2, KNTC1, CENPF, SPDL1, CDC20, KNSTRN, TACC3, GO:0000922~spindle pole 7.14E-08 5.497631 3.79E-06 17 CCNB1, TUBGCP3, MAD2L1, SGO1, PLK1, WDR62, CALM1 KIF14, CDK1, KIF4A, NEK2, CENPF, AURKA, BIRC5, AURKB, CEP55, GO:0030496~midbody 1.41E-07 4.918537 6.66E-06 18 RACGAP1, TACC1, CDCA8, PLK1, INCENP, SVIL, ASPM, SHCBP1, KIF20A KIFC1, HAUS5, KIF11, KIF15, TPX2, CENPF, TTK, BIRC5, CDC20, AURKA, GO:0005819~spindle 3.13E-07 4.952411 1.33E-05 17 NR3C1, AURKB, TUBGCP3, PLK1, INCENP, SHCBP1, KIF20A CDCA8, MKI67, HJURP, SGO1, INCENP, GO:0000775~chromosome, 4.79E-07 7.420951 1.85E-05 12 MIS18A, NUF2, SUV39H1, CENPF, BIRC5, centromeric region NDC80, CDCA5 GO:0051233~spindle KIF14, CDC6, CDCA8, PLK1, AURKA, 9.92E-06 12.98666 3.24E-04 7 midzone AURKB, RACGAP1 KIF14, KIF22, KIFC1, KIF4A, GABARAPL1, HAUS5, KIF11, NEK2, KIF15, TPX2, KIF18B, BIRC5, AURKA, RACGAP1, GO:0005874~microtubule 9.56E-06 2.833562 3.39E-04 25 WDR81, REEP4, TUBGCP3, TUBB, PBXIP1, INCENP, SYBU, STMN1, TUBA1B, KIF20A, TUBB4B HAUS5, XRCC2, TTC26, NEK2, CHEK1, AURKA, CEP55, WDR81, TUBGCP3, CDC45, TSPYL2, NCAPG, RANBP1, CDK1, GO:0005813~centrosome 1.32E-05 2.48236 4.00E-04 30 DTL, KIF15, BRCA2, CENPF, ESPL1, CDC20, MCM3, CDK2, CCNB1, PLK4, CCNB2, PLK1, SGO1, WDR62, PCNA, CALM1 GO:0005876~spindle CDK1, KIF4A, KIF11, PLK1, SKA3, BIRC5, 2.83E-05 7.210128 8.00E-04 9 microtubule AURKA, AURKB, CALM1 245

GO:0000942~condensed nuclear chromosome outer 8.84E-05 35.24952 0.002345 4 kinetochore CCNB1, PLK1, BUB1, NDC80 GO:0032133~chromosome 2.16E-04 28.19961 0.005393 4 passenger complex CDCA8, BIRC5, AURKA, AURKB GO:0000793~condensed HMGB1, HMGB2, MKI67, FANCD2, 5.93E-04 8.459884 0.013907 6 chromosome TOP2A, CDK2 CDK1, KIF22, MAD2L1, AURKA, ESPL1, GO:0072686~mitotic spindle 9.69E-04 6.01821 0.021462 7 RACGAP1, KNSTRN H1F0, E2F1, HMGB2, EZH2, SMAD3, GO:0000790~nuclear CALCOCO1, CITED2, UHRF1, TIMELESS, 0.001234 2.7396 0.023567 15 chromatin H2AFV, HIST1H2AI, H2AFX, CHAF1A, ASF1B, CDCA5 GO:0000784~nuclear ATRX, CDK1, DCLRE1A, MCM7, PCNA, chromosome, telomeric 0.001181 3.253802 0.023638 12 BRCA2, MCM2, MCM3, MCM4, ORC1, region MCM5, MCM6 GO:0005657~replication fork 0.001127 10.3675 0.023684 5 UHRF1, XRCC2, PCNA, H2AFX, CHEK1 GO:0005971~ribonucleoside- diphosphate reductase 0.002355 35.24952 0.04264 3 complex RRM2, RRM1, RRM2B Molecular Function Fold Term p-value Enrichment Benjamini Count Genes MCRIP1, LDHA, HRAS, XRCC2, DZIP3, EDN1, PGAM1, AURKA, AURKB, PTTG1, SART1, CITED2, FAH, PGR, CDCA8, MIS18A, INCENP, SERPINE1, H2AFZ, VPS13A, H2AFX, FAS, CCNA2, CDCA5, CDCA4, CDCA3, H1F0, MAGI2, ESPL1, OPTN, DEPDC1, RND3, UHRF1, RND1, MAD2L1, MELTF, SERPINF1, SGO1, RFC2, ZWINT, JUN, PGM1, CDCA7L, MRPL48, SIVA1, KIF4A, ENPP1, LMNB1, TICRR, NEK2, MST1, CHEK1, DUSP12, TAGLN2, MYBL2, TK1, LIF, ALCAM, RPS26, TPI1, RAC3, NCAPG2, SYBU, SKA3, LETMD1, WDHD1, DNMT3B, FH, RECQL4, ODC1, MKI67, SMAD6, FAM111B, NUF2, HGD, SMAD3, CDC20, NDC80, SPDL1, CSRP2, S100A14, RAD54L, NOLC1, WDR62, FAM214A, PCNA, CHAF1A, PARP2, SNRNP25, POP7, CLSPN, KCNJ15, BCKDK, GO:0005515~protein binding 8.04E-15 1.326706 5.48E-12 339 ZMAT3, TCOF1, EZH2, BNIP3, GTSE1, CDT1, PKM, MTHFD1, PBXIP1, NEURL1B, FANCG, WDR34, TOP2A, FANCA, ZFP36, CDC6, NDUFB10, CCNF, ARHGEF19, TPX2, ARRDC4, LDLRAP1, CARD10, MXD4, TRAP1, CCND3, FANCD2, RRM2, NUP205, RRM1, NCAPH2, GADD45A, RASD1, CCDC170, PPFIA4, TBC1D9, UNG, C5, HK2, HMMR, ZFP36L1, NCAPH, NCAPG, BCL2, BUB1, THBS1, BAZ2B, ABCA12, TRIP13, PDK1, SSRP1, DLGAP5, GMNN, TRIM28, AFF4, BIRC5, NPY1R, ANKRD40, CDKN3, ITPR1, NAT9, MT1X, TMPRSS4, CCNB1, ATRX, PLEKHF1, CCNB2, RPS6KA4, PHF19, DUSP1, TFRC, KNL1, SVIL, MT2A, HIST1H2AI, CHTF18, NOP56, CIT, DRAM1, KIF20A, CALM1, MPZL2, FAM189B, CRABP2, KNTC1, PKMYT1, EIF5A, TTK, RPS27L,

246

APOBEC3H, MCM10, FOXO6, NRCAM, KRT80, PACSIN3, CDKN2C, SLC25A23, ORC6, ELOVL6, ORC1, SAMD4A, MATK, DDX39A, NUDT1, DTL, USP1, RBL1, MGP, BASP1, TACC3, TACC1, MRTO4, TNNT1, TIMELESS, SPATA18, SDCBP, STMN1, HMGB1, CNBP, HMGB2, SLC38A2, FGFR3, HMGB3, SRM, BLM, TFAP4, MSMB, ITGB4, ITGB5, RRM2B, CCNG1, TTC5, SESN2, SESN1, TIMP3, TUBB, POLE2, HNRNPD, TFF3, ASF1B, BMF, GABARAPL1, LPP, RFX5, FSCN1, SUV39H1, NR4A1, BRIP1, TOMM40, BRCA2, KCTD6, DDX58, PPIF, POLD3, CORO1A, PLK4, CDKN1A, BBC3, PLK1, PSMC3, PNRC1, POLD2, TROAP, AHCYL2, ANTXR1, E2F1, SAT1, E2F2, KIF22, IER3, IER5, PTGES2, LYAR, E2F7, E2F8, RHOV, LGR4, FAM83D, FOS, WARS, CDC45, ACOT7, MCM7, DDX60, PRMT5, TUBA1B, KDM5B, EGR1, KIF14, CDK1, KIF15, MCM2, UBE2C, MCM3, DAPK2, GAL, MCM4, CDK2, MCM5, MCM6, DHRS2, BTG2, NRM, BTG1, SLC41A3, SGF29, CTSH, KPNA2, MELK, SHCBP1, CKS1B, FRK, ACCS, NR3C1, CEP55, CALCOCO1, TUBGCP3, C1QTNF6, TSPYL2, HJURP, PLEKHO1, SH2B2, SCNN1A, GAPDH, ENO1, EXO1, TXNIP, SLC12A2, TONSL, FLT4, CENPF, KIF18B, KNSTRN, COTL1, RACGAP1, CDC25A, CENPH, SMC4, ACTL8, ADI1, KCNN4, BAX, CENPU, IGFBP2, GDF15, IGFBP5 KIF22, KIFC1, BCKDK, XRCC2, DTYMK, PKMYT1, TTK, AURKA, AURKB, ACSF3, ITPKA, MTHFD1, PKM, WARS, MCM7, DDX60, ATP8B2, TOP2A, ORC1, MATK, KIF14, DDX39A, CDK1, CDC6, KIF11, ACTA2, LIG1, KIF15, TPX2, PFKP, ABCC10, MCM2, MCM3, UBE2C, DAPK2, MCM4, MCM5, CDK2, MCM6, TRAP1, GO:0005524~ATP binding 5.41E-09 1.931767 1.86E-06 84 RFC2, ATP9A, RRM1, AARS2, UBE2T, MELK, FRK, KIF4A, FGFR3, BLM, ENPP1, NEK2, PFKFB3, HK2, CTPS1, CHEK1, RIOK2, TK1, MYO15B, BUB1, AGK, TRIP13, ABCA12, RECQL4, PDK1, MKI67, FLT4, KIF18B, ATAD2, BRIP1, AK4, RAD54L, SMC4, DDX58, ATRX, PLK4, RPS6KA4, NOLC1, PSMC3, PLK1, CHTF18, CIT, CLCN7, KIF20A TICRR, EZH2, TTC5, CALCOCO1, CITED2, FOS, CDC45, HNRNPD, TOP2A, CDCA5, GO:0003682~chromatin ORC1, DNMT3B, EXO1, SSRP1, CDK1, 3.08E-05 2.462056 0.007014 28 binding SMAD6, TRIM28, POLE, SUV39H1, ATAD2, CENPF, MCM5, ATRX, DLX3, JUN, PCNA, CHAF1A, UBE2T GO:0003678~DNA helicase ATRX, MCM7, MCM2, MCM3, MCM4, 5.00E-05 10.02775 0.008541 7 activity MCM5, MCM6 HMGB1, CDC45, CNBP, HMGB2, MCM7, GO:0003697~single-stranded 7.85E-05 4.436239 0.010713 12 XRCC2, BLM, NEIL3, BRCA2, MCM10, DNA binding MCM4, MCM6

247

GO:0003688~DNA 1.97E-04 15.62766 0.022244 5 replication origin binding CDC45, ORC6, MCM2, MCM10, MCM5 LDHA, IER5, SRM, ACCS, E2F7, BNIP3, CTPS1, MCM10, TK1, SYP, C1QTNF6, HJURP, BCL2, GALE, FAS, THBS1, GO:0042802~identical GAPDH, TRIP13, DDX39A, SMAD6, HGD, 3.16E-04 1.836094 0.030468 40 protein binding SMAD3, NDC80, BIRC5, OPTN, DAPK2, MCM6, KCTD6, DDX58, UHRF1, PLK4, MAD2L1, TFRC, PSMC3, JUN, BAX, PCNA, SDCBP, CHAF1A, DCXR KEGG Pathway Fold Term p-value Enrichment Benjamini Count Genes E2F1, E2F2, PKMYT1, TTK, CHEK1, PTTG1, CDC45, MCM7, CDKN2C, BUB1, ORC6, CCNA2, ORC1, CDC6, CDK1, ANAPC5, RBL1, SMAD3, ESPL1, CDC20, has04110:Cell cycle 3.73E-22 8.158207 8.36E-20 35 MCM2, MCM3, MCM4, CDK2, CDC25A, MCM5, MCM6, CCNB1, CDKN1A, MAD2L1, CCNB2, CCND3, PLK1, PCNA, GADD45A CDK1, ZMAT3, CHEK1, RRM2B, CCNG1, SESN2, CCNG2, SESN1, GTSE1, CDK2, has04115:p53 signaling 2.57E-15 9.490656 2.86E-13 22 SESN3, CCNB1, CDKN1A, CCNB2, CCND3, pathway BBC3, RRM2, BAX, SERPINE1, FAS, THBS1, GADD45A POLD3, MCM7, POLE2, RFC2, LIG1, POLE, has03030:DNA replication 1.27E-09 10.43732 9.52E-08 13 POLD2, PCNA, MCM2, MCM3, MCM4, MCM5, MCM6 has03410:Base excision POLD3, HMGB1, POLE2, LIG1, UNG, 1.10E-06 8.758594 4.93E-05 10 repair NEIL3, POLE, POLD2, PCNA, PARP2 CDK1, ANAPC5, PKMYT1, AURKA, CDC20, ESPL1, PTTG1, CDK2, ITPR1, PGR, has04114:Oocyte meiosis 1.05E-06 4.426641 5.85E-05 17 CCNB1, CCNB2, MAD2L1, SGO1, PLK1, BUB1, CALM1 HLA-DQB1, E2F1, E2F2, HRAS, CHEK1, PTTG1, FOS, POLE2, CDKN2C, RANBP1, EGR1, ZFP36, ANAPC5, POLE, SMAD3, has05166:HTLV-I infection 5.91E-06 2.844819 2.21E-04 25 CDC20, POLD3, CDKN1A, WNT7B, MAD2L1, CCND3, BAX, JUN, POLD2, PCNA has03460:Fanconi anemia BLM, FANCD2, USP1, BRCA2, FANCE, 6.92E-05 5.453464 0.002212 10 pathway BRIP1, RMI2, FANCG, FANCA, UBE2T CCNB1, PGR, CDK1, MAD2L1, CCNB2, has04914:Progesterone- 1.78E-04 3.986671 0.004981 12 ANAPC5, PLK1, BUB1, PKMYT1, CCNA2, mediated oocyte maturation CDC25A, CDK2 E2F1, E2F2, EGR3, HRAS, BIRC5, CDK2, has05161:Hepatitis B 4.43E-04 2.990003 0.010964 15 DDX58, FOS, CDKN1A, BAX, BCL2, JUN, PCNA, FAS, CCNA2 has01230:Biosynthesis of PKM, TPI1, BCAT2, PHGDH, PGAM1, 7.52E-04 4.014356 0.016718 10 amino acids IDH2, PFKP, GPT2, GAPDH, ENO1 has03430:Mismatch repair 9.41E-04 7.540007 0.018992 6 EXO1, POLD3, RFC2, LIG1, POLD2, PCNA has00010:Glycolysis / PKM, LDHA, TPI1, PGM1, PGAM1, HK2, 0.001987 3.882541 0.036447 9 Gluconeogenesis PFKP, GAPDH, ENO1 GABARAPL1, HRAS, BNIP3, SMAD3, has04068:FoxO signaling 0.002159 2.804057 0.036553 13 CCNG2, FOXO6, CDK2, CCNB1, PLK4, pathway CDKN1A, CCNB2, PLK1, GADD45A has00240:Pyrimidine POLD3, TYMS, POLE2, RRM2, DTYMK, 0.002392 3.147891 0.037596 11 metabolism POLE, POLD2, RRM1, CTPS1, RRM2B, TK1 has05219:Bladder cancer 0.002566 4.93472 0.037638 7 E2F1, E2F2, CDKN1A, HRAS, FGFR3,

248

THBS1, DAPK2 has03440:Homologous POLD3, XRCC2, BLM, POLD2, BRCA2, 0.002807 5.980006 0.038583 6 recombination RAD54L

Appendix Table 3.2. Enriched GO terms and KEGG pathways in MCF7 in response to oxygen level changes in Plasmax. Biological Process Fold Term p-value Benjamini Count Genes Enrichment CDC6, MCM2, MCM10, MCM3, MCM4, GO:0000082~G1/S transition of CDC25A, CDK2, MCM5, CDT1, RBBP8, 6.78E-13 8.251018 1.82E-09 21 mitotic cell cycle MCM6, CCNE2, TYMS, CDC45, EIF4EBP1, DHFR, RRM2, PRIM2, PCNA, ID4, ORC1 CDC6, LIG1, BRIP1, MCM2, MCM10, MCM3, MCM4, CDC25A, CDK2, MCM5, CDT1, GO:0006260~DNA replication 2.77E-10 5.688259 3.72E-07 22 RBBP8, MCM6, CDC45, TIMELESS, RFC2, RRM2, RRM1, POLD2, PCNA, ORC1, DSCC1 CCNE2, CDC6, CDC45, PRIM2, MCM2, GO:0006270~DNA replication 3.24E-09 13.77625 2.90E-06 11 MCM3, MCM10, MCM4, ORC1, MCM5, initiation MCM6 GO:0000083~regulation of transcription involved in G1/S 9.78E-07 13.93961 5.24E-04 8 CDC6, TYMS, CDC45, DHFR, RRM2, PCNA, transition of mitotic cell cycle ORC1, CDT1 GO:0045429~positive regulation of nitric oxide biosynthetic 8.97E-07 9.320087 6.01E-04 10 HSP90AB1, P2RX4, TNF, HSP90AA1, CLU, process ESR1, SMAD3, JAK2, INSR, KLF4 GO:0071353~cellular response HSP90AB1, CORO1A, XBP1, GATA3, MCM2, 1.35E-06 13.35879 6.04E-04 8 to interleukin-4 IL24, TUBA1B, IMPDH2 DLC1, FRK, CYP1B1, TFAP4, E2F7, IGFBP6, CXCL8, ADORA1, LIF, CD9, PTGES, GATA3, GO:0008285~negative 5.94E-05 2.530074 0.022511 25 GPNMB, COL18A1, CDC6, BRIP1, IL24, regulation of cell proliferation SLC9A3R1, HMGA1, RERG, DHRS2, SCIN, JAK2, PMP22, KLF4 GO:0043524~negative HSP90AB1, NES, CEBPB, NRP1, MSH2, regulation of neuron apoptotic 1.14E-04 3.946915 0.037583 13 AARS, SOD1, CORO1A, UNC5B, FYN, BCL2, process HMOX1, JAK2 Cellular Component Fold Term p-value Benjamini Count Genes Enrichment DLC1, RBPMS2, TUBB2B, CRABP2, PDLIM3, MCM10, CALB2, AQP3, IL17RB, ACTG2, EIF4EBP1, CDCA7, PITPNC1, EIF1, MX2, ORC1, FTL, RET, SOCS3, AARS, RELB, PIM1, ACTN1, KRT10, PKIB, OPTN, VASH1, DDIT4, KRT17, TAGLN, HSPB8, VEGFA, CLIP1, STC1, TNFAIP3, SEPT9, OSTF1, TALDO1, HACD3, CHCHD1, CLU, BEX2, AFAP1L2, CDC37, JRK, HNRNPA3, GO:0005737~cytoplasm 1.24E-07 1.408896 5.15E-05 174 LIF, TUBB, PSMB7, SBK1, TRIM68, NDRG4, CRMP1, PSMB3, ADRA2A, IDH1, DYRK2, HIP1, FH, PLAT, XPOT, PARD6B, MOCS2, S100P, MYO1B, PODXL, FSCN1, SMAD3, BRIP1, CELSR2, EVL, SHANK3, GAS6, SREBF2, CORO1A, PLK1, PCNA, SYTL2, RFX2, FABP5, KCTD11, NAT8L, KLF4, SLC9A1, HSP90AB1, NUAK2, TLN2, SULT2B1, RNF187, MCF2L, CKB, PKM, FAM83D, ACOT7, CDC45, LONP1, TRIM3,

249

GSN, DDX60, BAG3, FAM129A, IMPDH2, SERTAD1, CDC6, ARHGEF2, HSP90AA1, LIMK1, SARS, ESR1, MCM2, PALLD, SLC9A3R1, FLNB, CDK2, DHRS2, MAST4, SDC1, DOK3, CCND3, FANCD2, RRM2, RRM1, SCIN, GADD45G, ERN1, OSBPL10, GADD45A, UBE2T, SRGAP1, ABLIM2, FRK, IRX3, HAUS1, MKNK2, KITLG, IVNS1ABP, HPRT1, NECAB1, IARS, TYMS, ZNF703, MARVELD1, INPP5J, XBP1, BCL2, SAPCD2, MSI1, MTCL1, GSTO1, NFATC2, ZBP1, ABCA12, NES, GDI2, RBM24, CEBPB, HIST1H2BD, ANXA5, SOD1, COTL1, CDC25A, SAMD1, TRIM21, SH3BP5, FSD1, NUPR1, SVIL, MT2A, PBX1, JAK2, ID4, ID3, DUSP9, HPGD CRABP2, CXCL12, ACTG2, COL12A1, RAB27B, INSR, FTL, AARS, HLA-A, ACTN1, KRT10, KRT19, GGACT, MELTF, KRT17, KRT16, COL1A2, NEU1, TNFAIP3, PRPS2, OSTF1, TALDO1, CLU, CDC37, PSMB7, TUBB, PSMB3, IDH2, HNRNPD, IDH1, GALE, RHOBTB3, FH, COL18A1, PLAT, PARD6B, KIF3B, S100P, MOCS2, MYO1B, PODXL, FSCN1, CD63, GAS6, P2RX4, CORO1A, NRF1, PKP1, FREM2, THSD4, GO:0070062~extracellular PCNA, HIST1H3E, PABPC1L, PDZK1, 3.26E-07 1.609494 6.79E-05 107 exosome HIST1H3H, FABP5, SLC9A1, HSP90AB1, GREB1, NPNT, IGFBP6, SULT2B1, RHOQ, CKB, MTHFD1, PKM, ACOT7, GSN, SEMA3B, SLC25A1, FAM129A, TUBA1B, IMPDH2, HSP90AA1, SARS, SLC9A3R1, FLNB, DHRS2, SDC1, SCIN, RRM1, SMS, FRK, HPRT1, GPRC5A, IARS, ANXA6, CD9, TGM1, CALML5, GSTO1, THBS1, SCNN1B, HSPA8, GDI2, HIST1H2BD, HSPG2, GYG1, SOD1, ANXA5, COTL1, COL5A1, ANXA2, GFPT1, GFRA1, PRSS23, HPGD CRABP2, FSTL3, CBX4, MCM10, PGR, EIF4EBP1, CDCA7, GATA3, PRIM2, PITPNC1, ORC1, H1F0, LIG1, RELB, OPTN, TIMELESS, RFC2, HSPB8, DSCC1, MYBL2, HNRNPA3, PSMB7, PSMB3, HNRNPD, DYRK2, ASF1B, XPOT, SMAD3, BRIP1, NOP10, ABCG1, SREBF2, NRF1, DHFR, PKP1, PLK1, POLD2, PCNA, HIST1H3E, KAT6B, HIST1H3H, KLF4, FABP5, SLC9A1, HSP90AB1, E2F1, E2F2, E2F7, RNF187, GO:0005654~nucleoplasm 6.99E-07 1.594728 9.72E-05 105 CDT1, CCNE2, FOS, CDC45, LONP1, ACOT7, CDC6, HSP90AA1, LIMK1, ESR1, MCM2, MCM3, MCM4, HMGA1, MCM5, CDK2, MCM6, NRIP1, RBBP8, CCND3, FANCD2, RRM2, RRM1, GADD45A, UBE2T, PMEPA1, UNG, MKNK2, ZNF367, TRIB3, FHL2, EHF, IVNS1ABP, NECAB1, IARS, TYMS, XBP1, CAMK2B, TCEA1, NFATC2, HSPA8, HIST1H2BD, CEBPB, MSH2, CEBPG, SOD1, CDC25A, CENPH, DUSP4, ID1, JAK2, PBX1, LSM10, CENPU, ID3, HPGD DLC1, TUBB2B, CRABP2, FSTL3, CBX4, GO:0005634~nucleus 1.08E-05 1.327446 9.01E-04 170 MCM10, CALB2, AQP3, PGR, CDCA7, GATA3, EIF1, MX2, ORC1, H1F0, ZNF593,

250

LIG1, RELB, PIM1, KRT10, PKIB, OPTN, ASCL1, SCCPDH, UHRF1, TIMELESS, KRT16, HSPB8, STC1, AMFR, TNFAIP3, TALDO1, TFAP4, MSMB, CHCHD1, CLU, CIART, BEX2, MYBL1, JRK, HNRNPA3, TUBB, PSMB7, TRIM68, CDYL2, PSMB3, HNRNPD, DYRK2, ASF1B, HELLS, HIP1, REEP6, PARD6B, MOCS2, S100P, SMAD3, BRIP1, CSRP2, NOP10, SREBF2, DLX3, RERG, NRF1, CORO1A, PKP1, PLK1, POLD2, PCNA, TCF19, ZNF385B, HIST1H3E, RFX2, AREG, KAT6B, FOXI1, HIST1H3H, E2F1, HSP90AB1, NUAK2, E2F7, SULT2B1, SLFN5, RNF187, CDT1, CCNE2, PKM, FOS, LONP1, CDC45, GSN, HMOX1, SLC25A1, MYB, IMPDH2, SERTAD1, CDC6, EGR3, HSP90AA1, HIST1H1C, ESR1, MCM2, PALLD, SLC9A3R1, MCM3, MCM4, HMGA1, CDK2, MCM5, NRIP1, MCM6, RBBP8, DHRS2, CCND3, HOXC13, FANCD2, RRM2, GADD45G, NRGN, GADD45A, UBE2T, FRK, IRX3, IRX5, UNG, MKNK2, ZNF367, TRIB3, FHL2, EHF, NECAB1, TYMS, ZNF703, MARVELD1, XBP1, BCL2, SAPCD2, MSI1, TCEA1, NFATC2, HSPA8, ZBP1, TRIP13, RBM24, CEBPB, HIST1H2BD, CEBPG, FADS1, RMDN3, SOD1, COTL1, CDC25A, TRIM21, CENPH, ANXA2, FSD1, CYBA, DUSP4, NUPR1, FYN, ID1, SVIL, MT2A, LSM10, PBX1, JAK2, ID4, CENPU, ID3, PRSS23, DUSP9 DLC1, MOCOS, CRABP2, CALB2, ACTG2, EIF4EBP1, ANK3, ORC1, MX2, FTL, SOCS3, AARS, RELB, ACTN1, SPIRE2, OPTN, DDIT4, TRAPPC9, CLIP1, TNFAIP3, PACS1, MYL7, TALDO1, CLU, IL32, EPHB3, CDC37, TK1, PSMB7, TRIM68, NDRG4, CRMP1, PSMB3, IDH2, HNRNPD, IDH1, GALE, FH, RHOBTB3, XPOT, PARD6B, KIF3B, MOCS2, FSCN1, SMAD3, EVL, SREBF2, RERG, CORO1A, DHFR, PLK1, FABP5, HSP90AB1, NRP1, SULT2B1, RHOQ, MCF2L, CDT1, GO:0005829~cytosol 1.00E-05 1.466833 0.001044 115 CKB, MTHFD1, PKM, CCNE2, FOS, ACOT7, GSN, BAG3, HMOX1, IMPDH2, EFR3B, CDC6, ARHGEF2, HSP90AA1, LIMK1, SARS, HMGA1, FLNB, CDK2, RRM2, RRM1, SMS, SRGAP1, HAUS1, TRIB3, HPRT1, TPM1, STARD13, IARS, TYMS, XBP1, INPP5J, BCL2, RASGRP1, CAMK2B, GSTO1, WIPF1, NFATC2, PAPSS2, HSPA8, PHLDA1, ZBP1, ABCA12, GDI2, GYG1, SOD1, CDC25A, TRIM21, ANXA2, CENPH, FYN, GFPT1, MT2A, JAK2, CENPU, DUSP9, HPGD NRP1, IGFBP6, FSTL3, CXCL12, MCF2L, CKB, KRT81, ACTG2, TNFRSF11B, GSN, IL4R, HMOX1, COL12A1, SEMA3B, LTB, ACTN1, KRT10, F7, IL24, GAL, TCN1, GO:0005615~extracellular space 2.37E-05 1.789261 0.001413 57 MMP13, VASH1, IL20, CHGA, CD36, MELTF, VEGFA, COL1A2, STC1, TNFAIP2, TNF, MSMB, CLU, CXCL8, KITLG, IL32, CX3CL1, ABCA3, LIF, CD9, MTCL1, THBS1, HSPA8, PLAT, COL18A1, HIST1H2BD, PODXL,

251

HSPG2, PPFIBP2, CD63, SOD1, GAS6, ANXA2, DKK1, FJX1, AREG DLC1, GDI2, ARHGEF2, NRP1, HACD3, TLN2, LIMK1, HSPG2, FHL2, ACTN1, EVL, GO:0005925~focal adhesion 2.10E-05 2.703521 0.001456 25 ANXA5, PALLD, CSRP2, FLNB, HMGA1, ANXA6, CD9, CYBA, SDC1, CCND3, GSN, SVIL, HSPA8, SLC9A1 GO:0042555~MCM complex 3.51E-05 23.49059 0.001826 5 MCM2, MCM3, MCM4, MCM5, MCM6 H1F0, ABLIM2, TLN2, FSCN1, PDLIM3, FHL2, SLC9A3R1, PALLD, IVNS1ABP, GO:0015629~actin cytoskeleton 6.38E-05 3.297303 0.002953 17 FLNB, CORO1A, GSN, SVIL, PSMB3, WIPF1, NFATC2, SEPT9 HSP90AB1, NRP1, NPNT, IL17RB, MTHFD1, FOS, LONP1, ELOVL5, PTGES, HMOX1, FAM129A, INSR, IMPDH2, FTL, RET, HSP90AA1, PCYOX1L, LIMK1, AARS, HLA- A, ESR1, KRT10, MCM3, SLC9A3R1, CHPT1, MCM4, MCM5, SCCPDH, SDC1, CD36, NRM, CHRM1, VEGFA, NEU1, AMFR, RAPGEFL1, GO:0016020~membrane 2.35E-04 1.499127 0.009755 78 CALCR, TNF, LMF2, KITLG, IL32, SFXN2, EHBP1L1, ABCA3, IARS, SLC29A1, ANXA6, CD9, BCL2, RASGRP1, PLXND1, LFNG, HSPA8, HIP1, GDI2, KIF3B, MSH2, FADS1, SCD, SYT12, EVL, GYG1, ANXA5, ANXA2, SREBF2, SYNE3, RERG, CYBA, P2RX4, CORO1A, LAMP3, SLC7A2, ABCC3, SYTL2, JAK2, RIT1, HIST1H3E, HIST1H3H INPP5J, TLN2, PODXL, FSCN1, CLIP1, GO:0001726~ruffle 2.72E-04 4.698118 0.010263 10 ACTN1, WIPF1, SLC9A3R1, PALLD, ANXA2 PLAT, COL18A1, NES, HSP90AA1, CLU, GO:0031012~extracellular HSPG2, SOD1, MMP13, FLNB, COL5A1, 7.05E-04 2.571267 0.024213 18 matrix ANXA2, PKM, TUBB, THSD4, COL1A2, COL12A1, THBS1, HSPA8 CORO1A, FYN, MYO1B, RHOQ, ACTN1, GO:0005884~actin filament 8.30E-04 5.204069 0.026289 8 WIPF1, PALLD, TPM1 MSH2, PIF1, PCNA, MCM2, HIST1H3E, GO:0000784~nuclear MCM3, MCM4, ORC1, HIST1H3H, MCM5, chromosome, telomeric region 0.001036 3.577798 0.030388 11 MCM6 PLAT, CYBA, CHGA, VEGFA, FSTL3, GO:0030141~secretory granule 0.001791 4.571142 0.048603 8 THBS1, RAB27B, GAL Molecular Function Fold Term p-value Benjamini Count Genes Enrichment RBPMS2, MOCOS, DLC1, SYT1, TUBB2B, PDLIM3, FSTL3, ADORA1, PGR, INSIG1, RAB27B, FTL, GTPBP2, H1F0, CYP1A1, PIM1, OPTN, F7, VASH1, KRT19, ATP2C2, UHRF1, CD36, KRT17, MELTF, GGACT, TAGLN, KRT16, SCYL2, RFC2, VEGFA, AMFR, PMP22, OSTF1, TALDO1, SCN1B, CHCHD1, MYBL2, KCNJ3, CDC37, TK1, JRK, LIF, HNRNPA3, CRMP1, FH, HIP1, GO:0005515~protein binding 1.24E-09 1.279496 8.79E-07 275 PLAT, KIF3B, FAM111B, SMAD3, CD63, CSRP2, NOP10, GAS6, SYNE3, PKP1, PCNA, SYTL5, RIT1, SYTL2, AREG, AGR3, KAT6B, PDZK1, KLF4, SLC9A1, NRP1, NUAK2, RNF187, CKB, CDT1, CCNE2, PKM, MTHFD1, LONP1, KISS1R, BAG3, IL4R, MYB, FAM129A, SERTAD1, CDC6, ARHGEF2, HIST1H1C, ACKR3, FLNB, RBBP8, DOK3, CCND3, FANCD2, RRM2, 252

RRM1, GADD45G, GADD45A, SRGAP1, TBC1D9, UNG, MKNK2, FHL2, CXCL8, HPRT1, STARD13, XBP1, BCL2, TCEA1, WIPF1, THBS1, ABCA12, TRIP13, PHLDA1, ZBP1, GDI2, RMDN3, HSPG2, SOD1, TRIM21, DKK1, FYN, SVIL, MT2A, PBX1, DUSP9, CLPB, CRABP2, CBX4, MCM10, KRT81, EIF4EBP1, UNC5B, ELOVL5, ANK3, GATA3, PITPNC1, ORC1, INSR, MX2, ZNF593, RET, SOCS3, RELB, HLA-A, ACTN1, IL24, ASCL1, TIMELESS, HSPB8, COL1A2, CLIP1, TNFAIP3, PRPS2, DSCC1, SEPT9, PACS1, MYL7, HACD3, TFAP4, MSMB, CIART, CLU, BEX2, IL32, CX3CL1, PSMB7, TUBB, NDRG4, CDYL2, PSMB3, HNRNPD, ADRA2A, DYRK2, ASF1B, PLXND1, HELLS, RHOBTB3, REEP6, PARD6B, S100P, PODXL, FSCN1, BRIP1, EVL, SHANK3, SREBF2, P2RX4, CORO1A, NRF1, PLK1, POLD2, RFX2, HIST1H3E, IFI6, HIST1H3H, FABP5, SEL1L, HSP90AB1, E2F1, E2F2, TLN2, E2F7, SULT2B1, LGR4, FAM83D, FOS, CDC45, ACOT7, TRIM3, GSN, HMOX1, DDX60, TUBA1B, IMPDH2, EFR3B, HSP90AA1, LIMK1, RAB3IL1, ESR1, MCM2, MCM3, SLC9A3R1, GAL, PALLD, MCM4, HMGA1, CDK2, MCM5, MCM6, NRIP1, DHRS2, SDC1, NRM, HOXC13, ERN1, CLDN1, PMEPA1, CALCR, TNFRSF21, FRK, TNF, LMF2, HAUS1, KITLG, TRIB3, TPM1, GPRC5A, IARS, ANXA6, CD9, ZNF703, INPP5J, TGM1, INPP5F, CAMK2B, GSTO1, NFATC2, SCNN1B, GPNMB, HSPA8, CEBPB, MSH2, CEBPG, GYG1, COTL1, ANXA5, CDC25A, CENPH, ANXA2, SH3BP5, CYBA, ID1, LSM10, ID4, JAK2, ID3, CENPU HSP90AB1, SEPHS2, NUAK2, CLPB, SLFN5, CKB, MTHFD1, PKM, ACTG2, LONP1, DDX60, ORC1, INSR, CDC6, RET, HSP90AA1, LIMK1, LIG1, SARS, AARS, PIM1, MCM2, MCM3, MCM4, MCM5, CDK2, MCM6, MAST4, ATP2C2, KIF1A, RFC2, GO:0005524~ATP binding 1.25E-05 1.749794 0.004398 64 SCYL2, RRM1, ERN1, PRPS2, UBE2T, FRK, MKNK2, TRIB3, EPHB3, ABCA3, TK1, IARS, SBK1, ENTPD8, CAMK2B, DYRK2, PAPSS2, HSPA8, HELLS, TRIP13, ABCA12, RHOBTB3, KIF3B, MYO1B, MSH2, PIF1, BRIP1, ABCG1, P2RX4, FYN, PLK1, ABCC3, JAK2 GO:0001786~phosphatidylserine SYT1, RASGRP1, SCIN, OSBPL10, SYTL2, 2.68E-05 8.837642 0.006305 8 binding THBS1, GAS6, HSPA8 GO:0001077~transcriptional activator activity, RNA polymerase II core promoter 7.19E-05 3.117515 0.012649 18 CEBPB, TFAP4, CEBPG, ESR1, EHF, MYBL1, proximal region sequence- MYBL2, PGR, DLX3, FOS, NRF1, HOXC13, specific binding GATA3, PBX1, MYB, NFATC2, FOXI1, KLF4 RBPMS2, TFAP4, HPRT1, ANXA6, TYMS, ACOT7, XBP1, RASGRP1, BCL2, HMOX1, GO:0042803~protein OXCT1, ADRA2A, INPP5F, IDH1, MTCL1, 1.15E-04 2.015709 0.016102 36 homodimerization activity CAMK2B, GALE, HIP1, HSP90AA1, CEBPB, MSH2, ACTN1, SMAD3, SOD1, ABCG1, P2RX4, ASCL1, NRF1, CORO1A, TIMELESS,

253

ID1, VEGFA, ERN1, CLIP1, HPGD, PRPS2 GO:0003678~DNA helicase 2.45E-04 10.21852 0.024464 6 activity PIF1, MCM2, MCM3, MCM4, MCM5, MCM6 E2F1, E2F2, ARHGEF2, CEBPB, CEBPG, GO:0008134~transcription ESR1, PIM1, FHL2, SMAD3, HMGA1, FOS, 2.24E-04 2.734534 0.02613 19 factor binding ID1, BCL2, GATA3, HNRNPD, PBX1, ID3, NFATC2, KAT6B TUBB, KRT19, TUBB2B, KRT17, TLN2, GO:0005200~structural 3.59E-04 4.087409 0.031317 11 KRT16, ANK3, TUBE1, TPM1, TUBA1B, constituent of cytoskeleton HIP1 GO:0005544~calcium- ANXA6, SYT1, SYT12, C2CD4C, SYTL5, 5.04E-04 5.637806 0.038884 8 dependent phospholipid binding SYTL2, ANXA5, ANXA2 KEGG Pathway Fold Term p-value Benjamini Count Genes Enrichment E2F1, CDC6, E2F2, SMAD3, MCM2, MCM3, MCM4, CDC25A, CDK2, MCM5, MCM6, has04110:Cell cycle 7.67E-08 4.705537 1.77E-05 19 CCNE2, CDC45, CCND3, PLK1, GADD45G, PCNA, ORC1, GADD45A RFC2, LIG1, POLD2, PRIM2, PCNA, MCM2, has03030:DNA replication 1.48E-06 8.530506 1.71E-04 10 MCM3, MCM4, MCM5, MCM6

Appendix Table 3.3. Enirched GO terms and KEGG pathways in PC3 in response to oxygen level changes in DMEM. Cellular Component Fold Term p-value Enrichment Benjamini Count Genes RAB7A, CAB39L, FAM20A, MRAS, MST1, TSPAN8, CDH1, LSR, C5ORF46, RPS28, SH3D21, COL6A3, SERPINE1, NDRG1, LOXL4, GALE, GO:0070062~extracellular SCNN1B, FAM129A, HIST1H4H, KCNMA1, 9.43E-05 1.872896 0.020171 39 exosome COBLL1, S100P, SLC12A2, GMDS, EFEMP2, MTMR11, GLUL, SDC1, RELT, EEF1A1P5, ARMC9, AOX1, COL1A2, PHGDH, DSP, GDF15, IGFBP3, LCP1, ITGA2B HMGB2, STC2, TNC, MST1, CST1, TLE2, SPARC, GO:0005615~extracellular CXADR, PTHLH, ADM, EEF1A1P5, SEMA7A, 3.25E-04 2.304996 0.034451 23 space COL6A3, SERPINE1, VEGFA, COL1A2, SEMA3D, LOXL4, GDF15, IGFBP3, APLN, LCP1, ANGPTL4

254

Appendix Table 3.4. Enriched GO terms and KEGG pathways in PC3 in response to oxygen level changes in Plasmax. Biological Process Fold Term p-value Benjamini Count Genes Enrichment NEK2, AURKA, LLGL2, FAM83D, VRK1, CDCA8, NCAPH, NCAPG2, NCAPG, CDCA2, BUB1, SKA3, LMLN, CCNA2, KIF14, CDK1, PARD6B, CDC6, KIF11, TPX2, CENPF, GO:0051301~cell division 2.37E-13 3.86023 6.80E-10 42 KIF18B, NDC80, CENPE, CDC20, BIRC5, PPP1CC, CDC27, MCM5, NCAPD3, TACC1, CDK2, CDC25A, SMC4, NCAPD2, CCNB1, SGO2, KNL1, ZWINT, KIF20B, MIS18BP1, SMC1A CENPO, XPO1, KIF18A, AHCTF1, CENPF, GO:0007062~sister BIRC5, NDC80, CDC20, CENPE, PPP1CC, 4.50E-11 6.558643 6.45E-08 21 chromatid cohesion CENPI, MRE11, REC8, CDCA8, SGO2, KNL1, PLK1, INCENP, ZWINT, BUB1, SMC1A NEK2, ANLN, AURKA, CEP55, FAM83D, VRK1, NCAPG2, INCENP, BUB1, CDCA2, GO:0007067~mitotic SKA3, LMLN, CCNA2, ASPM, CDC6, CDK1, 3.76E-09 3.761649 3.59E-06 29 nuclear division KIF11, TPX2, CENPF, NDC80, BIRC5, CDC20, CDC25A, CDK2, PLK1, KNL1, KIF20B, MIS18BP1, CIT CDK1, NES, HSP90AA1, NEK2, TPX2, SKP2, GO:0000086~G2/M BIRC5, CHEK1, AURKA, TPD52L1, CDC25A, transition of mitotic cell 8.50E-09 4.930951 6.09E-06 21 CDK2, HMMR, CCNB1, PLK4, CDKN1A, cycle CDKN2B, PLK1, CIT, TUBB4A, MELK CDC6, CDK1, POLA1, IQGAP3, SKP2, GO:0000082~G1/S MCM2, CDKN3, CDC25A, CDK2, MCM5, transition of mitotic cell 9.48E-08 5.36143 5.44E-05 17 MCM6, TYMS, CDKN1A, EIF4EBP1, PLK2, cycle RRM2, PCNA STIL, POLA1, CBFA2T3, FAM83D, TYMS, CSE1L, BCL2, BUB1, GRPR, USP13, TFDP1, GO:0008283~cell ERCC2, CDK1, BST2, MKI67, DLGAP5, TP53, 1.43E-07 2.900446 6.82E-05 33 proliferation TPX2, SKP2, CENPF, CDC27, TACC1, CDC25A, DDIT4, MRE11, UHRF1, TACSTD2, PLK1, PCNA, TCF19, AREG, PDZK1, MELK E2F1, STC2, MST1, TP53, BNIP3, CCNB1, GO:0071456~cellular 2.52E-07 5.36143 1.03E-04 16 FMN2, EIF4EBP1, ERO1A, BBC3, GATA6, response to hypoxia HMOX1, BCL2, VEGFA, BMP7, CCNA2 XPO1, LDHA, SORD, ASS1, CDH1, CDH3, MDK, FOS, TYMS, GATA6, BCL2, GATA3, GO:0042493~response to JUND, SEMA3C, THBS1, TXNIP, CDK1, 3.02E-07 3.068713 1.08E-04 29 drug HSP90AA1, ANXA1, CENPF, AK4, ACACB, GAL, CCNB1, CYBA, CDKN1A, ABAT, LRP8, IGFBP2 GO:0060337~type I interferon signaling 5.30E-07 6.534243 1.69E-04 13 EGR1, BST2, IFITM3, IFI35, HLA-F, OASL, pathway IFI27, ISG15, IRF7, XAF1, MX1, MX2, IFI6 STIL, IL6ST, FIGNL1, CBX4, BNIP3, AURKA, ASNS, EPCAM, GATA6, XBP1, BCL2, TGM2, GO:0043066~negative RARA, THBS1, DHCR24, KIF14, CDK1, regulation of apoptotic 7.97E-07 2.545207 2.28E-04 36 TBX3, ANXA1, TP53, BIRC5, MRE11, PPIF, process ASCL1, DHRS2, AMIGO2, FMN2, CDKN1A, KRT18, PLK2, PLK1, UCP2, VEGFA, HSPB1, SIAH2, WNT7A EXO1, CLSPN, CDC6, CDK1, CCDC88A, GO:0006260~DNA 1.55E-06 3.943246 4.03E-04 19 DTL, POLA1, CHEK1, MCM2, CDC25A, replication CDK2, MCM5, MRE11, MCM6, DNA2, RRM2,

255

RRM1, PCNA, NFIC GO:0070059~intrinsic apoptotic signaling pathway in response to 5.70E-06 8.77325 0.00136 9 endoplasmic reticulum ATF4, CEBPB, ERO1A, XBP1, CHAC1, BBC3, stress BCL2, ERN1, TRIB3 GO:0000070~mitotic sister PLK1, NEK2, ZWINT, KIF18B, NDC80, 7.86E-06 10.29395 0.001731 8 chromatid segregation ESPL1, SMC1A, SMC4 SASH1, CXCL8, XBP1, GATA6, ETS1, GO:0045766~positive 1.30E-05 4.195902 0.002648 15 HMOX1, SERPINE1, VEGFA, RHOB, IL1B, regulation of angiogenesis HSPB1, ADM2, THBS1, RUNX1, DDAH1 GO:0006977~DNA damage response, signal transduction by p53 class 1.91E-05 5.707329 0.00342 11 mediumtor resulting in cell CCNB1, E2F1, CDK1, CDKN1A, PLK2, cycle arrest PCNA, TP53, AURKA, GTSE1, CDK2, TFDP1 GO:1990440~positive regulation of transcription from RNA polymerase II 1.86E-05 16.08429 0.003556 6 promoter in response to endoplasmic reticulum stress ATF4, ATF3, CEBPB, XBP1, TP53, CREB3L1 LDHA, CAV1, HSP90AA1, GATA6, HMOX1, GO:0043627~response to 2.93E-05 5.443914 0.00441 11 GATA3, ESR1, IGFBP2, CTNNA1, GAL, estrogen WNT7A GO:0007052~mitotic CCNB1, STIL, KIF11, PLK2, TTK, NDC80, 2.91E-05 8.578289 0.004621 8 spindle organization AURKA, SMC1A GO:0034097~response to TYMS, FOS, IFI27, TNFRSF11A, IL6ST, 2.79E-05 6.186266 0.004686 10 cytokine BCL2, JUND, MAPKAPK3, RARA, SPARC KIF11, HJURP, NEK2, INCENP, CDCA2, GO:0007059~chromosome 4.38E-05 5.203741 0.006255 11 CENPF, SKA3, CENPE, NDC80, SRPK1, segregation ERCC2 E2F1, MEF2C, XPO1, CAV1, JDP2, EFNA1, FST, CBX4, TRIB3, CBX2, CXXC5, TSC22D3, GO:0000122~negative GATA6, XBP1, NRARP, GATA3, JUND, regulation of transcription PRMT6, RARA, MYB, TXNIP, EGR1, TBX3, 6.67E-05 1.921179 0.009064 43 from RNA polymerase II HIST1H1C, TBX2, RBL1, ASXL1, TP53, promoter ESR1, ZHX2, NRIP1, ASCL2, ASCL1, UHRF1, IFI27, ATF3, PLK1, IRF7, VEGFA, LRP8, ZFPM1, PARP1, NFIC GO:0032355~response to TXNIP, IFI27, ASS1, ETS1, PCNA, ANXA1, 1.16E-04 4.242011 0.014997 12 estradiol ESR1, RARA, AREG, IGFBP2, BMP7, WNT7A GO:0031145~anaphase- promoting complex- CCNB1, CDK1, PSMD13, PSME1, PLK1, 1.60E-04 4.47917 0.019743 11 dependent catabolic PSMD2, SKP2, AURKA, CDC20, PSMD6, process CDC27 GO:0045071~negative regulation of viral genome 2.04E-04 6.433716 0.024032 8 OASL, ISG15, BST2, C19ORF66, IFITM3, replication SLPI, MX1, SRPK1 GO:0000278~mitotic cell XRCC2, RRM1, CENPF, KIF18B, CENPE, cycle 2.39E-04 6.276797 0.027045 8 MYB, PPP2R2C, TFDP1 NDC1, IFIH1, CDCA8, INCENP, NUP50, GO:0016925~protein 2.79E-04 3.574287 0.030329 13 TP53, PCNA, CBX4, BIRC5, CBX2, SMC1A, sumoylation NUP155, PARP1 GO:0009636~response to CDK1, DHRS2, TYMS, FOS, CDKN1A, SDC1, 2.94E-04 4.162993 0.030778 11 toxic substance NUPR1, BCL2, CDH1, ASNS, NEFL EGR1, CAV1, LDHA, BNIP3, CBFA2T3, GO:0001666~response to 3.10E-04 2.992426 0.031194 16 DDIT4, ASCL2, CYBA, UCP2, ETS1, HMOX1, hypoxia VEGFA, ABAT, THBS1, MB, ERCC2 GO:0051436~negative CCNB1, CDK1, PSMD13, PSME1, PSMD2, 3.35E-04 4.530786 0.032548 10 regulation of ubiquitin- FBXO43, CDC20, PSMD6, CDC27, CDK2

256

protein ligase activity involved in mitotic cell cycle GO:0070301~cellular response to hydrogen 3.53E-04 5.07925 0.033162 9 PPIF, CDK1, ETS1, PCNA, ANXA1, AXL, peroxide RHOB, BNIP3, ECT2 GO:0043434~response to STC2, JUND, ANXA1, TFF1, AREG, SPARC, 3.76E-04 5.848833 0.034186 8 peptide hormone BMP7, NEFL GO:0007093~mitotic cell CDKN2B, PLK2, ZWINT, BUB1, TTK, 3.95E-04 7.036877 0.034765 7 cycle checkpoint CHEK1, SMC1A GO:0009612~response to TXNIP, CCNB1, INHBB, ETS1, TNC, JUND, 4.49E-04 4.907072 0.038226 9 mechanical stimulus ASNS, IGFBP2, THBS1 GO:0002931~response to MEF2C, EGR1, PPIF, EIF4EBP1, CAV1, 4.70E-04 6.823639 0.038874 7 ischemia PANX1, BCL2 GO:0010595~positive regulation of endothelial 4.98E-04 5.594536 0.039974 8 SASH1, ETS1, GATA3, VEGFA, RHOB, cell migration SPARC, THBS1, WNT7A GO:0000083~regulation of transcription involved in 5.97E-04 8.391804 0.046428 6 G1/S transition of mitotic cell cycle CDK1, CDC6, TYMS, RRM2, PCNA, POLA1 GO:0051439~regulation of ubiquitin-protein ligase 5.97E-04 8.391804 0.046428 6 activity involved in mitotic cell cycle CCNB1, CDK1, PLK1, CDC20, CDC27, CDK2 GO:0009615~response to MEF2C, IFIH1, OASL, BST2, DDX60, IRF7, 6.16E-04 3.5093 0.046637 12 virus IFITM3, GATA3, HSPB1, IFI44, MX1, MX2 PLK1, INCENP, RHOB, AHCTF1, BIRC5, GO:0000910~cytokinesis 6.50E-04 5.36143 0.047823 8 ESPL1, CIT, ECT2 Cellular Component Fold Term p-value Benjamini Count Genes Enrichment MEF2C, XPO1, DBF4B, XRCC2, CBX4, CBX2, AURKA, HIST1H2BO, EIF4EBP1, CDCA8, HIST1H2BK, ISG15, CDKN2B, GATA6, GATA3, INCENP, CDCA2, RARA, CCNA2, DDX39A, OPA1, DTL, EFTUD2, RBL1, SKP2, ZHX2, NEIL1, DEPDC1, NCAPD3, MRE11, NCAPD2, AQR, SGO2, HSPB8, RAD18, KIF4A, SSH1, LMNB1, LITAF, MAPKAPK3, AHCTF1, ANLN, CHEK1, CXXC5, CMPK2, LLGL2, HNRNPM, VRK1, CSE1L, NCAPG2, NUP50, WDHD1, RUNX1, ASF1B, TFDP1, HIST1H4H, GINS1, UBE2L6, NR4A1, ATAD2, CDC20, S100A14, CDC27, ABCG1, DNA2, ATF4, CDKN1A, GO:0005654~nucleoplasm 1.71E-14 1.835772 7.45E-12 152 ATF3, PKP1, ETS1, PLK1, PCNA, POP1, HOXB9, SMC1A, PARP1, HIST1H3H, E2F1, CLSPN, KYNU, FOXA2, BNIP3, IFI44L, CBFA2T3, GTSE1, FOS, SMAP2, FANCI, PRMT6, PSMD2, PSMD6, KPNB1, EGR1, CDK1, CDC6, SGK1, HSP90AA1, GEN1, NOL9, TP53, TPX2, ESR1, MCM2, CDK2, MCM5, NRIP1, MCM6, EPB41L2, SMTN, PSME1, RIF1, HIST2H2BE, RRM2, IPO5, RRM1, ESRP1, SIAH2, ZFPM1, KPNA2, PMEPA1, KPNA1, POLA1, TRIB3, TYMS, TSPYL2, XBP1, HJURP, KRT8, BUB1, TRIP12, ERCC2, EXO1, CENPO, ECI2, CEBPB, HIST1H2BD, ADARB1, TONSL, HIST1H2BG, ANXA1, CENPF, BIRC5,

257

RACGAP1, CDC25A, CENPI, SMC4, CCNB1, PSMD13, KNL1, IRF7, KIF20B, MIS18BP1 MEF2C, LDHA, DBF4B, XRCC2, PGAM1, AQP3, HIST1H2BO, G2E3, PIP5KL1, HIST1H2BK, AIF1L, CDCA2, RARA, CCNA2, PLS3, OPA1, SKP2, ZHX2, ESPL1, IFI44, NCAPD2, KRT18, KRT17, SIPA1L1, ZWINT, VEGFA, C12ORF57, KIF4A, PLCXD3, NEK2, KRT20, DENND2D, VRK1, RAC2, UBASH3B, NUP50, MYO15B, SKA3, ARNTL2, WDHD1, FH, MKI67, CDC20, CELSR2, TPD52L1, S100A14, CDC27, OASL, ETS1, PCNA, PLCXD1, GAS2L3, CLSPN, FOXA2, HIP1R, TTLL4, BNIP3, RNF182, YBX2, SMAP2, FANCI, CDC6, HERC6, SLC3A2, TP53, FLNC, SRPK1, SLIT2, EPB41L2, EPB41L3, CLIC3, HIST2H2BE, RRM2, CKAP2L, RRM1, SIAH2, MYO5A, FUT8, POLA1, EEA1, NCAPG, XBP1, BCL2, C19ORF66, BUB1, LMLN, TRIP12, HIST1H2BD, HIST1H2BG, DLGAP5, BIRC5, CDKN3, MID1, CCNB1, NUPR1, KNL1, AOX1, KIF20B, PSAT1, PAICS, XPO1, STIL, TTK, SLC7A5, EIF4EBP1, CDKN2B, ZNF185, EIF1, MX1, MX2, ASPM, DDX39A, GO:0005737~cytoplasm 2.42E-11 1.461616 5.28E-09 227 BST2, DTL, EFTUD2, NEIL1, PKIB, PNPLA2, PPP1CC, ANKRD13A, TACC1, DDIT4, MRE11, ASCL2, HSPB8, HSPB1, RAD18, CRACR2B, SSH1, LITAF, ASS1, MAPKAPK3, CXXC5, BICC1, MDK, LLGL2, CSE1L, TFF3, RUNX1, ARHGDIB, GINS1, CKAP2, PARD6B, S100P, SACS, NR4A1, EVL, PLEKHA4, CORO1C, ATF4, MYO10, PLK2, PLK1, PPIC, TROAP, SMC1A, MYLK, KYNU, FIGNL1, ANO1, SULT2B1, IFI44L, FAM83D, DDX60, KPNB1, EGR1, CDK1, SGK1, HSP90AA1, KIF11, SARS, ESR1, MAN1A1, MCM2, SLC9A3R1, ECT2, CDK2, DHRS2, SDC1, SMTN, PSME1, RIF1, FRMD4A, IPO5, ERN1, ZFPM1, KPNA2, KPNA1, SHCBP1, IRX3, CNN3, CDH1, CDH3, DNAH5, NDC1, TYMS, TSC22D3, TSPYL2, ZNF703, HJURP, KRT8, PAFAH1B3, PLCD4, TNPO1, NEFL, ERCC2, TXNIP, EXO1, NES, CEBPB, ADARB1, NF2, TONSL, KIF18A, ANXA1, KIF18B, CENPF, CENPE, DPYSL3, SPARC, RACGAP1, CDC25A, CENPI, SMC4, IRF7, VSTM2L, GDF15 MEF2C, STIL, XPO1, LDHA, PGAM1, IQGAP3, VPS53, AURKA, SLC7A5, CMBL, EIF4EBP1, CDCA8, ISG15, CDKN2B, INCENP, RBCK1, IL1B, DEPDC1B, MX1, ITPK1, MX2, DDAH1, CHAC1, SKP2, ESPL1, PNPLA2, PPP1CC, CTNNA1, DDIT4, NCAPD2, MRE11, SGO2, TACSTD2, ZWINT, GO:0005829~cytosol 4.00E-11 1.633002 5.82E-09 161 HSPB1, PRPS1, RBP4, IFIH1, KIF4A, ASS1, NEK2, DIAPH3, MAPKAPK3, AHCTF1, ASAP1, CHEK1, MYO9A, IFI35, TK1, CMPK2, VRK1, CSE1L, RAC2, PPP2R2C, TUBB4A, FH, ARHGDIB, PARD6B, UBE2L6, CDC20, EVL, NDC80, ACACB, CDC27, RERG, OASL, CDKN1A, MYO10, PLK4, GBE1, PLK2, BBC3, PLK1, MYH14, SMC1A,

258

MYLK, KYNU, FERMT1, SULT2B1, ANPEP, GTSE1, FOS, USP18, PBXIP1, HMOX1, PRMT6, PSMD2, RHOB, ARHGAP11A, PSMD6, KPNB1, SHC2, EFR3B, DHCR24, KIF14, CDK1, CDC6, SGK1, CCDC88A, HSP90AA1, KIF11, SARS, HERC6, TP53, TPX2, FLNC, ARHGAP23, ECT2, CDK2, FMN2, ARHGAP31, PSME1, RRM2, RRM1, SIAH2, KPNA2, KPNA1, MYO5A, SORD, AP1M2, HK1, TRIB3, ASNS, EEA1, TPM3, HMMR, TYMS, TSC22D3, NCAPH, NCAPG, XBP1, BCL2, PAFAH1B3, TGM2, BUB1, XAF1, TBC1D1, TNPO1, NEFL, TRIP12, TXNIP, CENPO, KIF18A, CENPF, BIRC5, CENPE, DPYSL3, RACGAP1, MID1, CDC25A, CENPI, SMC4, CCNB1, PSMD13, KNL1, IRF7, AOX1, PHGDH, PSAT1, CIT, PAICS XPO1, LDHA, EFNA1, IL6ST, PGAM1, VPS53, TTK, SLC7A5, PIP5KL1, SEMA7A, CREB3L1, DDX39A, OSBP2, OPA1, BST2, EFTUD2, PNPLA2, FIBCD1, TACC1, NCAPD3, HLA-F, NCAPD2, AQR, TACSTD2, VEGFA, CAV1, KIF4A, LMNB1, PANX1, ASAP1, HNRNPM, LGALS3BP, ERO1A, RAC2, CSE1L, NCAPG2, LFNG, HIST1H4H, ARHGDIB, MKI67, SYT12, EVL, NDC80, PPIF, RERG, PLEKHA4, OASL, SLC7A2, MYH14, PARP1, HIST1H3H, LDLR, HELZ, GTSE1, SLC1A4, FOS, FANCI, P4HA1, GO:0016020~membrane 6.12E-10 1.772882 6.67E-08 116 HMOX1, PSMD2, DOCK10, PTDSS1, KPNB1, DHCR24, KIF14, CDK1, KIF11, HSP90AA1, CCDC88A, NOL9, SLC3A2, ESR1, SPINT1, MAN1A1, SLC9A3R1, MCM5, SLIT2, DDR1, SDC1, IPO5, KPNA2, MELK, MYO5A, SORD, FUT8, TNC, CDH1, EEA1, CEP55, CDH3, PLPP2, HMMR, ANXA6, NDC1, NCAPH, NCAPG, BCL2, TAP1, BUB1, PAFAH1B3, LMLN, EHD1, B4GALNT1, ECI2, NF2, CENPE, NUP155, SLC16A3, CCNB1, CYBA, PSMD13, TFRC, KREMEN2, LRP8, CIT, PAICS KIF14, CDK1, KIF4A, SSH1, NEK2, CENPF, BIRC5, AURKA, CENPE, CEP55, RACGAP1, GO:0030496~midbody 8.65E-09 5.212964 7.54E-07 20 PPP1CC, ECT2, TACC1, CDCA8, PLK1, INCENP, KIF20B, ASPM, SHCBP1 XPO1, NEK2, KIF18A, CENPF, TTK, NDC80, GO:0000776~kinetochore 1.02E-07 6.226596 7.44E-06 15 CENPE, CENPI, PLK1, INCENP, ZWINT, FBXO28, BUB1, SKA3, SMC1A LDHA, EFNA1, IL6ST, FAM20A, FAM20C, PGAM1, KIAA1324, SLC7A5, CMBL, SERPINE1, AIF1L, IL1B, RAB25, DDAH1, ADAM9, MB, SOGA1, BST2, H2AFJ, PCOLCE2, KRT18, KRT17, TACSTD2, ST14, HSPB1, ABAT, MGAT5, RBP4, ASS1, GO:0070062~extracellular 1.36E-07 1.554988 8.48E-06 130 IFITM3, MST1, AHCTF1, HNRNPM, exosome LGALS3BP, CSE1L, RAC2, TFF2, TFF3, ENTPD2, TUBB4A, HIST1H4H, FH, ARHGDIB, PARD6B, S100P, ASXL1, ATAD2, AK4, S100A14, GBE1, PKP1, PPIC, PCNA, MYH14, PCSK1N, PDZK1, HIST1H3H, MYLK, TM7SF3, GM2A, FIGNL1, GREB1,

259

ANO1, SULT2B1, ANPEP, EPCAM, SLC1A4, PSMD2, SEMA3C, RHOB, HIST3H2A, DOCK10, PSMD6, KPNB1, CDK1, HSP90AA1, GEN1, SARS, SLC3A2, SPINT1, MAN1A1, SLC9A3R1, ARHGAP23, SLIT2, EPB41L2, DHRS2, DDR1, SDC1, CLIC3, PSME1, HIST2H2BE, RRM1, SLPI, SLC27A2, MYO5A, HIST1H2AC, SORD, FUT8, CDH1, EEA1, TPM3, PRSS8, ANXA6, ZG16B, KRT8, COL6A3, TGM2, PAFAH1B3, EHD1, THBS1, TNPO1, HIST1H2BD, HIST1H2BG, TMC4, ANXA1, AXL, PCK2, RACGAP1, PSMD13, TFRC, KNL1, AOX1, PHGDH, GFRA1, METRNL, GDF15, IGFBP2, PSAT1, WNT7A, PAICS CENPO, NEK2, AHCTF1, NDC80, BIRC5, GO:0000777~condensed 2.58E-07 5.797175 1.41E-05 15 CENPE, PPP1CC, SGO2, HJURP, KNL1, chromosome kinetochore INCENP, ZWINT, FBXO28, BUB1, SMC1A KIF11, TPX2, CENPF, TTK, BIRC5, AURKA, GO:0005819~spindle 5.88E-07 4.723979 2.85E-05 17 CDC20, MID1, CDC27, FMN2, VRK1, RIF1, PLK1, INCENP, HSPB1, SHCBP1, ERCC2 MEF2C, LDHA, DBF4B, AURKA, H1FX, AQP3, HIST1H2BO, CDCA8, HIST1H2BK, CREB3L1, RARA, CCNA2, SKP2, ZHX2, ESPL1, H2AFJ, DEPDC1, NCAPD2, UHRF1, REC8, HES4, TACSTD2, ZWINT, NEK2, STK17B, AHCTF1, CHEK1, MYBL1, IFI35, VRK1, UBASH3B, NCAPG2, ARNTL2, TUBB4A, MKI67, CDC20, NDC80, ACACB, CDC27, RERG, ZFHX4, PKP1, ETS1, PCNA, TCF19, AREG, PARP1, FOXA2, BNIP3, HELZ, YBX2, USP18, PBXIP1, MYB, DHCR24, CDC6, HIST1H1C, HERC6, TPX2, TP53, SLC3A2, SRPK1, EPB41L2, AMIGO2, CLIC3, HIST2H2BE, RRM2, ESRP1, SIAH2, HIST1H2AC, ZNF467, POLA1, NCAPH, NCAPG, XBP1, BCL2, C19ORF66, GBX2, TBC1D1, TRIP12, TRIP13, HIST1H2BD, TBX3, TBX2, DLGAP5, HIST1H2BG, BIRC5, CDKN3, CCNB1, PSMD13, DUSP2, NUPR1, KNL1, KIF20B, TJP3, XPO1, JDP2, CBX4, GO:0005634~nucleus 1.10E-06 1.322591 4.81E-05 213 CBX2, CDKN2B, GATA6, GATA3, EIF1, MX1, MX2, TFPI2, ASPM, DDX39A, DTL, RBL1, NEIL1, PKIB, PPP1CC, TACC1, MRE11, ASCL2, ASCL1, HSPB8, HSPB1, RAD18, IFIH1, ASS1, DIAPH3, MAPKAPK3, CMPK2, MEIS3, CSE1L, JUND, FBXO43, SSX2IP, RUNX1, ASF1B, HIST1H4H, TFDP1, GINS1, PARD6B, S100P, SACS, ATAD2, NR4A1, DNA2, ATF4, CDKN1A, ATF3, PLK1, HOXB9, SMC1A, HIST1H3H, E2F1, FIGNL1, SULT2B1, CBFA2T3, FOS, HMOX1, PRMT6, PSMD2, RHOB, EGR1, KIF14, CDK1, SGK1, HSP90AA1, ZFX, ESR1, MCM2, SLC9A3R1, ECT2, MCM5, CDK2, NRIP1, MCM6, DHRS2, ADRB2, RIF1, IPO5, NRGN, ZFPM1, KPNA2, KPNA1, MELK, IRX3, IRX5, TRIB3, TYMS, TSC22D3, TSPYL2, ZNF703, HJURP, KRT8, XAF1, TNPO1, SYNPO, ERCC2, TXNIP, EXO1, ADARB1, CEBPB, FOXL1, NF2, SAMD11, KIF18A, ANXA1, KIF18B, CENPF, CENPE, RACGAP1, CDC25A, CENPI, SMC4,

260

CYBA, IRF7, GDF15, NFIC HNRNPM, KIF4A, UHRF1, ZNF703, CEBPB, GO:0016363~nuclear 5.74E-06 4.852893 2.28E-04 14 LMNB1, KRT8, TP53, POLA1, CENPF, matrix AHCTF1, SPARC, MYB, SRPK1 GO:0051233~spindle KIF14, CDC6, CDCA8, PLK1, KIF20B, 1.30E-05 12.38765 4.72E-04 7 midzone AURKA, RACGAP1 GO:0000775~chromosome, CDCA8, MKI67, SGO2, HJURP, INCENP, 4.23E-05 5.89888 0.001419 10 centromeric region CENPF, CENPE, BIRC5, NDC80, SMC1A GO:0032154~cleavage PLK4, NF2, SSH1, RHOB, CEP55, RACGAP1, 6.49E-05 6.438565 0.00202 9 furrow PPP1CC, ECT2, MYLK GO:0005657~replication UHRF1, XRCC2, PCNA, TP53, RAD18, 1.04E-04 11.86716 0.002834 6 fork CHEK1 GO:0000942~condensed nuclear chromosome outer 1.02E-04 33.62362 0.002954 4 kinetochore CCNB1, PLK1, BUB1, NDC80 IL6ST, FAM20C, ANPEP, VGF, HIST1H2BK, SEMA7A, HMOX1, SERPINE1, IL1B, SEMA3C, APLN, TSKU, ADAM9, SOGA1, STC2, SPINT1, GAL, TCN1, SLIT2, IL20, INHBB, DDR1, CHGA, PLEKHH3, TACSTD2, HIST2H2BE, ST14, VEGFA, HSPB1, SLPI, GO:0005615~extracellular RBP4, SORD, TNC, CD109, MST1, CXCL8, 1.22E-04 1.62252 0.003129 65 space CHEK1, ZG16B, PRSS8, LGALS3BP, C1QTNF6, COL6A3, TFF2, TFF3, TFF1, THBS1, FGFBP1, OLFM1, SRGN, CPA4, HIST1H2BD, HIST1H2BG, AXL, ANXA1, DPYSL3, SPARC, TFRC, POP1, METRNL, AREG, IGFBP2, PCSK1N, BMP7, GDF15, WNT7A GO:0000796~condensin 2.49E-04 26.89889 0.00601 4 complex NCAPH, NCAPG, SMC4, NCAPD2 NDC1, IPO5, NUP50, AHCTF1, NUP155, GO:0005643~nuclear pore 2.70E-04 4.669947 0.006181 10 NPIPA1, KPNB1, KPNA2, MX2, KPNA1 GO:0005876~spindle CDK1, KIF4A, KIF11, PLK1, SKA3, BIRC5, 2.89E-04 6.113385 0.006273 8 microtubule AURKA, CDC27 STIL, XRCC2, NEK2, AURKA, CHEK1, CEP55, SLC1A4, TSPYL2, NCAPG, CKAP2, CDK1, DTL, GEN1, CENPF, ESPL1, CDC20, GO:0005813~centrosome 4.45E-04 2.131074 0.009198 27 SLC9A3R1, CDC27, CDK2, CCNB1, PLK4, PLK2, PLK1, CKAP2L, PCNA, KIF20B, RAD18 HIST1H2BO, HIST1H2AC, HIST1H2BD, HIST1H2BK, HIST1H1C, HIST2H2BE, GO:0000786~nucleosome 4.73E-04 3.934678 0.009336 11 HIST1H2BG, H1FX, H2AFJ, HIST1H3H, HIST1H4H CAV1, CNN3, TNC, FERMT1, CDH1, ANXA6, RAC2, ZNF185, TGM2, AIF1L, RHOB, LMLN, GO:0005925~focal 6.67E-04 2.149848 0.012576 25 ADAM9, ANXA1, EVL, FLNC, CTNNA1, adhesion PPP1CC, NEXN, CORO1C, EPB41L2, CYBA, ARHGAP31, SDC1, HSPB1 GO:0000779~condensed chromosome, centromeric 8.33E-04 19.21349 0.014426 4 region CEBPB, CENPE, NCAPD3, NCAPD2 GAS2L3, CNN3, STK17B, CDH1, ANLN, GO:0015629~actin SLC9A3R1, CTNNA1, NDC1, SLC16A3, 8.27E-04 2.622025 0.014924 17 cytoskeleton CORO1C, FMN2, SMTN, NCAPG, ZNF185, AIF1L, RARA, SYNPO XPO1, LMNB1, DTL, NR4A1, AHCTF1, GO:0031965~nuclear TRIB3, NPIPA1, NUP155, NDC1, SLC16A3, 0.001395 2.496076 0.023137 17 membrane BCL2, NUP50, IPO5, PLCD4, MX1, TNPO1, KPNB1

261

GO:0030141~secretory CYBA, CHGA, VEGFA, TFF3, IL1B, SCG5, 0.001546 4.089359 0.024676 9 granule PCSK1N, THBS1, GAL HIST1H2BO, HIST1H2BD, HIST1H2BK, GO:0000788~nuclear 0.001802 5.349212 0.027694 7 HIST2H2BE, HIST1H2BG, HIST3H2A, nucleosome HIST1H3H GO:0042555~MCM 0.001912 14.94383 0.028362 4 complex TONSL, MCM2, MCM5, MCM6 MEF2C, SASH1, RBP4, PARD6B, CAV1, HSP90AA1, PANX1, NEK2, ANXA1, TP53, GO:0043234~protein 0.00291 1.958657 0.040165 24 CDC20, PPP1CC, CDCA8, SDC1, EIF4EBP1, complex CDKN1A, ZNF703, INCENP, AIF1L, SSX2IP, ASF1B, PARP1, HIST1H3H, HIST1H4H NDC1, DHRS2, XPO1, RAC2, CSE1L, GO:0005635~nuclear 0.002846 2.749101 0.040578 13 LMNB1, RRM1, POLA1, CENPF, BNIP3, envelope NUP155, PARP1, KPNB1 GO:0034399~nuclear 0.003075 8.005623 0.041093 5 periphery ATF4, IPO5, AHCTF1, KPNB1, TNPO1 Molecular Function Fold Term p-value Benjamini Count Genes Enrichment MEF2C, PDP1, LDHA, XRCC2, DBF4B, EFNA1, IL6ST, FAM20A, FST, FAM20C, PGAM1, VPS53, AURKA, CDCA8, G2E3, ISG15, SEMA7A, INCENP, SERPINE1, CREB3L1, RAB25, RARA, RAB26, CCNA2, ADAM9, OSBP2, OPA1, ZHX2, SKP2, ESPL1, DEPDC1, CTNNA1, NCAPD3, NCAPD2, UHRF1, AQR, KRT18, KRT17, SGO2, TACSTD2, ZWINT, VEGFA, RBP4, KIF4A, LMNB1, NEK2, MST1, ASAP1, CHEK1, KRT20, DENND2D, MYO9A, IFI35, TK1, VRK1, UBASH3B, NCAPG2, NUP50, SKA3, WDHD1, PPP2R2C, TUBB4A, OLFM1, FH, MKI67, FAM111B, ASXL1, SPTSSB, UBE2L6, CDC20, NDC80, TPD52L1, ACACB, S100A14, CDC27, PKP1, ETS1, POP1, PCNA, AREG, PARP1, BMP7, PDZK1, GAS2L3, CLSPN, LDLR, HIP1R, BNIP3, RNF182, GTSE1, USP18, SMAP2, TNFRSF11A, PBXIP1, FANCI, P4HA1, MYB, USP13, CDC6, GO:0005515~protein 4.91E-09 1.240084 3.87E-06 333 HIST1H1C, TP53, TPX2, SLC3A2, ACKR3, binding FLNC, SRPK1, SLIT2, INHBB, EPB41L3, AMIGO2, CLIC3, RRM2, RRM1, ESRP1, SIAH2, USP5, POLA1, CXCL8, HK1, EEA1, HMMR, NCAPH, XBP1, NCAPG, BCL2, C19ORF66, BUB1, TBC1D1, THBS1, TRIP12, TRIP13, TBX3, TBX2, DLGAP5, HIST1H2BG, NTNG1, AXL, BIRC5, MID1, CDKN3, PCK2, SLC16A3, CCNB1, DUSP2, TFRC, KNL1, KIF20B, LRP8, TJP3, CIT, PAICS, WNT7A, XPO1, STIL, JDP2, CBX4, TTK, CBX2, PUS7L, NRCAM, EIF4EBP1, CDKN2B, GATA6, GATA3, FBXO28, RBCK1, MX1, MX2, DDX39A, BST2, CHAC1, DTL, EFTUD2, RBL1, FIBCD1, PPP1CC, TACC1, MRE11, ASCL1, HSPB8, HSPB1, RAD18, PTGFRN, CRACR2B, PRPS1, CAV1, IFIH1, ASS1, LITAF, SSH1, IFITM3, MAPKAPK3, NINJ1, CXXC5, LLGL2, HNRNPM, CSE1L, ERO1A, JUND, TFF2, TFF3, SSX2IP, TFF1, ASF1B, RUNX1, SRGN, ARHGDIB,

262

HIST1H4H, TFDP1, PARD6B, S100P, HENMT1, NR4A1, EVL, SELENOM, CORO1C, PPIF, DNA2, ATF4, MYO10, CDKN1A, PLK4, ATF3, PLK2, BBC3, PLK1, UCP2, PPIC, TROAP, HOXB9, SMC1A, IFI6, HIST1H3H, MYLK, E2F1, FIGNL1, ANO1, SULT2B1, CBFA2T3, FAM83D, EPCAM, FOS, TMEM171, HMOX1, DDX60, PRMT6, PSMD2, RHOB, PSMD6, SHC2, KPNB1, EFR3B, EGR1, KIF14, CDK1, SGK1, HSP90AA1, NOL9, ESR1, MCM2, GAL, SLC9A3R1, ECT2, CDK2, MCM5, MCM6, NRIP1, DHRS2, DDR1, SDC1, ADRB2, PSME1, IPO5, ERN1, SLPI, ZFPM1, KPNA2, KPNA1, PMEPA1, MELK, SHCBP1, AP1M2, PPP4R1, TRIB3, CDH1, ASNS, CEP55, PLPP2, TPM3, PRSS8, ANXA6, C1QTNF6, TSPYL2, ZNF703, HJURP, KRT8, TAP1, TGM2, PAFAH1B3, SCG5, FBN2, EHD1, TNPO1, NEFL, FGFBP1, ERCC2, SYNPO, EXO1, TXNIP, CENPO, ADARB1, CEBPB, NF2, TONSL, ANXA1, KIF18A, CENPF, KIF18B, DPYSL3, CENPE, SPARC, NUP155, RACGAP1, CDC25A, CENPI, SMC4, CYBA, KCNN4, IRF7, VSTM2L, MIS18BP1, IGFBP2, GDF15 MEF2C, HIST1H2AC, RBP4, CAV1, JDP2, PANX1, POLA1, BNIP3, HIST1H2BO, FOS, HIST1H2BK, XBP1, BCL2, PAFAH1B3, HIST3H2A, RARA, RUNX1, NEFL, GO:0046982~protein 1.17E-08 2.814204 4.62E-06 40 HIST1H4H, CEBPB, HIST1H2BD, heterodimerization activity HIST1H2BG, TP53, AXL, ZHX2, NR4A1, TPD52L1, BIRC5, H2AFJ, CTNNA1, MID1, ABCG1, SMC4, CYBA, ATF4, ATF3, HIST2H2BE, VEGFA, SMC1A, HIST1H3H XRCC2, FIGNL1, FAM20C, TTLL4, TTK, HELZ, AURKA, PIP5KL1, DDX60, ABCB10, ITPK1, KIF14, DDX39A, CDK1, CDC6, SGK1, KIF11, HSP90AA1, NOL9, SARS, TP53, TPX2, MCM2, MCM5, SRPK1, CDK2, MCM6, DDR1, RRM1, ERN1, SLC27A2, MELK, PRPS1, MYO5A, IFIH1, KIF4A, ASS1, NEK2, MAPKAPK3, STK17B, TRIB3, HK1, CHEK1, GO:0005524~ATP binding 2.24E-07 1.794408 5.89E-05 82 ASNS, MYO9A, DNAH5, CMPK2, TK1, VRK1, MYO15B, TAP1, BUB1, TGM2, LMTK3, RUNX1, EHD1, ENTPD2, TRIP13, ERCC2, MKI67, AXL, KIF18A, KIF18B, ATAD2, UBE2L6, CENPE, ACACB, AK4, ABCG1, SMC4, DNA2, OASL, PLK4, MYO10, PLK2, PLK1, KIF20B, MYH14, SMC1A, CIT, PAICS, MYLK GO:0008574~ATP- dependent microtubule 1.18E-04 11.54651 0.023037 6 motor activity, plus-end- KIF14, KIF4A, KIF11, KIF18A, KIF20B, directed KIF18B GO:0008536~Ran GTPase XPO1, CSE1L, IPO5, NUP50, BIRC5, KPNB1, 2.49E-04 7.633527 0.027658 7 binding TNPO1 E2F1, CAV1, DBF4B, POLA1, PGAM1, TRIB3, AURKA, FAM83D, VRK1, CDKN2B, GO:0019901~protein 2.37E-04 2.262215 0.03061 26 XBP1, GATA6, CCNA2, KIF14, SASH1, kinase binding KIF11, TP53, TPX2, RACGAP1, PPP1CC, CDC25A, CCNB1, PLK1, DOK7, HSPB1,

263

PARP1 KIF14, GAS2L3, KIF4A, CCDC88A, OPA1, GO:0008017~microtubule KIF11, CNN3, KIF18A, KIF18B, BIRC5, 2.25E-04 2.831116 0.034903 18 binding CENPE, RACGAP1, MID1, FAM83D, PLK1, KIF20B, MX1, MX2 CAV1, LDHA, SORD, ASS1, LDLR, BNIP3, TK1, C1QTNF6, HJURP, UBASH3B, BCL2, PAFAH1B3, THBS1, NEFL, TRIP13, DDX39A, GO:0042802~identical HSP90AA1, TP53, ZHX2, SKP2, ESR1, 4.29E-04 1.790814 0.036879 41 protein binding NDC80, TPD52L1, BIRC5, MID1, SLIT2, MCM6, UHRF1, PLK4, ATF3, TFRC, ETS1, HSPB8, VEGFA, ERN1, PCNA, RAD18, HSPB1, PARP1, PAICS, PRPS1 MEF2C, JDP2, POLA1, CBX4, CBX2, FOS, MEIS3, GATA6, GATA3, PRMT6, CREB3L1, GO:0003682~chromatin 4.23E-04 2.17543 0.040852 26 EXO1, CDK1, CEBPB, ESR1, TP53, ATAD2, binding CENPF, NCAPD3, MCM5, NCAPD2, ASCL1, REC8, NUPR1, PCNA, SMC1A GO:0000982~transcription factor activity, RNA polymerase II core 5.53E-04 8.534378 0.042644 6 promoter proximal region sequence-specific binding FOS, JDP2, ATF3, ETS1, IRF7, ARNTL2 KEGG Pathway Fold Term p-value Benjamini Count Genes Enrichment E2F1, TTK, CHEK1, CDKN2B, BUB1, CCNA2, TFDP1, CDC6, CDK1, RBL1, TP53, has04110:Cell cycle 3.64E-12 5.77873 8.34E-10 25 SKP2, CDC20, ESPL1, MCM2, CDC27, CDC25A, CDK2, MCM5, MCM6, CCNB1, CDKN1A, PLK1, PCNA, SMC1A CDK1, HIST1H2BD, IL6ST, HIST1H2BG, RBL1, TP53, SKP2, CHEK1, CDC20, CDK2, has05203:Viral 2.60E-05 2.936159 0.002968 21 HLA-F, HIST1H2BO, CDKN1A, ATF4, carcinogenesis HIST1H2BK, CDKN2B, HIST2H2BE, IRF7, CREB3L1, CCNA2, HIST1H4H CCNB1, CDK1, CDKN1A, BBC3, RRM2, has04115:p53 signaling 9.22E-05 4.705784 0.007011 11 SERPINE1, TP53, CHEK1, THBS1, GTSE1, pathway CDK2 CCNB1, CDK1, REC8, PLK1, BUB1, FBXO43, has04114:Oocyte meiosis 4.38E-04 3.356869 0.024786 13 AURKA, CDC20, ESPL1, SMC1A, PPP1CC, CDC27, CDK2

264

Appendix Table 3.5. Enriched GO terms and KEGG pathways in MCF7 in response to culture medium changes at 5% O2. Biological Process Fold Term p-value Benjamini Count Genes Enrichment EGR1, SP100, OAS3, RSAD2, OAS2, STAT1, GO:0060337~type I 6.09E-13 10.47173 1.50E-09 18 IFI35, ISG20, STAT2, OASL, IFIT1, ISG15, interferon signaling pathway IRF7, XAF1, MX1, MX2, IFI6, ADAR IFIH1, CYP1A1, OAS3, RSAD2, IFI44, OAS2, GO:0009615~response to IVNS1ABP, HMGA1, CXCL12, ISG20, DDX58, 1.14E-11 7.108083 1.41E-08 21 virus OASL, IFIT1, DDX60, GATA3, IRF7, STMN1, MX1, MX2, ENO1, ADAR CDC6, POLE, MCM2, MCM10, MCM3, MCM4, GO:0000082~G1/S CDC25A, CDK2, MCM5, MCM6, CCNE2, transition of mitotic cell 2.41E-11 7.300552 1.98E-08 20 TYMS, CDC45, CDKN1A, DHFR, PLK2, cycle RRM2, PRIM2, PCNA, ORC1 GO:0006270~DNA CCNE2, CDC6, CDC45, POLE, PRIM2, MCM2, 3.52E-10 13.96231 2.17E-07 12 replication initiation MCM3, MCM10, MCM4, ORC1, MCM5, MCM6 CLSPN, CDC6, LIG1, POLE, BRIP1, MCM2, GO:0006260~DNA MCM10, MCM3, MCM4, CDC25A, CDK2, 1.06E-09 5.284658 5.21E-07 22 replication MCM5, MCM6, CDC45, RFC3, TIMELESS, RFC2, RRM2, RRM1, PCNA, ORC1, DSCC1 OAS3, BNIP3, RSAD2, IFI44L, OAS2, STAT1, GO:0051607~defense DDIT4, STAT2, ISG20, PLSCR1, NLRC5, 3.36E-09 4.964375 1.38E-06 22 response to virus OASL, IFIT1, SERINC5, ISG15, DDX60, IFIT5, C19ORF66, BNIP3L, MX1, MX2, ADAR GO:0045071~negative regulation of viral genome 7.13E-08 10.23902 2.51E-05 11 PLSCR1, IFIT1, OASL, ISG15, C19ORF66, replication OAS3, RSAD2, PARP10, MX1, ISG20, ADAR H1F0, HIST1H2BD, HIST1H1C, ANP32E, H1FX, MCM2, HIST2H4A, HIST2H3D, GO:0006334~nucleosome 1.16E-07 5.318974 3.58E-05 17 HIST1H2BO, HIST1H2BK, HIST2H2BE, assembly HIST2H2BF, HIST1H3E, HIST3H2BB, ASF1B, HIST1H3H, HIST1H4H GO:0070059~intrinsic apoptotic signaling pathway 1.93E-06 10.1544 5.28E-04 9 in response to endoplasmic ATF4, CASP4, CEBPB, XBP1, CHAC1, BBC3, reticulum stress ERN1, TRIB3, PPP1R15A GO:0071353~cellular HSP90AB1, XBP1, GATA3, FASN, MCM2, 2.21E-06 12.41094 5.45E-04 8 response to interleukin-4 NFIL3, TUBA1B, IMPDH2 XRCC5, TXNIP, RET, LDHA, HSP90AA1, VAV3, CYP1A1, AK4, GAL, STAT1, RAD51, GO:0042493~response to 2.48E-06 3.061909 5.56E-04 25 TYMS, FOS, CYBA, CDKN1A, ACSL1, FYN, drug JUN, GATA3, OXCT1, JUND, TGIF1, LRP8, HSPD1, SLC9A1 BID, TXNIP, LDHA, VAV3, TNFRSF12A, GO:0043065~positive GRIN1, BNIP3, GAL, DAPK3, JMY, NOTCH1, regulation of apoptotic 6.53E-06 2.978625 0.001343 24 ATF4, NUPR1, DUSP1, HMOX1, GADD45G, process BNIP3L, FAM162A, HSPD1, GADD45B, BMF, PHLDA3, GADD45A, SLC9A1 GO:0000083~regulation of transcription involved in 2.45E-05 11.33173 0.004648 7 G1/S transition of mitotic CDC6, TYMS, CDC45, DHFR, RRM2, PCNA, cell cycle ORC1 SLC29A1, E2F1, BBC3, HMOX1, VEGFA, GO:0071456~cellular 5.15E-05 4.654102 0.009035 12 BNIP3L, MST1, BNIP3, STC1, FAM162A, response to hypoxia CCNA2, SLC9A1 GO:0043627~response to LDHA, KRT19, HSP90AA1, HMOX1, GATA3, 5.56E-05 5.728126 0.009104 10 estrogen ESR1, ARSA, F7, HSPD1, GAL GO:0006268~DNA 9.40E-05 18.61641 0.014396 5 MCM2, HMGA1, MCM4, RAD51, MCM6 265 unwinding involved in DNA replication GO:0051290~protein RRM2, XRCC6, RRM1, FARSA, HIST1H3E, 1.13E-04 7.091965 0.01634 8 heterotetramerization HIST2H4A, HIST1H3H, HIST1H4H HSP90AB1, LDHA, S100P, DIAPH3, FSCN1, GO:0098609~cell-cell SLC3A2, PFKP, H1FX, STAT1, PKM, PSMB6, 1.36E-04 2.747809 0.018451 20 adhesion BAG3, CAPG, FASN, HIST1H3E, AHSA1, PAICS, HIST1H3H, HSPA8, ENO1 GO:0032508~DNA duplex XRCC5, GINS1, DHX9, CDC45, XRCC6, 1.54E-04 6.769603 0.019796 8 unwinding BRIP1, MCM3, MCM5 GO:0000722~telomere maintenance via 1.80E-04 8.144678 0.021918 7 RFC3, RFC2, LIG1, POLE, PRIM2, PCNA, recombination RAD51 GO:0071480~cellular XRCC5, EGR1, CYBA, CDKN1A, XRCC6, 1.93E-04 10.63795 0.022409 6 response to gamma radiation RAD51 GO:0051591~response to FOS, LDHA, SDC1, DUSP1, JUN, JUND, VGF, 2.05E-04 6.475272 0.02277 8 cAMP STAT1 GO:0031100~organ PKM, CDKN1A, NOTCH1, F7, CXCL12, 2.36E-04 6.337501 0.02498 8 regeneration CCNA2, PRPS2, GSTP1 GO:0006139~nucleobase- containing compound 3.07E-04 6.078827 0.031129 8 SLC29A1, TYMS, OAS3, BRIP1, OAS2, AK4, metabolic process PRPS2, TK1 GO:0042542~response to TXNIP, LDHA, SDC1, DUSP1, JUN, HMOX1, 3.96E-04 5.840442 0.038326 8 hydrogen peroxide HSPD1, STAT1 CDC6, KIF11, LIG1, KNSTRN, CDC25A, CDK2, MCM5, FSD1, SPC24, CCNE2, VRK1, GO:0051301~cell division 5.17E-04 2.340348 0.046176 22 NCAPH, TUBB, CDCA7, CCND3, TIMELESS, NCAPG, ZWINT, BUB1, CCNA2, TUBA1B, HELLS GO:0061621~canonical 5.58E-04 8.592188 0.046403 6 glycolysis PKM, PFKFB3, PGAM1, HK2, PFKP, ENO1 CCNE2, E2F2, JUN, GADD45G, JUND, BEX2, GO:0051726~regulation of 5.05E-04 3.603176 0.046858 12 GADD45B, MYBL2, MX2, GADD45A, cell cycle CDC25A, HSPA8 GO:0000079~regulation of cyclin-dependent protein 5.52E-04 6.682813 0.047491 7 serine/threonine kinase CCNE2, CDC6, CDKN1A, CCNA2, GADD45A, activity SERTAD1, CDC25A GO:0034340~response to 6.17E-04 21.27589 0.047983 4 type I interferon SP100, ISG15, C19ORF66, MX1 GO:0060333~interferon- gamma-mediated signaling 6.06E-04 4.719653 0.048681 9 TRIM38, OASL, SP100, TRIM68, IRF7, OAS3, pathway OAS2, STAT1, TRIM21 Cellular Component Fold Term p-value Benjamini Count Genes Enrichment XRCC5, CRABP2, XRCC6, CBX4, MCM10, ISG20, HIST1H2BO, SIN3B, CDCA7, PACSIN3, HIST1H2BK, ISG15, GATA3, PRIM2, CCNA2, ORC1, H1F0, SNRPA1, LIG1, POLE, PARP10, HES1, RFC3, TIMELESS, RFC2, JUN, TGIF1, DSCC1, FGFR4, PFKFB3, OAS3, UBA7, MYBL2, CMPK2, HNRNPA3, GO:0005654~nucleoplasm 1.98E-16 1.981938 8.84E-14 142 HNRNPM, VRK1, PSMB7, PSMB6, CSE1L, PSMB3, HNRNPD, SYBU, WDHD1, ASF1B, HIST1H4H, GINS1, DHX9, BRIP1, ATAD2, RBMX, ABCG1, HIST2H3D, NOTCH1, ATF4, CDKN1A, DHFR, ATF3, PARP9, PKP1, PSMC1, PCNA, POP1, HIST1H3E, FABP5, HIST1H3H, KLF4, SLC9A1, ADAR, E2F1, HSP90AB1, IER2, CLSPN, E2F2, FOSL2,

266

HELZ2, E2F7, BNIP3, IFI44L, RNF187, HIST2H4A, PDCD4, CCNE2, HSPH1, FOS, ACOT7, CDC45, FANCI, PSMD1, PSMD3, FANCF, HIST3H2BB, EGR1, CDC6, HSP90AA1, SP100, LIMK1, ESR1, MCM2, MCM3, MCM4, HMGA1, CDK2, MCM5, JMY, RAD51, MCM6, CCND3, HIST2H2BE, HIST2H2BF, RRM2, RRM1, KPNA2, GADD45A, UBE2T, UNG, ZNF367, TRIB3, EHF, IVNS1ABP, TYMS, XBP1, BUB1, BDH1, HSPA8, RAD51AP1, CEBPB, HIST1H2BD, TONSL, CEBPG, GMNN, ACLY, STAT1, CDC25A, CENPI, STAT2, CENPH, PHF19, IRF7, CAPG, LTA4H, HPGD XRCC5, AKNA, LDHA, QPCTL, XRCC6, PGAM1, ELOVL1, SLC2A6, PIP5KL1, SLC2A1, GLE1, FTL, RET, CHPT1, LPCAT4, PARP14, VEGFA, NEU1, STMN1, SFXN1, SFXN2, OAS2, SLC29A1, HNRNPM, CSE1L, FOLR1, MSLN, GYS1, HIST1H4H, DHX9, MGAT4A, LGALS3, SCD, SYT12, REEP1, DOCK8, RBMX, APOL2, PLEKHA4, APOL3, OASL, LAMP3, PARP9, PSMC1, HIST1H3E, HSPD1, CLCN7, HIST1H3H, FABP6, ADAR, HSP90AB1, LDLR, HELZ2, HNRNPLL, GO:0016020~membrane 1.03E-12 1.978182 2.05E-10 112 HIST2H4A, MTHFD1, FOS, FANCI, HMOX1, PSMD1, PSMD3, FAM129A, IMPDH2, DHCR24, HSP90AA1, KIF11, LIMK1, SLC3A2, PFKP, ESR1, MCM3, SLC9A3R1, MCM4, MCM5, SLC7A11, SDC1, NRM, CHRM1, FARSA, KPNA2, PPP1R15A, BID, HK2, EEA1, EHBP1L1, HMMR, ANXA6, NCAPH, ACSL1, NCAPG, TAP1, FASN, BUB1, HSPA8, ENO1, B4GALNT1, GDI2, FADS1, FADS3, NLGN2, TSPAN13, ACLY, ANXA5, NUP155, TMPRSS4, RAB32, PLSCR1, CYBA, SLC6A9, LRP8, SLC15A3, PAICS MOCOS, CTNNAL1, XRCC5, LDHA, CRABP2, XRCC6, PGAM1, CCT2, AMOTL2, FAH, NLRC5, PACSIN1, ISG15, SLC2A1, MX1, MX2, ORC1, RNF146, FTL, CHAC1, FBP1, SPIRE2, DDIT4, TRIM38, RENBP, JUN, ZWINT, STMN1, IFIH1, MVD, PFKFB3, DIAPH3, OAS3, UBA7, ACP5, OAS2, SESN2, IFI35, TK1, CMPK2, SPC24, VRK1, PSMB7, PSMB6, CSE1L, TRIM68, PSMB3, HNRNPD, GYS1, IDH2, BMF, FH, DHX9, VAV3, FSCN1, DOCK8, DDX58, NOTCH1, OASL, CDKN1A, DHFR, PLK2, PARP9, BBC3, PSMC1, HSPD1, GO:0005829~cytosol 8.15E-12 1.711355 1.08E-09 146 AHSA1, FABP5, FABP6, HSP90AB1, SAT1, PDCD4, CCNE2, PKM, MTHFD1, HSPH1, FOS, ACOT7, USP18, BAG3, HMOX1, PSMD1, PSMD3, DHTKD1, ERRFI1, IMPDH2, DHCR24, CDC6, HSP90AA1, KIF11, LIMK1, SARS, HERC6, PFKP, PADI2, MID1IP1, HMGA1, CDK2, RRM2, RRM1, FARSA, PPP1R15A, KPNA2, SMS, GSTP1, BID, PPFIA4, HK2, TRIB3, EEA1, VARS, HPRT1, TPM3, HMMR, TYMS, NCAPH, NCAPG, XBP1, FASN, BUB1, XAF1, MTMR4, HSPA8, ENO1, TXNIP, GDI2, SPSB1, GMNN, DPYSL5, ACLY, STAT1, CAPN2, CDC25A, TRIM21,

267

CENPI, STAT2, CENPH, PLSCR1, IFIT1, FYN, IRF7, BNIP3L, LTA4H, VPS28, PAICS, HPGD XRCC5, LDHA, MCRIP1, XRCC6, H1FX, ISG20, HIST1H2BO, SIN3B, CDCA7, HIST1H2BK, PHTF2, TIGD2, CCNA2, H1F0, TYRO3, LIG1, POLE, PARP10, HES1, UHRF1, PARP12, JUN, PARP14, ZWINT, TGIF1, STC1, PABPC4, UBA7, OAS2, IFI35, JRK, HNRNPA3, VRK1, DHX9, LGALS3, KLF10, CSRP2, NOTCH1, PARP9, PKP1, PCNA, ADAR, BNIP3, RNF187, ZBTB38, PKM, CCNE2, USP18, SLC25A1, HIST3H2BB, ERRFI1, SERTAD1, DHCR24, CDC6, BATF2, HIST1H1C, HERC6, ANP32E, SLC3A2, JMY, RAD51, CCND3, HIST2H2BE, HIST2H2BF, RRM2, GADD45G, ZSCAN18, GADD45B, GADD45A, HIST1H2AC, UNG, ZNF324B, ZNF367, EHF, NCAPH, NCAPG, XBP1, C19ORF66, TRIP13, HIST1H2BD, GMNN, DLGAP5, FADS1, PHF10, STAT1, CAPN2, TRIM21, STAT2, FSD1, PLSCR1, NUPR1, DUSP1, FYN, CAPG, DUSP8, AKNA, CRABP2, GO:0005634~nucleus 8.34E-11 1.471046 8.30E-09 205 CBX4, MCM10, NLRC5, GATA3, MX1, MX2, ORC1, RNF146, PKIG, TACC2, ASCL1, TIMELESS, MNX1, IFIH1, DIAPH3, BEX2, STIP1, SP110, SESN2, ZMYND8, CMPK2, SPC24, TUBB, MEIS3, PSMB7, PSMB6, TRIM68, CSE1L, CDYL2, FOLR1, PSMB3, JUND, HNRNPD, AUTS2, MFAP3L, ASF1B, HELLS, HIST1H4H, GINS1, S100P, ATAD2, BRIP1, RBMX, IDH3A, HIST2H3D, CDKN1A, ATF4, ATF3, DYRK1B, PSMC1, HIST1H3E, HIST1H3H, HSP90AB1, E2F1, IER2, FOSL2, E2F7, HNRNPLL, HIST2H4A, PDCD4, FOS, CDC45, HMOX1, PSMD1, PSMD3, NFIL3, IMPDH2, EGR1, HSP90AA1, SP100, ESR1, PFKP, MID1IP1, MCM2, SLC9A3R1, MCM3, HMGA1, DAPK3, MCM4, MCM5, CDK2, MCM6, DHRS2, KPNA2, UBE2T, GSTP1, IRX3, TRIB3, MSX2, TSC22D1, TYMS, MSI1, XAF1, HSPA8, SYNPO, ENO1, TXNIP, CEBPB, RAD51AP1, CEBPG, KNSTRN, CDC25A, CENPI, CENPH, CYBA, IRF7, LTA4H, GDF15 LDHA, MCRIP1, XRCC6, PGAM1, CCT2, ISG20, HIST1H2BO, SIN3B, CDCA7, PIP5KL1, HIST1H2BK, GLE1, CCNA2, FTL, ARC, PIM3, IFI44, PARP10, HES1, KRT17, PARP14, ZWINT, VEGFA, STC1, SLC2A10, PABPC4, OAS3, OAS2, HNRNPA3, JRK, VRK1, FAM162A, WDHD1, FH, DHX9, LGALS3, APOL3, OASL, PARP9, PCNA, APOL6, AHSA1, KLF4, ADAR, SLC9A1, CLSPN, GO:0005737~cytoplasm 1.34E-09 1.451004 1.07E-07 195 BNIP3, RNF187, PKM, HSPH1, FANCI, BAG3, FAM129A, HIST3H2BB, ERRFI1, SERTAD1, CDC6, HERC6, SLC3A2, JMY, RAD51, DOK3, CCND3, HIST2H2BE, RRM2, HIST2H2BF, GADD45G, RRM1, FARSA, GADD45B, GADD45A, BID, EEA1, HPRT1, NCAPG, XBP1, C19ORF66, BUB1, FASN, PHLDA3, GDI2, HIST1H2BD, GMNN, DLGAP5, ACLY, CAPN2, STAT1, TRIM21, STAT2, FSD1, DUSP1, NUPR1, SAMD9, CAPG, HPGD,

268

DUSP8, PAICS, CRABP2, MCM10, KANK2, TRIM16L, NLRC5, PACSIN1, PACSIN3, MX1, MX2, ORC1, RET, PKIG, FBP1, DDIT4, TACC2, MNX1, STMN1, FGFR4, BEX2, SESN2, ZMYND8, TUBB, PSMB7, PSMB6, TRIM68, CSE1L, PSMB3, TFF3, MFAP3L, GINS1, S100P, FSCN1, BRIP1, REEP1, DNMBP, PLEKHA4, DDX58, ATF4, PLK2, PSMC1, HSPD1, FABP5, FABP6, HSP90AB1, IER2, IFI44L, PDCD4, ACOT7, CDC45, DDX60, PSMD3, KLHL24, IMPDH2, EGR1, KIF12, KIF11, HSP90AA1, SP100, LIMK1, SARS, ESR1, PFKP, PADI2, MCM2, SLC9A3R1, DAPK3, CDK2, DHRS2, SDC1, ERN1, KPNA2, PPP1R15A, UBE2T, GSTP1, SHCBP1, IRX3, IVNS1ABP, TSC22D1, TYMS, MSI1, BDH1, ENO1, TXNIP, CEBPB, TONSL, DPYSL5, ISOC1, RGS16, ANXA5, KNSTRN, CENPI, CDC25A, ACTL8, IFIT1, IRF7, LTA4H, VPS28, GDF15 GO:0042555~MCM TONSL, MCM2, MCM3, MCM4, MCM5, 1.27E-06 25.90476 8.39E-05 6 complex MCM6 LDHA, CRABP2, FAM20C, PGAM1, CCT2, CXCL12, FAH, PACSIN3, SERPINE1, SLC2A1, COL12A1, FTL, FBP1, RENBP, KRT19, KRT17, NEU1, STMN1, PRPS2, MST1, ACP5, HNRNPM, PSMB7, SERINC5, TUBB, PSMB6, CSE1L, FOLR1, PSMB3, IDH2, HNRNPD, TFF3, FAM162A, HIST1H4H, FH, MGAT4A, S100P, VAV3, LGALS3, B4GAT1, FSCN1, ATAD2, AK4, CD63, RBMX, HIST2H3D, PKP1, FREM2, PCNA, HIST1H3E, HSPD1, PABPC1L, PDZK1, AHSA1, HIST1H3H, GO:0070062~extracellular 2.33E-06 1.53438 1.33E-04 111 FABP5, SLC9A1, HSP90AB1, HIST2H4A, exosome MTHFD1, PKM, HSPH1, ACOT7, PSMD1, PSMD3, SLC25A1, FAM129A, TUBA1B, IMPDH2, GOLM1, KIF12, HSP90AA1, SARS, SLC3A2, PFKP, PADI2, SLC9A3R1, DHRS2, MTMR11, SDC1, HIST2H2BE, HIST2H2BF, RRM1, SMS, GSTP1, BID, HIST1H2AC, EEA1, HPRT1, TPM3, ANXA6, TGM1, FASN, PHLDA3, HSPA8, ENO1, GDI2, HIST1H2BD, ACLY, ISOC1, CAPN2, ANXA5, DBI, PLSCR1, CAPG, ARSA, LTA4H, GDF15, VPS28, HPGD, PAICS HIST1H2BO, H1F0, HIST1H2AC, HIST1H2BD, HIST1H2BK, HIST1H1C, HIST2H2BE, H1FX, GO:0000786~nucleosome 5.07E-06 5.37386 2.52E-04 13 HIST1H3E, HIST2H4A, HIST1H3H, HIST2H3D, HIST1H4H XRCC5, SP100, XRCC6, MCM2, HIST2H4A, GO:0000784~nuclear MCM3, MCM4, MCM5, RAD51, MCM6, chromosome, telomeric 6.45E-06 4.483516 2.85E-04 15 PCNA, HIST1H3E, ORC1, HIST1H3H, region HIST1H4H HSP90AB1, E2F1, UQCRC1, BNIP3, KANK2, PKM, MTHFD1, ATAD3A, ACOT7, CASP4, DHTKD1, CYP1A1, LIG1, RHBDD1, DDIT4, RAD51, DHRS2, ERN1, PPP1R15A, OAT, GO:0005739~mitochondrion 9.59E-05 1.693249 0.003809 58 GSTP1, BID, UNG, RSAD2, SFXN1, OAS2, SESN2, VARS, CMPK2, ANXA6, TYMS, ACSL1, PSMB3, OXCT1, TAP1, SYBU, FASN, IDH2, FAM162A, XAF1, BDH1, FH, PDK1, BRI3BP, FADS1, AK4, ABCG1, IDH3A, CYBA,

269

RAB32, PARP9, BBC3, FYN, BNIP3L, HDHD3, HSPD1, IFI6, SLC9A1 HIST1H2BO, HIST1H2BD, HIST1H2BK, GO:0000788~nuclear 1.19E-04 7.064935 0.004281 8 HIST2H2BE, HIST2H2BF, HIST1H3E, nucleosome HIST3H2BB, HIST1H3H GDI2, UQCRC1, HSP90AA1, FSCN1, PGAM1, GO:0043209~myelin sheath 1.53E-04 3.578947 0.005076 14 STIP1, CCT2, IDH3A, PKM, PACSIN1, SERINC5, HSPD1, TUBA1B, HSPA8 HSP90AB1, ANXA6, RAB32, HSP90AA1, GO:0042470~melanosome 2.70E-04 4.231966 0.008239 11 SLC2A1, CAPG, SLC3A2, FASN, CD63, GPNMB, HSPA8 HSP90AB1, LDHA, S100P, DIAPH3, FSCN1, GO:0005913~cell-cell SLC3A2, PFKP, H1FX, STAT1, PKM, PSMB6, 7.32E-04 2.406015 0.020595 20 adherens junction BAG3, CAPG, FASN, HIST1H3E, AHSA1, PAICS, HIST1H3H, HSPA8, ENO1 E2F1, H1F0, HIST1H2AC, UHRF1, CEBPB, GO:0000790~nuclear 0.00149 2.818653 0.038802 14 RAD51AP1, TIMELESS, GATA3, JUND, ESR1, chromatin ASF1B, STAT1, KLF4, RAD51 Molecular Function Fold Term p-value Benjamini Count Genes Enrichment XRCC5, MOCOS, MCRIP1, LDHA, XRCC6, FAM20C, PGAM1, CCT2, AMOTL2, FAH, S1PR3, SIN3B, ISG15, SERPINE1, INSIG1, GLE1, CCNA2, FTL, GTPBP2, H1F0, TYRO3, CYP1A1, PIM3, PARP10, F7, HES1, KRT19, RFC3, UHRF1, ATP2C2, KRT17, RFC2, ZWINT, JUN, VEGFA, TGIF1, PABPC4, UBA7, OAS3, MST1, OAS2, MYBL2, IFI35, TK1, JRK, HNRNPA3, VRK1, SYBU, GYS1, FAM162A, WDHD1, FH, DHX9, LGALS3, KLF10, FAM111B, CD63, CSRP2, NOTCH1, PKP1, PARP9, POP1, PCNA, AHSA1, PDZK1, PHYKPL, KLF4, ADAR, SLC9A1, CLSPN, LDLR, BNIP3, RNF187, ZBTB38, CCNE2, PKM, MTHFD1, HSPH1, USP18, FANCI, BAG3, FANCF, FAM129A, ERRFI1, SERTAD1, GOLM1, CDC6, BATF2, HIST1H1C, SLC3A2, LENG8, RAD51, JMY, INHBB, FAM167A, DOK3, CCND3, RRM2, RRM1, GADD45G, GO:0005515~protein FARSA, GADD45B, GADD45A, BID, PPFIA4, 3.74E-09 1.256092 1.41E-06 302 binding UNG, USP5, HK2, EEA1, HPRT1, HMMR, NCAPH, XBP1, NCAPG, C19ORF66, BUB1, FASN, MTMR4, TRIP13, PDK1, GDI2, SPSB1, DLGAP5, GMNN, ACLY, STAT1, CAPN2, TRIM21, STAT2, TMPRSS4, PLSCR1, RAB32, PHF19, DUSP1, FYN, BNIP3L, SAMD9, CAPG, ARSA, MPPED2, LRP8, PAICS, CTNNAL1, FAM189B, CRABP2, CBX4, GUCD1, MCM10, KANK2, ELOVL1, NRCAM, NLRC5, PACSIN1, PACSIN3, UNC5B, GATA3, SLC2A1, MX1, ORC1, MX2, RNF146, SNRPA1, RET, CHAC1, FBP1, ASCL1, TRIM38, TIMELESS, STMN1, PRPS2, DSCC1, FGFR4, GPN3, IFIH1, TNFRSF12A, BEX2, STIP1, SESN2, ZMYND8, SPC24, HNRNPM, PSMB7, TUBB, CSE1L, CDYL2, PSMB3, MSLN, JUND, HNRNPD, TFF3, MFAP3L, ASF1B, BMF, HELLS, HIST1H4H, VAV3, S100P, GRIN1, FSCN1, BRIP1, REEP1, DOCK8, RBMX, DNMBP, HIST2H3D, DDX58, ATF4, CDKN1A,

270

ATF3, PLK2, BBC3, DYRK1B, PSMC1, HDHD3, HSPD1, HIST1H3E, IFI6, HIST1H3H, FABP5, HSP90AB1, SAT1, E2F1, E2F2, FOSL2, HELZ2, E2F7, HNRNPLL, HIST2H4A, PDCD4, FOS, CDC45, ACOT7, HMOX1, DDX60, PSMD3, NFIL3, TUBA1B, IMPDH2, EGR1, SP100, HSP90AA1, RAB39B, LIMK1, RAB3IL1, ESR1, MCM2, MID1IP1, MCM3, GAL, SLC9A3R1, MCM4, HMGA1, DAPK3, CDK2, MCM5, SLC7A11, MCM6, DHRS2, SDC1, NRM, ERN1, KPNA2, PPP1R15A, GSTP1, SHCBP1, TRIB3, RSAD2, VARS, TPM3, MSX2, TSC22D1, ANXA6, TAP1, TGM1, GPNMB, HSPA8, ENO1, SYNPO, TXNIP, RAD51AP1, CEBPB, TONSL, CEBPG, DPYSL5, ISOC1, KNSTRN, NUP155, RGS16, ANXA5, CDC25A, CENPI, CENPH, ACTL8, CYBA, KCNN4, IFIT1, IRF7, LTA4H, VPS28, GDF15, IGFBP5, SH3BP2 XRCC5, HSP90AB1, SEPHS2, HELZ2, XRCC6, FAM20C, CCT2, MTHFD1, PKM, HSPH1, NLRC5, ATAD3A, PIP5KL1, DDX60, ORC1, TYRO3, CDC6, RET, KIF12, KIF11, HSP90AA1, LIMK1, LIG1, SARS, PFKP, PIM3, MCM2, MCM3, MCM4, DAPK3, MCM5, CDK2, MCM6, RAD51, ATP2C2, KIF1A, GO:0005524~ATP binding 3.33E-09 1.979703 2.51E-06 81 RENBP, CBWD1, RFC2, RRM1, ERN1, FARSA, UBE2T, PRPS2, IFIH1, FGFR4, MVD, PFKFB3, UBA7, OAS3, HK2, TRIB3, OAS2, VARS, CMPK2, TK1, VRK1, ACSL1, TAP1, BUB1, HELLS, HSPA8, TRIP13, PDK1, DHX9, ATAD2, BRIP1, ACLY, AK4, ABCG1, DDX58, OASL, ADCY9, PLK2, ATP2A3, FYN, DYRK1B, PSMC1, HSPD1, CLCN7, PAICS GO:0003725~double- DDX58, HSP90AB1, IFIH1, OASL, DDX60, 3.84E-05 5.989994 0.009595 10 stranded RNA binding OAS3, SLC3A2, HSPD1, OAS2, TUBA1B IER2, XRCC5, E2F1, E2F2, CLSPN, HELZ2, E2F7, XRCC6, H1FX, HIST2H4A, HIST1H2BO, FOS, HIST1H2BK, FANCI, GATA3, PHTF2, PRIM2, TIGD2, NFIL3, HIST3H2BB, ORC1, IMPDH2, EGR1, BATF2, SP100, LIG1, POLE, ESR1, HMCES, MCM2, MCM3, MCM4, HMGA1, MCM6, RAD51, HES1, ASCL1, RFC3, GO:0003677~DNA binding 6.59E-05 1.593396 0.012326 73 RFC2, HIST2H2BE, HIST2H2BF, JUN, MNX1, TGIF1, ZSCAN18, DSCC1, HIST1H2AC, IFIH1, ZNF324B, ZNF367, EHF, SP110, JRK, MEIS3, XBP1, WDHD1, HIST1H4H, ENO1, DHX9, HIST1H2BD, CEBPB, CEBPG, BRIP1, STAT1, TRIM21, STAT2, PLSCR1, OASL, ATF4, NUPR1, IRF7, PCNA, ADAR HIST1H2AC, BNIP3, HIST2H4A, HIST1H2BO, FOS, HIST1H2BK, XBP1, HIST3H2BB, HIST1H4H, TYRO3, CEBPB, HIST1H2BD, GO:0046982~protein 8.40E-05 2.278774 0.012575 29 LIMK1, CEBPG, CAPN2, ABCG1, HIST2H3D, heterodimerization activity CYBA, NOTCH1, ATF4, ATF3, TIMELESS, HIST2H2BE, HIST2H2BF, JUN, VEGFA, BNIP3L, HIST1H3E, HIST1H3H GO:0003688~DNA 1.56E-04 16.60862 0.019331 5 replication origin binding CDC45, MCM2, HSPD1, MCM10, MCM5 FOSL2, CBX4, ZMYND8, FOS, SIN3B, MEIS3, GO:0003682~chromatin 1.99E-04 2.336251 0.021156 25 CDC45, GATA3, HNRNPD, AUTS2, ORC1, binding HELLS, CEBPB, POLE, ESR1, ATAD2, RBMX,

271

HMGA1, MCM5, RAD51, ASCL1, NUPR1, JUN, PCNA, UBE2T HSP90AB1, LDHA, S100P, DIAPH3, FSCN1, GO:0098641~cadherin SLC3A2, PFKP, H1FX, STAT1, PKM, PSMB6, binding involved in cell-cell 4.09E-04 2.519928 0.030304 20 BAG3, CAPG, FASN, HIST1H3E, AHSA1, adhesion PAICS, HIST1H3H, HSPA8, ENO1 GO:0000982~transcription factor activity, RNA polymerase II core promoter 3.33E-04 9.531903 0.030846 6 FOS, ATF3, BATF2, FOSL2, JUN, IRF7 proximal region sequence- specific binding LDHA, CLDN9, LDLR, E2F7, BNIP3, HPRT1, MCM10, AMOTL2, TK1, SLC2A1, MSI1, GLE1, TRIP13, FTL, HSP90AA1, SP100, FBP1, GO:0042802~identical 3.82E-04 1.853779 0.03149 38 ESR1, STAT1, RBMX, DAPK3, TRIM21, protein binding MCM6, STAT2, RAD51, DDX58, UHRF1, ATF3, FYN, JUN, BNIP3L, VEGFA, ERN1, PCNA, PAICS, PHYKPL, OAT, PRPS2 UHRF1, TONSL, ANP32E, ATAD2, MCM2, GO:0042393~histone 5.15E-04 3.593996 0.034663 12 HIST1H3E, ASF1B, HIST2H4A, ZMYND8, binding HIST1H3H, HIST1H4H, HIST2H3D KEGG Pathway Fold Term p-value Benjamini Count Genes Enrichment E2F1, E2F2, CDC6, MCM2, MCM3, MCM4, CDC25A, CDK2, MCM5, MCM6, CCNE2, has04110:Cell cycle 7.68E-09 4.833991 1.74E-06 21 CDC45, CDKN1A, CCND3, GADD45G, BUB1, PCNA, GADD45B, ORC1, GADD45A, CCNA2 RFC3, RFC2, LIG1, POLE, PRIM2, PCNA, has03030:DNA replication 2.53E-07 8.721646 2.88E-05 11 MCM2, MCM3, MCM4, MCM5, MCM6 BID, CCNE2, CDKN1A, CCND3, BBC3, RRM2, has04115:p53 signaling 1.69E-05 5.112281 0.001279 12 GADD45G, SERPINE1, GADD45B, SESN2, pathway GADD45A, CDK2 IFIH1, OAS3, OAS2, STAT1, CDK2, STAT2, has05162:Measles 2.05E-04 3.219199 0.011559 15 DDX58, CCNE2, CCND3, BBC3, FYN, IRF7, MX1, HSPA8, ADAR HIST1H2BD, SP100, HIST2H4A, CDK2, PKM, CCNE2, HIST1H2BO, CDKN1A, ATF4, has05203:Viral 2.70E-04 2.645501 0.012177 19 CCND3, HIST1H2BK, HIST2H2BE, carcinogenesis HIST2H2BF, JUN, IRF7, PSMC1, HIST3H2BB, CCNA2, HIST1H4H E2F1, E2F2, IFIH1, STAT1, CDK2, STAT2, has05161:Hepatitis B 5.03E-04 2.952783 0.018845 15 DDX58, CCNE2, FOS, CDKN1A, ATF4, JUN, IRF7, PCNA, CCNA2 LDHA, MVD, PFKP, HK2, FBP1, PGAM1, has01130:Biosynthesis of 0.001125 2.423511 0.03583 18 ACLY, AK4, IDH3A, FDFT1, PKM, SQLE, antibiotics IDH2, OAT, PAICS, PRPS2, ENO1, FH

272

Appendix Table 3.6. Enriched GO terms and KEGG pathways in MCF7 in response to culture medium changes at 18% O2. Biological Process Fold Term p-value Benjamini Count Genes Enrichment BLM, NFIX, MCM10, CDT1, CDC45, MCM7, POLE2, EXO1, RECQL4, CDC6, SSRP1, DTL, GO:0006260~DNA 1.40E-12 5.679724 3.97E-09 27 LIG1, GINS3, BRIP1, RMI2, MCM2, MCM3, replication MCM4, CDC25A, MCM5, RBBP8, POLD3, RFC2, RRM2, POLD2, CHTF18 GO:0060337~type I IFITM3, OAS3, HLA-A, RSAD2, STAT1, IFI35, interferon signaling 5.93E-11 8.660922 8.43E-08 17 HLA-F, ISG20, IFIT3, OASL, IFIT1, IFI27, ISG15, pathway IRF7, XAF1, MX2, IFI6 CDC6, IQGAP3, PKMYT1, MCM2, MCM10, GO:0000082~G1/S MCM3, MCM4, CDC25A, MCM5, CDT1, RBBP8, transition of mitotic cell 1.74E-09 6.073634 1.64E-06 19 CDC45, DHFR, MCM7, PLK2, POLE2, RRM2, cycle RANBP1, CDCA5 GO:0071222~cellular ZFP36, HMGB2, TNF, CEBPB, ASS1, LITAF, response to 1.82E-06 4.616754 0.001289 16 CXCL8, IL24, CMPK2, B2M, CD36, SERPINE1, lipopolysaccharide TFPI, ZC3H12A, TNFAIP3, GSTP1 GO:0006270~DNA CDC6, CDC45, MCM7, POLE2, MCM2, MCM3, 4.01E-06 9.170388 0.002278 9 replication initiation MCM10, MCM4, MCM5 IFIT3, ODC1, IFIT1, OASL, TNF, IFITM3, IRF7, GO:0009615~response to 3.14E-05 4.149832 0.011088 14 OAS3, RSAD2, IFI44, CCL5, CXCL12, MX2, virus ISG20 GO:0045071~negative IFIT1, OASL, TNF, ISG15, IFITM3, OAS3, regulation of viral genome 2.37E-05 7.336311 0.011139 9 RSAD2, CCL5, ISG20 replication KIFC1, CKS1B, HAUS5, NEK3, KNTC1, NR3C1, CCNG1, TUBB, CDCA7, NCAPG2, SKA3, GO:0051301~cell division 3.04E-05 2.515307 0.012268 27 CDCA5, TUBA1B, CDCA4, CDC6, LIG1, CENPF, KIF18B, BIRC5, TACC3, SEPT10, MCM5, CDC25A, RBBP8, DCLRE1A, ZWINT, MAD2L2 E2F1, HMGB1, AEBP1, TNF, USP2, WFS1, SOX3, E2F8, SNCA, EDN1, CBX4, FHL2, TRIB3, NFIX, GO:0000122~negative CBX2, CXXC5, AURKB, CBX8, CITED2, MSX2, regulation of transcription 4.90E-05 1.947292 0.01535 43 NRARP, OVOL1, ZFP36, SATB2, RBL1, RELB, from RNA polymerase II KLF11, LMCD1, SUV39H1, HES6, STAT1, promoter RBBP8, ASCL2, ASCL1, UHRF1, IFI27, DKK1, CD36, BTG2, TRPS1, IRF7, MAD2L2, KLF4 GO:0060333~interferon- OASL, IRF7, MT2A, OAS3, HLA-A, CAMK2B, gamma-mediated signaling 5.72E-05 5.051607 0.016101 11 JAK2, STAT1, TRIM21, HLA-F, B2M pathway GO:0000731~DNA EXO1, POLD3, XRCC3, BLM, BRCA2, BRIP1, synthesis involved in DNA 7.72E-05 7.45276 0.019728 8 RMI2, RBBP8 repair GO:0000732~strand EXO1, XRCC3, BLM, BRCA2, BRIP1, RMI2, 1.09E-04 8.778491 0.025493 7 displacement RBBP8 GO:0042157~lipoprotein APOL2, APOL3, APOL1, APOA1, HSPG2, 1.34E-04 6.864384 0.028805 8 metabolic process PRKACB, APOL6, ABCG1 IFITM3, OAS3, UNC93B1, RSAD2, IFI44L, GO:0051607~defense 1.70E-04 3.161777 0.033903 16 APOBEC3H, STAT1, ISG20, IFIT3, NLRC5, response to virus OASL, IFIT1, ISG15, IFIT5, ZC3H12A, MX2 Cellular Component Fold Term p-value Benjamini Count Genes Enrichment XRCC3, CRABP2, CBX4, CDKN2AIPNL, PKMYT1, CBX2, AURKB, MCM10, CBX8, GO:0005654~nucleoplasm 4.44E-08 1.584687 2.03E-05 130 SART1, ISG20, HIST1H2BO, CDCA7, ISG15, HIST1H2BK, H2AFX, PRKACB, CDCA5, H1F0, 273

SATB2, MYO6, DTL, USP1, LIG1, RELB, RBL1, RMI2, DCLRE1A, RFC2, HSPB8, MAD2L2, HMGB1, FGFR4, HMGB2, LMNB1, LITAF, BLM, SOX3, OAS3, UBA7, RRM2B, CXXC5, MYBL2, CCNG1, CMPK2, POLE2, NCAPG2, EIF3E, HNRNPD, ASF1B, GINS3, SUV39H1, LMCD1, BRIP1, UBE2L6, NR4A1, BRCA2, ATAD2, S100A14, RAD54L, ABCG1, PSMB9, POLD3, ATF4, DHFR, SYNE2, PSMC3, TRPS1, POLD2, FABP5, POP7, KLF4, E2F1, E2F2, KYNU, FOSL2, HELZ2, ELF4, IFI44L, PDCD4, CDT1, PHC3, ACOT7, CDC45, MCM7, HIST3H2BB, FANCA, TSEN15, CDC6, ARID5A, TLE3, CCNC, MCM2, MCM3, MCM4, MCM5, RBBP8, FANCD2, RRM2, NUP205, CAND1, NCAPH2, UBE2T, CKS1B, USP2, FKBP5, ZNF367, TRIB3, FHL2, NR3C1, NECAB1, TSPYL2, ZC3H12A, CAMK2B, EXO1, SSRP1, CENPM, CEBPB, TONSL, CEBPG, AFF3, CENPF, BIRC5, STAT1, CDC25A, IRF7, CAPG, CHTF18, JAK2, TP53INP1 KIFC1, XRCC3, TUBB2B, SNCA, H1FX, AURKB, AQP3, ISG20, CITED2, HIST1H2BO, CDCA7, HIST1H2BK, WDR76, SMOX, H2AFX, CDCA5, CDCA4, H1F0, GTPBP3, LIG1, ZNF48, PIM1, ESPL1, ZNF787, UHRF1, DCLRE1A, PARP12, KRT16, HES4, ZWINT, STC1, MAD2L2, NEK3, UBA7, IFI35, NCAPG2, AHNAK2, TUBB4B, RECQL4, MKI67, LGALS3, KLF11, CSRP1, RAD54L, SYNE2, TRPS1, PKP3, TCF19, ZNF385B, POP7, NUAK2, ELF4, ELF5, TCOF1, HELZ, YBX2, KCNIP3, CDT1, USP18, HIST3H2BB, FANCA, ERRFI1, ZFP36, CDC6, HERC6, TLE3, CCNC, RBBP8, CLIC3, FANCD2, ZNF692, RRM2, CAND1, RASD1, FHL2, ZNF367, NECAB1, ZFP36L1, TRIP13, ZBP1, SSRP1, CEP131, LMX1B, AFF3, BIRC5, STAT1, TRIM21, MT1X, ATRX, MT2A, CAPG, GAMT, DUSP9, TP53INP1, TOB1, AKNA, LMO2, CRABP2, KNTC1, CBX4, CBX2, APOBEC3H, CBX8, GO:0005634~nucleus 1.96E-07 1.34744 4.48E-05 215 MCM10, FOXO6, CALB2, NLRC5, APOA1, MX2, SATB2, MYO6, DTL, USP1, RBL1, RELB, PKIB, RMI2, BASP1, HES6, ASCL2, ASCL1, CPE, HSPB8, SDCBP, SWT1, TNFAIP3, HMGB1, HMGB2, FGFR3, BLM, TFAP4, ASS1, ITGB4, TIMP3, SESN1, SESN3, CMPK2, TUBB, MEIS3, PEG10, EIF3E, OVOL1, HNRNPD, ASF1B, S100P, GINS3, LMCD1, SUV39H1, ATAD2, BRCA2, NR4A1, BRIP1, PSMB9, POLD3, IKBKE, ATF4, CORO1A, PSMC3, POLD2, CYFIP2, SPTBN1, MAFA, DCXR, HDAC6, E2F1, IER3, KIF22, AEBP1, PTGES2, FOSL2, E2F8, SULT2B1, HNRNPLL, PDCD4, PHC3, CDC45, MCM7, PLCB4, HMOX1, RANBP1, ARID5A, MCM2, MCM3, MCM4, MCM5, DHRS2, CMSS1, NRGN, UBE2T, GSTP1, GDAP1, TRIB3, NFIX, NR3C1, MSX2, TSPYL1, TSPYL2, ZNF703, ZC3H12A, XAF1, SYNPO, EXO1, CEBPB, CEBPG, FLT4, KIF18B, CENPF, COTL1, CDC25A, CYBA, IRF7, BAX, JAK2, APAF1, THEMIS2 XRCC3, TUBB2B, SNCA, EDN1, PDLIM3, GO:0005737~cytoplasm 4.72E-07 1.34525 7.19E-05 207 SART1, AQP3, ISG20, B2M, CITED2, HIST1H2BO, ACTG2, CDCA7, PIP5KL1,

274

HIST1H2BK, RAB29, SMOX, CDCA5, ARC, PIM1, ESPL1, IFI44, SPAG9, DCLRE1B, TAGLN, KRT17, ZWINT, STC1, MAD2L2, RTP4, NEK3, OAS3, UBA6, CDC42EP1, CRMP1, FBXO6, SKA3, AHNAK2, RECQL4, PLAT, ODC1, MKI67, LGALS3, CELSR2, S100A14, SEPT10, GAS6, RPS6KL1, APOL3, OASL, SYNE2, APOL6, KLF4, POP7, NUAK2, ELF5, TCOF1, UFC1, YBX2, BAG3, HIST3H2BB, FANCA, ERRFI1, ZFP36, CDC6, HERC6, PJA2, MAST4, BGN, CLIC3, FANCD2, RRM2, NUP205, CAND1, ABCD1, KIAA1147, NECAB1, ZFP36L1, DNALI1, EXOC5, PHLDA3, ABCA12, ZBP1, SSRP1, RTN4R, AFF3, BIRC5, STAT1, MT1X, TRIM21, SAMD9, MT2A, CAPG, TEX19, CHTF18, GAMT, DUSP9, DRAM1, TOB1, TP53INP1, AP1G1, CRABP2, KNTC1, APOBEC3H, MCM10, SLC7A5, FOXO6, CALB2, NLRC5, KRT80, ZNF185, MX2, BPNT1, SATB2, MYO6, SOCS2, DTL, RELB, PKIB, RMI2, TMSB10, BASP1, METTL7A, TACC3, ASCL2, RALGAPA2, SPATA18, HSPB8, SDCBP, TNFAIP3, HMGB2, FGFR4, BLM, LITAF, ASS1, CXXC5, RRM2B, CCL5, SESN1, MDK, SESN3, TUBB, PEG10, EIF3E, ADRA2A, S100P, PODXL, FSCN1, LMCD1, BRCA2, NR4A1, BRIP1, EVL, SHANK3, DNMBP, PSMB9, IKBKE, CORO1A, ATF4, PLK2, PSMC3, CYFIP2, SPTBN1, FABP5, HDAC6, KIF22, AEBP1, KYNU, TLN2, SULT2B1, IFI44L, PDCD4, ACOT7, CDC45, MCM7, TRIM3, RANBP1, MCM2, DHRS2, PANK3, SPATS2L, UBE2T, GSTP1, SHCBP1, GDAP1, NR3C1, TSPYL2, ZNF703, ZC3H12A, MTCL1, EXO1, NES, CEBPB, TONSL, FLT4, CENPF, KIF18B, COTL1, CDC25A, GDPD1, IFIT3, SLC17A5, IFIT1, IRF7, BAX, JAK2, THEMIS2 GO:0042555~MCM 2.47E-06 22.62446 2.82E-04 6 MCM7, TONSL, MCM2, MCM3, MCM4, MCM5 complex AEBP1, NRP1, SNCA, IGFBP6, EDN1, VGF, CXCL12, B2M, ACTG2, TNFRSF11B, METRN, APOA1, HIST1H2BK, HMOX1, SERPINE1, SEMA3B, TSKU, IL24, GAL, MMP13, RALGAPA2, CHGA, CD36, CPE, MELTF, TFPI, SEMA4C, COL1A2, SDCBP, STC1, TNFAIP2, GO:0005615~extracellular 9.45E-05 1.637628 0.008603 65 CTSH, GSTP1, HMGB1, HMGB2, TNF, ENPP1, space CSF1, C5, OAS3, CXCL8, IL32, CX3CL1, CCL5, TIMP3, LGALS3BP, COL7A1, MSLN, MTCL1, OLFM1, FN1, PLAT, COL18A1, LGALS3, PODXL, SAAL1, LMCD1, HSPG2, GAS6, WNT7B, APOL1, DKK1, METRNL, IGFBP2, BMP7 AP1G1, CRABP2, SNCA, KNTC1, IQGAP3, PKMYT1, AURKB, SLC7A5, ITPKA, SART1, CALB2, HOOK1, NLRC5, ACTG2, PGPEP1, APOA1, ISG15, RAB29, SMOX, PRKACB, MX2, CDCA5, BPNT1, MATK, MYO6, SOCS2, CHAC1, RELB, ESPL1, SPAG9, RAB18, ZWINT, SDCBP, GO:0005829~cytosol 1.46E-04 1.351325 0.011074 132 TNFAIP3, MYL7, SRM, ASS1, OAS3, UBA7, ACP5, UBA6, IL32, SESN1, IFI35, TK1, CMPK2, CRMP1, EIF3E, FBXO6, HNRNPD, GALE, TUBB4B, ODC1, GABARAPL1, FSCN1, HGD, UBE2L6, EVL, PSMB9, EIF4B, IKBKE, OASL, CORO1A, DHFR, PLK2, PSMC3, CYFIP2,

275

SPTBN1, AHCYL2, IDI1, FABP5, HDAC6, GNAZ, IER3, KIF22, KYNU, PTGES2, NRP1, SULT2B1, RHOQ, PDCD4, KCNIP3, CDT1, ACOT7, USP18, PLCB4, MCM7, BAG3, HMOX1, RHOD, ERRFI1, ZFP36, CDC6, HERC6, BTG2, RRM2, CTSH, EIF5A2, GSTP1, HAUS5, ABCD1, TRIB3, ZFP36L1, RASGRP1, CAMK2B, XAF1, EXOC5, PAPSS2, ZBP1, PHLDA1, ABCA12, CEP131, CENPM, SPSB1, MAT2A, NAT1, CENPF, BIRC5, STAT1, CDC25A, TRIM21, IFIT3, IFIT1, BAX, IRF7, MT2A, JAK2, GAMT, APAF1, DUSP9, CIT, TP53INP1 Molecular Function Fold Term p-value Benjamini Count Genes Enrichment ATP1B1, XRCC3, TUBB2B, SNCA, EDN1, PDLIM3, AURKB, SART1, CITED2, B2M, HOOK1, ISG15, RAB29, SERPINE1, H2AFX, RAB27B, CDCA5, CDCA4, H1F0, GTPBP3, PIM1, ESPL1, RND3, SPAG9, UHRF1, CD36, DCLRE1B, KRT17, MELTF, TAGLN, RAB18, KRT16, RFC2, ZWINT, MAD2L2, RTP4, SCN1B, NEK3, ENPP1, LMNB1, CCDC93, UBA7, OAS3, UBA6, ASB13, TAGLN2, MYBL2, KCNJ3, CD74, IFI35, TK1, ALCAM, CDC42EP1, CRMP1, NCAPG2, FBXO6, SKA3, AHNAK2, FCHO2, OLFM1, PLAT, RECQL4, ODC1, MKI67, LGALS3, FAM111B, KLF11, UBE2L6, HGD, S100A14, SEPT10, RAD54L, GAS6, SYNE2, APOL1, TRPS1, NPW, BMP7, KLF4, POP7, KCNJ15, NRP1, NUAK2, ELF4, ZMAT3, TCOF1, UFC1, CDT1, KCNIP3, USP18, BAG3, NEURL1B, ERRFI1, FANCA, ZFP36, CDC6, ARHGEF19, TLE3, CCNC, ACKR3, MMP15, RBBP8, CLIC3, FANCD2, RRM2, NUP205, SEMA4C, CAND1, NCAPH2, KCNH2, EIF5A2, RASD1, USP2, CSF1, ABCD1, C5, CXCL8, FHL2, ZFP36L1, DNALI1, COL7A1, GO:0005515~protein EXOC5, CD24, ZBP1, ABCA12, TRIP13, 6.35E-10 1.25583 4.75E-07 332 binding PHLDA1, SSRP1, CEP131, LMX1B, SPSB1, MAT2A, HSPG2, AFF4, RTN4R, FZD1, BIRC5, STAT1, TRIM21, NAT9, ITPR1, MT1X, ATRX, DKK1, MT2A, SAMD9, CAPG, CHTF18, SECISBP2L, CIT, DUSP9, PXYLP1, DRAM1, TP53INP1, TOB1, LMO2, AP1G1, WFS1, CRABP2, KNTC1, CBX4, PKMYT1, CBX2, APOBEC3H, CBX8, MCM10, FOXO6, NRCAM, NLRC5, KRT80, APOA1, PRKACB, MX2, MATK, SATB2, MYO6, SOCS2, CHAC1, DTL, USP1, RBL1, RELB, HLA-A, TMSB10, FIBCD1, IL24, BASP1, TACC3, GNS, ASCL1, SPATA18, HSPB8, COL1A2, SDCBP, TNFAIP3, HMGB1, MYL7, FGFR4, HMGB2, SLC38A2, FGFR3, SRM, ASS1, LITAF, BLM, TFAP4, IFITM3, ITGB4, ITGB5, IL32, CX3CL1, CXXC5, RRM2B, CCNG1, CCL5, SESN1, TIMP3, PEG10, TUBB, ECE1, POLE2, EIF3E, MSLN, ADRA2A, HNRNPD, ASF1B, PLXND1, TMEM30A, FN1, GABARAPL1, S100P, PODXL, FSCN1, SUV39H1, BRCA2, NR4A1, BRIP1, EVL, SELENOM, SHANK3, DNMBP, PSMB9, EIF4B, POLD3, IKBKE, ATF4, CORO1A, PLK2, PSMC3, SLC7A1, POLD2, CYFIP2,

276

SPTBN1, SCARA3, AHCYL2, ANTXR1, IFI6, FABP5, SEL1L, HDAC6, E2F1, E2F2, KIF22, IER3, FOSL2, PTGES2, HELZ2, TLN2, E2F8, SULT2B1, BCAM, HNRNPLL, SDC4, PDCD4, PHC3, CDC45, ACOT7, TRIM3, PLCB4, MCM7, HMOX1, TUBA1B, ARID5A, MCM2, MCM3, GAL, MCM4, MCM5, DHRS2, BTG2, NRM, SGF29, CTSH, GSTP1, SHCBP1, CALCR, CKS1B, TNF, FKBP5, TRIB3, RSAD2, NFIX, NR3C1, MSX2, TSPYL2, ZNF703, TAP1, ZC3H12A, CAMK2B, SCNN1B, SYNPO, EXO1, CEBPB, SLC12A2, TONSL, FLT4, CEBPG, CENPF, KIF18B, TSPAN15, COTL1, CDC25A, IFIT3, CYBA, IFIT1, BAX, IRF7, JAK2, APAF1, IGFBP2, THEMIS2, IGFBP5 KIFC1, KIF22, ATP1B1, XRCC3, NUAK2, HELZ2, PKMYT1, HELZ, AURKB, ITPKA, ACTG2, NLRC5, MCM7, PIP5KL1, ATP8B2, PRKACB, MATK, CDC6, MYO6, TRPM7, LIG1, PIM1, MCM2, MCM3, MCM4, MCM5, MAST4, KIF1A, PANK3, RFC2, SLC27A2, UBE2T, FGFR4, FGFR3, NEK3, ASS1, BLM, ENPP1, ABCD1, GO:0005524~ATP binding 1.20E-05 1.667073 0.004479 75 UBA7, OAS3, TRIB3, UBA6, CMPK2, TK1, TAP1, DDX60L, CAMK2B, AGK, PAPSS2, ENTPD2, TRIP13, ABCA12, RECQL4, MKI67, MAT2A, FLT4, ATAD2, UBE2L6, BRIP1, KIF18B, RAD54L, ABCG1, RPS6KL1, ATRX, IKBKE, OASL, PLK2, PSMC3, CHTF18, JAK2, APAF1, CLCN6, CIT, ABCC5 KEGG Pathway Fold Term p-value Benjamini Count Genes Enrichment POLD3, MCM7, POLE2, RFC2, LIG1, POLD2, has03030:DNA replication 3.57E-06 7.674029 8.31E-04 10 MCM2, MCM3, MCM4, MCM5

277

Appendix Table 3.7. Enriched GO terms and KEGG pathways in PC3 in response to culture medium changes at 5% O2. Biological Process Fold Term p-value Benjamini Count Genes Enrichment TXNIP, ARHGEF3, LDHA, EEF1A2, GO:0043065~positive 3.37E- BNIP3, NTSR1, PLEKHG2, ADM, DUSP1, regulation of apoptotic 3.887037 0.047033 15 05 MAP3K9, RASSF2, TGM2, BCL6, process FAM162A, GADD45B KEGG Pathway Fold Term p-value Benjamini Count Genes Enrichment 8.00E- CXCL1, CXCL3, EEF1A2, CXCL2, CXCL8, has05134:Legionellosis 10.61574 0.001439 8 06 NFKBIA, BNIP3, HSPA1A

Appendix Table 3.8. Enriched GO terms and KEGG pathways in PC3 in response to culture medium changes at 18% O2. Biological Process Fold Term p-value Benjamini Count Genes Enrichment GO:0070059~intrinsic apoptotic signaling ATF4, CEBPB, XBP1, CHAC1, BBC3, ERN1, TRIB3, pathway in response to 7.51E-06 8.449514 0.010718 9 PPP1R15A, ITPR1 endoplasmic reticulum stress GO:0060337~type I EGR1, IFI27, ISG15, BST2, IRF7, IFITM3, XAF1, interferon signaling 5.55E-06 5.809041 0.015792 12 MX2, IFI35, IFI6, HLA-F, IFNAR1 pathway WNT5A, SASH1, C5, CXCL8, GATA2, XBP1, GO:0045766~positive 1.97E-05 4.041072 0.018673 15 GATA6, HMOX1, VEGFA, RHOB, ZC3H12A, IL1B, regulation of angiogenesis HSPB1, ADM2, DDAH1 GO:0016264~gap junction 6.73E-05 19.36347 0.027197 5 PKP2, GJA1, CTNNA1, GJA5, GJC1 assembly FMN2, EIF4EBP1, STC2, GATA6, BBC3, HMOX1, GO:0071456~cellular 5.89E-05 4.195418 0.027785 13 VEGFA, MST1, TP53, BNIP3, NDRG1, BMP7, response to hypoxia SLC9A1 EGR1, LDHA, NF1, BNIP3, PDLIM1, CBFA2T3, GO:0001666~response to 4.19E-05 3.242255 0.029574 18 ITPR1, DDIT4, ASCL2, CYBA, EP300, PLOD1, hypoxia HMOX1, VEGFA, ABAT, CAT, DPP4, MB GO:0030336~negative PTPRJ, RAP2A, NF2, BST2, ROBO1, PKP2, NF1, regulation of cell 5.31E-05 4.239581 0.030003 13 RHOB, ABHD2, DPYSL3, IL24, LDLRAD4, IGFBP5 migration Cellular Component Fold Term p-value Benjamini Count Genes Enrichment MEF2C, KIFC2, MEF2A, LDHA, XRCC2, PGAM1, PDLIM1, PIP5KL1, TRAK1, AIF1L, PLS1, PLS3, OPA1, TNIK, PIM1, SKP2, ZHX2, KRT10, PIM3, IFI44, UBR1, SPAG9, KRT17, TAGLN, SIPA1L1, ZWINT, VEGFA, PGM1, ROR1, PDE4DIP, UNC13B, ADD3, C12ORF57, PLCXD3, UBA6, NFKBIA, GO:0005737~cytoplasm 6.96E-08 1.359205 3.21E-05 222 IGF2BP3, DENND2D, UBASH3B, WDHD1, SMAD9, MKI67, KLF12, CELSR2, TPD52L1, S100A14, CDC27, DHX40, RBPJ, SLC9A1, PLCXD1, FOXA2, BNIP3, RLIM, RNF182, YBX2, ANKRD17, SMAP2, AGAP1, ALS2CL, LRRC37B, ZFP36, ATF7IP, SLC3A2, TP53, ARHGEF12, FLNC, PFDN2, MAST2,

278

SERPINB5, HIST2H2BE, RRM2, RRM1, GADD45A, FUT8, POLA1, MTMR2, NCAPG, XBP1, BUB1, AGO1, AGO2, PRKAA2, AGL, TRIP12, DLGAP5, WWTR1, CDKN3, MID1, RPS6KA3, NUPR1, RPS6KA1, PYGL, SVIL, MT2A, PSAT1, DUSP8, SLC7A5, KANK2, EIF4EBP1, PRKAR2A, EIF4EBP2, CDKN2B, ROBO1, AAK1, STK39, MX2, FRS2, BST2, STMN3, SOCS3, EFTUD2, CHP1, SOCS5, TACC1, DDIT4, ASCL2, TNS3, MIB1, EP300, ARRB1, HSPB8, HSPB1, RAD18, CRACR2B, SEPT9, HMGB2, ASS1, NEDD9, CXXC5, BICC1, SESN2, MDK, LLGL2, GALM, PTK2, CSE1L, DDX3X, TFF3, NDRG1, SLC30A7, PARD6B, S100P, ZMYM4, MYO1B, LPP, IREB2, SACS, DOCK5, GCN1, RIMKLA, PLEKHA4, CORO1C, ATF4, MYO10, PPIG, PLK2, CIRBP, TRIP6, PLEKHA2, KYNU, FIGNL1, PREX1, ANO1, SULT2B1, SPRY4, FRMD3, TRIM2, DDX60, EGR1, PFKL, ESR1, PFKP, EML6, EML4, DHRS2, ARHGAP33, PSME1, RIF1, PPM1H, ERN1, ZFPM1, SPATS2L, KPNA3, KPNA2, PPP1R15A, PRKD3, KPNA1, IRX3, CNN3, USP9X, UPP1, SPOCK1, PALMD, NDC1, TSC22D3, ZNF703, PAFAH1B3, MSI2, ZC3H12A, ETNK1, MTCL1, KDM3A, KLHL42, PAFAH1B2, NEDD4L, NEFL, EXO1, NES, CEBPB, NF2, NF1, CENPF, CENPE, RCAN1, DPYSL3, RCAN3, CENPI, SMC4, SH3BGRL, IRF7, GDF15 MEF2C, LDHA, CAPZA1, SNRPD1, PGAM1, VPS53, SLC7A5, EIF4EBP1, PRKAR2A, ISG15, CDKN2B, ANK3, PIK3CA, RBCK1, IL1B, STK39, PRKACB, MX2, DDAH1, STAG2, C2CD2, CHAC1, SOCS3, SKP2, CHP1, UBR1, CTNNA1, GCC2, DDIT4, SPAG9, MIB1, SGO2, TACSTD2, ARRB1, ZWINT, PGM1, HSPB1, UNC13B, ADD3, PRPS1, ASS1, PFKFB3, NFKBIA, UBA6, IL32, IGF2BP3, SESN2, EPHB3, MYO9A, IFI35, PTK2, CSE1L, GYS1, NDRG1, TUBB4A, PARD6B, SMAD9, IREB2, CDC27, KIF3C, LCN2, RERG, MYO10, CDKN1A, GBE1, PLK2, BBC3, MYH14, KYNU, PREX1, SULT2B1, GJA1, RLIM, FOS, MKLN1, PLCB4, GO:0005829~cytosol 2.15E-07 1.475628 4.96E-05 153 HMOX1, PRMT6, RHOB, ARHGAP11A, RANBP2, CAT, EFR3B, DHCR24, ZFP36, RAP2A, ARHGEF3, CCDC88A, PFKL, ARHGEF6, PFKP, TP53, IPO8, ARHGEF12, FLNC, PGM2L1, FMN2, ARHGAP31, ARHGAP33, PSME1, RRM2, RRM1, PSME4, KPNA3, KPNA2, PPP1R15A, KPNA1, LIMS1, AP1M2, USP9X, UPP1, HK1, TRIB3, ASNS, TPM2, MTMR2, TSC22D3, NCAPG, XBP1, PAFAH1B3, AGO1, BUB1, ETNK1, AGO2, ABCD3, PRKAA2, PAFAH1B2, XAF1, NEDD4L, TBC1D1, TNRC6B, PAPSS2, NEFL, AGL, TRIP12, PDS5B, NF1, CENPF, CENPE, DPYSL3, LARS2, WWTR1, MID1, CENPI, SMC4, AP2A2, RPS6KA3, RPS6KA1, PYGL, IRF7, MT2A, PSAT1, CIT LDHA, PGAM1, VPS53, SLC7A5, PRKAR2A, PIP5KL1, LRRC59, SPRED3, GNG2, FRS2, STAG2, OSBP2, PLD1, OPA1, BST2, EFTUD2, ERP29, KRT10, MED14, FIBCD1, TACC1, GCC2, HLA-F, GO:0016020~membrane 1.67E-06 1.569531 2.56E-04 108 TACSTD2, MLEC, VEGFA, LCLAT1, ADD3, UNC13B, SUCO, GCNT2, PANX1, CERS6, IL32, SFXN1, LGALS3BP, CSE1L, GYS1, LFNG, SLC30A7, MKI67, SYT12, NIPA1, GCN1, RERG,

279

PLEKHA4, SLC7A2, SLC7A1, MYH14, PLEKHA2, ACOX1, LDLR, CLSTN3, HELZ, SPRY4, ANKRD17, FOS, P4HA1, HMOX1, RANBP2, CAT, AGAP1, DENND5B, DOCK10, AGPAT3, DPP4, DHCR24, RAP2A, CCDC88A, MAN1A2, PFKL, MPP2, PFKP, ESR1, SLC3A2, ARHGEF12, EML4, GRAMD1B, KPNA2, PPP1R15A, GALNT7, FUT8, USP9X, PALMD, PLPP2, ANXA6, NDC1, NCAPG, BUB1, PAFAH1B3, ETNK1, AGO2, SEC22B, ABCD3, KDM3A, SLC12A2, NF2, NF1, FADS3, CENPE, NUP155, ITPR1, CYBA, TFRC, KREMEN2, ADGRL1, CIT, SLC15A3 LIMS1, CNN3, GNA12, NEDD9, GJA1, PDLIM1, SPRY4, ANXA6, PRKAR2A, PTK2, AIF1L, RHOB, GO:0005925~focal 1.07E-04 2.28955 0.009836 28 CAT, DPP4, ADAM9, LPP, CHP1, FLNC, CTNNA1, adhesion CORO1C, TNS3, CYBA, ARHGAP31, ITGA5, SVIL, HSPB1, TRIP6, SLC9A1 LDHA, CAPZA1, PGAM1, KIAA1324, SLC7A5, COX5B, PRKAR2A, APOD, PLOD1, PLS1, AIF1L, IL1B, GNG2, PRKACB, DDAH1, ADAM9, MB, PTPRJ, TNIK, BST2, ERP29, PTPRS, CHP1, KRT10, H2AFJ, NEBL, SPAG9, KRT17, TACSTD2, ST14, PGM1, COL1A2, ABAT, HSPB1, LAMC1, MGAT5, UGGT1, WNT5A, ASS1, IFITM3, MST1, GALM, LGALS3BP, DDX3X, CSE1L, TFF2, TFF3, NDRG1, TUBB4A, SLC30A7, PARD6B, S100P, MYO1B, AK4, S100A14, LCN2, GBE1, PI3, MYH14, PCSK1N, GO:0070062~extracellular MUC5B, SLC9A1, PXDN, TM7SF3, GM2A, CLSTN3, 1.04E-04 1.387611 0.011958 122 exosome FIGNL1, GREB1, ANO1, SULT2B1, GJA1, GPC4, EFHD1, RHOB, SEMA3C, HIST3H2A, CAT, DOCK10, DPP4, PPP2R1B, RAP2A, PFKL, MAN1A2, SLC3A2, PFKP, ARHGEF12, DHRS2, PFDN2, PSME1, SERPINB5, HIST2H2BE, RRM1, SLPI, SLC27A2, HIST1H2AC, CPM, GALNT7, FUT8, C5, ALDH3A2, NDUFB1, ANXA6, MTMR2, COL6A3, PAFAH1B3, HS6ST2, NEDD4L, PAFAH1B2, ECI1, COX7A2, SLC12A2, TMC5, NID1, PCK2, SH3BGRL, TFRC, PYGL, GFRA1, METRNL, IGFBP2, GDF15, PSAT1 Molecular Function Fold Term p-value Benjamini Count Genes Enrichment HMGB2, KAT2B, CEBPB, FOXA2, CEBPG, TAF9B, GO:0008134~transcription TP53, ESR1, PIM1, NFKBIA, CENPF, CXXC5, 4.80E-05 2.627189 0.019848 24 factor binding GATA2, FOS, EP300, DDX3X, HES4, GATA6, ARRB1, ID1, GATA3, PIAS2, ZFPM1, RBPJ MEF2C, LDHA, MEF2A, SLC9A7, XRCC2, PGAM1, VPS53, ISG15, APOD, TRAK1, PIK3CA, STAG2, S100A2, ADAM9, PTPRJ, GTPBP2, PLD1, OSBP2, OPA1, TNIK, PIM1, ZHX2, SKP2, PTPRS, ZHX3, PIM3, MED14, CTNNA1, GCC2, SPAG9, KRT17, TAGLN, SGO2, ZNF783, TACSTD2, ZWINT, VEGFA, PGM1, ROR1, PDE4DIP, PIAS2, UNC13B, GO:0005515~protein 3.33E-05 1.164267 0.027468 329 TAF9B, MST1, NFKBIA, UBA6, IGF2BP3, binding DENND2D, MYO9A, IFI35, UHMK1, UBASH3B, GYS1, WDHD1, OLFM1, TUBB4A, SMAD9, MKI67, KLF12, FAM111B, TPD52L1, S100A14, CDC27, PKP2, ICE1, POP1, AREG, RBPJ, BMP7, SLC9A1, CEP104, LDLR, GJA1, BNIP3, RNF182, ANKRD17, SMAP2, P4HA1, DPP4, ATF7IP, ZFP36, ARHGEF3, RAP2A, EMSY, HIST1H1C, ARHGEF6, TP53,

280

SLC3A2, ACKR3, FLNC, ARHGEF12, PFDN2, MAST2, SERPINB5, RRM2, RRM1, ESRP1, G0S2, GADD45A, RASD1, APOOL, LIMS1, CXCL2, C5, POLA1, CXCL8, HK1, MTMR2, XBP1, NCAPG, BUB1, AGO1, ABCD3, AGO2, ETV1, PRKAA2, TBC1D1, TRIP12, AGL, BCL9, PDS5B, PDK3, DLGAP5, WWTR1, MID1, CDKN3, PCK2, ITPR1, ATRX, RLF, RPS6KA3, AP2A2, DUSP2, RPS6KA1, TFRC, PYGL, SLC16A9, SVIL, MAPK15, MT2A, SH3RF3, TMTC3, ADGRL1, CIT, JDP2, FAM189B, GNA12, F2RL1, CAPZA1, SNRPD1, COX5B, KANK2, NRCAM, GATA2, EIF4EBP1, PRKAR2A, MUTYH, EIF4EBP2, CDKN2B, ROBO1, GATA6, ANK3, AAK1, GATA3, FBXO28, RBCK1, STK39, PRKACB, FRS2, MX2, BST2, CHAC1, SOCS3, EFTUD2, CHP1, FIBCD1, IL24, SOCS5, NTSR1, TACC1, MOXD1, NEBL, ASCL1, MIB1, TNS3, EP300, ARRB1, HSPB8, COL1A2, HSPB1, LCLAT1, RAD18, PTGFRN, SEPT6, SMARCA1, UGGT1, CRACR2B, PRPS1, SEPT9, WNT5A, HMGB2, ASS1, TNFRSF12A, IFITM3, NEDD9, IL32, ABCA1, CXXC5, SESN2, LLGL2, PTK2, TOMM7, CSE1L, DDX3X, P2RY1, TFF2, TFF3, NDRG1, SSX2IP, TFF1, PARD6B, S100P, ZMYM4, NIN, LPP, HENMT1, IREB2, DOCK5, CORO1C, ATF4, CDKN1A, MYO10, ATF3, PPIG, PLK2, BBC3, ITGA5, SLC7A1, FAAH, CIRBP, SCARA3, TRIP6, IFI6, PLEKHA2, MUC5B, RSF1, FIGNL1, PREX1, ANO1, SULT2B1, CBFA2T3, SPRY4, ARHGAP21, FOS, TRIM2, PLCB4, HMOX1, DDX60, RTF1, PRMT6, RHOB, RANBP2, EFR3B, EGR1, PPP2R1B, PFKL, RAB3IL1, ESR1, IPO8, GAL, IFNAR1, MCM6, DHRS2, ARHGAP33, KIF1B, PSME1, ERN1, SLPI, PSME4, ZFPM1, KPNA3, KPNA2, PPP1R15A, PRKD3, KPNA1, AP1M2, USP9X, TRIB3, ASNS, PLPP2, ANXA6, C1QTNF6, ZNF703, SLC35B4, PAFAH1B3, MSI2, SEC22B, ZC3H12A, ETNK1, SCG5, KLHL42, PAFAH1B2, NEDD4L, FBN2, TNRC6B, NEFL, FGFBP1, EXO1, CEBPB, KAT2B, NF2, SLC12A2, CEBPG, NF1, CENPF, DPYSL3, CENPE, RCAN3, NUP155, CENPI, SMC4, CYBA, ANXA10, ID1, IRF7, IGFBP2, GDF15, IGFBP5, F2R MEF2C, HIST1H2AC, MEF2A, JDP2, PANX1, TAF9B, POLA1, BNIP3, FOS, XBP1, P2RY1, GO:0046982~protein PAFAH1B3, HIST3H2A, PAFAH1B2, NEFL, CEBPB, 1.07E-04 2.139417 0.029478 32 heterodimerization activity CEBPG, TP53, ZHX2, ZHX3, TPD52L1, H2AFJ, NTSR1, MID1, CTNNA1, ABCG1, SMC4, CYBA, ATF4, ATF3, HIST2H2BE, VEGFA

281

Appendix Table 3.9. DEGs shared between PC3 and MCF7 in response to oxygen level changes in DMEM. Overlap Comparison Elements size MCF7 5P vs PC3 plakophilin 1(PKP1), polo like kinase 1(PLK1), family with 3 5P sequence similarity 83 member D(FAM83D) heme oxygenase 1(HMOX1), S100 calcium binding protein P(S100P), tribbles pseudokinase 3 (TRIB3), histone cluster 1 H3 family member h(HIST1H3H), DNA damage inducible transcript 4(DDIT4), chromobox 4 (CBX4), ATP binding cassette subfamily G member 1(ABCG1), eukaryotic translation initiation factor 1(EIF1), seryl-tRNA synthetase (SARS), eukaryotic translation MCF7 5P vs PC3 17 initiation factor 4E binding protein 1(EIF4EBP1), nuclear protein 1, 18P transcriptional regulator(NUPR1), histone cluster 1 H2B family member d(HIST1H2BD), vascular endothelial growth factor A (VEGFA), histone cluster 1 H1 family member c(HIST1H1C), CCAAT/enhancer binding protein beta(CEBPB), endoplasmic reticulum to nucleus signaling 1(ERN1), dehydrogenase/reductase 2(DHRS2) fumarate hydratase(FH), MYB proto-oncogene like 1(MYBL1), thymidine kinase 1(TK1), syndecan 1(SDC1), solute carrier family 6 member 6(SLC6A6), ribonucleotide reductase regulatory subunit M2(RRM2), anti-silencing function 1B histone chaperone(ASF1B), neurogranin(NRGN), E2F transcription factor 1(E2F1), nuclear receptor interacting protein 1(NRIP1), cell division cycle 6(CDC6), minichromosome maintenance complex component 2(MCM2), cyclin dependent kinase 2(CDK2), ribonucleotide reductase MCF7 18P vs PC3 catalytic subunit M1(RRM1), annexin A6(ANXA6), proliferating 27 5P cell nuclear antigen(PCNA), minichromosome maintenance complex component 6(MCM6), nestin(NES), transcription factor 19(TCF19), thymidylate synthetase(TYMS), minichromosome maintenance complex component 5(MCM5), ubiquitin like with PHD and ring finger domains 1(UHRF1), family with sequence similarity 111 member B(FAM111B), thyroid hormone receptor interactor 13(TRIP13) EFR3 homolog B(EFR3B), heat shock protein 90 alpha family class A member 1(HSP90AA1), cell division cycle 25A(CDC25A)

282

AC044784.1, AC093323.1, atypical chemokine receptor 3(ACKR3), aquaporin 3 (Gill blood group)(AQP3), amphiregulin(AREG), achaete-scute family bHLH transcription factor 1(ASCL1), BCL2, apoptosis regulator(BCL2), cadherin EGF LAG seven-pass G-type receptor 2(CELSR2), chromogranin A(CHGA), C-X-C motif chemokine ligand 8(CXCL8), cytochrome b-245 alpha chain(CYBA), DexD/H-box helicase 60(DDX60), estrogen receptor 1(ESR1), Enah/Vasp-like(EVL), Fos proto- oncogene, AP-1 transcription factor subunit(FOS), galanin and

GMAP prepropeptide(GAL), GATA binding protein 3(GATA3),

GDNF family receptor alpha 1(GFRA1), growth regulation by

estrogen in breast cancer 1(GREB1), heat shock protein family B

(small) member 8(HSPB8), interferon alpha inducible protein

6(IFI6), interleukin 20(IL20), 283roquois homeobox 3(IRX3),

283roquois homeobox 5(IRX5), potassium two pore domain

channel subfamily K member 6(KCNK6), keratin 17(KRT17),

LFNG O-fucosylpeptide 3-beta-N-

acetylglucosaminyltransferase(LFNG), MX dynamin like GTPase

MCF7 18P vs PC3 2(MX2), MYB proto-oncogene, transcription factor(MYB), par-6 42 18P family cell polarity regulator beta(PARD6B), PDZ domain containing 1(PDZK1), protein kinase (cAMP-dependent, catalytic) inhibitor beta(PKIB), prostate transmembrane protein, androgen induced 1(PMEPA1), RAS like estrogen regulated growth inhibitor(RERG), solute carrier family 7 member 2(SLC7A2), SLC9A3 regulator 1(SLC9A3R1), sulfotransferase family 2B member 1(SULT2B1), synaptotagmin 12(SYT12), transcobalamin 1(TCN1), thrombospondin 1(THBS1), X-box binding protein 1(XBP1), zinc finger protein 703(ZNF703)

283

Appendix Table 3.10. DEGs shared between PC3 and MCF7 as the effect of different oxygen levels in Plasmax. Overlap Comparison Elements size BCL2 interacting protein 3(BNIP3), chromosome transmission fidelity factor 18(CHTF18), UDP-galactose-4-epimerase(GALE), high mobility group box 2(HMGB2), kinesin family member MCF7 5D vs PC3 9 18B(KIF18B), MAD2 mitotic arrest deficient-like 1 5D (yeast)(MAD2L1), peroxisome proliferator activated receptor gamma(PPARG), RELT tumor necrosis factor receptor(RELT), ubiquitin conjugating enzyme E2 C(UBE2C) MCF7 5D vs PC3 phosphoglycerate dehydrogenase(PHGDH), brain cytoplasmic 2 18D RNA 1(BCYRN1) serpin family E member 1(SERPINE1), golgin A8 family member MCF7 18D vs PC3 4 A(GOLGA8A), neuropeptide Y receptor Y1(NPY1R), macrophage 5D stimulating 1(MST1) family with sequence similarity 214 member A(FAM214A), MCF7 18D vs PC3 3 growth differentiation factor 15(GDF15), solute carrier family 12 18D member 2(SLC12A2)

284

Appendix Table 3.11. DEGs shared between PC3 and MCF7 in response to culture medium change at 5% O2. Overlap Comparison Elements size CCAAT/enhancer binding protein beta(CEBPB), ChaC glutathione specific gamma-glutamylcyclotransferase 1(CHAC1), dual specificity phosphatase 1(DUSP1), family with sequence similarity 129 member A(FAM129A), growth arrest and DNA damage inducible beta(GADD45B), growth differentiation factor 15(GDF15), interferon MCF7 5P vs PC3 5P 15 alpha inducible protein 6(IFI6), protein phosphatase 1 regulatory subunit 15A(PPP1R15A), S100 calcium binding protein P(S100P), serpin family E member 1(SERPINE1), sestrin 2(SESN2), solute carrier family 3 member 2(SLC3A2), synaptopodin(SYNPO), trefoil factor 3(TFF3), tribbles pseudokinase 3(TRIB3)

MCF7 5P vs PC3 5D 1 thioredoxin interacting protein(TXNIP) calcium/calmodulin dependent protein kinase II inhibitor MCF7 5D vs PC3 5P 3 1(CAMK2N1), four jointed box 1(FJX1), insulin like growth factor binding protein 5(IGFBP5)

adenylate kinase 4(AK4), BCL2 interacting protein 3(BNIP3), carbonic anhydrase 12(CA12), cellular retinoic acid binding protein 2(CRABP2), family with sequence similarity 162 member A(FAM162A), fibronectin type III domain containing 10(FNDC10), glycogen synthase 1(GYS1), hexokinase 2(HK2), hydroxyprostaglandin dehydrogenase 15- MCF7 5D vs PC3 5D 16 (NAD)(HPGD), potassium calcium-activated channel subfamily N member 4(KCNN4), lactate dehydrogenase A(LDHA), 6- phosphofructo-2-kinase/fructose-2,6-biphosphatase 3(PFKFB3), phosphofructokinase, platelet(PFKP), PTPRF interacting protein alpha 4(PPFIA4), solute carrier family 2 member 1(SLC2A1), solute carrier organic anion transporter family member 4A1(SLCO4A1)

285

Appendix Table 3.12. DEGs shared between PC3 and MCF7 in response to culture medium changes at 18% O2. Overlap Comparison Elements size AC093323.1, atypical chemokine receptor 3(ACKR3), adrenomedullin 2(ADM2), achaete-scute family bHLH transcription factor 1(ASCL1), achaete-scute family bHLH transcription factor 2(ASCL2), argininosuccinate synthase 1(ASS1), activating transcription factor 4(ATF4), bone morphogenetic protein 7(BMP7), chromosome 16 open reading frame 91(C16orf91), chromosome 19 open reading frame 48(C19orf48), CCN3 CCAAT/enhancer binding protein beta(CEBPB), CCAAT/enhancer binding protein gamma(CEBPG), cadherin EGF LAG seven-pass G-type receptor 2(CELSR2), ChaC glutathione specific gamma-glutamylcyclotransferase 1(CHAC1), chromogranin A(CHGA), collagen type I alpha 2 chain(COL1A2), C-X-C motif chemokine ligand 8(CXCL8), CXXC finger protein 5(CXXC5), cytochrome b-245 alpha chain(CYBA), dehydrogenase/reductase 2(DHRS2), docking protein 7(DOK7), fatty acid desaturase 3(FADS3), fibrinogen C domain containing 1(FIBCD1), galanin and GMAP prepropeptide(GAL), H1 histone family member X(H1FX), hes family bHLH transcription factor 4(HES4), major histocompatibility complex, class I, F(HLA-F), high mobility group box 2(HMGB2), heme oxygenase 1(HMOX1), heat shock protein family B (small) member 8(HSPB8), immediumte early response 5 like(IER5L), interferon alpha inducible protein 27(IFI27), interferon induced protein 35(IFI35), interferon induced protein 44(IFI44), interferon alpha inducible protein 6(IFI6), interferon induced transmembrane protein 3(IFITM3), insulin like growth factor binding MCF7 18P PC3 18P 75 protein 2(IGFBP2), interleukin 24(IL24), interleukin 32(IL32), interferon regulatory factor 7(IRF7), ISG15 ubiquitin-like modifier(ISG15), keratin 17(KRT17), kynureninase(KYNU), LFNG O- fucosylpeptide 3-beta-N-acetylglucosaminyltransferase(LFNG), galectin 3 binding protein(LGALS3BP), lymphocyte antigen 6 complex, E(LY6E), midkine (neurite growth-promoting factor 2)(MDK), meteorin like, glial cell differentiation regulator(METRNL), metallothionein 2A(MT2A), MX dynamin like GTPase 2(MX2), NOTCH-regulated ankyrin repeat protein(NRARP), neuronal cell adhesion molecule(NRCAM), neurexophilin 4(NXPH4), olfactomedin 1(OLFM1), Pim-1 proto-oncogene, serine/threonine kinase(PIM1), phosphatidylinositol-4-phosphate 5-kinase like 1(PIP5KL1), RAB31, member RAS oncogene family(RAB31), ring finger protein 223(RNF223), S100 calcium binding protein A14(S100A14), S100 calcium binding protein P(S100P), solute carrier family 15 member 3(SLC15A3), solute carrier family 27 member 2(SLC27A2), solute carrier family 7 member 2(SLC7A2), solute carrier family 7 member 5(SLC7A5), sulfotransferase family 2B member 1(SULT2B1), synaptotagmin 12(SYT12), transgelin(TAGLN), tribbles pseudokinase 3(TRIB3), tsukushi, small leucine rich proteoglycan(TSKU), VGF nerve growth factor inducible(VGF), XIAP associated factor 1(XAF1), Y-box binding protein 2(YBX2), zinc finger CCCH-type containing 12A(ZC3H12A), zinc finger protein 703(ZNF703)

286

complement C5(C5), centromere protein F(CENPF), citron rho- interacting serine/threonine kinase(CIT), exonuclease 1(EXO1), family with sequence similarity 111 member B(FAM111B), mitogen-activated protein kinase kinase kinase 9(MAP3K9), marker of proliferation Ki- MCF7 18P vs PC3 67(MKI67), microtubule crosslinking factor 1(MTCL1), nestin(NES), 15 18D neurogranin(NRGN), 3’-phosphoadenosine 5’-phosphosulfate synthase 2(PAPSS2), ribonucleotide reductase regulatory subunit M2(RRM2), scavenger receptor class A member 3(SCARA3), spermatogenesis associated serine rich 2 like(SPATS2L), ZW10 interacting kinetochore protein(ZWINT) ATP binding cassette subfamily G member 1(ABCG1), insulin like growth factor binding protein 5(IGFBP5), low density lipoprotein MCF7 18D vs PC3 6 receptor class A domain containing 4(LDLRAD4), ras related 18P dexamethasone induced 1(RASD1), small nucleolar RNA host gene 5(SNHG5), ZFP36 ring finger protein(ZFP36) ATRX, chromatin remodeler(ATRX), CRYBG1, helicase with zinc finger(HELZ), inositol 1,4,5-trisphosphate receptor type 1(ITPR1), mannosidase alpha class 1A member 2(MAN1A2), phospholipase C MCF7 18D vs PC3 11 beta 4(PLCB4), polo like kinase 2(PLK2), protein kinase cAMP- 18D activated catalytic subunit beta(PRKACB), solute carrier family 12 member 2(SLC12A2), sperm associated antigen 9(SPAG9), ubiquitin like modifier activating enzyme 6(UBA6)

287

Appendix Table 3.13. Enriched GO terms and KEGG pathways in MCF7 in response to 18% O2 in DMEM and 5% O2 in Plasmax. Biological Process Fold Term p-value Benjamini Count Genes Enrichment IFIT3, IFIT1, OASL, ISG15, IFITM3, IRF7, GO:0060337~type I 3.28E-08 7.465955 4.40E-05 14 OAS3, RSAD2, XAF1, MX1, STAT1, MX2, interferon signaling pathway IFI6, ADAR ODC1, IFIH1, ACTA2, IFITM3, OAS3, GO:0009615~response to RSAD2, IFI44, IVNS1ABP, CXCL12, DDX58, 1.99E-08 5.584922 5.34E-05 18 virus IFIT3, OASL, IFIT1, IRF7, GATA3, MX1, MX2, ADAR GO:0070059~intrinsic apoptotic signaling pathway ATF4, CEBPB, XBP1, CHAC1, BAX, BCL2, 3.00E-07 10.34245 2.68E-04 10 in response to endoplasmic ERN1, TRIB3, PPP1R15A, ITPR1 reticulum stress FRK, SNCA, EDN1, CBX4, TRIB3, CTCF, CBX2, CXXC5, AURKB, CBX8, CITED2, GO:0000122~negative MSX2, XBP1, GATA3, PRMT6, BHLHE40, regulation of transcription TPR, NFIL3, MYB, ZNF496, CEBPA, ZFP36, 1.68E-05 2.038324 0.011193 43 from RNA polymerase II ZNF593, HIST1H1C, PKIG, KLF11, NR4A2, promoter LMCD1, ESR1, SMAD3, RB1, STAT1, STAT3, NRIP1, ASCL2, ASCL1, ATF3, PLK1, IRF7, TRPS1, VEGFA, SMARCA2, KLF4 IFITM3, OAS3, UNC93B1, RSAD2, IFI44L, GO:0051607~defense 2.78E-05 3.516433 0.012328 17 STAT1, DDIT4, IFIT3, NLRC5, OASL, IFIT1, response to virus ISG15, IFIT5, BCL2, MX1, MX2, ADAR AKNA, FOSL2, HELZ2, EDN1, FSTL3, CTCF, ZBTB38, CITED2, PGR, FOS, NLRC5, GATA3, SERPINE1, H2AFZ, MYB, SERTAD1, AR, ARHGEF2, SOX12, ESR1, GO:0045944~positive CCNC, RB1, NRIP1, ASCL1, VEGFA, regulation of transcription 2.34E-05 1.843929 0.012469 53 NFE2L1, SMARCA2, EHF, NR3C1, MYBL2, from RNA polymerase II XBP1, ARMCX3, PPP3CA, CEBPA, CEBPB, promoter CEBPG, MET, NR4A2, NR4A1, SMAD3, STAT1, STAT3, DLX3, DDX58, NRF1, ATF4, ATF3, IRF7, TRPS1, PBX1, BMP7, KLF4, SLC9A1 KIFC1, CKS1B, NEK3, NR3C1, CCNG1, FAM83D, NDE1, CDCA8, TPR, CDCA4, BOD1, ARHGEF2, CCNF, KIF18B, CENPF, GO:0051301~cell division 3.89E-05 2.535377 0.014783 26 BIRC5, CDC20, RB1, UBE2C, TACC3, SEPT10, RGS14, CDC25B, DCLRE1A, RCC2, SEPT9 GO:0045071~negative IFIT1, OASL, ISG15, IFITM3, OAS3, RSAD2, regulation of viral genome 1.41E-04 6.826016 0.046233 8 MX1, ADAR replication Cellular Component Fold Term p-value Benjamini Count Genes Enrichment SNCA, EDN1, ACBD6, CITED2, PIP5KL1, HIST1H2BK, HIST1H2BJ, PHTF1, FAS, FTL, ARC, YARS, MPDZ, AARS, DCDC2, KRT10, ESPL1, IFI44, ST13, KRT17, PARP14, VEGFA, SLC2A10, RTP4, NEK3, OAS3, GO:0005737~cytoplasm 8.52E-10 1.433815 3.55E-07 212 AFAP1L2, DUSP12, HSPA1A, CTIF, CDC42EP1, FBXO6, C19ORF24, ODC1, MKI67, LGALS3, SMAD3, CDC20, S100A14, SEPT10, RPS6KL1, TNKS1BP1, APOL3, OASL, RGS3, APOL6, POP7, KLF4, SLC9A1,

288

ADAR, MTRNR2L12, RNF187, LONP1, DYNLL1, ZNF146, FAM129A, HIST3H2BB, ERRFI1, SERTAD1, ZFP36, AR, ARHGEF2, ACTA2, HERC6, SLC3A2, GMPR, JMY, DOK3, BGN, NUP205, GADD45B, NXT1, TRIM14, EEA1, FEM1B, DNALI1, XBP1, BCL2, PPP3CA, MARS, SSRP1, GDI2, SHMT2, HIST1H2BD, TRIM26, AFF3, DGKH, BIRC5, STAT1, CAPN2, STAT3, MT1X, TRIM21, CAPN1, RAB30, NUPR1, SVIL, SAMD9, MT2A, CAPG, TEX19, CHTF18, PBX1, OGFR, PSAT1, DRAM1, HPGD, DUSP8, TP53INP1, STIL, STYX, SLC7A5, NLRC5, GSTM3, EIF4EBP1, MX1, MX2, BPNT1, RET, CARS, NUDT1, RBL2, SOCS3, DTX3L, PKIG, FBP1, BASP1, METTL7A, TACC3, DDIT4, ASCL2, MAD2L1BP, MNX1, SDCBP, NFE2L1, SEPT9, FGFR4, LITAF, ASS1, PNPT1, BEX2, RRM2B, CXXC5, SESN1, FTH1, SESN3, PEG10, TRIM68, SBK1, EIF3E, MFAP3L, S100P, MYO1B, LMCD1, NR4A2, NR4A1, REEP1, DDX58, ATF4, PLK2, PLK1, KCTD17, CPNE3, FABP6, IFI44L, FES, PDCD4, FAM83D, WARS, TPR, IMPDH2, KIF12, HSP90AA1, SARS, ESR1, ARHGAP29, UBE2C, PALLD, DHRS2, ERN1, INPP4B, SPATS2L, PPP1R15A, GSTP1, IRX3, FRK, PPP1R12B, KITLG, NR3C1, BRSK1, IVNS1ABP, SHQ1, TUBGCP3, SAPCD2, MSI2, RTN4RL2, EXO1, BOD1, CEBPB, CENPF, KIF18B, COTL1, RGS16, RGS14, CDC25B, SH3BP5, IFIT3, SH3BP4, IFIT1, IRF7, BAX, LTA4H MOCOS, STIL, STYX, SNCA, IQGAP3, AURKB, SLC7A5, NLRC5, EIF4EBP1, CDCA8, GSTM3, ISG15, VPS13A, PRKACB, FAS, MX1, MX2, BPNT1, NET1, FTL, YARS, CARS, NUDT1, CHAC1, SOCS3, DTX3L, AARS, FBP1, ESPL1, NPEPPS, GCC2, DDIT4, ST13, RENBP, RND1, RCC2, SDCBP, PACS1, IFIH1, ASS1, OAS3, UBA7, CTPS2, HSPA1A, DUSP12, EPHB3, RIOK2, SESN1, FTH1, CMPK2, TRIM68, EIF3E, FBXO6, TUBB4B, ODC1, GABARAPL1, OSBPL6, HGD, SMAD3, CDC20, DDX58, TNKS1BP1, OASL, PLK2, RGS3, PLK1, CPNE3, IDI1, FABP6, IER3, PTGES2, FES, RHOV, PDCD4, WARS, GO:0005829~cytosol 2.25E-08 1.544822 4.68E-06 145 FOS, MCCC2, NDE1, USP18, DYNLL1, HMOX1, PRMT6, ARHGAP11A, ERRFI1, DUS1L, IMPDH2, RHOG, ZFP36, AR, ARHGEF2, HSP90AA1, SGK3, ACTA2, SARS, HERC6, ARHGAP29, MID1IP1, UBE2C, GMPR, ARHGAP26, ATP6V1A, INPP4B, PPP1R15A, AARS2, GSTP1, NXT1, PPP1R12B, TRIB3, EEA1, ASNS, STARD13, SHQ1, TUBGCP3, BEST1, XBP1, BCL2, PPP3CA, XAF1, PAPSS1, HSPA8, MARS, GDI2, CENPM, SPSB1, NAT1, TRIM26, CENPF, BIRC5, STAT1, CAPN2, STAT3, TRIM21, CDC25B, CAPN1, IFIT3, IFIT1, FYN, BAX, IRF7, MT2A, PHGDH, LTA4H, PSAT1, HPGD, TP53INP1

289

KIFC1, SNCA, FSTL3, H1FX, AURKB, CITED2, NONO, PGR, ZNF777, CDCA8, HIST1H2BK, HIST1H2BJ, PHTF1, H2AFZ, H2AFX, FAS, CDCA4, H1F0, YARS, GTPBP3, ZNF48, DCDC2, KRT10, ESPL1, ZNF787, DCLRE1A, PARP12, RCC2, HES4, PARP14, NEK3, UBA7, DUSP12, TUBB4B, MKI67, LGALS3, KLF11, SMAD3, CDC20, ZNF524, ZNF628, DLX3, TNKS1BP1, PKP1, TRPS1, AREG, METTL16, POP7, ADAR, RNF187, ZBTB38, LONP1, USP18, DYNLL1, HIST2H2AC, ZNF146, SERTAD3, MYB, HIST3H2BB, ERRFI1, ZNF496, SERTAD1, ZFP36, AR, BATF2, HIST1H1C, HERC6, CCNF, SLC3A2, CCNC, JMY, GADD45B, RASD1, HIST1H2AC, EHF, FEM1B, XBP1, BCL2, TCEA1, PPP3CA, SSRP1, SHMT2, HIST1H2BD, AFF3, BIRC5, DGKH, STAT1, CAPN2, STAT3, MT1X, ZNF22, TRIM21, NUPR1, FYN, SVIL, MT2A, CAPG, PBX1, GO:0005634~nucleus 1.16E-06 1.330533 1.61E-04 204 OGFR, DUSP8, TP53INP1, AKNA, STYX, CBX4, CBX2, CBX8, NLRC5, GSTM3, GATA3, ZNF445, MX1, MX2, NET1, ZNF593, NUDT1, RBL2, USP1, DTX3L, PKIG, PISD, BASP1, NPEPPS, ASCL2, ASCL1, MAD2L1BP, CPE, MNX1, SDCBP, NFE2L1, SMARCA2, IFIH1, FGFR3, ASS1, BEX2, SP110, SESN1, TIMP3, FTH1, CMPK2, SESN3, PEG10, TRIM68, CDYL2, EIF3E, MFAP3L, BHLHE40, S100P, LMCD1, NR4A2, NR4A1, NRF1, ATF4, ATF3, PLK1, CPNE3, IER3, PTGES2, FOSL2, CTCF, HNRNPLL, HIST2H4A, PDCD4, WARS, FOS, HMOX1, PRMT6, TPR, NFIL3, IMPDH2, HSP90AA1, ARID5A, ESR1, RB1, MID1IP1, PALLD, NRIP1, DHRS2, DDB2, GSTP1, IRX3, FRK, TRIB3, NR3C1, BRSK1, MSX2, CENPB, SAPCD2, XAF1, HSPA8, SYNPO, CEBPA, EXO1, CEBPB, CNTD2, CEBPG, KIF18B, CENPF, COTL1, RGS14, SH3BP4, IRF7, BAX, LTA4H, LSM10 FSTL3, CBX4, CBX2, AURKB, CBX8, NONO, PGR, CDCA8, EIF4EBP1, ISG15, HIST1H2BK, GATA3, HIST1H2BJ, H2AFX, PRKACB, H1F0, RBL2, USP1, DTX3L, DCLRE1A, SMARCA2, FGFR4, LITAF, UBA7, OAS3, HSPA1A, RRM2B, CXXC5, CCNG1, MYBL2, RIOK2, CMPK2, EIF3E, NR4A2, LMCD1, SMAD3, NR4A1, CDC20, S100A14, TNKS1BP1, NRF1, ATF4, ATF3, PKP1, RGS3, PLK1, TRPS1, KLF4, POP7, ADAR, SLC9A1, GO:0005654~nucleoplasm 4.84E-05 1.433518 0.005037 113 FOSL2, HELZ2, CTCF, RNF187, IFI44L, HIST2H4A, PDCD4, FOS, LONP1, PRMT6, TPR, HIST3H2BB, TSEN15, AR, HSP90AA1, ARID5A, SOX12, ESR1, CCNC, RB1, IRF2BP1, UBE2C, JMY, NRIP1, NUP205, DDB2, NXT1, SNX18, CKS1B, PPP1R12B, TRIB3, EHF, NR3C1, BRSK1, IVNS1ABP, FEM1B, SHQ1, XBP1, TCEA1, PPP3CA, HSPA8, EXO1, SSRP1, CENPM, HIST1H2BD, CEBPB, CEBPG, CENPF, AFF3, BIRC5, STAT1, STAT3, ZNF22, CDC25B, IRF7,

290

CAPG, CHTF18, LSM10, LTA4H, PBX1, HPGD, TP53INP1 H1F0, HIST1H2AC, AR, CEBPB, ESR1, GO:0000790~nuclear SMAD3, CBX8, STAT1, STAT3, CITED2, 1.23E-04 3.110897 0.010247 17 chromatin NRIP1, HIST2H2AC, TRPS1, GATA3, H2AFX, SMARCA2, KLF4 H1F0, HIST1H2AC, HIST1H2BD, HIST1H2BK, HIST1H1C, HIST2H2AC, GO:0000786~nucleosome 3.21E-04 4.132937 0.022085 11 HIST1H2BJ, H2AFZ, H1FX, H2AFX, HIST2H4A KIFC1, TUBGCP3, ARHGEF2, MAD2L1BP, GO:0005819~spindle 6.41E-04 3.502595 0.032865 12 PLK1, CENPF, BIRC5, CDC20, NR3C1, RB1, AURKB, RGS14 IL6ST, FAM20C, CXCL12, SLC7A5, GSTM3, SERPINE1, H2AFZ, H2AFX, ZNF445, FAS, PRKACB, BPNT1, FTL, COCH, NUDT1, RBL2, AARS, FBP1, KRT10, NPEPPS, BASP1, METTL7A, ST13, RENBP, KRT19, NME3, GGACT, CPE, KRT17, SDCBP, TWSG1, ASS1, IFITM3, ITGB5, RRM2B, TAGLN2, ACAT1, TIMP3, FTH1, EIF3E, TUBB4B, MGAT4A, S100P, LGALS3, MYO1B, HGD, S100A14, ADGRG1, P2RX4, NRF1, CD55, GO:0070062~extracellular PKP1, FREM2, THSD4, CPNE3, ANTXR1, 6.22E-04 1.356928 0.036363 108 exosome PDZK1, SLC9A1, NPNT, HIST2H4A, WARS, DYNLL1, HIST2H2AC, FAM129A, IMPDH2, RHOG, KIF12, HSP90AA1, ACTA2, SARS, SLC3A2, DHRS2, ATP6V1A, MTMR11, BGN, GSTP1, FRK, SNX18, HIST1H2AC, CSF1, EEA1, ANXA6, TGM1, RTN4RL2, THBS1, HSPA8, MARS, ECI1, SHMT2, GDI2, HIST1H2BD, SLC12A2, NPR3, PCK2, CAPN2, COTL1, CAPN1, SH3BP4, WNT7B, TOM1L2, BAX, CAPG, PHGDH, GFRA1, LTA4H, METRNL, PSAT1, HPGD MRPS34, PTGES2, MRPL41, CLPB, SNCA, TMEM143, WDR81, MCCC2, LONP1, DYNLL1, GPT2, NUDT1, GTPBP3, RPUSD1, AARS, PISD, PALLD, MRPS2, DDIT4, COX6C, ATP6V1A, COQ3, DHRS2, NME3, ERN1, PPP1R15A, OAT, AARS2, GSTP1, GO:0005739~mitochondrion 0.001075 1.539019 0.048606 58 PNPT1, RSAD2, RSAD1, HSPA1A, CTPS2, RRM2B, ACAT1, FTH1, CMPK2, ANXA6, BCL2, XAF1, PPP3CA, MRPL58, AGK, ECI1, SHMT2, GABARAPL1, PCK2, CAPN1, SH3BP5, IFIT3, RAB32, FYN, BAX, METAP1D, AGR2, IFI6, SLC9A1 Molecular Function Fold Term p-value Benjamini Count Genes Enrichment MOCOS, MRPL41, IL6ST, SNCA, FAM20C, EDN1, FSTL3, AURKB, ADORA1, CITED2, NONO, PGR, S1PR3, CDCA8, ISG15, SERPINE1, INSIG1, H2AFZ, VPS13A, H2AFX, FAS, CDCA4, FTL, GTPBP2, H1F0, GO:0005515~protein 3.15E-07 1.214651 2.38E-04 311 YARS, GTPBP3, MPDZ, DCDC2, ESPL1, F7, binding GCC2, ST13, KRT19, ATP2C2, NME3, RND1, KRT17, RCC2, GGACT, VEGFA, MRPL48, RTP4, NEK3, ENPP1, UBA7, OAS3, DUSP12, HSPA1A, TAGLN2, MYBL2, CTIF, CDC42EP1, FBXO6, TBKBP1, ODC1, MKI67,

291

LGALS3, OSBPL6, KLF11, HGD, SMAD3, CDC20, S100A14, ZNF524, SEPT10, SYNE3, CD55, PKP1, RGS3, TRPS1, AREG, BMP7, AGR2, PDZK1, KLF4, POP7, ADAR, SLC9A1, KCNJ15, LDLR, ZMAT3, RNF187, ZBTB38, MCCC2, LONP1, USP18, POMGNT2, DYNLL1, SERTAD3, NEURL1B, MYB, FAM129A, ZNF496, ERRFI1, SERTAD1, ZFP36, ARHGEF2, AR, BATF2, HIST1H1C, CCNF, SLC3A2, CCNC, JMY, DOK3, NUP205, GADD45B, RASD1, NXT1, MFSD5, CSF1, EEA1, FEM1B, STARD13, DNALI1, XBP1, BCL2, TCEA1, CD24, PPP3CA, THBS1, SSRP1, GDI2, SHMT2, SPSB1, TRIM26, BIRC5, STAT1, ANKRD40, PCK2, CAPN2, STAT3, TRIM21, ITPR1, MT1X, CAPN1, RAB32, FYN, TOM1L2, SVIL, MT2A, SAMD9, CAPG, CHTF18, OGFR, PBX1, PXYLP1, DRAM1, TP53INP1, STIL, CLPB, STYX, CBX4, CBX2, CBX8, NRCAM, NLRC5, EIF4EBP1, UNC5B, GATA3, PRKACB, MX1, MX2, COCH, CARS, ZNF593, RET, NUDT1, RBL2, CHAC1, SOCS3, DTX3L, USP1, FBP1, BASP1, TACC3, ASCL1, MAD2L1BP, SDCBP, NFE2L1, SMARCA2, SEPT9, PACS1, TWSG1, FGFR4, IFIH1, SLC38A2, FGFR3, ASS1, LITAF, TNFRSF12A, IFITM3, SNX7, PNPT1, ITGB5, BEX2, CTPS2, CXXC5, RRM2B, CCNG1, TIMP3, SESN1, FTH1, IKBIP, PEG10, CDYL2, EIF3E, MSLN, BHLHE40, MFAP3L, GABARAPL1, S100P, MET, NR4A2, NR4A1, REEP1, DDX58, P2RX4, ATF4, NRF1, ATF3, PLK2, PLK1, SLC7A1, KCTD17, GPATCH4, CPNE3, ANTXR1, IFI6, IER3, FOSL2, PTGES2, HELZ2, CTCF, HNRNPLL, RHOV, FES, HIST2H4A, PDCD4, LGR4, FAM83D, FOS, WARS, NDE1, HMOX1, PRMT6, TPR, NFIL3, RHOG, DCAF16, IMPDH2, HSP90AA1, SGK3, ARID5A, RAB3IL1, ESR1, IRF2BP1, RB1, MID1IP1, UBE2C, PALLD, SLC7A11, ARHGAP26, NRIP1, DHRS2, DDB2, ERN1, SGF29, INPP4B, PPP1R15A, GSTP1, SNX18, CKS1B, FRK, PPP1R12B, KITLG, TRIB3, RSAD2, ASNS, NR3C1, NAALADL2, SHQ1, MSX2, ANXA6, TUBGCP3, TGM1, MSI2, GPNMB, HSPA8, SYNPO, EXO1, CEBPA, CEBPB, SLC12A2, CEBPG, CENPF, KIF18B, COTL1, RGS16, RGS14, CDC25B, IFIT3, SH3BP5, SH3BP4, IFIT1, BAX, IRF7, LSM10, LTA4H, IGFBP5 GO:0001077~transcriptional AKNA, CEBPA, AR, CEBPB, CEBPG, SOX12, activator activity, RNA ESR1, NR4A2, NR4A1, EHF, NR3C1, MYBL2, polymerase II core promoter 5.77E-06 3.198481 0.002179 22 STAT3, PGR, DLX3, FOS, NRF1, ATF4, proximal region sequence- GATA3, PBX1, MYB, KLF4 specific binding HIST1H2AC, HIST2H4A, ADORA1, FOS, HIST1H2BK, XBP1, BCL2, HIST2H2AC, GO:0046982~protein HIST1H2BJ, H2AFZ, H2AFX, BHLHE40, 2.37E-04 2.139824 0.03518 29 heterodimerization activity PPP3CA, HIST3H2BB, CEBPB, HIST1H2BD, CEBPG, NR4A2, ADIPOR2, NR4A1, SMAD3, BIRC5, CAPN2, ATF4, ATF3, BAX, VEGFA,

292

SDCBP, PBX1 GO:0000978~RNA AKNA, CEBPA, AR, FOSL2, CEBPB, BATF2, polymerase II core promoter ESR1, SMAD3, EHF, CTCF, NR3C1, MYBL2, 3.03E-04 2.319615 0.037448 24 proximal region sequence- STAT1, STAT3, ASCL2, PGR, ASCL1, FOS, specific DNA binding NRF1, ATF4, ATF3, H2AFZ, PBX1, MYB CEBPA, AR, KLF11, SNCA, ARID5A, CBX4, GO:0044212~transcription SMAD3, CTCF, BASP1, STAT3, MSX2, FOS, regulatory region DNA 1.68E-04 2.899519 0.041452 18 ATF4, ATF3, XBP1, GATA3, SMARCA2, binding KLF4 GO:0000979~RNA polymerase II core promoter NLRC5, FOS, CEBPB, IRF7, GATA3, H2AFZ, 2.28E-04 5.417522 0.042272 9 sequence-specific DNA NR4A2, STAT1, NFIL3 binding KEGG Pathway Fold Term p-value Benjamini Count Genes Enrichment ZMAT3, BAX, SERPINE1, DDB2, FAS, has04115:p53 signaling 5.54E-05 4.997292 0.01289 11 RRM2B, THBS1, GADD45B, CCNG1, SESN1, pathway SESN3

293

Appendix Table 3.14. Enriched GO terms and KEGG pathways among DEGS in MCF7 between 5% O2 in Plasmax and 18% O2 in DMEM. All Biological Process 5P 18D DEGs GO:0060337~type I interferon signaling pathway 14 14 - GO:0009615~response to virus 18 14 - GO:0070059~intrinsic apoptotic signaling pathway in response to endoplasmic 10 4* 6* reticulum stress GO:0000122~negative regulation of transcription from RNA polymerase II promoter 43 24* 19* GO:0051607~defense response to virus 17 16 - GO:0045944~positive regulation of transcription from RNA polymerase II promoter 53 29* 24* GO:0051301~cell division 26 20 - GO:0045071~negative regulation of viral genome replication 8 8 - All Cellular Component 5P 18D DEGs GO:0005737~cytoplasm 212 133 79 GO:0005829~cytosol 145 91 54* GO:0005634~nucleus 204 138 - GO:0005654~nucleoplasm 113 71* 42* GO:0000790~nuclear chromatin 17 8** 9* GO:0000786~nucleosome 11 11 - GO:0005819~spindle 12 10 - GO:0070062~extracellular exosome 108 - 51 GO:0005739~mitochondrion 58 37* 21** All Molecular Function 5P 18D DEGs GO:0005515~protein binding 311 191* 120 GO:0001077~transcriptional activator activity, RNA polymerase II core promoter 22 10* 12 proximal region sequence-specific binding GO:0046982~protein heterodimerization activity 29 16* 13* GO:0000978~RNA polymerase II core promoter proximal region sequence-specific 24 14* 10* DNA binding GO:0044212~transcription regulatory region DNA binding 18 9* 9* GO:0000979~RNA polymerase II core promoter sequence-specific DNA binding 9 6* - GO:0042802~identical protein binding 38* - 21 GO:0003707~steroid hormone receptor activity 6* - 6 All KEGG pathway 5P 18D DEGs has04115:p53 signaling pathway 11 - 9

Abbreviations: 5D 5% O2 in DMEM, 5P 5% O2 in Plasmax The most significantly GO or KEGG pathway is determined by the p-value and Benjamini corrected p-value ≤ 0.05 unless indicated otherwise; * Significantly enriched based on p-value (≤0.05); ** Not significantly enriched; (-) No result.

294

Appendix Table 3.15. Enriched GO terms and KEGG pathways in PC3 in response to 18% O2 in DMEM and 5% O2 in Plasmax. Cellular Component Fold Term p-value Benjamini Count Genes Enrichment LIMS1, CAV1, LPP, HSPG2, FERMT1, CHP1, GO:0005925~focal FBLIM1, NEDD9, CDH1, TLE2, SPRY4, PLAUR, 2.01E-05 2.955018 0.007077 22 adhesion LPXN, SDC1, ITGAV, ZNF185, TGM2, CAT, ZYX, TRIP6, PLAU, EHD3 CXCL1, CPM, HMGB2, LYPD3, CXCL3, IGFBP6, CXCL2, CXCL8, SPOCK1, CXADR, ABCA3, GPC4, KRT81, ATXN10, SERPINE1, TFF2, SEMA3D, TFF3, LOXL4, CAT, TFF1, GO:0005615~extracellular 2.90E-04 1.754523 0.04998 45 MTUS1, ANGPTL4, CPA4, ZP3, PODXL, HSPG2, space SPINT1, TLE2, SPARC, LCN2, INHBB, DDR1, SH3BGRL, WNT7B, CPE, SERPINB5, CST4, COL1A2, LIPG, FJX1, HBEGF, SLPI, PLAU, MUC5B

Appendix Table 3.16. Enriched GO terms and KEGG pathways among DEGS in PC3 between 5% O2 in Plasmax and 18% O2 in DMEM. All Cellular Component 5P 18D DEGs GO:0005925~focal adhesion 22 9* 13* GO:0005615~extracellular space 45 22* -

Abbreviations: 5P 5% O2 in Plasmax, 18D 5% O2 in DMEM The most significantly GO or KEGG pathway is determined by the p-value and Benjamini corrected p-value ≤ 0.05 unless indicated otherwise; * Significantly enriched based on p-value (≤0.05); ** Not significantly enriched; (-) No result.

295

Appendix Table 3.17. Enriched GO terms and KEGG pathways in MCF7 in response to 5% O2 in DMEM and 18% O2 in Plasmax. Biological Process Fold Term p-value Benjamini Count Genes Enrichment EGR1, SP100, IFITM3, OAS3, HLA-A, GO:0060337~type I RSAD2, SAMHD1, OAS2, STAT1, interferon signaling 3.06E-20 15.12437 7.86E-17 23 HLA-E, IFI35, PSMB8, ISG20, HLA-F, pathway IFIT3, IFI27, OASL, IFIT1, ISG15, IRF7, XAF1, MX2, IFI6 IFITM3, HERC5, OAS3, BNIP3, SAMHD1, RSAD2, IFI44L, PMAIP1, GO:0051607~defense OAS2, STAT1, ISG20, IFIT3, PLSCR1, 1.07E-12 6.376547 1.38E-09 25 response to virus NLRC5, IFIT1, OASL, UNC13D, ISG15, BCL2, C19ORF66, IFIT5, DDX60, ZC3H12A, MX2, DHX58 GO:0060333~interferon- SP100, NMI, OAS3, HLA-A, OAS2, gamma-mediated signaling 9.92E-10 8.891242 8.50E-07 15 STAT1, HLA-E, TRIM21, HLA-F, B2M, pathway OASL, IRF7, MT2A, CAMK2B, JAK2 IFIH1, IFITM3, OAS3, RSAD2, IFI44, GO:0009615~response to OAS2, CCL5, ISG20, DDX58, IFIT3, 4.87E-08 6.121486 3.13E-05 16 virus OASL, IFIT1, DDX60, IRF7, MX2, DHX58 GO:0032480~negative DDX58, NLRC5, IFIH1, ISG15, HERC5, regulation of type I 3.44E-07 12.62556 1.47E-04 9 UBA7, UBE2L6, TNFAIP3, DHX58 interferon production GO:0045071~negative PLSCR1, IFIT1, OASL, ISG15, regulation of viral genome 3.08E-07 10.5213 1.58E-04 10 C19ORF66, IFITM3, OAS3, RSAD2, replication CCL5, ISG20 FRK, EFNA1, SOX3, FHL2, CXXC5, JUND, NFATC2, TXNIP, EGR1, SATB2, SP100, HIST1H1C, KLF10, GO:0000122~negative KLF11, RELB, SMAD3, STAT1, regulation of transcription 3.49E-06 2.279616 9.97E-04 39 RBBP8, NRIP1, HIST2H3D, HES1, from RNA polymerase II ASCL1, NOTCH1, IFI27, DKK1, promoter SALL4, CD36, ZNF217, ATF3, IRF2BPL, PLK1, ID1, IRF7, SPDEF, TGIF1, ZFPM1, ID3, CUX1, KLF4 GO:0034097~response to FOS, IFI27, SP100, PTGES, JUN, BCL2, 3.22E-06 8.09331 0.001034 10 cytokine JUND, RELB, ACP5, STAT1 FRK, HIST1H2AC, CYP1B1, CXCL8, SOX4, LIF, PTGES, TFF1, GPNMB, GO:0008285~negative COL18A1, B4GALT1, PTPRK, KLF10, regulation of cell 3.00E-06 2.869446 0.001101 27 KLF11, IL24, CDKN3, IFIT3, DHRS2, proliferation CDKN1A, NOTCH1, ADM, BTG1, JUN, JAK2, PMP22, KLF4, TOB1 COL18A1, RET, CSF1, PODXL, GO:0030335~positive SMAD3, CCL5, DAPK3, GTSE1, regulation of cell 8.23E-06 3.888308 0.002113 17 NOTCH1, ZNF703, ADRA2A, migration SEMA3B, JAK2, RHOD, THBS1, GPNMB, INSR GO:0070059~intrinsic apoptotic signaling ATF4, CASP4, CEBPB, CHAC1, BBC3, 1.00E-05 10.20248 0.002334 8 pathway in response to BCL2, PMAIP1, PPP1R15A endoplasmic reticulum 296

stress AKNA, FOSL2, LMO2, HELZ2, FSTL3, NFKBIA, SOX4, LIF, PGR, FOS, NLRC5, MEIS3, JUND, SERPINE1, ZC3H12A, KDM3A, NFATC2, GO:0045944~positive SERTAD1, EGR1, SATB2, CEBPB, regulation of transcription 1.66E-05 1.973415 0.003546 46 LMX1B, CEBPG, RELB, SMYD3, from RNA polymerase II SMAD3, NR4A1, STAT1, TET2, NRIP1, promoter DDX58, HES1, ASCL1, PLSCR1, NOTCH1, ATF4, SALL4, ATF3, IRF2BPL, HOXC13, JUN, IRF7, SPDEF, KAT6B, FOXI1, KLF4 COL18A1, NRP1, CYP1B1, S100A7, EFNA1, HSPG2, CXCL8, ACKR3, GO:0001525~angiogenesis 2.44E-05 3.397013 0.004815 18 NRCAM, ID1, JUN, HMOX1, SERPINE1, ZC3H12A, ADM2, PLXND1, TNFAIP2, FN1 GO:0071407~cellular CCNB1, P2RY6, CYP1B1, CEBPB, response to organic cyclic 7.15E-05 6.419778 0.01305 9 NFKBIA, CCL5, STAT1, AXIN1, compound IGFBP5 GO:0051591~response to FOS, LDHA, DUSP1, JUN, JUND, 9.61E-05 7.319167 0.015323 8 cAMP AREG, VGF, STAT1 NUAK2, GULP1, NFKBIA, BNIP3, PMAIP1, PDCD4, G2E3, CASP4, BCL2, BUB1, ZC3H12A, XAF1, BMF, AXIN1, GO:0006915~apoptotic 9.06E-05 2.226731 0.01541 30 PHLDA1, KLF11, TPX2, PIM1, NR4A1, process PIM3, ESPL1, IL24, STAT1, DAPK3, GAS6, PLSCR1, BBC3, JAK2, TNFAIP3, PPP1R15A GO:0031100~organ PKM, CDKN1A, NOTCH1, ADM, 1.11E-04 7.163441 0.016593 8 regeneration CCNA2, GAS6, GSTP1, ANXA3 CEBPB, CEBPG, IFITM3, OAS3, HLA- A, CXCL8, SMAD3, SAMHD1, IL32, GO:0006955~immune OAS2, CX3CL1, IL24, HLA-E, CCL5, 1.90E-04 2.399157 0.025355 24 response HLA-F, B2M, LIF, NOTCH1, TNFRSF11B, CD36, IL4R, THBS1, LTB, IFI6 GO:0042542~response to TXNIP, LDHA, DUSP1, JUN, HMOX1, 1.88E-04 6.601602 0.026464 8 hydrogen peroxide BCL2, AREG, STAT1 PTPRK, ARC, FGFR4, EFNA1, PODXL, GO:0016477~cell 2.37E-04 3.425541 0.028654 14 GAS6, COL5A1, HES1, FAM83D, migration PARP9, BTG1, JAK2, THBS1, NFATC2 GO:0032020~ISG15- 2.51E-04 28.05681 0.028866 4 ISG15, HERC5, UBA7, UBE2L6 protein conjugation NUAK2, FHL2, BNIP3, NFKBIA, AURKA, PDCD4, BAG3, BCL2, GO:0043066~negative THBS1, KIF14, EGR3, SOCS2, PIM1, regulation of apoptotic 2.32E-04 2.312374 0.029344 25 SMAD3, PIM3, GAS6, IFIT3, DHRS2, process ASCL1, CDKN1A, DUSP1, PLK2, ID1, PLK1, GSTP1 GO:0045668~negative NOTCH1, ID1, SMAD3, ID3, AREG, regulation of osteoblast 2.87E-04 7.553756 0.031603 7 TOB1, IGFBP5 differentiation GO:0042493~response to COL18A1, TXNIP, RET, LDHA, FZD1, 3.60E-04 2.630326 0.037793 19 drug COMT, STAT1, B2M, CCNB1, FOS,

297

TNFRSF11B, CDKN1A, JUN, BCL2, JUND, ABAT, TGIF1, THBS1, NFATC2 GO:0030336~negative PTPRK, CYP1B1, EPPK1, BCL2, regulation of cell 4.20E-04 4.430022 0.040691 10 SERPINE1, CX3CL1, IL24, TPM1, migration SRGAP1, IGFBP5 CYP1B1, ADM, BTG1, HMOX1, GO:0045766~positive 4.08E-04 4.025542 0.041098 11 SERPINE1, CXCL8, ZC3H12A, ADM2, regulation of angiogenesis CX3CL1, THBS1, ANXA3 GO:0042149~cellular ATF4, NUAK2, BCL2, SLC2A1, response to glucose 5.42E-04 8.707285 0.048538 6 ZC3H12A, PMAIP1 starvation HIST1H2BO, HIST1H2BK, HIST1H1C, GO:0006334~nucleosome HIST2H2BE, HIST2H2BF, SMYD3, 5.36E-04 3.89023 0.049743 11 assembly H1FX, HIST3H2BB, KAT6B, HIST1H3H, HIST2H3D Cellular Component Fold Term p-value Benjamini Count Genes Enrichment MCRIP1, LDHA, S100A7, PDLIM3, PGAM1, PTTG1, CALB2, AQP3, ISG20, B2M, HIST1H2BO, NLRC5, ACTG2, G2E3, PIP5KL1, PACSIN3, HIST1H2BK, CDCA2, SMOX, MX2, CCNA2, ASPM, FTL, ARC, RET, SATB2, SOCS2, DTX3L, RELB, PIM1, ESPL1, TMSB10, PIM3, IFI44, OPTN, METTL7A, HES1, KRT17, TAGLN, PARP14, TNFAIP3, FGFR4, OAS3, NFKBIA, SOX4, AFAP1L2, CXXC5, OAS2, CCL5, LIF, CRMP1, MYO15B, ADRA2A, TFF3, AHNAK2, DHX58, AXIN1, PLAT, MOCS2, S100P, LGALS3, PODXL, SMYD3, SMAD3, NR4A1, CDC20, CELSR2, EVL, GAS6, PSMB8, DNMBP, PSMB9, PLEKHA4, DDX58, APOL3, OASL, ATF4, SALL4, GO:0005737~cytoplasm 1.13E-08 1.4534 3.89E-06 172 PLK2, PARP9, ULK1, PLK1, SYTL2, APOL6, KLF4, GAS2L3, IER2, KYNU, NUAK2, BNIP3, IFI44L, PDCD4, PKM, FAM83D, TRIM3, GSN, BAG3, DDX60, KLHL24, HIST3H2BB, ERRFI1, SERTAD1, EGR1, KIF12, SP100, CCDC88C, HERC6, HERC5, PADI2, DAPK3, ECT2, DHRS2, MAST4, SMTN, BGN, ADM, CLIC3, HIST2H2BE, BTG1, HIST2H2BF, ZFPM1, SPATS2L, PPP1R15A, KPNA2, GSTP1, SRGAP1, SHCBP1, FRK, NMI, GULP1, ABCD1, NECAB1, ZFP36L2, ZNF703, C19ORF66, BCL2, BUB1, ZC3H12A, MTCL1, KDM3A, EXOC5, NFATC2, ZBP1, TXNIP, RBM24, CEBPB, EPPK1, NXNL2, DLGAP5, KNSTRN, STAT1, RGS16, CDKN3, TRIM21, ANXA3, IFIT3, CCNB1,

298

IFIT1, DUSP1, KNL1, IRF7, SVIL, SAMD9, MT2A, CAPG, TEX19, MAPK8IP2, JAK2, ID3, GDF15, THEMIS2, TOB1 LDHA, S100A7, PGAM1, AURKA, PTTG1, PMAIP1, AMOTL2, CALB2, NLRC5, ACTG2, ISG15, SLC2A1, SMOX, MX2, FTL, SOCS2, CHAC1, DTX3L, RELB, ESPL1, OPTN, RND1, JUN, TNFAIP3, MYL7, IFIH1, PFKFB3, OAS3, UBA7, ACP5, NFKBIA, IL32, OAS2, IFI35, CMPK2, CRMP1, GYS1, BMF, AXIN1, MOCS2, UBE2L6, SMAD3, CDC20, EVL, DOCK8, PSMB8, PSMB9, DDX58, NOTCH1, OASL, CDKN1A, PLK2, PARP9, PLK1, ULK1, BBC3, MYH14, KYNU, NRP1, GO:0005829~cytosol 1.13E-06 1.530763 1.93E-04 115 PDCD4, GTSE1, PKM, FOS, USP18, PLCB4, PBXIP1, GSN, BAG3, HMOX1, RHOD, ERRFI1, KIF14, HERC6, HERC5, TPX2, PADI2, ECT2, CUX1, PPP1R15A, KPNA2, GSTP1, SRGAP1, PPFIA4, ABCD1, HK2, COMT, TPM1, HMMR, ZFP36L2, BCL2, RASGRP1, BUB1, CAMK2B, EXOC5, XAF1, WIPF1, NFATC2, PAPSS2, HSPA8, ZBP1, PHLDA1, TXNIP, MAT2A, SPSB1, STAT1, TRIM21, CCNB1, IFIT3, PLSCR1, IFIT1, KNL1, IRF7, MT2A, SP6, JAK2 S100A4, NRP1, FSTL3, VGF, B2M, KRT81, ACTG2, TNFRSF11B, HIST1H2BK, GSN, IL4R, HMOX1, SERPINE1, COL12A1, SEMA3B, LTB, GOLM1, IL24, INHBB, CHGA, CD36, ADM, HIST2H2BE, COL1A2, GO:0005615~extracellular 2.75E-05 1.801726 0.003134 55 TNFAIP2, GSTP1, MSMB, CSF1, OAS3, space CXCL8, IL32, CX3CL1, CCL5, LIF, LGALS3BP, C1QTNF6, MSLN, TFF3, MTCL1, TFF1, THBS1, HSPA8, FN1, PLAT, B4GALT1, COL18A1, LGALS3, PODXL, HSPG2, GAS6, DKK1, APOL1, IRF2BPL, AREG, GDF15 AKNA, S100A4, LDHA, MCRIP1, LMO2, S100A7, FSTL3, AURKA, H1FX, PTTG1, PMAIP1, CALB2, AQP3, ISG20, HIST1H2BO, PGR, NLRC5, HIST1H2BK, SMOX, MX2, CCNA2, ASPM, TYRO3, SATB2, DTX3L, RELB, GO:0005634~nucleus 3.86E-05 1.311961 0.003303 161 PIM1, ESPL1, OPTN, DEPDC1, HES1, ASCL1, PARP12, HES4, JUN, PARP14, SPDEF, TGIF1, TNFAIP3, IFIH1, MSMB, UBA7, NFKBIA, SOX4, SP110, OAS2, IFI35, CMPK2, MEIS3, JUND, AHNAK2, AXIN1, KLF6, MOCS2, S100P, LGALS3, KLF10, KLF11,

299

SMAD3, NR4A1, CDC20, TET2, IDH3A, PSMB8, HIST2H3D, PSMB9, NOTCH1, ATF4, CDKN1A, ATF3, SALL4, PARP9, IRF2BPL, PLK1, DYRK1B, ZNF385B, AREG, KAT6B, FOXI1, HIST1H3H, IER2, FOSL2, NUAK2, BNIP3, HNRNPLL, PDCD4, PKM, FOS, USP18, PLCB4, PBXIP1, GSN, HMOX1, HIST3H2BB, ERRFI1, SERTAD1, KIF14, EGR1, EGR3, SP100, HIST1H1C, HERC6, TPX2, HERC5, DAPK3, ECT2, NRIP1, RBBP8, DHRS2, CLIC3, HIST2H2BE, BTG1, HOXC13, HIST2H2BF, ZFPM1, CUX1, RASD1, KPNA2, GSTP1, HIST1H2AC, FRK, FHL2, NECAB1, ZFP36L2, ZNF703, C19ORF66, BCL2, ZC3H12A, KDM3A, XAF1, NFATC2, HSPA8, SYNPO, ZBP1, TXNIP, RBM24, CEBPB, LMX1B, DLGAP5, CEBPG, SAMHD1, KNSTRN, STAT1, CDKN3, TRIM21, CCNB1, PLSCR1, DUSP1, KNL1, ID1, IRF7, SVIL, MT2A, CAPG, SP6, JAK2, ID3, PRSS23, GDF15, THEMIS2, TOB1 AKNA, ATP1B1, LDHA, NRP1, EFNA1, HELZ2, NPNT, PGAM1, HNRNPLL, SLC26A2, GTSE1, B2M, FOS, PIP5KL1, PTGES, HMOX1, SLC2A1, LBR, INSR, FTL, KIF14, PTPRK, RET, HLA-A, METTL7A, HLA-E, SLC7A11, HLA-F, UNC13D, CD36, PARP14, RAPGEFL1, KPNA2, PPP1R15A, CALCR, ABCD1, CSF1, GO:0016020~membrane 9.84E-05 1.544407 0.006728 77 HK2, IL32, COMT, OAS2, HMMR, LGALS3BP, ECE1, BCL2, RASGRP1, TAP1, MSLN, BUB1, GYS1, KDM3A, PLXND1, LFNG, HSPA8, B4GALT1, LGALS3, FADS3, EVL, DOCK8, ANXA3, PLEKHA4, APOL2, CCNB1, APOL3, PLSCR1, RAB32, OASL, PARP9, SLC7A2, TENM3, ABCC3, SYTL2, JAK2, MYH14, SLC15A3, CLCN7, HIST1H3H IER2, KYNU, FOSL2, HELZ2, FSTL3, BNIP3, AURKA, IFI44L, PDCD4, GTSE1, ISG20, HIST1H2BO, PGR, FOS, PACSIN3, HIST1H2BK, ISG15, CDCA2, HIST3H2BB, CCNA2, EGR1, SATB2, SP100, DTX3L, RELB, TPX2, GO:0005654~nucleoplasm 1.35E-04 1.458184 0.0077 92 DEPDC1, OPTN, RBBP8, NRIP1, HES1, SMTN, HIST2H2BE, HIST2H2BF, JUN, TGIF1, ZFPM1, CUX1, KPNA2, PMEPA1, FGFR4, NMI, PFKFB3, SOX3, UBA7, OAS3, FHL2, SOX4, ANLN, CXXC5, NECAB1, CMPK2, SYBU, BUB1, ZC3H12A, CAMK2B,

300

KDM3A, NFATC2, HSPA8, CEBPB, CEBPG, SMYD3, SAMHD1, NR4A1, UBE2L6, SMAD3, CDC20, STAT1, PSMB8, ABCG1, HIST2H3D, PSMB9, CCNB1, NOTCH1, ATF4, CDKN1A, ATF3, ZNF217, SALL4, PARP9, ID1, PLK1, IRF2BPL, KNL1, IRF7, CAPG, JAK2, ID3, KAT6B, USP43, KLF4, HIST1H3H GO:0030670~phagocytic TCIRG1, RAB32, RAB31, HLA-A, 3.55E-04 5.983174 0.017262 8 vesicle membrane HLA-E, ANXA3, HLA-F, B2M HIST1H2BO, HIST1H2BK, GO:0000788~nuclear 4.39E-04 7.020031 0.01866 7 HIST2H2BE, HIST2H2BF, HIST3H2A, nucleosome HIST3H2BB, HIST1H3H HIST1H2BO, HIST1H2AC, GO:0000786~nucleosome 0.001304 4.224821 0.04376 9 HIST1H2BK, HIST1H1C, HIST2H2BE, H1FX, KAT6B, HIST1H3H, HIST2H3D COL18A1, PLAT, PKM, PLSCR1, HAPLN1, LGALS3BP, BGN, LGALS3, GO:0031012~extracellular 0.001206 2.534258 0.044952 17 SERPINE1, COL1A2, HSPG2, matrix COL12A1, MGP, THBS1, COL5A1, HSPA8, FN1 Molecular Function Fold Term p-value Benjamini Count Genes Enrichment S100A4, ATP1B1, MCRIP1, LDHA, EFNA1, S100A7, PDLIM3, FSTL3, PGAM1, AURKA, PTTG1, AMOTL2, B2M, PGR, G2E3, ISG15, SERPINE1, CCNA2, FTL, PTPRK, TYRO3, PIM1, ESPL1, PIM3, OPTN, DEPDC1, HES1, CD36, UNC13D, RND1, KRT17, TAGLN, JUN, TGIF1, PMP22, CHORDC1, SCN1B, OAS3, UBA7, NFKBIA, OAS2, KCNJ3, IFI35, LIF, CRMP1, SYBU, GYS1, AHNAK2, DHX58, PLAT, KLF6, LGALS3, KLF10, KLF11, UBE2L6, SMAD3, CDC20, GAS6, NOTCH1, ZNF217, APOL1, GO:0005515~protein PARP9, SYTL5, SYTL2, AREG, 4.66E-08 1.252789 3.00E-05 266 binding KAT6B, KLF4, GAS2L3, NRP1, NUAK2, BNIP3, GTSE1, PKM, USP18, PBXIP1, BAG3, IL4R, ERRFI1, GOLM1, SERTAD1, HIST1H1C, TPX2, HERC5, ACKR3, ARRDC3, RBBP8, INHBB, ADM, CLIC3, RASD1, SRGAP1, PPFIA4, NMI, TBC1D9, CSF1, ABCD1, HK2, FHL2, CXCL8, HMMR, ZFP36L2, BCL2, C19ORF66, BUB1, WIPF1, EXOC5, THBS1, PHLDA1, ZBP1, PDK1, SPSB1, MAT2A, LMX1B, DLGAP5, HSPG2, FZD1, NPY1R, STAT1, CDKN3, TRIM21, CCNB1, PLSCR1, RAB32, DKK1, DUSP1, KNL1, SVIL, MT2A,

301

SAMD9, CAPG, TOB1, LMO2, FAM189B, PMAIP1, KRT81, NRCAM, NLRC5, PACSIN3, SLC2A1, MX2, INSR, SAMD4A, RET, SATB2, SOCS2, CHAC1, DTX3L, RELB, HLA-A, MGP, TMSB10, IL24, ASCL1, COL1A2, TNFAIP3, MYL7, IFIH1, FGFR4, MSMB, IFITM3, SOX4, IL32, CXXC5, CX3CL1, CCL5, P2RY6, ECE1, MSLN, JUND, ADRA2A, TFF3, TFF1, PLXND1, BMF, FN1, AXIN1, S100P, PODXL, NR4A1, EVL, DOCK8, TET2, PSMB8, DNMBP, HIST2H3D, PSMB9, DDX58, ATF4, CDKN1A, ATF3, SALL4, PLK2, BBC3, PLK1, ULK1, DYRK1B, HIST1H3H, IFI6, FOSL2, IER5, HELZ2, HNRNPLL, PDCD4, LGR4, FAM83D, FOS, TRIM3, PLCB4, GSN, HMOX1, DDX60, PXMP4, LBR, EGR1, KIF14, SP100, DAPK3, ECT2, SLC7A11, NRIP1, DHRS2, BTG1, HOXC13, CLDN1, ZFPM1, KPNA2, PPP1R15A, PMEPA1, GSTP1, SHCBP1, CALCR, FRK, RSAD2, COMT, TPM1, ZNF703, C1QTNF6, TAP1, ZC3H12A, CAMK2B, NFATC2, SCNN1B, GPNMB, SCNN1A, HSPA8, SYNPO, TXNIP, C17ORF82, CEBPB, CEBPG, SAMHD1, KNSTRN, RGS16, IFIT3, KCNN4, IFIT1, ID1, IRF7, MAPK8IP2, JAK2, ID3, KLHL35, GDF15, THEMIS2, SH3BP2, IGFBP5 KEGG Pathway Fold Term p-value Benjamini Count Genes Enrichment EGR3, SP100, HLA-A, NFKBIA, CDC20, PMAIP1, HLA-E, HLA-F, PKM, has05203:Viral HIST1H2BO, CDKN1A, ATF4, 3.75E-06 3.477316 8.43E-04 20 carcinogenesis HIST1H2BK, HIST2H2BE, GSN, HIST2H2BF, JUN, IRF7, HIST3H2BB, CCNA2 EGR3, IFIH1, HSPG2, CXCL8, NFKBIA, STAT1, DDX58, FOS, has05161:Hepatitis B 4.77E-05 3.687154 0.003572 15 CDKN1A, ATF4, IRF7, BCL2, JUN, NFATC2, CCNA2 IFIH1, SP100, OAS3, HLA-A, NFKBIA, has05168:Herpes simplex OAS2, STAT1, HLA-E, CCL5, HLA-F, 4.67E-05 3.311051 0.005237 17 infection DDX58, FOS, IFIT1, JUN, IRF7, TAP1, JAK2 CALCR, FOSL2, CSF1, RELB, FHL2, has04380:Osteoclast 2.75E-04 3.537041 0.015355 13 ACP5, NFKBIA, STAT1, FOS, differentiation TNFRSF11B, JUN, JUND, NFATC2 DDX58, CDKN1A, JUN, BCL2, RELB, has05169:Epstein-Barr 5.66E-04 3.505818 0.025158 12 ENTPD8, HLA-A, NFKBIA, HLA-E, virus infection TNFAIP3, CCNA2, HLA-F

302

EGR1, NRP1, RELB, HLA-A, FZD1, has05166:HTLV-I SMAD3, NFKBIA, CDC20, PTTG1, 6.94E-04 2.525846 0.02571 18 infection HLA-E, HLA-F, FOS, CDKN1A, ATF4, ATF3, JUN, SLC2A1, NFATC2 DDX58, CDKN1A, IFIT1, CLDN9, has05160:Hepatitis C 0.001166 3.215863 0.032284 12 IRF7, OAS3, CLDN1, CXCL8, NFKBIA, OAS2, STAT1, CLDN23 PDK1, CDKN1A, PFKFB3, HMOX1, has04066:HIF-1 signaling 0.00136 3.712759 0.033447 10 BCL2, SLC2A1, SERPINE1, HK2, pathway CAMK2B, INSR DDX58, IFIH1, JUN, IRF7, OAS3, has05164:Influenza A 0.0011 2.867786 0.034772 14 CXCL8, NFKBIA, RSAD2, JAK2, OAS2, CCL5, STAT1, KPNA2, HSPA8

303

Appendix Table 3.18. Enriched GO terms and KEGG pathways among DEGs in MCF7 between 5% O2 in DMEM and 18% O2 in Plasmax. All Biological Process 5D 18P DEGs GO:0060337~type I interferon signaling pathway 23 - 23 GO:0051607~defense response to virus 25 - 24 GO:0060333~interferon-gamma-mediated signaling pathway 15 - 15 GO:0009615~response to virus 16 - 16 GO:0032480~negative regulation of type I interferon production 9 - 9 GO:0045071~negative regulation of viral genome replication 10 - 10 GO:0000122~negative regulation of transcription from RNA polymerase II promoter 39 - 35 GO:0034097~response to cytokine 10 - 10 GO:0008285~negative regulation of cell proliferation 27 - 24 GO:0030335~positive regulation of cell migration 17 - 16 GO:0070059~intrinsic apoptotic signaling pathway in response to endoplasmic 8 - 8 reticulum stress GO:0045944~positive regulation of transcription from RNA polymerase II promoter 46 43 GO:0001525~angiogenesis 18 - 16 GO:0071407~cellular response to organic cyclic compound 9 5* 4* GO:0051591~response to cAMP 8 - 7 GO:0006915~apoptotic process 30 - 24 GO:0031100~organ regeneration 8 - 6* GO:0006955~immune response 24 - 24 GO:0042542~response to hydrogen peroxide 8 - 7 GO:0016477~cell migration 14 - 12 GO:0032020~ISG15-protein conjugation 4 - 4 GO:0043066~negative regulation of apoptotic process 25 7* 18* GO:0045668~negative regulation of osteoblast differentiation 7 4* - GO:0042493~response to drug 19 - 16* GO:0030336~negative regulation of cell migration 10 - 8* GO:0045766~positive regulation of angiogenesis 11 - 10 GO:0042149~cellular response to glucose starvation 6 - 5* GO:0006334~nucleosome assembly 11 - 11 All Cellular Component 5D 18P DEGs GO:0042612~MHC class I protein complex 4 - 4* GO:0005737~cytoplasm 172 37 135 GO:0005829~cytosol 115 31 84 GO:0005615~extracellular space 55 - 53 GO:0005634~nucleus 161 33* 128 GO:0016020~membrane 77 18* 59* GO:0005654~nucleoplasm 92 24 68* GO:0030670~phagocytic vesicle membrane 8 - 8

304

GO:0000788~nuclear nucleosome 7 - 7 GO:0000786~nucleosome 9 - 9 GO:0031012~extracellular matrix 17 - 14* All Molecular Function 5D 18P DEGs GO:0005515~protein binding 266 58 208 All KEGG Pathway 5D 18P DEGs has05203:Viral carcinogenesis 20 - 17 has05161:Hepatitis B 15 - 14 has05168:Herpes simplex infection 17 - 17 has04380:Osteoclast differentiation 13 - 13 has05169:Epstein-Barr virus infection 12 - 11 has05166:HTLV-I infection 18 - 15 has05160:Hepatitis C 12 - 12 has04066:HIF-1 signaling pathway 10 4* 6** has05164:Influenza A 14 - 12

Abbreviations: 5D 5% O2 in DMEM, 18P 5% O2 in Plasmax The most significantly GO or KEGG pathway is determined by the p-value and Benjamini corrected p-value ≤ 0.05 unless indicated otherwise; * Significantly enriched based on p-value (≤0.05); ** Not significantly enriched; (-) No result.

305

Appendix Table 3.19. Enriched GO terms and KEGG pathways in PC3 in response to 5% O2 in DMEM and 18% O2 in Plasmax. Biological Process Fold Term p-value Benjamini Count Genes Enrichment GO:0070059~intrinsic ATF4, CEBPB, ERO1A, XBP1, apoptotic signaling pathway in 2.50E-08 11.17232 7.09E-05 11 CHAC1, BBC3, BCL2, ERN1, response to endoplasmic TRIB3, PPP1R15A, ITPR1 reticulum stress KCNMA1, PAM, LDHA, NF1, BNIP3, PDLIM1, CBFA2T3, GO:0001666~response to ITPR1, DDIT4, ASCL2, PKM, 2.08E-07 4.092188 2.95E-04 21 hypoxia CYBA, PLOD1, PLOD2, HMOX1, ABAT, PAK1, THBS1, DPP4, MB, ANGPTL4 IFI27, ISG15, BST2, IRF7, IFITM3, GO:0060337~type I interferon 1.79E-05 5.760729 0.007218 11 XAF1, MX1, MX2, IFI35, IFI6, signaling pathway HLA-F GO:0061621~canonical PKM, PFKL, PFKFB3, PGAM1, 8.00E-06 10.31291 0.007524 8 glycolysis HK2, PFKP, HK1, PGK1 GO:0042149~cellular ATF4, XBP1, BCL2, SLC2A1, 1.76E-05 9.24606 0.008258 8 response to glucose starvation TP53, UPP1, ZC3H12A, ASNS GO:1990440~positive regulation of transcription ATF4, ATF3, CEBPB, XBP1, from RNA polymerase II 1.53E-05 16.75848 0.008614 6 TP53, CREB3L1 promoter in response to endoplasmic reticulum stress PTPRJ, RAP2A, NF2, BST2, PKP2, GO:0030336~negative BCL2, NF1, RHOB, DPYSL3, 2.47E-05 4.586532 0.008694 13 regulation of cell migration IL24, SLC9A3R1, LDLRAD4, IGFBP5 XPO1, PAM, LDHA, ASS1, CENPF, CTPS1, CDH1, AK4, MDK, LCN2, TYMS, FOS, CYBA, GO:0042493~response to drug 1.47E-05 2.756329 0.01037 25 CDKN1A, APOD, GATA6, BCL2, GATA3, JUND, SEMA3C, ABAT, LRP8, COL1A1, THBS1, IGFBP2 WNT5A, FIGNL1, TAF9B, NFKBIA, BNIP3, ASNS, GLI2, EPCAM, PHIP, PTK2, GATA6, SQSTM1, XBP1, BCL2, THBS1, GO:0043066~negative 4.35E-05 2.283574 0.011133 31 DHCR24, ANGPTL4, KIF14, regulation of apoptotic process TP53, BIRC5, PIM3, NTSR1, ASCL1, AMIGO2, FMN2, DHRS2, CDKN1A, RPS6KA3, RPS6KA1, HSPB1, WNT7A STC2, JUND, GJA1, COL1A1, GO:0043434~response to 4.03E-05 6.855743 0.011341 9 TFF1, AREG, SPARC, BMP7, peptide hormone NEFL WNT5A, SASH1, C5, CXCL8, GO:0045766~positive XBP1, GATA6, HMOX1, RHOB, 3.79E-05 4.080326 0.011865 14 regulation of angiogenesis ZC3H12A, IL1B, HSPB1, ADM2, THBS1, ANGPTL4 GO:0034976~response to 7.32E-05 4.915822 0.017116 11 ATF4, CEBPB, ERO1A, XBP1,

306 endoplasmic reticulum stress BBC3, TMX4, ERN1, CXCL8, TRIB3, THBS1, PPP1R15A PXDN, NPNT, TNC, FBN1, NF1, CCDC80, CDH1, NID1, SPARC, GO:0030198~extracellular 8.18E-05 3.078089 0.017663 18 ATP7A, PTK2, COL9A3, ERO1A, matrix organization ITGA5, COL6A3, COL1A1, FBN2, THBS1 CXCL1, ACHE, POLA1, GLI2, CBFA2T3, FAM83D, TYMS, CSE1L, BCL2, BUB1, GRPR, GO:0008283~cell 1.07E-04 2.380987 0.021457 26 PDK1, BST2, MKI67, DLGAP5, proliferation TP53, TPX2, SKP2, CENPF, TACC1, DDIT4, UHRF1, TACSTD2, TCF19, AREG, PDZK1 FMN2, ERO1A, STC2, GATA6, GO:0071456~cellular 1.31E-04 4.189621 0.024495 12 BBC3, HMOX1, BCL2, TP53, response to hypoxia BNIP3, NDRG1, ADAM8, BMP7 GO:0000278~mitotic cell XRCC2, CEP250, RRM1, CENPF, 1.86E-04 6.539896 0.032353 8 cycle KIF18B, CENPE, PAK1, PPP2R2C GPD2, RBP4, ATF4, ATF3, PGM1, GO:0006094~gluconeogenesis 2.93E-04 6.093994 0.047679 8 PGAM1, PCK2, PGK1 Cellular Component Fold Term p-value Benjamini Count Genes Enrichment MEF2C, XPO1, LDHA, CRABP2, PGAM1, IQGAP3, VPS53, SLC7A5, CMBL, ISG15, CDKN2B, SLC2A1, RBCK1, IL1B, DEPDC1B, MX1, ITPK1, MX2, C2CD2, CHAC1, RELB, SKP2, ESPL1, PNPLA2, CTNNA1, DDIT4, TNNT2, SPAG9, MIB1, SGO2, TACSTD2, ZWINT, PGM1, HSPB1, PRPS1, RBP4, IFIH1, ASS1, PFKFB3, NFKBIA, UBA6, CTPS1, CTPS2, IGF2BP3, SESN2, EPHB3, MYO9A, IFI35, TK1, CMPK2, PTK2, CSE1L, GYS1, NDRG1, BMF, PPP2R2C, GO:0005829~cytosol 1.89E-10 1.621849 8.39E-08 154 TUBB4A, PARD6B, UBE2L6, EVL, LCN2, RERG, ATP7A, CDKN1A, MYO10, GBE1, BBC3, MYH14, SMC1A, KYNU, SULT2B1, GJA1, GLI2, RHOU, GTSE1, PKM, FOS, CEP250, HMOX1, PRMT6, RHOB, PAK1, ARHGAP11A, ANO7, EFR3B, DHCR24, KIF14, RAP2A, CCDC88A, PFKL, SARS, HERC6, PFKP, TP53, TPX2, FLNC, ECT2, FMN2, ARHGAP31, PSME1, RRM2, RRM1, KPNA2, PPP1R15A, KPNA1, PPFIA4, AP1M2, HK2, UPP1, HK1, TRIB3, ASNS, FAM13A, TYMS,

307

TSC22D3, NCAPG, SQSTM1, XBP1, BCL2, PAFAH1B3, AGO1, BUB1, ETNK1, AGO2, XAF1, EXOC5, TBC1D1, TNPO1, PAPSS2, NEFL, MTMR4, TRIP12, NF1, CENPF, BIRC5, CENPE, DPYSL3, WWTR1, MID1, CENPI, SMC4, ICK, RPS6KA3, RPS6KA1, KNL1, PYGL, IRF7, MT2A, PHGDH, DPYD, PSAT1, PGK1, CIT, HPGD LDHA, CRABP2, PGAM1, KIAA1324, SLC7A5, CMBL, APOD, PLOD1, PLOD2, SLC2A1, PLS1, AIF1L, IL1B, ADAM9, MB, PTPRJ, TNIK, SOGA1, BST2, LAD1, ERP29, SPAG9, RND3, KRT17, TACSTD2, ST14, PGM1, ABAT, HSPB1, MGAT5, WNT5A, RBP4, PAM, ACADSB, ASS1, IFITM3, LGALS3BP, CSE1L, TFF2, TFF3, NDRG1, TUBB4A, HIST1H4H, PARD6B, S100P, EFEMP2, ATAD2, AK4, S100A14, ADGRG1, LCN2, GBE1, PI3, MYH14, PCSK1N, PDZK1, HIST1H3H, MUC5B, PXDN, GO:0070062~extracellular TM7SF3, GM2A, FIGNL1, 8.11E-07 1.527628 1.80E-04 123 exosome GREB1, NPNT, ANO1, SULT2B1, GJA1, LSR, EPCAM, PKM, SLC1A4, EFHD1, DIP2B, CEP250, RHOB, SEMA3C, HIST3H2A, DPP4, PPP2R1B, KCNMA1, RAP2A, PFKL, SARS, SLC3A2, PFKP, SLC9A3R1, DHRS2, PSME1, HIST2H2BE, RRM1, SLPI, HIST1H2AC, FUT8, C5, CDH1, GPRC5A, PRSS8, PHIP, ANXA6, SQSTM1, COL6A3, PAFAH1B3, THBS1, TNPO1, SHMT2, TMC5, HIST1H2BG, TMC4, FBN1, NID1, PCK2, TFRC, PYGL, KNL1, PHGDH, GFRA1, METRNL, IGFBP2, GDF15, PGK1, PSAT1, WNT7A, HPGD MEF2C, LDHA, XRCC2, PGAM1, PDLIM1, AQP3, G2E3, PIP5KL1, HIST1H2BK, AIF1L, CDCA2, PLS1, ADAM8, PLS3, TNIK, OPA1, SKP2, ZHX2, ESPL1, GO:0005737~cytoplasm 2.98E-06 1.323736 3.30E-04 198 PIM3, IFI44, SPAG9, KRT17, SIPA1L1, ZWINT, PGM1, HAS2, C12ORF57, PLCXD3, UBA6, NFKBIA, IGF2BP3, DENND2D, UBASH3B, PPP1R3G, OBSL1, WDHD1, MKI67, TPD52L1,

308

CELSR2, S100A14, DHX40, RBPJ, PLCXD1, CLSPN, FOXA2, TTLL4, BNIP3, RNF182, YBX2, PKM, SMAP2, DIP2B, P4HA2, HERC6, SLC3A2, TP53, FLNC, EPB41L3, HIST2H2BE, CKAP2L, RRM2, RRM1, GADD45A, FUT8, POLA1, NCAPG, SQSTM1, XBP1, BCL2, BUB1, AGO1, AGO2, EXOC5, LMLN, TRIP12, SHMT2, HIST1H2BG, DLGAP5, BIRC5, CDKN3, MID1, WWTR1, RPS6KA3, RPS6KA1, NUPR1, KNL1, PYGL, SVIL, MT2A, RASSF2, DPYD, PSAT1, HPGD, DUSP8, XPO1, CRABP2, TTK, SLC7A5, CDKN2B, EIF1, MX1, MX2, BST2, EFTUD2, RELB, PKIB, PNPLA2, TACC1, DDIT4, PTHLH, ASCL2, TNS3, MIB1, RAD18, HSPB1, CRACR2B, ASS1, CXXC5, BICC1, SESN2, MDK, PTK2, CSE1L, PLCH2, TFF3, NDRG1, PARD6B, S100P, LPP, ARID3A, EVL, RIMKLA, PLEKHA4, CORO1C, MYO10, ATF4, CIRBP, SMC1A, TRIP6, PLEKHA2, KYNU, FIGNL1, ANO1, SULT2B1, IFI44L, FAM83D, TRIM9, DDX60, PAK1, PFKL, SARS, ESR1, PFKP, SLC9A3R1, ECT2, DHRS2, PSME1, RIF1, ERN1, KPNA2, PPP1R15A, KPNA1, SHCBP1, CNN3, UPP1, SPOCK1, CDH1, PALMD, NDC1, TYMS, TSC22D3, ZNF703, HJURP, PAFAH1B3, ETNK1, ZC3H12A, PLCD4, KLHL42, KDM3A, TNPO1, NEFL, EXO1, NES, CEBPB, NF2, TONSL, NF1, KIF18B, CENPF, CENPE, DPYSL3, SPARC, CENPI, SMC4, IRF7, VSTM2L, GDF15 XPO1, LDHA, PGAM1, VPS53, TTK, SLC7A5, PIP5KL1, ATP2B4, SEMA7A, SLC2A1, CREB3L1, SPRED3, ANKZF1, OSBP2, OPA1, BST2, EFTUD2, ERP29, PNPLA2, FIBCD1, TACC1, HLA-F, GO:0016020~membrane 2.52E-06 1.586904 3.73E-04 100 TACSTD2, PAM, ACHE, PANX1, CTPS1, SFXN1, LGALS3BP, ERO1A, CSE1L, NCAPG2, GYS1, LFNG, HIST1H4H, MKI67, SYT12, NIPA1, EVL, RERG, ATP7A, PLEKHA4, SLC7A2, SLC7A1, MYH14, PLEKHA2,

309

HIST1H3H, NPNT, HELZ, GLI2, GTSE1, SLC1A4, FOS, DIP2B, P4HA1, HMOX1, DENND5B, DPP4, DHCR24, KCNMA1, KIF14, RAP2A, CCDC88A, PFKL, MPP2, PFKP, ESR1, SLC3A2, SLC9A3R1, GRAMD1B, KPNA2, PPP1R15A, FUT8, TNC, HK2, CDH1, PALMD, CEP55, PLPP2, ANXA6, NDC1, NCAPG, BCL2, BUB1, PAFAH1B3, ETNK1, AGO2, KDM3A, LMLN, NF2, NF1, CENPE, NUP155, ITPR1, SLC16A3, CYBA, TFRC, LRP8, CIT, PGK1 CNN3, TNC, GJA1, CDH1, PDLIM1, RHOU, ANXA6, PTK2, AIF1L, RHOB, PAK1, LMLN, DPP4, ADAM9, LPP, DOCK7, GO:0005925~focal adhesion 8.84E-06 2.589372 7.83E-04 29 EVL, FLNC, CTNNA1, CSRP2, CORO1C, RND3, TNS3, CYBA, ARHGAP31, ITGA5, SVIL, HSPB1, TRIP6 PXDN, VGF, HIST1H2BK, APOD, SEMA7A, HMOX1, IL1B, SEMA3C, APLN, LTB, TSKU, ADAM9, SOGA1, STC2, IL24, TCN1, PTHLH, CHGA, TACSTD2, HIST2H2BE, ST14, ANOS1, HSPB1, SLPI, COL1A1, CXCL1, WNT5A, RBP4, PAM, ACHE, GO:0005615~extracellular TNC, CXCL3, C5, CXCL2, CD109, 7.28E-05 1.658768 0.005363 64 space CXCL8, SPOCK1, PRSS8, LGALS3BP, C1QTNF6, COL6A3, TFF2, TFF3, TFF1, THBS1, FGFBP1, MTMR4, OLFM1, ANGPTL4, HIST1H2BG, FBN1, DPYSL3, SPARC, LCN2, TFRC, POP1, METRNL, AREG, GDF15, PCSK1N, BMP7, IGFBP2, WNT7A, MUC5B SGO2, HJURP, KNL1, ZWINT, GO:0000777~condensed 1.86E-04 4.414145 0.011711 11 FBXO28, RASSF2, BUB1, CENPE, chromosome kinetochore BIRC5, SEPT6, SMC1A KIF14, RAP2A, SVIL, SLC2A1, KLHL13, CENPF, BIRC5, CENPE, GO:0030496~midbody 3.31E-04 3.518251 0.01818 13 CEP55, SEPT6, ECT2, TACC1, SHCBP1 XPO1, ZWINT, FBXO28, RASSF2, GO:0000776~kinetochore 5.00E-04 4.310108 0.024294 10 BUB1, CENPF, TTK, CENPE, SMC1A, CENPI Molecular Function Fold Term p-value Benjamini Count Genes Enrichment GO:0005515~protein binding 3.56E-06 1.193234 0.001395 308 MEF2C, LDHA, SLC9A7, XRCC2,

310

PGAM1, VPS53, G2E3, ATP2B4, ISG15, APOD, INSIG2, MAP3K9, SEMA7A, YEATS2, CREB3L1, ADAM8, ADAM9, PTPRJ, OSBP2, OPA1, TNIK, ZHX2, SKP2, ESPL1, PIM3, CTNNA1, RND3, SPAG9, UHRF1, KRT17, SGO2, TACSTD2, ZWINT, PGM1, PIAS2, RBP4, TAF9B, NFKBIA, UBA6, IGF2BP3, DENND2D, MYO9A, IFI35, UHMK1, TK1, UBASH3B, NCAPG2, GYS1, OBSL1, WDHD1, PPP2R2C, TUBB4A, OLFM1, MKI67, FAM111B, UBE2L6, TPD52L1, CSRP2, S100A14, PKP2, POP1, AREG, RBPJ, BMP7, PDZK1, CLSPN, GJA1, BNIP3, GLI2, RNF182, GTSE1, PKM, FAM168A, SMAP2, TNFRSF11A, CEP250, P4HA1, DPP4, KCNMA1, RAP2A, EMSY, HIST1H1C, TPX2, TP53, SLC3A2, ACKR3, FLNC, EPB41L3, AMIGO2, RRM2, RRM1, ESRP1, GADD45A, RASD1, PPFIA4, CXCL2, C5, POLA1, HK2, HK1, CXCL8, PHIP, XBP1, SQSTM1, NCAPG, BCL2, BUB1, AGO1, USP37, AGO2, ETV1, EXOC5, TBC1D1, THBS1, MTMR4, TRIP12, BCL9, PDK1, SHMT2, PDK3, DLGAP5, HIST1H2BG, BIRC5, WWTR1, MID1, CDKN3, PCK2, ITPR1, SLC16A3, RLF, RPS6KA3, DUSP2, RPS6KA1, TFRC, PYGL, KNL1, SLC16A9, SVIL, RASSF2, MT2A, LRP8, DPYD, CIT, PGK1, WNT7A, XPO1, JDP2, F2RL1, CRABP2, TTK, NRCAM, CDKN2B, GATA6, GATA3, FBXO28, SLC2A1, RBCK1, ANKZF1, MX1, MX2, BST2, CHAC1, EFTUD2, RELB, FIBCD1, IL24, NTSR1, TACC1, TNNT2, ASCL1, MIB1, TNS3, ANOS1, HSPB1, RAD18, COL1A1, PTGFRN, SEPT6, CRACR2B, PRPS1, WNT5A, PAM, ACHE, IFIH1, ASS1, IFITM3, CTPS2, CXXC5, SESN2, PTK2, CSE1L, ERO1A, JUND, TFF2, TFF3, NDRG1, SSX2IP, TFF1, BMF, ANGPTL4, HIST1H4H, PARD6B, S100P, NIN, LPP, EFEMP2, HENMT1, ARID3A, EVL, CORO1C, ATP7A,

311

DNA2, ATF4, CDKN1A, MYO10, ATF3, BBC3, ITGA5, SLC7A1, CIRBP, HOXB9, SCARA3, TRIP6, SMC1A, IFI6, PLEKHA2, MUC5B, HIST1H3H, FIGNL1, ANO1, SULT2B1, CBFA2T3, RHOU, FAM83D, EPCAM, FOS, TRIM9, HMOX1, DDX60, PRMT6, RHOB, PAK1, EFR3B, KIF14, PPP2R1B, PFKL, RAB3IL1, ESR1, SLC9A3R1, ECT2, MCM6, DHRS2, PSME1, ERN1, SLPI, SGF29, KPNA2, PPP1R15A, KPNA1, PMEPA1, SHCBP1, AP1M2, TRIB3, CDH1, ASNS, CEP55, GPRC5A, PLPP2, PRSS8, ANXA6, C1QTNF6, ZNF703, HJURP, SLC35B4, PAFAH1B3, ZC3H12A, ETNK1, SCG5, KLHL42, FBN2, TNPO1, NEFL, FGFBP1, EXO1, CEBPB, NF2, TONSL, CEBPG, NF1, FBN1, CENPF, KIF18B, DPYSL3, CENPE, SPARC, NUP155, CENPI, SMC4, CYBA, ICK, ANXA10, IRF7, VSTM2L, MIS18BP1, IGFBP2, GDF15, IGFBP5, F2R MEF2C, HIST1H2AC, RBP4, JDP2, PANX1, TAF9B, POLA1, BNIP3, FOS, HIST1H2BK, XBP1, BCL2, PAFAH1B3, HIST3H2A, GO:0046982~protein NEFL, HIST1H4H, CEBPB, 2.78E-06 2.488528 0.002178 34 heterodimerization activity CEBPG, HIST1H2BG, TP53, ZHX2, TPD52L1, BIRC5, NTSR1, MID1, CTNNA1, ABCG1, SMC4, CYBA, ATF4, ATF3, HIST2H2BE, SMC1A, HIST1H3H SLC9A7, ACHE, KYNU, BNIP3, TTK, ASNS, ANXA6, TYMS, PLOD1, XBP1, SQSTM1, TRIM9, BCL2, MAP3K9, HMOX1, ABCB10, DPP4, EMSY, GO:0042803~protein CCDC88A, CEBPB, STC2, BST2, 2.49E-05 2.004759 0.006506 43 homodimerization activity ERP29, ZHX2, ARID3A, CENPF, TPD52L1, BIRC5, NTSR1, WWTR1, MID1, ECT2, ABCG1, LCN2, ASCL1, ATF3, TFRC, PYGL, ERN1, ABAT, DPYD, HPGD, PRPS1 MYH15, XRCC2, FIGNL1, TTLL4, TTK, HELZ, PKM, ATP2B4, PIP5KL1, MAP3K9, DDX60, GO:0005524~ATP binding 1.02E-04 1.593578 0.019839 70 ABCB10, PAK1, ITPK1, IPMK, ATP8B3, KIF14, TNIK, PFKL, SARS, TP53, TPX2, PFKP, PIM3, MCM6, RRM1, ERN1, PRPS1,

312

IFIH1, ASS1, PFKFB3, HK2, TRIB3, UBA6, CTPS1, HK1, CTPS2, ASNS, EPHB3, MYO9A, UHMK1, CMPK2, TK1, PTK2, BUB1, ETNK1, PAPSS2, PDK1, MKI67, PDK3, KIF18B, ATAD2, UBE2L6, CENPE, AK4, RIMKLA, ABCG1, SMC4, ATP7A, DNA2, ICK, RPS6KA3, MYO10, RPS6KA1, PYGL, MYH14, DHX40, SMC1A, PGK1, CIT KEGG Pathway Fold Term p-value Benjamini Count Genes Enrichment SLC16A3, PKM, PDK1, PFKL, has05230:Central carbon 5.16E-05 5.031184 0.011944 11 SLC2A1, TP53, PGAM1, HK2, metabolism in cancer PFKP, HK1, SLC7A5 PKM, LDHA, PFKL, PGM1, has00010:Glycolysis / 4.00E-04 4.369006 0.045506 10 PGAM1, HK2, PFKP, HK1, PCK2, Gluconeogenesis PGK1 LDHA, SHMT2, PFKL, ASS1, PFKP, HK2, PGAM1, HK1, AK4, has01130:Biosynthesis of 8.46E-04 2.485387 0.048094 18 PCK2, CMBL, PKM, PGM1, antibiotics PHGDH, PGK1, PSAT1, PAPSS2, PRPS1

313

Appendix Table 3.20. Enriched GO terms and KEGG pathways among DEGs in PC3 between 5% O2 in DMEM and 18% O2 in Plasmax. All Biological Process 5D 18P DEGs GO:0070059~intrinsic apoptotic signaling pathway in response to endoplasmic 11 - 9 reticulum stress GO:0001666~response to hypoxia 21 13 8* GO:0060337~type I interferon signaling pathway 11 - 11 GO:0061621~canonical glycolysis 8 7 - GO:0042149~cellular response to glucose starvation 8 - 7 GO:1990440~positive regulation of transcription from RNA polymerase II promoter in 6 - 6 response to endoplasmic reticulum stress GO:0030336~negative regulation of cell migration 13 6* 7* GO:0042493~response to drug 25 - 16* GO:0043066~negative regulation of apoptotic process 31 16* 15* GO:0043434~response to peptide hormone 9 - 6 GO:0045766~positive regulation of angiogenesis 14 - 10 GO:0034976~response to endoplasmic reticulum stress 11 - 9 GO:0030198~extracellular matrix organization 18 12* - GO:0008283~cell proliferation 26 16* 10* GO:0071456~cellular response to hypoxia 12 5** 7* GO:0000278~mitotic cell cycle 8 7 - GO:0006094~gluconeogenesis 8 - 4* GO:0034097~response to cytokine 8 - 6 GO:0071222~cellular response to lipopolysaccharide 11 - 8 All Cellular Component 5D 18P DEGs GO:0005829~cytosol 154 94 57 GO:0070062~extracellular exosome 123 57** 64 GO:0005737~cytoplasm 198 122 - GO:0016020~membrane 100 64 76* GO:0005925~focal adhesion 29 20 9** GO:0005615~extracellular space 64 - 42 GO:0000777~condensed chromosome kinetochore 11 11 - GO:0030496~midbody 13 13 - GO:0000776~kinetochore 10 10 - GO:0005819~spindle 11 10 - All Molecular Function 5D 18P DEGs GO:0005515~protein binding 308 181* 124* GO:0046982~protein heterodimerization activity 34 - 23 GO:0042803~protein homodimerization activity 43 24* 19* GO:0005524~ATP binding 70 53 -

314

All KEGG Pathway 5D 18P DEGs has05230:Central carbon metabolism in cancer 11 9 - has00010:Glycolysis / Gluconeogenesis 10 8 - has01130:Biosynthesis of antibiotics 18 12* -

Abbreviations: 5D 5% O2 in DMEM, 18P 5% O2 in Plasmax The most significantly enriched GO or KEGG pathway is determined by the p-value and Benjamini corrected p- value ≤ 0.05 unless indicated otherwise; * Significantly enriched based on p-value (≤0.05); ** Not significantly enriched; (-) No result.

315

Appendix Table 3.21. Lists of DEGs and the summary of their specific involvement in cell cycle and DNA replication. Biological Process Gene Term CDK1; CDK2; CDC25A; CDC6; EXO1; PCNA; RFC2; POLE; DNA replication POLE2; POLD2; POLD3 CCNB1; CCNB2; CCNA2; CDK1; CDK2; CDCA8; CDC25A; Cell division BUB1; CDC20; CDC6 CDC25A; CDK1; CDK2; CDC6; CDKN1A; PCNA; POLE; G1/S transition of mitotic cell cycle POLE2 CCNB1; CCNB2; CDK1; CDK2; CDC25A; CDKN1A; G2/M transition of mitotic cell cycle CDKN2B DNA damage response; signal transduction by CCNB1; CDK1; CDK2; CDKN1A; GADD45A; PCNA; TP53 p53 class mediumtor resulting in cell cycle arrest PCNA; RFC2; BRCA2; POLE; POLE2; POLD2; POLD3 Telomere maintenance via recombination CCNB2; CCNA2; CDK1; CDK2; CDC25A; CDC6 Mitotic nuclear division CDK1; CDC25A; SERPINF1; BUB1; PCNA; TP53 Cell proliferation CCNB1; CCNB2; CDC25A; GADD45A; GADD45B; Regulation of cell cycle GADD45G CDK1; CDK2; GADD45A; EXO1; POLE; POLE2 DNA repair CCNA2; CDC25A; CDC6; BUB1; CDC20 Mitotic nuclear division Regulation of cyclin-dependent protein CCNA2; CDC25A; CDC6; CDKN1A; GADD45A serine/threonine kinase activity PCNA; RFC2; POLE; POLD2; POLD3 Nucleotide-excision repair; DNA gap filling Negative regulation of ubiquitin-protein ligase activity involved in the mitotic cell cycle; CCNB1; CDK1; CDK2; CDC20 regulation of ubiquitin-protein ligase activity involved in mitotic cell cycle EXO1; BRCA2; POLE; POLD3 DNA synthesis involved in DNA repair CCNB1; CCNA2; TP53 Cellular response to hypoxia CCNB1; CDK1; CDKN1A Response to drug Anaphase-promoting complex-dependent CCNB1; CDK1; CDC20 catabolic process Regulation of transcription involved in G1/S CDK1; CDC6; PCNA transition of mitotic cell cycle CDK1; CDKN1A; TP53 Negative regulation of the apoptotic process Regulation of signal transduction by p53 class CDK2; EXO1; RFC2 mediumtor CDC6; POLE;POLE2 DNA replication initiation CDCA8; PCNA; TP53 Protein sumoylation CDCA8; BUB1; CDC20 Sister chromatid cohesion GADD45A; GADD45B; GADD45G Positive regulation of the apoptotic process PCNA; RFC2; POLD3 Telomere maintenance via recombination CCNB1; CDCA8 Mitotic metaphase plate congression CDK1; PCNA Cellular response to hydrogen peroxide CCNA2; CDKN1A Organ regeneration CDK1; CDKN1A Response to a toxic substance BUB1; CDKN2B Mitotic cell cycle checkpoint EXO1; BRCA2 Strand displacement DNA strand elongation involved in DNA PCNA; POLD3 replication Mitotic spindle organization; response to CCNB1 mechanical stimulus CDK1 Protein localization to the kinetochore

316

CDC6 Negative regulation of cell proliferation CDKN1A Cellular response to gamma radiation Positive regulation of angiogenesis; cellular SERPINE1 response to lipopolysaccharide PCNA Response to estradiol Negative regulation of transcription from RNA TP53 polymerase II promoter; transcription factor binding Molecular Function Gene Term CCNB1; CCNB2; CCNA2; CDK1; CDK2; CDCA8; CDC25A; CDC6; SERPINE1; CDKN1A; CDKN2B; BUB1; BRCA2; Protein Binding CDC20; GADD45A; GADD45B; GADD45G; EXO1; PCNA; RFC2; POLE2; POLD2; POLD3; TP53 CCNB1; CCNA2; CDC25A; CDKN2B; TP53 Protein kinase binding CDK1; CDK2; BUB1; RFC2; TP53; CDC6 ATP Binding CDK1; EXO1; PCNA; POLE; TP53 Chromatin binding EXO1; PCNA; RFC2; POLE; POLE2 DNA binding PCNA; TP53 Identical protein binding TP53 Protein heterodimerization activity KEGG Pathway Gene Term CCNB1; CCNB2; CCNA2; CDK1; CDK2; CDC25A; CDC6; Cell cycle PCNA; TP53; CDKN1A; CDKN2B; BUB1; CDC20;

GADD45A; GADD45B; GADD45G CCNB1; CCNB2; CDK1; CDK2; SERPINE1; CDKN1A; p53 signaling pathway GADD45A; GADD45B; GADD45G; TP53 CCNB1; CCNB2; CCNA2; CDK1; CDK2; CDC25A Progesterone-mediated oocyte maturation PCNA; RFC2; POLD2; POLD3; POLE; POLE2 DNA replication CCNB1; CCNB2; CDK1; CDK2; BUB1 Oocyte meiosis CCNA2; CDK1; CDK2; CDKN1A; TP53 Viral carcinogenesis CCNB1; CCNB2; CDK2; GADD45A FoxO signaling pathway CCNA2; CDK2; CDKN1A; PCNA Hepatitis B EXO1; PCNA; RFC2; POLD3 Mismatch repair PCNA; POLD3; POLE; POLE2 Base excision repair; HTLV-I infection POLD3; POLE; POLE2 Pyrimidine metabolism POLD3 Homologous recombination

317

Appendix Table 3.22. Lists of DEGs encodes for MCM protein and the summary of their involvement in the biological process, molecular function and KEGG pathways. Biological Process Gene Term MCM2; MCM3; MCM4; MCM5; MCM6; MCM7; MCM10 DNA replication G1/S transition of mitotic cell cycle; DNA replication MCM2; MCM3; MCM4; MCM5; MCM6; MCM7; MCM10 initiation MCM2;MCM4; MCM6; MCM7 DNA unwinding involved in DNA replication MCM3; MCM5; GINS1 DNA duplex unwinding MCM2 Cellular response to interleukin; Cell cycle MCM5 Cell division GINS1; GINS3 DNA strand elongation involved in DNA replication Molecular Function Gene Term MCM2; MCM3; MCM4; MCM5; MCM6; MCM7; MCM10 Protein binding MCM2; MCM3; MCM4; MCM5; MCM6; MCM7 ATP binding; DNA helicase activity MCM2; MCM3; MCM4 DNA binding MCM2; MCM5; MCM10 DNA replication origin binding MCM2 Histone binding MCM5 Chromatin binding MCM6 Identical protein binding MCM10 Single-stranded DNA binding KEGG Pathway Gene Term MCM2; MCM3; MCM4; MCM5; MCM6; MCM7 Cell cycle; DNA replication

318

Appendix Table 3.23. Lists of DEGs involved in KEGG pathways associated with viral carcinogenesis. Gene KEGG Pathway JUN; CCND3; CHEK1; ATF4; CCNE2; IRF7; CCNA2; CDC20; CDK1; CDK2; CDKN1A; CDKN2B; TP53; CREB3L1; HIST1H4H ; HIST1H2BD; HIST1H2BG; Viral carcinogenesis HIST1H2BO; HIST1H2BK; HIST2H4A; HIST2H2BE; HIST2H2BF; HIST3H2BB; RBL1; HLA-F; IL6ST; PKM; PSMC1; SKP2; SP100 JUN; BAX; EGR1; E2F1; E2F2; FOS; HRAS; CCND3; CHEK1; CDC20; CDKN1A; CDKN2C; PCNA; POLE; POLE2; POLD2; POLD3; ANAPC5; HLA-DQB1; AD2L1; HTLV-I infection PTTG1; RANBP1; SMAD3; WNT7B; ZFP36 JUN; BAX; EGR1; E2F1; E2F2; FOS; HRAS; ATF4; CCNE2; IRF7; CCNA2; CDK2; Hepatitis B CDKN1A; PCNA; BCL2; BIRC5; DDX58; FAS; IFIH1; STAT1; STAT2 DDX58; CDKN1A; IFIT1; CLDN9; IRF7; OAS3; CLDN1; CXCL8; NFKBIA; OAS2; Hepatitis C STAT1; CLDN23 CCND3; CCNE2; IRF7; CDK2; DDX58; IFIH1;STAT1; STAT2; ADAR; BBC3; FYN; Measles HSPA8; MX1; OAS2; OAS3 IFIH1; SP100; OAS3; HLA-A; NFKBIA; OAS2; STAT1; HLA-E; CCL5; HLA-F; Herpes simplex infection DDX58; FOS; IFIT1; JUN; IRF7; TAP1; JAK2 DDX58; CDKN1A; JUN; BCL2; RELB; ENTPD8; HLA-A; NFKBIA; HLA-E; Epstein-Barr virus infection TNFAIP3; CCNA2; HLA-F DDX58; IFIH1; JUN; IRF7; OAS3; CXCL8; NFKBIA; RSAD2; JAK2; OAS2; CCL5; Influenza A STAT1; KPNA2; HSPA8

Appendix Table 3.24. Lists of DEGs involved in the glycolysis/Gluconeogenesis pathway. Biological Process Gene Term Canonical PKM, TPI1, PFKFB3, PGAM1, HK2, PFKP, GAPDH, ENO1, PFKL, HK1,PGK1 glycolysis LDHA, TPI1, PGM1, PGAM1, HK2, DHTKD1, GAPDH, ENO1 Glycolytic process GPD2, RBP4, ATF4, ATF3, PGM1, PGAM1, PCK2, PGK1 Gluconeogenesis KEGG Pathway Gene Term PKM, LDHA, TPI1, PGM1, PGAM1, HK2, PFKP, GAPDH, ENO1, PFKL, Glycolysis / HK1,PCK2, PGK1 Gluconeogenesis

Appendix Table 3.25. Lists of DEGs involved in biological process associated with response to hypoxia. Number Biological process Gene of gene Cellular response to hypoxia 3 VEGFA, BNIP3, HMOX1 and response to hypoxia CCNB1, STC1, BNIP3L, PTN, EIF4EBP1, SLC29A1, EDN1, ERO1A, Cellular response to hypoxia 24 NDRG1, BBC3, BCL2, E2F1, ZFP36L1, MST1, STC2, CCNA2, ADAM8, TP53, GATA6, SLC9A1, SUV39H1, FMN2, BMP7, FAM162A ASCL2, ETS1, KCNMA1, EGR1, LDHA, MB, DPP4, ERCC2, ABAT, Response to hypoxia 26 THBS1, CAV1, PDLIM1, UCP2, CYBA, CAT, PAM, PAK1, PKM, CBFA2T3, EP300, ANGPTL4, DDIT4, PLOD2, ITPR1, PLOD1, NF1

319