DOWNREGULATION OF THE UNFOLDED RESPONSE BY METFORMIN MAY CONFER ANTINEOPLASTIC EFFECTS IN COLORECTAL CANCER

by

Mary L. Fay

A thesis submitted to the Department of Pathology & Molecular Medicine

In conformity with the requirements for

the degree of Master of Science

Queen’s University

Kingston, Ontario, Canada

July, 2021

Copyright © Mary L. Fay, 2021 Abstract

In 2020, there were 1.93 million cases of colorectal cancer (CRC) diagnosed and 935 thousand CRC-related deaths (1–3). Approximately 50 percent of patients are diagnosed with stage

III or IV disease, limiting the efficacy of available treatments (4). Type 2 diabetes mellitus (T2D) is a risk factor for CRC development and progression. Nonetheless, metformin-treated diabetic

CRC patients tend to have better clinical outcomes than diabetic CRC patients not exposed to metformin. Thus, we sought to investigate the molecular underpinnings of metformin’s protective effects to determine whether its mechanism could be exploited to improve patient outcomes.

We performed a targeted transcriptomic analysis of primary CRC tissue samples (N = 272).

Next, a supervised learning algorithm was used to pinpoint features discriminating between metformin-treated and diet-controlled diabetic CRC samples, as well as those discriminating between non-diabetic samples based on their five-year survival status; these features were used to generate corresponding classification models. Finally, we analyzed the differential expression patterns in diabetic treatment groups and non-diabetic survival categories to identify expression profiles suggesting that a gene and/or pathway may be linked to metformin’s antineoplastic effects.

Our results show downregulation of TMEM132 in metformin-treated samples (p = 0.053) and non-diabetics with good clinical outcomes (p = 0.053) relative to diet-controlled and non- diabetics with poor survival, respectively; SCNN1A was upregulated in metformin-treated samples

(p = 0.044) and non-diabetics with good clinical outcomes (p = 0.011) relative to diet-controlled samples and those with poor clinical outcomes, respectively; FN1 was downregulated in metformin-treated samples relative to diet-controlled samples (p = 0.121). These findings suggest a role for the unfolded protein response (UPR) in mediating metformin-related CRC protective

ii effects, by enhancing apoptosis. We also show that sFas, an antiapoptotic protein, is downregulated in metformin-treated samples relative to diet-controlled samples (p = 0.005).

Therefore, metformin may synergistically enhance apoptosis through the regulation of Fas alternative splicing, inhibiting CRC development and progression. Taken together, we outline two key cellular processes that may be implicated in metformin’s antineoplastic action and provide justification for further investigation of these pathways to determine if they may be targets for novel CRC therapies.

iii

Disclaimer of Transdisciplinary Research Contributions

I conducted the analyses and generated all the figures and tables presented in this thesis. However, several members of the Queen’s University Colorectal Cancer Transdisciplinary Research Group

(CoRecT) were involved in developing resources that were used during the course of this investigation, and those contributions are described below.

Electronic Database

The analysis was performed on data compiled in the electronic clinical database, which was constructed in 2013 (5). Since then, the pathology data has been updated by Dr. Christine Orr and the survival data has been updated by Ms. Nicole Archer.

mRNA Quantification

The custom gene list used in our NanoString nCounter assay was generated by Dr. Chris Nicol.

FFPE tissue block annotation on primary tumour specimens was performed by Dr. David Hurlbut to guide the removal of malignant tissue samples for subsequent RNA extraction. RNA preparation for the first two batches of NanoString data was done by Dr. Christine Orr (NanoString Batch 1) and Ms. Melinda Chelva (NanoString Batch 2).

iv

Acknowledgements

First, I would like to thank my supervisor, Dr. Scott Davey, for this amazing opportunity. To be continually challenged and supported throughout this process has allowed me to learn more than I could have imagined when I first began my studies at Queen’s University, making this experience invaluable. I will be forever grateful.

I would also like to thank Dr. Harriet Feilotter and Dr. David Hurlbut for their assistance and guidance throughout this project and during my work as an undergraduate student. It has truly been a privilege to learn from and work with you both. I would also like to thank Dr. Kathrin Tyryshkin, whose bioinformatics instruction was instrumental in allowing me to perform these analyses.

I also want to recognize Nicole Archer and Nick Zhang for their technical expertise, without whom, this work would not have been possible. To Nicole, thank you for your patience, assistance, and for all of the encouragement; it has been such a joy to work with you.

To my family; Mom, Dad, Evan, Matt, Marty, Oma, Opa, Nanny & Papa, thank you for your continual love and support. To have grown up surrounded by such caring people is something I am beyond thankful for. Mom and Dad, thank you for teaching me to always do my best and to never give up when faced with a challenge.

To an amazing friend, advocate, and three-time cancer survivor; Dylan Buskermolen, your courage, positive attitude, selflessness, and ability to embrace unimaginable challenges are a true inspiration.

v

Table of Contents

Abstract ...... ii Disclaimer of Transdisciplinary Research Contributions ...... iv Acknowledgements ...... v List of Figures ...... ix List of Tables ...... x List of Abbreviations ...... xi Chapter 1 Introduction ...... 1 1.1 Colorectal Cancer: A Growing Global Burden ...... 1 1.2 Key Clinicopathological Features & their Effect on Patient Outcomes ...... 1 1.2.1 Stage at Diagnosis ...... 1 1.2.2 Location of the Primary Tumour ...... 2 1.2.3 Colorectal Cancer Grade ...... 4 1.3 Clinical Management of Colorectal Cancer Patients ...... 4 1.3.1 Colorectal Cancer Screening ...... 4 1.3.2 Colorectal Cancer Diagnosis & Treatment ...... 6 1.4 Colorectal Cancer & Diabetes ...... 7 1.5 Type 2 Diabetes Mellitus ...... 8 1.6 Metformin in Diabetes...... 9 1.7 Metformin in Colorectal Cancer...... 10 1.7.1 Metformin & Cellular Proliferation in Cancer Onset & Progression ...... 11 1.7.2 Metformin & Epithelial-Mesenchymal Transition ...... 12 1.7.3 Metformin in Cellular Invasive & Metastatic Potential ...... 13 1.7.4 Metformin & Cell Stemness ...... 14 1.7.5 Metformin in Immune Markers & Inflammation ...... 15 1.8 Insulin-Treatment & Colorectal Cancer ...... 16 1.9 CRC Disease Heterogeneity: A Challenge in Research & Clinical Applications ...... 17 1.10 Gene Expression Profiling...... 19 1.11 Formalin-Fixed Paraffin-Embedded Samples ...... 19 1.12 NanoString Technology ...... 21 1.12.1 Performance of NanoString® Platform on FFPE Samples ...... 22 1.13 Machine Learning...... 23 1.13.1 Feature Selection for Dimension Reduction ...... 24 1.13.2 Unsupervised Machine Learning ...... 26 1.13.3 Supervised Machine Learning ...... 28 1.13.4 k-Fold Cross-Validation ...... 29 1.14 Hypothesis & Specific Aims ...... 30

vi

Chapter 2 Material and Methods ...... 31 2.1 Selection of Patient Cohort...... 31 2.2 Clinical & Pathological Data Retrieval ...... 32 2.3 mRNA Quantification ...... 32 2.3.1 RNA Extraction from FFPE Samples ...... 33 2.3.2 Selection of for Custom nCounter Codeset...... 33 2.4 Data Preprocessing ...... 34 2.4.1 Initial Quality Control ...... 34 2.4.2 Batch Calibration in nSolver Software ...... 35 2.4.3 Normalization & Additional Quality Control ...... 35 2.4.4 Data Visualization for Inspection for Batch Effects & Outliers ...... 37 2.4.5 Forgoing Filtering of Features with Low Expression Levels ...... 37 2.5 Data Analysis ...... 38 2.5.1 Assigning Data to Training & Validation Cohorts ...... 38 2.5.2 Feature Selection for Dimension Reduction ...... 38 2.5.3 Unsupervised Machine Learning: Hierarchical Clustering ...... 39 2.5.4 Supervised Machine Learning: Building a Classification Algorithm ...... 39 2.5.5 Classifying Metformin-Treated & Diet-Controlled Samples...... 39 2.5.6 Classifying Samples According to their Five-Year Survival Status ...... 40 2.5.7 Investigation of Gene Expression Profiles in Key Discriminating Features ...... 40 2.6 Statistical Analysis ...... 41 Chapter 3 Results ...... 42 3.1 mRNA Quantification in the Study Cohort ...... 42 3.1.1 RNA Extraction...... 42 3.1.2 NanoString nCounter Assay & Initial Quality Control ...... 44 3.1.3 Evaluation of Housekeeping Genes ...... 46 3.1.4 Data Normalization & Additional Quality Control...... 48 3.1.5 Data Visualization Indicates that Preprocessing was Successful...... 48 3.2 Non-T2D CRC Patients Display Higher Minimum Survival Times, Comorbid T2D is More Common in Male CRC Patients & Insulin-Treated Patients Present with Higher Grade Tumours...... 52 3.3 Bioinformatics: Analysis of Gene Expression Data ...... 54 3.3.1 Randomly Partitioned Training & Validation Cohorts did not Differ in their Distribution of Key Clinicopathological features ...... 56 3.3.2 A Supervised Learning Approach Identifies 43 Features Discriminating Between Metformin- Treated & Diet-Controlled Samples & 61 Features Discriminating Between Non-Diabetic CRC Samples Based on their Five-Year Survival Status ...... 56 3.3.3 The Classification Models Developed Using the Molecular Data Show Some Capacity to Predict Diabetic Treatment Group and Five-Year Survival Status ...... 60 3.4 Investigation of Discriminating Genes Common in the Two Analyses Reveals Five Distinct Expression Profiles ...... 63 3.4.1 Expression of TMEM132A & SCNN1A in Metformin-Treated Samples Showed the Same Overall Trend in Non-Diabetic Samples Alive at Five Years ...... 65 3.4.2 Expression of sFas, FN1, & SPG20 in Metformin-Treated Samples Showed the Same Overall Trend in Non-Diabetic Samples ...... 69

vii

3.4.3 Expression of SORL1 & HNF4A Differentiate Between Metformin-Treated & Diet-Controlled Samples ...... 72 3.4.4 Expression of FOXC2, OCT4 & AKT1 Differentiate Between Non-Diabetic Samples Based on Five-Year Survival Status ...... 72 3.4.5 Expression of TGFBR1, JAG1 & HGF Differentiate Based on Diabetic Status ...... 74 Chapter 4 Discussion ...... 75 4.1 Colorectal Cancer: A Lack of Effective Treatments Driving High Mortality...... 75 4.2 Three Biologically Related Genes Identified as Potential Biomarkers of Metformin Action ...... 76 4.2.1 Downregulation of the Unfolded Protein Response May Protect Against CRC ...... 77 4.2.2 GRP78: A Key Player in the UPR to Restore Proteostasis & Sustain Cell Survival ...... 78 4.2.3 Downregulation of GRP78 Leads to Persistent Stress & the Initiation Apoptosis ...... 81 4.2.4 Upregulation of the Unfolded Protein Response in Cancer Pathology & Type 2 Diabetes ...... 83 4.2.5 Upregulation of the UPR Drives Aggressive Malignancies & Poor Patient Outcomes ...... 84 4.2.6 Metformin May Downregulate sFas to Synergistically Enhance Apoptosis ...... 87 4.2.7 sFas in Malignant Pathology & Type 2 Diabetes ...... 89 4.3 Suppression of the UPR & sFas: Implications for CRC Immunotherapy ...... 90 4.4 Implications of Metformin in the CRC Consensus Molecular Subtypes ...... 92 4.5 Three Additional Gene Expression Profiles Identified ...... 92 4.5.1 The Role of Genes Differentiating Between Metformin-Treated & Diet-Controlled Samples in Cancer Pathology Remain Unclear ...... 93 4.5.2 Decreased Expression of FOXC2, OCT4 & AKT1 in Patient Samples with Better Survival Contradicts the Functional Role of these Genes Reported in the Current Literature ...... 94 4.5.3 Invasive & Metastatic Genes Differentiate Between Diabetic & Non-Diabetic Samples ...... 95 4.6 The Classification Algorithms Displayed Some Predictive Power ...... 96 4.7 Robust Performance of the NanoString nCounter® Assay on Degraded RNA Samples ...... 97 4.8 Limitations of the Investigation ...... 98 4.9 Conclusions & Future Directions ...... 99 References ...... 101 Appendix A Supplementary Tables ...... 169

viii

List of Figures

Figure 1. One Dimensional vs Multidimensional Data Analysis Between Conditions...... 25 Figure 2. Overview of the mRNA Quantification Process with NanoString nCounter® Technology & Data Preprocessing in nSolver® Software...... 43 Figure 3. Visualization of Preprocessed Data...... 51 Figure 4. General Overview of the Bioinformatics Workflow...... 55 Figure 5. Features Selected & the Frequency of Selection Using Multiple Iterations of a Supervised Learning Algorithm...... 59 Figure 6. Confusion Matrices for the Classification Algorithms Built on the Training Data & their Corresponding Performance Once Applied to the Validation Data...... 62 Figure 7. Overview of the Workflow to Analyze Commonly Identified Discriminating Genes...... 64 Figure 8. Box Plots of TMEM132A & SCNN1A, as well as sFas, FN1, & SPG20 Gene Expression...... 66 Figure 9. Box Plots of SORL1 & HNF4A; FOXC2, OCT4 & AKT1; TGFBR1, JAG1 & HGF Gene Expression...... 73 Figure 10. Overview of the Unfolded Protein Response to Restore Endoplasmic Reticulum Homeostasis...... 79

ix

List of Tables

Table 1. Overview of Samples Flagged for Limit of Detection Quality Control...... 45 Table 2. Evaluation of Housekeeping Genes Used for Normalization of Data...... 47 Table 3. Overview of Samples Flagged for Normalization...... 49 Table 4. Distribution of Clinicopathological Features Across Diabetic Status & Treatment Groups...... 53 Table 5. Distribution of Key Clinicopathological Features in the Training and Validation Cohorts for the Metformin-Treated vs Diet-Controlled Analysis & Five-Year Survival Analysis in Non- Diabetic Samples...... 57 Table 6. Summary of Model Performance on Training & Validation Data...... 62 Table 7. Comparison of TMEM132A & SCNN1A Expression in Metformin-Treated, Diet- Controlled, Non-Diabetic Five-Year Alive & Non-Diabetic Five-Year Deceased Samples...... 68 Table 8. Comparison of sFas, FN1 & SPG20 Expression in Metformin-Treated, Diet-Controlled, Non-Diabetic Five-Year Alive, Non-Diabetic Five-Year Deceased & All Non-Diabetic Samples...... 70

x

List of Abbreviations

AMPK AMP-Activated Protein Kinase AMP Adenosine Monophosphate ATF6 Activating Transcription Factor 6 BMI Body Mass Index CEA Carcinoembryonic Antigen CMS Consensus Molecular Subtypes CSCs Cancer Stem Cells CRC Colorectal Cancer eIF2 Eukaryotic Initiation Factor 2 EMT Epithelial-Mesenchymal Transition ER Endoplasmic Reticulum FFPE Formalin-Fixed Paraffin-Embedded FOBT Fecal Occult Blood Test HbA1c Glycosylated Hemoglobin IQR Interquartile Range IRE1 Inositol-Requiring Enzyme-1 LVI Lymphovascular Invasion mRNA Messenger Ribonucleic Acid PERK Protein Kinase R-Like ER Kinase PNI Perineural Invasion QC Quality Control RIN RNA Integrity Number RT-qPCR Quantitative Reverse Transcription Polymerase Chain Reaction T2D Type 2 Diabetes Mellitus UPR Unfolded Protein Response XBP1s X-Box-Binding Protein 1

xi

Chapter 1

Introduction

1.1 Colorectal Cancer: A Growing Global Burden

Colorectal cancer (CRC) is a highly lethal and prevalent disease. In 2020, the World Health

Organization reported CRC to be the 3rd most common malignancy worldwide, with 1.93 million new cases diagnosed in that year alone (1). Furthermore, this disease ranks as the second leading cause of cancer-related death, with a staggering 935 thousand deaths in the same year (6).

Unfortunately, these rates are projected to climb in the coming years, further exacerbating this global public health problem (2,3,7). Another major concern is that the incidence of early-onset disease, defined as cases diagnosed before the age of 50, is increasing at an alarming rate (8–10).

These emerging epidemiological trends foreshadow a bleak future for CRC patients if clinical management strategies are not improved, and underscore the urgent need to develop novel therapies to improve survival rates in this growing patient population.

1.2 Key Clinicopathological Features & their Effect on Patient Outcomes

1.2.1 Stage at Diagnosis

The stage of disease at the time of diagnosis is one of the strongest predictors of a patient’s outcome and is a critical piece of information used to help guide treatment decisions (11). The most commonly used staging system for CRC is the American Joint Committee on Cancer Tumour

Node Metastasis system (12). This system evaluates the degree of tumour development, as well as lymph node and distant metastasis, to describe the extent of disease growth and spread beyond the location of the primary tumour. In general, Stage I CRC is described as a malignancy that has

1

grown into the submucosa or muscularis propria, with the absence of lymph node or distant metastasis (12). Stage II CRC is diagnosed when cancer has invaded the outermost layers of the colon or rectum, and may be growing into nearby organs, but has not spread to lymph nodes or distant sites (12). Patients with Stage III disease display lymph node invasion but do not have distant metastasis (12). Finally, Stage IV disease is used to classify patients with the greatest extent of spread, including both lymph node and distant metastasis (12). Studies report as high as a 92 percent five-year relative survival rate for patients diagnosed with Stage I disease; this rate drops to 12 percent for patients diagnosed with Stage IV CRC (13,14). The rapid decline in survival as stage increases is largely attributed to a lack of effective therapies for CRC patients. Surgical resection of the tumour remains the standard of care for CRC patients, and as the disease advances, spreading from the site of origin, options for curative treatment are extremely limited (15). These trends highlight the extreme limitations of current CRC treatment strategies and emphasize the urgent need to develop new therapies that will improve patient outcomes.

1.2.2 Location of the Primary Tumour

The location of the primary tumour is another important clinicopathological feature with prognostic implications and can be broadly classified as right-sided, left-sided, or rectal CRC.

Right-sided tumours are those originating in the ascending colon and proximal two-thirds of the transverse colon, while left-sided tumours arise anywhere in the distal third of the transverse colon through the descending and sigmoid colon, and rectal tumours are located in the rectum (16).

Despite falling under a common diagnosis of CRC, several studies report distinctions in disease presentation, stage at diagnosis, morphology, molecular features, and pathogenesis of these tumours, suggesting that disease sidedness may be an important consideration when devising the

2

best clinical management strategy for a patient (17–21). Importantly, these discrepancies are linked to clinical outcomes as patients presenting with left-sided tumours are reported to have a lower risk of cancer-related mortality than those presenting with right-sided disease (22,23). Several hypotheses have been proposed to explain the differential survival rates based on tumour sidedness. There is evidence to suggest that right-sided tumours tend to present at a higher stage, delaying the initiation of treatment and thus, reducing a patient’s response to therapy (24).

Moreover, studies report that left-sided tumours display polypoid morphology, while right-sided tumours tend to display flat morphology making them more difficult to detect with a colonoscopy; consequently, right-sided tumours may be detected at a later stage, leading to poorer survival

(16,25). Furthermore, right-sided CRC is more likely to be characterized by tumours that are poorly differentiated, which is an indicator of aggressive disease phenotypes (23). Other studies report that the underlying molecular biology and mutational profiles of these tumours differ considerably, contributing to the discrepancy in clinical outcomes based on the location of the primary tumour (26–28). Anatomically, the colon is considered a single organ, however, colon organogenesis arises from two distinct areas of the primitive gut, the midgut and the hindgut, underscoring innate developmental differences (29). Additionally, differential gene expression analysis of the right and left colon confirm that there are major differences in the underlying molecular biology between the right and left colon of both healthy and malignant tissue (27,30,31).

Thus, the pathological differences in right and left-sided disease may arise from fundamental biological differences attributed to the embryonic origins of colon development. These differences are important considerations that should be accounted for when studying the molecular pathology of CRC to uncover targets for future therapies and stratify patients for disease management.

3

1.2.3 Colorectal Cancer Grade

Colorectal malignancies are further classified using a tumour grading system to provide additional prognostic information. Cancer grade is determined using a histological analysis to evaluate cellular arrangement, differentiation, and morphology, as well as the rate of proliferation and the presence of necrotic cells (32). Low-grade tumours are characterized by malignant cells that are normally arranged, well-differentiated, and slow-growing (32). These are contrasted by high-grade tumours which are comprised of fast-growing, undifferentiated or poorly differentiated cells, with abnormal arrangement and morphology (32). Tumour grade is a key prognostic factor, as high-grade CRC is associated with poorer overall survival across all stages of disease, compared to low-grade disease (33–35).

1.3 Clinical Management of Colorectal Cancer Patients

The current clinical management strategies for CRC patients follow a general progression of screening, diagnostic testing, evaluation of the extent of disease/severity, implementation of a treatment plan, follow-up and monitoring, and/or palliative care (36).

1.3.1 Colorectal Cancer Screening

The introduction of screening programs, to increase early detection rates and improve a patient’s therapeutic potential, is a key component of CRC disease management. Screening is recommended for healthy individuals and is generally initiated around the age of 50 (37). There are several available tests to screen for early pathological changes, however, flexible sigmoidoscopy and the fecal occult blood test (FOBT) are two of the most commonly administered

4

in Canada (38). A flexible sigmoidoscopy allows a physician to examine the rectum and sigmoid colon, using a narrow and flexible tube equipped with a light and camera, for the presence of polyps or cancer before a patient may present with outward symptoms (39). A FOBT is a less invasive procedure, that is used to look for the presence of blood in the stool as an indicator of polyps or cancer, but requires follow-up with a colonoscopy if test results are abnormal (40). These programs are reported to be successful in reducing both the incidence and mortality of CRC when the implementation and patient uptake of these initiatives are effective (41,42).

Nonetheless, the improvements in overall survival rates following the implementation of

CRC screening have been reported to be modest at best, as the vast majority of patients are diagnosed after their disease has progressed from an early stage, greatly limiting their capacity for curative treatment. One study reported that only 20 percent of patients are diagnosed with Stage I disease, underscoring the inadequacy of current clinical management strategies (4). Another concern is the rapidly growing incidence of early-onset disease, defined as cases diagnosed before the age of 50, the age at which routine screening programs are initiated, thereby greatly circumventing the effectiveness of these interventions (10). Correct implementation and adherence to screening guidelines may also influence the efficacy of these programs in reducing CRC-related mortality. For example, studies report that to be effective, a FOBT must be performed annually on several stool samples, as polyps and malignancies do not bleed continuously (43). Studies report that long-term adherence to fecal occult blood testing is relatively low, and thereby may compromise the efficacy of this intervention (44). A lack of CRC screening uptake may also contribute to the relatively minimal impact these programs have been reported to have on overall survival. One study reported that in 2015, only 62.6 percent of eligible American citizens received screening; this metric drops to 57 percent in people ages 50-64 (42,45). There are several factors

5

that may limit patient uptake of screening including age, race, socioeconomic status, accessibility and communication with family physicians, geographic location, and patient awareness (45).

Moreover, flexible sigmoidoscopy is also limited in that it only allows approximately one-third of the colon to be examined, increasing the possibility of missing early lesions located in the proximal colon (46). Taken together, despite screening efforts, many CRC patients are not diagnosed until their disease has progressed, highlighting the need for the development of effective treatment options for these patients.

1.3.2 Colorectal Cancer Diagnosis & Treatment

If screening tests indicate that a patient may have CRC, additional testing can be performed to make a definitive diagnosis. A colonoscopy, which allows a physician to examine the entire colon to look for pathological lesions, is the most commonly used test to diagnose CRC (47).

During the colonoscopy, the physician will often take a biopsy of any lesions for pathological assessment; this testing is accompanied by tumour staging and grading to help guide clinical decisions and treatment planning, which usually involves surgical resection of the tumour (48). A complete blood count test may also be ordered to assess for anemia which can arise from chronic intestinal bleeding at the sight of CRC lesions. Furthermore, clinicians may test for the presence of biomarkers that could indicate compromised organ function and cancer spread (47).

At early stages of disease, resection of the tumour or lesion may involve a less invasive procedure; as CRC advances, more aggressive surgical approaches are often required, such as the removal of entire sections of the colon (48). Following surgical resection of the tumour, patients are monitored periodically to assess for disease recurrence. Carcinoembryonic antigen and carbohydrate antigen 19-9 are two currently evaluated tumour markers that are measured from a

6

blood test; they may be used to monitor a patient’s response to therapy or to monitor for disease recurrence (47). Chemotherapy and radiotherapy may be administered as adjuvant treatments following surgical resection, or may be indicated prior to surgery to help shrink a tumour, making it more operable; these therapies can also be used in some cases to help relieve pain (38). The conventional treatment options are aggressive and are associated with several side effects (49).

However, they are very effective in treating patients with early-stage disease (49). While some targeted therapies have been developed, they are only indicated for patients with advanced disease, when surgical resection is no longer an option, and are mainly used to prolong survival as their curative potential remains limited (50,51). Furthermore, only patients with certain mutational profiles will benefit from targeted therapies, while those that can be treated inevitably develop therapy resistance (52). Given the relatively small number of treatment options for CRC patients, and their limited efficacy in those with advanced disease, there is an urgent need to develop novel therapies that offer curative potential.

1.4 Colorectal Cancer & Diabetes

A well-documented major risk factor for the development and progression of colorectal cancer is type 2 diabetes mellitus (T2D) (53). CRC and T2D are both complex diseases with diverse pathological features, nonetheless, these health conditions exhibit similarities in the factors that encompass their clinical risk profiles including age, diet and lifestyle, obesity, as well as gender (54–57). Moreover, the incidence of T2D is growing at an alarming rate, compounding the public health burden of CRC. It is reported that globally, one in eleven adults are living with diabetes and 90 percent of these cases are T2D (58). The growing prevalence of T2D globally coincides with an increasing number of CRC patients with comorbid T2D, which led to the

7

identification of an important epidemiological trend between diabetic treatment and CRC patient outcomes. Notably, metformin, a front-line oral antihyperglycemic agent, has been linked to a reduced risk in the development of CRC, as well as improved clinical outcomes in those with CRC

(59–61). These findings support the need to further investigate the potential therapeutic benefit of metformin in CRC patients, as well as the mechanistic underpinnings of its antineoplastic effects.

1.5 Type 2 Diabetes Mellitus

T2D is a metabolic disorder characterized by peripheral insulin resistance, and/or pancreatic insufficiency, which leads to the deregulation of glucose homeostasis and several diabetic complications (62). Glucose is the primary source of energy for most cell types, and uptake of this substrate requires insulin, a peptide hormone produced by pancreatic beta cells (63).

Insulin mediates glucose uptake through a series of signalling cascades which are initiated upon the binding of its receptor, a tyrosine kinase protein. Once bound by insulin, the receptor undergoes autophosphorylation to become activated, triggering subsequent phosphorylation of the insulin- receptor substrates and activation of the PI3K/AKT pathway (64). Phosphorylation of AS160, a substrate of AKT, results in the translocation of GLUT4 to the plasma membrane to facilitate glucose uptake (64–66). Insulin resistance occurs when the biological response to this hormone is reduced or attenuated, resulting in poor glucose uptake, heightened levels of plasma glucose, and increased insulin production from pancreatic beta cells to compensate and maintain glucose homeostasis (67). Over time, the pancreas may no longer be able to overcome the weakened cellular response to insulin, leading to prediabetes and eventually T2D. While the etiology of peripheral insulin resistance is complex and poorly understood, it has been associated with sedentary lifestyles, chronic overnutrition, and obesity (68,69). Furthermore, upregulation of lipid

8

production and deregulation of lipid disposal has been reported to promote insulin resistance in skeletal muscle through the inhibition of the PI3K/AKT pathway (70,71). These findings suggest that the PI3K/AKT axis may have a central role in the regulation of circulating glucose levels, and thus, alterations in this pathway may lead to loss of glucose homeostasis and the onset of T2D.

1.6 Metformin in Diabetes

Metformin, a biguanide, is a front-line therapy used to treat T2D patients and is the most commonly prescribed antidiabetic drug worldwide due to its clinical efficacy, cost-effectiveness, and safety profile (72,73). The physiological effects of metformin include the reduction of hepatic glucose production as a primary effect, decreased glucose absorption and increased glucagon-like peptide 1 secretion in the intestines, as well as increased peripheral glucose uptake and utilization to improve insulin sensitivity and regulate glucose homeostasis (74–76). In addition to its role in lowering blood glucose levels, metformin also promotes weight loss, reduces circulating lipid levels, prevents vascular complications, reduces inflammation, and protects against malignant pathologies (77–81). While the pleiotropic molecular mechanisms underpinning metformin’s antidiabetic effects are complex and not fully understood, it has been well-documented in the literature that metformin alters mitochondrial function by inhibiting Complex I of the mitochondrial respiratory chain (82,83). It is thought that this triggers cellular energy depletion, and the initiation of several downstream signalling pathways, via the activation of the AMP- activated protein kinase (AMPK), to induce cellular reprogramming to an energy-conserving state

(84). One of the downstream effects is the non-competitive inhibition of fructose-1-6- bisphosphatase, a key enzyme involved in gluconeogenesis (84). Thus, hepatic glucose production is reduced as a consequence of elevated AMP levels, thereby helping to regulate glucose

9

homeostasis. In addition to its regulation of glucose homeostasis through the modulation of cellular energy, metformin antagonizes glucagon signalling to suppress hepatic glucose production, by preventing the accumulation of cyclic AMP, which downregulates the transcription of gluconeogenic genes (85,86). More recently, it has been documented that metformin also inhibits hepatic glucose production via a redox-dependent mechanism (87). Studies report that metformin directly inhibits glycerol-3-phosphate dehydrogenase to alter cytosolic redox balance and subsequently inhibit hepatic glucose production (88). Inhibition of this enzyme results in the decreased conversion of glyceraldehyde-3-phosphate, a by-product of the triglyceride catabolism, to dihydroxyacetone phosphate (89). The reduction of dihydroxyacetone phosphate suppresses the conversion of lactate and glycerol to glucose, thereby decreasing hepatic gluconeogenesis (90,91).

These studies outline some of the proposed effects of metformin’s antidiabetic effects. However, a comprehensive understanding of the molecular mechanisms underlying its therapeutic benefit remains unclear.

1.7 Metformin in Colorectal Cancer

In addition to its antidiabetic effects, metformin has cancer-protective effects in various malignancies, including CRC (77,92,93). Several epidemiological studies have reported an inverse relationship between metformin use and risk of CRC development and progression amongst T2D patients (59,61,94–98). Furthermore, studies report a dose-dependent relationship whereby higher doses of metformin and longer treatment duration are linked to enhanced protection (61). At the cellular level, metformin inhibits proliferation, epithelial-mesenchymal transition (EMT), invasive and metastatic potential, stemness characteristics, as well as immune markers and inflammation in

CRC cell lines, which are fundamental characteristics of malignant pathologies (99–107). Despite

10

extensive epidemiological and cellular evidence of the CRC-protective effects of metformin, the molecular mechanisms by which this antidiabetic drug exerts its antineoplastic effects remain poorly understood and are an ongoing area of scientific investigation. Nonetheless, metformin’s action as an inhibitor of mitochondrial Complex I to reduce cellular respiration and drive energy depletion is widely documented, indicating that alterations in cell metabolism and energy balance may be a key component underlying the observed cancer-protective effects of metformin (72,108–

110).

1.7.1 Metformin & Cellular Proliferation in Cancer Onset & Progression

The ability to sustain proliferative signalling is a well-established hallmark of malignant pathologies. The mTOR pathway has important regulatory roles in cell growth, division, and survival. To coordinate these critical biological functions, mTOR integrates cellular growth factor signals with nutrient and energy levels to determine whether to promote or inhibit these energy- intensive processes (111,112). Activation of the AMPK pathway, as a consequence of cellular energy depletion following inhibition of mitochondrial Complex I, is proposed to initiate several downstream signalling cascades with antiproliferative effects (99). In particular, it has been reported that activation of AMPK reprograms cells to an energy-conserving state; consequently, the mTOR pathway and downstream protein synthesis necessary for cellular division is suppressed

(99). Inhibition of mTOR has also been reported independent of AMPK activation, as a result of heightened levels of reactive oxygen species following the alterations in cellular respiration by metformin, constituting an additional mechanism by which cellular proliferation may be downregulated in metformin-exposed cells (99). Furthermore, reduced expression of EGF1R, a cell surface receptor that binds the potent mitogen EGF1, was also reported in cells following

11

metformin treatment (99). The binding of EGFR1 with its ligand induces downstream signalling of several pro-oncogenic molecular pathways involved in sustained cellular proliferation including the RAS/MAPK and PI3K/AKT/mTOR pathways (113). These studies outline some preliminary work in elucidating the molecular underpinnings of metformin’s antiproliferative effects.

1.7.2 Metformin & Epithelial-Mesenchymal Transition

EMT is another key feature of malignant pathologies, whereby normal epithelial cells lose characteristics of maturation, including cell polarity, cell-cell adhesion, cytoskeleton remodeling, and extracellular matrix adhesion, and develop mesenchymal properties which include enhanced migratory capacity (114). EMT is associated with disease progression, therapy resistance, and poor patient outcomes in several malignancies including CRC (115). The microRNAs miR-34 and miR-

200 and the transcription factors SNAIL and ZEB1 have been widely reported as regulators of

EMT, with SNAIL and ZEB1 being potent inducers of EMT, while miR-34 and miR-200 act as tumour suppressors to inhibit their function (116,117). Interestingly, studies report that metformin treatment of CRC cells in vitro resulted in the upregulation of the epithelial marker E-Cadherin, and the downregulation of the mesenchymal marker vimentin (101). The researchers also reported that this coincided with a reduction in SNAIL and ZEB1 expression and an increase in mi-R200, suggesting that metformin may inhibit EMT via this molecular pathway to improve clinical outcomes in CRC patients (101).

12

1.7.3 Metformin in Cellular Invasive & Metastatic Potential

Cellular invasion and metastatic potential are key characteristics of malignant disease that constitute the basis of poor patient survival by altering a localized pathology to a systemic and highly lethal disease. Following invasion, dissemination, and metastasis from the location of the primary tumour, treating patients with the intent to cure is virtually impossible (49). Thus, devising therapeutic strategies to minimize CRC metastasis would improve treatment opportunities for patients, as well as overall survival rates. A recent study reports a potential anti-metastatic effect of metformin treatment in vitro and in CRC xenograft mouse models, indicating that this may be one mechanism of metformin’s CRC-protective effects (107). Furthermore, epidemiological studies report that the rate of distant metastasis in metformin-treated diabetic CRC patients was significantly lower than the rate observed in diabetic CRC patients who were not exposed to metformin (118). There is an abundance of molecular pathways that may be implicated in metformin’s anti-metastatic effects. One pathway that has been widely documented in CRC carcinogenesis and tumour metastasis is the WNT/-catenin pathway. The WNT/-catenin signalling axis is an evolutionarily conserved pathway with important roles in cell fate determination during embryonic development, as well as the maintenance of adult tissue homeostasis, however, deregulation of this pathway is widely documented in cancer pathology, and in particular CRC (119–121). Under normal physiological conditions, the transcription factor

-catenin is sequestered in a complex of which mediate its phosphorylation, ubiquitination, and subsequent degradation in the cytosol (122,123). When the WNT ligand is produced to mediate cell-cell communication, it binds its receptor on the surface of cells, triggering the release of -catenin. As unphosphorylated -catenin builds up in the cytosol it begins to migrate to the nucleus, where it regulates transcription of genes involved in proliferation, tissue

13

differentiation and homeostasis, as well as cell migration and invasion (120,124). A recent study performed an immunohistochemical analysis of -catenin in diabetic CRC patients treated with and without metformin. The researchers reported a significantly lower expression level and nuclear accumulation rate of -catenin in metformin-treated patients, suggesting that this pathway may be a target of metformin, contributing to its anti-metastatic effects (118).

1.7.4 Metformin & Cell Stemness

Disease recurrence, therapy resistance, and distant metastasis are major causes of death among CRC patients, and cancer stem cells (CSCs) may help enable these pathophysiological phenomena (52). CSCs are a unique subpopulation of tumour cells characterized by their capacity for self-renewal, ability to generate heterogeneous cell populations, and indefinite proliferative potential to sustain tumour growth (125,126). Furthermore, CSCs are capable of inducing cellular quiescence, which may facilitate CRC relapse, reduce the efficacy of treatment and promote metastatic dissemination (127). Given that traditional therapies target mature, actively dividing cells, it is thought that cellular adoption of a quiescent state may enable CSCs to escape therapy- induced death, representing a potential mechanism of treatment resistance (128). Moreover, patients who are treated and believed to be in remission may harbor small populations of these cells, providing an opportunity to reinitiate tumour growth when the microenvironment is favourable (129). In addition to several other antineoplastic effects, metformin has also been reported to suppress cell stemness. CD44 and LGR5 are two well-documented CRC stem cell markers associated with lymph node and distant metastasis, as well as overall poor patient prognosis (130). One study reported that expression of these genes declined substantially following

14

metformin treatment, suggesting that metformin’s antineoplastic effects may be linked to the inhibition of stem cell populations (99).

1.7.5 Metformin in Immune Markers & Inflammation

Tumour promoting inflammation is a well-recognized feature of malignant pathologies.

While acute inflammation is reported to have cancer-protective effects, uncontrolled and chronic inflammation can facilitate tumourigenesis and disease progression (131). For example, uncontrolled inflammatory bowel disease is a widely accepted risk factor for CRC, with supporting evidence from epidemiological reports (132). Recurrent and chronic inflammatory processes create a cycle of inflammatory signalling molecule production, followed by the recruitment of immune cells and the generation of an abundance of reactive oxygen and nitrogen species, which result in sustained tissue damage and exacerbation of inflammatory signalling (133). Reactive oxygen species are potent oxidizing agents with mutagenic capability, thus, chronic inflammation and production of these molecules increase the risk for spontaneous tumourigenesis, as well as cancer promotion (134). It has been reported that one of metformin’s pleiotropic antineoplastic effects involves its ability to suppress pathophysiological inflammation. A recent study reports that at the molecular level, metformin inhibits NADPH oxidase, an enzyme that catalyzes the production of reactive oxygen species (105,135). As a result of NADPH oxidase inhibition, NF-

B signalling is suppressed, a pathway directly involved in the production of proinflammatory cytokines, chemokines, as well as inflammation-induced proliferation and angiogenesis (105,135).

The findings from this study support the inhibition of inflammation as one mechanism of metformin’s cancer-protective effects.

15

1.8 Insulin-Treatment & Colorectal Cancer

While the improved overall survival of metformin-treated diabetic CRC patients has been well-documented in the literature, there is also considerable evidence to suggest that insulin treatment may enhance the risk of CRC development and progression in patients with T2D. This raises the question of whether metformin really does exhibit protective effects, or whether the discrepancy in survival is primarily due to the negative effects of insulin in other T2D-CRC patients. Moreover, metformin is a front-line antidiabetic agent, often prescribed as a monotherapy for patients with mild cases of T2D, while insulin is often indicated for more severe T2D cases, therefore, metformin-treated patients may present with less severe diabetic comorbidities than insulin-treated patients, contributing to the differential CRC patient outcomes reported in the literature (75,136–141). These factors add to the difficulty in elucidating the CRC-protective effects of metformin and/or the cancer-promoting effects of insulin. Nonetheless, a population- based cohort study reported that after adjusting for several key covariates, insulin treatment in T2D patients was associated with a significantly elevated risk of developing CRC (142). Moreover, at the molecular level, hyperinsulinemia is reported to potentiate tumourigenesis by facilitating the metabolic reprogramming of cells to accommodate elevated nutrient availability, consequently, energy-intensive processes including cell growth, proliferation, and survival are upregulated

(143,144). It is thought that this occurs via the overstimulation of the insulin-like growth receptor-

1, which insulin is reported to bind, thereby initiating downstream oncogenic signalling cascades

(145–147). Taken together, further investigation of insulin’s effects on CRC pathology is necessary to determine how metformin and insulin impact diabetic CRC patient outcomes, and how this information can be used to improve treatment strategies for these patients.

16

1.9 CRC Disease Heterogeneity: A Challenge in Research & Clinical Applications

CRC is an umbrella term used to classify a group of diverse malignant pathologies arising in the large intestine and rectum. At the clinical level, it has been reported that disease presentation varies depending on the anatomical location of the primary tumour. For example, patients with right-sided disease present more commonly with generalized symptoms such as anemia, fatigue, or pain, while patients with left-sided disease tend to present with localized symptoms such as changes in bowel habits, bleeding, or through screening procedures (148). These trends suggest that disease pathogenesis may differ based on tumour sidedness. CRC also varies at the histological level, with three major histological subtypes being reported. These subtypes include adenocarcinoma, mucinous adenocarcinoma, and signet ring cell carcinoma. Adenocarcinoma is the most common colorectal malignancy, accounting for more than 90 percent of CRC cases, and is characterized by the malignant transformation of glandular cells in the colorectal mucosa (149).

Mucinous adenocarcinoma tumours are less common and display a high volume of extracellular mucin, while signet ring cell carcinomas are extremely rare and are characterized by the presence of an intracytoplasmic vacuole containing mucin, which causes disruption of the nucleus to the periphery (149). The histological subtypes highlight another level of CRC disease heterogeneity, which is associated with prognostic distinctions (150). Efforts have also been made to stratify CRC at the molecular level and four consensus molecular subtypes (CMS) have been described (151).

Moreover, these molecular subtypes are characterized by intrinsic biological differences and are associated with important clinical and prognostic variables (151). Namely, CMS1 (Microsatellite

Instability Immune) is distinguished by its high mutational burden, microsatellite unstable phenotype, and extensive immune activity. These tumours have been more commonly reported in females with right-sided disease, high grade tumours, and poor survival following relapse. CMS2

17

(Canonical) displays upregulation of WNT and MYC oncogenic pathways, as well as epithelial differentiation. CMS2 tumours more commonly present in patients with left-sided disease and are associated with better clinical outcomes. CMS3 (Metabolic) is characterized by signatures of metabolic deregulation and are reported to present more commonly in the sigmoid colon or rectum.

Finally, CMS4 (Mesenchymal) is distinguished by TGF-β signalling, angiogenesis and invasive capacity; these tumours have been associated with diagnosis at an advanced stage and poor clinical outcomes.

These distinctions outline some aspects of CRC heterogeneity, which constitute a significant obstacle for researchers and clinicians working to effectively treat CRC patients.

Developing an understanding of the clinical, histological, and molecular differences that can affect a patient’s disease course is an important first step in uncovering targets for future therapies, as well as identifying predictive biomarkers that will allow these to be implemented effectively.

Given the highly heterogeneous nature of this disease, pinpointing the key molecular features underlying disease onset and progression, which may be promising therapeutic targets, is an enormous challenge. Nonetheless, the evidence to support metformin’s antineoplastic effects is well-documented in the literature, outlining a novel approach for studying CRC pathology. In addition, the rapid expansion of the fields of molecular biology and cancer genetics, due to modern-day technological advances, provides a unique opportunity for CRC research and the development of novel therapies. Notably, the ability to collect comprehensive transcriptomic data is revolutionizing the way researchers and clinicians study cancer pathology and approach the clinical management of patients, by shifting the focus from macroscopic disease to developing an understanding of the molecular pathogenesis. However, processing and interpreting large amounts of data to elucidate meaningful results is a challenge in itself. To overcome the complexity and

18

time-consuming aspects of transcriptomics research, the application of computing power and machine learning to optimize data analysis is quickly gaining momentum.

1.10 Gene Expression Profiling

Gene expression profiling is a powerful technique for studying the molecular pathology of diseases, including cancer. This technique focuses on elucidating gene expression patterns, at the level of transcription, under established conditions of interest, to provide a broad overview of cellular function (152). In particular, the analysis of this data enables researchers and clinicians to gain a better understanding of diseases like CRC that are highly heterogeneous in nature.

Consequently, gene expression data can provide important clinical information, including key prognostic and predictive biomarkers, to guide patient management and improve outcomes (153).

Moreover, studying gene expression data is a research avenue that may facilitate the improvement of early detection methods to maximize therapeutic potential, as well as uncovering targets for the development of novel treatments, by shifting the focus from macroscopic disease to its molecular underpinnings (154). Nonetheless, generating meaningful gene expression data requires obtaining

RNA of sufficient quality, a parameter that can be affected based on the tissue preservation technique and storage of samples (155). In addition, the platform used to quantify RNA levels also impacts the quality of the data produced for analysis.

1.11 Formalin-Fixed Paraffin-Embedded Samples

Formalin-fixed paraffin-embedded (FFPE) tissue samples have been a vital source of clinical data and a key component of molecular analysis in medical research for decades (156).

19

FFPE is a tissue preservation technique that involves fixing the sample in formaldehyde as a means of preserving proteins and other fragile biological structures, including cell morphology, for analysis; next, the fixed tissues are embedded in a paraffin wax block for storage (157). This technique provides readily available samples for clinical and experimental research; it is also used clinically to assess for diagnostic, prognostic, and predictive biomarkers (158). Another method of tissue preservation, denoted fresh-frozen tissue samples, involves snap freezing the specimen by submerging it in liquid nitrogen as soon as possible after it is collected, followed by storage at

-80C (159). A main advantage of FFPE tissue preservation over fresh-frozen sample preservation is that it is highly cost-effective, as samples can be stored at ambient temperatures for long periods of time, and thus do not require special storage equipment (160). Consequently, FFPE samples are routinely collected in the clinical setting, making them highly abundant and available for research purposes, while providing vast clinical information due to their utility in pathology (161). In contrast, the high rate of deterioration of fresh-frozen samples at room temperature makes this method of tissue preservation less practical for many clinical workflows (162).

Nonetheless, formaldehyde fixation of tissues is linked to extensive RNA degradation

(161). Moreover, the duration, as well as the temperature of FFPE sample storage may also affect

RNA integrity (163). The high level of RNA degradation in FFPE tissues is an inherent flaw of this preservation technique and also represents a significant obstacle in the molecular analysis of these samples for research purposes. Notably, poor RNA quality has been reported to contribute to unreliable results when generating gene expression data using reverse-transcription quantitative polymerase chain reaction (RT-qPCR), a commonly used technique for measuring gene expression

(164). Given the lower yield and poorer quality of nucleic acid obtained from FFPE samples compared to fresh-frozen samples, some researchers suggest that fresh-frozen samples are

20

preferable biospecimens for gene expression analysis studies, using conventional mRNA quantification techniques (159,165,166). Despite the poorer quality of nucleic acid associated with the use of FFPE samples, fresh-frozen samples are often not readily available or abundant, and thus, the use of FFPE samples is essential for transcriptomic research (165). Fortunately, the advent of NanoString nCounter technology in 2003 offers a new approach for obtaining transcriptomic data, and this platform has been reported to facilitate the collection of robust gene expression data using FFPE samples for molecular research (167).

1.12 NanoString Technology

NanoString nCounter technology is a direct detection tool for gene expression analysis that utilizes molecular barcode technology to quantify analytes of interest (168). The use of this novel technology is becoming increasingly widespread due to the unique advantages of the assay, including its simplicity, high sensitivity, and reproducibility (169). The assay uses a custom set of molecular probes to tag RNA of interest with a capture and reporter probe that are specific to the target RNA; this hybridization process results in the generation of unique target-probe complexes

(170). Following this, unhybridized probes are removed to leave only the purified target-probe complexes of interest, which are then immobilized and aligned on the imaging surface (170).

Finally, using an automated fluorescence microscope, the sample is scanned to count the labeled barcodes (170). This molecular barcoding technology can be used for several types of analytes, including RNA, miRNA, protein, and DNA. However, for the purposes of this project, mRNA is the target of interest.

The use of nCounter technology has several advantages including its ability to generate highly reproducible data, its capacity to perform well on FFPE samples which often exhibit poor

21

RNA integrity, its ability to analyze a large number (up to 800 RNA) of distinct targets in a single sample, as well as its flexibility to the type of sample input (168). This method is faster than other available technologies such as Next Generation Sequencing for mRNA quantification and is simpler than RT-qPCR as it does not involve reverse transcription to cDNA, followed by nucleic acid amplification (169,171,172). These additional steps introduce variability into traditional gene expression analysis assays and increase the chance of sample contamination, as well as the potential for loss of amplifiable templates (170). As a consequence, this may lead to the collection of biased data, emphasizing the value of NanoString’s direct detection approach in molecular research (170).

1.12.1 Performance of NanoString® Platform on FFPE Samples

As was previously discussed, the use of FFPE samples for gene expression studies is particularly challenging due to low mRNA yields and more importantly, the high level of RNA degradation. The nCounter direct detection system overcomes these issues by using a highly automated, probe-based system, that does not rely on enzymes or amplification protocols, thereby simplifying the assay and reducing the potential for biased results (173). A study conducted by

Reis et al. evaluated the performance of the nCounter system in quantifying mRNA for gene expression analysis from FFPE samples, and paired fresh-frozen specimens, the gold standard sample type for molecular analysis; the performance of the nCounter system was also compared to RT-qPCR, for mRNA quantification on the same paired sample types (174). The RNA integrity of FFPE samples was reported to be substantially lower than that of fresh-frozen samples, underscoring the difference in analyte quality based on the sample type (174). Nonetheless, the researchers reported a correlation coefficient of 0.90 between the results of the nCounter system 22

on the paired sample types, highlighting the robust performance of this platform on FFPE samples, despite the poor quality RNA (174). In contrast, the correlation coefficient in the different sample types using RT-qPCR was reported to be 0.50, and the researchers reported that lower RNA integrity corresponded to higher threshold cycles, as well as the loss of amplifiable templates

(174). Thus, when compared to RT-qPCR, NanoString technology offers superior performance in the mRNA quantification of archival FFPE samples.

1.13 Machine Learning

Machine learning is a branch of artificial intelligence that exploits computer systems to analyze patterns in data, using mathematical models and algorithms that have the capacity to adapt and learn from the information presented, to ultimately draw conclusions (175). It is a powerful tool for analyzing large quantities of data, which surpass human capacity for assessment. This is a particularly relevant problem in molecular biology and cancer genetics research, whereby large amounts of data can be generated, however, the sheer volume of data often makes it incomprehensible without the application of computing power (176). Moreover, machine learning approaches offer several advantages over conventional analysis methodologies, thereby optimizing data analysis and the capacity to draw inferences. For example, when analyzing gene expression data, a common analysis technique involves identifying genes that are differentially expressed between the groups of interest (177). This methodology relies on the statistical analysis of expression levels, of genes individually, between the comparison conditions. However, there are several limitations to this approach. First, the p-value is arbitrarily set, often at 0.05. This value represents the probability of obtaining the observed difference, or a larger difference, in the outcome being measured, given that no difference exists between the two groups of interest (178). 23

Setting a p-value of 0.05 as a threshold for statistical significance minimizes the chance of a type

I error (the null hypothesis is rejected but should not have been), conversely, the chance of a type

II error (the null hypothesis is not rejected but should have been) may be large, indicating that this type of analysis could fail to identify true differentially expressed genes. In addition to this limitation, univariable differential gene expression analysis is unable to detect high dimensional interactions and non-linear patterns. Machine learning techniques provide a more effective approach to data analysis by identifying genes that discriminate between groups of interest, thereby facilitating the detection of patterns that may exist in higher dimensional space (Figure 1) (179).

There are two main classes of machine learning, termed supervised machine learning and unsupervised machine learning. These branches of machine learning are applied in different research contexts; namely, unsupervised machine learning is a versatile research tool for performing multidimensional exploratory data analyses, while supervised machine learning is typically applied to detect patterns in datasets for the purpose of generating classification algorithms (180,181).

1.13.1 Feature Selection for Dimension Reduction

The “Curse of Dimensionality” is a major obstacle in machine learning applications that can lead to poor model performance if left unaccounted for. This well-documented phenomenon postulates that as the number of features/input variables (i.e. dimensions) of the algorithm increases, the number of samples necessary to maintain a certain level of accuracy in the machine learning application grows exponentially (182). When applying machine learning to gene expression data analysis, this is a relatively common problem, as there are often a large number of features (genes) relative to the number of samples, which can cause the performance of the model

24

A B

C

Figure 1. One Dimensional vs Multidimensional Data Analysis Between Conditions. (A) and (B) depict hypothetical gene expression levels of individual genes between two conditions, providing a visual representation of differential gene expression analysis. These graphs allow trends in the data to be seen, whereby the mean expression level tends to be higher in one condition compared to the other. However, the two conditions cannot be separated in one dimension. (C) When the data is plotted in two dimensions, the different conditions become distinguishable. Machine-learning applications provide an opportunity to analyze data and detect patterns that may be present in higher dimensional space by considering multiple features and measuring their effectiveness in discriminating between two conditions.

25

to decline substantially (183). This is a problem that can be overcome by reducing the number of features or increasing the number of samples.

Given that obtaining a sufficient number of samples to counteract this effect is often not feasible in transcriptomic research, where the number of features typically far outweigh the number of samples, methods of feature selection have been developed. Some of these methods apply supervised machine learning algorithms to select features with the highest discriminatory power between the conditions or groups of interest. Discriminating features are those that can best separate classes in N-dimensional space (184). The feature selection process allows informative features to be retained for further data analysis and machine learning applications, while simultaneously removing noisy or redundant features that are irrelevant to the analysis, contributing to poor learning, over-fitting, and deteriorating model performance (158). In addition to improving the accuracy of machine learning applications, feature selection is also highly advantageous for translational research, as the use of a small set of genes in clinical applications is more practical than assessing a large gene panel. Once the dimensions of the data have been reduced, unsupervised and supervised learning techniques can be more effectively applied.

1.13.2 Unsupervised Machine Learning

Unsupervised machine learning uses mathematical models to perform exploratory analyses on datasets with no predefined labels for the algorithm to reference (186). Without providing the algorithm any information about the dataset, the machine learning algorithms evaluate the data to uncover patterns that may be hidden in multidimensional space, facilitating the grouping and separation of data points based on intrinsic similarities and differences (187). These methods are

26

often applied when researchers have minimal prior knowledge of the projected gene expression patterns for the specific groups of interest.

Hierarchical clustering is a commonly used method of unsupervised learning in gene expression analysis for discovery-based science. This data mining approach utilizes an algorithm to group samples that are similar together, resulting in a set of nested clusters whereby each group includes samples that are broadly similar (188). These clusters can be visualized in the dendrogram, which is a structure that depicts the hierarchical relationship between samples. When applying this type of unsupervised learning model, the researcher must specify the similarity and linkage parameters. Euclidean distance, which is defined in mathematics as the length of a line segment between two points in space, is commonly used as a similarity metric for hierarchical clustering (189,190). Nonetheless, samples can also be grouped based on the strength of correlation between samples (191). The optimal performance of these metrics depends on the data set they are being applied to, and thus, it is important to evaluate different measures of similarity to optimize the algorithm (192). The researcher must also specify the linkage parameter for the function, which defines how the algorithm will merge the components of the dendrogram (193).

For example, some common linkage measures include single link, average link, and complete link.

Single link defines the distance between two clusters as the distance between the closest points within the two clusters (194). This can be problematic if two clusters that are not close together happen to have two data points that are near one another, thus causing the distinction between the clusters to be lost. Average link computes the average value for each cluster and defines the distance between two clusters as the difference between the centroids (194). Complete link defines the distance between two clusters as the distance between the farthest pair of points (194). Average and complete link methods can also be problematic as they are biased towards spherical clusters.

27

Taken together, clustering is sensitive to noisy data and outliers, emphasizing the importance of dimension reduction prior to using these methods.

1.13.3 Supervised Machine Learning

In contrast to unsupervised learning, supervised machine learning is a data analysis technique whereby an algorithm is provided a dataset and corresponding reference labels for individual samples, based on the groups of interest (195). The model is then trained using the labeled dataset to generate a classification algorithm that can accurately predict a sample’s group based on patterns that are identified in high dimensional feature space (195). The first step of supervised learning involves randomly partitioning the dataset into training and validation cohorts. Typically, a substantially larger portion of the data (approximately 75-80%) is allocated to the training dataset to provide ample opportunity for the algorithm to learn from the data. It is important to ensure that the training and validation cohorts do not differ significantly in their distribution of key covariates, as the model should be evaluated on a sample that best approximates what is seen in the general population. Once the model has been trained on the training dataset, the classifier performance is evaluated on the validation cohort (approximately 20-25% of the data), to assess the generalizability of the model. If the accuracy of the model drops substantially when classifying samples in the validation cohort, this indicates that the model may have been overfitted to the data.

Reducing the dimensionality of the data and applying cross-validation are two common approaches to prevent model overfitting.

It is important to note that there are many different types of classification algorithms, and all available options should be trained to determine which model performs optimally with the current dataset. The reason for assessing several models is explained by a well-established concept

28

in machine learning, denoted the “No Free Lunch Theorem”. This theorem states that there are no superior or inferior classification algorithms overall, rather that the outperformance of one algorithm over another is a consequence of the intrinsic properties of the dataset on which it was trained, underscoring the need to train multiple models to optimize classification performance

(196,197).

1.13.4 k-Fold Cross-Validation

One way to minimize the potential for overfitting is to apply k-fold cross-validation, a resampling technique with k-iterations, when the model is being built. To do this, the training dataset is randomly split into k-mutually exclusive subsets termed “folds”, where k is a number selected based on the number of samples in the training set. Then using k-iterations, the model is trained with all folds except one, which becomes the test subset. For each iteration, the model is trained on a distinct combination of folds and tested on the remaining fold. The proportion of samples in each training and test subset can be calculated as follows: 푛푡푟푎푖푛 = (푘 − 1)/푘 and

1 푛 = . The selection of k is not a precise science and depends on the dataset used for model 푡푒푠푡 푘 training. Ideally, k should be selected such that each fold contains sufficient variation and is representative of the underlying distribution of the larger dataset (198). Ultimately, cross- validation is an effective method to optimize fitting the model to the data, while balancing its rate of generalizability.

29

1.14 Hypothesis & Specific Aims

The molecular mechanisms of CRC pathogenesis remain unclear, highlighting a significant barrier to improving treatment options for this deadly disease. Nonetheless, several studies have reported that metformin-treated diabetic CRC patients tend to have better clinical outcomes than diet-controlled diabetic patients (59,60,142). Given this correlation, studying the gene expression profiles in CRC patients stratified by diabetic treatment may provide insight into the molecular features of disease onset and progression, as well as the molecular underpinnings of metformin’s antineoplastic effects. Thus, we hypothesize that metformin treatment leads to the altered regulation of key metabolic and oncogenic pathways, that can be measured using a targeted gene expression assay. Furthermore, we hypothesize that these gene expression profiles may predict clinical outcomes, and pinpoint important molecular biomarkers of metformin action. To test this hypothesis, we had the following specific aims:

1. Quantify mRNA counts for 151 metformin-related genes in the study cohort using

NanoString nCounter technology.

2. Identify mRNA signatures that discriminate between metformin-treated samples and diet-

controlled samples, as well as those discriminating between patients based on their 5-year

survival status, and generate corresponding classification models.

3. Evaluate individual genes and assess for differential expression patterns that may provide

insight into the molecular underpinnings of metformin’s antineoplastic effects.

30

Chapter 2

Material and Methods

This work was done in accordance with Queen’s University Health Science Ethics

Approval PATH 128-12: Validation of novel biomarkers for colorectal cancer.

2.1 Selection of Patient Cohort

The patient cohort includes 272 CRC patients who were diagnosed and treated between

2005 and 2013 at the Kingston General Hospital. These patients were selected for inclusion in this analysis retrospectively, following a review of patient medical records stored in the internal electronic medical database. The selection of cases for inclusion in the analysis was made primarily based on a patient’s prior history of T2D, as well as their diabetic treatment protocol. All CRC patients with a previous history of T2D (N = 194) were selected for inclusion in the analysis.

Among the T2D patients included, 55 managed their diabetes through diet-control, 72 were treated with metformin, 18 were treated with insulin, 14 were treated with other oral antihyperglycemic monotherapies, and 35 patients received a combination therapy. The other oral monotherapies included glyburide (N = 9), gliclazide (N = 2), glipizide (N = 1), repaglinide (N = 1), or rosiglitazone (N = 1), and the combination therapies included metformin and insulin (N = 9), metformin and another oral antidiabetic agent (N = 19), metformin, insulin and another oral antihyperglycemic medication (N = 6), as well as one patient treated with insulin and gliclazide.

A patient’s treatment regimen was defined as exposure to a monotherapy or combination therapy prior to CRC diagnosis. In addition to the diabetic patients, 78 cases without a prior history of T2D were randomly selected from the database to be included in the study.

31

2.2 Clinical & Pathological Data Retrieval

Patient biopsies of the primary tumour were obtained at the time of CRC diagnosis.

Following surgical resection, the tissue samples were fixed with formalin and embedded in paraffin for tissue preservation and subsequent pathological analysis. Several clinical parameters are recorded in the database including patient demographics, glycosylated hemoglobin level

(%HbA1c), body mass index (BMI), comorbidities, CRC risk factors, manner of disease presentation, as well as treatment and survival data. Moreover, the database also encompasses extensive pathological data, including the location of the primary tumour, tumour stage in accordance with the American Joint Committee on Cancer Tumour Node Metastasis system, histological grade, indicators of invasiveness including lymphovascular and perineural invasion, as well as records of lymph node and distant metastasis. Patients are followed and survival data continues to be updated, ensuring that changes in clinical events are accurately tracked to provide comprehensive information for research purposes. To do this, patient files are first checked to see if a patient is listed as deceased; if there is no information, then obituaries are checked and confirmed by cross-referencing with the last address listed, the date of birth, as well as through communication with the emergency contact if one is listed. The most recent date of patient follow- up was July 17, 2019, and this work is performed by Ms. Nicole Archer.

2.3 mRNA Quantification

In total, three temporally distinct batches of RNA extraction and quantification using

NanoString nCounter® technology (Seattle, WA) were performed to generate the total dataset used for this analysis. The methodologies for these assays were consistent across batches, however, each batch of RNA extraction was performed by different researchers. The first batch of RNA 32

extraction and quantification was completed in 2018 by Dr. Christine Orr (N = 96), the second was performed in 2020 by Ms. Melinda Chelva (N = 81), and the third in 2021 for this study (N = 95).

2.3.1 RNA Extraction from FFPE Samples

Prior to RNA extraction from the FFPE patient samples, histopathological evaluation and tissue block annotation were performed by a certified pathologist, Dr. David Hurlbut, to confirm the presence of tumour content to guide the removal of tissue cores. RNA was extracted using the

RecoverAll Total Nucleic Acid Isolation Kit (Streetsville, Canada), according to instructions provided by the manufacturer (199). First, cores from the FFPE tissue samples were deparaffinized in xylene, proteins were digested with a protease, the nucleic acid was isolated, DNA was enzymatically digested, and finally, purified RNA was extracted. The concentration of the RNA extracted from each sample was determined by measuring the absorbance at 260 nm, using a

NanoDrop 1000 Spectrophotometer (Fitchburg, WI). RNA integrity was assessed using an

Agilent 2100 Bioanalyzer (Santa Clara, CA). Samples with RNA concentrations falling below

40ng/L (N = 6), the threshold for the NanoString nCounter assay, were excluded from further analysis. Despite the low level of RNA integrity observed across samples, none were excluded based on this metric given the robust performance of nCounter technology on highly degraded

RNA, which is a well-known feature of FFPE samples (174).

2.3.2 Selection of Genes for Custom nCounter Codeset

A custom gene panel was designed by Dr. Chris Nicol to quantify the expression of 151 predetermined metformin-related genes (Supplementary Table 1). These genes were selected 33

based on their involvement in key CRC pathological processes which are most affected by metformin, and consequently, may play a role in the improved outcomes of these patients. The critical processes include cellular proliferation/cancer progression, EMT, cell stemness, immune markers and inflammation. The first step in developing the custom gene panel involved creating a comprehensive list of genes related to these key processes. This was done using two main approaches; namely, via an extensive review of the literature and data mining, or by selecting genes included in commercially available gene panels designed to evaluate these processes. In total, 2013 genes were identified for their potential involvement in metformin’s protective effects in CRC. Next, this list was further refined to 151 genes for the final gene panel. To do this, genes were prioritized for selection based on their involvement in more than one key process. Next, an

Ingenuity Pathway Analysis, a bioinformatics application that facilitates the functional analysis, integration, and improved understanding of high volume datasets, was performed to pinpoint genes most likely to play a significant role in any single process or to link the processes. In addition, seven housekeeping genes were included in the codeset, as well as eight negative controls and six positive controls. The assay was performed by Queen’s Laboratory for Molecular Pathology.

2.4 Data Preprocessing

2.4.1 Initial Quality Control

Using nSolver software (Seattle, WA) provided by NanoString, the reported quality control metrics were evaluated to screen for potentially problematic samples. In total, five samples were flagged for QC parameters; specifically, they did not pass the Limit of Detection QC check, indicating that the assay was not sufficiently sensitive with these samples. Three of these samples were re-run in later NanoString cartridges; they failed the same QC metric the second time,

34

indicating poor sample quality, and thus the corresponding data was not adequately robust.

Consequently, these samples were not included in any additional steps in the analysis.

2.4.2 Batch Calibration in nSolver Software

Following the removal of poor quality samples from the data set, a batch calibration was performed to account for the small variation that may exist in the probe sets from different lots.

Following instructions provided by the manufacturer, the technical replicates from our

NanoString® assay were specified in the imported data, samples from distinct batches were labeled, and the calibration process was automated in the nSolver software (200). To consolidate the data, the software computed the percent difference in raw counts for each unique probe among the technical replicates, thereby generating gene-specific scaling factors used to scale one lot to another.

2.4.3 Normalization & Additional Quality Control

The raw data was normalized in a two-step process to account for the two major sources of technical variability associated with this assay. Positive control normalization was used to remove variability related to the nCounter platform, and codeset content normalization was used to remove variability related to sample input. Prior to normalization, the raw data was examined to ensure that the housekeeping genes used for codeset content normalization displayed low variability across samples and that genes with high, moderate, and low expression levels were included, thereby validating their ability to normalize the data against inconsistencies in sample input. To evaluate these genes, the interquartile range and median were computed for each

35

housekeeping gene across all samples; next, the median, lower and upper quartile of the medians were calculated. Housekeeping genes with low expression levels displayed median expression levels below the lower quartile of the medians. Moderately expressed genes displayed median expression levels between the lower and upper quartile of the medians, and highly expressed genes exhibited median expression levels above the upper quartile of the medians. In the nSolver® program, the housekeeping genes and positive control genes were used to calculate lane-specific normalization factors. These are generated by calculating a geometric mean of the positive controls or housekeeping genes for each sample lane, next the arithmetic mean of the geometric means across all sample lanes is computed, and following that, the arithmetic mean is divided by each geometric mean to generate lane-specific normalization factors (173). Finally, the counts for each gene across all samples are multiplied by both normalization factors to correct for sample input variability and technical variability, thereby leaving biologically relevant results (173).

Following normalization, an additional round of quality control was performed to ensure that the normalization process did not introduce any variability into the data unintentionally. To assess for any potential bias that may have been introduced, the calculated scaling factors were assessed for their potential to skew the data. Samples were flagged if their positive control scaling factor exceeded three-fold (i.e. less than 0.3 or greater than 3.0), which can occur if samples display relatively low counts for the positive controls and may indicate poor assay efficiency. Moreover, samples were flagged if their codeset content/housekeeping gene scaling factor exceeded ten-fold

(i.e. less than 0.1 or greater than 10.0), indicating that the housekeeping genes for that sample displayed much lower counts than the average, and thus, were of poor quality. Two samples were flagged during this analysis for extreme codeset content scaling factors and thus were excluded from further analysis. No samples were flagged for an extreme positive control scaling factor.

36

2.4.4 Data Visualization for Inspection for Batch Effects & Outliers

Computations and graphical representations of the data were generated using MATLAB software (Natick, MA). The normalized data was visually inspected to assess for potentially problematic samples, thereby increasing the confidence that the data was sufficiently robust for inclusion in the in-depth analysis. Total mRNA counts were computed by summing the molecular counts for each gene within a sample, and subsequently plotted to screen for samples that may differ significantly from the majority of the data. Next, the interquartile ranges (IQRs) were computed for each sample. These values were plotted to assess for variability in mRNA expression levels within each sample, as well as to compare these values across samples. Finally, the normalized data was log-transformed and visualized as a box plot, providing a summary of several important parameters, including the center of the data, the skewness, and the IQR across samples, as well as enabling the screening for potential outlying samples and/or batch effects.

2.4.5 Forgoing Filtering of Features with Low Expression Levels

The NanoString assay for quantifying mRNA expression utilized a custom code set of 151 key genes, with documented biological implications in CRC pathology. These genes were also selected on the basis of their potential interaction with metformin. Given the careful selection and strict inclusion criteria for these features, it was decided that features displaying relatively lower expression levels would not be filtered from the dataset to ensure that genes with potential discriminatory power would not be inadvertently removed.

37

2.5 Data Analysis

2.5.1 Assigning Data to Training & Validation Cohorts

The samples were randomly partitioned into two groups. Eighty percent of the data was designated for the training set and twenty percent was used for the validation set. Statistical tests were computed using IBM SPSS Statistics Version 26 (Armonk, NY), to ensure that the training and validation cohorts did not differ significantly in their distribution of the key clinicopathological features (age at diagnosis, gender, tumour sidedness, stage at diagnosis, and grade), thereby demonstrating that the dispersal of these covariates amongst both cohorts best approximated that of the general population (201).

2.5.2 Feature Selection for Dimension Reduction

Feature selection is a method of dimension reduction used to optimize data for the application of machine learning techniques. Following the partitioning of data into training and validation cohorts, a supervised machine learning approach was used to identify features with the greatest discriminatory power between metformin-treated patients and those who were diet- controlled, within the training cohort. Discriminating features are variables, in this case the mRNAs within the targeted gene panel, that have the ability to separate classes in N-dimensional space. The focus of this project involved elucidating the molecular discrepancies between metformin-treated and diet-controlled patients that may contribute to the differential clinical outcomes documented in the literature. Consequently, the sequentialfs function in MATLAB was applied to the training dataset with five-fold cross-validation to identify features that best discriminate between metformin-treated and diet-controlled diabetic CRC patients. Fifty iterations were performed to generate a list of discriminating features; 25 iterations were performed using

38

an ensemble of weak tree learners and the other 25 were performed using an ensemble of linear discriminant models. The sequentialfs algorithm works by sequentially adding features to candidate feature subsets until the performance of the model no longer improves (202). To do this, sequentialfs uses a function handle to call an internal function, that computes the specified criterion used to determine which features are selected (203). For this purpose, the criterion is defined as the misclassification rate of the model using each subset of candidate features. The same methodology was used to identify features that best discriminate between samples based on their five-year survival status in the non-diabetic subset of our study cohort.

2.5.3 Unsupervised Machine Learning: Hierarchical Clustering

Hierarchical clustering was performed in MATLAB using the clustergram function, which generates a heat map and a corresponding dendrogram, to provide a visual representation of the hierarchical clusters formed from the molecular data (204). This function was applied using both the Euclidean distance and correlation coefficients as measures of similarity; three different linkage parameters were assessed, including average, single, and complete linkage. The performance of the algorithm was evaluated using each distinct combination of similarity and linkage measures to determine which selection optimized the model performance on the dataset.

2.5.4 Supervised Machine Learning: Building a Classification Algorithm

2.5.5 Classifying Metformin-Treated & Diet-Controlled Samples

Following dimension reduction, the selected features were used to build a classifier in the

MATLAB Classification Learner App. To do this, the training dataset was used to train several

39

algorithms for the classification of samples as either “metformin-treated” or “diet-controlled”, with five-fold cross-validation. The “No Free Lunch Theorem” states that no classification method is superior or inferior, therefore, it is important to try multiple algorithms to determine which one performs best with the data set being used (197). As a result, all of the available models were trained in the MATLAB application and their performances were evaluated. The performances of the models were also explored with the sequential removal of features to determine if these actions would improve the performance accuracy. Finally, the optimized model was exported and applied to the validation data set to predict the diabetic-treatment group, thereby assessing for potential overfitting to the training data and thus, the generalizability of the algorithm.

2.5.6 Classifying Samples According to their Five-Year Survival Status

The features with the greatest discriminatory power between non-diabetic patients that survived for five years following their cancer diagnosis and those that did not were used to build a second algorithm to classify patients according to their five-year survival status. Using the same workflow that was described in the aforementioned section, a classification algorithm was trained and optimized, then exported and applied to the validation cohort to assess its generalizability.

2.5.7 Investigation of Gene Expression Profiles in Key Discriminating Features

Following feature selection, we further investigated the expression patterns of genes that were selected in the diabetic treatment analysis and the five-year survival analysis. Box plots for gene expression in metformin-treated samples, diet-controlled samples, and non-diabetic samples grouped based on their five-year survival status were generated for each of the common features

40

and evaluated for patterns that may suggest a gene was linked to metformin’s antineoplastic effects. Two-tailed independent sample t-tests were also computed to evaluate patterns in expression that were trending towards, or that reached statistical significance at the five percent level. Tests were performed specifically to evaluate whether expression differed between metformin-treated and diet-controlled samples, metformin-treated samples and non-diabetic samples alive at five years, metformin-treated samples and non-diabetic samples deceased at five years, diet-controlled samples and non-diabetic samples alive at five years, diet-controlled samples and non-diabetic samples deceased at five years, non-diabetic samples alive and non-diabetic samples deceased at five years, metformin-treated samples and non-diabetics, as well as diet- controlled samples and non-diabetics.

2.6 Statistical Analysis

All statistical tests were computed using IBM SPSS Statistics Version 26 (201). A one- way ANOVA was used to evaluate whether key clinicopathological features measured on a continuous scale (i.e. age at diagnosis, BMI, %HbA1c, CEA level) differed in their mean values between diabetic treatment groups. A Kruskal-Wallis test was performed to test for statistically significant differences in key clinicopathological features recorded as ordinal data across diabetic treatment groups (i.e. stage at diagnosis, histopathological grade), and a chi-square test was used to evaluate dichotomous and nominal variables (i.e. gender, tumour sidedness, two-year survival, five-year survival, perineural invasion, and lymphovascular invasion). To assess for differences in these parameters between training and validation cohorts, a two-tailed independent samples t-test was performed for continuous variables, a chi-square test was performed for dichotomous variables, and a Mann Whitney U-Test was used for ordinal variables.

41

Chapter 3

Results

3.1 mRNA Quantification in the Study Cohort

To investigate metformin’s capacity to alter the regulation of key metabolic and oncogenic pathways, we used a targeted gene panel of 151 metformin-related genes to quantify mRNA of interest in our study cohort. An overview of the workflow in the mRNA quantification process and data preprocessing is displayed in Figure 2, and details of each step are described below.

3.1.1 RNA Extraction

RNA was extracted from 116 FFPE primary tumour specimens and samples were evaluated for their RNA concentration and quality. Six samples were excluded from further steps of the analysis as they fell below the required threshold concentration of 40ng/L for the NanoString nCounter assay (Figure 2). The RNA integrity number (RIN) of the samples was also evaluated using an Agilent 2100 Bioanalyzer to assess the degree of fragmentation. Despite displaying high levels of RNA degradation (mean RIN = 2.3), no samples were excluded based on their RIN number, due to documentation of the robust performance of NanoString nCounter® technology on highly degraded RNA (174). To satisfy the number of samples that could be processed in the

NanoString nCounter assay, two non-diabetic samples were randomly excluded, leaving 108 samples for mRNA quantification.

42

Figure 2. Overview of the mRNA Quantification Process with NanoString nCounter® Technology & Data Preprocessing in nSolver® Software.

43

3.1.2 NanoString nCounter Assay & Initial Quality Control

Following the initial NanoString runs, three diabetic samples were flagged for the Limit of

Detection QC metric. To increase the number of diabetic samples in our cohort for the subsequent analysis, we randomly excluded three non-diabetic samples from later runs to allow the samples that failed QC to be re-assessed in later NanoString cartridges. These samples were re-run using the stock concentration, thereby maximizing RNA input.

Following the nCounter assay, the raw data in the form of RCC files was imported into the nSolver software to be preprocessed. At this step, our raw data (N = 105) was merged with the raw NanoString data that was previously collected (N = 177). First, the quality control metrics reported in the NanoString assay were examined to look for any problematic samples. Five samples were flagged for the Limit of Detection QC, and three of these samples were the re-run samples indicating that failure likely resulted due to poor sample quality, rather than errors in the experimental protocol (Table 1). These samples were excluded from additional preprocessing steps and data analyses to ensure they did not bias the normalization of the data and the results of the study, respectively.

To account for potential differences in the efficiency of nCounter probes that were manufactured at different time points, we performed a batch calibration experiment using nSolver software. This facilitated the consolidation of data from separate batches and effectively removed extraneous noise that may have been introduced due to inconsistencies in the manufacturing of the

NanoString probe sets (200). Using the recommended guidelines provided by the nSolver 4.0

User Manual, we specified three reference samples that were included in separate batches to calibrate the data. Batch calibration is an automated process in the software; to scale one batch to another, the software uses the mRNA counts from the technical replicates to compute the percent

44

Table 1. Overview of Samples Flagged for Limit of Detection Quality Control. The limit of detection is computed by calculating the geometric mean of the negative controls for each sample, plus two standard deviations. Raw counts of the positive control POS_E, the 0.5fM synthetic positive control probe, must be greater than this limit to pass QC. All listed samples were from NanoString Batch 3. *Indicates a sample that was re-run.

Value of POS_E Sample Limit of Detection QC of mRNA (0.5fM Positive Control)

s10-18868 203.92 129 s09-12330 98.71 80 s13-9192 123.47 106 s13-19467 360.63 93 p07-20271 106.02 101 s10-18868* 283.35 76 s13-19467* 381.75 42 s13-9192* 231.06 76

45

difference in raw counts for each gene with respect to the reference sample in the separate batch, thereby generating gene-specific scaling factors (200). Finally, the software multiplies the unique scaling factor generated for each gene to the raw counts for the same gene in all other samples of the separate batch, thus accounting for variability in the probes from different manufacturing lot numbers. Following batch calibration, the technical replicates were removed from the data set to ensure they would not bias subsequent analyses.

3.1.3 Evaluation of Housekeeping Genes

After the first round of quality control and batch calibration, 274 samples remained (Batch

1: N = 96, Batch 2: N = 81, Batch 3: N = 95). The custom codeset included seven housekeeping genes, which were selected for inclusion on the basis that their expression levels are assumed to remain relatively constant across samples despite differences in disease pathology, diabetic treatment groups, or other covariates. Prior to the normalization process, the raw data was assessed to ensure that the housekeeping genes used to generate the codeset content normalization factors displayed low variability across samples and that genes exhibiting high (N = 2), moderate (N = 3), and low (N = 2) expression levels were included. A summary of the computed interquartile ranges, medians, and quartiles of the medians, to evaluate the variability of expression in these genes across samples, as well as their relative expression levels can be seen in Table 2. The inclusion of housekeeping genes across a range of expression levels ensured that the resulting codeset content normalization factors had the capacity to remove technical variability, related to sample input, across samples with varying levels of expression.

46

Table 2. Evaluation of Housekeeping Genes Used for Normalization of Data. Using Microsoft Excel® the interquartile range (IQR) and median expression value was computed for each housekeeping gene across all samples included in the normalization process. Finally, the quartiles of the medians were computed to classify genes as having “Low”, “Moderate” or “High” expression levels.

Housekeeping IQR Across Median Across Quartiles of Median Values Expression Gene Samples Samples Level

Lower Quartile: 673.5 HPRT1 553.25 518.5 Low GUSB 539.25 673.5 Low

Median: 1243 CLTC 737.5 798.5 Moderate TUBB 1621.25 1243 Moderate PGK1 5097.25 4934.5 Moderate

Upper Quartile: 12734.5 GAPDH 10835 12734.5 High RLPL0 15963 15674 High

47

3.1.4 Data Normalization & Additional Quality Control

The raw data was normalized in a two-step process using nSolver software provided by

NanoString, to remove the two main sources of technical variability, thereby leaving biologically relevant differences for the analysis. To ensure that the normalization process did not inadvertently bias the data, an additional round of quality control was performed, by investigating the magnitude of the computed normalization factors. Two samples were flagged for extreme Codeset Content

Normalization Factors, indicating that they likely displayed significantly lower mRNA counts in their housekeeping genes compared to other samples, thus suggesting that the samples may have been of poor quality (Table 3). Further investigation of the magnitudes of the corresponding scaling factors revealed that they well exceeded the ten-fold limit, with values at 15.3 and 15.73, respectively. Moreover, mRNA counts for the housekeeping genes in the flagged samples revealed extremely low counts, with the percent deviation between mRNA counts for a housekeeping gene in the flagged samples and that of the average counts across all samples for the same gene falling between an 87 percent to 98 percent discrepancy. Given these extreme values, the samples were deemed to be of poor quality and thus were excluded from the analysis to ensure they would not skew the results. Following the removal of flagged samples, a total of 272 samples remained.

3.1.5 Data Visualization Indicates that Preprocessing was Successful

After the preprocessing steps in the nSolver® software were complete, we visualized the data to confirm that quality control, batch calibration, and normalization were successful. Using

MATLAB software, graphical representations of several key parameters in our data were produced to screen for trends suggestive of batch effects or poor quality samples (205). To do this,

48

Table 3. Overview of Samples Flagged for Normalization. The percent deviation from the average level of expression was computed by subtracting the mRNA counts of a flagged sample from the mean count number across all samples; the difference was then divided by the mean count number across all samples and multiplied by 100.

Codeset Content Normalization Flagged Samples

Housekeeping Mean Count Gene of All s09-9525 s11-9855 Samples mRNA % Deviation mRNA % Deviation Counts Counts

CLTC 910.675 27 97.0 64 93.0 GAPDH 14756.125 1218 91.7 1158 92.2 GUSB 777.2 90 88.4 98 87.4 HPRT1 611.657143 32 94.8 24 96.1 PGK1 5986.26786 724 87.9 544 90.9 RLPL0 17969.5893 2223 87.6 823 95.4 TUBB 1658.95357 25 98.5 35 97.9

49

we plotted the total mRNA counts, the interquartile ranges, and box plots for each sample in our dataset, in the order that they were processed in the NanoString assay. We evaluated total mRNA counts across samples to look for samples that may display relatively low counts, which could indicate poor quality data. No samples displayed expression levels that approached the lower limit, suggesting that the data produced was of high quality (Figure 3A). Some samples exhibited high total mRNA counts, signifying that these samples may have elevated expression of some of the selected genes; this is not an indicator of low-quality samples, as variability in expression patterns is expected when using a targeted gene panel, and thus reflects legitimate biological discrepancies between samples (Figure 3A). Next, the IQR values for mRNA expression in each sample were evaluated. The IQR values display a degree of variability between samples. However, all IQR values fall within the lower and upper bounds (Figure 3B). Furthermore, no samples display extremely low levels of variability, which is often an indicator of poor sample quality or assay failure. Given that the custom code set is composed of 151 predetermined genes that are biologically implicated in CRC pathology, and that may interact with metformin, it is expected that mRNA expression will differ between samples. Moreover, CRC is a highly heterogeneous disease, displaying inconsistent mutational profiles between patients, as well as intratumoural heterogeneity; consequently, the IQR variability suggests that the molecular counts reflect biologically relevant differences. Following that, box plots were generated for each sample, providing a comprehensive summary of key metrics including the center of the data, the spread, the IQR, the skewness, and the presence or absence of outliers, and facilitates their comparison between samples (Figure 3C/D). The medians across samples are similar, showing only a low degree of variability. Furthermore, following log transformation, the data for each sample approximates a normal distribution, as the data is not heavily skewed. The vast majority of samples

50

5 mRNA Total Counts A 10

10

8

s

t n

u 6

o

C

l

a

t o

4 T

2

0

0 4 8 5 6 9 8 2 2 0 7 2 6 4 3 2 5 0 6 7 0 1 1 3 8 7 6 7 7 0 1 2 8 8 3 1 7 1 4 2 5 2 9 8 0 6 3 4 3 4 9 9 4 4 0 0 9 3 3 2 3 4 3 4 1 9 7 8 3 1 1 7 4 2 5 9 0 1 4 3 2 8 6 8 5 0 2 6 0 4 0 3 5 5 1 6 5 1 8 1 7 0 9 3 4 5 9 1 8 0 0 0 9 7 8 5 6 7 0 6 7 5 9 1 2 6 4 9 6 0 8 8 2 1 0 8 6 8 8 2 1 4 9 1 3 0 3 8 0 5 2 9 0 3 2 4 3 7 7 8 0 3 6 3 8 5 4 4 6

0 4 9 3 8 2 7 7 6 1 4 9 1 3 1 4 2 3 3 6 6 3 7 6 3 5 8 9 0 0 6 5 8 8 4 4 3 3 0 6 4 0 1 3 6 9 5 7 2 4 5 6 3 3 6 0 5 5 2 5 8 4 8 0 9 7 8 1 1 9 5 5 4 4 4 3 1 8 1 6 8 5 5 3 7 7 4 5 6 1 9 0 7 0 4 3 7 9 8 7 3 3 3

4 9 3 0 7 5 2 0 0 3 4 3 5 9 8 8 9 2 5 0 5 8 2 0 5 1 7 0 7 5 8 9 2 1 1 5 3 3 6 6 4 6 1 5 4 6 5 4 4 7 6 7 9 3 0 2 4 8 6 1 0 8 1 7 3 2 7 4 5 8 3 9 8 6 3 6 0 7 6 5 9 8 9 6 2 0 7 1 3 2 3 5 4 2 5 1 6 9 0 0 2 1 0 4 5 4 5 9 2 0 0 6 3 0 5 3 8 5 0 7 9 7 8 9 1 2 9 0 1 9 8 5 0 1 3 2 9 3 3 0 7 1 3 6 8 2 6 3 8 2 7 7 9 2 1 7 5 7 2 5 6 6 4 5 0 6 6 9 6

4 3 0 2 1 5 9 4 3 4 1 5 1 1 1 4 2 6 0 6 6 0 7 7 9 1 3 8 6 4 9 3 8 7 1 9 7 7 8 2 4 8 3 5 6 7 1 6 2 9 9 2 0 4 1 2 1 5 0 4 6 3 6 1 2 4 0 6 1 5 4 8 9 7 3 3 7 7 3 7 9 8 8 7 3 9 3 3 1 5 4 5 7 8 7 0 8 6 9 2 4 6 3

1 7 6 2 6 8 8 8 9 3 0 6 2 5 8 9 3 6 5 2 5 2 6 6 2 1 7 0 1 3 9 1 0 2 8 8 2 2 9 6 6 1 7 5 2 5 4 0 4 0 1 0 8 4 8 0 6 1 5 9 7 4 8 3 7 6 1 2 2 9 4 1 6 0 0 4 8 9 3 6 9 0 3 6 5 4 1 2 1 0 4 5 1 4 0 8 7 2 9 3 5 5 9 9 4 9 2 2 0 8 6 4 0 7 9 8 1 6 7 5 6 6 9 1 4 4 2 6 7 0 4 2 3 4 8 5 1 5 6 9 2 5 6 0 2 0 2 5 1 0 3 3 2 6 8 3 3 5 4 8 1 5 0 5 8 8 3 7 5

7 8 2 1 4 3 3 8 5 8 2 4 1 2 7 5 4 7 9 4 1 8 8 5 6 2 9 5 8 7 2 3 3 4 3 9 8 7 9 4 7 9 6 5 6 8 8 3 6 9 2 6 9 9 2 2 5 0 5 8 4 2 1 5 0 8 4 9 0 7 5 5 1 6 1 6 6 3 5 0 9 1 4 2 2 5 1 2 9 6 4 2 1 3 7 3 3 7 2 1 0 9 7

- -

- - - - -

6 0 2 5 8 3 4 5 1 2 1 7 1 3 2 7 8 8 4 6 0 1 5 6 4 3 5 3 8 0 0 1 0 1 3 8 3 7 7 1 0 1 0 4 1 2 1 7 5 2 2 4 8 1 3 9 2 4 6 8 4 4 9 0 8 0 8 8 2 9 2 1 7 5 7 2 3 7 1 1 9 2 7 4 9 7 8 4 2 0 0 1 3 1 9 1 1 3 5 5 7 8 8 0 0 8 0 0 4 9 0 6 9 2 2 4 6 0 0 2 3 4 9 7 2 4 1 3 4 5 6 7 2 6 3 3 3 8 0 1 2 4 1 4 8 0 3 7 2 1 5 3 3 0 1 6 7 5 1 6 7 8 8 5 0 0 1

8 8 7 3 2 8 8 3 6 7 9 2 1 2 4 9 2 6 9 2 3 9 2 5 1 0 2 7 9 9 9 3 4 4 5 6 0 1 0 1 2 0 6 0 7 2 4 3 2 2 2 3 1 5 4 7 0 1 1 8 8 2 4 5 6 2 3 3 6 1 1 3 8 4 2 3 1 7 3 3 5 1 6 5 1 4 7 0 2 7 5 3 5 5 4 0 2 4

------8 ------7 ------

- - - - - 1 - 3 ------1 - - 1 ------9 - -

1 2 1 1 2 1 1 1 1 1 2 2 1 1 2 2 1 2 2 2 1 2 2 1 1 2 2 2 1 1 1 2 1 1 1 1 1 2 1 1 1 1 1 1 1 2 2 1 2 1 1 1 1 1 1 1 2 2 1 1 1 1 1 2 0 1 1 0 0 0 0 0 0 1 0 1 0 1 0 0 0 1 0 0 1 1 1 1 0 0 1 1 0 0 1 1 2 0 1 2 0 0 1 0 0 0 0 0 0 0 0

1 1 1 2 1 1 1 1 2 2 1 1 1 2 2 1 1 2 2 1 1 1 2 2 0 1 1 2 1 1 2 1 2 2 1 1 2 1 0 1 1 1 1 1 1 1 1 1 1 2 1 1 1 0 2 1 1 1 1 1 1 2 0 1 2 1

- - 7 7 7 8 8 8 ------6 6 7 7 - - - - 9 1 5 6 - 7 8 - - - - - 0 0 5 5 5 6 - 8 - 8 ------7 7 - 8 9 0 - 1 1 5 - - - 6 - - - 0 9 0 1 5 - 5 - - 7 7 - 9 9 - 0 - - - - - 7 - - - 9 1 1 ------0 ------7 7 ------7 ------8 5 - - - - 8 8 - - - - 8 ------5 - - -

- - 1 - - 2 2 - - - - 3 - - 1 1 1 - - - - - 1 3 - 1 - - - 2 2 2 2 3 3 ------1 1 1 - - - 2 ------0 - - 1 1 - - - 9 ------0 - 1 - - - - - 1 2 3 9 1 - 2 2 2 - - - - - 0 9 - 1 - - -

6 7 0 0 0 0 0 0 9 9 9 1 5 5 0 0 0 0 9 9 9 9 0 1 0 0 7 0 0 9 9 9 0 0 1 1 0 0 0 0 6 0 8 0 9 9 0 1 5 5 7 0 0 8 0 0 1 1 1 1 0 6 6 6 0 7 7 8 0 1 1 0 5 0 7 7 0 0 8 0 0 0 1 5 5 6 6 7 0 7 7 9 0 1 1 s s s 4 7 7 7 7 7 7 7 7 7 4 7 7 7 7 7 0 0 8 8 8 9 7 7 7 7 7 7 0 8 3 6 6 7 7 7 7 8 7 7 7 s 7 7 7 7 7 7 8 0 0 7 7 7 7 0 0 7 7 7 8 0 7 7 8 6 7 7 7 s 0 7 7 7

1 1 1 2 2 1 1 3 3 3 3 1 1 1 1 1 3 3 3 3 3 1 1 1 2 2 2 1 1 1 1 1 1 7 7 7 7 7 8 1 1 2 2 2 1 2 8 9 2 2 8 9 1 1 1 1 2 2 9 0 0 1 1 1 2 0 0 1 3 3 1 1 1 2 8 0 0 1 1 1 1 0 1 2 1 1 1 3 3 3 9 9 0 0 1 2 1 2

0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 s s s s s s s s s s s s s s s s 0 0 0 0 0 0 0 0 0 0 s 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 s s 0 0 s 0 0 0 0 0 p p 0 0 0 0 p p s s 0 0 0 0 p s s s 0 0 0 s s s s s 0 0 0 0 s s p 0 0 0

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 0 0 1 1 1 1

s s s s s s s s s s s s s s s s s s s s s s s s s s s s s p p p p p p p p s s s s p p p s s s s s p p s s s s s s s p p p p p s s s s s p p p p s s p p p p s s s s p p p s s s s p p p p s s s s p p p s s Sample ID Block Number

B mRNA Data IQR

1600

1400

1200

R 1000

Q I

800

600

400

0

4

8

5

6

9

8

2

2

0

7

2

6

4

3

2

5

0

6

7

0

1

1

3

8

7

6

7

7

0

1

2

8

8

3

1

7

1

4

2

5

2

9

8

0

6

3

4

3

4

9

9

4

4

0

0

9

3

3

2

3

4

3

4

1

9

7

8

3

1

1

7

4

2

5

9

0

1

4

3

2

8

6

8

5

0

2

6

0

4

0

3

5

5

1

6

5

1

8

1

7

0

9

3

4

5

9

1

8

0

0

0

9

7

8

5

6

7

0

6

7

5

9

1

2

6

4

9

6

0

8

8

2

1

0

8

6

8

8

2

1

4

9

1

3

0

3

8

0

5

2

9

0

3

2

4

3

7

7

8

0

3

6

3

8

5

4

4

6

0

4

9

3

8

2

7

7

6

1

4

9

1

3

1

4

2

3

3

6

6

3

7

6

3

5

8

9

0

0

6

5

8

8

4

4

3

3

0

6

4

0

1

3

6

9

5

7

2

4

5

6

3

3

6

0

5

5

2

5

8

4

8

0

9

7

8

1

1

9

5

5

4

4

4

3

1

8

1

6

8

5

5

3

7

7

4

5

6

1

9

0

7

0

4

3

7

9

8

7

3

3

3

4

9

3

0

7

5

2

0

0

3

4

3

5

9

8

8

9

2

5

0

5

8

2

0

5

1

7

0

7

5

8

9

2

1

1

5

3

3

6

6

4

6

1

5

4

6

5

4

4

7

6

7

9

3

0

2

4

8

6

1

0

8

1

7

3

2

7

4

5

8

3

9

8

6

3

6

0

7

6

5

9

8

9

6

2

0

7

1

3

2

3

5

4

2

5

1

6

9

0

0

2

1

0

4

5

4

5

9

2

0

0

6

3

0

5

3

8

5

0

7

9

7

8

9

1

2

9

0

1

9

8

5

0

1

3

2

9

3

3

0

7

1

3

6

8

2

6

3

8

2

7

7

9

2

1

7

5

7

2

5

6

6

4

5

0

6

6

9

6

4

3

0

2

1

5

9

4

3

4

1

5

1

1

1

4

2

6

0

6

6

0

7

7

9

1

3

8

6

4

9

3

8

7

1

9

7

7

8

2

4

8

3

5

6

7

1

6

2

9

9

2

0

4

1

2

1

5

0

4

6

3

6

1

2

4

0

6

1

5

4

8

9

7

3

3

7

7

3

7

9

8

8

7

3

9

3

3

1

5

4

5

7

8

7

0

8

6

9

2

4

6

3

1

7

6

2

6

8

8

8

9

3

0

6

2

5

8

9

3

6

5

2

5

2

6

6

2

1

7

0

1

3

9

1

0

2

8

8

2

2

9

6

6

1

7

5

2

5

4

0

4

0

1

0

8

4

8

0

6

1

5

9

7

4

8

3

7

6

1

2

2

9

4

1

6

0

0

4

8

9

3

6

9

0

3

6

5

4

1

2

1

0

4

5

1

4

0

8

7

2

9

3

5

5

9

9

4

9

2

2

0

8

6

4

0

7

9

8

1

6

7

5

6

6

9

1

4

4

2

6

7

0

4

2

3

4

8

5

1

5

6

9

2

5

6

0

2

0

2

5

1

0

3

3

2

6

8

3

3

5

4

8

1

5

0

5

8

8

3

7

5

7

8

2

1

4

3

3

8

5

8

2

4

1

2

7

5

4

7

9

4

1

8

8

5

6

2

9

5

8

7

2

3

3

4

3

9

8

7

9

4

7

9

6

5

6

8

8

3

6

9

2

6

9

9

2

2

5

0

5

8

4

2

1

5

0

8

4

9

0

7

5

5

1

6

1

6

6

3

5

0

9

1

4

2

2

5

1

2

9

6

4

2

1

3

7

3

3

7

2

1

0

9

7

-

-

-

-

-

-

-

6

0

2

5

8

3

4

5

1

2

1

7

1

3

2

7

8

8

4

6

0

1

5

6

4

3

5

3

8

0

0

1

0

1

3

8

3

7

7

1

0

1

0

4

1

2

1

7

5

2

2

4

8

1

3

9

2

4

6

8

4

4

9

0

8

0

8

8

2

9

2

1

7

5

7

2

3

7

1

1

9

2

7

4

9

7

8

4

2

0

0

1

3

1

9

1

1

3

5

5

7

8

8

0

0

8

0

0

4

9

0

6

9

2

2

4

6

0

0

2

3

4

9

7

2

4

1

3

4

5

6

7

2

6

3

3

3

8

0

1

2

4

1

4

8

0

3

7

2

1

5

3

3

0

1

6

7

5

1

6

7

8

8

5

0

0

1

8

8

7

3

2

8

8

3

6

7

9

2

1

2

4

9

2

6

9

2

3

9

2

5

1

0

2

7

9

9

9

3

4

4

5

6

0

1

0

1

2

0

6

0

7

2

4

3

2

2

2

3

1

5

4

7

0

1

1

8

8

2

4

5

6

2

3

3

6

1

1

3

8

4

2

3

1

7

3

3

5

1

6

5

1

4

7

0

2

7

5

3

5

5

4

0

2

4

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

8

-

-

-

-

-

-

-

-

-

-

-

-

-

-

7

-

-

-

-

-

-

-

-

-

-

-

-

-

-

1

-

3

-

-

-

-

-

-

-

-

-

-

1

-

-

1

-

-

-

-

-

-

-

-

-

-

-

-

9

-

-

1

2

1

1

2

1

1

1

1

1

2

2

1

1

2

2

1

2

2

2

1

2

2

1

1

2

2

2

1

1

1

2

1

1

1

1

1

2

1

1

1

1

1

1

1

2

2

1

2

1

1

1

1

1

1

1

2

2

1

1

1

1

1

2

0

1

1

0

0

0

0

0

0

1

0

1

0

1

0

0

0

1

0

0

1

1

1

1

0

0

1

1

0

0

1

1

2

0

1

2

0

0

1

0

0

0

0

0

0

0

0

1

1

1

2

1

1

1

1

2

2

1

1

1

2

2

1

1

2

2

1

1

1

2

2

0

1

1

2

1

1

2

1

2

2

1

1

2

1

0

1

1

1

1

1

1

1

1

1

1

2

1

1

1

0

2

1

1

1

1

1

1

2

0

1

2

1

-

-

7

7

7

8

8

8

-

-

-

-

-

-

6

6

7

7

-

-

-

-

9

1

5

6

-

7

8

-

-

-

-

-

0

0

5

5

5

6

-

8

-

8

-

-

-

-

-

-

-

7

7

-

8

9

0

-

1

1

5

-

-

-

6

-

-

-

0

9

0

1

5

-

5

-

-

7

7

-

9

9

-

0

-

-

-

-

-

7

-

-

-

9

1

1

-

-

-

-

-

-

-

-

-

-

0

-

-

-

-

-

-

7

7

-

-

-

-

-

-

-

-

-

-

7

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

8

5

-

-

-

-

8

8

-

-

-

-

8

-

-

-

-

-

-

-

5

-

-

-

-

-

1

-

-

2

2

-

-

-

-

3

-

-

1

1

1

-

-

-

-

-

1

3

-

1

-

-

-

2

2

2

2

3

3

-

-

-

-

-

-

1

1

1

-

-

-

2

-

-

-

-

-

-

-

0

-

-

1

1

-

-

-

9

-

-

-

-

-

-

-

-

-

-

0

-

1

-

-

-

-

-

1

2

3

9

1

-

2

2

2

-

-

-

-

-

0

9

-

1

-

-

-

6

7

0

0

0

0

0

0

9

9

9

1

5

5

0

0

0

0

9

9

9

9

0

1

0

0

7

0

0

9

9

9

0

0

1

1

0

0

0

0

6

0

8

0

9

9

0

1

5

5

7

0

0

8

0

0

1

1

1

1

0

6

6

6

0

7

7

8

0

1

1

0

5

0

7

7

0

0

8

0

0

0

1

5

5

6

6

7

0

7

7

9

0

1

1

s

s

s

4

7

7

7

7

7

7

7

7

7

4

7

7

7

7

7

0

0

8

8

8

9

7

7

7

7

7

7

0

8

3

6

6

7

7

7

7

8

7

7

7

s

7

7

7

7

7

7

8

0

0

7

7

7

7

0

0

7

7

7

8

0

7

7

8

6

7

7

7

s

0

7

7

7

1

1

1

2

2

1

1

3

3

3

3

1

1

1

1

1

3

3

3

3

3

1

1

1

2

2

2

1

1

1

1

1

1

7

7

7

7

7

8

1

1

2

2

2

1

2

8

9

2

2

8

9

1

1

1

1

2

2

9

0

0

1

1

1

2

0

0

1

3

3

1

1

1

2

8

0

0

1

1

1

1

0

1

2

1

1

1

3

3

3

9

9

0

0

1

2

1

2

0

0

0

0

0

1

0

0

0

0

0

0

0

0

0

0

1

1

0

0

0

0

1

1

0

0

0

0

1

0

0

0

0

0

0

0

0

0

0

1

0

0

0

0

0

0

0

0

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

0

0

0

0

0

0

0

0

0

0

s

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

s

s

0

0

s

0

0

0

0

0

p

p

0

0

0

0

p

p

s

s

0

0

0

0

p

s

s

s

0

0

0

s

s

s

s

s

0

0

0

0

s

s

p

0

0

0

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

0

0

0

0

0

0

1

1

1

1

0

0

1

1

0

0

1

1

1

0

0

1

1

1

1

1

1

1

1

1

1

1

1

0

1

1

1

1

1

1

1

0

0

1

1

1

1

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

p

p

p

p

p

p

p

p

s

s

s

s

p

p

p

s

s

s

s

s

p

p

s

s

s

s

s

s

s

p

p

p

p

p

s

s

s

s

s

p

p

p

p

s

s

p

p

p

p

s

s

s

s

p

p

p

s

s

s

s

p

p

p

p

s

s

s

s

p

p

p

s s Sample ID Block Number

C mRNA Data Box Plot (1st Half of Data)

15

s

t

n

u o

C 10

A

N R

m 5

0

0

4

3

8

7

6

1

4

1

3

3

3

6

6

3

3

8

9

0

4

3

3

0

6

0

4

2

0

7

2

6

4

6

7

0

1

6

0

1

2

8

8

5

9

0

6

3

4

3

4

9

4

3

4

3

4

9

7

8

2

9

0

3

6

5

0

2

6

0

0

3

5

4

3

9

2

1

2

7

4

3

4

1

9

1

1

1

2

6

0

6

6

0

6

9

5

3

8

6

0

6

5

8

8

4

9

7

7

8

2

4

9

8

5

6

9

8

2

0

3

4

3

5

9

3

2

5

0

5

0

5

8

1

3

8

7

7

7

7

5

8

9

2

1

3

1

7

1

4

2

4

2

1

8

4

6

5

4

4

7

6

9

4

3

0

0

9

8

3

2

3

8

1

7

1

2

7

4

1

1

7

4

6

5

6

0

1

4

5

2

8

9

8

2

0

7

1

3

4

3

5

4

5

1

6

7

8

0

1

4

5

9

8

5

8

2

5

1

2

1

4

2

7

9

4

1

8

7

7

6

1

9

5

8

4

9

3

8

7

1

9

8

7

9

4

1

7

3

0

7

5

2

0

9

3

0

6

2

5

8

8

9

2

5

2

5

2

2

0

5

1

7

0

7

3

9

1

0

2

1

5

3

3

6

6

6

6

7

5

2

5

4

0

4

0

1

7

9

4

0

2

4

1

6

1

0

4

8

3

3

6

1

2

3

8

3

9

8

0

3

4

8

7

6

6

9

8

3

6

5

4

1

2

1

2

4

5

1

2

5

1

8

8

2

3

2

3

3

3

6

7

9

4

1

2

7

4

4

2

6

9

2

3

7

5

2

2

1

0

2

7

2

3

3

4

3

4

5

6

0

1

6

0

6

2

6

8

8

8

1

2

1

7

1

3

8

9

3

6

4

6

0

1

6

6

2

1

5

0

1

0

0

1

0

1

8

8

2

2

9

6

0

1

0

5

1

2

1

7

5

2

2

0

8

1

8

0

6

4

5

9

7

4

9

0

7

0

8

8

5

9

4

1

6

7

0

7

2

9

3

1

9

0

2

6

4

9

7

8

4

0

0

0

1

4

0

8

1

1

7

1

2

8

8

1

1

1

1

2

2

2

4

5

9

1

1

1

2

2

8

9

1

5

1

2

2

7

9

9

9

3

4

1

1

1

2

2

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

1

2

2

5

8

3

4

5

1

1

2

1

1

1

2

7

8

8

1

1

2

2

5

6

4

3

1

3

8

1

2

2

1

2

3

8

3

7

7

1

2

1

2

4

1

2

2

1

1

2

2

4

8

2

3

9

2

1

6

8

4

1

1

2

8

1

1

1

2

2

9

2

1

1

5

1

2

3

7

1

1

9

1

7

1

1

1

1

1

2

2

2

1

3

1

9

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

1

1

1

2

2

2

2

3

3

3

3

3

1

1

1

1

1

3

3

3

3

3

3

3

1

1

2

2

2

2

2

2

2

3

3

7

7

7

7

7

6

7

7

7

7

8

8

8

9

9

9

1

5

5

6

6

7

7

9

9

9

9

9

1

5

6

7

7

8

9

9

9

0

0

0

0

5

5

5

6

6

8

8

8

9

9

0

1

5

5

7

7

7

8

8

9

0

1

1

1

5

6

6

6

6

7

7

8

8

9

0

1

5

5

5

7

7

7

7

8

9

9

0

0

5

5

6

6

7

7

7

7

9

9

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

1

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

0

1

0

0

0

0

0

0

0

0

0

0

0

1

0

0

0

0

0

0

0

0

1

1

1

1

0

0

0

0

0

0

0

0

0

0

1

1

0

0

0

0

0

0

0

0

1

1

1

1

0

0

0

0

0

0

0

0

0

0

1

1

0

0

0

0

0

0

0

0

0

0

1

1

0

0

0

0

0

0

0

0

0

0

1

1

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s

s s Sample ID Block Number

D

Figure 3. Visualization of Preprocessed Data. Samples were plotted in the order that they were processed to screen for batch effects. (A) The total molecular counts of each gene were summed for every sample, to screen for samples with low counts. Outliers fall outside the red line. (B) The IQR of the data provides an indicator of expression variability, to screen for samples with low levels of variability. (C/D) The box plots indicate the IQR, the median (the line that splits the box), the skewness, the lower and upper limits (the whiskers), and outliers (indicated with red plus signs). The top panel displays the first half of the samples processed and the bottom panel displays the second half of the samples processed.

51

contain mRNAs with outlying molecular counts (both high and low). However, no samples differ substantially from the norm and there are no consecutive trends in samples indicating a lack of sample outliers and batch effects, respectively. Following these assessments, it was concluded that the data was amply robust for subsequent in-depth analysis.

3.2 Non-T2D CRC Patients Display Higher Minimum Survival Times, Comorbid T2D is

More Common in Male CRC Patients & Insulin-Treated Patients Present with Higher Grade

Tumours

To investigate potential differences in the presentation of key clinicopathological features based on diabetic status and treatment group, several statistical tests were performed. Diabetic-

CRC patients (N = 194) were compared to non-diabetic CRC patients (N = 78), and the following diabetic treatment groups, diet-controlled (N = 55), metformin-treated (N = 72), insulin-treated (N

= 18), other oral antidiabetic agents (N = 4) and combination therapies (N = 35), were compared to one another. The key clinicopathological features included in this analysis were: age at diagnosis, gender, tumour sidedness, stage at diagnosis, histopathologic grade, two-year survival, five-year survival, minimum survival time, body mass index (BMI), glycosylated hemoglobin

(%HbA1c), lymphovascular invasion (LVI), perineural invasion (PNI), and circulating carcinoembryonic antigen (CEA) levels. A comprehensive breakdown of the distribution of these clinicopathological features based on diabetic status and treatment groups can be seen in Table 4.

A two-tailed independent samples t-test and a one-way ANOVA were used to test for differences in the mean of continuous variables (i.e. age at diagnosis, minimum survival time,

BMI, %HbA1C and CEA level) between patients based on their diabetic status and treatment groups, respectively. Our results show that diabetic CRC patients differed from non-diabetic CRC

52

Table 4. Distribution of Clinicopathological Features Across Diabetic Status & Treatment Groups. Statistical tests were performed to assess for differences in key clinicopathological features between non-diabetic patients and all diabetic patients, and to assess for differences amongst diabetic treatment groups. *Indicates a p-value < 0.05.

Clinical/ Diabetic Status Diabetic Treatment Groups Pathological Non-T2D All T2D DietTx MetTx InsTx OthTx MixedTx Feature (N = 78) (N = 194) (N = 55) (N = 72) (N = 18) (N = 14) (N = 35) Age at Diagnosis (mean, median) 70, 70 70, 72 72, 74 72, 71 65, 69 72, 74 67, 70

Gender Male (%) 34 (44) 118 (61)* 36 (65) 42 (58) 11 (61) 7 (50) 22 (63) Female (%) 44 (56) 76 (39) 19 (35) 30 (42) 7 (39) 7 (50) 13 (37)

Tumour Sidedness Rectal (%) 14 (18) 60 (31) 16 (29) 19 (26) 8 (44) 5 (36) 12 (34) Left (%) 22 (28) 58 (30) 17 (31) 22 (31) 4 (22) 5 (36) 10 (29) Right (%) 41 (53) 76 (39) 22 (40) 31 (43) 6 (33) 4 (29) 13 (37)

Stage at Diagnosis Stage 1 (%) 7 (9) 35 (18) 7 (13) 17 (24) 1 (6) 1 (7) 9 (26) Stage 2 (%) 27 (35) 60 (31) 17 (31) 21 (29) 4 (22) 7 (50) 11 (31) Stage 3 (%) 34 (44) 85 (44) 28 (51) 29 (40) 10 (56) 4 (29) 14 (40) Stage 4 (%) 10 (13) 14 (7) 3 (5) 5 (7) 3 (17) 2 (14) 1 (3)

Histologic Grade Low (%) 68 (87) 166 (86) 48 (87) 62 (86) 11 (61) 13 (93) 32 (91) Moderate (%) 4 (5) 2 (1) 0 (0) 1 (1) 1 (6) 0 (0) 0 (0) High (%) 6 (8) 26 (13) 7 (13) 9 (13) 6 (33)* 1 (7) 3 (9)

Survival 2 Year (%) 65 (83) 161 (83) 45 (82) 59 (82) 14 (78) 12 (86) 31 (89) 5 Year (%) 48 (62) 125 (64) 36 (65) 42 (58) 10 (56) 10 (71) 27 (77) Minimum Days 2747, 2271, 2396, 2254, 2142, 1876, 2335, (mean, median) 3114* 2287 2500 2234 2020 2150 2316

Mean BMI 27.4 28.2 27.6 29.6 26.4 29.0 27.1 (range) (18-44) (17-53) (17-37) (19-53) (19-38) (25-33) (17-37) Not Specified 52 (65) 113 (58) 34 (62) 43 (60) 8 (44) 9 (64) 19 (54)

Mean %HbA1C N/A 7.8 6.9 8.3 8.3 7.4 7.9 (range) (4.7-60) (5-16.2) (4.7-60) (6-11) (5.7-9.5) (6-14.1) Not Specified 76 (39) 27 (49) 25 (35) 6 (33) 6 (43) 12 (34)

Invasiveness LVI (%) 24 (31) 59 (30) 20 (36) 21 (29) 8 (44) 3 (21) 8 (23) PNI (%) 12 (15) 21 (11) 6 (11) 8 (11) 3 (17) 0 (0) 4 (11)

Mean CEA 122.4 257.8 372.5 131.7 92.3 1246 20.2 (range) (0.6-4234) (0-8766) (0.7-8766) (0-1118) (1.1-919) (3.1-8039) (1.7-150) Not Specified 16 (21) 87 (45) 21 (38) 38 (53) 3 (17) 7 (50) 18 (51)

53

patients in their minimum survival times (p = 0.028). The average minimum survival time for diabetic-CRC patients (M = 2271) was significantly lower than the average minimum survival time for non-diabetic CRC patients (M = 2747). A Chi-Square test was used to test for differences in categorical variables (i.e. gender, tumour sidedness, two-year survival, five-year survival, LVI, and PNI) between patients based on their diabetic status and treatment groups. Our results indicate that non-diabetic CRC patients and diabetic CRC patients differ in their gender distributions (p =

0.017), with males being more likely to present with comorbid T2D. To test for differences in ordinal data (i.e. stage at diagnosis and histopathologic grade) based on diabetic status and treatment groups, Mann-Whitney U tests and Kruskal-Wallis tests were performed, respectively.

The results of these tests indicated that insulin-treated CRC patients differed from the other diabetic treatment groups in their histopathological grade, with insulin-treated patients presenting with higher-grade tumours (p = 0.035).

3.3 Bioinformatics: Analysis of Gene Expression Data

To investigate whether gene expression profiles discriminated between metformin-treated and diet-controlled diabetic CRC primary tumour samples, as well as if molecular profiles discriminated between non-diabetic CRC patient specimens based on their five-year survival status, we applied supervised learning techniques to identify genes with the strongest predictive power for our classes of interest, namely metformin-treated, diet-controlled, five-year alive (non-

T2D) and five-year deceased (non-T2D). The selected features were then used to generate models that could be evaluated based on their corresponding classification performance (Figure 4).

54

Figure 4. General Overview of the Bioinformatics Workflow. Following data preprocessing, the data was randomly partitioned into training and validation cohorts. Statistical tests were performed to ensure the distribution of key clinical and pathological features did not differ between the two cohorts. Next, feature selection was performed on the training data, to reduce the dimensionality, then hierarchical clustering, an unsupervised learning method was performed as an exploratory analysis. Finally, the selected features were used to build a classification algorithm with the training data. As a final step, after the model was optimized, it was exported and applied to the validation data to assess the performance generalizability.

55

3.3.1 Randomly Partitioned Training & Validation Cohorts did not Differ in their

Distribution of Key Clinicopathological features

To ensure that the models were built in an unbiased way, we randomly partitioned our data into training and validation cohorts. Roughly 80 percent of the samples were allocated to the training set for the purpose of generating a classification algorithm, and 20 percent were assigned to the validation cohort to assess the generalizability of the final model. Next, we performed statistical tests to evaluate the distribution of key clinicopathological features in our training and validation cohorts, thereby confirming that the dispersal of features best approximated that of the general population. Our results revealed that there were no significant differences in the patterns of clinicopathological feature distribution between the two cohorts, thus reducing the potential for these covariates to bias the model building and testing processes (Table 5).

3.3.2 A Supervised Learning Approach Identifies 43 Features Discriminating Between

Metformin-Treated & Diet-Controlled Samples & 61 Features Discriminating Between Non-

Diabetic CRC Samples Based on their Five-Year Survival Status

To narrow the scope of our investigation and pinpoint key genes that may be involved in metformin’s antineoplastic effects, we applied a supervised learning model to identify discriminating features from our targeted list of 151 genes, thereby effectively reducing the dimensions of our dataset. The function applied for this process was sequentialfs, with five-fold cross-validation using MATLAB software. The internal function called to compute the custom criterion, which measures the performance of a model developed with the corresponding candidate feature subset to determine what features are selected, was fitcensemble. Feature selection using these functions and specifications occurred as follows. The sequentialfs function sequentially

56

Table 5. Distribution of Key Clinicopathological Features in the Training and Validation Cohorts for the Metformin-Treated vs Diet-Controlled Analysis & Five-Year Survival Analysis in Non-Diabetic Samples. A Chi-Square statistic was used to investigate differences in categorical variables, a Mann-Whitney U test was used to evaluate differences in ordinal data, and a two- tailed independent samples t-test was used to investigate differences in continuous variables.

MetforminTx & DietTx Analysis Five-Year Survival Analysis

Clinical/ Pathological Train Test Statistical Assessment Train Test Statistical Assessment Feature Cohort Cohort Test P-value Cohort Cohort Test P-value (N = 102) (N = 25) Statistic (N = 62) (N = 16) Statistic

Diabetic 2 = 1264 0.938 N/A N/A Treatment MetTx (%) 58 (57) 14 (56) N/A N/A DietTx (%) 44 (43) 11(44) N/A N/A

Age at 72, 72 73, 75 t = -0.418 0.677 70, 70 69, 69 t = 0.084 0.934 Diagnosis (mean, median)

Gender 2 = 2.365 0.124 2 = 0.027 0.870 Male (%) 66 (65) 12 (48) 27 (44) 7 (44) Female (%) 36 (35) 13 (52) 35 (56) 9 (56)

Tumour 2 = 0.531 0.767 2 = 0.046 0.977 Sidedness Rectal (%) 28 (27) 7 (28) 11 (18) 3 (19) Left (%) 30 (29) 9 (36) 18 (29) 5 (31) Right (%) 44 (43) 9 (36) 33 (54) 8 (50)

Stage at U = 1211 0.681 U = 446.0 0.449 Diagnosis Stage 1 (%) 18 (18) 6 (24) 6 (10) 0 (0) Stage 2 (%) 31 (30) 7 (28) 22 (35) 6 (37) Stage 3 (%) 47 (46) 10 (40) 26 (42) 8 (50) Stage 4 (%) 6 (6) 2 (8) 8 (13) 2 (13)

Histologic U = 1231 0.655 U = 486.5 0.432 Grade Low (%) 89 (87) 21 (84) 54 (87) 15 (94) Moderate (%) 1 (1) 0 (0) 2 (3) 1 (6) High (%) 12 (12) 4 (16) 6 (10) 0 (0)

57

added features from the full gene list into candidate feature subsets. Five-fold cross-validation facilitated the division of the data into five non-overlapping subsets, whereby each iteration utilized a unique combination of training data (N = 4/5) and test data (N = 1/5). Next, the internal function fitcensemble is called and uses the training data to generate five distinct models whose classification accuracies were computed based on the model’s performance on the corresponding test data subsets. These accuracies for each mutually exclusive fold were then averaged, thereby providing an estimate of the overall accuracy of the model for a specified candidate feature subset, denoted the custom criterion. This process occurred repeatedly, as new features were sequentially added to the feature subsets, until the accuracy of the model no longer improved. At this point, the feature subset with the optimal accuracy was the output of the function.

This process was performed for the two separate analyses; namely, the metformin-treated against diet-controlled sample analysis and the investigation of non-diabetic samples grouped based on their five-year survival status. For each analysis, 50 iterations of the sequentialfs function were computed to increase the probability of identifying features with true discriminatory power.

The number of features selected in a single iteration for the metformin-treated compared to diet- controlled analysis ranged from one to eight, with the average number being 3.62. For the five- year survival analysis, the number of features selected per iteration ranged from one to six, with

2.88 being the average. In total, 43 unique features were identified as discriminating between metformin-treated samples and diet-controlled samples, while 61 features were identified for the five-year survival analysis (Figure 5).

58

A

B

Figure 5. Features Selected & the Frequency of Selection Using Multiple Iterations of a Supervised Learning Algorithm. (A) Discriminatory features selected using the sequentialfs function in MATLAB® between metformin-treated and diet-controlled samples. The bar graph plots the frequency at which each feature was selected over 50 iterations of the algorithm. (B) Features discriminating between non-diabetic samples based on their five-year survival status using the previously mentioned supervised learning algorithm and software.

59

3.3.3 The Classification Models Developed Using the Molecular Data Show Some Capacity to Predict Diabetic Treatment Group and Five-Year Survival Status

To investigate the capacity to classify samples according to metformin-treated or diet- controlled based on the transcriptomic data, we used a supervised machine learning approach to develop a model based on multidimensional pattern recognition intrinsic to the molecular data in the subset of genes identified during feature selection. The model was developed using the

Classification Learner Application in the MATLAB® software. Given the “No Free Lunch

Theorem”, we trained all available models to determine which would perform best with our data.

Starting with every feature (N = 43) that was selected, we trained all twenty-five available algorithms; next, we successively removed features starting with those that were least frequently selected and retrained the models. This process was repeated until the accuracy of the classifier no longer improved and conversely began to drop.

The optimal performance on the training data occurred with 11/42 of the features (genes selected at a frequency of less than 20 percent were excluded) and the corresponding accuracy achieved was 75.5 percent, using an ensemble of random subspace K-Nearest Neighbors (KNN) models. The value of K for this model was one, indicating that the algorithm was optimized by classifying samples into the category of that of the closest data point, measured based on the

Euclidean distance. The ensemble method of classification improves the performance accuracy by generating several weak models that are ultimately combined to form an ensemble that exhibits a greater accuracy than any individual learner. Using the random subspace method, features were randomly selected and used to build 30 distinct models; these were then combined to produce the final model. The main advantage of this method is that it facilitates the consideration of various features, thereby ensuring that one feature or a small set of features that may have a high level of

60

predictive power in the training data set is not over-emphasized in the model, which could lead to over-fitting and poor predictive performance on other data sets.

Of the total 44 diet-controlled samples, the algorithm correctly identified 25, and of the 58 metformin-treated samples, the model accurately classified 52 (Figure 6). These values correspond to a sensitivity (the true positive rate) of 89.7 percent for metformin-treated samples and 56.8 percent for diet-controlled samples. Correspondingly, the rates of false negatives were 10.3 and

43.2 percent, respectively. The positive predictive values (PPV), the probability that a sample classified as diet-controlled was from a diet-controlled patient, and the probability that a sample classified as metformin-treated was treated with metformin were 80.6 and 73.2 percent, respectively. To investigate the generalizability of this model to other data, we then applied the classifier to our test data. The performance of the model on the validation cohort was 46.2 percent.

Of the eleven diet-controlled samples, three were correctly identified, and of the 14 metformin- treated samples, nine were correctly classified. The drop in model performance indicates that this model was not generalizable to the test data. A comprehensive summary of model performance in the training and validation data can be seen in Table 6.

Using the same methodology, we generated an algorithm to classify non-diabetic samples based on their five-year survival status. The highest accuracy achieved was 80.6 percent including

19/61 of the selected features, using a Fine KNN model with Euclidean distance being the measure of similarity between two points, and where K was equal to one. Of the 37 non-diabetic samples that survived for five years following their CRC diagnosis, 34 were correctly recognized, and of the 25 deceased, 16 were accurately classified. These values correspond to a true positive rate of

91.9 percent for five-year alive samples and 64.0 percent for five-year deceased samples.

Correspondingly, the rates of false negatives were 8.1 and 36.0 percent, respectively. The PPVs

61

Table 6. Summary of Model Performance on Training & Validation Data. TPR = True Positive Rate; FNR = False Negative Rate; PPV = Positive Predictive Value; FDR = False Discovery Rate Metformin-Treated & Diet-Controlled Analysis Five-Year Survival Analysis (Non-Diabetics) Training Validation Training Validation Parameter Data Data Parameter Data Data (N = 102) (N = 25) (N = 62) (N = 16) Overall Accuracy 75.5 46.2 Overall Accuracy 80.6 68.8 TPR TPR MetTx 89.7 64.3 5-Y Alive 91.9 90.9 DietTx 56.8 27.3 5-Y Deceased 64.0 20.0 FNR FNR MetTx 10.3 35.7 5-Y Alive 8.1 9.1 DietTx 43.2 72.7 5-Y Deceased 36.0 80.0 PPV PPV MetTx 73.2 52.9 5-Y Alive 71.9 71.4 DietTx 80.6 37.5 5-Y Deceased 84.2 50.0 FDR FDR MetTx 26.8 47.1 5-Y Alive 28.1 28.6 DietTx 19.4 62.5 5-Y Deceased 15.8 50.0

Training Cohort Validation Cohort

Diet 25 19 Diet 3 8 True Class Metformin 6 52 Metformin 5 9 Diet Metformin Diet Metformin

Alive 34 3 Alive 10 1 True Class Deceased 9 16 Deceased 4 1 Alive Deceased Alive Deceased Predicted Class Predicted Class Figure 6. Confusion Matrices for the Classification Algorithms Built on the Training Data & their Corresponding Performance Once Applied to the Validation Data. This figure summarizes the performance of the model on the training and validation cohorts. Model predictions are on the x-axis and the true classes are shown on the y-axis. The blue boxes represent correct predictions made by the model and the red boxes indicate incorrect classifications. The shade (i.e. dark or light) of the box corresponding to the proportion of samples falling into a quadrant, with darker quadrants representing larger proportions of the data and lighter quadrants representing a smaller proportion of the data. The top panel corresponds to the metformin and diet analysis, while the bottom panel represents the five-year survival analysis.

62

were 79.1 and 84.2 percent for five-year alive and five-year deceased samples, respectively. The performance of this model only dropped to 68.8 percent when applied to the validation cohort. Of the 11 five-year alive samples, 10 were correctly identified, and of the five five-year deceased samples, one was correctly identified. The marginal drop in performance accuracy suggested that this model retained some predictive power which may be generalizable to other data. However, the model was biased towards predicting the more common class within the data set, underscoring the need for further investigation to confirm the utility of the algorithm.

3.4 Investigation of Discriminating Genes Common in the Two Analyses Reveals Five

Distinct Expression Profiles

To investigate whether the gene expression profiles in our samples may help pinpoint important molecular biomarkers of metformin action, we analyzed discriminating genes identified in the metformin-treated against diet-controlled analysis that were also selected in the five-year survival analysis. In total, 24 genes were commonly identified. For each gene, we assessed the differential gene expression patterns in metformin-treated samples, diet-controlled samples, and non-diabetic samples based on five-year survival status and performed corresponding two-tailed independent sample t-tests to pinpoint patterns that were statistically significant at the five percent level. Five unique expression patterns were identified and are summarized in Figure 7.

63

Figure 7. Overview of the Workflow to Analyze Commonly Identified Discriminating Genes. Discriminating genes were identified using a supervised machine learning approach. Next, statistical analyses were performed to investigate expression patterns in discriminating genes common to the two separate analyses. This resulted in the identification of five unique differential gene expression patterns.

64

3.4.1 Expression of TMEM132A & SCNN1A in Metformin-Treated Samples Showed the

Same Overall Trend in Non-Diabetic Samples Alive at Five Years

Following the evaluation of gene expression patterns in the discriminatory features that were identified in the diabetic treatment analysis and five-year survival analysis, two genes were pinpointed for displaying an expression pattern that suggested they may be implicated in metformin’s protective effects in CRC. In particular, the expression levels of TMEM132A and

SCNN1A in metformin-treated samples were comparable to the expression levels seen in non- diabetic samples that were alive after five years, while relative expression levels for these genes moved in the opposite direction in samples taken from diet-controlled diabetic patients and non- diabetic patients that were deceased five years following their diagnosis (Figure 8A).

The results of the statistical analysis showed that expression of TMEM132A in metformin- treated samples differed from diet-controlled samples (p = 0.053), approaching statistical significance at the five percent level. Inspection of the two group means indicates that expression levels of this gene in metformin-treated samples (M = 6.49) were lower than expression in diet- controlled samples (M = 6.74). Expression levels in TMEM132A also differed between metformin- treated samples and non-diabetic samples that were deceased five years following their diagnosis

(p = 0.065) and trended towards statistical significance. The mean expression level of this gene in metformin-treated samples (M = 6.49) was found to be lower than that of the non-diabetic samples deceased after five years (M = 6.77). Furthermore, expression in the diet-controlled samples (M =

6.74) was found to be higher than the expression in non-diabetic samples alive at five years (M =

6.46), nearing statistical significance (p = 0.057). Finally, expression levels also differed in non- diabetic samples based on five-year survival status (p = 0.053), with those that were alive at five years displaying lower expression (M = 6.46) than those that were deceased at five years (M =

65

A

tTx

tTx

MetTx

Die

MetTx

Die

Y Alive Y

Y Alive Y -

-

5

5

Y Deceased Y

- Deceased Y -

5 5

B

tTx

tTx

tTx

T2D

T2D

T2D

-

-

-

MetTx

MetTx Die

MetTx

Die

Die

Y Alive Y

Y Alive Y

Y Alive Y

-

-

-

Non

Deceased

Non

Non

5

5

5

Y Deceased Y

Y Deceased Y

Y Y

- -

-

5

5 5

Figure 8. Box Plots of TMEM132A & SCNN1A, as well as sFas, FN1, & SPG20 Gene Expression. (A) From left to right, gene expression was plotted as box plots in metformin-treated, diet- controlled, five-year alive and five-year deceased sample subsets within the data set. (B) Box plots are shown in the same categories that were previously described but also includes all non-diabetic samples. In general, the end of the bottom whisker represents the minimum expression level seen in the corresponding sample group (excluding outliers). This extends to the bottom end of the box which represents the lower quartile. The line dividing the box in two represents the median, and the top of the box is the upper quartile. The top whisker represents the highest value (excluding outliers). The X on the box designates the mean of the data. Prior to generating these graphs, the gene expression data was log-transformed to approximate a normal distribution, to satisfy the assumptions for statistical analyses.

66

6.77). Moreover, metformin-treated samples did not differ from non-diabetic samples alive after five years, and diet-controlled samples did not differ from non-diabetic samples that were deceased after five years. These results confirm an expression pattern for TMEM132A whereby metformin- treated samples and non-diabetic samples alive at five years display comparable expression levels that are opposite to that of diet-controlled samples and non-diabetic samples deceased at five years.

A summary of these statistical test results can be seen in Table 7.

Expression levels for SCNN1A followed the same pattern, indicating that expression in metformin-treated samples followed the same overall trend as seen in non-diabetics that were alive at five years, and occurred opposite that of diet-controlled samples and non-diabetic samples that were deceased at five years (Figure 8A). However, expression of SCNN1A was found to be upregulated in metformin-treated samples and non-diabetic samples with better overall survival, relative to diet-controlled samples and non-diabetics with poorer overall survival. The results of the statistical analysis showed that expression of SCNN1A was higher in metformin-treated samples (M = 10.22) compared to diet-controlled samples (M = 9.81), and this was statistically significant (p = 0.044). Expression of this gene also differed between metformin-treated samples

(M = 10.22) and non-diabetic samples that were deceased at five years (M = 9.79). This result was not statistically significant at the five percent level, however, it was trending towards significance

(p = 0.101). Lastly, expression levels also differed based on five-year survival status in non- diabetic samples (p = 0.045), with significantly higher expression in samples with better survival

(M = 10.44) compared to samples with poorer survival times (M = 9.81). No significant differences in expression were seen between metformin-treated samples and non-diabetics alive at five years, as well as between diet-controlled samples and non-diabetics that were deceased at five years. This expression profile suggests that SCNN1A may be coordinately regulated in response to metformin

67

Table 7. Comparison of TMEM132A & SCNN1A Expression in Metformin-Treated, Diet- Controlled, Non-Diabetic Five-Year Alive & Non-Diabetic Five-Year Deceased Samples. Two-tailed independent sample t-tests were performed following log transformation of the expression data to satisfy the assumption of normality for this statistical test. M = mean, SD = standard deviation, t = test statistic, df = degrees of freedom, p = probability of the difference occurring by random chance, g = Hedge’s g effect size.

Variable M SD t df p g TMEM132A -1.95 125 0.053 0.349 MetTx 6.49 0.661 DietTx 6.74 0.767

TMEM132A 0.252 118 0.801 0.047 MetTx 6.49 0.661 5-Y Alive 6.46 0.692

TMEM132A -1.86 98 0.065 0.415 MetTx 6.49 0.661 5-Y Deceased 6.77 0.757

TMEM132A 1.93 101 0.057 0.381 DietTx 6.74 0.767 5-Year Alive 6.46 0.692

TMEM132A -0.29 82 0.772 0.067 DietTx 6.74 0.767 5-Year Deceased 6.79 0.746

TMEM132A -1.97 75 0.053 0.463 5-Year Alive 6.46 0.692 5-Year Deceased 6.79 0.746

SCNN1A 2.04 125 0.044 0.365

MetTx 10.22 1.09 DietTx 9.81 1.15

SCNN1A -1.03 118 0.307 0.191 MetTx 10.22 1.09 5-Y Alive 10.44 1.34

SCNN1A 1.66 98 0.101 0.369 MetTx 10.22 1.09 5-Y Deceased 9.79 1.28

SCNN1A -2.58 101 0.011 0.510 DietTx 9.81 1.15 5-Year Alive 10.44 1.34

SCNN1A -0.023 82 0.981 0.005 DietTx 9.81 1.15 5-Year Deceased 9.81 1.26

SCNN1A 2.04 75 0.045 0.480 5-Year Alive 10.44 1.34 5-Year Deceased 9.81 1.26

68

treatment to confer antineoplastic effects. A full summary of the statistics can be seen in Table 7.

3.4.2 Expression of sFas, FN1, & SPG20 in Metformin-Treated Samples Showed the Same

Overall Trend in Non-Diabetic Samples

The investigation of gene expression profiles also identified three genes, sFas, FN1 and

SPG20, with an expression pattern where metformin-treated samples differed from diet-controlled samples and displayed similar mRNA counts to non-diabetic samples (Figure 8B). Given that non- diabetic CRC patients tend to have better clinical outcomes than patients with comorbid T2D, these genes may also be implicated in metformin’s CRC protective effects.

Statistical analysis (Table 8) confirmed that expression of sFas differed between metformin-treated and diet-controlled samples (p = 0.005), with metformin-treated samples (M =

6.97) exhibiting lower expression than diet-controlled samples (M = 7.38). Additionally, diet- controlled samples also differed from non-diabetic samples (p = 0.0001), with significantly higher expression of sFas found in diet-controlled samples (M = 7.38) compared to non-diabetic samples

(M = 6.83); metformin-treated samples did not differ from the non-diabetic samples. Expression of FN1 followed a similar trend, where diet-controlled samples differed significantly from non- diabetic samples (p = 0.016), with higher levels of expression being observed in the diet-controlled group (M = 12.5) relative to the non-diabetic samples (M = 11.85). Finally, SPG20 expression in diet-controlled samples was significantly higher (M = 7.80, p = 0.03) than that of the non-diabetic samples (M = 7.35). Conversely, expression of FN1 and SPG20 did not differ significantly between metformin-treated samples and diet-controlled samples. However, the differential expression of these genes in the diabetic treatment groups trended towards significance (p = 0.121, p = 0.138).

69

Table 8. Comparison of sFas, FN1 & SPG20 Expression in Metformin-Treated, Diet- Controlled, Non-Diabetic Five-Year Alive, Non-Diabetic Five-Year Deceased & All Non- Diabetic Samples. Two-tailed independent sample t-tests were performed following log transformation of the expression data to satisfy the assumption of normality for this statistical test. M = mean, SD = standard deviation, t = test statistic, df = degrees of freedom, p = probability of the difference occurring by random chance, g = Hedge’s g effect size.

Variable M SD t df p g sFas -2.89 125 0.005 0.518 MetTx 6.97 0.734 DietTx 7.38 0.865

sFas 0.263 118 0.793 0.0489 MetTx 6.97 0.734 5-Y Alive 6.94 0.818

sFas 1.92 98 0.057 0.428 MetTx 6.97 0.734 5-Y Deceased 6.65 0.815

sFas 2.69 101 0.008 0.531 DietTx 7.38 0.865 5-Year Alive 6.94 0.818

sFas 3.69 82 0.0001 0.846 DietTx 7.38 0.865 5-Year Deceased 6.67 0.807

sFas 1.40 75 0.167 0.328 5-Year Alive 6.94 0.818 5-Year Deceased 6.67 0.807

sFas 1.08 148 0.281 0.177 MetTx 6.97 0.734 Non-T2D 6.83 0.819

sFas 3.71 131 0.0001 0.654 DietTx 7.38 0.865 Non-T2D 6.83 0.819

FN1 -1.56 125 0.121 0.280 MetTx 12.05 1.57 DietTx 12.50 1.60

FN1 0.869 118 0.387 0.162 MetTx 12.05 1.57 5-Y Alive 11.81 1.40

FN1 0.356 98 0.722 0.0794 MetTx 12.05 1.57 5-Y Deceased 11.93 1.56

FN1 2.29 101 0.024 0.453 DietTx 12.50 1.60 5-Year Alive 11.81 1.40

70

FN1 1.62 82 0.108 0.373 DietTx 12.50 1.60 5-Year Deceased 11.91 1.54

FN1 -0.282 75 0.779 0.0663 5-Year Alive 11.81 1.40 5-Year Deceased 11.91 1.54

FN1 0.839 148 0.403 0.138 MetTx 12.05 1.57 Non-T2D 11.85 1.45

FN1 2.43 131 0.016 0.429 DietTx 12.50 1.60 Non-T2D 11.85 1.45

SPG20 -1.49 125 0.138 0.267 MetTx 7.51 0.956 DietTx 7.80 1.19

SPG20 0.357 118 0.722 0.0665 MetTx 7.51 0.956 5-Y Alive 7.44 1.16

SPG20 1.35 98 0.18 0.301 MetTx 7.51 0.956 5-Y Deceased 7.20 1.06

SPG20 1.52 101 0.131 0.300 DietTx 7.80 1.19 5-Year Alive 7.44 1.16

SPG20 2.29 82 0.025 0.525 DietTx 7.80 1.19 5-Year Deceased 7.20 1.05

SPG20 0.933 75 0.354 0.219 5-Year Alive 7.44 1.16 5-Year Deceased 7.20 1.05

SPG20 0.946 148 0.346 0.155 MetTx 7.51 0.956 Non-T2D 7.35 1.12

SPG20 2.20 131 0.03 0.388 DietTx 7.80 1.19 Non-T2D 7.35 1.12

71

3.4.3 Expression of SORL1 & HNF4A Differentiate Between Metformin-Treated & Diet-

Controlled Samples

The differential gene expression analysis revealed two genes whose expression differed between metformin-treated samples and diet-controlled samples (Figure 9A). Expression of

SORL1 was higher in metformin-treated samples (M = 9.83) compared to diet-controlled samples

(M = 9.46), a difference that was statistically significant (p = 0.026). HNF4A also differentiated between metformin-treated samples and diet-controlled samples (p = 0.012), with significantly greater expression levels being found in metformin-treated samples (M = 10.12) relative to diet- controlled samples (M = 9.62). Further investigation revealed that there were no significant differences in metformin-treated samples and non-diabetic samples alive at five years, nor between metformin-treated samples and non-diabetic samples deceased five years following CRC diagnosis. Expression in diet-controlled samples relative to the non-diabetic survival groups also did not differ. Furthermore, there were no significant differences between the non-diabetic samples based on their five-year survival status, indicating that while these genes may be implicated in metformin action, they do not appear to be related to clinical outcomes, based on our data.

3.4.4 Expression of FOXC2, OCT4 & AKT1 Differentiate Between Non-Diabetic Samples

Based on Five-Year Survival Status

The analysis of common discriminating features pinpointed three genes that exhibited differences in expression in non-diabetic samples based on five-year survival status (Figure 9B).

FOXC2 was expressed at a significantly lower level in non-diabetic samples that did not survive for five years (M = 4.61, p = 0.001), compared to non-diabetic samples that did survive (M = 5.63).

Expression of OCT4 also differentiated between non-diabetic samples based on their five-year 72

A

tTx

tTx

MetTx Die

MetTx

Die

Y Alive Y

Y Alive Y

-

-

5

5

Y Deceased Y

Y Deceased Y -

-

5 5

B

tTx

tTx tTx

MetTx

MetTx

MetTx

Die

Die Die

Y Alive Y Alive Y

Y Alive Y

- -

-

Deceased

5 5

5

Y Deceased Y

Y Deceased Y Y

- -

-

5

5 5

C

tTx tTx tTx

T2D T2D T2D

- - -

MetTx MetTx

MetTx

Die Die Die

Y Alive Y Alive Y Alive Y

- - -

Non Non Non

5 5 5

Y Deceased Y Deceased Y Deceased Y

- -

-

5 5 5

Figure 9. Box Plots of SORL1 & HNF4A; FOXC2, OCT4 & AKT1; TGFBR1, JAG1 & HGF Gene Expression. These plots highlight three distinct expression patterns. (A) Genes differentiating between metformin-treatment & diet-control. (B) Genes differentiating between five-year survival status. (C) Genes differentiating based on diabetic status. 73

survival status, with significantly lower levels found in the five-year deceased group (M = 7.84, p

= 0.044) relative to the five-year alive group (M = 8.29). Lastly, AKT1 also displayed significantly lower expression levels in non-diabetic samples with poorer clinical outcomes (M = 8.82, p =

0.011) compared to non-diabetic samples with greater survival times (M = 9.16). There were no significant differences in the expression of these genes between metformin-treated and diet- controlled samples, suggesting that while the expression of these genes may be related to clinical outcomes, they do not appear to be related to metformin action based on our data.

3.4.5 Expression of TGFBR1, JAG1 & HGF Differentiate Based on Diabetic Status

In our investigation of gene expression patterns in the discriminatory features common to the diabetic treatment analysis and five-year survival analysis, we identified three genes whose expression profiles differentiated between metformin-treated samples and non-diabetic samples, as well as diet-controlled samples and nondiabetic samples (Figure 9C). Expression of TGFBR1 was found to be significantly higher in metformin-treated samples (M = 9.50, p = 0.001) and diet- controlled samples (M = 9.69, p = 0.0001) compared to non-diabetic samples (M = 9.31). A similar trend was seen for JAG1, where expression in metformin-treated samples (M = 8.98, p = 0.04) and diet-controlled samples (M = 9.01, p = 0.052) tended to be greater than that of non-diabetic samples

(M = 8.71). Lastly, HGF expression profiles also differentiated between these groups of interest, with higher levels of expression being found in metformin-treated samples (M = 5.83, p = 0.001) and diet-controlled samples (M = 5.66, p = 0.038) relative to non-diabetic samples (M = 5.18). No significant differences in expression of these genes were seen between metformin-treated and diet- controlled samples, as well as between non-diabetic samples based on five-year survival status, thus indicating that these genes may be biomarkers of diabetic status.

74

Chapter 4

Discussion

4.1 Colorectal Cancer: A Lack of Effective Treatments Driving High Mortality

CRC is a highly prevalent disease, causing substantial mortality worldwide and placing significant economic strains on healthcare systems with limited resources to support patient care

(206). Moreover, the global burden of this devastating disease is projected to increase considerably by 2030, and the incidence of early-onset disease is rising at an alarming rate, adding to this already significant public health problem. This growing burden of disease coincides with the rapid growth in the prevalence of T2D, a serious metabolic disorder and a well-known risk factor for the development and progression of CRC, further exacerbating this problem (53). Current efforts to alleviate this societal burden include screening programs to increase the rates of early detection and consequently maximize the opportunity for curative treatment, which is typically achieved through the surgical resection of the tumour. Nonetheless, these programs have been largely unsuccessful as the vast majority of patients are diagnosed after their disease has progressed, when the therapeutic potential of currently available treatments is undeniably inadequate. These clinical realities underscore the urgent need to develop effective therapies for these patients.

Despite extensive research, the molecular underpinnings of this deadly disease remain poorly understood. Developing an understanding of these features is a crucial component needed to improve patient care and consequently, overall survival. More recently, epidemiological trends indicate that metformin-treated diabetic CRC patients tend to have better clinical outcomes than diabetic CRC patients not exposed to metformin. This scientific breakthrough highlights a unique opportunity to investigate the molecular mechanisms whereby metformin confers its protective

75

effects in CRC patients, which may help elucidate novel targets for the design of more effective therapies.

4.2 Three Biologically Related Genes Identified as Potential Biomarkers of Metformin

Action

In this study, we used a targeted panel of 151 metformin-related genes and NanoString nCounter® technology to quantify gene expression in our data set. Next, we applied a machine learning approach to investigate gene expression profiles discriminating between metformin- treated and diet-controlled diabetic CRC samples, as well as those discriminating between non- diabetic CRC samples based on their corresponding five-year survival status. Finally, we evaluated differential expression patterns in the identified discriminatory genes to help pinpoint key biomarkers of metformin action, that may provide insight into its antineoplastic effects in CRC.

Our results showed that three biologically related genes were identified as discriminatory features in our analysis of metformin-treated and diet-controlled samples, as well as our analysis of five-year survival in non-diabetic samples. In particular, TMEM132A and FN1 were downregulated in samples from metformin-treated patients relative to samples from diet-controlled patients, and SCNN1A was upregulated in samples from metformin-treated patients compared to samples from diet-controlled patients. Furthermore, expression levels of TMEM132A and SCNN1A in metformin-treated samples showed the same overall trend that was seen in non-diabetic samples from patients that were alive five years following their diagnosis and were opposite to that seen in non-diabetic samples from patients deceased at five years, thereby substantiating the evidence to suggest that these genes may be coordinately regulated in response to metformin treatment. While the expression of FN1 did not differ based on five-year survival status in the non-diabetic sample

76

subset, expression levels in metformin-treated patient samples were similar to the levels present in non-diabetic patient samples. Given that non-diabetic CRC patients tend to have better clinical outcomes than their diabetic counterparts, this expression profile also points to FN1 as a gene that may be associated with metformin’s protective action. These three genes are functionally related, in that they all have been implicated in the biological phenomenon denoted the unfolded protein response (UPR). Moreover, their expression patterns in our data set suggest that metformin treatment may lead to the downregulation of the UPR to promote its antineoplastic effects in CRC, thus improving patient outcomes.

4.2.1 Downregulation of the Unfolded Protein Response May Protect Against CRC

The UPR is an adaptive response to endoplasmic reticulum (ER) stress, induced by the accumulation of misfolded or unfolded proteins in the ER; the activation of this pathway initiates intracellular signalling cascades that promote cell survival by alleviating ER stress (207). This evolutionarily conserved pathway has been identified as a potential driver in the pathogenesis of several human diseases, including T2D and cancer, whereby upregulation of this pathway leads to insulin resistance, enhanced cellular proliferation, and survival (208–214). Hyperactivity of this response is reported to coincide with elevated levels of GRP78, a protein that is functionally associated with the protein product of TMEM132A (215,216). While there is a paucity of research investigating the effects of metformin on TMEM132A, a recent study reported that metformin- treatment suppresses GRP78 to enhance apoptosis in multiple myeloma, suggesting that this may be a potential mechanism by which metformin exerts its antineoplastic effects (217). SCNN1A is another gene that may be implicated in the regulation of this adaptive response, by interacting with

GRP78 to promote its degradation. While the functional role of this gene in the UPR has yet to be

77

investigated, increased expression of the homologous protein SCNN1B is a good prognostic factor in gastric cancer; at the molecular level, this protein was reported to initiate polyubiquitination of

GRP78 to induce its proteasomal degradation, thereby downregulating the UPR to suppress malignant cell growth and metastasis (218). Our results showed that TMEM132A was suppressed and SCNN1A was upregulated in metformin-treated samples. Thus, metformin may downregulate the UPR, by regulating these genes, thereby inhibiting cellular proliferation and evasion of apoptosis to promote its CRC protective effects. Furthermore, our results also showed that the oncogene FN1, which is transcriptionally upregulated downstream of ER stress sensor signalling, was expressed at lower levels in metformin-treated samples compared to diet-controlled samples, providing additional support for metformin’s potential role in downregulating the UPR.

4.2.2 GRP78: A Key Player in the UPR to Restore Proteostasis & Sustain Cell Survival

TMEM132A encodes a protein that binds to the molecular chaperone GRP78, which is localized in the endoplasmic reticulum (ER) under normal physiological conditions (219). While the functions of GRP78-binding protein have yet to be characterized, the role of GRP78 has been extensively studied. Nonetheless, preliminary studies report that GRP78-binding protein may have similar functions to GRP78 (236–240). GRP78 is essential for the maintenance of a normally functioning ER, an organelle with crucial roles in the synthesis, modification, and processing of proteins (220). In addition, GRP78 is known to be a master regulator of the UPR through its interactions with the three ER transmembrane stress sensors, PERK, IRE1, and ATF6 (221).

Under conditions of ER homeostasis, GRP78 is bound to these sensors, keeping them in their inactive conformation (Figure 10A). As the number of unfolded or misfolded proteins increases in the ER, stress is induced (Figure 10B) (221). In an attempt to restore homeostasis, GRP78

78

Figure 10. Overview of the Unfolded Protein Response to Restore Endoplasmic Reticulum Homeostasis. (A) In a state of ER homeostasis, proteins in the ER are correctly folded and are translated at a normal rate; consequently, GRP78 is bound to the three transmembrane ER stress sensors, ATF6, PERK, and IRE1, keeping them in their inactive conformation. (B) The accumulation of misfolded or unfolded proteins in the ER, which can occur due to pathological conditions such as cancer and T2D, triggers GRP78 dissociation from the ER stress sensors, to facilitate protein folding. This results in the activation of the three ER stress sensors, triggering downstream signalling to regulate transcription. ATF6 translocates to the Golgi apparatus where it is proteolytically cleaved to form the active transcription factor ATF6f. ATF6f then translocates to the nucleus where it acts as a transcription factor to upregulate UPR genes to restore proteostasis. At the same time, PERK dimerizes to become activated, triggering the phosphorylation of eukaryotic initiation factor 2 (eIF2) to inhibit global protein translation thus preventing the accumulation of additional proteins in the ER; increased protein translation of ATF4 is simultaneously favoured. ATF4 also acts as a transcription factor to increase the transcription of UPR genes. Finally, IRE1 becomes activated via oligomerization and downregulates protein synthesis by selectively degrading mRNA through its endoribonuclease activity, in a process denoted the RIDD pathway. It also increases the production of the XBP1s transcription factor, to induce UPR gene transcription, via the alternative splicing of its pre-mRNA. Taken together, these pathways coordinate an adaptive response to ER stress and attempt to restore homeostasis to sustain cell survival.

79

dissociates from the ER stress sensors and binds to the unfolded polypeptides, thereby expanding the protein folding capacity of the cell. This coincides with the activation of the ER stress sensors, which initiate distinct intracellular signalling pathways with pleiotropic effects and transcriptomic remodelling, collectively termed the UPR (222,223). The primary goal of this adaptive response is to overcome ER stress to sustain cell survival (215,224).

Following its release from GRP78, PERK facilitates cellular adaptation to ER stress through multiple mechanisms, including the global suppression of protein translation to inhibit the accumulation of misfolded proteins by phosphorylating the alpha subunit of eukaryotic initiation factor 2 (eIF2) (225,226). Furthermore, the phosphorylation of eIF2 selectively increases the translation of ATF4, a transcription factor that upregulates chaperone proteins, including GRP78, as well as pro-survival genes, to facilitate protein folding and prevent cell death, respectively (225).

The activated isoform of IRE1 also works to restore proteostasis via its endoribonuclease activity, which selectively degrades mRNA to downregulate protein synthesis, in a process denoted the RIDD pathway (227,228). In addition, it activates the XBP1s transcription factor through alternative splicing, which enhances transcription of chaperone proteins to increase protein folding, as well as the transcription of ER membrane biosynthesis proteins to accommodate the heightened level of protein molecules present (229–233). Furthermore, it upregulates the transcription of proteins involved in the endoplasmic reticulum-associated protein degradation pathway, to degrade misfolded proteins (229–231). Remarkably, it has also been reported that activation of IRE1 may promote tumour metastasis in CRC through XBP1s-dependent transcription of the oncogene FN1 (234).

The third main pathway in the UPR occurs through the activation of ATF6. This arm of the UPR is reported to increase transcription of GRP78, as well as other chaperone proteins such

80

as GRP94 and PDI, to further escalate the cell’s protein folding capability (235). It is also reported to increase the transcription of XBP1 mRNA, thereby simultaneously enhancing the IRE1 pathway of the UPR, to further promote this pro-survival adaptive response (230).

The initiation of signalling through the three branches of the UPR comprises an aggressive attempt to promote cell survival and evade apoptosis by regulating gene expression to alleviate ER stress and restore protein homeostasis. While the role of GRP78 in the UPR has been thoroughly studied, the function of TMEM132A in this adaptive response remains poorly understood.

However, it is reported that expression of this gene may promote cell survival through the regulation of stress-response genes, suggesting that GRP78 and the protein product of TMEM132A may function together in the UPR (236–240). Our results indicate that the transcription of

TMEM132A, a potential regulator of the UPR, and FN1, an oncogene and downstream target of this adaptive response, are suppressed in metformin-treated patient samples, indicating that this antidiabetic drug may be associated with the downregulation of the UPR.

4.2.3 Downregulation of GRP78 Leads to Persistent Stress & the Initiation Apoptosis

If the UPR is unable to restore normal ER function, the pro-survival pathways shut down, and instead apoptotic cell death is initiated (241). Given the critical role of GRP78 in this adaptive response to restore ER homeostasis, it has been reported that inhibition of this protein may lead to the disruption of the UPR via the reduced capacity to increase protein folding, driving chronic ER stress, and consequently the activation of the pro-apoptotic pathway to bestow cancer-protective effects (213,242). Furthermore, suppression of TMEM132A has also been reported to increase cellular susceptibility to apoptosis, suggesting this gene may also contribute to this effect (238).

81

The mechanism of GRP78 downregulation to promote apoptosis is poorly understood.

However, a recent study reported that SCNN1B displayed tumour suppressor functions in vitro and in vivo in gastric cancer by interacting with the ER chaperone to promote its proteasomal degradation (218). Upregulation of SCNN1B, and the coinciding decrease in GRP78, was associated with the inhibition of cellular proliferation and migration, as well as the initiation of programmed cell death (218). SCNN1B is a gene that encodes the -subunit of the epithelial sodium channel. This protein complex is a non-voltage gated channel comprised of three subunits, denoted alpha, beta, and gamma, and plays a key role in the maintenance of fluid and ion balance across several tissues (243). SCNN1A encodes a homologous protein, the -subunit of the epithelial sodium channel, which displays sequence similarity to SCNN1B and is part of the same protein complex, suggesting they may exhibit similar functions (244). Studies have also reported the hypermethylation of the SCNN1A promoter, and subsequent silencing of expression, in neuroblastoma and breast cancer with poor prognoses (245,246). To our knowledge, the role of

SCNN1A in the regulation of the UPR has not yet been investigated. However, our results indicate that SCNN1A was upregulated in metformin-treated samples relative to diet-controlled patient samples and non-diabetic samples from patients that were deceased at five years, and corresponded to expression levels seen in non-diabetic samples from patients that were alive five years following their CRC diagnosis. Moreover, the expression pattern of SCNN1A seen in these four groups of interest was opposite that of TMEM132A, suggesting that these genes may be interacting to downregulate the UPR in metformin-treated samples, leading to prolonged ER stress, and the initiation of apoptosis to protect against CRC. Given the reported functions of SCNN1B in the literature, in conjunction with our findings of SCNN1A’s expression pattern, further investigation

82

of this gene’s role in the regulation of the UPR, as well as its potential interaction with metformin, is warranted.

4.2.4 Upregulation of the Unfolded Protein Response in Cancer Pathology & Type 2 Diabetes

The saturation of the ER’s protein folding capacity, and consequently ER stress, is a common event in cancer pathology. Malignant cells are predisposed to heightened rates of protein misfolding due to the upregulation of protein translation to sustain uncontrolled cellular proliferation, the relative nutrient deprivation secondary to hypoxic tumour microenvironments coupled with increased metabolic demands, high mutational burdens, and the deregulation of normal cellular processes, thereby triggering chronic ER stress (247). Consequently, to facilitate cancer development and progression, malignant cells constitutively activate the UPR in an attempt to restore normal ER functioning to evade apoptosis (248).

The activation of UPR signalling also has an integral role in the pathology of T2D, whereby elevated levels of circulating glucose and insulin trigger ER stress, leading to the upregulation of

UPR signalling and GRP78 expression (249,250). Constitutive activation of the UPR is also reported to potentiate insulin resistance, a key feature of T2D, via the suppression of GLUT4 expression and subsequent inhibition of insulin-induced glucose uptake into cells (251). Other studies report that ER stress and UPR activation impairs the delivery of insulin receptors to the plasma membrane, highlighting another mechanism where abnormal ER function is implicated in insulin resistance (252). Increased transcription of UPR target genes has also been reported in

T2D, substantiating the evidence for UPR activity in the pathogenesis of T2D. Namely, FN1, a known oncogene and downstream target of the ER stress sensor IRE1, was reported to be upregulated with more than a 40-fold change in expression in the endothelial cells of T2D and

83

prediabetic patients, relative to controls (253). Thus, the amplification of the UPR and subsequent

GRP78 and FN1 expression constitutes a common molecular event that not only links these two distinct pathologies but also highlights a potential oncogenic pathway that may be synergistically activated in CRC patients with comorbid T2D, contributing to poor clinical outcomes.

4.2.5 Upregulation of the UPR Drives Aggressive Malignancies & Poor Patient Outcomes

Pathological upregulation of the UPR survival response is thought to be a key oncogenic adaptation, contributing to transcriptomic remodelling in many cancers including CRC, which facilitates the development of aggressive tumours to drive poor patient outcomes (254–260).

Furthermore, upregulation of GRP78, the master regulator of the UPR, has been well-documented in the literature in several malignancies, suggesting that this may be a critical mediator of cancer pathogenesis (261–266). In addition to its role in sustaining cell survival via the UPR, GRP78 has been implicated in several other malignant processes, including angiogenesis, tumour metastasis, therapeutic resistance, increased cell stemness, as well as the induction of proinflammatory cytokine signalling (267–271). More recently, evidence has been accumulating to suggest that the loss of protein homeostasis in malignant pathology may also induce the expression and translocation of GRP78 to subcellular locations outside of the ER, where it can interact with signalling molecules to escalate several oncogenic signalling pathways (272).

Aberrant expression of GRP78 on the surface of malignant cells, coupled with a secreted isoform of GRP78 into the tumour microenvironment, is thought to be an additional mechanism that malignant cells exploit to promote oncogenic signalling. In vitro studies have reported this phenomenon in multiple CRC cell lines, indicating that GRP78 was not only pathologically expressed on the surface of cells, but was also secreted as a soluble isoform (273). These

84

researchers reported that the soluble isoform of GRP78 acted in an autocrine manner, binding to cell surface GRP78 to mediate pro-survival and proliferative signalling (273). A key pathway that was found to be upregulated subsequent to the cell surface GRP78-GRP78 interaction was

PI3K/AKT signalling (273). The PI3K/AKT pathway is well-documented in the literature as an important oncogenic signalling process, especially in CRC, with an estimated 60 to 70 percent of all cases displaying its aberrant activity (274,275). Moreover, another study reported that there was an increased localization of GRP78 at the cell surface in diabetic mice, which also triggered the activation of PI3K/AKT signalling (276). The results of these studies suggest that this mechanism of PI3K/AKT oncogenic signalling may occur synergistically in CRC patients with comorbid T2D. Furthermore, the activation of the PI3K/AKT pathway was also reported to enhance oncogenic WNT/β-catenin signalling, another pathway whose abnormal activity is highly prevalent in CRC (277). This is thought to occur through the phosphorylation and inactivation of

GSK-3β by AKT, which promotes the stabilization and translocation of β-catenin to the nucleus to initiate transcription of its target genes, consequently facilitating the development of an invasive and metastatic phenotype (273,278). Moreover, a recent study revealed that TMEM132A may play a critical role in oncogenic WNT signalling, through the stabilization of the WNT protein to enhance its signalling capacity, as significantly lower levels of the WNT protein were found in

TMEM132A knockout cell lines compared to wildtype cells (279). The investigators reported that the loss of TMEM132A resulted in the destabilization of β-catenin, and the downregulation of target genes in this pathway. Thus, TMEM132A may act to promote oncogenic signalling pathways in the UPR through its interactions with GRP78, as well as by enhancing the WNT pathway.

FN1, an oncogene that is transcriptionally upregulated in response to UPR activation, is also reported to contribute to disease progression and poor patient outcomes. The protein encoded

85

by this gene, fibronectin, is a glycoprotein expressed as a dimer or multimer on the surface of cells, and is implicated in cellular adhesion and migration during embryogenesis, the coagulation of blood, wound healing, immune responses, as well as cellular proliferation and tumour metastasis in cancer pathology (280). Elevated expression of this gene has been reported in colorectal tumour tissue relative to levels in normal colonic tissue (281). Moreover, the researchers in this study reported that expression of FN1 correlated with several important clinicopathological features, including age, lymphovascular invasion, as well as overall survival (281). Further analysis revealed that over-expression of this gene inhibited apoptosis, and enhanced cellular invasive and metastatic potential, driving poor patient outcomes (281). Another study evaluated the role of FN1 in metastatic melanoma and reported that FN1 was upregulated in metastatic cells displaying a mesenchymal phenotype when compared to primary tumour cells (282). Moreover, suppression of

FN1, mediated by siRNA, resulted in the loss of cellular invasive and migratory capacity and coincided with the induction of apoptosis (282). Following that, a KEGG analysis of the protein- protein interaction network indicated that FN1 was closely linked to PI3K/AKT signalling, suggesting that it may promote cell survival, proliferation, and metastasis via the upregulation of this oncogenic pathway (282). Another recent study in human glioma cells reported similar results, whereby FN1 prevented apoptosis and promoted disease progression and invasion through the amplification of PI3K/AKT signalling (283). The results of these studies provide convincing evidence for the pathologic role of FN1 in cancer progression, potentially through the enhancement of PI3K/AKT signalling, which may be exacerbated due to UPR hyperactivity.

Taken together, the upregulation of the UPR is thought to be a central feature of CRC pathology, driving disease progression and poor clinical outcomes through multiple molecular mechanisms. Namely, the upregulation of the UPR drives increases in the expression of GRP78,

86

which is thought to promote cellular proliferation and invasive capacity by activating the

PI3K/AKT and WNT/β-catenin signalling pathways. Moreover, GRP78 plays a critical role in the restoration of ER homeostasis, thereby preventing apoptosis. UPR activation also enhances FN1 transcription, further driving disease progression and cancer invasion by inhibiting apoptosis and promoting EMT. The results from our analysis indicated that TMEM132A and FN1 were downregulated in metformin-treated patient samples, suggesting that transcription of TMEM132A may be suppressed in response to metformin treatment to downregulate the UPR and its oncogenic activity, thereby conferring antineoplastic effects in CRC.

4.2.6 Metformin May Downregulate sFas to Synergistically Enhance Apoptosis

Our differential gene expression analysis revealed that sFas may also be an important biomarker of metformin action. sFas is the mRNA that encodes the soluble isoform of the cell surface death receptor, which plays a key role in the initiation of apoptosis (284). Production of this protein occurs via the alternative splicing of Fas pre-mRNA, to remove exon 6 which encodes the transmembrane domain of the protein, resulting in the soluble isoform (285). This protein is reported to inhibit programmed cell death by binding and sequestering the Fas-ligand, thereby suppressing Fas-mediated apoptosis (286). The loss of Fas function and/or expression has been associated with sustained tumour growth in cancer pathology, as well as resistance to immunotherapies that target the Fas receptor to promote apoptosis in malignant cell populations

(287). Thus, enhanced expression of the soluble isoform, sFas, is associated with an antiapoptotic, tumourigenic, and therapy-resistant effect (286). A recent study reported that metformin may regulate the alternative splicing of Fas pre-mRNA, shifting its production from the soluble antiapoptotic form to the membrane-bound proapoptotic form (288). The results of our statistical

87

analyses fit with the current literature, showing that expression levels of sFas were significantly lower in metformin-treated patient samples relative to diet-controlled patient samples, and were similar to expression levels seen in non-diabetic samples. Given, the antiapoptotic function of sFas documented in the literature, coupled with its gene expression profile in this data set, our results suggest that downregulation of sFas production to enhance apoptosis may be an additional mechanism of metformin’s CRC-protective action.

SPG20 was another gene that displayed a similar expression profile to that of sFas and

FN1, whereby suppression of SPG20 transcription was seen in metformin-treated samples relative to diet-controlled samples, and expression levels in metformin-treated samples showed the same overall trend that was seen in the non-diabetic samples. This gene encodes a protein that has roles in endosomal trafficking and microtubule dynamics, as well as the degradation and intracellular transport of the epidermal growth factor receptor (289). Given its function in the degradation of the epidermal growth factor receptor, downregulation of this gene is linked to tumourigenesis in several malignancies, including CRC (290). Furthermore, this is reported to occur through the hypermethylation of the SPG20 promoter, leading to its decreased transcription (290). This molecular signature has been reported to be present in 89 percent of colorectal malignant tissue, in contrast to only one percent of normal mucosa samples (291). This expression profile suggests a cancer-promoting effect in the metformin-treated samples, as well as the non-diabetic samples, compared to the diet-controlled group, which contradicts the current literature that indicates a cancer-protective effect in metformin-treated patients, as well as non-diabetic patients.

Nonetheless, the statistical analysis revealed that the differential expression profile for SPG20 was the least significant among the genes in this category. Thus, further investigation is needed to

88

determine whether the relationship in our data reflects true biological differences, and if so, what this means with respect to metformin-treatment; or whether this finding is an artifact.

4.2.7 sFas in Malignant Pathology & Type 2 Diabetes

It is hypothesized that the conversion of the membrane-bound isoform Fas, to the soluble protein sFas, may represent an early event in tumourigenesis, whereby premalignant cells gain the capacity to evade apoptosis. One study investigating the clinical utility of soluble Fas as a diagnostic tool in gastric and intestinal adenocarcinoma reported that newly diagnosed cancer patients displayed significantly higher serum levels of sFas than controls; elevated levels of this circulating biomarker was also present in patients with pre-neoplastic lesions, suggesting this may be an early alteration that promotes tumour development and progression (292). The cellular microenvironment is proposed to play a role in the regulation of the alternative splicing of Fas mRNA precursors. In particular, studies report that exposure to hypoxia alters alternative splicing to upregulate the production of sFas (285). Malignant pathologies are characterized by uncontrolled cellular growth and proliferation, consequently, virtually all solid tumours experience inadequate blood supplies, creating a hypoxic microenvironment that may favour the production of the soluble Fas isoform (285,293,294). Furthermore, hyperglycemia in T2D is also associated with insufficient tissue oxygen levels, suggesting that the co-occurrence of malignant disease and

T2D may exacerbate hypoxic cellular microenvironments, driving the production of sFas to promote cancer onset and progression through the evasion of apoptosis (295).

Moreover, the regulation of Fas-mediated programmed cell death has been linked to

PI3K/AKT signalling, a key oncogenic pathway involved in CRC onset and disease progression, that has also been associated with aberrant GRP78 expression on the plasma membrane and FN1 89

overexpression. Specifically, the inhibition of PI3K/AKT signalling is reported to enhance cellular sensitivity to Fas-mediated apoptosis, while activation of the PI3K/AKT pathway is reported to suppress Fas-mediated apoptosis (296,297). Thus, downregulation of TMEM132A/GRP78 and

FN1 in metformin-treated CRC patients may suppress PI3K/AKT signalling, leading to a synergistic cellular apoptotic effect by enhancing Fas-dependent programmed cell death.

Taken together, upregulation of the soluble isoform of Fas may be a common molecular event in the pathology of cancer and T2D, thereby promoting tumourigenesis, disease progression, and poor clinical outcomes in diabetic CRC patients. This is thought to occur primarily through the evasion of programmed cell death, due to the binding and neutralization of the Fas-ligand.

However, suppression of apoptosis may be further enhanced by the upregulation of PI3K/AKT oncogenic signalling, which is a common occurrence in colorectal malignancies. The results from our analysis revealed that expression of sFas was significantly lower in metformin-treated patient samples compared to those that were diet-controlled, suggesting that metformin treatment may lead to the downregulation of the antiapoptotic isoform of this protein, thereby contributing to its antineoplastic effects in CRC.

4.3 Suppression of the UPR & sFas: Implications for CRC Immunotherapy

Cancer immunotherapy is a form of treatment that exploits the body’s immune system, to target malignant cells for destruction as a means of treating cancer (298). This line of treatment has been approved to treat many different types of cancer, including advanced-stage CRC (299).

Nonetheless, while immunotherapies have shown great promise in treating several human malignancies, the efficacy of these treatments is highly variable, and only a small subset of patients respond well (300). Moreover, a patient’s response to immunotherapy is highly dependent on their 90

mutational profile and tumour mutational burden (301). Thus, devising ways to improve these therapies to help a larger proportion of patients is needed.

It has been reported that UPR oncogenic signalling may have extensive immunomodulatory effects in the tumour microenvironment, thereby enabling cancer cell immune escape to promote disease progression (302,303). Conversely, prolonged ER stress, which may occur due to a lack of and/or ineffective UPR signalling, is reported to enhance immunotherapy by facilitating the production and exposure of tumour antigens on the surface of cells, thereby flagging them for immunogenic cell death (304,305). Given our results, whereby UPR adaptive genes were downregulated in metformin-treated samples, it is possible that this may lead to chronic

ER stress and the presentation of tumour antigens on the cell surface, thus enhancing immunotherapies. Furthermore, the cell surface death receptor, Fas, has also been implicated in the response to cancer immunotherapies. A recent study reported that suppression of Fas expression was associated with resistance to immune checkpoint inhibitor immunotherapy, a treatment that blocks immune checkpoints to prevent dampening of the immune response, in CRC

(306). Given our findings that sFas, the protein isoform that reduces Fas proapoptotic action, displayed lower expression levels in metformin-treated samples, there is reason to believe that this regulatory effect downstream of metformin-treatment may contribute to enhanced patient responses to immunotherapy. Taken together, downregulation of the UPR and sFas production following metformin-treatment may synergistically augment the efficacy of cancer immunotherapy, however, further investigation of these hypotheses is needed.

91

4.4 Implications of Metformin in the CRC Consensus Molecular Subtypes

The CMSs are considered to be the most robust method of classifying CRC to date (151).

Moreover, patient responses to CRC therapies vary considerably and these subtypes may facilitate the stratification of patients for personalized therapies due to their intrinsic biological differences.

Given the distinguishing features of the CRC CMSs, further investigation to determine whether the protective effects of metformin may be enhanced for a specific molecular subtype is needed to determine how this antidiabetic drug may be exploited to improve clinical outcomes. For example, downregulation of the production of sFas in favour of the pro-apoptotic isoform following metformin treatment may lead to enhanced immunogenic malignant cell death in the microsatellite instability immune subtype; inhibition of the UPR may also facilitate heightened expression of abnormal antigens on the cell surface, flagging cells for destruction. Conversely, the extent of metformin’s antineoplastic effects may be marginal at best for CMS1 tumours due to their inherent immunogenic nature, while the benefit may be more prominent in the other subtypes which do not display these signatures. Thus, investigation of metformin’s CMS-specific antineoplastic action is needed to better understand how this antidiabetic drug may be exploited in CRC patients.

4.5 Three Additional Gene Expression Profiles Identified

In our search for differential gene expression patterns that may help pinpoint important biomarkers of metformin action, we identified three additional classes of genes. Namely, genes differentiating between metformin-treated samples and diet-controlled samples, genes differentiating based on five-year survival status in non-diabetic samples, as well as genes differentiating between samples based on their diabetic status. While these molecular profiles did

92

not appear to be directly related to metformin’s antineoplastic action, they are worth noting for further investigation and are described below.

4.5.1 The Role of Genes Differentiating Between Metformin-Treated & Diet-Controlled

Samples in Cancer Pathology Remain Unclear

Two genes, SORL1 and HNF4A, were found to differ significantly in their expression levels in metformin-treated samples compared to diet-controlled samples, suggesting they may be implicated in metformin action. However, there were no significant differences in expression levels of these genes in the five-year survival categories, indicating that they were less likely to be related to the cancer-protective aspect of metformin treatment. SORL1 encodes an intracellular sorting protein comprised of an extracellular domain, a transmembrane domain, and a short cytoplasmic domain (307,308). This multifunctional protein has been implicated in the regulation of lipid metabolism in the development of obesity, whereby overexpression of SORL1 suppresses insulin-induced lipolysis, resulting in excess adiposity (309–312). Nonetheless, the role of SORL1 in cancer pathology remains unclear. Some studies report that SORL1 displays oncogenic functions in breast cancer by enhancing cell survival and proliferation; conversely, other studies report that

SORL1 initiates pro-apoptotic signalling in nerve cells when it functions as a co-receptor and binds precursor nerve growth factors (307,313). HNF4A also differentiated between metformin-treated and diet-controlled samples. This gene encodes the hepatocyte nuclear factor 4 alpha, a transcription factor involved in the development of several organs including the intestines (314).

The role of this gene in cancer pathology is also controversial. Some studies report that HNF4A acts as an oncogene by regulating transcription to promote cellular proliferation and survival (315).

Nonetheless, other researchers report that loss of HNF4A expression may reduce the sensitivity of

93

liver cancer to targeted therapy while enhancing cellular invasive and migratory capacity by upregulating EMT (316). Thus, further investigation of the involvement of these genes in CRC pathogenesis, as well as their potential interaction with metformin, is needed.

4.5.2 Decreased Expression of FOXC2, OCT4 & AKT1 in Patient Samples with Better

Survival Contradicts the Functional Role of these Genes Reported in the Current Literature

The expression levels of FOXC2, OCT4, and AKT1 were found to be lower in non-diabetic samples with poorer survival times, suggesting that these genes may be biomarkers for clinical outcomes. However, expression levels did not differ significantly between metformin-treated samples and diet-controlled samples, indicating that the effect may be independent of metformin action. An extensive review of the current literature revealed that these genes all display oncogenic functions. Namely, FOXC2 encodes a transcription factor that promotes EMT, thereby increasing cellular invasion and metastasis in several malignancies, including CRC (317–320). OCT4, a gene encoding a transcription factor with roles in the maintenance of stem cell pluripotency and embryonic development, is also reported to contribute to the development of metastatic disease in

CRC and poor patient outcomes (321–323). Upregulation of AKT1 has also been associated with malignant disease progression in multiple cancers, including CRC, where knockout of AKT1 was reported to suppress metastatic potential (324–326). Given that lower expression levels were exhibited for these oncogenes in the non-diabetic patient samples with greater survival, contradicting what we would expect based on the current literature, further investigation to determine the reasons for these discrepancies is needed. Our work involved investigating molecular profiles at the transcriptomic level. Consequently, an analysis of the proteome may help explain these unexpected findings. For example, one study investigating the role of the

94

p53/miRNA-37b/AKT1 axis in CRC pathology reported that AKT1 mRNA levels were elevated following the treatment of cells with a cytotoxic agent to induce DNA damage (327). However, the AKT1 protein was strongly inhibited, facilitating cellular apoptosis and increased sensitivity to chemotherapy (327). Thus, despite the transcriptional upregulation of these oncogenes, inhibition may occur at the protein level, thereby contributing to improved survival.

4.5.3 Invasive & Metastatic Genes Differentiate Between Diabetic & Non-Diabetic Samples

The final expression profile that we identified encompassed genes differentiating between samples based on their diabetic status. TGFBR1, JAG1, and HGF, all showed increased expression in the metformin-treated and diet-controlled diabetic samples relative to the non-diabetic samples, with the greatest difference in expression levels being for TGFBR1. No significant differences were seen between metformin-treated samples and diet-controlled samples, nor were there differences between non-diabetic patient samples based on their five-year survival status. Thus, our results indicate that these genes do not appear to be biomarkers of metformin-action or clinical outcomes. Nonetheless, they may be implicated in the pathology of T2D, and thereby could contribute to poor survival. TGFBR1 encodes a protein subunit of the TGF-β receptor complex, that initiates intracellular signalling when bound to TGF-β (328). This receptor plays an important role in cell signalling during embryogenesis, as well as the regulation of cell growth and death

(328). The role of TGF-β signalling in cancer pathology remains controversial. Some studies report that loss of function of this signalling pathway coincides with early tumourigenesis; conversely, as cancer progresses, TGF-β signalling has been reported to switch functions and instead contributes to disease progression, as well as the promotion of invasion and metastasis (329–333).

The implications of JAG1 in cancer pathology are less disputed. This gene encodes the ligand for

95

the notch 1 receptor and is reported to be overexpressed in multiple distinct malignant pathologies

(334). Furthermore, upregulation of this gene has been correlated with poor clinical outcomes in cancer patients through increased cellular proliferation, therapy resistance, and invasive metastatic potential (335–338). Finally, HGF encodes a growth factor that regulates cellular proliferation, morphogenesis, and motility across various distinct tissue types (339). The oncogenic functions of this gene in several cancers, including CRC, are well documented in the literature, indicating that upregulation of HGF is associated with the development of metastatic disease and poor patient outcomes (340–343). Taken together, our results suggest that comorbid T2D may coincide with the promotion of invasive metastatic disease in CRC, via the upregulation of these genes.

4.6 The Classification Algorithms Displayed Some Predictive Power

To investigate the capacity to predict diabetic treatment, as well as five-year survival status, we generated two distinct classification algorithms using a supervised learning approach. Our model was able to predict whether a sample was from a metformin-treated or diet-controlled diabetic CRC patient, with an accuracy of 75.5 percent on the training data; this was not generalizable to our test data. Several reasons may explain these results. First, our study was limited due to our relatively small sample size. Consequently, the survival advantage in metformin- treated patients that we were investigating was not significant in our data set. Thus, at the molecular level, the differences in metformin-treated and diet-controlled patient samples within our data set may have been subtle, making it difficult to generate a highly accurate model. Furthermore, access to a larger sample size would facilitate a stage-specific investigation, as previous work has reported that the greatest difference in survival between metformin-users and non-metformin-users occurred in patients with stage III disease (344). Moreover, we did not account for metformin

96

dosage or duration of treatment. Hence, some samples included in the metformin-treated group may not have been exposed to the antidiabetic drug for a long enough period, and/or at a high enough dosage to experience the protective effect, which may have biased our results.

Additionally, we did not investigate the contribution of insulin treatment to the differential survival relationship documented in the literature. There is evidence to suggest insulin may have a cancer- promoting effect, and thus, the cancer-protective effect of metformin may be relatively small.

Finally, given that our work utilized a targeted gene panel, it is possible that key discriminatory genes may not have been included in our analysis, leading to the suboptimal performance of the algorithm. In contrast, our model to predict five-year survival status achieved a performance of

80.6 percent on the training data and dropped only marginally, to 68.8 percent on the test data, thus revealing that this model exhibited greater predictive power. Nonetheless, this was also limited by a relatively small sample size, and an unequal distribution of the two classes within our data set. Consequently, our model tended to predict the more common class in our data set when applied to the test data, which may have deceptively inflated the performance accuracy. Therefore, confirmation of the model’s performance on a larger sample size is needed.

4.7 Robust Performance of the NanoString nCounter® Assay on Degraded RNA Samples

In general, the NanoString nCounter® assay performed well on our samples, despite exhibiting highly degraded RNA. We processed a total of 105 unique samples, of which only seven did not pass QC. In an attempt to augment the number of diabetic samples for our analysis, three diabetic samples that failed QC in early NanoString runs were re-processed using the stock concentration to maximize RNA input, in place of three randomly selected non-diabetic samples in later NanoString runs. These samples did not pass QC when they were re-run, indicating that

97

QC failure likely resulted due to poor sample quality rather than experimental errors or low RNA concentration, and therefore, these samples should not be re-evaluated unless there is reason to believe that failure resulted due to a technical error in the assay.

4.8 Limitations of the Investigation

There were several limitations in our investigation. First, the CRC-protective effects of metformin, which are documented in the literature, are reported in analyses with substantially larger sample sizes than what was used in this study. Consequently, the survival advantage in metformin-treated patients was not found to be statistically significant in our study population.

Therefore, conducting a similar investigation with a larger sample size may yield more compelling results, while reducing the probability of missing significant genes that may not have been detected due to an inadequate sample size. Furthermore, the use of a custom gene panel imparts the inherent possibility of missing key genes that may not have been included in the codeset. However, the careful selection of genes for inclusion in the analysis likely minimized this number considerably.

Next, this analysis did not account for the dosage or the duration of metformin treatment; given that higher dosages and longer durations of treatment are reported to contribute to a greater CRC- protective effect, these variables would be important to account for in future work (61). Moreover, we were not able to account for BMI and glycosylated hemoglobin levels in the diabetic population, which are important indicators of diabetic severity that could bias clinical outcomes, and thus should be considered. In addition, our sample population contained an extremely low level of insulin-treated patients. Given that insulin is reported to have a cancer-promoting effect, studying how this contributes to the differential survival seen between metformin-users and non- metformin-users, would also be a critical piece of information to determine the extent of

98

metformin’s antineoplastic effects. Finally, our investigation was conducted at the level of transcription. Given that molecular biology is highly complex, with cellular regulatory pathways occurring not only in the transcriptome, but also downstream at the protein level, further investigation is required to determine how remodelling of the transcriptome leads to biological changes at the protein level, and how this ultimately contributes to CRC pathology.

4.9 Conclusions & Future Directions

CRC is a frequently diagnosed malignancy and a major cause of death globally. In 2020,

1.93 million new cases were reported and in the same year, 935 thousand people died from this illness (345). Despite the implementation of screening programs, 50 percent of cases are diagnosed once their disease has progressed to stage III or IV when current treatment options are undeniably inadequate. T2D, a highly prevalent metabolic disorder whose incidence is growing worldwide, is a major risk factor for the development and progression of CRC (53,346). Nonetheless, the growing prevalence of CRC patients with comorbid T2D led to an important discovery; namely, metformin-treated diabetic CRC patients tend to have better clinical outcomes than those who are not treated with metformin (59,60). Thus, to investigate the molecular underpinnings of this relationship, we performed a targeted transcriptomic analysis of primary colorectal tumour samples from metformin-treated, diet-controlled, and non-diabetic patients, to identify genes that may be implicated in metformin’s antineoplastic mechanism of action.

Here we identified 43 genes discriminating between metformin-treated and diet-controlled diabetic CRC patient samples, as well as 61 genes discriminating between non-diabetic patient samples based on their five-year survival status. Further investigation of genes common to these separate analyses revealed three biologically related genes with expression profiles suggesting they 99

may be biomarkers of metformin’s antineoplastic effects in CRC. Namely, our results indicate that the UPR may be downregulated in response to metformin treatment by suppressing TMEM132A and FN1, while upregulating SCNN1A, to promote pro-apoptotic signalling in malignant cells.

Furthermore, we also identified sFas as a potential biomarker of metformin action, suggesting that this antidiabetic drug may synergistically enhance programmed cell death by downregulating the soluble form of the Fas receptor. We also noted three other categories of unique expression profiles; genes differentiating between metformin-treated and diet-controlled patient samples, non- diabetic patient samples based on clinical outcomes, as well as patient samples based on diabetic status. While these categories did not appear to be directly related to metformin’s cancer-protective effects, they may provide insight into metformin’s mechanism of action, molecular markers related to patient outcomes, as well as features of CRC with comorbid T2D, respectively.

The work described here was conducted as a discovery-based analysis, thus, further validation of our findings is necessary. Future work should ideally be performed with larger sample sizes and should account for metformin dosage, duration of treatment, as well as diabetic severity.

Nonetheless, our results highlight two potential mechanisms of metformin’s protective action in

CRC, warranting further investigation. Performing cell-based assays to confirm that metformin downregulates the UPR and sFas production to promote apoptosis in malignant cells would be a critical next step. Furthermore, investigating these postulated mechanisms at the protein level, to elucidate metformin’s specific targets, would be important in determining whether these molecular pathways may be targetable for the design of novel CRC therapies, to ultimately improve patient outcomes.

100

References

1. Cancer [Internet]. [cited 2021 Feb 26]. Available from: https://www.who.int/news- room/fact-sheets/detail/cancer

2. Dekker E, Tanis PJ, Vleugels JLA, Kasi PM, Wallace MB. Colorectal cancer [Internet]. Lancet. Lancet Publishing Group; 2019 [cited 2021 Apr 28]. page 1467–80. Available from: https://pubmed.ncbi.nlm.nih.gov/31631858/

3. Colorectal cancer rates expected to climb - Cancer Care Ontario [Internet]. [cited 2020 Dec 17]. Available from: https://www.cancercareontario.ca/en/cancer-facts/colorectal- cancer-rates-expected-climb-again-future

4. Moghadamyeghaneh Z, Alizadeh RF, Phelan M, Carmichael JC, Mills S, Pigazzi A, et al. Trends in colorectal cancer admissions and stage at presentation: impact of screening. Surg Endosc. Springer New York LLC; 2016;30:3604–10.

5. Wang A, Ramjeesingh R, Chen CH, Hurlbut D, Hammad N, Mulligan LM, et al. Reduction in membranous immunohistochemical staining for the intracellular domain of epithelial cell adhesion molecule correlates with poor patient outcome in primary colorectal adenocarcinoma. Curr Oncol. Multimed Inc.; 2016;23:e171–8.

6. Cancer [Internet]. [cited 2020 Dec 10]. Available from: https://www.who.int/news- room/fact-sheets/detail/cancer

7. Cancer [Internet]. [cited 2020 Dec 17]. Available from: https://www.who.int/news- room/fact-sheets/detail/cancer

8. Siegel RL, Torre LA, Soerjomataram I, Hayes RB, Bray F, Weber TK, et al. Global patterns and trends in colorectal cancer incidence in young adults. Gut. BMJ; 2019;gutjnl- 2019-319511.

9. Kasi PM, Shahjehan F, Cochuyt JJ, Li Z, Colibaseanu DT, Merchea A. Rising Proportion of Young Individuals With Rectal and Colon Cancer. Clin Colorectal Cancer. Elsevier Inc.; 2019;18:e87–95.

10. Hofseth LJ, Hebert JR, Chanda A, Chen H, Love BL, Pena MM, et al. Early-onset colorectal cancer: initial clues and current views [Internet]. Nat. Rev. Gastroenterol. Hepatol. Nature Research; 2020 [cited 2021 Feb 26]. page 352–64. Available from: https://www.nature.com/articles/s41575-019-0253-4

11. Siegel RL, Miller KD, Fedewa SA, Ahnen DJ, Meester RGS, Barzi A, et al. Colorectal cancer statistics, 2017. CA Cancer J Clin [Internet]. 2017 [cited 2019 Nov 29];67:177–93. Available from: http://doi.wiley.com/10.3322/caac.21395

101

12. Colorectal Cancer Stages | Rectal Cancer Staging | Colon Cancer Staging [Internet]. [cited 2021 Apr 27]. Available from: https://www.cancer.org/cancer/colon-rectal- cancer/detection-diagnosis-staging/staged.html

13. Siegel RL, Miller KD, Fedewa SA, Ahnen DJ, Meester RGS, Barzi A, et al. Colorectal cancer statistics, 2017. CA Cancer J Clin [Internet]. American Cancer Society; 2017 [cited 2020 Apr 15];67:177–93. Available from: http://doi.wiley.com/10.3322/caac.21395

14. Survival statistics for colorectal cancer - Canadian Cancer Society [Internet]. [cited 2021 Apr 27]. Available from: https://www.cancer.ca/en/cancer-information/cancer- type/colorectal/prognosis-and-survival/survival-statistics/?region=on

15. Rentsch M, Schiergens T, Khandoga A, Werner J. Surgery for colorectal cancer - Trends, developments, and future perspectives [Internet]. Visc. Med. S. Karger AG; 2016 [cited 2021 Apr 27]. page 184–91. Available from: www.karger.com/vis

16. Baran B, Mert Ozupek N, Yerli Tetik N, Acar E, Bekcioglu O, Baskin Y. Difference Between Left-Sided and Right-Sided Colorectal Cancer: A Focused Review of Literature. Gastroenterol Res. Elmer Press, Inc.; 2018;11:264–73.

17. Mik M, Berut M, Dziki L, Trzcinski R, Dziki A. Right-and left-sided colon cancer-clinical and pathological differences of the disease entity in one organ. Arch Med Sci. Termedia Publishing House Ltd.; 2017;13:157–62.

18. Jiang Y, Yan X, Liu K, Shi Y, Wang C, Hu J, et al. Discovering the molecular differences between right- and left-sided colon cancer using machine learning methods. BMC Cancer [Internet]. BioMed Central Ltd; 2020 [cited 2021 May 11];20:1–11. Available from: https://doi.org/10.1186/s12885-020-07507-8

19. Natsume S, Yamaguchi T, Takao M, Iijima T, Wakaume R, Takahashi K, et al. Clinicopathological and molecular differences between right-sided and left-sided colorectal cancer in Japanese patients. Jpn J Clin Oncol [Internet]. Oxford University Press; 2018 [cited 2021 May 11];48:609–18. Available from: https://academic.oup.com/jjco/article/48/7/609/4996059

20. Ulanja MB, Rishi M, Beutler BD, Sharma M, Patterson DR, Gullapalli N, et al. Colon Cancer Sidedness, Presentation, and Survival at Different Stages. J Oncol. Hindawi Limited; 2019;2019.

21. Petrelli F, Tomasello G, Borgonovo K, Ghidini M, Turati L, Dallera P, et al. Prognostic survival associated with left-sided vs right-sided colon cancer a systematic review and meta-analysis [Internet]. JAMA Oncol. American Medical Association; 2017 [cited 2021 May 11]. page 211–9. Available from: https://jamanetwork.com/

102

22. Lee JM, Han YD, Cho MS, Hur H, Min BS, Lee KY, et al. Impact of tumor sidedness on survival and recurrence patterns in colon cancer patients. Ann. Surg. Treat. Res. Korean Surgical Society; 2019. page 296–304.

23. Li Y, Feng Y, Dai W, Li Q, Cai S, Peng J. Prognostic Effect of Tumor Sidedness in Colorectal Cancer: A SEER-Based Analysis. Clin Colorectal Cancer. Elsevier Inc.; 2019;18:e104–16.

24. Boeckx N, Koukakis R, de Beeck KO, Rolfo C, Van Camp G, Siena S, et al. Primary tumor sidedness has an impact on prognosis and treatment outcome in metastatic colorectal cancer: Results from two randomized first-line panitumumab studies. Ann Oncol. Oxford University Press; 2017;28:1862–8.

25. Nawa T, Kato J, Kawamoto H, Okada H, Yamamoto H, Kohno H, et al. Differences between right- and left-sided colon cancer in patient characteristics, cancer morphology and histology. J Gastroenterol Hepatol [Internet]. Blackwell Publishing; 2008 [cited 2020 Apr 16];23:418–23. Available from: http://doi.wiley.com/10.1111/j.1440- 1746.2007.04923.x

26. Birkenkamp-Demtroder K, Olesen SH, Sørensen FB, Laurberg S, Laiho P, Aaltonen LA, et al. Differential gene expression in colon cancer of the caecum versus the sigmoid and rectosigmoid. Gut. 2005;54:374–84.

27. Peng Q, Lin K, Chang T, Zou L, Xing P, Shen Y, et al. Identification of genomic expression differences between right-sided and left-sided colon cancer based on bioinformatics analysis. Onco Targets Ther [Internet]. Dove Medical Press Ltd.; 2018 [cited 2021 Apr 27];11:609–18. Available from: /pmc/articles/PMC5797455/

28. Liang L, Zeng J, Qin X, Chen J, Luo D, Chen G. Distinguishable Prognostic Signatures of Left- and Right-Sided Colon Cancer: a Study Based on Sequencing Data. Cell Physiol Biochem [Internet]. S. Karger AG; 2018 [cited 2021 May 1];48:475–90. Available from: https://www.karger.com/Article/FullText/491778

29. Carethers JM. One colon lumen but two organs [Internet]. Gastroenterology. W.B. Saunders; 2011 [cited 2021 Apr 27]. page 411–2. Available from: /pmc/articles/PMC3424279/

30. Komuro K, Tada M, Tamoto E, Kawakami A, Matsunaga A, Teramoto KI, et al. Right- and left-sided colorectal cancers display distinct expression profiles and the anatomical stratification allows a high accuracy prediction of lymph node metastasis. J Surg Res. Academic Press Inc.; 2005;124:216–24.

31. Glebov OK, Rodriguez LM, Nakahara K, Jenkins J, Cliatt J, Humbyrd C-J, et al. Distinguishing Right from Left Colon by the Pattern of Gene Expression. Cancer Epidemiol Prev Biomarkers. 2003;12.

103

32. Grading colorectal cancer - Canadian Cancer Society [Internet]. [cited 2020 Apr 16]. Available from: https://www.cancer.ca/en/cancer-information/cancer- type/colorectal/grading/?region=on

33. Barresi V, Bonetti LR, Branca G, Di Gregorio C, Ponz De Leon M, Tuccari G. Colorectal carcinoma grading by quantifying poorly differentiated cell clusters is more reproducible and provides more robust prognostic information than conventional grading. Virchows Arch. Springer; 2012;461:621–8.

34. Ueno H, Hase K, Hashiguchi Y, Shimazaki H, Tanaka M, Miyake O, et al. Site-specific tumor grading system in colorectal cancer: Multicenter pathologic review of the value of quantifying poorly differentiated clusters. Am J Surg Pathol. 2014;38:197–204.

35. Barresi V, Reggiani Bonetti L, Ieni A, Caruso RA, Tuccari G. Poorly Differentiated Clusters: Clinical Impact in Colorectal Cancer. Clin. Colorectal Cancer. Elsevier Inc.; 2017. page 9–15.

36. Colorectal Cancer – Cancer Care Ontario [Internet]. [cited 2021 May 11]. Available from: https://www.cancercareontario.ca/en/types-of-cancer/colorectal

37. What Should I Know About Screening for Colorectal Cancer? | CDC [Internet]. [cited 2021 May 11]. Available from: https://www.cdc.gov/cancer/colorectal/basic_info/screening/index.htm

38. Screening Strategies For Colorectal Cancer Systematic Review & Recommendations TECHNICAL REPORT [Internet]. 2001. Available from: http://www.ctfphc.org

39. Flexible Sigmoidoscopy | NIDDK [Internet]. [cited 2021 May 1]. Available from: https://www.niddk.nih.gov/health-information/diagnostic-tests/flexible-sigmoidoscopy

40. Smith RA, Andrews KS, Brooks D, Fedewa SA, Manassaram-Baptiste D, Saslow D, et al. Cancer screening in the United States, 2018: A review of current American Cancer Society guidelines and current issues in cancer screening. CA Cancer J Clin. American Cancer Society; 2018;68:297–316.

41. Siegel RL, Miller KD, Goding Sauer A, Fedewa SA, Butterly LF, Anderson JC, et al. Colorectal cancer statistics, 2020. CA Cancer J Clin [Internet]. Wiley; 2020 [cited 2021 Apr 28];70:145–64. Available from: https://onlinelibrary.wiley.com/doi/abs/10.3322/caac.21601

42. Potter MB. Strategies and Resources to Address Colorectal Cancer Screening Rates and Disparities in the United States and Globally. Annu Rev Public Health [Internet]. Annual Reviews ; 2013 [cited 2021 Apr 28];34:413–29. Available from: http://www.annualreviews.org/doi/10.1146/annurev-publhealth-031912-114436

104

43. Cock C, Anwar S, Byrne SE, Meng R, Pedersen S, Fraser RJL, et al. Low Sensitivity of Fecal Immunochemical Tests and Blood-Based Markers of DNA Hypermethylation for Detection of Sessile Serrated Adenomas/Polyps. Dig Dis Sci [Internet]. Springer New York LLC; 2019 [cited 2021 May 11];64:2555–62. Available from: https://link.springer.com/article/10.1007/s10620-019-05569-8

44. Fenton JJ, Elmore JG, Buist DSM, Reid RJ, Tancredi DJ, Baldwin LM. Longitudinal adherence with fecal occult blood test screening in community practice. Ann Fam Med [Internet]. Annals of Family Medicine, Inc; 2010 [cited 2021 Apr 28];8:397–401. Available from: /pmc/articles/PMC2939414/

45. Alyabsi M, Meza J, Islam KMM, Soliman A, Watanabe-Galloway S. Colorectal Cancer Screening Uptake: Differences Between Rural and Urban Privately-Insured Population. Front Public Heal [Internet]. Frontiers Media S.A.; 2020 [cited 2021 Apr 28];8:532950. Available from: https://www.frontiersin.org/articles/10.3389/fpubh.2020.532950/full

46. Iovanescu D, Frandes M, Lungeanu D, Burlea A, Miutescu BP, Miutescu E. Diagnosis reliability of combined flexible sigmoidoscopy and fecal-immunochemical test in colorectal neoplasia screening. Onco Targets Ther [Internet]. Dove Medical Press Ltd.; 2016 [cited 2021 May 1];9:6819–28. Available from: /pmc/articles/PMC5104292/

47. Diagnosis of colorectal cancer - Canadian Cancer Society [Internet]. [cited 2021 May 3]. Available from: https://www.cancer.ca/en/cancer-information/cancer- type/colorectal/diagnosis/?region=on

48. Werner J, Heinemann V. Standards and Challenges of Care for Colorectal Cancer Today [Internet]. Visc. Med. S. Karger AG; 2016 [cited 2021 May 3]. page 156–7. Available from: /pmc/articles/PMC4945777/

49. Survival statistics for colorectal cancer - Canadian Cancer Society [Internet]. [cited 2020 Apr 16]. Available from: https://www.cancer.ca/en/cancer-information/cancer- type/colorectal/prognosis-and-survival/survival-statistics/?region=on

50. Feng QY, Wei Y, Chen JW, Chang WJ, Ye LC, Zhu DX, et al. Anti-EGFR and anti- VEGF agents: Important targeted therapies of colorectal liver metastases. World J Gastroenterol [Internet]. Baishideng Publishing Group Co; 2014 [cited 2021 May 1];20:4263–75. Available from: /pmc/articles/PMC3989962/

51. Seow HF, Yip WK, Fifis T. Advances in targeted and immunobased therapies for colorectal cancer in the genomic era [Internet]. Onco. Targets. Ther. Dove Medical Press Ltd.; 2016 [cited 2021 May 1]. page 1899–920. Available from: /pmc/articles/PMC4821380/

105

52. Van Der Jeught K, Xu HC, Li YJ, Lu X Bin, Ji G. Drug resistance and new therapies in colorectal cancer [Internet]. World J. Gastroenterol. Baishideng Publishing Group Co., Limited; 2018 [cited 2020 Dec 12]. page 3834–48. Available from: /pmc/articles/PMC6141340/?report=abstract

53. De Kort S, Masclee AAM, Sanduleanu S, Weijenberg MP, Van Herk-Sukel MPP, Oldenhof NJJ, et al. Higher risk of colorectal cancer in patients with newly diagnosed diabetes mellitus before the age of colorectal cancer screening initiation. Sci Rep. Nature Publishing Group; 2017;7.

54. De Bruijn KMJ, Arends LR, Hansen BE, Leeflang S, Ruiter R, Van Eijck CHJ. Systematic review and meta-analysis of the association between diabetes mellitus and incidence and mortality in breast and colorectal cancer [Internet]. Br. J. Surg. Br J Surg; 2013 [cited 2021 May 3]. page 1421–9. Available from: https://pubmed.ncbi.nlm.nih.gov/24037561/

55. Shikata K, Ninomiya T, Kiyohara Y. Diabetes mellitus and cancer risk: Review of the epidemiological evidence [Internet]. Cancer Sci. Cancer Sci; 2013 [cited 2021 May 3]. page 9–14. Available from: https://pubmed.ncbi.nlm.nih.gov/23066889/

56. Nguyen SP, Bent S, Chen YH, Terdiman JP. Gender as a Risk Factor for Advanced Neoplasia and Colorectal Cancer: A Systematic Review and Meta-analysis. Clin Gastroenterol Hepatol. W.B. Saunders; 2009;7:676-681.e3.

57. Kautzky-Willer A, Harreiter J, Pacini G. Sex and gender differences in risk, pathophysiology and complications of type 2 diabetes mellitus [Internet]. Endocr. Rev. Endocrine Society; 2016 [cited 2021 May 3]. page 278–316. Available from: https://academic.oup.com/edrv/article/37/3/278/2354724

58. Zheng Y, Ley SH, Hu FB. Global aetiology and epidemiology of type 2 diabetes mellitus and its complications. Nat. Rev. Endocrinol. Nature Publishing Group; 2018. page 88–98.

59. Marks AR, Pietrofesa RA, Jensen CD, Zebrowski A, Corley DA, Doubeni CA. Metformin use and risk of colorectal adenoma after polypectomy in patients with type 2 diabetes mellitus. Cancer Epidemiol Biomarkers Prev. American Association for Cancer Research Inc.; 2015;24:1692–8.

60. Kim HJ, Lee SJ, Chun KH, Jeon JY, Han SJ, Kim DJ, et al. Metformin reduces the risk of cancer in patients with type 2 diabetes. Med (United States). Lippincott Williams and Wilkins; 2018;97.

61. Chang YT, Tsai HL, Kung YT, Yeh YS, Huang CW, Ma CJ, et al. Dose-Dependent Relationship Between Metformin and Colorectal Cancer Occurrence Among Patients with Type 2 Diabetes—A Nationwide Cohort Study. Transl Oncol [Internet]. Neoplasia Press, Inc.; 2018 [cited 2020 Oct 18];11:535–41. Available from: /pmc/articles/PMC5884217/?report=abstract

106

62. Kaku K. Pathophysiology of Type 2 Diabetes and Its Treatment Policy. JMAJ.

63. Wilcox G. Insulin and Insulin. Clin Biochem Rev [Internet]. The Australian Association of Clinical Biochemists; 2005 [cited 2021 May 1];26:19–39. Available from: http://www.ncbi.nlm.nih.gov/pubmed/16278749

64. Huang X, Liu G, Guo J, Su ZQ. The PI3K/AKT pathway in obesity and type 2 diabetes [Internet]. Int. J. Biol. Sci. Ivyspring International Publisher; 2018 [cited 2021 Apr 23]. page 1483–96. Available from: /pmc/articles/PMC6158718/

65. Thorens B, Mueckler M. Glucose transporters in the 21st Century [Internet]. Am. J. Physiol. - Endocrinol. Metab. Am J Physiol Endocrinol Metab; 2010 [cited 2021 Apr 29]. Available from: https://pubmed.ncbi.nlm.nih.gov/20009031/

66. Osorio-Fuentealba C, Contreras-Ferrat AE, Altamirano F, Espinosa A, Li Q, Niu W, et al. Electrical stimuli release ATP to increase GLUT4 translocation and glucose uptake via PI3Kγ-Akt-AS160 in skeletal muscle cells. Diabetes [Internet]. Diabetes; 2013 [cited 2021 Apr 29];62:1519–26. Available from: https://pubmed.ncbi.nlm.nih.gov/23274898/

67. Insulin Resistance & Prediabetes | NIDDK [Internet]. [cited 2021 May 1]. Available from: https://www.niddk.nih.gov/health-information/diabetes/overview/what-is- diabetes/prediabetes-insulin-resistance

68. Petersen MC, Shulman GI. Mechanisms of insulin action and insulin resistance [Internet]. Physiol. Rev. American Physiological Society; 2018 [cited 2021 Apr 3]. page 2133–223. Available from: /pmc/articles/PMC6170977/

69. Czech MP. Insulin action and resistance in obesity and type 2 diabetes [Internet]. Nat. Med. Nature Publishing Group; 2017 [cited 2021 Apr 28]. page 804–14. Available from: /pmc/articles/PMC6048953/

70. Erion DM, Shulman GI. Diacylglycerol-mediated insulin resistance [Internet]. Nat. Med. Nature Publishing Group; 2010 [cited 2021 May 1]. page 400–2. Available from: https://www.nature.com/articles/nm0410-400

71. Pagel-Langenickel I, Bao J, Pang L, Sack MN. The role of mitochondria in the pathophysiology of skeletal muscle insulin resistance [Internet]. Endocr. Rev. Endocr Rev; 2010 [cited 2021 May 1]. page 25–51. Available from: https://pubmed.ncbi.nlm.nih.gov/19861693/

72. Vial G, Detaille D, Guigas B. Role of Mitochondria in the Mechanism(s) of Action of Metformin. Front Endocrinol (Lausanne) [Internet]. Frontiers Media S.A.; 2019 [cited 2021 Apr 28];10:294. Available from: https://www.frontiersin.org/article/10.3389/fendo.2019.00294/full

107

73. Schernthaner G, Schernthaner GH. The right place for metformin today. Diabetes Res. Clin. Pract. Elsevier Ireland Ltd; 2020. page 107946.

74. Rena G, Hardie DG, Pearson ER. The mechanisms of action of metformin. Diabetologia [Internet]. Springer; 2017 [cited 2019 Sep 5];60:1577–85. Available from: http://www.ncbi.nlm.nih.gov/pubmed/28776086

75. Koffert JP, Mikkola K, Virtanen KA, Andersson AMD, Faxius L, Hällsten K, et al. Metformin treatment significantly enhances intestinal glucose uptake in patients with type 2 diabetes: Results from a randomized clinical trial. Diabetes Res Clin Pract. Elsevier Ireland Ltd; 2017;131:208–16.

76. Horakova O, Kroupova P, Bardova K, Buresova J, Janovska P, Kopecky J, et al. Metformin acutely lowers blood glucose levels by inhibition of intestinal glucose transport. Sci Rep [Internet]. Nature Publishing Group; 2019 [cited 2021 Apr 29];9:1–11. Available from: https://doi.org/10.1038/s41598-019-42531-0

77. Schulten H-J. Pleiotropic Effects of Metformin on Cancer. Int J Mol Sci [Internet]. MDPI AG; 2018 [cited 2021 Apr 29];19:2850. Available from: http://www.mdpi.com/1422- 0067/19/10/2850

78. Chukir T, Mandel L, Tchang BG, Al-Mulla NA, Igel LI, Kumar RB, et al. Metformin- induced weight loss in patients with or without type 2 diabetes/prediabetes: A retrospective cohort study. Obes Res Clin Pract. Elsevier Ltd; 2021;15:64–8.

79. Isoda K, Young JL, Zirlik A, MacFarlane LA, Tsuboi N, Gerdes N, et al. Metformin inhibits proinflammatory responses and nuclear factor-κB in human vascular wall cells. Arterioscler Thromb Vasc Biol [Internet]. Lippincott Williams & Wilkins; 2006 [cited 2021 Apr 29];26:611–7. Available from: http://www.atvbaha.org

80. Chen C, Kassan A, Castañeda D, Gabani M, Choi SK, Kassan M. Metformin prevents vascular damage in hypertension through the AMPK/ER stress pathway. Hypertens Res [Internet]. Nature Publishing Group; 2019 [cited 2021 Apr 29];42:960–9. Available from: https://www.nature.com/articles/s41440-019-0212-z

81. Solymár M, Ivic I, Pótó L, Hegyi P, Garami A, Hartmann P, et al. Metformin induces significant reduction of body weight, total cholesterol and LDL levels in the elderly – A meta-analysis. Fürnsinn C, editor. PLoS One [Internet]. Public Library of Science; 2018 [cited 2021 Apr 29];13:e0207947. Available from: https://dx.plos.org/10.1371/journal.pone.0207947

82. Fontaine E. Metformin-Induced Mitochondrial Complex I Inhibition: Facts, Uncertainties, and Consequences. Front Endocrinol (Lausanne) [Internet]. Frontiers Media SA; 2018 [cited 2021 Apr 28];9:753. Available from: /pmc/articles/PMC6304344/

108

83. Hou WL, Yin J, Alimujiang M, Yu XY, Ai LG, Bao YQ, et al. Inhibition of mitochondrial complex I improves glucose metabolism independently of AMPK activation. J Cell Mol Med [Internet]. Blackwell Publishing Inc.; 2018 [cited 2021 Apr 28];22:1316–28. Available from: /pmc/articles/PMC5783883/

84. Hunter RW, Hughey CC, Lantier L, Sundelin EI, Peggie M, Zeqiraj E, et al. Metformin reduces liver glucose production by inhibition of fructose-1-6-bisphosphatase. Nat Med [Internet]. Nature Publishing Group; 2018 [cited 2021 Apr 28];24:1395–406. Available from: /pmc/articles/PMC6207338/

85. Miller RA, Chu Q, Xie J, Foretz M, Viollet B, Birnbaum MJ. Biguanides suppress hepatic glucagon signalling by decreasing production of cyclic AMP. Nature [Internet]. Nature; 2013 [cited 2021 May 1];494:256–60. Available from: https://pubmed.ncbi.nlm.nih.gov/23292513/

86. Johanns M, Lai YC, Hsu MF, Jacobs R, Vertommen D, Van Sande J, et al. AMPK antagonizes hepatic glucagon-stimulated cyclic AMP signalling via phosphorylation- induced activation of cyclic nucleotide phosphodiesterase 4B. Nat Commun [Internet]. Nature Publishing Group; 2016 [cited 2021 May 1];7. Available from: https://pubmed.ncbi.nlm.nih.gov/26952277/

87. Lamoia TE, Shulman GI. Cellular and Molecular Mechanisms of Metformin Action [Internet]. Endocr. Rev. Endocrine Society; 2021 [cited 2021 May 1]. page 77–96. Available from: https://academic.oup.com/edrv

88. Madiraju AK, Qiu Y, Perry RJ, Rahimi Y, Zhang XM, Zhang D, et al. Metformin inhibits gluconeogenesis via a redox-dependent mechanism in vivo. Nat Med [Internet]. Nature Publishing Group; 2018 [cited 2021 May 1];24:1384–94. Available from: /pmc/articles/PMC6129196/

89. Mráček T, Drahota Z, Houštěk J. The function and the role of the mitochondrial glycerol- 3-phosphate dehydrogenase in mammalian tissues. Biochim. Biophys. Acta - Bioenerg. Elsevier; 2013. page 401–10.

90. Madiraju AK, Erion DM, Rahimi Y, Zhang XM, Braddock DT, Albright RA, et al. Metformin suppresses gluconeogenesis by inhibiting mitochondrial glycerophosphate dehydrogenase. Nature [Internet]. Nature Publishing Group; 2014 [cited 2021 May 1];510:542–6. Available from: https://www.nature.com/articles/nature13270

91. Rotondo F, Ho-Palma AC, Remesar X, Fernández-López JA, Romero MDM, Alemany M. Glycerol is synthesized and secreted by adipocytes to dispose of excess glucose, via glycerogenesis and increased acyl-glycerol turnover. Sci Rep [Internet]. Nature Publishing Group; 2017 [cited 2021 May 1];7:1–14. Available from: www.nature.com/scientificreports/

109

92. Daugan M, Dufaÿ Wojcicki A, d’Hayer B, Boudy V. Metformin: An anti-diabetic drug to fight cancer [Internet]. Pharmacol. Res. Academic Press; 2016 [cited 2021 Apr 23]. page 675–85. Available from: https://pubmed.ncbi.nlm.nih.gov/27720766/

93. Del Barco S, Vazquez-Martin A, Cufí S, Oliveras-Ferraros C, Bosch-Barrera J, Joven J, et al. Metformin: Multi-faceted protection against cancer [Internet]. Oncotarget. Impact Journals LLC; 2011 [cited 2021 Apr 23]. page 896–917. Available from: https://pubmed.ncbi.nlm.nih.gov/22203527/

94. Bradley MC, Ferrara A, Achacoso N, Ehrlich SF, Quesenberry CP, Habel LA. A cohort study of metformin and colorectal cancer risk among patients with diabetes mellitus. Cancer Epidemiol Biomarkers Prev. American Association for Cancer Research Inc.; 2018;27:525–30.

95. Dulskas A, Patasius A, Linkeviciute-Ulinskiene D, Zabuliene L, Urbonas V, Smailyte G. Metformin increases cancer specific survival in colorectal cancer patients—National cohort study. Cancer Epidemiol. Elsevier Ltd; 2019;62:101587.

96. Shin CM, Kim N, Han K, Kim B, Jung JH, Oh TJ, et al. Anti-diabetic medications and the risk for colorectal cancer: A population-based nested case-control study. Cancer Epidemiol. Elsevier Ltd; 2020;64:101658.

97. Yang WT, Yang HJ, Zhou JG, Liu J Le. Relationship between metformin therapy and risk of colorectal cancer in patients with diabetes mellitus: a meta-analysis. Int J Colorectal Dis [Internet]. Springer; 2020 [cited 2020 Oct 12];35:2117–31. Available from: https://link.springer.com/article/10.1007/s00384-020-03704-w

98. Ha Lee J, Il Kim T, Min Jeon S, Pil Hong S, Hee Cheon J, Ho Kim W. The effects of metformin on the survival of colorectal cancer patients with diabetes mellitus Patients and Methods Patients. [cited 2019 Sep 5]; Available from: https://journals.scholarsportal.info/pdf/00207136/v131i0003/752_teomotccpwdm.xml

99. Mogavero A, Maiorana MV, Zanutto S, Varinelli L, Bozzi F, Belfiore A, et al. Metformin transiently inhibits colorectal cancer cell proliferation as a result of either AMPK activation or increased ROS production. Sci Rep [Internet]. Nature Publishing Group; 2017 [cited 2020 Dec 12];7. Available from: https://moh- it.pure.elsevier.com/en/publications/metformin-transiently-inhibits-colorectal-cancer-cell- proliferati-2

100. Wang Y, Wu Z, Hu L. Epithelial-Mesenchymal Transition Phenotype, Metformin, and Survival for Colorectal Cancer Patients with Diabetes Mellitus II. Gastroenterol Res Pract. Hindawi Limited; 2017;2017.

110

101. Wang Y, Wu Z, Hu L. The regulatory effects of metformin on the [SNAIL/miR- 34]:[ZEB/miR-200] system in the epithelial-mesenchymal transition(EMT) for colorectal cancer(CRC). Eur J Pharmacol [Internet]. Elsevier B.V.; 2018 [cited 2021 May 2];834:45–53. Available from: https://pubmed.ncbi.nlm.nih.gov/30017802/

102. Amable G, Martínez-León E, Picco ME, Nemirovsky SI, Rozengurt E, Rey O. Metformin inhibition of colorectal cancer cell migration is associated with rebuilt adherens junctions and FAK downregulation. J Cell Physiol [Internet]. Wiley-Liss Inc.; 2020 [cited 2021 May 2];235:8334–44. Available from: https://pubmed.ncbi.nlm.nih.gov/32239671/

103. Kim JH, Lee KJ, Seo Y, Kwon JH, Yoon JP, Kang JY, et al. Effects of metformin on colorectal cancer stem cells depend on alterations in glutamine metabolism. Sci Rep [Internet]. Nature Publishing Group; 2018 [cited 2021 May 2];8:1–13. Available from: www.nature.com/scientificreports

104. Montales MTE, Simmen RCM, Ferreira ES, Neves VA, Simmen FA. Metformin and soybean-derived bioactive molecules attenuate the expansion of stem cell-like epithelial subpopulation and confer apoptotic sensitivity in human colon cancer cells. Genes Nutr [Internet]. Springer Verlag; 2015 [cited 2021 May 2];10. Available from: https://pubmed.ncbi.nlm.nih.gov/26506839/

105. Nguyen TT, Ung TT, Li S, Lian S, Xia Y, Park SY, et al. Metformin inhibits lithocholic acid-induced interleukin 8 upregulation in colorectal cancer cells by suppressing ROS production and NF-kB activity. Sci Rep [Internet]. Nature Publishing Group; 2019 [cited 2021 May 2];9:1–13. Available from: https://doi.org/10.1038/s41598-019-38778-2

106. Jones GR, Molloy MP. Metformin, Microbiome and Protection Against Colorectal Cancer [Internet]. Dig. Dis. Sci. Springer; 2020 [cited 2021 May 2]. page 1409–14. Available from: https://link.springer.com/article/10.1007/s10620-020-06390-4

107. Sang J, Tang R, Yang M, Sun Q. Metformin Inhibited Proliferation and Metastasis of Colorectal Cancer and presented a Synergistic Effect on 5-FU. Biomed Res Int. Hindawi Limited; 2020;2020.

108. Wheaton WW, Weinberg SE, Hamanaka RB, Soberanes S, Sullivan LB, Anso E, et al. Metformin inhibits mitochondrial complex I of cancer cells to reduce tumorigenesis. Elife. eLife Sciences Publications Ltd; 2014;2014.

109. Howell JJ, Hellberg K, Turner M, Talbott G, Kolar MJ, Ross DS, et al. Metformin Inhibits Hepatic mTORC1 Signaling via Dose-Dependent Mechanisms Involving AMPK and the TSC Complex. Cell Metab. Cell Press; 2017;25:463–71.

110. Madiraju AK, Qiu Y, Perry RJ, Rahimi Y, Zhang XM, Zhang D, et al. Metformin inhibits gluconeogenesis via a redox-dependent mechanism in vivo. Nat Med [Internet]. Nature Publishing Group; 2018 [cited 2021 Apr 29];24:1384–94. Available from: https://www.nature.com/articles/s41591-018-0125-4

111

111. Howell JJ, Ricoult SJH, Ben-Sahra I, Manning BD. A growing role for mTOR in promoting anabolic metabolism. Biochem Soc Trans [Internet]. Biochem Soc Trans; 2013 [cited 2021 May 2];41:906–12. Available from: https://pubmed.ncbi.nlm.nih.gov/23863154/

112. Laplante M, Sabatini DM. MTOR signaling in growth control and disease [Internet]. Cell. Elsevier B.V.; 2012 [cited 2021 May 2]. page 274–93. Available from: https://pubmed.ncbi.nlm.nih.gov/22500797/

113. Wee P, Wang Z. Epidermal growth factor receptor cell proliferation signaling pathways [Internet]. Cancers (Basel). MDPI AG; 2017 [cited 2021 Apr 24]. Available from: /pmc/articles/PMC5447962/

114. Roche J. The epithelial-to-mesenchymal transition in cancer. Cancers (Basel). MDPI AG; 2018.

115. Kim AY, Kwak JH, Je NK, Lee Y hee, Jung YS. Epithelial-mesenchymal transition is associated with acquired resistance to 5-fluorocuracil in HT-29 colon cancer cells. Toxicol Res [Internet]. Korean Society of Toxicology; 2015 [cited 2021 May 2];31:151–6. Available from: /pmc/articles/PMC4505345/

116. Siemens H, Jackstadt R, Hünten S, Kaller M, Menssen A, Götz U, et al. miR-34 and SNAIL form a double-negative feedback loop to regulate epithelial-mesenchymal transitions. Cell Cycle [Internet]. Taylor and Francis Inc.; 2011 [cited 2021 May 2];10:4256–71. Available from: https://pubmed.ncbi.nlm.nih.gov/22134354/

117. Title AC, Hong SJ, Pires ND, Hasenöhrl L, Godbersen S, Stokar-Regenscheit N, et al. Genetic dissection of the miR-200–Zeb1 axis reveals its importance in tumor differentiation and invasion. Nat Commun [Internet]. Nature Publishing Group; 2018 [cited 2021 May 2];9:1–14. Available from: www.nature.com/naturecommunications

118. Zhang Y, Guan M, Zheng Z, Zhang Q, Gao F, Xue Y. Effects of metformin on CD133+ colorectal cancer cells in diabetic patients. PLoS One [Internet]. Public Library of Science; 2013 [cited 2021 May 3];8. Available from: /pmc/articles/PMC3836781/

119. Cheng X, Xu X, Chen D, Zhao F, Wang W. Therapeutic potential of targeting the Wnt/β- catenin signaling pathway in colorectal cancer. Biomed. Pharmacother. Elsevier Masson SAS; 2019. page 473–81.

120. Steinhart Z, Angers S. Wnt signaling in development and tissue homeostasis [Internet]. Development. Development; 2018 [cited 2021 May 3]. Available from: https://pubmed.ncbi.nlm.nih.gov/29884654/

121. Shang S, Hua F, Hu ZW. The regulation of β-catenin activity and function in cancer: Therapeutic opportunities [Internet]. Oncotarget. Impact Journals LLC; 2017 [cited 2021 May 3]. page 33972–89. Available from: /pmc/articles/PMC5464927/

112

122. Wiese KE, Nusse R, van Amerongen R. Wnt signalling: Conquering complexity. Dev [Internet]. Company of Biologists Ltd; 2018 [cited 2021 May 3];145. Available from: https://pubmed.ncbi.nlm.nih.gov/29945986/

123. Nusse R, Clevers H. Wnt/β-Catenin Signaling, Disease, and Emerging Therapeutic Modalities. Cell. Cell Press; 2017. page 985–99.

124. Sedgwick AE, D’Souza-Schorey C. Wnt signaling in cell motility and invasion: Drawing parallels between development and cancer [Internet]. Cancers (Basel). MDPI AG; 2016 [cited 2021 May 3]. Available from: /pmc/articles/PMC5040982/

125. Ajani JA, Song S, Hochster HS, Steinberg IB. Cancer stem cells: The promise and the potential. Semin Oncol. W.B. Saunders; 2015;42:S3–17.

126. Batlle E, Clevers H. Cancer stem cells revisited. Nat. Med. Nature Publishing Group; 2017. page 1124–34.

127. Luo M, Li JF, Yang Q, Zhang K, Wang ZW, Zheng S, et al. Stem cell quiescence and its clinical relevance. World J Stem Cells [Internet]. Baishideng Publishing Group Co; 2020 [cited 2021 May 2];12:1307–26. Available from: /pmc/articles/PMC7705463/

128. Touil Y, Igoudjil W, Corvaisier M, ed erique Dessein A-F, er J, Vandomme ome, et al. Human Cancer Biology Colon Cancer Cells Escape 5FU Chemotherapy-Induced Cell Death by Entering Stemness and Quiescence Associated with the c-Yes/YAP Axis. Clin Cancer Res [Internet]. 2014 [cited 2019 Nov 25];20. Available from: http://clincancerres.aacrjournals.org/

129. Chang JC. Cancer stem cells. Medicine (Baltimore) [Internet]. Lippincott Williams and Wilkins; 2016 [cited 2021 May 2];95:S20–5. Available from: https://journals.lww.com/00005792-201609081-00005

130. Wang X, Zhang X, Xu L, Shi G, Zheng H, Sun B. Expression of stem cell markers CD44 and Lgr5 in colorectal cancer and its relationship with lymph node and liver metastasis. Natl Med J China [Internet]. Chinese Medical Association; 2018 [cited 2020 Dec 14];98:2899–904. Available from: https://europepmc.org/article/med/30293346

131. Singh N, Baby D, Rajguru J, Patil P, Thakkannavar S, Pujari V. Inflammation and cancer. Ann Afr Med [Internet]. Wolters Kluwer Medknow Publications; 2019 [cited 2021 May 2];18:121–6. Available from: /pmc/articles/PMC6704802/

132. Lukas M. Inflammatory bowel disease as a risk factor for colorectal cancer. Dig Dis [Internet]. Dig Dis; 2010 [cited 2021 May 2]. page 619–24. Available from: https://pubmed.ncbi.nlm.nih.gov/21088413/

113

133. Wu Y, Antony S, Meitzler JL, Doroshow JH. Molecular mechanisms underlying chronic inflammation-associated cancers [Internet]. Cancer Lett. Elsevier Ireland Ltd; 2014 [cited 2021 May 2]. page 164–73. Available from: /pmc/articles/PMC3935998/

134. Sakai A, Nakanishi M, Yoshiyama K, Maki H. Impact of reactive oxygen species on spontaneous mutagenesis in Escherichia coli. Genes to Cells [Internet]. Genes Cells; 2006 [cited 2021 May 2];11:767–78. Available from: https://pubmed.ncbi.nlm.nih.gov/16824196/

135. Liu T, Zhang L, Joo D, Sun SC. NF-κB signaling in inflammation [Internet]. Signal Transduct. Target. Ther. Springer Nature; 2017 [cited 2021 May 2]. page 17023. Available from: /pmc/articles/PMC5661633/

136. Patient education: Type 2 diabetes: Treatment (Beyond the Basics) - UpToDate [Internet]. [cited 2021 Apr 8]. Available from: https://www.uptodate.com/contents/type-2-diabetes- treatment-beyond-the-basics?topicRef=1743&source=see_link#H9

137. Marín-Peñalver JJ, Martín-Timón I, Sevillano-Collantes C, Cañizo-Gómez FJ del. Update on the treatment of type 2 diabetes mellitus. World J Diabetes [Internet]. Baishideng Publishing Group Inc.; 2016 [cited 2021 Apr 8];7:354. Available from: /pmc/articles/PMC5027002/

138. Brown A, Guess N, Dornhorst A, Taheri S, Frost G. Insulin‐associated weight gain in obese type 2 diabetes mellitus patients: What can be done? Diabetes, Obes Metab [Internet]. Blackwell Publishing Ltd; 2017 [cited 2021 Apr 9];19:1655–68. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1111/dom.13009

139. Insulin - StatPearls - NCBI Bookshelf [Internet]. [cited 2021 Apr 9]. Available from: https://www.ncbi.nlm.nih.gov/books/NBK560688/

140. Petznick A. Insulin Management of Type 2 Diabetes Mellitus [Internet]. Am. Fam. Physician. 2011 Jul. Available from: http://www.endotext.org/diabetes/diabetes17/diabetesframe17.htm.

141. Hodish I. Insulin therapy, weight gain and prognosis. Diabetes, Obes Metab [Internet]. Blackwell Publishing Ltd; 2018 [cited 2021 Apr 9];20:2085–92. Available from: http://doi.wiley.com/10.1111/dom.13367

142. Chen CH, Lin CL, Hsu CY, Kao CH. Insulin enhances and metformin reduces risk of colorectal carcinoma in type-2 diabetes. QJM [Internet]. Oxford University Press; 2020 [cited 2021 May 14];113:194–200. Available from: https://academic.oup.com/qjmed/article/113/3/194/5583744

114

143. Gutiérrez-Salmerón M, Lucena SRR, Chocarro-Calvo A, Garcia-Martinez JM, Martín Orozco RM, García-Jiménez C. Metabolic and hormonal remodeling of colorectal cancer cell signaling by diabetes. Endocr Relat Cancer [Internet]. Bioscientifica Ltd.; 2021 [cited 2021 May 14];1. Available from: https://erc.bioscientifica.com/view/journals/erc/aop/erc- 21-0092/erc-21-0092.xml

144. Clayton PE, Banerjee I, Murray PG, Renehan AG. Growth hormone, the insulin-like growth factor axis, insulin and cancer risk [Internet]. Nat. Rev. Endocrinol. Nature Publishing Group; 2011 [cited 2021 Apr 11]. page 11–24. Available from: https://www.nature.com/articles/nrendo.2010.171

145. Hua H, Kong Q, Yin J, Zhang J, Jiang Y. Insulin-like growth factor receptor signaling in tumorigenesis and drug resistance: A challenge for cancer therapy [Internet]. J. Hematol. Oncol. BioMed Central Ltd.; 2020 [cited 2021 May 14]. page 1–17. Available from: https://doi.org/10.1186/s13045-020-00904-3

146. LeRoith D, Scheinman E, Bitton-Worms K. The Role for Insulin and Insulin-like Growth Factors in the Increased Risk of Cancer in Diabetes. Rambam Maimonides Med J. Rambam Health Care Campus; 2011;2.

147. Yuan J, Yin Z, Tao K, Wang G, Gao J. Function of insulin-like growth factor 1 receptor in cancer resistance to chemotherapy [Internet]. Oncol. Lett. Spandidos Publications; 2018 [cited 2021 May 14]. page 41–7. Available from: /pmc/articles/PMC5738696/

148. Broman KK, Bailey CE, Parikh AA. Sidedness of Colorectal Cancer Impacts Risk of Second Primary Gastrointestinal Malignancy. Ann Surg Oncol. Springer New York LLC; 2019;26:2037–43.

149. Nagtegaal ID, Odze RD, Klimstra D, Paradis V, Rugge M, Schirmacher P, et al. The 2019 WHO classification of tumours of the digestive system [Internet]. Histopathology. Blackwell Publishing Ltd; 2020 [cited 2021 May 3]. page 182–8. Available from: https://pubmed.ncbi.nlm.nih.gov/31433515/

150. Wu X, Lin H, Li S. Prognoses of different pathological subtypes of colorectal cancer at different stages: A population-based retrospective cohort study. BMC Gastroenterol [Internet]. BioMed Central Ltd.; 2019 [cited 2021 May 3];19:164. Available from: https://bmcgastroenterol.biomedcentral.com/articles/10.1186/s12876-019-1083-0

151. Guinney J, Dienstmann R, Wang X, De Reyniès A, Schlicker A, Soneson C, et al. The consensus molecular subtypes of colorectal cancer. Nat Med [Internet]. Nature Publishing Group; 2015 [cited 2021 Jan 17];21:1350–6. Available from: https://www.nature.com/articles/nm.3967

152. Gene expression profiling - Latest research and news | Nature [Internet]. [cited 2021 May 5]. Available from: https://www.nature.com/subjects/gene-expression-profiling

115

153. Guler EN. Gene Expression Profiling in Breast Cancer and Its Effect on Therapy Selection in Early-Stage Breast Cancer. Eur J Breast Heal [Internet]. Galenos Yayinevi; 2017 [cited 2021 Apr 7];13:168–74. Available from: /pmc/articles/PMC5648272/

154. Aarøe J, Lindahl T, Dumeaux V, Sæbø S, Tobin D, Hagen N, et al. Gene expression profiling of peripheral blood cells for early detection of breast cancer. Breast Cancer Res [Internet]. BioMed Central Ltd.; 2010 [cited 2021 Apr 7];12:1–11. Available from: https://link.springer.com/articles/10.1186/bcr2472

155. Opitz L, Salinas-Riester G, Grade M, Jung K, Jo P, Emons G, et al. Impact of RNA degradation on gene expression profiling. BMC Med Genomics [Internet]. BioMed Central; 2010 [cited 2021 Apr 7];3:36. Available from: /pmc/articles/PMC2927474/

156. Kokkat TJ, Patel MS, McGarvey D, Livolsi VA, Baloch ZW. Archived formalin-fixed paraffin-embedded (FFPE) blocks: A valuable underexploited resource for extraction of DNA, RNA, and protein. Biopreserv Biobank [Internet]. Mary Ann Liebert Inc.; 2013 [cited 2021 Apr 6];11:101–6. Available from: /pmc/articles/PMC4077003/

157. Gaffney EF, Riegman PH, Grizzle WE, Watson PH. Factors that drive the increasing use of FFPE tissue in basic and translational cancer research [Internet]. Biotech. Histochem. Taylor and Francis Ltd; 2018 [cited 2021 Apr 6]. page 373–86. Available from: https://doi.org/10.1080/10520295.2018.1446101

158. Ma CX, Bose R, Ellis MJ. Prognostic and predictive biomarkers of endocrine responsiveness for estrogen receptor positive breast cancer. Adv Exp Med Biol [Internet]. Springer New York LLC; 2016 [cited 2021 Apr 6]. page 125–54. Available from: https://link.springer.com/chapter/10.1007/978-3-319-22909-6_5

159. Shabihkhani M, Lucey GM, Wei B, Mareninov S, Lou JJ, Vinters H V., et al. The procurement, storage, and quality assurance of frozen blood and tissue biospecimens in pathology, biorepository, and biobank settings. Clin Biochem [Internet]. Elsevier Inc.; 2014 [cited 2021 May 5];47:258–66. Available from: /pmc/articles/PMC3982909/

160. Mathieson W, Thomas G. Using FFPE Tissue in Genomic Analyses: Advantages, Disadvantages and the Role of Biospecimen Science [Internet]. Curr. Pathobiol. Rep. Springer; 2019 [cited 2021 May 5]. page 35–40. Available from: https://link.springer.com/article/10.1007/s40139-019-00194-6

161. Pennock ND, Jindal S, Horton W, Sun D, Narasimhan J, Carbone L, et al. RNA-seq from archival FFPE breast cancer samples: Molecular pathway fidelity and novel discovery. BMC Med Genomics [Internet]. BioMed Central Ltd.; 2019 [cited 2021 May 5];12:1–18. Available from: https://doi.org/10.1186/s12920-019-0643-z

116

162. Chen X, Deane NG, Lewis KB, Li J, Zhu J, Washington MK, et al. Comparison of Nanostring nCounter® Data on FFPE Colon Cancer Samples and Affymetrix Microarray Data on Matched Frozen Tissues. PLoS One [Internet]. Public Library of Science; 2016 [cited 2021 Apr 6];11. Available from: /pmc/articles/PMC4866771/

163. Schmeller J, Wessolly M, Mairinger E, Borchert S, Hager T, Mairinger T, et al. Setting out the frame conditions for feasible use of FFPE derived RNA. Pathol Res Pract. Elsevier GmbH; 2019;215:381–6.

164. Vermeulen J, De Preter K, Lefever S, Nuytens J, De Vloed F, Derveaux S, et al. Measurable impact of RNA quality on gene expression results from quantitative PCR. Nucleic Acids Res [Internet]. Oxford University Press; 2011 [cited 2021 May 5];39:e63. Available from: /pmc/articles/PMC3089491/

165. Mathieson W, Thomas GA. Why Formalin-fixed, Paraffin-embedded Biospecimens Must Be Used in Genomic Medicine: An Evidence-based Review and Conclusion. J Histochem Cytochem [Internet]. SAGE Publications Ltd; 2020 [cited 2021 May 5];68:543–52. Available from: /pmc/articles/PMC7400666/

166. Gao XH, Li J, Gong HF, Yu GY, Liu P, Hao LQ, et al. Comparison of Fresh Frozen Tissue With Formalin-Fixed Paraffin-Embedded Tissue for Mutation Analysis Using a Multi-Gene Panel in Patients With Colorectal Cancer. Front Oncol [Internet]. Frontiers Media S.A.; 2020 [cited 2021 May 5];10:310. Available from: http://genetics.

167. Bondar G, Xu W, Elashoff D, Li X, Faure-Kumar E, Bao T-M, et al. Comparing NGS and NanoString platforms in peripheral blood mononuclear cell transcriptome profiling for advanced heart failure biomarker development. J Biol Methods [Internet]. Journal of Biological Methods; 2020 [cited 2021 May 5];7:e123. Available from: /pmc/articles/PMC6974694/

168. Technology for Research and Translational Medicine | NanoString [Internet]. [cited 2021 May 5]. Available from: https://www.nanostring.com/

169. Eastel JM, Lam KW, Lee NL, Lok WY, Tsang AHF, Pei XM, et al. Application of NanoString technologies in companion diagnostic development [Internet]. Expert Rev. Mol. Diagn. Taylor and Francis Ltd; 2019 [cited 2021 Apr 6]. page 591–8. Available from: https://www.tandfonline.com/doi/abs/10.1080/14737159.2019.1623672

170. Goytain A, Ng T. NanoString nCounter Technology: High-Throughput RNA Validation. Methods Mol Biol [Internet]. Humana Press Inc.; 2020 [cited 2021 May 5]. page 125–39. Available from: https://link.springer.com/protocol/10.1007/978-1-4939-9904-0_10

171. Stark R, Grzelak M, Hadfield J. RNA sequencing: the teenage years [Internet]. Nat. Rev. Genet. Nature Publishing Group; 2019 [cited 2020 Oct 29]. page 631–56. Available from: www.nature.com/nrg

117

172. Kukurba KR, Montgomery SB. RNA sequencing and analysis. Cold Spring Harb Protoc [Internet]. Cold Spring Harbor Laboratory Press; 2015 [cited 2021 Apr 6];2015:951–69. Available from: /pmc/articles/PMC4863231/

173. MAN-C0011-04 Gene Expression Data Analysis Guidelines FOR RESEARCH USE ONLY. Not for use in diagnostic procedures [Internet]. 2017. Available from: https://www.nanostring.com/products/analysis-software/nsolver,

174. Reis PP, Waldron L, Goswami RS, Xu W, Xuan Y, Perez-Ordonez B, et al. MRNA transcript quantification in archival samples using multiplexed, color-coded probes. BMC Biotechnol [Internet]. BMC Biotechnol; 2011 [cited 2021 May 5];11. Available from: https://pubmed.ncbi.nlm.nih.gov/21549012/

175. Foundations of Machine Learning, second edition - Mehryar Mohri, Afshin Rostamizadeh, Ameet Talwalkar - Google Books [Internet]. [cited 2021 May 11]. Available from: https://books.google.ca/books?hl=en&lr=&id=dWB9DwAAQBAJ&oi=fnd&pg=PR5&dq =machine+learning&ots=AypP- TpYn_&sig=Y4uSg4A2ksU4JEMB0_uRco98Mkk&redir_esc=y#v=onepage&q=machine learning&f=false

176. Xu C, Jackson SA. Machine learning and complex biological data [Internet]. Genome Biol. BioMed Central Ltd.; 2019 [cited 2021 May 12]. page 1–4. Available from: https://doi.org/10.1186/s13059-019-1689-0

177. Differential gene expression analysis | Functional genomics II [Internet]. [cited 2021 May 11]. Available from: https://www.ebi.ac.uk/training/online/courses/functional-genomics- ii-common-technologies-and-data-analysis-methods/rna-sequencing/performing-a-rna- seq-experiment/data-analysis/differential-gene-expression-analysis/

178. Greenland S, Senn SJ, Rothman KJ, Carlin JB, Poole C, Goodman SN, et al. Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations. Eur J Epidemiol [Internet]. Springer Netherlands; 2016 [cited 2021 May 3];31:337–50. Available from: /pmc/articles/PMC4877414/

179. Camacho DM, Collins KM, Powers RK, Costello JC, Collins JJ. Next-Generation Machine Learning for Biological Networks. Cell. Cell Press; 2018. page 1581–92.

180. T Akinsola JE, Jet A, O HJ. Supervised Machine Learning Algorithms: Classification and Comparison Emerging Memory Technologies View project The Use Of BIG DATA in Mobile Analytics View project Supervised Machine Learning Algorithms: Classification and Comparison. Int J Comput Trends Technol [Internet]. 2017 [cited 2021 May 4];48. Available from: http://www.ijcttjournal.org

181. Usama M, Qadir J, Raza A, Arif H, Yau KLA, Elkhatib Y, et al. Unsupervised Machine Learning for Networking: Techniques, Applications and Research Challenges. IEEE

118

Access. Institute of Electrical and Electronics Engineers Inc.; 2019;7:65579–615. 182. Chen L. Curse of Dimensionality. Encycl Database Syst [Internet]. Springer US; 2009 [cited 2021 May 4]. page 545–6. Available from: https://link.springer.com/referenceworkentry/10.1007/978-0-387-39940-9_133

183. Cai J, Luo J, Wang S, Yang S. Feature selection in machine learning: A new perspective. Neurocomputing. Elsevier B.V.; 2018;300:70–9.

184. Discriminative Feature - an overview | ScienceDirect Topics [Internet]. [cited 2021 May 24]. Available from: https://www.sciencedirect.com/topics/computer- science/discriminative-feature

185. Introduction to Feature Selection - MATLAB & Simulink [Internet]. [cited 2021 May 22]. Available from: https://www.mathworks.com/help/stats/feature-selection.html

186. What is Unsupervised Learning? | IBM [Internet]. [cited 2021 May 4]. Available from: https://www.ibm.com/cloud/learn/unsupervised-learning

187. Gentleman R, Carey VJ. Unsupervised Machine Learning. Bioconductor Case Stud [Internet]. Springer New York; 2008 [cited 2021 May 12]. page 137–57. Available from: https://link.springer.com/chapter/10.1007/978-0-387-77240-0_10

188. Murtagh F, Contreras P. Methods of Hierarchical Clustering. 2011 [cited 2021 May 12]; Available from: http://arxiv.org/abs/1105.0121

189. Irani J, Pise N, Phatak M. Clustering Techniques and the Similarity Measures used in Clustering: A Survey. Int. J. Comput. Appl. 2016.

190. Measures of Similarity and Distance [Internet]. [cited 2021 May 4]. Available from: http://www.analytictech.com/mb876/handouts/distance_and_correlation.htm

191. Choi S-S, Cha S-H, Tappert CC. A Survey of Binary Similarity and Distance Measures.

192. Tamasauskas D, Sakalauskas V, Kriksciuniene D. Evaluation framework of hierarchical clustering methods for binary data. Proc 2012 12th Int Conf Hybrid Intell Syst HIS 2012. 2012. page 421–6.

193. Carlsson G, Mémoli F. Multiparameter hierarchical clustering methods. Stud Classif Data Anal Knowl Organ [Internet]. Kluwer Academic Publishers; 2010 [cited 2021 May 12]. page 63–70. Available from: https://link.springer.com/chapter/10.1007/978-3-642-10745- 0_6

194. Single-Link, Complete-Link & Average-Link Clustering [Internet]. [cited 2021 May 12]. Available from: https://nlp.stanford.edu/IR-book/completelink.html

119

195. What is Supervised Learning? | IBM [Internet]. [cited 2021 May 12]. Available from: https://www.ibm.com/cloud/learn/supervised-learning

196. Whitley D, Watson JP. Complexity theory and the no free lunch theorem. Search Methodol Introd Tutorials Optim Decis Support Tech [Internet]. Springer US; 2005 [cited 2021 May 4]. page 317–39. Available from: https://link.springer.com/chapter/10.1007/0- 387-28356-0_11

197. Gómez D, Rojas A. An empirical overview of the no free lunch theorem and its effect on real-world machine learning classification. Neural Comput [Internet]. MIT Press Journals; 2016 [cited 2021 Jan 16];28:216–28. Available from: https://www.mitpressjournals.org/doix/abs/10.1162/NECO_a_00793

198. machine learning - How to calculate the fold number (k-fold) in cross validation? - Data Science Stack Exchange [Internet]. [cited 2021 May 4]. Available from: https://datascience.stackexchange.com/questions/28158/how-to-calculate-the-fold- number-k-fold-in-cross-validation

199. RecoverAllTM Total Nucleic Acid Isolation Kit for FFPE [Internet]. [cited 2020 Dec 17]. Available from: https://www.thermofisher.com/order/catalog/product/AM1975?ca&en#/AM1975?ca&en

200. Analysis Software User Manual nSolver 4.0 Analysis Software User Manual FOR RESEARCH USE ONLY. Not for use in diagnostic procedures [Internet]. 2018. Available from: www.nanostring.com

201. Downloading IBM SPSS Statistics 26 [Internet]. [cited 2020 Dec 21]. Available from: https://www.ibm.com/support/pages/downloading-ibm-spss-statistics-26

202. Sequential feature selection using custom criterion - MATLAB sequentialfs [Internet]. [cited 2021 May 7]. Available from: https://www.mathworks.com/help/stats/sequentialfs.html

203. Sequential feature selection using custom criterion - MATLAB sequentialfs [Internet]. [cited 2021 May 23]. Available from: https://www.mathworks.com/help/stats/sequentialfs.html

204. Working with the Clustergram Function - MATLAB & Simulink [Internet]. [cited 2021 May 8]. Available from: https://www.mathworks.com/help/bioinfo/ug/working-with-the- clustergram-function.html

205. MATLAB - MathWorks - MATLAB & Simulink [Internet]. [cited 2020 Dec 21]. Available from: https://www.mathworks.com/products/matlab.html

120

206. Gervès-Pinquié C, Girault A, Phillips S, Raskin S, Pratt-Chapman M. Economic evaluation of patient navigation programs in colorectal cancer care, a systematic review [Internet]. Health Econ. Rev. Springer Verlag; 2018 [cited 2020 Dec 21]. page 12. Available from: https://healtheconomicsreview.biomedcentral.com/articles/10.1186/s13561-018-0196-4

207. Hetz C, Zhang K, Kaufman RJ. Mechanisms, regulation and functions of the unfolded protein response [Internet]. Nat. Rev. Mol. Cell Biol. Nature Research; 2020 [cited 2021 Jun 20]. page 421–38. Available from: www.nature.com/nrm

208. Berry C, Lal M, Binukumar BK. Crosstalk between the unfolded protein response, microRNAs, and insulin signaling pathways: In search of biomarkers for the diagnosis and treatment of type 2 diabetes [Internet]. Front. Endocrinol. (Lausanne). Frontiers Media S.A.; 2018 [cited 2021 Jun 20]. page 1. Available from: www.frontiersin.org

209. Back SH, Kaufman RJ. Endoplasmic reticulum stress and type 2 diabetes. Annu Rev Biochem. 2012;81:767–93.

210. Siri M, Hosseini S V, Dastghaib S, Mokarram P. Crosstalk between Endoplasmic Reticulum Stress, Unfolded Protein Response (UPR) and Wnt Signaling Pathway in Cancer. Ann Color Res [Internet]. Shiraz University of Medical Sciences; 2020 [cited 2021 Jun 4];8:9–16. Available from: http://colorectalresearch.sums.ac.ir/

211. Deldicque L, Cani PD, Philp A, Raymackers JM, Meakin PJ, Ashford MLJ, et al. The unfolded protein response is activated in skeletal muscle by high-fat feeding: Potential role in the downregulation of protein synthesis. Am J Physiol - Endocrinol Metab [Internet]. American Physiological Society; 2010 [cited 2021 Jun 20];299:695–705. Available from: http://www.ajpendo.org

212. Amen OM, Sarker SD, Ghildyal R, Arya A. Endoplasmic reticulum stress activates unfolded protein response signaling and mediates inflammation, obesity, and cardiac dysfunction: Therapeutic and molecular approach. Front Pharmacol [Internet]. Frontiers Media S.A.; 2019 [cited 2021 Jun 20];10:977. Available from: www.frontiersin.org

213. Liang G, Fang X, Yang Y, Song Y. Knockdown of CEMIP suppresses proliferation and induces apoptosis in colorectal cancer cells: Downregulation of GRP78 and attenuation of unfolded protein response. Biochem Cell Biol [Internet]. Canadian Science Publishing; 2018 [cited 2021 Jun 20];96:332–41. Available from: https://cdnsciencepub.com/doi/abs/10.1139/bcb-2017-0151

214. Gong C, Hu X, Xu Y, Yang J, Zong L, Wang C, et al. Berberine inhibits proliferation and migration of colorectal cancer cells by downregulation of GRP78. Anticancer Drugs. Lippincott Williams and Wilkins; 2020;141–9.

121

215. Bartoszewska S, Collawn JF. Unfolded protein response (UPR) integrated signaling networks determine cell fate during hypoxia [Internet]. Cell. Mol. Biol. Lett. BioMed Central Ltd.; 2020 [cited 2021 Jun 21]. page 1–20. Available from: https://doi.org/10.1186/s11658-020-00212-1

216. TMEM132A transmembrane protein 132A [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/54972

217. Jagannathan S, Abdel-Malek MAY, Malek E, Vad N, Latif T, Anderson KC, et al. Pharmacologic screens reveal metformin that suppresses GRP78-dependent autophagy to enhance the anti-myeloma effect of bortezomib. Leukemia [Internet]. Nature Publishing Group; 2015 [cited 2021 Jun 6];29:2184–91. Available from: https://pubmed.ncbi.nlm.nih.gov/26108695/

218. Qian Y, Wong CC, Xu J, Chen H, Zhang Y, Kang W, et al. Sodium channel subunit SCNN1B suppresses gastric cancer growth and metastasis via GRP78 degradation. Cancer Res [Internet]. American Association for Cancer Research Inc.; 2017 [cited 2021 Jun 6];77:1968–82. Available from: http://cancerres.aacrjournals.org/

219. Jin X, Kim DK, Riew TR, Kim HL, Lee MY. Cellular and Subcellular Localization of Endoplasmic Reticulum Chaperone GRP78 Following Transient Focal Cerebral Ischemia in Rats. Neurochem Res [Internet]. Springer New York LLC; 2018 [cited 2021 Jun 21];43:1348–62. Available from: https://pubmed.ncbi.nlm.nih.gov/29774449/

220. Schwarz DS, Blower MD. The endoplasmic reticulum: Structure, function and response to cellular signaling [Internet]. Cell. Mol. Life Sci. Birkhauser Verlag AG; 2016 [cited 2021 Jun 21]. page 79–94. Available from: /pmc/articles/PMC4700099/

221. Wang M, Wey S, Zhang Y, Ye R, Lee AS. Role of the unfolded protein response regulator GRP78/BiP in development, cancer, and neurological disorders [Internet]. Antioxidants Redox Signal. Mary Ann Liebert, Inc.; 2009 [cited 2021 Jun 21]. page 2307–16. Available from: /pmc/articles/PMC2819800/

222. Park KW, Eun Kim G, Morales R, Moda F, Moreno-Gonzalez I, Concha-Marambio L, et al. The Endoplasmic Reticulum Chaperone GRP78/BiP Modulates Prion Propagation in vitro and in vivo. Sci Rep [Internet]. Nature Publishing Group; 2017 [cited 2021 Jun 21];7:1–13. Available from: www.nature.com/scientificreports

223. Walter P, Ron D. The unfolded protein response: From stress pathway to homeostatic regulation [Internet]. Science (80-. ). American Association for the Advancement of Science; 2011 [cited 2021 Jun 21]. page 1081–6. Available from: https://science.sciencemag.org/content/334/6059/1081

122

224. Karagöz GE, Acosta-Alvear D, Walter P. The unfolded protein response: Detecting and responding to fluctuations in the protein-folding capacity of the endoplasmic reticulum. Cold Spring Harb Perspect Biol [Internet]. Cold Spring Harbor Laboratory Press; 2019 [cited 2021 Jun 21];11. Available from: https://pubmed.ncbi.nlm.nih.gov/30670466/

225. Hamanaka RB, Bobrovnikova-Marjon E, Ji X, Liebhaber SA, Diehl JA. PERK-dependent regulation of IAP translation during ER stress. Oncogene [Internet]. Nature Publishing Group; 2009 [cited 2021 Jun 21];28:910–20. Available from: www.nature.com/onc

226. Baird TD, Wek RC. Eukaryotic initiation factor 2 phosphorylation and translational control in metabolism [Internet]. Adv. Nutr. Oxford University Press; 2012 [cited 2021 Jun 25]. page 307–21. Available from: /pmc/articles/PMC3649462/

227. Bae D, Moore KA, Mella JM, Hayashi SY, Hollien J. Degradation of Blos1 mRNA by IRE1 repositions lysosomes and protects cells from stress. J Cell Biol [Internet]. Rockefeller University Press; 2019 [cited 2021 Jun 25];218:1118–27. Available from: https://doi.org/10.1083/jcb.201809027

228. Coelho DS, Domingos PM. Physiological roles of regulated Ire1 dependent decay. Front Genet [Internet]. Frontiers Research Foundation; 2014 [cited 2021 Jun 25];5. Available from: /pmc/articles/PMC3997004/

229. Prischi F, Nowak PR, Carrara M, Ali MMU. Phosphoregulation of Ire1 RNase splicing activity. Nat Commun [Internet]. Nature Publishing Group; 2014 [cited 2021 Jun 21];5:1– 11. Available from: www.nature.com/naturecommunications

230. Yoshida H, Matsui T, Yamamoto A, Okada T, Mori K. XBP1 mRNA is induced by ATF6 and spliced by IRE1 in response to ER stress to produce a highly active transcription factor. Cell [Internet]. Cell Press; 2001 [cited 2021 Jun 21];107:881–91. Available from: https://pubmed.ncbi.nlm.nih.gov/11779464/

231. Zhang M, Han N, Jiang Y, Wang J, Li G, Lv X, et al. EGFR confers radioresistance in human oropharyngeal carcinoma by activating endoplasmic reticulum stress signaling PERK-eIF2α-GRP94 and IRE1α-XBP1-GRP78. Cancer Med [Internet]. Blackwell Publishing Ltd; 2018 [cited 2021 Jun 21];7:6234–46. Available from: https://onlinelibrary.wiley.com/doi/full/10.1002/cam4.1862

232. Wang Y, Xing P, Cui W, Wang W, Cui Y, Ying G, et al. Acute endoplasmic reticulum stress-independent unconventional splicing of XBP1 mRNA in the nucleus of mammalian cells. Int J Mol Sci [Internet]. MDPI AG; 2015 [cited 2021 Jun 25];16:13302–21. Available from: www.mdpi.com/journal/ijmsArticle

233. Shuda M, Kondoh N, Imazeki N, Tanaka K, Okada T, Mori K, et al. Activation of the ATF6, XBP1 and grp78 genes in human hepatocellular carcinoma: A possible involvement of the ER stress pathway in hepatocarcinogenesis. J Hepatol. Elsevier; 2003;38:605–14.

123

234. Xie Y, Liu C, Qin Y, Chen J, Fang J. Knockdown of IRE1ɑ suppresses metastatic potential of colon cancer cells through inhibiting FN1-Src/FAK-GTPases signaling. Int J Biochem Cell Biol. Elsevier Ltd; 2019;114:105572.

235. Baumeister P, Luo S, Skarnes WC, Sui G, Seto E, Shi Y, et al. Endoplasmic Reticulum Stress Induction of the Grp78/BiP Promoter: Activating Mechanisms Mediated by YY1 and Its Interactive Chromatin Modifiers. Mol Cell Biol [Internet]. American Society for Microbiology; 2005 [cited 2021 Jun 21];25:4529–40. Available from: /pmc/articles/PMC1140640/

236. Oh-hashi K, Naruse Y, Amaya F, Shimosato G, Tanaka M. Cloning and characterization of a novel GRP78-binding protein in the rat brain. J Biol Chem [Internet]. J Biol Chem; 2003 [cited 2021 Jul 4];278:10531–7. Available from: https://pubmed.ncbi.nlm.nih.gov/12514190/

237. Oh-hashi K, Hirata Y, Koga H, Kiuchi K. GRP78-binding protein regulates cAMP- induced glial fibrillary acidic protein expression in rat C6 glioblastoma cells. FEBS Lett [Internet]. FEBS Lett; 2006 [cited 2021 Jul 4];580:3943–7. Available from: https://pubmed.ncbi.nlm.nih.gov/16806201/

238. Oh-Hashi K, Imai K, Koga H, Hirata Y, Kiuchi K. Knockdown of transmembrane protein 132A by RNA interference facilitates serum starvation-induced cell death in Neuro2a cells. Mol Cell Biochem [Internet]. Springer; 2010 [cited 2021 Jul 4];342:117–23. Available from: https://link.springer.com/article/10.1007/s11010-010-0475-9

239. Oh-Hashi K, Koga H, Nagase T, Hirata Y, Kiuchi K. Characterization of the expression and cell-surface localization of transmembrane protein 132A. Mol Cell Biochem [Internet]. Mol Cell Biochem; 2012 [cited 2021 Jul 4];370:23–33. Available from: https://pubmed.ncbi.nlm.nih.gov/22821197/

240. Oh-hashi K, Sone A, Hikiji T, Hirata Y, Vitiello M, Fedele M, et al. Transcriptional and post-transcriptional regulation of transmembrane protein 132A. Mol Cell Biochem [Internet]. Kluwer Academic Publishers; 2015 [cited 2021 Jul 4];405:291–9. Available from: https://pubmed.ncbi.nlm.nih.gov/25926156/

241. Kim C, Kim B. Anti-cancer natural products and their bioactive compounds inducing ER stress-mediated apoptosis: A review [Internet]. Nutrients. MDPI AG; 2018 [cited 2021 Jun 21]. page 1021. Available from: www.mdpi.com/journal/nutrients

242. Thakur G, Sathe G, Kundu I, Biswas B, Gautam P, Alkahtani S, et al. Membrane Interactome of a Recombinant Fragment of Human Surfactant Protein D Reveals GRP78 as a Novel Binding Partner in PC3, a Metastatic Prostate Cancer Cell Line. Front Immunol [Internet]. Frontiers Media S.A.; 2021 [cited 2021 Jun 21];11:1. Available from: www.frontiersin.org

124

243. SCNN1B sodium channel epithelial 1 subunit beta [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 Jun 21]. Available from: https://www.ncbi.nlm.nih.gov/gene/6338

244. SCNN1A - Amiloride-sensitive sodium channel subunit alpha - Homo sapiens (Human) - SCNN1A gene & protein [Internet]. [cited 2021 May 16]. Available from: https://www.uniprot.org/uniprot/P37088

245. Roll JD, Rivenbark AG, Jones WD, Coleman WB. DNMT3b overexpression contributes to a hypermethylator phenotype in human breast cancer cell lines. Mol Cancer [Internet]. Mol Cancer; 2008 [cited 2021 Jun 21];7. Available from: https://pubmed.ncbi.nlm.nih.gov/18221536/

246. Roll JD, Rivenbark AG, Sandhu R, Parker JS, Jones WD, Carey LA, et al. Dysregulation of the epigenome in triple-negative breast cancers: Basal-like and claudin-low breast cancers express aberrant DNA hypermethylation. Exp Mol Pathol [Internet]. Academic Press Inc.; 2013 [cited 2021 Jun 21];95:276–87. Available from: https://pubmed.ncbi.nlm.nih.gov/24045095/

247. Clarke HJ, Chambers JE, Liniker E, Marciniak SJ. Endoplasmic Reticulum Stress in Malignancy. Cancer Cell. Cell Press; 2014. page 563–73.

248. Taguchi Y, Horiuchi Y, Kano F, Murata M. Novel prosurvival function of Yip1A in human cervical cancer cells: Constitutive activation of the IRE1 and PERK pathways of the unfolded protein response. Cell Death Dis [Internet]. Nature Publishing Group; 2017 [cited 2021 Jun 21];8:e2718–e2718. Available from: www.nature.com/cddis

249. Inageda K. Insulin modulates induction of glucose-regulated protein 78 during endoplasmic reticulum stress via augmentation of ATF4 expression in human neuroblastoma cells. FEBS Lett. No longer published by Elsevier; 2010;584:3649–54.

250. Dong Y, Fernandes C, Liu Y, Wu Y, Wu H, Brophy ML, et al. Role of endoplasmic reticulum stress signalling in diabetic endothelial dysfunction and atherosclerosis [Internet]. Diabetes Vasc. Dis. Res. SAGE Publications Ltd; 2017 [cited 2021 Jun 22]. page 14–23. Available from: /pmc/articles/PMC5161113/

251. Raciti GA, Iadicicco C, Ulianich L, Vind BF, Gaster M, Andreozzi F, et al. Glucosamine- induced endoplasmic reticulum stress affects GLUT4 expression via activating transcription factor 6 in rat and human skeletal muscle cells. Diabetologia [Internet]. Diabetologia; 2010 [cited 2021 Jun 22];53:955–65. Available from: https://pubmed.ncbi.nlm.nih.gov/20165829/

125

252. Brown M, Dainty S, Strudwick N, Mihai AD, Watson JN, Dendooven R, et al. Endoplasmic reticulum stress causes insulin resistance by inhibiting delivery of newly synthesized insulin receptors to the cell surface. Mol Biol Cell [Internet]. American Society for Cell Biology; 2020 [cited 2021 Jun 22];31:2597–629. Available from: https://www.molbiolcell.org/doi/abs/10.1091/mbc.E18-01-0013

253. Moradipoor S, Ismail P, Etemad A, Wan Sulaiman WA, Ahmadloo S. Expression Profiling of Genes Related to Endothelial Cells Biology in Patients with Type 2 Diabetes and Patients with Prediabetes. Biomed Res Int. Hindawi Limited; 2016;2016.

254. Dauer P, Sharma NS, Gupta VK, Nomura A, Dudeja V, Saluja A, et al. GRP78-mediated antioxidant response and ABC transporter activity confers chemoresistance to pancreatic cancer cells. Mol Oncol [Internet]. John Wiley and Sons Ltd.; 2018 [cited 2021 Jun 22];12:1498–512. Available from: https://pubmed.ncbi.nlm.nih.gov/29738634/

255. Gifford JB, Huang W, Zeleniak AE, Hindoyan A, Wu H, Donahue TR, et al. Expression of GRP78, master regulator of the unfolded protein response, increases chemoresistance in pancreatic ductal adenocarcinoma. Mol Cancer Ther [Internet]. American Association for Cancer Research Inc.; 2016 [cited 2021 Jun 22];15:1043–52. Available from: https://pubmed.ncbi.nlm.nih.gov/26939701/

256. Gu YJ, Li HD, Zhao L, Zhao S, He W Bin, Rui L, et al. GRP78 confers the resistance to 5-FU by activating the c-Src/LSF/TS Axis in hepatocellular carcinoma. Oncotarget [Internet]. Impact Journals LLC; 2015 [cited 2021 Jun 22];6:33658–74. Available from: /pmc/articles/PMC4741793/

257. Yao X, Tu Y, Xu Y, Guo Y, Yao F, Zhang X. Endoplasmic reticulum stress confers 5- fluorouracil resistance in breast cancer cell via the GRP78/OCT4/lncRNA MIAT/AKT pathway. Am J Cancer Res [Internet]. e-Century Publishing Corporation; 2020 [cited 2021 Jun 22];10:838–55. Available from: http://www.ncbi.nlm.nih.gov/pubmed/32266094

258. Chern YJ, Wong JCT, Cheng GSW, Yu A, Yin Y, Schaeffer DF, et al. The interaction between SPARC and GRP78 interferes with ER stress signaling and potentiates apoptosis via PERK/eIF2α and IRE1α/XBP-1 in colorectal cancer. Cell Death Dis [Internet]. Nature Publishing Group; 2019 [cited 2021 Jun 22];10:1–14. Available from: https://doi.org/10.1038/s41419-019-1687-x

259. Avril T, Vauléon E, Chevet E. Endoplasmic reticulum stress signaling and chemotherapy resistance in solid cancers. Oncogenesis [Internet]. Springer Science and Business Media LLC; 2017 [cited 2021 Jun 22];6:e373–e373. Available from: https://pubmed.ncbi.nlm.nih.gov/28846078/

260. Urra H, Dufey E, Avril T, Chevet E, Hetz C. Endoplasmic Reticulum Stress and the Hallmarks of Cancer [Internet]. Trends in Cancer. Cell Press; 2016 [cited 2021 Jun 22]. page 252–62. Available from: https://pubmed.ncbi.nlm.nih.gov/28741511/

126

261. Chevet E, Hetz C, Samali A. Endoplasmic reticulum stress–activated cell reprogramming in oncogenesis [Internet]. Cancer Discov. American Association for Cancer Research Inc.; 2015 [cited 2021 Jun 22]. page 586–97. Available from: https://pubmed.ncbi.nlm.nih.gov/25977222/

262. Wu MJ, Jan CI, Tsay YG, Yu YH, Huang CY, Lin SC, et al. Elimination of head and neck cancer initiating cells through targeting glucose regulated protein78 signaling. Mol Cancer [Internet]. Mol Cancer; 2010 [cited 2021 Jun 22];9. Available from: https://pubmed.ncbi.nlm.nih.gov/20979610/

263. Huang LW, Lin CY, Lee CC, Liu TZ, Jeng CJ. Overexpression of GRP78 is associated with malignant transformation in epithelial ovarian tumors. Appl Immunohistochem Mol Morphol [Internet]. Appl Immunohistochem Mol Morphol; 2012 [cited 2021 Jun 22];20:381–5. Available from: https://pubmed.ncbi.nlm.nih.gov/22495374/

264. Xi J, Chen Y, Huang S, Cui F, Wang X. Suppression of GRP78 sensitizes human colorectal cancer cells to oxaliplatin by downregulation of CD24. Oncol Lett [Internet]. Spandidos Publications; 2018 [cited 2021 Jun 22];15:9861–7. Available from: http://www.spandidos-publications.com/10.3892/ol.2018.8549/abstract

265. Niu Z, Wang M, Zhou L, Yao L, Liao Q, Zhao Y. Elevated GRP78 expression is associated with poor prognosis in patients with pancreatic cancer. Sci Rep [Internet]. Nature Publishing Group; 2015 [cited 2021 Jun 22];5. Available from: https://pubmed.ncbi.nlm.nih.gov/26530532/

266. Rangel DF, Dubeau L, Park R, Chan P, Ha DP, Pulido MA, et al. Endoplasmic reticulum chaperone GRP78/BiP is critical for mutant Kras-driven lung tumorigenesis. Oncogene [Internet]. Springer Nature; 2021 [cited 2021 Jun 22];40:3624–32. Available from: https://doi.org/10.1038/s41388-021-01791-9

267. Lee AS. GRP78 induction in cancer: Therapeutic and prognostic implications [Internet]. Cancer Res. Cancer Res; 2007 [cited 2021 Jun 22]. page 3496–9. Available from: https://pubmed.ncbi.nlm.nih.gov/17440054/

268. Dong D, Stapleton C, Luo B, Xiong S, Ye W, Zhang Y, et al. A critical role for GRP78/BiP in the tumor microenvironment for neovascularization during tumor growth and metastasis. Cancer Res [Internet]. NIH Public Access; 2011 [cited 2021 Jun 22];71:2848–57. Available from: /pmc/articles/PMC3078191/

269. Bian T, Tagmount A, Vulpe C, Vijendra KC, Xing C. CXL146, a novel 4H-chromene derivative, targets GRP78 to selectively eliminate multidrug-resistant cancer cells. Mol Pharmacol [Internet]. American Society for Pharmacology and Experimental Therapy; 2020 [cited 2021 Jun 22];97:402–8. Available from: https://pubmed.ncbi.nlm.nih.gov/32276963/

127

270. Conner C, Lager TW, Guldner IH, Wu MZ, Hishida Y, Hishida T, et al. Cell surface GRP78 promotes stemness in normal and neoplastic cells. Sci Rep [Internet]. Nature Research; 2020 [cited 2021 Jun 22];10:1–11. Available from: https://www.nature.com/articles/s41598-020-60269-y

271. Leonard A, Grose V, Paton AW, Paton JC, Yule DI, Rahman A, et al. Selective Inactivation of Intracellular BiP/GRP78 Attenuates Endothelial Inflammation and Permeability in Acute Lung Injury. Sci Rep [Internet]. Nature Publishing Group; 2019 [cited 2021 Jun 22];9:1–13. Available from: https://www.nature.com/articles/s41598-018- 38312-w

272. Casas C. GRP78 at the centre of the stage in cancer and neuroprotection [Internet]. Front. Neurosci. Frontiers Research Foundation; 2017 [cited 2021 Jun 22]. page 177. Available from: www.frontiersin.org

273. Fu R, Yang P, Wu HL, Li ZW, Li ZY. GRP78 secreted by colon cancer cells facilitates cell proliferation via PI3K/Akt signaling. Asian Pacific J Cancer Prev [Internet]. Asian Pacific Organization for Cancer Prevention; 2014 [cited 2021 Jun 22];15:7245–9. Available from: https://pubmed.ncbi.nlm.nih.gov/25227822/

274. Malinowsky K, Nitsche U, Janssen KP, Bader FG, Späth C, Drecoll E, et al. Activation of the PI3K/AKT pathway correlates with prognosis in stage II colon cancer. Br J Cancer [Internet]. Nature Publishing Group; 2014 [cited 2021 Jun 23];110:2081–9. Available from: www.bjcancer.com

275. Efferth T. Signal Transduction Pathways of the Epidermal Growth Factor Receptor in Colorectal Cancer and their Inhibition by Small Molecules. Curr Med Chem [Internet]. Bentham Science Publishers Ltd.; 2012 [cited 2021 Jun 23];19:5735–44. Available from: https://pubmed.ncbi.nlm.nih.gov/23033949/

276. Van Krieken R, Mehta N, Wang T, Zheng M, Li R, Gao B, et al. Cell surface expression of 78-kDa glucose-regulated protein (GRP78) mediates diabetic nephropathy. J Biol Chem [Internet]. Elsevier BV; 2019 [cited 2021 Jun 4];294:7755–68. Available from: https://pubmed.ncbi.nlm.nih.gov/30914477/

277. Bian J, Dannappel M, Wan C, Firestein R. Transcriptional Regulation of Wnt/β-Catenin Pathway in Colorectal Cancer [Internet]. Cells. NLM (Medline); 2020 [cited 2021 Jun 23]. Available from: /pmc/articles/PMC7564852/

278. Li X, Hu W, Zhou J, Huang Y, Peng J, Yuan Y, et al. CLCA1 suppresses colorectal cancer aggressiveness via inhibition of the Wnt/beta-catenin signaling pathway. Cell Commun Signal [Internet]. BioMed Central Ltd.; 2017 [cited 2021 Jun 23];15:1–13. Available from: https://biosignaling.biomedcentral.com/articles/10.1186/s12964-017- 0192-z

128

279. Li B, Niswander LA. TMEM132A, a Novel Wnt Signaling Pathway Regulator Through Wntless (WLS) Interaction. Front Cell Dev Biol [Internet]. Frontiers Media S.A.; 2020 [cited 2021 Jul 5];8:1426. Available from: www.frontiersin.org

280. FN1 fibronectin 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 Apr 24]. Available from: https://www.ncbi.nlm.nih.gov/gene/2335

281. Cai X, Liu C, Zhang TN, Zhu YW, Dong X, Xue P. Down-regulation of FN1 inhibits colorectal carcinogenesis by suppressing proliferation, migration, and invasion. J Cell Biochem [Internet]. Wiley-Liss Inc.; 2018 [cited 2021 Jun 24];119:4717–28. Available from: https://pubmed.ncbi.nlm.nih.gov/29274284/

282. Li B, Shen W, Peng H, Li Y, Chen F, Zheng L, et al.

Fibronectin 1 promotes melanoma proliferation and metastasis by inhibiting apoptosis and regulating EMT

. Onco Targets Ther [Internet]. Dove Medical Press Ltd.; 2019 [cited 2021 Jun 24];Volume 12:3207–21. Available from: /pmc/articles/PMC6503329/

283. Liao YX, Zhang ZP, Zhao J, Liu JP. Effects of Fibronectin 1 on Cell Proliferation, Senescence and Apoptosis of Human Glioma Cells Through the PI3K/AKT Signaling Pathway. Cell Physiol Biochem [Internet]. S. Karger AG; 2018 [cited 2021 Jun 24];48:1382–96. Available from: https://pubmed.ncbi.nlm.nih.gov/30048971/

284. FAS Fas cell surface death receptor [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/355

285. Peciuliene I, Vilys L, Jakubauskiene E, Zaliauskiene L, Kanopka A. Hypoxia alters splicing of the cancer associated Fas gene. Exp Cell Res. Academic Press; 2019;380:29– 35.

286. Sehgal L, Mathur R, Braun FK, Wise JF, Berkova Z, Neelapu S, et al. FAS-antisense 1 lncRNA and production of soluble versus membrane Fas in B-cell lymphoma. Leukemia [Internet]. 2014 [cited 2021 Jun 4];28:2376–87. Available from: www.nature.com/leu

287. Xiao W, Ibrahim ML, Redd PS, Klement JD, Lu C, Yang D, et al. Loss of fas expression and function is coupled with colon cancer resistance to immune checkpoint inhibitor immunotherapy. Mol Cancer Res [Internet]. American Association for Cancer Research Inc.; 2019 [cited 2021 Mar 27];17:420–30. Available from: http://mcr.aacrjournals.org/

288. Laustriat D, Gide J, Barrault L, Chautard E, Benoit C, Auboeuf D, et al. In vitro and in vivo modulation of alternative splicing by the biguanide metformin. Mol Ther - Nucleic Acids [Internet]. Nature Publishing Group; 2015 [cited 2021 Jun 6];4:e262. Available from: /pmc/articles/PMC4877444/

289. SPART spartin [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/23111

129

290. He L, Fan X, Li Y, Cui B, Shi Z, Zhou D, et al. Aberrant methylation status of SPG20 promoter in hepatocellular carcinoma: A potential tumor metastasis biomarker. Cancer Genet. Elsevier Inc.; 2019;233–234:48–55.

291. Lind GE, Raiborg C, Danielsen SA, Rognum TO, Thiis-Evensen E, Hoff G, et al. SPG20, a novel biomarker for early detection of colorectal cancer, encodes a regulator of cytokinesis. Oncogene [Internet]. Nature Publishing Group; 2011 [cited 2021 Jun 30];30:3967–78. Available from: /pmc/articles/PMC3174365/

292. Boroumand-Noughabi S, Sima HR, Ghaffarzadehgan K, Jafarzadeh M, Raziee HR, Hosseinnezhad H, et al. Soluble Fas might serve as a diagnostic tool for gastric adenocarcinoma. BMC Cancer [Internet]. BioMed Central; 2010 [cited 2021 Jun 23];10:275. Available from: http://www.biomedcentral.com/1471-2407/10/275

293. Jing X, Yang F, Shao C, Wei K, Xie M, Shen H, et al. Role of hypoxia in cancer therapy by regulating the tumor microenvironment [Internet]. Mol. Cancer. BioMed Central Ltd.; 2019 [cited 2021 Jun 23]. page 1–15. Available from: https://doi.org/10.1186/s12943-019- 1089-9

294. Shao C, Yang F, Miao S, Liu W, Wang C, Shu Y, et al. Role of hypoxia-induced exosomes in tumor biology. [cited 2021 Jun 23]; Available from: https://doi.org/10.1186/s12943-018-0869-y

295. Li W, Liu H, Qian W, Cheng L, Yan B, Han L, et al. Hyperglycemia aggravates microenvironment hypoxia and promotes the metastatic ability of pancreatic cancer. Comput Struct Biotechnol J [Internet]. Elsevier B.V.; 2018 [cited 2021 Jun 23];16:479– 87. Available from: /pmc/articles/PMC6232646/

296. Liu W, Jing ZT, Xue CR, Wu SX, Chen WN, Lin XJ, et al. PI3K/AKT inhibitors aggravate death receptor-mediated hepatocyte apoptosis and liver injury. Toxicol Appl Pharmacol. Academic Press; 2019;381:114729.

297. Osaki M, Kase S, Adachi K, Takeda A, Hashimoto K, Ito H. Inhibition of the PI3K-Akt signaling pathway enhances the sensitivity of Fas-mediated apoptosis in human gastric carcinoma cell line, MKN-45. J Cancer Res Clin Oncol [Internet]. Springer; 2004 [cited 2021 Jun 5];130:8–14. Available from: https://link.springer.com/article/10.1007/s00432- 003-0505-z

298. Immunotherapy for Cancer - National Cancer Institute [Internet]. [cited 2021 Jun 28]. Available from: https://www.cancer.gov/about-cancer/treatment/types/immunotherapy

299. Immunotherapy for Colorectal Cancer | Immunotherapy for Rectal Cancer [Internet]. [cited 2021 Jun 28]. Available from: https://www.cancer.org/cancer/colon-rectal- cancer/treating/immunotherapy.html

130

300. Sambi M, Bagheri L, Szewczuk MR. Current challenges in cancer immunotherapy: Multimodal approaches to improve efficacy and patient response rates [Internet]. J. Oncol. Hindawi Limited; 2019 [cited 2021 Jun 28]. Available from: /pmc/articles/PMC6420990/

301. Ott PA, Bang YJ, Piha-Paul SA, Abdul Razak AR, Bennouna J, Soria JC, et al. T-cell– inflamed gene-expression profile, programmed death ligand 1 expression, and tumor mutational burden predict efficacy in patients treated with pembrolizumab across 20 cancers: KEYNOTE-028. J Clin Oncol [Internet]. American Society of Clinical Oncology; 2019 [cited 2021 Jun 28];37:318–27. Available from: https://www.researchwithrutgers.com/en/publications/t-cellinflamed-gene-expression- profile-programmed-death-ligand-1-

302. Ramirez MU, Hernandez SR, Soto-Pantoja DR, Cook KL. Endoplasmic reticulum stress pathway, the unfolded protein response, modulates immune function in the tumor microenvironment to impact tumor progression and therapeutic response [Internet]. Int. J. Mol. Sci. MDPI AG; 2020 [cited 2021 Jun 28]. Available from: /pmc/articles/PMC6981480/

303. Chen X, Cubillos-Ruiz JR. Endoplasmic reticulum stress signals in the tumour and its microenvironment [Internet]. Nat. Rev. Cancer. Nature Research; 2021 [cited 2021 Jun 28]. page 71–88. Available from: https://doi.org/10.1038/

304. Li Y, Zhang X, Wan X, Liu X, Pan W, Li N, et al. Inducing Endoplasmic Reticulum Stress to Expose Immunogens: A DNA Tetrahedron Nanoregulator for Enhanced Immunotherapy. Adv Funct Mater [Internet]. Wiley-VCH Verlag; 2020 [cited 2021 Jun 28];30:2000532. Available from: https://onlinelibrary.wiley.com/doi/full/10.1002/adfm.202000532

305. Cubillos-Ruiz JR, Bettigole SE, Glimcher LH. Tumorigenic and Immunosuppressive Effects of Endoplasmic Reticulum Stress in Cancer [Internet]. Cell. Cell Press; 2017 [cited 2021 Jun 29]. page 692–706. Available from: /pmc/articles/PMC5333759/

306. Xiao W, Ibrahim ML, Redd PS, Klement JD, Lu C, Yang D, et al. Loss of fas expression and function is coupled with colon cancer resistance to immune checkpoint inhibitor immunotherapy. Mol Cancer Res [Internet]. American Association for Cancer Research Inc.; 2019 [cited 2021 Jun 28];17:420–30. Available from: https://pubmed.ncbi.nlm.nih.gov/30429213/

307. Pietilä M, Sahgal P, Peuhu E, Jäntti NZ, Paatero I, Närvä E, et al. SORLA regulates endosomal trafficking and oncogenic fitness of HER2. Nat Commun [Internet]. Nature Publishing Group; 2019 [cited 2021 Jun 25];10:1–16. Available from: https://doi.org/10.1038/s41467-019-10275-0

308. SORL1 sortilin related receptor 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/6653

131

309. Klinger SC, Glerup S, Raarup MK, Mari MC, Nyegaard M, Koster G, et al. SorLA regulates the activity of lipoprotein lipase by intracellular trafficking. J Cell Sci. The Company of Biologists; 2011;124:1095–105.

310. Spoelgen R, Adams KW, Koker M, Thomas A V., Andersen OM, Hallett PJ, et al. Interaction of the apolipoprotein E receptors low density lipoprotein receptor-related protein and sorLA/LR11. Neuroscience [Internet]. Neuroscience; 2009 [cited 2021 Jun 25];158:1460–8. Available from: https://pubmed.ncbi.nlm.nih.gov/19047013/

311. Schmidt V, Schulz N, Yan X, Schürmann A, Kempa S, Kern M, et al. SORLA facilitates insulin receptor signaling in adipocytes and exacerbates obesity. J Clin Invest [Internet]. American Society for Clinical Investigation; 2016 [cited 2021 Jun 25];126:2706–20. Available from: https://pubmed.ncbi.nlm.nih.gov/27322061/

312. Whittle AJ, Jiang M, Peirce V, Relat J, Virtue S, Ebinuma H, et al. Soluble LR11/SorLA represses thermogenesis in adipose tissue and correlates with BMI in humans. Nat Commun [Internet]. Nature Publishing Group; 2015 [cited 2021 Jun 25];6:1–12. Available from: www.nature.com/naturecommunications

313. Nykjaer A, Lee R, Teng KK, Jansen P, Madsen P, Nielsen MS, et al. Sortilin is essential for proNGF-induced neuronal cell death. Nature [Internet]. Nature; 2004 [cited 2021 Jun 25];427:843–8. Available from: https://pubmed.ncbi.nlm.nih.gov/14985763/

314. HNF4A hepatocyte nuclear factor 4 alpha [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 8]. Available from: https://www.ncbi.nlm.nih.gov/gene/3172

315. Pan J, Silva TC, Gull N, Yang Q, Plummer JT, Chen S, et al. Lineage-specific epigenomic and genomic activation of oncogene HNF4A promotes gastrointestinal adenocarcinomas. Cancer Res [Internet]. American Association for Cancer Research Inc.; 2020 [cited 2021 Jun 25];80:2722–36. Available from: https://pubmed.ncbi.nlm.nih.gov/32332020/

316. Xu Q, Li Y, Gao X, Kang K, Williams JG, Tong L, et al. HNF4α regulates sulfur amino acid metabolism and confers sensitivity to methionine restriction in liver cancer. Nat Commun [Internet]. Nature Research; 2020 [cited 2021 Jun 25];11. Available from: https://pubmed.ncbi.nlm.nih.gov/32770044/

317. Cui YM, Jiao HL, Ye YP, Chen CM, Wang JX, Tang N, et al. FOXC2 promotes colorectal cancer metastasis by directly targeting MET. Oncogene [Internet]. Nature Publishing Group; 2015 [cited 2021 Mar 26];34:4379–90. Available from: www.nature.com/onc

318. Børretzen A, Gravdal K, Haukaas SA, Beisland C, Akslen LA, Halvorsen OJ. FOXC2 expression and epithelial–mesenchymal phenotypes are associated with castration resistance, metastasis and survival in prostate cancer. J Pathol Clin Res [Internet]. Wiley- Blackwell Publishing Ltd; 2019 [cited 2021 Jun 25];5:272–86. Available from: https://pubmed.ncbi.nlm.nih.gov/31464093/

132

319. Pan K, Xie Y. LncRNA FOXC2-AS1 enhances FOXC2 mRNA stability to promote colorectal cancer progression via activation of Ca2+-FAK signal pathway. Cell Death Dis [Internet]. Springer Nature; 2020 [cited 2021 Jun 25];11:1–14. Available from: https://doi.org/10.1038/s41419-020-2633-7

320. Jiramongkol Y, Lam EWF. FOXO transcription factor family in cancer and metastasis [Internet]. Cancer Metastasis Rev. Springer; 2020 [cited 2021 Jun 25]. page 681–709. Available from: https://doi.org/10.1007/s10555-020-09883-w

321. Fujino S, Miyoshi N, Ito A, Yasui M, Matsuda C, Ohue M, et al. HNF1A regulates colorectal cancer progression and drug resistance as a downstream of POU5F1. Sci Rep [Internet]. Springer Science and Business Media LLC; 2021 [cited 2021 Jun 25];11:10363. Available from: https://doi.org/10.1038/s41598-021-89126-2

322. Miyoshi N, Fujino S, Ohue M, Yasui M, Takahashi Y, Sugimura K, et al. The POU5F1 gene expression in colorectal cancer: a novel prognostic marker. Surg Today [Internet]. Springer Tokyo; 2018 [cited 2021 Jun 25];48:709–15. Available from: https://pubmed.ncbi.nlm.nih.gov/29488015/

323. Fujino S, Miyoshi N. OCT4 gene expression in primary colorectal cancer promotes liver metastasis. Stem Cells Int. Hindawi Limited; 2019;2019.

324. Ou R, Mo L, Tang H, Leng S, Zhu H, Zhao L, et al. circRNA-AKT1 Sequesters miR-942- 5p to Upregulate AKT1 and Promote Cervical Cancer Progression. Mol Ther - Nucleic Acids. Cell Press; 2020;20:308–22.

325. Sahlberg SH, Mortensen AC, Haglöf J, Engskog MKR, Arvidsson T, Pettersson C, et al. Different functions of AKT1 and AKT2 in molecular pathways, cell migration and metabolism in colon cancer cells. Int J Oncol [Internet]. Spandidos Publications; 2017 [cited 2021 Jun 25];50:5–14. Available from: http://www.affymetrix.

326. Xu G, Wang H, Yuan D, Yao J, Meng L, Li K, et al. RUNX1-activated upregulation of lncRNA RNCR3 promotes cell proliferation, invasion, and suppresses apoptosis in colorectal cancer via miR-1301-3p/AKT1 axis in vitro and in vivo. Clin Transl Oncol [Internet]. Springer; 2020 [cited 2021 Jun 25];22:1762–77. Available from: https://link.springer.com/article/10.1007/s12094-020-02335-5

327. Gong H, Yu C, Gang H, Zhang Y, Qing Y, Wang Y, et al. P53/microRNA-374b/AKT1 regulates colorectal cancer cell apoptosis in response to DNA damage. Int J Oncol [Internet]. Spandidos Publications; 2017 [cited 2021 Jun 25];50:1785–91. Available from: http://www.spandidos-publications.com/10.3892/ijo.2017.3922/abstract

328. TGFBR1 transforming growth factor beta receptor 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/7046

133

329. Neel J-C, Humbert L, Lebrun J-J. The Dual Role of TGFβ in Human Cancer: From Tumor Suppression to Cancer Metastasis. ISRN Mol Biol. Hindawi Limited; 2012;2012:1–28.

330. Yu Y, Xiao CH, Tan LD, Wang QS, Li XQ, Feng YM. Cancer-associated fibroblasts induce epithelial-mesenchymal transition of breast cancer cells through paracrine TGF-β signalling. Br J Cancer [Internet]. Nature Publishing Group; 2014 [cited 2021 Jun 26];110:724–32. Available from: www.bjcancer.com

331. Mishra AK, Parish CR, Wong ML, Licinio J, Blackburn AC. Leptin signals via TGFB1 to promote metastatic potential and stemness in breast cancer. PLoS One [Internet]. Public Library of Science; 2017 [cited 2021 Jun 26];12:e0178454. Available from: https://doi.org/10.1371/journal.pone.0178454

332. Duan W, Qian W, Zhou C, Cao J, Qin T, Xiao Y, et al. Metformin suppresses the invasive ability of pancreatic cancer cells by blocking autocrine TGF-ß1 signaling. Oncol Rep [Internet]. Spandidos Publications; 2018 [cited 2021 Jun 26];40:1495–502. Available from: http://www.spandidos-publications.com/10.3892/or.2018.6518/abstract

333. Yeh HW, Hsu EC, Lee SS, Lang YD, Lin YC, Chang CY, et al. PSPC1 mediates TGF-β1 autocrine signalling and Smad2/3 target switching to promote EMT, stemness and metastasis. Nat Cell Biol [Internet]. Nature Publishing Group; 2018 [cited 2021 Jun 26];20:479–91. Available from: https://doi.org/10.1038/s41556-018-0062-y

334. JAG1 jagged canonical Notch ligand 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/182

335. Chen J, Zhang H, Chen Y, Qiao G, Jiang W, Ni P, et al. miR-598 inhibits metastasis in colorectal cancer by suppressing JAG1/Notch2 pathway stimulating EMT. Exp Cell Res. Academic Press Inc.; 2017;352:104–12.

336. Xiu M xi, Liu Y meng, Kuang B hai. The oncogenic role of Jagged1/Notch signaling in cancer [Internet]. Biomed. Pharmacother. Elsevier Masson SAS; 2020 [cited 2021 Jun 26]. Available from: https://pubmed.ncbi.nlm.nih.gov/32593969/

337. Lee J, Lee J, Kim JH. Association of Jagged1 expression with malignancy and prognosis in human pancreatic cancer. Cell Oncol [Internet]. Springer Science and Business Media B.V.; 2020 [cited 2021 Jun 26];43:821–34. Available from: https://link.springer.com/article/10.1007/s13402-020-00527-3

338. Sugiyama M, Oki E, Nakaji Y, Tsutsumi S, Ono N, Nakanishi R, et al. High expression of the Notch ligand Jagged-1 is associated with poor prognosis after surgery for colorectal cancer. Cancer Sci [Internet]. Blackwell Publishing Ltd; 2016 [cited 2021 Mar 26];107:1705–16. Available from: https://pubmed.ncbi.nlm.nih.gov/27589478/

134

339. HGF hepatocyte growth factor [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 10]. Available from: https://www.ncbi.nlm.nih.gov/gene/3082

340. Huang CY, Zhou QY, Hu Y, Wen Y, Qiu ZW, Liang MG, et al. Hepatocyte growth factor is a prognostic marker in patients with colorectal cancer: A meta-analysis. Oncotarget [Internet]. Impact Journals LLC; 2017 [cited 2021 Jun 26];8:23459–69. Available from: /pmc/articles/PMC5410318/

341. Yao J feng, Li X jun, Yan L kun, He S, Zheng J bao, Wang X rong, et al. Role of HGF/c- Met in the treatment of colorectal cancer with liver metastasis. J Biochem Mol Toxicol [Internet]. John Wiley and Sons Inc.; 2019 [cited 2021 Apr 25];33. Available from: https://pubmed.ncbi.nlm.nih.gov/30897285/

342. Mira A, Morello V, Céspedes MV, Perera T, Comoglio PM, Mangues R, et al. Stroma- derived HGF drives metabolic adaptation of colorectal cancer to angiogenesis inhibitors. Oncotarget [Internet]. Impact Journals LLC; 2017 [cited 2021 Mar 26];8:38193–213. Available from: /pmc/articles/PMC5503526/

343. Parizadeh SM, Jafarzadeh‐Esfehani R, Fazilat‐Panah D, Hassanian SM, Shahidsales S, Khazaei M, et al. The potential therapeutic and prognostic impacts of the c‐MET/HGF signaling pathway in colorectal cancer. IUBMB Life [Internet]. Blackwell Publishing Ltd; 2019 [cited 2021 Mar 26];71:iub.2063. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1002/iub.2063

344. Garrett CR, Hassabo HM, Bhadkamkar NA, Wen S, Baladandayuthapani V, Kee BK, et al. Survival advantage observed with the use of metformin in patients with type II diabetes and colorectal cancer. Br J Cancer [Internet]. Nature Publishing Group; 2012 [cited 2021 Jun 28];106:1374–8. Available from: www.bjcancer.com

345. Cancer [Internet]. [cited 2021 Jun 26]. Available from: https://www.who.int/news- room/fact-sheets/detail/cancer

346. Chatterjee S, Khunti K, Davies MJ. Type 2 diabetes. Lancet. Lancet Publishing Group; 2017. page 2239–51.

347. Shi G, Jin Y. Role of Oct4 in maintaining and regaining stem cell pluripotency [Internet]. Stem Cell Res. Ther. BioMed Central; 2010 [cited 2021 May 9]. page 1–9. Available from: http://stemcellres.com/content/1/5/39

348. Zhou H, Hu Y, Wang W, Mao Y, Zhu J, Zhou B, et al. Expression of oct-4 is significantly associated with the development and prognosis of colorectal cancer. Oncol Lett [Internet]. Spandidos Publications; 2015 [cited 2021 Mar 25];10:691–6. Available from: /pmc/articles/PMC4509036/

135

349. let-7 inhibition of ES cell reprogramming | Pathway - PubChem [Internet]. [cited 2021 May 9]. Available from: https://pubchem.ncbi.nlm.nih.gov/pathway/WikiPathways:WP3299

350. ACTA2 actin alpha 2, smooth muscle [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 9]. Available from: https://www.ncbi.nlm.nih.gov/gene/59

351. Zhao B, Baloch Z, Ma Y, Wan Z, Huo Y, Li F, et al. Identification of Potential Key Genes and Pathways in Early-Onset Colorectal Cancer Through Bioinformatics Analysis. Cancer Control [Internet]. SAGE Publications Ltd; 2019 [cited 2021 Mar 25];26:107327481983126. Available from: http://journals.sagepub.com/doi/10.1177/1073274819831260

352. Jeon M, You D, Bae SY, Kim SW, Nam SJ, Kim HH, et al. Dimerization of EGFR and HER2 induces breast cancer cell motility through STAT1-dependent ACTA2 induction. Oncotarget [Internet]. Impact Journals LLC; 2017 [cited 2021 Apr 23];8:50570–81. Available from: /pmc/articles/PMC5584169/

353. Almanza-Aguilera E, Hernáez Á, Corella D, Sanllorente A, Ros E, Portolés O, et al. Cancer Signaling Transcriptome Is Upregulated in Type 2 Diabetes Mellitus. J Clin Med [Internet]. MDPI AG; 2020 [cited 2021 Apr 23];10:85. Available from: https://www.mdpi.com/2077-0383/10/1/85

354. AFP alpha fetoprotein [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 9]. Available from: https://www.ncbi.nlm.nih.gov/gene/174

355. Ren F, Weng W, Zhang Q, Tan C, Xu M, Zhang M, et al. Clinicopathological features and prognosis of AFP-producing colorectal cancer: A single-center analysis of 20 cases. Cancer Manag Res [Internet]. Dove Medical Press Ltd; 2019 [cited 2021 Mar 25];11:4557–67. Available from: /pmc/articles/PMC6529609/

356. Yin L, Lin Y, Wang X, Su Y, Hu H, Li C, et al. The family of apoptosis-stimulating proteins of p53 is dysregulated in colorectal cancer patients. Oncol Lett [Internet]. Spandidos Publications; 2018 [cited 2021 Apr 23];15:6409–17. Available from: http://www.spandidos-publications.com/10.3892/ol.2018.8151/abstract

357. AHNAK AHNAK nucleoprotein [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 9]. Available from: https://www.ncbi.nlm.nih.gov/gene/79026

358. Sohn M, Shin S, Yoo JY, Goh Y, Lee IH, Bae YS. Ahnak promotes tumor metastasis through transforming growth factor-β-mediated epithelial-mesenchymal transition. Sci Rep [Internet]. Nature Publishing Group; 2018 [cited 2021 Mar 25];8:14379. Available from: www.nature.com/scientificreports/

136

359. Zhang Z, Liu X, Huang R, Liu X, Liang Z, Liu T. Upregulation of nucleoprotein AHNAK is associated with poor outcome of pancreatic ductal adenocarcinoma prognosis via mediating epithelial-mesenchymal transition. J Cancer [Internet]. Ivyspring International Publisher; 2019 [cited 2021 Apr 23];10:3860–70. Available from: /pmc/articles/PMC6636292/

360. AKT1 AKT serine/threonine kinase 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 9]. Available from: https://www.ncbi.nlm.nih.gov/gene/207

361. AKT2 AKT serine/threonine kinase 2 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 9]. Available from: https://www.ncbi.nlm.nih.gov/gene/208

362. AKT3 AKT serine/threonine kinase 3 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 9]. Available from: https://www.ncbi.nlm.nih.gov/gene/10000

363. Núñez-Olvera SI, Chávez-Munguía B, del Rocío Terrones-Gurrola MC, Marchat LA, Puente-Rivera J, Ruíz-García E, et al. A novel protective role for microRNA-3135b in Golgi apparatus fragmentation induced by chemotherapy via GOLPH3/AKT1/mTOR axis in colorectal cancer cells. Sci Rep [Internet]. Nature Research; 2020 [cited 2021 Mar 25];10:1–14. Available from: https://doi.org/10.1038/s41598-020-67550-0

364. Agarwal E, Robb CM, Smith LM, Brattain MG, Wang J, Black JD, et al. Role of Akt2 in regulation of metastasis suppressor 1 expression and colorectal cancer metastasis. Oncogene [Internet]. Nature Publishing Group; 2017 [cited 2021 Mar 25];36:3104–18. Available from: https://www.nature.com/articles/onc2016460

365. ANXA3 annexin A3 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 9]. Available from: https://www.ncbi.nlm.nih.gov/gene/306

366. Yang L, Men W-L, Yan K-M, Tie J, Nie Y-Z, Xiao H-J. MiR-340-5p is a potential prognostic indicator of colorectal cancer and modulates ANXA3.

367. Du R, Liu B, Zhou L, Wang D, He X, Xu X, et al. Downregulation of annexin A3 inhibits tumor metastasis and decreases drug resistance in breast cancer. Cell Death Dis [Internet]. Nature Publishing Group; 2018 [cited 2021 Apr 23];9:1–11. Available from: https://www.nature.com/articles/s41419-017-0143-z

368. Tong M, Fung TM, Luk ST, Ng KY, Lee TK, Lin CH, et al. ANXA3/JNK Signaling Promotes Self-Renewal and Tumor Growth, and Its Blockade Provides a Therapeutic Target for Hepatocellular Carcinoma. Stem Cell Reports [Internet]. Cell Press; 2015 [cited 2021 Apr 23];5:45–59. Available from: /pmc/articles/PMC4618447/

369. APC APC regulator of WNT signaling pathway [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 9]. Available from: https://www.ncbi.nlm.nih.gov/gene/324

137

370. Zhang L, Shay JW. Multiple Roles of APC and its Therapeutic Implications in Colorectal Cancer. JNCI J Natl Cancer Inst [Internet]. Oxford University Press; 2017 [cited 2021 Mar 25];109. Available from: https://academic.oup.com/jnci/article/doi/10.1093/jnci/djw332/3113843

371. AURKA aurora kinase A [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 9]. Available from: https://www.ncbi.nlm.nih.gov/gene/6790

372. Jacobsen A, Bosch LJW, Martens-De Kemp SR, Carvalho B, Sillars-Hardebol AH, Dobson RJ, et al. Aurora kinase A (AURKA) interaction with Wnt and Ras-MAPK signalling pathways in colorectal cancer. Sci Rep [Internet]. Nature Publishing Group; 2018 [cited 2021 Mar 25];8:1–11. Available from: www.nature.com/scientificreports

373. Liu YD, Ji CB, Li SB, Yan F, Gu QS, Balic JJ, et al. Toll-like receptor 2 stimulation promotes colorectal cancer cell growth via PI3K/Akt and NF-κB signaling pathways. Int Immunopharmacol. Elsevier B.V.; 2018;59:375–83.

374. BIRC3 baculoviral IAP repeat containing 3 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 9]. Available from: https://www.ncbi.nlm.nih.gov/gene/330

375. Zhang S, Yang Y, Weng W, Guo B, Cai G, Ma Y, et al. Fusobacterium nucleatum promotes chemoresistance to 5-fluorouracil by upregulation of BIRC3 expression in colorectal caner. J Exp Clin Cancer Res [Internet]. BioMed Central Ltd.; 2019 [cited 2021 Mar 26];38:14. Available from: https://jeccr.biomedcentral.com/articles/10.1186/s13046- 018-0985-y

376. BMP7 bone morphogenetic protein 7 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 9]. Available from: https://www.ncbi.nlm.nih.gov/gene/655

377. Veschi V, Mangiapane LR, Nicotra A, Di Franco S, Scavo E, Apuzzo T, et al. Targeting chemoresistant colorectal cancer via systemic administration of a BMP7 variant. Oncogene [Internet]. Springer Nature; 2020 [cited 2021 Mar 26];39:987–1003. Available from: https://doi.org/10.1038/s41388-019-1047-4

378. BRAF B-Raf proto-oncogene, serine/threonine kinase [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 9]. Available from: https://www.ncbi.nlm.nih.gov/gene/673

379. Sanz-Garcia E, Argiles G, Elez E, Tabernero J. BRAF mutant colorectal cancer: Prognosis, treatment, and new perspectives. Ann. Oncol. Oxford University Press; 2017. page 2648–57.

380. MUC16 mucin 16, cell surface associated [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 9]. Available from: https://www.ncbi.nlm.nih.gov/gene/94025

138

381. Jiang H, Kang B, Huang X, Yan Y, Wang S, Ye Y, et al. Cancer IgG, a potential prognostic marker, promotes colorectal cancer progression. Chinese J Cancer Res [Internet]. Chinese Journal of Cancer Research; 2019 [cited 2021 Mar 26];31:499–510. Available from: /pmc/articles/PMC6613500/

382. Hu J, Sun J. MUC16 mutations improve patients’ prognosis by enhancing the infiltration and antitumor immunity of cytotoxic T lymphocytes in the endometrial cancer microenvironment. Oncoimmunology [Internet]. Taylor and Francis Inc.; 2018 [cited 2021 Apr 23];7. Available from: /pmc/articles/PMC6169595/

383. CAMK2N1 calcium/calmodulin dependent protein kinase II inhibitor 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 9]. Available from: https://www.ncbi.nlm.nih.gov/gene/55450

384. Wu C, Miao C, Tang Q, Zhou X, Xi P, Chang P, et al. MiR‐129‐5p promotes docetaxel resistance in prostate cancer by down‐regulating CAMK2N1 expression. J Cell Mol Med [Internet]. Blackwell Publishing Inc.; 2020 [cited 2021 Mar 26];24:2098–108. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1111/jcmm.14050

385. Liu Q, Song J, Pan Y, Shi D, Yang C, Wang S, et al. Wnt5a/caMKII/ERK/CCL2 axis is required for tumor-associated macrophages to promote colorectal cancer progression. Int J Biol Sci [Internet]. Ivyspring International Publisher; 2020 [cited 2021 Apr 23];16:1023– 34. Available from: /pmc/articles/PMC7053330/

386. CCND1 cyclin D1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 9]. Available from: https://www.ncbi.nlm.nih.gov/gene/595

387. Xu J, Lin DI. Oncogenic c-terminal cyclin d1 (ccnd1) mutations are enriched in endometrioid endometrial adenocarcinomas. PLoS One [Internet]. Public Library of Science; 2018 [cited 2021 Mar 26];13. Available from: /pmc/articles/PMC6029777/

388. CD82 CD82 molecule [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 9]. Available from: https://www.ncbi.nlm.nih.gov/gene/3732

389. Lee JH, Bae JA, Lee JH, Seo YW, Kho DH, Sun EG, et al. Glycoprotein 90K, downregulated in advanced colorectal cancer tissues, interacts with CD9/CD82 and suppresses the Wnt/β-catenin signal via ISGylation of β-catenin. Gut [Internet]. BMJ Publishing Group; 2010 [cited 2021 Mar 26];59:907–17. Available from: https://gut.bmj.com/content/59/7/907

390. Wu J, Liang S, Bergholz J, He H, Walsh EM, Zhang Y, et al. ΔNp63α activates CD82 metastasis suppressor to inhibit cancer cell invasion. Cell Death Dis [Internet]. Nature Publishing Group; 2014 [cited 2021 Apr 23];5:e1280. Available from: /pmc/articles/PMC4611714/

139

391. Yi Y, Zhang W, Yi J, Xiao ZX. Role of p53 family proteins in metformin anti-cancer activities [Internet]. J. Cancer. Ivyspring International Publisher; 2019 [cited 2021 Apr 23]. page 2434–42. Available from: /pmc/articles/PMC6584340/

392. CD83 CD83 molecule [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 9]. Available from: https://www.ncbi.nlm.nih.gov/gene/9308

393. Michielsen AJ, Hogan AE, Marry J, Tosetto M, Cox F, Hyland JM, et al. Tumour Tissue Microenvironment Can Inhibit Dendritic Cell Maturation in Colorectal Cancer. Dieli F, editor. PLoS One [Internet]. Public Library of Science; 2011 [cited 2021 Mar 26];6:e27944. Available from: https://dx.plos.org/10.1371/journal.pone.0027944

394. CDH1 cadherin 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 10]. Available from: https://www.ncbi.nlm.nih.gov/gene/999

395. Grünhage F, Jungck M, Lamberti C, Berg C, Becker U, Schulte-Witte H, et al. Association of familial colorectal cancer with variants in the E-cadherin (CDH1) and cyclin D1 (CCND1) genes. Int J Colorectal Dis [Internet]. Springer; 2008 [cited 2021 Mar 26];23:147–54. Available from: https://link.springer.com/article/10.1007/s00384-007- 0388-6

396. CDH1 cadherin 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 Apr 23]. Available from: https://www.ncbi.nlm.nih.gov/gene/999

397. CDH2 cadherin 2 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 10]. Available from: https://www.ncbi.nlm.nih.gov/gene/1000

398. Yan X, Yan L, Liu S, Shan Z, Tian Y, Jin Z. N-cadherin, a novel prognostic biomarker, drives malignant progression of colorectal cancer. Mol Med Rep [Internet]. Spandidos Publications; 2015 [cited 2021 Mar 26];12:2999–3006. Available from: http://www.spandidos-publications.com/10.3892/mmr.2015.3687/abstract

399. CDK8 cyclin dependent kinase 8 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 Apr 23]. Available from: https://www.ncbi.nlm.nih.gov/gene/1024

400. Firestein R, Bass AJ, Kim SY, Dunn IF, Silver SJ, Guney I, et al. CDK8 is a colorectal cancer oncogene that regulates β-catenin activity. Nature [Internet]. Nature Publishing Group; 2008 [cited 2021 Mar 26];455:547–51. Available from: https://www.nature.com/articles/nature07179

401. CDKN2A cyclin dependent kinase inhibitor 2A [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 Apr 23]. Available from: https://www.ncbi.nlm.nih.gov/gene/1029

140

402. Xing X, Cai W, Shi H, Wang Y, Li M, Jiao J, et al. The prognostic value of CDKN2A hypermethylation in colorectal cancer: A meta-analysis. Br J Cancer [Internet]. Nature Publishing Group; 2013 [cited 2021 Mar 26];108:2542–8. Available from: /pmc/articles/PMC3694241/

403. CEACAM5 CEA cell adhesion molecule 5 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 Apr 23]. Available from: https://www.ncbi.nlm.nih.gov/gene/1048

404. Li Y, Xing C, Wei M, Wu H, Hu X, Li S, et al. Combining red blood cell distribution width (RDW-CV) and CEA predict poor prognosis for survival outcomes in colorectal cancer. J Cancer [Internet]. Ivyspring International Publisher; 2019 [cited 2021 Mar 26];10:1162–70. Available from: /pmc/articles/PMC6400666/

405. Eftekhar E, Jaberie H, Naghibalhossaini F. Carcinoembryonic Antigen Expression and Resistance to Radiation and 5-Fluorouracil-Induced Apoptosis and Autophagy. Int J Mol Cell Med [Internet]. Babol University of Medical Sciences; 2016 [cited 2021 Apr 23];5:80–9. Available from: http://www.ncbi.nlm.nih.gov/pubmed/27478804

406. CFC1 cripto, FRL-1, cryptic family 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 10]. Available from: https://www.ncbi.nlm.nih.gov/gene/55997

407. Chikaraishi K, Takenobu H, Sugino RP, Mukae K, Akter J, Haruta M, et al. CFC1 is a cancer stemness-regulating factor in neuroblastoma. Oncotarget [Internet]. Impact Journals LLC; 2017 [cited 2021 Apr 23];8:45046–59. Available from: /pmc/articles/PMC5542166/

408. Liu Y, Qin Z, Yang K, Liu R, Xu Y. Cripto-1 promotes epithelial-mesenchymal transition in prostate cancer via Wnt/β-catenin signaling. Oncol Rep [Internet]. Spandidos Publications; 2017 [cited 2021 Apr 23];37:1521–8. Available from: http://www.spandidos-publications.com/10.3892/or.2017.5378/abstract

409. Shani G, Fischer WH, Justice NJ, Kelber JA, Vale W, Gray PC. GRP78 and Cripto Form a Complex at the Cell Surface and Collaborate To Inhibit Transforming Growth Factor β Signaling and Enhance Cell Growth. Mol Cell Biol [Internet]. American Society for Microbiology; 2008 [cited 2021 Mar 26];28:666–77. Available from: https://pubmed.ncbi.nlm.nih.gov/17991893/

410. CLEC4D C-type lectin domain family 4 member D [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 10]. Available from: https://www.ncbi.nlm.nih.gov/gene/338339

411. Umansky V, Blattner C, Gebhardt C, Utikal J. The Role of Myeloid-Derived Suppressor Cells (MDSC) in Cancer Progression. Vaccines [Internet]. MDPI AG; 2016 [cited 2021 Mar 26];4:36. Available from: http://www.mdpi.com/2076-393X/4/4/36

141

412. Alshetaiwi H, Pervolarakis N, McIntyre LL, Ma D, Nguyen Q, Rath JA, et al. Defining the emergence of myeloid-derived suppressor cells in breast cancer using single-cell transcriptomics. Sci Immunol [Internet]. NLM (Medline); 2020 [cited 2021 Mar 26];5:6017. Available from: https://immunology.sciencemag.org/content/5/44/eaay6017

413. CNRIP1 cannabinoid receptor interacting protein 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 10]. Available from: https://www.ncbi.nlm.nih.gov/gene/25927

414. Zhang T, Cui G, Yao YL, Wang QC, Gu HG, Li XN, et al. Value of CNRIP1 promoter methylation in colorectal cancer screening and prognosis assessment and its influence on the activity of cancer cells. Arch Med Sci [Internet]. Termedia Publishing House Ltd.; 2017 [cited 2021 Mar 26];13:1281–94. Available from: /pmc/articles/PMC5701694/

415. COL1A1 collagen type I alpha 1 chain [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 10]. Available from: https://www.ncbi.nlm.nih.gov/gene/1277

416. Zhang Z, Wang Y, Zhang J, Zhong J, Yang R. COL1A1 promotes metastasis in colorectal cancer by regulating the WNT/PCP pathway. Mol Med Rep [Internet]. Spandidos Publications; 2018 [cited 2021 Mar 26];17:5037–42. Available from: http://www.spandidos-publications.com/10.3892/mmr.2018.8533/abstract

417. COL3A1 collagen type III alpha 1 chain [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 10]. Available from: https://www.ncbi.nlm.nih.gov/gene/1281

418. Wang J, Liu F. COL3A1 promotes colorectal cancer growth and its potential regulation mechanism. Fudan Univ J Med Sci. Fudan University; 2018;45:285–90.

419. Su B, Zhao W, Shi B, Zhang Z, Yu X, Xie F, et al. Let-7d suppresses growth, metastasis, and tumor macrophage infiltration in renal cell carcinoma by targeting COL3A1 and CCL7. Mol Cancer [Internet]. BioMed Central Ltd.; 2014 [cited 2021 Apr 24];13:1–13. Available from: https://link.springer.com/articles/10.1186/1476-4598-13-206

420. CREBBP CREB binding protein [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 Apr 24]. Available from: https://www.ncbi.nlm.nih.gov/gene/1387

421. Liu K, Wang JF, Zhan Y, Kong DL, Wang C. Prognosis model of colorectal cancer patients based on NOTCH3, KMT2C, and CREBBP mutations. J Gastrointest Oncol [Internet]. AME Publishing Company; 2021 [cited 2021 Mar 26];12:79–88. Available from: /pmc/articles/PMC7944156/

422. Horton SJ, Giotopoulos G, Yun H, Vohra S, Sheppard O, Bashford-Rogers R, et al. Early loss of Crebbp confers malignant stem cell properties on lymphoid progenitors. Nat Cell Biol [Internet]. Nature Publishing Group; 2017 [cited 2021 Apr 24];19:1093–104. Available from: https://pubmed.ncbi.nlm.nih.gov/28825697/

142

423. CTNNB1 catenin beta 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 Apr 24]. Available from: https://www.ncbi.nlm.nih.gov/gene/1499

424. Zhang L, Dong X, Yan B, Yu W, Shan L. CircAGFG1 drives metastasis and stemness in colorectal cancer by modulating YY1/CTNNB1. Cell Death Dis [Internet]. Springer Nature; 2020 [cited 2021 Mar 26];11:1–15. Available from: https://doi.org/10.1038/s41419-020-2707-6

425. KRT19 keratin 19 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 10]. Available from: https://www.ncbi.nlm.nih.gov/gene/3880

426. Cytokeratin 19 Fragment - an overview | ScienceDirect Topics [Internet]. [cited 2021 Mar 26]. Available from: https://www.sciencedirect.com/topics/medicine-and- dentistry/cytokeratin-19-fragment

427. Lee JH. Clinical Usefulness of Serum CYFRA 21-1 in Patients with Colorectal Cancer. Nucl Med Mol Imaging (2010) [Internet]. Nucl Med Mol Imaging; 2013 [cited 2021 Mar 26];47:181–7. Available from: https://pubmed.ncbi.nlm.nih.gov/24900105/

428. EGF epidermal growth factor [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 10]. Available from: https://www.ncbi.nlm.nih.gov/gene/1950

429. Bouché O, Beretta GD, Alfonso PG, Geissler M. The role of anti-epidermal growth factor receptor monoclonal antibody monotherapy in the treatment of metastatic colorectal cancer. Cancer Treat Rev. 2010;36.

430. EGFR epidermal growth factor receptor [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 10]. Available from: https://www.ncbi.nlm.nih.gov/gene/1956

431. EGFR expression in colorectal cancer. Nat Clin Pract Gastroenterol Hepatol [Internet]. Nature Publishing Group; 2005 [cited 2021 Mar 26];2:69–69. Available from: http://www.nature.com/articles/ncponc0092-nrgastro

432. EPAS1 endothelial PAS domain protein 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 10]. Available from: https://www.ncbi.nlm.nih.gov/gene/2034

433. Xue X, Ramakrishnan S, Anderson E, Taylor M, Zimmermann EM, Spence JR, et al. Endothelial PAS domain protein 1 activates the inflammatory response in the intestinal epithelium to promote colitis in mice. Gastroenterology [Internet]. W.B. Saunders; 2013 [cited 2021 Apr 24];145:831–41. Available from: https://pubmed.ncbi.nlm.nih.gov/23860500/

434. Rawłuszko-Wieczorek AA, Horbacka K, Krokowicz P, Misztal M, Jagodziński PP. Prognostic potential of DNA methylation and transcript levels of HIF1A and EPAS1 in colorectal cancer. Mol Cancer Res [Internet]. American Association for Cancer Research Inc.; 2014 [cited 2021 Mar 26];12:1112–27. Available from: http://mcr.aacrjournals.org/

143

435. ERBB2 erb-b2 receptor tyrosine kinase 2 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 10]. Available from: https://www.ncbi.nlm.nih.gov/gene/2064

436. Pectasides E, Bass AJ. ERBB2 emerges as a new target for colorectal cancer. Cancer Discov [Internet]. American Association for Cancer Research Inc.; 2015 [cited 2021 Mar 26];5:799–801. Available from: /pmc/articles/PMC4527096/

437. Subramanian J, Katta A, Masood A, Vudem DR, Kancha RK. Emergence of ERBB2 Mutation as a Biomarker and an Actionable Target in Solid Cancers . Oncologist [Internet]. Wiley; 2019 [cited 2021 Apr 24];24:e1303. Available from: /pmc/articles/PMC6975965/

438. Ruiz-Saenz A, Dreyer C, Campbell MR, Steri V, Gulizia N, Moasser MM. HER2 amplification in tumors activates PI3K/Akt signaling independent of HER3. Cancer Res [Internet]. American Association for Cancer Research Inc.; 2018 [cited 2021 May 10];78:3645–58. Available from: http://cancerres.aacrjournals.org/

439. ERBB3 erb-b2 receptor tyrosine kinase 3 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 10]. Available from: https://www.ncbi.nlm.nih.gov/gene/2065

440. Zhang L, Castanaro C, Luan B, Yang K, Fan L, Fairhurst JL, et al. ERBB3/HER2 signaling promotes resistance to EGFR blockade in head and neck and colorectal cancer models. Mol Cancer Ther [Internet]. American Association for Cancer Research Inc.; 2014 [cited 2021 Mar 26];13:1345–55. Available from: www.aacrjournals.org

441. Goossens-Beumer IJ, Zeestraten ECM, Benard A, Christen T, Reimers MS, Keijzer R, et al. Clinical prognostic value of combined analysis of Aldh1, Survivin, and EpCAM expression in colorectal cancer. Br J Cancer [Internet]. Nature Publishing Group; 2014 [cited 2021 Mar 26];110:2935–44. Available from: www.bjcancer.com

442. Kim JH, Bae JM, Song YS, Cho NY, Lee HS, Kang GH. Clinicopathologic, molecular, and prognostic implications of the loss of EPCAM expression in colorectal carcinoma. Oncotarget [Internet]. Impact Journals LLC; 2016 [cited 2021 Apr 24];7:13372–87. Available from: /pmc/articles/PMC4924648/

443. Wang A, Ramjeesingh R, Chen CH, Hurlbut D, Hammad N, Mulligan LM, et al. Reduction in membranous immunohistochemical staining for the intracellular domain of epithelial cell adhesion molecule correlates with poor patient outcome in primary colorectal adenocarcinoma. Curr Oncol [Internet]. Multimed Inc.; 2016 [cited 2021 May 10];23:e171–8. Available from: /pmc/articles/PMC4900837/

444. FBN1 fibrillin 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 Apr 24]. Available from: https://www.ncbi.nlm.nih.gov/gene/2200

144

445. Li WH, Zhang H, Guo Q, Wu X Di, Xu Z Sen, Dang CX, et al. Detection of SNCA and FBN1 methylation in the stool as a biomarker for colorectal cancer. Dis Markers. Hindawi Limited; 2015;2015.

446. Liu W, Liu P, Gao H, Wang X, Yan M. Long non‐coding RNA PGM5‐AS1 promotes epithelial‐mesenchymal transition, invasion and metastasis of osteosarcoma cells by impairing miR‐140‐5p‐mediated FBN1 inhibition. Mol Oncol [Internet]. John Wiley and Sons Ltd; 2020 [cited 2021 Apr 24];14:2660–77. Available from: https://onlinelibrary.wiley.com/doi/10.1002/1878-0261.12711

447. Wang Z, Liu Y, Lu L, Yang L, Yin S, Wang Y, et al. Fibrillin-1, induced by Aurora-A but inhibited by BRCA2, promotes ovarian cancer metastasis. Oncotarget [Internet]. Impact Journals LLC; 2015 [cited 2021 Apr 24];6:6670–83. Available from: /pmc/articles/PMC4466642/

448. FGFBP1 fibroblast growth factor binding protein 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 Apr 24]. Available from: https://www.ncbi.nlm.nih.gov/gene/9982

449. Chen J, Liu Q-M, Du P-C, Ning D, Mo J, Zhu H-D, et al. Type-2 11β-hydroxysteroid dehydrogenase promotes the metastasis of colorectal cancer via the Fgfbp1-AKT pathway. Am J Cancer Res [Internet]. e-Century Publishing Corporation; 2020 [cited 2021 Mar 26];10:662–73. Available from: http://www.ncbi.nlm.nih.gov/pubmed/32195034

450. FGFR1 fibroblast growth factor receptor 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 Apr 24]. Available from: https://www.ncbi.nlm.nih.gov/gene/2260

451. Bae JM, Wen X, Kim TS, Kwak Y, Cho NY, Lee HS, et al. Fibroblast growth factor receptor 1 (FGFR1) amplification detected by droplet digital polymerase chain reaction (DDPCR) is a prognostic factor in colorectal cancers. Cancer Res Treat [Internet]. Korean Cancer Association; 2020 [cited 2021 Mar 26];52:74–84. Available from: https://pubmed.ncbi.nlm.nih.gov/31096734/

452. Chong Y, Thakur N, Paik KY, Lee EJ, Kang CS. Prognostic significance of stem cell/ epithelial-mesenchymal transition markers in periampullary/pancreatic cancers: FGFR1 is a promising prognostic marker. BMC Cancer [Internet]. BioMed Central Ltd.; 2020 [cited 2021 May 30];20. Available from: https://pubmed.ncbi.nlm.nih.gov/32171280/

453. MIOS meiosis regulator for oocyte development [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 10]. Available from: https://www.ncbi.nlm.nih.gov/gene/54468

454. MIOS Gene - GeneCards | MIO Protein | MIO Antibody [Internet]. [cited 2021 Mar 26]. Available from: https://www.genecards.org/cgi-bin/carddisp.pl?gene=MIOS

145

455. Wu J, Wang Y, Xu X, Cao H, Sahengbieke S, Sheng H, et al. Transcriptional activation of FN1 and IL11 by HMGA2 promotes the malignant behavior of colorectal cancer. Carcinogenesis [Internet]. Oxford University Press; 2016 [cited 2021 Mar 26];37:511–21. Available from: https://academic.oup.com/carcin/article- lookup/doi/10.1093/carcin/bgw029

456. Kim SA, Cho EJ, Lee S, Cho YY, Kim B, Yoon JH, et al. Changes in serum fibronectin levels predict tumor recurrence in patients with early hepatocellular carcinoma after curative treatment. Sci Rep [Internet]. Nature Research; 2020 [cited 2021 May 10];10. Available from: https://pubmed.ncbi.nlm.nih.gov/33277619/

457. Vaidya A, Wang H, Qian V, Gilmore H, Lu ZR. Overexpression of Extradomain-B Fibronectin is Associated with Invasion of Breast Cancer Cells. Cells [Internet]. NLM (Medline); 2020 [cited 2021 May 10];9. Available from: https://pubmed.ncbi.nlm.nih.gov/32756405/

458. Alanen J, Appelblom H, Korpimaki T, Kouru H, Sairanen M, Gissler M, et al. Glycosylated fibronectin as a first trimester marker for gestational diabetes. Arch Gynecol Obstet [Internet]. Springer Science and Business Media Deutschland GmbH; 2020 [cited 2021 May 10];302:853–60. Available from: https://pubmed.ncbi.nlm.nih.gov/32653948/

459. Vu T, Datta P. Regulation of EMT in Colorectal Cancer: A Culprit in Metastasis. Cancers (Basel) [Internet]. MDPI AG; 2017 [cited 2021 Apr 24];9:171. Available from: http://www.mdpi.com/2072-6694/9/12/171

460. FOXD3 forkhead box D3 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 10]. Available from: https://www.ncbi.nlm.nih.gov/gene/27022

461. He GY, Hu JL, Zhou L, Zhu XH, Xin SN, Zhang D, et al. The FOXD3/miR-214/MED19 axis suppresses tumour growth and metastasis in human colorectal cancer. Br J Cancer [Internet]. Nature Publishing Group; 2016 [cited 2021 Mar 26];115:1367–78. Available from: www.bjcancer.com

462. Li K, Guo Q, Yang J, Chen H, Hu K, Zhao J, et al. FOXD3 is a tumor suppressor of colon cancer by inhibiting EGFRRas- Raf-MEK-ERK signal pathway. Oncotarget [Internet]. Impact Journals LLC; 2017 [cited 2021 Apr 24];8:5048–56. Available from: /pmc/articles/PMC5354891/

463. FOXO3 forkhead box O3 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 Apr 25]. Available from: https://www.ncbi.nlm.nih.gov/gene/2309

464. Bullock MD, Bruce A, Sreekumar R, Curtis N, Cheung T, Reading I, et al. FOXO3 expression during colorectal cancer progression: Biomarker potential reflects a tumour suppressor role. Br J Cancer [Internet]. Nature Publishing Group; 2013 [cited 2021 Mar 26];109:387–94. Available from: /pmc/articles/PMC3721407/

146

465. Song S, Cheng D, Wei S, Wang X, Niu Y, Qi W, et al. Preventive effect of genistein on AOM/DSS-induced colonic neoplasm by modulating the PI3K/AKT/FOXO3 signaling pathway in mice fed a high-fat diet. J Funct Foods. Elsevier Ltd; 2018;46:237–42.

466. FZD7 frizzled class receptor 7 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 10]. Available from: https://www.ncbi.nlm.nih.gov/gene/8324

467. Dong X, Liao W, Zhang L, Tu X, Hu J, Chen T, et al. RSPO2 suppresses colorectal cancer metastasis by counteracting the Wnt5a/Fzd7-driven noncanonical Wnt pathway. Cancer Lett. Elsevier Ireland Ltd; 2017;402:153–65.

468. Ye C, Xu M, Lin M, Zhang Y, Zheng X, Sun Y, et al. Overexpression of FZD7 is associated with poor survival in patients with colon cancer. Pathol Res Pract [Internet]. Elsevier GmbH; 2019 [cited 2021 May 29];215. Available from: https://pubmed.ncbi.nlm.nih.gov/31176574/

469. GSC goosecoid homeobox [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 10]. Available from: https://www.ncbi.nlm.nih.gov/gene/145258

470. Zlobec I, Lugli A. Epithelial mesenchymal transition and tumor budding in aggressive colorectal cancer: tumor budding as oncotarget. [Internet]. Oncotarget. Impact Journals, LLC; 2010 [cited 2021 Mar 26]. page 651–61. Available from: /pmc/articles/PMC3248128/

471. Voutsadakis I. Epithelial-Mesenchymal Transition (EMT) and Regulation of EMT Factors by Steroid Nuclear Receptors in Breast Cancer: A Review and in Silico Investigation. J Clin Med [Internet]. MDPI AG; 2016 [cited 2021 Apr 25];5:11. Available from: http://www.mdpi.com/2077-0383/5/1/11

472. Duda P, Akula SM, Abrams SL, Steelman LS, Martelli AM, Cocco L, et al. Targeting GSK3 and Associated Signaling Pathways Involved in Cancer [Internet]. Cells. NLM (Medline); 2020 [cited 2021 Apr 25]. Available from: /pmc/articles/PMC7290852/

473. Wang HL, Hart J, Fan L, Mustafi R, Bissonnette M. Upregulation of Glycogen synthase kinase 3β in human colorectal adenocarcinomas correlates with accumulation of CTNNB1. Clin Colorectal Cancer. Elsevier Inc.; 2011;10:30–6.

474. Tsai CC, Tsai CK, Tseng PC, Lin CF, Chen CL. Glycogen Synthase Kinase-3β Facilitates Cytokine Production in 12-O-Tetradecanoylphorbol-13-Acetate/Ionomycin-Activated Human CD4+ T Lymphocytes. Cells [Internet]. NLM (Medline); 2020 [cited 2021 May 10];9. Available from: https://pubmed.ncbi.nlm.nih.gov/32521784/

475. HDAC2 histone deacetylase 2 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 10]. Available from: https://www.ncbi.nlm.nih.gov/gene/3066

147

476. Mao QD, Zhang W, Zhao K, Cao B, Yuan H, Wei LZ, et al. MicroRNA-455 suppresses the oncogenic function of HDAC2 in human colorectal cancer. Brazilian J Med Biol Res [Internet]. Associacao Brasileira de Divulgacao Cientifica; 2017 [cited 2021 Mar 26];50. Available from: http://dx.doi.org/10.1590/1414-431X20176103

477. HIF1A hypoxia inducible factor 1 subunit alpha [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 10]. Available from: https://www.ncbi.nlm.nih.gov/gene/3091

478. Baba Y, Nosho K, Shima K, Irahara N, Chan AT, Meyerhardt JA, et al. HIF1A overexpression is associated with poor prognosis in a cohort of 731 colorectal cancers. Am J Pathol [Internet]. Elsevier Inc.; 2010 [cited 2021 Mar 26];176:2292–301. Available from: /pmc/articles/PMC2861094/

479. Lyou Y, Chen G, Waterman M. Abstract 385: HIF1A can regulate Wnt signaling in human colon cancer cells. Cancer Res [Internet]. American Association for Cancer Research (AACR); 2018 [cited 2021 Apr 25]. page 385–385. Available from: https://cancerres.aacrjournals.org/content/78/13_Supplement/385

480. Zhou B, Guo R. Genomic and regulatory characteristics of significant transcription factors in colorectal cancer metastasis. Sci Rep [Internet]. Nature Publishing Group; 2018 [cited 2021 Mar 26];8:17836. Available from: www.nature.com/scientificreports/

481. HPSE heparanase [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 8]. Available from: https://www.ncbi.nlm.nih.gov/gene/10855

482. Liu X, Zhou Z, Li W, Zhang S, Li J, Zhou M-J, et al. Heparanase Promotes Tumor Growth and Liver Metastasis of Colorectal Cancer Cells by Activating the p38/MMP1 Axis. Front Oncol [Internet]. Frontiers Media S.A.; 2019 [cited 2021 Mar 26];9:216. Available from: https://www.frontiersin.org/article/10.3389/fonc.2019.00216/full

483. HSP90AA1 heat shock protein 90 alpha family class A member 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/3320

484. Chu SH, Liu YW, Zhang L, Liu B, Li L, Shi JZ. Regulation of survival and chemoresistance by HSP90AA1 in ovarian cancer SKOV3 cells. Mol Biol Rep [Internet]. Springer; 2013 [cited 2021 Mar 26];40:1–6. Available from: https://link.springer.com/article/10.1007/s11033-012-1930-3

485. IGF1 insulin like growth factor 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/3479

486. Li Z, Pan W, Shen Y, Chen Z, Zhang L, Zhang Y, et al. IGF1/IGF1R and microRNA let- 7e down-regulate each other and modulate proliferation and migration of colorectal cancer cells. Cell Cycle [Internet]. Taylor and Francis Inc.; 2018 [cited 2021 Mar 26];17:1212–9. Available from: https://www.tandfonline.com/doi/full/10.1080/15384101.2018.1469873

148

487. IGF1R insulin like growth factor 1 receptor [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/3480

488. Afshar S, Najafi R, Sedighi Pashaki A, Sharifi M, Nikzad S, Gholami MH, et al. MiR-185 enhances radiosensitivity of colorectal cancer cells by targeting IGF1R and IGF2. Biomed Pharmacother. Elsevier Masson SAS; 2018;106:763–9.

489. IGF2 insulin like growth factor 2 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/3481

490. Livingstone C. IGF2 and cancer [Internet]. Endocr. Relat. Cancer. Society for Endocrinology; 2013 [cited 2021 Mar 26]. page R321–39. Available from: http://erc.endocrinology-journals.org

491. IGFBP3 insulin like growth factor binding protein 3 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 8]. Available from: https://www.ncbi.nlm.nih.gov/gene/3486

492. Hou Y, Luo P, Ji G, Chen H. Clinical significance of serum IGFBP‐3 in colorectal cancer. J Clin Lab Anal [Internet]. John Wiley and Sons Inc.; 2019 [cited 2021 Mar 26];33:e22912. Available from: https://onlinelibrary.wiley.com/doi/abs/10.1002/jcla.22912

493. IGFBP4 insulin like growth factor binding protein 4 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 8]. Available from: https://www.ncbi.nlm.nih.gov/gene/3487

494. Durai R, Yang SY, Sales KM, Seifalian AM, Goldspink G, Winslet MC. Increased apoptosis and decreased proliferation of colorectal cancer cells using insulin-like growth factor binding protein-4 gene delivered locally by gene transfer. Color Dis [Internet]. John Wiley & Sons, Ltd; 2007 [cited 2021 Mar 26];9:625–31. Available from: http://doi.wiley.com/10.1111/j.1463-1318.2006.01190.x

495. IL17RC interleukin 17 receptor C [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 8]. Available from: https://www.ncbi.nlm.nih.gov/gene/84818

496. Hurtado CG, Wan F, Housseau F, Sears CL. Roles for Interleukin 17 and Adaptive Immunity in Pathogenesis of Colorectal Cancer [Internet]. Gastroenterology. W.B. Saunders; 2018 [cited 2021 Mar 26]. page 1706–15. Available from: /pmc/articles/PMC6441974/

497. IL2RB interleukin 2 receptor subunit beta [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 8]. Available from: https://www.ncbi.nlm.nih.gov/gene/3560

149

498. He L, Zhang W, Yang S, Meng W, Dou X, Liu J, et al. Impact of genetic variants in IL- 2RA and IL-2RB on breast cancer risk in Chinese Han women. Biochem Genet [Internet]. Springer; 2021 [cited 2021 Mar 26];1–17. Available from: https://link.springer.com/article/10.1007/s10528-021-10029-y

499. Jia Z, Zhang Z, Yang Q, Deng C, Li D, Ren L. Effect of IL2RA and IL2RB gene polymorphisms on lung cancer risk. Int Immunopharmacol. Elsevier B.V.; 2019;74:105716.

500. IL6 interleukin 6 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/3569

501. Zeng J, Tang ZH, Liu S, Guo SS. Clinicopathological significance of overexpression of interleukin-6 in colorectal cancer. World J Gastroenterol [Internet]. Baishideng Publishing Group Co., Limited; 2017 [cited 2021 Mar 26];23:1780–6. Available from: /pmc/articles/PMC5352918/

502. Zhang X, Hu F, Li G, Li G, Yang X, Liu L, et al. Human colorectal cancer-derived mesenchymal stem cells promote colorectal cancer progression through IL- 6/JAK2/STAT3 signaling. Cell Death Dis [Internet]. Nature Publishing Group; 2018 [cited 2021 Mar 26];9:1–13. Available from: https://www.nature.com/articles/s41419- 017-0176-3

503. IL6ST interleukin 6 cytokine family signal transducer [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/3572

504. Yue Y, Zhang Q, Wu S, Wang S, Cui C, Yu M, et al. Identification of key genes involved in JAK/STAT pathway in colorectal cancer. Mol Immunol. Elsevier Ltd; 2020;128:287– 97.

505. IL8 interleukin 8 [Anas platyrhynchos (mallard)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/101804010

506. Jin WJ, Xu JM, Xu WL, Gu DH, Li PW. Diagnostic value of interleukin-8 in colorectal cancer: A case-control study and meta- Analysis. World J Gastroenterol [Internet]. WJG Press; 2014 [cited 2021 Mar 26];20:16334–42. Available from: /pmc/articles/PMC4239526/

507. ILK integrin linked kinase [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/3611

508. ILK Gene - GeneCards | ILK Protein | ILK Antibody [Internet]. [cited 2021 Mar 26]. Available from: https://www.genecards.org/cgi-bin/carddisp.pl?gene=ILK

150

509. INA internexin neuronal intermediate filament protein alpha [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/9118

510. Li Y, Bai L, Yu H, Cai D, Wang X, Huang B, et al. Epigenetic inactivation of α-internexin accelerates microtubule polymerization in colorectal cancer. Cancer Res [Internet]. American Association for Cancer Research Inc.; 2020 [cited 2021 Mar 26];80:5203–15. Available from: https://cancerres.aacrjournals.org/content/80/23/5203

511. IRS4 insulin receptor substrate 4 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/8471

512. Sanmartín-Salinas P, Toledo-Lobo MV, Noguerales-Fraguas F, Fernández-Contreras ME, Guijarro LG. Overexpression of insulin receptor substrate-4 is correlated with clinical staging in colorectal cancer patients. J Mol Histol [Internet]. Springer Netherlands; 2018 [cited 2021 Mar 26];49:39–49. Available from: https://link.springer.com/article/10.1007/s10735-017-9745-0

513. ITGA5 integrin subunit alpha 5 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/3678

514. Yoo HI, Kim BK, Yoon SK. MicroRNA-330-5p negatively regulates ITGA5 expression in human colorectal cancer. Oncol Rep [Internet]. Spandidos Publications; 2016 [cited 2021 Mar 26];36:3023–9. Available from: http://www.mirbase.org/

515. ITGB1 integrin subunit beta 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/3688

516. Liu QZ, Gao XH, Chang WJ, Gong HF, Fu CG, Zhang W, et al. Expression of ITGB1 predicts prognosis in colorectal cancer: A large prospective study based on tissue microarray. Int J Clin Exp Pathol [Internet]. E-Century Publishing Corporation; 2015 [cited 2021 Mar 26];8:12802–10. Available from: www.ijcep.com/

517. KRAS KRAS proto-oncogene, GTPase [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/3845

518. dos Santos W, Sobanski T, de Carvalho AC, Evangelista AF, Matsushita M, Berardinelli GN, et al. Mutation profiling of cancer drivers in Brazilian colorectal cancer. Sci Rep [Internet]. Nature Publishing Group; 2019 [cited 2021 Mar 26];9:1–13. Available from: https://doi.org/10.1038/s41598-019-49611-1

519. LMNB1 lamin B1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/4001

151

520. de Angelis PM, Fjell B, Kravik KL, Haug T, Tunheim SH, Reichelt W, et al. Molecular characterizations of derivatives of HCT116 colorectal cancer cells that are resistant to the chemotherapeutic agent 5-fluorouracil. Int J Oncol [Internet]. Spandidos Publications; 2004 [cited 2021 Mar 26];24:1279–88. Available from: http://www.spandidos- publications.com/10.3892/ijo.24.5.1279/abstract

521. He C, Huang C, Zhou R, Yu H. CirCLMNB1 promotes colorectal cancer by regulating cell proliferation, apoptosis and epithelial-mesenchymal transition. Onco Targets Ther [Internet]. Dove Medical Press Ltd.; 2019 [cited 2021 Mar 26];12:6349–59. Available from: /pmc/articles/PMC6691939/

522. MAL mal, T cell differentiation protein [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/4118

523. Ma R, Xu Y, Wang M, Peng W. Suppression of MAL gene expression is associated with colorectal cancer metastasis. Oncol Lett [Internet]. Spandidos Publications; 2015 [cited 2021 Mar 26];10:957–61. Available from: https://pubmed.ncbi.nlm.nih.gov/26622604/

524. MGMT O-6-methylguanine-DNA methyltransferase [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/4255

525. Inno A. Role of MGMT as biomarker in colorectal cancer. World J Clin Cases [Internet]. Baishideng Publishing Group Inc.; 2014 [cited 2021 Mar 26];2:835. Available from: /pmc/articles/PMC4266830/

526. MITF melanocyte inducing transcription factor [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 8]. Available from: https://www.ncbi.nlm.nih.gov/gene/4286

527. Guhan SM, Artomov M, McCormick S, Njauw CN, Stratigos AJ, Shannon K, et al. Cancer risks associated with the germline MITF(E318K) variant. Sci Rep [Internet]. Nature Research; 2020 [cited 2021 Mar 26];10:1–5. Available from: https://doi.org/10.1038/s41598-020-74237-z

528. Capowski EE, Simonett JM, Clark EM, Wright LS, Howden SE, Wallace KA, et al. Loss of MITF expression during human embryonic stem cell differentiation disrupts retinal pigment epithelium development and optic vesicle cell proliferation. Hum Mol Genet [Internet]. Hum Mol Genet; 2014 [cited 2021 May 8];23:6332–44. Available from: https://pubmed.ncbi.nlm.nih.gov/25008112/

529. MLH1 mutL homolog 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/4292

152

530. Ackermann A, Schrecker C, Bon D, Friedrichs N, Bankov K, Wild P, et al. Downregulation of SPTAN1 is related to MLH1 deficiency and metastasis in colorectal cancer. Suzuki H, editor. PLoS One [Internet]. Public Library of Science; 2019 [cited 2021 Mar 26];14:e0213411. Available from: https://dx.plos.org/10.1371/journal.pone.0213411

531. MMP3 matrix metallopeptidase 3 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/4314

532. Ji Y, Li J, Li P, Wang L, Yang H, Jiang G. C/EBPβ Promotion of MMP3-Dependent Tumor Cell Invasion and Association with Metastasis in Colorectal Cancer. Genet Test Mol Biomarkers [Internet]. Mary Ann Liebert Inc.; 2018 [cited 2021 Mar 26];22:5–10. Available from: https://www.liebertpub.com/doi/abs/10.1089/gtmb.2017.0113

533. MMP9 matrix metallopeptidase 9 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/4318

534. Marshall DC, Lyman SK, McCauley S, Kovalenko M, Spangler R, Liu C, et al. Selective Allosteric Inhibition of MMP9 Is Efficacious in Preclinical Models of Ulcerative Colitis and Colorectal Cancer. Cominelli F, editor. PLoS One [Internet]. Public Library of Science; 2015 [cited 2021 Mar 26];10:e0127063. Available from: https://dx.plos.org/10.1371/journal.pone.0127063

535. MSH2 mutS homolog 2 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/4436

536. Engel C, Ahadova A, Seppälä TT, Aretz S, Bigirwamungu-Bargeman M, Bläker H, et al. Associations of Pathogenic Variants in MLH1, MSH2, and MSH6 With Risk of Colorectal Adenomas and Tumors and With Somatic Mutations in Patients With Lynch Syndrome. Gastroenterology. W.B. Saunders; 2020;158:1326–33.

537. MSH6 mutS homolog 6 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/2956

538. Chen W, Pearlman R, Hampel H, Pritchard CC, Markow M, Arnold C, et al. MSH6 immunohistochemical heterogeneity in colorectal cancer: comparative sequencing from different tumor areas. Hum Pathol. W.B. Saunders; 2020;96:104–11.

539. MST1R macrophage stimulating 1 receptor [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/4486

540. Xie N, Yao Y, Wan L, Zhu T, Liu L, Yuan J. Next-generation sequencing reveals lymph node metastasis associated genetic markers in colorectal cancer. Exp Ther Med [Internet]. Spandidos Publications; 2017 [cited 2021 Mar 26];14:338–43. Available from: http://www.spandidos-publications.com/10.3892/etm.2017.4464/abstract

153

541. MUC1 mucin 1, cell surface associated [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/4582

542. Betge J, Schneider NI, Harbaum L, Pollheimer MJ, Lindtner RA, Kornprat P, et al. MUC1, MUC2, MUC5AC, and MUC6 in colorectal cancer: expression profiles and clinical significance. Virchows Arch. Springer Verlag; 2016;469:255–65.

543. Niv Y. MUC1 and colorectal cancer pathophysiology considerations [Internet]. World J. Gastroenterol. Baishideng Publishing Group Inc; 2008 [cited 2021 Mar 26]. page 2139– 41. Available from: /pmc/articles/PMC2703837/

544. MYC MYC proto-oncogene, bHLH transcription factor [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/4609

545. Xiang JF, Yin QF, Chen T, Zhang Y, Zhang XO, Wu Z, et al. Human colorectal cancer- specific CCAT1-L lncRNA regulates long-range chromatin interactions at the MYC locus. Cell Res [Internet]. Nature Publishing Group; 2014 [cited 2021 Mar 26];24:513–31. Available from: www.cell-research.com

546. NANOG Nanog homeobox [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/79923

547. Zhang J, Espinoza LA, Kinders RJ, Lawrence SM, Pfister TD, Zhou M, et al. NANOG modulates stemness in human colorectal cancer. Oncogene [Internet]. Nature Publishing Group; 2013 [cited 2021 Mar 26];32:4397–405. Available from: www.nature.com/onc

548. NDRG1 N-myc downstream regulated 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/10397

549. Mi L, Zhu F, Yang X, Lu J, Zheng Y, Zhao Q, et al. The metastatic suppressor NDRG1 inhibits EMT, migration and invasion through interaction and promotion of caveolin-1 ubiquitylation in human colorectal cancer cells. Oncogene [Internet]. Nature Publishing Group; 2017 [cited 2021 Mar 26];36:4323–35. Available from: www.nature.com/onc

550. NOTCH1 notch receptor 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/4851

551. Fender AW, Nutter JM, Fitzgerald TL, Bertrand FE, Sigounas G. Notch-1 Promotes Stemness and Epithelial to Mesenchymal Transition in Colorectal Cancer. J Cell Biochem [Internet]. Wiley-Liss Inc.; 2015 [cited 2021 Mar 26];116:2517–27. Available from: https://pubmed.ncbi.nlm.nih.gov/25914224/

552. SPP1 secreted phosphoprotein 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/6696

154

553. Cheng Y, Wen G, Sun Y, Shen Y, Zeng Y, Du M, et al. Osteopontin promotes colorectal cancer cell invasion and the stem cell-like properties through the PI3K-AKT-GSK/3β-β /catenin pathway. Med Sci Monit [Internet]. International Scientific Information, Inc.; 2019 [cited 2021 Mar 26];25:3014–25. Available from: /pmc/articles/PMC6496974/

554. PAK1 p21 (RAC1) activated kinase 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/5058

555. Huynh N, Liu KH, Baldwin GS, He H. P21-activated kinase 1 stimulates colon cancer cell growth and migration/invasion via ERK- and AKT-dependent pathways. Biochim Biophys Acta - Mol Cell Res [Internet]. Biochim Biophys Acta; 2010 [cited 2021 Mar 26];1803:1106–13. Available from: https://pubmed.ncbi.nlm.nih.gov/20595063/

556. PFK2 phosphofructokinase 2 [Arabidopsis thaliana (thale cress)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/834832

557. Shi L, Pan H, Liu Z, Xie J, Han W. Roles of PFKFB3 in cancer [Internet]. Signal Transduct. Target. Ther. Springer Nature; 2017 [cited 2021 Mar 26]. page 1–10. Available from: www.nature.com/sigtrans

558. PIK3CA phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 8]. Available from: https://www.ncbi.nlm.nih.gov/gene/5290

559. Cathomas G. PIK3CA in Colorectal Cancer. Front Oncol [Internet]. Frontiers Research Foundation; 2014 [cited 2021 Mar 26];4:35. Available from: http://journal.frontiersin.org/article/10.3389/fonc.2014.00035/abstract

560. PLEK2 pleckstrin 2 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/26499

561. Liu E-C, Lee T-T, Wu S-L, Lee T-L, Tsai M-F. Abstract 1035: PLEK2 promotes epithelial mesenchymal transition through increasing Snail1 in lung cancer cells. Cancer Res [Internet]. American Association for Cancer Research (AACR); 2019 [cited 2021 Mar 26]. page 1035–1035. Available from: https://cancerres.aacrjournals.org/content/79/13_Supplement/1035

562. Role of pleckstrin-2 in cancer--《中国药理学与毒理学杂志》2019年10期 [Internet]. [cited 2021 Mar 26]. Available from: https://www.cnki.com.cn/Article/CJFDTotal- YLBS201910027.htm

563. PMS2 PMS1 homolog 2, mismatch repair system component [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/5395

155

564. Truninger K, Menigatti M, Luz J, Russell A, Haider R, Gebbers JO, et al. Immunohistochemical analysis reveals high frequency of PMS2 defects in colorectal cancer. Gastroenterology. W.B. Saunders; 2005;128:1160–71.

565. PPARA - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/?term=PPARA

566. Tong JL, Zhang CP, Nie F, Xu XT, Zhu MM, Xiao SD, et al. MicroRNA 506 regulates expression of PPAR alpha in hydroxycamptothecin- resistant human colon cancer cells. FEBS Lett. No longer published by Elsevier; 2011;585:3560–8.

567. Giordano Attianese GMP, Desvergne B. Integrative and systemic approaches for evaluating PPARβ/δ (PPARD) function [Internet]. Nucl. Recept. Signal. SAGE Publications; 2015 [cited 2021 May 16]. page e001. Available from: /pmc/articles/PMC4419664/

568. Tan NS, Vázquez-Carrera M, Montagner A, Sng MK, Guillou H, Wahli W. Transcriptional control of physiological and pathological processes by the nuclear receptor PPAR beta/delta. [cited 2021 Mar 26]; Available from: https://hal.archives- ouvertes.fr/hal-01594607

569. Tan EH, Arthur Wahli W, Soon Tan N. 675. Exploring the Effects of Stroma-Derived PPARB Signaling in Colorectal Tumorigenesis. Mol. Ther. 2016.

570. PPARG peroxisome proliferator activated receptor gamma [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 8]. Available from: https://www.ncbi.nlm.nih.gov/gene/5468

571. Ogino S, Shima K, Baba Y, Nosho K, Irahara N, Kure S, et al. Colorectal Cancer Expression of Peroxisome Proliferator-Activated Receptor γ (PPARG, PPARgamma) Is Associated With Good Prognosis. Gastroenterology [Internet]. W.B. Saunders; 2009 [cited 2021 Mar 26];136:1242–50. Available from: /pmc/articles/PMC2663601/

572. Girnun G. PPARG: A New Independent Marker for Colorectal Cancer Survival [Internet]. Gastroenterology. W.B. Saunders; 2009 [cited 2021 May 8]. page 1157–60. Available from: http://www.gastrojournal.org/article/S0016508509002029/fulltext

573. PPL periplakin [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/5493

574. Li X, Zhang G, Wang Y, Elgehama A, Sun Y, Li L, et al. Loss of periplakin expression is associated with the tumorigenesis of colorectal carcinoma. Biomed Pharmacother. Elsevier Masson SAS; 2017;87:366–74.

156

575. PPP2R1A protein phosphatase 2 scaffold subunit Aalpha [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/5518

576. Tamaki M, Goi T, Hirono Y, Katayama K, Yamaguchi A. PPP2R1B gene alterations inhibit interaction of PP2A-Aβ and PP2A-C proteins in colorectal cancers. Oncol Rep [Internet]. Spandidos Publications; 2004 [cited 2021 Mar 27];11:655–9. Available from: http://www.spandidos-publications.com/10.3892/or.11.3.655/abstract

577. Jeong AL, Han S, Lee S, Su Park J, Lu Y, Yu S, et al. Patient derived mutation W257G of PPP2R1A enhances cancer cell migration through SRC-JNK-c-Jun pathway. Sci Rep [Internet]. Nature Publishing Group; 2016 [cited 2021 Mar 27];6:1–12. Available from: www.nature.com/scientificreports

578. PPP2R5C protein phosphatase 2 regulatory subunit B’gamma [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/5527

579. (No Title) [Internet]. [cited 2021 Mar 27]. Available from: https://link.springer.com/content/pdf/10.1007/s13277-016-5145-4.pdf

580. Mumby M. PP2A: Unveiling a Reluctant Tumor Suppressor [Internet]. Cell. Cell; 2007 [cited 2021 Mar 27]. page 21–4. Available from: https://pubmed.ncbi.nlm.nih.gov/17632053/

581. Hayes J, Peruzzi PP, Lawler S. MicroRNAs in cancer: Biomarkers, functions and therapy [Internet]. Trends Mol. Med. Elsevier Ltd; 2014 [cited 2021 Mar 27]. page 460–9. Available from: http://dx.doi.org/10.1016/j.molmed.2014.06.005

582. PRRG4 proline rich and Gla domain 4 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/79056

583. Zhang L, Qin Y, Wu G, Wang J, Cao J, Wang Y, et al. PRRG4 promotes breast cancer metastasis through the recruitment of NEDD4 and downregulation of Robo1. Oncogene [Internet]. Springer Nature; 2020 [cited 2021 Mar 27];39:7196–208. Available from: https://www.nature.com/articles/s41388-020-01494-7

584. Uhlen M, Zhang C, Lee S, Sjöstedt E, Fagerberg L, Bidkhori G, et al. A pathology atlas of the human cancer transcriptome. Science (80- ). American Association for the Advancement of Science; 2017;357.

585. PTEN phosphatase and tensin homolog [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/5728

586. Frattini M, Saletti P, Romagnani E, Martin V, Molinari F, Ghisletta M, et al. PTEN loss of expression predicts cetuximab efficacy in metastatic colorectal cancer patients. Br J

157

Cancer [Internet]. Nature Publishing Group; 2007 [cited 2021 Mar 27];97:1139–45. Available from: www.bjcancer.com

587. Salvatore L, Calegari MA, Loupakis F, Fassan M, Di Stefano B, Bensi M, et al. PTEN in Colorectal Cancer: Shedding Light on Its Role as Predictor and Target. Cancers (Basel) [Internet]. MDPI AG; 2019 [cited 2021 Mar 27];11:1765. Available from: https://www.mdpi.com/2072-6694/11/11/1765

588. PTK2 protein tyrosine kinase 2 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/5747

589. Yang H, Li X, Meng Q, Sun H, Wu S, Hu W, et al. CircPTK2 (hsa_circ_0005273) as a novel therapeutic target for metastatic colorectal cancer. Mol Cancer [Internet]. BioMed Central Ltd.; 2020 [cited 2021 Mar 27];19:13. Available from: https://molecular- cancer.biomedcentral.com/articles/10.1186/s12943-020-1139-3

590. RAB5A RAB5A, member RAS oncogene family [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/5868

591. Takeda M, Koseki J, Takahashi H, Miyoshi N, Nishida N, Nishimura J, et al. Disruption of endolysosomal Rab5/7 efficiently eliminates colorectal cancer stem cells. Cancer Res [Internet]. American Association for Cancer Research Inc.; 2019 [cited 2021 Mar 27];79:1426–37. Available from: http://cancerres.aacrjournals.org/content/canres/79/7/1426/F1.large.jpg.

592. RAC1 Rac family small GTPase 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/5879

593. Kotelevets L, Chastre E. Rac1 signaling: From intestinal homeostasis to colorectal cancer metastasis [Internet]. Cancers (Basel). MDPI AG; 2020 [cited 2021 Mar 27]. Available from: /pmc/articles/PMC7140047/

594. RET ret proto-oncogene [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/5979

595. Lipson D, Capelletti M, Yelensky R, Otto G, Parker A, Jarosz M, et al. Identification of new ALK and RET gene fusions from colorectal and lung cancer biopsies. Nat Med [Internet]. Nature Publishing Group; 2012 [cited 2021 Mar 27];18:382–4. Available from: https://www.nature.com/articles/nm.2673

596. Le Rolle AF, Klempner SJ, Garrett CR, Seery T, Sanford EM, Balasubramanian S, et al. Identification and characterization of RET fusions in advanced colorectal cancer. Oncotarget [Internet]. Impact Journals LLC; 2015 [cited 2021 Mar 27];6:28929–37. Available from: /pmc/articles/PMC4745701/

158

597. RGS2 regulator of G protein signaling 2 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/5997

598. Linder A, Hagberg Thulin M, Damber JE, Welén K. Analysis of regulator of G-protein signalling 2 (RGS2) expression and function during prostate cancer progression. Sci Rep [Internet]. Nature Publishing Group; 2018 [cited 2021 Mar 27];8:1–14. Available from: www.nature.com/scientificreports

599. Kim Y, Ghil S. Regulators of G-protein signaling, RGS2 and RGS4, inhibit protease- activated receptor 4-mediated signaling by forming a complex with the receptor and Gα in live cells. Cell Commun Signal [Internet]. BioMed Central Ltd; 2020 [cited 2021 Mar 27];18:1–13. Available from: https://link.springer.com/articles/10.1186/s12964-020- 00552-7

600. RUNX2 RUNX family transcription factor 2 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/860

601. Ji Q, Cai G, Liu X, Zhang Y, Wang Y, Zhou L, et al. MALAT1 regulates the transcriptional and translational levels of proto-oncogene RUNX2 in colorectal cancer metastasis. Cell Death Dis [Internet]. Nature Publishing Group; 2019 [cited 2021 Mar 27];10:1–17. Available from: https://doi.org/10.1038/s41419-019-1598-x

602. RHOA ras homolog family member A [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/387

603. Dopeso H, Rodrigues P, Bilic J, Bazzocco S, Cartón-García F, Macaya I, et al. Mechanisms of inactivation of the tumour suppressor gene RHOA in colorectal cancer [Internet]. Br. J. Cancer. Nature Publishing Group; 2018 [cited 2021 Mar 27]. page 106– 16. Available from: /pmc/articles/PMC5765235/

604. RPS6KB1 ribosomal protein S6 kinase B1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/6198

605. Wang X, Yao Y, Zhu X. The influence of aberrant expression of GLI1/p-S6K on colorectal cancer. Biochem Biophys Res Commun. Elsevier B.V.; 2018;503:3198–204.

606. SCNN1A sodium channel epithelial 1 subunit alpha [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/6337

607. Zhang GL, Pan LL, Huang T, Wang JH. The transcriptome difference between colorectal tumor and normal tissues revealed by single-cell sequencing. J Cancer [Internet]. Ivyspring International Publisher; 2019 [cited 2021 Mar 27];10:5883–90. Available from: /pmc/articles/PMC6843882/

159

608. Geng Y, Zhao S, Jia Y, Xia G, Li H, Fang Z, et al. MiR–95 promotes osteosarcoma growth by targeting SCNN1A. Oncol Rep [Internet]. Spandidos Publications; 2020 [cited 2021 Mar 27];43:1429–36. Available from: http://www.spandidos- publications.com/10.3892/or.2020.7514/abstract

609. Peixoto P, Etcheverry A, Aubry M, Missey A, Lachat C, Perrard J, et al. EMT is associated with an epigenetic signature of ECM remodeling genes. Cell Death Dis [Internet]. Nature Publishing Group; 2019 [cited 2021 Jun 4];10:1–17. Available from: https://doi.org/10.1038/s41419-019-1397-4

610. SERPINB5 serpin family B member 5 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/5268

611. SERPINB5 - Serpin B5 - Homo sapiens (Human) - SERPINB5 gene & protein [Internet]. [cited 2021 May 16]. Available from: https://www.uniprot.org/uniprot/P36952

612. Baek JY, Yeo HY, Chang HJ, Kim K-H, Kim SY, Park JW, et al. Serpin B5 is a CEA- interacting biomarker for colorectal cancer. Int J Cancer [Internet]. John Wiley & Sons, Ltd; 2014 [cited 2021 Mar 27];134:1595–604. Available from: http://doi.wiley.com/10.1002/ijc.28494

613. Chang IW, Liu KW, Ragunanan M, He HL, Shiue YL, Yu SC. SERPINB5 expression: Association with CCRT response and prognostic value in rectal cancer. Int J Med Sci [Internet]. Ivyspring International Publisher; 2018 [cited 2021 Mar 27];15:376–84. Available from: /pmc/articles/PMC5835708/

614. SIRT1 sirtuin 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/23411

615. Sun LN, Zhi Z, Chen LY, Zhou Q, Li XM, Gan WJ, et al. SIRT1 suppresses colorectal cancer metastasis by transcriptional repression of miR-15b-5p. Cancer Lett [Internet]. Elsevier Ireland Ltd; 2017 [cited 2021 Mar 27];409:104–15. Available from: https://pubmed.ncbi.nlm.nih.gov/28923398/

616. SMAD3 SMAD family member 3 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/4088

617. Xie W, Rimm DL, Lin Y, Shih WJ, Reiss M. Loss of smad signaling in human colorectal cancer is associated with advanced disease and poor prognosis. Cancer J [Internet]. Cancer J; 2003 [cited 2021 Mar 27];9:302–12. Available from: https://pubmed.ncbi.nlm.nih.gov/12967141/

618. SMAD4 SMAD family member 4 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/4089

160

619. Fleming NI, Jorissen RN, Mouradov D, Christie M, Sakthianandeswaren A, Palmieri M, et al. SMAD2, SMAD3 and SMAD4 mutations in colorectal cancer. Cancer Res [Internet]. American Association for Cancer Research; 2013 [cited 2021 Mar 27];73:725– 35. Available from: http://cancerres.aacrjournals.org/

620. SNAI1 snail family transcriptional repressor 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/6615

621. Zhu Y, Wang C, Becker SA, Hurst K, Nogueira LM, Findlay VJ, et al. miR-145 Antagonizes SNAI1-Mediated Stemness and Radiation Resistance in Colorectal Cancer. Mol Ther. Cell Press; 2018;26:744–54.

622. SNAI2 snail family transcriptional repressor 2 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/6591

623. SNAI2 Gene - GeneCards | SNAI2 Protein | SNAI2 Antibody [Internet]. [cited 2021 Mar 27]. Available from: https://www.genecards.org/cgi-bin/carddisp.pl?gene=SNAI2

624. Du F, Li X, Feng W, Qiao C, Chen J, Jiang M, et al. SOX13 promotes colorectal cancer metastasis by transactivating SNAI2 and c-MET. Oncogene [Internet]. Springer Nature; 2020 [cited 2021 Mar 27];39:3522–40. Available from: https://www.nature.com/articles/s41388-020-1233-4

625. SNCA synuclein alpha [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/6622

626. Yang Q, Wang S, Ma J, Li X, Liu X, Peng M, et al. Identification the potential of stool- based SNCA methylation as a candidate biomarker for early colorectal cancer detection. Transl Cancer Res [Internet]. AME Publishing Company; 2017 [cited 2021 Mar 27];6:169–76. Available from: http://dx.doi.org/10.21037/tcr.2017.01.30

627. Akil H, Perraud A, Mélin C, Jauberteau M-O, Mathonnet M. Fine-Tuning Roles of Endogenous Brain-Derived Neurotrophic Factor, TrkB and Sortilin in Colorectal Cancer Cell Survival. Mattson MP, editor. PLoS One [Internet]. Public Library of Science; 2011 [cited 2021 Mar 27];6:e25097. Available from: https://dx.plos.org/10.1371/journal.pone.0025097

628. SOX2 SRY-box transcription factor 2 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2020 Apr 18]. Available from: https://www.ncbi.nlm.nih.gov/gene/6657

629. Takeda K, Mizushima T, Yokoyama Y, Hirose H, Wu X, Qian Y, et al. Sox2 is associated with cancer stem-like properties in colorectal cancer. Sci Rep [Internet]. Nature Publishing Group; 2018 [cited 2019 Sep 26];8:17639. Available from: http://www.nature.com/articles/s41598-018-36251-0

161

630. Chaudhary S, Islam Z, Mishra V, Rawat S, Ashraf GM, Kolatkar PR. Sox2: A Regulatory Factor in Tumorigenesis and Metastasis. Curr Protein Pept Sci. Bentham Science Publishers Ltd.; 2019;20:495–504.

631. Neumann J, Bahr F, Horst D, Kriegl L, Engel J, Mejías-Luque R, et al. SOX2 expression correlates with lymph-node metastases and distant spread in right-sided colon cancer. BMC Cancer [Internet]. 2011 [cited 2019 Nov 22];11:518. Available from: http://bmccancer.biomedcentral.com/articles/10.1186/1471-2407-11-518

632. SOX9 SRY-box transcription factor 9 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/6662

633. Carrasco-Garcia E, Lopez L, Aldaz P, Arevalo S, Aldaregia J, Eganã L, et al. SOX9- regulated cell plasticity in colorectal metastasis is attenuated by rapamycin. Sci Rep [Internet]. Nature Publishing Group; 2016 [cited 2021 Mar 27];6:1–12. Available from: www.nature.com/scientificreports/

634. Prévostel C, Blache P. The dose-dependent effect of SOX9 and its incidence in colorectal cancer. Eur. J. Cancer. Elsevier Ltd; 2017. page 150–7.

635. SPARC secreted protein acidic and cysteine rich [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 8]. Available from: https://www.ncbi.nlm.nih.gov/gene/6678

636. Kurtul N, Taşdemir EA, Ünal D, Izmirli M, Eroglu C. SPARC: As a prognostic biomarker in rectal cancer patients treated with chemo-radiotherapy. Cancer Biomarkers [Internet]. IOS Press; 2017 [cited 2021 Mar 27];18:459–66. Available from: https://pubmed.ncbi.nlm.nih.gov/28009327/

637. Rezvani N, Alibakhshi R, Vaisi-Raygani A, Bashiri H, Saidijam M. Detection of SPG20 gene promoter-methylated DNA, as a novel epigenetic biomarker, in plasma for colorectal cancer diagnosis using the Methylight method. Oncol Lett [Internet]. Spandidos Publications; 2017 [cited 2021 Mar 27];13:3277–84. Available from: http://www.spandidos-publications.com/10.3892/ol.2017.5815/abstract

638. Zhou Z, Wang W, Xie X, Song Y, Dang C, Zhang H. Methylation-induced silencing of SPG20 facilitates gastric cancer cell proliferation by activating the EGFR/MAPK pathway. Biochem Biophys Res Commun. Elsevier B.V.; 2018;500:411–7.

639. STAT3 signal transducer and activator of transcription 3 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/6774

640. Wei N, Li J, Fang C, Chang J, Xirou V, Syrigos NK, et al. Targeting colon cancer with the novel STAT3 inhibitor bruceantinol. Oncogene [Internet]. Nature Publishing Group; 2019 [cited 2021 Mar 27];38:1676–87. Available from: https://pubmed.ncbi.nlm.nih.gov/30348989/

162

641. Mihaly SR, Ninomiya-Tsuji J, Morioka S. TAK1 control of cell death [Internet]. Cell Death Differ. Nature Publishing Group; 2014 [cited 2021 Mar 27]. page 1667–76. Available from: www.nature.com/cdd

642. Singh A, Sweeney MF, Yu M, Burger A, Greninger P, Benes C, et al. TAK1 inhibition promotes apoptosis in KRAS-dependent colon cancers. Cell [Internet]. NIH Public Access; 2012 [cited 2021 Mar 27];148:639–50. Available from: /pmc/articles/PMC3291475/

643. Grant SFA, Thorleifsson G, Reynisdottir I, Benediktsson R, Manolescu A, Sainz J, et al. Variant of transcription factor 7-like 2 (TCF7L2) gene confers risk of type 2 diabetes. Nat Genet. 2006;38:320–3.

644. Wenzel J, Rose K, Haghighi EB, Lamprecht C, Rauen G, Freihen V, et al. Loss of the nuclear Wnt pathway effector TCF7L2 promotes migration and invasion of human colorectal cancer cells. Oncogene [Internet]. Springer Nature; 2020 [cited 2021 Mar 27];39:3893–909. Available from: https://doi.org/10.1038/s41388-020-1259-7

645. TFPI2 tissue factor pathway inhibitor 2 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/7980

646. Hu H, Chen X, Wang C, Jiang Y, Li J, Ying X, et al. The role of TFPI2 hypermethylation in the detection of gastric and colorectal cancer. Oncotarget [Internet]. Impact Journals LLC; 2017 [cited 2021 Mar 27];8:84054–65. Available from: /pmc/articles/PMC5663576/

647. TGFA transforming growth factor alpha [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/7039

648. Stathopoulos GP, Batziou C, Trafalis D, Koutantos J, Batzios S, Stathopoulos J, et al. Treatment of Colorectal Cancer with and without Bevacizumab: A Phase III Study. Oncology [Internet]. Karger Publishers; 2010 [cited 2021 Mar 27];78:376–81. Available from: https://www.karger.com/Article/FullText/320520

649. Shim KS, Kim KH, Park BW, Yi SY, Choi JH, Han WS, et al. Increased serum levels of transforming growth factor-α in patients with colorectal cancer. Dis Colon Rectum [Internet]. Springer; 1998 [cited 2021 Mar 27];41:219–24. Available from: https://link.springer.com/article/10.1007/BF02238252

650. TGFB1 transforming growth factor beta 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/7040

651. Stanilova S, Stanilov N, Julianov A, Manolova I, Miteva L. Transforming growth factor- β1 gene promoter -509C/T polymorphism in association with expression affects colorectal cancer development and depends on gender. Ahmad A, editor. PLoS One [Internet]. Public Library of Science; 2018 [cited 2021 Mar 27];13:e0201775. Available from: https://dx.plos.org/10.1371/journal.pone.0201775

163

652. TGFB2 transforming growth factor beta 2 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/7042

653. Pellatt AJ, Mullany LE, Herrick JS, Sakoda LC, Wolff RK, Samowitz WS, et al. The TGFβ-signaling pathway and colorectal cancer: Associations between dysregulated genes and miRNAs. J Transl Med [Internet]. BioMed Central Ltd.; 2018 [cited 2021 Mar 27];16:191. Available from: https://translational- medicine.biomedcentral.com/articles/10.1186/s12967-018-1566-8

654. Jung B, Staudacher JJ, Beauchamp D. Transforming Growth Factor β Superfamily Signaling in Development of Colorectal Cancer [Internet]. Gastroenterology. W.B. Saunders; 2017 [cited 2021 Mar 27]. page 36–52. Available from: http://dx.doi.org/10.1053/j.gastro.2016.10.015

655. TGFB3 transforming growth factor beta 3 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/7043

656. Chen X, Liu J, Zhang Q, Liu B, Cheng Y, Zhang Y, et al. Exosome-mediated transfer of miR-93-5p from cancer-associated fibroblasts confer radioresistance in colorectal cancer cells by downregulating FOXA1 and upregulating TGFB3. [cited 2021 Mar 27]; Available from: https://doi.org/10.1186/s13046-019-1507-2

657. Zhou R, Huang Y, Cheng B, Wang Y, Xiong B. TGFBR1*6a is a potential modifier of migration and invasion in colorectal cancer cells. Oncol Lett [Internet]. Spandidos Publications; 2018 [cited 2021 Mar 27];15:3971–6. Available from: http://www.spandidos-publications.com/10.3892/ol.2018.7725/abstract

658. TGFBR2 transforming growth factor beta receptor 2 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/7048

659. THBS1 thrombospondin 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/7057

660. Liu X, Xu D, Liu Z, Li Y, Zhang C, Gong Y, et al. THBS1 facilitates colorectal liver metastasis through enhancing epithelial–mesenchymal transition. Clin Transl Oncol [Internet]. Springer; 2020 [cited 2021 Mar 27];22:1730–40. Available from: https://link.springer.com/article/10.1007/s12094-020-02308-8

661. Song G, Xu S, Zhang H, Wang Y, Xiao C, Jiang T, et al. TIMP1 is a prognostic marker for the progression and metastasis of colon cancer through FAK-PI3K/AKT and MAPK pathway. J Exp Clin Cancer Res [Internet]. BioMed Central Ltd.; 2016 [cited 2021 Mar 27];35:148. Available from: http://jeccr.biomedcentral.com/articles/10.1186/s13046-016- 0427-7

164

662. TIMP1 TIMP metallopeptidase inhibitor 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/7076

663. TIMP3 TIMP metallopeptidase inhibitor 3 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/7078

664. Huang HL, Liu YM, Sung TY, Huang TC, Cheng YW, Liou JP, et al. TIMP3 expression associates with prognosis in colorectal cancer and its novel arylsulfonamide inducer, MPT0B390, inhibits tumor growth, metastasis and angiogenesis. Theranostics [Internet]. NLM (Medline); 2019 [cited 2021 Mar 27];9:6676–89. Available from: /pmc/articles/PMC6771239/

665. García-Bilbao A, Armañanzas R, Ispizua Z, Calvo B, Alonso-Varona A, Inza I, et al. Identification of a biomarker panel for colorectal cancer diagnosis. BMC Cancer [Internet]. BioMed Central; 2012 [cited 2021 Mar 27];12:43. Available from: http://bmccancer.biomedcentral.com/articles/10.1186/1471-2407-12-43

666. Wörthmüller J, Blum W, Pecze L, Salicio V, Schwaller B. Calretinin promotes invasiveness and EMT in malignant mesothelioma cells involving the activation of the FAK signaling pathway. Oncotarget [Internet]. Impact Journals LLC; 2018 [cited 2021 May 8];9:36256–72. Available from: /pmc/articles/PMC6284738/

667. TNFAIP3 TNF alpha induced protein 3 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/7128

668. Liu K, Yao H, Wen Y, Zhao H, Zhou N, Lei S, et al. Functional role of a long non-coding RNA LIFR-AS1/miR-29a/TNFAIP3 axis in colorectal cancer resistance to pohotodynamic therapy. Biochim Biophys Acta - Mol Basis Dis. Elsevier B.V.; 2018;1864:2871–80.

669. Ungerbäck J, Belenki D, Jawad ul-Hassan A, Fredrikson M, Fransén K, Elander N, et al. Genetic variation and alterations of genes involved in NFκB/TNFAIP3- and NLRP3- inflammasome signaling affect susceptibility and outcome of colorectal cancer. Carcinogenesis [Internet]. Oxford Academic; 2012 [cited 2021 Mar 27];33:2126–34. Available from: https://academic.oup.com/carcin/article- lookup/doi/10.1093/carcin/bgs256

670. TNFAIP6 TNF alpha induced protein 6 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/7130

671. Guo F, Yuan Y. Tumor necrosis factor alpha-induced proteins in malignant tumors: Progress and prospects [Internet]. Onco. Targets. Ther. Dove Medical Press Ltd.; 2020 [cited 2021 Mar 27]. page 3303–18. Available from: /pmc/articles/PMC7182456/

672. TNF tumor necrosis factor [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/7124

165

673. Shen Z, Zhou R, Liu C, Wang Y, Zhan W, Shao Z, et al. MicroRNA-105 is involved in TNF-α-related tumor microenvironment enhanced colorectal cancer progression article. Cell Death Dis [Internet]. Nature Publishing Group; 2017 [cited 2021 Mar 27];8:1–13. Available from: http://cancergenome.nih.gov/

674. TP53 tumor protein p53 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/7157

675. Naccarati A, Polakova V, Pardini B, Vodickova L, Hemminki K, Kumar R, et al. Mutations and polymorphisms in TP53 gene--an overview on the role in colorectal cancer. Mutagenesis [Internet]. Oxford Academic; 2012 [cited 2021 Mar 27];27:211–8. Available from: https://academic.oup.com/mutage/article-lookup/doi/10.1093/mutage/ger067

676. TWIST1 twist family bHLH transcription factor 1 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/7291

677. Khot M, Sreekumar D, Jahagirdar S, Kulkarni A, Hari K, Faseela EE, et al. Twist1 induces chromosomal instability (CIN) in colorectal cancer cells. Hum Mol Genet [Internet]. Oxford University Press; 2020 [cited 2021 Mar 27];29:1673–88. Available from: https://academic.oup.com/hmg/article/29/10/1673/5825274

678. TWIST2 twist family bHLH transcription factor 2 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/117581

679. Long L, Huang G, Zhu H, Guo Y, Liu Y, Huo J. Down-regulation of miR-138 promotes colorectal cancer metastasis via directly targeting TWIST2. J Transl Med [Internet]. BioMed Central; 2013 [cited 2021 Mar 27];11:1–10. Available from: https://link.springer.com/articles/10.1186/1479-5876-11-275

680. VDR vitamin D receptor [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/7421

681. Messaritakis I, Koulouridi A, Sfakianaki M, Vogiatzoglou K, Gouvas N, Athanasakis E, et al. The Role of Vitamin D Receptor Gene Polymorphisms in Colorectal Cancer Risk. Cancers (Basel) [Internet]. MDPI AG; 2020 [cited 2021 Mar 27];12:1379. Available from: https://www.mdpi.com/2072-6694/12/6/1379

682. VEGFA vascular endothelial growth factor A [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/7422

683. Takahashi N, Iwasa S, Taniguchi H, Sasaki Y, Shoji H, Honma Y, et al. Prognostic role of ERBB2, MET and VEGFA expression in metastatic colorectal cancer patients treated with anti-EGFR antibodies. Br J Cancer [Internet]. Nature Publishing Group; 2016 [cited 2021 Mar 27];114:1003–11. Available from: www.bjcancer.com

166

684. VEGFC vascular endothelial growth factor C [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/7424

685. Tacconi C, Ungaro F, Correale C, Arena V, Massimino L, Detmar M, et al. Activation of the VEGFC/VEGFR3 pathway induces tumor immune escape in colorectal cancer. Cancer Res [Internet]. American Association for Cancer Research Inc.; 2019 [cited 2021 Mar 27];79:4196–210. Available from: http://cancerres.aacrjournals.org/

686. VIM vimentin [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/7431

687. Gu C, Wang X, Long T, Wang X, Zhong Y, Ma Y, et al. FSTL1 interacts with VIM and promotes colorectal cancer metastasis via activating the focal adhesion signalling pathway. Cell Death Dis [Internet]. Nature Publishing Group; 2018 [cited 2021 Mar 27];9:1–14. Available from: http://jaspar.

688. WNT5A Wnt family member 5A [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/7474

689. Liu Q, Yang C, Wang S, Shi D, Wei C, Song J, et al. Wnt5a-induced M2 polarization of tumor-associated macrophages via IL-10 promotes colorectal cancer progression. [cited 2021 Mar 27]; Available from: https://doi.org/10.1186/s12964-020-00557-2

690. WNT5B Wnt family member 5B [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/81029

691. Nie X, Liu H, Liu L, Wang YD, Chen WD. Emerging Roles of Wnt Ligands in Human Colorectal Cancer [Internet]. Front. Oncol. Frontiers Media S.A.; 2020 [cited 2021 Mar 27]. page 1341. Available from: /pmc/articles/PMC7456893/

692. ZEB1 Gene - GeneCards | ZEB1 Protein | ZEB1 Antibody [Internet]. [cited 2021 Mar 27]. Available from: https://www.genecards.org/cgi-bin/carddisp.pl?gene=ZEB1

693. Lindner P, Paul S, Eckstein M, Hampel C, Muenzner JK, Erlenbach-Wuensch K, et al. EMT transcription factor ZEB1 alters the epigenetic landscape of colorectal cancer cells. Cell Death Dis [Internet]. Springer Nature; 2020 [cited 2021 Mar 27];11:1–13. Available from: https://doi.org/10.1038/s41419-020-2340-4

694. S Anchez-Till E, De Barrios O, Siles L, Amendola PG, Darling DS, Cuatrecasas M, et al. Human Cancer Biology ZEB1 Promotes Invasiveness of Colorectal Carcinoma Cells through the Opposing Regulation of uPA and PAI-1. 2013 [cited 2021 Mar 27]; Available from: http://clincancerres.aacrjournals.org/

695. ZEB2 zinc finger E-box binding homeobox 2 [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/9839

167

696. Li MZ, Wang JJ, Yang S Bin, Li WF, Xiao L Bin, He YL, et al. ZEB2 promotes tumor metastasis and correlates with poor prognosis of human colorectal cancer. Am J Transl Res [Internet]. E-Century Publishing Corporation; 2017 [cited 2021 Mar 27];9:2838–51. Available from: /pmc/articles/PMC5489885/

697. Chi H. Regulation and function of mTOR signalling in T cell fate decisions [Internet]. Nat. Rev. Immunol. Nature Publishing Group; 2012 [cited 2021 Mar 27]. page 325–38. Available from: www.nature.com/reviews/immunol

698. MTOR mechanistic target of rapamycin kinase [Homo sapiens (human)] - Gene - NCBI [Internet]. [cited 2021 May 16]. Available from: https://www.ncbi.nlm.nih.gov/gene/2475

699. Francipane MG, Lagasse E. mTOR pathway in colorectal cancer: An update [Internet]. Oncotarget. Impact Journals LLC; 2014 [cited 2021 Mar 27]. page 49–66. Available from: /pmc/articles/PMC3960188/

700. CLTC Gene - GeneCards | CLH1 Protein | CLH1 Antibody [Internet]. [cited 2021 Mar 27]. Available from: https://www.genecards.org/cgi-bin/carddisp.pl?gene=CLTC

701. Tristan C, Shahani N, Sedlak TW, Sawa A. The diverse functions of GAPDH: Views from different subcellular compartments [Internet]. Cell. Signal. NIH Public Access; 2011 [cited 2021 Mar 27]. page 317–23. Available from: /pmc/articles/PMC3084531/

702. GUSB gene: MedlinePlus Genetics [Internet]. [cited 2021 Mar 27]. Available from: https://medlineplus.gov/genetics/gene/gusb/

703. HPRT1 Gene - GeneCards | HPRT Protein | HPRT Antibody [Internet]. [cited 2021 Mar 27]. Available from: https://www.genecards.org/cgi-bin/carddisp.pl?gene=HPRT1

704. PGK1 gene: MedlinePlus Genetics [Internet]. [cited 2021 Mar 27]. Available from: https://medlineplus.gov/genetics/gene/pgk1/

705. RPLP0 Gene - GeneCards | RLA0 Protein | RLA0 Antibody [Internet]. [cited 2021 Mar 27]. Available from: https://www.genecards.org/cgi-bin/carddisp.pl?gene=RPLP0

706. TUBB Gene - GeneCards | TBB5 Protein | TBB5 Antibody [Internet]. [cited 2021 Mar 27]. Available from: https://www.genecards.org/cgi-bin/carddisp.pl?gene=TUBB

168

Appendix A Supplementary Tables

S-T1: Genes Included in the Custom NanoString nCounter Codeset, Their Biological Function & Reason for Inclusion…………………………………………………………………………170

169

Supplementary Table 1. Genes Included in the Custom NanoString nCounter Codeset, Their Biological Function & Reason for Inclusion

Gene Category Biological Function Reason for Inclusion OCT4 Cell Transcription factor Heightened expression of Stemness necessary for OCT4 has been associated with maintaining CRC pathogenesis and pluripotency and self- aggressive clinical features, renewal in embryonic including higher grade stem cell and germ cell tumours, and poor patient populations (347). outcomes (348). This gene is reported to interact with the miRNA let-7 inhibition of embryonic cell reprogramming (349).

ACTA2 Invasion & Encodes smooth muscle Amplification of this gene has Metastasis cell actin alpha 2, a been linked to early metastasis protein involved in cell in lung cancer & has been structure, motility and identified as a hub gene in signalling (350). CRC bioinformatics analysis (351). Overexpression is linked to poor disease-free survival in CRC patients (352,353). This gene is reported to be involved in NOTCH signalling pathways, which is a regulator of tissue homeostasis (350).

AFP Cancer Encodes the major Some cases of CRC have been Onset & plasma protein during linked to elevated AFP Progression fetal life and is production and amongst CRC produced by the yolk patients, those with AFP sac and liver (354). producing tumours tended to have more aggressive tumours (355,356). Reported to be involved in SMAD protein signal transduction, which is important for the regulation of cell development and growth (354).

170

AHNAK Epithelial- Encodes a Reported to play a role in Mesenchymal nucleoprotein with tumour metastasis by inducing Transition many different epithelial-to-mesenchymal functions including cell transition via TGF signalling structure, migration, (358,359). blood-brain barrier formation, and regulation of calcium channels in cardiac tissue (357).

AKT1 Cancer A family of genes Aberrant activity of this protein AKT2 Onset & encoding is linked to tumourigenesis and AKT3 Progression serine/threonine kinases several different malignancies which have a regulatory including CRC. These genes role in cellular function as oncogenes in many proliferation, human cancers and have been differentiation and linked to tumour metastasis survival. They have (363,364). Signalling of also been shown to play PI3K/AKT is reported to be an important role in enhanced by insulin (64). insulin signalling (360– These genes may also play a 362). role in AMPK signalling pathways (361).

ANXA3 Cell Encodes a protein in the Elevated expression has been Stemness/ annexin family with documented in CRC pathology. Cancer regulatory roles in This gene functions as an Onset & signal transduction and oncogene (366–368). Progression/ cell growth. Also plays Downregulation of this gene Invasion & a role in coagulation has been reported to inhibit Metastasis pathways (365). proliferation, invasion and migration of CRC cells (366).

APC Invasion & Encodes critical tumour This tumour suppressor has Metastasis suppressor protein that been widely documented as a regulates cell division, highly mutated gene in CRC cell-cell adhesion, and where the protein is truncated cellular migration leading to a loss of tumour (369). suppressor function and cancer progression (370). This gene’s role in regulating WNT signalling is well documented.

171

AURKA Cancer A kinase that helps AUKRA overexpression has Onset & regulate the cell cycle. been documented in CRC and Progression Is present at the has been shown to upregulate centrosome during WNT and Ras-MAPK signalling to further exacerbate segregation (371). CRC pathogenesis (372). It has also been implicated in P53 signalling (371).

BIRC3 Cancer Encodes a protein with Upregulation of this gene is Onset & several functions linked to resistance to Progression/ including the regulation chemotherapy in CRC patients Immune of apoptosis and (375). The gene has been Markers & mitogenic processes implicated in NFB immune Inflammation (373,374). signalling, as well as anti- apoptosis signalling (374).

BMP7 Cell The protein plays a key Loss of function of this gene Stemness role in the cellular can lead to the development of differentiation of cancer stem cell populations mesenchymal cells to and therapy resistance. Some distinct tissues in the studies report that enhancing bone and cartilage the function of BMP7 (376). promotes the differentiation of these cells, facilitating more effective treatment (377).

BRAF Cancer Key protein involved in Mutations in BRAF can lead to Onset & signal transduction in constitutive activation of Progression the RAS/MAPK oncogenic signalling pathways pathway with several that promote uncontrolled cell downstream effects growth and proliferation, including cell reduced treatment efficacy and proliferation and poor patient outcomes (379). survival (378).

CA 125 Immune Encodes mucin 16, a Elevated serum levels of this Markers & membrane associated protein can be used in cancer Inflammation protein that has been diagnosis and prognosis and used as a biomarker for may provide some information several cancers. Plays a regarding the management of key role in generating CRC patients (381). Studies the mucous barrier on report this protein may be the apical surface of involved in the inhibition of epithelial tissues (380). antitumour immune responses,

172

while promoting tumourigenic inflammation (382).

CAMK2N1 Immune Inhibits CAMKII, a Downregulation of this protein Markers & protein kinase which is linked to therapy resistance, Inflammation/ has nearly 40 aggressive disease and poor Invasion & downstream targets survival in several Metastasis (383). malignancies (384). Loss of inhibition of CAMKII enables WNT/CAMKII signalling to recruit tumour associated macrophages, inflammation, and cancer invasion (385).

CCND1 Cancer A protein that helps CCND1 is often overexpressed Onset & regulate cyclin in many human cancers Progression D1/cyclin D4 including CRC. complexes which are Overexpression has been crucial in cell cycle reported to lead to nuclear regulation (386). retention of the protein and constitutive activation of CDK1/4 complexes (387).

CD82 Invasion & Membrane glycoprotein Downregulation of this protein Metastasis that has important roles has been reported to correlate in cell adhesion (388). with more advanced stage disease and poor patient outcomes in several cancers including CRC (389). This protein is reported to be a suppressor of metastasis, and is reported to be involved in PI3K/HER2/Ras Pathway and downstream P63/CD82 pathways (390,391).

CD83 Immune A cell surface protein Downregulation of this protein Markers & marker for mature has been reported in the CRC Inflammation dendritic cells. Has tumour microenvironment, been shown to localize thereby suppressing an immune in the thymus and play response to abnormal antigens a role in T-cell produced by a developing development via tumour (393). antigen presentation (392).

173

CDH1 Invasion & Encodes the protein E- Variants in this gene have been Metastasis cadherin which has linked to familial CRC and has important roles in cell- been shown to increase the risk cell adhesion, for the development and proliferation, and progression of this disease, as motility in epithelial well as enhanced proliferation, cells (394). invasion and metastatic potential (395,396).

CDH2 Epithelial- Encodes the protein N- Upregulation of this protein Mesenchymal cadherin which plays a has been reported during Transition role in cell-cell epithelial-to-mesenchymal adhesion, neural transition in cancer development and pathogenesis of several cartilage and bone malignancies; researchers also establishment. This report that it is linked to poorer protein is a CRC patient prognosis (398). mesenchymal marker (397).

CDK8 Cancer A cyclin-dependent CDK8 is reported as an Onset & kinase localized in the oncogene, whereby its Progression nucleus that functions upregulation is linked to to regulate transcription overactivity of the WNT/- of key genes encoding catenin pathway that has been cellular machinery for widely documented in CRC cell cycle progression (400). (399).

CDKN2A Cancer This gene encodes Mutations in this gene that lead Onset & several proteins with to loss of tumour suppressor Progression tumour suppressor function result in enhanced cell functions. P16 and P14 growth, proliferation, are two that have been inhibition of cellular well studied. The senescence and poor patient protein products work outcomes (402). This gene is to regulate cell cycle reported to interact with the c- progression by MYC signalling pathway, a stabilizing P53 and pathway that promotes rapid inhibiting CDK4 (401). cell division and inhibits apoptosis (401).

174

CEA Cancer Carcinoembryonic This protein is used clinically Onset & antigen (CEA) is cell to help diagnose, monitor and Progression/ surface glycoprotein manage CRC patients. Higher Invasion & that is highly expressed levels of this protein are Metastasis during fetal indicative of cancer development. Elevated pathogenesis, poor response to levels after birth is therapy and poor clinical linked to outcomes (404). Increased carcinogenesis. Has expression may inhibit roles in cell adhesion apoptosis and autophagy (405). (403).

CFC1 Cell Encodes a protein CFC1 encodes an oncoprotein Stemness involved in the that enhances cellular signalling to regulate proliferation and self-renewal embryonic development capacity (407). Overexpression (406). of this gene following embryonic development is linked to CRC pathology and stem cell characteristics (408,409).

CLEC4D Immune Encodes an endocytic This gene has been Markers & receptor with roles in documented as a marker of Inflammation cell-cell adhesion, myeloid derived suppressor signalling and immune cells which may result in responses (410). cancer promotion by sustaining an immunosuppressive environment (411,412).

CNRIP1 Cancer Encodes a protein that Evaluation of the methylation Onset & binds to the c-terminus status of this gene may provide Progression/ of the cannabinoid diagnostic/prognostic Invasion & receptor type 1 to information for CRC patients. Metastasis inhibit its activity (413). Hypomethylation of the CNRIP1 promoter or enhanced expression of the gene is linked to inhibition of CRC proliferation and cell migratory capacity (414).

175

COL1A1 Invasion & Encodes the protein that Studies have reported Metastasis is part of type 1 overexpression of this protein collagen which is in many different malignancies important for the and may promote cell structural integrity of migration and overstimulation many tissues in the of the oncogenic WNT/- body (415). catenin pathway (416).

COL3A1 Immune Encodes the protein Overexpression of this protein Markers & denoted type III has been documented in Inflammation/ collagen which is several cancers including CRC Invasion & important for the and is linked to enhanced Metastasis supporting many tissues cellular proliferation, in the body (417). recruitment of tumour macrophages, and metastatic potential (418,419).

CREBBP Cell Encodes a Mutations in this gene are Stemness transcriptional regulator reported in CRC and may serve protein that is important as biomarkers for the diagnosis for promoting cellular and prognosis of patients (421). differentiation in many In particular, loss of normal tissues. Plays a critical function of this gene is linked role in tissue to increased stem cell homeostasis during characteristics (422). embryonic development, and also regulates cellular growth by manipulating the transcriptome via chromatin remodeling (420).

CTNNB1 Cancer Encodes the protein Mutations in this gene are well Onset & beta-catenin. This documented in CRC Progression/ protein is part of a pathogenesis with Cell protein complex that overactivation of the WNT/- Stemness/ make up adherens catenin pathway leading to Invasion & junctions between cells. enhanced cell stemness, Metastasis These junctions help proliferation, cellular regulate epithelial cell invasion/metastasis and poor growth and contact patient outcomes (424). (423).

176

CYFRA 21-1 Cancer Encodes a cytokeratin Degradation products of this Onset & protein which is a protein are detected in the Progression structural protein found blood of many cancer patients, in epithelial cells (425). including CRC patients (426). Studies report that this tumour marker may be useful in diagnostics and treatment monitoring (427). This gene is reported to interact with the P38 mitogen kinase signalling pathway (425).

EGF Cancer Encodes epidermal Hyperactivity of this molecule Onset & growth factor, a known is linked to the development Progression mitogen, which and progression of CRC and stimulates cellular other malignancies, via the proliferation in a variety activation of several oncogenic of cell lines (428). signalling pathways (113,429). This gene has been implicated in the RAS/MAPK and AKT/PI3K/mTOR pathways (428).

EGFR Cancer Encodes the receptor Overexpression of this receptor Onset & tyrosine kinase, is linked to the development Progression epidermal growth factor and progression of CRC as receptor, which well as several other cancers regulates the (113,431). This gene has been development of implicated in the RAS/MAPK epithelial tissue and and AKT/PI3K/mTOR homeostasis (430). pathways (430).

EPAS1 Immune Encodes the protein Reduced expression of this Markers & hypoxia-inducible gene has been reported in CRC Inflammation/ factor 2-alpha. This compared to control tissue Epithelial- protein has functional samples; furthermore, low Mesenchymal roles in tissue expression of EPAS1 mRNA Transition adaptation to variable may be linked to poor clinical oxygen content by outcomes (434). This gene has altering gene been reported to interact with transcription. the AKT/PI3K/mTOR pathway Expression is associated (432). with an inflammatory response (432,433).

177

ERBB2 Cancer Encodes the receptor Mutations in this gene have Onset & tyrosine kinase HER2 been reported in several Progression which stimulates cancers including CRC. The cellular proliferation mutational status of this gene is when its ligand is important for guiding treatment bound (435). decisions as some therapies targeting this receptor have been developed (436,437). Activation of this gene has been reported to stimulate the AKT/PI3K pathway (438).

ERBB3 Cancer Encodes a receptor Activation of ERBB3 Onset & tyrosine kinase that signalling has been linked to Progression forms heterodimers CRC pathology and reduced with other receptors efficacy of EGFR inhibitor when its ligand is treatments in CRC patients bound to activate (440). cellular proliferation and differentiation pathways (439).

EpCAM Invasion & Encodes a Downregulation or inactivation Metastasis transmembrane of EpCAM is linked to CRC glycoprotein that invasive capacity and poor functions as a cell-cell clinical outcomes (441–443). adhesion molecule, as well as a regulator of cell signalling, division, and differentiation (439).

FBN1 Invasion & Encodes a preprotein Enhanced FBN1 expression Metastasis that is cleaved and has been reported to coincide processed to produce a with the development of CRC glycoprotein that is (445). In addition, upregulation transported from cells of this gene has also been to the extracellular reported with the development matrix, with roles in of tumour metastasis in several maintaining connective malignancies (446,447). This tissue structure. The gene is reported to be involved other protein product, in integrin signalling (444). asprosin, has roles in maintaining glucose homeostasis (444).

178

FGFBP1 Cancer Encodes a carrier Enhanced expression of this Onset & protein for growth gene is linked to aberrant Progression/ factors and is secreted cellular proliferation as well as Invasion & by cells to stimulate invasion and metastasis in Metastasis proliferation, CRC (449). differentiation and migration (448).

FGFR1 Cell Encodes a fibroblast Amplification of this gene has Stemness growth factor receptor. been reported as a prognostic The transmembrane indicator of poor clinical protein binds fibroblasts outcomes for CRC patients and initiates (451). This gene has also be intracellular signalling reported to play a role in EMT to stimulate cell processes and cell stemness in division and other malignancies (452). differentiation (450).

FLJ20323 Cancer Encodes a protein that Aberrant activity of this protein Onset & regulates meiosis in has been linked to CRC Progression oocytes during through its involvement with development (453). the PI3K/AKT/mTOR signalling pathway (454).

FN1 Invasion & Encodes a plasma Hyperactivity of this gene has Metastasis protein as well as cell been linked to CRC pathology, surface protein with the mediation of tumour roles in cell adhesion, metastasis and poor patient cell migration, outcomes (455). Moreover, coagulation and wound elevated levels of this protein healing (280). are associated with disease recurrence, malignant cell invasion, and diabetic pathological changes (456– 458). This gene has been implicated in AKT/PI3K and P38/MMP2 signalling pathways (280).

179

FOXC2 Epithelial- Encodes a transcription Enhanced expression and Mesenchymal factor important in the nuclear location of this protein Transition regulation of have been reported in advanced organogenesis during stages of CRC and has been fetal development shown to promote tumour (458). invasion and metastasis capacity (317,459).

FOXD3 Invasion & Encodes a transcription Deregulation of this Metastasis factor that regulates transcription factor has been embryonic stem cells reported in CRC via loss of and has important roles function mutations (461). in cell growth and cell Several studies have reported cycle progression (460). the tumour suppressor function of this gene (462). This gene is reported to be implicated in the FOXD3/miR-214/MED19 and EGFR/RAS/MAPK and WNT signalling pathways (460).

FOXO3 Cancer Encodes a transcription This gene has been reported to Onset & factor with roles in have conflicting roles including Progression regulating apoptosis, as tumour suppressor functions as well as cell cycle well as metastasis promoting progression and functions; however, most proliferation (463). studies report that downregulation of this gene is linked to disease progression (464,465). It has been reported to interact with the PI3K/AKT pathway (463).

FZD7 Invasion & Encodes a Overactivation of the Metastasis transmembrane receptor WNT5A/FZD7 pathway is protein that enhances linked to enhanced cellular WNT signalling by proliferation, disease binding WNT5A progression and metastasis in protein (466). CRC patients (467,468).

180

GSC Epithelial- Encodes a transcription GSC has been reported to be an Mesenchymal factor that plays an inducer of epithelial to Transition important role in tissue mesenchymal transition, a fate determination process well known to be during organogenesis involved in cancer pathology (469). and poor patient outcomes (470,471). Is reported to be implicated in SMAD2/3 signalling and retinoblastoma protein signalling (469).

GSK3B Immune Encodes a serine- Amplification of this gene has Markers & threonine kinase been associated with CRC Inflammation involved in maintaining pathogenesis by enhancing glucose homeostasis, growth promoting cellular regulating inflammation signalling (473). This gene has and apoptosis (472). been implicated in the NF-B signalling pathway (474).

HDAC2 Cancer Encodes a histone HDAC2 is reported to have an Onset & deacetylase for the oncogenic function and Progression removal of acetyl upregulation of this gene is groups from lysine linked to poor patient outcomes residues on the N- while the inhibition of this terminal region of gene has been shown to hinder histones to regulate tumourigenesis by decreasing transcription (475). cellular proliferation and inducing apoptosis (476). This gene is reported to be involved in PI3K/AKT and P53 signalling (475).

HGF Invasion & Encodes a hepatocyte Deregulation of HGF and Metastasis growth factor which is a signalling through its receptor regulatory protein cell c-MET has been reported to growth, motility, potentiate CRC pathogenesis differentiation, through angiogenesis, cell angiogenesis, and tissue motility and proliferation regeneration (339). (341,343).

181

HIF1A Epithelial- Encodes a transcription Overexpression of this gene Mesenchymal factor that regulates the has been reported to correlate Transition cellular response to low significantly with poor clinical levels of oxygen (477). outcomes in CRC patients (478,479). This gene has been implicated in WNT signalling.

HNF4A Invasion & Encodes a transcription Aberrant HNF4A is linked to Metastasis factor that plays an colorectal tumourigenesis, important role in organ protection of cancer cells from development including oxygen radicals, as well as the intestines, liver and promoting signalling of the kidneys (314). oncogenic WNT/-catenin pathway and metastasis (480).

HPSE Invasion & Encodes an enzyme that Upregulation of this gene helps Metastasis facilitates extracellular facilitate tumour growth and matrix remodelling metastasis by altering the (481). tumour microenvironment to accommodate the growing tumour (482).

HSP90AA1 Cancer Encodes a chaperone This gene is recognized as an Onset & protein that targets and oncogene in several Progression regulates proteins malignancies and has been important for cell cycle shown to promote cell survival control as well as signal as well as therapy resistance to transduction (483). conventional chemotherapy (484).

IGF1 Cancer Encodes insulin-like Elevated levels of circulating Onset & growth factor 1, which IGF1 have been reported to Progression is widely recognized as increase the risk for the a hormone that development and progression stimulates growth of of CRC (486). cells and helps regulate metabolism (485).

IGFR Cancer Encodes insulin-like IGFR upregulation has been Onset & growth factor receptor, shown to enhance cellular Progression which is a tyrosine proliferation, invasion and kinase receptor that survival capacity leading to binds IGF to stimulate poor clinical outcomes in cell growth and survival cancer patients (488). (487).

182

IGF2 Cancer Encodes insulin-like Upregulation of this gene is Onset & growth factor 2, a linked to cancer pathology for Progression/ protein involved in fetal many different cancers Cell Stemness growth and including CRC (488). development (489). Overexpression is also associated with poor patient prognosis (490).

IGFBP3 Cancer Encodes insulin-like Serum levels of this protein Onset & growth factor binding may be a potential biomarker Progression protein 3, which binds for CRC; however, the clinical the growth factor and significance of this protein and stabilizes the protein to its function remains prolong its half-life, as controversial with results well as altering its varying depending on several interaction with its clinical parameters (492). receptor (491).

IGFBP4 Cancer Encodes insulin-like Some studies report that Progression growth factor binding increased expression of protein 4, which binds IGFBP4 resulted in reduced the growth factor and cellular proliferation and stabilizes the protein, as increased programmed cell well as altering its death, thereby regulating the interaction with its tumourigenic capacity of the receptor (493). growth factor (494).

IL17RC Immune Encodes the interleukin Some studies report that Markers & receptor C protein. This binding of this receptor to its Inflammation receptor binds ligand induces signalling via interleukins to promote NF-B to activate genes that inflammation (495). prevent apoptosis and stimulate angiogenesis in colon epithelial cells (496). IL2RB Immune Encodes the interleukin The current literature related to Markers & 2 receptor beta. This CRC risk and this gene is Inflammation/ receptor is mainly limited and controversial. Cancer expressed in Some suggest that Onset & hematopoietic cells and overexpression of this gene Progression is involved in the provides some protection stimulation of against cancer while others mitogenic processes suggest that some variants (497). enhance risk (498,499).

183

IL6 Immune Encodes the IL6 has been shown to be Markers & interleukin-6 cytokine highly elevated in CRC Inflammation which has roles in patients (501). IL6 has also inflammatory processes been shown to trigger and adaptive immunity; signalling cascades that this gene is implicated promote disease progression in inflammation that and poor clinical outcomes increases the risk of (502). T2D (500).

IL6ST Immune Encodes a protein that This gene is reported to be Markers & has a role in signal involved in JAK/STAT Inflammation/ transduction following signalling in CRC and has been Invasion & the binding of a associated with key Metastasis cytokine to its receptor clinicopathological features (503). including tumour size, cellular differentiation, cancer stage and invasiveness (504).

IL8 Immune Encodes interleukin 8, a The role of IL8 in CRC is Markers & proinflammatory reported to be multifaceted. Inflammation cytokine secreted by Overexpression is linked to macrophages (505). disease progression, prolonged survival of tumour cells, cellular proliferation, migration and invasion (506).

ILK Epithelial- Encodes the Upregulation of this kinase is Mesenchymal transmembrane protein linked to epithelial-to- Transition/ integrin linked kinase mesenchymal transition, a Invasion & which regulates common feature of cancer Metastasis integrin-related signal pathology; furthermore, it has transduction within the been reported to be linked with cell (507). disease progression and metastasis (508).

INA Cancer Encodes the internexin INA is reported to be a tumour Onset & neuronal intermediate suppressor and studies have Progression filament protein alpha shown that through an which has roles in epigenetic mechanism, the neuronal morphology inactivation of this gene occurs and intracellular early in the development of transport within CRC and facilitates neurons (509). microtubule polymerization at an enhanced rate (510).

184

IRS4 Cancer Encodes the insulin Elevated expression of IRS4 Onset & receptor substrate 4 and has been linked to CRC staging Progression retains tyrosine kinase and may be involved in CRC activity. This protein is pathogenesis, as well as patient involved in stimulation prognosis (512). of cellular proliferation and signal transduction (511).

ITGA5 Invasion & Encodes the integrin Overexpression of this gene is Metastasis subunit alpha 5, a linked to highly aggressive protein present on the features of CRC namely cell surface with roles tumour metastasis and invasive in cell signalling and capacity through loss of cell adhesion (513). adhesion (514).

ITGB1 Invasion & Encodes the protein Elevated expression of this Metastasis integrin subunit beta 1. protein has been linked to poor This protein forms CRC prognosis, enhanced heterodimers on the cell cellular invasion and metastasis surface and acts as a (516). membrane receptor for signalling and cell adhesion (515).

JAG1 Epithelial- Encodes the protein Amplified expression of JAG1 Mesenchymal jagged 1, which is a has been associated with poor Transition/ signalling molecule that clinical outcomes in CRC Cancer binds the notch 1 patients via promoting Onset & transmembrane receptor epithelial-to-mesenchymal Progression (334). transition, cell growth, division and survival (338).

KRAS Cancer Encodes the KRAS KRAS has been reported as a Onset & protein which is an gene that harbours oncogenic Progression integral component of driver mutations in CRC the RAS/MAPK pathogenesis and is the most signalling pathway frequently mutated gene in the which promotes cellular MAPK/ERK pathway (518). growth and proliferation (517).

185

LMNB1 Cancer Encodes the lamin B1 Aberrant activity of this protein Onset & protein, which has is linked to tumour metastasis Progression/ structural roles in cells (520). Studies report that this Invasion & and is classified as an protein is overexpressed in Metastasis intermediate filament CRC tissue and the knockdown (519). of this gene inhibited cellular proliferation, invasion and migration capacity and activated apoptosis (521).

MAL Invasion & Encodes a hydrophobic This gene is recognized as a Metastasis protein localized in the tumour suppressor, and has membrane. This protein been reported to be functions to produce significantly downregulated in and sustain membrane CRC tissue samples and microdomains enriched furthermore was linked to CRC for glycosphingolipid stage and metastatic processes (522). (523).

MGMT Invasion & Encodes a DNA Methylation of the promoter of Metastasis/ damage repair protein this gene is reported in a Cancer with important roles in substantial portion of CRC Onset & protecting cells against cases with metastatic disease Progression carcinogenic agents which results in suppression of (524). its function by reducing its transcription (525).

MITF Cancer Encodes a protein MITF is a widely documented Onset & called melanocyte oncogene in melanomas; Progression inducing transcription however, its link to CRC factor which regulates remains unclear (527). cell fate determination Moreover loss of function in and survival in retinal epithelial development melanocytes (526). is linked to the disruption of cell differentiation, cell cycle regulation and apoptosis (528). MLH1 Cancer The gene encodes a Loss of function of this gene is Onset & tumour suppressor linked to tumour heterogeneity, Progression/ protein, also known as a disease progression, metastasis Invasion & mismatch repair and poor clinical outcomes; Metastasis protein, that functions furthermore, lack of function of to repair damaged DNA MLH1 is used to predict (529). aggressiveness and guide treatment (530).

186

MMP3 Invasion & Encodes an enzyme Upregulation of this enzyme is Metastasis called matrix linked to enhanced cellular metalloprotease 3, invasive capacity and tumour which functions to metastasis in several cancers break down the including CRC (532). extracellular matrix. This occurs normally during physiological tissue remodelling and early development (531).

MMP9 Invasion & Encodes an enzyme This enzyme is reported to be Metastasis/ denoted matrix elevated in malignant Immune metalloprotease 9, pathologies including CRC; Markers & which has been furthermore, inhibiting this Inflammation implicated in tissue protein has been linked to remodelling during reduced metastatic potential pathological processes (534). such as inflammation and fibrosis (533).

MSH2 Cancer Encodes a tumour Pathogenic variants of this Onset & suppressor protein gene lead to loss of function Progression involved in DNA and thus contribute to the mismatch repair. The accumulation of errors in DNA function of this protein base pairs. This has been is important for DNA linked to heritable syndromes integrity (535). such as Lynch Syndrome and greatly enhances the risk of CRC (536).

MSH6 Cancer Encodes a tumour Mutations that cause this gene Onset & suppressor protein that to become non-functional Progression has an essential role in result in a greatly enhanced DNA mismatch repair. risk of cancer development in a This protein helps variety of malignancies, facilitate the correction including CRC (538). of errors made during DNA replication (537).

187

MST1R Epithelial- Encodes a protein MST1R has been reported to Mesenchymal receptor located on the display oncogenic functions in Transition/ surface of cells that several malignancies including Immune binds macrophage- CRC, where it has been Markers & stimulating protein and associated with epithelial-to- Inflammation induces signalling mesenchymal transition as well (539). as metastasis (540).

MUC1 Cancer Encodes a protein The clinical significance of Onset & called mucin 1 which is MUC1 expressing tumours Progression expressed on the apical remains controversial; however surface of epithelial some studies report that cells and functions to upregulation of this gene leads produce mucus in to poorer patient outcomes several body systems (542,543). including the lining of the digestive tract (541).

MYC Cancer Encodes a transcription MYC is a proto-oncogene and Onset & factor with a variety of upregulation of the gene via Progression regulatory roles related several mechanisms promotes to cell cycle control, tumourigenesis in several proliferation, apoptosis, malignancies including CRC and metabolism (544). (545).

NANOG Cell Encodes a transcription Enhanced expression of this Stemness factor that is involved embryonic stem cell marker is in self-renewal, a predicter of disease pluripotency, and progression and poor clinical proliferation in outcomes in CRC; moreover, embryonic stem cells expression in adult cells (546). suggests reprogramming to a stem cell phenotype (547).

NDRG1 Epithelial- Encodes a cytoplasmic This gene functions as a Mesenchymal protein with functional tumour suppressor and inhibits Transition/ roles in cellular tumour metastasis, epithelial- Invasion & differentiation and to-mesenchymal transition, and Metastasis growth, as well as the invasion in CRC cells (549). cellular stress response (548).

188

NOTCH1 Cell Encodes a membrane Some studies report that Stemness/ bound receptor that has NOTCH1 signalling is linked Epithelial- important functions in to enhanced cell stemness, as Mesenchymal cell fate determination, well as epithelial-to- Transition growth, proliferation, mesenchymal transition in differentiation and CRC pathology (551). programmed cell death (550).

OPN Immune Encodes a protein that This gene has been implicated Markers & attaches osteoclasts to in several malignant Inflammation/ the bone matrix by pathologies including CRC, Invasion & binding hydroxyapatite. and has been reported to Metastasis/ In addition, this protein correlate with tumour Cell also acts as a cytokine progression, cell migration, Stemness by increasing the invasion and metastasis, as expression of well as promoting a stem-cell interferon-gamma and phenotype (553). interleukin-12 (552).

PAK1 Cancer Encodes a protein This gene is upregulated in Onset & kinase that plays an CRC pathogenesis and Progression/ important role in cell promotes cellular growth, Invasion & morphology, nuclear division, migration and Metastasis signalling, and invasion, as well as survival cytoskeletal (555). reorganization (554).

PFK-2 Cancer Encodes an enzyme The enzyme encoded by this Onset & with a critical gene is involved in many Progression regulatory role in aspects of cancer pathology cellular metabolism by including tumourigenesis, cell controlling glycolysis. growth and division, as well as This enzyme is linked therapy resistance. This to the AMPK pathway, enzyme is critical in several an important cellular signalling pathways related to signalling pathway that cancer cell metabolism (557). helps maintain cellular energy homeostasis (556).

PIK3CA Cancer Encodes a protein with Mutations in the catalytic Onset & multiple subunits, one subunit of PIK3CA is linked to Progression of which requires ATP several malignant pathologies to phosphorylate including CRC; furthermore,

189

several these mutations are linked to glycerophospholipids to poor patient outcomes and enhance cell growth and response to therapy (559). survival (558).

PLEK2 Epithelial- Encodes a protein that This gene is linked to Mesenchymal interacts with the actin epithelial-to-mesenchymal Transition/ component of the transition in several Invasion & cytoskeleton, thereby malignancies, as well as Metastasis affecting cellular contributing to enhanced morphology (560). invasiveness and capacity for metastasis (561,562).

PMS2 Cancer Encodes a protein that Studies report that there is a Onset & is part of the DNA high frequency of mutations in Progression damage repair response, PMS2 in cases of CRC; facilitating the furthermore, patients with an correction of errors inherited mutation in this gene generated during DNA are at an increased risk for replication (563). developing CRC and other malignancies (564).

PPARA Cancer Encodes a protein Elevated expression of PPARA Onset & receptor, that when is linked to therapy resistance Progression/ bound to its ligand, in human colon cancer cell Immune triggers expression of lines; furthermore, studies Markers & genes that promote report that downregulation of Inflammation cellular proliferation, PPARA results in decreased maturation, cellular proliferation and inflammation and initiation of apoptosis (566). immune responses (565).

PPARB Cancer Encodes a protein in the While the functions of this Onset & PPAR family that is gene remain somewhat Progression believed to have a role controversial, some researchers in transcriptional report that upregulation of this repression and gene is involved in signalling in the nucleus tumourigenesis and disease (567). progression in CRC patients (568,569).

190

PPARG Cancer Encodes a protein in the CRC patients expressing this Onset & PPAR family, which gene have been reported to Progression/ encodes a nuclear have significantly better Immune receptor protein that clinical outcomes than those Markers & results in the who do not (571). Furthermore, Inflammation transcription of genes to treatment of diabetic patients regulate the with thiazolidinediones, which differentiation of work by binding PPARG to adipocytes, cell regulate adipogenesis, has been metabolism, apoptosis linked to a reduced incidence and inflammation (570). of several malignant pathologies including CRC (572).

PPL Cancer Encodes a protein This gene is classified as a Onset & subunit of desmosomes tumour suppressor as loss of Progression/ and interacts with both function mutations are linked Invasion & the plasma membrane to tumourigenesis and disease Metastasis and intermediate progression in CRC patients; filaments to regulate furthermore, expression of this cell signalling for gene was shown to reduce growth and survival metastatic capability and (573). induce cell cycle arrest (574).

PPP2R1A Cancer Encodes a subunit of Mutant forms of this gene that Onset & the protein phosphatase lead to loss of tumour Progression 2, an enzyme which suppressor function are removes phosphate documented in several types of groups from molecules malignancies, including CRC and has a role in (576). Moreover, elevated downregulating cellular levels of variant PPP2R1A proliferation (575). have been linked to higher grade tumours in some malignancies (577).

PPP2R5C Cancer Encodes a subunit of Loss of function of this protein Onset & protein phosphatase 2, can occur as a result of Progression/ an enzyme with mutation leading to protein Invasion & important cellular truncation. This has been Metastasis functions to regulate associated with aggressive growth and division; malignant phenotypes furthermore, this characterized by metastatic subunit is believed to potential (579–581). inhibit cell growth and

191

division following DNA-damage (578).

PRRG4 Invasion & Encodes a protein that Upregulation of this protein Metastasis limits the transcription causes a reduced functional of ROBO1 (a capacity of ROBO1, a tumour documented tumour suppressor, which has been suppressor) and linked to enhanced invasive prevents it from being and migratory capacity in transported to the cell tumours, leading to poor membrane, which may clinical outcomes (583,584). affect neural cell migration during development (582).

PTEN Cancer Encodes a phosphatase The loss of function of PTEN Onset & that has been found to is well documented in many Progression/ be highly mutated in malignancies including CRC. It Invasion & many malignancies. has been linked to therapy Metastasis This enzyme functions resistance and metastatic as a tumour suppressor disease (586,587). in a variety of mechanisms, including through its regulation of cell division and apoptosis (585).

PTK2 Invasion & Encodes a tyrosine Studies have reported elevated Metastasis kinase localized in the levels of this protein in cytoplasm in regions colorectal malignancies; adjacent to adhesion furthermore, expression of this points between cells. gene has been positively This protein has correlated with disease functions in cell growth progression, metastatic and signal transduction potential and poor clinical (588). outcomes (589).

RAB5 Cell Encodes a GTPase, Studies report that RAB5 has a Stemness which binds GTP in its role in the maintenance of active state to regulate a cancer stem cell populations variety of cellular and that inhibiting its processes, including expression increased the trafficking of cellular effectiveness of cancer materials within a cell therapeutics on CRC cell lines (590). (591).

192

RAC1 Invasion & Encodes a GTPase that The accumulation and Metastasis has roles in macrophage hyperactivity of RAC1 has NADPH oxidase been reported in CRC tissues function, as well as cell compared to that of normal adhesion and migration colonic mucosa; furthermore, (592). levels of this GTPase were correlated with tumour stage, metastasis and overall survival (593).

RET Cancer Encodes a Mutations that generate a Onset & transmembrane receptor fusion protein with constitutive Progression/ tyrosine kinase, and activation have been linked to Invasion & when bound to its several malignancies; Metastasis ligand, it initiates furthermore, overactivation of intracellular signalling this kinase is linked to disease to promote cell growth, progression, tumour metastasis maturation, survival and and poor patient (595,596). migratory capacity (594).

RGS2 Invasion & Encodes a regulatory Abnormal expression and/or Metastasis protein that aids in the activity of RGS2 has been differentiation of linked to several malignancies myeloid cells, and is thought to exert regulation of blood suppressive control over G- pressure, as well as protein signalling; nonetheless, vasoconstriction and the role of RGS2 in cancer dilation (597). pathology remains controversial as researchers report low levels during tumourigenesis but higher levels in later disease stages (598,599).

RUNX2 Epithelial- Encodes a transcription RUNX2 is reported to be an Mesenchymal factor with critical roles oncogene and elevated levels Transition/ in the differentiation of of this protein are linked with Invasion & osteoblasts and bone disease recurrence in CRC Metastasis development (600). patients, as well as being positively correlated with tumour stage and metastasis (601).

193

RhoA Invasion & Encodes a GTPase This protein is a tumour Metastasis protein that affects actin suppressor and the loss of its polymerization, function or downregulation of contraction via its activity is related to disease actin/myosin filament progression and metastatic interactions, cell potential in CRC patients adhesion and other (603). microtubule functions. Consequently, RhoA affects cell morphology and motility (602).

S6K Cancer Encodes a ribosomal S6 Amplified expression of this Onset & protein kinase that has protein has been reported in Progression/ functions in promoting several malignant pathologies Invasion & the synthesis of protein including CRC; furthermore Metastasis which has downstream expression has been shown to effects on cell cycle be positively correlated with progression (604). disease stage and metastasis resulting in poor overall survival (605).

SCNN1A Cancer Encodes a protein Studies have reported the Onset & subunit of a non-voltage downregulation of molecular Progression/ gated sodium channel pathways involved in Invasion & present in epithelial aldosterone-regulated sodium Metastasis cells, and has roles in reabsorption in CRC, which maintaining water and includes the downregulation of sodium ion homeostasis SCNN1A transcription (607); (244,606). alternatively, overexpression of this gene has been reported in several malignancies (608). Conversely, some studies report that downregulation of this gene is linked to aggressive tumours displaying extracellular matrix degradation profiles (609).

194

SERPINB5 Cancer This gene encodes a While the tumour suppressor Onset & protein that belongs to functions of SERPINB5 have Progression the serpin (serine been documented, several protease inhibitor) studies report oncogenic roles family of proteins of this protein indicating that (610). Has functions in amplified expression is linked extracellular matrix to disease progression, therapy organization, and resistance and poor clinical epithelial cell outcomes in CRC patients morphology (611). (612,613).

SIRT1 Invasion & Encodes a protein with The activity of this protein has Metastasis several regulatory roles been implicated in many including having an different cancers including effect on epigenetic CRC; some studies report that silencing of genes, fat expression of this protein metabolism, and the confers protection against maintenance of glucose tumour metastasis in CRC via homeostasis (614). the initiation of intracellular signalling cascades (615).

SMAD3 Cancer Encodes a protein that Loss of SMAD3 expression or Onset & functions in function has been associated Progression intracellular signalling with CRC progression, by facilitating the metastasis and poor overall transmission of survival (617). extracellular signals to the nucleus. These signals are important in the regulation of cellular proliferation and gene transcription (616).

SMAD4 Cancer Encodes a protein that Loss of function of SMAD4 Onset & functions in has been linked to CRC Progression intracellular signal pathogenesis by inhibiting transduction. This TGF- signalling, an important protein is involved in suppressor of intestinal bone developmental epithelial cell growth (619). signalling and acts as a tumour suppressor by inhibiting the proliferation of epithelial cells (618).

195

SNAI1 Cell Encodes a protein Overexpression of this gene is Stemness localized in the nucleus linked to increased cell that functions as a stemness, epithelial-to- repressor of mesenchymal transition, transcription; this therapy resistance, and poor protein is important clinical outcomes in CRC during embryonic patients (621). development (620).

SNAI2 Cell Encodes a protein that Overexpression of this gene is Stemness/ represses transcription linked to cancer pathology and Invasion & of certain genes during is shown to promote a stem cell Metastasis embryonic development phenotype; furthermore, it is (622). linked to the repression of the cell adhesion molecule E- cadherin which may promote tumour invasion and metastasis (623,624).

SNCA Cancer Encodes a protein that Promoter methylation of this Onset & is highly expressed in gene and subsequent Progression neuronal tissue with downregulation of its important roles in expression is reported in CRC neuronal vesicle tissues, and patient stool, transportation and suggesting that it may be a neural signalling (625). candidate biomarker for this disease (626).

SORL1 Cancer Encodes the sorting Elevated expression of this Onset & receptor for the insulin protein is linked to enhanced Progression receptor. Enables the cell survival in CRC through recycling of the insulin its intracellular trafficking receptor by inhibiting ability; however, it has also its degradation and shown proapoptotic functions subsequently increasing indicating that the relationship its expression on the of this gene and CRC is still cell surface (308). highly controversial (627).

SOX2 Cell Encodes a transcription Overexpression of this gene in Stemness/ factor that regulates the adult tissues is strongly linked Invasion & expression of many key to aggressive malignant Metastasis genes during embryonic phenotypes in several cancers development. It is including CRC (629–631). recognized as a marker of stem cells (628).

196

SOX9 Cell Encodes a transcription Overexpression of this gene is Stemness/ factor that is heavily linked to CRC pathogenesis, Invasion & involved in embryonic therapy resistance, tumour Metastasis development, especially metastasis and poor clinical for skeletal outcomes (633,634). development and sex determination (632).

SPARC Invasion & Encodes a protein that Elevated expression of this Metastasis is strongly associated gene is linked to with the extracellular tumourigenesis and poor matrix, cell prognosis in CRC patients morphology, and the (636). calcification of bone (635).

SPG20 Cancer Encodes a protein with Downregulation of this gene Onset & endosomal trafficking via promoter methylation has Progression function via its capacity been reported in CRC patient to interact with plasma and tumour biopsies microtubules; this (637). Furthermore, protein also has inactivation of this protein is functions in the reported to enhance cell degradation of EGFR proliferation through the (289). activation of EGFR signalling (638).

STAT3 Cancer Encodes a signal Overactivation of STAT3 is Onset & transducer protein that linked to malignant Progression has important pathogenesis via the regulatory roles in gene constitutive activation of transcription. It is cellular proliferation. STAT3 reported to have roles in inhibitors have been developed cell proliferation, as an antineoplastic agent and apoptosis and migration have been used to treat CRC (639). patients (640).

197

TAK1 Cancer Encodes a protein Upregulation of TAK1 is Onset & kinase that activates associated with malignant Progression/ pro-survival cell pathogenesis in several cancers Immune signalling pathways, including CRC; this gene is an Markers & namely NF-κB, whose oncoprotein and studies report Inflammation target genes inhibit that inhibition of its function apoptosis while enables apoptosis to occur stimulating cell division (642). and inflammation (641).

TCF7L2 Invasion & Encodes a transcription This gene is reported to be one Metastasis factor that is involved of the most frequently mutated in the WNT signalling genes in CRC and loss of pathway, the normal function is associated maintenance of blood with changes in cell glucose homeostasis morphology, as well as and studies report that increased invasive and mutations in this gene migration capacity to promote may predispose tumour metastasis (644). individuals to the development of T2D (643).

TFPI2 Invasion & Encodes a protein that Studies have reported that Metastasis acts to inhibit serine methylation and silencing of proteases and has been this gene is significantly higher classified as a tumour in CRC tissue samples suppressor by compared to normal intestinal preventing degradation mucosa, suggesting that loss of of the extracellular tumour suppressor function is matrix to limit tumour linked to oncogenesis (646). invasive capacity (645).

TGFA Cancer Encodes a protein that Overexpression of this gene Onset & binds the epidermal has been implicated in many Progression growth factor receptor malignant processes such as to promote cell uncontrolled proliferation and signalling to enhance angiogenesis; furthermore, cell growth, division elevated serum levels of this and maturation (647). protein have been reported in CRC patients and were found to decline following surgical resection of the tumour (648,649).

198

TGFB1 Cancer Encodes a signalling Higher levels of this protein Onset & molecule that binds its have been linked to invasive Progression receptor on the cell metastatic disease in CRC surface leading to patients; furthermore, increased transcriptional levels were more common regulation to alter amongst males compared to cellular proliferation, females suggesting there may growth and maturation be sex bias with respect to this (650). gene/protein (651).

TGFB2 Cancer Encodes transforming This gene is reported to be Onset & growth factor 2, which upregulated in cell signalling in Progression is produced throughout CRC tissue compared to the body to assist with healthy controls; furthermore, fetal development, as loss of its receptor and well as regulation of subsequent signalling has been cellular processes after shown to promote greater birth (652). overall survival times in CRC patients (653,654).

TGFB3 Cancer Encodes another protein Upregulation of this protein Onset & in the transforming has been implicated in CRC Progression growth factor-beta therapy resistance by inhibiting superfamily. Similarly cellular apoptosis induced by it is involved in radiotherapy (656). embryogenesis and cellular maturation (655).

TGFBR1 Cancer Encodes a receptor for Variants in this protein have Onset & TGFB1, and when been well-documented in CRC Progression bound, it facilitates and are thought to promote signal transduction to tumourigenesis; however, regulate cellular depending on the variant the processes including effect on CRC pathology varies proliferation, growth (657). and maturation (328).

199

TGFBR2 Cancer Encodes a receptor for Inactivating mutations of Onset & TGFB2 and facilitates TGFBR2 have been linked to Progression transduction of growth overall improved survival signals from outside the times in CRC patients (168); cell to the inside (658). however, the findings related to this receptor’s role in CRC pathogenesis remain controversial.

THBS1 Epithelial- Encodes a glycoprotein Studies have reported that this Mesenchymal with adhesive protein promotes epithelial-to- Transition/ properties to facilitate mesenchymal transition, Invasion & cell-cell adhesion and thereby contributing to Metastasis interactions, as well as metastasis from the primary involvement with the tumour to the liver in CRC extracellular matrix patients (660). (659).

TIMP1 Invasion & Encodes a protein that Studies report that this gene Metastasis has been reported to promotes tumourigenesis and have inhibitory effects metastasis in colorectal cancer on matrix patients through its regulation metalloproteases (661). of PI3K and MAPK signalling This protein also pathways (661). regulates cell proliferation and apoptosis (662).

TIMP3 Invasion & Encodes a secreted Expression of this protein has Metastasis protein that binds the been reported to associate with extracellular matrix and better clinical outcomes in functions to inhibit CRC patients compared to matrix metalloproteases those with lower expression (663). levels (664).

TMEM132A Cancer Encodes a There is minimal literature Onset & transmembrane protein focused on the role of Progression/ that has roles in brain TMEM132A in cancer Invasion & development in-utero pathology; however, its ability Metastasis and after birth. It is also to support cell survival in stress thought to help cells conditions suggests it may resist death in contribute to sustained survival unfavourable conditions and enhanced proliferation in (216). malignant tissues (665,666).

200

TNFAIP3 Immune Encodes a protein that Upregulation of this protein Markers & is induced by the has been linked to therapy Inflammation cytokine tumour resistance in CRC patients necrosis factor and has (668); however, these findings been shown to have remain controversial as some functions in the studies also report this gene to inhibition of apoptosis have tumour suppressor and inhibition of the functions (669). NF-B pathway (667).

TNFAIP6 Invasion & Encodes a secretory Expression of this molecular is Metastasis/ protein with a special can be augmented by T2D; Immune binding domain that however, the relationship Markers & functions to support the between this protein and Inflammation extracellular matrix and malignant pathogenesis remain contribute to cell poorly understood (671). motility. The production of this protein is also related to inflammatory processes (670).

TNF Immune Encodes the TNF has been highly studied Markers & proinflammatory and is well documented for its Inflammation cytokine tumour role in promoting inflammation necrosis factor alpha. to stimulate tumourigenesis, The roles of this epithelial-to-mesenchymal cytokine are wide transition and several other ranging as it interacts malignant processes; with many cellular heightened expression of this processes, both cytokine is linked to poor biological and patient outcomes (673). pathogenic (672).

TP53 Cancer Encodes the TP53 Loss of tumour suppressor Onset & protein, a transcription function of TP53 is well Progression factor, that is a known documented in the literature for tumour suppressor many different types of protein through its malignancies, including CRC ability to induce cell (675). cycle arrest, apoptosis, and DNA repair to name a few (674).

201

TWIST1 Cell Encodes a transcription This gene has been implicated Stemness/ factor with important in many malignant pathologies, Epithelial- regulatory roles in including CRC; it has been Mesenchymal embryonic reported to induce epithelial-to- Transition developmental mesenchymal transition and processes (676). promote chromosomal instability (677).

TWIST2 Cell Encodes a transcription Upregulation of this protein, or Stemness/ factor that helps loss of miRNA that function to Epithelial- regulate early silence this gene have been Mesenchymal developmental associated with CRC Transition processes and may pathology, metastatic disease inhibit the maturation of and poor patient outcomes osteoblasts (678). (679).

VDR Cancer Encodes the vitamin D Variants of the vitamin D Onset & receptor protein, which receptor have been reported as Progression is a hormone receptor a plausible mechanism for localized in the nucleus deficiency in this vitamin, (680). resulting in enhanced risk of CRC onset, progression and patient mortality (681).

VEGFA Invasion & Encodes vascular Higher levels of VEGFA are Metastasis endothelial growth associated with disease factor A and has a role progression, invasion and in the induction of metastatic potential in a variety vascular endothelial cell of malignancies including CRC growth, division and (683). migration (682).

VEGFC Invasion & Encodes a growth factor Activation and overexpression Metastasis that acts on endothelial of this gene has been reported cells in blood vessels to in CRC tissues and is thought facilitate angiogenesis to promote an aggressive and vascular endothelial disease phenotype with cell proliferation and metastatic capability, leading may also affect vessel to poor clinical outcomes permeability (684). (685).

202

VIM Invasion & Encodes an Molecular interactions with Metastasis intermediate filament VIM have been reported to be protein, a component of involved in CRC pathogenesis, the cytoskeleton. This particularly tumour metastasis protein has roles in leading to poor clinical cholesterol transport, outcomes (687). cell shape, cell adhesion, migration and signalling (686).

WNT5A Invasion & Encodes a secreted This protein has been Metastasis protein involved in cell implicated in CRC disease signalling, and is progression through its critical during interactions with tumour- embryonic development associated macrophages to alter (688). the tumour microenvironment to support oncogenesis (689).

WNT5B Invasion & Encodes a secreted Variants in this gene are linked Metastasis signalling protein to disease recurrence in CRC involved in cell fate patients and has been reported determination during to enhance cellular early development and proliferation and metastatic notably fetal gut potential in CRC cells (691). development (690).

ZEB1 Epithelial- Encodes a transcription Expression of this gene has Mesenchymal factor that represses the been linked to epithelial-to- Transition/ expression of mesenchymal transition, Immune interleukin 2, as well as aggressive colorectal cancer Markers & the expression of E- phenotypes and poor clinical Inflammation cadherin (692). outcomes (693,694).

ZEB2 Epithelial- Encodes a transcription Expression of ZEB2 has been Mesenchymal factor with important reported to promote metastatic Transition roles in organogenesis disease and poor overall during fetal survival and response to development (695). treatment in CRC patients (696).

203

mTOR Cancer Encodes a protein Aberrant activity of this protein Onset & kinase, mammalian has been shown to play a key Progression/ target of rapamycin, role in CRC pathogenesis by Invasion & which is involved in sustaining cell growth Metastasis cellular signalling signalling and promoting pathways related to tumour metastasis (699). immune responses, metabolic alterations in response to nutrient concentrations and cell migratory capacity (697,698). sFas Cancer The Fas gene encodes a Loss of function and or Onset & cell surface receptor, expression of the Fas protein is Progression that when bound to its linked with sustained tumour ligand, initiates growth and resistance to apoptosis (284). sFas is immunotherapies which target the soluble form of this this receptor to induce receptor that impairs programmed cell death in physiologic apoptosis malignant tissues (287); thus, by sequestering the Fas enhanced expression of the ligand and preventing it soluble form reduces the from binding its function of the cancer- receptor (286). protective cell surface form of the protein (286).

CLTC Housekeeping Encodes clathrin, which These genes are all constitutive is a major protein genes necessary to maintain expressed in cells to cellular function at the most coat intracellular basic level. These genes are organelles and function selected on the basis that they in intracellular should be expressed at a trafficking (700). relatively consistent level in organisms under variable GAPDH Housekeeping Encodes an enzyme conditions including pathologic denoted glyceraldehyde and physiologic conditions. 3-phosphate Consequently, these genes dehydrogenase, which provide data that can be used in functions to obtain the normalization of genes cellular energy through selected for the NanoString the breakdown of custom gene panel. glucose (701).

204

GUSB Housekeeping Encodes an enzyme denoted beta- glucuronidase which is localized in lysosomes to allow for recycling of cellular components (702).

HPRT1 Housekeeping Encodes a key enzyme, a protein transferase which functions to produce purine nucleotides (703).

PGK1 Housekeeping Encodes a key enzyme, phosphoglycerate kinase, which functions to produce cellular energy through glycolysis (704).

RPLP0 Housekeeping Encodes the large 60S ribosomal subunit protein (705).

TUBB Housekeeping Encodes the beta tubulin protein which is a key component of microtubules in cells (706).

NEG_A-H Negative N/A NanoString nCounter probes Controls are reported to be highly specific; however, low levels of non-specific probe binding is an intrinsic component of the assay. To account for the small number of false positive reads, 8 negative controls, denoted NEG_G to NEG_H, are included in the assay to monitor the level of non- specific binding.

205

POS_A-F Positive N/A The NanoString codeset Controls includes 6 synthetic DNA control targets, with known concentrations ranging from 128fM to 0.125fM and decreasing in a linear fashion. These concentrations correspond to POS_A to POS_F, respectively. Inclusion of these controls allow the probe hybridization efficiency of the assay to be measured at a range of concentrations.

206