Biomarkers for Early Detection of : A Systematic Review Towards the identification of true cancer for early detection

Louis-Philippe Fermon Student number: 01304166

Promotor: Prof. Inge Huybrechts Copromotor: Prof. Dr. Veronique Cocquyt

A dissertation submitted to Ghent University in partial fulfilment of the requirements for the degree of Master of Medicine in Medicine.

Academic year: 2017 – 2018

2 | P a g e

Deze pagina is niet beschikbaar omdat ze persoonsgegevens bevat. Universiteitsbibliotheek Gent, 2021.

This page is not available because it contains personal information. Ghent University, Library, 2021.

4 | P a g e

VOORWOORD

“Sic parvis magna” “Greatness from small beginnings”

After two years of intensive labour, I’m happy to finally present this dissertation. I would not have been able to do this without the help of my family, friends and of course, my promotor Prof. Dr. Inge Huybrechts. She was always eager to lend support, especially during though moments, and was available when I had any questions or needed any help. Furthermore, I would like to thank a few special people that aided me throughout this process. Firstly, my parents, without whom I would not have been able to bring it home. Secondly, I would like to thank my girlfriend Kimberley and all- time best friend Thomas, who’s advice and support were deeply appreciated. Lastly, I won’t forget all others who were there when I needed them the most. Thank you.

5 | P a g e

6 | P a g e

Contents Abstract ...... 1 Abstract ...... 3 1. Introduction ...... 4 1.1 Cancer as major health issue...... 4 1.2 Early detection of cancer ...... 5 1.3 Biomarkers in screening for cancer ...... 6 1.3.1 Definitions ...... 6 1.3.2 Classification of Biomarkers ...... 8 1.3.3 The Five Phases of Development ...... 9 1.4 Modern biomarkers obstacles ...... 11 1.5 Searching reliable biomarkers...... 12 2. Methods ...... 13 2.1 Data sources and search strategy ...... 13 2.2 Eligibility criteria ...... 13 2.3 Exclusion Criteria ...... 13 2.4 Data extraction ...... 14 2.5 Outcome Measures ...... 14 2.6 Results ...... 14 2.7 Flow diagram ...... 15 Identification ...... 15 Screening ...... 15 Eligibility ...... 15 Included ...... 15 3. Results ...... 16 3.1 ...... 16 3.2 Hepatocellular cancer ...... 25 3.3 ...... 33 3.4 ...... 39 3.5 Cancer ...... 46 3.6 ...... 51 3.7 ...... 58 3.8 Gastric Cancer ...... 63 3.9 Renal Cancer ...... 68

7 | P a g e

3.10 Gynecologic cancer ...... 70 3.11 ...... 72 3.12 Oral Cancer ...... 76 3.13 Esophageal Cancer ...... 79 3.14 Skin Cancer ...... 80 3.15 Osteosarcoma ...... 81 3.16 ...... 82 3.17 Leukemia ...... 83 3.18 Various types of cancer ...... 84 4. Discussion ...... 87 4.1 Colorectal Cancer ...... 87 4.2 Hepatocellular cancer ...... 97 4.3 Lung cancer ...... 103 4.4 Ovarian cancer ...... 110 4.5 ...... 116 4.6 Pancreatic cancer ...... 120 4.7 Breast Cancer ...... 126 4.8 Gastric cancer ...... 130 4.9 Renal Cancer ...... 134 4.10 Gynecologic Cancer ...... 136 4.11 Bladder Cancer ...... 138 4.12 Oral Cancer ...... 141 4.13 Esophageal Cancer ...... 144 4.14 Skin Cancer ...... 145 4.15 Osteosarcoma ...... 145 4.16 Thyroid Cancer ...... 146 4.17 Leukemia ...... 146 4.18 Various types of cancer ...... 148 5. Conclusion ...... 151 References ...... 153 Addenda ...... 162 Additional Tables: ...... 162

8 | P a g e

9 | P a g e

ABSTRACT

Introduction: Cancer is the leading cause of death and morbidity in Western civilization and presents a major health issue and economic burden worldwide. Early diagnosis and screening have been the two principal instruments of early detection. At an early stage, cancer responds better to effective treatment and leads to an improved probability of survival, less morbidity and lower therapy costs. However, early diagnosis of cancer is often made through the use of invasive methods with certain complications. Moreover, the currently used screening tests often produce false positive ore false negative results, which in turn results in over-diagnosis (i.e. PSA and prostate cancer) or under-diagnosis (i.e. iFOBT and CRC). A solution for these problems could be the use of non-invasive, cost-effective, diagnostic or screening biomarkers. The aim of this systematic review was to provide an overview of the recent progress and developments in terms of early detection of cancer through biomarkers, as well as a full selection of the potential biomarkers ready for clinical use. Key question: ‘Which cancer biomarkers are ready for use in a broad cancer screening assay and in what phase of development are they now?’ Methods: In order to identify all original research articles on cancer biomarkers of the past seven years, an exhaustive literature search was performed in Pubmed. The MESH terms used in the search strategy were: ‘Biomarker’, ‘Cancer’, ‘Early Detection’ and ‘Diagnosis’. An objective was to include studies with at least 100 patients. The search strategy was restricted to papers written in English and to studies with human participants. Sensitivity, specificity (with their respective 95% confidence interval ranges), area under the curve (AUC) and predictive positive value of each biomarker was sought and evaluated. Results: A total of 7303 articles were identified by the search strategy. Ultimately, 98 articles met the eligibility criteria and were included in this systematic review. The results can be consulted in the additional tables. The majority of the included studies consists of (multi-center) case-control studies, cohort studies, population screening studies and validation studies. An abundance of new biomarkers were identified and tested over the past seven years. In the near future, some of these diagnostic biomarkers, that were evaluated in phase III prospective cohort studies, could be introduced in the clinic or implemented in a screening program. Conclusion: The development of one biomarker that can detect every type of cancer, is probably impossible. Combining multiple cancer-specific biomarkers in one panel however, has proven to be significantly more sensitive in the detection of cancer. The majority of the promising biomarkers,

1 | P a g e

tested phase III prospective studies, display a decrease in sensitivity and specificity values by approximately 20 to 30%. Furthermore, there is a huge dispersal in terms of means and no consensus (or coherence) on the right strategy to discover and validate new biomarkers. A solution for these problems could be the centralization of means and efforts in an institution that validates the most promising biomarkers (in prospective studies), without compromising the freedom of researchers to continue identifying new potential biomarkers in retrospective studies.

2 | P a g e

ABSTRACT

Kanker is de grootste oorzaak van morbiditeit en sterfte in de Westerse wereld en presenteert een grote uitdaging op vlak van gezondheid. Vroege detectie van kanker resulteert in een hogere kans op curatieve therapie en langere overleving. Een minpunt is dat vroege diagnose van kanker vaak gesteld wordt door middel van invasieve procedures met mogelijke bijwerkingen. Een oplossing voor dit probleem zou het gebruik van non-invasieve, kosteneffectieve biomerkers om kanker te detecteren kunnen zijn. In deze systematische review wordt een overzicht van de meest veelbelovende biomerkers gegeven, alsook de biomerkers besproken die momenteel geïmplementeerd zijn in screening programma’s. De zoektermen die gebruikt werden tijdens de zoekstrategie in Pubmed waren ‘Biomarker’, ‘Cancer’, ‘Early Detection’ en ‘Diagnosis’. Een totaal van 7303 artikelen werden gescreend en 98 artikelen voldeden aan de inclusiecriteria. Een groot aantal nieuwe biomerkers werden geïdentificeerd in retrospectieve studies (fase twee studies). Wanneer deze echter werden gevalideerd in prospectieve studies, werd vaak een vermindering in sensitiviteit en specificiteit waargenomen. De conclusie is dat er een grote spreiding in het gebruik van middelen en inspanningen is. Tevens is er geen consensus over de meest betrouwbare strategie met betrekking tot het identificeren en valideren van nieuwe biomerkers. Een mogelijke oplossing voor dit probleem zou een centralisatie van middelen en inspanningen in een overkoepelende instantie zijn.

3 | P a g e

1. INTRODUCTION

1.1 CANCER AS MAJOR HEALTH ISSUE Cancer is a group of diseases characterized by the uncontrolled growth and spread of abnormal cells. “The term cancer defines over one hundred different diseases that can arise from virtually any tissue or organ in the body and, while sharing common properties of local invasion and distant spread, may have different causal factors, molecular composition, natural history of disease, methods for diagnosis and methods by which they are treated” (1). Cancer is the leading cause of death and morbidity in Western civilization and presents a major health issue and economic burden worldwide. In Europe and the US, cancer has taken over cardiovascular diseases as number one cause of death in recent years. The economical cost of cancer in 2010 was calculated at about 1.16 trillion dollars. In 2012, cancer occurred in approximately 14 million new cases and, according to the WHO (World Health Organization) and GLOBACAN 2012 (the online database of the International Agency for Research on Cancer), was related to 8.8 million deaths in 2015 worldwide (2-4). As a result of the continuous growth of the aging world population, a cancer-related lifestyle (use of tobacco, unhealthy diet, physical inactivity and obesity), air pollution and the generally late detection, GLOBACAN 2012 estimates that the amount of newly detected cases of cancer will increase to approximately 19.3 million per year by 2025. The lifetime risk of developing cancer is currently at nearly 40% in industrialized countries and is expected to rise by 70% in the next two decades. At present, over 70% of all cancer related deaths occur in low- and middle- income countries due to a less healthy lifestyle and higher risk for infections. Worldwide, the leading causes of cancer death for men and women are lung cancer and breast cancer. Other deadly forms of cancer are liver, colorectal and stomach cancer. The most commonly diagnosed cancer in industrial countries is prostate cancer; this as a result of the widespread screening of men with the PSA (prostate-specific antigen) test (2). Furthermore, it is estimated that, by avoiding or correcting specific risk factors, up to 50% of all cancer deaths could be prevented (5). These risk factors include: a sedentary lifestyle, a high body mass index, a low fruit and vegetable intake, physical inactivity, alcohol consumption and sexually transmitted HPV-infections. Tobacco use is the most important risk factor and accounts for 22% of all cancer related deaths (3).

4 | P a g e

1.2 EARLY DETECTION OF CANCER To decrease cancer morbidity and mortality, early detection of cancer is vital and a priority in cancer research. Until now, early diagnosis (or public education) and screening have been the two principal instruments of early detection. At an early stage, cancer responds better to effective treatment and leads to an improved probability of survival, less morbidity and lower therapy costs (3, 4). Early diagnosis of cancer can be accomplished by raising awareness, addressing barriers, providing accessible care, improving clinical evaluation and reducing the time to diagnosis of cancer. A good example of raising awareness is the breaking of the taboo concerning breast cancer. Naturally, informing the public plays a key role in this stage. The purpose of public education is raising the number of subjects that take part in cancer screening programs and motivating persons at risk to complete screening exams and follow-up.

Screening, on the other hand, is the systematic application of tests to identify individuals at risk of a specific disorder that can benefit from further investigation or immediate preventive action. These individuals have not yet sought medical attention due to the absence of noticeable symptoms (6). The initial aim of screening cancer was to detect precursor lesions before they evolve into malignancies. However, this has frequently proven to be impossible seeing as these lesions often evolve into malignancies before the screening test even notices them (7). The WHO released a number of criteria regarding the conditions for a screening program (6) :

‹ The disease should be a relevant health problem. ‹ There must be an accessible treatment available for this health problem. ‹ There must be a test or examination that can detect the disease at a latent stage. This implies that the natural history of the disease is understood. ‹ The screening method must be accessible and level of invasive tests must be acceptable to the population. ‹ The cost of the screening and treatment of this disease should be economically balanced. ‹ The benefits of the treatment must outbalance the possible harm.

Nowadays, there are only a handful of cancer screening tests that have been put into practice. The most widely used tests are: the use of a mammography in the screening of breast cancer, the fecal occult blood testing or FOBT, the Papanicalaou (PAP) test in screening for cervical cancer, flexible

5 | P a g e

sigmoidoscopy in screening for colorectal cancer (CRC) and low-dose computed tomography in the screening of persons with high risk for lung cancer (6, 7). However, even though these tests have proven to reduce the mortality rate caused by cancer, they often show limitations in terms of sensitivity and specificity. Currently used screening tests often produce false positive results, meaning that the test indicates a positive result in individuals that don’t necessarily have the disease. This regularly results in over-diagnosis, anxiety and unnecessary further medical examinations. “Therefore it is recommended by expert panels to inform the individual about the benefits and hazards related to the screening test as well as the treatment that may follow after the screening procedure” (6). Moreover, screening may cause an overtreatment of patients at low risk (8). The lack of sensitivity and specificity, false positive results and overtreatment of cancer can easily be demonstrated with an example of prostate cancer. Firstly, the PSA (prostate specific antigen) test used in screening for prostate cancer is not exclusively an indicator of cancer. The PSA levels can be elevated as a consequence of benign prostate hypertrophy (not related to cancer), diet, medication, cycling and prostatitis. Secondly, the PSA test doesn’t differentiate between the stages of prostate cancer and metastatic forms of prostate cancer, reducing the sensitivity and specificity of the test. Therefore, it is not surprising that the PSA-test has an average sensitivity of 47% and a specificity of 65% for the diagnosis of prostate cancer. Furthermore, the PSA (prostate specific antigen) test often indicates a lesion in the prostate with no risk of developing into a malignancy or the lesion is so slow-growing that the patient is more likely to die of causes not related to prostate cancer first (7). Since cancer screening and treatment is a costly affair, only screening tests of sufficient quality with a high sensitivity and specificity in detection of all precursor lesions are recommended.

1.3 BIOMARKERS IN SCREENING FOR CANCER

1.3.1 Definitions According to The Biomarkers Definitions Working Group of the National Institutes of Health a biomarker can be described as “a cellular, biochemical, and/or molecular characteristic that can be objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention” (9).

6 | P a g e

Another definition of biomarker could be “a biological molecule found in blood, other body fluids, or tissues that is a sign of a normal or abnormal process, or of a condition or disease” (10). “A cancer biomarker is a molecule, produced by cancer cells or by human cells as a result of the presence of a tumor in the body, that can be measured and used as an indicator of a cancerous process” (4). Synonyms for the term ‘biomarkers’ are “marker”, “molecular diagnostic” or “signature molecule” (4). There is a vast amount of cancer biomarkers including DNA, RNA, proteins, peptides, hormones, and antibodies among many others. Some can be found in the circulation (blood, serum or plasma), some in secretions like urine, stools, sputum, or nipple discharge. Less common biomarkers can be detected in the saliva, cerebrospinal fluid, ascites fluid and hair. They can be obtained through invasive procedures like biopsy, or non-invasive procedures like special imaging. The most commonly known cancer biomarkers that are currently being used or investigated are (6): ‹ AFP (Alfa-fetoprotein) in screening for hepatocellular cancer ‹ PSA in screening for prostate cancer ‹ CA 125 (Cancer Antigen) in screening for ovarian cancer ‹ FOBT (fecal occult blood test) and FIT (fecal immunochemical test) in screening for CRC (colorectal cancer) ‹ hCG (human chorionic gonadotropin) in screening for GTN (genitational trophoblastic neoplasia: mole or chorioncarcinoma) Currently, a lot of attention is given to new “-omics” technologies, including genomics, and CTC’s (circulating tumor cells). The ‘liquid biopsy’ analysis is used to describe a non- invasive manner of detecting CTC’s, miRNA and cell free tumor DNA (cfDNA) in cancer patients. CTC’s are cells, coming from a tumor or from a metastatic tumor that can be found in the circulation and used for non-invasive tumor sampling. Circulating free DNA is DNA released by the tumor cells (or any other cell) into the circulation. This cfDNA coming from the tumor (or ctDNA) presents multiple mutations and researchers are studying it to construct the genome of various tumors. MicroRNA’s on the other hand are “fragments of single-stranded non-coding RNA that regulate a variety of genes” (4). In

7 | P a g e

patients with cancer, these miRNA’s are often overexpressed or downregulated, allowing the tumor to continue developing. These new potential biomarkers show great promise. Nevertheless, not many of these new biomarkers have been clinically validated and they are still under further investigation. In recent years, cancer therapy has been following a new direction towards more specialized and individualized methods, away from the ‘one size fits all’ mentality. Considering the fact that cancer is a heterogeneous form of disease, each tumor is unique and can be identified by its own molecular characteristics. New biomarker tests in the future should be able to distinguish different groups of patients with the same types of cancer: e.g. to separate the patients with malignant forms from those with indolent cancers or to predict if certain cancers will respond to specific treatment. Subsequently, the patient will receive a suitable dose of the right medicine at the appropriate time (4).

1.3.2 Classification of Biomarkers Following the classification used by Alvaro Mordente, biomarkers are generally divided into three categories. Note, however, that it is possible for a biomarker to be categorized in more than one category (4). ‹ The prognostic biomarkers predict the outcome of a disease without an intervention ever taking place. It predicts the natural progression of cancer and determines whether the cancer is potentially malign or not. These are relevant at the time of diagnosis to make appropriate decisions regarding the aggressiveness of the treatment. Nonetheless, no prognostic biomarker can accurately provide a certain outcome for an individual; usually, a rough estimate of the probability of several potential outcomes is provided. ‹ The predictive biomarkers provide information about the responsiveness of a particular type of cancer to a specific treatment. These biomarkers estimate whether a more tumor-specific treatment could be of greater benefit to a patient than the standard treatment: e.g. the selection of patients with lung cancer for EGFR or ALK TKI therapy by biomarker testing (11). These biomarkers are the future of personalized cancer treatment. ‹ The pharmacodynamic markers “present information on the effects of a drug on the human body” (4). The above biomarker categories can only be used after the diagnosis of cancer. This systematic review however, will focus on the potential use of biomarkers as an early detection method for

8 | P a g e

cancer (the screening biomarkers). Moreover, it shall discuss the recent advancements in biomarker research.

Figure 1: Classification of biomarkers Mordente A, Meucci E, Martorana GE, Silvestrini A. Cancer Biomarkers Discovery and Validation: State of the Art, Problems and Future Perspectives. Adv Exp Med Biol. 2015;867:9-26.

1.3.3 The Five Phases of Biomarker Development

‹ Phase 1: Preclinical exploratory studies In this phase, many promising biomarkers are discovered through preclinical semi- quantitative studies. The comparison between cancer and non-cancer tissue is included in this phase of the biomarker development. Immunohistochemistry, micro-arrays, mass spectrometry and western blots are frequently used in these studies. The desirable approach to discovering new biomarkers is the “hypothesis-driven” method. In this phase, as in any phase of the development for that matter, transparency of the used method and results is essential (4, 12).

9 | P a g e

‹ Phase 2: Clinical Assay Development The following phase is developing a solid and specific test to measure the biomarker in patients. This is a very important step as this test could be put into practice for screening or diagnosing cancer. Firstly, the test must be analytically validated. This is defined as “the reliability of the test to measure the biomarker in the clinical laboratory, and in specimen representative of the population of interest” (4). Secondly, the test must be clinically validated. This implies that the test distinguishes two or more specific groups with different characteristics; usually the persons with cancer from the persons without cancer. Finally, the test must prove its clinical utility. This includes the test’s potential of improving upon the currently used ‘gold standard’ of practice in terms of measurable clinical outcomes and patient managing decision. It is a vital step in proving whether a cancer biomarker test is ready for application in patient care. The confirmation of clinical utility is generally determined by the end point survival or progression-free survival. The gold standard for testing the clinical utility is a prospective randomized control trial (4, 6, 12, 13). ‹ Phase 3: Retrospective Longitudinal Repository Studies After the clinical assay, a retrospective assessment using large numbers of previously stored samples can determine the outcome and establish the cut-off point for a biomarker. This phase can also be useful to compare or combine multiple biomarkers in order to establish an algorithm (4, 12). ‹ Phase 4: Prospective Screening Studies During this stage of developing a biomarker, the researchers make an effort to demonstrate the biomarker test as a suitable application for the early detection of cancer. This includes calculating the PPV (positive predictive value) and the false positive rate. ‹ Phase 5: Cancer Control Studies In this final phase, the investigators assess whether the biomarker test is suitable for use in the population. It is then determined whether or not the test improves the overall mortality and morbidity as a result of cancer in the population (4, 12). In the continuing search for new biomarkers, only a few of the many potential candidates prove to be adequate. The most durable method to display the biomarker’s effectiveness is through the use of clinical endpoints (“a characteristic or variable that reflects how the patient feels, functions or survives”) in a clinical trial. However this is a long, expensive process and large numbers of patients are needed to attain reliable results. Thus many studies make use of intermediate endpoints, (this is “a true clinical endpoint but not the endpoint of the disease”) and present inaccurate results (13).

10 | P a g e

1.4 MODERN BIOMARKERS OBSTACLES Due to the importance of early detection of cancer for increasing survival chances, the interest in cancer biomarkers has grown exponentially in the past years, along with the increase in amount of newly developed cancer biomarkers. The perfect biomarker should be generated by cancer cells only while indicating the stage of the cancer and having an acceptable lead time to detect the disease. Furthermore, for a biomarker to be useful as a screening investigation, it must be quantifiable in the blood, stools or other human fluids of cancer patients in an early (preferably an asymptomatic) stage and non-detectable in healthy persons. As already stated above, a biomarker should be cost-effective and used in a solid test with high sensitivity and specificity rates. Until now, the perfect biomarker has not yet been found, and the research and development of new biomarkers has been fairly disappointing with few biomarkers living up to the high requirements needed to be implemented in a screening test. A small number of biomarkers have been introduced into clinic use in the last 30 years and the amount of biomarkers appears to have stagnated (4, 7, 13). However, combining different biomarkers in one panel, test or algorithm to increase the sensitivity and specificity of new biomarker tests shows promising results in more recent studies.

Why has the development of an ideal biomarker been so difficult? The problems in biomarker development have frequently been underestimated. During the first phases of biomarker development, the preclinical or small clinical studies repeatedly show promising results, but again and again these biomarkers fail to perform in larger clinical trials. Moreover, there is the recurrent problem of clinically validating a biomarker’s effectiveness; often the sensitivity, specificity and the PPV of the biomarker test decrease when the test is applied to a larger group of patients. According to Diamandis there are three categories of biomarker failures (4): ‹ Fraudulent reports (e.g. the statistical errors in the article of Potti about genomic signatures (14) ): this category represents a minor percentage of all biomarker failures. ‹ The discovery and validation of potential biomarkers by using solid techniques in observational studies, but with a poor clinical outcome as a result of low sensitivity, low specificity, low prognostic or predictive value. ‹ The discovery of false cancer biomarkers: biomarkers that look very promising at first, but cause problems in the discovery or validation phase later on.

11 | P a g e

In order to minimize these failures in the future, several guidelines for today’s development of biomarkers have been presented. The BRISQ (Biospecimen Reporting for Improved Study Quality) and REMARK (Reporting recommendations for ) prognostic study guidelines are widely known and describe the statistical and pre-analytical problems that can be avoided by introducing a solid prognostic study design. The current gold standards for proving the clinical utility of a biomarker are studies with a PROBE (Prospective-Randomized-Open-Blinded-End-point) design, normally a RCT (randomized clinical trial). The implementation of these guidelines should lead to the discovery of new cancer biomarkers with a reliable sensitivity and specificity rate (4, 7, 10).

1.5 SEARCHING RELIABLE BIOMARKERS As a result of these biomarker failures, there is an abundance of literature available about cancer biomarkers that are not ready for practical use. Firstly, it has proven to be extremely difficult to discover new biomarker tests with clinical uitility. Often these biomarker tests are deemed “revolutionary” or “ground-breaking” with a high sensitivity and specificity rate in the initial observational tests but only a paucity of these tests reach a sufficient level of clinical validity during phase two. Secondly, in most of these observational studies, the discovery of new biomarkers is relatively uncharted territory. These studies often lack the advantage of well-designed experiments. Therefore, underreporting and invalid reporting occurs frequently (7). The main objective of this systematic review is to give an overview of the recent progress and developments in terms of early detection of cancer through biomarkers, as well as a full selection of the potential biomarkers ready for clinical use. Previous reviews that are broad in scope, in this area of expertise, have been sparse due to the vast amount of studies currently in the field. For that reason, it is important to distinguish the promising biomarkers that lack sufficient data from the robust biomarkers that could be used in practice or considered as a screening test. As a result, an effort will be made to include as much reliable studies (such as the PROBE studies) as possible that meet the REMARK criteria.

Key question: Which cancer biomarkers are ready for use in a broad cancer screening assay and in what phase of development are they now?

12 | P a g e

2. METHODS

2.1 DATA SOURCES AND SEARCH STRATEGY To identify all original research articles on cancer biomarkers of the past seven years, extensive literature searches were performed in Pubmed. The search terms used in the search strategy were: ‘Biomarker’, ‘Cancer’, ‘early detection’ and ‘Diagnosis’. This systematic review was drawn up according to the method described in the PRISMA (Preferred Reporting Items of Systematic Review and Meta-Analysis) checklist ( http://www.prisma- statement.org/ ).

2.2 ELIGIBILITY CRITERIA All studies published between January 1, 2010 and October 10, 2017 were included in the search to make sure all recent published material was found. All study designs were included in the primary search. Only studies performed on humans (hand-picked) and in English regarding the early detection of cancer through the use of biomarkers were retained. Wherever possible, only studies with more than 100 patients were included. However, some articles that handle rare types of cancer with less than 100 patients are included considering the difficulty of finding more than 100 patients per study. The majority of the included studies consists of (multi-center) case-control studies, cohort studies, population screening studies and validation studies. Quality assessment of the included studies was performed using the guidelines of F. Fowkes and P. Fulton (15).

2.3 EXCLUSION CRITERIA To exclude articles with possible biased results due to a low number of participants, articles with less than 100 patients (save those discussing a rare variety of cancer with a limited number of patients) have not been included, as mentioned before. In addition, studies conducted solemnly on cell lines and not on human subjects were also excluded. Other reasons for exclusion included a lack of data transparency, a shortage of data analysis, citation titles without abstracts and abstracts of articles discussing panels without stating which biomarkers were studied.

13 | P a g e

2.4 DATA EXTRACTION The studies that fulfilled the inclusion criteria were categorized according to the type of cancer the biomarker of the study identifies. Information about the number of patients and controls was retrieved from each article. The patients were generally divided into persons with cancer and persons with benign lesions.

2.5 OUTCOME MEASURES To assess the robustness of each tumor marker’s performance, sensitivity, specificity (with their respective 95% confidence interval ranges), area under the curve (AUC) and predictive positive value was sought and evaluated. The sensitivity of a test is defined as the probability that people with the disease are tested positive. The specificity of a test, on the other hand, is defined as the probability that people without the disease are tested negative. When estimating the probability of a patient suffering from a disease, the sensitivity and specificity are combined with the prevalence of a disease to obtain the PPV (predictive positive value). The PPV illustrates the probability that, when a person tests positive, he has the disease. The area under the curve is a tool to evaluate the accuracy of a test, in this case the ability to identify those with and those without cancer. An area under the curve of 1 represents a biomarker test that separates the healthy persons from patients perfectly, an AUC of 0.5 means the biomarker test is not dependable.

2.6 RESULTS

A PRISMA diagram of the studies included in this systematic review is summarized in figure 2.

14 | P a g e

2.7 FLOW DIAGRAM

Records identified through database searching: PubMed (n = 7303)

Identification

Records afte r duplicates removed (n = 7303)

Records screened on title Records excluded (n = 7303) (n = 5519) Screening

Records screened on Records excluded abstract (n = 1525) (n = 1751)

Eligibility Eligibility

Studies screened on full - text (n = 226)

Articles included in Included qualitative analysis (n = 98 )

Figure 2. PRISMA diagram of studies searched and selected.

15 | P a g e

3. RESULTS

3.1 COLORECTAL CANCER As can be seen in additional table 1, 16 articles concerning detection biomarkers for CRC were found eligible for this systematic review. The first article of Nielsen and colleagues describes a prospective population based validation study of two serum biomarkers CEA () and TIMP-1 (plasma tissue inhibitor of metalloproteinases-1) for colorectal cancer in a high-risk population. The expected incidence of CRC in this population is 4%. As a result, it was calculated that the sample size necessary to validate these biomarkers for the detection of CRC was 4500 individuals. All 4509 participants underwent a colonoscopy and/or sigmoidoscopy to confirm or exclude CRC or adenomas. People with colon cancer had higher TIMP-1 levels than patients with rectal cancer and both TIMP-1 and CEA levels were increased in patients with adenomas. However, neither plasma TIMP-1 nor CEA were able to differentiate between patients with adenomas < 1cm or > 1cm. In a univariate analysis of TIMP-1 and CEA, plasma TIMP-1 proves a significant predictor and detector of CRC with an odds ratio (OR) of 2.9 with a 95% confidence interval (CI) of 2.4-3.5, and an AUC of 0.7 (P<0.0001). Plasma CEA had an OR of 2.4, 95% CI: 1.9-3.0 and an AUC of 0.73 (P<0.0001) in patients with CEA levels over 5 ng/ml. A significant interaction between plasma TIMP-1 and co-morbidity such as diabetes, chronic lung diseases and cardiovascular diseases was shown (P=0.02). Therefore, a multivariable model for plasma TIMP-1 and CEA was calculated taking into account age, gender and comorbidities. The outcome of this model was an OR of 1.2, 95% CI of 0.9-1.61 for individuals with no comorbidity for TIMP-1 and for CEA levels over 5 ng/ml an OR of 2.4, 95% CI of 2.0-3.0. The AUC of this multivariable model, combining CEA and TIMP-1, was 0.82. This model shows that the combination of these two biomarkers produces an increase in detection of CRC compared to the individual detection value of these biomarkers. An analysis of a subgroup, consisting solely of colon cancer cases; was made (not shown in additional table 1). It demonstrates that a multivariable model of plasma TIMP-1 and CEA reaches an AUC of 0.82 as well. Nonetheless, it proves that the OR of plasma TIMP-1 for the detection of CC is higher: 2.1; with a 95% CI of 1.4- 3.1 (16).

The article concerning the evaluation of fecal tumor M2 pyruvate kinase (M2PK) as a diagnostic biomarker for CRC screening, describes a prospective multi-center cohort study of a high-risk

16 | P a g e

population. The 328 patients enrolled in this study were patients referred for colonoscopy. A stool sample was taken before the preparation of or after the colonoscopy. Tumor M2PK regulates the equilibrium between the production of ATP and the synthesis of fatty acids and nucleic acids in cancer cells. It is essential in tumor growth and glycolysis during tumorigenesis and is therefore raised in patients with CRC (17). Based on the receiver operating curve, the optimal cut-off level of tumor M2PK was 4.00 U/ml, the sensitivity, specificity, positive predictive value and negative predictive value were 71.4%, 71.0%, 73.5% and 94.4% respectively. Wielding the M2PK-test, 16 tumors among 67 cases of adenoma, 8 tumors among 19 cases of IBD, 36 tumors among 114 cases of infective colitis and two tumors among three cases of amoebic colitis were found. Only 12 tumors among the 42 cases of cancer had a concentration under 4 U/ml and were reported false negative. 83 stool samples of patients without CRC turned out M2PK false positive. Additionally, a significant association between the presence of CRC and fecal tumor M2PK levels was found (P<0,0001) (17).

Next-generation stool DNA testing is the subject of a blinded, multicenter, case-control study performed by Ahlquist et al. on archived stool samples from 252 patients with colorectal cancer and 293 controls with normal colonoscopy. Patients and controls were randomly divided into a training and test set to evaluate the performance of the stool DNA. A new 2 nd generation assay technique was developed for this study, namely the QuARTS (quantitative allele-specific real-time target and signal amplification). The methylated markers selected for this panel were vimentine, NDRG-4, BMP-3, TFPI-2, mutant KRAS, β-actin and hemoglobin complement methylation markers. The training set reported, at a specificity cut-off of 90%, a sensitivity of 89%, 95% CI 89- 93% for detecting CRC and a sensitivity of 62%, 95% CI 49-74% for detecting adenomas > 1cm (data concerning the adenomas is not described in additional table 1) (18). The AUC for the detection of CRC was 0.94, whereas the AUC for the detection of adenomas > 1 cm was 0.84. The test set presented at a specificity cut-off of 85%, a sensitivity of 78%, 95% CI 68-86% for the detection of CRC and a sensitivity of 64%, 95% CI 45-80% for the detection of adenomas > 1cm. AUC’s were 0.88 for detecting CRC and 0.81 for adenomas > 1cm. Furthermore, detection rates of adenomas and CRC increased with size (P<0.0001 for adenomas and P=0.71 for CRC) (18) . The sensitivity was not affected by the site of the lesion, nor was it by the stage type of CRC. The sensitivity for stages I-III was 87%, for stage IV it was 69%. The AUC for the individual biomarkers was 0.75 for NDRG-4, 0.73 for BMP-3, 0.69 for , 0.66 for hemoglobin and 0.61 for mutant KRAS (18).

17 | P a g e

The study of Khales et al. explores the prognostic and/or diagnostic use of mRNA levels of SALL- 4 and CEA in peripheral blood and serum of 51 patients and 60 healthy controls. SALL-4 is a recently discovered self-renewal stem cell factor that is highly overexpressed in CRC and can be defined as an oncogene. The copy number of SALL-4 in blood and serum of the CRC patients was elevated in comparison to those of the healthy controls (P<0.0001). The same conclusion could be drawn for the amount of copy numbers of CEA in CRC patients (P<0.05). Furthermore, it is interesting to notice that the mRNA levels of CEA and SALL-4 were significantly correlated with each other (P=0.002). SALL-4 demonstrates a sensitivity of 96.1%, a specificity of 95% and an AUC of 0,981 (19).

In the search for new biomarkers as a replacement for the immunochemical Fecal Occult Blood Test (iFOBT), Calistri et al. conducted a prospective cohort study concerning a new fecal DNA assay. A total of 560 patients with positive iFOBT results were enrolled in this study and examined through the use of colonoscopy. Of those 560 participants, 26 were diagnosed with adenocarcinomas, 264 with high-risk adenomas, 54 with low-risk adenomas and 216 patients showed no lesions or premalignant symptoms. The DNA assay is based on the difference in fluorescence intensity of each PCR product with fluorescent-labeled primers. In this way, a quantitative analysis of p53 exons, APC fragments 1-2 and 3-4 could be carried out. In order to compare the curves of fluorescent intensity of the patients with CRC to those of normal patients, a standard curve of genomic DNA was used as a reference. Furthermore, the diagnostic accuracy of the iFOBT was measured and compared to the diagnostic accuracy of the DNA assay. The lowest median levels of iFOBT were observed in patients without lesions and low-risk adenomas, respectively 222 ng/mL and 246 ng/mL. Moreover, a not statistically different median value of 327 ng/mL was observed in patients with high-risk adenomas. However, the median iFOBT value for these three groups was statistically different from the median iFOBT value in patients with CRC (1.511 ng/ml, P<0.0001). The DNA assay reported similar results; the first three subgroups without cancer recorded a statistically lower median value than the group with CRC patients. Patients without lesions had a median value of 13 ng, low and high-risk patients had a median value of 12 ng and patients with CRC had a median value that was three times higher (38 ng, P<0.0001). The FL-DNA (fluorescence long DNA) levels were higher in females and significantly higher in patients with tumors located in the proximal and distal colon or rectum. In order to further evaluate the combination of iFOBT and FL-DNA, a nomogram was provided to estimate the probability of having CRC based on test results. If the iFOBT shows levels of over

18 | P a g e

1000 ng/ml and the FL-DNA value crosses the limit of 30 ng/ml, a likelihood ratio of 2 could be achieved (20).

Identifying new methylation markers in stool and comparing the sensitivity and specificity of these new markers to the iFOBT test is the subject of a retrospective case-control study conducted by Bosch et al. The researchers’ first step was to measure the downregulated genes in cancer cell lines. These genes were further studied in cancer tissue samples and were compared to normal tissue samples. Only phosphatase and actin regulator 3 (PHACTR3) passed further validation tests in tissue samples. In order to compare it to the iFOBT test, the performance of PHACTR3 methylation in stool samples of both healthy patients and patients with CRC was analyzed. The training set, consisting of 100 stool samples (22 CRC patients and 78 control samples or samples of patients with non-advanced adenomas) was used to calculate the optimal cut-off value of methylation levels for detecting CRC patients (21). The AUC of this training set was 0.77, 95% CI 0.64-0.90 with a maximum sensitivity of 55%, 95% CI 33-75% at a fixed specificity of 95%, 95% CI 87-98%. A cut-off value of 82.5 relative copies reached the highest sensitivity. The following validation set was used to confirm the results of the training set and consisted of 93 stool samples (44 CRC patients, 19 patients with advanced adenomas and 30 controls). ROC analysis of CRC and advanced adenomas compared with controls, yielded an AUC of 0.83, 95% CI 0.75-0.91; a sensitivity of 66%, 95% CI 50-79% for the detection of CRC and a sensitivity of 32%, 95% CI 14-57% for the detection of advanced adenomas with an overall specificity of 100%, 95% CI 86-100% at a cut-off value of 82.5 relative copies. To determine if the combination of the iFOBT and PHACTR3 could improve the detection of CRC, both tests were analyzed in an independent series of stool subsamples. Sensitivities for detecting advanced adenomas were 21%, 95% CI 9-40% for FIT and 21%, 95% CI 9-40% for PHACTR3. When combining iFOBT and PHACTR3, a positive result being the identification of cancer by one of the two tests, increased the sensitivity to 33%, 95% CI 18-53%. The sensitivity for the detection of CRC was 50%, 95% CI 30-70% for PHACTR3 and 65%, 95% CI 32-82% for iFOBT. The combination of iFOBT and PHACTR3 increased the sensitivity to 95%, 95% CI 76- 99%. The overall specificity for iFOBT and PHACRT3 was 94%, 95% CI 83-98% (96% for PHACTR3 and 98% for iFOBT). The AUC for CRC and advanced adenoma was calculated at 0.79, 95% CI 0.69-0.92 and an AUC of 97, 95% CI 0.93-1.0 for CRC alone (21).

Analyzing new tumor antigen-associated antibodies as new biomarkers for cancer screening, is the topic of the article of Chan et al. Previous studies have demonstrated that a humoral immune

19 | P a g e

response is activated during tumorigenesis, even though the underlying mechanism has not been discovered yet. The purpose of this retrospective case-control study was to develop a panel of 5 select recombinant fragments as coating antigens to measure the prevalence of antibodies in serum of patients with colorectal cancer and determine if this panel could be used as an application for the detection of CRC. Serum samples of 94 patients with CRC were acquired. For control purposes, serum samples of 54 healthy individuals were obtained during annual health examinations. Using ELISA, five recombinant colorectal tumor-associated antigens were selected for the panel, namely rCCCAP, rHDAC5, rP53, NY-CO-16 and rNMDA. As can be seen in additional table 1, the sensitivity and specificity of rCCCAP, rHDAC5, rP53, NY-CO-16 and rNMDA were 35.1% and 96.3%, 20.2% and 96.3%, 24.5% and 98.1%, 18.1% and 100%, 20.2% and 96.3%. When combining the 5 recombinant antigens, a sensitivity of 58.5% and a specificity of 92.6% was reached. The panel would test positive if antibodies against at least one of the five antigens was observed. The detection of antibodies in the sera was not statistically associated with gender, age, tumor location or Dukes’ classification in CRC patients. Furthermore, CEA levels in the patient’s sera were analyzed and, in order to increase the sensitivity, the CEA assay and the panel of five antigens; were combined in one test. CEA displayed a sensitivity of 21.9% in the early-stage group and 58.4% in the advanced-stage group. The combination of the CEA assay and the five-antigen panel demonstrated an overall increase in sensitivity to 77.6% (P<0.001), and a sensitivity of 65.9% for the detection of early-stage CRC in patients (22).

The article of Ciarloni and colleagues describes the discovery and validation of a 29-gene panel (listed in additional table 1) in a multicenter, case-control study comprising three phases. The first phase consisted of 93 participants (31 controls, 31 CRC patients and 31 patients with adenomas > 1cm), the second phase consisted of 51 participants (19 controls, 17 CRC patients and 15 patients with adenomas > 1cm) and the third phase combined the results of the first two phases. These three phases were used to identify the 29 genes with the most potential for the detection of CRC in a panel. To evaluate the predictive accuracy of this 29-gene panel, univariate and multivariate analyses of the dataset were performed. When using a multivariate approach, the sensitivity for detecting CRC and adenomas > 1cm was respectively 75% and 59% with a specificity of 91%. ROC analysis showed an AUC of 0.88, 95% CI 0.83-0.92 for CRC detection and an AUC of 0.85, 95% CI 0.78-0.91 for the detection of adenomas > 1cm. The univariate approach showed a drastic decrease in predictive accuracy. With a specificity set at 91%, the sensitivity dropped to 65% and 37% for the detection of CRC and adenomas > 1cm respectively while the AUC was calculated at 0.86for the former and 0.77 for the latter (23).

20 | P a g e

As a direct continuation of the former study, Ciarloni set out to prospectively validate the 24-gene panel in a new multicenter case-control study. The study included 149 controls (all with negative colonoscopies), 103 patients with adenomas > 1cm, 97 patients with CRC (all adenocarcinomas except 1 squamous and two of the mucinous type) and 245 patients with other diseases (IBD: 8 with Crohn, 6 with ulcerative colitis; adenomas < 1cm, hyperplastic polyps, inflammatory diseases and other cancers). Training and validation sets were created in the discovery phase to define the new predictive algorithms. In a second phase, the testing phase, these algorithms were validated in 3 test sets. Ultimately, two algorithms were tested: an algorithm based on the 29-gene panel identified in the former study, called multigene multiclassifier algorithm or MGMC, as well as a second algorithm based on a combination of the MGMC algorithm with CEA and CYFRA21-1 proteins, or the MGMC-P algorithm. Both algorithms present a binary result: the presence or absence of CRC or adenoma > 1cm. The combination of the three test sets showed a sensitivity of 55.4%, 95% CI 43-68% for the detection of adenomas > 1cm and a sensitivity of 79.5%, 95% CI 68-88% for the detection of CRC with a specificity of 90.0%, 95% CI 82-95% for the MGMC algorithm. When only combining the two first test sets, a sensitivity of 60.9% was attained for the detection of early-stage cancer (stages I-II) with a specificity of 89.2%. This figure rose to 75.6% when stage III colorectal cancer was included. Non-GI inflammatory and viral diseases had no effect on the positivity rate (18.2% and 20%). However, IBD and other GI diseases displayed a higher positivity rate of 42.9%. The combination of the MGMC algorithm and protein tumor markers CEA and CYFRA21-2 present a sensitivity of 52.3% with a 95% CI of 40-65% and 78.1% with a 95% CI of 67-87% for the detection of adenomas > 1cm and CRC with an overall specificity of 92.2%, 95% CI 85-97%, respectively. Although the difference was not statistically significant, the specificity was higher than it was when only the MGMC was tested (24).

The retrospective cohort study of Koga and colleagues describes a new fecal microRNA test (FmiRT) as a new application for the detection of CRC. A total of 117 patients with colorectal cancer and 107 healthy volunteers participated in this study. Fecal miRNA was obtained from the residuum of iFOBT samples from CRC patients and healthy controls. Subsequently, this fecal miRNA was implemented in a test (FmiRT) for the detection of CRC. The aim of the study was to observe if the combination of the FmiRT with the iFOBT could decrease the number of false- negative results that occur when using the iFOBT. The sensitivities of the FmiRT (using miR-106 expression) were 34.2%, 95% CI 25.6-43.6% with a specificity of 97.2%, 95% CI 92-99.4%. The iFOBT both on its own and in combination with the FmiRT showed sensitivities of 60.7% with a

21 | P a g e

95% CI of 51.2-69.6% and 70.9% with a 95% CI of 61.8-79%, respectively. The specificity of the iFOBT alone amounted to 98.1%, 95% CI 93.4-99.8%, whereas the iFOBT in combination with the FmiRT presented a specificity of 96.3%, 95% CI 90.7-99%, respectively. In 25% of all CRC patients with false-negative iFOBT results, the results became positive when a combination of the iFOBT and the FmiRT was used (25). In another study, conducted by Koga et al, a DNA chip assay using six genes is proposed as a new screening method for CRC. This retrospective cohort study consisted of 41 patients with CRC, 54 healthy controls in the training set and, 12 patients with CRC and 7 healthy controls in the validation set. All healthy controls presented a negative colonoscopy. The DNA chip contained 24.460 genes; the median number of detected genes in the 53 patients with CRC was 6.379, whereas the median number of genes found in the healthy controls was 3.047. 43 genes had significantly higher expression levels among the CRC patients as compared to those of the healthy control group. Subsequently, the expression levels of these genes were further evaluated in the training and validation set. The highest sensitivity, obtained in the training set, was 85.4% combining the expression levels of six different genes. The specificity in this training set was 85.2%. The six genes were CEBPB, FCGR3A, PFKFB3, SOD2, RGS2 and IL-8. These were considered as new diagnostic biomarkers for CRC. In the validation set, a sensitivity of 83.3% was observed using five to seven genes. The specificity in this validation set was 85.7%. Additionally, in the training and validation set, the iFOBT had a sensitivity of 53.7% and 66.7% for the detection of CRC. The specificity for iFOBT was 98.1% in the training set and 100% in the validation set. Previous studies have demonstrated that patients with tumors of less than 35 mm in diameter, located in the cecum, the ascending colon or the transverse colon and invading into the muscularis propria, display false negative results when using the iFOBT test. In this subgroup, a higher sensitivity for the detection of CRC was observed when using the DNA chip assay than when the iFOBT was used (26).

The article of Marshall and colleagues explores the use of a blood-based screening test that could potentially assess the risk of having CRC in the present or in the future. In this retrospective multi-center cohort study, an analysis of the previously identified 7-gene biomarker panel for discriminating patients with CRC from healthy controls, was conducted. The seven genes in the gene panel were ANXA3, CLEC4D, LMNB1, PRRG4, TNFAIP6, VNN1 and IL-2RB. The first six were overexpressed, the latter was underexpressed. Blood samples were taken from 25 hospital centers. A training set with 112 CRC patients and 120 controls and a test set, using a blind independent average-risk cohort, with 202 CRC patients and 208 controls were used. Logistic

22 | P a g e

regression multivariate analysis were performed and the blood test had a sensitivity of 82%, a specificity of 64%, a positive predictive value of 68% and a negative predictive value of 79% in the training set. Moreover, ROC analysis determined an AUC of 0.80, 95% CI 0.74-0.85 in the training set. The test set displayed a sensitivity of 72%, a specificity of 70%, a PPV of 70%, a NPV of 72% and an AUC of 0.80, 95% CI 0.76-0.84. Using logistic regression analysis, Bayes’ theorem was applied to calculate the current relative risk for CRC (CURR). The CURR is the ratio of the probability of having CRC to the CRC prevalence in the general population, based on this new blood test. A CURR > 1 was observed in 71% of the CRC patients and a CURR < 1 was observed in 71% of the controls. At a CURR=1, the PPV was 70% and the NPV was 72% (27).

The screening study, conducted by W. Meng and colleagues, analyzes the value of serum M2-PK as a screening biomarker for CRC. The aim of the study was to compare the diagnostic value of M2-PK to CEA, which is currently the most frequently used diagnostic biomarker for CRC. Samples were taken from 93 CRC patients (55 patients with stage 0-II, 38 cases with stage III), 41 patients with advanced adenomas, 137 patients with adenomas, 47 patients with non- adenomatous polyps, 7 patients with IDB and 158 healthy individuals. All participants underwent colonoscopy to confirm or exclude the diagnosis of CRC. Serum M2-PK was detected using ELISA. The average serum M2-PK value in U/mL was 14.75 among stage III and 13.10 among stage I and II CRC patients. This is about 4 times higher in comparison to the healthy participants with an average of 2.96. The average serum M2-PK value was 8.58 U/mL and 6.70 U/mL among patients with advanced adenoma and adenomas. ROC analysis presented a higher AUC for serum M2-PK (0.89, 95% CI 0.84-0.94) than for CEA (0.70, 95% CI 0.62-0.79) for stage I and II CRC patients. An AUC of 0.89, 95% CI 0.84-0.94 for M2-PK and of 0.73, 95% CI 0.63-0.83 for CEA was observed for stage III CRC patients. For advanced adenomas, the AUC of serum M2- PK was 0.81, 95% CI 0.74-0.86 and the AUC of CEA was 0.63, 95% CI 0.53-0.73. In order to compare the sensitivity and specificity of both serum M2-PK and CEA, the cut-off value of M2-PK was set at 2.00 U/mL and the cut-off value of CEA was set at 5.00 ng/mL. The sensitivity of serum M2-PK for detecting CRC reached 100% but the specificity was just over 40%, meaning that the sensitivity of M2-PK was higher than the sensitivity of CEA but its specificity was lower. The cut-off value that provided the best balance between sensitivity and specificity for detecting CRC was 4.00 U/mL (Se: 81.72 and Sp: 74.05). The PPV ranged from 49.73% to 64.96% and the NPV from 100% to 87.31 % when the cut-off value was changed from 2.00 U/mL to 4.00 U/mL (28).

23 | P a g e

From the implementation of the iFOBT as a screening program onwards, there has been some discussion surrounding the most accurate cut-off value of the test. For that reason, Terhaar sive Droste et al. assessed in a screening study whether a higher cut-off level of iFOBT could improve the positivity and detection rates of early-stage CRC patients and of patients with advanced adenomas. iFOBT samples were taken from 2.525 individuals who underwent colonoscopy. CRC was diagnosed in 79 patients (38 exhibited early-stage CRC, stage I and II, and 36 exhibited late-stage CRC, stage III or IV), and 236 participants were diagnosed with at least one advanced adenoma. The sensitivity, specificity and AUC’s of six different cut-off values from 50 ng/mL to > 299 ng/mL were assessed. Moreover, the currently used iFOBT displayed an AUC of 0.93 (95% CI 0.89-0.96), 0.89 (95% CI 0.82-0.95) and 0.69 (95% CI 0.65-0.73) for the detection of CRC, early-stage CRC and advanced adenoma, respectively. The sensitivity for using iFOBT at a threshold of >200ng/mL was 81% (95% CI 70.6-89%) and 78.9% (95% CI 62.7-90.5%) for the detection of CRC and early-stage CRC. Furthermore, the specificity was calculated at 92.8% (95% CI 91.6-93.9%). Lastly, the sensitivity for the detection of advanced adenoma was calculated at 30.5% (95% CI 24.7-36.8%) with a specificity of 95.8% (95% CI 94.8-96.7%) (29).

The study of Jin et al. explored the potential of the second-generation methylated Septin 9 or SEPT9 as a new blood-based diagnostic biomarker for CRC. Furthermore, it compares the results of this new blood test to those of the iFOBT. A total of 135 patients with CRC, 169 patients with adenomatous polyps, 81 patients with hyperplastic polyps and 91 healthy controls took part in this retrospective case-control study. Previous studies have illustrated that methylated SEPT9 is differentially expressed in tumor tissue. It is released into the bloodstream by CRC cells and can therefore be identified and measured in blood plasma. All participants underwent a colonoscopy for the diagnosis or the exclusion of CRC. The sensitivity and specificity of SEPT9 for the detection of CRC were calculated at 74.8% with a 95% CI of 67.0-81.6% and at 87.4% with a 95% CI of 83.5-90.6%, respectively. Furthermore, a sensitivity of 70.1 % (95% CI 62.6- 76.8%) and a specificity of 89.3% (95% CI 85.6-92.4%) for the detection of both CRC and high- grade dysplasia were assessed. The false positive rate of SEPT9 was calculated at 4.7%. With 90 CRC patients, the stage of cancer of each individual patient was determined through the use of colonoscopy. The SEPT9 test was positive in 66.7% of all patients with stage I CRC, in 82.6% of the stage II patients, in 84.1% of all stage III patients and in 100% of the stage IV patients. Moreover, the sensitivity of SEPT9 for the detection of advanced adenomas was 27.4%. In addition, the iFOBT was also evaluated in this study. The sensitivity and specificity of the iFOBT for CRC detection were calculated at 58.0% (95% CI 46.1-69.2%) and 82.4% (95% CI

24 | P a g e

74.4-88.7%), respectively. Ultimately, the SEPT9 test successfully detected 25 more cancers than the iFOBT while the iFOBT identified 12 cancers that the SEPT9 test missed. 28 CRC patients were detected by both the SEPT9 assay and the iFOBT. On the one side, there was a significant difference in false test results between the iFOBT and the SEPT9 test (P=0.033). On the other side, no significant difference in sensitivity for the detection of advanced adenomas could be found (P=0.774) (30).

The last study concerning screening biomarkers CRC in this systematic review was performed by Wilhelmsen et al. In this prospective cohort study, a panel of eight blood-based protein biomarkers is evaluated. The proteins in this panel, identified in plasma using the Abbott ARCHITECT automated immunoassay, are AFP, CA19-9, CEA, hs-CRP, CYFRA21-1, Ferritin, Galectin-3 and TIMP-1. The participants in this study were high-risk patients planning to undergo a colonoscopy due to symptoms of bowel neoplasia such as unwanted weight loss with bowel distress. A total of 4698 patients participated in this study; 319 with CRC cancer, 193 with rectal cancer, 177 with cancer outside the colon and rectum; 123 patients with high-risk adenomas and 51 patients with low-risk adenomas, and 276 patients with high-risk adenomas and 239 patients with low-risk adenomas specifically in the colon. In 1978 patients, no anomaly was found. Diverticula were found in 1108 of the patients and 97 patients were diagnosed with IBD. A univariable analysis of all eight biomarkers was performed and can be consulted in additional table 1. All eight blood-based biomarkers displayed a significant performance in detecting CRC and high-risk adenomas. After a multivariable analysis of the eight-protein panel, an AUC of 0.76 for the detection of both CRC and high-risk adenomas and AUC of 0.84 for the detection of CRC was observed. For a sensitivity set at 60%, a specificity of 70% for the detection of CRC and high-risk adenomas was attained. After restricting the analysis to early-stage cancer detection (stage I and II), an AUC of 0.76 was assessed (P<0.0001) (31).

3.2 HEPATOCELLULAR CANCER A total of 11 articles concerning diagnostic biomarkers for hepatocellular cancer were included in this systematic review. The results of these studies can be consulted in additional table 2. The first study concerning the application of biomarkers for the detection of hepatocellular cancer (HCC), was performed by Chen et al. in China. In this retrospective cohort study, SELDI-TOF-MS was applied to identify a serum profile. Using this serum profile, an algorithm that discriminates (early- stage) HCC from liver cirrhosis (LC) was developed and then compared to the results of AFP (with

25 | P a g e

a cut-off at 20 ng/mL) as a diagnostic marker. A total of 240 patients (120 HCC patients and 120 patients with liver cirrhosis) took part in this endeavor. Preliminary diagnosis was made by a using a combination of blood biochemistry, AFP assay, chest x-ray, ultrasonography, computed tomography and hepatic angiography. A training set and validation set, each with 60 HCC patients and 60 LC patients, were created (the characteristics of the participants can be found in additional table 2). The study identified 80 peaks in certain protein levels in the training set. These peaks displayed a significant difference between HCC and LC patients (P<0.05). Five protein peaks were chosen as classifiers to design an algorithm named CART. During the same training set, the CART algorithm presented an accuracy rate of 96.7%, a sensitivity of 98.33% and a specificity of 95% for the detection of HCC, respectively. In the succeeding blinded validation set, a sensitivity of 83%, a specificity of 92% and an accuracy rate of 87.5% were observed. Furthermore, 87% of stage I, 89% of stage II and 97% of stage III-IV HCC patients were correctly detected by the CART algorithm. Three months after operation and removal of the tumors, a decrease in these five protein spikes was assessed. AFP 20 displayed a sensitivity of 54% for stage I, 72% for stage II and 84% for stage III-IV HCC patients. The diagnostic odds ratios (DOR) were calculated at 92.72 for the CART algorithm and at 9.11 for AFP 20 . When combining the CART algorithm and AFP 20 , a sensitivity of 95% and a specificity of 98% were reached; note that, in this combination of tests, a positive result was achieved when at least one of the two tests was positive. A surprisingly high DOR of 931 was obtained when combing the two tests (32).

In the prospective cohort study of Jirun and colleagues in China and south-Korea, the performance of three biomarkers (AFP, DCP and HCCR-1) and their possible combinations were assessed. The aim of the study was to compare HCCR-1 as a possible diagnostic biomarker for detecting HCC, to AFP and DCP. The total number of participants consisted of 2040 individuals, including 612 patients with HCC, 608 patients with liver cirrhosis, 402 patients with chronic hepatitis and 402 healthy volunteers. The sensitivity, specificity, PPV, NPV and accuracy were calculated. A sensitivity of 65.4%, 44.3% and 41.6% for, respectively, AFP (cut-off level: 20 ng/mL), DCP (cut- off level: 40mAU/mL) and HCCR-1 (cut-off level: 10 ng/mL) were observed. The specificity and PPV were 80.2% and 58.6% for AFP, 86.7% and 58.7% for DCP and 87.4% and 58.6%, for HCCR- 1. When combining the three biomarkers, a sensitivity of 75.4%, a specificity of 79.3% and a PPV of 60.8% for the detection of HCC were obtained. Furthermore, ROC analysis revealed an AUC of 0.791, 0.678, 0.643 for AFP, DCP and HCCR-1, respectively, and an AUC of 0.891 for the combination of these three biomarkers (33).

26 | P a g e

The identification and the validation of metabolite biomarkers in serum for the detection of HCC is the subject of the article by Luo and colleagues. A total of 1448 individuals participated in this nested case-control study of which 290 were healthy controls, 310 were patients with liver cirrhosis and 516 were patients with CRC. A number of 92 patients were described as having early-stage HCC, meaning that these patients had a small solitary HCC or two nodules with a diameter less than 3cm. A test cohort with 684 participants, a first validation cohort with 572 participants and a second validation cohort with 84 participants from multiple centers were formed. A total of 108 serum samples were used in the discovery set to define biomarker candidates. The aim of the project was to validate a panel of metabolite biomarkers. In the discovery set, 57 metabolites were identified as possible candidates through LC-MS analysis. Subsequently, a univariate analysis was performed on these 57 metabolites and a significantly reduced concentration of 17 of these metabolites was perceived in patients with HCC, in comparison to the patients of the normal control group and of the cirrhosis group (P<0.05). Subsequently, the test cohort was used to evaluate the reliability of these 17 biomarkers. After employing a binary logistic regression analysis and creating an optimized algorithm, a panel of two biomarkers (Phe-Trp and GCA) was determined to have the most success at differentiating HCC patients from patients without cancer. The AUC of this panel in the test set was 0.930 for the detection of HCC patients and 0.892 for the detection of HCC patients and patients with cirrhosis. The sensitivity and specificity of the biomarker panel were higher than the sensitivity and specificity of AFP 20 (80.2% and 78.4% for the panel and 45.3% and

78.4% for AFP 20 ). On top of that, the diagnostic accuracy for small HCC was 74.5%. In validation cohort 1, the AUC for detecting HCC and small HCC was, respectively, 0.807 and 0.753 for the metabolite panel, whereas for AFP, the AUC was 0.650 for HCC and 0.676 for small HCC. In both validation cohort 1 and 2, the sensitivity was higher for the biomarker panel in comparison to the sensitivity of AFP. The specificity however, was a lot lower (52.8% for the biomarker panel vs 73.2% for AFP). Combining the biomarker panel with AFP resulted in a 77.9% sensitivity, 76.4% specificity and an AUC of 0.826 for detecting HCC in the first validation set. The rest of the results can be found in additional table 2 (34).

Recently, peroxiredoxin-3 or PRDX3 has been identified as a new blood-based biomarker for the evaluation of HCC progression. In the retrospective cohort study of Shi et al, the potential of PRDX3 as a diagnostic biomarker for HCC was examined. The study population was comprised of 96 HCC patients, 98 patients with liver cirrhosis and 103 healthy controls. Alpha-fetoprotein is currently the most widely accepted biomarker for detecting HCC. Therefore, the performance of PRDX3 was analyzed and compared to AFP. The AUC of PRDX3 for detecting HCC was calculated at 0.865

27 | P a g e

(95% CI 0.809-0.953) which was higher than the AUC of AFP (0.67, 95% CI 0.509-0.753). The sensitivity was 85.9% and the specificity was 75.3% at a cut-off level of 153.26 ng/mL. These results imply that alterations in serum PRDX3 could be used to distinguish HCC patients from patients with cirrhosis and from healthy controls. Furthermore, PRDX3-expression was significantly higher in HCC patients than in patients with liver cirrhosis (P<0.001). ROC analysis demonstrated an AUC of 0.717, 95% CI 0.641-0.734 and, at a cut-off value of 95.13 ng/mL, the sensitivity and specificity were calculated at 73.2% and 69.0%, respectively. The AUC of PRDX3 for distinguishing patients with cirrhosis from HCC patients was merely 0.577. Additionally, elevated serum PRDX3- expression was significantly associated with tumor diameter, TNM stage, AFP serum levels and portal vein invasion (P<0.01). When exploring the potential of PRDX3 as a prognostic biomarker, a significant association between high levels of serum PRDX3 in patients with HCC and a lower survival rate was observed (P<0.001). For instance, the survival rate of patients with tumors smaller than 3cm in diameter was 44.7% when these tumors displayed low PRDX3-expression and only 22.3% when these tumors exhibited high PRDX3-expression. A survival rate of 50.1% was observed in early-stage HCC patients with low PRDX3 levels while the same patients with high PRX3 levels had a survival rate of 25.3% (35).

The fifth study concerning possible diagnostic biomarkers of HCC was conducted by Lin in China. The article describes a multicenter, retrospective biomarker identification study in combination with a nested case-control study to validate the results of the former. The objective of the study was to identify a combination of serum miRNA’s with diagnostic potential for HCC. A total of 1416 serum samples taken from 910 subjects were included in this study. The group of subjects comprised 343 HCC patients, 118 patients with cirrhosis, 254 patients with chronic hepatitis B, 42 inactive HBsAg carriers (as the population at risk) and 159 healthy controls. These participants were subsequently divided among a discovery cohort, a training cohort and three separate validation cohorts. In the discovery stage, 31 serum miRNA’s displayed high levels in HCC patients and therefore were considered as candidates for the biomarker panel. Using the training cohort of 108 patients, 13 of these miRNA’s showed elevated levels in HCC patients. A combination of the seven biomarkers with the highest levels was evaluated. ROC analysis of this 7-marker panel showed an AUC > 0.800. The miRNA’s included in this panel were miRNA-29a, miRNA-29c, miRNA-133a, miRNA- 143, miRNA-145, miRNA-192 and miRNA-505. This panel of miRNA’s displayed an AUC of 0.826 for differentiating HCC patients from controls (healthy people, patients with chronic hepatitis B and liver cirrhosis) when applying a logistical regression model in the training cohort. A sensitivity of 80.6% and a specificity of 84.6% for the detection of HCC were attained. When comparing the AUC

28 | P a g e

of this panel to the AUC of AFP 20 (0.826), no significant difference was observed (P=0.72).

However, the AUC of this miRNA combination was significantly higher than the AUC of AFP 400 (P<0.0001). The results of the validation cohorts can be found in additional table 2. In the nested case-control study, the panel displayed a much higher sensitivity and a larger AUC to distinguish patients with HCC from controls at every timepoint when compared with AFP 20 and AFP 400 . 12 months before diagnosis, the sensitivity was 29.6%, between 9 to 6 months before diagnosis it remained at 48.1% and 3 months prior to the diagnosis, the sensitivity was 55.6%. However, the specificity of the panel was lower than the specificity of AFP 20 and AFP 400 over the course of these 12 months. Furthermore, the miRNA panel displayed higher sensitivity and a higher AUC for detecting patients with small HCC and early-stage in the training and validation cohorts but, again, the specificity of the panel was lower than that of AFP 20 and AFP 400 .

The AUC of the panel was 0.824 for discriminating early-stage HCC from controls while AFP 20 displayed an AUC of 0.754 for the detection of early-stage HCC (36).

The article of Kumada describes a case-control study in which the performance of the lens culinaris agglutinin-reactive fraction of alpha-fetoprotein (AFP-L3) as a biomarker is examined. A current problem with using AFP as a diagnostic biomarker for HCC is its low specificity which is due to high levels of AFP found not only in HCC patients, but also in patients with benign diseases. AFP-L3 has been reported to be more specific. Recently however, accurate measurements of AFP-L3% have been difficult. A new high-sensitivity AFP-L3 assay was designed and it demonstrated remarkable improvements in clinical sensitivity and predicting prognosis of HCC patients with AFP < 20 ng/mL. The main objective of Kumada’s study was to discover the clinical utility of hs-AFP-L3 in early prediction of development of HCC. A total of 104 patients with HCC and 104 matched controls (of a high-risk population with HepB/C) took part in this endeavor. The levels of hs-AFP- L3 12 months before diagnosis were significantly increased when compared to the levels of hs- AFP-L3 24 months before diagnosis (P=0.0001). Furthermore, the levels of hs-AFP-L3 at the time of diagnosis were significantly increased in comparison to the levels of hs-AFP-L3 12 months before diagnosis (P=0.0003). The levels of AFP and DCP, on the other hand, were only significantly raised one year before diagnosis (P=0.0315 and P<0.0001). The sensitivity of hs-AFP-L3 at the time of diagnosis, was 39.4% at a cut-off level of 7%. 36 months before diagnosis, the specificity level of hs-AFP-L3 was 77% at a cut-off value of 7%. By combining hs-AFP-L3 with AFP and DCP in an assay, a sensitivity of 60.7% and a specificity of 76% for the detection of HCC was reached. In addition, the sensitivity and specificity for hs-AFP-L3 at -1 year from diagnosis were 34.3% and 74.7%, respectively (37).

29 | P a g e

As in the previous article, the principal objective of the nested case-control study conducted by Lok and colleagues was to evaluate the accuracy of AFP and DCP for early diagnosis of HCC (that is to say, 1 year before diagnosis of HCC). The study population consisted of 39 patients with HCC, 77 matched controls with HepC and 915 non-HCC patients. In the year preceding and the year following the diagnosis, serum samples were taken from these patients and tested for AFP and DCP every three months. Patients were prospectively followed up the year after diagnosis. A total of twenty-four patients displayed early-stage HCC at the time of diagnosis. Mean AFP levels in HCC cases increased from 37.0 ng/mL at 1 year before diagnosis to 157.6 ng/mL at the time of diagnosis while AFP levels of the control group remained unchanged in the same period. In one year’s time, the mean AFP levels decreased from 17.1 ng/mL at the time of diagnosis to 15.7 ng/mL in the treated patients, whereas the mean AFP levels in untreated patients rose to 19 ng/mL. The sensitivity of DCP and AFP for the detection of HCC at month 0 were 70% for the former and 30% for the latter, at a predefined specificity of 90% with cut-off values of 44.5 mAU/mL and 87.2 ng/mL, respectively. At a lower cut-off value of 40 mAU/mL, DCP displayed a sensitivity of 40% at month -12, which increased to 74% at month 0 while the sensitivity of AFP increased from 47% to 61% at a cut-off value of 20 ng/mL. When combining both biomarkers, a sensitivity of 91% at month 0 was observed. The AUC of DCP was higher than the AUC of AFP at all times except at month -3, where both biomarkers displayed no significant difference in AUC’s. The combination of AFP and DCP had an AUC of 0.92 for the detection of HCC (38).

The next study is performed by Mirua et al. and reports the evaluation of serum human telomerase reverse transcriptase (hTERT) mRNA as a diagnostic biomarker for the detection of HCC. The results where compared with other well-studied biomarkers: i.e. AFP, DCP and AFP-L3. The population of this multi-center case-control study included 638 subjects: 303 patients with HCC, 89 with chronic hepatitis (CH), 45 with liver cirrhosis (LC) and 201 healthy individuals. The telomerase catalytic subunit (hTERT) is known to be involved in telomere homeostasis, genetic stability, cell survival and possibly in cell differentiation. The results of the study indicated that increased hTERTmRNA expression was correlated with disease progression and the quantification of hTERTmRNA was higher in HCC patients than in LC patients, CH patients and healthy individuals (P<0.001, P<0.02 and P<0.001, respectively). Optimal cut-off values for hTERTmRNA expression were predicted at 9.332 copies/0.2ml. ROC curve analyses of hTERTmRNA displayed a sensitivity of 90.2% and a specificity of 85.4% for HCC detection. Furthermore, multivariate analysis showed that elevated levels of hTERTmRNA, AFP and AFP-L3 were associated with tumor size and

30 | P a g e

differentiation degree (P<0.001). Elevated DCP levels were only associated with the number of tumors (P=0.029). The sensitivity and specificity of AFP, AFP-L3 and DCP were, respectively, 76.6% and 66.2%, 60.5% and 88.7%, and 83.4% and 80.3%. These results confirmed that hTERTmRNA was superior to these other markers, especially in terms of sensitivity. The PPV and NPV of hTERTmRNA were 83.0% and 85.9%. Again, the values of hTERTmRNA regarding PPV and NPV were superior to the PPV and NPV of AFP, AFP-L3 and DCP. Moreover, the combination of hTERTmRNA with AFP improved the sensitivity and specificity, with values rising up to 96% and 87.2%. Levels of hTERTmRNA were measured before and after embolization and/or chemotherapy. A significant decrease in hTERTmRNA levels was observed (P=0.018). A notable result was that in one case, elevated hTERTmRNA levels even demonstrated the recurrence of HCC (39).

Wang et al. conducted a cross-sectional study using a total of 625 serum samples from 146 asymptomatic HBV carriers, 95 patients with HBV-induced chronic hepatitis, 97 patients with HBV- induced liver cirrhosis and 78 patients with HBV-induced HCC. In addition, 18 patients with cholangiocarcinoma, 13 patients with chronic HCV, 15 patients with drug-induced hepatitis, 10 patients with lung cancer, 30 patients with CRC and 108 healthy controls were included in this study. The aim of the study was to identify antibodies against upregulated genes (anti-URG’s) in a high-risk population (patients with HBV or chronic liver disease, liver cirrhosis) or in patients with early-stage HCC. To identify high-risk patients with anti-URG’s, a panel of five different HBx- induced URG’s was evaluated. Among patients with cirrhosis or HCC, a significant association between the number of antibody markers in sera and the patients’ disease state was found (P<0.001). Using three anti-URG’s to distinguish patients with HCC and patients with HBV-induced cirrhosis from others, the sensitivity and specificity were calculated. The sensitivity was 58.3% and the specificity was 80% with an AUC of 0.721. The results of these anti-URG’s were compared to the results of known biomarkers like AFP, AFP-L3, GPC3 and GP73. The sensitivities of AFP, AFP- L3, GPC3 and GP73 for the detection of patients with cirrhosis/HCC were 38.8%/32.8%, 42.3%/46.3%, 45.2%/36.1% and 41.1%/33.2%, respectively. When combining each of these biomarkers with the anti-URG’s, the sensitivity for the detection of cirrhosis/HCC increased (40).

The next study by Wang et al. explores the use of a self-developed algorithm for the detection of early-stage HCC. This algorithm, that incorporated log-transformed AFP values, age, gender, alkaline phosphatase (ALK) and alanine aminotransferase levels (ALT), was termed the “Doylestown algorithm”. The identification and validation of this algorithm was analyzed using a

31 | P a g e

discovery cohort and three validation cohorts from various centers. A total of 1019 patients with HCC and 1930 patients with cirrhosis took part in this study. The AUC of the Doylestown algorithm in the discovery set was 0.9388 (95% CI 0.9103-0.9674) for the detection of HCC. More importantly, the AUC of the algorithm for the detection of early-stage HCC was 0.9491 (95% CI 0.9138-0.9843). In the other three validation sets, ROC analysis showed AUC’s of 0.8409, 0.8920 and 0.876 for the detection of HCC. For the detection of early-stage cancer, AUC’s of 0.8104 and 0.7709 were observed in the first and third validation set for this algorithm. The sensitivity for detection of early- stage HCC in the first validation cohort was 43% with the Doylestown algorithm. The rest of the results can be seen in additional table 2 (41).

The identification of a new high-sensitive and highly specific diagnostic panel containing a combination of various miRNA’s for the early detection of HCC in HCV positive patients is the subject of a retrospective case-control study by Zekri and colleagues. The population in this study consisted of 192 patients with HCC, 96 patients with LC, 96 patients with chronic hepatitis C in addition to 96 healthy controls. A statistically significant difference among the different groups in terms of age, gender, albumin levels, diabetes mellitus, total bilirubin, ALT, AST, ascites, Child- Pugh score and AFP (P<0.001) was observed. Thirteen miRNA’s were studied; three of which were significantly upregulated (miR-602, miR-125a-5p and miR-885-5p) and two significantly downregulated (miR-29b and miR-375). Subsequently, ROC analysis of these miRNA’s was performed. miR-29b demonstrated a higher AUC than miR-122 (0.766 vs. 0.617) and miR-885-5p (AUC=0.63) for detecting HCC. A panel combining these three biomarkers displayed an AUC of 0.898 for differentiating HCC patients from controls. Adding AFP to this diagnostic panel resulted in the highest diagnostic accuracy (AUC=1). A panel of miR-221, miR-885-5p, miR-181b and miR- 122 showed an AUC of 0.845 for distinguishing patients with HCC from patients with LC. Adding AFP increased the diagnostic accuracy to an AUC of 0.982. Another combination of AFP and several other mi-RNA biomarkers (miR-22 + miR-199a-3p) resulted in an AUC of 0.988 for distinguishing HCC patients from patients with chronic hepatitis. Moreover, a panel of miR-375, miR-221 and miR-22 displayed an AUC of 0.882 (AUC=1 when combined with AFP) in making a distinction between LC patients and healthy controls. Lastly, a panel consisting of miR-125a-5p, miR-29, miR-375, miR-602 and miR-885-5p presented an AUC of 0.964 for distinguishing patients with chronic hepatitis from healthy controls (42).

32 | P a g e

3.3 LUNG CANCER The first of eleven articles included in this systematic review that discuss possible diagnostic biomarkers for the detection of lung cancer, describes a prospective cohort study performed by Lin et al. The results of all studies concerning lung cancer can be found in additional table 3. The performance of two biomarkers (survivin and livin mRNA in bronchial aspirates), both inhibitors of apoptosis, was analyzed. The study included 70 patients with lung mass or nodules, visualized by CT, and 26 patients with benign lung disease as controls. After measuring the levels of survivin and livin mRNA in bronchial aspirates of all participants, it was established that these levels were significantly increased in patients with lung cancer, and this to a larger extent than in the bronchial aspirates of patients with benign lung disease (P<0.001 and P=0.001). Furthermore, levels of survivin and livin mRNA were significantly elevated in bronchial was (BW) fluids from patients with centrally located tumors than in bronchoalveolar lavage (BAL) fluids from patients with peripherally located tumors (P=0.01 and P=0.02). However, levels of survivin and livin mRNA were similar in patients with benign lung disease in comparison to the levels observed in samples taken from the healthy bronchi from patients with lung cancer. The ROC analysis for survivin displayed an AUC of 0.826 (95% CI 0.718-0.924) and an AUC of 0.676 (95% CI 0.560-0.792) for livin at cut-offs of respectively 0.35 and 0.3 for the detection of lung cancer. The respective sensitivity, specificity, PPV and NPV of survivin were 83%, 96%, 98% and 68%. The sensitivity, specificity, PPV and NPV of livin were, respectively, 63%, 92%, 96% and 48%. All patients with positive cytological and/or histologic bronchoscopy, showed positive results of survivin. Moreover, one patient with pneumonia displayed a survivin level of over 3.35. Two patients displayed false positive results for livin, one suffering from TBC and the other one from pneumonia. Lastly, for the detection of stage I lung cancer, survivin reached a sensitivity of 70% (10 cases of stage I lung cancer) (43).

The Belgian retrospective case-control study by Louis et al. explored the use of analyzing the metabolic profile of blood plasma for the detection of lung cancer and compared it to the current tool used for screening: i.e. low-dose CT. The population study consisted of 357 patients and 347 controls with benign lung diseases. Both groups were subsequently divided in a training and validation cohort. Multivariate OPLS-DA statistics were used to create a classification model, based on the interpretation of metabolic phenotypes, in order to differentiate patients with lung cancer from controls. The application of this model on the training set resulted in a sensitivity of 78% and a specificity of 92% for the detection of lung cancer. The sensitivity and specificity of OPLS-DA in the validation cohort were 71% and 81%. Furthermore, the data set was analyzed independently

33 | P a g e

and resulted in a sensitivity of 75% with a specificity of 82%. The metabolites, of which elevated levels were detected in lung cancer patients, were glucose, N-acetylated glycoproteins, beta- hydroxyburate, leucine, tyrosine, threonine, glutamine, valine and aspartate, whereas the metabolites with reduced concentrations were alanine, lactate, sphingomyelin and phosphatidylcholine, citrate and other phospholipids. The model classified 81% of the adenocarcinomas correctly but was only able to detect 38% of the squamous . Lastly, 79% of the patients with stage I lung cancer and 52% of the patients with stage IV were correctly identified. The rest of the results can be consulted in additional table 3 (44).

The article of Chen and colleagues describes a retrospective case-control study in which ten miRNA’s were identified and analyzed as possible diagnostic biomarkers for the detection of non- small cell lung cancer. Both the training and validation set consisted of 200 NSCLC patients and 110 healthy controls. In the discovery phase, 10 out of 63 miRNA’s were differentially expressed (all elevated) in NSCLC patient blood samples, compared with samples from healthy controls. These miRNA’s were miR-20a, miR-24, miR-25, miR-145, miR-152, miR-199A-5p, miR-221, miR- 222, miR-223 and miR-320. Furthermore, ROC analysis was performed using a risk score formula that incorporated all 10 miRNA’s risk scores, with a calculated optimal cut-off value of 5.006. At this cut-off, the sensitivity was 93% and the specificity was 90% for the detection of NSCLC. The same risk score formula was applied to the validation set and ROC analysis revealed a sensitivity of 92.5% and a specificity of 90%. The AUC’s were respectively 0.966 and 0.972 for the training and validation set. The miRNA with the highest AUC was >0.80 and, by individually adding each of the other nine miRNA’s, the sensitivity and specificity for the differentiation of NSCLC patients from healthy controls gradually increased. In the training set, 2 out of 110 controls and 53 out of 200 NSCLC patients were incorrectly classified by this risk formula. In the validation set, only 22 NSCLC patients and 3 controls were incorrectly classified. It is also notable that the high-risk score rate increased from stage I to stage IV NSCLC (45).

Doseeva and colleagues conducted a case-control study which explores the potential of “Paula’s test” as a new immunoassay blood test that assesses the risk of having lung cancer in a high-risk population. Paula’s test consisted of a combination of tumor antigens and auto-antibodies (CEA, NY-ESO-1, CA-125 and CYFRA21-1). Two independent studies were executed: the first was a training set consisting of 115 early-stage lung cancer patients (stage I and II, 41 adenocarcinoma, 7 bronchioloalveolar carcinoma and 45 squamous carcinoma) and 115 healthy controls, the second a validation set consisting of 75 lung cancer patients (32 adenocarcinoma, 21 bronchioloalveolar

34 | P a g e

carcinoma and 15 squamous cell carcinoma with 50 early-stage cases and 25 late-stage cases) patients and 75 controls. ROC analysis was performed to evaluate the potential of each biomarker for differentiating lung cancer patients from controls. In the training set, individual biomarkers exhibited AUC values from 0.60-0.79, the highest AUC was calculated at 0.79 for CEA. The 4- biomarker panel displayed an AUC of 0.83. At a predetermined value of 80% specificity, CEA, CA- 125, CYFRA21-1 and NY-ESO-1 displayed sensitivities of respectively 63%, 42%, 45% and 47%. However, when combining these biomarkers, a sensitivity of 73% was reached. When this 4- biomarker panel was applied to the validation set, an AUC of 0.85 was observed with MoM (multiple of median) analysis (P<0.0001). Moreover, a sensitivity of 77% at a fixed specificity of 80% was found. When combining the two cohorts, the sensitivity for detecting early-stage (I and II) lung cancer was 71.2% with an AUC of 0.82 (P<0.0001), which was lower than the sensitivity for detecting late-stage lung cancer (stage III and IV), namely 76.6% (46).

Testing AKAP4 as a biomarker for detecting NSCLC is the subject of the retrospective case-control study conducted by Gumireddy. This study was divided into two phases: i.e. a discovery and a validation phase. The discovery set consisted of 12 NSCLC patients and 7 control patients with benign lung diseases and a history of smoking. During the discovery phase, expression of 116 CTA’s was analyzed. GAGE4 and AKAP4 were identified as potential biomarkers for the detection of patients with NSCLC. The utility of AKAP4 was further tested in a first validation cohort of 141 NSCLC patients and 35 control patients with benign lung diseases. Twenty-four out of the 35 controls displayed benign pulmonary nodules, which were confirmed as being benign by histological analysis. ROC analysis proved that AKAP4 had a higher diagnostic accuracy and AUC than GAGE4 (0.9735 for AKAP4 vs. 0.7149 for GAGE4). Combining AKAP4 and GAGE4 did not improve these results. After discarding GAGE4 in favor of AKAP4 as a potential diagnostic biomarker for NSCLC, further validation of AKAP4 was performed in a second validation cohort of 123 NSCLC patients and 100 controls. The AUC of AKAP4 for detecting NSCLC was 0.9805 in this cohort. Furthermore, it was proven that altered AKAP4 expression is significantly associated with cancer stage. This association was confirmed by ROC analysis and an AUC of 0.9795 for the detection of stage I NSCLC patients. Moreover, it was suggested that AKAP4 expression might be a possible biomarker for discriminating patients with benign nodules from patients with malign nodules (47).

In their 2017 case-control study, Hulbert and colleagues evaluated the diagnostic potential of hypermethylated genes in plasma and sputum samples of 150 patients with node-negative NSCLC

35 | P a g e

(stage I and IIa cancer) and 60 controls (12 of which suffered from COPD). Six hypermethylated genes, which had already been tested in previous studies, were chosen as biomarkers. These were SOX17, TAC1, HOXA7, CDO1, ZFP2 and HOXA9. Firstly, DNA methylation was detected in tumor tissue, plasma and in sputum. The sensitivity and specificity of the six biomarkers for the detection of NSCLC ranged from 63% to 93% and from 42% to 92% respectively in sputum samples. In plasma samples, the sensitivity ranged from 33% to 91% and the specificity from 52% to 94%. ROC curves for each hypermethylated gene were calculated and they ranged from 0.56 to 0.89 in sputum samples and from 0.60 to 0.78 in plasma samples. When combining the three best- performing biomarkers (i.e. TAC1, HOX17 and SOX17), an AUC of 0.89 with a sensitivity of 98% and a specificity of 71% was observed in sputum samples. In plasma samples, this gene panel demonstrated a sensitivity and specificity of 93% and 62% respectively, with an AUC of 0.77. Subsequently, a random forest model using these three biomarkers was applied to an independent test set of patients and it predicted lung cancer in 85% of the subjects. The AUC for detecting NSCLC in this last test cohort was calculated at 0.89 (48).

In a large, blinded multi-center case-control study by Ostroff et al, 813 proteins were tested as new potential biomarkers for the detection of NSCLC. Two different cohorts were created to select and validate the best-performing biomarkers. The selection or training cohort included 213 cases of NSCLC of which 99 were stage I patients, 32 were stage II patients and 82 were stage III patients. It also included 772 randomly selected high-risk controls with a history of long-term tobacco use. Of these 772 controls in the training cohort, 420 were diagnosed with benign pulmonary nodules. The verification set consisted of 78 cases of NSCLC (38 stage I patients, 11 stage II patients and 27 stage III patients) and 263 controls with a history of smoking, including 145 patients with benign nodules. During the selection phase, a classifier with the 12 best-performing protein biomarkers was composed. The 12 biomarkers were Cadherin-1, CD30 ligand, Endostatin, HSP90-alfa, LRIG3, MIP-4, Pleiotrophin, PRKCI, RGM-C, SCF sR, sL-Selectin and YES. This classifier attained a sensitivity of 91%, a specificity of 84% and an AUC of 0.92 for the detection of NSCLC. The sensitivity for detecting stage I NSCLC was 90%. When testing this algorithm on the blinded independent verification set, a sensitivity of 89% and a specificity of 83% was reached with an AUC of 0.90 (49).

One particular study in this systematic review focuses on a specific type of lung cancer: i.e. pleural mesothelioma. This multi-center case-control study, in which fibrulin-3 in plasma and in pleural effusions was evaluated as a potential new biomarker for the detection of mesothelioma, was

36 | P a g e

performed by Pass and colleagues. The serum samples of 92 patients with mesothelioma, 136 persons without cancer but with a history of asbestos exposure and 43 healthy controls were analyzed. Effusion samples were obtained from 74 patients with mesothelioma, 39 patients with benign effusions and 54 patients with malignant effusions, which were not due to mesothelioma. Subsequently, a blinded validation study was executed. Mean plasma fibulin-3 levels were significantly elevated in patients with mesothelioma in comparison to the levels in asbestos- exposed patients in all cohorts. However, there was no significant difference in plasma fibulin-3 levels between patients with stage I and II mesothelioma and patients with stage III or IV mesothelioma. In the Detroit cohort, ROC analysis presented an AUC of 1.00 for the detection of mesothelioma. At a cut-off value of 32.9 ng/mL, fibulin-3 displayed the highest sensitivity and specificity: both were calculated at 100%. These results were similar to the results in the New York cohort. In this latter cohort, an AUC of 0.99, a sensitivity of 94.6% and a specificity of 95.7% for distinguishing mesothelioma patients from asbestos-exposed patients were observed, at a cut-off of 52.8 n/mL. After combining the two cohorts, an AUC of 0.99 was achieved. The AUC for the detection of stage I or II mesothelioma patients was 0.99 at a cut-off value of 46 ng/mL. The validation study did not immediately confirm these results with an AUC calculated at 0.87, a specificity of 100% and a sensitivity and 33%. The effusion fibulin-3 levels were significantly more elevated in patients with mesothelioma than in patients with benign or malignant effusions, which were not caused by mesothelioma. The AUC’s for the detection of mesothelioma in the Detroit and New York cohorts were, respectively 0.95 and 0.91 at cut-offs of 378 ng/mL and 346 ng/mL (50).

The aim of the prospective cohortstudy of Tammemagi and colleagues was to prove that plasma pro-surfactant protein B (proSFTPB) is a potential predictive biomarker for lung cancer. This study consisted of two phases. In the first phase, which is referred to as the Pan-Can study, 2537 individuals with no prior history of lung cancer but with a 2% 3-year risk of lung cancer (predicted by lung cancer risk prediction models) were enrolled. All participants were followed up every six months for at least two years. The second phase was a validation study, in which sera collected from 61 NSCLC cancer patients and 122 controls of the double-blind, case-control CARET study were analyzed. In the Pan-Can study, 113 participants developed lung cancer in the follow-up period of two years. Pro-SFTPB was measured in 2485 individuals. When applying a logistic model of log-proSFTPB to this first set of samples, an OR of 2.331 (95% CI 1.837-2.958) and an AUC of 0.690 (95% CI 0.643-0.735) for predicting and detecting lung cancer were achieved. Furthermore, a sensitivity of 80.4%, a specificity of 40.1%, a PPV of 6.4% and a NPV of 97.6% were observed. When this model was applied to a set of patients with stage I or stage II lung cancer (early-stage

37 | P a g e

lung cancer), log-proSFTPB displayed an OR of 2.195 (95% CI 1.679-2.870) and an AUC of 0.735. In the CARET study, the levels of pro-SFTPB were significantly higher among NSCLC patients than among controls (P<0.01). ROC analysis of pro-SFTPB revealed an AUC of 0.683 (95% CI 0.604- 0.761). Moreover, when analyzing subgroups, it was observed that pro-SFTPB levels had increased significantly among patients with adenocarcinoma, but not among patients with squamous cell carcinoma (51).

The next study by Sin et al. explored the use of a chromosomal aneusomy-FISH assay, as either a predictor or a diagnostic biomarker for lung cancer, in sputum samples. The premise of this study was that current and former smokers are at higher risk of developing lung cancer and could therefore present a target population for prevention and early detection programs. The sputum samples were prospectively collected and included 100 incident lung cancer cases and 96 controls (matched on gender, age and date of collection) nested in an ongoing high-risk cohort. Each sputum sample was acquired within at least 18 months before diagnosis. Four individual DNA markers were incorporated in the FISH-assay: EGFR, 5p15, MYC and CEP6. Each one of these FISH-markers displayed a significant association between the detection of chromosomal aneusomy and cancer status in cases or controls. The sensitivity of each individual marker however, was rather low (19%-59%). If at least two of four markers were positive, a sensitivity, specificity and odds-ratio of, respectively, 56%, 85% and 8.3 (for the age-adjusted model) were calculated. Furthermore, the time of sample collection played a large role with regard to the sensitivity of this assay. Sensitivity decreased from 76% (for samples collected within 18 months prior to cancer diagnosis) to 31% (for samples collected more than 18 months before diagnosis). Among subgroups, a sensitivity of 93% was reached for the detection of localized tumors (stage I) and a sensitivity of 95% was attained for the detection of squamous cell cancer. ROC analysis of the FISH assay displayed an AUC of 0.84 for the detection of lung cancer. When the FISH-assay was combined with cytological analysis, the AUC did not significantly increase to (0.95) (52).

The identification and verification of a biomarker panel with four proteins for the detection of NSCLC was the subject of a study, conducted by Yao et al. Firstly, 35 in-frame expressed proteins found in tumor tissue were isolated. Then, based on the profile of gene expression, the four best- performing proteins, (or immunogenic phage displayed proteins) were selected for further evaluation in a panel. Subsequently, an ELISA assay was developed to measure these proteins in serum samples of NSCLC. These four protein biomarkers (SMOX, NOLC1, MALAT1 and HMMR) were identified and the combined predictive value of the antibodies against these proteins was

38 | P a g e

assessed in a cohort of 40 NSCLC patients and 36 healthy matched controls. Logistic processing was performed for each biomarker and NOLC1 was significantly elevated in patients with NSCLC (P<0.006). Furthermore, NOLC1 displayed an AUC of 0.684 with a sensitivity of 45% and a specificity of 96.2% for the detection of NSCLC patients. Combining the four biomarkers resulted in an increase of the AUC to 0.767 with a 47.5% sensitivity and a 97.3% specificity (P=0.001). To confirm these results, leave-one out cross-validation was performed. This validation study demonstrated a sensitivity of 66.7% and a specificity of 60% for the detection of NSCLC (53).

3.4 OVARIAN CANCER A total of ten articles concerning biomarkers for the detection of ovarian cancer were included in this systematic review. The results can be consulted in additional table 4. The first article describing a potential biomarker panel for the detection of ovarian cancer is written by Longoria and colleagues. The prospective cohort study consisted of 515 premenopausal women and 501 postmenopausal women. Of these 1016 participants, 255 were diagnosed with primary ovarian malignancy (86 patients with primary-stage I or II primary ovarian malignancy). Epithelial ovarian cancer was present in 79 of these 86 patients. The goal of the study was to evaluate the diagnostic performance of a multivariate index assay (MIA) or OVA1, that was approved by the FDA in 2009, in combination with physician assessment. The OVA1 algorithm incorporates a multitude of different biomarkers like CA125-II, transferrin, transthyretin, apolipoprotein A1 and beta-2- microglobulin and generates an ovarian malignancy risk score. Women with a score > 5.00 (premenopausal) or > 4.4 (postmenopausal) were classified as high-risk patients. The diagnostic results of OVA1 combined with clinical assessments of patients, were compared to other risk- assessment tools of ovarian cancer like the ACOG guidelines. Furthermore, the diagnostic performance of CA-125 was tested and compared with the diagnostic performance of OVA1 and a cut-off value of > 200 U/mL (premenopausal) or of > 35U/mL (postmenopausal) for CA-125 were yielded. The results showed that OVA1 in combination with clinical assessment had a sensitivity of 95.3% for the detection of all stages of ovarian malignancy, whereas, individually, OVA1 and clinical assessment displayed a sensitivity of 92.2% and 74.5%, respectively. CA-125 had a sensitivity of merely 70.6% while the modified ACOG guidelines presented a sensitivity of 80%. The specificity of OVA1 alone was low (i.e. 49.4%) but the specificity of OVA1 in combination with clinical assessment was even lower (i.e.44.2%). Moreover, the highest NPV belonged to OVA1 in combination with clinical assessment while similar NPV’s for clinical assessment alone (91%), for the modified ACOG guidelines (91.9%) and for CA-125 (90.1%) were observed. For the detection

39 | P a g e

of stage I or II primary ovarian malignancy, OVA1 in combination with clinical assessment maintained a sensitivity of 95.3%, which was statistically higher than the sensitivity of clinical assessment alone (68.8% and P<0.001), the sensitivity of CA-125 (62.8% and P<0.0001) and the sensitivity of the modified ACOG guidelines (76.7% and P<0.0001). Further analysis showed that 23 of early-stage malignancies were missed by clinical assessment alone, 25 were missed by CA- 125 II and 13 were missed by modified ACOG guidelines; all of which were correctly classified by a combination of OVA1 and clinical assessment. The sensitivity for detecting early-stage disease among premenopausal women was 89.3% for OVA1 in combination with clinical assessment and was statistically higher than the sensitivity of clinical assessment and of CA-125. Subsequently, a risk-assessment tool combination of OVA1 and clinical assessment correctly detected 98.3% of all early-stage ovarian cancer patients in the postmenopausal subgroup (54).

The possible role of the neutrophil/lymphocyte ratio and the platelet/lymphocyte ratio in the early diagnosis of malignant ovarian cancer is the focus of the study by Yildirim et al. In this retrospective study, 316 patients with benign adnexal masses and 253 patients with malignant adnexal masses, all of whom underwent surgical treatment, were enrolled. A complete blood count and measurement of CA-125 in serum were performed just before surgery. The benign cases were classified as endometrioma masses and non-endometrioma masses whereas the malignant cases were divided into an epithelial and non-epithelial group of ovarian cancer. The sensitivity for the detection of endometrioma amongst all benign cases was 75% for CA-125 whereas its specificity was calculated at 82.2%. Although the sensitivity of the neutrophil/lymphocyte ratio (NLR) was higher than that of CA-125 (78%), the specificity of NLR was much lower than that of CA-125 (36.9%). Moreover, the sensitivity of the platelet/lymphocyte ratio (PLR) was similar to the sensitivity of CA-125, while the specificity of the PLR was higher than that of NLR. There was a significant difference in platelet and neutrophil levels in the epithelial cancer group as opposed to the non-epithelial cancer group (P<0.01). Furthermore, a significant difference in NLR, PLR, CA- 125, platelet, neutrophil and lymphocyte levels was observed in the group with malignant ovarian pathology in contrast to cases with benign pathology. The lymphocyte values displayed the highest sensitivity for detecting early-stage ovarian cancer patients, while the highest specificity was observed using platelet values (55).

In the search for new biomarkers, a combination of CA-125 and the receptor for the circulating fetal protein alpha-fetoprotein (RECAF) was tested in a retrospective case-control study conducted by Tcherkassova and colleagues. The aim of the study was to determine whether RECAF can be used

40 | P a g e

as a diagnostic biomarker as well as whether combining RECAF and CA-125 can detect ovarian cancer in an early stage (I and II) using a high cut-off value, which eliminates false positives. Serum samples were collected from 80 ovarian cancer patients (31 patients in stage I/II) and from 105 healthy women. Firstly, an antibody against the AFP site of RECAF, which was to be used in a new chemiluminescence assay, was selected. It was demonstrated that Mab 1.4G11 was specific to RECAF and that this antibody could interact with the AFP-binding site of RECAF. ROCanalysis of the RECAF and CA-125 levels in this study population showed that RECAF significantly distinguished patients with ovarian cancer from healthy controls. Furthermore, the AUC of RECAF was higher than the AUC of CA-125, especially in terms of detecting early-stage cancer (0.96 for RECAF vs 0.805 for CA-125). Another notable result was the high sensitivity of RECAF for the detection of all stages of ovarian cancer. The sensitivity of CA-125, on the other hand, was reduced in early stages, which was not the case in later stages. As there was no significant correlation observed between CA-125 and RECAF, the combination of these two potential biomarkers were tested in a panel. With a specificity set at 100%, the cut-offs of CA-125 and RECAF were determined at 36 and 7500 units/ml, respectively. The overall sensitivity for all stages of ovarian cancer rose to 83% and was calculated at 88.2% for the detection of stage III and IV ovarian cancer. Furthermore, the sensitivity for detecting stage I and II ovarian cancer patients was 76%. These results showed that, by combining RECAF and CA-125, 75% of all women with early-stage ovarian cancer could be detected without the occurrence of false-positives among women without cancer (56).

In a retrospective case-control study by Simmons, the levels of CA-125, HE-4, MMP-7, CA72-4, CA19-9, CA15-3, CEA and s-VCAM in pretreatment sera from 142 patients with stage I ovarian cancer and in 5 annual samples from 217 healthy controls were measured and analyzed. The objective was to identify and validate a panel of the most sensitive biomarkers for the detection of early-stage ovarian malignancies. A repeated sub-sampling validation was performed during which, samples were randomly divided into a training set and a validation set. Of the individual markers, CA-125 displayed the highest sensitivity for detecting early-stage ovarian cancer, followed by HE- 4. In the training set, the two panels with the highest sensitivity (85.5% and 85.1%) consisted of CA-125, MMP-7, sVCAM and HE-4 or CA72-4. However, in the validation sets, the two panels displaying the highest sensitivity at a predetermined specificity of 98% consisted of CA-125, CA72- 4, HE-4 with either s-VCAM (Se: 83.7%) or MMP-7 (Se: 83.2%). In order to evaluate the longitudinal utility of these individual biomarkers, the ‘baseline value’ of each biomarker for each individual participant was calculated, based on 5 annual samples. From all the biomarkers in the

41 | P a g e

aforementioned panels, CA-125 had the lowest biological within-person over time variability, followed by CA72-4 and HE-4. Of the two remaining biomarkers (i.e. MMP-7 and sVCAM), MMP-7 showed a lower biological within-person over time variability. Based on these results, the panel consisting of CA-125, CA72-4, HE-4 and MMP-7 was selected as the most promising detection tool for ovarian cancer (57).

Lysophosphatidic acid (LPA) is produced by ovarian cancer cells and not by normal ovarian epithelial cells. Therefore, evaluating the potential of lysophosphatidic acid (LPA) in plasma as a diagnostic biomarker for ovarian cancer and correlating it with clinico-pathological characteristics is the subject of a prospective study led by Sedláková et al. In this study, 81 patients with ovarian cancer (22 in stages I or II and 64 in stages III or IV), 51 patients with benign tumors and 27 controls were included. The results showed that LPA plasma levels were significantly higher in patients with ovarian cancer (median: 11.53 µmol/l) than those in the healthy control group (median: 1.86 µmol/l) and those in the patients with benign ovarian tumors (median: 6.17 µmol/l and P<0.001). Furthermore, plasma LPA levels were associated with the FIGO stage of the disease and rose from stage I to stage IV patients. More specifically, in 90% of the patients with stage I, LPA was elevated. No correlation was found between LPA plasma level and histological subtype. Lastly, at a cut-off value of 8.30 µmol/l, a sensitivity of 79% and a specificity of 92.6% for the detection of ovarian cancer was observed (58).

In the search of the ideal biomarker for ovarian cancer, Leandersson et al. conducted a cohort study, which assessed the performance of B7-family protein homolog 4 or B7-H4, intact and cleaved soluble urokinase plasminogen activator receptor (suPAR), human epididymis protein 4 (HE-4), Risk of Ovarian Malignancy Algorithm (ROMA), and cancer antigen 125 (CA-125) in preoperative samples of patients with type I and II ovarian cancer, and analyzed if these biomarker levels correlated with the classification system of epithelial ovarian cancer (EOC) type I and II. Furthermore, the estimated prognosis of the ovarian cancer patients, based on these biomarker levels, was evaluated through five-year follow-up. The study population consisted of 211 patients with benign tumors, of 30 patients with borderline tumors, of 35 patients with type 1 EOC and of 74 patients with type 2 EOC. Firstly, plasma levels of B7-H4 were found to be higher in patients with EOCII tumors than in patients with benign tumors (P<0.001). The suPAR levels were lower in patients with benign tumors when compared with the suPAR levels in patients with borderline, EOC I and EOC II tumors. More interestingly, HE-4 and CA-125 levels varied significantly in all subgroups. B7-H4 levels were elevated in stage II-IV ovarian cancer patients and rose with tumor

42 | P a g e

stage. Furthermore, B7-H4 levels were higher in patients with stage II-IV ovarian cancer when compared with B7-H4 levels in patients with benign tumors (P<0.001), but were not elevated in patients with type I ovarian cancer (P=0.07). ROC analysis was performed and logistic regression models were calculated. In premenopausal woman, CA-125 displayed the highest AUC (0.864, 95% CI 0.783-0.946) of all individual biomarkers for detecting ovarian cancer. The second-best performing biomarker was suPAR which presented an AUC of 0.822, 95% CI 0.708-0.936). The best model, combining the three biomarkers HE-4, CA-125, suPAR with age, had the highest AUC (0.94, 95% CI 0.90-0.98). On top of that, with a specificity set at 95%, this panel showed a sensitivity of 74% with a positive likelihood ratio of 14.7% for the detection of early-stage ovarian cancer. In postmenopausal women, the ROMA score displayed the highest AUC (0.914, 95% CI 0.867-0.961) when compared to the other 4 biomarkers. In fact, none of the proposed models incorporating B7- H4, suPAR, CA-125, HE-4 or age could display a higher AUC than ROMA. In the 5-year survival analysis, high levels of uPAR (I), CA-125, HE-4 were associated with shorter survival (59).

A total amount of 35 biomarkers were evaluated in the phase III study of Cramer and collegues. A phase III study evaluates the diagnostic performance of biomarkers in samples taken from asymptomatic individuals before clinical diagnosis is made, while in a in a phase II study, the potential diagnostic use of biomarkers is assessed in samples from symptomatic individuals at the time of diagnosis. The goal of the study was to validate the markers that were previously assessed in phase II sets. The study population of phase II consisted of 160 patients with ovarian cancer; approximately 50% of which were early-stage cases. Furthermore, 160 patients with benign diseases and 480 healthy controls took part in this study. In phase III, 118 cases of ovarian cancer and 474 healthy controls were included. In phase II, the best performing biomarkers for detection of all stage of ovarian cancer, ranked by sensitivity, were CA-125 (73%), HE-4 (57%), Transthyretin (47%), CA15-3 (46%), CA72-4 (40%) and IGFBP2 (38%). The sensitivity, specificity and AUC of these biomarkers can be found in additional table 4. In phase III data, CA-125 was again the best performing biomarker. Sensitivity of CA-125 for detecting ovarian cancer was highest in samples drawn within 6 months before or after diagnosis (86% 95% CI 76-0.97%). However, the sensitivity of CA-125 dropped in samples more remote from diagnosis. The second best biomarker was HE- 4 with a sensitivity of 73% (95% CI 60-86%) in samples taken 6 months or more before diagnosis and similarly to CA-125, the sensitivity of HE-4 declined in samples that were taken from patients more than 6 months from diagnosis. Lastly, one similar observation could be made for all biomarkers. When comparing phase II and III, it became apparent that, as regards phase II

43 | P a g e

samples, the further away the samples were taken from the time of diagnosis, the more the performance of all biomarkers declined (60).

The next case-control study explored if modifying cut-off values of HE-4, CA-125 and the ROMA algorithm could result in more sensitive and cost-effective screening for ovarian cancer. Serum HE- 4 and CA-125 levels were evaluated in the sera of 765 patients with adnexal masses of which 275 patients had malignant tumors, 53 patients had borderline tumors and 428 patients had benign diseases. A total of 265 healthy women were included as a control population. It has already been proven that HE-4 isn’t expressed in non-epithelial ovarian tumors. For that reason, patients with non-EOC were excluded. Women < 51 years old were classified as pre-menopausal in the healthy cohort. Approximately 67% of patients with ovarian cancer were at an advanced cancer stage. Similarly to results in previous studies in this systematic review, levels of HE-4 and CA-125 rose as the status of the disease progressed from benign to malignant. Furthermore, the HE-4 and ROMA levels were found to be higher in post-menopausal women, particularly in patients with ovarian cancer. ROC analysis was performed on all groups. In premenopausal women, ROMA displayed the highest diagnostic accuracy of all biomarkers with an AUC of 0.818 for the differentiation between ovarian cancer patients and patients with benign diseases, followed closely by HE-4 with an AUC of 0.817. The AUC of CA-125 for ovarian cancer was much lower at 0.683. When combining HE-4 and CA-125, the diagnostic accuracy increased (AUC=0.817) and was similar to that of HE-4 alone and ROMA. However, the combination of HE-4, CA-125 and ROMA displayed the highest AUC (0.886). In post-menopausal women, the AUC’s of all biomarkers increased in comparison with their AUC’s in pre-menopausal women. The two panels (HE4 and CA-125 with or without ROMA) both displayed an AUC of 0.888. The sensitivities of these panels can be found in additional table 4. HE-4 and ROMA had higher sensitivities in pre- and post- menopausal woman than CA-125. Furthermore, when it came to differentiating early-stage ovarian cancer from benign diseases, ROMA displayed an AUC of 0.699 while HE-4 had an AUC of 0.714 in premenopausal women. In samples from postmenopausal women, the AUC’s of both ROMA and HE-4 both increased to respectively 0.902 and 0.880. The highest diagnostic performance in pre- and postmenopausal women was reached by the panel consisting of CA-125, HE-4 and ROMA with AUC’s of 0.734 (P=0.05) and 0.976 (P=0.02). Lastly, new cut-off values of each biomarker, specifically for the population of South China, were evaluated and the resulting new values were 70 pmol/l for HE-4 and 60 U/ml for CA-125.The sensitivity of HE-4 increased from 39.7% to 73.8% with a minor decrease in specificity. Using the new cut-off values for CA-125, only an increase in

44 | P a g e

specificity from 77.6% to 87% was observed. No significant differences were found when using these new values for the ROMA score (61).

The article by Gislefoss explores the use of the most studied biomarker for detecting ovarian cancer in recent years: Human Epididymis protein 4 or HE-4. A total of 120 EOC patients and 174 healthy controls were included in this retrospective multi-center case-control study. HE-4 levels, CA-125 levels and cotinine (a biomarker for tobacco use) levels were measured in all samples and the performance of the first two biomarkers was evaluated. In the results, a significant correlation between HE-4 and cotinine (P<0.001) in patients with ovarian cancer and controls was observed. However, no significant correlation between HE-4 and CA-125 was observed. Furthermore, a significant difference in HE-4 and CA-125 levels was observed when analyzing these biomarker levels in samples of patients with cancer (collected two years before diagnosis) and in samples of controls (P=0.002 and P<0.002). Furthermore, when measuring HE-4 and CA-125 levels in samples collected from patients 4 years before diagnosis and comparing them to the HE-4 and CA- 125 levels in samples of controls, only a significant difference in CA-125 levels was observed (P<0.001) (62).

Zhu and colleagues performed a validation study of 28 various biomarker panels for early-stage ovarian cancer detection. Previous studies on biomarker panels have often report plausible results, demonstrating higher sensitivity and specificity than the current most widely used biomarker: CA- 125. However, the samples in these studies were frequently collected at the time of diagnosis and included a high number of patients with advanced stage disease as a target population. Nevertheless, the purpose of a screening program is to detect cancer in asymptomatic individuals. Thus for a biomarker to be implemented in a screening program, it must display high performance in diagnosing cancer while the patient is still in an asymptomatic phase, which cannot be analyzed in samples collected at the time of diagnosis. Furthermore, only a minority of these biomarker panels have been validated in reliable studies. For that reason, many of the so-called more sensitive and specific biomarkers than CA-125 were evaluated in this study in prediagnostic serum samples of 118 patients with ovarian cancer and 951 healthy controls. The study was divided into three steps. In the first step, previously established biomarker models were validated. In the second step, investigators sought to improve upon these models in a training set and to validate these existing or improved models. Finally, in the third step, the new biomarker models, that were discovered in a training set, were described. The results of this article are summed up in additional table 4. The five different biomarker panels were named panel A to E. For within-one-year cases,

45 | P a g e

panel B1 had similar results to CA-125 when it came to the diagnostic performance for ovarian cancer with a sensitivity, specificity, and AUC of respectively 69.2%, 96.6% and 0.892. CA-125, on the other hand displayed a sensitivity of 63.1%, a specificity of 98.5% and an AUC of 0.890. Panels A1, C1 and E1 all presented a considerably lower sensitivity and AUC than CA-125. One model, D1, had a very high sensitivity of 95.4% but was accompanied with a low specificity of 32.3% for the detection of ovarian cancer. In step 2, the same patterns in terms of sensitivity and specificity of these panels for the detection of ovarian cancer were seen as in step 1 with regard to the within- one-year cases. In step 3, a pan-site model, which incorporated CA-125, HE-4, CA72-4, SLPI (Secretory Leukocyte Protease Inhibitor) and B2M (Beta-2-microglobulin), was evaluated. In the within-one-year cases, this biomarker panel showed an AUC of 0.911 and a sensitivity of 68.2% at a specificity of 98%. It’s diagnostic performance however, was not statistically better than that of any model or that of CA-125 alone in step 1 or 2 (63).

3.5 PROSTATE CANCER Nine studies regarding the evaluation of potential biomarkers for the detection of prostate cancer, were included in this systematic review. The results of these studies can be consulted in additional table 5. The first study concerning biomarkers for prostate cancer, was conducted by Cremers and colleagues. The goal of the study was to evaluate the use of prostate cancer gen 3 (PCA3) urine test as an additional test to serum prostate-specific antigen (PSA) in prostate cancer screening among BRCA carriers with breast cancer. The study population consisted of 191 BRCA 1 carriers, 75 BRCA 2 carriers and 309 non-carriers. Moreover, these patients were also participants in the IMPACT study, which is an international study on PSA screening in BRCA mutation carriers. In the first screening round, PSA was elevated in 75 participants and 14 of these men were diagnosed with prostate cancer (PC). No PC’s were diagnosed in men with a normal PSA level but abnormal direct (DRE). Detection rate in BRCA 2 and BRCA 1 carriers was 4% and 1%. In the second year of screening, 50 participants displayed elevated PSA levels and 4 PC’s were detected. Three of these 4 patients with PC had a PSA < 3.0 ng/ml in the first round. Furthermore, results indicated that PSA performed best in the BRCA 2 group with a sensitivity of 100%, a specificity of 85%, a PPV and NPV of 25% and 100%. Only 3 of al 23 PC’s had abnormal DRE results. The results of PCA3 in BRCA 2 carriers consisted of a PPV of 13% and a NPV of 98% (rest of results not reported). When using PCA3 score at a cut-off of 25 to withhold biopsies in men with elevated PSA levels, the number of prostate biopsies in the first screening round would have been decreased from 68 to 31. When yielding a cut-off level of >35 for PCA3 score, only 25 biopsies

46 | P a g e

would have been carried out at the expense of missing 7 and 11 PC’s in the first and second screening round (64).

The validation of free prostate-specific antigen (PSA) isoform, [-2]proPSA, as a potential diagnostic tool for prostate cancer was the subject of a prospective multi-center cohort study by Sokoll et al. A total of 566 men who were referred to a doctor for , participated in this study. Of these 566 individuals, 321 patients showed no signs of cancer and 245 patients were diagnosed with prostate cancer. Serum PSA, free PSA and [-2]proPSA were measured in samples collected from these patients before prostate biopsy. Median PSA levels were significantly elevated in patients with prostate cancer when compared with the levels of PSA in the non-cancer group. The AUC’s of PSA, %fPSA and %[-2]proPSA were 0.66, 0.70 and 0.67, respectively. A logistic regression model that incorporated PSA, %fPSA, %[-2]proPSA, age, race, DRE and prostate cancer family history achieved an AUC of 0.79, which was significantly higher than the AUC’s of all individual biomarkers (P<0.0001). Adding %[-2]proPSA increased the AUC from 0.75 to 0.79. The sensitivities and specificities of each biomarker are displayed in additional table 5. In the samples of patients with early-stage prostate cancer (PSA 2 to 4 ng/mL range), %fPSA was significantly decreased (P=0.02) while %[-2]proPSA was significantly higher in this group (P<0.0001). On top of that, ROC analysis was performed and %[-2]proPSA presented an AUC of 0.73 for the detection of early-stage cancer, which is a higher AUC than that of PSA (0.58) and %fPSA (0.61). As a result, a logistic regression model was developed. Subsequently, this model displayed an AUC of 0.76, which was similar to the AUC of %[-2]proPSA. Similar results were observed in the group with a PSA range of 4-10 ng/mL. Lastly, a relation between the aggressiveness of the disease an biomarker levels was investigated by comparing [-2]proPSA and %[-2]proPSA with the Gleason score of each patient. As expected, [-2]proPSA and %[-2]proPSA increased with increasing Gleason score (P<0.0001) (65).

In the study by Gordian et al, serum free-circulating DNA (fcDNA) is tested as a possible diagnostic biomarker for distinguishing patients with prostate cancer from patients with benign prostate disease. The main difference between this study and previous studies was that the entire study population consisted of men with elevated PSA levels > 4 ng/mL and/or abnormal digital rectal exam, all of whom were referred to undergo prostate biopsy. Samples were collected in a prospective fashion from 89 patients with prostate cancer, 59 patients with prostatitis and 104 patients with benign prostate hypertrophy. When fcDNA was measured in all samples, a significant difference (P=0.0129) in fcDNA levels between the group with prostate cancer and the group with

47 | P a g e

benign prostate diseases (prostatitis + BPH) was observed. This difference in fcDNA levels was also found in the subgroup of patients with PSA < 10 ng/mL (P=0.023), but not in the subgroup of patients with PSA > 10 ng/mL (P=0.485). Furthermore, no association was found between fcDNA levels and Gleason score of patients with prostate cancer. A univariate analysis was computed and concluded that the best cut-off point of fcDNA to distinguish patients with prostate cancer from patients with benign diseases was 180 ng/mL. Subsequently, multivariate analysis was performed and odds ratios of fcDNA and PSA for prostate cancer patients were calculated. Combining age, race, fcDNA (>180 ng/mL or <180 ng/mL) and PSA (>10 or <10 ng/mL) in a logistic model and applying it to the test data set resulted in an AUC of 0.742. With this AUC, the logistic model displayed a higher diagnostic performance than that of PSA, fcDNA or the combination of these two biomarkers. Moreover, the risk at prostate cancer was far higher for patients with PSA <10 ng/mL and fcDNA >180 ng/mL (OR=4.27) than for patients with fcDNA <180 ng/mL (66).

The first study concerning biomarkers for prostate cancer by Cremers et al. used a subpopulation of the IMPACT study to evaluate the use of PCA3 in BRCA carriers. The study of Mitra and colleagues consisted of a preliminary analysis of the results of this IMPACT study, which is a multi- center observational study that focused on prostate cancer screening through PSA in men with mutations in BRCA1 and BRCA2. A total of 300 patients were recruited over a period of 33 months and consisted of 89 BRCA1 carriers, 116 BRCA2 carriers and 95 controls. Total PSA in serum of these patients was measured annually in order to determine referral for biopsy. The cut-off value of total PSA for referral was 3.0 ng/mL. In the first year of follow-up, 10 patients were diagnosed with prostate cancer out of 21 patients that were referred for biopsy based on an increase in PSA levels ( > 3 ng/mL). Overall PPV of PSA was 48%, with a false positivity rate of 52%. The PPV of PSA was 50% in the control group and was 47% in the group with BRCA carriers (95% CI 23-72%). More specifically, the PPV in BRCA1 mutation carriers was 66.7% while the PPV in the BRCA2 group was 36.4%. In the second year, one patient (BRCA2 carrier) out of 4 patients that were referred for prostate biopsy due to an increase in PSA levels, was diagnosed with prostate cancer. Consequently, although test numbers were low, the PPV of PSA was 25% (67).

Hitherto, PSA has been used as a cancer biomarker for diagnosis, monitoring of response to treatment, prediction of PC risk and of treatment outcome. However, its performance leaves a lot to be desired, as it is in essential a prostate (disease-)specific marker. However, PSA is no prostate cancer specific marker and it falls short in terms of sensitivity and specificity to flawlessly detect PC. For PSA to become a more reliable biomarker, adjustments in terms of age and prostate

48 | P a g e

volume have to be made. In the last decade, there have been no new serum-based prostate cancer detection tests that showed sufficient sensitivity or specificity to reach the clinic. As for new urine- based biomarker for the detection of PC, PCA3 displayed a higher specificity than PSA but it is still not the ideal diagnostic biomarker for prostate cancer because it presents a great number of false positives. Based on that premise, Morgan and his colleagues set out to assess the diagnostic potential of an alternative biomarker: Engrailed-2 or En-2 and its protein EN2. First, En-2 was identified in prostatic cancer tissue and subsequently, secreted En-2 in urine samples was evaluated as a possible diagnostic biomarker for PC. A total of 82 patients with PC and 102 controls with hematuria took part in this study. The control group with men older than 40 years was divided into two groups: a low PSA group A with hematuria but no urothelial malignancy, and a low PSA group B with no symptoms or family history of PC. In prostate cancer tissue, EN-2 protein was highly expressed, while in normal prostate tissue, no elevated En-2 expression was observed. The protein EN2 was measured in urine samples using ELISA. When using a cut-off of 42.5 µg/L, elevated EN2 protein levels were detected in 54 of 82 samples of patients with PC. Interestingly, in nine of these men, PSA levels were below the 2,5 ng/mL cut-off point. The sensitivity and specificity of EN2 for detecting PC were calculated at respectively 66% and 90%. ROC analysis of this data showed that EN2 had an AUC of 0.8021 (P<0.001) for the detection of PC, suggesting EN2 could be a diagnostic biomarker for PC (68).

The study of Vickers et al. involves the prediction of prostate cancer in men using a blood-based four-kallikrein panel. The participants included in this screening study were 1501 men with elevated PSA levels (> 3 pg/mL) that underwent biopsy. Of these 1501 participants, 388 patients were diagnosed with prostate cancer after biopsy. The four-kallikrein model that is tested in the serum samples of these patients, consisted of total PSA (tPSA), free PSA (fPSA), intact PSA (iPSA) and kallikrein-related peptidase 2 (hk2). Besides this model, a few other laboratory and clinical models were developed and tested in three rounds of follow-up. Only the results of the clinical models in all stages of prostate cancer are displayed in additional table 5. When assessing the results, a few observations could be made. The total PSA levels were similar in participants with and without prostate cancer. Both the laboratory and clinical base models (including DRE, age and tPSA analysis) showed poor diagnostic performance with an AUC of 0.557 and 0.585. The full laboratory and clinic models included fPSA, iPSA and hK2 on top of DRE, age and tPSA analysis. As a result, discriminative accuracy increased to 0.713 and 0.711 (P<0.001). In order to put these results in a more clinical real-life context, a scenario was examined in which a cut-off point of 20% predicted probability for

49 | P a g e

positive biopsy is a threshold for referral for biopsy. When applying the full laboratory model, the number of biopsies would be reduced by 30%, while only delaying the diagnosis of 32 low-grade tumors and 4 high-grade tumors per 1000 men that would be detected in later rounds of screening. Applying the full clinical model would lower the number of biopsies by 36% with a delayed diagnosis of 45 low-grade tumors and 4 high-grade tumors. In round 2 and 3, similar results for these models were attained. If the kallikrein model is used to determine biopsy, this would lead to superior clinical results (69). The next study included in this systematic review is a direct follow-up cohort study of the last one. The aim of this study by Vickers and colleagues was to validate the positive test results of the kallikrein model in a larger, independent, representative population-based cohort study. A total of 2914 unscreened participants were included in this study. Prostate cancer was diagnosed in 807 of men with elevated levels of serum tPSA (>3 ng/mL). The data set was divided into a training set with 728 participants (202 patients with PC) and a validation set with 2186 men (605 patients with PC). The results of the laboratory models are displayed in additional table 5. In round 1, the laboratory base model, that incorporated PSA and age, displayed an AUC of 0.637, which rose to 0.764 for the full laboratory model that included the four kallikreins. The clinical models included DRE on top of age and PSA, and showed similar results. As a result, the AUC of the base clinical model was 0.695 and the AUC of the full clinical model was 0.776, respectively. The prediction accuracy for prostate cancer was significantly increased by adding the kallikreins (iPSA, fPSA, nicked PSA and hK2) to the models (P<0.001). Once again, to put these results in a clinical context, decision curves for prostate cancer diagnosis were assessed. A 20% probability threshold for referral to prostate biopsy was chosen, similarly to the situation in the last study. If only the men with elevated PSA levels and a predicted probability of > 20% from the full clinical model were biopsied, the total amount of biopsies would be reduced by 513, while only 66 men with predominantly low-grade prostate cancers would be missed. During round 2, which occurred four years later, the same analyses were repeated. The laboratory base model presented an AUC of 0.615 and the full model an AUC of 0.757, while the clinical base and full models reached an AUC of 0.672 and 0.764, respectively. These results were similar to those of the first round (70).

PSA is, despite its limited sensitivity and low specificity, currently the most-widespread screening tool for prostate cancer in the world. However, the resulting over- and underdiagnosis of low-grade and high-grade prostate cancer has raised some concerns. In the prospective multi-center validation study by Wei et al, the performance of urinary prostate cancer antigen (PCA3) as a diagnostic biomarker for the detection of prostate cancer was assessed. A total of 859 patients

50 | P a g e

were divided into two groups; a group that consisted of men presenting for initial biopsy and a group that consisted of men presenting for repeat biopsy. In the first group, the PPV was 80% (95% CI 72% to 86%). When using a PCA3 of 60 as a cut-off point, sensitivity and specificity of PCA for the detection of prostate cancer were 42% and 91%, respectively. In the second group, the NPV was 88% (95% CI 81%-93%). When yielding a PCA3 score of less than 20 among men with prior biopsies, sensitivity and specificity were calculated at 76% and 52%, respectively. Furthermore, the two aforementioned groups were then divided into subgroups on the basis of serum PSA levels. Subsequently, the risk of cancer in these subgroups was determined for men with PCA3 scores of less than 20, 20-60 and more than 60. Elevated PCA3 scores correlated with higher probability for any stage of prostate cancer as well as high-grade prostate cancer. In addition, the performance of PCA3 in combination with the Prostate Cancer Prevention Trial’s risk calculator (which takes age, race, PSA, abnormal DRE, prior negative biopsy and family history of prostate cancer into account) was analyzed. In the first group of men presenting for an initial prostate biopsy, the AUC increased from 0.68 for PCPT alone to 0.79 for PCPT and PCA3 (P<0.001). The same observation was made in the second group of men presenting for a repeat biopsy; the AUC increased from 0.64 for PCPT to 0.69 when adding PCA3. Lastly, if a group with PCA3 < 20 and a PSA < 4 ng/mL would be classified as low risk and were advised against undergoing a prostate biopsy, the amount of biopsies among men presenting for repeat biopsy would be reduced by 23. The greatest pitfall of this measure would be underdiagnosis of high-grade cancer. However, by implying this strategy, only 2 patients with low-grade prostate cancer would have been missed in this study population. Nevertheless, if the same thresholds were applied to the group of men presenting for initial biopsy, a much higher rate of aggressive cancer would have been missed (71).

3.6 PANCREATIC CANCER Eight studies concerning potential biomarkers for the detection of pancreatic cancer were included in this systematic review. Although the incidence of pancreatic cancer is low, it has a five-year survival rate of only 5%. The reason behind this poor prognosis can be found in the absence of an effective early detection program and rapid disease progression. CA19-9 is currently the only biomarker that is approved by the FDA to assist in pancreatic cancer diagnosis, however it lacks sensitivity and specificity for it to be a reliable diagnostic biomarker on its own. The first study that concerns the use of biomarkers for the detection of pancreatic cancer in this systematic review was conducted by Baine et al. They sought to analyze the expression of six genes (ANXA3, ARG1, CA5B, F5, SSBP2 and TBC1D8) in Peripheral Blood Mononuclear Cells

51 | P a g e

(PBMC) from 95 patients with pancreatic cancer (PC), 35 patients with chronic pancreatitis and 47 healthy persons. Of the patients with pancreatic cancer, 48 had early-stage resectable tumors and 47 had late-stage disease. It was found that expression levels of these six genes in PBMC were lower in patients with early-stage pancreatic cancer when compared with healthy controls. However, when comparing the levels of the six genes in early-stage PC with late-stage PC, gene expression levels increased again with pancreatic cancer progression, but this trend was not significant for all biomarkers (P>0.05), with the exception of F5 (P=0.028). When analyzing the diagnostic performance of these genes for the differentiation of early-stage PC patients from healthy controls, none of the examined genes displayed promising results at a fixed specificity of 80%. The AUC’s of these six genes ranged from 0.503-0.584. Nevertheless, when comparing early- stage PC patients to patients with chronic pancreatitis, the differentiation ability of all six genes increased with AUC’s ranging from 0.517-0.700. CA5B was the best performing biomarker for distinguishing early-stage PC from both healthy and chronic pancreatitis patients with the highest AUC’s of 0.584 and 0.700 and sensitivities of 20.8% and 47.9%, respectively. Furthermore, multivariate analyses were carried out to calculate the OR for PC using the univariate-derived cut- off values for the best 4 performing genes with regard to distinguishing early PC patients from chronic pancreatitis patients and healthy controls. For differentiating early PC patients from healthy controls, multivariate analysis showed that MIC1, ARG1, F5 and CA5B displayed the highest AUC’s (0.574, 0.567, 0.561 and 0.581). The OR’s for these genes, were 1.253, 1.638, 1.949 and 0.914, respectively. Moreover, the performance of CA19-9 was analyzed and displayed an AUC of 0.719 with an OR of 8.565 at a cut-off value of 61.7 U/mL. Based on these results, a new score model, which included MIC1, ARG1, F5, CA5B and CA19-9 was constructed and was named “panel A”. This panel performed well and demonstrated an AUC of 0.772 (95% CI 0.68-0.87). However this AUC was not significantly different from the AUC of CA19-9 alone. Furthermore, it was important to determine which biomarker could differentiate best between patients with PC and patients with chronic pancreatitis. Based on multivariate analysis, the AUC’s of CA5B, F5, SSBP2 and MIC1 were, respectively, 0.700, 0.678, 0.685 and 0.640 for the differentiation of PC patients from chronic pancreatitis patients. The AUC of CA19-9 was 0.704 with an OR of 5.66 at a cut-off value of 74.0 U/mL. Based on the results of the multivariate analysis, a new panel was constructed and named “panel B”. The AUC of this panel was calculated at 0.820 (95% CI 0.73-0.91) and was a significant improvement over CA19-9 alone. Lastly, the sensitivity and specificity of panel B were found to be 67% and 83%, respectively. The rest of the results are displayed in additional table 6 (72).

52 | P a g e

The next study of Yip-Schneider and colleagues discusses the use of vascular endothelial growth factor (VEGF) as a biomarker for the detection of mucinous pancreatic cysts, which are premalignant lesions and could progress to invasive pancreatic adenocarcinoma. The pancreatic cystic lesions are divided into two groups, of which the first one is premalignant: mucinous cysts (intraductal papillary mucinous neoplasms (IPMN) and mucinous cystic neoplasms (MCN)) and serous cystic neoplasms (SCN). SCN is known to mimic IPMN and MCN on radiographic imaging and in symptom presentation. Therefore, it is of outmost importance to identify a biomarker that can differentiate benign serous cysts from premalignant mucinous cysts. Pancreatic fluid samples were prospectively collected from 87 patients during routine endoscopy and/or operation; 9 of which were diagnosed with pseudocysts, 17 with SCN’s, 24 with MCN’s, 10 with high-grade IPMN’s, 16 with low-grade IPMN’s and 11 patients displayed pancreatic ductal adenocarcinoma’s. The results demonstrated that the levels of VEGF-A were significantly increased in pancreatic cyst fluid from patients with SCN in comparison to the VEGF-A levels in all other groups (P<0.0001). When yielding a cut-off value of 8500 pg/mL, VEGF-A displayed a sensitivity of 100% and a specificity of 97% as a biomarker for the detection of benign SCN lesions. Subsequently, ROC analysis resulted in an AUC > 0.99 for the detection of SCN lesions. The levels of VEGF-A were also measured in plasma, serum, bile and urine. However, no significant difference in serum VEGF-A levels between any of the groups was observed. For that reason, it can be concluded that increased levels of VEGF-A are specifically detected in pancreatic cyst fluid from SCN. In addition to the analysis of VEGF-A levels, VEGF-C levels were also evaluated in pancreatic cystic fluid. Similarly to VEGF-A, levels of VEGF-C were significantly elevated in SCN cyst fluid when compared with the levels of VEGF-C of other pancreatic lesions (P<0.0001). At a cut-off value of 200 pg/mL, VEGF-C demonstrated a sensitivity of 100% and a specificity of 90% for the detection of SCN. Furthermore, immunohistochemistry was used to localize the expression of VEGF-A and vascular endothelial growth factor receptor 2 (VEGFR-2) in pancreatic cyst wall tissue. VEGF protein was localized in SCN tissue, indicating that increased VEGF protein production by cells in this SCN tissue can lead to elevated levels of secreted VEGF in SCN cyst fluid. Moreover, expression of VEGFR-2 was localized in SCN cyst tissue. However, low expression of VEGF or VEGFR-2 was found in normal pancreas tissue or in tissue of premalignant lesions (73).

In the validation study by Capello et al, the value of 17 plasma proteins as diagnostic biomarkers for pancreatic ductal adenocarcinoma (PDAC) was evaluated. These 17 proteins were identified in previous studies as possible biomarkers and were validated in plasma samples of 187 PDAC patients, 93 patients with benign pancreatic disease and 169 healthy controls. All participants were

53 | P a g e

subdivided in an identification set (which was used to discover the 7 best biomarkers), three validation sets and one additional independent plasma sample set for the evaluation of a combined biomarker panel. The 17 proteins tested in this study were ALCAM, CHI3L1, COL18A1, IGFBP2, LCN2, LRG1, LYZ, NPC2, PARK7, REG3A, SLPI, pro-CTSS, total CTSS, THBS1, TIMP1, TNFRSF1A and WFDC2. In the first set, the identification set, the levels of seven biomarkers were significantly elevated in PDAC cases when compared with the levels of these seven biomarkers in chronic pancreatitis cases (P<0.05). ROC analysis of these seven biomarkers resulted in AUC’s all greater than 0.60. Thus, these seven proteins were selected and tested in the subsequent validation sets. AUC values of all seven biomarkers were > 0.6 in validation sets 1,2 and 3. The results demonstrated that, in each validation set, the plasma levels of these seven proteins were elevated in PDAC patients in comparison to the plasma levels of these proteins in controls. The controls were healthy individuals and chronic pancreatitis patients in validation sets 1,2 whereas in validation set 3, the controls were patients with benign pancreatic cysts. Subsequently, a biomarker panel for the early-stage detection of PDAC was developed based on a logistic regression model. This panel consisted of three biomarkers, namely TIMP1, LRG1 and CA19-9. When it came to distinguishing patients with early-stage PDAC and healthy controls, the panel demonstrated an AUC of 0.949 (95% CI 0.917-0.981) with a sensitivity of 84.9% at a fixed specificity of 95%, respectively. The OR of this panel was 4.67. Furthermore, when comparing the results of CA19-9 alone (AUC=0.882 and Se=0.726) with those of the newly proposed biomarker panel, a statistically significant improvement was observed. On top of that, another logistic regression model using the same three biomarkers was developed to differentiate patients with early-stage PDAC from benign pancreatic disease cases. The AUC of this model was 0.846 with a 95% CI of 0.781-0.911 and the OR was 2.98. Blinded validation of this biomarker panel was carried out in an independent set. Similar results were observed in this last validation set, as levels of all three biomarkers were statistically increased. Furthermore, the AUC of the logistic regression model for discriminating patients with PDAC from healthy controls was 0.887, which was again statistically higher than the AUC of CA19-9 alone. The model yielded a sensitivity of 66.7% at 95% specificity and an odds ratio of 3.19 (74).

The aim of the prospective observational cohort study by Henriksen was to evaluate cell-free DNA promoter hypermethylation of 28 genes (combined in a panel) in patients with pancreatic adenocarcinoma and to combine these genes in a diagnostic biomarker panel. A total of 95 patients with pancreatic adenomacarcinoma, 27 controls (control group 1), 97 patients with chronic pancreatitis (control group 2) and 59 patients with acute pancreatitis (control group 3) were

54 | P a g e

included. Patients with pancreatic adenocarcinoma had significantly elevated levels of cell-free DNA compared with the levels of cell-free DNA in all control groups. The mean number of methylated genes of the whole panel of 28 genes was 8.41 in the cancer group. The amount of methylated genes in the control groups was significantly lower (P<0.0001). Subsequently, a prediction model incorporating the most significant hypermethylated genes was developed. As a result, 20 of the 28 genes with a P-value below 0.20 were included in the multivariable logistic regression model. ROC analysis of the first model demonstrated an AUC of 0.87 for the detection of PC. When removing 12 of the least significant genes from the model, and employing the eight remaining genes in a new model, the AUC remained high at 0.86. Furthermore, the sensitivity and specificity of the second model were 76% and 83%, respectively. Lastly, the second model with 8 genes demonstrated a similar AUC of 0.86, a sensitivity of 73% and a specificity of 83% for the differentiation of stage I and II PC patients from healthy controls (75).

In the article of Matsubara and colleagues, a new combination of hollow fiber membrane-based low-molecular-weight (LMW) protein enrichment and LC-MS-based quantitative shotgun proteomics was used to compare the plasma proteome of 24 patients with pancreatic cancer with 21 healthy controls. From that training cohort, the best performing biomarker with the highest MS- peak, chemokine ligand 7 (CXCL7), was selected and validated in a second cohort with 140 pancreatic cancer patients, 10 patients with chronic pancreatitis and 87 healthy controls. Firstly, the level of plasma CXCL7 was measured in 12 patients with pancreatic cancer and 12 healthy individuals of the training study. CXCL7 was found to be significantly lower in patients with pancreatic cancer when compared with CXCL7 levels in healthy controls (P=0.0003). Similar results were observed in the following validation study, where a reduction in CXCL7 levels was seen in pancreatic cancer patients as well. The most notable result however, was that CXCL7 was also significantly decreased in stage I and II pancreatic patients. This suggests that the alteration of CXCL7 levels is an early event in pancreatic carcinogenesis and may even predict the development of cancer. Additionally, it was confirmed that CXCL7 levels were significantly reduced in 10 patients with chronic pancreatitis (P=0.0002), but they were slightly higher than the CXCL7 levels in pancreatic cancer patients although the difference was not significant (P=0.095). Furthermore, the diagnostic performance of CXCL7 and CA19-9 were evaluated. CXCL7 was significantly reduced in pancreatic cancer patients with normal CA19-9 levels (37 ng/mL was used as a cut-off value for CA19-9) and displayed an AUC of 0.853 in the training set and an AUC of 0.834 in the validation cohort. When CA19-9 and CXCL7 were combined, the ability to differentiate pancreas cancer patients from healthy controls increased to an AUC of 0.965 in the training cohort

55 | P a g e

and to an AUC of 0.961 in the validation cohort. The sensitivity of this combination panel was 83% and 84% at a fixed specificity of 95% in the training and validation cohort (76).

The next case-control study by Takayama et al. assessed the performance of regenerating islet- derived family member 4 (REG4) as a diagnostic biomarker for pancreatic cancer. Pre-therapeutic blood samples were collected from 92 patients with pancreatic cancer of which 12 patients were diagnosed to be in stage I/II and 80 patients in stage III/IV pancreatic cancer, 16 patients with intraductal papillary mucinous adenoma (IPMA), 4 patients with endocrine tumor, 2 patients with acinar cell carcinoma, 2 patients with MCN, 3 patients with SCN, 1 patients with solid pseudopapillary tumor, 11 patients with chronic pancreatitis and 69 healthy controls. In the additional validation study, 39 patients with pancreatic cancer and 59 controls were included. By using ELISA, the serum levels of REG4 were firstly analyzed in 205 subjects. REG4 was significantly elevated in patients with pancreatic cancer when compared with the REG4 levels in healthy controls (P<0.001). REG4 displayed an diagnostic accuracy of 85.1% for the detection of pancreatic cancer at a cut-off value of 3.49 ng/mL. In the validation study, the sensitivity and specificity for the detection of pancreatic cancer were 99.4% and 64% with an accuracy of 77.5%, respectively. To compare the diagnostic performance of REG4 with CA19-9, serum levels of CA19- 9 were analyzed in the same population. Although the CA19-9 levels were significantly altered in patients with pancreatic cancer when compared with the CA19-9 levels in healthy controls, ROC analysis pointed out that REG4 outperformed CA-19-9 with regard to discriminating patients with pancreatic cancer from healthy controls. REG4 displayed an AUC of 0.922 for the detection of pancreatic cancer while CA19-9 had an AUC of 0.884. When serum levels of both REG4 and CA19- 9 were analyzed for each stage of pancreatic cancer, significantly elevated levels of REG4 were seen in patients with early-stage pancreatic cancer (stages I and II) when compared with REG4 levels in healthy controls. In contrast with REG4, CA19-9 levels were significantly elevated in late- stage pancreatic cancer patients (stages III and IV), but not in early-stage patients. In addition, no correlation was found between serum REG4 and CA19-9 levels. Ultimately, combining CA19-9 and REG4 resulted in a sensitivity of 100%, a specificity of 60% and an accuracy of 77.5% for the detection of pancreatic cancer (77).

In recent years, serum-based miRNA’s have been extensively examined to find out if they could be used as diagnostic biomarkers for cancer. In the case-control study by Wang et al, specific miRNA’s in peripheral blood mononuclear cells (PBMCs) were identified and validated in three independent cohorts. The first phase of this study, the discovery phase, included 20 patients with pancreatic

56 | P a g e

cancer, 20 patients with benign pancreatic diseases (BPD) and 20 healthy participants. The following validation phase consisted of 129 patients with pancreas cancer, 103 patients with BPD and 60 healthy individuals. In the discovery phase, five microRNA’s were significantly upregulated in the pancreatic cancer group when compared with the expression levels of these five microRNA’s in the BPD group and the control group: miR-27-a-3p, miR-16-5p, miR-15b-5p, miR-26a-5p and miR-342-5p. These five candidate biomarkers were further tested in the validation phase. During this second phase, three miRNA’s were again significantly upregulated in patients with pancreatic cancer in comparison to the expression of these miRNA’s in the healthy and BPD cohorts: miR-27- a-3p, miR-16-5p and miR-15b-5p. When these three miRNA’s were further analyzed using a logistic regression model, only miR-27-a-3p could discriminate patients with pancreatic cancer from healthy patients and patients with BPD (P<0.001). ROC analysis was performed and miR-27-a-3p displayed an AUC of 0.857 with a sensitivity and specificity of 82.2% and 79.1%, respectively. Furthermore, miR-27-a-3p was able to differentiate patients with pancreatic cancer from the BPD subgroup (AUC=0.840, sensitivity=82.2%, specificity=76.7%) and from the healthy group (AUC=0.866, sensitivity=82.9%, specificity=83.3%). CA19-9 was also tested in the validation dataset and could distinguish patients with pancreatic cancer from the BPD group with an AUC of 0.788 (sensitivity=72.9% and specificity=75.7%). Multiple linear analyses showed that both pancreatic cancer and jaundice (or elevated levels of bilirubin) were predictors for serum CA19-9 and PBMC miR-27-a-3p expression in the validation set. On top of that, there were also significant differences in the distributions of gender, total bilirubin and fasting blood glucose observed when comparing these characteristics in the pancreatic cancer group with the same characteristics of BPD groups. Subsequently, these five variables (gender, total bilirubin, fasting blood glucose, miR- 27-a-3p and CA19-9) were combined and analyzed in a multivariate logistic regression model. ROC analysis of this model resulted in an AUC of 0.886 with a sensitivity of 85.3% and a specificity of 81.6% for the detection of pancreatic cancer and improved diagnostic performance over CA19-9 and miR-27-a-3p alone. Lastly, this panel displayed an AUC of respectively 0.893 and 0.884 for the detection of stage I and II pancreatic cancer (78).

The last article regarding the identification and validation of biomarkers for pancreatic cancer in this systematic describes a case-control study from Zhang et al. In this study, changes in the levels of unsaturated free fatty acid (or FFA) levels were measured using a special technique involving mass spectrometry and these FFA’s were evaluated as possible diagnostic biomarkers for the detection of pancreatic cancer. A total of 156 patients took part in this study, of which 95 patients were diagnosed with pancreatic cancer and 61 patients with pancreatitis. Furthermore, 205 healthy

57 | P a g e

controls were selected for this study. The patients were divided into a training set with 26 PC patients and 60 controls, and a validation set with 67 PC patients, 145 controls and 61 patients with pancreatitis. In the training set, alterations in levels of 6 FFA’s between patients with pancreatitis and healthy controls were compared. It was found that the levels of C 16:1, C18:3, C18:2, C20:4 and C 22:6 were significantly decreased in the PC patients compared to the levels of these FFA’s in the healthy controls. ROC analysis of these seven polyunsaturated fatty acids in two panels was performed.

Panel a consisted of C 16:1, C18:3, C18:2, C20:4 and C 22:6 and panel b incorporated only the ratios of

C18:2 /C 18:1 and of C 18:3 /C 18:1 . The AUC’s of C 16:1 , C 18:2 /C 18:1 , panel a and panel b were, respectively, 0.907, 0.907, 0.933 and 0.908, for the detection of pancreatic cancer. Furthermore, these four variables had a sensitivity of > 82% and a specificity of > 82%, except for C 18:2 /C 18:1 . In the validation set, significantly altered levels of C16:1 (P<0.001) or C 18:1 (P<0.05) were found in the serum of normal controls when compared with the levels of C 16:1 or C 18:1 in patients suffering from pancreatitis. The levels of C 16:1, C18:3, C18:2, C18:1, C20:4 and C 22:6 in patients with pancreatic cancer were significantly lower when compared with the levels of C 16:1, C18:3, C18:2, C18:1, C20:4 and C 22:6 in healthy controls

(P<0.001) or in patients with pancreatitis (P<0.001), except for C 16:1 . The ratios of C18:2 /C 18:1 and of

C18:3 /C 18:1 also displayed significant differences between patients with pancreatic cancer and patients with pancreatitis. Moreover, similar results were observed after ROC analysis was performed on the validation set. The AUC’s of C 16:1 , C 18:2 /C 18:1 , panel a and panel b were higher than 0.84 and both their sensitivities and specificities were greater than 80% and 74%. The single best performing FFA for the differentiation of patients with pancreatitis from healthy controls was

C16:1 with an AUC of 0.814, a sensitivity of 88.3% and a specificity of 62.3%. Another notable results was that panel C or polyunsaturated fatty acids (PUFA), which is a combination of C18:3 , C 18:2, C18:1,

C20:4 and C 22:6 , displayed the highest diagnostic ability to distinguish patients with pancreatic cancer from patients with pancreatitis with an AUC of 0.900, a sensitivity of 73.1% and a specificity of

96.7%, respectively. In addition, ROC analysis demonstrated that C 20:4 , C 18:2 /C 18:1 , PUFA, panel a and panel b could differentiate patients with early-stage pancreatic cancer from healthy controls or patients with pancreatitis with AUC’s of >0.80, sensitivities of >85% and specificities of >80%. Panel b demonstrated the highest diagnostic performance with an AUC of 0.912, a sensitivity of 86.7% and a specificity of 88.6%. The rest of the results can be consulted in additional table 6 (79).

3.7 BREAST CANCER The first of seven studies concerning evaluation of potential biomarkers for the detection of breast cancer (BC), was conducted by Atahan et al. The aim of this prospective cohort study was to assess

58 | P a g e

the diagnostic performance of three possible biomarkers for early-stage breast cancer, Bc1, Bc2 and Bc3. These biomarkers were identified in a previous study as three serum peaks that distinguished BC patients from healthy controls. A total of 27 breast cancer patients, 24 patients with benign breast diseases and 37 healthy controls were included in this study. Of the 27 patients with breast cancer, 18 patients had invasive ductal carcinoma, 6 patients had invasive lobular carcinoma and 3 patients had mixed type breast carcinoma. The benign breast disease group consisted of 7 patients with fibrocystic disease, 3 patients with lipoma, 3 patients with sclerosing adenosis, 9 patients with fibroadenoma, 1 patient with breast abscess and 1 patient with fat necrosis. In this study, the three same serum spikes were observed in the blood of these patients and the proteins were identified as Bc1, Bc2 and Bc3. Bc2 levels was found to be significantly elevated in patients with either malignant or benign lesions in comparison to Bc2 levels in with the control group (P=0.002). Furthermore, the Bc2 levels were increased in the patients with BC when compared with the Bc2 levels in the healthy controls (P=0.003). However, ROC analysis revealed that the AUC of Bc2 for the detection of BC did not attain 0.70. Similarly to the results of Bc2, the levels of Bc1 were elevated in the sera of patients with BC when compared with levels of Bc1 in the patients with benign breast diseases and/or healthy controls (P=0.006 and P0.015). Nevertheless, the AUC values of Bc1 were lower than 0.70 in all groups. Moving on, Bc3 levels were elevated in the sera of patients with BC, but the difference in Bc3 levels between the BC group and the two control groups was not statistically significant (P=0.098 and P=0.134) (80).

In the study by Garczyk and colleagues, human anterior gradient 3 (AGR3) was evaluated as a possible prognostic or diagnostic biomarker for early-stage breast cancer. Serum samples were collected from 40 patients with BC and of 40 cancer-free women. In a first stage of this study, increased AGR3 expression was found in 63 breast tumor samples when compared with 13 normal breast tissues (P<0.05). Furthermore, AGR3 expression was significantly elevated in low (G1) and intermediate (G2) grade tumors but it was not upregulated in high-grade tumors (G3). In a second stage, AGR3 expression was evaluated on protein level using a tissue microarray consisting of 39 normal breast tissue samples and 190 BC tissue samples. The antibody used in the following Western blot experiment showed excellent specificity for AGR3. Again, a significant over- expression of AGR3 (resulting in elevated levels of AGR3 protein) was found in tumor samples in comparison to AGR3 expression in normal breast samples (P<0.05). Interestingly, highly significant over-expression of AGR3 protein in G1 and G2 breast tumor samples (P<0.05) compared with the expression of AGR3 in normal breast tissues was observed (P<0.001). Furthermore, after dividing the cohort into multiple subgroups, analysis of the potential prognostic value of AGR3 was

59 | P a g e

performed. It was found that AGR3 had a significant prognostic effect in the low (G1) and intermediate (G2) grade tumors, but not in the high grade group (G3). The outcome of patients with high levels of AGR3 expression was negative (mean tumor-specific-survival: 181.7 months) compared to the outcome of patients with low AGR3 levels (mean tumor-specific-survival: 142.5 months). In the third and most important stage of this study, the diagnostic utility of AGR3 protein was evaluated in serum samples of patients with breast cancer. The majority of the cancer sera were collected from patients with low-grade breast cancer. Significantly elevated AGR3 serum levels were observed in breast cancer serum samples when compared to AGR3 serum levels in the control samples from healthy individuals (P<0.001). Furthermore, ROC analysis was performed to determine the diagnostic performance of AGR3 and revealed that differences in AGR3 levels can be used to discriminate between breast cancer patients and healthy women (AUC=0.718, 95% CI 0.606-0.830). What is more, the sensitivity and specificity of AGR3 for detecting breast cancer were calculated at 35% and 92.5%, respectively. In addition, the diagnostic performance of AGR2 as a diagnostic biomarker for breast cancer was also evaluated, seeing as AGR2 has been reported as a serum biomarker for ovarian, lung and prostate cancer and is a paralogue of AGR3. For that reason, it was not surprising that significantly increased levels of AGR2 were found in breast cancer sera when compared to the AGR2 serum levels in control samples (P<0.001). Subsequently, ROC analysis of AGR2 and a combination of AGR2 and AGR3 revealed AUC’s of 0.841 (95% CI 0.745- 0.936) and 0.827 (95% CI 0.693-0.962). AGR2 achieved sensitivity of 32.5% and a specificity of 90% for the detection of breast cancer while the combination of AGR2 and AGR3 reached an increased sensitivity of 64.5% and a specificity of 89.5%. Additional results can be consulted in additional table 7 (81).

In the case-control study by Gong et al, cell-free DNA in blood samples of patients with breast cancer were analyzed to investigate the diagnostic ability of glyceraldehyde 3-phosphate dehydrogenase or GAPDH for breast cancer. A total of 200 patients with breast cancer, 100 patients with hyperplasia and 100 healthy women were included in this study. The malignant cases consisted of 8 patients in stage I, 86 patients in stage II, 93 patients in stage III and 13 patients in stage IV breast cancer. Among the patients with breast cancer, 122 women had invasive ductal carcinoma, 17 had invasive lobular carcinoma, 19 had medullary carcinoma, 9 had invasive carcinoma, 5 had intraductal carcinoma, 13 had accompanying metastasis of the lymph nodes and 15 suffered from other carcinoma. The population in this study was then randomly divided in a training cohort and a testing cohort. In the training stage, the optimal cut-off point for GADPH was determined at 471 ng/mL. Overall accuracy, sensitivity, specificity, PPV and NPV of GAPDH for

60 | P a g e

discriminating patients with breast cancer from healthy controls were 0.94, 0.95, 0.92, 0.96 and 0.90, respectively. The odds ratio was calculated at 218.5. In addition, the sensitivity, specificity, PPV and NPV of GADPH for differentiating pancreatic cancer from patients with hyperplasia were 0.95, 0.90, 0.95 and 0.90, respectively. Subsequently, the odds ratio was calculated at 171. The same cut-off point for GADPH was used in the validation cohort. In the classification of breast cancer and healthy controls, GADPH displayed an overall accuracy, sensitivity, specificity, PPV and NPV of 0.91, 0.89, 0.94 and 0.97, respectively. Furthermore, the odds ratio was calculated at 126.8 (95% CI 33.71-467.69). The overall accuracy, sensitivity, specificity, PPV and NPV of GAPDH for the differentiation of patients with breast cancer from healthy controls were 0.91, 0.89, 0.96, 0.98 and 0.81, respectively. The odds ratio was calculated at 194.1 (95% CI 41.34-911.70) (82).

The human cytosolic thioredoxin or Trx1 plays a role in regulating cell proliferation, differentiation and apoptosis. Previous reports stated to have found upregulated Trx1 levels in patients with breast cancer. Furthermore, these studies reported that elevated Trx1 levels were associated with breast cancer stage. In the study of Park and colleagues, Trx1 levels were analyzed in sera of patients with breast cancer to evaluate the potential of Trx1 as a biomarker for early detection of breast cancer. In order to determine if Trx1 is a breast cancer-specific biomarker, patients with various types of cancer were included in this study. A total of 302 patients took part in this study, of which 197 patients had breast cancer, 11 patients had NSCLC, 64 patients had colorectal carcinoma and 30 patients had kidney carcinoma. In addition, 100 healthy individuals were included as controls. Serum Trx1 levels were significantly higher in the breast cancer group than in other cancer patients or in the control group. Although the differences in Trx1 levels among the group of patients with different types of cancer not significant (P>0.05), the differences between BC and other cancers were (P<0.0001). ROC analysis was performed and Trx1 demonstrated an AUC of 0.911 for differentiating BC patients from healthy controls with a sensitivity and specificity of 89.3% and 78% at a cut-off value of 32.390 ng/mL. These results indicated that Trx1 could be used as a diagnostic breast cancer biomarker with a superior sensitivity and specificity. In addition, Trx1 demonstrated AUC values of >0.75 for the differentiation of breast cancer patients from other cancer patients, with sensitivities and specificities of more than 75% and 55%, respectively. This result suggested that Trx1 is a breast cancer-specific biomarker. In order to confirm the proportional correlation of Trx1 with the progress of breast cancer, the levels of Trx1 in breast cancer patients were compared with the levels of Trx1 in lung cancer patients. In fact, Trx1 serum level appeared to be associated with breast cancer progress in a pattern that was similar to the pattern found in the progress of lung

61 | P a g e

cancer. These results suggested that the increment of serum Trx1 levels with cancer progression is due to the increase in oxidative stress that is associated with cancer development. However, the Trx1 levels were far higher in breast cancer patients than in lung cancer patients, suggesting that the high increase in the breast cancer group is another sign of Trx1 being breast cancer-specific. To compare the diagnostic performance of Trx1 in breast cancer patients with other biomarkers, CEA and CA15-3 levels were also measured in the samples of the same study population. Although both CEA and Trx1 levels were elevated in breast cancer patients when compared with CEA and Trx1 levels in healthy controls, the increase in Trx1 levels was significantly higher than that of CEA. ROC analysis of both biomarkers was performed, and Trx1 had an AUC that exceeded 0.83 for the detection of BC patients in stage I-III (sensitivity: 82% and specificity: 72%) while CEA displayed an AUC of 0.678 (sensitivity: 54.4% and specificity: 77.6%) at a cut-off value of 8.28 ng/mL. The superiority of Trx1 as a diagnostic biomarker was attributed to the higher sensitivity, although the specificity of Trx1 was lower than that of CEA. On top of that, the AUC of Trx1 for the detection of stage I breast cancer was 0.837 whereas the AUC of CEA reached a mere 0.594. With regard to comparing the diagnostic capacities of Trx1 and CA15-3, it was found that the serum levels of CA15-3 were not only increased in patients with breast cancer, but were observed to have a proportional correlation with the progress of breast cancer, similarly to Trx1. In addition, ROC curve analysis showed that CA15-3 had an AUC value of 0.719 with a sensitivity of 48.6% and a specificity of 89.8%. More specifically, the AUC of CA15-3 for the detection of stage I breast cancer was measured at 0.693 with a sensitivity of 50% and a specificity of 83.7%. As a result, this study demonstrated that the AUC and sensitivity of Trx1 were higher than those of CA15-3 in patients with any stage of breast cancer whereas the specificity of Trx1 was lower than that of CA15-3 (83).

In the next study of Zhang and colleagues, a five-marker panel, based on support vector machine (SVM) analysis, was evaluated as an early detection tool for breast cancer. The study population consisted of 67 patients with breast cancer and 63 healthy controls. These participants were divided in a training group and a testing group. Firstly, 42 possible biomarkers were tested in the training group. Subsequently, an SVM model with 5-fodl cross-validation was developed based on these 42 biomarkers. Using this model, an AUC of 1.0 with a sensitivity of 100% and a specificity of 93.8% for the detection of breast cancer was reached in the training group. However, in the testing group, a lower performance was observed with an AUC of 0.58, a sensitivity of 63.6% and a specificity of 51.6%. For that reason, a new SVM-based model with five biomarkers was created after comparing the AUC’s of the best four 5-marker panels with the AUC’s of four randomly chosen 5-marker panels. This new panel consisted of the five genes PCDHGA8, LEFTY2, CACNG6, BCAR3 and

62 | P a g e

CYP21A2. ROC analysis of this new panel revealed an AUC of 0.9053 in the training group and an AUC of 0.7879 in the testing group. Although the accuracy of this panel as just 68.75%, the sensitivity markedly improved to 85.29% in the training group and 72.41% in the testing group. Lastly, the specificity of this panel was calculated at 81.25% in the training group and 74.19% in the testing group (84).

Zhang and colleagues set out to measure the levels of serum miRNA-205 in breast cancer and explored its potential as a diagnostic biomarker for the early detection of breast cancer and other cancers through meta-analysis. The study was divided into two phases. In the first phase, the diagnostic performance of miR-205 for breast cancer was assessed. Blood samples were collected from 25 patients with stage I breast cancer, 33 patients with stage II breast cancer and 93 healthy women. In the second phase, a meta-analysis was performed to evaluate the diagnostic performance of miR-205 in serum samples from patients with various sorts of cancer. In the first phase, the concentration of normalized levels of miR-205 in serum were significantly higher in healthy controls than in breast cancer patients (P<0.0001), indicating that miR-205 was under- expressed in breast cancer patients. Furthermore, a significant difference was observed between the level of miR-205 expression in stage I breast cancer patients and the level of miR-205 expression in healthy controls (P<0.001). ROC analysis was performed to evaluate the diagnostic performance of miR-205 and revealed an AUC of 0.84 (95% CI 0.77-0.91) for the detection of breast cancer. Additionally, the sensitivity and specificity were calculated at 86.2% and 82.8%, respectively. These results demonstrated that miR-205 was a moderately high performing diagnostic biomarker for breast cancer. In the second phase, the meta-analysis, a forest plot was carried out for all studies to assess the diagnostic utility of miR-205 in various types of cancer. The relative parameters were determined through the use of a random-effect model. The resulting sensitivity for all types of cancer was 75%, the specificity 84% and the odds ratio was 16. The AUC of the overall SROC curve was calculated at 0.87 (95% CI 0.84-0.90). These results confirmed the outcome of the first phase and demonstrated that miR-205 has a high accuracy in terms of detecting cancer (85).

3.8 GASTRIC CANCER The first of five studies concerning biomarkers for gastric cancer in this systematic review was conducted by Jing et al. The aim of this study was to evaluate a panel of various well-studied biomarkers; CEA, CA19-9, CA24-2, AFP, CA72-4, SCC and TPA for the detection of upper

63 | P a g e

gastrointestinal tract cancer or GIT. A total of 573 patients with GIT were enrolled in this study. Of the GIT patients, 127 patients had esophageal cancer (EC), 264 had patients gastric cancer (GC) and 182 had cardia carcinoma. In the 573 cases, the sensitivity of CEA, CA19-9, CA24-2, AFP, SCC, CA72-4, TPA and TPS for the detection of upper GIT cancer were calculated at 26.80%, 36.15%, 42.89%, 2.84%, 25.39%, 34.59%, 34.15% and 30.89%, respectively. Combining CEA, CA19-9, CA24-2 and CA72-4 resulted in a higher sensitivity and specificity for GC and cardiac carcinoma, but a panel that included CEA, CA19-9, CA24-2 and SCC displayed the best sensitivity and specificity for detecting EC. Furthermore, SCC was the most sensitive biomarker for the detection of squamous cell carcinomas. CEA and CA72-4 levels were significantly elevated in the male group when compared with the female group, while levels of CA19-9 and CA24-2 were significantly higher in the female group. On top of that, elevated serum levels of CEA were detected in the high-age group (P<0.05). The participants were followed up until three years after surgery. During follow-up, the levels of CEA, CA19-9, CA24-2, CA72-4 and SCC appeared to be significantly decreased three months after surgery (P<0.005). In case of metastasis or recurrence of the cancer, the levels of these biomarkers again rose significantly. During follow-up, 138 patients died. Preoperatively elevated biomarker levels of CA24-2, CA72-4 and SCC predicted poor survival. Multivariate analysis was performed and revealed that these three biomarkers were prognostic factors of cardiac carcinoma, GC and EC (86).

The next article by Tao and colleagues describes a case-control study that explores the potential use of REG4 for the prognosis and the diagnosis of gastric cancer. Currently, two cancer biomarkers are available for gastric cancer: CEA and CA19-9. However, neither of these two biomarkers can serve as an early detection tool for gastric cancer since preoperative positivity of these markers is correlated with an advanced tumor stage at the time of diagnosis. For that reason, new diagnostic biomarkers, that are more specific than CA19-9 and CEA, are urgently needed. Serum and tissue samples from 70 healthy donors and 102 patients with gastric cancer were obtained for this study. In the first phase of this study, REG4 mRNA expression was measured in tissue samples of gastric cancer. The results of this phase demonstrated that the concentration of REG4 was significantly higher in gastric cancer mucosa than in normal mucosa (P<0.001). In a second phase, the REG4 levels were measured in serum samples from patients with and without gastric cancer. In this phase, the REG4 concentrations were elevated in GC patients when compared with REG4 concentrations in healthy individuals. In addition, ROC analysis was carried out and revealed that REG4 had an AUC of 0.798 for the detection of GC. In order to compare the diagnostic ability of REG4 for GC with the conventional biomarkers, the levels of CEA and CA-19-

64 | P a g e

9 were measured in the same study population. Patients with serum CEA > 5 ng/mL or CA19-9 > 37 U/mL were considered positive for gastric cancer disease. An optimal cut-off value for REG4 was calculated at 2.053 ng/mL. With regard to the detection of stage I gastric cancer in patients, REG4 displayed a sensitivity of 44%, which was significantly higher than the sensitivity of CEA (16%) and CA19-9 (8%). These results indicated that REG4 could be a better serum biomarker for the early diagnosis of GC than CEA or CA19-9 (87).

The next study of Chen discusses the possibility of multiple miRNA’s as diagnostic biomarkers for early-stage gastric cancer. In a first phase of this case-control study, 29 differentially expressed miRNA’s were selected and analyzed in tissue samples of 305 patients with GC and 41 non-cancer gastric tissue samples. ROC analysis of these 29 miRNA’s was performed to evaluate their predictive values. Nine miRNA’s displayed an AUC > 0.90, including miR-21 (AUC: 0.993, sensitivity: 96.8%, specificity: 95.10%), miR-196a-1 (AUC: 0.948, sensitivity: 94.30%, specificity: 82.90%), miR-146b (AUC: 0.935, sensitivity: 91.11%, specificity: 78.05%), miR-17 (AUC: 0.909, sensitivity: 77.46%, specificity: 90.24%), miR-181a-1 (AUC: 0.931, sensitivity: 82.26%, specificity: 87.80%), miR-1-2 (AUC: 0.903, sensitivity: 78%, specificity: 84.40%), miR-139 (AUC: 0.930, sensitivity: 87.80%, specificity: 84.80%) miR-133b (AUC: 0.909, sensitivity: 85.36%, specificity: 84.10%) and miR-133a-2 (AUC: 0.905, sensitivity: 76.75%, specificity: 92.68%), respectively. The first six of these miRNA’s were upregulated, the rest were found to be downregulated. Furthermore, miR-21, miR-146b and miR-17 levels were significantly correlated with tumor stage (P<0.05). MiR- 133a-2 was clearly associated with race, tumor stage, tumor pathologic, metastasis and tumor grade (P<0.05). In addition, combinations of the best performing miRNA’s and their targeted mRNA’s were incorporated in different panels to maximize the diagnostic accuracy for gastric cancer. These targeted mRNA’s were three different genes: FOS, BCL-2 and KAT2B. The combination of miRNA-181a-1 with KAT2B resulted in an AUC of 0.960 with a sensitivity of 96.55% and a specificity of 86.21% at a cut-off value of -0.9323 whereas the panel that included miR-139 and BCL-2 demonstrated an AUC of 0.942 with a sensitivity of 86.20% and a specificity of 93.10% at a cut-off value of 0.3485, respectively. In a second phase of this study, the best performing mi- RNA panels from the first phase were validated in plasma samples of 25 patients with GC and 18 healthy controls. The panel that incorporated miR-181-a-1 and KAT2B displayed a sensitivity of 95.84% and a specificity of 94.12% at a cut-off value of 3.2955 and the AUC value was higher than 0.95. The other biomarker panel disappointed with an AUC of 0.45, a high sensitivity of 100% but a low specificity of 41.18% at a cut-off value of 0.0528. These results suggested that the panel

65 | P a g e

consisting of miR-181-a-1 and KAT2B could become an important early detection tool for gastric cancer in the future (88).

In Japan, where the prevalence of cancer is higher than in Europe, photofluography and the pepsinogen test, which consists of analyzing the levels of PG I and PG II in serum, were introduced into the screening program for gastric cancer in order to reduce the number of deaths caused by gastric cancer. The pepsinogen (PG) test seems to be safe, has no side-effects and is easy to perform at a very low cost. As this test has never been analyzed in a European population, Lomba- Viana et al. evaluated the PG test in Portugal, where gastric cancer has the highest incidence in Europe. The aim of the screening study was to explore the use of the pepsinogen test in the detection of early-stage gastric cancer. A positive pepsinogen test was followed by upper gastrointestinal endoscopy to confirm or exclude the diagnosis of gastric cancer. In the first phase, volunteers between 40 and 79 years old were sought to participate in a cross-sectional study. In the second phase, a five-year prospective cohort study that enrolled the participants of the cross- sectional study, was carried out. In the cross-sectional study, a total of 13118 individuals were included. A PG test was positive if PGI < 70 ng/mL and PGI/PGII < 3, and individuals with positive PG tests were subjected to a gastroscopy with biopsies. Of these participants, 446 displayed a positive PG test (3.4%) of which 274 accepted to undergo a gastroscopy annually. Moreover, all participants underwent two endoscopies with a median follow-up of three years. H. pylori was detected in 96% of participants with a positive PG test during gastroscopy. Six gastric cancers were diagnosed; three advanced lesions and three early lesions. Additionally, four gastric cancers were discovered within the first year and two gastric cancers between the 2 nd and 5 th year during follow- up. Three cases of adenocarcinoma were detected in the group with a negative PG test during follow-up. These three individuals all took proton-pomp inhibitors. As a results, the PG test displayed a sensitivity of 67%, a specificity of 47%, a PPV of 2% and a NPV of 99% (89).

In the study by Tong et al, the utility of ten serum biomarkers for the detection of gastric cancer was assessed. Serum samples were collected from 285 patients with newly diagnosed primary gastric adenocarcinoma and from 238 healthy controls. The ten biomarkers that were measured in these serum samples, consisted of IgG to H. Pylori, PGI, PGII, PGI/PGII ratio, beta-catenin, COX2, ADAM8, VEGF, EGFR, ICAM and HB. First, these biomarkers were analyzed in a training group of 228 gastric cancer patients. Subsequently, the five best performing biomarkers were selected and a panel was constructed through the use of logistic regression, random forest (RF) and support vector machine (SVM) analysis. Secondly, this five-biomarker panel was validated in a test group

66 | P a g e

with 57 gastric cancer patients. In the first phase, the levels of eight biomarkers were found to be significantly elevated in gastric cancer patients (P<0.05), except for EGFR (P=0.1951) and ICAM (P=0.2627). Furthermore, significant correlations were found between beta-catenin, COX2 and VEGF. Forest plotting determined that IgG for H. Pylori, ADAM8, PGI, PGII and VEGF were the best predicting biomarkers for gastric cancer. Three different algorithms were employed to combine these five biomarkers in different ways. When using RF and SVM based algorithms and applying them on the validation group, the sensitivity for this five-biomarker panel was calculated at 86% and 88.6%, with a specificity of 78.4% and 83.2%, respectively (90).

The last article discussing diagnostic biomarkers for the detection of gastric cancer in this systematic review was conducted by Zhang and colleagues. In this study, another miRNA, more specifically miRNA-421 in gastric juice, was tested as a potential screening biomarker for gastric cancer. Gastric juice samples were obtained from 42 patients with gastric cancer, 34 patients with gastric ulcers, 18 patients with atrophic gastritis and from 47 healthy controls. Among the patients with gastric cancer, seven were classified as early-stage gastric cancer patients and 35 were classified as advanced stage cancer patients. First, the levels of miR-421 were analyzed in endoscopic mucosa biopsies. There was a significant overexpression of miR-421 found in gastric cancer tissue when compared with expression of miR-421 in normal mucosa tissue (P<0.001). Secondly, the study demonstrated that gastric-juice based miRNA’s remained stable while being stored, and had a highly reproducible detection rate. Thirdly, the levels of miR-421 were measured patients in gastric juice of patients with gastric cancer and in gastric juice of healthy controls. It was found that the expression levels of miR-421 were lower in the gastric juice of patients with GC when compared to the expression levels of miR-421 in patients with benign gastric diseases. Subsequently, ROC analysis was performed to evaluate the diagnostic ability of miR-421. The AUC of miR-421 was calculated at 0.767 (95% CI 0.684-0.850) accompanied with a sensitivity of 71.4% and a specificity of 71.7%. The optimal cut-off value of miR-421 in gastric juice was determined at 5.21. The sensitivity of miR-421 appeared to be higher than that of CEA in gastric juice, which displayed a sensitivity of 69.04%. Furthermore, no significant difference in miR-421 gastric juice levels was observed among the patients with different stages of gastric cancer (P=0.807). Lastly, miR-421 levels were evaluated in gastric juice samples from patients with early-stage gastric cancer. The sensitivity of gastric juice miR-421 for the detection of early-stage and advanced stage gastric cancer were both 71.4%, respectively. Gastric juice CEA showed sensitivities of 42.8% and 74.3% for the detection of early-stage and advanced stage gastric cancer. These results demonstrated that gastric juice miR-421 had a higher sensitivity than gastric juice CEA for the

67 | P a g e

detection of both early-stage and advanced stage gastric cancer. However, when gastric juice CEA and miR-421 were combined in a panel, the sensitivity for the detection of early-stage gastric cancer improved significantly from 71.4% to 85.7% with a specificity of 88.5% (91).

3.9 RENAL CANCER As screening with abdominal imaging (CT and MRI) for asymptomatic renal tumors is not cost- effective and does not permit the radiologist to differentiate benign kidney tumors from malignant kidney tumors, diagnostic biomarkers for early detection of renal cancer could be the solution to this problem. In a prospective cohort study, designed by Morrissey, Urine Aquaporin-1 (AQP1) and Urine Perilipin-2 (PLIN2) were tested as possible screening biomarkers for renal cell carcinoma (RCC). A total of 720 patients, who were scheduled to undergo abdominal CT for a variety of diseases, took part in this study. Furthermore, urine samples from 19 patients with a presumptive diagnosis of RCC based on CT-imaged renal mass, and from 80 healthy controls were obtained. The CT cohort with 720 participants was divided in a group with no history of cancer (N=334) and a group with a history of cancer (N=386). Urine AQP1 and PLIN2 were measured and it was found that the urine AQP1 levels were significantly elevated in patients with RCC when compared to the urine AQP1 levels in healthy patients, in the CT screening patients without and in the CT screening patients with a history of cancer (all P<0.001). Furthermore, the urine AQP1 concentrations in CT screening patients with a history of cancer were not significantly different from the urine AQP1 levels in the group with healthy controls (P=0.08) and the urine AQP1 concentrations in CT screening patients with or without cancer history were not significantly different from each other (P=0.43). Moreover, the urine PLIN2 levels were significantly higher in patients with RCC than the urine PLIN2 levels in the healthy controls and CT cohorts with or without a history of cancer (all P<0.001). Similarly to the results of urine AQP1, there were no differences in urine PLIN2 levels found between CT screening patients with or without cancer and the healthy controls and between the urine PLIN2 levels in both CT screening groups. In addition, the concentrations of urine AQP1 and PLIN2 were correlated with the tumor size in the 19 patients with RCC. When comparing the levels of AQP1 and PLIN2 in the urine of RCC patients with those in patients with various other types of cancer (lung, prostate, CRC, gastrointestinal, uterine, ovarian, pancreatic, and breast cancer) it was demonstrated that AQP1 and PLIN2 were significantly elevated in patients with RCC. ROC analysis revealed that urine AQP1 had an AUC of 1.00 with a sensitivity and specificity of both 100% for differentiating RCC patients from healthy controls, at an optimal cut-off value of 7 ng/mg urine creatinine. Urine PLIN2 displayed an AUC of 0.990 (95% CI 0.980-1.00)

68 | P a g e

with a sensitivity of 100% and a specificity of 91% at a cut-off value of 14 absorbance units/mg creatinine. For distinguishing patients with RCC from all other 720 screened patients, urine AQP1 demonstrated a sensitivity of 100%, a specificity of 96% and an AUC of 0.991 (95% CI 0.980-1.00) at a cut-off value of 96 ng/mg creatinine. PLIN2 displayed similar results with a sensitivity of 100%, a specificity of 98% and an AUC of 0.996 (95% CI 0.992-1.00) at a cut-off value of 9.8 absorbance units/mg creatinine. Internal validation by means of resampling the population data 200 times, resulted in an overall AUC of 0.990 for AQP and an AUC of 0.997 for PLIN2. Employing a logistic regression model that combined AQP1 and PLIN2 resulted in a 95% sensitivity, 98% specificity and an AUC of 0.990 (92).

All over the world, serum miRNA’s are being tested as potential diagnostic biomarkers for the detection of various types of cancer. Therefore Fedorko et al. explored the potential utility of two novel miRNA’s, miR-210 and miR-378, as a potential biomarker for RCC in a case-control study. The study group consisted of 195 patients who were diagnosed with RCC and who were undergoing radical nephrectomy and 100 cancer-free controls. Furthermore, serum samples of 20 patients with RCC were collected at one week and three months after nephrectomy. The expression of miR-378 and miR-210 was measured in these patients and elevated expression of both these miRNA’s was observed in patients with RCC compared to the expression of miR-378 and miR-210 in the healthy controls (P<0.0001). ROC analysis revealed that both miR-378 and miR-210 could serve as potential biomarkers for differentiating RCC patients from healthy controls with AUC’s of 0.82 (95% CI 0.77-0.86) for miR-378 and 0.74 (95% CI 0.69-0.80) for miR-210, respectively. Moreover, significantly elevated levels of miR-378 and miR-210 were found in patients with early- stage RCC when compared to the levels of miR-378 and miR-210 in healthy controls (P<0.001). When combining miR-378 and miR-210 in a panel, the diagnostic accuracy improved with an AUC of 0.85 (95% CI 0.81-0.89), a sensitivity of 80% and a specificity of 78%. Another interesting result was the fact that the expression levels of both miRNA’s were significantly lower in patients three months after radical nephrectomy (P<0.0001). Furthermore, elevated serum miR-378 expression levels and progression of RCC were correlated in this study. Another significant correlation was found between elevated serum miR-378 expression level and disease-free survival. However, no difference in miRNA expression level was seen among the different RCC histological subtypes (93).

The last study discussing potential new biomarkers for early detection of RCC was conducted by Mustafa. In this case-control study, serum amino acid profiles were investigated as potential

69 | P a g e

prognostic biomarkers for RCC. The study group included 189 patients with RCC and 104 age- and sex-matched controls. Amino acids were measured in each serum sample and thirteen of the 26 amino acids, taurine, threonine, serine, asparagine, glutamate, glycine, alanine, citrulline, methionine, tyrosine, ornithine, phenylalanine, histidine and proline were all significantly decreased in RCC patients, whereas two of the 26 amino acids, arginine and cysteine, were elevated. Of these 15 amino acids and based on correlations between different amino acids, nine amino acids showed the highest significant differences between cases and controls: threonine, alanine, isoleucine, leucine, ornithine, histidine, arginine, cysteine and alfa-aminobutyrate. Subsequently, a logistic regression model was created to identify which of these amino-acids had significant predictive value (P<0.05) as regards to differentiating RCC patients from healthy controls. The final model consisted of eight different amino acids (cysteine, ornithine, histidine, tyrosine, proline, lysine, leucine and valine). ROC analysis of this model revealed an AUC of 0.81. Furthermore, 10-fold cross validation was performed on the sample set and the mean AUC was not significantly different (0.79). In addition, the AUC of the panel for distinguishing stage I and II RCC patients from healthy controls was calculated at respectively 0.76. When the relation between the model score and the time to recurrence in patients, of whom the tumor was removed through operation, was examined, patients with lower regression scores appeared to have significantly fewer recurrences and had increased overall survival compared to those with higher regression scores (P=0.003 and P=0.006). The rest of the results can be consulted in additional table 9 (94).

3.10 GYNECOLOGIC CANCER Only one study concerning endometrial cancer was included in this systematic review as it was the only study that could meet the inclusion criteria, despite of the fact that the amount of patients in this article was limited. The prospective cohort study of Kemik et al. examined the efficiency of YKL-40, HE-4 and DDK-3 as diagnostic and/or prognostic biomarkers for endometrial cancer. The study group consisted of 50 patients, diagnosed with endometrial cancer after biopsy, whereas the control group included 50 women with gynecologic symptoms like abnormal uterine bleeding, but with a normal endometrial histology. The serum levels of CA-125, HE-4, YKL-40 and DDK-3 were measured and compared between the two groups. All patients with endometrial cancer were operated and the tumors were removed. The histological subtypes of these tumors were: 72% endometrioid adenocarcinoma and 28% non-endometrioid carcinoma. Furthermore, 33 patients were diagnosed with stage I disease, 11 with stage II disease and 6 patients were in stage III disease. The results

70 | P a g e

indicated that CA-125, YKL-40 and HE-4 serum levels were significantly increased in the cancer group compared to the CA-125, YKL-40 and HE-4 serum levels in the healthy controls (P<0.0001). The difference in DDK-3 levels between the two groups was not statistically significant (P=0.454). Multivariate analysis revealed similar results, with significantly higher CA-125, YKL-40 and HE-4 serum levels in the patients with endometrial cancer. In addition, no differences in the levels of CA- 125, YKL-4 and DDK-3 were found between the patients with different stages of endometrial cancer. However, there was a difference in HE-4 levels between patients with different stages of endometrial cancer; serum HE-4 was significantly increased in stage II and III patients. Moreover, there was significant difference in the levels of YKL-40 between the endometrioid type group and the non-endometrioid group (P=0.022). HE-4 levels were also significantly raised in patients with myometrial invasion deeper than 50% and positive lymphovascular space invasion (P=0.013 and P<0.0001). ROC analysis was performed for CA-125, YKL-40 , HE-4 and DDK-3. The AUC of YKL- 40 for the differentiation of patients with endometrial cancer from healthy controls was 0.823, which meant YKL-40 has a high diagnostic value. At a cut-off value of 61.97 ng/mL, a sensitivity and specificity of 90% and 54% were observed. The diagnostic value of HE-4 on the other hand, was significantly higher with an AUC of 0.882. At a cut-off value of 38.95 pmol.L, a sensitivity and specificity of 90% and 42% was found (95).

The aim of another prospective cohort study of Duvlis and colleagues, was to compare the clinical and prognostic value of HPVE6/E7 mRNA as an early detection tool for cervical cancer to HPV DNA detection and cytology in triage of woman for cervical cancer. Presently, cytomorphological examination of cervical cancer smear is widely used as a screening test for the detection of cervical cancer and for the detection of precursors of cervical cancer or cervical intraepithelial neoplasia (CIN). However, it is not the ideal screening method due to its low sensitivity of approximately 55% for the detection of high-grade CIN. Furthermore, the high-risk HPV DNA test has improved the cervical cancer screening but its specificity and PPV for cervical cancer are low among a young screening population with frequent transient HPV infections. For these reasons, it is reported that HPV E6/E7 mRNA testing could be a more specific diagnostic test for cervical cancers or its precursors. The study population consisted of 413 women, 138 of which had positive HPV DNA tests for the five most common types (HPV 16, 18, 31, 33 and 45). All participants underwent cytological, HPV and colposcopical examination. Of these 413 women, 258 were negative for intraepithelial lesion or malignancy (NILM), 26 women had atypical squamous cells of undetermined significance (ASC-US, with two CINI patients), 81 women had low grade intraepithelial lesions (LSIL, with 16 CINI and two CINII) and 41 women had high-grade

71 | P a g e

intraepithelial lesions (HSIL with five CINI, 17 CINII and 9 CINIII). Biopsies of 61 women were taken during colposcopy. It was determined that 10 women had a normal biopsy, 22 women had CINI, 20 women had CINII and 9 women had CINIII. The HPV E6/E7 mRNA test was positive in 74 participants. The most common HPV type was HPV 16. Subsequently, the diagnostic performance DNA test and mRNA test for HPV were compared to each other. The results indicated that the DNA based HPV test had a higher sensitivity (90.2%) than the HPV E6/E7 mRNA test (73.2%) in cytological determined findings. However, the HPV E6/E7 mRNA test demonstrated a higher specificity and a higher PPV: 88.2% and 41.1% for E6/E7 mRNA and 61.4% and 20.8% for HPV DNA, respectively. The results were similar in the histology-based analysis. The HPV E6/E7 mRNA test had a significantly higher PPV and specificity, 50% and 62.8%, than the HPV DNA test with 18.7% and 52.7%. Yet again, the sensitivity of the HPV DNA test was higher, calculated at a 100%, whereas the HPV E6/E7 mRNA test demonstrated a sensitivity of 93% (96).

The last article concerning potential biomarker for the detection of cervical cancer describes a cross-sectional study, conducted by Kan et al. In this study, the levels of paired box gene 1 (PAX1), sex determining region Y-box-1 (SOX1) and NK6 transcription factor-related locus 1 (NKX6-1). These three genes were all suspected of being hypermethylated in cervical cancer and were measured in 247 women with a normal uterine cervix and in 172 patients with abnormal Pap test results of different stages. Firstly, the optimal cut-off values for the methylation of the three genes were determined using data from the first 100 patients. As CINIII+ lesions have shown to have a high risk of progressing to cervical cancer, the primary aim of this study was to confirm that the utility of these methylated genes in the detection of CINIII lesions. ROC analysis was performed and the AUC’s of PAX1, SOX1 and NKX6-1 for the detection of CINIII lesions were 0.97, 0.76 and 0.73, respectively. Furthermore, the sensitivities of PAX1, SOX1 and NKX6-1 were 92%, 92% and 54% while their specificities were 83%, and 30%, and 82%, respectively. The overall accuracy values for PAX1, SOX1 and NKX6-1 were calculated at 84%, 41% and 78%, respectively. Although these results look promising, for the detection of CINII lesions, the sensitivities of PAX1, SOX1 and NKX6-1 were decreased to 77.4%, 86% and 48.8% with specificities of 84.6%, 39.6% and 79%, respectively (97).

3.11 BLADDER CANCER As in the last article, detecting cancer using DNA methylation is also the subject of the first of four studies concerning bladder cancer in this systematic review. The gold standard for the detection of

72 | P a g e

bladder cancer until now is by means of cystoscopy in combination with a biopsy of suspicious lesions. However, this invasive procedure can miss 10 to 30% of all bladder malignancies. Furthermore, voided urine cytology is the most used non-invasive manner for detecting bladder cancer in patients with symptoms. Previous studies have reported that the sensitivity and specificity of cytological examination for the detection of bladder cancer were approximately 35% and 94%, leaving room for improvement. In this case-control study by Chung, first-voided urine was obtained from 128 patients with bladder cancer (88 patients with primary cancer and 40 patients with recurrent cancer), 71 patients with benign urologic disorders and 39 healthy individuals. Firstly, 10 genes, selected from previous studies, were examined in 6 bladder cancer cell lines and in tissue samples of 26 primary bladder tumors. Eight genes were highly methylated in bladder cancer and had low methylation levels in normal cell lines: A2BP1, NTX2, SOX11, PENK, NKX6-2, DBC1, MYO3A and CA10. Secondly, the methylation levels of these genes were analyzed in the first- voided urine samples. All eight genes were indeed significantly more methylated in patients with bladder cancer than in the controls. ROC analysis was performed for all eight genes and the five best performing genes were MYO3A (AUC=0.841), CA10 (AUC=0.835), NKX6-2 (AUC=0.823), PENK (AUC= 0.802) and SOX11 (AUC=0.797). When combining the eight genes in two differentially composed panels (panel 1: MYO3A, CA10, NKX6-2 and DBC1; and panel 2: MYO3A, CA10, NKX6-2 and SOX11), the resulting AUC’s of these two panels were both 0.939 (95% CI 0.901-0.966). Moreover, combining MYO3A, CA10, NKX6-2, DBC1 with SOX11 or PENK in one panel, resulted in an AUC of 0.939 (95% CI 0.901-0.966). The sensitivities and specificities of these panels can be consulted in additional table 11. The sensitivity and specificity of each panel were higher than respectively 80% and 90%, indicating that these biomarkers have high diagnostic value for bladder cancer. When a three-gene methylation panel, that included MYO3A, CA10 and NKX6- 2, was applied on this data set, the sensitivity for the detection of non-muscle invasive stage bladder cancer (Tis, pTa and T1 stages) was 81% and its specificity was calculated at 95% with an AUC of 0.911 (95% CI 0.857-0.949). Furthermore, the sensitivity and specificity of this three-gene panel for the detection of invasive stage bladder cancer (T2, T3 and T4 stages) were calculated at 90% and 95% with an AUC of 0.962 (95% CI 0.923-0.985). Lastly, the second panel with four biomakers demonstrated a sensitivity and specificity of 76% and 97% for the detection of non-muscle invasive stage bladder cancer and a sensitivity and specificity of 86% and 97% for invasive stage bladder cancer (98).

The next study concerning potential biomarker for the detection of bladder cancer was designed by Eissa. The goal of this case-control study was to assess the diagnostic performance of survivin

73 | P a g e

RNA and matrix-metalloproteinase 2 and 9 for detecting bladder cancer. Blood samples and single- voided urine samples were obtained from 56 patients with bladder cancer, 20 patients with benign urological lesions and 20 healthy volunteers. Matrix-metalloproteinase 2 and 9 were measured by using MMP’s zymography and survivin RNA, also called Baculoviral IPA repeat-containing 5 or BIRC5, was detected by RT-PCR. The performance of these biomarkers for the detection of bladder cancer was compared with the performance of cytology for detecting bladder cancer. The results demonstrated that survivin RNA and MMP’s zymography had a sensitivity of 76.1% and 67.4%, respectively. Urinary survivin had the highest sensitivity and specificity whereas cytology displayed the lowest sensitivity. When combining survivin and the MMP’s in a panel, both the sensitivity and specificity increased to 84.7% and 95% respectively. However, the highest sensitivity was observed in a three-biomarker panel that incorporated cytology, survivin and MMP’s zymography with 95.6%, whereas cytology alone demonstrated the highest specificity with 100%. No ROC analysis was performed in this study. The other results of this study can be consulted in additional table 11 (99).

In the prospective cohort study of Lai et al, the utility of human uroplakin 3A or UPK3A in urine as a diagnostic biomarker for bladder cancer was evaluated. Furthermore, possible associations between urinary levels of UPK3A and bladder cancer were assessed and the diagnostic performance of urine UPK3A was compared with that of urine cytology and NMP22. Voided urine samples were collected from 32 healthy volunteers, 44 patients with benign urological disorders and from 122 patients with bladder cancer. The results demonstrated that urine UPK3A levels were significantly increased in patients with bladder cancer when compared to the urine UPK3A levels in healthy volunteers and in patients with benign urological disorders (P<0.01). There was also a significant difference in UPK3A levels between the healthy volunteers and the patients with benign urological disorders (P<0.01), but not between patients with different grades of cancer (G1, G2 and G3). These results thereby indicated that raised UPK3A levels in urine can detect bladder cancer, but not predict tumor grade. ROC analysis was performed on the study data to determine sensitivity, specificity and AUC for urine UPK3A, cytology and NMP22. UPK3A presented an AUC of 0.907 (95% CI 0.876-0.974). The optimal cut-off value of UPK3A was determined at an OD (absorbance unit) of 0.0685 with a sensitivity and specificity of both 83%. The sensitivities of urine NMP22 and of urine cytology for the detection of bladder cancer were 58% and 64%, respectively, whereas their specificities were calculated at 75% and 82%, respectively. The PPV and NPV of UPK3A, NMP22 and urine cytology were 0.88 and 0.79, 0.85 and 0.75, 0.53 and 0.58, respectively.

74 | P a g e

As a result, it can be concluded that the UPK3A assay was more sensitive than NMP22 and cytology for the detection of bladder cancer (P<0.05) (100).

The last study concerning new biomarkers for bladder cancer was designed by Renard. In this prospective cohort study, two methylation based DNA markers, TWIST1 and NID2 were identified and tested as biomarkers for non-invasive urine-based early detection of bladder cancer. Voided urine samples were obtained from 157 patients with bladder cancer and from 339 control patients with benign urological diseases. The patients with bladder cancer had predominantly non-muscle- invasive bladder cancer or early-stage bladder cancer. The study was divided into three phases: an identification phase in cell lines, a second phase that included an analysis of the candidate methylated genes found in the first phase; in the final phase, the selected methylated genes from phase two were evaluated in a validation set. In the second phase, the 10 best performing methylated gene markers and their respective sensitivities at a fixed specificity of 100% were NID2 with 73%, TWIST1 with 68%, TJP2 with 68%, TNFRSF25 with 66%, BMP7 and RUNX3 both with 59%, CCNA1 and APC both with 50%, LOXL1 with 38% and finally TUBB4 with 34%. The two genes with the highest sensitivity were NID2 and TWIST1. In the third phase, these 10 methylated genes were tested in a urine marker selection set with 62 bladder cancer patients and 143 controls. The five best performing markers were TWIST1, NID2, RUNX3, CCNA1 and BMP7 and were selected to be further evaluated in a training set of 48 bladder cancer patients and 121 controls. In this training set, TWIST1 and NID2 were the two methylated genes with the best diagnostic performance in terms of sensitivity and specificity for distinguishing patients with bladder cancer from controls. These two genes were finally validated in an independent urine validation set that included 35 bladder cancer patients and 57 controls. The sensitivity and specificity of TWIST1 for detecting bladder cancer were 88% and 94%, whereas the sensitivity and specificity of NID2 were 94% and 91% in the training and validation set, respectively. The most notable result was the high sensitivity of 80-89% for the detection of early-stage Ta and low-grade bladder cancer. Subsequently, NID2 and TWIST1 were combined in a biomarker panel and this panel presented a sensitivity of 90% and a specificity of 93%. A logistic regression model incorporating the two biomarkers was constructed and ROC analysis revealed an AUC of 0.93 (95% CI 0.90-0.96) for differentiating bladder cancer patients from healthy controls. The sensitivity of the two-marker panel (88-94%) was much higher than that of cytology (48-49%) although both tests displayed high specificity rates of 91-94% and 95-97%. Furthermore, when the methylated gene panel was combined with cytology, the sensitivity increased to 96-97% but there was a slight decrease in

75 | P a g e

specificity (86-93%) for the detection of bladder cancer. Ultimately, the PPV and NPV of the two- marker panel were 95% and 86%, respectively (101).

3.12 ORAL CANCER Nasopharyngeal carcinoma is strongly associated with Eppstein-Barr virus (EBV). The detection of characteristic antibodies against EBV and high concentrations of viral load have been reported as screening biomarkers in previous reports. In the retrospective case-control study by Hutajulu, methylation analysis in the promoter of tumor suppressor genes (TSG’s) was evaluated as a complementary test for risk assessment in EBV-infected individuals to improve early detection of nasopharyngeal carcinoma (NPC). The study population consisted of 53 NPC patients, 22 high- risk patients and 25 healthy EBV carriers. The high-risk group were patients with chronic symptomes in the head and neck region, elevated EBV IgA ELISA seroreactivity and with a detectable viral load in nasopharyngeal brushings. The seroreactivity of high-risk patients and NPC patients were higher than the seroreactivity of healthy individuals (P<0.001). After analyzing the methylation level of promoters regions of TSG’s in tissue samples and in brushings from the NPC patients, high-risk patients and healthy controls, 10 TSG’s with the highest frequency of methylation were selected: CHFR, RIZ1, WIF1, p16, RASSF2A, RASSF1A, DAPK1, DLC1, CDH13 and CADM1. These were al further tested in 50 NPC cases. The five TSG’s that were significantly more methylated in NPC’s and that could differentiate NPC patients from non-cancer patients were combined in a panel. These genes were CHFR, RIZ1, WIF1, p16 and RASSF1A. This five-gene panel displayed a sensitivity of 98% and a specificity of 96%, respectively (102).

Oral and oropharyngeal squamous cell carcinomas (OSCC) are a major cause of mortality, with approximately 200000 new deaths every year. The low survival rate is in part the result of a lack in sensitive early detection methods. In the observational study by Voka č and colleagues, amplifications of human telomerase RNA component (hTERC) and SRY-related HMG-box 2 (SOX2) genes were measured in brush cytology slides of OSCC exfoliative tumor cells using the FISH technique. Subsequently, these two genes were evaluated as potential supportive biomarkers for early detection and diagnosis of oral and OSCC. Brush biopsies were obtained from exophytic and exulcerated oral and oropharyngeal lesions in the oral cavity of 71 patients with OSCC or oral cancer and from the oral cavity of 22 healthy controls. The majority of the tumors were locally advanced cancers with 48 patients in stage IV diseases. Furthermore, 44 patients presented lymph node involvement. These 71 brush biopsies were analyzed by FISH using TERC, SOX2 and CEP-

76 | P a g e

3 DNA probes. Polyploid and amplified cases were considered abnormal. As a result, a total of 49 patients displayed abnormal patterns in these genes; 27 of which had polyploidy genes and 22 had amplified and polypoloid genes. Three brush biopsies demonstrated only amplified genes. The presentation of TERC/SOX2 polyploidy and amplification in tumor cells was statistically higher than the TERC/SOX2 polyploidy and amplification in mucosa cells of controls (P=0.01). In addition, significantly higher rates of polyploidy and amplification of these genes were seen in tumor cells from the oropharynx when compared with tumor cells from other locations in the oral cavity (P=0.0442). The odds ratios of patients with total abnormal FISH results (amplification or polyploidy), patients with polyploidy only results and patients with amplification and polyploidy results were analyzed. The odds ratio of total abnormal FISH results vs. controls was 1.29 (95% CI 0.39-4.23) (103).

CYFRA21-2 is a well-accepted biomarkers with high sensitivity and specificity for the detection of NSCLC, more specifically squamous cell carcinoma. In the study of Rajkumar and colleagues, the potential of CYFRA21-1 as a diagnostic biomarker for oral cancer and OSCC instead of NSCLC was evaluated in saliva and blood samples from patients and healthy controls. The study population consisted of 100 patients with oral cancer or OSCC, 100 patients with premalignant lesions and 100 healthy controls. The premalignant lesions (leukoplakia and oral submucous fibrosis or OSMF) were from buccal mucosa, vestibule mucosa, alveolar mucosa or from palate and tongue. Of the 100 OSCC patients included, 36 had well-differentiated lesions, 31 had moderately differentiated lesions and 33 had poorly differentiated lesions. A significant difference was found in serum and saliva CYFRA21-1 levels between the patients with OSCC and the patients with premalignant lesions (P<0.01). Furthermore, no differences in serum CYFRA21-1 between the patients with premalignant lesions and the healthy controls were observed. Saliva CYFRA21-1 levels however were significantly elevated in the patients with premalignant lesions compared to the healthy controls (P<0.01). The salivary CYFRA21-1 levels were much higher than the serum CYFRA21-1 levels in patients with premalignant lesions and in patients with OSCC. ROC analysis of CYFRA21- 1 in serum and saliva samples was performed. CYFRA21-1 in saliva displayed a specificity and sensitivity of 75% for differentiating OSCC patients from patients with premalignant lesions. However, the sensitivity and specificity of salivary CYFRA21-1 were respectively 83.6% and 95% for distinguishing OSCC patients from healthy controls. On the other hand, serum CYFRA21-1 demonstrated a sensitivity of 60% with a specificity of 90% for differentiating OSCC patients from patients with premalignant lesions. In addition, the AUC’s of serum and salivary CYFRA21-1 were 0.865 and 0.899, respectively. Based on the ROC analysis of this data set, salivary CYFRA21-1

77 | P a g e

was superior to serum CYFRA21-1 in distinguishing OSCC from patients with premalignant lesions. Moreover, there was a significant increase in salivary CYFRA21-1 levels in patients with poorly differentiated lesions (P<0.001), while no significant difference in serum CYFRA21-1 levels were observed between patients with different histological of OSCC. The last notable result was the significant increase in serum CYFRA21-1 levels in patients with stage III and IV OSCC (P<0.01) (104).

Previous studies have reported that serum antibodies to HPV oncoproteins E6 and E7 are strongly correlated with OSCC. Recently, it has been reported that, in up to 10 years before diagnosis, elevated levels of HPV E6 antibodies can be found in 35% of European OSCC patients. Furthermore, associations between seropositivity to E1 and E2 proteins and the major capsid protein L1 and presence of OSCC were demonstrated. In the study of Holzinger, the diagnostic performances of HPV antibody levels to oncoproteins E6 and E7, to regulatory proteins E1 and E2 as biomarker and to major capsid protein L1 for HPV16-driven OSCC were evaluated. Tissue and serum samples were collected from 214 patients with head and neck squamous cell carcinomas or HNSCC (142 non-HPV-driven OSCC and 72 HPV-driven OSCC). Of these 214 patients, 120 cancer lesions were located in the oropharynx and 94 were located outside the oropharynx. Of the 72 patients with HPV-driven OSCC, 66 patients were E6 seropositive and 33 patients were seropositive for all four E antibodies. Antibody sensitivity for HPV-driven tumors of OSCC patients was 97% for any of the 4 E proteins and 53% for L1 whereas the specificity was calculated at 98% for the E6 protein and the E1 protein and 57% for the L1 protein. E6 seropositivity had the highest diagnostic accuracy with 97%. An algorithm, named ‘HPV sero-pattern’ was constructed and combined E6 seropositivity and/or seropositivity to at least three E proteins. This algorithm presented a sensitivity of 97% (95% 90-99) and a specificity of 98% (95% CI 90-100%) with a diagnostic accuracy of 98% (95% CI 93-99%), respectively. Sensitivity for detecting HNSCC cases from outside the oropharynx were 83% for any E protein and 17% for E1 alone while specificity was 100% for E6 protein and 74% for any of the other E proteins. These results showed that the sensitivity of E6 seropositivity and the HPV sero-pattern algorithm were much lower in cases with cancer outside the oropharynx (50% vs > 96%) whereas specificity was high in all patients with 98% in OSCC patients and 100% in HNSCC patients with cancer outside of the oropharynx. In a subset of the cohort, the performances of p16 immunostaining or the combination of p16 immunostaining and HPV DNA were evaluated in OSCC patients and in HNSCC patients with cancer outside of the oropharynx to compare the diagnostic performance of the HPV16 serology with that of p16 immunostaining. The sensitivity of E6 seropositvity (and the sero-pattern algorithm)

78 | P a g e

for the detection of HPV-driven OSCC patients was 95% in this subcohort, whereas the sensitivity of p16 (with or without HPV DNA analysis) was 97%. On the other hand, the specificity of p16 was just 65% while p16 in combination with HPV DNA positivity displayed a 83% specificity rate. The specificity of HPV16 E6 in combination with the HPV sero-pattern positivity however, had a much higher specificity of 100%. Lastly, the sensitivity of a panel that included p16, with or without HPV DNA analysis, and E6 seropositivity (and the sero-pattern algorithm) was calculated at 100% for the detection of HNSCC patients with cancer outside of the oropharynx, but the specificity of p16 was merely 88% while the combination of E6 protein and the sero-pattern positivity displayed a specificity of 100% (105).

3.13 ESOPHAGEAL CANCER Two studies were found eligible to be included in this systematic review. Additional table 13 presents the results of these two studies. The majority of patients with esophageal cancer (EC) often do not develop symptoms until the cancer is in an advanced stage. For that reason, long-term survival is disappointingly low. However, when patients are diagnosed with esophageal cancer in stage I, the 5-year survival exceeds 85% as a result of early surgical interventions. Thus, identifying a reliable diagnostic biomarker for the detection of patients with stage I esophageal cancer could dramatically improve long-term survival. In the case-control study by Guo et al, the profile of serum proteins in patients with EC was analyzed using MALDI-TOF-MS. Proteomic patterns associated with EC were identified and subsequently, a model of biomarkers was constructed and evaluated as an early detection tool for EC. Serum samples from 78 patients with EC and 95 healthy controls were collected for this study. In the discovery phase, the protein profile of 40 EC patients and 50 healthy controls was examined. Only few significant differences in proteomic peaks were found between the two groups, but these were considered as potential biomarkers. Ultimately, 60 discriminating protein peaks in sera were observed. The proteins behind the four peaks with the highest discriminatory power were selected to be combined in a classification tree. Subsequently, this classification tree was used to differentiate patients with EC from healthy controls. The sensitivity and specificity of the classification tree were 93.5% and 88% in the first test set. In the second, blinded test set, the classification tree yielded a sensitivity of 89.5% and a specificity of 84.4%, respectively (106).

79 | P a g e

In the cohort study by Xu et al, the potential of autoantibodies against L1CAM as diagnostic biomarkers for the detection of patients with esophageal squamous cell carcinoma (ESCC) was examined. These auto-antibodies were first identified in cohort 1 with 191 patients with ESCC and 94 normal controls. Subsequently, the selected auto-antibodies from cohort 1 were validated in cohort 2 with 47 patients with ESCC and 47 normal controls. Patients with with stage 0, I and IIa tumors were classified as early-stage ESCC patients. The results of cohort 1demonstrated that the serum levels of auto-antibodies against L1CAM were significantly elevated, not only in patients with ESCC (P=0.005), but also in early-stage ESCC patients (P=0.007). The same result was observed in cohort 2. ROC analysis was performed and in cohort 1 and 2, the optimal cut-off value of the auto-antibodies against L1CAM was calculated at 0.349, which offered an AUC of 0.603 (95% CI 0.535-0.672) for the detection of patients with ESCC, with a sensitivity of 26.2 and a specificity of 90.4%. For the detection of early-stage patients with ESCC, the AUC of auto- antibodies against L1CAM was 0.611 (95% CI 0.533-0.689) with a sensitivity of 25.2% and a specificity of 90.4%, respectively. The PPV and NPV of auto-antibodies against L1CAM for ESCC were 84.8% and 37.7%, respectively, while the positive likelihood ratio (PLR) and negative likelihood ratio (NLR) were 2.74 and 0.82. Furthermore, the levels of auto-antibodies against L1CAM showed no association with age, gender, size of tumor, tumor location, histological grade, lymph node status, TNM stage, or early-stage and late-stage groups. However, a significant correlation between the auto-antibodies against L1CAM and the depth of the tumor invasion in cohort 1 was found (P<0.05). In addition, no correlation was found between auto-antibodies against L1CAM and ESCC patients’ survival time (P>0.05) (107).

3.14 SKIN CANCER The occurrence of has been rising in recent years. What is more, disseminated forms of melanoma have a low 10-year survival rate < 50%. If the disease were to be detected and excised in an early stage, a non-invasive stage, the survival rate of patients with melanoma cancer could be drastically increased. At present, early detection of patients with melanoma is done by means of biopsying pigmented lesions that are considered suspicious for melanoma. The finding of a new sensitive biomarker could reduce the amount of these biopsies drastically. In the study of Wachsman and colleagues, epidermal genetic information retrieval- or EGIR-based strategy was used to construct a 17-gene classifier. Subsequently this classifier was evaluated as a detection tool that can distinguish melanoma from naevi. Suspicious lesions of patients included in this study were first taped stripped and then biopsied. After histological examination, 29 , 68 naevi

80 | P a g e

(benign and atypical) and 15 normal skin specimens were harvested, from which RNA was isolated, amplified and profiled. A group of 422 genes were differentially expressed by melanomas and naevi of patients in this first group, were then subjected to class prediction modelling in order to identify a multigene classifier that can distinguish melanoma lesions from naevi. The diagnostic performance of this 422-gene classifier was assessed in a training set with 37 melanomas and 37 naevi. It was found that the best performing classifier with 168 of the 422 genes could identify all 37 melanomas correctly. Furthermore, this classifier demonstrated a sensitivity of 100% and a specificity of 95%. In the following independent test dataset with 39 melanomas and 89 naevi, the sensitivity and specificity of this 168-gene classifier were 100% and 88%, respectively. By applying TreeNet modelling, it was revealed that a classifier of 17 genes could still could maintain a sensitivity of 100%, at a slight cost in specificity (88%). ROC analysis of this 17-gene classifier revealed an AUC of 0.955, demonstrating that this classifier is highly accurate for the detection of melanoma. Among these 17 genes, nine were identified as critical regulators in melanocyte development, pigmentation signaling and melanoma progression. The rest of the genes played a role in cell death, cellular development, cancer, hair and skin development and neurological diseases. In conclusion, most of these newly identified biomarkers were involved in melanoma or cancer development. The 17 genes included in the classifier were ACTN4, BC020163, CMIP, CNN2, EDNRB, GPM6B, KIT, MGC40222, NAMPT, PRAME, RPL18, RPL21, RPS15, TMEM80, TRIB2, TTC3 and VDAC1. The results of this study can be consulted in additional table 14 (108).

3.15 OSTEOSARCOMA The 5-year survival of patients with osteosarcoma has been improved over the past decades to 60- 70%. Unfortunately, a great number of osteosarcoma patients still do not respond well to chemotherapy. As a result, these patients are at high-risk for local relapse or distant metastasis, even after curative resection of the primary tumor and chemotherapy. For that reason, identification and validation of new biomarkers that can detect osteosarcoma at an early stage is of the outmost importance. In the study of Ouyang, various miRNA levels in plasma of osteosarcoma patients were measured and tested as potential biomarkers for the detection of osteosarcoma. These biomarkers were miR-21, miR-199a-3p, miR-143, miR-34, miR-140 and miR-132. The study consisted of an initial cohort study with 40 patients with osteosarcoma and 40 healthy controls in an, followed by a validation cohort with another 40 osteosarcoma patients and 40 controls. In the initial cohort study, the levels of miR-21 were significantly elevated in the patients with osteosarcoma when compared to the levels of miR-21 in healthy controls (P=0.03), while the levels

81 | P a g e

of miR-199a-3p and miR-143 were significantly decreased in patients with osteosarcoma when compared to the levels of miR-199a-3p and miR-143 in healthy controls (P=0.02 and P=0.01). No significant difference in the levels of miR-34, miR-140 and miR-132 were observed between patients with osteosarcoma and healthy controls. In the validation cohort study, the levels of the miR-199a-3p, miR-143 and miR-21 were measured. Similar results as in in the initial cohort study were found in this cohort with the levels of miR-21 being significantly elevated (P=0.02) and the levels of miR-199a-3p and miR-143 being significantly decreased (both P=0.01) in patients with osteosarcoma compared with the levels of these mi- RNA’s in healthy controls. ROC analysis was conducted in this validation data set and the AUC’s of miR-21, miR-199a-3p and miR-143 for differentiating osteosarcoma patients from healthy controls were 0.863 (95% CI 0.818-0.908), 0.918 (95% CI 0.882-0.954) and 0.902 (0.864-0.940), respectively. Furthermore, the AUC of the panel combining the three miRNA’s was calculated at 0.935 (95% CI 0.924-0.984), whereas the AUC of BALP or bone-specific alkaline phosphatase (a well-established biomarker for osteosarcoma) was lower with 0.922 (95% CI 0.88-0.964). On top of that, the sensitivity and the specificity of the combined three miRNA’s were 90.5% and 93.8%, respectively. Lastly, significantly elevated levels of miR-21 were associated with metastasis in patients and in osteoblastic patients compared to the miR-21 levels in the non-metastatic patients and the non-osteoblastic ones, while significantly decreased levels of miR-199a-3p and miR-143 were associated with osteoblastic patients (109).

3.16 THYROID CANCER Medullary thyroid carcinomas or MTC’s, are derived from the parafollicular C-cells that secrete . Therefore, these MTC’s all express calcitonin. The MTC’s are responsible for 5% of all thyroid carcinomas and are often presented with micrometastasis. Consequently, curative non- surgical treatment is not possible, so a new manner for early detection of these cancers, preferably before they metastasize, is urgently needed. Presently, hCT (serum calcitonin) is used in few hospital centers to diagnose primary or occult MTC‘s after pentagastrin stimulation. However, no consensus on the optimal cut-off value of basal hCT for the detection of MTC was found in recent reports. In the study of Hermann et al, hCT was measured in patients with thyroid nodule disease after pentagastrin stimulation to determine the optimal cut-off value of hCT for the detection of MTC. The population study consisted of 1007 patients with nodular thyroid disease, whom were all diagnosed by sonography. The hCT levels were increased (> 10 pg/mL) in 17 patients after pentagastrin stimulation. The mean increase of hCT was 4.6-fold in patients 1-12, 8.8-fold in

82 | P a g e

patients 13-15 and 25-fold in patient 16, who was diagnosed with MTC. One more patient was diagnosed with MTC but presented basal levels of 4400 pg/mL for hCT. Five patients displayed basal elevated hCT levels > 100 pg/mL, 2 were diagnosed with MTC and 3 with C-cell hyperplasia. Lastly, it was found that patient 16 displayed basal hCT levels of 21 pg/mL after surgical removal of the tumor (110).

3.17 LEUKEMIA In the study of Papageorgiou and colleagues, the diagnostic performance of BCL2L12 as a biomarker for chronic lymphocytic leukemia or CLL was evaluated. The mRNA expression of the apoptosis-related gene BCL2L12 was analyzed in a first phase and subsequently its prognostic and predictive value and potential clinical application were assessed. The study population consisted of 65 patients with CLL and 23 healthy donors. Of the 65 CLL patients, 40 patients were classified in stage A, 10 patients in stage B and 15 patients in stage C. All examinations were performed on peripheral blood mononuclear cells or PBMC’s, isolated from peripheral blood samples of these patients. The results showed that BCL2L12 mRNA expression was significantly elevated in CLL patients when compared to BCL2L12 mRNA expression in healthy blood donors (P<0.001). Furthermore, BCL2 mRNA levels were also statistically significantly increased in CLL patients, in comparison with BCL2 mRNA expression in healthy controls (P<0.001). ROC analysis and logistic regression analysis were performed and BCL2L12 mRNA expression was able to differentiate CLL patients from healthy donors with an AUC of 0.833 (95% CI 0.731-0.935), whereas the AUC of BLC2 mRNA expression was calculated at 0.776 (95% CI 0.656-0.896). The combination of BCL2L12 and BCL2 mRNA expression in an adjusted logistic regression model resulted in an AUC for differentiating CLL patients from healthy controls that was only slightly increased to 0.840 (95% CI 0.740-0.943) when compared to the discriminatory power of BCL2L12. The subsequently performed univariate logistic regression analysis demonstrated that high BCL2L12 mRNA levels were a powerful predictor of CLL presence, displaying an OR of 4.52 (95% CI 2.11-9.65), whereas the OR for the presence of CLL in BCL2-positive individuals was 4.48 (95% CI 1.50-15.9). Similarly, high levels of BCL2 mRNA expression were also associated with high risk at CLL, displaying an OR of 4.79 (95% CI 2.06-11.2). The results of multivariate analysis indicated that increased BCL2L12 mRNA expression was a significant and independent predictive biomarker for CLL as a dichotomous variable with an OR of 5.51 (95% CI 1.82-16.6). The mRNA expression of BLC2 on the other hand demonstrated to be a statistically predictive biomarker for CLL as a dichotomous variable with an OR of 3.84 (95% CI 1.11-13.3). Furthermore, patients in advanced

83 | P a g e

stage CLL were associated with a BCL2L12 positive result, while BCL2 positivity was significantly associated with normal LDH and CD38 levels. Interestingly, it appeared that BCL2L12 and BCL2 mRNA expression levels were correlated. During follow-up, the observed median overall survival of all patients with CLL was 72 months. By performing a Kaplan-Meier analysis, it was found that BCL2L12 positive patients were associated with reduced overall survival, when compared to the overall survival of patients with BCL2L12 negative patients (P=0.043) (111).

3.18 VARIOUS TYPES OF CANCER In the study of Wen et al, a biomarker panel consisting of eight molecules; (i.e.alpha-fetoprotein, carcinoembryonic antigen, prostate-specific antigen, CA19-9, CA125, CA15-3, squamous cell specific antigen, and 19 fragment), was evaluated in Taiwanese patients with various types of cancer. In this study, patients that were examined at the Chang Gung Memorial hospital were given the opportunity were given the opportunity to be screened (for cancer) by means of this multi-marker panel. All these patients were then followed up for one year. A total of 314 patients who were diagnosed with malignancy in the year following the assessment of the multi-analyte biomarker panel were re-screened by this panel. Of these 314 patients, at least 179 had one positive biomarker. The sensitivity, specificity, PPV and NPV of the panel were, respectively, 57%, 88.7%, 3.7% and 99.6%. The sensitivities of this panel for liver cancer, lung cancer, prostate cancer and colorectal cancer were respectively, 90.9%, 75%, 100% and 76.9%. Nevertheless, the sensitivity of this panel was 17.6% for the detection of head and neck cancer, 37.5% for breast cancer and 44.4% for cervical cancer. A notable result was that the sensitivity of the biomarker screening panel for the detection of single malignancies was higher than the sensitivity of individual tumor-specific biomarkers. In the detection of HCC,for instance, the sensitivity of AFP individually was 63.3%, whereas the sensitivity of the panel was much higher (92.3%). Similar conclusions were drawn with regard to pancreatic cancer and colorectal cancer where the sensitivity of CEA alone was 53.8%, while the panel displayed a sensitivity of 76.9% for the detection of pancreatic cancer and colorectal cancer. Another interesting result was that the accuracy of the biomarker panel increased with age. Nearly 80% of all cases that were successfully detected by this screening panel, consisted of patients aged 70-79 years. Most of the patients with lung or pancreatic cancer were in an advanced clinical stage. The results of this study can be consulted in additional table 18 (112).

84 | P a g e

The retrospective study of Wang and colleagues examined the utility of thymidine kinase A or TK1, CEA and AFP in serum for the screening of malignant tumors. These biomarkers were analyzed in the serum of 56286 volunteers. Out of the 56286 individuals that were screened, 89 patients were diagnosed with twenty-four different types of tumors. Cancer patients with an STK1 value of < 0.5 pM were considered as having a better prognosis than patients with cancer and a STK1 value of > 0.5 pM. Moreover, the cut-off value of CEA and AFP were set at 5.0 ng/mL and 10 ng/mL, respectively. The highly specific anti-TK1 IgY antibody against human TK1 or STK1 displayed a higher sensitivity for all types of cancer in all ages groups, except in the age group with patients > 60 years. Combining STK1 with CEA and AFP increased the overall sensitivity with 20% (from 54.2 for STK1 alone to 72.3%). The number of patients that displayed STK1 values above the cut-off value, was well over 50% in six of the seven types of cancer in a subgroup of patients. In contrast, individually elevated levels of CEA only occurred in patients with two different types of cancer and in the case of AFP, there were no tumor groups where more than 50% of the patients had levels of AFP that were above the cut-off value. Furthermore, it was found that AFP was most sensitive for stomach cancer (100%) and the least sensitive for patients with cervix cancer (47.4%). Fifty-two patients with malignant tumors were followed up for 10 to 40 months. It was found that the overall survival of patients with STK1 levels above the cut-off value was much lower than those with STK1 levels below the cut-off value (113).

The last article included in this systematic review describes a screening study by Chen that evaluated the diagnostic performance of TK1 as a biomarker for all types of cancer. The screened population consisted of 35365 individuals (from four different centers), which were divided into two groups: one with participants that live in the city, the other with participants that work on a land- based oil field. Furthermore, 720 patients with 11 different types of cancer (liver, cervical, ovarian, lung, breast, esophageal, gastric, thyroid, head and neck cancer, lymphoma and other digestive cancers) were included. Additionally, 4103 patients with various illnesses were enrolled. All patients undergoing medical examination were offered an additional screening test that included STK1 as a biomarker for the detection of cancer. The threshold value of STK1 was determined at 2.0 pM. A value under 2.0 pM was defined as normal, a higher value was defined as elevated. ROC analysis was performed to assess the sensitivity and specificity of STK1 as a diagnostic biomarker. The STK1 test demonstrated an AUC of 0.96 for differentiating patients with cancer from healthy controls. Furthermore, the sensitivity and specificity of STK1 were calculated at 0.798 and 0.997, respectively. The likelihood value of STK1 was 233.74. When the STK1 cut-off value was lowered from 2.0 pM to 1.5pM, the sensitivity of STK1 increased to 0.86, at the cost of the specificity, which

85 | P a g e

decreased to 0.95. No malignancies were found in the individuals with STK < 2.0 pM in the city group. In city residents exhibiting elevated levels of STK1, the risk of having a malignancy or pre- malignancy was 85.4%, which was much higher than the risk for individuals with normal STK1 levels (52.4%). In the oil-field group, a significantly higher number of pre-malignancies of all types were found in patients with elevated levels of STK1 (25.6%). Furthermore, as in the city group, a significantly higher frequency of malignancies and pre-malignancies were observed in the group with elevated levels of STK1 (85.2%), compared to the frequency of (pre-)malignancies in the group with normal STK1 levels (57.8%). Finally, after follow-up, it was observed that the risk to develop cancer in patients with elevated STK1 levels had significantly increased by approximately three to five times, compared to the normal cancer incidence rate (114).

86 | P a g e

4. DISCUSSION

4.1 COLORECTAL CANCER The study of Nielsen et al. proves that an increase in plasma TIMP-1 and CEA protein levels, both combined and independently from each other, is significantly associated with CRC. The multivariable model demonstrates that both markers could serve as a detection tool for CRC in the future (16). On top of that, the model shows that comorbidity influences the levels of CEA and, more specifically, those of plasma TIMP-1. This suggests that an individual with comorbid diabetes, cardiovascular diseases or lung and rheumatic diseases requires higher levels of these biomarkers to obtain the same OR for CRC than a person without comorbidity. An algorithm may incorporate the impact of other diseases on CEA and TIMP-1 levels in order to use these biomarkers as a detection tool for CRC in people with comorbidity. Both CEA and plasma TIMP-1 levels were significantly raised in patients with adenomas. This suggests the use of CEA and TIMP-1 as a means for detecting adenomas in patients. These days, an upper size limit of 1 cm is used; if the adenoma exceeds this size, the decision is made to proceed with therapeutic intervention. However, neither CEA nor TIMP-1 levels are able to separate adenomas of < 1cm from adenomas > 1cm (16). The OR for plasma TIMP-1 was slightly higher for colon cancer than for CRC, illustrating that TIMP- 1 has a greater potential for detection of colon cancer. In a previous retrospective study, the AUC of these two biomarkers were higher. This could be due to a more careful selection process with one well-defined group of CRC patients and another group of perfectly healthy blood donors without comorbidities. Other reasons for the higher AUC in retrospective studies are a median age of under 60 years, archived samples of blood that were not as well preserved as in this study and the fact that blood donors are not comparable with high-risk patients. Although these results are promising, it must be pointed out that CRC detection by plasma TIMP-1 and CEA cannot be used in a screening program just yet, since the data presented here are based on biomarker detection in a high-risk population. The next step before putting these biomarkers into practice should be a prospective RCT that finally validates the combination of CEA and plasma TIMP-1 as screening biomarkers for discovery of CRC (16).

The article of Abdullah and colleagues validated the recommended cut-off value for the CRC biomarker fecal tumor M2PK, namely 4 U/ml. Taking into account the fact that no AUC is calculated,

87 | P a g e

the relatively high sensitivity, specificity and low false positive rate of M2PK (among IBD patients) compared to the rather low sensitivity and specificity and higher false positivity of the current method of screening through iFOBT, demonstrates that fecal tumor M2PK could be used as a more specific screening test for CRC in the near future. According to the authors, the false-negative results in this study are largely due to the stool samples’ dilution with toilet water. The false-positive results could have occurred as a result of bowl inflammation. It has already been demonstrated that patients with IBD colitis are at a higher risk of developing CRC. Therefore, it is suggested that the presence of positive M2PK levels in IBD patients is possibly caused by an ongoing premalignant process or ulcerative colitis-associated neoplasia. The iFOBT test showed this pattern of high positivity rates in IBD as well. The fecal tumor M2PK biomarker has been studied for quite some time and this study proves once more that it seems superior to the iFOBT. Consequently, fecal M2PK could be used as a complementary screening test for CRC (and possibly IBD). Before it can be put into practice however, a prospective study not only on high-risk patients, as is the case here, but on a randomly selected group of healthy individuals should be performed (17).

The findings in the multi-center case-control study described by Ahlquist and colleagues show that next-generation sequencing can be a powerful tool in the detection of CRC and premalignant adenomas. Surprisingly, the detection rate of stage IV CRC dropped, compared to the detection rates of stages I to III. This could be due to hypomethylation in a later stage of the illness. A panel of biomarkers is promising as a screening program, considering the high sensitivity of not only CRC in the first few stages, but adenomas > 1cm as well. In addition, this study is the first to reveal that stool DNA sensitivity increases with the augmentation size of adenomas of more than 1cm. By way of illustration, the sensitivity raises from 78% to 85% for adenomas bigger than 3cm. This study has a few strengths. The large number of CRC patients included, results in a robust covariate analysis and the blinded assays performed in different laboratories with a training and test set design, present objective data. Where this study falls short however, is in the buffering and the failure to standardize the stool samples across the different medical centers, which could explain the dissimilarities in test performance between these centers. A frequently returning observation is that this population does not represent a screening population, seeing as it only concerns symptomatic patients referred to a gastro-enterologist, often with a well-known history of colonoscopies and bowl diseases. Further validation of this panel in a screening population or by RCT study is required before it can be put into practice as a screening test (18).

88 | P a g e

SALL-4 mRNA is a stem-cell factor, studied by Khales et al. in a prospective cohort study. As a newly defined oncogene, it has already been demonstrated that SALL-4 is highly expressed in cancer cells and tissues and is already defined as a possible new marker for metastatic CRC tumor. In this study, SALL-4 mRNA displayed a higher sensitivity and specificity than CEA (96.1% vs. 80% and 95% vs. 82%), indicating that the former could indeed be regarded as a new diagnostic biomarker for CRC. Furthermore, the progression of CRC correlated with elevated levels of SALL- 4 mRNA in the blood, which in turn correlated with clinical symptoms. This study was included in this systematic review because of the seemingly promising results and the prospective nature of the study, despite having less than 100 CRC patients participating. With an AUC of 9.81, SALL-4 mRNA could be a powerful diagnostic biomarker in the future. However, this prospective study was performed on a carefully chosen number of patients and healthy volunteers, thus selection bias could have interfered with the results. In order to validate SALL-4 mRNA as a reliable diagnostic biomarker that is ready for use, a prospective study on a randomly selected population of CRC patients and controls should be executed (19).

In a prospective cohort study, Calistri et al. suggested that a combined test of using FL-DNA and iFOBT could improve the overall diagnostic accuracy for CRC. In previous studies, the FL-DNA from exfoliated cells, that are excreted in stool, showed a higher sensitivity and a comparable specificity for detecting CRC. The results of this study confirmed that iFOBT and fecal DNA levels are significantly higher in CRC patients. Furthermore, it was demonstrated that iFOBT and fecal DNA values were independent variables and that they were related to premalignant lesions but variable due to different patient characteristics (female vs male). A difference in FL-DNA values could not separate the group with low-risk adenomas from the group with high-risk adenomas. The iFOBT levels, however, were slightly higher in the high-risk group. In brief, the novelty in this study was that fecal DNA analysis could provide more detailed diagnostic information and that it could identify smaller subgroups with various probabilities of having CRC. On top of that, fecal DNA could, one day, be used in predicting neoplastic lesions but also in determining a patient’s risk of developing an (pre)malignant lesion. Therefore, it could be useful to implement the analysis of fecal DNA into a screening program alongside iFOBT in the future. However, for this to happen, a few issues must be addressed first. For instance, there is still no general agreement as regards the frequency with which the test should be performed and the number of stool samples needed to obtain the highest sensitivity and specificity at certain times. Like the iFOBT, fecal DNA analysis is not able to identify all high-risk adenomas, which is the prime objective of a screening program for CRC. Lastly, the participants in

89 | P a g e

this study all had iFOBT positive results. As a result, the findings in this study cannot be applied to the general population (20).

The detection of cancer through evaluating the levels of methylation in genes, also called methylation markers, is a more recent but expanding subgroup among the oncologic biomarkers. PHACTR3 was discovered as a protein associated with nuclear scaffold in HL-60 leukemia cells, and although the clear function of PHACTR3 hypermethylation in cancer cells could not yet be demonstrated, Bosch et al. identified it as a potential detection marker in stool samples for CRC in a retrospective case-control study in 2011. This study did not include more than 100 patients with CRC, but considering the interesting results, it was included in this systematic review. The project began by examining advanced adenoma and CRC tissues. These tissues displayed high levels of PHACTR3 methylation, making it a promising marker for CRC screening. Upon further examination, a sensitivity of 55% to 66% for detecting CRC and a sensitivity of 32% for detecting advanced adenoma, at a cut-off value of 82.5 relative copies, was reached (21). When compromising the specificity (from 100% to 93%) by lowering the cut-off value, a sensitivity of 53% for the detection of advanced adenomas could be observed. Combining PHACTR3 and the already widely used iFOBT in an independent series of stool subsamples, the sensitivity for the detection of CRC or advanced adenomas rose from 21% to 33%. If the same high sensitivity and specificity can be obtained in future prospective studies and RCT’s, PHACTR3 could become the single best performing methylation marker so far. Furthermore, PHACTR3 can be identified in a stool sample, which is an attractive approach for the screening of CRC. An interesting option for further improvement of the test performance, would be to combine PHACTR3 with other (methylation) biomarkers such as the iFOBT. Additionally, some recommendations for future studies could be made. The number of patients included in this study is rather small, and there is some discrepancy between the results of this case-control study, which only described PHACTR3, and those of the pilot study on subsamples of stool; a study that described the combined use of PHACTR3 and iFOBT. The sensitivity for detecting advanced adenomas is considerably lower in the pilot study than in the case-control study (32% vs 21%). The results of this study are promising, but in order to validate PHACTR3 as a reliable biomarker for the detection of advanced adenomas, prospective cohort studies with more CRC patients should be carried out before PHACTR3 can be put into use (21).

In the study of Chan et al, a five biomarker panel, consisting of recombinant antigens for serum antibody screening, was tested in a retrospective case-control study. rCCCAP alone had the

90 | P a g e

highest sensitivity for the detection of CRC (35.1%), and the panel in its entirety had an overall sensitivity of 58.5%. In combination with CEA, the panel’s sensitivity for detecting CRC reached 77.6% without compromising the specificity. These results prove that combining multiple biomarkers could be essential for accurate detection of cancer in the future. Previous studies suggest that there is a correlation between serum antibodies and clinicopathological characteristics, such as tumor location or patient survival rate. During this study however, such a correlation could not be observed. The most interesting result of this study is the increase in sensitivity for the detection of early-stage CRC (from 21.9% to 65.9%) when combining CEA and the antigen panel. However, this study has a few deficiencies. The group of patients in Duke’s A stage (these were considered to be early-stage CRC patients) consisted of merely seven individuals. Furthermore, the authors pointed out that the use of partial recombinant proteins as screening antigens could decrease the seropositivity in contrast to using full- length or even more specific antigens. The aim of the study was to set up a new non-invasive tool for the detection of CRC through the use of antigen biomarkers and Chan et colleagues provided a new biomarker panel that looks very promising. Nonetheless, before this panel can be put into practice, larger prospective cohort studies should validate these results (22).

The 29-gene panel, identified and studied by Ciarloni et al. in a multicenter case-control study, shows promise as a biomarker for discriminating healthy individuals from CRC patients or from patients with adenomas > 1cm. The penalized logistic regression analysis shows a 75% and a 59% sensitivity for the detection of, respectively, CRC and adenomas > 1cm with a 91% specificity. The majority of the genes included in this panel, are known mediators or regulators of inflammation, cell proliferation and cell survival. Ciarloni and colleagues acknowledged the fact that these genes (which are associated to inflammation processes) might not be specific enough to safely exclude CRC. Therefore, the authors are planning a new prospective case-control study during which the gene panel will be evaluated in a test set of patients suffering from chronic inflammatory diseases such as Crohn or rheumatoid arthritis. The authors also suggested to continue the search for more discriminating biomarkers in the whole . The 29 genes identified in this study were discovered in a limited part of the transcriptome. An advantage of this gene panel is that it can be transferred and implemented into a routine lab test and therefore into a cost-effective screening assay in the future. The significance of the results in this study however, is limited due to the small sample size and the low number of CRC patients included. Furthermore, the focus of this study lay on evaluating the prediction accuracy for adenomas. To further validate this gene panel, the above- mentioned new prospective study (that is currently being carried out), should include a larger

91 | P a g e

number of (early-stage) CRC patients, patients with adenomas > 1cm as well as numerous patients with inflammatory diseases as a control group. This will allow Ciarloni et al. to fine-tune the algorithm and to be more specific and more sensitive in the detection of CRC (23). In the subsequent follow-up study, the MGMC and MGMC-P algorithms were applied to the aforementioned 29-gene panel. The MGMC-P algorithm can detect 78% of the patients with CRC and 52% of the patients with large adenomas. This last result is arguably the most interesting, given the fact that, these days, screening assays aim for the detection of premalignant lesions in order to remove them before ever becoming malignant. Both algorithms displayed a similar sensitivity for CRC. Nonetheless the specificity of the MGMC-P algorithm was higher (92.2% vs 90%). The positive rate of the algorithms in patients with IBD and other inflammatory GI diseases, nearly reached 30%. This could potentially present a problem. However, most of these patients are already diagnosed by the age of 50 and are closely monitored by their respective doctors through a yearly colonoscopy. These promising results show that the application of this gene panel could be used as a screening assay in the near future. A prospective cohort study in a real-life clinical practice setting is ongoing to validate these algorithms. If the outcomes of this cohort study are positive, a RCT-study should be performed in order to validate this panel and put it into practice as a screening tool for CRC (24).

Koga et al. set out to reduce the high rate of false negatives when using the iFOBT by adding a new biomarker test based on the expression of miRNA-106a. As already mentioned in the introduction, miRNA’s studies have recently proved the value of miRNA-106a as a regulator in cancer, which implies that these miRNA’s in plasma could become potential biomarkers for the detection of cancer. In this study, there was no significant difference between the true-positive subgroup and the false negative subgroup, as regards the expression of miRNA-106a. Nonetheless, a significant difference between the false-negative subgroup and the healthy volunteer group could be observed, when using the FmiRT. Previous studies have already demonstrated that the expression of miRNA-106a is high in CRC tissue. It is known to suppress the expression of certain genes that are responsible for inhibiting cancer proliferation. Therefore, fecal miRNA-106a has proven itself as a potential diagnostic biomarker for CRC. The sensitivity of the FmiRT that detects fecal miRNA- 106a expression, was rather low (34.2%) but in combination with the iFOBT, the sensitivity rose to 70.9% without compromising the specificity of these combined tests (96.3%). On top of that, a quarter of the CRC patients, deemed false negative by the iFOBT, were reported as true positive when combining iFOBT and FmiRT. This is a novelty and the use of miRNA-106a in combination

92 | P a g e

with the already broadly used screening tool iFOBT, should be further investigated in prospective cohort studies and RCT’s. Furthermore, these studies should include randomly chosen healthy controls with a negative colonoscopy instead of healthy volunteers. If the combination of the FmiRT and the iFOBT maintains a high sensitivity for the detection of CRC in these studies, it ought to be put into practice (25).

When comparing the efficiency of the iFOBT and of colonoscopy as a screening method for the detection of CRC, there are 3 important conclusions to be drawn. Firstly, approximately half of all the right-sided small non-metastatic CRC patients present a negative result when these patients are screened with the iFOBT. Secondly, large-scale studies have proved that the overall sensitivity of the iFOBT is limited. Thirdly, colonoscopy is the gold standard to confirm the diagnosis of CRC. However, it is challenging to reach the ascending colon or cecum in approximately 10% of the patients. Therefore, Koga and colleagues carried out this retrospective cohort study, in which they propose to use a DNA chip for detecting changes in genes with low expression levels in stool of patients with CRC. This new DNA chip assay with six genes displayed a higher sensitivity for detecting CRC in the subgroup of patients with right-sided CRC. More specifically, 90% of the patients with right-sided CRC was detected by the DNA chip assay. In addition, the overall sensitivity for detecting CRC was higher when using the DNA assay than when the iFOBT was used. Nevertheless, a few restrictions in this study must be addressed before this DNA assay can be put into practice. Only 53 patients with CRC took part in this study, and only 13 patients with CRC were included in the validation cohort. The subgroup tested in this study is even smaller. For that reason, large-scale prospective cohort studies and RCT’s are needed. Further studies can demonstrate if this DNA chip can be used as a reliable screening tool for CRC in the future. Although the number of patients with CRC in this study is far below the inclusion criteria of more than 100 patients with CRC, this study was included because of its interesting concept and promising results (26).

Despite multiple screening models already in place and awareness campaigns all around the world, patient compliance in participating in screening test for CRC remains low. This is partly due to the patients’ fear of experiencing discomfort during a colonoscopy. In order to expand the diagnostic arsenal of medical practitioners, Marshall and colleagues set out to discover a blood-based risk stratification test for CRC. The current relative risk of each participant, which was calculated in this study, indicates that this blood test could be used for risk stratification in the future. The 7 genes included in the biomarker panel are probably not tumor-specific biomarkers. Instead, they display

93 | P a g e

the alterations in gene expression due to a systemic response against the tumor. The general implementation of risk stratification in future CRC screening could be a courageous effort to improve patients participation in screening tests. However, a few hurdles must be overcome before it can be put into practice. Firstly, this risk classification should be tested in other independent prospective studies or RCT’s. Secondly, the implementation of this risk stratification in the screening program has yet to be worked out. Thirdly, although the sensitivity of this new blood test is acceptable and the discomfort of getting a blood test is minimal, the specificity reaches 70%, which implies a large number of false negatives. However, when colonoscopy capacity is limited, combining prescreening and colonoscopy could detect 2.1 to 4.7 times more cancers (27).

The screening study of Meng and colleagues evaluated the potential use of serum M2-PK as a diagnostic biomarker. Overall, this study demonstrated that serum M2-PK had a higher diagnostic value for CRC than serum CEA. Moreover, serum M2-PK had a higher AUC for detecting early- and late-stage CRC, advanced adenomas and regular adenomas. Especially the sensitivity of M2- PK was superior to the sensitivity of CEA for diagnosing colorectal lesions, but M2-PK displayed slightly lower results in terms of specificity than CEA. The authors stated that a new blood test, with the ability to detect CRC in early stages and a higher sensitivity, is preferable to the presently used fecal test (iFOBT) with a higher specificity. On top of that, Meng et al. pointed out that patient compliance in a screening assay using a blood test, is predicted to be much higher than patient compliance in a screening program involving a fecal test and colonoscopy. Although the results of this study are favorable, a few issues need to be addressed. It is tempting to draw conclusions from single screening screening study and project them on the general population. Nevertheless, in order to verify these results, M2-PK could be studied in larger screening studies or RCT’s (28).

In the article of Terhaar sive Droste et al, the performance of various cut-off values in the iFOBT were evaluated in a screening study. It was clearly demonstrated that by increasing the cut-off level, there was also an increase of the test’s specificity. Meanwhile, the decline in detection rates of early-stage CRC’s was acceptable (5.3%). The Dutch Health Council currently advises to start screening at a cut-off of 75 ng/mL. In this study however, when changing the cut-off level to >200 ng/mL, there was a substantial reduction in (falsely) positive test results and only two early-stage CRC’s were missed. The specificity rose from 86.4% at a cut-off level of 75 ng/mL to 92.8% at the cut-off level of 200 ng/mL. The sensitivity for the detection of relevant neoplasia was calculated at respectively 47.1% and 37.2% for the cut-off levels of >50 ng/mL and >200 ng/mL. These results demonstrated that wielding a higher cut-off level in the first rounds of a screening program (on a

94 | P a g e

younger population) could drastically diminish referrals for colonoscopy while displaying an acceptable reduction in detection power. In later rounds of screening on an older population, when the prevalence of early-stage CRC has increased, the cut-off value could be adjusted to a lower value in order to attain a greater sensitivity for advanced adenomas or early-stage CRC. The theory behind this proposal relies on the fact that missing an advanced adenoma in the first rounds is acceptable when seeing as there are still multiple occasions to detect these adenomas or early- stage CRC in a next round. This study has a few advantages, two of which are the fact that it provides data on multiple cut-off levels of individuals that all underwent colonoscopy (not solely on those with a positive iFOBT) and the fact that the referral population consisted of more CRC patients than does the average risk screening population. However, for the results of this study to be generalized for a screening population, the risk of spectrum bias (this is a referral population) and therefore sensitivity overestimation must first be analyzed. This issue could be solved in a future study by using a fully colonoscopy-controlled screening population. The most interesting consequence of raising the cut-off value for the iFOBT is the decrease in unnecessary colonoscopies, and therefore, in costs. If the compliance of the screening population increases and the missed adenomas in the first round can be detected in future screening rounds, the cost- effectiveness of the iFOBT would improve considerably (29).

In the study of Jin et al, the performance of the second-generation SETP9 test was evaluated. In comparison to the first generation test, three separate PCR duplicates are now being used instead of two replicates. This study concluded that this new blood-based test had a sensitivity of 74.8% and a specificity of 87.4% for the detection of CRC. In the previous first generation SEPT9 studies, an average sensitivity and specificity of 58% to 69% and of 86% to 90% were observed. The new results of this study proved that the second-generation SEPT9 test is an improvement in sensitivity and specificity over the first generation test. Furthermore, no association with possible confounding factors such as age, gender and tumor localization were found. However, the sensitivity for the detection of advanced adenomas was mediocre (27.4%). A possible explanation for this result is that adenoma cells do not release as many SEPT9 in the bloodstream as CRC cells. Equally important as the observations above was the fact that the iFOBT performed less well than the SEPT9 assay with a sensitivity and specificity of 58% and 82.4% for the detection of CRC. The authors pointed out some limitations to their study. This was a retrospective case-control study. in order to validate these results, a prospective cohort study in a screening setting should be executed. Additionally, no attention was given to the cost-effectiveness or patient compliance in

95 | P a g e

this study. Nevertheless, some studies suggest that patients who refrain from participating in screening studies with fecal tests, would be more willing to take a blood test (30).

In the article of Wilhelmsen and colleagues, the performance of an eight-protein panel was evaluated. The univariable analysis revealed significant associations between all eight protein biomarkers and the occurrence of CRC and high-risk adenoma. In the multivariable analysis, the panel displayed an AUC of 0.76 for the detection of CRC and high-risk adenomas, which is a significant improvement over the results of each biomarker separately. Comorbidity appeared to have only a slight influence on the detection rates, while benign lung disease had a significant association with the increase of biomarker levels. The authors pointed out that this study did not follow the REMARK guidelines in detail but suggested that, in the ideal case, these results are valuated through prospective clinical trials in the near future; a blood-based screening test for CRC could replace the currently used stool-based screening programs. As mentioned before, a blood- based test is expected to present a larger compliance rate than a stool-based test. It is also acknowledged that all participants in this study represented a high-risk population, therefore it is not advised to project these results onto the general population. Furthermore, the results in this study demonstrated that this panel could detect early-stage CRC and high-risk adenomas. Screening studies should focus on detecting cancer in an earlier stage and on removing these premalignant lesions in order to improve survival rates (31).

When assessing all studies discussing CRC biomarkers in this systematic review, some conclusions can be drawn. Although most of the results are promising, all authors strived to conduct prospective cohort studies with a larger number of CRC patients, RCT’s or broad screening studies in the future to confirm their results. On top of that, the studied populations of the prospective studies in this article, often consisted of high-risk populations referred for colonoscopy after a positive iFOBT test. As a result, the detection rates in these studies could be overestimated. Furthermore, executing a prospective study, RCT or screening study, demands a lot of work, time and money, which is the reason why good prospective studies are sparse. A returning suggestion in some of the latter articles was the possible replacement of a stool-based screening program by a blood-based test. The advantages of a blood-based test seem to include a higher patient compliance rate and a more cost-effective way of detecting CRC. Lastly, the field of potential screening biomarkers has expanded immensely over the last few years and has almost reached a point where it is no longer possible to keep a clear overview of all biomarkers under investigation.

96 | P a g e

Therefore, a case could be made for the centralization of our resources and focus on a few very promising biomarkers in some costly prospective studies or RCT’s.

4.2 HEPATOCELLULAR CANCER The CART algorithm is the subject of the study performed by Chen et al. They created a model, using SELDI-TOF-MS, from five peaks in protein levels seen in 120 patients with HCC. These proteins can be detected in serum and could potentially present an alternative approach for detecting HCC. Four proteins were identified as RLN2, TAC1, ADCYAP, CPAMD1. The last marker was not yet known. In comparison to AFP 20 , the CART algorithm performed better in terms of sensitivity, specificity and DOR. More specifically, the DOR of CART was 10 times better than the

DOR of AFP 20 . Especially interesting was the high sensitivity of CART for the detection of stage I and II HCC (respectively 87% and 89% in the validation set). These results suggest that the CART model could be an improvement over AFP 20 as a diagnostic biomarker. In addition, previous studies have concluded that combining various biomarkers improves the performance of screening tests.

This conclusion definitely applies to this study, seeing as the combination of AFP 20 and CART displayed a considerably higher sensitivity, specificity and DOR. Consequently, it is clear that the combination of CART and AFP 20 is a potential powerful biomarker for the future detection of HCC. The next step in validating this algorithm will be a clinical trial in multiple centers (32).

The prospective study of Jirun et al. (with a study population of 2040 patients) assessed the performance of three biomarkers: AFP, des-gamma-carboxy prothrombin or DCP and HCCR-1. Furthermore, an assay combining these three biomarkers was developed and evaluated. The results of this study are very promising. The combination of these three biomarkers resulted in an AUC of 0.891 and a sensitivity of 75.4% for the detection of HCC. On top of that, each biomarker was successful in significantly differentiating healthy people from HCC patients. In addition, the sensitivities for detecting small HCC (<2cm) reached 39.7%, 11.5% and 51.9% for AFP, DCP and HCCR-1, respectively. HCCR-1 displayed the highest specificity of all three biomarkers and its inclusion in the three-marker assay gave a boost to the overall specificity. For that reason, it is concluded that HCCR-1 could be a useful diagnostic biomarker for latent HCC in a cirrhotic liver and small HCC, not because of its low sensitivity (41.6%) but for its high specificity. This extensive prospective study has several advantages: the high number of participants, the fact that the participants in this study were prospectively enrolled and the thorough evaluation of the test’s performance. However, the results of HCCR-1 are not entirely convincing. The low sensitivity of

97 | P a g e

HCCR-1 for HCC means it can only be used in the future in combination with other biomarkers that present a higher detection rate (33).

The aim of the multi-center nested case-control study of Luo et al. was to assess the performance of a newly identified metabolite panel. After a multivariate and univariate analysis, two blood-based metabolite biomarkers were selected and combined in a panel. These metabolites were Phe-trp and GCA, two metabolites of bile acids, which were significantly increased in patients with liver disease. During the study, it was established that the serum biomarker panel was better at distinguishing patients with HCC from a high-risk population with cirrhosis and healthy controls than AFP (AUC of 0.930, 0.892, and 0.807 for the biomarker panel vs 0.657, 0.725 and 0.650 for AFP in the three cohorts). The diagnostic accuracy for detecting HCC and cirrhosis ranged, respectively, from 86% to 92.5% and from 63.8% to 82.9%. The most remarkable and interesting result was the high sensitivity of the biomarker panel (92.1% in the validation set). Another notable observation was the high AUC of 0.866 for the detection of small HCC (which is considered to be early-stage HCC) in the test cohort, which was confirmed in the validation cohort. As a consequence, this biomarker panel could be of use for detecting early-stage HCC when put to use in a clinical setting. This study has a few advantages and some limitations. Luo and colleagues advantageously explored the use of a metabolite biomarker panel for the detection of early-stage HCC and analyzed the performance of this panel in combination with AFP. Subsequently, the result of this nested case-control study were validated. However, a limited amount of metabolites were examined for use as diagnostic biomarkers due to trade-off among coverage, throughput and cost. Moreover, merely 42 patients participated in the nested case- control study during the three-year follow-up period. In order to put this biomarker panel into use, a prospective study or RCT is required to confirm these results (34).

The study performed by Shi et al. explored the potential of PRDX3 as a diagnostic biomarker for HCC detection. At present, the number of clinically applicable biomarkers for early detection of HCC remains limited. Until now, only AFP displayed acceptable results as a diagnostic and prognostic biomarker for HCC. PRDX3 is a c-Myc target gene that plays a role in mitochondrial homeostasis and neoplastic transformation. Earlier studies have already determined that PRDX3 is overexpressed in a multitude of cancers like prostate cancer, cervical cancer and CRC cancer. In this study, PRDX3 showed significantly elevated levels in patients with HCC, compared to patients with cirrhosis or to healthy controls. The AUC of serum PRDX3 was higher than the AUC of AFP with a sensitivity of 85.9% and a specificity of 75.3%. Furthermore, PRDX3 could be useful

98 | P a g e

in differentiating HCC patients from patients with cirrhosis as it displayed an AUC of 0.717 with a 73.2% sensitivity and 69.0% specificity. The perceived association between serum PRDX3 levels and AFP levels, tumor diameter, TNM stage and portal vein invasion suggested that PRDX3 plays a role in HCC development. The most notable result in this study however was that PRDX3 showed promise in detecting early-stage HCC, especially in comparison to AFP. In patients with early-stage HCC, higher levels of PRDX3 were observed than in healthy controls. Moreover, this retrospective cohort study demonstrated a significant association between a low survival rate and a high PRDX3 expression. Thus, these findings suggest that PRDX3 could be used as a diagnostic and/or prognostic biomarker for HCC. Nevertheless, additional studies in a clinical setting and prospective cohort studies with a larger number of early-stage HCC patients should be executed to confirm these results, before this biomarker can be put into practice. A particularly interesting study would be the evaluation of PRDX3 as a prognostic biomarker for the recurrence of HCC after surgical resection (35).

The study of Lin and colleagues focused on the identification and validation of a miRNA panel. The levels of the seven miRNA’s were significantly elevated in patients with HCC, compared to the levels in either patients with chronic hepatitis B HBV-induced liver cirrhosis, in healthy individuals or in HBsAg carriers. The classifier (C mi ) or panel that combined these seven biomarkers had a higher sensitivity than AFP, in distinguishing HCC patients from patients without cancer. In all cohorts, the classifier displayed a higher sensitivity but a similar specificity for the detection of HCC, compared to the sensitivity and specificity of AFP. What is more, the panel had a higher sensitivity and elevated detection rates for small and early-stage liver cancer. AFP-negative HCC patients can also detected by evaluating the levels of these seven miRNA’s in the sera. The most notable result however was the fact that this miRNA panel could identify preclinical hepatocellular carcinoma 12 months before diagnosis, albeit with a rather limited sensitivity. This study thus highlights the potential diagnostic value of this classifier as a non-invasive tool for the detection of hepatocellular carcinoma, even in preclinical stages. Two important strengths found in this study were the high-throughput screening for serum miRNA’s that showed differences in concentration between patients with HCC and at-risk controls, and the inclusion of both healthy and at-risk controls. As HBV is a major risk factor for HCC and liver cirrhosis, it was essential to analyze the effect of chronic HBV infection on the performance of this classifier. However, a possible shortcoming of the study was the limited number of patients, (only 27), in the nested case-control study. Furthermore, practically all cases of HCC were probably caused by HBV infection, meaning that was little diversity regarding the type of liver cancer. In summary, the findings in this study

99 | P a g e

suggest that this miRNA panel could be a possible diagnostic and predictive biomarker for screening applications in the future. In order to use this classifier in a clinical setting however, more prospective studies and RCT’s will be needed to confirm these promising results (36).

Few studies, like the one performed by Kumada and colleagues, have focused on the early prediction of development of HCC in patients at high risk for HCC. For that reason, hs-AFP-L3, was evaluated as a possible prediction biomarker for HCC in a retrospective case-control study. Previous studies pointed out that hs-AFP-L3 concentration correlates with AFP but AFP-L3% does not correlate with AFP. As a consequence, AFP-L3% was selected for further analysis. In the present study, hs-AFP-L3 levels (with a cut-off value of 7%) were significantly increased 12 months before HCC diagnosis in 34.4% of all HCC patients (including patients with small HCC). Furthermore, positivity rates for hs-AFP-L3 were high in patients with changes in echo pattern 12 months prior to diagnosis. The survival rate of patients with hs-AFP-L3 > 7% at -1 year was significantly lower compared to patients with hs-AFP-L3 < 7%. As regards tumor size and number, no significant difference was discovered between patients with hs-AFP-L3 > 7% and patients with hs-AFP-L3 < 7%. AFP and DCP levels were not significantly increased 12 months before HCC diagnosis. The sensitivity of the three biomarkers combined was 60.6% at diagnosis and should be used to complement changes in echo patterns. Ultimately, the results of this study indicate that elevated hs-AFP-L3 could be used as an early predictive biomarker of HCC, especially at low AFP- levels and when no changes in echo patterns are observed. Diagnosing and treating HCC in an earlier stage could drastically improve overall survival as patients with early-stage HCC have more chance of receiving curative treatment. For that reason, Hs-AFP-L3 could become a vital biomarker for predicting HCC in the future (37).

The accuracy of DCP and AFP as predicting biomarkers was compared in the nested case-control study by Lok and colleagues. DCP displayed a higher accuracy than AFP between month -12 and the time of diagnosis. No statistical difference was observed however. At the time of diagnosis and at -12 months, DCP had a sensitivity of 74% and 43% with a specificity of 86% and 94% for distinguishing HCC cases and controls, at a cut-off value of 40 mAU/mL. At a cut-off value of 20 ng/mL, AFP displayed a sensitivity of 61% and a specificity of 81% at the time of diagnosis, and a sensitivity of 47% with a specificity of 75% at -12 months. Until now, the low sensitivity and specificity of AFP has encouraged the authorities to recommend AFP only as an additional tool for diagnosing or surveilling HCC, complementary to ultrasound. The conclusion of this study was that neither DCP nor AFP alone is qualified for the detection of HCC but that the combination of both

100 | P a g e

biomarkers improved the sensitivity, proving that these two biomarkers are complementary. The sensitivity of DCP and AFP together was 91% at month 0 and 73% at month -12. The strength of this study lay in the similar root cause for liver cancer in all patients, namely HepC. On top of that, matched controls were selected and prospective follow-up of at least 1 year was executed. The limitations of the study were found in the small number of HCC cases and the incomplete availability of samples at selected points in time from month -12 to month 0. Furthermore, these results cannot be generalized to patients with liver cancer that not caused by HCV. To confirm the potential of combining AFP and DCP as a predictive and diagnostic biomarker, prospective studies should be conducted. Until then, ultrasonography remains the preferred tool for detecting early HCC and monitoring HCC development in high-risk populations (38).

The study of Miura and colleagues was based on the belief that conventional biomarkers like AFP and DCP do not perform well enough with regard to diagnosing HCC. Therefore, they set out to analyze the potential of hTERTmRNA as a diagnostic biomarker in a multi-center case-control study. In the past, it was thought that hTERTmRNA could not be detected in human serum because of its instability. However, it was proven that RNA’s are generally stable within 24 hours after drawing blood. Not only is hTERTmRNA a hot topic with regard to liver cancer, in a study concerning breast cancer, hTERTmRNA displayed a sensitivity and specificity of respectively 40% and 100%. With liver cancer however, the sensitivity of hTERTmRNA in this study was much higher (89.7% ). Furthermore, 24 patients with late-stage HCC showed hTERTmRNA levels under the cut- off value of 9.332. These patients suffered from decompensating liver cirrhosis and displayed elevated TGF-beta which breaks down hTERTmRNA, suggesting hTERTmRNA could be used to monitor disease progression. On top of that, serum expression of hTERTmRNA in HCC patients with tumors of less than 10mm in diameter could be detected, which suggested that hTERTmRNA has potential as a diagnostic biomarker for the early detection of HCC. The amount of HCC patients in this study is high and the results are promising. However, the cost of the hTERTmRNA assay and the limited period of time in which hTERTmRNA is stable in serum samples could lead to practical issues. For that reason, the authors are currently working out a way to improve the RNA stability and PCR condition. Future, large-scale prospective cohort studies are required to confirm these results before hTERTmRNA can be used as a diagnostic or surveillance biomarker (39).

The cross-sectional study of Wang and colleagues explored the use of anti-URG’s as possible diagnostic biomarker for detecting HCC patients or high-risk patients with HBV, HCV or cirrhosis. Previous studies have already established that with regard to the development of HCC, patients

101 | P a g e

often exhibit no symptoms until the HCC reaches an advanced stage. At present, although advancements are being made, there is only a limited number of therapeutic options that can be offered to patients in later stages of liver cancer. This underscores the importance of creating a surveillance program for high-risk patients and identifying tumors in an earlier stage. Previous studies have suggested that HBC and HCV alter the expression of genes that play a role in hepatocellular growth, survival and tumorigenesis. After analyzing the levels of antibodies against these URG’s, the results in this study confirmed that anti-URG’s are increased in HBV-carriers with cirrhosis and/or HCC. ROC analysis for differentiating patients with cirrhosis and HCC from healthy controls, HBV-carriers and CRC patients, showed an AUC of 0.721 with a sensitivity of 58.3% and a specificity of 80%. The other biomarkers (AFP, AFP-L3, GPC3 and GP73) all displayed a sensitivity under 50%. Combining anti-URG’s with these biomarkers increased the sensitivity to 75%. Although the results of this cross-sectional study are promising, this is merely an observational study. Therefore, the execution of future prospective studies that analyze the performance of a panel that includes anti-URG’s and one of the aforementioned biomarkers is strongly recommended (40).

The utility of combining biomarkers and clinical variables in an algorithm was evaluated in the study conducted by Wang et al. More specifically, both the performance of the biomarker AFP alone and its performance after the inclusion of age, gender and serum ALK and ALT levels were investigated. ROC analysis revealed that the AUC of AFP had increased by 4% to 12%. Equally important was the observation that combining multiple factors did not compromise the overall specificity of the Doylestown algorithm. The study was both internally and externally validated in a discovery cohort and three validation cohorts with a total of more than 3000 participants. The authors concluded that this model could be a beneficial replacement for AFP as a tool for the detection of (early-stage) HCC. However, there was some concern about the algorithms’ performance due to the potential variation in test accuracy between different laboratories. These assay variations observed while measuring concentrations of ALK, ALT and AFP, could lead up to 15%. Furthermore, selection bias could have occurred in the validation cohorts, seeing as only patients displaying certain clinical features (HBV+, cirrhosis,…) were selected. In order to validate the results of this present study, more longitudinal and prospective studies will have to be performed (41).

The aim of the study performed by Zekri, was analyzing multiple miRNA’s in sera of HCC patients, LC patients, chronic hepatitis patients and healthy controls. Upregulated expression of miR-885- 5p, miR-181b, and miR-221 in HCC patients suggested that these miRNA’s play an oncogenic role

102 | P a g e

during hepatocarcinogenesis. On the other hand, the downregulation of miR-29b and miR-199a- 3p (in early-stage HCC) suggested that these two miRNA’s are potential tumor suppressors. When comparing the three studied groups (HCC, LC and CHC) with the control group, only miR-885-5p showed significant overexpression in HCC and LC groups and significant underexpression in the healthy control group, suggesting its utility as a marker of liver injury. ROC analysis showed that combining multiple miRNA’s resulted in a higher diagnostic accuracy. The AUC’s ranged from 0.9- 1.0 for the detection of HCC patients. Similar results were observed when combining these panels with AFP, seeing as the AUC for distinguishing patients with HCC from patients with LC ranged from 0.9-1. The miRNA panel, consisting of miR-122, miR-885-5p and miR-29b with AFP, displayed the highest accuracy for the early detection of HCC among the group of healthy controls (AUC=1). The miRNA panel of miR-122, miR-885-5p, miR-221 and miR-22 with AFP provided the highest AUC for detecting HCC among LC patients (0.982). Furthermore, the panel of miR-22, miR-375, miR-885-5p and miR-221 with AFP was found to have the highest AUC for distinguishing LC patients from healthy controls. Lastly, the panel with miR-125a-5p, miR-375, miR-885-5p, miR-29b and miR-602 with AFP showed the highest accuracy in detecting patients with chronic hepatitis (AUC=1). Overall, this appears to be a solid case-control study. The number of patients involved was sufficient and the findings suggest that serum miRNA’s could become the next gold standard for detecting early-stage HCC. However, as is the case for many of the articles included in this study, to confirm these results and to safely use these panels as diagnostic tools, independent prospective cohort studies must be performed in the future (42).

4.3 LUNG CANCER The prospective study conducted by Li and colleagues, explored the potential of survivin and livin as diagnostic biomarkers for the detection of lung cancer. Up until now, the gold standard for diagnosing lung cancer is through bronchoscopy and subsequent histology analysis. However, in a large number of patients, bronchoscopy fails to provide a correct diagnosis, especially when peripherally located tumors beyond the visual segment bronchi are involved. Therefore, alternative methods for the detection of lung cancer are being studied. One of those new methods consists of using diagnostic biomarkers in bronchial aspirates. Survivin and livin mRNA are two known inhibitors of apoptosis proteins and might prove invaluable in terms of successfully detecting lung cancer in the future. In this study, results showed that survivin and livin mRNA are significantly higher in bronchial aspirates of patients with lung cancer than in patients with benign lung disease. These findings are supported by the fact that significantly higher levels of survivin and livin mRNA

103 | P a g e

were observed in bronchial aspirates obtained from cancerous bronchi than in those taken from the healthy mirror sides. Furthermore, ROC analysis demonstrated that the detection of survivin mRNA in bronchial aspirates displays a high sensitivity and specificity at a cut-off of 3.5, which means that it could be a useful marker for lung cancer diagnosis (AUC=0.826). Moreover, it must be mentioned that survivin presented positive results in 18 patients of which histology and cytology analysis turned out negative. However, in a later stadium, the pulmonary lesions found in these patients were shown to be malignant. Furthermore, levels of survivin and livin were significantly elevated in 10 patients with early-stage lung cancer, which suggests that these two biomarkers might be useful in early detection. The second biomarker, livin, displayed moderate results with respect to the detection of lung cancer. The AUC was 0.676 with a sensitivity of 63% and a specificity of 92% for diagnosing lung cancer. Although these results are promising, the data in this study should be interpreted with care. The relatively low number of patients included and the non- randomized manner of selection inevitably resulted in an selection bias. Prospective control studies with larger sample sizes are required before survivin and livin can be put into practice as diagnostic biomarkers (43).

The case-control study concerning the analysis of metabolic changes in blood plasma as a detection tool for lung cancer was conducted by Louis in Belgium. This study successfully demonstrated that a model based on assessing the metabolic phenotype of a person, could correctly diagnose 78% of all patients with lung cancer and 92% of all controls. ROC analysis of this model displayed an AUC of 0.88. What is more, in an independent cohort, it was proven that this model can differentiate between patients with lung cancer and controls with benign lung disease. The sensitivity, specificity and AUC calculated in this cohort were, respectively, 71%, 81% and 0.84. Proton nuclear magnetic resonance-based metabolomics allowed a fast, non-invasive identification of alterations in the concentration of metabolites and required no minimal sample preparation. Low-dose CT is currently the most common tool in the screening of high-risk patients for lung cancer and displays a sensitivity of 85%, a specificity of 99% and a disappointing PPV ranging from 4-40%. This means that many patients are being referred for further examination, based on false positive results. Therefore, the aim of this study was to find an alternative tool for detecting (early-stage) lung cancer or to look for an additional factor to be incorporated in today’s risk stratification models used for referral. The paper sought to validate H-NMR metabolic phenotyping of blood plasma as a complementary means of distinguishing patients with lung cancer from controls. However, this study demonstrated that the diagnostic accuracy of this metabolic

104 | P a g e

profiling is not superior to that of LD-CT and therefore it cannot be used as a screening tool on its own (44).

The retrospective case-control study by Chen and colleagues presented the evaluation of a new risk score formula based on elevated levels of ten miRNA’s in NSCLC patients. The use of this 10- miRNA biomarker panel as a detection tool for NSCLC could have some advantages. Firstly, it is a serum-based biomarker which means that it does not require invasive techniques such as abiopsy. Secondly, the test has a low cost and easy sample collection and processing. Thirdly, this formula is not based on one individual miRNA but on a panel of multiple miRNA’s, which improves the detection rate seeing as the use of multiple miRNA’s in one panel reflects the different aspects of tumorigenesis. The sensitivity and specificity and AUC for NSCLC detection were 93%, 90% and 0.97, respectively. These results were significantly higher than those of CYFRA21-1 (with an AUC of 0.84, a sensitivity of 50% and a specificity of 95%), TPS and CEA. Furthermore, as for the detection of early-stage NSCLC, six out of seven patients were correctly classified when using this formula. In conclusion, it was demonstrated that this formula, based on the expression profile of 10 miRNA’s in serum, could be used as a highly sensitive non-invasive tool for the future detection of NSCLC. However, to examine the use of this panel in real-life conditions, more prospective studies and RCT’s should be executed (45).

The study of Doseeva et al. describes the performance of a 4-biomarker panel as a potential detection tool for (early-stage) lung cancer, also known as PAULA’s assay. The four biomarkers in this panel were CYFRA21-1, CEA, CA-125 and NY-ESO-1, (which are, respectively, three antigens and one auto-antibody). Subsequently, a MoM algorithm was used to calculate a risk score and to determine the patient’s risk of having lung cancer. The results of this 4-marker panel displayed a higher performance in differentiating lung cancer patients from controls than each of the biomarkers individually. The sensitivities were 72.7% and 82.8% for early- and late-stage NSCLC detection respectively, with a specificity of 80%. Previous studies demonstrated that a biomarker panel that exclusively consisted of antigens, provided inconsistent results regarding the detection of early- stage lung cancer. Adding NY-ESO-1 in this study resulted in an increased sensitivity for NSCLC detection (from 67.9% to 72.4%), largely due to a remarkable increase in early-stage lung cancer detection (from 65.2% to 72.7%). However, the conclusion that combining multiple antigens in a biomarker panel results in only a moderate increase in diagnostic accuracy, proves that tumors often overlap in the tumor protein markers they express. This is not the case for auto-antibodies, which suggests that these could be used as biomarkers to detect certain specific types of cancer.

105 | P a g e

Additionally, the performance of this biomarker panel was analyzed in a study consisting of 81 patients with benign lung diseases. Eight false-positive test results were observed, five of which were found among the samples of COPD patients. This is not a surprising result seeing as COPD was already established as an independent risk factor. It is possible to implement this blood-based biomarker panel into the screening program for high-risk patients in multiple ways: before diagnostic CT, to assess the risk of lung cancer, and after CT, to help interpret CT results when the diagnosis is unclear. A limitation in this study was the fact that the study population did not accurately represent the expected distribution of cancer types, stages or frequency of benign diseases found in a high-risk smoking population. Therefore, prospective studies with samples from high-risk populations are required to further validate the utility of PAULA’s assay for clinical use. Lastly, the authors suggested that adding other biomarkers to this panel might further increase the sensitivity and specificity for the detection of early-stage lung cancer (46).

The aim of the retrospective study, conducted by Gumireddy, was to identify and validate AKAP4 as a potential diagnostic biomarker for the detection of NSCLC. The premise of this study was the need for new detection tools, complementary to LD-CT, for the diagnosis of NSCLC and for more sensitive screening of high-risk patients. Although the National Lung Screening Trial has demonstrated that a 20% reduction in lung cancer mortality is associated with diagnosing and treating NSCLC through routine LD-CT screening of older patients with smoking history, previous studies have proved that 96% of all positive screening results for lung cancer are false-positives. AKAP4 belongs to the family of testis antigens and is expressed in a multitude of cancers such as ovarian cancer, multiple myeloma and breast cancer. More specifically, AKAP4 is found on circulating tumor cells (CTC’s) that can be found in serum. Therefore, there is a lot of interest in these antigens as potential biomarkers. In this particular study, AKAP4 displayed an AUC of 0.97 with a sensitivity of 92.8% and a specificity of 98.6% for detecting NSCLC in the validation study. AKAP4 was identified in a training cohort as the most sensitive biomarker when compared to 116 other antigens. In spite of these encouraging results, further refinement of this test and validation in prospective studies with more test subjects (in early stages) are needed and are currently being executed. For instance, analyzing the performance of AKAP4 in combination with LD-CT as a detection tool is an excellent idea for an interesting study in the future (47).

In the study of Hulbert et al, six hypermethylated genes were tested as potential diagnostic biomarkers for early-stage NSCLC. All cases included were stage I and stage IIa NSCLC patients. The sensitivities, specificities and AUC’s of the individual biomarkers and of a combination of the

106 | P a g e

three best performing genes, were calculated in multiple cohorts. The results of these were interesting for a number of reasons. Firstly, the sensitivity and specificity of this panel in sputum and plasma certainly comply with the diagnostic accuracy that is recommended by most clinical standards. Secondly, only small quantities of DNA from sputum and plasma are needed to perform this test adequately. Thirdly, this panel can be useful as a complementary test to LD-CT screening in order to differentiate patients with malignant nodules from patients with benign nodules. Furthermore, the detection of methylated genes had a higher positivity rate in sputum samples than in plasma samples, making sputum the preferred fluid for diagnosing NSCLC. However, the test was not accurate in some patients that had undetectable DNA methylation in their blood or sputum. In conclusion, this panel of methylated genes in plasma or sputum displayed a high sensitivity and specificity for the detection of early-stage NSCLC. Nevertheless, before it can be used as a diagnostic test or, more likely, as a complementary test for low-dose CT screening, more prospective studies should validate these results (48).

A twelve-protein marker panel was composed and tested in the blinded case-control study of Ostroff and colleagues. The primary result of this study was the identification of 44 potential protein biomarkers for differentiating NSCLC patients from high-risk controls with a smoking history. The twelve best-performing biomarkers were combined and verified in a classifier panel. The strengths of this study were the large sample size, the blinded control selection (in order to avoid selection bias) and the high sensitivity and specificity displayed by this new panel of biomarkers. The 12- biomarker classifier achieved a 91% sensitivity and 84% specificity in the training cohort and a 89% sensitivity and 83% specificity in the blinded validation cohort. Most of the biomarkers had already been associated with tumor biology in the past, but they were hitherto never identified as lung cancer biomarkers. Furthermore, half of the biomarkers included in the panel were downregulated (cadherin-1, LRIG3, sL-selectin, SCRsR, ERBB1 and RGM-C). The other six protein biomarkers were upregulated (CD30-ligand, endostatin, HSP90, MIP-4, pleiotrophin, PRKCI and YES). Despite the several strengths of this study, there were also some limitations. No patients with premalignant lesions were included in this study. Moreover, organ-specificity of these biomarkers was not tested, while many of these proteins have already been described as potential biomarkers for other types of cancers. Thus, the biomarkers used in this study might not have been lung cancer-specific. A potential selection bias was minimized by blinding the validation cohort, but it was not eliminated. Furthermore, this biomarker panel could serve two purposes in the future: either as a detection tool for early-stage lung cancers (which, at this stage, are still curable by surgery) in long-term smokers or as a complementary test to low-dose CT screening. All in all, these results are promising and

107 | P a g e

provide enough encouragement to embark on the next step in developing this panel: an independent prospective cohort study or RCT (49).

The use of fibulin-3 as a potential diagnostic biomarker for mesothelioma was the subject of the case-control study by Pass et al. The significantly elevated fibulin-3 concentrations in plasma and the effusions of patients with mesothelioma proved that fibulin-3 could be a diagnostic biomarker for mesothelioma. The specificity and sensitivity of fibulin-3 for discriminating patients with mesothelioma from asbestos-exposed persons and patients with effusions that did not result from having mesothelioma, is higher than the sensitivity and specificity of previously identified biomarkers. Furthermore, the levels of fibulin-3 are not affected by the duration of asbestos exposure. Early detection is of paramount importance to improve the median survival of mesothelioma patients, which is currently 12 months. However, early detection is often impaired by the long latency period, the inability of imaging when the disease is at an early stage and a lack of sensitive and specific biomarkers. Plasma fibulin-3 in particular, could differentiate between stage I or II mesothelioma patients and people that have suffered asbestos exposure with a sensitivity of 100% and a specificity of 94%. Unexpectedly however, effusion fibulin-3 levels and plasma levels did not correlate. The results with regard to the sensitivity and specificity, of fibulin-3 in pleural effusions were similar to those of plasma fibulin-3. Because of the fact that non-invasive blood-based biomarkers are more cost- effective, it is suggested that fibulin-3 in pleural effusions could serve as a prognostic biomarker instead of as a diagnostic biomarker. The results in this study have not yet confirmed that fibulin-3 is a reliable diagnostic marker for the early detection of mesothelioma, due to a lack of prospectively collected plasma samples. For the clinical validation of fibulin-3 as a detection tool, prospective studies and later RCT’s must first be performed. Moreover, the authors of this study advised to further investigate the prognostic performance of fibulin-3 seeing as plasma fibulin-3 levels dropped drastically after surgical reduction and rose with the progression of the disease. Fibulin-3 levels should also be analyzed in studies with more asbestos-exposed patients with benign effusion. Lastly, since the cut-offs varied between the two cohorts, the optimal cut-off value has yet to be discovered in prospective trials (49, 50).

The prospective cohort study of Tammemagi et al. analyzed the potential of pro-surfactant protein B as a predictive (or diagnostic) biomarker for the early detection of lung cancer. The precise role of pro-SFTPB accumulation in the development of NSCLC is not yet known. It is presumed that pro-SFTPB could be a biomarker of lung conditions that precede lung cancer, such as COPD. The

108 | P a g e

results of this study clearly showed that plasma pro-SFTPB is significantly and independently associated with the occurrence of lung cancer in patients at risk. Subsequently, it was proven that pro-SFTPB is associated with early-stage lung cancer and with lung cancers diagnosed one year after the blood sample was taken. This result suggests that pro-SFTPB has huge potential in predicting NSCLC tumors in a curable stage. A limitation of this study was the small sample size of patients with other types of cancers than adenocarcinomas in the Pan-Can study. However, in the CARET study, which enrolled more patients with squamous cell carcinoma, pro-SFTPB was less predictive. Furthermore, pro-SFTPB was exclusively analyzed in high-risk patients which might have produced a selection bias. Therefore, the results of this study cannot be generalized to low- risk individuals. On the other hand, the strengths of this study were that patients were enrolled prospectively and that samples of patients with no history of cancer were collected in order to evaluate the performance of pro-SFTPB as a predictive biomarker. In conclusion, for pro-SFTPB to be included in a risk-predicting model that is ready for clinical practice, additional prospective studies in different populations are needed (51).

Varella-Garcia and colleagues conducted a prospective study to analyze the predictive performance of a chromosomal aneusomy FISH-assay (or CA assay) in sputum samples. Chromosomal missegregation, resulting in loss or gain, is a characteristic sign of cancer and, therefore, it wasn’t surprising that chromosomal aneusomy in this study was significantly associated with lung cancer incidence in samples from within 18 months before diagnosis. The sensitivity and specificity of this CA assay (which were respectively 76% and 85%) were better than those of cytology analysis and gene promoter methylation analysis (29% and 83%, for cytology and both 64% for gene promoter methylation). As a result, the diagnostic accuracy of this FISH-assay was sufficient for this CA assay to be considered as a diagnostic tool for the detection of lung cancer. Other notable results in this study were the favorable sensitivities of the CA assay for detecting squamous cell cancer and early-stage localized lung cancer. The specificity of the FISH-assay was higher than that of any other sputum biomarker reported to date. However this CA assay also showed some limitations. The sensitivity decreased over time, between sputum collection and cancer diagnosis. This suggests that the FISH-assay predicts lung cancer itself, rather than the risk of lung cancer. On top of that, the sensitivity for the detection of adenocarcinomas was limited. In conclusion, the CA-FISH assay (examined in this study) displayed enough diagnostic potential to further validate its utility as a detection tool in future screening studies and RCT’s (52).

109 | P a g e

The last study exploring potential biomarkers for the detection of lung cancer was conducted by Yao and colleagues. More specifically, a panel of four protein markers were identified and analyzed. Hitherto, the most extensively studied tumor markers were CYFRA 21-1, CEA, CA125, Neuron- specific Enolase (NSE), Tissue Polypeptide Antigen (TPA), Chromogranin 1 (CgA) and CA19-9. However, none of these showed sufficient sensitivity and specificity to be used individually as a robust diagnostic biomarker. In this study, ELISA was used to detect a set of serum auto-antibodies against lung cancer antigens. These four proteins were SMOX, NOLC1, MALAT1 and HMMR and they play a role in cell proliferation, apoptosis, adhesion and migration, which are all associated with tumorogenesis. Statistical analysis of this panel in a cohort with 40 NSCLC patients demonstrated a sensitivity of 63.6% and 62.5% for the detection of stage I and stage II cancer patients. This suggests that the four-biomarker panel has potential as a detection tool for early- stage NSCLC. Although these results are promising, further validation of this panel on additional lung cancer samples and control samples of patients with benign lung diseases are needed because of the small sample size in this study (53).

4.4 OVARIAN CANCER The prospective study, conducted by Longoria, evaluated the combination of OVA1 and clinical assessment as a diagnostic tool for the early detection of primary ovarian malignancies. In previous studies, the importance of staging ovarian cancer at the time of surgery was stressed. Approximately 21% of women with ovarian masses (which were early-stage ovarian cancer) did not undergo staging and 46.8% did not undergo nodal sampling before operation. Therefore, a diagnostic triage tool should be applied on high-risk patients before operation of ovarian masses to estimate the risk of ovarian cancer. In this study, OVA1 combined with clinical assessment had the highest sensitivity (i.e. 96%) for the detection of ovarian malignancy, when compared to the sensitivity of the other tested risk assessment tools: CA-125, clinical assessment alone and the modified ACOG guidelines. Furthermore, OVA1 outperformed clinical assessment, CA-125 and the modified ACOG guidelines in detecting early-stage disease patients, which consisted of stage I and II ovarian cancer patients. On top of that, the combination of OVA1 and clinical assessment increased the sensitivity to 95.3% for the detection of early-stage ovarian cancer and confirmed its potential as a risk stratification instrument in this population. The superiority of OVA1 combined with clinical assessment was maintained during the examination of the subgroups with pre-and postmenopausal women. The strength of this study lay in the large (prospectively enrolled) patient cohort, which allowed the researchers to focus on specific subgroups like premenopausal women

110 | P a g e

with early-stage disease. A limitation of this study was that 27% of the patients were included after referral by gynecologic oncologists, which could result in selection bias and a false increase in the test’s PPV and a false decrease in NPV. In spite of these remarks, Longoria et al. strongly suggested that OVA1 in combination with clinical assessment could become a sensitive risk stratification test for early-stage ovarian malignancy (54).

The retrospective study of Yildirim et al. discusses the possibility of using hematologic biomarkers (more specifically NLR and PLR) for the detection of early-stage ovarian cancer. The sensitivity of NLR for the detection of endometrioma among patients with benign adnexal masses was 79% with a specificity of 36%. NLR had a better sensitivity for detecting endometrioma compared to CA-125. The sensitivity of PLR however, was similar to that of CA-125 (75.8%). Although these results are promising in terms of sensitivity, PLR and NLR cannot be used as a biomarker for detection of endometrioma in screening programs, due to their low specificity. Thus, CA-125 is still presently the most reliable biomarker for detecting endometrioses. As regards the differentiation between patients with malignant masses and patients with benign masses, CA-125 displayed the highest sensitivity with 78.6%. NLR had a sensitivity of 65.6% and neutrophils as biomarkers had a sensitivity of 73.1%. Interestingly, the specificity of NLR was higher than the specificity of CA-125. Furthermore, CA-125 showed a sensitivity of 55% for detecting early-stage ovarian cancer and this was lower than the sensitivity of lymphocytes, neutrophils, NLR and PLR. However, CA-125 displayed the highest specificity values. These last observations suggest that combining PLR and CA-125 could increase the overall sensitivity and specificity for detecting early-stage ovarian cancer. The limitations of this study were the lack of reporting full test results in subgroups, the retrospective nature of the study and a possible selection bias. In conclusion, the use of hematological parameters as biomarkers (in combination with other markers) can aid in the detection of ovarian cancer. Nevertheless the findings in this study should be supported by the results of more prospective cohort studies with larger sample sizes before these markers can be used in clinical practice (55).

The early detection of ovarian cancer is of great importance for the survival rate of the patient, as it improves the chances at successful treatment. For instance, the 5-year survival rate for early- stage ovarian cancer is 90% while it is merely 20% when the cancer is discovered in stage III or IV. In the retrospective case-control study of Tcherkassova and colleagues, the use of the receptor for the circulating fetal protein alpha-fetoprotein or RECAF, with or without CA-125, as a potential diagnostic biomarker for ovarian cancer is tested. RECAF is normally involved in the internalization

111 | P a g e

of AFP in immature cells and expression is downregulated after differentiation, but previous studies have proven that RECAF is reactivated in cancer cells. Due to the low incidence of ovarian cancer in women, a successful screening program should ideally display a sensitivity of more than a 75% and a specificity over 99.6% for the detection of stage I or II ovarian cancer. If these numbers cannot be attained and too much false-positives are referred for staging through surgery, there is a risk that the complications of these operations would outweigh the benefit of early detection. At present, women at risk have been screened with CA-125, which is only elevated in 50% of the patients with early-stage ovarian cancer. In this study, the sensitivity of CA-125 for the detection of early-stage ovarian cancer was a little higher, i.e. 56%. When analyzing the results of RECAF in this study, elevated levels of RECAF were detected in ovarian cancer patients in comparison with normal individuals. Furthermore, RECAF expression was considerably higher in late-stage cancer than in stage I or II ovarian cancer. ROC analysis demonstrated that RECAF and CA-125 could discriminate controls from cancer patients, with RECAF showing a higher sensitivity, particularly in earlier stages of the disease. Combining CA-125 and RECAF in one panel resulted in a sensitivity of 83% and 76%, respectively, for the detection of ovarian cancer and stage I/II ovarian cancer at a specificity set at 100%. Thus, the combination of these two biomarkers displayed sufficient results for this panel to be considered as a future screening tool for the early detection of ovarian cancer. However, a few questions are left unanswered. RECAF is expressed in a multitude of cancers and for that reason, patients with elevated RECAF serum do not automatically have ovarian cancer but must be further examined. On top of that, this was a retrospective study that did not include patients with benign diseases. In order to confirm the high sensitivity of RECAF for ovarian cancer, prospective studies using patients with benign adnexal masses as controls should be executed (56).

In a retro prospective case-control study, Simmons et al. came to the conclusion that a biomarker panel consisting of CA-125, HE-4, CA72-4 and MMP-7 has the highest potential as a detection tool for diagnosing early-stage ovarian cancer. In the past, applying the ROCA (risk of ovarian cancer algorithm) to evaluate CA-125 over time as a diagnostic biomarker in multi-modal screening (MMS), demonstrated that this algorithm has a high sensitivity for ovarian cancer and resulted in a reduction in ovarian cancer mortality. Nevertheless, when using CA-125 only, 20% of all early-stage ovarian cancers were missed. The best performing panel in this study presented a sensitivity of 83.2% at a specificity of 98% in the validation set. As a result, this panel outperformed the results of CA-125 alone. Moreover, in order to further evaluate the longitudinal diagnostic performance of these four biomarkers, their levels were measured in samples of healthy volunteers, taken at 5 different points

112 | P a g e

in time. A baseline of each biomarker for each participant individually was calculated and CA-125 displayed the lowest within-person over time variability. The biomarker panel, discovered in this study, is currently under investigation in a large blinded retrospective study with longitudinal cases and controls and the authors proposed that a prospective trial should be conducted to further validate the combination of this four-biomarker panel (57).

In the prospective study of Sedláková and colleagues, it was demonstrated that patients with ovarian cancer presented higher LPA plasma levels than patients with benign ovarian tumors and healthy controls. Previous studies have stated that LPA plays a role in cancer development and increases the expression of VEGF. More importantly, LPA levels were also significantly elevated in patients with early-stage ovarian malignancies (in 90% of the patients with stage I cancer). Furthermore, this study demonstrated that plasma LPA levels were slightly higher in patients with benign tumors when compared with the plasma LPA levels in healthy women. The differentiation of patients with malignant tumors from patients with benign tumors by analyzing the LPA levels was statistically significant (P<0.001). The most balanced cut-off limit with the highest sensitivity and specificity in this study was calculated at 8.30 µmol/l, but the optimal cut-off value should be determined in a study with larger sample sizes. Because LPA levels were slightly elevated in patients with benign ovarian tumors and the mechanism behind LPA elevation in ovarian cancers hasn’t been discovered yet, the authors theorized that ovarian carcinogenesis could be caused by benign lesions in ovarian . In conclusion, the data in this study suggests that LPA could be a diagnostic biomarker for ovarian cancer in the future (58).

The sixth study exploring possible biomarkers for the detection of ovarian cancers was conducted by Leandersson et al. They studied the performances of various biomarkers (CA-125, HE-4, suPAR and B7-H4) in a multitude of panels and compared them to the performance of the Risk of Ovarian Malignancy Algorithm (ROMA). The results of this study demonstrated that a panel incorporating suPAR, HE-4, CA-125 and age displayed the highest diagnostic accuracy (with an AUC of 0.94) and outperformed ROMA with regard to the differentiation of patients with benign ovarian tumors from patients malignant ovarian tumors in premenopausal women. In patients with EOC type II, the levels of suPAR, HE-4 and CA-125 were elevated in comparison to the levels of these three biomarkers in type I EOC patients. Furthermore, lower levels of these biomarkers were seen in patients with borderline and benign ovarian tumors. With regard to prognosis, it was concluded that elevated levels of HE-4, CA-125 and suPAR were associated with a decrease in 5-year survival in premenopausal patients. More specifically, the overall survival of ovarian cancer patients

113 | P a g e

decreased drastically if high levels of suPAR were detected during the first year after diagnosis. Another notable result was the mediocre diagnostic performance of HE-4 and CA-125 and, consequently, the performance of the ROMA score, in EOC type I. The authors suggested that the best performing panel in this study, consisting of suPAR, HE-4, CA-125 and age should be further investigated in more prospective cohort studies to evaluate its utility as a detection tool or as a risk assessment test (59).

The phase II and phase III study of Cramer et al. evaluated the performance of 35 different biomarkers as potential detection tools for ovarian cancer. In general, biomarkers whose assays display low coefficients of variation (CV) tend to have a low performance as reliable biomarkers. This was also the case in this study as none of the biomarkers with a CV > 30% displayed a sensitivity > 37% at a fixed specificity of 95% in phase II or phase III samples. With regard to detection of early-stage ovarian cancer, CA19-9, apolipoprotein A1 and prolactin performed best in phase II samples from ovarian cancer patients. However, the performance of these biomarkers was lower in phase III samples (60). An important limitation in this study was the exclusion of women with a CA-125 value > 35 ng/mL seeing as these women were referred after triaging for diagnostic workup for ovarian cancer. Therefore, the cut-off used in this study for CA-125 was determined at 24 ng/mL to achieve a 95% specificity. Nevertheless, the conclusion of this study was that CA-125, despite having a rather low sensitivity for detection of early ovarian cancer (56%), is still the single best-performing biomarker for detection of ovarian cancer and that its highest peak in concentration levels lies within 6 months of diagnosis (AUC=0.96) (60).

Xu et al. evaluated the most-studied diagnostic biomarkers CA-125 and HE-4 for ovarian cancer in a retrospective case-control study. Additionally, the performance of the ROMA algorithm (which produces a risk score for ovarian cancer based on the levels of CA-125, HE-4 and a clinical assessment) as a diagnostic marker was assessed and alternative cut-off levels for CA-125 and HE-4 were proposed. The results of this study confirmed that ROMA or HE-4 could be of use in the triage of (premenopausal) women with adnexal masses. The diagnostic performance of CA-125 was unmistakably lower than that of ROMA and HE-4 (AUC of 0.817 and 0.816 for ROMA and HE- 4 vs 0.683 for CA-125) in samples of premenopausal women. Furthermore, combining HE-4, CA- 125 and ROMA in one panel improved diagnostic potential when compared to CA-125 alone. With regard to the patients with early-stage ovarian cancer, the combination of HE-4 and CA-125 displayed the highest performance for detecting stage I and II ovarian cancer. However no

114 | P a g e

significant difference in AUC’s between HE-4 and this panel was observed. The mean levels of HE- 4 and CA-125 in EOC patients (especially stage I patients) were lower than the recommended cut- off levels used in clinical centers, which suggested that the recommended cut-off levels are not sensitive enough for the detection of early-stage ovarian cancer. The optimal cut-offs proposed in this study (60 U/ml for CA-125 and 35 U/ml for HE-4) increased the sensitivity for differentiating EOC patients from patients with benign disease from 39.7% to 73.8%, accompanied by a slight loss of 5.4% in specificity (from 99.2% to 93.8%). Employing the new cut-off values for CA-125 and HE-4 in the ROMA did not alter the resulting ROMA scores in this study. A limitation apparent in this study was the lack in data (follow up and 5-year survival rate) to evaluate the prognostic use of these biomarkers. Furthermore, all samples were obtained from a single hospital center. In order to confirm the results of this study, future studies are required to include a follow-up strategy for ovarian cancer patients and multi-center validation. In summary, the authors suggested that HE-4 should be used in a first-line test that incorporates multiple biomarker panels (61).

The aim of the nested case-control study by Gislefoss and colleagues was to evaluate serum HE- 4 as a diagnostic biomarker for the early-detection of ovarian cancer. The results demonstrated that serum HE-4 levels could already be elevated in samples collected two years before diagnosis and that CA-125 levels could be elevated in serum samples taken four years before diagnosis. In addition, cotinine levels were measured as previous studies have reported that age and smoking can influence HE-4 levels in women. As expected, a significant correlation between cotinine and HE-4 levels was observed. A potential weakness of this study was the use of archived samples because HE-4 and CA-125 levels increase in serum with time and could have biased the results. Although the controls and cancer patients were matched for age at sampling to decrease bias of storage time, the measurement of these biomarker levels could have been biased by analytic batch, storage and freeze-thaw cycles. The conclusion of this study was that HE-4 and CA-125 levels seemed to increase 2 and 4 years before symptoms arose and diagnosis was made. Therefore, especially HE-4 (in combination with other biomarkers in a panel) could be a potentially strong prognostic and diagnostic biomarker for ovarian cancer. However, when using HE-4 as a diagnostic biomarker in the future, the smoking status should be taken into account (62).

The aim of the study, conducted by Zhu et al, was to evaluate already well-studied biomarkers in prediagnostic samples of patients with ovarian cancer. The currently used biomarkers for the detection of ovarian cancer were all identified and validated in samples that were collected from ovarian cancer patients at the time of diagnosis. Unfortunately, ovarian cancer is often diagnosed

115 | P a g e

in a later stage (III or IV). In order to detect ovarian cancer in a non-symptomatic earlier stage, biomarkers that can detect or predict ovarian cancer in prediagnostic samples are urgently needed. For that reason, CA-125 and HE-4 were evaluated in prediagnostic samples of ovarian cancer patients in this study. An important strength in this study, is that it was drawn up following a PRoBE design. The “PRoBE approach” implies the prevention of potential bias by performing a blinded case-control study with patients that were followed in a prospective cohort study. The results of this study indicated that CA-125 remains the single-best diagnostic biomarker for ovarian cancer. However, combining CA-125 with other biomarkers like HE-4 or B7-H4, enhanced the overall diagnostic performance. Furthermore, the authors demonstrated that biomarkers, which were initially identified and validated in diagnostic samples, were only highly expressed in symptomatic patients with an advanced stage ovarian cancer. As a result, the performance of these biomarkers for early detection of cancer could be lower than anticipated. In conclusion, the authors suggested that, when it comes to validating new biomarkers for ovarian cancer, longitudinal marker values from serial samples should be evaluated in future studies (63).

4.5 PROSTATE CANCER The study of Cremers et al. discussed the use of PCA3 levels as an additional diagnostic biomarker to PSA in BRCA carriers. Employing PCA3 as a supplementary indicator for prostate biopsies, resulted in a large number of extra prostate biopsies to diagnose hardly any more prostate cancers. Furthermore, using PCA3 was as an indicator for prostate biopsies in a subgroup of men with elevated PSA levels, reduced the number of biopsies drastically. However, many intermediate- and high-risk PC’s would were missed if prostate biopsies were reduced based on normal PCA3 scores. A possible explanation for this disappointing result is that PC in BRCA families has a different etiological pathway than in patients with normal PC’s, which does not result an increase in PCA3 expression. What is more, the use of PSA as an indicator for referral, resulted in 3 PC patients being missed in the first screening round. However, the three low-risk patients that were missed in the first screening round, displayed increased serum PSA levels during the second round of screening and were subsequently diagnosed with PC through biopsy. A possible limitation in this study could be an overestimation of the results seeing as no end-of-study biopsies were executed to identify the number of PC patients after two years of follow-up. However, it is also proven that the employment of end-of-study biopsies results in overdiagnosis of non-lethal PC. After analyzing the results of this study, the authors suggested that employing a lower cut-off value of PSA in BRCA2 carriers for the detection of PC could lead to a more cost-effective screening tool for PC

116 | P a g e

than using a screening strategy that involves combining PCA3 and PSA in a panel. Furthermore, the screening of BRCA2 carriers was proven to be the more effective than the screening of BRCA1 carriers, considering the increased number of PC’s were detected in this group during the first screening round. In addition, PSA displayed a higher PPV for detecting PC in the BRCA2 group than in the BRCA1 group (64).

Previous studies have proven that [-2]proPSA is associated with prostate cancer and is therefore a considered as a possible diagnostic biomarker. This study of Sokoll and colleagues investigated the potential role of [-2]proPSA in a screening strategy for prostate cancer. In this study, %[- 2]proPSA displayed similar results to PSA and %fPSA with regard to the differentiation of patients with prostate cancer from patients without cancer. Moreover, %[-2]proPSA could be complementary to these two other well-studied biomarkers. Adding %[-2]proPSA to a logistic regression model that already incorporated PSA, %fPSA, and other demographic and clinical parameters, improved the overall diagnostic performance for (early-stage) prostate cancer to 0.73, which was the highest AUC in comparison to the AUC’s of each individual marker. Furthermore, in the subgroup of patients with PSA levels ranging from 2 to 20 ng/mL, %[-2]proPSA was better than %fPSA at differentiating patients with prostate cancer from patients without prostate cancer. What is more, [-2]proPSA might be helpful in the identification of aggressive prostate cancer seeing as [- 2]proPSA and %[-2]proPSA correlated with the Gleason score (the most widely used score to classify the aggressiveness of prostate tumors). In conclusion, this study further validated the utility of %[-2]proPSA as a PC biomarker. The greatest drawback of this study however, was the possible selection bias, as most of the study population were preselected. Nevertheless, the authors stated that no biases were introduced that could affect the test results due to the fact that all patients were prospectively enrolled (65).

In the prospective study by Gordian and colleagues, free circulating DNA (fcDNA) was tested as a diagnostic biomarker for prostate cancer. The patients included in this study all displayed elevated PSA levels (4 > ng/mL) and/or abnormal digital rectum exams. The controls consisted of patients with benign prostatic diseases (prostatitis + BPH). At a cut-off point of 180 ng/mL, fcDNA could distinguish prostate cancer from benign prostatic diseases in the subgroup of patients with PSA >10 ng/mL. When adjusting for race and race, the OR of patients with PSA levels < 10 ng/mL and fcDNA levels > 180 ng/mL, was 4.27, implying these patients were at increased risk for prostate cancer in comparison to patients with fcDNA levels < 180 ng/mL. Furthermore, a multivariate model which included fcDNA, yielded a sensitivity of 95.5%, a specificity of 33.3% and an increased NPV

117 | P a g e

of 93.3% (in comparison to the same model without fcDNA). The advantages of this study consisted of the prospective nature of the study and the inclusion of a control population that was not comprised of healthy men, but of patients with benign prostatic disease. In conclusion, the results found in this study indicated that fcDNA could be used as a diagnostic biomarker in patients with PSA < 10 ng/mL. Furthermore, the inclusion of fcDNA in a multivariate model resulted in a high negative predictive value of 93.1%. Applying this model to all patients with PSA < 10 ng/mL could prevent a third of all unnecessary prostate biopsies. Lastly, in search of refining fcDNA as a biomarker, other studies have reported that just 1.9% of the elevated fcDNA originate from prostate cancer cells. Consequently, it was concluded a large proportion of the fcDNA derives from non- cancerous cells that are in a state of apoptosis caused by the release of pro-apoptotic cytokines from prostate cancer cells. Identification of a cancer-specific gene is impossible because of the genetic heterogeneity among prostate cancers, but if a non-cancer-specific gene for the quantitation of fcDNA levels was identified, this could lead to the precise measurement of fcDNA and more reliable detection of prostate cancer in the future. For that reason, future prospective studies on isolating and quantifying fcDNA are required to evaluate the potential of fcDNA as a prostate cancer biomarker (66).

The preliminary results of the IMPACT study are evaluated in the article by Mitra and colleagues. The results indicated a higher incidence of prostate cancer in BRCA carriers, more specifically in the BRCA2 group. Furthermore, the preliminary data demonstrated a relatively low rate of biopsy when a PSA threshold of 3 ng/mL was used, but PPV of PSA for the detection of PC was high (48%). A lower PPV was observed in the BRCA2 subcohort. The age of the cohort (40-69 years) affected the PPV of PSA positively. The biggest backdrop of this study however, is the small sample size. The number of patients included in this study are too small to draw meaningful conclusions. When recruitment in this study is complete, further analysis will demonstrate if there are any differences in development of prostate cancer between BRCA carriers and controls (67).

The study of Morgan et al. explored the potential use of En-2 and its protein EN2 as diagnostic biomarkers for prostate cancer. The results indicated that En-2 is highly expressed by PC cancer cells but not by normal prostate cells. Furthermore, it was found that PC cancer cells are able to produce and secrete EN2, which can be found in first-pass urine samples of PC patients. When yielding a cut-off value of 42.5 ng/mL for EN2, a sensitivity of 66% and a specificity of 90% were observed. The optimal diagnostic PSA values were determined in previous studies, a sensitivity of 24% and a specificity 93% for the detection of PC were reported. These results suggested that

118 | P a g e

EN2 has a higher diagnostic performance for PC than PSA does. On top of that EN2 levels can be measured in just 110 µL of unprocessed urine without the need of DRE, whereas DRE before sampling is required to measure PCA3 adequately. In order to validate the utility of EN2 as a diagnostic biomarker for identifying significant PC patients (that require intervention), and to prevent the overdiagnosis of PC by PSA, further studies should be performed (68).

The screening study of Vickers was developed to confirm the results of previous studies concerning a four-kallikrein test for the detection of prostate cancer. PSA proved to be a poor predictor of initial biopsy results in men with elevated PSA and the addition of iPSA, fPSA and hK2 improved discrimination dramatically and could reduce the number of unnecessary biopsies. Not only did hK2 improve the diagnostic performance of this model, adding fPSA and iPSA appeared to increase the prediction accuracy. The strengths of this study lay the use of independent training and validation sets, the replication of a previously published model, and the confirmation of previous results. For example, the AUC’s for the detection of prostate cancer increased from 0.564 for PSA alone to 0.674 in combination with iPSA, fPSA and hK2. Furthermore, the AUC’s of the PSA and the kallikrein model increased in a similar fashion from 0.557 to 0.713. Another advantage of using the kallikrein model as a predictive biomarker is that the test can be performed on a normal blood sample. However, a limitation in this study was that the tested blood samples had been stored for several years. What is more, previous studies reported that long-term storage and repeated freezing degrade kallikreins. This could have an negative effect on the predictive accuracy of the four-kallikrein panel. In summary, this panel could be of use in predicting the result of an initial biopsy in prescreened men with elevated PSA levels and therefore help determining which of these men should undergo biopsy. (69). In the second study of Vickers et al, the findings of the previous study were replicated in a cohort study that included unscreened men with elevated PSA. The results of this study suggested that the panel with kallikrein markers could help in predicting the result of biopsy in men with elevated PSA. Moreover, applying the full clinical model to the data in this study leaded to superior results when compared with the results of the currently screening strategy that is based on men with an abnormal DRE and elevated PSA levels being referred for biopsy. When using full clinical model, a large number of unnecessary biopsies could be reduced while only a small number of patients with predominantly low-stage prostate cancer would be missed in the first round. The benefit of this study is that the results of the previous studies were replicated and confirmed in an independent set of patients. Another advantage is that the evaluated test in this study is ready for use in a real- life clinical setting, as it requires only a serum sample. The assays for kallikreins have been studied

119 | P a g e

and improved in the last decade, are ready to be implemented in a screening program and is expected to be cost-effective. The biggest limitation in this study was that the samples were stored for years and other studies have proven that kallikreins degrade over time. In conclusion, the models that were tested in this study, could be used to determine which men should undergo a biopsy and which might be advised to continue screening. Subsequently, a great number of harm by unnecessary biopsies could be prevented (70).

The prospective cohort study of Wei was designed to examine PCA3 as a potential diagnostic biomarker for prostate cancer. Furthermore, PCA3 was combined with the PCPT risk calculator to improve prediction of prostate cancer risk. In the repeat biopsy setting, a PCA3 score cut-off value of 20 yielded a NPV of 88% and could help reduce a great number of unwanted repeat biopsies. The test performance of PCA3 was higher than that of PSA, meaning that for biopsy-naïve patients with a PCA3 score of more than 60, the probability of a first prostate biopsy detecting prostate cancer was greatly increased. Thus, PCA3 should be added to the clinical information upon which is decided to recommend a biopsy. Furthermore, adding PCA3 to the PCPT risk calculator, which incorporates age, race, prior biopsy, PSA and DRE, improved the differentiation of patients with high-risk of (high-grade) prostate cancer from patients with elevated PSA levels without cancer. A limitation in this study was the fact that all samples were collected from prescreened men with elevated PSA levels. For that reason, the results of this study cannot be generalized to an unscreened population. As a result, the authors concluded that the overdiagnosis of low-grade prostate cancer during repeat biopsy and the underdiagnosis of high-grade cancer during initial biopsy could be improved by using PCA3 as a complementary biomarker to PSA-based screening or by adding PCA3 to a risk estimation model (71).

4.6 PANCREATIC CANCER The study by Bain et al. evaluated the diagnostic and predictive potential of two biomarker panels for pancreatic cancer in a univariate and multivariate analysis. Several recent studies have already demonstrated that gene expression in PBMC’s of patients with pancreatic cancer is altered in significant ways. Of the various genes studied in this study, the CA5B gen was able to differentiate early PC patients from both patients with chronic pancreatitis and healthy controls with the highest accuracy. Based on multivariate analysis, panel B, which incorporated multiple genes (CA5B, F5, SSBP2, MIC1 and CA19-9), displayed a better diagnostic performance for the detection of PC than CA19-9. Furthermore, the results of this study showed that five genes (ANXA3, ARG1, CA5B,

120 | P a g e

SSBP2 and TBC1D8) were differentially expressed in patients with PC in comparison to the expression of these genes in healthy controls. More importantly, an increase in the expression of these five genes was found in late-stage PC patients in comparison with early-stage PC patients (72). As it is almost impossible to screen the entire population for PC due to its low prevalence, it would be more efficient to develop a screening test for a high-risk population that includes patients with chronic pancreatitis, type II diabetes mellitus, chronic smokers or patients with cystic fibrosis. To this day however, no biomarker has demonstrated sufficient results to serve as a diagnostic tool for the detection of PC among patients with chronic pancreatitis. In this study, panel B could distinguish resectable, early-stage PC from chronic pancreatitis patients with an AUC of 0.820, a sensitivity of 67% and a specificity of 83% respectively. The cut-off value of CA19-9 in this panel was of 37 U/mL, which was higher than its recommended cut-off value. In addition, the performance of these five genes in differentiating patients with early-stage PC from healthy controls was analyzed. However, it appears that diagnostic testing in healthy persons based on PBMC expression profiling requires some more investigation as the results were disappointingly lower than the diagnostic results of CA19-9 for pancreatic cancer. However, based on the results in this study, further testing in high-risk groups like patients with chronic pancreatitis could confirm the potential of these five genes as PC biomarkers. In addition, the authors suggested that the response of the immune system to non-self-antigens is significantly decreased in patients with pancreatic cancer. In conclusion, the results of this study showed that a combination of genes could be able to significantly improve the ability of CA19-9 to distinguish resectable early-stage PC from patients with chronic pancreatitis (72).

More than 2% of the American population have pancreatic cysts, which could be considered as premalignant lesions. Although pseudocysts and SCN are relatively safe and do not display malignant potential, mucinous cysts like IPMN and MCN have the ability to progress into invasive pancreatic adenocarcinoma. Therefore, a biomarker that can distinguish mucinous cysts from serous cysts could be useful to assess the risk at pancreatic cancer. In the prospective study of Yip-Schneider et al, VEGF-A has been examined in pancreatic fluid as a candidate biomarker for the detection of SCN cysts. The results of this study have demonstrated that VEGF-A concentrations were significantly raised in pancreatic cyst fluid from patients with SCN. In addition, VEGF-A could differentiate SCN from other pancreatic cysts with a sensitivity of 100% and a specificity of 97%. Furthermore, high levels of VEGFR-2 expression were demonstrated in SCN cyst tissue. Therefore, the overexpression of VEGFR-2 in the SCN cyst tissues could explain the

121 | P a g e

elevated VEGF concentrations in the pancreatic cystic fluid. The authors suggested to investigate a panel that includes VEGF-A with other cyst fluid biomarkers VEGFR-2 to facilitate the differential diagnosis of pancreatic cyst lesions (73).

The goal of the study by Capello and colleagues was to evaluate the diagnostic use of 17 protein biomarkers and to develop a protein biomarker panel for the detection of early-stage pancreatic ductal adenocarcinoma (PDAC). Furthermore, the diagnostic performance of this new panel should be better than that of CA19-9. In the first selection and validation cohort, seven plasma protein biomarker were selected out of the aforementioned 17 biomarkers, based on their performance to distinguish patients with PDAC from healthy controls. These seven biomarkers were further validated in three validation cohorts and their diagnostic performance was compared with that of CA19-9. Ultimately, the three biomarkers with the most promising results were identified as TIMP1, LRG1 and CA19-9. Based on a newly constructed logistic regression model, these three biomarkers were combined in a panel, which performed significantly better than CA19-9 alone with regard to the differentiation of early-stage PDAC patients from healthy individuals or from benign pancreatic disease controls. As a result, this biomarker panel could become an important tool for the discovery of PDAC among patients at increased risk like individuals with family history, cystic lesions, chronic pancreatitis or type II diabetes. Although these results are promising, further validation of this panel in independent prediagnostic cohorts is needed to confirm its use as a screening tool for pancreatic cancer (74).

The prospective cohort study of Henriksen et al. examined the diagnostic performance of cell-free DNA promoter hypermethylation of 28 genes in plasma of patients with pancreatic adenocarcinoma. Of the 28 genes that were examined, 19 displayed statistically altered levels of hypermethylation in patients with pancreatic adenocarcinoma in comparison to the lower methylation status of these 19 genes in controls or in patients with chronic pancreatitis. In addition, this study demonstrated that cell-free DNA hypermethylation can be detected in both malignant and benign pancreatic disorders and that plasma samples from patients with pancreatic adenocarcinoma contained higher levels of hypermethylated genes in cell-free DNA. As in the previous studies, none of the 28 genes individually demonstrated sufficient diagnostic performance to serve as a reliable diagnostic biomarker for pancreatic cancer. However, a panel that combines these genes could serve as a detection tool for pancreatic cancer. Therefore, a diagnostic prediction model based on eight biomarkers (age >65, BMP3, RASSF1A, BNC1, MESTv2, TFPI2, APC, SFRP1 and SFRP2) was developed and evaluated in this study. This panel was able to

122 | P a g e

distinguish pancreatic cancer patients from a large control group that included patients with chronic pancreatitis or patients with symptoms of pancreatic cancer. The AUC and positive predictive value of the model were superior to those of CA19-9, which is currently the only FDA-approved blood- based biomarker for pancreatic cancer. Applying this model on stage I and II pancreatic cancer patients, revealed AUC of 0.86 was found. Curative treatment of patients with stage I and II disease is possible and for that reason, a diagnostic biomarker shoulde be able to detect patients at these early stages (75). This study had a few limitations and several strengths. As it was an observatory study with training data, an overestimation of test performance could have occurred. In order to validate the results of this study, further testing of this gene model in an independent cohort is required. In addition, comparing the performance of this prediction model to CA19-9 was impossible seeing as CA19-9 was not measured in two-thirds of the patients. However, the great strength of this study was that cell-free DNA hypermethylation of a large gene panel in plasma samples from an extensive group of patients with pancreatic adenocarcinoma was tested. These plasma samples were included prospectively and consecutively before diagnostic work-up and before treatment. Furthermore, the control group consisted of patients with either benign pancreatic disease or with symptoms of pancreatic cancer. In conclusion, this study created a panel of hypermethylated genes that was able to differentiate between patients with pancreatic adenocarcinoma and a most relevant control group. However, external validation of this gene panel is required before it can be implemented in daily clinical practice (75).

In the study by Matsubara and colleagues, the possibility of early detection of pancreatic cancer by means of a new identified biomarker, CXCL7, was investigated. Seeing as early-stage pancreatic cancer patients often are asymptomatic but can be curatively treated through surgery, the search for biomarkers that can detect these patients is expanding. In this study, the plasma LMW proteomes of both patients with pancreatic cancer and healthy controls were compared. A significant decrease in plasma levels of CXCL7 was observed in pancreatic cancer patients and, most notably, in stage I/II pancreatic cancer patients. Furthermore, the results of CXCL7 were compared with the results of CA19-9. CXCL7 did not outperform CA19-9 in terms of sensitivity for the detection of pancreatic cancer but the combination of CA19-9 and CXCL7 displayed a higher sensitivity and a higher overall diagnostic performance than CA19-9 alone. Although the precise purpose of CXCL7 has yet to be discovered, previous studies have reported that CXCL7 reduction may play a role in the suppression of angiogenesis in pancreatic cancer. In summary, this study has identified and validated CXCL7 as a potential diagnostic biomarker for early-stage pancreatic

123 | P a g e

cancer. However, before CXCL7 can be implemented in a clinical application or screening program with CA19-9, further validation of this biomarker is required (76).

In a previous study, Takayama and colleagues demonstrated that pancreatic cancer patients show overexpression of REG4. The following case-control study, included in this systematic review, analyzed the performance of REG4 as a diagnostic biomarker for pancreatic cancer. Other serum biomarkers like carcinoembryonic antigen (CEA) and CA19-9 are proven to be highly elevated in patients with advanced pancreatic cancer. However, these conventional biomarkers are not elevated in early-stage or small pancreatic cancer and are therefore not fit to be used as an early detection tool. The results of this study suggested that REG4 is as good, if not better, at differentiating patients with pancreatic cancer from healthy controls with an AUC of 0.922. Furthermore, REG4 and CA19-9 were not correlated and combining these two biomarkers in a panel resulted in a sensitivity of 100%. Interestingly, two stage I pancreatic cancer patients displayed normal CA19-9 levels in sera but elevated levels of REG4. This observation suggested that REG4 might be a useful diagnostic biomarker for early-stage pancreatic cancer. However, for these results to be confirmed, further studies that include a larger number of early-stage pancreas cancer patients are mandatory. Moreover, with regard to patients with pancreatitis, elevated levels of REG4 were seen in eight of the eleven patients with pancreatitis. Therefore, the authors theorized that REG4 in pancreatitis patients and in pancreatic cancer patients may derive from destroyed acinar cells. How promising as these results were, no tests on whether or not REG4 could distinguish pancreatic patients from patients with benign pancreatic diseases like SCN, pancreatitis or SPT were performed. It is reported that screening for pancreatic cancer in the general population is not practical because of the low prevalence of the disease. However, screening in high-risk patients for pancreatic cancer is feasible and biomarkers that can differentiate between high-risk patients with benign pancreatic diseases and patients with actual pancreatic cancer are therefore needed. For REG4 to be included in such a screening program, additional validation studies in test sets with pancreatic cancer patients and patients with benign pancreatic diseases must first be performed (77).

The major limitation of CA19-9 as a screening biomarker for pancreatic cancer lies in its low specificity, which can be explained by the elevated levels of 19-9 that are not only found in patients with pancreatic cancer, but also in patients with benign pancreatic diseases. Previous studies have already established that the median sensitivity and specificity of CA19-9 are calculated at 79% and 82%. In addition, CA19-9 is correlated with elevated bilirubin levels regardless of the fact if benign

124 | P a g e

or malignant disease is present, further weakening the specificity of CA19-9. Consequently, numerous other biomarkers are being tested in the search for an alternative and more specific biomarker for the detection of pancreatic cancer. The study of Wang and colleagues investigated the role of miRNA’s as possible diagnostic biomarkers for pancreatic cancer. In this case-control study, it was found that PBMC miR-27-a-3p was moderately effective in distinguishing patients with pancreatic cancer from healthy controls. Furthermore, miR-27-a-3p showed similar a diagnostic performance for patients in each stage of pancreatic cancer, including stage I and II cancer. Based on these results, the authors suggested that miR-27-a-3p (in combination with CA19-9) could serve as a diagnostic biomarker for early-stage pancreatic cancer. Although previous studies suggested that miR-27-a-3p is an oncogene, since it is upregulated in gastric cancer, this is merely the first study that reported on the possible importance of miR-27-a-3p as a biomarker for pancreatic cancer. Further investigation is needed to determine the role of miR-27-a-3p in the development of cancer and in PBMC’s. The greatest strength of this study was the inclusion of patients with BPD as a control population because in a clinical context, these patients are far more likely to be screened than healthy persons. The greatest limitation in this study was the relatively small sample size. Therefore, the authors suggested that further validation of these results in a large independent cohort is required (78).

The analysis of changes in levels of fatty acids in the serum of pancreatic cancer patients is the subject of the case-control study by Zhang et al. In this study, it was shown that certain individual free fatty acids (FFA’s) or a combination of these FFA’s, could serve as diagnostic biomarkers for the detection of early-stage pancreatic cancer. Especially C 16:1 , C 18:2 /C 18:1 , panel a (consisting of

C16:1, C18:3, C18:2, C20:4 and C 22:6 ) and panel b (consisting of the ratios of C 18:2 /C 18:1 and C 18:3 /C 18:1 ) displayed high performance with regard to the detection of pancreatic cancer. Furthermore, the ratio of C 18:2 /C 18:1 , PUFA, panel a and panel b showed best performance in differentiating early- stage pancreatic cancer from non-cancer participants. On top of that, panel C or the polyunsaturated fatty acids (PUFA), a combination of C 18:3 , C 18:2, C18:1, C20:4 and C 22:6 , could discriminate patients with pancreatic cancer from high-risk patients with pancreatitis. In conclusion, these results indicate that the examination of serum FFA’s profiles has great potential for early diagnosis of pancreatic cancer and for monitoring the disease’s progression. The authors expect that studies in larger patient cohort will confirm these results, since the sample size in this study was already fairly large. However, the mechanism behind the differences in FFA metabolism between patients with pancreatitis and pancreatic cancer patients should be investigated upon as pancreatitis might be a premalignant disease that progresses to pancreatic cancer (79).

125 | P a g e

4.7 BREAST CANCER The first study exploring new biomarkers for the detection of early-stage breast cancer, included in this systematic review, is a prospective cohort study conducted by Atahan et al. Diagnosing breast cancer in an early stage, especially ductal carcinoma in situ, is highly important since 100% with early-stage BC can be treated. However, data from the US showed that approximately 63% of the early-stage BC patients remain unnoticed. Small premalignant lesions are often missed and may not even be visible on mammography in younger women or difficult to differentiate from normal tissue in more dense breasts. At present, no biomarker has been suggested as a suitable replacement for mammography for early stage-detection of breast cancer and the tumor markers that were approved by the FDA like CA15-3 and CA27-29 are only advised to use for monitoring of advanced-stage or recurrent breast cancers. Using SELDI-TOF, Li et al. investigated three serum biomarkers Bc1, Bc2 and Bc3 in a BC group and a non-cancer control group. In the present study, Bc1, Bc2 and Bc3 levels were further analyzed in women with malignant disease, in patients with benign breast disease and in healthy women. Based on the results in this study, Bc2 displayed the highest diagnostic performance for the detection of BC. Furthermore, statistically significant differences in Bc2 levels were found between patients with BC and patients with benign diseases as well as between patients with malignant disease and healthy women. Nevertheless, Bc2 presented disappointingly low AUC values under 0.70 for the detection of BC. Bc3 levels on the other hand, were elevated in patients with BC but comparing the levels of Bc3 in patients with the levels of Bc3 in patients with benign disease and in healthy women did not result in significant differences. The most notable result of this study was that Bc1 demonstrated significantly higher levels in patients with BC in comparison to the Bc1 levels in the women with benign disease or the Bc1 levels in the healthy controls. However, ROC analysis of Bc1 revealed AUC values all below 0.70. In summary, this study did not demonstrate that proteomics, studied using SELDI-TOF, could be useful with regard to the early-detection of breast cancer. None of the three proposed biomarkers displayed sufficiently high AUC values for the discrimination of breast cancer. Notwithstanding these results, larger prospective studies and subgroup analyses on these proteomics should be conducted to determine their potential use in clinical practice (80).

The aim of the study by Garczyk and colleagues was to identify novel serum biomarker candidates with sufficient clinical sensitivity and specificity to detect early-stage breast cancer. In recent studies, AGR3 expression has been described to have a role in breast carcinogenesis and was proven to be significantly up-regulated in breast cancer tissues in comparison to AGR3 expression

126 | P a g e

in normal breast tissues. These previous reports were confirmed in this study as a clear over- expression of AGR3 was seen in human breast carcinomas compared to the expression of AG3in normal breast tissues on a mRNA and protein level. Furthermore, patients with low or intermediate grade breast tumor and high levels of AGR3 protein showed a significantly lower tumor-specific survival in comparison to the tumor-specific survival of patients displaying low AGR3 expression. In the last stage of the study, the possibility of AGR3 as an serum based biomarker for breast cancer was tested. Significantly increased levels of AGR3 protein were observed in the sera of patients with breast cancer in comparison to the AGR3 levels in samples from age-matched healthy women. The majority of the cancer patients included in this latest stage suffered from early-stage breast cancer, confirming the utility of AGR3 as an early detection tool for breast cancer. The most notable result of the ROC analysis of AGR3 was its high specificity: 92.5%. In addition, a second member of the AGR family was tested as a potential diagnostic biomarker for BC, namely AGR2. Similarly to the results of AGR3, significantly elevated levels of AGR2 were observed in breast cancer sera in comparison to the levels of AGR2 in healthy controls. ROC analysis AGR2 revealed a sensitivity of 90% and a specificity of 32.5% for the detection of breast cancer. More interestingly though, were the results of combining AGR2 and AGR3 in a diagnostic panel. This panel demonstrated an increased sensitivity of 64.5% and a specificity of 89.5% for the detection of BC. Presently, many studies focus on identifying new biomarker for cancer in circulating tumor cells (CTC’s). However, the incidence and count of CTC’s appear to depend too much on advanced tumor stage and metastasis. For that reason, none of the CTC’s are fit to serve as early-stage diagnostic biomarkers. In contrast, free-circulating DNA or proteins show great potential to detect early cancer stages as changes in mRNA expression in tumor tissue occur early in cancer development. This study showed the potential utility of AGR2 and AGR3 as biomarkers for non- invasive early-detection of breast cancer. However, further prospective validation studies and RCT’s using independent cohorts should be performed to confirm the results of this study (81).

In the study of Gong et al, the possible utility of glyceraldehyde 3-phosphate dehydrogenase or GAPDH as a diagnostic biomarker for the detection of breast cancer was evaluated. GAPDH is a housekeeping gene and is characterized by low gene amplification in various types of cancer. The optimal GAPDH concentration for discriminating breast cancer patients from healthy women or patients with benign breast cancer disease was determined in the training cohort and then further validated in the testing cohort. The greatest strength in this study was the comparison of cell-free DNA in patients with breast cancer to the cell-free DNA levels in patients with hyperplasia, which is a premalignant stage in the development of cancer. No significant difference in GAPDH

127 | P a g e

concentrations was found when comparing the samples of patients with hyperplasia to the samples of healthy controls. This result suggests that cell-free DNA is derived from cancer cells as it cannot be detected in the hyperplasia or healthy control samples. Furthermore, the results demonstrated that GAPDH levels were significantly elevated in BC patients in comparison to the GAPDH levels in healthy controls and in women with benign breast disease. On top of that, approximately 85% of patients with stage I or II breast cancer displayed elevated levels of GAPDH, which means GAPDH may have potential as an early detection tool for breast cancer. However, before GAPDH can be used in a clinical application, further validation in larger prospective studies are mandatory (82).

The study of Park and colleagues evaluated the potential use of thioredoxin-1 (Trx1) as a diagnostic biomarker for early-stage breast cancer. In this study, breast cancer patients were found to have significantly elevated serum Trx1 levels in comparison to the normal Trx1 levels in healthy controls. Furthermore, the increase in Trx1 levels in patients with breast cancer was significantly higher when compared with the increase in Trx1 levels in patients with other cancer types like NSCLC and CRC. In addition, Trx1 levels were proportionally correlated with the progress of breast cancer. Analysis of serum Trx1 levels in healthy female patients demonstrated that Trx1 levels seem to increase in function of age. Next, the diagnostic performance of Trx1 was compared with that of CEA and CA15-3, both well-known and well-studied biomarkers of various types of cancer. Although CEA levels were significantly elevated in breast cancer patients, CEA was shown to be unreliable in terms of detecting breast cancer due to a low sensitivity (55.4% for all stages and 30.3% for stage I). However, when using the combined cut-off values of Trx1 and CEA, a sensitivity of 90.9% was reached. CA15-3 displayed a higher sensitivity (50% stage I) but a lower specificity, making it a moderately performing biomarker for the detection of breast cancer. CA15-3 is currently used to monitor the response of breast cancer patients to their treatment and to discover breast cancer recurrence after curative surgery. However, when using the combined cut-off values of Trx1 and CA15-3, a sensitivity of 97% was reached. In summary, after comparing Trx1 with CA15-3 and CEA as diagnostic biomarkers for breast cancer, Trx1 had higher diagnostic performance. Considering the sensitivities that were calculated for each biomarker, this study indicated that combining the two best performing biomarker CA15-3 and Trx1 in one panel, could be the most effective screening tool for the detection of early-stage breast cancer. However, in order to confirm these results, further validation in prospective cohort studies and RCT’s are needed before this panel can be put into practice (83).

128 | P a g e

In the study of Zhang, five out of 42 possible biomarkers were selected and incorporated in a panel based on support vector machine analysis. The five biomarkers included in this panel were PCDHGA8, LEFTY2, CACNG6, BCAR3 and CYP21A2. With an AUC of 0.788, a sensitivity of 72.41% and a specificity of 74.19%, this panel could be a valuable biomarker for the detection of early-stage breast cancer. The greatest strength of this study was the fact that this model is more close to real application and that it was validated in a testing group, totally blind to the training group. One limitation in this study was that dividing the 67 breast patients into the training group and testing group resulted in two groups with little data. Therefore, the analysis could have lacked in power. Furthermore, as ANOVA analysis was used for the identification of differentially expressed genes between breast cancer patients and healthy controls, the possibility that a fixed group effect and a random sample effect were introduced in this study, must be taken into account. The authors suggested that a computational approach, as demonstrated in this study, can aid in finding new biomarkers for early detection of cancer in blood. However, the results displayed in this study should be further investigated in prospective independent studies (84).

In the case-control study of Zhang, miR-205 was tested as a potential diagnostic biomarker for breast cancer. The study results showed that the expression levels of miR-205 were significantly higher in healthy controls than the expression levels of miR-205 in patients with breast cancer. Furthermore, no significant difference in expression levels of miR-205 was observed between patients with stage I breast cancer and stage II breast cancer. This suggested that miR-205 could be a diagnostic serum biomarker for early detection of breast cancer. A ROC curve was performed and revealed a sensitivity, specificity and AUC of 86.2%, 82.8% and 0.84, respectively, meaning miR-205 has a high diagnostic accuracy for breast cancer. Furthermore, miR-205 has been reported in other studies to be a tumor suppressor and was downregulated in this study. In the second part of this article, a meta-analysis with regard to the performance miR-205 as a diagnostic biomarker in other types of cancer was described. It appeared that miR-205 was upregulated in lung cancer, uterine cancer and ovarian cancer, whereas it was downregulated in breast cancer, prostate cancer and colorectal cancer. The pooled parameters of this meta-analysis consisted of a sensitivity of 75%, a specificity of 84%. The AUC of the overall SROC curve was 0.87, further demonstrating that miR-205 is a promising biomarker for cancer detection displaying moderately high accuracy. Although these results were favorable, a limitation in this study was the small sample size. In addition, the tests in this study were only performed on serum samples. Other body fluids or tissues should be analyzed as well to determine the most sensitive specimen for the extraction of miR-205. This was first study to establish a diagnostic application of miR-205 for breast cancer.

129 | P a g e

Nevertheless, more complementary researches with larger scale samples should be performed in the future (85).

4.8 GASTRIC CANCER In the study of Jing et al, a multitude of potential biomarkers for the detection of upper gastrointestinal tract cancer were tested on patients with EC, gastric cancer and cardiac cancer. Serum levels of CEA, CA19-9, CA24-2, AFP, CA72-4, SCC, TPA and TP’s were measured in 172 patients with EC, in 182 patients with cardiac cancer and in 264 patients with gastric cancer. The most sensitive combinations of tumor markers were found to be CEA, CA19-9, CA24-2 and SCC for EC and CEA, CA19-9, CA24-2 and CA72-4 for cardiac cancer and gastric cancer. A significant correlation was found between CEA, CA19-9, CA24-2, SCC, CA72-4 and the different pathological types of cancer. The SCC was the most sensitive marker for squamous cell carcinomas while the other biomarkers were more sensitive for the detection of adenocarcinomas. Furthermore, survival analysis of three years follow-up demonstrated that high levels of CA72-4, CA24-2 and SCC in preoperative sera of patients with cardiac cancer, GC or EC, predicted poor survival. As a result, it was suggested that these three could be used as prognostic biomarkers in the future. Furthermore, significantly higher levels of CEA and CA72-4 were observed in male patients than in female patients, whereas levels of CA19-9 and CA24-2 were significantly higher in female patients. In addition, postoperative levels of CEA, CA19-9, CA24-2, SCC and CA72-4 were significantly decreased in comparison to preoperative levels of these biomarkers. Subsequently, whenever metastasis or recurrence occurred, the levels of these biomarkers increased again. These results indicated that the levels of tumor markers increased during follow-up when metastasis or deterioration occurred and were possibly related to tumor advancement. In summary, the panel of CEA, CA19-9, CA24-2 and SCC displayed the highest diagnostic performance for the detection of esophageal cancer. The panel of CEA, CA19-9, CA24-2 and CA72-4 proved to be the best at detecting cardiac cancer and GC. Although the results are promising and the sample size in this study was sufficiently high, further prospective cohort studies and RCT’s are required to confirm these results (86).

REG4 protein has already been extensively investigated as a potential diagnostic or prognostic biomarker for various cancer types. It was reported in other studies that REG4 protein was overexpressed in CRC, and upregulated during the development of adenomas and adenocarcinomas. Furthermore, high-throughput analysis revealed that REG4 is upregulated in

130 | P a g e

pancreatic cancer and prostate cancer, but not in lung or breast cancer, suggesting that REG4 is a specific marker for tumors of the digestive tract. In the case-control study of Tao and colleagues, the potential use of REG4 as a diagnostic biomarker for early-stage gastric cancer was explored. Firstly, REG4 mRNA expression was measured in tissue samples, and REG4 was found to be upregulated in gastric cancer tissue when compared to REG4 levels normal gastric mucosa. Secondly, REG4 levels were analyzed in serum samples of both healthy donors and GC patients. REG4 has already been reported as a promising biomarker for the early detection of pancreatic ductal adenocarcinoma and in this study, the mean concentrations of serum REG4 were higher in GC patients when compared to healthy individuals (P<0.01). ROC analysis revealed an AUC of 0.798, meaning the test was moderately accurate in differentiating GC patients from healthy persons. In comparison with more known biomarkers CEA and CA19-9, it was found that the number of serum samples positive for serum REG4 overexpression were higher than the number of serum samples positive for elevated serum CEA and CA19-9 levels in TNM stage I. In conclusion, REG4 is a better diagnostic biomarker for early-stage gastric cancer than CEA or CA19-9. However, to confirm the clinical utility of REG4, larger prospective studies should be performed (87).

As already mentioned in this systematic review, the possibility of detecting cancer by the use of blood tests based on miRNA’s as diagnostic biomarkers has become a hot topic in recent years. This was proven by the next case-control study, in which Chen and his colleagues set out to discover a multi-mi-RNA panel with sufficient diagnostic accuracy to become a reliable detection tool for early-stage gastric cancer. Firstly, the nine miRNA’s with the highest predictive power in tissue samples of GC patients and in tissue samples of healthy controls were identified. All of these nine biomarkers displayed AUC’s > 0.90. In particular miRNA-21 showed a high AUC of 0.993 with a sensitivity of 96.80% and a specificity of 95.10%, respectively. To increase the specificity and sensitivity of these biomarkers, two of these mi-RNA’s were combined with their target mRNA’s: miR-139 with FOS and miR-181a-1 with KAT2B. The two panels were then validated in a case- control study with 29 gastric cancer patients. The sensitivity of miR181-a-1 in combination with KAT2B increased to 96.55%. Analyzing the panel of miR-139 with FOS resulted in a disappointing specificity of 41.18% at 100% sensitivity. Furthermore, the levels of miR-133-b, miR-133-a-2 and miR-1-2 were significantly lower in tumor tissues than in normal tissues. However, in stage IV patients, these miRNA’s were highly expressed in comparison to the expression levels of the same miRNA’s in stage I, II and III patients. Therefore, these biomarkers could assist in staging patients with gastric cancer and help predicting patients outcome. Although the results are promising, there

131 | P a g e

were a few limitations in this study. The detection of tissue mRNA and miRNA is an invasive examination by biopsy and is therefore not suited as a detection tool for early-stage cancer. What is more, the validation study that followed, included a small sample size. Thus, in order to confirm the positive results, prospective studies in larger samples sizes are mandatory. In summary, this study demonstrated that miR-17, miR-133b, miR133-a-2 and miR-1-2 could be novel prediction biomarkers for gastric cancer whereas a panel that combined miR-181-a-1 and KAT2B was found to have high diagnostic accuracy for gastric cancer (88).

In the screening study by Lomba-Viana et al, the potential use of the PG test in a European population is explored. As reported in previous studies, H. pylori infection starts a chain of events that begin with gastritis and evolves into atrophic gastritis, metaplasia, dysplasia and ultimately adenocarcinoma. Low serum PG levels have been reported to indicate the state of atrophic gastritis and therefore could be diagnostic biomarkers for early-stage gastric cancer. Furthermore, the PG test is based on measuring low PGI levels and assessing the ratio of PGI/PGII. This test is currently being used in Japan for the identification of patients at high-risk of gastric cancer. More than 13000 patients participated in this cross-validation study and were followed up in a prospective cohort study. The sensitivity, specificity, PPV and NPV values of the PG test were similar to those in previous reports: 58-85%, 70-74%, 0.7-2.6% and 99.1-99.9%, respectively. False-negative PG tests were found in three participants who all took proton-pomp inhibitors. The results of this study supported the suspicion that proton-pump inhibitors change the gastric function of patients. Furthermore, the authors suggested that chronic atrophic gastritis is not required as a premalignant stage for a patient to develop gastric cancer. In summary, this study proved that the risk of gastric cancer increases when patients display a positive PG test, independently from H. Pylori infection and in the absence of symptoms. However, the study did not point out if the PG test would be useful as a detection tool in a European population. As it stands, it seems as if the PG test is not sensitive or specific enough to be used as a diagnostic biomarker for early detection of gastric cancers, as some gastric cancers do not originate from atrophic gastritis. Nevertheless, this test could detect patients with atrophic gastritis that are at risk of progressing into gastric cancer (89).

The case-control study, designed by Tung and colleagues, explored the diagnostic performance of ten serum biomarkers to differentiate gastric cancer patients from healthy controls. Eight of the ten biomarkers displayed significantly elevated levels in patients with gastric cancer when compared to the 10 biomarker levels in healthy controls in the training group. Subsequently, the five best

132 | P a g e

performing biomarkers were selected among the ten proteins using computing algorithm combined with high-throughput method. Three algorithms (logistic regression method, SVM and RF) were used to combine IgG to H. Pylori, ADAM8, PGI, PGII and VEGF in a panel that was cross-validated in the training group. The SVM- and RF-based algorithms revealed higher sensitivity and specificity in comparison to the logistic regression method. Both PGI and PGII were proven to be involved in atrophic gastritis, while COX2 was overproduced in gastric mucosa cells during H. Pylori infection. Beta-catenin increase was also observed as a consequence of H. Pylori infection. In turn, COX2 and beta-catenin upregulation resulted in VEGF production. In summary, applying the SVM algorithm on the five-biomarker panel displayed the greatest diagnostic performance for gastric cancer in this study, with the highest sensitivity and specificity in comparison to RF and logistic regression analysis. Therefore, this SVM-based panel could supplement clinical gastroscopic diagnosis or could become a diagnostic tool for early-stage gastric cancer. However, before this panel can be put into practice, further validation in prospective cohort studies should be performed (90).

The evaluation of miRNA-421 in gastric juice as a potential detection tool for early-stage gastric cancer is the subject of the study by Zhang. Staging of gastric cancer is usually performed by analysis of a biopsy, taken during gastroscopy. Gastric juice can easily be collected during gastroscopy examinations. Some studies have suggested that, due to the fact that gastric juice is only produced in the upper digestive system, biomarkers in gastric juice should be more specific. In this study, it was found that miRNA-421 was overexpressed in gastric cancer biopsies in comparison to the miRNA-421 levels in normal tissue. Therefore, the authors suggested that miR- 421 could play a role in the development of stomach carcinogenesis and might serve as an early diagnostic biomarker of gastric cancer. Furthermore, using RT-qPCR, this study demonstrated that gastric juice miR-421 levels remained stable during storage, which is important for the test reproducibility. Subsequently, gastric juice levels of miR-421 were measured and analyzed in gastric cancer patients and healthy controls. It was found that gastric juice miR-421 could distinguish GC patients from healthy controls with an AUC of 0.767, a sensitivity of 71.4 and a specificity of 71.7, respectively. These results indicated that gastric juice miR-421 could be a very useful diagnostic biomarker in the future. However, further validation in prospective cohort studies and RCT’s is required (91).

133 | P a g e

4.9 RENAL CANCER The first of three articles discussing diagnostic biomarkers for renal cell cancer included in this systematic, was written by Morrissey and colleagues. The prospective phase 3 study explored the potential clinical utility of urine AQP1 and PLIN2 to diagnose renal cancer. The results of this study indicated that both these biomarkers had favorable sensitivity and specificity for differentiating patients with RCC from healthy patients. The sensitivity and specificity of AQP1 were both 100%, whereas the sensitivity and specificity of PLIN2 were 100% and 91%, respectively. Furthermore, the AUC’s of urine AQP1 and PLIN2 were 1.00 and 0.990, respectively. Similar results were found when comparing the levels of AQP1 and PLIN2 in patients with RCC to the levels of AQP1 and PLIN2 in patients with or without a cancer history, with AUC’s of 0.991 and 0.996, respectively. Internal validation found AUC’s of 0.990 and 0.997 for AQP1 and PLIN2, confirming the clinical utility of these two biomarkers. The biggest strength of this study was its prospective nature, thus meeting the criteria of a phase 3 study for cancer biomarker development. On top of that, this large- scale prospective investigation replicated the same sensitivities and specificities that were observed in retrospective, smaller cohort studies. The most notable result however, was the identification of three new patients with RCC of the 720 participants in the screening cohort through the use of these two biomarkers. They displayed elevated levels of urine AQP1 and PLIN2 and were subsequently found to have a renal mass on CT. Furthermore, the specificity of urine AQP1 and PLIN2 for RCC was found to be very high, indicating that these biomarkers are very specific in diagnosing RCC and can differentiate RCC patients from patients with other types of cancer with high certainty (or confidence). These results suggested that urine AQP1 and PLIN2 might be suitable for population screening on renal cancer. However, as promising as these results are, the value of population screening to achieve greater detection, and theoretically better treatment outcomes, has not been proven yet and renal cell cancer has a low prevalence. Moreover, imaging alone struggles to distinguish benign lesions from RCC. The treatment for an imaged renal mass is partial or total nephrectomy, which sometimes results in the removal of a fairly normal kidney with a benign lesion. Pathological examination of biopsies is only 80% of the time conclusive while 20% are not diagnostic. In contrast, AQP1 and PLIN2 displayed great sensitivity and specificity for differentiating RRC from benign renal masses in this study and could be useful in the differential diagnosis of imaged renal masses. Lastly, it is critical to mention that urine AQP1 and PLIN2 were able detect clear cell and papillary RCC, but not the chromophobe subtype of RCC although this type of cancer occurs only in 5% of all RCC patients. In summary, this prospective cohort study

134 | P a g e

validated the clinical use of urine AQP1 and PLIN2 as biomarker for early and non-invasive detection of RCC, accomplishing phase 3 of the cancer biomarker discovery program (92).

Various articles stated over the years that miRNA’s could serve as non-invasive diagnostic biomarkers for cancer. Recent studies have investigated the potential of miR-141 and miR-26a as biomarkers in prostate cancer, miR-29a and miR-92 in CRC cancer. The case-control study of Fedorko focused on two other promising miRNA as biomarkers for detecting RCC, more specifically miR-378 and miR-210. In this study, it was found that miR-378 and miR-210 expression levels were significantly elevated in RCC patients and ROC analysis showed that these two biomarkers could be used to distinguish RCC patients from healthy controls. Combining miR-378 and miR-210 in one panel resulted in a higher diagnostic accuracy (AUC=0.8480) with a sensitivity and specificity of 80% and 78%. In addition, the expression levels of both miRNA’s were significantly decreased in serum samples from patients with RCC three months after radical nephrectomy. A correlation was found between the elevated levels of miR-378 and the clinical stage of RCC patients and disease- free survival. In summary, serum miR-378 and miR-210 could serve as diagnostic or prognostic biomarkers in the management of RCC patients in the future (93).

In the case-control study of Mustafa, serum amino acid profiles were examined in a large group of RCC patients with matched controls. Statistically significant differences in the levels of 15 of the 26 amino acids were discovered between RCC patients and these healthy controls. Furthermore, a logistic regression model was constructed that incorporated the eight best performing amino acids. ROC analysis of this model resulted in an AUC of 0.81 which meant the model’s capability to differentiate RCC patients from healthy controls was moderately high. Moreover, this model was able to identify early-stage RCC patients with slightly less accuracy than it could detect late-stage tumors (AUC=0.76). Lastly, it was found that the logistic regression model displayed prognostic utility. Patients with low logistic regression scores had significantly decreased likelihood of cancer recurrence and longer overall survival than patients with higher scores. However, some of these results might be influenced by the fact that patients higher stage cancer tended to display higher model scores. Another limitation in this study was that detailed information on the control samples was missing. All in all, this study proved that alterations in serum amino acid profiles occur in early- stage and late-stage RCC patients. In addition, the newly created logistic regression model could be of use as a detection tool for RCC in the future (94).

135 | P a g e

4.10 GYNECOLOGIC CANCER In a prospective study by Kemik and colleagues, the diagnostic and prognostic performances of YKL-40, HE-4 and DDK-3 for endometrial cancer were investigated to determine if these three biomarkers could be used as screening tools for endometrial cancer. Until now, although endometrial cancer has a high frequency of occurring in the general population, no effective screening method or highly sensitive and highly specific biomarker has been put forth to be implemented in a screening program for endometrial cancer. For that reason, Kemik et al. set out to confirm the promising results that were reported in previous studies on these three biomarkers. In addition, the CA125 serum levels were also measured and was found to be significantly elevated in patients with endometrial cancer when compared with the CA125 serum levels in healthy controls. The preoperative serum levels of HE-4 and YKL-40 were significantly elevated in endometrial cancer patients, especially in late-stage disease (stage II and III). However, there was no difference between preoperative serum levels of DKK-3 in patients with endometrial cancer when compared to the serum levels of DKK-3 healthy controls. In addition, it appeared that serum HE-4 levels have prognostic value, seeing as serum HE-4 concentrations were higher in patients with deep myometrial invasion and lymphovascular space involvement. ROC analysis pointed out that HE-4 had more diagnostic value than YKL-40. Furthermore, HE-4 showed the ability to differentiate early-stage cancer patients from late-stage cancer patients, which makes it a superior diagnostic biomarker over YKL-40. The most important limitation of this study lay in its small sample size. If these results were to be confirmed in a prospective study with a larger amount of cancer patients, HE-4 could be implemented in a screening program for endometrial cancer in the future (95).

Despite the reduction of the death rate due to cervical cancer in developed countries, more sensitive and specific screening methods are needed for the early detection of cervical cancer or its precursors like high-grade intraepithelial cervical lesions. In the prospective cohort study of Duvlis and colleagues, the HPV E6/E7 mRNA test and the HPV DNA test were compared to each other in cervical specimens. A higher rate of concordance in CIN2+ lesions was detected, which indicated that the HPV E6 and E7 mRNA test could be a more specific test for CIN2 detection. As it has already been proved that E6 and E7 expression demonstrates cancer development, detecting elevated E6 and E7 expression in benign lesions could predict the potential of an oncogene transformation from a benign lesion to severe dysplasia. Furthermore, the HPV E6/E7 mRNA test displayed a higher specificity and PPV for the detection of cervical cancer and CIN2+ than the HPV

136 | P a g e

DNA test (both in cases that were diagnosed by cytology and histology analysis). However, the sensitivity of the HPV DNA test was higher at every turn. The specificity of the HPV E6/E7 mRNA test for histological CIN2+ was calculated at 50%, whereas the specificity of the HPV DNA test was merely 18.8%, respectively. The sensitivity of the HPV E6/E7 mRNA test was 93.1% for the detection of CIN2+ lesions, while the HPV DNA test demonstrated a sensitivity of 100% for the detection of histologically confirmed CIN2+. The conventional Pap smear test displayed a poor sensitivity for the detection of high-grade cervical lesions. Furthermore, there were negative mRNA results in HPV DNA positive patients, which meant that not all HPV infections express E6 and E7, as was expected from transient infections. In addition, only 34.6% of patients with CIN1 lesions had a positive HPV E6/E7 mRNA test. Low HPV E6 and E7 expression was further detected in normal and benign lesions. These results could be explained by the fact that HPV infection can regress or spontaneously resolve. In summary, the low sensitivity of the cytology examination and the low specificity of the HPV DNA test, as demonstrated in this study, confirmed that a complementary test with a higher specificity like the HPV E6/E7 mRNA test could be useful when implicated in the current screening program for cervical cancer. The most important advantages of implementing the HPV E6/E7 mRNA test in a screening algorithm would be less frequent colposcopy referrals and less costs (96).

In the cross-sectional study of Kan and colleagues, three methylated genes, PAX1, SOX1 and NKX6-1, were examined as diagnostic biomarkers for the detection of CINIII, which is a potential precursor of cervical cancer. DNA hypermethylation has been detected in multiple cancer, and could play a role in carcinogenesis. In this study, PAX1 demonstrated the highest diagnostic value as a screening biomarker with the highest AUC for the detection of CINIII lesions of the three methylated genes. Furthermore, the sensitivity and specificity of PAX1 were higher than 80% for the detection of CINIII lesions or worse. In CINII lesions however, the levels of PAX1 were low. Since only 16% of the CINII lesions have an increased risk of progressing to severe dysplasia, accurate detection of CINIII lesions is more important. Furthermore, The sensitivity of PAX1 for the detection of CINII lesions was calculated at 77.4%. These results suggested that elevated levels of methylated PAX1 gen could be related to carcinogenesis of the cervix dysplasia. In addition, the sensitivity and specificity of PAX1 for the detection of ASCUS, LSIL and AGC were 73% and 87%, respectively. If PAX1 would be used as a part of the triage protocol, the number of referrals for colposcopy or biopsy would be reduced by 60%. Lastly, combining PAX1 testing with HPV 16/18 typing could result in a sensitivity of 90% and a specificity of 83%. In summary, this prospective case-control study confirmed the potential of PAX1 as a diagnostic biomarker for early detection of

137 | P a g e

cervical cancer. However, future studies should test the utility of combining PAX1 with HPV 16/18 typing and prospective population-based studies are mandatory to determine how PAX1 can be implemented in a screening program (97).

4.11 BLADDER CANCER In the case-control study by Chung and colleagues, 10 hypermethylated genes were tested as potential biomarkers for the early detection of bladder cancer. Previous reports have confirmed that DNA hypermethylation is present in both non-muscle invasive and muscle invasive bladder cancers. In this study, various panels consisting of 2, 3,4 or 5 methylated genes were measured in first-voided urine. Four panels, of which two panels with four and two panels with five biomarkers, displayed the highest diagnostic performance for bladder cancer. The best performing 5-gene panel with MYO3A, CA10, NKX6-2, DBC1 and SOX11 or PENK displayed a sensitivity of 85% and a specificity of 95%. Furthermore, the PPV of this 5-gene panel would be 52% based on bladder cancer having a prevalence of 5.9% in high-risk patients (history of smoking or hematuria and dysuria). In addition, this biomarker panel performed well in non-muscle invasive tumors (pTa, Tis and pT1 stages), which are considered to be early-stage tumors. In comparison with the sensitivity and specificity of the biomarkers from previous reports, this panel displayed the highest sensitivity (81%) and specificity (95%) for non-muscle invasive bladder cancer. Thus, this biomarker panel could become an early-detection tool for bladder cancer. Another application of this biomarker panel as a complementary screening test for high-risk patients would reduce the number of cystoscopy referrals. A limitation in this study was that the focus lay on genes from promoter CpG islands. Investigating other DNA sequences like exonic CpG islands may lead to the discovery of more sensitive biomarkers for bladder cancer. Although the results are promising, these hypermethylated genes need to be tested in larger prospective studies that include a high-risk population with a history of smoking or symptoms such as hematuria, dysuria, urgent urination and recurrent urinary tract infections. Moreover, validating these genes as biomarkers for the detection of early-stage low-grade tumor is needed as this study did not included enough G1-grade tumors (98).

In the study of Eissa, survivin RNA and matrix-metalloproteinase 2 and 9 were tested as diagnostic biomarkers for bladder cancer. The diagnostic performance of these two novel biomarkers were compared with that of conventional urine cytology. The results of this case-control study showed that the expression levels of urine survivin were significantly increased in patients with bladder

138 | P a g e

cancer compared to the expression levels of urine survivin in patients with benign lesions (P<0.001). Survivin had a sensitivity of more than 75% and a specificity of 95%, implying that survivin has a moderately high diagnostic value for bladder cancer. Furthermore, urinary MMP-2 and MMP-9 were significantly increased in the malignant group when compared to the levels of urinary MMP-2 and MMP-9 in the benign and controls groups. However, the sensitivity of this MMP’s Zymography was lower than the sensitivity of survivin (67.3% vs 76.1%). Previous studies already suggested that bladder cancer cells produce MMP-2 and this theory was further validated in this study. A panel consisting of survivin and MMP’s zymography displayed a sensitivity and specificity of 91.3% and 85%. Lastly, when combining zymography and survivin with cytology, the highest sensitivity was observed (95.6%) at a cost of a lower specificity (85%). According to the results of this study, both survivin and MMP-2 and MMP-9 were superior in diagnosing bladder cancer when compared with cytology as a detection tool for bladder cancer (99). However, this study had a few pitfalls. No ROC analysis was performed. As a consequence, no AUC was calculated and the diagnostic performance of these biomarkers could not be assessed. Furthermore, no detailed information was given about the cancer stages of the 66 patients. Therefore, this study could not assess the diagnostic performance of these biomarkers in early- stage bladder cancer. In conclusion, combining survivin mRNA by nested RT-PCR and MMP’s zymography with cytology could improve overall diagnostic power for bladder cancer, but additional prospective and more detailed studies are need to confirm these results (99).

In the study of Lai and colleagues, urine UPK3A was evaluated as a novel biomarker for the detection of bladder cancer. The urine-based NMP22 test has been widely used for early diagnosis and monitoring of recurrent bladder cancer but is plagued by many false positive results due to increased NMP22 levels in patients with benign urological diseases. In addition, cytology examination demonstrates high specificity, but it is not a very sensitive test. For these reasons, new biomarkers to identify early-stage bladder cancer are under investigation. Furthermore, the diagnostic performance of UPK3A as a biomarker for bladder cancer was compared with the diagnostic performances of cytology and the NMP22 test in this prospective cohort study. As already mentioned, UPK3A levels were significantly raised in bladder cancer patients in comparison to the UPK3A levels in normal individuals. Moreover, ROC analysis pointed out that UPK3A has a high AUC (0.907) for the distinguishing bladder cancer patients from healthy controls with a sensitivity and specificity of both 83%. Based on these results, UPK3A could be a powerful diagnostic biomarker for bladder cancer in the future. In addition, this urine-based UPK3A test outperformed NMP22 and cytology in terms of sensitivity and specificity for bladder cancer.

139 | P a g e

However, UPK3A levels were also raised in 13 of the 44 patients with benign urological disorders (false positives) and in 21 patients with bladder cancer, levels of UPK3A under the cut-off of 0.0685 (false negatives) were observed. It was found that the presence of urinary tract infection, urolithiasis and benign prostate hyperplasia were associated with higher levels of UPK3A. The biggest limitation in this study was the relatively small sample size. As a consequence, a larger, multi- center study is needed to validate the results presented in this study (100).

Recent studies are focused on the discovery of a new screening method for the detection of early- stage bladder cancer, without the need of a cystoscopy. A great amount of biomarkers and techniques have been investigated and were already found to be more sensitive for the detection of bladder cancer than cytology. In case of most novel biomarkers, the sensitivity for detecting early-stage disease is the lowest and consequently, a negative result cannot accurately exclude bladder cancer. Furthermore, the cost and labor-intensity of these assays is often high. In the prospective validation study by Renard, two methylated genes, NID2 and TWIST1 were identified and evaluated as urine-based diagnostic biomarkers for early-stage bladder cancer. The results indicated that increased levels of methylated TWIST1 and NID2 are associated with bladder cancer. TWIST1 overexpression has been reported in several other type of cancer like breast, lung, cervical and gastric cancer. NID2 on the other hand, is a basement membrane protein involved in maintaining basement membrane and tissue architecture. Both the two genes demonstrated sensitivities of > 90% for the detection of bladder cancer, which are much higher than the sensitivity of cytology. However, cytology is still an important part of the diagnostic round-up of bladder cancer because of its high specificity (96%). Therefore, the specificity of these two genes were tested in a dataset with a control group that did not consist solely of healthy individuals, but included patients with benign urological diseases as well. As it turned out, these two genes displayed similar specificity rates as those of cytology (94% and 91% for TWIST1 and NID2). Furthermore, although the sample size was small, the PPV of 86% and the NPV of 95% for the two methylated genes were favorable. However, combining cytology with a panel of NID2 and TWIST1, led to an increase in NPV to 98% but a decrease in PPV to 82%. In addition, combining NID2 and TWIST1 in one panel, resulted in a higher sensitivity and specificity. ROC analysis revealed an AUC of 0.93 for the detection of early-stage bladder cancer, indicating that this panel has high diagnostic accuracy for with regard to the detection of bladder cancer. A limitation in this study was that only tissue samples from bladder cancer lesions (and no tissue samples from healthy bladder wall) were examined in the second phase. Therefore, this study did not investigate if the normal-appearing bladder mucosa, neighboring the bladder cancer lesions, expressed higher levels of TWIST1 and NID2

140 | P a g e

methylation. In summary, two novel biomarkers, TWIST1 and NID2, were discovered and validated this study. These methylated genes could be highly sensitive and specific biomarkers for the detection of bladder cancer, certainly when they are combined in one panel. To further confirm these results and validate these biomarkers, prospective cohort studies with large samples are required (101).

4.12 ORAL CANCER In the study of Hutajulu, methylation analysis in the promoter region of tumor suppressor genes was tested as a complementary test for screening of nasopharyngeal carcinoma. Previous reports have proven that methylation alterations occur early in carcinogenesis. In theory, this five-marker panel should be able to differentiate early-stage patients from non-cancer individuals. Altered methylation in five TSG’s was detected. Subsequently, these 5 genes, CHFR, RIZ1, WIF1, p16 and RASSF1A were identified as independent complementary biomarkers to EBV IgA serology and DNA load for the early diagnosis of NPC. Furthermore, the methylation frequency of single genes varied from 29.2% to 79.2%. These five TSG’s demonstrated high methylation levels in NPC patients but no more than 4% in healthy EBV positive controls. As a consequence, all TSG’s were able to differentiate the NPC patients from the normal controls. When combining these five biomarkers in one panel and using it in an assay, the resulting qualitative methylation specific PCR test would be less labor-intensive and more cost-effective than other assays like Q-MSP (quantitative methylation-specific PCR) or bisulfite sequencing. Furthermore, this panel displayed a sensitivity of 98% for the detection of NPC cancer patients. In this study, there were no significant differences observed between methylation frequency in early and late-stage NPC patients. However, a correlation was found between the methylation status and viral DNA load in NPC patients. As a result, the authors suggested that there must be a link between EBV infection and elevated methylation levels. In four genes, DAPK1, CADM1, CDH13 and DLC1, the methylation frequency was also elevated in in the high-risk group and normal EBV carriers. For that reason, their diagnostic accuracy was limited (102). In summary, DNA promoter methylation of TSG’s could be used as a complementary test for early NPC detection among high-risk patients when combined with EBV-based biomarker (102).

In the observational study by Voka č et al, two genes, SOX2 and hTERC were examined as potential OSCC marker genes. Exfoliated cells were collected from OSCC lesions using the Cytobrush technique. Subsequently, SOX2 and hTERC levels were evaluated by FISH/cytobrush technique

141 | P a g e

as non-invasive screening tools for OSCC detection. A limitation of the cytobrush technique however was that malignant cells are not always detectable in the smear. High SOX2 expression has already been reported in other oral squamous cell carcinomas such as oral cavity and tongue carcinomas. The TERC gene is the most frequently amplified oncogene in cervical cancer precursors and is associated with high-grade cervical dysplasia and cancer. Previous studies have proven that TERC gene amplification could already be detected in precancerous laryngeal epithelial changes. In this study, statistical higher levels of SOX2 and TERC amplifications were found in all patients with different stages of OSCC, independently from lymph node status and tumor classification. Furthermore, 70% of all OSCC patients were detected in brush smears using the FISH technique. As a consequence, precancerous lesions could be detected using this technique. In summary, amplifications of the SOX2 and TERC gene were found in all squamous cell carcinomas. Therefore, these two genes could be useful as diagnostic or prognostic biomarkers in the future. However, this non-invasive method for screening OSCC should be further refined in case-control studies and prospective cohort studies (103).

The study of Rujkumar is the first cohort study to evaluate the diagnostic performance of serum and salivary CYFRA21-1 in oral cancer patients and in OSCC patients. CYFRA21-1 is a monoclonal antibody that reacts only with cytokeratin-19. Cytokeratin is expressed in normal epithelia and in carcinomas. Previous reports have proven that in carcinoma, CYFRA21-1 levels increased due to cytokeratin release during epithelial transition to malignancy, which in turn led to testing of CYFRA21-1 as a biomarker for different types of cancer. In this study, both serum and salivary concentrations of CYFRA21-1 were significantly elevated in OSCC patients in comparison to the salivary concentrations of CYFRA21-1 in patients with premalignant lesions. However, salivary CYFRA21-1 levels demonstrated a threefold increase in comparison to serum CYFRA21- 1 levels. ROC analysis revealed that salivary CYFRA21-1 had a high sensitivity for the detection of OSCC patients. Furthermore, elevated CYFRA21-1 levels were found in the saliva, but not in the sera of patients with premalignant lesions. The authors therefore suggested that some alterations in CYFRA21-1 levels occur earlier in the saliva of patients with OSCC precursors than in the serum. On top of that, ROC curve analyses demonstrated that salivary CYFRA21-1 had a higher sensitivity and a higher specificity than serum CYFRA21-1 for the detection of OSCC. Especially the sensitivity of salivary CYFRA21-1 was much higher than that of serum CYFRA21-1, which is important if salivary CYFRA21-1 would be implemented in a screening test for OSCC. Moreover, significantly elevated levels of salivary CYFRA21-1 were found in OSCC patients with poorly differentiated lesions and confirmed the potential of salivary CYFRA21-1 as a prognostic biomarker

142 | P a g e

for OSCC. Nevertheless, this result needs to be validated in other studies that focus on CYFRA21- 1 levels in saliva and serum before and after treatment. A limitation in this study was that no patients with erythroplakia and oral lichen planus were included in the control group with premalignant lesions. In summary, the results in this study suggest that salivary CYFRA21-1 could become a detection tool to differentiate patients with OSCC from patients with premalignant lesions. If salivary CYFRA21-1 would be turned into a screening biomarker, many unnecessary biopsies could be prevented. However, prospective studies with larger sample sizes are required to confirm the results of this study and to validate salivary CYFRA21-1 as a diagnostic biomarker for NSCC (104).

In the study of Holzinger and colleagues, the predictive accuracies of HPV E16 antibodies to the oncoproteins E6 and E7, regulatory proteins E1 and E2 and the major capsid protein L1 in serum of HNSCC patients were calculated in order to validate these proteins as diagnostic biomarkers for HPV16-based OSCC and HNSCC outside the oropharynx. The results of this study demonstrated that HPV16 E6 and a newly constructed algorithm named HPV sero-pattern had high sensitivity and specificity with regard to the detection of HPV-driven-OSCC patients (>96% and 98%). The sensitivity to diagnose HNPCC outside the oropharynx was lower (50%) for both biomarkers, while their specificity was high (100%). The specificity of E6 seropositivity and HPV sero-pattern (100%) however, was much higher than that of p16 immunostaining, with or without HPV DNA positivity (<83%). HPV antibodies were elevated in almost all HPV-driven-HNSCC patients, while antibody detection was scarce in non-HPV-driven HNSCC patients and only antibodies to the single E proteins were detected. In patients with OSCC, E6 antibodies were most frequently detected, followed by E2, E7 and E1 antibodies. Only one patient of the 142 non-HPV driven HNSCC patients was E6 seropositive, which makes HPV serology an excellent diagnostic biomarker for HPV-driven HNSCC. One limitation of this study was, given the low prevalence of this type of cancer, the small sample size of patients with HPV-driven HNCSS cancer located outside the oropharynx. Therefore, the sensitivity of HPV16 serology for these type of tumors could have been biased. In addition, further validation of the new HPV sero-pattern algorithm in larger cohort studies must be performed in order to confirm its utility as a reliable biomarker for HPV-driven OSCC. In summary, this study showed that HPV serology can detect patients with HPV-driven OSCC with high sensitivity and specificity. For that reason, it could be a powerful biomarker or early-detection tool for OSCC. However, further research is mandatory to prove that this biomarker has diagnostic value in screening programs for OSCC (105).

143 | P a g e

4.13 ESOPHAGEAL CANCER The prognosis of esophageal cancer is still disappointingly low due to the late development of symptoms in this type of cancer. However, early detection could significantly improve overall survival. Therefore Guo and colleagues conducted a study that evaluated the diagnostic performance of a newly constructed classification tree using the differences in protein profile between 78 patients with EC and 95 healthy controls. Firstly, the protein profiles of these patients and controls were analyzed using MALDI-TOF-MS. Then, four proteomic peaks were identified as possible biomarkers for the detection of EC. Subsequently, these biomarkers were combined in a decision tree or classification tree, which displayed a sensitivity of 82.5% and a specificity of 84.4% for the detection of EC patients. ROC analysis was not performed. In summary, the panel of four newly identified biomarkers could become a powerful early detection tool for EC. Nevertheless, further prospective cohort studies and RCT’s with larger sample sizes should validate these results before this panel can be used in a clinical setting (106).

In the cohort study by Xu, auto-antibodies against L1CAM were analyzed as possible diagnostic biomarker for ESCC. L1CAM is involved in fasciculation, neurite outgrowth, cell adhesion and migration in the nervous system. However, altered expression of L1CAM has been reported in uterine and ovarian carcinomas, renal cell carcinomas, CRC, NSCLC and gastric cancer and is associated with cancer growth, invasion, metastasis, chemoresistance and could be a prognostic biomarker for ESCC. In this study, an ELISA assay was constructed to measure the levels of auto- antibodies against L1CAM in a first cohort set and its diagnostic performance was validated in a second independent cohort set. Elevated levels of auto-antibodies against L1CAM were found in patients with ESCC, which suggested that the auto-antibodies against L1CAM could serve as diagnostic biomarkers for early-stage ESCC. Furthermore, the data in this study indicated that auto- antibodies are produced early in the process of carcinogenesis and could be potentially used as ideal screening biomarkers for cancer. Auto-antibodies against L1CAM however, did not present any correlation with disease outcome. On top of that, though auto-antibodies against L1CAM displayed high specificity in this study, its sensitivity for early ESCC was disappointingly low (25.2%). As a result, the potential of auto-antibodies against L1CAM as a clinical application for screening of symptomless or early-stage patients is limited. This problem could be overcome by combining multiple auto-antibodies in a panel. Furthermore, the greatest limitation in this study was the relatively low sample size of early-stage ESCC patients in the second cohort. In summary, the results in this study demonstrated that auto-antibodies against L1CAM present a potential

144 | P a g e

biomarker for the diagnosis of early-stage ESCC, but did not display prognostic value for ESCC. Future case-control and prospective cohort studies should focus on combining multiple auto- antibodies in a panel for the detection of ESCC (107).

4.14 SKIN CANCER In the study by Wachsman, a 17-gene classifier was identified in specimens from non-invasive tape stripping of stratum corneum and subsequently tested as an early-detection tool for melanoma. Initially, 312 genes that were significantly differentially expressed in melanoma compared with naevi or normal skin, were identified as possible candidate biomarkers. Subsequently, epidermal genetic information retrieval or EGIR (a method that samples material from stratum corneum), was used to detect signals from cells in the basal layer of the epidermis. Furthermore, melanoma could be differentiated from naevi using the EGIR method with 422 differentially expressed genes. The sensitivity and specificity of this model for the detection of melanoma were calculated at 100% and 95%. When the number of genes in this model were reduced to 17, the sensitivity remained 100%, while the specificity remained high (88%) for the detection of in situ and invasive superficial spreading melanoma, lentigo maligna and lentigo maligna melanoma. The AUC of this 17-gene model was 0.955, indicating it has a high diagnostic performance for melanoma. On top of that, this model could distinguish melanoma patients from patients basal cell carcinoma, solar lentigo and normal skin. However, the 17-gene classifier presented 13 false positives. In other words, the test classified 13 naevi as melanoma. This could be the consequence of a sampling error due to the low sample size or as a result of the false-positive naevi exhibiting pagetoid spread of melanocytes. The next step in validating this 17-gene model is in a prospective cohort study or in a RCT (108).

4.15 OSTEOSARCOMA In the study of Ouyang, the diagnostic performances of five plasma miRNA’s as novel biomarkers for osteosarcoma were assessed. Circulating miR-21 was up-regulated and miR-199a-3p and miR-143 were downregulated in patients with osteosarcoma in comparison with miR-21, miR- 199a-3p and miR-143 levels in healthy controls. These three miRNA’s could be used to differentiate patients with osteosarcoma from healthy controls with high accuracy. Furthermore, the levels of these three miRNA’s were associated with the metastatic status of the patient and with the histological subtype of osteosarcoma patients. The increase in miR-21 levels and the decrease in miR-143 levels in patients with metastatic osteosarcoma patients indicated that miR- 143 and miR-21 are involved in the regulation of the migration of cancer cells, as was reported in

145 | P a g e

previous studies. In addition, it was found that combining these three miRNA’s resulted in a high AUC for distinguishing patients with osteosarcoma from healthy controls. The strength of this study was that the healthy volunteers were matched with the osteosarcoma cases by age and sex, whereas he biggest limitation was the small sample size. In conclusion, this study demonstrated that a three miRNA-panel could be a diagnostic biomarker for early detection of osteosarcoma in the future. However, before this panel can be put to practice, prospective cohort studies with larger sample sizes that confirm the promising results of this study and reveal the mechanisms behind the dysregulation of miRNA’s are required (109).

4.16 THYROID CANCER In the study of Herrmann, 500 patients with thyroid nodular disease were screened with hCT to determine the optimal-cut off value of hCT for the detection of medullar thyroid cancer. One patient with MTC was diagnosed based on elevated levels of hCT after pentagastrin stimulation. The cut- off levels of basal hCT that were used to determine if a pentagastrin stimulation test is needed remained doubtful as basal hCT values between 10 and 100 were a grey area where true positives and false positives overlap. In this study, pentagastrin testing was carried out in all patients with hCT levels of more than 10 ng/mL. Of 16 patients with hCT > 10 ng/mL, 4 displayed a hCT of more than 100 pg/mL after pentagastrine stimulation and were surgically treated through thyroidectomy. Furthermore, histological analysis revealed one MTC and two C-Cell hyperplasia. Previous reports suggested that the cut-off value for basal hCT should be increased to 15 pg/mL, recplacing the now-used cut-off level of 10 pg/mL. This proposition was supported by this study, as none of the patients with MTC or CHH displayed basal levels of hCT < 15 ng/mL. In addition, the results indicated that the risk of MTC is very low in the range of basal levels of hCT between 10 and 15 pg/dL, and in patients with hCT < 100 pg/dL after pentagastrine stimulation. In summary, this study proved that primarily basal hCT measuring is required in patients with thyroid nodule disease. One patient with early-stage MTC was detected among 500 patients with thyroid nodular disease using hCT measuring. Screening these patients with an optimal cut-off level of 15 pg/mL for basal hCT could reduce the number of thyroidectomies substantially (110).

4.17 LEUKEMIA Chronic lymphocytic leukemia is the most prevalent form of leukemia in the Western Hemisphere. The CLL patients often go into remission after treatment with chemotherapy, but almost all of these patients relapse sooner or later. Researchers are investigating various methods that could predict

146 | P a g e

disease progression in CLL patients. Therefore, the search for prognostic biomarkers, that could stratify patients into treatment groups for CLL, has expanded. In the study by Papageorgiou, the predictive and diagnostic performance of a apoptosis-related gene BCL2L12 mRNA expression as a biomarker for CLL was assessed and compared to the predictive and diagnostic value of BCL2 mRNA expression. The results of this study indicated that BCL2L12 mRNA was significantly overexpressed in CLL patients compared to the BCL2L12 mRNA levels in healthy controls. Furthermore, the expression of BLC2 was also elevated in CLL patients in comparison to BCL2 expression in controls. The mRNA expression levels of BCL2 and BCL2L12 were correlated and both these biomarker levels were negatively correlated with the early apoptosis index. To assess the predictive performance of both biomarkers, ROC analysis was performed and showed that BCL2L12 had a higher AUC (0.833 vs. BLC2: 0.776). When combining the two biomarkers in a logistic regression model, the AUC did not improve significantly (0.840). Furthermore, BCL2L12 mRNA expression was analyzed as a dichotomous and continuous variable revealing a 5.5-fold higher risk for CLL. As a result, it can be suggested that BCL2L12 mRNA could serve as a biomarker for the triaging of CLL patients with potentially higher clonal lymphocytosis that need further follow-up. However, BCL2L12 mRNA expression was not correlated with other prognostic parameters like LDH levels, CD38 expression or IGHV mutational status. What is more, BCL2L12 positive patients were significantly more associated with advanced stage CLL than BCL2L12 negative patients. This result can be related to the increasing tumor size. BCL2 mRNA expression status on the other hand was significantly related to LDH and CD38 expression status. Lastly, the prognostic value of BCL2L12 mRNA expression in patients with CLL was analyzed by examining the overall survival rate. Higher levels of BCL2L12 mRNA expression were correlated with significantly poorer survival rates. In contrast, higher levels of BCL2L12 mRNA expression in patients with breast, colon or gastric cancer were found to be correlated with higher overall survival. In summary, using BCL2L12 mRNA expression as a diagnostic biomarker, patients with CLL could be differentiated from healthy controls with high accuracy. Furthermore, BCL2L12 mRNA expression had high independent predictive value and was associated with advance stage CLL. Although the results were very promising, the mechanism behind BCL2L12 mRNA expression is still unknown and validation of these results in prospective cohort studies is needed before this biomarker can be used in clinical practice (111).

147 | P a g e

4.18 VARIOUS TYPES OF CANCER In the screening study of Wen and colleagues, the diagnostic performance of an eight-biomarker panel was evaluated in 314 patients with various types of cancer. The most notable result was that most malignancies were detected in patients aged between 70 and 79 years. Furthermore, elevated biomarkers were detected in 179 of the 314 participants in the year leading up to diagnosis. However, 4674 patients that were screened with this panel tested positive for one or more of these biomarkers and did not develop cancer in the following year. Fifteen of these patients did develop cancer later. The sensitivity of the panel for the detection of HCC was calculated at 92.3%, which was significantly higher than the sensitivity of the liver-specific biomarker AFP (63.3%). Furthermore, the sensitivity for the detection of CRC was 76.9%, which was higher than the sensitivity of iFOBT in a previous study (6.98%). With regard to the sensitivity of this panel for early- stage lung cancer, the result was unclear because 88.9% of the patients with lung cancer were in an advanced stage. Lastly, the multi-analyte biomarker screening panel displayed poor sensitivity for the detection of oral cancer (0%) and breast cancer (37.5%), especially in terms of detecting early-stage disease. In summary, the results of this study suggested that a multi-analyte biomarker panel could be clinically useful for the screening of multiple types of tumors during health check- up. However, due to the low sample size in patients per type of tumor, further prospective studies are required to evaluate the clinical utility of this panel (112).

The study by Wang et al. investigated the potential of serum TK1 as a diagnostic biomarker for the detection of a multitude of cancers. Furthermore, the diagnostic performance of STK for seven different types of cancer was compared to that of serum CEA and serum AFP. STK1 is involved in the proliferation of cells and indicates a faster growth rate of both normal cells and different types of tumor cells. For instance, STK1 levels were higher in stomach cancer patients than in patients with cervical cancer and it has been reported that gastric tumors grow faster than cervical tumors. In addition, STK1 had a higher sensitivity for the detection of caner, when compared to CEA and AFP. Combining these three markers resulted in an increase in sensitivity by 20% but this result was not statistically significant. The lower sensitivity of CEA and AFP for all cancer types in general can be explained. These are tumor-specific biomarker, whereas STK1 is a general proliferation marker, which occurs in most types of cancer. Furthermore, STK1 demonstrated similar sensitivity for the detection of liver cancer as AFP, which is a specific liver biomarker. Therefore SKT1 could be used as a liver cancer biomarker. Another interesting result was the higher sensitivity of SKT1 for cervix cancer than that of CEA. Combining CEA and STK1 demonstrated a better accuracy for

148 | P a g e

the detection of CRC, lung and liver cancer patients, while combining AFP with STK1 increased the sensitivity for the detection of patients with liver cancer. In Summary, STK1 was more sensitive than CEA and AFP for the screening of eight different types of cancer. However, a combination of more tumor-specific tumor markers should be used to detect cancer (113).

In the screening study by Chen and colleagues, the diagnostic performance of the STK1 test was assessed in 35365 individuals. It has already been proven in previous reports that SK1 has a role in DNA synthesis and that this biomarker is expressed in multiple types of proliferating cells (both in normal cells and in cancer cells). In the serum of patients with cancer however, the concentration and activity of TK1 was elevated when compared to the TK1 levels in the serum of healthy patients. Furthermore, TK1 was increased during acute illness like an infection or inflammation, and physiological changes like menstruation or during surgery. Therefore, it cannot be excluded that an increase in TK1 is in part due to stimulation of the immune system (114). In this study the STK1 assay was able to distinguish patients with 11 different types of cancer from the healthy individuals and from patients with various non-malignant diseases. The ROC value of STK1 was 0.96 and the positive likelihood ratio was 233.73, with a sensitivity and a specificity of 0.798 and 0.997, respectively. Thus, these results suggested that STK1 could be used as a screening test in the future. In addition, patients with STK1 values > 2 pM, not showing any symptoms of malignant disease, should be followed up because these patients have an increased risk of cancer development. The authors suggested that patients with STK1 values > 2 pM should recheck their STK1 values and general health within 3-6 months. Furthermore, follow-up of 24% of all patients was provided and it was found that patients with elevated STK1 levels had 30-times more risk at developing new malignancies than patients with normal STK1 levels. A notable result was that 85% of the people with elevated STK1 levels, were patients with premalignant diseases. However, not all of these premalignant lesions progressed to cancer disease. In summary, this study demonstrated the high diagnostic potential of the STK1 assay and its ability to differentiate patients with (pre)malignancies from healthy patients or patients with benign lesions. As a result, STK1 could be used in screening programs as a tumor-proliferation biomarker or as a biomarker in combination with annual healthy inspections to assess an early risk score for malignancies (114).

149 | P a g e

150 | P a g e

5. CONCLUSION

The aim of this systematic review was to present an overview of the most promising diagnostic and/or prognostic biomarkers that were developed in the last eight years. After an exhaustive selection process, 98 articles were considered eligible and were therefore be included in this study. The results of these studies and the biomarkers that were examined, can be consulted in the additional tables in the addenda. The first notable observation in this systematic review is the low number of prospective studies with more than 100 cancer patients that were found or that were accessible in the database of Pubmed. Prospective studies are long and expensive whereas retrospective case-control and cohort studies can be performed at a lower cost and in a shorter amount of time. For these reasons, only a limited number of good prospective studies concerning cancer biomarkers are executed, and free open access to these studies is often not granted. On top of that, in the majority of the prospective studies included in this article, the study populations mostly consisted of high-risk patients that were referred for further examination, resulting in an overestimation of the detection rates. Nevertheless, some biomarkers like the multi-marker stool-based DNA test for the detection of CRC or the biomarker panel of Jirun et al. that combines the diagnostic power of AFP, DCP and HCCR-1 for the detection of HCC, were proven to be promising biomarkers and are expected to be used as clinical tests in the near future. The development of one biomarker that can detect every type of cancer, is probably impossible. Combining multiple cancer-specific biomarkers in one panel however, has proven to be significantly more sensitive in the detection of HCC for example. Consequently, the future of cancer diagnosis could lay in the development of cancer-specific biomarker panels. The majority of the studies included in this systematic review however, consisted of retrospective case-control studies. These are phase II studies that are low in cost and that often analyze potential biomarkers in stored samples of patients that have already been diagnosed with cancer. Moreover, retrospective studies are easier to perform and precise information on carefully selected (and not randomly selected) patients and healthy controls is often provided in these studies since the cancer patients were already thoroughly examined and described in their patient files. Therefore, the difference between cancer patients and healthy controls, and more importantly, the cut-off value for a new biomarker test, can be easily defined. Consequently, the results of newly identified biomarkers are frequently exceptional with sensitivities and specificities of more than 90%, outperforming currently used biomarkers like CEA, AFP and CA-15-3. However, when the majority

151 | P a g e

of these so-called ‘promising biomarkers’ is tested in a less tightly controlled population during subsequent phase III prospective studies, the sensitivities and specificities decrease in value by approximately 20 to 30%. As a result, it must be concluded that retrospective studies are more subject to selection bias, storage problems and statistical bias. Furthermore, it is apparent that all authors of the retrospective studies want to conduct a prospective study to validate their biomarkers in real-life circumstances. However, the universities or institutions often do not have the means (or the time) to finance a prospective study that can take several years before the results are available. On top of that, the biomarker often does not replicate the same excellent results in prospective studies as seen in the retrospective studies, turning it into a risky undertaking. A careful selection of the biomarkers that are to be tested in prospective studies is therefore of the utmost importance. To further complicate the situation, all types of cancers display different prevalence values all over the world. For instance, most of the papers from China in this systematic review focus on lung and esophageal cancer, while the European papers focus more on CRC, prostate cancer and breast cancer. As a result, there is a huge dispersal in terms of means and no consensus (or coherence) on the right strategy to discover and validate new biomarkers. In summary, many researchers try to find a new and better performing biomarker individually, but they often lack the means and coherence to succeed. For all these reasons, I suggest a centralization of means and efforts to validate the most promising biomarkers, known to this date, in prospective studies, without compromising the researchers’ urge to develop new potential biomarkers in retrospective studies. If an overarching, well-financed institution could oversee all cancer biomarker developments and concentrate the results of these studies in one database, it could be determined (for each continent individually) which biomarkers are worth further investigating upon in prospective studies (which would be financed by this institution). Lastly, the development of one biomarker that can detect every type (or one specific type) of cancer, is probably impossible. Half of the studies included in this systematic review propose to combine multiple cancer-specific biomarkers in one panel. This strategy has proven to increase the sensitivity and specificity significantly. As a consequence, the future of biomarker cancer development could lay in the combination of new and/or old biomarkers in cancer-specific panels.

152 | P a g e

REFERENCES

1. La Thangue NB, Kerr DJ. Predictive biomarkers: a paradigm shift towards personalized cancer medicine. Nat Rev Clin Oncol. 2011;8(10):587-96. 2. Torre LA, Bray F, Siegel RL, Ferlay J, Lortet-Tieulent J, Jemal A. Global cancer statistics, 2012. CA Cancer J Clin. 2015;65(2):87-108. 3. Organization WWH. Cancer Fact Sheet 2017 [Available from: http://www.who.int/mediacentre/factsheets/fs297/en/. 4. Mordente A, Meucci E, Martorana GE, Silvestrini A. Cancer Biomarkers Discovery and Validation: State of the Art, Problems and Future Perspectives. Adv Exp Med Biol. 2015;867:9-26. 5. Collaborators GBDRF. Global, regional, and national comparative risk assessment of 79 behavioural, environmental and occupational, and metabolic risks or clusters of risks, 1990-2015: a systematic analysis for the Global Burden of Disease Study 2015. Lancet. 2016;388(10053):1659-724. 6. Duffy MJ. Use of Biomarkers in Screening for Cancer. Adv Exp Med Biol. 2015;867:27-39. 7. Khunger M, Kumar U, Roy HK, Tiwari AK. Dysplasia and cancer screening in 21st century. APMIS. 2014;122(8):674-82. 8. Velonas VM, Woo HH, dos Remedios CG, Assinder SJ. Current status of biomarkers for prostate cancer. Int J Mol Sci. 2013;14(6):11034-60. 9. Biomarkers Definitions Working G. Biomarkers and surrogate endpoints: preferred definitions and conceptual framework. Clin Pharmacol Ther. 2001;69(3):89-95. 10. Henry NL, Hayes DF. Cancer biomarkers. Mol Oncol. 2012;6(2):140-6. 11. Cagle PT, Allen TC, Olsen RJ. Lung cancer biomarkers: present status and future developments. Arch Pathol Lab Med. 2013;137(9):1191-8. 12. Pepe MS, Etzioni R, Feng Z, Potter JD, Thompson ML, Thornquist M, et al. Phases of biomarker development for early detection of cancer. J Natl Cancer Inst. 2001;93(14):1054-61. 13. Puntmann VO. How-to guide on biomarkers: biomarker definitions, validation and applications with examples from cardiovascular disease. Postgrad Med J. 2009;85(1008):538-45. 14. Potti A, Dressman HK, Bild A, Riedel RF, Chan G, Sayer R, et al. Genomic signatures to guide the use of chemotherapeutics. Nat Med. 2006;12(11):1294-300. 15. Fowkes FG, Fulton PM. Critical appraisal of published research: introductory guidelines. BMJ. 1991;302(6785):1136-40. 16. Nielsen HJ, Brunner N, Jorgensen LN, Olsen J, Rahr HB, Thygesen K, et al. Plasma TIMP-1 and CEA in detection of primary colorectal cancer: a prospective, population based study of 4509 high-risk individuals. Scand J Gastroenterol. 2011;46(1):60-9.

153 | P a g e

17. Abdullah M, Rani AA, Simadibrata M, Fauzi A, Syam AF. The value of fecal tumor M2 pyruvate kinase as a diagnostic tool for colorectal cancer screening. Acta Med Indones. 2012;44(2):94-9. 18. Ahlquist DA, Zou H, Domanico M, Mahoney DW, Yab TC, Taylor WR, et al. Next-generation stool DNA test accurately detects colorectal cancer and large adenomas. Gastroenterology. 2012;142(2):248-56; quiz e25-6. 19. Ardalan Khales S, Abbaszadegan MR, Abdollahi A, Raeisossadati R, Tousi MF, Forghanifard MM. SALL4 as a new biomarker for early colorectal cancers. J Cancer Res Clin Oncol. 2015;141(2):229-35. 20. Calistri D, Rengucci C, Casadei Gardini A, Frassineti GL, Scarpi E, Zoli W, et al. Fecal DNA for noninvasive diagnosis of colorectal cancer in immunochemical fecal occult blood test-positive individuals. Cancer Epidemiol Biomarkers Prev. 2010;19(10):2647-54. 21. Bosch LJ, Oort FA, Neerincx M, Khalid-de Bakker CA, Terhaar sive Droste JS, Melotte V, et al. DNA methylation of phosphatase and actin regulator 3 detects colorectal cancer in stool and complements FIT. Cancer Prev Res (Phila). 2012;5(3):464-72. 22. Chan CC, Fan CW, Kuo YB, Chen YH, Chang PY, Chen KT, et al. Multiple serological biomarkers for colorectal cancer detection. Int J Cancer. 2010;126(7):1683-90. 23. Ciarloni L, Hosseinian S, Monnier-Benoit S, Imaizumi N, Dorta G, Ruegg C, et al. Discovery of a 29-gene panel in peripheral blood mononuclear cells for the detection of colorectal cancer and adenomas using high throughput real-time PCR. PLoS One. 2015;10(4):e0123904. 24. Ciarloni L, Ehrensberger SH, Imaizumi N, Monnier-Benoit S, Nichita C, Myung SJ, et al. Development and Clinical Validation of a Blood Test Based on 29-Gene Expression for Early Detection of Colorectal Cancer. Clin Cancer Res. 2016;22(18):4604-11. 25. Koga Y, Yamazaki N, Yamamoto Y, Yamamoto S, Saito N, Kakugawa Y, et al. Fecal miR-106a is a useful marker for colorectal cancer patients with false-negative results in immunochemical fecal occult blood test. Cancer Epidemiol Biomarkers Prev. 2013;22(10):1844-52. 26. Koga Y, Yamazaki N, Takizawa S, Kawauchi J, Nomura O, Yamamoto S, et al. Gene expression analysis using a highly sensitive DNA microarray for colorectal cancer screening. Anticancer Res. 2014;34(1):169-76. 27. Marshall KW, Mohr S, Khettabi FE, Nossova N, Chao S, Bao W, et al. A blood-based biomarker panel for stratifying current risk for colorectal cancer. Int J Cancer. 2010;126(5):1177-86. 28. Meng W, Zhu HH, Xu ZF, Cai SR, Dong Q, Pan QR, et al. Serum M2-pyruvate kinase: A promising non-invasive biomarker for colorectal cancer mass screening. World J Gastrointest Oncol. 2012;4(6):145- 51. 29. Terhaar sive Droste JS, Oort FA, van der Hulst RW, van Heukelem HA, Loffeld RJ, van Turenhout ST, et al. Higher fecal immunochemical test cutoff levels: lower positivity rates but still acceptable detection rates for early-stage colorectal cancers. Cancer Epidemiol Biomarkers Prev. 2011;20(2):272-80. 30. Jin P, Kang Q, Wang X, Yang L, Yu Y, Li N, et al. Performance of a second-generation methylated SEPT9 test in detecting colorectal neoplasm. J Gastroenterol Hepatol. 2015;30(5):830-3.

154 | P a g e

31. Wilhelmsen M, Christensen IJ, Rasmussen L, Jorgensen LN, Madsen MR, Vilandt J, et al. Detection of colorectal neoplasia: Combination of eight blood-based, cancer-associated protein biomarkers. Int J Cancer. 2017;140(6):1436-46. 32. Chen L, Ho DW, Lee NP, Sun S, Lam B, Wong KF, et al. Enhanced detection of early hepatocellular carcinoma by serum SELDI-TOF proteomic signature combined with alpha-fetoprotein marker. Ann Surg Oncol. 2010;17(9):2518-25. 33. Jirun P, Zhang G, Ha SA, Kim HK, Yoo J, Kim S, et al. HCCR-1 for detecting small hepatocellular carcinoma latent in a cirrhotic liver: a prospective cohort study. Gut. 2012;61(10):1514-5. 34. Luo P, Yin P, Hua R, Tan Y, Li Z, Qiu G, et al. A large-scale, multi-center serum metabolite biomarkers identification study for the early detection of hepatocellular carcinoma. Hepatology. 2017. 35. Shi L, Wu LL, Yang JR, Chen XF, Zhang Y, Chen ZQ, et al. Serum peroxiredoxin3 is a useful biomarker for early diagnosis and assessemnt of prognosis of hepatocellular carcinoma in Chinese patients. Asian Pac J Cancer Prev. 2014;15(7):2979-86. 36. Lin XJ, Chong Y, Guo ZW, Xie C, Yang XJ, Zhang Q, et al. A serum microRNA classifier for early detection of hepatocellular carcinoma: a multicentre, retrospective, longitudinal biomarker identification study with a nested case-control study. Lancet Oncol. 2015;16(7):804-15. 37. Kumada T, Toyoda H, Tada T, Kiriyama S, Tanikawa M, Hisanaga Y, et al. High-sensitivity Lens culinaris agglutinin-reactive alpha-fetoprotein assay predicts early detection of hepatocellular carcinoma. J Gastroenterol. 2014;49(3):555-63. 38. Lok AS, Sterling RK, Everhart JE, Wright EC, Hoefs JC, Di Bisceglie AM, et al. Des-gamma- carboxy prothrombin and alpha-fetoprotein as biomarkers for the early detection of hepatocellular carcinoma. Gastroenterology. 2010;138(2):493-502. 39. Miura N, Osaki Y, Nagashima M, Kohno M, Yorozu K, Shomori K, et al. A novel biomarker TERTmRNA is applicable for early detection of hepatoma. BMC Gastroenterol. 2010;10:46. 40. Wang W, Zhao LJ, Wang Y, Tao QY, Feitelson MA, Zhao P, et al. Application of HBx-induced anti- URGs as early warning biomarker of cirrhosis and HCC. Cancer Biomark. 2011;11(1):29-39. 41. Wang M, Devarajan K, Singal AG, Marrero JA, Dai J, Feng Z, et al. The Doylestown Algorithm: A Test to Improve the Performance of AFP in the Detection of Hepatocellular Carcinoma. Cancer Prev Res (Phila). 2016;9(2):172-9. 42. Zekri AN, Youssef AS, El-Desouky ED, Ahmed OS, Lotfy MM, Nassar AA, et al. Serum microRNA panels as potential biomarkers for early detection of hepatocellular carcinoma on top of HCV infection. Tumour Biol. 2016;37(9):12273-86. 43. Li J, Chen P, Li XQ, Bao QL, Dai CH, Ge LP. Elevated levels of survivin and livin mRNA in bronchial aspirates as markers to support the diagnosis of lung cancer. Int J Cancer. 2013;132(5):1098- 104. 44. Louis E, Adriaensens P, Guedens W, Bigirumurame T, Baeten K, Vanhove K, et al. Detection of Lung Cancer through Metabolic Changes Measured in Blood Plasma. J Thorac Oncol. 2016;11(4):516-23.

155 | P a g e

45. Chen X, Hu Z, Wang W, Ba Y, Ma L, Zhang C, et al. Identification of ten serum microRNAs from a genome-wide serum microRNA expression profile as novel noninvasive biomarkers for nonsmall cell lung cancer diagnosis. Int J Cancer. 2012;130(7):1620-8. 46. Doseeva V, Colpitts T, Gao G, Woodcock J, Knezevic V. Performance of a multiplexed dual analyte immunoassay for the early detection of non-small cell lung cancer. J Transl Med. 2015;13:55. 47. Gumireddy K, Li A, Chang DH, Liu Q, Kossenkov AV, Yan J, et al. AKAP4 is a circulating biomarker for non-small cell lung cancer. Oncotarget. 2015;6(19):17637-47. 48. Hulbert A, Jusue-Torres I, Stark A, Chen C, Rodgers K, Lee B, et al. Early Detection of Lung Cancer Using DNA Promoter Hypermethylation in Plasma and Sputum. Clin Cancer Res. 2017;23(8):1998- 2005. 49. Ostroff RM, Bigbee WL, Franklin W, Gold L, Mehan M, Miller YE, et al. Unlocking biomarker discovery: large scale application of aptamer proteomic technology for early detection of lung cancer. PLoS One. 2010;5(12):e15003. 50. Pass HI, Levin SM, Harbut MR, Melamed J, Chiriboga L, Donington J, et al. Fibulin-3 as a blood and effusion biomarker for pleural mesothelioma. N Engl J Med. 2012;367(15):1417-27. 51. Sin DD, Tammemagi CM, Lam S, Barnett MJ, Duan X, Tam A, et al. Pro-surfactant protein B as a biomarker for lung cancer prediction. J Clin Oncol. 2013;31(36):4536-43. 52. Varella-Garcia M, Schulte AP, Wolf HJ, Feser WJ, Zeng C, Braudrick S, et al. The detection of chromosomal aneusomy by fluorescence in situ hybridization in sputum predicts lung cancer incidence. Cancer Prev Res (Phila). 2010;3(4):447-53. 53. Yao Y, Fan Y, Wu J, Wan H, Wang J, Lam S, et al. Potential application of non-small cell lung cancer-associated autoantibodies to early cancer diagnosis. Biochem Biophys Res Commun. 2012;423(3):613-9. 54. Longoria TC, Ueland FR, Zhang Z, Chan DW, Smith A, Fung ET, et al. Clinical performance of a multivariate index assay for detecting early-stage ovarian cancer. Am J Obstet Gynecol. 2014;210(1):78 e1-9. 55. Yildirim MA, Seckin KD, Togrul C, Baser E, Karsli MF, Gungor T, et al. Roles of neutrophil/lymphocyte and platelet/lymphocyte ratios in the early diagnosis of malignant ovarian masses. Asian Pac J Cancer Prev. 2014;15(16):6881-5. 56. Tcherkassova J, Abramovich C, Moro R, Chen C, Schmit R, Gerber A, et al. Combination of CA125 and RECAF biomarkers for early detection of ovarian cancer. Tumour Biol. 2011;32(4):831-8. 57. Simmons AR, Clarke CH, Badgwell DB, Lu Z, Sokoll LJ, Lu KH, et al. Validation of a Biomarker Panel and Longitudinal Biomarker Performance for Early Detection of Ovarian Cancer. Int J Gynecol Cancer. 2016;26(6):1070-7. 58. Sedlakova I, Vavrova J, Tosner J, Hanousek L. Lysophosphatidic acid (LPA)-a perspective marker in ovarian cancer. Tumour Biol. 2011;32(2):311-6.

156 | P a g e

59. Leandersson P, Kalapotharakos G, Henic E, Borgfeldt H, Petzold M, Hoyer-Hansen G, et al. A Biomarker Panel Increases the Diagnostic Performance for Epithelial Ovarian Cancer Type I and II in Young Women. Anticancer Res. 2016;36(3):957-65. 60. Cramer DW, Bast RC, Jr., Berg CD, Diamandis EP, Godwin AK, Hartge P, et al. Ovarian cancer biomarker performance in prostate, lung, colorectal, and ovarian cancer screening trial specimens. Cancer Prev Res (Phila). 2011;4(3):365-74. 61. Xu Y, Zhong R, He J, Ding R, Lin H, Deng Y, et al. Modification of cut-off values for HE4, CA125 and the ROMA algorithm for early-stage epithelial ovarian cancer detection: Results from 1021 cases in South China. Clin Biochem. 2016;49(1-2):32-40. 62. Gislefoss RE, Langseth H, Bolstad N, Nustad K, Morkrid L. HE4 as an Early Detection Biomarker of Epithelial Ovarian Cancer: Investigations in Prediagnostic Specimens From the Janus Serumbank. Int J Gynecol Cancer. 2015;25(9):1608-15. 63. Zhu CS, Pinsky PF, Cramer DW, Ransohoff DF, Hartge P, Pfeiffer RM, et al. A framework for evaluating biomarkers for early detection: validation of biomarker panels for ovarian cancer. Cancer Prev Res (Phila). 2011;4(3):375-83. 64. Cremers RG, Eeles RA, Bancroft EK, Ringelberg-Borsboom J, Vasen HF, Van Asperen CJ, et al. The role of the prostate cancer gene 3 urine test in addition to serum prostate-specific antigen level in prostate cancer screening among breast cancer, early-onset gene mutation carriers. Urol Oncol. 2015;33(5):202 e19-28. 65. Sokoll LJ, Sanda MG, Feng Z, Kagan J, Mizrahi IA, Broyles DL, et al. A prospective, multicenter, National Cancer Institute Early Detection Research Network study of [-2]proPSA: improving prostate cancer detection and correlating with cancer aggressiveness. Cancer Epidemiol Biomarkers Prev. 2010;19(5):1193-200. 66. Gordian E, Ramachandran K, Reis IM, Manoharan M, Soloway MS, Singal R. Serum free circulating DNA is a useful biomarker to distinguish benign versus malignant prostate disease. Cancer Epidemiol Biomarkers Prev. 2010;19(8):1984-91. 67. Mitra AV, Bancroft EK, Barbachano Y, Page EC, Foster CS, Jameson C, et al. Targeted prostate cancer screening in men with mutations in BRCA1 and BRCA2 detects aggressive prostate cancer: preliminary analysis of the results of the IMPACT study. BJU Int. 2011;107(1):28-39. 68. Morgan R, Boxall A, Bhatt A, Bailey M, Hindley R, Langley S, et al. Engrailed-2 (EN2): a tumor specific urinary biomarker for the early diagnosis of prostate cancer. Clin Cancer Res. 2011;17(5):1090-8. 69. Vickers AJ, Cronin AM, Roobol MJ, Savage CJ, Peltola M, Pettersson K, et al. A four-kallikrein panel predicts prostate cancer in men with recent screening: data from the European Randomized Study of Screening for Prostate Cancer, Rotterdam. Clin Cancer Res. 2010;16(12):3232-9. 70. Vickers A, Cronin A, Roobol M, Savage C, Peltola M, Pettersson K, et al. Reducing unnecessary biopsy during prostate cancer screening using a four-kallikrein panel: an independent replication. J Clin Oncol. 2010;28(15):2493-8.

157 | P a g e

71. Wei JT, Feng Z, Partin AW, Brown E, Thompson I, Sokoll L, et al. Can urinary PCA3 supplement PSA in the early detection of prostate cancer? J Clin Oncol. 2014;32(36):4066-72. 72. Baine MJ, Menning M, Smith LM, Mallya K, Kaur S, Rachagani S, et al. Differential gene expression analysis of peripheral blood mononuclear cells reveals novel test for early detection of pancreatic cancer. Cancer Biomark. 2011;11(1):1-14. 73. Yip-Schneider MT, Wu H, Dumas RP, Hancock BA, Agaram N, Radovich M, et al. Vascular endothelial growth factor, a novel and highly accurate pancreatic fluid biomarker for serous pancreatic cysts. J Am Coll Surg. 2014;218(4):608-17. 74. Capello M, Bantis LE, Scelo G, Zhao Y, Li P, Dhillon DS, et al. Sequential Validation of Blood- Based Protein Biomarker Candidates for Early-Stage Pancreatic Cancer. J Natl Cancer Inst. 2017;109(4). 75. Henriksen SD, Madsen PH, Larsen AC, Johansen MB, Drewes AM, Pedersen IS, et al. Cell-free DNA promoter hypermethylation in plasma as a diagnostic marker for pancreatic adenocarcinoma. Clin Epigenetics. 2016;8:117. 76. Matsubara J, Honda K, Ono M, Tanaka Y, Kobayashi M, Jung G, et al. Reduced plasma level of CXC chemokine ligand 7 in patients with pancreatic cancer. Cancer Epidemiol Biomarkers Prev. 2011;20(1):160-71. 77. Takayama R, Nakagawa H, Sawaki A, Mizuno N, Kawai H, Tajika M, et al. Serum tumor antigen REG4 as a diagnostic biomarker in pancreatic ductal adenocarcinoma. J Gastroenterol. 2010;45(1):52-9. 78. Wang WS, Liu LX, Li GP, Chen Y, Li CY, Jin DY, et al. Combined serum CA19-9 and miR-27a-3p in peripheral blood mononuclear cells to diagnose pancreatic cancer. Cancer Prev Res (Phila). 2013;6(4):331-8. 79. Zhang Y, Qiu L, Wang Y, Qin X, Li Z. High-throughput and high-sensitivity quantitative analysis of serum unsaturated fatty acids by chip-based nanoelectrospray ionization-Fourier transform ion cyclotron resonance mass spectrometry: early stage diagnostic biomarkers of pancreatic cancer. Analyst. 2014;139(7):1697-706. 80. Atahan K, Kupeli H, Gur S, Yigitbasi T, Baskin Y, Yigit S, et al. The value of serum biomarkers (Bc1, Bc2, Bc3) in the diagnosis of early breast cancer. Int J Med Sci. 2011;8(2):148-55. 81. Garczyk S, von Stillfried S, Antonopoulos W, Hartmann A, Schrauder MG, Fasching PA, et al. AGR3 in breast cancer: prognostic impact and suitable serum-based biomarker for early cancer detection. PLoS One. 2015;10(4):e0122106. 82. Gong B, Xue J, Yu J, Li H, Hu H, Yen H, et al. Cell-free DNA in blood is a potential diagnostic biomarker of breast cancer. Oncol Lett. 2012;3(4):897-900. 83. Park BJ, Cha MK, Kim IH. Thioredoxin 1 as a serum marker for breast cancer and its use in combination with CEA or CA15-3 for improving the sensitivity of breast cancer diagnoses. BMC Res Notes. 2014;7:7. 84. Zhang F, Deng Y, Drabier R. Multiple biomarker panels for early detection of breast cancer in peripheral blood. Biomed Res Int. 2013;2013:781618.

158 | P a g e

85. Zhang H, Li B, Zhao H, Chang J. The expression and clinical significance of serum miR-205 for breast cancer and its role in detection of human cancers. Int J Clin Exp Med. 2015;8(2):3034-43. 86. Jing JX, Wang Y, Xu XQ, Sun T, Tian BG, Du LL, et al. Tumor markers for diagnosis, monitoring of recurrence and prognosis in patients with upper gastrointestinal tract cancer. Asian Pac J Cancer Prev. 2014;15(23):10267-72. 87. Tao HQ, He XJ, Ma YY, Wang HJ, Xia YJ, Ye ZY, et al. Evaluation of REG4 for early diagnosis and prognosis of gastric cancer. Hum Pathol. 2011;42(10):1401-9. 88. Chen S, Zhu J, Yu F, Tian Y, Ma S, Liu X. Combination of miRNA and RNA functions as potential biomarkers for gastric cancer. Tumour Biol. 2015;36(12):9909-18. 89. Lomba-Viana R, Dinis-Ribeiro M, Fonseca F, Vieira AS, Bento MJ, Lomba-Viana H. Serum pepsinogen test for early detection of gastric cancer in a European country. Eur J Gastroenterol Hepatol. 2012;24(1):37-41. 90. Tong W, Ye F, He L, Cui L, Cui M, Hu Y, et al. Serum biomarker panels for diagnosis of gastric cancer. Onco Targets Ther. 2016;9:2455-63. 91. Zhang X, Cui L, Ye G, Zheng T, Song H, Xia T, et al. Gastric juice microRNA-421 is a new biomarker for screening gastric cancer. Tumour Biol. 2012;33(6):2349-55. 92. Morrissey JJ, Mellnick VM, Luo J, Siegel MJ, Figenshau RS, Bhayani S, et al. Evaluation of Urine Aquaporin-1 and Perilipin-2 Concentrations as Biomarkers to Screen for Renal Cell Carcinoma: A Prospective Cohort Study. JAMA Oncol. 2015;1(2):204-12. 93. Fedorko M, Stanik M, Iliev R, Redova-Lojova M, Machackova T, Svoboda M, et al. Combination of MiR-378 and MiR-210 Serum Levels Enables Sensitive Detection of Renal Cell Carcinoma. Int J Mol Sci. 2015;16(10):23382-9. 94. Mustafa A, Gupta S, Hudes GR, Egleston BL, Uzzo RG, Kruger WD. Serum amino acid levels as a biomarker for renal cell carcinoma. J Urol. 2011;186(4):1206-12. 95. Kemik P, Saatli B, Yildirim N, Kemik VD, Deveci B, Terek MC, et al. Diagnostic and prognostic values of preoperative serum levels of YKL-40, HE-4 and DKK-3 in endometrial cancer. Gynecol Oncol. 2016;140(1):64-9. 96. Duvlis S, Popovska-Jankovic K, Arsova ZS, Memeti S, Popeska Z, Plaseska-Karanfilska D. HPV E6/E7 mRNA versus HPV DNA biomarker in cervical cancer screening of a group of Macedonian women. J Med Virol. 2015;87(9):1578-86. 97. Kan YY, Liou YL, Wang HJ, Chen CY, Sung LC, Chang CF, et al. PAX1 methylation as a potential biomarker for cervical cancer screening. Int J Gynecol Cancer. 2014;24(5):928-34. 98. Chung W, Bondaruk J, Jelinek J, Lotan Y, Liang S, Czerniak B, et al. Detection of bladder cancer using novel DNA methylation biomarkers in urine sediments. Cancer Epidemiol Biomarkers Prev. 2011;20(7):1483-91.

159 | P a g e

99. Eissa S, Badr S, Elhamid SA, Helmy AS, Nour M, Esmat M. The value of combined use of survivin mRNA and matrix metalloproteinase 2 and 9 for bladder cancer detection in voided urine. Dis Markers. 2013;34(1):57-62. 100. Lai Y, Ye J, Chen J, Zhang L, Wasi L, He Z, et al. UPK3A: a promising novel urinary marker for the detection of bladder cancer. Urology. 2010;76(2):514 e6-11. 101. Renard I, Joniau S, van Cleynenbreugel B, Collette C, Naome C, Vlassenbroeck I, et al. Identification and validation of the methylated TWIST1 and NID2 genes through real-time methylation- specific polymerase chain reaction assays for the noninvasive detection of primary bladder cancer in urine samples. Eur Urol. 2010;58(1):96-104. 102. Hutajulu SH, Indrasari SR, Indrawati LP, Harijadi A, Duin S, Haryana SM, et al. Epigenetic markers for early detection of nasopharyngeal carcinoma in a high risk population. Mol Cancer. 2011;10:48. 103. Kokalj Vokac N, Cizmarevic B, Zagorac A, Zagradisnik B, Lanisnik B. An evaluation of SOX2 and hTERC gene amplifications as screening markers in oral and oropharyngeal squamous cell carcinomas. Mol Cytogenet. 2014;7(1):5. 104. Rajkumar K, Ramya R, Nandhini G, Rajashree P, Ramesh Kumar A, Nirmala Anandan S. Salivary and serum level of CYFRA 21-1 in oral precancer and oral squamous cell carcinoma. Oral Dis. 2015;21(1):90-6. 105. Holzinger D, Wichmann G, Baboci L, Michel A, Hofler D, Wiesenfarth M, et al. Sensitivity and specificity of antibodies against HPV16 E6 and other early proteins for the detection of HPV16-driven oropharyngeal squamous cell carcinoma. Int J Cancer. 2017;140(12):2748-57. 106. Guo R, Pan C, Shen J, Liu C. New serum biomarkers for detection of esophageal carcinoma using Matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. J Cancer Res Clin Oncol. 2011;137(3):513-9. 107. Xu YW, Peng YH, Ran LQ, Zhai TT, Guo HP, Qiu SQ, et al. Circulating levels of autoantibodies against L1-cell adhesion molecule as a potential diagnostic biomarker in esophageal squamous cell carcinoma. Clin Transl Oncol. 2017;19(7):898-906. 108. Wachsman W, Morhenn V, Palmer T, Walls L, Hata T, Zalla J, et al. Noninvasive genomic detection of melanoma. Br J Dermatol. 2011;164(4):797-806. 109. Ouyang L, Liu P, Yang S, Ye S, Xu W, Liu X. A three-plasma miRNA signature serves as novel biomarkers for osteosarcoma. Med Oncol. 2013;30(1):340. 110. Herrmann BL, Schmid KW, Goerges R, Kemen M, Mann K. Calcitonin screening and pentagastrin testing: predictive value for the diagnosis of medullary carcinoma in nodular thyroid disease. Eur J Endocrinol. 2010;162(6):1141-5. 111. Papageorgiou SG, Kontos CK, Pappa V, Thomadaki H, Kontsioti F, Dervenoulas J, et al. The novel member of the BCL2 gene family, BCL2L12, is substantially elevated in chronic lymphocytic leukemia patients, supporting its value as a significant biomarker. Oncologist. 2011;16(9):1280-91.

160 | P a g e

112. Wen YH, Chang PY, Hsu CM, Wang HY, Chiu CT, Lu JJ. Cancer screening through a multi-analyte serum biomarker panel during health check-up examinations: Results from a 12-year experience. Clin Chim Acta. 2015;450:273-6. 113. Wang Y, Jiang X, Dong S, Shen J, Yu H, Zhou J, et al. Serum TK1 is a more reliable marker than CEA and AFP for cancer screening in a study of 56,286 people. Cancer Biomark. 2016;16(4):529-36. 114. Chen ZH, Huang SQ, Wang Y, Yang AZ, Wen J, Xu XH, et al. Serological thymidine kinase 1 is a biomarker for early detection of tumours--a health screening study on 35,365 people, using a sensitive chemiluminescent dot blot assay. Sensors (Basel). 2011;11(12):11064-80.

161 | P a g e

ADDENDA

ADDITIONAL TABLES:

Table 1: Colorectal cancer Table 2: Hepatocellular cancer Table 3: Lung cancer Table 4: Ovarian cancer Table 5: Prostate cancer Table 6: Pancreatic cancer Table 7: Breast cancer Table 8: Gastric cancer Table 9: Renal cancer Table 10: Gynecologic cancer Table 11: Bladder cancer Table 12: Oral cancer Table 13: Esophageal cancer Table 14: Skin cancer Table 15: Osteosarcoma Table 16: Thyroid cancer Table 17: Leukemia Table 18: Various types of cancer

162 | P a g e

Table 1: Colorectal Cancer

Author + Date Journal Location Study Design Study Population Specimen Method Biomarker

Healthy patients Persons with Symptoms Age Range Males/Females Other characteristics

Prospective population All participants are high- Scandinavian Journal of ARCHITECT Nielsen H. 2010 Denmark based study (on high-risk 0 4509 18-97 2070/2439 risk patients who Serum TIMP-1 Gastroenterology immunoassay patients) underwent colonoscopy

CEA

TIMP-1 + CEA

Prospective population Ethnics: Javenese 42,6%, The Indonesian Journal Abdullah M. 2012 Indonesia based study (on high-risk 0 266 NK 197/131 Chinese 23,5%, Malay Stool ELISA M2PK of Internal Medicine patients) 17,4% , others 16,5% Vimentin + NDRG4 + Retrospective multi- BMP3 + TFPI2 + mutant Ahlquist D. 2012 Gastroenterology USA center case-control 293 385 39-92 339/339 81% White Stool QuARTS assay KRAS + ACTB + study hemoglobin Journal of Cancer Khales S. 2015 Research and Clinical Iran Prospective cohort study 60 51 Mean age: 59,6 70/41 NK Serum RT-PCR SALL4 Oncololy American Association for All 560 participants Calistri D. 2010 Italy Prospective cohort study 216 NK 50-69 NK Stool PCR FL-DNA Cancer Reasearch positive for iFOBT American Association for Retrospective case- Bosch L. 2012 The Nethederlands 96 66 NK NK NK Stool RT-PCR PHACTR3 Cancer Reasearch control study PHACTR3 + FIT International Journal of Retrospective case- Chan C. 2010 China 54 94 26-86 52/42 NK Serum ELISA rCCAP Cancer control study rHDAC5 rP53 rNY-CO-16 rNMDA rCCPA + rHDAC5 + rP53 + rNY-CO-16 + rNMDA Multivariate analysis of the 29-gene panel: BLCL3 + IL1-B + PTGS2 + MAP2K3 + PTGES + PPARG + MMP11 + CCR1 + EGR1 + + CACNB4 + Retrospective case- Ciarloni L. 2015 PLOS ONE Switzerland 50 48 55-68 56%/44% NK Serum RT-PCR CES1 + IL8 + S100A8 + control Study CXCL11 + ITGA2 + NME1 + JUN + TNFSF13B + CXCR3 + MAPK6 + CD63 + ITGB5 + GATA 2 + LTF + MMP9 + CXCL10 + MSL1 + RHOC + FXYD5 MGMC algorithm (multigene multiclassifier) on 29- gene panel: BLCL3 + IL1- B + PTGS2 + MAP2K3 + PTGES + PPARG + Controls: 46%/54% MMP11 + CCR1 + EGR1 + Prospective multi-center 38 (Psoriasis, Crohn, Ciarloni L. 2016 Clinical Cancer Research Switzerland 149 NK NK Adenoma: 64%/36% Serum RT-PCR + CACNB4 + CES1 + IL8 + cohort study colitis ulcerosa,…) CRC: 62%/38% S100A8 + CXCL11 + ITGA2 + NME1 + JUN + TNFSF13B + CXCR3 + MAPK6 + CD63 + ITGB5 + GATA 2 + LTF + MMP9 + CXCL10 + MSL1 + RHOC + FXYD5 MGMC algorithm (multigene multiclassifier) on 29- gene panel + protein tumor markers: CEA and CYFRA 21-2 Retrospective multi- American Association for Controls: 61,7%/38,3% Healthy volunteers as Koga Y. 2013 Japan center case-control 107 117 30-84 Stool RT-PCR miR-106A Cancer Research CRC: 59%/41% controls study Immunoassay iFOBt miR-106A + iFOBT DNA Chip 3D-gene: 6- Retrospective cohort gene panel: CEPB + Koga Y. 2014 Anticancer Research Japan 61 53 40-78 71/43 NK Stool cDNA Microarray study FCGRA3A + PFKFB3 + SOD2 + RGS2 + IL-8

iFOBT

7 gene-panel: ANXA3 + International Journal of Retrospective multi- CLEC4D + IL2RB + LMNB1 Marshall K. 2010 Canada 328 314 NK 399/243 NK Serum RT-PCR Cancer center cohort study + PRRG4 + TNFAIP6 + VNN1 World Journal of IBD: 7 Nonadenomatous Meng W. 2012 Gastrointestinal China Screening study 158 325 40-74 228/255 Serum ELISA M2-PK polyps: 47 Oncology

CEA

Terhaar sive Droste J. Cancer Epidemiology, All patients were iFOBT Automated OCE-sensor The Nethederlands Screening study 1036 1109 18-86 43,1%/56,9% Stool iFOBT 2011 Biomarkers & Prevention positive Test

Journal of Retrospective case- Jin P. 2014 Gastroenterology and China 91 135 20-84 260/216 Hyperplastic polyps: 81 Plasma RT-PCR SEPT9 control study Hepatology

iFOBt International Journal of Wilhelmsen M. 2016 Denmark Prospective cohort study 1164 4698 NK NK Comorbidity: 814 Serum Immunoassay AFP Cancer CA19-9 CEA Cyfra21-1 Ferritin Galectin-3 hs-CRP TIMP-1

CEA + CyFra21-1 + Ferritin +hs-CRP + Gender + Age pr 10 years Mechanism Number of Patients included P-value Sensitivity(%) 95% CI Specificity (%) 95% CI AUC

Total number of Number of patients with Adenoma Controls Non-Neoplastic findings Other cancers participants colorectal cancer

Plasma tissue inhibitor 0,7 (univariate 4509 294 843 2173 1176 10 P<0,0001 NK NK NK NK of metalloproteinases-1 analysis)

Glycoprotein involved in 0,73 (univariate NK NK NK NK cell-adhesion analysis) 0,82 (multivariable Biomarker panel NK NK NK NK analysis) 19 IBD patients, 3 Isoenzyme expression in 328 42 67 83 amoebic colitis, 114 NK NK 71,4 NK 71 NK NK proliferating cells infective colitis

Training Set: 83-93 Test Training Set: 85-94 Training Set: 0;94 Test Multimarker methylated Training Set: 89 Test Set: Training Set: 197 Test Training Set: 89 Test Set: Training Set: 90% Test 678 Training Set: 170 Test Set: 82 NK NK NK Set: 68-86 Combined: Test Set: 77-92 set: 0,88 Combined: DNA stool test 44 Set: 96 78 Combined: 85 Set: 85 Combined: 89 80-89 Combined: 85-92 0,90

mRNA overexpression 111 51 0 60 NK NK P=0,0001 96,1 NK 95 NK 0,981

DNA integrity 560 26 (adenocarcinoma) 318 216 NK NK P<0,0001 NK NK NK NK NK Training Set: 3 Test Set: Training Set: 58 Test set: Training Set: 55 Test Set: Training Set: 33-75 Test Training Set: 95 Test Set: Training Set: 87-98 Training Set: 0,77 Test DNA Methylation 193 Training Set: 22 Test Set: 44 NK NK P<0,05 19 30 66 Set: 50-79 100 Test Set: 86-100 Set: 0,83 Biomarker panel 92 20 24 48 NK NK P<0,05 95 76-95 94 83-98 0,97 Tumor Antibodies 148 94 7 Duke's stage A 54 NK NK P<0,05 35,1 NK 96,3 NK NK Tumor Antibodies 20,2 NK 96,3 NK NK Tumor Antibodies 24,5 NK 98,1 NK NK Tumor Antibodies 18,1 NK 100 NK NK Tumor Antibodies 20,2 NK 96,3 NK NK Antibody panel 58,5 NK 92,6 NK NK

CRC: 0,88 Adenoma: Gene overexpression 144 48 46 50 NK NK P<0,05 CRC: 75 adenoma: 59 NK 91 NK 0,85

Test Set 1+2+3: Test Set 1+2+3: 2 mucineus cancer, 1 Gene overexpression 594 97 103 149 245 (Benign lesions) P<0,05 Adenoma: 55,4 CRC: Adenoma: 43-68 CRC: Test Set 1+2+3: 90 Test Set 1+2+3: 82-95 NK squameus carcinoma 79,5 68-88 Test Set 1+2+3: Test Set 1+2+3: P<0,05 Adenoma: 52,3 CRC: Adenoma: 40-65 CRC: Test Set 1+2+3: 92,2 Test Set 1+2+3: 85-97 NK 78,1 67-87

Tumour-associated RNA- 224 117 0 207 0 0 P=0,0001 34,2 25,6-43,6 97,2 92-99,4 NK expression DNA integrity 60,7 51,2-69,6 98,1 93,4-99,8 NK Biomarker panel 70,9 61,8-79 96,3 90,7-99,0 NK NK (Accuracy Training Training Set: 41 Validation Training Set: 54 Training Set: 85,4 Training Set: 85,2 DNA hypermethylation 114 0 0 0 P<0,05 NK NK Set: 85,3% Validation Set: 12 Validation Set: 7 Validation Set: 83,3 Validation Set: 85,7 Set: 84,2%) NK (Accuracy Training Training Set: 53,7 Training Set: 98,1 DNA integrity NK NK Set: 78,9% Validation Validation set: 66,7 Validation Set: 100 Set: 78,9%)

Training Set: 112 Validation Training Set: 120 Training Set: 6 Validation Training Set: 82 Training Set: 64 Training Set: 0,8 DNA hypermethylation 642 0 0 P<0,05 NK NK Set: 202 Validation Set: 208 Set: 4 Validation Set: 72 Validation Set: 70 Validation Set: 0,8

CRC: 100 Advanced CRC stage I & II: 0,89 Isoenzyme expression in 137 (advanced 438 93 158 47 0 P<0,05 Adenoma: 95,12 100-100 40,51 32,85-48,16 Advanced adenoma: proliferating cells adenoma: 41) Adenoma: 82,48 0,81 Adenoma: 0,69

CRC stage I & II: 0,7 Carcinoembryogenic NK NK NK NK Advanced adenoma: antigen 0,63 Adenoma: 0,54 (Cut-off: >200ng/ml) CRC: 70,6-89 Early- (Cut-off: >200ng/ml) CRC CRC & Early-stage: CRC: 0,93 Early-stage: 236 (advanced CRC: 81 Early-stage: 78,9 stage: 62,7-90,5 & Early-stage: 92,8 DNA integrity 2145 79 1830 0 0 P<0,001 91,6-93,9 Advanced 0,89 Advanced adenoma) Advanced adenoma: Advanced adenoma: Advanced Adenoma: adenoma: 94,8-96,7 adenoma: 0,69 30,5 24,7-36,8 95,8 CRC: 83,5-90,6 CRC: 67-81,6 CRC+ CRC: 74,8 CRC+Advanced CRC:87,4 CRC+Advanced CRC+Advanced Advanced adenomas: 50- DNA methylation 476 135 169 91 0 0 P<0,05 adenomas: 56,6 adenomas: 92,2 CRC+ adenomas: 88,4-95,1 NK 63,1 CRC+Adenomas: CRC+Adenomas: 44,7 Adenomas: 95,3 CRC+ Adenomas: 91,4- 39,2-50,4 97,8 DNA integrity CRC: 58 46,1-69,2 CRC: 82,4 74,4-88,7 NK Colon Cancer: 319 Rectal Adenoma Colon: 515 Alpha-fetoprotein 4698 Clean colon: 1978 Comorbidity: 814 177 P=0,01 NK NK NK NK 0,52 Cancer: 193 Adenoma Rectum: 174 Cancer Antigen 19-9 P<0,0001 NK NK NK NK 0,58 Carcinoembryogenic P<0,0001 NK NK NK NK 0,65 antigen Cytkeratin-19 fragment P<0,0001 NK NK NK NK 0,65 P<0,0001 NK NK NK NK 0,56 P<0,0001 NK NK NK NK 0,56 High sensitive CRP P<0,0001 NK NK NK NK 0,65 Tissue inhibitor of P<0,0001 NK NK NK NK 0,63 metalloproteinases

0,76 (CRC + high-risk Biomarker panel <0,0001 60 NK 75 NK adenomas) Quality Assessment 95% CI DOR (%) 95% CI PPV (%) NPV (%) Blinded Reference (Fowkes checklist)

2,9 (univariate analysis) 2,4-3,5 (univariate analysis) NK 1,2 (multivariable 0,9-1,61 (multivariable NK NK NA 0 [16] analysis) analysis) 2,4 (univariate analysis) 1,9-3,0 (univariate analysis) NK 2,4 (multivariable 2,0-3,0 (multivariable NK NK analysis) analysis) NK NK NK NK NK

NK NK NK 73,5 94,4 NA ++ [17]

NK NK NK NK NK Yes 0 [18]

NK NK NK NK NK No 0 [19]

NK NK NK NK NK NA ++ [20] Training Set: 0,64-0,90 NK NK NK NK NA 0 [21] Test Set: 0,75-0,91 0,93-1,0 NK NK NK NK NK NK NK NK NK NA + [22] NK NK NK NK NK NK NK NK NK NK NK NK NK NK NK NK NK NK NK NK NK NK NK NK NK

CRC: 0,83-0,92 NK NK NK NK NA 0 [23] Adenoma: 0,78-0,91

NK NK NK NK NK Yes 0 [24] NK NK NK NK NK

NK NK NK NK NK No 0 [25]

NK NK NK NK NK NK NK NK NK NK

NK NK NK NK NK NA + [26]

NK NK NK NK NK NA

Training Set: 0,74-0,85 Training Set: 68 Training Set: 79 NK NK Yes 0 [27] Test Set: 0,76-0,84 Validation Set: 70 Validation Set: 72

CRC stage I & II: 0,84- CRC: 49,73 Advanced CRC: 100 Advanced 0,94 Advanced NK NK adenoma: 29,32 adenoma: 96,97 NA 0 [28] adenoma: 0,74-0,86 Adenoma: 54,59 Adenoma: 72,73 Adenoma: 0,64-0,76 CRC stage I & II: 0,62- 0,79 Advanced NK NK NK NK adenoma: 0,53-0,73 Adenoma: 0,47-0,60 CRC: 0,89-0,96 Early- stage: 0,82-0,95 NK NK NK NK NA 0 [29] Advanced adenoma: 0,65-0,73

NK NK NK NK NK NA 0 [30]

NK NK NK NK NK NK 1,12 1,05-1,2 NK NK No 0 [31] NK 1,35 1,31-1,38 NK NK NK 1,72 1,62-1,83 NK NK NK 1,79 1,72-1,87 NK NK NK 0,87 0,85-0,9 NK NK NK 1,56 1,41-1,72 NK NK NK 1,3 1,26-1,42 NK NK NK 2,52 2,02-3,1 NK NK

NK NK NK 37 88 Table 2: Hepatocellular cancer

Author + Date Journal Location Study Design Study Population Specimen Method

Healthy patients Persons with Symptoms Age Range Males/Females Other characteristics Annals of Surgical Retrospective cohort Immunoassay (SELDI- Chen L. 2010 China 0 240 NK 184/56 120 cirrhotic patients Serum Oncology study TOF-MS)

Prospective cohort 402 patients with Jirun P. 2012 Gut China 418 1622 NK NK Serum Immunoassay study chronic hepatitis

Prospective nested Luo P. 2017 Hepatology China multi-center case- 290 1085 NK 1076/371 Cirrhosis: 516 Serum LC-MS control study

Asian Pacific Journal of Retrospective cohort Shi L. 2014 China 103 194 NK 195/102 NK Serum ELISA Cancer Prevention Study

Nested case-control Lin X. 2015 Lancet Oncology China 294 616 35-57 745/165 NK Serum RT-PCR study The Japanese Society of Retrospective case- Kumuda T. 2013 Japan 0 208 14-84 116/92 104 HepB or HebC Serum Microassay Gastroenenterology control study

Nested case-control HCC: 82%/18% Controls: Lok A. 2010 Gastroenterology USA 0 116 NK NK Serum Immunoassay study 75%/25%

167 HepC and 97 HepB Multi-center case- Miura N. 2010 BMC Gastroenterology Japan 201 437 22-101 NK patients, 24 with HepC Serum RT-PCR control study and HepB markers

Wang W. 2012 Cancer Biomarkers China Cross-sectional study 108 416 28-68 Patients: 278/138 NK Serum ELISA

Cancer Prevention Retrospective case- Wang M. 2015 USA NK Discovery cohort: 360 NK NK NK AFP algorithm Algorithm Research control Study

HALT-C cohort: 151 NK NK NK AFP algorithm Algorithm

EDRN cohort: 870 NK NK NK AFP algorithm Algorithm

Jefferson university NK NK NK AFP algorithm Algorithm cohort: 699 University of Texas southwestern cohort: NK NK NK AFP algorithm Algorithm 1229 Retrospective case- Nabawy Zekri A-R. 2016 Tumor Biology Egypt 95 192 NK 370/108 NK Serum RT-PCR Control Biomarker Mechanism Number of Patients included P-value Sensitivity(%) Specificity (%) Total number of Number of patients with Cirrhosis Controls Non-Neoplastic findings Other cancers participants HCC CART-algorithm 5- Training Set: 60 Training Set: 60 Training Set: 98,33 Training Set: 95 Protein overexpression 240 0 NK NK P<0,001 protein panel Validation Set: 60 Validation Set: 60 Validation Set: 83 Validation Set: 92 AFP Alpha-fetoprotein P<0,05 Validation Set: 72 Validation Set: 78 Cart-algorithm 5-protein Biomarker panel P<0,05 Validation Set: 95 Validation Set: 98 panel + AFP AFP Alpha-fetoprotein 2040 612 608 418 402 chronic hepatitis NK P<0,05 65,4 80,2 Des-gamma-carboxy DCP P<0,05 44,3 86,7 prothrombin Human cervical cancer HCCR-1 P<0,05 41,6 87,4 proto-oncogene 1 AFP + DCP + HCCR-1 Biomarker panel P<0,05 75,4 79,3

HCC vs non-HCC HCC vs non-HCC Discovery set: 88,9 Test Discovery set: 88,9 Test Discovery Set: 36 Test Discovery Set: 41 Test Discovery Set: 31 Test Vaildation Set: 42, High- Gastric Cancer: 73, Set: 88,6 Validation Set: Set: 85,7 Validation Set: 2-Metabolites panel: Overexpressed 1448 set: 325 Validation Set: Set: 126 Validation Set: Set: 160 Validation Set: risk patients: 150 Intrahepatic P<0,05 91,6 HCC vs Cirrhosis 72,2 HCC vs Cirrhosis Phe-Trp + GCA metabolites 197 143 99 Chronic Hepatitis B cholangiocarcinoma: 25 Discovery Set: 88,9 Test Discovery Set: 82,9 Test Set: 86 Validation Set: Set: 78,4 Validation Set: 92,1 52,8

HCC vs Cirrhosis HCC vs Cirrhosis Discovery set: 61,8 Test Discovery set: 56,1 Test AFP Alpha-fetoprotein Set: 56,4 Validation Set: Set: 78,4 Validation Set: 50,4 73,2

HCC vs Cirrhosis HCC vs Cirrhosis 2-Metabolites panel + Discovery set: 100 Test Discovery set: 100 Test Biomarker panel AFP Set: 81,4 Validation Set: Set: 87,4 Validation Set: 77,9 76,4 Diabetes: 12, Overexpressed genes: HCC vs non-HCC: 85,9 HCC vs non-HCC: 75,3 PRDX3 297 96 98 103 Hypertension: 17, Heart 0 P<0,05 Peroxiredoxin-3 HCC vs Cirrhosis: 73,2 HCC vs Cirrhosis: 69 disease: 9 AFP Alpha-fetoprotein P<0,05 NK NK

7 mi-RNA-panel: miR- HCC vs non-HCC HCC vs non-HCC Inactive HBsAg Carriers: 29a, miR-29c, miR-133a, Training set: 80,6 Training set: 84,6 mi-RNA overexpression 910 343 (HepB induced) 118 159 42, Chronic hepatitis B: 0 P<0,05 miR-143, miR-145, miR- Validation Set 1: 74,5 Validation Set 1: 88,9 254 192, miR-505 Validation Set 2: 85,7 Validation Set 2: 91,1

HCC vs non-HCC HCC vs non-HCC Training set: 69,4 Training set: 93,3 AFP20 AFP cut-off 20 ng/mL P<0,05 Validation Set 1: 56,9 Validation Set 1: 84,9 Validation Set 2: 59,2 Validation Set 2: 100

HCC vs non-HCC HCC vs non-HCC Training set: 38,9 Training set: 98,7 AFP400 AFP cut-off 400 ng/mL P<0,05 Validation Set 1: 31,4 Validation Set 1: 100 Validation Set 2: 40.8 Validation Set 2: 100

HCC vs non-HCC HCC vs non-HCC Training set: 85,2 Training set: 87,2 7 mi-RNA-panel + AFP20 Biomarker panel P<0,05 Validation Set 1: 75,8 Validation Set 1: 92,5 Validation Set 2: 79,6 Validation Set 2: 96,7 Lens Culinaris agglutinin- 104 patients HepB or (Cut-off value: 7%) 39,4 77 (36 months before AFP-L3 reactive fraction of 208 104 0 104 0 P<0,0003 HepC (time of diagnosis) diagnosis) alpha-fetoprotein AFP Alpha-fetoprotein P<0,03 41,4 90,4 Des-gamma-carboxy DCP P<0,001 34,6 94 prothrombin AFP-L3 + AFP + DCP Biomarker panel 60,6 76 Des-gamma-carboxy DCP 116 39 57% of controls 77 NK NK P<0,0001 74 86 prothrombin AFP Alpha-fetoprotein P<0,0001 61 81 DCP or AFP P<0,05 91 74 DCP and AFP P<0,05 43 93 Human telomerase hTERTmRNA reverse transcriptase 638 303 45 201 89 chronic hepatitis NK P<0,001 90,2 85,4 mRNA overexpression AFP Alpha-fetoprotein P<0,008 76,6 66,2 Lens Culinaris agglutinin- AFP-L3 reactive fraction of P<0,003 60,5 88,7 alpha-fetoprotein Des-gamma-carboxy DCP P<0,029 83,4 80,3 prothrombin 13 HepC, 15 drug 18 cholangiocarcinoma, Anti-URG's: URG4, induced hepatitis, 95 10 lung cancer, 30 CRC HCC vs Cirrhosis: 58,33 URG7, URG11, S15A, Upregulated genes 614 78 97 209 P<0,05 HCC vs Cirrhosis: 80 chronic hepatitis, 146 patients, 15 gastric URG's or more Sui1 asymptomatic cariers cancer AFP: 55 Doylestown AFP Alpha-fetoprotein 360 165 295 NK NK NK P<0,001 95 algorithm: 75

AFP: 43 Doylestown AFP Alpha-fetoprotein 151 49 NK 102 HepC 102 HepC NK P<0,001 95 algorithm: 55

AFP: 42 Doylestown AFP Alpha-fetoprotein 870 432 438 NK NK NK P<0,001 95 algorithm: 53

AFP: 38 Doylestown AFP Alpha-fetoprotein 699 113 NK 586 (HepB) NK NK P<0,001 95 algorithm: 58

AFP: 61 Doylestown AFP Alpha-fetoprotein 1229 425 804 NK NK NK P=0,9328 95 algorithm: 63 mi-RNA Panel: miR-122 96 chronic HepC mi-RNA overexpression 384 192 96 95 NK P<0,05 NK NK + miR-885-5p + mi-R29b patienst AFP + mi-RNA Panel: miR-122 + miR-885-5p + P<0,05 NK NK mi-R29b mi-RNA Panel: miR-885- 5p + miR-221 + miR- P<0,05 NK NK 181b + miR-122 AFP + mi-RNA Panel: miR-885-5p + miR-221 + P<0,05 NK NK miR-181b + miR-122 mi-RNA Panel: miR-22 + P<0,05 NK NK miR-199a-3p AFP + mi-RNA Panel: P<0,05 NK NK miR-22 + miR-199a-3p Quality Assessment AUC 95% CI DOR PPV (%) NPV (%) Blinded Reference (Fowkes checklist)

NK NK 92,72 NK NK Yes 0 [32] NK NK 9,11 NK NK NK NK (Estimated) 931 NK NK

0,791 NK NK 58,6 94,4 NA + [33]

0,678 NK NK 58,7 78,4

0,643 NK NK 58,6 77,7 0,891 NK NK 60,8 88,3 HCC vs non-HCC HCC vs non-HCC Discovery set: 0,914- Discovery set: 0,951 0,989 Test Set: 0,917- Test Set: 0,936 0,955 Validation Set: Validation Set: 0,875 0,846-0,905 HCC vs NK NK NK NA 0 [34] HCC vs Cirrhosis Cirrhosis Discovery Set: Discovery Set: 0,930 0,871-0,988 Test Set: Test Set: 0,892 0,856-0,929 Validation Validation Set: 0,807 Set: 0,753-0,861 HCC vs Cirrhosis HCC vs Cirrhosis Discovery set: 0,528- Discovery set: 0,657 0,786 Test Set: 0,671- NK NK NK Test Set: 0,725 0,779 Validation Set: Validation Set: 0,650 0,582-0,718 HCC vs Cirrhosis HCC vs Cirrhosis Discovery set: 1-1 Test Discovery set: 1 Test Set: 0,872-0,938 NK NK NK Set: 0,905 Validation Validation Set: 0,774- Set: 0,826 0,877 HCC vs non-HCC:0,809- HCC vs non-HCC: 0,865 0,953 HCC vs Cirrhosis: NK NK NK NA 0 [35] HCC vs Cirrhosis: 0,717 0,641-0,734 HCC vs non-HCC: 0,509- HCC vs non-HCC: 0,67 NK NK NK 0,753 HCC vs non-HCC HCC vs non-HCC Training set: 0,771-0,880 Training set: 0,826 Validation Set 1: 0,769- NK NK NK NA 0 [36] Validation Set 1: 0,817 0,865 Validation Set 2: Validation Set 2: 0,884 0,818-951 HCC vs non-HCC HCC vs non-HCC Training set: 0,756-0,872 Training set: 0,814 Validation Set 1: 0,653- NK NK NK Validation Set 1: 0,709 0,765 Validation Set 2: Validation Set 2: 0,796 0,706-0,886 HCC vs non-HCC HCC vs non-HCC Training set: 0,619-0,757 Training set: 0,688 Validation Set 1: 0,597- NK NK NK Validation Set 1: 0,657 0,716 Validation Set 2: Validation Set 2: 0,704 0,605-0,804 HCC vs non-HCC HCC vs non-HCC Training set: 0,813-0,905 Training set: 0,862 Validation Set 1: 0,796- NK NK NK Validation Set 1: 0,841 0,887 Validation Set 2: Validation Set 2: 0,843 0,810-0,952 NK NK NK NK NK NA ++ [37]

NK NK NK NK NK NK NK NK NK NK NK NK NK NK NK 0,82 0,68-0,95 NK NK NK NA ++ [38] 0,79 0,68-0,90 NK NK NK 0,92 0,84-0,998 NK NK NK NK NK NK NK NK

NK NK 19 83 85,9 NA 0 [39]

NK NK 11,1 74,6 67,7

NK NK 2,2 59,6 92,2

NK NK 7,6 78,4 73,5

0,721 NK NK NK NK NA ++ [40]

AFP: 0,8398 AFP: 0,7870-0,8926 Doylestown algorithm: Doylestown algorithm: NK NK NK NA + [41] 0,9388 0,9103-0,9674 AFP: 0,8153 AFP: 0,7430-0,8875 Doylestown algorithm: Doylestown algorithm: NK NK NK 0,8533 0,7912-0,9153 AFP: 0,8109 Doylestown NK NK NK NK algorithm: 0,8409

AFP: 0,8257 Doylestown AFP: 0,7877-0,8637 NK NK NK algorithm: 0,8920 Doylestown algorithm:

AFP: 0,877 Doylestown NK NK NK NK algorithm: 0,876

HCC vs Controls: 0,898 0,824-0,971 NK NK NK NA + [42]

HCC vs Controls: 1 1 NK NK NK

HCC vs Cirrhosis: 0,845 0,757-0,0,934 NK NK NK

HCC vs Cirrhosis: 0,982 0,957-1 NK NK NK

HCC vs Chronic Hepatitis 0,532-0,804 NK NK NK C: 0,668 HCC vs Chronic Hepatitis 0,971-1 NK NK NK C: 0,988 Table 3: Lung cancer

Author + Date Journal Location Study Design Study Population Specimen Method

Healthy patients Persons with Symptoms Age Range Males/Females Other characteristics

International Journal of Prospective cohort Li J. 2013 China 0 96 NK 71/25 60 smoker Bronchial aspirate RT-PCR Cancer study

Journal of Thoracic H-NMR metabolomics + Louis E. 2016 Belgium Case-control study 224 357 36-89 389/257 223 smoker Plasma Oncology OPLS-DA statistics

International Journal of Chen X. 2011 China Case-control study 220 400 NK 444/176 313 smoker Serum qRT-PCR Cancer

Journal of Translational Retrospective case- 190 smoking history- Doseeva V. 2015 USA 190 190 50-97 206/174 Serum PAULA's Test Medicine control study matched controls

237 smoking history Retrospective cohort Gummireddy K. 2015 Oncotarget USA 135 264 26-89 179/220 cancers, 74 smoking Serum RT-PCR Study history controls

114 current or former American Association Retrospective case- Hulbert A. 2016 USA 60 150 55-75 96/114 smoker cancers, 41 in Plasma and Sputum RT-PCR and Methylation for Cancer Research Control Study controls

139 current of former Multi-center case- smoker cancers, 731 Ostroff R. 2010 PLOS ONE USA 470 291 NK 47,5%/52,5% Serum ELISA control study current or former smoker controls

The New England Multi-center case- 308 current or former Pass H. 2012 UK 43 185 NK 422/93 Serum ELISA Journal of Medicine control Study smoker

Pleural effusion

146 emphysema, 623 Journal of Clinical Prospective cohort pneumonia, 1439 Sin D. 2013 Canada 2237 113 50-… 1292/1058 Plasma ELISA Oncology study current smoker, 911 former smoker 97 current or former Cancer Prevention Prospective cohort smoker cancers, 94 Varella-Garcia M. 2010 USA 96 100 NK 146/50 Sputum FISH Research study current or former smoker controls Biochemical and Retrospective case- Yibing Y. 2012 Biophysical Research USA 36 40 NK NK NK Serum ELISA control study Communications Biomarker Mechanism Number of Patients included P-value Sensitivity(%) 95% CI Total number of Number of patients with Benign Lung disease Controls Non-Neoplastic findings Other cancers participants Lung Cancer 31 adenocarcinoma, 25 8 interstitial lung squameus cell disease, 7 pulmonary carcinoma, 6 tuberculosis, 4 Survivan Inhibition of apoptosis 96 bronchoalveolar cell pneumonia, 4 COPD, 3 0 NK NK P<0001 83 NK carcinoma, 4 large cell hamartoma and carcinoma, 4 small cell inflammatory carcinoma pseudotumor Livin Inhibition of apoptosis P<0001 63 NK Overexpressed Training study: LC vs C Metabolic fenotype 704 357 202 COPD 347 95 Diabetes NK P<0,05 NK metabolites 78 Stage I LC vs C: 74 Validation study: 71 NK 10 miRNA-panel: miR- 20a + miR-24 + miR-25 + miR-145 + miR-152 + Overexpressed miRNA 620 NSCLC: 400 0 220 0 NK P<0,05 Training study: 93 NK miR-199A-5p + miR-221 + miR-222 + miR-223 + miR-320 Validation study: 92,5 NK 190 Lung cancer: 41 adenocarcinoma, 7 Bronchioalveolar carcinoma, 45 NY-ESO-1 Cancer auto-antibody 380 squamous cell 0 190 0 3 P<0,05 NK NK carcinoma, 3 large cell carcinoma, 16 neuroendocrine tumours Carcinoembryogenic CEA NK NK antigen CA125 Cancer Antigen NK NK

CYFRA21-1 Cytokine Fragment NK NK 4-biomarker panel: NY- Training + validation ESO-1 + CEA + CA125 + Biomarker panel NK study: 72 CYFRA21-1 264 NSCLC: 136 Stage 1, AKAP4 Cancer Testis Antigen 399 42 Stage 2, 74 Stage 3, 0 135 0 0 P<0,05 92,8 NK 12 Stage 4 GAGE4 Cancer Testis Antigen NK NK Methylated gene 150 NSCLC: 136 Stage 1, SOX17 210 0 60 0 0 P<0,001 Sputum: 84, Plasma: 73 NK fragment 14 Stage Methylated gene TAC1 Sputum: 86, Plasma: 76 NK fragment Methylated gene HOXA7 Sputum: 63, Plasma: 34 NK fragment Methylated gene CDO1 Sputum: 78, Plasma: 65 NK fragment Methylated gene HOXA9 Sputum: 93, Plasma: 86 NK fragment Methylated gene ZFP42 Sputum: 87, Plasma: 84 NK fragment TAC1 + HOXA7 + SOX17 Biomarker panel Sputum: 98, Plasma: 93 NK

12-biomarker classifiers: Cadherin-1, CD30 ligand, Endostatin, Biomarker panel; Training Stage: 90 Training Stage: 84-96 HSP90-alfa, LRIG3, MIP- 1326 291 565 1035 0 0 P<0,0001 Bayesian Classifiers Verification Stage: 89 Verification Stage: 81-96 4, Pleiotrophin, PRKCI, RGM-C, SCF sR, sL- Selectin, YES 20 Ovarian cancer 20 Mesotheline related Detroit cohort: 100 Detroit cohort: 90,5-100 43 benign prostatic Breast cancer 20 Fibulin-3 protein in pleural 531 190 Mesothelioma 93 189 (asbest exposed) P<0,0001 New York cohort: 94.6 New York cohort: 84.9- hypertrophy glioblastoma 31 effusion Validation cohort: 33 98,9 prostate cancer Mesotheline related Fibulin-3 protein in pleural NK NK effusion

NSCLC associated Pro-surfactant protein B 2485 113 769 2237 NK 0 P<0.001 80,4 NK Protein

EGFR + 5p15 + MYC + Biomarker Panel 196 100 0 96 NK NK P<0,05 56 NK CEP-6

NOLC1 + MALAT1 + RNA marker panel 76 40 NSCLC 0 36 NK NK P<0,05 45 NK HMMR + SMOX Quality Assessment Specificity (%) 95% CI AUC 95% CI DOR 95% CI PPV (%) NPV (%) Blinded Reference (Fowkes checklist)

96 NK 0,826 0,718-0,924 NK NK 98 68 NA + [43]

92 NK 0,676 0,560-0,792 NK NK 96 48 Training Study: LC vs C Training study: LC vs C: Training study: LC vs C Training study: LC vs C NK NK NK NK NA 0 [44] 92 Stage I LC vs C: 78 0,88 Stage I LC vs C: 0,79 91 Stage I LC vs C: 75 80 Stage I LC vs C: 77 Validation study: 81 NK Validation study: 0,84 NK NK NK Validation study: 80 Validation study:72

Training study: 90 NK 0,966 NK NK NK NK NK NA 0 [45]

Validation study: 90 NK 0,972 NK NK NK NK NK

Training study:0,60 Training study: 0,52- NK NK NK NK NK NK NA + [46] MoM: 0,60 0,67 MoM: 0,52-0,67

Training study: 0,79 Training study: 0,74- NK NK NK NK NK NK MoM: 0,79 0,85 MoM: 0,74-0,85 Training study: 0,70 Training study:0,63-0,77 NK NK NK NK NK NK MoM: 0,70 MoM: 0,63-0,77 Training study: 0,69 Training study: 0,62- NK NK NK NK NK NK MoM: 0,69 0,76 MoM: 0,62-0,76 Training + Validation Training study: 0,83 Training study: 0,78- NK NK NK 7.2 99.4 study: 83 MoM: 0,81 0,88 MoM: 0,75-0,86

92,6 NK 0,97 0,95-0,98 NK NK NK NK NA 0 [47]

NK NK 0,71 NK NK NK NK NK Sputum: 0,84 Plasma: Sputum: 0,75-0,94 Sputum: 96% Plasma: Sputum: 60% Plasma: Sputum: 88, Plasma: 84 NK NK NK NA 0 [48] 0,78 Plasma: 0,70-0,86 92% 55% Sputum: 0,84 Plasma: Sputum: 0,74-0,94 Sputum: 93% Plasma: Sputum: 58% Plasma: Sputum: 75, Plasma: 78 NK NK NK 0,78 Plasma: 0,70-0,86 90% 57% Sputum: 97% Plasma: Sputum: 40% Plasma: Sputum: 92, Plasma: 92 NK Sputum: 0,77 Sputum: 0,67-0,86 NK NK 91% 36% Sputum: 90% Plasma: Sputum: 44% Plasma: Sputum: 67, Plasma: 74 NK Plasma: 0,68 Plasma: 0,58-0,77 NK NK 86% 46% Sputum: 79% Plasma: Sputum: 25% Plasma: Sputum: 8, Plasma: 46 NK NK NK NK NK 80% 58% Sputum: 90% Plasma: Sputum: 56% Plasma: Sputum: 63, Plasma: 54 NK NK NK NK NK 82% 57% Sputum: 89 Plasma: Sputum: 0,8-0,9 Plasma: Sputum: 93% Plasma: Sputum: 89% Plasma: Sputum: 71, Plasma: 62 NK NK NK 0,77 0,68-0,86 86% 78%

Training Stage: 84 Training Stage: 81-96 Training Stage: 0,91 NK NK NK NK NK Yes 0 [49] Verification Stage: 83 Verification Stage: 79-88 Verification Stage: 0,90

Detroit cohort: 100 New Detroit cohort : 1,00 Detroit cohort: 91,4-100 York: 95,7 Validation New York cohort: 0,99 NK NK NK NK NK Yes 0 [50] New York: 89,6-98,8 cohort: 100 Validation cohort: 0,87

Detroit cohort : 0,95 NK NK NK NK NK NK NK New York cohort: 0,91

40,1 NK 0,69 0,642-0,735 2,331 1,837-2,958 6,4 97,6 NA 0 [51]

85 NK 0,84 NK 8,3 3.9-17.7 NK NK NA 0 [52]

96.2 NK NK NK NK NK NK NK NA ++ [53] Table 4: Ovarian cancer

Author + Date Journal Location Study Design Study Population Specimen Method

Healthy patients Persons with Symptoms Age Range Other characteristics Pregnancies: no pregnancies 179, 1 American Journal of pregnancy 155, 2 Longoria T. 2013 USA Prospective cohort study 0 1016 18-92 Serum PCR Obstetrics & Gynecology pregnancies 251, 3 pregnancies 207, 4 or more: 222

Patients with pre- Asian Pacific Journal of Yildirim M. 2014 Turkey Cohort study 0 569 NK existing infection were Serum ELISA Cancer Prevention excluded

International Journal of Nested case-control Gislefoss R. 2015 Norway 174 120 NK NK Serum ELISA Gynecological Cancer study

Xu Y. 2016 Clinical Biochemistry China Case-control study 265 595 NK NK Serum ELISA

American Association for Cramer D. 2011 USA Case-control study 480 320 30-… NK Serum ELISA Cancer Research

Leandersson P. 2016 Anticancer Research Sweden Cohort study 0 350 16-90 NK Serum ELISA Sedlakova I. 2010 Tumor Biology Czech Republic Prospective cohort study 27 132 20-82 NK Plasma Capillary electrophoresis

International Journal of Retrospective validation Simmons A. 2016 USA 217 142 20-87 NK Serum ELISA Gynecological Cancer study

Retrospective case- Tcherkassova J. 2011 Tumor Biology Canada 105 80 NK NK Serum ELISA control study

American Association for Zhu C. 2011 USA Validation study 951 118 55-79 NK Serum ELISA Cancer Research Biomarker Mechanism Number of Patients included P-value Sensitivity(%) 95% CI Total number of Number of patients with Benign ovarium disease Controls Non-Neoplastic findings Other cancers participants Ovarium Cancer

OVA1 Multivariate index assay 1016 255 761 0 NK NK P<0,05 92,2 88,2-94,9

Clinical assessment 74,5 68,8-79,5 OVA1 + Clinical 95,3 92-97,3 assessment CA-125 Carcinoma antigen 70,6 64,7-75,8 American Congress of Modified ACOG Obstetricians and 80 74,7-84,4 Gynecologists guidelines

CA-125 Carcinoma antigen 569 253 316 0 NK NK P<0,01 75 NK

Neutrophil/Lymphocyte NLR 79 NK ratio Platelet/Lymphocyte PLR 75,8 NK ratio Neutrophil count 30,6 NK Lymphocyte count 50 NK Platelet count 74,2 NK Human epididymis HE-4 294 120 0 120 NK NK P=0.002 NK NK protein Human epididymis Pre-Meno: 72,6 Post- HE-4 860 239 311 265 NK 45 borderline cancer P<0,05 NK protein Meno: 76,8

Pre-Meno: 56,5 Post- CA-125 Carcinoma antigen NK Meno: 73,2

Risk of Ovarian Pre-Meno: 75,8 Post- ROMA NK Malignancy Algorithm Meno: 78,6

Pre-Meno: 72,6 Post- HE-4 + CA-125 Biomarker panel NK Meno: 80,4

Pre-Meno: 85,5 Post- HE-4 + CA-125 + ROMA Biomarker panel NK Meno: 80,4

CA-125 Carcinoma antigen 800 160 160 480 NK NK P<0,05 73 64-84 Human epididymis HE-4 57 50-70 protein Transthyretin Transport protein 47 38-56 IGF-2 Insulin-like growth factor 36 28-49 Prolactin Luteotropic hormone 34 15-49 B7-family protein B7-H4 350 109 211 0 NK 30 borderline cancer P<0,01 NK NK homolog 4 Soluble urokinase suPAR plasminogen activator NK NK receptor Human epididymis HE-4 NK NK protein

CA-125 Carcinoma antigen NK NK

Risk of Ovarian ROMA NK NK Malignancy Algorithm

lnHE-4 + lnCA-125 + Biomarker panel 74 NK insuPAR + age 45 serous ovarian cancer, 8 endometroid Lysophosphatidic acid ovarian cancer, 17 LPA 159 81 51 27 NK P<0,001 79 NK (cut-off: 8.30 µmol/l mucinous ovarian cancer, 5 clear cell cancer 55 endometroid ovarian cancer, 33 mucinous CA-125 + CA72-4 + HE-4 Biomarker panel 359 142 0 217 NK ovarian cancer, 19 clear P<0,05 83.7 NK + sVCAM cell cancer, 5 mixed, 2 undifferentiated CA-125 + MMP-7 + CA72- Biomarker panel 83.2 NK 4 + HE-4 Receptor for the RECAF circulating fetal alpha- 185 80 0 105 NK NK P<0,05 52,5 NK foetoprotein CA-125 Carcinoma antigen 70 NK RECAF + CA-125 Biomarker panel 83 NK A: CA-125 + IGF-2 + Step 1: 22-44 Step 2: 20- Leptin + MIF + OPN + Biomarker panel 1069 118 0 951 NK NK P<0,01 Step 1: 32,8 Step 2: 36,7 54 Prolactin B: CA-125 + B7-H4 + Biomarker panel Step 1: 64,6 Step 1: 53-76 CA15-3 + CA72-4 + HE-4 C: CA-125 + HE-4 + IGFBP- 2 + Mesothelin + MMP-7 Step 1: 15-36 Step 2: 29- + Secretory Leukocyte Biomarker panel Step 1: 25,4 Step 2: 46,7 64 Protease Inhibitor + Spondin-2 D: Apolipoprotein A-1 + Bèta)2-microglobulin + Step 1: 40-64 Step 2: 34- Biomarker panel Step 1: 52,3 Step 2: 51,7 CA-125 + CTAP-3 + 69 Transthyretin E: CA-125 + CA72-4 + EGFR + Eotaxin + HE-4 + Biomarker panel Step 2: 23,3 Step 2: 8-38 MMP-3 + Prolactin + sVCAM-1 Step 1: 53-76 Step 2: 56- CA-125 Carcinoma antigen Step 1: 64,6 Step 2: 72,4 89 Pan-site: CA-125 + HE-4 Biomarker panel Step 3: 68,2 Step 3: 57-80 + CA72-4 + SLPI + B2M Quality Assessment Specificity (%) 95% CI AUC 95% CI PPV (%) NPV (%) Blinded Reference (Fowkes checklist)

49,4 45,9-53 NK NK 37,9 94,9 NA 0 [54]

86,3 83,7-88,6 NK NK 64,6 91 44,2 40,7-47,7 NK NK 36,4 96,6 89,6 87,2-91,6 NK NK 69,5 90,1

76,5 73,6-79,4 NK NK 53,3 91,9

82,2 NK NK NK 73,2 83,6 NA 0 [55]

37 NK 0,593 NK 43,4 73,2

48,4 NK 0,621 NK 48,7 75,6 79,7 NK 0,701 NK 49,3 64,1 64 NK 0,593 NK 47,3 66,5 39,6 NK 0,56 NK 44,2 62,9 NK NK NK NK NK NK NA ++ [62]

Pre-Meno: 0,817 Post- Pre-Meno: 0,749-0,885 75 NK NK NK NA 0 [61] Meno: 0,862 Post-Meno: 0,784-0,939

Pre-Meno: 0,683 Post- Pre-Meno: 0,596-0,770 75 NK NK NK Meno: 0,849 Post-Meno: 0,772-0,927

Pre-Meno: 0,818 Post- Pre-Meno: 0,750-0,885 75 NK NK NK Meno: 0,882 Post-Meno: 0,813-0,950

Pre-Meno: 0,817 Post- Pre-Meno: 0,749-0,885 75 NK NK NK Meno: 0,888 Post-Meno: 0,821-0,954

Pre-Meno: 0,886 Post- Pre-Meno: 0,836-0,935 75 NK NK NK Meno: 0,888 Post-Meno: 0,819-0,935

95 NK 0,92 0,89-0,95 NK NK NA 0 [60]

95 NK 0,86 0,83-0,89 NK NK 95 NK 0,8 0,75-0,84 NK NK 95 NK 0,8 0,76-0,83 NK NK 95 NK 0,77 0,74-0,81 NK NK Pre-Meno: 0,682 Post- Pre-Meno: 0,532-0,832 NK NK NK NK NA 0 [59] Meno: 0,795 Post-Meno: 0,724-0,865 Pre-Meno: 0,822 Post- Pre-Meno: 0,708-0,936 NK NK NK NK Meno: 0,747 Post-Meno: 0,670-0,825

Pre-Meno: 0,761 Post- Pre-Meno: 0,620-0,901 NK NK NK NK Meno: 0,880 Post-Meno: 0,828-0,932

Pre-Meno: 0,864 Post- Pre-Meno: 0,783-0,946 NK NK NK NK Meno: 0,889 Post-Meno: 0,833-0,945

Pre-Meno: 0,773 Post- Pre-Meno: 0,633-0,912 NK NK NK NK Meno: 0,914 Post-Meno: 0,876-0,961

Pre-Meno: 0,940 Post- Pre-Meno: 0,902-0,980 95 NK NK NK Meno: 0,912 Post-Meno: 0,864-0,960

92,6 NK NK NK NK NK NA + [58]

98 NK NK NK NK NK Yes + [57]

98 NK NK NK NK NK

100 NK 0,96 NK NK NK NA + [56]

100 NK 0,889 NK NK NK 100 NK NK NK NK NK Step 1: 0,721 Step 2: Step 1: 0,64-0,80 Step 2: 98 NK NK NK Yes 0 [63] 0,852 0,77-0,94

98 NK Step 1: 0,892 Step 1: 0,84-0,95 NK NK

Step 1: 0,712 Step 2: Step 1: 0,63-0,79 Step 2: 98 NK NK NK 0,848 0,76-0,94

Step 1: 0,858 Step 2: Step 1: 0,80-0,92 Step 2: 98 NK NK NK 0,810 0,72-0,90

98 NK Step 2: 0,590 Step 2: 0,46-0,72 NK NK

Step 1: 0,890 Step 2: Step 1: 0,84-0,94 Step 2: 98 NK NK NK 0,898 0,82-0,98 98 NK Step 3: 0,911 Step 3: 0,86-0,96 NK NK Table 5: Prostate cancer

Author + Date Journal Location Study Design Study Population Specimen Method Biomarker Mechanism Number of Patients included Total number of Number of patients with Healthy patients Persons with Symptoms Age Range Males Other characteristics Benign Prostate disease Controls participants Prostate Cancer 191 BRCA1 mutation cariers, 75 BRCA2 Prostate Specific Cremers R. 2015 Urologic Oncology The Netherlands Screening study 498 634 45-69 2181 Serum Immunoassay PSA 2181 42 0 489 mutation cariers, 308 antigen non-carriers Prostate Cancer Antigen PCA3 3 American Association Prospective multi- Prostate Specific Sokol L. 2010 USA 321 245 40 - >90 566 NK Serum Immunoassay PSA 566 245 0 321 for Cancer Research center cohort study antigen %fPSA %[-2]proPSA Precursor form of PSA Base logistic regression Biomarker panel model + %[-2]proPSA American Association Prospective cohort Free-circulating DNA Gordian E. 2010 USA 0 172 NK 172 NK Serum PCR fcDNA 172 89 163 0 for Cancer Research study from tumor cells PSA (adjusted for race Prostate Specific and age) antigen fcDNA + PSA Biomarker panel fcDNA + PSA + Biomarker panel fcDNA*PSA British Journal of Prostate Specific Mitra A. 2010 UK Screening study 95 205 40-69 300 NK Serum PCR PSA 300 BRCA1: 89 BRCA2: 116 0 95 Urology International antigen American Association Prospective cohort Morgan R. 2011 UK 102 82 40-86 200 NK Urine ELISA EN-2 Engrailed-2 200 82 0 102 for Cancer Research study American Association Clinical base model: DRE Vickers A. 2010 USA Screening study 1113 388 NK 1501 NK Serum PSA assay Biomarker panel 1501 388 0 1113 for Cancer Research + age + tPSA Biomarker panel + DRE + age + tPSA + fPSA Kallikrein-related + iPSA + hK2 peptidase 2 Journal of Clinical Laboratory base model: Vickers A. 2010 USA Cohort study 2107 807 55-75 2914 NK Serum PSA assay Biomarker panel 2914 807 0 2107 Oncology age + tPSA + fPSA age + tPSA + fPSA + hK2 Biomarker panel age + tPSA + fPSA + hk2 Biomarker panel + iPSA age + DRE + tPSA + fPSA Biomarker panel age + DRE + tPSA + hK2 Biomarker panel age + DRE + tPSA + iPSA Biomarker panel + hK2 Journal of Clinical Prospective multi- Prostate Cancer Antigen Wei J. 2014 USA 0 331 NK 859 NK Urine PCA3 assay PCA3 859 331 528 0 Oncology center cohort study 3 Prostate Cancer PCPT Prevention Trial PCPT + PCA3 Biomarker panel Quality Assessment P-value Sensitivity(%) 95% CI Specificity (%) 95% CI AUC 95% CI DOR 95% CI PPV (%) NPV (%) Blinded Reference (Fowkes checklist) Non-Neoplastic findings Other cancers

NK NK P<0,05 100 NK 85 NK NK NK NK NK 25 100 Yes + [64]

NK NK NK NK NK NK NK NK 13 98

NK NK P<0,001 80 NK 41,7 36,3-47,4 0,66 0,62-0,71 NK NK NK NK NA 0 [65] 81 NK 40,2 34,8-45,8 0,70 0,65-0,74 NK NK NK NK 82 NK 42,1 36,6-47,7 0,67 0,62-0,71 NK NK NK NK 83 NK 61,4 56-66,7 0,79 0,75-0,82 NK NK NK NK

59 prostatitis, 104 BPH NK P<0,01 NK NK NK NK 0,676 NK 2,15 1,22-3,79 NK NK NA 0 [66]

P<0,001 95,5 88,9-98,9 6,7 3,4-11,8 0,687 NK 2,67 1,50-4,77 35,9 73,3 p<0,005 NK NK NK NK 0,726 NK 2,78 1,54-5,02 NK NK P<0,001 95,5 88,9-98,8 33,1 26-40,9 0,742 NK 5,42 2,55-11,52 43,8 93,1 BRCA1: 66,7 BRCA2: NK NK P<0,05 NK NK NK NK NK NK NK NK NK NA + [67] 36,4 NK NK P<0,0001 66,3 NK 90 NK 0,8021 0,7293-0,8750 NK NK NK NK Yes + [68]

NK NK P<0,001 NK NK NK NK 0,585 0,551-0,619 NK NK NK NK NA 0 [69]

NK NK NK NK 0,711 0,681-0,741 NK NK NK NK

NK NK P<0,001 NK NK NK NK 0,727 0,701-0,752 NK NK NK NK NA 0 [70] NK NK NK NK 0,648 0,621-0,675 NK NK NK NK NK NK NK NK 0,764 0,739-0,788 NK NK NK NK NK NK NK NK 0,752 0,727-0,776 NK NK NK NK NK NK NK NK 0,702 0,676-0,728 NK NK NK NK NK NK NK NK 0,776 0,752-0,799 NK NK NK NK

NK NK P<0,05 42 36-48 91 87-94 NK NK NK NK 80 88 NA 0 [71]

NK NK NK NK 0,68 NK NK NK NK NK NK NK NK NK 0,79 NK NK NK NK NK Table 6: Pancreatic cancer

Author + Date Journal Location Study Design Study Population Specimen

Healthy patients Persons with Symptoms Age Range Males/Females Other characteristics Baine M. 2011 Cancer Biomarkers USA Case-control study 47 130 NK 79/96 NK Serum

Journal of the American Prospective cohort Yip-Schneider M. 2014 USA 0 87 NK NK NK Pancreatic fluid College of Surgeons study

Journal of National Capello M. 2017 USA Cohort study 169 280 61-76 NK NK Serum Cancer institute

Prospective cohort 128 currently smoker, Henriksen S. 2016 Clinical Epigenetics Denmark 27 251 22-87 168/110 Serum study 75 former smoker Cancer Epidemiology, Matsubara J. 2010 Biomarkers & Japan Case-control study 108 174 NK 179/103 NK Plasma Prevention

Journal of Takayama R. 2010 Japan Case-control study 69 131 NK NK NK Serum Gastroenterology

Cancer Prevention Wang W. 2013 China Case-control study 80 272 NK 157/135 NK Serum Research

The Royal Society of Zhang Y. 2014 China Cohort study 205 156 34-79 194/167 NK Serum Chemistry Method Biomarker Mechanism Number of Patients included P-value Sensitivity(%) Total number of Number of patients with Benign Pancreatic Controls Non-Neoplastic findings Other cancers participants Pancreatic Cancer disease RT-PCR ANXA3 Annexin 3 gen 177 95 35 chronic pancreatitis 47 NK NK P<0,05 12 ARG1 Arginase 1 gen 30 Carbonic anhydrase 5b CA5b 21 gen F5 Coagulation factor V gen 32 Single stranded DNA SSBP2 29 binding protein gen TBC 1 domain family TBC1D8 29 member 8 gen MUC1 Mucin 1 gen 15 MUC16 Mucin 16 gen 20 Neutrophil gelatinase- NGAL 27 associated lipocalin gen Macrophage inhibitory MIC1 26 cytokine 1 gen CA 19-9 Cancer antigen 70 CA 19-9 + CA5B + F5 + Biomarker panel 67 MIC1+ ARG1 Mucinous cystic neoplasm: 24, low/moderate grade Vascular endothelial papillary mucinous ELISA VEGF-A 87 11 9 (pseudocysts) 0 NK P<0,0001 100 growth factor neoplasm:10, high grade/invasive intraductal papillary mucinous neoplasm Vascular endothelial VEGF-C P<0,0001 100 growth factor ELISA CA19-9 Cancer antigen 449 187 93 169 NK NK P<0,001 72,6 Tissue inhibitor of TIMP1 41,1 metalloproteinases Leucine-rich alpha-2- LRG1 42,5 glycoprotein 1 Regenerating islet- REG3A 45,2 derived family Insulin-like growth IGFBP2 42,5 factor binding protein 2 COL18A1 Collagen 32,9 TNF-receptor TNFRSF1A 20,6 superfamily member 1a Combined validation set: TIMP1 + LRG1 + CA19-9 Biomarker panel 84,9 Blinded independent set: 66,7 8-gene panel: BMP3 + RASSF1A + BNC1 + 103 chronic pancreatitis, RT-PCR Hypermethylated genes 278 95 124 27 NK P<0,001 76 MESTv2 + TFPI2 + APC + 62 acute pancreatitis SFRP1 + SFRP2 Mass spectrometry CXCL-7 Chemokine ligand 282 146 10 108 10 chronic pancreatitis NK P<0,0001 NK

CXCL-7 + CA19-9 Biomarker panel P<0,03 84 Regenerating islet- ELISA REG-4 200 120 11 69 NK NK P<0,001 94,9 derived family CA19-9 Cancer antigen NK REG-4 + CA19-9 Biomarker panel 100 44 chronic pancreatitis, 20 pseudocyst, 4 auto- immune pancreatitis, 19 RT-PCR CA19-9 Cancer antigen 352 149 123 80 serous cystoadenoma, 2 NK P<0,001 72,9 benign cyst, 33 biliary calculus disease, 1 lymphoepithelial cyst miR-27a-3p miRNA P<0,001 82,2 Gender + total bilirubin + fasting blood glucose + Biomarker panel 85,3 miR-27a-3p + CA19-9 Training set: 86,7 Mass spectrometry C16:1 Free fatty acid 361 95 61 205 60 pancreatitis NK P<0,01 Valmidation Set: 80,7

Training set: 88,3 C18:3 Free fatty acid Valmidation Set: 95,9

Training set: 83,3 C18:2 Free fatty acid Valmidation Set: 68,3

Training set: 88,3 C20:4 Free fatty acid Valmidation Set: 86,2

Training set: 65 C22:6 Free fatty acid Valmidation Set: 68,3

Training set: 96,7 C18:2 + C18:1 Free fatty acids Valmidation Set: 86,2

Training set: 86,7 C18:3 + C18:1 Free fatty acids Valmidation Set: 81,4

Polyunsaturated fatty Training set: 96,4 PUFA acids Valmidation Set: 85,1

Panel A: C16:1 + C18:3 + Training set: 89,3 Biomarker panel C18:2 + C20:4 + C22:6 Valmidation Set: 86,6

Panel B: C18:2 + C18:1 + Training set: 82,1 Biomarker panel C18:3 Valmidation Set: 83,6 Quality Assessment 95% CI Specificity (%) 95% CI AUC 95% CI DOR 95% CI Blinded Reference (Fowkes checklist)

NK 80 NK 0,526 NK NK NK NA 0 [72] NK 81 NK 0,567 NK NK NK NK 82 NK 0,584 NK NK NK

NK 83 NK 0,561 NK NK NK

NK 84 NK 0,535 NK NK NK

NK 85 NK 0,501 NK NK NK NK 86 NK 0,539 NK NK NK NK 87 NK 0,503 NK NK NK NK 88 NK 0,504 NK NK NK

NK 89 NK 0,574 NK NK NK NK 80 NK 0,719 NK NK NK NK 81 NK 0,772 NK NK NK

NK 97 NK >0,99 NK NK NK NA 0 [73]

NK 90 NK NK NK NK NK

NK 95 NK 0,882 0,809-0,956 NK NK Yes 0 [74]

NK 95 NK 0,88 0,805-0,956 NK NK

NK 95 NK 0,847 0,768-0,926 NK NK

NK 95 NK 0,819 0,735-0,903 NK NK

NK 95 NK 0,8 0,715-0,885 NK NK NK 95 NK 0,749 0,660-0,837 NK NK NK 95 NK 0,692 0,597-0,788 NK NK Combined validation set: Combined validation set: Combined validation set Combined validation set: Combined validation set: 0,917-0,981 Blinded 3,29-6,05 Blinded NK and blinded NK 0,949 Blinded 4,67 Blinded independent set: 0.817- independent set: 2.11- independent set: 95 independent set: 0.887 independent set: 3.19 0.957 4.26

NK 83 NK 0,86 0,81-0,91 NK NK NA 0 [75] NK NK NK 0,85 0,792-0,895 NK NK NA 0 [76]

NK 95 NK 0,961 0,932-0,994 NK NK NK 64 NK 0,922 NK NK NK NA 0 [77] NK NK NK 0,884 NK NK NK NK 60 NK NK NK NK NK

NK 75,7 NK 0,788 0,730-0,839 NK NK NA 0 [78]

NK 79,1 NK 0,857 0,812-0,895 NK NK

NK 81,6 NK 0,886 0,837-0,923 NK NK

Training set: 0,840-0,974 Training set: 82,1 Training set: 0,907 NK NK Validation set: 0,780- NK NK NA 0 [79] Validation set: 76,1 Validation set: 0,843 0,906 Training set: 0,692-0,901 Training set: 67,9 Training set: 0,795 NK NK Validation set: 0,829- NK NK Validation set: 67,2 Validation set: 0,885 0,940 Training set: 0,661-0,902 Training set: 71,4 Training set: 0,782 NK NK Validation set: 0,772- NK NK Validation set: 79,1 Validation set: 0,835 0,898 Training set: 0,590-0,845 Training set: 50 Training set: 0,717 NK NK Validation set: 0,778- NK NK Validation set: 68,7 Validation set: 0,842 0,907 Training set: 0,685-0,895 Training set: 82,1 Training set: 0,790 NK NK Validation set: 0,818- NK NK Validation set: 86,6 Validation set: 0,873 0,929 Training set: 0,827-0,987 Training set: 75 Training set: 0,907 NK NK Validation set: 0,800- NK NK Validation set: 79,1 Validation set: 0,860 0,921 Training set: 0,679-0,896 Training set: 60,7 Training set: 0,788 NK NK Validation set: 0,656- NK NK Validation set: 62,7 Validation set: 0,738 0,821 Training set: 0,709-0,902 Training set: 55 Training set: 0,806 NK NK Validation set: 0,861- NK NK Validation set: 73,1 Validation set: 0,911 0,961 Training set: 0,870-0,986 Training set: 85 Training set: 0,933 NK NK Validation set: 0,896- NK NK Validation set: 85,5 Validation set: 0,935 0,973 Training set: 0,829-0,988 Training set: 88,3 Training set: 0,908 NK NK Validation set: 0,832- NK NK Validation set: 81,4 Validation set: 0,880 0,928 Table 7: Breast cancer

Author + Date Journal Location Study Design Study Population Specimen Method

Healthy patients Persons with Symptoms Age Range Males/Females Other characteristics 18 invasive lobuler International Journal of Prospective cohort carcinoma, 6 invasive Atahan K. 2011 Turkey 37 51 21-73 0/92 Serum SELDI-TOF-MS Medical Sciences study ductal carcinoma, 3 mixed type

Garczyk S. 2015 PLOS ONE China Case-control study 40 40 NK 0/80 NK Serum ELISA

122 invasive ductal carcinoma, 17 invasive lobular carcinoma, 19 medullary carcinoma, 9 Gong B. 2012 Oncology Letters China Case-control study 100 300 22-74 0/400 Serum PCR invasive carcinoma, 5 intraductal carcinoma, 13 metastatis, 15 other carcinoma

Park B-J. 2014 BioMed Central South-Korea Cohort study 100 402 20-91 150/352 NK Serum ELISA

BioMed Research Zhang F. 2013 USA Case-control study 63 67 NK 0/130 NK Serum SVM International International Journal of Case-control study and Zhang H. 2013 Clinical and China 93 58 21-78 0/151 NK Serum RT-PCR Meta-analysis Experimental Medicine Biomarker Mechanism Number of Patients included P-value Sensitivity(%) 95% CI Specificity (%)

Total number of Number of patients with Benign breast disease Controls Non-Neoplastic findings Other cancers participants Breast Cancer

BC1 NK 91 27 24 37 NK NK P<0,005 NK NK NK

BC2 NK NK NK NK BC3 NK NK NK NK Human Anterior AGR3 80 40 0 40 NK NK P<0,01 35 NK 92,5 Gradient family Human Anterior AGR2 P<0,001 32,5 NK 90 Gradient family AGR2 + AGR3 Biomarker panel P<0,001 64,5 NK 89,5

Training cohort: 95 Training cohort: 92 GAPDH Apoptotic Cell DNA 400 200 100 100 NK NK P<0,05 NK Validation cohort: 89 Validation cohort: 94

197 breast cancer, 111 Thioredoxin: NADPH- Trx1 502 197 0 100 NK NSCLC, 64 colorectal P<0,001 89,3 NK 78 dependent cancer, 30 kidney cancer Carcinoembryonic CEA 54.4 NK 77.6 antigen CA15-3 Cancer antigen 48.6 NK 89.8 PCDHGA-8 + LEFTY-2 + Training cohort: 85,29 Training cohort: 81,25 CACNG-6 + BCAR-3 + Biomarker panel 130 67 0 63 NK NK P<0,01 NK Validation cohort: 72,41 Validation cohort: 74,19 CYP21A2 Case-control: 86,2, Meta- Case-control: 82,8, Meta- miRNA-205 miRNA 151 58 0 93 NK NK P<0,001 Meta-analysis: 66-82 analysis: 75 analysis: 84 Quality Assessment 95% CI AUC 95% CI DOR 95% CI PLR NLR PPV (%) NPV (%) Blinded Reference (Fowkes checklist)

NK <0,70 NK NK NK NK NK NK NK NA ++ [80]

NK <0,70 NK NK NK NK NK NK NK NK NK NK NK NK NK NK NK NK NK 0,718 0,606-0,829 NK NK NK NK NK NK NA 0 [81]

NK 0,841 0,745-0,936 NK NK NK NK NK NK NK 0,827 0,693-0,962 NK NK NK NK NK NK

Training cohort: 56- Training cohort: 0,935 Training cohort: 218,5 Training cohort: 96 Training cohort: 90 NK NK 852,2 Validation cohort: NK NK NA 0 [82] Validation cohort: 0,915 Validation cohort:126,8 Validation cohort: 97 Validation cohort: 81 33,71-467,69

NK 0,911 NK NK NK NK NK NK NK NA 0 [83]

NK 0.678 NK NK NK NK NK NK NK NK 0.719 NK NK NK NK NK NK NK Training cohort: 0,905 NK NK NK NK NK KN NK NK Yes 0 [84] Validation cohort: 0,788

Case-control: 0,84 Meta- Case-control: 0,77-0,91 Meta-analysis: 80-88 Meta-analysis: 16 Meta-analysis: 11-24 Meta-analysis: 4,8 Meta-analysis: 0,30 Meta-analysis: 62 Meta-analysis: 7 NA 0 [85] analysis: 0,87 Meta-analysis: 0,84-0,90 Table 8: Gastric cancer

Author + Date Journal Location Study Design Study Population Specimen

Healthy patients Persons with Symptoms Age Range Males/Females Other characteristics 127 Esophageal cancer, Asian Pacific Journal of Prospective cohort Jing J-X. 2014 China 0 573 29-81 440/133 182 Cardiac carcinoma, Serum Cancer Prevention study 264 Gastric cancer

Tao H-Q. 2011 Human Pathology China Cohort study 70 92 22-86 112/50 92 Gastric cancer Serum Chen S. 2015 Tumor biology China Case-control study 18 25 NK NK NK Serum

European Journal of Lomba-Viana R. 2011 Gastroenterology and Portugal Screening study 13106 NK 40-79 5326/7792 NK Serum Hepatology OncoTargets and Tong W. 2016 China Case-control study 238 285 NK 361/162 NK Serum Therapy 34 patients with gastric Zhang X. 2012 Tumor biology China Case-control study 47 94 NK 90/51 ulcers, 18 patients with Gastric Juice atrophic gastritis Method Biomarker Mechanism Number of Patients included P-value Sensitivity(%) Total number of Number of patients with Controls Non-Neoplastic findings Other cancers participants Upper GIT cancer CEA + CA19-9 + CA24-2 ELISA + SCC for Esophageal Biomarker panel 573 573 0 NK NK P<0,05 68,4 cancer CEA + CA199 + CA242 + CA724 for Cardiac and Biomarker panel P<0,05 82,6 Gastric cancer Calcium dependent RT-PCR REG-4 162 92 70 NK NK P<0,01 NK lectine gene RT-PCR miRNA-181 + KAT2B Biomarker panel 43 25 18 NK NK P<0,01 95,8 mIRNA-181 + FOS Biomarker panel 100

ELISA Serum Pepsinogen Pepsinogen 13118 12 13106 NK NK P<0,05 67

IgG H. Pylori + ADAM-8 Training group: 0,842, ELISA Biomarker panel 523 285 238 NK NK P<0,05 + PG-1 + PG-2 + VEGF Validation group: 0,860 Atrophic gastritis: 18, RT-PCR Gastric juice miRNA-421 miRNA 141 42 99 NK P<0,001 71,4 Gastric Ulcer: 34 Carcinoembryonic Gastric juice CEA 42.8 antigen Carcinoembryonic Serum CEA 14.3 antigen Gastric juice CEA + Biomarker panel 85.7 miRNA-421 Quality Assessment 95% CI Specificity (%) 95% CI AUC 95% CI PPV (%) NPV (%) Blinded Reference (Fowkes checklist)

NK 71,5 NK NK NK NK NK NA 0 [86]

NK 83,3 NK NK NK NK NK

NK NK NK 0,798 NK NK NK NA + [87] NK 94,1 NK 0,960 NK NK NK NA + [88] NK 41,18 NK 0,45 NK NK NK

63-71 47 42-51 NK NK 2 99 Yes + [89]

Training group: 0,792, NK NK 0,853 0,773-0,933 NK NK NA + [90] Validation group: 0,832

NK 71,7 NK 0,767 0,684-0,850 NK NK NA 0 [91]

NK 74.3 NK NK NK NK NK

NK 40 NK NK NK NK NK

NK 88.5 NK NK NK NK NK Table 9: Renal cancer

Author + Date Journal Location Study Design Study Population

Healthy patients Persons with Symptoms Age Range Males/Females Other characteristics

334 with no history of Prospective cohort Morrissey J. 2015 JAMA Oncology USA 80 720 NK NK cancer, 386 with history Study of cancer

International Journal of Retrospective case- Fedorko M. 2015 Czech Republic 100 195 Mean age: 58 198/97 NK Molecular Sciences control study

National Institute of Retrospective case- Mustafa A. 2011 USA 104 189 25-87 200/92 NK Health control study Specimen Method Biomarker Mechanism Number of Patients included P-value

Total number of Number of patients with Controls Non-Neoplastic findings Other cancers participants RCC History: 89 lung cancer, 12 prostate cancer, 25 colorectal cancer, 11 gastro-intestinal cancer, Aquaporin 1: water 16 patients with renal 16 uterine cancer, 25 Urine ELISA AQP1 819 22 80 P<0,001 channel lesions ovarian cancer, 13 pancreatic cancer, 38 lymfoma, 44 breast cancer, 95 other tissue cancer PLIN2 Perilipin-2 Serum qRT-PCR miR-378 mi-RNA 295 195 100 NK NK P<0,0001 miR-210 mi-RNA miR-378 + miR-210 Biomarker panel Biochrom 30 Amino Combination of 15 Serum Biomarker panel 293 189 104 NK NK P<0,05 Acid Analyzer Amino Acids Quality Assessment Sensitivity(%) 95% CI Specificity (%) 95% CI AUC 95% CI Blinded Reference (Fowkes checklist)

100 NK 96 NK 0,991 0,985-0,997 NA 0 [92]

100 NK 98 NK 0.996 0.992-1.000 NK NK NK NK 0,82 0,77-0,86 NA 0 [93] NK NK NK NK 0,74 0,69-0,80 80 NK 78 NK 0,85 0,81-0,89 NK NK NK NK 0,81 NK NA + [94] Table 10: Gynecologic cancer

Author + Date Cancer Type Journal Location Study Design Study Population Specimen

Healthy patients Persons with Symptoms Age Range Other characteristics

Prospective cohort Kemik P. 2016 Endometrial Cancer Gynecologic Oncology Turkey 50 50 36-85 72 menopausal women Serum study

Journal of Medical Prospective cohort Cytological: 26 ASC-US, Duvlis S. 2015 Cervical Cancer Macedonia 258 148 10-78 Cervical specimen Virology study 81 LSIL, 41 HSIL

International Journal of Cytological: 31 HSIL, 38 Kan Y. 2014 Cervical Cancer China Cross-sectional study 247 172 21-90 Cervical specimen Gynecological Cancer LSIL, 19 AGC, 84 ASCUS Method Biomarker Mechanism Number of Patients included P-value Total number of Number of patients with Pre-lesion Controls Non-Neoplastic findings Other cancers participants Cancer 20 hypertension, 13 Human chitinase-3 like diabetes, 4 with family ELISA YKL-40 100 50 NK 50 NK P<0,0001 protein history of endometrial cancer Human epididymis HE-4 P<0,0001 protein-4 Pathological: 22 CIN 1, RT-PCR HPV DNA Test HPV Viral infection 413 0 258 NK NK P<0,0001 20 CIN 2, 9 CIN 3 HPV E6/E7 mRNA Test P<0,0001 Patho: 76 CIN 1, 7 CIN 2, RT-PCR PAX1m Hypermethylated genes 419 4 247 NK NK P<0,0001 32 CIN 3 SOX1m Hypermethylated genes P<0,0001 NKX6-1 Hypermethylated genes P<0,0001 Quality Assessment Sensitivity(%) 95% CI Specificity (%) 95% CI AUC PPV (%) NPV (%) Blinded Reference (Fowkes checklist)

90 NK 54 NK 0,823 NK NK NA + [95]

90 NK 42 NK 0,882 NK NK

100 85-100 18,7 7-37 NK 52,7 100 NA ++ [96] 93,1 76-98 50 32-67 NK 62,8 88,9 CIN 3: 92 NK CIN 3: 83 NK CIN 3: 0,97 NK NK NA ++ [97] CIN 3: 92 NK CIN 3: 30 NK CIN 3: 0,76 NK NK CIN 3: 54 NK CIN 3: 82 NK CIN 3: 0,73 NK NK Table 11: Bladder cancer

Author + Date Journal Location Study Design Study Population Specimen

Healthy patients Persons with Symptoms Age Range Males/Females Other characteristics Cancer Epidemiology, Chung W. 2011 Biomarkers & USA Case-control study 110 128 35-90 101/27 (bladder cancer) NK Urine sediment Prevention

Eissa S. 2013 Disease Markers Egypt Cohort study 20 66 25-83 35/11 NK Urine Sediment

Lai Y. 2010 Journal of Urology China Cohort study 32 166 NK 129/69 NK Urine Samples

BPH: 55, Urinary incontinence: 17, Urinary lithiasis: 15, Prospective cohort Pelvic organ prolapse: Renard I. 2009 European Urology Belgium 339 157 30-87 NK Urine Samples study 12, OAB: 5, Hydrocele: 4, Urethral stenosis: 2, Renal angiomyolipoma: 2 Method Biomarker Mechanism Number of Patients included P-value Sensitivity(%) Total number of Number of patients with Benign Urological Lesion Controls Non-Neoplastic findings Other cancers participants Bladder Cancer

RT-PCR MYO3A Myosin IIIa 238 138 0 110 NK NK P<0,0001 77,3

Carbonic anhydrase- CA-10 85,2 related gene NKX-6-2 NK6 Homeobox protein 88,3 PENK Proenkephalin 81,3 SRY-related HMG-box SOX-11 70,3 11 Deleted in breast cancer DBC-1 71,1 1 NPTX-2 Neuronal pentraxin 2 75,8 Ataxin 2-binding protein A2BP-1 87,5 1 MYO3A + CA-10 + NKX-6- Biomarker Panel 85,2 2 + DBC-1 + SOX-11 RT-PCR Cytology Cytology analysis 86 46 20 20 NK NK P<0,001 50 Baculoviral IAP Repeat- Survivin 76,1 containing 5 MMP-Zymography MMP-Zymography 67,3 Cytology + Survivin Biomarker Panel 84,7 Cytology + MMP- Biomarker Panel 84,7 Zymography Survivin + MMP- Biomarker Panel 91,3 Zymography Cytology + Survivin + Biomarker Panel 95,6 MMP-Zymography ELISA UKP3-A Human Uroplakin-3 198 122 44 32 NK NK P<0,01 83 Nuclear matrix protein NMP-22 58 22 Cytology Cytology analysis 64

Methylation-specific Training set: 88 TWIST-1 + NID-2 Gene methylation 496 157 0 339 NK NK P<0,0001 PCR Validation set: 94

Training set: 48 Cytology Cytology analysis Validation set: 49 TwiST-1 + NID-2 + Training set: 96 Biomarker Panel Cytology Validation set: 97 Quality Assessment 95% CI Specificity (%) 95% CI AUC 95% CI PPV (%) NPV (%) Blinded Reference (Fowkes checklist)

NK 90,9 NK 0,841 NK NK NK NA 0 [98]

NK 81,8 NK 0,835 NK NK NK NK 76,4 NK 0,823 NK NK NK NK 79,1 NK 0,802 NK NK NK NK 89,1 NK 0,797 NK NK NK

NK 83,6 NK 0,774 NK NK NK NK 73,6 NK 0,747 NK NK NK NK 54,5 NK 0,71 NK NK NK

NK 94,5 NK 0,939 NK NK NK NK 100 NK NK NK 100 63,45 NA 0 [99] NK 95 NK NK NK 94,5 77,5 NK 90 NK NK NK 88,5 70,5 NK 95 NK NK NK 95,1 84,4 NK 90 NK NK NK 90,6 83,7

NK 85 NK NK NK 87,5 89,4

NK 85 NK NK NK 88 94,4 NK 83 NK 0,907 0,867-0,974 88 75 NA 0 [100] NK 75 NK NK NK 79 53 NK 82 NK NK NK 85 58

Training set: 78-97 Training set: 94 Training set: 90-98 0,93 0,90-0,96 86 95 NA 0 [101] Validation set: 87-102 Validation set: 91 Validation set: 84-99

Training set: 34-62 Training set: 97 Training set: 94-100 NK NK 85 80 Validation set: 33-65 Validation set: 95 Validation set: 89-101 Training set: 90-101 Training set: 93 Training set: 88-97 NK NK 83 98 Validation set: 92-103 Validation set: 86 Validation set: 77-95 Table 12: Oral cancer

Author + Date Journal Location Study Design Study Population Specimen

Healthy patients Persons with Symptoms Age Range Males/Females Other characteristics

22 high-risk patients with symptoms in head Retrospective case- Nasopharyngeal Hutajulu S. 2011 Molecular Cancer The Netherlands 25 53 NK NK and neck region and control study brushing elevated EBV IgA seroactivity

Vokac N. 2014 Molecular Cytogenetics Slovenia Observation study 22 71 36-84 54/17 (with OSCC) NK Brush Biopsies

120 Oropharynx cancer, International Journal of Multi-center cohort Holzinger D. 2017 Germany 0 214 39-95 159/55 94 outside the Serum Cancer study oropharynx

Rajkumar K. 2015 Oral Diseases India Case-control study 100 200 21-90 204/96 NK Serum Method Biomarker Mechanism Number of Patients included P-value

Number of patients with Total number of Nasopharyngeal/Oral High-Risk patients Controls Non-Neoplastic findings Other cancers participants cancer

5 tumor suppressor Promoter methylation genes-panel: CHFR + ELISA of tumor suppressor 100 53 22 25 NK NK P<0,05 RIZ1 + WIF1 + p16 + genes RASSF1A TERC + SOX2 + FISH Centromere 3-control Gene amplification 93 71 NK 22 NK NK P<0,05 probe

RT-PCR HPV-16 E6 Oncoprotein antibody 214 214 NK NK HPV-: 142, HPV+: 72 NK P<0,05

Tumor suppressor p16 (in subcohort) protein 100: 50 with leukoplakia ELISA CYFRA-21-1 Cytokeratin fraction 300 100 100 NK NK P<0,01 and 50 with OSMF Quality Assessment Sensitivity(%) 95% CI Specificity (%) 95% CI AUC DOR 95% CI Blinded Reference (Fowkes checklist)

91 NK 96 NK NK NK NK NA ++ [102]

NK NK NK NK NK 1,29 0,39-4,23 NA ++ [103]

97 90-99 98 90-100 NK NK NK NA ++ [105]

97 NK 65 NK NK NK NK

83,6 NK 95 NK 0,899 NK NK NA + [104] Table 13: Esophageal cancer

Author + Date Journal Location Study Design Study Population Specimen Method

Healthy patients Persons with Symptoms Age Range Males/Females Other characteristics

Journal of Cancer Guo R. 2011 Research and Clinical China Cohort study 95 78 NK NK NK Serum MALDI-TOF-MS Oncology Clinical and Xu Y. 2017 China Cohort study 141 238 38-88 277/103 NK Serum ELISA Translational Oncology Biomarker Mechanism Number of Patients included P-value Sensitivity(%) 95% CI Specificity (%)

Total number of Number of patients with Controls Non-Neoplastic findings Other cancers participants Esophageal Cancer

Proteomic pattern Training set: 92,5 Blind Training set: 89,5 Blind Protein biomarker 173 78 78 NK NK P<0,01 NK analysis test set: 88 test set: 84,4 L1-cell adhesion Cohort 1: 26,2 Cohort 2: Cohort 1: 90,4 Cohort 2: L1CAM Auto-antibodies 379 238 141 NK NK P<0,05 NK molecule 27,7 91,5 Quality Assessment 95% CI AUC 95% CI PLR NLR PPV (%) NPV (%) Blinded Reference (Fowkes checklist)

NK NK NK NK NK NK NK Yes 0 [106]

Cohort 1: 0,603 Cohort Cohort 1:0,535-0,672 Cohort 1: 2,74 Cohort 2: Cohort 1: 0,82 Cohort 2: Cohort 1: 84,8 Cohort 2: Cohort 1: 37,7 Cohort 2: NK NA 0 [107] 2: 0,628 Cohort 2: 0,516-0,741 3,25 0,79 76,5 55,9 Table 14: Skin cancer

Author + Date Journal Location Study Design Study Population

Healthy patients Persons with Symptoms Age Range Males/Females Other characteristics

British Journal of Wachsman W. 2011 USA Cohort study 126 76 19-95 117/58 NK Dermatology Specimen Method Biomarker Mechanism Number of Patients included

Total number of Number of patients with Naevi Controls Non-Neoplastic findings Other cancers participants Melanoma Cells from stratum EGIR: epidermal genetic corneum using tape information retriever 17-gene clasifier Biomarker classifier 206 76 126 0 NK NK stripping and RT-PCR Quality Assessment P-value Sensitivity(%) 95% CI Specificity (%) 95% CI AUC 95% CI Blinded Reference (Fowkes checklist)

Training set: 100 Test Training set: 95 Test set: P<0,001 NK NK 0,955 NK 0 [108] set: 100 88 Table 15: Osteosarcoma

Author + Date Journal Location Study Design Study Population

Healthy patients Persons with Symptoms Age Range Males/Females Other characteristics

Ouyang L. 2013 Medical Oncology USA Case-control study 80 80 NK 127/33 NK Specimen Method Biomarker Mechanism Number of Patients included P-value

Total number of Number of patients with Controls Non-Neoplastic findings Other cancers participants Osteosarcoma Plasma RT-PCR miRNA-21 miRNA 160 80 80 NK NK P<0,01 miRNA-199a-3p miRNA miRNA-143 miRNA miRNA-21 + miRNA- Biomarker panel 199a-3p + miRNA-143 Bone alkaline BALP phosphatase Quality Assessment Sensitivity(%) 95% CI Specificity (%) 95% CI AUC 95% CI Blinded Reference (Fowkes checklist)

NK NK NK NK 0,863 0,818-0,908 NA 0 [109] NK NK NK NK 0,918 0,882-0,954 NK NK NK NK 0,902 0,864-0,940 90,5 NK 93,8 NK 0,953 0,924-0,984

NK NK NK NK 0,922 0,88-0,964 Table 16: Thyroid cancer

Author + Date Journal Location Study Design Study Population

Healthy patients Persons with Symptoms Age Range Males/Females Other characteristics

European Journal of Retrospective cohort Herrmann B. 2010 Germany 0 1007 NK 440/567 NK Endocrinology study Specimen Method Biomarker Mechanism Number of Patients included P-value

Number of patients with Total number of Patients with thyroid MTC (medullary thyroid Non-Neoplastic findings Other cancers participants nodule disease cancer) Chemoluminescent Serum hCT Serum calcitonine 1007 2 1000 5 C-Cell hyperplasia NK NK assay Quality Assessment Sensitivity(%) 95% CI Specificity (%) 95% CI AUC 95% CI Blinded Reference (Fowkes checklist)

NK NK NK NK NK NK NK ++ [110] Table 17: Leukemia

Author + Date Journal Location Study Design Study Population Specimen

Healthy patients Persons with Symptoms Age Range Males/Females Other characteristics

Papageorgiou S. 2011 The Oncologist Greece Case-control study 23 65 48-87 55/33 NK Serum (RNA) Method Biomarker Mechanism Number of Patients included P-value Sensitivity(%) Total number of Number of patients with Controls CD38+ status LDH Abnormal IGHV Mutated participants CLL BCL2L12 mRNA RT-PCR Anti-apoptosis 88 65 23 5 14 23 P<0,001 NK expression BLC2 mRNA expression Anti-apoptosis NK Quality Assessment 95% CI Specificity (%) 95% CI AUC 95% CI DOR 95% CI Blinded Reference (Fowkes checklist)

NK NK NK 0,833 0,731-0,935 4,52 2,11-9,65 NA 0 [111] NK NK NK 0.776 0.656-0.896 4.48 2.06-11.2 Table 18: Various types of cancer

Author + Date Type of Cancer Journal Location Study Design Study Population

Healthy patients Persons with Symptoms Age Range Males/Females

Wen Y-H. 2015 Prostate cancer Clinica Chimica Acta China Screening study 41202 314 20-93 19998/21518

Hepatocellulair Cancer

Pancreatic Cancer

Colorectal Cancer

Lung Cancer

Bladder Cancer Cervical Cancer

Gastric Cancer

Breast Cancer

Ovarian Cancer

All types of cancer

Chen Z. 2011 All types of cancer Sensors China Screening study 34645 720 NK NK

Retrospective cohort Wang Y. 2016 All types of cancer Cancer Biomarkers China 56234 89 21-85 38816/17362 study Specimen Method Biomarker Mechanism Number of Patients included

Total number of Number of patients with Other characteristics Neoplastic findings Controls Non-Neoplastic findings participants Cancer Prostate-specific NK Serum Microassay PSA 19998 24 NK 19974 NK antigen Panel: PSA + AFP + CEA + CA19-9 + CYFRA 21-1 + Biomarker panel CA125 + SCC + CA 15-3 AFP Alpha-fetoprotein 41516 26 NK 41202 NK CA 19-9 Cancer Antigen Panel: PSA + AFP + CEA + CA19-9 + CYFRA 21-1 + Biomarker panel CA125 + SCC + CA 15-3 Carcinoembryonic CEA 41516 9 NK 41202 NK Antigen CA19-9 Cancer Antigen CYFRA 21-1 Cytokeratin fragment CA125 Cancer Antigen Panel: PSA + AFP + CEA + CA19-9 + CYFRA 21-1 + CA125 + SCC + CA 15-3 Carcinoembryonic CEA 41516 26 NK 41202 NK Antigen CA 19-9 Cancer Antigen CYFRA 21-1 Cytokeratin fragment CA125 Cancer Antigen CA 15-3 Cancer Antigen panel: PSA + AFP + CEA + CA19-9 + CYFRA 21-1 + Biomarker panel CA125 + SCC + CA 15-3 Carcinoembryonic CEA 41516 36 NK 41202 NK Antigen CYFRA 21-1 Cytokeratin fragment CA125 Cancer Antigen CA 15-3 Cancer Antigen PAnel: PSA + AFP + CEA + CA19-9 + CYFRA 21-1 + Biomarker panel CA125 + SCC + CA 15-3 Prostate-specific PSA 41516 14 NK 41202 NK antigen Carcinoembryonic CEA Antigen CA 19-9 Cancer Antigen CYFRA 21-1 Cytokeratin fragment CA125 Cancer Antigen Squamous cell-specific SCC antigen Panel: PSA + AFP + CEA + CA19-9 + CYFRA 21-1 + Biomarker panel CA125 + SCC + CA 15-3 Carcinoembryonic CEA 21518 27 NK NK NK Antigen CA125 Cancer Antigen Squamous cell-specific SCC antigen Panel: PSA + AFP + CEA + CA19-9 + CYFRA 21-1 + Biomarker panel CA125 + SCC + CA 15-3 Carcinoembryonic CEA 41516 18 NK NK KN Antigen CYFRA 21-1 Cytokeratin fragment Panel: PSA + AFP + CEA + CA19-9 + CYFRA 21-1 + Biomarker panel CA125 + SCC + CA 15-3 CA125 Cancer Antigen 21518 40 NK NK NK Panel: PSA + AFP + CEA + CA19-9 + CYFRA 21-1 + Biomarker panel CA125 + SCC + CA 15-3 CA 19-9 Cancer Antigen 21518 3 NK NK NK Panel: PSA + AFP + CEA + CA19-9 + CYFRA 21-1 + Biomarker panel CA125 + SCC + CA 15-3

Panel: PSA + AFP + CEA + CA19-9 + CYFRA 21-1 + Biomarker panel 41516 314 NK NK NK CA125 + SCC + CA 15-3 Enhanced NK Serum chemiluminescent dot TK-1 Thymidine-kinase 1 35365 720 NK 34645 NK blot assay Enhanced NK Serum chemiluminescent dot TK-1 Thymidine-kinase 1 56285 89 NK 56234 NK blot assay TK-1 + AFP Biomarker panel TK-1 + CEA Biomarker panel TK-1 + AFP + CEA Biomarker panel Quality Assessment P-value Sensitivity(%) Specificity (%) AUC PLR PPV (%) NPV (%) Blinded Reference (Fowkes checklist)

P<0,05 100 NK NK NK NK NK NA ++ [112]

P<0,05 100 NK NK NK NK NK

P<0,05 63,3 NK NK NK NK NK P<0,05 31,6 NK NK NK NK NK

P<0,05 92,3 NK NK NK NK NK

P<0,05 55,6 NK NK NK NK NK P<0,05 62,5 NK NK NK NK NK P<0,05 33,3 NK NK NK NK NK P<0,05 66,7 NK NK NK NK NK

P<0,05 88,9 NK NK NK NK NK

P<0,05 53,8 NK NK NK NK NK P<0,05 25 NK NK NK NK NK P<0,05 38,9 NK NK NK NK NK P<0,05 22,2 NK NK NK NK NK P<0,05 12,5 NK NK NK NK NK

P<0,05 76,9 NK NK NK NK NK

P<0,05 72,2 NK NK NK NK NK P<0,05 40,9 NK NK NK NK NK P<0,05 20 NK NK NK NK NK P<0,05 20 NK NK NK NK NK

P<0,05 75 NK NK NK NK NK

P<0,05 25 NK NK NK NK NK

P<0,05 33,3 NK NK NK NK NK P<0,05 69,2 NK NK NK NK NK P<0,05 57,1 NK NK NK NK NK P<0,05 50 NK NK NK NK NK P<0,05 60 NK NK NK NK NK

P<0,05 64,3 NK NK NK NK NK

P<0,05 20,8 NK NK NK NK NK P<0,05 30,4 NK NK NK NK NK P<0,05 20,8 NK NK NK NK NK

P<0,05 44,4 NK NK NK NK NK

P<0,05 25 NK NK NK NK NK P<0,05 41,7 NK NK NK NK NK

P<0,05 38,9 NK NK NK NK NK

P<0,05 20,5 NK NK NK NK NK

P<0,05 37,5 NK NK NK NK NK

P<0,05 50 NK NK NK NK NK

P<0,05 33,3 NK NK NK NK NK

P<0,05 57 88,7 NK NK 3,7 99,6

P<0,001 79,8 99,7 0,96 233,73 NK NK NA + [114]

P<0,05 54,2 NK NK NK NK NK NA ++ [113]

P<0,05 62,6 NK NK NK NK NK P<0,05 67,3 NK NK NK NK NK P<0,05 72,3 NK NK NK NK NK