Bioinformatics Analysis for Circulating Cell-Free DNA in Cancer

cancers Review Bioinformatics Analysis for Circulating Cell-Free DNA in Cancer Chiang-Ching Huang 1,*, Meijun Du 2 and Liang Wang 2,* 1 Zilber School of Public Health, University of Wisconsin, Milwaukee, WI 53205, USA 2 Department of Pathology and MCW Cancer Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA; [email protected] * Correspondence: [email protected] (C.-C.H.); [email protected] (L.W.); Tel.: +1-414-227-5006 (C.-C.H.); +1-414-955-2574 (L.W.) Received: 22 April 2019; Accepted: 6 June 2019; Published: 11 June 2019 Abstract: Molecular analysis of cell-free DNA (cfDNA) that circulates in plasma and other body fluids represents a “liquid biopsy” approach for non-invasive cancer screening or monitoring. The rapid development of sequencing technologies has made cfDNA a promising source to study cancer development and progression. Specific genetic and epigenetic alterations have been found in plasma, serum, and urine cfDNA and could potentially be used as diagnostic or prognostic biomarkers in various cancer types. In this review, we will discuss the molecular characteristics of cancer cfDNA and major bioinformatics approaches involved in the analysis of cfDNA sequencing data for detecting genetic mutation, copy number alteration, methylation change, and nucleosome positioning variation. We highlight specific challenges in sensitivity to detect genetic aberrations and robustness of statistical analysis. Finally, we provide perspectives regarding the standard and continuing development of bioinformatics analysis to move this promising screening tool into clinical practice. Keywords: bioinformatics; copy number variation; cell-free DNA; methylation; mutation; next generation sequencing 1. Introduction To date, tissue biopsy samples are widely used to characterize tumors. Although tissues allow the histological definition of the disease and can reveal details of the genetic profile of the tumor, enabling prediction of disease progression and response to therapies, the applications are limited on tissue availability, sampling frequency, and their genetic heterogeneity [1]. Therefore, attention is turning to liquid biopsies, which enable the analysis of tumor components, including circulating tumor cells (CTC) [2] and circulating tumor nucleic acids from various biological fluids, mostly blood but also other easily accessible fluids such as urine [3]. Compared to conventional tissue biopsy from a single tumor site, the main advantages of liquid biopsies include their non-invasive characteristics, multiple sampling capability, and comprehensive coverage to address issues of tumor heterogeneity [4,5]. Circulating cell-free DNA (cfDNA) is defined as extracellular DNA occurring in blood or other body fluids. It is usually released as small fragments (150–200 bp in length [6]) from normal or tumor cells by apoptosis and necrosis [7], or shed from viable cells [8]. Levels of cfDNA are higher in diseased than healthy individuals [9]. cfDNA can track the evolutionary dynamics and heterogeneity of tumors and detect the early emergence of therapy resistance, residual disease, and recurrence [10–12]. Therefore, analysis of cfDNA has been considered as a potential screening approach for tumor diagnosis and prognosis by detecting tumor-associated aberrations in peripheral blood [13,14]. Next generation sequencing (NGS) has emerged as a powerful tool for cfDNA analysis, which allows the detection of cancer-related genetic and epigenetic alterations such as mutations, copy number Cancers 2019, 11, 805; doi:10.3390/cancers11060805 www.mdpi.com/journal/cancers Cancers 2019, 11, x 2 of 14 Cancers 2019, 11, 805 2 of 15 Next generation sequencing (NGS) has emerged as a powerful tool for cfDNA analysis, which allows the detection of cancer-related genetic and epigenetic alterations such as mutations, copy variationsnumber variations (CNVs), (CNVs), and DNA and methylation DNA methylation changes changes across wider across genomic wider genomic regions regions in many in cancer many typescancer [ 15types,16]. However,[15,16]. However, detection detection of cancer of with canc higher with specificity high andspecificity sensitivity and issensitivity still challenging, is still especiallychallenging, in especially early-stage in cancers, early-stage as there cancers, exist as many there barriersexist many to thebarriers utilization to the ofutilization cfDNA inof clinicalcfDNA applications,in clinical applications, including lackincluding of well-accepted lack of well sample-accepted collection sample protocolcollection and protocol sensitive and detectionsensitive approaches.detection approaches. Furthermore, Furthermore, analysis of cfDNA analysis sequencing of cfDNA data requiressequencing specialized data requires bioinformatics specialized tools tobioinformatics identify robust tools biomarkers to identify for robust clinical biomarkers practice. In for this clinical review, practice. we will In discuss this review, specific we challenges will discuss in sensitivityspecific challenges to detect geneticin sensitivity aberrations to detect and providegenetic informationaberrations onand cfDNA provide bioinformatics information approaches. on cfDNA Webioinformatics conclude with approaches. a perspective We co regardingnclude with future a perspective development regarding in this future rapidly development evolving area. in this A simplifiedrapidly evolving workflow area. of A blood-based simplified workflow liquid biopsy of blood-based is shown in liquid Figure biopsy1. is shown in Figure 1. Figure 1. WorkflowWorkflow of blood-based liquid biopsy.biopsy. 2. Characteristics of Circulating Tumor DNA (ctDNA) 2. Characteristics of Circulating Tumor DNA (ctDNA) The ctDNA is released from tumor cells only. The ctDNA can be derived from primary or metastaticThe ctDNA tumors is [ 17released]. Most from circulating tumor ctDNAcells only. are 160–180The ctDNA base can pair be fragments, derived from roughly primary the size or ofmetastatic a mononucleosomal tumors [17]. Most unit [circulating18,19]. However, ctDNA recentare 160–180 studies base have pair shown fragments, that ctDNA roughly tends the size to beof shortera mononucleosomal than cfDNA fromunit [18,19]. normal However, cells [20,21 recent]. Therefore, studies ctDNA have shown may be that enriched ctDNA by tends excising to be smallershorter DNAthan cfDNA fragments from from normal cfDNA cells on [20,21]. polyacrylamide Therefore, gels ctDNA [22]. may Currently, be enriched cfDNA by fragmentation excising smaller patterns DNA andfragments their applications from cfDNA in on liquid polyacrylamide biopsy are an gels emerging [22]. Currently, research field.cfDNA Although fragmentation ctDNA patterns can be used and totheir detect applications the presence in liquid of cancer-related biopsy are an genetic emergi andng research epigenetic field. changes, Although such ctDNA changes can usually be used vary to fromdetect case the topresence case, which of cancer-related makes the development genetic and ofepig sensitiveenetic changes, and generalizable such changes approaches usually extremelyvary from challenging.case to case, Onewhich major makes challenge the development is low ctDNA of fraction.sensitive In and most generalizable cases, ctDNA approaches accounts forextremely a small fractionchallenging. of total One cfDNA major sincechallenge most is cfDNA low ctDNA is derived fraction. from In non-cancer most cases, cells, ctDNA especially accounts blood for cells. a small In early-stagefraction of total cancer cfDNA patients, since ctDNA most fractioncfDNA is could derived be lower from than non-cancer 0.1%. To cells, detect especially such a rare blood event cells. with In highearly-stage specificity cancer and patients, sensitivity, ctDNA a variety fraction of approaches could be lower have beenthan developed,0.1%. To detect which such include a rare droplet event digitalwith high PCR specificity (ddPCR) andand molecularsensitivity, index-based a variety of next approaches generation have sequencing been developed, technologies which [23 include,24]. droplet digital PCR (ddPCR) and molecular index-based next generation sequencing technologies 3.[23,24]. Detection and Analysis of Somatic Mutations Somatic mutations are involved in cancer development and progression. The presence or absence 3. Detection and Analysis of Somatic Mutations of a single genetic alteration in tumor DNA is currently employed to guide clinical decision making for a numberSomatic of targeted mutations agents are [ 25involved–28]. Ever-increasing in cancer deve numberslopment of genomicand progression. alterations The are beingpresence tested or asabsence putative of a predictive single genetic biomarkers alteration in in clinical tumor trials DNA of is novel currently anticancer employed therapies to guide [29 clinical]. To detect decision the cancer-associatedmaking for a number alleles of targeted in the blood, agents real-time [25–28]. PCR Ever-increasing (RT-PCR) and nu ddPCRmbers of “targeted” genomic methodsalterations have are beenbeing extensively tested as putative adopted predictive in most clinical biomarkers trials [ 30in]. clinical Till now, trials clinical of novel utility anticancer has been therapies demonstrated [29]. forTo twodetect FDA-approved the cancer-associated cfDNA-based alleles tests: in the cobasblood, epidermal real-time growthPCR (RT-PCR) factor receptor and ddPCR (EGFR) “targeted” mutation testmethods V2 (Roche have Molecularbeen extensively Diagnostics), adopted which in most detects clinical

Bioinformatics Analysis for Circulating Cell-Free DNA in Cancer

IJC Sequencing Coverage and Quality Statistics

Detection and Characterization of Low and High Genome Coverage Regions Using an Efﬁcient Running Median and a Double Threshold Approach

Transcriptome Analysis Using Next-Generation Sequencing

An Introduction to Next-Generation Sequencing Technology

Information Theory of DNA Shotgun Sequencing Abolfazl S

1 G&T-Seq: Separation and Parallel Sequencing of the Genomes And

Comparison of Transcriptomics Technologies for Assessment of Staphylococcus Aureus Gene Expression

Exome Versus Transcriptome Sequencing in Identifying Coding Region Variants

Cfdna Sequencing: Technological Approaches and Bioinformatic Issues

Benchmarking Full-Length Transcript Single Cell Mrna Sequencing Protocols

Whole Exome and Whole Genome Sequencing

Guidelines for the Experimental Design of Single-Cell RNA Sequencing Studies