( 12 ) Patent Application Publication ( 10 ) Pub . No .: US 2021/0005284 A1 Nuzhdina Et Al
Total Page:16
File Type:pdf, Size:1020Kb
US 20210005284A1 IN ( 19 ) United States ( 12 ) Patent Application Publication ( 10 ) Pub . No .: US 2021/0005284 A1 Nuzhdina et al . ( 43 ) Pub . Date : Jan. 7 , 2021 ( 54 ) TECHNIQUES FOR NUCLEIC ACID DATA G16B 25/10 ( 2006.01 ) QUALITY CONTROL G16B 35/10 ( 2006.01 ) ( 52 ) U.S. CI . ( 71 ) Applicant: BostonGene Corporation , Waltham , CPC G16B 30/10 ( 2019.02 ) ; G16B 35/10 MA ( US ) ( 2019.02 ) ; G16B 25/10 ( 2019.02 ) ; G16B ( 72 ) Inventors : Ekaterina Nuzhdina , Moscow ( RU ); 20/20 ( 2019.02 ) Alexander Bagaev , Moscow ( RU ); Maksim Chelushkin , Moscow (RU ) ; Yaroslav Lozinsky, Moscow ( RU ); ( 57 ) ABSTRACT Natalia Miheecheva , Moscow ( RU ) ; Alexander Zaitsev , Leninskiy R - N (RU ) Described herein are various methods of collecting and processing of tumor and /or healthy tissue samples to extract ( 21 ) Appl . No .: 16 / 920,641 nucleic acid and perform nucleic acid sequencing . Also described herein are various methods of processing nucleic ( 22 ) Filed : Jul . 3 , 2020 acid sequencing data to remove bias from the nucleic acid sequencing data . Also described herein are various methods Related U.S. Application Data of evaluating the quality of nucleic acid sequence informa ( 60 ) Provisional application No. 62 / 991,570 , filed on Mar. tion . The identity and / or integrity of nucleic acid sequence 18 , 2020 , provisional application No. 62 /870,622 , data is evaluated prior to using the sequence information for filed on Jul. 3 , 2019 . subsequent analysis ( for example for diagnostic , prognostic , or clinical purposes ). The methods enable a subject, doctor, Publication Classification or user to characterize or classify various types of cancer ( 51 ) Int . Cl . precisely , and thereby determine a therapy or combination of G16B 30/10 ( 2006.01 ) therapies that may be effective to treat a cancer in a subject G16B 20/20 ( 2006.01 ) based on the precise characterization . 110 BEGIN Obtain a biological sample ( e.g. , a tumor sampte ) from a subject 111 having, suspected of having , or at risk of having cancer Extract nucleic acid from the biological sample 112 Perform one or more quality control assessments of extracted nucleic 113 acid Prepare nucleic acid library with extracted nucleic acid -114 that satisfies nucleic acid quality control Sequence DNA and / or RNA nucleic acid , -115 obtain RNA expression data Process RNA expression data to obtain bias - corrected gene 116 expression data Perform one or more quality control assessmients on gene -117 expression data Analyze gene expression data 118 Identify a cancer treatment for the subject based on 119 the gene expression data End Patent Application Publication Jan. 7 , 2021 Sheet 1 of 22 US 2021/0005284 A1 BEGIN 200100 Obtain a sample of a tumor from a subject having , suspected of 101 having , or at risk of having cancer 102 Quality control step 1 - Perform one or more quality control assessments on tumor sample Quality control step 2 - Perform one or more quality control assessments on 103 DNA , RNA , and / or libraries processes Quality control step 3 - Perform one or more quality control 104 assessments on nucleic acid sequence data End FIG . 1A Patent Application Publication Jan. 7 , 2021 Sheet 2 of 22 US 2021/0005284 A1 110 BEGIN Obtain a biological sample ( e.g., a tumor sample ) from a subject 111 having , suspected of having, or at risk of having cancer Extract nucleic acid from the biological sample 112 Perform one or more quality control assessments of extracted nucleic 113 acid Prepare nucleic acid library with extracted nucleic acid -114 that satisfies nucleic acid quality control Sequence DNA and / or RNA nucleic acid , 115 obtain RNA expression data Process RNA expression data to obtain bias - corrected gene -116 expression data Perform one or more quality control assessments on gene expression data ' 117 Analyze gene expression data -118 Identify a cancer treatment for the subject based on 119 the gene expression data End FIG . 1B Patent Application Publication Jan. 7 , 2021 Sheet 3 of 22 US 2021/0005284 A1 enrichmentpolyA FIG.2A rRNAdepletion Patent Application Publication Jan. 7 , 2021 Sheet 4 of 22 US 2021/0005284 A1 FIG.2B WX08 ) uossa Patent Application Publication Jan. 7 , 2021 Sheet 5 of 22 US 2021/0005284 A1 IILE Samples FIG . 3 Patent Application Publication Jan. 7 , 2021 Sheet 6 of 22 US 2021/0005284 A1 Histone length ( Hela ) Gene Median length , nt HIST1H4C 20,0 HISTIH1C 25,5 HIST1H2BK 71,0 HIST1H2AE 14,0 HIST1H3B 15,0 HIST3H2A 17,0 FIG . 4A Patent Application Publication Jan. 7 , 2021 Sheet 7 of 22 US 2021/0005284 A1 Polya Total FIG . 4B Patent Application Publication Jan. 7 , 2021 Sheet 8 of 22 US 2021/0005284 A1 SIKIO Polya Total FIG . 4C Patent Application Publication Jan. 7 , 2021 Sheet 9 of 22 US 2021/0005284 A1 b FIG . 5 Patent Application Publication Jan. 7. 2021 Sheet 10 of 22 US 2021/0005284 A1 BEGIN 200 Obtain a first sample of a first tumor from a -201 subject having , suspected of having , or at risk of having cancer 202 Extract RNA from the first sample of the first tumor Enrich the extracted RNA for coding RNA to obtain 203 enriched RNA Prepare a first library of CDNA fragments from the 204 enriched RNA for non - stranded RNA sequencing Perform non - stranded RNA sequencing on 205 the first library of cDNA fragments from the enriched RNA End FIG . 6A Patent Application Publication Jan. 7 , 2021 Sheet 11 of 22 US 2021/0005284 A1 BEGIN 210 Obtain RNA expression data for a subject having , suspected of 211 having , or at risk of having cancer Align the RNA expression data to a reference and annotate the RNA * 212 expression data Remove non - coding transcripts from the annotated RNA expression data to -213 obtain filtered RNA expression data Normalize the filtered RNA expression data to obtain gene expression 214 data in transcripts per million ( TPM ) Identify at least one gene that introduces bias in the gene expression data 215 Remove at least one gene from the gene expression data to obtain 216 bias - corrected gene expression data Identify a cancer treatment for the subject using the bias - corrected -217 gene expression data End FIG . 6B Patent Application Publication Jan. 7 , 2021 Sheet 12 of 22 US 2021/0005284 A1 220 BEGIN Enrich RNA for coding RNA in a sample of extracted RNA from a first 221 tumor sample from a subject having , suspected of having , or at risk of having cancer Perform non - stranded RNA sequencing on a first library of cDNA fragments prepared from the enriched RNA to obtain RNA 222 expression data Convert the RNA expression data to gene expression data 223 Identify at least one gene that introduces bias in the gene expression 224 data in transcripts per kilobase million ( TPM ) Remove at least one gene from the gene expression data to 225 obtain bias - corrected gene expression data Identify a cancer treatment for the subject using the 226 bias - corrected gene expression data End FIG . 6C Patent Application Publication Jan. 7. 2021 Sheet 13 of 22 US 2021/0005284 A1 300 308 301 309 302 . 310 303 311 304 . 312 305 313 306 314 307 315 FIG . 7 Patent Application Publication Jan. 7 , 2021 Sheet 14 of 22 US 2021/0005284 A1 800 Obtain nucleic acid data comprising : sequence data and asserted information indicating an asserted 801 source and / or an asserted integrity of the sequence data Process the nucleic acid data 802a 802b 802 Process the sequence data to Process the sequence data to obtain determined information obtain determined information indicating a determined source indicating a determined of the sequence data integrity of the sequence data 803 Match ? Yes No Process the sequence data to determine whether the sequence data is indicative of one or more disease 804 features and / or determine a therapy for the subject Perform remedial action ( s ) 805 FIG . 8 Patent Application Publication Jan. 7 , 2021 Sheet 15 of 22 US 2021/0005284 A1 500 Kiowaw 520 550 FIG9. 540 510 530 Patent Application Publication Jan. 7 , 2021 Sheet 16 of 22 US 2021/0005284 A1 640 660 600 620 670 610 FIG.10 SampleBokcal 089 680 *** 650 OD Patent Application Publication Jan. 7 , 2021 Sheet 17 of 22 US 2021/0005284 A1 301302303 Normal 60400701 8:00 C.06:02 83 WES 8,04:0103:03 0:0802,06:02 8:5301.550 FIG.11 FIG.12 RNA-Seq C.06:02,06:02 8370103 8633602 3703 103 Patent Application Publication Jan. 7. 2021 Sheet 18 of 22 US 2021/0005284 A1 FIG.13A Good *** Patent Application Publication Jan. 7. 2021 Sheet 19 of 22 US 2021/0005284 A1 FIG.13B Patent Application Publication Jan. 7. 2021 Sheet 20 of 22 US 2021/0005284 A1 Total FIG.14A Polya VAjodjejoi jo killigeqoid Patent Application Publication Jan. 7. 2021 Sheet 21 of 22 US 2021/0005284 A1 Total Polya FIG.14B 0.5 Vhod / e10l jo yliqeqoid Patent Application Publication Jan. 7 , 2021 Sheet 22 of 22 US 2021/0005284 A1 ...ilili *** 369211 260240, FIG.15 US 2021/0005284 A1 Jan. 7 , 2021 1 TECHNIQUES FOR NUCLEIC ACID DATA or a determined integrity of the sequence data ; and deter QUALITY CONTROL mining whether the determined information matches the asserted information . CROSS REFERENCE TO RELATED [ 0005 ] Some embodiments provide at least one non - tran APPLICATIONS sitory computer - readable storage medium storing processor executable instructions that, when executed by at least one [ 0001 ] This application claims the benefit under 35 U.S.C. computer hardware processor, cause the at least one com $ 119 ( e ) of U.S. Provisional Application No. 62 / 870,622 , puter hardware processor to perform a method . The method filed Jul . 3 , 2019 , entitled “ Compositions and Methods for comprising: obtaining nucleic acid data comprising : Sample Preparation and Characterization of Cancer There sequence data indicating a nucleotide sequence for at least 5 from " and U.S.