US010435714B2 (12 ) United States Patent ( 10 ) Patent No. : US 10 ,435 ,714 B2 Gill et al. (45 ) Date of Patent : * Oct. 8 , 2019

(54 ) NUCLEIC ACID -GUIDED NUCLEASES 8 ,569 ,041 32 10 / 2013 Church et al. 8 ,697 , 359 B1 4 / 2014 Zhang 8 , 906 ,616 B2 12 / 2014 Zhang et al. ( 71 ) Applicant: Inscripta , Inc. , Boulder, CO (US ) 9 , 458 ,439 B2 10 / 2016 Choulika et al . 9 ,512 ,446 B112 /2016 Joung et al. (72 ) Inventors : Ryan T. Gill, Denver , CO (US ); 9 ,752 , 132 B2 9/ 2017 Joung et al. Andrew Garst , Boulder , CO (US ); 9 , 790 ,490 B2 10 / 2017 Zhang et al. Tanya Elizabeth Warnecke Lipscomb, 9 ,926 ,546 B2 3 / 2018 Joung et al . 9 , 982 ,278 B2 5 /2018 Gill et al. Boulder, CO (US ) 9 , 982 ,279 B1 5 / 2018 Gill et al. 10 ,011 ,849 B1 7 / 2018 Gill et al . (73 ) Assignee : INSCRIPTA , INC ., Boulder, CO (US ) 10 ,017 , 760 B2 7 / 2018 Gill et al . 2008 /0287317 AL 11/ 2008 Boone ( * ) Notice : Subject to any disclaimer , the term of this 2009 /0176653 Al 7 / 2009 Kim et al. patent is extended or adjusted under 35 2010 /0034924 Al 2 / 2010 Fremaux et al . 2010 /0305001 A1 12 /2010 Kern et al. U .S . C . 154 (b ) by 48 days . 2014 / 0068797 A1 3 / 2014 Doudna et al. 2014 / 0089681 A1 3 /2014 Goto et al. This patent is subject to a terminal dis 2014 /0121118 A1 5 / 2014 Warner claimer . 2014 /0199767 A1 7 /2014 Barrangou et al. 2014 / 0273226 A1 9 / 2014 Wu (21 ) Appl. No. : 15 /896 ,433 2014 / 0273232 A1 9 /2014 Zhang et al. 2014 / 0295557 Al 10 / 2014 Joung et al . (22 ) Filed : Feb . 14 , 2018 2015 /0031133 AL 1 / 2015 Church et al. 2015 /0031134 Al 1 / 2015 Zhang et al . 2015 /0064138 A1 3 /2015 Lu et al. (65 ) Prior Publication Data 2015 /0079680 A1 3 /2015 Bradley et al . US 2018 /0371497 A1 Dec. 27 , 2018 2015 /0098954 A1 4 / 2015 Hyde et al. (Continued ) Related U . S . Application Data FOREIGN PATENT DOCUMENTS (63 ) Continuation of application No. 15 /631 , 989 , filed on EP 2764103 A2 8 / 2014 Jun . 23 , 2017 , now Pat . No . 10 ,011 , 849 . EP 2825654 A11 / 2015 (51 ) Int. Cl. (Continued ) C12N 15 / 90 ( 2006 .01 ) C12N 15 / 11 ( 2006 . 01 ) OTHER PUBLICATIONS C12N 15 / 10 ( 2006 .01 ) C12N 15 /82 ( 2006 .01 ) Abudayyeh , et al. C2c2 is a single -component programmable RNA C12N 15 / 85 ( 2006 .01 ) guided RNA - targeting CRISPR effector. Science . Jun . 2 , 2016 . C12N 9 /22 ( 2006 .01 ) DOI: 10 .1126 / science .aaf5573 . C12N 5 / 14 ( 2006 . 01 ) Agresti, et al. Ultrahigh -throughput screening in drop -based microfluid C12N 5 /22 ( 2006 .01 ) ics for directed evolution . Proc Natl Acad Sci US A . Mar. 2 , C12N 15 / 09 ( 2006 . 01 ) 2010 ; 107 ( 9 ) : 4004 - 9 . doi: 10 . 1073 /pnas .0910781107 . Epub Feb . 8 , C12N 15 /52 ( 2006 . 01 ) 2010 . CI2N 15 /63 ( 2006 .01 ) (Continued ) C12N 15 / 79 ( 2006 .01 ) C12N 15 / 70 ( 2006 .01 ) CI2N 15 / 81 ( 2006 . 01) Primary Examiner — Julie Wu C12N 15 / 113 ( 2010 .01 ) (74 ) Attorney , Agent, or Firm — Wilson Sonsini Goodrich (52 ) U .S . CI. & Rosati CPC . C12N 15 /902 (2013 . 01 ) ; C12N 9 /22 ( 2013 .01 ) ; C12N 15/ 11 (2013 .01 ) ; C12N 15 /111 (2013 . 01 ) ; C12N 15 /70 (2013 . 01 ) ; (57 ) ABSTRACT C12N 15 /81 ( 2013 .01 ) ; C12N 15 / 85 Disclosed herein are nucleic acid - guided nucleases , guide (2013 .01 ); C12N 15 / 905 (2013 . 01 ) ; C12N nucleic acids, and targetable nuclease systems, and methods 15 /907 (2013 . 01 ); C12N 2310 /20 (2017 .05 ) ; of use . Disclosed herein are engineered non -naturally occur C12N 2800 /22 (2013 . 01 ) ring nucleic acid -guided nucleases , guide nucleic acids, and (58 ) Field of Classification Search targetable nuclease systems, and methods of use. Targetable None nuclease systems can be used to edit genetic targets , includ See application file for complete search history. ing recursive genetic engineering and trackable genetic ( 56 ) References Cited engineering methods . U . S . PATENT DOCUMENTS 21 Claims, 21 Drawing Sheets 6 ,322 , 969 B1 11/ 2001 Stull et al . 6 , 562 ,594 B1 5 / 2003 Short ( 19 of 21 Drawing Sheet ( s ) Filed in Color ) 8 ,153 ,432 B2 4 /2012 Church et al. Specification includes a Sequence Listing . US 10 ,435 ,714 B2 Page 2

( 56 ) References Cited EP 2840140 B1 11 / 2016 EP 3144390 A1 3 /2017 U . S . PATENT DOCUMENTS EP 3009511 B1 5 / 2017 WO WO - 03106654 A2 12 /2003 2015 /0133315 A15 / 2015 Jacobson et al. WO WO - 2007144770 A2 12 /2007 2015 /0159174 A1 6 /2015 Frendewey et al. WO WO - 2012142591 A2 10 / 2012 2015 /0176013 Al 6 / 2015 Musunuru et al. Wo WO - 2013176772 AL 11 /2013 2015 /0201634 A1 7 / 2015 Fremaux et al. wo WO - 2013176915 AL 11 / 2013 2015 /0225773 AL 8 / 2015 Farmer et al. wo WO - 2014022702 A2 2 / 2014 2015 /0247150 A1 9 / 2015 Zhang et al. WO WO - 2014065596 Al 5 / 2014 2015 /0353905 Al 12 / 2015 Weiss et al . WO WO - 2014093595 Al 6 / 2014 2015 / 0353917 Al 12 / 2015 Miller WO WO - 2014093622 A2 6 / 2014 2015 / 0368639 Al 12 / 2015 Gill et al. WO WO - 2014093661 A2 6 / 2014 2016 / 0024523 Al 1 / 2016 Joung et al. WO WO - 2014093701 Al 6 / 2014 2016 /0024529 Al 1 / 2016 Carstens WO WO - 2014099744 Al 6 / 2014 2016 / 0053272 Al 2 / 2016 Wurtzel et al. WO WO - 2014110006 Al 7 /2014 2016 /0053304 A1 2 / 2016 Wurtzel et al. WO WO - 2014143381 Al 9 / 2014 2016 / 0060653 Al 3 / 2016 Doudna et al . WO WO - 2014150624 Al 9 / 2014 2016 / 0060654 Al 3 / 2016 Doudna et al . WO WO - 2014191128 Al 12 / 2014 2016 / 0068864 Al 3 / 2016 Doudna et al . WO WO - 2015006290 Al 1 / 2015 2016 / 0076093 A1 3 / 2016 Shendure et al. WO WO - 2015006747 A2 1 /2015 2016 /0102322 Al 4 / 2016 Ravinder et al . WO WO - 2015013583 A2 1 / 2015 2016 /0115488 Al 4 / 2016 Zhang et al. wo WO - 2015017866 A1 2 / 2015 2016 /0115489 A1 4 / 2016 Zhang et al. WO WO - 2015048577 A2 4 / 2015 2016 /0160210 Al 6 / 2016 Mali et al. WO WO - 2015048690 A1 4 / 2015 2016 /0168592 Al 6 / 2016 Church et al . WO WO - 2015068785 Al 5 / 2015 2016 / 0186168 A1 6 / 2016 Konieczka et al. WO WO - 2015069682 A2 5 / 2015 2016 /0208243 Al 7 / 2016 Zhang et al. WO WO - 2015070062 A1 5 / 2015 2016 / 0264995 AL 9 / 2016 Yamamoto et al. WO WO - 2015071474 A2 5 / 2015 2016 / 0289673 Al 10 / 2016 Huang et al. WO WO - 2015089354 A1 6 / 2015 2016 / 0289675 A1 10 / 2016 Ryan et al. WO WO - 2015123339 Al 8 / 2015 2016 /0298096 A1 10 / 2016 Charpentier et al . WO WO - 2015153889 A2 10 /2015 2016 /0298097 A1 10 / 2016 Chavez et al . WO WO - 2015159086 A1 10 / 2015 2016 /0298134 Al 10 / 2016 Chen et al. wo WO - 2015159087 Al 10 /2015 2016 /0298135 Al 10 / 2016 Chen et al. WO WO - 2015179540 AL 11 / 2015 2016 / 0298138 A1 10 / 2016 Chen et al. WO WO - 2015191693 A2 12 /2015 2016 / 0333389 AL 11/ 2016 Liu et al. WO WO - 2015195798 AL 12 / 2015 2016 /0367702 A112 / 2016 Hoge et al. WO WO - 2015198020 Al 12 / 2015 2017 /0002339 Al 1 / 2017 Barrangou et al . WO WO - 2015191693 A3 2 / 2016 2017 /0037434 A1 2 / 2017 Choulika et al . WO WO - 2016040594 Al 3 / 2016 2017 /0044569 A9 2 / 2017 Church et al. WO WO - 2016070037 A2 5 / 2016 2017 / 0051276 A1 2 / 2017 May et al. WO WO - 2016099887 A1 6 / 2016 2017 /0051310 A1 2 / 2017 Doudna et al . 2017 /0051311 A1 2 / 2017 Dalia et al. WO WO - 2016100955 A2 6 / 2016 2017 / 0058272 Al 3 / 2017 Carter et al. WO WO - 2016106239 A1 6 / 2016 2017 / 0067046 Al 3 / 2017 Gill et al. WO WO - 2016166340 A110 / 2016 2017 / 0073705 A1 3/ 2017 Chen et al. WO WO - 2016186946 AL 11 / 2016 2017 / 0080107 A1 3/ 2017 Chivukula et al. WO WO - 2016186953 AL 11 / 2016 2017 /0114334 A1 4 / 2017 May et al . wo WO - 2016196805 AL 12 / 2016 2017 /0114369 Al 4 / 2017 Donohoue et al . WO WO - 2016205554 Al 12 / 2016 2017 /0145425 Al 5 / 2017 Kim et al. WO WO - 2016205613 Al 12 / 2016 2017 /0159045 A1 6 / 2017 Serber et al. WO WO - 2016205749 Al 12 / 2016 2017 /0175143 Al 6 / 2017 Tolar et al. WO WO - 2016205764 AL 12 / 2016 2017 /0191123 A1 7 / 2017 Kim et al. wo WO - 2017004261 A1 1 / 2017 2017 / 0198302 A1 7 / 2017 Feng et al. WO WO - 2017015015 Al 1 / 2017 2017 / 0204407 A1 7 / 2017 Gilbert et al. WO WO - 2017019867 Al 2 / 2017 2017 /0211142 Al 7 / 2017 Smargon et al. WO WO - 2017031483 A1 2 / 2017 2017 / 0218349 A1 8 / 2017 Miller et al . WO WO - 2017053713 Al 3 / 2017 2017 / 0226533 Al 8 / 2017 Frisch et al . WO WO - 2017066588 A2 4 / 2017 2017/ 0233752 Al 8 / 2017 Shiboleth et al. WO WO - 2017068120 A1 4 / 2017 2017 /0233756 A1 8 / 2017 Begemann et al . WO - 2017070605 Al 4 / 2017 2017 / 0240922 A1 8 / 2017 Gill et al. WO 2017 / 0321226 A1 11/ 2017 Gill et al. WO WO - 2017089767 Al 6 / 2017 2017 /0369870 A1 12 / 2017 Gill et al. WO WO - 2017096041 A1 6 / 2017 2018 /0230460 A1 8 / 2018 Gill et al. WO WO - 2017099494 Al 6 / 2017 2018 / 0230461 AL 8 / 2018 Gill et al. Wo WO - 2017100343 Al 6 / 2017 2018 /0230492 AL 8 / 2018 Gill et al . wo WO - 2017100377 Al 6 / 2017 2018 /0230493 AL 8 / 2018 Gill et al . WO WO - 2017109167 A2 6 / 2017 2018 /0362590 A1 * 12 / 2018 Monds . . . . C12N 15 / 85 WO WO - 2017127807 A1 7 / 2017 2019 / 0010481 AL 1 / 2019 Joung et al. WO WO - 2017223538 Al 12 /2017 WO WO - 2018071672 AL 4 /2018 FOREIGN PATENT DOCUMENTS WO WO - 2018236548 A1 12 /2018 EP 2828386 A11 / 2015 EP 2840140 A1 2 /2015 OTHER PUBLICATIONS EP 2848690 Al 3 / 2015 EP 2898075 A1 7 / 2015 Alper, et al. Engineering yeast transcription machinery for improved EP 3009511 A2 4 / 2016 ethanol tolerance and production. Science. Dec . 8 , 2006 ;314 (5805 ) : 1565 EP 3064585 A1 9 /2016 8 . US 10 ,435 ,714 B2 Page 3

( 56 ) References Cited Edgar. Search and clustering orders ofmagnitude faster than BLAST. Bioinformatics. Oct . 1 , 2010 ; 26 ( 19 ) :2460 - 1 . doi: 10 . 1093 /bioinformatics / OTHER PUBLICATIONS btq461. Epub Aug . 12 , 2010 . Eklund , et al. Altered target site specificity variants of the I - Ppol Alper , et al. Global transcription machinery engineering: a new His -Cys box homing endonuclease . Nucleic Acids Res. approach for improving cellular phenotype . Metab Eng . May , 2007 ; 35 ( 17 ) : 5839 -50 . Epub Aug . 24 , 2007 . 2007 ; 9 ( 3 ) :258 -67 . Epub Jan . 8 , 2007 . European Search Report dated Jun . 26 , 2017 for EP Application No. Baba, et al. Construction of Escherichia coli K - 12 in - frame, single 15749644 . 9 . gene knockoutmutants : the Keio collection . Mol Syst Biol. 2006 ;2 : 2006 . Examination Report dated Jun . 27 , 2017 for GB Application No . 0008 . Epub Feb . 21 , 2006 . 1615434 . 6 . Bakan , et al. ProDy : protein dynamics inferred from theory and Farasat, et al . Efficient search , mapping , and optimization ofmulti experiments . Bioinformatics. Jun . 1 , 2011 ; 27 ( 11 ) : 1575 - 7 . doi: 10 .1093 / protein genetic systems in diverse . Mol Syst Biol. Jun . 21 , bioinformatics/ btr168 . Epub Apr. 5 , 2011. 2014 ; 10 :731 . doi: 10 . 15252 /msb .20134955 . Basak , et al . Enhancing E . coli tolerance towards oxidative stress Findlay, et al. Saturation editing of genomic regions by multiplex via engineering its global regulator CAMP receptor protein (CRP ) . homology -directed repair . Nature. Sep . 4 , 2014 ;513 ( 7516 ) : 120 - 3 . doi: 10 . 1038 / nature13695 . PLoS One. 2012 ; 7 ( 12 ) :e51179 . doi: 10 . 1371/ journal. pone . 0051179 . Firth , et al. GLUE - IT and PEDEL -AA : new programmes for Epub Dec . 14 , 2012 . analyzing protein diversity in randomized libraries. Nucleic Acids Bateman , et al . The Pfam protein families database . Nucleic Acids Res. Jul. 1 , 2008 ; 36 (Web Server issue ) :W281 - 5 . doi: 10 . 1093 /nar / Res . Jan . 1 , 2004 ;32 (Database issue ) :D138 -41 . gkn226 . Epub Apr. 28 , 2008 . Bhabha, et al. Divergent evolution of protein conformational dynam Fisher, et al . Enhancing tolerance to short - chain alcohols by engi ics in dihydrofolate reductase. Nat Struct Mol Biol. Nov . neering the Escherichia coli Acr? efflux pump to secrete the 2013 ; 20 ( 11) : 1243 - 9 . doi: 10 .1038 /nsmb . 2676 . Epub Sep . 29 , 2013 . non - native substrate n -butanol . ACS Synth Biol. Jan . 17 , 2014 ; 3 ( 1 ) :30 Bhaya et al. CRISPR - Cas systems in bacteria and archaea : versatile 40 . doi: 10 .1021 / sb400065q . Epub Sep . 13 , 2013 . small RNAs for adaptive defense and regulation . Annu Rev Genet Foo , et al . Directed evolution of an E . coli inner membrane 45 : 273 - 297 (2011 ) . transporter for improved efflux of biofuel molecules. Biotechnol Bikard et al. CRISPR interference can prevent natural transforma Biofuels . May 21, 2013 ;6 ( 1 ): 81 . doi: 10 . 1186 / 1754 -6834 -6 - 81 . tion and virulence acquisition during in vivo bacterial infection . Cell Gao , et al. DNA - guided genome editing using the Natronobacterium Host & Microbe 12 ( 2 ) : 177 - 186 ( 2012 ) . gregoryi Argonaute . Nat Biotechnol. May 2 , 2016 . doi: 10 . 1038 / Boehr, et al . The dynamic energy landscape of dihydrofolate reductase nbt. 3547 . catalysis . Science . Sep . 15 , 2006 ;313 (5793 ): 1638 -42 . Garst, et al. , Genome- wide mapping of mutations at single nucleotide resolution for protein , metabolic and genome engineer Brouns, et al. Small CRISPR RNAs guide antiviral defense in ing. Nature Biotechnology 35 , 48 - 55 (2017 ) doi: 10 . 1038 / nbt. 3718. prokaryotes. Science . Aug. 15 , 2008 ;321 ( 5891 ) : 960 - 4 . doi: 10 . 1126 / Garst, et al. , Strategies for the multiplex mapping of genes to traits. science . 1159689 . Microbial Cell Factories 2013 , 12 : 99 . Browning , et al. Modulation of CRP - dependent transcription at the Glebes , et al . Comparison of genome- wide selection strategies to Escherichia coli acsP2 promoter by nucleoprotein complexes : anti identify furfural tolerance genes in Escherichia coli . Biotechnol activation by the nucleoid proteins FIS and IHF. Mol Microbiol . Jan . Bioeng. Jan . 2015 ; 112 ( 1 ) : 129 -40 . doi : 10 .1002 / bit .25325 . Epub 2004 ; 51 ( 1 ) :241 - 54 . Sep . 2 , 2014 . Campbell , et al . Structuralmechanism for rifampicin inhibition of Gutierrez - Rios, et al. Regulatory network of Escherichia coli : bacterial rna polymerase . Cell. Mar. 23 , 2001 ; 104 (6 ) : 901 - 12 . consistency between literature knowledge and microarray profiles . Chang, et al. Structural systems biology evaluation of metabolic Genome Res . Nov . 2003 ; 13 ( 11 ) :2435 -43 . thermotolerance in Escherichia coli. Science . Jun . 7 , Hamady, et al. Error -correcting barcoded primers for pyrosequenc 2013 ; 340 (6137 ) : 1220 - 3 . doi: 10 . 1126 / science . 1234012 . ing hundreds of samples in multiplex . NatMethods . Mar. 2008 ; 5 ( 3 ) :235 Chen , et al. Genome- wide CRISPR screen in a mouse model of 7 . doi : 10 . 1038 /nmeth . 1184 . Epub Feb . 10 , 2008 . tumor growth and metastasis . Cell . Mar. 12 , 2015 ; 160 ( 6 ) : 1246 -60 . HHMI. The Fields lab homepage. Cloning Vectors . Available at doi : 10 . 1016 / j .cell .2015 .02 .038 . Epub Mar. 5 , 2015 . http : / /depts . washington .edu / sfields/ protocols /pOAD .html . Accessed Chiang , et al . Regulators of oxidative stress response genes in on Jan . 3 , 2017 . Escherichia coli and their functional conservation in bacteria . Arch Ho , et al . Efficient Reassignment of a Frequent Serine Codon in Biochem Biophys. Sep . 15 , 2012 ;525 ( 2 ) : 161 - 9 . doi: 10 . 1016 /j .abb . Wild - Type Escherichia coli .ACS Synth Biol. Feb . 19 , 2016 ; 5 ( 2 ) : 163 2012 .02 . 007 . Epub Feb . 20 , 2012 . 71 . doi: 10 . 1021 /acssynbio . 5b00197 . Epub Nov. 20 , 2015 . Cong , et al. Multiplex genome engineering using CRISPR / Cas Hsu , et al ., DNA targeting specificity of RNA - guided Cas9 nucleases. systems. Science . Feb . 15 , 2013 ; 339 (6121 ): 819 - 23 . doi: 10 . 1126 / Nature Biotechnology . Jul . 21 , 2013 ; 31 ( 9 ) : 827 - 834 . science .1231143 . Epub Jan . 3 , 2013 . Hung , et al. Crystal structure of AcrB complexed with linezolid at Co - pending U . S . Appl. No . 15 /632 ,001 , filed Jun . 23 , 2017 . 3 . 5 X resolution . J Struct Funct Genomics . Jun . 2013 ; 14 ( 2 ) : 71 - 5 . Costantino , et al . Enhanced levels of lambda Red -mediated recom doi: 10 .1007 /s10969 - 013 -9154 -X . Epub May 15 , 2013 . binants in mismatch repair mutants . Proc Natl Acad SciUSA . Dec . Hwang , et al. Efficient genome editing in zebrafish using a CRISPR 23 , 2003 ; 100 ( 26 ) : 15748 - 53 . Epub Dec . 12 , 2003 . Cas system . Nat Biotechnol . Mar. 2013 ;31 ( 3 ) : 227 - 9 . doi: 10 . 1038 / Datta, et al. A set of recombineering plasmids for gram -negative nbt. 2501 . Epub Jan . 29 , 2013 . bacteria. Gene . Sep . 1 , 2006 ; 379: 109 -15 . Epub May 4 , 2006 . Ibanez , et al. Mass spectrometry -based metabolomics of single Dicarlo , et al. Genome engineering in Saccharomyces cerevisiae yeast cells . Proc Natl Acad Sci USA . May 28 , 2013 ; 110 ( 22 ) : 8790 using CRISPR -Cas systems. Nucleic Acids Res . Apr. 2013 ; 41 ( 7 ) : 4336 4 . doi: 10 . 1073 /pnas . 1209302110 . Epub May 13 , 2013 . 43 . doi: 10 . 1093 /nar / gkt135 . Epub Mar . 4 , 2013 . International search report and written opinion dated Jul. 28 , 2015 “ Dickinson et al . Engineering the Caenorhabditis elegans Genome for PCT /US2015 /015476 . Using Cas9 - Triggered Homologous Recombination ; Nat Methods . International search report and written opinion dated Nov . 5 , 2012 Oct. 2013 ; 10 ( 10 ): 1028 - 1034; doi : 10 . 1038 /nmeth . 2641” . for PCT/ US2012 /033799 . Dwyer, et al . Role of reactive oxygen in antibiotic action and International Search Report dated Nov. 29 , 2017 for International resistance . Curr Opin Microbiol. Oct . 2009; 12 (5 ): 482 -9 . doi: 10 . 1016 / Patent Application No . PCT/ US2017 /039146 . j .mib . 2009 .06 .018 . Epub Jul. 31, 2009 . International Search Report dated Dec. 26 , 2017 for International Ebright, et al. Consensus DNA site for the Escherichia coli catabolite Patent Application No . PCT/ US2017 / 056344 . gene activator protein (CAP ) : CAP exhibits a 450 - fold higher Isaacs, et al. Precise manipulation of chromosomes in vivo enables affinity for the consensus DNA site than for the E . coli lac DNA site . genome- wide codon replacement . Science . Jul. 15 , 2011; 333 (6040 ) : 348 Nucleic Acids Res . Dec . 25 , 1989 ;17 (24 ): 10295 - 305 . 53 . doi: 10 . 1126 / science. 1205822 . US 10 ,435 ,714 B2 Page 4

(56 ) References Cited Nakashima, et al. Structures of the multidrug exporter AcrB reveal a proximal multisite drug -binding pocket . Nature . Nov . 27 , OTHER PUBLICATIONS 2011; 480 ( 7378 ) : 565 - 9 . doi: 10 . 1038/ nature10641 . NCBI( https : / / www .ncbi . nlm .nih .gov /protein /WP _ 055225123 ?report = Iwakura , et al. Evolutional design of a hyperactive cysteine - and genbank & log $ = protalign & blast _ rank = 1 & RID = 5FPJ4MN4014 , Type methionine - free mutant of Escherichia coli dihydrofolate reductase . V CRISPR -associated protein Cpfi Eubacterium rectale , 2016 ) . J Biol Chem . May 12 , 2006 ; 281 ( 19 ): 13234 -46 . Epub Mar. 1, 2006 . NCBI. Basic Local Alignment Search Tool. Available at https: / / Jiang, et al. Multigene editing in the Escherichia coli genome via the CRISPR - Cas9 system . Appl Environ Microbiol. Apr. blast. ncbi . nlm .nih . gov /Blast . cgi . Accessed on Jan . 3, 2017 . 2015 ;81 ( 7 ) : 2506 - 14 . doi: 10 . 1128 /AEM .04023 - 14 . Epub Jan . 30 , Neylon , Cameron ., Chemical and biochemical strategies for the 2015 . randomization of protein encoding DNA sequences: library con Jiang et al . RNA - guided editing ofbacterial genomes using CRISPR struction methods for directed evolution . Nucleic Acids Research , Cas systems. Nat Biotechnol 31: 233 - 239 (2013 ). 2004 , vol. 32 , No. 4 . 1448 - 1459 . Jinek , et al. A programmable dual - RNA - guided DNA endonuclease Office Action dated Jan . 19 , 2018 for U . S . Appl. No . 15 /632 , 001 . in adaptive bacterial immunity . Science . Aug . 17 , 2012 ; 337 (6096 ) :816 Office Action dated Jun . 16 , 2016 for U .S . Appl. No . 14 /110 ,072 . 21 . doi: 10 . 1126 / science . 1225829 . Epub Jun. 28 , 2012 . Office Action dated Jun . 21, 2017 for U . S . Appl. No . 14 / 110 ,072 . Kersten , et al. A mass spectrometry - guided genome mining approach Office Action dated Sep . 28 , 2017 for U . S . Appl. No . 15 /632 , 001 . for natural product peptidogenomics . Nat Chem Biol. Oct. 9 , Office Action dated Nov . 8 , 2017 for U . S . Appl . No. 15 /630 , 909. 2011; 7 ( 11) : 794 - 802 . doi: 10 . 1038 / nchembio .684 . Office Action dated Nov. 20 , 2017 for U . S . Appl . No. 15 /632 , 222. Kim , et al . A guide to genome engineering with programmable Office Action dated Dec . 9 , 2016 for U . S . Appl. No. 14 / 110 ,072 . nucleases. Nat Rev Genet. May 2014 ; 15 ( 5 ) :321 - 34 . doi: 10 . 1038 / Oh et al. CRISPR -Cas9 -assisted recombineering in lactobacillus nrg3686 . Epub Apr. 2 , 2014 . reuteri. Nucleic Acids Res 42 ( 17 ): e131 ( 2014 ). Kim , et al. Hybrid restriction enzymes : zinc finger fusions to Fok I Pines , et al . Codon compression algorithms for saturation mutagenesis . cleavage domain . Proc Natl Acad Sci USA . Feb . 6 , 1996 ; 93 ( 3 ) : 115 ACS Synth Biol. May 15 , 2015 ;4 ( 5 ): 604 - 14 . doi: 10 . 1021/ 60 . sb500282v. Epub Oct . 30 , 2014 . Kleinstiver , et al. , Engineered CRISPR - Cas9 nucleases with altered Plagens , et al . , DNA and RNA interference mechanismsby CRISPR PAM specificities. Nature . Jul. 23 , 2015 ; 523 (7561 ): 481 -5 . doi: Cas surveillance complexes, FEMS Microbiology Reviews, May 1 , 10 . 1038 / nature14592 . Kohanski , et al. A common mechanism of cellular death induced by 2015 ; 39 ( 3 ): 442 -463 . bactericidal antibiotics. Cell . Sep . 7 , 2007 ; 130 ( 5 ): 797 -810 . Prior , et al. Broad -host - range vectors for protein expression across Kosuri , et al. Scalable gene synthesis by selective amplification of gram negative hosts . Biotechnol Bioeng. Jun . 1 , 2010 ; 106 (2 ): 326 DNA pools from high - fidelity microchips . Nat Biotechnol. Dec 32 . doi: 10 . 1002/ bit . 22695 . 2010 ;28 ( 12 ) : 1295 - 9 . doi: 10 . 1038 /nbt . 1716 . Epub Nov . 28 , 2010 . Qi et al . Repurposing CRISPR as an RNA - guided platform for Kwon , et al. Crystal structure of the Escherichia coli Rob transcrip sequence -specific control of gene expression . Cell 152 : 1173 - 1183 tion factor in complex with DNA .Nat Struct Biol. May 2000 ;7 (5 ): 424 ( 2013 ) . 30 . Raman , et al . Evolution - guided optimization of biosynthetic path Lajoie , et al . Genomically recoded organisms expand biological ways . Proc Natl Acad Sci US A . Dec . 16 , 2014 ; 111 ( 50 ) : 17803 - 8 . functions. Science . Oct. 18 , 2013 ; 342 (6156 ): 357 -60 . doi: 10 . 1126 / doi: 10 . 1073 /pnas . 1409523111. Epub Dec . 1 , 2014 . science . 1241459 . Reynolds, et al . Quantifying Impact of Chromosome Copy Number Li, et al . Identification of factors influencing strand bias in oligo on Recombination in Escherichia coli . ACS Synth Biol. Jul. 17 , nucleotide -mediated recombination in Escherichia coli. Nucleic 2015 ; 4 ( 7 ) : 776 - 80 . doi: 10 . 1021 /sb500338g . Epub Mar. 19 , 2015 . Acids Res . Nov . 15 , 2003 ; 31 ( 22 ) :6674 - 87 . Rhee , et al. A novel DNA -binding motif in MarA : the first structure Li, et al . Metabolic engineering of Escherichia coli using CRISPR for an AraC family transcriptional activator. Proc Natl Acad Sci U Cas9 meditated genome editing . Metab Eng . Sep . 2015 ; 31 : 13 - 21 . S A . Sep . 1 , 1998 ; 95 ( 18 ) : 10413 - 8 . doi : 10 . 1016 /j .ymben . 2015. 06 .006 . Epub Jun . 30 , 2015 . Rice , et al. Crystal structure of an IHF - DNA complex : a protein Liu , et al . Efficient genome editing in filamentous fungus Trichoderma induced DNA U - turn . Cell . Dec . 27 , 1996 : 87 ( 7 ) : 1295 - 306 . reesei using the CRISPR /Cas9 system . Cell Discovery . 2015 ; 1 : 15007 . Rodriguez - Verdugo , et al . Evolution of Escherichia coli rifampicin doi: 10 . 1038 / celldisc .2015 . 7 . resistance in an antibiotic -free environment during thermal stress . Makarova et al. An updated evolutionary classification of CRISPR BMC Evol Biol. Feb . 22 , 2013 ; 13: 50 . doi: 10 . 1186 / 1471 - 2148 - 13 Cas systems. Nat Rev Microbiol 13 :722 - 736 ( 2015 ) . 50 . Makarova, et al. Evolution and classification of the CRISPR -Cas Ronda , et al. CRMAGE : CRISPR Optimized MAGE Recombineer systems. Nat Rev Microbiol. Jun . 2011; 9 (6 ): 467 - 77 . doi: 10 .1038 / ing. Sci Rep . Jan . 22 , 2016 ; 6 : 19452. doi : 10 . 1038 / srep19452 . nrmicro2577 . Epub May 9 , 2011 . Ross , et al. A third recognition element in bacterial promoters : DNA Mali, et al . RNA - guided human genome engineering via Cas9 . binding by the alpha subunit of RNA polymerase . Science . Nov . 26 , Science . Feb . 15 , 2013 ; 339 (6121 ) : 823 -6 . doi: 10 . 1126 / science . 1993 ; 262 (5138 ) : 1407 - 13 . 1232033 . Epub Jan . 3 , 2013 . Sandoval, et al. Strategy for directing combinatorial genome engi Maruyama, et al. Increasing the efficiency of precise genome editing neering in Escherichia coli . Proc Natl Acad Sci US A . Jun . 26 , with CRISPR - Cas9 by inhibition of nonhomologous end joining . 2012 ; 109 ( 26 ) : 10540 - 5 . doi: 10 . 1073 / pnas . 1206299109 . Epub Jun . Nat Biotechnol. May 2015 ;33 (5 ): 538 -42 . doi: 10 . 1038 /nbt . 3190 . 11 , 2012 . Epub Mar. 23 , 2015 Mar 23 . Sawitzke , et al. Probing cellular processes with oligo -mediated Mills, et al . Cellulosic hydrolysate toxicity and tolerance mecha recombination and using the knowledge gained to optimize recombineer nisms in Escherichia coli . Biotechnol Biofuels . Oct . 15 , 2009 ; 2 : 26 . ing. J Mol Biol. Mar. 18 , 2011; 407 ( 1 ) :45 -59 . doi: 10 . 1016 /j .jmb . doi: 10 .1186 / 1754 -6834 - 2 - 26 . 2011 .01 . 030 . Epub Jan . 19 , 2011 . Molodtsov , et al . X - ray crystal structures of the Escherichia coli Semenova et al . Interference by clustered regularly interspaced RNA polymerase in complex with benzoxazinorifamycins. J Med short palindromic repeat (CRISPR ) RNA is governed by a seed Chem . Jun . 13 , 2013 ; 56 ( 11) :4758 -63 . doi: 10 . 1021/ jm4004889 . sequence . PNAS USA 108 ( 25 ) : 10098 - 10103 (2011 ) . Epub May 31 , 2013 . Shalem , et al . Genome- scale CRISPR -Cas9 knockout screening in Murakami, et al. Structural basis of transcription initiation : RNA human cells . Science. Jan . 3, 2014 ;343 (6166 ): 84 - 7 . doi: 10 .1126 / polymerase holoenzyme at 4 A resolution . Science . May 17 , science .1247005 . Epub Dec . 12 , 2013 . 2002 ; 296 ( 5571) : 1280 - 4 . Shendure. Life after genetics. GenomeMed . Oct. 29 , 2014 ;6 ( 10 ): 86 . Nakashima , et al. Structural basis for the inhibition of bacterial doi: 10 .1186 / s13073 -014 - 0086 - 2 . eCollection 2014 . multidrug exporters. Nature . Aug . 1 , 2013 ; 500 ( 7460 ) : 102 - 6 . doi: Shmakov et al. Discovery and Functional Characterization of Diverse 10 . 1038 / nature12300 . Epub Jun. 30 , 2013 . Class 2 CRISPR -Cas Systems. Mol Cell 60 ( 3 ) :385 - 397 ( 2015 ) . US 10 ,435 ,714 B2 Page 5

( 56 ) References Cited Zhang, et al ., Efficient editing of malaria parasite genome using the CRISPR / Cas9 system . mBio . Jul. 2014 ; vol. 5 Art . e01414 -14 . OTHER PUBLICATIONS Zhao, et al. Activity and specificity of the bacterial PD - ( D / E )XK Smanski, et al. Functional optimization of gene clusters by combinato homing endonuclease I -Ssp68031 . J Mol Biol. Feb . 6 , 2009 ;385 ( 5 ): 1498 rial design and assembly . Nat Biotechnol. Dec . 2014 ; 32 ( 12 ) : 1241- 9 . 510 . doi: 10 . 1016 / j .jmb .2008 . 10 .096 . Epub Nov. 12 , 2008 . doi: 10 .1038 / nbt. 3063 . Epub Nov . 24 , 2014 . Zheng , et al . Metabolic engineering of Escherichia coli for high Stearns, et al. , Manipulating yeast genome using plasmid vectors . specificity production of isoprenol and prenol as next generation of Methods in Enzymology . 1990 , 185 : 280 -297 . biofuels. Biotechnol Biofuels . Apr. 24 , 2013 ; 6 :57 . doi: 10 . 1186 / Steinmetz, et al. Maximizing the potential of functional genomics . 1754 -6834 - 6 - 57 . eCollection 2013 . Nat Rev Genet . Mar. 2004 ; 5 ( 3 ) : 190 - 201. Co -pending U .S . Appl. No . 15 /631 , 989 , filed Jun . 23 , 2017 . Stoebel , et al. Compensatory evolution of gene regulation in response Co -pending U .S . Appl. No. 15 /896 ,444 , filed Feb . 14 , 2018 . to stress by Escherichia coli lacking RpoS . PLoS Genet . Oct . Office Action dated Jan . 19 , 2018 for U . S . Appl. No. 157631, 989 . 2009 ;5 ( 10 ): e1000671 . doi: 10 . 1371/ journal. pgen .1000671 . Epub Oct. Office Action dated Oct . 3 , 2017 for U . S . Appl. No . 15 /631 , 989 . 2 , 2009 . Co- pending U . S . Appl. No. 16 / 056 ,310 , filed Aug . 6 , 2018 . Swarts , et al . Argonaute of the archaeon Pyrococcus furiosus is a Fineran , et al. Degenerate target sites mediate rapid primed CRISPR DNA -guided nuclease that targets cognate DNA . Nucleic Acids adaptation . Proc Natl Acad Sci USA . Apr. 22, 2014 ;111 ( 16 ) :E1629 Res .May 26 , 2015 ;43 ( 10 ): 5120 -9 . doi: 10 . 1093 /nar / gkv415 . Epub 38 . doi: 10 . 1073 / pnas. 1400071111. Epub Apr. 7 , 2014 . Apr. 29 , 2015 . Li, et al. , Engineering CRISPR - Cpfl crRNAs and mRNAs to Swarts , et al . DNA - guided DNA interference by a prokaryotic maximize genome editing efficiency . Nat Biomed Eng. May 2017 ; 1 ( 5 ) . Argonaute . Nature .Mar . 13, 2014 ;507 (7491 ) : 258 -61 . doi: 10 .1038 / pii: 0066 . doi: 10 .1038 / s41551 -017 -0066 . Epub May 10 , 2017 . nature12971. Epub Feb . 16 , 2014 . PCT /US2018 /034779 International Search Report and Written Opin Tenaillon , et al. The molecular diversity of adaptive convergence . ion dated Nov . 26 , 2018 . Science . Jan . 27 , 2012 ;335 (6067 ) : 457 -61 . doi: 10 . 1126 / science . U . S . Appl. No. 15/ 630 , 909 Notice of Allowance dated Mar. 26 , 1212986 . 2018 . Toprak , et al. Evolutionary paths to antibiotic resistance under U . S . Appl. No. 15/ 632 ,222 Notice of Allowance dated Mar. 26 , dynamically sustained drug selection . Nat Genet . Dec . 18 , 2018 . 2011 ; 44 ( 1 ): 101 - 5 . doi : 10 .1038 / ng. 1034 . U . S . Appl. No. 15 /116 ,616 Office Action dated Jul. 30 , 2018 . Waaijers , et al. CRISPR / Cas9 -targeted mutagenesis in Caenorhabditis U . S . Appl. No. 15 / 116 ,616 Office Action dated Mar . 16 , 2018 . elegans . Genetics. Nov . 2013 ; 195 ( 3 ) : 1187 - 91 . doi: 10 . 1534 / genetics. U . S . Appl. No . 15 / 116 ,616 Office Action dated Nov . 14 , 2018 . 113 . 156299 . Epub Aug . 26 , 2013 . U . S . Appl. No. 15 /630 ,909 Office Action dated Mar. 2 , 2018 . Wang , et al. Engineering furfural tolerance in Escherichia coli U .S . Appl. No . 15/ 631 , 989 Notice of Allowance dated Mar. 7 , 2018 . improves the fermentation of lignocellulosic sugars into renewable U . S . Appl. No . 15 /632 ,001 Notice of Allowance dated Feb . 27 , chemicals. Proc Natl Acad Sci USA. Mar . 5 , 2013 ; 110 ( 10 ) : 4021 - 6 . 2018 . doi: 10 .1073 /pnas . 1217958110 . Epub Feb . 19 , 2013 . U . S . Appl . No. 15 /632 ,001 Notice of Allowance dated Mar. 12 , Wang , et al. Genome -scale promoter engineering by coselection 2018 . MAGE . Nat Methods. Jun . 2012 ; 9 ( 6 ) : 591 - 3 . doi: 10 . 1038 /nmeth . 1971 . Epub Apr. 8, 2012 . U . S . Appl. No . 15 /632 ,222 Office Action dated Mar . 2 , 2018 . Wang , et al . Multiplexed in vivo His - tagging of enzyme pathways U . S . Appl. No . 15 /896 ,444 Office Action dated Oct. 5 , 2018 . for in vitro single - pot multienzyme catalysis . ACS Synth Biol . Feb . U . S . Appl. No. 15 /948 ,785 Office Action dated Jun . 4 , 2018 . 17 , 2012 ; 1 ( 2 ) :43 - 52 . U . S . Appl. No. 15 /948 ,785 Office Action dated Nov . 2 , 2018 . Wang , et al. One -step generation of mice carrying mutations in U . S . Appl. No. 15 / 948 ,789 Office Action dated Jun . 22 , 2018 . multiple genes by CRISPR / Cas -mediated genome engineering. Cell . U . S . Appl. No. 15 / 948 ,789 Office Action dated Oct. 11 , 2018 . May 9 , 2013 ; 153 ( 4 ) :910 - 8 . doi: 10 . 1016 / j .ce11 .2013 . 04 .025 . Epub U . S . Appl. No. 15 /948 ,793 Office Action dated Sep . 5 , 2018 . May 2 , 2013 U . S . Appl. No . 15 / 948 ,798 Office Action dated Sep . 17 , 2018 Wang , et al. Programming cells by multiplex genome engineering U . S . Appl. No. 16 / 056 , 310 Office Action dated Nov . 9 , 2018 . and accelerated evolution . Nature . Aug. 13 , 2009 ; 460 ( 7257 ) :894 - 8 . Bao , et al. , Genome- scale engineering of Saccharomyces cerevisiae doi: 10 . 1038 /nature08187 . Epub Jul. 26 , 2009 . with single- nucleotide precision . Nature Biotechnology, May 7 , Warner , et al. Rapid profiling of a microbial genome using mixtures 2018 ; 1 - 8 . ofbarcoded oligonucleotides . Nat Biotechnol. Aug . 2010 ; 28 ( 8 ): 856 Beloglazova , et al . , Crispr RNA binding and DNA taget recognition 62. doi : 10 .1038 /nbt . 1653 . Epub Jul . 18 , 2010 . by purified cascade complexes from Escherichia coli. Nucleic Acids Watson , et al. Directed evolution of trimethoprim resistance in Research , 2015 . vol. 43 No . 1 : 530 -543 . Escherichia coli . FEBS J . May 2007 ;274 ( 10 ) : 2661- 71 . Epub Apr. Co -pending U . S . Appl. No . 16 / 275, 465 , filed Feb . 14 , 2019 . 19 , 2007 . Garst et al. , ( 2017 ) 14 Supplementary Figures . Nature Biotechnol Wetmore , et al . Rapid quantification of mutant fitness in diverse ogy : doi: 10 . 1038 /nbt . 3718 . bacteria by sequencing randomly bar -coded transposons. MBio . Kleinstiver, et al ., Engineered CRISPR -Cas12a variants with increased May 12 , 2015 ;6 ( 3 ): e00306 - 15 . doi: 10 .1128 /mBio .00306 - 15 . activities and improved targeting ranges for gene , epigenetic and White , et al. Role of the acrAB locus in organic solvent tolerance mediated by expression of marA , soxS, or robA in Escherichia coli . base editing . Nature Biotechnology , Feb . 11 , 2019 , 13 Pages . J Bacteriol. Oct . 1997; 179( 19 ): 6122 -6 . Richardson et al. Enhancing homology - directed genome editing by “ Withers, et al. Identification of isopentenol biosynthetic genes from catalytically active and inactive CRISPR - Cas9 using asymmetric Bacillus subtilis by a screening method based on isoprenoid pre donor DNA . Nat Biotechnol 2016 ; 34 ( 3 ) : 339 - 344 . cursor toxicity . Appl Environ Microbiol . Oct . 2007 ; 73 ( 19 ) :6277 -83 . Roy , et al. , Multiplexed precision genome editing with trackable Epub Aug . 10 , 2007. " genomic barcodes in yeast. Nature Biotechnology , Jun . 2018 ; 36 ( 6 ) :512 Wolfe . The acetate switch .Microbiol Mol Biol Rev. Mar . 2005 ;69 ( 1 ) : 12 524 . 50 . Sadhu , et al. , Highly parallel genome variant engineering with Zeitoun , et al . Multiplexed tracking of combinatorial genomic CRISPR - Cas9. Nature genetics , Apr . 9 , 2018 ; 1 - 11. mutations in engineered cell populations. Nat Biotechnol . Jun . Sapranauskas et al. The Streptococcus thermophilus CRISPR /Cas 2015 ;33 ( 6 ) :631 - 7 . doi: 10 . 1038 /nbt .3177 . Epub Mar. 23 , 2015 . system provides immunity in Escherichia coli . Nucleic Acid Res . Zetsche, et al . Cpfl is a single RNA - guided endonuclease of a class 39 : 9275 - 9282 ( 2011) . 2 CRISPR -Cas system . Cell. Oct . 22, 2015 ; 163 ( 3 ) :759 -71 . doi : U . S . Appl. No . 15 / 116 ,616 Notice of Allowance dated Feb . 1 , 2019 . 10 . 1016 /j . cell . 2015 .09 . 038 . Epub Sep . 25 , 2015 . U . S . Appl. No. 15 / 948 ,789 Notice of Allowance dated Jan . 10 , 2019 . US 10 ,435 ,714 B2 Page 6

( 56 ) References Cited OTHER PUBLICATIONS U . S . Appl. No. 15 /948 ,793 Notice of Allowance dated Feb . 1 , 2019 . U . S . Appl. No . 15 / 948 , 798 Notice of Allowance dated Feb . 6 , 2019 . Co -pending U . S . Appl. No. 16 / 275 ,439 , filed Feb . 14 , 2019 . * cited by examiner U . S . Patent Oct. 8 , 2019 Sheet 1 of 21 US 10 ,435 , 714 B2

MADTMADS MADSMAD6 Figure1B MAD3 MAD1MAD2

"

*ytetto MAD4

**** * * **

***** * * * * ** * *

*** * *** * * * * * SEQIDNO:1SEQIDNO:2 SEQIDNO:3 SEQIDNO:4 SEQIDNO:5 SEQIDNO:6SEOIDNO:7 SEQIDNO:10 SEQIDNO:12

Figure1A OME

O MAD101QM MAD1 MAD2MAD3 MAD4 MAD6S MAD8 MAD12T U . S . Patent Oct. 8 , 2019 Sheet 2 of 21 US 10 ,435 ,714 B2 Bsalsites

.

.: Figure2

.

3 . MADproteinexpression2 * 6XHistaggedMADvariant

.

.

promoter(proB) U . S . Patent Oct. 8 , 2019 SheetSheet 3 ofof 21 US 10 ,435 ,714 B2

.

.

.

.

.

. . ???? wwwwwwwwwwwwwwwwwwwwwww guidenucleicacid wwwwwwwwwwwwwwwwwwww

.

. Figure3 ??????????? AGCAGOMITATOAICIGCOGI

888 ????GITATAATAACT6cca U . S . Patent Oct. 8 , 2019 Sheet 4 of 21 US 10 ,435 ,714 B2 CompatibleMAD nucleaseandguide nucleicacid combinations identified.

* 2-DOGplates covo Figure4

. . . Editing ; Noediting galKOFFediting cassettes(heterologous)

MADvariantX U . S . Patent Oct. 8 , 2019 Sheet 5 of 21 US 10 ,435 ,714 B2

(whenMADnucleaseandguidenucleicacidare activeandgenewassuccessfullyedited) Editing Guidenucleicacid Screening/selectionforgeneedits MADnucleasevector promoter Editingcassette promöter

(whenMADnucleaseandguide nucleicacidareinactive) nucleaseandguidenucleic promoter CellDeath(whenMAD acidareactive) NoEditing Figure5A Figure5B 5CFigure U . S . Patent Oct. 8 , 2019 Sheet 6 of 21 US 10 ,435 ,714 B2

(noactivecleaving) target RandomPAMS Cellsurvival PAMselectionassay (selftargeting) MADnucleasevector promotër Editingcassette Guidenucleicacid promoter

CellDeath(whenMAD nucleaseandguidenucleic promoter acidareactive) Figure6A Figure6B Figure6C atent Oct. 8 , 2019 Sheet 7 of 21 US 10 ,435 , 714 B2

wwww . o wwwwwwwwwwwwwwww

*

* 8888 512

24

999999999999999 9999999999999999999999999999

22 MAD2

?? Figure7A 8388 12410111213 MAD1 FOOT 103 Nuclease: ) DNA ug / CFU( Scaffold TotalCFU EditedCFU Efficiency Transformation U . S . Patent Oct. 8 , 2019 Sheet 8 of 21 US 10 ,435 ,714 B2

TotalCFU EditedCFU

XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX 410111213 WY21 fd MAD7 iFigure7B 12

10111213 MAD4 24 ?????- ?? -?????????? - ?? - ??- ??- ??- ??- ?? - ?? - ??????? - ???????????????- ??????????? ????????????????????????????- ??- ?? ???????????? ??????- ??????? ??? -???? . ??????????????????? ???????????? ??????????????????????????????????????????????? Scaffold.1 Nuclease: ) DNA ug / CFU( Efficiency Transformation U . S . Patent Oct. 8 , 2019 Sheet 9 of 21 US 10 ,435 ,714 B2

Figure8 1234567891011121314

ww Nuclease:MAD2Cas9 Scaffold:1210511Sp24113

CFU/g(edited)CFU/g(total)

plasmid CFUlug atent Oct. 8 , 2019 Sheet 10 of 21 US 10 ,435 ,714 B2

WCFU/g(edited)

jual % 97 5 MAD7 11??

% 100 4 Figure9 MAD7 10

3 MAD7 4

2 MAD7 2

1

MAD7 1 Nuclease: Scaffold:Scaffold: U . S . Patent Oct. 8 , 2019 Sheet 11 of 21 US 10 ,435 , 714 B2

.

.

.

.

.

.

.

.

. Figure10B

TU efficency Transformation

8

932 Figure10A X

*

*

*

*

*

*.

*

*

.

* Figure10C ) Total Wyte efficiency Editing U . S . Patent Oct. 8 , 2019 Sheet 12 of 21 US 10 ,435 , 714 B2

:

24 .

GOGAIGOGOOGAT .

PAM Figure11funt ALCAGCICOACGAOITTOOCUCGAYGCGOCCAY

.

. CAGCCGATTATGAAAATCAGCTCGACGAGTTTICCCTCGATGCGCCCATG codonsTarget. SEQIDNO:187CCGTALAGCGCOGAGGGAGCOALGAONALICAGCCGACGAGIOCOCGAIGOGOCCAT SEQIDNO:188COMMAGITOCGTGAYGGCAGOCGALIAGARAATGAGICGACOAGITOCCICSAYGCGCCCAT SEQIDNO:187CCGTAAAGIICGCGIGAYGGCAGCOLLAAAAYCAGOTCGACGGMCCOYOGAYGCGCGCAY SEQIDNO:187CCOTMAAGIICGCGIKAIGGCAGCOTAGASAAYCAGOTCGACONMOCOTOGAGCGOCCAY SEQIDNO:187COGTAAAOMCOCOTCAYGOCAGCOAGADICAGCICGACGAMTOCOUCGAYCOCCCAT .YCAYUGAMCOGTAAAGIICGOGY187NO:IDSEO SEQIDNO:187COGIKAA61CGOGIGANGGUAGCONTAIGALAXICAGOTOGAGAG SEOIDNO:187CCOTAAAGIICCCOLGAIGOCACCO SAYYCOGTAAAGIICGCGUGAYGGCAGOOGATTAYA186NO:SEOID CCGTAAAGTTGNO:189SEQID U . S . Patent Oct. 8 , 2019 Sheet 13 of 21 US 10 ,435 ,714 B2 noselection selection www Figure12 >4,000fold enrichment

>11,000fold enrichment HIJtarget- non /target - on of Ratio U . S . Patent Oct. 8 , 2019 Sheet 14 of 21 US 10 ,435 ,714 B2

.

.

.

.

.

..

.

..

.

.

. WWW . . .

.

.

. Pseudoknotregion

GTGTTTCC-G.ACTATCCTCCTATTTGAC-ATGAG CAACTGAAAAGATTAC.TRADAT . COGTOTAAMWACMHTTTWKAATTTCTACT-RTTGTAGAT : ve 13AFigure SKINN/YXLFEPLU. CCTGCAGGCCGGCAMICTAEGSTIGAT

Los10 GGTTCCTT

C Consensus Frame1 Identity Scaffold-1 Scaffold-2 Scaffold-3 Scaffold-4 Scaffold-5 Scaffold-6 Scaffold-7 Scaffold-8 Scaffold-10 Scaffold-11 Scaffold-12 atent Oct. 8 , 2019 Sheet 15 of 21 US 10 ,435 ,714 B2

MAD2(green) MAD12(grey)

* Figure13B Pseudoknot

Guidenucleicacid(red) U . S . Patent Oct . 8 , 2019 Sheet 16 of 21 US 10 ,435 ,714 B2

.

.

.

. .: 11

. Figure14B MAD11templateqPCR

33 . . results *MAD11ormers .

tititiktinititititi * * * * * * * * * ** * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * it

......

Listedesestaratencionesde 2920020 sei

Concentrationofinpixvector(nguo Figure14A Cas9plasmidqPCR estasinstansiesienicienseinketarekinistracinisanie. NY *MAD11primers atent Oct. 8 , 2019 Sheet 17 of 21 US 10, 435 ,714 B2

Figure15 ( odwipolo xm wwwwwww atent Oct. 8 , 2019 Sheet 18 of 21 US 10 ,435 ,714 B2

.

.

. .: ...... 1 ...... : : : 100%

.

.

.

.

.

.

. . 80% .

.

.

.

.

.

.

. . . .

.

Figure16 .

. *** . 40%60 .

. PlateBasedEditingEffeciency

.

. 1.' 20%

Effeciency Editing Molecular U . S . Patent Oct. 8 , 2019 Sheet 19 of 21 US 10 ,435 ,714 B2

. 170Figure .

.

.

Y * OOQQÒ Figure17A

.0 Figure17B US Patentatent Octmakao. 8 , 2019 m Sheet 20 anusof 21 US 10 massen ,435 ,714 B2

.

.

.

.ons E.coligeltone Figure18 w . 000*** *timi **** *** * *** *

Round1

i

** * U . S . Patent Oct . 8 , 2019 Sheet 21 of 21 US 10 ,435 ,714 B2

**

. .

. Figure19 1.oo 9 hoo US 10 ,435 , 714 B2 NUCLEIC ACID -GUIDED NUCLEASES Disclosed herein are compositions for use in genome editing comprising a non -naturally occurring nuclease CROSS REFERENCE encoded by a nucleic acid having at least 75 % identity to SEQ ID NO : 22. In some aspects , the nucleic acid has at This application is a continuation application of U . S . 5 least 80 % identity to SEQ ID NO : 22 . In some aspects , the patent application Ser . No. 15 /631 , 989 , filed Jun . 23 , 2017 , nucleic acid has at least 90 % identity to SEQ ID NO : 22 . In now U . S . Pat . No. 10 ,011 , 849, which is incorporated herein some aspects , the nuclease is further codon optimized for by reference in its entirety. use in cells from a particular organism . In some aspects , the nuclease is codon optimized for E . Coli In some aspects, the BACKGROUND OF THE DISCLOSURE 10 nuclease is codon optimized for S . Cerevisiae . In some aspects , the nuclease is codon optimized for mammalian Nucleic acid - guided nucleases have become important cells . In some aspects , the nucleic acid -guided nuclease has tools for research and genome engineering. The applicability less than 40 % protein identity to SEQ ID NO : 12 . In some of these tools can be limited by the sequence specificity 15 aspects , the nucleic acid - guided nuclease has less than 40 % requirements , expression , or delivery issues. protein identity to SEQ ID NO : 108 . SEQUENCE LISTING INCORPORATION BY REFERENCE The instant application contains a Sequence Listing which has been submitted electronically in ASCII format and is 200 All publications and patent applications mentioned in this hereby incorporated by reference in its entirety . Said ASCII specification are herein incorporated by reference to the copy , created on Jul. 14 , 2017 , is named 49022 same extent as if each individual publication or patent 716 _ 201 _ SL . txt and is 809, 751 bytes in size. This applica application was specifically and individually indicated to be tion contains a partial sequence list in Table 6 . incorporated by reference . 25 SUMMARY OF THE DISCLOSURE BRIEF DESCRIPTION OF THE DRAWINGS Disclosed herein are methods ofmodifying a target region This patent or application file contains at least one draw in the genome of a cell , the method comprising : ( a ) con - ing executed in color. Copies of this patent or patent tacting a cell with : a non -naturally occurring nucleic -acid - 30 application publication with color drawing ( s ) will be pro guided nuclease encoded by a nucleic acid having at least vided by the Office upon request and payment of the 80 % identity to SEQ ID NO : 22 ; an engineered guide necessary fee . nucleic acid capable of complexing with the nucleic acid - FIG . 1A depicts a partial sequence alignment of MAD1- 8 guided nuclease ; and an editing sequence encoding a nucleic (SEQ ID NO : 1 -8 ) ,MAD10 , and MAD12 (SEQ ID NO : 10 , acid complementary to said target region having a change in 35 12 ). FIG . 1A discloses residues 703 -707 of SEQ ID NO : 1 , sequence relative to the target region ; and ( b ) allowing the residues 625 -629 of SEQ ID NO : 2, residues 587 - 591 of nuclease, guide nucleic acid , and editing sequence to create SEQ ID NO : 3 , residues 654 -658 of SEQ ID NO : 4 , residues a genome edit in a target region of the genome of the cell . 581 - 585 ofSEQ ID NO : 5 , residues 637 -641 of SEQ ID NO : In some aspects , the engineered guide nucleic acid and the 6 , residues 590 - 594 of SEQ ID NO : 7 , residues 645 -649 of editing sequence are provided as a single nucleic acid . In 40 SEQ ID NO : 8 , residues 619 -623 of SEQ ID NO : 10 and some aspects , the single nucleic acid further comprises a residues 603 -607 of SEQ ID NO : 12 , all disclosed respec mutation in a protospacer adjacent motif ( PAM ) site . In tively , in order of appearance . some aspects , the nucleic acid - guided nuclease is encoded FIG . 1B depicts a phylogenetic tree of nucleases includ by a nucleic acid with at least 85 % identity to SEQ ID NO : ing MAD1 - 8 . 42 . In some aspects , the nucleic acid - guided nuclease is 45 FIG . 2 depicts an example protein expression construct . encoded by a nucleic acid with at least 85 % identity to SEQ FIG . 2 discloses “ 6X - His ” as SEQ ID NO : 182 . ID NO : 128 . FIG . 3 depicts an example editing cassette . FIG . 3 dis Disclosed herein are nucleic acid -guided nuclease sys - closes SEQ ID NOS 183 - 185, respectively , in order of tems comprising : ( a ) a non - naturally occurring nuclease appearance . encoded by a nucleic acid having at least 80 % identity to 50 FIG . 4 depicts an example screening or selection experi SEQ ID NO : 22 ; ( b ) an engineered guide nucleic acid ment workflow . capable of complexing with the nucleic acid - guided nucle FIG . 5A depicts an example protein expression construct . ase , and ( c ) an editing sequence having a change in sequence FIG . 5B depicts an example editing cassette . relative to the sequence of a target region in a genome of a FIG . 5C depicts an example screening or selection experi cell ; wherein the system results in a genome edit in the target 55 ment workflow . region in the genome of the cell facilitated by the nuclease , FIG . 6A depicts an example protein expression construct . the engineered guide nucleic acid , and the editing sequence . FIG . 6B depicts an example editing cassette . In some aspects , nucleic acid - guided nuclease is encoded by FIG . 6C depicts an example screening or selection experi a nucleic acid with at least 85 % identity to SEQ ID NO : 42 . ment workflow . In some aspects , the nucleic acid - guided nuclease is encoded 60 FIG . 7A - 7B depicts example data from a functional by a nucleic acid with at least 85 % identity to SEQ ID NO : nuclease complex screening or selection experiment. 128 . In some aspects , the nucleic acid - guided nuclease is FIG . 8 depicts example data from a targetable nuclease codon optimized for the cell to be edited . In some aspects , complex -based editing experiment. the engineered guide nucleic acid and the editing sequence FIG . 9 depicts example data from a targetable nuclease are provided as a single nucleic acid . In some aspects, the 65 complex - based editing experiment. single nucleic acid further comprises a mutation in a proto - FIGS. 10A - 10C depict example data from a targetable spacer adjacent motif (PAM ) site . nuclease complex -based editing experiment. US 10 ,435 , 714 B2 FIG . 11 depicts an example sequence alignment of select that require PAM recognition are also limited in the sequences from an editing experiment. FIG . 11 discloses sequences they can target throughout a genetic sequence . SEQ ID NOS 187 , 188 , 187, 187, 187, 187 , 187 , 189 , 186 , Other challenges include processivity , target recognition 187 and 187 , respectively , in order of appearance . specificity and efficiency, and nuclease acidity efficiency , FIG . 12 depicts example data from a targetable nuclease 5 which often effect genetic editing efficiency . complex -based editing experiment . Non - naturally occurring targetable nucleases and non FIG . 13A depicts an example alignment of scaffold naturally occurring targetable nuclease systems can address sequences (SEQ ID NOS 190 - 202 , respectively , in order of many of these challenges and limitations . appearance ) . Disclosed herein are non - naturally targetable nuclease FIG . 13B depicts an example model of a nucleic acid - 10 systems. Such targetable nuclease systems are engineered to guided nuclease complexed with a guide nucleic acid and a address one or more of the challenges described above and target sequence . can be referred to as engineered nuclease systems. Engi FIG . 14A -14B depict example data from a primer vali neered nuclease systems can comprise one or more of an dation experiment. engineered nuclease, such as an engineered nucleic acid FIG . 15 depicts example data from a targetable nuclease 15 guided nuclease , an engineered guide nucleic acid , an engi complex -based editing experiment. neered polynucleotides encoding said nuclease , or an engi FIG . 16 depicts example validation data comparing neered polynucleotides encoding said guide nucleic acid . results from two different assays . Engineered nucleases , engineered guide nucleic acids, and FIG . 17A - 17C depict an example trackable genetic engi engineered polynucleotides encoding the engineered nucle neering workflow , including a plasmid comprising an edit - 20 ase or engineered guide nucleic acid are not naturally ing cassette and a recording cassette , and downstream occurring and are not found in nature . It follows that sequencing of barcodes in order to identify the incorporated engineered nuclease systems including one or more of these edit or mutation . FIG . 17B discloses SEO ID NOS 203 and elements are non -naturally occurring . 204 , respectively, in order of appearance . Non - limiting examples of types of engineering that can be FIG . 18 depicts an example trackable genetic engineering 25 done to obtain a non - naturally occurring nuclease system are workflow , including iterative rounds of engineering with a as follows. Engineering can include codon optimization to different editing cassette and recorder cassette with unique facilitate expression or improve expression in a host cell , barcode (BC ) at each round , which can be followed by such as a heterologous host cell . Engineering can reduce the selection and tracking to confirm the successful engineering size or molecular weight of the nuclease in order to facilitate step at each round . 30 expression or delivery . Engineering can alter PAM selection FIG . 19 depicts an example recursive engineering work - in order to change PAM specificity or to broaden the range flow . of recognized PAMs. Engineering can alter, increase, or decrease stability , processivity , specificity, or efficiency of a DETAILED DESCRIPTION OF THE targetable nuclease system . Engineering can alter , increase , DISCLOSURE 35 or decrease protein stability . Engineering can alter, increase , or decrease processivity of nucleic acid scanning . Engineer The present disclosure provides nucleic acid - guided ing can alter , increase , or decrease target sequence specific nucleases and methods of use . Often , the subject nucleic ity. Engineering can alter , increase , or decrease nuclease acid guided nucleases are part of a targetable nuclease activity . Engineering can alter , increase , or decrease editing system comprising a nucleic acid - guided nuclease and a 40 efficiency . Engineering can alter, increase , or decrease trans guide nucleic acid . A subject targetable nuclease system can formation efficiency . Engineering can alter , increase , or be used to cleave , modify, and/ or edit a target polynucleotide decrease nuclease or guide nucleic acid expression . sequence , often referred to as a target sequence . A subject Examples of non - naturally occurring nucleic acid targetable nuclease system refers collectively to transcripts sequences which are disclosed herein include sequences and other elements involved in the expression of or directing 45 codon optimized for expression in bacteria , such as E . coli the activity of genes, which may include sequences encod - ( e . g ., SEQ ID NO : 41 -60 ) , sequences codon optimized for ing a subject nucleic acid - guided nuclease protein and a expression in single cell eukaryotes, such as yeast ( e .g . , SEQ guide nucleic acid as disclosed herein . ID NO : 127 - 146 ) , sequences codon optimized for expres Methods , systems, vectors , polynucleotides, and compo - sion in multi cell eukaryotes , such as human cells ( e . g . , SEQ sitions described herein may be used in various applications 50 ID NO : 147 - 166 ) , polynucleotides used for cloning or including altering or modifying synthesis of a gene product, expression of any sequences disclosed herein ( e . g . , SEQ ID such as a protein , polynucleotide cleavage , polynucleotide NO : 61 - 80 ), plasmids comprising nucleic acid sequences editing , polynucleotide splicing; trafficking of target poly - ( e . g . , SEQ ID NO : 21 - 40 ) operably linked to a heterologous nucleotide , tracing of target polynucleotide , isolation of promoter or nuclear localization signal or other heterologous target polynucleotide , visualization of target polynucleotide , 55 element, proteins generated from engineered or codon opti etc . Aspects of the invention also encompass methods and mized nucleic acid sequences ( e . g . , SEQ ID NO : 1 - 20 ) , or uses of the compositions and systems described herein in engineered guide nucleic acids comprising any one of SEQ genome engineering , e . g . for altering or manipulating the ID NO : 84 - 107 . Such non -naturally occurring nucleic acid expression of one or more genes or the one or more gene sequences can be amplified , cloned , assembled , synthesized , products, in prokaryotic , archaeal, or eukaryotic cells , in 60 generated from synthesized oligonucleotides or dNTPs, or vitro , in vivo or ex vivo . otherwise obtained using methods known by those skilled in Nucleic Acid -Guided Nucleases the art. Bacterial and archaeal targetable nuclease systems have Disclosed herein are nucleic acid - guided nucleases . Sub emerged as powerful tools for precision genome editing . ject nucleases are functional in vitro , or in prokaryotic , However, naturally occurring nucleases have some limita - 65 archaeal , or eukaryotic cells for in vitro , in vivo , or ex vivo tions including expression and delivery challenges due to the applications. Suitable nucleic acid - guided nucleases can be nucleic acid sequence and protein size . Targetable nucleases from an organism from a genus which includes but is not US 10 , 435 , 714 B2 6 limited to Thiomicrospira , Succinivibrio , Candidatus, Por - salsuginis , N . tergarcus; S. auricularis , S . carnosus ; N . phyromonas, Acidaminococcus, Acidomonococcus, Pre meningitides, N . gonorrhoeae; L . monocytogenes , L . iva votella , Smithella ,Moraxella , Synergistes, Francisella , Lep novii ; C . botulinum , C . difficile , C . tetani, C . sordellii ; tospira, Catenibacterium , Kandleria , Clostridium , Dorea , Francisella tularensis 1 , Prevotella albensis , Lachno Coprococcus, Enterococcus, Fructobacillus, Weissella , 5 spiraceae bacterium MC2017 1 , Butyrivibrio proteoclasti Pediococcus, Corynebacter, Sutterella , Legionella , cus, Butyrivibrio proteoclasticus B316 , Peregrinibacteria Treponema, Roseburia , Filifactor, Eubacterium , Streptococ - bacterium GW2011 GWA2 3310 , Parcubacteria bacte cus, Lactobacillus, Mycoplasma, Bacteroides, Flaviivola , rium GW2011 _ GWC2_ 44 _ 17 , Smithella sp . SCADC , Flavobacterium , Sphaerochaeta , Azospirillum , Gluconac Acidaminococcus sp . BV3L6 , Lachnospiraceae bacterium etobacter, Neisseria , Roseburia , Parvibaculum , Staphylo - 10 MA2020 , Candidatus Methanoplasma termitum , Eubacte coccus , Nitratifractor, Mycoplasma , Alicyclobacillus, Brevi - rium eligens , Moraxella bovoculi 237 , Leptospira inadai, bacilus, Bacillus, Bacteroidetes , Brevibacilus, Lachnospiraceae bacterium ND2006 , Porphyromonas cre Carnobacterium , Clostridiaridium , Clostridium , Desul- vioricanis 3 , Prevotella disiens, Porphyromonas macacae, fonatronum , Desulfovibrio , Helcococcus, Leptotrichia , List - Catenibacterium sp . CAG : 290 , , eria , Methanomethyophilus, Methylobacterium , Opituta - 15 Clostridiales bacterium KA00274 , Lachnospiraceae bacte ceae , Paludibacter , Rhodobacter , Sphaerochaeta , rium 3 -2 , Dorea longicatena , Coprococcus catus GD /7 , Tuberibacillus, Oleiphilus, Omnitrophica , Parcubacteria , Enterococcus columbae DSM 7374 , Fructobacillus sp . and Campylobacter . Species of organism of such a genus EFB -N1 , Weissella halotolerans, Pediococcus acidilactici , can be as otherwise herein discussed . Suitable nucleic acid Lactobacillus curvatus, Streptococcus pyogenes, Lactoba guided nucleases can be from an organism from a genus or 20 cillus versmoldensis , Filifactor alocis ATCC 35896 , Alicy unclassified genus within a kingdom which includes but is clobacillus acidoterrestris , Alicyclobacillus acidoterrestris not limited to Firmicute , Actinobacteria , Bacteroidetes, Pro - ATCC 49025 , Desulfovibrio inopinatus, Desulfovibrio teobacteria , Spirochates , and Tenericutes . Suitable nucleic inopinatus DSM 10711, Oleiphilus sp . Oleiphilus sp . acid - guided nucleases can be from an organism from a genus H10009 , Candidtus kefeldibacteria , Parcubacteria Cas Y . 4 , or unclassified genus within a phylum which includes but is 25 Omnitrophica WOR 2 bacterium GWF2 , Bacillus sp . not limited to , Clostridia , Bacilli, Actino - NSP2. 1 , and Bacillus thermoamylovorans. bacteria , Bacteroidetes, Flavobacteria , Alphaproteobacte - In some instances, a nucleic acid - guided nuclease dis ria , Betaproteobacteria , Gammaproteobacteria , Deltapro - closed herein comprises an amino acid sequence comprising teobacteria , Epsilonproteobacteria , Spirochaetes, and at least 50 % amino acid identity to any one of SEQ ID NO : Mollicutes. Suitable nucleic acid - guided nucleases can be 30 1 - 20 . In some instances, a nuclease comprises an amino acid from an organism from a genus or unclassified genus within sequence comprising at least about 10 % , 20 % , 30 % , 40 % , an order which includes but is not limited to Clostridiales, 50 % , 60 % , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , 95 % , greater Lactobacillales, Actinomycetales, Bacteroidales, Flavobac than 95 % , or 100 % amino acid identity to any one of SEQ teriales, Rhizobiales, Rhodospirillales, Burkholderiales, ID NO : 1 - 20 . In some cases, the nucleic acid - guided nucle Neisseriales, Legionellales, Nautiliales, Campylobacter - 35 ase comprises at least about 50 % , 60 % , 65 % , 70 % , 75 % , ales , Spirochaetales, Mycoplasmatales, and Thiotrichales . 80 % , 85 % , 90 % , 95 % , greater than 95 % , amino acid identity Suitable nucleic acid - guided nucleases can be from an to any one of SEQ ID NO : 1 - 20 . In some cases, the nucleic organism from a genus or unclassified genus within a family acid - guided nuclease comprises at least about 50 % , 60 % , which includes but is not limited to Lachnospiraceae , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , 95 % , greater than 95 % , Enterococcaceae, Leuconostocaceae, Lactobacillaceae, 40 amino acid identity to any one of SEQ ID NO : 1 -8 or 10 - 12 . Streptococcaceae , Peptostreptococcaceae , Staphylococca - In some cases , the nucleic acid - guided nuclease comprises at ceae, Eubacteriaceae , Corynebacterineae , Bacteroidaceae, least about 50 % , 60 % , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , Flavobacterium , Cryomoorphaceae , Rhodobiaceae , Rho - 95 % , greater than 95 % , amino acid identity to any one of dospirillaceae, Acetobacteraceae , Sutterellaceae , Neisseri SEQ ID NO : 1 - 8 or 10 - 11. In some cases, the nucleic aceae, Legionellaceae , Nautiliaceae , Campylobacteraceae , 45 acid - guided nuclease comprises at least about 50 % , 60 % , Spirochaetaceae , Mycoplasmataceae, Pisciririckettsiaceae , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , 95 % , greater than 95 % , and Francisellaceae . Other nucleic acid - guided nucleases amino acid identity to SEQ ID NO : 2 . In some cases, the have been describe in US Patent Application Publication No . nucleic acid - guided nuclease comprises at least about 50 % , US20160208243 filed Dec . 18 , 2015 , US Application Pub - 60 % , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , 95 % , greater than lication No. US20140068797 filed Mar. 15 , 2013 , U . S . Pat. 50 95 % , amino acid identity to SEQ ID NO : 7 . No. 8 ,697 ,359 filed Oct. 15 , 2013 , and Zetsche et al. , Cell In some cases , the nucleic acid - guided nuclease com 2015 Oct . 22 ; 163 ( 3 ) :759 - 71, each of which are incorporated prises any one of SEQ ID NO : 1 - 20 . In some cases , the herein by reference in their entirety . nucleic acid - guided nuclease comprises any one of SEQ ID Some nucleic acid - guided nucleases suitable for use in the NO : 1 - 8 or 10 - 12 . In some cases , the nucleic acid - guided methods , systems, and compositions of the present disclo - 55 nuclease comprises any one of SEQ ID NO : 1 - 8 or 10 - 11 . In sure include those derived from an organism such as , but not some cases , the nucleic acid - guided nuclease comprises limited to , Thiomicrospira sp . XS5 , Eubacterium rectale , SEQ ID NO : 2 . In some cases , the nucleic acid - guided Succinivibrio dextrinosolvens, Candidatus Methanoplasma nuclease comprises SEQ ID NO : 7 . termitum , Candidatus Methanomethylophilus alvus, Por - In some instances, a nucleic acid - guided nuclease com phyromonas crevioricanis , Flavobacterium branchiophi- 60 prises an amino acid sequence comprising at most 10 % , lum , Acidaminococcus Sp . , Acidomonococcus sp ., Lachno - 15 % , 20 % , 25 % , 30 % , 35 % , 40 % , 45 % , 50 % , 55 % , 60 % , spiraceae bacterium COE1, Prevotella brevis ATCC 19188 , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , or 95 % amino acid Smithella sp . SCADC , Moraxella bovoculi , Synergistes identity to any one of SEQ ID NO : 12 or SEQ ID NO : jonesii , Bacteroidetes oral taxon 274 , Francisella tularensis , 108 - 110 . In some instances, a nucleic acid - guided nuclease Leptospira inadai serovar Lyme str . 10 , Acidomonococcus 65 comprises an amino acid sequence comprising at most 50 % sp . crystal structure (5B43 ) S . mutans, S . agalactiae , S . amino acid identity to any one of SEQ ID NO : 12 or SEO equisimilis, S. sanguinis, S. pneumonia ; C . jejuni, C . coli; N . ID NO : 108 -110 . In some instances , a nucleic acid - guided US 10 ,435 , 714 B2 nuclease comprises an amino acid sequence comprising at positive bacteria , e. g. , Bacillus subtilis, or gram negative most 45 % amino acid identity to any one of SEQ ID NO : 12 bacteria , e . g . , E . coli . In some instances, a nucleic acid or SEQ ID NO : 108 - 110 . In some instances , a nucleic guided nuclease disclosed herein is encoded by a nucleic acid - guided nuclease comprises an amino acid sequence acid sequence comprising at least 50 % sequence identity to comprising at most 40 % amino acid identity to any one of 5 any one of SEQ ID NO : 41 - 60 . In some instances , a nuclease SEQ ID NO : 12 or SEQ ID NO : 108 - 110 . In some instances , is encoded by a nucleic acid sequence comprising at least a nucleic acid - guided nuclease comprises an amino acid about 10 % , 20 % , 30 % , 40 % , 50 % , 60 % , 65 % , 70 % , 75 % , sequence comprising atmost 35 % amino acid identity to any 80 % , 85 % , 90 % , 95 % , greater than 95 % , or 100 % sequence one of SEQ ID NO : 12 or SEQ ID NO : 108 - 110 . In some identity to any one of SEQ ID NO : 41 -60 . In some cases , the instances , a nucleic acid - guided nuclease comprises an 10 nucleic acid - guided nuclease is encoded by the nucleic acid amino acid sequence comprising at most 30 % amino acid sequence comprising at least about 50 % , 60 % , 65 % , 70 % , identity to any one of SEQ ID NO : 12 or SEQ ID NO : 75 % , 80 % , 85 % , 90 % , 95 % , greater than 95 % sequence 108 - 110 . identity to any one of SEQ ID NO : 41 -60 . In some cases, the In some instances , a nucleic acid - guided nuclease dis - nucleic acid - guided nuclease is encoded by the nucleic acid closed herein is encoded by a nucleic acid sequence com - 15 sequence comprising at least about 50 % , 60 % , 65 % , 70 % , prising at least 50 % sequence identity to any one of SEQ ID 75 % , 80 % , 85 % , 90 % , 95 % , greater than 95 % sequence NO : 21 - 40 . In some instances, a nuclease is encoded by a identity to any one of SEQ ID NO : 41 - 48 or 50 - 52. In some nucleic acid sequence comprising at least about 10 % , 20 % , cases, the nucleic acid - guided nuclease is encoded by the 30 % , 40 % , 50 % , 60 % , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , nucleic acid sequence comprising at least about 50 % , 60 % , 95 % , greater than 95 % , or 100 % sequence identity to any 20 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , 95 % , greater than 95 % one of SEQ ID NO : 21 - 40 . In some instances , a nuclease is sequence identity to any one of SEQ ID NO : 41 - 48 or 50 -51 . encoded by a nucleic acid sequence comprising at least In some cases , the nucleic acid - guided nuclease is encoded about 50 % , 60 % , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , 95 % , by the nucleic acid sequence comprising at least about 50 % , greater than 95 % , sequence identity to any one of SEQ ID 60 % , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , 95 % , greater than NO : 21 - 40 . In some cases , the nucleic acid - guided nuclease 25 95 % sequence identity to SEQ ID NO : 42 . In some cases , the is encoded by the nucleic acid sequence comprising at least nucleic acid - guided nuclease is encoded by the nucleic acid about 50 % , 60 % , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , 95 % , sequence comprising at least about 50 % , 60 % , 65 % , 70 % , greater than 95 % , sequence identity to any one of SEQ ID 75 % , 80 % , 85 % , 90 % , 95 % , greater than 95 % sequence NO : 21 - 40 . In some cases , the nucleic acid - guided nuclease identity to SEQ ID NO : 47 . is encoded by the nucleic acid sequence comprising at least 30 In some cases , the nucleic acid - guided nuclease is about 50 % , 60 % , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , 95 % , encoded by the nucleic acid sequence of any one of SEQ ID greater than 95 % , sequence identity to any one of SEQ ID NO : 41 -60 . In some cases, the nucleic acid - guided nuclease NO : 21- 28 or 30 - 32 . In some cases , the nucleic acid - guided is encoded by the nucleic acid sequence of any one of SEQ nuclease is encoded by the nucleic acid sequence comprising ID NO : 41 -48 or 50 -52 . In some cases, the nucleic acid at least about 50 % , 60 % , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , 35 guided nuclease is encoded by the nucleic acid sequence of 95 % , greater than 95 % , sequence identity to any one of SEQ any one of SEQ ID NO : 41 - 48 or 50 -51 . In some cases , the ID NO : 21 - 28 or 30 - 31. In some cases , the nucleic acid nucleic acid - guided nuclease is encoded by the nucleic acid guided nuclease is encoded by the nucleic acid sequence sequence of SEQ ID NO : 42 . In some cases, the nucleic comprising at least about 50 % , 60 % , 65 % , 70 % , 75 % , 80 % , acid - guided nuclease is encoded by the nucleic acid 85 % , 90 % , 95 % , greater than 95 % , sequence identity to 40 sequence of SEQ ID NO : 47 . SEQ ID NO : 22 . In some cases , the nucleic acid - guided A nucleic acid sequence encoding a nucleic acid - guided nuclease is encoded by the nucleic acid sequence comprising nuclease can be codon optimized for expression in a species at least about 50 % , 60 % , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , of yeast, e . g . , S . cerevisiae. In some instances , a nucleic 95 % , greater than 95 % , sequence identity to SEQ ID NO : acid - guided nuclease disclosed herein is encoded by a 27 . 45 nucleic acid sequence comprising at least 50 % sequence In some cases, the nucleic acid - guided nuclease is identity to any one of SEQ ID NO : 127 - 146 . In some encoded by the nucleic acid sequence of any one of SEQ ID instances, a nuclease is encoded by a nucleic acid sequence NO : 21 -40 . In some cases, the nucleic acid - guided nuclease comprising at least about 10 % , 20 % , 30 % , 40 % , 50 % , 60 % , is encoded by the nucleic acid sequence of any one of SEQ 6 5 % , 70 % , 75 % , 80 % , 85 % , 90 % , 95 % , greater than 95 % , ID NO : 21 - 28 or 30 - 32 . In some cases , the nucleic acid - 50 or 100 % sequence identity to any one of SEQ ID NO : guided nuclease is encoded by the nucleic acid sequence of 127 - 146 . In some instances , a nuclease is encoded by a any one of SEQ ID NO : 21 - 28 or 30 - 31 . In some cases , the nucleic acid sequence comprising at least about 50 % , 60 % , nucleic acid - guided nuclease is encoded by the nucleic acid 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , 95 % , greater than 95 % sequence of SEQ ID NO : 22 . In some cases , the nucleic sequence identity to any one of SEQ ID NO : 127 - 146 . In acid - guided nuclease is encoded by the nucleic acid 55 some cases, the nucleic acid -guided nuclease is encoded by sequence of SEQ ID NO : 27 . the nucleic acid sequence comprising at least about 50 % , In some instances , a nucleic acid - guided nuclease dis - 60 % , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , 95 % , greater than closed herein is encoded on a nucleic acid sequence . Such a 95 % sequence identity to any one of SEQ ID NO : 127 - 146 . nucleic acid can be codon optimized for expression in a In some cases, the nucleic acid - guided nuclease is encoded desired host cell. Suitable host cells can include , as non - 60 by the nucleic acid sequence comprising at least about 50 % , limiting examples, prokaryotic cells such as E . coli , P . 60 % , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , 95 % , greater than aeruginosa , B . subtilus, and V . natriegens, and eukaryotic 95 % sequence identity to any one of SEQ ID NO : 127 - 134 cells such as S. cerevisiae , plant cells, insect cells , nematode or 136 - 138. In some cases , the nucleic acid - guided nuclease cells , amphibian cells , fish cells , or mammalian cells , includ - is encoded by the nucleic acid sequence comprising at least ing human cells. 65 about 50 % , 60 % , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , 95 % , A nucleic acid sequence encoding a nucleic acid - guided greater than 95 % sequence identity to any one of SEQ ID nuclease can be codon optimized for expression in gram NO : 127 - 134 or 136 - 137 . In some cases, the nucleic acid US 10 , 435 , 714 B2 10 guided nuclease is encoded by the nucleic acid sequence ase system , such as a guide nucleic acid , or an editing or comprising at least about 50 % , 60 % , 65 % , 70 % , 75 % , 80 % , recorder cassette as disclosed herein . These larger nucleic 85 % , 90 % , 95 % , greater than 95º A sequence identity to acid sequences can be recombinant expression vectors, as SEQ ID NO : 128 . In some cases , the nucleic acid - guided are described in more detail later . nuclease is encoded by the nucleic acid sequence comprising 5 Guide Nucleic Acid at least about 50 % , 60 % , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , In general, a guide nucleic acid can complex with a 95 % , greater than 95 % sequence identity to SEQ ID NO : compatible nucleic acid - guided nuclease and can hybridize 133 . with a target sequence , thereby directing the nuclease to the In some cases, the nucleic acid - guided nuclease is target sequence . A subject nucleic acid - guided nuclease encoded by the nucleic acid sequence of any one of SEQ ID 10 capable of complexing with a guide nucleic acid can be NO : 127 - 146 . In some cases , the nucleic acid - guided nucle - referred to as a nucleic acid - guided nuclease that is com ase is encoded by the nucleic acid sequence of any one of patible with the guide nucleic acid . Likewise , a guide SEQ ID NO : 127 - 134 or 136 - 138 . In some cases, the nucleic nucleic acid capable of complexing with a nucleic acid acid - guided nuclease is encoded by the nucleic acid guided nuclease can be referred to as a guide nucleic acid sequence of any one of SEQ ID NO : 127 - 134 or 136 - 137 . In 15 that is compatible with the nucleic acid - guided nucleases . some cases , the nucleic acid -guided nuclease is encoded by A guide nucleic acid can be DNA . A guide nucleic acid the nucleic acid sequence of SEQ ID NO : 128 . In some can be RNA . A guide nucleic acid can comprise both DNA cases, the nucleic acid - guided nuclease is encoded by the and RNA . A guide nucleic acid can comprise modified of nucleic acid sequence of SEQ ID NO : 133 . non -naturally occurring nucleotides . In cases where the A nucleic acid sequence encoding a nucleic acid -guided 20 guide nucleic acid comprises RNA , the RNA guide nucleic nuclease can be codon optimized for expression in mam acid can be encoded by a DNA sequence on a polynucleotide malian cells . In some instances, a nucleic acid - guided nucle molecule such as a plasmid , linear construct , or editing ase disclosed herein is encoded by a nucleic acid sequence cassette as disclosed herein . comprising at least 50 % sequence identity to any one ofSEQ A guide nucleic acid can comprise a guide sequence . A ID NO : 147 - 166 . In some instances, a nuclease is encoded 25 guide sequence is a polynucleotide sequence having suffi by a nucleic acid sequence comprising at least about 10 % , cient complementarity with a target polynucleotide 20 % , 30 % , 40 % , 50 % , 60 % , 65 % , 70 % , 75 % , 80 % , 85 % , sequence to hybridize with the target sequence and direct 90 % , 95 % , greater than 95 % , or 100 % sequence identity to sequence- specific binding of a complexed nucleic acid any one of SEQ ID NO : 147 - 166 . In some cases, the nucleic guided nuclease to the target sequence . The degree of acid -guided nuclease is encoded by the nucleic acid 30 complementarity between a guide sequence and its corre sequence comprising at least about 50 % , 60 % , 65 % , 70 % , sponding target sequence , when optimally aligned using a 75 % , 80 % , 85 % , 90 % , 95 % , greater than 95 % sequence suitable alignment algorithm , is about or more than about identity to any one of SEQ ID NO : 147 - 166 . In some cases , 50 % , 60 % , 75 % , 80 % , 85 % , 90 % , 95 % , 97 . 5 % , 99 % , or the nucleic acid - guided nuclease is encoded by the nucleic more . Optimal alignmentmay be determined with the use of acid sequence comprising at least about 50 % , 60 % , 65 % , 35 any suitable algorithm for aligning sequences . In some 70 % , 75 % , 80 % , 85 % , 90 % , 95 % , greater than 95 % embodiments , a guide sequence is about or more than about sequence identity to any one of SEQ ID NO : 147 - 154 or 5 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21, 22 , 23 , 24 , 156 - 158 . In some cases, the nucleic acid - guided nuclease is 25 , 26 , 27 , 28 , 29 , 30 , 35 , 40 , 45 , 50 , 75 , ormore nucleotides encoded by the nucleic acid sequence comprising at least in length . In some embodiments , a guide sequence is less about 50 % , 60 % , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , 95 % , 40 than about 75 , 50 , 45 , 40 , 35 , 30 , 25 , 20 nucleotides in greater than 95 % sequence identity to any one of SEO ID length . Preferably the guide sequence is 10 - 30 nucleotides NO : 147 - 154 or 156 - 157. In some cases, the nucleic acid - long. The guide sequence can be 15 - 20 nucleotides in length . guided nuclease is encoded by the nucleic acid sequence The guide sequence can be 15 nucleotides in length . The comprising at least about 50 % , 60 % , 65 % , 70 % , 75 % , 80 % , guide sequence can be 16 nucleotides in length . The guide 85 % , 90 % , 95 % , greater than 95 % sequence identity to SEQ 45 sequence can be 17 nucleotides in length . The guide ID NO : 148 . In some cases , the nucleic acid -guided nuclease sequence can be 18 nucleotides in length . The guide is encoded by the nucleic acid sequence comprising at least sequence can be 19 nucleotides in length . The guide about 50 % , 60 % , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , 95 % , sequence can be 20 nucleotides in length . greater than 95 % sequence identity to SEQ ID NO : 153 . A guide nucleic acid can comprise a scaffold sequence . In In some cases , the nucleic acid - guided nuclease is 50 general, a " scaffold sequence ” includes any sequence that encoded by the nucleic acid sequence of any one of SEQ ID has sufficient sequence to promote formation of a targetable NO : 147 - 166 . In some cases , the nucleic acid - guided nucle nuclease complex , wherein the targetable nuclease complex ase is encoded by the nucleic acid sequence of any one of comprises a nucleic acid - guided nuclease and a guide SEQ ID NO : 147 - 154 or 156 - 158 . In some cases, the nucleic nucleic acid comprising a scaffold sequence and a guide acid - guided nuclease is encoded by the nucleic acid 55 sequence . Sufficient sequence within the scaffold sequence sequence of any one of SEQ ID NO : 147 - 154 or 156 - 157 . In to promote formation of a targetable nuclease complex may some cases, the nucleic acid - guided nuclease is encoded by include a degree of complementarity along the length of two the nucleic acid sequence of SEQ ID NO : 148 . In someme sequence regions within the scaffold sequence , such as one cases , the nucleic acid - guided nuclease is encoded by the or two sequence regions involved in forming a secondary nucleic acid sequence of SEQ ID NO : 153 . 60 structure . In some cases, the one or two sequence regions are A nucleic acid sequence encoding a nucleic acid - guided comprised or encoded on the same polynucleotide . In some nuclease can be operably linked to a promoter. Such nucleic cases , the one or two sequence regions are comprised or acid sequences can be linear or circular . The nucleic acid encoded on separate polynucleotides. Optimal alignment sequences can be comprised on a larger linear or circular may be determined by any suitable alignment algorithm , and nucleic acid sequences that comprises additional elements 65 may further account for secondary structures, such as self such as an origin of replication , selectable or screenable complementarity within either the one or two sequence marker, terminator, other components of a targetable nucle - regions. In some embodiments , the degree of complemen US 10 , 435 , 714 B2 12 tarity between the one or two sequence regions along the acid - guided nucleases for use in the present disclosure can length of the shorter of the two when optimally aligned is bind to a pseudoknot region having at least 75 % , 80 % , 85 % , about or more than about 25 % , 30 % , 40 % , 50 % , 60 % , 70 % , 90 % , 95 % , or 100 % sequence identity to the pseudoknot 80 % , 90 % , 95 % , 97 . 5 % , 99 % , or higher . In some embodi region of Scaffold - 11 ( SEQ ID NO : 180 ) . Certain nucleic ments , at least one of the two sequence regions is about or 5 acid - guided nucleases for use in the present disclosure can more than about 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12, 13 , 14 , 15 , 16 , 17 , bind to a pseudoknot region having at least 75 % , 80 % , 85 % , 18 , 19 , 20 , 25 , 30 , 40 , 50 , or more nucleotides in length . 90 % , 95 % , or 100 % sequence identity to the pseudoknot A scaffold sequence of a subject guide nucleic acid can region of Scaffold - 12 ( SEQ ID NO : 181 ) . comprise a secondary structure . A secondary structure can guide nucleic acid can comprise the sequence of any comprise a pseudoknot region . In some cases, binding 10 one of SEQ ID NO : 84 - 107 . A guide nucleic acid can kinetics of a guide nucleic acid to a nucleic acid - guided comprise the sequence of any one of SEQ ID NO : 84 - 103 . nuclease is determined in part by secondary structures A guide nucleic acid can comprise the sequence of any one within the scaffold sequence . In some cases, binding kinetics of SEQ ID NO : 84 -91 or 93 - 95 . A guide nucleic acid can of a guide nucleic acid to a nucleic acid -guided nuclease is comprise the sequence of any one of SEQ ID NO : 88 , 93, 94 , determined in part by nucleic acid sequence with the scaffold 15 or 95. A guide nucleic acid can comprise the sequence of sequence . SEQ ID NO : 88 . A guide nucleic acid can comprise the A scaffold sequence can comprise the sequence of any one sequence of SEQ ID NO : 93 . A guide nucleic acid can of SEQ ID NO : 84 - 107 . A scaffold sequence can comprise comprise the sequence of SEQ ID NO : 94 . A guide nucleic the sequence of any one of SEQ ID NO : 84 - 103 . A scaffold acid can comprise the sequence of SEQ ID NO : 95 . sequence can comprise the sequence of any one of SEQ ID 20 In aspects of the invention the terms " guide nucleic acid ” NO : 84 - 91 or 93 - 95 . A scaffold sequence can comprise the refers to one or more polynucleotides comprising 1 ) a guide sequence of any one of SEQ ID NO : 88 , 93 , 94 , or 95 . A sequence capable of hybridizing to a target sequence and 2 ) scaffold sequence can comprise the sequence of SEQ ID a scaffold sequence capable of interacting with or complex NO : 88 . A scaffold sequence can comprise the sequence of ing with an nucleic acid - guided nuclease as described SEQ ID NO : 93 . A scaffold sequence can comprise the 25 herein . A guide nucleic acid may be provided as one or more sequence of SEQ ID NO : 94 . A scaffold sequence can nucleic acids . In specific embodiments , the guide sequence comprise the sequence of SEQ ID NO : 95 . and the scaffold sequence are provided as a single poly In some aspects , the invention provides a nuclease that nucleotide . binds to a guide nucleic acid comprising a conserved scaf- A guide nucleic acid can be compatible with a nucleic fold sequence . For example , the nucleic acid - guided nucle - 30 acid - guided nuclease when the two elements can form a ases for use in the present disclosure can bind to a conserved functional targetable nuclease complex capable of cleaving pseudoknot region as shown in FIG . 13A . Specifically , the a target sequence . Often , a compatible scaffold sequence for nucleic acid - guided nucleases for use in the present disclo - a compatible guide nucleic acid can be found by scanning sure can bind to a guide nucleic acid comprising a conserved sequences adjacent to a native nucleic acid -guided nuclease pseudoknot region as shown in FIG . 13A . Certain nucleic 35 loci. In other words, native nucleic acid - guided nucleases acid - guided nucleases for use in the present disclosure can can be encoded on a genome within proximity to a corre bind to a pseudoknot region having at least 75 % , 80 % , 85 % , sponding compatible guide nucleic acid or scaffold 90 % , 95 % , or 100 % sequence identity to the pseudoknot sequence . region of Scaffold - 1 (SEO ID NO : 172 ) . Other nucleic Nucleic acid - guided nucleases can be compatible with acid - guided nucleases for use in the present disclosure can 40 guide nucleic acids that are not found within the nucleases bind to a pseudoknot region having at least 75 % , 80 % , 85 % , endogenous host. Such orthogonal guide nucleic acids can 90 % , 95 % , or 100 % sequence identity to the pseudoknot be determined by empirical testing . Orthogonal guide region of Scaffold - 3 (SEQ ID NO : 173 ). Still other nucleic nucleic acids can come from different bacterial species or be acid - guided nucleases for use in the present disclosure can synthetic or otherwise engineered to be non - naturally occur bind to a pseudoknot region having at least 75 % , 80 % , 85 % , 45 ring . 90 % , 95 % , or 100 % sequence identity to the pseudoknot Orthogonal guide nucleic acids that are compatible with a region of Scaffold - 4 (SEQ ID NO : 174 ) . Other nucleic common nucleic acid - guided nuclease can comprise one or acid - guided nucleases for use in the present disclosure can more common features . Common features can include bind to a pseudoknot region having at least 75 % , 80 % , 85 % , sequence outside a pseudoknot region . Common features 90 % , 95 % , or 100 % sequence identity to the pseudoknot 50 can include a pseudoknot region . Common features can region of Scaffold - 5 (SEQ ID NO : 175 ) . Other nucleic include a primary sequence or secondary structure . acid - guided nucleases for use in the present disclosure can A guide nucleic acid can be engineered to target a desired bind to a pseudoknot region having at least 75 % , 80 % , 85 % , target sequence by altering the guide sequence such that the 90 % , 95 % , or 100 % sequence identity to the pseudoknot guide sequence is complementary to the target sequence , region of Scaffold - 6 (SEQ ID NO : 176 ) . Still other nucleic 55 thereby allowing hybridization between the guide sequence acid - guided nucleases for use in the present disclosure can and the target sequence . A guide nucleic acid with an bind to a pseudoknot region having at least 75 % , 80 % , 85 % , engineered guide sequence can be referred to as an engi 90 % , 95 % , or 100 % sequence identity to the pseudoknot neered guide nucleic acid . Engineered guide nucleic acids region of Scaffold - 7 ( SEQ ID NO : 177 ) . Other nucleic are often non -naturally occurring and are not found in acid - guided nucleases for use in the present disclosure can 60 nature. bind to a pseudoknot region having at least 75 % , 80 % , 85 % , Targetable Nuclease System 90 % , 95 % , or 100 % sequence identity to the pseudoknot Disclosed herein are targetable nuclease systems. A tar region of Scaffold - 8 (SEQ ID NO : 178 ) . Other nucleic getable nuclease system can comprise a nucleic acid - guided acid - guided nucleases for use in the present disclosure can nuclease and a compatible guide nucleic acid . A targetable bind to a pseudoknot region having at least 75 % , 80 % , 85 % , 65 nuclease system can comprise a nucleic acid - guided nucle 90 % , 95 % , or 100 % sequence identity to the pseudoknot ase or a polynucleotide sequence encoding the nucleic region of Scaffold -10 (SEQ ID NO : 179 ) . Still other nucleic acid -guided nuclease . A targetable nuclease system can US 10 ,435 , 714 B2 13 14 comprise a guide nucleic acid or a polynucleotide sequence nucleic acid - guided nuclease of SEQ ID NO : 2 and a encoding the guide nucleic acid . compatible guide nucleic acid comprising SEQ ID NO : 94 . In general, a targetable nuclease system as disclosed A targetable nuclease complex can comprise a nucleic herein is characterized by elements that promote the forma acid - guided nuclease of SEQ ID NO : 2 and a compatible tion of a targetable nuclease complex at the site of a target 5 guide nucleic acid comprising SEQ ID NO : 95 . In any of sequence , wherein the targetable nuclease complex com - these cases , the guide nucleic acid can further comprise a prises a nucleic acid - guided nuclease and a guide nucleic guide sequence . The guide sequence can be engineered to acid . target any desired target sequence . The guide sequence can A guide nucleic acid together with a nucleic acid -guided be engineered to be complementary to any desired target nuclease forms a targetable nuclease complex which is 10 sequence . The guide sequence can be engineered to hybrid capable of binding to a target sequence within a target ize to any desired target sequence . polynucleotide , as determined by the guide sequence of the A targetable nuclease complex can comprise a nucleic guide nucleic acid . acid -guided nuclease of SEQ ID NO : 7 and a compatible In general, to generate a double stranded break , in most guide nucleic acid . A targetable nuclease complex can cases a targetable nuclease complex binds to a target 15 comprise a nucleic acid - guided nuclease of SEQ ID NO : 7 sequence as determined by the guide nucleic acid , and the and a compatible guide nucleic acid comprising any one of nuclease has to recognize a protospacer adjacent motif SEQ ID NO : 88 , 93 , 94 , or 95 . A targetable nuclease ( PAM ) sequence adjacent to the target sequence . complex can comprise a nucleic acid - guided nuclease of A targetable nuclease complex can comprise a nucleic SEQ ID NO : 7 and a compatible guide nucleic acid com acid - guided nuclease of any one of SEQ ID NO : 1 - 20 and a 20 prising SEQ ID NO : 88 . A targetable nuclease complex can compatible guide nucleic acid . A targetable nuclease com comprise a nucleic acid -guided nuclease of SEQ ID NO : 7 plex can comprise a nucleic acid - guided nuclease of any one and a compatible guide nucleic acid comprising SEQ ID of SEQ ID NO : 1 - 8 or 10 - 12 and a compatible guide nucleic NO : 93 . A targetable nuclease complex can comprise a acid . A targetable nuclease complex can comprise a nucleic nucleic acid - guided nuclease of SEQ ID NO : 7 and a acid - guided nuclease of any one of SEQ ID NO : 1 - 8 or 25 compatible guide nucleic acid comprising SEQ ID NO : 94 . 10 - 11 and a compatible guide nucleic acid . A targetable A targetable nuclease complex can comprise a nucleic nuclease complex can comprise a nucleic acid - guided nucle - acid - guided nuclease of SEQ ID NO : 7 and a compatible ase of SEQ ID NO : 2 and a compatible guide nucleic acid guide nucleic acid comprising SEQ ID NO : 95 . In any of A targetable nuclease complex can comprise a nucleic these cases , the guide nucleic acid can further comprise a acid - guided nuclease of SEQ ID NO : 7 and a compatible 30 guide sequence . The guide sequence can be engineered to guide nucleic acid . In any of these cases , the guide nucleic target any desired target sequence . The guide sequence can acid can comprise a scaffold sequence compatible with the be engineered to be complementary to any desired target nucleic acid - guided nuclease . In any of these cases, the sequence . The guide sequence can be engineered to hybrid guide nucleic acid can further comprise a guide sequence . ize to any desired target sequence . The guide sequence can be engineered to target any desired 35 target sequence of a targetable nuclease complex can be target sequence . The guide sequence can be engineered to be any polynucleotide endogenous or exogenous to a prokary complementary to any desired target sequence . The guide otic or eukaryotic cell , or in vitro . For example , the target sequence can be engineered to hybridize to any desired sequence can be a polynucleotide residing in the nucleus of target sequence. the eukaryotic cell. A target sequence can be a sequence A targetable nuclease complex can comprise a nucleic 40 coding a gene product ( e . g . , a protein ) or a non - coding acid - guided nuclease of any one of SEQ ID NO : 1 - 20 and a sequence ( e . g ., a regulatory polynucleotide or a junk DNA ) . compatible guide nucleic acid comprising any one of SEQ Without wishing to be bound by theory , it is believed that the ID NO : 84 - 107 . A targetable nuclease complex can comprise target sequence should be associated with a PAM ; that is , a a nucleic acid - guided nuclease of any one of SEQ ID NO : short sequence recognized by a targetable nuclease complex . 1 - 8 or 10 - 12 and a compatible guide nucleic acid comprising 45 The precise sequence and length requirements for a PAM any one of SEQ ID NO : 84 - 95 . A targetable nuclease differ depending on the nucleic acid - guided nuclease used , complex can comprise a nucleic acid - guided nuclease of any but PAMs are typically 2 - 5 base pair sequences adjacent the one of SEQ ID NO : 1 - 8 or 10 - 11 and a compatible guide target sequence . Examples of PAM sequences are given in nucleic acid comprising any one of SEQ ID NO : 84 - 91 or the examples section below , and the skilled person will be 93 -95 . In any of these cases , the guide nucleic acid can 50 able to identify further PAM sequences for use with a given further comprise a guide sequence . The guide sequence can nucleic acid - guided nuclease . Further , engineering of the be engineered to target any desired target sequence . The PAM Interacting (PI ) domain may allow programming of guide sequence can be engineered to be complementary to PAM specificity , improve target site recognition fidelity , and any desired target sequence . The guide sequence can be increase the versatility of a nucleic acid - guided nuclease engineered to hybridize to any desired target sequence. 55 genome engineering platform . Nucleic acid - guided nucle A targetable nuclease complex can comprise a nucleic ases may be engineered to alter their PAM specificity , for acid - guided nuclease of SEO ID NO : 2 and a compatible example as described in Kleinstiver B P et al . Engineered guide nucleic acid . A targetable nuclease complex can CRISPR - Cas9 nucleases with altered PAM specificities. comprise a nucleic acid - guided nuclease of SEQ ID NO : 2 Nature . 2015 Jul. 23 ; 523 (7561 ) : 481- 5 . doi: 10 . 1038 / and a compatible guide nucleic acid comprising any one of 60 nature14592 . SEQ ID NO : 88 , 93 , 94 , or 95 . A targetable nuclease PAM site is a nucleotide sequence in proximity to a complex can comprise a nucleic acid - guided nuclease of target sequence . In most cases , a nucleic acid - guided nucle SEQ ID NO : 2 and a compatible guide nucleic acid com - ase can only cleave a target sequence if an appropriate PAM prising SEQ ID NO : 88 . A targetable nuclease complex can is present. PAMs are nucleic acid - guided nuclease -specific comprise a nucleic acid - guided nuclease of SEQ ID NO : 2 65 and can be different between two different nucleic acid and a compatible guide nucleic acid comprising SEQ ID guided nucleases . A PAM can be 5 ' or 3 ' of a target sequence . NO : 93 . A targetable nuclease complex can comprise a APAM can be upstream or downstream of a target sequence . US 10 ,435 , 714 B2 15 16 A PAM can be 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 ormore nucleotides be yeast , fungi, algae, plant, animal, or human cells . Eukary in length . Often , a PAM is between 2 -6 nucleotides in length . otic cells may be those of or derived from a particular In some examples , a PAM can be provided on a separate organism , such as a mammal, including but not limited to oligonucleotide . In such cases, providing PAM on a oligo human , mouse, rat , rabbit , dog , or non -human mammal nucleotide allows cleavage of a target sequence that other - 5 including non - human primate . wise would not be able to be cleave because no adjacent In general, codon optimization refers to a process of PAM is present on the same polynucleotide as the target modifying a nucleic acid sequence for enhanced expression sequence . in the host cells of interest by replacing at least one codon Polynucleotide sequences encoding a component of a ( e . g . about or more than about 1 , 2 , 3 , 4 , 5 , 10 , 15 , 20 , 25 , targetable nuclease system can comprise one or more vec - 10 50 , or more codons ) of the native sequence with codons that tors. In general, the term “ vector ” refers to a nucleic acid are more frequently or most frequently used in the genes of molecule capable of transporting another nucleic acid to that host cell while maintaining the native amino acid which it has been linked . Vectors include, but are not limited sequence . Various species exhibit particular bias for certain to , nucleic acid molecules that are single - stranded , double codons of a particular amino acid . Codon bias ( differences stranded , or partially double - stranded ; nucleic acid mol- 15 in codon usage between organisms) often correlates with the ecules that comprise one or more free ends, no free ends ( e . g . efficiency of translation of messenger RNA (mRNA ) , which circular ); nucleic acid molecules that comprise DNA , RNA , is in turn believed to be dependent on , among other things , or both ; and other varieties of polynucleotides known in the the properties of the codons being translated and the avail art. One type of vector is a " plasmid , ” which refers to a ability of particular transfer RNA (TRNA ) molecules . The circular double stranded DNA loop into which additional 20 predominance of selected tRNAs in a cell is generally a DNA segments can be inserted , such as by standard molecu reflection of the codons used most frequently in peptide lar cloning techniques . Another type of vector is a viral synthesis. Accordingly , genes can be tailored for optimal vector, wherein virally -derived DNA or RNA sequences are gene expression in a given organism based on codon opti present in the vector for packaging into a virus ( e . g . retro mization . Codon usage tables are readily available , for viruses , replication defective retroviruses, adenoviruses , 25 example , at the “ Codon Usage Database ” available at replication defective adenoviruses, and adeno - associated www .kazusa .orjp / codon / ( visited Jul. 9 , 2002 ) , and these viruses ) . Viral vectors also include polynucleotides carried tables can be adapted in a number of ways . See Nakamura , by a virus for transfection into a host cell . Certain vectors are Y . , et al. “ Codon usage tabulated from the international capable of autonomous replication in a host cell into which DNA sequence databases : status for the year 2000 ” Nucl. they are introduced ( e . g. bacterial vectors having a bacterial 30 Acids Res. 28 : 292 (2000 ) . Computer algorithms for codon origin of replication and episomal mammalian vectors ) . optimizing a particular sequence for expression in a particu Other vectors ( e . g ., non -episomal mammalian vectors ) are lar host cell are also available , such as Gene Forge ( Aptagen ; integrated into the genome of a host cell upon introduction Jacobus , Pa . ), are also available . In some embodiments, one into the host cell, and thereby are replicated along with the or more codons ( e . g . 1 , 2 , 3 , 4 , 5 , 10 , 15 , 20 , 25 , 50 , or more , host genome. Moreover , certain vectors are capable of 35 or all codons) in a sequence encoding an engineered nucle directing the expression of genes to which they are opera - ase correspond to the most frequently used codon for a tively - linked . Such vectors are referred to herein as " expres particular amino acid . sion vectors .” Common expression vectors of utility in In some embodiments , a vector encodes a nucleic acid recombinant DNA techniques are often in the form of guided nuclease comprising one ormore nuclear localization plasmids. Further discussion of vectors is provided herein . 40 sequences (NLSs ) , such as about or more than about 1 , 2 , 3 , Recombinant expression vectors can comprise a nucleic 4 , 5 , 6 , 7 , 8 , 9 , 10 , or more NLSs . In some embodiments, the acid of the invention in a form suitable for expression of the engineered nuclease comprises about or more than about 1 , nucleic acid in a host cell , which means that the recombinant 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , or more NLSs at or near the expression vectors include one or more regulatory elements , amino - terminus , about or more than about 1 , 2 , 3 , 4 , 5 , 6 , 7 , which may be selected on the basis of the host cells to be 45 8 , 9 , 10 , or more NLSs at or near the carboxy - terminus , or used for expression , that is operatively - linked to the nucleic a combination of these ( e . g . one or more NLS at the acid sequence to be expressed . Within a recombinant expres - amino -terminus and one or more NLS at the carboxy ter sion vector, “ operably linked " is intended to mean that the minus ) . When more than one NLS is present, each may be nucleotide sequence of interest is linked to the regulatory selected independently of the others , such that a single NLS element( s ) in a manner that allows for expression of the 50 may be present in more than one copy and / or in combination nucleotide sequence ( e . g . in an in vitro transcription / trans - with one or more other NLSs present in one or more copies. lation system or in a host cell when the vector is introduced In a preferred embodiment of the invention , the engineered into the host cell ) . With regards to recombination and nuclease comprises at most 6 NLSs . In some embodiments , cloningmethods , mention is made of U . S . patent application an NLS is considered near the N - or C - terminus when the Ser. No . 10 /815 , 730 , published Sep . 2 , 2004 as US 2004 - 55 nearest amino acid of the NLS is within about 1 , 2 , 3 , 4 , 5 , 0171156 A1, the contents of which are herein incorporated 10 , 15 , 20 , 25 , 30 , 40 , 50 , or more amino acids along the by reference in their entirety. polypeptide chain from the N - or C -terminus . Non - limiting In some embodiments , a regulatory element is operably exaexamples of NLSs include an NLS sequence derived from : linked to one or more elements of a targetable nuclease the NLS of the SV40 virus large T -antigen , having the amino system so as to drive expression of the one or more com - 60 acid sequence PKKKRKV (SEQ ID NO : 111) ; the NLS ponents of the targetable nuclease system . from nucleoplasmin ( e . g . the nucleoplasmin bipartite NLS In some embodiments , a vector comprises a regulatory with the sequence KRPAATKKAGOAKKKK (SEQ ID element operably linked to a polynucleotide sequence NO : 112 ) ) ; the c -myc NLS having the amino acid sequence encoding a nucleic acid - guided nuclease . The polynucle - PAAKRVKLD (SEQ ID NO : 113 ) or RQRRNELKRSP otide sequence encoding the nucleic acid - guided nuclease 65 (SEQ ID NO : 114 ) ; the hRNPAI M9 NLS having the can be codon optimized for expression in particular cells , sequence NQSSNFGPMKGGNFGGRSSGPYGGGGQY such as prokaryotic or eukaryotic cells . Eukaryotic cells can FAKPRNQGGY (SEQ ID NO : 115 ) ; the sequence US 10 ,435 , 714 B2 17 18 RMRIZFKNKGKDTAELRRRRVEVSVELRKAKKDEO - comprising an nucleic acid - guided nuclease encoded on a ILKRRNV ( SEQ ID NO : 1 116 ) of the IBB domain from vector or chromosome. The guide nucleic acid may be importin -alpha ; the sequences VSRKRPRP (SEQ ID provided in the cassette one or more polynucleotides , which NO : 117 ) and PPKKARED (SEQ ID NO : 118 ) of the myoma may be contiguous or non - contiguous in the cassette . In T protein ; the sequence PQPKKKPL (SEQ ID NO : 119 ) of 5 specific embodiments , the guide nucleic acid is provided in human p53 ; the sequence SALIKKKKKMAP (SEQ ID the cassette as a single contiguous polynucleotide . NO : 120 ) of mouse c -abl IV ; the sequences DRLRR (SEQ ID A variety of delivery systems can be used to introduce a NO : 121 ) and PKQKKRK (SEQ ID NO : 122 ) of the influenza nucleic acid - guided nuclease (DNA or RNA ) and guide virus NS1 ; the sequence RKLKKKIKKL ( SEQ ID NO : 123 ) nucleic acid (DNA or RNA ) into a host cell. These include of the Hepatitis virus delta antigen ; the sequence REKK - 10 the use of yeast systems, lipofection systems, microinjection KFLKRR ( SEQ ID NO : 124 ) of the mouse Mxl protein ; the systems, biolistic systems, virosomes , liposomes, immuno sequence KRKGDEVDGVDEVAKKKSKK (SEQ ID NO : liposomes , polycations, lipid :nucleic acid conjugates , viri 125 ) of the human poly ( ADP - ribose ) polymerase ; and the ons, artificial virions, viral vectors, electroporation , cell sequence RKCLQAGMNLEARKTKK (SEQ ID NO : 126 ) permeable peptides , nanoparticles , nanowires ( Shalek et al ., of the steroid hormone receptors ( human ) glucocorticoid . 15 Nano Letters , 2012 ) , exosomes . Molecular trojan horses In general, the one or more NLSs are of sufficient strength liposomes ( Pardridge et al. , Cold Spring Harb Protoc ; 2010 ; to drive accumulation of the nucleic acid - guided nuclease in doi: 10 . 1101/ pdb .prot5407 ) may be used to deliver an engi a detectable amount in the nucleus of a eukaryotic cell . In neered nuclease and guide nuclease across the blood brain general, strength of nuclear localization activity may derive barrier . from the number of NLSs , the particular NLS ( s ) used , or a 20 In some embodiments , a editing template is also provided . combination of these factors . Detection of accumulation in A editing template may be a component of a vector as the nucleus may be performed by any suitable technique . For described herein , contained in a separate vector, or provided example , a detectable marker may be fused to the nucleic as a separate polynucleotide , such as an oligonucleotide , acid - guided nuclease , such that location within a cell may be linear polynucleotide , or synthetic polynucleotide . In some visualized , such as in combination with a means for detect- 25 cases , a editing template is on the same polynucleotide as a ing the location of the nucleus ( e . g . a stain specific for the guide nucleic acid . In some embodiments , a editing template nucleus such as DAPI) . Cell nuclei may also be isolated is designed to serve as a template in homologous recombi from cells, the contents of which may then be analyzed by nation , such as within or near a target sequence nicked or any suitable process for detecting protein , such as immuno - cleaved by a nucleic acid - guided nuclease as a part of a histochemistry , Western blot, or enzyme activity assay . 30 complex as disclosed herein . A editing template polynucle Accumulation in the nucleus may also be determined indi- otide may be of any suitable length , such as about or more rectly , such as by an assay for the effect of the nucleic than about 10 , 15 , 20 , 25 , 50 , 75 , 100 , 150 , 200 , 500 , 1000 , acid - guided nuclease complex formation ( e . g . assay for or more nucleotides in length . In some embodiments , the DNA cleavage or mutation at the target sequence , or assay editing template polynucleotide is complementary to a por for altered gene expression activity affected by targetable 35 tion of a polynucleotide comprising the target sequence . nuclease complex formation and / or nucleic acid - guided When optimally aligned , a editing template polynucleotide nuclease activity ) , as compared to a control not exposed to might overlap with one or more nucleotides of a target the nucleic acid - guided nuclease or targetable nuclease sequences ( e . g . about or more than about 1 , 5 , 10 , 15 , 20 , 25 , complex , or exposed to a nucleic acid - guided nuclease 30 , 35 , 40 , or more nucleotides ) . In some embodiments , lacking the one or more NLSs . 40 when a editing template sequence and a polynucleotide A nucleic acid - guided nuclease and one or more guide comprising a target sequence are optimally aligned , the nucleic acids can be delivered either as DNA or RNA . nearest nucleotide of the template polynucleotide is within Delivery of an nucleic acid - guided nuclease and guide about 1, 5 , 10 , 15 , 20 , 25 , 50 , 75 , 100 , 200 , 300 , 400 , 500 , nucleic acid both as RNA ( unmodified or containing base or 1000 , 5000 , 10000 , or more nucleotides from the target backbone modifications ) molecules can be used to reduce 45 sequence . the amount of time that the nucleic acid - guided nuclease In many examples, an editing template comprises at least persist in the cell . This may reduce the level of off - target one mutation compared to the target sequence. An editing cleavage activity in the target cell . Since delivery of a template can comprise an insertion , deletion , modification , nucleic acid - guided nuclease as mRNA takes time to be or any combination thereof compared to the target sequence . translated into protein , it might be advantageous to deliver 50 Examples of some editing templates are described in more the guide nucleic acid several hours following the delivery detail in a later section . of the nucleic acid - guided nuclease mRNA, to maximize the In some aspects , the invention provides methods com level of guide nucleic acid available for interaction with the prising delivering one or more polynucleotides , such as or nucleic acid - guided nuclease protein . In other cases, the one or more vectors or linear polynucleotides as described nucleic acid - guided nuclease mRNA and guide nucleic acid 55 herein , one or more transcripts thereof, and / or one or pro are delivered concomitantly . In other examples , the guide teins transcribed therefrom , to a host cell . In some aspects , nucleic acid is delivered sequentially , such as 0 . 5 , 1 , 2 , 3 , 4 , the invention further provides cells produced by such meth or more hours after the nucleic acid - guided nuclease mRNA . ods, and organisms comprising or produced from such cells . In situations where guide nucleic acid amount is limiting , In some embodiments , an engineered nuclease in combina it may be desirable to introduce a nucleic acid - guided 60 tion with and optionally complexed with a guide nucleic nuclease as mRNA and guide nucleic acid in the form of a acid is delivered to a cell . DNA expression cassette with a promoter driving the Conventional viral and non - viral based gene transfer expression of the guide nucleic acid . This way the amount of methods can be used to introduce nucleic acids in cells , such guide nucleic acid available will be amplified via transcrip - as prokaryotic cells, eukaryotic cells , mammalian cells , or tion . 65 target tissues. Such methods can be used to administer Guide nucleic acid in the form of RNA or encoded on a nucleic acids encoding components of an engineered nucleic DNA expression cassette can be introduced into a host cell acid - guided nuclease system to cells in culture , or in a host US 10 ,435 , 714 B2 19 20 organism . Non - viral vector delivery systems include DNA retroviral vectors include those based upon murine leukemia plasmids , RNA ( e . g . a transcript of a vector described virus (MuLV ) , gibbon ape leukemia virus GLV( ) , Simian herein ), naked nucleic acid , and nucleic acid complexed Immuno deficiency virus (SIV ) , human immuno deficiency with a delivery vehicle , such as a liposome. Viral vector virus (HIV ) , and combinations thereof ( see , e . g . , Buchscher delivery systems include DNA and RNA viruses , which 5 et al. , J. Virol. 66 :2731 -2739 (1992 ) ; Johann et al ., J . Virol. have either episomal or integrated genomes after delivery to 66 : 1635 - 1640 ( 1992 ) ; Sommnerfelt et al ., Virol. 176 :58 - 59 the cell. For a review of gene therapy procedures , see Anderson , Science 256 :808 - 813 (1992 ) ; Nabel & Feigner, (1990 ) ; Wilson et al. , J. Virol. 63 :2374 - 2378 ( 1989 ) ; Miller TIBTECH 11 :211 -217 ( 1993 ) ; Mitani & Caskey , TIBTECH et al. , J . Virol. 65: 2220 - 2224 ( 1991 ) ; PCT /US94 / 05700 ). 11: 162 - 166 (1993 ) ; Dillon . TIBTECH 11: 167 - 175 ( 1993 ); 10 In applications where transient expression is preferred , Miller, Nature 357 : 455 -460 ( 1992 ) ; Van Brunt, Biotechnol adenoviral based systems may be used . Adenoviral based ogy 6 ( 10 ) : 1149 - 1154 ( 1988 ) ; Vigne , Restorative Neurology vectors are capable of very high transduction efficiency in and Neuroscience 8 :35 - 36 ( 1995 ) ; Kremer & Perricaudet, many cell types and do not require cell division . With such British Medical Bulletin 51 ( 1 ) : 31 -44 ( 1995 ) ; Haddada et al ., vectors , high titer and levels of expression have been in Current Topics in Microbiology and Immunology Doer - 15 obtained . This vector can be produced in large quantities in fler and Bohm (eds ) (1995 ); and Yu et al. , Gene Therapy a relatively simple systemwotom . 1 : 13 - 26 ( 1994 ) . Adeno - associated virus (“ AAV ” ) vectors may also be Methods of non -viral delivery of nucleic acids include used to transduce cells with target nucleic acids, e . g ., in the lipofection , microinjection , biolistics, virosomes, liposomes , in vitro production of nucleic acids and peptides, and for in immunoliposomes, polycation or lipid :nucleic acid conju - 20 vivo and ex vivo gene therapy procedures (see , e . g ., West et gates , naked DNA , artificial virions, and agent - enhanced al. , Virology 160 :38 - 47 ( 1987 ) ; U . S . Pat. No . 4 ,797 , 368 ; uptake of DNA . Lipofection is described in e . g . , U . S . Pat . WO 93 /24641 ; Kotin , Human Gene Therapy 5 :793 - 801 Nos. 5 ,049 ,386 , 4 , 946 ,787 ; and 4 ,897 , 355 ) and lipofection ( 1994 ) ; Muzyczka , J . Clin . Invest . 94 : 1351 (1994 ) . Con reagents are sold commercially ( e .g . , TransfectamTM and struction of recombinant AAV vectors are described in a LipofectinTM ) . Cationic and neutral lipids that are suitable 25 number of publications , including U . S . Pat. No. 5 , 173 , 414 ; for efficient receptor- recognition lipofection of polynucle - Tratschin et al. , Mol. Cell . Biol. 5 : 3251 - 3260 ( 1985 ) ; otides include those of Felgner, WO 91 / 17424 ; WO Tratschin , et al. , Mol. Cell. Biol. 4 : 2072 - 2081 ( 1984 ) ; Her 91/ 16024 . Delivery can be to cells ( e . g . in vitro or ex vivo monat & Muzyczka , PNAS 81 :6466 -6470 ( 1984 ) ; and Sam administration ) or target tissues ( e . g . in vivo administration ) . ulski et al . , J . Virol. 63 : 03822 - 3828 ( 1989 ) . The preparation of lipid :nucleic acid complexes , includ - 30 In some embodiments , a host cell is transiently or non ing targeted liposomes such as immunolipid complexes , is transiently transfected with one or more vectors, linear well known to one of skill in the art ( see , e . g . , Crystal, polynucleotides , polypeptides , nucleic acid -protein com Science 270 : 404 -410 ( 1995 ) ; Blaese et al. , Cancer Gene plexes, or any combination thereof as described herein . In Ther . 2 : 291 - 297 ( 1995 ) ; Behr et al. , Bioconjugate Chem . some embodiments , a cell in transfected in vitro , in culture , 5 :382 - 389 ( 1994 ); Remy et al ., Bioconjugate Chem . 5 :647 - 35 or ex vivo . In some embodiments , a cell is transfected as it 654 ( 1994 ) ; Gao et al . , Gene Therapy 2 : 710 - 722 ( 1995 ) ; naturally occurs in a subject. In some embodiments , a cell Ahmad et al. , Cancer Res. 52 :4817 - 4820 ( 1992 ) ; U . S . Pat . that is transfected is taken from a subject. In some embodi Nos. 4 , 186 , 183 , 4 , 217 , 344 , 4 , 235 ,871 , 4 , 261, 975 , 4 ,485 , ments, the cell is derived from cells taken from a subject, 054 , 4 , 501, 728 , 4 ,774 ,085 , 4 ,837 ,028 , and 4 , 946 ,787 ) . such as a cell line . The use of RNA or DNA viral based systems for the 40 In some embodiments, a cell transfected with one or more delivery of nucleic acids take advantage of highly evolved vectors , linear polynucleotides , polypeptides , nucleic acid processes for targeting a virus to specific cells in culture or protein complexes , or any combination thereof as described in the host and trafficking the viral payload to the nucleus or herein is used to establish a new cell line comprising one or host cell genome. Viral vectors can be administered directly more transfection -derived sequences. In some embodiments , to cells in culture , patients ( in vivo ) , or they can be used to 45 a cell transiently transfected with the components of an treat cells in vitro , and the modified cells may optionally be engineered nucleic acid - guided nuclease system as administered to patients ( ex vivo ) . Conventional viral based described herein (such as by transient transfection of one or systems could include retroviral, lentivirus , adenoviral, more vectors , or transfection with RNA ) , and modified adeno -associated and herpes simplex virus vectors for gene through the activity of an engineered nuclease complex , is transfer. Integration in the host genome is possible with the 50 used to establish a new cell line comprising cells containing retrovirus , lentivirus , and adeno -associated virus gene trans - the modification but lacking any other exogenous sequence . fer methods , often resulting in long term expression of the In some embodiments , one or more vectors described inserted transgene . Additionally , high transduction efficien - herein are used to produce a non - human transgenic cell, cies have been observed in many different cell types and organism , animal, or plant. In some embodiments , the target tissues . 55 transgenic animal is a mammal, such as a mouse , rat, or The tropism of a retrovirus can be altered by incorporat- rabbit . Methods for producing transgenic cells , organisms, ing foreign envelope proteins, expanding the potential target plants , and animals are known in the art , and generally begin population of target cells . Lentiviral vectors are retroviral with a method of cell transformation or transfection , such as vectors that are able to transduce or infect non - dividing cells described herein . and typically produce high viral titers . Selection of a retro - 60 Methods of Use viral gene transfer system would therefore depend on the In the context of formation of an engineered nuclease target tissue . Retroviral vectors are comprised of cis -acting complex , “ target sequence ” refers to a sequence to which a long terminal repeats with packaging capacity for up to 6 - 10 guide sequence is designed to have complementarity , where kb of foreign sequence . The minimum cis - acting LTRs are hybridization between a target sequence and a guide sufficient for replication and packaging of the vectors , which 65 sequence promotes the formation of a engineered nuclease are then used to integrate the therapeutic gene into the target complex . A target sequence may comprise any polynucle cell to provide permanent transgene expression . Widely used otide , such as DNA , RNA , or a DNA -RNA hybrid . A target US 10 , 435 , 714 B2 21 22 sequence can be located in the nucleus or cytoplasm of a sequences by different nucleic acid - guided nucleases . In cell . A target sequence can be located in vitro or in a cell- free some such cases, each nucleic acid - guided nuclease can environment. correspond to a distinct plurality of guide nucleic acids, Typically , formation of an engineered nuclease complex allowing two or more non overlapping , partially overlap comprising a guide nucleic acid hybridized to a target 5 ping , or completely overlapping multiplexing events . sequence and complexed with one or more engineered In some embodiments , the nucleic acid - guided nuclease nucleases as disclosed herein results in cleavage of one or has DNA cleavage activity or RNA cleavage activity . In both strands in or near ( e . g . within 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , some embodiments , the nucleic acid - guided nuclease directs 10 , 20 , 50 , or more base pairs from the target sequence . cleavage of one or both strands at the location of a target Cleavage can occur within a target sequence , 5 ' of the target 10 sequence , such as within the target sequence and /or within sequence, upstream of a target sequence, 3 ' of the target the complement of the target sequence. In some embodi sequence , or downstream of a target sequence . ments , the nucleic acid - guided nuclease directs cleavage of In some embodiments , one or more vectors driving one or both strands within about 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , expression of one or more components of a targetable 15 , 20 , 25 , 50 , 100 , 200 , 500 , or more base pairs from the nuclease system are introduced into a host cell or in vitro 15 first or last nucleotide of a target sequence . such formation of a targetable nuclease complex at one or In some embodiments , a nucleic acid - guided nuclease more target sites. For example, a nucleic acid - guided nucle - may form a component of an inducible system . The induc ase and a guide nucleic acid could each be operably linked ible nature of the system would allow for spatiotemporal to separate regulatory elements on separate vectors. Alter control of gene editing or gene expression using a form of natively , two or more of the elements expressed from the 20 energy . The form of energy may include but is not limited to same or different regulatory elements , may be combined in electromagnetic radiation , sound energy , chemical energy , a single vector , with one or more additional vectors provid - light energy, temperature , and thermal energy . Examples of ing any components of the targetable nuclease system not inducible system include tetracycline inducible promoters included in the first vector. Targetable nuclease system ( Tet- On or Tet- Off ) , small molecule two -hybrid transcription elements that are combined in a single vector may be 25 activations systems (FKBP , ABA , etc ) , or light inducible arranged in any suitable orientation , such as one element systems ( Phytochrome, LOV domains , or cryptochorome) . located 5 ' with respect to (" upstream " of ) or 3 ' with respect In one embodiment, the nucleic acid - guided nuclease may to (“ downstream ” of) a second element. The coding be a part of a Light Inducible Transcriptional Effector sequence of one element may be located on the same or (LITE ) to direct changes in transcriptional activity in a opposite strand of the coding sequence of a second element, 30 sequence - specific manner . The components of a light induc and oriented in the same or opposite direction . In some ible system may include a nucleic acid - guided nuclease , a embodiments , a single promoter drives expression of a light - responsive cytochrome heterodimer ( e . g . from Arabi transcript encoding a nucleic acid - guided nuclease and one dopsis thaliana ), and a transcriptional activation /repression or more guide nucleic acids . In some embodiments, a domain . Further examples of inducible DNA binding pro nucleic acid - guided nuclease and one or more guide nucleic 35 teins and methods for their use are provided in U . S . 61/ 736 , acids are operably linked to and expressed from the same 465 and U . S . 61 /721 , 283 , which is hereby incorporated by promoter. In other embodiments , one or more guide nucleic reference in its entirety . An inducible system can be tem acids or polynucleotides encoding the one or more guide p erature inducible such that the system is turned on or off by nucleic acids are introduced into a cell or in vitro environ increasing or decreasing the temperature . In some tempera ment already comprising a nucleic acid - guided nuclease or 40 ture inducible systems, increasing the temperature turns the polynucleotide sequence encoding the nucleic acid -guided system on . In some temperature inducible systems, increas nuclease . ing the temperature turns the system off . When multiple different guide sequences are used , a In some aspects , the invention provides for methods of single expression construct may be used to target nuclease modifying a target sequence in vitro , or in a prokaryotic or activity to multiple different , corresponding target sequences 45 eukaryotic cell , which may be in vivo , ex vivo , or in vitro . within a cell or in vitro . For example , a single vector may In some embodiments , the method comprises sampling a comprise about or more than about 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , cell or population of cells such as prokaryotic cells , or those 10 , 15 , 20 , or more guide sequences. In some embodiments, from a human or non - human animal or plant ( including about or more than about 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , or more micro - algae ) , and modifying the cell or cells . Culturing may such guide - sequence - containing vectors may be provided , 50 occur at any stage in vitro or ex vivo . The cell or cells may and optionally delivered to a cell or in vitro . even be re - introduced into the host, such as a non - human Methods and compositions disclosed herein may com animal or plant ( including micro - algae ) . For re - introduced prise more than one guide nucleic acid , wherein each guide cells it is particularly preferred that the cells are stem cells . nucleic acid has a different guide sequence , thereby targeting I n some embodiments , the method comprises allowing a a different target sequence . In such cases , multiple guide 55 targetable nuclease complex to bind to the target sequence to nucleic acids can be using in multiplexing , wherein multiple effect cleavage of said target sequence , thereby modifying targets are targeted simultaneously . Additionally or alterna - the target sequence , wherein the targetable nuclease com tively, the multiple guide nucleic acids are introduced into a plex comprises a nucleic acid - guided nuclease complexed population of cells , such that each cell in a population with a guide nucleic acid wherein the guide sequence of the received a different or random guide nucleic acid , thereby 60 guide nucleic acid is hybridized to a target sequence within targeting multiple different target sequences across a popu - a target polynucleotide . lation of cells . In such cases , the collection of subsequently In some aspects , the invention provides a method of altered cells can be referred to as a library . modifying expression of a target polynucleotide in in vitro Methods and compositions disclosed herein may com - or in a prokaryotic or eukaryotic cell . In some embodiments , prise multiple different nucleic acid -guided nucleases, each 65 the method comprises allowing an targetable nuclease com with one or more different corresponding guide nucleic plex to bind to a target sequence with the target polynucle acids, thereby allowing targeting of different target otide such that said binding results in increased or decreased

US 10 , 435 , 714 B2 25 26 easy to screen for targeted integrations. Examples of suitable fication may be carried out by natural or recombinant DNA markers include restriction sites, fluorescent proteins, or polymerases such as TaqGoldTM , T7 DNA polymerase , selectable markers . The exogenous polynucleotide template Klenow fragment of E . coli DNA polymerase , and reverse of the invention can be constructed using recombinant transcriptase . A preferred amplification method is PCR . In techniques (see , for example , Green and Sambrook et al. , 5 particular, the isolated RNA can be subjected to a reverse 2014 and Ausubel et al. , 2017 ) . transcription assay that is coupled with a quantitative poly In an exemplary method for modifying a target polynucle otide by integrating an editing template polynucleotide , a merase chain reaction (RT - PCR ) in order to quantify the double stranded break is introduced into the genome expression level of a sequence associated with a signaling sequence by an engineered nuclease complex , the break can 10 biochemical pathway . be repaired via homologous recombination using an editing Detection of the gene expression level can be conducted template such that the template is integrated into the target in real time in an amplification assay. In one aspect , the polynucleotide . The presence of a double - stranded break can amplified products can be directly visualized with fluores increase the efficiency of integration of the editing template . cent DNA -binding agents including but not limited to DNA Disclosed herein are methods formodifying expression of 15 intercalators and DNA groove binders . Because the amount a polynucleotide in a cell . Somemethods comprise increas of the intercalators incorporated into the double - stranded ing or decreasing expression of a target polynucleotide by DNA molecules is typically proportional to the amount of using a targetable nuclease complex that binds to the target the amplified DNA products , one can conveniently deter polynucleotide . mine the amount of the amplified products by quantifying In some methods, a target polynucleotide can be inacti - 20 the fluorescence of the intercalated dye using conventional vated to effect the modification of the expression in a cell. optical systems in the art. DNA - binding dye suitable for this For example , upon the binding of a targetable nuclease application include SYBR green , SYBR blue , DAPI, pro complex to a target sequence in a cell, the target polynucle - pidium iodine , Hoeste , SYBR gold , ethidium bromide , acri otide is inactivated such that the sequence is not transcribed , dines , proflavine , acridine orange , acriflavine , fluorcouma the coded protein is not produced , or the sequence does not 25 nin , ellipticine, daunomycin , chloroquine , distamycin D , function as the wild -type sequence does. For example , a chromomycin , homidium , mithramycin , ruthenium poly protein or microRNA coding sequence may be inactivated pyridyls, anthramycin , and the like . such that the protein is not produced . In another aspect, other fluorescent labels such as In some methods , a control sequence can be inactivated sequence specific probes can be employed in the amplifica such that it no longer functions as a regulatory sequence . As 30 tion reaction to facilitate the detection and quantification of used herein , “ regulatory sequence ” can refer to any nucleic the amplified products . Probe -based quantitative amplifica acid sequence that effects the transcription , translation , or t ion relies on the sequence -specific detection of a desired accessibility of a nucleic acid sequence . Examples of regu - amplified product. It utilizes fluorescent, target -specific latory sequences include, a promoter, a transcription termi probes (e . g ., TaqManTMprobes ) resulting in increased speci nator, and an enhancer . 35 ficity and sensitivity . Methods for performing probe -based An inactivated target sequence may include a deletion quantitative amplification are well established in the art and mutation ( i. e ., deletion of one or more nucleotides ), an are taught in U . S . Pat. No . 5 ,210 ,015 . insertion mutation (i . e . , insertion of one or more nucleo In yet another aspect, conventional hybridization assays tides ), or a nonsense mutation ( i . e . , substitution of a single using hybridization probes that share sequence homology nucleotide for another nucleotide such that a stop codon is 40 with sequences associated with a signaling biochemical introduced ) . In some methods , the inactivation of a target pathway can be performed . Typically, probes are allowed to sequence results in “ knockout” of the target sequence . form stable complexes with the sequences associated with a An altered expression of one or more target polynucle signaling biochemical pathway contained within the biologi otides associated with a signaling biochemical pathway can cal sample derived from the test subject in a hybridization be determined by assaying for a difference in the mRNA 45 reaction . It will be appreciated by one of skill in the art that levels of the corresponding genes between the test model where antisense is used as the probe nucleic acid , the target cell and a control cell , when they are contacted with a polynucleotides provided in the sample are chosen to be candidate agent. Alternatively , the differential expression of complementary to sequences of the antisense nucleic acids . the sequences associated with a signaling biochemical path - Conversely , where the nucleotide probe is a sense nucleic way is determined by detecting a difference in the level of 50 acid , the target polynucleotide is selected to be complemen the encoded polypeptide or gene product . tary to sequences of the sense nucleic acid . To assay for an agent- induced alteration in the level of Hybridization can be performed under conditions of vari mRNA transcripts or corresponding polynucleotides, ous stringency , for instance as described herein . Suitable nucleic acid contained in a sample is first extracted accord - hybridization conditions for the practice of the present ing to standard methods in the art. For instance , mRNA can 55 invention are such that the recognition interaction between be isolated using various lytic enzymes or chemical solu - the probe and sequences associated with a signaling bio tions according to the procedures set forth in Green and chemical pathway is both sufficiently specific and suffi Sambrook ( 2014 ) , or extracted by nucleic - acid -binding res ciently stable . Conditions that increase the stringency of a ins following the accompanying instructions provided by the hybridization reaction are widely known and published in manufacturers . The mRNA contained in the extracted 60 the art . See , for example , (Green and Sambrook , et al. , nucleic acid sample is then detected by amplification pro - ( 2014 ) ; Nonradioactive in Situ Hybridization Application cedures or conventional hybridization assays (e .g . Northern Manual, Boehringer Mannheim , second edition ) . The blot analysis ) according to methods widely known in the art hybridization assay can be formed using probes immobilized or based on the methods exemplified herein . on any solid support , including but are not limited to For purpose of this invention , amplification means any 65 nitrocellulose , glass , silicon , and a variety of gene arrays . A method employing a primer and a polymerase capable of preferred hybridization assay is conducted on high -density replicating a target sequence with reasonable fidelity . Ampli - gene chips as described in U . S . Pat. No. 5 , 445 , 934 . US 10 , 435 , 714 B2 27 28 For a convenient detection of the probe - target complexes analog for binding sites on the specific agent. In this formed during the hybridization assay , the nucleotide probes competitive assay, the amount of label captured is inversely are conjugated to a detectable label . Detectable labels suit proportional to the amount of protein sequences associated able for use in the present invention include any composition with a signaling biochemical pathway present in a test detectable by photochemical, biochemical, spectroscopic , 5 sample . immunochemical, electrical, optical or chemical meansA . A number of techniques for protein analysis based on the wide variety of appropriate detectable labels are known in general principles outlined above are available in the art . the art , which include fluorescent or chemiluminescent They include but are not limited to radioimmunoassays , labels , radioactive isotope labels , enzymatic or other ELISA (enzyme linked immunoradiometric assays ), “ sand ligands. In preferred embodiments , one will likely desire to 10 wich ” immunoassays, immunoradiometric assays, in situ employ a fluorescent label or an enzyme tag , such as immunoassays (using e . g ., colloidal gold , enzyme or radio digoxigenin , beta .- galactosidase , urease , alkaline phos - isotope labels ), western blot analysis , immunoprecipitation phatase or peroxidase , avidin /biotin complex . assays , immunofluorescent assays, and SDS -PAGE . Detection methods used to detect or quantify the hybrid - Antibodies that specifically recognize or bind to proteins ization intensity will typically depend upon the label 15 associated with a signaling biochemical pathway are pref selected above . For example , radiolabels may be detected erable for conducting the aforementioned protein analyses . using photographic film or a phosphoimager . Fluorescent Where desired , antibodies that recognize a specific type of markers may be detected and quantified using a photode - post- translational modifications ( e . g . , signaling biochemical tector to detect emitted light. Enzymatic labels are typically pathway inducible modifications ) can be used . Post - transla detected by providing the enzyme with a substrate and 20 tional modifications include but are not limited to glycosy measuring the reaction product produced by the action of the lation , lipidation , acetylation , and phosphorylation . These enzyme on the substrate ; and finally colorimetric labels are antibodies may be purchased from commercial vendors . For detected by simply visualizing the colored label. example , anti -phosphotyrosine antibodies that specifically An agent- induced change in expression of sequences recognize tyrosine- phosphorylated proteins are available associated with a signaling biochemical pathway can also be 25 from a number of vendors including Invitrogen and Perkin determined by examining the corresponding gene products . Elmer . Anti- phosphotyrosine antibodies are particularly use Determining the protein level typically involves a ) contact - ful in detecting proteins that are differentially phosphory ing the protein contained in a biological sample with an lated on their tyrosine residues in response to an ER stress . agent that specifically bind to a protein associated with a Such proteins include but are not limited to eukaryotic signaling biochemical pathway ; and ( b ) identifying any 30 translation initiation factor 2 alpha (eIF - 2 .alpha . ). Alterna agent: protein complex so formed . In one aspect of this tively , these antibodies can be generated using conventional embodiment, the agent that specifically binds a protein polyclonal or monoclonal antibody technologies by immu associated with a signaling biochemical pathway is an nizing a host animal or an antibody - producing cell with a antibody, preferably a monoclonal antibody . target protein that exhibits the desired post -translational The reaction can be performed by contacting the agent 35 modification . with a sample of the proteins associated with a signaling In practicing a subject method , it may be desirable to biochemical pathway derived from the test samples under discern the expression pattern of an protein associated with conditions that will allow a complex to form between the a signaling biochemical pathway in differentbodily tissue, in agent and the proteins associated with a signaling biochemi- different cell types, and /or in different subcellular structures. cal pathway . The formation of the complex can be detected 40 These studies can be performed with the use of tissue directly or indirectly according to standard procedures in the specific , cell -specific or subcellular structure specific anti art . In the direct detection method , the agents are supplied bodies capable of binding to protein markers that are pref with a detectable label and unreacted agents may be erentially expressed in certain tissues, cell types, or removed from the complex ; the amount of remaining label subcellular structures. thereby indicating the amount of complex formed . For such 45 An altered expression of a gene associated with a signal method , it is preferable to select labels that remain attached ing biochemical pathway can also be determined by exam to the agents even during stringent washing conditions . It is i ning a change in activity of the gene product relative to a preferable that the label does not interfere with the binding control cell. The assay for an agent- induced change in the reaction . In the alternative , an indirect detection procedure activity of a protein associated with a signaling biochemical may use an agent that contains a label introduced either 50 pathway will dependent on the biological activity and / or the chemically or enzymatically . A desirable label generally signal transduction pathway that is under investigation . For does not interfere with binding or the stability of the example , where the protein is a kinase , a change in its ability resulting agent: polypeptide complex . However , the label is to phosphorylate the downstream substrate ( s ) can be deter typically designed to be accessible to an antibody for an mined by a variety of assays known in the art. Representa effective binding and hence generating a detectable signal. 55 tive assays include but are not limited to immunoblotting A wide variety of labels suitable for detecting protein and immunoprecipitation with antibodies such as anti - phos levels are known in the art. Non - limiting examples include photyrosine antibodies that recognize phosphorylated pro radioisotopes, enzymes , colloidal metals , fluorescent com teins. In addition , kinase activity can be detected by high pounds, bioluminescent compounds , and chemiluminescent throughput chemiluminescent assays such as AlphaScreenTM compounds . 60 (available from Perkin Elmer ) and eTagTM assay (Chan -Hui , The amount of agent: polypeptide complexes formed dur - et al. ( 2003 ) Clinical Immunology 111: 162 - 174 ). ing the binding reaction can be quantified by standard Where the protein associated with a signaling biochemical quantitative assays . As illustrated above , the formation of pathway is part of a signaling cascade leading to a fluctua agent: polypeptide complex can be measured directly by the tion of intracellular pH condition , pH sensitive molecules amount of label remained at the site of binding . In an 65 such as fluorescent pH dyes can be used as the reporter alternative , the protein associated with a signaling biochemi- molecules. In another example where the protein associated cal pathway is tested for its ability to compete with a labeled with a signaling biochemical pathway is an ion channel , US 10 , 435 , 714 B2 29 30 fluctuations in membrane potential and / or intracellular ion t ion of a target sequence in a genomic locus of interest , this concentration can be monitored . A number of commercial may apply to the organism (or mammal) as a whole or just kits and high -throughput devices are particularly suited for a single cell or population of cells from that organism ( if the a rapid and robust screening for modulators of ion channels . organism is multicellular ). In the case of humans, for Representative instruments include FLIPRTM (Molecular 5 instance, Applicants envisage, inter alia, a single cell or a Devices, Inc . ) and VIPR ( Aurora Biosciences) . These instru population of cells and these may preferably be modified ex ments are capable of detecting reactions in over 1000 sample vivo and then re - introduced . In this case , a biopsy or other wells of a microplate simultaneously , and providing real tissue or biological fluid sample may be necessary . Stem time measurement and functional data within a second or cells are also particularly preferred in this regard . But, of even a minisecond . 10 In practicing any of the methods disclosed herein , a course , in vivo embodiments are also envisaged . And the suitable vector can be introduced to a cell, tissue , organism , invention is especially advantageous as to HSCs. or an embryo via one or more methods known in the art , The functionality of a targetable nuclease complex can be including without limitation , microinjection , electropora assessed by any suitable assay . For example , the components tion , sonoporation , biolistics , calcium phosphate -mediated 15 of a targetable nuclease system sufficient to form a targetable transfection , cationic transfection , liposome transfection , nuclease complex , including a guide nucleic acid and dendrimer transfection , heat shock transfection , nucleofec - nucleic acid - guided nuclease , can be provided to a host cell tion transfection , magnetofection , lipofection , impalefec having the corresponding target sequence , such as by trans tion , optical transfection , proprietary agent- enhanced uptake fection with vectors encoding the components of the engi of nucleic acids , and delivery via liposomes, immunolipo - 20 neered nuclease system , followed by an assessment of somes, virosomes , or artificial virions. In somemethods , the preferential cleavage within the target sequence . Similarly , vector is introduced into an embryo by microinjection . The cleavage of a target sequence may be evaluated in a test tube vector or vectors may be microinjected into the nucleus or by providing the target sequence and components of a the cytoplasm of the embryo . In somemethods , the vector or targetable nuclease complex . Other assays are possible , and vectors may be introduced into a cell by nucleofection . 25 will occur to those skilled in the art . A guide sequence can A target polynucleotide of a targetable nuclease complex be selected to target any target sequence . In some embodi can be any polynucleotide endogenous or exogenous to the ments, the target sequence is a sequence within a genome of host cell. For example , the target polynucleotide can be a a cell. Exemplary target sequences include those that are polynucleotide residing in the nucleus of the eukaryotic cell , unique in the target genome. the genome of a prokaryotic cell, or an extrachromosomal 30 Editing Cassette vector of a host cell . The target polynucleotide can be a Disclosed herein are compositions and methods for edit sequence coding a gene product ( e . g ., a protein ) or a ing a target polynucleotide sequence . Such compositions non - coding sequence ( e. g ., a regulatory polynucleotide or a include polynucleotides containing one or more components junk DNA ). of targetable nuclease system . Polynucleotide sequences for Examples of target polynucleotides include a sequence 35 use in these methods can be referred to as editing cassettes . associated with a signaling biochemical pathway , e . g ., a An editing cassette can comprise one or more primer sites . signaling biochemical pathway -associated gene or poly - Primer sites can be used to amplify an editing cassette by nucleotide . Examples of target polynucleotides include a using oligonucleotide primers comprising reverse comple disease associated gene or polynucleotide . A “ disease -asso - mentary sequences that can hybridize to the one or more ciated ” gene or polynucleotide refers to any gene or poly - 40 primer sites. An editing cassette can comprise two or more nucleotide which is yielding transcription or translation primer times . Sometimes , an editing cassette comprises a products at an abnormal level or in an abnormal form in cells primer site on each end of the editing cassette , said primer derived from a disease - affected tissues compared with tis - sites flanking one or more of the other components of the sues or cells of a non - disease control. It may be a gene that editing cassette . Primer sites can be approximately 10 , 15 , becomes expressed at an abnormally high level; it may be a 45 16 , 17 , 18 , 19 , 20 , 21, 22 , 23 , 24 , 25 , 26 or more nucleotides gene that becomes expressed at an abnormally low level, in length . where the altered expression correlates with the occurrence An editing cassette can comprise an editing template as and / or progression of the disease . A disease - associated gene disclosed herein . An editing cassette can comprise an editing also refers to a gene possessing mutation ( s ) or genetic sequence . An editing sequence can be homologous to a variation that is directly responsible or is in linkage disequi- 50 target sequence . An editing sequence can comprise at least librium with a gene ( s ) that is responsible for the etiology of one mutation relative to a target sequence . An editing a disease . The transcribed or translated products may be sequence often comprises homology region ( or homology known or unknown , and may be at a normal or abnormal arms) flanking at least one mutation relative to a target level. sequence , such that the flanking homology regions facilitate Embodiments of the invention also relate to methods and 55 homologous recombination of the editing sequence into a compositions related to knocking out genes, editing genes , target sequence . An editing sequence can comprise an edit altering genes , amplifying genes, and repairing particular ing template as disclosed herein . For example , the editing mutations. Altering genes may also mean the epigenetic sequence can comprise at least one mutation relative to a manipulation of a target sequence . This may be the chro - target sequence including one or more PAM mutations that matin state of a target sequence , such as by modification of 60 mutate or delete a PAM site . An editing sequence can the methylation state of the target sequence ( i . e . addition or comprise one or more mutations in a codon or non - coding removal of methylation or methylation patterns or CpG sequence relative to a non - editing target site . islands) , histone modification , increasing or reducing acces - PAM mutation can be a silent mutation . A silent sibility to the target sequence , or by promoting 3D folding . mutation can be a change to at least one nucleotide of a It will be appreciated that where reference is made to a 65 codon relative to the original codon that does not change the method ofmodifying a cell , organism , or mammal including amino acid encoded by the original codon . A silent mutation human or a non -human mammal or organism by manipula - can be a change to a nucleotide within a non -coding region , US 10 , 435 , 714 B2 31 32 such as an intron , 5 ' untranslated region , 3 ' untranslated repair. Homology arms can be added by synthesis , in vitro region , or other non - coding region . assembly , PCR , or other known methods in the art. For APAM mutation can be a non - silentmutation . Non - silent example , chemical synthesis , Gibson assembly , SLIC , mutations can include a missense mutation . A missense CPEC , PCA , ligation - free cloning , overlapping oligo exten mutation can be when a change to at least one nucleotide of 5 sion , in vitro assembly , in vitro oligo assembly, PCR , a codon relative to the original codon that changes the amino traditional ligation - based cloning , other known methods in acid encoded by the original codon .Missense mutations can the art , or any combination thereof. A homology arm can be occur within an exon , open reading frame, or other coding added to both ends of a barcode , recorder sequence , and / or region . editing sequence , thereby flanking the sequence with two An editing sequence can comprise at least one mutation 10 distinct homology arms, for example , a 5 ' homology arm and relative to a target sequence . A mutation can be a silent a 3 ' homology arm . mutation or non - silent mutation , such as a missense muta - A homology arm can comprise sequence homologous to tion . A mutation can include an insertion of one or more a target sequence . A homology arm can comprise sequence nucleotides or base pairs . A mutation can include a deletion homologous to sequence adjacent to a target sequence . A of one or more nucleotides or base pairs . A mutation can 15 homology arm can comprise sequence homologous to include a substitution of one or more nucleotides or base sequence upstream or downstream of a target sequence . A pairs for a different one or more nucleotides or base pairs . homology arm can comprise sequence homologous to Inserted or substituted sequences can include exogenous or sequence within the same gene or open reading frame as a heterologous sequences . target sequence . A homology arm can comprise sequence An editing cassette can comprise a polynucleotide encod - 20 homologous to sequence upstream or downstream of a gene ing a guide nucleic acid sequence . In some cases , the guide or open reading frame the target sequence is within . A nucleic acid sequence is optionally operably linked to a homology arm can comprise sequence homologous to a 5 ' promoter. A guide nucleic acid sequence can comprise a UTR or 3 ' UTR of a gene or open reading frame within scaffold sequence and a guide sequence as described herein . which is a target sequence . A homology arm can comprise An editing cassette can comprise a barcode . A barcode can 25 sequence homologous to a different gene , open reading be a unique DNA sequence that corresponds to the editing frame, promoter, terminator , or nucleic acid sequence than sequence such that the barcode can identify the one or more that which the target sequence is within . mutations of the corresponding editing sequence . In some The same 5 ' and 3 ' homology arms can be added to a examples, the barcode is 15 nucleotides . The barcode can plurality of distinct editing sequences , thereby generating a comprise less than 10 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 30 library of unique editing sequences that each have the same 20 , 25 , 30 , 35 , 40 , 45 , 50 , 55 , 60 , 65 , 70 , 75 , 80 , 88 , 90 , 100 , targeted insertion site . The same 5 ' and 3 ' homology arms 110 , 120 , 130 , 140 , 150 , 160 , 170 , 180 , 190 , 200 , or more can be added to a plurality of distinct editing templates , than 200 nucleotides . A barcode can be a non - naturally thereby generating a library of unique editing templates that occurring sequence . An editing cassette comprising a bar each have the same targeted insertion site . In alternative code can be a non -naturally occurring sequence . 35 examples, different or a variety of 5 ' or 3 homology arms An editing cassette can comprise one or more of an can be added to a plurality of editing sequences or editing editing sequence and a polynucleotide encoding a guide templates . nucleic acid optionally operably linked to a promoter , Abarcode library or recorder sequence library comprising wherein the editing cassette and guide nucleic acid sequence flanking homology arms can be cloned into a vector back are flanked by primer sites . An editing cassette can further 40 bone . In some examples, the barcode comprising flanking comprise a barcode. homology arms are cloned into an editing cassette . Cloning An example of an editing cassette is depicted in FIG . 3 . can occur by chemical synthesis , Gibson assembly , SLIC , Each editing cassette can be designed to edit a site in a target CPEC , PCA , ligation - free cloning , overlapping oligo exten sequence Sites to be targeted can be coding regions, non sion , in vitro assembly , in vitro oligo assembly , PCR , coding regions , functionally neutral sites , or they can be a 45 traditional ligation -based cloning, other known methods in screenable or selectable marker gene . Homology regions the art , or any combination thereof. within the editing sequence flank the one or more mutations An editing sequence library comprising flanking homol of the editing cassette and can be inserted into the target ogy arms can be cloned into a vector backbone. In some sequence by recombination . Recombination can comprise examples, the editing sequence and homology arms are DNA cleavage , such as by an nucleic acid - guided nuclease , 50 cloned into an editing cassette . Editing cassettes can , in and repair via homologous recombination . some cases , further comprise a nucleic acid sequence encod Editing cassettes can be generated by chemical synthesis , ing a guide nucleic acid or gRNA engineered to target the Gibson assembly , SLIC , CPEC , PCA , ligation - free cloning , desired site of editing sequence insertion , e . g . the target overlapping oligo extension , in vitro assembly , in vitro oligo sequence . Editing cassettes can , in some cases , further assembly , PCR , traditional ligation - based cloning , other 55 comprise a barcode or recorder sequence . Cloning can occur known methods in the art , or any combination thereof. by chemical synthesis , Gibson assembly , SLIC , CPEC , Trackable sequences , such as barcodes or recorder PCA , ligation - free cloning, overlapping oligo extension , in sequences, can be designed in silico via standard code with vitro assembly , in vitro oligo assembly , PCR , traditional a degenerate mutation at the target codon . The degenerate ligation -based cloning , other known methods in the art , or mutation can comprise 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 60 any combination thereof . 14 , 15 , 16 , 17 , 18 , 19, 20 , 21, 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , Gene - wide or genome- wide editing libraries can be 30 , or more than 30 nucleic acid residues. In some examples , cloned into a vector backbone . A barcode or recorder the degenerate mutations can comprise 15 nucleic acid sequence library can be inserted or assembled into a second residues (N15 ) . site to generate competent trackable plasmids that can Homology arms can be added to an editing sequence to 65 embed the recording barcode at a fixed locus while integrat allow incorporation of the editing sequence into the desired ing the editing libraries at a wide variety of user defined location via homologous recombination or homology -driven sites . Cloning can occur by chemical synthesis , Gibson US 10 , 435 , 714 B2 33 34 assembly , SLIC , CPEC , PCA , ligation - free cloning, over - 70 , 80 , 90 , 100 , 200 , 300 , 400 , 500 , 600 , 700 , 800 , 1000 , lapping oligo extension , in vitro assembly , in vitro oligo 1500 , 2000 , 2500 , 3000 , 3500 , 4000 , 4500 , 5000 , 5500 , assembly , PCR , traditional ligation - based cloning , other 6000 , 6500 , 7000 , 7500 , 8000 , 8500 , 9000 , 9500 , 104, 10 %, known methods in the art , or any combination thereof. 10 % , 107, 108 , 109 , 1010 , or more nucleic acid molecules or A guide nucleic acid or sequence encoding the same can 5 plasmids of the present disclosure . It should be understood be assembled or inserted into a vector backbone firstist., that such a library can include any number of nucleic acid followed by insertion of an editing sequence and / or cassette . molecules or plasmids, even if the specific number is not In other cases, an editing sequence and /or cassette can be explicit listed above . inserted or assembled into a vector backbone first, followed by insertion of a guide nucleic acid or sequence encoding the 10 1 : Trackable plasmid libraries or nucleic acid molecule same. In other cases , guide nucleic acid or sequence encod libraries can be sequenced in order to determine the recorder ing the same and an editing sequence and / or cassette are sequence and editing sequence pair that is comprised on simultaneous inserted or assembled into a vector. A recorder each trackable plasmid . In other cases , a known recorder sequence or barcode can be inserted before or after any of sequence is paired with a known editing sequence during the these steps . In other words , it should be understood that there 15 library generation process . Other methods of determining are many possible permutations to the order in which the association between a recorder sequence and editing elements of the disclosure are assembled . The vector can be sequence comprised on a common nucleic acid molecule or linear or circular and can be generated by chemical synthe plasmid are envisioned such that the editing sequence can be sis . Gibson assembly , SLIC , CPEC , PCA , ligation - free clon - identified by identification or sequencing of the recorder ing , overlapping oligo extension , in vitro assembly, in vitro 20 sequence . oligo assembly , PCR , traditional ligation -based cloning , Methods and compositions for tracking edited episomal other known methods in the art, or any combination thereof . libraries that are shuttled between E . coli and other organ A nucleic acid molecule can be synthesized which com - isms/ cell lines are provided herein . The libraries can be prises one or more elements disclosed herein . A nucleic acid comprised on plasmids, Bacterial artificial chromosomes molecule can be synthesized that comprises an editing 25 (BACs ), Yeast artificial chromosomes ( YACs) , synthetic cassette . A nucleic acid molecule can be synthesized that chromosomes , or viral or phage genomes . These methods comprises a guide nucleic acid . A nucleic acid molecule can and compositions can be used to generate portable barcoded be synthesized that comprises a recorder cassette . A nucleic libraries in host organisms, such as E . coli . Library genera acid molecule can be synthesized that comprises a barcode . tion in such organisms can offer the advantage of established A nucleic acid molecule can be synthesized that comprises 30 techniques for performing homologous recombination . Bar a homology arm . A nucleic acid molecule can be synthesized coded plasmid libraries can be deep - sequenced at one site to that comprises an editing cassette and a guide nucleic acid . track mutational diversity targeted across the remaining A nucleic acid molecule can be synthesized that comprises portions of the plasmid allowing dramatic improvements in an editing cassette and a barcode . A nucleic acid molecule the depth of library coverage . can be synthesized that comprises an editing cassette , a 35 Any nucleic acid molecule disclosed herein can be an guide nucleic acid , and a recorder cassette . A nucleic acid isolated nucleic acid . Isolated nucleic acids may be made by molecule can be synthesized that comprises an editing any method known in the art , for example using standard cassette, a recorder cassette , and two guide nucleic acids. A recombinant methods, assembly methods , synthesis tech nucleic acid molecule can be synthesized that comprises a niques , or combinations thereof. In some embodiments , the recorder cassette and a guide nucleic acid . In any of these 40 nucleic acids may be cloned , amplified , assembled , or oth cases, the guide nucleic acid can optionally be operably erwise constructed . linked to a promoter . In any of these cases , the nucleic acid Isolated nucleic acids may be obtained from cellular , molecule can further include one or more barcodes . bacterial , or other sources using any number of cloning Synthesis can occur by any nucleic acid synthesis method methodologies known in the art. In some embodiments , known in the art . Synthesis can occur by enzymatic nucleic 45 oligonucleotide probes which selectively hybridize , under acid synthesis . Synthesis can occur by chemical synthesis . stringent conditions, to other oligonucleotides or to the Synthesis can occur by array -based synthesis . Synthesis can nucleic acids of an organism or cell can be used to isolate or occur by solid - phase synthesis or phosphoramidite methods. identify an isolated nucleic acid Synthesis can occur by column or multi -well methods. Cellular genomic DNA, RNA, or cDNA may be screened Synthesized nucleic acid molecules can be non - naturally 50 for the presence of an identified genetic element of interest occurring nucleic acid molecules . using a probe based upon one or more sequences. Various Software and automation methods can be used for mul - degrees of stringency of hybridization may be employed in tiplex synthesis and generation . For example , software and the assay . automation can be used to create 10 , 102 , 103, 104, 10 % , 106, High stringency conditions for nucleic acid hybridization or more synthesized polynucleotides, cassettes , or plasmids . 55 are well known in the art . For example , conditions may An automation method can generate desired sequences and comprise low salt and / or high temperature conditions , such libraries in rapid fashion that can be processed through a as provided by about 0 . 02 M to about 0 . 15 M NaCl at workflow with minimal steps to produce precisely defined temperatures of about 50° C . to about 70° C . It is understood libraries , such as gene -wide or genome -wide editing librar - that the temperature and ionic strength of a desired strin ies. 60 gency are determined in part by the length of the particular Polynucleotides or libraries can be generated which com - nucleic acid ( s) , the length and nucleotide content of the prise two or more nucleic acid molecules or plasmids target sequence ( s ) , the charge composition of the nucleic comprising any combination disclosed herein of recorder acid ( s ), and by the presence or concentration of formamide , sequence , editing sequence , guide nucleic acid , and optional tetramethylammonium chloride or other solvent( s ) in a barcode , including combinations of one or more of any of 65 hybridization mixture . Nucleic acids may be completely the previously mentioned elements . For example , such a complementary to a target sequence or may exhibit one or library can comprise at least 2 , 3 , 4 , 5 , 10 , 20 , 30 , 40 , 50 , 60 , more mismatches. US 10 , 435 , 714 B2 35 36 Nucleic acids of interest may also be amplified using a can be used to incorporate a desired edit into a target nucleic variety of known amplification techniques . For instance , acid sequence . The desired edit can include insertion , dele polymerase chain reaction (PCR ) technology may be used to tion , substitution , or alteration of the target nucleic acid amplify target sequences directly from DNA , RNA , or sequence. In some examples , the one or more recording cDNA . PCR and other in vitro amplification methods may 5 sequence and editing sequences are comprised on a single also be useful, for example , to clone nucleic acid sequences, cassette comprised within the plasmid such that they are to make nucleic acids to use as probes for detecting the incorporated into the target nucleic acid sequence by the presence of a target nucleic acid in samples , for nucleic acid sequencing, or for other purposes . same engineering event. In other examples , the recording Isolated nucleic acids may be prepared by direct chemical 10 and editing sequences are comprised on separate cassettes synthesis by methods such as the phosphotriestermethod , or within the plasmid such that they are each incorporated into using an automated synthesizer. Chemical synthesis gener the target nucleic acid by distinct engineering events . In ally produces a single stranded oligonucleotide. This may be some examples, the plasmid comprises two or more editing converted into double stranded DNA by hybridization with sequences . For example , one editing sequence can be used a complementary sequence or by polymerization with a 15 to alter or silence a PAM sequence while a second editing DNA polymerase using the single strand as a template . sequence can be used to incorporate a mutation into a Recorder distinct sequence . In some example , two editing cassettes can be used Recorder sequences can be inserted into a site separated together to track a genetic engineering step . For example , from the editing sequence insertion site . The inserted one editing cassette can comprise an editing template and an 20 recorder sequence can be separated from the editing encoded guide nucleic acid , and a second editing cassette , sequence by 1 bp to 1 Mbp . For example , the separation referred to as a recorder cassette , can comprise an editing distance can be about 1 bp , 10 bp , 50 bp , 100 bp , 500 bp , 1 template comprising a recorder sequence and an encoded kp, 2 kb , 5 kb , 10 kb , or greater . The separation distance can nucleic acid which has a distinct guide sequence compared be any discrete integer between 1 bp and 10 Mbp . In some to that of the first editing cassette . In such cases, the editing 25 examples , the maximum distance of separation depends on sequence and the recorder sequence can be inserted into the size of the target nucleic acid or genome. separate target sequences and determined by their corre - Recorder sequences can be inserted adjacent to editing sponding guide nucleic acids . A recorder sequence can sequences , or within proximity to the editing sequence . For comprise a barcode , trackable or traceable sequence , and /or example , the recorder sequence can be inserted outside of a regulatory element operable with a screenable or selectable 30 the open reading frame within which the editing sequence is marker. inserted . Recorder sequence can be inserted into an untrans Through a multiplexed cloning approach , the recorder lated region adjacent to an open reading frame within which cassette can be covalently coupled to at least one editing an editing sequence has been inserted . The recorder cassette in a plasmid ( e . g ., FIG . 17A , green cassette ) to sequence can be inserted into a functionally neutral or generate plasmid libraries that have a unique recorder and 35 non - functional site . The recorder sequence can be inserted editing cassette combination . This library can be sequenced into a screenable or selectable marker gene . to generate the recorder / edit mapping and used to track In some examples, the target nucleic acid sequence is editing libraries across large segments of the target DNA comprised within a genome, artificial chromosome, syn ( e . g . , FIG . 17C ) . Recorder and editing sequences can be thetic chromosome, or episomal plasmid . In various comprised on the same cassette , in which case they are both 40 examples , the target nucleic acid sequence can be in vitro or incorporated into the target nucleic acid sequence , such as a in vivo . When the target nucleic acid sequence is in vivo , the genome or plasmid , by the same recombination event. In plasmid can be introduced into the host organisms by other examples , the recorder and editing sequences can be transformation , transfection , conjugation , biolistics, nano comprised on separate cassettes within the same plasmid , in particles , cell - permeable technologies , or other known which case the recorder and editing sequences are incorpo - 45 methods for DNA delivery , or any combination thereof. In rated into the target nucleic acid sequence by separate such examples, the host organism can be a eukaryote , recombination events , either simultaneously or sequentially . prokaryote , bacterium , archaea , yeast , or other fungi. Methods are provided herein for combining multiplex The engineering event can comprise recombineering, oligonucleotide synthesis with recombineering , to create non -homologous end joining , homologous recombination , libraries of specifically designed and trackable mutations. 50 or homology - driven repair . In some examples , the engineer Screens and / or selections followed by high - throughput ing event is performed in vitro or in vivo . sequencing and /or barcode microarray methods can allow The methods described herein can be carried out in any for rapid mapping of mutations leading to a phenotype of type of cell in which a targetable nuclease system can interest . function ( e . g ., target and cleave DNA ) , including prokary Methods and compositions disclosed herein can be used 55 otic and eukaryotic cells . In some embodiments the cell is a to simultaneously engineer and track engineering events in bacterial cell , such as Escherichia spp . ( e . g . , E . coli) . In a target nucleic acid sequence . other embodiments , the cell is a fungal cell, such as a yeast Such plasmids can be generated using in vitro assembly or cell , e . g ., Saccharomyces spp . In other embodiments , the cell cloning techniques. For example , the plasmids can be gen - is an algal cell, a plant cell , an insect cell , or a mammalian erated using chemical synthesis , Gibson assembly , SLIC , 60 cell , including a human cell. CPEC , PCA , ligation - free cloning, other in vitro oligo In some examples , the cell is a recombinant organism . For assembly techniques , traditional ligation - based cloning , or example , the cell can comprise a non - native targetable any combination thereof. nuclease system . Additionally or alternatively , the cell can Such plasmids can comprise at least one recording comprise recombination system machinery . Such recombi sequence , such as a barcode, and at least one editing 65 nation systems can include lambda red recombination sys sequence . In most cases, the recording sequence is used to tem , Cre /Lox , attB / attp , or other integrase systems. Where record and track engineering events . Each editing sequence appropriate , the plasmid can have the complementary com US 10 , 435 , 714 B2 37 38 ponents or machinery required for the selected recombina - recorder sequence of the target DNA in at least one cell of tion system to work correctly and efficiently . the second population of cells to identify the mutation of at Method for genome editing can comprise : ( a) introducing least one codon . a vector that encodes at least one editing cassette and at least In some examples transformation efficiency is determined one guide nucleic acid into a first population of cells , thereby 5 by using a non - targeting control guide nucleic acid , which producing a second population of cells comprising the allows for validation of the recombineering procedure and vector ; (b ) maintaining the second population of cells under conditions in which a nucleic acid - guided nuclease is CFU /ng calculations. In some cases, absolute efficient is expressed or maintained , wherein the nucleic acid - guided obtained by counting the total number of colonies on each nuclease is encoded on the vector, a second vector, on the 10 transformation plate , for example, by counting both red and genome of cells of the second population of cells , or white colonies from a galK control. In some examples, otherwise introduced into the cell, resulting in DNA cleav relative efficiency is calculated by the total number of age and incorporation of the editing cassette ; ( c ) obtaining successful transformants ( for example , white colonies ) out viable cells ; and (d ) sequencing the target DNA molecule in of all colonies from a control ( for example , galK control) . at least one cell of the second population of cells to identify 15 The methods of the disclosure can provide , for example , the mutation of at least one codon . greater than 1000x improvements in the efficiency , scale , A method for genome editing can comprise : (a ) introduc - cost of generating a combinatorial library , and / or precision ing a vector that encodes at least one editing cassette 01of such library generation . comprising a PAM mutation as disclosed herein and at least The methods of the disclosure can provide, for example , one guide nucleic acid into a first population of cells, thereby 20 greater than : 10x , 50x , 100x , 200x , 300x , 400x, 500x , 600x , producing a second population of cells comprising the 700x , 800x , 900x , 1000x , 1100x , 1200x , 1300x , 1400x , vector ; (b ) maintaining the second population of cells under 1500x , 1600x , 1700X, 1800x , 1900x , 2000x , or greater conditions in which a nucleic acid - guided nuclease is improvements in the efficiency of generating genomic or expressed or maintained , wherein the nucleic acid - guided combinatorial libraries. nuclease is encoded on the vector, a second vector , on the 25 The methods of the disclosure can provide , for example , genome of cells of the second population of cells , or greater than : 10x , 50x , 100x , 200x , 300m , 400m , 500x , 600x , otherwise introduced into the cell, resulting in DNA cleav - 700x , 800x , 900x , 1000x , 1100x , 1200x , 1300x , 1400x , age , incorporation of the editing cassette , and death of cells 1500x , 1600x , 1700x , 1800x , 1900x , 2000x , or greater of the second population of cells that do not comprise the improvements in the scale of generating genomic or com PAM mutation , whereas cells of the second population of 30 binatorial libraries. cells that comprise the PAM mutation are viable ; ( c ) obtain The methods of the disclosure can provide, for example , ing viable cells ; and ( d ) sequencing the target DNA in at greater than : 10x , 50x , 100x , 200x , 300x , 400x, 500x , 600x , least one cell of the second population of cells to identify the 700x , 800x , 900x, 1000x, 1100x , 1200x , 1300x , 1400x , mutation of at least one codon . 1500x , 1600x , 1700x , 1800x , 1900x , 2000x , or greater Method for trackable genome editing can comprise : (a ) 35 decrease in the cost of generating genomic or combinatorial introducing a vector that encodes at least one editing cas - libraries . sette , at least one recorder cassette , and at least two guide The methods of the disclosure can provide , for example , nucleic acids into a first population of cells , thereby pro - greater than : 10 % , 50 % , 100x , 200x , 300x , 400x , 500x , 600x , ducing a second population of cells comprising the vector ; 700x, 800x , 900x, 1000x, 1100x , 1200x , 1300x , 1400x , ( b ) maintaining the second population of cells under con - 40 1500x , 1600x , 1700x , 1800x , 1900x , 2000x , or greater ditions in which a nucleic acid - guided nuclease is expressed improvements in the precision of genomic or combinatorial or maintained , wherein the nucleic acid -guided nuclease is library generation . encoded on the vector, a second vector, on the genome of Recursive Tracking for Combinatorial Engineering cells of the second population of cells , or otherwise intro - Disclosed herein are methods and compositions for itera duced into the cell, resulting in DNA cleavage and incor - 45 tive rounds of engineering . Disclosed herein are recursive poration of the editing and recorder cassettes ; (c ) obtaining engineering strategies that allow implementation of CRE viable cells ; and ( d ) sequencing the recorder sequence of the ATE recording at the single cell level through several serial target DNA molecule in at least one cell of the second engineering cycles ( e . g . , FIG . 18 and FIG . 19 ) . These population of cells to identify the mutation of at least one disclosed methods and compositions can enable search codon . 50 based technologies that can effectively construct and explore In some examples where the plasmid comprises a second complex genotypic space . The terms recursive and iterative editing sequence designed to silence a PAM , a method for can be used interchangeably . trackable genome editing can comprise : ( a ) introducing a Combinatorial engineering methods can comprise mul vector that encodes at least one editing cassette , a recorder tiple rounds of engineering . Methods disclosed herein can cassette , and at least two guide nucleic acids into a first 55 comprise 2 or more rounds of engineering. For example , a population of cells , thereby producing a second population method can comprise 2 , 3 , 4 , 5 , 6 , 7, 8 , 9 , 10 , 11, 12 , 13 , 14 , of cells comprising the vector ; (b ) maintaining the second 15 , 20 , 25 , 30 , or more than 30 rounds of engineering. population of cells under conditions in which a nucleic In some examples , during each round of engineering a acid - guided nuclease is expressed or maintained , wherein new recorder sequence , such as a barcode, is incorporated at the nucleic acid - guided nuclease is encoded on the vector , a 60 the same locus in nearby sites ( e . g . , FIG . 18 , green bars or second vector , on the genome of cells of the second popu - FIG . 19 , black bars ) such that following multiple engineer lation of cells, or otherwise introduced into the cell , resulting ing cycles to construct combinatorial diversity throughout in DNA cleavage , incorporation of the editing and recorder the genome (e . g ., FIG . 18 , green bars or FIG . 19 , grey bars ) cassettes , and death of cells of the second population of cells a simple PCR of the recording locus can be used to recon that do not comprise the PAM mutation , whereas cells of the 65 struct each combinatorial genotype or to confirm that the second population of cells that comprise the PAM mutation engineered edit from each round has been incorporated into are viable ; ( c ) obtaining viable cells ; and ( d ) sequencing the the target site . US 10 , 435 , 714 B2 39 40 Disclosed herein are methods for selecting for successive screenable or selectable marker on and off. Such short rounds of engineering . Selection can occur by a PAM sequences can easily fit within a synthesized cassette or mutation incorporated by an editing cassette . Selection can polynucleotide . occur by a PAM mutation incorporated by a recorder cas One or more rounds of engineering can be performed sette . Selection can occur using a screenable , selectable , or 5 using the methods and compositions disclosed herein . In some examples , each round of engineering is used to incor counter - selectable marker. Selection can occur by targeting porate an edit unique from that of previous rounds. Each a site for editing or recording that was incorporated by a round of engineering can incorporate a unique recording prior round of engineering , thereby selecting for variants sequence . Each round of engineering can result in removal that successfully incorporated edits and recorder sequences 10 or curing of the plasmid used in the previous round of from both rounds or all prior rounds of engineering . engineering . In some examples, successful incorporation of Quantitation of these genotypes can be used for under the recording sequence of each round of engineering results standing combinatorial mutational effects on large popula in a complete and functional screenable or selectable marker tions and investigation of important biological phenomena or unique sequence combination . such as epistasis . 15 Unique recorder cassettes comprising recording Serial editing and combinatorial tracking can be imple sequences such as barcodes or screenable or selectable mented using recursive vector systems as disclosed herein . markers can be inserted with each round of engineering , These recursive vector systems can be used to move rapidly thereby generating a recorder sequence that is indicative of through the transformation procedure . In some examples, the combination of edits or engineering steps performed . these systems consist of two or more plasmids containing 20 Successive recording sequences can be inserted adjacent to orthogonal replication origins, antibiotic markers , and an one another. Successive recording sequences can be inserted encoded guide nucleic acids. The encoded guide nucleic acid within proximity to one another. Successive sequences can in each vector can be designed to target one of the other be inserted at a distance from one another. resistance markers for destruction by nucleic acid - guided Successive sequences can be inserted at a distance from nuclease -mediated cleavage . These systems can be used , in 25 one another. For example , successive recorder sequences some examples, to perform transformations in which the can be inserted and separated by 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 . antibiotic selection pressure is switched to remove the 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , previous plasmid and drive enrichment of the next round of 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , engineered genomes. Two or more passages through the 42, 43 , 44 , 45 , 46 , 47 , 48 , 49, 50 , 51, 52 , 53 , 54 , 55 , 56 , 57 , 30 58 , 59 , 60 , 61, 62 , 63 , 64, 65, 66 , 67, 68 , 69, 70 , 71 , 72 , 73 , transformation loop can be performed , or in other words , 74 , 75 , 76 , 78, 79 , 80 , 81, 82, 83, 84 , 85 , 86 , 87, 88 , 89 , 90 , multiple rounds of engineering can be performed . Introduc 91, 92 , 93 , 94 , 95 , 96 , 97 , 98 , 99 , 100 , or greater than 100 ing the requisite recording cassettes and editing cassettes bp . In some examples, successive recorder sequences are into recursive vectors as disclosed herein can be used for separated by about 10 , 50 , 100 , 150 , 200 , 250 , 300 , 350 , 400 , simultaneous genome editing and plasmid curing in each 35 450 . 500 . 550 . 600 . 650 . 700 . 750. 800 . 850 . 900 . 950 . 1000 . transformation step with high efficiencies. 1100 , 1200 , 1300 , 1400 , 1500 , or greater than 1500 bp . In some examples , the recursive vector system disclosed Successive recorder sequences can be separated by any herein comprises 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , or more than 10 desired number of base pairs and can be dependent and unique plasmids . In some examples , the recursive vector limited on the number of successive recorder sequences to system can use a particular plasmid more than once as long 40 be inserted , the size of the target nucleic acid or target as a distinct plasmid is used in the previous round and in the genomes , and / or the design of the desired final recorder subsequent round . sequence . For example , if the compiled recorder sequence is Recursive methods and compositions disclosed herein can a functional screenable or selectable marker, than the suc be used to restore function to a selectable or screenable cessive recording sequences can be inserted within proxim element in a targeted genome or plasmid . The selectable or 45 ity and within the same reading frame from one another . If screenable element can include an antibiotic resistance gene , the compiled recorder sequence is a unique set of barcodes a fluorescent gene , a unique DNA sequence or watermark , or to be identified by sequencing and have no coding sequence other known reporter , screenable , or selectable gene . In element, then the successive recorder sequences can be some examples , each successive round of engineering can inserted with any desired number of base pairs separating incorporate a fragment of the selectable or screenable ele - 50 them . In these cases , the separation distance can be depen ment, such that at the end of the engineering rounds, the dent on the sequencing technology to be used and the read entire selectable or screenable element has been incorpo - length limit . rated into the target genome or plasmid . In such examples , While preferred embodiments of the present invention only those genome or plasmids which have successfully have been shown and described herein , it will be obvious to incorporated all of the fragments , and therefore all of the 55 those skilled in the art that such embodiments are provided desired corresponding mutations , can be selected or by way of example only . Numerous variations , changes , and screened for. In this way , the selected or screened cells will substitutions will now occur to those skilled in the art be enriched for those that have incorporated the edits from without departing from the invention . It should be under each and every iterative round of engineering . stood that various alternatives to the embodiments of the Recursive methods can be used to switch a selectable or 60 invention described herein may be employed in practicing screenable marker between an on and an off position , or the invention . It is intended that the following claims define between an off and an on position , with each successive the scope of the invention and that methods and structures round of engineering . Using such a method allows conser within the scope of these claims and their equivalents be vation of available selectable or screenable markers by covered thereby . requiring , for example , the use of only one screenable or 65 Some Definitions selectable marker . Furthermore , short regulatory sequence As used herein the term “wild type ” is a term of the art or start codon or non -start codons can be used to turn the understood by skilled persons and means the typical form of US 10 ,435 , 714 B2 an organism , strain , gene or characteristic as it occurs in 40 , 45 , 50 , ormore nucleotides, or refers to two nucleic acids nature as distinguished from mutant or variant forms. that hybridize under stringent conditions As used herein the term “ variant” should be taken to mean As used herein , " stringent conditions” for hybridization the exhibition of qualities that have a pattern that deviates refer to conditions under which a nucleic acid having from what occurs in nature . complementarity to a target sequence predominantly hybrid The terms “ orthologue ” ( also referred to as “ ortholog ” izes with the target sequence , and substantially does not herein ) and “ homologue ” (also referred to as “ homolog " hybridize to non - target sequences . Stringent conditions are herein ) are well known in the art . By means of further generally sequence - dependent, and vary depending on a guidance , a “ homologue” of a protein as used herein is a number of factors . In general, the longer the sequence , the protein of the same species which performs the same or a 10 higher the temperature at which the sequence specifically similar function as the protein it is a homologue of. Homolo - hybridizes to its target sequence . Non - limiting examples of gous proteins may but need not be structurally related , or are stringent conditions are described in detail in Tijssen ( 1993 ). only partially structurally related . An “ orthologue ” of a Laboratory Techniques In Biochemistry And Molecular protein as used herein is a protein of a different species Biology -Hybridization With Nucleic Acid Probes Part I , which performs the same or a similar function as the protein 15 Second Chapter " Overview of principles of hybridization it is an orthologue of Orthologous proteins may but need not and the strategy of nucleic acid probe assay " , Elsevier, N . Y . be structurally related, or are only partially structurally Where reference is made to a polynucleotide sequence , then related . Homologs and orthologs may be identified by complementary or partially complementary sequences are homology modelling ( see , e . g . , Greer, Science vol. 228 also envisaged . These are preferably capable of hybridising ( 1985 ) 1055 , and Blundell et al. Eur J Biochem vol 172 20 to the reference sequence under highly stringent conditions . ( 1988 ) , 513 ) or “ structural BLAST ” ( Dey F , Cliff Zhang Q , Generally, in order to maximize the hybridization rate , Petrey D , Honig B . Toward a " structural BLAST ” : using relatively low -stringency hybridization conditions are structural relationships to infer function . Protein Sci . 2013 selected : about 20 to 25 degrees Celsius. lower than the April; 22 ( 4 ) :359 - 66 . doi : 10 . 1002/ pro .2225 . ) . thermal melting point ( Tm ) . The Tm is the temperature at The terms “ polynucleotide ” , “ nucleotide ” , “ nucleotide 25 which 50 % of specific target sequence hybridizes to a sequence ” , “ nucleic acid ” and “ oligonucleotide ” are used perfectly complementary probe in solution at a defined ionic interchangeably. They refer to a polymeric form of nucleo strength and pH . Generally, in order to require at least about tides of any length , either deoxyribonucleotides or ribo - 85 % nucleotide complementarity of hybridized sequences , nucleotides , or analogs thereof. Polynucleotides may have highly stringent washing conditions are selected to be about any three dimensional structure , and may perform any 30 5 to 15 degrees Celsius lower than the Tm . In order to function , known or unknown . The following are non - limit - require at least about 70 % nucleotide complementarity of ing examples of polynucleotides : coding or non - coding hybridized sequences, moderately -stringent washing condi regions of a gene or gene fragment, loci (locus ) defined from tions are selected to be about 15 to 30 degrees Celsius lower linkage analysis , exons , introns, messenger RNA (MRNA ) , than the Tm . Highly permissive ( very low stringency ) wash transfer RNA , ribosomal RNA , short interfering RNA 35 ing conditions may be as low as 50 degrees Celsius below ( siRNA ), short- hairpin RNA ( shRNA ) , micro -RNA the Tm , allowing a high level of mis -matching between (miRNA ) , ribozymes, cDNA , recombinant polynucleotides, hybridized sequences. Those skilled in the art will recognize branched polynucleotides , plasmids, vectors , isolated DNA that other physical and chemical parameters in the hybrid of any sequence, isolated RNA of any sequence , nucleic acid ization and wash stages can also be altered to affect the probes, and primers . The term also encompasses nucleic - 40 outcome of a detectable hybridization signal from a specific acid - like structures with synthetic backbones , see , e . g . , level of homology between target and probe sequences . Eckstein , 1991 ; Baserga et al. , 1992 ; Milligan , 1993 ; WO “Hybridization ” refers to a reaction in which one or more 97 /03211 ; WO 96 /39154 ; Mata , 1997 ; Strauss- Soukup , polynucleotides react to form a complex that is stabilized via 1997 ; and Samstag , 1996 . A polynucleotide may comprise hydrogen bonding between the bases of the nucleotide one or more modified nucleotides , such as methylated 45 residues . The hydrogen bonding may occur by Watson Crick nucleotides and nucleotide analogs . If present, modifications base pairing, Hoogstein binding , or in any other sequence to the nucleotide structure may be imparted before or after specific manner. The complex may comprise two strands assembly of the polymer. The sequence of nucleotides may forming a duplex structure , three or more strands forming a be interrupted by non - nucleotide components . A polynucle - multi stranded complex , a single self- hybridizing strand , or otide may be further modified after polymerization , such as 50 any combination of these . A hybridization reaction may by conjugation with a labeling component. constitute a step in a more extensive process, such as the “ Complementarity ” refers to the ability of a nucleic acid initiation of PCR , or the cleavage of a polynucleotide by an to form hydrogen bond ( s ) with another nucleic acid enzyme. A sequence capable of hybridizing with a given sequence by either traditional Watson - Crick base pairing or sequence is referred to as the “ complement” of the given other non -traditional types. A percent complementarity indi- 55 sequence . cates the percentage of residues in a nucleic acid molecule As used herein , the term " genomic locus ” or “ locus” which can form hydrogen bonds ( e . g ., Watson - Crick base ( plural loci) is the specific location of a gene or DNA pairing ) with a second nucleic acid sequence ( e . g . , 5 , 6 , 7 , 8 , sequence on a chromosome. A " gene” refers to stretches of 9 , 10 out of 10 being 50 % , 60 % , 70 % , 80 % , 90 % , and 100 % DNA or RNA that encode a polypeptide or an RNA chain complementary ) . “ Perfectly complementary ” means that all 60 that has functional role to play in an organism and hence is the contiguous residues of a nucleic acid sequence will the molecular unit of heredity in living organisms. For the hydrogen bond with the same number of contiguous residues purpose of this invention it may be considered that genes in a second nucleic acid sequence . “ Substantially comple include regions which regulate the production of the gene mentary ” as used herein refers to a degree of complemen - product, whether or not such regulatory sequences are tarity that is at least 60 % , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , 65 adjacent to coding and /or transcribed sequences . Accord 95 % , 97 % , 98 % , 99 % , or 100 % over a region of 8 , 9 , 10 , 11, ingly , a gene includes, but is not necessarily limited to , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19, 20 , 21 , 22 , 23 , 24 , 25 , 30 , 35 , promoter sequences, terminators, translational regulatory US 10 , 435 , 714 B2 43 sequences such as ribosome binding sites and internal ribo - nucleotide in the other sequence , one residue at a time . This some entry sites, enhancers , silencers , insulators , boundary is called an " ungapped ” alignment. Typically , such elements, replication origins, matrix attachment sites and ungapped alignments are performed only over a relatively locus control regions. short number of residues. As used herein , " expression of a genomic locus” or “ gene 5 Although this is a very simple and consistent method , it expression ” is the process by which information from a gene fails to take into consideration that , for example , in an is used in the synthesis of a functional gene product. The otherwise identical pair of sequences , one insertion or dele products of gene expression are often proteins, but in tion may cause the following amino acid residues to be put non -protein coding genes such as rRNA genes or tRNA out of alignment, thus potentially resulting in a large reduc genes , the product is functional RNA . The process of gene 10 tion in % homology when a global alignment is performed . expression is used by all known life - eukaryotes ( including Consequently, most sequence comparison methods are multicellular organisms) , prokaryotes (bacteria and archaea ) designed to produce optimal alignments that take into con and viruses to generate functional products to survive . As s ideration possible insertions and deletions without unduly used herein “ expression ” of a gene or nucleic acid encom penalizing the overall homology or identity score . This is passes not only cellular gene expression , but also the tran - 15 achieved by inserting " gaps ” in the sequence alignment to scription and translation of nucleic acid ( s ) in cloning sys - try to maximize local homology or identity . tems and in any other context. As used herein , " expression ” However, these more complex methods assign " gap pen also refers to the process by which a polynucleotide is alties” to each gap that occurs in the alignment so that, for transcribed from a DNA template (such as into and mRNA the same number of identical amino acids, a sequence or other RNA transcript ) and / or the process by which a 20 alignment with as few gaps as possible - reflecting higher transcribed mRNA is subsequently translated into peptides, relatedness between the two compared sequences — may polypeptides, or proteins . Transcripts and encoded polypep - achieve a higher score than one with many gaps . “ Affinity tidesmay be collectively referred to as “ gene product. ” If the gap costs ” are typically used that charge a relatively high polynucleotide is derived from genomic DNA, expression cost for the existence of a gap and a smaller penalty for each may include splicing of the mRNA in a eukaryotic cell. 25 subsequent residue in the gap . This is the most commonly The terms “ polypeptide ” , “ peptide ” and “ protein ” are used gap scoring system . High gap penalties may , of course , used interchangeably herein to refer to polymers of amino produce optimized alignments with fewer gaps . Most align acids of any length . The polymer may be linear or branched , ment programs allow the gap penalties to be modified . it may comprise modified amino acids , and it may be However, it is preferred to use the default values when using interrupted by non amino acids . The terms also encompass 30 such software for sequence comparisons . For example , when an amino acid polymer that has been modified ; for example , using the GCG Wisconsin Bestfit package the default gap disulfide bond formation , glycosylation , lipidation , acety - penalty for amino acid sequences is – 12 for a gap and - 4 for lation , phosphorylation , or any other manipulation , such as each extension . conjugation with a labeling component. As used herein the Calculation of maximum % homology therefore first term " amino acid ” includes natural and / or unnatural or 35 requires the production of an optimal alignment , taking into synthetic amino acids, including glycine and both the D or consideration gap penalties. A suitable computer program L optical isomers, and amino acid analogs and peptidomi- for carrying out such an alignment is the GCG Wisconsin metics . Bestfit package (Devereux et al. , 1984 Nuc . Acids Research As used herein , the term “ domain ” or “ protein domain ” 12 p387 ) . Examples of other software that may perform refers to a part of a protein sequence that may exist and 40 sequence comparisons include , but are not limited to , the function independently of the rest of the protein chain . BLAST package (see Ausubel et al. , 1999 Short Protocols in As described in aspects of the invention , sequence iden - Molecular Biology, 4th Ed . -Chapter 18 ), FASTA (Altschul tity is related to sequence homology . Homology compari- et al. , 1990 J. Mol. Biol. 403 -410 ) and the GENEWORKS sons may be conducted by eye, or more usually , with the aid suite of comparison tools . Both BLAST and FASTA are of readily available sequence comparison programs. These 45 available for offline and online searching ( see Ausubel et al. , commercially available computer programs may calculate 1999 , Short Protocols in Molecular Biology, pages 7 - 58 to percent ( % ) homology between two or more sequences and 7 -60 ) . However, for some applications, it is preferred to use may also calculate the sequence identity shared by two or the GCG Bestfit program . A new tool, called BLAST 2 more amino acid or nucleic acid sequences . Sequence Sequences is also available for comparing protein and homologies may be generated by any of a number of 50 nucleotide sequences ( see FEMS Microbiol Lett . 1999 174 computer programs known in the art, for example BLAST or (2 ): 247 -50 ; FEMS Microbiol Lett . 1999 177 ( 1 ): 187 - 8 and FASTA , etc . A suitable computer program for carrying out the website of the National Center for Biotechnology infor such an alignment is the GCG Wisconsin Bestfit package mation at the website of the National Institutes for Health ) . ( University of Wisconsin . U . S . A ; Devereux et al. , 1984 , Although the final % homology may be measured in terms Nucleic Acids Research 12 : 387) . Examples of other soft - 55 of identity , the alignment process itself is typically not based ware than may perform sequence comparisons include , but on an all -or -nothing pair comparison . Instead , a scaled are not limited to , the BLAST package ( see Ausubel et al. , similarity score matrix is generally used that assigns scores 1999 ibid -Chapter 18 ) , FASTA (Atschul et al. , 1990 , J . Mol. to each pair -wise comparison based on chemical similarity Biol. , 403 -410 ) and the GENEWORKS suite of comparison or evolutionary distance . An example of such a matrix tools . Both BLAST and FASTA are available for offline and 60 commonly used is the BLOSUM62 matrix — the default online searching ( see Ausubel et al ., 1999 ibid , pages 7 -58 matrix for the BLAST suite of programs. GCG Wisconsin to 7 -60 ) . However it is preferred to use the GCG Bestfit programs generally use either the public default values or a program . custom symbol comparison table , if supplied ( see user Percent homology may be calculated over contiguous manual for further details ) . For some applications , it is sequences , i . e . , one sequence is aligned with the other 65 preferred to use the public default values for the GCG sequence and each amino acid or nucleotide in one sequence package , or in the case of other software , the default matrix , is directly compared with the corresponding amino acid or such as BLOSUM62 . US 10 , 435 , 714 B2 45 46 Alternatively , percentage homologies may be calculated The practice of the present invention employs, unless using the multiple alignment feature in DNASISTM (Hitachi otherwise indicated , conventional techniques of immunol Software ), based on an algorithm , analogous to CLUSTAL ogy, biochemistry , chemistry , molecular biology , microbiol ( Higgins D G & Sharp PM ( 1988 ) , Gene 73 ( 1 ) , 237 - 244 ) . Once the software has produced an optimal alignment , it is a ogy , cell biology , genomics and recombinant DNA , which possible to calculate % homology , preferably % sequence 5 are within the skill of the art . See Green and Sambrook , identity . The software typically does this as part of the (Molecular Cloning : A Laboratory Manual. 4th , ed ., Cold sequence comparison and generates a numerical result . Spring Harbor Laboratory Press , Cold Spring Harbor, N . Y ., Sequences may also have deletions , insertions or substi 2014 ) ; CURRENT PROTOCOLS IN MOLECULAR tutions of amino acid residues which produce a silent change BIOLOGY ( F . M . Ausubel , et al. eds. , ( 2017 ) ) ; Short Pro and result in a functionally equivalent substance . Deliberate 10 tocols in Molecular Biology, (Ausubel et al . , 1999 ) ) ; the amino acid substitutions may be made on the basis of series METHODS IN ENZYMOLOGY (Academic Press , similarity in amino acid properties ( such as polarity , charge, Inc . ) : PCR 2 : A PRACTICAL APPROACH ( M . J . MacPher solubility , hydrophobicity , hydrophilicity, and /or the son , B . D . Hames and G . R . Taylor eds. ( 1995 ) ) , ANTI amphipathic nature of the residues ) and it is therefore useful BODIES , A LABORATORY MANUAL , SECOND EDI to group amino acids together in functional groups. Amino 15 TION (Harlow and Lane , eds. ( 2014 ) and CULTURE OF acids may be grouped together based on the properties of ANIMAL CELLS: A MANUAL OF BASIC TECHNIQUE , their side chains alone. However, it is more useful to include 7TH EDITION RIR . I. FreshneyFre , ed . (2016 ) ) . mutation data as well . The sets of amino acids thus derived are likely to be conserved for structural reasons. These sets may be described in the form of a Venn diagram (Living EXAMPLES stone C . D . and Barton G . J . ( 1993 ) “ Protein sequence 20 alignments : a strategy for the hierarchical analysis of residue The following examples are given for the purpose of conservation ” Comput. Appl. Biosci . 9 : 745 -756 ) ( Taylor W . illustrating various embodiments of the invention and are R . ( 1986 ) “ The classification of amino acid conservation ” J . not meant to limit the present invention in any fashion . The Theor. Biol. 119 ; 205 - 218 ) . Conservative substitutions may present examples , along with the methods described herein be made , for example according to the table below which 25 are presently representative of preferred embodiments , are describes a generally accepted Venn diagram grouping of exemplary , and are not intended as limitations on the scope amino acids . of the invention . Changes therein and other uses which are Embodiments of the invention include sequences (both encompassed within the spirit of the invention as defined by polynucleotide or polypeptide ) which may comprise the scope of the claims will occur to those skilled in the art . homologous substitution (substitution and replacement are 30 both used herein to mean the interchange of an existing Example 1 . Nucleic Acid -Guided Nucleases amino acid residue or nucleotide , with an alternative residue or nucleotide ) that may occur i . e . , like - for- like substitution Sequences for twenty nucleic acid guided nucleases , in the case of amino acids such as basic for basic , acidic for termed MAD1- MAD20 (SEQ ID NOs 1 - 20 ) , were aligned acidic , polar for polar , etc . Non -homologous substitution 35 and compared to other nucleic acid guided nucleases. A may also occur i . e ., from one class of residue to another or partial alignment and phylogenetic tree are depicted in FIG . alternatively involving the inclusion of unnatural amino 1A and FIG . 1B respectively . Key residues in that may be acids such as ornithine ( hereinafter referred to as Z ) , diamin - involved in the recognition of a PAM site are shown in FIG . obutyric acid ornithine ( hereinafter referred to as B ) , nor - 1A . These include amino acids at positions 167 , 539 , 548 , leucine ornithine (hereinafter referred to as O ), pyridylala - 40 599, 603 , 604 , 605 , 606 , and 607. nine, thienylalanine, naphthylalanine and phenylglycine . Sequence alignments were built using PSI- BLAST to Variant amino acid sequences may include suitable spacer search for MAD nuclease homologs in the NCBI non groups that may be inserted between any two amino acid redundant databases . Multiple sequence alignments were residues of the sequence including alkyl groups such as further refined using the MUSCLE alignment algorithm with methyl, ethyl or propyl groups in addition to amino acid 45 default settings as implemented in Geneious 10 . The percent spacers such as glycine or .beta . - alanine residues. A further identity of each homolog to SpCas9 and AsCpf1 reference form of variation , which involves the presence of one or sequences were computed based on the pairwise alignment more amino acid residues in peptoid form , may be well matching from these global alignments . understood by those skilled in the art . For the avoidance of Genomic source sequences were identified using Uniprot doubt, “ the peptoid form ” is used to refer to variant amino 50 linkage information or TBLASTN searches of NCBI using acid residues wherein the .alpha . -carbon substituent group is the default parameters and searching all possible frames for on the residue ' s nitrogen atom rather than the .alpha . - carbon . translational matches . Processes for preparing peptides in the peptoid form are Percent identities of MAD1- 8 and 10 - 12 to other various known in the art , for example Simon R J et al. , PNAS ( 1992 ) nuclease are summarized in Table 1 . These percent identities 89 (20 ) , 9367 - 9371 and Horwell D C , Trends Biotechnol. represent the shared amino acid sequence identity between ( 1995 ) 13 ( 4 ) , 132- 134 . the indicated proteins. TABLE 1 Protein identifier or accession number MAD1 MAD2 MAD MAD MAD MAD6 MAD MADS MAD10 MAD11 MAD12 gi| 10257348611 6 .4 32. 8 33. 2 29 .7 29 .4 31. 1 30. 3 31. 7 26 .7 27. 9 98. 8 pdb |5B43 | A US 10 , 435 , 714 B2 47 48 TABLE 1 - continued Protein identifier or accession number MADI MAD2 MAD3 MAD MAD MAD MAD MADS MAD10 MAD11 MAD12 gi| 10522451731 6 .4 32 . 7 33 . 1 29 . 7 29 . 3 31 30 . 3 31. 7 26 .7 27 . 8 98. 7 pdb |5KK5 | A gi/ 10862166831 6 .. 1 33 34 . 4 2929 . . 6 30 . 1 33. 5 32323 . 3 32 . 1 2626 . . 2 27 . 2 4646 . . 88 embl SDC16215 . 11 gi| 11201753331 5 .. 9 30 . . 9 37 . 2 3232 . 8 33 . 6 34. . 4 3535 . 7 35 . 1 2626 . 33 2828 . 3 34 . 9 ref] WP_ 073043853 . 11 Cpfl . Sjl 6 . 6 33 . 6 41. 7 37. 2 33 . 4 37 . 6 40 . 1 37 . 7 29 . 1 30 . 3 34 . 1 WP _ 081839471 Cpfl. Ss |KF067989 6 . 9 32 . 3 35 . 7 43 33 . 7 45 . 9 34 . 8 48 33 . 2 33 . 4 33 . 8 MAD3 5 . 8 31 100 32 . 9 35 . 9 35 . 6 34 . 3 28 27 . 6 33 . 1 gi/ 10824745761 31. 4 35 . 9 43 . 2 31 . 4 33 . 6 48 . 6 30 . 8 33 . 5 33 gb |OFY19591 . 1 || MAD2 6 . 1 100 31 30 . 7 30 . 2 31 31. 2 31 . 2 25 . 8 27 . 7 32 . 6 Cpfi . Lb51 7 . 8 32 . 8 36 . 5 38 . 2 34 . 2 45 . 5 35 . 8 43. 6 30 . 7 35 . 7 32 .5 WP_ 016301126 gi| 10882867361 6 . 7 30 . 6 35 . 3 42. 4 33 . 2 44 . 7 32 . 1 46 . 8 30 . 7 32 . 6 32 . 4 gb |OHB41002 . 11 gi| 10944233101 6 . 8 30 . 8 36 . 1 40 . 4 31 . 8 50 . 4 3535 . . 2 46 . 6 30 . 4 36 . 8 32 . 3 embl 46. 6 30 .4 SERO3894 . 1 | gi| 4933265311 6 . 8 30 . 8 36 . 1 40 . 3 31. 8 50 . 3 35 . 1 46 . 6 30 . 4 36 . 8 32 . 3 refWP _ 006283774 . 1 || MADS 7 .6 31. 2 34 . 3 40 . 4 32 41. 6 32 . 8 100 30 . 1 32 . 1 31 . 7 Cpfi. Bot 6 . 9 30 . 1 36 . 6 41. 5 32 . 5 50 . 2 35 . 4 45 . 5 29 . 8 34 . 1 31 . 6 WP 009217842 Cpfi. Li |WP _ 020988726 7 . 3 30 . 2 34. 6 39 . 3 30 . 3 40 . 7 31 . 8 39 . 4 32 . 1 31 . 3 31 . 5 Cpf1. Pb | 6 . 3 31 . 4 31. 8 36 .1 30 .8 45 . 7 30 . 4 39 . 4 27 . 7 33. 5 31. 5 WP _ 044110123 gi| 8179113721 7 . 3 29 . 8 35 40. 7 32 . 1 40. 3 32 . 6 41 . 7 29. 1 31. 4 gb |AKG08867 . 11 gi| 10528385331 6 . 6 30. 8 35 .5 32 31. 5 34 . 4 51 . 9 33 . 4 26 . 1 31 . 3 embl SCH45297 . 1 ] gi | 10537133321 7 . 2 29. 6 33 .2 39 .6 29 .8 49. 1 322 . 4 1. 4 30 .1 32. 4 31. 3 ref] WP _ 066040075 . 11 gi| 8179090021 7 . 3 29 . 8 35 40 . 7 32 40 . 3 32 . 5 41. 6 29 . 1 30 . 9 31. 3 gb | AKG06878 .11 gi| 10422014771 7 . 2 29 . 5 35 . 2 40 . 6 31 . 9 40 . 1 32 . 7 41. 6 29 30 . 8 31 . 2 ref WP_ 065256572 . 1 || MAD6 7 . 5 31 35 38 . 9 33 . 1 100 34 . 3 41. 6 30 . 5 33 . 6 31 gi14904687731 6 .8 31 . 8 31. 7 36 . 2 28 . 6 36 . 5 31 . 4 38 . 4 28 . 5 31 . 4 ref]WP _ 004339290 . 1 || gi/ 5658537041 7 . 5 30 . 8 34 . 9 38 . 9 33 . 1 99 . 7 34 . 1 41. 6 30 . 4 33 . 6 ref] WP _ 023936172 . 1 ] gi17390057071 7 . 5 309 . 35 38 . 9 33 99 . 9 34 . 2 41. 5 30 . 4 33 .5 31 ref] 33 99. 9 34. 2 415 30. 4 WP _ 036887416 . 11 gi| 7390085491 7 . 5 31 35 38. 8 33 99 . 8 34 . 2 41. 5 30 . 4 33 . 5 31 refWP _ 036890108 . 1 || 33 99. 8 34. 2 41. 5 30 4 Cpfl . Ft| WP _ 014550095 7 .1 31 . 9 33 . 8 40 . 3 29 . 7 39 . 4 34 . 1 41 29 . 8 32 . 5 30 . 8 gi| 5043629931 7 .2 32. 4 33 . 8 40 . 3 29 . 6 39 . 4 33 . 8 40 . 9 30 . 1 32 . 5 30 . 8 ref]WP _ 014550095 . 1 || gi| 6405574471 6 . 6 31 . 4 34 . 8 40 . 7 31 . 2 34 . 1 45 . 1 28 . 8 35 . 2 30 . 8 refWP _ 024988992 . 1 || 34. 8 40. 7 31. 2 48 gi| 10989441131 7 . 1 32 . 3 33. 5 40 . 3 29 . 6 39 . 2 33 . 8 40 . 9 30 . 1 32 . 5 30 . 6 refl WP _ 071304624 . 1 | | gi| 4891248481 7 . 1 32 . 3 33 . 9 40 . 9 29 . 9 39. 2 33 . 9 40 . 9 29. 9 32 . 2 30 . 6 refWP 003034647 . 1 | gi| 7389677761 29 . 4 33 . 1 35 . 5 28 . 9 40 . 3 30 . 7 35 . 9 28 . 7 31 . 3 30 . 5 ref]WP _ 036851563 . 1 || MAD7 31 . 2 35 . 6 30 . 8 33 . 9 34 . 3 100 32 . 8 24 . 2 28 . 9 30 . 5 Cpf1. Lb61 29 . 8 33. 7 36 . 6 30 . 9 43 34 39 . 8 29 . 1 32 . 1 30 . 4 WP _ 044910713 gi| 10529619771 5 . 5 30 . 5 35 . 8 32 . 3 34 35 53 . 8 33 . 4 26 . 2 27 . 4 30 . 4 embl SCH47915 . 11 gi| 8179183531 29 . 1 34. 4 39 . 8 31. 7 40 32. 4 41. 1 28 . 4 30 . 1 30 . 3 gb |AKG14689 .11 US 10 , 435 , 714 B2 49 50 TABLE 1 - continued Protein identifier or accession number MADI MAD2 MAD MAD MAD MAD MAD MADS MAD10 MAD11 MAD12 gi| 9170594161 6 .9 29 .9 31. 5 35 .7 31. 6 41. 8 32. 9 39. 1 30 .1 34 30. 2 ref] WP _ 051666128 . 1 || gi/ 10116492011 66 . 8 29 3434 . 77 4040 . 3 31. 4 40. 1 33. 1 41. 6 2828 . 5 30 . 4 30. 1 refl WP _ 062499108 . 1 | Cpf1 . Pml 29 . 2 27 . 4 38 .7 29 .4 35 27 . 2 30 . 1 30 WP _ 018359861 6 .3 29. 2 32. 3 34. 2 gi| 8179225371 6 . 8 29 . 1 34. 5 39 . 6 31 . 5 39 . 9 32 . 7 40 . 7 28. 3 29 . 8 30 gb | AKG18099 . 11 gi17691423221 6 . 7 31 34 . 6 37 . 8 31 . 5 41 . 4 33 . 3 39 . 2 31 . 9 29 . 9 refWP _ 044919442 . 1 || 34. 6 378 31. 5 41. 4 333. 39. 2 28 31, 9 29, 9 gi| 10231764411 6 . 7 29 . 7 31. 3 35 . 5 31. 3 41 32 .6 38 . 5 33 . 3 29. 8 pdb5ID6A 29. 7 gi| 4915409871 28 . 3 30 . 4 29. 7 28. 5 29 30. 7 29 . 8 25 . 8 27 . 8 29. 8 ref]WP _ 005398606 . 1 | gi16528206121 6 . 4 31 . 1 34 35. 3 31 . 7 40 . 3 33 . 4 37 . 5 28. 5 33 . 3 29. 8 refWP_ 027109509 . 1 | 37. 5 gi15022404461 5 . 9 31 . 6 36 . 1 31.2 33 35 . 4 49 . 4 34 26 . 6 29 . 4 29. 7 ref ]WP _ 012739647 . 1 || 34 gi15242780461 31 . 6 36 31 33 35 . 4 50 34 26 . 6 29 . 5 29 . 7 emb |CDA41776 . 1 | | 5. 8 31. 6 33 35. 4 50 34 gi| 7378315801 66 . 22 3131. 33 34 . 88 3838 . 11 31 . 5 4242. . 1 33 39 .6 28. 4 32 . 4 29 . 7 ref]WP _ 035798880 . 1 || gi19096525721 6 . 9 30 . 7 34 . 2 37 . 2 30 .8 41. 5 34 . 2 38 . 7 32 29 . 7 ref]WP _ 049895985 . 1 || 34. 2 38. 7 MAD4 6 . 7 30 . 7 32 . 9 100 30 . 7 38 . 9 30 . 8 40 . 4 28 . 8 29 . 4 29 . 7 gi19420730491 5 . 9 31. 6 36 . 1 31 . 1 32 . 7 35 49 . 7 33 . 9 27 . 1 29 . 5 29 . 6 ref |WP _ 055286279 . 1 || gi16547945051 7 . 4 30 . 5 35 . 9 37 . 4 31 . 3 42 . 8 34 . 2 40 . 2 27 . 9 33 . 5 29 . 5 refWP _ 028248456 .1 | gi| 9330147861 5 . 6 31. 3 34. 9 31. 2 31 . 5 32 . 4 46 . 7 30 . 6 25 . 4 27 . 7 29 . 4 emb |CU047728 . 1 || gi19418874501 5. 6 31. 4 35 31. 3 31 . 6 6 32 . 5 46 . 6 30 . 7 25 . 3 27 . 8 29 . 4 ref |WP _ 055224182 . 1 | gi| 9200716741 6 . 3 31 31 . 8 38 . 8 31 . 8 41. 3 33 . 8 42 . 6 29. 8 34 . 7 refWP _ 052943011. 1 | MAD5 5 . 1 30 . 2 35 . 9 30 .7 100 33. 1 33 . 9 32 24 . 3 28 . 7 29 gi| 10814626741 6 . 9 30 . 4 33 . 5 34 . 7 29 . 7 40 . 1 30 . 5 37 . 4 27 . 3 32 . 5 28 .9 embl SCZ76797. 1 gi| 9187225231 7 . 4 27. 5 30 . 5 35 . 7 28. 3 3335 . 2 28. 5 36 26 27. 1 28 .8 ref] WP _ 052585281. 1 || gi |524816323 | 6 . 2 3030 34 . 1 29. 3 31. 2 32 .7 47 .6 32 . 2 25. 5 25. 9 28 . 4 emb |CDF09621 . 1 || 6. 2 gi19417823281 6. 2 30 .. 2 33 . 1 28 . 9 30 . 9 32 46 . 9 32 . 1 26 27 . 1 28 . 4 refWP _ 055176369 . 1 || gi| 9421132961 6 . 4 29 . 8 33 . 8 29 . 7 31. 3 33 . 1 48 32 . 5 25 . 8 26 . 2 28 . 4 ref] WP _ 055306762. 1 || MAD11 6 .4 27 . 7 27 . 6 29 . 4 28 . 7 33 . 6 28 . 9 32 . 1 100 27. 8 gi16531585481 5 . 9 26 . 4 28 . 1 33 . 5 27 . 4 32 . 5 27 . 8 32 27 26 . 8 27 . 6 ref] WP _ 027407524 . 1 || gi16529630041 6 . 6 30 . 3 32 .5 33 . 2 30 . 4 38 . 2 29 . 6 34 . 6 25 . 9 30 . 5 27. 2 refWP _ 027216152 .11 gi| 10830696501 6 . 2 24 . 3 26 . 6 23 . 1 28. 1 23 . 2 26 . 4 45 24 . 9 27 . 1 gbIOGD68774. 11 gi/ 3024832751 5 .6 24 . 7 26. 8 30 . 3 24 . 9 34 . 8 26 30 . 4 24 . 4 27 . 5 27 . 1 gb |EFL46285 .11 34. 8 26 30. 4 24. 4 gi19154008551 5 . 6 24 . 7 26 . 8 30 . 3 24 . 9 34 . 8 26 30 . 4 24 . 4 27. 5 27 . 1 refWP_ 050786240 . 1 || MAD10 5 . 6 25 . 8 28 . 8 24 . 3 30 . 5 24 . 2 30 . 1 100 26 . 2 26 . 6 gi| 11011179671 6 . 1 26828 . 8 26 27 . 3 24 . 3 28 . 1 24 . 4 28 . 2 44 . 1 25 . 4 26 . 1 gb |O1075780 . 1 || 1256. , gi| 10882044581 6 . 5 25 . 2 23 . 5 25 . 8 22 . 9 27 22 26 . 1 36 . 5 24 . 2 24 . 7 gb |OHA63117 . 1 | | gi| 8091980711 4 . 9 25 . 6 26 . 5 22 . 2 23 . 9 23 . 8 25 . 8 23 . 9 20 .3 25 . 1 24 ref] WP _ 046328599 . 1 || gi| 10880799291 5 . 6 21 . 9 23 . 8 26 . 9 23 . 4 27 . 8 23 . 3 26 . 7 28 .8 24 . 7 23. 5 gbIOGZ45678 . 11 21. 9 23. 8 26. 9 23. 4 27. 8 gi| 11010534991 5 . 9 23 .1 26 .2 25 .2 23 26 .4 25 . 11 26 . 5 2929 . 22 23 .2 23 . 4 gb |O1015737 . 11 gi| 11010580581 5 . 4 21. 2 22 . 8 23 . 6 20 . 6 25 20 . 7 25 25 . 9 22 . 2 gb |OI019978 . 11 gi| 10880008481 5 . 7 25 . 2 23 . 9 27 25 . 1 25 . 6 31 . 6 23 . 6 22. 9 gb |OGY73485 . 11 5. 7 23. 5 25. 2 25. 5 239. 27 25. 1 25 .6 31. 6 US 10 , 435 , 714 B2 51 TABLE 1 - continued Protein identifier or accession number MADI MAD2 MAD MAD MAD MAD MAD MADS MAD10 MAD11 MAD12 gi| 4070144331 gb | EKE28449 . 11 5. 2 23. 5 25. 9 26 .7 24 .3 25. 8 23 27. 8 29. 9 25. 3 22. 9 gi| 8182498551 6 21 20. 7 23 .5 20 24 .2 21 24 24. 6 21. 8 22. 6 gb |KKP36646 . 11 gi 8187036471 5. 8 2323 . 3 25 25 .1 23. 5 26 .5 24. 7 25 .3 31. 2 23 .3 22. 6 gb |KKT48220 . 11 gi 8187057861 5 . 8 24 . 6 24 . 7 22 . 9 26 . 2 24 . 8 30 . 8 22 .9 22 . 2 gb |KKT50231 . 11 5. 8 23 .1 24. 6 24. 7 22. 9 26. 2 24. 2 24 .8 30 .8 22. 9 gi/ 10839506321 4 . 5 20 22 . 1 23 . 5 20 . 6 24 .6 20 24 23 . 5 20 . 7 22 . 1 gb | OGJ66851. 11 gi| 10839321991 20 . 4 22 . 6 19 . 3 23 . 3 23 . 2 23 . 9 21 21. 8 gb |OG | 49885 . 11 20. 4 20. 2 22. 6 19. 3 23. 3 20. 6 23. 2 23. 9 21 21. 8 gi| 10834107351 21 . 7 23 . 3 25 . 5 23 22. 7 25 . 9 27 . 2 22 . 4 21. 5 gb |OGF20863 . 11 gi| 10114809271 4 . 7 20 . 1 20 . 1 21. 4 19 . 3 23 . 3 21 . 4 22 20 . 2 19 . 7 20 . 9 ref7 WP _ 062376669 . 1 || gi8185395931 5 . 1 19 . 8 21 .6 22 . 1 20 . 5 22. 9 21 . 2 22 . 8 24 20 . 5 19 . 9 gb |KKR91555 . 11 gi15030480151 5 . 1 18. 8 20 . 7 15 . 3 19 . 7 18 . 9 19 . 3 17 . 7 15. 9 19 19 .2 ref] WP _ 013282991. 1 || gi| 10962327461 19 . 1 20 . 5 17 . 4 20 . 1 19 . 7 20 . 4 20 . 4 17 . 5 18 . 5 18 . 9 refl 5 19. 1 20. 1 19. 7 20. 4 20. 4 17. 5 18. 5 WP _ 071177645 . 1 || gi17691304041 4 . 6 19 . 4 16 . 1 18 . 1 17 . 1 18 . 7 17 . 9 14 . 5 16 . 8 17 . 5 refWP _ 044910712 . 1 || 4. 6 1919.. 4 18. 2 16. 1 18. 1 17. 1 18. 7 17. 9 14. 5 16. 8 17. 5 gi/ 10855695001 2 .6 11. 6 12. 1 12 . 7 10 . 2 12 . 1 12 . 7 11 . 6 10 . 9 11 . 1 10 . 5 gb |OGX23684 . 11 gi8183570621 3 . 3 10 11 . 1 10 . 6 11. 1 11. 8 12 . 1 11. 5 12 . 2 10 . 8 gb |KKQ38176 . 11 10 11. 1 10. 6 11. 1 11. 8 12. 1 11. 5 12. 2 10. 8 9. 8 gi| 745626763 3 . 7 9 .4 11 .7 11 . 1 11. 1 12. 5 11. 9 11 . 9 10 . 2 10 .6 8 . 8 gb | KIE18642 . 11 MAD1 100 6 . 1 5 .8 6 . 7 5 . 1 7 .5 59 7 .6 5 .6 6 . 4 6 .4 SpCas9 4 6 . 3 6 . 5 8 .3 5 . 6 8 . 1 6 . 9 7 .7 6 . 9 6 .3 6 . 3 MAD12 6 . 4 32. 6 33 . 1 297 . 2 9 31 30 . 5 31. 7 26 . 6 27 . 8 100

Example 2 . Expression of MAD Nucleases Example 3 . Testing Guide Nucleic Acid Sequences Compatible with MAD Nucleases Wild -type nucleic acid sequences for MAD1- MAD20 40 include SEQ ID NOs 21 - 40 , respectively . These MAD In order to have a functioning targetable nuclease com nucleases were codon optimized for expression in E . coli plex , a nucleic acid - guided nuclease and a compatible guide and the codon optimized sequences are listed as SEQ ID nucleic acid is needed . To determine the compatible guide NO : 41 -60 , respectively ( summarized in Table 2 ) . nucleic acid sequence , specifically the scaffold sequence Codon optimized MAD1 -MAD20 were cloned into an 4545 portion of the guide nucleic acid ,multiple approaches were expression construct comprising a constitutive or inducible promoter ( eg ., pro? promoter SEQ ID NO : 83 , or pBAD taken . First, scaffold sequences were looked for near the promoter SEQ ID NO : 81 or SEQ ID NO : 82 ) and an endogenous loci of each MAD nuclease . In some cases, such optional 6X -His tag (SEQ ID NO : 182 ) (eg . , FIG . 2 ) . The as with MAD2, no endogenous scaffold sequence was generated MAD1- MAD20 expression constructs are pro 50 found . Therefore , we tested the compatibility ofMAD2 with vided as SEQ ID NOs : 61 - 80 , respectively . The expression scaffold sequences found near the endogenous loci of the constructs as depicted in FIG . 2 were generated either by other MAD nucleases . A list of the MAD nucleases and restriction / ligation -based cloning or homology -based clon corresponding endogenous scaffold sequences that were ing . tested is listed in Table 2 . TABLE 2 Endogenous Codon optimized scaffold sequence MAD WT nucleic acid nucleic acid Amino acid for guide nucleic nuclease sequence sequence sequence acid MAD1 SEQ ID NO : 21 SEQ ID NO : 41 SEQ ID NO : 1 SEQ ID NO : 84 MAD2 SEQ ID NO : 22 SEQ ID NO : 42 SEQ ID NO : 2 None identified MAD3 SEQ ID NO : 23 SEQ ID NO : 43 SEQ ID NO : 3 SEQ ID NO : 86 MAD4 SEQ ID NO : 24 SEQ ID NO : 44 SEQ ID NO : 4 SEQ ID NO : 87 MAD5 SEQ ID NO : 25SEQ ID NO : 45 SEQ ID NO : 5 SEQ ID NO : 88 MAD6 SEQ ID NO : 26 SEQ ID NO : 46 SEQ ID NO : 6 SEQ ID NO : 89 US 10 , 435 , 714 B2 53 54 TABLE 2 -continued Endogenous Codon optimized scaffold sequence MAD WT nucleic acid nucleic acid Amino acid for guide nucleic nuclease sequence sequence sequence acid MAD7 SEQ ID NO : 27 SEQ ID NO : 47 SEQ ID NO : 7 SEQ ID NO : 90 MADS SEQ ID NO : 28 SEQ ID NO : 48 SEQ ID NO : 8 SEQ ID NO : 91 MAD9 SEQ ID NO : 29 SEQ ID NO : 49 SEQ ID NO : 9 SEQ ID NO : 92 ; SEQ ID NO : 103; SEQ ID NO : 106 MAD10 SEQ ID NO : 30 SEQ ID NO : 50 SEQ ID NO : 10 SEQ ID NO : 93 MAD11 SEQ ID NO : 31 SEQ ID NO : 51 SEQ ID NO : 11 SEQ ID NO : 94 MAD12 SEQ ID NO : 32 SEQ ID NO : 52 SEQ ID NO : 12 SEQ ID NO : 95 MAD13 SEQ ID NO : 33 SEQ ID NO : 53 SEQ ID NO : 13 SEQ ID NO : 96 ; SEQ ID NO : 105 ; SEQ ID NO : 107 MAD14 SEQ ID NO : 34 SEQ ID NO : 54 SEQ ID NO : 14 SEQ ID NO : 97 MAD15 SEQ ID NO : 35 SEQ ID NO : 55 SEQ ID NO : 15 SEQ ID NO : 98 MAD16 SEQ ID NO : 36 SEQ ID NO : 56 SEQ ID NO : 16 SEQ ID NO : 99 MAD17 SEQ ID NO : 37 SEQ ID NO : 57 SEQ ID NO : 17 SEQ ID NO : 100 MAD18 SEQ ID NO : 38 SEQ ID NO : 58 SEO ID NO : 18 SEO ID NO : 101 MAD19 SEQ ID NO : 39 SEQ ID NO : 59 SEQ ID NO : 19 SEQ ID NO : 102 MAD20 SEQ ID NO : 40 SEQ ID NO : 60 SEQ ID NO : 20 SEQ ID NO : 103

Editing cassettes as depicted in FIG . 3 were generated to the PAM mutation survive cleavage by the MAD nuclease, assess the functionality of the MAD nucleases and corre - 25 while wild -type non -edited cells die (FIG . 5C ) . The editing sponding guide nucleic acids . Each editing cassette com sequence further comprises a mutation in the galK gene that prises an editing sequence and a promoter operably linked to allows for screening of edited cells , while the MAD nuclease an encoded guide nucleic acid . The editing cassettes further expression vector and editing cassette contain drug selection comprises primer sites (P1 and P2 ) on flanking ends. The markers , allowing for selection of edited cells . guide nucleic acids comprised various scaffold sequences to 30 Using these methods, compatible guide nucleic acids for be tested , as well as a guide sequence to guide the MAD MAD1-MAD20 were tested . Twenty scaffold sequences nuclease to the target sequence for editing . The editing were tested . The guide nucleic acids used in the experiments sequences comprised a PAM mutation and /or codon muta - contained one of the twenty scaffold sequences , referred to tion relative to the target sequence . The mutations were as scaffold - 1 , scaffold -2 , etc ., and a guide sequence that flanked by regions of homology (homology arms or HA ) 35 targets the galK gene . Sequences for Scaffold - 1 through which would allow recombination into the cleaved target Scaffold - 20 are listed as SEQ ID NO : 84 - 103 , respectively . sequence . It should be understood that the guide sequence of the guide FIG . 4 depicts an experimental designed to test different nucleic acid is variable and can be engineered or designed to MAD nuclease and guide nucleic acid combinations. An target any desired target sequence . Since MAD2 does not expression cassette encoding the MAD nuclease or the 40 have an endogenous scaffold sequence to test, a scaffold MAD nuclease protein were added to host cells along with various editing cassettes as described above . In this sequence from a close homology (scaffold -2 , SEQ ID NO : example , the guide nucleic acids were engineered to target 85 ) was tested and found to be a non - functional pair , the galK gene in the host cell , and the editing sequence was meaning MAD2 and scaffold - 2 were not compatible . There designed to mutate the targeted galK gene in order to turn the 45 fore, MAD2 was tested with the other nineteen scaffold gene off , thereby allowing for screening of successfully sequences , despite the low sequence homology between edited cells. This design was used for identification of MAD2 and the other MAD nucleases . functional or compatible MAD nuclease and guide nucleic This workflow could also be used to identify or test PAM acid combinations . Editing efficiency was determined by sequences compatible with a given MAD nuclease . Another qPCR to measure the editing plasmid in the recovered cells 50 method for identifying a PAM site is described in the next in a high - throughput manner. Validation of MAD11 and example . Cas9 primers is shown in FIGS. 14A and 14B . These results In general, for the assays described , transformations were show that the selected primer pairs are orthogonal and allow carried out as follows. E . coli strains expressing the codon quantitative measurement of input plasmid DNA optimized MAD nucleases were grown overnight. Saturated FIGS . 5A -5B is a depiction of a similar experimental 55 cultures were diluted 1/ 100 and grown to an OD600 of 0 . 6 and design . In this case , the editing cassette (FIG . 5B ) further induced by adding arabinose at a filing concentration of comprises a selectable marker, in this case kanamycin 0 .4 % and ( if a temperature sensitive plasmid is used ) shift resistance (kan ) and the MAD nuclease expression vector ing the culture to 42 degrees Celsius in a shaking water bath . ( FIG . 5A ) further comprises a selectable marker, in this case Following induction , cells were chilled on ice for 15 min chloramphenicol resistance (Cm ), and the lambda RED 60 prior to washing thrice with 1 / 4 the initial culture volume recombination system to aid homologous recombination with 10 % glycerol ( for example , 50 mL washed for a 200 (HR ) of the editing sequence into the target sequence . A mL culture ) . Cells were resuspended in 1/ 100 the initial compatible MAD nuclease and guide nucleic acid combi - volume ( for example , 2 mL for a 200 mL culture ) and stores nation will cause a double strand break in the target at - 90 degrees Celsius until ready to use . To perform the sequence if a PAM sequence is present. Since the editing 65 compatibility and editing efficiency screens described here , sequence (eg . FIG . 3 ) contains a PAM mutation that is not 50 ng of editing cassette was transformed into cell aliquots recognized by the MAD nuclease , edited cells that contain by electroporation . Following electroporation , the cells were US 10 , 435 , 714 B2 55 56 recovered in LB for 3 hours and 100 uL of cells were plated for MAD13 -MAD15 was determined to be TTN . The con on Macconkey plates containing 1 % galactose . sensus PAM forMAD16 -MAD18 was determined to be TA . Editing efficiencies were determined by dividing the The consensus PAM for MAD19 -MAD20 was determined number of white colonies ( edited cells ) by the total number to be TTCN . of white and red colonies ( edited and non - edited cells ) . Example 5 . Testing Heterologous Guide Nucleic Example 4 . PAM Selection Assay Acids In order to generate a double strand break in a target Editing efficiencies were tested for MADI , MAD2 , sequence , a guide nucleic acid must hybridize to a target 10 MAD4. and MAD7 and are depicted in FIG . 7A and FIG . sequence , and the MAD nuclease must recognize a PAM sequence adjacent to the target sequence . If the guide nucleic 7B . Experiment details and editing efficiencies are summa acid hybridizes to the target sequence, but the MAD nucle rized in Table 3 . Editing efficiency was determined by ase does not recognize a PAM site , then cleavage does not dividing the number of edited cells by the total number of occur. 15 recovered cells . Various editing cassettes targeting the galK A PAM is MAD nuclease -specific and not all MAD gene were used to allow screening of editing cells . The guide nucleases necessarily recognize the same PAM . In order to nucleic acids encoded on the editing cassette contained a assess the PAM site requirements for the MAD nucleases , an guide sequence targeting the galK gene and one of various assay as depicted in FIGS. 6A -6C was performed . scaffold sequences in order to test the compatibility of the FIG . 6A depicts a MAD nuclease expression vector as 20 indicated MAD nuclease with the indicated scaffold described elsewhere , which also contains a chloramphenicol sequence , as summarized in Table 3 . resistance gene and the lambda RED recombination system . Editing efficiencies for compatible MAD nuclease and FIG . 6B depicts a self - targeting editing cassette . The guide nucleic acids ( comprising the indicated scaffold guided nucleic acid is designed to target the target sequence sequences ) were observed to have between 75 - 100 % editing which is contained on the same nucleic acid molecule . The 25 efficiency . MAD2 had between a 75 - 100 % editing efficiency target sequence is flanked by random nucleotides , depicted and MAD7 had between a 97 - 100 % editing efficiency. by N4, meaning four random nucleotides on either end of the MAD2 combined with scaffold - 1 , scaffold - 2 , scaffold - 4 , target sequence . It should be understood that any number of or scaffold - 13 in these experiments results in 0 % editing random nucleotides could also be used ( for example , 3 , 5 , 6 , efficiency. These data imply that MAD2 did not form a 7 , 8 , etc ) . The random nucleotides serve as a library of 30 functional complex with these tested guide nucleic acids and potential PAMs. thatMAD2 is not compatible with these scaffold sequences. FIG . 6C depicts the experimental design . Basically , the MAD7 combined with scaffold - 1 , scaffold - 2 , scaffold -4 , MAD nuclease expression vector and editing cassette com - or scaffold - 13 in these experiments results in 0 % editing prising the random PAM sites were transformed into a host efficiency. These data imply that MAD7 did not form a cell . If a functional targetable nuclease complex was formed 35 functional complex with these tested guide nucleic acids and and the MAD nuclease recognized a PAM site , then the that MAD7 is not compatible with these scaffold sequences . editing cassette vector was cleaved and which leads to cell For MAD1 and MAD4 , all tested guide nucleic acid death . If a functional targetable complex was not formed or combinations resulted in 0 % editing efficiency , implying if the MAD nuclease did not recognize the PAM , then the that MAD1 and MAD4 did not form a functional complex target sequence was not cleaved and the cell survived . Next 40 with any of the tested guide nucleic acids . These data also generation sequence (NGS ) was then used to sequence the imply that MAD1 and MAD4 are not compatible with the starting and final cell populations in order to determine what tested scaffold sequences . PAM sites were recognized by a given MAD nuclease . Combined , these data highlight the unpredictability of These recognized PAM sites were then used to determine a finding a compatible MAD nuclease and scaffold sequence consensus or non -consensus PAM for a given MAD nucle - 45 pair in order to form a functional targetable nuclease com ase . plex . Some tested MAD nucleases did not function with any The consensus PAM for MAD1- MAD8 , and MAD10 tested scaffold sequence . Some tested MAD nucleases only MAD12 was determined to be TTTN . The consensus PAM functioned with some tested scaffold sequences and not with for MAD9 was determined to be NNG . The consensus PAM others . TABLE 3 Editing Nucleic acid Guide nucleic acid scaffold sequence Target Editing # guided nuclease sequence mutation gene efficiency 1 MAD1 Scaffold - 1 ; SEQ ID NO : 84 L80 * * galK 0 % 2 MAD1 Scaffold - 2 ; SEQ ID NO : 85 Y145 * * galK 0 % 3 MAD1 Scaffold - 4 ; SEQ ID NO : 87 Y145 * * galK 0 % 4 MAD1 Scaffold - 10 ; SEQ ID NO : 93 Y145 * * galk 0 % 5 MAD1 Scaffold - 11 ; SEO ID NO : 94 L80 * * galK 0 % 6 MAD1 Scaffold - 12 ; SEQ ID NO : 95 L10KpnI galK 0 % 7 MAD1 Scaffold - 13 ; SEQ ID NO : 96 Y145 * * galK 0 % 8 MAD1 Scaffold - 12 ; SEQ ID NO : 95 L10KpnI galK 0 % 9 MAD2 Scaffold - 10 ; SEO ID NO : 93 L80 * * galK 0 % 10 MAD2 Scaffold - 10 ; SEQ ID NO : 93 Y145 * * galK 100 % 11 MAD2 Scaffold - 11 ; SEQ ID NO : 94 L80 * * galK 98 % 12 MAD2 Scaffold - 11 ; SEQ ID NO : 94 Y145 * * galK 99 % 13 MAD2 Scaffold - 12 ; SEQ ID NO : 95 Y145 * * galK 98 % 14 MAD2 Scaffold - 12 ; SEQ ID NO : 95 Y145 * * galK 0 % US 10 , 435 , 714 B2 57 58 TABLE 3 -continued Editing Nucleic acid Guide nucleic acid scaffold sequence Target Editing # guided nuclease sequence mutation gene efficiency 15 MAD2 Scaffold - 13 ; SEQ ID NO : 96 Y145 * * galK 0 % 16 MAD2 Scaffold - 1 ; SEQ ID NO : 84 L80 * * galK 0 % 17 MAD2 Scaffold - 2 ; SEQ ID NO : 85 Y145 * * galK 0 % 18 MAD2 Scaffold - 2 ; SEQ ID NO : 85 Y145 * * galK 0 % 19 MAD2 Scaffold - 4 ; SEQ ID NO : 87 Y145 * * galK 0 % 20 MAD2 Scaffold - 5 ; SEO ID NO : 88 L80 * * galK 99 % 21 MAD2 Scaffold - 12 ; SEQ ID NO : 95 89 * * galK 0 % 22 MAD2 Scaffold - 12 ; SEQ ID NO : 95 70 * * galK 75 % 3 MAD2 Scaffold - 12 ; SEQ ID NO : 95 L10KpnI galK 79 % 4 MAD4 Scaffold - 1 ; SEQ ID NO : 84 L80 * * galK 0 % 5 MAD4 Scaffold - 2 ; SEQ ID NO : 85 Y145 * * galK 0 % 26 MAD4 Scaffold - 4 ; SEQ ID NO : 87 Y145 * * galK 0 % 27 MAD4 Scaffold - 10 ; SEQ ID NO : 93 Y145 * * galK 0 % 28 MAD4 Scaffold - 11 ; SEQ ID NO : 94 L80 * * galK 0 % 29 MAD4 Scaffold - 12 ; SEQ ID NO : 95 L10KpnI galK 0 % 30 MAD4 Scaffold - 13 ; SEQ ID NO : 96 Y145 * * galK 0 % 31 MAD4 Scaffold - 12 ; SEQ ID NO : 95 L10KpnI galK 0 % 32 MAD7 Scaffold - 1 ; SEQ ID NO : 84 L80 * * galK 0 % 33 MAD7 Scaffold - 2 ; SEQ ID NO : 85 Y145 * * galK 0 % 34 MAD7 Scaffold - 4 ; SEQ ID NO : 87 Y145 * * galK 0 % 35 MAD7 Scaffold - 10 ; SEQ ID NO : 93 Y145 * * galK 100 % 36 MAD7 Scaffold - 11 ; SEQ ID NO : 94 L80 * * galK 97 % 7 MAD7 Scaffold - 12 ; SEQ ID NO : 95 L10KpnI galK 0 % 38 MAD7 Scaffold - 13 ; SEQ ID NO : 96 Y145 * * galK 0 % 39 MAD7 Scaffold - 12 ; SEQ ID NO : 95 L10KpnI galK 0 %

Example 6 . Assessment of MAD2 and MAD7 TABLE 5 30 Nucleic The ability ofMAD2 and MAD7 to function with heter acid Editing ologous guide nucleic acids were tested using a similar guided Guide nucleic acid scaffold sequence Target experimental design as described above . # nuclease sequence mutation gene The compatibility ofMAD2 with other scaffold sequences 1 MAD7 Scaffold - 1 ; SEQ ID NO : 84 L80 * * galK MAD7 Scaffold -2 ; SEQ ID NO : 85 Y145 * * galK was tested and the results of an experiment are depicted in 35 MAD7 Scaffold - 4 ; SEQ ID NO : 87 Y145 * * galK FIG . 8 . The MAD nucleases , guide nucleic acid scaffold MAD7 Scaffold - 10 ; SEQ ID NO : 93 Y145 * * galK sequences , and editing sequences used in this experiment are uAWNE5 MADT Scaffold - 11 ; SEQ ID NO : 95 L80 * * galk summarized in Table 4 . The compatibility of MAD7 with other scaffold sequences 40 . In another experiment, transformation efficiencies (FIG . was tested and the results of an experiment are depicted in 10B ) were determined by calculating the total number of FIG . 9 . The MAD nucleases , guide nucleic acid scaffold recovered cells compared to the start number of cells . An sequences , and editing sequences used in this experiment are example plate image is depicted in FIG . 10C . Editing summarized in Table 5 . efficiencies (FIG . 10A ) were determined by calculating the 45 ratio of editing colonies (white colonies , edited galK gene ) TABLE 4 versus total colonies . In this example (FIG . 10A - 10C ) , cells expressing galK Nucleic were transformed with expression constructs expressing acid Editing either MAD2 or MAD7 and a corresponding editing cassette guided Guide nucleic acid scaffold sequence Target 50 comprising a guide nucleic acid targeting the galK gene . The # nuclease sequence mutation gene guide nucleic acid was comprised of a guide sequence 1 MAD2 Scaffold -12 ; SEQ ID NO : 95 N89Kpnl galK targeting the galK gene and the scaffold -12 sequence (SEQ 2 MAD2 Scaffold - 10 ; SEQ ID NO : 93 180 * * galK ID NO : 95 ) . 3 MAD2 Scaffold - 5 ; SEQ ID NO : 88 L80 * * galK In the depicted example , MAD2 and MAD7 has a lower mTin4 MAD2 Scaffold - 12 ; SEQ ID NO : 95 D70KpnI galK 55 transformation efficiency compared to S . pyogenes Cas9 , 5 MAD2 Scaffold - 12 ; SEQ ID NO : 95 Y145 * * galK though the editing efficiency of MAD2 and MAD7 was 6 MAD2 Scaffold -11 ; SEQ ID NO : 94 Y145 * * galK slightly higher than S . pyogenes Cas9. 7 MAD2 Scaffold - 10 ; SEQ ID NO : 93 Y145* * galK FIG . 11 depicts the sequencing results from select colo 8 MAD2 Scaffold - 12 ; SEQ ID NO : 95 L10KpnI galK nies recovered from the assay described above . The target 9 MAD2 Scaffold - 11; SEQ ID NO : 94 L80 * * galK 60 sequence was in the galK coding sequence (CDS ) . The 10 SpCas9 S . pyogenese gRNA Y145 * * galK TTTN PAM is shown as the reverse complement (wild - type 11 MAD2 Scaffold - 2 ; SEQ ID NO : 85 Y145 * * galK NAAA , mutated NGAA ). The mutations targeted by the 12 MAD2 Scaffold - 4 ; SEQ ID NO : 87 Y145 * * galK editing sequence are labeled as target codons. Changes 13 MAD2 Scaffold - 1 ; SEQ ID NO : 84 L80 * * galK compared to the wild -type sequence are highlighted . In these 14 MAD2 Scaffold - 13 ; SEQ ID NO : 96 Y145 * * galK 65 experiments , the scaffold - 12 sequence (SEQ ID NO : 95 ) was used . The guide sequence of the guide nucleic acid targeted the galK gene . US 10 ,435 , 714 B2 59 60 Six of the seven depicted sequences from the MAD2 shown in FIG . 16 , the editing efficiencies as determined by experiment contained the designed PAM mutation and the two separate assays are consistent. designed mutations in the target codons of galK , which one sequences colony maintained the wild - type PAM and wild type target codons while also containing an unintended 5 Example 9 . Trackable Editing mutation upstream of the target site . Two of the four depicted sequences from the MAD7 Genetic edits can be tracked by the use of a barcode . A experiment contained the designed PAM mutation and barcode can be incorporated into or near the edit site as mutated target codons . One colony comprises a wildtype described in the present specification . When multiple rounds sequence , while another contained a deletion of eight 10 of engineering are being performed , with a different edit nucleotides upstream of the target sequence. being made in each round , it may be beneficial to insert a FIG . 12 depicts results from another experiment testing barcode in a common region during each round of engineer the ability to recover edited cells. In Experiment 0 , the ing , this way one could sequence a single site and get the MAD2 nuclease was used with a guide nucleic acid com sequences of all of the barcodes from each round without the prising scaffold - 11 sequence and a guide sequence targeting 15 need to sequence each edited site individually . FIGS . 17A galK . The editing cassette comprised an editing sequence designed to incorporate an L80 * * mutation into galK , 17C , 18 , and 19 depict examples of such trackable engi thereby allowing screening of the edited cells . In experiment neering workflows . 1 , the MAD2 nuclease was used with a guide nucleic acid As depicted in FIG . 17A , a cell expressing a MAD comprising scaffold - 12 sequence and a guide sequence 20 nuclease is transformed with a plasmid containing an editing targeting galK . The editing cassette comprised an editing cassette and a recording cassette . The editing cassette con sequence designed to incorporate an L10KpnI mutation into tains a PAM mutation and a gene edit . The recorder cassette galK . In both experiments , a negative control plasmid a comprises a barcode, in this case 15N . Both the editing guide nucleic acid that is not compatible with MAD2 was cassette and recording cassette each comprise a guide included in the transformations . Following transformation , 25 nucleic acid to a distinct target sequence. Within a library of the ratio of the compatible editing cassette ( those containing such plasmids , the recorder cassette for each round can scaffold - 11 or scaffold - 12 guide nucleic acids) to the non contain the same guide nucleic acid , such that the first round compatible editing cassette (negative control) was measure . barcode is inserted into the same location across all variants , The experiments were done in the presence or absence of regardless of what editing cassette and corresponding gene selection . The results show that more compatible editing 30 edit is used . The correlation between the barcode and editing cassette containing cells were recovered compared to the cassette is determined beforehand though such that the edit non -compatible editing cassette , and this result is magnified can be identified by sequencing the barcode. FIG . 17B when selection is used . shows an example of a recording cassette designed to delete 25 a PAM site while incorporating a 15N barcode. The deleted Example 7 . Guide Nucleic Acid Characterization 35 PAM is used to enrich for edited cells since mutated PAM The sequences of scaffolds 1 - 8 , and 10 - 12 (SEQ ID NO : cells escape cell death while cells containing a wild - type 84 -91 , and 93 -95 ) were aligned and are depicted in FIG . PAM sequence are killed . Fire 21 C depicts how sequencing 13A . Nucleotides that match the consensus sequence are the barcode region can be used to identify which edit is faded , while those diverging from the consensus sequence 40 comprised within each cell. are visible . The predicted pseudoknot region is indicated . A similar approach is depicted in FIG . 18 . In this case , the Without being bound by theory , the region 5 ' of the pseudo - recorder cassette from each round is designed to target a knot may be influence binding and/ or kinetics of the nucleic sequence adjacent to the previous round , and each time, a acid - guided nuclease . As is shown in FIG . 13A , in general, new PAM site is deleted by the recorder cassette . The result there appears to be less variability in the pseudoknot region 45 is a barcode array with the barcodes from each round that ( e. g ., SEQ ID NO : 172 - 181 ) as compared to the sequence can be sequenced to confirm each round of engineering took outside of the pseudoknot region . place and to determine which combination of mutations are FIG . 13B shows a preliminary model of MAD2 and contained in the cell , and in which order the mutations were MAD12 complexed with a guide nucleic acid in this made . Each successive recorder cassette can be designed to example , a guide RNA ) and target sequence (DNA ) . 50 be homologous on one end to the region comprising the Example 8 . Editing Efficiency of the MAD mutated PAM from the previous round , which could increase the efficiency of getting fully edited cells at the end Nucleases of the experiment. In other examples, the recorder cassette A plate -based editing efficiency assay and a molecular 55 is designed to target a unique landing site that was incor editing efficiency assay were used to test editing efficiency porated by the previous recorder cassette . This increases the of various MAD nuclease and guide nucleic acid combina efficiency of recovering cells containing all of the desired tions . mutations since the subsequent recorder cassette and bar FIG . 15 depicts quantification of the data obtained using code can only target a cell that has successfully completed the molecular editing efficiency assay using MAD2 nuclease 60 the previous round of engineering . with a guide nucleic acid comprising scaffold - 12 and a guide FIG . 19 depicts another approach that allows the recy sequencing targeting galK . The indicated mutations were cling of selectable markers or to otherwise cure the cell of incorporated into the galK using corresponding editing the plasmid form the previous round of engineering . In this cassettes containing the mutation . FIG . 16 shows the com - case , the transformed plasmid containing a guide nucleic parison of the editing efficiencies determined by the plate - 65 acid designed to target a selectable marker or other unique based assay using white and red colonies as described sequence in the plasmid form the previous round of engi previously , and the molecular editing efficiency assay. As neering . US 10 ,435 ,714 B2

TABLE 6 SEQUENCE LISTING SEQ ID NO : Sequence SEO MGKMYYLGLDIGTNSVGYAVTDPSYHLLKFKGEPMWGAHVFAAGNQSAERRSFRTSRRRLDRROQRVKLV ID OEIFAPVISPIDPRFFIRLHESALWRDDVAETDKHIFFNDPTYTDKEYYSDYPTIHHLIVDLMESSEKHDPRLVY NO : 1 LAVAWLVAHRGHFLNEVDKDNIGDVLSFDAFYPEFLAFLSDNGVSPWVCESKALOATLLSRNSVNDKYKAL KSLIFGSOKPEDNFDANISEDGLIOLLAGKKVKVNKLFPOESNDASFTLNDKEDAIEEILGTLTPDECEWIAHIR RLFDWAIMKHALKDGRTISESKVKLYEQHHHDLTOLKYFVKTYLAKEYDDIFRNVDSETTKNYVAYSYHVK EVKGTLPKNKATOEEFCKYVLGKVKNIECSEADKVDFDEMIORLTDNSFMPKOVSGENRVIPYOLYYYELKT ILNKAASYLPFLTOCGKDAISNODKLLSIMTFRIPYFVGPLRKDNSEHAWLERKAGKIYPWNFNDKVDLDKSE EAFIRRMTNTCTYYPGEDVLPLDSLIYEKFMILNEINNIRIDGYPISVDV KOOVFGLFEKKRRVTVKDIONLLLS LGALDKHGKLTGIDTTIHSNYNTYHHFKSLMERGVLTRDDVERIVERMTYSDDTKRVRLWLNNNYGTLTAD DVKHISRLRKHDFGRLSKMFLTGLKGVHKETGERASILDFMWNTNDNLMOLLSECYTFSDEITKLQEAYYA KAQLSLNDFLDSMYISNAVKRPIYRTLAVVNDIRKACGTAPKRIFIEMARDGESKKKRSVTRREQIKNLYRSIR KDFOOEVDFLEKILENKSDGOLOSDALYLYFAOLGRDMYTGDPIKLEHIKDOSFYNIDHIYPOSMVKDDSLD NKVLVOSEINGEKSSRYPLDAAIRNKMKPLWDAYYNHGLISLKKYORLTRSTPFTDDEKWDFINROLVETRO STKALAILLKRKFPDTEIVYSKAGLSSDFRHEFGLVKSRNIND LHHAKDAFLAIVTGNVYHERFNRRWFMVN OPYSVKTKTLFTHSIKNGNFVAWNGEEDLGRIVKMLKONKNTIHFTRFSFDRKEGLFDIOPLKASTGLVPRKA GLDVVKYGGYDKSTAAYYLLVRFTLEDKKTOHKLMMIPVEGLYKARIDHDKEFLTDYAOTTISEILOKDKO KVINIMFPMGTRHIKLNSMISIDGFYLSIGGKSSKGKSVLCHAMVPLIVPHKIECYIKAMESFARKFKENNKLRI VEKFDKITVEDNLNLYELFLQKLQHNPYNKFFSTOFDVL TNGRSTFTKLSPEEQVQTLLNILSIFKTCRSSGCD LKSINGSAQAARIMISADLTGLSKKYSDIRLVEQSASGLFVSKSQNLLEYL * SEQ MSSLTKFTNKYSKOLTIKNELIPVGKTLENIKENGLIDGDEQLNENYQKAKIIVDDFLRDFINKALNNTQIGNW ID RELADALNKEDEDNIEKLODKIRGIIVSKFETFDLFSSYSIKKDEKIIDDDNDVEEEELDLGKKTSSFKYIFKKN NO : 2 LFKLVLPSYLKTTNQDKLKIISSFDNFSTYFRGFFENRKNIFTKKPISTSIAYRIVHDNFPKFLDNIRCFNVWOTE CPQLIVKADNYLKSKNVIAKDKSLANYFTVGAYDYFLSONGIDFYNNIIGGLPAFAGHEKIQGLNEFINQECQ KDSELKSKLKNRHAF KMAVLFKOILSDREKSFVIDEFESDAOVIDAVKNFYAEOCKDNNVIFNLLNLIKNIAF LSDDELDGIFIEGKYLSSVSOKLYSDWSKLRNDIEDSANSKOGNKELAKKIKTNKGDVEKAISKYEFSLSELNS IVHDNTKFSDLLSCTLHKVASEKLVKVNEGDWPKHLKNNEEKOKI KEPLDALLEIYNTLLIFNCKSFNKNGNF YVDYDRCINELSSVVYLYNKTRNYCTKKPYNTDKFKLNFNSPOLGEGFSKSKENDCLTLLFKKDDNYYVGII RKGAKINFDDTQAIADNTDNCIFKMNYFLLKDAKKFIPKCSIQLKEVKAHFKKS EDDYILSDKEKFASPLVIKK STFLLATAHVKGKKGNIKKFQKEYSKENPTEYRNSLNEWIAFCKEFLKTYKAATIFDITTLKKAEEYADIVEF YKDVDNLCYKLEFCPIKTSFIENLIDNGDLYLFRINNKDFSSKSTGTKNLHTLYLOAIFDERNLNNPTIMLNGG AELFYRKESIEOKNRITHKAGSILVNKVCKDGTSLDDKIRNEI YOYENKFIDTLSDEAKKVLPNVIKKEATHDI TKDKRFTSDKFFFHCPLTINYKEGDTKOFNNEVLSFLRGNPDINIIGIDRGERNLIYVTVINOKGEILDSVSFNT VTNKSSKIEOTVDYEEKLAVREKERIEAKRSWDSISKIATLKEGYLSAIVHEICLLMIKHNAIVLENLNAGFK RIRGGLSEKSVYOKFEKMLINKLNYFVSKKESDWNKPSGLLNGLOLSDOFESFEKLGIOSGFIFYVPAAYTSKI DPTTGFANVLNLSKVRNVDAIKSFFSNFNEISYSKKEALFKFSFDLDSLSKKGFSSFVKFSKSKWNVYTFGERII KPKNKOGYREDKRINLTFEMKKLLNEYKVSFDLENNLIPNLTSANLKDTFWKELFFIFKTTLOLRNSVTNGKE DVLISPVKNAKGEFFVSGTHNKTLPODCDANGAYHIALKGLMILERNNLVREEKDTKKIMAISNVDWFEYVQ KRRGVL * SEQ YDEFTKLYPIOKTIRFELKPOGRTMEHLETFNFFEEDRDRAEKYKILKEAIDEYHKKFIDEHLTNMSLDW ID NSLKOISEKYYKSREEKDKKVFLSEQKRMRQEIVSEFKKDDRFKDLFSKKLFSELLKEEIYKKGNHQEIDALK NO : 3 SFDKFSGYFIGLHENRKNMYSDGDEITAISNRIVNENFPKFLDNLQKYQEARKKYPEWIIKAESALVAHNIKM DEVFSLEYFNKVLNOEGIORYNLALGGYVTKSGEKMMGLNDALNLAHOSEKSSKGRIHMTPLFKOILSEKES FSYIPDVFTEDSOLLPSIGGFFAQIENDKDGNIFDRALELISSYAEYDTERIYIROADINRVSNVIFGEWGTLGGL MREYKADSINDINLERTCKKVDKWLDSKEFALSDVLEAIKRTGNNDAFNEYISKMRTAREKIDAARKEMKFI SEKISGDEESIHIIKTLLDSVOOFLHFFNLFKARODIPLDGAFYAEFDEVHSKLFAIVPLYNKVRNYLTKNNLNT KKIKLNFKNPTLANGWDONKVYDYASLIFLRDGNYYLGIINPKRKKNIKFEOGSGNGPFYRKMVYKOIPGPN KNLPRVFLTSTKGKKEYKPSKEIIEGYEADKHIRGDKFDLDFCHKLIDFFKESIEKHKDWSKFNFYFSPTESYG DISEFYLDVEKOGYRMHFENISAETIDEYVEKGDLFLFOIYNKDFVKAATGKKDMHTIYWNAAFSPENLODV VKLNGEAELFYRDKSDIKEIVHREGEILVNRTYNGRTPVPDKIHKKLTDYHNGRTKDLGEAKEYLDKVRYF KAHYDI TKDRRYLNDKIYFHVPLTLNFKANGKKNLNKMVIEKFLSDEKAHIIGIDRGERNLLYYSIIDRSGKII DOOSLNVIDGFDYREKLNOREIEMKDAROSWNAIGKIKDLKEGYLSKAVHEITKMAIOYNAIVVMEELNYGF KRGRFKVEKOIYOKFENMLIDKMNYLVFKDAPDESPGGVLNAYOL TNPLESFAKLGKOTGILFYVPAAYTSK IDPTTGFVNLFNTSSKTNAQERKEFLQKFESISYSAKDGGI FAFAFDYRKFGTSKTDHKNVWTAYTNGERMR YIKEKKRNELFDPSKEIKEALTSSGIKYDGGQNILPDILRSNNNGLIYTMYSSFIAAIQMRVYDGKEDYIISPIKN SKGEFFRTDPKRRELPIDADANGAYNIALRGELTMRAIAEKFDPDSEKMAKLELKHKDWFEFMQTRGD * SEQ MTKTFDSEFFNLYSLOKTVRFELKPVGETASFVEDFKNEGLKRWVSEDERRAVDYOKVKEIIDDYHRDFIEES ID LNYFPEOVSKDALEOAFHLYOKLKAAKVEEREKALKEWEALOKKLREKVVKCFSDSNKARFSRIDKKELIK NO : 4 EDLINWLVAONREDDIPTVETFNNFTTYFTGFHENRKNIYSKDDHATAISFRLIHENLPKFFDNVISFNKL KEG FPELKFDKVKEDLEVDYDLKHAFEIEYFVNFVTQAGIDQYNYLLGGKTLEDGTKKQGMNEQINLFKQQQTR DKARQIPKLIPLFKQILSERTESQSFIPKQFESDQELFDSLQKLHNNCQDKFTVLQQAILGLAEADLKKVFIKTS DLNALSNTIFGNYSVFSDALNLYKESLKTKKAQEAFEKLPAHSIHDLIOYLEOFNSSLDAE KOOSTDTVLNYFI KTDELYSRFIKSTSEAFTOVOPLFELEALSSKRRPPESEDEGAKGOEGFEOIKRIKAYLDTLMEAVHFAKPLYL VKGRKMIEGLDKDOSFYEAFEMAYOELESLIIPIYNKARSYLSRKPFKADKFKINFDNNTLLSGWDANKETAN ASILFKKDGLYYLGIMPKGKTFLFDYFVSSEDSEKLKORROKTAEEALAQDGESYFEKIRYKLLPGASKMLPK VFFSNKNIGFYNPSDDILRIRNTASHTKNGTPQKGHSKVEFNLNDCHKMIDFFKSSIQKHPEWGSFGFTFSDTS DFEDMSAFYREVENOGYVISFDKIKETYIOSOVEOGNLYLFOIYNKDFSPYSKGKPNLHTLYWKALFEEANL NNVVAKLNGEAEIFFRRHSIKASDKVVHPANQAIDNKNPHTEKTOSTFEYDLVKDKRYTQDKFFFHVPISLNF KAQGVSKFNDKVNGFLKGNPDVNIIGIDRGERHLLYFTVVNOKGEILVQESLNTLMSDKGHVNDYQOKLDK KEOERDAARKSWTTVENIKELKEGYLSHVVHKLAHLIIKYNAIVCLEDLNFGFKRGRFKVEKOVYOKFEKAL IDKLNYLVFKEKELGEVGHYL TAYOLTAPFESFKKLGKOSGILFYVPADYTSKIDPTTGF AKOLLSDFNAIRFNSVQNYFEFEIDYKKLTPKRKVGTOSKWVICTYGDVRYQNRRNQKGHWETEEVNVTEK LKALFASDSKTTTVIDYANDDNLIDVILEODKASFFKELLWLLKLTMTLRHSKIKSEDDFILSPVKNEOGEFYD SRKAGEVWPKDADANGAYHIALKGLWNLOOINOWEKGKTLNLAIKNODWFSFIOEKPYOE * US 10 , 435 , 714 B2 63 64 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence

SEQ MHTGGLLSMDAKEFTGQYPLSKTLRFELRPIGRTWDNLEASGYLAEDRHRAECYPRAKELLDDNHRAFLNR ID VLPOIDMDWHPIAEAFCKVHKNPGNKELAODYNLOLSKRRKEI SAYLODADGYKGLFAKPALDEAMKIAKE NO : 5 NGNESDIEVLEAFNGFSVYFTGYHESRENIYSDEDMVSVAYRI TEDNFPRFVSNALIFDKLNESHPDIISEVSGN LGVDDIGKYFDVSNYNNFLSQAGIDDYNHIIGGHTTEDGLIQAFNVVLNLRHOKDPGFEKIQFKOLYKQILSV RTSKSYIPKOFDNSKEMVDCI CDYVSKI EKSETVERALKLVRNISSFDLRGIFVNKKNLRILSNKLIGDWDAIET ALMHSSSSENDKKSVYDSAEAFTLDDIFSSVKKFSDASAEDIGNRAEDICRVISETAPFINDLRAVDLDSLNDD GYEAAVSKIRESLEPYMDLFHELEIFSVGDEFPKCAAFYSELEEVSEOLIEIIPLFNKARSFCTRKRYSTDKIKVN LKFPTLADGWDLNKERDNKAAILRKDGKYYLAILDMKKDLSSIRTSDEDESSFEKMEYKLLPSPV KMLPKIF VKSKAAKEKYGLTDRMLECYDKGMHKSGSAFDLGFCHELIDYYKRCIAEYPGWDVFDFKFRETSDYGSMK EFNEDVAGAGYYMSLRKIPCSEVYRLLDEKSIYLFOIYNKDYS ENAHGNKNMHTMYWEGLFSPONLESPVF KLSGGAELFFRKSSIPNDAKTVHPKGSVLVPRNDVNGRRIPDSIYRELTRYFNRGDCRISDEAKSYLDKVKTK KADHDIVKDRRFTVDKMMFHVPIAMNFKAISKPNLNKKVIDGIIDDODLKIIGIDRGERNLIYVTMVDRKGNI LYODSLNILNGYDYRKALDVREYDNKEARRNWTKVEGIRKMKEGYLSLAVSKLADMIIENNAIIVMEDLNH GFKAGRSKIEKOVYOKFESMLINKLGYMVLKDKSIDOSGGALHGYOLANHVTTLASVGKOCGVIFYI PAAFT SKIDPTTGFADLFALSNVKNVASMREFFSKMKSVIYDKAEGKFAFTFDYLDYNVKS ECGRTLWTVYTVGERF TYSRVNREYVRKVPTDIIYDALQKAGISVEGDLRDRIAESDGDTLKSIFYAFKYALDMRVENREEDYIQSPVK NASGEFFCSKNAGKSLPODSDANGAYNIALKGILOLRMLSEOYDPNAESIRLPLITNKAWLTFMOSGMKTWK N * SEQ MDSLKDFTNLYPVSKTLRFELKPVGKTLENIEKAGILKEDEHRAESYRRVKKIIDTYHKVFIDSSLENMAKMG ID IENEIKAMLOSFCELYKKDHRTEGEDKALDKIRAVLRGLIVGAFTGVCGRRENTVONEKYESLFKEKLIKEILP NO : 6 DFVLSTEAESLPFSVEEATRSLKEFDSFTSYFAGFYENRKNIYSTKPOSTAIAYRLIHENLPKFIDNILVFOKIKE IHVLSOAGIEKYNALIGKIVTEGDGEMKGLNEHINLYNOO RGREDRLPLFRPLYKOILSDREOLSYLPESFEKDEELLRALKEFYDHIAEDILGRTOOLMTSISEYDLSRIYVRN DSOLTDISKKMLGDWNAIYMARERAYDHEOAPKRITAKYERDRIKALKGEESISLANLNSCIAFLDNVRDCR VDTYLSTLGQKEGPHGLSNLVENVFASYHEAEQLLSFPYPEENNLIQDKDNVVLIKNLLDNISDLQRFLKPLW GMGDEPDKDERFYGEYNYIRGALDOVIPLYNKVRNYL TRKPYSTRKVKLNFGNSOLLSGWDRNKEKDNSC VILRKGONFYLAIMNNRHKRSFENKVLPEYKEGEPYFEKMDYKFLPDPNKMLPKVFLSKKGIEIYKPSPKLLE QYGHGTHKKGDTFSMDDLHELIDFFKHSIEAHEDWKQFGFKFSDTATYENVSSFYREVEDQGYKLSFRKVSE SYVYSLIDOGKLYLFOIYNKDFSPCSKGTPNLHTLYWRMLFDERNLADVIYKLDGKAEIFFREKSLKNDHPTH PAGKPIKKKSROKKGEESLFEYDLVKDRHYTMDKFOFHVPI TMNFKCSAGSKVNDMVNAHIREAKDMHVIG IDRGERNLLYICVIDSRGTILDQISLNTINDIDYHDLLESRDKDRQQERRNWQTIEGIKELKQGYLSQAVHRIAE LMVAYKAVVALEDLNMGFKRGROKVESSVYOOFEKOLIDKLNYLVDKKKRPEDIGGLLRAYOFTAPFKSFK EMGKONGFLFYIPAWNTSNIDPTTGFVNLFHAOYENVDKAKSFFOKFDSISYNPKKDWFEFAFDYKNFTKKA EGSRSMWILCTHGSRIKNFRNSOKNGOWDSEEFALTEAFKSLFVRYEIDYTADLKTAIVDEKOKDFFVDLLKL FKLTVQMRNSWKEKDLDYLISPVAGADGRFFDTREGNKSLPKDADANGAYNIALKGLWALRQIRQTSEGGK LKLAISNKEWLOFVOERSYEKD * SEQ MNNGTNNFQNFIGISSLOKTLRNALIPTETTQQFIVKNGIIKEDELRGENRQILKDIMDDYYRGFISETLSSIDDI ID DWTSLFEKMEIQLKNGDNKDTLIKEQTEYRKAIHKKFANDDRF KNMFSAKLISDILPEFVIHNNNYSASEKEE NO : 7 KTQVIKLFSRFATSFKDYFKNRANCFSADDISSSSCHRIVNDNAEIFFSNALVYRRIVKSLSNDDINKISGDMKD SLKEMSLEEIYSYEKYGEFITOEGISFYNDICGKVNSFMNLYCOKNKENKNLYKLOKLHKOILCIADTSYEVP YKFESDEEVYOSVNGFLDNISSKHIVERLRKIGDNYNGYNLDKIYIVSKFYESVSOKTYRDWETINTALEIHYN NILPGNGKSKADKVKKAVKNDLQKSITEINELVSNYKLCSDDNIKAETYIHEISHILNNFEAQELKYNPEIHLV ESELKASELKNVLDVIMNAFHWCSVFMTEELVDKDNNFYAELEEIYDEIYPVISLYNLVRNYVTQKPYSTKKI KLNFGIPTLADGWSKSKEY SNNAIILMRDNLYYLGIFNAKNKPDKKIIEGNTSENKGDYKKMIYNLLPGPNK MIPKVFLSSKTGVETYKPSAYILEGYKONKHIKSSKDFDITFCHDLIDYFKNCIAIHPEWKNFGFDFSDTSTYED ISGFYREVELQGYKIDWTYISEKDIDLLQEKGQLYLFQIYNKD FSKKS TGNDNLHTMYLKNLFSEENLKDIVL KLNGEAEIFFRKSSIKNPIIHKKGSILVNRTYEAEEKDOFGNIQIVRKNIPENIYOELYKYFNDKSDKELSDEAA KLKNVVGHHEAATNIVKDYRYTYDKYFLHMPITINFKANKTGFINDRILQYIAKEKDLHVIGIDRGERNLIYV SVIDTCGNIVEOKSFNIVNGYDYOIKLKOOEGAROIARKEWKEIGKIKEI KEGYLSLVIHEISKMVIKYNAIIAM EDLSYGFKKGRFKVEROVYOKFETMLINKLNYLVFKDISITENGGLLKGYOL TYIPDKLKNVGHOCGCIFYVP AAYTSKIDPTTGFVNIFKFKDLTVDAKREFIKKFDSIRYDSEKNLFCFTFDYNNFITONTVMSKSSWSVYTYGV RIKRRFVNGRFSNESDTIDITKDMEKTLEMTDINWRDGHDLRODIIDYEIVOHIFEIFRLTVOMRNSLSELEDR DYDRLISPVLNENNIFYDSAKAGDALPKDADANGAYCIALKGLYEIKQITENWKEDGKFSRDKLKISNKDWF DFIONKRYL * SEQ MTNKFTNQYSLSKTLRFELIPGKTLEFIQEKGLLSQDKQRAESYQEMKKTIDKFHKYFIDLALSNAKL THLE ID TYLELYNKSAETKKEQKFKDDLKKVODNLRKEIVKSFSDGDAKSIFAILDKKELITVELEKWFENNEQKDIYF NO : 8 DEKFKTFTTYFTGFHONRKNMYSVEPNSTAIAYRLIHENLPKFLENAKAFEKIKOVESLOVNFRELMGEFGDE GLIFYNELEEMFOINYYNDVLSONGITIYNSIISGFTKNDIKYKGLNEYINNYNOTKDKKDRLPKLKOLYKOIL SDRISLSFLPDAFTDGKOVLKAIFDFYKINLLSYTIEGOEESONLLLLIROTIENLSSFDTOKIYLKNDTHLTTISO OVFGDFSVFSTALNYWYETKVNPKFETEYSKANEKKREILDKAKAVFTKODYFSIAFLOEVLSEYILTLDHTS DIVKKHSSNCIADYFKNHFVAKKENETDKTFDFIANITAKYQCIQGILENADQYEDEL KODOKLIDNLKFFLD AILELLHFIKPLHLKSESITEKDTAFYDVFENYYEALSLLTPLYNMVRNYVTQKPYSTEKIKLNFENAQLLNG KANKAFQKFPEGKENYEKMVYKLLPGVNKMLPKVFFSNKNIAY FNPSKELLENYKKETHKKGDTFNLEHCHTLIDFFKDSLNKHEDWKYFDFOFSETKSYODLSGFYREVEHOGY KINFKNIDSEYIDGLVNEGKLFLFQIYSKDFSPFSKGKPNMHTLYWKALFEEONLQNVIYKLNGQAEIFFRKAS IKPKNIILHKKKIKIAKKHFIDKKTKTSEIVPVQTIKNLNMYYQGKISEKEL TQDDLRYIDNFSIFNEKNKTIDIIK DKRFTVDKFQFHVPI TMNFKATGGSYINQTVLEYLONNPEVKI IGLDRGERHLVYLTLIDQQGNILKQESLNTI TDSKISTPYHKLLDNKENERDLARKNWGTVENIKELKEGYISQVVHKIATLMLEENAIVVMEDLNFGFKRGR FKVEKQIYQKLEKMLIDKLNYLVLKDKQPQELGGLYNALOL TNKFESFQKMGKQSGFLFYVPAWNTSKIDP TTGFVNYFYTKYENVDKAKAFFEKFEAIRFNAEKKYFEFEVKKYSDFNPKAEGTQQAWTICTYGERIETKRO US 10 , 435 , 714 B2 65 66 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence KDONNKFVSTPINLTEKIEDFLGKNQIVYGDGNCIKSQIASKDDKAFFETLLYWFKMTLQMRNSETRTDIDYL ISPVMNDNGTFYNSRDYEKLENPTLPKDADANGAYHIAKKGLMLLNKIDQADLTKKVDLSISNRDWLQFVQ KNK *

SEQ MEQEYYLGLDMGTGSVGWAVTDSEYHVLRKHGKALWGVRLFESASTAEERRMFRTSRRRLDRRNWRIEIL ID OEIFAEEISKKDPGFFLRMKESKYYPEDKRDINGNCPELPYAL FVDDDFTDKDYHKKFPTIYHLRKMLMNTEE NO : 9 TPDIRLVYLAIHHMMKHRGHFLLSGDINEIKEFGTTFSKLLENIKNEELDWNLELGKEEYAVESILKDNMLN RSTKKTRLIKALKAKSICEKAVLNLLAGGTVKLSDIFGLEELNETERPKISFADNGYDDYIGEVENELGEOFYII ETAKAVYDWAVLVEILGKYTSISEAKVATYEKHKSDLOFLKKIVRKYLTKEEYKDIFVSTSDKLKNYSAYIG MTKINGKKVDLOSKRCSKEEFYDFIKKNVLKKLEGOPEYEYLKEELERETFLPKOVNRDNGVIPYOIHLYELK KILGNLRDKIDLIKENEDKLVOLFEFRIPYYVGPLNKIDDGKEGKFTWAVRKSNEKIYPWNFENVVDIEASAE KFIRRMTNKCTYLMGEDVLPKDSLLYSKYMVLNELNNVKLDGEKLSVELKORLYTDVFCKYRKVTVKKIK NYLKCEGIISGNVEITGIDGDFKASLTAYHDFKEILTGTELAKKDKENIITNIVLFGDDKKLLKKRLNRLYPOIT PNOLKKICALSYTGWGRFSKKFLEEI TAPDPETGEVWNIITALWESNNNLMOLLSNEYRFMEEVETYNMGKO TKTLSYETVENMYVSPSVKROIWOTLKIVKELEKVMKESPKRVFIEMAREKOESKRTESRKKOLIDLYKACK GDQEEQKLRSDKLYLYYTOKGRCMYSGEVIELKDLWDNTKYDIDHIYPOSKTMDDSLNNR VLVKKKYNATKSDKYPLNENIRHERKGFWKSLLDGGFISKEKYERLIRNTELSPEELAGFIERQIVETROSTKA VAEILKOVFPESEIVYV KAGTVSRFRKDFELLKVREVNDLHHAKDAYLNIVVGNSYYVKFTKNASWFIKENP GRTYNLKKMFTSGWNIERNGEVAWEVGKKGTIVTVKQIMNKNNILVTROVHEAKGGLFDOQIMKKGKGQI AIKETDERLASIEKYGGYNKAAGAYFMLVESKDKKGKTIRTIEFIPLYLKNKIESDESIALNFLEKGRGLKEPKI LLKKIKIDTLFDVDGFKMWLSGRTGDRLLFKCANOLILDEKIIVTMKKIVKFIORROENRELKLSDKDGIDNE VLMEIYNTFVDKLENTVYRIRLSEOAKTLIDKOKEFERLSLEDKSSTLFEILHIFOCOSSAANLKMIGGPGKAGI LVMNNNISKCNKISIINOSPTGIFENEIDLLK

SEQ MNKFENFTGLYPISKTLRFELIPOGKTLEYIEKSEILENDNYRAEKYEEVKDIIDGYHKWFINETLHDLHINWSE ID LKVALENNRIEKSDASKKELORVOKI KREEIYNAFIEHEAFOYLFKENLLSDLLPIOI EOSEDLDAEKKKOAVE NO : TFNRFSTYFTGFHENRKNIYSKEGISTSVTYRIVHDNFPKFLENMKVFEILRNECPEVISDTANELAPFIDGVRIE 10 GGVTTETGEKYRGINEFTNLYROOHPEFGKSKKATKMVVLFKOILSDRDT LSFIPEMFGNDKOVONSIQLFYNREISQFENEGVKTDVCTALATLTSKIAEFDTEKIYIQQPELPNVSQRLFGSW DKWLKSDLFSFTELNKALEFSGKDERIENYFSETGI FAOLVKTGFD EAQSILETEYTSEVHLKDOQTDIEKIKTFLDALONLMHLLKSLCVSEEADRDAAFYNEFDMLYNQLKLVVPL YNKVRNYITOKLFRSDKIKIYFENKGOFLGGWVDSOTENSDNGTQAGGYIFRKENVINEYDYYLGICSDPKLF RRTTIVSENDRSSFERLDYYOLKTASVYGNSYCGKHPYTEDKNELVNSIDRFVHLSGNNILIEKIAKDKVKSNP TTNTPSGYLNFIHREAPNTYECLLODENFVSLNORWSALKATLATLVRVPKALVYAKKDYHLFSEIINDIDE LSYEKAFSYFPVSOTEFENSSNRTIKPLLLFKISNKDLSFAENFEKGNROKIGKKNLHTLYFEALMKGNODTIDI GTGMVFHRVKSLNYNEKTLKYGHHSTOLNEKFSYPIIKD KRFASDKFLFHLSTEINYKEKRKPLNNSIIEFLTN NPDINIIGLDRGERHLIYLTLINOKGEILROKTFNIVGNTNYHEKLNOREKERDNARKSWATIGKIKELKEGFLS LVIHEIAKIMVENNAIVVLEDLNFGFKRGRFKVEKOIYOKFEKMLIDKLNYLVFKDKKANEAGGVLKGYOLA EKFESFQKMGKQSGFLFYVPAAYTSKIDPTTGFVNMLNLNYTNMKDAQTLLSGMDKISFNADANYFEFELD TYNSATKETTTVNVTEDLKKLLDKFEV KYSNGDNIKDEICROTDAKF FEIILWLLKLTMQMRNSNTKTEEDFILSPVKNSNGEFFRSNDDANGIWPADADANGAYHIALKGLYLVKECF NKNEKSLKIEHKNWFKFAQTRFNGSLTKNG * SEQ MENFKNLYPINKTLRFELRPYGKTLENFKKSGLLEKDAFKANSRRSMOAIIDEKFKETIEERLKYTEFSECDLG ID NMTSKDKKITDKAATNLKKOVILSFDDEIFNNYLKPDKNIDALFKNDPSNPVISTFKGFTTYFVNFFEIRKHIFK NO : GESSGSMAYRIIDENLTTYLNNIEKI KKLPEELKSOLEGIDOIDKLNNYNEFITOSGI THYNEIIGGISKSENVKIO 11 GINEGINLYCOKNKVKLPRLTPLYKMILSDRVSNSFVLDTIENDTELIEMISDLINKTEISODVIMSDIONIFIKY KOLGNLPGISYSSIVNAICSDYDNNFGDGKRKKSYENDRKKHLETNVYSINYISELLTDTDVSSNIKMRYKEL EONYOVCKENFNATNWMNIKNIKOSEKTNLIKDLLDILKSIOR FYDLFDIVDEDKNPSAEFYTWLSKNAEKLD FEFNSVYNKSRNYLTRKOYSDKKIKLNFDSPTLAKGWDANKEIDNSTIIMRKFNNDRGDYDYFLGIWNKSTP ANEKIIPLEDNGLFEKMOYKLYPDPSKMLPKOFLSKIWKAKHPTTPEFDKKYKEGRHKKGPDFEKEFLHELID CFKHGLVNHDEKYODVFGFNLRNTEDYNSYTEFLEDVERCNYNLSFNKIADTSNLINDGKLYVFOIWSKDFSI DSKGTKNLNTIYFESLFSEENMIEKMFKLSGEAEI FYRPASLNYCEDIIKKGHHHAELKDKFDYPIIKDKRYSO DKFFFHVPMVINYKSEKLNSKSLNNRTNENLGQFTHIIGIDRGERHLIYLTVVDVSTGEIVEQKHLDEIINTDTK GVEHKTHYLNKLEEKSKTRDNERKSWEAIETIKELKEGYISHVINEIOKLOEKYNALIVMENLNYGFKNSRIK VEKOVYQKFETALIKKFNYIIDKKDPETYIHGYQL TNPITTLDKIGNQSGIVLYIPAWNTSKIDPVTGFVNLLYA DDLKYKNOEOAKSFIOKIDNIYFENGEFKFDIDFSKWNNRYSISKTKWTLTSYGTRIOTFRNPOKNNKWDSAE YDLTEEFKLILNIDGTLKSODVETYKKFMSLFKLMLQLRNSVTGTDIDYMISPVTDKTGTHFDSRENIKNLPA DADANGAYNIARKGIMAIENIMNGISDPLKISNEDYLKYIQNQQE SEO MTQFEGFTNLYQVSKTLRFELIPGKTLKHIQEQGFIEEDKARNDHYKELKPIIDRIYKTYADQCLQLVQLDW ID ENLSAAIDSYRKEKTEETRNALIEEQATYRNAIHDYFIGRTDNLTDAINKRHAEIYKGLFKAELFNGKVLKOL NO : GTVTTTEHENALLRSFDKFTTYFSGFYENRKNVFSAEDISTAI PHRIVODNFPKFKENCHIFTRLITAVPSLREH 12 FENVKKAIGIFVSTSIEEVFSFPFYNOLLTOTOIDLYNOLLGGISREAGTEKIKGLNEVLNLAIOKNDETAHIIAS KOILSDRNTLSFILEEFKSDEEVIOSFCKYKTLLRNENVLETAEALFNELNSIDLTHIFISHKKLETIS SALCDHWDTLRNALYERRISELTGKITKSAKEKVORSLKHEDINLQEIISAAGKELSEAFKQKTSEILSHAHAA LDOPLPTTLKKOEEKEILKSOLDSLLGLYHLLDWFAVDESNEVDPEFSARLTGWFAVDESNEVDPEFSARLTGIKLEMEPSLSFYNKARNYAT KKPYSVEKFKLNFQMPTLASGWDVNKEKNNGAILFVKNGLYYLGIMPKQKGRYKALSFEPTEKTSEGFDKM YYDYFPDAAKMIPKCSTOLKAVTAHFOTHTTPILLSNNFIEPLEITKEIYDLNNPEKEPKKFOTAYAKKTGDOK GYREALCKWIDFTRDFLSKYTKTTSIDLSSLRPSSOYKDLGEYYAELNPLLYHISFORIAEKEIMDAVETGKLY LFOIYNKDFAKGHHGKPNLHTLYWTGLFSPENLAKTSIKLNGOAELFYRPKSRMKRMAHRLGEKMLNKKLK DOKTPIPDTLYOELYDYVNHRLSHDLSDEARALLPNVITKEVSHEIIKDRRFTSDKFFFHVPITLNYOAANSPS KFNORVNAYLKEHPETPIIGIDRGERNLIYITVIDSTGKILEQRSLNTIQOFDYQKKLDNREKERVAAROAWSV VGTIKDLKQGYLSQVIHEIVDLMIHYQAWVLENLNFGFKS KRTGIAEKAVYQQFEKMLIDKLNCLVLKDYP AEKVGGVLNPYOL TDOFTSFAKMGTOSGFLFYVPAPYTSKIDPLTGFVDPFVWKTIKNHESRKHFLEGFDFL HYDVKTGDFILHFKMNRNLSFORGLPGFMPAWDIVFEKNETOFDAKGTPFIAGKRIVPVIENHRFTGRYRDLY US 10 , 435 , 714 B2 67 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence PANELIALLEEKGIVFRDGSNILPKLLENDDSHAIDTMVALIRSVLOMRNSNAATGEDYINSPVRDLNGVCFDS RFONPEWPMDADANGAYHIALKGOLLLNHLKESKDLKLONGISNOL SEQ MAVKSIKVKLRLDDMPEIRAGLWKLHKEVNAGVRYYTEWLSLLROENLYRRSPNGDGEQECDKTAEECKA ID ELLERLRAROVENGHRGPAGSDDELLOLAROLYELLVPOAIGAKGDAQOIARKFLSPLADKDAVGGLGIAKA NO : GNKPRWVRMREAGEPGWEEEKEKAETRKSADRTADVLRALADFGLKPLMRVYTDSEMSSVEWKPLRKGQ 13 AVRTWDRDMFOOAIERMMSWESWNORVGOEYAKLVEOKNRFEOKNFVGOEHLVHLVNOLOODMKEASP GLESKEOTAHYVTGRALRGSDKVFEKWGKLAPDAPFDLYDAEI KNVORRNTRRFGSHDLFAKLAEPEYOAL WREDASFLTRYAVYNSILRKLNHAKMFATFTLPDATAHPIWTRFDKLGGNLHOYTFLFNEFGERRHAIRFHK LLKVENGVAREVDDVTVPISMSEOLDNLLPRDPNEPIALYFRDYGAEOHFTGEFGGAKIOCRRDOLAHMHRR RGARDVYLNVSVRVQSQSEARGERRPPYAAVFRLVGDNHRAFVHFDKLSDYLAEHPDDGKLGSEGLLSGLR VMSVDLGLRTSASISVFRVARKDELKPNSKGRVPFFFPIKGNDNLVAVHERSOLLKLPGETESKDLRAIREER ORTLROLRTOLAYLRLLVRCGSEDVGRRERSWAKLIEOPVDAANHMTPDWREAFENELOKLKSLHGICSDK EWMDAVYESVRRVWRHMGKVRDWRKDVRSGERPKIRGYAKDVVGGNSIEQI EYLEROYKFLKSWSFFG KVSGOVIRAEKGSRFAITLREHIDHAKEDRLKKLADRIIMEALGYVYALDERGKGKWVAKYPPCOLILLEELS EYQFNNDRPPSENNOLMOWSHRGVFQELINQAQVHDLLVGTMYAAFSSRFDARTGAPGIRCRRVPARCTOE HNPEPFPWWLNKFVVEHTLDACPLRADDLIPTGEGEIFVSPFSAEEGDFHQIHADLNAAQNLQQRLWSDFDIS QIRLRCDWGEVDGELVLIPRLTGKRTADSYSNKVFYTNTGVTYYERERGKKRRKVFAQEKLSEEEAELLVEA DEAREKSVVLMRDPSGIINRGNWTROKEFWSMVNQRI EGYLVKQIRSRVPLQDSACENTGDI * SEQ MATRSFILKIEPNEEVKKGLWKTHEVLNHGIAYYMNILKLIROEAIYEHHEODPKNPKKVSKAEIOAELWDFV ID LKMQKCNSFTHEVDKDVVFNILRELYEELVPSSVEKKGEANQL SNKFLYPLVDPNSQSGKGTASSGRKPRWY NO : NLKIAGDPSWEEEKKKWEEDKKKDPLAKILGKLAEYGLIPLFIPFTDSNEPIVKEIKWMEKSRNOSVRRLDKD 14 MFIQALERFLSWESWNLKVKEEYEKVEKEHKTLEERIKEDIQAFKSLEQYEKERQEQLLRDTLNTNEYRLSK RGLRGWREIIOKWLKMDENEPSEKYLEVFKDYORKHPREAGDYSVYEFLSKKENHFIWRNHPEYPYLYATF CEIDKKKKDAKOOATFTLADPINHPLWVRFEERSGSNLNKYRILTEOLHTEKLKKKLTVOLDRLIYPTESGG WEEKGKVDIVLLPSRQFYNQIFLDIEEKGKHAFTYKDESIKFPLKGTLGGARVQFDRDHLRRYPHKVESGNV GRIYFNMTVNIEPTESPVSKSLKIHRDDFPKFVNFKPKELTEWIKDSKGKKLKSGIESLEIGLRVMSIDLGOROA AAASIFEVVDQKPDIEGKLFFPIKGTELYAVHRASFNIKLPGETLVKSREVLRKAREDNLKLMNOKLNFLRNV LHFQQFEDITEREKRVTKWISROENSDVPLVYQDELIQIRELMYKPYKDWVAFLKQLHKRLEVEIGKEVKHW RKSLSDGRKGLYGISLKNIDEIDRTRKFLLRWSLRPTEPGEVRRLEPGQRFAIDOLNHLNALKEDRLKKMANT IIMHALGYCYDVRKKKWOAKNPACOIILFEDLSNYNPYEERSRFENSKLMKWSRREIPROVALOGEIYGLOV GEVGAQFSSRFHAKTGSPGIRCSVVTKEKLQDNRFFKNLQREGRLTLDKIAVLKEGDLYPDKGGEKFISLSKD RKLVTTHADINAAONLOKRFWTRTHGFYKVYCKAYOVDGOTVYIPESKDOKOKIIEEFGEGYFILKDGVYE WGNAGKLKIKKGSSKOSSSELVDSDILKDSFDLASELKGEKLMLYRDPSGNVFPSDKWMAAGVFFGKLERIL ISKLTNQYSISTIEDDSSKOSM * SEQ MP TRTINLKLVLGKNPENATLRRALFSTHRLVNOATKRIEEFLLLCRGEAYRTVDNEGKEAEIPRHAVOEEAL ID AFAKAAQRHNGCISTYEDQEILDVLRQLYERLVPSVNENNEAGDAQAANAWVSPLMSAESEGGLSVYDKVL NO : DPPPVWMKLKEEKAPGWEAASOIWIOSDEGOSLLNKPGSPPRWIRKLRSGOPWODDFVSDOKKKODELTKG 15 NAPLIKOLKEMGLLPLVNPFFRHLLDPEGKGVSPWDRLAVRAAVAHFISWESWNHRTRAEYNSLKLRRDEFE AASDEFKDDFTLLRQYEAKRHSTLKSIALADDSNPYRIGVRSLRAWNRVREEWIDKGATEEQRVTILSKLQT QLRGKFGDPDLFNWLAQDRHVHLWSPRDSVTPLVRINAVDKVLRRRKPYALMTFAHPRFHPRWILYEAPGG SNLROYALDCTENALHITLPLLVDDAHGTWIEKKIRVPLAPSGOIODLTLEKLEKKKNRLYYRSGFOOFAGLA GGAEVLFHRPYMEHDERSEESLLERPGAVWFKLTLDVATOAPPNWLDGKGRVRTPPEVHHFKTALSNKS KH TRTLOPGLRVLSVDLGMRTFASCSVFELIEGKPETGRAFPVADERSMDSPNKLWAKHERSFKLTLPGETPSRK EEEERSIARAEIYALKRDIORLKSLLRLGEEDNDNRRDALLEOFFKGWGEEDVVPGOAFPRSLFOGLGAAPFR STPELWROHCOTYYDKAEACLAKHISDWRKRTRPRPTSREMWYKTRSYHGGKSIWMLEYLDAVRKLLLSW SLRGRTYGAINRODTARFGSLASRLLHHINSLKEDRIKTGADSIVOAARGYIPLPHGKGWEORYEPCOLILFED LARYRFRVDRPRRENSOLMOWNHRAIVAETTMOAELYGOIVENTAAGFSSRFHAATGAPGVRCRFLLERDF DNDLPKPYLLRELSWMLGNTKVESEEEKLRLLSEKIRPGSLVPWDGGEOFATLHPKROTLCVIHADMNAAO NLQRRFFGRCGEAFRLVCQPHGDDVLRLASTPGARLLGALQOL ENGQGAFELVRDMGSTSOMNRFVMKSL GKKKIKPLODNNGDDELEDVLSVLPEEDDTGRITVFRDSSGIFFPCNVWIPAKQFWPAVRAMIWKVMASHSL

SEQ MTKLRHROKKL THDWAGSKKREVLGSNGKLONPLLMPVKKGOVTEFRKAFSAYARATKGEMTDGRKNMF ID THSFEPFKTKPSLHQCELADKAYQSLHSYLPGSLAHFLLSAHALGFRIFSKSGEATAFQASSKIEAYESKLASE NO : LACVDLSIONLTISTLFNALTTSVRGKGEETSADPLIARFYTLLTGKPLSRDTOGPERDLAEVISRKIASSFGTW 16 KEMTANPLQSLOFFEEELHALDANVSLSPAFDVLIKMNDLQGDLKNRTIVFDPDAPVFEYNAEDPADIIIKLTA RYAKEAVIKNONVGNYVKNAITTTNANGLGWLLNKGLSLLPVS TDDELLEFIGVERSHPSCHALIELIAQLEA PELFEKNVFSDTRSEVQGMIDSAVSNHIARLSSSRNSLSMDSEELERLIKSFQIHTPHCSLFIGAQSLSQQLESLP EALQSGVNSADILLGSTQYMLTNSLVEESIATYQRTLNRINYLSGVAGQINGAIKRKAIDGEKIHLPAAWSELI SLPFIGOPVIDVESDLAHLKNOYOTLSNEFDTLISALOKNFDLNFNKALLNRTOHFEAMCRSTKKNALSKPEIV SYRDLLARLTSCLYRGSLVLRRAGIEVLKKHKIFESNSELREHVHERKHFVFVSPLDRKAKKLLRLTDSRPDL LHVIDEILOHDNLENKDRESLWLVRSGYLLAGLPDOLSSSFINLPIITOKGDRRLIDLIQYDOINRDAFVMLVTS AFKSNLSGLOYRANKOSFVVTRTLSPYLGSKLVYVPKDKDWLVPSOMFEGRFADILOSDYMVWKDAGRLC SNIKKSVFSSEEVLAFLRELPHRTFIQTEVRGLGVNVDGIAFNNGDIPSLKTFSNCVOVKVSRTNT SLVOTLNRWFEGGKVSPPSIQFERAYYKKDDQIHEDAAKRKIRFQMPATELVHASDDAGWTPSYLLGIDPGE YGMGLSLVSINNGEVLDSGFIHINSLINFASKKSNHQTKVVPROQYKS PYANYLEQSKDSAAGDIAHILDRLIY KLNALPVFEALSGNSQSAADQVWTKVLSFYTWGDNDAONSIRKQHWFGASHWDIKGMLROPPTEKKPKPYI AFPGSQVSSYGNSORCSCCGRNPIEQLREMAKDTSIKELKIRNSEIQLFDGTIKLFNPDPSTVIERRRHNLGPSRI PVADRTFKNISPSSLEFKELITIVSRSIRHSPEFIAKKRGIGSEYFCAYSDCNSSLNSEANAAANVAQKFOKOLFF EL * SEQ MKRILNSLKVAALRLLFRGKGSELVKTVKYPLVSPVOGAVEELAEAIRHDNLHLFGQKEIVDLMEKDEGTOV ID YSVVDFWLDTLRLGMFFSPSANALKITLGKFNSDOVSPFRKVLEQSPFFLAGRLKVEPAERILSVEIRKIGKRE US 10 , 435 , 714 B2 69 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence NO : NRVENYAADVETCFIGOLSSDEKOSICKLANDIWDSKDHEEQRMLKADFFAIPLIKDPKAVTEEDPENETAGK 17 OKPLELCVCLVPELYTRGFGSIADFLVORLTLLRDKMSTDTAEDCLEYVGIEEEKGNGMNSLLGTFLKNLOG DGFEQIFOFMLGSYVGWOGKEDVLRERLDLLAEKVKRLPKPKFAGEWSGHRMFLHGOLKSWSSNFFRLFNE TRELLESIKSDIOHATMLISYVEEKGGYHPOLLSOYRKLMEOL PALRTKVLDPEIEMTHMSEAVRSYIMIHKS VAGFLPDLLESLDRDKDREFLLSIFPRIPKIDKKTKEIVAWEL PGEPEEGYLFTANNLFRNFLENPKHVPRFMA ERIPEDWTRLRSAPVWFDGMV KOWOKVYNOLVESPGALYOFNESFLRORLOAMLTVYKRDLOTEKFLKLL ADVCRPLVDFFGLGGNDIIFKSCQDPRKQWQTVIPLSVPADVYTACEGLAIRLRETLGFEWKNLKGHEREDFL RLHOLLGNLLFWIRDAKLVVKLEDWMNNPCVOEYVEARKAIDLPLEIFGFEVPIFLNGYLFSELROLELLLRR KSVMTSYSVKTTGSPNRLFOLVYLPLNPSDPEKKNSNNFOERLDTPTGLSRRFLDLTLDAFAGKLLTDPVTOE LKTMAGFYDHLFGFKLPCKLAAMSNHPGSSSKMVVLAKPKKGVASNIGFEPIPDPAHPVFRVRSSWPELKYL EGLLYLPEDTPLTIELAETSVSCOSVSSVAFDLKNLTTILGRVGEFRVTADOPFKLTPIIPEKEESFIGKTYLGLD AGERSGVGFAIVTVDGDGYEVORLGVHEDTOLMALOOVASKSLKEPVFOPLRKGTFROOERIRKSLRGCYW NFYHALMI KYRAKVVHEESVGSSGLVGOWLRAFOKDLKKADVLPKKGGKNGVDKKKRESSAODTLWGGA FSKKEEQQIAFEVQAAGSSQFCLKCGWWFQLGMREVNRVQESGVVLDWNRSIVTFLIESSGEKVYGFSPOOL EKGFRPDIETFKKMVRDFMRPPMFDRKGRPAAAYERFULGRRHRRYRFDKVFEERFGRSALFICPRVGCGNF DHSSEQSAVULALIGYIADKEGMSGKKLVYVRLAELMAEWKLKKLERSRVEEQSSAQ * SEQ MAESKOMOCRKCGASMKYEVIGLGKKSCRYMCPDCGNHTSARKIONKKKRDKKYGSASKAOSORIAVAG ID ALYPDKKVQTIKTYKYPADLNGEVHDSGVAEKIAQAIQEDEIGLLGPSSEYACWIASQKQSEPYSVVDFWFD NO : AVCAGGVFAYSGARLLSTVLOLSGEESVLRAALASSPFVDDINLAOAEKFLAVSRRTGODKLGKRIGECFAE 18 GRLEALGI KDRMREFVOAIDVAOTAGORFAAKLKIFGISOMPEAKOWNNDSGLTVCILPDYYVPEENRADOL VVLLRRLREIAYCMGIEDEAGFEHLGIDPGALSNFSNGNPKRGFLGRLLNNDIIALANNMSAMTPYWEGRKG ELIERLAWLKHRAEGLYLKEPHFGNSWADHRSRIFSRIAGWLSGCAGKLKIAKDOISGVRTDLFLLKRLLDAV POSAPSPDFIASISALDRFLEAAESSODPAEOVRALYAFHLNAPAVRSIANKAVORSDSOEWLIKELDAVDHL EFNKAFPFFSDTGKKKKKGANSNGAPSEEEYTETESICOPEDAEQEVNGOEGNGAS KNOKKFORIPRFFGEGS RSEYRILTEAPOYFDMF CNNMRAIFMOLESOPRKAPRDFKCFLONRLOKLYKOTFLNARSNKCRALLESVLIS WGEFYTYGANEKKFRLRHEASERSSDPDYVVOOALEIARRLFLFGFEWRDCSAGERVDLVEIHKKAISFLLAI TQAEVSVGSYNWLGNSTVSRYLSVAGTDTLYGTQLEEFLNATVLSOMRGLAIRLSSQELKDGFDVQLESSCQ DNLQHLLVYRASRDLAACKRATCPAELDPKILVLPVGAFIASVMKMIERGDEPLAGAYLRHRPHSFWQIRV RGVAEVGMDOGTALAFQKPTESEPFKIKPFSAQYGPVLWLNSSSYSQSQYLDGFLSQPKNWSMRVLPQAGS VRVEQRVALIWNLQAGKMRLERSGARAFFMPVPFSFRPSGSGD EAVLAPNRYLGLFPHSGGIEYAVDVLDS AGFKILERGTIAVNGFSOKRGERQEEAHREKORRGISDIGRKKPVQAEVDAANELHRKYTDVATRLGCRIVV QWAPQPKPGTAPTAQTVYARAVRTEAPRSGNQEDHARMKSSWGYTWGTYWEKRKPEDILGISTQVYWTG GIGESCPAVAVALLGHIRATSTQTEWEKEEVVFGRLKKFFPS * SEQ MEKRINKIRKKLSADNATKPVSRSGPMKTLLVRVMTDDLKKRLEKRRKKPEVMPQVISNNAANNLRMLLDD ID YTKMKEAILOVYWOEFKDDHVGLMCKFAOPASKKIDONKLKPEMDEKGNLTTAGFACSOCGOPLFVYKLE NO : OVSEKGKAYTNYFGRCNVAEHEKLILLAOLKPEKDSDEAVTYSLGKFGORALDFYSIHVTKESTHPVKPLAO 19 IAGNRYASGPVGKALSDACMGTIASFLSKYODIIIEHOKVKGNOKRLESLRELAGKENLEYPSVTLPPOPHT KEGVDAYNEVIARVRMWVNLNLWOKLKLSRDDAKPLLRLKGFPSFPVVERRENEVDWWNTINEVKKLIDA KRDMGRVFWSGVTAEKRNTILEGYNYLPNENDHKKREGSLENPKKPAKROFGDLLLYLEKKYAGDWGKVF DEAWERIDKKIAGLTSHIEREEARNAEDAQSKAVLTDWLRAKASFVLERLKEMDEKEFYACEIQLQKWYGD LRGNPFAVEAENRVVDISGFSIGSDGHSIOYRNLLAW KYLENGKREFYLLMNYGKKGRIRFTDGTDIKKSGK WOGLLYGGGKAKVIDLTFDPDDEOLIILPLAFGTROGREFIWNDLLSLETGLIKLANGRVIEKTIYNKKIGRDE PALFVALTFERREWDPSNIKPVNLIGVDRGENIPAVIALTDPEGCPLPEFKDSSGGPTDILRIGEGYKEKORAI OAAKEVEORRAGGYSRKFASKSRNLADDMVRNSARDLFYHAVTHDAVLVFENLSRGFGROGKRTFMTERO YTKMEDWLTAKLAYEGLTSKTYLSKTLAOYTSKTCSNCGFTITTADYDGMLVRLKKTSDGWATTLNNKEL KAEGOITYYNRYKROTVEKELSAELDRLSEESGNNDISKWTKGRRDEALFLLKKRFSHRPVOEOFVCLDCGH EVHADEOAALNIARSWLFLNSNSTEFKSYKSGKOPFVGAWOAFYKRRLKEVWKPNA SEQ MKRINKIRRRLVKDSNTKKAGKTGPMKTLLVRVMTPDLRERLENLRKKPENIPQPISNTSRANLNKLLTDYTE ID MKKAILHVYWEEFOKDPVGLMSRVAOPAPKNIDORKLIPVKDGNERLTSSGFACSOCCOPLYVYKLEOVND NO : KGKPHTNYFGRCNVSEHERLILLSPHKPEANDELVTYSLGKFGORALDFYSIHVTRESNHPVKPLEOIGGNSC 20 ASGPVGKALSDACMGAVASFLTKYODIILEHOKVIKKNEKRLANLKDIASANGLAFPKITLPPOPHTKEGIEA YNNVVAQIVIWVNLNLWOKLKIGRDEAKPLORLKGFPSFPLVERQANEVDWWDMVCNVKKLINEKKEDGK VFWONLAGYKROEALLPYLSSEEDRKKGKKFARYOFGDLLLHLEKKHGEDWGKVYDEAWERIDKKVEGLS KHIKLEEERRSEDAOSKAALTDWLRAKASFVIEGLKEADKDEFCRCELKLOKWYGDLRGKPFAIEAENSILDI SGFSKOYNCAFIWOKDGVKKLNLYLIINYFKGGKLRFKKIKPEAFEANRFYTVINKKSGEIVPMEVNFNFDDP NLIILPLAFGKROGREFIWNDLLSLETGSLKLANGRVIEKTLYNRRTRODEPALFVALTFERREVLDSSNIKPM NLIGIDRGENIPAVIALTDPEGCPLSRFKDSLGNPTHILRIGESYKEKORTIOAAKEVEORRAGGYSRKYASKA KNLADDMVRNTARDLLYYAVTODAMLIFENLSRGFGRQGKRTFMAERQYTRMEDWLTAKLAYEGLPSKT YLSKTLAQYTSKTCSNCGFTITSADYDRVLEKLKKTATGWMTTINGKELKVEGQITYYNRYKRONVVKDLS VELDRLSEESVNNDISSWTKGRSGEALSLLKKRFSHRPVOEKFVCLNCGFETHADEQAALNIARSWLFLRSOE YKKYOTNKTTGNTDKRAFVETWOSFYRKKLKEVWKP

SEO atqGGAAAAATGTATTATCTTGGTCTGGATATAGGAACAAATTCTGTTGGATATGCCGTAACCGACCCATC ID GTACCATTTGCTCAAATTTAAAGGCGAACCGATGTGGGGTGCCCACGTGTTTGCTGCGGGGAATCAATC NO : AGCTGAACGGAGAAGCTTTCGTACGAGCCGCAGACGCCTTGACCGCAGGCAACAGCGTGTCAAACTGGT 21 TCAAGAAATCTTTGCTCCCGTGATTAGTCCCATTGATCCACGTTTTTTTATCAGACTTCATGAGAGCGCTT TATGGCGGGATGATGTGGCTGAAACGGATAAACATATTTTCTTTAATGACCCGACCTATACGGATAAGG AATATTATTCTGACTATCCAACCATCCATCATCTCATTGTGGACCTTATGGAAAGCAGTGAAAAGCATGA CCCGCGGCTTGTTTATTTGGCTGTTGCCTGGCTGGTTGCTCAT CGTGGTCATTTCCTCAATGAAGTGGATA AGGATAATATTGGGGATGTCCTGAGTTTTGACGCCTTTTAAccccTTTTATccTCAc????????????????cccATAAT GGGGTGTCACCTTGGGTATGTGAGTCAAAAGCACTCCAAGCGACCCTGCTTTCACGAAACTCCGTCAAC GATAAGTATAAAGCCTTGAAGTCTCTGATCTTTGGCAGCCAAAAGCCGGAGGATAATTTTGATGCCAAT ATCAGTGAAGATGGACTTATCCAACTTTTAGCAGGAAAAAAGGTCAAGGTCAATAAACTTTTTCCTCAA US 10 , 435 , 714 B2 71 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence GAAAGTAATGATGCTTCCTTTACACTCAATGATAAGGAAGATGCAATTGAGGAAATCTTAGGAACGCTT ACACCGGATGAGTGTGAATGGATTGCGCATATTAGGAGGCTGTTTGATTGGGCCATCATGAAACATGCT CTCAAAGATGGCAGAACAATCTCCGAATCGAAAGTAAAGCTCTATGAACAGCATCACCATGACTTGACA CAGCTCAAGTATTTTGTGAAGACCTATCTAGCAAAGGAATATGATGACATTTTTCGAAACGTAGATAGT GAAACAACCAAAAACTATGTCGCATATTCCTATCATGTAAAAAAGTCAAGGGTACATTGCCCAAAAAT AAGGCAACCCAAGAAGAATTTTGCAAGTATGTCCTTGGAAAGGTAAAGAACATCGAATGCAGTGAAGC TGATAAGGTTGATTTTGATGAAATGATTCAGCGTCTTACAGACAATTCCTTTATGCCGAAACAAGTATCA GGTGAAAACAGGGTTATCCCTTACCAGCTTTACTATTATGAACTAAAGACTATTTTGAATAAAGCCGCTT CTTATCTGCCTTTTTTGACCCAATGCGGAAAAGATGCCATCTCCAATCAAGATAAGCTCCTTTCCATCAT GACCTTTCGGATTCCGTATTTCGTTGGGCCCTTGCGCAAGGACAATTCAGAGCATGCCTGGCTGGAACGA AAAGCAGGGAAAATCTATCCGTGGAATTTTAACGACAAAGTTGACCTTGATAAAAGTGAAGAAGCGTTC ATTCGGAGAATGACGAATACCTGCACTTATTATCCCGGTGAAGATGTTTTGCCACTTGACTCCCTTATTT ATGAAAAATTCATGATCCTCAATGAAATCAATAATATCCGAAT TGATGGTTATCCTATTTCTGTAGATGT AAAACAGCAGGTTTTTGGCCTCTTTGAAAAGAAGAGAAGAGTGACCGTAAAGGATATCCAGAATCTCCT GCTTTCCTTGGGTGCCTTGGATAAGCATGGTAAATTGACGGGAATCGATACTACCATCCATAGCAATTAC AATACATACCATCATTTTAAATCGCTCATGGAGCGTGGCGTTCTTACTCGTGATGATGTGGAACGCATTG TGGAGCGTATGACCTATAGTGATGATACAAAACGCGTCCGTCTTTGGCTGAACAATAATTATGGAACGC TCACTGCTGACGACGTAAAGCATATTTCAAGGCTCCGAAAGCATGATTTTGGCCGGCTTTCCAAAATGTT CCTCACAGGCCTAAAGGGAGTTCATAAGGAAACGGGGGAACGAGCTTCCATTTTGGATTTTATGTGGAA TACCAATGATAACTTGATGCAGCTTTTATCTGAATGTTATACTTTTTCGGATGAAATTACCAAGCTGCAG GAAGCATACTATGCCAAGGCGCAGCTTTCCCTGAATGATTTTC TGGACTCCATGTATATTTCAAATGCTG TCAAACGTCCTATCTATCGAACTCTTGCCGTTGTAAATGACATACGCAAAGCCTGTGGGACGGCGCCAA AACGCATTTTTATCGAAATGGCAAGAGATGGGGAAAGCAAAAAGAAAAGGAGCGTAACAAGAAGAGA ACAAATCAAGAATCTTTATAGGTCCATCCGCAAGGATTTTCAGCAGGAGGTAGATTTCCTTGAAAAAAT CCTTGAAAACAAAAGCGATGGACAGCTGCAAAGCGATGCGCTCTATCTATACTTTGCGCAGCTTGGAAG GGATATGTATACCGGGGACCCTATCAAGTTGGAGCATATCAAGGACCAGTCCTTCTATAATATTGATCAT ATCTATCCCCAAAGCATGGTCAAGGACGATAGTCTTGATAACAAGGTGTTGGTTCAATCGGAAATTAAT GGAGAGAAGAGCAGTCGATATCCTCTTGATGCTGCTATCCGTAATAAAATGAAGCCTCTTTGGGATGCTT ATTATAACCATGGCCTGATTTCCCTCAAGAAGTATCAGCGTTTGACGCGGAGCACTCCCTTTACAGATGA TGAAAAGTGGGATTTCATCAATCGGCAGCTTGTTGAGACAAGACAATCCACGAAGGCCTTGGCAATCTT ACTAAAAAGGAAGTTCCCTGATACGGAGATTGTCTACTCCAAGGCAGGGCTTTCTTCTGATTTTCGGCAT GAGTTTGGTCTCGTAAAATCGAGGAATATCAATGACCTGCACCATGCAAAGGACGCATTTCTTGCGATT GTAACAGGAAATGTCTATCATGAACGCTTTAATCGCCGGTGGTTTATGGTGAACCAGCCCTATTCCGTCA AGACCAAGACGTTGTTTACGCATTCTATTAAAAATGGTAATTTTGTAGCTTGGAATGGAGAAGAGGATC TTGGCCGCATTGTTAAAATGTTAAAGCAAAATAAGAACACTATTCATTTCACGCGGTTCTCTTTTGATCG AAAGGAAGGCCTGTTTGATATTCAGCCACTAAAAGCGTCAACCGGTCTTGTACCAAGAAAAGCCGGACT AGACGTGGTAAAATATGGTGGCTATGACAAATCGACAGCAGCTTATTATCTCCTTGTTCGATTTACACTA GAAGATAAAAAGACTCAACATAAATTGATGATGATTCCTGTAGAAGGCTTGTATAAAGCTCGAATTGAC CATGATAAGGAATTCTTAACGGACTATGCACAAACTACAATCAGTGAAATCCTACAAAAAGATAAACAA AAGGTGATAAATATAATGTTTCCAATGGGAACAAGGCACATTAAACTGAATTCCATGATTTCAATCGAT GGTTTTTATCTTTCCATTGGAGGAAAGTCTAGTAAGGGAAAAT CGGTGTTGTGTCATGCTATGGTACCTC TTATTGTACCTCATAAGATAGAATGTTATATTAAGGCGATGGAGTCTTTTGCACGTAAATTTAAAGAAAA TAATAAATTAAGGATTGTGGAAAAGTTTGATAAGATTACGGTGGAAGATAACTTGAACCTATACGAACT ATTTTTACAAAAACTTCAACATAACCCATATAATAAGTTCTTCTCCACACAATTTGATGTGCTGACTAAT GGAAGAAGTACATTTACTAAATTATCTCCAGAGGAACAAGTTCAAACGTTATTGAATATCTTATCAATTT TTAAAACTTGTCGGAGCTCTGGCTGCGATTTAAAATCCATTAACGGTTCTGCTCAAGCTGCCAGAATTAT GATCAGCGCAGATTTAACTGGACTCTCAAAAAAATATTCCGATATTCGGCTTGTTGAGCAATCAGCATCT GGACTTTTTGTTAGTAAATCACAAAATCTTTTGGAGTATTTAtga SEQ atgtcttcattaacaaaatttacaaataaatacagtaagcagctaaccataaaaaatgaactcatcccagt ID aggaaagactctcgagaacattaaggaaaacggtctcatagatggagatgaacagctaaacgagaattatc NO : aaaaagcaaagataatcgttgatgattttctacgagatttcataaataaagctttaaataatacccaaata 22 ggaaattggagagaattagcagatgctttaaataaagaagatgaagataacatagaaaagctccaagacaa aatcagaggaataattgtaagtaaattcgagacatttgatttgttttcttcttactcgataaagaaagacg aaaagataatagatgatgataatgatgttgaagaagaggagct agatctaggaaaaaaaacttcctcattt aaatatatttttaagaaaaacctttttaaattagtacttccttcttatttaaagacaacaaatcaggataa actgaaaataatctcttcttttgataatttttctacctatttcagaggattctttgagaacagaaaaaata ttttcactaagaagcctatatctacgtcaattgcctacagaattgtccatgataactttccaaagtttcta gataacatcagatgttttaatgtgtggcaaacagaatgcccacagttaattgtaaaggctgataattattt aaaatcaaagaacgtcatagctaaagataaatctttagcaaactattttactgt aggagcatatgattact atcttatcccagaatggcattgatttctacaacaacattatcggcggtctaccagcatttgctggtcatga gaaatccaaggacttaatgaatttataaatcaagaatgccaaaaggacagcgaactaaaatctaaactgaa aaacagacatgctttcaaaatggctgttctatttaagcaaattctttcagatagagaaaaaagttttgtta tagacgagttcgaatctgatgctcaggtcatagatgcggttaagaacttctatgcagaacaatgtaaggat aataatgttatttttaaccttctaaatcttatcaagaatatagcgttcttatctgatgatgaattagatgg aatttttatagaaggcaagtatttaagctctgtttcccaaaagctatattcagattggtcgaagcttcgaa atgatattgaagatagtgcaaacagtaaacaaggaaataaagagttagcaaagaaaattaaaacaaataaa ggcgatgttgaaaaggccataagtaaatatgagttttctttatcagaacttaactcaattgtacatgataa tacaaaattcagtgaccttctttcttgtacgttacataaagtggctagcgaaaaactagtgaaagttaatg aaggggactggccaaaacacctgaaaaataatgaagaaaaacaaaagataaaagagcctttagatgcattg ttagaaatttataatacattgctgatattcaactgcaagtcatttaataagaacggtaatttctatgttga ttatgacagatgcataaatgagctttctagtgttgtttatttatataacaaaacaagaaattactgtacaa agaaaccttataacacagacaaattcaaattaaactttaacagtcctcaattaggagagggctttagtaag tcgaaagaaaatgactgtctgacattattatttaaaaaagacgacaattactatgttggaattatcagaaa aggggcaaaaattaactttgatgatacacaagccattgcagacaatacagataactgtatatttaagatga attatttcctattaaaagatgctaaaaagtttattcctaaatgttcaattcagttaaaagaagtaaaagca US 10 , 435 , 714 B2 73 74 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence cattttaaaaaatcagaggatgattatatcctgagtgacaaagaaaaatttgcctctccccttgttattaa gaaatcaacatttttattagcaacagcacatgtaaaaggaaagaaaggaaacataaaaaaattccaaaagg aatattctaaggaaaatccaacagaatatagaaattctctgaatgaatggattgcattttgtaaagaattt ctaaaaacatataaggeggcaacaatctttgacattacaacgttaaaaaaagctgaagaatatgctgatat tgttgagttttataaggatgtagataatctttgttataaactagagttttgccctattaaaacatctttca ttgagaatcttattgataatggggacttatatttattcagaatcaataataaagatttcagttcaaaatct actggtacaaagaatcttcatacgctctatcttcaggcaatctttgatgaaagaaacctcaataatcctac tattatgttaaatggcggagcagagttattttatcgaaaagaaagcattgaacagaaaaataggataacto ataaggcaggatcaattcttgtaaacaaggtttgtaaggatggaacaagtctagatgacaaaatcagaaac gaaatatatcaatatgaaaacaagtttattgatacattgtctgatgaagctaaaaaagttttacctaatgt aataaaaaaagaagcaactcacgacataacaaaagataagcgatttacatcagataagttctttttccatt gcccattaacaattaactataaggaaggagatacaaaacaatttaacaatgaggtaatttatctttcctta gaggtaatccagacattaatatcatcggaattgacagaggaga aagaccttatatacgtaactgttattaa tcaqaa aggcgaaatacttgacagcgtttcgtttaacacagtaacaaacaagtcgagcaaaattgaacaaa ctgttgattatgaggaaaagcttgctgttagggaaaaagaaagaatagaagcaaaaagatcctgggattca atatcaaagatagcaaccttaaaagaaggttatctatcagctattgttcatgagatatgcctactgatgat caaacacaacgcaatcgttgtacttgagaatetaaatgcaggatttaagagaattagaggaggattatcag aaaagtctgtttatcagaaattcgagaagatgcttattaacaaactaaattactttgtatctaaaaaagaa tcagactggaataaacctagtggacttttaaatggtttacaactttcagaccagttcgagtcatttgagaa attaggaattcaatctgggttcatcttctatgttcctgcagcatatacatctaagattgatcctacaacag gatttgcaaatgttcttaacttatccaaggtaagaaatgttgatgcaataaagagttttttcagtaattto aatgaaatttcatatagcaaaaaagaagctctctttaaattctcttttgatttagattccttatcaaagaa gggcttcagctcatttgtaaaattcagtaaatctaaatggaatgtatatacatttggagagagaataataa aaccaaagaataagcaagggtatcgtgaagataagagaattaatttaacatttgaaatgaaaaaacttctg aatgaatataaagtaagttttgatcttgaaaacaacttaattccaaatctaacctctgcaaatctgaaaga taccttctggaaagaactattctttatttttaaaacaactctgcagcttagaaacagtgtaacaaatggca aagaagatgtactgatttctccagtaaagaacgctaaaggagagttctttgtatcaggaactcataacaag acattacctcaagactgtgatgcaaatggagcatatcatatcgccctaaaaggtctgatgattcttgaacg taacaatcttgttagagaagaaaaagacacaaagaagataatggcaatttctaatgttgactggtttgagt atgttcaaaaaaggagaggtgtcctgtaa SEQ ATGAACAACTATGATGAGTTTACCAAACTGTACCCAATACAGAAAACGATAAGGTTCGAATTGAAGCCG ID CAGGGAAGAACGATGGAACACCTCGAAACATTCAACTTTTTCGAAGAGGACAGGGATAGAGCGGAGAA NO : ATATAAGATTTTAAAGGAAGCAATCGACGAGTATCATAAGAAGTTTATAGACGAACATCTAACAAATAT 23 GTCTCTTGACTGGAATTCTTTAAAACAGATTTCAGAGAAATACTATAAGAGTAGAGAGGAAAAAGACAA GAAAGTTTTTCTGTCAGAACAGAAACGCATGAGGCAAGAGATAGTTTCTGAGTTCAAAAAAGACGATCG GTTTAAAGATCTTTTTTCAAAAAAATTGTTTTCTGAACTTCTCAAGGAAGAGATTTACAAAAAAGGAAAC CATCAGGAAATTGACGCATTGAAAAGTTTTGATAAATTCTCAGGCTATTTTATTGGGTTGCATGAGAACC GAAAAAATATGTATTCTGACGGAGACGAGATCACGGCTATCTCTAACCGTATTGTAAATGAGAATTTCC CGAAGTTCCTCGACAACCTTCAGAAATATCAGGAAGCTCGTAAAAAATATCCAGAGTGGATCATTAAGG CAGAATCTGCTTTAGTTGCACATAATATCAAGATGGATGAAGTCTTTTCCTTAGAGTATTTCAACAAAGT CCTGAATCAAGAAGGAATACAGAGATACAATCTCGCCCTAGGTGGCTATGTGACCAAAAGTGGTGAGA AAATGATGGGGCTTAATGATGCACTTAATCTTGCCCATCAAAGTGAAAAAAGCAGCAAGGGAAGGATA CACATGACTCCACTCTTCAAACAGATTCTGAGTGAAAAAGAGTCCTTTTCTTATATACCAGATGTTTTTA CAGAAGACTCTCAACTTTTACCATCCATTGGTGGGTTCTTTGCACAAATAGAAAATGATAAGGACGGGA ATATTTTTGACAGAGCATTAGAATTGATATCTTCTTATGCAGAATACGATACAGAAAGGATATATATCAG GCAAGCGGACATAAACAGAGTTTCTAATGTTATTTTCGGGGAGTGGGGAACACTGGGGGGGTTAATGAG GGAATACAAAGCAGACTCTATCAACGACATCAATTTGGAGAGAACATGCAAGAAGGTAGACAAGTGGC TCGACTCAAAGGAGTTTGCGTTATCAGATGTATTAGAGGCAATAAAAAGAACCGGCAATAATGATGCTT TTAATGAATATATCTCAAAGATGCGCACTGCCAGGGAAAAGAT TGACGCTGCAAGAAAGGAAATGAAA TTCATTTCGGAAAAAATATCTGGAGACGAAGAATCGATCCATATTATCAAAACCTTATTGGACTCGGTGC AACAGTTTTTACATTTTTTCAATTTATTCAAAGCGCGTCAGGACATTCCTCTTGATGGAGCATTCTATGCG GAGTTCGATGAAGTCCATAGCAAACTGTTTGCTATTGTTCCGT TGTATAATAAGGTTAGGAACTATCTTA CGAAAAATAACCTTAACACGAAAAAGATAAAGCTAAACTTCAAGAATCCAACTCTGGCAAACGGATGG GATCAAAACAAGGTATATGACTACGCCTCCTTAATCTTTCTCCGCGATGGTAATTATTATCTCGGAATAA TAAATCCAAAAAGGAAAAAGAATATTAAATTCGAACAAGGGTCTGGAAATGGCCCATTCTACCGGAAG ATGGTGTACAAACAAATTCCAGGGCCGAACAAGAACTTACCAAGAGTCTTCCTCACATCTACGAAAGGC AAAAAAGAGTACAAGCCGTCAAAGGAGATAATAGAAGGATATGAAGCGGACAAACACATAAGAGGAG ATAAATTCGATCTGGATTTCTGTCATAAGCTGATAGACTTCTT CAAGGAATCCATCGAGAAGCACAAGG ACTGGAGTAAGTTCAACTTCTATTTCTCTCCAACTGAATCATATGGAGACAT CAGCGAATTCTATCTGGA TGTAGAAAAACAGGGATACCGGATGCATTTTGAGAATATTTCTGCCGAGACGATTGATGAGTATGTCGA AAAGGGGGACTTATTCCTCTTCCAGATATACAACAAAGACTTTGTGAAAGCGGCAACCGGAAAAAAAG ATATGCACACCATTTATTGGAACGCGGCATTCTCGCCCGAGAACCTTCAGGATGTGGTAGTGAAACTGA ACGGTGAAGCAGAACTTTTCTACAGAGACAAGAGCGACATCAAGGAGATAGTTCACAGGGAGGGAGAG ATACTGGTCAATCGTACCTACAACGGCAGGACACCTGTGCCTGACAAGATCCACAAAAAATTAACAGAT TATCATAATGGCCGTACCAAAGATCTCGGAGAAGCAAAAGAATACCTCGATAAGGTCAGATATTTCAAA GCGCACTACGACATCACAAAGGATCGCAGATACCTGAATGATAAAATATACTTCCATGTGCCTCTGACA TTGAATTTCAAAGCAAACGGGAAGAAGAATCTCAATAAGATGGTAATTGAAAAGTTCCTCTCGGACGAA AAAGCGCATATTATTGGGATTGATCGCGGGGAAAGGAATCTTCTTTACTATTCTATCATTGACAGGTCAG GTAAAATAATCGATCAACAGAGCCTCAACGTCATCGATGGATTCGATTACCGAGAGAAACTGAATCAGA GGGAGATCGAGATGAAGGATGCCAGACAAAGCTGGAATGCTATCGGGAAGATAAAGGACCTCAAGGAA GGGTATCTTTCAAAAGCGGTCCACGAAATTACCAAGATGGCGATACAATACAATGCCATTGTTGTCATG GAGGAACTCAATTATGGGTTCAAACGCGGACGTTTCAAAGTTGAGAAGCAGATATATCAGAAATTCGAG AATATGCTGATTGACAAGATGAATTATCTGGTATTCAAGGATGCTCCGGATGAAAGTCCGGGAGGAGTC CTCAATGCATATCAGCTTACTAATCCGCTTGAAAGTTTCGCTAAACTTGGGAAACAGACAGGAATTCTTT TCTATGTTCCGGCAGCCTATACTTCGAAGATAGATCCGACGACCGGGTTTGTCAATCTTTTCAATACTTC US 10 , 435 , 714 B2 75 76 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence AAGTAAAACGAACGCACAGGAAAGAAAAGAATTCTTGCAAAAATTCGAGTCGATCTCCTATTCCGCTAA AGACGGAGGAATATTCGCATTCGCGTTCGATTATCGGAAGTTCGGAACGTCAAAAACAGACCACAAAAA TGTATGGACCGCATACACGAACGGGGAAAGGATGAGGTACATAAAAGAGAAAAAACGCAACGAACTGT TCGACCCCTCGAAGGAGATCAAAGAGGCTCTCACTTCATCAGGAATCAAATATGACGGCGGACAGAACA TATTGCCAGATATCCTGAGGAGCAACAATAACGGTCTGATCTACACAATGTATTCCTCTTTCATAGCGGC CATTCAAATGAGGGTCTATGACGGGAAAGAAGACTATATCATCTCGCCGATAAAGAACAGCAAGGGAG AGTTCTTCAGGACCGATCCGAAAAGAAGGGAACTTCCGATAGACGCGGATGCGAACGGCGCGTATAAC ATTGCTCTCAGGGGCGAATTGACGATGCGTGCGATAGCGGAGAAGTTCGATCCGGACTCGGAAAAGATG GCGAAGCTAGAACTGAAACATAAGGACTGGTTCGAATTCATGCAGACAAGGGGGGATTGA SEQ ATGACAAAAACATTTGATTCAGAATTTTTTAATTTATATTCTCTTCAAAAAACAGTTCGTTTTGAACTCAA ID GCCGGTTGGTGAAACAGCCTCGTTTGTTGAAGATTTTAAAAACGAAGGTTTGAAACGAGTTGTTTCAGA NO : GGATGAACGGCGTGCGGTTGATTACCAAAAAGTGAAAGAAATTATTGATGACTACCACCGAGATTTTAT 24 TGAAGAATCGCTGAACTATTTTCCTGAGCAGGTCTCAAAAGACGCTTTGGAACAAGCTTTTCACCTTTAT CAAAAACTAAAAGCCGCTAAGGTTGAAGAGCGTGAAAAAGCATTGAAAGAATGGGAAGCCCTTCAGAA AAAACTGCGCGAAAAAGTTGTTAAATGTTTTTCAGATTCAAACAAAGCACGCTTTTCCCGCATTGATAAA AAAGAACTGATTAAAGAAGATTTAATTAACTGGTTGGTTGCACAAAATCGCGAAGATGACATTCCAACC GTTGAAACCTTTAACAACTTTACGACTTATTTTACGGGGTTTCATGAAAACCGAAAAAACATTTATTCAA AAGACGATCATGCCACAGCCATTTCATTTCGACTCATTCATGAAAACCTGCCTAAGTTTTTTGATAATGT GATCAGCTTTAATAAATTGAAGGAAGGATTTCCAGAGCTGAAATTTGATAAGGTTAAGGAAGATTTAGA AGTTGATTATGACTTGAAACATGCCTTTGAAATCGAATACTTTGTCAATTTTGTTACCCAAGCCGGAATT GACCAATATAACTATCTTTTGGGGGGTAAAACCTTAGAAGACGGCACCAAAAAGCAAGGCATGAATGA ACAAATCAATCTGTTCAAGCAACAGCAAACCCGAGACAAAGCCCGACAAATTCCCAAACTCATACCATT GTTTAAACAAATTCTAAGCGAACGAACGGAAAGCCAATCGTTTATTCCAAAACAATTTGAATCAGACCA AGAGCTATTTGACTCACTGCAAAAACTGCATAACAACTGCCAAGATAAATTTACCGTACTGCAACAAGC CATTTTAGGCTTAGCCGAAGCAGATCTGAAAAAAGTATTCATTAAAACATCTGATCTTAATGCGCTATCA AATACCATTTTTGGAAATTACAGTGTGTTTTCGGATGCGTTGAATTTATACAAAGAATCGCTCAAAACAA AAAAGGCGCAAGAAGCGTTTGAAAAACTACCCGCTCACAGCATTCATGACTTGATTCAATATTTGGAGC AATTTAATAGCTCTTTGGATGCAGAAAAACAGCAATCAACTGACACCGTACTGAATTACTTTATTAAAAC AGACGAGCTGTATTCTCGGTTCATAAAATCAACGAGCGAAGCCTTCACACAAGTACAACCACTCTTTGA ATTGGAAGCATTAAGCTCAAAACGTCGTCCACCGGAAAGTGAAGACGAAGGCGCAAAAGGTCAGGAAG GGTTTGAGCAAATTAAACGCATAAAAGCCTATTTGGATACCTTGATGGAGGCGGTGCATTTTGCAAAAC CACTTTATCTGGTGAAGGGGCGCAAAATGATTGAAGGTCTGGACAAAGACCAAAGTTTCTATGAAGCCT TTGAAATGGCTTACCAAGAACTAGAAAGTCTGATTATTCCAAT CTACAACAAAGCTCGTAGTTATTTAAG TCGTAAACCGTTTAAAGCGGACAAATTCAAAATTAATTTTGATAATAATACATTGCTTTCCGGTTGGGAT GCTAATAAAGAAACGGCTAACGCTTCAATTTTGTTTAAGAAGGATGGTTTGTATTATTTAGGAATCATGC CTAAAGGAAAAACGTTTTTGTTCGATTACTTCGTTTCATCGGAAGATTCTGAAAAGTTAAAACAAAGAA GACAAAAAACCGCCGAAGAAGCGCTTGCGCAAGATGGCGAAAGCTACTTTGAAAAAATTCGTTACAAG CTGTTACCTGGCGCCAGCAAAATGTTGCCGAAAGTATTTTTTTCCAACAAAAACATAGGGTTTTACAACC CAAGTGATGACATACTTCGTATCAGGAATACAGCCTCTCACACTAAAAACGGAACACCGCAAAAAGGGC ACTCTAAAGTAGAGTTTAATTTGAATGATTGTCATAAGATGATTGATTTCTTTAAATCAAGCATTCAAAA GCATCCAGAGTGGGGAAGTTTTGGATTCACCTTTTCAGATACATCAGATTTTGAAGATATGAGCGCCTTT TATCGAGAAGTCGAAAACCAAGGTTATGTCATTAGTTTCGATAAAATAAAAGAAACTTACATTCAGAGT CAAGTTGAACAGGGGAACCTATATTTATTCCAAATCTACAATAAAGACTTCTCGCCCTACAGCAAAGGC AAACCAAATTTACACACGCTTTACTGGAAAGCGTTGTTTGAGGAAGCCAACCTAAATAATGTGGTGGCA AAACTCAATGGTGAAGCTGAAATTTTCTTTAGGCGACACTCAATCAAAGCATCTGATAAAGTGGTGCAC CCAGCGAATCAAGCCATTGACAATAAAAACCCGCATACCGAAAAAACGCAAAGCACCTTTGAATATGAT CTTGTAAAAGACAAGCGCTATACCCAAGACAAATTCTTCTTCCATGTACCGATTTCATTGAACTTTAAGG CACAAGGTGTTTCAAAATTTAACGATAAAGTGAATGGATTTTTAAAGGGTAACCCAGATGTCAATATTA TTGGCATTGACCGAGGCGAACGACACCTTCTGTATTTCACTGTGGTGAATCAGAAAGGTGAAATTTTGGT TCAAGAGTCGCTTAATACCCTAATGAGTGATAAAGGGCATGTGAATGACTACCAGCAAAAACTCGACAA AAAAGAACAAGAACGCGATGCCGCTCGCAAAAGCTGGACGACGGTTGAAAATATCAAAGAATTAAAAG AAGGCTATTTATCTCATGTTGTTCATAAGTTGGCACACCTGAT TATTAAATACAATGCCATTGTTTGCTTG GAAGACCTGAATTTTGGTTTCAAACGCGGGCGTTTTAAAGTGGAAAAACAAGTTTATCAGAAATTTGAA AAAGCGCTTATTGATAAGCTTAACTACTTGGTATTTAAAGAAAAAGAGTTAGGCGAGGTGGGCCATTAT CTAACCGCCTATCAGTTGACCGCACCGTTTGAAAGTTTCAAGAAGTTAGGCAAGCAAAGTGGCATATTG TTTTATGTTCCGGCGGATTACACCTCCAAAATTGACCCAACCACCGGGTTTGTCAACTTTCTTGATCTGCG TTATCAGAGTGTCGAAAAAGCCAAACAGCTCTTAAGCGACTTTAATGCCATTCGTTTTAATTCAGTACAA AACTATTTTGAGTTCGAAATAGATTACAAAAAACTCACACCCAAACGTAAAGTTGGTACTCAGAGTAAA TGGGTGATTTGTACCTATGGAGATGTCCGCTATCAAAATCGGCGTAATCAAAAAGGTCACTGGGAAACG GAAGAAGTCAATGTGACTGAAAAACTAAAAGCCCTTTTCGCCAGTGATTCCAAAACTACAACCGTAATC GATTACGCCAATGACGACAACCTAATTGACGTCATTCTGGAACAGGACAAAGCCAGCTTCTTCAAAGAA CTGTTATGGTTATTAAAACTCACCATGACGCTCCGCCACAGCAAAATCAAAAGTGAAGACGACTTTATTC TTTCACCCGTTAAAAACGAACAAGGCGAGTTTTACGATAGTCGAAAAGCGGGCGAGGTGTGGCCTAAAG ATGCAGACGCCAATGGCGCTTATCACATAGCGTTGAAAGGCTTGTGGAATCTGCAACAGATCAATCAGT GGGAAAAGGGTAAAACACTTAATCTGGCGATTAAAAACCAGGATTGGTTCAGTTTTATTCAAGAAAAGC CCTATCAAGAATAA SEQ ATGCACACAGGCGGATTACTTAGCATGGATGCCAAGGAGTTTACCGGACAGTACCCCCTTTCGAAGACT ID CTGCGTTTTGAACTGAGACCGATAGGCAGAACGTGGGACAATCTCGAAGCATCGGGGTATCTTGCGGAG NO : GACAGACACCGTGCAGAATGCTATCCCAGGGCAAAAGAGCTCTTGGACGACAACCATCGTGCATTCCTC 25 AACCGTGTCCTGCCTCAGATCGATATGGATTGGCACCCGATCGCAGAGGCATTCTGCAAAGTCCACAAG AATCCGGGAAACAAGGAATTGGCTCAGGATTACAATCTTCAGC TGTCCAAACGCAGAAAGGAGATTTCG GCCTATCTGCAGGATGCGGACGGCTATAAAGGTCTGTTTGCCAAACCTGCATTGGATGAAGCAATGAAG ATCGCGAAAGAAAACGGAAATGAATCGGACATAGAGGTTCTTGAGGCATTCAACGGTTTCTCCGTATAC TTCACCGGATATCATGAGAGCAGGGAGAACATCTATTCGGACGAGGATATGGTGTCGGTAGCTTATCGC US 10 , 435 , 714 B2

TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence ATCACCGAAGACAATTTCCCGAGATTCGTTTCCAATGCGCTTATATTCGATAAGCTGAATGAGTCGCACC CCGATATAATCTCGGAAGTATCCGGAAATCTGGGCGTAGACGACATCGGAAAATATTTTGATGTGTCTA ACTACAATAATTTCCTGTCGCAGGCCGGTATAGATGACTACAATCACATCATCGGCGGCCATACGACGG AGGACGGTCTGATCCAGGCATTCAATGTTGTTCTGAATCTCAGGCATCAGAAAGACCCCGGATTCGAAA AAATCCAATTCAAACAGCTGTACAAACAGATACTCAGCGTCCGTACATCCAAATCCTATATCCCGAAAC AGTTCGATAATTCGAAGGAGATGGTGGACTGCATCTGCGACTATGTGTCCAAGATCGAAAAATCCGAAA CGGTCGAGAGAGCATTGAAGCTGGTAAGGAACATATCTTCTTTTGATTTGCGCGGAATATTCGTAAACA AGAAGAATCTCCGCATTCTTTCCAACAAACTGATTGGTGATTGGGACGCGATCGAAACCGCGCTGATGC ACTCCTCCTCTTCGGAAAATGATAAGAAATCCGTCTACGACAGCGCCGAGGCATTTACGCTGGATGATA TCTTTTCGTCCGTTAAAAAATTCTCAGATGCATCTGCAGAGGATATCGGAAACCGGGCGGAGGACATAT GCAGAGTCATATCTGAGACCGCTCCGTTCATAAACGATCTGAGGGCTGTCGATTTGGACAGTTTGAATG ACGACGGTTACGAGGCGGCGGTTTCCAAGATAAGGGAATCTCTGGAACCATATATGGATCTGTTTCATG AACTGGAGATATTCTCCGTAGGCGATGAATTCCCGAAATGTGCAGCTTTCTACAGTGAACTTGAAGAAG TCTCCGAACAGCTAATCGAGATTATACCGTTATTCAACAAGGCCCGTTCGTTCTGTACGCGCAAGAGATA CAGTACGGACAAGATAAAGGTCAATTTGAAATTCCCGACACTCGCCGACGGATGGGATCTCAACAAAGA ACGCGACAACAAAGCCGCAATACTCAGGAAAGACGGAAAGTACTACCTGGCCATACTGGATATGAAGA AAGATCTTTCTTCGATCAGAACTTCGGATGAAGACGAATCCAGTTTTGAGAAAATGGAGTACAAGCTTC TTCCGAGTCCGGTAAAGATGCTGCCAAAGATCTTCGTAAAATCGAAGGCGGCCAAGGAGAAGTACGGTC TGACCGACCGTATGCTGGAGTGCTACGATAAAGGGATGCACAAGAGCGGCAGTGCATTCGATCTCGGAT TTTGTCACGAATTGATCGATTACTACAAGAGGTGCATCGCAGAATATCCCGGCTGGGACGTCTTCGATTT CAAGTTCAGGGAAACATCGGATTATGGCAGCATGAAGGAGTTCAATGAGGATGTTGCAGGGGCCGGAT ACTATATGTCCCTCAGAAAGATCCCTTGTTCGGAGGTCTACAGGCTTCTTGATGAGAAATCGATATATCT TTTCCAGATCTACAACAAAGATTATTCGGAAAACGCTCATGGGAATAAGAACATGCATACCATGTATTG GGAAGGGCTCTTTTCCCCCCAGAATCTGGAATCCCCTGTGTTTAAACTCAGCGGCGGTGCGGAGCTTTTC TTCCGTAAATCCTCCATACCCAATGACGCCAAAACGGTCCATCCGAAGGGAAGCGTCCTGGTTCCGCGC AATGATGTAAACGGCCGCAGGATACCTGACAGCATATATCGGGAGCTCACCAGATATTTCAACCGCGGA GATTGCCGCATAAGCGACGAGGCAAAGAGTTATCTGGACAAGGTGAAAACCAAGAAAGCTGACCACGA TATCGTGAAAGACAGGAGGTTCACGGTGGACAAGATGATGTTCCACGTCCCTATCGCCATGAATTTCAA AGCGATTTCGAAGCCGAATCTCAATAAAAAGGTGATTGACGGCATAATCGACGACCAAGATCTGAAGAT CATCGGCATAGACCGCGGAGAGCGCAACCTCATCTACGTAACCATGGTGGATCGCAAAGGGAACATCCT CTATCAGGATAGCCTCAATATTCTGAACGGATACGATTACCGTAAGGCCCTCGACGTCCGCGAATATGA CAATAAAGAGGCTCGGAGGAACTGGACGAAGGTCGAAGGCATCCGTAAGATGAAAGAGGGGTATCTGT CGCTTGCAGTCAGCAAATTGGCAGATATGATCATAGAGAACAATGCGATTATCGTCATGGAGGATCTCA ATCACGGATTCAAGGCAGGGCGTTCGAAGATAGAGAAACAGGTCTATCAGAAGTTCGAATCCATGCTCA TAAACAAACTCGGTTACATGGTCCTCAAGGATAAGTCTATCGATCAGAGCGGCGGAGCTCTCCACGGAT ACCAGCTTGCCAACCATGTGACAACATTGGCATCTGTAGGTAAACAATGTGGAGTGATATTCTACATCCC TGCTGCATTTACATCCAAGATAGATCCGACAACAGGATTTGCAGATCTGTTCGCCCTCAGCAATGTTAAA AACGTGGCATCTATGAGAGAATTTTTCTCCAAGATGAAGTCTGTAATCTATGATAAGGCGGAGGGAAAA TTCGCATTTACCTTCGACTATCTTGATTATAATGTGAAATCCGAGTGCGGAAGGACCCTTTGGACCGTGT ATACGGTCGGAGAGAGATTCACATACAGCAGGGTCAATAGAGAATATGTCAGAAAAGTTCCGACAGAC ATAATCTACGACGCATTGCAAAAGGCAGGAATATCTGTTGAAGGGGATCTCAGGGACAGGATTGCTGAA TCGGATGGCGACACTCTGAAGAGCATATTCTATGCATTCAAGTATGCATTGGATATGAGAGTAGAGAAC CGCGAAGAGGATTACATACAGTCTCCTGTCAAAAATGCCTCCGGAGAATTCTTCTGTTCCAAGAACGCA GGCAAATCGCTCCCTCAGGATTCCGATGCGAACGGTGCATACAATATCGCACTCAAGGGGATCCTGCAG CTACGTATGCTTTCCGAGCAGTATGATCCGAATGCAGAGAGCATACGGTTGCCACTGATAACCAACAAG GCCTGGCTGACCTTTATGCATCCGGTATGAAGACATGGAAGAACTGA SEQ atgGATAGTTTGAAAGATTTCACCAATCTGTACCCTGTCAGTAAGACATTGAGATTTGAATTAAAGCCCGT ID AATATCGAGAAAGCAGGTATTTTGAAAGAGGATGAGCATCGTGCAGAAAGTT NO : ATCGGAGGGTGAAGAAAATAATTGATACTTATCATAAGGTATTTATCGATTCTTCTCTTGAAAATATGGC 26 TAAAATGGGTATTGAGAATGAAATAAAAGCAATGCTCCAAAGTTTCTGCGAATTGTATAAAAAAGATCA TCGCACTGAGGGTGAAGACAAGGCATTAGATAAAATTCGAGCAGTACTTCGTGGCCTGATTGTTGGGGC TTTCACTGGTGTTTGCGGAAGACGGGAAAATACAGTCCAAAACGAGAAGTACGAGAGTTTGTTCAAAGA AAAGTTGATAAAAGAAATTTTACCTGATTTTGTGCTCTCTACTGAGGCTGAAAGCTTGCCTTTCTCTGTTG AAGAAGCTACGAGGTCACTGAAGGAGTTTGATAGCTTTACATCCTACTTTGCTGGTTTTTACGAGAATAG AAAGAATATATACTCGACGAAACCTCAATCCACTGCCATTGCTTATCGTCTTATTCATGAGAACTTGCCG AAGTTCATTGATAATATTCTTGTTTTTCAGAAGATCAAAGAGCCTATAGCCAAAGAGCTGGAACATATTC GTGCGGACTTTTCTGCCGGGGGGTACATAAAAAAGGATGAGAGATTGGAGGATATTTTTTCGTTGAACT ATTATATCCACGTGTTATCTCAGGCTGGGATCGAAAAATATAACGCATTGATTGGGAAGATTGTGACAG AAGGAGATGGAGAGATGAAAGGGCTCAATGAACACATCAACCTTTACAACCAACAAAGAGGCAGAGAG GATCGGCTCCCTCTTTTTAGGCCTCTTTATAAACAGATATTGAGTGACAGAGAGCAATTATCATACTTGC CTGAGAGTTTTGAAAAAGATGAGGAGCTCCTCAGGGCTCTAAAAAGTTCTATGATCATATCGCAGAAG ACATTCTCGGACGTACTCAACAGTTGATGACTTCTATTTCAGAATATGATTTATCTCGGATATACGTAAG GAACGATAGCCAATTGACTGATATATCAAAAAAAATGTTGGGAGATTGGAATGCTATCTACATGGCTAG AGAACGAGCATATGACCACGAGCAGGCTCCCAAAAGAATCACGGCGAAATACGAGAGGGACAGGATTA AAGCTCTTAAAGGAGAAGAGAGTATAAGTCTGGCAAATCTTAATAGTTGTATTGCCTTTCTGGACAATGT TAGAGATTGCCGTGTAGATACTTATCTTTCCACACTGGGCCAGAAGGAAGGACCACATGGTCTATCTAAT CTCGTTGAGAACGTTTTTGCCTCATACCATGAAGCAGAGCAATTGTTGAGCTTTCCATACCCCGAAGAGA ATAATCTGATTCAGGACAAGGACAATGTGGTGTTAATTAAGAATCTTCTCGACAATATCAGTGATCTGCA GAGGTTCTTGAAACCTCTTTGGGGTATGGGAGACGAACCCGATAAAGATGAAAGATTTTATGGAGAGTA TAATTATATCCGAGGAGCTCTAGATCAGGTGATCCCTCTGTACAATAAGGTAAGGAACTACCTCACTCG GAAGCCTTATTCGACCAGAAAAGTAAAACTCAATTTTGGGAATTCTCAATTGCTTAGTGGTTGGGATAG AGGATAATAGCTGTGTGATTTTGCGTAAGGGGCAGAACTTCTATTTGGCTATTATGAA CAATAGGCACAAAAGAAGTTTCGAAAACAAGGTGTTGCCCGAGTATAAGGAGGGAGAACCTTACTTCG AAAAGATGGATTATAAATTTTTGCCTGATCCTAATAAAATGCTTCCTAAGGTTTTTCTTTCGAAAAAAGG AATAGAGATATACAAACCAAGTCCGAAGCTTTTAGAACAATATGGACATGGAACTCACAAAAAGGGAG US 10 , 435 , 714 B2 79 80 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence ATACCTTTAGTATGGATGATTTGCACGAACTGATCGATTTCTT CAAACACTCAATCGAGGCTCATGAAGA TTGGAAGCAATTCGGATTCAAATTTTCTGATACGGCTACTTATGAGAATGTATCTAGTTTCTATAGAGAA GTTGAGGATCAGGGGTATAAGCTCTCTTTCCGAAAAGTTTCGGAATCTTATGTCTATTCATTAATAGATC AAGGCAAGTTGTATTTATTTCAGATATACAACAAGGACTTTTCTCCCTGCAGCAAAGGGACACCTAATCT GCATACCTTGTATTGGAGAATGCTTTTTGACGAGCGCAATTTGGCAGATGTCATATACAAACTGGATGGG AAGGCTGAAATCTTTTTCCGAGAGAAGAGTTTGAAAAATGATCATCCCACGCATCCGGCTGGTAAGCCT ATCAAAAAGAAAAGTCGACAAAAAAAAGGAGAGGAGAGTCTGTTTGAGTATGATTTAGTCAAGGATAG GCACTATACGATGGATAAGTTCCAGTTTCATGTGCCTATTACTATGAATTTTAAATGTTCTGCAGGAAGC AAAGTCAATGATATGGTTAATGCTCATATTCGAGAGGCAAAGGATATGCATGTCATTGGAATTGATCGT GGAGAACGCAATCTGCTGTATATATGCGTGATAGATAGTCGAGGGACGATTTTGGATCAAATTTCTCTG AATACGATTAACGATATAGACTATCATGATTTATTGGAGAGTCGAGACAAAGACCGTCAGCAGGAGCGC CGAAACTGGCAAACTATCGAAGGGATCAAGGAGCTAAAACAAGGCTACCTTAGTCAGGCGGTTCATCG GATAGCCGAACTGATGGTGGCTTATAAGGCTGTAGTTGCTTTGGAGGATTTGAATATGGGGTTCAAACG TGGGCGGCAGAAAGTAGAAAGTTCTGTTTATCAGCAGTTTGAGAAACAGCTGATAGATAAGCTCAACTA TCTTGTGGACAAGAAGAAAAGGCCTGAAGATATTGGAGGATTGTTGAGAGCCTATCAATTTACGGCCCC ATTTAAGAGTTTTAAGGAAATGGGAAAGCAAAACGGCTTCTTGTTTTATATCCCGGCTTGGAACACGAG CAACATAGATCCGACTACTGGATTTGTTAATTTATTTCATGCCCAGTATGAAAATGTAGATAAAGCGAAG AGCTTCTTTCAAAAGTTTGATTCAATTAGTTACAACCCGAAGAAAGACTGGTTTGAGTTTGCATTCGATT ATAAAAACTTTACTAAAAAGGCTGAAGGAAGTCGTTCTATGTGGATATTATGCACACATGGTTCCCGAA TAAAGAATTTTAGAAATTCCCAGAAGAATGGTCAATGGGATTCCGAAGAATTCGCCTTGACGGAGGCTT TTAAGTCTCTTTTTGTGCGATATGAGATAGATTATACCGCTGATTTGAAAACAGCTATTGTGGACGAAAA GCAAAAAGACTTCTTCGTGGATCTTCTGAAGCTATTCAAATTGACAGTACAGATGCGCAACAGCTGGAA AGAGAAGGATTTGGATTATCTAATCTCTCCTGTAGCAGGGGCTGATGGCCGTTTCTTCGATACAAGAGA GGGAAATAAAAGTCTGCCTAAGGATGCAGATGCCAATGGAGCTTATAATATTGCCCTAAAAGGACTTTG GGCTCTACGCCAGATTCGGCAAACTTCAGAAGGCGGTAAACTCAAATTGGCGATTTCCAATAAGGAATG GCTACAGTTTGTGCAAGAGAGATCTTACGAGAAAGACtga

SEO atgaataatggaacaaataactttcagaattttatcggaatttcttctttgcagaagactcttaggaatgct ID ctcattocaaccgaaacaacacagcaatttattgttaaaaacggaataattaaagaagatgagctaagagga NO : gaaaatcgtcagatacttaaagatatcatggatgattattacagaggtttcatttcagaaactttatcgtca 27 attgatgatattgactggacttctttatttgagaaaatggaaattcagttaaaaaatggagataacaaagac actcttataaaagaacagactgaataccgtaaggcaattcata aaaaatttgcaaatgatgatagatttaaa aatatgttcagtgcaaaattaatctcagatattcttcctgaatttgtcattcataacaataattattctgca tcagaaaaggaagaaaaaacacaggtaattaaattattttccagatttgcaacgtcattcaaggactatttt aaaaacagggctaattgtttttcggctgatgatatatcttcatcttcttgtcatagaatagttaatgataat gcagagatattttttagtaatgcattggtgtataggagaattgtaaaaagtctttcaaatgatgatataaat aaaatatccggagatatgaaggattcattaaaggaaatgtctctggaagaaatttattcttatgaaaaatat ggggaatttattacacaggaaggtatatctttttataatgatatatgtggtaaagtaaattcatttatgaat ttatattaccagaaaaataaaqaaaacaaaaatctctataaqctqcaaaaqcttcataaacaqatactatge atagcagatacttcttatgaggtgccgtataaatttgaatcagatgaagaggtttatcaatcagtgaatgga tttttggacaatattagttcgaaacatatcgttgaaagattgcgtaagattggagacaactataacggctac aatcttgataagatttatattgttagtaaattctatgaatcagtttcacaaaagacatatagagattgggaa acaataaatactgcattagaaattcattacaacaatatattacccggaaatggtaaatctaaagctgacaag gtaaaaaaagcggtaaagaatgatctgcaaaaaagcattactgaaatcaatgagcttgttagcaattataaa ttatgttcggatgataatattaaagctgagacatatatacatgaaatatcacatattttgaataattttgaa gcacaggagcttaagtataatcctgaaattcatctggtggaaagtgaattgaaagcatctgaattaaaaaat gttctcgatgtaataatgaatgcttttcattggtgttcggttttcatgacagaggagctggtagataaagat aataatttttatgccgagttagaagagatatatgacgaaatatatccggtaatttcattgtataatcttgtg cgtaattatgtaacgcagaagccatatagtacaaaaaaaattaaattgaattttggtattcctacactagcg gatggatggagtaaaagtaaagaatatagtaataatgcaattattctcatgcgtgataatttgtactattta ggaatatttaatgcaaaaaataagcctgacaaaaagataattgaaggtaatacatcagaaaataaaggggat tataagaagatgatttataatcttctgccaggaccaaataaaatgatccccaaggtattcctctcttcaaaa accggagtggaaacatataagccgtctgcctatatattggagggctataaacaaaacaagcatattaaatcc totaaggattttgatataacattttgtcacgatttgattgattattttaagaactgtatagcaatacatcct gaatggaagaattttggctttgatttttctgacacctccacatatgaagatatcagoggattttacagagaa gtcgaattacaaggttataaaatcgactggacatatatcagcgaaaaggatattgatttgttgcaggaaaaa ggacagttatatttattccaaatatataacaaagatttttccaagaaaagtaccggaaatgataatcttcat actatgtatttgaagaatttgtttagtgaagagaatttaaaggatattgtactgaaattaaacggtgaggcg gaaatcttctttagaaaatcaagcataaagaatccaataattcataaaaaa acatatgaagcagaggaaaaagatcaatttggaaatatccagatagtcagaaaaaacataccggaaaatata tatcaggagctttataaatatttcaatgataaaagtgataaagaactttcggatgaagcagctaagcttaag aatgtagtaggtcatcatgaggctgctacaaacatagtaaaagattatagatatacatatgataaatatttt cttcatatgcctattacaatcaattttaaagccaataagacaggctttattaatgacagaatattacaatat attgctaaagaaaaggatttgcatgtaataggcattgatcgtggtgaaagaaacctgatatatgtttcagt . attgatacttgtggaaatattgttgaacaaaaatcgtttaacattgttaatggatatgattatcagattaag ctcaagcagcaggagggggcgcgacaaatcgcacgaaaagaatggaaagaaatcggcaaaataaaagaaatt aaagaaggctatttatctcttgtaattcatgaaatttcaaagatggttattaaatataatgccataattgca atggaggatttaagctacggatttaaaaaaggtcgtttcaaggttgagcgacaggtttaccagaagtttgag acaatgcttatcaacaaactcaactatctggtatttaaagatatatccataacggaaaacggtggtcttcta aagggataccagcttacatatattccagataaactgaaaaatgtgggtcatcaatgtggctgtatattttat gtacctgctgcctatacatcaaaaatagatcctacaaccggatttgtaaatatattcaaatttaaagattta acagttgatgcgaagagagaatttataaaaaaatttgacagtatcagatatgattcagaaaaaaatctgttt tgttttacattcgattataataactttattacgcaaaatactgttatgtcaaagtcaagctggagtgtatat acgtacggagttaggataaaaagaagatttgtcaatggcaggttctcaaatgaatcggatacaattgatata acaaaagatatggaaaaaacactcgaaatgacagatataaattggagagatggtcatgatctgaggcaggat attattgattatgaaatcgtacaacacatatttgagatttttagattgactgtacaaatgagaaacagttta US 10 , 435 , 714 B2 81 82 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence agtgaattagaagacagggattatgaccgtttgatttctccggtgctcaatgaaaataatatattttatgat tcagctaaagcaggagatgcgttacctaaagacgcagatgctaatggtgcatattgtatagctctaaaaggc ttgtatgaaatcaaacaaattacagagaattggaaagaagacggtaagttttcaagagataaacttaaaatt tocaataaggactggtttgactttattcaaaataaaaggtatttataa SEQ atgacaaacaaatttacaaaccagtactcgctttccaaaacacttcgatttgagttgattccacaaggaaaa ID acattggaatttattcaagaaaaaggattgctctctcaagataaacaacgagcggagagttatcaagaaatg NO : aaaaaaactattgataaatttcataaatactttatcgatttagctttaagcaatgctaaactaactcattta 28 gaaacttacttggaattatacaataaaagtgctgaaacaaaaaaagaacaaaaatttaaagacgatttaaag aaagtacaagacaatttacgaaaagaaatcgttaaatctttttcagatggtgatgcaaaatcaatttttgca attttggataaaaaagaactgattaccgtagaacttgaaaaatggtttgaaaacaacgaacaaaaagacatt tattttgacgaaaaattcaaaacgtttactacttattttactggttttcatcaaaacagaaaaaacatgtat toggttgaacccaattctacagcaattgcttatcgattgattcatgaaaatttacctaaatttttagaaaat gctaaagcatttgaaaaaataaaacaagtagaaagtttgcaagttaattttagagaattaatgggggaattt ggagatgaagggctaattttcgtaaatgaattagaagaaatgtttcaaatcaattattataatgatgtgctt tcacaaaatggaattacaatttataatagtataatttcaggatttaccaaaaatgatataaaatataaaggt ctaaatgaatacataaataattacaatcaaaccaaagacaaaaaagaccgtttgccaaaattaaaacaattg tataaacagattttgagtgataggatttcactttcgtttttgcccgatgcttttacggatgggaaacaagtt ttgaaagccatatttgacttttataaaatcaacttactttcttataccattgaaggacaggaagaaagccaa aatcttttactattaattcgtcagacaattgaaaacctttctagttttgatacccaaaaaatttatctaaaa aatgatacccatttaaccactatttcacaacaagtatttggcgatttttcggtgttttcaactgctttaaat tattggtatgaaactaaagtaaatccaaaatttgaaacggaatatagcaaagccaacgaaaaaaaacgagaa attttagataaagccaaagcggtatttacaaaacaagattatttttcaattgcttttttacaagaagtactt tcggaatacattcttaccttagatcacacttctgatattgtaaaaaagcattcctccaactgtattgcggat tattttaaaaatcattttgtagccaaaaaagaaaatgaaaccgacaaaacctttgattttattgctaatatt actgcaaaataccaatgtattcaaggtattttagaaaatgcagaccaatacgaagacgaactcaaacaagac caaaaattaattgataatttgaaattctttttaqatqctattttaqaattqttqcattttattaaaccttta catttaaaatcagaaagcattaccgaaaaagacactgctttttatgatgtgtttgaaaattattacgaagca ttgagtttgttgaccccattatataatatggtgcgaaactatgtaacgcaaaagccgtacagcaccgaaaaa ataaaattaaattttgaaaatgcacaattattgaatggttgggatgccaataaagaaggtgattacctaact accattttgaaaaaagacggtaattattttttagccataatggataaaaagcataacaaagcgtttcaaaag tttccagaaggaaaagaaaattatgaaaaaatggtgtataaactattgcctggagtaaataagatgttgcca aaagtatttttttccaataaaaatattgcttacttcaacccatcaaaagagttattagaaaactataaaaaa gagacgcacaaaaaaggagacacattcaatttagaacattgtcatacgttgatcgattttttcaaggactct ttaaacaaacatgaagactggaaatactttgattttcaattttctgaaacaaaatcgtatcaagatttgagt ggtttttatagagaagtagaacatcaaggctacaaaatcaattttaaaaatatcgattcagaatatattgat ggtttggtgaacgaaggtaaattgtttctatttcaaatttacagcaaagatttttcgcctttttccaaaggg aaaccgaacatgcacactttgtattggaaagccttatttgaagaacaaaatttgcaaaatgtaatctataaa ttgaatggacaagccgaaatattttttagaaaagcctctataaaacctaaaaatataatattgcacaaaaag aaaattaaaattgccaaaaagcattttattgataaaaaaacaaaaacatctgaaattgttcctgttcaaaca ataaaaaacctcaatatgtactaccaaggaaaaataagtgaaaaagaattaacacaagatgatttaaggtat attgataattttagcattttcaatgaaaaaaataaaacaattgatattataaaagacaaacgatttacggtt gataaatttcagtttcatgtgccgattaccatgaactttaaagcaacgggcggaagttatatcaatcaaacc gtattagaatatttgcaaaacaatcccgaagttaagattattggattggatagaggcgaacgccatttggta tatctgacactgatagaccagcaaggaaacatcttgaaacaagaaagtttgaatacaatcaccgattctaaa atctcgacaccttatcataagttgttggataacaaggaaaacgagcgtgacttggctcgaaaaaattgggga acggtggaaaacatcaaagaactcaaagaaggctacatcagtcaagtggtgcataaaattgctacgttgatg ctggaagaaaatgccattgtggtaatggaagatttgaattttggatttaaacgtggacgttttaaagtggaa aaacaaatttatcaaaagctggaaaaaatgttgattgacaaattgaattatttggttttaaaagacaaacaa cctcaggaattaggcggattgtacaacgcattacaactcaccaataaatttgaaagtttccaaaaaatgggt aaacaatcgggctttttgttttatgtacccgcttggaacacctccaaaatagacccaaccacagggtttgtc aattatttttataccaaatatgaaaatgttgacaaagccaaagccttttttgaaaaatttgaggcgattcgt ttcaatgcagaaaagaagtattttgaatttgaagtaaaaaaatatagcgattttaacccaaaagccgaaggc actcaacaagcctggaccatttgcacgtatggcgaacgaatagaaaccaaacgacaaaaagaccaaa aaatttgtaagcactccaattaatctaaccgaaaagatagaagactttttgggtaaaaaccaaattgtttat ggtgatggtaattgcatcaaatctcaaattgctagcaaagacgacaaggettttttgaaaccttattgtatt ggttcaaaatgactttacaaatgcgaaacagcgaaacaagaacagatatagattatctaatttcgcccgtga tgaatgacaacggaacattttacaacagccgagattatgaaaaattagaaaatccaactttgcccaaagatg ccgatgccaacggagcgtatcatattgccaaaaaaggattgatgcttttgaataaaatagaccaagccgact tgacaaaaaaagtggatttatctattagtaacagagattggttgcaatttgtacaaaaaaataaataa SEQ atggaacaggagtactatttaggactggatatgggaaccggatctgtaggatgggctgttacagattcggaa ID tatcatgtcttgcgtaaacatggaaaagcactatggggagtccgattatttgaaagtgcatcgacagcagaa NO : gaacgaagaatgttccgaacatcaagaagaagactagatcgaagaaactggagaattgaaattttacaggaa 29 atttttgcagaggaaataagtaagaaagatccaggatttttcttgcgaatgaaagaaagcaaatattatcca gaagataagcgagatatcaatggaaattgtccggaactgccatatgcattatttgttgatgacgattttaca gataaagattatcataaaaaatttccgacaatttatcatctcaggaaaatgttgatgaatacagaggagaca ccggatatccggttggtgtatctggcaattcatcatatgatga agcataggggccatttcttgttatctggt gacattaatgagattaaggagttcggaacgacattttcaaaattgttggagaatatcaaaaatgaggaattg gattggaatcttgaactgggaaaagaagaatatgctgttgtagaaagtattttaaaagataacatgttaaac cgatccacaaagaaaaccagattaataaaagcattaaaagcaaaatcaatatgtgaaaaggctgtactgaat ttattggctggtggaacggtgaaattgagtgatatatttggtcttgaagaattaaatgagacagaaagaccg aagatttcctttgctgataatggatacgatgattatatcggagaagttgaaaatgagctgggagaacaatto tatattatagagacggcaaaagcagtgtatgactgggcggtattagttgaaatattgggaaaatatacgtca atttcagaagcgaaagtagcaacgtatgaaaaacataaatcggatttacaatttttgaaaaagatagttegg aaatatctgacaaaggaggaatataaagatatttttgtaagtacgagtgacaaattgaaaaattactctgct US 10 , 435 , 714 B2 83 84 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence tatataggaatgacgaaaataaatggaaaaaaggttgatttgcagagcaaacggtgcagtaaagaagaatto tatgattttattaagaaaaacgtacttaaaaagctagaaggacaacctgaatatga atatttgaaagaagag ctagaaagagaaacatttctaccaaaacaggtgaacagggataatggtgtaataccgtatcagattcatttg tacgagttgaaaaagatattaggaaatttacgggataaaatagacctcattaaagagaacgaagataaactg gttcaattatttgaattcagaattccgtattatgttggtccgctgaataagatagatgacggaaaagaggga aaatttacatgggctgtacggaaaagtaatgaaaagatatatccatggaattttgaaaatgtagttgatata gaagcaagtgcagaaaaatttatccggagaatgacaaataagtgtacatatctgatgggcgaagatgtattg ccgaaggattcattgctttacagtaaatatatggttttaaatgaattaaataatgtaaagttggatggcgaa aaattatctgtagaattgaaacaacggttgtatacagatgtattttgtaagtatcggaaagtaactgtaaag aagataaaaaattacttgaaatgtgaaggtatcatatccggcaatgtcgaaataactggaattgatggtgat tttaaggcatcgttaacggcatatcatgattttaaagaaatcttgacaggaacagaattggctaaaaaggac aaagaaaatattattaccaatatagtattgtttggagatgataaaaagctgctgaaaaagagactgaatcga ttatatcctcagattacgccgaatcagttgaagaaaatatgtgcgctatcctatacaggctggggaagattt totaaaaagttcttagaagaaataacagctccagatccggaaacgggagaggtatggaatatcattacggca ttgtgggaatcgaataataatctgatgcaattattaagtaatgaatatcggtttatggaagaagtcgaaaca tacaatatgggaaaacagactaaaacattgtcgtacgaaacagtagagaatatgtatgtttctccatctgtg aaaagacagatatggcagacgctgaaaatcgtgaaagaattagaaaaagtaatgaaagaatctccgaaacgt gtatttattgagatggcgagagaaaagcaagaaagtaagagaaccgaatcgcgtaaaaaacaactaatagat ttgtataaggettgtaaaaatgaagaaaaagattgggtaaaagaactgggagatcaggaagaacagaaatta cgaagcgataagttgtacctatattatacgcaaaagggtcgttgtatgtattctggcgaggtaatagaactg aaagacttatgggataatacaaaatatgatattgatcatatatatccacaatctaaaacgatggatgacagt cttaataatcgcgtattggtaaaaaagaaatataatgcaacaaaatcagataagtatccattaaatgaaaat atacgacatgagagaaaaggettttggaagtcactgttagatggagggtttataagtaaagaaaaatatgaa cgcttaataagaaatacagaattgagtccggaagaattagcaggatttattgaaaggcagattgttgaaacg aggcagagtacaaaagctgtagcggaaatattaaagcaagtgtttccggaaagtgaaattgtatatgtcaaa gcaggtacggtttcaagattcagaaaagattttgaattactgaaagttcgagaagtgaatgatttgcatcac gcaaaggatgcgtatttaaatattgtagttggtaatagttattatgtgaaatttactaagaatgcatcatgg tttataaaagaaaatccqqqacatacttacaacttaaaaaaqatgtttacatcaqattggaatattgaacga aatggagaagttgcatgggaagtcgggaaaaaaggaacaattgtaacggtaaaacaaataatgaataaaaat aatatattggtgacaagacaggttcatgaagcgaaaggtgggctgtttgatcagcagattatgaaaaaagga aaaggtcagattgctataaaggaaactgatgaacgtcttgcatcaatagaaaagtatggaggctataataaa gctgccggggcatattttatgctggtagaatctaaagataaaaaaggaaaaacaattcgaacgatagaattt ataccattatatttaaagaataaaatcgagtcggatgaatcaatagcattgaactttttagaaaaaggcaga ggtttgaaagaaccaaagatactattgaaaaaaattaagattgatacattatttgatgtggacggattcaaa atgtggttgtctggaagaacaggggacagactactatttaaatgtgcaaatcaattgattttggatgagaaa ataattgtaacaatgaaaaaaattgtaaagtttattcaaaggagacaagaaaatagagaattaaaattatct gataaagatggaattgataatgaagtacttatggaaatatata acacttttgtggataagttagaaaacaca gtgtatagaatacgattatccgaacaggcaaaaacgcttatagataaacaaaaagaatttgaaaggttatca ctagaggataaaagtagtactttgtttgaaattttacatatttttcagtgtcaaagtagtgcggccaattta aaaatgataggcggacctggaaaagcaggaatattagttatgaataataatataagtaagtgtaacaaaatt tctattataaatcagtctccaacaggaattttcgaaaatgagattgatttgttaaagat SEQ ATGAAATCTTTCGATTCATTCACAAATCTTTATTCTCTTTCAAAAACCTTGAAATTTGAGATGAGACCTGT ID CGGAAATACCCAAAAAATGCTCGACAATGCAGGAGTATTTGAAAAAGACAAACTAATTCAAAAAAAGT NO : ACGGAAAAACAAAGCCGTATTTCGACAGACTCCACAGAGAATT TATAGAAGAAGCGCTCACGGGGGTA 30 GAGCTAATAGGACTAGATGAGAACTTTAGGACACTTGTTGACTGGCAAAAAGATAAGAAAAATAATGTC GCAATGAAAGCGTATGAAAATAGTTTGCAGCGGCTGAGAACGGAAATAGGTAAAATATTTAACCTAAA GGCTGAGGATTGGGTAAAGAACAAATATCCAATATTAGGGCTGAAAAATAAAAATACCGATATTTTATT CGAAGAGGCTGTATTCGGGATATTGAAAGCCCGATATGGAGGAAAGCCCGATATGGAGAAGAAAAAGATACTTTTATAGAAGTAG AGGAAATAGATAAAACCGGCAAATCAAAGATCAATCAAATATCAATTTTCGATAGTTGGAAAGGATTTA CAGGATATTTCAAAAAATTTTTTGAAACCAGAAAGAATTTTTACAAAAACGACGGAACTTCTACAGCAA TTGCTACAAGGATCATTGATCAAAATCTGAAAAGATTCATAGATAATCTGTCAATAGTTGAAAGTGTGA GACAAAAGGTTGATCTCGCCGAGACAGAAAAATCTTTCAGCATATCTCTATCGCAATTCTTCTCAATAGA CTTTTATAACAAGTGTCTCCTTCAAGATGGTATTGATTACTACAACAAGATAATCGGTGGAGAAACTCTC AAAAATGGCGAAAAACTAATAGGTCTCAATGAACTAATAAATCAATATAGGCAAATAATAAGGATCA GAAAATCCCATTTTTCAAACTTCTTGATAAACAAATTTTGAGTGAAAAGATATTATTTTTGGATGAAATA AAAAATGACACAGAACTGATCGAGGCGCTGAGTCAGTTCGCAAAAACAGCCGAAGAAAAAACAAAAAT TGTCAAAAAGCTTTTTGCCGATTTTGTAGAAAATAATTCCAAATACGATCTTGCACAGATTTATATTTCC CAAGAAGCATTCAATACTATATCAAACAAGTGGACAAGCGAAACTGAGACGTTCGCTAAATATCTATTC GAAGCAATGAAGAGTGGAAAACTTGCAAAGTATGAGAAAAAAGATAATAGCTATAAATTTCCTGATTTT ATTGCCCTTTCACAGATGAAGAGTGCTTTATTAAGTATCAGCCTTGAGGGACATTTTTGGAAAGAGAAAT ACTACAAAATTTCAAAATTCCAAGAGAAGACCAATTGGGAGCAGTTTCTTGCAATTTTTCTATACGAGTT TAACTCTCTTTTCAGCGACAAAATAAATACAAAAGATGGAGAAACAAAGCAAGTTGGATACTATCTATT TGCCAAAGACCTGCATAATCTTATCTTAAGTGAGCAGATTGATATTCCAAAAGATTCAAAAGTCACAAT AAAAGATTTTGCCGATTCTGTACTCACAATCTACCAAATGGCAAAATATTTTGCGGTAGAAAAAAAACG AGCGTGGCTTGCCGAGTATGAACTAGATTCATTTTATACCCAGCCAGACACAGGCTATTTACAGTTTTAT GATAACGCCTACGAGGATATTGTGCAGGTATACAACAAGCTTCGAAACTATCTGACCAAAAAGCCATAT AGCGAGGAGAAATGGAAGTTGAATTTTGAAAATTCTACGCTGGCAAATGGATGGGATAAGAACAAAGA ATCTGATAATTCAGCAGTTATTCTACAAAAAGGTGGAAAATAT TATTTGGGA?TGATTACTAAAGGACA CAACAAAATTTTTGATGACCGTTTTCAAGAAAAATTTATTGTGGGAATTGAAGGTGGAAAATATGAAAA AATAGTCTATAAATTTTTCCCCGACCAGGCAAAAATGTTTCCCAAAGTGTGCTTTTCTGCAAAAGGACTC GAATTTTTTAGACCGTCTGAAGAAATTTTAAGAATTTATAACAATGCAGAGTTTAAAAAAGGAGAAACT TATTCAATAGATAGTATGCAGAAGTTGATTGATTTTTATAAAGATTGCTTGACTAAATATGAAGGCTGGG CATGTTATACCTTTCGGCATCTAAAACCCACAGAAGAATACCAAAACAATATTGGAGAGTTTTTTCGAG ATGTTGCAGAGGACGGATACAGGATTGATTTTCAAGGCATTTCAGATCAATATATTCATGAAAAAAACG AGAAAGGCGAACTTCACCTTTTTGAAATCCACAATAAAGATTGGAATTTGGATAAGGCACGAGACGGAA US 10 , 435 , 714 B2 85 86 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence AGTCAAAAACAACACAAAAAAACCTTCATACACTCTATTTCGAATCGCTCTTTTCAAACGATAATGTTGT TCAAAACTTTCCAATAAAACTCAATGGTCAAGCTGAAATTTTTTATAGACCGAAAACGGAAAAAGACAA ATTAGAATCAAAAAAAGATAAGAAAGGGAATAAAGTGATTGACCATAAACGCTATAGTGAGAATAAGA TTTTTTTTCATGTTCCTCTCACACTAAACCGCACTAAAAATGACTCATATCGCTTTAATGCTCAAATCAAC AACTTTCTCGCAAATAATAAAGATATCAACATCATCGGTGTAGATAGGGGAGAAAAGCATTTAGTCTAT TATTCGGTGATTACACAAGCTAGTGACATCTTAGAAAGTGGCT CACTAAATGAGCTAAATGGCGTGAAT TATGCTGAAAAACTGGGAAAAAAGGCAGAAAATCGAGAACAAGCACGCAGAGACTGGCAAGACGTAC AAGGGATCAAAGACCTCAAGAAAGGATATATTTCACAGGTGGTGCGAAAGCTTGCTGATTTAGCAATTA AACACAATGCCATTATCATTCTTGAAGATTTGAATATGAGATT TAAACAAGTTCGGGGCGGTATCGAAA AATCCATTTATCAACAGTTAGAAAAAGCACTGATAGATAAATTAAGCTTTCTTGTAGACAAAGGTGAAA AAAATCCCGAGCAAGCAGGACATCTTCTGAAAGCATATCAGCTTTCGGCCCCATTTGAGACATTTCAAA AAATGGGCAAACAGACGGGTATAATCTTTTATACACAAGCTTCGTATACCTCAAAAAGTGACCCTGTAA CAGGTTGGCGACCACACCTGTATCTCAAATATTTCAGTGCCAAAAAAGCAAAAGACGATATTGCAAAGT TTACAAAAATAGAATTTGTAAACGATAGGTTTGAGCTTACCTATGATATAAAGGACTTTCAGCAAGCAA AAGAATATCCAAATAAAACTGTTTGGAAAGTTTGCTCAAATGTAGAGAGATTCAGGTGGGACAAAAACC TCAATCAAAACAAAGGCGGATATACTCACTACACAAATATAAC TGAGAATATCCAAGAGCTTTTTACAA AATATGGAATTGATATCACAAAAGATTTGCTCACACAGATTTCTACAATTGATGAAAAACAAAATACCT CATTTTTTAGAGATTTTATTTTTTATTTCAACCTTATTTGCCAAATCAGAAATACCGATGATTCTGAGATT GCTAAAAAGAATGGGAAAGATGATTTTATACTGTCACCTGTTGAGCCGTTTTTCGATAGCCGAAAAGAC AATGGAAATAAACTTCCTGAGAATGGAGATGATAACGGCGCGTATAACATAGCAAGAAAAGGGATTGT CATACTCAACAAAATCTCACAATATTCAGAGAAAAACGAAAATTGCGAGAAAATGAAATGGGGGGATT TGTATGTATCAAACATTGACTGGGACAATTTTGTAACCCAAGCTAATGCACGGCATTAA SEQ ATGATTATCTTATATATTAGTACCTCGAATATGAACATGGAAGGAGTATTTATGGAAAATTTTAAAAACT ID TGTATCCAATAAACAAAACACTTCGATTTGAATTAAGACCCTATGGAAAAACATTGGAAAATTTTAAAA NO : AATCCGGACTTTTAGAAAAAGATGCCTTTAAGGCAAATAGTAGACGAAGTATGCAAGCTATAATCGATG 31 AAAAATTCAAAGAGACTATCGAAGAACGCTTAAAGTACACTGAATTCAGTGAATGTGATCTTGGAAACA TGACATCAAAAGATAAAAAAATAACTGATAAAGCAGCTACAAATTTAAAAAAGCAAGTTATCTTATCTT TTGACGATGAAATATTTAATAATTACCTAAAACCTGATAAAAATATTGACGCATTATTTAAAAATGATCC TTCAAATCCTGTAATCTCTACATTTAAAGGTTTTACGACATATTTTGTGAATTTTTTTGAAATTCGAAAAC ATATTTTCAAGGGAGAATCATCAGGCTCAATGGCATACCGAATTATAGATGAAAACCTGACAACATACT TAAAAAAACTGCCAGAAGAATTAAAATCACAGCTAGAAGGCATTGATCAG ATTGATAAACTTAATAATTATAATGAGTTCATTACACAGTCAGGTATAACACACTATAATGAAATCATCG GCGGTATATCAAAATCAGAGAATGTCAAAATACAGGGAATTAATGAAGGAATTAATCTATACTGTCAGA AGAACAAAGTTAAACTTCCTCGACTGACTCCGCTATACAAAATGATATTATCAGACAGAGTTTCCAACTC TTTTGTATTAGACACTATTGAAAATGACACAGAATTAATTGAAATGATAAGTGATTTGATTAATAAGACT GAGATTTCGCAAGATGTTATAATGTCAGATATTCAAAATATTTTCATAAAATACAAACAACTTGGTAATT TGCCGGGTATCTCATATTCTTCAATAGTTAATGCTATTTGCTCGGATTATGACAACAATTTCGGAGATGG GAAGCGAAAAAAATCTTACGAAAATGATCGCAAAAAGCATTTGGAGACTAATGTATACTCCATAAATTA TATTTCTGAATTGCTTACAGATACCGATGTTTCATCAAATATCAAGATGAGATATAAAGAGCTTGAGCAA AATTATCAGGTTTGCAAAGAAAATTTTAATGCCACAAACTGGATGAATATTAAAAATATAAAACAATCT GAAAAAACAAACCTTATTAAAGATTTGTTAGATATACTTAAATCGATTCAACGTTTCTATGATTTGTTTG ATATTGTTGACGAAGATAAAAATCCAAGTGCTGAATTTTATACCTGGTTATCAAAAAATGCTGAAAAGC TTGACTTTGAATTCAATTCTGTATATAACAAGTCACGAAACTATCTCACCAGGAAACAATACTCTGATAA AAAAATCAAGCTGAATTTTGATTCTCCAACATTGGCCAAAGGGTGGGATGCTAACAAAGAAATAGATAA CTCCACGATTATAATGCGTAAATTTAATAATGACAGAGGCGATTATGATTACTTCCTTGGCATATGGAAT AAATCCACACCTGCAAATGAAAAAATAATCCCACTGGAGGATAATGGATTATTCGAAAAAATGCAATAT AAGCTGTATCCAGATCCTAGTAAGATGTTACCGAAACAATTTC CCTACGACACCTGAATTTGATAAAAAATATAAAGAGGGAAGACATAAAAAAGGTCCTGATTTCGAAAA AGAATTCCTGCATGAATTGATTGATTGCTTCAAACATGGTCTTGTTAATCACGATGAAAAATATCAGGAT GTTTTTGGCTTCAATCTCCGTAACACTGAAGATTATAATTCATATACAGAGTTTCTCGAAGATGTGGAAA GATGCAATTACAATCTTTCATTTAACAAAATTGCTGATACTTCAAACCTTATTAATGATGGGAAATTGTA TGTATTTCAGATATGGTCAAAAGACTTTTCTATTGATTCAAAAGGTACTAAAAACTTGAATACAATCTAT TTTGAATCACTATTTTCAGAAGAAAACATGATAGAAAAAATGT TCAAGCTTTCTGGAGAGGCTGAGATA TTCTATCGACCAGCATCGTTGAATTATTGTGAAGATATCATAAAAAAAGGTCATCACCATGCAGAATTA AAAGATAAGTTTGACTATCCTATAATAAAAGATAAGCGATATT CACAAGATAAGTTTTTCTTTCATGTGC CTGAGAAACTGAATTCCAAAAGCCTTAACAACCGAACAAATGAAAACC TGGGACAGTTTACACATATTATAGGTATAGACAGGGGCGAGCGGCACTTGATTTATTTAACTGTTGTTGA TGTTTCCACTGGTGAAATCGTTGAACAGAAACATCTGGACGAAATTATCAATACTGATACCAAGGGAGT TGAACACAAAACCCATTATTTGAATAAATTGGAAGAAAAATCTAAAACAAGAGATAACGAGCGTAAAT CATGGGAAGCTATTGAAACTATCAAAGAATTAAAAGAAGGCTATATTTCTCATGTAATTAATGAAATAC AAAAGCTGCAAGAAAAATATAATGCCTTAATCGTAATGGAAAATCTTAACTATGGGTTCAAAAACTCAC GAATCAAAGTTGAAAAACAGGTTTATCAAAAATTCGAGACAGCATTGATTAAAAAGTTCAATTATATTA TTGATAAAAAAGATCCAGAAACCTATATACATGGTTACCAGCTTACAAATCCTATTACCACTCTGGATAA GATTGGAAATCAATCTGGAATAGTGCTGTATATTCCTGCGTGGAATACTTCTAAGATAGATCCCGTCACA GGATTTGTAAACCTTCTGTACGCAGATGATTTGAAGTATAAAAATCAGGAGCAGGCCAAATCATTCATT CAGAAAATAGACAACATATATTTTGAAAATGGAGAGTTTAAATTTGATATTGATTTTTCCAAATGGAATA ATCGCTACTCAATAAGTAAAACTAAATGGACGTTAACAAGTTATGGGACTCGCATCCAGACATTTAGAA ATCCCCAGAAAAACAATAAGTGGGATTCTGCTGAATATGATTTGACAGAAGAGTTTAAATTAATTTTAA ATATAGACGGAACGTTAAAGTCACAGGACGTAGAAACATACAAAAAATTCATGTCTTTATTTAAACTAA TGCTACAGCTTCGAAACTCTGTTACAGGAACCGACATTGATTATATGATCTCTCCTGTCACTGATAAAAC AGGAACACATTTCGATTCAAGAGAAAATATTAAAAATCTTCCTGCCGATGCAGATGCCAATGGTGCCTA CAACATTGCGCGCAAAGGAATAATGGCTATTGAAAATATAATGAACGGTATAAGCGATCCACTAAAAAT AAGCAACGAAGACTATTTAAAGTATATTCAGAATCAACAGGAATAA US 10 , 435 , 714 B2 87 88 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence

SEQ ATGACCCAATTTGAAGGTTTTACCAATTTATACCAAGTTTCGAAGACCCTTCGTTTTGAACTGATTCCCC ID AAGGAAAAACACTCAAACATATCCAGGAGCAAGGGTTCATTGAGGAGGATAAAGCTCGCAATGACCAT NO : TACAAAGAGTTAAAACCAATCATTGACCGCATCTATAAGACTTATGCTGATCAATGTCTCCAACTGGTAC 32 AGCTTGACTGGGAGAATCTATCTGCAGCCATAGACTCCTATCGTAAGGAAAAAACCGAAGAAACACGA AATGCGCTGATTGAGGAGCAAGCAACATATAGAAATGCGATTCATGACTACTTTATAGGTCGGACGGAT AATCTGACAGATGCCATAAATAAGCGCCATGCTGAAATCTATAAAGGACTTTTTAAAGCTGAACTTTTCA ATGGAAAAGTTTTAAAGCAATTAGGGACCGTAACCACGACAGAACATGAAAATGCTCTACTCCGTTCGT TTGACAAATTTACGACCTATTTTTCCGGCTTTTATGAAAACCGAAAAAATGTCTTTAGCGCTGAAGATAT CAGCACGGCAATTCCCCATCGAATCGTCCAGGACAATTTCCCTAAATTTAAGGAAAACTGCCATATTTTT ACAAGATTGATAACCGCAGTTCCTTCTTTGCGGGAGCATTTTGAAAATGTCAAAAAGGCCA TTGTTAGTACGTCTATTGAAGAAGTCTTTTCCTTTCCCTTTTATAATCAACTTCTAACCCAAACGCAAATT GATCTTTATAATCAACTTCTCGGCGGCATATCTAGGGAAGCAGGCACAGAAAAAATCAAGGGACTTAAT GAAGTTCTCAATCTGGCTATCCAAAAAAATGATGAAACAGCCCATATAATCGCGTCCCTGCCGCATCGTT TTATTCCTCTTTTTAAACAAATTCTTTCCGATCGAAATACGTTATCCTTTATTTTGGAAGAATTCAAAAGC GATGAGGAAGTCATCCAATCCTTCTGCAAATATAAAACCCTCTTGAGAAACGAAAATGTACTGGAGACT GCAGAAGCCCTTTTCAATGAATTAAATTCCATTGATTTGACTCATATCTTTATTTCCCATAAAAAGTTAG AAACCATCTCTTCAGCGCTTTGTGACCATTGGGATACCTTGCGCAATGCACTTTACGAAAGACGGATTTC TGAACTCACTGGCAAAATAACAAAAAGTGCCAAAGAAAAAGTTCAAAGGTCATTAAAACATGAGGATA TAAATCTCCAAGAAATTATTTCTGCTGCAGGAAAAGAACTATCAGAAGCATTCAAACAAAAAACAAGTG AAATTCTTTCCCATGCCCATGCTGCACTTGACCAGCCTCTTCCCACAACATTAAAAAAACAGGAAGAAA AAGAAATCCTCAAATCACAGCTCGATTCGCTTTTAGGCCTTTATCATCTTCTTGATTGGTTTGCTGTCGAT GAAAGCAATGAAGTCGACCCAGAATTCTCAGCACGGCTGACAGGCATTAAACTAGAAATGGAACCAAG CCTTTCGTTTTATAATAAAGCAAGAAATTATGCGACAAAAAAGCCCTATTCGGTGGAAAAATTTAAATT GAATTTTCAAATGCCAACCCTTGCCTCTGGTTGGGATGTCAATAAAGAAAAAAATAATGGAGCTATTTTA TTCGTAAAAAATGGTCTCTATTACCTTGGTATCATGCCTAAACAGAAGGGGCGCTATAAAGCCCTGTCTT TTGAGCCGACAGAAAAAACATCAGAAGGATTCGATAAGATGTACTATGACTACTTCCCAGATGCCGCAA AAATGATTCCTAAGTGTTCCACTCAGCTAAAGGCTGTAACCGCTCATTTTCAAACTCATACCACCCCCAT TCTTCTCTCAAATAATTTCATTGAACCTCTTGAAATCACAAAAGAAATTTATGACCTGAACAATCCTGAA AAGGAGCCTAAAAAGTTTCAAACGGCTTATGCAAAGAAGACAGGCGATCAAAAAGGCTATAGAGAAGC GCTTTGCAAATGGATTGACTTTACGCGGGATTTTCTCTCTAAATATACGAAAACAACTTCAATCGATTTA TCTTCACTCCGCCCTTCTTCGCAATATAAAGATTTAGGGGAATATTACGCCGAACTGAATCCGCTTCTCT ATCATATCTCCTTCCAACGAATTGCTGAAAAGGAAATCATGGATGCTGTAGAAACGGGAAAATTGTATC TGTTCCAAATCTACAATAAGGATTTTGCGAAGGGCCATCACGGGAAACCAAATCTCCACACCCTGTATT GGACAGGTCTCTTCAGTCCTGAAAACCTTGCGAAAACCAGCAT CAAACTTAATGGTCAAGCAGAATTGT TCTATCGACCTAAAAGCCGCATGAAGCGGATGGCCCATCGTCTTGGGGAAAAAATGCTGAACAAAAAAC TAAAGGACCAGAAGACACCGATTCCAGATACCCTCTACCAAGAACTGTACGATTATGTCAACCACCGGC TAAGCCATGATCTTTCCGATGAAGCAAGGGCCCTGCTTCCAAATGTTATCACCAAAGAAGTCTCCCATGA AATTATAAAGGATCGGCGGTTTACTTCCGATAAATTTTTCTTCCATGTTCCCATTACACTGAATTATCAAG CAGCCAATAGTCCCAGTAAATTCAACCAGCGTGTCAATGCCTACCTTAAGGAGCATCCGGAAACGCCCA TCATTGGTATCGATCGTGGAGAACGCAATCTAATCTATATTACCGTCATTGACAGTACTGGGAAAATTTT GGAGCAGCGTTCCCTGAATACCATCCAGCAATTTGACTACCAAAAAAAATTGGACAACAGGGAAAAAG AGCGTGTTGCCGCCCGTCAAGCCTGGTCCGTCGTCGGAACGAT CAAAGACCTTAAACAAGGCTACTTGT CACAGGTCATCCATGAAATTGTAGACCTGATGATTCATTACCAAGCTGTTGTCGTCCTTGAAAACCTCAA CTTCGGATTTAAATCAAAACGGACAGGCATTGCCGAAAAAGCAGTCTACCAACAATTTGAAAAGATGCT AATAGATAAACTCAACTGTTTGGTTCTCAAAGATTATCCTGCTGAGAAAGTGGGAGGCGTCTTAAACCC GTATCAACTTACAGATCAGTTCACGAGCTTTGCAAAAATGGGCACGCAAAGCGGCTTCCTTTTCTATGTA CCGGCCCCTTATACCTCAAAGATTGATCCCCTGACTGGTTTTGTCGATCCCTTTGTATGGAAGACCATTA AAAATCATGAAAGTCGGAAGCATTTCCTAGAAGGATTTGATTTCCTGCATTATGATGTCAAAACAGGTG ATTTTATCCTCCATTTTAAAATGAATCGGAATCTCTCTTTCCAGAGAGGGCTTCCTGGCTTCATGCCAGCT TGGGATATTGTTTTCGAAAAGAATGAAACCCAATTTGATGCAAAAGGGACGCCCTTCATTGCAGGAAAA CGAATTGTTCCTGTAATCGAAAATCATCGTTTTACGGGTCGTTACAGAGACCTCTATCCCGCTAATGAAC TCATTGCCCTTCTGGAAGAAAAAGGCATTGTCTTTAGAGACGGAAGTAATATATTACCCAAACTTTTAGA AAATGATGATTCTCATGCAATTGATACGATGGTCGCCTTGATTCGCAGTGTACTCCAAATGAGAAACAG CAATGCCGCAACGGGGGAAGACTACATCAACTCTCCCGTTAGGGATCTGAACGGGGTGTGTTTCGACAG TCGATTCCAAAATCCAGAATGGCCAATGGATGCGGATGCCAACGGAGCTTATCATATTGCCTTAAAAGG GCAGCTTCTTCTGAACCACCTCAAAGAAACAAAGATCTGAAATTACAAAACGGCATCAGCAACCAAGA TTGGCTGGCCTACATTCAGGAACTGAGAAACTGA SEQ ATGGCCGTCAAATCCATCAAAGTGAAACTTCGTCTCGACGATATGCCGGAGATTCGGGCCGGTCTATGG ID AAACTTCATAAGGAAGTCAATGCGGGGGTTCGATATTACACGGAATGGCTCAGTCTTCTCCGTCAAGAG NO : AACTTGTATCGAAGAAGTCCGAATGGGGACGGAGAGCAAGAATGTGATAAGACTGCAAAGAATGCAA 33 AGCCGAATTGTTGGAGCGGCTGCGCGCGCGTCAAGTGGAGAATGGACACCGTGGTCCGGCGGGATCGG ACGATGAATTGCTGCAGTTGGCGCGTCAACTCTATGAGTTGTTGGTTCCGCAGGCGATAGGTGCGAAAG GCGACGCGCAGCAAATTGCCCGCAAATTTTTGAGCCCCTTGGCCGACAAGGACGCAGTTGGTGGGCTTG GAATCGCGAAGGCGGGGAACAAACCGCGGTGGGTTCGCATGCGCGAAGCGGGGGAACCAGGCTGGGAA GAGGAGAAGGAGAAGGCTGAGACGAGGAAATCTGCGGATCGGACTGCGGATGTTTTGCGCGCGCTCGC GGATTTTGGGTTAAAGCCACTGATGCGCGTATACACCGATTCTGAGATGTCATCGGTGGAGTGGAAACC GCTTCGGAAGGGACAAGCCGTTCGGACGTGGGATAGGGACATGTTCCAACAAGCTATCGAACGGATGAT GTCGTGGGAGTCGTGGAATCAGCGCGTTGGGCAAGAGTACGCGAAACTCGTAGAACAAAAAAATCGAT TTGAGCAGAAGAATTTCGTCGGCCAGGAACATCTGGTCCATCTCGTCAATCAGTTGCAACAAGATATGA AAGAAGCATCGCCCGGACTCGAATCGAAAGAGCAAACCGCGCACTATGTGACGGGACGGGCATTGCGC GGATCGGACAAGGTATTTGAGAAGTGGGGGAAACTCGCCCCCGATGCACCTTTCGATTTGTACGACGCC GAAATCAAGAATGTGCAGAGACGTAACACGAGACGATTCGGAT CACATGACTTGTTCGCAAAATTGGCA GAGCCAGAGTATCAGGCCCTGTGGCGCGAAGATGCTTCGTTTCTCACGCGTTACGCGGTGTACAACAGC ATCCTTCGCAAACTGAATCACGCCAAAATGTTCGCGACGTTTACTTTGCCGGATGCAACGGCGCACCCG US 10 , 435 , 714 B2 89 90 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence ATTTGGACTCGCTTCGATAAATTGGGTGGGAATTTGCACCAGTACACCTTTTTGTTCAACGAATTTGGAG AACGCAGGCACGCGATTCGTTTTCACAAGCTATTGAAAGTCGAGAATGGTGTCGCAAGAGAAGTTGATG ATGTCACCGTGCCCATTTCAATGTCAGAGCAATTGGATAATCTGCTTCCCAGAGATCCCAATGAACCGAT TGCGCTATATTTTCGAATTACGGAGCCGAACAGCATTTCACAGGTGAATTTGGTGGCGCGAAGATCCA GTGCCGCCGGGATCAGCTGGCTCATATGCACCGACGCAGAGGGGCGAGGGATGTTTATCTCAATGTCAG CGTACGTGTGCAGAGTCAGTCTGAGGCGCGGGGAGAACGTCGCCCGCCGTATGCGGCAGTATTTCGTCT GGTCGGGGACAACCATCGCGCGTTTGTCCATTTCGATAAACTATCGGATTATCTTGCGGAACATCCGGAT GATGGGAAGCTCGGGTCGGAGGGGTTGCTTTCCGGGCTGCGGGTGATGAGTGTCGATCTCGGCCTTCGC ACATCTGCATCGATTTCCGTTTTTCGCGTTGCCCGGAAGGACGAGTTGAAGCCGAACTCAAAAGGTCGTG TACCGTTTTTCTTTCCGATAAAAGGGAATGACAATCTCGTCGCGGTTCATGAGCGATCACAACTCTTGAA GCTGCCTGGCGAAACGGAGTCGAAGGACCTGCGTGCTATCCGAGAAGAACGCCAACGGACATTGCGGC AGTTGCGGACGCAACTGGCGTATTTGCGGCTGCTCGTGCGGTGTGGGTCGGAAGATGTGGGGCGGCGTG AACGGAGTTGGGCAAAGCTTATCGAGCAGCCGGTGGATGCGGCCAATCACATGACACCGGATTGGCGC GAGGCTTTTGAAAACGAACTTCAGAAGCTTAAGTCACTCCATGGTATCTGTAGCGACAAGGAATGGATG GATGCTGTCTACGAGAGCGTTCGCCGCGTGTGGCGTCACATGGGCAAACAGGTTCGCGATTGGCGAAAG GACGTACGAAGCGGAGAGCGGCCCAAGATTCGCGGCTATGCGAAAGACGTGGTCGGTGGAAACTCGAT TGAGCAAATCGAGTATCTGGAACGTCAGTACAAGTTCCTCAAGAGTTGGAGCTTCTTTGGTAAGGTGTC GGGACAAGTGATTCGTGCGGAGAAGGGATCTCGTTTTGCGATCACGCTGCGCGAACACATTGATCACGC GAAGGAAGATCGGCTGAAGAAATTGGCGGATCGCATCATTATGGAGGCTCTCGGCTATGTGTACGCGTT GGATGAGCGCGGCAAAGGAAAGTGGGTTGCGAAGTATCCGCCGTGCCAGCTCATCCTGCTGGAGGAATT GAGCGAGTACCAGTTCAATAACGACAGGCCTCCGAGCGAAAACAACCAGTTGATGCAATGGAGTCATC GCGGCGTGTTCCAGGAGTTGATAAATCAGGCCCAAGTCCATGATTTACTCGTTGGGACGATGTATGCAG CGTTCTCGTCGCGATTCGACGCGCGAACTGGGGCACCGGGTATCCGCTGTCGCCGGGTTCCGGCGCGTTG CACCCAGGAGCACAATCCAGAACCATTTCCTTGGTGGCTGAACAAGTTTGTGGTGGAACATACGTTGGA TGCTTGTCCCCTACGCGCAGACGACCTCATCCCAACGGGTGAAGGAGAGATTTTTGTCTCGCCGTTCAGC GCGGAGGAGGGGGACTTTCATCAGATTCACGCCGACCTGAATGCGGCGCAAAATCTGCAGCAGCGACTC TGGTCTGATTTTGATATCAGTCAAATTCGGTTGCGGTGTGATTGGGGTGAAGTGGACGGTGAACTCGTTC TGATCCCAAGGCTTACAGGAAAACGAACGGCGGATTCATATAGCAACAAGGTGTTTTATACCAATACAG GTGTCACCTATTATGAGCGAGAGCGGGGGAAGAAGCGGAGAAAGGTTTTCGCGCAAGAGAAATTGTCG GAGGAAGAGGCGGAGTTGCTCGTGGAAGCAGACGAGGCGAGGGAGAAATCGGTCGTTTTGATGCGTGA TCCGTCTGGCATCATCAATCGGGGAAATTGGACCAGGCAAAAGGAATTTTGGTCGATGGTGAACCAGCG GATCGAAGGATACTTGGTCAAGCAGATTCGCTCGCGCGTTCCATTACAAGATAGTGCGTGTGAAAACAC GGGGGATATTTAA SEQ ATGGCGACACGCAGTTTTATTTTAAAAATTGAACCAAATGAAGAAGTTAAAAAGGGATTATGGAAGACG ID CATGAGGTATTGAATCATGGAATTGCCTACTACATGAATATTC TGAAACTAATTAGACAGGAAGCTATTT NO : ATGAACATCATGAACAAGATCCTAAAAATCCGAAAAAAGTTTCAAAAGCAGAAATACAAGCCGAGTTA 34 TGGGATTTTGTTTTAAAAATGCAAAAATGTAATAGTTTTACACATGAAGTTGACAAAGATGTTGTTTTTA ACATCCTGCGTGAACTATATGAAGAGTTGGTCCCTAGTTCAGT CGAGAAAAAGGGTGAAGCCAATCAAT TATCGAATAAGTTTCTGTACCCGCTAGTTGATCCGAACAGTCAAAGTGGGAAAGGGACGGCATCATCCG GACGTAAACCTCGGTGGTATAATTTAAAAATAGCAGGCGACCCATCGTGGGAGGAAGAAAAGAAAAAA TGGGAAGAGGATAAAAAGAAAGATCCCCTTGCTAAAATCTTAGGTAAGTTAGCAGAATATGGGCTTATT CCGCTATTTATTCCATTTACTGACAGCAACGAACCAATTGTAAAAGAAATTAAATGGATGGAAAAAAGT CGTAATCAAAGTGTCCGGCGACTTGATAAGGATATGTTTATCCAAGCATTAGAGCGTTTTCTTTCATGGG AAAGCTGGAACCTTAAAGTAAAGGAAGAGTATGAAAAAGTTGAAAAGGAACACAAAACACTAGAGGA AAGGATAAAAGAGGACATTCAAGCATTTAAATCCCTTGAACAATATGAAAAAGAACGGCAGGAGCAAC TTCTTAGAGATACATTGAATACAAATGAATACCGATTAAGCAAAAGAGGATTACGTGGTTGGCGTGAAA TTATCCAAAAATGGCTAAAGATGGATGAAAATGAACCATCAGAAAAATATTTAGAAGTATTTAAAGATT ATCAACGGAAACATCCACGAGAAGCCGGGGACTATTCTGTCTATGAATTTTTAAGCAAGAAAGAAAATC ATTTTATTTGGCGAAATCATCCTGAATATCCTTATTTGTATGC TACATTTTGTGAAATTGACAAAAAAAA GAAAGACGCTAAGCAACAGGCAACTTTTACTTTGGCTGACCCGATTAACCATCCGTTATGGGTACGATTT GAAGAAAGAAGCGGTTCGAACTTAAACAAATATCGAATTTTAACAGAGCAATTACACACTGAAAAGTTA AAAAAGAAATTAACAGTTCAACTTGATCGTTTAATTTATCCAACTGAATCCGGCGGTTGGGAGGAAAAA GGTAAAGTAGATATCGTTTTGTTGCCGTCAAGACAATTTTATAATCAAATCTTCCTTGATATAGAAGAAA AGGGGAAACATGCTTTTACTTATAAGGATGAAAGTATTAAATTCCCCCTTAAAGGTACACTTGGTGGTGC AAGAGTGCAGTTTGACCGTGACCATTTGCGGAGATATCCGCATAAAGTAGAATCAGGAAATGTTGGACG GATTTATTTTAACATGACAGTAAATATTGAACCAACTGAGAGCCCTGTTAGTAAGTCTTTGAAAATACAT AGGGACGATTTCCCCAAGTTCGTTAATTTTAAACCGAAAGAGCTCACCGAATGGATAAAAGATAGTAAA GGGAAAAAATTAAAAAGTGGTATAGAATCCCTTGAAATTGGTCTACGGGTGATGAGTATCGACTTAGGT CAACGTCAAGCGGCTGCTGCATCGATTTTTGAAGTAGTTGATCAGAAACCGGATATTGAAGGGAAGTTA TTTTTTCCAATCAAAGGAACTGAGCTTTATGCTGTTCACCGGGCAAGTTTTAACATTAAATTACCGGGTG AAACATTAGTAAAATCACGGGAAGTATTGCGGAAAGCTCGGGAGGACAACTTAAAATTAATGAATCAA AAGTTAAACTTTCTAAGAAATGTTCTACATTTCCAACAGTTTGAAGATATCACAGAAAGAGAGAAGCGT GTAACTAAATGGATTTCTAGACAAGAAAATAGTGATGTTCCTCTTGTATATCAAGATGAGCTAATTCAAA TTCGTGAATTAATGTATAAACCCTATAAAGATTGGGTTGCCTTTTTAAAACAACTCCATAAACGGCTAGA AGTCGAGATTGGCAAAGAGGTTAAGCATTGGCGAAAATCATTAAGTGACGGGAGAAAAGGTCTTTACG GAATCTCCCTAAAAAATATTGATGAAATTGATCGAACAAGGAAATTCCTTTTAAGATGGAGCTTACGTC CAACAGAACCTGGGGAAGTAAGACGCTTGGAACCAGGACAGCGTTTTGCGATTGATCAATTAAACCACC TAAATGCATTAAAAGAAGATCGATTAAAAAAGATGGCAAATACGATTATCATGCATGCCTTAGGTTACT GTTATGATGTAAGAAAGAAAAAGTGGCAGGCAAAAAATCCAGCATGTCAAATTATTTTATTTGAAGATT TATCTAACTACAATCCTTACGAGGAAAGGTCCCGTTTTGAAAACTCAAAACTGATGAAGTGGTCACGGA GAGAAATTCCACGACAAGTCGCCTTACAAGGTGAAATTTACGGATTACAAGTTGGGGAAGTAGGTGCCC AATTCAGTTCAAATTCCATGCGAAAACCGGGTCGCCGGGAATTCGTTGCAGTGTTGTAACGAAAGAAA AATTGCAGGATAATCGCTTTTTTAAAAATTTACAAAGAGAAGGACGACTTACTCTTGATAAAATCGCAG TTTTAAAAGAAGGAGACTTATATCCAGATAAAGGTGGAGAAAAGTTTATTTCTTTATCAAAGGATCGAA AGTTGGTAACTACGCATGCTGATATTAACGCGGCCCAAAATTTACAGAAGCGTTTTTGGACAAGAACAC US 10 , 435 , 714 B2 91 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence ATGGATTTTATAAAGTTTACTGCAAAGCCTATCAGGTTGATGGACAAACTGTTTATATTCCGGAGAGCAA GGACCAAAAACAAAAAATAATTGAAGAATTTGGGGAAGGCTATTTTATTTTAAAAGATGGTGTATATGA ATGGGGTAATGCGGGGAAACTAAAAATTAAAAAAGGTTCCTCTAAACAATCATCGAGTGAATTAGTAGA TTCGGACATACTGAAAATTCATTTGATTTAGCAAGTGAACTTAAGGGAGAGAAACTCATGTTATATCG AGATCCGAGTGGAAACGTATTTCCTTCCGACAAGTGGATGGCAGCAGGAGTATTTTTTGGCAAATTAGA AAGAATATTGATTTCTAAGTTAACAAATCAATACTCAATATCAACAATAGAAGATGATTCTTCAAAACA ATCAATGTAA SEQ ATGCCCACCCGCACCATCAATCTGAAACTTGTTCTTGGGAAAAATCCTGAAAACGCAACATTGCGACGC ID GCCCTATTTTCGACACACCGTTTGGTTAACCAAGCGACGAAACGTATTGAGGAATTCTTGTTGCTGTGTC NO : GTGGAGAAGCCTACAGAACAGTGGATAATGAGGGGAAGGAAGCCGAGATTCCACGTCATGCAGTCCAA 35 GAAGAAGCTCTTGCCTTTGCCAAAGCTGCTCAACGCCACAACGGCTGTATATCCACCTATGAAGACCAA GAGATTCTTGATGTACTGCGGCAACTGTACGAACGTCTTGTTCCTTCGGTCAACGAAAACAACGAGGCA GGCGATGCTCAAGCTGCTAACGCCTGGGTCAGTCCGCTCATGT CGGCAGAAAGCGAAGGAGGCTTGTCG GTCTACGACAAGGTGCTTGATCCACCGCCGGTTTGGATGAAGCTTAAAGAAGAAAAGGCTCCAGGATGG GAAGCCGCTTCTCAAATTTGGATTCAGAGTGATGAGGGACAGT CGTTACTTAATAAGCCAGGTAGCCCT CCCCGCTGGATTCGAAAACTGCGATCTGGGCAACCGTGGCAAGATGATTTCGTCAGTGACCAAAAGAAA AAGCAAGATGAGCTGACCAAAGGGAACGCACCACTTATAAAACAACTCAAAGAAATGGGGTTGTTGCC TCTTGTTAACCCATTTTTTAGACATCTTCTTGACCCTGAAGGTAAAGGCGTGAGTCCATGGGACCGTCTT GCTGTACGCGCTGCAGTGGCTCACTTTATCTCCTGGGAAAGTTGGAATCATAGAACACGTGCAGAATAC AATTCCTTGAAACTACGGCGAGACGAGTTTGAGGCAGCATCCGACGAATTCAAAGACGATTTTACTTTG CTCCGACAATATGAAGCCAAACGCCATAGTACATTGAAAAGCATCGCGCTGGCCGACGATTCGAACCCT TACCGGATTGGAGTACGTTCTCTGCGTGCCTGGAACCGCGTTCGTGAAGAATGGATAGACAAGGGTGCA ACAGAAGAACAACGCGTGACCATATTGTCAAAGCTTCAAACACAACTTCGGGGAAAATTCGGCGATCCC GATCTGTTCAACTGGCTAGCTCAGGATAGGCATGTCCATTTGTGGTCTCCTCGGGACAGCGTGACACCAT TGGTTCGCATCAATGCGGTAGATAAAGTTCTGCGTCGACGAAAACCGTATGCATTGATGACCTTTGCCCA TCCCCGCTTCCACCCTCGATGGATACTGTACGAGGCTCCAGGAGGAAGCAATCTCCGTCAATATGCATTG GATTGTACAGAAAACGCTCTACACATCACGTTGCCTTTGCTTGTCGACGATGCGCACGGAACCTGGATTG AAAAAAAGATCAGGGTGCCGCTGGCACCATCCGGACAAATTCAAATTTAACTCTGGAAAAACTTGAGA AGAAAAAAAATCGTTTATACTACCGTTCCGGTTTTCAGCAGTTTGCCGGCTTGGCTGGCGGAGCTGAGGT TCTTTTCCACAGACCCTATATGGAACACGACGAACGCAGCGAGGAGTCTCTTTTGGAACGTCCGGGAGC CGTTTGGTTCAAATTGACCCTGGATGTGGCAACACAGGCTCCCCCGAACTGGCTTGATGGTAAGGGCCG TGTCCGTACACCGCCGGAGGTACATCATTTTAAAACCGCATTGTCGAATAAAAGCAAACATACACGTAC GCTGCAGCCGGGTCTCCGTGTCTTGTCAGTAGACTTGGGCATGCGAACATTCGCCTCCTGCTCAGTATTT GAACTCATCGAGGGAAAGCCTGAGACAGGCCGTGCCTTCCCTGTTGCCGATGAGAGATCAATGGACAGC CCGAATAAACTGTGGGCCAAGCATGAACGTAGTTTTAAACTGACGCTCCCCGGCGAAACCCCTTCTCGA AAGGAAGAGGAAGAGCGTAGCATAGCAAGAGCGGAAATTTATGCACTGAAACGCGACATACAACGCCT CAAAAGCCTACTCCGCTTAGGTGAAGAAGATAACGATAACCGT CGTGATGCATTGCTTGAACAGTTCTTT AAAGGATGGGGAGAAGAAGACGTTGTGCCTGGACAAGCGTTTCCACGCTCTCTTTTCCAAGGGTTGGGA GCTGCCCCGTTTCGCTCAACTCCAGAGTTATGGCGTCAGCATTGCCAAACATATTATGACAAAGCGGAA GCCTGTCTGGCTAAACATATCAGTGATTGGCGCAAGCGAACTCGTCCCCGTCCGACATCGCGGGAGATG TGGTACAAAACACGTTCCTATCATGGCGGCAAGTCCATTTGGATGTTGGAATATCTTGATGCCGTTCGAA AACTGCTTCTCAGTTGGAGCTTACGTGGTCGTACTTACGGTGCCATTAATCGCCAGGATACAGCCCGGTT TGGTTCTTTGGCATCACGGCTGCTCCACCATATCAATTCCCTAAAGGAAGACCGCATCAAAACAGGAGC CGACTCTATCGTTCAGGCTGCTCGCGGGTATATTCCTCTCCCT CATGGCAAGGGTTGGGAACAAAGATAT GAGCCTTGTCAGCTCATATTATTTGAAGACCTCGCACGATATCGCTTTCGCGTGGATCGACCTCGTCGAG AGAACAGCCAACTCATGCAGTGGAACCATCGAGCCATCGTGGCAGAAACAACGATGCAAGCCGAACTC TACGGACAAATTGTCGAAAATACTGCAGCGGGGTTCAGCAGTCGTTTTCACGCGGCGACAGGTGCCCCC GGTGTACGTTGTCGTTTTCTTCTAGAAAGAGACTTTGATAACGATTTGCCCAAACCGTACCTTCTCAGGG AACTTTCTTGGATGCTCGGCAATACAAAAGTCGAGTCTGAAGAAGAAAAGCTTCGATTGCTGTCTGAAA AAATCAGGCCAGGCAGTCTTGTTCCTTGGGATGGAGGCGAACAGTTCGCTACCCTGCATCCCAAAAGAC AAACACTTTGCGTCATTCATGCCGATATGAATGCTGCCCAAAATTTACAACGCCGGTTTTTCGGTCGATG CGGCGAGGCCTTTCGGCTTGTTTGTCAACCCCACGGTGACGACGTGTTACGACTCGCATCCACCCCAGGA GCTCGTCTTCTTGGAGCCCTGCAGCAGCTTGAAAATGGACAAGGAGCTTTCGAGTTGGTTCGAGACATG GGGTCAACAAGTCAAATGAACCGGTTCGTCATGAAGTCTTTGGGAAAAAAGAAAATAAAACCCCTTCAG GACAACAATGGAGACGACGAGCTTGAAGACGTGTTGTCCGTACTCCCGGAGGAAGACGACACAGGACG TATCACAGTCTTCCGCGATTCATCAGGAATCTTTTTTCCTTGCAACGTCTGGATACCGGCCAAACAGTTTT GGCCAGCAGTACGCGCCATGATTTGGAAGGTCATGGCTTCCCATTCTTTGGGGTGA SEQ ATGACAAAGTTAAGACACCGACAGAAAAAATTAACACACGACTGGGCTGGCTCCAAAAAGAGGGAAGT ID ATTAGGCTCAAATGGCAAGCTTCAGAATCCGTTGTTAATGCCGGTTAAAAAAGGTCAGGTTACTGAGTT NO : CCGGAAAGCGTTTTCTGCGTATGCTCGCGCAACGAAAGGAGAAATGACTGACGGCCGAAAGAATATGTT 36 TACGCATAGTTTCGAGCCATTTAAGACAAAGCCCTCGCTTCAT CAGTGTGAATTGGCAGATAAAGCATAT CAATCTTTACATTCGTATCTGCCTGGTTCTCTTGCTCATTTTCTATTATCTGCTCACGCATTAGGTTTTCGT ATTTTTTCAAAATCTGGTGAAGCAACTGCATTCCAGGCATCCTCTAAAATTGAAGCTTACGAATCAAAAT TGGCAAGCGAATTAGCTTGTGTAGATTTATCTATTCAAAACTTGACTATTTCAACGCTTTTTAATGCGCTT ACAACGTCTGTAAGAGGGAAGGGCGAAGAAACTAGCGCTGACCCCTTAATTGCACGATTTTACACCTTA CTTACTGGCAAGCCTCTGTCTCGAGACACTCAAGGGCCTGAACGTGATTTAGCAGAAGTTATCTCGCGTA AGATAGCTAGTTCTTTTGGCACATGGAAAGAAATGACGGCAAACCCTCTTCAGTCATTACAATTTTTTGA AGAGGAACTCCATGCGCTGGATGCCAATGTCTCGCTCTCACCCGCCTTCGACGTTTTAATTAAAATGAAT GATTTGCAGGGCGATTTAAAAAATCGAACCATTGTTTTTGATCCTGACGCCCCTGTTTTTGAATATAACG CAGAAGACCCTGCCGACATAATTATTAAACTTACAGCTCGTTACGCTAAAGAAGCTGTCATCAAAAATC CCTACCAAATTAccTTAAAAAcccTATTACTAccACAAATaccAATcc?c??ac??a???????? ACAAAGGTTTGTCGTTACTCCCTGTCTCGACCGATGACGAATTGCTAGAGTTTATTGGCGTTGAACGATC TCATCCCTCATGCCATGCCTTAATTGAATTGATTGCACAATTAGAAGCCCCCGAGCTCTTTGAGAAGAAC GTATTTTCAGATACTCGTTCTGAAGTTCAAGGTATGATTGATT CAGCTGTTTCTAATCATATTGCTCGTCT US 10 , 435 , 714 B2 93 94 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence TTCCAGCTCTAGAAATAGCTTGTCAATGGATAGTGAAGAATTAGAACGTTTAATCAAAAGCTTTCAGAT ACACACACCTCATTGCTCACTTTTTATTGGCGCCCAATCACTTTCACAGCATTAGAATCTTTGCCTGAA GCCCTTCAATCGGGCGTTAATTCAGCCGATATTTTACTAGGCTCTACTCAATATATGCTCACCAATTCTTT GGTTGAAGAGTCAATTGCAACTTATCAAAGAACACTTAATCGCATCAATTACTTGTCAGGTGTTGCAGGT CAGATTAACGGCGCAATAAAGCGAAAAGCGATAGATGGAGAAAAAATTCACTTGCCTGCAGCTTGGTC AGAGTTGATATCTTTACCATTTATAGGCCAGCCTGTTATAGATGTTGAAAGCGATTTAGCTCATCTAAAA AATCAATACCAAACACTTTCAAATGAGTTTGATACTCTTATATCTGCTTTGCAAAAGAATTTTGATTTGA ACTTTAATAAAGCGCTCCTTAATCGTACTCAGCATTTTGAAGCCATGTGTAGAAGCACTAAGAAAAACG CTTTATCCAAACCAGAGATCGTTTCCTATCGCGACCTGCTTGC TCGATTAACTTCTTGTTTGTATCGAGGC TGAAGTGTTAAAAAAACATAAAATATTTGAGTCAAACAGCGAAC TTCGTGAACATGTTCATGAAAGAAAGCATTTCGTGTTTGTTAGTCCTCTAGATCGCAAAGCCAAGAAACT CCTTCGATTAACTGATTCGCGTCCAGACTTGTTACATGTTATTGATGAAATATTGCAGCACGATAATCTT GAAAACAAAGACCGCGAGTCACTTTGGCTAGTTCGCTCTGGTTATTTGCTTGCAGGACTTCCAGATCAAC TTTCTTCATCTTTTATTAACTTGCCTATCATTACTCAAAAAGGAGATAGACGCCTTATAGACCTGATTCAG TATGATCAAATTAATCGTGATGCTTTTGTTATGTTAGTGACCTCTGCATTCAAGTCTAATTTGTCTGGTCT GCAGTATCGTGCCAATAAGCAATCGTTCGTTGTTACTCGCACGCTAAGCCCTTATCTCGGCTCAAAACTT GTCTACGTACCCAAGGATAAAGATTGGTTAGTTCCTTCTCAAATGTTTGAAGGACGATTTGCTGACATTC TTCAATCAGATTATATGGTCTGGAAAGATGCCGGTCGTCTTTGTGTTATTGATACTGCAAAACACCTTTC TAATATAAAGAAGTCTGTATTTTCATCCGAAGAAGTTCTCGCTTTTTTAAGAGAACTCCCTCACCGCACA TTTATCCAGACCGAAGTTCGCGGCCTTGGCGTTAATGTCGATGGAATTGCATTTAATAATGGTGATATTC CGTCATTAAAAACCTTTTCAAATTGCGTTCAGGTAAAAGTTTCTCGGACTAATACATCCCTAGTTCAAAC ACTTAATCGTTGGTTTGAAGGAGGAAAAGTTTCTCCTCCGAGCATTCAATTTGAACGGGCGTATTATAAA AAAGACGATCAAATTCATGAAGACGCAGCGAAAAGAAAGATACGATTCCAGATGCCCGCAACTGAGTT GGTTCATGCTTCTGACGATGCGGGGTGGACACCAAGTTATTTGCTCGGCATTGATCCTGGCGAGTATGGA ATGGGTCTTTCATTGGTTTCGATTAATAACGGAGAAGTCTTAGATTCAGGCTTTATTCATATTAATTCTCT GATCAATTTTGCCTCTAAAAAGAGCAACCATCAAACTAAGGTTGTTCCGCGTCAGCAGTACAAATCTCCT TATGCAAATTATTTAGAACAATCTAAAGATTCTGCTGCTGGTGATATTGCGCATATACTCGATCGACTTA TATACAAATTAAATGCGTTGCCTGTTTTTGAGGCTCTTTCAGGTAATTCTCAGAGTGCTGCTGATCAAGTT TGGACGAAAGTCTTATCGTTTTACACTTGGGGTGATAATGACGCTCAGAATTCTATTAGAAAGCAGCATT GGTTTGGAGCCAGTCATTGGGATATCAAAGGTATGTTAAGGCAACCCCCTACGGAGAAGAAGCCTAAAC CGTATATTGCTTTTCCTGGCTCTCAGGTTTCTTCGTATGGTAATTCCCAACGTTGCTCTTGCTGCGGTCGC AATCCTATTGAACAACTTCGAGAAATGGCAAAGGATACCTCTATTAAAGAGCTAAAAATTCGCAATTCT GAGATACAGCTTTTTGACGGAACCATTAAATTATTTAATCCAGACCCATCCACTGTGATAGAGAGAAGG CGACATAATCTTGGTCCATCAAGAATTCCTGTTGCTGACCGTACTTTCAAAAACATCAGTCCATCAAGTC TAGAATTTAAAGAATTGATTACTATCGTGTCTCGATCTATCCGTCATTCACCTGAGTTTATCGCTAA ACGCGGCATAGGGTCTGAGTATTTTTGCGCTTATTCCGATTGCAACTCATCCTTAAATTCTGAAGCTAAC GCAGCTGCTAACGTAGCGCAAAAATTTCAAAAACAGTTATTTTTTGAGTTATAA SEO ATGAAGAGAATTCTGAACAGTCTGAAAGTTGCTGCCTTGAGACTTCTGTTTCGAGGCAAAGGTTCTGAAT ID TAGTGAAGACAGTCAAATATCCATTGGTTTCCCCGGTTCAAGGCGCGGTTGAAGAACTTGCTGAAGCAA NO : TTCGGCACGACAACCTGCACCTTTTTGGGCAGAAGGAAATAGTGGATCTTATGGAGAAAGACGAAGGAA 37 CCCAGGTGTATTCGGTTGTGGATTTTTGGTTGGATACCCTGCGTTTAGGGATGTTTTTCTCACCATCAGCG AATGCGTTGAAAATCACGCTGGGAAAATTCAATTCTGATCAGGTTTCACCTTTTCGTAAGGTTTTGGAGC AGTCACCTTTTTTTCTTGCGGGTCGCTTGAAGGTTGAACCTGCGGAAAGGATACTTTCTGTTGAAATCAG AAAGATTGGTAAAAGAGAAAACAGAGTTGAGAACTATGCCGCCGATGTGGAGACATGCTTCATTGGTCA GCTTTCTTCAGATGAGAAACAGAGTATCCAGAAGCTGGCAAATGATATCTGGGATAGCAAGGATCATGA GGAACAGAGAATGTTGAAGGCGGATTTTTTTGCTATACCTCTTATAAAAGACCCCAAAGCTGTCACAGA AGAAGATCCTGAAAATGAAACGGCGGGAAAACAGAAACCGCTTGAATTATGTGTTTGTCTTGTTCCTGA GTTGTATACCCGAGGTTTCGGCTCCATTGCTGATTTTCTGGTT CAGCGACTTACCTTGCTGCGTGACAAA ATGAGTACCGACACGGCGGAAGATTGCCTCGAGTATGTTGGCATTGAGGAAGAAAAAGGCAATGGAAT GAATTCCTTGCTCGGCACTTTTTTGAAGAACCTGCAGGGTGATGGTTTTGAACAGATTTTTCAGTTTATGC TTGGGTCTTATGTTGGCTGGCAGGGGAAGGAAGATGTACTGCGCGAACGATTGGATTTGCTGGCCGAAA AAGTCAAAAGATTACCAAAGCCAAAATTTGCCGGAGAATGGAGTGGTCATCGTATGTTTCTCCATGGTC AGCTGAAAAGCTGGTCGTCGAATTTCTTCCGTCTTTTTAATGAGACGCGGGAACTTCTGGAAAGTATCAA GAGTGATATTCAACATGCCACCATGCTCATTAGCTATGTGGAAGAGAAAGGAGGCTATCATCCACAGCT GTTGAGTCAGTATCGGAAGTTAATGGAACAATTACCGGCGTTGCGGACTAAGGTTTTGGATCCTGAGAT TGAGATGACGCATATGTCCGAGGCTGTTCGAAGTTACATTATGATACACAAGTCTGTAGCGGGATTTCTG CCGGATTTACTCGAGTCTTTGGATCGAGATAAGGATAGGGAATTTTTGCTTTCCATCTTTCCTCGTATTCC AAAGATAGATAAGAAGACGAAAGAGATCGTTGCATGGGAGCTACCGGGCGAGCCAGAGGAAGGCTATT TGTTCACAGCAAACAACCTTTTCCGGAATTTTCTTGAGAATCCGAAACATGTGCCACGATTTATGGCAGA GAGGATTCCCGAGGATTGGACGCGTTTGCGCTCGGCCCCTGTGTGGTTTGATGGGATGGTGAAGCAATG GCAGAAGGTGGTGAATCAGTTGGTTGAATCTCCAGGCGCCCTTTATCAGTTCAATGAAAGTTTTTTGCGT CAAAGACTGCAAGCAATGCTTACGGTCTATAAGCGGGATCTCCAGACTGAGAAGTTTCTGAAGCTGCTG GCTGATGTCTGTCGTCCACTCGTTGATTTTTTCGGACTTGGAGGAAATGATATTATCTTCAAGTCATGTCA GGATCCAAGAAAGCAATGGCAGACTGTTATTCCACTCAGTGTCCCAGCGGATGTTTATACAGCATGTGA AGGCTTGGCTATTCGTCTCCGCGAAACTCTTGGATTCGAATGGAAAAATCTGAAAGGACACGAGCGGGA AGATTTTTTACGGCTGCATCAGTTGCTGGGAAATCTGCTGTTC TGGATCAGGGATGCGAAACTTGTCGTG AAGCTGGAAGACTGGATGAACAATCCTTGTGTTCAGGAGTATGTGGAAGCACGAAAAGCCATTGATCTT CCCTTGGAGATTTTCGGATTTGAGGTGCCGATTTTTCTCAATGGCTATCTCTTTTCGGAACTGCGCCAGCT GGAATTGTTGCTGAGGCGTAAGTCGGTGATGACGTCTTACAGCGTCAAAACGACAGGCTCGCCAAATAG GCTCTTCCAGTTGGTTTACCTACCTCTAAACCCTTCAGATCCGGAAAAGAAAAATTCCAACAACTTTCAG GTTTGTCGCGTCGTTTTCTGGATCTTACGCTGGATGCATTTGCTGGCA AACTCTTGACGGATCCGGTAACTCAGGAACTGAAGACGATGGCCGGTTTTTACGATCATCT CAAGTTGCCGTGTAAACTGGCGGCGATGAGTAACCATCCAGGATCCTCTTCCAAAATGGTGGTTCTGGC AAAACCAAAGAAGGGTGTTGCTAGTAACATCGGCTTTGAACCTATTCCCGATCCTGCTCATCCTGTGTTC CGGGTGAGAAGTTCCTGGCCGGAGTTGAAGTACCTGGAGGGGTTGTTGTATCTTCCCGAAGATACACCA US 10 , 435 , 714 B2 95 96 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence CTGACCATTGAACTGGCGGAAACGTCGGTCAGTTGTCAGTCTGTGAGTTCAGTCGCTTTCGATTTGAAGA ATCTGACGACTATCTTGGGTCGTGTTGGTGAATTCAGGGTGACGGCAGATCAACCTTTCAAGCTGACGCC CATTATTCCTGAGAAAGAGGAATCCTTCATCGGGAAGACCTACCTCGGTCTTGATGCTGGAGAGCGATC TGGCGTTGGTTTCGCGATTGTGACGGTTGACGGCGATGGGTATGAGGTGCAGAGGTTGGGTGTGCATGA AGATACTCAGCTTATGGCGCTTCAGCAAGTCGCCAGCAAGTCTCTTAAGGAGCCGGTTTTCCAGCCACTC CGTAAGGGCACATTTCGTCAGCAGGAGCGCATTCGCAAAAGCCTCCGCGGTTGCTACTGGAATTTCTATC ATGCATTGATGATCAAGTACCGAGCTAAAGTTGTGCATGAGGAATCGGTGGGTTCATCCGGTCTGGTGG GGCAGTGGCTGCGTGCATTTCAGAAGGATCTCAAAAAGGCTGATGTTCTGCCCAAGAAGGGTGGAAAAA ATGGTGTAGACAAAAAAAAGAGAGAAAGCAGCGCTCAGGATACCTTATGGGGAGGAGCTTTCTCGAAG AAGGAAGAGCAGCAGATAGCCTTTGAGGTTCAGGCAGCTGGAT CAAGCCAGTTTTGTCTGAAGTGTGGT TGGTGGTTTCAGTTGGGGATGCGGGAAGTAAATCGTGTGCAGGAGAGTGGCGTGGTGCTGGACTGGAAC CGGTCCATTGTAACCTTCCTCATCGAATCCTCAGGAGAAAAGGTATATGGTTTCAGTCCTCAGCAACTGG AAAAAGGCTTTCGTCCTGACATCGAAACGTTCAAAAAAATGGTAAGGGATTTTATGAGACCCCCCATGT TTGATCGCAAAGGTCGGCCGGCCGCGGCGTATGAAAGATTCGTACTGGGACGTCGTCACCGTCGTTATC GCTTTGATAAAGTTTTTGAAGAGAGATTTGGTCGCAGTGCTCTTTTCATCTGCCCGCGGGTCGGGTGTGG GAATTTCGATCACTCCAGTGAGCAGTCAGCCGTTGTCCTTGCCCTTATTGGTTACATTGCTGATAAGGAA GGGATGAGTGGTAAGAAGCTTGTTTATGTGAGGCTGGCTGAACTTATGGCTGAGTGGAAGCTGAAGAAA CTGGAGAGATCAAGGGTGGAAGAACAGAGCTCGGCACAATAA SEQ ATGGCAGAAAGCAAGCAGATGCAATGCCGCAAGTGCGGCGCAAGCATGAAGTATGAAGTAATTGGATT ID GGGCAAGAAGTCATGCAGATATATGTGCCCAGATTGCGGCAAT CACACCAGCGCGCGCAAGATTCAA NO : ACAAGAAAAAGCGCGACAAAAAGTATGGATCCGCAAGCAAAGCGCAGAGCCAGAGGATAGCTGTGGCT 38 GGCGCGCTTTATCCAGACAAAAAAGTGCAGACCATAAAGACCTACAAATACCCAGCGGATCTTAATGGC GAAGTTCATGACAGCGGCGTCGCAGAGAAGATTGCGCAGGCGATTCAGGAAGATGAGATCGGCCTGCTT GGCCCGTCCAGCGAATACGCTTGCTGGATTGCTTCACAAAAACAGAGCGAGCCGTATTCAGTTGTAGAT TTTTGGTTTGACGCGGTGTGCGCAGGCGGAGTATTCGCGTATTCTGGCGCGCGCCTGCTTTCCACAGTCC TCCAGTTGAGTGGCGAGGAAAGCGTTTTGCGCGCTGCTTTAGCATCTAGCCCGTTTGTAGATGACATTAA TTTGGCGCAAGCGGAAAAGTTCCTAGCCGTTAGCCGGCGCACAGGCCAAGATAAGCTAGGCAAGCGCAT TGGAGAATGTTTTGCGGAAGGCCGGCTTGAAGCGCTTGGCATCAAAGATCGCATGCGCGAATTCGTGCA AGCGATTGATGTGGCCCAAACCGCGGGCCAGCGGTTCGCGGCCAAGCTAAAGATATTCGGCATCAGTCA GATGCCTGAAGCCAAGCAATGGAACAATGATTCCGGGCTCACTGTATGTATTTTGCCGGATTATTATGTC CCGGAAGAAAACCGCGCGGACCAGCTGGTTGTTTTGCTTCGGCGCTTACGCGAGATCGCGTATTGCATG GGAATTGAGGATGAAGCAGGATTTGAGCATCTAGGCATTGACCCTGGTGCTCTTTCCAATTTTTCCAATG GCAATCCAAAGCGAGGATTTCTCGGCCGCCTGCTCAATAATGACATTATAGCGCTGGCAAACAACATGT CAGCCATGACGCCGTATTGGGAAGGCAGAAAAGGCGAGTTGAT TGAGCGCCTTGCATGGCTTAAACATC GCGCTGAAGGATTGTATTTGAAAGAGCCACATTTCGGCAACTCCTGGGCAGACCACCGCAGCAGGATTT TCAGTCGCATTGCGGGCTGGCTTTCCGGATGCGCGGGCAAGCT CAAGATTGCCAAGGATCAGATTTCAG GCGTGCGTACGGATTTGTTTCTGCTCAAGCGCCTTCTGGATGCGGTACCGCAAAGCGCGCCGTCGCCGGA CTTTATTGCTTCCATCAGCGCGCTGGATCGGTTTTTGGAAGCGGCAGAAAGCAGCCAGGATCCGGCAGA ACAGGTACGCGCTTTGTACGCGTTTCATCTGAACGCGCCTGCGGTCCGATCCATCGCCAACAAGGCGGT ACAGAGGTCTGATTCCCAGGAGTGGCTTATCAAGGAACTGGATGCTGTAGATCACCTTGAATTCAACAA AGCATTTCCGTTTTTTTCGGATACAGGAAAGAAAAAGAAGAAAGGAGCGAATAGCAACGGAGCGCCTTC TGAAGAAGAATACACGGAAACAGAATCCATTCAACAACCAGAAGATGCAGAGCAGGAAGTGAATGGTC AAGAAGGAAATGGCGCTTCAAAGAACCAGAAAAAGTTTCAGCGCATTCCTCGATTTTTCGGGGAAGGGT CAAGGAGTGAGTATCGAATTTTAACAGAAGCGCCGCAATATTTTGACATGTTCTGCAATAATATGCGCG CGATCTTTATGCAGCTAGAGAGTCAGCCGCGCAAGGCGCCTCGTGATTTCAAATGCTTTCTGCAGAATCG TTTGCAGAAGCTTTACAAGCAAACCTTTCTCAATGCTCGCAGTAATAAATGCCGCGCGCTTCTGGAATCC GTCCTTATTTCATGGGGAGAATTTTATACTTATGGCGCGAATGAAAAGAAGTTTCGTCTGCGCCATGAAG CGAGCGAGCGCAGCTCGGATCCGGACTATGTGGTTCAGCAGGCATTGGAAATCGCGCGCCGGCTTTTCT TGTTCGGATTTGAGTGGCGCGATTGCTCTGCTGGAGAGCGCGTGGATTTGGTTGAAATCCACAAAAAAG CAATCTCATTTTTGCTTGCAATCACTCAGGCCGAGGTTTCAGTTGGTTCCTATAACTGGCTTGGGAATAG CACCGTGAGCCGGTATCTTTCGGTTGCTGGCACAGACACATTGTACGGCACTCAACTGGAGGAGTTTTTG AACGCCACAGTGCTTTCACAGATGCGTGGGCTGGCGATTCGGCTTTCATCTCAGGAGTTAAAAGACGGA TTTGATGTTCAGTTGGAGAGTTCGTGCCAGGACAATCTCCAGCATCTGCTGGTGTATCGCGCTTCGCGCG ACTTGGCTGCGTGCAAACGCGCTACATGCCCGGCTGAATTGGATCCGAAAATTCTTGTTCTGCCGGTTGG TGCGTTTATCGCGAGCGTAATGAAAATGATTGAGCGTGGCGATGAACCATTAGCAGGCGCGTATTTGCG TCATCGGCCGCATTCATTCGGCTGGCAGATACGGGTTCGTGGAGTGGCGGAAGTAGGCATGGATCAGGG CACAGCGCTAGCATTCCAGAAGCCGACTGAATCAGAGCCGTTTAAAATAAAGCCGTTTTCCGCTCAATA CGGCCCAGTACTTTGGCTTAATTCTTCATCCTATAGCCAGAGCCAGTATCTGGATGGATTTTTAAGCCAG CCAAAGAATTGGTCTATGCGGGTGCTACCTCAAGCCGGATCAGTGCGCGTGGAACAGCGCGTTGCTCTG ATATGGAATTTGCAGGCAGGCAAGATGCGGCTGGAGCGCTCTGGAGCGCGCGCGTTTTTCATGCCAGTG CCATTCAGCTTCAGGCCGTCTGGTTCAGGAGATGAAGCAGTATTGGCGCCGAATCGGTACTTGGGACTTT TTCCGCATTCCGGAGGAATAGAATACGCGGTGGTGGATGTATTAGATTCCGCGGGTTTCAAAATTCTTGA GCGCGGTACGATTGCGGTAAATGGCTTTTCCCAGAAGCGCGGCGAACGCCAAGAGGAGGCACACAGAG AAAAACAGAGACGCGGAATTTCTGATATAGGCCGCAAGAAGCCGGTGCAAGCTGAAGTTGACGCAGCC AATGAATTGCACCGCAAATACACCGATGTTGCCACTCGTTTAGGGTGCAGAATTGTGGTTCAGTGGGCG CCCCAGCCAAAGCCGGGCACAGCGCCGACCGCGCAAACAGTATGCCGACCGCGCAAACAGTATACGCGCGCGCAGTGCGGACCGAAGC GCCGCGATCTGGAAATCAAGAGGATCATGCTCGTATGAAATCCTCTTGGGGATATACCTGGGGCACCTA TTGGGAGAAGCGCAAACCAGAGGATATTTTGGGCATCTCAACCCAAGTATACTGGACCGGCGGTATAGG CGAGTCATGTCCCGCAGTCGCGGTTGCGCTTTTGGGGCACATTAGGGCAACATCCACTCAAACTGAATG GGAAAAAGAGGAGGTTGTATTCGGTCGACTGAAGAAGTTCTTTCCAAGCTAG SEQ ATGGAAAAGAGAATAAACAAGATACGAAAGAAACTATCGGCCGATAATGCCACAAAGCCTGTGAGCAG ID GAGCGGCCCCATGAAAACACTCCTTGTCCGGGTCATGACGGACGACTTGAAAAAAAGACTGGAGAAGC NO : GTCGGAAAAAGCCGGAAGTTATGCCGCAGGTTATTTCAAATAACGCAGCAAACAATCTTAGAATGCTCC 39 TTGATGACTATACAAAGATGAAGGAGGCGATACTACAAGTTTACTGGCAGGAATTTAAGGACGACCATG US 10 , 435 , 714 B2 97 98 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence TGGGCTTGATGTGCAAATTTGCCCAGCCTGCTTCCAAAAAAATTGACCAGAACAAACTAAAACCGGAAA TGGATGAAAAAGGAAATCTAACAACTGCCGGTTTTGCATGTTCTCAATGCGGTCAGCCGCTATTTGTTTA TAAGCTTGAACAGGTGAGTGAAAAAGGCAAGGCTTATACAAATTACTTCGGCCGGTGTAATGTGGCCGA GCATGAGAAATTGATTCTTCTTGCTCAATTAAAACCTGAAAAAGACAGTGACGAAGCAGTGACATACTC CCTTGGCAAATTCGGCCAGAGGGCATTGGACTTTTATTCAATCCACGTAACAAAAGAATCCACCCATCC AGTAAAGCCCCTGGCACAGATTGCGGGCAACCGCTATGCAAGCGGACCTGTTGGCAAGGCCCTTTCCGA TGCCTGTATGGGCACTATAGCCAGTTTTCTTTCGAAATATCAAGACATCATCATAGAACATCAAAAGGTT GTGAAGGGTAATCAAAAGAGGTTAGAGAGTCTCAGGGAATTGGCAGGGAAAGAAAATCTTGAGTACCC ATCGGTTACACTGCCGCCGCAGCCGCATACGAAAGAAGGGGTTGACGCTTATAACGAAGTTATTGCAAG GGTACGTATGTGGGTTAATCTTAATCTGTGGCAAAAGCTGAAGCTCAGCCGTGATGACGCAAAACCGCT ACTGCGGCTAAAAGGATTCCCATCTTTCCCTGTTGTGGAGCGGCGTGAAAACGAAGTTGACTGGTGGAA TACGATTAATGAAGTAAAAAAACTGATTGACGCTAAACGAGATATGGGACGGGTATTCTGGAGCGGCGT TACCGCAGAAAAGAGAAATACCATCCTTGAAGGATACAACTATCTGCCAAATGAGAATGACCATAAAA AGAGAGAGGGCAGTTTGGAAAACCCTAAGAAGCCTGCCAAACGCCAGTTTGGAGACCTCTTGCTGTATC TTGAAAAGAAATATGCCGGAGACTGGGGAAAGGTCTTCGATGAGGCATGGGAGAGGATAGATAAGAAA ATAGCCGGACTCACAAGCCATATAGAGCGCGAAGAAGCAAGAAACGCGGAAGACGCTCAATCCAAAGC CGTACTTACAGACTGGCTAAGGGCAAAGGCATCATTTGTTCTTGAAAGACTGAAGGAAATGGATGAAAA GGAATTCTATGCGTGTGAAATCCAACTTCAAAAATGGTATGGCGATCTTCGAGGCAACCCGTTTGCCGTT GAAGCTGAGAATAGAGTTGTTGATATAAGCGGGTTTTCTATCGGAAGCGATGGCCATTCAATCCAATAC AGAAATCTCCTTGCCTGGAAATATCTGGAGAACGGCAAGCGTGAATTCTATCTGTTAATGAATTATGGC AAGAAAGGGCGCATCAGATTTACAGATGGAACAGATATTAAAAAGAGCGGCAAATGGCAGGGACTATT ATATGGCGGTGGCAAGGCAAAGGTTATTGATCTGACTTTCGACCCCGATGATGAACAGTTGATAATCCT GCCGCTGGCCTTTGGCACAAGGCAAGGCCGCGAGTTTATCTGGAACGATTTGCTGAGTCTTGAAACAGG CCTGATAAAGCTCGCAAACGGAAGAGTTATCGAAAAAACAATCTATAACAAAAAAATAGGGCGGGATG AACCGGCTCTATTCGTTGCCTTAACATTTGAGCGCCGGGAAGTTGTTGATCCATCAAATATAAAGCCTGT AAACCTTATAGGCGTTGACCGCGGCGAAAACATCCCGGCGGTTATTGCATTGACAGACCCTGAAGGTTG TCCTTTACCGGAATTCAAGGATTCATCAGGGGGCCCAACAGACATCCTGCGAATAGGAGAAGGATATAA GGAAAAGCAGAGGGCTATTCAGGCAGCAAAGGAGGTAGAGCAAAGGCGGGCTGGCGGTTATTCACGGA AGTTTGCATCCAAGTCGAGGAACCTGGCGGACGACATGGTGAAAATTCAGCGCGAGACCTTTTTTACC ATGCCGTTACCCACGATGCCGTCCTTGTCTTTGAAAACCTGAGCAGGGGTTTTGGAAGGCAGGGCAAAA GGACCTTCATGACGGAAAGACAATATACAAAATGGAAGACTGGCTGACAGCGAAGCTCGCATACGAA GGTCTTACGTCAAAAACCTACCTTTCAAAGACGCTGGCGCAATATACGTCAAAAACATGCTCCAACTGC GGGTTTACTATAACGACTGCCGATTATGACGGGATGTTGGTAAGGCTTAAAAAGACTTCTGATGGATGG GCAACTACCCTCAACAACAAAGAATTAAAAGCCGAAGGCCAGATAACGTATTATAACCGGTATAAAAG GCAAACCGTGGAAAAAGAACTCTCCGCAGAGCTTGACAGGCTTTCAGAAGAGTCGGGCAATAATGATAT TTCTAAGTGGACCAAGGGTCGCCGGGACGAGGCATTATTTTTGTTAAAGAAAAGATTCAGCCATCGGCC TGTTCAGGAACAGTTTGTTTGCCTCGATTGCGGCCATGAAGTCCACGCCGATGAACAGGCAGCCTTGAAT ATTGCAAGGTCATGGCTTTTTCTAAACTCAAATTCAACAGAATTCAAAAGTTATAAATCGGGTAAACAG CCCTTCGTTGGTGCTTGGCAGGCCTTTTACAAAAGGAGGCTTAAAGAGGTATGGAAGCCCAACGCC SEQ ATGAAAAGGATAAATAAAATACGAAGGAGATTGGTAAAGGATAGCAACACGAAAAAAGCCGGCAAAA ID CCGGCCCTATGAAAACCTTGCTCGTTCGGGTTATGACACCTGACCTGAGAGAAAGGTTAGAGAATCTTC NO : GCAAAAAGCCGGAAAACATTCCTCAGCCCATTTCAAATACTTCACGTGCAAATTTAAATAAACTCCTCA 40 CTGACTATACGGAAATGAAGAAAGCAATCCTGCATGTTTATTGGGAAGAGTTCCAAAAAGACCCTGTCG GATTGATGAGCAGGGTTGCACAACCAGCGCCCAAGAATATTGATCAGAGAAAATTGATTCCGGTGAAGG ACGGAAATGAGAGACTAACAAGTTCTGGATTTGCCTGTTCTCAGTGCTGTCAACCCCTCTATGTTTATAA GCTTGAACAAGTGAATGACAAGGGTAAGCCCCATACAAATTACTTTGGCCGTTGTAATGTCTCCGAGCA TGAACGTTTGATATTGCTCTCGCCGCATAAACCGGAGGCAAATGACGAGCTAGTAACGTATTCGTTGGG GAAGTTCGGTCAAAGGGCATTGGACTTTTATTCAATCCACGTAACAAGAGAATCGAACCATCCTGTAAA GCCGCTAGAACAGATCGGTGGCAATAGCTGCGCAAGTGGTCCCGTTGGTAAGGCTTTATCTGATGCCTG TATGGGAGCAGTAGCCAGTTTCCTTACAAAGTACCAGGACATCATCCTCGAACACCAAAAGGTTATAAA AAAAAACGAAAAGAGATTGGCAAATCTAAAGGATATAGCAAGTGCAAACGGGCTTGCATTTCCTAAAA TCACTCTTCCACCGCAACCGCATACAAAAGAAGGGATTGAAGCTTATAACAATGTTGTTGCTCAGATAG TGATCTGGGTAAACCTGAATCTTTGGCAGAAACTCAAAATTGGCAGGGATGAGGCAAAGCCCTTACAGC GGCTTAAGGGTTTTCCGTCCTTCCCTCTTGTTGAACGCCAGGCGAATGAGGTTGATTGGTGGGATATGGT CTGTAATGTCAAAAAGTTGATTAACGAAAAGAAAGAGGACGGGAAGGTCTTCTGGCAAAATCTTGCTGG ATATAAAAGGCAGGAAGCCTTGCTTCCATATCTTTCGTCTGAAGAAGACCGTAAAAAAGGAAAAAAGTT TGCGCGTTATCAGTTTGGTGACCTTTTGCTTCACCTTGAAAAGAAACACGGTGAAGATTGGGGCAAAGTT TATGATGAGGCATGGGAAAGAATAGATAAAAAAGTTGAAGGTC TGAGTAAGCACATAAAGTTGGAGGA AGAAAAAGGTCTGAAGATGCTCAATCAAAGGCTGCCCTCACTGATTGGCTCAGGGCAAAGGCCTCTTT TGTTATTGAAGGGCTCAAAGAAGCTGATAAGGATGAGTTTTGCAGGTGTGAGTTAAAGCTTCAAAAGTG GTATGGAGATTTGAGAGGAAAACCATTTGCTATAGAAGCAGAGAACAGCATTTTAGATATAAGCGGATT TTCTAAACAGTATAATTGTGCATTTATATGGCAGAAAGACGGCGTAAAGAAGTTAAATCTTTATTTAATA ATAAATTACTTCAAAGGTGGTAAGCTACGCTTCAAAAAAATCAAGCCAGAAGCTTTTGAAGCAAATAGG TTTTATACAGTAATTAATAAAAAAAGCGGTGAGATTGTGCCTATGGAGGTCAACTTCAATTTTGATGACC CGAATTTGATAATTCTGCCTTTGGCCTTTGGAAAAAGGCAGGGGAGGGAGTTTATCTGGAACGACCTATT AAACTCGCCAATGGCAGGGTTATTGAAAAAACGCTCTATAACAGAAG GACGAGACAGGATGAACCAGCACTTTTTGTTGCCCTGACATTTGAAAGAAGAGAGGTGCTTGACTCATC GAATATAAAACCGATGAATCTGATAGGAATAGACCGGGGAGAAAATATCCCGGCAGTCATAGCATTAA CAGACCCGGAAGGATGCCCCTTGTCAAGATTCAAAGATTCATTGGGCAATCCAACGCATATTTTGCGAA TAGGAGAAAGTTATAAGGAAAAACAACGGACTATTCAGGCTGCTAAAGAAGTTGAACAAAGGCGGGCA GGCGGATATTCGAGAAAATATGCATCAAAGGCGAAGAATCTGGCGGACGATATGGTAAGAAATACAGC TCGTGACCTCTTATATTATGCTGTTACTCAAGATGCAATGCTCATTTTTGAAAATCTTTCCCGCGGTTTTG GTAGACAAGGCAAGAGGACTTTTATGGCGGAAAGGCAGTACACGAGGATGGAAGACTGGCTGACTGCA AAGCTTGCCTATGAAGGTCTGCCATCAAAAACCTATCTTTCAAAGACTCTGGCACAGTATACCTCAAAG ACATGTTCTAATTGTGGTTTTACAATCACAAGTGCAGATTATGACAGGGTGCTCGAAAAGCTCAAGAAG US 10 , 435 , 714 B2 99 100 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence ACGGCTACTGGATGGATGACTACAATCAATGGAAAAGAGTTAAAAGTTGAAGGACAGATAACATACTA TAACCGGTATAAAAGGCAGAATGTGGTAAAAACCTCTCTGTAGAGCTGGATAGACTTTCGGAAGAGTC GGTAAATAATGATATTTCTAGTTGGACAAAAGGCCGCAGTGGTGAAGCTTTATCTCTGCTAAAAAAGAG ATTTAGTCACAGGCCGGTGCAGGAAAAGTTTGTTTGCCTGAAC TGTGGTTTTGAAACCCATGCAGACGA ACAAGCAGCACTGAATATTGCAAGGTCGTGGCTCTTTCTCCGTTCTCAAGAATATAAGAAGTATCAAAC CAATAAAACGACCGGAAATACTGACAAAAGGGCATTTGTTGAAACATGGCAATCCTTTTACAGAAAGAA GCTCAAAGAAGTATGGAAACCA SEQ ATGGGTAAAATGTATTACCTTGGTTTAGACATTGGCACGAATTCCGTGGGCTACGCGGTGACCGACCCCT ID CATACCACCTGCTGAAGTTTAAGGGGGAACCAATGTGGGGTGCGCACGTATTTGCCGCCGGTAATCAGA NO : GCGCGGAACGACGCTCGTTCCGCACATCGCGTCGT CGTTTGGACCGACGCCAACAGCGCGTTAAACTGG 41 TACAGGAGATTTTTGCCCCGGTGATTAGTCCGATCGACCCACGCTTCTTCATTCGTCTGCATGAATCCGC CCTGTGGCGCGATGACGTCGCGGAGACGGATAAACATATCTTTTTCAATGATCCTACCTATACCGATAAG GAATATTATAGCGATTACCCGACTATCCATCACCTGATCGTTGATCTGATGGAAAGCTCTGAGAAACAC GATCCGCGGCTGGTGTACCTTGCAGTGGCGTGGTTAGTGGCACACCGTGGTCATTTTCTGAACGAGGTGG ACAAGGATAATATTGGAGATGTGTTGTCGTTCGACGCATTTTATCCGGAGTTTCTCGCGTTCCTGTCGGA CAACGGTGTATCACCGTGGGTGTGCGAAAGCAAAGCGCTGCAGGCGACCTTGCTGAGCCGTAACTCAGT GAACGACAAATATAAAGCCCTTAAGTCTCTGATCTTCGGATCCCAGAAACCTGAAGATAACTTCGATGC CAATATTTCGGAAGATGGACTCATTCAACTGCTGGCCGGCAAAAAGGTAAAAGTTAACAAACTGTTCCC TCAGGAATCGAACGATGCATCCTTCACATTGAATGATAAAGAAGACGCGATAGAAGAAATCCTGGGTAC GCTTACACCAGATGAATGTGAATGGATTGCGCATATACGCCGCCTTTTTGACTGGGCTATCATGAAACAT GCTCTGAAAGATGGCAGGACTATTAGCGAGTCAAAAGTCAAAC TGTATGAGCAGCACCATCACGATCTG ACCCAACTTAAATACTTCGTGAAAACCTACCTTGCAAAAGAATACGACGATATTTTCCGCAACGTGGAT AGCGAAACAACGAAAAACTATGTAGCGTATTCCTATCATGTGAAAGAGGTGAAAGGCACTCTGCCTAAA AATAAGGCAACGCAAGAAGAGTTTTGTAAGTATGTCCTGGGCAAGGTTAAAAACATTGAATGCTCTGAA GCAGACAAGGTTGACTTTGATGAGATGATTCAGCGTCTTACCGACAACTCTTTTATGCCTAAGCAGGTTT CGGGCGAAAACCGCGTTATTCCTTATCAGTTATATTATTATGAACTGAAGACAATTCTGAATAAAGCAGC CTCGTACCTGCCTTTCCTGACGCAGTGTGGAAAAGATGCAATTTCGAACCAGGACAAACTACTGTCGATC ATGACGTTCCGTATTCCTTACTTCGTCGGACCCTTGCGAAAAATAATTCGGAACATGCATGGCTCGAAC GAAAGGCCGGTAAGATTTATCCGTGGAACTTTAACGACAAAGTGGACTTGGATAAATCAGAAGAAGCGT TCATTCGCCGAATGACCAATACCTGTACCTATTATCCCGGCGAAGATGTTTTACCGTTGGATTCGCTGAT CTATGAGAAATTTATGATTTTAAATGAAATCAATAATATTCGTATTGACGGCTACCCGATTAGTGTTGAC GTTAAACAGCAGGTTTTTGGCTTGTTCGAAAAAAAACGACGCGTAACCGTGAAAGATATTCAGAACCTG CTGCTGTCTCTCGGAGCTCTGGACAAACACGGGAAGCTGACAGGCATCGATACCACTATCCACTCAAAC TATAATACGTATCACCATTTTAAATCTCTCATGGAACGCGGCGTCCTGACCCGGGATGACGTGGAACGC ATCGTTGAAAGGATGACCTACAGCGACGATACTAAGCGTGTGCGTCTGTGGCTGAATAACAACTATGGT ACTTTAACCGCCGACGATGTGAAACACATTTCGCGTCTGCGCAAACACGATTTTGGCCGTTTATCCAAAA TGTTCTTAACAGGTCTGAAGGGTGTCCATAAGGAGACCGGTGAACGTGCCTCCATACTGGATTTCATGTG GAACACGAACGATAACCTGATGCAGCTCCTTTCCGAATGCTACACGTTCAGTGATGAAATCACAAAGCT GCAAGAGGCGTATTATGCAAAAGCCCAGTTGTCTTTAAACGATTTTTTAGACTCGATGTACATCTCTAAC GCGGTGAAACGTCCGATTTACAGAACTCTGGCAGTGGTGAACGATATTCGAAAAGCATGTGGGACGGCC CCTAAACGCATTTTCATCGAAATGGCTCGTGATGGTGAATCAAAAAAAAAGAGAAGTGTTACACGTCGC GAGCAGATCAAAAACCTGTACCGCTCGATTCGTAAAGATTTCCAGCAGGAAGTTGATTTTCTGGAAAAG ATCCTGGAAAATAAATCTGATGGTCAACTTCAGTCAGATGCTTTGTATCTTTACTTTGCACAATTAGGGC GCGATATGTACACGGGCGATCCAATAAAGCTGGAGCACATCAAAGATCAGAGTTTCTATAACATAGACC ATATTTACCCGCAGTCTATGGTGAAAGACGATTCCCTAGATAACAAAGTGCTGGTGCAAAGCGAAATTA ACGGCGAGAAAAGCTCGCGATACCCTTTGGACGCCGCGATCCGCAATAAAATGAAGCCCCTTTGGGACG CTTACTATAATCATGGCCTGATCTCCTTAAAGAAATACCAGCGTCTAACGCGCTCGACCCCGTTTACCGA TGATGAAAAATGGGACTTTATTAATCGCCAGTTAGTGGAAACCCGTCAATCTACCAAAGCGCTGGCCAT TTTGTTGAAGCGTAAGTTTCCAGACACCGAAATTGTGTATTCGAAGGCGGGGTTATCGTCCGACTTCAGA CATGAATTCGGCCTTGTAAAAAGTCGCAATATTAATGATTTGCACCACGCTAAAGACGCATTCTTGGCTA TCGTTACCGGCAATGTGTACCATGAAAGATTCAATCGCAGATGGTTTATGGTGAACCAGCCGTACTCAGT TAAAACTAAAACTCTTTTTACCCACAGCATAAAGAATGGCAACTTCGTTGCCTGGAACGGCGAAGAAGA TCTCGGTCGTATTGTAAAAATGCTGAAGCAAAACAAAAATACCATTCACTTCACGCGCTTCTCCTTCGAT CGCAAAGAAGGATTATTTGATATCCAACCTCTGAAAGCCAGCACCGGCTTAGTCCCACGAAAAGCCGGT CTGGATGTCGTTAAATACGGCGGATATGACAAATCTACCGCGGCCTATTACCTGCTGGTGAGGTTCACGC TCGAGGACAAGAAAACCCAGCACAAGCTGATGATGATTCCTGTAGAAGGCCTGTACAAGGCTCGCATTG ATCATGACAAGGAATTTCTTACCGATTATGCGCAAACGACTATAAGCGAAATCCTACAGAAAGATAAAC AGAAAGTGATCAATATTATGTTTCCAATGGGTACGAGGCATATAAAACTCAATTCAATGATTAGTATCG ATGGCTTCTATCTTAGTATCGGCGGAAAGTCCTCTAAAGTAAGTCAGTTCTATGTCACGCAATGGTTCC ACTGATCGTCCCTCACAAAATCGAATGTTACATTAAAGCAATGGAAAGCTTCGCCCGGAAGTTTAAAGA AAACAACAAGCTGCGCATCGTAGAAAAATTCGATAAAATCACCGTTGAAGACAACCTGAATCTCTACGA GCTCTTTCTCCAAAAACTGCAGCATAATCCCTATAATAAGTTTTTTTCGACACAGTTTGACGTACTGACG AACGGCCGTTCTACTTTCACAAAACTGTCGCCGGAGGAACAGGTACAGACGCTCTTGAACATTTTAAGT ATCTTTAAAACATGCCGCAGTTCGGGTTGCGACCTGAAATCCATCAACGGCAGTGCCCAGGCAGCGCGC ATCATGATTAGCGCTGACTTAACTGGACTGTCGAAAAAATATT CAGATATTAGGTTGGTTGAACAGTCA GCTTCTGGTTTGTTCGTATCCAAAAGTCAGAACTTACTGGAGTATCTCTAA SEQ ATGTCATCGCTCACGAAATTCACTAACAAATACTCTAAACAGCTCACCATTAAGAATGAACTCATCCCA ID GTTGGCAAAACACTGGAGAACATCAAAGAGAATGGTCTGATAGATGGCGACGAACAGCTGAATGAGAA NO : TTATCAGAAGGCGAAAATTATTGTGGATGATTTTCTGCGGGACTTCATTAATAAAGCACTGAATAATACG 42 CAGATCGGGAACTGGCGCGAACTGGCGGATGCCCTTAATAAAGAGGATGAAGATAACATCGAGAAATT GCAGGATAAAATTCGGGGAATCATTGTATCCAAATTTGAAACGTTTGATCTGTTTAGCAGCTATTCTATT AAGAAAGATGAAAAGATTATTGACGACGACAATGATGTTGAAGAAGAGGAACTGGATCTGGGCAAGAA GACCAGCTCATTTAAATACATATTTAAAAAAAACCTGTTTAAGTTAGTGTTGCCATCCTACCTGAAAACC ACAAACCAGGACAAGCTGAAGATTATTAGCTCGTTTGATAATTTTTCAACGTACTTCCGCGGGTTCTTTG US 10 , 435 , 714 B2 101 102 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence AAAACCGGAAAAACATTTTTACCAAGAAACCGATCTCCACAAGTATTGCGTATCGCATTGTTCATGATA ACTTCCCGAAATTCCTTGATAACATTCGTTGTTTTAATGTGTGGCAGACGGAATGCCCGCAACTAATCGT GAAAGCAGATAACTATCTGAAAAGCAAAAATGTTATAGCGAAAGATAAAAGTTTGGCAAACTATTTTAC CGTGGGCGCGTATGACTATTTCCTGTCTCAGAATGGTATAGATTTTTACAACAATATTATAGGTGGACTG CCAGCGTTCGCCGGCCATGAGAAAATCCAAGGTCTCAATGAATTCATCAATCAAGAGTGCCAAAAAGAC AGCGAGCTGAAAAGTAAGCTGAAAAACCGTCACGCGTTCAAAATGGCGGTACTGTTCAAACAGATACTC AGCGATCGTGAAAAAAGTTTTGTAATTGATGAGTTCGAGTCGGATGCTCAAGTTATTGACGCCGTTAAA AACTTTTACGCCGAACAGTGCAAAGATAACAATGTTATTTTTAACTTATTAAATCTTATCAAGAATATCG CTTTCTTAAGTGATGACGAACTGGACGGCATATTCATTGAAGGGAAATACCTGTCGAGCGTTAGTCAAA AACTCTATAGCGATTGGTCAAAATTACGTAACGACATTGAGGATTCGGCTAACTCTAAACAAGGCAATA AAGAGCTGGCCAAGAAGATCAAAACCAACAAAGGGGATGTAGAAAAAGCGATCTCGAAATATGAGTTC TCGCTGTCGGAACTGAACTCGATTGTACATGATAACACCAAGTTTTCTGACCTCCTTAGTTGTACACTGC ATAAGGTGGCTTCTGAGAAACTGGTGAAGGTCAATGAAGGCGACTGGCCGAAACATCTCAAGAATAAT GAAGAGAAACAAAAAATCAAAGAGCCGCTTGATGCTCTGCTGGAGATCTATAATACACTTCTGATTTTT AACTGCAAAAGCTTCAATAAAAACGGCAACTTCTATGTCGACTATGATCGTTGCATCAATGAACTGAGT TCGGTCGTGTATCTGTATAATAAAACACGTAACTATTGCACTAAAAAACCCTATAACACGGACAAGTTC AAACTCAATTTTAACAGTCCGCAGCTCGGTGAAGGCTTTTCCAAGTCGAAAGAAAATGACTGTCTGACT CTTTTGTTTAAAAAAGACGACAACTATTATGTAGGCATTATCCGCAAAGGTGCAAAAATCAATTTTGATG ATACACAAGCAATCGCCGATAACACCGACAATTGCATCTTTAAAATGAATTATTTCCTACTTAAAGACGC AAAAAAATTTATCCCGAAATGTAGCATTCAGCTGAAAGAAGTCAAGGCCCATTTTAAGAAATCTGAAGA TGATTACATTTTGTCTGATAAAGAGAAATTTGCTAGCCCGCTGGTCATTAAAAAGAGCACATTTTTGCTG GCAACTGCACATGTGAAAGGGAAAAAAGGCAATATCAAGAAATTTCAGAAAGAATATTCGAAAGAAAA CCCCACTGAGTATCGCAATTCTTTAAACGAATGGATTGCTTTTTGTAAAGAGTTCTTAAAAACTTATAAA GCGGCTACCATTTTTGATATAACCACATTGAAAAAGGCAGAGGAATATGCTGATATTGTAGAATTCTAC AAGGATGTCGATAATCTGTGCTACAAACTGGAGTTCTGCCCGATTAAAACCTCGTTTATAGAAAACCTG ATAGATAACGGCGACCTGTATCTGTTTCGCATCAATAACAAAGACTTCAGCAGTAAATCGACCGGCACC AAGAACCTTCATACGTTATATTTACAAGCTATATTCGATGAACGTAATCTGAACAATCCGACAATTATGC TGAATGGGGGAGCAGAACTGTTCTATCGTAAAGAAAGTATTGAGCAGAAAAACCGTATCACACACAAA GCCGGTTCAATTCTCGTGAATAAGGTGTGTAAAGACGGTACAAGCCTGGATGATAAGATACGTAATGAA ATTTATCAATATGAGAATAAATTTATTGATACCCTGTCTGATGAAGCTAAAAAGGTGTTACCGAATGTCA TTAAAAAGGAAGCTACCCATGACATTACAAAAGATAAACGTTTCACTAGTGACAAATTCTTCTTTCACTG CCCCCTGACAATTAATTATAAGGAAGGCGATACCAAGCAGTTCAATAACGAAGTGCTGAGTTTTCTGCG TGGAAATCCTGACATCAACATTATCGGCATTGACCGCGGAGAGCGTAATTTAATCTATGTAACGGTTATA AACCAGAAAGGCGAGATTCTGGATTCGGTTTCATTCAATACCGTGACCAACAAGAGTTCAAAAATCGAG CAGACAGTCGATTATGAAGAGAAATTGGCAGTCCGCGAGAAAGAGAGGATTGAAGCAAAACGTTCCTG GGACTCTATCTCAAAAATTGCGACACTAAAGGAAGGTTATCTGAGCGCAATAGTTCACGAGATCTGTCT GTTAATGATTAAACACAACGCGATCGTTGTCTTAGAGAATCTTAATGCAGGCTTTAAGCGTATTCGTGGC GGTTTATCAGAAAAAAGTGTTTATCAAAAATTCGAAAAAATGTTGATTAACAAACTGAACTATTTTGTCA GCAAGAAGGAATCCGACTGGAATAAACCGTCTGGTCTGCTGAATGGACTGCAGCTTTCGGATCAGTTTG AAAGCTTCGAAAAACTGGGTATTCAGTCTGGTTTTATTTTTTACGTGCCGGCTGCATATACCTCAAAGAT TGATCCGACCACGGGCTTCGCCAATGTTCTGAATCTGTCGAAGGTACGCAATGTTGATGCGATCAAAAG CTTTTTTTCTAACTTCAACGAAATTAGTTATAGCAAAAAGAAGCCCTTTTCAAATTCTCATTCGATCTGG ATTCACTGAGTAAGAAAGGCTTTAGTAGCTTTGTGAAATTTAGTAAGAGTAAATGGAACGTCTACACCTT TGGAGAACGTATCATAAAGCCAAAGAATAAGCAAGGTTATCGGGAGGACAAAAGAATCAACTTGACCT TCGAGATGAAGAAGTTACTTAACGAGTATAAGGTTTCTTTTGATCTTGAAAATAACTTGATTCCGAATCT CACGAGTGCCAACCTGAAGGATACTTTTTGGAAAGAGCTATTCTTTATCTTCAAGACTACGCTGCAGCTC CGTAACAGCGTTACTAACGGTAAAGAAGATGTGCTCATCTCTCCGGTCAAAAATGCGAAGGGTGAATTC TTCGTTTCGGGAACGCATAACAAGACTCTTCCGCAAGATTGCGATGCGAACGGTGCATACCATATTGCGT TGAAAGGTCTGATGATACTCGAACGTAACAACCTTGTACGTGAGGAGAAAGATACGAAAAAGATTATG GCGATTTCAAACGTGGATTGGTTCGAGTACGTGCAGAAACGTAGAGGCGTTCTGTAA SEQ ATGAACAACTACGACGAATTCACCAAACTGTACCCGATCCAGAAAACCATCCGTTTCGAACTGAAACCG ID CAGGGTCGTACCATGGAACACCTGGAAACCTTCAACTTCTTCGAAGAAGACCGTGACCGTGCGGAAAAA NO : TACAAAATCCTGAAAGAAGCGATCGACGAATACCACAAAAAATTCATCGACGAACACCTGACCAACAT 43 GTCTCTGGACTGGAACTCTCTGAAACAGATCTCTGAAAAATACTACAAATCTCGTGAAGAAAAAACAA AAAAGTTTTCCTGTCTGAACAGAAACGTATGCGTCAGGAAATCGTTTCTGAATTCAAAAAAGACGACCG TTTCAAAGACCTGTTCTCTAAAAAACTGTTCTCTGAACTGCTGAAAGAAGAAATCTACAAAAAAGGTAA CCACCAGGAAATCGACGCGCTGAAATCTTTCGACAAATTCTCTGGTTACTTCATCGGTCTGCACGAAAAC CGTAAAAACATGTACTCTGACGGTGACGAAATCACCGCGATCTCTAACCGTATCGTTAACGAAAACTTC CCGAAATTCCTGGACAACCTGCAGAAATACCAGGAAGCGCGTAAAAAATACCCGGAATGGATCATCAA AGCGGAATCTGCGCTGGTTGCGCACAACATCAAAATGGACGAAGTTTTCTCTCTGGAATACTTCAACAA AGTTCTGAACCAGGAAGGTATCCAGCGTTACAACCTGGCGCTGGGTGGTTACGTTACCAAATCTGGTGA AAAAATGATGGGTCTGAACGACGCGCTGAACCTGGCGCACCAGTCTGAAAAATCTTCTAAAGGTCGTAT CCACATGACCCCGCTGTTCAAACAGATCCTGTCTGAAAAAGAATCTTTCTCTTACATCCCGGACGTTTTC ACCGAAGACTCTCAGCTGCTGCCGTCTATCGGTGGTTTCTTCGCGCAGATCGAAAACGACAAAGACGGT AACATCTTCGACCGTGCGCTGGAACTGATCTCTTCTTACGCGGAATACGACACCGAACGTATCTACATCC GTCAGGCGGACATCAACCGTGTTTCTAACGTTATCTTCGGTGAATGGGGTACCCTGGGTGGTCTGATGCG TGAATACAAAGCGGACTCTATCAACGACATCAACCTGGAACGTACCTGCAAAAAAGTTGACAAATGGCT GGACTCTAAAGAATTCGCGCTGTCTGACGTTCTGGAAGCGATCAAACGTACCGGTAACAACGACGCGTT CAACGAATACATCTCTAAAATGCGTACCGCGCGTGAAAAAATCGACGCGGCGCGTAAAGAAATGAAAT TCATCTCTGAAAAAATCTCTGGTGACGAAGAATCTATCCACATCATCAAAACCCTGCTGGACTCTGTTCA GCAGTTCCTGCACTTCTTCAACCTGTTCAAAGCGCGTCAGGACATCCCGCTGGACGGTGCGTTCTACGCG GAATTCGACGAAGTTCACTCTAAACTGTTCGCGATCGTTCCGC TGTACAACAAAGTTCGTAACTACCTGA CCAAAAACAACCTGAACACCAAAAAAATCAAACTGAACTTCAAAAACCCGACCCTGGCGAACGGTTGG GACCAGAACAAAGTTTACGACTACGCGTCTCTGATCTTCCTGCGTGACGGTAACTACTACCTGGGTATCA TCAACCCGAAACGTAAAAAAAACATCAAATTCGAACAGGGTTCTGGTAACGGTCCGTTCTACCGTAAAA US 10 , 435 , 714 B2 103 104 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence TGGTTTACAAACAGATCCCGGGTCCGAACAAAAACCTGCCGCGTGTTTTCCTGACCTCTACCAAAGGTA AAAAAGAATACAAACCGTCTAAAGAAATCATCGAAGGTTACGAAGCGGACAAACACATCCGTGGTGAC AAATTCGACCTGGACTTCTGCCACAAACTGATCGACTTCTTCAAAGAATCTATCGAAAAACACAAAGAC TGGTCTAAATTCAACTTCTACTTCTCTCCGACCGAATCTTACGGTGACATCTCTGAATTCTACCTGGACGT TGAAAAACAGGGTTACCGTATGCACTTCGAAAACATCTCTGCGGAAACCATCGACGAATACGTTGAAAA AGGTGACCTGTTCCTGTTCCAGATCTACAACAAAGACTTCGTTAAAGCGGCGACCGGTAAAAAAGACAT GCACACCATCTACTGGAACGCGGCGTTCTCTCCGGAAAACCTGCAGGACGTTGTTGTTAAACTGAACGG TGAAGCGGAACTGTTCTACCGTGACAAATCTGACATCAAAGAAATCGTTCACCGTGAAGGTGAAATCCT GGTTAACCGTACCTACAACGGTCGTACCCCGGTTCCGGACAAAATCCACAAAAAACTGACCGACTACCA AAGCGAAAGAATACCTGGACAAAGTTCGTTACTTCAAAGCGCA CTACGACATCACCAAAGACCGTCGTTACCTGAACGACAAAATCTACTTCCACGTTCCGCTGACCCTGAAC TTCAAAGCGAACGGTAAAAAAAACCTGAACAAAATGGTTATCGAAAAATTCCTGTCTGACGAAAAAGC GCACATCATCGGTATCGACCGTGGTGAACGTAACCTGCTGTACTACTCTATCATCGACCGTTCTGGTAAA ATCATCGACCAGCAGTCTCTGAACGTTATCGACGGTTTCGACTACCGTGAAAAACTGAACCAGCGTGAA ATCGAAATGAAAGACGCGCGTCAGTCTTGGAACGCGATCGGTAAAATCAAAGACCTGAAAGAAGGTTA CCTGTCTAAAGCGGTTCACGAAATCACCAAAATGGCGATCCAGTACAACGCGATCGTTGTTATGGAAGA ACTGAACTACGGTTTCAAACGTGGTCGTTTCAAAGTTGAAAAACAGATCTACCAGAAATTCGAAAACAT GCTGATCGACAAAATGAACTACCTGGTTTTCAAAGACGCGCCGGACGAATCTCCGGGTGGTGTTCTGAA CGCGTACCAGCTGACCAACCCGCTGGAATCTTTCGCGAAACTGGGTAAACAGACCGGTATCCTGTTCTA CGTTCCGGCGGCGTACACCTCTAAAATCGACCCGACCACCGGTTTCGTTAACCTGTTCAACACCTCTTCT AAAACCAACGCGCAGGAACGTAAAGAATTCCTGCAGAAATTCGAATCTATCTCTTACTCTGCGAAAGAC GGTGGTATCTTCGCGTTCGCGTTCGACTACCGTAAATTCGGTACCTCTAAAACCGACCACAAAAACGTTT GGACCGCGTACACCAACGGTGAACGTATGCGTTACATCAAAGAAAAAAAACGTAACGAACTGTTCGAC CCGTCTAAAGAAATCAAAGAAGCGCTGACCTCTTCTGGTATCAAATACGACGGTGGTCAGAACATCCTG CCGGACATCCTGCGTTCTAACAACAACGGTCTGATCTACACCATGTACTCTTCTTTCATCGCGGCGATCC AGATGCGTGTTTACGACGGTAAAGAAGACTACATCATCTCTCCGATCAAAAACTCTAAAGGTGAATTCT TCCGTACCGACCCGAAACGTCGTGAACTGCCGATCGACGCGGACGCGAACGGTGCGTACAACATCGCGC TGCGTGGTGAACTGACCATGCGTGCGATCGCGGAAAAATTCGACCCGGACTCTGAAAAAATGGCGAAAC TGGAACTGAAACACAAAGACTGGTTCGAATTCATGCAGACCCGTGGTGACTAA SEQ ATGACTAAAACATTTGATTCAGAGTTTTTTAATTTGTACTCGCTGCAAAAAACGGTACGCTTTGAGTTAA ID AACCCGTGGGAGAAACCGCGTCATTTGTGGAAGACTTTAAAAACGAGGGCTTGAAACGTGTTGTGAGCG NO : AAGATGAAAGGCGAGCCGTCGATTACCAGAAAGTTAAGGAAATAATTGACGATTACCATCGGGATTTCA 44 TTGAAGAAAGTTTAAATTATTTTCCGGAACAGGTGAGTAAAGATGCTCTTGAGCAGGCGTTTCATCTTTA TCAGAAACTGAAGGCAGCAAAAGTTGAGGAAAGGGAAAAAGCGCTGAAAGAATGGGAAGCGCTGCAG AAAAAGCTACGTGAAAAAGTGGTGAAATGCTTCTCGGACTCGAATAAAGCCCGCTTCTCAAGGATTGAT AAAAAGGAACTGATTAAGGAAGACCTGATAAATTGGTTGGTCGCCCAGAATCGCGAGGATGATATCCCT ACGGTCGAAACGTTTAACAACTTCACCACATATTTTACCGGCTTCCATGAGAATCGTAAAAATATTTACT CCAAAGATGATCACGCCACCGCTATTAGCTTTCGCCTTATTCATGAAAATCTTCCAAAGTTTTTTGACAA CGTGATTAGCTTCAATAAGTTGAAAGAGGGTTTCCCTGAATTAAAATTTGATAAAGTGAAAGAGGATTT AGAAGTAGATTATGATCTGAAGCATGCGTTTGAAATAGAATAT TTCGTTAACTTCGTGACCCAAGCGGG CATAGATCAGTATAATTATCTGTTAGGAGGGAAAACCCTGGAGGACGGGACGAAAAAACAAGGGATGA ATGAGCAAATTAATCTGTTCAAACAACAGCAAACGCGAGATAAAGCGCGTCAGATTCCCAAACTGATCC CCCTGTTCAAACAGATTCTTAGCGAAAGGA?TGAAAGCCAGTCCTTTATTCCTAAACAATTTGAAAGTGA TCAGGAGTTGTTCGATTCACTGCAGAAGTTACATAATAACTGCCAGGATAAATTCACCGTGCTGCAACA AGCCATTCTCGGTCTGGCAGAGGCGGATCTTAAGAAGGTCTTCATCAAAACCTCTGATTTAAATGCCTTA TCTAACACCATTTTCGGGAATTACAGCGTCTTTTCCGATGCACTGAACCTGTATAAAGAAAGCCTGAAAA CGAAAAAAGCGCAGGAGGCTTTTGAGAAACTACCGGCCCATTCTATTCACGACCTCATTCAATACTTGG AACAGTTCAATTCCAGCCTGGACGCGGAAAAACAACAGAGCACCGACACCGTCCTGAACTACTTCATCA AGACCGATGAATTATATTCTCGCTTCATTAAATCCACTAGCGAGGCTTTCACTCAGGTGCAGCCTTTGTT CGAACTGGAAGCCCTGTCATCTAAGCGCCGCCCACCGGAATCGGAAGATGAAGGGGCAAAAGGGCAGG AAGGCTTCGAGCAGATCAAGCGTATTAAAGCTTACCTGGATACGCTTATGGAAGCGGTACACTTTGCAA AGCCGTTGTATCTTGTTAAGGGTCGTAAAATGATCGAAGGGCT CGATAAAGACCAGTCCTTTTATGAAG CGTTTGAAATGGCGTACCAAAACTTGAATCGTTAATCATTCCTATCTATAACAAAGCGCGGAGCTATCT GTCGCGGAAACCTTTCAAGGCCGATAAATTCAAGATTAATTTTGACAACAACACGCTACTGAGCGGATG GGATGCGAACAAGGAAACTGCTAACGCGTCCATTCTGTTTAAGAAAGACGGGTTATATTACCTTGGAAT TATGCCGAAAGGTAAGACCTTTCTCTTTGACTACTTTGTATCGAGCGAGGATTCAGAGAAACTGAAACA GCGTCGCCAGAAGACCGCCGAAGAAGCTCTGGCGCAGGATGGTGAAAGTTACTTCGAAAAAATTCGTTA TAAACTGTTACCAGGGGCTTCAAAGATGTTACCGAAAGTCTTTTTTAGCAACAAAAATATTGGCTTTTAC AACCCGTCGGATGACATTTTACGCATTCGCAACACAGCCTCTCACACCAAAAACGGGACCCCTCAGAAA GGCCACTCAAAAGTTGAGTTTAACCTGAATGATTGTCATAAGATGATTGATTTCTTCAAATCATCAATTC AGAAACACCCGGAATGGGGGTCTTTTGGCTTTACGTTTTCTGATACCAGTGATTTTGAAGACATGAGTGC CTTCTACCGGGAAGTAGAAAACCAGGGTTACGTAATTAGCTTTGACAAAATCAAAGAGACCTATATACA GAGCCAGGTGGAACAGGGTAATCTCTACTTATTCCAGATTTATAACAAGGATTTCTCGCCCTACAGCAA AGGCAAACCAAACCTGCATACTCTGTACTGGAAAGCCCTGTTTGAAGAAGCGAACCTGAATAACGTAGT GGCGAAGTTGAACGGTGAAGCGGAAATCTTCTTCCGTCGTCACTCCATTAAGGCCTCTGATAAAGTTGTC CATCCGGCAAATCAGGCCATTGATAATAAGAATCCACACACGGAAAAAACGCAGTCAACCTTTGAATAT GACCTCGTTAAAACAAACGCTACACGCAAGATAAGTTCTTTTTCCACGTCCCAATCAGCCTCAACTTTA AAGCACAAGGGGTTTCAAAGTTTAATGATAAAGTCAATGGGTTCCTCAAGGGCAACCCGGATGTCAACA TTATAGGTATAGACAGGGGCGAACGCCATCTGCTTTACTTTACCGTAGTGAATCAGAAAGGTGAAATAC TGGTTCAGGAATCATTAAATACCTTGATGTCGGACAAAGGGCACGTTAATGATTACCAGCAGAAACTGG ATAAAAAAGAACAGGAACGTGATGCTGCGCGTAAATCGTGGACCACGGTTGAGAACATTAAAGAGCTG AAAGAGGGGTATCTAAGCCATGTGGTACACAAACTGGCGCACCTCATCATTAAATATAACGCAATAGTC TGCCTAGAAGACTTGAATTTTGGCTTTAAACGCGGCCGCTTCAAAGTGGAAAAACAAGTTTATCAAAAA TTTGAAAAGGCGCTTATAGATAAACTGAATTATCTGGTTTTTAAAGAAAAGGAACTTGGTGAGGTAGGG CACTACTTGACAGCTTATCAACTGACGGCCCCGTTCGAATCATTCAAAAAACTGGGCAAACAGTCTGGC US 10 , 435 , 714 B2 105 106 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence ATTCTGTTTTACGTGCCGGCAGATTATACTTCAAAAATCGATCCAACAACTGGCTTTGTGAACTTCCTGG ACCTGAGATATCAGTCTGTAGAAAAAGCTAAACAACTTCTTAGCGATTTTAATGCCATTCGTTTTAACAG CGTTCAGAATTACTTTGAATTCGAAATTGACTATAAAAAACTTACTCCGAAACGTAAAGTCGGAACCCA AAGTAAATGGGTAATTTGTACGTATGGCGATGTCAGGTATCAGAACCGTCGGAATCAAAAAGGTCATTG GGAGACCGAAGAAGTGAACGTGACCGAAAAGCTGAAGGCTCTGTTCGCCAGCGATTCAAAAACTACAA CTGTGATCGATTACGCAAATGATGATAACCTGATAGATGTGATTTTAGAGCAGGATAAAGCCAGCTTTTT TAAAGAACTGTTGTGGCTCCTGAAACTTACGATGACCTTACGACATTCCAAGATCAAATCGGAAGATGA TTTTATTCTGTCACCGGTCAAGAATGAGCAGGGTGAATTCTATGATAGTAGGAAAGCCGGCGAAGTGTG GCCGAAAGACGCCGACGCCAATGGCGCCTATCATATCGCGCTCAAAGGGCTTTGGAATTTGCAGCAGAT TAACCAGTGGGAAAAAGGTAAAACCCTGAATCTGGCTATCAAAAACCAGGATTGGTTTAGCTTTATCCA AGAGAAACCGTATCAGGAATGA SEQ ATGCATACAGGCGGTCTTCTTAGTATGGACGCGAAAGAGTTCACAGGTCAGTATCCGTTGTCGAAAACA ID TTACGATTCGAACTTCGGCCCATCGGCCGCACGTGGGATAACC TGGAGGCCTCAGGCTACTTAGCGGAA NO : GACCGCCATCGTGCCGAATGTTATCCTCGTGCGAAAGAGTTAT TGGATGACAACCATCGTGCCTTCCTGA 45 ATCGTGTGTTGCCACAAATCGATATGGATTGGCACCCGATTGCGGAGGCCTTTTGTAAGGTACATAAAA ACCCTGGTAATAAAGAACTTGCCCAGGATTACAACCTTCAGTTGTCAAAGCGCCGTAAGGAGATCAGCG CATATCTTCAGGATGCAGATGGCTATAAAGGCCTGTTCGCGAAGCCCGCCTTAGACGAAGCTATGAAAA TTGCGAAAGAAAACGGGAACGAAAGTGATATTGAGGTTCTCGAAGCGTTTAACGGTTTTAGCGTATACT TCACCGGTTATCATGAGTCACGCGAGAACATTTATAGCGATGAGGATATGGTGAGCGTAGCCTACCGAA TTACTGAGGATAATTTCCCGCGCTTTGTCTCAAACGCTTTGATCTTTGATAAATTAAACGAAAGCCATCC GGATATTATCTCTGAAGTATCGGGCAATCTTGGAGTTGATGACATTGGTAAGTACTTTGACGTGTCGAAC TATAACAATTTTCTTTCCCAGGCCGGTATAGATGACTACAATCACATTATTGGCGGCCATACAACCGAAG ACGGA?TGATACAAGCGTTTAATGTCGTATTGAACTTACGTCACCAAAAAGACCCTGGCTTTGAAAAAA TTCAGTTCAAACAGCTCTACAAACAAATCCTGAGCGTGCGTACCAGCAAAAGCTACATCCCGAAACAGT TTGACAACTCTAAGGAGATGGTTGACTGCATTTGCGATTATGT CAGCAAAATAGAGAAATCCGAAACAG TAGAACGGGCCCTGAAACTAGTCCGTAATATCAGTTCTTTCGACTTGCGCGGGATCTTTGTCAATAAAAA GAACTTGCGCATACTGAGCAACAAACTGATAGGAGATTGGGACGCGATCGAAACCGCATTGATGCATAG TTCTTCATCAGAAAACGATAAGAAAAGCGTATATGATAGCGCGGAGGCTTTTACGTTGGATGACATCTTT TCAAGCGTGAAAAAATTTTCTGATGCCTCTGCCGAAGATATTGGCAACAGGGCGGAAGACATCTGTAGA GTGATAAGTGAGACGGCCCCTTTTATCAACGATCTGCGAGCGGTGGACCTGGATAGCCTGAACGACGAT GGTTATGAAGCGGCCGTCTCAAAAATTCGGGAGTCGCTGGAGCCTTATATGGATCTTTTCCATGAACTGG AAATTTTCTCGGTTGGCGATGAGTTCCCAAAATGCGCAGCATTTTACAGCGAACTGGAGGAAGTCAGCG AACAGCTGATCGAAATTATTCCGTTATTCAACAAGGCGCGTTCGTTCTGCACCCGGAAACGCTATAGCAC CGATAAGATTAAAGTGAACTTAAAATTCCCGACCTTGGCGGACGGGTGGGACCTGAACAAAGAGAGAG ACAACAAAGCCGCGATTCTGCGGAAAGACGGTAAGTATTATCTGGCAATTCTGGATATGAAGAAAGATC TGTCAAGCATTAGGACCAGCGACGAAGATGAATCCAGCTTCGAAAAGATGGAGTATAAACTGTTACCGA GTCCAGTAAAAATGCTGCCAAAGATATTCGTAAAATCGAAAGCCGCTAAGGAAAAATATGGCCTGACA GATCGTATGCTTGAATGCTACGATAAAGGTATGCATAAGTCGGGTAGTGCGTTTGATCTTGGCTTTTGCC ATGAACTCATTGATTATTACAAGCGTTGTATCGCGGAGTACCCAGGCTGGGATGTGTTCGATTTCAAGTT TCGCGAAACTTCCGATTATGGGTCCATGAAAGAGTTCAATGAAGATGTGGCCGGAGCCGGTTACTATAT GAGTCTGAGAAAAATTCCGTGCAGCGAAGTGTACCGTCTGTTAGACGAGAAATCGATTTATCTATTTCA AATTTATAACAAAATTACTCTGAAAATGCACATGGTAATAAGAACATGCATACCATGTACTGGGAGGG TCTCTTTTCCCCGCAAAACCTGGAGTCGCCCGTTTTCAAGTTGTCGGGTGGGGCAGAACTTTTCTTTCGA AAATCCTCAATCCCTAACGATGCCAAAACAGTACACCCGAAAGGCTCAGTGCTGGTTCCACGTAATGAT GTTAACGGTCGGCGTATTCCAGATTCAATCTACCGCGAACTGACACGCTATTTTAACCGTGGCGATTGCC GAATCAGTGACGAAGCCAAAAGTTATCTTGACAAGGTTAAGACTAAAAAAGCGGACCATGACATTGTG AAAGATCGCCGCTTTACCGTGGATAAAATGATGTTCCACGTCCCGATTGCGATGAACTTTAAGGCGATC AGTAAACCGAACTTAAACAAAAAAGTCATTGATGGCATCATTGATGATCAGGATCTGAAAATCATTGGT ATTGATCGTGGCGAGCGGAACTTAATTTACGTCACGATGGTTGACAGAAAAGGGAATATCTTATATCAG GATTCTCTTAACATCCTCAATGGCTACGACTATCGTAAAGCTCTGGATGTGCGCGAATATGACAACAAG GAAGCGCGTCGTAACTGGACTAAAGTGGAGGGCATTCGCAAAATGAAGGAAGGCTATCTGTCATTAGCG GTCTCGAAATTAGCGGATATGATTATCGAAAATAACGCCATCATCGTTATGGAGGACCTGAACCACGGA TTCAAAGCGGGCCGCTCAAAATTGAAAAACAAGTTTATCAGAAATTTGAGAGTATGCTGATTAACAAA CTGGGCTATATGGTGTTAAAAGACAAGTCAATTGACCAATCAGGTGGCGCGCTGCATGGATACCAGCTG GCGAACCATGTTACCACCTTAGCATCAGTTGGAAAGCAGTGTGGGGTTATCTTTTATATACCGGCAGCGT TCACTAGTAAAATAGATCCGACCACTGGTTTCGCCGATCTCTTTGCCCTGAGTAACGTTAAAAACGTAGC GAGCATGCGTGAATTCTTTTCCAAAATGAAATCTGTCATTTATGATAAAGCTGAAGGCAAATTCGCATTC ACCTTTGATTACTTGGATTACAACGTGAAGAGCGAATGTGGTCGTACGCTGTGGACCGTTTACACCGTTG GTGAGCGCTTCACCTATTCCCGTGTGAACCGCGAATATGTACGTAAAGTCCCCACCGATATTATCTATGA TGCCCTCCAGAAAGCAGGCATTAGCGTCGAAGGAGACTTAAGGGACAGAATTGCCGAAAGCGATGGCG ATACGCTGAAGTCTATTTTTTACGCATTCAAATACGCGCTAGATATGCGCGTTGAGAATCGCGAGGAAG ACTACATTCAATCACCTGTGAAAAATGCCTCTGGGGAATTTTTTTGTTCAAAAAATGCTGGTAAAAGCCT CCCACAAGATAGCGATGCAAACGGTGCATATAACATTGCCCTGAAAGGTATTCTTCAATTACGCATGCT GTCTGAGCAGTACGACCCCAACGCGGAATCTATTAGACTTCCGCTGATAACCAATAAAGCCTGGCTGAC ATTCATGCAGTCTGGCATGAAGACCTGGAAAAATTAG SEQ ATGGATAGTTTAAAAGATTTTACGAATCTATATCCCGTAAGCAAAACTCTTCGTTTTGAACTGAAACCTG ID TTGGAAAAACGTTGGAGAATATCGAGAAAGCGGGCATCCTGAAAGAAGACGAGCACCGTGCCGAAAGC NO : TACAGGCGTGTCAAAAAGATTATCGATACTTATCACAAAGTGT TCATTGATAGCAGTCTGGAGAACATG 46 GCAAAAATGGGCATAGAAAATGAAATCAAAGCAATGCTGCAGAGCTTTTGCGAGCTCTACAAGAAAGA TCACCGAACGGAAGGTGAAGATAAAGCACTGGACAAAATTCGCGCCGTTCTTCGCGGTCTGATTGTTGG CGCGTTCACCGGCGTGTGCGGCCGCCGTGAAAACACCGTGCAGAACG AGAAAAACTGATAAAAGAAATTTTGCCTGACTTTGTGCTTTCGACCGAAGCGGAATCCCTGCCATTTTCT GTCGAAGAAGCGACCCGCAGCCTGAAAGAATTTGACTCATTCACAAGTTACTTTGCAGGCTTCTACGAA AACCGTAAAAACATCTACAGCACGAAGCCACAGAGCACGGCTATTGCTTATCGCCTGATTCATGAGAAC US 10 , 435 , 714 B2 107 108 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence CTGCCGAAGTTCATCGATAACATCCTTGTTTTTCAAAAAATTAAAGAGCCGATTGCGAAAGAGTTAGAA CATATTCGAGCTGACTTTTCTGCGGGTGGGTACATTAAAAAAGATGAGGGGCTGGAAGACATCTTCAGT CTAAACTATTATATCCACGTTCTGTCGCAGGCAGGCATTGAGAAATATAATGCGCTGATTGGTAAGATTG TCACAGAAGGCGATGGTGAGATGAAAGGTCTTAATGAACATATCAATCTGTATAACCAGCAGCGTGGTC GCGAAGACCGTCTTCCACTGTTCCGCCCACTGTATAAACAGATCCTGTCTGACCGGGAACAGCTGTCCTA CCTGCCGGAAAGCTTTGAAAAGGATGAAGAGCTACTTCGCGCATTAAAGGAGTTTTACGACCATATTGC GGAAGACATTTTGGGTAGAACGCAGCAACTGATGACGTCAATTTCTGAATACGATCTGAGTAGAATCTA CGTTAGGAATGATAGCCAGCTGACCGATATTAGCAAAAAAATGCTGGGCGACTGGAACGCTATCTATAT GGCACGTGAACGTGCATATGATCATGAACAAGCACCGAAACGTATAACCGCGAAATATGAGCGTGATC GCATTAAGGCGCTAAAGGGAGAAGAAAGCATCTCACTCGCAAACC TGAACTCCTGTATCGCTTTCTTAG ATAACGTGCGCGATTGTCGCGTCGACACGTATCTGTCAACCCTTGGGCAGAAAGAGGGTCCACATGGTC TGTCTAACCTGGTGGAAAATGTCTTTGCGAGTTACCATGAAGCGGAACAACTGCTGTCTTTTCCATACCC CGAAGAAAACAATCTAATACAGGATAAAGATAACGTGGTGTTAATCAAAAACCTGCTGGACAACATCA GCGATCTGCAACGTTTCCTGAAACCTTTGTGGGGTATGGGTGACGAGCCAGACAAAGACGAACGTTTTT ATGGTGAGTATAATTATATACGTGGCGCCCTTGACCAAGTTATTCCGCTGTATAACAAAGTACGGAACTA TCTGACCCGTAAGCCATATTCTACCCGTAAAGTGAAACTGAACTTTGGCAACTCGCAACTGCTGTCGGGT TGGGATCGTAACAAAGAAAAAGATAATAGTTGTGTTATCCTGCGTAAGGGACAAAATTTTTACCTCGCG ATTATGAACAACAGACACAAGCGTTCATTTGAAAATAAGGTTCTGCCGGAGTATAAAGAGGGCGAACCG TACTTCGAGAAAATGGATTATAAGTTCTTACCAGACCCTAATAAGATGTTACCGAAAGTCTTTCTTTCGA AAAAAGGCATAGAAATCTATAAGCCGTCCCCGAAATTACTCGAACAGTATGGGCACGGGACCCACAAG AAAGGGGATACTTTTAGCATGGACGATCTGCACGAACTGATCGATTTTTTTAAACACTCCATCGAAGCCC ATGAAGACTGGAAACAGTTTGGGTTCAAGTTCTCTGATACAGCCACATACGAGAATGTGTCTAGTTTTTA TCGGGAAGTGGAGGATCAGGGCTACAAACTTAGTTTTCGTAAAGTTTCAGAGAGTTATGTTTATAGTTTA ATTGATCAGGGAAAACTTTACCTGTTCCAGATCTACAACAAAGATTTCTCGCCATGTAGTAAGGGTACCC CGAATCTGCATACACTCTATTGGAGAATGTTATTCGATGAGCGTAACTTAGCGGATGTCATTTATAAATT GGACGGGAAAGCAGAGATCTTTTTTCGTGAAAAATCACTGAAGAATGACCACCCGACTCATCCGGCCGG GAAACCGATCAAAAAAAAATCCCGCCAGAAAAAAGGAGAAGAGTCTCTGTTTGAATATGATCTGGTGA AAGACCGTCATTACACTATGGATAAATTTCAATTTCATGTTCCAATTACAATGAACTTCAAATGTTCGGC GGGTTCCAAAGTAAATGATATGGTAAACGCCCATATTCGCGAAGCGAAAGATATGCATGTTATTGGCAT CGATAGAGGCGAAAAAACCTGCTTTATATTTGCGTAATTGACAGCCGTGGTACCATTCTGGACCAGAT CTCTTTAAACACCATCAATGACATCGATTATCACGACCTGTTGGAGTCTCGGGACAAGGACCGCCAGCA GGAGCGCCGTAATTGGCAGACAATTGAAGGCATAAAAGAATTAAAACAGGGTTACCTTTCCCAGGCCGT ACACCGCATAGCGGAACTGATGGTGGCCTACAAAGCCGTAGTTGCCCTGGAAGACTTGAATATGGGGTT TAAACGTGGCCGTCAAAAAGTCGAGAGCAGCGTGTATCAGCAATTTGAAAAACAGTTGATTGACAAGTT GAATTATTTGGTTGATAAAAAGAAACGTCCAGAAGATATTGGTGGCTTACTGCGTGCATACCAGTTTAC GGCACCTTTTAAGTCCTTCAAAGAAATGGGTAAACAGAACGGGTTTCTGTTTTACATCCCGGCCTGGAAT ACATCCAACATCGATCCTACCACCGGGTTTGTCAACCTGTTTCATGCACAATATGAAAACGTGGATAAA GCGAAGAGTTTTTTCCAAAAATTCGATAGTATTTCGTATAACCCAAAAAAAGATTGGTTTGAGTTTGCGT TCGATTATAAAAATTTTACTAAAAAGGCTGAGGGATCCCGCAGTATGTGGATCCTCTGCACCCATGGCA GTCGTATTAAAAATTTTCGTAATTCGCAAAAGAATGGCCAGTGGGACTCGGAAGAGTTTGCCCTGACCG AAGCGTTCAAATCGCTGTTTGTACGCTACGAAATTGACTACACAGCAGATCTGAAAACAGCCATCGTCG ATGAAAAACAGAAAGATTTTTTTGTAGATCTCCTAAAACTGTTCAAACTGACTGTTCAGATGCGCAATTC CTGGAAAGAGAAAGACCTGGATTATCTGATTAGCCCGGTAGCCGGTGCTGATGGACGATTTTTCGATAC TCGTGAAGGTAACAAAAGTCTCCCGAAAGATGCTGATGCCAATGGTGCATACAATATTGCATTAAAGGG GCTATGGGCCTTGCGACAGATCCGCCAGACCAGCGAAGGCGGCAAGCTGAAATTGGCCATATCGAATAA GGAATGGTTACAATTTGTTCAGGAACGTAGCTATGAAAAAGATTGA SEQ ATGAACAACGGCACAAATAATTTTCAGAACTTCATCGGGATCTCAAGTTTGCAGAAAACGCTGCGCAAT ID GCTCTGATCCCCACGGAAACCACGCAACAGTTCATCGTCAAGAACGGAATAATTAAAGAAGATGAGTTA NO : CGTGGCGAGAACCGCCAGATTCTGAAAGATATCATGGATGACTACTACCGCGGATTCATCTCTGAGACT 47 CTGAGTTCTATTGATGACATAGATTGGACTAGCCTGTTCGAAAAAATGGAAATTCAGCTGAAAAATGGT GATAATAAAGATACCTTAATTAAGGAACAGACAGAGTATCGGAAAGCAATCCATAAAAAATTTGCGAA CGACGATCGGTTTAAGAACATGTTTAGCGCCAAACTGATTAGTGACATATTACCTGAATTTGTCATCCAC AACAATAATTATTCGGCATCAGAGAAAGAGGAAAAAACCCAGGTGATAAAATTGTTTTCGCGCTTTGCG ACTAGCTTTAAAATTACTTCAAGAACCGTGCAAATTGCTTTTCAGCGGACGATATTTCATCAAGCAGCT GCCATCGCATCGTCAACGACAATGCAGAGATATTCTTTTCAAATGCGCTGGTCTACCGCCGGATCGTAAA ATCGCTGAGCAATGACGATATCAACAAAATTTCGGGCGATATGAAAGATTCATTAAAAGAAATGAGTCT GGAAGAAATATATTCTTACGAGAAGTATGGGGAATTTATTACCCAGGAAGGCATTAGCTTCTATAATGA TATCTGTGGGAAAGTGAATTCTTTTATGAACCTGTATTGTCAGAAAAATAAAGAAAACAAAAATTTATA TCACAAACAGATTCTATGCATTGCGGACACTAGCTATGAGGTCCCGTATAAATTT GAAAGTGACGAGGAAGTGTACCAATCAGTTAACGGCTTCCTTGATAACATTAGCAGCAAACATATAGTC GAAAGATTACGCAAAATCGGCGATAACTATAACGGCTACAACC TGGATAAAATTTATATCGTGTCCAAA TTTTACGAGAGCGTTAGCCAAAAAACCTACCGCGACTGGGAAACAATTAATACCGCCCTCGAAATTCAT TACAATAATATCTTGCCGGGTAACGGTAAAAGTAAAGCCGACAAAGTAAAAAAAGCGGTTAAGAATGA TTTACAGAAATCCATCACCGAAATAAATGAACTAGTGTCAAAC TATAAGCTGTGCAGTGACGACAACAT CAAAGCGGAGACTTATATACATGAGATTAGCCATATCTTGAATAACTTTGAAGCACAGGAATTGAAATA CAATCCGGAAATTCACCTAGTTGAATCCGAGCTCAAAGCGAGTGAGCTTAAAAACGTGCTGGACGTGAT CATGAATGCGTTTCATTGGTGTTCGGTTTTTATGACTGAGGAACTTGTTGATAAAGACAACAATTTTTAT GCGGAACTGGAGGAGATTTACGATGAAATTTATCCAGTAATTAGTCTGTACAACCTGGTTCGTAACTAC GTTACCCAGAAACCGTACAGCACGAAAAAGATTAAATTGAACTTTGGAATACCGACGTTAGCAGACGGT TGGTCAAAGTCCAAAGAGTATTCTAATAACGCTATCATACTGATGCGCGACAATCTGTATTATCTGGGCA TCTTTAATGCGAAGAATAAACCGGACAAGAAGATTATCGAGGGTAATACGTCAGAAAATAAGGGTGAC TACAAAAAGATGATTTATAATTTGCTCCCGGGTCCCAACAAAATGATCCCGAAAGTTTTCTTGAGCAGCA AGACGGGGGTGGAAACGTATAAACCGAGCGCCTATATCCTAGAGGGGTATAAACAGAATAAACATATC AAGTCTTCAAAAGACTTTGATATCACTTTCTGTCATGATCTGATCGACTACTTCAAAAACTGTATTGCAA TTCATCCCGAGTGGAAAAACTTCGGTTTTGATTTTAGCGACACCAGTACTTATGAAGACATTTCCGGGTT US 10 , 435 , 714 B2 109 110 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence TTATCGTGAGGTAGAGTTACAAGGTTACAAGATTGATTGGACATACATTAGCGAAAAAGACATTGATCT GCTGCAGGAAAAAGGTCAACTGTATCTGTTCCAGATATATAACAAAGATTTTTCGAAAAAATCAACCGG GAATGACAACCTTCACACCATGTACCTGAAAAATCTTTTCTCAGAAGAAAATCTTAAGGATATCGTCCTG AAACTTAACGGCGAAGCGGAAATCTTCTTCAGGAAGAGCAGCATAAAGAACCCAATCATTCATAAAAA AGGCTCGATTTTAGTCAACCGTACCTACGAAGCAGAAGAAAAAGACCAGTTTGGCAACATTCAAATTGT GCGTAAAAATATTCCGGAAAACATTTATCAGGAGCTGTACAAATACTTCAACGATAAAAGCGACAAAGA GCTGTCTGATGAAGCAGCCAAACTGAAGAATGTAGTGGGACACCACGAGGCAGCGACGAATATAGTCA AGGACTATCGCTACACGTATGATAAATACTTCCTTCATATGCCTATTACGATCAATTTCAAAGCCAATAA AACGGGTTTTATTAATGATAGGATCTTACAGTATATCGCTAAAGAAAAAGACTTACATGTGATCGGCATT GATCGGGGCGAGCGTAACCTGATCTACGTGTCCGTGATTGATACTTGTGGTAATATAGTTGAACAGAAA AGCTTTAACATTGTAAACGGCTACGACTATCAGATAAAACTGAAACAACAGGAGGGCGCTAGACAGATT GCGCGGAAAGAATGGAAAGAAATTGGTAAAATTAAAGAGATCAAAGAGGGCTACCTGAGCTTAGTAAT CCACGAGATCTCTAAAATGGTAATCAAATACAATGCAATTATAGCGATGGAGGATTTGTCTTATGGTTTT AAAAAAGGGCGCTTTAAGGTCGAACGGCAAGTTTACCAGAAATTTGAAACCATGCTCATCAATAAACTC AACTATCTGGTATTTAAAGATATTTCGATTACCGAGAATGGCGGTCTCCTGAAAGGTTATCAGCTGACAT ACATTCCTGATAAACTTAAAAACGTGGGTCATCAGTGCGGCTGCATTTTTTATGTGCCTGCTGCATACAC GAGCAAAATTGATCCGACCACCGGCTTTGTGAATATCTTTAAATTTAAAGACCTGACAGTGGACGCAAA ACGTGAATTCATTAAAAAATTTGACTCAATTCGTTATGACAGTGAAAAAAATCTGTTCTGCTTTACATTT GACTACAATAACTTTATTACGCAAAACACGGTCATGAGCAAATCATCGTGGAGTGTGTATACATACGGC GTGCGCATCAAACGTCGCTTTGTGAACGGCCGCTTCTCAAACGAAAGTGATACCATTGACATAACCAAA GATATGGAGAAAACGTTGGAAATGACGGACATTAACTGGCGCGATGGCCACGATCTTCGTCAAGACATT ATAGATTATGAAATTGTTCAGCACATATTCGAAATTTTCCGTTTAACAGTGCAAATGCGTAACTCCTTGT CTGAACTGGAGGACCGTGATTACGATCGTCTCATTTCACCTGTACTGAACGAAAATAACATTTTTTATGA CAGCGCGAAAGCGGGGGATGCACTTCCTAAGGATGCCGATGCAAATGGTGCGTATTGTATTGCATTAAA AGGGTTATATGAAATTAAACAAATTACCGAAAATTGGAAAGAAGATGGTAAATTTTCGCGCGATAAACT CAAAATCAGCAATAAAGATTGGTTCGACTTTATCCAGAATAAGCGCTATCTCTAA SEO ATGACCAATAAATTCACTAACCAGTATTCTCTCTCTAAGACCCTGCGCTTTGAACTGATTCCGCAGGGGA ID AAACCTTGGAGTTCATTCAAAAAAAGGCCTCTTGTCTCAGGATAAACAGAGGGCTGAATCTTACCAAG NO : AAATGAAGAAAACTATTGATAAGTTTCATAAATATTTCATTGATTTAGCCTTGTCTAACGCCAAATTAAC 48 TCACTTGGAAACGTATCTGGAGTTATACAACAAATCTGCCGAAACTAAGAAAGAACAGAAATTTAAAGA CGATTTGAAAAAAGTACAGGACAATCTGCGTAAAGAAATTGTCAAATCCTTCAGTGACGGCGATGCTAA AAGCATTTTTGCCATTCTGGACAAAAAAGAGTTGATTACTGTGGAATTAGAAAAGTGGTTTGAAAACAA TGAGCAGAAAGACATCTACTTCGATGAGAAATTCAAAACTTTCACCACCTATTTTACAGGATTTCATCAA AACCGGAAGAACATGTACTCAGTAGAACCGAACTCCACGGCCATTGCGTATCGTTTGATCCATGAGAAT CTGCCTAAATTTCTGGAGAATGCGAAAGCCTTTGAAAAGATTAAGCAGGTCGAATCGCTGCAAGTGAAT TTTCGTGAACTCATGGGCGAATTTGGTGACGAAGGTCTAATCTTCGTTAACGAACTGGAAGAAATGTTTC AGATTAATTACTACAATGACGTGCTATCGCAGAACGGTATCACAATCTACAATAGTATTATCTCAGGGTT CACAAAAAACGATATAAAATACAAAGGCCTGAACGAGTATATCAATAACTACAACCAAACAAAGGACA AAAAGGATAGGCTTCCGAAACTGAAGCAGTTATACAAACAGATTTTATCTGACAGAATCTCCCTGAGCT TTCTGCCGGATGCTTTCACTGATGGGAAGCAGGTTCTGAAAGCGATTTTCGATTTTTATAAGATTAACTT ACTGAGCTACACGATTGAAGGTCAAGAAGAATCTCAAAACTTACTGCTCTTGATCCGTCAAACCATTGA AAATCTATCATCGTTCGATACGCAGAAAATCTACCTCAAAAACGATACTCACCTGACTACGATCTCTCAG CAGGTTTTCGGGGATTTTAGTGTATTTTCAACAGCTCTGAACTACTGGTATGAAACCAAAGTCAATCCGA AATTCGAGACGGAATATTCTAAGGCCAACGAAAAAAAACGTGAGATTCTTGATAAAGCTAAAGCCGTAT TTACTAAACAGGATTACTTTTCTATTGCTTTCCTGCAGGAAGTTTTATCGGAGTATATCCTGACCCTGGAT CATACATCTGATATCGTTAAAAAACACAGCAGCAATTGCATCGCTGACTATTTCAAAAACCACTTTGTCG CCAAAAAAGAAAACGAAACAGACAAGACTTTCGATTTCATTGCTAACATCACCGCAAAATACCAGTGTA TTCAGGGTATCTTGGAAAACGCCGACCAATACGAAGACGAACT AATTTAAAATTCTTCTTAGATGCAATCCTGGAGCTGCTGCACTTCATCAAACCGCTTCATTTAAAGAGCG AGTCCATTACCGAAAAGGACACCGCCTTCTATGACGTTTTTGAAAATTATTATGAAGCCCTCTCCTTGCT GACTCCGCTGTATAATATGGTACGCAATTACGTAACCCAGAAACCATATTCTACCGAAAAAATTAAACT GAACTTTGAAAACGCACAGCTGCTCAACGGTTGGGACGCGAATAAAGAAGGTGACTACCTCACCACCAT CCTGAAAAAAGATGGTAACTATTTTCTGGCAATTATGGATAAGAAACATAATAAAGCATTCCAGAAATT TCCTGAAGGGAAAGAAAATTACGAAAAGATGGTGTACAAACTCTTACCTGGAGTTAACAAAATGTTGCC GAAAGTATTTTTTAGTAATAAGAACATCGCGTACTTTAACCCGTCCAAAGAACTGCTGGAAAATTATAA AAAGGAGACGCATAAGAAAGGGGATACCTTTAACCTGGAACAT TGCCATACCTTAATAGACTTCTTCAA GGATTCCCTGAATAAACACGAGGATTGGAAATATTTCGATTTT CAGTTTAGTGAGACCAAGTCATACCA GGATCTTAGCGGCTTTTATCGCGAAGTAGAACACCAAGGCTATAAAATTAACTTCAAAAACATCGACAG CGAATACATCGACGGTTTAGTTAACGAGGGCAAACTGTTTCTGTTCCAGATCTATTCAAAGGATTTTAGC CCGTTCTCTAAAGGCAAACCAAATATGCATACGTTGTACTGGAAAGCACTGTTTGAAGAGCAAAACCTG CAGAATGTGATTTATAAACTGAACGGCCAAGCTGAGATTTTTTTCCGTAAAGCCTCGATTAAACCGAAA AATATCATCCTTCATAAGAAGAAAATAAAGATCGCTAAAAAACACTTCATAGATAAAAAAACCAAAAC CTCCGAAATAGTGCCTGTTCAAACAATTAAGAACTTGAATATGTACTACCAGGGCAAGATATCGGAAAA GGAGTTGACTCAAGACGATCTTCGCTATATCGATAACTTTTCGATTTTTAACGAAAAAAACAAGACGATC GACATCATCAAAGATAAACGCTTCACTGTAGATAAGTTCCAGT TTCATGTGCCGATTACTATGAACTTCA AAGCTACCGGGGGTAGCTATATCAACCAAACGGTGTTGGAATACCTCCAAACGGTGTTGGAATACCTGCAGAATAACCCGGAAGTCAAA ATCATTGGGCTGGACCGCGGAGAACGTCACCTTGTGTACTTGACCTTAATCGATCAGCAAGGCAACATC TTAAAACAAGAATCGCTGAATACCATTACGGATTCAAAGATTAGCACCCCGTATCATAAGCTGCTCGAT AACAAGGAGAATGAGCGCGACCTGGCCCGTAAAAACTGGGGCACGGTGGAAAACATTAAGGAGTTAAA GGAGGGTTATATTTCCCAGGTAGTGCATAAGATCGCCACTCTCATGCTCGAGGAAAATGCGATCGTTGTC ATGGAAGACTTAAACTTCGGATTTAAACGTGGGCGATTTAAAGTAGAGAAACAAATCTACCAGAAGTTA GAAAAAATGCTGATTGACAAATTAAATTACTTGGTCCTAAAAGA TTATACAACGCCCTCCAACTTACCAATAAATTCGAAAGTTTTCAGAAAATGGGTAAACAGTCAGGCTTTC TTTTTTATGTTCCTGCGTGGAACACATCCAAAATCGACCCTACAACCGGCTTCGTCAATTACTTCTATACT AAATATGAAAACGTCGACAAAGCAAAAGCATTCTTTGAAAAGT TCGAAGCAATACGTTTTAACGCTGAG US 10 , 435 , 714 B2 111 112 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence AAAAAATATTTCGAGTTCGAAGTCAAGAAATACTCAGACTTTAACCCCAAAGCTGAGGGCACACAGCAA GCGTGGACAATCTGCACCTACGGCGAGCGCATCGAAACGAAGCGTCAAAAAGATCAGAATAACAAATT TGTTTCAACACCTATCAACCTGACCGAGAAGATTGAAGACTTCTTAGGTAAAAATCAGATTGTTTATGGC GACGGTAACTGTATAAAATCTCAAATAGCCTCAAAGGATGATAAAGCATTTTTO GGTTCAAAATGACACTGCAGATGCGCAATAGTGAGACGCGTACAGATATTGATTATCTTATCAGCCCGG TCATGAACGACAACGGTACTTTTTACAACTCCAGAGACTATGAAAAACTTGAGAATCCAACTCTCCCCA AAGATGCTGATGCGAACGGTGCTTATCACATCGCGAAAAAAGGTCTGATGCTGCTGAACAAAATCGACC AAGCCGATCTGACTAAGAAAGTTGACCTAAGCATTTCAAATCGGGACTGGTTACAGTTTGTTCAAAAGA ACAAATGA SEQ ATGGAACAGGAATATTATCTGGGCTTGGACATGGGCACCGGTTCCGTCGGCTGGGCTGTTACTGACAGT ID GAATATCACGTTCTAAGAAAGCATGGTAAGGCATTGTGGGGTGTAAGACTTTTCGAATCTGCTTCCACTG NO : CTGAAGAGCGTAGAATGTTTAGAACGAGTCGACGTAGGCTAGACAGGCGCAATTGGAGAATCGAAATTT 49 TACAAGAAATTTTTGCGGAAGAGATATCTAAGAAAGACCCAGGCTTTTTCCTGAGAATGAAGGAATCTA AGTATTACCCTGAGGATAAAAGAGATATAAATGGTAACTGTCCCGAATTGCCTTACGCATTATTTGTGGA CGATGATTTTACCGATAAGGATTACCATAAAAAGTTCCCAACTATCTACCATTTACGCAAAATGTTAATG AATACAGAGGAAACCCCAGACATAAGACTAGTTTATCTGGCAATACACCATATGATGAAACATAGAGGC CATTTCTTACTTTCCGGGGATATCAACGAAATCAAAGAGTTTGGTACCACATTTAGTAAGTTACTGGAAA ACATAAAGAATGAAGAATTGGATTGGAACTTAGAACTCGGAAAAGAAGAATACGCGGTTGTCGAATCT ATCCTGAAGGATAATATGCTGAATAGGTCGACCAAAAAAACTAGGCTGATCAAAGCACTGAAAGCCAA ATCTATCTGCGAAAAAGCTGTTTTAAATTTACTTGCTGGTGGCACTGTTAAGTTATCAGACATTTTTGGTT TGGAAGAATTGAACGAAACCGAGCGTCCAAAAATTAGTTTCGCTGATAATGGCTACGATGATTACATTG GTGAGGTGGAAAACGAGTTGGGCGAACAATTTTATATTATAGAGACAGCTAAGGCAGTCTATGACTGGG CTGTTTTAGTAGAAATCCTTGGTAAATACACATCTATCTCCGAAGCGAAAGTTGCTACTTACGAAAAGCA CAAGTCCGATCTCCAGTTTTTGAAGAAAATTGTCAGGAAATATCTGACTAAGGAAGAATATAAAGATAT TTTCGTTAGTACCTCTGACAAACTGAAAAATTACTCCGCTTACATCGGGATGACCAAGATTAATGGCAAA AAAGTTGATCTGCAAAGCAAAAGGTGTTCGAAGGAAGAATTTTATGATTTCATTAAAAAGAATGTCTTA AAAAAATTAGAAGGTCAGCCAGAATACGAATATTTGAAAGAAGAACTGGAAAGAGAGACATTCTTACC AAAACAAGTCAACAGAGATAATGGGGTAATTCCATATCAAATTCACCTCTACGAATTAAAAAAAATTTT AGGCAATTTACGCGATAAAATTGACCTTATCAAAGAAAATGAGGATAAGCTGGTTCAACTCTTTGAATT CAGAATACCCTATTATGTGGGCCCACTGAACAAGATTGATGACGGCAAAGAAGGTAAATTCACATGGGC CGTCCGCAAATCCAATGAAAAAATTTACCCATGGAACTTTGAAAATGTAGTAGATATTGAAGCGTCTGC GGAGAAATTTATTCGAAGAATGACTAATAAATGCACTTACTTGATGGGAGAGGATGTTCTGCCTAAAGA CAGCTTATTATACAGCAAGTACATGGTTCTAAACGAACTTAACAACGTTAAGTTGGACGGTGAGAAATT AAGTGTAGAATTGAAACAAAGATTGTATACTGACGTCTTCTGCAAGTACAGAAAAGTGACAGTTAAAAA AATTAAGAATTACTTGAAGTGCGAAGGTATAATTTCTGGAAACGTAGAGATTACTGGTATTGATGGTGA TTTCAAAGCATCCCTAACAGCTTACCACGATTTCAAGGAAATCCTGACAGGAACTGAACTCGCAAAAAA AGATAAAGAAAACATTATTACTAATATTGTTCTTTTCGGTGATGACAAGAAATTGTTGAAGAAAAGACT GAATAGACTTTACCCCCAGATTACTCCCAATCAACTTAAGAAAATTTGTGCTTTGTCTTACACAGGATGG GGTCGTTTTTCAAAAAAGTTCTTAGAAGAGATTACCGCACCTGATCCAGAAACAGGCGAAGTATGGAAT ATAATTACCGCCTTATGGGAATCGAACAATAATCTTATGCAACTTCTGAGCAATGAATATCGTTTCATGG AAGAAGTTGAGACTTACAACATGGGCAAACAGACGAAGACTTTATCCTATGAAACTGTGGAAAATATGT ATGTATCACCTTCTGTCAAGAGACAAATTTGGCAAACCTTAAAAATTGTCAAAGAATTAGAAAAGGTAA TGAAGGAGTCTCCTAAACGTGTGTTTATTGAAATGGCTAGAGAAAAACAAGAGTCAAAAAGAACCGAG TCAAGAAAGAAGCAGTTAATCGATTTATATAAGGCTTGTAAAAACGAAGAGAAAGATTGGGTTAAAGA ATTGGGGGACCAAGAGGAACAAAAACTACGGTCGGATAAGTTGTATTTATACTATACGCAAAAGGGAC GATGTATGTATTCCGGCGAGGTAATAGAATTGAAGGATTTATGGGACAATACAAAATATGACATAGACC ATATATATCCCCAATCAAAAACGATGGACGATAGCTTGAACAATAGAGTACTCGTGAAAAAAAAATATA ATGCGACCAAATC TGATAAGTATCCTCTGAATGAAAATATCAGACATGAAAGAAAGGGGTTCTGGAAGT CCTTGTTAGATGGTGGGTTTATAAGCAAAGAAAAGTACGAGCGTCTAATAAGAAACACGGAGTTATCGC CAGAAGAACTCGCTGGTTTTATTGAGAGGCAAATCGTGGAAACGAGACAATCTACCAAAGCCGTTGCTG AGATCCTAAAGCAAGTTTTCCCAGAGTCGGAGATTGTCTATGT CAAAGCTGGCACAGTGAGCAGGTTTA GGAAAGACTTCGAACTATTAAAGGTAAGAGAAGTGAACGATTTACATCACGCAAAGGACGCTTACCTAA ATATCGTTGTAGGTAACTCATATTATGTTAAATTTACCAAGAACGCCTCTTGGTTTATAAAGGAGAACCC AGGTAGAACATATAACCTGAAAAAGATGTTCACCTCTGGTTGGAATATTGAGAGAAACGGAGAAGTCGC ATGGGAAGTTGGTAAGAAAGGGACTATAGTGACAGTAAAGCAAATTATGAACAAAAATAATATCCTCG TTACAAGGCAGGTTCATGAAGCAAAGGGCGGCCTTTTTGACCAACAAATTATGAAGAAAGGGAAAGGT CAAATTGCAATAAAAGAAACCGATGAGAGACTAGCGTCAATAGAAAAGTATGGTGGCTATAATAAAGC TGCGGGTGCATACTTTATGCTTGTTGAATCAAAAGACAAGAAAGGTAAGACTATTAGAACTATAGAATT TATACCCCTGTACCTTAAAAACAAAATTGAATCGGATGAGTCAATCGCGTTAAATTTTCTAGAGAAAGG AAGGGGTTTAAAAGAACCAAAGATCCTGTTAAAAAAGATTAAGATTGACACCTTGTTCGATGTAGATGG ATTTAAAATGTGGTTATCTGGCAGAACAGGCGATAGACTTTTGTTTAAGTGCGCTAATCAATTAATTTTG GATGAGAAAATCATTGTCACAATGAAAAAAATAGTTAAGTTTATTCAGAGAAGACAAGAAAACAGGGA GTTGAAATTATCTGATAAAGATGGTATCGACAATGAAGTTTTAATGGAAATCTACAATACATTCGTTGAT AAACTTGAAAATACCGTATATCGAATCAGGTTAAGTGAACAAGCCAAAACATTAATTGATAAACAAAAA ????cc?ccAcccTATTTGAAATTTTGCATATATTccAGTccc AATCTTCAGCAGCTAATTTAAAAATGATTGGCGGACCTGGGAAAGCCGGCATCCTAGTGATGAACAATA ATATCTCCAAGTGTAACAAAATATCAATTATTAACCAATCTCCGACAGGTATTTTTGAAAATGAAATAGA CTTGCTTAAGATATAA SEQ ATGTCTTTCGACTCTTTCACCAACCTGTACTCTCTGTCTAAAACCCTGAAATTCGAAATGCGTCCGGTTGG ID TAACACCCAGAAAATGCTGGACAACGCGGGTGTTTTCGAAAAAGACAAACTGATCCAGAAAAAATACG NO : GTAAAACCAAACCGTACTTCGACCGTCTGCACCGTGAATTCATCGAAGAAGCGCTGACCGGTGTTGAAC 50 TGATCGGTCTGGACGAAAACTTCCGTACCCTGGTTGACTGGCAGAAAGACAAAAAAAACAACGTTGCGA TGAAAGCGTACGAAAACTCTCTGCAGCGTCTGCGTACCGAAAT CGGTAAAATCTTCAACCTGAAAGCGG AAGACTGGGTTAAAAACAAATACCCGATCCTGGGTCTGAAAAACAAAAACACCGACATCCTGTTCGAAG US 10 , 435 , 714 B2 113 114 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence AAGCGGTTTTCGGTATCCTGAAAGCGCGTTACGGTGAAGAAAAAACACCTTCATCGAAGTTGAAGAAA TCGACAAAACCGGTAAATCTAAAATCAACCAGATCTCTATCTTCGACTCTTGGAAAGGTTTCACCGGTTA CTTCAAAAAATTCTTCGAAACCCGTAAAAACTTCTACAAAAACGACGGTACCTCTACCGCGATCGCGAC CCGTATCATCGACCAGAACCTGAAACGTTTCATCGACAACCTGTCTATCGTTGAATCTGTTCGTCAGAAA GTTGACCTGGCGGAAACCGAAAAATCTTTCTCTATCTCTCTGTCTCAGTTCTTCTCTATCGACTTCTACAA CAAATGCCTGCTGCAGGACGGTATCGACTACTACAACAAAATCATCGGTGGTGAAACCCTGAAAAACGG TGAAAAACTGATCGGTCTGAACGAACTGATCAACCAGTACCGT CAGAACAACAAAGACCAGAAAATCC CGTTCTTCAAACTGCTGGACAAACAGATCCTGTCTGAAAAAATCCTGTTCCTGGACGAAATCAAAAACG ACACCGAACTGATCGAAGCGCTGTCTCAGTTCGCGAAAACCGCGGAAGAAAAAACCAAAATCGTTAAA AAACTGTTCGCGGACTTCGTTGAAAACAACTCTAAATACGACC TGGCGCAGATCTACATCTCTCAGGAA GCGTTCAACACCATCTCTAACAAATGGACCTCTGAAACCGAAACCTTCGCGAAATACCTGTTCGAAGCG ATGAAATCTGGTAAACTGGCGAAATACGAAAAAAAAGACAACTCTTACAAATTCCCGGACTTCATCGCG CTGTCTCAGATGAAATCTGCGCTGCTGTCTATCTCTCTGGAAGGTCACTTCTGGAAAGAAAAATACTACA AAATCTCTAAATTCCAGGAAAAAACCAACTGGGAACAGTTCCTGGCGATCTTCCTGTACGAATTCAACT CTCTGTTCTCTGACAAAATCAACACCAAAGACGGTGAAACCAAACAGGTTGGTTACTACCTGTTCGCGA AAGACCTGCACAACCTGATCCTGTCTGAACAGATCGACATCCCGAAAGACTCTAAAGTTACCATCAAAG ACTTCGCGGACTCTGTTCTGACCATCTACCAGATGGCGAAATACTTCGCGGTTGAAAAAAAACGTGCGT GGCTGGCGGAATACGAACTGGACTCTTTCTACACCCAGCCGGACACCGGTTACCTGCAGTTCTACGACA ACGCGTACGAAGACATCGTTCAGGTTTACAACAAACTGCGTAACTACCTGACCAAAAAACCGTACTCTG AAGAAAAATGGAAACTGAACTTCGAAAACTCTACCCTGGCGAACGGTTGGGACAAAAACAAAGAATCT GACAACTCTGCGGTTATCCTGCAGAAAGGTGGTAAATACTACCTGGGTCTGATCACCAAAGGTCACAAC AAAATCTTCGACGACCGTTTCCAGGAAAAATTCATCGTTGGTATCGAAGGTGGTAAATACGAAAAAATC GTTTACAAATTCTTCCCGGACCAGGCGAAAATGTTCCCGAAAGTTTGCTTCTCTGCGAAAGGTCTGGAAT TCTTCCGTCCGTCTGAAGAAATCCTGCGTATCTACAACAACGCGGAATTCAAAAAAGGTGAAACCTACT CTATCGACTCTATGCAGAAACTGATCGACTTCTACAAAGACTGCCTGACCAAATACGAAGGTTGGGCGT GCTACACCTTCCGTCACCTGAAACCGACCGAAGAATACCAGAACAACATCGGTGAATTCTTCCGTGACG TTGCGGAAGACGGTTACCGTATCGACTTCCAGGGTATCTCTGACCAGTACATCCACGAAAAAAACGAAA AAGGTGAACTGCACCTGTTCGAAATCCACAACAAAGACTGGAACCTGGACAAAGCGCGTGACGGTAAA TCTAAAACCACCCAGAAAAACCTGCACACCCTGTACTTCGAATCTCTGTTCTCTAACGACAACGTTGTTC AGAACTTCCCGATCAAACTGAACGGTCAGGCGGAAATCTTCTACCGTCCGAAAACCGAAAAAGACAAA CTGGAATCTAAAAAAGACAAAAAAGGTAACAAAGTTATCGACCACAAACGTTACTCTGAAAACAAAAT CTTCTTCCACGTTCCGCTGACCCTGAACCGTACCAAAAACGACTCTTACCGTTTCAACGCGCAGATCAAC AACTTCCTGGCGAACAACAAAGACATCAACATCATCGGTGTTGACCGTGGTGAAAAACACCTGGTTTAC TACTCTGTTATCACCCAGGCGTCTGACATCCTGGAATCTGGTTCTCTGAACGAACTGAACGGTGTTAACT ACGCGGAAAAACTGGGTAAAAAAGCGGAAAACCGTGAACAGGCGCGTCGTGACTGGCAGGACGTTCAG GGTATCAAAGACCTGAAAAAAGGTTACATCTCTCAGGTTGTTCGTAAACTGGCGGACCTGGCGATCAAA CACAACGCGATCATCATCCTGGAAGACCTGAACATGCGTTTCAAACAGGTTCGTGGTGGTATCGAAAAA TCTATCTACCAGCAGCTGGAAAAAGCGCTGATCGACAAACTGTCTTTCCTGGTTGACAAAGGTGAAAAA AACCCGGAACAGGCGGGTCACCTGCTGAAAGCGTACCAGCTGTCTGCGCCGTTCGAAACCTTCCAGAAA ATGGGTAAACAGACCGGTATCATCTTCTACACCCAGGCGTCTTACACCTCTAAATCTGACCCGGTTACCG GTTGGCGTCCGCACCTGTACCTGAAATACTTCTCTGCGAAAAAAGCGAAAGACGACATCGCGAAATTCA CCAAAATCGAATTCGTTAACGACCGTTTCGAACTGACCTACGACATCAAAGACTTCCAGCAGGCGAAAG AATACCCGAACAAAACCGTTTGGAAAGTTTGCTCTAACGTTGAACGTTTCCGTTGGGACAAAAACCTGA ACCAGAACAAAGGTGGTTACACCCACTACACCAACATCACCGAAAACATCCAGGAACTGTTCACCAAAT ACGGTATCGACATCACCAAAGACCTGCTGACCCAGATCTCTACCATCGACGAAAAACAGAACACCTCTT TCTTCCGTGACTTCATCTTCTACTTCAACCTGATCTGCCAGATCCGTAACACCGACGACTCTGAAATCGC GAAAAAAAACGGTAAAGACGACTTCATCCTGTCTCCGGTTGAACCGTTCTTCGACTCTCGTAAAGACAA CGGTAACAAACTGCCGGAAAACGGTGACGACAACGGTGCGTACAACATCGCGCGTAAAGGTATCGTTAT CCTGAACAAAATCTCTCAGTACTCTGAAAAAAACGAAAACTGCGAAAAAATGAAATGGGGTGACCTGT ACGTTTCTAACATCGACTGGGACAACTTCGTT SEQ ATGGAAAACTTTAAAAACTTATACCCAATAAACAAAACGTTACGTTTTGAACTGCGTCCATATGGTAAA ID ACACTGGAAAACTTTAAAAAAAGCGGTTTGTTGGAGAAGGATGCATTTAAAGCGAACTCTCGCAGATCC NO : ATGCAGGCCATCATTGATGAAAAATTTAAAGAGACGATCGAAGAACGTCTGAAATACACGGAATTTAGT 51 GAGTGTGACTTAGGTAATATGACTTCTAAAGATAAGAAAATCACCGATAAGGCGGCGACCAACCTGAAG AAGCAAGTCATTTTATCTTTTGATGATGAAATCTTTAACAACTATTTGAAACCGGACAAAAACATCGATG CCTTATTTAAAAATGACCCTTCGAACCCGGTGATTAGCACATTTAAGGGCTTCACAACGTATTTTGTCAA TTTTTTTGAAATTCGTAAACATATCTTCAAAGGAGAATCAAGCGGCTCTATGGCTTATCGCATTATTGAT GAAAACCTGACGACCTATTTGAATAACATTGAAAAAATCAAAAAACTGCCAGAGGAATTAAAGTCTCAG TTAGAAGGCATCGACCAGATCGACAAACTCAACAACTATAACGAATTTATTACGCAGTCTGGTATCACC CACTATAATGAAATTATTGGAGGTATCAGTAAATCAGAAAATGTGAAAATCCAAGGGATTAATGAAGGC ATTAACCTCTATTGCCAGAAAAATAAAGTGAAACTGCCGAGGC TGACTCCACTCTACAAAATGATCCTG TCTGACCGCGTCTCGAATAGCTTTGTCCTGGACACAATTGAAAACGATACGGAATTGATTGAGATGATA AGCGATCTGATTAACAAAACCGAAATTTCACAGGATGTAATCATGAGTGATATACAAAACATCTTTATT AAATATAAACAGCTTGGTAATCTGCCTGGAATTAGCTATTCGT CAATAGTGAACGCAATCTGTTCTGATT ATGATAACAATTTTGGCGACGGTAAGCGTAAAAAGAGTTATGAAAACGATAGGAAAAAACACCTGGAA ACTAACGTGTATTCTATCAACTATATCAGCGAACTGCTTACGGACACCGATGTGAGTTCAAACATTAAGA TGCGGTATAAGGAGCTTGAACAGAACTACCAGGTCTGTAAGGAAAACTTCAACGCAACCAACTGGATGA ACATTAAAAATATCAAACAATCCGAGAAGACCAACTTAATCAAAGATCTGCTGGATATTTTGAAGAGCA TTCAACGTTTTTATGATCTGTTCGATATCGTTGATGAAGACAAGAATCCTAGTGCGGAATTTTATACATG GCTGTCTAAAAATGCGGAGAAATTGGATTTCGAATTCAATTCTGTTTATAATAAATCACGCAACTATTTG ACCCGCAAACAATACAGCGACAAAAAGATAAAACTAAACTTCGACAGTCCGACATTGGCAAAGGGCTG GGACGCAAATAAGGAAATCGATAACTCTACGATAATTATGCGTAAGTTCAATAATGATCGAGGTGATTA TGATTATTTCTTAGGCATTTGGAACAAAAGCACCCCGGCCAACGAAAAGATAATTCCACTGGAGGATAA CGGTCTGTTCGAAAAAATGCAGTACAAATTATATCCGGATCCAAGCAAGATGCTTCCAAAGCAGTTTCT GTCTAAAATTTGGAAAGCTAAGCATCCGACCACCCCAGAATTTGACAAGAAATATAAGGAAGGCCGCCA US 10 , 435 , 714 B2 115 116 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence TAAGAAAGGTCCCGATTTTGAAAAAGAATTCTTGCACGAACTGATTGATTGCTTTAAACATGGCTTAGTC AATCACGATGAAAAGTATCAAGATGTTTTTGGATTCAATTTGAGAAACACAGAAGACTACAATTCCTAC ACTGAGTTTCTCGAAGATGTGGAACGATGTAATTATAATCTGAGCTTTAACAAAATCGCGGACACCTCG AATCTGATTAACGATGGTAAACTTTATGTTTTCCAGATCTGGAGCAAGGATTTCTCTATTGACAGCAAAG GCACCAAAAACCTGAACACCATTTACTTTGAAAGTCTCTTCAGCGAAGAAAATATGATTGAGAAAATGT TTAAACTTAGCGGTGAAGCTGAAATATTCTATCGCCCGGCAAGCCTGAACTATTGCGAAGACATTATCA AAAAGGGTCATCACCACGCTGAACTGAAAGATAAATTTGATTATCCTATCATAAAAGATAAACGCTATA GCCAGGATAAATTTTTTTTTCATGTTCCTATGGTCATTAACTACAAATCAGAAAAACTGAACTCTAAAAG CCTCAATAATCGAACCAATGAAAACCTTGGGCAGTTTACCCATATAATTGGAATTGATCGCGGAGAGCG TCATTTAATCTACCTGACCGTAGTCGATGTATCGACCGGCGAGATCGTCGAGCAGAAGCACTTAGACGA GATTATCAACACTGATACCAAAGGTGTTGAGCATAAGACGCACTATCTAAACAAGCTGGAGGAAAAATC GAAAACCCGTGATAATGAACGTAAGAGTTGGGAGGCAATTGAAACGATTAAAGAACTGAAGGAGGGTT ATATCAGCCACGTAATCAATGAAATTCAAAAACTGCAGGAAAAATACAACGCCCTGATCGTTATGGAAA ATCTGAATTACGGTTTCAAAAATTCTCGCATCAAAGTGGAAAAACAGGTATATCAGAAGTTCGAGACGG CATTAATTAAAAAGTTTAATTACATCATTGACAAAAAAGATCCGGAAACTTATATTCATGGCTATCAGCT GACGAACCCGATCACCACACTGGATAAAATTGGTAACCAGTCTGGTATCGTGCTTTACATCCCTGCCTGG AATACCAGTAAAATCGATCCGGTAACGGGATTCGTCAACCTTCTATATGCAGATGACCTCAAATATAAG AATCAGGAACAGGCCAAGTCTTTTATTCAGAAAATCGATAACATTTACTTTGAGAATGGGGAATTCAAA TTTGATATTGATTTTTCTAAATGGAACAATCGTTATAGTATATCTAAGACGAAATGGACGCTCACCTCGT ACGGAACCCGAATCCAGACATTCCGCAATCCGCAGAAGAACAATAAATGGGACAGCGCCGAGTATGAT CTCACTGAAGAATTCAAATTGATTCTGAACATTGACGGTACCC TGAAAAGCCAGGATGTCGAAACCTAT AAAAAATTTATGTCTCTGTTCAAGCTGATGCTGCAACTTAGGAACTCTGTTACCGGCACTGATATCGATT ATATGATCTCCCCTGTCACTGATAAAACAGGTACGCATTTCGATTCGCGCGAAAATATCAAAAATCTGCC CGCAGATGCCGACGCCAATGGGGCGTACAATATTGCACGCAAGGGTATCATGGCGATCGAAAACATTAT GAATGGTATCAGCGACCCGCTGAAAATCTCAAACGAAGATTATTTGAAATATATCCAAAACCAGCAGGA ATAA

SEO ATGACCCAGTTCGAAGGTTTCACCAACCTGTACCAGGTTTCTAAAACCCTGCGTTTCGAACTGATCCCGC ID AGGGTAAAACCCTGAAACACATCCAGGAACAGGGTTTCATCGAAGAAGACAAAGCGCGTAACGACCAC NO : TACAAAGAACTGAAACCGATCATCGACCGTATCTACAAAACCTACGCGGACCAGTGCCTGCAGCTGGTT 52 CAGCTGGACTGGGAAAACCTGTCTGCGGCGATCGACTCTTACCGTAAAGAAAAAACCGAAGAAACCCGT AACGCGCTGATCGAAGAACAGGCGACCTACCGTAACGCGATCCACGACTACTTCATCGGTCGTACCGAC AACCTGACCGACGCGATCAACAAACGTCACGCGGAAATCTACAAAGGTCTGTTCAAAGCGGAACTGTTC AACGGTAAAGTTCTGAAACAGCTGGGTACCGTTACCACCACCGAACACGAAAACGCGCTGCTGCGTTCT TTCGACAAATTCACCACCTACTTCTCTGGTTTCTACGAAAACCGTAAAAACGTTTTCTCTGCGGAAGACA TCTCTACCGCGATCCCGCACCGTATCGTTCAGGACAACTTCCCGAAATTCAAAGAAAACTGCCACATCTT CACCCGTCTGATCACCGCGGTTCCGTCTCTGCGTGAACACTTCGAAAACGTTAAAAAAGCGATCGGTATC TTCGTTTCTACCTCTATCGAAGAAGTTTTCTCTTTCCCGTTCTACAACCAGCTGCTGACCCAGACCCAGAT CGACCTGTACAACCAGCTGCTGGGTGGTATCTCTCGTGAAGCGGGTACCGAAAAAATCAAAGGTCTGAA CGAAGTTCTGAACCTGGCGATCCAGAAAAACGACGAAACCGCGCACATCATCGCGTCTCTGCCGCACCG TTTCATCCCGCTGTTCAAACAGATCCTGTCTGACCGTAACACCCTGTCTTTCATCCTGGAAGAATTCAAA TCTGACGAAGAAGTTATCCAGTCTTTCTGCAAATACAAAACCCTGCTGCGTAACGAAAACGTTCTGGAA ACCGCGGAAGCGCTGTTCAACGAACTGAACTCTATCGACCTGACCCACATCTTCATCTCTCACAAAAAA CTGGAAACCATCTCTTCTGCGCTGTGCGACCACTGGGACACCC TGCGTAACGCGCTGTACGAACGTCGTA TCTCTGAACTGACCGGTAAAATCACCAAATCTGCGAAAGAAAAAGTTCAGCGTTCTCTGAAACACGAAG ACATCAACCTGCAGGAAATCATCTCTGCGGCGGGTAAAGAACTGTCTGAAGCGTTCAAACAGAAAACCT CTGAAATCCTGTCTCACGCGCACGCGGCGCTGGACCAGCCGCTGCCGACCACCCTGAAAAAACAGGAAG AAAAAGAAATCCTGAAATCTCAGCTGGACTCTCTGCTGGGTCTGTACCACCTGCTGGACTGGTTCGCGGT TGACGAATCTAACGAAGTTGACCCGGAATTCTCTGCGCGTCTGACCGGTATCAAACTGGAAATGGAACC GTCTCTGTCTTTCTACAACAAAGCGCGTAACTACGCGACCAAAAAACCGTACTCTGTTGAAAAATTCAA ACTGAACTTCCAGATGCCGACCCTGGCGTCTGGTTGGGACGTTAACAAAGAAAAAAACAACGGTGCGAT CTGTTCGTTAAAAACGGTCTGTACTACCTGGGTATCATGCCGAAACAGAAAGGTCGTTACAAAGCGCT GTCTTTCGAACCGACCGAAAAAACCTCTGAAGGTTTCGACAAAATGTACTACGACTACTTCCCGGACGC GGCGAAAATGATCCCGAAATGCTCTACCCAGCTGAAAGCGGTTACCGCGCACTTCCAGACCCACACCAC CCCGATCCTGCTGTCTAACAACTTCATCGAACCGCTGGAAATCACCAAAGAAATCTACGACCTGAACAA CCCGGAAAAAGAACCGAAAAAATTCCAGACCGCGTACGCGAAAAAAACCGGTGACCAGAAAGGTTACC GTGAAGCGCTGTGCAAATGGATCGACTTCACCCGTGACTTCCTGTCTAAATACACCAAAACCACCTCTAT CGACCTGTCTTCTCTGCGTCCGTCTTCTCAGTACAAAGACCTGGGTGAATACTACGCGGAACTGAACCCG CTGCTGTACCACATCTCTTTCCAGCGTATCGCGGAAAAAGAAATCATGGACGCGGTTGAAACCGGTAAA CTGTACCTGTTCCAGATCTACAACAAAGACTTCGCGAAAGGTCACCACGGTAAACCGAACCTGCACACC CTGTACTGGACCGGTCTGTTCTCTCCGGAAAACCTGGCGAAAACCTCTATCAAACTGAACGGTCAGGCG GAACTGTTCTACCGTCCGAAATCTCGTATGAAACGTATGGCGCACCGTCTGGGTGAAAAAATGCTGAAC AAAAAACTGAAAGACCAGAAAACCCCGATCCCGGACACCCTGTACCAGGAACTGTACGACTACGTTAA CCACCGTCTGTCTCACGACCTGTCTGACGAAGCGCGTGCGCTGCTGCCGAACGTTATCACCAAAGAAGTT TCTCACGAAATCATCAAAGACCGTCGTTTCACCTCTGACAAATTCTTCTTCCACGTTCCGATCACCCTGA ACTACCAGGCGGCGAACTCTCCGTCTAAATTCAACCAGCGTGTTAACGCGTACCTGAAAGAACACCCGG TCGACCGTGGTGAACGTAACCTGATCTACATCACCGTTATCGACTCTACCGG TAAAATCCTGGAACAGCGTTCTCTGAACACCATCCAGCAGTTCGACTACCAGAAAAAACTGGACAACCG TGAAAAAGAACGTGTTGCGGCGCGTCAGGCGTGGTCTGTTGTTGGTACCATCAAAGACCTGAAACAGGG TTACCTGTCTCAGGTTATCCACGAAATCGTTGACCTGATGATCCACTACCAGGCGGTTGTTGTTCTGGAA AACCTGAACTTCGGTTTCAAATCTAAACGTACCGGTATCGCGGAAAAAGCGGTTTACCAGCAGTTCGAA AAAATGCTGATCGACAAACTGAACTGCCTGGTTCTGAAAGACTACCCGGCGGAAAAAGTTGGTGGTGTT CTGAACCCGTACCAGCTGACCGACCAGTTCACCTCTTTCGCGAAAATGGGTACCCAGTCTGGTTTCCTGT TCTACGTTCCGGCGCCGTACACCTCTAAAATCGACCCGCTGACCGGTTTCGTTGACCCGTTCGTTTGGAA AACCATCAAAAACCACGAATCTCGTAAACACTTCCTGGAAGGT TTCGACTTCCTGCACTACGACGTTAA AACCGGTGACTTCATCCTGCACTTCAAAATGAACCGTAACCTGTCTTTCCAGCGTGGTCTGCCGGGTTTC US 10 , 435 , 714 B2 117 118 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence ATGCCGGCGTGGGACATCGTTTTCGAAAAAAACGAAACCCAGT TCGACGCGAAAGGTACCCCGTTCATC GCGGGTAAACGTATCGTTCCGGTTATCGAAAACCACCGTTTCACCGGTCGTTACCGTGACCTGTACCCGG CGAACGAACTGATCGCGCTGCTGGAAGAAAAAGGTATCGTTTTCCGTGACGGTTCTAACATCCTGCCGA AACTGCTGGAAAACGACGACTCTCACGCGATCGACACCATGGTTGCGCTGATCCGTTCTGTTCTGCAGAT GCGTAACTCTAACGCGGCGACCGGTGAAGACTACATCAACTCTCCGGTTCGTGACCTGAACGGTGTTTG CTTCGACTCTCGTTTCCAGAACCCGGAATGGCCGATGGACGCGGACGCGAACGGTGCGTACCACATCGC GCTGAAAGGTCAGCTGCTGCTGAACCACCTGAAAGAATCTAAAGACCTGAAACTGCAGAACGGTATCTC TAACCAGGACTGGCTGGCGTACATCCAGGAACTGCGTAACTA SEQ ATGGCGGTTAAATCTATCAAAGTTAAACTGCGTCTGGACGACATGCCGGAAATCCGTGCGGGTCTGTGG ID AAACTGCACAAAGAAGTTAACGCGGGTGTTCGTTACTACACCGAATGGCTGTCTCTGCTGCGTCAGGAA NO : AACCTGTACCGTCGTTCTCCGAACGGTGACGGTGAACAGGAATGCGACAAAACCGCGGAAGAATGCAA 53 AGCGGAACTGCTGGAACGTCTGCGTGCGCGTCAGGTTGAAAACGGTCACCGTGGTCCGGCGGGTTCTGA CGACGAACTGCTGCAGCTGGCGCGTCAGCTGTACGAACTGCTGGTTCCGCAGGCGATCGGTGCGAAAGG TGACGCGCAGCAGATCGCGCGTAAATTCCTGTCTCCGCTGGCGGACAAAGACGCGGTTGGTGGTCTGGG TATCGCGAAAGCGGGTAACAAACCGCGTTGGGTTCGTATGCGTGAAGCGGGTGAACCGGGTTGGGAAG AAGAAAAAGAAAAAGCGGAAACCCGTAAATCTGCGGACCGTACCGCGGACGTTCTGCGTGCGCTGGCG GACTTCGGTCTGAAACCGCTGATGCGTGTTTACACCGACTCTGAAATGTCTTCTGTTGAATGGAAACCGC TGCGTAAAGGTCAGGCGGTTCGTACCTGGGACCGTGACATGTTCCAGCAGGCGATCGAACGTATGATGT CTTGGGAATCTTGGAACCAGCGTGTTGGTCAGGAATACGCGAAACTGGTTGAACAGAAAAACCGTTTCG AACAGAAAAACTTCGTTGGTCAGGAACACCTGGTTCACCTGGTTAACCAGCTGCAGCAGGACATGAAAG AAGCGTCTCCGGGTCTGGAATCTAAAGAACAGACCGCGCACTACGTTACCGGTCGTGCGCTGCGTGGTT CTGACAAAGTTTTCGAAAAATGGGGTAAACTGGCGCCGGACGCGCCGTTCGACCTGTACGACGCGGAAA TCAAAAACGTTCAGCGTCGTAACACCCGTCGTTTCGGTTCTCACGACCTGTTCGCGAAACTGGCGGAACC GGAATACCAGGCGCTGTGGCGTGAAGACGCGTCTTTCCTGACCCGTTACGCGGTTTACAACTCTATCCTG CGTAAACTGAACCACGCGAAAATGTTCGCGACCTTCACCCTGCCGGACGCGACCGCGCACCCGATCTGG ACCCGTTTCGACAAACTGGGTGGTAACCTGCACCAGTACACCTTCCTGTTCAACGAATTCGGTGAACGTC GTCACGCGATCCGTTTCCACAAACTGCTGAAAGTTGAAAACGGTGTTGCGCGTGAAGTTGACGACGTTA CCGTTCCGATCTCTATGTCTGAACAGCTGGACAACCTGCTGCCGCGTGACCCGAACGAACCGATCGCGCT GTACTTCCGTGACTACGGTGCGGAACAGCACTTCACCGGTGAATTCGGTGGTGCGAAAATCCAGTGCCG TCGTGACCAGCTGGCGCACATGCACCGTCGTCGTGGTGCGCGTGACGTTTACCTGAACGTTTCTGTTCGT GTTCAGTCTCAGTCTGAAGCGCGTGGTGAACGTCGTCCGCCGTACGCGGCGGTTTTCCGTCTGGTTGGTG ACAACCACCGTGCGTTCGTTCACTTCGACAAACTGTCTGACTACCTGGCGGAACACCCGGACGACGGTA GTCTGCTGTCTGGTCTGCGTGTTATGTC TGTTGACCTGGGTCTGCGTACCTCTGCG TCTATCTCTGTTTTCCGTGTTGCGCGTAAAGACGAACTGAAACCGAACTCTAAAGGTCGTGTTCCGTTCT TCTTCCCGATCAAAGGTAACGACAACCTGGTTGCGGTTCACGAACGTTCTCAGCTGCTGAAACTGCCGG GTGAAACCGAATCTAAAGACCTGCGTGCGATCCGTGAAGAACGTCAGCGTACCCTGCGTCAGCTGCGTA CCCAGCTGGCGTACCTGCGTCTGCTGGTTCGTTGCGGTTCTGAAGACGTTGGTCGTCGTGAACGTTCTTG GGCGAAACTGATCGAACAGCCGGTTGACGCGGCGAACCACATGACCCCGGACTGGCGTGAAGCGTTCG AAAACGAACTGCAGAAACTGAAATCTCTGCACGGTATCTGCTC TGACAAAGAATGGATGGACGCGGTTT ACGAATCTGTTCGTCGTGTTTGGCGTCACATGGGTAAACAGGT TCGTGACTGGCGTAAAGACGTTCGTTC TGGTGAACGTCCGAAAATCCGTGGTTACGCGAAAGACGTTGTTGGTGGTAACTCTATCGAACAGATCGA ATACCTGGAACGTCAGTACAAATTCCTGAAATCTTGGTCTTTCTTCGGTAAAGTTTCTGGTCAGGTTATC CGTGCGGAAAAAGGTTCTCGTTTCGCGATCACCCTGCGTGAACACATCGACCACGCGAAAGAAGACCGT CTGAAAAAACTGGCGGACCGTATCATCATGGAAGCGCTGGGTTACGTTTACGCGCTGGACGAACGTGGT AAAGGTAAATGGGTTGCGAAATACCCGCCGTGCCAGCTGATCCTGCTGGAAGAACTGTCTGAATACCAG TTCAACAACGACCGTCCGCCGTCTGAAAACAACCAGCTGATGCAGTGGTCTCACCGTGGTGTTTTCCAGG AACTGATCAACCAGGCGCAGGTTCACGACCTGCTGGTTGGTACCATGTACGCGGCGTTCTCTTCTCGTTT CGACGCGCGTACCGGTGCGCCGGGTATCCGTTGCCGTCGTGTTCCGGCGCGTTGCACCCAGGAACACAA CCCGGAACCGTTCCCGTGGTGGCTGAACAAATTCGTTGTTGAACACACCCTGGACGCGTGCCCGCTGCGT GCGGACGACCTGATCCCGACCGGTGAAGGTGAAATCTTCGTTTCTCCGTTCTCTGCGGAAGAAGGTGAC TTCCACCAGATCCACGCGGACCTGAACGCGGCGCAGAACCTGCAGCAGCGTCTGTGGTCTGACTTCGAC ATCTCTCAGATCCGTCTGCGTTGCGACTGGGGTGAAGTTGACGGTGAACTGGTTCTGATCCCGCGTCTGA CCGGTAAACGTACCGCGGACTCTTACTCTAACAAAGTTTTCTACACCAACACCGGTGTTACCTACTACGA ACGTGAACGTGGTAAAAAACGTCGTAAAGTTTTCGCGCAGGAAAAACTGTCTGAAGAAGAAGCGGAAC TGCTGGTTGAAGCGGACGAAGCGCGTGAAAAATCTGTTGTTCTGATGCGTGACCCGTCTGGTATCATCA ACCGTGGTAACTGGACCCGTCAGAAAGAATTCTGGTCTATGGT TAACCAGCGTATCGAAGGTTACCTGG TTAAACAGATCCGTTCTCGTGTTCCGCTGCAGGACTCTGCGTGCGAAAACACCGGTGACATCTAA SEQ ATGGCGACCCGTTCTTTCATCCTGAAAATCGAACCGAACGAAGAAGTTAAAAAAGGTCTGTGGAAAACC ID CACGAAGTTCTGAACCACGGTATCGCGTACTACATGAACATCCTGAAACTGATCCGTCAGGAAGCGATC NO : TACGAACACCACGAACAGGACCCGAAAAACCCGAAAAAAGTTTCTAAAGCGGAAATCCAGGCGGAACT 54 GTGGGACTTCGTTCTGAAAATGCAGAAATGCAACTCTTTCACCCACGAAGTTGACAAAGACGTTGTTTTC AACATCCTGCGTGAACTGTACGAAGAACTGGTTCCGTCTTCTGTTGAAAAAAAAGGTGAAGCGAACCAG CTGTCTAACAAATTCCTGTACCCGCTGGTTGACCCGAACTCTCAGTCTGGTAAAGGTACCGCGTCTTCTG GTCGTAAACCGCGTTGGTACAACCTGAAAATCGCGGGTGACCCGTCTTGGGAAGAAGAAAAAAAAAAA AAAAGACCCGCTGGCGAAAATCCTGGGTAAACTGGCGGAATACGGTCTGAT CCCGCTGTTCATCCCGTTCACCGACTCTAACGAACCGATCGTTAAAGAAATCAAATGGATGGAAAAATC TCGTAACCAGTCTGTTCGTCGTCTGGACAAAGACATGTTCATCCAGGCGCTGGAACGTTTCCTGTCTTGG GAATCTTGGAACCTGAAAGTTAAAGAAGAATACGAAAAAGTTGAAAAAGAACACAAAACCCTGGAAGA ACGTATCAAAGAAGACATCCAGGCGTTCAAATCTCTGGAACAGTACGAAAAAGAACGTCAGGAACAGC TGCTGCGTGACACCCTGAACACCAACGAATACCGTCTGTCTAAACGTGGTCTGCGTGGTTGGCGTGAAA TCATCCAGAAATGGCTGAAAATGGACGAAAACGAACCGTCTGAAAAATACCTGGAAGTTTTCAAAGACT ACCAGCGTAAACACCCGCGTGAAGCGGGTGACTACTCTGTTTACGAATTCCTGTCTAAAAAAGAAAACC ACTTCATCTGGCGTAACCACCCGGAATACCCGTACCTGTACGCGACCTTCTGCGAAATCGACAAAAAAA AAAAAGACGCGAAACAGCAGGCGACCTTCACCCTGGCGGACCCGATCAACCACCCGCTGTGGGTTCGTT US 10 , 435 , 714 B2 119 120 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence TCGAAGAACGTTCTGGTTCTAACCTGAACAAATACCGTATCCTGACCGAACAGCTGCACACCGAAAAAC TGAAAAAAAAACTGACCGTTCAGCTGGACCGTCTGATCTACCCGACCGAATCTGGTGGTTGGGAAGAAA AAGGTAAAGTTGACATCGTTCTGCTGCCGTCTCGTCAGTTCTACHCGTCTCGTCAGTTCTACAACCAGATCTTCCTGGACATCGAAGA AAAAGGTAAACACGCGTTCACCTACAAAGACGAATCTATCAAATTCCCGCTGAAAGGTACCCTGGGTGG TGCGCGTGTTCAGTTCGACCGTGACCACCTGCGTCGTTACCCGCACAAAGTTGAATCTGGTAACGTTGGT CGTATCTACTTCAACATGACCGTTAACATCGAACCGACCGAATCTCCGGTTTCTAAATCTCTGAAAATCC ACCGTGACGACTTCCCGAAATTCGTTAACTTCAAACCGAAAGAACTGACCGAATGGATCAAAGACTCTA AAGGTAAAAAACTGAAATCTGGTATCGAATCTCTGGAAATCGGTCTGCGTGTTATGTCTATCGACCTGG GTCAGCGTCAGGCGGCGGCGGCGTCTATCTTCGAAGTTGTTGACCAGAAACCGGACATCGAAGGTAAAC TGTTCTTCCCGATCAAAGGTACCGAACTGTACGCGGTTCACCGTGCGTCTTTCAACATCAAACTGCCGGG TGAAACCCTGGTTAAATCTCGTGAAGTTCTGCGTAAAGCGCGTGAAGACAACCTGAAACTGATGAACCA GAAACTGAACTTCCTGCGTAACGTTCTGCACTTCCAGCAGTTCGAAGACATCACCGAACGTGAAAAACG TGTTACCAAATGGATCTCTCGTCAGGAAAACTCTGACGTTCCGCTGGTTTACCAGGACGAACTGATCCAG ATCCGTGAACTGATGTACAAACCGTACAAAGACTGGGTTGCGTTCCTGAAACAGCTGCACAAACGTCTG GAAGTTGAAATCGGTAAAGAAGTTAAACACTGGCGTAAATCTCTGTCTGACGGTCGTAAAGGTCTGTAC GGTATCTCTCTGAAAAACATCGACGAAATCGACCGTACCCGTAAATTCCTGCTGCGTTGGTCTCTGCGTC CGACCGAACCGGGTGAAGTTCGTCGTCTGGAACCGGGTCAGCGTTTCGCGATCGACCAGCTGAACCACC TGAACGCGCTGAAAGAAGACCGTCTGAAAAAAATGGCGAACACCATCATCATGCACGCGCTGGGTTACT GCTACGACGTTCGTAAAAAAAAATGGCAGGCGAAAAACCCGGCGTGCCAGATCATCCTGTTCGAAGACC TGTCTAACTACAACCCGTACGAAGAACGTTCTCGTTTCGAAAACTCTAAACTGATGAAATGGTCTCGTCG TGAAATCCCGCGTCAGGTTGCGCTGCAGGGTGAAATCTACGGTCTGCAGGTTGGTGAAGTTGGTGCGCA GTTCTCTTCTCGTTTCCACGCGAAAACCGGTTCTCCGGGTATCCGTTGCTCTGTTGTTACCAAAGAAAAA CTGCAGGACAACCGTTTCTTCAAAAACCTGCAGCGTGAAGGTCGTCTGACCCTGGACAAAATCGCGGTT CTGAAAGAAGGTGACCTGTACCCGGACAAAGGTGGTGAAAAATTCATCTCTCTGTCTAAAGACCGTAAA CTGGTTACCACCCACGCGGACATCAACGCGGCGCAGAACCTGCAGAAACGTTTCTGGACCCGTACCCAC GGTTTCTACAAAGTTTACTGCAAAGCGTACCAGGTTGACGGTCAGACCGTTTACATCCCGGAATCTAAA GACCAGAAACAGAAAATCATCGAAGAATTCGGTGAAGGTTACTTCATCCTGAAAGACGGTGTTTACGAA TGGGGTAACGCGGGTAAACTGAAAATCAAAAAAGGTTCTTCTAAACAGTCTTCTTCTGAACTGGTTGAC TCTGACATCCTGAAAGACTCTTTCGACCTGGCGTCTGAACTGAAAGGTGAAAAACTGATGCTGTACCGT GACCCGTCTGGTAACGTTTTCCCGTCTGACAAATGGATGGCGGCGGGTGTTTTCTTCGGTAAACTGGAAC GTATCCTGATCTCTAAACTGACCAACCAGTACTCTATCTCTACCATCGAAGACGACTCTTCTAAACAGTC TATGTAA SEQ ATGCCGACCCGTACCATCAACCTGAAACTGGTTCTGGGTAAAAACCCGGAAAACGCGACCCTGCGTCGT ID GCGCTGTTCTCTACCCACCGTCTGGTTAACCAGGCGACCAAACGTATCGAAGAATTCCTGCTGCTGTGCC NO : GTGGTGAAGCGTACCGTACCGTTGACAACGAAGGTAAAGAAGCGGAAATCCCGCGTCACGCGGTTCAG 55 GAAGAAGCGCTGGCGTTCGCGAAAGCGGCGCAGCGTCACAACGGTTGCATCTCTACCTACGAAGACCAG GAAATCCTGGACGTTCTGCGTCAGCTGTACGAACGTCTGGTTCCGTCTGTTAACGAAAACAACGAAGCG GGTGACGCGCAGGCGGCGAACGCGTGGGTTTCTCCGCTGATGT CTGCGGAATCTGAAGGTGGTCTGTCT GTTTACGACAAAGTTCTGGACCCGCCGCCGGTTTGGATGAAAC TGAAAGAAGAAAAAGCGCCGGGTTGG GAAGCGGCGTCTCAGATCTGGATCCAGTCTGACGAAGGTCAGTCTCTGCTGAACAAACCGGGTTCTCCG CCGCGTTGGATCCGTAAACTGCGTTCTGGTCAGCCGTGGCAGGACGACTTCGTTTCTGACCAGAAAAAA AAACAGGACGAACTGACCAAAGGTAACGCGCCGCTGATCAAACAGCTGAAAGAAATGGGTCTGCTGCC GCTGGTTAACCCGTTCTTCCGTCACCTGCTGGACCCGGAAGGTAAAGGTGTTTCTCCGTGGGACCGTCTG GCGGTTCGTGCGGCGGTTGCGCACTTCATCTCTTGGGAATCTTGGAACCACCGTACCCGTGCGGAATACA ACTCTCTGAAACTGCGTCGTGACGAATTCGAAGCGGCGTCTGACGAATTCAAAGACGACTTCACCCTGC TGCGTCAGTACGAAGCGAAACGTCACTCTACCCTGAAATCTATCGCGCTGGCGGACGACTCTAACCCGT ACCGTATCGGTGTTCGTTCTCTGCGTGCGTGGAACCGTGTTCGTGAAGAATGGATCGACAAAGGTGCGA CCGAAGAACAGCGTGTTACCATCCTGTCTAAACTGCAGACCCAGCTGCGTGGTAAATTCGGTGACCCGG ACCTGTTCAACTGGCTGGCGCAGGACCGTCACGTTCACCTGTGGTCTCCGCGTGACTCTGTTACCCCGCT GGTTCGTATCAACGCGGTTGACAAAGTTCTGCGTCGTCGTAAACCGTACGCGCTGATGACCTTCGCGCAC CCGCGTTTCCACCCGCGTTGGATCCTGTACGAAGCGCCGGGTGGTTCTAACCTGCGTCAGTACGCGCTGG ACTGCACCGAAAACGCGCTGCACATCACCCTGCCGCTGCTGGTTGACGACGCGCACGGTACCTGGATCG AAAAAAAAATCCGTGTTCCGCTGGCGCCGTCTGGTCAGATCCAGGACCTGACCCTGGAAAAACTGGAAA AAAAAAAAAACCGTCTGTACTACCGTTCTGGTTTCCAGCAGTTCGCGGGTCTGGCGGGTGGTGCGGAAG TTCTGTTCCACCGTCCGTACATGGAACACGACGAACGTTCTGAAGAATCTCTGCTGGAACGTCCGGGTGC GGTTTGGTTCAAACTGACCCTGGACGTTGCGACCCAGGCGCCGCCGAACTGGCTGGACGGTAAAGGTCG TGTTCGTACCCCGCCGGAAGTTCACCACTTCAAAACCGCGCTGTCTAACAAATCTAAACACACCCGTACC CTGCAGCCGGGTCTGCGTGTTCTGTCTGTTGACCTGGGTATGCGTACCTTCGCGTCTTGCTCTGTTTTCGA ACTGATCGAAGGTAAACCGGAAACCGGTCGTGCGTTCCCGGTTGCGGACGAACGTTCTATGGACTCTCC GAACAAACTGTGGGCGAAACACGAACGTTCTTTCAAACTGACCCTGCCGGGTGAAACCCCGTCTCGTAA AGAAGAAGAAGAACGTTCTATCGCGCGTGCGGAAATCTACGCGCTGAAACGTGACATCCAGCGTCTGAA ATCTCTGCTGCGTCTGGGTGAAGAAGACAACGACAACCGTCGTGACGCGCTGCTGGAACAGTTCTTCAA AGGTTGGGGTGAAGAAGACGTTGTTCCGGGTCAGGCGTTCCCGCGTTCTCTGTTCCAGGGTCTGGGTGCG GCGCCGTTCCGTTCTACCCCGGAACTGTGGCGTCAGCACTGCCAGACCTACTACGACAAAGCGGAAGCG TGCCTGGCGAAACACATCTCTGACTGGCGTAAACGTACCCGTCCGCGTCCGACCTCTCGTGAAATGTGGT ACAAAACCCGTTCTTACCACGGTGGTAAATCTATCTGGATGCTGGAATACCTGGACGCGGTTCGTAAACT GCTGCTGTCTTGGTCTCTGCGTGGTCGTACCTACGGTGCGATCAACCGTCAGGACACCGCGCGTTTCGGT TCTCTGGCGTCTCGTCTGCTGCACCACATCAACTCTCTGAAAGAAGACCGTATCAAAACCGGTGCGGACT CTATCGTTCAGGCGGCGCGTGGTTACATCCCGCTGCCGCACGGTAAAGGTTGGGAACAGCGTTACGAAC CGTGCCAGCTGATCCTGTTCGAAGACCTGGCGCGTTACCGTTTCCGTGTTGACCGTCCGCGTCGTGAAAA CTCTCAGCTGATGCAGTGGAACCACCGTGCGATCGTTGCGGAAACCACCATGCAGGCGGAACTGTACGG TCAGATCGTTGAAAACACCGCGGCGGGTTTCTCTTCTCGTTTCCACGCGGCGACCGGTGCGCCGGGTGTT CGTTGCCGTTTCCTGCTGGAACGTGACTTCGACAACGACCTGCCGAAACCGTACCTGCTGCGTGAACTGT CTTGGATGCTGGGTAACACCAAAGTTGAATCTGAAGAAGAAAAACTGCGTCTGCTGTCTGAAAAAATCC GTCCGGGTTCTCTGGTTCCGTGGGACGGTGGTGAACAGTTCGCGACCCTGCACCCGAAACGTCAGACCC US 10 , 435 , 714 B2 121 122 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence TGTGCGTTATCCACGCGGACATGAACGCGGCGCAGAACCTGCAGCGTCGTTTCTTCGGTCGTTGCGGTGA AGCGTTCCGTCTGGTTTGCCAGCCGCACGGTGACGACGTTCTGCGTCTGGCGTCTACCCCGGGTGCGCGT CTGCTGGGTGCGCTGCAGCAGCTGGAAAACGGTCAGGGTGCGTTCGAACTGGTTCGTGACATGGGTTCT ACCTCTCAGATGAACCGTTTCGTTATGAAATCTCTGGGTAAAAAAAAAATCAAACCGCTGCAGGACAAC AACGGTGACGACGAACTGGAAGACGTTCTGTCTGTTCTGCCGGAAGAAGACGACACCGGTCGTATCACC GTTTTCCGTGACTCTTCTGGTATCTTCTTCCCGTGCAACGTTTGGATCCCGGCGAAACAGTTCTGGCCGGC GGTTCGTGCGATGATCTGGAAAGTTATGGCGTCTCACTCTCTGGGTTAA SEQ ATGACCAAACTGCGTCACCGTCAGAAAAAACTGACCCACGACTGGGCGGGTTCTAAAAAACGTGAAGTT ID CTGGGTTCTAACGGTAAACTGCAGAACCCGCTGCTGATGCCGGTTAAAAAAGGTCAGGTTACCGAATTC NO : CGTAAAGCGTTCTCTGCGTACGCGCGTGCGACCAAAGGTGAAATGACCGACGGTCGTAAAAACATGTTC 56 ACCCACTCTTTCGAACCGTTCAAAACCAAACCGTCTCTGCACCAGTGCGAACTGGCGGACAAAGCGTAC CAGTCTCTGCACTCTTACCTGCCGGGTTCTCTGGCGCACTTCCTGCTGTCTGCGCACGCGCTGGGTTTCCG TATCTTCTCTAAATCTGGTGAAGCGACCGCGTTCCAGGCGTCTTCTAAAATCGAAGCGTACGAATCTAAA CTGGCGTCTGAACTGGCGTGCGTTGACCTGTCTATCCAGAACC TGACCATCTCTACCCTGTTCAACGCGC TGACCACCTCTGTTCGTGGTAAAGGTGAAGAAACCTCTGCGGACCCGCTGATCGCGCGTTTCTACACCCT GCTGACCGGTAAACCGCTGTCTCGTGACACCCAGGGTCCGGAACGTGACCTGGCGGAAGTTATCTCTCG TAAAATCGCGTCTTCTTTCGGTACCTGGAAAGAAATGACCGCGAACCCGCTGCAGTCTCTGCAGTTCTTC GAAGAAGAACTGCACGCGCTGGACGCGAACGTTTCTCTGTCTCCGGCGTTCGACGTTCTGATCAAAATG AACGACCTGCAGGGTGACCTGAAAAACCGTACCATCGTTTTCGACCCGGACGCGCCGGTTTTCGAATAC AACGCGGAAGACCCGGCGGACATCATCATCAAACTGACCGCGCGTTACGCGAAAGAAGCGGTTATCAA AAACCAGAACGTTGGTAACTACGTTAAAAACGCGATCACCACCACCAACGCGAACGGTCTGGGTTGGCT GCTGAACAAAGGTCTGTCTCTGCTGCCGGTTTCTACCGACGACGAACTGCTGGAATTCATCGGTGTTGAA CGTTCTCACCCGTCTTGCCACGCGCTGATCGAACTGATCGCGCAGCTGGAAGCGCCGGAACTGTTCGAA AAAAACGTTTTCTCTGACACCCGTTCTGAAGTTCAGGGTATGATCGACTCTGCGGTTTCTAACCACATCG CGCGTCTGTCTTCTTCTCGTAACTCTCTGTCTATGGACTCTGAAGAACTGGAACGTCTGATCAAATCTTTC CAGATCCACACCCCGCACTGCTCTCTGTTCATCGGTGCGCAGTCTCTGTCTCAGCAGCTGGAATCTCTGC CGGAAGCGCTGCAGTCTGGTGTTAACTCTGCGGACATCCTGCTGGGTTCTACCCAGTACATGCTGACCAA CTCTCTGGTTGAAGAATCTATCGCGACCTACCAGCGTACCCTGAACCGTATCAACTACCTGTCTGGTGTT GCGGGTCAGATCAACGGTGCGATCAAACGTAAAGCGATCGACGGTGAAAAAATCCACCTGCCGGCGGC GTGGTCTGAACTGATCTCTCTGCCGTTCATCGGTCAGCCGGTTATCGACGTTGAATCTGACCTGGCGCAC CTGAAAAACCAGTACCAGACCCTGTCTAACGAATTCGACACCC TGATCTCTGCGCTGCAGAAAAACTTC GACCTGAACTTCAACAAAGCGCTGCTGAACCGTACCCAGCACTTCGAAGCGATGTGCCGTTCTACCAAA AAAAACGCGCTGTCTAAACCGGAAATCGTTTCTTACCGTGACC TGCTGGCGCGTCTGACCTCTTGCCTGT ACCGTGGTTCTCTGGTTCTGCGTCGTGCGGGTATCGAAGTTCTGAAAAAACACAAAATCTTCGAATCTAA CTCTGAACTGCGTGAACACGTTCACGAACGTAAACACTTCGTTTTCGTTTCTCCGCTGGACCGTAAAGCG AAAAAACTGCTGCGTCTGACCGACTCTCGTCCGGACCTGCTGCACGTTATCGACGAAATCCTGCAGCAC GACAACCTGGAAAACAAAGACCGTGAATCTCTGTGGCTGGTTCGTTCTGGTTACCTGCTGGCGGGTCTGC CGGACCAGCTGTCTTCTTCTTTCATCAACCTGCCGATCATCACCCAGAAAGGTGACCGTCGTCTGATCGA CCTGATCCAGTACGACCAGATCAACCGTGACGCGTTCGTTATGCTGGTTACCTCTGCGTTCAAATCTAAC CTGTCTGGTCTGCAGTACCGTGCGAACAAACAGTCTTTCGTTGTTACCCGTACCCTGTCTCCGTACCTGG GTTCTAAACTGGTTTACGTTCCGAAAGACAAAGACTGGCTGGT TCCGTCTCAGATGTTCGAAGGTCGTTT CGCGGACATCCTGCAGTCTGACTACATGGTTTGGAAAGACGCGGGTCGTCTGTGCGTTATCGACACCGC GAAACACCTGTCTAACATCAAAAAATCTGTTTTCTCTTCTGAAGAAGTTCTGGCGTTCCTGCGTGAACTG CCGCACCGTACCTTCATCCAGACCGAAGTTCGTGGTCTGGGTGTTAACGTTGACGGTATCGCGTTCAACA ACGGTGACATCCCGTCTCTGAAAACCTTCTCTAACTGCGTTCAGGTTAAAGTTTCTCGTACCAACACCTC TCTGGTTCAGACCCTGAACCGTTGGTTCGAAGGTGGTAAAGTTTCTCCGCCGTCTATCCAGTTCGAACGT GCGTACTACAAAAAAGACGACCAGATCCACGAAGACGCGGCGAAACGTAAAATCCGTTTCCAGATGCC GGCGACCGAACTGGTTCACGCGTCTGACGACGCGGGTTGGACCCCGTCTTACCTGCTGGGTATCGACCC GGGTGAATACGGTATGGGTCTGTCTCTGGTTTCTATCAACAACGGTGAAGTTCTGGACTCTGGTTTCATC CACATCAACTCTCTGATCAACTTCGCGTCTAAAAAATCTAACCACCAGACCAAAGTTGTTCCGCGTCAGC AGTACAAATCTCCGTACGCGAACTACCTGGAACAGTCTAAAGACTCTGCGGCGGGTGACATCGCGCACA TCCTGGACCGTCTGATCTACAAACTGAACGCGCTGCCGGTTTTCGAAGCGCTGTCTGGTAACTCTCAGTC TGCGGCGGACCAGGTTTGGACCAAAGTTCTGTCTTTCTACACC TGGGGTGACAACGACGCGCAGAACTC TATCCGTAAACAGCACTGGTTCGGTGCGTCTCACTGGGACATCAAAGGTATGCTGCGTCAGCCGCCGAC CGAAAAAAAACCGAAACCGTACATCGCGTTCCCGGGTTCTCAGGTTTCTTCTTACGGTAACTCTCAGCGT TGCTCTTGCTGCGGTCGTAACCCGATCGAACAGCTGCGTGAAATGGCGAAAGACACCTCTATCAAAGAA CTGAAAATCCGTAACTCTGAAATCCAGCTGTTCGACGGTACCATCAAACTGTTCAACCCGGACCCGTCTA CCGTTATCGAACGTCGTCGTCACAACCTGGGTCCGTCTCGTATCCCGGTTGCGGACCGTACCTTCAAAAA CATCTCTCCGTCTTCTCTGGAATTCAAAGAACTGATCACCATCGTTTCTCGTTCTATCCGTCACTCTCCGG AATTCATCGCGAAAAAACGTGGTATCGGTTCTGAATACTTCTGCGCGTACTCTGACTGCAACTCTTCTCT GAACTCTGAAGCGAACGCGGCGGCGAACGTTGCGCAGAAATTCCAGAAACAGCTGTTCTTCGAACTGTAA SEQ ATGAAACGTATCCTGAACTCTCTGAAAGTTGCGGCGCTGCGTCTGCTGTTCCGTGGTAAAGGTTCTGAAC ID TGGTTAAAACCGTTAAATACCCGCTGGTTTCTCCGGTTCAGGGTGCGGTTGAAGAACTGGCGGAAGCGA NO : TCCGTCACGACAACCTGCACCTGTTCGGTCAGAAAGAAATCGTTGACCTGATGGAAAAAGACGAAGGTA 57 CCCAGGTTTACTCTGTTGTTGACTTCTGGCTGGACACCCTGCGTCTGGGTATGTTCTTCTCTCCGTCTGCG AACGCGCTGAAAATCACCCTGGGTAAATTCAACTCTGACCAGGTTTCTCCGTTCCGTAAAGTTCTGGAAC AGTCTCCGTTCTTCCTGGCGGGTCGTCTGAAAGTTGAACCGGCGGAACGTATCCTGTCTGTTGAAATCCG TAAAATCGGTAAACGTGAAAACCGTGTTGAAAACTACGCGGCGGACGTTGAAACCTGCTTCATCGGTCA GCTGTCTTCTGACGAAAAACAGTCTATCCAGAAACTGGCGAACGACATCTGGGACTCTAAAGACCACGA AGAACAGCGTATGCTGAAAGCGGACTTCTTCGCGATCCCGCTGATCAAAGACCCGAAAGCGGTTACCGA AGAAGACCCGGAAAACGAAACCGCGGGTAAACAGAAACCGCTGGAACTGTGCGTTTG AACTGTACACCCGTGGTTTCGGTTCTATCGCGGACTTCCTGGT TCAGCGTCTGACCCTGCTGCGTGACAA AATGTCTACCGACACCGCGGAAGACTGCCTGGAATACGTTGGTATCGAAGAAGAAAAAGGTAACGGTA TGAACTCTCTGCTGGGTACCTTCCTGAAAAACCTGCAGGGTGACGGTTTCGAACAGATCTTCCAGTTCAT US 10 , 435 , 714 B2 123 124 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence GCTGGGTTCTTACGTTGGTTGGCAGGGTAAAGAAGACGTTCTGCGTGAACGTCTGGACCTGCTGGCGGA AAAAGTTAAACGTCTGCCGAAACCGAAATTCGCGGGTGAATGGTCTGGTCACCGTATGTTCCTGCACGG TCAGCTGAAATCTTGGTCTTCTAACTTCTTCCGTCTGTTCAACGAAACCCGTGAACTGCTGGAATCTATC AAATCTGACATCCAGCACGCGACCATGCTGATCTCTTACGTTGAAGAAAAAGGTGGTTACCACCCGCAG CTGCTGTCTCAGTACCGTAAACTGATGGAACAGCTGCCGGCGC TGCGTACCAAAGTTCTGGACCCGGAA ATCGAAATGACCCACATGTCTGAAGCGGTTCGTTCTTACATCATGATCCACAAATCTGTTGCGGGTTTCC TGCCGGACCTGCTGGAATCTCTGGACCGTGACAAAGACCGTGAATTCCTGCTGTCTATCTTCCCGCGTAT CCCGAAAATCGACAAAAAAACCAAAGAAATCGTTGCGTGGGAACTGCCGGGTGAACCGGAAGAAGGTT ACCTGTTCACCGCGAACAACCTGTTCCGTAACTTCCTGGAAAACCCGAAACACGTTCCGCGTTTCATGGC GGAACGTATCCCGGAAGACTGGACCCGTCTGCGTTCTGCGCCGGTTTGGTTCGACGGTATGGTTAAACA GTGGCAGAAAGTTGTTAACCAGCTGGTTGAATCTCCGGGTGCGCTGTACCAGTTCAACGAATCTTTCCTG CGTCAGCGTCTGCAGGCGATGCTGACCGTTTACAAACGTGACCTGCAGACCGAAAAATTCCTGAAACTG CTGGCGGACGTTTGCCGTCCGCTGGTTGACTTCTTCGGTCTGGGTGGTAACGACATCATCTTCAAATCTT GCCAGGACCCGCGTAAACAGTGGCAGACCGTTATCCCGCTGTCTGTTCCGGCGGACGTTTACACCGCGT GCGAAGGTCTGGCGATCCGTCTGCGTGAAACCCTGGGTTTCGAATGGAAAAACCTGAAAGGTCACGAAC GTGAAGACTTCCTGCGTCTGCACCAGCTGCTGGGTAACCTGCTGTTCTGGATCCGTGACGCGAAACTGGT TGTTAAACTGGAAGACTGGATGAACAACCCGTGCGTTCAGGAATACGTTGAAGCGCGTAAAGCGATCGA CCTGCCGCTGGAAATCTTCGGTTTCGAAGTTCCGATCTTCCTGAACGGTTACCTGTTCTCTGAACTGCGTC AGCTGGAACTGCTGCTGCGTCGTAAATCTGTTATGACCTCTTACTCTGTTAAAACCACCGGTTCTCCGAA CCGTCTGTTCCAGCTGGTTTACCTGCCGCTGAACCCGTCTGACCCGGAAAAAAAAAACTCTAACAACTTC CAGGAACGTCTGGACACCCCGACCGGTCTGTCTCGTCGTTTCCTGGACCTGACCCTGGACGCGTTCGCGG GTAAACTGCTGACCGACCCGGTTACCCAGGAACTGAAAACCATGGCGGGTTTCTACGACCACCTGTTCG GTTTCAAACTGCCGTGCAAACTGGCGGCGATGTCTAACCACCCGGGTTCTTCTTCTAAAATGGTTGTTCT GGCGAAACCGAAAAAAGGTGTTGCGTCTAACATCGGTTTCGAACCGATCCCGGACCCGGCGCACCCGGT TTTCCGTGTTCGTTCTTCTTGGCCGGAACTGAAATACCTGGAAGGTCTGCTGTACCTGCCGGAAGACACC CCGCTGACCATCGAACTGGCGGAAACCTCTGTTTCTTGCCAGTCTGTTTCTTCTGTTGCGTTCGACCTGAA AAACCTGACCACCATCCTGGGTCGTGTTGGTGAATTCCGTGTTACCGCGGACCAGCCGTTCAAACTGACC CCGATCATCCCGGAAAAAGAAGAATCTTTCATCGGTAAAACCTACCTGGGTCTGGACGCGGGTGAACGT TCTGGTGTTGGTTTCGCGATCGTTACCGTTGACGGTGACGGTTACGAAGTTCAGCGTCTGGGTGTTCACG AAGACACCCAGCTGATGGCGCTGCAGCAGGTTGCGTCTAAATCTCTGAAAGAACCGGTTTTCCAGCCGC TGCGTAAAGGTACCTTCCGTCAGCAGGAACGTATCCGTAAATCTCTGCGTGGTTGCTACTGGAACTTCTA CCACGCGCTGATGATCAAATACCGTGCGAAAGTTGTTCACGAAGAATCTGTTGGTTCTTCTGGTCTGGTT GGTCAGTGGCTGCGTGCGTTCCAGAAAGACCTGAAAAAAGCGGACGTTCTGCCGAAAAAAGGTGGTAA AAACGGTGTTGACAAAAAAAAACGTGAATCTTCTGCGCAGGACACCCTGTGGGGTGGTGCGTTCTCTAA AAAAGAAGAACAGCAGATCGCGTTCGAAGTTCAGGCGGCGGGTTCTTCTCAGTTCTGCCTGAAATGCGG TTGGTGGTTCCAGCTGGGTATGCGTGAAGTTAACCGTGTTCAGGAATCTGGTGTTGTTCTGGACTGGAAC CGTTCTATCGTTACCTTCCTGATCGAATCTTCTGGTGAAAAAGTTTACGGTTTCTCTCCGCAGCAGCTGGA AAAAGGTTTCCGTCCGGACATCGAAACCTTCAAAAAAATGGTTCGTGACTTCATGCGTCCGCCGATGTTC GACCGTAAAGGTCGTCCGGCGGCGGCGTACGAACGTTTCGTTC TGGGTCGTCGTCACCGTCGTTACCGTT TCGACAAAGTTTTCGAAGAACGTTTCGGTCGTTCTGCGCTGTTCATCTGCCCGCGTGTTGGTTGCGGTAA CTTCGACCACTCTTCTGAACAGTCTGCGGTTGTTCTGGCGCTGATCGGTTACATCGCGGACAAAGAAGGT ATGTCTGGTAAAAAACTGGTTTACGTTCGTCTGGCGGAACTGATGGCGGAATGGAAACTGAAAAAACTG GAACGTTCTCGTGTTGAAGAACAGTCTTCTGCGCAGTAA SEQ ATGGCGGAATCTAAACAGATGCAGTGCCGTAAATGCGGTGCGTCTATGAAATACGAAGTTATCGGTCTG ID GGTAAAAAATCTTGCCGTTACATGTGCCCGGACTGCGGTAACCACACCTCTGCGCGTAAAATCCAGAAC NO : AAAAAAAAACGTGACAAAAAATACGGTTCTGCGTCTAAAGCGCAGTCTCAGCGTATCGCGGTTGCGGGT 58 GCGCTGTACCCGGACAAAAAAGTTCAGACCATCAAAACCTACAAATACCCGGCGGACCTGAACGGTGA AGTTCACGACTCTGGTGTTGCGGAAAAAATCGCGCAGGCGATCCAGGAAGACGAAATCGGTCTGCTGGG TCCGTCTTCTGAATACGCGTGCTGGATCGCGTCTCAGAAACAGTCTGAACCGTACTCTGTTGTTGACTTC TGGTTCGACGCGGTTTGCGCGGGTGGTGTTTTCGCGTACTCTGGTGCGCGTCTGCTGTCTACCGTTCTGCA GCTGTCTGGTGAAGAATCTGTTCTGCGTGCGGCGCTGGCGTCTTCTCCGTTCGTTGACGACATCAACCTG GCGCAGGCGGAAAAATTCCTGGCGGTTTCTCGTCGTACCGGTCAGGACAAACTGGGTAAACGTATCGGT GAATGCTTCGCGGAAGGTCGTCTGGAAGCGCTGGGTATCAAAACCGTATGCGTGAATTCGTTCAGGCG ATCGACGTTGCGCAGACCGCGGGTCAGCGTTTCGCGGCGAAAC TGAAAATCTTCGGTATCTCTCAGATG CCGGAAGCGAAACAGTGGAACAACGACTCTGGTCTGACCGTTTGCATCCTGCCGGACTACTACGTTCCG GAAGAAAACCGTGCGGACCAGCTGGTTGTTCTGCTGCGTCGTCTGCGTGAAATCGCGTACTGCATGGGT ATCGAAGACGAAGCGGGTTTCGAACACCTGGGTATCGACCCGGGTGCGCTGTCTAACTTCTCTAACGGT AACCCGAAACGTGGTTTCCTGGGTCGTCTGCTGAACAACGACATCATCGCGCTGGCGAACAACATGTCT GCGATGACCCCGTACTGGGAAGGTCGTAAAGGTGAACTGATCGAACGTCTGGCGTGGCTGAAACACCGT GCGGAAGGTCTGTACCTGAAAGAACCGCACTTCGGTAACTCTTGGGCGGACCACCGTTCTCGTATCTTCT CTCGTATCGCGGGTTGGCTGTCTGGTTGCGCGGGTAAACTGAAAATCGCGAAAGACCAGATCTCTGGTG TTCGTACCGACCTGTTCCTGCTGAAACGTCTGCTGGACGCGGTTCCGCAGTCTGCGCCGTCTCCGGACTT CATCGCGTCTATCTCTGCGCTGGACCGTTTCCTGGAAGCGGCGGAATCTTCTCAGGACCCGGCGGAACA GGTTCGTGCGCTGTACGCGTTCCACCTGAACGCGCCGGCGGTTCGTTCTATCGCGAACAAAGCGGTTCAG CGTTCTGACTCTCAGGAATGGCTGATCAAAGAACTGGACGCGGTTGACCACCTGGAATTCAACAAAGCG TTCCCGTTCTTCTCTGACACCGGTAAAAAAAAAAAAAAAGGTGCGAACTCTAACGGTGCGCCGTCTGAA GAAGAATACACCGAAACCGAATCTATCCAGCAGCCGGAAGACGCGGAACAGGAAGTTAACGGTCAGGA AGGTAACGGTGCGTCTAAAAACCAGAAAAAATTCCAGCGTATCCCGCGTTTCTTCGGTGAAGGTTCTCG TTCTGAATACCGTATCCTGACCGAAGCGCCGCAGTACTTCGACATGTTCTGCAACAACATGCGTGCGATC TTCATGCAGCTGGAATCTCAGCCGCGTAAAGCGCCGCGTGACTTCAAATGCTTCCTGCAGAACCGTCTGC AGAAACTGTACAAACAGACCTTCCTGAACGCGCGTTCTAACAAATGCCGTGCGCTGCTGGAATCTGTTCT GATCTCTTGGGGTGAATTCTACACCTACGGTGCGAACGAAAAAAAATTCCGTCTGCGTCACGAAGCGTC TGAACGTTCTTCTGACCCGGACTACGTTGTTCAGCAGGCGCTGGAAATCGCGCGTCGTCTGTTCCTGTTC GGTTTCGAATGGCGTGACTGCTCTGCGGGTGAACGTGTTGACCTGGTTGAAATCCACAAAAAAGCGATC TCTTTCCTGCTGGCGATCACCCAGGCGGAAGTTTCTGTTGGTTCTTACAACTGGCTGGGTAACTCTACCG US 10 , 435 , 714 B2 125 126 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence TTTCTCGTTACCTGTCTGTTGCGGGTACCGACACCCTGTACGGTACCCAGCTGGAAGAATTCCTGAACGC GACCGTTCTGTCTCAGATGCGTGGTCTGGCGATCCGTCTGTCTTCTCAGGAACTGAAAGACGGTTTCGAC GTTCAGCTGGAATCTTCTTGCCAGGACAACCTGCAGCACCTGC TGGTTTACCGTGCGTCTCGTGACCTGG CGGCGTGCAAACGTGCGACCTGCCCGGCGGAACTGGACCCGAAAATCCTGGTTCTGCCGGTTGGTGCGT TCATCGCGTCTGTTATGAAAATGATCGAACGTGGTGACGAACCGCTGGCGGGTGCGTACCTGCGTCACC GTCCGCACTCTTTCGGTTGGCAGATCCGTGTTCGTGGTGTTGCGGAAGTTGGTATGGACCAGGGTACCGC GCTGGCGTTCCAGAAACCGACCGAATCTGAACCGTTCAAAATCAAACCGTTCTCTGCGCAGTACGGTCC GGTTCTGTGGCTGAACTCTTCTTCTTACTCTCAGTCTCAGTACCTGGACGGTTTCCTGTCTCAGCCGAAAA ACTGGTCTATGCGTGTTCTGCCGCAGGCGGGTTCTGTTCGTGT TGAACAGCGTGTTGCGCTGATCTGGAA CCTGCAGGCGGGTAAAATGCGTCTGGAACGTTCTGGTGCGCGTGCGTTCTTCATGCCGGTTCCGTTCTCT TTCCGTCCGTCTGGTTCTGGTGACGAAGCGGTTCTGGCGCCGAACCGTTACCTGGGTCTGTTCCCGCACT CTGGTGGTATCGAATACGCGGTTGTTGACGTTCTGGACTCTGCGGGTTTCAAAATCCTGGAACGTGGTAC CATCGCGGTTAACGGTTTCTCTCAGAAACGTGGTGAACGTCAGGAAGAAGCGCACCGTGAAAAACAGCG TCGTGGTATCTCTGACATCGGTCGTAAAAAACCGGTTCAGGCGGAAGTTGACGCGGCGAACGAACTGCA CCGTAAATACACCGACGTTGCGACCCGTCTGGGTTGCCGTATCGTTGTTCAGTGGGCGCCGCAGCCGAA ACCGGGTACCGCGCCGACCGCGCAGACCGTTTACGCGCGTGCGGTTCGTACCGAAGCGCCGCGTTCTGG TAACCAGGAAGACCACGCGCGTATGAAATCTTCTTGGGGTTACACCTGGGGTACCTACTGGGAAAAACG TAAACCGGAAGACATCCTGGGTATCTCTACCCAGGTTTACTGGACCGGTGGTATCGGTGAATCTTGCCCG GCGGTTGCGGTTGCGCTGCTGGGTCACATCCGTGCGACCTCTACCCAGACCGAATGGGAAAAAGAAGAA GTTGTTTTCGGTCGTCTGAAAAAATTCTTCCCGTCTTAA SEQ ATGGAAAAACGTATCAACAAAATCCGTAAAAAACTGTCTGCGGACAACGCGACCAAACCGGTTTCTCGT ID TCTGGTCCGATGAAAACCCTGCTGGTTCGTGTTATGACCGACGACCTGAAAAAACGTCTGGAAAAACGT NO : CGTAAAAAACCGGAAGTTATGCCGCAGGTTATCTCTAACAACGCGGCGAACAACCTGCGTATGCTGCTG 59 GACGACTACACCAAAATGAAAGAAGCGATCCTGCAGGTTTACTGGCAGGAATTCAAAGACGACCACGTT GGTCTGATGTGCAAATTCGCGCAGCCGGCGTCTAAAAAAATCGACCAGAACAAACTGAAACCGGAAAT GGACGAAAAAGGTAACCTGACCACCGCGGGTTTCGCGTGCTCT CAGTGCGGTCAGCCGCTGTTCGTTTA CAAACTGGAACAGGTTTCTGAAAAAGGTAAAGCGTACACCAAC TACTTCGGTCGTTGCAACGTTGCGGA ACACGAAAAACTGATCCTGCTGGCGCAGCTGAAACCGGAAAAAGACTCTGACGAAGCGGTTACCTACTC TCTGGGTAAATTCGGTCAGCGTGCGCTGGACTTCTACTCTATCCACGTTACCAAAGAATCTACCCACCCG GTTAAACCGCTGGCGCAGATCGCGGGTAACCGTTACGCGTCTGGTCCGGTTGGTAAAGCGCTGTCTGAC GCGTGCATGGGTACCATCGCGTCTTTCCTGTCTAAATACCAGGACATCATCATCGAACACCAGAAAGTTG TTAAAGGTAACCAGAAACGTCTGGAATCTCTGCGTGAACTGGCGGGTAAAGAAAACCTGGAATACCCGT CTGTTACCCTGCCGCCGCAGCCGCACACCAAAGAAGGTGTTGACGCGTACAACGAAGTTATCGCGCGTG TTCGTATGTGGGTTAACCTGAACCTGTGGCAGAAACTGAAACTGTCTCGTGACGACGCGAAACCGCTGC TGCGTCTGAAAGGTTTCCCGTCTTTCCCGGTTGTTGAACGTCGTGAAAACGAAGTTGACTGGTGGAACAC CATCAACGAAGTTAAAAAACTGATCGACGCGAAACGTGACATGGGTCGTGTTTTCTGGTCTGGTGTTAC CGCGGAAAAACGTAACACCATCCTGGAAGGTTACAACTACCTGCCGAACGAAAACGACCACAAAAAAC GTGAAGGTTCTCTGGAAAACCCGAAAAAACCGGCGAAACGTCAGTTCGGTGACCTGCTGCTGTACCTGG AAAAAAAATACGCGGGTGACTGGGGTAAAGTTTTCGACGAAGCGTGGGAACGTATCGACAAAAAAATC GCGGGTCTGACCTCTCACATCGAACGTGAAGAAGCGCGTAACGCGGAAGACGCGCAGTCTAAAGCGGTT CTGACCGACTGGCTGCGTGCGAAAGCGTCTTTCGTTCTGGAACGTCTGAAAGAAATGGACGAAAAAGAA TTCTACGCGTGCGAAATCCAGCTGCAGAAATGGTACGGTGACCTGCGTGGTAACCCGTTCGCGGTTGAA GCGGAAAACCGTGTTGTTGACATCTCTGGTTTCTCTATCGGTTCTGACGGTCACTCTATCCAGTACCGTA ACCTGCTGGCGTGGAAATACCTGGAAAACGGTAAACGTGAATTCTACCTGCTGATGAACTACGGTAAAA AAGGTCGTATCCGTTTCACCGACGGTACCGACATCAAAAAATCTGGTAAATGGCAGGGTCTGCTGTACG GTGGTGGTAAAGCGAAAGTTATCGACCTGACCTTCGACCCGGACGACGAACAGCTGATCATCCTGCCGC TGGCGTTCGGTACCCGTCAGGGTCGTGAATTCATCTGGAACGACCTGCTGTCTCTGGAAACCGGTCTGAT CAAACTGGCGAACGGTCGTGTTATCGAAAAAACCATCTACAACAAAAAAATCGGTCGTGACGAACCGG CGCTGTTCGTTGCGCTGACCTTCGAACGTCGTGAAGTTGTTGACCCGTCTAACATCAAACCGGTTAACCT GATCGGTGTTGACCGTGGTGAAAACATCCCGGCGGTTATCGCGCTGACCGACCCGGAAGGTTGCCCGCT GCCGGAATTCAAAGACTCTTCTGGTGGTCCGACCGACATCCTGCGTATCGGTGAAGGTTACAAAGAAAA ACAGCGTGCGATCCAGGCGGCGAAAGAAGTTGAACAGCGTCGTGCGGGTGGTTACTCTCGTAAATTCGC GTCTAAATCTCGTAACCTGGCGGACGACATGGTTCGTAACTCTGCGCGTGACCTGTTCTACCACGCGGTT ACCCACGACGCGGTTCTGGTTTTCGAAAACCTGTCTCGTGGTTTCGGTCGTCAGGGTAAACGTACCTTCA TGACCGAACGTCAGTACACCAAAATGGAAGACTGGCTGACCGCGAAACTGGCGTACGAAGGTCTGACCT CTAAAACCTACCTGTCTAAAACCCTGGCGCAGTACACCTCTAAAACCTGCTCTAACTGCGGTTTCACCAT CACCACCGCGGACTACGACGGTATGCTGGTTCGTCTGAAAAAAACCTCTGACGGTTGGGCGACCACCCT GAACAACAAAGAACTGAAAGCGGAAGGTCAGATCACCTACTACAACCGTTACAAACGTCAGACCGTTG AAAAAGAACTGTCTGCGGAACTGGACCGTCTGTCTGAAGAATC TGGTAACAACGACATCTCTAAATGGA CCAAAGGTCGTCGTGACGAAGCGCTGTTCCTGCTGAAAAAACGTTTCTCTCACCGTCCGGTTCAGGAAC AGTTCGTTTGCCTGGACTGCGGTCACGAAGTTCACGCGGACGAACAGGCGGCGCTGAACATCGCGCGTT CTTGGCTGTTCCTGAACTCTAACTCTACCGAATTCAAATCTTACAAATCTGGTAAACAGCCGTTCGTTGG TGCGTGGCAGGCGTTCTACAAACGTCGTCTGAAAGAAGTTTGGAAACCGAACGCG SEO ATGAAACGTATCAACAAAATCCGTCGTCGTCTGGTTAAAGACT CTAACACCAAAAAAGCGGGTAAAACC ID GGTCCGATGAAAACCCTGCTGGTTCGTGTTATGACCCCGGACCTGCGTGAACGTCTGGAAAACCTGCGT NO : AAAAAACCGGAAAACATCCCGCAGCCGATCTCTAACACCTCTCGTGCGAACCTGAACAAACTGCTGACC 60 GACTACACCGAAATGAAAAAAGCGATCCTGCACGTTTACTGGGAAGAATTCCAGAAAGACCCGGTTGGT CTGATGTCTCGTGTTGCGCAGCCGGCGCCGAAAAACATCGACCAGCGTAAACTGATCCCGGTTAAAGAC GGTAACGAACGTCTGACCTCTTCTGGTTTCGCGTGCTCTCAGTGCTGCCAGCCGCTGTACGTTTACAAAC TGGAACAGGTTAACGACAAAGGTAAACCGCACACCAACTACTTCGGTCGTTGCAACGTTTCTGAACACG AAcc????????????????ccccACAAAccccAAcccAAccAccAACTccTTAcc????c????cc?? AATTCGGTCAGCGTGCGCTGGACTTCTACTCTATCCACGTTACCCGTGAATCTAACCACCCGGTTAAACC GCTGGAACAGATCGGTGGTAACTCTTGCGCGTCTGGTCCGGTTGGTAAAGCGCTGTCTGACGCGTGCAT GGGTGCGGTTGCGTCTTTCCTGACCAAATACCAGGACATCATCCTGGAACACCAGAAAGTTATCAAAAA US 10 , 435 , 714 B2 127 128 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence AAACGAAAAACGTCTGGCGAACCTGAAAGACATCGCGTCTGCGAACGGTCTGGCGTTCCCGAAAATCAC CCTGCCGCCGCAGCCGCACACCAAAGAAGGTATCGAAGCGTACAACAACGTTGTTGCGCAGATCGTTAT CTGGGTTAACCTGAACCTGTGGCAGAAACTGAAAATCGGTCGTGACGAAGCGAAACCGCTGCAGCGTCT GAAAGGTTTCCCGTCTTTCCCGCTGGTTGAACGTCAGGCGAACGAAGTTGACTGGTGGGACATGGTTTGC AACGTTAAAAAACTGATCAACGAAAAAAAAGAAGACGGTAAAGTTTTCTGGCAAACCTGGCGGGTTA CAAACGTCAGGAAGCGCTGCTGCCGTACCTGTCTTCTGAAGAAGACCGTAAAAAAGGTAAAAAATTCGC GCGTTACCAGTTCGGTGACCTGCTGCTGCACCTGGAAAAAAAACACGGTGAAGACTGGGGTAAAGTTTA CGACGAAGCGTGGGAACGTATCGACAAAAAAGTTGAAGGTCTGTCTAAACACATCAAACTGGAAGAAG AACGTCGTTCTGAAGACGCGCAGTCTAAAGCGGCGCTGACCGACTGGCTGCGTGCGAAAGCGTCTTTCG TTATCGAAGGTCTGAAAGAAGCGGACAAAGACGACGGACAAAGACGAATTCTGCCGTTGCGAACTGAAACTGCAGAAATGGT ACGGTGACCTGCGTGGTAAACCGTTCGCGATCGAAGCGGAAAACTCTATCCTGGACATCTCTGGTTTCTC TAAACAGTACAACTGCGCGTTCATCTGGCAGAAAGACGGTGTTAAAAAACTGAACCTGTACCTGATCAT CAACTACTTCAAAGGTGGTAAACTGCGTTTCAAAAAAATCAAACCGGAAGCGTTCGAAGCGAACCGTTT CTACACCGTTATCAACAAAAAATCTGGTGAAATCGTTCCGATGGAAGTTAACTTCAACTTCGACGACCC GAACCTGATCATCCTGCCGCTGGCGTTCGGTAAACGTCAGGGT CGTGAATTCATCTGGAACGACCTGCTG TCTCTGGAAACCGGTTCTCTGAAACTGGCGAACGGTCGTGTTATCGAAAAAACCCTGTACAACCGTCGT ACCCGTCAGGACGAACCGGCGCTGTTCGTTGCGCTGACCTTCGAACGTCGTGAAGTTCTGGACTCTTCTA ACATCAAACCGATGAACCTGATCGGTATCGACCGTGGTGAAAACATCCCGGCGGTTATCGCGCTGACCG ACCCGGAAGGTTGCCCGCTGTCTCGTTTCAAAGACTCTCTGGGTAACCCGACCCACATCCTGCGTATCGG TGAATCTTACAAAGAAAAACAGCGTACCATCCAGGCGGCGAAAGAAGTTGAACAGCGTCGTGCGGGTG GTTACTCTCGTAAATACGCGTCTAAAGCGAAAAACCTGGCGGACGACATGGTTCGTAACACCGCGCGTG ACCTGCTGTACTACGCGGTTACCCAGGACGCGATGCTGATCTTCGAAAACCTGTCTCGTGGTTTCGGTCG TCAGGGTAAACGTACCTTCATGGCGGAACGTCAGTACACCCGTATGGAAGACTGGCTGACCGCGAAACT GGCGTACGAAGGTCTGCCGTCTAAAACCTACCTGTCTAAAACCCTGGCGCAGTACACCTCTAAAACCTG CTCTAACTGCGGTTTCACCATCACCTCTGCGGACTACGACCGTGTTCTGGAAAAACTGAAAAAAACCGC GACCGGTTGGATGACCACCATCAACGGTAAAGAACTGAAAGTTGAAGGTCAGATCACCTACTACAACCG TTACAAACGTCAGAACGTTGTTAAAGACCTGTCTGTTGAACTGGACCGTCTGTCTGAAGAATCTGTTAAC AACGACATCTCTTCTTGGACCAAAGGTCGTTCTGGTGAAGCGC TGTCTCTGCTGAAAAAACGTTTCTCTC ACCGTCCGGTTCAGGAAAAATTCGTTTGCCTGAACTGCGGTTTCGAAACCCACGCGGACGAACAGGCGG CGCTGAACATCGCGCGTTCTTGGCTGTTCCTGCGTTCTCAGGAATACAAAAAATACCAGACCAACAAAA CCACCGGTAACACCGACAAACGTGCGTTCGTTGAAACCTGGCAGTCTTTCTACCGTAAAAAACTGAAAG AAGTTTGGAAACCG SEQ AAAATTCcatGCAAAATGCTCCGGTTTCATGTCATCAAAATGATGACGTAATTAAGCATTGATAATTGAGA ID TCCCTCTCCCTGACAGGATGATTACATAAATAATAGTGACAAAAATAAATTATTTATTTATCCAGAAAAT NO : GAATTGGAAAATCAGGAGAGCGTTTTCAATCCTACCTCTGGCGCAGTTGATATGTcaaaCAGGTtgccgto 61 actgcgtcttttactggctcttctcgctaaccaaaccggtaaccccgcttattaaaagcattctgtaacaa agcgggaccaaagccatgacaaaaacgcgtaacaaaagtgtctataatcacggcagaaaagtccacattga ttatttgcacggcgtcacactttgctatgccatagcatttttatccataagattagcggatcctacctgac gctttttatcgcaactctctactgtttctccatacccgtttttttgggctagcaccgcctatctcgtgtga gataggcggagatacgaactttaagAAGGAGatatacCATGGGTAAAATGTATTACCTTGGTTTAGACATT GGCACGAATTCCGTGGGCTACGCGGTGACCGACCCCTCATACCACCTGCTGAAGTTTAAGGGGGAACCAAT GTGGGGTGCGCACGTATTTGCCGCCGGTAATCAGAGCGCGGAACGACGCTCGTTCCGCACATCGCGTCGTC GTTTGGACCGACGCCAACAGCGCGTTAAACTGGTACAGGAGATTTTTGCCCCGGTGATTAGTCCGATCGAC CCACGCTTCTTCATTCGTCTGCATGAATCCGCCCTGTGGCGCGATGACGTCGCGGAGACGGAT AAACATATCTTTTTCAATGATCCTACCTATACCGATAAGGAATATTATAGCGATTA CCCGACTATCCATCACCTGATCGTTGATCTGATGGAAAGCTCTGAGAAACACGATCCGCGGCTGGTGTA CCTTGCAGTGGCGTGGTTAGTGGCACACCGTGGTCATTTTCTGAACGAGGTGGACAAGGATAATATTGG AGATGTGTTGTCGTTCGACGCATTTTATCCGGAGTTTCTCGCGTTCCTGTCGGACAACGGTGTATCACCG TGGGTGTGCGAAAGCAAAGCGCTGCAGGCGACCTTGCTGAGCCGTAACTCAGTGAACGACAAATATAA AGCCCTTAAGTCTCTGATCTTCGGATCCCAGAAACCTGAAGATAACTTCGATGCCAATATTTCGGAAGAT GGACTCATTCAACTGCTGGCCGGCAAAAAGGTAAAAGTTAACAAACTGTTCCCTCAGGAATCGAACGAT GCATCCTTCACATTGAATGATAAAGAAGACGCGATAGAAGAAATCCTGGGTACGCTTACACCAGATGAA TGTGAATGGATTGCGCATATACGCCGCCTTTTTGACTGGGCTATCATGAAACATGCTCTGAAAGATGGCA GGACTATTAGCGAGTCAAAAGTCAAACTGTATGAGCAGCACCATCACGATCTGACCCAACTTAAATACT TCGTGAAAACCTACCTTGCAAAAGAATACGACGATATTTTCCGCAACGTGGATAGCGAAACAACGAAAA ACTATGTAGCGTATTCCTATCATGTGAAAGAGGTGAAAGGCACTCTGCCTAAAAATAAGGCAACGCAAG AAGAGTTTTGTAAGTATGTCCTGGGCAAGGTTAAAAACATTGAATGCTCTGAAGCAGACAAGGTTGACT TTGATGAGATGATTCAGCGTCTTACCGACAACTCTTTTATGCCTAAGCAGGTTTCGGGCGAAAACCGCGT TATTCCTTATCAGTTATATTATTATGAACTGAAGACAATTCTGAATAAAGCAGCCTCGTACCTGCCTTTCC TGACGCAGTGTGGAAAAGATGCAATTTCGAACCAGGACAAACTACTGTCGATCATGACGTTCCGTATTC CTTACTTCGTCGGACCCTTGCGAAAAGATAATTCGGAACATGCATGGCTCGAACGAAAGGCCGGTAAGA TTTATCCGTGGAACTTTAACGACAAAGTGGACTTGGATAAATCAGAAGAAGCGTTCATTCGCCGAATGA CCAATACCTGTACCTATTATCCCGGCGAAGATGTTTTACCGTTGGATTCGCTGATCTATGAGAAATTTAT GATTTTAAATGAAATCAATAATATTCGTATTGACGGCTACCCGATTAGTGTTGACGTTAAACAGCAGGTT TTTGGCTTGTTCGAAAAAAAACGACGCGTAACCGTGAAAGATATTCAGAACCTGCTGCTGTCTCTCGGA GCTCTGGACAAACACGGGAAGCTGACAGGCATCGATACCACTATCCACTCAAACTATAATACGTATCAC CATTTTAAATCTCTCATGGAACGCGGCGTCCTGACCCGGGATGACGTGGAACGCATCGTTGAAAGGATG ACCTACAGCGACGATACTAAGCGTGTGCGTCTGTGGCTGAATAACAACTATGGTACTTTAACCGCCGAC GATGTGAAACACATTTCGCGTCTGCGCAAACACGATTTTGGCCGTTTATCCAAAATGTTCTTAACAGGTC TGAAGGGTGTCCATAAGGAGACCGGTGAACGTGCCTCCATACTGGATTTCATGTGGAACACGAACGATA ACCTGATGCAGCTCCTTTCCGAATGCTACACGTTCAGTGATGAAATCACAAAGCTGCAAGAGGCGTATT ATGCAAAAGCCCAGTTGTCTTTAAACGATTTTTTAGACTCGATGT GATTTACAGAACTCTGGCAGTGGTGAACGATATTCGAAAAGCATGTGGGACGGCCCCTAAACGCATTTT CATCGAAATGGCTCGTGATGGTGAATCAAAAAAAAAGAGAAGTGTTACACGTCGCGAGCAGATCAAAA ACCTGTACCGCTCGATTCGTAAAGATTTCCAGCAGGAAGTTGATTTTCTGGAAAAGATCCTGGAAAATA US 10 , 435 , 714 B2 129 130 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence AATCTGATGGTCAACTTCAGTCAGATGCTTTGTATCTTTACTTTGCACAATTAGGGCGCGATATGTACAC GGGCGATCCAATAAAGCTGGAGCACATCAAAGATCAGAGTTTCTATAACATAGACCATATTTACCCGCA GTCTATGGTGAAAGACGATTCCCTAGATAACAAAGTGCTGGTGCAAAGCGAAATTAACGGCGAGAAAA GCTCGCGATACCCTTTGGACGCCGCGATCCGCAATAAAATGAAGCCCCTTTGGGACGCTTACTATAATCA TGGCCTGATCTCCTTAAAGAAATACCAGCGTCTAACGCGCTCGACCCCGTTTACCGATGATGAAAAATG GGACTTTATTAATCGCCAGTTAGTGGAAACCCGTCAATCTACCAAAGCGCTGGCCATTTTGTTGAAGCGT AAGTTTCCAGACACCGAAATTGTGTATTCGAAGGCGGGGTTATCGTCCGACTTCAGACATGAATTCGGC CTTGTAAAAAGTCGCAATATTAATGATTTGCACCACGCTAAAGACGCATTCTTGGCTATCGTTACCGGCA ATGTGTACCATGAAAGATTCAATCGCAGATGGTTTATGGTGAACCAGCCGTACTCAGTTAAAACTAAAA CTCTTTTTACCCACAGCATAAAGAATGGCAACTTCGTTGCCTGGAACGGCGAAGAAGATCTCGGTCGTAT TGTAAAAATGCTGAAGCAAAACAAAAATACCATTCACTTCACGCGCTTCTCCTTCGATCGCAAAGAAGG ATTATTTGATATCCAACCTCTGAAAGCCAGCACCGGCTTAGTCCCACGAAAAGCCGGTCTGGATGTCGTT AAATACGGCGGATATGACAAATCTACCGCGGCCTATTACCTGC TGGTGAGGTTCACGCTCGAGGACAAG AAAACCCAGCACAAGCTGATGATGATTCCTGTAGAAGGCCTGTACAAGGCTCGCATTGATCATGACAAG GAATTTCTTACCGATTATGCGCAAACGACTATAAGCGAAATCCTACAGAAAGATAAACAGAAAGTGATC AATATTATGTTTCCAATGGGTACGAGGCATATAAAACTCAATTCAATGATTAGTATCGATGGCTTCTATC TTAGTATCGGCGGAAAGTCCTCTAAAGGTAAGTCAGTTCTATGTCACGCAATGGTTCCACTGATCGTCCC TCACAAAATCGAATGTTACATTAAAGCAATGGAAAGCTTCGCCCGGAAGTTTAAAGAAAACAACAAGCT GCGCATCGTAGAAAAATTCGATAAAATCACCGTTGAAGACAACCTGAATCTCTACGAGCTCTTTCTCCA AAAACTGCAGCATAATCCCTATAATAAGTTTTTTTCGACACAGTTTGACGTACTGACGAACGGCCGTTCT ACTTTCACAAAACTGTCGCCGGAGGAACAGGTACAGACGCTCTTGAACATTTTAAGTATCTTTAAAACAT GCCGCAGTTCGGGTTGCGACCTGAAATCCATCAACGGCAGTGCCCAGGCAGCGCGCATCATGATTAGCG CTGACTTAACTGGACTGTCGAAAAAATATTCAGATATTAGGTTGGTTGAACAGTCAGCTTCTGGTTTGTT CGTATCCAAAAGTCAGAACTTACTGGAGTATCTCTAAGAAATCATCCTTAGCGAAAGCTAAGGATTTTTT TTATCTGAAATTTATTATATCGCGTTGATTATTGATGCTGTTTTTAGTTTTAACGGCAATTAATATATGTG TTATTAATTGAATGAATTTTATCATTCATAATAAGTATGTGTAGGATCAAGCTCAGGTTAAATATTCACT CAGGAAGTTATTACTCAGGAAGCAAAGAGGATTACAGAATTATCTCATAACAAGTGTTAAGGGATGTTA TTTCC SEQ AAAATTCcatGCAAAATGCTCCGGTTTCATGTCATCAAAATGATGACGTAATTAAGCATTGATAATTGAGA ID TCCCTCTCCCTGACAGGATGATTACATAAATAATAGTGACAAAAATAAATTATTTATTTATCCAGAAAAT NO : GAATTGGAAAATCAGGAGAGCGTTTTCAATCCTACCTCTGGCGCAGTTGATATGTcaaaCAGGTtgccgtca 62 ctgcgtcttttactggctcttctcgctaaccaaaccggtaaccccgcttattaaaagcattctgtaacaaag cgggaccaaagccatgacaaaaacgcgtaacaaaagtgtctataatcacggcagaaaagtccacattgatta tttgcacggcgtcacactttgctatgccatagcatttttatccataagattagcggatcctacctgacgctt tttatcgcaactctctactgtttctccatacccgtttttttgggctagcaccgcctatctcgtgtgagatag qoqqaqatacgaactttaaqAAGGAGatatacCATGTCATCGCTCACGAAATTCACTAACAAATACTCTAAA CAGCTCACCATTAAGAATGAACTCATCCCAGTTGGCAAAACAC TGGAGAACATCAAAGAGAATGGTCTGATA GATGGCGACGAACAGCTGAATGAGAATTATCAGAAGGCGAAA ATTATTGTGGATGATTTTCTGCGGGACTTCATTAATAAAGCAC TGAATAATACGCAGATCGGGAACTGGC GCGAACTGGCGGATGCCCTTAATAAAGAGGATGAAGATAACATCGAGAAATTGCAGGATAAAATTCGG GGAATCATTGTATCCAAATTTGAAACGTTTGATCTGTTTAGCAGCTATTCTATTAAGAAAGATGAAAAGA TTATTGACGACGACAATGATGTTGAAGAAGAGGAACTGGATCTGGGCAAGAAGACCAGCTCATTTAAAT ACATATTTAAAAAAAACCTGTTTAAGTTAGTGTTGCCATCCTACCTGAAAACCACAAACCAGGACAAGC TGAAGATTATTAGCTCGTTTGATAATTTTTCAACGTACTTCCGCGGGTTCTTTGAAAACCGGAAAAACAT TTTTACCAAGAAACCGATCTCCACAAGTATTGCGTATCGCATTGTTCATGATAACTTCCCGAAATTCCTT GATAACATTCGTTGTTTTAATGTGTGGCAGACGGAATGCCCGCAACTAATCGTGAAAGCAGATAACTAT CTGAAAAGCAAAAATGTTATAGCGAAAGATAAAAGTTTGGCAAACTATTTTACCGTGGGCGCGTATGAC TATTTCCTGTCTCAGAATGGTATAGATTTTTACAACAATATTATAGGTGGACTGCCAGCGTTCGCCGGCC ATGAGAAAATCCAAGGTCTCAATGAATTCATCAATCAAGAGTGCCAAAAAGACAGCGAGCTGAAAAGT AAGCTGAAAAACCGTCACGCGTTCAAAATGGCGGTACTGTTCAAACAGATACTCAGCGATCGTGAAAAA AGTTTTGTAATTGATGAGTTCGAGTCGGATGCTCAAGTTATTGACGCCGTTAAAAACTTTTACGCCGAAC AGTGCAAAGATAACAATGTTATTTTTAACTTATTAAATCTTAT CAAGAATATCGCTTTCTTAAGTGATGA CGAACTGGACGGCATATTCATTGAAGGGAAATACCTGTCGAGCGTTAGTCAAAAACTCTATAGCGATTG GTCAAAATTACGTAACGACATTGAGGATTCGGCTAACTCTAAACAAGGCAATAAAGAGCTGGCCAAGA AGATCAAAACCAACAAAGGGGATGTAGAAAAAGCGATCTCGAAATATGAGTTCTCGCTGTCGGAACTG AACTCGATTGTACATGATAACACCAAGTTTTCTGACCTCCTTAGTTGTACACTGCATAAGGTGGCTTCTG AGAAACTGGTGAAGGTCAATGAAGGCGACTGGCCGAAACATCTCAAGAATAATGAAGAGAAACAAAAA ATCAAAGAGCCGCTTGATGCTCTGCTGGAGATCTATAATACACTTCTGATTTTTAACTGCAAAAGCTTCA ATAAAAACGGCAACTTCTATGTCGACTATGATCGTTGCATCAATGAACTGAGTTCGGTCGTGTATCTGTA TAATAAAACACGTAACTATTGCACTAAAAAACCCTATAACACGGACAAGTTCAAACTCAATTTTAACAG TCCGCAGCTCGGTGAAGGCTTTTCCAAGTCGAAAGAAAATGAC TGTCTGACTCTTTTGTTTAAAAAAGAC GACAACTATTATGTAGGCATTATCCGCAAAGGTGCAAAAATCAATTTTGATGATACACAAGCAATCGCC GATAACACCGACAATTGCATCTTTAAAATGAATTATTTCCTACTTAAAGACGCAAAAAAATTTATCCCGA AATGTAGCATTCAGCTGAAAGAAGTCAAGGCCCATTTTAAGAAATCTGAAGATGATTACATTTTGTCTG ATAAAGAGAAATTTGCTAGCCCGCTGGTCATTAAAAAGAGCACATTTTTGCTGGCAACTGCACATGTGA AAGGGAAAAAAGGCAATATCAAGAAATTTCAAAAGAATATTCGAAAGAAAACCCCACTGAGTATCGC AATTCTTTAAACGAATGGATTGCTTTTTGTAAAGAGTTCTTAAAAACTTATAAAGCGGCTACCATTTTTG ATATAACCACATTGAAAAAGGCAGAGGAATATGCTGATATTGTAGAATTCTACAAGGATGTCGATAATC TGTGCTACAAACTGGAGTTCTGCCCGATTAAAACCTCGTTTATAGAAAACCTGATAGATAACGGCGACC TGTATCTGTTTCGCATCAATAACAAAGACTTCAGCAGTAAATCGACCGGCACCAAGAACCTTCATACGTT ATATTTACAAGCTATATTCGATGAACGTAATCTGAACAATCCGACAATTATGCTGAATGGGGGAGCAGA GTTCTATCGTAAAGAAAGTATTGAGCAGAAAAACCGTATCACACACAAAGCCGGTTCAATTCTCGT GAATAAGGTGTGTAAAGACGGTACAAGCCTGGATGATAAGATACGTAATGAAATTTATCAATATGAGAA TAAATTTATTGATACCCTGTCTGATGAAGCTAAAAAGGTGTTACCGAATGTCATTAAAAAGGAAGCTAC CCATGACATTACAAAAGATAAACGTTTCACTAGTGACAAATTCTTCTTTCACTGCCCCCTGACAATTAAT US 10 , 435 , 714 B2 131 132 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence TATAAGGAAGGCGATACCAAGCAGTTCAATAACGAAGTGCTGAGTTTTCTGCGTGGAAATCCTGACATC AACATTATCGGCATTGACCGCGGAGAGCGTAATTTAATCTATGTAACGGTTATAAACCAGAAAGGCGAG ATTCTGGATTCGGTTTCATTCAATACCGTGACCAACAAGAGTT CAAAAATCGAGCAGACAGTCGATTAT GAAGAGAAATTGGCAGTCCGCGAGAAAGAGAGGATTGAAGCAAAACGTTCCTGGGACTCTATCTCAAA AATTGCGACACTAAAGGAAGGTTATCTGAGCGCAATAGTTCACGAGATCTGTCTGTTAATGATTAAACA CAACGCGATCGTTGTCTTAGAGAATCTTAATGCAGGCTTTAAGCGTATTCGTGGCGGTTTATCAGAAAAA AGTGTTTATCAAAAATTCGAAAAAATGTTGATTAACAAACTGAACTATTTTGTCAGCAAGAAGGAATCC GACTGGAATAAACCGTCTGGTCTGCTGAATGGACTGCAGCTTTCGGATCAGTTTGAAAGCTTCGAAAAA CTGGGTATTCAGTCTGGTTTTATTTTTTACGTGCCGGCTGCATATACCTCAAAGATTGATCCGACCACGG GCTTCGCCAATGTTCTGAATCTGTCGAAGGTACGCAATGTTGATGCGATCAAAAGCTTTTTTTCTAACTT CAACGAAATTAGTTATAGCAAGAAAGAAGCCCTTTTCAAATTCTCATTCGATCTGGATTCACTGAGTAAG AAAGGCTTTAGTAGCTTTGTGAAATTTAGTAAGAGTAAATGGAACGTCTACACCTTTGGAGAACGTATC ATAAAGCCAAAGAATAAGCAAGGTTATCGGGAGGACAAAAGAATCAACTTGACCTTCGAGATGAAGAA GTTACTTAACGAGTATAAGGTTTCTTTTGATCTTGAAAATAACTTGATTCCGAATCTCACGAGTGCCAAC CTGAAGGATACTTTTTGGAAAGAGCTATTCTTTATCTTCAAGACTACGCTGCAGCTCCGTAACAGCGTTA CTAACGGTAAAGAAGATGTGCTCATCTCTCCGGTCAAAAATGCGAAGGGTGAATTCTTCGTTTCGGGAA CGCATAACAAGACTCTTCCGCAAGATTGCGATGCGAACGGTGCATACCATATTGCGTTGAAAGGTCTGA TGATACTCGAACGTAACAACCTTGTACGTGAGGAGAAAGATACGAAAAAGATTATGGCGATTTCAAACG TGGATTGGTTCGAGTACGTGCAGAAACGTAGAGGCGTTCTGTAAGAAATCATCCTTAGCGAAAGCTAAG GATTTTTTTTATCTGAAATTTATTATATCGCGTTGATTATTGATGCTGTTTTTAGTTTTAACGGCAATTAAT ATATGTGTTATTAATTGAATGAATTTTATCATTCATAATAAGTATGTGTAGGATCAAGCTCAGGTTAAAT ATTCACTCAGGAAGTTATTACTCAGGAAGCAAAGAGGATTACAGAATTATCTCATAACAAGTGTTAAGG GATGTTATTTCC SEQ AATCAGGAGAGCGTTTTCAATCCTACCTCTGGCGCAGTTGATATGTCAAACAGGTTGCCGTCACTGCGTC ID TTTTACTGGCTCTTCTCGCTAACCAAACCGGTAACCCCGCTTATTAAAAGCATTCTGTAACAAAGCGGGA NO : CCAAAGCCATGACAAAAACGCGTAACAAAAGTGTCTATAATCACGGCAGAAAAGTCCACATTGATTATT 63 TGCACGGCGTCACACTTTGCTATGCCATAGCATTTTTATCCATAAGATTAGCGGATCCTACCTGACGCTT TTTATCGCAACTCTCTACTGTTTCTCCATACCCGTTTTTTTGGGCTAGCAGTAATACGACTCACTATAGGG GTCTCATCTCGTGTGAGATAGGCGGAGATACGAACTTTAAGAGGAGGATATACCATGCACCATCATCAT CACCATAACAACTACGACGAATTCACCAAACTGTACCCGATCCAGAAAACCATCCGTTTCGAACTGAAA CCGCAGGGTCGTACCATGGAACACCTGGAAACCTTCAACTTCTTCGAAGAAGACCGTGACCGTGCGGAA AAATACAAAATCCTGAAAGAAGCGATCGACGAATACCACAAAAAATTCATCGACGAACACCTGACCAA CATGTCTCTGGACTGGAACTCTCTGAAACAGATCTCTGAAAAATACTACAAATCTCGTGAAGAAAAAGA CAAAAAAGTTTTCCTGTCTGAACAGAAACGTATGCGTCAGGAAATCGTTTCTGAATTCAAAAAAGACGA CCGTTTCAAAGACCTGTTCTCTAAAAAACTGTTCTCTGAACTGCTGAAAGAAGAAATCTACAAAAAAGG TAACCACCAGGAAATCGACGCGCTGAAATCTTTCGACAAATTCTCTGGTTACTTCATCGGTCTGCACGAA AACCGTAAAAACATGTACTCTGACGGTGACGAAATCACCGCGATCTCTAACCGTATCGTTAACGAAAAC TTCCCGAAATTCCTGGACAACCTGCAGAAATACCAGGAAGCGCGTAAAAAATACCCGGAATGGATCATC AAAGCGGAATCTGCGCTGGTTGCGCACAACATCAAAATGGACGAAGTTTTCTCTCTGGAATACTTCAAC AAAGTTCTGAACCAGGAAGGTATCCAGCGTTACAACCTGGCGC TGGGTGGTTACGTTACCAAATCTGGT GAAAAAATGATGGGTCTGAACGACGCGCTGAACCTGGCGCACCAGTCTGAAAAATCTTCTAAAGGTCGT ATCCACATGACCCCGCTGTTCAAACAGATCCTGTCTGAAAAAGAATCTTTCTCTTACATCCCGGACGTTT TCACCGAAGACTCTCAGCTGCTGCCGTCTATCGGTGGTTTCTT CGCGCAGATCGAAAACGACAAAGACG GTAACATCTTCGACCGTGCGCTGGAACTGATCTCTTCTTACGCGGAATACGACACCGAACGTATCTACAT CCGTCAGGCGGACATCAACCGTGTTTCTAACGTTATCTTCGGTGAATGGGGTACCCTGGGTGGTCTGATG CGTGAATACAAAGCGGACTCTATCAACGACATCAACCTGGAACGTACCTGCAAAAAAGTTGACAAATGG CTGGACTCTAAAGAATTCGCGCTGTCTGACGTTCTGGAAGCGATCAAACGTACCGGTAACAACGACGCG TTCAACGAATACATCTCTAAAATGCGTACCGCGCGTGAAAAAATCGACGCGGCGCGTAAAGAAATGAA ATTCATCTCTGAAAAAATCTCTGGTGACGAAGAATCTATCCACATCATCAAAACCCTGCTGGACTCTGTT CAGCAGTTCCTGCACTTCTTCAACCTGTTCAAAGCGCGTCAGGACATCCCGCTGGACGGTGCGTTCTACG CGGAATTCGACGAAGTTCACTCTAAACTGTTCGCGATCGTTCCGCTGTACAACAAAGTTCGTAACTACCT GACCAAAAACAACCTGAACACCAAAAAAATCAAACTGAACTTCAAAAACCCGACCCTGGCGAACGGTT GGGACCAGAACAAAGTTTACGACTACGCGTCTCTGATCTTCCTGCGTGACGGTAACTACTACCTGGGTAT CATCAACCCGAAACGTAAAAAAAACATCAAATTCGAACAGGGTTCTGGTAACGGTCCGTTCTACCGTAA AATGGTTTACAAACAGATCCCGGGTCCGAACAAAAACCTGCCGCGTGTTTTCCTGACCTCTACCAAAGG TAAAAAAGAATACAAACCGTCTAAAGAAATCATCGAAGGTTACGAAGCGGACAAACACATCCGTGGTG ACAAATTCGACCTGGACTTCTGCCACAAACTGATCGACTTCTTCAAAGAATCTATCGAAAAACACAAAG ACTGGTCTAAATTCAACTTCTACTTCTCTCCGACCGAATCTTACGGTGACATCTCTGAATTCTACCTGGAC GTTGAAAAACAGGGTTACCGTATGCACTTCGAAAACATCTCTGCGGAAACCATCGACGAATACGTTGAA AAAGGTGACCTGTTCCTGTTCCAGATCTACAACAAAGACTTCGTTAAAGCGGCGACCGGTAAAAAAGAC ATGCACACCATCTACTGGAACGCGGCGTTCTCTCCGGAAAACC TGCAGGACGTTGTTGTTAAACTGAAC GGTGAAGCGGAACTGTTCTACCGTGACAAATCTGACATCAAAGAAATCGTTCACCGTGAAGGTGAAATC CTGGTTAACCGTACCTACAACGGTCGTACCCCGGTTCCGGACAAAATCCACAAAAAACTGACCGACTAC CACAACGGTCGTACCAAAGACCTGGGTGAAGCGAAAGAATACC TGGACAAAGTTCGTTACTTCAAAGCG CACTACGACATCACCAAAGACCGTCGTTACCTGAACGACAAAATCTACTTCCACGTTCCGCTGACCCTGA ACTTCAAAGCGAACGGTAAAAAAAACCTGAACAAAATGGTTAT CGAAAAATTCCTGTCTGACGAAAAA GCGCACATCATCGGTATCGACCGTGGTGAACGTAACCTGCTGTACTACTCTATCATCGACCGTTCTGGTA AAATCATCGACCAGCAGTCTCTGAACGTTATCGACGGTTTCGACTACCGTGAAAAACTGAACCAGCGTG AAATCGAAATGAAAGACGCGCGTCAGTCTTGGAACGCGATCGGTAAAATCAAAGACCTGAAAGAAGGT TACCTGTCTAAAGCGGTTCACGAAATCACCAAAATGGCGATCCAGTACAACGCGATCGTTGTTATGGAA GAACTGAACTACGGTTTCAAACGTGGTCGTTTCAAAGTTGAAAAACAGATCTACCAGAAATTCGAAAAC ATGCTGATCGACAAAATGAACTACCTGGTTTTCAAAGACGCGCCGGACGAATCTCCGGGTGGTGTTCTG AACGCGTACCAGCTGACCAACCCGCTGGAATCTTTCGCGAAACTGGGTAAACAGACCGGTATCCTGTTC TACGTTCCGGCGGCGTACACCTCTAAAATCGACCCGACCACCGGTTTCGTTAACCTGTTCAACACCTCTT CTAAAACCAACGCGCAGGAACGTAAAGAATTCCTGCAGAAATTCGAATCTATCTCTTACTCTGCGAAAG US 10 , 435 , 714 B2 133 134 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence ACGGTGGTATCTTCGCGTTCGCGTTCGACTACCGTAAATTCGGTACCTCTAAAACCGACCACAAAAACGT TTGGACCGCGTACACCAACGGTGAACGTATGCGTTACATCAAAGAAAAAAAACGTAACGAACTGTTCGA CCCGTCTAAAGAAATCAAAGAAGCGCTGACCTCTTCTGGTATCAAATACGACGGTGGTCAGAACATCCT GCCGGACATCCTGCGTTCTAACAACAACGGTCTGATCTACACCATGTACTCTTCTTTCATCGCGGCGATC CAGATGCGTGTTTACGACGGTAAAGAAGACTACATCATCTCTCCGATCAAAAACTCTAAAGGTGAATTC TTCCGTACCGACCCGAAACGTCGTGAACTGCCGATCGACGCGGACGCGAACGGTGCGTACAACATCGCG CTGCGTGGTGAACTGACCATGCGTGCGATCGCGGAAAAATTCGACCCGGACTCTGAAAAAATGGCGAAA CTGGAACTGAAACACAAAGACTGGTTCGAATTCATGCAGACCCGTGGTGACTAAGAAATCATCCTTAGC GAAAGCTAAGGATTTTTTTTATCTGAAATGTAGGGAGACCCTCAGGTTAAATATTCACTCAGGAAGTTAT TACTCAGGAAGCAAAGAGGATTACA SEQ AAAATTCcatGCAAAATGCTCCGGTTTCATGTCATCAAAATGATGACGTAATTAAGCATTGATAATTGAGA ID TCCCTCTCCCTGACAGGATGATTACATAAATAATAGTGACAAAAATAAATTATTTATTTATCCAGAAAAT NO : GAATTGGAAAATCAGGAGAGCGTTTTCAATCCTACCTCTGGCGCAGTTGATATGTcaaaCAGGTtgccgt 64 cactgcgtcttttactggctcttctcgctaaccaaaccggtaaccccgcttattaaaagcattctgtaac aaagcgggaccaaagccatgacaaaaacgcgtaacaaaagtgtctataatcacggcagaaaagtccacat tgattatttgcacggcgtcacactttgctatgccatagcatttttatccataagattagcggatcctacc tgacgctttttatcgcaactctctactgtttctccatacccgtttttttgggctagcaccgcctatctcg tgtgagataggcggagatacgaactttaagAAGGAGatataccATGACTAAAACATTTGATTCAGAGTTTT TTAATTTGTACTCGCTGCAAAAAACGGTACGCTTTGAGTTAAAACCCGTGGGAGAAACCGCGTCATTTGTG GAAGACTTTAAAAACGAGGGCTTGAAACGTGTTGTGAGCGAAGATGAAAGGCGAGC CGTCGATTACCAGAAAGTTAAGGAAATAATTGACGATTACCATCGGGATTTCATTGAAGAAAGTTTAAA TTATTTTCCGGAACAGGTGAGTAAAGATGCTCTTGAGCAGGCGTTTCATCTTTATCAGAAACTGAAGGCA GCAAAAGTTGAGGAAAGGGAAAAAGCGCTGAAAGAATGGGAAGCGCTGCAGAAAAAGCTACGTGAAA AAGTGGTGAAATGCTTCTCGGACTCGAATAAAGCCCGCTTCTCAAGGATTGATAAAAAGGAACTGATTA AGGAAGACCTGATAAATTGGTTGGTCGCCCAGAATCGCGAGGATGATATCCCTACGGTCGAAACGTTTA ACAACTTCACCACATATTTTACCGGCTTCCATGAGAATCGTAAAAATATTTACTCCAAAGATGATCACGC CACCGCTATTAGCTTTCGCCTTATTCATGAAAATCTTCCAAAGTTTTTTGACAACGTGATTAGCTTCAATA AGTTGAAAGAGGGTTTCCCTGAATTAAAATTTGATAAAGTGAAAAGGATTTAGAAGTAGATTATGATC TGAAGCATGCGTTTGAAATAGAATATTTCGTTAACTTCGTGACCCAAGCGGGCATAGATCAGTATAATTA TCTGTTAGGAGGGAAAACCCTGGAGGACGGGACGAAAAAACAAGGGATGAATGAGCAAATTAATCTGT TCAAACAACAGCAAACGCGAGATAAAGCGCGTCAGATTCCCAAACTGATCCCCCTGTTCAAACAGATTC TTAGCGAAAGGA?TGAAAGCCAGTCCTTTATTCCTAAACAATTTGAAAGTGATCAGGAGTTGTTCGATTC ACTGCAGAAGTTACATAATAACTGCCAGGATAAATTCACCGTGCTGCAACAAGCCATTCTCGGTCTGGC AGAGGCGGATCTTAAGAAGGTCTTCATCAAAACCTCTGATTTAAATGCCTTATCTAACACCATTTTCGGG AATTACAGCGTCTTTTCCGATGCACTGAACCTGTATAAAGAAAGCCTGAAAACGAAAAAAGCGCAGGAG GCTTTTGAGAAACTACCGGCCCATTCTATTCACGACCTCATTCAATACTTGGAACAGTTCAATTCCAGCC TGGACGCGGAAAAACAACAGAGCACCGACACCGTCCTGAACTACTTCATCAAGACCGATGAATTATATT CTCGCTTCATTAAATCCACTAGCGAGGCTTTCACTCAGGTGCAGCCTTTGTTCGAACTGGAAGCCCTGTC ATCTAAGCGCCGCCCACCGGAATCGGAAGATGAAGGGGCAAAAGGGCAGGAAGGCTTCGAGCAGATCA AGCGTATTAAAGCTTACCTGGATACGCTTATGGAAGCGGTACACTTTGCAAAGCCGTTGTATCTTGTTAA GGGTCGTAAAATGATCGAAGGGCTCGATAAAGACCAGTCCTTTTATGAAGCGTTTGAAATGGCGTACCA AGAACTTGAATCGTTAATCATTCCTATCTATAACAAAGCGCGGAGCTATCTGTCGCGGAAACCTTTCAAG GCCGATAAATTCAAGATTAATTTTGACAACAACACGCTACTGAGCGGATGGGATGCGAACAAGGAAACT GCTAACGCGTCCATTCTGTTTAAGAAAGACGGGTTATATTACCTTGGAATTATGCCGAAAGGTAAGACCT TTCTCTTTGACTACTTTGTATCGAGCGAGGATTCAGAGAAACTGAAACAGCGTCGCCAGAAGACCGCCG AAGAAGCTCTGGCGCAGGATGGTGAAAGTTACTTCGAAAAAATTCGTTATAAACTGTTACCAGGGGCTT CAAAGATGTTACCGAAAGTCTTTTTTAGCAACAAAAATATTGGCTTTTACAACCCGTCGGATGACATTTT ACGCATTCGCAACACAGCCTCTCACACCAAAAACGGGACCCCT CAGAAAGGCCACTCAAAAGTTGAGTT TAACCTGAATGATTGTCATAAGATGATTGATTTCTTCAAATCATCAATTCAGAAACACCCGGAATGGGG GTCTTTTGGCTTTACGTTTTCTGATACCAGTGATTTTGAAGACATGAGTGCCTTCTACCGGGAAGTAGAA AACCAGGGTTACGTAATTAGCTTTGACAAAATCAAAGAGACCTATATACAGAGCCAGGTGGAACAGGGT AATCTCTACTTATTCCAGATTTATAACAAGGATTTCTCGCCCTACAGCAAAGGCAAACCAAACCTGCATA CTCTGTACTGGAAAGCCCTGTTTGAAGAAGCGAACCTGAATAACGTAGTGGCGAAGTTGAACGGTGAAG CGGAAATCTTCTTCCGTCGTCACTCCATTAAGGCCTCTGATAAAGTTGTCCATCCGGCAAATCAGGCCAT TGATAATAAGAATCCACACACGGAAAAAACGCAGTCAACCTTTGAATATGACCTCGTTAAAGACAAACG CTACACGCAAGATAAGTTCTTTTTCCACGTCCCAATCAGCCTCAACTTTAAAGCACAAGGGGTTTCAAAG TTTAATGATAAAGTCAATGGGTTCCTCAAGGGCAACCCGGATGTCAACATTATAGGTATAGACAGGGGC GAACGCCATCTGCTTTACTTTACCGTAGTGAATCAGAAAGGTGAAATACTGGTTCAGGAATCATTAAAT ACCTTGATGTCGGACAAAGGGCACGTTAATGATTACCAGCAGAAACTGGATAAAAAAGAACAGGAACG TGATGCTGCGCGTAAATCGTGGACCACGGTTGAGAACATTAAAGAGCTGAAAGAGGGGTATCTAAGCCA TGTGGTACACAAACTGGCGCACCTCATCATTAAATATAACGCAATAGTCTGCCTAGAAGACTTGAATTTT GGCTTTAAACGCGGCCGCTTCAAAGTGGAAAAACAAGTTTATCAAAAATTTGAAAAGGCGCTTATAGAT AAACTGAATTATCTGGTTTTTAAAGAAAAGGAACTTGGTGAGGTAGGGCACTACTTGACAGCTTATCAA CTGACGGCCCCGTTCGAATCATTCAAAAAACTGGGCAAACAGTCTGGCATTCTGTTTTACGTGCCGGCAG ATTATACTTCAAAAATCGATCCAACAACTGGCTTTGTGAACTTCCTGGACCTGAGATATCAGTCTGTAGA AAAAGCTAAACAACTTCTTAGCGATTTTAATGCCATTCGTTTTAACAGCGTTCAGAATTACTTTGAATTC GAAATTGACTATAAAAAACTTACTCCGAAACGTAAAGTCGGAACCCAAAGTAAATGGGTAATTTGTACG TATGGCGATGTCAGGTATCAGAACCGTCGGAATCAAAAAGGTCATTGGGAGACCGAAGAAGTGAACGT GACCGAAAAGCTGAAGGCTCTGTTCGCCAGCGATTCAAAAACTACAACTGTGATCGATTACGCAAATGA TGATAACCTGATAGATGTGATTTTAGAGCAGGATAAAGCCAGCTTTTTTAAAGAACTGTTGTGGCTCCTG AAACTTACGATGACCTTACGACATTCCAAGATCAAATCGGAAGATGATTTTATTCTGTCACCGGTCAAGA ATGAGCAGGGTGAATTCTATGATAGTAGGAAAGCCGGCGAAGTGTGGCCGAAAGACGCCGACGCCAAT GGCGCCTATCATATCGCGCTCAAAGGGCTTTGGAATTTGCAGCAGATTAACCAGTGGGAAAAAGGTAAA ACCCTGAATCTGGCTATCAAAAACCAGGATTGGTTTAGCTTTATCCAAGAGAAACCGTATCAGGAATGA GAAATCATCCTTAGCGAAAGCTAAGGATTTTTTTTATCTGAAATTTATTATATCGCGTTGATTATTGATGC US 10 , 435 , 714 B2 135 136 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence TGTTTTTAGTTTTAACGGCAATTAATATATGTGTTATTAATTGAATGAATTTTATCATTCATAATAAGTAT GTGTAGGATCAAGCTCAGGTTAAATATTCACTCAGGAAGTTATTACTCAGGAAGCAAAGAGGATTACAG AATTATCTCATAACAAGTGTTAAGGGATGTTATTTCC SEQ AATCAGGAGAGCGTTTTCAATCCTACCTCTGGCGCAGTTGATATGTCAAACAGGTTGCCGTCACTGCGTC ID TTTTACTGGCTCTTCTCGCTAACCAAACCGGTAACCCCGCTTATTAAAAGCATTCTGTAACAAAGCGGGA NO : CCAAAGCCATGACAAAAACGCGTAACAAAAGTGTCTATAATCACGGCAGAAAAGTCCACATTGATTATT 65 TGCACGGCGTCACACTTTGCTATGCCATAGCATTTTTATCCATAAGATTAGCGGATCCTACCTGACGCTT TTTATCGCAACTCTCTACTGTTTCTCCATACCCGTTTTTTTGGGCTAGCAGTAATACGACTCACTATAGGG GTCTCATCTCGTGTGAGATAGGCGGAGATACGAACTTTAAGAGGAGGATATACCATGCACCATCATCAT CACCATCATACAGGCGGTCTTCTTAGTATGGACGCGAAAGAGTTCACAGGTCAGTATCCGTTGTCGAAA ACATTACGATTCGAACTTCGGCCCATCGGCCGCACGTGGGATAACCTGGAGGCCTCAGGCTACTTAGCG GAAGACCGCCATCGTGCCGAATGTTATCCTCGTGCGAAAGAGT TATTGGATGACAACCATCGTGCCTTCC TGAATCGTGTGTTGCCACAAATCGATATGGATTGGCACCCGAT TGCGGAGGCCTTTTGTAAGGTACATAA AAACCCTGGTAATAAAGAACTTGCCCAGGATTACAACCTTCAGTTGTCAAAGCGCCGTAAGGAGATCAG CGCATATCTTCAGGATGCAGATGGCTATAAAGGCCTGTTCGCGAAGCCCGCCTTAGACGAAGCTATGAA AATTGCGAAAGAAAACGGGAACGAAAGTGATATTGAGGTTCTCGAAGCGTTTAACGGTTTTAGCGTATA CTTCACCGGTTATCATGAGTCACGCGAGAACATTTATAGCGATGAGGATATGGTGAGCGTAGCCTACCG AATTACTGAGGATAATTTCCCGCGCTTTGTCTCAAACGCTTTGATCTTTGATAAATTAAACGAAAGCCAT CCGGATATTATCTCTGAAGTATCGGGCAATCTTGGAGTTGATGACATTGGTAAGTACTTTGACGTGTCGA ACTATAACAATTTTCTTTCCCAGGCCGGTATAGATGACTACAATCACATTATTGGCGGCCATACAACCGA AGACGGA?TGATACAAGCGTTTAATGTCGTATTGAACTTACGT CACCAAAAAGACCCTGGCTTTGAAAA AATTCAGTTCAAACAGCTCTACAAACAAATCCTGAGCGTGCGTACCAGCAAAAGCTACATCCCGAAACA GTTTGACAACTCTAAGGAGATGGTTGACTGCATTTGCGATTATGTCAGCAAAATAGAGAAATCCGAAAC AGTAGAACGGGCCCTGAAACTAGTCCGTAATATCAGTTCTTTCGACTTGCGCGGGATCTTTGTCAATAAA AAGAACTTGCGCATACTGAGCAACAAACTGATAGGAGATTGGGACGCGATCGAAACCGCATTGATGCAT AGTTCTTCATCAGAAAACGATAAGAAAAGCGTATATGATAGCGCGGAGGCTTTTACGTTGGATGACATC TTTTCAAGCGTGAAAAAATTTTCTGATGCCTCTGCCGAAGATATTGGCAACAGGGCGGAAGACATCTGT AGAGTGATAAGTGAGACGGCCCCTTTTATCAACGATCTGCGAGCGGTGGACCTGGATAGCCTGAACGAC GATGGTTATGAAGCGGCCGTCTCAAAAATTCGGGAGTCGCTGGAGCCTTATATGGATCTTTTCCATGAAC TGGAAATTTTCTCGGTTGGCGATGAGTTCCCAAAATGCGCAGCATTTTACAGCGAACTGGAGGAAGTCA GCGAACAGCTGATCGAAATTATTCCGTTATTCAACAAGGCGCGTTCGTTCTGCACCCGGAAACGCTATA GCACCGATAAGATTAAAGTGAACTTAAAATTCCCGACCTTGGCGGACGGGTGGGACCTGAACAAAGAG AGAGACAACAAAGCCGCGATTCTGCGGAAAGACGGTAAGTATTATCTGGCAATTCTGGATATGAAGAA AGATCTGTCAAGCATTAGGACCAGCGACGAAGATGAATCCAGCTTCGAAAAGATGGAGTATAAACTGTT ACCGAGTCCAGTAAAAATGCTGCCAAAGATATTCGTAAAATCGAAAGCCGCTAAGGAAAAATATGGCCT GACAGATCGTATGCTTGAATGCTACGATAAAGGTATGCATAAGTCGGGTAGTGCGTTTGATCTTGGCTTT TGCCATGAACTCATTGATTATTACAAGCGTTGTATCGCGGAGTACCCAGGCTGGGATGTGTTCGATTTCA AGTTTCGCGAAACTTCCGATTATGGGTCCATGAAAGAGTTCAATGAAGATGTGGCCGGAGCCGGTTACT ATATGAGTCTGAGAAAAATTCCGTGCAGCGAAGTGTACCGTCTGTTAGACGAGAAATCGATTTATCTATT TCAAATTTATAACAAAGATTACTCTGAAAATGCACATGGTAATAAGAACATGCATACCATGTACTGGGA GGGTCTCTTTTCCCCGCAAAACCTGGAGTCGCCCGTTTTCAAGTTGTCGGGTGGGGCAGAACTTTTCTTT CGAAAATCCTCAATCCCTAACGATGCCAAAACAGTACACCCGAAAGGCTCAGTGCTGGTTCCACGTAAT GATGTTAACGGTCGGCGTATTCCAGATTCAATCTACCGCGAAC TGACACGCTATTTTAACCGTGGCGATT GCCGAATCAGTGACGAAGCCAAAAGTTATCTTGACAAGGTTAAGACTAAAAAAGCGGACCATGACATT GTGAAAGATCGCCGCTTTACCGTGGATAAAATGATGTTCCACGTCCCGATTGCGATGAACTTTAAGGCG ATCAGTAAACCGAACTTAAACAAAAAAGTCATTGATGGCATCATTGATGATCAGGATCTGAAAATCATT GGTATTGATCGTGGCGAGCGGAACTTAATTTACGTCACGATGGTTGACAGAAAAGGGAATATCTTATAT CAGGATTCTCTTAACATCCTCAATGGCTACGACTATCGTAAAGCTCTGGATGTGCGCGAATATGACAACA AGGAAGCGCGTCGTAACTGGACTAAAGTGGAGGGCATTCGCAAAATGAAGGAAGGCTATCTGTCATTA GCGGTCTCGAAATTAGCGGATATGATTATCGAAAATAACGCCATCATCGTTATGGAGGACCTGAACCAC GGATTCAAAGCGGGCCGCTCAAAGATTGAAAAACAAGTTTATCAGAAATTTGAGAGTATGCTGATTAAC AAACTGGGCTATATGGTGTTAAAAGACAAGTCAATTGACCAAT CAGGTGGCGCGCTGCATGGATACCAG CTGGCGAACCATGTTACCACCTTAGCATCAGTTGGAAAGCAGTGTGGGGTTATCTTTTATATACCGGCAG CGTTCACTAGTAAAATAGATCCGACCACTGGTTTCGCCGATCTCTTTGCCCTGAGTAACGTTAAAAACGT AGCGAGCATGCGTGAATTCTTTTCCAAAATGAAATCTGTCATTTATGATAAAGCTGAAGGCAAATTCGC ATTCACCTTTGATTACTTGGATTACAACGTGAAGAGCGAATGTGGTCGTACGCTGTGGACCGTTTACACC GTTGGTGAGCGCTTCACCTATTCCCGTGTGAACCGCGAATATGTACGTAAAGTCCCCACCGATATTATCT ATGATGCCCTCCAGAAAGCAGGCATTAGCGTCGAAGGAGACTTAAGGGACAGAATTGCCGAAAGCGAT GGCGATACGCTGAAGTCTATTTTTTACGCATTCAAATACGCGC TAGATATGCGCGTTGAGAATCGCGAG GAAGACTACATTCAATCACCTGTGAAAAATGCCTCTGGGGAATTTTTTTGTTCAAAAAATGCTGGTAAAA GCCTCCCACAAGATAGCGATGCAAACGGTGCATATAACATTGCCCTGAAAGGTATTCTTCAATTACGCA TGCTGTCTGAGCAGTACGACCCCAACGCGGAATCTATTAGACTTCCGCTGATAACCAATAAAGCCTGGC TGACATTCATGCAGTCTGGCATGAAGACCTGGAAAAATTAGGAAATCATCCTTAGCGAAAGCTAAGGAT TTTTTTTATCTGAAATGTAGGGAGACCCTCAGGTTAAATATTCACTCAGGAAGTTATTACTCAGGAAGCA AAGAGGATTACA SEQ AAAATTCcatGCAAAATGCTCCGGTTTCATGTCATCAAAATGATGACGTAATTAAGCATTGATAATTGAGA ID TCCCTCTCCCTGACAGGATGATTACATAAATAATAGTGACAAAAATAAATTATTTATTTATCCAGAAAAT NO : GAATTGGAAAATCAGGAGAGCGTTTTCAATCCTACCTCTGGCGCAGTTGATATGTcaaaCAGGTtgccgtc 66 actgcgtcttttactggctcttctcgctaaccaaaccggtaaccccgcttattaaaagcattctgtaacaa agcgggaccaaagccatgacaaaaacgcgtaacaaaagtgtctataatcacggcagaaaagtccacattga ttatttgcacggcgtcacactttgctatgccatagcatttttatccataagattagcggatcctacctgac gctttttatcgcaactctctactgtttctccatacccgtttttttgggctagcaccgcctatctcgtgtga gataggcggagatacgaactttaagAAGGAGatataccatqGATAGTTTGAAAGATTTCACCAATCTGTAC CCTGTCAGTAAGACATTGAGATTTGAATTAAAGCCCGTTGGAAAGACTTTAGAAAATATCGAGAAAGCAGG US 10 , 435 , 714 B2 137 138 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence TATTTTGAAAGAGGATGAGCATCGTGCAGAAAGTTATCGGAGGGTGAAGAA AATAATTGATACTTATCATAAGGTATTTATCGATTCTTCTCTTGAAAATATGGCTAAAATGGGTATTGAG AATGAAATAAAAGCAATGCTCCAAAGTTTCTGCGAATTGTATAAAAAAGATCATCGCACTGAGGGTGAA GACAAGGCATTAGATAAAATTCGAGCAGTACTTCGTGGCCTGATTGTTGGGGCTTTCACTGGTGTTTGCG GAAGACGGGAAAATACAGTCCAAAACGAGAAGTACGAGAGTTTGTTCAAAGAAAAGTTGATAAAAGAA ATTTTACCTGATTTTGTGCTCTCTACTGAGGCTGAAAGCTTGCCTTTCTCTGTTGAAGAAGCTACGAGGTC ACTGAAGGAGTTTGATAGCTTTACATCCTACTTTGCTGGTTTTTACGAGAATAGAAAGAATATATACTCG ACGAAACCTCAATCCACTGCCATTGCTTATCGTCTTATTCATGAGAACTTGCCGAAGTTCATTGATAATA TTCTTGTTTTTCAGAAGATCAAAGAGCCTATAGCCAAAGAGCTGGAACATATTCGTGCGGACTTTTCTGC CGGGGGGTACATAAAAAAGGATGAGAGATTGGAGGATATTTTTTCGTTGAACTATTATATCCACGTGTT ATCTCAGGCTGGGATCGAAAAATATAACGCATTGATTGGGAAGATTGTGACAGAAGGAGATGGAGAGA TGAAAGGGCTCAATGAACACATCAACCTTTACAACCAACAAAGAGGCAGAGAGGATCGGCTCCCTCTTT TTAGGCCTCTTTATAAACAGATATTGAGTGACAGAGAGCAATTATCATACTTGCCTGAGAGTTTTGAAAA AGATGAGGAGCTCCTCAGGGCTCTAAAAGAGTTCTATGATCATATCGCAGAAGACATTCTCGGACGTAC TCAACAGTTGATGACTTCTATTTCAGAATATGATTTATCTCGGATATACGTAAGGAACGATAGCCAATTG ACTGATATATCAAAAAAAATGTTGGGAGATTGGAATGCTATCTACATGGCTAGAGAACGAGCATATGAC CACGAGCAGGCTCCCAAAAGAATCACGGCGAAATACGAGAGGGACAGGATTAAAGCTCTTAAAGGAGA AGAGAGTATAAGTCTGGCAAATCTTAATAGTTGTATTGCCTTTCTGGACAATGTTAGAGATTGCCGTGTA GATACTTATCTTTCCACACTGGGCCAGAAGGAAGGACCACATGGTCTATCTAATCTCGTTGAGAACGTTT TTGCCTCATACCATGAAGCAGAGCAATTGTTGAGCTTTCCATACCCCGAAGAGAATAATCTGATTCAGG ACAAGGACAATGTGGTGTTAATTAAGAATCTTCTCGACAATAT CAGTGATCTGCAGAGGTTCTTGAAAC CTCTTTGGGGTATGGGAGACGAACCCGATAAAGATGAAAGATTTTATGGAGAGTATAATTATATCCGAG GAGCTCTAGATCAGGTGATCCCTCTGTACAATAAGGTAAGGAACTACCTCACTCGGAAGCCTTATTCGA CCAGAAAAGTAAAACTCAATTTTGGGAATTCTCAATTGCTTAGTGGTTGGGATAGAAATAAGGAAAAGG ATAATAGCTGTGTGATTTTGCGTAAGGGGCAGAACTTCTATTTGGCTATTATGAACAATAGGCACAAAA GAAGTTTCGAAAACAAGGTGTTGCCCGAGTATAAGGAGGGAGAACCTTACTTCGAAAAGATGGATTATA AATTTTTGCCTGATCCTAATAAAATGCTTCCTAAGGTTTTTCTTTCGAAAAAAGGAATAGAGATATACAA ACCAAGTCCGAAGCTTTTAGAACAATATGGACATGGAACTCACAAAAAGGGAGATACCTTTAGTATGGA TGATTTGCACGAACTGATCGATTTCTTCAAACACTCAATCGAGGCTCATGAAGATTGGAAGCAATTCGG ATTCAAATTTTCTGATACGGCTACTTATGAGAATGTATCTAGTTTCTATAGAGAAGTTGAGGATCAGGGG TATAAGCTCTCTTTCCGAAAAGTTTCGGAATCTTATGTCTATTCATTAATAGATCAAGGCAAGTTGTATTT ATTTCAGATATACAACAAGGACTTTTCTCCCTGCAGCAAAGGGACACCTAATCTGCATACCTTGTATTGG AGAATGCTTTTTGACGAGCGCAATTTGGCAGATGTCATATACAAACTGGATGGGAAGGCTGAAATCTTT TTCCGAGAGAAGAGTTTGAAAAATGATCATCCCACGCATCCGGCTGGTAAGCCTATCAAAAAGAAAAGT CGACAAAAAAAAGGAGAGGAGAGTCTGTTTGAGTATGATTTAGTCAAGGATAGGCACTATACGATGGA TAAGTTCCAGTTTCATGTGCCTATTACTATGAATTTTAAATGT TCTGCAGGAAGCAAAGTCAATGATATG GTTAATGCTCATATTCGAGAGGCAAAGGATATGCATGTCATTGGAATTGATCGTGGAGAACGCAATCTG CTGTATATATGCGTGATAGATAGTCGAGGGACGATTTTGGATCAAATTTCTCTGAATACGATTAACGATA TAGACTATCATGATTTATTGGAGAGTCGAGACAAAGACCGTCAGCAGGAGCGCCGAAACTGGCAAACTA TCGAAGGGATCAAGGAGCTAAAACAAGGCTACCTTAGTCAGGCGGTTCATCGGATAGCCGAACTGATGG TGGCTTATAAGGCTGTAGTTGCTTTGGAGGATTTGAATATGGGGTTCAAACGTGGGCGGCAGAAAGTAG AAAGTTCTGTTTATCAGCAGTTTGAGAAACAGCTGATAGATAAGCTCAACTATCTTGTGGACAAGAAGA AAAGGCCTGAAGATATTGGAGGATTGTTGAGAGCCTATCAATTTACGGCCCCATTTAAGAGTTTTAAGG AAATGGGAAAGCAAAACGGCTTCTTGTTTTATATCCCGGCTTGGAACACGAGCAACATAGATCCGACTA CTGGATTTGTTAATTTATTTCATGCCCAGTATGAAAATGTAGATAAAGCGAAGAGCTTCTTTCAAAAGTT TGATTCAATTAGTTACAACCCGAAGAAAGACTGGTTTGAGTTTGCATTCGATTATAAAAACTTTACTAAA AAGGCTGAAGGAAGTCGTTCTATGTGGATATTATGCACACATGGTTCCCGAATAAAGAATTTTAGAAAT TCCCAGAAGAATGGTCAATGGGATTCCGAAGAATTCGCCTTGACGGAGGCTTTTAAGTCTCTTTTTGTGC GATATGAGATAGATTATACCGCTGATTTGAAAACAGCTATTGTGGACGAAAAGCAAAAAGACTTCTTCG TGGATCTTCTGAAGCTATTCAAATTGACAGTACAGATGCGCAACAGCTGGAAAGAGAAGGATTTGGATT ATCTAATCTCTCCTGTAGCAGGGGCTGATGGCCGTTTCTTCGATACAAGAGAGGGAAATAAAAGTCTGC CTAAGGATGCAGATGCCAATGGAGCTTATAATATTGCCCTAAAAGGACTTTGGGCTCTACGCCAGATTC GGCAAACTTCAGAAGGCGGTAAACTCAAATTGGCGATTTCC AGAGATCTTACGAGAAAGACtgaGAAATCATCCTTAGCGAAAGCTAAGGATTTTTTTTATCTGAAATTTAT TATATCGCGTTGATTATTGATGCTGTTTTTAGTTTTAACGGCAATTAATATATGTGTTATTAATTGAATGA ATTTTATCATTCATAATAAGTATGTGTAGGATCAAGCTCAGGTTAAATATTCACTCAGGAAGTTATTACT CAGGAAGCAAAGAGGATTACAGAATTATCTCATAACAAGTGTTAAGGGATGTTATTTCC SEQ AAAATTCcatGCAAAATGCTCCGGTTTCATGTCATCAAAATGATGACGTAATTAAGCATTGATAATTGAGA ID TCCCTCTCCCTGACAGGATGATTACATAAATAATAGTGACAAAAATAAATTATTTATTTATCCAGAAAAT NO : GAATTGGAAAATCAGGAGAGCGTTTTCAATCCTACCTCTGGCGCAGTTGATATGTcaaaCAGGTtgccgt 67 cactgcgtcttttactggctcttctcgctaaccaaaccggtaaccccgcttattaaaagcattctgtaac aaagcgggaccaaagccatgacaaaaacgcgtaacaaaagtgtctataatcacggcagaaaagtccacat tgattatttgcacggcgtcacactttgctatgccatagcatttttatccataagattagcggatcctacc tgacgctttttatcgcaactctctactgtttctccatacccgtttttttgggctagcaccgcctatctcg tgtgagataggcggagatacgaactttaagAAGGAGatataccATGAACAACGGCACAAATAATTTTCAG AACTTCATCGGGATCTCAAGTTTGCAGAAAACGCTGCGCAATGCTGCAGAAAACGCTGCGCAATGCTCTGATCCCCACGGAAACCACGCAAC AGTTCATCGTCAAGAACGGAATAATTAAAGAAGATGAGTTACGTGGCGAGAACCGCC AGATTCTGAAAGATATCATGGATGACTACTACCGCGGATTCATCTCTGAGACTCTGAGTTCTATTGATGA CATAGATTGGACTAGCCTGTTCGAAAAAATGGAAATTCAGCTGAAAAATGGTGATAATAAAGATACCTT AATTAAGGAACAGACAGAGTATCGGAAAGCAATCCATAAAAAATTTGCGAACGACGATCGGTTTAAGA ACATGTTTAGCGCCAAACTGATTAGTGACATATTACCTGAATTTGTCATCCACAACAATAATTATTCGGC ATCAGAGAAAGAGGAAAAAACCCAGGTGATAAAATTGTTTTCGCGCTTTGCGACTAGCTTTAAAGATTA CTTCAAGAACCGTGCAAATTGCTTTTCAGCGGACGATATTTCATCAAGCAGCTGCCATCGCATCGTCAAC GACAATGCAGAGATATTCTTTTCAAATGCGCTGGTCTACCGCCGGATCGTAAAATCGCTGAGCAATGAC GATATCAACAAAATTTCGGGCGATATGAAAGATTCATTAAAAGAAATGAGTCTGGAAGAAATATATTCT US 10 , 435 , 714 B2 139 140 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence TACGAGAAGTATGGGGAATTTATTACCCAGGAAGGCATTAGCTTCTATAATGATATCTGTGGGAAAGTG AATTCTTTTATGAACCTGTATTGTCAGAAAAATAAAGAAAACAAAAATTTATACAAACTTCAGAAACTT CACAAACAGATTCTATGCATTGCGGACACTAGCTATGAGGTCOCACTAGCTATGAGGTCCCGTATAAATTTGAAAGTGACGAGGAN GTGTACCAATCAGTTAACGGCTTCCTTGATAACATTAGCAGCAAACATATAGTCGAAAGATTACGCAAA ATCGGCGATAACTATAACGGCTACAACCTGGATAAAATTTATATCGTGTCCAAATTTTACGAGAGCGTTA GCCAAAAAACCTACCGCGACTGGGAAACAATTAATACCGCCCTCGAAATTCATTACAATAATATCTTGC CGGGTAACGGTAAAAGTAAAGCCGACAAAGTAAAAAAAGCGGT TAAGAATGATTTACAGAAATCCATC ACCGAAATAAATGAACTAGTGTCAAACTATAAGCTGTGCAGTGACGACAACATCAAAGCGGAGACTTAT ATACATGAGATTAGCCATATCTTGAATAACTTTGAAGCACAGGAATTGAAATACAATCCGGAAATTCAC CTAGTTGAATCCGAGCTCAAAGCGAGTGAGCTTAAAAACGTGCTGGACGTGATCATGAATGCGTTTCAT TGGTGTTCGGTTTTTATGACTGAGGAACTTGTTGATAAAGACAACAATTTTTATGCGGAACTGGAGGAGA TTTACGATGAAATTTATCCAGTAATTAGTCTGTACAACCTGGT TCGTAACTACGTTACCCAGAAACCGTA CAGCACGAAAAAGATTAAATTGAACTTTGGAATACCGACGTTAGCAGACGGTTGGTCAAAGTCCAAAGA GTATTCTAATAACGCTATCATACTGATGCGCGACAATCTGTATTATCTGGGCATCTTTAATGCGAAGAAT AAACCGGACAAGAAGATTATCGAGGGTAATACGTCAGAAAATAAGGGTGACTACAAAAAGATGATTTA TAATTTGCTCCCGGGTCCCAACAAAATGATCCCGAAAGTTTTCTTGAGCAGCAAGACGGGGGTGGAAAC GTATAAACCGAGCGCCTATATCCTAGAGGGGTATAAACAGAATAAACATATCAAGTCTTCAAAAGACTT TGATATCACTTTCTGTCATGATCTGATCGACTACTTCAAAAACTGTATTGCAATTCATCCCGAGTGGAAA AACTTCGGTTTTGATTTTAGCGACACCAGTACTTATGAAGACATTTCCGGGTTTTATCGTGAGGTAGAGT TACAAGGTTACAAGATTGATTGGACATACATTAGCGAAAAAGACATTGATCTGCTGCAGGAAAAAGGTC AACTGTATCTGTTCCAGATATATAACAAAGATTTTTCGAAAAAATCAACCGGGAATGACAACCTTCACA CCATGTACCTGAAAAATCTTTTCTCAGAAGAAAATCTTAAGGATATCGTCCTGAAACTTAACGGCGAAG CGGAAATCTTCTTCAGGAAGAGCAGCATAAAGAACCCAATCATTCATAAAAAAGGCTCGATTTTAGTCA ACCGTACCTACGAAGCAGAAGAAAAAGACCAGTTTGGCAACATTCAAATTGTGCGTAAAAATATTCCGG AAAACATTTATCAGGAGCTGTACAAATACTTCAACGATAAAAGCGACAAAGAGCTGTCTGATGAAGCAG CCAAACTGAAGAATGTAGTGGGACACCACGAGGCAGCGACGAATATAGTCAAGGACTATCGCTACACG TATGATAAATACTTCCTTCATATGCCTATTACGATCAATTTCAAAGCCAATAAAACGGGTTTTATTAATG ATAGGATCTTACAGTATATCGCTAAAGAAAAAGACTTACATGTGATCGGCATTGATCGGGGCGAGCGTA ACCTGATCTACGTGTCCGTGATTGATACTTGTGGTAATATAGTTGAACAGAAAAGCTTTAACATTGTAAA CGGCTACGACTATCAGATAAAACTGAAACAACAGGAGGGCGCTAGACAGATTGCGCGGAAAGAATGGA AAGAAATTGGTAAAATTAAAAGATCAAAGAGGGCTACCTGAGCTTAGTAATCCACGAGATCTCTAAAA TGGTAATCAAATACAATGCAATTATAGCGATGGAGGATTTGTCTTATGGTTTTAAAAAAGGGCGCTTTAA GGTCGAACGGCAAGTTTACCAGAAATTTGAAACCATGCTCATCAATAAACTCAACTATCTGGTATTTAA AGATATTTCGATTACCGAGAATGGCGGTCTCCTGAAAGGTTAT CAGCTGACATACATTCCTGATAAACTT AAAAACGTGGGTCATCAGTGCGGCTGCATTTTTTATGTGCCTGCTGCATACACGAGCAAAATTGATCCGA CCACCGGCTTTGTGAATATCTTTAAATTTAAAGACCTGACAGTGGACGCAAAACGTGAATTCATTAAAA AATTTGACTCAATTCGTTATGACAGTGAAAAAAATCTGTTCTGCTTTACATTTGACTACAATAACTTTATT ACGCAAAACACGGTCATGAGCAAATCATCGTGGAGTGTGTATACATACGGCGTGCGCATCAAACGTCGC TTTGTGAACGGCCGCTTCTCAAACGAAAGTGATACCATTGACATAACCAAAGATATGGAGAAAACGTTG GAAATGACGGACATTAACTGGCGCGATGGCCACGATCTTCGTCAAGACATTATAGATTATGAAATTGTT CAGCACATATTCGAAATTTTCCGTTTAACAGTGCAAATGCGTAACTCCTTGTCTGAACTGGAGGACCGTG ATTACGATCGTCTCATTTCACCTGTACTGAACGAAAATAACATTTTTTATGACAGCGCGAAAGCGGGGG ATGCACTTCCTAAGGATGCCGATGCAAATGGTGCGTATTGTATTGCATTAAAAGGGTTATATGAAATTAA ACAAATTACCGAAAATTGGAAAGAAGATGGTAAATTTTCGCGCGATAAACTCAAAATCAGCAATAAAG ATTGGTTCGACTTTATCCAGAATAAGCGCTATCTCTAAGAAAT CATCCTTAGCGAAAGCTAAGGATTTTT TTTATCTGAAATTTATTATATCGCGTTGATTATTGATGCTGTTTTTAGTTTTAACGGCAATTAATATATGT GTTATTAATTGAATGAATTTTATCATTCATAATAAGTATGTGTAGGATCAAGCTCAGGTTAAATATTCAC TCAGGAAGTTATTACTCAGGAAGCAAAGAGGATTACAGAATTATCTCATAACAAGTGTTAAGGGATGTT ????cc SEQ AATCAGGAGAGCGTTTTCAATCCTACCTCTGGCGCAGTTGATATGTCAAACAGGTTGCCGTCACTGCGTC ID TTTTACTGGCTCTTCTCGCTAACCAAACCGGTAACCCCGCTTATTAAAAGCATTCTGTAACAAAGCGGGA NO : CCAAAGCCATGACAAAAACGCGTAACAAAAGTGTCTATAATCACGGCAGAAAAGTCCACATTGATTATT 68 TGCACGGCGTCACACTTTGCTATGCCATAGCATTTTTATCCATAAGATTAGCGGATCCTACCTGACGCTT TTTATCGCAACTCTCTACTGTTTCTCCATACCCGTTTTTTTGGGCTAGCAGTAATACGACTCACTATAGGG GTCTCATCTCGTGTGAGATAGGCGGAGATACGAACTTTAAGAGGAGGATATACCATGCACCATCATCAT CACCATACCAATAAATTCACTAACCAGTATTCTCTCTCTAAGACCCTGCGCTTTGAACTGATTCCGCAGG GGAAAACCTTGGAGTTCATTCAAGAAAAAGGCCTCTTGTCTCAGGATAAACAGAGGGCTGAATCTTACC AAGAAATGAAGAAAACTATTGATAAGTTTCATAAATATTTCATTGATTTAGCCTTGTCTAACGCCAAATT AACTCACTTGGAAACGTATCTGGAGTTATACAACAAATCTGCCGAAACTAAGAAAGAACAGAAATTTAA AGACGATTTGAAAAAAGTACAGGACAATCTGCGTAAAGAAATTGTCAAATCCTTCAGTGACGGCGATGC TAAAAGCATTTTTGCCATTCTGGACAAAAAAGAGTTGATTACTGTGGAATTAGAAAAGTGGTTTGAAAA CAATGAGCAGAAAGACATCTACTTCGATGAGAAATTCAAAACTTTCACCACCTATTTTACAGGATTTCAT CAAAACCGGAAGAACATGTACTCAGTAGAACCGAACTCCACGGCCATTGCGTATCGTTTGATCCATGAG AATCTGCCTAAATTTCTGGAGAATGCGAAAGCCTTTGAAAAGATTAAGCAGGTCGAATCGCTGCAAGTG AATTTTCGTGAACTCATGGGCGAATTTGGTGACGAAGGTCTAATCTTCGTTAACGAACTGGAAGAAATG TTTCAGATTAATTACTACAATGACGTGCTATCGCAGAACGGTATCACAATCTACAATAGTATTATCTCAG GGTTCACAAAAAACGATATAAAATACAAAGGCCTGAACGAGTATATCAATAACTACAACCAAACAAAG GACAAAAAGGATAGGCTTCCGAAACTGAAGCAGTTATACAAACAGATTTTATCTGACAGAATCTCCCTG AGCTTTCTGCCGGATGCTTTCACTGATGGGAAGCAGGTTCTGAAAGCGATTTTCGATTTTTATAAGATTA ACTTACTGAGCTACACGATTGAAGGTCAAGAAGAATCTCAAAACTTACTGCTCTTGATCCGTCAAACCAT TGAAAATCTATCATCGTTCGATACGCAGAAAATCTACCTCAAAAACGATACTCACCTGACTACGATCTCT CAGCAGGTTTTCGGGGATTTTAGTGTATTTTCAACACTCTGAACTACTGGTATGAAACCAAAGTCAATC CGAAATTCGAGACGGAATATTCTAAGGCCAACGAAAAAAAACGTGAGATTCTTGATAAAGCTAAAGCC GTATTTACTAAACAGGATTACTTTTCTATTGCTTTCCTGCAGGAAGTTTTATCGGAGTATATCCTGACCCT GGATCATACATCTGATATCGTTAAAAAACACAGCAGCAATTGCATCGCTGACTATTTCAAAAACCACTTT US 10 , 435 , 714 B2 141 142 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence GTCGCCAAAAAAGAAAACGAAACAGACAAGACTTTCGATTTCATTGCTAACATCACCGCAAAATACCAG TGTATTCAGGGTATCTTGGAAAACGCCGACCAATACGAAGACGAACTGAAACAAGATCAGAAGCTGATC GATAATTTAAAATTCTTCTTAGATGCAATCCTGGAGCTGCTGCACTTCATCAAACCGCTTCATTTAAAGA GCGAGTCCATTACCGAAAAGGACACCGCCTTCTATGACGTTTTTGAAAATTATTATGAAGCCCTCTCCTT GCTGACTCCGCTGTATAATATGGTACGCAATTACGTAACCCAGAAACCATATTCTACCGAAAAAATTAA ACTGAACTTTGAAAACGCACAGCTGCTCAACGGTTGGGACGCGAATAAAGAAGGTGACTACCTCACCAC CATCCTGAAAAAAGATGGTAACTATTTTCTGGCAATTATGGATAAGAAACATAATAAAGCATTCCAGAA ATTTCCTGAAGGGAAAGAAAATTACGAAAAGATGGTGTACAAACTCTTACCTGGAGTTAACAAAATGTT GCCGAAAGTATTTTTTAGTAATAAGAACATCGCGTACTTTAACCCGTCCAAAGAACTGCTGGAAAATTAT AAAAAGGAGAcccATAACAAAccccATAccTTTAAcc?????????cccATAcc???? AAGGATTCCCTGAATAAACACGAGGATTGGAAATATTTCGATTTTCAGTTTAGTGAGACCAAGTCATAC CAGGATCTTAGCGGCTTTTATCGCGAAGTAGAACACCAAGGCTATAAAATTAACTTCAAAAACATCGAC AGCGAATACATCGACGGTTTAGTTAACGAGGGCAAACTGTTTC TGTTCCAGATCTATTCAAAGGATTTTA GCCCGTTCTCTAAAGGCAAACCAAATATGCATACGTTGTACTGGAAAGCACTGTTTGAAGAGCAAAACC TGCAGAATGTGATTTATAAACTGAACGGCCAAGCTGAGATTTTTTTCCGTAAAGCCTCGATTAAACCGAA AAATATCATCCTTCATAAGAAGAAAATAAAGATCGCTAAAAAACACTTCATAGATAAAAAAACCAAAA CCTCCGAAATAGTGCCTGTTCAAACAATTAAGAACTTGAATATGTACTACCAGGGCAAGATATCGGAAA AGGAGTTGACTCAAGACGATCTTCGCTATATCGATAACTTTTCGATTTTTAACGAAAAAAACAAGACGA TCGACATCATCAAAGATAAACGCTTCACTGTAGATAAGTTCCAGTTTCATGTGCCGATTACTATGAACTT CAAAGCTACCGGGGGTAGCTATATCAACCAAACGGTGTTGGAATACCTGCAGAATAACCCGGAAGTCAA AATCATTGGGCTGGACCGCGGAGAACGTCACCTTGTGTACTTGACCTTAATCGATCAGCAAGGCAACAT CTTAAAACAAGAATCGCTGAATACCATTACGGATTCAAAGATTAGCACCCCGTATCATAAGCTGCTCGA TAACAAGGAGAATGAGCGCGACCTGGCCCGTAAAAACTGGGGCACGGTGGAAAACATTAAGGAGTTAA AGGAGGGTTATATTTCCCAGGTAGTGCATAAGATCGCCACTCT CATGCTCGAGGAAAATGCGATCGTTG TCATGGAAGACTTAAACTTCGGATTTAAACGTGGGCGATTTAAAGTAGAGAAACAAATCTACCAGAAGT TAGAAAAAATGCTGATTGACAAATTAAATTACTTGGTCCTAAAAGACAAACAGCCGCAAGAATTGGGTG GATTATACAACGCCCTCCAACTTACCAATAAATTCGAAAGTTTTCAGAAAATGGGTAAACAGTCAGGCT TTCTTTTTTATGTTCCTGCGTGGAACACATCCAAAATCGACCCTACAACCGGCTTCGTCAATTACTTCTAT ACTAAATATGAAAACGTCGACAAAGCAAAAGCATTCTTTGAAAAGTTCGAAGCAATACGTTTTAACGCT GAGAAAAAATATTTCGAGTTCGAAGTCAAGAAATACTCAGACTTTAACCCCAAAGCTGAGGGCACACAG CAAGCGTGGACAATCTGCACCTACGGCGAGCGCATCGAAACGAAGCGTCAAAAAGATCAGAATAACAA ATTTGTTTCAACACCTATCAACCTGACCGAGAAGATTGAAGACTTCTTAGGTAAAAATCAGATTGTTTAT GGCGACGGTAACTGTATAAAATCTCAAATAGCCTCAAAGGATGATAAAGCATTTTTCGAAACATTATTA TATTGGTTCAAAATGACACTGCAGATGCGCAATAGTGAGACGCGTACAGATATTGATTATCTTATCAGCC CGGTCATGAACGACAACGGTACTTTTTACAACTCCAGAGACTATGAAAAACTTGAGAATCCAACTCTCC CCAAAGATGCTGATGCGAACGGTGCTTATCACATCGCGAAAAAAGGTCTGATGCTGCTGAACAAAATCG ACCAAGCCGATCTGACTAAGAAAGTTGACCTAAGCATTTCAAATCGGGACTGGTTACAGTTTGTTCAAA AGAACAAATGAGAAATCATCCTTAGCGAAAGCTAAGGATTTTTTTTATCTGAAATGTAGGGAGACCCTC AGGTTAAATATTCACTCAGGAAGTTATTACTCAGGAAGCAAAGAGGATTACA SEQ AAAATTCcatGCAAAATGCTCCGGTTTCATGTCATCAAAATGATGACGTAATTAAGCATTGATAATTGAGA ID TCCCTCTCCCTGACAGGATGATTACATAAATAATAGTGACAAAAATAAATTATTTATTTATCCAGAAAAT NO : GAATTGGAAAATCAGGAGAGCGTTTTCAATCCTACCTCTGGCGCAGTTGATATGTcaaaCAGGTtgccgtc 69 actgcgtcttttactggctcttctcgctaaccaaaccggtaaccccgcttattaaaagcattctgtaacaa agcgggaccaaagccatgacaaaaacgcgtaacaaaagtgtctataatcacggcagaaaagtccacattga ttatttgcacggcgtcacactttgctatgccatagcatttttatccataagattagcggatcctacctgac gctttttatcgcaactctctactgtttctccatacccgtttttttgggctagcaccgccta qataggcggagatacgaactttaagAAGGAGatatacCATGGAACAGGAATATTATCTGGGCTTGGACATG GGCACCGGTTCCGTCGGCTGGGCTGTTACTGACAGTGAATATCACGTTCTAAGAAAGCATGGTAAGGCATT GTGGGGTGTAAGACTTTTCGAATCTGCTTCCACTGCTGAAGAGCGTAGAAT GTTTAGAACGAGTCGACGTAGGCTAGACAGGCGCAATTGGAGAATCGAAATTTTACAAGAAATTTTTGC GGAAGAGATATCTAAGAAAGACCCAGGCTTTTTCCTGAGAATGAAGGAATCTAAGTATTACCCTGAGGA TAAAAGAGATATAAATGGTAACTGTCCCGAATTGCCTTACGCATTATTTGTGGACGATGATTTTACCGAT AAGGATTACCATAAAAAGTTCCCAACTATCTACCATTTACGCAAAATGTTAATGAATACAGAGGAAACC CCAGACATAAGACTAGTTTATCTGGCAATACACCATATGATGAAACATAGAGGCCATTTCTTACTTTCCG GGGATATCAACGAAATCAAAGAGTTTGGTACCACATTTAGTAAGTTACTGGAAAACATAAAGAATGAAG AATTGGATTGGAACTTAGAACTCGGAAAAGAAGAATACGCGGTTGTCGAATCTATCCTGAAGGATAATA TGCTGAATAGGTCGACCAAAAAAACTAGGCTGATCAAAGCACTGAAAGCCAAATCTATCTGCGAAAAA GCTGTTTTAAATTTACTTGCTGGTGGCACTGTTAAGTTATCAGACATTTTTGGTTTGGAAGAATTGAACG AAACCGAGCGTCCAAAAATTAGTTTCGCTGATAATGGCTACGATGATTACATTGGTGAGGTGGAAAACG AGTTGGGCGAACAATTTTATATTATAGAGACAGCTAAGGCAGTCTATGACTGGGCTGTTTTAGTAGAAA TCCTTGGTAAATACACATCTATCTCCGAAGCGAAAGTTGCTACTTACGAAAAGCACAAGTCCGATCTCCA GTTTTTGAAGAAAATTGTCAGGAAATATCTGACTAAGGAAGAATATAAAGATATTTTCGTTAGTACCTCT GACAAACTGAAAAATTACTCCGCTTACATCGGGATGACCAAGATTAATGGCAAAAAAGTTGATCTGCAA AGCAAAAGGTGTTCGAAGGAAGAATTTTATGATTTCATTAAAAAGAATGTCTTAAAAAAATTAGAAGGT CAGCCAGAATACGAATATTTGAAAGAAGAACTGGAAAGAGAGACATTCTTACCAAAACAAGTCAACAG AGATAATGGGGTAATTCCATATCAAATTCACCTCTACGAATTAAAAAAAATTTTAGGCAATTTACGCGAT AAAATTGACCTTATCAAAGAAAATGAGGATAAGCTGGTTCAACTCTTTGAATTCAGAATACCCTATTATG TGGGCCCACTGAACAAGATTGATGACGGCAAAGAAGGTAAATT CACATGGGCCGTCCGCAAATCCAATG AAAAAATTTACCCATGGAACTTTGAAAATGTAGTAGATATTGAAGCGTCTGCGGAGAAATTTATTCGAA GAATGACTAATAAATGCACTTACTTGATGGGAGAGGATGTTCTGCCTAAAGACAGCTTATTATACAGCA AGTACATGGTTCTAAACGAACTTAACAACGTTAAGTTGGACGGTGAGAAATTAAGTGTAGAATTGAAAC AAAGATTGTATACTGACGTCTTCTGCAAGTACAGAAAAGTGACAGTTAAAAAAATTAAGAATTACTTGA AGTGCGAAGGTATAATTTCTGGAAACGTAGAGATTACTGGTATTGATGGTGATTTCAAAGCATCCCTAA CAGCTTACCACGATTTCAAGGAAATCCTGACAGGAACTGAACT CGCAAAAAAAGATAAAGAAAACATT ATTACTAATATTGTTCTTTTCGGTGATGACAAGAAATTGTTGAAGAAAAGACTGAATAGACTTTACCCCC US 10 , 435 , 714 B2 143 144 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence AGATTACTCCCAATCAACTTAAGAAAATTTGTGCTTTGTCTTACACAGGATGGGGTCGTTTTTCAAAAAA GTTCTTAGAAGAGATTACCGCACCTGATCCAGAAACAGGCGAAGTATGGAATATAATTACCGCCTTATG GGAATCGAACAATAATCTTATGCAACTTCTGAGCAATGAATATCGTTTCATGGAAGAAGTTGAGACTTA CAACATGGGCAAACAGACGAAGACTTTATCCTATGAAACTGTGGAAAATATGTATGTATCACCTTCTGT CAAGAGACAAATTTGGCAAACCTTAAAAATTGTCAAAGAATTAGAAAAGGTAATGAAGGAGTCTCCTA AACGTGTGTTTATTGAAATGGCTAGAGAAAAACAAGAGTCAAAAAGAACCGAGTCAAGAAAGAAGCAG TTAATCGATTTATATAAGGCTTGTAAAAACGAAGAGAAAGATTGGGTTAAAGAATTGGGGGACCAAGA GGAACAAAAACTACGGTCGGATAAGTTGTATTTATACTATACGCAAAAGGGACGATGTATGTATTCCGG CGAGGTAATAGAATTGAAGGATTTATGGGACAATACAAAATATGACATAGACCATATATATCCCCAATC AAAAACGATGGACGATAGCTTGAACAATAGAGTACTCGTGAAAAAAAAATATAATGCGACCAAATCTG ATAAGTATCCTCTGAATGAAAATATCAGACATGAAAGAAAGGGGTTCTGGAAGTCCTTGTTAGATGGTG GGTTTATAAGCAAAGAAAAGTACGAGCGTCTAATAAGAAACACGGAGTTATCGCCAGAAGAACTCGCT GGTTTTATTGAGAGGCAAATCGTGGAAACGAGACAATCTACCAAAGCCGTTGCTGAGATCCTAAAGCAA GTTTTCCCAGAGTCGGAGATTGTCTATGTCAAAGCTGGCACAGTGAGCAGGTTTAGGAAAGACTTCGAA CTATTAAAGGTAAGAGAAGTGAACGATTTACATCACGCAAAGGACGCTTACCTAAATATCGTTGTAGGT AACTCATATTATGTTAAATTTACCAAGAACGCCTCTTGGTTTATAAAGGAGAACCCAGGTAGAACATAT AACCTGAAAAAGATGTTCACCTCTGGTTGGAATATTGAGAGAAACGGAGAAGTCGCATGGGAAGTTGGT AAGAAAGGGACTATAGTGACAGTAAAGCAAATTATGAACAAAAATAATATCCTCGTTACAAGGCAGGT TCATGAAGCAAAGGGCGGCCTTTTTGACCAACAAATTATGAAGAAAGGGAAAGGTCAAATTGCAATAA AAGAAACCGATGAGAGACTAGCGTCAATAGAAAAGTATGGTGGCTATAATAAAGCTGCGGGTGCATAC TTTATGCTTGTTGAATCAAAAGACAAGAAAGGTAAGACTATTAGAACTATAGAATTTATACCCCTGTACC TTAAAAACAAAATTGAATCGGATGAGTCAATCGCGTTAAATTTTCTAGAGAAAGGAAGGGGTTTAAAAG AACCAAAGATCCTGTTAAAAAAGATTAAGATTGACACCTTGTTCGATGTAGATGGATTTAAAATGTGGT TATCTGGCAGAACAGGCGATAGACTTTTGTTTAAGTGCGCTAATCAATTAATTTTGGATGAGAAAATCAT TGTCACAATGAAAAAAATAGTTAAGTTTATTCAGAGAAGACAAGAAAACAGGGAGTTGAAATTATCTGA TAAAGATGGTATCGACAATGAAGTTTTAATGGAAATCTACAATACATTCGTTGATAAACTTGAAAATAC CGTATATCGAATCAGGTTAAGTGAACAAGCCAAAACATTAATTGATAAACAAAAAGAATTTGAAAGGCT ATCACTGGAAGACAAATCCTCCACCCTATTTGAAATTTTGCATATATTCCAGTGCCAATCTTCAGCAGCT AATTTAAAAATGATTGGCGGACCTGGGAAAGCCGGCATCCTAGTGATGAACAATAATATCTCCAAGTGT AACAAAATATCAATTATTAACCAATCTCCGACAGGTATTTTTGAAAATGAAATAGACTTGCTTAAGATAT AAGAAATCATCCTTAGCGAAAGCTAAGGATTTTTTTTATCTGAAATTTATTATATCGCGTTGATTATTGAT GCTGTTTTTAGTTTTAACGGCAATTAATATATGTGTTATTAATTGAATGAATTTTATCATTCATAATAAGT ATGTGTAGGATCAAGCTCAGGTTAAATATTCACTCAGGAAGTTATTACTCAGGAAGCAAAGAGGATTAC AGAATTATCTCATAACAAGTGTTAAGGGATGTTATTTCC SEQ AATCAGGAGAGCGTTTTCAATCCTACCTCTGGCGCAGTTGATATGTCAAACAGGTTGCCGTCACTGCGTC ID TTTTACTGGCTCTTCTCGCTAACCAAACCGGTAACCCCGCTTATTAAAAGCATTCTGTAACAAAGCGGGA NO : CCAAAGCCATGACAAAAACGCGTAACAAAAGTGTCTATAATCACGGCAGAAAAGTCCACATTGATTATT 70 TGCACGGCGTCACACTTTGCTATGCCATAGCATTTTTATCCATAAGATTAGCGGATCCTACCTGACGCTT TTTATCGCAACTCTCTACTGTTTCTCCATACCCGTTTTTTTGGGCTAGCAGTAATACGACTCACTATAGGG GTCTCATCTCGTGTGAGATAGGCGGAGATACGAACTTTAAGAGGAGGATATACCATGCACCATCATCAT CACCATTCTTTCGACTCTTTCACCAACCTGTACTCTCTGTCTAAAACCCTGAAATTCGAAATGCGTCCGGT TGGTAACACCCAGAAAATGCTGGACAACGCGGGTGTTTTCGAAAAAGACAAACTGATCCAGAAAAAAT ACGGTAAAACCAAACCGTACTTCGACCGTCTGCACCGTGAATTCATCGAAGAAGCGCTGACCGGTGTTG AACTGATCGGTCTGGACGAAAACTTCCGTACCCTGGTTGACTGGCAGAAAGACAAAAAAAACAACGTTG CGATGAAAGCGTACGAAAACTCTCTGCAGCGTCTGCGTACCGAAATCGGTAAAATCTTCAACCTGAAAG CGGAAGACTGGGTTAAAAACAAATACCCGATCCTGGGTCTGAAAAACAAAAACACCGACATCCTGTTCG AAGAAGCGGTTTTCGGTATCCTGAAAGCGCGTTACGGTGAAGAAAAAGACACCTTCATCGAAGTTGAAG AAATCGACAAAACCGGTAAATCTAAAATCAACCAGATCTCTATCTTCGACTCTTGGAAAGGTTTCACCG GTTACTTCAAAAAATTCTTCGAAACCCGTAAAAACTTCTACAAAAACGACGGTACCTCTACCGCGATCG CGACCCGTATCATCGACCAGAACCTGAAACGTTTCATCGACAACCTGTCTATCGTTGAATCTGTTCGTCA GAAAGTTGACCTGGCGGAAACCGAAAAATCTTTCTCTATCTCTCTGTCTCAGTTCTTCTCTATCGACTTCT ACAACAAATGCCTGCTGCAGGACGGTATCGACTACTACAACAAAATCATCGGTGGTGAAACCCTGAAAA ACGGTGAAAAACTGATCGGTCTGAACGAACTGATCAACCAGTACCGTCAAACAACAAAGACCAGAAA ATCCCGTTCTTCAAACTGCTGGACAAACAGATCCTGTCTGAAAAAATCCTGTTCCTGGACGAAATCAAA AACGACACCGAACTGATCGAAGCGCTGTCTCAGTTCGCGAAAACCGCGGAAGAAAAAACCAAAATCGT TAAAAAACTGTTCGCGGACTTCGTTGAAAACAACTCTAAATACGACCTGGCGCAGATCTACATCTCTCA GGAAGCGTTCAACACCATCTCTAACAAATGGACCTCTGAAACCGAAACCTTCGCGAAATACCTGTTCGA AGCGATGAAATCTGGTAAACTGGCGAAATACGAAAAAAAAGACAACTCTTACAAATTCCCGGACTTCAT cccc?ac?c?c???c????????Acc??????c?ocAAACAAAAATAC TACAAAATCTCTAAATTCCAGGAAAAAACCAACTGGGAACAGTTCCTGGCGATCTTCCTGTACGAATTC AACTCTCTGTTCTCTGACAAAATCAACACCAAAGACGGTGAAACCAAACAGGTTGGTTACTACCTGTTC GCGAAAGACCTGCACAACCTGATCCTGTCTGAACAGATCGACATCCCGAAAGACTCTAAAGTTACCATC AAAGACTTCGCGGACTCTGTTCTGACCATCTACCAGATGGCGAAATACTTCGCGGTTGAAAAAAAACGT GCGTGGCTGGCGGAATACGAACTGGACTCTTTCTACACCCAGCCGGACACCGGTTACCTGCAGTTCTAC GACAACGCGTACGAAGACATCGTTCAGGTTTACAACAAACTGCGTAACTACCTGACCAAAAAACCGTAC TCTGAAGAAAAATGGAAACTGAACTTCGAAAACTCTACCCTGGCGAACGGTTGGGACAAAAACAAAGA ATCTGACAACTCTGCGGTTATCCTGCAGAAAGGTGGTAAATACTACCTGGGTCTGATCACCAAAGGTCA CAACAAAATCTTCGACGACCGTTTCCAGGAAAAATTCATCGTTGGTATCGAAGGTGGTAAATACGAAAA AATCGTTTACAAATTCTTCCCGGACCAGGCGAAAATGTTCCCGAAAGTTTGCTTCTCTGCGAAAGGTCTG GAATTCTTCCGTCCGTCTGAAGAAATCCTGCGTATCTACAACAACGCGGAATTCAAAAAAGGTGAAACC TACTCTATCGACTCTATGCAGAAACTGATCGACTTCTACAAAGACTGCCTGACCAAATACGAAGGTTGG GCGTGCTACACCTTCCGTCACCTGAAACCGACCGAAGAATACCAGAACAACATCGGTGAATTCTTCCGT GACGTTGCGGAAGACGGTTACCGTATCGACTTCCAGGGTATCTCTGACCAGTACATCCACGAAAAAAAC GAAAAAGGTGAACTGCACCTGTTCGAAATCCACAACAAAGACTGGAACCTGGACAAAGCGCGTGACGG TAAATCTAAAACCACCCAGAAAAACCTGCACACCCTGTACTTCGAATCTCTGTTCTCTAACGACAACGTT US 10 , 435 , 714 B2 145 146 TABLE 6 - continued SEQUENCE LISTING SEQ ID NO : Sequence GTTCAGAACTTCCCGATCAAACTGAACGGTCAGGCGGAAATCTTCTACCGTCCGAAAACCGAAAAAGAC AAACTGGAATCTAAAAAAGACAAAAAAGGTAACAAAGTTATCGACCACAAACGTTACTCTGAAAACAA AATCTTCTTCCACGTTCCGCTGACCCTGAACCGTACCAAAAACGACTCTTACCGTTTCAACGCGCAGATC AACAACTTCCTGGCGAACAACAAAGACATCAACATCATCGGTGTTGACCGTGGTGAAAAACACCTGGTT TACTACTCTGTTATCACCCAGGCGTCTGACATCCTGGAATCTGGTTCTCTGAACGAACTGAACGGTGTTA ACTACGCGGAAAAACTGGGTAAAAAAGCGGAAAACCGTGAACAGGCGCGTCGTGACTGGCAGGACGTT CAGGGTATCAAAGACCTGAAAAAAGGTTACATCTCTCAGGTTGTTCGTAAACTGGCGGACCTGGCGATC AAACACAACGCGATCATCATCCTGGAAGACCTGAACATGCGTTTCAAACAGGTTCGTGGTGGTATCGAA AAATCTATCTACCAGCAGCTGGAAAAAGCGCTGATCGACAAAC TGTCTTTCCTGGTTGACAAAGGTGAA AAAAACCCGGAACAGGCGGGTCACCTGCTGAAAGCGTACCAGCTGTCTGCGCCGTTCGAAACCTTCCAG AAAATGGGTAAACAGACCGGTATCATCTTCTACACCCAGGCGTCTTACACCTCTAAATCTGACCCGGTTA CCGGTTGGCGTCCGCACCTGTACCTGAAATACTTCTCTGCGAAAAAAGCGAAAGACGACATCGCGAAAT TCACCAAAATCGAATTCGTTAACGACCGTTTCGAACTGACCTACGACATCAAAGACTTCCAGCAGGCGA AAGAATACCCGAACAAAACCGTTTGGAAAGTTTGCTCTAACGT TGAACGTTTCCGTTGGGACAAAAACC TGAACCAGAACAAAGGTGGTTACACCCACTACACCAACATCACCGAAAACATCCAGGAACTGTTCACCA AATACGGTATCGACATCACCAAAGACCTGCTGACCCAGATCTCTACCATCGACGAAAAACAGAACACCT CTTTCTTCCGTGACTTCATCTTCTACTTCAACCTGATCTGCCAGATCCGTAACACCGACGACTCTGAAATC GCGAAAAAAAACGGTAAAACGACTTCATCCTGTCTCCGGTTGAACCGTTCTTCGACTCTCGTAAAGAC AACGGTAACAAACTGCCGGAAAACGGTGACGACAACGGTGCGTACAACATCGCGCGTAAAGGTATCGT TATCCTGAACAAAATCTCTCAGTACTCTGAAAAAAACGAAAAC TGCGAAAAAATGAAATGGGGTGACCT GTACGTTTCTAACATCGACTGGGACAACTTCGTTGAAATCATCCTTAGCGAAAGCTAAGGATTTTTTTTA TCTGAAATGTAGGGAGACCCTCAGGTTAAATATTCACTCAGGAAGTTATTACTCAGGAAGCAAAGAGGA ????? SEQ AATCAGGAGAGCGTTTTCAATCCTACCTCTGGCGCAGTTGATATGTCAAACAGGTTGCCGTCACTGCGTC ID TTTTACTGGCTCTTCTCGCTAACCAAACCGGTAACCCCGCTTATTAAAAGCATTCTGTAACAAAGCGGGA NO : CCAAAGCCATGACAAAAACGCGTAACAAAAGTGTCTATAATCACGGCAGAAAAGTCCACATTGATTATT 71 TGCACGGCGTCACACTTTGCTATGCCATAGCATTTTTATCCATAAGATTAGCGGATCCTACCTGACGCTT TTTATCGCAACTCTCTACTGTTTCTCCATACCCGTTTTTTTGGGCTAGCAGTAATACGACTCACTATAGGG GTCTCATCTCGTGTGAGATAGGCGGAGATACGAACTTTAAGAGGAGGATATACCATGCACCATCATCAT CACCATAACAAATTCGAAAACTTCACCGGTCTGTACCCGATCTCTAAAACCCTGCGTTTCGAACTGATCC CGCAGGGTAAAACCCTGGAATACATCGAAAAATCTGAAATCCTGGAAAACGACAACTACCGTGCGGAA AAATACGAAGAAGTTAAAGACATCATCGACGGTTACCACAAATGGTTCATCAACGAAACCCTGCACGAC CTGCACATCAACTGGTCTGAACTGAAAGTTGCGCTGGAAAACAACCGTATCGAAAAATCTGACGCGTCT AAAAAAGAACTGCAGCGTGTTCAGAAAATCAAACGTGAAGAAATCTACAACGCGTTCATCGAACACGA AGCGTTCCAGTACCTGTTCAAAGAAAACCTGCTGTCTGACCTGCTGCCGATCCAGATCGAACAGTCTGA AGACCTGGACGCGGAAAAAAAAAAACAGGCGGTTGAAACCTTCAACCGTTTCTCTACCTACTTCACCGG TTTCCACGAAAACCGTAAAAACATCTACTCTAAAGAAGGTATCTCTACCTCTGTTACCTACCGTATCGTT CACGACAACTTCCCGAAATTCCTGGAAAACATGAAAGTTTTCGAAATCCTGCGTAACGAATGCCCGGAA GTTATCTCTGACACCGCGAACGAACTGGCGCCGTTCATCGACGGTGTTCGTATCGAAGACATCTTCCTGA TCGACTTCTTCAACTCTACCTTCTCTCAGAACGGTATCGACTACTACAACCGTATCCTGGGTGGTGTTACC ACCGAAACCGGTGAAAAATACCGTGGTATCAACGAATTCACCAACCTGTACCGTCAGCAGCACCCGGAA TTCGGTAAATCTAAAAAAGCGACCAAAATGGTTGTTCTGTTCAAACAGATCCTGTCTGACCGTGACACCC TGTCTTTCATCCCGGAAATGTTCGGTAACGACAAACAGGTTCAGAACTCTATCCAGCTGTTCTACAACCG TGAAATCTCTCAGTTCGAAAACGAAGGTGTTAAAACCGACGTTTGCACCGCGCTGGCGACCCTGACCTC TAAAATCGCGGAATTCGACACCGAAAAAATCTACATCCAGCAGCCGGAACTGCCGAACGTTTCTCAGCG TCTGTTCGGTTCTTGGAACGAACTGAACGCGTGCCTGTTCAAATACGCGGAACTGAAATTCGGTACCGCG GAAAAAGTTGCGAACCGTAAAAAAATCGACAAATGGCTGAAAT CTGACCTGTTCTCTTTCACCGAACTG AACAAAGCGCTGGAATTCTCTGGTAAAGACGAACGTATCGAAAACTACTTCTCTGAAACCGGTATCTTC GCGCAGCTGGTTAAAACCGGTTTCGACGAAGCGCAGTCTATCCTGGAAACCGAATACACCTCTGAAGTT CACCTGAAAGACCAGCAGACCGACATCGAAAAAATCAAAACCTTCCTGGACGCGCTGCAGAACCTGAT GCACCTGCTGAAATCTCTGTGCGTTTCTGAAGAAGCGGACCGTGACGCGGCGTTCTACAACGAATTCGA CATGCTGTACAACCAGCTGAAACTGGTTGTTCCGCTGTACAACAAAGTTCGTAACTACATCACCCAGAA ACTGTTCCGTTCTGACAAAATCAAAATCTACTTCGAAAACAAAGGTCAGTTCCTGGGTGGTTGGGTTGAC TCTCAGACCGAAAACTCTGACAACGGTACCCAGGCGGGTGGTTACATCTTCCGTAAAGAAAACGTTATC AACGAATACGACTACTACCTGGGTATCTGCTCTGACCCGAAACTGTTCCGTCGTACCACCATCGTTTCTG AAAACGACCGTTCTTCTTTCGAACGTCTGGACTACTACCAGCTGAAAACCGCGTCTGTTTACGGTAACTC TTACTGCGGTAAACACCCGTACACCGAAGACAAAAACGAACTGGTTAACTCTATCGACCGTTTCGTTCA CCTGTCTGGTAACAACATCCTGATCGAAAAAATCGCGAAAGACAAAGTTAAATCTAACCCGACCACCAA CACCCCGTCTGGTTACCTGAACTTCATCCACCGTGAAGCGCCGAACACCTACGAATGCCTGCTGCAGGA CGAAAACTTCGTTTCTCTGAACCAGCGTGTTGTTTCTGCGCTGAAAGCGACCCTGGCGACCCTGGTTCGT GTTCCGAAAGCGCTGGTTTACGCGAAAAAAGACTACCACCTGTTCTCTGAAATCATCAACGACATCGAC GAACTGTCTTACGAAAAAGCGTTCTCTTACTTCCCGGTTTCTCAGACCGAATTCGAAAACTCTTCTAACC GTACCATCAAACCGCTGCTGCTGTTCAAAATCTCTAACAAAGACCTGTCTTTCGCGGAAAACTTCGAAAA AGGTAACCGTCAGAAAATCGGTAAAAAAAACCTGCACACCCTGTACTTCGAAGCGCTGATGAAAGGTA ACCAGGACACCATCGACATCGGTACCGGTATGGTTTTCCACCGTGTTAAATCTCTGAACTACAACGAAA AAACCCTGAAATACGGTCACCACTCTACCCAGCTGAACGAAAAATTCTCTTACCCGATCATCAAAGACA AACGTTTCGCGTCTGACAAATTCCTGTTCCACCTGTCTACCGAAATCAACTACAAAAAAAACGTAAAC CGCTGAACAACTCTATCATCGAATTCCTGACCAACAACCCGGACATCAACATCATCGGTCTGGACCGTG GTGAACGTCACCTGATCTACCTGACCCTGATCAACCAGAAAGGTGAAATCCTGCGTCAGAAAACCTTCA ACATCGTTGGTAACACCAACTACCACGAAAAACTGAACCAGCGTGAAAAAGAACGTGACAACGCGCGT AAATCTTGGGCGACCATCGGTAAAATCAAAGAACTGAAAGAAGGTTTCCTGTCTCTGGTTATCCACGAA ATCGCGAAAATCATGGTTGAAAACAACGCGATCGTTGTTCTGGAAGACCTGAACTTCGGTTTCAAACGT GGTCGTTTCAAAGTTGAAAAACAGATCTACCAGAAATTCGAAAAAATGCTGATCGACAAACTGAACTAC CTGGTTTTCAAAGACAAAAAAGCGAACGAAGCGGGTGGTGTTC TGAAAGGTTACCAGCTGGCGGAAAA ATTCGAATCTTTCCAGAAAATGGGTAAACAGTCTGGTTTCCTGTTCTACGTTCCGGCGGCGTACACCTCT