US010510435B2

( 12 ) United States Patent ( 10 ) Patent No.: US 10,510,435 B2 Cai et al. (45 ) Date of Patent : Dec. 17 , 2019

( 54 ) ERROR CORRECTION OF MULTIPLEX 8,309,306 B2 11/2012 Nolan et al . 2001/0019835 Al 9/2001 Usui IMAGING ANALYSIS BY SEQUENTIAL 2001/0026918 Al 10/2001 Collins et al . HYBRIDIZATION 2002/0172950 A1 11/2002 Kenny et al . 2003/0087279 Al 5/2003 Shao et al . ( 71) Applicant: CALIFORNIA INSTITUTE OF 2003/0152490 A1 8/2003 Trulson et al. TECHNOLOGY, Pasadena , CA (US ) 2003/0170613 Al 9/2003 Straus 2004/0171075 A1 9/2004 Flynn et al. 2004/0229253 Al 11/2004 Hyldig -Nielsen et al. (72 ) Inventors: Long Cai, Pasadena, CA (US ) ; Sheel 2005/0069895 Al 3/2005 Woudenberg et al. Shah , Pasadena , CA (US ) ; Eric 2006/0275782 A1 12/2006 Gunderson et al. Lubeck , San Francisco , CA (US ) ; Wen 2010/0221708 Al 9/2010 Yamada et al. Zhou , Pasadena , CA (US ) 2010/0304994 Al 12/2010 Wu et al. 2010/0323348 A1 12/2010 Hamady et al. 2011/0104676 A1 5/2011 Pierce et al. ( * ) Notice : Subject to any disclaimer , the term of this 2012/0021410 A1 1/2012 Yin et al. patent is extended or adjusted under 35 2012/0046178 A1 2/2012 Van Den Boom et al . U.S.C. 154 ( b ) by 152 days . 2012/0142014 Al 6/2012 Cai 2012/0301886 A1 11/2012 Farrell et al. ( 21) Appl. No .: 15 /298,219 2014/0031243 Al 1/2014 Cai et al. 2014/0073520 A1 3/2014 Cai et al. Oct. 19 , 2016 2014/0099637 A1 4/2014 Nolan et al. ( 22) Filed : 2014/0171338 A1 6/2014 Terbrueggen et al. 2015/0225801 A1 * 8/2015 Cai C12Q 1/6888 (65 ) Prior Publication Data 506/9 US 2017/0212983 A1 2015/0267251 A1 * 9/2015 Cai C12N 15/1065 Jul. 27 , 2017 506/9 2016/0019334 A1 * 1/2016 Cai G06F 19/26 Related U.S. Application Data 506/8 2016/0369329 Al 12/2016 Cai et al. (63 ) Continuation - in -part of application No. 14 /435,735 , GOIN 33/50 filed as application No. PCT /US2014 /036258 on Apr. 2018/0142307 A1 * 5/2018 Cai 30 , 2014 . FOREIGN PATENT DOCUMENTS (60 ) Provisional application No. 61/ 971,974 , filed on Mar. JP H09168399 A 6/1997 28 , 2014 , provisional application No. 61/ 817,651 , JP 2002542793 A 12/2002 JP 2003527075 A 9/2003 filed on Apr. 30 , 2013 . WO WO /00 /65094 A 11/2000 WO WO2010 / 148039 A3 12/2010 ( 51 ) Int. Cl. WO WO2012 /0158967 A1 11/2012 C12Q 1/68 ( 2018.01 ) WO WO2014 / 182528 A2 11/2014 COFH 21/00 ( 2006.01 ) WO WO2016 /018960 A1 2/2016 G16B 25/00 ( 2019.01 ) GOIN 21/78 ( 2006.01 ) OTHER PUBLICATIONS G16B 40/00 ( 2019.01) C12Q 1/6841 (2018.01 ) Velculescu , et al. , “ Analysis of human transcriptomes” , Nature Dec. GOIN 21/64 ( 2006.01) 1999 , vol . 23 , pp . 387-388 . Blanco , Ana, et al. , “ A FRET -based assay for characterization of C12Q 1/6881 (2018.01 ) alternative splicing events using peptide nucleic acid fluorescence in ( 52 ) U.S. Cl. titu hybridization ” , Nucleic Acids Research , vol. 37 , No. 17, el 16 CPC G16B 25/00 ( 2019.02 ) ; C12Q 1/6841 ( Jun . 26 , 2009 ) . (2013.01 ) ; GOIN 21/6458 (2013.01 ); GOIN Choi, Harry M.T., et al. , “ Programmable in situ amplification for 21/78 (2013.01 ) ; G16B 40/00 ( 2019.02 ) ; multiplexed imaging of mRNA expression ” , Nat Biotechnol , vol. C12Q 1/6881 (2013.01 ) ; C12Q 2600/158 28 , No. 11 , pp . 1208-1212 (Nov. 2010 ) . (2013.01 ) (Continued ) (58 ) Field of Classification Search CPC C12Q 1/68; C07H 21/00 Primary Examiner Ethan C Whisenant See application file for complete search history. ( 74 ) Attorney, Agent, or Firm — Squire Patton Boggs (US ) LLP ( 56 ) References Cited ( 57 ) ABSTRACT Disclosed herein are methods and systems for detecting U.S. PATENT DOCUMENTS and /or quantifying cellular targets such as nucleic acids in 5,364,763 A 11/1994 Kacian cells , tissues , organs or organisms. Through sequential 5,367,066 A 11/1994 Urdea et al . barcoding , it is possible to perform high -throughput profil 5,424,413 A 6/1995 Hogan et al. 5,629,147 A 5/1997 Asgari et al. ing of a large number of targets , such as transcripts and / or 5,866,331 A 2/1999 Singer et al. DNA loci . In some embodiments , error correction is imple 5,955,272 A 9/1999 Lawrence et al . mented through use of barcodes that can tolerate mistakes 6,194,146 B1 2/2001 Utermohlen et al. and missing data during sequential hybridization of probes 6,534,266 B1 3/2003 Singer to selected targets . 6,537,755 B1 3/2003 Drmanac 7,727,720 B2 6/2010 Dhallan 20 Claims, 61 Drawing Sheets US 10,510,435 B2 Page 2

( 56 ) References Cited Moffitt, Jeffrey R., et al. , “ High -throughput single -cell expression profiling with multiplexed error- robust fluorescence in situ hybridization ” , PNAS Sep. 27 , 2016 , vol. 113 , No. 39 , pp . OTHER PUBLICATIONS 11046-11051 . Muller, Stefan , et al. , " Towards unlimited colors for fluorescence in Epstein , Lucy, et al. , “ Reutilization fo previously hybridized slides situ hybridization ( FISH )” , Research , vol . 10 , pp . 223-232 ( Year: 2002 ). for fluorescence in situ hybridization ” , Cytometry vol. 21, pp . Muller, Stefan , et al. , “ A nonredundant multicolor bar code as a 378-381 ( Year : 1995 ) . screening tool for rearrangements in neoplasia " , Chromo Femino , Andrea, et al. , “ Visualization of signle RNA transcript” , somes & Cancer , vol. 39 , No. 1 , pp . 59-70 ( Jan. 2004 ) . Science 280 : 585 ( Year: 1998 ). Theodosiou , Zenonas, et al ., “ Automated analysis of FISH and Fernandez - Suarez , Marta , et al ., “ Fluorescent probes for super immunohistochemistry images: a review ” , Cytometry Part A , vol. 71A , pp . 439-450 ( Year: 2007) . resolution imaging in living cells ” , Molecular Cell Biology, vol. 9 , Zhen , D.K., al ., “ Poly - FISH : A technique of repeated hybridiza pp . 929-943 (Dec. 2008 ) . tions that improves cytogenetic analysis of fetal cells in maternal Ioannou , D., et al ., “ Multicolour interphase cytogenetics: 24 chro blood ” , Prenatal Diagnosis, vol. 18, pp . 1181-1185 ( Year: 1998 ). mosome probes, 6 colours , 4 layers” , Molecular and Cellular Eng, et al. , Profiling the transcriptome with RNA SPOTs Nature Probes, vol. 25 , pp . 199-205 (Aug. 2011) . Methods Published Oniine Nov. 13 , 2017 ; doi : 10.1038 /NMETH . Kitayama, Yasuhiko , et al ., “ Repeated fluorescence in situ hybrid 4500 , 6 pages . ization by microwave -enhanced protocol” , Pathology International Shah , et al. , Dynamics and Spatial Genomics of the Nascent 2006 , vol . 56 , pp . 490-493 . Transcriptome by Intron seqFISH Cell ( 2018 ) doi.org/10.1016/j. cell.2018.05.035 , 15 pages. Levesque , Marshall J., et al ., “ Single - chromosome transcriptional Collins, et al. , A branched DNA signal amplification assay for profiling reveals chromosomal gene expression regulation ” , Nature quantification of nucleic acid targets below 100 molecules /ml , Methods, vol . 10 , No. 3 , pp . 246-248 (Mar. 2013 ) . Nucleic Acids Research 1997 , vol. 25 , No. 15 , pp . 2979-2984 . Levesque ,Marshall J., et al. , “ Visualizing SNVs to quantify allele Flagella , et al. , A multiplex branched DNA assay for parallel specific expression in single cells ” , Nature Methods, vol. 10 , No.9 , quantitative gene expression profiling , Analytical Biochemistry, pp . 865-867 ( Sep. 2013 ) . Mar. 2006 , vol. 352 , pp . 50-60 . Levsky, Jeffrey M., et al ., “ Single -cell gene expression profiling ” , Linton , et al. , Microarray Gene Expression Analysis of Fixed Science 297 : 836 ( Year: 2002) . Archival Tissue Permits Molecular Classification and Identification Liehr, T., et al. , “ Multicolor FISH probe sets and their applications” , of Potential Therapeutic Targets in Diffuse Large B - Cell Lym Histology and Histopathology, vol. 19, pp . 229-237 ( Year: 2004 ) . phoma , Journal OfMolecular Diagnostics , May 2012 , vol. 14 , No. 3 , pp . 223-232 . Lu , Jing, et al. , “ Quantification ofmiRNA Abundance in single cells Pon , et al. , Tandem oligonucleotide synthesis using linker using locked nucleic acid - FISH and enzyme- labeled fluorescence ” , phosphoramidites, Nucleic Acids Research , Apr. 6 , 2005, vol. 33 , Methods in Molecular Biology 680 :77 ( Year: 2011 ) . No. 6 , pp . 1940-1948 . Lubeck , Eric , et al ., “ Single -cell in situ RNA profiling by sequential Sinclair , et al. , Improved Sensitivity of BCR -A BL Detection : A hybridization ” , Nature Methods, vol. 1 , No. 4 , pp . 360-361 ( Apr. Triple Probe Three -Color Fluorescence In Situ Hybridization Sys 2014 ) . tem , Blood , Aug. 15 , 1997 , vol. 90 , No. 4 , pp . 1395-1402 . Lubeck , Eric , et al. , “ Single -cell systems biology by super Urdea , et al. , A comparison of non -radioisotopic hybridization assay resolution imaging and combinatorial labeling ” , Nature Methods , methods using fluorescent, chemilluminescent and enzyme labeled vol . 9 , No. 7 , pp . 743-748 ( Jul. 2012) . synthetic oligodeoxyribonucleotide probes , Nucleic Acids Research Mali, Prashant, et al. , “ Barcoding cells using cell - surface program 1988 , vol. 16 , No. 11, pp . 4937-4956 . mable DNA - binding domains” , Nature Methods vol. 10, No. 5 , pp . 403-406 (May 2013 ) . * cited by examiner U.S. Patent Dec. 17 , 2019 Sheet 1 of 61 US 10,510,435 B2

mRNA4 Cell 3mRNA Round2Hybridization Washand rehybridize mRNA3 Round3Hybridization MRNAZ

mRNA1 mRNA1 1.Fig Washand rehybridize markerAlignment Washand rehybridize mRNA4 4mRNA

WWW 1RoundHybridization MANA3 HybridizationRound4 mRNA2 mRNA2

1mRNA U.S. Patent Dec. 17 , 2019 Sheet 2 of 61 US 10,510,435 B2

Sameprobes withgreendye

scalesas Nhybs Barcode Hybridization2-probeset1Hybridization3m DNasel

2.Fig Compositefour-colorFISHimages Sameprobes with bluedye 2Hyb 7==2=27Rehyb 1setprobe DNasel 1Hyb Hybridization1 FISHprobes withpurpledye

a b U.S. Patent Dec. 17 , 2019 Sheet 3 of 61 US 10,510,435 B2

MANAZ mRNAL

1pm

Fig.2(Continued)

1

O o ob D

000

C U.S. Patent Dec. 17 , 2019 Sheet 4 of 61 US 10,510,435 B2

Sameprobes withgreendye mRNA1 HybN mRNA2mRNA3

NHybs Barcode# scalesas mRNA1 DNAsel andRehyb andStrip Rehybridize mRNA1 Fig3. Sameprobes withbluedye 2Hyb mRNA2MANA3 Rehyb mRNA1 mRNA1 andStrip Rehybridize DNAse! 1Hyb mRNA2MANA3 FISHprobes withpurple dye MRNA1

b mRNA1 Cell U.S. Patent Dec. 17 , 2019 Sheet 5 of 61 US 10,510,435 B2

Fig . 4

300 Removalof smFISH probes with DNAse !

250

330

50

$ ministros 1.0 intensityIntensity RatioRatio caner(After ONaselBeforeOmaselbefore ONaseNase ) U.S. Patent Dec. 17 , 2019 Sheet 6 of 61 US 10,510,435 B2

Fig. 5 Removal of smFISH signalby DNasel and photobleaching

603

Intensity Ratio (After DNase & Photobleaching Before DNase ) U.S. Patent Dec. 17, 2019 Sheet 7 of 61 US 10,510,435 B2

Fig . 6

SOON meo

4 -

? U.S. Patent Dec. 17 , 2019 Sheet 8 of 61 US 10,510,435 B2

Fig . 7

8 U.S. Patent Dec. 17 , 2019 Sheet 9 of 61 US 10,510,435 B2

Fig . 8

0 105.8nm

Displacementy(pixels ) Displacementx ( pixels ) 1 pixels 130nm U.S. Patent Dec. 17 , 2019 Sheet 10 of 61 US 10,510,435 B2

48 32 24 0

9.Fig

??7, A1647,Cy7 A1594,Cy7 A1532,Cy7 A1647,Cy7 A1647, A1594A1647, A1532Al647, A1594,Cy7 A1594,A1647 A1594, A1532,Cy? A1647A1532, A1594A1532, A1532, U.S. Patent Dec. 17 , 2019 Sheet 11 of 61 US 10,510,435 B2

32 20 12 0

10.Fig

A1532Cy7),( ,Cy?)(A1647 ,Cy?)(A1594 A1594)Cy7, A1532)A1594, (A1647,A1532) (A1594Al647,) A1647)(07, A1594A1532,( (A1532A1647 (A1647,A1594) (A532,7) U.S. Patent Dec. 17 , 2019 Sheet 12 of 61 US 10,510,435 B2 17.3uM DNasePost-

11.Fig

Pre-DNase1 .. U.S. Patent Dec. 17 , 2019 Sheet 13 of 61 US 10,510,435 B2

2Hybridization# 17.3uM

12.Fig

Hybridization#1 U.S. Patent Dec. 17 , 2019 Sheet 14 of 61 US 10,510,435 B2

28m

13.Fig U.S. Patent Dec. 17 , 2019 Sheet 15 of 61 US 10,510,435 B2 urzeWrzt*

ActbsmFISHCYBB

14.Fig

Alexa647ActbHCR U.S. Patent Dec. 17 , 2019 Sheet 16 of 61 US 10,510,435 B2

Sum Hybridization2

17um 15.Fig

1Hybridization U.S. Patent Dec. 17 , 2019 Sheet 17 of 61 US 10,510,435 B2

X

16.Fig

X U.S. Patent Dec. 17 , 2019 Sheet 18 of 61 US 10,510,435 B2 12 directionSamplezScanning ap Objective 20x EZ aOTTO

1

5 views ?? Mirror FIG.17 Mirror Cylindrical Lens Objective

? X81Olympus SheetLight U.S. Patent Dec. 17 , 2019 Sheet 19 of 61 US 10,510,435 B2

0

18.Fig

V U.S. Patent Dec. 17 , 2019 Sheet 20 of 61 US 10,510,435 B2

Fig . 19

* 2.2

53 U.S. Patent Dec. 17 , 2019 Sheet 21 of 61 US 10,510,435 B2

Fig. 20

B

G ANALYSIS (section Site B. Laterad Entorina Cort ENTL } Cholera Toxxon Sssssssnit & ORBVA ORBI Retrograde 14 Ay

* EPO * * 23 RE & O Toivo Águmo guardicios 3 A : May 2.Ciors 7 Disney Intial, Messi find ... gox expresii U.S. Patent Dec. 17 , 2019 Sheet 22 of 61 US 10,510,435 B2

New StrandBridging Bridging Strand

21.Fig (a) (b) Exolll Digestion Intermediate StrandProbe Fluorophore Exoill Digestion

000 StrandBridging Target mRNA U.S. Patent Dec. 17 , 2019 Sheet 23 of 61 US 10,510,435 B2

Washin New Bridging Strand Wash New Bridging Strand

22.Fig (a) (b) Digestion Intermediate ProbeStrandwexo Fluorophore h-exo Digestion StrandBridging Target mRNA,es U.S. Patent Dec. 17 , 2019 Sheet 24 of 61 US 10,510,435 B2

w Bridging Wash New Bridging Strand

23.Fig )(a (b) Digestion Intermediate ProbeStrand Fluorophore USER Digestion StrandBridging Target mRNA,eg U.S. Patent Dec. 17 , 2019 Sheet 25 of 61 US 10,510,435 B2

Fig. 24

12 U.S. Patent Dec. 17 , 2019 Sheet 26 of 61 US 10,510,435 B2

Fig. 25 Single Stranded Full Library Probe sequence Primer R1 (RC ) 3

Probe sequence Primer R2 (RC ) Nick Site Sub - Library

my Probe sequence BERC) Nick Site Nick Site more morena Endonuclease Double Stranded Amplified Primer F2 Probe sequence R

Denaturing Gel Electrophoresis Extract and Hybridize U.S. Patent Dec. 17 , 2019 Sheet 27 of 61 US 10,510,435 B2

Fig . 26 11 U.S. Patent Dec. 17 , 2019 Sheet 28 of 61 US 10,510,435 B2

Before Post Hybridization Hybridization

Sample processing Hybridization

Image collection Probe design

Signal removal and re -hybridization Barcode design

Data analysis Error correction / resistant algorithm

FIG . 27A U.S. Patent Dec. 17 , 2019 Sheet 29 of 61 US 10,510,435 B2

Identify target genes that will be subject to analysis by 2710 hybridization experiments

Select F type of visual signals to represent binding between probes and target sequences in a target gene 2720 2700

Determine the number of rounds of hybridization ( e.g., NZ 2 ) 2730

Create a library of drop -safe unique barcodes ( e.g., by -2740 implementing one or more error correction mechanisms)

Associate each target gene with a drop -safe unique barcode (e.g. , through probe synthesis and N rounds of sequential 2750 hybridization on immobilized nucleic acid samples)

Determine the identity and location of target genes in the 2760 sample based on the unique barcodes for each target gene

FIG . 27B U.S. Patent Dec. 17 , 2019 Sheet 30 of 61 US 10,510,435 B2

2800

Power Source CPU Communications Circuitry 2816 2814 2812 2810 2820 2830 2832 Operating system 2834 File system 2836 Applications 2838 Data processing application 2840 2822 Contentmanagement tools 2842 Design tools 2846 Network application 2848 2834 System administration and monitoring tools 2826 2850 Data Controller 2852 Databases 2854 2828 Sequence database 2856 Image database 2858 Probe database Barcode database 2860

Result database 2862

FIG , 28 U.S. Patent Dec. 17 , 2019 Sheet 31 of 61 US 10,510,435 B2

What* 0 -

D

29.Fig .. BarcodingHybridizations SerialHybridizations

* **********## A B C U.S. Patent Dec. 17 , 2019 Sheet 32 of 61 US 10,510,435 B2

Hybridization3

30.Fig o D Hybridization2 idale

C Hybridization1

A U.S. Patent Dec. 17 , 2019 Sheet 33 of 61 US 10,510,435 B2

?

???

9 C

3

Fig . 31 U.S. Patent Dec. 17 , 2019 Sheet 34 of 61 US 10,510,435 B2

09

32.Fig

C LL |

mn ZOS A

** U.S. Patent Dec. 17 , 2019 Sheet 35 of 61 US 10,510,435 B2

Cell Class CA1 w

A B 10 CA3 hing C

E G

*600 C. K

M N P. Fig . 33 U.S. Patent Dec. 17 , 2019 Sheet 36 of 61 US 10,510,435 B2

E

34.Fig

an U.S. Patent Dec. 17 , 2019 Sheet 37 of 61 US 10,510,435 B2

CA1 12 A B O CA3 6

C D

M

S Fig . 35 U.S. Patent Dec. 17 , 2019 Sheet 38 of 61 US 10,510,435 B2

C

SA 0.0 36.Fig

CALL CATV CAIL Toevo doa

B

A U.S. Patent Dec. 17 , 2019 Sheet 39 of 61 US 10,510,435 B2

4931431419Rik Hyalsitpr2 8 4932429005RK wowo Kchip Kif166 Adcy4Acta2 23 Aldh362 Amigo2 Lhx4 Lox1 Anxag Arhgef26 104 B3gat2 Miges

w tgart 5 Bc12114 w Mmp8 BmpribBizi1 $ Caon13 Cdc61 Neurog1Neurod4 Cecr2 Nik62 Cilg Nikbiz Clec5a 119 Nkd 2 Nro 12 Creb311 Noyer Cua4 OmgOir1 Cyp2070 Osr2 5 Cyp215 Pax6 x Dostamp Ppo1r36Point ww Rom31y ***** Foxa Foxa2 Rrm2 Fordi Scm2 Foxd4 SemazeSenp1 *Low Serpinb11 5 WWW Slc4a8 6 Gdf2 D. Gais Sic6a16 5518 Slco1c1 Gm6377 Spage 9 GoC4 Tnfrsf1b 5 Gykl 1 Vmn1r65 S Vos 130 Zip182 53 Zf0715 Hoxb8 hiyo3Barcode SerialMýb5 Controlmyb2 SerialHýb

Fig . 37 U.S. Patent Dec. 17 , 2019 Sheet 40 of 61 US 10,510,435 B2

Cy3B smFISH

3

125 Gene Brain 1 125 Gene Brain 2 29 Gene Brain Fig . 38 U.S. Patent Dec. 17 , 2019 Sheet 41 of 61 US 10,510,435 B2 A. Mini

C.

***

D.

Dropped Fig . 39 Sn' JU?rd ad LI ' 6107 J??US TV JO 19 SN SED'OIS?OT T Konip Amigo2 Slcoici Lyvel Loxl Mfge8 Slc5a7 Semaze Pax6 Calbi Mrci Itpr2 Rhob Sox2 Nov Colgal Omg Acta2 Nes Zfp90 Cone5 Zfp715 Zfp182 Vps13c Tofrsfib : Spage Sumf2 Slc6a16 Sic4a8 Serpinb11 Sis Senp1 Sam12 Rbm3ly Pppir3b Psmd5 Osr2 Obsli Npyzr Nkoz Nirp12 Nikb2 Neurogi Neurod4 Mmp8 Mroprb1 Mgam Mmgt1 40A.Fig Lmodi LX3 Lhx4 Lefty2 Kif160 Laptms Hyals Hoxb3 Hoxb8 Gpr114 Gykl1 GDC4 Gm805 Gm15688 Gm6377 Gofz Gdf5 Gata6 Galnt3 Foxd1 . Foxd4 Foxal Foxa2 Fam69c Ddb2 Dcstamp Cypzis Dbx1 Cyp2c70 Cla4 23 CsF2rb Creb311 Clecsa Creb1 Cecr2 Cilo Cdhi Coco Capn13 Bmprib Bc2114 B3gat2 Arhgef26 Anxa9 Anklel Ano7 Above Aldh3b2 Abca9 Abca15 4931431F19RIK 4932429POSRIK 1800 1000 0091Cel OOZI per Counts S'n ' juared LI' 6107 j??YS Et JO 19 SN SED'OISOI TI

Amigo2 Kanip Sico1c1 Lyvel Loxil Mfges Semaze Slc5a7 Gda Pax6 Clans Rhob Sox2 NOV . C05al Omg Acta2 Nes Zfp90 Cone5 Zfp715 Zf0182 Vmnir65 Vos13 Tofrsilb Spag6 Sumf2 Slcáa16 Slc4a8 Serpinb11 Sis Senp1 Rrm2 Scmi2 Rbm3ly Ppp1r3b . Psmos Pld5 Plol Osr2 Obsli Nirp12 Npy2r Nikoz Nikbiz Neurog1 40B.Fig Nelli Neurod4 Mmp8 Mrgprb1 Mmgti Mertk Lmodl thx3 Lhx4 Lefty2 Kif16b .Laptm5 Hyal5 Hoxb3 Hoxb8 Gykli Gpc4 Gpr114 Gm6377 Gm805 Gm15688 Gdf5 Gata6 Gdf2 roxdi . Foxd4 Foxal Foxa2 Fam69c Odb2 Egin3 Dcstamp Cyp215 Ctla4 Cyp2c70 Cst2ro2 Creb3l1 Clecsa Cilp Cecr2 Cdc6 Caoni3 .Cdc51 Bmprib Bizii Bd2114 83gat2 Barhll Arhgef26 Anxa9 Anklei Aboy4 Aldh3b2 Abca9 Abca15 4931431F19Rik 4932429POSRIK 50 10 Score Z U.S. Patent Dec. 17 , 2019 Sheet 44 of 61 US 10,510,435 B2

AT 08 RegionalComposition mCortexTemporalCortexPariental 123456789 DG CA3 CAL

1234567 13 funt

2 5 12 12345678910123456 front

*****

4 .40C4Fig40C540C6400740C140C240C3 * 3 ma REW 3 .40C13Fig40C1140C12400940C1040C8 ?? 10

1234561234567891

2 wwel

1234

kom 11 U.S. Patent Dec. 17 , 2019 Sheet 45 of 61 US 10,510,435 B2

350 300 250 ValuesComponentPrincipal 200 150 )

50 med 80 Principal components Fig. 40D

0.81 0.6 0.4 ) 0.2

correlationSpatiallocalization somit mimimitinananana -0.5 -0.2 -0.4 Gene expressionFig . 40E correlation U.S. Patent Dec. 17 , 2019 Sheet 46 of 61 US 10,510,435 B2

15

10 03 5 5 66 PCA2 ta

o 11 @ 12 -5 @ 13

-5 0 in 15 20 PCA1 Fig . 40F U.S. Patent Dec. 17 , 2019 Sheet 47 of 61 US 10,510,435 B2 hanhnininananana 0009000S0000001 class3 9SS2 00090009000€0007000T)Sijapaldwes 410Fig. Sijap?rdwes 41F.Fig

0.211 imbutientuinendubetetekmininininO'I 0.04 0.04 assignmentH80 accuracy0.21

ssepS 100020003000400050006000 Class2 Sijapajdwes 817614 H cellssampled 41E.Fig H

sukututenbetenkninlukuinedO'Iambientenminiudininkusiminen t SobotabostadO'T 0.21 Yo'o assignment08'0 accuracyZO yoo00010009_0009_0005000€_00020008_0007_0001_10'0009_0005_OOO class1 H 41A yssep 0009000S00040000000Z0001 sila?p?rdwes Fig uminininanananananananananananananananananananiniwininiwan sppapajdwes 41D.Fig 4 I 89 0.4 0.2 0.8 0.21 0.01 assignment0.61 accuracy assignment0.44 accuracy U.S. Patent Dec. 17 , 2019 Sheet 48 of 61 US 10,510,435 B2

wymaga

100020003000400050006000 class9 cellssampled 411 .Fig class12 cellssampled 41L.Fig + benimsiminimaiatentatutakmimiminiO'T 1.0 100020003000400050000.006000 0.2 assignment accuracy assignment-0.81 accuracy0.04

50006000200030004000 class8 cellssampled 41H.Fig class111.0pmyngugasnyomorgaserpongezaagizompapapapagarsasagangan cellssampled 41K .Fig H .0.6 0.2 0.00 0.25 0.04 assignment accuracy assignment-0.81 accuracy

100020003000400050006000 100020003000400050006000 class7 cellssampled 416.Fig class10 cellssampled 41J.Fig gemmmmmgangngngmmmmmgangnanammmming tt 0.001000 ,8 0.6 0.41 0.21 0.8 16 0.21 0.04 assignment accuracy assignment accuracy U.S. Patent Dec. 17 , 2019 Sheet 49 of 61 US 10,510,435 B2

300040005000600010002000 13class cellssampled 41M.Fig

wi

assignment accuracy U.S. Patent Dec. 17 , 2019 Sheet 50 of 61 US 10.sl0.435 B2

A. B. *

C.

Fig . 42 U.S. Patent Dec. 17 , 2019 Sheet 51 of 61 US 10,510,435 B2

Dmoxi Ex2 Uncx Ctonbi Paxip1 Rybo Prdm1 - Nikb2 Tido2 Spa Zfp287 IH2 Esr2 Z?p128 Spi Sp7 Pparacib Vsx1 V $ X2 Mybl2 Bizfi Rnf2 Topors Nr3c2 Taf61 . Hoxd13 Hoxd12 Nr2e1 Sox6 Pbx3 Foxa1 Sox5 Fig . 43A1 Cebpa Cdc51 Rest Scmi2 Tbx15 Clock Foxci Rbpi Pias3 Zfp422 Relo Poara * Cdc6 > Lhxi Fig . 43A1 Hoxbg Hoxb8 Lhx6 Hoxb3 Six4 Cofa2t3 Zfp263 Gata4 Gatah Fig . 43A2 Nfe213 - Nie212 Gl1 TbX2 En2 Zfp423 Tbx4 Foxbi Fig . 43A3 Rix4 Sox13 Taf4b Sox17 Rfx2 Med14 Sal4 Fig . 43A Sall3 Smydi Arid Zíp64 Tros1 4 Bach2 Hoxal Notch3 Pknox2 Pknoxi to Fig . 43A2 U.S. Patent Dec. 17 , 2019 Sheet 52 of 61 US 10,510,435 B2

from Fig . 43A1 Sinza Smade E13 Alxi Smads Egf MO Rbak NkX3-1 Nikbiz Gabpa Zscan21 Trp73 Esrrg 6247 Nfatc4 Rbpili Esmb Neurod4 Tox21 Rorc . Pax7 Paxi Pax6 Pax3 Pax2 Zkscan17 Pax9 Mzf1 GA Runx3 Foxda Smarca4 < Srebfi Crebl Soxli Inx4 Gmeb2 Pou3f2 1 Tc23 Ikzf1 Fig . 43A2 Nfatc3 Noas 3 Prox25 5242 Plagi Taf2 Ddx3x Trim33 Lixia TSC2 comes N : 2 : 2 Zip354a Foxo1 Fox04 ElK4 Mam3 Mycn Foxp3 Atm Tori Uaca Lhx3 Pml Pochi Z61533 Irxs Barli Tfap2b Trapze G112 Zice . Zic2 Zics Zic3 Satbi Foxn4 Onecut2 Foxni Vezfi Sst: Snco Xdh Slcsa Skci7a8 Slc6a3 Sic6a8 Opalin Smad3 Palvb Pdofra Sici7a7 to Fig. 43A3 U.S. Patent Dec. 17 , 2019 Sheet 53 of 61 US 10,510,435 B2

from Fig . 43A2 Mfge8 Lyve Mog My14 Ctss Gadi 1 Acta2 Alldh11 Camk2 Cldn5 Chat Tiami Nger Sicía2 Gal 150 250 300 350 Counts per Cell Fig. 43A3 U.S. Patent Dec. 17 , 2019 Sheet 54 of 61 US 10,510,435 B2

Ombxl Paxipl Cinnbi Prdmi Nikb2 Tidp2 Sp8 Esr2 Zip287 Zfp128 Spl Vavi So7 Ppargcib Vsx1 Niya Vsx2 Mybl2 Rnf2 Topors Nr3c2 Taf61 Hoxd13 Hoxd12 Soxg Nr2e1 Polr2b Sox6 Sox5 Pbx3 Foxal Cdc51 Fig .43B1 Rest Mark Etsi Sam2 Tbx15 Myo Clock Foxcl. Rbpi Runx1 Pias3 - Z?p422 Relb ppara Aridja Cdc6 Fig . 43B1 Hoxb9 Hoxb8 Lhx6 Hoxb3 Six4 Cofa2 3 Zfp263 Gata4 Gatao Gatas Fig . 43B2 Nfe213 Nie212 Tox2 Hntia Zf0423 Tbx4 Foxbl Elft Elf2 E1F4 Fig .4383 Rix4 Sox13 Bhihe41 Sox17 Ahr Fig . 438 Smydl Sall3 Zfp64 Por Trps1 Hoxal Bachi Notch3 Pknox2 Pknox1 to Fig . 43B2 U.S. Patent Dec. 17 , 2019 Sheet 55 of 61 US 10,510,435 B2

from Fig . 4381 Sin3a Smade Egf Alxi Smad5 Nkx3-1 Rbak Gabpa Zscan21 Nikbiz Esrrg E27 Nfatc4 Ropil Esmrb Neurod4 TbX21 Rorc Pax 1 Pax6 Paxg Pax2 Zkscan17 Ghi Runx3 Smarcal Foxd3 Foxd4 Srebfi Crebi 50x11 Gmeb2 Pou3f2 TC123 Ikzf1 Fig . 43B2 Mtf2 Nfatc3 Npas3 43 : Phox2b E2f2 Plagi Taf2 Ddx3x Trim33 Lonxla TSC2 Eomes Nr22 Foxo1 Zfp354a Elk4 Salli Fox04 Foxp3 Mam3 Uaca Atm Lhx3 Pm Zbtb33 Barnil Tfap2b Tfapze Rxrb G12 Zic4 Gi3 Zic2 Zics - Zic3 Satbt Foxn4 Onecuta Foxni Mnati Vezfi Sst Snco Vip Th Skc17a8 Slc6a3 Slc5a7 Smad3 Slc6a8 Opalin Pdgfra Rein Sic17a7 to Fig . 4383 U.S. Patent Dec. 17 , 2019 Sheet 56 of 61 US 10,510,435 B2

from Fig . 43B2 Lyve Mog Mfge8 Non Myl14 Ctss Foxil Htr3a Gadi Alldh11 Acta2 Chat Camk2 Ngef Slcla2 Gial -2 0 IN 4 6 8 7 Score Fig. 43B3

Regional Composition

CA3V CALV CAid

1 2 3 4 5 6 8 9 10 11 12 13 14 15 16 17 18 Fig. 43C U.S. Patent Dec. 17 , 2019 Sheet 57 of 61 US 10,510,435 B2

160

130 seqFISH 100

40 70 100 130 160 SmHCR Fig. 43D

Fig. 43E U.S. Patent Dec. 17 , 2019 Sheet 58 of 61 US 10,510,435 B2

CAld

CA1

CA1V ????

CA31

3 2 ?? 4 5 6 77 8 99 Fig . 43F1

Brain 2

CALV

????

CA31

G

} N 3 sha un 6 7 8 9 11 12 13 Fig . 43F2 U.S. Patent Dec. 17 , 2019 Sheet 59 of 61 US 10,510,435 B2

CA1d CA1 CALV ???g ??? : DG

mand 2 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Fig . 43G U.S. Patent Dec. 17 , 2019 Sheet 60 of 61 US 10,510,435 B2

A. CA1

NSSI B.

Fig . 44 U.S. Patent Dec. 17 , 2019 Sheet 61 of 61 US 10,510,435 B2

C. sox1SOX1 45.Fig

aw US 10,510,435 B2 1 2 ERROR CORRECTION OF MULTIPLEX next round of hybridization . In some embodiments , probes IMAGING ANALYSIS BY SEQUENTIAL used in the n rounds of hybridization are capable of emitting HYBRIDIZATION at least F types of detectable visual signals ( where Fz2 and F ” is greater than the number of target genes in the plurality CROSS -REFERENCE TO RELATED 5 of target genes ). In some embodiments , a unique code in the APPLICATIONS first plurality of unique codes for a target gene consists of n components . In some embodiments , each component is This application is a continuation - in -part of U.S. patent determined by visual signals that reflect the binding between application Ser. No. 14 /435,735 , filed on Apr. 14 , 2015 and binding probes and the target gene during one of the n entitled “MULTIPLEX LABELING OF MOLECULES BY 10 rounds of hybridization . In some embodiments , the n rounds SEQUENTIAL HYBRIDIZATION BARCODING ,” which of hybridization include m error correction round (m21 ) . In is a National Stage Entry of International Application No. some embodiments , a second plurality of unique codes for PCT /US2014 / 36258 filed Apr. 30 , 2014 , which in turn the plurality of target genes is generated after the m error claims priority to U.S. Provisional Application Ser . No. correction round is removed from the n rounds of hybrid 61/ 817,651 , filed Apr. 30 , 2013, and61 / 971,974 , filed Mar. 15 secondization . pluralityIn some of embodiments unique codes, consistseach unique of ( n code- m ) compoin the 28 , 2014 , each of which is hereby incorporated by reference nents and uniquely represents a target gene in the plurality herein in its entirety . of target genes . STATEMENT REGARDING FEDERALLY In some embodiments, the plurality of target genes are 20 located on immobilized nucleic acids selected from the SPONSORED RESEARCH group consisting of mRNAs, chromosomal DNAs and com This invention was made with government support under binations thereof. In some embodiments , n is 4 or greater, 5 Grant No. HD075605 and under Grant No. OD008530 or greater, or 10 or greater . In some embodiments , the m error correction round comprises one round of the n rounds awarded by the National Institutes of Health . The govern 25 ofhybridization . In some embodiments , the one round of the ment has certain rights in the invention . n rounds of hybridization is a repeat of one of the remaining FIELD OF THE INVENTION one or more ( n - 1 ) rounds of the n rounds of hybridization . In some embodiments , where ms0.5n . In some embodiments , the at least F types of detectable sequentialThe invention hybridization disclosed methods herein for identifyinggenerally / relatesquantitat- to 30 visual signals comprises one selected from the group con ing cellular species such as nucleic acids. More specifically , sisting of a fluorescence signal, a color signal , a red signal, disclosed herein are methods for efficient error reduction . a green signal, a yellow signal, a combined color signal representing two or more colors , and combinations thereof. BACKGROUND OF THE INVENTION In some embodiments , a probe in the plurality of binding 35 probes further comprises a signal moiety that emits a detect Transcription profiling of cells are essential for many able visual signal upon binding of the probe to a target purposes . Microscopy imaging which can resolve multiple sequence . mRNAs in single cells can provide valuable information In some embodiments , the signal moiety is connected to regarding transcript abundance and localization , which are the binding sequence of the probe via a cleavable linker . important for understanding the molecular basis of cell 40 In some embodiments , each component of a n - component identify and developing treatment for diseases. Therefore, unique code in the first plurality of unique codes is assigned there is a need for new and improved methods for profile a numerical value that corresponds to one of the at least F transcripts in cells by, for example , microscopy imaging. types of detectable visual signals ; and where at least one component of the n -component unique code is determined SUMMARY OF THE INVENTION 45 based on the numerical values of all or some of the other n - 1 components . In some embodiments , the n - component In one aspect, disclosed herein is a sequential hybridiza unique code is determined as : tion method that comprises the steps of: identifying a { ji, j2 , ... ( a , * ji + a , * j2j + a , * jn + C )mod F , . , jn } , plurality of target genes ; and associating , via sequential where ji is a numerical value that corresponds the detectable hybridization of binding probes to the plurality of target 50 visual signals used in the first round of hybridization , jz is a genes, a first plurality of unique codes with the plurality of numerical value that corresponds the detectable visual sig target genes , where each target gene in the plurality of target nals used in the second round of hybridization , and in is a genes is represented by a unique code in the first plurality of numerical value that corresponds the detectable visual sig unique codes , where the sequential hybridization comprises nals used in the nth round of hybridization ; and where 11, n rounds of hybridization (where n22) . Here , each round of 55 j2 , ... jn , aj, a2, a , and n are none zero integers and C hybridization in n rounds of hybridization in turn comprises is an integer. the steps of contacting the plurality of target genes with a In some embodiments , m , n , F , i , j and k are all integers . plurality of binding probes, where each probe in the plurality In one aspect disclosed herein is a hybridization method of binding probes comprises: a binding sequence that spe that comprises the steps of: identifying a plurality of target cifically binds a target sequence in a gene in the plurality of 60 genes ; performing sequential hybridization of binding target genes , where target genes from the plurality of target probes to the plurality of target genes, where the sequential genes are spatially transfixed from each other , and where hybridization comprises n rounds of hybridization (where each probe is capable of emitting a detectable visual signal n22 ) . Here , each round of hybridization in n rounds of upon binding of the probe to a target sequence ; detecting hybridization in turn comprises : contacting the plurality of visual signals that reflect the binding between the plurality 65 target genes with a plurality of binding probes , where each of binding probes and the plurality of target genes; and probe in the plurality of binding probes comprises: a binding removing the visual signals , when applicable, prior to the sequence that specifically binds a target sequence in a gene US 10,510,435 B2 3 4 in the plurality of target genes, where target genes from the values of all or some of the other n - 1 components . In some plurality of target genes are spatially transfixed from each embodiments , the n -component unique code is determined other , and where each probe is capable of emitting a detect as : able visual signal upon binding of the probe to a target { j1 , 32 , ... ( a * j1 + a * j2 · + an * jn + C )mod F, · , jn } , sequence; detecting visual signals that reflect the binding 5 where j? is a numerical value that corresponds the detectable between the plurality of binding probes and the plurality of visual signals used in the first round of hybridization , jz is a target genes, where each target gene in the plurality of target numerical value that corresponds the detectable visual sig genes is represented by visual signals that are unique for the nals used in the second round of hybridization , and in is a target gene, and where probes used in the n rounds of numerical value that corresponds the detectable visual sig hybridization are capable of emitting at least F types of 10 nals used in the nth round of hybridization , and where ji , detectable visual signals (where F22, and F ” is greater than j2, . . . ?n , 21, 22, an are none zero integers and C is an the number of target genes in the plurality of target genes ); integer. and removing the visual signals , when applicable , prior to In one aspect , disclosed herein is a non - transitory com the next round of hybridization ; and performing serial puter - readable medium containing instructions that , when hybridizations against one ormore serial target genes , where 15 executed by a computer processor, cause the computer the expression level of each serial target gene is above a processor to : associate , via sequential hybridization of bind predetermined threshold value, and where each serial ing probes to a plurality of target genes , a first plurality of hybridization in turn comprises : contacting the one or more unique codes with the plurality of target genes, where each serial target genes with a plurality of binding probes, where target gene in the plurality of target genes is represented by each probe in the plurality of binding probes comprises : a 20 a unique code in the first plurality of unique codes , where the binding sequence that specifically binds a target sequence in sequential hybridization comprises n rounds of hybridiza a serial target gene in the one or more serial target genes, tion (where n22 ) . Here each round of hybridization in n where one or more serial target genes are spatially transfixed rounds of hybridization in turn comprises: contacting the from each other, where each probe capable of emitting a plurality of target genes with a plurality of binding probes, detectable visual signal upon binding of the probe to the 25 where each probe in the plurality of binding probes com target sequence, and where probes binding to target prises: a binding sequence that specifically binds a target sequences in the same serial target gene emit the same sequence in a gene in the plurality of target genes, where detectable visual signals; and detecting visual signals that target genes from the plurality of target genes are spatially reflect the binding between the plurality of binding probes transfixed from each other, and where each probe is capable and the one or more serial target gene . 30 of emitting a detectable visual signal upon binding of the In some embodiments , the n rounds of hybridization probe to a target sequence ; detecting visual signals that generate a first plurality of unique codes , where each target reflect the binding between the plurality of binding probes gene in the plurality of target genes is represented by a and the plurality of target genes ; and removing the visual unique code in the first plurality of unique codes. signals , when applicable , prior to the next round of hybrid In some embodiments , where a unique code in the first 35 ization . plurality of unique codes for a target gene consists of n In some embodiments , probes used in the n rounds of components , and where each component is determined by hybridization are capable of emitting at least F types of visual signals that reflect the binding between binding detectable visual signals (where F22 and F ” is greater than probes and the target gene during one of the n rounds of the number of target genes in the plurality of target genes) . hybridization . 40 In some embodiments , a unique code in the first plurality of In some embodiments , the n rounds of hybridization unique codes for a target gene consists of n components . In include m error correction round m1( ) , where a second some embodiments , each component is determined by visual plurality of unique codes for the plurality of target genes is signals that reflect the binding between binding probes and generated after the m error correction round is removed from the target gene during one of the n rounds of hybridization . the n rounds of hybridization , and where each unique code 45 In some embodiments , the n rounds of hybridization include in the second plurality of unique codes consists of ( n - m ) m error correction round (m21 ). In some embodiments , a components and uniquely represents a target gene in the second plurality of unique codes for the plurality of target plurality of target genes. genes is generated after the m error correction round is In some embodiments , the method of hybridization further removed from the n rounds of hybridization . In some comprises the step of: identifying the one or more serial 50 embodiments , each unique code in the second plurality of target genes based on expression levels of candidate target unique codes consists of ( n - m ) components and uniquely genes . represents a target gene in the plurality of target genes . In some embodiments , the plurality of target genes are In one aspect, disclosed herein is a non - transitory com located on immobilized nucleic acids selected from the puter - readable medium containing instructions that , when group consisting ofmRNAs , chromosomal DNAs and com- 55 executed by a computer processor, cause the computer binations thereof . processor to : perform sequential hybridization of binding In some embodiments , the one or more serial target genes probes to a plurality of target genes, where the sequential are located on immobilized nucleic acids selected from the hybridization comprises n rounds of hybridization (where group consisting ofmRNAs , chromosomal DNAs and com n22 ). binations thereof. 60 Here , each round of hybridization in n rounds of hybrid In some embodiments , each unique code in the first ization comprises : contacting the plurality of target genes plurality of unique codes consists of n component, where with a plurality of binding probes , where each probe in the each component of a n - component unique code in the first plurality of binding probes comprises: a binding sequence plurality of unique codes is assigned a numerical value that that specifically binds a target sequence in a gene in the corresponds to one of the at least F types of detectable visual 65 plurality of target genes , where target genes from the plu signals ; and where at least one component of the n -compo rality of target genes are spatially transfixed from each other, nent unique code is determined based on the numerical and where each probe is capable of emitting a detectable US 10,510,435 B2 5 6 visual signal upon binding of the probe to a target sequence ; compared sequences . A sequence which is “ unrelated ” or detecting visual signals that reflect the binding between the " non - homologous ” shares less than 40 % identity , less than plurality of binding probes and the plurality of target genes , 35 % identity, less than 30 % identity , or less than 25 % where each target gene in the plurality of target genes is identity with a sequence described herein . In comparing two represented by visual signals that are unique for the target 5 sequences, the absence of residues (amino acids or nucleic gene, and where probes used in the n rounds ofhybridization acids ) or presence of extra residues also decreases the are capable of emitting at least F types of detectable visual identity and homology / similarity . signals (where F22, and F " is greater than the number of In some embodiments , the term “ homology ” describes a target genes in the plurality of target genes) ; and removing mathematically based comparison of sequence similarities the visual signals , when applicable , prior to the next round 10 which is used to identify genes with similar functions or of hybridization ; and perform serial hybridizations against motifs . The nucleic acid sequences described herein can be one or more serial target genes , where the expression level used as a “ query sequence ” to perform a search against of each serial target gene is above a predetermined threshold public databases, for example , to identify other family value, where each serial hybridization comprises : contacting members , related sequences or homologs. In some embodi the one or more serial target genes with a plurality of binding 15 ments , such searches can be performed using the NBLAST probes , where each probe in the plurality of binding probes and XBLAST programs (version 2.0 ) of Altschul , et al . comprises: a binding sequence that specifically binds a (1990 ) J. Mol. Biol. 215 :403-10 . In some embodiments , target sequence in a serial target gene in the one or more BLAST nucleotide searches can be performed with the serial target genes, where one or more serial target genes are NBLAST program , score = 100 , word length = 12 to obtain spatially transfixed from each other, where each probe is 20 nucleotide sequences homologous to nucleic acid molecules capable of emitting a detectable visual signal upon binding of the invention . In some embodiments , to obtain gapped of the probe to the target sequence, and where probes alignments for comparison purposes , Gapped BLAST can binding to target sequences in the same serial target gene be utilized as described in Altschul et al ., ( 1997 ) Nucleic emit the same detectable visual signals ; and detecting visual Acids Res. 25 ( 17 ) :3389-3402 . When utilizing BLAST and signals that reflect the binding between the plurality of 25 Gapped BLAST programs , the default parameters of the binding probes and the one or more serial target gene . respective programs ( e.g. , XBLAST and BLAST ) can be In any of the embodiments disclosed herein , m , n , F , i , j used (See www.ncbi.nlm.nih.gov ). and k are all integers . Embodiments disclosed herein can be Identity : As used herein , “ identity ” means the percentage applied individually or in combination in any aspect dis of identical nucleotide residues at corresponding positions in closed herein . 30 two or more sequences when the sequences are aligned to maximize sequence matching , i.e., taking into account gaps Definitions and insertions. Identity can be readily calculated by known m ds, including but not limited to those described in Animal : As used herein , the term “ animal” refers to any ( Computational Molecular Biology, Lesk , A. M., ed ., member of the animal kingdom . In some embodiments , 35 Oxford University Press , New York , 1988 ; Biocomputing : " animal” refers to humans, at any stage of development. In Informatics and Genome Projects , Smith , D. W., ed ., Aca some embodiments , “ animal” refers to non -human animals , demic Press , New York , 1993 ; Computer Analysis of at any stage of development. In certain embodiments , the Sequence Data , Part I, Griffin , A.M. , and Griffin , H.G. , eds. , non -human animal is a mammal ( e.g., a rodent , a mouse , a Humana Press , New Jersey , 1994 ; Sequence Analysis in rat, a rabbit, a monkey , a dog , a cat , a sheep , cattle , a primate , 40 Molecular Biology, von Heinje , G., Academic Press , 1987; and / or a pig ). In some embodiments , animals include, but and Sequence Analysis Primer , Gribskov , M. and Devereux , are not limited to , mammals , birds , reptiles, amphibians, J. , eds. , M Stockton Press, New York , 1991 ; and Carillo , H., fish , and /or worms. In some embodiments , an animalmay be and Lipman , D., SIAM J. Applied Math . , 48 : 1073 ( 1988 ) . a transgenic animal , a genetically - engineered animal , and /or Methods to determine identity are designed to give the a clone. 45 largest match between the sequences tested . Moreover, Approximately : As used herein , the terms " approxi methods to determine identity are codified in publicly avail mately ” or “ about” in reference to a number are generally able computer programs. Computer program methods to taken to include numbers that fall within a range of 5 % , determine identity between two sequences include, but are 10 % , 15 % , or 20 % in either direction (greater than or less not limited to , the GCG program package (Devereux , J. , et than ) of the number unless otherwise stated or otherwise 50 al. , Nucleic Acids Research 12 (1 ): 387 (1984 )) , BLASTP, evident from the context ( except where such number would BLASTN , and FASTA (Altschul , S. F. et al ., J. Molec . Biol. be less than 0 % or exceed 100 % of a possible value ) . In 215 : 403-410 ( 1990 ) and Altschul et al . Nuc . Acids Res . 25 : some embodiments , use of the term “ about” in reference to 3389-3402 (1997 ) ). The BLAST X program is publicly dosages means + 5 mg/ kg /day . available from NCBI and other sources (BLAST Manual, Homology : “ Homology" or " identity ” or “ similarity ” 55 Altschul, S., et al ., NCBI NLM NIH Bethesda , Md. 20894 ; refers to sequence similarity between two nucleic acid Altschul, S., et al. , J. Mol. Biol. 215 : 403-410 ( 1990 ) . The molecules . Homology and identity can each be determined well - known Smith Waterman algorithm can also be used to by comparing a position in each sequence which can be determine identity . aligned for purposes of comparison . When an equivalent In vitro : As used herein , the term " in vitro ” refers to position in the compared sequences is occupied by the same 60 events that occur in an artificial environment, e.g., in a test base , then the molecules are identical at that position ; when tube or reaction vessel, in cell culture , etc. , rather than within the equivalent site occupied by the same or a similar nucleic an organism ( e.g. , animal , plant, and / or microbe ). acid residue ( e.g., similar in steric and/ or electronic nature ), In vivo : As used herein , the term “ in vivo ” refers to events then the molecules can be referred to as homologous (simi that occur within an organism ( e.g. , animal, plant, and /or lar ) at that position . Expression as a percentage of homol- 65 microbe ) . ogy / similarity or identity refers to a function of the number Oligonucleotide: the term “ oligonucleotide” refers to a of identical or similar nucleic acids at positions shared by the polymer or oligomer of nucleotide monomers , containing US 10,510,435 B2 7 8 any combination ofnucleobases , modified nucleobases, sug some embodiments , as will be clear from context, the term ars , modified sugars , phosphate bridges , or modified bridges . " sample ” refers to a preparation that is obtained by process Oligonucleotides of the present invention can be of vari ing ( e.g., by removing one or more components of and /or by ous lengths . In particular embodiments , oligonucleotides adding one or more agents to ) a primary sample. For can range from about 2 to about 200 nucleotides in length . 5 example , filtering using a semi- permeable membrane . Such In various related embodiments , oligonucleotides, single a “ processed sample ” may comprise , for example nucleic stranded , double - stranded , and triple -stranded , can range in acids or extracted from a sample or obtained by length from about 4 to about 10 nucleotides , from about 10 subjecting a primary sample to techniques such as amplifi to about 50 nucleotides, from about 20 to about 50 nucleo cation or reverse transcription of mRNA , isolation and / or tides , from about 15 to about 30 nucleotides , from about 20 10 purification of certain components , etc. to about 30 nucleotides in length . In some embodiments , the Subject : As used herein , the term “ subject” or “ test oligonucleotide is from about 9 to about 39 nucleotides in subject ” refers to any organism to which a provided com length . In some embodiments , the oligonucleotide is at least pound or composition is administered in accordance with the 4 nucleotides in length . In some embodiments , the oligo present invention e.g., for experimental, diagnostic , prophy nucleotide is at least 5 nucleotides in length. In some 15 lactic , and / or therapeutic purposes . Typical subjects include embodiments , the oligonucleotide is at least 6 nucleotides in animals ( e.g., mammals such as mice , rats , rabbits, non length . In some embodiments , the oligonucleotide is at least human primates , and humans ; insects ; worms; etc. ) and 7 nucleotides in length . In some embodiments , the oligo plants . In some embodiments , a subject may be suffering nucleotide is at least 8 nucleotides in length . In some from , and / or susceptible to a disease , disorder , and / or con embodiments , the oligonucleotide is at least 9 nucleotides in 20 dition . length . In some embodiments , the oligonucleotide is at least Substantially : As used herein , the term “ substantially ” 10 nucleotides in length . In some embodiments , the oligo refers to the qualitative condition of exhibiting total or nucleotide is at least 11 nucleotides in length . In some near- total extent or degree of a characteristic or property of embodiments , the oligonucleotide is at least 12 nucleotides interest. One of ordinary skill in the biological arts will in length . In some embodiments , the oligonucleotide is at 25 understand that biological and chemical phenomena rarely , least 15 nucleotides in length . In some embodiments , the if ever, go to completion and /or proceed to completeness or oligonucleotide is at least 20 nucleotides in length . In some achieve or avoid an absolute result . The term “ substantially ” embodiments , the oligonucleotide is at least 25 nucleotides is therefore used herein to capture the potential lack of in length . In some embodiments , the oligonucleotide is at completeness inherent in many biological and / or chemical least 30 nucleotides in length . In some embodiments , the 30 phenomena . oligonucleotide is a duplex of complementary strands of at Suffering from : An individual who is " suffering from ” a least 18 nucleotides in length . In some embodiments , the disease , disorder , and /or condition has been diagnosed with oligonucleotide is a duplex of complementary strands of at and /or displays one or more symptoms of a disease , disorder, least 21 nucleotides in length . and / or condition . Predetermined : By predetermined is meant deliberately 35 Susceptible to : An individual who is “ susceptible to ” a selected , for example as opposed to randomly occurring or disease , disorder , and /or condition is one who has a higher achieved . A composition that may contain certain individual risk of developing the disease, disorder, and/ or condition oligonucleotides because they happen to have been gener than does a member of the general public . In some embodi ated through a process that cannot be controlled to inten ments , an individual who is susceptible to a disease , disorder tionally generate the particular oligonucleotides is not a 40 and /or condition may not have been diagnosed with the “ predetermined ” composition . In some embodiments , a pre disease , disorder, and/ or condition . In some embodiments , determined composition is one that can be intentionally an individual who is susceptible to a disease , disorder, reproduced ( e.g. , through repetition of a controlled process ). and /or condition may exhibit symptoms of the disease , Sample : As used herein , the term “ sample ” refers to a disorder , and / or condition . In some embodiments , an indi biological sample obtained or derived from a source of 45 vidual who is susceptible to a disease , disorder, and /or interest, as described herein . In some embodiments , a source condition may not exhibit symptoms of the disease, disorder , of interest comprises an organism , such as an animal or and /or condition . In some embodiments , an individual who human . In some embodiments , a biological sample com is susceptible to a disease , disorder, and /or condition will prises biological tissue or fluid . In some embodiments , a develop the disease , disorder , and / or condition . In some biological sample is or comprises bone marrow ; blood ; 50 embodiments , an individual who is susceptible to a disease , blood cells ; ascites ; tissue or fine needle biopsy samples ; disorder, and / or condition will not develop the disease , cell - containing body fluids, free floating nucleic acids ; spu disorder, and /or condition . tum ; saliva ; urine ; cerebrospinal fluid , peritoneal fluid ; pleu Treat: As used herein , the term “ treat, ” “ treatment, ” or ral fluid ; feces; lymph ; gynecological fluids; skin swabs ; “ treating ” refers to any method used to partially or com vaginal swabs; oral swabs; nasal swabs; washings or lavages 55 pletely alleviate , ameliorate , relieve , inhibit , prevent, delay such as a ductal lavages or broncheoalveolar lavages ; aspi onset of, reduce severity of, and /or reduce incidence of one rates ; scrapings; bone marrow specimens; tissue biopsy or more symptoms or features of a disease , disorder, and /or specimens ; surgical specimens; feces , other body fluids , condition . Treatment may be administered to a subject who secretions, and /or excretions ; and /or cells therefrom , etc. In does not exhibit signs of a disease, disorder, and /or condi some embodiments , a biological sample is or comprises 60 tion . In some embodiments , treatment may be administered cells obtained from an individual . In some embodiments , a to a subject who exhibits only early signs of the disease, sample is a “ primary sample ” obtained directly from a disorder, and /or condition , for example for the purpose of source of interest by any appropriate means. For example , in decreasing the risk of developing pathology associated with some embodiments , a primary biological sample is obtained the disease , disorder, and /or condition . by methods selected from the group consisting of biopsy 65 Wild -type : As used herein , the term " wild - type” has its ( e.g. , fine needle aspiration or tissue biopsy ), surgery , col art -understood meaning that refers to an entity having a lection of body fluid ( e.g. , blood , lymph , feces etc.) , etc. In structure and /or activity as found in nature in a “ normal” (as US 10,510,435 B2 9 10 contrasted with mutant, diseased , altered , etc. ) state or co - localization through all three hybridizations . 77.9 : 5.6 % context. Those of ordinary skill in the art will appreciate that of barcodes reoccur. n = 37 cells . wild type genes and polypeptides often exist in multiple FIG . 8. Point -wise displacement between FISH points in different forms ( e.g., alleles) . Hybridizations 1 and 3. Point -wise displacement between 5 FISH points in Hybridizations 1 and 3. FISH dots in the Cy5 BRIEF DESCRIPTION OF THE DRAWINGS images in Hybridization 1 and 3 were extracted , fitted with 2D Gaussians. The point -wise displacements were shown in FIG . 1.Methodologies provided by the present disclosure the 3D histogram . The standard deviation was 105.8 nm , are represented in FIG . 1. indicating that mRNAs can be localized to 100 nm between FIG . 2. Exemplary sequential barcoding of provided 10 2 rounds of hybridizations. n = 1199 spots . methods . (a ) Schematic of sequential barcoding . In each FIG . 9. Barcodes identified between repeat hybridizations round of hybridization , multiple probes (e.g. , 24 ) were of the same probe set (hybridization 1 and 3 ) . Barcodes hybridized on each transcript, imaged and then stripped by identified between repeat hybridizations of the same probe DNase I treatment. The same probe sequences could be used set (hybridization 1 and 3) . Barcodes were identified by in different rounds of hybridization , but probes were coupled 15 co - localization between the hybridizations. Each column to different fluorophores . (b ) Composite four - color FISH corresponds to an individual cell. Each row corresponds to Data from 3 rounds of hybridizations on multiple yeast cells . a specific barcode identified between hybridization 1 and 3 . Twelve genes were encoded by 2 rounds of hybridization , Bolded row names correspond to repeated color barcodes with the third hybridization using the same probes as hybrid- 20 that should co - localize between hybridization 1 and 3 . ization 1. The boxed regions were magnified in the bottom Non - bolded row names correspond to false positive bar right corner of each image. The matching spots were shown codes . For example , a large number of barcodes were and barcodes were extracted . Spots without co - localization , detected for ( Alexa 532 , Alexa 532 ) , indicating co - localiza without the intention to be limited by theory , could be due tion of spots in the Alexa 532 channels . n = 37 cells . to nonspecific binding of probes in the cell as well as 25 A1532 = Alexa 532. A1594 = Alexa 594. A1647 = Alexa 647 . mis -hybridization . The number of each barcode were quan FIG . 10. Single cell mRNA levels from barcode extrac tified to provide the abundances of the corresponding tran tion . Single cell mRNA levels from barcode extraction . scripts in single cells. ( c ) Exemplary barcodes . mRNA 1 : Barcodes were identified by co - localization between hybrid Yellow - Blue - Yellow ; mRNA 2 : Green - Purple - Green ; izations 1 and 2. Each column corresponds to an individual mRNA 3 : Purple -Blue -Purple ; and mRNA 4 : Blue- Purple- 30 cell. n = 37 cells . A1532 = Alexa 532. A1594 = Alexa 594 . Blue . A1647 = Alexa 647. From top to bottom : YLR194c , CMK2, FIG . 3. Schematic of sequential hybridization and barcod GYP7, PMC1, NPT1 , SOK2, UIP3, RCN2, DOA1, HSP30 , ing . ( a ) Schematic of sequential hybridization and barcod PUN1 and YPS1 . ing . ( b ) Schematic of the FISH images of the cell. In each FIG . 11. DNase I stripping of Nanog Alexa 647 probes in round of hybridization , the same spots were detected , but the 35 mouse embryonic stem cells (mESCs ) . DNase I stripping of dye associated with the transcript changes. The identity of an Nanog Alexa 647 probes in mouse embryonic stem cells mRNA was encoded in the temporal sequence of dyes (mESCs ). Forty - eight probes targeting Nanog were hybrid hybridized . ized in mESCs. Probes were stripped off by 30 minutes of FIG . DNase I efficiently removes smFISH probes DNase I incubation at a concentration of 3 Units /uL . bound to mRNA . DNase I efficiently removes smFISH 40 FIG . 12. Re -Hybridization of Nanog mRNA in Mouse probes bound to mRNA . Spots were imaged before and after Embryonic Stem Cells (mESCs ) . Re -Hybridization of a 4 hour DNase I treatment in anti- bleaching buffer. The Nanog mRNA in Mouse Embryonic Stem Cells (mESCs ) . mean , median and STD of the intensity ratio after treatment Probes were stripped off by 30 minutes of DNase I incuba were 11.5 % , 8.3 % and 11 % . The ratio of the spot intensities tion at a concentration of 3 Units /uL . Nanog Alexa 647 after and before DNase I treatment was plotted for each spot. 45 probes were re - hybridized for 12 hours and imaged . Images n = 1084 spots. were 2D maximum projections created from z stacks of 11 FIG . 5. Photobleaching removes residual intensity fol images taken every 1.5 um . lowing DNase I treatment. Photobleaching removed residual FIG . 13. HCR detection of ß -actin (red ) in the cortex and intensity following DNase I treatment. Spots were bleached visualized with retrograde tracers ( fluorogold , green ) in a by 10 seconds of excitation following a 4 hour DNase I 50 100 um coronal section . An entire coronal section was treatment. The mean , median and STD of the intensity ratio imaged at both 10x and 60x (magnified inset ) . In the 60x after bleaching were 0.03 % , 0.01 % and 0.049 % . The ratio of image, individual red dots correspond to single ß - actin the spot intensities after and before DNase I treatment was mRNA molecules . B -actin expression can be quantified by plotted for each spot. n = 1286 spots . counting fluorescent foci while simultaneously detecting a FIG . 6. mRNAs are stable over multiple rounds of re- 55 distal sub -population of neurons tagged with the retrograde hybridization . mRNAs were stable over multiple rounds of tracer (green ). re- hybridization . The intensity distributions of smFISH FIG . 14. Detectably labeled oligonucleotides labeled with spots were plotted over 6 hybridizations. Two hybridizations HCR were as specific as smFISH probes directly labeled were repeated 3 times to make 6 total hybridizations . Spots with fluorophores in detecting single molecules of RNA in were identified by their co - localization with spots in the next 60 20 um brain slices . HCR probes ( left ) and smFISH probes identical hybridization . For each boxplot the number of ( right) targeted ß -actin simultaneously and co - localized . spots counted was between 191 and 1337 . Note the improved S /N of the HCR . FIG . 7. Fraction of barcodes identified from first two FIG . 15. Detectably labeled oligonucleotides labeled with rounds of hybridization that reoccur in following round of HCR rehybridized well in 20 um brain slices. HCR spots in hybridization per cell. Fraction of barcodes identified from 65 hybl and hyb2 colocalized , with DNase treatment in first two rounds of hybridization that reoccur in following between two hybridizations. This showed that HCR can be round of hybridization per cell. Barcodes were identified by fully integrated with the seqFISH protocol. US 10,510,435 B2 11 12 FIG . 16. CLARITY with Nissl : 1 mm - thick coronal FIG . 22. Hybridization Chain Reaction (HCR ) Re -hybrid section ( Bregma A P , 2.3-1.3 mm ) of a Thy - 1 -eYFP mouse ization Using Lambda Exonuclease (a -exo ) . ( a ) Schematic brain was cleared and stained with Neuro Trace fluorescent representation ofa -exo digestion of bridging strands. 9 -exo Nissl stain ( 1: 100 dilution , 48 hours , RT) . Left, 3d coronal selectively digests 5 ' phosphorylated bridging strands in the rendering of the motor cortex . Right, 100 um -thick section 5 5 ' to 3 ' direction releasing HCR polymers from intermediate of layer V motor cortex . Arrows indicate apical dendrites of probe strands bound to targets , e.g. , mRNAs. Released the pyramidal neurons (Red -Fluorescent Nissl , Green polymers are washed out with wash buffer . A new bridging eYFP ) . strand can then by hybridized to target bound probe with a FIG . 17. A schematic display of an exemplary light sheet different initiator sequence which initiates polymerization of microscope . 10 a different hairpin set with a different fluorescent dye. ( b ) FIG . 18. SPIM detects single mRNAs in 100 um CLAR Raw data illustrating use of the schematic shown in ( a ) in ITY brain slices. The slice was scanned over 100 um . The T3T mouse fibroblast cell line using probes against beta images were then registered and stitched to a 3D reconstruc actin (Actb ) transcripts . tion. Diffraction limited dots in the image corresponded to 15 izationFIG . Using23. Hybridization Uracil- Specific Chain Excision Reaction Reagent (HCR )( ReUSER - hybrid ). (a ) single B - Actin mRNAs detected by HCR . Scale bar is 15 Schematic representation of USER digestion of bridging um . strands. USER selectively digests deoxyuridine nucleotides FIG . 19. A connectivity map of the cortical somatic in bridging strands causing bridging strands to become sensorimotor subnetworks on serial coronal levels of the fragmented . Fragments then dissociate from intermediate Allen Reference Atlas . It shows that each of the four major 20 probe strands releasing HCR polymers from probes bound to components of somatic sensorimotor areas, SSp , SSs, MOP , targets , e.g., mRNAs. Released polymers are washed out and MOs, are divided into 4 functional domains. These with wash buffer . A new bridging strand can then be hybrid functionally correlated domains are extensively intercon ized to target bound probe with a different initiator sequence nected with all others and form four major cortical somatic which initiates polymerization of a different hairpin set with sensorimotor subnetworks : orofaciopharyngeal ( orf , blue ), 25 a different fluorescent dye . ( b ) Raw data illustrating use of upper limb (ul , green ) , lower limb and trunk ( 11/ tr , red ), and the schematic shown in ( a ) in T3T mouse fibroblast cell line whisker (bfd.cm & w , yellow ) . Numbers indicate position of using probes against beta - actin (Actb ) transcripts . sections relative to bregma (mm ) . Provided methods can FIG . 24. Exemplary removal of detectably labeled oligo characterize connectivity and molecular identities of projec nucleotides using complementary oligonucleotides (cTOE ). tion neurons in each of these distinct domains within dif- 30 FIG . 25. Exemplary oligonucleotide preparation . The ferent subnetworks . original oligonucleotide (as exemplified in this Figure, FIG . 20. Exemplary informatics workflow for automati probe ) library contains several probe sub - libraries. Each cally detecting and mapping retrogradely labeled neurons sub - library has a specific set of primers that can be used and gene barcoding information . A. Raw image with CTb amplify the sub - library using PCR . Once the desired sub labeling (pink ) and Nissl staining (blue ). The boxed area 35 library is amplified , the product is incubated with a nicking shows a close -up view of CTb labeled neurons . B. Multi enzyme. The enzyme cleaves the phosphodiester bond on channel raw image are converted into grayscale for segmen the probe strand at its recognition site . Denaturing the tation . C. Individual tracer channel images are run through resulting product and running it on a denaturing gel allows a segmentation pipeline that discretely separates the tissue the desired probe sequence to be released . The probe band background from labeled cells . White dots are reconstruc- 40 can then be cut out of the gel and extracted . The extracted tions of labeled somata . D. Reintegrated multi picture tiffs product can be used for hybridization . are associated with a coronal section in the ARA for regis FIG . 26. Exemplary oligonucleotide preparation . Product tration . E. Using provided developed registration software, was the third band on gel. The library has many different multi picture tiffs are warped to align both the tissue's primers associated with it , one primer set for each subset of silhouette and cytoarchitectural delineations to its corre- 45 probes . Exemplified primers were random 20 nucleotide sponding ARA level . F. Cells extracted via the segmentation sequences with a GC content of 45-55 % and a Tm of around process are spatially normalized and can be associated with 55 ° C. Nicking endonucleases sites were GTCTCNN ; cor a layer- or sub - nucleus- specific ROI within the ARA . G. responding nicking endonuclease is Nt. BsmAI. Product Segmented and registered labeling reconstructions are made probes were 20mer DNA sequences complementary to available to the public on the iConnectome FISH viewer, 50 mRNA of interest with a GC content between 45-55 % . along with their accompanying seqFISH data . An analysis FIG . 27A illustrates exemplary aspects that may contrib tab provides information about the injection site , tracer type , ute to error correction during a sequential hybridization number of labeled cells by ROI, which can be further process . disambiguated into layer -specific cell counts , and gene FIG . 27B illustrates an exemplary process for error cor expression by cell. 55 rection . FIG . 21.Hybridization Chain Reaction (HCR ) Re- hybrid FIG . 28 illustrate an exemplary computer system for ization Using Exonuclease III (ExoIII ). (a ) Schematic rep implementing the error correction methods disclosed herein . resentation of exolII digestion of bridging strands and HCR FIG . 29 depicts an overview of the Sequential barcode polymers. ExolII digests bridging strands and HCR poly FISH ( seqFISH ) in brain slices. A ). A coronal section from mers from the 3 ' to 5 ' direction into dNMP's leaving behind 60 a mouse brain was mounted on a slide and imaged in all intermediate probe strandsbound to targets , e.g., mRNAs . A boxed areas. Each image was taken at 60x magnification . B ). new bridging strand can then by hybridized to target bound Example of barcoding hybridizations from one cell in field probe with a different initiator sequence which initiates from A. The same points are re- probed through a sequence polymerization of a different hairpin set with a different of 4 hybridizations (numbered ). The sequence of colors at a fluorescent dye . ( b ) Raw data illustrating use of the sche- 65 given location provides a barcode readout for that mRNA matic shown in (a ) in T3T mouse fibroblast cell line using (“ barcode composite ” ) . These barcodes are identified probes against beta -actin (Actb ) transcripts . through referencing a lookup table abbreviated in D and US 10,510,435 B2 13 14 quantified to obtain single cell expression . In principle , the assignments are shown on the dendrogram . Abbreviations: maximum number of transcripts that can be identified with Hippocampus pyramidal (Hipp ) , cortex (Cort ), Dentate this approach scales to FN , where F is the number of Gyms ( DG ), Interneurons ( Int) , Astrocyes ( Astro ) , Microglia fluorophores and N is the number of hybridizations . Error (uGlia ). C ) . Subclusters of cluster 6 cells and their regional correction adds another round of hybridization . C ). Serial 5 localization and gene expression profile displayed under the smHCR is an alternative detection method where 5 genes are dendrogram . Subcluster 6.1 is enriched in the CA3, while quantified in each hybridization and repeated N times . Serial 6.7 is enriched in the DG . D ). Subclusters of cluster 7 cells hybridization scales as F * N . D ). Schematic for multiplexing are shown . Almost all cells are localized in the GCL but have 125 genes in single cells . 100 genes are multiplexed in 4 different combinatorial expression profiles. Note Calbl hybridizations by seqFISH barcoding. This barcode scheme 10 expression , which marks out granule cell maturation , differs is tolerant to loss of any round of hybridization in the amongst subclusters . E ). Any random subset of 25 genes can experiment. 25 genes are serially hybridized 5 genes at a recapitulate approximately 50 % of the information in the time by 5 rounds of hybridization . Each number represents correlation amongst cells ( red ) , but a larger number of genes a color channel in single molecule HCR . As a control, 5 are required to accurately assign cells to cluster using a genes are measured both by double rounds of smHCR as 15 random forest algorithm (blue ) (n = 10 bootstrap replicates; well as barcoding in the same cell. E. SmHCR amplifies shading is 95 % CI) , indicating that fine structures in the data signal from individual mRNAs. After imaging , DNAse require quantitative measurements of combinatorial expres strips the smHCR probes from the mRNA, enabling rehy sion ofmany genes . F ). Similar to E , while the first ten PCs bridization on the same mRNA ( step a ). The “ color” of an explain the coarse structure , a larger number of principal mRNA can be modulated by hybridizing probes that trigger 20 components (PCs ) are required to describe the full data . HCR polymers labeled with different dyes ( step b ) . mRNA Expected variation ( green ) and accuracy in predicting cell are amplified following hybridization by adding the comple identity using a random forest model (blue ). mentary hairpin pair ( step c ) . The DNAse smHCR cycle is FIG . 32 depicts an example embodiment , illustrating repeated on the same mRNAs to construct a predefined spatial layering of cell classes in the Dentate Gyms (DG ) . barcode over time. 25 A - B ). Suprapyramidal and infrapyramidal blades of DG . FIG . 30 illustrates an example accurate in situ quantifi Cells of the subgranular zone (SGZ ) and granule cell layer cation of mRNA levels generated by seqFISH . A ) . Image of (GCL ) are arranged in lamina layers in mirror symmetric seqFISH barcoding 100 genes in the outer layer ofthe mouse patterns on the upper and lower blades . C ) . The SGZ stays cortex . RNA dots in the image are z projected over 15 um . on the inner layer of the DG fork . D ). Cells are patterned in IndividualmRNA points are shown across 4 hybridizations 30 the crest. Numbered color key corresponds to cluster num in the inset images. White squares correspond to identified bers in FIG . 316. E ) . Letters in the cartoon ofDG correspond barcodes , yellow squares correspond to missing transcripts to images. F ) . 3D image of the fork region shown in C ) . in a particular hybridization , red squares correspond to FIGS . 33A through 33P depict an example embodiment, spurious false positives and are not counted in any barcode illustrating that subregions of the hippocampus are com measurements . Numbers in the squares correspond to bar- 35 posed of distinct compositions of cell classes based on the code indices . B ) . seqFISH correlates with smHCR counts . first 125 gene experiment. Upper right panel . Cartoon of After barcoding , 5 target mRNAs were measured twice by hippocampus with imaged regions labeled . Color key cor smHCR in the same cells , providing absolute counts of the responds to the classes in FIG . 36b . FIGS. 33A - D ) . These transcripts . The two techniques correlate with an R = 0.85 and images are regions from the CAld . Astrocytes ( Astro ) are a slope ( m ) of 0.84 (n = 3851 measurements ). The 2D histo- 40 marked in image 33A ) and a microglia cell (uGlia ) is marked gram intensity shows the distribution of points around the in image 33B ) . Moving along the hippocampus from CA1 regression line. A high density of points is seen along the dorsal to ventral, cell classes transition from a homogenous regression line . The density falls off steeply around the dorsal population (33C to 33D ) to a mixed population in the regression line . C ). Error correction results in a median gain CA1 intermediate (33E -33F ) to regions of even larger cel of 373 (25 % ) counts per cell (n = 3497 ). Red and blue curves 45 lular diversity in the CA1 ventral region (33G - I) . The dotted correspond to the total barcode counts per cell before and line in 33D ) marks the transition point of the CAld to the after error correction . D ) . Dropped and off - target barcodes CAli. 33E ) shows two laterally segregated cell classes represent a small source of error in seqFISH . 100 on - target (marked by a dotted line) in the CAli along with cholinergic barcodes and 525 off -target barcodes are measured per cell . interneurons ( Int) on the interior surface of the CAli. The Dropped barcodes are due to at least two overlapping dots 50 ventral ( 33J- 33K ) and intermediate CA3 (33L -33M ) have appearing within the same region . E. Off - target barcodes are similar cell classes compositions to the CAlv and CAli. The rarely observed and contribute minimally to the expression two last regions (330-33P ) of the dorsal CA3 shows distinct profile in single cells . Each of the 100 on -target barcodes cell classes compositions that are relatively homogeneous (blue ) and 525 off - target barcodes (red ) are quantified per within a field but are different than other fields of CA3 . The cell . The mean is shown with shaded regions corresponding 55 cell class composition of field 33P is similar to that of the to 1 SD ( N = 41 imaged regions) . CAld, but these cluster 6 cells are grouped into a distinct FIG . 31 depicts an example illustrating that distinct clus subcluster. ters of cells exhibit different regional localization in the FIG . 34 depicts an example embodiment, showing map brain . A ) . Gene expression of 14,908 cells presented as a ping of cell types to a second brain slice with 125 genes. Z - score normalized heatmap . B ). Regional compositions of 60 Upper right panel. Cartoon of hippocampus with imaged 13 cell clusters are visualized as stacked bar plots with the regions labeled . Color key corresponds to the classes in FIG . area corresponding the number of cells in each region . 316. A - D . Similar to the cell class compositions shown for Hippocampal regions are : CA3, CA1, Dentate Gyms ( DG ) . the hippocampus in FIG . 33 , CAld in this second coronal Cortical regions: parietal and temporal. Box plot of the Z section from a second mouse is composed of mostly cluster scores of 21 representative genes are plotted for each cell 65 6 cells . ( E ) CAli region and ( F - G ) the CAl ventral regions class . The major tick marks correspond to Z score 0 while are again composed of similar cell classes to that shown in every minor tick is a z score interval of 1. Cell type FIG . 33 with increasing diversity of cell class compositions US 10,510,435 B2 15 16 from the CAld to the CAli to finally the CAlv. ( H -J ) CA3 coronal section are boxed . Each box represents a field of 216 regions. ( K - M ) DG regions showing the same cell classes umx216 um . The brain section used for FIGS. 32 and 33 is and layer pattern of the GCL and SGZ shown in FIG . 32 . shown on the left . The middle section is used for FIG . 34 and FIG . 35 depicts an example embodiments , showing map the right section is used for FIG . 35 . ping of cell types to a third brain slice with 249 genes. Upper 5 FIG . 39 depicts an example embodiment, showing quan right panel. Cartoon of hippocampus with imaged regions titation of seqFISH ( related to FIG . 30 ) . A ) . All control labeled . Color key corresponds to the classes in FIG . 43C . genes show high correlations between seqFISH and A - C ) . Similar to the slice shown in FIGS. 33 and 34 , CAld smHCR . B ). Number of dropped hybridizations from the is relatively homogenous in cell cluster composition . D - G ). barcode . Blue bars represent measured probability and the Images from the CAli region show that the cell class 10 red bars represent inferred values from binomial distribution composition is different from that of the CAld. H - K ). Again , fitting of measured probability . The ratio of the full barcodes similar to FIGS. 33 and 34 , images from the CAl ventral (4 hybridizations ) vs 3 hybridization barcodes indicate that regions shows a much more complicated cellular composi transcripts that are mis -hybridized in 2 rounds are rare . tion and a high degree of cellular heterogeneity . L - R ). Transcripts missed in 2 or more hybridizations (red bars ) Images from the CA3 region show that the cellular compo- 15 could not be recovered from the error - correction algorithm sitions also creates 3-4 subregions within the CA3 . The and would be dropped from our quantifications ( N = 2,115 , cellular heterogeneity of the CA3 subregionsmirrors that of 477 total barcodes ). C ) . Intensity of barcode hybridizations the CA1, where the ventral region of the CA3 is very overtime. All dots belonging to barcodes are quantified in heterogenous while the dorsal region of the CA3 is relatively each hybridization and their mean intensity is plotted over homogenous. S - T ). The DG regions show the distinct SGZ 20 time normalized to the first hybridization . 99 % CI ratio of versus GCL layering pattern seen in the previous two brains. mean is plotted as a bar over points , but is not visible due to FIG . 36 depicts an example embodiment, showing corre its small size (n = 60143 to 111284 points per channel) . D ). lations of the transcription profile across the pyramidal layer Barcoding confidence ratio . Barcode classes in D ) are com A ) . mRNA counts in the cell bodies in the Stratum Pyrami pared to a null model of barcode observations where random dale (SP ) are grouped within each field of view . A single cell 25 chance observation should give a ratio of 1. Off target in the Stratum Radiatum (SR ) is shown to illustrate indi barcodes are observed 0.005 times less than expected , vidual mRNA localization . Stratum Oriens (SO ) is labeled suggesting that seqFISH has high accuracy in correctly for orientation . B ) .mRNAs in different subregions of pyra counting barcoded transcripts (n = 3493 cells ). Dark bars on midal layer show both long - distance spatial correlations as top of bar plots correspond to 99.999 % confidence interval well as local correlations between neighboring fields. Both 30 determined by bootstrap resampling. E ). Comparison of CA1 and Dentate Gyms (DG ) show high regional correla average copy numbers per gene as measured by Zeisel et al . tions. Correlation is calculated based on the 125 gene and seqFISH . Single cell RNA - seq underestimates copy experiment . C ) . Illustration of regional and long distance numbers compared to seqFISH . correlation patterns observed in B. Correlated regions are FIGS. 40A through 40F depict an example embodiment, colored and long distance correlations are shown as dotted 35 showing gene expression patterns and clustering of the lines with their median correlation coefficient written over 125 - gene dataset ( related to FIG . 31 ). 40A ) . Overview of the dotted line . 125 gene expression . Plots show the distribution of each FIG . 37 depicts an example embodiment, showing bar transcript in all 14,908 imaged cells . Note the last 25 genes code assignments for all genes in the combined hybridiza have higher expression and were imaged with serial hybrid tion experiment (FIG . 29 ). Barcode assignments in the 40 ization . 40B ) . Violin plots of Z - score distribution for 125 125 - gene seqFISH and serial experiment (FIG . 29 ). 125 genes . 40C1-40C13 ). Subcluster hierarchy of each of the 13 genes are profiled , 100 of which are barcoded and 25 are clusters identified in FIG . 31B . 40D ). PCA eigenvalue identified by serial smHCR hybridizations. Five control analysis of the cell -to -cell correlation matrix . First 125 PC genes (Hdx , Vps13c , Zfp715 , Fbili , Slc4a8 ) were quantified and their eigenvalues are shown . As observed in FIG . 31, the by both techniques. The smHCR round of hybridization of 45 first 10 PCs explain 59.5 % of the variation in the data , while control genes were performed twice to co -localize signal to the remaining 115 PCs are needed to explain remaining data . obtain an absolute count . Reflecting this , the eigenvalues of the first 10 components FIG . 38 depicts an example embodiment , showing are high , while the remaining eigenvalues are uniform . 40E ) . smHCR performance metrics as compared to smFISH , ( re Correlation between gene expression and spatial localiza lated to FIG . 29 ) . A ) . Raw data of Pgkl transcripts imaged 50 tion . Each dot represents a pair of cell classes and their in a brain slice. The transcript was targeted with 2 her probes correlations in gene expression space (x ) and spatial local sets and 1 smFISH probe set , each consisted of 24 oligo ization patterns (y ) (N = 153 pairwise correlations between nucleotide probes. The probe sets were hybridized together classes, R = 0.67 ). Classes that are similar in expression have and were imaged in 3 different channels . Green circles are similar localization patterns. 40F ). PCA decomposition transcripts detected in all channels , yellow circles signify 55 separates cells into coherent clusters corresponding to cell transcripts detected in 2 out of 3 channels , and red circles classes . Cells are colored according to the clusters displayed represent signal found in only 1 channel ( false positives due in the dendrogram . to nonspecific binding ). These images show that smHCR FIGS. 41A through 41M depict an example embodiment, and smFISH have similar sensitivity , specificity , and spot showing robustness of cell classes to downsampling of cells size . B ) . Gain of smHCR vs smFISH . The mean gain of 60 ( related to FIG . 31 ) . To measure how well cluster assign smHCR is 22.1 + 11.55 vs smFISH ( n = 1338 ) . C ) . True posi ments perform with a limited number of cells , a random tive detection rate of smHCR and smFISH per channel. The forest model was trained on the cell - to -cell correlation percent of true positives ( transcripts detected with at least 2 matrix of the 6872 cells in the center field of view . The out of 3 probe sets ) detected with each probe set ( n = 1338 ) . robustness of the clusters was calculated by applying this D ) . False positive rate of smHCR and smFISH . Percent of 65 model to classify the remaining cells and determining the total dots in a channel not detected in any other channel for percent accuracy of correct assignment to the clusters pre 3 color Pgk1 ( n = 1338 ) . E ) . All the regions imaged in the sented in FIG . 316. While some classes can be assigned US 10,510,435 B2 17 18 accurately even with a small number of cells as the initial the SGZ . B -C ) . Comparison of averaged z - score values per training set , several classes require large number of cells to cell from seqFISH to ABA data across hippocampus . B ). accurately assign ( n = 10 bootstrap replicates, S.E.) Amigo2 Z -score profile found across the different fields of FIG . 42 depicts an example embodiment, showing cell the hippocampus using seqFISH is shown on top and the to - cell correlation analysis as a function of dropping genes 5 ABA ISH image for Amigo2 is shown on the bottom . C ) . ( related to FIG . 31 ). A ). Clustered gene to gene correlation Gpc4 Z -score profile found across the different fields of the map for all 125 genes. There are many blocks of highly hippocampus using seqFISH is shown on top and ABA ISH correlated genes. A few genes do not fall into any blocks. B ) . image for Gpc4 is shown on the bottom . The full cell - to -cell correlation map using all genes in the data set. C ) . Representative cell -to - cell correlation with the 10 DETAILED DESCRIPTION indicated number of genes used to construct the matrix indicated above each plot. Dropping genes from the data Among other things, the present invention provides new results in degradation of the fine structure of the correlation methods, compositions and / or kits for profiling nucleic acids map . ( e.g., transcripts and /or DNA loci ) in cells . FIGS. 43A1 through 43G depict an example embodiment, 15 In some embodiments , the present invention provides showing gene expression patterns and clustering of the methods for profiling nucleic acids ( e.g., transcripts and /or 249 - gene dataset (related to FIG . 35 ). FIGS . 43A1 through DNA loci) in cells . In some embodiments , provide methods 43A3 ) . Overview of 249- gene expression . Plots show the profile multiple targets in single cells . Provided methods distribution of each transcript in all 2050 imaged cells in the can , among other things , profile a large number of targets hippocampus. Note the last 35 genes have higher expression 20 (transcripts , DNA loci or combinations thereof) , with a and were imaged with serial hybridization . FIGS . 43B1 limited number of detectable labels through sequential through 43B3 ) . Violin plots of Z - score distribution for 249 barcoding genes . 43C ) . Dendogram with regional localization of the 18 FIG . 1 depicts methodologies in accordance with the cell clusters for the 249 - gene experiment. 43D ) . Correlation present invention . As depicted , the present invention pro of seqFISH counts to smHCR counts for the 249 - gene 25 vides methodologies in which multiple rounds of hybridiza experiment. The 2D density histogram shows a high density tion ( contacting steps ) with labeled probes profiles nucleic of points around the regression line that fall off towards the acids ( e.g., mRNAs) present in cells . Specifically , as edges of the distribution . 43E ) . Cell -to - cell correlation for depicted in FIG . 1 , sets of probes that hybridize with nucleic all 2050 cells in the 249- gene dataset. 43F ). Heatmap of the acid targets in cells are provided , wherein probes ( i.e. , percentage of each cell class in each region of the hippocam- 30 detectably labeled oligonucleotides that hybridize with dif pus for both the 125 -gene experiments . These heat maps ferent targets ) are labeled within a single set and , further show that in both 125 - gene experiments the same cell more , at least one probe is differently labeled in different classes are used in roughly the same proportions in each sets . subregion . 43G ). Heat map of the percentage of each cell In some embodiments , the present invention ( e.g., as class in each region of the hippocampus for the 249 -gene 35 represented in FIG . 1) , provides methods comprising steps experiment . The same patterns are seen as the 125 gene of: experiment ( i.e. , different regions use different cell classes in (a ) performing a first contacting step that involves contact varying amounts ). ing a cell comprising a plurality of transcripts and DNA loci FIG . 44 depicts an example embodiment, showing marker with a first plurality of detectably labeled oligonucleotides , genes expression in the hippocampus ( related to FIG . 35 ). 40 each of which targets a transcript or DNA locus and is A ). The top panel outlines the region of the hippocampus labeled with a detectable moiety , so that the composition being shown in a yellow box . The images show the raw gene comprises at least: expression patterns seen using smHCR in our data at the (i ) a first oligonucleotide targeting a first transcript or dorsal most tip of the CA3 for a representative set of cell DNA locus and labeled with a first detectable moiety ; and identity markers used in the 249 gene experiment. The 45 ( ii ) a second oligonucleotide targeting a second transcript transcript expression profile is shown in red , Nissl staining or DNA locus and labeled with a second detectable moiety ; is shown in green , and DAPI staining is shown in blue. Each (b ) imaging the cell after the first contacting step so that image shown is the full field of view and a maximum hybridization by oligonucleotides of the first plurality with intensity projection over 15 um . B ) . Set of images showing their targets is detected ; the distinction between the GCL and SGZ . The GCL shows 50 (c ) performing a second contacting step that involves con a high level of Nissl staining and expression of neuronal tacting the cell with a second plurality of detectably labeled genes such as slc17a7 and camkII. The SGZ shows an oligonucleotides , which second plurality includes oligo absence of Nissl staining and terminal neuron marker genes. nucleotides targeting overlapping transcripts and /or DNA The transcript expression profile is shown in red , Nissl loci that are targeted by the first plurality , so that the second staining is shown in green , and DAPI staining is shown in 55 plurality comprises at least : blue. Each image shown is the full field of view (216 ( i ) a third oligonucleotide , optionally identical in umx216 um ) and a maximum intensity projection over 15 sequence to the first oligonucleotide , targeting the first um . transcript or DNA locus ; and FIG . 45 depicts an example embodiment, showing com ( ii ) a fourth oligonucleotide , optionally identical in parison of SeqFISH expression data to Allen Brain Atlas 60 sequence to the second oligonucleotide, targeting the second expression data ( related to FIG . 36 ). A ) . ISH data from the transcript or DNA locus, Allen Brain Atlas for genes seen to be enriched in the SGZ wherein the second plurality differs from the first plurality in in the 125 and 249 gene seqFISH experiments . In the 125 that at least one of the oligonucleotides present in the second gene experiment , mertk and mfges were found to be plurality is labeled with a different detectable moiety than enriched in the SGZ. In the 249 gene experiment, nfia and 65 the corresponding oligonucleotide targeting the same tran sox11 were seen to be enriched in the SGZ . ABA ISH data script or DNA locus in the first plurality , so that , in the shows similar patterns to those observed with seqFISH for second plurality : US 10,510,435 B2 19 20 ( iii) the third oligonucleotide is labeled with the first embodiments , N is greater than 8. In some embodiments , N detectable moiety , the second detectable moiety or a third is 9. In some embodiments , N is greater than 9. In some detectable moiety ; and embodiments , N is 10. In some embodiments , N is greater (iv ) the fourth oligonucleotide is labeled with the first than 10. In some embodiments , a plurality of detectably detectable moiety , the second detectable moiety , the third 5 labeled oligonucleotides target at least 100 targets . detectable moiety , or a fourth detectable moiety , In a contacting step , a detectably labeled oligonucleotide wherein either the third oligonucleotide is labeled with a can be labeled prior to , concurrent with or subsequent to its different detectable moiety than was the first oligonucle binding to its target. In some embodiments , a detectably otide , or the fourth oligonucleotide is labeled with a different labeled oligonucleotide , such as a fluorophore- labeled oli detectable moiety than was the second oligonucleotide, or 10 gonucleotide, is labeled prior to its binding to its target . In both ; some embodiments , a detectably labeled oligonucleotide is ( d ) imaging the cell after the second contacting step so that labeled concurrent with its binding to its target . In some hybridization by oligonucleotides of the second plurality embodiments , a detectably labeled oligonucleotide is with their targets is detected ; and labeled subsequent to its binding to its target . In some ( e ) optionally repeating the contacting and imaging steps , 15 embodiments , a detectably labeled oligonucleotide is each time with a new plurality of detectably labeled oligo labeled subsequent to hybridization through orthogonal nucleotides comprising oligonucleotides that target overlap amplification with hybridization chain reactions (HCR ) ping transcripts or DNA loci targeted by the first and second ( Choi, H M., Nat Biotechnol. 2010 November; 28 ( 11 ): 1208 pluralities, wherein each utilized plurality differs from each 12 ) . In some embodiments , a detectably labeled oligonucle other utilized plurality , due to at least one difference in 20 otide comprises a moiety , e.g. , a nucleic acid sequence, that detectable moiety labeling of oligonucleotides targeting the one or more moieties that can provide signals in an imaging same transcript or DNA locus. step can be directly or indirectly linked to the oligonucle As used herein , a detectably labeled oligonucleotide is otide . labeled with a detectable moiety . In some embodiments , a In some embodiments , the same type of labels can be detectably labeled oligonucleotide comprises one detectable 25 attached to different probes for different targets . In some moiety . In some embodiments , a detectably labeled oligo embodiments , probes for the same target have the same label nucleotide comprises two or more detectable moieties . In in a plurality of detectably labeled oligonucleotides used in some embodiments , a detectably labeled oligonucleotide has a contacting step (a set of detectably labeled oligonucle one detectable moiety . In some embodiments , a detectably otides ). Each target, after rounds of contacting and imaging , labeled oligonucleotide has two or more detectable moiety . 30 has its own unique combination of labels ( sequential barcod In some embodiments , a detectable moiety is or com ing ), so that information , e.g., quantitative and /or spatial prises a fluorophore. Exemplary detectably labeled oligo information , can be obtained for a target. For example ,when nucleotides labeled with fluorophores includes but are not fluorophores are used to label detectably labeled oligonucle limited to probes for fluorescence in situ hybridization otides , after N steps, a target would have a sequential ( FISH ) . Widely known and practiced by persons having 35 barcode of F , F2 ... F N wherein F , is the color of fluoro ordinary skill in the art , FISH is used to , among other things , phore used for the target in the n - th imaging. One target can to detect and localize the presence or absence of specific be differentiated from another by a difference in their DNA sequences or RNA targets . Methods for designing and barcodes ( e.g. , RedRedBlueRed compared to RedRedRed preparing detectably labeled oligonucleotides labeled are Blue ). widely known in the art , including but not limited to those 40 In some embodiments , labels of the present invention is or described in US patent application publication US comprise one or more fluorescent dyes, including but not 20120142014. Due to limitations such as fluorophore avail limited to fluorescein , rhodamine , Alexa Fluors , DyLight ability , FISH , however, can only be used to profile a limited fluors , ATTO Dyes, or any analogs or derivatives thereof . number of targets in a given experiment. Through sequential In some embodiments , labels of the present invention barcoding to multiplex different targets , provided methods 45 include but are not limited to fluorescein and chemical of the present invention can profile a large number of targets , derivatives of fluorescein ; Eosin ; Carboxyfluorescein ; Fluo up to FN , wherein F is the number of types of detectable rescein isothiocyanate (FITC ); Fluorescein amidite (FAM ) ; moieties ( in the case of FISH , fluorophores ) and N is the Erythrosine ; Rose Bengal; fluorescein secreted from the number of contacting steps ( in the case of FISH , hybridiza bacterium Pseudomonas aeruginosa ; Methylene blue ; Laser tion ). For example , when F is four and N is 8 , almost the 50 dyes; Rhodamine dyes ( e.g. , Rhodamine , Rhodamine 6G , entire transcriptome ( 48 = 65,536 ) can be profiled . In some Rhodamine B , Rhodamine 123 , Auramine O , Sulforhod embodiments , F is at least 2. In some embodiments , F is 3 . amine 101 , Sulforhodamine B , and Texas Red ) . In some embodiments , F is 4. In some embodiments , F is 5 . In some embodiments , labels of the present invention In some embodiments , F is 6. In some embodiments , F is 7 . include but are not limited to ATTO dyes; Acridine dyes In some embodiments , F is 8. In some embodiments , F is 9. 55 ( e.g., Acridine orange, Acridine yellow ) ; Alexa Fluor; In some embodiments , F is 10. In some embodiments , F is 7 - Amino actinomycin D ; 8 - Anilinonaphthalene- 1 - sulfonate ; 11. In some embodiments , F is 12. In some embodiments , F Auramine - rhodamine stain ; Benzanthrone ; 5,12 - Bis (pheny is 13. In some embodiments , F is 14. In some embodiments , lethynyl) naphthacene ; 9,10 -Bis (phenylethynyl ) anthracene ; F is 15. In some embodiments , F is greater than 15. In some Blacklight paint; Brainbow ; Calcein ; Carboxyfluorescein ; embodiments, N is 2. In some embodiments , N is greater 60 Carboxyfluorescein diacetate succinimidyl ester ; Carboxy than 2. In some embodiments , N is 3. In some embodiments , fluorescein succinimidyl ester ; 1 - Chloro - 9,10 -bis ( phenyl N is greater than 3. In some embodiments , N is 4. In some ethynyl) anthracene ; 2 - Chloro - 9,10 - bis ( phenylethynyl) an embodiments , N is greater than 4. In some embodiments , N thracene; 2 -Chloro - 9,10 - diphenylanthracene ; Coumarin ; is 5. In some embodiments , N is greater than 5. In some Cyanine dyes ( e.g. , Cyanine such as Cy3 and Cy5 , DIOC6 , embodiments , N is 6. In some embodiments , N is greater 65 SYBR Green I ) ; DAPI, Dark quencher, DyLight Fluor, than 6. In some embodiments , N is 7. In some embodiments , Fluo - 4 , FluoProbes; Fluorone dyes (e.g. , Calcein , Carboxy N is greater than 7. In some embodiments , N is 8. In some fluorescein , Carboxyfluorescein diacetate succinimidyl US 10,510,435 B2 21 22 ester , Carboxyfluorescein succinimidyl ester, Eosin , Eosin can be done through incorporating aminoalkyl -modified B , Eosin Y , Erythrosine , Fluorescein , Fluorescein isothio nucleotides during synthesis reactions. In some embodi cyanate , Fluorescein amidite , Indian yellow , Merbromin ) ; ments , a label is used in every 60 bases to avoid quenching Fluoro - Jade stain ; Fura - 2 ; Fura - 2 - acetoxymethyl ester ; effects . Green fluorescent , Hoechst stain , Indian yellow , 5 A detectably labeled oligonucleotide can hybridize with a Indo - 1 , Lucifer yellow , Luciferin , Merocyanine, Optical target, e.g., a transcript or DNA locus. In some embodi brightener , Oxazin dyes ( e.g. , Cresyl violet, Nile blue , Nile ments , a target is or comprises a transcript. In some embodi red ) ; Perylene ; Phenanthridine dyes (Ethidium bromide and ments , a target is a transcript. In some embodiments , a Propidium iodide ) ; Phloxine , Phycobilin , Phycoerythrin , transcript is an RNA . In some embodiments , a transcript is Phycoerythrobilin , Pyranine , Rhodamine, Rhodamine 123 , 10 an mRNA . In some embodiments , a transcript is tRNA . In Rhodamine 6G , RiboGreen , RoGFP, Rubrene , SYBR Green some embodiments , a transcript is rRNA. In some embodi I , (E )-Stilbene , (Z )-Stilbene , Sulforhodamine 101, Sulforho ments , a transcript is snRNA . In some embodiments , an damine B , Synapto -pHluorin , Tetraphenyl butadiene, Tetra RNA is a non -coding RNA . Exemplary non -coding RNA sodium tris (bathophenanthroline disulfonate )ruthenium ( II ) , types are widely known in the art, including but not limited Texas Red, TSQ , Umbelliferone, or Yellow fluorescent pro- 15 to long non - coding RNA ( IncRNA ), microRNA miRNA ), tein . short interfering RNA siRNA ) , piwi- interacting RNA In some embodiments , labels of the present invention (piRNA ), small nucleolar RNA ( snoRNA ) and other short include but are not limited to Alexa Fluor family of fluo RNAs . In some embodiments , an RNA is lncRNA . In some rescent dyes (Molecular Probes, Oregon ). Alexa Fluor dyes embodiments , an RNA is miRNA . In some embodiments , an are widely used as cell and tissue labels in fluorescence 20 RNA is piRNA . In some embodiments , an RNA is snoRNA . microscopy and cell biology . The excitation and emission In some embodiments , a target is or comprises a DNA spectra of the Alexa Fluor series cover the visible spectrum locus . In some embodiments , when a target is a DNA locus, and extend into the infrared . The individual members of the a detectably labeled oligonucleotide optionally comprises family are numbered according roughly to their excitation one or more RNA nucleotide or RNA segments . A detectably maxima (in nm ). Certain Alexa Fluor dyes are synthesized 25 labeled oligonucleotide comprises RNA sequences can be through sulfonation of coumarin , rhodamine , xanthene ( such selectively removed , for example , through RNA - specific as fluorescein ), and cyanine dyes . In some embodiments , enzymatic digestion , after imaging without degrading the sulfonation makes Alexa Fluor dyes negatively charged and DNA target . Exemplary enzymes that specifically degrade hydrophilic . In some embodiments , Alexa Fluor dyes are RNA but not DNA include but are not limited to various more stable , brighter , and less pH - sensitive than common 30 RNase , such as RNase A and RNase H. dyes ( e.g. fluorescein , rhodamine ) of comparable excitation In some embodiments , a detectably labeled oligonucle and emission , and to some extent the newer cyanine series . otide directly hybridizes to its target, e.g., a transcript or Exemplary Alexa Fluor dyes include but are not limited DNA locus . In some embod a detectably labeled Alexa - 350 , Alexa - 405 , Alexa - 430 , Alexa - 488 , Alexa - 500 , oligonucleotide specifically interacts with ( recognizes ) its Alexa- 514 , Alexa -532 , Alexa- 546 , Alexa -555 , Alexa - 568 , 35 target through binding or hybridization to one or more Alexa -594 , Alexa -610 , Alexa -633 , Alexa -647 , Alexa -660 , intermediate , e.g., an oligonucleotide, that is bound , hybrid Alexa -680 , Alexa -700 , or Alexa -750 . ized , or otherwise specifically linked to the target . In some In some embodiments , labels of the present invention embodiments , an intermediate oligonucleotide is hybridized comprise one or more of the DyLight Fluor family of against its target with an overhang such that a second fluorescent dyes (Dyomics and Thermo Fisher Scientific ). 40 oligonucleotide with complementary sequence (“ bridge oli Exemplary DyLight Fluor family dyes include but are not gonucleotide” or “ bridge probe ” ) can bind to it. In some limited to DyLight- 350 , DyLight- 405 , DyLight -488 , embodiments , an intermediate targets a nucleic acid and is DyLight- 549, DyLight- 594 , DyLight- 633 , DyLight- 649 , optionally labeled with a detectable moiety , and comprises DyLight- 680 , DyLight- 750 , or DyLight- 800 . an overhang sequence after hybridization with the target . In In some embodiments , a detectable moiety is or com- 45 some embodiments , an intermediate comprises a sequence prises a nanomaterial. In some embodiments , a detectable that hybridizes to a target , an overhang sequence , and moiety is or compresses a nanoparticle . In some embodi optionally a detectable moiety . In some embodiments , an ments , a detectable moiety is or comprises a quantum dot. In intermediate comprises a sequence that hybridizes to a target some embodiments , a detectable moiety is a quantum dot. In and an overhang sequence. In some embodiments , an inter some embodiments , a detectable moiety comprises a quan- 50 mediate does not have a detectable moiety . In some embodi tum dot. In some embodiments , a detectable moiety is or ments , a second oligonucleotide is a detectably labeled comprises a gold nanoparticle . In some embodiments , a oligonucleotide. In some embodiments , a second detectably detectable moiety is a gold nanoparticle . In some embodi labeled oligonucleotide is labeled with a dye . In some ments , a detectable moiety comprises a gold nanoparticle . embodiments , a detectably labeled oligonucleotide is One of skill in the art understands that , in some embodi- 55 labeled with an HCR polymer . In some embodiments , inter ments , selection of label for a particular probe in a particular mediate oligonucleotides bound to targets are preserved cycle may be determined based on a variety of factors , through multiple contacting , removing and /or imaging steps; including , for example , size , types of signals generated , sequential barcodes are provided through combinations of manners attached to or incorporated into a probe, properties detectable labels that are linked to intermediate oligonucle of the cellular constituents including their locations within 60 otides through bridge probes in the contacting and imaging the cell, properties of the cells , types of interactions being steps. For example, when detectably labeled oligonucle analyzed , and etc. otides are used as bridge probes , barcodes are provided by For example , in some embodiments , probes are labeled detectably labeled oligonucleotides that hybridize with inter with either Cy3 or Cy5 that has been synthesized to carry an mediate oligonucleotides through their overhang sequences . N -hydroxysuccinimidyl ester (NHS -ester ) reactive group . 65 After an imaging step , bridge oligonucleotides are option Since NHS- esters react readily with aliphatic amine groups, ally removed as described herein . In some embodiments , nucleotides can be modified with aminoalkyl groups. This one intermediate oligonucleotide is employed for a target . In US 10,510,435 B2 23 24 some embodiments , two or more intermediate oligonucle performed step ( a ). In some embodiments , a removing step otides are employed for a target . In some embodiments , preserves intermediate oligonucleotides . three or more intermediate oligonucleotides are employed In some embodiments , provided technologies are used to for a target . In some embodiments , four or more interme profile different transcripts formed as a result of splicing diate oligonucleotides are employed for a target . In some variation , RNA editing , oligonucleotide modification , or a embodiments , five or more intermediate oligonucleotides combination thereof. In some embodiments , a target is an are employed for a target . In some embodiments , six ormore RNA splicing variant. In some embodiments , provided tech intermediate oligonucleotides are employed for a target . In nologies profile one or more splicing variants of a gene , e.g. , some embodiments , seven or more intermediate oligonucle locations and quantities of one or more splicing variant of a otides are employed for a target . In some embodiments , 10 gene . In some embodiments , provided methods or compo eight or more intermediate oligonucleotides are employed sitions profile different splicing variants . In some embodi for a target . In some embodiments , nine or more interme ments , an exon that contains one or more variants is targeted diate oligonucleotides are employed for a target. In some and barcoded by sequential hybridization and barcoding . In embodiments , 10 or more intermediate oligonucleotides are some embodiments , a splicing variant contains one or more employed for a target. In some embodiments , 11 or more 15 distinguishable sequences resulted from splicing , and such intermediate oligonucleotides are employed for a target . In sequences are targeted . In some embodiments , by targeting some embodiments , 12 or more intermediate oligonucle exons and / or distinguishable sequences, provided technolo otides are employed for a target. In some embodiments , 13 gies can profile one or more specific splicing variants , or an or more intermediate oligonucleotides are employed for a entire splicing repertoire of an mRNA . As widely known in target . In some embodiments , 13 or more intermediate 20 the art ,mRNA splicing are important to numerous biological oligonucleotides are employed for a target . In some embodi processes and diseases, for example , neurological diseases ments , 15 or more intermediate oligonucleotides are like autism or Down syndrome . Molecules responsible for employed for a target . In some embodiments , 16 or more cell- to -cell adhesion and synpatogenesis are spliced and intermediate oligonucleotides are employed for a target . In their defects are known to generate miswiring in the brain some embodiments , 17 or more intermediate oligonucle- 25 and cause diseases . otides are employed for a target . In some embodiments , 18 In some embodiments , detectably labeled oligonucle or more intermediate oligonucleotides are employed for a otides target sequence modifications caused by sequence target . In some embodiments , 19 or more intermediate editing , chemical modifications and / or combinations oligonucleotides are employed for a target. In some embodi thereof. In some embodiments , a modified nucleic acid ments , 20 or more intermediate oligonucleotides are 30 target, optionally after a conversion process , hybridizes with employed for a target . In some embodiments , 21 or more one or more different complementary sequences compared intermediate oligonucleotides are employed for a target. In to an un -modified target , and is profiled using one or more some embodim nts , 22 or more intermediate oligonucle oligonucleotides that selectively hybridizes with the modi otides are employed for a target . In some embodiments , 23 fied nucleic acid . In some embodiments , a target is an RNA or more intermediate oligonucleotides are employed for a 35 through by RNA editing (Brennicke , A., A. Marchfelder, et target. In some embodiments , 24 or more intermediate al. ( 1999 ). "RNA editing " . FEMS Microbiol Rev 23 ( 3) : oligonucleotides are employed for a target. In some embodi 297-316 ) . In some embodiments , provided technologies ments , 25 or more intermediate oligonucleotides are profiles different RNA variants formed by RNA editing. In employed for a target . In some embodiments , 30 or more some embodiments , provided technologies profile modified intermediate oligonucleotides are employed for a target. In 40 oligonucleotide. In some embodiments, provided technolo some embodiments , 40 or more intermediate oligonucle gies profiles methylated RNA (Song C X , Yi C , He C. otides are employed for a target. In some embodiments , 50 Mapping recently identified nucleotide variants in the or more intermediate oligonucleotides are employed for a genome and transcriptome. Nat Biotechnol. 2012 Novem target . ber ; 30 ( 11 ) : 1107-16 ). In some embodiments , provided tech In some embodiments , each intermediate oligonucleotide 45 nologies profile methylated DNA . In some embodiments , a hybridizes with a different sequence of a target . In some target is single -nucleotide polymorphism ( SNP ) . embodiments , each intermediate oligonucleotide of a target In some embodiments , by profiling a target , provided comprises the same overhang sequence. In some embodi technologies provide, among other things, quantitative and / ments , each detectably labeled oligonucleotide for a target or positioning information of a target , in some cases , in comprises the same sequence complimentary to the same 50 single cells, a tissue , an organ , or an organism . In some overhang sequence shared by all intermediate oligonucle embodiments , profiling of transcripts can be used to quali otides of the target . In some embodiments , an intermediate tatively and / or quantitatively define the spatial - temporal oligonucleotide comprises a sequence complimentary to a patterns of gene expression within cells , tissues , organs or target , and a sequence complimentary to a detectably labeled organisms. oligonucleotide. 55 In some embodiments , each detectably labeled oligo In some embodiments , provided methods further com nucleotide in a set has a different target, e.g., a transcript or prises steps of : ( f ) performing a contacting step that involves DNA locus. In some embodiments , two or more detectably contacting a cell comprising a plurality of nucleic acids with labeled oligonucleotides in a set have the same target. In a plurality of intermediate oligonucleotides , each of which : some embodiments , two or more detectably labeled oligo ( i) targets a nucleic acid and is optionally labeled with a 60 nucleotides target the same transcript. In some embodi detectable moiety ; and ments , two or more detectably labeled oligonucleotides (ii ) comprises an overhang sequence after hybridization target the same DNA locus. In some embodiments , about 2 , with the target ; and ( g ) optionally imaging the cell so that 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11, 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , interaction between the intermediate oligonucleotides with 21, 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 40 , 50 , 60 , 70 , 80 , 90 their targets is detected . 65 or 100 detectably labeled oligonucleotides the same target . In some embodiments , step ( f) and optionally step ( g ) are In some embodiments , two or more detectably labeled performed before step ( a ). In some embodiments , step ( f) is oligonucleotides target the same target. In some embodi US 10,510,435 B2 25 26 ments , five or more detectably labeled oligonucleotides region is about 60 bp in length . In some embodiments , a target the same target. In some embodiments , 10 or more targeted region is about 80 bp in length . In some embodi detectably labeled oligonucleotides target the same target . In ments , a targeted region is about 100 bp in length . In some some embodiments , 15 or more detectably labeled oligo embodiments , a targeted region is about 150 bp in length . In nucleotides target the same target. In some embodiments , 20 5 some embodiments , a targeted region is about 200 bp in or more detectably labeled oligonucleotides target the same length . In some embodiments , a targeted region is about 250 target . In some embodiments , 25 ormore detectably labeled bp in length . In some embodiments , a targeted region is oligonucleotides target the same target. In some embodi about 300 bp in length . In some embodiments , a targeted ments , 30 or more detectably labeled oligonucleotides target region is about 350 bp in length . In some embodiments , a the same target . In some embodiments , 35 or more detect- 10 targeted region is about 400 bp in length . In some embodi ably labeled oligonucleotides target the same target . In some ments , a targeted region is about 450 bp in length . In some embodiments , 40 or more detectably labeled oligonucle embodiments , a targeted region is about 500 bp in length . In otides target the same target. In some embodiments , 45 or some embodiments , a targeted region is about 600 bp in more detectably labeled oligonucleotides target the same length . In some embodiments , a targeted region is about 700 target . In some embodiments , 50 or more detectably labeled 15 bp in length . In some embodiments , a targeted region is oligonucleotides target the same target. In some embodi about 800 bp in length . In some embodiments , a targeted ments, 60 or more detectably labeled oligonucleotides target region is about 900 bp in length . In some embodiments , a the same target. In some embodiments , 70 or more detect targeted region is about 1,000 bp in length . In some embodi ably labeled oligonucleotides target the same target . In some ments , detectably labeled oligonucleotides for a target are embodiments , 80 or more detectably labeled oligonucle- 20 positioned in proximity to each other on the target. otides target the same target. In some embodiments, 90 or As understood by a person having ordinary skill in the art , more detectably labeled oligonucleotides target the same different technologies can be used for the imaging steps. target . In some embodiments , 100 or more detectably Exemplary methods include but are not limited to epi labeled oligonucleotides target the same target. In some fluorescence microscopy , confocal microscopy, the different embodiments , about 1-10 detectably labeled oligonucle- 25 types of super - resolution microscopy (PALM /STORM , otides target the same target . In some embodiments , about SSIM /GSD /STED ) , and light sheet microscopy (SPIM and 5-15 detectably labeled oligonucleotides target the same etc ) . target . In some embodiments, about 10-20 detectably Exemplary super resolution technologies include but are labeled oligonucleotides target the same target. In some not limited to 1PM and 4Pi- microscopy , Stimulated Emis embodiments , about 15-25 detectably labeled oligonucle- 30 sion Depletion microscopy ( STEDM ), Ground State Deple otides target the same target . In some embodiments , about tion microscopy (GSDM ) , Spatially Structured Illumination 20-30 detectably labeled oligonucleotides target the same microscopy (SSIM ), Photo - Activated Localization Micros target. In some embodiments , about 25-35 detectably copy (PALM ), Reversible Saturable Optically Linear Fluo labeled oligonucleotides target the same target . In some rescent Transition (RESOLFT ) , Total Internal Reflection embodiments , about 30-40 detectably labeled oligonucle- 35 Fluorescence Microscope ( TIRFM ), Fluorescence -PALM otides target the same target . In some embodiments , about (FPALM ), Stochastical Optical Reconstruction Microscopy 35-45 detectably labeled oligonucleotides target the same (STORM ), Fluorescence Imaging with One- Nanometer target . In some embodiments , about 40-50 detectably Accuracy (FIONA ), and combinations thereof. For labeled oligonucleotides target the same target . In some examples : Chi, 2009 “ Super- resolution microscopy : break embodiments , about 45-55 detectably labeled oligonucle- 40 ing the limits , Nature Methods 6 ( 1) :15-18 ; Blow 2008 , otides target the same target. In some embodiments , about “ New ways to see a smaller world ,” Nature 456 : 825-828 ; 50-70 detectably labeled oligonucleotides target the same Hell , et al. , 2007 , “ Far -Field Optical Nanoscopy, ” Science target. In some embodiments, about 60-80 detectably 316 : 1153 ; R. Heintzmann and G. Ficz , 2006 , “ Breaking the labeled oligonucleotides target the same target . In some resolution limit in lightmicroscopy , ” Briefings in Functional embodiments , about 70-90 detectably labeled oligonucle- 45 Genomics and Proteomics 5 ( 4 ): 289-301 ; Garini et al. , 2005 , otides target the same target . In some embodiments , about “ From micro to nano : recent advances in high - resolution 80-100 detectably labeled oligonucleotides target the same microscopy, " Current Opinion in Biotechnology 16 :3-12 ; target . and Bewersdorf et al. , 2006 , “ Comparison of PM and In some embodiments , among other things , using multiple 4Pi- microscopy , ” 222 ( 2 ) : 105-117 ; and Wells, 2004 , “Man detectably labeled oligonucleotides for the same target 50 the Nanoscopes ,” JCB 164 ( 3 ) : 337-340 . increases signal intensity . In some embodiments , each In some embodiments , electron microscopes ( EM ) are detectably labeled oligonucleotide in a set targeting the same used . target interacts with a different portion of a target. In some embodiments , an imaging step detects a target . In In some embodiments , all detectably labeled oligonucle some embodiments , an imaging step localizes a target. In otides for a target in a set have the same detectable moieties. 55 some embodiments , an imaging step provides three - dimen In some embodiments , all detectably labeled oligonucle sional spatial information of a target. In some embodiments , otides are labeled in the same way . In some embodiments , all an imaging step quantifies a target . By using multiple the detectably labeled oligonucleotides for a target have the contacting and imaging steps, provided methods are capable same fluorophore . of providing spatial and / or quantitative information for a In some embodiments , detectably labeled oligonucle- 60 large number of targets in surprisingly high throughput . For otides for a target are positioned within a targeted region of example , when using F detectably different types of labels , a target. A targeted region can have various lengths. In some spatial and/ or quantitative information of up to F ^ targets embodiments , a targeted region is about 20 bp in length . In can be obtained after N contacting and imaging steps . some embodiments , a targeted region is about 30 bp in In some embodiments , provided methods comprise addi length . In some embodiments , a targeted region is about 40 65 tional steps before or after a contacting and / or an imaging bp in length . In some embodiments , a targeted region is step . In some embodiments , provided methods comprise a about 50 bp in length . In some embodiments , a targeted step of removing a plurality of detectably labeled oligo US 10,510,435 B2 27 28 nucleotides after each imaging step . In some embodiments , In some embodiments , the second plurality differs from a step of removing comprises degrading the detectably the first plurality in that at least one of the oligonucleotides labeled oligonucleotides . In some embodiments , a step of present in the second plurality is labeled with a different removing does not significantly degrade a target , so that a detectable moiety than the corresponding oligonucleotide target can be used for the next contacting and/ or imaging 5 targeting the same transcript or DNA locus in the first step (s ) if desired . In some embodiments , a step of removing plurality . In some embodiments , each plurality of detectably comprises contacting the plurality of detectably labeled labeled oligonucleotides is different from another, in that at oligonucleotides with an enzyme that digests a detectably least one of the oligonucleotides present in a plurality is labeled with a different detectable moiety than the corre labeled oligonucleotide . In some embodiments , a step of 10 sponding oligonucleotide targeting the same transcript or removing comprises contacting the plurality of detectably DNA locus in another plurality . labeled oligonucleotides with a DNase or RNase . For In some embodiments , a detectably labeled oligonucle example , in some embodiments , a detectably labeled oligo otide has the structure of [ S ] - [ L ] , wherein [ S ] is an oligo nucleotide comprises a DNA sequence , and a DNase is used nucleotide sequence, [ L ] is a detectable moiety or a com for its degradation ; in some other embodiments, a detectably 15 bination of detectable moieties . In some embodiments , [ L ] labeled oligonucleotide comprises an RNA sequence , and an comprises multiple units of detectable labels , e.g., fluoro RNase is used for its degradation . In some embodiments , a phores, each of which independently associates with one or step of removing comprises degrading a detectable moiety . more nucleotidic moieties of an oligonucleotide sequence , In some embodiments , a step of removing comprises pho e.g., [ S ]. In some embodiments , each detectable label tobleaching . 20 attached to the same detectably labeled oligonucleotide In some embodiments , targets of one set of detectably provides the same detectable signal . In some embodiments , labeled oligonucleotides are also targets of another set . In all detectable labels attached to the same oligonucleotide some embodiments , targets of one set of detectably labeled sequence are the same . oligonucleotides overlap with those of another set . In some In some embodiments , oligonucleotides targeting the embodiments , the overlap is more than 10 % . In some 25 same target have the same set of sequences among two or embodiments , the overlap is more than 20 % . In some more sets of detectably labeled oligonucleotides , i.e. , the embodiments , the overlap is more than 30 % . In some differences, if any , among the detectably labeled oligonucle embodiments , the overlap is more than 40 % . In some otides are within the detectable moieties, not the sequences. embodiments , the overlap is more than 50 % . In some For example , in one set of detectably labeled oligonucle embodiments , the overlap is more than 60 % . In some 30 otides , the detectably labeled oligonucleotides targeting a embodiments , the overlap is more than 70 % . In some first target all have the same detectable moiety , or combi embodiments , the overlap is more than 80 % . In some nation of detect moieties [ L ]? : embodiments , the overlap is more than 90 % . In some [ S ] - [ L ] ] , [ S ] 2- [ L ] 1 , ... , [ S ],, - [ L ] ] , wherein n is the number embodiments , the overlap is more than 91 % . In some of detectably labeled oligonucleotides for a target , e.g. , an embodiments, the overlap is more than 92 % . In some 35 integer of 3-50 ; embodiments , the overlap is more than 93 % . In some In another set of detectably labeled oligonucleotides , embodiments , the overlap is more than 94 % . In some wherein oligonucleotides targeting the same target are dif embodiments , the overlap is more than 90 % . In some ferently labeled , the oligonucleotides targeting the same embodiments , the overlap is more than 95 % . In some target are having the same set of oligonucleotide sequences embodiments , the overlap is more than 96 % . In some 40 ( [ S ] 1, [ S ] 2 , ... , [ S ] n ) yet a different [L ] 2: embodiments , the overlap is more than 97 % . In some [ S ] 1- [L ] 2 , [ S ] 2- [ L ] 2 , . [ S ] n-[ L ] 2, wherein [ L ] , is detect embodiments , the overlap is more than 98 % . In some ably different than [L ] 2. embodiments , the overlap is more than 99 % . In some To exemplify certain embodiments of the present inven embodiments , the overlap is more than 99.5 % . In some tion , a two - step , two -label , 4 -target (FN = 22 = 4 ) process , embodiments , the overlap is more than 99.6 % . In some 45 wherein all detectably labeled oligonucleotides targeting the embodiments , the overlap is more than 99.7 % . In some same target in each set independently have the same detect embodiments , the overlap is more than 99.8 % . In some able moiety , is provided below : embodiments , the overlap is more than 99.9 % . In some Step 1. Contacting the targets with the first plurality (P1 ) embodiments , the overlap is 100 % . In some embodiments , of detectably labeled oligonucleotides : targets of one set of detectably labeled oligonucleotides are 50 Target T1 : [S ] P1-11-1 [L ] 1, [ S ] 21-11-2[ L ] 1 , [ S ]P1-11-3 the same as targets of another set . In some embodiments , [ L ] 1 , ... , [ S ] P1-11- P171 [ L ] 1 , wherein P1T1 is the number of each set of detectably labeled oligonucleotides targets the detectably labeled oligonucleotides targeting T1 in the first same targets . plurality , and [L ] , is the first detectable label ; In some embodiments , a third detectably labeled oligo Target T2 : [ S ] P1-12-1 [ L ] 1 [ S ] 21-12-2 [ L ] ] , [ S ]21-12-3 nucleotide in a second contacting step targeting the first 55 [L ]1 , ... , [ S ]P1-12 -P172 [ L ]1 , wherein P1T2 is the number of transcript or DNA locus (the first target ) optionally has an detectably labeled oligonucleotides targeting T2 in the first identical sequence to the first detectably labeled oligonucle plurality ; otide targeting the first transcript or DNA locus . In some Target T3 : [S ]P1-13-1 [L ] 2, [ S ] 21-73-2[ L ]2 , [S ]P1-13-3 embodiments , the sequences are identical . In some embodi [ L ] 2, ... , [ S ] P1-73 - P173 [L ]2 , wherein P1T3 is the number of ments , the sequences are different. Similarly , in some 60 detectably labeled oligonucleotides targeting T3 in the first embodiments , a fourth detectably labeled oligonucleotide in plurality , and [ L ] , is a detectably different label than [ L ] ; a second contacting step targeting the second transcript or Target T4 : [S ] P1- T4-1 [L ] 2, [S ] P1-14-2 [L ]2 , [ S ] P1- T4-3 DNA locus (the first target ) optionally has an identical [ L ] 2, . [S ]P1 - T4 -P174 [L ]2 , wherein P1T4 is the number of sequence to the second detectably labeled oligonucleotide detectably labeled oligonucleotides targeting T4 in the first targeting the first transcript or DNA locus. In some embodi- 65 plurality . ments , the sequences are identical. In some embodiments , Step 2 : Imaging; the sequences are different. Step 3 : Removing P1 from the targets ; US 10,510,435 B2 29 30 Step 4 : Contacting the targets with the second plurality some embodiments , a removing step removes at least 85 % ( P2 ) of detectably labeled oligonucleotides : detectably labeled oligonucleotides. In some embodiments , Target T1: [ S ] P2-11-1 [ L ] ], [S ] P2-11-2 [ L ] ) , [ S ] P2-11-3 a removing step removes at least 90 % detectably labeled [ L ] 1, ... , [ S ] P2- T1- P2T1 [ L ] 1 , wherein P2T1 is the number of oligonucleotides. In some embodiments , a removing step detectably labeled oligonucleotides targeting Tl in the sec- 5 removes at least 91 % detectably labeled oligonucleotides. In ond plurality ; some embodiments , a removing step removes at least 92 % Target T2 : [ S ] P2-12-1 [ L ] , [ S ] P2-12-2 [L ] 2 , [ S ] 22-12-3 detectably labeled oligonucleotides . In some embodiments , [ L ] 2, ... , [ S ]P2-12 - P272 [L ] 2 , wherein P2T2 is the number of a removing step removes at least 93 % detectably labeled detectably labeled oligonucleotides targeting T2 in the sec oligonucleotides. In some embodiments , a removing step ond plurality ; 10 removes at least 94 % detectably labeled oligonucleotides. In Target T3 : [ S ]P2-13-1 [ L ] 1 , [ S ]P2-73-2 [ L ] ] , [ S ]P2 - T3-3 some embodiments , a removing step removes at least 95 % [ L ] 1, ... , [ S ] P2- T3 -P273 [ L ] 1 , wherein P2T3 is the number of detectably labeled oligonucleotides . In some embodiments , detectably labeled oligonucleotides targeting T3 in the sec a removing step removes at least 96 % detectably labeled ond plurality ; oligonucleotides . In some embodiments , a removing step Target T4 : [ S ]P2-14-1 [L ] [ S ]P2 - T4-2 [ L ] 2, [ S ] P2-143 15 removes at least 97 % detectably labeled oligonucleotides . In [L ] 2 , ... , [ S ]P2 - T4 -P2T4 [ L ]2 , wherein P2T4 is the number of some embodiments , a removing step removes at least 98 % detectably labeled oligonucleotides targeting T4 in the sec detectably labeled oligonucleotides. In some embodiments, ond plurality. a removing step removes at least 99 % detectably labeled Step 5 : Imaging . oligonucleotides . In some embodiments , a removing step After the two imaging steps, each target has its own 20 removes at least 99.1 % detectably labeled oligonucleotides . unique sequential barcode : In some embodiments , a removing step removes at least T1: [L ] [L ]? ; 99.2 % detectably labeled oligonucleotides . In some embodi T2 : [ L ] , [ L ] ; ments , a removing step removes at least 99.3 % detectably T3 : [ L ] 2 [ L ] ? ; and labeled oligonucleotides. In some embodiments , a removing T4 : [L ] [ L ]2 . 25 step removes at least 99.4 % detectably labeled oligonucle In some embodiments , additional barcodes, T1-- , T2-- , otides. In some embodiments , a removing step removes at --T1 , --T2 can also be used , wherein indicates no signal least 99.5 % detectably labeled oligonucleotides. In some for that step . embodiments , a removing step removes at least 80 % of the In the exemplified process above , each of P1T1, P1T2 , detectable signal. In some embodiments , a removing step P1T3 , P1T4 , P2T1 , P2T2 , P2T3 and P2T4 is independently 30 removes at least 85 % of the detectable signal . In some a natural number ( an integer greater than 0 ) . In some embodiments , a removing step removes at least 90 % of the embodiments , P1T1 = P2T1 . In some embodiments , detectable signal. In some embodiments , a removing step P1T2 = P2T2 . In some embodir P1T3 = P2T3 . In some removes at least 91 % of the detectable signal . In some embodiments , P1T4 = P2T4 . In some embodiments , one embodiments , a removing step removes at least 92 % of the detectably labeled oligonucleotide is used for a target. In 35 detectable signal. In some embodiments, a removing step some embodiments , two or more detectably labeled oligo removes at least 93 % of the detectable signal . In some nucleotides are used for a target. embodiments , a removing step removes at least 94 % of the In some embodiments , detectably labeled oligonucle detectable signal. In some embodiments , a removing step otides targeting the same target have the same set of removes at least 95 % of the detectable signal . In some sequences in each plurality . For example , for target T1 in the 40 embodiments , a removing step removes at least 96 % of the example above , each of [S ]P1-11-1 to [S ]P1-11 -Piti indepen detectable signal. In some embodiments , a removing step dently has the same sequence as one of [ S ] P2- T1-1 to removes at least 97 % of the detectable signal. In some [ S ]P2-11 - P271 , and each of [ S ] p2-71-1 to [ S ] 22-11 - P2t1 inde embodiments , a removing step removes at least 98 % of the pendently has the same sequence as one of [S ]P1 - T1-1 to detectable signal. In some embodiments , a removing step [S ] P1 -T1 - P171 . In some embodiments , detectably labeled oli- 45 removes at least 99 % of the detectable signal. In some gonucleotides targeting the same target have different sets of embodiments , a removing step removes at least 99.5 % of the sequences in each plurality . detectable signal. In some embodiments , a removing step In some embodiments , provided methods optionally com removes 100 % of the detectable signal . In some embodi prise a step of removing a plurality of detectably labeled ments , after a removing step no signal can be detected by an oligonucleotides after an imaging step . In some embodi- 50 imaging step . ments , provided methods comprise a removing step after an A removing step optionally preserves targets ( e.g., tran imaging step . In some embodiments , provided methods scripts or DNA loci) for further use , for example , further comprise a removing step after each imaging step but the last detection or quantification by additional contacting and /or imaging step . In some embodiments , provided methods imaging steps. In some embodiments , a removing step comprise a removing step after each imaging step . 55 preserves at least 80 % targets . Percentage of preserved A removing step in provided methods can serve one or targets can be measured , for example , by comparing data more of a variety of purposes . In some embodiments , a collected before and after a removing step , optionally using removing step removes a plurality of detectably labeled the same contacting and imaging protocols . In some oligonucleotides from targets so that targets are available for embodiments , a removing step preserves at least 85 % tar interacting with another plurality of detectably labeled oli- 60 gets. In some embodiments, a removing step preserves at gonucleotides . In some embodiments , a removing step least 90 % targets . In some embodiments , a removing step removes a plurality of detectably labeled oligonucleotides so preserves at least 91 % targets . In some embodiments , a that detectable moieties of one plurality of detectably removing step preserves at least 92 % targets . In some labeled oligonucleotides do not interfere with detection of embodiments , a removing step preserves at least 93 % tar another plurality of detectably labeled oligonucleotides 65 gets. In some embodiments, a removing step preserves at bound to targets . In some embodiments , a removing step least 94 % targets . In some embodiments , a removing step removes at least 80 % detectably labeled oligonucleotides. In preserves at least 95 % targets . In some embodiments , a US 10,510,435 B2 31 32 removing step preserves at least 96 % targets . In some away all detectable moieties without affecting the DNA loci embodiments , a removing step preserves at least 97 % tar and /or the probe hybridized on them . gets . In some embodiments , a removing step preserves at In some embodiments , detectably labeled oligonucle least 98 % targets . In some embodiments , a removing step otides comprises 5' phosphorylation and can be degraded by preserves at least 99 % targets . 5 Lambda exonuclease , while intermediate oligonucleotides Methods for removing detectably labeled oligonucle are not 5' - phosphoralated and cannot be degraded by otides are widely known in the art. In some embodiments , a Lambda exonuclease . removing step comprising degrading a detectably labeled In some embodiments , a detectably labeled oligonucle oligonucleotide . In some embodiments , a detectably labeled otide comprises uracil. In some embodiments , detectably oligonucleotide is removed by enzymatic digestion . In some 10 labeled oligonucleotides contain uracil, and can be degraded embodiments , a removing step comprising contacting a by USERTM enzyme (New England BioLabs, Ipswich , plurality of detectably labeled oligonucleotides with an Mass ., MA , US) , while intermediate oligonucleotides con enzyme that digests a detectably labeled oligonucleotide. tain no uracil and cannot be degraded by USERTM enzyme. Suitable enzymes are widely used in the art. For example , is againstIn some an overhang embodiments of an , intermediatean oligonucleotide oligonucleotide hybridized has depending on the type ( s ) of detectably labeled oligonucle a recessed 3 '- end when hybridized against the overhang . otides and /or targets , either DNase or RNase can be used . In Detectably labeled oligonucleotides with recessed 3 ' - end some embodiments , a detectably labeled oligonucleotide when hybridized against intermediate oligonucleotides can comprising a DNA sequence for detecting quantifying a be selectively digested by Exonuclease III. Intermediate RNA target is digested by a DNase , e.g. , DNase I. In some 20 oligonucleotides which do not have recessed 3 '- ends , or embodiments , a detectably labeled oligonucleotide compris whose 3 '- ends are in RNA - DNA duplexes, can be kept intact ing an RNA sequence for detecting quantifying a DNA due to the much weaker activities of exonuclease III toward target is digested by a RNase. In some embodiments , a them . detectably labeled RNA oligonucleotide is used to target a In some embodiments, when an enzyme is involved , a DNA loci. 25 removing step is performed at a temperature that produces In some embodiments , a detectably labeled oligonucle optimal results . In some embodiments , a removing step is otide interacts with its target through binding or hybridiza performed at about 37 ° C. In some embodiments, a remov tion to one or more intermediate , such as an oligonucleotide , ing step is performed at room temperature . In some embodi that is bound , hybridized , or otherwise linked to the target. ments , digestion with Lambda exonuclease is conducted at In some embodiments , a detectably labeled oligonucleotide 30 about 37° C. In some embodiments , digestion with USERTM interacts with a target through hybridization with an inter enzyme is conducted at about 37 ° C. In some embodiments , mediate oligonucleotide hybridized to a target, wherein the digestion with USERTM enzyme is conducted at room tem intermediate oligonucleotide comprises a sequence compli perature . In some embodiments , digestion with Exonuclease mentary to the target , and a sequence complementary to the III is conducted at about 37 ° C. In some embodiments , detectably labeled oligonucleotide (overhang ). In some 35 digestion with Exonuclease III is conducted at room tem embodiments , a removing step removes detectably labeled perature . oligonucleotides, optionally keeping intermediate oligo In some embodiments , use of an intermediate oligonucle nucleotides intact . In some embodiments, a removing step otide and an overhang sequence for detectably labeled removes detectably labeled oligonucleotides and keeps oligonucleotide binding provides a variety of advantages. In intermediate oligonucleotides intact . In some embodiments , 40 some embodiments , kinetics of hybridization between an detectably labeled oligonucleotides differ from intermedi overhang sequence and a detectably labeled oligonucleotide ates in a chemical or enzymatic perspective , so that detect is faster than that between an intermediate oligonucleotide ably labeled oligonucleotides can be selectively removed . and a target . In some embodiments , all intermediate oligo In some embodiments , intermediate DNA oligonucle nucleotides for a target comprise the same overhang otides are used to hybridize against DNA loci, with an 45 sequence, and all detectably labeled oligonucleotides for a overhang ( e.g., 20 nt) such that a bridge oligonucleotide target comprises the same complimentary sequence for comprising an RNA sequence and with complementary binding to the same overhang sequence . In some embodi sequence ( e.g., RNA bridge probe ) can bind . An RNA bridge ments , hybridization between a set of detectably labeled probe can be labeled directly with a dye or a HCR polymer oligonucleotides and a set of intermediate oligonucleotides (which can also be DNA ). After imaging, RNase can be used 50 is up to about 20-40 times faster than that between a set of to digest away the RNA bridge probes , while leaving the an intermediate oligonucleotides and a set of targets . In DNA probe intact hybridized on the DNA loci. Such a some embodiments , hybridization between detectably method provides multiple advantages . For example , subse labeled oligonucleotides and intermediate oligonucleotides quent contacting steps only involve RNA bridge probes can be done in 30 minutes, compared to , in some cases, up hybridizing against DNA oligonucleotides with overhangs, 55 to about 12 hours for hybridization between intermediate and avoid getting double stranded DNA to melt and hybrid oligonucleotides and targets . ize with DNA oligonucleotides, which can be a difficult In some embodiments , strand displacement is used in a process. Further , the overhang can be made to be the same removing step to remove a detectably labeled oligonucle for all DNA oligonucleotides ( e.g., 20-40 ) targeting the same otide . In some embodiments , heat is used to dissociate a gene , so that only one type of RNA bridge probe is needed 60 detectably labeled oligonucleotide in a removing step . per gene per round of hybridization . To switch colors on In some embodiments , a removing step comprises pho different hybridization ( contacting steps ), one can change tobleaching . In some embodiments , photobleaching destroys RNA bridge probes with a different label or different HCR a dye, such as a fluorophore , of a detectably labeled oligo polymer. DNA bridge probes that can be specifically nucleotide . removed , e.g. , with a specific enzyme restriction site like 65 In some embodiments , a first and a second sets of detect EcoRI on the bridge or the HCR hairpins, can also be used . ably labeled oligonucleotides target different sequences of Incubating the cells with the appropriate nuclease can digest each target, and a removing step after a first imaging step is US 10,510,435 B2 33 34 optional. For example , one strategy is to target the same ( iii ) a third oligonucleotide , optionally identical in RNA with different DNA probes (detectably labeled DNA sequence to the first oligonucleotide, targeting the first oligonucleotides ), such that the first plurality of probes can transcript or DNA locus and labeled with the first, the second target one set of sequences on the RNA , and the second or a third detectable moiety ; and plurality of probes target a different set of sequences on the 5 ( iv ) a fourth oligonucleotide , optionally identical in same RNA . On the first hybridization (contacting ), the first sequence to the second oligonucleotide, targeting the second plurality of probes is used . They can then be imaged and transcript or DNA locus, and labeled with the first, the optionally photobleached or digested by DNase , or other second , the third or a fourth detectable moiety , methods of destroying either the oligos or the dyes. The wherein either the third oligonucleotide is labeled with a second set of probes can be hybridized and imaged without 10 thedifferent fourth detectable oligonucleotide moiety isthan labeled the firstwith oligonucleotidea different detect, or interferences from the first set of probes. able moiety than the second oligonucleotide , or both . In some embodiments , provide methods optionally com In some embodiments , detectably labeled oligonucle prise HCR , light sheet microscopy, CLARITY , or combina otides targeting the same target ( transcript or DNA locus) in tions thereof. In some embodiments , provided methods 15 a composition are labeled with moieties providing the same allow direct profiling of targets in a tissue, an organ or an detectable signal, or detectable signals that cannot be dif organism . In some embodiments , an organ is a brain . In ferentiated in an imaging step . In some embodiments , some embodiments , provided methods allow direct imaging detectably labeled oligonucleotides targeting the same target of transcripts in intact brains or tissues . In some embodi in a composition are labeled with the same detectable ments , provided methods further comprise HCR . In some 20 moiety . embodiments , provided methods further comprise light In some embodiments , a detectable moiety is or com sheet microscopy. In some embodiments , provided methods prises a fluorophore . In some embodiments , a detectable further comprise CLARITY . moiety is a fluorophore. Exemplary fluorophores are widely Provided methods offer many advantages over methods known and used in the art , for example but not limited to prior to the present invention . For example , in some embodi- 25 fluorescein , rhodamine , Alexa Fluors , DyLight fluors , ATTO ments , provided methods provide high - throughput at rea Dyes, or any analogs or derivatives thereof. sonable cost . In some embodiments , provided methods In some embodiments , a first and a second detectably provide direct probing of target without transformation or labeled oligonucleotides target different target . In some amplification of a target . In some embodiments , provided embodiments , a first and a second detectably labeled oligo methods enable quick scale up without the requirement of a 30 nucleotides target the same target . In some embodiments , large number of detectable labels . In some embodiments , detectably labeled oligonucleotides in a composition or a kit provided methods can apply multiple labels to the same targets two or more targets, e.g., transcripts and /or DNA loci . target and therefore increase signal intensity . In some In some embodiments , detectably labeled oligonucleotides embodiments , provided methods provide a combination of in a composition or a kit targets two or more transcripts . In the advantages . 35 some embodiments, detectably labeled oligonucleotides in a In some embodiments , the present invention provides composition or a kit targets two or more DNA loci. In some compositions comprising a plurality of detectably labeled embodiments , detectably labeled oligonucleotides in a com oligonucleotides, for, e.g. , use in provided methods . Exem position or kit targets at least 4 targets . In some embodi plary compositions include but are not limited to those ments , detectably labeled oligonucleotides in a composition described in exemplary method embodiments herein . 40 or kit targets at least 9 targets . In some embodiments , In some embodiments , the present invention provides detectably labeled oligonucleotides in a composition or kit compositions comprising a plurality of detectably labeled targets at least 16 targets . In some embodiments , detectably oligonucleotides, each of which targets a nucleic acid and is labeled oligonucleotides in a composition or kit targets at labeled with a detectable moiety , so that the composition least 25 targets . In some embodiments , detectably labeled comprises at least: 45 oligonucleotides in a composition or kit targets at least 36 ( i ) a first oligonucleotide targeting a first nucleic acid and targets . In some embodiments , detectably labeled oligo labeled with a first detectable moiety ; and nucleotides in a composition or kit targets at least 50 targets . ( ii ) a second oligonucleotide targeting a second nucleic In some embodiments , detectably labeled oligonucleotides acid and labeled with a second detectable moiety . in a composition or kit targets at least 100 targets . In some In some embodiments , the present invention provides 50 embodiments , detectably labeled oligonucleotides in a com compositions comprising a plurality of detectably labeled position or kit targets at least 200 targets . In some embodi oligonucleotides, each of which targets a transcript or DNA ments , detectably labeled oligonucleotides in a composition locus and is labeled with a detectable moiety , so that the or kit targets at least 500 targets . In some embodiments , composition comprises at least : detectably labeled oligonucleotides in a composition or kit ( i) a first oligonucleotide targeting a first transcript or 55 targets at least 1,000 targets . In some embodiments , detect DNA locus and labeled with a first detectable moiety ; and ably labeled oligonucleotides in a composition or kit targets ( ii ) a second oligonucleotide targeting a second transcript at least 5,000 targets . In some embodiments , detectably or DNA locus and labeled with a second detectable moiety . labeled oligonucleotides in a composition or kit targets at In some embodiments , the present invention provides kits least 10,000 targets . In some embodiments , detectably comprising a plurality of detectably labeled oligonucle- 60 labeled oligonucleotides in a composition or kit targets at otides, each of which targets a transcript or DNA locus and least 50,000 targets . In some embodiments , detectably is labeled with a detectable moiety, so that the kit comprises labeled oligonucleotides in a composition or kit targets at at least: least 100,000 targets . In some embodiments , detectably (i ) a first oligonucleotide targeting a first transcript or labeled oligonucleotides in a composition or kit targets at DNA locus and labeled with a first detectable moiety ; 65 least 1,000,000 targets. ( ii ) a second oligonucleotide targeting a second transcript In some embodiments , a first and a second oligonucle or DNA locus and labeled with a second detectable moiety. otides have different oligonucleotide sequences. In some US 10,510,435 B2 35 36 embodiments , a first and a second detectable moieties are ments , a detectably labeled oligonucleotide is 22 base pairs different . In some embodiments , a first and a second detect in length . In some embodiments , a detectably labeled oli able moieties are the same . gonucleotide is 23 base pairs in length . In some embodi In some embodiments , a first and a second oligonucle ments, a detectably labeled oligonucleotide is 24 base pairs otides share less than 5 % sequence identity . In some 5 in length . In some embodiments , a detectably labeled oli embodiments , a first and a second oligonucleotides share gonucleotide is 25 base pairs in length . In some embodi less than 10 % sequence identity . In some embodiments , a ments , a detectably labeled oligonucleotide is 26 base pairs first and a second oligonucleotides share less than 20 % in length . In some embodiments, a detectably labeled oli sequence identity . In some embodiments , a first and a second gonucleotide is 27 base pairs in length . In some embodi oligonucleotides share less than 30 % sequence identity . In 10 ments , a detectably labeled oligonucleotide is 28 base pairs some embodiments , a first and a second oligonucleotides in length . In some embodiments , a detectably labeled oli share less than 40 % sequence identity . In some embodi gonucleotide is 29 base pairs in length . In some embodi ments , a first and a second oligonucleotides share less than ments, a detectably labeled oligonucleotide is 30 base pairs 50 % sequence identity . In some embodiments , a first and a in length . In some embodiments, a detectably labeled oli second oligonucleotides share less than 60 % sequence iden- 15 gonucleotide is at least 15 base pairs in length . In some tity . In some embodiments , a first and a second oligonucle embodiments , a detectably labeled oligonucleotide is at least otides share less than 65 % sequence identity . In some 16 base pairs in length . In some embodiments , a detectably embodiments , a first and a second oligonucleotides share labeled oligonucleotide is at least 17 base pairs in length . In less than 68 % sequence identity . In some embodiments , a some embodiments , a detectably labeled oligonucleotide is first and a second oligonucleotides share less than 70 % 20 at least 18 base pairs in length . In some embodiments , a sequence identity . In some embodiments , a first and a second detectably labeled oligonucleotide is at least 19 base pairs in oligonucleotides share less than 80 % sequence identity . In length . In some embodiments , a detectably labeled oligo some embodiments , a first and a second oligonucleotides nucleotide is at least 20 base pairs in length . In some share less than 90 % sequence identity . embodiments , a detectably labeled oligonucleotide is at least In some embodiments , each oligonucleotide shares less 25 21 base pairs in length . In some embodiments , a detectably than 5 % sequence identity with any other oligonucleotide . In labeled oligonucleotide is at least 22 base pairs in length . In some embodiments , each oligonucleotide shares less than some embodiments , a detectably labeled oligonucleotide is 10 % sequence identity with any other oligonucleotide . In at least 23 base pairs in length . In some embodiments , a some embodiments , each oligonucleotide shares less than detectably labeled oligonucleotide is at least 24 base pairs in 20 % sequence identity with any other oligonucleotide. In 30 length . In some embodiments, a detectably labeled oligo some embodiments , each oligonucleotide shares less than nucleotide is at least 25 base pairs in length . In some 30 % sequence identity with any other oligonucleotide. In embodiments , a detectably labeled oligonucleotide is at least some embodiments , each oligonucleotide shares less than 26 base pairs in length . In some embodi ents , a detectably 40 % sequence identity with any other oligonucleotide. In labeled oligonucleotide is at least 27 base pairs in length . In some embodiments , each oligonucleotide shares less than 35 some embodiments , a detectably labeled oligonucleotide is 50 % sequence identity with any other oligonucleotide. In at least 28 base pairs in length . In some embodiments , a some embodiments , each oligonucleotide shares less than detectably labeled oligonucleotide is at least 29 base pairs in 55 % sequence identity with any other oligonucleotide. In length . In some embodiments , a detectably labeled oligo some embodiments , each oligonucleotide shares less than nucleotide is at least 30 base pairs in length . In some 60 % sequence identity with any other oligonucleotide. In 40 embodiments , a detectably labeled oligonucleotide is at least some embodiments , each oligonucleotide shares less than 35 base pairs in length . In some embodiments , a detectably 65 % sequence identity with any other oligonucleotide. In labeled oligonucleotide is at least 40 base pairs in length . In some embodiments , each oligonucleotide shares less than some embodiments , a detectably labeled oligonucleotide is 68% sequence identity with any other oligonucleotide. In at least 50 base pairs in length . In some embodiments , a some embodiments , each oligonucleotide shares less than 45 detectably labeled oligonucleotide is about 15-25 base pairs 70 % sequence identity with any other oligonucleotide. In in length . In some embodiments, a detectably labeled oli some embodiments , each oligonucleotide shares less than gonucleotide is about 20-30 base pairs in length . In some 80 % sequence identity with any other oligonucleotide. In embodiments , a detectably labeled oligonucleotide is about some embodiments , each oligonucleotide shares less than 25-35 base pairs in length . In some embodiments , a detect 90 % sequence identity with any other oligonucleotide. 50 ably labeled oligonucleotide is about 30-40 base pairs in In some embodiments , a composition or kit comprises length . In some embodiments , a detectably labeled oligo two or more detectably labeled oligonucleotides targeting nucleotide is about 35-45 base pairs in length . In some the same target . In some embodiments , 5 , 10 , 20 , 30 , 40 , 50 embodiments , a detectably labeled oligonucleotide is about or more detectably labeled oligonucleotides target the same 40-50 base pairs in length . In some embodiments , a detect target . 55 ably labeled oligonucleotide is about 15-30 base pairs in Detectably labeled oligonucleotides can be of various length . In some embodiments , a detectably labeled oligo suitable lengths . In some embodiments , a detectably labeled nucleotide is about 20-30 base pairs in length . In some oligonucleotide is 15 base pairs in length . In some embodi embodiments , a detectably labeled oligonucleotide is about ments , a detectably labeled oligonucleotide is 16 base pairs 15-35 base pairs in length . In some embodiments , a detect in length . In some embodiments , a detectably labeled oli- 60 ably labeled oligonucleotide is about 20-35 base pairs in gonucleotide is 17 base pairs in length . In some embodi length . ments , a detectably labeled oligonucleotide is 18 base pairs In some embodiments , a plurality of detectably labeled in length . In some embodiments, a detectably labeled oli oligonucleotides contains two detectable moieties . In some gonucleotide is 19 base pairs in length . In some embodi embodiments , a plurality of detectably labeled oligonucle ments , a detectably labeled oligonucleotide is 20 base pairs 65 otides contains three detectable moieties . In some embodi in length . In some embodiments , a detectably labeled oli ments, a plurality of detectably labeled oligonucleotides gonucleotide is 21 base pairs in length . In some embodi contains four detectable moieties . In some embodiments , a US 10,510,435 B2 37 38 plurality of detectably labeled oligonucleotides contains five 20 % sequence identity with a second oligonucleotide . In detectable moieties . In some embodiments , a plurality of some embodiments , a fourth oligonucleotide has less than detectably labeled oligonucleotides contains six detectable 10 % sequence identity with a second oligonucleotide . In moieties . In some embodiments , a plurality of detectably some embodiments , a fourth oligonucleotide has less than labeled oligonucleotides contains seven detectable moieties . 5 5 % sequence identity with a second oligonucleotide. In some embodiments , a plurality of detectably labeled In some embodiments , a third oligonucleotide is labeled oligonucleotides contains eight detectable moieties. In some with a different detectable moiety than the first oligonucle embodiments , a plurality of detectably labeled oligonucle otide. In some embodiments , a fourth oligonucleotide is otides contains nine detectable moieties . In some embodi labeled with a different detectable moiety than the second ments , a plurality of detectably labeled oligonucleotides 10 oligonucleotide . contains ten detectable moieties . In some embodiments , amount of a detectably labeled In some embodiments , a plurality of detectably labeled oligonucleotide in a plurality , composition , kit or method is oligonucleotides comprises at least two detectable moieties . pre -determined . In some embodiments, amounts of 5 % In some embodiments , a plurality of detectably labeled detectably labeled oligonucleotides in a plurality , composi oligonucleotides comprises at least three detectable moi- 15 tion , kit or method are pre- determined . In some embodi eties . In some embodiments , a plurality ofdetectably labeled ments , amounts of 10 % detectably labeled oligonucleotides oligonucleotides comprises at least four detectable moieties . in a plurality , composition , kit or method are pre -deter In some embodiments , a plurality of detectably labeled mined . In some embodiments , amounts of 20 % detectably oligonucleotides comprises at least five detectable moieties . labeled oligonucleotides in a plurality , composition , kit or In some embodiments , a plurality of detectably labeled 20 method are pre -determined . In some embodiments , amounts oligonucleotides comprises at least six detectable moieties . of 30 % detectably labeled oligonucleotides in a plurality, In some embodiments , a plurality of detectably labeled composition , kit or method are pre - determined . In some oligonucleotides comprises at least seven detectable moi embodiments , amounts of 40 % detectably labeled oligo eties. In some embodiments , a plurality of detectably labeled nucleotides in a plurality , composition , kit or method are oligonucleotides comprises at least eight detectable moi- 25 pre -determined . In some embodiments , amounts of 50 % eties. In some embodiments , a plurality of detectably labeled detectably labeled oligonucleotides in a plurality , composi oligonucleotides comprises at least nine detectable moieties. tion , kit or method are pre - determined . In some embodi In some embodiments , a plurality of detectably labeled ments , amounts of 60 % detectably labeled oligonucleotides oligonucleotides comprises at least ten detectable moieties . in a plurality , composition , kit or method are pre- deter In some embodiments , a composition further comprises : 30 mined . In some embodiments , amounts of 70 % detectably ( iii) a third oligonucleotide, optionally identical in labeled oligonucleotides in a plurality , composition , kit or sequence to the first oligonucleotide, targeting the first method are pre -determined . In some embodiments , amounts transcript or DNA locus ; and of 80 % detectably labeled oligonucleotides in a plurality , ( iv ) a fourth oligonucleotide, optionally identical in composition , kit or method are pre -determined . In some sequence to the second oligonucleotide , targeting the second 35 embodiments , amounts of 90 % detectably labeled oligo transcript or DNA locus nucleotides in a plurality, composition , kit or method are wherein either the third oligonucleotide is labeled with a pre - determined . different detectable moiety than the first oligonucleotide , or In some embodiments , amounts of at least 5 detectably the fourth oligonucleotide is labeled with a different detect labeled oligonucleotides in a plurality , composition , kit or able moiety than the second oligonucleotide, or both . 40 method are pre - determined . In some embodiments , amounts In some embodiments , a third oligonucleotide is identical of at least 10 detectably labeled oligonucleotides in a in sequence to a first oligonucleotide . In some embodiments , plurality , composition , kit or method are pre- determined . In a third oligonucleotide comprises a sequence overlapping some embodiments , amounts of at least 20 detectably with a first oligonucleotide . In some embodiments , a third labeled oligonucleotides in a plurality , composition , kit or oligonucleotide has less than 50 % sequence identity with a 45 method are pre -determined . In some embodiments , amounts first oligonucleotide. In some embodiments , a third oligo of at least 30 detectably labeled oligonucleotides in a nucleotide has less than 40 % sequence identity with a first plurality , composition , kit or method are pre -determined . In oligonucleotide. In some embodiments , a third oligonucle some embodiments , amounts of at least 40 detectably otide has less than 30 % sequence identity with a first labeled oligonucleotides in a plurality, composition , kit or oligonucleotide. In some embodiments , a third oligonucle- 50 method are pre -determined . In some embodiments , amounts otide has less than 20 % sequence identity with a first of at least 50 detectably labeled oligonucleotides in a oligonucleotide. In some embodiments , a third oligonucle plurality , composition , kit or method are pre -determined . In otide has less than 10 % sequence identity with a first some embodiments , amounts of at least 60 detectably oligonucleotide . In some embodiments , a third oligonucle labeled oligonucleotides in a plurality , composition , kit or otide has less than 5 % sequence identity with a first oligo- 55 method are pre -determined . In some embodiments , amounts nucleotide . of at least 70 detectably labeled oligonucleotides in a In some embodiments , a fourth oligonucleotide is iden plurality , composition , kit or method are pre- determined . In tical in sequence to a second oligonucleotide. In some some embodiments , amounts of at least 80 detectably embodiments, a fourth oligonucleotide comprises a labeled oligonucleotides in a plurality, composition , kit or sequence overlapping with a second oligonucleotide . In 60 method are pre -determined . In some embodiments , amounts some embodiments , a fourth oligonucleotide has less than of at least 90 detectably labeled oligonucleotides in a 50 % sequence identity with a second oligonucleotide. In plurality , composition , kit or method are pre - determined. In some embodiments , a fourth oligonucleotide has less than some embodiments , amounts of at least each detectably 40 % sequence identity with a second oligonucleotide. In labeled oligonucleotides in a plurality , composition , kit or some embodiments , a fourth oligonucleotide has less than 65 method is pre -determined . 30 % sequence identity with a second oligonucleotide. In In some embodiments , two or more detectably labeled some embodiments , a fourth oligonucleotide has less than oligonucleotides are provided for one target. In some US 10,510,435 B2 39 40 embodiments , total amount of all detectably labeled oligo clease sites is used . In some embodiments , the two flanking nucleotides for a target is pre -determined . In some embodi endonuclease sites are different. In some embodiments , two ments , total amount of all detectably labeled oligonucle nicking endonucleases , each of which independently corre otides for a target is pre- determined , wherein the amount of sponds to a nicking endonuclease site , are used . each of the detectably labeled oligonucleotide for the target 5 In some embodiments , oligonucleotides of provided tech is independently and optionally pre- determined . In some nologies are generated from oligonucleotide pools . In some embodiments , total amount of all detectably labeled oligo embodiments , such pools are available commercially . An nucleotides for each of a plurality of targets is independently initial DNA oligonucleotide pool in some embodiments pre -determined . In some embodiments , a plurality of targets consists of up to 12,000 or more different single stranded has at least two targets . In some embodiments , a plurality of 10 sequences organized into subsets . Each sequence is designed targets has at least five targets . In some embodiments , a such that nicking endonuclease sites and a forward and plurality of targets has at least 10 targets . In some embodi reverse primer sequence flank a desired sequence ( e.g. , a ments , a plurality of targets has at least 50 targets . In some probe sequence ). The forward and reverse primer sequences embodiments , a plurality of targets has at least 100 targets . specify to which subset with the desired sequence belongs. In some embodiments , a plurality of targets has at least 500 15 The primer pair can be used to amplify the subset using targets . In some embodiments , a plurality of targets has at polymerase chain reaction (PCR ) . The product of the PCR least 1,000 targets . reaction is isolated and digested by the nicking endonu In some embodiments , a target of a plurality , composition , cleases . The incubation time with the nicking enzyme varies kit or method is pre -determined . In some embodiments , at based on the amount of enzyme used and the amount of least 10 targets of a plurality , composition, kit ormethod are 20 DNA recovered . In some embodiments, about 10 units of pre - determined . In some embodiments , at least 50 targets of enzyme digest about 1 ug of DNA in about 1 hour. The a plurality , composition , kit or method are pre -determined . sample is then purified and reconstituted in a buffer , e.g., 2x In some embodiments , at least 100 targets of a plurality , loading buffer ( 96 % formamide/ 20 mM EDTA ) and water to composition , kit or method are pre -determined . In some make a final loading buffer (48 % formamide/ 10 mm embodiments , at least 1,000 targets of a plurality , composi- 25 EDTA ) , and denatured , e.g., by heating to 95 ° C. to com tion , kit or method are pre - determined . In some embodi pletely denature the DNA . The denatured DNA is purified ments , up to FN targets of a plurality , composition , kit or and the desired product isolated . In some embodiments , method are pre -determined , wherein F is the number of purification and / or isolation comprise electrophoresis . An detectable moieties in a pluralities, and N is the number of exemplary process is illustrated in FIG . 25 . imaging steps . 30 In some embodiments , the present invention provides a Methods for synthesizing detectably labeled oligonucle method for preparing a target nucleic acid having first otides are widely known and practiced in the art, for sequence, comprising steps of: example, see Lubeck , E. & Cai, L. Nat. Methods 9 , 743-48 1 ) providing a first nucleic acid comprising the first (2012 ) . Oligonucleotides are also commercially available sequence or its complimentary sequence , wherein the first from various vendors. In some embodiments , the present 35 sequence or its complementary sequence is flanked by at invention provides methods for preparing detectably labeled least one restriction site ; oligonucleotides. In some embodiments , the present inven 2 ) amplifying the first nucleic acid or part of the first nucleic tion provides methods for preparing intermediate oligo acid to provide a second nucleic acid comprising the first nucleotides . In some embodiments , the present invention sequence and the at least one flanking restriction site ; and provides methods for preparing bridge oligonucleotides . 40 3 ) contacting the second nucleic acid with a restriction In some embodiments , the present invention provides enzyme corresponding to the at least one flanking restriction methods for preparing a target nucleic acid having a first site to provide a third nucleic acid comprising a recessed sequence, comprising steps of: end ; 1 ) providing a first nucleic acid comprising the first 4 ) contacting the third nucleic acid with a nuclease to sequence, wherein the first sequence is flanked by nicking 45 selectively digest the strand comprising the complementary endonuclease sites at both ends ; sequence , if any , while keeping the strand comprising the 2 ) amplifying the first nucleic acid or part of the first nucleic first sequence . acid to provide a second nucleic acid comprising the first In some embodiments , the first sequence or its comple sequence and the flanking nicking endonuclease sites ; and mentary sequence is independently flanked by a restriction 3 ) contacting the second nucleic acid with one or more 50 site at each end . nicking endonuclease corresponding to the flanking nicking In some embodiments , the present invention provides a endonuclease sites . method for preparing a target nucleic acid having a first In some embodiments , a target nucleic acid having a first sequence , comprising steps of: sequence is single - stranded . In some embodiments , an 1 ) providing a first nucleic acid comprising the first amplifying step comprises polymerase chain reaction 55 sequence or its complimentary sequence , wherein the first (PCR ) . In some embodiments , provided methods further sequence or its complementary sequence is flanked by comprise a step of denaturing, wherein double - stranded restriction sites at both ends; second nucleic acid is denatured and the two strands become 2 ) amplifying the first nucleic acid or part of the first nucleic single - stranded . In some embodiments , provided methods acid to provide a second nucleic acid comprising the first further comprise isolating the nucleic acid having a first 60 sequence and the flanking restriction sites ; and sequence. In some embodiments , a second nucleic acid is 3 ) contacting the second nucleic acid with restriction optionally modified before contacting with nicking endonu enzymes corresponding to the flanking restriction sites to cleases. In some embodiments , provided methods further provide a third nucleic acid comprising a recessed end ; comprise labeling a nucleic acid having a first sequence . 4 ) contacting the third nucleic acid with a nuclease to In some embodiments , the two flanking endonuclease 65 selectively digest the strand comprising the complementary sites are the same. In some embodiments , one nicking sequence , if any, while keeping the strand comprising the endonuclease corresponding to the same nicking endonu first sequence . US 10,510,435 B2 41 42 In some embodiments , a target nucleic acid having a first methods provides significant advantages for diagnosis , treat sequence is single -stranded . In some embodiments , an ment monitoring and patient stratification . amplifying step comprises PCR . In some embodiments , In some embodiments , provided technologies optionally provided methods further comprise isolating the nucleic acid comprises profiling proteins , neural activities, and / or struc having a first sequence. In some embodiments , a second 5 tural arrangements . In some embodiments , provided meth nucleic acid is optionally modified before contacting with ods comprise profiling proteins in the same sample . In some restriction enzymes . In some embodiments , a third nucleic embodiments , provided methods comprise profiling neural acid is optionally modified before contacting with a nucle activities in the same sample . In some embodiments , pro ase . In some embodiments , a nuclease is exonuclease III , vided method comprise profiling structural arrangement. which preferentially degrade a strand with 3 '- recessed ends, 10 As exemplified herein , provided technologies work for a and can preserve a strand with a 5 ' recessed ends. In some wide variety of samples . For example , HCR -seqFISH embodiments , a restriction enzyme creates a 5 '- recessed end . worked in brain slices and that SPIMs can robustly detect In some embodiments , a restriction enzyme creates a 3 '- re single mRNAs in CLARITY brain slices. In some embodi cessed end . In some embodiments , the complementary ments , provided technologies are useful for profiling targets sequence has a 3' recessed end after restriction digestion . In 15 in mouse models of neurodegenerative diseases , or human some embodiments , the strand comprising the complemen brains. No other technology prior to the present invention tary sequence has a 3 ' recessed end after restriction diges can deliver the same quality and quantity of data . tion , and the strand comprising a first sequence has a 5 ' Overall Process recessed end after restriction digestion . In some embodi FIG . 27A illustrates general aspects of a sequential ments , provided methods further comprise labeling a nucleic 20 hybridization analysis that may contribute to quality of the acid having a first sequence . analysis . Sequential hybridization includes multiple rounds In some embodiments , single stranded oligonucleotides , of hybridization , where each round of hybridization is a e.g. , probes for seqFISH or intermediate oligonucleotides , multiple step process. Errors can be introduced at any step can be generated using nuclease digestion , such as exoIII during any round of hybridization . Such errors can lead to nuclease digestion . Instead of two nick sites on the ampli- 25 misidentification of target genes in a sample . fication ( e.g., PCR ) products , two restriction sites can be Prior to hybridization , samples that will be subject to used flanking the probe and /or adaptor sequence . In some analysis are processed . The main purpose of such processing embodiments , one restriction site leaves a 3 ' recessed end is to immobilize target molecules ; for example , mRNAs , while the other leaves a 5 ' recessed ends. For example , chromosomal DNAs, and proteins. It is essential that the EcoRI and BamHI leave 5 ' recessed ends, while Bmth and 30 target molecules remain spatially fixed through different Pacl leave 3 ' recessed ends. Such restriction enzymes are rounds of hybridization . widely known and used in the art . Exonuclease III degrades Probe design contributes to specificity of binding between the 3 ' recessed ends preferentially , and preserve the strand the probes and target sequences. It is possible to apply with the 5 ' recessed ends. This provides another mechanism hybridization chain reaction to allow multiple probes to bind to generate single stranded probes from oligonucleotide 35 at the same target sequence to amplify detectable signals . pools using PCR and restriction nucleases. Additionally , as illustrated in FIGS. 28 and 29 , it is possible In some embodiments , a provided target nucleic acid is to insert a cleavable linker between the binding sequence DNA . In some embodiments , a target nucleic acid has the ( that binds a target sequence ) and signal moiety ( that emits same sequence a first sequence . In some embodiments , a visible signals ) of a probe . Here , error can be reduced target nucleic acid is an intermediate oligonucleotide, com- 40 because no removal of probes is needed for the next round prising a first sequence that hybridizes to a target , e.g., a of hybridization . Instead, only visible signals are switched . transcript or a DNA locus, and a second sequence that Barcodes implemented during the analysis are unique . hybridizes to a second oligonucleotide, e.g., a detectably Nonspecific binding or other mistakes can render the results labeled oligonucleotide . In some embodiments , a target from one or more rounds of hybridization unreliable . A nucleic acid is an intermediate oligonucleotide, comprising 45 simple solution is to remove data that are unreliable . How a first sequence that hybridizes to a target , and a second ever , if data from one or more rounds of hybridization are sequence that hybridizes with a detectably labeled oligo eliminated from analysis , some of the barcodes would nucleotide labeled by HCR . In some embodiments , a target become indistinguishable from each other. nucleic acid is a bridge probe . During and after hybridization of probes to target In some embodiments, provided methods are used for 50 sequences, there are also aspects that are important for diagnosis of a disease , wherein the disease is related to an improving the quality of the sequential analysis , including abnormal number of a transcript or a DNA locus. In some hybridization , image collection , signal removal and re embodiments , provided methods are used for selecting sub hybridization and data analysis . jects for a treatment. In some embodiments , provided meth Barcodes and Error Correction ods are used for monitoring a treatment regimen. In some 55 In one aspect , disclosed herein are methods for designing embodiments , a cell in providemethods is from a subject . In barcodes with built - in error correction mechanisms such that some embodiments , a cell in provide methods is a mamma the multi - component barcodes can withstand the loss of the lian cell. In some embodiments , a cell in provide methods is data from one or more rounds of hybridization ( i.e., drop a human cell. In some embodiments , a cell in provide safe ) . As disclosed herein the terms “ barcode ” and “ code " methods is from a subject. In some embodiments, a cell in 60 are used interchangeably . provide methods is from an animal . In some embodiments , As disclosed herein , by using probes that are associated a cell in provide methods is from a human subject. In some with F detectable visual signals (F22 ) , a sequential hybrid embodiments , a cell in provide methods is isolated from a ization of N rounds (N22 ) can generate a total of FN human subject . In some embodiments , a cell in provide combinations of visual signals . In some embodiments , these methods is from a diseased tissue, or a tissue that is 65 combinations of visual signals can be used as barcodes to susceptible to a disease . Being capable of detecting and uniquely identify cellular targets such as mRNA , DNA , or quantifying a number of targets at the same time, provided even protein . US 10,510,435 B2 43 44 FIG . 27B illustrates an exemplary process 2700 for gen mRNA1. Similarly , binding signals can be missing or erating drop safe barcodes . ambiguous for a particular location during hybridization At step 2710 , the total number of genes that will be round 2 , which can produce an incomplete three letter analyzed during the hybridization experiments is deter barcode R - * -R for the particular location , where * remains mined . This number sets the threshold values for the number 5 undetermined . Once again , the identity of * is not needed to of detectable visual signals ( F ) and the total number of decipher that the code is for mRNA3 . rounds in the sequential hybridization ( N ) . Additionally , data from repeat rounds can validate each Once the total number of genes is determined , steps 2720 other. For example , in FIG . 2C , a circle highlights a cyan and 2730 are performed simultaneously . The number of data point in the image corresponding to hybridization round genes being analyzed must be smaller than the total number 10 ization2. In the round same 3 location reveals , athe yellow image data corresponding point. Based to onhybrid only of possible combinations of visual signals (FM ) . Practical information from hybridization rounds 2 and 3 , this location aspects of the hybridization analysis need to be considered would be identified as part of mRNA1. However, no signals when selecting values for F and N.One would tend to reduce are identified at the location during hybridization round 1 , the number of rounds of hybridization to as few as possible. 15 which suggests that the highlighted data points may be due Theoretically , this can be achieved by using a high number to non - specific binding . of detectable visual signals ( F ) . In practice, however, too In some embodiments , a sophisticated barcode generating many different types of visual signals may interfere with algorithm is used such that the resulting barcodes can each other. For example , overlapping of visual signals can withstand the loss of any round or even multiple rounds of lead to barcode misidentification . 20 hybridization data . In some embodiments , a barcode gen At step 2740, a library of drop - safe unique barcodes are erator is used to generate the drop - safe barcodes . For generated by implementing one or more error correction example , FIG . 29 illustrates an example , where probes with mechanisms. 5 different visual signals (blue , green , red , purple and In some embodiments , a repeat round can be performed yellow ) are used in 4 rounds of hybridization . One of the for any round during a sequential hybridization of N rounds, 25 hybridization round is an error correction round where rendering a new sequential hybridization of (N + 1) rounds. barcodes are generated based on barcodes from the previous The extra repeat round can be an error correction round . The 3 rounds. The following is an example that illustrates how repeat round can be a duplicate of any round of the n rounds barcodes are generated . sequential hybridization . The repeat round can take place as Designing an error correction code to correct for m any round during the sequential hybridization ( N + 1 ) rounds . 30 number of errors in a message of n length is analogous to After the repeat , there are two rounds ofhybridization that packing as many spheres of radius m in a n dimensional should be identical to each other. Consequently , the com cube . There are examples of “ perfect codes” such as Golay plete loss of one of the repeat rounds does not affect the and Hamming codes that can be as efficient as possible in outcome of the sequential hybridization . As such , either of this packing design . These perfect codes are important in the repeat rounds is a drop - safe round . 35 digital communication because the word lengths are long, up FIG . 2 illustrates an experiment where 3 rounds of hybrid to billions of letters for gigabytes of data, and many forms ization using probes with 4 types of detectable visual signals of errors can occur, including deletion and insertions . How ( red: R , yellow : Y , green : G , and cyan : C ) are used to create ever, in the seqFISH experiments , as the code lengths are barcodes for 4 different mRNA molecules. Hybridization short , a perfect code correction system is not necessary , round 3 is a repeat of hybridization round 1, as summarized 40 especially as the " correct ” codes are already defined . One of in Table 1 below . the major source of error is deletions due to loss of a hybridization . Thus, it is possible to design simple correc TABLE 1 tion schemes that are not completely efficient (i.e. obtain the tightest packing density for the n - spheres) but can achieve Illustration of the effect of repeat hybridization rounds. 45 good error correction with just a few extra rounds of Color barcodes Color barcodes Color barcodes hybridization. mRNA molecules (3 rounds ) ( dropping round 1 ) (dropping round 3 ) To design a barcode scheme that can tolerate loss of a mRNA1 Y - C - Y C - Y Y - C single round ofhybridization is akin to a problem where any mRNA2 G - R - G R - G G - R n - dimensional hypercube is collapsed by 1 dimension to a mRNA3 R - C - R C - R R - C 50 n - 1 dimensional hypercube without having any two points mRNA4 C - R - C R - C C - R on the n - dimensional hypercube mapping to the same point. In order for this to be true , no two barcodes can be connected As shown in the table above , data from one of the repeat by a 1D line running parallel to any of the axes. There are rounds can be dropped completely in case ofmajor experi many solutions to generate this 1 round loss tolerant code . ment error, barcodes derived from the remaining rounds of 55 In this example , 4 rounds of hybridization is used . Here, hybridization still uniquely represent the mRNA molecules. 5 different visual signals (blue , green , red , purple and In some embodiments, even in a questionable hybridiza yellow ) are assigned numerical values. In some embodi tion round , most of the information is still reliable . Only ments , the numerical values are integers . For example , some of the bindings between probes and target sequences blue = l ; green = 2 ; red = 3 ; purple = 4 ; and yellow = 5 . It would include inaccurate information . In some embodiments, par- 60 be understood that these are mere sample values. Any tial data from a questionable round of hybridization are used . non - redundant numerical values can be assigned to represent For example , in the illustration above , binding signals can be the different types of visual signals . In some embodiments , missing or ambiguous for a particular location during a barcode generator is used to generate the barcodes used in hybridization round 1, which can produce an incomplete the experiment. In the exemplary embodiment, a drop -safe three letter barcode * -C - Y for the particular location ,where 65 barcode for a particular target gene can be defined as a * remains undetermined . In the scheme illustrated , the four- component linear array : { i, ( i + j + k )mod 5 , j, k ) . Here , identity of * is not needed to decipher that the code is for mod (modulo operation or modulus ) finds the remainder US 10,510,435 B2 45 46 after division of one number by another . For example , 8 mod As disclosed herein , array ( 1 ) consists of n - component, 5 is 3. 5 mod 5 is 0 , which is equivalent to 5 . each corresponding to the visual signals from a particular In this example , i represents the numerical values corre round of hybridization . In some embodiments , probes bind sponding to the visual signals observed for the particular ing to a particular gene are all associated with the same target gene during the first round of hybridization . (i + j + k ) 5 detectable visual signal, for example , red , green or blue. In some embodiments , probes binding to a particular gene are mod 5 represents the numerical values corresponding to the all associated with multiple types of detectable visual signal , visual signals observed for the particular target gene during for example , green + yellow or blue + red. Through combina the second round of hybridization . j represents the numerical tions of visual signals , the total number of different types of values corresponding to the visual signals observed for the detectable visual signals can be further expanded . particular target gene during the third round of hybridiza- 10 In some embodiments , barcodes can be designed such that tion . k represents the numerical values corresponding to the drop or loss of data from two rounds ofhybridization can be visual signals observed for the particular target gene during tolerated . Using 2 additional rounds of hybridization does the found round of hybridization . In this example , i, j, and not correct for all possible 2 drops , but it does correct for a k each can be 1 , 2 , 3 , 4 or 5 , or any one of the numerical large fraction of the 2 drops . For example , for detecting 100 values that have been assigned to the five types of visual 15 genes with F = 5 dyes, 3 rounds of hybridization are needed signals used in the experiment. for basic barcoding of these genes. When adding two rounds In this example , ( i+ j + k )mod 5 is determined as the error of hybridization , the error correction code: correction round . However, once complete barcodes are generated , any of round 1 through round 4 can be dropped {i , j , k , ( i + j+ k )mod F ,( i - j) mod F } ( 2 ) to yield unique 3 -component barcodes. As such , the bar- 20 Such codes can correct for 2 drops all except dropping codes determined by this method can be used to correct hybridization round 3 and round 4 together. Here , each errors in any round . component in the 5 -member array represents one round of The following table illustrates how the 1 drop tolerant hybridization . barcodes can be generated using the equation ( i + j + k )mod 5 . Similarly , an error correction code such as 25 { i, j, k ,( i + j + k )mod F ,( i - k )mod F } ( 3 ) TABLE 2 can correct for dropping hybridization round 2 and hybrid Illustration of the effect of repeat hybridization rounds. ization round 4 together . Again , each component in the 5 -member array represents one round of hybridization . 1st round of 2nd round of 3rd round of 4th round of hyb For example , to code for most of the transcriptome, only Genes hyb * hyb hyb ( i + + k ) mod 5 30 6 rounds of hybridization are needed when F = 5 (65 = 15,625 ) . mRNA1 1 2 4 2 When adding two rounds of hybridization , the following mRNA2 ??? 3 1 2 error correction code is generated : mRNA3 5 1 2 3 mRNA4 2 3 5 { i, j , k ,l , m ,n , (i + k + 1 + m + n )mod F ,( i - j- k - l+ n )mod F } ( 4 ) 35 There are a total of 28 combinations of how 2 rounds of mRNA125 5 2 1 3 hybridization can be lost or dropped . This type of code can * The term “ hyb .” stands for hybridization . Numerical values are assigned to color signals correct for 24 out of the total 28 combinations. Here , each as follows : blue = 1 ; green = 2 ; red = 3 ; purple = 4 ; and yellow = 5 . component in the 8 -member array represents one round of As illustrated above , although the 4th round of hybridiza hybridization . Similarly , the 1st error correction round can be tion is generated using an error correction algorithm , any 40 any liner combination of 5 out of 6 rounds of hybridization one round of four rounds of hybridization in Table 2 can be ( e.g., without j) and 2nd error correction can be a subset of dropped and still yield a unique set of barcodes for 125 the linear combination of 5 out of 6 rounds of hybridization genes. ( e.g., without m ) . In these embodiments , the 2nd error More generally, a barcode that can resist the elimination correction round , indices include different coefficients as 45 long as the it is not exactly the same 5 indices used in the 1st of one round of hybridization can be defined as : error correction round . {1112 ... (aj * ji + a2* ;2+ ... + an * in + To correct for all combinations of drop or loss of 2 rounds C )mod F, ... jn } ( 1 ) of hybridization ( 2 drops) fully , 3 additional hybridizations where ji is a numerical value that corresponds the detect are needed . Again for 6 rounds of hybridization with 5 types able visual signals used in the first round of hybridization , j2 50 of detectable signals (F = 5 ), three extra rounds ofhybridiza is a numerical value that corresponds the detectable visual tions are added to create the full 9 -member error correction signals used in the second round of hybridization , and jn is code: a numerical value that corresponds the detectable visual { i, j , k ,l , m ,n , (i + j+ k + 1 + m + n )Mod F ,( i - j - k signals used in the nth round of hybridization . In some 1) Mod F ,( m - n - j +k )mod F } ( 5 ) embodiments, ji, j2 , . jn are non -redundant integers. In 55 In some embodiments , there are many equivalent codes some embodiments , aj, az, ... a , can be any integers that are that can correct for 2 drops with 3 additional rounds of not none zero . In some embodiments , C is a constant integer. hybridization . They can be all empirically determined . The In some embodiments , C is zero . The remainder of F divided number of hybridization for any reasonable number can be by F is 0 (F mod F = 0 ), so F and 0 are equivalent. There is simulated to determine the complete correcting barcode . no limitation on the number of hybridization . One of such 60 In some embodiments , three additional hybridization can examples is shown in FIG . 37 . correct for majority of the errors due to drop or loss of three Array ( 1 ) is a general representation of a barcode that is rounds of hybridization . For example , for 6 rounds of safe against the drop or loss of one round of hybridization . hybridization with 5 types of detectable signals (F = 5 ), three Although ( a * ji + a2 * j2 + . + a , * jn + C ) mod F is the desig extra rounds of hybridizations are added to create the full nated error correction round , in some embodiments , the 65 9 -member error correction code : barcode is safe against the loss or drop of any round of { i, j , k , l, m ,n , (k + i - l + m - n )mod F , (i - l + j - k + m )mod F , ( 1– hybridization . n - j- k +i ) mod F } ( 6 ) US 10,510,435 B2 47 48 Similar to the previous example , 3 additional rounds of collected from different hybridization are used to readout the hybridizations can correct for a majority of the loss or drop barcodes for specific locations on the immobilized nucleic of 3 rounds of hybridization . There are a total of 84 acid samples. Such barcodes can then be used to decipher combinations how 3 rounds of hybridization can be lost or the identity of the nucleic acid targets ( see, for example , dropped . A 9 -component code as illustrated in (6 ) can 5 FIGS . 2 , 29 , 30 , 37 and 38 ) . correct for 72 out of the 84 combinations . In one aspect, sequential hybridization and serial hybrid In some embodiments , 4 additional rounds of hybridiza ization are combined for gene identification . In serial tions can correct for the drop or loss of all and any three hybridization, only one round of hybridization is used to rounds of hybridization . An example 10 - component code is identify target genes. The method is particularly helpful as follows: 10 when analyzing genes whose expression level is too high . In some embodiments , genes that are highly expressed , if { i, j, k , l, m ,n ,( k + i + l + m + n )mod F ,( i - l+ j + k + m )mod F ,( 1– included in hybridization analysis with genes that are not so n - j- k + i) mod F ,( n - k -i - j+ m )mod F } ( 7 ) highly expressed , would overpower the signals for the genes It will be understood that there are many other solutions that are not so highly expression . In some embodiment, the that can be determined empirically . For higher number of 15 method can also applied to genes whose expression level is drops, similar correction schemes can be determined empiri too low . cally . In some embodiments, expression levels of genes are For 16,000 species , this scheme allows 10 hybs with the pre -determined . For example , gene expression levels ( e.g. , ability to correct 3 drops. In comparison , in MERFISH , 16 measured by mRNA transcription level) can be already hybs are needed to target 140 species , with only 2 round 20 available for certain species . It is possible to identify highly correction ability . Because the more round ofhybridization expressed genes by mining publically available data , thus one implements , the more mistakes can be made, keeping obviating the need to conduct additional experiments to the number of hybs low is crucial. Thus, this error correction measure expression level . scheme is very powerful compared to the Hamming Dis In some embodiments , initial experiments are performed tance scheme used in MERFISH . This is because hamming 25 to determine relative expression level of candidate genes. In distance correction is used in telecommunications with some embodiments , genes are grouped according to their binary numbers, which uses much longer strings of 0,1. expression levels . For example , genes with moderate or low As described above , the design disclosed above can expression levels can be grouped together and subject to correct for loss of 1 hybridization for an arbitrarily long sequential hybridization analysis . Genes that are highly barcode sequence with minimal extra effort. In this example , 30 expressed can be subject to serial hybridization analysis . In only one round of error correction is needed in a total of 4 some embodiments , expression levels of different genes are rounds of hybridization that analyzes 100 genes, which compared to the same control gene to derive a relative below the capacity of 54 (625 ) . expression level. For example , the expression level of actin For example , 7 rounds of hybridization with 5 colors can can be used as a control. It will be understood that gene cover 57 = 78,125 transcripts , more than the transcriptome, 35 expression level may vary by organisms and can change with 8 hybridizations the entire transcriptome can be coded with respect to different internal and environmental controls . with error correction using the barcoding system disclosed In some embodiments , data from existing expression analy herein . sis can be used in identifying highly expressed gene. In some Another consideration in designing error - tolerant bar embodiments , preliminary expression analysis is carried out codes is that themechanism of re- hybridization should guide 40 before sequential and /or serial hybridization analysis . the robustness of error correction . In the merFISH imple In some embodiments , a threshold value is set for high mentation of seqFISH (Chen 2015 ), null signal, or “ O ” , expression . Any genes having expression level above the along with “ 1 ” which is cy5 fluorescence , is used to form a threshold will be excluded from sequential hybridization . binary barcode . However, it is difficult to determine whether Depending on types of detectable visual signals that are no signal is due to mis -hybridization or actual null signal. In 45 available , a serial hybridization experiment can detect as the seqFISH implementation using positive signals as read many target genes as the number of types of detectable outs during each round of hybridization reduces the need for visual signals . For example , in the experiment illustrated in error correction because false positive signal is unlikely to FIGS. 29 and 37 , 5 genes are analyzed at the same time re- occur in the same position during another hybridization during one serial hybridization experiment. due to DNAse stripping between hybridizations. Thus, 50 In some embodiments , when multiple target genes are implementation of seqFISH with 5 colors and 1 extra round present in one serial hybridization round , the number of of hybridization to error correct is both efficient and accu probes that recognize each target gene is selected such that rate , and allows imaging of a large tissue sections since overlapping of signals is minimize or avoided . In some imaging time is ultimately limiting in multiplexing experi embodiments , the concentration of probes are selected to ments . 55 avoid or minimize overlapping of detectable signals . At step 2750 , sequential hybridization is carried out to Computer System associate or assign barcodes from step 2740 to target genes In some embodiments , a computer system 2800 , local or in a sample . As disclosed herein , the sample can be immo accessible via remote access , may comprise a central pro bilized mRNAs, DNAs, chromosomal DNAs, and combi cessing unit 2810 , a power source 2812 , a user interface nations thereof. For example , in the 100 -gene sequential 60 2820 , communications circuitry 2816 , a bus 2814, a con hybridization example (see FIG . 29 and FIG . 37) , 4 rounds troller 2826 , an optional non -volatile storage 2828 , and at ofhybridization are carried out using probes associated with least one memory 2830. In some embodiments , computer 5 different types of visual signals . Barcodes are assigned 2800 is a local computer device . In some embodiments , through selection of probes during the 4 rounds of hybrid computer 2800 is a remote server . ization experiment on immobilized nucleic acid samples. 65 Memory 2830 may comprise volatile and non - volatile At step 2760 , after hybridization , visual signals are col storage units , for example random -access memory (RAM ), lected and used in further analysis . For example , images are read -only memory (ROM ) , flash memory and the like . In US 10,510,435 B2 49 50 preferred embodiments , memory 2830 comprises high target gene. Data processing application 2838 identifies and speed RAM for storing system control programs, data , and characterizes each type of detectable signal . application programs, e.g., programs and data loaded from The methods and systems are provided by way of illus non -volatile storage 2828. It will be appreciated that at any tration only . They should in no way limit the scope of the given time, all or a portion of any of the modules or data 5 present invention . structures in memory 2830 can , in fact, be stored in memory ContentManagement Tools 2840 . 2828 . In some embodiments , content management tools 2840 are used to organize different forms of data 2850 into User interface 2820 may comprise one or more input multiple databases 2852 , e.g., a sequence database 2854 , an devices 2824 , e.g. , keyboard , key pad , mouse , scroll wheel , 10 image database 2856 , a probe library database 2858 , a touch screen , and the like , and a display 2822 or other output barcode library database 2860 , and result database 2862. In device . A network interface card or other communication some embodiments in accordance with the present inven circuitry 2816 may provide for connection to any wired or tion , contentmanagement tools 2840 are used to search and wireless communications network , which may include the compare any of the databases hosted on the computer system Internet and /or any other wide area network , and in particu- 15 2800. Contents in accordance with the invention may be an lar embodiments comprises a telephone network such as a image, a simple text file (e.g. , ASCII ) , a formatted text file , mobile telephone network . Internal bus 2814 provides for a sequence file , a two - dimension map , or a video file . interconnection of the aforementioned elements of central The databases stored on computer system 2800 comprise ized data server 2800 . any form of data storage system including , but not limited to , In some embodiments , operation of computer 2800 is 20 a flat file , a relational database (SQL ), and an on - line controlled primarily by operating system 2832 , which is analytical processing (OLAP ) database (MDX and /or vari executed by central processing unit 2810. Operating system ants thereof) . In some specific embodiments, the databases 2832 can be stored in system memory 2830. In addition to are hierarchical OLAP cubes . In some embodiments , the operating system 2832 , a typical implementation system databases each have a star schema that is not stored as a cube memory 2830 may include a file system 2834 for controlling 25 but has dimension tables that define hierarchy . Still further, access to the various files and data structures used by the in some embodiments , the databases have hierarchy that is present invention , one or more application modules 2836 , not explicitly broken out in the underlying database or and one or more databases or data modules 2850 . database schema (e.g. , dimension tables are not hierarchi In some embodiments in accordance with the present cally arranged ). In some embodiments , the databases in fact invention , applications modules 2836 may comprise one or 30 are not hosted on computer system 2800 but are in fact more of the following modules described below and illus accessed by centralized data server through a secure network trated in FIG . 28 . interface . In such embodiments , security measures such as Data Processing Application 2838. In some embodiments , encryption is taken to secure the sensitive information stored a data processing application 2838 receives and processes in such databases . data collected during hybridization experiments ( for either 35 Design Tools 2842 . sequential or serial hybridization ). For example , detectable In some embodiments , design tools 2842 are used to signals are collected as images and stored computer 2800 . design probes for specific binding of target sequences . For Standard image processing algorithms can be applied to example , for nucleic acid probe design , design tools 2842 enhance signal detection . In some embodiments , coordinates can utilize sequence information from sequence database are assigned to data locations where signals are detected to 40 2854 to create probes that will likely bind to a specific target precisely define the binding between probes and target sequence . In some embodiments , design tools 2842 can sequences. The positions of such target sequences do not utilize secondary and tertiary structure information from change between different rounds of hybridization because sequence database 2854 to design probes that will avoid the target sequences are part of the immobilized nucleic acid regions containing hairpins or other structures that may samples. Thus, by comparing coordinates of data locations 45 interfere with binding between the probes and their respec between different images, it is possible to identify the same tive target sequences . target sequence in each image and characterize the detect In some embodiments , design tools 2842 are used to able signals associated with the same target sequence create barcodes . For example , design tools 2842 can utilize between different images. a barcode generator with built - in error correction mecha In some embodiments , the detectable signals for the same 50 nisms. In some embodiments , error correction mechanisms location ( target sequence ) change from one color to another are saved as additional data 2862. In some embodiments , a between different images . In some embodiments , the detect user can define the number hybridization rounds that the able signals for the same location ( target sequence ) remain final barcode can tolerate to loss. For example , depending on the same color between different images . The characteristics the total number of rounds of hybridization , a user can set of these detectable signals are compiled between images 55 the barcode to be one drop safe , two- drop safe or three - drop from all hybridization rounds to derive a barcode that safe . uniquely represents the binding interaction at the particular Network Application 2846 . location . In some embodiments, network applications 2846 connect In some embodiments , data processing application 2838 computer system 2800 with multiple network services. detects and corrects minor shifts between different images . 60 Computer system 2800 can be connected to multiple types In some embodiments , data processing application 2838 of client devices , which requires that remote data server be detects major changes between different images that cannot adapted to communications based on different types of be corrected . network interfaces, for example , router based computer For image data collection during serial hybridization , data network interface , switch based phone like network inter processing application 2838 identifies and characterizes 65 face, and cell tower based cell phone wireless network detectable signals by their type . In this case , the same interface , for example , an 802.11 network or a Bluetooth detectable signal represents binding sequences in the same network . In some embodiments in accordance with the US 10,510,435 B2 51 52 present invention, upon recognition , a network application previously designed barcodes includes types of detectable 2846 receives data from intermediary gateway servers signals forming the barcodes. In some embodiments , bar before it transfers the data to other application modules such code database 2860 further includes information on whether as data processing application 288 , content management any ambiguities or errors are associated with certain bar tools 2840 , and system administration and monitoring tools 5 codes. 2842 . In some embodiments , barcodes used in the past are System Administration and Monitoring Tools 2848 . ranked based on their efficiency and accuracy in identifies In some embodiments , system administration and moni toring tools 2842 administer and monitor all applications cellular targets . and data files of computer system 2800. Because some of the 10 Additional Data 2862 . information stored on remote data server 2800 can relate to In some embodiments , additional data 2862 , including for a person's privacy ( e.g. , personal data associated with cer example , results and conclusions from sequential hybridiza tain biological samples and analytical results of these tion and serial hybridization analysis are also stored on samples) , it is important that access those files that are computer system 2800. In some embodiments , error correc strictly controlled and monitored . System administration and 15 tion mechanisms are saved as additional data 2862. In some monitoring tools 2842 determine which users or devices embodiments , data needed for image processing are also have access , locally or remotely , to computer system 2800 . saved as additional data 2862 . In some embodiments , security administration and monitor The methods and systems are provided by way of illus ing is achieved by restricting data download access from tration only . They should in no way limit the scope of the computer system 2800 such that the data are protected 20 present invention . against malicious Internet traffic . In some embodiments , Computer System and Program Product system administration and monitoring tools 2842 use more The present invention can be implemented as a computer than one security measure to protect the data stored on system and /or a computer program product that comprises a computer system 3800. In some embodiments, a random computer program mechanism embedded in a computer rotational security system may be applied to safeguard the 25 readable storage medium . Further , any of the methods of the data stored on computer system 2800 . present invention can be implemented in one or more Sequence Database 2854 . computers or computer systems. Further still , any of the Sequence database store information relating to potential targets for hybridization analysis , such as sequence , second methods of the present invention can be implemented in one ary and tertiary structure information. For example , second- 30 theor more present computer invention program provide products a computer . Some embodiments system or ofa ary and tertiary structure in nucleic acids may prevent computer program product that encodes or has instructions probes from binding to such regions. In some embodiments , for performing any or all of the methods disclosed herein . sequence database 2854 includes a subset database including Such methods/ instructions can be stored on a CD - ROM , regionssome embodiments that would likely , sequence be good database probe binding2854 includestargets . Ina 35 DVD , magnetic disk storage product , or any other computer subset database including regions that would likely be poor readable data or program storage product . Such methods can probe binding targets . Such information is provided to also be embedded in permanent storage , such as ROM , one design tools 2842 to facilitate probe design . or more programmable chips , or one or more application In some embodiments , sequence database 2854 further specific integrated circuits ( ASICs ). Such permanent storage includes gene expression information . For example , 40 can be localized in a server , 802.11 access point, 802.11 sequence database 2854 can include a subset of genes whose wireless bridge/ station , repeater , router , mobile phone, or expression levels may be too high for sequential hybridiza other electronic devices . Such methods encoded in the tion analysis . In some embodiments , a user may receive a computer program product can also be distributed electroni warning message if one of the gens in the subset is identified cally , via the Internet or otherwise , by transmission of a as a target gene. 45 computer data signal ( in which the software modules are Image Database 2856 . embedded ) either digitally or on a carrier wave . In some embodiments , computer system 2800 hosts an Some embodiments of the present invention provide a Image database 2856. Raw data collected off the detectable computer system or a computer program product that con signals are organized and stored in image database 2856 . tains any or all of the program modules as disclosed herein . Probe Database 2858 . 50 In some embodiments , probes that have been designed are These program modules can be stored on a CD -ROM , DVD , stored in designated probe database on computer system magnetic disk storage product , or any other computer read 2800. In some embodiments , information concerning pre able data or program storage product. The program modules viously designed probes includes binding sequence , a signal can also be embedded in permanent storage , such as ROM , moiety that can emit detectable signals . In some embodi- 55 one or more programmable chips, or one or more application ments , a linker for connecting the signal moiety to the specific integrated circuits ( ASICs) . Such permanent storage binding sequence is also included in probe database 2858 . can be localized in a server, 802.11 access point, 802.11 In some embodiments , certain probe designs are ranked wireless bridge / station , repeater, router , mobile phone, or based on existing data showing the efficacy of binding of other electronic devices . The software modules in the com these probes. The existing data can be publically available 60 puter program product can also be distributed electronically , information or information generated by the user. In some via the Internet or otherwise , by transmission of a computer embodiments , a user is given the option to edit entries in data signal ( in which the software modules are embedded ) probe database 2858 . either digitally or on a carrier wave . Barcode Database 2860 . Having described the invention in detail , it will be appar In some embodiments , barcodes that have been designed 65 ent that modifications, variations, and equivalent embodi are stored in designated barcode database 2860 on computer ments are possible without departing the scope of the system 2800. In some embodiments , information concerning invention defined in the appended claims. Furthermore , it US 10,510,435 B2 53 54 should be appreciated that all examples in the present can be stripped and re -hybridized efficiently in mammalian disclosure are provided as non - limiting examples. cells (FIGS . 11 and 12 ) . As demonstrated here, provided methods have many advantages over methods known prior EXAMPLES to the present invention . For example , provided methods 5 scale up quickly ; with even two dyes the coding capacity is The following non - limiting examples are provided to in principle unlimited ( 24 ) . During each contacting step, all further illustrate embodiments of the invention disclosed available detectably labeled oligonucleotides , in this herein . It should be appreciated by those of skill in the art example , FISH probes , against a target can be used , increas that the techniques disclosed in the examples that follow ing the brightness of the signals . Readouts of provided represent approaches that have been found to function well 10 methods are also robust and enable full Z -stacks on native in the practice of the invention , and thus can be considered samples . Provided methods can take advantage of the high to constitute examples ofmodes for its practice. However, hybridization efficiency of detectably labeled oligonucle those of skill in the art should , in light of the present otides, such as FISH probes (> 95 % of the mRNAs are disclosurespecific embodiments , appreciate thatthatmany are disclosedchanges canand be still made obtain in the a 15 detected ; Lubeck , E. & Cai, L. Nat. Methods 9, 743-48 like or similar result without departing from the spirit and (2012 )) . Applicant notes that detectably labeled oligonucle scope of the invention . otides , for example FISH probes, can also be designed to Further, the foregoing has been a description of certain resolve a large number of splice - isoforms, SNPs, as well as non - limiting embodiments of the invention . Accordingly , it chromosome loci (Levesque , M. J. & Raj, A. Nat Meth 10 , is to be understood that the embodiments of the invention 20 246-248 (2013 ) ) in single cells . In combination with super herein described are merely illustrative of the application of resolution methods (Lubeck , E. & Cai, L. Nat. Methods 9 , the principles of the invention . Reference herein to details of 743-48 (2012 ) ), provided methods enable a large number of the illustrated embodiments is not intended to limit the scope targets , for example the transcriptome, to be directly imaged of the claims. at single cell resolution in complex samples, such as the 25 brain . Example 1 Methods and Procedures Sample Preparation : Sequential Hybridization and Barcoding MDN1- GFP yeast cells were grown in YPD supple mented with 50 mM CaCl2 to OD 0.3 . Cells were fixed in In Situ Profiling of Nucleic Acids by Sequential 30 1 % Formaldehyde 5 % Acetic Acid for 5 minutes , rinsed 3x Hybridization and Barcoding in Buffer B and spheroplasted for 1 hour at 30 ° C. Cells were stored in 70 % EtOH at -20 ° C. for up to two weeks. As described in the non - limiting examples ein , nucleic Coverslips were prepared by sonicating 3x with alternat providedacids in methodscells , for through example sequential , mRNAs rounds, were of profiled contacting by, 35 ing solutions of 1M NaOH and 100 % EtOH followed by a imaging and removing steps (FIGS . 2 ( a ) and 3 ). As the final round of sonication in acetone . A 2 % solution of transcripts are fixed in cells , the corresponding fluorescent (3 - Aminopropyl) triethoxysilane (Sigma 440140 ) was pre spots remain in place during multiple rounds of hybridiza pared in acetone and the cleaned coverslips were immedi tion , and can be aligned to read out a fluorophore sequence . ately submerged in it for two minutes. Amine -modified This sequential barcode is designed to uniquely identify an 40 coverslips were rinsed and stored in ultra -pure water at room mRNA . temperature . During each round of hybridization , each transcript was Fixed yeast cells were pre - treated with a 0.5 U /uL solution targeted by a set of detectably labeled oligonucleotides , in of DNase I (Roche 04716728001) for 30 minutes at 23 ° C. this case , FISH probes labeled with a single type of fluoro Following treatment, yeast cells were adhered to coated phore . The sample was imaged and then treated it with 45 coverslips by physically compressing a dilute solution of DNase I to remove the FISH probes. In a subsequent round yeast between two amine -modified coverslips. The cover the mRNA was hybridized with FISH probes with the same slips were then carefully pealed apart and immediately set of oligonucleotide sequences , but now labeled with a submerged in a 1 % formaldehyde solution for 2.5 minutes . differentdye . The number of barcodes available scales as FN , Following fixation coverslips were dried and a flow cell was where F is the number of fluorophores and N is the number 50 constructed by adhering an adhesive coated flow cell to the of hybridization rounds. For example , with 4 dyes, 8 rounds coverslip (GraceBio Labs SA84-0.5 - SecureSeal ) . Fluo of hybridization can cover almost the entire transcriptome Sphere 365 nm fluorescent beads were added to the cover (48 = 65,536 ) . slip to measure drift over multiple hybridizations ( Life As a demonstration , 12 genes were barcoded in single F8805 ) . Flow cells were stored at 4 ° C. covered with yeast cells with 4 dyes and 2 rounds of hybridization (42 = 16 , 55 parafilm . with 4 barcodes left out; each hybridization was conducted Preparation of Detectably Labeled Oligonucleotides: for 3 cycles ). Cells were immobilized on glass surfaces . The Probes were prepared according to the method in Lubeck , DNA probes were hybridized , imaged , and then removed by E. & Cai, L. Nat . Methods 9 , 743-48 ( 2012 ) . For each target , DNase I treatment (88.5 + 11.0 % (SE ) efficiency, FIG . 4 ). The 24 probes were used . All 24 probes for each set of genes remaining signal was photobleached (FIG . 5 ). Even after 6 60 were coupled to one of the four dyes used , Alexa 532, 594 , hybridizations, mRNAs were observed at 70.9 : 21.8 % (SE ) Cy5 and Cy7 . of the original intensity (FIG . 6 ). It was observed that Hybridization : 77.9 + 5.6 % (SE ) of the spots that co - localized in the first two Flow cells were hybridized at a concentration of 2 hybridizations also co - localize with the third hybridization nM /probe overnight in a hybridization buffer of 10 % Dex (FIGS . 7 and 8 ). ThemRNA abundances were quantified by 65 tran Sulfate (Sigma D8906 ) , 10 % formamide, and 2xSSC . counting the occurrence of corresponding barcodes in the Following hybridization , samples were washed in a 30 % cell (FIGS . 9 and 10, n = 37 cells ) . It was shown that mRNAs formamide, 0.1 % Triton - X 100 buffer pre -heated to 37 ° C. US 10,510,435 B2 55 56 before adding to room temperature samples for 10 minutes. ized with the same probe sets . Themaximum intensity pixel Samples were washed several times with 2xSSC to remove for each PSF was used as a proxy for the location of that diffusing probes . mRNA molecule . Imaging: The barcodes were extracted automatically from the dots Samples were immersed in an anti - bleaching buffer (Swo- 5 corresponding to mRNAs in hybl and hyb2 . The algorithm boda , M. ACS Nano 6 , 6364-69 (2012 ) ) : 20 mM Tris -HCL , calculated the pairwise distances between each point iden 50 mM NaCl, 0.8 % glucose , saturated Trolox (Sigma : tified in hybl with all the points identified in hyb2 . For each 53188-07-1 ), pyranose oxidase (Sigma P4234 ) at an point in hyb1, the closest neighbor in hyb2 was identified . If OD 405nm of 0.05 , and catalase at a dilution of 1/1000 that distance were 0 or 1 pixel and the closest neighbor of the (Sigma : 9001-05-2 ). 10 point in hyb2 were also the original point in hyb1, then the Probe Displacement: barcode pair was confirmed . The symmetrical nearest neigh Following imaging , cells were washed in DNase I buffer bor requirements decreased the false assignment of bar (Roche ) and allowed to sit in 0.5 U /uL DNase I ( Roche) for codes. To reduce false positives in cy7, points detected in 4 hours . To inhibit DNase cells were washed 2x with 30 % hybl cy7 were required to reappear in hyb3 in cy7 . formamide, 0.1 % Trition - X 100, 2xSSC . Following DNase 15 In this non - limiting example , Applicant removed probes treatment cells were imaged once more in anti- bleaching with DNase I due to its low cost and rapid activity . Applicant buffer to determine DNase I probe stripping rates . To remove notes that any method that removes probes from mRNA and remaining probe signal, samples were bleached with 10 leaves it intact could be used in provided barcoding seconds of excitation in all imaging channels and imaged approaches, for example but not limited to , strand -displace once more with standard excitation times to record residual 20 ment ( Duose , D. Y. Nucleic Acids Research , 40 , 3289-3298 signal . (2012 )) and high temperature or formamide washes . Appli Re- Hybridization : cant notes that DNase I does not require probe redesigns Samples were re -hybridized on the microscope according from standard smFISH probes , and does not perturb the to the previously outlined conditions. Samples were covered sample with harsh washes . with parafilm during hybridization on the scope to prevent 25 In some embodiments , a rapid loss of DAPI signal from evaporation . dsDNA within seconds was observed , while smFISH probes At least six rounds of hybridizations were carried out on took a substantially longer period of time ( 10 s ofminutes ) the same sample . Each round of hybridization took place to be degraded . Without the intention to be limited by theory, overnight on the microscope, with DNase treatment and the efficiency of DNase I probe removal could be low imaging occurring during the day . In the iterative hybrid- 30 relative to the dsDNA cleavage rate . The removal process ization scheme applied in this correspondence, two rounds was still observed in a short amount of time. of hybridization were used to barcode the mRNAs. The In certain experiments , 11.5 % of the fluorescent signal barcode scheme was then repeated , such that hybl and hyb3 remained on mRNA after DNase I treatment. The remaining were performed using the same probes , while hyb2 and hyb4 signal was reduced almost to zero by bleaching . Applicant were done with another set of probes . The co - localization 35 notes that more complete removal of signal and / or probes between hybl and hyb3 gave a calibration for transcripts can be achieved prior to photobleaching, so that more that were detected , while hyb1 and hyb2 yielded the barcod mRNAs are available for the following rounds of hybrid ing data . ization . Applicant notes that photobleaching is not necessary Data Analysis : for barcoding , but in some embodiments , it does simplify the Data analysis was carried out with Image ), Python and 40 process by removing residual signal that might give false Matlab . Since the sample drifted during the experiments , the positives in further rounds of barcoding. Some of the 11.5 % raw images were aligned using cross -correlation based reg of residual probes bound to mRNA may inhibit further istration method that was determined from the DAPI chan rounds of hybridization . Applicant notes that residual probes nel of each imaging position . The drift -correction was then were not significantly inhibiting progressive rounds of propagated to the other 4 color channels corresponding to 45 hybridization as presented data showed a minor drop in the same position . The images were then deconvolved to hybridization efficiency for 5 hybridizations . decrease the overlap between adjacent FISH spots . Spots Profiling of Nucleic Acids in Brain Tissues overlaps in individual channels were rarely observed , but Transcription profiling of cells in intact brain slices are spots in different channels could overlap in their point spread essential for understanding the molecular basis of cell iden functions (PSFs ) when the images were overlaid . The raw 50 tity . However, prior to the present invention it was techni data were processed based on an iterative Lucy - Richardson cally difficult to quantitatively profile transcript abundance algorithm (Lucy , L. B. The Astronomical Journal. 79 , 745 and localization in single cells in the anatomical context of ( 1974 ) and Richardson , W. H. J. Opt. Soc. Am . 62 , 55-59 native neural networks. The cortical somatic sensory sub (1972 )) . The PSF of the microscope was estimated by networks are used as an example to demonstrate the feasi averaging the measured bead images ( ~ 200 nm diameter ) in 55 bility and utility of exemplary provided technologies , for the DAPI channel of the microscope. Using this measured example , using in situ sequencing by FISH ( seqFISH ) and point spread function with the Lucy -Richardson algorithm , connectomics to profile multiple genes in distinct neuronal we performed maximum - likelihood estimation of fluores populations within different functional domains, such as cent emitter distribution in the FISH images after computing those in the primary somatic sensory (SSp ), primary somato this process over ~ 20 iterations. The output of this decon- 60 motor (MOP ) , secondary somatomotor (MOs ) , and supple volution method provides resolved FISH data and increases mentary somatosensory (SSs ) cortical areas. the barcode assignment fidelity. As described extensively herein , in some embodiments , Dots corresponding to FISH signals in the images were the present invention provides technologies to profile gene identified using a local maximum function . Dots below a expression in single cells via in situ “ sequencing ” by FISH , threshold were discarded for further analysis . The value of 65 e.g. , as illustrated by FIGS. 1 and 2. To detect individual the threshold was determined by optimizing the co -local mRNAs , single molecule fluorescence in situ hybridization ization between hybl and hyb3 images, which were hybrid ( smFISH ) was used with 20mer oligonucleotide probes US 10,510,435 B2 57 58 complementary to the mRNA sequence (Fan , Y. , Braut, SA , fluorescent channels . It was shown that the same spots Lin , Q., Singer, R H , Skoultchi, A I. Determination of realign to 100 nm between different rounds of hybridization transgenic loci by expression FISH . Genomics. 2001 Oct. 2 ; ( FIG . 8 ) . 71 ( 1 ) : 66-9 ; Raj A , Peskin C S , Tranchina D , Vargas D Y , In some embodiments , a 100 genes multiplex can be Tyagi S. Stochastic mRNA synthesis in mammalian cells . 5 performed quickly with 3 rounds of hybridization . In some PLOS Biol. 2006 October; 4 ( 10 ) : e309 ). By putting 24 such embodiments , each hybridization cycle involves about 4 fluorophore labeled probes on an mRNA , single transcripts hours of hybridization , about 1 hour of imaging and about 1 in cells become readily detectable in situ . It was shown that hour of DNase treatment and washing , the time length of almost all mRNAs that can be detected are observed by each can be optionally and independently varied . In some smFISH (Lubeck , E. L. Cai. Single cell systems biology by 10 embodiments , 3 rounds of hybridization take approximately super- resolution imaging and combinatorial labeling . Nature 18 hours. In some embodiments , imaging time is the rate Methods 9 , 743-48 (2012 )) . Provided methods are highly limiting step , rather than the hybridization time, because one quantitative and preserve the spatial information within a brain slice can be imaged while another slice on the same tissue sample without physically isolating single cells or microscope is hybridizing. In some embodiments , a single using homogenates . 15 microscope can multiplex up to 8 slices simultaneously and In some embodiments , to distinguish different mRNA obtain 100 gene data on all 8 slices at the end of the 3 cycles species , mRNAs are barcoded with detectably labeled oli of hybridization in 18 hours . gonucleotides , such as FISH probes using sequential rounds In some embodiments , a 10 mmx5 mmx10 um brain slice of hybridization . During a round of hybridization , each containing 106 cells can be imaged and analyzed in 35 transcript is targeted by a set of multiple , for example , 24 20 minutes on microscopes . In some embodiments , a single FISH probes, labeled with a single type of fluorophore . The field of view (FOV ) on a microscope is 0.5 mmx0.5 mmx2 sample is imaged and the FISH probes are removed by um with a 20x air objective and 13 mmx13 mm camera chip . enzymatic digestion . Then the mRNA is hybridized in a In some embodiments , each FOV is exposed and read out in subsequent round with the same FISH probes , but now 100 msec . In some embodiments , scanning the sample in labeled with , in some cases , a different dye . As the tran- 25 xyz and in the different color channels introduces a time scripts are fixed in cells , the fluorescent spots corresponding delay of 200 msec between snapshots . In some embodi to single mRNAs remain in place during multiple rounds of ments, an entire brain slice can be imaged in 2000 sec or 35 hybridization , and can be aligned to read out a color minutes . With 3 rounds of hybridization needed for the 100 sequence. Each mRNA species is therefore assigned a gene multiple, the total imaging time is 105 minutes. In unique barcode . The number of each transcript in a given 30 some embodiments, an entire mouse brain can be imaged in cell can be determined by counting the number of the 30 days on one microscope . When multiple microscope is corresponding barcode . An exemplary process is illustrated used , the time frame can be further shortened . In some in FIG . , 2 , or 3 . embodiments , provided methods can image an entire mouse Provided technologies can take advantage of the high brain with 500 slices with a cost less than $ 25,000 . hybridization efficiency of FISH (> 95 % of the mRNAs are 35 Compared with other methods known prior to the present detected ). In some embodiments , resolution is not invention , provided technologies provide a variety of advan needed to identify a transcript, although can be achieved if tages. Among other things, provided technologies is quan desired . The number of barcodes available with provided titative, preserve spatial information and inexpensively methods scales as FN , where F is the number of distinct scales up to a whole tissue , organ and / or organism . fluorophores and N is the number of hybridization rounds. 40 Comparison with Single Cell RNA - Seq Prior to the With 5 distinct dyes and 3 rounds of hybridization , 125 Present Invention . unique nucleic acids can be profiled . Almost the entire Unlike single cell RNA -seq or qPCR , which require transcriptome can be covered by 6 rounds of hybridization single cells to be isolated and put into a 96 well format, ( 58 = 15,625 ), for example, using super - resolution micros provided methods , such as seqFISH , can scan a large num copy which resolves all of the transcripts in a cell . In some 45 ber of cells in their native anatomical context with auto embodiments , conventional microscopy, such as conven mated microscopy at little additional cost . To achieve the tional epi- fluorescence microscopy which is simple and same level of throughput with a microfluidics device would robust to implement, is used to detect fewer but still large be economically impossible and labor intensive . In some number of targets , for example , at 100 genes multiplex . embodiments , major cost of provided technologies is the Probes can be stripped and rehybridized to the same 50 initial cost of probe synthesis , which is offset by the fact that mRNA in multiple cycles of hybridization ( FIG . 2) . Many once probes are synthesized , they can be used in many, e.g., commercially available fluorophores work robustly , such as 1000 to 10,000 or even more reactions . Alexafluor 488 , 532 , 594 , 647 , 700 , 750 , and 790 , giving at Provided methods such as seqFISH are based on single least 7 colors for barcoding . Even at the end of 6 rounds of molecule FISH and can measure low copy number tran hybridizations, probes can be re -hybridized to the stripped 55 scripts with absolute quantitation . The data obtained with mRNA with 70.9 + 21.8 % (FIG . 6 ) of the original intensity . this method is highly quantitative and enables high quality As a demonstration , barcoded 12 genes were barcoded in statistical analysis . In comparison , current single cell qPCR single yeast cells with 4 dyes and 2 rounds of hybridization and RNA -seq experiments are limited in quantitative powers ( 42 = 16 , FIG . 3, c ). with biases from reverse transcription (RT ) and PCR ampli There is sufficient optical space in cells to perform mul- 60 fication errors . tiple, e.g., 100 gene multiplex , as 12 genes multiplex images Comparison with Other In Situ Sequencing Method Prior only occupied 5 % of the optical space in each fluorescent to the Present Invention . channel. Although the composite image of all 4 fluorescent One major advantage of the smFISH approach is that channels in FIG . 3 appears dense , spots in each fluorescent almost all mRNAs that are targeted can be observed . It was channel are sparsely distributed . Each spot can be fitted with 65 determined that the efficiency of each FISH probes binding a 2 dimensional Gaussian profile to extract its centroid on a mRNA is 50-60 % (Lubeck , E. & Cai, L. Nat. Methods positions and further reduce the overlaps with spots in other 9 , 743-48 (2012 ) ; Levesque , M. J. & Raj, A. Nat Meth 10 , US 10,510,435 B2 59 60 246-248 (2013 ) ). Targeting multiple , e.g. , 24-48 probes per in some embodiments , a single color, during each round of mRNAs ensures that at least 10 probes are hybridized on hybridization instead of at least 3 colors in the spectral almost every mRNA, providing good signals over the non barcoding schemes prior to the present invention . If desir specific background. Directly probing the mRNA with FISH able , the lower image density can greatly simplifies data probes is highly specific and ensures that all transcripts are 5 analysis and allows more genes to be multiplexed before detected . super - resolution microscopy is necessary . Applicant notes In contrast, many other in situ sequencing methods, that certain spectral barcoding methods, probes, and / or instead of targeting the mRNA directly , use enzymatic super -resolution microscopy, can be used , and can be useful reactions to convert the mRNA into a DNA template first , embodiments , in provided embodiments . To profile the before detecting the DNA template by sequencing reactions . 10 transcriptome with provided technologies such as seqFISH , However , the mRNA to DNA conversion process is highly in some embodiments , super -resolution microscopy is used inefficient, and only a small fraction of the RNAs are to resolve the millions of transcripts in the cells . converted and detected . One exemplary major downside of Besides transcriptional profiling , provided technologies low efficiency , which is estimated at 1 % for reverse tran can resolve multiple alternative splicing choices and RNA scription (RT ) and 10 % for padlock ligation (PLA ) , is that 15 editing on the same mRNA molecule . Alternative spliced it can introduce significant noise and bias in the gene isoforms are difficult to probe by sequencing methods as the expression measurements . sequencing reads are usually too short to correlate the exon Given the typical cell size of ( 10-20 umº) , there are choices within the same transcript. Provided methods such approximately 25,000 diffraction limited spots in the cell. In as seqFISH allow direct visualization of the entire repertoire some embodiments , this is the available real estate for 20 of splice isoforms within individual cells in brain slices . transcript detection in single cells . In seqFISH , a chosen set Similarly , smFISH methods of detecting single nucleotide of genes , such as transcription factors ( TFs) and cell adhe polymorphisms (SNPs ) can be adapted to seqFISH to image sion molecules (CAMs ) , can be imaged and quantitated with edited transcripts in neurons or other cell types . high accuracy . If target genes with average copy numbers of In some embodiments , provided technologies provide 100 transcripts per gene are chosen , a highly quantitative 25 efficient and cost effective pipelines for gene profiling in situ 100-200 gene profiling experiment can be performed . In by sequential FISH (seqFISH ) , and integrate seqFISH and contrast , with many other in situ sequencing methods, most connectomics to profile somatic motor neurons in the cortex of that real estate is used to sequence ribosomal RNAs as to identify combinatorialmolecular markers that correspond well as house- keeping genes; genes of interest, such as those to cell identity . specific for neuronal cell identity , are severely under -repre- 30 Quantitative In Situ Gene Expression Mapping in Brain sented and poorly detected . Light sheet microscopy is applied to directly image In some embodiments , provided methods use hybridiza CLARITY cleared brains slices . In some embodiments , a tion chain reaction (HCR ) (Choi , et al ., Programmable in mouse brain is mapped in 1 month per machine . In some situ amplification for multiplexed imaging of mRNA expres embodiments , a mouse brain is mapped in one week with sion Nature Biotechnol , 28 , 1208-1212, ( 2010 )) to amplify 35 4-5 machines . FISH signal that allows large FOV imaging with 20x air Amplification : Amplification of FISH signals allows large objectives, but at the same time preserves the high detection FOV imaging of brain slices with 20x low NA objectives. In efficiency of smFISH . some embodiments , provided methods use detectably Comparison with Super -Resolution Barcoding Method of labeled oligonucleotides labeled with hybridization chain Multiplexing RNA Prior to the Present Invention . 40 reaction (HCR ) (Choi et al ., 2010 ) to increase the signal In some embodiments , provided methods have many to - background and/ or preserve the specificity and multiplex advantages compared to spectral barcoding of mRNAs by ing capabilities of FISH methods . With this approach , smFISH prior to the present invention (Femino et al. , nucleic acid probes complementary to mRNA targets trigger Visualization of single RNA transcripts in situ . Science . chain reactions in which metastable fluorophore - labeled 1998 Apr. 24 ; 280 (5363 ) :585-90 ; Kosman et al. , Multiplex 45 nucleic acid hairpins self - assemble into tethered fluorescent detection of RNA expression in Drosophila embryos. Sci amplification polymers . Using orthogonal HCR amplifiers ence . 2004 Aug. 6 ; 305 (5685 ): 846 ; Levsky et al. , Single - cell carrying spectrally distinct fluorophores , in situ amplifica gene expression profiling. Science . 2002 Aug. 2 ; 297 (5582 ) : tion can be performed simultaneously for all channels . 836-40 ; Lubeck et al. , Single cell systems biology by In some embodiments, detectably labeled oligonucle super - resolution imaging and combinatorial labeling. Nature 50 otides with HCR (HCR probes ) contain a 20 - nt domain Methods 9, 743-48 (2012 ); and Levesque et al. , Nat Meth 10 , complementary to the target mRNA plus a 40 -nt HCR 246-248 ( 2013 ) ), in which the probes against a particular initiator. The hybridization of probes is performed under mRNA are split up into subsets which are labeled with stringency of 10 % formamide followed by the amplification different dyes. Among other things , provided technologies step at a permissive condition . Conditions like concentration do not require many distinct fluorophores to scale up ; with 55 of hairpins can be optimized to achieve optimal results ; even two dyes, the coding capacity is huge, and repeated Applicant notes that, in certain cases, higher concentrations barcodes can be used ( e.g., Red -Red -Red ). In comparison , of hairpins increase reaction rate . Every HCR probe can be spectral barcoding of RNA prior to the present invention is amplified to a diffraction - limited spot. In some embodi limited in the number of barcodes that can be generated ments , FISH signal is amplified by approximately 10-20 (~ 30 ). In provided methods, during each round of hybrid- 60 times within a diffraction -limited spot size . Spot brightness ization , all the detectably labeled oligonucleotides such as can be further enhanced while maintaining a diffraction FISH probes against a transcript can be used at once instead limited spot size by , for example , incorporating multiple of splitting probes into subsets . Among other things, pro HCR initiators within each probe and /or labeling each HCR vided technologies provide improved robustness of barcode amplification hairpin with multiple fluorophores . readout, as the signal on each mRNA is stronger. Compared 65 HCR amplified signals were observed from mRNAs to methods prior to the present invention , density of objects directly in brain slices . When targeted to the same mRNA , in the image is lower as each mRNA can have fewer colors , HCR probes colocalize with smFISH dots with 90 % rate , but US 10,510,435 B2 61 62 are 10-20 times brighter (FIG . 14 ) . This allows HCR probes embodiments , codes are written in Micromanager , free to be readily detected above the autofluorescence of the software supported by the National Institute of Health , to brain (FIG . 13 ) . The high colocalization rate proves that control a microscope as well as fluidics elements . In some HCR is as specific as smFISH and most transcripts are embodiments , valves, stages, light sources, cameras and /or detected . 5 microscopes are controlled through Micromanager . HCR probes are readily stripped and rehybridized , and In some embodiments , compressed sensing is used for can be fully integrated with the seqFISH protocol described dense images (Zhu et al. , Faster STORM using compressed herein . FIG . 15 showed the same genes targeted by HCR in sensing . Nat .Methods . 2012 , 9 ( 7) :721-3 ) and deconvolution brain slices in two different rounds of hybridization . The methods are used to separate out the spots in dense clusters . good colocalization between the two hybridizations shows 10 In some embodiments , improvement in image analysis that HCR -seqFISH works robustly to barcode mRNAs in increases multiplex capacity of provided methods, e.g. , brain . seqFISH ( for example , by about 4-5 folds beyond the 100 HCR protocols work on the same time scale as smFISH gene multiplex ). In some embodiments , efficiency is hybridization and do not increase the cycle time of the assay . improved in a similar fashion to improvement from the The initial hybridization step in HCR is similar to smFISH 15 Illumina GAII sequencer to the HiSeq machines , wherein in time, while the second amplification step occurs in 30 using image processing methods to analyze densely packed minutes to 1 hour. Alternative methods of hybridizing RNA clusters on the sequencing chip increased the throughput. In probes to the transcripts , and optionally using alternative some embodiments , data acquisition and analysis are inte types of hairpins to amplify the signal can further reduce grated in a user - friendly package . cycle time. In some embodiments , HCR removes the need to 20 In some embodiments, provided technology provides purchase amine labeled oligo probes. Among other things, software packages for data analysis . In some embodiments , HCR can potentially decrease the cost of the reagents by provided technologies provide software packages for data approximately one half , to e.g., $ 10,000 per brain . analysis in Python and Matlab . Images of provided tech Automation nologies can be a variety of sizes , and can the optionally Automation of both hardware and software can be applied 25 optimized if desirable . In some embodiments , each FOV is to efficiently scale up , for example, to map 100 genes and /or 6 Megapixels at 14 bits depth , corresponding to 1.5 MB of reduce human labor and / or errors. In some embodiments , data per image. In some embodiments , about 100 GB of data key pieces of technology are integrated to generate a pipe are generated per run . In some embodiments, provided line and /or optimize workflow for tissue and/ or organ imag technologies provide methods for data processing and /or ing , such as imaging of brain slices . Among other things, 30 mitigating data log jam . In some embodiments , data log jam automated fluidics, image acquisition and /or integrated is mitigated by segmenting out the spots from each image , analysis can be independently and optionally combined with fitting them with 2 dimensional Gaussian distributions and fast hybridization cycle time and imaging time. recording the center position of the fits . In some embodi Hardware . ments , provided technologies save computer space by dis In some embodiments , an automated system requires 35 carding raw images and saving processed data . minimum intervention from users and can perform the Light Sheet Microscopy with CLARITY Cleared Brain image acquisition automatically once the user has set up the Slices . experiments . In some embodiments , each sequencer consists In some embodiments , provided technologies provide of or comprises an automated epi- fluorescence microscope methods for imaging a tissue, an organ and / or an organism . to perform the imaging and an automated fluidics system to 40 In some embodiments , provided technologies provide meth perform the sequential hybridizations. In some embodi ods for measuring thick tissues or organs . In some embodi ments , compressed air is used to push reagents into a 1 cmx1 ments , a thick tissue or organ has a thickness of about or cm well with cells and tissues fixed on the bottom coverslip . more than 100 um . In some embodiments , provided tech Without the intention to be limited by theory , Applicant nologies preserves long range projections and morphology notes that, in some embodiments , because of the high 45 beyond within single cells . In some embodiments , light viscosity of the hybridization buffer , a compressed air driven sheet microscopy is used for measuring thick tissues or system eliminates dead volume and also can be precisely organs. In some embodiments , a tissue, an organ , and/ or an controlled to deliver defined volumes of reagents . In some organism is cleared by CLARITY (Chung et al. , Structural embodiments , a separate vacuum line is used to purge the and molecular interrogation of intact biological systems, chamber. In some embodiments , work flow of a provided 50 Nature , 2013 , doi: 10.1038 /nature12107 ). protocol is similar to existing DNA sequencers at the time of In some embodiments , to image thicker brain slices ( > 100 the present invention , which is well known in the art . um ) which better preserves long range projections and In some embodiments , during each cycle of hybridization , morphology, light sheet microscopy, a.k.a. selective plane a machine automatically hybridizes samples with probes , illumination microscopy (SPIM ) , are applied on CLARITY washes with buffer to remove excess probes, and /or scans 55 cleared brains tissues . In some embodiments , the CLARITY the brain slices for imaging . In some embodiments , wherein methodology renders the brain transparent for visualization DNase is used in a removing step , after imaging DNase is and identification of neuronal components and their molecu flown in to remove the probes. After extensive wash , another lar identities. In some embodiments , CLARITY turns brain round of hybridization can proceed afterwards . During tissue optically transparent and macromolecule -permeable hybridization time, a microscope moves to a different loca- 60 by removing light scattering lipids , which are replaced by a tion on the stage to image another brain slice that has been porous hydrogel to preserve the morphology of brain tissue , hybridized and washed already. In such a way, a camera is so that studies can be conducted without thinly sectioning acquiring images most of the time, while other samples on the brain , which enables visualization of neurons of interest the stage are being hybridized . as well as their long -range synaptic connectivity . Without Software . 65 the intention to be limited by theory , Applicant notes that In some embodiments , software is used , for example , to compared to FISH that was previously performed in culture automate the control process and analysis of data . In some or thin slices prior to the present invention , provided tech US 10,510,435 B2 63 64 nologies can use thicker tissues and allow formore accurate than antibodies that persons having ordinary skill in the art reconstructions of individual neurons or 3D neuronal net routinely diffuse into 1 mm thick coronal slices , and provide works transcriptome. FIG . 16 illustrates a successful, vali profiling of targets for whole tissues or organs, e.g., perfor dated Clarity -based protocol to prepare optically clear thick mance of seqFISH on CLARITY cleared whole brains . slices compatible with FISH staining : ( 1 ) 100 micron coro 5 nal brain slices in 2 mL Eppendorf tubes were incubated in In some embodiments , provided technologies provide 1.5 mL of 4 % Acrylamide, 2 % formaldehyde, 0.25 % geometries of microscopy . In some embodiments , provided thermo - initiator, 1x PBS at 4 degrees overnight; ( 2 ) nitrogen technologies provide alternative geometry of SPIM such that gas was bubbled through the hydrogel solution for 10 thick brain slices ( > 100 um ) and potentially entire CLAR seconds; ( 3 ) degassed samples were incubated for 2 hours at 10 ITY cleared brain can be mounted on an epifluorescscence 42 degrees to polymerize ; ( 4 ) samples were washed 3 times microscope with a long working distance objective . In some in PBS and incubated in 10 % SDS , 1xPBS at 37 degrees for embodiments , a light sheet is generated perpendicular to the 4 hours to clear, and ( 5 ) samples were washed 3 times in imaging axis , and sections the sample , mounted at an angle PBS and ready to be used for seqFISH . on the microscope , with 10 um width over 200-300 um . In In some embodiments, provided technologies provide 15 some embodiments , a provided geometry allows direct methods for minimizing or preventing out- of - focused back carry -over of a developed flow chamber and automation ground . In some embodiments , provided technologies utilize design . In some embodiments , fiducial markers in brain imaging technologies that minimize or prevent out- of - fo slices are used to register successive slices . In some embodi cused background . In some embodiments , SPIM is used for ments , nanoscopic rods are injected into the brain prior to thicker slices that have higher out- of - focused background . 20 sectioning , allowing good registration between different In some embodiments , while confocal microscope can reject sections. this background , it scans slowly and photobleaches the upper layers of sample while imaging the lower layers. In Speed . some embodiments, in SPIM only the layer that is being In some embodiments , imaging speed limits the ultimate imaged is illuminated by excitation light. In some embodi- 25 throughput. In some embodiments , provided HCR amplifi ments , in a SPIM setup useful for provided technologies , cation provides more than sufficient number of photons for two objectives placed perpendicular to each other are sus imaging , and less expensive cameras can be used to image pended over the sample at approximately 550. In some the sample . In some embodiments , light from the collection embodiments , a light sheet is generated using a cylindrical objective can be split into multiple , e.g., 6 distinct paths lens to focus one axis of the beam into an about 10 um height 30 ( e.g., 5 fluorescence and 1 DAPI) with imaging flat dichroics and an effective width or FOV of about 200 um . In some embodiments , the present invention provides and filters. This dramatically increases the throughput of in microscope setups for provided methods. In some embodi situ “ sequencers ,” such that an entire brain can be completed ments , the present invention provides a light sheet micro in 1 week on a single microscope . scope , wherein the sample is illuminated from the side . In 35 Target Genes Selection . some embodiments , a light sheet parallel to a sample In some embodiments , the present invention provides stage . In some embodiments , a light sheet is perpendicular technologies for selecting and imaging a set of targets , such to the detection objective . An exemplary setup of light sheet as a set of transcripts and /or DNA loci ( e.g , a set of 100 microscope is illustrated in FIG . 17. By adapting two targets as exemplified ) . In some embodiments , target genes mirrors and a cylindrical lens, a plane of light sheet is 40 are chosen from the in situ database from the Allen brain created and illuminates the sample from the side, and is perpendicular to the detection objective (middle ). The bot atlas (ABA ) . Multiple criteria can be used to select genes of tom mirror is connected to the cylindrical lens and mounted interest , e.g. , those likely to represent the cellular identity in directly onto the same base of objective. With this configu the cortex region . Computational selection of an optimal set ration , the objective is moving synchronically with the 45 of genes from overlapping criteria is well -known ( 2 . illumination sheet, allowing scanning the sample along Alon , N ; Moshkovitz , Dana ; Safra , Shmuel (2006 ) , Z -axis ( right, top ) . The right (bottom ) figure also shows that, “ Algorithmic construction of sets for k - restrictions” , ACM the sample is mounted inside the hybridization chamber , and Trans . Algorithms (ACM ) 2 ( 2 ) : 153-177, ISSN 1549-6325 ; imaged by an air objective below . As illustrated in FIG . 18 , SPIM images were acquired 50 Cormen , T H .; Leiserson , Charles E .; Rivest, Ronald L .; with a 100 um brain slice that was CLARITY cleared and Stein , Clifford (2001 ) , Introduction to Algorithms, Cam hybridized with HCR probes against B -actin . 200 optical bridge, Mass .: MIT Press and McGraw - Hill , pp . 1033-1038 , sections with 0.5 um spacing were taken to generate the ISBN 0-262-03293-7 ; 12. Feige , U ( 1998 ) , “ A threshold of reconstruction . Clear HCR signals were observed with a 20x In n for approximating set cover” , Journal of the ACM water immersion objective . The ß - actin mRNA is highly 55 ( ACM ) 45 ( 4 ): 634-652, ISSN 0004-5411 ). In some embodi expressed , accounting for the large number of dots in the cell ments , set- cover -heuristics (Pe’er , 2002 ) are used to select bodies. However , clear diffraction limited spots were also genes that: 1. are known to define sub cell types ; 2. exhibit observed in axons . “ salt and pepper” expression patterns in the ABA ; 3. belong In some embodiments , HCR - seqFISH protocol to CLAR to a family of genes such as transcription factors , ion ITY cleared brains and SPIM microscopy can be adapted . 60 channels , GPCRs, and neurotropins; and 4. culled from 100 um slices were efficiently hybridized in 4-5 hours , RNA - seq experiments from cortex samples . For instance, indicating that detectably labeled oligonucleotide probes can SLC1A3 marks glia cells while SLC6A1 marks inhibitory diffuse readily in 100 um thick but cleared slices. In addi neurons, and SLC17A7 marks excitatory neurons. In some tion , DNase enzyme diffused readily as well to strip HCR embodiments , genes with heterogeneous expression pattern signal from the slice ( FIG . 15 ) . In some embodiments , 65 such as PVALB , SST and CALB2 mark out subsets of provided technologies provide detectably labeled oligo inhibitory neurons. An exemplary set of 100 genes is shown nucleotides, such as FISH and HCR probes , that are smaller below : US 10,510,435 B2 65 66 -continued Gene Name Expression Profile Gene Name Expression Profile SLC6A1 all inhibitory ( I ) SLC17A7 all excitatory ( E ) SSTR2 Secondary Somatosensory L5 SLC1A3 glia 5 ZFP395 Secondary Somatosensory L5 PVALB subset I CCDC36 Secondary Somatosensory Lba SST subset I ST14 Secondary Somatosensory Lóa CALB2 subset I MYL12b Secondary Somatosensory L6b LER5 Isocortex RSPO2 Secondary Somatosensory L6b TNNC1 Isocortex NDNF L1 ( 1) MYL4 Isocortex 10 RASGRF2 L2 / 3 (1 ) SATB2 Isocortex CUX2 L2 / 3 / 4 CCL27a Isocortex RORB L4 BOC Primary Motor L1 SCNN1A L4 DACT2 Primary Motor L1 ETV1 L5 LHX1 Primary Motor L1 FEZF2 L5 PVRL3 Primary Motor L1 15 BCL6 L5 SLC44a3 Primary Motor L2 /L3 TRIB2 L5a KLK5 Primary Motor L2 /L3 FOXP2 L6 TNNC1 Primary Motor L2 /L3 TLE4 L6 / L6b WNT6 Primary Motor L2/ L3 CTGF L6b ZMAT4 Primary Motor L5 CYLD L2 / 3 STARDS Primary Motor L5 20 CMTM3 L2 / 3 TCF21 Primary Motor L5 ANKRD6 L2 / 3 MYL4 Primary Motor L5 KRT80 Primary Motor Lóa OLFR19 Primary Motor Lóa TBC1d30 Primary Motor Lóa Integration of seqFISH with Protein Detection , Organelle OLF16 Primary Motor L6b Markers and Activity Measurements . EAR6 Primary Motor Løb 25 In some embodiments , provided technologies , e.g., CHIT1 Primary Motor Løb seqFISH , allow multiplex analysis of RNA , as well as SLN Secondary Motor Li ADAMTS8 Secondary Motor Li proteins, neural activities , and structural arrangements in the EPYC Secondary Motor L1 same sample in situ with single cell resolution . Antibodies KCNV1 Secondary Motor L1 for specific targets can be hybridized in one additional round pcdh7 Secondary Motor L2 / L3 30 of hybridization to the sample . In some embodiments , GLT8d2 Secondary Motor L2 / L3 HKDC1 Secondary Motor L2 /L3 provided methods optionally comprise a step of immunos SRPX Secondary Motor L3 taining. In some embodiments , multiple antibodies are used ZFP458 Secondary Motor L3 to detect many protein targets in sequential rounds of SLC30a8 Secondary Motor L3 hybridization ( Schubert W et al . Analyzing proteome topol GK5 Secondary Motor L5 TEX28 Secondary Motor L5 35 ogy and function by automated multidimensional fluores MS4a10 Secondary Motor L5 cence microscopy . Nat Biotechnol (2006 ) 24 ( 10 ): 1270 KRT16 Secondary Motor L6a 1278 ). Applicant notes that there are up to about 100-1000 KRT42 Secondary Motor Lóa or more fold higher abundance of proteins over mRNAs in DOC2a Secondary Motor L6a KRT33b Secondary Motor Løb cells . Targeted proteins can mark cellular organelles such as YBX Secondary Motor L6b 40 mitochondria , ER , transport vesicles, cytoskeleton , as well PNPLA5 Secondary Motor L6b as synaptic junctions. For example ,MAP2 antibodies can be TMEM215 Primary Somatosensory Li used to mark out cell boundaries to help segmentation of SDC1 Primary Somatosensory L1 axons and dendrites . PREX1 Primary Somatosensory L1 DIEXF Primary Somatosensory Li Live observation of brain slices can be imaged on the DHRS7c Primary Somatosensory L2 /L3 45 epi- fluorescence and light sheet microscope prior to tran DDIT41 Primary Somatosensory L2 /L3 scription profiling by provided methods ( e.g., seqFISH ) . TDG Primary Somatosensory L2 /L3 EPSTI1 Primary Somatosensory L2 / L3 Calcium (Nakai J , Ohkura M , Imoto K (February 2001 ). “ A RORO Primary Somatosensory L4 high signal- to -noise Ca( 2+) probe composed of a single GSC2 Primary Somatosensory L4 green fluorescent protein ” . Nat. Biotechnol . 19 (2 ) : 137-41; KRT10 Primary Somatosensory L4 50 Akerboom et al. , “ Optimization of a GCAMP calcium indi GCA Primary Somatosensory L4 DCBLD2 Primary Somatosensory L5 cator for neural activity imaging. ” J Neurosci . 2012 Oct. 3 ; ABCD2 Primary Somatosensory L5 32 ( 40 ) : 13819-40 ; Stosiek et al. , “ In vivo two- photon cal GTDC1 Primary Somatosensory L5 cium imaging of neuronal networks. ” Proceedings of the IL17RA Primary Somatosensory Lóa National Academy of Sciences 100 (12 ): 7319 ) and voltage TBR1 Primary Somatosensory L6a PPID Primary Somatosensory Lba 55 sensor ( Cohen , et al. , “ Optical Measurement of Membrane IGHM Primary Somatosensory L6b Potential” in Reviews of Physiology. ” Biochemistry and MMGT1 Primary Somatosensory Løb Pharmacalogy , vol. 83 , pp . 35-88 , 1978 ( June ); Mutoh et al. , CPLX3 Primary Somatosensory L6b Genetically Engineered Fluorescent Voltage Reporters ACS ART2b Secondary Somatosensory L1 GNB4 Secondary Somatosensory L1 Chem Neurosci. 2012 August 15 ; 3 ( 8 ): 585-592 ; Peterka et BUGAT2 Secondary Somatosensory L1 60 al. , Imaging voltage in neurons. Neuron . 2011 Jan. 13 ; PDC Secondary Somatosensory L2 /L3 69 ( 1 ): 9-21) can be imaged in the brain slices . SPIM allows ADIG Secondary Somatosensory L2/ L3 efficient and fast imaging of these sensors in brain slices. FPR1 Secondary Somatosensory L2 /L3 INHBC Secondary Somatosensory L4 Brain slices are fixed on the microscope and provided RUFY4 Secondary Somatosensory L4 protocols such as seqFISH protocols can be performed with HGFAC Secondary Somatosensory L4 65 automated fluidics. In some embodiments , in addition to the EFCAB4b Secondary Somatosensory L5 live measurements , mRNAs of activity dependent immedi ate early genes ( IEGs) are detected as a measure of the US 10,510,435 B2 67 68 integrated neural activities in the neurons. For example , tracers are injected into five of the main targets of one of the CamKII and cFos were readily detected in neurons with main nodes of each sensorimotor subnetwork in the same heterogeneous expression levels ; they can be incorporated in animal (tracer information below ) . For example , in one a set of genes, e.g. , an exemplary 100 gene multiplex or animal , circuit tracers are injected into two of the major FISHed separately in additional cycles depending on abun- 5 cortical nodes (SSP-bfd.al and SSs- r & cv ) and three of the dance . subcortical nodes ( caudoputamen ventrolateral domain , CP Integrating Connectomics and seqFISH to Identify vl; ventral posteromedial thalamic nucleus, VPM ; and ven Molecular Identities of Distinct Neurons within Different tral spinal trigeminal nucleus, SPV ) of the orofaciopharyn Somatic Sensorimotor Neural Networks. geal subnetwork . This simultaneously back labels five To Systematically Characterize the Molecular Identities 10 different neuronal populations in all of the other nodes of the of Distinct Neuronal Populations within the Somatic Sen orofaciopharyngeal subnetwork . In this example , labeled sorimotor Neural Networks Using Provided Technologies . neurons are in the SSp - m / n domain and in the MOp -oro . In some embodiments , neuronal populations within the On the other hand , tracer can be injected into four same functional subnetworks can share common sets of different SSp body subfield domains ( i.e. SSp - m / n , SSp - ul, marker genes , but also have heterogeneous expression of 15 SSp - 11/ tr , and SSp-bfd.cm ) , each of which belongs to a other genes that defines identity at a cellular level. In some distinct somatic subnetwork . This simultaneously labels embodiments , cells in different subnetworks differ more in distinct neuronal populations in cortical areas associated their expression patterns . Exemplary cortico -cortical with the different subnetworks. In this case for example , somatic subnetworks each of which controls a basic class of back labeled neurons are observed in the MOp domains sensorimotor function are : ( 1 ) orofaciopharyngeal for eating 20 associated with each subnetwork , i.e MOp -orf , MOp -ul , and drinking, (2 ) upper limbs for reaching and grabbing, (3 ) MOp - 11 /tr , and MOp - w . This injection strategy applied to all lower limbs for locomotion , and (4 ) whisker for rhythmic the main nodes and subcortical targets of each of the four whisker movements . In some embodiments , provided tech somatic sensorimotor subnetworks labels distinct neuronal nologies provide a novel and rigorous approach for charac populations of each of the subnetworks . terizing molecular identities of cortical neurons with distinct 25 After injection of the tracers (e.g. , one week following the neural networks and provide invaluable information for injection of the tracers ), animals are sacrificed , and their understanding genetic circuits underlying the wiring dia brains are harvested and coronally sectioned at 20 um or 100 gram of the mammalian brain . um thickness for seqFISH analysis of back labeled neurons. Using a collection of neuronal pathways, digital cortical Genes , such as the exemplified approximately 100 genes connectivity atlas can be generated to display raw images of 30 that are richly expressed in the somatic sensorimotor cortical tract tracing studies . Pathways can be graphically recon areas (SSp , MOP ,MOs , SSs ) can be preselected for profiling structed to create cortico -cortical connectivity map to help using , for example , the online digital gene expression data analysis of large - scale data . Based on intracortical connec base of the Allen Brain Atlas project ( www.Brian-Map.org ) tivity , four distinct cortico -cortical somatic subnetworks can (Lein et al. , 2007 Genome- wide atlas of gene expression in be established each of which controls a basic class of 35 the adult mouse brain . Nature, 11 ; 445 (7124 ) :168-76 ) . sensorimotor function . Each of these subnetworks comprises Injection Strategy and Post Injection Processing . 4-5 distinct functional domains in the primary somatic Three hundred 4 -week - old male C57B1/ 6 mice are used . sensory (SSp ) , primary somatomotor (MOp ), secondary In one animal, five fluorescent retrograde tracers are injected somatomotor (MOs ), and supplementary somatosensory into either different nodes within the same somatic senso ( SSS ) cortical areas, which were further subdivided accord- 40 rimotor subnetworks, or different nodes in different somatic ing to their strength of connectivity with other somatic subnetworks as described above (FIG . 19 ) . The tracers are sensorimotor areas corresponding to a specific body sub Fluorogold (FG , yellow ) , cholera toxin b conjugated with field . In some embodiments , the orofaciopharyngeal subnet 488 or 647 (CTb -488 [ green ], CT6-647 ( pink ]) , red retro work comprises five major nodes: (1 ) the SSp mouth and beads (RR , red ), and wheat germ agglutinin conjugated with nose domain (SSp - m / n ); ( 2 ) the MOp orofacial domain 45 655 (WGA - Qdot655 , white ) . Since Qdot655 has an excita (MOp -orf ); (3 ) the MOs rostrodorsolateral domain (MOS tion wavelength that differs from CTb 647 , it can be cap rdl) ; (4 ) the SSp barrel field anterolateral domain (SSP tured into a different channel and pseudocolored with a bfd.al ); and (5 ) the SSs rostral and caudoventral domain unique hue. The tracers are injected ( either iontophoretically (SSs - r & cv ). In some embodiments , the fourmajor nodes of or with pressure injection ) via stereotaxic surgeries . Details the upper limb subnetwork comprise (1 ) the SSp upper limb 50 on surgeries and perfusions are described , e.g., in Hintiryan ( SSp -ul ) ; ( 2 ) MOp- ul; ( 3 ) rostrodorsal MOS (MOS - rd ) ; and et al ., Comprehensive connectivity of the mouse main (4 ) caudodorsal SSs (SSs - cd ) . In some embodiments, the olfactory bulb :analysis and online digital atlas . Front Neu lower limb/ trunk subnetwork comprise the SSp lower limb/ roanat. 2012 Aug. 7 ; 6:30 . eCollection 2012. In some trunk region (SSP - 11/ tr ) , the MOP - 11/ tr , and the rostrodorso embodiments , two paired mice are injected with the same medial MOS (MOs - rdm ) ( FIG . 10 B - D ) . In some embodi- 55 tracers used in the exact same coordinates . One of the ments , the whisker subnetwork comprises the caudomedial animals is used to validate locations of labeled cells and SSp -bfd (SSp-bfd.cm ), MOP- W , which corresponds to the injection sites , while the other is subjected to provided , e.g., vibrissal primary motor cortex (vM1 ) and the caudodorsal seqFISH methods. One is perfused following tracer trans SSs (SSs - cd ; FIG . 19 ) . Exemplary data are described by the port , and brains are coronally sectioned into 50 um thickness Mouse Connectome Project (www.mouseconnectome.org ) . 60 sections and collected in four series. One in four series of To determine molecular identities of distinct neuronal sections is counterstained with a fluorescent Nissl stain populations in each of these somatic sensorimotor subnet solution (NeuroTrace Blue [NTB ] ) , mounted onto glass works , multi - fluorescent retrograde tracers are used to label slides, and imaged using an Olympus VS120 virtualmicros neurons, and provided technologies such as seqFISH can be copy system . In some embodiments , the Nissl stain provides applied to determine the gene expression profile of retro- 65 cytoarchitectonic background for visualizing precise ana gradely labeled population at single cell resolution . To label tomical location of back labeled cells . These images are the neuronal populations, multiple ( e.g., five ) retrograde processed through an informatics pipeline so that every US 10,510,435 B2 69 70 individual image is faithfully registered onto its correspond and corresponding ARA atlas level. In some embodiments , ing level of the Allen Reference Atlas (ARA ) . This Nissl , to faithfully associate seqFISH information with its corre along with provided informatics tools , enables automatical sponding retrogradely labeled somata , labeled cell bodies and precise counting of the approximate number of each are discretely segmented from tissue background and from tracer labeled neuronal population in each ROI (in this case , 5 images of the same section , but acquired at different rounds the different domains of somatic sensorimotor areas ). The (e.g. , first for image retrograde tracers, then for different distribution patterns are automatically plotted onto the cor mRNA in seqFISH ), and spatially indexed by their coordi responding atlas level to create their connectivity map . nates relative to a fixed reference point on either the slide The paired mice are sacrificed at the same time and their and /or an anatomical landmark in the tissue . In some brains are sectioned at either 20 um or 100 um thickness for 10 embodiments , to associate data with a stereotaxic coordinate seqFISH analysis . These sections are first imaged under 20x defined in the ARA , the present invention provides a novel ( or 10x ) objective to reveal back labeled neurons with registration pipeline that dramatically increases registration different tracers . Brain sections through all coronal levels accuracy (i.e. warping each scanned microscopy image to containing the somatic sensorimotor areas are used to per the shape of the corresponding level of the ARA ) , and image form seqFISH for an exemplified set of 100 genes. This 15 segmentation that automatically and accurately enumerates method reveals the gene expression in every tracer labeled fluorescently labeled neurons in a given ROI (e.g. , SSp - m , neuron . All images are analyzed first for gene expression MOP - 11) . In some embodiments , provided technologies col profiles of each individual tracer labeled neurons . Each lectively allow for labeling and seqFISH data from multiple section is registered back to the closest matched section of tracers within a brain , and across multiple brains, to be its paired brain so that sections at approximately the same 20 collated into a single anatomical framework for the purposes coronal level from the paired brains can be displayed of visualization and annotation . alongside in a connectome viewer. As such , molecular In some embodiments , images are registered at corre profiles of different neuronal populations are displayed sponding atlas levels of the Allen Reference Atlas (Dong , H. within its closest matched anatomic background . In some W. ( 2008 ). The Allen Reference Atlas: A Digital Color Brain embodiments , gene expression profiles are correlated in 25 Atlas of C57BL /6J Male Mouse , John Wiley & Sons ). The each retrogradely labeled neuronal population . deformation matrix resulting from the registration process is Results . applied on the original resolution images to get the high In some embodiments , distinct retrogradely labeled neu resolution warped images . Following registration and reg ronal populations within different somatic sensorimotor istration refinement, the NeuroTrace® fluorescent Nissl areas display different transcriptome profiles; even neuronal 30 stain is converted to a bright- field image. Next, each channel populations in the same domain ( e.g., SSp - m ) that are for every image is adjusted for brightness and contrast to labeled with different tracers display distinct gene expres maximize labeling visibility and quality in tools, e.g., iCon sion profile from its neighboring neurons that have different nectome. After modifications ( i.e. skewness , angles ) and connectivity profiles . In some embodiments , different neu JPEG2000 file format conversions , images can be published ronal populations within different somatic sensorimotor 35 to iConnectome view (www.MouseConnectome.org ). nodes within the same subnetwork ( e.g., SSp - m , MOp -orf , FISH Visualization Tool. SSp-bfd.al, and SSs- r) share common network -specific In some embodiments , all connectivity data produced in genes, while neuronal populations within different neural are processed through the MCP informatics pipeline and networks ( e.g., the orofaciopharyngeal and lower limb/ trunk presented online through a new iConnectome FISH Viewer subnetworks) display very distinct transcriptome profiles . In 40 (www.MouseConnectome.org ) . Different from available some embodiments , regional ( e.g. , SSp or MOp ) or laminar iConnectome viewer , which displays two anterograde (different layers) specific genes are identified for those (PHAL and BDA ) and two retrograde (FG and CTb ) label neurons in different cortical areas and different layers . ing, iConnectome FISH can display up to five different As exemplified , provided technologies provide a unique neuronal populations with retrograde fluorescent dyes . As combination of fluorescent tract tracing with seqFISH tech- 45 mentioned above , each set of injections can be given to a nology to characterize molecular identities of connectivity pair ofmice . One can be processed following a regular MCP based neuronal populations (cell types) within distinct pipeline and be presented in iConnectome FISH viewer to somatic sensorimotor networks with subneuronal resolution display multiple fluorescent labeled neuronal populations and faithful anatomic background. See , e.g. , FIGS. 3 and 9 within their own Nissl- stained cytoarchitetural background for exemplary results . In some embodiments , provided tech- 50 and within their corresponding ARA level. These can pro nologies comprise measuring other parameters in parallel vide precise anatomic information for each of the fluorescent ( i.e. antibody and organelle stains , as well as IEG expression labeled neuronal populations across the entire brain . Brain levels ) , and can be applied to the entire neocortex or brain . sections from its paired partner following seqFISH are High Throughput Pipelines and Informatics Tools for registered onto the closest ARA level that its paired partner Analyzing and Presenting Data Online. 55 was registered to and can be displayed side by side in In some embodiments , provided technologies provide different window . A list of genes that were expressed in the high throughput pipelines and informatics tools for analyz neurons can be listed on the side panel. Upon clicking on the ing and presenting data online through , e.g. , a publicly gene, the fluorescently labeled neurons that expressed this accessible database , such as www.MouseConnectome.org ) . gene can light up to indicate its expression locations . This In some embodiments , provided technologies provide inte- 60 provides a practical way to display the molecular identities gration with Mouse Connectome Project , whose broad scope of neuronal populations within the global context of con of study and use of multi - fluorescent imaging make it a nectivity and anatomical background . valuable tool among the connectomic community and well In some embodiments , a corresponding database is devel suited for studying long- range connectivity in the mouse oped that allows users to analyze these data and to correlate brain . For example , it offers online visualization tools that 65 neural connectivity with their molecular identities. This allow users to visualize multiple fluorescent labeled path informatics tool is built on top of a database that stores ways on the top of their own cytoarchitectural background information associated with each retrograde labeled neu US 10,510,435 B2 71 72 ronal population (e.g. cell numbers , anatomic location ) with Actb transcripts after washing with wash buffer and addition gene barcoding . This database can help users to identify of only the new bridging strand along with the correspond corresponding gene barcodes for neurons within the same ing hairpins tagged with Alexa 647 dye . The contrast ratio of neural networks or distinct neural networks . the images was adjusted to illustrate certain features of the Mapping the Whole Brain . 5 method . In some embodiments , provided technologies have suffi In some embodiments , detectably labeled oligonucle cient sensitivity , selectivity , automation , and / or spatiotem otides are removed by displacement using complementary poral resolution at single neuron level for high - throughput oligonucleotides (cTOE ) . In some embodiments , displace analysis of gene expression in retrogradely labeled neurons ment comprises use of a dextran or a derivative thereof, a for whole brains . 10 salt , and /or an organic solvent. In some embodiments , Additional Exemplary Methods for Removing Steps displacement comprises use of a dextran or a derivative In some embodiments , the present invention provides a thereof. In some embodiments , displacement comprises use varieties of methods for removing detectably labeled oligo of dextran sulfate . In some embodiments , displacement nucleotides from targets . In some embodiments , exonu comprises use of a salt . In some embodiments , a salt is clease III (ExoIII ) is used to remove detectably labeled 15 MgCl2 . In some embodiments , displacement comprises use oligonucleotides . FIG . 21 illustrates an exemplary process of an organic solvent. In some embodiments , an organic for HCR re- hybridization using Exo III . In FIG . 21, Exo III solvent is formamide. A variety of factors , for example but digests bridging strands and HCR polymers , keeping inter not limited to TOE concentration , incubation time, buffer mediate oligonucleotides intact for hybridization with new composition and type and /or concentration of organic sol bridging strands . Exemplary data were presented in FIG . 21 20 vent can be optimized individually or in combination . FIG . ( b ) using detectably labeled oligonucleotides targeting beta 24 showed exemplary data of displacement of smFISH actin ( Actb ) transcripts in T3T mouse fibroblast cells. The probes using TOE. The mean ratio of fluorescence intensity left image showed the initial hybridization and amplification between the smFISH probe to be displaced ( Alexa 647 ) and of Actb transcripts using Alexa 488 dye. The middle image a colocalized smFISH probe ( Alexa 532 ) is shown . Various showed complete loss of signal in the Alexa 488 channel 25 treatments were performed in which the concentration of after a 1 hour incubation in exolII at room temperature. The TOE , hybridization buffer composition and displacement right image showed re - amplification of Actb transcripts after time were compared . All displacement probe conditions addition of only the new bridging strand and the correspond resulted in significantly more displacement than the control ing hairpins tagged with Alexa 647 dye . The contrast ratio of in which cells were placed in 10 % DS and no cTOE was the images was adjusted to illustrate certain features of the 30 added . Without the intention to be limited by theory, Appli method . cant notes that , among other things , increasing the concen In some embodiments , Lambda Exonuclease (a - exo ) is tration of CTOE , increasing the amount of time that cTOE used remove detectably labeled oligonucleotides . FIG . 22 probes hybridized , adjusting buffers to 10 mM MgCl, or illustrates an exemplary process for HCR re -hybridization 10 % formamide all resulted in increased displacement . using à -exo . In FIG . 22 , à -exo digests 5 ' phosphorylated 35 TOE at 2.5 uM for 2 hours in 10 % Dextran sulfate (DS ) bridging strands and releases HCR polymers from interme results in minimal residual Alexa 647 smFISH signal but a diate oligonucleotides bound to targets , e.g., mRNAs and minor increase over the baseline signal determined by keeps intermediate oligonucleotides intact for hybridization hybridizing Alexa 594 (A594 ) in place of Alexa 647 and not with new bridging strands after washing out released poly adding TOE . mers. Exemplary data were presented in FIG . 22 ( b ) using 40 Additional Examples for Oligonucleotide Preparation detectably labeled oligonucleotides targeting beta - actin A set of sequences were amplified by PCR (FIG . 25 ) . The ( Actb ) transcripts in T3T mouse fibroblast cells . The left product was isolated, e.g. , precipitated using 5 volumes of image showed the initial hybridization and amplification of precipitation buffer (30 :1 EtOH : 1M NaOAc ) at -20 OC for Actb transcripts using Alexa 488 dye. The middle image at least 10 minutes . The precipitation mixture was centri showed loss of signal in the Alexa 488 channel after a 1 hour 45 fuged for 10 minutes. The supernatant was discarded and the incubation in à - exo at 37 ° C. The right image showed oligonucleotide pellet was reconstituted in nicking enzyme re - amplification of Actb transcripts after washing with wash buffer with the appropriate units of enzyme, based on that buffer and addition of only the new bridging strand along about 10 units of enzyme digest about 1 ug of DNA in 1 with the corresponding hairpins tagged with Alexa 647 dye . hour. Once the incubation time had elapsed , the sample was The contrast ratio of the images was adjusted to illustrate 50 again precipitated and reconstituted in 2x loading buffer certain features of the method . ( 96 % formamide /20 mM EDTA ) and water to make a final In some embodiments , Uracil -Specific Excision Reagent loading buffer (48 % formamide/ 10 mM EDTA ) . The sample (USER ) is used to remove detectably labeled oligonucle was heated to 95 ° C. to completely denature the DNA . The otides. FIG . 23 illustrates an exemplary process for HCR denatured DNA was then loaded into a denaturing acrylam re- hybridization using USER . In FIG . 23 , USER digests at 55 ide gel (8M urea 10-12 % acrylamide) . The gel was run at deoxyuridine nucleotides in bridging strands and releases 250V for 1 hour, or optimized as desired. After electropho HCR polymers from intermediate oligonucleotides bound to resis , the gel was stained using 1x sybr gold for 15 minutes targets , e.g. , mRNAs and keeps intermediate oligonucle and then visualized . The appropriate band was cut out, otides intact for hybridization with new bridging strands crushed , and incubated in DI water for 2 hours . After after washing out fragments and released polymers. Exem- 60 incubation , the sample was precipitated again and then plary data were presented in FIG . 23 (6 ) using detectably purified using a vacuum column . The column was eluted labeled oligonucleotides targeting beta -actin (Actb ) tran with 30 L of RNase free water to yield the final product, as scripts in T3T mouse fibroblast cells . The left image showed shown in FIG . 26 . the initial hybridization and amplification of Actb transcripts In some embodiments , provided methods use restriction using Alexa 488 dye. The middle image showed loss of 65 sites instead of nicking endonuclease sites. Similar to the signal in the Alexa 488 channel after a 1 hour incubation in amplification step in FIG . 25 , a set of sequences are ampli USER at 37 ° C. The right image showed re- amplification of fied by PCR , with a BamHI site flanking the 5 '- end , and an US 10,510,435 B2 73 74 AatIl site flanking the 3 ' - end . The PCR product is precipi states ( Cembrowski et al. , 2016 , Zeisel et al 2015 ) , while tated with 5 volumes of precipitation buffer (30 : 1 EtOH : 1M ABA analysis indicates that sub - regions within the CA1 NaOAc ) at -20 OC for at least 10 minutes and isolated , have distinct expression profiles ( Thompson et al, 2008 ) . To followed by digestion with BamHI and AatII. The product is resolve the two conflicting descriptions of hippocampal again purified, and subjected to exo III digestion . Removal 5 organization, a method to profile transcription in situ in the of the digested nucleic acids provides the product oligo hippocampus with single cell resolution is needed . Here, we nucleotides . demonstrate a general method that enables the mapping of cells and their transcription profiles with single molecule Example 2 resolution in tissue , allowing an unprecedented resolution of 10 cellular transcription states for molecular neuroscience Brain Slice Analysis (FIG . 29A ). A great deal of progress has been made recently in As an illustration , barcodes generated using the error developing highly quantitative methods to profile the tran correction mechanisms disclosed herein are used for in situ scriptome of single cells . Building upon single molecule transcription profiling of single cells reveals spatial organi- 15 fluorescence in situ hybridization ( smFISH ) (Femino et al. , zation of cells in the mouse hippocampus . 1998 ; Raj et al ., 2006 ;) , Lubeck et al. devised a general Identifying the spatial organization of tissues at cellular method to highly multiplex single molecule in situ mRNA resolution from single cell gene expression profiles is essen imaging irrespective of transcript density using super - reso tial to understandingmany biological systems. In particular , lution microscopy ( Betzig et al. , 2006 ; Rust et al ., 2006 ; there exist conflicting evidence on whether the hippocampus 20 Lubeck and Cai, 2012 ). However, the spectral barcoding is organized into transcriptionally distinct subregions. Here , methods used in these previous works is difficult to scale up a generalizable in situ 3D multiplexed imaging method was beyond 20-30 genes because of limited number of fluoro applied to quantify hundreds of genes with single cell phores (Fan et al. , 2001; Lubeck and Cai , 2012 ). resolution via Sequential barcoded Fluorescence in situ To overcome the scalability problem , a temporal barcod hybridization ( seqFISH ) (Lubeck et al ., 2014) . seqFISH was 25 ing scheme was developed that uses a limited set of fluo used to identify unique transcriptional states by quantifying rophores and scales exponentially with time ( Lubeck et al ., and clustering up to 249 genes in 16,958 cells . By visual 2014 ) . Specifically , by using sequential rounds of probe izing these clustered cells in situ , we identified distinct hybridizations on the mRNAs in fixed cells to impart a layers in the dentate gyrus corresponding to the granule cell unique pre -defined temporal sequence of colors , different layer, composed of predominantly a single cellclass , and the 30 mRNAs can be uniquely identified in situ . The multiplex subgranular zone , which contains cells involved in adult capacity scales as F ^ , where F is the number of fluorophores neurogenesis . Furthermore , it was discovered that distinct and N is the number of rounds of hybridization . Thus, one subregions within the CA1 and CA3 are composed of unique can increase the multiplex capacity by increasing the num combinations of cells in different transcriptional states , ber of rounds of hybridization with a limited pool of instead of a single state in each sub - region as previously 35 fluorophores. This approach is called Sequential barcoded proposed . In addition , it was revealed that while the dorsal Fluorescence in situ Hybridization (seqFISH ) (Lubeck et al. , region of the CAl is relatively homogenous at the single cell 2014 ). In parallel , in situ sequencing methods were devel level , the ventral part of the CA1 has a high degree of oped to directly sequence transcripts in tissue sections, but cellular heterogeneity . These structures and patterns are these methods suffer from low detection efficiency ( < 1 % ) observed in sections from different mice , as well as in 40 (Ke et al. , 2013 ; Lee et al. , 2014 ) . Recently , Chen et al. seqFISH experiments with different sets of genes. Together, expanded the error correction method in the original these results demonstrate the power of seqFISH in transcrip seqFISH demonstration by using a Hamming distance 2 tional profiling of complex tissues . based error correcting barcode system , called merFISH . The mouse brain contains about 108 cells arranged into However, this implementation requires larger transcripts ( > 6 distinct anatomical structures. While cells in these complex 45 kb ) and many more rounds ofhybridization than the method structures have been traditionally classified by morphology described here (Chen et al. , 2015b ). Furthermore , seqFISH and electrophysiology , their characterization has been and its variants have only been applied in cell culture recently aided by gene expression studies. In particular , the systems due to the difficulty of smFISH detection in tissue . Allen Brain Atlas ( ABA ) provides a systematic gene expres Here, an improved version of seqFISH in complex tissues by sion database using in situ hybridization (ISH ) of the entire 50 including signal amplification and a time- efficient error mouse brain one gene at a time (Dong et al. , 2009 ; Fanselow correction scheme (FIGS . 29A - D , FIG . 37 ) were demon and Dong , 2010 ; Thompson et al. , 2008 ) . This comprehen strated to resolve the structural organization of the hip sive reference provides regional gene expression informa pocampus with single cell resolution . tion , but lacks the ability to correlate the expression of different genes in the same cell. More recently , single cell 55 Example 3 RNA sequencing (RNA - seq ) has identified many cell types based on gene expression profiles (Darmanis et al. , 2015 ; Brain Slice Analysis with Error Correction Tasic et al. , 2016 ; Zeisel et al. , 2015 ) . However, while single cell RNA - seq provides useful information on multiple genes Signal Amplification and Error Correction Enable Robust in individual cells , it has relatively low detection efficiencies 60 Detection of mRNAs in Tissues . and requires cells to be removed from their native environ To overcome the autofluorescence and scattering inherent ment resulting in the loss of spatial information . These to brain tissues, we used an amplified version of smFISH , different approaches can lead to contradictory descriptions called single molecule Hybridization Chain Reaction (sm of cellular organization in the brain and other biological HCR ) (FIG . 29E ) (Shah et al. , 2016 ). Single molecule HCR systems. 65 amplified signal 22.1 + 11.5 (meants.d ., n = 1288 , FIG . 38B ) In the hippocampus, recent RNA - seq data suggests that fold compared to smFISH , enabling robust and rapid detec CA1 is composed of cells with a continuum of expression tion of individual mRNA molecules in tissues and facile