Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine 829

Detection and Sequencing of Amplified Single Molecules

RONGQIN KE

ACTA UNIVERSITATIS UPSALIENSIS ISSN 1651-6206 ISBN 978-91-554-8508-5 UPPSALA urn:nbn:se:uu:diva-183141 2012 Dissertation presented at Uppsala University to be publicly examined in Rudbecksalen, Dag Hammarskjölds v 20, Rudbecklaboratoriet, Uppsala, Thursday, December 6, 2012 at 09:15 for the degree of Doctor of Philosophy (Faculty of Medicine). The examination will be conducted in English.

Abstract Ke, R. 2012. Detection and Sequencing of Amplified Single Molecules. Acta Universitatis Upsaliensis. Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine 829. 48 pp. Uppsala. ISBN 978-91-554-8508-5.

Improved analytical methods provide new opportunities for both biological research and medical applications. This thesis describes several novel molecular techniques for nucleic acid and analysis based on detection or sequencing of amplified single molecules (ASMs). ASMs were generated from padlock probe assay and proximity ligation assay (PLA) through a series of molecular processes. In Paper I, a simple colorimetric readout strategy for detection of ASMs generated from padlock probe assay was used for highly sensitive detection of RNA virus, showing the potential of using padlock probes in the point-of-care diagnostics. In Paper II, digital quantification of ASMs, which were generated from padlock probe assay and PLA through circle-to- circle amplification (C2CA), was used for rapid and sensitive detection of nucleic acids and , aiming for applications in biodefense. In Paper III, digital quantification of ASMs that were generated from PLA without C2CA was shown to be able to improve the precision and sensitivity of PLA when compared to the conventional real-time PCR readout. In Paper IV, a non-optical approach for detection of ASMs generated from PLA was used for sensitive detection of bacterial spores. ASMs were detected through sensing oligonucleotide- functionalized magnetic nanobeads that were trapped within them. Finally, based on in situ sequencing of ASMs generated via padlock probe assay, a novel method that enabled sequencing of individual mRNA molecules in their natural context was established and presented in Paper V. Highly multiplex detection of mRNA molecules was also achieved based on in situ sequencing. In situ sequencing allows studies of mRNA molecules from different aspects that cannot be accessed by current in situ hybridization techniques, providing possibilities for discovery of new information from the complexity of transcriptome. Therefore, it has a great potential to become a useful tool for gene expression research and disease diagnostics.

Keywords: padlock probes, proximity ligation assay, rolling circle amplification, circle-to- circle amplification, single molecule detection, amplified single molecules, sequencing, in situ, single cell

Rongqin Ke, Uppsala University, Department of Immunology, Genetics and Pathology, Molecular tools, Rudbecklaboratoriet, SE-751 85 Uppsala, Sweden.

© Rongqin Ke 2012

ISSN 1651-6206 ISBN 978-91-554-8508-5 urn:nbn:se:uu:diva-183141 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-183141)

The mechanic, who wishes to do his work well, must first sharpen his tools.

Confucius

List of Papers

This thesis is based on the following papers, which are referred to in the text by their Roman numerals.

I Ke R, Zorzet A, Göransson J, Lindegren G, Sharifi-Mood B, Chinikar S, Mardani M, Mirazimi A, Nilsson M. (2011) Col- orimetric nucleic acid testing assay for RNA virus detection based on circle-to-circle amplification of padlock probes. Jour- nal of Clinical Microbiology, 49(12):4279–85.

II Göransson J, Ke R, Nong RY, Howell WM, Karman A, Grawé J, Stenberg J, Granberg M, Elgh M, Herthnek D, Wikström P, Jarvius J, Nilsson M. (2012) Rapid identification of bio- molecules applied for detection of biosecurity agents using roll- ing circle amplification. PLoS One, 7(2):e31068.

III Ke R*, Nong RY*, Fredriksson S, Landegren U, Nilsson M. Digital quantification of proximity ligation products for highly sensitive and precise protein measurement. Manuscript. *Equal contribution

IV Torre TZ*, Ke R*, Mezger A, Svedlindh P, Strømme M, Nils- son M. (2012) Sensitive detection of spores using volume- amplified magnetic nanobeads. Small, 8(14):2174-7. *Equal contribution

V Ke R*, Mignardi M*, Pacureanu A, Botling J, Wählby C, and Nilsson M. In situ sequencing of mRNA in intact cells and tis- sues. Manuscript. *Equal contribution

Reprints were made with permission from the respective publishers.

Related work by the author

I Sun S, Ke R, Hughes D, Nilsson M, Andersson DI. (2012) Ge- nome-wide detection of spontaneous chromosomal rearrange- ments in bacteria. PLoS One, 7(8):e42639.

Contents

Introduction ...... 11 Biomolecular analysis ...... 12 Methods for nucleic acid detection ...... 12 Hybridization based detection ...... 12 Polymerization based detection ...... 13 Ligation based detection ...... 14 Sequencing based detection ...... 15 Padlock probes for nucleic acid detection ...... 17 The padlock probes ...... 17 Rolling circle amplification ...... 18 Circle-to-circle amplification (C2CA) ...... 19 Methods for protein detection ...... 20 Mass spectrometry ...... 20 based detection ...... 21 Proximity ligation assay for protein detection ...... 21 Single molecule detection ...... 23 Amplified single molecule detection ...... 25 Present Investigations ...... 26 Paper I. Colorimetric nucleic acid testing assay for RNA virus detection based on circle-to-circle amplification of padlock probes ...... 26 Background ...... 26 Summary ...... 26 Discussion ...... 27 Paper II. Rapid identification of bio-molecules applied for detection of biosecurity agents using rolling circle amplification ...... 28 Background ...... 28 Summary ...... 28 Discussion ...... 29 Paper III. Digital quantification of proximity ligation products for highly sensitive and precise protein measurement ...... 30 Background ...... 30 Summary ...... 30 Discussion ...... 31 Paper IV. Sensitive detection of spores using volume-amplified magnetic nanobeads ...... 32

Background ...... 32 Summary ...... 33 Discussion ...... 33 Paper V: In situ sequencing of mRNA in intact cells and tissues ...... 33 Background ...... 33 Summary ...... 34 Discussion ...... 35 Acknowledgements ...... 37 References ...... 41

Abbreviations

A adenine AC alternating current AP apurinic ASMs amplified single molecules ASMD amplified single molecule detection bDNA branched DNA C cytosine C2CA circle-to-circle amplification CCHFV Crimea Congo hemorrhagic virus cDNA complementary DNA CV coefficient of variance DNA deoxyribonucleic acid dNTP deoxyribonucleotide triphosphate ELISA enzyme linked immunesorbent assay FISH fluorescent in situ hybridization G guanine HCR hybridization chain reaction HRP horseradish peroxidase IL2 interleukin 2 IL6 interleukin 6 ISH in situ hybridization LAR ligation amplification reaction LCR ligation chain reaction LNA locked nucleic acid MIP molecular inversion probe mRNA messenger RNA MS mass spectrometry NGS next generation sequencing OLA oligonucleotide ligation assay PCR polymerase chain reaction PLA proximity ligation assay PSA prostate-specific qPCR quantitative PCR RCA rolling circle amplification RNA ribonucleic acid RIA radioimmunoassay

SBL sequencing-by-ligation SMAP2 smart amplification process version 2 SMD single molecule detection SMRT single molecule real time SNP single nucleotide polymorphism SQUID superconducting quantum interference device T thymine U uracil VAM-ND volume-amplified magnetic nanobead detection VEGF vascular endothelial growth factor ZMW zero-mode wave-guides

Introduction

The accomplishment of the Human Genome Project in 2003 dramatically changed the field of genomic research, leading biology into a new era of large-scale sciences. In the past decade, tremendous progress has been made to the development of new molecular analysis technologies. One of the most game-changing examples is the next generation sequencing technologies, which offer unprecedented speed of data generation. These new technologies have greatly facilitated the biological research, enabling exploration of the intricacies of life in a faster and larger-scale manner. Attributed to these technological advances, more and more new findings are emerging, how- ever, accompanied by the spawning of new questions. Many of these new questions are difficult to answer with the existing technologies. Thus, tech- nologies must continuously evolve to keep their pace to satisfy the changing needs of modern researches. The new discoveries in biological research and development of new analytical methods therefore exhibit a relationship of mutual promotion. Furthermore, the analysis of large-scale data sets reveals more and more differences and similarities among individuals. In the clinical practices, these two seemed to be contradictory aspects have led to the emerging of personal- ized medicine, calling for novel companied diagnostic methods. This kind of methods must be able to deliver reliable and informative results in the de- sired time span at an affordable cost. From the analytical point of view, such methods must be sensitive, specific, precise and rapid, as well as inexpen- sive. This thesis is focused on the further development and applications of two DNA ligation based molecular techniques, namely the padlock probe assay and the proximity ligation assay (PLA). For both basic biological research and biomedical applications, these two methods provide means that enable studying of nucleic acid and protein molecules in a highly sensitive way. In the first half of this thesis, I will review some of the currently available ana- lytical methods and their applications in the related research fields. The ad- vantages and limitations of these methods will be discussed. In the second half, papers that constitute this thesis will be reviewed and discussed.

11 Biomolecular analysis Nucleic acids and proteins are probably the most widely studied bio- molecules due to their fundamental roles in keeping the living organisms functional. The hereditary information of living organisms is stored in nu- cleic acids, which include DNA and RNA. This information is conveyed through gene expression. Gene expression begins with transcription of DNA into RNA, followed by translation of RNA into proteins. The former can result in some functional RNA that can already execute biological functions. Any small changes during this procedure are likely to cause difference in the whole living organism. In order to understand the hidden mechanisms, we need powerful analytical methods. To date, various analytical methods have been developed to study biomolecules from different aspects at different levels.

Methods for nucleic acid detection Currently, there are numerous methods available for nucleic acid detection. The basic goal of these methods is to identify the presence of specific oli- gonucleotide sequences in a complex mixture. Herein, I will only focus on describing some of these detection methods. By using the critical steps that are involved in these methods, I divide them into four main categories: hy- bridization based detection, polymerization based detection, ligation based detection and sequencing based detection. However, it is very common that a detection method is constituted by the combination of these four methods.

Hybridization based detection The basic principle of hybridization based detection is the complementary base pairing. There are four nucleobases in the DNA stand: adenine (A), cytosine (C), guanine (G), and thymine (T). In the case of RNA molecules, the base uracil (U) replaces the base T. Under normal conditions, A binds to T (U), while C binds to G. The bases in two complementary nucleotide strands follow this rule, making these two strands hybridize to each other in opposite directions. Based on this basic principle, detection probes can be used to hybridize and identify their complementary sequences from the complex background. The probes are often labeled with signal reporters, such as radioactive isotopes, fluorophores or enzymes. A routinely used technique, the Southern blot, which is used for identification of specific DNA fragments from the genomic DNA, is one of the pioneering technolo- gies in this field (1). The in situ hybridization (ISH) is a representative tech- nique in this category. Since first described by Pardue and Gall in 1969 for ribosomal DNA detection in oocytes of the toad Xenopus (2), ISH has been

12 developed into different variants and is still one of the most important mo- lecular techniques (3). One of these variants, the in situ hy- bridization (FISH) has become more and more widely used due to its wide range of applications, high performance and relatively ease-of-use (4-5). FISH can be used for both DNA and RNA detection. Using different fluo- rescent dyes to label the probes enables concurrent detection of multiple targets within a single sample. Detection of single transcripts has also been demonstrated with different FISH based methods. One adaption used single detection probes labeled with five fluorophores to visualize single RNA transcripts (6). In another adaption, multiple single label short probes (48 probes or more) were used to hybridize with the same target mRNA mole- cule, enabling visualization of single mRNA molecules in situ (7). Signal amplification strategies based on multiple hybridization steps have been used to increase the sensitivity of hybridization based detections. The branched DNA (bDNA) assay, for example, depends on several sequential hybridization events to create a large hybridization complex that bears mul- tiple reporters to achieve signal amplification (8-9). In the in vitro assay, this method was shown to be able to detect less than 100 copies of target nucleic acids per milliliter sample (8). In the in situ detection that based on bDNA, detection of single copies of nucleic acid has been achieved (10). The hy- bridization chain reaction (HCR) uses a different strategy to create hybridi- zation complexes. In HCR, an initiator DNA strand triggers two species of DNA hairpins to hybridize on each other sequentially in a cascade way to generate a nicked double-stranded 'polymer' (11). Recently, an in situ detec- tion assay that was based on the HCR principle had been shown to be able to detect five mRNA targets simultaneously in fixed whole-mount and sec- tioned zebrafish embryos by using detection probes labeled with spectrally distinct fluorophores (12).

Polymerization based detection To detect a target DNA sequence, enzymatic polymerization can be per- formed for specific amplification of the target. This is followed by detection and quantification of the amplification products to infer the presence of the target sequence. Polymerase chain reaction (PCR) is one of the core tech- nologies that are being used in the modern molecular biology research. In the PCR, after repeated cycles of annealing and extension, a pair of primers re- sults in exponential amplification of target sequence (13-14). The develop- ment of quantitative PCR (qPCR), also called real-time PCR, allows accu- rate quantification of the copy number of the target DNA molecules through simultaneous amplification and detection (15). For even more accurate quan- tification of the copy number of target DNA, digital PCR was introduced (16-17). The underlying principle of this technique is that the templates are diluted into large number of separated reaction compartments, ideally down

13 to a concentration where each reaction compartment contains at most one template. At these concentrations, many compartments will be empty and thus contain no amplifiable templates. The individual reactions are usually amplified on mass, following by binary classification of individual com- partment reactions as positive or negative. Knowing the input concentration and the number of positive outcomes enables accurate quantification of the amplifiable molecules in the original sample. To guarantee that each com- partment contains at most one template, larger numbers of compartments are needed. A recent article reported on the so called megapixel digital PCR that enabled amplification of single DNA molecules in 1,000,000 picoliter vol- ume reactors, making it possible to detect below one copy of single- nucleotide-variant among 100,000 wild-type sequences and discriminate 1% difference in chromosome copy number (18). The PCR technique involves thermo cycling steps, which can bear limita- tions in certain situations, such as in field-based point-of-care diagnostic assays. The truth is under natural conditions, the polymerization does not require thermo cycling steps, indicating the possibility of carrying out ampli- fication of target DNA under isothermal condition. Currently, there is a wide range of isothermal nucleic acid amplification methods available (19). Re- cently, an isothermal amplification approach called SMAP 2 (SMart Ampli- fication Process version 2) that enabled single nucleotide polymorphism (SNP) detection has been reported. In SMAP2, an accessory protein that can tightly bind to the mismatched nucleotides was used to stop the polymeriza- tion, enabling SNP discrimination (20).

Ligation based detection DNA ligation is a molecular process of covalent linking the ends of two DNA strands together. The most common DNA ligation used in molecular biology is the enzymatic ligation that catalyzed by the DNA ligase. DNA ligase has two major biological functions in vivo: to ligate nicks formed in the lagging strand during the DNA replication and to seal nicks generated during DNA damage repair or DNA recombination. In vitro, DNA ligation has been used as a popular technique for nucleic acid detection. The fidelity of the DNA ligase is a key feature that builds up ligation based detection methods (21). Oligonucleotide ligation assay (OLA), the pioneering inven- tion in the ligation based detection, utilizes the fidelity of T4 ligase to dis- criminate single nucleotide substitutions (22). The ligation amplification reaction (LAR), was the first conceptual idea described for using the ligase- mediated detection to distinguish alleles and to amplify the specific target at the same time (23). However, this technique was not very convenient due to the high temperature involved in the thermo cycling step could kill the li- gase, which was not thermostable. In order to continue the reaction in LAR, addition of new ligases was needed. To overcome this problem, ligase chain

14 reaction (LCR) used a thermostable DNA ligase, enabling the ligation reac- tion continue under high temperature conditions that were involved in the denaturing and annealing cycles, making it possible to detect mutations as well as to exponentially amplify the specific target (24-25).

Sequencing based detection DNA sequencing is a process that reads out the bases on DNA strand. It can therefore be used to identify target sequences as well as to discover novel sequences. The recent advances in sequencing technologies have brought revolutionary impact to life sciences (26-28). To date, there are a number of different ways to perform sequencing, including the classical Sanger se- quencing and the so called next generation sequencing (NGS). Even though new technologies are emerging on the horizon and the next generation se- quencing technologies have been developed and used in real applications for a while, NGS are still referred to “next generation” rather than “current gen- eration” technologies. In essence, by using the sequencing throughput and the differences among the sequencing substrates, sequencing technologies can be categorized into three generations (29). First generation, which is low in throughput, includes the chain-termination method (the Sanger sequenc- ing) and the chemical sequencing method (29). Of these two methods, the former is more popular and is still being used for various applications. The read length of Sanger sequencing is still one of the longest among all the methods currently available. Second generation sequencing began in the year 2005 when 454 pub- lished a landmark paper describing the 454 sequencing technology (30). The 454 sequencing platform is based on the pyrosequencing principle, whose detected signal comes from the released pyrophosphates during nucleotide incorporation (31). Also in 2005, George Church’s group published the mul- tiplex polony sequencing which performed the sequencing-by-ligation (SBL) chemistry on the bead surface that harbored the amplification products gen- erated from single DNA molecules by emulsion polymerase chain reaction (32). This technology later became the basis of SOLiD sequencing system, which was launched in 2007. Illumina’s sequencing platform is based on the sequencing-by-synthesis chemistry, where fluorescently labeled dNTPs are used to generate signal on the DNA clusters produced by Bridge PCR (33). Several characteristics are shared amongst these sequencing platforms. The first shared characteristic is that individual templates are amplified to gener- ate independent groups of amplification products which are clones of the original templates. However, the platforms vary on molecular amplification approaches, i.e. Emulsion PCR (34), Polony Amplification (35) and Bridge PCR. The second is that base information of the templates is then collected based on measuring specific signal from individual groups of amplification products. Third, the amplification products are sequenced in a massively

15 parallel way so that sequence information from a large number of templates is obtained at the same time. Fourth, upon completion of sequencing, the sequences are assembled through data analysis processes to derive the full- length sequences. The newly launched Ion Torrent sequencing platform is also a massively parallel sequencing technology that is based on performing sequencing chemistry on amplification products derived from single templates. How- ever, unlike the other second generation sequencing platforms, the signal being measured is the change in pH, rather than light. Combined with the semiconductor reader, the pH change signal is converted into digital infor- mation and recorded directly (36). The sequencing speed has been greatly improved in this method. But lower throughput and higher error rates than the other commercially available second generation sequencing platforms are limiting factors for this technology at the moment. The third generation sequencing technologies are attempting to sequence real single molecules without amplification so to avoid errors arise during the amplification steps. In nanopore sequencing, single stranded DNA or individual nucleotides released from DNA strands pass through the nanopore, and cause a current change (37-38). When different bases pass through the nanopore, the patterns of electric current change are different. By using this property, the four bases can be identified and registered. In addition, the 5-methylcytosine can be distinguished directly from the other four standard DNA bases without pretreatments to the sample, simplifying the procedure of sequencing based methylation analysis (39). The true single molecule sequencing technology from Helicos also uses the sequencing-by- synthesis chemistry and fluorescently labeled dNTPs, similar to that of Illu- mina’s technology, but on single DNA molecules. Sequencing primers are first immobilized on a solid surface to capture the single DNA molecules, followed by performing sequencing cycles on these captured single mole- cules (40). Instead of immobilizing primers like in many other methods, SMRT sequencing technology from Pacific Biosciences immobilizes DNA polymerases. Single DNA polymerases are attached to the bottom of nano- meter-sized holes called ZMW, which can increase the ability to distinguish signal from background (41-42). The base signal is recorded when phos- pholinked nucleotides are incorporated during polymerization. After the incorporation, the labeled dye is freed from the phospholinked nucleotides due to the polymerase cleaves the phosphate chain. Only one single DNA base is sequenced at a time in a ZMW. These third generation sequencing technologies not only simplifies the sequencing protocol, but also eliminates the potential base incorporation errors that may arise due to the enzymatic amplification processes. However, sequencing technologies based on single molecules will obviously require high performance instruments in order to distinguish signal from background. This is probably why none of them is widely used until now.

16 Padlock probes for nucleic acid detection The padlock probes A padlock probe (43) is a linear oligonucleotide that contains two target- complementary segments at its 5’ and 3’ ends, and a target-non- complementary segment in between the two ends. Upon hybridization, the two target-complementary ends will be brought head-to-tail together on the target strand, forming a nick between the two ends. The nick can then be sealed by DNA ligase. As a consequence, the probe becomes circularized, and thereby “locked” on the target strand (Figure 1). This topological struc- ture makes locked padlock probes able to withstand highly stringent wash conditions. The DNA ligase has a strong preference for a perfect base pair match between the probe and the target sequence at the ligation junction, ensuring the high specificity of padlock probe assays (44). This property makes padlock probe very suitable for SNP discrimination (45-47).

Figure 1. Illustration of a padlock probe hybridizes to and being ligated on the cor- rect target. (A) The two target-complementary regions of padlock probe (black) bind to target sequence (grey), resulting in the two ends of the probe become head-to-tail in juxtaposition. A nick is therefore formed in between the two ends. (B) DNA li- gase is then used to seal the nick, making the padlock probe become a complete DNA circle.

The target-non-complementary sequence of padlock probe can be used for different purposes. It can harbor a detection probe hybridization site, an am- plification primer hybridization site, a capture probe hybridization site or the combinations of these sites (48-52). Besides the applications in SNP dis- crimination, padlock probes have also been used in many other applications. For instance, copy number variation analysis (53), gene expression profiling (47, 54-55), alternative splicing analysis (56), as well as detection and identi- fication of pathogens (57-58) have been successfully examined by using padlock probes. The detection of padlock probes can be achieved either with or without amplification. To detect padlock probes without amplification, the padlock probes can be modified with radioactive or fluorescent labels (43, 45). The reacted padlock probes become circularized and immobilized on the target strand, giving signal for detection. The unreacted probes are washed away under stringent conditions to reduce the background. Detection after amplifi-

17 cation can be achieved by molecular procedures such as PCR or rolling cir- cle amplification (RCA). The latter will be discussed in the following sec- tion: rolling circle amplification. The molecular inversion probe (MIP) (59) which is a further development of the padlock probe, is designed such that after the target-complementary ends hybridize to the corresponding target strand, a gap is formed in between the two ends. The resultant gap can be either only one base or much longer. Thus, in order to form a complete DNA circle, it requires performing DNA polymerization from the 3’ end of the probe to fill up the gap and then liga- tion by DNA ligase. The specificity of MIP can be guaranteed by desired selection of dNTPs for the polymerization (60). Molecular inversion probes have been used for highly multiplex genotyping (59, 61) and highly multi- plex gene copy alteration analysis (62-64). MIPs with wide gap sizes ranging from 60 to 191 nt have been used for target amplification of 10, 000 human exons in a single multiplex reaction prior to targeted sequencing (65).

Rolling circle amplification After forming a complete DNA circle on the target, the padlock probe can also be further amplified to achieve more sensitive and specific detection. DNA circles formed from ligation of padlock probes, are ideal substrate templates for the so called rolling circle amplification (RCA) (66). RCA is a type of isothermal nucleic acid amplification process which uses circular DNA or RNA molecules as templates, generating amplification products that contain tandem repeated sequences that are complementary to the original template (Figure 2). In our lab, the enzyme, Φ29 DNA polymerase is often used to perform RCA on padlock probes. Several properties of the Φ29 DNA polymerase make this enzyme very suitable for performing RCA on padlock probes. First of all, it is highly possessive in the absence of any other acces- sory proteins. Second, the strand displacement ability of the polymerase makes it able to carry out the polymerization continuously at constant tem- perature by displacing the newly synthesized strand from the template (67). Third, the 3’ to 5’ exonuclease activity makes the enzyme possible to use target DNA strand as primer by digesting away the non-complementary tar- get extensions for initiating the RCA (48). When the free 3’ end is too far away from the target hybridization site, however, the RCA will be hindered. Therefore, a strategy combining glycosylases and AP-cleaving enzymes has been developed and used to remove the long extension of target beyond the matched region (68). Ligated padlock probes form complete DNA circles that can be amplified by the RCA to generate RCA products, while unligated padlock probes can- not be amplified (60). The RCA products, which are single-stranded DNA harboring tandem repeated sequences that are complementary to the padlock probes, will naturally collapse into micrometer-sized DNA balls in solution.

18 Fluorescence labeled detection probes can then be used to hybridize with the RCA products, making these RCA products visible as bright individual mi- crometer-sized spots when examined by using a (Figure 2). These spots are regarded as signals, which can be quantified one- by-one digitally.

Figure 2. Detection of circularized padlock probes by RCA. The reacted padlock probes are amplified by RCA. When fluorescently labeled detection probes hybrid- ize to the RCA products, the RCA products are visualized as bright spots when ex- amined by using fluorescence microscope.

Circle-to-circle amplification (C2CA) Circle-to-circle amplification is a molecular procedure that involves a series of enzymatic reactions for amplification of circular DNA molecules (50). The principle of C2CA is shown in Figure 3. The process starts with circular DNA templates being amplified by RCA, generating first generation RCA products. These DNA circles are designed such that they contain segments harboring specific restriction sites. The RCA products are concatemers that contain tandem repeated complementary copies of the original DNA circles, thus the restriction sites will be distributed at regular intervals in the RCA products. After RCA, restriction oligonucleotides that are complementary to the segments with restriction sites are then used to hybridize with the RCA products, creating double stranded regions. Thereby, the hybridization gen- erates recognition sequences for the corresponding restriction enzymes. Un- der the right conditions, the restriction enzymes cut the RCA products through their recognition sites and fragment the RCA products into mono- mers. This is followed by inactivation of the restriction enzymes to eliminate

19 their interference on the following steps. The excess amount of restriction oligonucleotides will then serve as templates for the recircularization of monomers. Monomers will bind head-to-tail on the excess restriction oli- gonucleotides, forming nicks in the juxtaposition of the two ends. When the nicks are sealed by DNA ligase, new circular DNA molecules are formed. These new DNA circles can be replicated through RCA again, generating the second generation RCA products. After this step, one full cycle of C2CA is completed. However, the process of monomerization, recircularization and new generation RCA can be repeated if further amplification is necessary. This property of C2CA is used to detect targets with different concentra- tions, allowing quantification in different dynamic range windows to fulfill different sensitivity requirements (50-51).

Figure 3. A cycle of circle-to-circle amplification (C2CA). (A) A DNA circle is first replicated by RCA, generating long tandem repeated single stranded RCA product. (B) Oligonucleotides containing restriction digestion sites hybridize to their com- plementary motifs on the RCA products, resulting in double stranded regions. (C) The corresponding restriction endonucleoase can then cut their recognition sites inside the double stranded regions, generating monomers from the RCA products. (D) The resultant monomers can bind head-to-tail to the excess restriction oligonu- cleotides and be ligated by DNA ligase to form new DNA circles again. (E) The new DNA circles are amplified again through RCA.

Methods for protein detection Mass spectrometry Mass spectrometry (MS) is a crucial technique for protein research (69-70). The primary application of MS for protein detection is to identify target pro- teins from complex mixtures. In order to avoid bias against the less abundant proteins, two-dimensional gel electrophoresis or high performance liquid chromatography are often used to fractionate the samples before performing MS. Quantification of proteins can also be achieved by MS. Typically, iso-

20 tope labeled peptides or proteins with known concentrations are used as in- ternal standards for their corresponding candidate peptides or proteins and the level of the latter can then be calculated from the ratio to the former (71). Due to its prominence multiplexing ability, MS has become an important tool for protein biomarker discovery (72-73).

Immunoassay based detection Immunoassays are still the most widely used method for quantitative meas- urement of proteins. The foundation of this technology is the Nobel Prize awarded work, the radioimmunoassay (RIA). As early as 1960, the first RIA was developed as a method for measuring insulin in plasma (74). In that study, radioactively labeled insulin was used to compete with the insulin in plasma for binding to capture . The level of plasma insulin can therefore be deduced from the detected radioactively labeled insulin. Based on this methodological concept, hundreds of immunoassay formats have been developed afterwards. In 1971, the enzyme linked immunesorbent as- say (ELISA), which was a milestone in the development of immunoassays, was first described by Engvall and Permann (75). In ELISA, enzymes are used to label the affinity reagents, eliminating the potential health risks that can cause by radioactive labels in RIA. It didn’t take long before various enzyme-linked immunoassays became widely spread (76). And very soon, a lot of alternative labels were also available (77). To achieve sensitive detection, nucleic acids themselves can be amplified through different in vitro amplification methods. Unfortunately, such kind of techniques are still lacking for protein detection. Moreover, it is unlikely that this problem can be overcome within a short period. Therefore, the immuno- PCR, which coverts detection of protein into detection of reporter DNA molecules, was introduced as an alternative strategy (78). In a typical im- muno-PCR assay, DNA molecules are conjugated to the that are incubated with . After washing, bound DNA molecules can be am- plified by PCR to achieve signal amplification, resulting in dramatically improved sensitivity over traditional immunoassays. The bio-barcode assay for detection of protein, which also uses DNA molecules as label, however, can achieve ultra sensitivity without the need of enzymatic amplification of the labeled DNA molecules (79).

Proximity ligation assay for protein detection The first proximity ligation assay (PLA) for detection of protein was pub- lished in 2002 (80). In a typical PLA assay, a pair of PLA probes, which are single-stranded DNA tagged affinity binders, is brought into close proximity upon binding to the same target. DNA ligation is then performed to generate

21 either a new longer linear DNA strand or a DNA circle. Depending on the assay format and the application environment, there are three different basic type of PLAs: solution phase PLA (81-82), solid phase PLA (83) and in situ PLA (84-85). Figure 4 shows the principles of these three types of PLA. Signal amplification in PLA can be achieved through DNA amplification, resulting in high detection sensitivity. This is similar to that of immuno- PCR. However, unlike immuno-PCR, where only one single binding event is needed to form an amplifiable template, PLA requires two or more binding events followed by a specific DNA ligase mediated ligation to generate am- plifiable templates. The multiple binding and ligation increases the specific- ity of the PLA.

Figure 4. Three types of proximity ligation assays. (A) Solution phase PLA: A pair of PLA probes binds their target protein, upon the antibody-antigen reaction. The binding event brings two tagged DNA strands equipped on PLA probes into close proximity. These two DNA strands are then joined through DNA ligation, forming a new linear DNA strand. The newly formed DNA strand can then be quantified by real-time PCR. Finally the concentration of the target can be inferred from the real- time PCR results. (B) Solid phase PLA: The target is first captured by a capture antibody that immobilized on a solid support, followed by washing away other un- bound materials. Then the PLA probes are added, followed by DNA ligation and real-time PCR quantification as described in (A). (C) In situ PLA: As in the other two types of PLA, the PLA probes first recognize and bind to their target. However, the labeled DNA strands are designed as ligation template to guide two single stranded DNA to form a DNA circle. The DNA circle can then be amplified by RCA. RCA products are detected by fluorescence labeled detection probes, visualiz- ing as micrometer-sized fluorescent spots when examined by fluorescence micro- scope.

22 Solution phase PLA (Figure 4A) and solid phase PLA (Figure 4B) are used for in vitro detection of proteins. The former is a homogeneous assay that enables protein detection without additional wash steps, yet achieves high sensitivity. The latter involves a solid phase mediated target capture step, which increases the assay specificity and enables to wash away un- bound materials, eliminating the potential inhibiting factors presented in the complex sample (83). Further increased specificity of solution phase PLA was reported in a triple-binder PLA that required three PLA probes binding on the same target to create a ligation event (86). A solid phase PLA assay that required five binding events (four PLA probes bind to the same target to generate ligation products and one capture antibody to capture the target) was reported for specific detection of prostasomes, whose presence is diffi- cult to validate by conventional methods, in prostate cancer (87). The in situ PLA (Figure 4C) is used for in situ protein detection in cells and tissue sections, providing the localization information of the target pro- teins. Different from the other two types of PLA, the DNA stands equipped on the PLA probes themselves do not form a new linear DNA strand in the in situ PLA, but serve as ligation templates for guiding two connector oli- gonucleotides to form a DNA circle, which is quite similar to the circle that generated in the padlock probe assay. The DNA circle is then amplified by RCA, rather than PCR. Apart from detection of proteins, in situ PLA has also been used for detection of post-transcriptional modifications of proteins (88-89), protein-protein interactions (85), and protein-DNA interactions (6), etc.

Single molecule detection To date, most of the biomolecular detection methods are based on measuring the ensemble average signal from many molecules. Examples of this kind of detections are numerous, such as ELISA, real-time PCR and so on. This type of detections is sufficient to answer most of the biological research ques- tions, such as the presence or the concentration of target molecules. How- ever, average measurements do not provide information on individual mole- cules in the ensemble. Therefore, individual differences will be masked by this type of detection, resulting in ignoring rare events from individual mole- cules which can be of significant importance in the biological functions. The first single molecule research paper, in which the activity of droplet- trapped single β-galactosidase molecules was investigated, dated back as early as 1961 (90). In the modern biological research, attributed to the spec- tacular advances in instrumentations and technologies, various methods that allow single molecule detection (SMD) have been developed. SMD has been used for a variety of basic biological researches, such as molecular motors, single enzyme activities, interactions between biomolecules, and so on (91-

23 94). These studies reveal characteristics of individual molecule that are not possible to observe by bulky measurements, helping to understand the mechanism of these biological processes in single molecule level. From the analytical point of view, the most attractive property of SMD is its potential to provide ultimate sensitivity. Therefore, a number of diagnos- tics oriented technologies have been built based on SMD. Optical microcavi- ties, which enable single molecule detection in a label-free way, were re- ported to be able to detect as low as 5 aM of IL2 (95). A company called Singulex has developed a single-molecule counting based immunoassay system named Erenna, which was shown to be able to detect proteins with limit of detections ranging from 10-100 pg/L (96). In this technique, antigens were first captured by antibodies immobilized on the magnetic beads, fol- lowed by detection with fluorescently labeled antibodies. The labeled anti- bodies were then released into solution and counted one-by-one using a dedicated fluorescence detector (97). Recently, an approach called single- molecule enzyme-linked immunosorbent assay (or digital ELISA) also claimed to be able to detect single proteins (98). In digital ELISA, immuno- complexes formed from antigens reacted with enzymatic reporter labeled antibodies, were first captured by the antibodies immobilized on the mag- netic beads, allowing maximal only one single immunocomplex per bead. And then the beads with or without the immunocomplex were loaded into individual femtoliter-volume wells on an array for isolation and detection of single molecules (98). SMD has also been applied in the field of DNA sequencing. As early as 1989, there was such an technique reported (99). In that technique, fluores- cently labeled nucleotides were first used to generate a complementary strand from the template through DNA polymerization. Then the labeled nucleotides were sequentially released by exonuclease digestion into a flow stream and detected individually by laser induced fluorescence. This method was expected to be able to reach a sequencing length up to 40 kb at a speed of 100 to 1000 bp per second (100-101). As mentioned previously, the third generation sequencing technologies are also based on SMD, however, in a massively parallel manner. Before I finished this section, it is necessary to point out that SMD doesn’t necessarily mean to detect every single molecule in the sample. However, if as many molecules as possible could be detected in the sample of interest, the assays based on SMD are therefore possible to reach its ulti- mate sensitivity. To do so, new sampling strategies to increase the detection efficiency, such as miniaturization of sample volume, may be necessary. An example of this is the previously mentioned Erenna system, where the fluo- rescently labeled antibodies were eluted in a small volume to concentrate the signal prior to detection (96).

24 Amplified single molecule detection Amplified single molecules (ASMs) are often referred to individual groups of amplified DNA molecules that are generated through independent ampli- fications of single DNA molecules. Such a group of DNA molecules is often viewed as an intact entity which harbors the characteristics of the original single DNA molecule. Therefore, this group of molecules is detected and quantified by measuring the overall signal generated from all molecules in the group, instead of looking into individual molecules inside the group. Due to this reason, the detection of individual ASM is a binary event that the result is either positive or negative. Several different approaches have been used for the generation of ASMs. The first one is the separation based ap- proach, where samples are fractionated into independent reaction chambers that contain one single DNA molecule and amplified separately. Digital PCR and emulsion PCR (34) are examples of this approach. The second approach, which can be exemplified by Polony amplification (35) and Bridge PCR (102-103), depends on localized amplification of single DNA molecules. ASMs are used as sequencing substrates for the second generation sequenc- ing platforms (104). An ASM that is generated from these approaches is virtually a group of closely related molecules that are clones of the original single DNA template, rather than real single molecule. Digital PCR together with second generation sequencing technologies provide means for digital quantification of ASMs that provide precise measuring results, therefore are being investigated as potential diagnosis tools for noninvasive prenatal diag- nosis of fetal chromosomal aneuploidy where high precision is needed (105- 107). The amplification products generated from RCA are another type of ASMs that contain tandem repeated copies of the original circular DNA molecules (51, 53). These ASMs are true single molecules, because all the amplified products from an original DNA template are presented on the same DNA strand. The digital quantification of ASMs from RCA has been used for copy number variation analysis (53). RCA has also been used to generate sequencing substrates in two second generation sequencing tech- nologies (108-109). In these tow technologies, genomic DNA was frag- mented and circularized to construct a library made up by DNA circles. The library was then amplified by RCA to generate ASMs that were subject to sequencing reaction in the following steps. The first one utilized the se- quencing-by-hybridization method, while the second one used sequencing- by-ligation chemistry to interrogate the bases.

25 Present Investigations

Paper I. Colorimetric nucleic acid testing assay for RNA virus detection based on circle-to-circle amplification of padlock probes Background Lack of proofreading enzymes that assure the fidelity of RNA replication is an important reason for the high mutation rates in RNA viruses (110-111). High mutation rates generate great sequence diversity across genomes of different viral strains and makes detection of these viruses difficult with conventional PCR based methods. In a typical PCR assay, at least two short conserved genome fragments must be used for binding the extension prim- ers. Moreover, a third fragment is sometimes needed to provide a binding site for detection probes. However, the highly variable RNA virus genome sequence makes it very difficult to find several conserved fragments. In ad- dition, epidemic of infectious diseases caused by RNA viruses often happen in rural areas of developing countries where equipped laboratories are often far away. Therefore, testing assays that can be performed under basic condi- tion are needed. In this study, we aimed for using padlock probes to establish an RNA virus diagnostic assay that is highly sensitive and specific, but re- quires simple readout strategy.

Summary We used the Crimea Congo hemorrhagic virus (CCHFV), which is a single stranded RNA virus that causes high mortality rate (112-113), as a model to establish our approach. We started with detection of a single CCHF viral strain. The viral RNA was first converted into cDNA by using biotinylated random hexamers. Padlock probes were designed to target the most con- served region in the L-segment of the CCHFV genome. Upon hybridization to their targets, padlock probes were circularized and ligated through enzy- matic ligation by DNA ligase. The circular DNA molecules were then ampli- fied by two cycles of C2CA. Finally the RCA products were detected by horseradish peroxidase (HRP) labeled detection oligonucleotides. The sensi- tivity of the method reached 103 copies/ml, which was comparable to that of detection by real-time PCR. The method was further validated by testing

26 patient samples collected from an outbreak in Iran. All results agreed with that of the detection using conventional real-time PCR. To address the prob- lem of high mutation rates in the target sequence, we designed seven addi- tional padlock probes that harbored the corresponding variant bases at dif- ferent positions in the target sequence. The new padlock probes can then reduce the number of mismatches when hybridizing with a mutated template. By using a cocktail of all probes, we were able to detect all the synthetic templates that represented the variants of different viral strains, which were not able to be detected by the single padlock probe approach.

Discussion We have established a highly sensitive and specific RNA virus detection assay based on padlock probes and C2CA. The RCA products were detected by using HRP labeled probes that catalyze the colorimetric reaction, without the need of using sophisticated instruments. The simple readout approach indicates the potential of using padlock probe in point-of-care applications, where the readout often requires simple and preferably visualizable ap- proaches. The ligation of padlock probes on complementary template is a highly specific enzymatic reaction. When mismatches exist at the ligation junction, the probe cannot be circularized (43, 45). This property has been used for detection of single nucleotide variations, however, may hinder the detection of highly variable target sequences by using padlock probes. We demon- strated in this paper that by using a cocktail of padlock probes that helped to reduce the mismatches in the hybridization region can overcome this prob- lem. This was due to some of the additional probes in the cocktail can hy- bridized better on the mutated templates when competing with those with more mismatches. Because each template can circulate only one padlock probe, the total signal remained to be determined by the concentration of templates even when the total amount of probes has increased. This indi- cated that the efficiency of the ligation reaction was not limited by competi- tion among the probes in the cocktail. The reason for this could be explained by there was a higher off-rate for the mismatched probes under the condi- tions that still allowed efficient ligation of probes with less mismatches. In this study, the additional padlock probes were designed to target the com- mon variants, so that limited the number of probes necessary to complete the multiplex probe cocktail. However, this made some templates still did not have any perfect matched padlock probes to hybridize with them. These templates can still be detected due to the reduction of mismatches that were provided when some of the probes with minimal mismatches bind and ligate on them. It would be interesting to investigate the performance of a cocktail of perfect matched probes or probes with degenerated bases in the future studies.

27 Paper II. Rapid identification of bio-molecules applied for detection of biosecurity agents using rolling circle amplification Background Detection of security-sensitive, highly pathogenic biological agents in the early stage is a great challenge in the biodefense and biosecurity surveil- lance. Rapid response to detected threats can help to minimize the potential damages that can cause to the public. Moreover, detection of biosecurity agents must be able to give accurate results, since giving false positive alarms can cause unnecessary losses. Another great challenge in the detec- tion of biosecurity threats is the great variety of biothreat agents, each of which may require different dedicated methods or instruments for detection. Such needs highlight the importance of developing detection systems that are versatile, rapid, sensitive, specific and cost effective. To meet these re- quirements, various approaches have been developed based on the detection of nucleic acids or proteins that originated from the biothreat agents (114- 116). In this study, we aimed for establishing a rapid, sensitive and specific system that could streamline the detection of nucleic acids and proteins, as well as providing easier solutions for data interpretations.

Summary A system that was capable of rapid detection of pathogens from the complex environmental background had been developed. The system consisted of two ligation based molecular approaches and a dedicated device for digital quan- tification of RCA products (ASMs) generated by the molecular approaches. First, we showed that the sensitivity of the dedicated device for detection of RCA products was improved by an order of magnitude over the old con- focal microscope based detection system. This device was then used in the following experiments. We then focused on establishment of the two molecular approaches, the padlock probe assay and the PLA, for nucleic acid and protein detection respectively. In order to find a good restriction endonuclease that can cut rapidly and is easy to inactivate for the C2CA protocol, we evaluated a num- ber of selected restriction endonucleases and found that AluI meet our re- quirements. Detection of E. coli DNA and bacterial spores (protein) were used as model systems. For DNA detection, padlock probes were used to specifically ligate on the target DNA, forming DNA circles. In order to speed up the ligation, high concentration of padlock probes was used. A biotinylated probe was used to capture the targets harboring the DNA cir- cles. For detection of bacterial spores, a modified solid phase PLA protocol was used. The spores were first captured by antibodies immobilized on mag-

28 netic beads, followed by specifically binding of two PLA probes. The oli- gonucleotides on the PLA probes were designed as templates for guiding the other two connector oligonucleotides to form a DNA circle, which is a simi- lar approach to that in the in situ PLA. High concentrations of magnetic beads with immobilized capture antibodies and PLA probes were used to speed up the sandwich immunoassay, and the ligation was accelerated by using high concentration of connector oligonucleotides. Next, DNA circles formed by padlock probe assay and PLA were amplified via C2CA to gener- ate RCA products that hybridized with fluorescently labeled detection probes. In order to further shorten the overall reaction time, the RCA time was reduced. However, this could result in the reduction of the size of RCA products. To overcome the potential loss in detection sensitivity, two detec- tion probe hybridization sites were introduced in the original DNA circle, doubling the number of probes hybridized to the RCA products. The fluores- cent RCA products were then detected by the dedicated ASM digital quanti- fication device. The reaction times for nucleic acid and protein assays were both 30 min. The limit of detection was approximately 30 bacteria for pad- lock probe based genetic detection assay, and was approximately 5 spores for PLA based protein detection assay. The specificity of these two assays was demonstrated by detection of targets that were released and collected in the forest environment, which contained a very complex background. The specificity of the padlock assay was further proved by successful discrimina- tion of two bacterial species based on a single-nucleotide difference.

Discussion The biosecurity agent detection approach presented in this paper was based on digital quantification of ASMs. The underlying principle of the presented approach was demonstrated in the work by Jarvius and colleagues (51). The current paper presents a dramatically improved protocol over the original design. The most significant change was that we shifted from solution based ligation assay to solid phase based ligation assay. By introducing the solid phase, it provided the possibility to use high concentration probes in the reaction. This resulted in acceleration of the reaction, making the reaction time shorter, which is crucial for biosecurity agents detection assays. Com- bination of the ligation with the second RCA in the protocol also shortened the overall assay time and simplified the protocol. This shows that the C2CA is a robust molecular reaction process, which still has great potential for further improvements. The oligonucleotide systems in padlock probes and PLA probes were de- signed such that they harbored the same restriction sites and detection sites. Therefore, after forming DNA circles, the same amplification procedures can be performed in the downstream steps. This made it possible to detect pro-

29 tein and nucleic acid in the same detection system. The flexibility in design- ing oligonucleotide system is a great advantage for both assays. In this paper, we for the first time demonstrated the possibility of detect- ing real protein targets by using PLA combined with digital quantification of ASMs in vitro. The paper also demonstrated the versatility of padlock probe assay and PLA. Before practical applications, there are still several issues need to be addressed in the presented system, such as combination of the sampling processes with the molecular approaches and the automation of the procedures that constitute the system.

Paper III. Digital quantification of proximity ligation products for highly sensitive and precise protein measurement Background Proteins are fundamental players in the biological activities in all living cells. The status of proteins can reflect the physiological or pathological status of health and disease. Therefore, many proteins are being used as biomarkers. Great efforts are still being made to find more protein bio- markers and different techniques are involved in this process (117). PLA has been demonstrated to be a highly sensitive and specific assay for protein biomarker detection (81, 87, 118-119). However, one major drawback of PLA is the relatively poor precision when using real-time PCR as the read- out strategy. The coefficient of variance (CV) was shown to vary from lower than 10% to more than 30% in previous studies (80, 83). This is probably due to the intrinsic variation caused by real-time PCR (80, 120). Digital quantification of ASMs provides a means to absolutely quantify the number of detected single molecules, which makes it more precise than the bulky signal measurement using real-time PCR (51). Theoretically, the precision of digital quantification is governed by the Poisson statistics. Therefore, count- ing objects above 1000, the CV% can be less than 3%, where the real varia- tion will be mainly caused by experimental factors (51).

Summary The protein detection method presented in this paper was established based on PLA and digital quantification of ASMs generated from PLA. The first part of the experimental protocol was the same as that of the solid phase PLA. Briefly, the target proteins were captured by antibodies immobilized on magnetic beads, followed by recognition and binding of two PLA probes. The ligation of the two proximal DNA strands on the PLA probes generated

30 a new linear DNA fragment. To this step, the protocol was the same as that in the regular solid phase PLA. In order to evaluate the new protocol, we then divided the samples into two groups: one group was used for the regular real-time PCR readout, another for the new ASMD based readout. In the ASMD approach, the DNA fragments formed after proximity ligation were released from the immunocomplex by endonuclease digestion. The released fragments were then subjected for circularization and RCA amplification. Finally, RCA products generated through this approached were labeled with fluorescent detection probes and counted by using the dedicated ASMs de- tection instrument as described in Paper II. Two model protein targets, IL6 and VEGF were tested in this study. The result showed that the sensitivity of the ASMD approach was comparable to that of real-time PCR based ap- proach. The ASMD approach showed an improvement on assay precision by at least a factor of two.

Discussion The method for protein detection presented in this paper is a truly digital quantification approach for PLA, in terms of detection of protein in the liq- uid phase. We should note that the protein detection in the in situ PLA, where single ligation events also generate single signals, is also a digital approach. Herein, we only focus on detection of proteins present in the liq- uid phase samples. The previous approach described in Paper II for detection of bacterial spores included one C2CA step for signal amplification, making the final readout not directly reflect the ligation events in the PLA. The new approach in this paper, however, did not involve any signal amplification steps for the ligation events. Each ligation event only generated one ASM and could only be measured once by ASMD. This approach ensures the readout to be considered truly digitalized. Digital PCR, which can also provide digital quantification of nucleic ac- ids, holds the same idea. However, in digital PCR, the step for compartmen- talization samples requires special instruments, makes the sampling more complicated. In our RCA based approach, ASMs were generated in a homo- geneous way, without the need of fractionating the samples into independent reaction chambers. Previously, I also described two SMD based approaches for single protein detection, the Erenna system (96) and digital ELISA (98). The former measures single fluorescently labeled antibodies released from the immunocomplex, which may result in less specificity than our approach that based on solid phase PLA. This is due to the solid phase PLA requires three binding events to generate specific signal but the Erenna system only needs two binding events. The digital ELISA was based on compartmentali- zation of reactions to achieve single protein detection. The concept was very similar to that of digital PCR and the approaches to generate ASMs in many

31 of the next generation sequencing platforms. This makes the digital ELISA depends on the special instrumentation for the sample preparation.

Paper IV. Sensitive detection of spores using volume- amplified magnetic nanobeads Background Compared to many other biosensors that are based on optical or electro- chemical detections, biosensor based on magnetic sensing is a relatively new field. However, more and more attentions have been drawn to develop bio- sensors based on sensing magnetic labels (121). One important reason to use magnetic sensing for detection of biomolecules is the nearly absence of magnetic background in biological samples. Therefore the detection sensitiv- ity can potentially be increased due to negligible interference from back- ground. Different approaches for detection of magnetic labels in the biosen- sors have been used (122). Utilizing the Brownian relaxation principle for biomolecule detection was first proposed by Connolly and Pierre in 2001 (123). In 2004, another pio- neering work demonstrated detection of prostate specific antigen (PSA) by using this principle (124). However, the detection sensitivity was not good enough for real applications in that study. Briefly, the principle of the Brownian relaxation based biomolecule detection is described as following. When biomolecules, e.g. antigens are captured by the antibodies that are immobilized on the surface of magnetic beads, the hydrodynamic diameter of the beads slightly increases, resulting in the decrease of Brownian relaxa- tion frequency. The volume-amplified magnetic nanobead detection (VAM- ND) approach demonstrated that sensitive and specific detection of DNA sequences can be achieved by the same principle (125). In that study, ASMs, which were micrometer-sized single-stranded DNA concatemers generated by RCA, were hybridized to probes immobilized on magnetic beads. Upon the hybridization, the Brownian frequency of the beads decreased dramati- cally. This method was soon used for sensitive detection of bacterial DNA (126). At the beginning, the measurements were all performed by using the superconducting quantum interference device (SQUID), which is a bulky and very expensive instrument. Later on, a portable and less expensive alter- nating current (AC) susceptometer was used to replace the bulky SQUID for the measurements, which made the VAM-DN more suitable for clinical ap- plications (127). In this study, we used PLA combined with the VAM-ND method for de- tection of proteins.

32 Summary We used detection of bacterial spore as a model system to establish the as- say. The PLA was performed by using the same design as that in Paper II, except for longer reaction time of RCA and specific connector oligonucleo- tides for magnetic readout assay were used. Either two or three RCAs were performed in order to reach different levels of sensitivity. After the final RCA, the ASMs were then hybridized with detection probe functionalized magnetic nanobeads. Finally, the samples were detected by an AC suscep- tometer. The detection sensitivity was 500 spores after two RCAs and 50 spores after three RCAs, with molecular reaction time of 103 min and 135 min, respectively.

Discussion We demonstrated in this paper a sensitive protein detection method based on PLA and magnetic readout. PLA is a highly sensitive and specific DNA assisted protein detection method. It converts the detection of protein into detection of ligated DNA signals. In this paper, PLA was introduced to gen- erate DNA substrates for the following VAM-ND. The assay for the first time showed that VAM-ND can be used for sensitive detection of proteins, expanding the use of VAM-ND to encompass both nucleic acids and pro- teins. The assay holds several advantages over traditional methods. First, the PLA steps made the assay very sensitive and specific. Second, the magnetic beads used in the assay were environmentally safe, easy to prepare and mod- ify, stable and inexpensive. Third, the detection was performed in a homo- geneous way, not requiring wash steps to get rid of the unbound probes. The protein detection assay using VAM-ND further indicates great potential of using VAM-ND as a versatile detection system in the point-of-care diagnos- tics. This paper again showed the versatility of RCA generated ASMs.

Paper V: In situ sequencing of mRNA in intact cells and tissues Background Studies have revealed that gene expression is a fundamentally stochastic process, with substantial cell-to-cell variations in both mRNA and protein level that originate from randomness in transcription and translation (128). A lot of our knowledge about gene expression comes from the probable state of an “average cell” that is inferred from cell population studies, without know- ing the variations among the members of the population (129). Thus, in or-

33 der to better understand the process of gene expression, it is necessary to carry out the studies at single cell level with single molecule resolution. Due to there are various methods available for nucleic acid analysis, stud- ies based on analyzing RNA expression are very popular for single cell gene expression profiling. To do so, one straightforward way is to isolate single cells from cell population, followed by performing gene expression analysis using different methods, such as multiplex quantitative PCR (130), microar- ray (131) or single cell RNA-seq (132-133). These methods allow analysis of up to hundreds even thousands of genes at the same time, therefore are powerful in terms of the ability of multiplexing. However, isolation of single cells and performing analysis on individual cells is a very time-consuming process and will be very costly if large number of cells needs to be analyzed. Moreover, when cells are isolated from their natural context, the information from their neighboring cells is lost, especially when analyzing cells derived from complex cell population such as tumor tissues. Different approaches for in situ RNA detection provide opportunities to solve this problem. In situ detection of single RNA has been demonstrated either by using probes la- beled with multiple fluorophores (134) or multiple probes labeled with sin- gle fluorophores (7), however limited in the ability of multiplexing due to the limited number of spectrally distinct fluorophores available. A method was reported for simultaneously detection of 11 genes by using combina- tions of probes labeled different colors (135). But detection of single RNA was not possible by this method. Unfortunately, none of these in situ detec- tion approaches described above is able to distinguish highly similar se- quences. Padlock probe based in situ mRNA detection that enables detection of single mRNA with single nucleotide resolution has been demonstrated (136). In this paper, we further developed the padlock probe based approach by combining with next generation sequencing chemistry, enabling sequenc- ing of single mRNA molecules in the unprecedented matrix and to perform highly multiplex in situ mRNA detection.

Summary In this study, we established an approach that allows performing sequencing of individual mRNA in situ in their natural context. In brief, the RNA was first converted into cDNA in situ by using specific LNA modified primers or random decamers through reverse transcription. Padlock probes were then hybridized to their cDNA targets, such that a gap, which defined the region of interest for sequencing, was formed. The gap was then filled by polymeri- zation, followed by ligation to create a complete DNA circle. Upon the po- lymerization and ligation, the region of interest for sequencing was then copied and integrated into the DNA circle. RCA was then performed to clonally amplify this DNA circle, generating a micrometer-sized amplified single molecule (ASM) that contained up to more than 1000 repeated se-

34 quences that were complementary to the original DNA circle. Many targets have gone through the same process, resulting in generation of a population of ASMs in their preserved natural environment. The ASMs were then sub- jected to sequencing by using the SBL approach, where the anchor primers right next to the sequences that were complementary to the original se- quences in the gap were used. Through the assistance of CellProfiler based image analysis and MatLab data processing, the sequence of the gaps can be deduced. This approach was successfully used for sequencing of a four-base se- quence stretch in the ACTB mRNA in human cells. The authenticity of the results was further demonstrated by discrimination of a single nucleotide variant in the human and mouse ACTB mRNA in the second base of another four-base stretch. The image analysis suggested that there was virtually no loss of signal during the sequencing steps. The ability of sequencing multiple transcripts at the same time was demonstrated by correct sequencing of five bases in KRAS genes codon 12 and 13, by carrying out the experiment on a slide that contained mixed co-culture of six different cell lines bearing dif- ferent point mutations in these two codons. The feasibility of using this ap- proach for sequencing mRNA directly in tissue sections was validated by successful sequencing of two species of four-base sequences in ACTB and HER2 in a HER2 positive fresh frozen breast cancer tissue section. Finally, we also used the in situ sequencing to perform highly multiplex single mRNA detection in situ on three HER2 positive (two ER positive from different tumor parts of the same patient and one ER negative from the tumor of another patient) fresh frozen breast cancer tissue sections. We used 39 regular non-gap padlock probes, which were transcript-specific and de- signed with a unique four-base barcode. The barcodes were sequenced and decoded into individual transcripts. In total, a number of 24 probes passed our sequencing quality control. Subsequently, localization-related gene ex- pression analysis was performed based on the data generated from these 24 genes. The result revealed that expression patterns of VIM and HER2 counter stained each other. Cancer related genes, such as CTSL2, EPCAM and MUC1 were expressed within the HER2 positive regions of the tumor. Some other interesting expression patterns were also observed, such as Ki-67, CCNB1, CCND1 and BIRC5 exhibited uneven expression levels in different parts of the cancer compartments.

Discussion A novel method that allows in situ sequencing of individual mRNA and performing highly multiplex in situ mRNA detection is presented in this paper. The most important property of this method is the feasibility of pro- viding mRNA sequences with their spatial information, which can in turn help to understand their biological functions. This method not only allows to

35 study single nucleotide variations in single cells, but can also be used to study the variations involving longer sequences, e.g. alternative splicing. The multiplex ability of this method can be used for studying a great number of targets simultaneously in a single sample, which can help to better under- stand the complexity of transcriptome. Furthermore, multiplex detection was performed on the same tissue section, rather than consecutive sections. This not only makes better use of the precious samples, but also provides real spatial relation information of the studied transcripts. Therefore, we believe that this method will provide new opportunities for both basic research and clinical applications. The current version of this technology does not require sophisticated in- strumentation to perform. And the open source software, CellProfiler can be used for image analysis. Therefore it should be possible to be used by most of the molecular biology research laboratories. However, this absolutely does not mean that there is no need to further improve this technology. The next step for developing this technology is to automate the current protocols, which will allow performing large-scale studies. This could be done by sev- eral ways such as integration with the existing next generation sequencing platforms or automated pipetting robots, even with dedicated instrument when necessary. Further optimization of the molecular steps to make it be- come more robust and versatile is also another important issue to be ad- dressed.

36 Acknowledgements

Many people have made my PhD study fruitful, exciting and interesting. My most sincere gratitude goes to:

Mats Nilsson, my main supervisor. Thank you for giving me the oppor- tunity to pursue my PhD in your group and believing in me throughout these years. Thank you for always being so encouraging and supportive in all ways, for your optimistic and enthusiastic attitude towards science and for always appreciating new ideas and new results. I feel so lucky to be a PhD student in your group.

Ulf Landegren, my co-supervisor. Thank you for your great advices and all kinds of supports. Thank you for all the inspiring and interesting discus- sions, for sharing your brilliant ideas, valuable knowledge and experience in science and for showing me how to be a professional scientist.

The administrative at IGP, especially Christina Andersson, Ulla Steimer, and Gunilla Åberg for making my PhD study so much easier with your help.

My collaborators from the other labs: Carolina and Alexandra, for the wonderful image and data analysis in the in situ sequencing project, for in- teresting discussions and valuable suggestions; Johan Botling, for sharing your knowledge on tissue pathology and contributions to the in situ sequenc- ing; Janne, Jonas and Johan S from Qlinea, for making blob counting eas- ier; Simon, Anna Z and Song Sun for our collaborations; Maria Strømme, Teresa, Mattias, Rebecca and Peter at Ångström Lab for introducing me the world of magnetic biosensor and sharing your knowledge.

Former and present members of the Molecular Tools Group: thank you for creating this inspiring, creative and friendly working environment. A special thank to Mathias, for helping me both in science and life dur- ing these years. Thank you for going to China to join my wedding celebra- tion and for helping me with the proofreading of this thesis. You are a great friend. Ola, thank you for spending your time and being very patient to teach me how to swim. Also thank you for discussions and suggestions in my projects;

37 Masood, thank you for a lot of interesting discussions, and for organizing the journal club; Erik, for chairing my defense and for interesting conversa- tions; Joakim, for sharing your knowledge on biobanking and for interesting discussions; Elin E, Delal and Christina for keeping the lab running smoothly, and helping me to solve all kinds of problems; Johan O, for help- ing me with computers and using the lab wiki. David, for initiating a lot of interesting discussions on blob counting and quality-controlling our C2CA, also for a lot of interesting discussion on various topics; Anna E, for interesting discussion on PhD study, scientific research and those beyond; Anja, for carrying on the trisomy project, for interesting conversations and for the time taking course, commuting and out for conference together; Annika, for your friendly concerns and for always willing to listen and being very understanding to me; Monica, for sharing your knowledge on microfluidics and trying to help me to find a position in Lausanne; Camilla, for interesting conversations and those days when we first started our study in this group; Malte, for taking over the gene expres- sion profiling project and for organizing great party in Stockholm; Jie, for sharing your knowledge on biosensors. Marco, for sharing the desk with me in our former office, for forgiving me disturbing you at nights when we shared rooms during conferences and meetings and most importantly, for being such a good collaborator in our in situ sequencing project and great times together; Elin L, for making the move to Stockholm so smoothly and organized and for those interesting conversations we had now and then; Tomasz, for commuting together and Thomas, for showing me your serious attitude of addressing research prob- lems, and both of you for making the in situ padlocking group having more interesting discussions. Elin FS and Lotte, thank you for being so helpful and nice office mates, for trying to help me to improve my Swedish and for all our interesting of- fice conversations, and for the days in Barcelona. Carl-Magnus, for those days when we were in the student room, for sharing interesting news from the Swedish websites as well as your ideas about research with me; Spyros, for your humor and those interesting dis- cussion in after-lab sections; Karin, for lending me your swimming board, for putting efforts in making our working environment more joyful and be- ing a talented director in those short movies; Linda, for often coming to our office to remind me to have FIKA even I often didn’t show up in the end; Tonge, for the after lab activities and a lot of fun discussions; Anne-Li, for all the funs in the parties, for interesting discussions and for the food sharing party; Björn, for sharing your sense humor, for a lot of interesting lunch conversations; Anderis, for sharing your knowledge and suggesting interest- ing projects that I still don’t have time to do yet; Rasel, for interesting dis- cussions and always asking me “how is it going”; Caroline, for valuable comments during meetings and other interesting conversations; Agata, for

38 fun discussions and for showing me how to be so professional in our own research field; Maria, for doing great job when we were teaching together, and for sharing experience on the PhD study; Ellen, for nice and friendly discussions in the lab and the lunch room; Mike, for the nice cookies and chocolates you brought for the former office mates. Rachel, for teaching me PLA and great collaboration on counting PLA blobs, for the interesting conversations and for nice dinners at your place; Gucci, for willing to help all the time, for sharing your dissertation experi- ence with me, and for all the great party times; Wu Di, for sharing your crea- tive research ideas, for the numerous interesting discussions on all aspects and organizing a lot of activities; Chen Lei, for always willing to help and for a lot of great fun in the parties; Junhong, for a lot of interesting and meaningful discussions, for nice meals and great party times; Huang Rui, for giving me a chance to learn a lot from being a supervisor and for you excellent work in the solution PLA blob counting. Jenny, for picking me up the first day when I arrived in Uppsala, for teaching me padlocking and C2CA, and for helping me around now and then; Anna K, for sharing your experience on blob counting and always having time to count the blobs for me; Lean L and Carla, for helping me with many things in the lab; Ida, Irene and Sara H, for interesting discus- sions both inside and outside the microscope room; Chaterina for teaching me the in situ padlocking and a lot of fun discussions; Pier, Olle, Kalle, Mikaela, Magnus, Henrik, Katerina and Yuki, for activities, discussions and company and for helping me in different ways. Tim and Liu Lei, for your hospitality and the nice dinners, for inviting me to join different activities; Liu Yangling, for useful advices and for shar- ing your research experience with me; Johan V, for your company and for inviting me to you and Sofia’s parties.

Sara K, for interesting discussions both during the in situ meetings and during others times, for sharing dissertation information with me; Jiao Xiang, for also sharing information on dissertation, and for other nice and interesting discussion; Anja N, Jimmy, Pan Gang, Anqi, Soumin, Ammar Maria L, Snehangshu, Mashara, Lean Å, Hamid and Zhao Hongxing for your company and help in different ways at Rudbeck Laboratory.

Cai Yangling and Huang Hua, for your hospitality, for nice dinners and great party times; Yu Ziquan, for helping me with the moving; Su-Chen and Shang Feng, for your hospitality and for treating me with nice Taiwan- ese food as well as interesting discussions; Cui Tao, Wang Chuan and Xiaoze for the time when we took course together; Many other friends in or have already moved out from Uppsala and Wu Chenglin, Wu Lei, Li Li, Zhang Lei, Guo Xiaohu, Jiang Wangshu, Sun Yu, Fu Xi, Ding Zhoujie, Xu Hui, Yu Jin, Yu Di, Chuan Jin, Xie Yuan, VJ, Wei Kun, Mateusz,

39 Hu Xiao, Gao Shanjun and Prof. Zhou Weiguang for your company and help in different ways.

I also would like to take this opportunity to thank Prof. Qingge Li, my former supervisor of my master study, for leading me into the field of mo- lecular diagnostics, for your valuable advices, suggestions and help both during and after my master study; All my former lab fellows at the Molecu- lar Diagnostics Lab at Xiamen University, for the time when we worked and had fun together.

My parents, for everything; My brother, Ronghui, for being such a nice brother; My parents-in-law, for believing in me, for being so supportive and understanding, and for a lot of other things; My beloved wife, Chen, for everything.

40 References

1. E. M. Southern, Detection of Specific Sequences among DNA Fragments Separated by Gel-Electrophoresis. Journal of Molecular Biology 98, 503 (1975). 2. J. G. Gall, M. L. Pardue, Formation and detection of RNA-DNA hybrid molecules in cytological preparations. Proc Natl Acad Sci U S A 63, 378 (1969). 3. L. Jin, R. V. Lloyd, In situ hybridization: methods and applications. J Clin Lab Anal 11, 2 (1997). 4. J. G. Bauman, J. Wiegant, P. Borst, P. van Duijn, A new method for fluorescence microscopical localization of specific DNA sequences by in situ hybridization of fluorochromelabelled RNA. Exp Cell Res 128, 485 (1980). 5. J. M. Levsky, R. H. Singer, Fluorescence in situ hybridization: past, present and future. Journal of Cell Science 116, 2833 (2003). 6. I. Weibrecht et al., Visualising individual sequence-specific protein- DNA interactions in situ. N Biotechnol 29, 589 (2012). 7. A. Raj, P. van den Bogaard, S. A. Rifkin, A. van Oudenaarden, S. Tyagi, Imaging individual mRNA molecules using multiple singly labeled probes. Nature Methods 5, 877 (2008). 8. M. L. Collins et al., A branched DNA signal amplification assay for quantification of nucleic acid targets below 100 molecules/ml. Nucleic Acids Res 25, 2979 (1997). 9. G. J. Tsongalis, Branched DNA technology in molecular diagnostics. Am J Clin Pathol 126, 448 (2006). 10. A. N. Player, L. P. Shen, D. Kenny, V. P. Antao, J. A. Kolberg, Single-copy gene detection using branched DNA (bDNA) in situ hybridization. Journal of Histochemistry & Cytochemistry 49, 603 (2001). 11. R. M. Dirks, N. A. Pierce, Triggered amplification by hybridization chain reaction. Proceedings of the National Academy of Sciences of the United States of America 101, 15275 (2004). 12. H. M. T. Choi et al., Programmable in situ amplification for multiplexed imaging of mRNA expression. Nature Biotechnology 28, 1208 (2010). 13. R. K. Saiki et al., Enzymatic amplification of beta-globin genomic sequences and restriction site analysis for diagnosis of sickle cell anemia. Science 230, 1350 (1985). 14. R. K. Saiki et al., Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239, 487 (1988).

41 15. C. A. Heid, J. Stevens, K. J. Livak, P. M. Williams, Real time quantitative PCR. Genome Research 6, 986 (1996). 16. G. Pohl, L. M. Shih, Principle and applications of digital PCR. Expert Review of Molecular Diagnostics 4, 41 (2004). 17. B. Vogelstein, K. W. Kinzler, Digital PCR. Proceedings of the National Academy of Sciences of the United States of America 96, 9236 (1999). 18. K. A. Heyries et al., Megapixel digital PCR. Nat Methods 8, 649 (2011). 19. P. Gill, A. Ghaemi, Nucleic acid isothermal amplification technologies: a review. Nucleosides Nucleotides Nucleic Acids 27, 224 (2008). 20. Y. Mitani et al., Rapid SNP diagnostics using asymmetric isothermal amplification and a new mismatch-suppression technology. Nat Methods 4, 257 (2007). 21. J. N. Housby, E. M. Southern, Fidelity of DNA ligation: a novel experimental approach based on the polymerisation of libraries of oligonucleotides. Nucleic Acids Res 26, 4259 (1998). 22. U. Landegren, R. Kaiser, J. Sanders, L. Hood, A ligase-mediated gene detection technique. Science 241, 1077 (1988). 23. D. Y. Wu, R. B. Wallace, The ligation amplification reaction (LAR)--amplification of specific DNA sequences using sequential rounds of template-dependent ligation. Genomics 4, 560 (1989). 24. F. Barany, Genetic disease detection and DNA amplification using cloned thermostable ligase. Proc Natl Acad Sci U S A 88, 189 (1991). 25. F. Barany, The ligase chain reaction in a PCR world. PCR Methods Appl 1, 5 (1991). 26. M. L. Metzker, Sequencing technologies - the next generation. Nat Rev Genet 11, 31 (2010). 27. E. R. Mardis, The impact of next-generation sequencing technology on genetics. Trends in Genetics 24, 133 (2008). 28. S. C. Schuster, Next-generation sequencing transforms today's biology. Nature Methods 5, 16 (2008). 29. E. E. Schadt, S. Turner, A. Kasarskis, A window into third- generation sequencing. Human Molecular Genetics 19, R227 (2010). 30. M. Margulies et al., Genome sequencing in microfabricated high- density picolitre reactors. Nature 437, 376 (2005). 31. M. Ronaghi, M. Uhlen, P. Nyren, A sequencing method based on real-time pyrophosphate. Science 281, 363 (1998). 32. J. Shendure et al., Accurate multiplex polony sequencing of an evolved bacterial genome. Science 309, 1728 (2005). 33. E. R. Mardis, Next-generation DNA sequencing methods. Annual Review of Genomics and Human Genetics 9, 387 (2008).

42 34. D. Dressman, H. Yan, G. Traverso, K. W. Kinzler, B. Vogelstein, Transforming single DNA molecules into fluorescent magnetic particles for detection and enumeration of genetic variations. Proceedings of the National Academy of Sciences of the United States of America 100, 8817 (2003). 35. R. D. Mitra, G. M. Church, In situ localized amplification and contact replication of many individual DNA molecules. Nucleic Acids Research 27, (1999). 36. J. M. Rothberg et al., An integrated semiconductor device enabling non-optical genome sequencing. Nature 475, 348 (2011). 37. B. Hornblower et al., Single-molecule analysis of DNA-protein complexes using nanopores. Nature Methods 4, 315 (2007). 38. D. Branton et al., The potential and challenges of nanopore sequencing. Nature Biotechnology 26, 1146 (2008). 39. J. Clarke et al., Continuous base identification for single-molecule nanopore DNA sequencing. Nature Nanotechnology 4, 265 (2009). 40. T. D. Harris et al., Single-molecule DNA sequencing of a viral genome. Science 320, 106 (2008). 41. M. J. Levene et al., Zero-mode waveguides for single-molecule analysis at high concentrations. Science 299, 682 (2003). 42. J. Eid et al., Real-Time DNA Sequencing from Single Polymerase Molecules. Science 323, 133 (2009). 43. M. Nilsson et al., Padlock probes: circularizing oligonucleotides for localized DNA detection. Science 265, 2085 (1994). 44. J. Luo, D. E. Bergstrom, F. Barany, Improving the fidelity of Thermus thermophilus DNA ligase. Nucleic Acids Res 24, 3071 (1996). 45. M. Nilsson et al., Padlock probes reveal single-nucleotide differences, parent of origin and in situ distribution of centromeric sequences in human chromosomes 13 and 21. Nat Genet 16, 252 (1997). 46. A. C. Syvanen, Accessing genetic variation: genotyping single nucleotide polymorphisms. Nat Rev Genet 2, 930 (2001). 47. J. Baner et al., Parallel gene analysis with allele-specific padlock probes and tag microarrays. Nucleic Acids Res 31, e103 (2003). 48. C. Larsson et al., In situ genotyping individual DNA molecules by target-primed rolling-circle amplification of padlock probes. Nat Methods 1, 227 (2004). 49. S. Shaposhnikov, C. Larsson, S. Henriksson, A. Collins, M. Nilsson, Detection of Alu sequences and mtDNA in comets using padlock probes. Mutagenesis 21, 243 (2006). 50. F. Dahl et al., Circle-to-circle amplification for precise and sensitive DNA analysis. Proc Natl Acad Sci U S A 101, 4548 (2004). 51. J. Jarvius et al., Digital quantification using amplified single- molecule detection. Nature Methods 3, 725 (2006). 52. K. Sato et al., Microbead-based rolling circle amplification in a microchip for sensitive DNA detection. Lab Chip 10, 1262 (2010).

43 53. J. Goransson et al., A single molecule array for digital targeted molecular analyses. Nucleic Acids Res 37, e7 (2009). 54. J. Baner, P. Marits, M. Nilsson, O. Winqvist, U. Landegren, Analysis of T-cell receptor V beta gene repertoires after immune stimulation and in malignancy by use of padlock probes and microarrays. Clinical Chemistry 51, 768 (2005). 55. K. Zhang et al., Digital RNA allelotyping reveals tissue-specific and allele-specific gene expression in human. Nature Methods 6, 613 (2009). 56. S. R. Lin, W. Y. Wang, C. Palm, R. W. Davis, K. Juneau, A molecular inversion probe assay for detecting alternative splicing. Bmc Genomics 11, (2010). 57. S. Henriksson et al., Development of an in situ assay for simultaneous detection of the genomic and replicative form of PCV2 using padlock probes and rolling circle amplification. Virology Journal 8, (2011). 58. C. K. M. Tsui et al., Rapid identification and detection of pine pathogenic fungi associated with mountain pine beetles by padlock probes. Journal of Microbiological Methods 83, 26 (2010). 59. P. Hardenbol et al., Multiplexed genotyping with sequence-tagged molecular inversion probes. Nature Biotechnology 21, 673 (2003). 60. P. M. Lizardi et al., Mutation detection and single-molecule counting using isothermal rolling-circle amplification. Nature Genetics 19, 225 (1998). 61. P. Hardenbol et al., Highly multiplexed molecular inversion probe genotyping: Over 10,000 targeted SNPs genotyped in a single tube assay. Genome Research 15, 269 (2005). 62. Y. K. Wang et al., Allele quantification using molecular inversion probes (MIP). Nucleic Acids Research 33, (2005). 63. H. L. Ji et al., Molecular inversion probe analysis of gene copy alterations reveals distinct categories of colorectal carcinoma. Cancer Research 66, 7910 (2006). 64. Y. Wang et al., Analysis of Molecular Inversion Probe performance for allele copy number determination. Genome Biology 8 , (2007). 65. G. J. Porreca et al., Multiplex amplification of large sets of human exons. Nat Methods 4, 931 (2007). 66. J. Baner, M. Nilsson, M. Mendel-Hartvig, U. Landegren, Signal amplification of padlock probes by rolling circle replication. Nucleic Acids Research 26, 5073 (1998). 67. L. Blanco et al., Highly Efficient DNA-Synthesis by the Phage Phi- 29 DNA-Polymerase - Symmetrical Mode of DNA-Replication. Journal of Biological Chemistry 264, 8935 (1989). 68. W. M. Howell, I. Grundberg, M. Faryna, U. Landegren, M. Nilsson, Glycosylases and AP-cleaving enzymes as a general tool for probe- directed cleavage of ssDNA targets. Nucleic Acids Research 38, (2010).

44 69. B. Domon, R. Aebersold, Mass spectrometry and protein analysis. Science 312, 212 (2006). 70. B. Domon, K. Alving, T. He, T. E. Ryan, S. D. Patterson, Enabling parallel protein analysis through mass spectrometry. Curr Opin Mol Ther 4, 577 (2002). 71. S. Pan et al., Mass spectrometry based targeted protein quantification: methods and applications. J Proteome Res 8, 787 (2009). 72. R. Huttenhain, J. Malmstrom, P. Picotti, R. Aebersold, Perspectives of targeted mass spectrometry for protein biomarker verification. Curr Opin Chem Biol 13, 518 (2009). 73. S. A. Carr, L. Anderson, Protein quantitation through targeted mass spectrometry: the way out of biomarker purgatory? Clin Chem 54, 1749 (2008). 74. R. S. Yalow, S. A. Berson, Immunoassay of endogenous plasma insulin in man. J Clin Invest 39, 1157 (1960). 75. E. Engvall, P. Perlmann, Enzyme-linked immunosorbent assay (ELISA). Quantitative assay of immunoglobulin G. Immunochemistry 8, 871 (1971). 76. G. B. Wisdom, Enzyme-immunoassay. Clin Chem 22, 1243 (1976). 77. R. F. Schall, Jr., H. J. Tenoso, Alternatives to radioimmunoassay: labels and methods. Clin Chem 27, 1157 (1981). 78. T. Sano, C. L. Smith, C. R. Cantor, Immuno-PCR: very sensitive antigen detection by means of specific antibody-DNA conjugates. Science 258, 120 (1992). 79. J. M. Nam, C. S. Thaxton, C. A. Mirkin, Nanoparticle-based bio-bar codes for the ultrasensitive detection of proteins. Science 301, 1884 (2003). 80. S. Fredriksson et al., Protein detection using proximity-dependent DNA ligation assays. Nat Biotechnol 20, 473 (2002). 81. S. Fredriksson et al., Multiplexed protein detection by proximity ligation for cancer biomarker validation. Nature Methods 4, 327 (2007). 82. M. Gullberg et al., Cytokine detection by antibody-based proximity ligation. Proceedings of the National Academy of Sciences of the United States of America 101, 8420 (2004). 83. S. Darmanis et al., Sensitive Plasma Protein Analysis by Microparticle-based Proximity Ligation Assays. Molecular & Cellular Proteomics 9, 327 (2010). 84. O. Soderberg et al., Direct observation of individual endogenous protein complexes in situ by proximity ligation. Nat Methods 3, 995 (2006). 85. O. Soderberg et al., Characterizing proteins and their interactions in cells and tissues using the in situ proximity ligation assay. Methods 45, 227 (2008). 86. E. Schallmeiner et al., Sensitive protein detection via triple-binder proximity ligation assays. Nat Methods 4, 135 (2007).

45 87. G. Tavoosidana et al., Multiple recognition assay reveals prostasomes as promising plasma biomarkers for prostate cancer. Proc Natl Acad Sci U S A 108, 8809 (2011). 88. T. Conze et al., MUC2 mucin is a major carrier of the cancer- associated sialyl-Tn antigen in intestinal metaplasia and gastric carcinomas. Glycobiology 20, 199 (2010). 89. M. Jarvius et al., In situ detection of phosphorylated platelet-derived growth factor receptor beta using a generalized proximity ligation method. Molecular & Cellular Proteomics 6, 1500 (2007). 90. B. Rotman, Measurement of activity of single molecules of beta-D- galactosidase. Proc Natl Acad Sci U S A 47, 1981 (1961). 91. S. Nie, R. N. Zare, Optical detection of single molecules. Annu Rev Biophys Biomol Struct 26, 567 (1997). 92. A. Ishijima, T. Yanagida, Single molecule nanobioscience. Trends Biochem Sci 26, 438 (2001). 93. S. Weiss, Measuring conformational dynamics of biomolecules by single molecule fluorescence spectroscopy. Nat Struct Biol 7, 724 (2000). 94. S. Weiss, Fluorescence spectroscopy of single biomolecules. Science 283, 1676 (1999). 95. A. M. Armani, R. P. Kulkarni, S. E. Fraser, R. C. Flagan, K. J. Vahala, Label-free, single-molecule detection with optical microcavities. Science 317, 783 (2007). 96. J. Todd et al., Ultrasensitive flow-based immunoassays using single- molecule counting. Clinical Chemistry 53, 1990 (2007). 97. A. H. Wu, N. Fukushima, R. Puskas, J. Todd, P. Goix, Development and preliminary clinical validation of a high sensitivity assay for cardiac troponin using a capillary flow (single molecule) fluorescence detector. Clin Chem 52, 2157 (2006). 98. D. M. Rissin et al., Single-molecule enzyme-linked immunosorbent assay detects serum proteins at subfemtomolar concentrations. Nat Biotechnol 28, 595 (2010). 99. J. H. Jett et al., High-speed DNA sequencing: an approach based upon fluorescence detection of single molecules. J Biomol Struct Dyn 7, 301 (1989). 100. J. D. Harding, R. A. Keller, Single-molecule detection as an approach to rapid DNA sequencing. Trends Biotechnol 10, 55 (1992). 101. L. M. Davis et al., Rapid DNA sequencing based upon single molecule detection. Genet Anal Tech Appl 8, 1 (1991). 102. C. Adessi et al., Solid phase DNA amplification: characterisation of primer attachment and amplification mechanisms. Nucleic Acids Res 28, E87 (2000). 103. M. Fedurco, A. Romieu, S. Williams, I. Lawrence, G. Turcatti, BTA, a novel reagent for DNA attachment on glass and efficient generation of solid-phase amplified DNA colonies. Nucleic Acids Res 34, e22 (2006).

46 104. J. Shendure, H. L. Ji, Next-generation DNA sequencing. Nature Biotechnology 26, 1135 (2008). 105. B. G. Zimmermann et al., Digital PCR: a powerful new tool for noninvasive prenatal diagnosis? Prenat Diagn 28, 1087 (2008). 106. H. C. Fan, Y. J. Blumenfeld, U. Chitkara, L. Hudgins, S. R. Quake, Noninvasive diagnosis of fetal aneuploidy by shotgun sequencing DNA from maternal blood. Proc Natl Acad Sci U S A 105, 16266 (2008). 107. R. W. Chiu et al., Noninvasive prenatal diagnosis of fetal chromosomal aneuploidy by massively parallel genomic sequencing of DNA in maternal plasma. Proc Natl Acad Sci U S A 105, 20458 (2008). 108. R. Drmanac et al., Human Genome Sequencing Using Unchained Base Reads on Self-Assembling DNA Nanoarrays. Science 327, 78 (2010). 109. A. Pihlak et al., Rapid genome sequencing with short universal tiling probes. Nature Biotechnology 26, 676 (2008). 110. E. Domingo, Rapid evolution of viral RNA genomes. Journal of Nutrition 127, S958 (1997). 111. J. Holland et al., Rapid Evolution of Rna Genomes. Science 215, 1577 (1982). 112. O. Ergonul, Emerging viral infections - Crimean-Congo haemorrhagic fever in Turkey. Journal of Clinical Virology 46, S2 (2009). 113. O. Ergonul, Crimean-Congo haemorrhagic fever. Lancet Infectious Diseases 6, 203 (2006). 114. L. M. Eubanks, T. J. Dickerson, K. D. Janda, Technological advancements for the detection of and protection against biological and chemical warfare agents. Chemical Society Reviews 36, 458 (2007). 115. K. E. Sapsford, C. Bradburne, J. B. Detehanty, I. L. Medintz, Sensors for detecting biological agents. Materials Today 11, 38 (2008). 116. N. M. Cirino, K. A. Musser, C. Egan, Multiplex diagnostic platforms for detection of biothreat agents. Expert Review of Molecular Diagnostics 4, 841 (2004). 117. N. Rifai, M. A. Gillette, S. A. Carr, Protein biomarker discovery and validation: the long and uncertain path to clinical utility. Nat Biotechnol 24, 971 (2006). 118. S. Fredriksson et al., Multiplexed proximity ligation assays to profile putative plasma biomarkers relevant to pancreatic and ovarian cancer. Clin Chem 54, 582 (2008). 119. M. Lundberg et al., Multiplexed homogeneous proximity ligation assays for high-throughput protein biomarker research in serological material. Molecular & Cellular Proteomics 10, M110 004978 (2011).

47 120. R. G. Rutledge, C. Cote, Mathematics of quantitative kinetic PCR and the application of standard curves. Nucleic Acids Res 31, e93 (2003). 121. J. B. Haun, T. J. Yoon, H. Lee, R. Weissleder, Magnetic nanoparticle biosensors. Wiley Interdiscip Rev Nanomed Nanobiotechnol 2, 291 (2010). 122. J. Llandro, J. J. Palfreyman, A. Ionescu, C. H. Barnes, Magnetic biosensor technologies for medical applications: a review. Med Biol Eng Comput 48, 977 (2010). 123. J. Connolly, T. G. St Pierre, Proposed biosensors based on time- dependent properties of magnetic fluids. Journal of Magnetism and Magnetic Materials 225, 156 (2001). 124. A. P. Astalan, F. Ahrentorp, C. Johansson, K. Larsson, A. Krozer, Biomolecular reactions studied using changes in Brownian rotation dynamics of magnetic particles. Biosensors & Bioelectronics 19, 945 (2004). 125. M. Stromberg et al., Sensitive molecular diagnostics using volume- amplified magnetic nanobeads. Nano Letters 8, 816 (2008). 126. J. Goransson et al., Sensitive detection of bacterial DNA by magnetic nanoparticles. Anal Chem 82, 9138 (2010). 127. T. Z. G. de la Torre et al., Detection of rolling circle amplified DNA molecules using probe-tagged magnetic nanobeads in a portable AC susceptometer. Biosensors & Bioelectronics 29, 195 (2011). 128. A. Raj, A. van Oudenaarden, Nature, nurture, or chance: stochastic gene expression and its consequences. Cell 135, 216 (2008). 129. J. M. Levsky, R. H. Singer, Gene expression and the myth of the average cell. Trends Cell Biol 13, 4 (2003). 130. P. Dalerba et al., Single-cell dissection of transcriptional heterogeneity in human colon tumors. Nat Biotechnol 29, 1120 (2011). 131. M. K. Chiang, D. A. Melton, Single-cell transcript analysis of pancreas development. Dev Cell 4, 383 (2003). 132. F. Tang et al., mRNA-Seq whole-transcriptome analysis of a single cell. Nat Methods 6, 377 (2009). 133. S. Islam et al., Characterization of the single-cell transcriptional landscape by highly multiplex RNA-seq. Genome Research 21, 1160 (2011). 134. A. Femino, F. S. Fay, K. Fogarty, R. H. Singer, Visualization of single RNA transcripts in situ. Science 280, 585 (1998). 135. J. M. Levsky, S. M. Shenoy, R. C. Pezo, R. H. Singer, Single-cell gene expression profiling. Science 297, 836 (2002). 136. C. Larsson, I. Grundberg, O. Soderberg, M. Nilsson, In situ detection and genotyping of individual mRNA molecules. Nature Methods 7, 395 (2010).

48

Acta Universitatis Upsaliensis Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine 829 Editor: The Dean of the Faculty of Medicine

A doctoral dissertation from the Faculty of Medicine, Uppsala University, is usually a summary of a number of papers. A few copies of the complete dissertation are kept at major Swedish research libraries, while the summary alone is distributed internationally through the series Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Medicine.

ACTA UNIVERSITATIS UPSALIENSIS Distribution: publications.uu.se UPPSALA urn:nbn:se:uu:diva-183141 2012