Identification of Adenovirus New Splice Sites
Total Page:16
File Type:pdf, Size:1020Kb
Identification of adenovirus new splice sites Uzair Tauheed Degree project in infection biology, Master of Medical Science 2012 Department of Medical Biochemistry and Microbiology (IMBIM) Biomedical Centre (BMC), Uppsala University Abstract RNA splicing is a process where introns are removed and exons are joined together. Human adenovirus type 2 pre-mRNAs undergoes intensive alternative splicing and produce more than 40 differently spliced mRNAs. This thesis work is focused on the identification of new splice sites in adenovirus. By virtue of Illumina mRNA sequencing technology we have identified 255 splice sites. Splice site analysis of the introns revealed the presence of three types of splice sites GT-AG (61.2%), GC-AG (25.9%) and AT-AC (12.9%). Among 255 splice sites, 224 were new. Significantly, more than 50% of the new splice sites were located in the major late transcription unit on the positive strand of adenovirus DNA. Three new splice sites; 17452-29489 (GC-AG) located on the negative strand of adenovirus DNA in the E2 region, 9668-20346 (AT-AC) and 9699-30505 (GC- AG) on the positive strand of adenovirus DNA in the major late transcription unit were further confirmed by PCR analysis. Keywords: Alternative splicing, human adenovirus type 2, RNA sequencing, splice sites, major late transcription unit, adenovirus DNA, PCR analysis. 2 I would like to dedicate this thesis to my parents and teachers. 3 Table of Contents 1. Introduction ........................................................................................................................................... 6 1.1. Adenoviruses ................................................................................................................................. 6 1.1.1. Genome organization................................................................................................................. 6 1.1.2. Life cycle ................................................................................................................................... 8 1.1.3. Adenovirus gene expression ...................................................................................................... 8 1.1.3.1. Early genes ........................................................................................................................ 8 1.1.3.2. Late genes ........................................................................................................................ 10 1.2. RNA splicing ............................................................................................................................... 10 1.2.1. Adenovirus alternative splicing ............................................................................................... 11 1.2.2. Consensus sequences in pre-mRNA ........................................................................................ 13 1.2.3. Splicing mechanism and spliceosome ..................................................................................... 13 1.3. Aim of the project ........................................................................................................................ 14 2. Materials and methods ......................................................................................................................... 15 2.1. RNA samples ............................................................................................................................... 15 2.2. cDNA synthesis ........................................................................................................................... 15 2.3. Primers......................................................................................................................................... 15 2.4. Polymerase Chain Reaction (PCR) ............................................................................................. 16 3. Results and discussion ......................................................................................................................... 17 3.1. Identification of adenovirus new splice sites by RNA sequencing ............................................. 17 3.2. PCR analysis for new splice sites ................................................................................................ 19 4. Conclusions ......................................................................................................................................... 24 5. Acknowledgment ................................................................................................................................. 25 6. References ........................................................................................................................................... 26 4 5 1. Introduction 1.1. Adenoviruses Adenoviruses belong to Adenoviridae family, affecting a broad range of vertebrate hosts. Human adenoviruses are responsible for respiratory infections, gastroenteritis and conjunctivitis. Children are mostly affected with adenovirus infections compare to adults but usually adenoviruses are not considered highly pathogen. Adenoviruses were discovered accidentally when two different groups of scientist were looking for the causative agents of acute respiratory infections (Hilleman and Werner, 1954; Rowe et al., 1953).These agents were named ‘’adenoviruses’’ due to their source, adenoid tissue (Enders et al., 1956). Based on the oncogenicity, agglutination, homology and immunological properties adenoviruses have been classified into seven subgroups from A to G and serotypes (Jones et al., 2007). Currently, more than 50 human adenovirus serotypes have been discovered. Human adenovirus type 2 (Ad2) and 5 (Ad5) of subgroup C are among the most widely studied human adenoviruses. Ad12 serotype of subgroup A has been found oncogenic in rodents. Fortunately at present none of adenovirus serotypes have been reported as oncogenic in humans (Zheng, 2010). 1.1.1. Genome organization Adenovirus is a non-enveloped icosahedral capsid of 70-100 nm in diameter. The capsid is comprised of 252 capsomers, 240 hexons and 12 vertices which contain penton base and fiber. The genome of human adenovirus is a single linear double-stranded DNA molecule with about 100bp inverted terminal repeats (ITR) at both ends which has the important role in DNA replication (Figure 1). Generally, human adenovirus genome length is about 30-36 kb. Ad2 is the first completely sequenced adenovirus genome with a total length of 35,937bp (Roberts et al., 1984). The genome is divided into transcriptional units which generate multiple mRNAs by alternative splicing and polyadenylation. 6 Depending on the serotypes, these multiple mRNAs encodes for 30-40 proteins. Adenovirus genes are arranged into three transcriptional units producing six early genes (E1A, E1B, E2A, E2B, E3 and E4), two intermediate genes (IX and IVa2) and five late genes (L1-L5). All these genes are transcribed by cellular RNA polymerase II (RNAP II) whereas, two genes virus-associated RNAs I and II (VA RNA I and VA RNA II) are transcribed by RNA polymerase III (RNAP III), (Pettersson and Roberts, 1986). Figure 1. A genomic map of human adenovirus type 2. Early genes are represented by red bars, Intermediate genes by black bars and late genes by yellow bars. MLTU encodes for five gene families L1-L5. The tripartite leaders are indicated as 1, 2, 3 and (i) leader (Reprinted with permission from Goran Akusjarvi) 7 1.1.2. Life cycle Virus attaches to the host cells with its fiber either to coxsackie adenovirus receptor (CAR) or CD46, depending on the adenovirus serotype (Bergelson et al., 1997; Gaggar et al., 2003). Penton base then combines with secondary integrin family of host cell receptors and virus enters the cell through receptor-mediated endocytosis (Wickham et al., 1993). Inside the endosome, virus particle dissociates due to low pH of endosome and cleavage activity of viral L3 protease (Cotten and Weber, 1995; Greber et al., 1993). Subsequently, by disruption in endosome virus particle is released into the cytoplasm. The capsid reaches to the nuclear pore by travelling over the microtubules and releases viral DNA inside the nucleus (Greber et al., 1997). The lytic type of viral replication cycle is divided into early and late phases separated by the onset of viral DNA replication. In HeLa cells early phase of infection lasts for about 5-8 hours and viral DNA replication starts about 6-10 hours post infection. The virus completes its infectious cycle in about 36 hours releasing approximately 104 progeny of virus particles per cell (Green and Daesch, 1961). 1.1.3. Adenovirus gene expression Adenovirus gene expression is a temporally regulated event. Some genes are predominant at early times of infection and others dominate at late stages of infection. Majority of the early gene products are regulatory proteins whose prime function is to push host cell into S phase and to block antiviral response whereas late gene products are mostly structural proteins. 1.1.3.1. Early genes E1A is the first gene expressed just after the viral genome enters the nucleus. E1A region encodes proteins that regulates both viral and host cell gene expression (Nevins, 1981). E1A proteins push host cells from resting phase to active S phase which is achieved by 8 interacting with tumor suppressor retinoblastoma protein (pRB) and thereby releasing cellular transcription factor (E2F) resulting into the activation of several cellular genes responsible for S phase induction (Johnson et al., 1993). E1B