Chapter 3 Materials and Methods
Total Page:16
File Type:pdf, Size:1020Kb
Chapter 3 Materials and Methods Chapter 3: Materials and Methods 103 Chapter 3 Materials and Methods Chapter 3: Materials and Methods 3.1 Computational Methods 3.1.1 Public Databases I searched several public biological databases in attempts to detect homologues of PRNP and their genomic context: Ensembl (http://www.ensembl.org/); National Center for Biotechnology Information (NCBI; http://www.ncbi.nlm.nih.gov/); DNA Data Bank of Japan (DDBJ; http://www.ddbj.nig.ac.jp/); The Institute for Genomic Research (TIGR; http://www.tigr.org/), Sanger Institute (http://www.sanger.ac.uk/) and local BLAST server for Sanger Institute zebrafish database on http://danio.mgh.harvard.edu/blast/blast_grp.html); Functional Annotation of Mouse (FANTOM; http://www.gsc.riken.go.jp/e/FANTOM/); The Fugu Genomics group at the Rosalind Franklin Centre for Genomics Research (http://Fugu.hgmp.mrc.ac.uk/); A Database of the Drosophila Genome (FlyBase; http://www.flybase. org/); Genoscope (http://www.genoscope.cns.fr/); Medaka Genome Database (M Base; http://mbase.bioweb.ne.jp/~dclust/medaka_top.html); and The Biology and Genome of C. elegans (WormBase; http://www.wormbase.org/). 3.1.2 Analysis of Nucleic Acids Sequences 3.1.2.1 Basic Analysis and Handling of Sequences The Lasergene (DNASTAR, Madison WI, USA) and Vector NTI (InforMax, Frederick MD, USA) software packages were used for basic handling, storage and analyses of the nucleotide sequences. I used these programs to align cDNAs with genomic sequences (local alignment) and define gene features. I also used these software packages to align homologous sequences (global alignment) and to determine basic gene features such as length of exons/introns and their nucleotide context. 104 Chapter 3 Materials and Methods Exon-intron structures of SPRN genes were determined using the Local Pair Alignment tool (Huang and Miller, 1991) on the ANGIS interactive web interface Biomanager (http://www.angis.org.au). 3.1.2.2 Design of PCR Primers I used the MacVector 7.0 program (Oxford Molecular Group 2000) to design primers for the PCR experiments. As stringent as possible criteria for primer design were adjusted individually for each primer. 3.1.2.3 Analysis of Transposable Element Content I used the slow speed option of the RepeatMasker program as a free web service (http://ftp.genome.Washington.edu/RM/RepeatMasker.html) to determine the relative content of interspersed repeats, small RNA, satellites, simple repeats and low complexity sequence in my nucleotide sequences. 3.1.2.4 Prediction of CpG Islands CpG islands in the human, mouse, and rat genomic sequences were detected by using the EMBOSS free open source software package (Rice et al. 2000) run locally under the Linux server. The cpgplot program (Larsen et al. 1992) identifies a CpG island as a sequence region where over an average of 10 windows, GC percentage is over 50% and Obs/Exp ratio is over 0.6 in a minimum of over 100 bp. 3.1.2.5 Analysis of Genomic Sequences using NIX Interactive Tool I used routinely the NIX interactive web tool (Williams et al. 1998; http://www.hgmp.mrc.ac.uk/NIX/) to annotate genomic sequences. This service enables analysis and annotation of genomic sequences using several programs simultaneously. The analysis includes the CpG island, promoter, poly-A signal site and exon 105 Chapter 3 Materials and Methods predictions, analysis of transposable element content, filtering of vector sequences, and BLASTN and BLASTP searches (Figure 3.1). 3.1.3 Analysis of Protein Sequences 3.1.3.1 Translation of Protein Sequences in silico I used the Lasergene (DNASTAR, Madison WI, USA) and Vector NTI (InforMax, Frederick MD, USA) program packages to translate protein amino acid sequences from nucleic acid nucleotide sequences and also for basic handling, storage and analyses of the protein sequences. 3.1.3.2 Alignment of Protein Sequences I aligned protein sequences using the program Cameleon v3.14 (Oxford Molecular 1995, now owned by Accelrys) implementing the algorithm of Taylor (1990). Multiple sequence alignments were edited manually. Final alignment figures were prepared using the CHROMA alignment editor (Goodstadt and Ponting, 2001) available free on the web (http://www.lg.ndirect.co.uk/chroma/index.htm). 3.1.3.3 Prediction of Signal Peptides and GPI-anchor Addition Sites I used the SignalP program to predict signal peptide cleavage sites (Nielsen et al., 1997) as free web service (http://www.cbs.dtu.dk/services/SignalP/). GPI-anchor addition sites were predicted using the bigPI-predictor (Eisenhaber et al., 1999) also as free web service (http://mendel.imp.univie.ac.at/sat/gpi/gpi_server.html). 3.1.3.4 Computational Prediction of Tammar Wallaby PrP Structure The homology model of tammar wallaby PrP structure was built by submitting the primary protein amino acid sequence to the SWISS-MODEL, automated comparative modelling server available as free web service (http://swissmodel.expasy.org/). 106 Chapter 3 Materials and Methods Figure 3.1 An example of the NIX output. The NIX analysis enables annotation of genomic sequences using a number of programs simultaneously. CpG island prediction: GRAIL/cpg. Promoter prediction: GRAIL/polIIprom, TSSW/Promotor, GENSCAN/Prom, FGenes/Prom. Exon prediction: Fex, Hexon, MZEF, Genemark, GRAIL/exons, GRAIL/gap2, Genefinder, Fgene, GENSCAN, FGenes. BLASTP searches: BLAST/trembl, BLAST/swissprot. BLASTN searches: BLAST/est, BLAST/embl-, BLAST/gss, BLAST/sts. Poly-A site prediction: Polyah, GENSCAN/polya, FGenes/polya, GRAIL/polya. Vector and contamination check: BLAST/vector, BLAST/ecoli. Transposable element content: RepeatMasker. tRNAs: tRNAscan-SE. Sequence, nucleotide sequence in bp. This figure was downloaded from http://www.hgmp.mrc.ac.uk/NIX/. 106a Chapter 3 Materials and Methods I visualized and analysed protein structures by using the program Swiss-PdbViewer (http://au.expasy.org/spdbv/) run under MacOSIX. I aligned the model of tammar wallaby PrP structure with the experimental mouse (1AG2) and bovine (1DWY) PrP structures deposited in the Protein Data Bank (http://www.rcsb.org/pdb/index.html/). 3.1.3.5 Analysis of Evolutionary Distances I calculated evolutionary distances in a set of aligned PrPs using the free MEGA2 program (Kumar et al., 2001; http://www.megasoftware.net) run locally under Windows. Evolutionary distances were defined as a number of amino acid substitutions between a pair of sequences. Using the Complete-Delete option to remove alignment gaps, I determined number of valid common sites and number of sites different between two sequences. I then calculated percent of identity between a pair of sequences manually. 3.1.3.6 Protein Amino Acid Pattern Search I used the ScanProsite program available as free web service (http://au.expasy.org/tools/scanprosite/) to search the SWISS-PROT database with the defined marsupial PrP pattern [T-T-T-T-T-T-K]. 3.1.4 Cross-Species Comparisons 3.1.4.1 Global Alignments of Long Genomic sequences I used the VISTA free web server (Mayor et al. 2000; http://www-gsd.lbl.gov/VISTA/) to produce global alignments of long genomic sequences. 107 Chapter 3 Materials and Methods 3.1.4.1.1 Alignment of PRNP Genomic Context between Mammals and Fish I aligned human (chr20: 4558866-4938939 bp; Ensembl human v12.31.1), mouse (chr2: 132862228-133103751 bp; Ensembl mouse v12.3.1), rat (chr3: 112821949-113040637 bp; Ensembl rat v11.2.1), Fugu (chr_scaffold_155: 247572-271811 bp; Ensembl Fugu v12.2.1), and Tetraodon (Tetraodon virtual contig 1) genomic sequences containing either the PRNP or stPrP-2 gene, its whole upstream intergenic sequence, and adjacent genes (gene for PrP-like protein in fishes, PRND in mammals, PRNT in human, and RASSF2 and SLC23A1 in both mammals and fishes). Prior to submission to VISTA, transposable elements in the mammalian sequences were masked by using the RepeatMasker as above. Human sequence and its annotation was used as the base sequence. Pairwise sequence comparisons were calculated with a threshold of 50% identity in a 50 bp window. The minimum identity shown in the VISTA plots is 30%. 3.1.4.1.2 Alignment of SPRN Genomic Context between Mammals and Fish I aligned human (chr10: 134322741-134374165 bp; Ensembl human v12.31.1), mouse (chr7: 130335510-130371625 bp; Ensembl mouse v12.3.1), rat (chr1: 199669758- 199704156 bp; Ensembl rat v11.2.1), zebrafish (assembly_203: 4494411-4519077 bp; Ensembl zebrafish v14.2.1), Fugu (chr_scaffold_28: 384496-394338 bp; Ensembl Fugu v12.2.1), and Tetraodon (Tetraodon virtual contig 2) genomic sequences containing the SPRN gene, its whole upstream intergenic region, and adjacent genes (gene encoding GTP-binding protein in all species, amine-oxidase-coding gene in mammals and pufferfish, and long-chain fatty-acyl elongase-coding gene in zebrafish). Prior to submission to VISTA, transposable elements in the mammalian sequences were masked by using the RepeatMasker as above. Human sequence and its annotation was used as the base sequence. Pairwise sequence comparisons were calculated with a threshold of 50% identity in a 50 bp window. The minimum identity shown in the VISTA plots is 30%. 108 Chapter 3 Materials and Methods 3.1.4.1.3 Alignment of Tammar Wallaby, Human, Mouse, Bovine and Ovine PRNPs I aligned in silico genomic sequences harbouring the PRNP gene. The 66.5 kb tammar wallaby BAC sequence contained the PRNP gene (30055 - 50234 bp) together with flanking sequences. The human (chr20: 4558866 - 4650555 bp; Ensembl human v16.33.1) and mouse (chr2: 132052342 - 132080812 bp; Ensembl mouse v22.32b.1) genomic DNA sequences contained complete proximal and distal intergenic regions. PRNP in the bovine genomic sequence (AJ298878; NCBI) encompassed 49430 - 69659 bp. I merged two overlapping ovine genomic sequences (U67922 and AY184242; NCBI) into a single 46955 bp contig with the PRNP lying between 5666 - 26295 bp.