Lab Manual Bioperl Programming Language

LAB MANUAL BIOPERL PROGRAMMING LANGUAGE DBT STAR STATUS Department of BCA Published by Coimbatore Institute of Information Technology #156, 3rd Floor, Kalidas Road, Ramnagar, Coimbatore – 641009, Tamil Nadu, India. Website: www.ciitresearch.org Complimentary Copy All Rights Reserved. Original English Language Edition 2020 © Copyright by Coimbatore Institute of Information Technology. This book may not be duplicated in any way without the express written consent of the publisher, except in the form of brief excerpts or quotations for the purpose of review. The information contained herein is for the personal use of the reader and may not be incorporated in any commercial programs, other books, database, or any kind of software without written consent of the publisher. Making copies of this book or any portion thereof for any purpose other than your own is a violation of copyright laws. This edition has been published by Coimbatore Institute of Information Technology, Coimbatore. Limits of Liability/Disclaimer of Warranty: The author and publisher have used their effort in preparing this Bioperl Programming Language book and author makes no representation or warranties with respect to accuracy or completeness of the contents of this book, and specifically disclaims any implied warranties of merchantability or fitness for any particular purpose. There are no warranties which extend beyond the descriptions contained in this paragraph. No warranty may be created or extended by sales representatives or written sales materials. Neither CiiT nor author shall be liable for any loss of profit or any other commercial damage, including but limited to special, incidental, consequential, or other damages. Trademarks: All brand names and product names used in this book are trademarks, registered trademarks, or trade names of their respective holders. ISBN 978-93-891053-7-7 This book is printed in 70 gsm papers. Printed in India by Mahasagar Technologies. Coimbatore Institute of Information Technology, #156, 3rd Floor, Kalidas Road, Ramnagar, Coimbatore – 641009, Tamil Nadu, India. Phone: 0422-4377821 www.ciitresearch.org ii Complimentary Copy Authors Dr. M. S Vijaya K. Geethalakshmi S. Mohanapriya M. Selvanayaki T. S. Anushya Devi T. Saranya A. Kavitha L. Sheeba G. Sangeetha S. Kavitha iii Complimentary Copy Notes ********** iv Complimentary Copy Dr. R. SUBASHKUMAR, M.Sc., Ph.D., DMLT., PGDNBT., Associate Professor, Department of Biotechnology, Sri Ramakrishna Arts & Science College, (Autonomous, Affiliated to Bharathiar University) Coimbatore, Tamil Nadu. India. The Lab Manual of Bioperl Programming Language is a collection of fourteen titled modules that facilitate the development of perl scripts, its use in the analysis of biological data, exploring approaches, emerging methodologies, and tools that can give biological meaning to the data generated. The manual has been prepared with prioritised methodology viz., operation of tools using the suitable methods, annotated algorithms, self- descriptive results, etc. I hope that the practical manual will also be of interest to computational biologists who want to learn and practice a little more about the biological questions related to the field of bioinformatics. This book is intended to ease the learning curve for new users of bioperl, as well as a user guide for a specific set of DNA and protein sequence analysis programs and enables developing scripts that can analyze large quantities of sequence data in ways that are typically difficult or impossible with web based systems. The book has been carefully reviewed by the team experts and highly appreciated to fulfil the requirement of the DBT STAR College Scheme. v Complimentary Copy Notes ********** vi Complimentary Copy Mr. Vivek Chandramohan, Assistant Professor, Department of Biotechnology, Siddaganga Institute of Technology, Tumkur-572103, Karnataka, India. I have gone through the entire manual of “Bioperl Programming Language” lab. This manual is written very well and the concepts are very clear. The Bioperl is used for many application, in this manual deals with, BLAST, Multiple sequences alignment, Gene prediction algorithm, and some mathematical model. Finding mutation in given sequences is most impotent clinical genomics. vii Complimentary Copy Notes ********** viii Complimentary Copy Introduction to Bioperl: Bioperl is a collection of perl modules that facilitate the development of perl scripts for bio-informatics applications. It provides reusable perl modules that facilitate writing perl scripts for sequence manipulation, accessing of databases using a range of data formats and execution and parsing of the results of various molecular biology programs including Blast, clustalw, TCoffee, Genscan, ESTscan and HMMER. Bioperl enables developing scripts that can analyse large quantities of sequence data in ways that are typically difficult or impossible with web based systems. Using Bioperl: Bioperl provides software modules for many of the typical tasks of bioinformatics programming. These include: Accessing sequence data from local and remote databases Transforming formats of database/ file records Manipulating individual sequences Searching for "similar" sequences Creating and manipulating sequence alignments Searching for genes and other structures on genomic DNA Developing machine readable sequence annotation Installation: The actual installation of the various system components is accomplished in the standard manner: Locate the package on the network Download Decompress (with gunzip or a simliar utility) Remove the file archive (eg with tar -xvf) Create a ‘‘makefile’’ (with ‘‘perl Makefile.PL’’ for perl modules or a supplied ‘‘install’’ or ‘‘configure’’ program for non-perl program Run ‘‘make’’, ‘‘make test’’ and ‘‘make install’’ This procedure must be repeated for every CPAN. ix Complimentary Copy Module, Bioperl-extension and external module to be installed. A helper module CPAN.pm is Available from CPAN which automates the process for installing the perl modules. x Complimentary Copy BIOPERL PROGRAMMING LANGUAGE Syllabus: 1. To align the sequence using Local Blast. 2. Write a script to search for genes from Genscan. 3. Protein Sequence Generation. 4. To count start and stop codons in a sequence. 5. To calculating the reverse complement of DNA Sequence. 6. To concatenating DNA Fragments Transcription: DNA to RNA. 7. Program to simulate DNA Mutation. 8. Write a code to check file extensions using Perl. 9. Write a code to search the given string in an entered sequence. 10. To compute total and average length of the proteins in these files. 11. Reading a protein sequence from a file. 12. Write a sequence in a file. xi Complimentary Copy Notes ********** xii Complimentary Copy List of Experiments S. No Programs Page No 01 Align the Sequence using Local Blast 1 02 Gene from Genscan 5 03 Protein Sequence Generation 7 04 To Count Start and Stop Codons in a Sequence 11 05 Reverse Complement of DNA 13 06 DNA Fragments Transcription 16 07 DNA Mutation 21 08 To Check File Extensions 27 09 To Search String in Sequence 29 10 Total and Average length of Proteins in files 31 11 Reading Protein Sequence From Files 37 12 Sequence into a file 43 xiii Complimentary Copy Notes ********** xiv Complimentary Copy BioPerl Programming Language 1 1. ALIGN THE SEQUENCE USING LOCAL BLAST Prerequisites: NCBI, BLAST Tools. Aim: To write a program to align the sequence using local blast. Steps Using Tools: 1) Open the NCBI website. 2) Choose the BLAST under popular resources. 3) According to the chosen sequence choose Protein BLAST or Nucleotide BLAST. 4) Enter the sequence. 5) Choose Others in Database option. 6) Choose Nucleotide in the following option. 7) Choose BLAST option to view the output. Department of BCA Complimentary Copy 2 BioPerl Programming Language Department of BCA Complimentary Copy BioPerl Programming Language 3 Department of BCA Complimentary Copy 4 BioPerl Programming Language Department of BCA Complimentary Copy BioPerl Programming Language 5 2. GENE FROM GENSCAN Prerequisites: Perl Software, Notepad. Aim: To write a script to search for genes in the Genscan. Algorithm: Step 1: Start the process. Step 2: Open programs folder in Perl Folder. Step 3: Open Notepad and type the program. Step 4: Save the program as sel.pl in the folder which is created in perl. Step 5: Run the program in Command Prompt as perl sel.pl. Step 6: Check the output in the perl folder. Step 7: Stop the program. Program: use Bio::Seq; use Bio::SeqIO; $seq_obj = Bio::Seq-> new (-seq=>"aaaatgggggggggggccccgtt", -display_id => "#12345", -desc => "example 1", -alphabet => "dna"); $seqio_obj = Bio::SeqIO->new(-file => '>perlseq1.fasta', -format => 'fasta' ); $seqio_obj->write_seq($seq_obj); Output: C:\>cd perl64 C:\Perl64>cd programs C:\Perl64\Programs>perl sel.pl C:\Perl64\Programs> Department of BCA Complimentary Copy 6 BioPerl Programming Language Department of BCA Complimentary Copy BioPerl Programming Language 7 3. PROTEIN SEQUENCE GENERATION Prerequisites: Expasy and UniProtKB Tools Aim: To write a program for generating protein sequence. Steps Using Tools: 1. Search Expasy tool in the Google. 2. Click the Proteomics column. 3. Under the databases, click UniProtKB link. 4. Copy the protein sequence in the text file. Department of BCA Complimentary Copy 8 BioPerl Programming Language Department of BCA Complimentary Copy BioPerl Programming Language 9 Department of BCA Complimentary Copy 10 BioPerl Programming Language Department

Load more