Perl Tutorial

Total Page:16

File Type:pdf, Size:1020Kb

Perl Tutorial Perl tutorial Working with DNA Sequences #!/usr/bin/perl -w # Storing DNA in a variable, and printing it out # First we store the DNA in a variable called $DNA $DNA = 'ACGGGAGGACGGGAAAATTACTACGGCATTAGC'; # Next, we print the DNA onto the screen print $DNA; # Finally, we'll specifically tell the program to exit. exit; Concatenating the DNA sequences #!/usr/bin/perl -w # Concatenating DNA # Store two DNA fragments into variables called $DNA1 #and $DNA2 $DNA1 = 'ACGGGAGGACGGGAAAATTACTACGGCATTAGC'; $DNA2 = 'ATAGTGCCGTGAGAGTGATGTAGTA'; # Print the DNA onto the screen print "Here are the original two DNA fragments:\n\n"; print $DNA1, "\n"; print $DNA2, "\n\n"; # Concatenate the DNA fragments into a third variable and #print them Using "string interpolation" $DNA3 = "$DNA1$DNA2"; print "Here is the new DNA of the two fragments version 1):\n\n"; print "$DNA3\n\n"; # An alternative way using the "dot operator": # Concatenate the DNA fragments into a third variable and # print them $DNA3 = $DNA1 . $DNA2; print "Here is the concatenation of the first two fragments (version 2):\n\n"; print "$DNA3\n\n"; # Print the same thing without using the variable $DNA3 print "Here is the concatenation of the first two fragments (version 3):\n\n"; print $DNA1, $DNA2, "\n"; exit; TRANSCRIPTION: DNA -> RNA #!/usr/bin/perl -w # Transcribing DNA into RNA # The DNA $DNA = 'ACGGGAGGACGGGAAAATTACTACGGCATTAGC'; # Print the DNA onto the screen print "Here is the starting DNA:\n\n"; print "$DNA\n\n"; # Transcribe the DNA to RNA by substituting all T's with U's. $RNA = $DNA; $RNA =~ s/T/U/g; # Print the RNA onto the screen print "Here is the result of transcribing the DNA to RNA:\n\n"; print "$RNA\n"; # Exit the program. exit; Reverse Complement #!/usr/bin/perl -w # Calculating the reverse complement of a strand of DNA # The DNA $DNA = 'ACGGGAGGACGGGAAAATTACTACGGCATTAGC'; # Print the DNA onto the screen print "Here is the starting DNA:\n\n"; print "$DNA\n\n"; # Calculate the reverse complement # First, copy the DNA into new variable $revcom # (short for REVerse COMplement) # # It doesn't matter if we first reverse the string and then # do the complementation; or if we first do the complementation # and then reverse the string. Same result each time. # So when we make the copy we'll do the reverse in the same statement. $revcom = reverse $DNA; ----- The DNA is now reversed.. we neeed to complement the bases in revcom - substitute all bases by their complements. # A->T, T->A, G->C, C->G ####Attempt 1: $revcom =~ s/A/T/g; $revcom =~ s/T/A/g; $revcom =~ s/G/C/g; $revcom =~ s/C/G/g; # Print the reverse complement DNA onto the screen print "Here is the reverse complement DNA:\n\n"; print "$revcom\n"; ################# Does this work?? Why? # See the text for a discussion of tr/// $revcom =~ tr/ACGTacgt/TGCAtgca/; # Print the reverse complement DNA onto the screen print "Here is the reverse complement DNA:\n\n"; print "$revcom\n"; print "\nThis time it worked!\n\n"; exit; Reading Proteins in Files #!/usr/bin/perl -w # Reading protein sequence data from a file # The filename of the file containing the protein sequence data $proteinfilename = 'Name_Of_your_sequence_file.txt'; # First we have to "open" the file, and associate # a "filehandle" with it. We choose the filehandle # PROTEINFILE for readability. open(PROTEINFILE, $proteinfilename) || Die ("cannot open file"); # Now we do the actual reading of the protein sequence data from the file, by using the angle brackets < and > to get the input from the filehandle. We store the data into our variable $protein. @protein = <PROTEINFILE>; # Now that we've got our data, we can close the file. close PROTEINFILE; # Print the protein onto the screen print "Here is the protein:\n\n"; print @protein; exit; Pattern matching: MotiFs and Loops Proceed ONLY iF condition is true... code layout.. if (condition) { do something } Finding Motifs #!/usr/bin/perl -w # if-elsif-else $word = 'MNIDDKL'; # if-elsif-else conditionals if($word eq 'QSTVSGE') { print "QSTVSGE\n"; } elsif($word eq 'MRQQDMISHDEL') { print "MRQQDMISHDEL\n"; } GC CONTENT In PCR experiments, the GC-content oF primers are used to predict their annealing temperature to the template DNA. A higher GC-content level indicates a higher melting temperature. GC % = G + C x100 A+G+C+T Logical: for each base in the DNA if base is A count_of_A = count_of_A + 1 if base is C count_of_C = count_of_C + 1 if base is G count_of_G = count_of_G + 1 if base is T count_of_T = count_of_T + 1 done print count_of_A, count_of_C, count_of_G, count_of_T the script #!/usr/bin/perl -w # Determining frequency of nucleotides # Get the name of the file with the DNA sequence data $dna_filename = File_name.txt; # Remove the newline from the DNA filename chomp $dna_filename; # open the file, or exit open(DNAFILE, $dna_filename) || die ("Cannot open file \"$dna_filename\"); exit; } # Read the DNA sequence data from the file, and store it # into the array variable @DNA @DNA = <DNAFILE>; # Close the file close DNAFILE; # From the lines of the DNA file, # put the DNA sequence data into a single string. $DNA = join( '', @DNA); # Remove whitespace $DNA =~ s/\s//g; # Now explode the DNA into an array where each letter of # the original string is now an element in the array. # This will make it easy to look at each position. # Notice that we're reusing the variable @DNA for this purpose. @DNA = split( '', $DNA ); # Initialize the counts. # Notice that we can use scalar variables to hold numbers. $count_of_A = 0; $count_of_C = 0; $count_of_G = 0; $count_of_T = 0; $errors = 0; # In a loop, look at each base in turn, determine which of # the four types of nucleotides it is, and increment the # appropriate count. foreach $base (@DNA) { if ( $base eq 'A' ) { ++$count_of_A; } elsif ( $base eq 'C' ) { ++$count_of_C; } elsif ( $base eq 'G' ) { ++$count_of_G; } elsif ( $base eq 'T' ) { ++$count_of_T; } else { print "!!!!!!!! Error - I don\'t recognize this base: $base\n"; ++$errors; } } # print the results print "A = $count_of_A\n"; print "C = $count_of_C\n"; print "G = $count_of_G\n"; print "T = $count_of_T\n"; print "errors = $errors\n"; # exit the program exit; ---using regex --- while($DNA =~ /a/ig){$a++} while($DNA =~ /c/ig){$c++} while($DNA =~ /g/ig){$g++} while($DNA =~ /t/ig){$t++} while($DNA =~ /[^acgt]/ig){$e++} print "A=$a C=$c G=$g T=$t errors=$e\n"; ---- Next is a new kind of loop, the foreach loop. This loop works over the elements of an array. The line: foreach $base (@DNA) Wrtiting to Files # Also write the results to a file called "countbase" $outputfile = "countbase"; ( unless open(COUNTBASE, ">$outputfile") || die ("Cannot open file \"$outputfile\" to write to!!\n\n"); print COUNTBASE "A=$a C=$c G=$g T=$t errors=$e\n"; close(COUNTBASE); .
Recommended publications
  • The Linux Kernel Module Programming Guide
    The Linux Kernel Module Programming Guide Peter Jay Salzman Michael Burian Ori Pomerantz Copyright © 2001 Peter Jay Salzman 2007−05−18 ver 2.6.4 The Linux Kernel Module Programming Guide is a free book; you may reproduce and/or modify it under the terms of the Open Software License, version 1.1. You can obtain a copy of this license at http://opensource.org/licenses/osl.php. This book is distributed in the hope it will be useful, but without any warranty, without even the implied warranty of merchantability or fitness for a particular purpose. The author encourages wide distribution of this book for personal or commercial use, provided the above copyright notice remains intact and the method adheres to the provisions of the Open Software License. In summary, you may copy and distribute this book free of charge or for a profit. No explicit permission is required from the author for reproduction of this book in any medium, physical or electronic. Derivative works and translations of this document must be placed under the Open Software License, and the original copyright notice must remain intact. If you have contributed new material to this book, you must make the material and source code available for your revisions. Please make revisions and updates available directly to the document maintainer, Peter Jay Salzman <[email protected]>. This will allow for the merging of updates and provide consistent revisions to the Linux community. If you publish or distribute this book commercially, donations, royalties, and/or printed copies are greatly appreciated by the author and the Linux Documentation Project (LDP).
    [Show full text]
  • Writing Your First Linux Kernel Module
    Writing your first Linux kernel module Praktikum Kernel Programming University of Hamburg Scientific Computing Winter semester 2014/2015 Outline ● Before you start ● Hello world module ● Compile, load and unload ● User space VS. kernel space programing ● Summary Before you start ● Define your module’s goal ● Define your module behaviour ● Know your hardware specifications ○ If you are building a device driver you should have the manual ● Documentation ○ /usr/src/linux/Documentation ○ make { htmldocs | psdocs | pdfdocks | rtfdocks } ○ /usr/src/linux/Documentation/DocBook Role of the device driver ● Software layer between application and device “black boxes” ○ Offer abstraction ■ Make hardware available to users ○ Hide complexity ■ User does not need to know their implementation ● Provide mechanism not policy ○ Mechanism ■ Providing the flexibility and the ability the device supports ○ Policy ■ Controlling how these capabilities are being used Role of the device driver ● Policy-free characteristics ○ Synchronous and asynchronous operations ○ Exploit the full capabilities of the hardware ○ Often a client library is provided as well ■ Provides capabilities that do not need to be implemented inside the module Outline ● Before you start ● Hello world module ● Compile, load and unload ● User space VS. kernel space programing ● Summary Hello world module /* header files */ #include <linux/module.h> #include <linux/init.h> /* the initialization function */ /* the shutdown function */ static int __init hello_init(void) { static void __exit hello_exit(void)
    [Show full text]
  • Name Synopsis Description
    Perl version 5.10.0 documentation - vmsish NAME vmsish - Perl pragma to control VMS-specific language features SYNOPSIS use vmsish; use vmsish 'status';# or '$?' use vmsish 'exit'; use vmsish 'time'; use vmsish 'hushed'; no vmsish 'hushed'; vmsish::hushed($hush); use vmsish; no vmsish 'time'; DESCRIPTION If no import list is supplied, all possible VMS-specific features areassumed. Currently, there are four VMS-specific features available:'status' (a.k.a '$?'), 'exit', 'time' and 'hushed'. If you're not running VMS, this module does nothing. vmsish status This makes $? and system return the native VMS exit statusinstead of emulating the POSIX exit status. vmsish exit This makes exit 1 produce a successful exit (with status SS$_NORMAL),instead of emulating UNIX exit(), which considers exit 1 to indicatean error. As with the CRTL's exit() function, exit 0 is also mappedto an exit status of SS$_NORMAL, and any other argument to exit() isused directly as Perl's exit status. vmsish time This makes all times relative to the local time zone, instead of thedefault of Universal Time (a.k.a Greenwich Mean Time, or GMT). vmsish hushed This suppresses printing of VMS status messages to SYS$OUTPUT andSYS$ERROR if Perl terminates with an error status. and allowsprograms that are expecting "unix-style" Perl to avoid having to parseVMS error messages. It does not suppress any messages from Perlitself, just the messages generated by DCL after Perl exits. The DCLsymbol $STATUS will still have the termination status, but with ahigh-order bit set: EXAMPLE:$ perl -e"exit 44;" Non-hushed error exit%SYSTEM-F-ABORT, abort DCL message$ show sym $STATUS$STATUS == "%X0000002C" $ perl -e"use vmsish qw(hushed); exit 44;" Hushed error exit $ show sym $STATUS $STATUS == "%X1000002C" The 'hushed' flag has a global scope during compilation: the exit() ordie() commands that are compiled after 'vmsish hushed' will be hushedwhen they are executed.
    [Show full text]
  • A Concurrent PASCAL Compiler for Minicomputers
    512 Appendix A DIFFERENCES BETWEEN UCSD'S PASCAL AND STANDARD PASCAL The PASCAL language used in this book contains most of the features described by K. Jensen and N. Wirth in PASCAL User Manual and Report, Springer Verlag, 1975. We refer to the PASCAL defined by Jensen and Wirth as "Standard" PASCAL, because of its widespread acceptance even though no international standard for the language has yet been established. The PASCAL used in this book has been implemented at University of California San Diego (UCSD) in a complete software system for use on a variety of small stand-alone microcomputers. This will be referred to as "UCSD PASCAL", which differs from the standard by a small number of omissions, a very small number of alterations, and several extensions. This appendix provides a very brief summary Of these differences. Only the PASCAL constructs used within this book will be mentioned herein. Documents are available from the author's group at UCSD describing UCSD PASCAL in detail. 1. CASE Statements Jensen & Wirth state that if there is no label equal to the value of the case statement selector, then the result of the case statement is undefined. UCSD PASCAL treats this situation by leaving the case statement normally with no action being taken. 2. Comments In UCSD PASCAL, a comment appears between the delimiting symbols "(*" and "*)". If the opening delimiter is followed immediately by a dollar sign, as in "(*$", then the remainder of the comment is treated as a directive to the compiler. The only compiler directive mentioned in this book is (*$G+*), which tells the compiler to allow the use of GOTO statements.
    [Show full text]
  • Shell Code for Beginners
    Shell Code For Beginners Beenu Arora Site: www.BeenuArora.com Email: [email protected] ################################################################ # .___ __ _______ .___ # # __| _/____ _______| | __ ____ \ _ \ __| _/____ # # / __ |\__ \\_ __ \ |/ // ___\/ /_\ \ / __ |/ __ \ # # / /_/ | / __ \| | \/ <\ \___\ \_/ \/ /_/ \ ___/ # # \____ |(______/__| |__|_ \\_____>\_____ /\_____|\____\ # # \/ \/ \/ # # ___________ ______ _ __ # # _/ ___\_ __ \_/ __ \ \/ \/ / # # \ \___| | \/\ ___/\ / # # \___ >__| \___ >\/\_/ # # est.2007 \/ \/ forum.darkc0de.com # ################################################################ What is a shell Code? Shellcode is defined as a set of instructions injected and then executed by an exploited program. Shellcode is used to directly manipulate registers and the functionality of a exploited program. We can of course write shell codes in the high level language but would let you know later why they might not work for some cases, so assembly language is preferred for this. I would take an clean example of the exit() syscall used for exiting from a program. Many of you might be wondered to see why this being used is, the reason is the newer kernel don’t allow anymore the code execution from the stack so we have to use some C library wrapper or libc (responsible for providing us the malloc function). Usage at darker site: We write shellcode because we want the target program to function in a manner other than what was intended by the designer. One way to manipulate the program is to force it to make a system call or syscall. System calls in Linux are accomplished via software interrupts and are called with the int 0x80 instruction.
    [Show full text]
  • An Introduction to Python
    An Introduction to Python Day 1 Simon Mitchell [email protected] Why Python? * Clear code * Great beginner language * Powerful text manipulation * Wrangle large data files * Great compliment to other languages * Large user group * Supports many advanced features Warning: Spacing is important! Wrong: Error: Correct: No Error: Open A Terminal * Open a terminal: * Mac: cmd + space then type terminal and press enter * Windows: Start -> Program Files -> Accessories -> Command Prompt. * Type “python” (no quotes). Exit() to exit python. This is python Hello World Launch python Call the built in function print, which displays whatever comes after the command. Put any message in quotes after the print command. The command has finished and python is ready for the next command. >>> means tell me what to do now! Getting help - interactive Getting help – single command But usually just Google! If you got stuck on something, someone else probably has. Let’s get programming - Variables Set a variable with equals Display a variable by typing its name Variables can be text, numbers, boolean (True/ False) and many more things. Capitalization is important for True/ False Numeric Operators Add + Subtract – Multiply * Divide / Power ** Modulo (remainder) % Reassigning Variables Reassign with equals. (Same as assigning) ????? Warning! In some version of python division might not do what you expect. Integer division gives an integer result. Types of number Integer: Plus and minus. No decimal points or commas Float: Decimal points or scientific notation okay. 2e-2 = 2 x 10-2 Working With Numbers What is the minimum of these numbers: What is the maximum of these numbers: What type of variable is this? Remember that str(anything) makes that variable into a string: Working With Text Single or double quotes.
    [Show full text]
  • Standardizing SAS Code for Quality Programs Clarence Wm
    Standardizing SAS Code for Quality Programs Clarence Wm. Jackson, CQA, Change Manager City of Dallas, Communication and Information Services, Change Management Group Abstract and Introduction SAS software is a powerful programming system that allows even casual users to write very complicated solutions for business problems. It is free form, meaning that it has no syntactical constraints for defining the program structure, as is found in such programming languages as COBOL and ALC. However, the free form of SAS code can be a mixed blessing, where code maintenance is concemed. Whenever changes in the SAS code are required, if someone other than the original programmer has to make the changes, this may result in errors, lost productivity and a reduction in the consistency and overall quality of the program. This situation may be avoided by the implementation of appropriate standards for the writing of SAS programs. This paper will review industry standards for other programming languages, and will discuss how SAS code could be standardized. The paper also will review the benefits and opportunities for quality improvement. Standards Provide Basis for Quality What is a "Standard"? The American Heritage dictionary defines "standard" as "an acknowledged measure 0/ comparison/or quantitative or qualitative value; criterion; nonn; a degree or level 0/ requirement, excellence or attainment. " There are two categories of standards related to Information Technology, 'industry' and 'installation'. The industry standards are those set by a recognized standards organization meant to facilitate the efficient interchange of information between organizations and companies. The installation standards serve the purpose of providing guidelines for the efficient operation of a single installation or shop.
    [Show full text]
  • Exit Legacy: Atos Syntel's Solution Accelerator for Faster, Cheaper
    Exit Legacy: Atos Syntel’s Solution Accelerator for Faster, Cheaper, and Derisked Legacy Transformation Legacy software is embedded in customer organizations with extreme complexity. The problems with legacy applications are lack of agility, skills shortage, and high costs of ownership. In the new digital era, having IT systems that are agile, proactive, and flexible is the key to business growth. Customers typically face the following challenges while transforming their legacy applications: • Migration of legacy applications/systems is a time consuming and costly affair • A number of errors arise due to the manual method of execution • Generally migration is done for legacy technologies and finding the right resources is a challenge Legacy modernization provides solutions to all these problems and helps customers get into a non-legacy distributed platform with cloud adoption, and digital and analytics capabilities. Atos Syntel’s Exit Legacy portal hosts various solution accelerators that help in legacy modernization. With Exit Legacy, transforming your legacy is faster, risk-free, and cheaper. Exit Legacy: Atos Syntel’s Solution Accelerator for Faster, Cheaper, and Derisked Legacy Transformation BUSINESS Atos Syntel’s Solution BENEFITS Atos Syntel’s Exit Legacy portal is aligned to the modernization of project life cycles to ensure the • ~50% faster assessment modernization of the entire eco-system, instead of just application components or data. • ~50% reduction in manual Exit Legacy Tool Set efforts Inventory Analysis Tool: • ~50% costs
    [Show full text]
  • System Calls & Signals
    CS345 OPERATING SYSTEMS System calls & Signals Panagiotis Papadopoulos [email protected] 1 SYSTEM CALL When a program invokes a system call, it is interrupted and the system switches to Kernel space. The Kernel then saves the process execution context (so that it can resume the program later) and determines what is being requested. The Kernel carefully checks that the request is valid and that the process invoking the system call has enough privilege. For instance some system calls can only be called by a user with superuser privilege (often referred to as root). If everything is good, the Kernel processes the request in Kernel Mode and can access the device drivers in charge of controlling the hardware (e.g. reading a character inputted from the keyboard). The Kernel can read and modify the data of the calling process as it has access to memory in User Space (e.g. it can copy the keyboard character into a buffer that the calling process has access to) When the Kernel is done processing the request, it restores the process execution context that was saved when the system call was invoked, and control returns to the calling program which continues executing. 2 SYSTEM CALLS FORK() 3 THE FORK() SYSTEM CALL (1/2) • A process calling fork()spawns a child process. • The child is almost an identical clone of the parent: • Program Text (segment .text) • Stack (ss) • PCB (eg. registers) • Data (segment .data) #include <sys/types.h> #include <unistd.h> pid_t fork(void); 4 THE FORK() SYSTEM CALL (2/2) • The fork()is one of the those system calls, which is called once, but returns twice! Consider a piece of program • After fork()both the parent and the child are ..
    [Show full text]
  • C Programming Tutorial
    C Programming Tutorial C PROGRAMMING TUTORIAL Simply Easy Learning by tutorialspoint.com tutorialspoint.com i COPYRIGHT & DISCLAIMER NOTICE All the content and graphics on this tutorial are the property of tutorialspoint.com. Any content from tutorialspoint.com or this tutorial may not be redistributed or reproduced in any way, shape, or form without the written permission of tutorialspoint.com. Failure to do so is a violation of copyright laws. This tutorial may contain inaccuracies or errors and tutorialspoint provides no guarantee regarding the accuracy of the site or its contents including this tutorial. If you discover that the tutorialspoint.com site or this tutorial content contains some errors, please contact us at [email protected] ii Table of Contents C Language Overview .............................................................. 1 Facts about C ............................................................................................... 1 Why to use C ? ............................................................................................. 2 C Programs .................................................................................................. 2 C Environment Setup ............................................................... 3 Text Editor ................................................................................................... 3 The C Compiler ............................................................................................ 3 Installation on Unix/Linux ............................................................................
    [Show full text]
  • Unix Login Profile
    Unix login Profile A general discussion of shell processes, shell scripts, shell functions and aliases is a natural lead in for examining the characteristics of the login profile. The term “shell” is used to describe the command interpreter that a user runs to interact with the Unix operating system. When you login, a shell process is initiated for you, called your login shell. There are a number of "standard" command interpreters available on most Unix systems. On the UNF system, the default command interpreter is the Korn shell which is determined by the user’s entry in the /etc/passwd file. From within the login environment, the user can run Unix commands, which are just predefined processes, most of which are within the system directory named /usr/bin. A shell script is just a file of commands, normally executed at startup for a shell process that was spawned to run the script. The contents of this file can just be ordinary commands as would be entered at the command prompt, but all standard command interpreters also support a scripting language to provide control flow and other capabilities analogous to those of high level languages. A shell function is like a shell script in its use of commands and the scripting language, but it is maintained in the active shell, rather than in a file. The typically used definition syntax is: <function-name> () { <commands> } It is important to remember that a shell function only applies within the shell in which it is defined (not its children). Functions are usually defined within a shell script, but may also be entered directly at the command prompt.
    [Show full text]
  • Iterating in Perl: Loops
    Iterating in Perl: Loops - Computers are great for doing repetitive tasks. - All programming languages come with some way of iterating over some interval. - These methods of iteration are called ‘loops’. - Perl comes with a variety of loops, we will cover 4 of them: 1. if statement and if-else statement 2. while loop and do-while loop 3. for loop 4. foreach loop if statement Syntax: - if the conditional is ‘true’ then the if(conditional) body of the statement (what’s in { between the curly braces) is …some code… executed. } #!/usr/bin/perl -w $var1 = 1333; Output? if($var1 > 10) 1333 is greater than 10 { print “$var1 is greater than 10\n”; } exit; if-else statement Syntax: -if the conditional is ‘true’ then execute if(conditional) the code within the first pair of curly { braces. …some code… } - otherwise (else) execute the code in else the next set of curly braces { …some different code… } Output? #!/usr/bin/perl -w 13 is less than 100 $var1 = 13; if($var1 > 100) { print “$var1 is greater than 100\n”; } else { print “$var1 is less than 100\n”; } exit; Comparisons that are Allowed - In perl you can compare numbers and strings within conditionals - The comparison operators are slightly different for each one - The most common comparison operators for strings: syntax meaning example lt Less than “dog” lt “cat” False! d > c gt Greater than “dog” gt “cat” True! d > c le Less than or equal to “dog” le “cat” False! d > c ge Greater than or equal to “dog” ge “cat” True! d > c eq Equal to “cat” eq “cat” True! c = c ne Not equal to “cat” eq “Cat”
    [Show full text]