Introduction to Unix

Introduction to Unix

10/19/2017 Outline • Unix overview Introduction to Unix – Logging in to tak4 – Directory structure – Accessing Whitehead drives/finding your lab share • Basic commands • Exercises BaRC Hot Topics • BaRC and Whitehead resources Bioinformatics and Research Computing Whitehead Institute • LSF October 19th 2017 http://barc.wi.mit.edu/hot_topics/ 1 2 What is unix? and Why unix? Where can unix be used? • Unix is a family of operating systems that are • Mac computers related to the original UNIX operating system Come with unix • It is powerful, so large datasets can be analyzed • Windows computers need Cygwin or PuTTY: • Many repetitive analyses or tasks can be easily \\grindhouse\Software\Cygwin\Wibr‐Cygwin‐Installer‐1010.exe automated https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html • Some computer programs only run on the unix • Dedicated unix server operation system “tak4”, the Whitehead Scientific Linux • TAK4 (our unix server): lots of software and databases already installed or downloaded 3 4 1 10/19/2017 Connecting to tak for Windows Logging in to tak4 host: tak4.wi.mit.edu • Requesting a tak4 account http://iona.wi.mit.edu/bio/software/unix/bioinfoaccount.php 3) Click “Open” • Windows 1) Open putty 2.1) Write the address of the Host 2.2) Click on SSH‐>X11‐> Enable X11 PuTTY or Cygwin forwarding When you write the Xming: setup X‐windows for graphical display password you won’t see any characters being typed. • Macs Access through Terminal Command Prompt 5 user@tak4 ~$ 6 Log in to tak4 for Mac Unix Directory Structure root / ssh –Y [email protected] home usr bin nfs lab . jdoe BaRC_Public BaRC_training solexa_public solexa_lodish page /home/jdoe /lab/page See http://www.thegeekstuff.com/2010/09/linux‐file‐system‐structure/ for more detailed info. nfs and lab are directories specific to Whitehead 7 8 2 10/19/2017 Accessing Shared Resources at Whitehead Get ready for the exercises • Unix /nfs/BaRC_Public /nfs/BaRC_training • Use this link to copy paste the commands for the exercises /lab/solexa_public http://barc.wi.mit.edu/education/hot_topics/hot_topics/unix_es • Windows (access using Start Menu Search) sentials_2017/UnixEssentials_HandsOn.txt \\wi‐files1\BaRC_Public \\wi‐files1\BaRC_training \\wi‐htdata\solexa_public • These folders contain today’s slides and the material for the • Macs (access using Go Connect to Server…) cifs://wi‐files1/BaRC_Public exercises cifs://wi‐files1/BaRC_training \\wi‐files1\BaRC_Public\Hot_Topics\unixEssentials_2017 cifs://wi‐htdata/solexa_public cifs://wi‐files1/BaRC_Public/Hot_Topics/unixEssentials_2017 Where’s my lab’s share? • http://it.wi.mit.edu/systems/file‐storage/lab‐share‐paths 9 10 Unix Tips Basic commands listing and organizing files/folders • Use to reuse previous commands • list the contents of a directory: ls [only show names] ls –l [long listing: show other information too] ls –h [human readable] • Ctrl‐c: stop a process that is running • make a directory: mkdir dirname • change directory: cd dirname • The directories . and .. • Tab‐completion: The current directory (.) cd . The parent directory (..) – Complete commands/file names cd .. (go to the parent directory) ls .. (list to the parent directory) • print working directory: pwd • Go to your home directory: cd or cd ~ • Unix is case‐sensitive • Go to the previous directory: cd - 11 12 3 10/19/2017 Paths Unix Directory Structure root • “Absolute path” example / /nfs/BaRC_training/userName/myWork /nfs/BaRC_training/userName/myFig home usr bin cd /nfs/BaRC_training/userName/myWork nfs lab . cd /nfs/BaRC_training/userName/myFig • “Relative path” example jdoe BaRC_Public BaRC_training solexa_public solexa_lodish ../dirName if I am in “/nfs/BaRC_training/userName/myWork” I can do: page userName cd ../userName/myFig myWork myFig cd /nfs/BaRC_training/userName/myFig 13 cd ../userName/myFig 14 Basic commands Permissions copying, moving files, getting help drwxrwxr-x • copy: cp r read w write Type: directory (d) User Group Others • move: mv symbolic link(l) x execute • remove: rm • Use chmod to change permissions user(u), group(g), others(o), all(a) • remove directory: rmdir chmod u+x foo.pl (user can execute) chmod g-w foo.pl (group can’t write) • get help on a command: man {command} thiruvil@tak /nfs/BaRC_Public$ ls -l myFile.txt -rw-r--r-- 1 thiruvil barc 0 2012-10-10 13:32 myFile.txt thiruvil@tak /nfs/BaRC_Public$ chmod g+w myFile.txt thiruvil@tak /nfs/BaRC_Public$ ll myFile.txt -rw-rw-r-- 1 thiruvil barc 0 2012-10-10 13:32 myFile.txt Do Exercises 1, 2 and 3 15 16 4 10/19/2017 Displaying the contents of a file on the Editing a File screen • cat filename: Dump a file to the screen • Command‐line editors pico • more filename: Progressively dump a file to the screen: nano ENTER = one line down SPACEBAR = page down q=quit emacs (emacs –nw) • less filename: like more but with extended capabilities vi • Graphical editors (Windows users need an X‐windows emulator) • head filename: Show the first few lines of a file Note: may not be part of standard installation • head ‐n filename: Show the first n lines of a file nedit gedit • tail filename: Show the last few lines of a file xemacs • tail ‐n filename: Show the last n lines of a file • Put an & at the end of command line to run it in the background when using a graphical editor so that you can continue to use the • clear: clear screen terminal window eg. gedit myFile.txt& 17 18 Parsing a File: cut Output Redirection and Piping Word count: wc • Write output of a command to file • Select columns of interest Write to output file cut –f 9,12-15 myGeneValues.txt > col_9.12to15.txt • sort myFile.txt > myFile_sorted.txt Options: Over‐write output file (if it exists) ‐f output only these fields • sort myFile.txt >| myFile_sorted.txt ‐d field delimiter Append to output file • sort myFile.txt >> myFile_sorted.txt • Count number of lines/words/characters in file wc myFile • Piping “|”: use output of one command as wc ‐w (count words only) wc ‐l (count lines only) input for another command sort myFile.txt | more Do Exercises 4, 5, 6 and 7 19 20 5 10/19/2017 Sorting and removing redundancy Searching the contents of a file sort and uniq • Sort on column(s) • grep (global regular expression print) sort -k 3,3 myGeneExpression.txt | more Find words, or patterns, occurring in lines of a file Options: grep TMEM geneList.txt TMEM131 ‐n, ‐‐numeric‐sort compare according to string numerical value TMEM9B ‐g, ‐‐general‐numeric‐sort compare according to general numerical value TMEM14C ‐r reverse TMEM66 ‐k pos1,pos2 start sorting at pos1, end it at pos2 TMEM49 • Get only unique entries Options: uniq mySortedGenes.txt > myUniqGenes.txt ‐v select non‐matching lines Options: ‐iignore case ‐n print line number ‐c count entries Example: get TMEM but exclude TMEM14C ‐d duplicate counts grep TMEM geneList.txt | grep -v "TMEM14C" | more make sure that the file is sorted before running uniq Do Exercise 10 Do Exercises 8 and 9 21 22 Getting Files (Un)Compressing Files • .gz file • Getting files or directories Compress: gzip file.txt (file.txt.gz will be created) Uncompress: gunzip file.txt.gz (file.txt will be created) Files wget http://data.broadinstitute.org/igv/projects/downloads/2.4/IGVSource_2.4.2.zip • .tar.gz file Compress: tar –czvf myFiles.tar.gz myFiles Directories from (outside) servers Uncompress: tar –xzvf myFiles.tar.gz scp –r origin destination Options scp -r userName@serverToCopyFrom:/pathToFolderTocopyFrom . ‐c create an archive ‐x extract an archive scp -r [email protected]:/broad/lab/works . ‐f FILE name of archive ‐v be verbose, list all files being archived/extracted ‐z create/extract archive with gzip/gunzip • View compressed files using: zmore,zgrep 23 24 6 10/19/2017 BaRC SOPs BaRC Resources • barc.wi.mit.edu http://barcwiki.wi.mit.edu/wiki/SOPs 25 26 Unix Commands and BaRC Scripts Running Scripts on Unix • Perl bed2gff.pl or ./bed2gff.pl or path/bed2gff.pl • R run_rma_customCDF.R Perl UNIX R • Python myScript.py or ./myScript.py • Matlab matlab -nodesktop -nosplash myScript.m • Java Archive (JAR) java -Xmx1000m -jar /usr/local/share/IGVTools/igv.jar 27 28 7 10/19/2017 BaRC code Running Programs/Tools on Unix • bedtools Script type Location bedtools intersect -a myGenes1.bed –b myGenes2.bed Perl /nfs/BaRC_Public/BaRC_code/Perl Other utilities: http://code.google.com/p/bedtools/wiki/Usage R /nfs/BaRC_Public/BaRC_code/R • samtools Python /nfs/BaRC_Public/BaRC_code/Python samtools view myFile.bam Other utilities: http://samtools.sourceforge.net/samtools.shtml • Fastx toolkit fastx_quality_stats -i mySeq.fastq -o fastxStats_mySeq • FastQC fastqc mySeq.fastq • BLAST blastp –task blastp -db myProtDB.fa –q myProt.fa –out out.txt 29 30 Commonly Used Data Locations at Whitehead LSF Commands • bsub to submit jobs Location Description bsub wc –l reads.fq bsub “sort foo.txt > sorted.txt” /nfs/genomes Genome data: gff, gtf, fasta, bowtie Options: indexed files, blat indexed file, etc. ‐e error_file for several organisms ‐o standard_out_file ‐m machine (send the job to that machine) /nfs/seq/Data Sequence data, including blast ‐n number (use that many processors in the cluster) databases, for several organisms bsub ‐n 4 ‐R "span[hosts=1]“(use 4 processors all in the same machine in the cluster) /nfs/BaRC_datasets Large (array/NGS) datasets: HBI, • bjobs to view your jobs HBM 2.0 bjobs • bkill to kill a job bkill 237878 31 32 8 10/19/2017 LSF cluster jobs Further Reading http://tak4.wi.mit.edu/ • BaRC: Unix Info http://iona.wi.mit.edu/bio/education/unix_intro.php • LSF Cluster (incl. examples) http://iona.wi.mit.edu/bio/bioinfo/docs/LSF_help.php • Whitehead IT Computing Tutorials http://it.wi.mit.edu/ 33 34 Upcoming Hot Topics Regular Expressions • Unix advanced (Oct 25th) • Pattern matching • An Introduction to R ‐ November • Easier to search • An Introduction to R Graphics ‐ November • Commonly used regular expressions Regular Expression Matches .All characters *Zero or more; wildcard ^ Beginning of a line $Endof a line example: grep chr.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    9 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us