Unix Essentials: Hands-On Parsing HBI Array Data

Unix Essentials: Hands-On Parsing HBI Array Data

Unix Essentials: Hands-on Parsing HBI array data Goal: Process a gene expression file to get information such as genes of interest, sort by expression values, and subset the data for further investigation. 0) In your browser open the page: http://jura.wi.mit.edu/bio/education/hot_topics/unix_essentials_2016/UnixEssentials_HandsOn.txt You can copy paste the commands from that page as we need them In this file all commands are in red. 1) Log into tak. See handouts. 2) Go to the BaRC’s training folder: cd /nfs/BaRC_training Create a folder with your login name with mkdir command, such as mkdir your_login_name Then, go to the directory that you just created with cd your_login_name [Note: Replace your_login_name with your tak login name] Check where you are with pwd 3) Copy the HBI data we will be working with: cp ../HBI.partial.txt . # If you are following this instructions after the Hot Topics is over then use this command: #cp /nfs/BaRC_Public/Hot_Topics/Unix_Essentials_Oct2016/HBI.partial.txt . 4) View the file in your favorite editor. What is the first field? Command: gedit HBI.partial.txt & Without x window, you can run command: more HBI.partial.txt or head -1 HBI.partial.txt | cut -f1 Answer: Gene 5) How many genes are in the HBI data? [Note: header line] Command: wc -l HBI.partial.txt Answer: 999 6) Get the first column, and columns 20-22 and output it to a file called HBI.partial.new.txt, use this new file for the rest of the questions. Command: cut -f 1,20-22 HBI.partial.txt > HBI.partial.new.txt 7) What tissues are included in the new file? Command: head -1 HBI.partial.new.txt Answer: Brain (Ganglia) 8) Are there any duplicate genes? [Hint: uniq needs a sorted list] Command: cut -f 1 HBI.partial.new.txt | sort | uniq –d Answer: No 9) Sort the expression values based on the second column. Which gene has the highest expression level? Command: sort -k 2,2gr HBI.partial.new.txt| head Answer: LOC100507311 1.7393 [Note: the difference in using the sort options –g (general numeric sort) and –n (numerical sort)] sort -k 2,2gr HBI.partial.new.txt | cut -f1,2 | head sort -k 2,2nr HBI.partial.new.txt | cut -f1,2 | head 10) Get all the genes that begin only with "ZNF" from the original file, and output to a new file. Make sure to include the header line by appending just the header to the new file first. [Hint: use grep] Command: head -1 HBI.partial.new.txt > ZNF_genes.txt grep "^ZNF" HBI.partial.new.txt >> ZNF_genes.txt 11) At the end of the class make a folder with your name in your lab folder and copy all the material to lab it. Refer to the hand out for the location of your lab folder on Tak. These are example commands: mkdir /lab/PIname_lab/username mkdir /lab/PI_name_lab/username/unix_essentials_class cp –r * /lab/PI_name_lab/username/unix_essentials_class .

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    2 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us