Unix: Beyond the Basics

Total Page:16

File Type:pdf, Size:1020Kb

Unix: Beyond the Basics Log in using secure shell ssh –Y user@tak PuTTY on Windows Unix: Beyond the Basics George W Bell, Ph.D. Terminal on Macs BaRC Hot Topics – October, 2016 Bioinformatics and Research Computing Whitehead Institute Command prompt user@tak ~$ http://barc.wi.mit.edu/hot_topics/ 3 Hot Topics website: Logging in to our Unix server http://jura.wi.mit.edu/bio/education/hot_topics/ • Create a directory for the exercises and use it as your working • Our main server is called tak directory $ cd /nfs/BaRC_training • Request a tak account: $ mkdir john_doe http://iona.wi.mit.edu/bio/software/unix/bioinfoaccount.php $ cd john_doe • Logging in from Windows • Copy all files into your working directory ¾ PuTTY for ssh $ cp -r /nfs/BaRC_training/UnixII/* . ¾ Xming for graphical display [optional] • You should have the files below in your working directory: • Logging in from Mac – foo.txt, sample1.txt, exercise.txt, datasets folder – You can check they’re there with the ‘ls’ command ¾Access the Terminal: Go Î Utilities Î Terminal ¾XQuartz needed for X-windows for newer OS X. 2 4 Unix Review: Unix Review: Commands Pipes ¾ command [arg1 arg2 … ] [input1 input2 … ] • Stream output of one command/program as $ sort -k2,3nr foo.tab -n or -g: -n is recommended, except for scientific notation or input for another start end a leading '+' -r: reverse order – Avoid intermediate file(s) $ cut -f1,5 foo.tab ¾ $ cut -f 1 myFile.txt | sort | uniq -c > uniqCounts.txt $ cut -f1-5 foo.tab pipe -f: select only these fields -f1,5: select 1st and 5th fields -f1-5: select 1st, 2nd, 3rd, 4th, and 5th fields $ wc -l foo.txt How many lines are in this file? 5 7 Unix Review: What we will discuss today Common Mistakes • Case sensitive • Aliases (to reduce typing) cd /nfs/Barc_Public vs cd /nfs/BaRC_Public -bash: cd: /nfs/Barc_Public: No such file or directory • sed (for file manipulation) • awk/bioawk (to filter by column) • Spaces may matter! • groupBy (bedtools; not typical Unix) rm –f myFiles* vs rm –f myFiles * • join (merge files) • Office applications can convert text to special • loops (one-line and with shell scripts) characters that Unix won’t understand • scripting (to streamline commands) • Ex: smart quotes, dashes 6 8 Aliases Regular Expressions • Pattern matching and easier to search • Add a one-word link to a longer command • Commonly used regular expressions • To get current aliases (from ~/.bashrc) • Examples Matches List all txt files: ls *.txt . All characters alias Replace CHR with Chr at the beginning of each line: * Zero or more; wildcard • Create a new alias (two examples) $ sed 's/^CHR/Chr/' myFile.txt +One or more Delete a dot followed by one or more numbers ?One alias sp='cd /lab/solexa_public/Reddien' $ sed 's/\.[0-9]\+//g' myFile.txt ^ Beginning of a line alias CollectRnaSeqMetrics='java -jar $ End of a line /usr/local/share/picard-tools/CollectRnaSeqMetrics.jar' [ab] Any character in brackets • Make an alias permanent • Note: regular expression syntax may slightly differ – Paste command(s) in ~/.bashrc between sed, awk, Unix shell, and Perl – Ex: \+ in sed is equivalent to + in Perl 9 11 sed: awk stream editor for filtering and transforming text • Print lines 10 - 15: • Name comes from the original authors: $ sed -n '10,15p' bigFile > selectedLines.txt Alfred V. Aho, Peter J. Weinberger, Brian W. • Delete 5 header lines at the beginning of a file: Kernighan $ sed '1,5d' file > fileNoHeader • A simple programing language • Remove all version numbers (eg: '.1') from the end of • Good for filtering/manipulating multiple- a list of sequence accessions: eg. NM_000035.2 column files $ sed 's/\.[0-9]\+//g' accsWithVersion > accsOnly s: substitute g: global modifier (change all) 10 12 awk awk: arithmetic operations • By default, awk splits each line by spaces Add average values of 4th and 5th fields to the file: $ awk '{ print $0 "\t" ($4+$5)/2 }' foo.tab • Print the 2nd and 1st fields of the file: $ awk ' { print $2"\t"$1 } ' foo.tab $0: all fields • Convert sequences from tab delimited format to fasta format: Operator Description + Addition - Subtraction $ head -1 foo.tab * Multiplication Seq1 ACTGCATCAC / Division $ awk ' { print ">" $1 "\n" $2 }' foo.tab > foo.fa % Modulo $ head -2 foo.fa ^Exponentiation >Seq1 ** Exponentiation ACGCATCAC 13 15 awk: field separator awk: making comparisons Print out records if values in 4th or 5th field are above 4: • Issues with default separator (white space) $ awk '{ if( $4>4 || $5>4 ) print $0 } ' foo.tab – one field is gene description with multiple words Sequence Description > Greater than – consecutive empty cells < Less than <= Less than or equal to • To use tab as the separator: >= Greater than or equal to $ awk -F "\t" '{ print NF }' foo.txt == Equal to or Character Description != Not equal to $ awk 'BEGIN {FS="\t"} { print NF }' foo.txt \n newline ~Matches \r carriage return !~ Does not match BEGIN: action before read input \t horizontal tab || Logical OR NF: number of fields in the current record && Logical AND FS: input field separator OFS: output field separator END: action after read input 14 16 awk • Conditional statements: bioawk: Examples Display expression levels for the gene NANOG: • Print transcript info and chr from a gff/gtf file (2 ways) $ awk '{ if(/NANOG/) print $0 }' foo.txt or $ awk '/NANOG/ { print $0 } ' foo.txt bioawk -c gff '{print $group "\t" $seqname}' Homo_sapiens.GRCh37.75.canonical.gtf or bioawk -c gff '{print $9 "\t" $1}' Homo_sapiens.GRCh37.75.canonical.gtf $ awk '/NANOG/' foo.txt Sample output: Add line number to the above output: $ awk '/NANOG/ { print NR"\t"$0 }' foo.txt gene_id "ENSG00000223972"; transcript_id "ENST00000518655"; chr1 NR: line number of the current row gene_id "ENSG00000223972"; transcript_id "ENST00000515242"; chr1 • Looping: Calculate the average expression (4th, 5th and 6th fields in this case) for each transcript • Convert a fastq file into fasta (2 ways) $ awk '{ total= $4 + $5 + $6; avg=total/3; print $0"\t"avg}' foo.txt bioawk -c fastx '{print “>” $name “\n” $seq}' sequences.fastq or bioawk -c fastx '{print “>” $1 “\n” $2}' sequences.fastq $ awk '{ total=0; for (i=4; i<=6; i++) total=total+$i; avg=total/3; print $0"\t"avg }' foo.txt 17 19 Summarize by Columns: bioawk* • Extension of awk for commonly used file groupBy (from bedtools) formats in bioinformatics Input file must be pre-sorted by grouping column(s)! $ bioawk -c help input bed: !Ensembl Gene ID !Ensembl Transcript ID !Symbol -g grpCols column(s) for grouping 1:chrom 2:start 3:end 4:name 5:score 6:strand 7:thickstart 8:thickend 9:rgb ENSG00000281518 ENST00000627423 FOXO6 -c -opCols column(s) to be summarized 10:blockcount 11:blocksizes 12:blockstarts ENSG00000281518 ENST00000630406 FOXO6 -o Operation(s) applied to opCol: sam: ENSG00000280680 ENST00000625523 HHAT ENSG00000280680 ENST00000627903 HHAT sum, count, min, max, mean, median, stdev, 1:qname 2:flag 3:rname 4:pos 5:mapq 6:cigar 7:rnext 8:pnext 9:tlen 10:seq ENSG00000280680 ENST00000626327 HHAT collapse (comma-sep list) ENSG00000281614 ENST00000629761 INPP5D 11:qual distinct (non-redundant comma-sep list) ENSG00000281614 ENST00000630338 INPP5D vcf: 1:chrom 2:pos 3:id 4:ref 5:alt 6:qual 7:filter 8:info Print the gene ID (1st column), the gene symbol , and a list of transcript IDs (2nd field) gff: $ sort -k1,1 Ensembl_info.txt | groupBy -g 1 -c 3,2 -o distinct,collapse 1:seqname 2:source 3:feature 4:start 5:end 6:score 7:filter 8:strand 9:group 10:attribute Partial output fastx: !Ensembl Gene ID !Symbol !Ensembl Transcript ID 1:name 2:seq 3:qual 4:comment ENSG00000281518 FOXO6 ENST00000627423,ENST00000630406 ENSG00000280680 HHAT ENST00000625523,ENST00000626327,ENST00000627903 *https://github.com/lh3/bioawk 18 20 Join files together Shell script advantages With Unix join • Automation: avoid having to retype the same $ join -1 1 -2 2 $ ' \t ' FILE1 FILE2 Join files on the 1st field of FILE1 with the 2nd field of FILE2, commands many times only showing the common lines. • Ease of use and more efficient FILE1 and FILE2 must be sorted on the join fields before running join With BaRC scripts (sorting not required) • Outline of a script: Code in /nfs/BaRC_Public/BaRC_code/Perl/ #!/bin/bash shebang: interprets how to run the script $ join2filesByFirstColumn.pl file1 file2 commands… set of commands used in the script Sample tables to join: #comments write comments using “#” Skeletal Smooth Spinal !Symbol Heart Skin Ensembl Gene ID !Symbol Muscle Muscle cord HHAT 8.15 7.7 5 6.55 6.4 ENSG00000252303 RNU6-280P • Commonly used extension for script is .sh (eg. INPP5D 19.65 5.95 4.55 5.25 14.5 ENSG00000280584 OBP2B NDUFA10 441.8 160.2 24.9 188.85 158.75 ENSG00000280680 HHAT foo.sh), file must have executable permission RPS6KA1 85.2 47.75 46.45 35.85 44.55 ENSG00000280775 RNA5SP136 RYBP 20.45 13.05 11.95 20.7 17.75 ENSG00000280820 LCN1P1 SLC16A1 15.45 20.45 12.2 248.35 27.15 ENSG00000280963 SERTAD4-AS1 21 23 Shell Flavors Bash Shell: ‘for’ loop • Syntax (for scripting) depends the shell • Process multiple files with one command echo $SHELL # /bin/bash (on tak) • Reduce computational time with many cluster nodes • bash is common and the default on tak. for mySam in `/bin/ls *.sam` do • Some Unix shells (incomplete listing): bsub wc -l $mySam Shell Name done sh Bourne When referring to a variable, $ is needed before the variable bash Bourne-Again name ($mySam), but $ is not needed when defining it (mySam). ksh Korn shell csh C shell Identical one-line command: for samFile in `/bin/ls *.sam`; do bsub wc -l $samFile; done 22 24 Shell script example #!/bin/sh # 1.
Recommended publications
  • By Sebastiano Vigna and Todd M. Lewis Copyright C 1993-1998 Sebastiano Vigna Copyright C 1999-2021 Todd M
    ne A nice editor Version 3.3.1 by Sebastiano Vigna and Todd M. Lewis Copyright c 1993-1998 Sebastiano Vigna Copyright c 1999-2021 Todd M. Lewis and Sebastiano Vigna Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are preserved on all copies. Permission is granted to copy and distribute modified versions of this manual under the conditions for verbatim copying, provided that the entire resulting derived work is distributed under the terms of a permission notice identical to this one. Permission is granted to copy and distribute translations of this manual into another language, under the above conditions for modified versions, except that this permission notice may be stated in a translation approved by the Free Software Foundation. Chapter 1: Introduction 1 1 Introduction ne is a full screen text editor for UN*X (or, more precisely, for POSIX: see Chapter 7 [Motivations and Design], page 65). I came to the decision to write such an editor after getting completely sick of vi, both from a feature and user interface point of view. I needed an editor that I could use through a telnet connection or a phone line and that wouldn’t fire off a full-blown LITHP1 operating system just to do some editing. A concise overview of the main features follows: • three user interfaces: control keystrokes, command line, and menus; keystrokes and menus are completely configurable; • syntax highlighting; • full support for UTF-8 files, including multiple-column characters; • 64-bit
    [Show full text]
  • Pipe and Filter Architectural Style Group Number: 5 Group Members: Fan Zhao 20571694 Yu Gan 20563500 Yuxiao Yu 20594369
    Pipe and Filter Architectural Style Group Number: 5 Group Members: Fan Zhao 20571694 Yu Gan 20563500 Yuxiao Yu 20594369 1. Have its own vocabulary for its components and connectors? (define) The Pipe and Filter is an architectural pattern for stream processing. It consists of one or more components called filters. These filters will transform or filter ​ ​ data and then pass it on via connectors called pipes. These filters, which merely ​ ​ consume and produce data, can be seen as functions like sorting and counting. All of these filters can work at the same time. Also, every pipe connected to a filter has its own role in the function of the filter. When data is sent from the producer (pump), it ​ ​ goes through the pipes and filters, and arrives the destination (sink). The pump can ​ ​ be a static text file or a keyboard input. The sink can be a file, a database or a computer screen. 2. Impose specific topological constraints? (diagram) Figure 1 shows a basic structure of Pipe and Filter architecture style. In this example, there are five filters and eight pipes. Each filter will get input from one or more pipes and pass it via pipes. The combination of several filters and pipes can be regarded as a “big” filter. Figure 2 is an specific example using Pipe and Filter architecture style. This example demonstrates a simple process of making sandwiches. To begin with, the first 4 filters can work simultaneously for preparation. Once they are done, the 5th filter can get the output and combine them together. Next, a following filter will add sauce to it and pass it to customer through a pipe.
    [Show full text]
  • Digital Filter Graphical User Interface
    University of Southern Maine USM Digital Commons Thinking Matters Symposium Archive Student Scholarship Spring 2018 Digital Filter Graphical User Interface Tony Finn University of Southern Maine Follow this and additional works at: https://digitalcommons.usm.maine.edu/thinking_matters Recommended Citation Finn, Tony, "Digital Filter Graphical User Interface" (2018). Thinking Matters Symposium Archive. 135. https://digitalcommons.usm.maine.edu/thinking_matters/135 This Poster Session is brought to you for free and open access by the Student Scholarship at USM Digital Commons. It has been accepted for inclusion in Thinking Matters Symposium Archive by an authorized administrator of USM Digital Commons. For more information, please contact [email protected]. By Tony Finn Digital Filter Graphical User Interface EGN 402 Fall 2017 Problem Statements - Digital FIR (finite impulse response) filter design Results requires tedious computations, with each requiring Illustrated in Figure 3 is the final design of the user interface, truncation of an impulse response (seen in Figure 1.) one will find buttons to change design type and filter type as - In order to obtain the desired effects from a filter, one well as clickable buttons to give the user feedback or an output. may need to try multiple filters, so many computations - Play Original Audio: emits the input audio as is; unfiltered. would be necessary. - Play Filtered Audio: emits the input audio with the designed Therefore the desire to simplify the digital filter design filter applied. process is necessary to provide users an easier, more intuitive method for design. - Return Filtered Audio: returns the filtered audio. - Print Filter: returns the filter specifications.
    [Show full text]
  • Unix: Beyond the Basics
    Unix: Beyond the Basics BaRC Hot Topics – September, 2018 Bioinformatics and Research Computing Whitehead Institute http://barc.wi.mit.edu/hot_topics/ Logging in to our Unix server • Our main server is called tak4 • Request a tak4 account: http://iona.wi.mit.edu/bio/software/unix/bioinfoaccount.php • Logging in from Windows Ø PuTTY for ssh Ø Xming for graphical display [optional] • Logging in from Mac ØAccess the Terminal: Go è Utilities è Terminal ØXQuartz needed for X-windows for newer OS X. 2 Log in using secure shell ssh –Y user@tak4 PuTTY on Windows Terminal on Macs Command prompt user@tak4 ~$ 3 Hot Topics website: http://barc.wi.mit.edu/education/hot_topics/ • Create a directory for the exercises and use it as your working directory $ cd /nfs/BaRC_training $ mkdir john_doe $ cd john_doe • Copy all files into your working directory $ cp -r /nfs/BaRC_training/UnixII/* . • You should have the files below in your working directory: – foo.txt, sample1.txt, exercise.txt, datasets folder – You can check they’re there with the ‘ls’ command 4 Unix Review: Commands Ø command [arg1 arg2 … ] [input1 input2 … ] $ sort -k2,3nr foo.tab -n or -g: -n is recommended, except for scientific notation or start end a leading '+' -r: reverse order $ cut -f1,5 foo.tab $ cut -f1-5 foo.tab -f: select only these fields -f1,5: select 1st and 5th fields -f1-5: select 1st, 2nd, 3rd, 4th, and 5th fields $ wc -l foo.txt How many lines are in this file? 5 Unix Review: Common Mistakes • Case sensitive cd /nfs/Barc_Public vs cd /nfs/BaRC_Public -bash: cd: /nfs/Barc_Public:
    [Show full text]
  • MATLAB Creating Graphical User Interfaces  COPYRIGHT 2000 - 2004 by the Mathworks, Inc
    MATLAB® The Language of Technical Computing Creating Graphical User Interfaces Version 7 How to Contact The MathWorks: www.mathworks.com Web comp.soft-sys.matlab Newsgroup [email protected] Technical support [email protected] Product enhancement suggestions [email protected] Bug reports [email protected] Documentation error reports [email protected] Order status, license renewals, passcodes [email protected] Sales, pricing, and general information 508-647-7000 Phone 508-647-7001 Fax The MathWorks, Inc. Mail 3 Apple Hill Drive Natick, MA 01760-2098 For contact information about worldwide offices, see the MathWorks Web site. MATLAB Creating Graphical User Interfaces COPYRIGHT 2000 - 2004 by The MathWorks, Inc. The software described in this document is furnished under a license agreement. The software may be used or copied only under the terms of the license agreement. No part of this manual may be photocopied or repro- duced in any form without prior written consent from The MathWorks, Inc. FEDERAL ACQUISITION: This provision applies to all acquisitions of the Program and Documentation by, for, or through the federal government of the United States. By accepting delivery of the Program or Documentation, the government hereby agrees that this software or documentation qualifies as commercial computer software or commercial computer software documentation as such terms are used or defined in FAR 12.212, DFARS Part 227.72, and DFARS 252.227-7014. Accordingly, the terms and conditions of this Agreement and only those rights specified in this Agreement, shall pertain to and govern the use, modification, reproduction, release, performance, display, and disclosure of the Program and Documentation by the federal government (or other entity acquiring for or through the federal government) and shall supersede any conflicting contractual terms or conditions.
    [Show full text]
  • Useful Commands in Linux and Other Tools for Quality Control
    Useful commands in Linux and other tools for quality control Ignacio Aguilar INIA Uruguay 05-2018 Unix Basic Commands pwd show working directory ls list files in working directory ll as before but with more information mkdir d make a directory d cd d change to directory d Copy and moving commands To copy file cp /home/user/is . To copy file directory cp –r /home/folder . to move file aa into bb in folder test mv aa ./test/bb To delete rm yy delete the file yy rm –r xx delete the folder xx Redirections & pipe Redirection useful to read/write from file !! aa < bb program aa reads from file bb blupf90 < in aa > bb program aa write in file bb blupf90 < in > log Redirections & pipe “|” similar to redirection but instead to write to a file, passes content as input to other command tee copy standard input to standard output and save in a file echo copy stream to standard output Example: program blupf90 reads name of parameter file and writes output in terminal and in file log echo par.b90 | blupf90 | tee blup.log Other popular commands head file print first 10 lines list file page-by-page tail file print last 10 lines less file list file line-by-line or page-by-page wc –l file count lines grep text file find lines that contains text cat file1 fiel2 concatenate files sort sort file cut cuts specific columns join join lines of two files on specific columns paste paste lines of two file expand replace TAB with spaces uniq retain unique lines on a sorted file head / tail $ head pedigree.txt 1 0 0 2 0 0 3 0 0 4 0 0 5 0 0 6 0 0 7 0 0 8 0 0 9 0 0 10
    [Show full text]
  • Frequently Asked Questions Welcome Aboard! MV Is Excited to Have You
    Frequently Asked Questions Welcome Aboard! MV is excited to have you join our team. As we move forward with the King County Access transition, MV is committed to providing you with up- to-date information to ensure you are informed of employee transition requirements, critical dates, and next steps. Following are answers to frequency asked questions. If at any time you have additional questions, please refer to www.mvtransit.com/KingCounty for the most current contacts. Applying How soon can I apply? MV began accepting applications on June 3rd, 2019. You are welcome to apply online at www.MVTransit.com/Careers. To retain your current pay and seniority, we must receive your application before August 31, 2019. After that date we will begin to process external candidates and we want to ensure we give current team members the first opportunity at open roles. What if I have applied online, but I have not heard back from MV? If you have already applied, please contact the MV HR Manager, Samantha Walsh at 425-231-7751. Where/how can I apply? We will process applications out of our temporary office located at 600 SW 39th Street, Suite 100 A, Renton, WA 98057. You can apply during your lunch break or after work. You can make an appointment if you are not be able to do it during those times and we will do our best to meet your schedule. Please call Samantha Walsh at 425-231-7751. Please bring your driver’s license, DOT card (for drivers) and most current pay stub.
    [Show full text]
  • Reference Manual
    Reference Manual Command Line Interface (CLI) HiLCOS Rel. 9.12 RM CLI HiLCOS Technical Support Release 9.12 05/16 https://hirschmann-support.belden.eu.com The naming of copyrighted trademarks in this manual, even when not specially indicated, should not be taken to mean that these names may be considered as free in the sense of the trademark and tradename protection law and hence that they may be freely used by anyone. © 2016 Hirschmann Automation and Control GmbH Manuals and software are protected by copyright. All rights reserved. The copying, reproduction, translation, conversion into any electronic medium or machine scannable form is not permitted, either in whole or in part. An exception is the preparation of a backup copy of the software for your own use. The performance features described here are binding only if they have been expressly agreed when the contract was made. This document was produced by Hirschmann Automation and Control GmbH according to the best of the company's knowledge. Hirschmann reserves the right to change the contents of this document without prior notice. Hirschmann can give no guarantee in respect of the correctness or accuracy of the information in this document. Hirschmann can accept no responsibility for damages, resulting from the use of the network components or the associated operating software. In addition, we refer to the conditions of use specified in the license contract. You can get the latest version of this manual on the Internet at the Hirschmann product site (www.hirschmann.com.) Hirschmann Automation and Control GmbH Stuttgarter Str. 45-51 Germany 72654 Neckartenzlingen Tel.: +49 1805 141538 Rel.
    [Show full text]
  • Great Lakes Cheat Sheet Less File Prints Content of File Page by Page Guide to General L Inux (Bash) a Nd S Lurm C Ommands Head File Print First 10 Lines of File
    Viewing and editing text files cat file Print entire content of file Great Lakes Cheat Sheet less file Prints content of file page by page Guide to general L inux (Bash) and S lurm c ommands head file Print first 10 lines of file tail file Print last 10 lines of file Accessing Great Lakes nano Simple, easy to use text editor Logging in from a terminal (Duo required) vim Minimalist yet powerful text editor ssh uniqname @greatlakes.arc-ts.umich.edu emacs Extensible and customizable text editor Transferring files between Great Lakes and your system scp input uniqname@ greatlakes-xfer.arc-ts.umich.edu: output Advanced file management scp -r i nput uniqname@ greatlakes-xfer.arc-ts.umich.edu:o utput scp uniqname@ greatlakes-xfer.arc-ts.umich.edu:i nput output chmod Change read/write/execute permissions which cmd List the full file path of a command GUI Clients PuTTY SSH client for Windows whereis cmd List all related file paths (binary, source, manual, etc.) of a command WinSCP SCP client for Windows du dir List size of directory and its subdirectories FileZilla FTP client for Windows, Mac, and Linux find Find file in a directory Basic Linux file management Aliases and system variables man command Display the manual page for command alias Create shortcut to command pwd Print out the present working directory env Lists all environment variables ls List the files in the current directory export var = val Create environment variable $ var with value ls -lh Show long, human-readable listing val ls dir List files inside directory dir echo $var Print the value of variable $var rm file Delete file .bashrc File that defines user aliases and variables mkdir dir Create empty directory called dir Input and output redirection rmdir dir Remove empty directory dir $( command) Runs command first, then inserts output to the rm -r dir Remove directory dir and all contents rest of the overall command cd dir Change working directory to dir < Standard input redirection cd .
    [Show full text]
  • Niagara Networking and Connectivity Guide
    Technical Publications Niagara Networking & Connectivity Guide Tridium, Inc. 3951 Westerre Parkway • Suite 350 Richmond, Virginia 23233 USA http://www.tridium.com Phone 804.747.4771 • Fax 804.747.5204 Copyright Notice: The software described herein is furnished under a license agreement and may be used only in accordance with the terms of the agreement. © 2002 Tridium, Inc. All rights reserved. This document may not, in whole or in part, be copied, photocopied, reproduced, translated, or reduced to any electronic medium or machine-readable form without prior written consent from Tridium, Inc., 3951 Westerre Parkway, Suite 350, Richmond, Virginia 23233. The confidential information contained in this document is provided solely for use by Tridium employees, licensees, and system owners. It is not to be released to, or reproduced for, anyone else; neither is it to be used for reproduction of this control system or any of its components. All rights to revise designs described herein are reserved. While every effort has been made to assure the accuracy of this document, Tridium shall not be held responsible for damages, including consequential damages, arising from the application of the information given herein. The information in this document is subject to change without notice. The release described in this document may be protected by one of more U.S. patents, foreign patents, or pending applications. Trademark Notices: Metasys is a registered trademark, and Companion, Facilitator, and HVAC PRO are trademarks of Johnson Controls Inc. Black Box is a registered trademark of the Black Box Corporation. Microsoft and Windows are registered trademarks, and Windows 95, Windows NT, Windows 2000, and Internet Explorer are trademarks of Microsoft Corporation.
    [Show full text]
  • Fast Cache for Your Text: Accelerating Exact Pattern Matching with Feed-Forward Bloom Filters Iulian Moraru and David G
    Fast Cache for Your Text: Accelerating Exact Pattern Matching with Feed-Forward Bloom Filters Iulian Moraru and David G. Andersen September 2009 CMU-CS-09-159 School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213 Abstract This paper presents an algorithm for exact pattern matching based on a new type of Bloom filter that we call a feed-forward Bloom filter. Besides filtering the input corpus, a feed-forward Bloom filter is also able to reduce the set of patterns needed for the exact matching phase. We show that this technique, along with a CPU architecture aware design of the Bloom filter, can provide speedups between 2 and 30 , and memory consumption reductions as large as 50 when compared with × × × grep, while the filtering speed can be as much as 5 higher than that of a normal Bloom filters. × This research was supported by grants from the National Science Foundation, Google, Network Appliance, Intel Corporation and Carnegie Mellon Cylab. Keywords: feed-forward Bloom filter, text scanning, cache efficient 1 Introduction Matching a large corpus of data against a database of thousands or millions of patterns is an im- portant component of virus scanning [18], data mining and machine learning [1] and bioinfor- matics [19], to name a few problem domains. Today, it is not uncommon to match terabyte or petabyte-sized corpuses or gigabit-rate streams against tens to hundreds of megabytes of patterns. Conventional solutions to this problem build an exact-match trie-like structure using an algo- rithm such as Aho-Corasick [3]. These algorithms are in one sense optimal: matching n elements against m patterns requires only O(m + n) time.
    [Show full text]
  • Standard TECO (Text Editor and Corrector)
    Standard TECO TextEditor and Corrector for the VAX, PDP-11, PDP-10, and PDP-8 May 1990 This manual was updated for the online version only in May 1990. User’s Guide and Language Reference Manual TECO-32 Version 40 TECO-11 Version 40 TECO-10 Version 3 TECO-8 Version 7 This manual describes the TECO Text Editor and COrrector. It includes a description for the novice user and an in-depth discussion of all available commands for more advanced users. General permission to copy or modify, but not for profit, is hereby granted, provided that the copyright notice is included and reference made to the fact that reproduction privileges were granted by the TECO SIG. © Digital Equipment Corporation 1979, 1985, 1990 TECO SIG. All Rights Reserved. This document was prepared using DECdocument, Version 3.3-1b. Contents Preface ............................................................ xvii Introduction ........................................................ xix Preface to the May 1985 edition ...................................... xxiii Preface to the May 1990 edition ...................................... xxv 1 Basics of TECO 1.1 Using TECO ................................................ 1–1 1.2 Data Structure Fundamentals . ................................ 1–2 1.3 File Selection Commands ...................................... 1–3 1.3.1 Simplified File Selection .................................... 1–3 1.3.2 Input File Specification (ER command) . ....................... 1–4 1.3.3 Output File Specification (EW command) ...................... 1–4 1.3.4 Closing Files (EX command) ................................ 1–5 1.4 Input and Output Commands . ................................ 1–5 1.5 Pointer Positioning Commands . ................................ 1–5 1.6 Type-Out Commands . ........................................ 1–6 1.6.1 Immediate Inspection Commands [not in TECO-10] .............. 1–7 1.7 Text Modification Commands . ................................ 1–7 1.8 Search Commands .
    [Show full text]