HMMER User's Guide
Total Page:16
File Type:pdf, Size:1020Kb
HMMER User’s Guide Biological sequence analysis using profile hidden Markov models http://hmmer.org/ Version 3.1b2; February 2015 Sean R. Eddy, Travis J. Wheeler and the HMMER development team Copyright (C) 2015 Howard Hughes Medical Institute. Permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and this permission notice are retained on all copies. HMMER is licensed and freely distributed under the GNU General Public License version 3 (GPLv3). For a copy of the License, see http://www.gnu.org/licenses/. HMMER is a trademark of the Howard Hughes Medical Institute. 1 Contents 1 Introduction 7 How to avoid reading this manual . 7 How to avoid using this software (links to similar software) . 7 What profile HMMs are . 7 Applications of profile HMMs . 8 Design goals of HMMER3 . 9 What’s new in HMMER3.1 . 10 What’s still missing in HMMER3.1 . 11 How to learn more about profile HMMs . 11 2 Installation 13 Quick installation instructions . 13 System requirements . 13 Multithreaded parallelization for multicores is the default . 14 MPI parallelization for clusters is optional . 14 Using build directories . 15 Makefile targets . 15 Why is the output of ’make’ so clean? . 15 What gets installed by ’make install’, and where? . 15 Staged installations in a buildroot, for a packaging system . 16 Workarounds for some unusual configure/compilation problems . 16 3 Tutorial 18 The programs in HMMER . 18 Supported formats . 19 Files used in the tutorial . 19 Searching a protein sequence database with a single protein profile HMM . 20 Step 1: build a profile HMM with hmmbuild . 20 Step 2: search the sequence database with hmmsearch . 21 Single sequence protein queries using phmmer . 27 Iterative protein searches using jackhmmer . 28 Searching a DNA sequence database . 30 Step 1: Optionally build a profile HMM with hmmbuild . 30 Step 2: search the DNA sequence database with nhmmer . 31 Searching a profile HMM database with a query sequence . 33 Step 1: create an HMM database flatfile . 33 Step 2: compress and index the flatfile with hmmpress . 34 Step 3: search the HMM database with hmmscan . 34 Creating multiple alignments with hmmalign . 36 4 The HMMER profile/sequence comparison pipeline 38 Null model. 39 MSV filter. 39 Biased composition filter. 39 Viterbi filter. 40 Forward filter/parser. 41 Domain definition. 41 Modifications to the pipeline as used for DNA search. 43 SSV, not MSV. 43 There are no domains, but there are envelopes. 43 2 Biased composition. 43 5 Tabular output formats 45 The target hits table . 45 The domain hits table (protein search only) . 47 6 Some other topics 49 How do I cite HMMER? . 49 How do I report a bug? . 49 Input files . 50 Reading from a stdin pipe using - (dash) as a filename argument . 50 7 Manual pages 52 alimask - Add mask line to a multiple sequence alignment . 52 Synopsis . 52 Description . 52 Options . 52 Options for Specifying Mask Range . 52 Options for Specifying the Alphabet . 53 Options Controlling Profile Construction . 53 Options Controlling Relative Weights . 54 Other Options . 54 hmmalign - align sequences to a profile HMM . 55 Synopsis . 55 Description . 55 Options . 55 hmmbuild - construct profile HMM(s) from multiple sequence alignment(s) . 56 Synopsis . 56 Description . 56 Options . 56 Options for Specifying the Alphabet . 56 Options Controlling Profile Construction . 57 Options Controlling Relative Weights . 57 Options Controlling Effective Sequence Number . 58 Options Controlling Priors . 58 Options Controlling E-value Calibration . 59 Other Options . 59 hmmconvert - convert profile file to a HMMER format . 61 Synopsis . 61 Description . 61 Options . 61 hmmemit - sample sequences from a profile HMM . 62 Synopsis . 62 Description . 62 Common Options . 62 Options Controlling What to Emit . 62 Options Controlling Emission from Profiles . ..