Treefinder Manual
Total Page:16
File Type:pdf, Size:1020Kb
TREEFINDER MANUAL - Version of October 2008 - Gangolf Jobb Email: [email protected] TREEFINDER computes phylogenetic trees from molecular sequences. The program infers even large trees by maximum likelihood under a variety of models of sequence evolution. It accepts both nucleotide and amino acid data and takes rate heterogeneity into account. Separate models can be assumed for user-defined data partitions, separate rates, separate edge lengths, separate character compositions. All parameters can be estimated from the data. Tree search can be guided by user-supplied topological constraints and start trees. There is a model proposer to propose appropriate models of sequence evolution. Confidence of inferred trees can be assessed by parametric and non-parametric bootstrapping, various paired-sites tests (ELW, BP, KH, SH, WSH, AU), also in combination with information criteria, edge support by LR-ELW. There are tools to manipulate trees and sequence data, visualize base frequencies, compute consensus and distance trees, count topologies. Confidence limits and other statistics can be obtained for all numerical results. The program can estimate divergence times by rate smoothing. A graphical user interface makes the use of TREEFINDER very intuitive. Data files and reconstruction parameters can be chosen interactively, results will be displayed on the screen. TREEFINDER is programmable in the functional and system independent programming language TL, which offers a wide range of functionality and enables the researcher to create problem specific applications. Parts of the software are written in TL and can be modified by the user. CONTENTS CONTENTS........................................................................................................................................... 2 SYSTEM REQUIREMENTS............................................................................................................... 5 DOWNLOAD ........................................................................................................................................ 5 INSTALLATION .................................................................................................................................. 5 WINDOWS......................................................................................................................................... 5 LINUX ................................................................................................................................................ 6 MAC OS X ......................................................................................................................................... 7 HEAT PROBLEMS .............................................................................................................................. 9 DATA FORMATS............................................................................................................................... 10 SEQUENCES ................................................................................................................................... 10 TREEFINDER SEQUENCE FORMAT ........................................................................................ 10 PHYLIP SEQUENCE FORMAT................................................................................................... 11 NEXUS SEQUENCE FORMAT................................................................................................... 12 FASTA SEQUENCE FORMAT..................................................................................................... 13 DATA PARTITIONS ..................................................................................................................... 13 DISTANCES..................................................................................................................................... 14 TREES .............................................................................................................................................. 14 REPORTS......................................................................................................................................... 16 THE GRAPHICAL USER INTERFACE......................................................................................... 17 THE COMMAND SHELL ...............................................................................................................17 THE TEXT EDITOR........................................................................................................................ 19 THE TREE VIEWER ....................................................................................................................... 19 SAVING TREES IN VARIOUS FORMATS................................................................................... 21 SAVING TIP-TO-TIP DISTANCES .............................................................................................. 22 EXTRACTING DATA FROM REPORTS...................................................................................... 22 REDRAWING, REROOTING AND MANIPULATING TREES.................................................... 23 CHANGING TOPOLOGY ...................................................................................................................................... 23 CHANGING HYPOTHESIS ORDER ..................................................................................................................... 24 CHANGING PAPER SIZE, ADDING DESCRIPTION AND HEADER ............................................................... 25 THE ADVANCED SEQUENCE SELECTION DIALOG............................................................... 25 ABORTING AND QUITTING ........................................................................................................ 26 TREE RECONSTRUCTION.............................................................................................................27 THE TREE RECONSTRUCTION DIALOG................................................................................... 27 MODEL EXPRESSIONS................................................................................................................. 28 SUBSTITUTION MODEL EXPRESSIONS .................................................................................. 28 2 PARTITION MODEL EXPRESSIONS ......................................................................................... 30 AVAILABLE SUBSTITUTION MODELS ..................................................................................... 32 AVAILABLE HETEROGENEITY MODELS................................................................................. 33 THE MAXIMUM LIKELIHOOD METHOD.................................................................................. 34 NUCLEOTIDE MODELS............................................................................................................. 34 EMPIRICAL PROTEIN MODELS ............................................................................................... 36 MIXED PROTEIN MODELS........................................................................................................ 36 REDUCED CHARACTER STATE MODELS............................................................................... 36 THE LIKELIHOOD SCORE......................................................................................................... 37 MISSING CHARACTERS ............................................................................................................. 37 AMONG-SITE RATE HETEROGENEITY.................................................................................... 38 DATA PARTITIONING ................................................................................................................ 39 PROPORTIONAL MODEL - PARTITION RATES............................................................................................... 40 GENERALIZED MODEL - PARTITION GROUPS............................................................................................... 40 MODEL SELECTION CRITERIA .................................................................................................. 41 AVAILABLE INFORMATION CRITERIA.................................................................................... 41 A SIMULATED INFORMATION CRITERION (SimIC) .............................................................. 42 SITEWISE INFORMATION CRITERIA ....................................................................................... 43 TREE SEARCH................................................................................................................................ 44 TREE SEARCH ALGORITHM ..................................................................................................... 44 START TREES .............................................................................................................................. 44 GLOBAL TREE SEARCH............................................................................................................. 44 EDGE SUPPORT................................................................................................................................ 45 BOOTSTRAP ANALYSIS............................................................................................................... 45 LR-ELW EDGE SUPPORT – APPROXIMATE BOOTSTRAPS................................................... 47 HYPOTHESIS TESTING .................................................................................................................