8 Conversion Examples
Total Page:16
File Type:pdf, Size:1020Kb
PGDSpider version 2.0.1.9 (August 2012) An automated data conversion tool for connecting population genetics and genomics programs Author: Heidi Lischer Computational and Molecular Population Genetics lab (CMPG) Institute of Ecology and Evolution (IEE) University of Berne 3012 Bern Switzerland Member of the Swiss Institute of Bioinformatics (SIB) e-mail: [email protected] Download: http://cmpg.unibe.ch/software/PGDSpider/ Manual PGDSpider ver 2.0.1.9 28.09.2012 Contents 1 Introduction ......................................................................................................................... 4 2 Formats supported by PGDSpider ......................................................................................... 5 3 How to cite PGDSpider and License ....................................................................................... 7 4 System requirements ........................................................................................................... 7 5 Installing PGDSpider ............................................................................................................. 8 5.1 Installation Instructions ........................................................................................................... 8 5.2 Java Web Start ......................................................................................................................... 9 6 Execute PGDSpider GUI ...................................................................................................... 10 6.1 Increase memory ................................................................................................................... 10 6.2 How to use the PGDSpider GUI ............................................................................................. 11 6.3 SPID Editor ............................................................................................................................. 12 6.4 Menus .................................................................................................................................... 14 6.5 Shortcuts ............................................................................................................................... 16 6.6 Log Output ............................................................................................................................. 18 7 Execute PGDSpider-cli ........................................................................................................ 18 7.1 Examples ................................................................................................................................ 19 8 Conversion examples ......................................................................................................... 20 9 Reporting bugs and comments ........................................................................................... 21 10 File format descriptions ...................................................................................................... 22 10.1 PGD ........................................................................................................................................ 23 10.2 ARLEQUIN .............................................................................................................................. 35 10.3 BAM ....................................................................................................................................... 41 10.4 BAPS....................................................................................................................................... 43 10.5 BATWING ............................................................................................................................... 48 10.6 BCF ......................................................................................................................................... 50 10.7 CONVERT ............................................................................................................................... 52 10.8 FASTA ..................................................................................................................................... 54 10.9 FASTQ .................................................................................................................................... 56 10.10 FDist2 ................................................................................................................................. 58 10.11 FSTAT ................................................................................................................................. 60 10.12 GDA .................................................................................................................................... 63 Heidi Lischer page 2/141 Manual PGDSpider ver 2.0.1.9 28.09.2012 10.13 GENELAND ......................................................................................................................... 66 10.14 GENEPOP ........................................................................................................................... 69 10.15 GENETIX ............................................................................................................................. 72 10.16 GESTE / BayeScan .............................................................................................................. 75 10.17 HGDP-CEPH ........................................................................................................................ 77 10.18 Immanc and BayesAss ....................................................................................................... 78 10.19 IM/IMa ............................................................................................................................... 80 10.20 IMa2 ................................................................................................................................... 83 10.21 KML .................................................................................................................................... 87 10.22 MEGA ................................................................................................................................. 89 10.23 MIGRATE ............................................................................................................................ 94 10.24 MSA ................................................................................................................................. 100 10.25 MSVar .............................................................................................................................. 102 10.26 NewHybrids ..................................................................................................................... 104 10.27 NEXUS .............................................................................................................................. 107 10.28 PED .................................................................................................................................. 113 10.29 PHYLIP .............................................................................................................................. 116 10.30 SAM ................................................................................................................................. 120 10.31 STRUCTURE ...................................................................................................................... 125 10.32 VCF ................................................................................................................................... 128 11 PGDSpider Screenshots ..................................................................................................... 134 12 References (Literature) ...................................................................................................... 139 Heidi Lischer page 3/141 Manual PGDSpider ver 2.0.1.9 28.09.2012 1 Introduction PGDSpider is a powerful automated data conversion tool for population genetic and genomics programs. It facilitates the data exchange possibilities between programs (Fig. 1) for a vast range of data types (e.g. DNA, RNA, NGS, microsatellite, SNP, RFLP, AFLP, multi-allelic data, allele frequency or genetic distances). Besides the conventional population genetics formats, PGDSpider integrates population genomics data formats commonly used to store and handle next-generation sequencing (NGS) data. Currently, PGDSpider is not meant to convert very large NGS files as it loads into memory the whole input file, whose size may exceed available RAM. However, since PGDSpider allows one to convert specific subsets of these NGS files into any other format, one could use this feature to calculate parameters or statistics for specific regions, and thus perform sliding window analysis over large genomic regions. PGDSpider uses a newly developed PGD (Population Genetics Data) format as an intermediate step in the conversion process. PGD is a file format designed to store various kinds of population genetics data, including different data types (e.g. DNA sequences, microsatellites, AFLP or SNPs) and ploidy levels. PGD is based on the XML format and is therefore independent of any particular computer system and extensible for future needs. PGDSpider uses PGD to connect population genetics and genomics programs like a spider knits a web. PGDSpider is written in Java and is therefore platform independent. It is user