Inadvertent Paralog Inclusion Drives Artifactual Topologies and Timetree Estimates in Phylogenomics Karen Siu-Ting,*,1,2,3 Marıa Torres-Sanchez,‡,4 Diego San Mauro,4 David Wilcockson,5 Mark Wilkinson,6 Davide Pisani,7 Mary J. O’Connell,8,9 and Christopher J. Creevey*,1 1Institute for Global Food Security, School of Biological Sciences, Queen’s University Belfast, Belfast, United Kingdom 2School of Biotechnology, Dublin City University, Glasnevin, Dublin, Ireland 3 Dpto. de Herpetologıa, Museo de Historia Natural, Universidad Nacional Mayor de San Marcos, Lima, Peru Downloaded from https://academic.oup.com/mbe/article-abstract/36/6/1344/5418531 by University of Leeds Medical & user on 10 June 2019 4Department of Biodiversity, Ecology, and Evolution, Complutense University of Madrid, Madrid, Spain 5Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, United Kingdom 6Department of Life Sciences, Natural History Museum, London, United Kingdom 7Life Sciences Building, University of Bristol, Bristol, United Kingdom 8School of Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom 9School of Life Sciences, University of Nottingham, University Park, United Kingdom ‡Present address: Department of Neuroscience, Spinal Cord and Brain Injury Research Center and Ambystoma Genetic Stock Center, University of Kentucky, Lexington, KY *Corresponding authors: E-mails:
[email protected];
[email protected]. Associate editor: Fabia Ursula Battistuzzi Abstract Increasingly, large phylogenomic data sets include transcriptomic data from nonmodel organisms. This not only has allowed controversial and unexplored evolutionary relationships in the tree of life to be addressed but also increases the risk of inadvertent inclusion of paralogs in the analysis. Although this may be expected to result in decreased phyloge- netic support, it is not clear if it could also drive highly supported artifactual relationships.