From Barcoding to Metagenomics: Molecular Identification Techniques for Ecological Studies of Endangered Primates Amrita Srivath
Total Page:16
File Type:pdf, Size:1020Kb
FROM BARCODING TO METAGENOMICS: MOLECULAR IDENTIFICATION TECHNIQUES FOR ECOLOGICAL STUDIES OF ENDANGERED PRIMATES AMRITA SRIVATHSAN B. Sc. (Hons.) A THESIS SUBMITTED FOR THE DEGREE OF DOCTOR OF PHILOSOPHY DEPARTMENT OF BIOLOGICAL SCIENCES NATIONAL UNIVERSITY OF SINGAPORE AND DEPARTMENT OF LIFE SCIENCES IMPERIAL COLLEGE LONDON 2014 DECLARATION I hereby declare that this thesis is my original work. I have duly acknowledged all the sources of information which have been used in the thesis. The copyright of this thesis rests with the author and is made available under a Creative Commons Attribution Non-Commercial No Derivatives licence. Researchers are free to copy, distribute or transmit the thesis on the condition that they attribute it, that they do not use it for commercial purposes and that they do not alter, transform or build upon it. For any reuse or redistribution, researchers must make clear to others the licence terms of this work ___________________ Amrita Srivathsan 31 July 2014 ii “I asked the question for the best reason possible, for the only reason, indeed that excuses anyone for asking any question - simple curiosity.” – Oscar Wilde iii ACKNOWLEDGEMENTS My most sincere gratitude goes to a number of people whose contributions were invaluable during my PhD studies: Prof. Rudolf Meier, it seems only fitting to begin this with Wilde, who, through you, had a heavy hand in the first part of the thesis. Thank you. For nurturing scientific thinking in me, for being the most supportive supervisor one can imagine, and for a lot of reasons that would be difficult to list. Your immense knowledge and ideas in many different aspects of this field lead to discussions that give me perspective on the questions we are asking and always leave me instigated about research. I am fortunate to have an opportunity to be supervised by you. Prof. Alfried Vogler, for hosting me in the Natural History Museum, which was my first time in such an environment, and it was a great experience. Thanks a lot for your extremely valuable suggestions, for giving me access to facilities that were critical for this thesis, for teaching me how scientific writing is done and most importantly showing me how to think big and work towards it. It is truly inspiring, and I learnt a lot about genomics and molecular systematics during my stay in NHM. To both my supervisors I am really grateful for your support during the sudden change in timeline, so that that the thesis could be put together. Andie Ang, without whom this project would not be possible. For collecting the samples, for plant collections, for molecular work, for data validation and for monkey- talk. iv John Sha for providing the data for retention time for douc langurs. Members of Singapore Zoological gardens, for all their help with the feeding trials for douc langurs. Members of Nee Soon Swamp Forest survey team and Mirza Rifqi Ismail for collecting plant samples, and vouchering them. Teo Li Young and Tay Ywee Chieh for their help with sequencing plant barcodes. Chong Kwek Yan for his insights into plant community in Nee Soon. Simon Burbidge and Peter Foster, the people behind the scenes managing the servers where this work was done. I am fairly sure that I have spent more time with these servers than people in last two years and they have never left me frustrated. AITBiotech members who generated the Next Generation Sequencing data for this project. NUS for funding my studies through President’s Graduate Fellowship and for funding travel to and from London, Illumina for funding the MiSeq runs, Ministry of Education and National Parks Board for funding the project. Members of two labs: From NUS, Jayanthi, for tea breaks and giving amazing company here, you will be dearly missed. Sujatha, who taught me how to run my first PCR and who was there and back again. Lei, my batchmate who is writing this along with me, your support during writing helped me a lot. Kathy, for all the coffee breaks and bearing with random statements from my corner and brainstorming with me. Yuchen, for never failing to entertain, and helping with formatting this. Darren and Youguang for helping me with crosschecking and forms. Denise, Wing Hing, Shiyang, Mindy, Youguang, Ywee Chieh, Diego, Gwynne, Gowri, Jinfa, Li Young, Bilge, Amy, you all made lab a really great place. From NHM, Chris, for sharing your pipeline that made the databases happen, Alex and Martijn, for sharing your ideas and valuable discussions, Ben, Kirsten, Nicole, Debora, Martin, Samia, Carmelo, Conrad, Paula, for your great company. v A number of friends, in particular, Shweta (whose gift of white tea was a constant company during writing), Akshat, Manali, Shefali, Aishwarya, Shy, Anupama, Janani, Eli, Seetha and Souvik, thanks for putting up with me especially through this last year of strange level of communication. Anjali Ma’am and Hindustani greats for keeping me sane, even if momentarily. Amma, Appa, and Atreya, for being great role models in life and academia and for being my backbone throughout. vi TABLE OF CONTENTS SUMMARY ................................................................................................................ xi List of Figures ........................................................................................................... xiii List of tables and appendices ...................................................................................... xv List of publications .................................................................................................. xvii CHAPTER 1 General Introduction ............................................................................... 1 CHAPTER 2 On the inappropriate use of Kimura-2-parameter (K2P) divergences in the DNA barcoding literature ........................................................................................... 13 2.1 Abstract ....................................................................................................... 13 2.2 Introduction ................................................................................................. 14 2.3 Materials and Methods ................................................................................ 17 2.4 Results and Discussion ................................................................................ 19 2.5 Conclusions ................................................................................................. 25 CHAPTER 3 An update on DNA Barcoding: Low species coverage and an increasing number of unidentified barcodes ................................................................................ 26 3.1 Abstract ....................................................................................................... 26 3.2 Introduction ................................................................................................. 28 3.2 Materials and Methods ................................................................................ 31 3.3 Results and Discussion ................................................................................ 33 3.3.1 Species coverage: Metazoa ............................................................................... 33 3.3.2 Species coverage: BOLD campaign taxa ......................................................... 36 3.3.3 Unidentified vs. Identified sequences ............................................................... 38 3.4 Conclusions ................................................................................................. 41 vii 3.5 An update ..................................................................................................... 42 CHAPTER 4 The databases for diet and parasite analyses: barcoding the Nee Soon Swamp forest and the bioinformatic retrieval of barcode sequences from GenBank . 43 4.1 Abstract ....................................................................................................... 43 4.2 Introduction ................................................................................................. 45 4.3 Methods ....................................................................................................... 49 4.3.1 Local databases ................................................................................................. 49 4.3.2 Data mining from GenBank .............................................................................. 50 4.3.3 rDNA databases ................................................................................................ 52 4.4 Results ......................................................................................................... 53 CHAPTER 5 Comparing the effectiveness of metagenomics and metabarcoding for diet analysis of a leaf-feeding monkey (Pygathrix nemaeus) ............................................ 60 5.1 Abstract ....................................................................................................... 60 5.2 Introduction ................................................................................................. 62 5.3 Materials and Methods ................................................................................ 65 5.3.1 Diet composition ............................................................................................... 65 5.3.2 Sample preparation and Next Generation Sequencing ...................................... 65 5.3.3 Diet database ..................................................................................................... 67 5.3.4 Plant database ...................................................................................................