Page 1 of 62
DNA barcoding the Lepidoptera inventory of a large complex tropical conserved
wildland, Area de Conservacion Guanacaste (ACG), northwestern Costa Rica.
Daniel H. Janzen, Department of Biology, University of Pennsylvania, Philadelphia,
PA 19104 [email protected]
Winnie Hallwachs, Department of Biology, University of Pennsylvania,
Philadelphia, PA 19104 [email protected]
Genome Downloaded from www.nrcresearchpress.com by UNIVERSITY OF PITTSBURGH on 05/13/16 For personal use only. This Just-IN manuscript is the accepted prior to copy editing and page composition. It may differ from final official version of record. 1 Page 2 of 62
Abstract.
The 37 year ongoing inventory of the estimated 15,000 species of Lepidoptera living in
the 125,000 terrestrial hectares of Area de Conservacion Guanacaste, northwestern Costa
Rica, has DNA barcode documented 11,000+ species, and the simultaneous inventory of
at least 6,000+ species of wild caught caterpillars, plus 2,700+ species of parasitoids.
The inventory began with Victorian methodologies and species level perceptions, but it
was transformed in 2004 by the full application of DNA barcoding for specimen
identification and species discovery. This tropical inventory of an extraordinarily
species rich and complex multidimensional trophic web has relied upon the sequencing
services provided by the Canadian Centre for DNA Barcoding, and the informatics
support from BOLD, the Barcode of Life Data Systems, major tools developed by the
Centre for Biodiversity Genomics at the Biodiversity Institute of Ontario, and available to
all through couriers and the internet. As biodiversity information flows from these many
thousands of undescribed and often look alike species through their transformations to
usable product, we see that DNA barcoding, firmly married to our centuries old
morphology , ecology , microgeography and behavior based ways of taxonomizing the
wild world, has made possible what was impossible before 2004. We can now work with
all the species that we find, as recognizable species level units of biology. In this essay,
we touch on some of the details of the mechanics of actually using DNA barcoding in an
Genome Downloaded from www.nrcresearchpress.com by UNIVERSITY OF PITTSBURGH on 05/13/16 inventory.
Key words: conservation, taxasphere, barcode, parataxonomists, tropics.
For personal use only. This Just-IN manuscript is the accepted prior to copy editing and page composition. It may differ from final official version of record. 2 Page 3 of 62
Introduction.
Why conduct a biodiversity inventory of a large complex tropical conserved
wildland? This inventory began in 1978 as a purely academic question about who are the
caterpillars behind the hundreds of species of moths that came to a 25 watt light bulb in
1978 in the dry forest of the then Parque Nacional Santa Rosa, today’s Sector Santa Rosa
of Area de Conservacion Guanacaste (ACG) in northwestern Costa Rica (Janzen and
Hallwachs 2016; Janzen et al. 2009) (Fig. 1). While our focus (by personal interest and
ability) is on caterpillars and their adults, such an inventory necessarily includes the
caterpillars’ parasitoids and food plants, the fellow travelers in the ACG trophic web of at
least 375,000+ eukaryotic species that extends from Pacific coast dry forest to high
elevation cloud forest and down into Caribbean rain forest. This ACG inventory is only a
small piece of tropical biodiversity, but nonetheless includes tens of thousands of species.
This overview is offered by just its two authors, rather than co authored by the
literally hundreds of members of the taxasphere, DNA barcoding guild, biodiversity
conservation and management guild, and many others, who have been major and minor
contributors to the ACG inventory. This is quite simply because we do not have the
months of time required to circulate this general description and incorporate the multitude
Genome Downloaded from www.nrcresearchpress.com by UNIVERSITY OF PITTSBURGH on 05/13/16 of comments that such circulation will create. All of these persons have been and will be
coauthors and acknowledged collaborators of portions of the inventory over time.
FIG. 1 GOES HERE For personal use only. This Just-IN manuscript is the accepted prior to copy editing and page composition. It may differ from final official version of record. 3 Page 4 of 62
During the inventory’s 37 year ongoing travelogue, its technical purpose has
remained the same: to know what they are, where they are, what they do, how to find
them when you want them, and simultaneously, to get all of that information into the
public internet domain for all sectors of society to reference and use. There are tens of
thousands of ACG Lepidoptera, parasitoid and plant species to find and make some sense
of. But it has also necessarily evolved into addressing the broader social need of
conservation of wild areas as well. This inventory is meant to be a living public library
with open stacks. And we do this because such understanding and social integration is a
necessary component of a conserved tropical wildland that is to have some chance of
surviving indefinitely, albeit tattered and bruised, at relative peace with its society – local,
national and international (Janzen and Hallwachs 1994; Janzen 1988, 1991, 1998, 1999,
2000a,b; Geddes 2015).
The ACG inventory of caterpillars, adults, parasitoids and food plants is carried out by
a large team of on site Costa Rican parataxonomists (Janzen et al. 1993, 2009; Janzen
and Hallwachs 2011), who currently constitute a large portion of the “Programa de
Parataxónomos” of ACG (ACG 2016), which is jointly funded and in kind supported by
ACG and the Guanacaste Dry Forest Conservation Fund (GDFCF 2016). These two
Genome Downloaded from www.nrcresearchpress.com by UNIVERSITY OF PITTSBURGH on 05/13/16 conservation entities are in turn supported by a huge and institutionally diverse array of
financial and sweat equity supporters (see Acknowledgments). And this entire network
is directly supported, guided, encouraged, and stimulated by several hundred volunteers,
administrators, taxonomists, the taxasphere, institutions and others – neighboring, For personal use only. This Just-IN manuscript is the accepted prior to copy editing and page composition. It may differ from final official version of record. 4 Page 5 of 62
national and international. To detail this network and its activities will require a 500
page book that attempts description of a constantly shifting and diverse sociology,
economics, science and politics – you can never photograph the same river twice.
Here, in reaction to the 6th International Barcode of Life Conference, global biennial
symposium of the International Barcode of Life initiative held at the University of
Guelph in August 2015, represented by this special issue and others, we sketch a few of
the salient features and results associated with the insertion of DNA barcoding (Hebert et
al 2003; Janzen et al. 2005, 2009; Janzen and Hallwachs 2011) into the long running
ACG Lepidoptera inventory. We make no attempt to review and compare the results
with other on going tropical biodiversity inventories (e.g., Bassett et al. 2004, 2015;
Telfer et al. 2015; Miller 2015; Novotny and Miller 2014; Glover et al 2016), as that
would require an entire book length treatment. The most salient overarching result of
inserting DNA barcoding into the ACG inventory is to realize that it would be a seriously
debilitating oversight to conduct a species level arthropod biodiversity inventory of a
large complex tropical biological system without having DNA barcoding and its
anticipated offspring in the toolbox. Since Next Generation Sequencing (NGS) is still
being explored for this ACG inventory (e.g., Shokralla et al 2014; Gibson et al 2014,
2015), this entire commentary relates only to DNA barcodes obtained by Sanger
Genome Downloaded from www.nrcresearchpress.com by UNIVERSITY OF PITTSBURGH on 05/13/16 sequencing.
History.
For personal use only. This Just-IN manuscript is the accepted prior to copy editing and page composition. It may differ from final official version of record. 5 Page 6 of 62
In March and September 2003, at Cold Spring Harbor Laboratory, Long Island, New
York, we listened to Paul Hebert (Biodiversity Institute of Ontario, University of Guelph,
Canada) present and defend the concept of adding DNA barcoding for specimen
identification and species discovery to the taxasphere’s tool box (Hebert et al 2003;
Devan 2015), explicitly not for phylogenies or pushing more specimens into the museum
based taxasphere (defined in Janzen 1993). Through facilitation by the Smithsonian
Institution and John Burns, the ACG’s core skipper butterfly (Hesperiidae) taxonomist,
the inventory offered Hebert a leg from each of the 25 year accumulation of 484 oven
dried and museum spread reared vouchers of “ Astraptes fulgerator ”, a common, showy,
and well known roadside Central American butterfly (Fig. 2), to test drive the barcoding
concept with a horribly complex fauna, but yet one of the few areas of ACG butterfly
taxonomy assumed to be well understood. This ACG species was chosen because of the
both morphology and ecology based suspicion that it was well more than one species,
yet neither set of traits could consistently delineate these species among the ACG
specimens.
FIG. 2 HERE
This first example revealed ten sympatric species (now 11 through the addition of one
Genome Downloaded from www.nrcresearchpress.com by UNIVERSITY OF PITTSBURGH on 05/13/16 newly discovered by the inventory) of nearly identical but subtly variable adult butterflies
(Fig. 2) hiding inside one scientific name that had been happily used for 225 years; this
complex then sorted cleanly into barcode clusters that matched caterpillar color patterns,
food plant clusters, and subtle adult colors on both the upper side and underside of their For personal use only. This Just-IN manuscript is the accepted prior to copy editing and page composition. It may differ from final official version of record. 6 Page 7 of 62
wings, although the adults cannot be distinguished by the usual taxonomist’s key
character, their genitalia (Hebert et al 2004). This result was followed by finding 32
(now 39) ACG species of highly host specific tiny parasitic wasps (Braconidae,
Microgastrinae), previously thought to be a “generalist” parasitoid of Hesperiidae
caterpillars, that had been hiding under one morphology based name, Apanteles
leucostigmus (Smith et al 2008) for more than a century; these wasps have subsequently
been found to be distinguishable by overlooked morphological differences discovered by
a taxonomist minutely scrutinizing the wasps’ shape and relative sizes of body and leg
sculpture, color, and any other distinctive trait under the microscope, assisted by the
barcodes and host records (Fernandez Triana et al 2014). Each morphologically
distinguished wasp species has its own species of hesperiid caterpillar hosts. Equally, 16
species of “generalist” tachinid parasitoid flies were found by barcoding to be actually 72
species (mostly specialists), only nine of which were truly generalists (Smith et al 2006,
2007).
These results underlined the imperative of barcoding practically all of the inventory’s
voucher specimens that had been saved since 1978. A major accumulated result is that
the initial estimate of 9,500 species of ACG Lepidoptera has been revised to 15,000, and
the Canadian government, Biodiversity Institute of Ontario (BIO), Wege Foundation, and
Genome Downloaded from www.nrcresearchpress.com by UNIVERSITY OF PITTSBURGH on 05/13/16 other donors have now funded the DNA barcoding (Sanger sequencing) of tens of
thousands of 1 20 year old oven dried or ethanol preserved insect legs. The results have
driven much greater amounts of collecting, vouchering, and data gathering than For personal use only. This Just-IN manuscript is the accepted prior to copy editing and page composition. It may differ from final official version of record. 7 Page 8 of 62
originally anticipated by the inventory; simultaneously, the inventory is now far more
accurate and useful than it would have been.
In practical terms, an ACG insect leg is couriered to BIO, just as we used to ship our
rolls of K25 film to Kodak. And we get back a highly usable result – the barcode and its
analysis/use in BOLD (BOLD 2016; Ratnasingham and Hebert 2007) without really
understanding the guts of Kodak’s magic, nor having to sustain the administrative and
dollar costs and headaches of maintaining such a biodiversity processing center. It is
noteworthy that BIO and BOLD’s home University of Guelph is noted for its agricultural
research and practical crop breeding. DNA barcoding is, indeed, a kind of Crop Science.
The inventory has long found that with numerous very useful exceptions, some of which
are outlined here, the DNA barcodes match extremely well the morphology based
species level identifications that have been obtained through tens of thousands of hours
of taxasphere time already invested in the ACG inventory since 1978, and for centuries
before. But simultaneously the barcodes have revealed many hundreds of overlooked
species and undescribed species, and allowed the inventory itself to operate a “home
made” level of taxonomic operational accuracy that is generally ready to use decades
earlier than if the taxasphere were to be able to examine the specimens more formally,
describe them, and otherwise make use of them, if indeed it ever would or could given
Genome Downloaded from www.nrcresearchpress.com by UNIVERSITY OF PITTSBURGH on 05/13/16 the immense taxonomic challenge of tropical diversity.
An enormous array of inventory and ecological questions can be addressed with only
minimal and mutually iterative reinforcement of the initial input from the taxasphere. Is For personal use only. This Just-IN manuscript is the accepted prior to copy editing and page composition. It may differ from final official version of record. 8 Page 9 of 62
this wasp Braconidae or is it Ichneumonidae? Is this caterpillar Apateledidae or
Bombycidae, Erebidae or Noctuidae? Is this biology tagged with the family name
“Pyralidae” in 1900 the same entity as is tagged with “Pyralidae” today? With such
taxasphere guideposts in place, the inventory can, for example, majorly grasp the species
level community traits and social uses of 500+ species of formally unidentified crambid
moths, facilitate any crambid taxonomist, and simultaneously offer a massive amount of
collateral information to join with those same species that are today represented largely
by decades to centuries old museum specimens blessed with only its morphology, a date
of collection, place of collection and who caught it. The potential for inventory
taxasphere mutualisms is enormous.
Within 1 6 months of the leg’s submission to BIO there was (and still is) a DNA
barcode available to the inventory in BOLD, with its within BOLD massive out bound
and in house comparability, for us to join with what the bioinventory is telling us about
food plant, parasitoids, appearance, and micro within ACG location, and simultaneously
to gift this information outwards to other biodiversity users. As a tiny, but illustrative
example of the latter, the species behind the newly designated 20 patronyms for the Costa
Rican school children who are winners of an ACG Area de Conservacion Tempisque
conservation drawing contest (Kazmier 2015a) were all located by DNA barcoding of
Genome Downloaded from www.nrcresearchpress.com by UNIVERSITY OF PITTSBURGH on 05/13/16 ACG inventory Malaise trapped wasps and their simultaneous morphology based generic
placement and corroboration; the wasps are minute and hardly noticeable braconid
microgastrine ACG parasitic Promicrogaster (Fernandez Triana et al. 2016); all 150+
members of the ACG staff have an Apanteles patronym that is most easily recognized by For personal use only. This Just-IN manuscript is the accepted prior to copy editing and page composition. It may differ from final official version of record. 9 Page 10 of 62
its barcode and corroborated by its morphology, placement in by classical key, host
caterpillar, and microgeographic habitat (Fernandez Triana 2014a); even the President of
Costa Rica has his very own unique Pseudapanteles luisguillermosolisi (Fernandez
Triana et al 2014b; Kazmier 2015b).
While the ACG barcoding began with barcoding adults reared from wild caught
caterpillars, and their reared parasitoids (Janzen et al 2009), BIO’s insatiable craving for
larger and more complex (largely tropical and non Canadian) barcoding projects led to
Hebert’s 2006 offer to also barcode the ACG wild caught adult Lepidoptera as a way to
speed a “total” inventory and demonstrate that DNA barcoding “works” for a large
taxonomically complex fauna. (This speeds the inventory because only about 30% of
ACG Lepidoptera has been found as caterpillars, despite 30 years of work by
parataxonomists.) At that time, the US National Science Foundation was unwilling to
support barcoding costs and Hebert’s funds were critically needed. Two very experienced
caterpillar focused parataxonomists were then repurposed to mentor apprentice
parataxonomists to become a 4 member team to blanket light trap ACG, expressly for
adults to be barcoded and museum deposited as vouchers. This team set out to re capture
the ACG moth fauna that had been light trapped from 1978 1990. In 2006, those
previous voucher specimens, now residing in INBio’s national insect survey collection
Genome Downloaded from www.nrcresearchpress.com by UNIVERSITY OF PITTSBURGH on 05/13/16 (Janzen 1971), were deemed to be too old for easy barcoding. This INBio collection is
now donated to, and melded into, the Museo Nacional de Costa Rica, and reinforces the
current ACG inventory as well as that of the entire country.
For personal use only. This Just-IN manuscript is the accepted prior to copy editing and page composition. It may differ from final official version of record. 10 Page 11 of 62
Today, the barcoding of rearing and light trapping combined has already tallied
11,000+ species of ACG Lepidoptera; the species area curve rises abruptly each time a
new locality and/or habitat or ecosystem subset is added to the inventory, and gradually
creeps upwards as more seemingly “known” specimens from well inventoried portions of
ACG are barcoded. More than 215,000 adult ACG Lepidoptera legs (and 55,000+
parasitoid legs) have been sent to BIO for attempted Sanger sequencing by the end of
2015.
Routine ACG barcoding, routine results.
There are major DNA barcoding facilitated inventory processes in the tortured and
multi path travelogue for the biodiversity in the forest to become biodiversity
understanding and information in someone’s brain. When a parataxonomist finds and
rears a caterpillar on its food plant (Janzen 2004; Janzen et al 2009, 2011), or selects an
adult from a light trap, the specimen’s on site database record (in a FMP (FileMaker Pro)
flat file) receives all the collateral available at that ACG biological station, including
best guess interim (or actual) species and higher taxon names. The same applies to food
plants and parasitoids when they are first perceived.
Genome Downloaded from www.nrcresearchpress.com by UNIVERSITY OF PITTSBURGH on 05/13/16 Along its early journey to BIO/BOLD (or elsewhere), and to the final institutional
adult voucher specimen residence, that specimen’s name may be upgraded one to many
times by a subsequent handler. After several months the BOLD generated and inventory
screened name is first applied to the record (by downloading from BOLD). It is often For personal use only. This Just-IN manuscript is the accepted prior to copy editing and page composition. It may differ from final official version of record. 11 Page 12 of 62
upgraded in later months, years and decades as it is worked over by the taxasphere (the
aggregate of taxonomists, museums, and their knowledge, Janzen 1993) and as it receives
input from the inventory itself. No taxonomist has necessarily had to directly bother with
this process or specimen until the time that the first BOLD results are obtained, though
the processing parataxonomists and taxonomists along this trip all occasionally logic
check the interim ID and may upgrade it, as based on morphology or food plants. In this
essay “morphology” includes color patterns as well as more strictly morphological traits.
At any time, the addition of a “better” name creates a pulse of confirmations and
explorations by the Principle Investigator (PI) within the bioinventory and/or the subset
of relevant records in BOLD, and often, new queries back to the taxasphere. This is
especially the case if there is discordance between the inventory database collateral and
the collateral carried by other specimens with that name. For example, when the
inventory says that Enyo ocypete (Sphingidae) caterpillars are invariably found feeding
on plants in the Dilleniaceae, Onagraceae, Vitaceae, Actinidiaceae (n = 4,577), and a
2015 record in BOLD lists Rubiaceae as the food plant for this moth, the inventory PI
knows that there is a biological or bookkeeping problem to be resolved. 98%+, but not
100%, of the time, the problem is found to be in bookkeeping. However, it is also the
case that mother moths can be sloppy in oviposition or simply have oviposition
Genome Downloaded from www.nrcresearchpress.com by UNIVERSITY OF PITTSBURGH on 05/13/16 “preferences” a bit broader than suggested by traditional expectations, and caterpillars
may even “wander”, especially if they have defoliated their food plant and have
“generalist” abilities.
For personal use only. This Just-IN manuscript is the accepted prior to copy editing and page composition. It may differ from final official version of record. 12 Page 13 of 62
For a simple illustrative example of the many possible barcode mediated interactions
between the inventory and the taxasphere, tachinid flies DHJPAR0058284 and
DHJPAR0058285, reared in 1986 and 1989 respectively, were both initially named
tachJanzen01 Janzen01 (general genus