Expanding Larval Fish DNA Metabarcoding to All the Great Lakes Erik Pilgrim, Sara Okum, John Martinson, Joel Hoffman, Greg Peterson, Julie Lietz & Chelsea Hatzenbuhler U.S. EPA-Cincinnati/U.S. EPA-Duluth

Office of Research and Development National Exposure Research Laboratory Genetic Monitoring for Invasives • Two main investigative pathways: – 1) Targeting particular invaders with developed biomarkers (most eDNA work) • Advantage: sensitivity • Disadvantage: works only for targeted species – 2) Community profiles based on genetic data (DNA metabarcoding) • Advantage: ability to detect ‘foreign’ DNA • Disadvantage: IDs are only as good as the reference database

2 DNA Barcoding DNA Metabarcoding

isolation PCR

Mixed environmental sample Next generation Bulk DNA sequencing

Full community profile of the sample

500K to 12 million sequences

4 2012 Larval Fish Study Design

I. Probabilistic design II. Field sampling III. Morphological larval beach seine light trap

Tucker trawl neuston net Reconstitute 3 cycles sample - ~50 stations/cycle - 1 sample (gear)/station IV. Molecular taxonomy - early April (1) - mid-May (2) - late June (3) Duluth Harbor Larval Fishes

Common Name Scientific Name Brook Silverside* Labidesthes sicculus Longnose Sucker Catostomus catostomus White Sucker Catostomus commersonii Fish species (31) detected Redhorse Moxostoma breviceps/macrolepidum • Rock Bass Ambloplites rupestris from larval samples through Pumpkinseed Lepomis gibbosus Smallmouth Bass Micropterus dolomieu DNA metabarcoding in 2012 Black Crappie Pomoxis nigromaculatus • (~40 species occur) Common Carp* Cyprinus carpio Common Shiner Luxilus cornutus • Including 8 non-native Golden Shiner Notemigonus crysoleucas • Only failed to find 1 species Emerald Shiner Notropis atherinoides Spottail Shiner Notropis hudsonius • Uncovered 4 other species Mimic Shiner Notropis volucellus Fathead Minnow Pimephales promelas Longnose Dace Rhinichthys cataractae Creek Chub Semotilus atromaculatus Brook Stickleback Culaea inconstans Threespine Stickleback Gasterosteus aculeatus Round Goby* Neogobius melanostomus Tubenose Goby* Proterorhinus semilunaris White * Morone americana Rainbow Smelt* Osmerus mordax Johnny Darter nigrum Eurasian Ruffe* cernua *species not native to Lake Yellow Perch Perca flavescens Logperch Percina caprodes Superior Walleye vitreus Trout-perch Cisco Coregonus artedi 6 Freshwater Drum* Aplodinotus grunniens Morphological- vs. NGS-based ID

Morph vs R1 Morph vs R2 R1 vs R2 Richness 32 vs 32* 32 vs 31** 32 vs 31 Common Species 28 27 31 % Agree 78% 75% 86%

• NGS found species morph ID did not! – Freshwater Drum (likely as eggs: 2/2 R1 w/eggs; 7/9 R2 w/eggs) – Brook Stickleback (one hit; morph confused with non-native Threespine Stickleback) – Walleye (likely as eggs: 2/2 w/eggs and no unmatched percids) – Creek Chub (morph ID one as hornyhead chub, one blacknose dace) • NGS missed Bluegill and Pumpkinseed – why? The ruffe conundrum

57 sites (DNA) vs. 3 sites (morph)

Eurasian ruffe (Gymnocephalus cernua)

Incomplete larval key: • Myomere number in ruffe mimics ruffe larva centrarchids • Overestimated sunfish and black crappy-- underestimated ruffe yellow perch larva

8 Total length = 5.0 mm

Ruffe

Black Crappie Take Home Message (in 2013) • DNA metabarcoding applied to larval fish communities provides a new measure of invasive fish propagules. • Next step: Expand sampling to more of the Great Lakes (2014- ).

10 Expanded Sampling

11 Updated Design/Workflow

• Most samples collected as tow • Pick/sort larvae from tow sample • Approximately 25 samples per site undergoing morphological ID • Enter molecular workflow (DNA extraction -> PCR -> DNA Sequencing -> Analysis)

12 So, it’s 2016. Where the (bleep) is the data?

Office of Research and Development National Exposure Research Laboratory 13 A word about technology...

• Purchased in 2010 for $750K • Provides long reads (700+ bp) Instrument discontinued • Costs about $10K/run Support ended early 2016 • Generates 300-500K sequences/run • Maximum multiplex of 160 samples

• Purchased in 2015 for $100K • Provides shorter reads (400-500 bp) • Costs $1-1.5K/run • Generates 12-20 million sequences/run • Currently multiplexing up to 384 samples • (can expand to approx. 1000 samples)

14 Technological “Advancement”

• Positives: – New instrument costs less to purchase and run – Generates at least 40X the amount of sequence data as the previous instrument – Very responsive tech support • Challenges: – Create and optimize new markers for the new instrument

15 Where are we now?

• Designed, optimized, and processed all samples for new 16S marker (300 bp) – Lots of bioinformatics and analyses to complete in the coming weeks • Designed and optimized new COI marker (about half the standard barcode) – Starting to process these samples

16 Preliminary 16S Results

Alewife Logperch Black Crappie Longnose Sucker Bluegill Mimic Shiner Bluntnose Minnow Muskellunge 192 Sites Brook Silverside Pumpkinseed Sunfish • St.Louis River/Duluth Buffalo* Rainbow Smelt • Chequamegon Bay Burbot Rock Bass • Green Bay Carpsucker/Quillback* Silver Redhorse • Maumee Bay Common Carp Smallmouth Bass • Sandusky Bay Common Shiner Spot-tail Shiner Coregonus spp.* Trout-perch Emerald Shiner Tubenose Goby Eurasian Ruffe Walleye Freshwater Drum White Bass Gizzard Shad White Perch Golden Shiner White Sucker Johnny Darter Yellow Perch

17 Upcoming Work

• Comparisons, comparisons, and more comparisons – 16S against COI • Possible comparison against old instrument – Molecular results against morphological IDs – Diversity across and between lakes – Sampling period • Share data and results with FWS and USGS collaborators

18 Acknowledgements

EPA Cincinnati: Sara Okum, John Martinson EPA Duluth: Joel Hoffman, Anett Trebitz, Chelsea Hatzenbuhler, Julie Lietz, Greg Peterson, Christy Meredith EPA GLNPO/Chicago: Jamie Schardt USFWS: Stephen Hensler, Anjie Bowen , Tim Strakosh, Henry Quinlan Dynamac: Barry Wiechman, Ana Braam, Xiao Song

19