Discovering global biodiversity and biogeography using mega-databases

Mark J. Costello, University of Auckland, New Zealand

Why set up online biodiversity databases?

Provide quality assured . Information e.g. GISD, WoRMS . Data to other scientists, students and public

Compile content to facilitate new research = this workshop aims to promote this

Opportunities

Databases drive standardisation 1. Enabling production of statistics 2. Comparison of patterns and trends 3. Unexpected discoveries 4. Testing of hypotheses 5. Calling for further database development

Use of the Global Biodiversity Information Facility 150 50 % of publications 40 using data of those that 100 reference GBIF 30

20 50 Number of publications using GBIF 10 data

0 0 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 Year Issues

• Takes years for resource to be cited • Years for people to start using resources • More years for data to be used to do research not previously practical • Quality control • “Fit for (what) purpose?” Species name problems over-estimate richness

Labrus bimaculatus Linnaeus, 1758

Labrus mixtus Linnaeus, 1758

Photos by Bernard Picton

Later discovered that these are male and female cuckoo wrasse. Both Latin names widely used Multiple descriptions!

Distinctive sperm whale Physeter macrocephalus Linnaeus, 1758 described as 19 different species! 3 times by Linnaeus 1758 2 x Borowski 1780 3 x Bonaterre 1789 3 x Lacépède 1804 3 x Gray 1846, 1850, & 1856 1 x 5 authors G. Cuvier, Kerr, Desmoulins, Fleming, Risso + re-described by more authors

Image from stamp collection Georges Declercq; accessed www. marinespecies. org World Register of Marine Species WoRMS www.marinespecies.org o > 200 world-experts directly validate taxonomy o Each page and group has standard citation; archived monthly o Primary source of authoritative content for Species 2000’s Catalogue of Life (complier of species names), Encyclopaedia of Life (content aggregator), OBIS and GBIF (publish species distribution data) o 90% complete: > 210,000 of 230,000 species names o Welcome readers to check content for omissions and errors

ICZN ZooBank Findings from WoRMS

• How many taxonomists are describing species? • What is progress in rate of description of species? • Is describing species irrelevant to conservation because of imminent mass extinction crisis?

Authors describing species

WoRMS

CoL non-marine

From: Costello, Wilson & Houlding 2011. Predicting total global species richness using rates of species description and estimates of taxonomic effort. Systematic Biology (online August ) WoRMS – marine species

The average number of authors per species per year,

indicating more taxonomic effort since 1960’s CoL – terrestrial species ‘et al.’ effect does not alter trend The number of species per WoRMS – marine species author is decreasing,

more so for terrestrial than marine species.

Maybe it is CoL – terrestrial species getting harder to discover new species? Similar trend globally for terrestrial and marine species

CoL – land species

WoRMS – marine species

Continuing discovery of new species – is there no end in sight? Discovery of biodiversity

Number of marine (blue) and terrestrial species described per year Description vs extinction Species description rate 16,000 species p.a. or 160,000 per decade

Species extinction rate • if 1% per decade* = 20,000 species extinct per decade if 2 million species on 80,000 ………… if 8 million species • if 0.1% per decade = 2,000 if 2 million = 8,000 if 8 million

* See Stork 2010. Biodiversity and Conservation 19, 357. Marine species distribution database Biogeographic Information System (OBIS)

Databases centred on  Taxonomic group (literature sources)  Field surveys (benthos, plankton, observations)  Fishery surveys  Museum collections Datasets: Global Habitats  Seabed, seashores to deep 39% datasets Regional  Plankton 17% National  Several habitats 44% Local

A community effort in online publication of primary data Global surveys

SAFHOS CPR zoo- phyto- plankton NODC plankton

BioOcean (deep-sea) Global collections

Canadian Atlantic Reference Centre (HMSC) Museum Nature

Southampton Oceanography Centre mid-water collections ZooGene Global syntheses (1) CephBase FishBase

Hexacorallia anemones +) MicroBIS Regional: NW Atlantic

ECNSAP

SE USA invertebrate Collection

ACCDC DFO DFO Atlantic fisheries

EAISSNA E Canada benthic macroinvertebrates Regional: Pacific

Bishop Museum, Hawaii

NIWA, New Zealand fisheries Mammals, birds, reptiles

Birds

Mammals

Reptiles Molluscs

Gastropods (nudibranchs, snails)

Cephalopods (squids, octopuses, cuttlefish)

Bivalves (clams+) Data by depth

< 100 m depth

100 – 1,000 m depth

> 1,000 m depth Number of distribution records in OBIS (5-degree cells)

Species richness 65,000 species in all

ES50 Most species are geographically rare (endemic)

• Marine species . 90 % species in < 3 seas . 48 % in only 1 sea

• Most widespread . Of the 100 most widespread species 93% are pelagic = 25 % microscopic plankton + 70 % mega-fauna fish, birds, mammals, turtles Similarity 100 0.1 10 1

Solomon Sea Cluster analysis Indian & South of and Pacific North Pacific Ocean Andaman or Burma Sea Ceram Sea oceans Malacca Strait South Sea North Atlantic Ocean Bass Strait Singapore Strait Note log 10 scale Inland Sea Japan Sea Eastern China Sea Atlantic Balearic Sea Red lines indicate no significant Strait of & Polar Skaggerak difference in species similarity Inner Seas off the West Coast of Scotland and St. George's Channel Bristol Channel between seas (i.e. same Kattegat biogeographic ) - Western Basin Mediterranean Sea - Eastern Basin Ocean Barentsz Sea The Coastal Waters of Southeast Alaska and British Columbia Gulf of St-Lawrence Northwestern Passages Hudson Strait Gulf of Riga Gulf of Boni Halamahera Sea Molukka Sea Rio de La Plata Cluster analysis by 5o cells

Data set too large for statistical analysis of clusters. Groups cells must be contiguous (coherent) to form a ‘region’. Distinguished 30 realms (28 fully marine) Opportunity to compare species distributions to environment - e.g. bathymetry Seabed slope

Seamount locations

Scale with slopes exaggerated

From: Costello MJ, Cheung A, De Hauwere N. 2010. Envir Sci Technol 44, 8821-8828. Annual average chlorophyll production

Predicted catch change

Cheung et al. 2009. Large-scale redistribution of maximum fisheries catch potential in the global ocean under climate change. Global Change Biology. Abundance at locations * temperature, salinity, ice cover, depth, coral reef, estuary, seamount, upwelling, ocean advection, larval duration Further reading

Costello MJ, Wilson SP, Houlding B. Predicting total global species richness using rates of species description and estimates of taxonomic effort. Systematic Biology (online) Costello MJ, Tsai P, Wong PS, Cheung A. Global biogeography of marine species richness, rarity, and endemicity in relation to geographic area and topographic variation. Submitted. for publication Costello MJ, Wilson SP. 2011. Predicting the number of known and unknown species in European seas using rates of description. Global Ecology and Biogeography 20, 319-330. [with a review of methods to estimate global species richness] Costello MJ, Cheung A, De Hauwere N. 2010. Topography statistics for the surface and seabed area, volume, depth and slope, of the world’s seas, oceans and countries. Environmental Science and Technology 44, 8821-8828. Costello MJ, Coll M, Danovaro R, Halpin P, Ojaveer H, Miloslavich P. 2010. A census of marine biodiversity knowledge, resources and future challenges. PLoS ONE 5(8): e12110. Wilson SP, Costello MJ. 2005. Predicting future discoveries of European marine species by using a non-homogeneous renewal process. Applied Statistics 54 (5), 897-918. Costello MJ, Emblow CS, Picton BE. 1996. Long term trends in the discovery of marine species new to science which occur in Britain and Ireland. Journal of the marine biological Association of the United Kingdom 76, 255-257. Species described p.a. vs all described

Species are being described at roughly the same proportion to their number of known species

Excluding insect and flowering plants To avoid ‘et al.’ effect can look at distinct first authors only

Still clear increase in number of taxonomists over years Species/distinct first author

Still decreasing species/taxonomist Discovery higher taxa in WoRMS

Phyla Classes

Orders Families Speciation requires - time - water - space

Ocean is where life started , largest habitat on Earth , has most phyla and classes of life.

75% ocean area and 90% volume is 3,000-6,000 m depth

So we expect more species in the oceans than on land?