Stephen D. Larson Implementing 'Big Open Data' in government through Open Collaboration - case examples and possibilities Joint Data.Gov – Ontolog “Big Open Data” Session 05/17/12 A multi-scale data problem

Whole brain data (20 um microscopic MRI) Mosiac LM images (1 GB+)

Conventional LM images

Individual cell morphologies

Scale EM volumes & reconstructions

Solved molecular structures 2 Framework of parts

3 Multi-scale synthesis in biology

Watson & Crick, 1953 4 Organizing frameworks for knowledge

Knowledge in words, terminologies and logical relationships (the “what”) 5 Past

6 Shared building blocks

Calbindin Calbindin

Has Has part Purkinje neuron part Cerebellum Purkinje cell Cerebellum axon Has Purkinje cell part dendrite Is part of Cerebellum Is part of Purkinje cell Cerebellum granule soma cell layer Cerebellum molecular layer Is part of Has part Has part Cerebellum Purkinje cell layer Cerebellar cortex

Has part 7 Larson et al., 2007 8 Larson et al., 2007 9 Present

10 Organizing knowledge online

11 Organizing knowledge online

12 Top 25 accessed databases in NIF 1.Grants.gov / Opportunity 14.OMIM / Genes 2.SumsDB / Activation Foci 15.DrugBank / Drugs 3.CCDB / All Information 16.ModelDB / Models 4.Antibody Registry / Abs 17.NIF Integrated Animals / Available 5.GENSAT / GENSAT 18.Gemma / Microarray 6.Research Crossroads / Grants 19.NIF Integrated Brain 7.BrainInfo /Brain Region Gene/Expression 8.Drug Related Gene Database 20.EntrezGene / NCBIGene 9.NIF Integrated Nervous 21.NIF Integrated Software / Info System/Connectivity 22.NIF Integrated Video / Videos 10.RePORTER / CurrentNIHGrants 23.GeneNetwork / Info 11.AllenInstitute / MouseBrainAtlas 24.NeuroMorpho / NeuronInfo 12.OneMind / BioBanks 25.Human Brain Atlas / Michigan 13.ClinicalTrials / ClinTr

13 Need for an online parts list “We need a parts catalog for the brain” – Society for Keynote lecture, 2009

Larry Swanson, University of Southern California

14 Organizing knowledge online • Built on Semantic MediaWiki • Uses Semantic Forms to Structured data • Originally populated from 8- 10 community ontologies • Allows anyone to edit or create new concepts • ~18,000 concepts 15 16 17 NeuroLex Entities

Behavioral Brain Regions, Cell Types, 364 Activity, 27 1004 Organisms, 878 Subcellular Qualities, 1382 parts, 151 Diseases, 377 Molecules, 569

Nervous Proteins, 1366 system Resources, function, 5120 38

18 19 Organizing knowledge online

Visitors, December, 2008 to October, 2011

800 20 Organizing knowledge online Impact: NeuroLex being found spontaneously online

21 Master table of neurons?

22 Master table of neurons?

23 Master table of neurons?

Every row represents a wiki page that is publicly available for review and edit

24 Partnerships & Community

Curators with the NIH-funded Neuroscience Information Framework and volunteers with the International Coordinating Facility

Community

25 Contributions Acknowledgements NIF Vulcan  Described work to Maryann Martone Mark Greaves represent neuroscience (Director) Wil Smith using ontologies Jeff Grethe Jesse Wang  Provided an introduction Anita Bandrowski to NIF and NeuroLex.org Amarnath Gupta Fahim Imam and their use for Vadim Astahkov creating structured data Andrea Arnaud in neuroscience Jonathan Cachat  Outlined some examples Larry Lui of use of structured data Vicky Rowley within NeuroLex.org Chris Condit Xufei Qian Jennifer Lawrence Willy Wong Cliff Lee

Lee Hornbrook 26