<<

© 2018 Nature America, Inc., part of Springer Nature. All rights reserved. to to J.R.S. ( 9 5 3 1 measurements. quantitative into data image the or and the outputs processing perturbations analytic that convert and systems biological protocols, imaging define that metadata conventional lyzed. Furthermore, publications do not include the reana be easily cannot and data image original the in contained dimensions multiple and measurements the convey cannot that illustrations: they are presented in processed, compressed formats mere often are data image of versions published time, same the data At or resolutions. showing tissues at organisms whole of subcellular gigapixels contain that images individual generate gies technolo imaging tissue sheet’ ‘light and slide’ ‘virtual new and images, million 1 than more contain may (HCS) screen content’ and publication challenging. An image-based genome-wide ‘high- size of biological sheer image data The sets makes data submission, handling structures. and processes organismal and tissue cell, of measures quantitative provide to signal detected of acteristics char on and time the spectral that 3D sample data sets space, image based is sciences life the in research published the of Much scientificimage data. that promotes and extends publication and reanalysis of I also an open-source platform for publishing imaging data. notebooks that allows remote access to the entire I also established a resourcecomputational based on Jupyter inaccessible to individual studies. of networks and reveals functional interactions that are ontologies. Using this integration, I and cell and tissue phenotypes expressed using controlled digital pathology, with public genetic or chemical databases high-content screening, microscopymulti-dimensional and I repositories, we built a prototype Image of science. A WilliamsEleanor and publication platform Image Received Polytechnique, CNRS, INSERM, Université INSERM, Polytechnique, CNRS, Paris-Saclay, France. Palaiseau, Rafael E Rafael Salas Carazo Anatole Chessel School of Cell and Molecular Medicine, UniversityMedicine, and of Molecular of UK. Bristol, Cell Bristol, School University UK. of Cambridge, Cambridge, Department, Genetics Laboratory, Institute, European Biology Hinxton, UK. European Molecular Centre for Gene Regulation and Expression, University of Dundee, Dundee, UK. UK. University and Expression, Dundee, of for Regulation Centre Dundee, Gene D D ccess ccess to primary research data is vital for the advancement R R provides an online resource and a software infrastructure R links data from several imaging modalities, including [email protected]

24

Novembe T o o extend the data types supported by community D r r 4 2016; – ata ata Resource: a bioimage data integration 1 7 –

3

accepted , 10

, , Simone Leo 4 – , , Josh Moore ). 6 , 9 & Jason R Swedlow

26 A T p o o enable reanalysis, we r D il R R facilitates the analysis

2017; 1 , D 1 2 , , ata ata Resource (I 2 8

published , , , Bálint Antal 10 , , Simon W Li D 6

o Cambridge Systems Biology Centre, UniversityCentre, UK. of Cambridge, Systems Cambridge, Biology Cambridge R. R. I 1 n , 2 li

D n e D R R is

10 19 8 T R). R). Center for Advanced Studies, Research, and Development in Sardinia (CRS4), Pula, Italy. Pula, (CRS4), and in Sardinia Development for Center Research, Advanced Studies, 4 These authors contributed equally to this work. Correspondence should be addressed addressed be should work. to Correspondence this authors equally These contributed hus hus –

Ju 1 6 , , , Richard K Ferguson n 2 - - - 2 e , 10 Division of Biology,Division Computational UK. University Dundee, of Dundee,

2017; , Gabriella , RusticiGabriella to studies or publications or studies to linked files data image downloading and browsing for support projects, each tailored for specific aims and target communities imaging for specific databases of many examples dedicated other nity-submitted images, some of which are linked to publications tions. The CELL Image Library includes several thousand commu which publishes image data sets associated with its online publica similar to Expression Atlas, for data an to provide ‘added-value’ sets data platform imaging biological independent links resource existing no However, (BioStudies). websites journal external at publications related to links provide and sets data archive others whereas or (MitoCheck), experiment EMDataBank) example, (for domain imaging specific a for value platform that combines data from multiple independent independent multiple from data combines that platform value for sequence and function data and function for sequence protein resource EMPIAR the tomograms, 3D for repository prototype a released provide links to download image data. The EMDataBank recently and can sets data from image derived stores Figshare 2D pictures copy micros electron by recorded reconstructions molecular for tory reposi community definitive the EMDataBank, the is archetype One query. and search and browsing, for functionalities sets rich provide data submitted to access long-term for identifiers ent store and collect, provide for they tories persist image data—i.e., A number of public resources serve as scientific, structured reposi concentration into a spatial and biological context biological and spatial a into concentration and localization that biomolecular systems place coordinate with metadata analytic other and/or localization protein expression, gene of measurements synthesize all Atlas Mouse Edinburgh the and Atlas Protein Human the Atlas, Brain Allen The metadata. experimental include cases, some in and, visualization and ing brows enable data, image to access online provide These years. CO Inspired by these added-value resources, we built IDR, an added- Several public image have Several databases appeared over the past few 4 RR Pharmacology Department, University UK. of Cambridge, Cambridge, Department, Pharmacology 9 . The The . ECTED ECTED O 1 1 . Finally, the BioStudies and Dryad archives include include archives Dryad and BioStudies the Finally, . Journal of Cell Biology Cell of Journal N LI nature metho ds N E E 4 OCTOBE 1 1 , , 2 2 , , Ugis Sarkans , Aleksandra , TarkowskaAleksandra R

2018; 1 2 . Some of these provide a resource resource a provide these of Some .

| doi O.4 NO.8 VOL.14 has built the JCB DataViewer, JCB the built has :10.1038/ 1 3 4

. n meth.432

, , Alvis Brazma | AUGUST 2017 7 RESOURCE LOB, Ecole 6 13 1– 1 , , or UniProt, 2 3 , . There are There .

OPEN |

3

,

775 4–

1 0 8 ------. .

© 2018 Nature America, Inc., part of Springer Nature. All rights reserved. ing 35 screens or biological imaging experiments, most of which of which most experiments, compris imaging or biological screens 35 ing studies, imaging 24 with populated currently is IDR I Current RESUL perturbations. and experiments different between links IDR enables users to query across studies and identify phenotypic monizing data from multiple imaging studies into a single system, By har classifiers. of and phenotypic the development reanalysis for others by used be can that features image of sets prehensive com calculated have we studies, several For ontology. common by data authors set to determined a have the phenotypes mapped we Where possible, (API). interface program application a single and analysis and a web interface through and and query for search available made acquisition data design, experimental the to For each study, image data are stored along related with metadata across data sets acquired from a wide variety of and imaging domains. within processing computational and visualization search, form. IDR provides a prototyped resource that supports browsing, integrates and modalities them into a single resource for reanalysis in a imaging convenient, scalable and experiments imaging 776 idr0008-rohn-actinome idr0010-doil-dnadamage idr0009-simpson-secretion idr0005-toret-adhesion idr0004-thorpe-rad52 idr0018-neff-histopathology idr0017-breinig-drugscreen a

Study identifier T idr0001-graml-sysgro (in nonscreen data sets).NA,not applicable (unpublished data). idr0011-ledesmafernandez-dad4 idr0007-srikumar-sumo Average idr0003-breker-plasticity idr0002-heriche-condensation idr0032-yang-meristem idr0028-pascualvargas-rhogtpases idr0027-dickerson- chromatin idr0023-szymborska- idr0021-lawo- idr0020-barr-chtog idr0019-sero-nfkappab idr0015-UNKNOWN-taraoceans idr0013-neumann-mitocheck idr0012-fuchs-cellmorph Sum idr0006-fong-nuclearbodies idr0016-wawer- The number of submitted phenotypes. able RESOURCE nuclearpore pericentriolarmaterial bioactivecompoundprofiling

| O.4 NO.8 VOL.14

1 T | S Data sets in IDR D R

| AUGUST 2017 b The number of , compounds orproteins identified astargets for analysis. human D. melanogaster Human Human Drosophila Drosophila S. cerevisiae

Human Human Mus Mus musculus Human Human

S. S. pombe S. S. cerevisiae S. S. cerevisiae Saccharomyces Saccharomyces Human Arabidopsis Arabidopsis Human S. S. cerevisiae Human Human Multi-species Human Human Human thaliana melanogaster cerevisiae Species | nature metho ds

, RNAi RNAi screen RNAi RNAi screen RNAi screen In In situ

RNAi RNAi screen Protein screen Gene deletion screen Protein localization Protein localization Histopathology Histopathology of Small molecule screen Small molecule screen Gene deletion screen Gene deletion screen Protein localization RNAi RNAi screen RNAi RNAi screen 3D-tracking of tagged RNAi screen HCS image analysis Geographic screen RNAi screen RNAi screen Protein localization chromatin loci using dSTORM using 3D-SIM gene knockouts screen screen hybridization T ype

experiments - - - Screens or 35 2 1 1 1 1 1 1 2 1 1 2 1 1 1 1 1 5 1 4 1 1 1 2 1 are linked to published works ( works published to linked are annotations describing experimental perturbations (such as as (such perturbations , expressed and reagents, siRNA mutants, targets genetic experimental describing included have we annotations IDR, in sets data across querying enable To I in annotation functional and chemical Genetic, to millimeter-scale structures of animal tissues ( tissues animal of structures millimeter-scale to proteins cellular of localization nanometer-scale from adhesion, ples biomedically relevant features such as cell shape, division and other and marine organisms, are sam The also included. current collection plankton of survey global a Oceans, Tara from data Imaging mice. intact and cells fungal and of human imaging live and imaging histopathology whole-slide screening, siRNA and chemical high-content dSTORM, and 3DSIM super-resolution include supported approaches experimental and modalities ing human, mouse, fly, plant and in cells are fungal The included. imag studies from sets Data annotations. and functional and regions features) image (submitter-calculated annotations analytic location), geographic chemistry, RNAi, genes, as (such tations anno experimental associated all includes and experiments ual individ million ~1 and planes image million ~36 in data image 2,538,777 images 397,056 240,848 105,782 147,456 869,820 109,728 155,332 200,995 56,832 55,944 97,920 45,792 36,960 25,872 45,692 32,776 5D 3,456 3,765 8,957 1,152 229 524 414 899 458

c The number of individual wells(inHCS studies) or imaging experiments 10.06 14.54 0.02 0.08 0.12 3.25 1.40 1.73 0.20 0.17 2.48 3.19 0.03 0.0005 0.0003 0.27 0.4 0.003 0.14 2.10 0.18 0.03 0.03 0.38 2.49 (TB) Size 42 Pheno- types 224 23 46 14 48 19 18 18 2 3 8 9 1 0 2 0 1 1 1 5 1 2 9 2 0 0 a

161,119 T argets Table 12,826 18,675 17,960 12,743 29,542 16,701 13,035 18,393 6,713 6,234 4,195 1,281 5,209 3,005 377 115 102 170 241 198 84 9 8 7 9 b 1 Experiments ). IDR holds ~42 TB of of TB ~42 holds IDR ). 1,002,328 397,056 144,000 206,592 26,496 41,764 32,640 36,864 56,832 16,224 18,432 26,112 15,264 1,152 4,512 8,736 1,152 5,544 1,232 2,156 248 112 359 414 115 84 Table c D Reference R 2 ). NA NA 4 4 2 4 4 4 4 4 4 2 5 5 5 5 5 5 3 5 4 4 5 4 5 0 7 1 2 9 8 6 4 6 4 5 1 0 3 2 9 6 3 7 - - - -

© 2018 Nature America, Inc., part of Springer Nature. All rights reserved. idr0021-lawo-pericentriolarmaterial idr0020-barr-chtog idr0019-sero-nfkappab idr0018-neff-histopathology idr0017-breinig-drugscreen idr0016-wawer-bioactivecompoundprofiling idr0015-UNKNOWN-taraoceans idr0013-neumann-mitocheck idr0012-fuchs-cellmorph idr0011-ledesmafernandez-dad4 idr0010-doil-dnadamage idr0009-simpson-secretion idr0008-rohn-actinome idr0007-srikumar-sumo idr0006-fong-nuclearbodies idr0005-toret-adhesion idr0004-thorpe-rad52 idr0003-breker-plasticity idr0032-yang-meristem idr0002-heriche-condensation idr0001-graml-sysgro Study identifier T idr0028-pascualvargas-rhogtpases idr0023-szymborska-nuclearpore idr0027-dickerson-chromatin in the Cellular Phenotype Ontology (CMPO) Ontology Phenotype Microscopy Cellular the in terms to defined converted were annotations Functional to IDR. organisms. and assays several in sampled are genes known of majority the in perturbations of phenotypes the incarnation, early this in Even studies. or more three in sampled are 90.3% and times 20 than more sampled are 84.1% these, of sampled; are orthologs gene 19,601 Overall, orthologs. gene of el ( cell’ ‘round (e.g., assays orthogonal multiple in observed were types we used we Ensembl’sused resource BioMart To of orthologs, gene sampling the or siRNA depletion. calculate NCBI link Ensembl, as IDR (such in PubChem). or resources metadata external authoritative experimental to possible, Where image data. of inspection visual or analysis quantitative from either authors study by declared phenotypes and drugs) and lines cell ( experiments individual million >1 includes IDR total, In ments. are and included, phenotypes many are cases of sampled in of thousands experi classes Several sets. data IDR current the across observed phenotypes. observed 000014 in specific assays, for example, ‘protein localized in cytosol cytosol in localized ( ‘protein example, phenotype’ for assays, specific in or over-represented common are very phenotypes some because other imaging data sets, and a median of 144. This skewing occurs and HCSs across phenotype sampled, per samples 698 of well average an with were phenotypes these Nonetheless, study. one only in reported are 136 and arrested’), ‘mitosis and filaments’ number of actin (for example, ‘increased phenotypes normalized ontology- 158 includes IDR vocabularies. controlled published submitters. data the with to defined, have links annotations of 88% functional the Overall, collaboration in ontologies, other Table able We normalized the phenotypes included in studies submitted submitted studies in included phenotypes the Wenormalized Many of the studies in IDR perturb gene function by mutation function gene in Many IDR perturb of studies the

2 CMPO_000011 | 0 1 Example URLs and views of IDR data sets )). )). ), ~9% of which are annotated with experimentally experimentally with annotated are which of ~9% ), Figure Figure CMPO_00003 1 summarizes the sampling of phenotypes phenotypes of sampling the summarizes 8 ad icesd ula sz’ ( size’ nuclear ‘increased and ) 9 3 . oehls, eea pheno several Nonetheless, ). 1 5 to access a normalized list to a access normalized CMPO_

https://idr.open https://idr.open https://idr.open https://idr.ope https://idr.ope https://idr.ope https://idr.ope https://idr.open https://idr.openm https://idr.openmi https://idr.openmic https://id https:// https: https: https http h https://idr.openmi https://idr.openmicr https://idr.openmic https://idr.openmic https://idr.openmi https://idr.openmicr ttps://idr.openmicro 1 6 s://idr.openmicrosco or or ://idr.openmicroscop //idr.openmicroscopy //idr.openmicroscopy - - idr.openmicroscopy.o

r.openmicroscopy.org other communities that have had broad acceptance broad had have that from communities other methods lightweight adopt to sought by therefore We served IDR. potentially domains imaging the all crosses that ard stand metadata a yet not is there subdomains, imaging specific an open-source application an open-source OMERO.web,on based isinterface (WUI)user web currentIDR integratesIDR imagemetadatadataand from several studies. The D into OMERO into identifiers ontology and gene as such metadata semistructured WUI WUI and web-based API in JSON format (Supplementary Note the IDR through are and thumbnails metadata accessible images, IDR WUI. IDR the through available made possible, where and, linked and included are data image the with submitted (ROIs) used in are histopathology also supported. Any regions of interest sional images can be viewed and browsed. Tiled whole-slide images study,for each and multidimen as thumbnails are viewable data ( ways several in reuse and access This architecture makes the integrated data resource available for ( organisms and compounds antibodies, allowing data sets to be viewed by study, genes, phenotypes, siRNAs, TAB specifications TAB a consistent tabular format, inspired by the MAGE-TAB and ISA- Word documents—into Microsoft and databases MySQL formats—spreadsheets, PDFs, custom in submitted metadata verted ( They also can be embedded into other pages (e.g., Euro-BioImaging, NeuroVault ( MIACA as such efforts While arching standards for experimental, imaging or analytic metadata. over of absence the in modalities, imaging various by acquired were data These from data many studies. imaging IDR integrates metadata imaging for interfaces Standardized htm https:// ata visualization in I in visualization ata nmicroscopy.org/webc nmicroscopy.org/webc nmicroscopy.org/webc nmicroscopy.org/webc microscopy.org/webcl microscopy.org/webcl microscopy.org/webcl microscopy.org/webcl l icroscopy.org/webcli ) using the OMERO.web gateway. OMERO.web the using ) croscopy.org/webclie croscopy.org/webclie croscopy.org/webclie roscopy.org/webclien roscopy.org/webclien roscopy.org/webclien oscopy.org/webclient oscopy.org/webclient scopy.org/webclient/ www.eurobioimaging-i py.org/webclient/?sh IDR URL y.org/webclient/?sho 1 .org/webclient/?show .org/webclient/?show 8 , MULTIMOT , 2 rg/webclient/?show=w nature metho ds 3 /webclient/?show=image-64895 . We also used the Bio-Formats software library library software Bio-Formats the used Wealso . 21 lient/?show=dataset- lient/?show=well-104 lient/?show=well-102 lient/?show=well-105 ient/?show=dataset-5 ient/?show=well-1030 ient/?show=well-1024 ient/?show=image-163 , 2 ent/?show=image-1821 2 nt/?show=image-28498 nt/?show=image-312577 nt/?show=dataset-6 , that could then be used for importing importing for used be then could that , t/?show=image-306366 t/?show=well-59068 t/?show=image-289505 D /?show=well-11909 /?show=image-285826 ?show=well-485 R 1 ow=well-46926 7 1 w=well-54760 , and is supplemented with , a with and plugin is supplemented 9 have proposed data standards in in standards data proposed have =well-3747 =image-82068 | O.4 NO.8 VOL.14 nterim.eu/image-data ell-4540 http: Supplementary Note Supplementary //miaca.sourceforge.net 36 633 940 657 2 1 1 57 67 7 2 654 9 9 7 3 6 Supplementary Note Supplementary 81 9 1 6 1 8 6 0 4 3 8 6 7 6 1 6 | AUGUST 2017 RESOURCE 2 0 -resource. and con and ). Image Image ). |

777 / ), ), ). ). ). ). - - - -

© 2018 Nature America, Inc., part of Springer Nature. All rights reserved. and male fertility results in a range of defects in homeostasis in the brain, rib growth open specification. open common, a into domains imaging of several across range types a metadata reading and recognizing thus for and framework a Methods) comprise (Online available publicly are scripts resource. The single a into sets data integrate to scripts translation protein secretion protein ( phenotype secretion Knockout of mutants. mouse of series a in phenotypes tissue of study thology groupings of phenotype terms. Point size represents the number of studies each phenotype is linked to (1, 2, 3 or 4 studies). 3 or 2, (1, to linked is phenotype each of studies number the represents Point size terms. of phenotype groupings higher-level represent here. Colors shown terms CMPO the to mapped were terms phenotype User-submitted included. not were controls as annotated ( gene the for Searching search. text-based simple through combine genes and phenotypes across multiple studies are possible IDR Because links gene names and phenotypes, resultsquery that A Model OME Data the in specified (for example, metadata image pixel imaging as size) that describe to and identify elements convert typed semantically well-defined, 1 Figure 778 CM types types in several tissues from 191894 ( liver nal ( o mitotic defects (for example, (for defects mitotic with associated studies four from phenotypes of range a returns is involved in certain forms of retinitis pigmentosa retinitis of forms certain in involved is htt rg/webclient/?show=i RESOURCE dded value of I value dded Number of samples 10,000 PO_000021 ps://idr.openmicrosc https:// 1,000 | O.4 NO.8 VOL.14 10 https://i 0 | ) and male reproductive tract ( Sampling of phenotypes in the IDR. Each sample represents a well from a microwell plate in a screen or an image from a data set. Wells set. a data from image an or a screen in plate a microwell from a well represents sample Each IDR. the in of phenotypes Sampling Increased cell numbers number Altered Decreased cell numbers idr.openmicroscopy.o Fewer cells with projections cell More lamellipodia cells

Car4 Proliferating cells More cells in m phase More multinucleate cells dr.openmicroscopy.or Increased cell size in population 2

component Fewer aggregated cells in population 28– and Increased cell component number 2 number , , which encodes carbonic anhydrase 4 in mouse,

D Decreased number of filopodia 7 | Decreased number of microtubules Cell AUGUST 2017 . A second example is provided in a histopa a in provided is example second A .

R Increased number of filopodia 3 Increased number of microtubules 0 Increased amount of stress fibers . Data in IDR show abnormal nuclear pheno

CMPO_000034 Increased amount of stress fibers located in the cell cortex CMPO_000024

mage-191895 Increased amount of transverse stress fibers Increased amount of zig−zag stress fibers Increased number of microtubule bundle Mitosis delayed

phase Abnormal cell cycle cycle

Cell Absence of mitotic decondensation opy.org/mapr/gene/?v Mitotic chromosome condensation

Car4 Increased duration of mitotic prophase Mitosis arrested

CMPO_000 Prometaphase delayed Mitotic metaphase plate congression S phase mitotic Elongated cell

cycle M phase arrested Cell rg/webclient/?show=d M phase mitotic −/− | Metaphase arrested

24 Metaphase delayed nature metho ds Decreased duration of mitotic prophase mice, including gastrointesti 3 Prometaphase arrested ,

2 Abnormal mitotic cell cycle phase Round cell ). The human ortholog, 4 5 Cell morphology g/webclient/?show=im 6 https://

, and we used the resulting resulting the , we and used Apoptotic cell shape )

) in a screen for defects in in defects for screen a in ) Cell with projections morphology 4 Elongated cell ,

2 Abnormal cell shape Round cell 6 011

Cell Triangular shaped cell but also an accelerated Increased cell size Decreased cell size Geometric cell idr.openmicroscopy.

8 Star shaped cell S−shaped cell , , Curved cell

CMPO_000030 Pear−shaped cell Stubby cell Fan−shaped cell morphology

population Increased variability of cell size in population alue=SGOL Increased variability of cell shape in population Layered cells in population Cell 31 Increased variability of nuclear shape in population More cells with metaphase microtubule spindles ataset-15

, More cells with interphase microtubule arrays

3 Fewer cells with metaphase microtubule spindles

2 Fewer cells with interphase microtubule arrays

. Data in in Data . Fewer cells with G1 phase microtubule arrays More cells with G1 phase microtubule arrays More cells with S phase microtubule arrays SGOL1

project- Increased cilium length Decreased axon thickness Cell CA4 ion Cilium morphology component

age- Asymmetric lamellipodia Decreased lamellipodia width Cellular

3 Fan−shaped lamellipodia 1

5 Cell projection ), Cell component size - - - ) , , Cell component structure Cellular component Increased lamellipodia width

Cell component morphology Increased nucleussize Actin filament Increased number of actin filament asd n lnae cl peoye ( ries to String DB phenotype cell pombe Schizosaccharomyces that elongated knockdowns an or knockouts caused when created network gene els (idr0001-A, idr0008-B and idr0012-A) ( idr0012-A) and idr0008-B (idr0001-A, els mod biological different using gated cell’ studies hits in separate ‘elon as identified were but complex the of part all are MED20 and MED18 MED30, HELZ2, example, For phenotype. complexes to the cell elongated macromolecular specific connect three studies form nonoverlapping, complementary networks that Note novel representations of gene networks. networks. gene of representations novel consistent with a defect in some ofaspect the cell division cycle. ( tion of CA4 in HeLa cells IDR from the MitoCheck study show that siRNA-mediated deple data also provides an opportunity to include new functionalities functionalities new include to opportunity an provides also data complexes. macromolecular large of functions specific conserved, through arise but models biological of individual or characteristics effects due to not off-target probably are hits individual the that show examples These in to idr0001-A) form the Set1 histone (HMT). methyltransferase phenotype cell (elongated SETD1B and SETD1A with associates pathway. Finally, ASH2L (elongated cell phenotype in idr0012-A), transcription II polymerase RNA the in of complex elongation part the all are and studies these in hits cell elongated as scored were (idr0008-B) and SUPT16H PAF1 (idr0001-A) (idr0012-A), Decreased number of actin filament https://idr.openm Cytoskeletal Disorganized microtubules Phenotypes across distinct studies can also be used to build build to used be also can studies distinct across Phenotypes The integration of experimental, imaging and analytic meta analytic and imaging experimental, of integration The Increased cortical actin cytoskeleton mass Decreased cortical actin cytoskeleton mass Disorganised cortical actin cytoskeleton Actin cytoskeleton

and and Aggregated microtubules Microtubules nuclear bracket Microtubules nuclear ring Increased amount of punctate actin foci Increased actin localised to the nucleus Increased actin localised to the cytoplasm

Supplementary Table 1 Supplementary Actin nuclear ring Elongated cytoplasmic microtubules Shortened cytoplasmic microtubules Microtubule spindle morphology Abnormal microtubule cytoskeleton morphology during mitotic interphase Apoptopic nucleus

Bilobed nucleus Protein localizedincytosol

Nuclear Round nucleus

3 Increased nucleus size Decreased nucleus size 3 Bright nuclei and visualized in Cytoscape

icroscopy.org/webcli Graped micronucleus Abnormal nucleus shape Nuclear morphology Bright nuclear body Pyknotic nuclear Polylobed nuclear Increased level of polypetide in cell nucleus Decreased level of polypetide in cell nucleus 4 Protein localized in bud neck

also results in abnormally shaped nuclei Protein localized in cell periphery Protein localized in cytosol Protein localized in endoplasmic reticulum Protein localized in mitochondrion and human cells are linked by que by linked are cells human and Protein localized in nuclear periphery Protein localized in nucleolus

localization Protein localized in nucleus Protein localized in punctate foci

Protein Protein localized in vacuole protein localized in vacuolar membrane Protein localized in cajal body Protein localized in nuclear speckle ). The genes discovered in the the in discovered genes The ). Protein localized in paraspeckle Protein localized in PML body Protein localized in polycomb body Protein localized in Sam68 nuclear body Protein localized in centrosome Protein localized in nuclear pore

ent/?show=well-82841 Absence of protein localized in bud neck Absence of mitotic process Cell adhesion

Cell−matrix adhesion in rateofproteinsecretion Mild andstrongdecrease

CMPO_000007 Cell spreading iue 2 Figure Absence of cell spreading Increased thickness of dendritic branches Cell death Fig. 2 Fig. 3 Cell apoptosis

4 Cell DNA Misshapen DNA ( Apoptotic DNA

Supplementary Supplementary Abnormal cell growth Increased cell movement speed Other Increased cell movement distance Binuclear cell b Polyploid cell a Loss of cell monolayer ). POLR2G POLR2G ). shows the the shows Aggregated cells in population No cells Cell response to DNA damage Abnormal chromosome segregation Increased microtubule−based processes Metabolic process Kinetochore

7 Regulation of metabolic process Negative regulation of protein import into nucleus ) in in ) Positive regulation of protein import into nucleus Cellular response to chemical stimulus

9 Increased rate of protein secretion Mild decrease in rate of protein secretion ), ), - - - - - Strong decrease in rate of protein secretion © 2018 Nature America, Inc., part of Springer Nature. All rights reserved. displayed with linkages (gray) from StringDB from (gray) linkages with displayed morphology (blue, idr0012-A) (blue, morphology three IDR studies. Genes from from Genes studies. IDR three in (CMPO_0000077) phenotype cell elongated the to linked genes the ( IDR. the in phenotype original studies and data sets. We have added the data analyt Mineotaur tool data ics the added have We sets. data and studies the original to value further adding analysis, and visualization data for 2 Figure network in in network identified in idr0001-A ( idr0001-A in identified of orthologs human the Cytoscape, microscopy.org/mineo human cell data (idr0012-A) but not not but (idr0012-A) data cell human as such cell in morphology controlling in function HMT Set1 the of components that shown having instance, For data. feature quantitative of analysis b a | ASH2L Network analysis of genes linked to the elongated cell cell elongated the to linked of genes analysis Network a . Genes are listed in in listed are . Genes S. pombe S. were in the ‘elongated cell’ network based on on based network cell’ ‘elongated the in were 3 5 a to one of IDR’s data sets ( sets IDR’s of data one to Supplementary Supplementary ) Protein–protein interaction network based on based network interaction ) Protein–protein taur and human cells, we noticed that genes genes that noticed we cells, human and 3 S. pombe S. 9 and HeLa Actinome (red, idr0008-B) (red, Actinome HeLa and / ). This allows visual querying and and querying visual allows This ). Supplementary Supplementary S. pombe S. (green, idr0001-A) (green, N ote genes are used for the genes the for used are genes 3 S. pombe S. 3 ). ( ). . To enable comparisons in . Tocomparisons enable b N ) Close-up view of view ) Close-up ote . https://idr.open data. We noted noted We data. 5 , HeLa cell cell , HeLa 4 0 are - - data are recorded in a consistent format. Rather than attempting attempting than Rather format. a consistent in are recorded data ent studies. Experimental, imaging phenotypic and analytic meta and discoveries. hypotheses of generation biological the new more data are sets to potentiate added will and they IDR, catalyze As analysis. and querying Data Big biological of modes new ing data from several independent studies into a single resource, allow meta analytic and phenotypic imaging, experimental, integrates range of imaging modalities and scales in a consistent format. IDR that integrates and publishes image data andtechnology metadata data from a next-generation wide a IDR, built have we sets, data available through the OMERO API ( API OMERO the through available and store data HDF5-based OMERO’s using IDR in stored Features are idr0013-B. and and idr0013-A of parts idr0009-B for and idr0012-A idr0009-A, in idr0008-B, images for idr0005-A, calculated idr0002-A, been have features WND-CHARM enterprise of scientific the part and public is data a available Making critical D linked to study data in BioStudies (for example, BioStudies BioStudies example, (for been BioStudies have S-EPMC470449 in IDR data in on study data IDR to image of linked example, use For make sites. to own parties their third encourages This avail JSON API. web-based data a as well makes as interface user IDR web a through resources, able online modern most Like D data using the open-source tool WND-CHARM tool open-source the using data image IDR of vectors feature of sets comprehensive calculating ?term=&hostName=Imag ( repository notypic resource ( resource we sets, data computational TB-scale to IDR a Jupyter notebook-based IDR’s have connected to access the ease To computational reanalysis. for candidate attractive an IDR makes features Supplementary Note Supplementary see are data not image included; (original stack IDR software the of deployment automate that scripts Ansible built have and load down for available thumbnails and metadata databases, IDR all IDR data sets via an via API ( IDR sets data notebook ( GitHub in stored notebooks using 1 WND- CHARM features for images and and individual recreation of networks gene of calculations phenotypes, CMPO tion of image features using PCA, access to images annotated with api.htm that that openmicroscop from from HeLa cells ( ( shape cell for hits phenotypic define to study this in used nally that found and (idr0001-A) screen Sysgro the in hits shape cell for used criteria the queried cytoskeleton in unicellular and multicellular organisms. multicellular and the unicellular in and cytoskeleton shape cell controlling in role conserved strongly a has Supplementary Note Supplementary ISCUSSION ata integration and access and integration ata and In IDR, we have linked image metadata from independ several To further extend the possibilities for of reuse To IDR we data, the possibilities are extend further The integration of image-based phenotypes and calculated calculated and phenotypes image-based of integration The ash2

2 l from IDR data. Users can also run their own analyses analyses own their run also can Users data. IDR from ). ). We that visualiza provide notebooks example include has a microtubule cytoskeleton phenotype ( phenotype cytoskeleton microtubule a has s 3 https://idr.open ). ). To allow reuse of IDR metadata locally, we have made 8 . To help facilitate the reuse and meta-analysis of image nature metho ds y.org/webclient/?show=well-59237 4 Fig. Fig. 2 ad o PhenoImageShare to and ) ). http://www.phenoim ). When combined with results on results with combined When ). b ) these ) results that suggest these the HMT Set1 https://idr.openm e+Data+Repository+(IDR microscopy.org/jupyt ash2 | O.4 NO.8 VOL.14 fell just below the cutoff origi cutoff the below just fell https: Supplementary Note Supplementary //github.com/IDR/idr ageshare.org/search/ icroscopy.org/about/ | AUGUST 2017 3 6 a oln phe online an , RESOURCE e r 3 7 ) that exposes exposes that ) . To date, full full To date, . 1 ) ). ), then we we then ), https: Figures Figures ASH2L ). | //idr.

779 ------

© 2018 Nature America, Inc., part of Springer Nature. All rights reserved. to the genomic resources that are the foundation of much of of much of foundation biology. modern the are that resources genomic the to analogous to form resources may amalgamate and services bases data In those future, the and services. databases image online scientific supports that platform technology a as and publication data image for resource a as both functions therefore IDR nity. commu scientific the for data bioimaging publish and integrate that installations and technology of building the supports This publication. for data image systems into and other built accessed resource. computational IDR the using reanalyzed and queried IDR’s or WUI through viewed and browsed be can sets data the of the integrated data sets. B.A. helped with the IDR–Mineotaur integration. G.R., S.W.L. and E.W. sourced and received the data sets. A.C. analyzed features calculation software; E.W. performed all data curation and annotation. the incoming data sets and, along with S.W.L, developed and ran the feature application, and updated the IDR web stack; S.L. updated Bio-Formats to read cloud and ran all the IDR systems; A.T. built Mapr, the metadata querying development team; S.W.L. built all the tools for deploying IDR in the OpenStack by J.R.S. J.M. designed the software architecture and managed the software J.R.S., A.B. and R.E.C.S. conceived and funded the project, which was overseen A Grant (SYSGRO) and the University of Bristol. a (BB/K006320/1), European Research Council Starting Researcher Investigator 634107 (MULTIMOT). R.E.C.S. was funded by a BBSRC Responsive Mode grant 2020 Framework Programme of the European Union under grant agreement by awards to J.R.S. from the Wellcome Trust and(095931/Z/11/Z) Horizon II to J.R.S. and A.B.). Updates to OMERO and Bio-Formats were supported European Union under grant agreement 688945 (Euro-BioImaging Prep Phase to J.R.S., A.B. and R.E.C.S.) and Horizon 2020 Framework Programme of the new ontology terms. The IDR project was funded by the BBSRC (BB/M018423/1 Cloud. We are particularly grateful to S. Jupp for help with adding and defining C. Short and A. Cafferkey for their support of the project’s use of the Embassy thank the systems support team at EMBL-EBI, in particular R. Boyce, D. Ocana, IDR for their contributions and help in incorporating their data sets. We also We thank all the study authors who submitted data sets for inclusion in the Ac online version of the pape Note: Any Information Supplementary and Source Data files are available in the of accession codes and references, are available in the Methods, including statements of data availability Me and any associated openmic at available are process submission the about (details in publication for we have built and formats specifications metadata sets the using IDR data image submit can many Authors publish more. and receive to capacity the has Cloud, Embassy ( studies 24 into grouped sets generates. community the data that amounts of bioimaging on increasing the and capitalize harness to helping thereby studies, imaging across analytics and visualization integration, systematic enable IDR—to to link that repositories similar IDR—or in extended be could capabilities other and these that imagine can we future, the In applications. structure to into IDR or read can be that other formats in shareable metadata others for incentives provide will framework this of availability the that Wehope reuse. data facilitates frame that a work as these releases and formats community supporting for tools provides IDR standard, data imaging strict a enforce to 780 U RESOURCE kn IDR software and technology is open source, so it can be be can it so source, open is technology and software IDR As of this writing, IDR has published 35 reference image data data image reference 35 published has IDR writing, this of As TH the pape the th OR OR CO o | O.4 NO.8 VOL.14 o w d le s roscopy.org/about/submission.htm d NT g r m RIBU . e nt T s IO | N AUGUST 2017 S r . Table | nature metho ds 1 ) and, using EMBL-EBI’s EMBL-EBI’s using and, ) l ). Once published, published, Once ). online online version https://idr.

- - -

credit to the original author(s) and the source, provide a link to the Creative Commons Creative the to link a provide source, the and author(s) original the to credit 3. 2. 1. view a copy of this license, visit obtainpermissioncopyrighttodirectlyneedthe holder.willfrompermitted you use, the exceeds or regulation statutory by permitted not is use intended your and license in a credit line to the material. If material is not included in thearticle are included inarticle’s the article’s Creative Commons license,Creative unless indicated otherwise Commons license, and indicate if changes were made. institutional and maps published affiliations. in claims jurisdictional to regard with c Reprints and permissions information is available online at online version of the pape The authors declare competing financial interests: details are available in the CO the integration of IDR data sets into BioStudies. R.K.F. designed the updates to the OMERO user interface. U.S. helped with 23. 22. 21. 20. 19. 18. 17. 16. 15. 14. 13. 12. 11. 10. 9. 8. 7. 6. 5. 4. m om/reprints/index.ht M PE for microarray data: MAGE-TAB.data: microarray for the in Cell Dev. progression. cell-cycle and organization, microtubule shape, cell link and transcriptome. Science experimental biology.experimental in patterns localization subcellular and abundance protein of analysis automated Methods Biol. Cell Trends Semantics genes. division cell reveals microscopy time-lapse 121–126 (2012). 121–126 Mol. Syst. Biol. Syst. Mol. biology. Methods Nat. data. image microscopy electron raw for archive public a EMPIAR: Res. Armit, C. Armit, M.J. Hawrylycz, M. Uhlén, Li, S. Li, Rayner,T.F. S.A. Sansone, systems for Standards U. Sarkans, & M. Krestyaninova, A., Brazma, P.Masuzzo, K.J. Gorgolewski, C. Allan, S. Jupp, A. Yates, information. protein for hub a UniProt: Consortium. UniProt R. Petryszak, database.BioStudies The A. Brazma, & U. Sarkans, McEntyre,J., A. Patwardhan, & G.J. Kleywegt, Korir,P.K.,Salavert-Torres,J., A., Iudin, C.M. Kane, & M.H. Ellisman, M.E., Martone, J.H., Iwasa, D.N., Orloff, C.L. Lawson, C.C. Fowlkes, Gönczy,P. J.L. Koh, V.Graml, B. Neumann, Nucleic Acids Res. Acids Nucleic data. microscopy of repository curated a library-CCDB: image an cell: The human brain transcriptome.brain human Front. Neuroinform. Front. brain. human the of maps statisticalunthresholded sharing and collecting Res. plants. and animalshumans, in expression protein and gene Nucleic Acids Res. Acids Nucleic using RNAi of genes on chromosome III. chromosome on genes of RNAi using (2010). T I N , D204–D212 (2015). D204–D212 43, , D746–D752 (2016). D746–D752 44, G G FI Metadata management for high content screening in OMERO. in screening content high managementfor Metadata al. et blastoderm. Drosophila , 1260419 (2015). 1260419 347, Nat. Rev. Genet. Rev. Nat. , 27–32 (2016). 27–32 96, , 227–239 (2014). 227–239 31, The cellular microscopy phenotype ontology.phenotype microscopy cellular The al. et OMERO: flexible, model-driven data management for data model-driven flexible, OMERO: al. et Ensembl 2016. Ensembl al. et A genomic Multiprocess survey of machineries that control that machineries of survey Multiprocessgenomic A al. et eMouseAtlas, EMAGE, and the spatial dimension of the of dimension spatial the and EMAGE, eMouseAtlas, al. et NAN CYCLoPs: a comprehensive database constructed from constructed database comprehensive a CYCLoPs: al. et T License, which permits use, sharing, adaptation, distribution and and distribution adaptation, appropriate give you sharing, as long as format, or medium any in reproduction use, permits which License, , 28 (2016). 28 7, Proteomics. Tissue-based map of the human proteome.human the of map Tissue-based Proteomics. al. et Functional genomic analysis of cell division in division cell of analysis genomicFunctional al. et his work is licensed under a Creative Commons Saccharomyces A simple spreadsheet-based, MIAME-supportive format MIAME-supportive spreadsheet-based, simple A al. et An open data ecosystem for cell migration research.migration cell for ecosystem data open An al. et EMDataBank.org: unified data resource for CryoEM. for resource data unified EMDataBank.org: al. et Phenotypic profiling of the by genome human the of profiling Phenotypic al. et Expression Atlas update—an integrated database of database integrated update—an Atlas Expression al. et A quantitative spatiotemporal atlas of gene expressiongene of atlas spatiotemporal quantitative A al. et , 387–388 (2016). 387–388 13, CI Toward interoperable bioscience data. bioscienceinteroperableToward al. et

Mamm. Genome Mamm.

, 847 (2015). 847 11, An anatomically comprehensive atlas of the adult the of atlas comprehensiveanatomically An al. et A , 55–58 (2015). 55–58 25,

L L I NeuroVault.org: a web-based repository for repository web-based NeuroVault.org:a al. et , D1241–D1250 (2013). D1241–D1250 41, , D456–D464 (2011). D456–D464 39, , 8 (2015). 8 9, l NT r . Publisher’s note: Springer Springer note: Publisher’s . . Nat. Methods Nat. ERES

http://creativec , 593–605 (2006). 593–605 7, T Nature S Cell , 514–524 (2012). 514–524 23, Nucleic Acids Res. Acids Nucleic BMC Bioinformatics BMC T , 364–374 (2008). 364–374 133, he images or other third party material in this , 391–399 (2012). 391–399 489, , 245–253 (2012). 245–253 9, . G3 (Bethesda) G3 ommons.org/licenses/by/4.0 Nature

, 331–336 (2000). 331–336 408, , D710–D716 (2016). D710–D716 44, Nature A N , 489 (2006). 489 7, ttribution 4.0 International ature remains neutral neutral remains ature , 1223–1232 (2015). 1223–1232 5, http://www.nature. , 721–727 464, Nat. Genet. Nat. J. Biomed. J. Nucleic Acids Nucleic Nucleic Acids Nucleic / C. elegans C. .

44, T o © 2018 Nature America, Inc., part of Springer Nature. All rights reserved. 39. 38. 37. 36. 35. 34. 33. 32. 31. 30. 29. 28. 27. 26. 25. 24.

283 (2015). 283 pigmentosa. retinitis of form RP17 the with patients in identified IV anhydrase Biol. Cell Nat. pathway.secretory early the in function regulatory a with proteins , C402–C412 (2008). C402–C412 294, muscle. skeletal mouse in role functional and brain. in space extracellular the buffering in anhydrases carbonic respective the condensation. chromosome mitotic in involved genes new of prediction allows nodes data using Cytoscape. using data infrastructure. and multiparametric imaging.multiparametric and content microscopy screen sharing and visual analytics. visual and sharing screen microscopy content Fuchs, F.Fuchs, as Science M. Walport, Vallance,P.& M., Rawlins, G., Boulton, S. Adebayo, high- for tool a Mineotaur: R.E. Salas, Carazo & A. Chessel, B., Antal, M.S. Cline, D. Szklarczyk, Z. Yang, G. Rebello, Wandernoth,P.M. R.J. Scheibe, G.N. Shah, J.C. Simpson, Hériché,J.K. I.G. Goldberg, M. Linkert, Orlov, N. N. Orlov, (2011). data. open for case the enterprise: public a D447–D452 (2015). D447–D452 life. of tree the over integrated networks, biological imaging. biological in analysis quantitative and informatics for tools open file: XML and world. (2008). transforms. image compound e15061 (2010). e15061 sperm. human and murine of activationbicarbonate-mediated 255–265 (2005). 255–265 degeneration.photoreceptor retinal causes and Proc. Natl. Acad. Sci. USA Sci. Acad. Natl. Proc. J. Cell Biol. Cell J. Mutant carbonic anhydrase 4 impairs pH regulation pH impairs 4 anhydrase carbonic Mutant al. et Clustering phenotype populations by genome-wideRNAi by populations phenotype Clustering al. et et al. et Integration of biological networks and gene expressiongene and networks biological of Integration al. et Carbonic anhydrase IV and XIV knockout mice: roles of roles mice: knockout XIV and IV anhydrase Carbonic al. et Metadata matters: access to image data in the real the in data image to access matters: Metadata al. et Apoptosis-inducing signal sequence mutation in carbonic in mutation sequence signalApoptosis-inducing al. et PhenoImageShare: an image annotation and query and annotation image PhenoImageShare:an al. et Proc. Natl. Acad. Sci. USA Sci. Acad. Natl. Proc. Integration of biological data by kernels on graph on kernels by data biological of Integration al. et

Carbonic anhydrases IV and IX: subcellular localization subcellular IX: and IV anhydrases Carbonic al. et Genome-wide RNAi screening identifies humanidentifies screening RNAi Genome-wide al. et STRING v10: protein-protein interactionprotein-protein v10: STRING al. et , 764–774 (2012). 764–774 14, Mol. Biol. Cell Biol. Mol. J. Biomed. Semantics Biomed. J. The Open Microscopy Environment (OME) Data Model Data (OME) EnvironmentMicroscopy Open The al. et WND-CHARM: Multi-purpose image classification using classification image Multi-purpose WND-CHARM: Role of carbonic anhydrase IV in the in IV anhydrase carbonic of Role al. et

, 777–782 (2010). 777–782 189, Genome Biol. Genome Nat. Protoc. Nat.

, 2522–2536 (2014). 2522–2536 25, Pattern Recognit. Lett. Recognit. Pattern Mol. Syst. Biol. Syst. Mol.

, 16771–16776 (2005). 16771–16776 102, , R47 (2005). R47 6, , 2366–2382 (2007). 2366–2382 2,

, 35 (2016). 35 7,

, 6617–6622 (2004). 6617–6622 101, Nucleic Acids Res. Acids Nucleic Lancet

, 370 (2010). 370 6, Am. J. Physiol. Cell Physiol. Cell Physiol. J. Am. Hum. Mol. Genet. Mol. Hum. , 1633–1635 377,

29 , 1684–1693 1684–1693 , Genome Biol. Genome

43,

PLoS One PLoS

14,

16,

5,

51. 50. 49. 48. 47. 46. 45. 44. 43. 42. 41. 40. 56. 55. 54. 53. 52.

Mol. Syst. Biol. Syst. Mol. super-resolution microscopy and particle averaging. particle and microscopy super-resolution metazoan actinome by phenotype. by actinome metazoan nuclear translocation of NF- of translocationnuclear , 170018 (2017). 170018 4, cancer.breast negative triple in YAP/TAZlocalisationand shape BMC Cell Biol. Cell BMC division. cell through inherited not is that cells between states chromatin , 435–446 (2009). 435–446 136, proteins. repair of accumulation allow to chromosomesdamaged on distinct nuclear bodies. nuclear distinct synthases in synthases cancer cells. cancer in imaging high-throughput using molecules small of map interaction focus. into Biol. Cell J. responses. stress yeast during plasticity proteome reveals platform pericentriolar material.pericentriolar of features organizationalhigher-order reveals centrosomes of interactionnetworkrequired spindlefor assembly. Barr, A.R. & Bakal, C. A sensitisedAscreench-TOG Bakal,RNAirevealsBarr,C.genetica & A.R. Sero, J.E. Sero, chemical-geneticA M. Boutros, Huber,W.& F.A., Klein, M., Breinig, Wawer,M.J. E. Karsenti, C. Doil, Srikumar,T. K.W.Fong, W.J.Nelson, & M.A. Simon, C.P.,D’Ambrosio,R.D., Toret,M.V.,Vale, foci Rad52 Bringing R. Rothstein, & M. Lisby, D., P.H.,Alvaro, Thorpe, screening single-cell novel A Schuldiner,M. & M. Gymrek, Breker,M., J.L. Rohn, Yang, W.Yang, Pascual-Vargas,P. D. Dickerson, A. Szymborska, imaging Subdiffraction Pelletier,L. & G.D. Gupta, M., Hasegan, S., Lawo, cadherin-mediated cell–cell adhesion. cell–cell cadherin-mediated for required hubs protein conserved identifies genome-widescreen A 655–658 (2013). 655–658 profiling. high-dimensionalmultiplexed using screeningphenotypic cell-based regulation. chromatin in roles PLoS Biol. PLoS RNF168 binds and amplifies ubiquitin conjugates ubiquitin amplifies and binds RNF168 al. et Regulation of meristem morphogenesis by cell wall cell by morphogenesismeristem of Regulation al. et Proc. Natl. Acad. Sci. USA Sci. Acad. Natl. Proc. Cell shape and the microenvironmentregulate the and shape Cell al. et , e1001177 (2011). e1001177 9, Comparative RNAi screening identifies a conserved core conserved a identifies screening RNAi Comparative al. et

Whole-genome screening identifies proteins localized to localized proteins identifies screeningWhole-genome al. et J. Cell Biol. Cell J. , 839–850 (2013). 839–850 200, A holistic approach to marine eco-systems biology.eco-systems marine to approach holistic A al. et Global analysis of SUMO chain function reveals multiple reveals function chain SUMO of analysis Global al. et Toward performance-diverse small-molecule libraries for libraries small-moleculeperformance-diverse Toward al. et Mol. Syst. Biol. Syst. Mol. nature metho ds Arabidopsis.

High resolution imaging reveals heterogeneity in heterogeneityreveals imaging resolution High al. et , 33 (2016). 33 17,

, 790 (2015). 790 11, Nuclear pore scaffold structure analyzed by analyzed structure scaffold pore Nuclear al. et RNAi screens for Rho GTPase regulators of cell of regulators GTPase Rho for screens RNAi al. et

Nat. Cell Biol. Cell Nat. , 665–667 (2011). 665–667 194, J. Cell Biol. Cell J. Curr. Biol. Curr. B in breast epithelial and tumor cells. tumor and epithelial breast in κB

, 846 (2015). 846 11, J. Cell Biol. Cell J. | O.4 NO.8 VOL.14 J. Cell Biol. Cell J.

, 10911–10916 (2014). 10911–10916 111, , 1404–1415 (2016). 1404–1415 26, , 149–164 (2013). 149–164 203,

, 1148–1158 (2012). 1148–1158 14, J. Cell Biol. Cell J.

, 145–163 (2013). 145–163 201,

, 789–805 (2011). 789–805 194, Sci. Rep.Sci.

, 265–279 (2014). 265–279 204, | AUGUST 2017 Science RESOURCE

5 , 10564 (2015).10564, 341,

Sci. Data Sci.

| Cell

781 © 2018 Nature America, Inc., part of Springer Nature. All rights reserved. f eeec iae ( images reference of concept Data Strategy by the Euro-BioImaging/Elixir defined ria by Aspera. Included data sets were according selected to the - crite ISA-TAB specifications formats to a format consistent tabular by inspired the MAGE and custom these Weconverted format. custom own its using each database, MySQL a or format HDF5 or PDF (CSV, XLS), sheets studies. other with integration and/or reanalysis reuse, for dates and candi resources to other as much as possible linked studies, that image data sets for publication should be related to published sets ( sets Data resource. Embassy EMBL-EBI the OpenStack-based within an contained cloud on roles re-usable with along playbooks Architecture and population of IDR. of population and Architecture O nature metho ds file using a custom script and imported into OMERO. Imaging Imaging OMERO. into imported and script custom a using file Formats n c r e N o Experimental and were metadata analytic submitted Experimental in spread w LI s s c Table / N o e p E u y 2 r M . 4 o o as a are Deployments foundation. managed by Ansible E - r 1 TH bi g ) were collected by USB ) or drive shipped were transferred collected ) was built using open-source OMERO open-source using built was ) oi O m D S a g i n g 21 - h e t , l 2 t i p 2 x and combined them into a single CSV : i / r / - w i m w a w g . e e - u d r a IDR ( IDR o t b a - i o s t i r m h a t a t e t g p g i y s n : ), which states states ), which g / / . i e d 1 u 7 r and Bio- and / . o c o p n e n t e m n t i - - - - Map Annotation facility Annotation Map OMERO’skey-value-based to copied were search and querying for valuable were that metadata set, data each For OMERO. by used store data tabular HDF5-backed an OMERO.tables, using stored were metadata analytic and using Experimental OMERO into Bio-Formats. imported were data binary and metadata queries, see see queries, of construction the on information more For require. they ties capabili querying and search the on depending API, OMERO the of parts different using accessed be can elements and types Code availability. availability. Code aa availability. Data I at available are files CSV single a into data open is com/openmicroscop sets data IDR the of source and available at metadata reading and IDR the available at available DR/idr-metadat ht Supplementary Note Supplementary tps://idr.openmicroscopy.or a . All data sets described in this paper are are paper this in described sets data All y All software for building and running running and building for software All . . The custom scripts used to combine meta https://github.co 2 3 . This means that different metadata metadata different that means This . . m/ID g . doi: R https://github.com/ and 10.1038/nmeth.4326 https://github - - .

AmenDments https://doi.org/10.1038/s41592-018-0169-x

Publisher Correction: Image Data Resource: a bioimage data integration and publication platform Eleanor Williams, Josh Moore, Simon W Li, Gabriella Rustici, Aleksandra Tarkowska, Anatole Chessel , Simone Leo, Bálint Antal, Richard K Ferguson, Ugis Sarkans , Alvis Brazma, Rafael E Carazo Salas and Jason R Swedlow

Correction to: Nature Methods https://doi.org/10.1038/nmeth.4326, published online 19 June 2017 This paper was originally published under standard Nature America Inc. copyright. As of the date of this correction, the Resource is available online as an open-access paper with a CC-BY license. No other part of the paper has been changed.

Published: xx xx xxxx https://doi.org/10.1038/s41592-018-0169-x

Nature Methods | www.nature.com/naturemethods