© 2014 Nature America, Inc. All rights reserved. stereotypy in cell behavior cell in stereotypy visualization and editing tools for efficient data curation. adjusting only two parameters across all data sets and providing on a single computer workstation and (iii) ease of use by 1 biology of goal developmental fundamental a is embryos, mouse and fly, zebrafish fruit as such organisms, complex in cells of behavior dynamic the Following neuroblast dynamics throughout an entire embryo. Drosophila melanogaster we performed the first cell lineage reconstruction of early across all species and imaging modalities. O with up to 20,000 cells per time point at 26,000 cells min (ii) scalability by analyzing advanced stages of development acquired with three types of fluorescencemicroscopes, sized image data sets of fruit fly,zebrafish and mouse embryos reconstructing cell lineages in four-dimensional, terabyte- accuracy and speed. We demonstrate its (i) generality by for the segmentation and tracking of cell nuclei with high biology. We present an open-source frameworkcomputational organisms multicellular is a central goal of developmental T Fernando Amat large-scale fluorescencemicroscopy data Fast, accurate reconstruction of cell lineages from Received role of differential gene expression in directing cell-fate deci cell-fate sions directing in expression gene differential of role development at the cellular level for several days several for level cellular the at development of recording are capable microscopes, fluorescence and confocal Germany. Correspondence should be addressed to F.A. ( dynamics across cell populations or even entire embryos entire or even populations cell across dynamics arelabels fluorescent employed to nucleus-specific cellular reveal of such complexity and size and complexity such of semiautomated or manual approaches to reconstructing cell lineages do not existing scale to data sets However, points. time of up comprising to tens of for of imaged cells thousands thousands Such imaging experiments easily generate terabytes of image data, function cell to history developmental linking of development of for understanding the morphogenesis of tissues and organs and tissues of morphogenesis the understanding for crucial is cells of divisions and movements positions, the struct Eugene W Myers Howard Hughes Medical Institute, Janelia Farm Research Campus, Ashburn, Virginia, USA. he he comprehensive reconstruction of cell lineages in complex ur approach achieves on average 97.0% linkage accuracy State-of-the-art live imaging technologies, such as light-sheet light-sheet as such technologies, imaging live State-of-the-art 8 , 9

, quantitatively analyzing mutant defects mutant analyzing quantitatively , 8

Janua r y; 12

accepted , 1 1 3 , , William Lemon . 2 & Philipp J Keller nervous system development, revealing

16 1

June; 1 1 8 and experimentally testing models models testing experimentally and . 1 , 2

. The ability to accurately . recon to The ability accurately published 1 , , Daniel P Mossing

online 1 U [email protected] sing our system,

20 7 1 , dissecting the the dissecting ,

0 1 July , determining determining , 4 . Frequently, Frequently, .

2014;

3 , 4 − CO , 15– 1

RR 3– 1 1 ) ) or P.J.K. ( 6 ECTED ECTED ONLINE 7 - - , , Katie McDole . , tula stages of more complex organisms such as the early zebrafish blas higher accuracy and orders-of-magnitude faster computation computation faster orders-of-magnitude and accuracy higher data local association rules to improvespatiotemporal results. use This divisionand of labor failed provides have might evolution contour where areas we flag Third, cases. in easy results and tracking segmentation good provides which evolution, contour ric paramet use we Second, supervoxels. generate to segmentation by low-level using and complexity size data we reduce First, ples. princi three on operates recon method Our lineages. accurate cell of and struction fast facilitates which nuclei, cell tracking system. the about knowledge spatiotemporal with density.object increasing However, they can model complex State-space models and data association methods do not scale well division. cell during even outlines, cell detailed provide they but slow to compute for thousands of three-dimensional (3D) objects, be can methods evolution contour example, For above. outlined challenges the of all solves none and weaknesses, and strengths tion (for example, level sets) level example, (for tion tracking can be divided into three main categories: contour evolu results. lineage alter fundamentally can errors few a because required, are robustness and accuracy high indis Fourth, is approach. computational scalability general any for thus pensable and large, are sets data Third, scope. micro the of depth penetration physical limited the of because specimen the across markedly varies quality image Second, iors. - and shapes complex behav different with cells of packed densely number large a comprises often specimen the i.e., complex, are image First, data stages. developmental in and advanced tracking exist. currently not does ment develop of stages later in lineaging cell automated accurate, for Caenorhabditis elegans as Caenorhabditis such organisms model small for data image such analyze to graph matching) graph atce filters) particle Here we present a new hybrid approach to segmenting and and segmenting to approach hybrid new a present we Here In general, In computational approaches general, to and cell segmentation Four major complicate challenges automated cell segmentation Automated computational approaches have been developed developed been have approaches computational Automated [email protected] 19 2 Max Planck Institute of Molecular andBiology Cell Dresden, Genetics, , 20 2 0 and the the and AU G UST 1 , , Yinan Wan 2014 na 2 2 Drosophila 20 and data association methods (for example, example, (for methods association data and ture methods

, ( 2 DETAILS ONLINE 3 . All of these approaches have individual individual have approaches these of All . ). embryos blastoderm 1 2 1 , , Kristin Branson , state-space models (for example, example, (for models state-space ,

| ); 1 ADVANCE ONLINE PUBLICATION

5 doi and for early developmental developmental early for and :10.1038/nmeth.303 16 , 2 0 . However, a method However,. method a Articles 1 ,

6 |

 ------

© 2014 Nature America, Inc. All rights reserved. for data visualization, editing and annotation. Scale bars, 5 bars, Scale and annotation. editing visualization, for data a into framework are output and segmentation tracking of the automated accuracy. ( improve to are applied rules heuristic erroneous; model might be GMM the sequential which in windows spatiotemporal local ( the in GMM. of objects the number control to used is ellipsoids) gray to compared green 3, (step detection division cell for A module 2). 1 and (steps time in forward propagated sequentially is GMM the tracking, cell For nucleus. same the to belong that supervoxels Methods and and Methods distance threshold, threshold, distance PBC The PBC. and watershed using data image the from calculated is tion agglomera (PBC) clustering persistence-based and watershed techniques using By threefold. least at by further size that data the threshold reduced intensity global conservative a applied ( merged. are regions more image as segmentation, the coarser the level, level, the and threshold intensity background global the provided: be ( supervoxels into a hierarchical representation of all possible partitions of the image selected to generate supervoxels. The higher the value of value the higher The supervoxels. generate to selected is model hierarchical the in segmentation which defines thereby and Figure Figure ( of data image terabytes time-lapse with recordings microscopy 3D fluorescence in nuclei cell track and segment accurately and We an developed automated computational pipeline to efficiently nuclei of cell tracking and segmentation Automated RESULTS auto our Weaugment also methods. existing to compared time  shown. ( shown. are data of 3D the slices 2D only simplicity, For data. image example on pipeline reconstruction lineage of cell the representation Schematic ric contour evolution ( evolution contour ric paramet using simultaneously tasks both perform to (GMMs) models mixture Gaussian with approach Bayesian sequential a developed We lineages. cell full recover to (tracking) time and parallelized. trivially is step processing this independently, processed be can point time each ate ( set of the initial supervoxels methods to efficiently curate the results and obtain error-free error-free obtain reconstructions. and results the curate editing efficiently and to methods visualization with pipeline computational mated tral nerve cord. nerve tral of behavior progenitors neural in the entire early cell and lineages the analyzed systematically study,we case a As different microscopy methods and for different model organisms. structs cell lineages from large-scale image data sets obtained with hl rtiig opooia information morphological retaining while magnitude of orders three approximately by complexity reduces voxels) of (instead unit image smallest the as supervoxels of use and each nucleus can nucleus, be represented by single multiple supervoxels. a The to belong all that space in voxels of set nected ( supervoxels into point time generality. and speed scalability, included goals design racy, our b Fig.

Articles ) A GMM (green ellipsoids) is fit to the gray-scale image data to group group to data image gray-scale the to fit is ellipsoids) (green ) A GMM | Second, we connected supervoxels in space (segmentation) (segmentation) space in supervoxels connected we Second, We show that our framework rapidly and accurately recon accurately and rapidly framework our that show We First, we partitioned the 3D image volume recorded at each each at recorded volume image 3D the partitioned we First, ADVANCE ONLINE PUBLICATION 25 1 , τ 2 1 and , at which the hierarchical representation is cut to gener to cut is representation hierarchical the which at , 6 a | ( ) A hierarchy of possible segmentations (from coarse to fine) fine) to coarse (from segmentations of possible ) A hierarchy Computational framework for nuclei segmentation and tracking. tracking. and segmentation nuclei for framework Computational Supplementary Supplementary Fig. 1 Supplementary Supplementary Software 1 upeetr Nt 1 Note Supplementary τ Fig. 1 Fig. , controls the merging of neighboring image regions regions image of neighboring merging the , controls a Fig. 1 Fig. ). Thus, only two parameters needed to to needed parameters two only Thus, ). b Fig. 1 Fig. and Online Methods), we created , , |

Supplementary Software 1 Software Supplementary Supplementary Fig. 2 Fig. Supplementary na ture methods a . sproe i a con a is supervoxel A ). ). ). In addition to high accu ). The intensity profile profile intensity The ). c ) The algorithm determines determines ) The algorithm 2 4 . In addition, we we addition, In . Drosophila τ ( d µ τ ) The results ) The results 1 m. > , Online Online , τ

2 ), ), ven

). ). As ------to each supervoxel using Otsu’s method using supervoxel each to estimation background local Weapplied GMM. a as described of each nucleus can be modeled as a Gaussian distribution in in distribution ( Gaussian 3D a as modeled be can nucleus each of the solution obtained for time point point time for obtained solution the propagated we independently, point time each at GMM a fitting of Moreover, instead voxels. of thousands of instead supervoxels memory efficiency memory and speed processing convergence, global improves it step: ond sec this in advantageous also is partition image supervoxel The nected, the cluster is flagged as a cell division candidate ( candidate division cell a as flagged is cluster the nected, con spatially not are but Gaussian same the into clustered supervoxels are two if partition: supervoxel the of advantage took sions in the image data. To maximize detection accuracy, we again - divi cell of occurrence sparse the by complicated is task this for of reconstruction cell However,lineages. high accuracy achieving change points. time cannot between abruptly nuclei of intensity and shape location, the that the imposed we points, t tional solutions tional computa efficient with problem estimation a is well-established a and data identity. to parent GMM image Fitting (shape) matrix to finding ten parameters per nucleus: its 3D center, 3D covariance parametric model reduces the segmentation and tracking problem The Methods). (Online fitting improve and mask segmentation + 1. Because of the temporal coherence between consecutive time d c b a Robust detection of cell divisions is a key requirement for the the for requirement key a is divisions cell of detection Robust upeetr Fg 3 Fig. Supplementary 1 Lineage 1 Data validation t 0 t −1 t Image data Fit Gaussianmixturemodelanddetectcelldivisions 2 8 Hierarchical supervoxeloversegmentation (Online Methods and Methods (Online 1 Manual datacurationandannotation 2 9 , as each Gaussian only has to cluster a few few a cluster to has only Gaussian each as , Incorporate temporalcontext 2 t 0 a priori a Cell lineageediting Lineage 2 1 Supervoxels (� ), and thus each image volume is is volume image each thus and ), 2 2 knowledge (Bayesian approach) approach) (Bayesian knowledge 3 2 1 1 Supervoxels(� ) t to the next time point, point, time next the to Supplementary Note 1 Supplementary 3 2 Initialize twith−1 Detect divisions Fit model 7 to tighten the nuclear nuclear the tighten to 3 3 Data annotation Cell type2 Cell type1 2

)

Fig. 1 Fig.

Automated steps Automated Manual steps Manual ). ). b - - -

© 2014 Nature America, Inc. All rights reserved. ratio. Scale bars, 50 50 bars, Scale ratio. signal-to-noise SNR, results). segmentation automated (green, challenge analysis image of type a different representing a nucleus on ( (green). a reference as shown is distance n 9, (stage 120 point time at average on occurs precursors of neural internalization and 6), 0 (stage point time at starts Gastrulation from time point 250 (stage 10, 10, (stage 250 point time from cells backtracking by gray, mean; measured s.d.) (black, tracks cell uncurated corresponding and ( 40. point time at starts elongation band germ Rapid 25th and 75th and 25th gray, median; (black, lineages cell of neuroblast hypotheses. The hierarchical segmentation structure (Online (Online and structure Methods segmentation hierarchical The hypotheses. linkage and segmentation multiple consider to tracking cell to approaches association data used we window, local each Within informa access. have not does model GMM sequential incorporate the which to tion to essential are ( occurred had division cell true a whether determine to object flagged each around window spatiotemporal local a analyzed 235 neuroblasts. ( neuroblasts. 235 for (green) centroids nuclei detected manually and automatically between distance Euclidean normalized average and (black) accuracy ( extension. band germ during movements cell show and long points time 270–450 are tracks The 10). (stage AEL h 4.9 at images with superimposed neuroblasts S1 (right) dorsal and (left) ventral of eight ( posterior. p, anterior; a, 6–11). stages h AEL, (3–7 points time 501 comprises ( origin lineage blastoderm indicates that code a color using 5 min) last (lines, movements and (spheres) positions of cell reconstruction Drosophila 9 stage (His2Av-mRFP1) of a nuclei-labeled recording of a SiMView projections intensity Drosophila and and 2 Figure corrected false track termination and wrong linkages in the the in linkages wrong ( GMM and sequential termination track false and detections corrected background identified that rules heuristic tion associa data incorporated we computa in particular, In complexity. increase an tional avoiding while volume image the of to regions us challenging in allowed dynamics cellular reconstruct strategy accurately This solutions. possible all enumerate cence microscopes (a custom SiMView light-sheet microscope (a custom SiMView light-sheet microscopes cence zebrafish and mouse embryos) and three different types of fluores across algorithm three tracking different model fly,systems (fruit We of the performance assessed the automated and segmentation pipeline lineaging cell automated of the Performance Note 2 Supplementary f S Fig. 1 Fig. = 295). The average nuclei neighbor (NN) (NN) neighbor nuclei average The = 295). ) Orthogonal image slices, each centered centered each slices, image ) Orthogonal upplementary Video 2 Video upplementary upeetr Tbe 1 Table Supplementary c | ). These spatiotemporal windows windows spatiotemporal These ). Automated cell lineaging in lineaging cell Automated embryo. Right, automated automated Right, embryo. ( embryos. e ) Average distance between curated curated between distance ) Average percentiles; percentiles; upeetr Fg 1 Fig. Supplementary d ) Average error-free length length error-free ) Average µ a Supplementary Fig. 4 Fig. Supplementary m m ( ) Left, maximum- ) Left, c ) Average linkage linkage ) Average a ). The reconstruction reconstruction The ). , ). n b = 262 on average). average). on = 262 ); 10 10 ); n = 241 cells). cells). = 241 b µ ) ) Tracks ). We then then We ). m m ( f ).

) is crucial to efficiently efficiently to crucial is ) -

, Online Methods and and Methods Online , b a c f yz xy SiMView data Ventral S1neuroblasttracks t Accuracy and distance 0.2 0.4 0.6 0.8 1.0 0 Superficial 0 0 0 400 300 200 100 0 Time pointindex Normalized distance Linkage accuracy 1 xz 6 - - - ,

t +1 the background intensity threshold and the PBC threshold threshold PBC the and threshold intensity background the in we Moreover, curated 116,820 data points across different developmental stages microscopes. and systems model all for metrics accuracy generate to points data truth ground 42,947 of total a annotated users Seven Methods). (Online microscope) confocal and LSM a 710 Zeiss Carl Z.1 microscope Lightsheet a Zeiss Carl complementary metrics to evaluate different types of errors errors of types different ( evaluate to metrics complementary Drosophila section this in ( presented results the obtain to sets data across and and ( challenges computational associated the and data image 1 Videos Supplementary normalized this distance by the nearest neighbor distance to to distance neighbor nearest the by distance we this Second, normalized accuracy. segmentation of nuclei measure a marked as centroids automatically and manually between distance Supplementary Software 1 Software Supplementary upeetr Nt 3 Note Supplementary Cell division Elongated First, we analyzed performance on the SiMView data set of of set data SiMView the on performance analyzed we First, Drosophila Supplementary Videos 5 5 Videos Supplementary Dorsolateral view d

mroi dvlpet n eal ( detail in development embryonic Error-free length Ventral view 100 200 300 Dorsal view . We adjusted only two parameters of the pipeline, pipeline, the of parameters two only We . adjusted 0 t +1 0 0 300 200 100 0 na ture methods Low SNR Time pointindex Reconstruction Dorsal S1neuroblasttracks ). First, we reported the Euclidean Euclidean the reported we First, ). – 6 and and ). Because of the complexity of the the of complexity the of Because ). t | ADVANCE ONLINE PUBLICATION Supplementary Fig. 5 Fig. Supplementary and Crowded Change inshapeandtexture e 6 Distance (µm) 10 20 30 ) 0 3 0 200 100 0 0 , we used several several used we , Mean NNdistance Ground truthdistance Time pointindex t +1 Articles i. 2a Fig. Dorsolateral view Uneven texture a p a oslVentral Dorsal ). – Fig. 2 Fig. e and and |

τ  f ,

© 2014 Nature America, Inc. All rights reserved. linearly with cell counts ( counts cell with linearly scales time time. Computation computation ( ( in as bars (error distance (NN-norm.) nuclei nearest-neighbor local the to distances ( intervals). confidence percentile 95th and 5th bars, error 1,202; 1,219; 1,206; 1,246; 1,369; 1,210; 1,469; 2,416; 2,655; 1,375; 841; 740; 715; 1,594; 1,327; 1,998; 1,744; 1,365; 1,330; n right: to left (from centroids nuclei detected manually and automatically between distance center/periphery. ( center/periphery. blastoderm c/p, origin. lineage blastoderm the indicates code color gradient radial The stage). sphere the from points time (801 gastrulation late during embryo zebrafish (H2B-eGFP) of a nuclei-labeled recording microscopy 100 100 (E) 6.25). ( 6.25). (E) day embryonic from points time (26 embryo mouse (eGFP)) GFP (H2B–enhanced labeled of a nuclei- recording microscopy SiMView from 50% epiboly). ( epiboly). 50% from points time (101 gastrulation early during embryo zebrafish (H2B-mCherry) labeled of a nuclei- recording microscopy confocal of internalized cells and fast cell movements movements cell fast and cells internalized of crowded included populations scenarios challenging Particularly tors influencing tracking accuracy (Supplementary Figs. 7 6 Fig. Supplementary ( embryo the across varied generally movements cell 1 Tables ( points time for all 95% above was accuracy age link pairwise average the and radius, nucleus the of 50% below points. The average Euclidean distance was time consecutive in correct assignments of linkage fraction the accuracy determining by tracking evaluated we Third, 26,000 cells min cells 26,000 is speed processing average The regression). account for bias toward oversegmentation. Figure  reconstructed. ( reconstructed. were 5–11) (stages points time 501 colors. random using points) time ten last (lines, movements and (circles) positions cell Drosophila 6 stage (His2Av-mRFP1) a nuclei-labeled of half of ventral the recording microscopy Z.1 Lightsheet of a 4-h projection intensity ( microscopes. various using embryos mouse and fly fruit zebrafish, Fig. 6 6 Fig. ages ( ages mistakes do not propagate indefinitely, as the pipeline is capable capable is pipeline the as indefinitely, propagate not do mistakes ( points time two next the to aver age on propagated that mistake segmentation a by separated often were points, time 122 of length average an with segments, which we encountered no or segmentation errors. tracking These g h f = 1,211; 1,718; 2,408; 2,008; 2,041; 1,436; 1,436; 2,041; 2,008; 2,408; 1,718; = 1,211; Articles ) As in ) As in ) Average linkage accuracy (error bars as in bars ) (error Average accuracy linkage ) Scatter plot of cell count versus versus count of cell plot ) Scatter | We also measured the average error-free length of cell line cell of length error-free average the measured also We ADVANCE ONLINE PUBLICATION µ m m ( Fig. 2 Fig. 3 and and | e b Automated cell lineaging in lineaging cell Automated – embryo. Right, reconstruction of reconstruction Right, embryo. but normalizing individual individual normalizing but , d d 3 7 ). ) As in ) As in ). Image quality, cell density and the magnitude of of magnitude the and density cell quality, Image ). d ). ), i.e., the number of consecutive time points over over points time consecutive of number the i.e., ), b −1 ) As in ) As in . Scale bars, 50 50 bars, . Scale e a ) Average Euclidean Euclidean ) Average c but for a 13.4-h SiMView SiMView a 13.4-h for but ) As in ) As in a ) Left, maximum- ) Left, a R but for a 3.4-h a 3.4-h for but ) and constituted the most important fac important most the constituted and ) 2 = 0.90, linear linear = 0.90, a but for a 2-h a 2-h for but µ m m ( |

na a Supplementary Fig. 9 Fig. Supplementary

,

c ture methods ); );

e

e ). ). ). ).

d g e a Supplementary Supplementary

Supplementary Supplementary Linkage accuracy 0.5

Euclidean distance ( m) view Animal

µ Fruit fly(LightsheetZ.1) 1 3 0 2 4 0 1 Zebrafish (SiMView)

Fig. 2 Fig. 30 30 73 64 00 90 95.4% 99.0% 90.0% 96.4% 97.3% 80 80 Zebrafish LSM710 Zebrafish SiMView Mouse SiMView Fruit flyLightsheetZ.1 Fruit flySiMView 130 130 Microscopy ). Such Such ). Fruit fly SiMView and 180 180 e 230 230 stage 6 and and 280 280 8 90 90 ). - - - - 140 140 Time pointindex Time pointindex

190 190 240 Fruit fly Lightsheet Z.1 240 290 290 2,458 cells of self-correction ( self-correction of over time. over rotation specimen and contrast image low sampling, temporal low cells, crowded the of because case challenging exceptionally ( set data Drosophila during example, (for points time of hundreds over micrometers tance of the cell identity,true even when moved cells hundreds of dis neighbor nearest median the within stayed errors struction 3 Note age accuracy was between 90% and 99% ( 99% and 90% between was accuracy age centroids ( determined radius nuclear the below automatically was and manually between 1 Videos systems and ( microscopes 340 340 Onset ofsegmentationperiod view Vegetal A comparative analysis of the data sets for all biological model model biological for all sets of data the analysis A comparative 0 0 12 Mouse SiMView 12 Reconstruction 24 24

). Moreover, the majority of cell tracks affected by recon by affected tracks cell of majority Moreover, the ). 100 100 200 200 300 300 – i. 3 Fig.

germ band extension; extension; band germ 400 Zebrafish SiMView 400 23 500 500 600 600 Microscopy ) showed that the average Euclidian distance distance Euclidian average the that showed ) 700 700 b c

c 30 30

and and 60 Zebrafish LSM 710 60

Supplementary Fig. 6f Fig. Supplementary 90 90 h f Zebrafish (LSM710) 19,941 cells Processing time (s) NN-norm. distance view Animal Mouse (SiMView) 0.6 0.2 0.4 upeetr Vdo 16 Videos Supplementary 50 60 10 20 30 40 0 0 0.5 0 Figs. Figs. 2a 30 Zebrafish LSM710 Zebrafish SiMView Mouse SiMView Fruit flyLightsheetZ.1 Fruit flySiMView

80 Zebrafish LSM710 Zebrafish SiMView Mouse SiMView Fruit flyLightsheetZ.1 Fruit flySiMView

130 shield stage Microscopy Microscopy Fig. 3e Fig. 180

Fig. 2b Fig. 230 and

Cell nucleicount( 280 E6.25 90 p c 140 3a Time pointindex ,

f 190

), and the average link average the and ), 240 , 2,900 cells 874 cells – e

1.0 290 d ). and and view Vegetal Fig. 3 Fig. 340 and 0 12

×10 24 Supplementary Supplementary Supplementary Supplementary 100 g

4 200 ). The mouse mouse The ). ) 1.5 – Reconstruction Reconstruction

300 Reconstruction 18 400 500 ) was an an was ) 600 700 30

2.0 60 90 - - -

© 2014 Nature America, Inc. All rights reserved. full data range; range; data full data points for the cell lineage data set analyzed in in analyzed set data lineage cell the for points data ( set. data the in errors of all of 97% aware user the makes it inspection, manual for points data of all 15% only flags system guidance error the Although system. guidance error the by flagged nodes validating manually when corrected are that set data uncurated the in errors and errors reconstruction by affected potentially nodes as system guidance error the by flagged points data set, data the curate fully to user the by edited points data set, data curated the in points of data number the right) to left (from showing plot ( using the manual curation framework. The user curated a data set with 116,820 data points within 100 h, including the initial training phase. ( phase. training initial the including h, 100 within points data 116,820 with set a data curated user The framework. curation manual the using cells min cells scaling of computation time with the number of cells tracked tracked cells of ( number the with time computation linear of a in scaling and resulted platforms CPU (GPU) unit multicore processing on graphics parallelization and approach hybrid Figure of data points that need to be validated. Our computational computational Our validated. be framework to automatically outputs a confidence score for need each data that points data of in approximately 100 h, including the training initial phase. formed validation the andfull curation of the data 116,820 points curation rate of 1,395 data points per hour ( sustained a achieved and efficiency maximum reached user the ( metrics performance curation actions were recorded in the database and used to estimate ( key datasection next the in early of tion ( dle multi-terabyte data sets with more than 10 million data points in study,this performed its to ability thereby demonstrating han data and from curate reconstructions all lineage the annotate cell annotation data during ( constraints logic biological temporal enforcing and for slicing image multiple orthogonal with channels, images color time-lapse 3D of handling data adding stantially shorter than the image acquisition time ( time acquisition image the than shorter stantially - sub thus was time computation the scenarios, presented all In A Croa Jnla am proa communication) personal Farm, Janelia Cardona, (A. sets data microscopy electron large in tracing neuron for form plat Data) Image of Amounts Massive (Collaborative for Toolkit CATMAID Annotation the extended we purpose, this For ( pipeline reconstruction our in editing and inspection data for manual a module we integrated reconstructions, lineage To error-free that of demand applications requirements the meet results of lineaging curation manual and Visualization 10 Fig. Supplementary a n Supplementary Fig. 11 Fig. Supplementary Fig. 3 Fig. upeetr Sfwr 2 Software Supplementary = 40,310 operations). Confirmation actions are not shown. ( shown. not are actions Confirmation operations). = 40,310 We then evaluated the processing speed of our pipeline. The The pipeline. our of speed processing the evaluated Wethen A novice user was able to curate the entire cell lineage reconstruc We also developed an approach for reducing the overall number Deleted nodes Frequency 0.2 0.4 0.6 0

4 Added nodes h | ). We measured an average processing speed of 26,000 26,000 of speed processing average an measured We ). Manual data curation and annotation. ( annotation. and curation data Manual −1 Moved nodes on a single computer workstation (Online Methods). Methods). (Online workstation computer on a single

Drosophila Added links n Confirm Node Confirm Deleted links Supplementary Videos 24 Videos Supplementary = 16,330, = 16,330, ). nervous system development presented presented development system nervous b ). Time (s) 100

Confirm node 20 40 60 80 0 ). We used this module to store, store, to module this used We ).

Fig. 4a Fig. Move node n Move Node Move Add node , b

). Within a 24-h period, period, 24-h a Within ). Add link = 2,647, = 2,647,

Fig. Fig. 4 Delete link a ) Frequency of editing actions when correcting cell lineages with the manual curation framework framework curation manual the with lineages cell correcting when actions of editing ) Frequency d when using the error guidance system to sort nodes on the basis of curation priority. of curation basis the on nodes sort to system guidance error the using when – c 28 n ). ). The user per Add Node c

). All editing editing All ). Number of curated cells Fig. 3 Fig. (×104) 12 b 0 4 8 ) Average time per editing action (red, median; blue, 25th and 75th percentiles; black, black, percentiles; 75th and 25th blue, median; (red, action editing per time ) Average Fig. 1 Fig. 0 = 25,498, = 25,498, 31 h , 3 and and 20 2 Training phase by by d Editing time(h) e ). ). - - - - ) Total fraction of corrected errors versus the percentage of manually curated curated of manually percentage the versus errors of corrected ) Total fraction 40

n 1,395 data points/h Add Link - ( neu 295 of roblasts divisions and tracks reconstructed, cell the we analyzed and framework, curated our Using impossible. nically cell. glial a and neuron a or neurons two generate to symmetrically divides turn in which (GMC), cell mother ganglion a produce and self-regenerate to that are the most likely to be affected by errors ( point (Online Methods) and thereby guides the user to data points we performed a cell lineage reconstruction of early early of reconstruction lineage cell a performed we framework, lineaging cell our of capabilities the To demonstrate C sion for S2 neuroblasts ( neuroblasts S2 for sion divi one and neuroblasts S1 for divisions two through time in forward and blastoderm the in location original their to time in resolution blastoderm fate map for almost all S1 neuroblasts neuroblasts S1 all almost for map fate blastoderm resolution and prises 116,820 fully validated data points ( Methods). Online ing blastoderm cells by lateral inhibition and delaminate from from sheet cell the surrounding delaminate and inhibition lateral by cells blastoderm ing neighbor from differentiate precursors, neural the Neuroblasts, hours after egg laying (AEL; (AEL; laying egg after hours h 4.4 at array stereotypic a in arrangement their of basis the on neuroblasts identify to module visualization data our used We neuroblasts). (S1 wave delamination first the in neuroblasts all system reconstruction ( reconstruction system automated the in errors all of 97% correct to required was points data all of 15% only of curation and tion inspec system, guidance error by this flagged points on ing data nervous system development ( development system nervous from earlier studies earlier from ( array neuroblast the on subtypes the of within basis neuroblast relative positions their ( ment in data curation speed when using the error guidance system 60 n ell lineage reconstruction of the early nervous system nervous early of the reconstruction lineage ell Comprehensive tracking of neuroblasts has so far been tech been far so has neuroblasts of tracking Comprehensive To com which our our reconstruction, cell knowledge, lineage = 2,332 data points, two annotators) (Online Methods). (Online annotators) two points, data 2,332 = = 2,044, = 2,044, 0100 80 upeetr Dt 1 Data Supplementary upeetr Vdo 25 Videos Supplementary d

n Ground truth nodes Delete Link Delete Cell count (×104) 12 na 0 4 8 ture methods 116,820

3 Total errors 6 = 4,806). ( = 4,806). . We followed these neuroblasts backward backward neuroblasts these Wefollowed . Supplementary Fig. 12 Fig. Supplementary Flagged nodes 4,982 Fig. 4d Fig. Corrected errors Fig. 5a Fig. 34 17,470 , 3 5 Fig. 5 Fig. 4,819 . These cells divide asymmetrically asymmetrically . divide cells These , rvds h frt single-cell– first the provides ), Supplementary Videos 24 Videos Supplementary , c e ) Curation speed of a novice user user of a novice speed ) Curation | , ). ). We a improve 28% measured ADVANCE ONLINE PUBLICATION b , , e a Supplementary Fig. 13 Fig. Supplementary

). We further annotated ten ten annotated We further ). Fraction of corrected

– errors 1.0 0.2 0.4 0.6 0.8 27 Supplementary Video 28 0 0 ), including 92.4% of of 92.4% including ), Percentage ofcuratednodes Drosophila 3 ) using definitions definitions using ) Fig. Fig. 4 Articles 6 d ). ). By focus Drosophila 9 nervous nervous d ) Bar ) Bar – 215 12 28

and and ) | 3

3  ------.

© 2014 Nature America, Inc. All rights reserved. smoothing ( smoothing Gaussian through achieved is Continuity posterior). to anterior (from band germ the along position neuroblast The offset. midline The h AEL. 4.4 at array neuroblast the in code a color using precursors neural all for measured behavior ( blastoderm. the in position (AP) anteroposterior for code a color using learning classifier using only morphodynamic features ( features morphodynamic only using classifier learning y the and axis, a-p the along neurectoderm The line, midline). dashed (right; h AEL 4.4 at array neuroblast the to (left) stage blastoderm the from time in forward propagated is code color The blastoderm. the in position (DV) dorsoventral for code a color using precursors of neuroblast maps fate cell–resolution Depth is shown separately for the left ( left the for separately shown is Depth second neuroblast division ( division neuroblast second band. Scale bars, 50 50 bars, Scale band. (purple to yellow, 2.9–5.4 h AEL). ( h AEL). yellow, 2.9–5.4 to (purple n right: to left from range; data full black, percentiles; 75th and 25th blue, median; (red, development system nervous of early reconstruction lineage cell the in annotated types neuroblast different ten for division cell second 180° and 270° indicate anterior, lateral, posterior and medial orientations, respectively. ( respectively. orientations, medial and posterior anterior, lateral, indicate 270° and 180° feature importance indicates a strong effect on the prediction accuracy of the machine learning classifier. Int., internalization; Div., division. internalization; Int., classifier. learning of machine the accuracy prediction the on effect a strong indicates importance feature of level A high movement). cell magenta, orientation; division cell blue, timing; division cell (green, types cell ( diagonal. the along of 1 level a probability by indicated is accuracy prediction Perfect rows. in shown are types neuroblast predicted and analyzed cell division orientation cell division analyzed Figure 6 Figure a in Figure  ( reconstruction lineage cell the in included tracks cell precursor neural of all views ventral ( neuroblast. NB, AEL). (h points Drosophila early early posterior end. posterior neuroblast array ( array neuroblast neuro the and blastoderm ventral the in regions the genic between pres ervation neighborhood perfect almost an two-dimensional (2D) ectoderm (2D) two-dimensional the from structure 3D of creation the in stage early an are sions ( and oe 4 Note ae i cl lyr formation layer cell in cated a n a origin marks its ventral end. ( ventral its marks origin = 20, 22, 24, 18, 21, 21, 20, 20, 10). ( 10). 20, 20, 21, 21, 18, 24, 22, = 20,

Articles ) Location of neural precursors in the in precursors of neural ) Location = 235). Tracks are color coded for time time for coded color are Tracks = 235). | Division interval (min) Wealso analyzed neuroblast movements and divisions ( ADVANCE ONLINE PUBLICATION 30 35 40 45 x 6a Drosophila origin marks the center of the ventral of ventral the center the marks origin Drosophila 5 , . oto oe te el iiin nl hs en impli been has angle division cell the over Control ). | | 7−1 b Cell behavior of neural precursors in the the in precursors of neural behavior Cell of the reconstruction lineage Cell embryo at the indicated time time indicated the at embryo , , Neuroblast type 5−6 Supplementary Figs. 14 14 Figs. Supplementary σ

= 0.1 rad). ( rad). = 0.1 5−2 θ z ˆ embryonic nervous system. system. nervous embryonic points to the end of end germ the to points 5−3 average the marks origin

3−5 reveals map This embryo. θ Fig. 5c Fig. origin marks the embryo embryo the marks origin 7−4 µ f ) Feature importance extracted from the machine learning classifier predicting neuroblast neuroblast predicting classifier learning machine the from extracted importance ) Feature 2−5 m (

1−1 a c ); 20 20 );

) Histogram of cell division angle, angle, division of cell ) Histogram 3−2 n , e d = 184). ( = 184). ) Features of cell of cell ) Features ). d µ b ) As in ) As in b m m ( Depth at division (µm) and ) Lateral 3 c 10 15 20 7 ) ) Single- | b ad h frt erbat divi neuroblast first the and ,

39 ). 3 na 2− 1 0 −1 −2 e n 8 c , ) Confusion matrix for neuroblast type prediction based on a machine a machine on based prediction type neuroblast for matrix ) Confusion = 147) and right ( right and = 147) . We therefore systematically systematically Wetherefore .

b 4 but but ture methods and 0

) Average depth of neuroblasts in the embryo during their first division. division. first their during embryo the in of neuroblasts depth ) Average relative to embryo the local

Left Right

15 � - - (rad)

and and Drosophila a e c −150 −150 Fig. Fig. Supplementary Supplementary −50 z (µm) y (µm) 50 NB array y Blastoderm 0 0 0 nenlzto ie( E)Pathtodivision(µm) Internalization time(hAEL)

3− 1012 1 0 −1 −2 −3 left right n 100150 0 −150 V D V D 5 = 148) sides of the ventral nerve cord as a function of a function as cord nerve of ventral the sides = 148) 3.5 x 2.9 h Blastoderm e embryo. ( embryo. Blastoderm DVcode ψ ). Annotated neuroblast types are arranged in columns, columns, in arranged are types neuroblast Annotated ). etright left , for the first neuroblast division ( division neuroblast first the , for 180 c Figs.5e 4.0 � (rad) x (µm) � Division angle� 4.5 a ) Time interval between the first and first the between interval ) Time 270 (medial) - -

90 Lateral blasts (polar angle angle (polar blasts neuro self-renewed the cells, sister their to relative lateral more the newly formed GMCs to come to rest not only deeper but to GMCs come to also formed rest not only deeper newly the and surface Our the results midline. embryo show a preference of mean mean nervous system exhibit distinct signatures in their dynamic cell cell dynamic their in signatures distinct exhibit system nervous 4.4 h 5.4 h 2.9 h −50 −50 We then investigated whether different cell types in the early early the in types cell different whether investigated Wethen 50 z (µm) 50 z (µm) 0 0 3− 1012 1 0 −1 −2 −3 3− 1012 1 0 −1 −2 −3 1 (°) z 0 0 700 500 300 ± s.d., s.d., x 0 d � (rad) � (rad) ) Same as as ) Same n 180 d NB array = 415; 415; = 4.4 h n = 231). 0°, 90°, 90°, 0°, = 231). Division angle� η c 270 (medial) but for the for but = 28 28 = Fig. 6c Fig.

y (µm) Ventral −150 −150 z (µm) d −50 90 50 Drosophila –180 0 0 0 5

3− 1012 1 0 −1 −2 −3 left right 100 −150 ± A

Division angle�

2.9 h Blastoderm

Blastoderm APcode 15°, azimuthal angle angle azimuthal 15°, ,

d 9 900 –90 2 b (°) and and

� (rad) x (µm)

0

Supplementary Fig. 14 Fig. Supplementary 2 f e Predicted NB type B (°) 180 150 MP2 3−2 1−1 2−5 7−4 3−5 5−3 5−2 5−6 7−1

Feature importance (a.u.) P 0 −50 −50 z (µm) z (µm) 50 Int. time 50 Probability of B given A Probability ofBgiven 0 7−1 0 3− 1012 1 0 −1 −2 −3 3− 1012 1 0 −1 −2 −3 Div. 1 time Annotated NBtypeA 5−6 Div. 2 time Resting depth ( 02 040 30 20 10

Div. interval 5−2 ψ

Div. angle �1 5−3 271 = 0.5 � (rad) � (rad)

Div. angle 3−5 Ventral view �1 Lateral view Div. angle �2 7−4 Div. angle �2 2−5 NB array ± µ Depth at div. ).

1−1 4.4 h m) Resting depth 3−2 65°, Path to div. MP2 1 ­ © 2014 Nature America, Inc. All rights reserved. tion used for manual neuroblast annotation. neuroblast manual for used tion informa position spatial the to access without predictions these made Notably, models the (10%). at random identity cell correct the of assigning probability the than higher to tenfold are sixfold online online version of the pape Note: Any Information Supplementary and Source Data files are available in the v and are Methods references any in available the associated M ( computation time that scales nonlinearly with the number of cells a require and/or stages developmental complex more in regions populated densely segment correctly to fail methods these that Wefound density. cell low and cells thousand few a to up with stages, developmental early for designed been have algorithms ( microscopy fluorescence in lineaging cell for methods state-of-the-art faster than and accurate more is pipeline our as speed, for accuracy nelia.org/lab/keller http://www.ja at download for available freely and source open is software All and editing. data visualization we for tion, tools provide efficient addi In non-experts. by use to easy thus is and only parameters of two adjustment requires framework the Third, counts. cell with linearly and is computation workstation needed, scales time computer single a only Second, microscopes. fluorescence and systems model biological tracking different across robust is and performance segmentation its First, strengths. main three has study this in presented framework lineaging cell automated The DISCUSSION for 3–2, 79% for 1–1 and 67% for 2–5 neuroblasts ( (Fig. 6 divisions cell of orientation and timing the about primarily tion predicted with high from accuracy their behavior: using informa found that four of the ten annotated neuroblast cell types could be we Methods), (Online models learning behavior. Using machine opment of computer models of embryonic development. embryonic of models computer of opment devel the for data crucial provide and species across and within reconstructions lineage cell of comparisons quantitative enable information, tracking cell requiring investigations of accuracy and the speed enhance here will presented framework putational of vertebrates and higher invertebrates. We envision that the com plans building developmental the reconstructing of goal mental funda the to closer us brings organisms multicellular complex behavior. cell morphodynamic reconstructed automatically of basis the on identity type cell of predictions real-time for as well as level, single-cell the at lation manipu optical for used be could embryo developing the in ing For experiments. cell example, interactive track design real-time scopes’, in which cell real-time lineage are reconstructions used to in real time. This opens up the of prospect ‘smartbuilding micro tracking cell and segmentation image for suitable principle in is exceeds the speed of image acquisition means that our framework Supplementary Note 5 Supplementary ersion of the pape the of ersion ethods Our performance results suggest that we did not sacrifice sacrifice not did we that suggest results performance Our The ability to efficiently perform system-level cell lineaging in lineaging cell system-level perform to efficiently ability The pipeline presented the of speed processing the that fact The f), we obtained prediction accuracies of 100% for MP2, 90% Supplementary Tables 4 Tables Supplementary r . r . ). -lab / . – 6 ). In general, previous previous general, In ). Fig. Fig. 6 e ), ), which online online ------This This work was supported by the Howard Hughes Medical Institute. 16. 15. 14. 13. 12. 11. 10. 9. 8. 7. 6. 5. 4. 3. 1. COM all authors. data with input from K.B. F.A. and P.J.K. wrote the manuscript with input from W.L. performed the zebrafish imaging experiments. F.A. and P.J.K. analyzed the system development. K.M. performed the mouse imaging experiments. Y.W. and D.P.M. curated and analyzed the reconstruction of early input from P.J.K. and K.B. W.L. performed the the cell lineaging framework and performed the cell lineage reconstructions with P.J.K. and F.A. conceived of the research with input from E.W.M. F.A. developed A Tg( (Institute of Science and Technology Austria) for kindly providing the (Carl Zeiss) for supporting the Lightsheet Z.1 experiments; and C.-P. Heisenberg help executing the Lightsheet Z.1 experiments; S. Olenych and O. Selchow developing SiMView live imaging assays; C. Akitake (Carl Zeiss) for her generous the CAG-TAG1 transgenic mouse strain; A. Denisin for her outstanding help development, helpful discussions about mouse embryo culturing and providing for their generous help in exploring imaging assays for mouse embryonic Drosophila M. Schroeder, H. Lacin, J. Truman and T. Lee for helpful discussion about the the microscopy data sets; the development team for help using Ilastik; A. Pavlopoulos for invaluable contributions to ground truth annotations of for help with modifying the open source code of CATMAID; R. Chhetri and We thank A. Cardona and the participants of the Janelia CATMAID hackathon Ackno 2. R The authors declare no competing financial interests. com/reprints/index.htm eprints and permissions information is available online at UTHOR β P -actin:H2B-mCherry) -actin:H2B-mCherry) and Tg( PLoS Biol. PLoS cells. endoderm visceral anterior of migrationordered the facilitate in pathway. polarity cell planar Four-jointed Proc. Natl. Acad. Sci. USA Sci. Acad. Natl. Proc. (2010). resolution. cell at lineaging andreconstruction after noisy Shh signaling. Shh noisy after imaging. cell live high-throughput 563–578 (2013). 563–578 light-sheet microscopy.light-sheet multiview simultaneous with embryos developing entire of imaging insights. microscopy.light-sheet using organisms complex in (2014). doi:10.1002/mrd.2225 8 microscopy.light-sheet using function and migration. of analyses microscopy. sheet light scanned by development embryonic early zebrafish of 49 Tomer, R., Khairy, K., Amat, F. & Keller, P.J. Quantitative high-speed Keller,P.J.Quantitative & F.Amat, K., Khairy,Tomer, R., Z. Bao, biological and advances technologicalmorphogenesis: Keller,P.J.Imaging F.Xiong, G. Trichas, reconstructionslineage cell comprehensiveKeller,P.J.Towards & F.Amat, M. Held, X. Liu, Murray,J.I. development system nervous of imaging Keller,P.J.Live W.C.& Lemon, F.Bosveld, R. Fernandez, Dynamic A. Stathopoulos, & W.,Fraser,S.E. Supatto, A., McMahon, Reconstruction E.H.K. Stelzer, & J. Wittbrodt, A.D., Keller,P.J.,Schmidt, biology.systems in Imaging Fraser,S.E. & S.G. Megason, Khairy, K. & Keller, P.J. Reconstructing embryonic development. embryonic Keller,P.J.Reconstructing & K. Khairy, cellular resolution in resolution cellular 784–795 (2007). 784–795 ETING w , 488–513 (2011). 488–513 , C. elegans. C.

ledgments CONTRI nervous system; S. Srinivas and T. Watanabe (University of Oxford) F Analysis of cell fate from single-cell gene expression profilesexpression gene single-cell from fate cell of Analysis al. et Automated cell lineage tracing in tracing lineage cell Automated al. et IN CellCognition: time-resolved phenotype annotation in annotation phenotype time-resolved CellCognition: al. et Science Specified neural progenitors sort to form sharp domains sharp form to sort progenitors neural Specified al. et 10 Science Multi-cellular rosettes in the mouse visceral endoderm visceral mouse the in rosettes Multi-cellular al. et A Mechanical control of morphogenesis by Fat/Dachsous/ by morphogenesisof control Mechanical al. et B gastrulation provide insights into collective cell collective into insightsprovide gastrulation Drosophila Automated analysis of embryonic gene expression with expressiongene embryonic of analysis Automated al. et Science NCI UTIONS , e1001256 (2012). e1001256 , Imaging plant growth in 4D: robust tissue robust 4D: in growth plant Imaging al. et Cell na A 340 ture methods L 139 322

INTERESTS 322 l . , 1234168 (2013). 1234168 , C. elegans. C. , 623–633 (2009). 623–633 , , 1546–1550 (2008). 1546–1550 , Nat. Methods Nat. , 1065–1069 (2008). 1065–1069 ,

103 Cell β -actin:H2B-eGFP) zebrafish -actin:H2B-eGFP) lines. , 2707–2712 (2006). 2707–2712 , 153 Nat. Methods Nat. , 550–561 (2013). 550–561 , | Nat. Methods Nat. ADVANCE ONLINE PUBLICATION , 755–763 (2012). 755–763 9, Drosophila Science Mol. Reprod. Dev. Reprod. Mol. 5 Nat. Methods Nat. , 703–709 (2008). 703–709 , Caenorhabditis elegans . Caenorhabditis , 747–754 (2010). 747–754 7, Drosophila 336 imaging experiments. Dev. Growth Differ. Growth Dev. , 724–727 (2012). 724–727 , http://www.nature Articles Cell nervous

, 547–553 7, 130

Genesis ,

55 |

 , . © 2014 Nature America, Inc. All rights reserved. 29. 28. 27. 26. 25. 24. 23. 22. 21. 20. 19. 18. 17.  Articles | ADVANCE ONLINE PUBLICATION with high misdetection robustness.misdetection high with microscopy.nonlinear label-free using embryos sequences. spatiotemporal context. spatiotemporal Trans. Syst. Man Cybern. Man Syst. Trans. (2013). super-voxels. using microscopy time-lapse mixtures. 13 simulations.immersion on based shapes. deformable of segmentation studies. organogenesiszebrafish for SPIM using tracing lineage retrospective 764–777 (2008). 764–777 Dasgupta, S. & Schulman, L.J. A two-round variant of EM for Gaussian for EM of variant two-round A L.J. Schulman, & S. Dasgupta, C.M. Bishop, histograms. gray-level from method selection threshold A N. Otsu, algorithm efficient an spaces: digital in P.WatershedsSoille, & L. Vincent, Persistence-based L. Guibas, & F. Chazal, P.,Ovsjanikov,M., Skraba, for flow optical robust and Keller,P.J.Fast E.W.& Myers,F., Amat, K. Jaqaman, I. Smal, K. Li, Kausler,B.X. Olivier,N. C.A. Giurumescu, 4D J. Sharpe, & Lopez-Schier,H. M., Muzzopappa, Swoger,J., (2010). Development embryos. complex in resolution single-cell with morphogenesis Rao-Blackwellized marginal particle filtering. particle marginal Rao-Blackwellized , 583–598 (1991). 583–598 , Cell population tracking and lineage construction with constructionlineage and tracking population Cell al. et J. Biophotonics J. Multiple object tracking in molecular bioimaging by bioimaging molecular in tracking object Multiple al. et 152–159 (2000). 152–159 UCAI Proc. Cell lineage reconstruction of early zebrafish early ofreconstruction lineage Cell al. et Nat. Methods Nat. Robust single-particle tracking in live-cell time-lapse live-cell in tracking single-particle Robust al. et 139 A discrete chain graph model for 3D+t cell tracking cell 3D+t for model graph chain discrete A al. et Pattern Recognition and Machine Learning Machine and Recognition Pattern , 4271–4279 (2012). 4271–4279 , Quantitative semi-automated analysis of analysis semi-automated Quantitative al. et

4 Med. Image Anal. Image Med. , 62–66 (1979). 62–66 9, 5 , 122–134 (2011). 122–134 , , 695–702 (2008). 695–702 , IEEE Trans. Pattern Anal. Mach. Intell. Mach. Anal. Pattern Trans. IEEE |

144–157 (2012). 144–157 ECCV na 45–52 (2010). 45–52 CVPR Proc. ture methods

Bioinformatics 12 , 546–566 (2008). 546–566 , Med. Image Anal. Image Med. Science (Springer,2007). 329 , 373–380 29, , 967–971 ,

12 , IEEE

34. 33. 39. 38. 37. 36. 35. 32. 31. 30. 40. neurons in the in neurons (1995). (1992). embryonic neuroblasts. embryonic Drosophila Dyn. Dev. of system nervous central embryonic the in pattern segmental embryo. Drosophila the in neurogenesis early and formation layer germ cellularization, during genes. polarity segment andneurogenic by control common anddynamics, cytoskeletal the in division the in cells melanogaster. Drosophila (2014). data. Bioinformatics data. image of amounts massive for toolkit annotation collaborative tracking.multi-target of asymmetric cell division. cell asymmetric The Doe, C.Q. Molecular markers for identified neuroblasts and ganglion mother ganglion and neuroblasts identified for markers Molecular C.Q. Doe, wild-type in neurogenesis Early J.A. Camposortega, V.& Hartenstein, Siegrist, S.E. & Doe, C.Q. Extrinsic cues orient the cell division axis in axis division cell the orient cues Extrinsic C.Q. Doe, & S.E. Siegrist, and diversity cell of Generation R. Urbach, & Berger,C. G.M., Technau, DE-cadherin of role V.The Hartenstein, & T.Haag, K., Dumstrei,F., Wang, J., Broadus, and Delamination A. Lekven, & V.,Younossi-Hartenstein,A. Hartenstein, image of amounts massive for toolkit annotation Collaborative A. Cardona, Tomancˇák,V.& Hartenstein, A., Cardona,P.CATMAID: S., Saalfeld, evaluation truth ground of Challenges S. Roth, & Schindler,K. A., Milan, Bowman, S.K., Neumuller, R.A., Novatchkova, M., Du, Q. & Knoblich, J.A. Knoblich, & Q. Du, M., Novatchkova, Neumuller,R.A., S.K., Bowman, NuMA homolog Mud regulates spindle orientation in orientation spindle regulates Mud homolog NuMA Drosophila CATMAID GitHub Repository GitHub CATMAID 235 central nervous system. nervous central Drosophila New neuroblast markers and the origin of the aCC/pCC the of origin the and markers neuroblast New al. et , 861–869 (2006). 861–869 , 25 central nervous system. nervous central Drosophila neurectoderm: spatiotemporal pattern, spatiotemporalneurectoderm: Drosophila Dev. Biol. Dev. , 1984–1986 (2009). 1984–1986 , Dev. Biol. Dev. 735–742 (2013). 735–742 CVPR Proc. Rouxs Arch. Dev. Biol. Dev. Arch. Rouxs Dev. Cell Dev. 165 , 480–499 (1994). 480–499 , , 350–363 (2004). 350–363 270,

https://github.com/acardo na/CATMAID 10 Development , 731–742 (2006). 731–742 ,

Development 193 133 Mech. Dev. Mech. , 308–325 (1984). 308–325 , , 529–536 (2006). 529–536 , 116 53 Drosophila. , 855–863 , , 393–402 , © 2014 Nature America, Inc. All rights reserved. Drosophila fused without fusion artifacts fusion without fused readily are stacks image ventral and dorsal and simultaneously, where two capture cameras images of the and dorsal sides ventral above), described (as microscopy SiMView to contrast in is This ( embryo the of half ventral the for reconstructed only fused in the subsequent image processing, and cell dynamics were not were stacks image ventral and dorsal the imaging, multiview Note that in order intervals. to avoid 30-s fusion arising artifacts from at sequential acquired were dorsal, and ventral views, Both image stack contained 124 planes with an axial step size of 2.03 was rotated by 180° between successive volume acquisitions. Each In order to image the ventral and dorsal hemispheres, the embryo The embryo was oriented so that the ventral side faced the camera. sCMOS camera (lateral pixel size in the acquired images, 333 nm). a PCO.edge and filter detection long-pass a 585-nm Zeiss), (Carl objective immersion NAwater 20×/1.0 Apochromat Plan a with imaged and sheet light 561-nm a with excited was label nuclear 2A-mRFP histone the from above. In microscope, this supported was cylinder agarose the that so microscope the in mounted was capillary the and embryo, the expose to extruded was gel erized polym The capillary. glass ID 1.2-mm a in agarose temperature SiMView microscope. Embryos were embedded in 1% low–melting had the same genotype as the the Lightsheet Z.1 (Carl Zeiss) commercial light-sheet microscope microscope. Z.1 Lightsheet a of and imaging preparation Sample Gb. 515 of size set data total a in Mb, resulting 183 is stack image multifused Each hatching. larval to AEL h 2.9 from Drosophila Fluorescently labeled labeled Fluorescently Bloomington the from 23560 number stock +, His2Av-mRFP1}; = P{w[+mC] (w-; 2A-mRFP histone label nuclear for the homozygous embryos with performed were microscopy. SiMView of and imaging preparation Sample ONLINE doi: excited with scanned light sheets light scanned with excited dorsal and ventral sides of the embryo the facing with the below cameras. from RFP wassupported was agarose the that so °C) 21.5 illumination and bidirectional detection bidirectional and illumination bidirectional Using nm). 406 images, acquired the in size pixel (Semrock) and Hamamatsu Orcafilters Flash detection 4.0 sCMOSlong-pass cameras594-nm (lateral objectives, immersion water (NA) aperture numerical 16×/0.8 Nikon with imaged was light size of 2.03 2.03 of size the entire volume of an with step encompassing the embryo axial planes 154 of stacks Image recorded. were embryo the of views e o te iVe lgtset microscope light-sheet SiMView the of ber cham recording water-filled the within vertically mounted was embryo the holding capillary The capillary. glass the of outside embryo the the to just expose enough was extruded cylinder agarose polymerization, After GmbH). (Hilgenberg capillary glass 20-mm × (ID) diameter (Sigma-Aldrich, inner 1.5-mm a in SeaPlaque) solution (Lonza, hypochlorite agarose temperature low–melting 1% in embedded and 425044) sodium 50% with The experiment shown in in shown experiment The The experiment shown in in shown experiment The 10.1038/nmeth.3036

METHODS development in 1,304 time points for an 11-h period period 11-h an for points time 1,304 in development development in 2,881 time points for a 24-h period period 24-h a for points time 2,881 in development µ m were acquired at 30-s intervals for all four views. four all for intervals at 30-s acquired were m Drosophila

Drosophila Drosophila 1 Supplementary Video 7 Video Supplementary Supplementary Video 1 Video Supplementary 6

Drosophila . 3 embryos were dechorionated dechorionated were embryos using a 594-nm laser. Emitted Emitted laser. a 594-nm using live-imaging experiments experiments live-imaging Drosophila Drosophila Drosophila embryos imaged with the 1 6 embryos imaged with with imaged embryos , four complementary complementary four , 1 6 embryos using embryos embryos using embryos (temperature, (temperature, Stock Center). Center). Stock captured captured captured captured Fig. 3 Fig. µ a m. ). ). - -

stock solution, 34.8 g NaCl, 1.6 g KCl, 5.8 g CaCl 60× (for buffer E3 in preparedtemperaturelow–meltingagarose 0.5% with filled capillary glass ID 2.0-mm a in embedded were GFP expressed under the control of the SiMView microscope were heterozygous for the nuclear label H2B- Institute).Fluorescently zebrafishthe labeledimagedonembryos Care and Use Committee of Janelia Farm (Howard Hughes Medical experiments were performed according to the Institutional Animal microscopy.SiMView using embryos zebrafish of imaging and preparation Sample Mb, 228 is Gb. 595 of size stack set data total a in image resulting single-view Each AEL. h 13 to 2 from the entire volume of the embryo with an axial step size of 3.25 encompassingplanes 262 ofrecorded. wereImagestacks embryo stack is 96 Mb, resulting in a total data set size of 9.4 Gb. 9.4 of size set data total a in Mb, resulting 96 is stack image Each stage. 50%-epiboly the at starting °C) 21.5 perature, zebrafish development in 101 time points for a 3.4-h (tem period 3.25 of size step an axial with planes 35 contained stack image Each nm. 590 was each) pixels 1,024 × (1,024 images acquired the in size pixel lateral The Zeiss). (Carl objective air NA 20×/0.8 Apochromat was mCherry slip. cover excited with a 561-nm the laser, and images nearest were acquired with a was Plan pole animal the that so oriented were embryos The coverslip. #1 buffer a with E3 covered and in prepared agarose in temperature embedded low–melting slide, 0.4% microscope deep-well a in mounted were the of control the under expressed were heterozygous for the fluorescent nuclear label H2B-mCherry microscope confocal LSM 710 the with Zeiss laser-scanning Carl imaged embryos Zebrafish Institute). Medical Hughes (Howard Farm Janelia of Committee Use and Care the Animal to Institutional according performed were experiments microscopy cal microscopy. confocal using embryos zebrafish of imaging and preparation Sample Tb. 1.7 of size set data total a in resulting Gb, 1.5 is stack (hpf). image multifused post-fertilization Each hours 6 at stage sphere the at starting period 18-h an for points time 1,170 in development zebrafish g MgSO and bidirectional detection bidirectional and intheacquired images, 406 nm). Using bidirectional illumination and Hamamatsu Orca Flash 4.0 sCMOS cameras (lateral pixel size (Semrock)filters band-passdetectionobjectives,525/50-nm sion immer water NA 16×/0.8 Nikon with acquired were images the and laser, 488-nm a using sheets light scanned with excited was GFP cameras. the facing embryo the of poles vegetal and animal the with below from supported was agarose the that so chamber 21.5 °C. The capillary was then mounted vertically in the recording scope was filled with E3 buffer and equilibrated to a temperature of embryo outside of the glass. The recording chamber of the micro ized agarose cylinder was extruded from the capillary to expose the 7.2 with NaOH, and finally the solution is autoclaved). The polymer were acquired at 60-s intervals for all four views. stacks were acquired at 2-min intervals. at 2-min acquired were stacks image and 2 s, approximately in imaged was plane Each embryo. The experiment shown in in shown experiment The The experiment shown in in shown experiment The 4 × 6H µ m, which covered about 15% of the volume of the the of volume the of 15% about covered which m, 2 O are dissolved in 2 l H Zebrafish-line maintenance and SiMViewmaintenanceZebrafish-lineand Zebrafish-line maintenance and confo and maintenance Zebrafish-line 1 6 Supplementary Video 19 Video Supplementary Supplementary Video 12 Video Supplementary , four complementary views of the the of views complementary four , β 2 O, then the pH is adjusted to -actin promoter. Embryos Embryos promoter. -actin β -actin promoter. Embryos 2 × 2H na ture methods 2 O and 9.78 captured captured captured captured µ m - - - - -

© 2014 Nature America, Inc. All rights reserved. scopy data sets. data scopy micro of SiMView and handling data fusion image Multiview Gb. 8.4 of size set data total a in resulting Mb, 332 is stack image multifused Each period. 2-h a for points time 26 in E6.25 at development mouse planes with a step size of 2.03 2.03 of size step a with planes 356 containing stacks image providing excitation, fluorescence for laser 488-nm a using imaged were transgene CAG-TAG the from 2B-eGFP histone Nuclei expressing were recorded. embryo an Intel RMS25CB080 RAID module and an Intel X520-SR1 X520-SR1 Intel array, an disk and RAID-6 module a RAID in RMS25CB080 Intel combined an SSDs Intel GB six 480 memory, Series DDR3 GB 520 128 CPUs, E5-2690 Xeon Intel two used: was components hardware following the with server speed. processing on impact minor a have only will ponents com these as suffice, generally will disks hard CPUs and slower build, cost-efficient particularly a For performance. on impact little with GPU Titan GTX GeForce lower-cost a by replaced be sufficient memory are of primary importance. The Tesla GPU can processing speed in the cell lineaging framework, a good GPU and RAID-6 a For disk array module. and RAID optimal an Intel RMS25CB080 in combined disks hard ST9900805SS 10K.5 Seagate Savvio six GPU, K20 Kepler Tesla Nvidia an memory, DDR3 CPUs, GB 192 Intel two E5-2687W Xeon components: hardware following the with workstation computer single a on performed reconstructions. eage of Components the workstation processing for used all cell lin sets. data image SiMView multi-terabyte of storage and acquisition routine efficient enabling data, image the order of 40:1 to compared to 400:1 the raw, SiMView unfused pr fusion described previously and cessed with fused our SiMView image pipeline, processing as CAG-TAG1 mice male with mice CD-1 female crossing by matings natural from obtained were embryos Mouse Institute). Medical Institutional the to according Animal Care and Use Committee performed of Janelia Farm were (Howard Hughes experiments using embryos microscopy. mouse SiMView of imaging and preparation Sample na 37 °C, 5% CO 5% °C, 37 at maintained and Harlan) DMEM (WEC, serum in rat 50% and imaged F-12 and and 10082-147) (Invitrogen, FBS 10% 21041-025) and (Invitrogen, F-12 and DMEM in E5.5 at dissected Together with the fourfold data size reduction in the multiview multiview the in reduction size data fourfold the with Together set. data image volumetric raw the of region foreground any in sampling or data quality image affecting without tor of to 100 10 effect of these two post-processing steps combined is data The reduction by a fac standard. JPEG2000 the following compression wavelet 3D lossless using compressed subsequently were regions foreground The volume. image the in regions foreground all ing retain and setting threshold conservative a using volume image mask by generated of adaptive thresholding the Gauss-convolved ground regions in the were image stacks removed using an image iietoa detection bidirectional and illumination bidirectional Using above. described as scope The experiment shown in in shown experiment The For data visualization, editing and annotation, a CATMAID CATMAID a annotation, and editing visualization, data For ture methods 4 1 maintained on a C57BL/6J background. Embryos were were Embryos background. C57BL/6J a on maintained o cess, total lossless data compression ratios are thus on thus are ratios compression data lossless total cess, 2 and 5% O 5% and SiMView four-view image data sets were pro were sets data image four-view SiMView 1 6 . . For efficient long-term data storage, back All computational reconstructions were were reconstructions computational All Mouse line maintenance and SiMView SiMView and maintenance line Mouse 1 6 , four complementary views of the the of views complementary four , 2 on the SiMView light-sheet micro light-sheet SiMView the on Supplementary Video 16 Video Supplementary µ m every 5 min. 5 every m captured captured - - - - ­ - - -

each time point point time each ( curve bell-shaped Gaussian 3D a as nucleus each of profile sity model. mixture Gaussian Sequential user ( user GMM with one Gaussian per supervoxel. the The variationalinitialized we inferencepoint, time first the at information prior have Initialization for tracking and segmentation. were set to to set were t prior at point time each to given is weight more the hyperparameters, these of ues (covariance), shape and time points. The larger consecutive respectively, the val between location mean intensity, in changes Supervoxel formation with watershed persistence-based persistence-based watershed with agglomeration. formation Supervoxel impact performance. on minor a have only generally will CPUs hardware slower storage and well, as case this In adapter. network fiber 10Gb all possible segmentations for all possible values of of values possible all for segmentations possible all sible merges of neighboring regions ( regions pos neighboring of merges sible other affects merge each as sequentially, performed be to technique based on PBC on based technique agglomeration an use we importantly, most and Third, minima. of neighborhood 2 × 2 × 1 (74 anisotropic elements in total) to avoid an an overabundance of local considered we Second, filter. median 5 × 5 a with stack image 3D the in image 2D each essed preproc we First, images. microscopy light the in noise Poisson to due oversegmentation excessive avoid to order in algorithm watershed classical the to modifications three Weincorporated group voxels into coherent regions belonging to the same nucleus. a a parameter, watershed approach. The number of merged regions is bydefined during run time. Initially, Initially, time. run during supervoxels merge and split to used was segmentation archical where where with the new value of of value new the with again (1) equation solved we Finally, Gaussians. two into them split and nuclei two contain might Gaussians which determined if they occupy an image volume with low sum intensity). Then, we nates nates for the ( approach Bayesian full a used we correlated, is points time tive time at mixture is below below is a region and the intensity at the contact point with another region equation (1) with a fixed a (1) with fixed equation a of in Gaussian In time. order we to solved cell divisions, first handle evolution the following by obtained automatically is tion) negligible negligible computational cost ( hyperparameters three particular, In (1). equation in parameter + 1. For all data sets presented in this work, these parameters parameters these work, this in presented sets data all For 1. + Supplementary Fig. 3 Fig. Supplementary Supplementary Note 1 Note Supplementary α The linkage of objects between time points (tracking informa (tracking points time between of objects linkage The , , β and and Supplementary Software 1 Software Supplementary K t τ is the number of nuclei at time time at nuclei of number the is , then these two regions are merged. The merging has has merging The merged. are regions two these then , ν α ) define the distributions that model the expected expected the model that distributions the define ) τ n = 0.8 × 10 × 0.8 = . . If the difference between the minimum intensity in th voxel, and We used a modified watershed algorithm watershed modified a used We t n I t was then modeled as a GMM: a as modeled then was . Because information between two consecu two between information Because . t ] [ t in order at to parameters estimate point time ∝ K ). The complete volumetric image data at data image volumetric complete The ). −5 t ∑ obtained by estimating cell divisions. cell by estimating obtained ) in order to incorporate priors for each each for priors incorporate to order in ) K 2 , , 5 β t µ π k K τ to merge regions extracted from the the from extracted regions merge to (although Gaussians can be removed Gaussians (although = 0.1 and and 0.1 = should be set conservatively by the the by conservatively set be should t k t = 2 , 5 1 , , we constructed a binary tree with µ π t k t k and ) to avoid undersegmentation. avoid to ) N ) , ; ( x Supplementary Fig. 1 Fig. Supplementary Σ k n ν We modeled the inten the modeled We t k = 1.0. = define the define t t , , x doi: Σ n are the 3D coordi 3D the are t k Because we did not 10.1038/nmeth.3036 k τ th . This hier This . Gaussian Gaussian 2 ). At). 6 (1)(1) to to ------

© 2014 Nature America, Inc. All rights reserved. porates first-order Markovian information. However, when when However, information. we cell tracking, have performing further Markovian first-order porates context rules. Spatiotemporal cells per time point and were locally isolated in space. in isolated locally were and point time per cells all of 3% than less for occurred typically events These divisions. cell and deaths cell mainly events, lineaging cell critical in rules combinatorial applied only we Thus, resources. computational of amounts large consume also would rules these of application Broad point. time given any at behaviors dynamic challenging unnecessary, because usually only a small fraction of cells exhibits ten. size of window a temporal we used work, in this presented experiments last the over window time sliding a using applied are that rules logic temporal three defined we Thus, points. time a of number for certain again divide not should cells daughter the division, cell a after example, For exploit. can we that behavior cell on mation ( file configuration software can the of section parameters parameters advanced the in modified be these However, 0.4. = maxPercentileTrimSV and 3000 = maxNucleiSize 50, = minNucleiSize at constant kept were parameters three these work, in this presented experiments all For supervoxel. current the in voxels of number total the to respect with scale relative a on limit this - maxPer defines and centileTrimSV foreground, as considered be to allowed are that scale) absolute an (on voxels of number maximum the defines maxNucleiSize background. considered is supervoxel entire the foreground, are minNucleiSize than voxels fewer If foreground. minNucleiSize defines the minimum number of parameters. voxels that should be considered advanced three defined we versa), vice or foreground to belonging voxels all example, (for process volume. image the of regions large ing is from cover Gaussians to prevent individual important generally supervoxels unconnected multiple contain that never Ensuring Gaussians point. time earlier an at occurred that mistakes grouping as to as correct well divisions cell biological true detect to Gaussian each subgroup. is This split to important mechanism new one assigned and components connected to corresponding into subgroups of group we the supervoxels split existed, regions reconstructions in this work, work, this in reconstructions within ended We any that track such a eliminated after therefore misdetection. shortly However,die will tracks of one and the are temporary, event. effects these division cell a resembles that scenario a duce pro can elements autofluorescent or effects scattering intensity, For in time. changes fluorescence across example, of supervoxels coherence the affect that points time consecutive between ance appear data image in changes to spurious due are artifacts sions Detection of cell divisions. cell of Detection successfully oversegmentation. from resulting Gaussians rules excess disregard logic temporal the and (1) equation solving doi:10.1038/nmeth.3036 old thresh Otsu’s applying after region connected single Gaussian a same formed the to assigned supervoxels all whether mined We refer to the first rule as ‘short-lived daughter’. Many cell divi is however, cells, all to rules combinatorial these Applying thresholding Otsu’s the in cases degenerate avoid to order In 2 T 7 to each supervoxel individually. If multiple unconnected unconnected multiple If individually. supervoxel each to time points processed in the sequential approach. For all all For approach. sequential the in processed points time L time points after the last cell division. For all all For division. cell last the after points time Supplementary Software 1 Software Supplementary To detect cell divisions, we deter we divisions, cell detect To L The sequential GMM only incor The sequential was set to 5. to set was a a priori ). temporal infor ------

segmentation’. Because we set a global parameter, Background Background detector. subsection. next the in explained study. this in reconstructions all for 3.2 to set was threshold the set, training small a analyzing After old. sion with a distance of mother to division plane above this thresh divi any cell for daughter distant more the and mother between ( threshold a defined we positives, false and divisions cell true between nate discrimi to Thus, values. high relatively yields typically metric this undersegmentation, from recovery a as such positives, false for whereas low values, relatively yields metric this divisions, cell true For plane. this to cell mother the of distance the computed and daughters two the of calculated centroids Wethe to cells. equidistant plane the daughter two the between in usu located is ally cell mother dividing the that considered we (false positives), divisions cell detected incorrectly determine to order In configuration file. configuration software the of section parameters advanced the in changed be can and negatives false of number and positives false of number The indicated. value of represents the thrHigh a between tradeoff was set to thrHigh 0.7 and was set to thrLow otherwise 0.2 unless ( work this in presented reconstructions post-processed all For event. division cell a ters encoun it until or thrLow below decreases probability the until in time forward points data connected deleting it keeps thrHigh, above probability background a with object an detects classifier the When nuclei. cell true of instead structures to autofluorescent corresponding tracks remove to probability classifier the of outcome the on thrLow) and (thrHigh approach hysteresis old Wedetector. a two-thresh background used the applying before Note that all pipeline. quality metrics presented lineaging in this studycell were the evaluated in module post-processing optional an as provided is differences, of these advantage tor,takes which detec background The nuclei. cell actual of those to contrast in time, of periods short over coherent not are objects background ( rules logic the evaluate to used windows temporal same the within features several uses that objects background detecting for classifier ing be solved efficiently using the Hungarian algorithm Hungarian the using efficiently solved be distances between supervoxels, and Jaccard the graph-matching to problem could thus corresponded edges between time. weights across The segmentation problem of levels graph-matching possible different local between a formulate to segmentation track, point could be split into smaller regions. ending We used our hierarchical each for Thus, we supervoxels whether neighboring in determined the next time. time in propagated be cannot Gaussian corresponding the as reconstruction, the in death cell a in results point time at given a Undersegmentation 1%). below can occur in low-contrast or crowded image regions (at a rate well undersegmentation regions, watershed of clustering ence-based not correspond to nuclei. Therefore, we built a machine learn machine a do built we that Therefore, nuclei. to parts correspond not autofluorescent many contain specimens some for. accounted to be However,need levels intensity nonzero with of tion volume as image each a parts image that all GMM implies The second heuristic The is second heuristic ‘track death extension with hierarchical The fourth rule is a background detector, which will be be will which detector, background a is rule fourth The plane’. division to cell mother of ‘distance is rule third The Supplementary Note 2 Supplementary Supplementary Software 1 Software Supplementary Modeling the gray-scale intensity informa Supplementary Videos 2 Videos Supplementary ). In general, the properties of ). In the properties general, ) and broke the linkage linkage the broke and ) τ na , for the persist , 8 ture methods 4 , 13 13 2 . and 20 ), ), ------

© 2014 Nature America, Inc. All rights reserved. Confidence score for guiding users to possible lineaging mis lineaging possible to users guiding for score Confidence na mistakes ( mistakes efficiently directed tobe local spatiotemporal windows can that are likely user to contain the interface, curation data CATMAID the and information this Using accordingly. data lineage output the we labeled issue, the to fix rule context a spatiotemporal find where cases those we a estimated high error probability but could in Thus, lineages. correct other affecting without mistakes such even struction, if it is not to straightforward automatically correct recon lineage cell the in mistakes possible detect to used be can takes. development ( development line of early neuroblast ages from our reconstruction selected randomly we system, guidance error the using obtained time curation in improvement the estimate system. error guidance the with time curation Data behavior. cell abnormal indicating s.d.), 4 (above ments displace large unusually with trajectories flag to statistics these We used point. time each at displacements cell of s.d. and mean the we estimated Second, reconstruction. lineage cell an accurate for importance high with are rare events as these deaths, and cell divisions cell all flagged we First, revisited. manually be should axes in the anterior-posterior, medial-lateral and ventral-dorsal ventral-dorsal and medial-lateral anterior-posterior, the in axes the 3D structed convex hulls of all nuclei con at each we time point. Body features, dynamic several of definitions for required is annotated locations. nucleus from calculated were angles, division and divisions cell of point time point, resting the to length migra path velocity, maximal tion nucleus, the of depth resting point, time tion internaliza including cells, these by taken paths the of Features ( identities cell putative assign to used were literature on based neuroblasts of positions Relative ously. sions were found wherever the cells could be followed unambigu Nervous system reconstruction. system Nervous error the using system. mistakes guidance 32 and mode curation full the using a of total 34 mistakes corrected The annotators system. guidance error the with speed curation data in improvement 28% a shows number This system. guidance error the using s 1,169 to pared com s 1,625 was mode curation full in required time total The methods. curation both with lineages 12 of total a reconstructed annotators different two methodology, this Using lineage. tive for respec the of curation data completion the until elapsed that an annotator started curating a new lineage, we measured the time dicted low-confidence region pre for manual next data curation. the Each time to user the directs that (‘c’) key hot by a CATMAID introducing in system guidance error the implemented We CATMAID. with curation full a (ii) by followed system, guided error the using curation computer-guided a (i) presented text: main the in methods curation both using 0 point time reaching in backwards until time lineages to curate the selected proceeded tracked paths were proofread forward and backward in time time in backward and ( forward proofread were paths tracked tion for each wave ( delamina after point time a at data image the in identified cally Supplementary Videos Supplementary 25 In particular, we used two simple rules to flag data points that that points data flag to rules simple two used we particular, In In order to describe the location of the embryo surface, which which surface, embryo the of location the describe to In order ture methods Some of the prior information on temporal cell behavior behavior cell temporal on information prior the of Some Fig. 4d Fig. Fig. 5 Fig. , e Supplementary Video Supplementary 24 ). b ) at time point 200. The annotators then then annotators The 200. point time at ) – 27 ) ) so that the first asymmetric divi Neuroblasts were morphologi Neuroblasts Drosophila Supplementary Fig. 12 Fig. Supplementary ). ). The automatically nervous system system nervous In order to ). ). ------

on the hull. On this structure, we found the the found we structure, this On hull. the on and the convex hull, we a used first 3D k-d tree to store the points side. dorsal from the ventral side, around the posterior tip at at tip posterior the around side, ventral the from calculated as calculated then was angle polar The surface. embryo the the from as further one identified cell, coordinate daughter GMC angle the in division placed was each system for origin The tangent surface. locally the plane to the onto projected axis body transverse the as chosen was axis other The normal. surface local the along defined was system coordinate division of one cell axis the sheet, cell outer the Because to orthogonally angles. approximately divide neuroblasts spherical two calculated we orientation, sion the of segregation the before sheet. cell outer the from basally neuroblast moves nucleus the as nucleus, the of internalization only but canonically ‘delamination’ discuss defined to us allow not does label nuclear the that for a nucleus at nucleus a for where where normal surface unit the for both delamination depth ( and than a defines criterion simple threshold performance better considerably shows fit Notably, this delamination. of point time a to of center identify the descent the geometric identify robustly to trajectory the of behavior global the incorporate to used was z phase and the delaminated phase. A logistic fit of the form the of fit logistic A phase. delaminated the and phase surface embryo the between phase internalization an reflecting through the course of delamination follows a characteristic shape, tionality of the germ band. Because the neuroblasts are located located are neuroblasts the Because band. germ the of direc tionality the to relative as well as directions medial and lateral to band. Thus, the relative coordinate system is defined consistently bilateral symmetry as well as the change in orientation of the germ for account to egg of the tip posterior the and midline the across daughter cells cells daughter drical coordinate drical system ( distance from the point to the plane containing each triangle as triangle each containing plane the to point the from distance We found the containing triangles these points and the calculated of the germ band, we defined cylindrical coordinates ( coordinates cylindrical defined we band, germ the of specimen. the of drift gradual for to correct algorithm point closest iterative the using adjusted and point time initial the for aligned manually were directions , the transverse axis coordinate, and and coordinate, axis transverse the , In order the to calculate each distance between point efficiently In order to fully parameterize the asymmetric neuroblast divi neuroblast asymmetric the parameterize In to order fully The distance from the neuroblast to the convex hull over time time over hull convex the to neuroblast the from distance The For the visualization of neuroblast patterns along the length length the along patterns neuroblast of visualization the For f ˆ and a ˆ p d ⋅ = ψ s ) ( z ˆ  are the angular and radial unit vectors of the cylin r ˆ 0 . The second, azimuthal angle was calculated as calculated was angle azimuthal second, The . g z r sgn p  0 = and triangle vertices at vertices triangle and d t d     | )|| ( ) ||( fit ( ) ( ) ( ˆ ˆ ⊥ p p p p p p p p ) (         a ˆ 3 3 1 1 2 2 1 1 n ˆ

⋅ = η − − × × − − + = Fig. Fig. 5 d )arccos  ) ) and internalization time ( 0 arccos( n a ˆ 1 ). ). The angle and the unit vector between between vector unit the and +     || r n e ˆ ˆ r r − − ˆ ˆ ⊥ ⊥ φ µ d n n ) , the polar angle running running angle polar the , ˆ ˆ ( ∞ t t || ⋅ p  || doi:10.1038/nmeth.3036 1 ⋅ 2 1 k ψ , , f f ( / ˆ ˆ nearest neighbors. neighbors. nearest p p   ⊥ ⊥ p  was then reflected 1 ) 2 n n ˆ ˆ and and − || φ     = 0 up to the the to up 0 =     00 ) p  t 3 z 1/2 . , , φ ). ). Note ) with with ) (2)(2) (3)(3) (5)(5) (4)(4) - - -

© 2014 Nature America, Inc. All rights reserved. direction (2 (2 direction dorsoventral the in resolution spatial limited the that note We robust when averaging one to three time points (data not shown). remained The calculation points. two time the first between cells daughter between vector the by averaging were calculated angles Final robust. remain the will projection within the thus and component plane, tangent large a contain will axis transverse the embryo, the of surfaces lateral the from away band, germ the on doi:10.1038/nmeth.3036 tions using machine learning. machine using tions annota factor transcription and type neuroblast Predicting calculations. 295 samples (total number of neuroblasts in the cell lineage lineage cell the in neuroblasts of number (total samples 295 only contained set data our Because reconstruction. lineage cell our from obtained features morphodynamic the of basis the on annotations factor transcription or type neuroblast predict can that models learn to trees classification with boosting of tation µ m) created an up to 10% uncertainty in the angle angle the in uncertainty 10% to up an created m) We used the Matlab implemen Matlab the We used - - 42. 41. feature and dividing the sum by the number of branch nodes. branch of number by the sum the dividing and feature on every splits to due risk the in changes by summing calculated was importance the tree, each For function. boosting in same Matlab the by returned was model each for importance feature The division. cell second up to the not tracked were that for cells values missing accommodate to Matlab in implemented niques each leaf had at least five samples, and we used the surrogate tech in model was the classifier grown boosting and then until pruned weak Each results. accuracy prediction the to estimate classifiers weak 500 and crossvalidation fivefold used we reconstruction), Nav. Res. Log. Quart. Log. Res.Nav. bicistronic expression in transgenic mice.transgenic in expressionbicistronic Kuhn, H.W. The Hungarian method for the assignment problem. assignment the for methodHungarian H.W.The Kuhn, for peptide 2A viral the of Use S. Srinivas, & J. Begbie, G., Trichas,

, 83–97 (1955). 83–97 2, BMC Biol. BMC , 40 (2008). 40 6, na ture methods

-