<<

Development of a lab bioimage informatics system for fluorescence microscopy data,

with application to experimental studies of

RhoGAP regulation of calcium signaling and actomyosin contractility.

A Dissertation Presented

By

Jeffrey Alan Bouffard

to

The Department of Bioengineering

in partial fulfillment of the requirements for the degree of

Doctor of Philosophy

in the field of

Bioengineering

Northeastern University Boston, Massachusetts

December 2019

ii

ABSTRACT

Microscopic imaging is a powerful tool to advance our understanding of biological systems. Image-based biological investigations can be large and complex, requiring bioimage informatics solutions to manage and examine large amounts of information. This dissertation describes the development of a lab bioimage informatics system to organize and analyze fluorescence microscopy movies. This system is applied here to investigate the regulation of calcium signaling and actomyosin contractility in the C. elegans spermatheca, a multicellular contractile tube with stereotyped tissue function and conserved genes and regulatory networks.

The lab bioimage informatics system standardizes the organization, processing, and analysis of calcium sensor movies by using computer programs to automate processes. The motivations, design goals, and implementation for this lab bioimage informatics system are presented.

The experimental investigation revealed a new role for a RhoGAP known to regulate actomyosin contractility. For this work, the lab bioimage informatics system supported analysis of almost 500 fluorescence microscopy movies, acquired by 4 different people over 4 years. These data show that the RhoGAP SPV-1 is a key regulator of calcium signaling and tissue function in the C. elegans spermatheca. Experiments to establish mechanism suggest SPV-1 coordinates the activity of multiple GTPases to control tissue contractility.

An additional prototype image processing and analysis pipeline is presented, which automatically segments the spermathecal tissue from calcium sensor movies. This pipeline will advance spatial analysis of the tissue, enabling quantitative analysis of tissue shape changes and mechanochemical signaling in the spermatheca.

A major product of this dissertation is an operational lab bioimage informatics system, used by multiple researchers for more than two and a half years to organize iii

movies and support publications and collaborations. This dissertation work also produced a biological investigation, using standardized analysis of movies, which was published in a Molecular Biology of the Cell special issue on Quantitative Cell Biology.

A final product is an image processing pipeline that extracts new spatial measurements from movies. The tools and insights presented here continue to be used, and the framework presented may be useful for other researchers dealing with large amounts of complex imaging data.

iv

DEDICATION

I dedicate this work to those with the audacity to follow their own path to the mysteries.

v

ACKNOWLEDGEMENTS

I have many people to thank for making this achievement possible.

First and foremost, none of this would have been possible without the support and guidance of my primary advisor, Professor Erin Cram. She is a paragon of great research and general humanity, and it has been a privilege and honor to do my Ph. D. work in her lab. I am the scientist, researcher, writer, and presenter I am today because of her insightful questioning and seemingly endless patience and feedback. I became a programmer and synthetic biologist only because she allowed me to develop in the ways that were best for me. I would not be the mentor and research community member

I am today without her masterful example of how to support our fellow human beings in our research endeavors. Thank You, Erin!

I thank Professor Jeffrey Ruberti and the Northeastern University Bioengineering

Program for accepting and supporting a very non-traditional candidate, and for giving me the opportunity to radically alter the trajectory of my life. Before grad school I spent years as a waiter in a restaurant, now I command the skills and mindsets to lead in my chosen dream career, engineering biology. This experience has shown me that I am capable of so much more than I ever thought possible!

I thank Professor Anand Asthagiri for helping me develop as an engineer, and for welcoming me into his group lab meetings. Insights from our conversations helped me become a better researcher, and helped me plot my career course.

I thank Professor Harikrishnan Parameswaran for helping me develop as an engineer, and for providing me space and resources in his lab office. Our meetings advanced my image analysis skills, and helped me plot my career course.

These years would have been a lot harder if I had not worked alongside such amazing people in the Cram Lab. I must give particular shout-outs to Ismar Kovacevic vi

and Jose Orozco for showing me the ropes; Alyssa Cecchetelli and Coleman Clifford for being great collaborators; Alison Wirshing, Charlotte Kelley, and Perla Castaneda for great discussions and keeping the lab running and upbeat; and Pedro Falcon and Doug

Pagani for reminding me how much I enjoy teaching others. Many more current and former members of the Cram lab are not mentioned here, but they all contributed to making it the magical place it is. Thank You, Cram Lab, I will very much miss the great science and camaraderie, the lab meetings and birthday parties, and the stimulating and supportive environment.

I thank the members of the Asthagiri lab for welcoming me at their lab meetings, and the members of the Parameswaran lab for welcoming me in their office area. I thank

Professor Javier Apfeld and the members of the Apfeld lab for helpful discussions at joint lab meetings. I also thank Professor Ronen Zaidel-Bar, Pei Yi Tan, and the members of the Zaidel-Bar lab for hosting me in Singapore for two months.

Northeastern Bioengineering was a brand new program when I started, so I thank the people who put in time and energy to build a graduate student community from scratch, particularly Robert Natividad, Michelle Stolzoff, Jessica Fitzgerald, Paige

Baldwin, Shravani Kakarla, Judith Piet, and Ian Harding. The path certainly would have been lonelier without this BioE camaraderie. I also thank the Northeastern Biology department for having and supporting a strong community that was a pleasure to be a part of.

I thank two great housemates that I shared living space with over these years,

Greg Spiers and Don Medor. A low-stress and drama-free living situation was great to come home to after a long day in the lab, and they were great sounding boards for new ideas, listened to and shared rants about life’s highs and lows, and provided valuable reminders of life outside the lab.

I thank my good friends Nate Kujawski and Kelly Allen-Kujawski for regular visits to the woods and exposure to small children in my nephew, Ethan, and niece,

Nora. These visits never failed to provide much welcome perspective and rejuvenation. vii

Finally, I thank my parents, Judith and Joseph, for giving me the genes and upbringing that make me experience and operate in the world the way I do, and for giving me the freedom to develop according to my own design.

viii

TABLE OF CONTENTS

List of Figures and Tables ...... xiv

List of Abbreviations ...... xvi

Chapter 1 – Introduction: The spermatheca as a model contractile tube ...... 1

1.1 – Abstract...... 1

1.2 – Introduction ...... 1

1.3 – Contractile tubes are essential components of animal body plans ...... 1

1.4 – C. elegans is a powerful model organism ...... 2

1.5 – The C. elegans spermatheca is a powerful model contractile tube ...... 3

1.6 – Calcium signaling regulates cellular contractility ...... 4

1.7 – Genetically encoded fluorescent calcium sensors and fluorescence microscopy enable in vivo study of calcium signaling in the spermatheca ...... 4

1.8 – Conclusions and future directions ...... 5

Chapter 2 – Development of a lab bioimage informatics solution ...... 7

2.1 – Abstract...... 7

2.2 – Introduction ...... 7

2.3 – Motivation ...... 9

2.4 – Design goals ...... 11

2.4.1 – Organization and research efficiency ...... 13

2.4.1.1 – Unique identifier ...... 13

2.4.1.2 – Human readability ...... 14

2.4.1.3 – Machine readability ...... 15 ix

2.4.1.4 – Standardized directory structure ...... 15

2.4.2 – Standardization, research reproducibility and data provenance ...... 16

2.4.2.1 – Raw files ...... 16

2.4.2.2 – Processing ...... 17

2.4.2.3 – Analysis ...... 17

2.4.2.4 – Presentation ...... 17

2.4.3 – Usability ...... 18

2.4.3.1 – Minimize barriers to use ...... 18

2.4.3.2 – Maximize benefits of use ...... 19

2.4.3.3 – Extensibility and future-proofing ...... 19

2.5 – First generation lab bioimage informatics solution ...... 20

2.5.1 – Unique identifier ...... 20

2.5.2 – Local and lab movie archives ...... 21

2.5.3 – GitHub ...... 23

2.5.3.1 – Local and remote repositories ...... 23

2.5.3.2 – GitHub as a distributed database management system ...... 24

2.5.4 – Matlab ...... 25

2.5.4.1 – Matlab_’X’_A_FileUpload.m; ‘X’=(GCaMP|RG ...... 25

2.5.4.2 – Matlab_ArchiveTransferToDiskVader.m ...... 27

2.6 – Conclusions and future directions ...... 28

2.6.1 – Alternative bioimage informatics solutions ...... 29 x

Chapter 3: Application of the first generation lab bioimage informatics solution to processing and analysis of widefield microscopy fluorescent calcium sensor movies ...... 32

3.1 – Abstract...... 32

3.2 – Introduction ...... 32

3.3 – Fiji / ImageJ processing of calcium sensor movies ...... 33

3.3.1 – Fiji_RawToGCaMP_A.ijm ...... 33

3.3.2 – Complete and partial processing ...... 34

3.3.3 – Temporal cropping ...... 36

3.3.4 – Registration ...... 37

3.3.5 – Rotation ...... 38

3.3.6 - Spatial cropping ...... 39

3.3.7 – Annotation of time points ...... 40

3.3.8 – Generation of time series and kymograms ...... 41

3.3.9 – Multichannel movies ...... 42

3.4 – Matlab processing and analysis of calcium sensor time series ...... 42

3.4.1 – Matlab_’AnalysisTitle’.m ...... 43

3.4.2 – Analysis title ...... 43

3.4.3 – Time series processing ...... 44

3.4.4 – Dwell time ...... 45

3.4.5 – Rising time ...... 45

3.4.6 – Fraction over half max ...... 45

3.4.7 – Multichannel movies ...... 46 xi

3.4.8 – ‘AnalysisTitle’_TimeseriesMetricsTable.txt ...... 46

3.4.9 – ‘AnalysisTitle’_TimeseriesHeatmapMatrix.txt ...... 46

3.5 – Fiji / ImageJ processing and analysis of calcium sensor kymograms ...... 47

3.5.1 – Fiji_GenerateKymogramMetrics_A_’AnalysisTitle’.ijm ...... 47

3.5.2 – Kymogram annotations ...... 48

3.5.3 – sp-ut quiet period ...... 48

3.5.4 – Bag intensity ...... 49

3.5.5 – Multichannel movies ...... 49

3.6 – Matlab consolidation and exploration of calcium sensor data ...... 49

3.6.1 – ‘AnalysisTitle’_CalciumAnalysisApp.m & ‘AnalysisTitle’_ExploreTimeseriesApp.m ...... 50

3.6.2 – ‘AnalysisTitle’_combinedKymogramAndTimeseriesMetrics.txt ...... 53

3.7 – Conclusions and future directions ...... 54

Chapter 4: The RhoGAP SPV-1 regulates calcium signaling to control the contractility of the Caenorhabditis elegans spermatheca during embryo transits ...... 56

4.1 – Introduction ...... 56

4.2 – Abstract...... 56

4.3 – Introduction ...... 57

4.4 – Results ...... 59

4.4.1 – SPV-1 regulates calcium signaling in the spermatheca during embryo transits ...... 59

4.4.2 – SPV-1 overexpression results in low calcium signaling and embryo trapping ...... 61

4.4.3 – Spermathecal tissue function and calcium signaling exhibit a threshold response to SPV-1::mApple levels ...... 62 xii

4.4.4 – SPV-1 regulates spatiotemporal aspects of calcium signaling ...... 64

4.4.5 – Increasing RHO-1 activity recapitulates transit timing of the spv-1 mutant, but not calcium signaling ...... 67

4.4.6 – SPV-1 regulates calcium signaling via its GAP domain ...... 70

4.4.7 – SPV-1 has GAP activity toward Cdc42 and partially co-localizes with CDC-42 ...... 70

4.4.8 – Increasing CDC-42 activity recapitulates many aspects of spv-1 mutant calcium signaling ...... 73

4.4.9 – Decreasing CDC-42 using RNAi does not alter calcium signaling ...... 78

4.5 – Discussion ...... 78

4.6 – Acknowledgements ...... 84

4.7 – Methods ...... 85

4.8 – Supplemental Methods ...... 94

4.9 – Acknowledgement of contributions ...... 96

Chapter 5: Segmentation, polar coordinate representation, and refined spatial analysis of widefield calcium sensor movies ...... 98

5.1 – Abstract...... 98

5.2 – Introduction ...... 98

5.3 – Fiji / ImageJ segmentation of calcium sensor data and conversion to polar coordinates ...... 100

5.3.1 – Fiji_GradientSegmentationOfSpermatheca_Calcium.ijm ...... 100

5.3.1.1 – Edge image ...... 100

5.3.1.2 – Extrema of line profiles ...... 101

5.3.1.3 – Segmentation boundaries ...... 103

5.3.1.4 – Calculation of centroid ...... 105 xiii

5.3.1.5 – Calcium sensor movies in polar coordinates ...... 106

5.3.2 – Fiji / ImageJ future directions ...... 107

5.4 – Matlab analysis of segmentation data ...... 108

5.4.1 – Matlab_spatialAnalysis_thetaR_processing.m ...... 108

5.4.2 – Matlab_spatialAnalysis_boundaryAnalysis.mlx ...... 109

5.4.2.1 – Deformation maps ...... 110

5.4.2.2 – Calcium sensor maps ...... 111

5.4.3 – Matlab future directions ...... 111

5.5 – Conclusions and future directions ...... 113

Chapter 6: Conclusions and future directions ...... 115

References ...... 118

xiv

LIST OF FIGURES AND TABLES

Figure 1.1. C. elegans spermatheca overview ...... 3

Figure 1.2. C. elegans spermathecal calcium activity is linked to embryo transits ...... 5

Figure 2.1. Informal online self-evaluation of scientists’ expertise relevant for bioimage informatics. Reprinted with permission ...... 9

Figure 2.2. Many pieces of information are associated with each calcium sensor movie ...... 11

Figure 2.3. Lab bioimage informatics system design goals ...... 12

Figure 2.4. Standardized directory structure overview ...... 24

Figure 3.1. Fiji_RawToGCaMP_A.ijm overview ...... 34

Figure 3.2. Fiji_RawToGCaMP_A.ijm processing selection screen ...... 35

Figure 3.3. Matlab_’AnalysisTitle’.m overview ...... 44

Figure 3.4. Screenshot of ‘AnalysisTitle’_CalciumAnalysisApp.m ...... 51

Figure 3.5. Screenshot of ‘AnalysisTitle’_ ExploreTimeseriesApp.m ...... 52

Figure 3.6. Analysis interface ...... 53

Figure 4.1. Loss of SPV-1 alters calcium signaling in the spermatheca ...... 60

Figure 4.2. Spermathecal tissue function and calcium signaling exhibit a threshold response to SPV-1::mApple ...... 63

Figure 4.3. SPV-1 regulates spatiotemporal aspects of calcium signaling ...... 66

Supplemental Figure 4.1. mApple alone does not induce embryo trapping or low calcium signaling ...... 68

Supplemental Figure 4.2. Tissue function and calcium signaling metrics all show thresholds covering SPV-1::mApple intensity values from ~10 to ~15 ...... 69

Figure 4.4. Increasing RHO-1 activity alters spermathecal contractility but does not recapitulate spv-1(ok1498) mutant calcium signaling ...... 71 xv

Figure 4.5. SPV-1 regulates calcium signaling through its GAP domain...... 72

Figure 4.6. SPV-1 exhibits GAP activity toward Cdc42 and partially co-localizes with CDC-42 at spermathecal cell membranes ...... 74

Figure 4.7. Increasing CDC-42 activity alters spermathecal calcium signaling ...... 75

Supplemental Figure 4.3. Heat shock treatment does not alter wildtype tissue function or calcium signaling in any of the metrics...... 76

Supplemental Figure 4.4. Heat shock treatment and activation of constitutively active GTPases does not alter actin bundle alignment...... 77

Figure 4.8. cdc-42(RNAi) does not alter calcium signaling ...... 79

Figure 4.9. SPV-1 regulates spermathecal contractility via calcium and Rho-ROCK signaling ...... 82

Table 4.1. List of C. elegans strains used in this study ...... 96

Figure 5.1: Still frames from a calcium sensor movie with corresponding edge images ...... 102

Figure 5.2: Line profile over edge image ...... 102

Figure 5.3: Line profile sweeps and extrema images ...... 103

Figure 5.4: Combining line profile extrema images ...... 104

Figure 5.5: Refining rough segmentation boundaries ...... 105

Figure 5.6: Overview of Fiji / ImageJ segmentation program ...... 105

Figure 5.7: The centroid of the segmentation boundaries appears accurate and stable over embryo transit movies ...... 106

Figure 5.8: Calcium sensor image converted into polar coordinates ...... 107

Figure 5.9: Calcium sensor movies can be converted into polar coordinates ...... 107

Figure 5.10: Classification and annotation of segmentation boundaries ...... 109

Figure 5.11: Classified segmentation boundaries can be interpolated ...... 110

Figure 5.12: Deformation and calcium sensor maps ...... 112 xvi

LIST OF ABBREVIATIONS

ANOVA – Analysis of variance

DIC - Differential interference contrast dsRNA – Double-stranded RNA

GFP – Green fluorescent protein

I/O – Input output

IP3 – Inositol 1,4,5 triphosphate

KNIME - Konstanz information miner

LET – Lethal

NGM – Nematode growth media

OME – Open microscopy environment

PIP2 – Phosphatidyl inositol bisphosphate

PLC-1 – Phospholipase C

RAID – Redundant array of independent disks

RhoGAP – GTPase-activating protein toward Rho family small GTPases

RNAi – RNA interference

ROCK – Rho-associated protein kinase

ROI – Region of interest sp-ut – Spermatheca uterine

SPV-1 – Spermatheca physiology variant 1 1

Chapter 1 – Introduction: The Caenorhabditis elegans spermatheca as a model contractile tube

1.1 – Abstract

Contractile tubes are essential elements of animal body plans, and the C. elegans spermatheca is a powerful experimental model system to study the regulation of contractile biological tubes in living, intact animals. This chapter introduces the spermatheca and highlights some benefits of the model, including the ease of working with C. elegans and the stereotyped functioning of the spermathecal tissue. Calcium sensor movies are a primary data source throughout this dissertation, and they are introduced here with some motivation for their central role.

1.2 – Introduction

In this chapter I provide a brief overview of the experimental biology system studied in this work. Much of this dissertation focuses on a computational system I built to organize and standardize the processing and analysis of data collected from this experimental system. The power of this experimental system motivated me to build tools to help us study it; hopefully this chapter conveys some of that motivation and the possibilities that make me consider my time and energy well spent.

1.3 – Contractile tubes are essential components of animal body plans

Tubular structures are found across the tree of life, defining the morphologies of organisms from slime molds [1] to vertebrates [2], and supporting vital functions such as gas exchange [3]. As we advance through evolutionary time, simple tubular structures give rise to more complex and specialized systems of tubes, including digestive systems that break down food and absorb nutrients [4], and circulatory systems that ensure every cell of an organism is within diffusion range of vessels for nutrient delivery and waste removal [5]. Biological tubes are constructed using a handful of morphogenetic processes [6], and are comprised of multiple layers [2] that help define their functional 2

capabilities. As the size and complexity of animal body plans increases, pumps and valves and regulated contractility arise to control the pacing and directionality of material flows through these tubes [7].

Contractile biological tubes gain active contractility from the cells that comprise them [8], [9], enabling rapid responses over short time scales. Contractile biological tubes also gain passive contractility from materials that cells produce, assemble, and remodel [10], [11], enabling adaptations over long time scales. Cellular activity governs both sets of processes, and biomolecules and biomolecular pathways in turn regulate, coordinate, and actuate this cellular activity. The necessity and function of contractile tubes is conserved over animal body plans through evolution [3]–[5], [12], and so are many of the biomolecules and biomolecular pathways that control contractile tubes, for example actin filaments [13] and myosin motors [14]. The basic blueprints for constructing and controlling biological tubes are very old.

1.4 – C. elegans is a powerful model organism

The nematode Caenorhabditis elegans is a recipient of ancient blueprints for constructing and controlling biological tubes, as are Homo sapiens. With a last common evolutionary ancestor living ~ 600 million years ago [15], much of these blueprints, and the biomolecules and biomolecular pathways that provide the material and logic to build what is described, are still closely shared. C. elegans is an established and widely- used model organism, and today we build on ~ 55 years of effort from researchers who developed and studied this remarkable living laboratory. The full virtues of the worm are better extolled in dedicated works [16], but a couple of highlights include its status as the first multicellular organism with a fully sequenced genome, and a developmental map tracing every one of its 959 adult somatic cells back to the 1 cell embryo. The C. elegans research community must also be highlighted here, as the worm would not be the powerful model it is without the organized, open, and collaborative culture of many 3

C. elegans researchers. C. elegans characteristics relevant for this dissertation include a transparent body and an ease of engineering transgenic strains (Figure 1.1).

1.5 – The C. elegans spermatheca is a model contractile tube

The spermatheca is a 28-cell contractile tube in the nematode’s reproductive system [17], which consists of two U-shaped tubes holding germ line stem cells and developing oocytes, two spermathecae, and a shared uterus. The spermatheca is the site of fertilization, and stretches considerably to accommodate an entering oocyte, remains distended for a regulated time coincident with eggshell formation, and finally initiates coordinated cellular contractions to expel the egg [18]. The 28 cells of the spermatheca are arranged in distinct zones, with 8 cells arranged in pairs to form a distal neck that acts as a valve, 16 cells arranged in a bag that accommodates the embryo, and 4 cells fused into a syncytial sp-ut valve between the spermatheca and uterus (Figure 1.1).

Figure 1.1. C. elegans spermatheca overview. (Top left) A transgenic nematode expressing the fluorescent protein calcium sensor GCaMP in its spermathecae, with one occupied and one empty spermatheca. Scale bars are 100 μm in the large image, and 10 μm in the insets. (Top right) A cartoon schematic depicting the tissue level organization of the spermatheca. (Bottom) Still images from an embryo transit movie, visualized with the fluorescent calcium sensor GCaMP. Scale bar is 10 μm, all frames are on the same color scale as the first frame. Images reproduced from reference [19].

Spermathecal tissue function is tightly regulated and robust; each spermatheca processes ~150 eggs [20], [21], with low variances in size and shape [22], in wildtype animals. Embryos take ~ 10 minutes to completely transit through the spermatheca, and 4

the next embryo transit through that spermatheca occurs ~ 30 minutes later, providing a regular, stereotyped tissue activity that can be recorded and studied in great detail [18].

Many highly conserved, fundamental biological processes act in the spermathecal cells, including, but not limited to: calcium signaling [19], [23], lipid signaling [23], [24], signaling by small GTPases such as RhoA/RHO-1 and Cdc42/CDC-42 [19], [22], [23], actomyosin fiber formation and alignment [21], [25], and regulation of the actin cytoskeleton by actin binding proteins such as filamin [20], [23] and spectrin [26]. Even with these efforts, there is still much we don’t understand about how these 28 cells integrate and coordinate numerous biological processes to form a robust, functioning tissue.

1.6 – Calcium signaling regulates cellular contractility

Calcium is almost ubiquitous in biological systems, and calcium ions frequently act as secondary messengers in cells [27]. Calcium concentrations in the cytoplasm are tightly regulated, and generally maintained at low levels [28]. Calcium signaling regulates many biological processes, including proliferation, differentiation, and transcription [29]. Calcium signaling is also a major regulator of cellular contractility

[30]. Previous work, as well as work in this dissertation, shows that calcium is a key regulator of spermathecal tissue function. Low calcium makes the spermatheca less contractile, slowing embryo transits or causing the embryo to get stuck in the spermatheca, whereas high calcium makes the tissue more contractile, with faster embryo transits and deformed embryos [19], [23].

1.7 – Genetically encoded fluorescent calcium sensors and fluorescence microscopy enable in vivo study of calcium signaling in the spermatheca

Since calcium is recognized as a central biological signaling molecule [29], calcium sensors have been developed to monitor calcium during biological phenomena

[31]–[33]. Easy transgenesis of C. elegans allows genetically encoded, fluorescent protein based calcium sensors to be introduced into the nematode, and tissue specific promoters can drive expression only in the spermathecal cells. Using a we 5

can record spermathecal calcium sensors during embryo transits (see reference [19]

Movie 1). Calcium sensor movies of embryo transits indicate that calcium signaling in the wildtype spermatheca is very tightly regulated; there is almost no calcium activity until an oocyte enters, and then a stereotyped program of calcium signaling runs to expel the egg, and then the empty spermatheca goes back to no calcium activity (Figure

1.2). This tight coupling of calcium signaling with embryo transits is exciting, because the spermatheca is not innervated, and no chemical traveling from the embryo to the spermathecal cells has been identified, suggesting the robust control of calcium signaling and tissue function in the spermatheca arises locally from the cells themselves in response to the stretch from the embryo [34].

Figure 1.2. C. elegans spermathecal calcium activity is linked to embryo transits. (Top) A time series of the average pixel intensity from a calcium sensor embryo transit movie. The embryo transit in this time series occurs between 1500 and 2000 seconds, and the calcium sensor shows a flat line before and after embryo transit. (Bottom) Still images from an embryo transit movie, visualized with the fluorescent calcium sensor GCaMP. Scale bar is 10 μm, all frames are on the same color scale as the first frame. Images reproduced from reference [19].

1.8 – Conclusions and future directions

In this chapter I described the experimental system of the C. elegans spermatheca as a model contractile tube. The spermatheca features a stereotyped tissue function, embryo transit, which can be observed and recorded in living, intact animals.

Fluorescent protein sensors can be introduced into the spermatheca to monitor a variety of proteins, as well as biological signaling molecules such as calcium. Calcium plays a 6

central role in cellular signaling, and in the spermatheca calcium is tightly coupled to embryo transits. Calcium signaling in the spermatheca can be observed and recorded using calcium sensor movies, which are the focus of this dissertation. There are still aspects of the timing and coordination of embryo transit calcium signaling that we don’t understand, making calcium sensor movies a rich source of data that can be further mined for biological understanding. The numerous biological processes acting in the spermathecal cells provide us an opportunity to develop an integrated understanding of these processes and their interactions at a molecular level, likely generating insights about how other biological tubes function.

7

Chapter 2 – Development of a lab bioimage informatics solution

2.1 – Abstract

Biological data are inherently noisy, so the exploration and analysis of large amounts of data are required to draw strong biological conclusions with confidence.

This chapter describes the motivation, design goals, and implementation of a lab bioimage informatics system, which organizes fluorescence microscopy movies in a standardized way, and associates multiple pieces of information with each movie. This standardized organization is accomplished using Matlab programs, leveraging GitHub.

At the time of writing, this lab bioimage informatics system has been operational for two and a half years, houses over one thousand movies, and is used by multiple researchers in the lab.

2.2 – Introduction

Fluorescence microscopy is a widely used technique in cell and molecular biology.

Ongoing advances in microscopes and digital imaging, sample preparation and labeling, fluorescent molecules and proteins, and genome editing are generating datasets of increasing size and complexity. The emerging field of bioimage informatics aims to turn bioimaging data into biological insights [35]. As a nascent field, the definition and boundaries of bioimage informatics are not yet fully determined; it is viewed by different people as an intersection of , image analysis, and [36], as a subfield of [37], or as an engineering discipline [38]. Despite these disparate paths to the bioimage informatics field, there seems to be consensus that standardization and organization of the storage, processing, analysis, and presentation of imaging data is essential if we are to maximize the biological understanding that we can confidently extract, share, and build on.

Most modern imaging is digital, meaning imaging devices convert light into electrical , which in turn define discrete values for each discrete picture element

(pixel) that comprises an image. In the case of fluorescence microscopy, cameras digitize 8

fields of photons to pixel values to generate an image, or photomultiplier tubes digitize photons at a single point that is scanned over an area to generate an image. Fluorescence microscopy and digital imaging are so entangled that modern image-based biological investigation requires a transformation from the biological domain to the computational domain [39]. The inputs of fluorescence microscopy are biological, samples on a slide, and the outputs are computational, files in a computer.

The biological and computational domains use very different languages, and there is little overlap in vocabulary or concepts. This is good for avoiding ambiguity, but makes things difficult for people who want to understand the entire process of modern image- based biological investigations. Even bioimage informatics papers explicitly intended for experimental biology quickly enter the computational domain and use language and concepts that can be unfamiliar and intimidating to people without a computational background [40], [41]. This observation is not intended as an indictment of individual authors or of the field in general, it is rather a simple statement of reality. Domain-specific language arises, at least in part, because it enables people working in that domain to communicate more efficiently. The gulf between the biological and computational domains is a known problem [42], with a view of the landscape circa 2012 discussed in reference [43] (Figure 2.1). The proper way to address this problem seems to be to train in both domains, and courses and training programs across these domains are increasing in popularity to meet this need. A less than ideal way to deal with this problem is presented in this chapter. My image-based biological investigations reached a point where a bioimage informatics solution was needed, but the language barrier obscured the utility of the available bioimage informatics solutions. I could see a path to a solution using software packages that were popular in the lab, so I acquired the necessary computer programming and software development skills, and I built a lab bioimage informatics system to support our image-based biological investigations. This chapter describes the development of this lab bioimage informatics system.

9

Figure 2.1. Reprinted by permission from Springer Nature Customer Service Centre GmbH: Springer Nature Nature Methods Current challenges in open-source bioimage informatics, Albert Cardona et al, 2012. Reference [43].

2.3 – Motivation

In the spermatheca during embryo transits, the 28 cells integrate changing biochemical and biophysical information, and respond with coordinated cellular activity that produces stereotyped and robust tissue function. There is still much we don’t know about how multicellular systems like this are controlled, and the spermatheca provides a powerful model system in a living, intact animal. We do know that calcium signaling is a strong readout of the control system that regulates spermathecal contractility and tissue function, making calcium sensor movies an important data source and a common technique in the lab. Most undergraduate students can learn to acquire calcium sensor movies independently over the course of a single semester, although it takes dedication to acquire the numbers of movies required for solid scientific conclusions. Raw calcium 10

sensor movies reach file sizes of multiple gigabytes, which can introduce difficulties transferring, visualizing, processing, and analyzing these files. Gigabytes of data are rendered useless if the genetic background, growth conditions, experimental perturbations, and other relevant information about that sample cannot be determined with a high degree of confidence, so experimental metadata must also be associated with each movie.

Raw movies are collected from live animals, and the animals are prepared for imaging with minimal manipulation to preserve as much normal biology as possible. This results in raw movies with animals in random orientations, and animals that can still move on the slide despite optimized immobilization protocols. To minimize these variabilities and increase the power of downstream analyses, we process the raw movies to a standard form using to align movie frames and decrease movement artifacts, rotation to produce a standard orientation, and spatial cropping to standardize the frame size. The processing reduces the size of the movies from gigabytes to hundreds of megabytes, making the files easier to work with. A difference of degrees in the rotation or a differently placed spatial cropping box could have drastic consequences in downstream analyses, so processing metadata must also be associated with each movie.

In embryo transit movies, the opening and closing of the distal and sp-ut valves are important events that characterize spermathecal tissue function. These valve events can be used for time segmentation and normalization of spermathecal activity, facilitating detailed analyses, and for comparing movies from different sensors, facilitating expanded analyses. These valve events are currently discerned by human eyes, producing annotations that must be associated with each movie.

Movies are difficult to explore and analyze in large volumes, making smaller representations of the movie data useful. Calcium sensor movies are visualized and analyzed as time series that provide a snapshot of the movie while condensing hundreds of megabytes of movie information down to kilobytes of time series information. Calcium sensor movies are also visualized and analyzed as kymograms, a single image that 11

preserves some spatial information while condensing hundreds of megabytes of movie information down to hundreds of kilobytes of kymogram information. Time series and kymograms can vary depending on how they were generated, so processing metadata must be associated with each file. Information about the movies is useful for analyzing the time series and kymograms, so experimental metadata, time point annotations, time series, kymograms, and movies must all be associated with clear connections (Figure 2.2).

The work in this chapter was motivated initially by a personal desire for a framework to automate the organization, processing, and analysis of calcium sensor movies. Standardized processes, programs, directory structures and locations for input and output files were defined with input from lab members, leading to a motivating vision of future researchers being able to access and use any calcium sensor movies that were taken by previous researchers.

Figure 2.2. Many pieces of information are associated with each calcium sensor movie. For every calcium sensor movie, experimental metadata, processing parameters, and annotations must be collected and stored. Selected examples are provided for each category. Processed outputs for each movie include time series and kymograms. All of this information must be stored in a way that facilitates downstream data exploration and analysis.

2.4 – Design goals

In order to develop an effective bioimage informatics system, I carefully considered what the system should accomplish, how it should operate, and how to 12

make it so other people would want to use it. I started with a system for my own imaging data and analysis, and realizing efficiency gains I considered extending my computer programs to make them usable by other researchers in the lab. I spoke with potential users to gather information about their preferences and what such a lab bioimage informatics system would look like to them. I combined this information with insights from my experience to identify the essential requirements and characteristics for a lab bioimage informatics system, generating the design goals for the system (Figure

2.3). These design goals can be broadly classified into three categories: organization, standardization, and usability. The rest of this section will describe the design goals for the first generation lab bioimage informatics solution, providing background about how each goal was arrived at and rationale for why they are considered essential.

Figure 2.3. Lab bioimage informatics system design goals. 11 design goals were used to guide the development of a lab bioimage informatics system. These goals can be classified into 3 categories, represented by the circles in the figure. An ideal lab bioimage informatics system, meeting all the design goals, is represented by the black star at the overlap of all 3 circles. 13

2.4.1 – Organization and research efficiency

The first category of design goals relates to organization. This category is motivated by research efficiency. If files are well organized with known locations they are more easily located by humans, increasing access speed and decreasing frustration and cognitive load, resulting in faster results and increased research efficiency. Well organized files in known locations are also easily located by computer programs, enabling semi- and fully- automated processing and analyses to advance research in the background while human researchers do other tasks. Building and analyzing large datasets requires schemes to organize the data, and the variability of biological data requires large datasets to support strong scientific conclusions, ergo advancing biological understanding requires organization.

2.4.1.1 – Unique identifier

A core requirement for a lab bioimage informatics system is that each image file should be assigned an identifier that is guaranteed to be lab unique. Lab uniqueness provides a safeguard against files being accidentally moved to different folders, and facilitates exploration of data across multiple researchers and experiments, where user defined filenames are often relatively unique for that researcher or experiment, but not guaranteed to be unique over the entire lab.

An ideal unique identifier could also have a powerful secondary function by serving as a core string for downstream processing and analysis filenames, and as a primary key for databases of image metadata and processing and analysis parameters and output. There is a serious constraint on this design goal, with file path lengths limited to 255 characters in the Windows and macOS operating systems used by most biological researchers. The file path length limit means that the entire file location specification, from the root directory through all folders and subfolders to the filename and extension, must total less than 255 characters. This length constraint generates a selective pressure to make the identifier as small as possible. 14

Design goal: A unique identifier, small enough to be included in the name of every data file, processing step, and analysis output, is necessary for the system because it ensures confidence in the foundational data and clarity about the processed data that will support advanced analyses and understanding.

2.4.1.2 – Human readability

This design goal came from experimental biology experience analyzing movies manually and with my personal bioimage informatics system, and from conversations with potential users. A popular solution for unique identifiers involves generating alphanumeric strings of sufficient length that combinatorics guarantees uniqueness over the domain, with the gold standard being a Universally (or Globally) Unique Identifier

(UUID or GUID), a 128-bit number, 36 characters in length, coded in 32 alphanumeric hexadecimal digits with 4 hyphens [44]. While this may provide the most efficient guaranteed unique identifier, it obliterates all information that would give a human context about the file. An above average human won’t be able to hold more than maybe a dozen of these context-less alphanumeric strings in at any given time, limiting capabilities for manual human analyses and exploration of the data.

Calcium sensor movies can be hard won, with some representing the culmination of extended months of genetic engineering and / or genome editing efforts.

The movies are acquired in free-living organisms, immobilized for the imaging session; sometimes the animals do not want to cooperate, and hours and days can be spent at the microscope with nothing to show for it. Experimental biologists can become heavily invested in and connected to these precious movies, and analyzing the data rigorously and with confidence requires that we be able to track how the movies are processed and to be able to identify the resulting data points in the final pooled statistical plots.

Feeding these hard won movies to a program that would strip them of context and force future interactions to use context-less alphanumeric strings would be demoralizing and present a barrier to wide usage. Conversations with potential users and other experimental biologists reinforced this view. 15

Design goal: The unique identifier must be human readable because a context-less identifier would present a barrier to researchers using the system, and would limit manual human exploration and analysis of the data.

2.4.1.3 – Machine readability

This design goal came from a desire to make it easy to write computer programs that can parse the unique identifier to locate data inputs, processing and analysis outputs, and metadata and processing information. With computer programs able to recognize the unique identifiers, files can be automatically processed and analyzed, and output files can be saved without human input. With a unique machine readable identifier for each file, processing and analysis results from different steps and / or analyses can be automatically integrated to get a fuller picture of the data. Machine readability also helps ensure that output files can be opened and used by a variety of other computer programs.

Design goal: The unique identifier must be machine readable to enable automated location, loading, processing, analysis, and saving of files, as well as enabling files to be loaded and manipulated in other computer programs.

2.4.1.4 – Standardized directory structure

This design goal came from a desire to enable different subsets of the image archive to be distributed to multiple researchers on their local computers while simultaneously allowing a centralized archive containing all the movies to be housed in the lab. In order for this to happen smoothly, files must have specific folders that they are placed in, and these folders must be consistent across the local and central archives.

Standardized directory structures expand on the other organization design goals, providing a unique address for each file, facilitating human readability by providing the same folders and files on personal computers and the lab archive, and facilitating machine readability with the same folders and file paths accessible to computer programs on different systems. 16

Design goal: A standardized directory structure is useful to extend the organization goal and provide each file a unique address that is human and machine readable.

2.4.2 – Standardization, research reproducibility and data provenance

The second category of design goals relates to standardization. This category is motivated by research reproducibility and data provenance. Standardizing processing and analysis by using computer programs removes sources of variability and enables rigorous comparison. Standardized processing and analysis requires standardized steps, which can be tracked and recorded, providing clear chains of data provenance that allow any data point to be traced back to its raw movie, and increasing reproducibility by enabling processing and analysis programs to be re-run with the exact same parameters. Building and analyzing large datasets with confidence requires standardized, reproducible, and auditable methods, and the variability of biological data requires large datasets to support strong scientific conclusions, ergo advancing biological understanding requires standardization.

2.4.2.1 – Raw files

This design goal came from a ‘garbage in, garbage out’ philosophy, and a desire to have entry into the lab image archive be a strong initial filter that ensures only quality data is sent to downstream processing and analysis to support scientific conclusions.

Details about what the movies are of, i.e. experiment metadata, and how the movies were acquired, i.e. acquisition metadata, are important for processing and analysis.

Standardizing this metadata and setting requirements for addition to the image archive ensures that data will be able to be processed and analyzed fully, and that consistent information is available for more advanced analysis and exploration.

Design goal: Raw files must be accompanied by standardized acquisition and experiment metadata when they are added to the lab image archive.

17

2.4.2.2 – Processing

This design goal came from a desire to make sure movies are processed the same way, regardless of researcher, to be able to check the work of the processing programs, and to be able to reproduce a processed movie from a raw movie at any point. To accomplish this, processing needs to be broken down into discrete steps, and the information required to reproduce the transformations between processing steps needs to be recorded. An added benefit of having the full chain of transformation from raw to processed movie is that data storage space can be conserved by only storing the raw movies and regenerating the processed movies when needed.

Design goal: Processing of files must be accompanied by a record of the processing steps and transformations from raw to processed files.

2.4.2.3 – Analysis

This design goal came from the scientific principles that things being compared should be analyzed the same way, and that we should have a firm understanding of how they were analyzed. Analysis can be used to make raw data support almost any conclusion, especially in computational analysis where many automated procedures with adjustable parameters can be used. By using computer programs for analysis, standardized analysis procedures can be used and parameters can be recorded and stored, enabling auditable trails from processed movies through data points to scientific conclusions.

Design goal: Analysis of data must be accompanied by a record of the procedures used, parameters for those procedures, and documentation and rationale for any curation of the dataset.

2.4.2.4 – Presentation

This design goal came from the benefits of generating figures using computer code. The dataset going in to a plot can be recorded and stored for future interrogation, 18

spacing and layout can be optimized and the dataset explored for representative images, and user choices can be recorded and stored for reproducibility.

Design goal: Figures and other presentation output should be generated via code as much as possible to provide a record of the procedures and parameters used.

2.4.3 – Usability

The final category of design goals relates to usability. This category is motivated by a desire to build a system that other researchers will actually use. A system that beautifully organizes data and standardizes processing and analysis might as well not exist if no one uses it. It seems reasonable that people will use the system if it is easy to use and some benefits of use are clear. Another facet of usability is making sure a system that comes to be relied on can be expected to incorporate future advances and not break with an update. Building and analyzing large datasets requires programs for organization and standardization, and programs need to have high usability for people to actually use them, ergo advancing biological understanding requires programs with high usability.

2.4.3.1 – Minimize barriers to use

This design goal came from a human preference for things that are simple and easy to use. A great deal of information is available from the raw image metadata, and standardized processing and analysis procedures and parameters enable user input to be minimized. Researchers can be efficiently guided through processing steps in an interactive manner, without having to think about what to do next or where to save the processing parameters and output files.

Some people experience anxiety interacting with computer code, so the ideal system should enable use with minimal code interaction, i.e. loading the programs and running them. Some people enjoy interacting with computer code, so the ideal system should enable use with maximal code interaction, i.e. analyzing the programs and modifying them. 19

Design goal: The ideal system will make processing and analysis easier and faster by leading the researcher through processing steps and collecting processing parameters.

The ideal system will support a spectrum of code interaction, from loading established programs and running them to writing new programs.

2.4.3.2 – Maximize benefits of use

This design goal came from a human preference for things that can increase capability or understanding with minimal effort. Mediating organization and standardization with computer programs can generate consistent file conventions and locations without the user having to think about it; just click through the program, and consistent outputs are found in expected places without having to manually name or move them around. These outputs can then be fed to other programs, producing analysis outputs with the push of a button. Being able to trace any data point back to its raw movie increases confidence in the data.

Design goal: The ideal system will have benefits of organizing data with minimal user input, automating as much as possible, and generating an auditable trail that can increase confidence in analysis output. The ideal system will automatically generate output files and save them to standardized file locations without further human input.

2.4.3.3 – Extensibility and future-proofing

This design goal came from a desire to ensure that the system can be used into the future, accommodating software and hardware changes, new microscopy equipment and experiments, and new fluorescent sensors. The best way to accomplish this is to make sure the computer programs that comprise the system are easy to understand and well documented, with any dependencies on other packages made clear.

Design goal: The ideal system will be easy to understand and well documented.

20

2.5 – First generation lab bioimage informatics solution

With these design goals as guides, I extended my computer programs to develop my personal bioimage informatics system into an accessible and useful lab bioimage informatics system. The lab system went through a few iterations of feedback and refinement to produce the current version of the system, which has been up and running for two and a half years and is used by multiple people in the lab. The next sections will describe this first generation lab bioimage informatics solution.

2.5.1 – Unique identifier

The unique identifier devised for the system contains small pieces of context information about the movie, facilitating human readability, separated by underscores, facilitating machine readability. The identifier is of the form ABC_##Mon##_##. The first three letters are the initials or three letter code of the person who acquired the movie, and the middle segment is the date the movie was acquired, with the last two digits of the year followed by the three letter month and the two digit day. After the second underscore is an append number that guarantees uniqueness of the identifier: when the movie is entered into the lab movie archive, the first two segments of the identifier are assembled, and the archive is checked for other movies containing this partial identifier.

If no other movies contain the partial identifier, the append number is set to 01, if other movies do contain the partial identifier then the append number is increased until uniqueness is achieved. This process is carried out programmatically in Matlab, leveraging GitHub as will be explained below.

An example identifier is JAB_18Sep02_02. This 14 character string displays at a glance that this is the second movie that I acquired on September 2nd, 2018. It is also small enough to be used as an organizing column in spreadsheets of information, it can accommodate further small string tags to help organize file names, and it leaves plenty of file path real estate before the 255 character limit is reached. The append number is default two digit, because it seems unlikely that a single person will be able to acquire 21

more than 99 movies in a single day, but it could easily be expanded to three digits if it becomes an issue. The three letter initials could become problematic if there are multiple people in the lab with the same initials, or for people who have more or less than three initials; in either case the easiest solution would likely be to view the letters as a personal code rather than initials, and select three that aren’t currently in use. The letter codes could be duplicates, meaning two people could have the same code if they are separated in time. However, having two different people adding to the movie archive simultaneously with the same letter code should be avoided.

2.5.2 – Local and lab movie archives

In this first generation system, standardized organization and directory structures are achieved in an automated way by computer programs, guaranteeing machine and human readability with consistent folders, and increasing usability by removing things for the user to think about and perform manually (Figure 2.4). Users only need to define the Archive folder the first time the system is run on a new machine, after that the programs handle making and naming folders, and moving files between folders. Processing programs know what to name output files and where to place them, analysis programs know where to find files, and these files can be expected to have the same name and be in the same place across machines, whether connected with a hardwire to the lab RAID (Redundant Array of Independent Disks, the ~ 40 terabyte data storage unit in the lab) or exploring the latest data on a flight without internet.

Human readability of the organization scheme is paramount, as machines can be programmed to operate on almost any organization system we can imagine, but if humans are using new systems that are hard to navigate or too different from their natural organization tendencies they will use the system begrudgingly, if at all, and will be tempted to revert to their personal ways or other alternatives, undercutting the goal of common standardized organization.

The specific directory structure used for the first generation system was arrived at after much thought and feedback about how it should be arranged. The user decides 22

on the naming and placement of an Archive folder for a new machine, ideally naming it something small and placing it close to the system root to reserve as much space as possible for file paths to avoid the 255 character limit. The naming and placement of the

Archive folder is under user control, enabling some flexibility and customization for different researchers to increase usability. Two GitHub repositories are in the Archive folder; one holding the code of the computer programs for the system, and the other holding text files of metadata for the movie archive. These folders will be discussed more in the next section. Within the Archive folder is also an Upload folder, which holds files that are ready to be entered into the archive, with processes discussed in a later section. Within the Archive Upload folder is a Processing folder which holds movies in various stages of processing, covered in the next chapter.

The bulk of the Archive folder is filled with movies and their processing and analysis outputs. The most important organizing principle for calcium sensor movies in the lab is the C. elegans strain, as this small, usually 5 - 7 character string describes the genetic background, transgene content, and genome editing record of the animals according to specified rules that are accepted and enforced in the C. elegans community

[45]. Thus, the first level of directories in the Archive folder are the strain folders. The most common perturbations used for calcium sensor movies are RNAi or heat shock treatments, so these perturbations form an optional second level of directories that nest in the strain folders, with the more common RNAi taking precedence over a third level of heat shock directories if both are present. The combination of strain and any RNAi or heat shock treatments for a given movie is often referred to as the condition.

Within the strain folder, or RNAi or heat shock subfolder, the processed movies are located, as these are the files people are generally most interested in if they are manually exploring the movie archive. Within the strain folder, or RNAi or heat shock subfolder, separate folders hold the unprocessed raw movie files, kymograms generating from the processed movies, and time series files generated from the processed movies. Processing of the movie files is described in the next chapter. 23

A vision that guided the creation of this system featured a central lab movie archive, holding all the movies that are available, and distributed local movie archives on personal machines where researchers are only interested in subsets of the data and don’t have the storage capacity to hold all the movies. This first generation system realizes this vision, with common file names and placements across the central lab and distributed local archives, and programs enabling transfer of files between central and distributed archives.

2.5.3 – GitHub

Coding projects, particularly those intended to be worked on or used by other people, greatly benefit from version control. Version control enables changes to programs to be tracked, with records of who made the changes and what they were.

Version control also enables working versions of programs to be isolated, providing an ability to work on extending or improving programs without having to worry about breaking something that works. This ability was invaluable to an amateur software development effort advancing through trial and error tinkering. The most popular version control system is the free and open source Git, which underlies the popular code sharing site GitHub. Git was originally created to enable collaboration between many programmers as they were building Linux, and this ability to synchronize information across multiple distributed locations was very useful for this first generation system.

2.5.3.1 – Local and remote repositories

Git and GitHub are utilized for two purposes in this first generation system.

First, all the code used for the system is stored in a lab GitHub repository at https://github.com/CramWormLab. A GitHub repository is essentially a folder holding files and information about their previous versions. This GitHub repository is located on the local machines at a folder in the Archive directory, and provides a single folder, with the same organization across all machines, that holds all the programs for the system.

This facilitates code distribution and ensures that every user of the system is working 24

with the same programs, and aids usability of the system by providing a quick and easy way to get all the needed files and to make sure they are up to date.

Figure 2.4. Standardized directory structure overview. The Archive folder is defined by the user when the system is first set up on a new machine. Windows PC and Mac OS file systems are shown, to highlight that the lab bioimage informatics system can function across operating systems. Within the Archive folder, C. elegans strain names provide the primary organization. Metadata for the movies is stored in the ArchiveSheets folder, and programs are stored in the GCaMP_ArchiveProcessing folder; both are Git repositories hosted on the lab GitHub, indicated by the GitHub logo. The BackupSheets folder holds safeguard copies of ArchiveSheets files when they are being manipulated by programs. An Upload folder, not shown, holds movie files and an Excel sheet containing information for movies that are waiting to be entered into the archive. The Strain organization level also contains subfolders for RNAi or heat shock conditions, not shown in this figure. Within the Strain, RNAi, and/or heat shock folders, data are organized with the processed movies most immediately accessible, and kymograms, raw files, and time series files housed in subfolders. Standardized filenames based on the unique identifier can be seen at the Data level. The specifics of the files at the Data level are covered in the next chapter.

2.5.3.2 – GitHub as a distributed database management system

The second use for GitHub in this first generation system is to serve as a distributed database management system. The experiment and acquisition metadata for the movies in the lab Archive are stored in tab delimited text files, allowing them to be opened and used by any program. By storing these metadata files in a GitHub repository, they can be brought under version control so changes to the files can be 25

examined at any future point. Another benefit of storing the metadata files in a GitHub repository is that all local archives have nearly instantaneous access to information about new movies loaded into the lab archive. Finally, having the metadata on GitHub is essential to generating unique identifiers, because the names of all the movies in the lab archive can be determined at the moment of file upload. Git was built to enable multiple people in disparate locations to make changes to a file simultaneously, and for these changes to be merged in to a single file, resolving conflicts, with a record of who made what changes when. This functionality is leveraged in this first generation system to ensure unique identifiers and to track changes over a distributed lab image archive.

2.5.4 – Matlab

The programs to upload files to the lab movie archive, and to move files between the central lab archive and distributed local archives, are written in Matlab. With a goal to limit the number of programming languages used, to increase usability, Matlab was the choice because the ImageJ macro language could not be made to programmatically interface with GitHub, and because an Excel file was used for uploading metadata, and the ImageJ macro language on MacOS could not be made to read Excel files. The Matlab programs handle the uploading and processing of metadata, the assignment of unique identifiers, the renaming of files to incorporate the unique identifiers, the creation of the organized directory structure, and the movement of files between local and central archives.

2.5.4.1 – Matlab_’X’_A_FileUpload.m; ‘X’= (GCaMP|RG)

The code for this computer program, and a template Excel sheet for the metadata, can be found on the lab Github. This program requires an internet connection and access to the CramWormLab GitHub remote repository. The program searches for files in the Archive Upload folder that are available for upload, with these files being marked with a _Raw append before the file extension. Raw files at this stage can be named by any convention that the entering researcher prefers, so long as the end is 26

_Raw.tif. If files ready for upload are found, an Excel sheet of metadata is examined to look for an entry for this file. A template and an example Excel sheet for the metadata is included in the GitHub repository, so the standardized form is conveniently available.

As a gatekeeper for entry into the lab movie archive, this Matlab program searches the

Excel sheet, and if it doesn’t find an entry for the movie to be uploaded, or if it finds the metadata entries incomplete, the program exits and tells the user to fix the problems and run the program again. Multiple movies can be uploaded simultaneously with a single

Excel sheet, increasing usability.

With metadata entries confirmed for each file to be uploaded, the unique identifiers for the movies are generated. Fields in the Excel sheet for researcher initials and acquisition date are examined, and the partial unique identifier is assembled from the information in these fields. With this accomplished, the most recent metadata text files in the lab GitHub are pulled down. These files are checked for the partial unique identifiers for each movie, and the append numbers to complete the unique identifiers are determined. The metadata from the Excel sheet is divided into experiment and acquisition metadata and formatted for addition to the tab delimited text files holding the rest of the metadata for the archive. These updated metadata text files are then pushed back to the lab GitHub to update the lab archive metadata. If an error is encountered in this push back to the lab GitHub, the process is restarted by pulling down the most recent metadata and determining the unique identifiers for these movies anew. Since most other checks for the GitHub repository are completed at the beginning of the program, an error on this push is assumed to arise from another user uploading files to the archive and taking the unique identifiers in the time between the start of the program and the push, so the unique identifiers are determined anew to ensure that there are no conflicts.

With unique identifiers verified, the movie files are then renamed. The old name of the movie file is saved in the metadata sheets, in case researchers made notes during acquisition or have some other reason to refer back to the original name. From here, the 27

strain and RNAi and/or heat shock conditions are read from the metadata text files.

From this information the standardized directory location for the file can be determined, and the file moved to this location. This is carried out with no further user input, and the file sizes of the file in the new and old locations are compared to ensure no error has occurred in the renaming or movement. This program is designed with user flexibility in mind, so uploading can be carried out before or after the processing in the next chapter.

If the movies are processed before uploading, all the processed output files are found, renamed, and moved to their relevant locations with no further user input.

Normal usage of this program entails moving the movie file to the Archive

Upload folder and adding ‘_Raw’ to the movie file name, filling out a templated Excel spreadsheet with metadata about the movie, and running the program. Checkboxes allow control over which movies in the upload folder will be added to the archive, for cases where multiple people are using the same computer, such as the computer that is hardwired to the RAID in the lab. When the program finishes, files have been renamed and moved to standard locations, and information about the new files is available to anyone with access to the lab GitHub, all without the user having to think about it.

2.5.4.2 – Matlab_ArchiveTransferToDiskVader.m

The code for this program is hosted on the lab GitHub. This program requires internet access, access to the lab GitHub, and connection to the lab RAID. The system was designed to enable subsets of the movie archive to be worked with on multiple local machines, and because the standardized directory structure features some nesting of folders that can make moving a lot of files manually tedious, programs were written to facilitate moving files between different machines and the lab RAID. The lab RAID is nicknamed DiskVader, hence the program title.

This program checks to make sure the computer has a connection to the lab

RAID, and then identifies movies that are on the local machine but not on the lab RAID.

There is also a column in the experiment metadata sheet that indicates if the movie has 28

been uploaded to the lab RAID, so users seeing new movie info will know if they should go to the lab RAID or the uploading researcher to get the files, and also so large batches of movies processed on local machines can have some reminder to upload them to the lab RAID. Files that are found on the local machine but not on the lab RAID populate a list of checkboxes, giving the user control over which files are to be moved during that transfer session. The program then copies all the selected files on to the lab RAID, arranged in the standardized directory structure. Required directories are created if they do not already exist, and the file sizes of the original and copied files are compared to provide a check that the file is correct. Some of these files are large and can take time to move, and large batches of files can be selected for a transfer session, so a progress message updates after each file is moved so the user can see the program is doing something and didn’t freeze. When the program finishes transferring all the selected files, it updates the metadata text files to indicate the files are now on the lab RAID and pushes these files back to the lab GitHub. Another checklist then appears for the user to select files to delete off the local machine. This option is provided because sometimes the files need to be deleted off the local machine as soon as they are loaded on to the lab

RAID, such as in the common computer in the lab which quickly fills up with movie files, but sometimes they need to stay on the local machine, for example if the person is going to analyze them at home later.

2.6 – Conclusions and future directions

The system described in this chapter is operational and has been in use by multiple people in the lab for two and a half years. It has been used to support work I have published as well as external collaborations. The current movie archive contains over one thousand movies, acquired by eight different people over a period of ten years.

The only necessary improvement that I can see is that more efficient ways to deal with the metadata sheets will eventually need to be found. There are a number of steps in the programs where the metadata sheets are searched, for example when the unique identifier is being assigned. As more movies have been entered into the archive, these 29

searches have been taking longer, and with multiple searches the whole programs get slower. The programs are not prohibitively slow at ~1000 movies, but I expect that as the number of movies continues to rise the programs will eventually get slowed down to a point where they are no longer usable. This slowing of searches as a database increases in size is a common computing problem; there are almost certainly more efficient ways to deal with the sheets and searches than are currently coded in this first version, and at some point in the future the programs will need to be upgraded in this way.

There are many ways this system could be extended, with two logical next steps discussed in this paragraph. First, the system was built for widefield calcium sensor movies. Over the course of use, it was extended to accommodate two channel widefield fluorescence movies, with one channel for a calcium sensor and a second channel for another fluorescently labeled protein. While it was possible to extend the system to new sensors, it is an area that could use more thought and improvement. No attempt was made to extend the system to other fluorescence microscopy modalities, such as confocal or light sheet data, and this extension would likely take a great deal of thought and work. The other focus for extending the system would be to make it usable beyond a single lab. Currently the GitHub repositories holding the computer code and metadata sheets are hardcoded in to the programs; some thought would have to go in to flexibility on loading in repositories and how best to share and synchronize programs across labs.

Another hurdle here is the unique identifier is guaranteed to be unique to the lab, but not beyond that. These extensions would likely not be hard to accomplish, but they were beyond the scope of this endeavor.

2.6.1 – Alternative bioimage informatics solutions

Efforts to standardize the organization, processing, and analysis of fluorescence microscopy data have been advancing in recent years, led primarily by people with formal training in computer science or software development. I attempted to explore these solutions when I first noticed that the organization and analysis of my microscopy data was cumbersome and laden with inefficiencies, but the language barrier between 30

the biological and computational worlds obscured the utility of these established bioimage informatics systems. I saw a clear path to a solution using software packages that were already popular in the lab, and so I acquired programming and software development skills and built a system that met the widefield calcium sensor bioimage informatics needs of the lab. Of course, anyone working with my system will be seeing my idiosyncrasies while about the computational world, so it would be more responsible to leave the system in a form that is better integrated with the larger computational world. To this end, four more established bioimage informatics solutions have been identified that look to be able to meet the goals that the current system accomplishes, specifically Cell Profiler [46], [47] and Analyst [48], [49], Icy [50], BisQue

[51], and OME [41], [52].

These four established solutions are commonly used and widely cited, with active user communities and developer support. The next steps for analyzing these solutions would be to take a small test dataset of movies and import, process and analyze them in each of the four systems to determine which would be best for the lab going forward. An exciting potential solution is a combination of OME and KNIME.

OME is the Open Microscopy Environment [53], an open source effort to deal with the management of microscopy data. KNIME [54] is the Konstanz Information Miner, an open source data science platform that uses a graphical programming interface and has been utilized for image processing [55], analysis of sequencing data [56], [57], and organization of a workflow for biofilm assays [58], among other things. This combination seems most likely to allow for microscopy data from any sensor and imaging modality to be entered into the archive, accessible remotely and across labs, while simultaneously providing an intuitive graphical interface for image processing and analysis pipelines and data visualization and exploration. Gaining experience and familiarity with an open source data science platform that can enable analysis of sequencing data and organization of biological workflows could also open doors to further integration between the experimental biology and computational worlds [59]. Of 31

course, the other bioimage informatics solutions should be thoroughly investigated before a final conclusion is drawn.

32

Chapter 3: Application of the first generation lab bioimage informatics solution to processing and analysis of widefield microscopy fluorescent calcium sensor movies

3.1 – Abstract

Modern image-based biological investigations rely on computer programs to process and analyze imaging data, but the results can be hard to trust if it is impossible to reproduce or interrogate the computational methods. This chapter describes the standardization and tracking of the processing and analysis of calcium sensor movies, utilizing the lab bioimage informatics system developed in the previous chapter. Fiji /

ImageJ programs standardize the processing of movies. Matlab programs enable automated extraction of metrics and detailed exploration of the data. Products from this chapter include time series, kymograms, and text files of analysis results that can be read into other analysis or presentation programs.

3.2 – Introduction

In the last chapter, I developed a standardized organization scheme, providing a firm foundation of microscopy data to support advanced processing and analyses and strong scientific conclusions. In this chapter, I build on that foundation, standardizing the processing and analysis of widefield calcium sensor movies. The motivations and design goals from the previous chapter (Figure 2.3) were again used as guides for developing the computer programs in this chapter. The processing is broken down into discrete standardized steps, and the processing parameters for each step are recorded and stored in a text file. The analysis is automated where possible, and semi-automated in instances where human input is required, minimizing sources of bias and increasing research efficiency and reproducibility. This chapter was motivated by a vision of an auditable data trail, where data points in figures can be traced back through their analysis and processing to the raw movie. The desire to visualize, manipulate, and explore the large datasets needed for strong biological conclusions also motivated this chapter. 33

3.3 – Fiji / ImageJ processing of calcium sensor movies

The Fiji / ImageJ programs process archived movies, generate time series and kymograms, and annotate time points. With a goal to limit the number of programming languages used, and to increase usability, Fiji / ImageJ was chosen because Fiji / ImageJ is a free, open source program [60] that is commonly used in the lab for image visualization, processing, analysis, and presentation. A single program processes the calcium sensor movies, increasing usability.

3.3.1 – Fiji_RawToGCaMP_A.ijm

The code for this program is available on the lab GitHub. The program features a simple generate time series option that allows users to get quick preliminary data from a movie without having to process it. For all other processing options, this program uses the standardized Archive directory structure outlined in the last chapter, operating on files in the Archive Upload folder. The location of the Archive folder can be hardcoded in to the local version of the program, or it can be read in automatically using a function written by the undergraduate researcher Doug Pagani who worked with me. For registration steps this program requires the ImageJ plugin StackReg [61]. This plugin is called directly from the Fiji program, and so the BIG-EPFL update site must be included in the Fiji update sites in order for registration to work.

This program takes as input raw movie files, indicated with _Raw, located in the

Archive Upload folder. The program requires user input to select parameters as it guides the user through the processing steps, and it automatically records these processing parameters without further thought from the user. Outputs of the program include: a registered movie, in a standard spatial orientation and cropped to a standard spatial frame, a tab delimited entry to a text file of processing metadata, a tab delimited entry to a text file of time point annotations, a kymogram displaying the calcium sensor variation from the distal neck to the sp-ut valve over time, and two text files containing time series from the calcium sensor movie. 34

Keeping with the organization and standardization goals, and recognizing there are infinite ways a movie could be processed, this program is assigned an identifier of

_GCaMP_A. _GCaMP comes from the most popular calcium sensor in the lab, and _A indicates that this is the first established processing program for the calcium sensor movies. This _GCaMP_A tag is appended to the unique identifier to indicate the movie has been processed, and is part of the file names for output files as well as entries for metadata files (Figure 2.4).

Figure 3.1. Fiji_RawToGCaMP_A.ijm overview. This Fiji / ImageJ program leads users through standardized processing steps, automatically collecting and recording processing parameters. Processing parameters and time point annotations are added to tab delimited text files under Git version control, hosted on the lab GitHub (GitHub logo). Processed _GCaMP_A.tif movie files are automatically named and saved, kymogram .tif image files and time series text files are automatically generated, named and saved.

3.3.2 – Complete and partial processing

Processing of calcium sensor movies is broken down into five discrete steps requiring user input: temporal cropping, registration, rotation, spatial cropping, and 35

annotation of time points. (Figure 3.1) To increase the usability of the program, as much flexibility as possible was built in to the processing. The start point of processing is a raw movie file with a _Raw append located in the Archive Upload folder. The end point of processing is a processed movie file with a _GCaMP_A append in the Archive Upload folder. Files that are partially processed, meaning not all five of the steps completed, are given a _GCaMP_PartialA append, and stored in a Processing folder in the Archive

Upload folder. Text files track the progress of the processing and collect metadata; entries for partially processed files are in a text file in the Archive Upload Processing folder, whereas entries for fully processed files are in a text file in the Archive Upload folder while they wait to be entered into the metadata text file on the lab GitHub.

Figure 3.2. Fiji_RawToGCaMP_A.ijm processing selection screen. The opening screen of the Fiji_RawToGCaMP_A.ijm program. This screen collects user initials to associate with processing metadata, and provides the user with options for the specific processing steps to be completed in this run of the program. A workaround to select the camera used for the imaging is also presented on this screen.

Multiple movies can be held in the Upload and Processing folders, and they can be held at different processing stages. The processing stage of each movie is tracked by the program, using a text file that can be examined by a human if necessary. When the user runs the program, they see a dialog box asking for their initials and a selection of 36

the processing steps they would like to perform (Figure 3.2). The user can select any single processing step, or they can select a complete processing option which will lead them through all the processing steps in sequence. At this point the program searches the Upload and Processing folders, and the processing text files, to determine which movies need the processing step selected. The user is then presented with a list of checkboxes displaying all the movies that are eligible for processing. These checkboxes increase flexibility of the system and usability, allowing users to process multiple files in a single processing run, and also allowing multiple people to use the system on a single computer without worry of movies getting crossed. Once the user selects the movies they want to process, the program leads them through their desired processing step for all the movies in sequence, and the program exits when the last selected movie is processed for that step. In addition to the complete processing option which will lead the user through the full processing of a raw movie, there is also a complete partial processing option, which will lead the user through whatever steps are needed to complete the processing of a partially processed movie. The following sections will cover the processing steps in detail.

3.3.3 – Temporal cropping

Raw movies ideally contain some frames of white light DIC (Differential

Interference Contrast) at the beginning and end to provide greater context for the movie.

Also, raw movies can have long segments without activity preceding or following the embryo transit. These frames do not need to be processed for the analyses presented here, but a record of where the processing started and ended is necessary if the processed movie is to be traceable back to the raw movie. The first processing step in the program is temporal cropping, where the frames to start and end processing are selected by the user. Frame selections in all the processing steps involve prompts that allow the user to scroll through the movie in Fiji to select the desired frame. Once the user is happy with the frame, they press OK, which causes the program to get the number of the selected frame and present it in a confirmation window which will not let the user 37

do anything else with the program until they press OK. The purpose of the confirmation window is to provide a chance for the user to review their selection and change it if necessary.

Once the user has selected the processing start and end time points, the movie is cropped according to those time points, and the user is presented the temporally cropped movie with a window prompting them to scroll through the movie to make sure they are happy with this processing step. Upon scrolling through the movie and pressing OK, a confirmation window appears, indicating that the processing step will be finalized for the movie. If the movie is not properly cropped the user has an opportunity to say no and repeat the processing step. If the user is happy with the temporal cropping, the initials and the temporal cropping frames used are recorded in the processing metadata text file. This temporally cropped file is then appended

_GCaMP_PartialA and saved in the Processing folder, and the metadata text file is updated. Temporal cropping can be carried out at any point before the annotation of the time points, as the other three processing steps do not depend on the temporal cropping.

Temporal cropping is recommended as the first step, and is presented first in the complete option, because it makes the movie file smaller which speeds up all the other processing steps.

3.3.4 – Registration

Calcium movies are acquired in living, intact animals, which can move during imaging. Optimized immobilization protocols can remove most of that movement, but sometimes movies are acquired in nematodes that were not so well immobilized.

Additionally, the embryo transits inherently involve motion as the spermathecae move away from the vulva during early ovulations when the uterus fills with embryos.

Correcting for these motion artifacts is a major step in the processing, and helps remove some of the variability in the data before it is analyzed. 38

Registration in this program uses the ImageJ plugin StackReg [61], using a rigid body transformation that allows translation and rotation but not resizing. StackReg anchors the registration of the movie to whatever frame was current when StackReg is run, and moves frame by frame through the movie, aligning the frames as it goes. The

StackReg plugin can take some time, upwards of hours per movie, depending on the frame size. This was a prime motivation to set up a batch registration option, where the anchor frames for multiple movies can be defined by the user, and then the user can set the program to automatically register multiple movies overnight or while they are doing other things. The registration step is faster on smaller frames, because there are fewer pixels to compare, so options are provided to rotate and/or spatially crop the movie to make the frame smaller prior to registration. Since the registration performance can vary with the anchor frame selected and any pre-registration rotation or spatial cropping, these parameters are collected as the user selects them. Once all movies selected for registration are processed, each registered movie is presented to the user with a prompt to scroll through the movie to examine the registration result. The user then selects if they are happy with the registration, if they are not happy the registered movie is deleted and the next movie is presented, or if all registered movies have been reviewed the program exits after telling the user to re-run the registration. If the user is happy with the registration, the movie is appended _GCaMP_PartialA if it is not already, it is saved in the Processing folder, and the metadata text file is updated. Registration must occur before the rotation and spatial cropping steps, but can occur before or after the temporal cropping or time point annotation. Registration is presented as the second processing step in the complete option.

3.3.5 – Rotation

Calcium sensor movies are acquired in living, intact animals, immobilized on slides with minimal manipulation. The animals are not oriented when the slides are made, and each animal has two spermathecae that are oriented roughly 180 degrees apart, so raw movies have the spermathecae at a variety of angles. In order to analyze 39

the movies as kymograms a standard orientation is necessary, and a standard orientation has other benefits, so it was decided in the lab that _GCaMP_A processed calcium sensor movies should have the spermatheca horizontal, with the distal neck on the left and the sp-ut valve on the right. Rotation is the most labor intensive processing step for the user. First, the user is prompted to select a frame to determine the rotation angle for the movie. This frame is then presented to the user, with a rotation preview tool, so the user can find the proper angle for rotation for that frame. Once the user is happy with the rotation angle they must remember it, and enter the value in to another box that pops up. From here, the entire movie is rotated by the entered angle, and this rotated movie is presented to the user. The user then has to scroll through the movie to verify they are happy with the rotation, if unhappy they will be led through the rotation process again, if happy the _GCaMP_PartialA file and metadata is updated. Processing parameters recorded for rotation are the frame used to determine rotation, and the angle of rotation. Rotation must occur after registration and before spatial cropping, and is presented as the third processing step in the complete option.

3.3.6 - Spatial cropping

Calcium sensor movies are quantified to time series in the _GCaMP_A processing by calculating the average pixel intensity over the entire movie frame.

Comparison of time series and kymograms across movies will be more rigorous if they are generated from movies with a standard frame size. Previous work in the lab showed that a frame size of 800 x 400 pixels, roughly 100 x 50 μm, can accommodate the spermatheca over the whole embryo transit [23], so this was agreed upon as the standard frame size. With a camera change altering the pixel size of our images, or for images from collaborators with different pixel sizes, the pixel dimensions are changed to achieve a ~100 x 50 μm frame. For this processing step, a ~100 x 50 μm box is drawn on the movie, and a prompt instructs the user to move the box to a region where the spermatheca is roughly centered in the box, and to scroll over the movie to position the box so it always contains the spermatheca. The pixel dimensions of the box are included 40

in the prompt window, in case the user accidentally clicks off the original box they can just manually redraw a new one. Once the user is happy with the placement of the spatial cropping box, they press OK in the prompt window and they are presented with the spatially cropped movie. They are then asked to scroll through and confirm they are happy with the spatial cropping, if unhappy they will be led through the spatial cropping again, if happy the _GCaMP_PartialA file and metadata is updated. Processing parameters recorded for the spatial cropping are the four numbers that define a rectangle in ImageJ: the image coordinates x and y of the upper left corner of the box, and the width and height of the box. Spatial cropping must occur after registration and rotation, and is presented as the fourth processing step in the complete option.

3.3.7 – Annotation of time points

There are four important time points in embryo transit movies: the distal valve open, when the oocyte starts to enter the spermatheca, the distal valve close, when the embryo is completely enclosed within the spermatheca, the sp-ut valve open, when the embryo starts to exit the spermatheca, and the sp-ut valve close, when the embryo has completely exited and movement and calcium signaling in the spermatheca have ceased.

These time points provide information for downstream processing and analysis, and collecting them at this point, when the user is familiar with the movie and has scrolled through it a few times in the previous processing steps, increases usability. For this processing step, the processed movie is presented to the user, with a prompt to scroll through the movie and select the frame that they see as the distal valve open. The current frame at selection is sent to a confirmation window, and once the user verifies they are happy with the time point, the prompt for the next time point is presented.

When all four time points are annotated, a window displaying all four time points is presented, and the user is prompted to scroll through the movie to verify they are happy with all the time points. If unhappy they are asked to repeat the time point annotations, if happy the time point annotations are saved to the time points text file. Annotation of time points must occur after the temporal cropping, and can occur before or after the 41

registration, rotation, or spatial cropping. Annotation of time points is presented as the fifth processing step in the complete option.

3.3.8 – Generation of time series and kymograms

Time series and kymograms from the processed movies are two major outputs of the processing program. Time series are generated by calculating the average pixel intensity over the entire movie frame for each frame of the movie. This condenses the information in the movie to a one dimensional representation of the calcium sensor activity in the whole tissue over time. Two average pixel intensity time series are calculated for each movie, one that sums the pixel intensity over all pixels and then divides by the number of pixels, and one that collects the line profile of a wide line over the kymogram. The time series are nearly identical, with the line profile from the kymogram smoothing the data slightly. Time series file names are constructed using the unique identifier, _GCaMP_A, and time series tags to indicate they are time series and which method was used (Figure 2.4). The time series are automatically saved as text files, in time series folders within the strain and condition archive folders.

The kymograms used to analyze calcium sensor movies display the variation in calcium signaling from the distal valve to the sp-ut valve, and condense the information in the movie to a two dimensional representation showing the spatial variation in calcium sensor activity over time. Kymograms are generated by averaging over each column of a movie frame to condense it to a single pixel line, and then carrying this out for all the frames in the movie, stacking the single pixel lines to display the spatial coordinate horizontally and the time coordinate vertically. This is implemented in Fiji /

ImageJ using the commands Image > Stacks > Reslice followed by Image > Stacks > Z project (Average Intensity). Kymogram file names are constructed using the unique identifier, _GCaMP_A, and a kymogram tag. There are many kymograms that can be generated from a movie, so the kymogram tag in this case is _KymogramAvgX to indicate that it displays the average over each x column. Kymograms are automatically saved as .tif image files in kymogram folders within the strain and condition archive 42

folders. Generation of time series and kymograms must occur after spatial cropping, does not require any user input beyond running the program and selecting movies to process, and is the final step in the complete option.

3.3.9 – Multichannel movies

In addition to calcium sensors, there are many other fluorescent sensors that can be used in the spermatheca, and the lab widefield microscope is equipped to acquire fluorescence in multiple spectral bands common to fluorescent sensors. The processing pipeline described here was extended to accommodate two channel movies from sensors based on green and red fluorescent proteins. An _RG_Raw append is used to distinguish two channel movies from single channel _Raw movies, with RG coming from red green.

_RG_Raw movies are uploaded into the archive as single movies with two color channels. A new processing program was created, modifying the _GCaMP_A to _RG_A in file names and metadata entries. For registration of red green movies, color channels were split, each color channel was separately registered using a common anchor frame, and the two registered channels were then merged to end with a single movie with two color channels. All other processing steps were carried out on the movie with the color channels merged. Processed movies were saved as single movies with two color channels, and kymograms were saved as single images with two color channels. Time series were calculated from separated color channels, and saved with tags to indicate if they came from the red or green channel. In archive folders for red green movies, an extra set of time series folders is present, to keep the red and green time series separated.

3.4 – Matlab processing and analysis of calcium sensor time series

A Matlab program processes and analyzes the time series from calcium sensor movies.

Matlab was chosen because it presented many options for processing and analyzing time series, and allowed rapid exploration of the raw data as well as processing and analysis results.

43

3.4.1 – Matlab_’AnalysisTitle’.m

The code for this program is available on the lab GitHub. The program takes as input an Excel file holding the names of the movies the user wants to process, and automatically reads in the required time series text files from their archive locations. The program requires the standardized directory structure developed in the previous chapter, and expects to find time series with the naming conventions and in the locations outlined earlier in this chapter. The program also loads in the metadata text files and time points file for more information about the time series. Outputs of this program are a tab delimited text file holding calculated metrics and supporting information for each time series processed, and a text file holding all the processed time series as a matrix (Figure 3.3). The program was built to run on local movie archives on local machines, but if a machine is hardwired into the lab RAID the program recognizes this and gives users the option to access the files directly from the lab RAID. The following sections describe the standardized time series processing, and the analysis metrics and how they are calculated.

3.4.2 – Analysis title

I wrote this program with flexibility in mind, to accommodate different scientific questions that people might ask of the archived movies. I also attempted to make it easy to connect the processing and analysis outputs of this program to other downstream processing and analysis programs. To facilitate organization, each analysis project is given a title that is used across processing and analysis programs and data files. For example, the final data underlying the next chapter has an analysis title of

SPV1Paper_Complete. This makes the corresponding Matlab file name of this section

Matlab_SPV1Paper_Complete.m.

44

Figure 3.3. Matlab_’AnalysisTitle’.m overview. This Matlab program is named with a user defined title for the analysis. The program reads in an Excel sheet that defines the time series to process, and automatically finds and loads the necessary data. The program then automatically processes the time series, calculates metrics, and assembles and saves a text file with the analysis output and a text file with a matrix of all the time series analyzed.

The analysis title is used in the name of an Excel file that defines which movies to include in processing and analysis, referred to here as a dataset filter. Ideally, all movies that could be analyzed for the project are included in the Excel file, and there is a column to indicate if each movie should be excluded from analysis, and another column to provide an explanation for why that movie should be excluded. The Matlab program reads this Excel file and only examines movies that are designated to be included in the analysis.

3.4.3 – Time series processing

Each time series to be analyzed is loaded from its archive location into the

Matlab program. The time points text file is loaded into the program, providing access to the annotated time points for each time series. The first time point, distal valve open, is used to automatically determine the baseline for each time series by calculating the mean of the 30 time points prior to the distal valve open. If less than 30 time points are 45

available, the mean of the available time points is used. The entire time series is then normalized by this baseline value, and smoothed using a moving average filter with a window size of 5. This smoothed normalized time series is used for further analysis and calculation of metrics.

3.4.4 – Dwell time

One metric that characterizes spermathecal tissue function is the dwell time [22], defined as the time from the distal valve close to the sp-ut valve open. It measures how long the embryo stays enclosed within the spermatheca, and is influenced by changes in the contractility, signaling, and coordination of spermathecal cells. Dwell time here is calculated from the time point values in the time points text file.

3.4.5 – Rising time

Calcium signaling time series can be characterized by how long they take to rise in amplitude after a stimulus [62], and in embryo transits this amplitude rise is influenced by signaling and coordination of the spermathecal cells. To standardize the amplitude compared across time series, the program calculates the half maximum of the normalized time series. This is done by finding the maximum of the normalized time series, subtracting 1 from the maximum to get the difference to the normalized baseline, halving this difference, and then adding 1 to the halved difference to account for the normalized baseline. The normalized time series is then searched to find the first time point above the half maximum. To generate a metric that is relevant for embryo transits, the distal valve open, i.e. the start of embryo entry, provides the first time point for the rising time. The rising time is calculated by subtracting the first time point above the half maximum from the distal valve open, resulting in a measure of how rapidly the calcium sensor time series rises.

3.4.6 – Fraction over half max

Calcium signaling time series can also be characterized by how much time they spend at an elevated level after a stimulus [62], and in embryo transits elevated calcium 46

influences contractility and coordination of the spermathecal cells. Using the previously calculated half maximum, and the dwell time to standardize comparisons across embryo transits, the fraction of the dwell time spent over the half maximum is calculated.

Specifically, the number of time points above the half maximum is divided by the total number of time points in the dwell time, resulting in a measure of how much of the dwell time is spent with high calcium sensor signal.

3.4.7 – Multichannel movies

Time series from two channel movies are used to determine the level of a labeled protein in conjunction with the calcium sensor. In this program the non-calcium sensor time series have their baselines calculated in the same way as the calcium sensor time series, and these non-calcium sensor baselines are recorded for further analyses.

3.4.8 - ‘AnalysisTitle’_TimeseriesMetricsTable.txt

This is one of the outputs of the program, providing a tab delimited text file holding the analysis output for all time series analyzed. The file name is constructed using the analysis title described above, to facilitate organization and tracking of analyses. Additional metadata from each movie is included in this file to increase confidence and convenience when plotting and analyzing the analysis outputs in other programs. This file also provides the movie identities for the rows of the time series matrix in the next section.

3.4.9 – ‘AnalysisTitle’_TimeseriesHeatmapMatrix.txt

This is a second output of the program, providing a text file holding all the normalized time series that were analyzed. The file name is constructed using the analysis title to facilitate organization and tracking of analyses. Time series in this file are aligned so that the distal valve open time point is at 50 seconds. This alignment provides a view of the baseline of each time series. If a time series doesn’t have 50 seconds prior to the start of entry, the time series is padded with zeros until it does. The time series in this file are also cropped or extended to a uniform length, to create a 47

matrix that can be manipulated and visualized as a heat map image. This file contains only numbers to enable working with it as an image or a numerical matrix, so the identities of each time series must be referenced from the time series metrics table in the previous section. This is not ideal from a data security point, because row sorts could alter the ordering of the metrics table, but the table and the matrix always come out of the program aligned, so the program could be re-run, and the identity of the time series can always be verified by replotting it from the text file in the archive. With the metrics table and time series matrix combined, time series data and analysis outputs from hundreds of movies can be easily moved and explored.

3.5 – Fiji / ImageJ processing and analysis of calcium sensor kymograms

A Fiji / ImageJ program processes and analyzes the kymograms from calcium sensor movies. Fiji / ImageJ was the choice because it is a free, open source program that is heavily used in the lab for image exploration, processing, analysis, and presentation.

3.5.1 – Fiji_GenerateKymogramMetrics_A_’AnalysisTitle’.ijm

The code for this program is hosted on the lab GitHub. The program takes as input a text file holding the names of the movies the user wants to process, and automatically loads the required kymograms from their archive locations. The program requires the standardized directory structure developed in the previous chapter, and expects to find kymograms with the naming conventions and in the locations outlined earlier in this chapter. The program also loads in the metadata text files, and the time series metrics table from the previous section, to gather information about the kymograms. Keeping with the organization and standardization goals, this program uses an _A to indicate that it is one of many possible ways to generate metrics from the kymograms, and features the analysis title described above. An output of this program is a tab delimited text file holding calculated metrics and supporting information for each kymogram processed. Output of this program also includes a new folder created in the Archive directory structure to hold ImageJ ROI manager files of user annotations. 48

The program was built to run on local movie archives on local machines, but if a machine is hardwired in to the lab RAID the program recognizes this and gives users the option to access the files directly from the lab RAID. The following sections describe the standardized kymogram analysis, and the analysis metrics and how they are calculated.

3.5.2 – Kymogram annotations

This program has a step that is time consuming, so it was written to enable batching of multiple kymograms by collecting user annotations and then running the time consuming step while the user is doing something else. This program loads the kymograms and presents them to the user with prompts for three annotations. The time points for the movie are loaded from the time points file, and displayed on the kymogram to provide the user with some context. The first annotation is to draw a line on the kymogram, the sp-ut quiet period which will be described below. The second and third user annotations are to draw start and end points for the bag intensity sweep, described below. User annotations are presented similarly to the movie processing steps, with prompts enabling interaction with the image followed by confirmation windows that will not let the user do anything else until the annotation is confirmed. These annotations are collected in the ImageJ ROI manager, along with the time points, and saved as files in the Archive directory structure. These ROI manager files ensure data provenance and confidence in the analysis by allowing the kymogram analysis to be checked manually or re-run with the same annotations.

3.5.3 – sp-ut quiet period

One metric that characterizes the calcium sensor kymograms is the sp-ut quiet period. Upon oocyte entry, in wildtype spermathecae, there is a little blip in calcium sensor activity in the sp-ut valve that then dies down and stays low until later in the embryo transit. This metric captures that quiet period by measuring the difference in the time components (the y values) of the user annotation, and sending this value to the output text file. 49

3.5.4 – Bag intensity

Another metric that characterizes the calcium sensor kymograms is the bag intensity. During the dwell time in wildtype animals, calcium sweeps across the central region of the spermatheca, and builds over time to expel the embryo. To get a measure of this activity, the central region of the spermatheca is analyzed in kymograms by calculating the average pixel intensity of a 25 μm wide analysis box over the dwell time.

To provide a standardized measure for comparison across kymograms, the kymogram is first normalized to the time series baseline by recalling that information from the time series metrics table loaded at the start of the program. To minimize bias in the placement of the analysis box, it is swept from the user-annotated left bound to the user-annotated right bound, in one pixel steps, and the average normalized pixel intensity in the box is calculated at each step. When the sweep is complete, the box with the lowest average normalized pixel intensity is determined, the intensity value is recorded in the output text file, and the coordinates of this box are included in the ROI manager annotations for this kymogram. This sweep is the time consuming step of the kymogram analysis, and can take tens of minutes per kymogram, so the program has a batch option for these sweeps.

3.5.5 – Multichannel movies

The non-calcium sensor channel of these kymograms was not analyzed, so two channel movies were dealt with solely by splitting the color channels to gain access to the calcium sensor channel. Output of the kymogram analysis is marked with _RG for two channel movies, as described above.

3.6 – Matlab consolidation and exploration of calcium sensor data

With datasets that can reach into the hundreds of movies, and five analysis metrics for each movie, it is helpful to have these data consolidated in a way that enables movement of the data and facilitates exploration. A pair of Matlab applications enable detailed, interactive exploration of the analysis metrics and time series. Matlab was 50

chosen for this because the presentation of time series with metrics was worked out in

Matlab when the time series processing program was developed, and because Matlab has a helpful application building interface for amateurs.

3.6.1 – ‘AnalysisTitle’_CalciumAnalysisApp.m & ‘AnalysisTitle’_ExploreTimeseriesApp.m

The code for these two programs is available on the lab GitHub. Following organization and standardization goals, both programs have included in the file name the analysis title described above. These programs require the standardized Archive directory structure developed in the previous chapter, and the matching analysis titled time series metrics table, heat map matrix, and kymogram metrics text files described in the previous sections. Outputs of these programs include two interactive windows where the full dataset can be explored, and a text file that combines the time series and kymogram metrics into a single file.

The calcium analysis app program displays all five analysis metrics simultaneously, in plots that are stacked vertically, making the data of interest consumable at a glance. The data presented can be selected from any strain, RNAi, or heat shock conditions that are present in the dataset, with these selection choices populated automatically from the metadata sheets. As few or as many conditions as the user desires can be visualized, from a single strain to the entire dataset. The mean and standard deviation of each analysis metric for each condition is displayed, along with the individual data points. To enable more detailed exploration, any subset of the data presented can be highlighted, and any information that is present in the metadata can be used for this highlighting. For example, all the movies acquired by a particular researcher or over a particular time window, or all the movies that used a particular immobilization method, etc., can be highlighted to explore trends or biases in the data

(Figure 3.4). This highlighting can be taken down to the single movie level with the unique identifier, enabling outlier movies to be identified. This single movie level presents the unique identifier preceded by the strain and RNAi or heat shock conditions, 51

enabling rapid location in the Archive directory of the movie, time series, or kymogram of interest if the user wants to explore any part of the underlying data.

Figure 3.4. Screenshot of ‘AnalysisTitle’_CalciumAnalysisApp.m. This Matlab program presents all five analysis metrics for any set of conditions chosen by the user. In this example two conditions are selected from the box in the upper right, and heat shock conditions can be seen among the available conditions. The plotted data points can be highlighted by any metadata information in the archive, in this example the data from all the movies acquired by CGC is selected in the middle right, and are highlighted in orange on the metrics plots.

The explore time series app is opened using a button in the main calcium analysis app, and only the subset of the data the user was exploring in the main app, 52

meaning the strains, RNAi, and heat shock conditions selected, is available. This window is arranged in two columns, allowing two time series to be directly compared.

The user selects two time series from pull down menus at the top, selecting the time series as the unique identifiers preceded by the condition. With two time series selected, the normalized time series are plotted with the dwell time, rising time, and fraction over

Figure 3.5. Screenshot of ‘AnalysisTitle’_ ExploreTimeseriesApp.m. This Matlab program presents two selected time series with analysis metrics annotated, and all other time series for the selected condition below as a heat map.

53

half max for that time series annotated on the plot. Below these two time series are two heat maps, showing all the time series that share the same strain, RNAi, and heat shock condition as the selected time series. In this way, time series can rapidly be explored to see how representative they are of the time series for that condition, and all the time series for two conditions can be compared at a glance (Figure 3.5). This window can be located alongside the main calcium analysis app, and the main calcium app retains interactivity when the explore time series app is open, allowing seamless exploration of the statistical plots and time series with their metrics annotated (Figure 3.6).

3.6.2 – ‘AnalysisTitle’_combinedKymogramAndTimeseriesMetrics.txt

An output of the calcium analysis app is a text file combining the time series and kymogram metrics. This file is saved as a tab delimited text file, with a file name featuring the analysis title as described above. This file provides a single source that can be easily loaded in to other analysis programs, including Excel and GraphPad Prism, for plotting and further exploration and analysis.

Figure 3.6. Analysis interface. The CalciumAnalysisApp and ExploreTimeseriesApp programs can run simultaneously, enabling detailed, interactive data exploration.

54

3.7 – Conclusions and future directions

In this chapter, the bioimage informatics solution developed in the previous chapter was applied to organize and standardize the processing and analysis of calcium sensor movies. The Fiji / ImageJ movie processing program went live with the lab bioimage informatics system, and has been used by multiple people to process nearly one thousand movies over the last two and a half years. The Matlab time series processing and analysis, Fiji / ImageJ kymogram analysis, and Matlab calcium analysis and explore time series programs are later additions, and are not as widely used as the core organization functionality.

Many improvements could be made to the programs in this chapter. For the Fiji /

ImageJ movie processing program, a camera change on the lab microscope changed the pixel size of the images and necessitated a workaround that could be better incorporated into the program. With movies acquired on a single microscope, the pixel size and dimensions of the spatial cropping box were hardcoded into the program. To deal with the different pixel sizes, the user is presented with a camera selection option that updates the hardcoded values. This could be handled in a better way, and would need to be if this program is used for data from multiple microscopes and / or other labs. A possible solution would be to enter raw movies into the archive with pixel size metadata. The programs would have to be modified to make the pixel size read from the metadata sheets, because the system was built to enable processing before or after the movie is loaded into the archive, and if processing is done before archiving then the metadata is not available in a form that Fiji / ImageJ can access. This issue might be sidestepped by porting this system to one of the more established solutions outlined at the end of the previous chapter. The new camera also generates larger files, causing Fiji to run out of memory when processing long movies, especially during the registration step. One possible way to deal with this is to split movies above a given size in two, duplicating a frame at the split to ensure a common frame between the split movies, and then to run the registration using this common frame as the anchor. There are potential 55

drawbacks to this approach, and again the whole issue may be sidestepped by porting the system to a more established solution.

As for the rest of the programs in this chapter, they all have some wonkiness that decreases their usability, and this is an area that can be improved. The Fiji / ImageJ kymogram analysis program in particular was only developed to the basic functionality level, and would benefit from some polishing to increase usability and to better interface with the other programs. All of the analysis title programs require the user to go into the code and hardcode the analysis title file names, which can be error prone and frustrating even with a guide, decreasing usability. Were I to continue this, my next step would be to develop a more user friendly framework to organize the analyses and stream line the information flows between programs. All the programs in this chapter use the same functions to search over the metadata files, and so they all suffer from the same decreasing performance with increasing archive size mentioned in the previous chapter.

None of the programs are prohibitively slow from this defect at this point, but projecting forward an archive size will be reached that makes the programs too slow to be used effectively. Thankfully this problem only needs to be solved once for all the programs.

The most pressing future direction for this chapter would be to examine the more established bioimage informatics solutions outlined at the end of the last chapter. It may be possible that porting this system to one of these more established solutions obviates the need to fix these issues, and having a larger user community and dedicated programmers and developers could likely head off any future issues.

56

Chapter 4: The RhoGAP SPV-1 regulates calcium signaling to control the contractility of the Caenorhabditis elegans spermatheca during embryo transits

4.1 – Introduction

This chapter presents my first first-author publication, an experimental work that was published in a Molecular Biology of the Cell special issue on Quantitative Cell

Biology [19]. This work used the first generation lab bioimage informatics system described in the previous two chapters to organize almost 500 fluorescence microscopy movies, >1TB of raw image data, acquired by 4 different people over ~ 4 years. Using this system, our understanding of the molecular biology that regulates contractile tubes was advanced.

4.2 – Abstract

Contractility of the non-muscle and smooth muscle cells that comprise biological tubing is regulated by the Rho-ROCK and calcium signaling pathways. Although many molecular details about these signaling pathways are known, less is known about how they are coordinated spatiotemporally in biological tubes. The spermatheca of the C. elegans reproductive system enables study of the signaling pathways regulating actomyosin contractility in live adult animals. The RhoGAP SPV-1 was previously identified as a negative regulator of RHO-1/Rho and spermathecal contractility. Here, we uncover a role for SPV-1 as a key regulator of calcium signaling. spv-1 mutants expressing the calcium indicator GCaMP in the spermatheca exhibit premature calcium release, elevated calcium levels, and disrupted spatial regulation of calcium signaling during spermathecal contraction. Although RHO-1 is required for spermathecal contractility,

RHO-1 does not play a significant role in regulating calcium. In contrast, activation of

CDC-42 recapitulates many aspects of spv-1 mutant calcium signaling. Depletion of cdc-

42 by RNAi does not suppress the premature or elevated calcium signal seen in spv-1 mutants, suggesting other targets remain to be identified. Our results suggest SPV-1 57

works through both the Rho-ROCK and calcium signaling pathways to coordinate cellular contractility.

4.3 – Introduction

Animals are full of biological tubing, including blood and lymphatic vessels, lung airways, salivary glands, digestive canals, and urinary and reproductive tracts.

Actomyosin contractility plays a central role in the functioning of these biological tubes, which must dilate and contract with the proper timing and magnitude to generate appropriate responses to changing biological states (Reviewed in Sethi et al., 2017).

Consequences of misregulated biological tubes can be seen in conditions such as heart disease, hypertension, and asthma [63]–[66]. Previous research has focused on how biochemical factors such as acetylcholine, serotonin, and nitric oxide regulate tissue function in biological tubes, and elucidated many of the downstream signaling mechanisms that regulate cellular and actomyosin contractility. However, mechanical factors such as flow and pressure are also known to regulate tissue function in biological tubes [67]–[69], and the mechanisms by which these mechanical signals regulate cellular contractility are not as well understood. In vivo studies suggest that mechanical cues play a critical role in regulating and coordinating actomyosin contractility [70].

The C. elegans spermatheca (Figure 4.1A) is a powerful in vivo model system for the study of how cells regulate actomyosin contractility and produce coordinated tissue- level responses to mechanical input. The spermatheca is part of the hermaphrodite gonad, a simple, tubular organ consisting of the oviduct (containing germ cells and oocytes and enclosed by contractile sheath cells), the spermatheca, and the uterus [17].

The spermatheca, consisting of a single layer of 24 myoepithelial cells (Figure 4.1B), is the site of sperm storage and fertilization. Sheath cell contractions propel the oocyte into the spermatheca, dramatically stretching the cells of the distal neck and spermathecal bag and initiating a process that culminates in spermathecal cell contraction, spermatheca uterine (sp-ut) valve dilation, and expulsion of the fertilized egg into the uterus. This ovulation cycle, including oocyte entry, transit through the spermatheca, 58

and expulsion into the uterus, repeats roughly every 20 minutes until ~150 eggs have been produced by each gonad arm. Because the nematode is transparent, and the cells of the spermatheca are clearly visible, the entire ovulatory process can be visualized in live animals using time-lapse microscopy.

Contraction of the spermatheca is driven by the well conserved Rho-ROCK and calcium signaling pathways, similar to those observed in non-muscle and smooth muscle cells [8], [9], [71]. These two pathways act in concert to regulate the levels of phosphorylated myosin [30]. The calcium signaling pathway requires activated phospholipase C (PLC-1) [23], [24], which cleaves the membrane lipid phospatidyl inositol bisphosphate (PIP2) to generate 1,4,5 triphosphate (IP3) and diacylglycerol

(DAG). IP3 stimulates the release of calcium from the endoplasmic reticulum (ER) by binding to the IP3 receptor (ITR-1) [72]–[74], and this elevated cytosolic calcium activates myosin light chain kinase (MLCK-1), which phosphorylates the regulatory light chain of myosin (MLC-4) [21]. The Rho-ROCK pathway acts via RhoA (RHO-1), with active,

GTP-bound Rho activating Rho kinase/ROCK (LET-502) [75], which phosphorylates and activates myosin and also inhibits myosin phosphatase (MEL-11) [76], resulting in increased phosphorylation and activation of non-muscle myosin II (NMY-1) [77], and contraction of actomyosin fibers. Activation and coordination of both the Rho-ROCK and calcium signaling pathways is required for successful transits of embryos through the spermatheca [22], [23]. Despite our detailed molecular understanding of these two central pathways regulating contractility, little is known about how these two pathways are integrated to regulate actomyosin contractility at the cellular level.

We have developed the C. elegans spermatheca as a model for understanding how cells within a tissue coordinately regulate actomyosin contractility [21]–[23], [25].

Using the genetically encoded calcium sensor, GCaMP, we have shown that the stretch of incoming oocytes triggers a flash of calcium in the sp-ut valve, followed by dynamic oscillations across the central bag which culminate in contraction and expulsion of the fertilized embryo through the sp-ut valve and into the uterus [23]. Phospholipid 59

signaling, communication through gap junctions, and the mechanosensor FLN-1/filamin

[20], [23] are all required for proper spatiotemporal regulation of calcium signaling.

Spatiotemporal control of Rho-ROCK signaling is also required for coordinated regulation of contractility in the spermatheca. We have shown that the RhoGAP SPV-1 controls the spatiotemporal activation of RHO-1, and regulates tissue contractility through LET-502/ROCK [22]. Disruption of these spatiotemporal regulators often results in damage to embryos, reflux of embryos back into the oviduct, or embryos that are trapped in the spermatheca and unable to exit.

In this study, we demonstrate that the RhoGAP SPV-1 is necessary for proper calcium signaling in the C. elegans spermatheca. We further find that SPV-1 is a major regulator of spatiotemporal aspects of spermathecal calcium signaling, controlling the timing and magnitude of calcium signaling in the spermathecal bag and sp-ut cells.

Expression of constitutively active alleles of RHO-1 and CDC-42 recapitulate the contractility and calcium release phenotypes, respectively, of loss of spv-1. This places

SPV-1 as a key regulator of both major signaling pathways that modulate actomyosin contractility.

4.4 – Results

4.4.1 – SPV-1 regulates calcium signaling in the spermatheca during embryo transits

To determine if SPV-1 plays a role in spermathecal calcium signaling we used a mutant allele of spv-1, ok1498, which causes a 577-bp frameshift deletion in the RhoGAP domain, resulting in loss of function of the SPV-1 protein [22]. We imaged embryo transits in wildtype and spv-1(ok1498) nematodes expressing the genetically encoded calcium sensor GCaMP3 in the spermatheca (Figure 4.1C; Movie 1). To characterize the spv-1 mutant calcium signaling phenotype, we generated GCaMP time series from the embryo transit movies by calculating the average pixel intensity of each frame (Figure

4.1, C and D; Movie 1). Plotting time series from many animals as heat maps reveals that 60

Figure 4.1. Loss of SPV-1 alters calcium signaling in the spermatheca. (A) An adult nematode with labeled spermathecae, showing one with an embryo inside (left), and one without an embryo inside (right). The scale bar in the large image is 100 μm, scale bars in the insets are 10 μm. (B) Tissue-level schematic cartoon of the spermatheca. The spermatheca consists of 3 distinct regions: the cells closest to the sheath form a neck, or distal valve, that constricts to enclose the newly entered embryo, the central cells form a bag that accommodates the embryo during fertilization and egg shell deposition, and the spermatheca and uterus are connected by the sp-ut valve. (C) Still frames from GCaMP movies of embryo transits in wild-type (top) and spv- 1(ok1498) (bottom) animals. Movies were temporally aligned to the start of oocyte entry at 50 seconds. (D) GCaMP time series generated from GCaMP movies (Movie 1), with metrics highlighted. Dwell time is a tissue function metric calculated as the time from the closing of the distal valve to the opening of the sp-ut valve. Rising time is a calcium signaling metric measuring the time from the opening of the distal valve to the first time point where the time series reaches half its maximum, and fraction over half max is a calcium signaling metric measuring how much of the dwell time is spent over the half maximum. Lower panels show heat maps of 5 time series for each condition. The top line of the heat map is the time series in the upper panel. (E) Quantification of metrics from time series. Error bars display standard deviation, p-values were calculated using Welch's t-test: **p < 0.01, ****p < 0.0001. 61

spv-1(ok1498) embryo transits consistently show a rapid increase in the onset of calcium activity at the start and elevated calcium levels throughout (Figure 4.1D).

To quantify these differences and enable statistical analysis over many embryo transits, we identified one tissue function metric from the movies, dwell time, and two calcium signaling metrics extracted from the GCaMP time series, rising time and fraction over half max (Figure 4.1D; Movie 1). The tissue function metric, dwell time, is defined as the time from distal neck closure to sp-ut valve opening. Dwell time measures the amount of time the embryo remains enclosed by the spermatheca. In spv-1 mutants, faster transit of the embryo through the spermatheca results in a reduced dwell time

(Figure 4.1E). This result is in agreement with a previous observation that embryo transits are faster when SPV-1 is lost [22]. The first calcium signaling metric, rising time, is defined as the time from the start of oocyte entry to the first time point where the

GCaMP time series crosses the half maximum. Rising time captures the rate of increase in calcium signal at the start of embryo transit. In spv-1 mutants, a rapid rise in calcium is observed immediately upon entry (Figure 4.1, D and E). The second calcium signaling metric, fraction over half max, is defined as the duration of the dwell time over the

GCaMP half maximum value divided by the total dwell time. This metric captures the level of calcium throughout embryo transits. In spv-1 mutants, fraction over half max is higher than in wild type animals (Figure 4.1, D and E). Taken together, these metrics suggest SPV-1 regulates calcium activity in the spermatheca by dampening calcium signaling, keeping cytosolic calcium at low levels until late in embryo transits.

4.4.2 – SPV-1 overexpression results in low calcium signaling and embryo trapping

To further understand how SPV-1 regulates calcium signaling, we expressed an

SPV-1::mApple translational fusion in the spv-1 mutant background. Animals expressing both the mApple fluorophore and GCaMP were imaged during embryo transits (Figure

4.2, A and B). In animals expressing high levels of SPV-1::mApple, embryos failed to exit the spermatheca (referred to here as embryo trapping) and spermathecae exhibited low 62

levels of calcium activity (Figure 4.2, C and D), suggesting that overexpression of SPV-1 results in disrupted tissue function and calcium signaling phenotypes. We next crossed spv-1::mApple into the wildtype background, and observed embryo trapping and low calcium signaling in 100% of movies recorded (n=5, data not shown). In control experiments, overexpression of mApple alone did not result in embryo trapping or low calcium levels (Figure 4.2, E and F; Supplemental Figure 4.1, A-E). These data suggest that high levels of SPV-1 inhibit calcium release and spermathecal tissue contractility.

4.4.3 – Spermathecal tissue function and calcium signaling exhibit a threshold response to SPV-1::mApple levels

Because high levels of SPV-1::mApple resulted in depressed calcium signaling and embryo trapping, we hypothesized that lowering the levels of SPV-1::mApple would result in a more wildtype phenotype in the spv-1 mutant background. To test this in a way that allowed a full sampling of SPV-1::mApple fluorescence intensities, we used spv-1 RNAi diluted with empty RNAi to four different strengths. These treatments generated spv-1(ok1498); spv-1::mApple animals with a range of SPV-1::mApple levels, measured by mApple fluorescence intensity. Examining the data across all spv-1(ok1498); spv-1::mApple animals, with and without RNAi treatment, shows that spermathecal calcium signaling and embryo transit timing exhibit a threshold response to SPV-

1::mApple levels, in which low levels of SPV-1::mApple result in short dwell times and spv-1 mutant-like calcium signaling with a rapid onset of elevated calcium activity, whereas high levels of SPV-1::mApple induce the overexpression phenotypes with embryo trapping and low calcium activity (Figure 4.2, D and E). Plotting dwell times as a function of SPV-1::mApple intensity clearly displays this threshold behavior (Figure

4.2F), which can be modeled using a Hill function [78] to determine where, and how rapidly, a system switches from one state to another [79]. To explore this, we fit a Hill equation to these dwell time data points (Figure 4.2F), and extracted a threshold value of

12.2 for SPV-1::mApple intensity, with a 95% confidence interval from 10.1 to 14.8. 63

Figure 4.2. Spermathecal tissue function and calcium signaling exhibit a threshold response to SPV-1::mApple. (A) Still images from a dual-labeled spermatheca expressing GCaMP and SPV- 1::mApple. Scale bars are 10 μm, brightness is enhanced for presentation. (B) GCaMP time series of normalized average pixel intensity, F/F0 (top), and SPV1::mApple time series of raw average pixel intensity (bottom), from a single embryo transit movie over the same spatial frame and time. (C) A representative GCaMP time series from an SPV-1::mApple expressing spermatheca containing a trapped embryo, corresponding to the top row of the heat map in panel D. (D) A 64

Figure 4.2 (continued) heat map showing GCaMP time series from 53 embryo transit movies of varying SPV-1::mApple intensity, ranked with the highest SPV-1::mApple intensity at the top and decreasing with each row. Rows labeled with 'T' indicate trapped. (E) Dwell times plotted as a function of condition. (F) Dwell times plotted as a function of SPV-1::mApple intensity. The threshold value from the fitted Hill function is 12.2, with the 95% confidence interval from 10.1 to 14.8. (G) Rising times plotted as a function of condition. (H) Fractions over half max plotted as a function of condition. In panels E, G, and H error bars display standard deviation, p-values were calculated using Welch's ANOVA with Games-Howell multiple comparison: ns, p ≥ 0.05, *p < 0.05, **p < 0.01, ***p <0.001, ****p < 0.0001.

This window of SPV-1::mApple intensity values coincides with the switch from mutant to overexpression behavior in both tissue function (Figure 4.2, E and F;

Supplemental Figure 4.2A) and calcium signaling (Figure 4.2, D, G and H; Supplemental

Figure 4.2, B and C). The wildtype phenotype occurs within this window, as well as the spv-1 mutant and overexpression phenotypes.

4.4.4 – SPV-1 regulates spatiotemporal aspects of calcium signaling

In wildtype embryo transits, calcium signaling starts with a single pulse in the sp-ut valve, followed by a quiet period after the oocyte enters, then increasing pulses are initiated in the distal neck and travel across the spermathecal bag, culminating in intense calcium pulses that activate contractions that expel the embryo into the uterus [23]. In contrast, in spv-1(ok1498) embryo transits calcium rises immediately upon oocyte entry, lacks spatial organization, and is elevated compared to wildtype signal, sporadically crashing to a low level and rising again multiple times before the embryo is expelled

(Figure 4.1, B and C; Figure 4.3A; Movie 1).

To visualize and analyze these spatiotemporal changes we generated kymograms in which each frame of the movie is condensed into a horizontal line, along the distal to proximal axis of the spermatheca, by averaging over the columns of that frame. Lines representing the individual frames are then stacked vertically, generating kymograms that display the spatial variation of the GCaMP signal from the distal neck to the sp-ut valve over the entire movie (Figure 4.3, A and B; Movie 2). Wildtype 65

kymograms indicate calcium activity is restricted to the distal (neck) and proximal (sp- ut) ends of the spermatheca during the dwell time. In contrast, spv-1(ok1498) kymograms depict increased activity in the middle of the spermathecal bag as well as in the sp-ut valve (Figure 4.3B).

To quantify these differences, we identified two prominent spatiotemporal calcium signaling metrics that are consistently altered in spv-1 mutant kymograms

(Figure 4.3, B and C). The first is the sp-ut quiet period. In wildtype kymograms, the sp- ut valve displays a single calcium transient upon oocyte entry, followed by an extended period of low calcium. In spv-1(ok1498) kymograms, the sp-ut valve displays an immediate rise in calcium followed by sustained calcium. The sp-ut quiet period metric quantifies this difference by measuring the time from the end of the initial calcium transient to the next rise in sp-ut calcium. The second spatiotemporal calcium signaling metric is bag intensity. In wildtype kymograms, the central spermathecal cells exhibit low calcium activity that increases and pulses as the tissue expels the egg. In spv-

1(ok1498) kymograms these central cells exhibit rapid high calcium levels that stay elevated while the spermatheca is occupied. The bag intensity metric quantifies this difference by measuring the average pixel intensity of a 25 µm region of the spermathecal bag over the dwell time. Both spatiotemporal metrics differ significantly between wildtype and spv-1(ok1498) animals (Figure 4.3C).

We next examined kymograms from the movies with varying SPV-1::mApple intensities (Figure 4.3D). As expected, low levels of SPV-1::mApple result in short sp-ut quiet periods and high bag intensities, whereas high levels of SPV-1::mApple result in long sp-ut quiet periods and low bag intensities (Figure 4.3, E and F), with the switch occurring between mApple intensity values of 10.1 and 14.8 (Supplemental Figure 4.2, D and E). This spatiotemporal analysis indicates that SPV-1 spatially regulates calcium activity in the spermatheca by keeping calcium low in the bag cells and sp-ut valve for a period of time after oocyte entry. 66

Figure 4.3. SPV-1 regulates spatiotemporal aspects of calcium signaling. (A) Individual frames from wildtype (left) and spv-1(ok1498) (right) embryo transit movies, the same movies as Figure 5.1C. All frames follow the color scale indicated in the top frame, scale bars are 10 μm. (B) Kymograms of the movies in panel A, generated by averaging over the columns of each movie 67

Figure 4.3 (continued) frame, display the variation in average calcium signaling from the distal valve on the left to the sp-ut valve on the right, with time progressing down (Movie 2). Horizontal scale bars are 5 μm, vertical scale bars are 50 seconds. Colored lines on the left side of the kymograms correspond to the individual frames in panel A, annotations show the two spatiotemporal calcium signaling metrics used for analysis. The sp-ut quiet period measures the low calcium signaling of the sp-ut valve after oocyte entry, which is lost in spv-1 mutants. Bag intensity measures the average normalized fluorescence intensity of a 25 μm wide region in the bag section of the spermatheca during the dwell time. (C) Quantification of metrics. Error bars display standard deviation, p-values were calculated with Welch's t-test: ****p < 0.0001. (D) Kymograms from embryo transit movies with SPV-1::mApple intensities of 2.5, 12.6, 16.9, and 78.5, from left to right. Horizontal scale bars are 5 μm, vertical scale bars are 50 seconds. (E) sp-ut quiet periods plotted as a function of condition. (F) Bag intensities plotted as a function of condition. In panels E and F error bars display standard deviation, p-values were calculated using Welch's ANOVA with Games-Howell multiple comparison: ns, p ≥ 0.05, *p < 0.05, ***p <0.001, ****p < 0.0001.

4.4.5 – Increasing RHO-1 activity recapitulates transit timing of the spv-1 mutant, but not calcium signaling

Because SPV-1 acts through RHO-1 to regulate contractility of the spermatheca

[22], we speculated that increasing RHO-1 activity might alter calcium signaling in a manner similar to the loss of SPV-1. To test this idea, we obtained nematodes expressing constitutively active RHO-1(G14V) under the control of a heat shock promoter [80], crossed them with GCaMP expressing animals, and optimized the heat shock protocol to enable the capture of embryo transit movies (Figure 4.4, A and B). As expected, increasing RHO-1 activity results in shorter dwell times (Figure 4.4, A and C). However, increasing RHO-1 activity did not phenocopy spv-1 mutant calcium signaling, showing slow rising times and low fractions over half max (Figure 4.4, A, B and C), and long sp- ut quiet periods and low bag intensities (Figure 4.4, D and E). In all calcium signaling metrics, increased RHO-1 activity did not differ significantly from the non-heat shocked controls. These results suggest that SPV-1 is likely not working through RHO-1 to regulate calcium signaling in the spermatheca.

68

Supplemental Figure 4.1. mApple alone does not induce embryo trapping or low calcium signaling. (A) Dwell times plotted as a function of mApple intensity. (B) Rising times plotted as a function of mApple intensity. (C) Fractions over half max plotted as a function of mApple intensity. (D) sp-ut quiet periods plotted as a function of mApple intensity. (E) Bag intensities plotted as a function of mApple intensity. In all panels, spv-1(ok1498);spv-1::mApple data are shown as slightly transparent to emphasize the mApple controls. Hill function fits to the mApple control data are plotted solely as a guide for the eye. 69

Supplemental Figure 4.2. Tissue function and calcium signaling metrics all show thresholds covering SPV-1::mApple intensity values from ~10 to ~15. (A) Dwell times plotted as a function of mApple intensity. Annotations display the R2 value of the Hill function fit, and the Threshold value from that Hill function. The 95 percent confidence interval for the Threshold value is displayed below the data points, with numbers indicating the upper and lower values. (B) Rising times plotted as in panel A. (C) Fractions over half max plotted as in panel A. (D) sp-ut quiet periods plotted as in panel A. (E) Bag intensities plotted as in panel A. The n. d. on the confidence interval indicates that the program could not determine the confidence interval for the Threshold value of the bag intensity. 70

4.4.6 – SPV-1 regulates calcium signaling via its GAP domain

A functional RhoGAP domain is required for SPV-1 to regulate spermathecal contractility and embryo transit timing [22]. To explore if this activity is also needed for

SPV-1 to regulate calcium signaling, we generated transgenic nematodes expressing labeled SPV-1 with a nonfunctional RhoGAP domain, SPV-1(R635K)::mApple, and acquired two-color movies of embryo transits. spv-1(R635K)::mApple was expressed at average mApple fluorescence intensities from 11 to 24, coinciding with mApple fluorescence levels at and above the threshold that induces trapping with RhoGAP functional SPV-1::mApple (Supplemental Figure 4.1). GCaMP time series, heat maps, and kymograms show that overall calcium levels in spv-1(ok1498); spv-1(R635K)::mApple are higher than in the spv-1 mutant controls, and well above the wildtype controls

(Figure 4.5, A-C). All of the spv-1(ok1498); spv-1(R635K)::mApple calcium signaling metrics are significantly different from the wildtype controls, and the fractions over half max and bag intensities are significantly increased over the spv-1 mutant controls

(Figure 4.5D). These data suggest that the GAP activity of SPV-1 is required to modulate calcium signaling, and indicate that the downstream target of SPV-1 is likely to be a

GTPase.

4.4.7 – SPV-1 has GAP activity toward Cdc42 and partially co-localizes with CDC-42

In addition to regulating RHO-1, in vitro RhoGAP activity assays indicate SPV-1 has significant GAP activity toward another Rho family GTPase, Cdc42/CDC-42 (Figure

4.6A; (Ouellette et al., 2015)). To determine if CDC-42 is present in the cells of the spermatheca, we obtained and imaged nematodes expressing a GFP::CDC-42 fusion [82] and found CDC-42 expressed throughout the spermatheca at the apical and basal membranes (Figure 4.6, B and C). To investigate the spatial relationship between CDC-

42 and SPV-1, we crossed spv-1::mApple into the gfp::cdc-42 animals and acquired two- color movies of embryo transits. We found that SPV-1::mApple partially co-localizes with GFP::CDC-42 at the cell membranes (Figure 4.6, D and E). To quantify the co- 71

Figure 4.4. Increasing RHO-1 activity alters spermathecal contractility but does not recapitulate spv- 1(ok1498) mutant calcium signaling. (A) Representative time series from embryo transits with metrics annotated. (B) Heat maps showing time series from multiple embryo transits. The time series in panel A corresponds to the first row of the heat map. (C) Quantification of time series metrics. (D) Representative kymograms with metrics annotated. Horizontal scale bars are 10 μm, vertical scale bars are 100 seconds. (E) Quantification of kymogram metrics. In panels C and E error bars display standard deviation, p-values were calculated using Welch's ANOVA with Games- Howell multiple comparison: ns, p ≥ 0.05, **p < 0.01, ***p <0.001, ****p < 0.0001. WT and spv-1(ok1498) data are duplicated from figures 4.1 and 4.3.

72

Figure 4.5. SPV-1 regulates calcium signaling through its GAP domain. (A) Representative time series from embryo transits with metrics annotated. (B) Heat maps showing time series from multiple embryo transits. Time series in panel A correspond to the first rows of the heat maps. (C) Representative kymograms with metrics annotated. Horizontal scale bars are 10 μm, vertical scale bars are 100 seconds. (D) Quantification of metrics. Error bars display standard deviation, p- values were calculated using Welch's ANOVA with Games-Howell multiple comparison: ns, p ≥ 0.05, *p < 0.05, **p < 0.01, ***p <0.001, ****p < 0.0001. WT; mApple and spv-1(ok1498); mApple data are duplicated from figures 4.2 and 4.3. 73

localization of SPV-1::mApple and GFP::CDC-42 in the spermatheca we calculated the

Manders’ co-localization coefficient [83], [84] for each frame of these embryo transit movies (Figure 4.6F). Manders’ coefficient gives the fraction of SPV-1::mApple pixel intensity in pixels that are also positive for GFP::CDC-42. This analysis shows high co- localization during oocyte entry, with ~70-90% of the SPV-1::mApple pixel intensity found in GFP::CDC-42 positive pixels. Co-localization decreases during the dwell time and exit, rebounding to high co-localization by the end of embryo exit. These data suggest that SPV-1 may regulate CDC-42 in spermathecal cells.

4.4.8 – Increasing CDC-42 activity recapitulates many aspects of spv-1 mutant calcium signaling

To determine if increasing CDC-42 activity alters calcium signaling similarly to the loss of SPV-1, we generated nematodes expressing a constitutively active CDC-

42(Q61L) [85], [86] under the control of a heat shock promoter, crossed this line with our

GCaMP sensor line, optimized the heat shock protocol for CDC-42, and acquired embryo transit movies. Increasing CDC-42 activity does not significantly alter dwell times. Comparing animals expressing constitutively active CDC-42(Q61L) to non-heat shocked controls demonstrates that activation of CDC-42 results in faster rising times, increased fractions over half max, shorter sp-ut quiet periods, and higher bag intensities

(Figure 4.7, A -E). Heat shock alone does not significantly alter calcium signaling metrics in control animals (Supplemental Figure 4.3).

Because both Cdc42 and Rho are known regulators of the actin cytoskeleton [87], and because rearrangements of the actin cytoskeleton can influence the spatiotemporal regulation of the endoplasmic reticulum and thereby directly affect calcium levels [88], we visualized F-actin in dissected spermathecae from animals treated with heat shock conditions. No major alterations in actin cytoskeletal arrangements were observed in response to heat shock (Supplemental Figure 4.4), suggesting disruption of the actin cytoskeleton is unlikely to underlie the defects in calcium signaling seen in animals expressing constitutively active CDC-42. These data suggest SPV-1 may act through 74

CDC-42 to regulate calcium signaling in the spermatheca during embryo transits, perhaps through effectors that do not directly regulate the actin cytoskeleton.

Figure 4.6. SPV-1 exhibits GAP activity toward Cdc42 and partially co- localizes with CDC-42 at spermathecal cell membranes. (A) In vitro assay measuring activity of the SPV-1 RhoGAP domain toward recombinant mammalian Cdc42. Error bars display S.E.M, p- values were calculated using Welch's ANOVA with Games- Howell multiple comparison: **p < 0.01, ***p <0.001, ****p < 0.0001. (B, C) The first and last frames of the dwell time from a representative movie, with the quantified region, capturing the same cell, annotated. Scale bars are 10 μm. (D, E Left) Digitally zoomed view of the quantified region, with the line scans annotated. (D, E Right) Average fluorescence intensity along the line scans, with GFP::CDC-42 in green and SPV-1::mApple in magenta. (F) Time series of Manders' co-localization coefficients for embryo transit movies from 3 different animals. Manders' coefficient gives the fraction of SPV-1::mApple pixel intensity in pixels that are also positive for GFP::CDC-42. The black trace is from the same movie displayed in panels B-E. 75

Figure 4.7. Increasing CDC-42 activity alters spermathecal calcium signaling. (A) Representative time series from embryo transits with metrics annotated. (B) Heat maps showing time series from multiple embryo transits. Time series in panel A corresponds to the first row of the heat map. (C) Quantification of time series metrics. (D) Representative kymograms with metrics annotated. Horizontal scale bars are 10 μm, vertical scale bars are 100 seconds. (E) Quantification of kymogram metrics. In panels C and E error bars display standard deviation, p-values were calculated using Welch's ANOVA with Games-Howell multiple comparison: ns, p ≥ 0.05, *p < 0.05, **p < 0.01, ***p <0.001, ****p < 0.0001. WT and spv-1(ok1498) data are duplicated from figures 4.1 and 4.3.

76

Supplemental Figure 4.3. Heat shock treatment does not alter wildtype tissue function or calcium signaling in any of the metrics. (A) Dwell times plotted by condition. (B) Rising times plotted by condition. (C) Fractions over half max plotted by condition. (D) sp-ut quiet periods plotted by condition. (E) Bag intensities plotted by condition. In all panels, error bars display standard deviation, p-values were calculated using Welch's t-test: ns, p ≥ 0.05 77

Supplemental Figure 4.4. Heat shock treatment and activation of constitutively active GTPases does not alter actin bundle alignment. (A) Max intensity projections of representative fixed, dissected spermathecae, stained with phalloidin- conjugated Texas Red. All spermathecae are oriented with the sp-ut valve on the right. Scale bars are 10 μm, brightness is enhanced for presentation. Any apparent differences in actin filament size or structure are solely due to the selection of images for presentation, visual inspection of all the data collected revealed no consistent differences between the conditions. (B) Anisotropy values provide a measure of actin bundle alignment. Each data point is a single cell, 3 cells were quantified in each of 4 different spermathecae for each condition. Error bars display standard deviation, p-values were calculated using Welch's t-test: ns, p ≥ 0.05

78

4.4.9 – Decreasing CDC-42 using RNAi does not alter calcium signaling

To further explore CDC-42’s role in calcium signaling we generated nematodes co-expressing a red calcium sensor, -GECO, and CDC-42 labeled with GFP. We used

RNAi against cdc-42 to deplete CDC-42 levels and recorded embryo transit movies with

R-GECO to monitor calcium signaling. Decreasing CDC-42 levels via RNAi does not alter calcium time series in the spv-1 mutant background (Figure 4.8, A and B), or in the wildtype background (data not shown). Dwell times and global calcium signaling metrics are unchanged by cdc-42(RNAi) in both wildtype and mutant backgrounds

(Figure 4.8C). Spatiotemporal patterns of calcium signaling are also unchanged by cdc-

42(RNAi), indicated by kymograms (Figure 4.8D) and spatiotemporal calcium signaling metrics (Figure 4.8E). Measurements of GFP::CDC-42 fluorescence intensity suggests that cdc-42(RNAi) treatments are significantly depleting CDC-42 levels (Figure 4.8F), however residual levels of CDC-42 could be sufficient to support the observed calcium dynamics. These results suggest that SPV-1 may regulate calcium signaling through another as yet unidentified effector.

4.5 – Discussion

In this work, we show that the RhoGAP SPV-1 is a major regulator of calcium signaling in the C. elegans spermatheca during embryo transits. SPV-1 controls spatiotemporal aspects of calcium signaling by keeping calcium low in the spermathecal bag cells and sp-ut valve for a period of time after oocyte entry. This controlled calcium signaling likely contributes to the spatiotemporal patterns of contractility present in wildtype function of the tissue, where contractility must be high in the distal neck to keep the embryo in the spermatheca, and yet must also stay low in the central bag cells so that the egg forms with the correct shape. Misregulated calcium signaling, in addition to increased active RHO-1 levels, likely contributes to the misshapen embryos and decreased brood sizes seen in the spv-1(ok1498) mutant [22].

79

Figure 4.8. cdc- 42(RNAi) does not alter calcium signaling. (A) Representative time series from embryo transits with metrics annotated. (B) Heat maps showing time series from multiple embryo transits. The time series in panel A corresponds to the first row of the heat map. (C) Quantification of time series metrics. (D) Representative kymograms with metrics annotated. Horizontal scale bars are 5 μm, vertical scale bars are 50 seconds. (E) Quantification of kymogram metrics. (F) Quantification of GFP::CDC-42 intensity. In panels C, E and F error bars display standard deviation, p- values were calculated using Welch's t-test: ns, p ≥ 0.05, ****p < 0.0001.

80

Spermathecal tissue function and calcium signaling exhibit a threshold response to SPV-1::mApple level, suggesting levels of SPV-1 protein must be maintained within a narrow range for wildtype embryo transits to occur. Increases in SPV-1 above wildtype levels lead to over-dampening of the signaling pathways regulating contractility, resulting in a tissue that cannot produce productive contractions. Decreases in SPV-1 below wildtype levels lead to insufficient inhibition of the signaling pathways that stimulate contractility, making the tissue hypercontractile (Figure 4.9, A and B). A threshold response also suggests that feedback mechanisms are involved in controlling spermathecal tissue function and calcium signaling. The molecular mechanisms tightly regulating the expression and activity levels of SPV-1, and characterization of the feedback control system that governs the spermatheca, are exciting areas for future investigation.

Given previous results suggesting SPV-1 acts through RHO-1 to regulate contractility, it was a surprise to find that constitutively active RHO-1 did not recapitulate the spv-1(ok1498) calcium signaling phenotypes. Although animals expressing constitutively active RHO-1 did exhibit shorter dwell times, suggesting sufficient activation of RHO-1 to stimulate actomyosin contractility, they produced fairly wildtype calcium signaling. RHO-1 is therefore probably working predominantly through LET-502/ROCK and myosin activation, and not in the regulation of calcium signaling. Increased phosphorylation of myosin through the Rho-ROCK pathway would lead to calcium sensitization of myosin, a common phenomenon in smooth muscle and non-muscle systems in which contraction is initiated at lower than normal calcium levels

[30].

Because altering RHO-1 activity did not affect the calcium signal, we asked if

SPV-1 GAP activity was necessary to regulate calcium during embryo transits.

Expression of SPV-1 with a non-functional GAP domain (spv-1(R635K)::mApple) resulted in high calcium levels, exceeding those observed in the spv-1 mutant. This suggests that a GTPase acts downstream of SPV-1 to modulate calcium signaling and that SPV- 81

1(R635K) may be able to bind and sequester GTPases, preventing them from being turned off by other GAPs.

Several observations suggest the small GTPase CDC-42 may play a role in the regulation of calcium signaling in the spermatheca. Expressing constitutively active

CDC-42 in the spermatheca recapitulates key aspects of the spv-1 mutant calcium signaling phenotypes. In particular, the spv-1 mutant-like increase in calcium signal evident in the increased fractions over half max and the reduced sp-ut quiet periods suggests increased CDC-42 activity may contribute to the stimulation of calcium release downstream of SPV-1. Although calcium levels increase in animals expressing constitutively active CDC-42, the dwell times are unchanged. Therefore, premature and elevated calcium signaling alone does not generate the hyperconstriction of the spermatheca and rapid embryo transits observed in spv-1(ok1498) animals. The hyperconstriction may therefore be driven primarily by RHO-1. We attempted to directly address the question of whether the spv-1 phenotype is the collective result of misregulation of both RHO-1 and CDC-42 by generating animals expressing both of the constitutively active constructs. Unfortunately, developmental and ovulation defects in these animals prevented us from collecting the transit data needed to answer this question.

Despite the phenotypic similarities between loss of spv-1 and increased CDC-42, the calcium intensities in the spermathecal bag and the rising times in animals expressing constitutively active CDC-42 are significantly different from those seen in the spv-1 mutant population. This may be because the levels of CDC-42 activity in a constitutively active mutant activated by heat shock differ from wildtype CDC-42 activity levels in intensity, duration and/or spatial organization. The uneven recapitulation of the spv-1 mutant calcium signaling phenotype may also suggest that another unidentified protein acts downstream of SPV-1 to regulate calcium signaling.

82

Figure 4.9. SPV- 1 regulates spermathecal contractility via calcium and Rho-ROCK signaling. (A) Summary table of findings. SPV- 1 regulates both the Rho-ROCK and calcium signaling pathways which together are required for activation of myosin and tissue contractility. Increasing RHO- 1 activity leads to faster transits (decreased dwell times) similar to decreased SPV-1, but does not alter calcium signaling. Increasing active CDC-42 does not alter dwell times but does alter calcium signaling similar to decreased SPV-1. Increasing SPV-1 increases dwell times, often resulting in transit failures and embryo trapping, and decreases calcium signaling activity and magnitude. Decreasing CDC-42 using RNAi does not alter dwell time or calcium signaling. (B) Proposed model of the network regulating actomyosin contractility in the spermatheca. Grey lines along the outside display previously known interactions. Orange lines display interactions investigated in this study. Solid lines indicate direct interactions, dotted lines indicate resultant interactions with known intermediates not shown, and dashed lines indicate unknown intermediates. SPV-1, CDC-42, and/or an unknown effector symbolized by the question mark could interact with any of the upstream members of the calcium signaling pathway; those arrows were omitted for clarity.

83

If SPV-1 works through the inactivation of CDC-42 to regulate calcium release, depletion of CDC-42 might be expected to suppress the spv-1 mutant calcium dynamics.

However, depleting CDC-42 using RNAi did not result in altered calcium signaling in the spv-1 mutant or wildtype backgrounds, despite evidence that the treatment is effective at depleting CDC-42. CDC-42 is important for larval development, including proper formation of the gonad [89], and depleting it completely results in morphogenesis defects that preclude ovulation. Due to this our RNAi treatments were started at later larval stages, and were not complete knockdowns. Perhaps residual

CDC-42 is sufficient to properly regulate calcium signaling. Another possibility is that

SPV-1 regulates calcium through another unknown effector. Our results demonstrate the

GAP activity of SPV-1 is required for proper spatiotemporal regulation of calcium signaling, suggesting that a GTPase is a target. Of the known Rho GTPases in C. elegans

(CED-10, RAC-2, MIG-2, CDC-42, RHO-1, and CRP-1), SPV-1 has been shown to only have significant GAP activity toward CDC-42 and RHO-1 [81]. Identification of additional SPV-1 effectors will be an important avenue for future studies.

In other systems, CDC-42/Cdc42 is known to regulate actin cytoskeletal rearrangements and cell migration as well as protein kinase cascades that can alter transcription of downstream targets [90], [91]. However, few studies link Cdc42 to the activation of calcium signaling. Perhaps the best characterized example is found in mast cells, a critical cell in the immune system during inflammatory reactions. Calcium signaling, triggered by antigen binding to IgE and FCεRI receptors, controls mast cell exocytosis and degranulation, which releases histamines and other substances. In RBL-

2H3mast cells, dominant active mutants of Cdc42 lead to elevated levels of antigen stimulated IP3 and cytosolic calcium [92]. Further studies have shown that Cdc42 can participate in the regulation of stimulated PIP2 synthesis, the substrate of PLC isozymes, in these cells [93]. This suggests CDC-42 activity in the spermatheca may be controlling calcium activity through the regulation of PIP2 turnover and/or the production of IP3. In addition, in vitro studies have shown that human Cdc42 can stimulate purified PLC-β

[94] and interact with PLC-γ1 [92]. It is possible that CDC-42 is affecting the activation of 84

PLC-1/ε or the availability of substrate in the spermathecal cells to control IP3 production and calcium release during embryo transits.

SPV-1 has three human orthologs [22]: ARHGAP29/PARG1 [95], [96],

HMHA1/ARHGAP45 [97], [98] and GMIP/ARHGAP46 [99]. Recently, loss of

HMHA1/ARHGAP45 in endothelial cells was found to increase wound healing and cell migration rates, suggesting altered actomyosin contractility, as well as increasing the cellular response to shear stress, suggesting altered mechanotransduction (Amado-

Azevedo et al., 2018). Similar patterns of altered mechanotransduction and contractility are present in the spermatheca when SPV-1 is lost. In mice, ARHGAP29/PARG1 and a binding partner Rasip1 were found to regulate nonmuscle myosin II via RhoA and

Cdc42 during blood vessel development, with Rasip1 first activating Cdc42 to break cell- cell adhesions during lumen formation and Rasip1 then activating ARHGAP29/PARG1, inhibiting RhoA and keeping contractility low so the lumen stays open and the blood vessel can expand as the animal grows [100]. This behavior of ARHGAP29/PARG1, regulating cellular contractility directly through RhoA and also through Cdc42 in conjunction with a binding partner, further suggests other proteins may interact with

SPV-1 to modulate calcium signaling in the spermatheca.

Our results indicate SPV-1 regulates calcium signaling in addition to Rho-ROCK signaling, providing insights into the regulation of calcium signaling, actomyosin contractility, and tissue function in the C. elegans spermatheca. From this evidence it appears that not only is there interaction between the two central pathways that regulate actomyosin contractility, but a single protein can fine-tune both pathways to generate consistent and robust tissue function. Given conservations of sequence and function, it will be exciting to determine whether SPV-1 orthologs play similar roles in other biological contexts.

4.6 – Acknowledgements

We thank Ismar Kovacevic, Pei Yi Tan, Anand Asthagiri, Javier Apfeld, and members of the Cram and Apfeld labs for helpful feedback and discussions. This work 85

was supported by a grant from the National Institutes of Health/National Institute of

General Medical Sciences (GM110268) to E.J.C., a grant from the Israel Science

Foundation (grant No. 1293/17) to R.Z-B., a National Science Foundation/Molecular and

Cellular Biosciences - U.S. Israel Binational Science Foundation award (1816640) to E.J.C. and R.Z-B., and a National Science Foundation East Asia and Pacific Summer Institutes fellowship (1414889) to J.B. Some C. elegans strains were provided by the Caenorhabditis

Genetics Center, which is funded by the National Institutes of Health Office of Research

Infrastructure Programs (P40 OD010440).

4.7 – Methods

C. elegans Strains and Culture

Nematodes were grown on nematode growth media (NGM) (0.107 M NaCl,

0.25%wt/vol Peptone (Fisher), 1.7% wt/vol BD Bacto-Agar (Fisher), 0.5% Nystatin

(Sigma), 0.1 mM CaCl2, 0.1 mM MgSO4, 0.5% wt/vol cholesterol, 2.5 mM KPO4), and seeded with Escherichia coli OP50 using standard techniques [101]. Nematodes were cultured at 20°C unless specified otherwise. All lines expressing GCaMP3 were crossed into UN1108 as described in Kovacevic et al. (2013) or UN1417, a separate integration event of the same construct, fln-1p::GCaMP3. GCaMP signal in UN1108 and UN1417 does not differ significantly in any of the described metrics. spv-1(ok1498) animals were crossed with UN1417 animals to generate the line UN1416 (spv-1(ok1498); fln-

1p::GCaMP3). Table 1 provides a full list of C. elegans strains used in this study.

Construction of spv-1::mApple and SPV-1(R635K)::mApple transgenic animals

GFP and the spv-1 3’UTR were removed from plasmid pPY1 (spv-1p::spv-

1::GFP::spv-1 3’UTR, Tan and Zaidel-Bar 2015) and replaced with tdTomato and the fln-1

3’UTR from pUN284 (fln-1p::inx-12::tdTomato::fln-1 3’UTR) using restriction digest cloning with the enzymes KpnI and EagI. tdTomato was removed from this plasmid using the enzymes KpnI and NotI, and replaced with mApple amplified from Addgene plasmid 27698 using primers that introduced KpnI and NotI restriction sites to the ends of mApple, generating pUN362. spv-1p::spv-1::mApple::fln-1 3’UTR was then transferred 86

from pUN362 into pUN359, a plasmid modified from pCFJ151 [102] to facilitate CRISPR insertion at the ttTi5605 transposon site, using restriction digest cloning with enzymes

PmeI and SpeI, resulting in pUN427. pUN595 (spv-1p::SPV-1(R635K)::mApple::fln-1

3’UTR) was generated using around the horn PCR site-directed mutagenesis on pUN427 with primers designed to introduce the two nucleotide R635K amino acid mutation as well as a single nucleotide silent mutation 30 bp downstream to generate a BglII restriction site to facilitate tracking the R635K mutation. Transgenic animals were created by microinjecting N2 nematodes with a DNA solution containing 50 ng/µL of

Cas9 plasmid [103], [104], 30 ng/µL each of 2 different CRISPR guide plasmids, pUN357 and pUN358, and 45 ng/µL of pUN427 or pUN595. Progeny displaying red fluorescence in the spermatheca were isolated and screened for CRISPR integration. After multiple failed attempts at CRISPR integration, lines expressing the transgenes as extrachromosomal arrays were established and used for experiments. Three spv-

1::mApple lines and two SPV-1(R635K)::mApple lines were isolated from independent microinjections. These lines were crossed with UN1416 nematodes to generate animals with dual-labeled spermathecae in the wildtype and spv-1 mutant backgrounds.

Construction of fkh-6p::mApple transgenic animals

GFP was removed from pUN106 (fkh-6p::GFP::unc-54 3’UTR) and replaced with mApple from pUN427 using restriction digest cloning with enzymes KpnI and EcoRI to create pUN822. Transgenic animals were created by microinjecting N2 nematodes with a

DNA solution containing 45 ng/µL of pUN822 and 100 ng/µL of pRF4 (rol-6 injection marker). Progeny displaying red fluorescence in the spermatheca were isolated and lines expressing the transgene as an extrachromosomal array were established. These lines were crossed with UN1416 nematodes to generate animals with dual-labeled spermathecae in the wildtype and spv-1 mutant backgrounds.

87

Construction of constitutively active CDC-42 (hsp16.2p::CDC-42(Q61L)::unc-54 3’UTR) animals

CDC-42(Q61L) was amplified from pJK6 (a gift from Ahna Skop, made originally by John White) using primers engineered with EcoRI 5’ extensions and ligated into pUN597 (hsp16.2::unc-54 3’UTR) between the promoter and 3’UTR to create pUN624.

Transgenic animals were created by microinjecting a DNA solution containing 20 ng/μl pUN624 and 25 ng/μl myo-3:mCherry as a co-injection marker into N2 animals. Animals exhibiting mCherry expression in the body wall muscle were segregated to establish the transgenic line UN1720. UN1720 was crossed into UN1108 (fln-1p::GCaMP3) creating the transgenic line UN1728 for calcium studies.

Construction of gfp::cdc-42; spv-1(ok1498); spv-1::mApple animals

The strain WS5018 (cdc-42(gk388);opIs295 II (gfp::cdc-42)) [82] was obtained from the Caenorhabditis Genetics Center (CGC) and crossed with a spv-1(ok1498); spv-1::mApple line generated above, creating strain RZB300.

Construction of gfp::cdc-42; R-GECO animals

WS5018 animals were injected with a DNA solution containing 50 ng/μl of pUN526 (fln-1p::R-GECO::fln-1 3’UTR) and 40 ng/μl of pRF4 (rol-6 injection marker).

Progeny displaying red fluorescence in the spermatheca were isolated and a line,

UN18124, expressing the transgene as an extrachromosomal array was established.

RZB300 animals lacking red fluorescence were crossed with UN18124 animals to generate the strain UN18127 with gfp::cdc-42 and R-GECO in the spv-1(ok1498) background.

Heat shock protocols for constitutively active rho-1(G14V) and cdc-42(Q61L)

UN1516 (RHO-1(G14V); fln-1p::GCaMP3) animals were synchronized using an

‘egg prep’ where embryos are released from young gravid adults using an alkaline hypochlorite solution (Hope, 1999). Clean embryos were plated on OP50 seeded NGM and cultured at 20°C until young adulthood (70-72 hours post ‘egg prep’). To induce 88

expression of constitutively active RHO-1(G14V), animals were moved from 20°C to

33°C for 30 minutes, and then left to recover at 20°C for 1 hour before imaging.

UN1728 (CDC-42(Q61L); fln-1p::GCaMP3) animals were synchronized as described above. Young adults expressing myo-3::mCherry were segregated and placed at

33°C for 2 hours to induce the expression of constitutively active CDC-42(Q61L).

Animals were left to recover at 20°C for 1 hour before imaging.

RNA interference

The RNAi protocol was performed essentially as described in (Timmons and

Fire, 1998). HT115(DE3) bacteria transformed with the dsRNA construct of interest were grown overnight in Luria Broth (LB) supplemented with 40 mg/ml ampicillin. The following day 300 μl of the cultured bacteria was seeded on NGM/RNAi plates supplemented with 25 μg/ml carbenicillin and disopropylthio-β-galactoside (IPTG) and left for 24-72 hours to induce dsRNA expression. empty RNAi, i.e. HT115(DE3) bacteria transformed with L4440 without a gene-specific insert, is used as a control in RNAi experiments to account for the different bacterial food source as well as non-specific activation of the RNAi machinery. Diluted spv-1 RNAi was conducted by mixing overnight cultures of spv-1 and empty RNAi bacteria in volume ratios of 1:1, 1:3, 1:9, or

1:19 for seeding NGM/RNAi plates. Embryos were collected using an ‘egg prep’ as described above. Clean embryos were transferred to the seeded NGM/RNAi plates and incubated at 20°C for 65-70 hours before imaging. For cdc-42 RNAi clean embryos were deposited on OP50 NGM plates and incubated at 20°C for 36-48 hours, followed by transfer to cdc-42 or empty RNAi plates and incubation at 20°C for another 24-36 hours before imaging.

Image Acquisition

For all GCaMP and R-GECO studies, animals were immobilized with 0.05- micron polystyrene beads (Polysciences Inc., Warrington, PA, USA), mounted on slides with 5-10% agarose pads [107], and imaged using a 60x, 1.40 NA oil-immersion objective on a Nikon Eclipse 80i microscope equipped with a SPOT RT3 CCD camera (Diagnostic 89

instruments; Sterling Heights, MI, USA) controlled by SPOT Advanced imaging software, version 5.0, with Peripheral Devices and Quantitative Imaging modules.

Images were acquired at 1600x1200 pixels, using the full camera chip, and saved as 8-bit tiff files. Fluorescence excitation was provided by a Nikon Intensilight C-HGFI 130W mercury lamp and shuttered with a Lambda 10-B SmartShutter (Sutter Instruments,

Novato CA, USA), also controlled through the SPOT software. Single channel GCaMP time lapse movies were acquired using a GFP filter set (470/40x 495bs 525/50m) (Chroma

Technologies, Bellows Falls VT, USA) at 1 frame per second, with an exposure time of 75 ms, gain of 8, and neutral density of 16. Two channel GCaMP/mApple and GFP/R-

GECO time lapse movies were acquired using an EGFP/mCherry filter set, with excitation filters 470/40x and 572/35x housed in the SmartShutter filterwheel to enable rapid switching, and emissions passing through a single filter cube containing beamsplitter 59022bs and emission filter 59022m (520/20m and 640/40m) (Chroma

Technologies). For GCaMP/mApple movies one GCaMP frame and one mApple frame were acquired every 2 seconds, with exposure time of 40 ms and gain of 16 for the

GCaMP channel, exposure time of 75 ms and gain of 32 for the mApple channel, and neutral density of 32 for both channels. For GFP/R-GECO movies one GFP frame and one R-GECO frame were acquired every 2 seconds, with exposure time of 50 ms, gain of

16, and neutral density of 16 for both channels.

For GFP::CDC-42 /SPV-1::mApple studies, animals were mounted with 2 µl of

M9 buffer on slides with 10% agarose pads, and imaged using a 100x 1.4 NA oil- immersion objective on a Nikon Ti confocal microscope equipped with a spinning-disk head (CSU-X1; Yokogawa, Tokyo, Japan) and Prime95b camera (Photometrics, Tucson,

AZ) controlled using Metamorph software (Molecular devices, Sunnyvale, CA), and saved as 16-bit tiff files. Z-stacks were acquired with 1 µm spacing between slices, with

Z-stacks acquired every 10 seconds. GFP was excited using the 481 nm laser and mApple was excited with the 561 nm laser, both at 35% laser power and 200 ms exposure time. 90

Image Processing

Only successful embryo transits, meaning the embryo exited through the sp-ut valve, or whole embryos trapped in the spermatheca, were analyzed in this work.

GCaMP and R-GECO movies were processed using a custom Fiji script. First, the movie was registered using the Fiji plugin StackReg [61] to correct for any movement of the animal during image acquisition. The movie was then rotated to a standard orientation with the occupied spermatheca horizontal and the sp-ut valve on the right side of the movie. Finally, the movie was cropped to 800x400 pixels, annotated as processed, and saved as an 8-bit tiff file. For GCaMP/mApple and GFP/R-GECO movies the2 channels were registered separately and then recombined before rotation and cropping. The angle of rotation and positioning of the cropping box were determined by manual user input.

GFP::CDC-42 /SPV-1::mApple movies were not processed beyond selecting a single Z-plane for analysis. Fiji was used to manually track and annotate the cells, and to manually draw and quantify line scans.

Generation of GCaMP and R-GECO time series

GCaMP and R-GECO time series were generated by calculating the average pixel intensity for each frame of the processed movie. Normalized average pixel intensity time series were generated by normalizing the entire time series to the baseline value, F0, calculated as the average of the 30 frames prior to the start of oocyte entry [23].

Extraction of metrics from time series

Custom Fiji and Matlab code was used to document and archive manually annotated time points, computationally identified metrics, and additional data for each movie. The global calcium signaling metrics rising time and fraction over half max were adapted from [62].

Manual annotation of time points, calculation of dwell time

Time points were determined by visual inspection. Four significant time points were annotated for each movie: (1) distal neck open, when the oocyte starts to enter the 91

spermatheca, (2) distal neck close, when the embryo is completely enclosed, (3) sp-ut valve open, when the embryo starts to exit, and (4) sp-ut valve close, when the embryo completely exits the spermatheca. In ambiguous cases, three individuals scored each time point, and the average of the three values was used. Dwell time is calculated as the frame when the sp-ut valve opens minus the frame when the distal neck closes.

Calculation of max, half max and rising time

Each GCaMP and R-GECO time series was smoothed using a moving average filter, Matlab function ‘filter’, with a window size of 5. The max value was determined using the built-in Matlab function, and the baseline value, previously used to normalize the time series, was recalled. The half max value was calculated by taking half the difference between the max and the baseline, and adding the baseline again. The first time point where the time series is above the half max value was recorded as the end of the rising time, with distal neck open as the start of the rising time. Rising time is calculated as first time point above half max minus distal neck open.

Calculation of fraction over half max

For the dwell time of each time series, the number of time points above the half max value were counted, and this number was divided by the total dwell time to give the fraction over half max.

Calculation of mApple and GFP::CDC-42 average pixel intensity

For GCaMP/mApple and GFP::CDC-42 /R-GECO movies, mApple and

GFP::CDC-42 time series were generated by calculating the average pixel intensity of the mApple or GFP channel for each frame of the processed movie. The average of the 30 frames prior to oocyte entry, analogous to the GCaMP/R-GECO baseline, was recorded as the value for the mApple or GFP::CDC-42 average pixel intensity. For presentation in

Figure 5.2 the mApple time series was smoothed using a moving average filter with a window size of 50, and a constant 2 was added to every value of the time series to accommodate for the decreased intensity of the higher averaging. GCaMP/mApple movies were acquired in three batches over fifteen months, with a microscope deep 92

cleaning and fluorescent lamp replacement between batches. SPV-1(R635K)::mApple average pixel intensity values were found to be consistent within batches, so the average

SPV-1(R635K)::mApple average pixel intensity value for each batch was used to establish ratios to bring all the mApple values from the first two batches into agreement with the third batch.

Adjustment of metrics for trapped movies

Trapped dwell times were assigned a value of 575, slightly above the highest value in the dataset, for plotting purposes and fitting of the Hill function. These trapped dwell times were excluded from statistical analysis. Trapped fractions over half max were assigned a value of -0.10 for plotting purposes, and excluded from statistical analysis.

Fit to Hill function and determination of SPV-1::mApple threshold

Using GraphPad Prism 7.04, dwell times were arranged in an XY table with mApple values as X and dwell times as Y, with all spv-1(ok1498); spv-1::mApple movies, i.e. trapped, untreated and RNAi, in a single column. Nonlinear regression curve fit was used, with [Agonist] vs. response – Variable slope (four parameters). The EC50 in the resulting table is presented here as the threshold.

Generation of kymograms

For every frame of the processed movie the pixel intensities of each column were averaged to generate a single pixel representing that column. This operation was performed for each column of the frame, condensing the frame into a single pixel line.

This operation was carried out for every frame of the movie, with the single pixel lines stacking to generate a 2D image that represents the spatiotemporal dynamics of the entire movie (Fig. 4.3 A, B, Movie 2). This was implemented in custom Fiji code using the commands Image>Stacks>Reslice followed by Image>Stacks>Z Project (Average

Intensity).

93

Extraction of spatiotemporal calcium signaling metrics from kymograms

The sp-ut quiet period was visually determined and indicated as a line drawn from the end of the first calcium transient to the start of the next rise in calcium level.

The sp-ut quiet period was then calculated as the vertical distance between the endpoints of the line, measuring in seconds.

Before calculation of the bag intensity the kymogram was normalized by dividing the 32-bit kymogram image by the GCaMP or R-GECO time series baseline value, calculated above as the average of the 30 frames prior to the start of oocyte entry.

A left bound at the distal neck and a right bound at the sp-ut valve were then identified.

The bag intensity was calculated within a rectangle computationally selected to represent the lowest average calcium signal in a 25 μm region of the spermathecal bag during the dwell time. For trapped SPV-1::mApple kymograms, the average dwell time of the WT;mApple controls was used.

In vitro RhoGAP activity assay

The RhoGAP activity assay was conducted as previously described [22].

Calculation of Manders’ co-localization coefficient

Manders’ co-localization coefficients [83] were calculated using the Costes [84], implemented in the built-in Fiji function ‘Colocalization Threshold’.

Custom Fiji code was written to allow the ‘Colocalization Threshold’ function to analyze movies frame by frame. Background subtraction was applied to the GFP::CDC-42 SPV-

1::mApple movies prior to running the ‘Colocalization Threshold’ function, using the built-in Fiji function ‘Subtract Background’. Values for Manders’ coefficients were found to vary with the degree of background subtraction. To determine the proper level of background subtraction in an unbiased manner, custom Fiji code was written to vary the

‘rolling ball radius’ from 10 to 500 pixels, at 10 pixel increments. Manders’ coefficients and the ‘R below threshold’ value from the Costes algorithm were calculated and collected for each frame of the movie at each background subtraction level. The ‘R below threshold’ values for each frame of the movie were summed, resulting in a single ‘R 94

below threshold’ value for the entire movie for each background subtraction level.

Given that the Costes algorithm ideally returns an ‘R below threshold’ value of 0, the background subtraction level resulting in the movie ‘R below threshold’ value closest to

0 was used as the proper level of background subtraction. The thresholded Manders’ coefficient, giving the amount of mApple intensity in pixels above the mApple intensity threshold that are also above the GFP intensity threshold, divided by the total mApple intensity in pixels above the mApple intensity threshold, is reported here.

To facilitate comparison, the time series of Manders’ coefficients over the embryo transit movies were segmented into entry time, dwell time, and exit time through the identification of time points as noted above. The segments for each movie were then normalized to the average entry time, dwell time, or exit time for all GFP::CDC-42 SPV-

1::mApple movies, using the Matlab function ‘interp1’ to resample each segment to the same length.

Statistical analysis

Statistical tests were conducted at a significance level of 0.05. When 2 groups were compared, p-values were calculated using Welch’s t-test in GraphPad Prism. When more than 2 groups were compared, p-values were calculated using Welch’s ANOVA with Games-Howell multiple comparisons, in R [108], using the ‘oneway’ function in the package ‘userfriendlyscience’ [109].

5.8 – Supplemental Methods

Supplemental Imaging

Data in Supplemental Figure 5.3 was acquired on a Nikon Eclipse 80i microscope equipped with a SPOT RT39M5 sCMOS camera (Diagnostic instruments; Sterling

Heights, MI, USA) and a 0.63x wide field adapter. Images were recorded using the full camera chip, at 2448x2048 pixels, with a neutral density of 32 and an exposure time of 60 ms. During image processing these movies were cropped to a frame size of 1078x539 pixels. All other image acquisition and processing information for these movies is as described in the main text. 95

F-actin staining

F-actin staining was adapted from [25]. Animals were treated with heat shock according to the protocols described above, with wildtype and spv-1 mutant controls both exposed to the longer (2 hour) heat shock. After the recovery period, 150-200 animals were transferred to 100 µL of phosphate buffered saline (PBS) in a watch glass, and dissected using a 23-gauge hypodermic needle. This PBS containing the dissected animals was transferred to a 15 mL glass conical using a Pasteur pipette. Another 100 µL of PBS was used to rinse the watch glass and the Pasteur pipette and transferred to the glass conical. Immediately after this 200 µL of 3.7% formaldehyde in PBS was added to the glass conical, and this conical was incubated at room temperature for 25 minutes.

After fixation, dissected animals and gonads were washed twice in PBS, and then permeabilized for 15 minutes in PBST (PBS + 0.1% Triton X-100). Following permeabilization, dissected animals and gonads were washed once more with PBS, and then transferred to 1.5 mL centrifuge tubes, where they were incubated in 400 µL of 0.4

U/mL Texas Red-X-phalloidin in PBS (Invitrogen, Carlsbad, CA), overnight at 4°C, in the dark, with constant mixing. After this incubation period, dissected animals and gonads were kept in the dark as much as possible. They were washed twice more with PBS, and mounted on pads of 2% agarose in water, on microscope slides, with the coverslip sealed with nail polish, for imaging.

F-actin imaging

Fixed and stained spermathecae were imaged on a Zeiss LSM710 confocal microscope (Carl Zeiss, Jena, Germany) controlled by ZEN software (Carl Zeiss, Jena,

Germany). Imaging was performed using a Plan-Apochromat 63x/1.40 NA oil immersion DIC M27 objective, with excitation from a 594-nm HeNe laser attenuated to

2% laser power, emissions collected from 600-690 nm with a detector gain of 675, pixel dwell time of 1.24 µs and averaging of 4. Images were acquired at 8-bit, with a frame size of 1304x750 pixels and a pixel size of 0.10 µm/pixel, and saved as .lsm files. For each spermatheca a z-stack capturing the full spermatheca was acquired, with 0.46 µm between each z-slice. 96

F-actin image analysis

Fiji was used for analysis of F-actin images. Maximum intensity projections of the entire z-stack for each spermatheca were generated and examined to select 4 representative spermathecae for each condition that looked roughly matched. One of these images for each condition was rotated and spatially cropped to a standard orientation and size to form Supplemental Figure 5.4A. For the selected spermathecae, the full z-stack was examined, and the single z-plane with the clearest fibers was selected for each spermatheca. These single z-plane images were then analyzed using the Fiji plugin FibrilTool [110], by outlining 3 cells in each spermatheca. Output from

FibrilTool was collected in an Excel spreadsheet and transferred to GraphPad Prism for plotting and statistical analysis.

Table 4.1. List of C. elegans strains used in this study. Strain Genotype N2 Wildtype Bristol RZB300 cdc-42(gk388) spv-1(ok1498) opIs295 II; xbEx1406[spv-1p::spv-1::mApple+pRF4(rol-6(su1006))] UN1108 xbIs1101[fln-1p::GCaMP3+pRF4(rol-6(su1006))] II UN1416 spv-1(ok1498) II; xbIs1408[fln-1p::GCaMP3] UN1417 xbIs1408[fln-1p::GCaMP3] UN1516 xbIs1408[fln-1p::GCaMP3]; nzIs1[hsp-16.2p::rho-1(G14V)] UN1649 spv-1(ok1498) II; xbIs1408[fln-1p::GCaMP3]; xbEx1649[spv-1p::spv-1(R635K)::mApple] UN1661 spv-1(ok1498) II; xbIs1408[fln-1p::GCaMP3]; xbEx1661[spv-1p::spv-1::mApple] UN1662 spv-1(ok1498) II; xbIs1408[fln-1p::GCaMP3]; xbEx1662[spv-1p::spv-1::mApple] UN1663 xbIs1408[fln-1p::GCaMP3]; xbEx1661[spv-1p::spv-1::mApple] UN1664 xbIs1408[fln-1p::GCaMP3]; xbEx1662[spv-1p::spv-1::mApple] UN1665 spv-1(ok1498) II; xbIs1408[fln-1p::GCaMP3]; xbEx1665[spv-1p::spv-1::mApple] UN1728 xbIs1101[fln-1p::GCaMP3+pRF4(rol-6(su1006))] II; xbEx1720[hsp-16.2p::cdc-42(Q61L)+myo-3p::mCherry] UN1771 xbIs1408[fln-1p::GCaMP3]; xbEx1771[fkh-6p::mApple] UN1772 spv-1(ok1498) II; xbIs1408[fln-1p::GCaMP3]; xbEx1771[fkh-6p::mApple] UN18124 cdc-42(gk388) opIs295 II; xbEx18124[fln-1p::R-GECO+pRF4(rol-6(su1006))] UN18127 cdc-42(gk388) spv-1(ok1498) opIs295 II; xbEx18124[fln-1p::R-GECO+pRF4(rol-6(su1006))]

4.9 – Acknowledgement of contributions

This work was a collaborative effort, and I must acknowledge the experimentalists who contributed. Coleman Clifford optimized and performed the

RHO-1 heat shock experiments, and acquired some of the wildtype and spv-1 mutant 97

movies. Alyssa Cecchetelli generated the CDC-42 heat shock animals, and optimized and performed the CDC-42 heat shock experiments. Kriti Sethi generated the GFP::CDC-

42, SPV-1::mApple animals, and acquired movies of them in the Zaidel-Bar lab in

Singapore. All other data were acquired by me, and I performed all the data processing and analysis.

98

Chapter 5: Segmentation, polar coordinate representation, and refined spatial analysis of widefield calcium sensor movies

5.1 – Abstract

The spermatheca undergoes substantial shape changes and spatial movements during embryo transits. These deformations and displacements are recorded in the calcium sensor movies, but they are difficult to rigorously measure and analyze. This chapter describes a prototype image processing and analysis pipeline to explore this spatial information. A Fiji / ImageJ program identifies fluorescence boundaries in calcium sensor movies, and automatically segments the spermathecae. The Fiji / ImageJ program also automatically calculates an unbiased tissue centroid for each movie frame, and recasts the fluorescence movies and segmentation boundaries into polar coordinates, a representation that enables new processing and analysis of embryo transit movies. A Matlab program automatically categorizes and annotates the segmentation boundaries, and a Matlab live script enables interactive analysis and exploration of the data. A deformation map displaying the spatial changes and a calcium sensor map displaying the localized calcium signaling over time provide a proof of concept for this pipeline. With further refinement, this unbiased and automatic pipeline will quantify relationships between geometry changes and calcium signaling, enabling quantitative analysis of tissue shape changes and mechanochemical signaling in the spermatheca.

5.2 – Introduction

Fluorescence microscopy image datasets can be very information dense, and the widefield calcium sensor movies analyzed in the previous chapters contain a wealth of spatial information that remains untapped. In this chapter I describe a prototype image processing and analysis pipeline of Fiji / ImageJ and Matlab programs that aims to extract some of this spatial information, to further advance our understanding of spermathecal tissue function. 99

For this pipeline, I developed a Fiji / ImageJ program for automated image processing and analysis, and Matlab programs for interactive analysis and exploration.

The Fiji / ImageJ program roughly segments the calcium sensor fluorescence data, outlining the spermatheca, and these segmentation boundaries are used to calculate spermathecal centroids and recast the calcium sensor data and segmentation boundaries into polar coordinates. A Matlab program automatically classifies and annotates the segmentation boundaries, and a Matlab live script integrates the analysis output and generates deformation and calcium sensor maps and data visualization in an interactive manner. The scientific goal of this pipeline is to glean some information about mechanotransduction from the calcium sensor movies. Deformation maps aid our understanding of geometry changes in the movies, representing stretch and contraction and mechanical changes in the spermatheca. Calcium sensor maps aid our understanding of localized calcium signaling in the movies, representing biological responses in the spermathecal cells. An unbiased, automated image processing and analysis pipeline could provide a new measurable mechanotransduction phenotype to help us understand how biophysics and biochemistry interact to control tissue function in the spermatheca.

A prominent feature of this spatial analysis is conversion to, and visualization in, polar coordinates. The movie frames are x,y pixel matrices, leading us to think about embryo transit displacements as position changes in a rectangular grid. However, each x,y pixel location in the movie frame can also be represented as a distance, r, from a center, and an angle, theta, around a circle centered at r = 0. This r,theta representation is called a polar coordinate system, which arose to facilitate mathematical thinking about systems that travel around, or vary with distance from, a central point. The principal displacements of the spermatheca during embryo entry are to extend away from the tissue center, increasing the lumen radius and the distance between the distal and sp-ut valves to accommodate an entering oocyte. The principal displacements of the spermatheca during embryo exit are to contract back toward the tissue center, decreasing the lumen radius and the distance between the distal and sp-ut valves. Given 100

that polar coordinate systems are designed to facilitate mathematical thinking about these kinds of situations, it seems likely that analyzing the embryo transit movies in polar coordinates centered at the tissue centroid could be fruitful. The latter parts of this chapter demonstrate benefits of utilizing multiple coordinate systems.

5.3 – Fiji / ImageJ segmentation of calcium sensor data and conversion to polar coordinates

I first developed a Fiji / ImageJ program for segmenting calcium sensor data in widefield movies, that requires no user input beyond the selection of the movie and the time points to process for a given run. There are a few adjustable parameters; these will be mentioned in the relevant sections below, with their default values. Visual inspection of the output suggests the segmentation is accurate for many frames. From automated and unbiased segmentation an automated and unbiased centroid for the tissue is calculated for each frame, and this centroid is utilized as the origin for a polar coordinate system that advances spatial analysis of the tissue.

5.3.1 – Fiji_GradientSegmentationOfSpermatheca_Calcium.ijm

The ImageJ macro language code for this program is available on the lab GitHub.

The program takes as input a processed calcium sensor movie. Final outputs of the program include: a text file containing the centroids for each frame, a movie of the segmentation boundaries in rectangular coordinates for comparison to the calcium sensor movie, and movies of the segmentation boundaries and calcium sensor fluorescence in polar coordinates. The following sections describe in detail how the program determines the segmentation boundaries and produces movies in polar coordinates.

5.3.1.1 – Edge image

The first step of segmentation processing uses the Edges function from the

Feature J package in Fiji [111] to generate an edge image from the fluorescence image

(Figure 5.1). This function is called directly from the Fiji segmentation program, and so 101

the ImageScience update site must be included in the Fiji update sites prior to running the segmentation program. The Edges function is passed a smoothing scale parameter of

10, and the ‘Suppress non-maximum gradients’ checkbox is unchecked, with no threshold values set.

5.3.1.2 – Extrema of line profiles

Line profiles over edge images show pronounced peaks and valleys (Figure 5.2).

An adjustable parameter at this step is the line profile width, here set to 20 pixels to provide some spatial averaging. When the spermatheca is occupied with an embryo we expect to find four peaks as we traverse a line over the whole image: one at the transition from the outside to the spermathecal cell, one at the transition from the spermathecal cell to the spermathecal lumen, and another pair as we move from the spermathecal lumen back to the outside. Figure 5.2 indicates that the sp-ut valve may generate an additional pair of peaks.

Peaks and valleys from the line profile are extracted using the ImageJ macro language functions Array.findMaxima and Array.findMinima, respectively [112]. Since we expect to find at most 6 peaks over a line, the 6 highest maxima are extracted. The 6 lowest minima are also extracted, to provide further information for later processing.

There is an adjustable parameter for the extrema functions called tolerance, which is the minimum amplitude difference needed to identify a peak or valley. Here this tolerance parameter is set to 0.005, and if less than 6 extrema are found over the line profile then the tolerance is reduced by 0.001 and the extrema identified again. If the tolerance parameter reaches 0 and there are still less than 6 extrema, the extrema that are identified are used, and the results are padded with -1 to give a consistent 6 results.

Line profile extrema are identified for the entire image, and visualized by assigning pixel values to the extrema and plotting them on a new image. The maxima are assigned values of 255, 253, 212, 210, 170, and 168. The minima are assigned values of

128, 126, 85, 83, 42, and 40. Differently oriented line profiles identify segmentation 102

boundaries in different regions, so three easily automated line profile sweeps are used: a horizontal line is swept vertically over the image, a vertical line is swept horizontally over the image, and a radial line from the image center is swept in a circle over the image (Figure 5.3).

Figure 5.1: Still frames from a calcium sensor movie with corresponding edge images. (Top row) R-GECO fluorescence intensity. Brightness was adjusted for presentation, all frames follow the color map in the first frame. (Bottom row) Edge images generated from the images in the top row, using the FeatureJ Edges function with a smoothing scale of 10. Scale bars = 10 μm.

Figure 5.2: Line profile over edge image. (Top) An edge image annotated with a line profile. Yellow indicates 20 pixel width, scale bar is 10 μm. (Bottom) The quantified line profile from the top panel, displaying the pixel intensity of the edge image along the line.

103

Figure 5.3: Line profile sweeps and extrema images. (Top row left) Edge image annotated with a 20 pixel wide horizontal line. (Top row right) Image displaying the line profile extrema from a horizontal line swept vertically. The 6 highest maxima hold values from 168 to 255, i.e. bright oranges, yellows, and whites. The 6 lowest minima hold values from 40 to 128, i.e. blues, magentas, and dark oranges. (Middle row) Line profile sweep and extrema image for a vertical line swept horizontally. (Bottom row) Line profile sweep and extrema image for a radial line centered at the image center, swept in a circle. All line sweeps used a line 20 pixels wide, all extrema images follow the color map in the top right panel.

5.3.1.3 – Segmentation boundaries

Each of the line profile sweeps provide different segmentation boundary information, so the same boundary found by two sweeps can be considered a high confidence boundary. We can visualize these high confidence boundaries by merging the three line profile extrema images as different color channels in an RGB image, so that compound colors such as yellows, magentas, and whites display high confidence segmentation boundaries (Figure 5.4). The program finds these high confidence 104

boundaries using the Fiji / ImageJ Image Calculator function, performing AND operations on each pair of line profile extrema images to get the overlaps, and then ADD operations on the three resulting overlap images to produce a single rough boundary image.

These rough segmentation boundaries are refined (Figure 5.5), first by thresholding the image to remove any pixels with an intensity below 200. The resulting thresholded pixels are then binarized, creating an image that can be analyzed using the

Fiji / ImageJ Analyze Particles function to produce ROIs for the parts of the boundaries.

The Analyze Particles function has gating parameters that are set to not exclude any

ROIs, i.e. Size (pixel^2) = 0-Infinity, Circularity = 0.00-1.00, ‘Exclude on edges’ is checked, and ‘Include holes’ is not checked. The resulting ROIs are added to the ROI Manager.

The edge image is recalled, and the mean intensity of each ROI in the edge image is calculated. By excluding ROIs that measure low intensity in the edge image, in this case

< 10, refined segmentation boundaries are obtained (Figures 5.5 and 5.6).

Figure 5.4: Combining line profile extrema images. (Left column) Images from different line profile sweeps are assigned to different color channels to combine into an RGB image. From the bottom, the radial line swept in a circle is assigned to the red channel, the vertical line swept horizontally is assigned to the green channel, and the horizontal line swept vertically is assigned to the blue channel. (Right column) The resulting RGB image from merging the line profile extrema images. Brightness indicates extrema rank, i.e. bright colors correspond to the whites and yellows in the left column, while dim colors correspond to the magentas. Overlaps from the line profile sweeps are visible in the RGB image as yellows, magentas, and whites. 105

Figure 5.5: Refining rough segmentation boundaries. (Left) Rough segmentation boundaries resulting from adding the overlaps of the line profile extrema images. (Middle) The edge image annotated with ROIs identified by the Analyze Particles function after thresholding the left image to discard pixels below 200. (Right) The refined segmentation boundaries produced by excluding ROIs from the middle image with a mean intensity less than 10.

Figure 5.6: Overview of Fiji / ImageJ segmentation program. Still frames from a calcium sensor movie with corresponding edge images, combined line profile extrema images, and final refined segmentation boundaries. (Top row) R-GECO fluorescence intensity. Brightness was adjusted for presentation, all frames follow the color map in the first frame. (Second row) Edge images generated from the images in the top row. (Third row) Combined line profile extrema images, with horizontal line swept vertically in blue, vertical line swept horizontally in green, and radial line swept through a circle in red. (Bottom row) Refined segmentation boundary images, a final output of the Fiji / ImageJ segmentation program. Scale bars = 10 μm.

5.3.1.4 – Calculation of centroid

From the refined boundaries image, the Analyze Particles function is again used to identify ROIs for each part of the boundaries. The Analyze Particles function can calculate and return many properties for each particle it finds; here the x and y coordinates of the particle centroid and the area of the particle are collected for each boundary particle. Using these, the area-weighted centroid of all the boundary particles 106

in the image is calculated by multiplying the x and y components of each particle centroid by the area of that particle, summing the resulting area-weighted x and y components for all particles, and dividing these resulting values by the total area of all particles to obtain the final x and y coordinates of the segmentation boundary centroid.

The segmentation boundary centroid generated in this way is stable and stays within the spermatheca throughout the embryo transit movie (Figure 5.7).

Figure 5.7: The centroid of the segmentation boundaries appears accurate and stable over embryo transit movies. (Left) The first frame of an embryo transit movie, annotated with the segmentation boundary centroid as a white star. (Middle) A frame in the middle of the movie, annotated with the centroids over the entire movie. Centroid colors indicate time, going from dark blue at the first frame through light blue to white for the current frame, then light red to dark red at the last frame. (Right) The last frame of the movie, annotated with the centroid as a white star.

5.3.1.5 – Calcium sensor movies in polar coordinates

With a centroid automatically established for each frame, the calcium sensor movie and segmentation boundaries can now be recast in polar coordinates with the centroid as the pole. First, a line length guaranteed to capture the entire frame is defined, in this case 800 pixels. Then, with the start point at the centroid, the cardinal end points of the line are calculated, and the image padded with zeros to ensure the end point of the line never goes outside the image. Trigonometry is then used to calculate the end points of the line as it is swept through 365°. Following image coordinate conventions where y = 0 at the top of the image and increases going down, the polar axis of the coordinate system points right and the angle increases going clockwise. At each angle, a

1 pixel wide line is drawn from the centroid to the calculated end point, and the line profile is obtained. The values of this line profile are copied into a table to hold all 365 line profiles. The final 800x365 matrix is saved as a table, reopened as a text image, and 107

then saved as an 8-bit tiff image for further processing and analysis. These steps are carried out on both the fluorescence and segmentation boundary images, resulting in calcium sensor movies with segmentation boundaries in polar coordinates (Figures 5.8 and 5.9).

Figure 5.8: Calcium sensor image converted into polar coordinates. (Top) A frame from a calcium sensor movie showing the refined segmentation boundaries. The centroid of the segmentation boundaries is used as the origin for a polar coordinate system, with the polar axis pointing to the right and increasing angle in a clockwise direction. r values are given in μm, theta values in degrees, scale bar is 10 μm. (Bottom) Polar representation of the frame in the top panel. The bottom of the image is the centroid, with radius increasing vertically, and degrees are increasing from left to right. r values are given in μm, theta values in degrees, horizontal scale bar is 45 degrees, vertical scale bar is 10 μm.

Figure 5.9: Calcium sensor movies can be converted into polar coordinates. (Top) Multiple frames from a calcium sensor movie, annotated with segmentation boundaries. Scale bar is 10 μm. (Bottom) Corresponding frames displayed in polar coordinates. Horizontal scale bar is 45 degrees, vertical scale bar is 10 μm.

5.3.2 – Fiji / ImageJ future directions

There are many ways this program could be improved. First, the program is very slow, with processing of a full movie taking multiple days. The program currently displays the images at each step of the processing, so figuring out how to run the 108

program without displaying the images, if possible, would likely increase the speed. The program sometimes crashes in the middle of long stretches processing a movie, for reasons I’ve not been able to isolate. The program provides spurious segmentation for some frames, especially ones after the embryo has exited, and incomplete segmentation boundaries for many frames. The current version of the program processes each movie frame independently, so using information from additional movie frames would likely increase the segmentation accuracy for these less accurate regions.

5.4 – Matlab analysis of segmentation data

Polar representations of calcium sensor fluorescence and segmentation boundaries are loaded into Matlab for further data exploration, processing, and analysis.

For this part of the pipeline I developed a Matlab program that automatically categorizes and annotates the segmentation boundaries, and interpolates the incomplete segmentation boundaries to fill in gaps. I also developed a Matlab live script notebook that automatically calculates deformation and calcium sensor maps and facilitates interactive exploration of the data.

5.4.1 – Matlab_spatialAnalysis_thetaR_processing.m

The Matlab code for this program is hosted on the lab GitHub. The program takes as input the segmentation boundaries images in polar coordinates, one of the outputs of the Fiji / ImageJ segmentation program. The output of this Matlab program is a cell array holding the segmentation boundaries annotated as inner boundary, outer boundary, single boundary, or unknown. The program annotates the boundaries by examining the segmentation boundaries image; for each theta column, the number of boundaries is counted. The segmentation boundaries image only has pixel intensity at the boundary segments, with zeros everywhere else, so the boundaries are counted by examining the first difference of the theta column and using the ‘islocalmax’ function to find the points of greatest change, i.e. where the zero background transitions to a high intensity boundary segment. Adjustable parameters used for the ‘islocalmax’ function 109

are ‘FlatSelection’, which is set to ‘center’, meaning wide boundary segments are counted once at their center, and ‘MinSeparation’, the minimum number of pixels needed between two peaks to count them as separate, here set to 15 pixels. With the boundaries for the theta column thus counted, if only one boundary is found the coordinates are annotated as single boundary, if exactly two boundaries are found the boundary with the lower r value is annotated as inner boundary and the boundary with the higher r value is annotated as outer boundary, and if three or more boundaries are found they are all annotated as unknown boundary (Figure 5.10). In this way the segmentation boundaries are automatically classified, with no user input required beyond running the program and selecting the folder to process.

Figure 5.10: Classification and annotation of segmentation boundaries. (Top) Multiple frames from a calcium sensor movie, annotated with segmentation boundaries. Scale bar is 10 μm. (Middle) Corresponding frames displayed in polar coordinates. Horizontal scale bar is 45 degrees, vertical scale bar is 10 μm. (Bottom) Segmentation boundaries annotated with colors to indicate classification by how many boundaries are encountered at each angle. Green indicates a single boundary is found, blue indicates 3 or more boundaries are found, and if exactly 2 boundaries are found orange indicates the smaller inner boundary and magenta the larger outer boundary.

5.4.2 – Matlab_spatialAnalysis_boundaryAnalysis.mlx

The Matlab code for this live script notebook is available on the lab GitHub. The notebook accesses the calcium sensor movie, the centroid data file, and the annotated segmentation boundaries to calculate deformation maps and calcium sensor maps. The live script notebook format enables interactive examination of the data; users can select the frame to focus on and the time window to be considered. The user can also focus the 110

analysis on particular regions of the spermatheca defined by angles in the polar coordinates.

Figure 5.11: Classified segmentation boundaries can be interpolated. (Left) Segmentation boundaries from the Fiji / ImageJ segmentation program. (Middle) Classified boundary regions from the left frame annotated on the calcium sensor fluorescence image. Green indicates inner boundary, blue indicates outer boundary. (Right) The inner (green) and outer (blue) boundaries interpolated over the full circle of the image in polar coordinates.

5.4.2.1 – Deformation maps

The first step of generating deformation maps from the movies is to get consistent boundaries that can be compared over time. The segmentation boundaries from the Fiji / ImageJ program have some gaps, and after classification the double boundaries that we are most interested in are even further reduced. A first attempt to deal with this missing information is to linearly interpolate the boundaries, using the

Matlab function ‘interp1’ (Figure 5.11). Interpolation is carried out on the theta and r vectors in polar coordinates, with users able to select windows of theta to focus analysis on specific regions, and these interpolated vectors are converted to x,y image coordinates using trigonometry.

With the boundaries interpolated, they are now binned to discrete segments to generate points for tracking the boundaries through the movie. The first step to discretizing the boundaries is to define the number of segments to use; this number of segments will be used for all desired frames, and is an adjustable parameter that can be varied by the user in the live script, enabling rapid exploration of how the analysis output varies with the number of segments. The next step is determining the length of the boundary for each time point; the program automatically calculates the distance between each of the interpolated boundary points using the Pythagorean theorem, and 111

sums all these distances to get the boundary path length. The boundary segment length for each time point is determined by dividing the boundary path length by the number of segments desired to cover it. A length coordinate normalization vector is defined from 0 to the boundary path length, with points at the integer multiples of the segment length. The x and y component vectors of the interpolated boundary are interpolated by this length coordinate normalization vector, providing the boundary coordinates at the desired segment end points. A consistent number of boundary segments over time enables the calculation of deformation vectors to track how these boundary segments are changing over time. These segment end points are sent to a cell array holding the coordinates for the deformation maps for all desired frames. The deformation vectors can be time resolved, showing the path of the boundary over time, or resultant, showing the cumulative displacement from the start to the end of the analysis time window

(Figure 5.12).

5.4.2.2 – Calcium sensor maps

With spatial analysis of the displacements and deformations advanced, the next step is to examine localized calcium signaling to look for a relation between deformation and cellular response. A first attempt at this is to calculate the average calcium sensor fluorescence in a 10 x 10 pixel square centered at the deformation vector end point for each time point. Representative time series showing the cumulative deformation and instantaneous calcium sensor fluorescence, and kymograms showing the same data over the entire boundary, can be calculated (Figure 5.12). These results indicate that this prototype processing and analysis pipeline is functional, and will enable more detailed measurement and analysis of the geometry changes and biological responses of the spermatheca during embryo transits.

5.4.3 – Matlab future directions

There are many ways to improve these Matlab programs. First, the combined analysis of the inner and outer boundaries could be expanded, as this prototype pipeline 112

Figure 5.12: Deformation and calcium sensor maps. (Top row) 3 still frames from a calcium sensor embryo transit movie. The left frame is the first time point analyzed and the right frame is the last time point analyzed. (Second row) The same movie frames from the top row, with annotations from analysis overlaid. The annotations in the left frame depict the computationally determined outer boundary of the spermatheca, colors indicate time going from blue to red. The middle frame depicts the time resolved boundaries binned into 52 deformation vectors at single time point resolution, following the same colors as the left frame. The right frame depicts the resultant deformation vectors from the start to the end of the analysis time window. Axis labels are pixel coordinates of the frames. (Bottom left) Deformation and average calcium sensor fluorescence time series, each normalized to their own maximum. Fluorescence was calculated as the average pixel intensity of a 10x10 pixel ROI centered at the deformation vector end point. Roman numeral annotations indicate time series is representative of regions annotated in the second row right frame and bottom right column. X-axis is time in seconds, y-axis is normalized value. (Bottom right) The top panel is a kymogram displaying all 52 deformation vectors over time, the bottom panel is a kymogram displaying the corresponding average calcium sensor fluorescence. X-axis is time in seconds, y-axis is deformation vector number.

113

only looked at deformation of the outer boundary. The transition from single boundaries to inner and outer boundaries with oocyte entry, and the transition from inner and outer boundaries to single boundaries with embryo exit, are additional areas where results can be improved. The boundaries at the distal valve and sp-ut valve need some refinement to increase confidence in the results. As with the Fiji / ImageJ program, each frame is processed and analyzed independently, so incorporating information from other frames would likely improve analysis results.

5.5 – Conclusions and future directions

In this chapter I have described a prototype image processing and analysis pipeline of Fiji / ImageJ and Matlab programs that segment calcium sensor fluorescence from widefield movies, convert movies and segmentation to polar coordinates, and perform detailed spatial analyses, all in an automated manner. The scientific goal of this effort is to generate data about spermathecal mechanotransduction by exploring spermathecal tissue deformation and calcium signaling to determine if any relation between them can be established.

As a prototype pipeline, the work in this chapter needs further refinement and validation before biological conclusions can be drawn from the results. Had I time, my first priority would be to speed up the processing of the Fiji / ImageJ segmentation program, as a throughput of more than a week per movie is prohibitively slow for datasets of even tens of movies. There are currently over one thousand movies in the lab movie archive that could be processed by this pipeline, and more are being added on a regular basis. To speed up the segmentation program I would first look at getting Fiji to run the program with as few display and file I/O operations as possible. In this chapter I also learned how to load and manipulate movies in Matlab almost as easily as in Fiji, so I might investigate porting the program to Matlab to see if it can be made faster.

Once throughput of the segmentation program is increased, the programs will need to be modified to integrate with the standardized directory structure for automated 114

accessing of input and saving of output files. This prototype pipeline currently uses a relative directory structure with folders the user either hardcodes or points to when prompted by the programs. I deem it best to leave this as an exercise for the person(s) who take(s) this up after me, so they can gain an understanding of the system. Once throughput is increased and output is well organized, machines can be set to work processing the movies while humans work out the next steps.

The next priority would be advancing the Matlab analysis code to consider the inner and outer boundaries simultaneously. One goal would be measuring the thickness of the inter-boundary region, by subtracting the inner from the outer boundaries.

Visualizing this thickness over time as a function of theta may be useful for determining when and where the cells are contracting and being stretched. The average calcium sensor fluorescence value over the thickness could also be visualized on the same plot, likely providing further insights about mechanotransduction relationships between tissue deformation and cellular response.

115

Chapter 6: Conclusions and future directions

For centuries, microscopes have been a powerful tool to study biology. In recent decades, advances in microscopy technologies have led to imaging datasets of increasing size and complexity. Modern image-based biological investigations can get so large and complex they require computational tools to manage and examine the data. The nascent field of bioimage informatics extracts biological insight from these large, complex datasets. In this dissertation I presented the development of a lab bioimage informatics system for fluorescence microscopy calcium sensor movies.

At the start of my Ph.D. I did not intend to build software, but the requirements of my experimental biology studies pulled me into the computational world. I encountered a significant language barrier between the biological and computational domains, which obscured the utility of available bioimage informatics solutions and led me to develop my own system. As biological investigations come to rely on computational tools, this language barrier may start to impede progress. The ideal solution is extensive training in both domains; dedicated training programs, and double majors in Biology and Computer Science, are increasing in popularity to meet this need.

For those who missed those opportunities, I present a less than ideal solution in this dissertation. I hope it may serve as an example and inspiration that it is still possible to teach yourself and build your own computational tools.

I was motivated to build this lab bioimage informatics system by an experimental investigation, and I used the system to demonstrate that the RhoGAP SPV-

1 is a key regulator of calcium signaling and actomyosin contractility in the C. elegans spermatheca. Experiments exploring mechanism indicate that SPV-1 regulates calcium signaling through its GAP domain, and that it acts through the GTPase RHO-1 to control actomyosin contractility but not calcium signaling. Further experiments suggest that

SPV-1 acts through the GTPase CDC-42 to regulate calcium signaling. These conclusions rely on data from almost 500 fluorescence microscopy movies, acquired by 4 different people over 4 years. The lab bioimage informatics system standardized the organization, 116

processing, and analysis of these movies, enabling us to rigorously compare movies from different researchers and to state strong biological conclusions with confidence.

Calcium sensor movies contain spatial as well as temporal information, so I developed a prototype image processing and analysis pipeline that segments the spermatheca from calcium sensor movies and makes detailed spatial measurements.

After some additional refinement, this pipeline will enable quantitative investigation of the geometry changes in the spermatheca during embryo transits. Rigorously quantified tissue geometry changes combined with rigorously quantified calcium signaling will generate new insights about the interplay between biophysics and biochemistry in contractile tubes, and will advance the C. elegans spermatheca as a powerful model system for mechanotransduction.

Calcium sensor movies are not the only fluorescence microscopy data generated in the lab. This work could be extended by bringing all the fluorescence microscopy data generated in the lab into a single bioimage informatics system. The first step in this future direction is exploration of established bioimage informatics solutions to determine if they can accomplish what my system does. Established solutions have many benefits, including the use of common programs that facilitate collaboration across labs, active user communities to ask for tips and troubleshooting, and active developer support for troubleshooting, upgrades, and extension. Despite my best efforts, it is unlikely that the time and energy needed to understand and extend my system is less than the time and energy needed to understand and use an established solution, and the advantages of an established solution likely overshadow my system. If my system turns out to have some advantages the code is available on the lab GitHub.

The RhoGAP SPV-1 that I studied experimentally has an F-BAR domain which can bind to curved membranes, linking the biophysics of spermathecal cell stretch to the biochemistry of Rho GTPases and calcium signaling. The role of the F-BAR domain in calcium signaling has not yet been explored. I constructed a plasmid containing SPV-1 without an F-BAR domain, and a plasmid containing SPV-1 with the F-BAR domain 117

replaced by a PH domain that should force it to stay at the membrane, both labeled with mApple to enable co-imaging with GCaMP. Refining the spatial analysis pipeline, analyzing the spatial aspects of SPV-1 calcium signaling regulation in existing data, and generating new experimental data exploring the role of the F-BAR domain in calcium signaling could be a ready-made project for some enterprising researcher.

The biophysics of cell stretch and biochemistry of Rho GTPases and calcium signaling participate in tightly choreographed interactions with the biochemistry of actomyosin contractility and biophysics of cellular contraction in the spermatheca during embryo transits. The complexity of this system, combined with the robust, stereotyped calcium signaling and tissue function we can observe, suggest modeling may advance understanding of the spermatheca and comparison to other systems of contractile cells. Hundreds of calcium sensor movies and a prototype pipeline that can isolate the spermatheca are available. With refinement of the prototype pipeline, geometric changes in the movies could be quantified and used to inform mechanochemical models of the spermatheca. With so much imaging data, and established standard methods for analyzing that imaging data, it would likely be fruitful to model experimentally relevant geometries, and to generate simulated movie data that can be fed to the same analysis pipelines as the experimental movies. A tight integration between modeling and experiment could make both more efficient, driving exploration of the spermatheca model contractile tube in ways that generate universal insights for contractile systems.

118

REFERENCES

[1] K. Alim, G. Amselem, F. Peaudecerf, M. P. Brenner, and A. Pringle, “Random network peristalsis in Physarum polycephalum organizes fl uid fl ows across an individual,” Proc. Natl. Acad. Sci., vol. 110, no. 33, pp. 13306–13311, 2013.

[2] J. S. Foote, “Outline of the Tube Plan of Structure of the Animal Body,” in Transactions of the American Microscopical Society, Twenty-Sixth Annual Meeting (Sep. 1904), 1904, vol. 25, pp. 63–87.

[3] C. C. W. Hsia, A. Schmitz, M. Lambertz, S. F. Perry, and J. N. Maina, “Evolution of Air Breathing: Oxygen Homeostasis and the Transitions from Water to Land to Sky,” Compr. Physiol., vol. 3, no. 2, pp. 849–915, 2013.

[4] R. Annunziata, C. Andrikou, M. Perillo, C. Cuomo, and M. I. Arnone, “Development and evolution of gut structures : from molecules to function,” pp. 445–458, 2019.

[5] R. Monahan-Earley, A. M. Dvorak, and W. C. Aird, “Evolutionary origins of the blood vascular system and endothelium,” J. Thromb. Haemost., vol. 11, no. Suppl. 1, pp. 46–66, 2013.

[6] B. Lubarsky and M. A. Krasnow, “Tube Morphogenesis : Making and Shaping Biological Tubes,” Cell, vol. 112, pp. 19–28, 2003.

[7] M. S. Simoes-Costa et al., “The evolutionary origin of cardiac chambers,” Dev. Biol., vol. 277, pp. 1–15, 2005.

[8] F. V Brozovich, C. J. Nicholson, C. V Degen, Y. Z. Gao, M. Aggarwal, and K. G. Morgan, “Mechanisms of Vascular Smooth Muscle Contraction and the Basis for Pharmacologic Treatment of Smooth Muscle Disorders,” Pharmacol. Rev. Pharmacol Rev, vol. 68, pp. 476–532, 2016.

[9] G. Pelaia et al., “Molecular mechanisms underlying airway smooth muscle contraction and proliferation: implications for asthma.,” Respir. Med., vol. 102, no. 8, pp. 1173–81, Aug. 2008.

[10] G. Bou-Gharios, M. Ponticos, V. Rajkumar, and D. Abraham, “Extra-cellular matrix in vascular networks,” Cell Prolif., vol. 37, pp. 207–220, 2004.

[11] Y. Zhou et al., “Extracellular matrix in lung development, homeostasis and disease,” Matrix Biol., vol. 73, pp. 77–104, 2018.

[12] B. Jensen, T. Wang, V. M. Christoffels, and A. F. M. Moorman, “Evolution and development of the building plan of the vertebrate heart,” Biochim. Biophys. Acta, vol. 1833, pp. 783–794, 2013.

119

[13] P. W. Gunning, U. Ghoshdastider, S. Whitaker, D. Popp, and R. C. Robinson, “The evolution of compositionally and functionally distinct actin filaments,” J. Cell Sci., vol. 128, pp. 2009–2019, 2015.

[14] M. J. T. V Cope, J. Whisstock, I. Rayment, and J. Kendrick-jones, “Conservation within the myosin motor domain : implications for structure and function,” Structure, vol. 4, no. 8, pp. 969–987, 1996.

[15] E. Lundquist, “What is C. elegans and why do scientists use it to study human development and disease?” [Online]. Available: https://www.people.ku.edu/~erikl/Lundquist_Lab/Why_study_C._elegans.html. [Accessed: 23-Oct-2019].

[16] A. K. Corsi, B. Wightman, and M. Chalfie, “A Transparent Window into Biology: A Primer on Caenorhabditis elegans,” Genetics, vol. 200, no. 2, pp. 387–407, 2015.

[17] J. Kimble and D. Hirsh, “The postembryonic Cell Lineages of the hermaphrodite and Male Gonad in Caenorhabditis elegans,” Dev. Biol., vol. 70, pp. 396–417, 1979.

[18] J. McCarter, B. Bartlett, T. Dang, and T. Schedl, “On the control of oocyte meiotic maturation and ovulation in Caenorhabditis elegans.,” Dev. Biol., vol. 205, no. 1, pp. 111–28, Jan. 1999.

[19] J. Bouffard, A. D. Cecchetelli, C. Clifford, K. Sethi, R. Zaidel-Bar, and E. J. Cram, “The RhoGAP SPV-1 regulates calcium signaling to control the contractility of the C. elegans spermatheca during embryo transits.,” Mol. Biol. Cell, vol. 30, no. 7, pp. 907–922, 2019.

[20] I. Kovacevic and E. J. Cram, “FLN-1/filamin is required for maintenance of actin and exit of fertilized oocytes from the spermatheca in C. elegans.,” Dev. Biol., vol. 347, no. 2, pp. 247–57, Nov. 2010.

[21] C. A. Kelley, A. C. E. Wirshing, R. Zaidel-Bar, and E. J. Cram, “The myosin light- chain kinase MLCK-1 relocalizes during Caenorhabditis elegans ovulation to promote actomyosin bundle assembly and drive contraction,” Mol. Biol. Cell, vol. 29, no. 16, pp. 1975–1991, 2018.

[22] P. Y. Tan and R. Zaidel-Bar, “Transient Membrane Localization of SPV-1 Drives Cyclical Actomyosin Contractions in the C. elegans Spermatheca,” Curr. Biol., vol. 25, no. 2, pp. 141–151, 2015.

[23] I. Kovacevic, J. M. Orozco, and E. J. Cram, “Filamin and phospholipase C-ε are required for calcium signaling in the Caenorhabditis elegans spermatheca.,” PLoS Genet., vol. 9, no. 5, p. e1003510, May 2013.

[24] K.-I. Kariya, Y. K. Bui, X. Gao, P. W. Sternberg, and T. Kataoka, “Phospholipase C epsilon regulates ovulation in Caenorhabditis elegans.,” Dev. Biol., vol. 274, no. 1, pp. 201–10, Oct. 2004. 120

[25] A. C. E. Wirshing and E. J. Cram, “Myosin activity drives actomyosin bundle formation and organization in contractile cells of the Caenorhabditis elegans spermatheca,” Mol. Biol. Cell, vol. 28, no. 14, pp. 1937–1949, 2017.

[26] A. C. E. Wirshing and E. J. Cram, “Spectrin regulates cell contractility through production and maintenance of actin bundles in the Caenorhabditis elegans spermatheca,” Mol. Biol. Cell, vol. 29, no. 20, pp. 2433–2449, 2018.

[27] S. Forsen and J. Kordel, “Calcium in Biological Systems,” in Bioinorganic Chemistry, I. Bertini, H. B. Gray, S. Lippard, and J. Valentine, Eds. University Science Books, 1994, pp. 107–166.

[28] F. C. Mooren and R. K. H. Kinne, “Cellular calcium in health and disease,” Biochim. Biophys. Acta, vol. 1406, pp. 127–151, 1998.

[29] M. S. Islam, Ed., Calcium Signaling. Springer, Dordrecht, 2012.

[30] A. P. Somlyo and A. V Somlyo, “Ca2+ Sensitivity of Smooth Muscle and Nonmuscle Myosin II: Modulated by G Proteins, Kinases, and Myosin Phosphatase,” Physiol Rev, vol. 83, pp. 1325–1358, 2003.

[31] L. Tian et al., “Imaging neural activity in worms, flies and mice with improved GCaMP calcium indicators.,” Nat. Methods, vol. 6, no. 12, pp. 875–81, Dec. 2009.

[32] J. Akerboom et al., “Optimization of a GCaMP calcium indicator for neural activity imaging.,” J. Neurosci., vol. 32, no. 40, pp. 13819–40, Oct. 2012.

[33] Y. Zhao et al., “An Expanded Palette of Genetically Encoded Ca2+ Indicators,” Science (80-. )., vol. 333, no. 2011, pp. 1888–1891, 2011.

[34] K. Sethi, E. J. Cram, and R. Zaidel-Bar, “Stretch-induced actomyosin contraction in epithelial tubes: Mechanotransduction pathways for tubular homeostasis,” Semin. Cell Dev. Biol., vol. 71, pp. 146–152, Nov. 2017.

[35] W. H. De Vos, S. Munck, and J.-P. Timmermans, Eds., Focus on Bio-Image Informatics, vol. 219. Springer International Publishing, 2016.

[36] G. Myers, “Why bioimage informatics matters,” Nat. Methods, vol. 9, no. 7, pp. 659–660, 2012.

[37] J. R. Swedlow and K. W. Eliceiri, “Open source bioimage informatics for cell biology,” Trends Cell Biol., vol. 19, no. 11, pp. 656–660, 2009.

[38] H. Peng, “Bioimage informatics: A new area of engineering biology,” Bioinformatics, vol. 24, no. 17, pp. 1827–1836, 2008.

[39] K. W. Eliceiri et al., “Biological imaging software tools,” Nat. Methods, vol. 9, no. 7, pp. 697–710, 2012.

121

[40] J. R. Swedlow, I. G. Goldberg, and K. W. Eliceiri, “Bioimage Informatics for Experimental Biology,” Annu. Rev. Biophys., vol. 38, no. 1, pp. 327–346, 2009.

[41] C. Allan et al., “OMERO: Flexible, model-driven data management for experimental biology,” Nat. Methods, vol. 9, no. 3, pp. 245–253, 2012.

[42] A. E. Carpenter, L. Kamentsky, and K. W. Eliceiri, “A call for bioimaging software usability,” Nat. Methods, vol. 9, no. 7, pp. 666–670, 2012.

[43] A. Cardona and P. Tomancak, “Current challenges in open-source bioimage informatics,” Nat. Methods, vol. 9, no. 7, pp. 661–665, 2012.

[44] “Universally Unique Identifier.” [Online]. Available: https://en.wikipedia.org/wiki/Universally_unique_identifier. [Accessed: 06-Nov- 2019].

[45] M. A. Tuli, “Caenorhabditis nomenclature,” WormBook, pp. 1–14, 2018.

[46] A. E. Carpenter et al., “CellProfiler: image analysis software for identifying and quantifying cell phenotypes,” Genome Biol., vol. 7, no. 10, p. R100, 2006.

[47] C. McQuin et al., “CellProfiler 3.0: Next-generation image processing for biology,” PLoS Biol., vol. 16, no. 7, pp. 1–17, 2018.

[48] T. R. Jones et al., “CellProfiler Analyst: Data exploration and analysis software for complex image-based screens,” BMC Bioinformatics, vol. 9, pp. 1–16, 2008.

[49] D. Dao, A. N. Fraser, J. Hung, V. Ljosa, S. Singh, and A. E. Carpenter, “CellProfiler Analyst: Interactive data exploration, analysis and classification of large biological image sets,” Bioinformatics, vol. 32, no. 20, pp. 3210–3212, 2016.

[50] F. De Chaumont et al., “Icy: An open bioimage informatics platform for extended reproducible research,” Nat. Methods, vol. 9, no. 7, pp. 690–696, 2012.

[51] K. Kvilekval, D. Fedorov, B. Obara, A. Singh, and B. S. Manjunath, “Bisque: A platform for bioimage analysis and management,” Bioinformatics, vol. 26, no. 4, pp. 544–552, 2009.

[52] I. G. Goldberg et al., “The Open Microscopy Environment (OME) Data Model and XML file: open tools for informatics and quantitative analysis in biological imaging.,” Genome Biol., vol. 6, no. 5, 2005.

[53] “Open Microscopy Environment.” [Online]. Available: https://www.openmicroscopy.org/. [Accessed: 20-Sep-2019].

[54] “KNIME.” [Online]. Available: https://www.knime.com/. [Accessed: 20-Sep-2019].

[55] C. Dietz and M. R. Berthold, “KNIME for Open-Source Bioimage Analysis: A Tutorial,” in Focus on Bio-Image Informatics, W. H. De Vos, S. Munck, and J.-P. Timmermans, Eds. Springer International Publishing, 2016, pp. 179–197. 122

[56] B. Jagla, B. Wiswedel, and J. Y. Coppée, “Extending KNIME for next-generation sequencing data analysis,” Bioinformatics, vol. 27, no. 20, pp. 2907–2909, 2011.

[57] P. Lindenbaum, S. Le scouarnec, V. Portero, and R. Redon, “Knime4Bio: A set of custom nodes for the interpretation of next-generation sequencing data with KNIME,” Bioinformatics, vol. 27, no. 22, pp. 3200–3201, 2011.

[58] K. Leinweber, S. Müller, and P. G Kroth, “A semi-automated, KNIME-based workflow for biofilm assays,” BMC Microbiol., vol. 16, no. 1, pp. 1–11, 2016.

[59] A. Fillbrunn, C. Dietz, J. Pfeuffer, R. Rahn, G. A. Landrum, and M. R. Berthold, “KNIME for reproducible cross-domain analysis of life science data,” J. Biotechnol., vol. 261, no. February, pp. 149–156, 2017.

[60] J. Schindelin et al., “Fiji: An open-source platform for biological-image analysis,” Nat. Methods, vol. 9, no. 7, pp. 676–682, 2012.

[61] P. Thévenaz, U. E. Ruttimann, and M. Unser, “A pyramid approach to subpixel registration based on intensity.,” IEEE Trans. Image Process., vol. 7, no. 1, pp. 27– 41, Jan. 1998.

[62] S. N. Christo, K. R. Diener, and J. D. Hayball, “The functional contribution of calcium ion flux heterogeneity in T cells,” Immunol. Cell Biol., vol. 93, no. 8, pp. 694–704, Sep. 2015.

[63] M. Uehata et al., “Calcium sensitization of smooth muscle mediated by a Rho- associated protein kinase in hypertension.,” Nature, vol. 389, no. 6654, pp. 990–4, Oct. 1997.

[64] O. Seguchi et al., “A cardiac myosin light chain kinase regulates sarcomere assembly in the vertebrate heart,” J. Clin. Invest., vol. 117, no. 10, pp. 2812–2824, 2007.

[65] T. L. Lavoie et al., “Disrupting actin-myosin-actin connectivity in airway smooth muscle as a treatment for asthma?,” Proc. Am. Thorac. Soc., vol. 6, no. 3, pp. 295– 300, May 2009.

[66] N. Wettschureck and S. Offermanns, “Rho/Rho-kinase mediated signaling in physiology and pathophysiology,” J. Mol. Med., vol. 80, no. 10, pp. 629–638, 2002.

[67] S. J. Gunst, D. D. Tang, and A. Opazo Saez, “Cytoskeletal remodeling of the airway smooth muscle cell: a mechanism for adaptation to mechanical forces in the lung,” Respir. Physiol. Neurobiol., vol. 137, no. 2–3, pp. 151–168, Sep. 2003.

[68] P. G. Smith, C. Roy, Y. N. Zhang, and S. Chauduri, “Mechanical stress increases RhoA activation in airway smooth muscle cells,” Am. J. Respir. Cell Mol. Biol., vol. 28, no. 4, pp. 436–442, 2003.

123

[69] V. Smiesko and P. C. Johnson, “The Arterial Lumen Is Controlled by Flow- Related Shear Stress,” NIPS, vol. 8, no. February, 1993.

[70] A. Munjal and T. Lecuit, “Actomyosin networks and tissue morphogenesis,” Development, vol. 141, no. 9, pp. 1789–1793, May 2014.

[71] K. Sethi, E. J. Cram, and R. Zaidel-Bar, “Stretch-induced actomyosin contraction in epithelial tubes: Mechanotransduction pathways for tubular homeostasis,” Semin. Cell Dev. Biol., vol. 71, pp. 146–152, 2017.

[72] X. Yin, N. J. D. Gower, H. A. Baylis, and K. Strange, “Inositol 1,4,5,-Trisphosphate Signaling Regulates Rhythmic Contractile Activity of Myoepithelial Sheath Cells in Caenorhabditis elegans.,” Mol. Biol. Cell, vol. 15, no. August, pp. 3938–3949, 2004.

[73] Y. K. Bui and P. W. Sternberg, “Caenorhabditis elegans Inositol 5-Phosphatase Homolog Negatively Regulates Inositol 1,4,5-Triphosphate Signaling in Ovulation,” Mol. Biol. Cell, vol. 13, no. May, pp. 1641–1651, 2002.

[74] T. R. Clandinin, J. A. DeModena, and P. W. Sternberg, “Inositol Trisphosphate Mediates a RAS-Independent Response to LET-23 Receptor Tyrosine Kinase Activation in C. elegans,” Cell, vol. 92, pp. 523–533, 1998.

[75] A. Wissmann, J. Ingles, J. D. Mcghee, and P. E. Mains, “Caenorhabditis elegans LET- 502 is related to Rho-binding kinases and human myotonic dystrophy kinase and interacts genetically with a homolog of the regulatory subunit of smooth muscle myosin phosphatase to affect cell shape,” Genes Dev., vol. 11, pp. 409–422, 1997.

[76] A. Wissmann, J. Ingles, and P. E. Mains, “The Caenorhabditis elegans mel-11 Myosin Phosphatase Regulatory Subunit Affects Tissue Contraction in the Somatic Gonad and the Embryonic Epidermis and Genetically Interacts with the Rac Signaling Pathway,” Dev. Biol., vol. 209, pp. 111–127, 1999.

[77] A. J. Piekny, J.-L. F. Johnson, G. D. Cham, and P. E. Mains, “The Caenorhabditis elegans nonmuscle myosin genes nmy-1 and nmy-2 function as redundant components of the let-502/Rho- binding kinase and mel-11/myosin phosphatase pathway during embryonic morphogenesis,” Development, vol. 130, pp. 5695– 5704, 2003.

[78] A.V. Hill, “The possible effects of the aggregation of the molecules of hæmoglobin on its dissociation curves,” J. Physiol., vol. 40, no. Suppl, pp. iv–vii, 1910.

[79] J. Monod, J. Wyman, and J. Changeux, “On the Nature of Allosteric Transitions: A Plausible Model,” J. Mol. Biol., vol. 12, no. 1, pp. 88–118, 1965.

[80] R. McMullan and S. J. Nurrish, “The RHO-1 RhoGTPase modulates fertility and multiple behaviors in adult C. elegans.,” PLoS One, vol. 6, no. 2, p. e17265, Jan. 2011.

124

[81] M. H. Ouellette, E. Martin, G. Lacoste-Caron, K. Hamiche, and S. Jenna, “Spatial control of active CDC-42 during collective migration of hypodermal cells in Caenorhabditis elegans,” J. Mol. Cell Biol., vol. 8, no. 4, pp. 313–327, 2016.

[82] L. J. Neukomm, S. Zeng, A. P. Frei, P. A. Huegli, and M. O. Hengartner, “Small GTPase CDC-42 promotes apoptotic cell corpse clearance in response to PAT-2 and CED-1 in C. elegans,” Cell Death Differ., vol. 21, no. 6, pp. 845–853, 2014.

[83] E. M. M. Manders, F. J. Verbeek, and J. A. Aten, “Measurement of co-localization of objects in dual-colour confocal images,” J. Microsc., vol. 169, pp. 375–382, 1993.

[84] S. V. Costes, D. Daelemans, E. H. Cho, Z. Dobbin, G. Pavlakis, and S. Lockett, “Automatic and quantitative measurement of protein-protein colocalization in live cells,” Biophys. J., vol. 86, no. 6, pp. 3993–4003, 2004.

[85] M. Ziman, J. M. O. Brien, L. A. Ouellette, W. R. Church, and D. I. Johnson, “Mutational Analysis of CDC42Sc , a Saccharomyces cerevisiae Gene That Encodes a Putative GTP-Binding Protein Involved in the Control of Cell Polarity,” Mol. Cell. Biol., vol. 11, no. 7, pp. 3537–3544, 1991.

[86] D. Aceto, M. Beers, and K. J. Kemphues, “Interaction of PAR-6 with CDC-42 is required for maintenance but not establishment of PAR asymmetry in C. elegans,” Dev. Biol., vol. 299, no. 12, pp. 386–97, 2006.

[87] S. Sit and E. Manser, “Rho GTPases and their role in organizing the actin cytoskeleton,” J. Cell Sci., vol. 124, no. 5, pp. 679–683, 2011.

[88] Y. Wang, M. P. Mattson, and K. Furukawa, “Endoplasmic reticulum calcium release is modulated by actin polymerization,” J. Neurochem., vol. 82, pp. 945–952, 2002.

[89] K. R. Norman et al., “The Rho/Rac-Family guanine nucleotide exchange factor VAV-1 regulates rhythmic behaviors in C. elegans,” Cell, vol. 123, no. 1, pp. 119– 132, 2005.

[90] Y. Takai, T. Sasaki, and T. Matozaki, “Small GTP-binding proteins,” Physiol. Rev., vol. 81, no. 1, pp. 153–208, 2001.

[91] D. I. Johnson, “Cdc42: An essential Rho-type GTPase controlling eukaryotic cell polarity.,” Microbiol. Mol. Biol. Rev., vol. 63, no. 1, pp. 54–105, 1999.

[92] E. Hong-Geller and R. A. Cerione, “Cdc42 and Rac Stimulate Exocytosis of Secretory Granules by Activating the IP 3 /Calcium Pathway in RBL-2H3 Mast Cells,” J. Cell Biol., vol. 148, no. 3, pp. 481–493, 2000.

[93] M. M. Wilkes, J. D. Wilson, B. Baird, and D. Holowka, “Activation of Cdc42 is necessary for sustained oscillations of Ca2+ and PIP2 stimulated by antigen in RBL mast cells,” Biol. Open, vol. 3, no. 8, pp. 700–710, 2014. 125

[94] D. Illenberger et al., “Stimulation of phospholipase C-β 2 by the Rho GTPases Cdc42Hs and Rac1,” EMBO J., vol. 17, no. 21, pp. 6241–6249, 1998.

[95] J. Saras, P. Franzén, P. Aspenström, U. Hellman, L. J. Gonez, and C. H. Heldin, “A novel GTPase-activating protein for Rho interacts with a PDZ domain of the protein-tyrosine phosphatase PTPL1.,” J. Biol. Chem., vol. 272, no. 39, pp. 24333–8, Sep. 1997.

[96] B.-E. Myagmar et al., “PARG1, a protein-tyrosine phosphatase-associated RhoGAP, as a putative Rap2 effector.,” Biochem. Biophys. Res. Commun., vol. 329, no. 3, pp. 1046–52, Apr. 2005.

[97] B.-J. de Kreuk et al., “The Human Minor Histocompatibility Antigen1 Is a RhoGAP,” PLoS One, vol. 8, no. 9, p. e73962, Sep. 2013.

[98] J. Amado-Azevedo, N. R. Reinhard, J. van Bezu, G. P. van Nieuw Amerongen, V. W. M. van Hinsbergh, and P. L. Hordijk, “The minor histocompatibility antigen 1 (HMHA1)/ArhGAP45 is a RacGAP and a novel regulator of endothelial integrity,” Vascul. Pharmacol., vol. 101, pp. 38–47, Feb. 2018.

[99] S. Aresta, M.-F. de Tand-Heim, F. Béranger, and J. de Gunzburg, “A novel Rho GTPase-activating-protein interacts with Gem, a member of the Ras superfamily of GTPases.,” Biochem. J., vol. 367, no. Pt 1, pp. 57–65, Oct. 2002.

[100] D. M. Barry et al., “Rasip1-Mediated Rho GTPase Signaling Regulates Blood Vessel Tubulogenesis via Non-Muscle Myosin II,” Circ. Res., no. June 2016, 2016.

[101] C. D. Myers, P.-Y. Goh, T. S. Allen, E. A. Bucher, and T. Bogaert, “Developmental Genetic Analysis of Troponin T Mutations in Striated and Non-Striated Muscle Cells of Caenorhabditis elegans,” J. Cell Biol., vol. 132, no. 6, pp. 1061–77, 1996.

[102] C. Frøkjær-Jensen et al., “Single copy insertion of transgenes in C. elegans,” vol. 40, no. 11, pp. 1375–1383, 2008.

[103] X. Chen, V. B. Vega, and H.-H. Ng, “Transcriptional regulatory networks in embryonic stem cells.,” Cold Spring Harb. Symp. Quant. Biol., vol. 73, pp. 203–9, Jan. 2008.

[104] Y. B. Tzur, A. E. Friedland, S. Nadarajan, G. M. Church, J. A. Calarco, and M. P. Colaiácovo, “Heritable custom genomic modifications in Caenorhabditis elegans via a CRISPR-Cas9 system.,” Genetics, vol. 195, no. 3, pp. 1181–5, Nov. 2013.

[105] I. Hope, C. elegans: A Practical Approach. New York: Oxford University Press, 1999.

[106] L. Timmons and A. Fire, “Specific interference by ingested dsRNA,” Nature, vol. 395, no. 6705, p. 854, Oct. 1998.

126

[107] E. Kim, L. Sun, C. V. Gabel, and C. Fang-Yen, “Long-Term Imaging of Caenorhabditis elegans Using Nanoparticle-Mediated Immobilization,” PLoS One, vol. 8, no. 1, pp. 1–6, 2013.

[108] R Core Team, “R: A Language and Environment for Statistical Computing.” R Foundation for Statistical Computing, Vienna, Austria, 2017.

[109] G.-J. Y. Peters, “userfriendlyscience: Quantitative analysis made accessible.” 2017.

[110] A. Boudaoud et al., “FibrilTool, an ImageJ plug-in to quantify fibrillar structures in raw microscopy images,” Nat. Protoc., vol. 9, no. 2, pp. 457–463, 2014.

[111] E. Meijering, “FeatureJ: Edges.” [Online]. Available: https://imagescience.org/meijering/software/featurej/edges/. [Accessed: 09-Sep- 2019].

[112] “Built-in Macro Functions.” [Online]. Available: https://imagej.nih.gov/ij/developer/macro/functions.html. [Accessed: 09-Sep- 2019].