<<

TSpace Research Repository tspace.library.utoronto.ca

Deep Learning with for

Jason Riordon, Dušan Sovilj, Scott Sanner, David Sinton, and Edmond W. K. Young

Version Post-print/Accepted Manuscript

Citation Riordon, Jason, et al. ""Deep Learning with Microfluidics for (published version) Biotechnology."" Trends in Biotechnology (2018).

Copyright/License This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

How to cite TSpace items

Always cite the published version, so the author(s) will receive recognition through services that track citation counts, e.g. Scopus. If you need to cite the page number of the author manuscript from TSpace because you cannot access the published version, then cite the TSpace version in addition to the published version using the permanent URI (handle) found on the record page.

This article was made openly accessible by U of T Faculty. Please tell us how this access benefits you. Your story matters.

Deep Learning with Microfluidics for Biotechnology

Jason Riordon, Dušan Sovilj, Scott Sanner1, David Sinton2 & Edmond W. K. Young3*

Department of Mechanical and Industrial , University of Toronto, 5 King’s College

Road, Toronto, ON, Canada, M5S 3G8

*Correspondence: [email protected] (E.W.K. Young)

1http://d3m.mie.utoronto.ca/ 2 http://www.sintonlab.com/ 3 http://ibmt.mie.utoronto.ca/

Keywords: Deep Learning, Machine Learning, Microfluidics, Lab-on-a-Chip

Abstract: Advances in high-throughput and multiplexed microfluidics have rewarded biotechnology researchers with vast amounts of data. However, our ability to analyze complex data effectively has lagged. Over the last few years, deep artificial neural networks leveraging modern graphics processing units have enabled the rapid analysis of structured input data - sequences, images, videos - to predict complex outputs with unprecedented accuracy. While there have been early successes in flow cytometry, for example, the extensive potential of pairing microfluidics (to acquire data) and deep learning (to analyze data) to tackle biotechnology challenges remains largely untapped. Here we provide a roadmap to integrating deep learning and microfluidics in biotechnology labs that matches computational architectures to problem types, and provide an outlook on emerging opportunities.

1

The Challenge of Processing Data from Microfluidics

Over the past two decades, microfluidics (see Glossary) has shown great promise in enhancing biotechnology applications by leveraging small sample volumes, short reaction times, parallelization and manipulation at sample-relevant scales [1]. Major milestones in the field are indicated in Figure 1. By demonstrating an ability to efficiently perform a wide range of functions within biotechnology laboratories, including DNA and RNA [2,3], single- [4,5], antimicrobial resistance screening [6] and drug discovery [7], microfluidic technologies have revolutionized the way we approach experimental biology and biomedical research. Microfluidic technologies have also demonstrated an ability to capture, align, and manipulate single cells for cell sorting and flow-based cytometry [8–10], mass [11] and volume

[12] sensing, phenotyping [13], cell fusion [14], cell capture (e.g., circulating tumor cells

[15,16]), and cell movement (e.g., sperm motility [17–20]). Microfluidic high-throughput screening using droplets benefit from rapid reaction times, high sensitivity and low cost of reagents [21]. Microfluidics has also enabled the capture and monitoring of zebrafish embryos

[22,23], the high-throughput and rapid imaging of the nematode worm Caenorhabditis elegans

[24], and the ordering and orientation of Drosophila embryos [25].

While microfluidic applications in biotechnology vary widely, the real product in all cases is data. For example, a typical time-lapse experiment can readily generate >100 GB of data (e.g.,

100 cells × 4 images/cell (1 brightfield + 3 bands) × 6 time points/hr × 48 hrs ×

1 MB/image). However, the generation of data using high-throughput, highly parallelized microfluidic systems, has far outpaced researchers’ abilities to effectively process this data – an analysis bottleneck.

2

Machine learning (Box 1) is a class of artificial intelligence-based methods that enable algorithms to learn without direct programming. While traditional machine learning has long offered data processing capabilities, the advent of deep learning methods represents a step- increase in the ability to handle structured data such as images or sequences. Deep learning architectures can now exploit structured data applicable to a variety of research fields [26,27], and leverage inherent data structures. Deep learning has achieved impressive recent gains in analyzing a variety of data types: images [28–31], natural language translation [32–34], speech data [35,36], text documents [37,38] and computational biology [39,40]. These major gains have been powered by increased computational power from GPUs, open-source frameworks (e.g.,

TensorFlow), and distributed computing.

Traditional machine learning has already been paired with microfluidics for biotechnology applications, for example in detection within liquid as comprehensively reviewed by Ko and colleagues [41]. This marriage between traditional machine learning and microfluidics has yielded (i) improvements in analysis methods including single-cell lipid screening [42], cancer screening [43,44] and cell counting [45], and (ii) advances in microfluidic design and flow modeling including predicting water-in-oil emulsion sizes [46]. Applications that combine deep learning with microfluidics, however, are only beginning to emerge, with label-free cell classification as a prime example. Cells have been identified using either deep learning architectures that process pre-extracted features [47] or use raw images as inputs and fully leverage a deep network’s ability to extract relevant features for improved prediction [48–

50]. Deep learning methods have also led to well-defined flows by tuning channel geometry [51].

While these deep learning demonstrations indicate potential, we see far broader biotechnology applicability in the near future.

3

In this perspective we outline the many opportunities presented by the marriage of deep learning (to analyze data) and microfluidics (to generate data) for biotechnology applications.

We provide a roadmap to integrating deep learning strategies and microfluidics applications, carefully pairing problem types to artificial neural network architectures - Figure 2 (Key

Figure). We begin with advances in processing simple unstructured data for which there is much precedence in biotechnology, and transition to complex sequential and image data for which there are abundant opportunities. We provide practical implementation guidelines for biotechnology researchers and research leaders new to deep learning, including a summary of how neural networks function and tips on getting started (Box 2). We end with an outlook on emerging opportunities that will have a profound impact on biotechnology research in the near and medium term.

Deep Learning Architectures for Microfluidic Challenges

Deep neural architectures have been developed and applied to a wide range of challenges, including single-molecule science [52], computational biology [40], and biomedicine [53,54].

Here we highlight strategies that have successfully been applied within the microfluidics community, and where such approaches could be applied in the near future. Most relevant architecture categories (input-to-output data types) are listed alongside existing microfluidic applications in Figure 2. We progress from the simple unstructured-to-unstructured case to the more complex image-to-image combination, and highlight microfluidic achievements and opportunities.

Unstructured-to-Unstructured Neural Networks – e.g., for Classifying Cells Based on

Manually-Extracted Cell Traits

4

Unstructured-to-unstructured neural networks represent the simplest neural network case – a type of architecture that typically falls into the traditional machine learning category, but where deep learning can often be beneficial. In a typical biotechnology application, a neural network designed to handle unstructured inputs could be used to classify flowing cells within a microfluidic channel, where the input x would be a vector of cell traits (e.g., circularity, perimeter and major axis length) and the output y would be a class (e.g., white cell or colon cancer cell) as in the label-free cell segmentation and classification example in Figure 3A by Chen and colleagues (simplified – only three traits are shown for clarity). “Label-free” here refers to physical labels such as fluorophores rather than feature labels as used in machine learning terminology. The authors used time-stretch quantitative phase [47] to create a rich hyperdimensional feature space of cell traits, trained their network using supervised learning, and achieved an accuracy and consistency in cell classification with their deep neural network surpassing that of traditional machine learning approaches, including logistic regression, naive Bayes and support vector machines.

Whereas the unstructured-to-unstructured example above utilized a deep network, most microfluidic work using simple unstructured-to-unstructured networks have not necessitated deep learning. Various imaging modalities (e.g., light microscopy [55], digital holographic microscopy [44]) have been used to image individual cells, and various cell features (e.g., cell size, perimeter, eccentricity, image intensities [56]) have been used as inputs in supervised learning algorithms to achieve, for example, label-free cell classification [44]. Traditional machine learning algorithms have also been used to classify and identify disease

[57]. An excellent example of unstructured input (i.e., cell features extracted from images) to unstructured output is from the Lu group at Georgia Tech, who used microfluidic capture arrays

5 to position and orient C. elegans for imaging synaptic puncta patterns within the organism [58].

The authors discovered highly subtle but important phenotypic differences between worms that revealed genetic differences that were previously hidden. For additional biotechnology examples and applications of machine learning, we refer the reader to an excellent review by Vasilevich and colleagues [59].

Sequence-to-Unstructured Neural Networks – e.g., for Microfluidic Soft

Characterization

Sequential data is prevalent in microfluidics – from measuring an electrical signal to characterizing a droplet generator, microfluidic measurements are often produced as part of a time series. When data has a sequential structure, there are alternative deep network architectures that can better reflect and exploit the sequences. These deep neural networks are generally referred to as recurrent neural networks (RNNs) and are shown in Figure 3B-C. The first case of interest is sequence-to-unstructured, whereby an input sequence is assigned a single output value (Figure 3B). In this architecture, recurrent weights connect the hidden layer to itself and permit training through a gradient descent technique known as backpropagation-through-time

[60]. The example shown in Figure 3B is reproduced (with simplification) from Han and colleagues [61], where a RNN was used to characterize a microfluidic soft sensor. An analog voltage was measured over time at the ends of a microchannel filled with a liquid metal. The signal was monitored as a stimulus was applied at various locations along the channel.

The RNN was trained to not only identify the pressure applied, but also discern the location of stimulus along the channel.

Sequence-to-Sequence Neural Networks – e.g., for DNA Base Calling

6

Another case of interest is sequence-to-sequence (Figure 3C). Here, rather than classify an entire sequence of events as corresponding to a given class, the output itself is a sequence - for example in DNA base calling applications where current events are identified in order [62]. Boža and colleagues demonstrated an open-source DNA base caller with a RNN structure to segment current change events from MinION nanopore data into DNA base pairs [62]. Such an architecture also has great potential in applications where the growth of a cell (via either volume

[12] or mass [63]) is monitored over extended periods, and any individual size measurement would likely benefit from taking previous measurements in consideration. Given the strong correlation between pulses (i.e. a cell that is growing), using an RNN would likely substantially increase measurement accuracy. In tagging problems such as this example, each element of the input sequence is annotated, i.e. every pulse amplitude corresponds to the passage of a cell, or each current event corresponds to a certain base pair. Architectures have also been developed for cases where the output sequence is not the same length as the input sequence, for example in language translation (where an output sentence need not have the same number of words as an input text). To tackle this challenge, a more powerful architecture is employed [64], namely the encoder-decoder model. In this approach, there are two separate RNNs for two distinct phases.

The first network is tasked with reading the input sequence and producing an encoded state that is a concise lower-dimensional representation of the same sequence, namely the encoder part.

The decoder RNN is then modified to have outputs passed further down the sequence. This encoder-decoder architecture is now the backbone of most sequence-to-sequence learning models in language translation and speech-to-text transcription, and has begun to impact biomedical research, as evidenced by recent work on predicting DNA-binding from primary sequences [65]. Since training RNNs is computationally expensive, a significant effort

7 has gone into developing more efficient structures and adapting them to the temporal domain.

Recently, several works have explored using CNN-based sequence-to-sequence learning [66] and reported comparable or improved results over RNN-based models with a substantial decrease in training times.

As identified in a recent review by Angerer and colleagues, single-cell transcriptomics has thus far relied on simpler models, but stands to benefit significantly from deep learning approaches [67]. Deng and colleagues recently demonstrated scScope, a deep learning architecture capable of identifying cell-type composition from single-cell RNA sequence gene- expression profiles, leveraging a recurrent network structure [68]. Angermueller and colleagues developed DeepCpG, a deep learning architecture capable of predicting single-cell methylation states, here using a bidirectional gated recurrent network, a variant of the RNN [40]. Not all networks dealing with data rely on RNNs: DeepVariant, a network that first translates genomic data into images, was developed by researchers at Verily to analyse genetic variability within using CNNs [69]. For further reading on how deep learning stands to transform genetics and biological networks, we refer the reader to a thoughtful review by Camacho and colleagues [70]. We also refer the reader to an excellent review on the multi-omics of single cells by Bock and colleagues [71].

Image-to-Unstructured Neural Networks – e.g., for Classifying Cells Based on Images

Directly

Deep networks have also been developed to handle spatially-distributed data, or images – useful for example in classifying cells directly, without requiring prior manual extraction of traits.

Image analysis is key to most microfluidic experiments – multiplexing, rapid throughput and the

8 planar geometry of many microfluidic networks lead to the production of vast quantities of images. Image-to-unstructured neural networks offer the promise of handling such data directly, without requiring time-consuming pre-extraction of relevant features. Further, by embedding feature extraction layers within a deep network, the selection process is no longer subject to human biases. Thus, it is not only a deep network’s ability to accelerate classification tasks (by incorporating the feature-extraction step), but its ability to predict what features are relevant that make deep learning approaches so powerful.

Convolutional neural networks (CNNs) are the backbone of all image-based deep learning (Figure 4). Their architecture is such that nodes are restricted to connect to a portion of the input image. Convolutional blocks (or filters shown as light green layers, Figure 4) operate on all parts of an image by sliding a small window region along an image, outputting the weighted sum of pixel values for that filter within the region, and applying a nonlinear transformation. These convolutional layers can be combined with pooling (subsampling) layers, which extract the most dominant values in the feature maps and reduce their resolution (dark green layers, Figure 4). The process is repeated several times until a specific filter map resolution size is achieved. In summary, initial stages of the pipeline are designed to tackle spatial data, and convolutions act as special feature detectors (such as edges or lines) [72], thereby teaching the network to associate proximate pixels in space. The output of these layers is then a low-dimensional embedded representation of an image that constitutes a far better representation of the image content than other feature-extraction methods [72,73]. For image-to- unstructured applications specifically, for example in classification tasks, a fully connected stack of layers is used at the end of the network (Figure 4A). Such an architecture has been applied in several label-free microfluidic cell cytometry applications [48–50]. In the flow cytometry

9 example by Heo and colleagues (Figure 4A), a CNN is used to classify a binary population of lymphocytes and red blood cells at high-throughput [48]. In the second example, Stoecklein and colleagues show how precise flow profiles can be generated using clever pillar distributions along the channel. Interestingly, solving the “what flow shape will result from a given geometry?” problem is easily solved by a computational flow model, whereas deep learning can be used to solve the much more difficult and inverse problem, “what geometry is required to produce a desired flow shape?” [51]. Recently, an image-to-unstructured architecture has also been used to quantify bacterial growth within microfluidic culture systems [74]. Kim and colleagues used a CNN to calculate the concentration of Pseudomonas aeruginosa within on- chip microfluidic cultures using only culture images as inputs.

Image-to-Image Neural Networks – e.g., for Cell Segmentation

Image-to-image applications include predicting the next frame in a video, predicting the depth of an image, or most notably segmenting images – where cell contours, for example, can be learned and applied to produce fully segmented images. In the nerve cell segmentation application in

Figure 4B by Zaimi and colleagues, nerve cell images are segmented into regions marking axon

(blue), myelin (red) and background (black) [75] using a CNN tailored for such a purpose. In a semantic image segmentation problem, the goal is to map each pixel within the input image to one of many classes present in the corpus (e.g., membrane, nucleus, and cytoplasm) [76]. Here, many classifications are required (one classification per pixel rather than one classification per image). Two broad approaches have been proposed using fully convolutional networks. The first approach is to use an encoder-decoder, whereby a reverse series of operations (upsampling and deconvolutions) are performed to reconstruct a segmented image from its embedding in the

10 initial convolutional and pooling layers [73,77,78] (Figure 4B). The dashed arrows indicate a popular “U-Net” modification, where a connected network can be applied at different scales and subsequently combined [79,80]. In the microscopy field, the U-Net has been successfully applied to segment and count cells [77], and employed by Zaimi and colleagues in Figure 4B (simplified here) [75]. Alternatively, a spatial pyramidal pooling approach can also be used [81–83]. Spatial pyramid pooling focuses on capturing context at several scales from the image-level feature.

That is, given the low-level representations, several filters at different resolutions are applied in order to capture objects at multiple scales. These filtered representations are joined together and passed to the final convolutional layer. Spatial pyramidal methods produce a representation that explicitly tackles different scales with specific filters placed in the pyramidal component.

Although both encoder-decoder and pyramidal approaches are applicable, a pyramidal architecture separates the tasks of image downscaling and convolution, which often leads to better segmentation at multiple image scales.

Hybrid Approaches: Video

The above examples represent common microfluidic problem types and associated well- developed deep learning architectures. This list should not be construed as being comprehensive, but rather an overview of the most common deep learning approaches for biotechnology applications. Notably, combinations of the above architectures can also be used in conjunction, for example to analyze videos (i.e., image sequences). One recent example is in the identification of hematopoietic lineage by Buggenthin and colleagues, where a combination of CNN and RNN architectures were used to predict single-cell lineage [84]. In this case, a CNN was applied to brightfield images of stem cells to extract local image features, and these vectors were then fed into an RNN to take temporal information into account (i.e. analyze the next frame in a video by

11 considering the previous frame). In this way, 5,922 cells were imaged over 400 sequential frames, resulting in ~2.5 million image patches. The authors demonstrated that their algorithm could predict cell type after differentiation, and up to three generations before conventional molecular markers were observed. An alternative approach to processing video involves compression into static images in a manner that preserves key features. For example, Yu and colleagues recently used such a video compression approach to analyze phenotypic antimicrobial susceptibility [85]. Videos of within microfluidic channels were compressed into two sets of static images, each capturing either cell morphology or motion trace. These compressed images were then fed into a CNN, and bacterial cells inhibited by an antibiotic were successfully differentiated from cells that were uninhibited by an antibiotic.

Emerging Opportunities

So far, we have mapped the most promising pairings of deep learning architectures and microfluidics for biotechnology applications. Below we provide an outlook on intriguing opportunities that stem from the marriage of deep learning and microfluidics.

Organ-on-a-Chip and AI-Autonomous Living Systems

Organ-on-a-chip (OOC) systems are engineered devices that rely on the integration of advanced , microfluidics, living cells, and biomaterials to create human tissue constructs that accurately mimic the structure, function, and microenvironment of real human tissue. OOC systems have already been developed for many organs of the body, and have demonstrated immense potential as test platforms for drug development and towards personalized medicine

[86,87]. Based on the current pace of progress, we anticipate that OOC systems will mimic real tissues with increasing accuracy and complexity, and any deep learning application that can be 12 proposed for real tissues can conceivably be proposed for OOC systems. First, as OOC systems continue to evolve, microfluidic tissue-level structures will be cultured, maintained, and monitored on-chip, and could be studied over large sample regions of the OOC tissue to detect spatial heterogeneity in a manner similar to modern histopathology analysis [88]. Thus, we anticipate the generation of enormous datasets of images and videos of living cells, tissues, and organs residing on-chip within their respective in vitro microenvironments and undergoing changes in behaviour and function, and we believe this data will provide a rich dataset for deep learning architectures. Indeed, the application of deep learning for histopathology has recently been shown using CNNs to organize and classify histomorphologic information [89], and thus can be conceivably extended to OOC-derived tissue constructs given that the most common data output structure to date for OOCs is fluorescence microscopic images.

Second, considering the emerging vision of the person-on-a-chip (or body-on-a-chip

[90]) for clinical applications and personalized medicine [91], there is a potential role for deep networks and artificial intelligence in control of the entire multi-organ system. Such multi-organ microphysiological systems have recently been reported [90], demonstrating control of flow partitioning to the various organs to mimic human physiology. One can imagine deep networks monitoring individual organs, metering communication, providing real-time control of multiple organs-on-chip systems, and working towards a fully “trained” and AI-guided multi-organ microphysiological system that essentially regulates itself. Ironically, a learning AI-driven multi- organ system may in fact be more fundamentally natural, or self-guided, than human-controlled organ microsystems - a rather intriguing proposition. Thus, we envision a clear and immediate role here for deep learning in accelerating data analysis from OOC systems, and an intriguing

13 medium- to long-term role in the integrated control and self-regulation of multiple organs or tissues making up a larger living whole.

Deep Learning-Powered Experiment Design and Control

While the majority of near-term deep learning applications will focus on a post-experiment data analysis role, there is growing potential for deep learning in designing microfluidic systems and controlling systems during experiments. The company Zymergen, for example, controls thousands of parallel microwell-based microbial culture experiments where the fluidic decisions (e.g. what to inject and when to do it) are made autonomously via AI. In that case the machine learning analytics and fluid experimental system work together autonomously toward the goal of genetically engineering to produce useful chemicals [92]. Neural networks have also been applied to predict the outcome of organic reactions [93]. In addition to cultivation optimization, we see near-term opportunity in applying deep learning to fields with deeply complex and convoluted factors. For example, an increasing awareness of climate change and pollution is motivating a growing emphasis on elucidating the role of environmental stressors on microbiota [94,95]. Quantifying the response of complex systems to multiple stressors – a biotechnology grand challenge – will require high numbers of parallel experiments and roles for deep learning in both experiment planning and post-experiment analysis. Early work in this area by Nguyen and colleagues showed how an aerogel-based gas platform could be used to evaluate the role of various stressors (temperature, light, food supply, various pollutants) on microalgal growth [94]. Lambert and colleagues developed a to study bacterial communities, enabling the study of complex chemical interactions at an organism-relevant microfluidic scale [95]. Sifting the resulting data is currently a bottleneck, and despite the large numbers of tests possible in microsystems, the number of

14 variables is still too high for brute force. These recent approaches point to the opportunity for small scale environmental toxicology testing, as well the need for deep learning guidance both in experiment planning (increasing efficiency) and post-experiment analysis.

Globally Distributed Microfluidics with Cloud-Based Deep Learning

Inexpensive microfluidic tests such as paper-based assays [96] can provide rapid results in high numbers and at high frequency worldwide. Particularly when paired with imaging and data transmission capabilities of now ubiquitous smartphones [97], paper-based tests can uniquely provide real-time, globally distributed analytical data ideally suited to deep learning algorithms.

We see this combination being particularly powerful in the medium-term in microfluidic point- of-care diagnostics [98] and food safety [99]. In diagnostics, data generated from low-cost -detecting paper microfluidic devices from millions of globally distributed users could be paired with deep learning algorithms to track, predict and ultimately contain outbreaks. In addition to the detection and prediction of an rapidly evolving outbreak, there may be an additional role for microfluidics in a targeted distributed response. For example, the local production of , therapeutics and vaccines is possible via hydration of freeze-dried cell- free transcription machinery [100]. This vision leverages deep learning powered analysis and prediction to turn distributed local microfluidics-based detection into a coordinated and effective local response. Analogously in food safety, microfluidic systems could be applied to test and monitor food quality and safety throughout the food production chain, feeding data-hungry deep learning strategies to contain and ultimately prevent contamination. Particularly in supply chains, microfluidic (data acquisition) and deep learning (analysis) will likely be further combined with cloud-based distributed ledger systems known as blockchain. In both diagnostics and food safety, cloud-based deep learning algorithms are an ideal partner to low-cost distributed testing.

15

Concluding Remarks

Deep learning and microfluidics represent an ideal marriage of experimental and analytical throughput. The pairing will only strengthen as technologies in both fields advance. In essence, the massive amount of data recovered from highly parallelized microfluidic systems represents the ideal biotechnology application for today’s modern deep learning algorithms. It is also likely that the integration of these approaches within the biotechnology research workflow will synergistically accelerate research in powerful ways (see Outstanding Questions). For example, the training of high performance generic image classifiers (e.g., human, chair, car) has enabled retraining for medical image classification tasks with substantially reduced data requirements

[101]. Similar architecture reuse may accelerate progress and reduce data requirements for deep learning in biotechnology. While the adoption of any new technology within a lab presents challenges and costs, the opportunity for a union of microfluidics and deep learning is clear, and for many biotechnology applications the barriers to entry are now relatively low. We hope this roadmap demystifies deep learning, highlights its tremendous potential, and encourages rapid implementation to the benefit of biotechnology research.

Acknowledgements

The authors gratefully acknowledge support from the Natural Sciences and Engineering Council of Canada, the Discovery Grants program, an E.W.R. Steacie Memorial Fellowship and the

Canada Research Chairs program. The authors also thankfully acknowledge support from the

Canadian Institutes of Health Research Collaborative Health Research Projects program.

16

Box 1: Deep Learning Basics

What is a neural network?

A neural network is a type of machine learning architecture where a structured nonlinear function y=f(x) is learned to map input x into output y. The essence of neural networks is in the multiple layers of simple algebraic operations used to compute function f [26,102]. Each node within a hidden layer is computed as a weighted linear combination of all nodes within the previous layer, followed by a nonlinear transform (e.g. a sigmoidal function or a rectified linear unit). Next, the output from this layer is computed as a weighted linear combination of all nodes within the hidden layer in a similar fashion. Each layer’s outputs are fed into the next layer, until a series of network outputs are generated. The power and versatility of such a network comes from simply combining many of these simple operations together. In a process called supervised learning, a labeled data set with input x and output y pairs can be used to train a neural network by optimization of the weights. At each iteration of the code, the predicted outputs y (predicted cell classes) are compared to known values (e.g. known cell classes), and the error calculated

(e.g. squared error or cross-entropy). New weights are assigned and back-propagated using a gradient descent algorithm to minimize the error [60,103]. After multiple iterations, the neural network is trained and capable of making predictions on new test datasets.

Deep Learning vs. Traditional Machine Learning

While traditional machine learning methods including neural networks have been active areas of research for decades, only in the past 10 years have deep neural networks (i.e., usually considered to be neural networks with three or more hidden layers) started to significantly outperform other methods. While deep neural networks were traditionally very hard and time- consuming to train, the advent of large-scale data and storage, fast GPU-based computation, and

17 advances in training methods have combined to make possible deep learning performance breakthroughs on a variety of tasks including image recognition, speech recognition, and language translation. In recent years, deep convolutional neural networks as deep as 152 layers have offered state-of-the-art performance on image recognition tasks [104]. Critically, unlike many previous machine learning methods, deep learning methods naturally and efficiently exploit the sequential and spatial structure of data, which is the key innovation of deep learning.

Box 2: Getting Started with Deep Learning

When to go Deep?

Certainly many biotechnology applications could benefit from traditional machine learning methods since many common problems require simple classification or regression tasks with simply unstructured inputs (e.g. cell parameters, not sequences or images). This simplified approach, however, fails to leverage the most compelling attribute of deep learning: the ability to learn complex features or correlations of features not known a priori. A deep learning algorithm can discover features relevant for prediction that describe the image far better than a limited set of established features, or features that a human observer deems important. Deep learning can thus represent a significant improvement in efficacy, particularly if there are complex nonlinear relationships, and is essential in the case of sequential or image inputs.

Is Big Data Required?

How much data is required varies based on how many classes are being trained, how different these classes are, and whether tricks can be used to augment the data. Extensive training can in some cases be avoided by using a pre-trained network (termed transfer learning) [101]. Recently,

Gopakumar and colleagues showed how a CNN pre-trained on the ImageNet database [105]

18

(>106 labelled everyday images such as dogs, trees, and cars) could be re-trained to predict cell class at reasonable accuracy with as few as ~30 cell images [50]. Another popular approach is to enlarge a small dataset by augmenting the original set with modified images (e.g., flipping, rotating, defocusing, and noise addition). In short, even a small amount of data is sufficient to get started with deep learning implementation.

How do I Get Started?

Computer engineering is producing hardware solutions well-adapted for deep learning computing requirements, yielding graphics processing units (GPUs) optimized for parallel data processing. Training networks still generally requires days for datasets with ~108 samples, for example, while more typical biotechnology applications with hundreds or thousands of samples can be processed in under an hour on a single commercial GPU. Biotechnology labs that widely embrace deep learning may additionally consider hardware clusters or external services for computing. A student with basic programming skills can quickly get up to speed via a deep learning introduction course, or equivalent online course (e.g., Andrew Ng’s deep learning course: www.coursera.org/learn/machine-learning). In summary, the barriers to entry into deep learning are quickly fading, and we encourage biotechnology researchers to leverage the potent combination of microfluidics and deep learning.

19

References

1 Sackmann, E.K. et al. (2014) The present and future role of microfluidics in biomedical research. Nature 507, 181–189 2 Lan, F. et al. (2017) Single-cell sequencing at ultra-high-throughput with microfluidic droplet barcoding. Nat. Biotechnol. 35, 640–646 3 Zilionis, R. et al. (2016) Single-cell barcoding and sequencing using droplet microfluidics. Nat. Protoc. 12, 44–73 4 Su, Y. et al. (2017) Single cell in biomedicine: High-dimensional data acquisition, visualization, and analysis. PROTEOMICS 17, 1600267 5 Caen, O. et al. (2017) Microfluidics as a Strategic Player to Decipher Single-Cell Omics? Trends Biotechnol. 35, 713–727 6 Liu, Z. et al. (2017) Microfluidics for Combating Antimicrobial Resistance. Trends Biotechnol. 35, 1129–1139 7 Dittrich, P.S. and Manz, A. (2006) Lab-on-a-chip: microfluidics in drug discovery. Nat. Rev. Drug Discov. 5, 210–218 8 Wolff, A. et al. (2003) Integrating advanced functionality in a microfabricated high- throughput fluorescent-activated cell sorter. Lab. Chip 3, 22–27 9 Gossett, D.R. et al. (2012) Hydrodynamic stretching of single cells for large population mechanical phenotyping. Proc. Natl. Acad. Sci. U. S. A. 109, 7630–7635 10 Mazutis, L. et al. (2013) Single-cell analysis and sorting using droplet-based microfluidics. Nat. Protoc. 8, 870–891 11 Cermak, N. et al. (2016) High-throughput measurement of single-cell growth rates using serial microfluidic mass sensor arrays. Nat. Biotechnol. 34, 1052–1059 12 Riordon, J. et al. (2014) Quantifying the volume of single cells continuously using a microfluidic pressure-driven trap with media exchange. Biomicrofluidics 8, 011101 13 Wang, B.L. et al. (2014) Microfluidic high-throughput culturing of single cells for selection based on extracellular metabolite production or consumption. Nat. Biotechnol. 32, 473–478 14 Skelley, A. et al. (2009) Microfluidic control of cell pairing and fusion. Nat. Methods 6, 147–152 15 Nagrath, S. et al. (2007) Isolation of rare circulating tumour cells in cancer patients by microchip technology. Nature 450, 1235–1239 16 Sarioglu, A.F. et al. (2015) A microfluidic device for label-free, physical capture of clusters. Nat Methods 12, 685–691 17 Cho, B.S. et al. (2003) Passively driven integrated microfluidic system for separation of motile sperm. Anal. Chem. 75, 1671–1675 18 Nosrati, R. et al. (2014) Rapid selection of sperm with high DNA integrity. Lab. Chip 14, 1142–1150 19 Knowlton, S.M. et al. (2015) Microfluidics for sperm research. Trends Biotechnol. 33, 221–229 20 Nosrati, R. et al. (2017) Microfluidics for sperm analysis and selection. Nat. Rev. Urol. advance online publication, 21 Sesen, M. et al. (2017) Droplet control technologies for microfluidic high throughput screening (μHTS). Lab. Chip 17, 2372–2394

20

22 Wielhouwer, E.M. et al. (2011) Zebrafish embryo development in a microfluidic flow- through system. Lab. Chip 11, 1815–1824 23 Choudhury, D. et al. (2012) Fish and Chips: a microfluidic perfusion platform for monitoring zebrafish development. Lab. Chip 12, 892–900 24 Chung, K. et al. (2008) Automated on-chip rapid microscopy, phenotyping and sorting of C. elegans. Nat. METHODS 5, 637–643 25 Chung, K. et al. (2011) A microfluidic array for large-scale ordering and orientation of embryos. Nat. Methods 8, 171-U103 26 LeCun, Y. et al. (2015) Deep learning. Nature 521, 436–444 27 Bengio, Y. et al. (2013) Representation Learning: A Review and New Perspectives. IEEE Trans. Pattern Anal. Mach. Intell. 35, 1798–1828 28 Krizhevsky, A. et al. (2012) , Imagenet classification with deep convolutional neural networks. , in Advances in neural information processing systems, pp. 1097–1105 29 Simonyan, K. and Zisserman, A. (2014) Very deep convolutional networks for large- scale image recognition. ArXiv Prepr. ArXiv14091556 30 Szegedy, C. et al. (2016) , Rethinking the inception architecture for computer vision. , in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2818–2826 31 Redmon, J. et al. (2016) , You Only Look Once: Unified, Real-Time Object Detection. , pp. 779–788 32 Luong, M.-T. et al. (2015) Effective approaches to attention-based neural machine translation. ArXiv Prepr. ArXiv150804025 33 Sutskever, I. et al. (2014) , Sequence to sequence learning with neural networks. , in Advances in neural information processing systems, pp. 3104–3112 34 Cho, K. et al. (2014) Learning phrase representations using RNN encoder-decoder for statistical machine translation. ArXiv Prepr. ArXiv14061078 35 Bahdanau, D. et al. (2016) , End-to-end attention-based large vocabulary speech recognition. , in Acoustics, Speech and Signal Processing (ICASSP), 2016 IEEE International Conference on, pp. 4945–4949 36 Graves, A. et al. (2013) , Speech recognition with deep recurrent neural networks. , in Acoustics, speech and signal processing (icassp), 2013 ieee international conference on, pp. 6645–6649 37 Joulin, A. et al. (2016) Bag of tricks for efficient text classification. ArXiv Prepr. ArXiv160701759 38 Jaderberg, M. et al. (2016) Reading Text in the Wild with Convolutional Neural Networks. Int. J. Comput. Vis. 116, 1–20 39 Zhou, J. and Troyanskaya, O.G. (2015) Predicting effects of noncoding variants with deep learning–based sequence model. Nat. Methods 12, 931–934 40 Angermueller, C. et al. (2016) Deep learning for computational biology. Mol. Syst. Biol. 12, 878 41 Ko, J. et al. (2018) Machine learning to detect signatures of disease in liquid biopsies – a user’s guide. Lab. Chip 18, 395–405 42 Guo, B. et al. (2017) High-throughput, label-free, single-cell, microalgal lipid screening by machine-learning-equipped optofluidic time-stretch quantitative phase microscopy. Cytometry A 91, 494–502

21

43 Ko, J. et al. (2017) Combining Machine Learning and Nanofluidic Technology To Diagnose Pancreatic Cancer Using Exosomes. ACS Nano 11, 11182–11193 44 Singh, D.K. et al. (2017) Label-free, high-throughput holographic screening and enumeration of tumor cells in blood. Lab. Chip 17, 2920–2932 45 Huang, X. et al. (2016) Machine Learning Based Single-Frame Super-Resolution Processing for Lensless Blood Cell Counting. 16, 46 Mahdi, Y. and Daoud, K. (2017) Microdroplet size prediction in microfluidic systems via artificial neural network modeling for water-in-oil emulsion formulation. J. Dispers. Sci. Technol. 38, 1501–1508 47 Chen, C.L. et al. (2016) Deep Learning in Label-free Cell Classification. Sci. Rep. 6, 21471 48 Heo, Y.J. et al. (2017) Real-time Image Processing for Microscopy-based Label-free Imaging Flow Cytometry in a Microfluidic Chip. Sci. Rep. 7, 49 Van Valen, D.A. et al. (2016) Deep Learning Automates the Quantitative Analysis of Individual Cells in Live-Cell Imaging Experiments. PLOS Comput. Biol. 12, e1005177 50 Gopakumar, G. et al. (2017) Cytopathological image analysis using deep-learning networks in microfluidic microscopy. J. Opt. Soc. Am. A 34, 111 51 Stoecklein, D. et al. (2017) Deep Learning for Flow Sculpting: Insights into Efficient Learning using Scientific Simulation Data. Sci. Rep. 7, 46368 52 Albrecht, T. et al. (2017) Deep learning for single-molecule science. 28, 423001 53 Ching, T. et al. (2017) Opportunities And Obstacles For Deep Learning In Biology And Medicine. bioRxiv 54 Mamoshina, P. et al. (2016) Applications of Deep Learning in Biomedicine. Mol. Pharm. 13, 1445–1454 55 Das, D.K. et al. (2013) Machine learning approach for automated screening of malaria parasite using light microscopic images. MICRON 45, 97–106 56 Mirsky, S.K. et al. (2017) Automated Analysis of Individual Sperm Cells Using Stain- Free Interferometric Phase Microscopy and Machine Learning. Cytometry A 91A, 893– 900 57 Ko, J. et al. (2017) Combining Machine Learning and Nanofluidic Technology To Diagnose Pancreatic Cancer Using Exosomes. ACS NANO 11, 11182–11193 58 San-Miguel, A. et al. (2016) Deep phenotyping unveils hidden traits and genetic relations in subtle mutants. Nat. Commun. 7, 59 Vasilevich, A.S. et al. (2017) How Not To Drown in Data: A Guide for Biomaterial Engineers. Trends Biotechnol. 35, 743–755 60 Werbos, P.J. (1990) Backpropagation through time: what it does and how to do it. Proc. IEEE 78, 1550–1560 61 Han, S. et al. (2018) Use of Deep Learning for Characterization of Microfluidic Soft Sensors. IEEE Robot. Autom. Lett. 62 Boža, V. et al. (2017) DeepNano: Deep recurrent neural networks for base calling in MinION nanopore reads. PloS One 12, e0178751 63 Godin, M. et al. (2010) Using buoyant mass to measure the growth of single cells. Nat. Methods 7, 387–390 64 Jozefowicz, R. et al. (2016) Exploring the limits of language modeling. ArXiv Prepr. ArXiv160202410 22

65 Qu, Y.-H. et al. (2017) On the prediction of DNA-binding proteins only from primary sequences: A deep learning approach. PLOS ONE 12, e0188129 66 Vaswani, A. et al. (2017) , Attention is all you need. , in Advances in Neural Information Processing Systems, pp. 6000–6010 67 Angerer, P. et al. (2017) Single cells make big data: New challenges and opportunities in transcriptomics. Curr. Opin. Syst. Biol. 4, 85–91 68 Deng, Y. et al. (2018) Massive single-cell RNA-seq analysis and imputation via deep learning. DOI: 10.1101/315556 69 Webb, S. (2018) Deep learning for biology. Nature 554, 555–557 70 Camacho, D.M. et al. (2018) Next-Generation Machine Learning for Biological Networks. Cell 173, 1581–1592 71 Bock, C. et al. (2016) Multi-Omics of Single Cells: Strategies and Applications. Trends Biotechnol. 34, 605–608 72 LeCun, Y. and Bengio, Y. (1995) Convolutional networks for images, speech, and time series. Handb. Brain Theory Neural Netw. 3361, 1995 73 Zeiler, M.D. et al. (2011) , Adaptive deconvolutional networks for mid and high level feature learning. , in Computer Vision (ICCV), 2011 IEEE International Conference on, pp. 2018–2025 74 Kim, K. et al. (2018) Visual Estimation of Bacterial Growth Level in Microfluidic Culture Systems. Sensors 18, 75 Zaimi, A. et al. (2018) AxonDeepSeg: automatic axon and myelin segmentation from microscopy data using convolutional neural networks. Sci. Rep. 8, 3816 76 Long, J. et al. (2015) , Fully convolutional networks for semantic segmentation. , in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3431–3440 77 Ronneberger, O. et al. (2015) , U-net: Convolutional networks for biomedical image segmentation. , in International Conference on Medical image computing and computer- assisted intervention, pp. 234–241 78 Badrinarayanan, V. et al. (2017) SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 39, 2481– 2495 79 Farabet, C. (2013) , Towards real-time image understanding with convolutional networks. , PhD Thesis, Université Paris-Est 80 Lin, G. et al. (2016) , Efficient Piecewise Training of Deep Structured Models for Semantic Segmentation. , in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3194–3203 81 Zhao, H. et al. (2017) , Pyramid Scene Parsing Network. , in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6230–6239 82 Liu, W. et al. (2015) Parsenet: Looking wider to see better. ArXiv Prepr. ArXiv150604579 83 Chen, L.-C. et al. (2016) Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. ArXiv Prepr. ArXiv160600915 84 Buggenthin, F. et al. (2017) Prospective identification of hematopoietic lineage choice by deep learning. Nat. Methods 14, 403–406

23

85 Yu, H. et al. (2018) Phenotypic Antimicrobial Susceptibility Testing with Deep Learning Video Microscopy. Anal. Chem. 90, 6314–6322 86 Bhatia, S.N. and Ingber, D.E. (2014) Microfluidic organs-on-chips. Nat. Biotechnol. 32, 760–772 87 Ahadian, S. et al. (2018) Organ-On-A-Chip Platforms: A Convergence of Advanced Materials, Cells, and Microscale Technologies. Adv. Healthc. Mater. 7, 1700506 88 Lu, C. et al. (2017) An oral cavity squamous cell carcinoma quantitative histomorphometric-based image classifier of nuclear morphology can risk stratify patients for disease-specific survival. Mod. Pathol. 30, 1655–1665 89 Faust, K. et al. (2018) Visualizing histopathologic deep learning classification and anomaly detection using nonlinear feature space dimensionality reduction. BMC 19, 90 Edington, C.D. et al. (2018) Interconnected Microphysiological Systems for Quantitative Biology and Pharmacology Studies. Sci. Rep. 8, 4530 91 Zhang, B. et al. (2016) Biodegradable scaffold with built-in vasculature for organ-on-a- chip engineering and direct surgical anastomosis. Nat. Mater. 15, 669–678 92 Check Hayden, E. (2015) Synthetic biology lures Silicon Valley investors. Nat. News 527, 19 93 Wei, J.N. et al. (2016) Neural Networks for the Prediction of Organic Chemistry Reactions. ACS Cent. Sci. 2, 725–732 94 Nguyen, B. et al. (2018) A Platform for High-Throughput Assessments of Environmental Multistressors. Adv. Sci. 95 Lambert, B.S. et al. (2017) A microfluidics-based in situ chemotaxis assay to study the behaviour of aquatic microbial communities. Nat. Microbiol. 2, 1344–1349 96 Gong, M.M. and Sinton, D. (2017) Turning the Page: Advancing Paper-Based Microfluidics for Broad Diagnostic Application. Chem. Rev. 117, 8447–8480 97 Erickson, D. et al. (2014) Smartphone technology can be transformative to the deployment of lab-on-chip diagnostics. Lab. Chip 14, 3159–3164 98 Pandey, C.M. et al. (2017) Microfluidics Based Point‐of‐Care Diagnostics. Biotechnol. J. 13, 1700047 99 Weng, X. and Neethirajan, S. (2017) Ensuring food safety: Quality monitoring using microfluidics. Trends Food Sci. Technol. 65, 10–22 100 Pardee, K. et al. (2016) Rapid, Low-Cost Detection of Zika Virus Using Programmable Biomolecular Components. Cell 165, 1255–1266 101 Tajbakhsh, N. et al. (2016) Convolutional Neural Networks for Medical Image Analysis: Full Training or Fine Tuning? IEEE Trans. Med. Imaging 35, 1299–1312 102 Krogh, A. (2008) What are artificial neural networks? Nat. Biotechnol. 26, 195–197 103 Rumelhart, D.E. et al. (1986) Learning representations by back-propagating errors. Nature 323, 533–536 104 He, K. et al. (2016) , Deep Residual Learning for Image Recognition. , in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, pp. 770–778 105 Deng, J. et al. (2009) , ImageNet: A large-scale hierarchical image database. , in 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255 106 Kilby, J.S. 23-Jun-(1964) Miniaturized electronic circuits. , US3138743 A

24

107 Aspray, W. (1997) The Intel 4004 microprocessor: What constituted invention? IEEE Ann. Hist. Comput. 19, 4–15 108 Terry, S.C. et al. (1979) A gas chromatographic air analyzer fabricated on a silicon wafer. IEEE Trans. Electron Devices 26, 1880–1886 109 Duffy, D.C. et al. (1998) Rapid Prototyping of Microfluidic Systems in Poly(dimethylsiloxane). Anal. Chem. 70, 4974–4984 110 Rosenblatt, F. (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65, 386 111 Jordan, M.I. (1997) Serial order: A parallel distributed processing approach. In Advances in psychology 121pp. 471–495, Elsevier 112 Abadi, M. et al. (2016) Tensorflow: Large-scale machine learning on heterogeneous distributed systems. ArXiv Prepr. ArXiv160304467 113 Rampasek, L. and Goldenberg, A. (2016) TensorFlow: Biology’s Gateway to Deep Learning? Cell Syst. 2, 12–14

25

Figure Legends

Text Box 1 Figure. Neural network basics. (A) Neural network weights adjusted via back- propagation. (B) Neural network with three hidden layers.

26

Figure 1. Deep learning with microfluidics. (A) A brief history of deep learning and microfluidics. Microfluidics emerged from MEMS technologies in the 1990s. (Top): 1958 - first integrated circuit by Kilby [106]; 1971 - first commercially available microprocessor (Intel)

[107]; 1979 - first lab on a chip [108]; 1998 - Demonstration of rapid prototyping with PDMS

[109]. Concurrently, artificial intelligence and machine learning algorithms have been progressing over a similar time period (Bottom): 1957 – first perceptron [110]; 1974 – introduction of backpropagation within neural networks by Werbos [60], and in 1986 popularized by Rumelhart, Hinton & Williams [103]; 1986 - introduction of the recurrent neural network (RNN) by Jordan [111]; 2012 – demonstration of a foundational convolutional neural network (CNN), AlexNet, developed by Krizhevsky, Hinton and Sutskever [28]. RNNs and

CNNs were at first limited by data and computational power, but in the last few years have gained massive popularity by leveraging fast GPUs, frameworks such as TensorFlow [112,113] and distributed computing.

27

Key Figure: Figure 2. Mapping microfluidic applications to deep learning architectures.

Example applications are paired with deep learning architectures, from the simplest unstructured-to-unstructured case to the more complex image-to-image case. Each example is described in detail in corresponding sections.

28

Figure 3. Deep learning architectures: unstructured data and structured sequences as inputs. (A) Unstructured-to-unstructured application and neural network architecture. Flow cytometry images (left) are reproduced from “Deep Learning in Label-free Cell Classification” by Chen and colleagues [47], and licensed under CC BY 4.0. Images were modified for clarity.

Grey circles represent nodes and arrows depict connections between nodes (light grey dashed arrows) or between layers (solid black arrows). Layers are color-coded with the input layer in

29 blue, hidden layers in green and output layers in red. (B) Sequence-to-unstructured application and recurrent neural network (RNN) architecture. Microfluidic soft sensor characterization example (left) images modified for clarity from ref. [61]. A single recurrent neural network layer is shown with progressive shading to show progression through time – nodes are not only fed new current values as inputs, but also their previous values. (C) Sequence-to-sequence application and recurrent neural network architecture. DNA base calling example image from

“DeepNano: Deep recurrent neural networks for base calling in MinION nanopore reads” by

Boža and colleagues [62], and licensed under CC BY 4.0. Image modified and inset schematic added for clarity. The deep learning architecture schematics (right) are simplified from references to denote network principles of operation, and do not represent an exact representation.

30

Figure 4. Deep learning architectures: images as inputs. (a) Image-to-unstructured application and convolutional neural network architecture. Flow cytometry cell classification application and images (top left) are reproduced from “Real-time Image Processing for

Microscopy-based Label-free Imaging Flow Cytometry in a Microfluidic Chip” by Heo and colleagues [48], and licensed under CC BY 4.0. Images modified for clarity. Convolutional layers are light green, pooling layers are dark green, and different filters at the same scale (i.e., channels) are shown as vertical planar slices. Images reproduced from “Deep Learning for Flow

Sculpting: Insights into Efficient Learning using Scientific Simulation Data” by Stoecklein and colleagues [51], and licensed under CC BY 4.0. Images were modified for clarity. (c) Image-to- image cell segmentation application and CNN architecture. An SEM image of a rat spinal cord is segmented into one of three classes: axon (blue), myelin (red) and background (black) using a

31 modified U-Net architecture, here simplified for clarity [51]. Images reproduced from

“AxonDeepSeg: automatic axon and myelin segmentation from microscopy data using convolutional neural networks” by Zaimi and colleagues [75], and licensed under CC BY 4.0.

Images were modified for clarity. The deep learning architecture schematics (right) are simplified from above references to denote network principles of operation, and do not represent an exact representation.

32