<<

Uncertainty quantification and estimation in differential dynamic microscopy

Mengyang Gu,1, ∗ Yimin Luo,2, 3, ∗ Yue He,1 Matthew E. Helgeson,2 and Megan T. Valentine3, † 1Department of Statistics and Applied Probability, University of California, Santa Barbara CA 93106, USA 2Department of Chemical Engineering, University of California, Santa Barbara CA 93106, USA 3Department of Mechanical Engineering, University of California, Santa Barbara CA 93106, USA (Dated: September 23, 2021) Differential dynamic microscopy (DDM) is a form of video image analysis that combines the sensitivity of scattering and the direct visualization benefits of microscopy. DDM is broadly use- ful in determining dynamical properties including the intermediate scattering function for many spatiotemporally correlated systems. Despite its straightforward analysis, DDM has not been fully adopted as a routine characterization tool, largely due to computational cost and lack of algorithmic robustness. We present statistical analysis that quantifies the noise, reduces the computational order and enhances the robustness of DDM analysis. We propagate the image noise through the Fourier analysis, which allows us to comprehensively study the bias in different estimators of model param- eters, and we derive a different way to detect whether the bias is negligible. Furthermore, through use of Gaussian process regression (GPR), we find that predictive samples of the image structure function require only around 0.5%–5% of the Fourier transforms of the observed quantities. This vastly reduces computational cost, while preserving information of the quantities of interest, such as quantiles of the image scattering function, for subsequent analysis. The approach, which we call DDM with uncertainty quantification (DDM-UQ), is validated using both simulations and experi- ments with respect to accuracy and computational efficiency, as compared with conventional DDM and multiple tracking. Overall, we propose that DDM-UQ lays the foundation for important new applications of DDM, as well as to high-throughput characterization. We implement the fast computation tool in a new, publicly available MATLAB software package.

I. INTRODUCTION in most research laboratories and straightforward anal- ysis routines. DDM has been applied to study an ever- Microscopy has become an essential tool for probing broadening range of phenomena in soft and biological dynamical processes in complex materials and systems, matter systems [3, 4, 6], including analysis of the dy- but typically requires sophisticated video image analysis namics of concentrated particle suspensions [7], motions to obtain quantitative information. Although real-space of swimming bacteria [8], binary mixture of molecular analysis methods retain information regarding individu- fluids [9], and the coarsening dynamics of phase separat- alistic processes within an image, feature tracking algo- ing colloidal gels [10]. While it is common to assume the rithms such as multiple particle tracking (MPT) are often material to be isotropic, anisotropic properties such as computationally expensive and require user interactivity the viscoelasticity of nematic crystals can also be to determine algorithmic parameters to isolate the dy- extracted [11]. namical process(es) of interest [1, 2]. By contrast, Fourier For a more comprehensive overview of DDM and its transform-based analysis retains the statistical informa- various applications, the interested reader is referred to tion encoded within the entire image, and is therefore various reviews on the topic [3, 4, 12]. This work is con- more sensitive to low-signal processes as well as more ro- cerned with the development of a comprehensive statis- bust to non-ideal imaging conditions and optically dense tical framework that aims at quantifying errors, reduc- systems [3, 4]. In this way, Fourier microscopy combines ing computational cost and enhancing the robustness of the advantages of real-space imaging in feature identifi- the analysis of differential dynamic microscopy (DDM) cation and segmentation with ensemble-level statistical data. Similar developments have been made previously precision of Fourier-space analysis. for MPT analysis [2, 13], and have greatly improved the robustness and algorithmic development of MPT in vari- arXiv:2105.01200v2 [cond-mat.soft] 22 Sep 2021 Of the various Fourier-space based approaches avail- able, differential dynamic microscopy (DDM) [5] has ous applications. We therefore anticipate similar benefits emerged as a powerful and versatile analysis method from a more thorough investigation of uncertainty for to quantify spatiotemporally correlated dynamics from DDM. To better motivate these developments, we first video microscopy data. This versatility stems from its summarize the analysis procedure of DDM, estimators compatibility with a broad range of microscopy imag- employed to extract physical parameters, and highlight ing modes, easy setup with instrumentation available the features and limitations of DDM that inspired this study. In DDM, a time sequence of image stacks repre- ∗ Equal contribution sented by the intensity matrix I(x, t) is processed using a † Corresponding authors: [email protected]; helge- Fourier-based technique where x = (x1, x2) denotes two [email protected]; [email protected] spatial coordinates and t ∈ [tmin, tmax]. One first cal- 2 culates the difference in intensity at each pixel location effects, much in the same way MPT incurs static and between two frames separated by a lag time ∆t: dynamic errors associated with the imaging process [13]. Because of this, the observed value Do(q, ∆t) will in gen- ∆I(x, t, ∆t) = I(x, t + ∆t) − I(x, t). (1) eral not be equal to the “true” image structure function D(q, ∆t) that would be obtained from an ideal imaging The intensity differences are then Fourier transformed system. Separating the signal from the noise in estimat- and the absolute values squared to obtain the normalized ing the image structure function requires properly quan- squared intensity function in Fourier space: tifying the uncertainty of the background noise that prop- agates through the analysis [5]. The observed intensity |∆Iˆ(q, t, ∆t)|2 = |F(∆I(x, t, ∆t))|2, (2) Do(q, ∆t) typically overestimates D(q, ∆t), as the mean where F(·) denotes the operator of the 2D discrete of the noise term B(q, ∆t) is positive, and is twice as large as the variance of the noise in the original images [3]. In Fourier transformation (DFT), q = (q1, q2) is a coor- dinate wave vector in reciprocal space. this study, we show that it is critically important to ob- The ensemble average of Eq. (2) is computed to obtain tain an accurate estimate of the mean of the noise term the dynamic image structure function D(q, ∆t): B(q, ∆t), which we denote as Best, to extract dynamic information from systems, such as the mean squared dis- D(q, ∆t) = h|∆Iˆ(q, t, ∆t)|2i placement, using DDM. Several distinct methods to obtain the noise estimator 1 X X ˆ 2 = |∆I(q, t, ∆t)| , (3) Best in DDM have been proposed in prior studies. How- n∆tnq t∈S∆t (q1,q2)∈Sq ever, to our knowledge, there has yet to be a detailed study of how the choice of estimator affects the estima- where h.i denotes averaging across all instances of t ∈ tion of dynamic properties. For instance, Best has been S∆t and (q1, q2) ∈ Sq with sets S∆t = {t : tmin ≤ t ≤ assumed to be 0 [15], estimated by the minimum value tmax − ∆t}. The sizes of the two sets are denoted by of D(q, ∆t) at the temporal resolution ∆tmin, denoted n∆t = #S∆t and nq = #Sq, respectively. as Dmin(∆tmin) [6], as the average of the high-q limit of We only consider isotropic materials in this work, in the observed image structure function hDo(qmax, ∆t)i∆t which case D(q, ∆t) = D(q, ∆t) where Sq = {(q1, q2): [16, 17], or as the ensemble average of the static power q2 + q2 = q2}. However, the approach can be general- 2 1 2 spectrum 2h|Iˆo(qmax, t)| it [18]. Although a particular ized to retain a multi-dimensional q-dependence as de- estimator of the noise term may work well under certain sired [14]. experimental conditions, we will show that all estimators The following representation is routinely used to relate are biased in general (see Section II A), and ultimately observables commonly associated with scattering analy- introduce a way to discern whether the bias is negligible, sis to the observations of D(q, ∆t) [3, 5, 6]: or if additional measurements are needed to estimate the noise. Indeed, in both simulated scenarios and real ex- D (q, ∆t) = A(q) (1 − f(q, ∆t)) + B(q, ∆t), (4) o periments, we found that the bias of noise estimation where A(q) is determined by the properties of the im- can substantially impact the estimation accuracy of sys- aged material and imaging optics, B(q, ∆t) is determined tem dynamics. Finally, we note that Best is sometimes by the noise of the detection chain, and the subscript treated as a fitting parameter and estimated along with ‘o’ denotes the observed value. As will be discussed other parameters in the model of f(q, ∆t) [3, 19, 20]. in Sec. II A, the mean of B(q, ∆t) is a constant value This approach is applied to analyzing the active actin shared across all q and ∆t values, whereas the variance dynamics from [20] in Sec. IV C. In general, we find that of B(q, ∆) depends on the values of q and ∆t. The in- estimation of the noise parameter is the most challenging termediate scattering function (ISF), f(q, ∆t), is in prin- among all parameters, and could lead to a poor fit to the ciple the same as that measured in conventional observed values. The first contribution of this work is a scattering measurements such as dynamic light scatter- formal analysis of error propagation and comparison of ing (DLS). The ISF quantifies how the dynamic structure different estimators in representative experimental con- decorrelates over the observed length scale 1/q in Fourier texts. space and timescale ∆t in real space, which encodes the After obtaining the estimate of the mean of the noise physical dynamics of the observed system. In general, for Best, the amplitude parameter A(q) may be estimated randomly-fluctuating, ergodic systems, f(q, ∆t → 0) = 1 by the plateau of intensity through Do(q, ∆t → ∞) = A(q) + B [6, 16], or through the connection A(q) + and f(q, ∆t → ∞) = 0, and thus, Do(q, ∆t → 0) = est ˆ 2 B(q, ∆t) and Do(q, ∆t → ∞) = A(q) + B(q, ∆t). Best = h|Io(q, t)| it [5, 18, 21]. In practice, it may be DDM’s ultimate integration into the characterization difficult to accurately obtain Do(q, ∆t → ∞) as there workflows of a diverse range of systems is not without are very few observations available for intensities at large challenges. First, we note that, because DDM oper- ∆t. Thus, we find that estimation of A(q) through 2 ates on a series of finite-exposure images taken of time- A(q) + Best = h|Iˆo(q, t)| it is often more reliable, as it fluctuating processes, the measured DDM signal will con- does not require the intensity to reach plateau at large tain inherent error related to both static and dynamic ∆t. These approaches will be compared using simulated 3 and experimental observations. original image intensities. By propagating the error, we A second challenge of DDM analysis is the computa- show that there exists potential bias in different estima- tional cost: the most computationally expensive step for tors of the noise and amplitude parameter in DDM, and DDM is performing a 2D fast Fourier transformation for we propose a new way to detect and reduce such bias. each pixel of the difference for each of the image pairs (2) We speed up the computation by using Gaussian (T ×(T −1)/2 pairs, where T ∼ 103–104 is the total num- process regression (GPR) [28] to overcome the compu- ber of time points). For T images with size N × N pix- tational bottleneck introduced by the Fourier transfor- 2 3 2 2 els (N ∼ 10 –10 ), this requires O(T N log2(N)) com- mations by subsampling the data at selected ∆t (Figure putational operations. Although there has been recent 1). (3) Finally, we use the median of the predictive sam- progress in accelerated computation that takes advantage ples from GPR to robustly estimate the ISF, MSD and of contemporary computational efficiency for Fourier- other quantities of interest. Furthermore, we illustrate based image analysis [7, 22], these approaches still re- through a broad range of examples, both simulated and quire resolving the DDM signal over the full sampled experimental, how the choice of the estimators impacts space of q and ∆t. To overcome the computational chal- the accuracy of the resultant MSD and other quantities lenge, we use a probabilistic approach to downsample the of interests. We demonstrate that accurate estimation image stacks and reconstruct all image structure func- of the noise in image intensities is critical for obtain- tions based on a fraction of observations, which dramati- ing an accurate estimation of dynamical information in cally reduces the computational cost of the required fast DDM. We make available a user-friendly MATLAB soft- Fourier transform (FFT) algorithm. ware package [29] that implements fast computational The third, and often overlooked aspect, is the robust- techniques with uncertainty quantification developed in this work. ness of the algorithm(s) for decomposing Do(q, ∆t) into its more physically meaningful components through Eq. (4). To ensure applicability to a wide range of materials, it is desirable to allow for an arbitrary form of f(q, ∆t), II. METHODOLOGY such that the method does not require prior knowledge of the system’s dynamical properties. Moreover, the es- We name our algorithm differential dynamic mi- timators used for A(q) and B(q) may contain bias, which croscopy with uncertainty quantification (DDM-UQ). can cause the algorithm to be less robust for both small The analysis routine is described schematically in Fig- and large ∆t’s in estimating f(q, ∆t) and quantities de- ure 1. First, image stacks I(x, t) are acquired with a rived from it [6]. Here we generate the predictive samples microscope or are produced using a particle dynamics based on the observed image structure function Do(q, ∆t) simulation algorithm. The variance of the background 2 and use the predictive median to derive physical param- noise intensity, σ0, is either assessed independently or es- eters within the systems; this approach is more robust timated from the image stack. Then, a small subsample than a simple ensemble based on Do(q, ∆t) at selected of a few percent of the image differences are squared and wave vectors. Fourier transformed to construct a set of observed quan- An exemplifying context for the potential advantages tities Do(q, ∆t). Thereafter, GPR is fit to Do(q, ∆t) to gained by overcoming these limitations is the recent ap- obtain a predictive distribution D(q, ∆t). Eventually, the plication of DDM to passive probe microrheology as predictive samples for each q are used to obtain predic- an alternative to conventional approaches such as MPT tive samples for quantities of interest, such as f(q, ∆t) 2 [6, 23]. MPT-based passive probe microrheology involves and h∆r (∆t)i. Through analysis of a number of simu- imaging the Brownian fluctuations of embedded colloidal lated and experimental data sets, we demonstrate that probes in order to resolve their mean square displace- our approach not only reduces the computational time, ments (MSD) h∆r2(∆t)i [1], which in the limit of homo- but in many cases also improves the accuracy and ro- geneous, uniform materials can be related to their linear bustness of estimation, as compared to previous DDM viscoelastic moduli [24, 25]. DDM offers an alternative approaches and MPT. approach to MPT in estimating the MSD through the es- timation of f(q, ∆t) using Eq. (4). DDM presents a num- A. Error quantification ber of advantages in this regard, including applicability and better statistical precision to low-signal or optically dense probes and materials [6, 17, 26, 27]. Importantly, To develop a statistical approach to error quantifica- since DDM requires no a priori user-input parameters tion and analysis, we write the observed intensity as associated with the probes or imaging system, it has the I (x, t) = I(x, t) + (x, t), (5) potential to provide automated, user-free analysis that o could enable high-throughput characterization [12]. where I(x, t) is an (unknown) deterministic function of In this work, we make three contributions towards the observed sample and imaging system; (x, t) is an overcoming these challenges of DDM: (1) We relate the independent random noise with mean zero and variance 2 mean and variance of the error in the observed image σ0; the subscript “o” denotes the observed value. Sev- structure function to the variance of the error in the eral artifacts are known to impact the accuracy of DDM 4

Measured Processed Observed Predictive quantity 0.5-5% FT quantity quantity sample of ΔI(x,Δt) Averaging Estimating 2 I(x,t), σ0 ΔIo(q,t) Do(q, Δt) D(q*, Δt*) | Do

108 1 0 Quantify Increasing q 0.8 50 107 plateau & noise q t)

t) 0.6 100 x2 q2 106 150 0.4 f(q, D(q, f(q, Δt) Increasing 200 105 0.2

250 0 50 100 150 200 250 0 I(x,t) 0 1 2 Models 0 1 2 x1 q1 10 10 10 10 10 10 t [s] t [s] x = (x1, x2) q = (q1, q2) Example MSD exponential γ(q) Other applications <Δr2(Δt)> time const τ(q) models Microrheology Active matter

FIG. 1. Schematic representation of DDM-UQ-based data reduction, sampling, and fitting procedure used to determine material constants. From the stack of images acquired by microscopy, around 0.5%–5% of the Fourier transforms are performed ∗ ∗ to obtain the observed image structure function Do(q, ∆t), from which a predictive sample is generated to estimate D(q , ∆t ) ∗ ∗ at unobserved q and ∆t , given Do(q, ∆t). In the graphs of Do(q, ∆t) versus ∆t and f(q, ∆t) versus ∆t, at select q’s, the asterisks denote data selected for fitting, the lines denote values at all ∆t’s for a particular q. The shadow denotes 95% predictive interval, which is small compared to the range of the change in D(q, ∆t) over the range of ∆t. The predictive samples preserve the quantiles of the distribution after transformation and are then used to find material quantities of interest. and are expected to contribute to (x, t), including cam- h∆Iˆ(q, t, ∆t)∆ˆ(q, t, ∆t)i, is zero under the assumptions era detection noise, edge effects arising from the finite made for (x, t). field of view [30], and effects of finite exposure time [15]. By combining Eqs. (3) and (6), we can express the Others are known to affect MPT, such as the depth of observations of the dynamic image structure function as field [2] and finite pixel size [13], and are expected to im- follows: pact DDM as well. Here we consider (x, t) to be the ˆ 2 ˆ difference between the measured signal and the “true” Do(q, ∆t) = h|∆I(q, t, ∆t)| i + 2h∆I(q, t, ∆t)∆ˆ(q, t, ∆t)i intensity I(x, t) at each pixel without regard to the ac- + h|∆ˆ(q, t, ∆t)|2i, (8) tual physical origin of the error. We note that there are other known spatially or temporally correlated artifacts, where h·i denotes the ensemble with respect to q ∈ Sq such as illumination fluctuations, that do not satisfy the and t ∈ S∆t. The mean of Do(q, ∆t) follows: criteria assumed for (x, t). These will not be considered 2 in the present analysis. E[Do(q, ∆t)] = D(q, ∆t) + 2σ0. (9) We illustrate how the error in Eq. (5) propagates in 2 the analysis of DDM. The derivation of Eqs. (6)-(10) is Further assuming (x, t) ∼ N (0, σ0) independently, we given in Appendix A. Assuming that Eq. (5) holds, we can calculate the variance can express the observed squared intensity function in 2  reciprocal space as 2σ0 2 V[Do(q, ∆t)] = 2σ0 + 2D(q, ∆t) nqn∆t 2 |∆Iˆo(q, t, ∆t)|  2  σ0 2Sq,∆t 2 +max (0, (T − 2∆t)) − , = |∆Iˆ(q, t, ∆t)| + 2∆Iˆ(q, t, ∆t)∆ˆ(q, t, ∆t) n∆t (T − 2∆t)n∆tnq + |∆ˆ(q, t, ∆t)|2, (6) (10) where the expression of S is given in Eq. (34) in where the closed form expressions of |∆Iˆ(q, t, ∆t)|2, q,∆t Appendix A. ˆ 2 ∆I(q, t, ∆t)∆ˆ(q, t, ∆t), and |∆ˆ(q, t, ∆t)| are given in The result in Eq. (10) is intuitive: the ratio 1 Eqs. (30)-(32) in Appendix A, respectively. nq n∆t arises from the fact that the D (q, ∆t) is averaged from The expected value (i.e., mean) of |I (q, t, ∆t)|2 is o o ˆ 2 given by nq and n∆t observations of hI(q, t, ∆t)| i, as shown in Eq. (3), which decreases the variance. The other terms arise ˆ 2 ˆ 2 2 from the covariance between the sin and cos terms from E[|Io(q, t, ∆t)| ] = |∆I(q, t, ∆t)| + 2σ0. (7) the Fourier transform and that of the recursive sampling Note that the mean of the cross-product term, of the same image in different ∆t. 5

Note that by Eq. (9), an unbiased estimator of of the noise may be used. One goal of this exercise is to 2 D(q, ∆t) is Do(q, ∆t) − 2σ0, while using the observations illustrate the importance of noise quantification, and to Do(q, ∆t) alone typically overestimates the image struc- provide a way to detect potential bias in a wide range of 2 ture function by 2σ0 on average. Potential practical pro- scenarios. 2 cedures for estimating 2σ0 will be discussed later. For applications of DDM to microrheology, and as- We have shown that Do(q, ∆t) can be separated into a suming dilute probes with diffusive particle dynamics in- deterministic term of the signal, D(q, ∆t), and a random volving Gaussian displacements, we can relate the MSD term containing the noise and cross product of the noise h∆r2(∆t)i at each q to the image structure function and and signal. This representation can be related to Eq. (4) related quantities as follows [6, 32]: by letting f(q, ∆t) B(q, ∆t) = 2h∆Iˆ(q, t, ∆t)∆ˆ(q, t, ∆t)i+h|∆ˆ(q, t, ∆t)|2i,  q2h∆r2(∆t)i  α q4h∆r2(∆t)i2  (11) = exp − 1 + 2 + ··· , where the mean and variance 4 32 (14) 2 E[B(q, ∆t)] = 2σ0, (12) where the first order non-Gaussian parameter α = [B(q, ∆t)] = [Do(q, ∆t)], (13) 2 V V dhr4(∆t)i (d+2)h∆r2(∆t)i − 1 (d=dimensionality) is a measure of the where V[Do(q, ∆t)] is given in Eq. (10). We observe heterogeneity or non-diffusive dynamics of the sample that specifying B as 0 typically underestimates the mean [33]. As is common in microrheology, we assume the of the noise term. On the other hand, specifying B contribution of the non-Gaussian parameter is negligible as the average of the high-q limit of observed image [24], and thus ∆r2(q, ∆t) at each q may be estimated structure function hDo(qmax, ∆t)i∆t, or as the ensem- from D(q, ∆t) through the following approach: 2 ble average of static power spectrum 2h|Iˆo(qmax, t)| it [18], tends to overestimate the mean of the noise by an 4  A(q)  ∆r2 (q, ∆t) = ln . small amount hA(qmax)[1 − f(qmax, ∆t)]i∆t. Note that est 2 q A(q) − Do(q, ∆t) + B(q, ∆t) [−f(q , ∆t)] is close to 1 for large ∆t. Thus the bias max (15) is non-negligible when A(qmax) is large. Also note that A(q ) typically increases when the number of objects max It is common to assume sample ergodicity, and use the in the image or their peak intensity increases, when the ensemble average of ∆r2(q, ∆t) to estimate the mean image pixel size increases, or when the object size de- squared displacement h∆r2(∆t)i. However, we note that creases; the relationship of A(q) to some of these quan- the variance of D (q, ∆t) given in Eq. (10) is not the tities was considered in [31]. Indeed, we found that for o same across different values of q and ∆t. A simple en- a system with a large number of small objects, the bias semble average over wave vectors without addressing the can be large (Fig. A1, Appendix C). When the pixel weights due to different variances could introduce sub- size is large, such as in the case of the actively driven stantial bias in the estimation, when D(q, ∆t) at differ- system considered in Sec. IV C, one may also tend to ent wave vectors is not the same. We found that using overestimate the noise. Similarly, using D (q, ∆t ), min min the median, instead of the mean, can be more robust in i.e., the minimum value of intermediate scattering func- such estimations. Numerical comparison between these tion at the smallest ∆t, may also overestimate B in estimation approaches will be discussed in the context of these scenarios. We found that the overestimation by the simulated and experimental studies. B = D (q, ∆t ) may be smaller than the ones by est min min To illustrate the importance of proper error esti- ˆ 2 Best = hDo(qmax, ∆t)i∆t or Best = 2h|Io(qmax, t)| it, as mation in estimating MSD, we consider the outcomes the term (1 − f(q, ∆tmin)) can be close to zero. Further- using four different ways of estimating B, graphed more, both estimators may slightly underestimate the in Figure 2. If there exists ∆t , such that noise due to its stochastic nature. For instance, when q,min Do(q, ∆tq,min) = Dmin(∆tmin) for a given q, and B is the signal contained in Do(q, ∆tmin) is close to zero, the 0 estimated by Dmin(∆tmin) (red line), then from ensemble average across q s serves as a good estimator, Eq. (15), the argument of the natural log approaches whereas the minimum tends to underestimate the noise. 2 unity, and ∆r (q, ∆tq,min) → 0, which in turn drives In these cases, Do(q, ∆t) tends to underestimate the noise log (∆r2(q, ∆t )) → −∞, as shown. Choosing much less frequently than overestimating the noise. 10 q,min B = hDo(qmax, t)it (purple solid line) will similarly over- We summarize the limitations for each of the available estimate the noise at small ∆t, when A(q) does not suf- approaches in estimating the mean of the noise in DDM ficiently approach 0 at qmax. in Table I and offer a simple test to detect the bias in On the other hand, if B is estimated to be 0 (green Appendix C. In such a scenario, one may change exper- 2 solid line), then approximately log10(∆r (q, ∆t → 0)) → imental conditions, such as by reducing the number of h 4  A(q) i log10 2 ln 2 (an asymptotic value denoted by objects, or increasing the object size, reducing the pixel q A(q)−2σ0 size, etc. to reduce the bias; alternatively, the noise can the green dotted line), as the expected value of Do(q, 0), 2 be measured independently, or a more accurate estimator E(Do(q, 0)) = 2σ0. When we use the correct estimator 6

TABLE I. Potential bias in estimation approaches of noise Best in DDM.

Best Scenarios for non-negligible bias 0 underestimation in all scenarios hDo(qmax, ∆t)i∆t overestimation when particle number, peak intensity, or pixel size is large, or if is small 2 2h|Iˆo(qmax, t)| it overestimation in similar scenarios as the above Dmin(∆tmin) overestimation when (1 − f(q, ∆tmin)) is not close to zero in similar scenarios as the above

2 2 B = 2σ0 (blue solid line), the estimated ∆r (q, ∆t) → 0 monly quantified [13] and frequently subtracted from all when ∆t → 0. data to give a more realistic estimate of the MSD [35– 37]. We show that while many estimators have negligible 2 log10(Δtq,min) bias in approximating 2σ0 in many cases, other times it may be necessary to evoke an independently measured )) log (Δt ) t 10 q,max 2 Δ 2σ0, and we propose that noise characterization should

(q, be routine for DDM as well. 2 r Δ ( 10 B = 0 log B. Gaussian Process Regression

B = o max Δt The second challenge of DDM is the computational bottleneck that arises when performing a massive number of Fourier transformations. We overcome this problem by log10(Δt) representing the logarithm of image structure function by B= 2σ 2 0 a Gaussian process regression (GPR) approach on a frac- 4 A(q) log ln tion (0.5%–5%) of data. GPR is a widely used machine 10 q! A(q)"!σ 2 0 learning tool for estimating nonlinear, smooth response B = Dmin(Δtmin) surfaces and the predictions from GPR is the equivalence to the kernel ridge regression (KRR) estimator, [38–46]. FIG. 2. Illustration of how four different approaches to esti- We apply GPR to the logarithm of D (q, ∆), denoted mating B(q) influence the calculated MSD. For given q, the o D˜ o(θ) = ln(Do(θ)), with two input parameters: the nat- quantity ∆tq,min satisfies Do(q, ∆tq,min) = Dmin(∆tmin). ural logarithm of the wave vector (in reciprocal space) and time are denoted by (˜q, ∆t˜) = (ln(q), ln(∆t)) = θ. These asymptotic limits demonstrate the crucial im- After obtaining the predictive samples of the logarithm 2 portance of properly estimating the mean of B by σ0 of the image structure function, we transform it back to when extracting dynamical properties from DDM such obtain predictive samples for the image structure func- as the ISF or MSD, particularly at small ∆t, a result tion. Because the logarithm of the ISF is smoother, the that is further validated in simulated and experimental GPR approach works much better using logarithm of examples (see e.g. Fig. A1 and Fig. 4). A similar point Do(q, ∆) as observations. Assuming that n observations about B was noted previously by other researchers: while D˜ o(θi), i = 1, 2, ..., n are used, predictions by GPR can uncertainty in A(q) is considered to dominate the analy- be represented through the following optimization, which sis because it pertains to the signal, overestimating B(q) simultaneously penalizes mean squared errors of the es- such as in [6] can lead to spurious results when comput- 2 timation with respect to the observations, as well as the ing h∆r (∆t)i [23], and hence other authors proposed an complexity of the estimation: iterative scheme to solve for h∆r2(∆t)i. Treating B as a fitting parameter was also commonly used in prior stud-  1  D˜ = argmin [D˜ (θ ) − D˜(θ )]2 + λ||D˜||2 , (16) ies [5, 20, 34], although the optimized value of B can be ∗ o i i H D˜ ∈H n unstable in some scenarios, particularly if the weights of Do(q, ∆t) are not properly accounted for (see e.g. Fig 9). where ||·||H denotes the reproducing kernel Hilbert spaces Through various simulation and real experiments in this regression (RKHS) norm (or native norm) [28] that pe- study, we advocate that accurately estimating the noise nalizes the complexity of the estimation to avoid overfit- parameter is crucial in DDM. In experiment, the noise ting and λ is a regularization parameter. For any θ∗, the 2 solution of (16), known as KRR, is a weighted average of term, σ0, may be measured, typically through indepen- dent experiments using immobilized under iden- observations tical imaging conditions, if a large bias of the estimator n ˜ T ˜ X ˜ is detected. The variance of the image difference is then D∗(θ∗) = w Do = wiDo(θi), (17) 2 computed to give σ0. In MPT, a “noise floor” is com- i=1 7 where w = (w , . . . , w ) = rT R˜ −1 is a row vector of C. Predictive sampling by a downsampled data set 1 n θ∗ weights, where R˜ = R + nλIn with In being an identity matrix of size n and R is an n×n correlation with (i, j)th entry parameterized by a kernel function K(θi, θj), and r = (K(θ , θ ), ..., K(θ , θ ))T is the correlation be- θ∗ 1 ∗ n ∗ After obtaining the image intensities, DDM-UQ analy- tween predictive output and observations. sis starts by selecting a fraction of the ∆t’s equally spaced Note that Eq. (17) is only a point estimator with- logarithmically along the input space coordinates to com- out giving assessment of uncertainty. One advantage of pute D (q, ∆t) as shown in Fig. 3(a). Here, we choose the GPR approach is the uncertainty of estimation can o not to downsample observations at wave vectors (q), but be quantified in a probabilistic framework. We model the approach can be extended to q as well, if one wishes the latent function D˜ (·) by a Gaussian process with o to further reduce the computational cost. The Fourier ˜ noises, meaning that any marginal distribution Do = transformation is only performed on this reduced set ˜ ˜ T (Do(θ1), ..., Do(θn)) at n inputs {θ1, ..., θn} follows a of image differences, allowing for fast, high-throughput multivariate normal distribution: analysis. Since an estimate of the plateau of the image     (D˜(θ ), ..., D˜(θ ))T | m, R,˜ σ2 ∼ MN m, σ2R˜ , structure function at long times is often required by the 1 n analysis, we ensure that at least five design points lie (18) on the interval [0.7∆t , 0.9∆t ] on the ∆t˜-axis for T max max where m = (m(θ1), ..., m(θn)) is a vector of the mean each chosenq ˜ [Fig. 3b]. For all numerical results ana- [assumed to be a constant in this work, i.e. m = lyzed in this study, we only use D (q, ∆t) at 25 ∆t points T 2 o (m, ..., m) ] and σ is a variance parameter. in DDM-UQ, which represents around 0.5% to 5% of to- The power (stretched) exponential covariance func- tal observations. The predictive median of Do(q, ∆t) is tion and Mat´erncovariance function are often used for smoother than the observed D (q, t) since spurious noise ˜ o GPR [28]. For any two inputs θa = (˜qa, ∆ta) and is effectively filtered out [see e.g. the black curves and ˜ θb = (˜qb, ∆tb), we use a product covariance function colored curves in Fig 3b]. 2 2 σ K(θa, θb) = σ K1(˜qa, q˜b)K2(∆t˜a, ∆t˜b), with Kl(·, ·), l = 1, 2, following a Mat´erncorrelation with roughness parameter 5/2 such that Note that our goal is to relate Do(q, ∆t) to other quan-  √ 5β2d2   √  tities of interest, such as the intermediate scattering func- K (x , x ) = 1 + 5β d + l exp − 5β d , l a b l 3 l tion f(q, ∆t) or mean squared displacement h∆r2(∆t)i, (19) both of which are nonlinear transformations of Do(q, ∆t). where d = |xa − xb| for any real valued input xa and To achieve this, we sample image structure functions + xb with inverse range parameter βl ∈ R , l = 1, 2. The from the predictive distribution in Eq. (20) and trans- sample path of the Gaussian process with Mat´erncorre- form the samples to obtain other quantities of interest. lation in (19) is twice differentiable and is often used as The transformed predictive samples can be used to esti- a default correlation in GPR [47]. mate the predictive interval of quantities of interest at a The parameters in GPR (m, σ, β, λ) can be esti- given set of input (q, ∆t). In this work, we sample 1000 mated by the maximum likelihood approach discussed observations, denoted as Ds(q, ∆t) for s = 1, 2,..., 1000 in Appendix B. Plugging in the estimated parame- at any (q, ∆t), and transform Ds(q, ∆t) to obtain other 2 ters (mest, σest, βest, λest), the predictive distribution of quantities of interest, such as intermediate scattering function f (q, ∆t) via Eq. (4) after estimating B and D˜(θ∗) at any θ∗ follows a normal distribution [28]: s est Aest(q). The determination of these two estimators is dis- ˜ ˜ ˜ 2 2 (D(θ∗) | Do) ∼ N (Dest(θ∗), σˆ K∗(θ∗, θ∗) +σ ˜∗), (20) cussed in Sec. III A. For each (q, ∆t), the 95% predictive interval of f(q, ∆t) can be estimated by the lower 2.5% 2 whereσ ˜∗ is the variance of the noise θ∗ with quantile and the upper 2.5% quantile of the transformed predictive samples f (q, ∆t) for s = 1, 2,..., 300. D˜ (θ ) = m + rT R˜ −1(D˜ − m 1 ) (21) s est ∗ est θ∗ o est n   K (θ , θ ) = σ2 K(θ , θ ) − rT R˜ −1r . (22) ∗ ∗ ∗ est ∗ ∗ θ∗ θ∗ Since the predictive samples of D(q, ∆t) preserve in- ˜ We use the predictive median Dest(θ∗) for predicting formation such as quantiles of distribution for any trans- the logarithm of the ISF to obtain the unsampled θ∗ formation, transforming these samples can be used to 2 and the predictive median of Do(θ∗) can be obtained by estimate the ISF f(q, ∆t) and MSD h∆r (∆t)i as shown transforming the predictive median of D˜ o(θ∗) through in Figs. 3(c) and 3(d). The predictive median is used the exponential function. The predictive median in Eq. for estimating MSD at each ∆t as it is typically more (21) is equivalent to KRR in Eq. (17) when the mean robust than the mean. Conditional on the observed val- parameter is zero mest = 0. Here the uncertainty of ues Do(q, ∆t), the predictive samples of f(q, ∆t) and predictions and predictive samples can be obtained by h∆r2(∆t)i can be used to assess estimation uncertainty the predictive distribution in Eq. (20). as well. 8

FIG. 3. (a) The design points (blue circles) are selected along the logarithmically scaled coordinates (q, ∆t), requiring only a fraction of the Fourier transformations to compute Do(q, ∆t). (b) Predictive median and predictive samples by GPR for representative values of q’s. The variance of noise is proportional to the inverse of the number of pixels in ensemble. 300 predictive samples for each q were generated to obtain the 95% credible interval (gray shadow) . The observations used for GPR are plotted as black dots. The full observations (black lines) overlap with the predictions (colored lines).(c) The predictive samples for the intermediate scattering function are plotted against the ∆t. (d) Predictive mean squared displacement is plotted against the ∆t using all three methods. The 95% predictive interval is shown with the error bar for DDM-UQ. The inset shows a snapshot of experiment with MPT trajectory overlay. Plots (b)–(d) are derived from a movie with 1 µm probe particles diffusing in a viscous fluid.

III. VALIDATION THROUGH SIMULATION particle located in xj(t) is given by:  2  (x − xj(t)) A. Image formation and analysis approach Ip(x, xj(t)) = Icexp − 2 . (23) 2σp To account for noise in the background intensity signal, To validate our methodology, we first employ simu- a time-varying, random uniform noise Ib(x, t), centered lations of a time series of images demonstrating particle around zero, in the range [−10, 10] is added. Thus, we motion. This approach has the significant advantage that 2 may compute σ0 as the variance of the background noise the true particle motion is known a priori and the true 2 I (x, t): σ2 = 20 ≈ 33.3. Thus, the signal at a time t MSD has a closed form expression, thus allowing quan- b 0 12 is the sum of signals attributed to all particles as well as titative comparison with results obtained using DDM, the background: DDM-UQ, and MPT, the latter obtained using an open np source tracking algorithm [1, 48]. Moreover, it is possi- X ble to investigate systematically how different sources of I(x, t) = Ib(x, t) + Ip(x − xj(t)). (24) noise influence algorithm performance, and whether dif- j=1 2 ferent functional forms of h∆r (∆t)i perform differently. Note that the pixel intensities in simulation are not We first examine simulated video images of particles in subjected to a cut-off ceiling value as are those obtained motion, each with a Gaussian intensity profile with peak in imaging (e.g. 0-255 for an 8-bit image). Moreover, un- intensity Ic = 255 and standard deviation of σp = 2 pix- like MPT, where the relative brightness of particle and els. In principle, the intensity recorded at a single pixel background significantly affects tracking precision [13], Ip(x) could arise from intensity contributions of multi- in DDM signal quality depends sensitively on the mag- ple particles in the vicinity. The contribution of the j-th nitude of the image difference. Keeping this context in 9 mind, we simulate particles with brightness Ip(x − xj(t)) for both DDM analysis approaches. that does not vary with time, and we vary the step size Also included is the MPT approach, which was per- by which the particles move in each time step instead. It formed by locating particles in each frame, and search- is well-known that DDM performance deteriorates, and ing in the vicinity to link trajectories of individual par- can completely break down, in the limit of small probe ticles. The localization error is not characterized, given displacements [6]. For diffusive particles taking a step the high-particle intensity compared to the background 2 with a variance σs , we compare the performance of DDM, intensity in all cases investigated. Note that for certain DDM-UQ, and MPT in calculating the MSD values. cases, such as optically dense cases of simple diffusive The DDM-UQ analysis represents data obtained us- processes compared in Sec. III B, MPT cannot identify ing our proposed approach based on the downsampled the large number of probes, so MPT results are not com- Fourier transformation of ∆I(x, ∆t)’s with GPR and pre- pared in these two studies. dictive samples. For all numerical results that extract In detail, in all cases, we generated videos of the motion the MSD using DDM-UQ, we estimate Best to be the of np = 50 particles, except for the case described in minimum of Dmin(∆tmin) and hDo(qmax, ∆t)i∆t, to re- Sec. III B, where np = 800. The simulation box is a 2D duce the overestimation bias, as summarized in Table I. square with sides L = 480. The movie spans time steps The estimate of the amplitude parameter Aest(q) is ob- t = 1, 2,..., 1000. We further imposed displacements in 2 each successive time step ∆x (t) = x (t + 1) − x (t) tained by Aest(q) + Best = 2hIo (q, t)it, as it applies to i,j i,j i,j all wave vectors, regardless of whether or not a plateau (where i = 1,2 stands for the x1, or x2 directions in the is reached at the given ∆t. Furthermore, extracting the Cartesian coordinates in 2D). MSD from the ISF at extremal wave vectors could lead to We construct three scenarios that represent the gen- large errors. This is because the intensities of only a few eral features observed in a broad range of experiments: pixels are used to calculate the ISF at small wave vec- simple diffusion, diffusion with drift, and constrained dif- tors, leading to large uncertainty in estimation. On the fusion within a harmonic potential well (i.e., an Orn- other hand, A(q) can approach zero at large wave vec- stein–Uhlenbeck process). These scenarios will result in tors, leading to unstable estimate of MSD in this regime. distinct shapes of h∆r2(∆t)i and highlight various chal- To avoid these extremes, DDM-UQ uses Do(q, ∆t) based lenges to each analytical approach. The derivations of the on 80 intermediate q values, typically ranging from the expected values of the MSDs for all simulated scenarios 4th up to 83rd largest q’s. are given in Appendix D. Additionally, we record the MSD estimations for ∆t As a metric of the accuracy of a given analysis method, values that are no larger than 90% of the estimated we compare the normalized root mean squared error (N- plateau values according to Eq. (15). In this limit, the RMSE) for the estimated MSD relative to the known true denominator is sufficiently different from zero to ensure a MSD: small variance of the estimation. Finally, we truncate the q 1 P [h∆˜r2(∆t)i − h∆˜r2 (∆t)i]2 estimation of MSD at those ∆t where fewer than 10 wave n∆t ∆t∈∆T est N-RMSE = , vectors are available, to avoid selection bias in estimation σ˜r when sampling is limited. Other approaches to truncat- (25) ing the wave vector range and lag time based on standard where h∆˜r2(∆t)i is the logarithm of the true MSD with 2 deviation of the data were explored in prior DDM analy- base 10, h∆˜rest(∆t)i is the corresponding estimate using ses [6]. Weighting D(q, ∆t) based on the variance of the DDM with two different A(q) estimators, DDM-UQ or data without truncation of wave vectors and time points MPT, andσ ˜r is the sample standard deviation of the could be an efficient way to reduce selection bias while al- logarithm of the true MSD with base 10. In practice, not lowing robust estimation of the extracted MSD at longer all ∆t values are available for every method, due to large ∆t. This is a potential future topic for research. fluctuations at large ∆t values, and this provides a limit For DDM analysis in practice, we perform the Fourier to the total ∆t range captured in each case. To ensure transformation of all values of ∆I(x, ∆t), and estimate that the four methods are evaluated on the same test set B(q) using the average of D(qmax, ∆t) over all ∆t’s at ∆T and to ease quantitative comparisons, we determine qmax, hD(qmax, ∆t)i. Our first simulated case contains the usable range of ∆t’s by the smallest maximum ∆t 800 small particles (Fig. 4), whereas we include 50 mod- available among the three methods. The N-RMSE of erately large particles in the latter 6 simulations. As different simulated cases is summarized in Table II. shown in Figure A1, all estimators of the mean of the noise B for the six latter simulated examples are very 2 similar, as expected. We then test how the DDM anal- B. Validating the use of σ0 as an estimator for B ysis is affected by estimating A(q) two different ways. 2 Following the procedures (I) from [6] we estimate A(q) We have shown that the mean of B is 2σ0, and that 2 from the plateau or (II) using the static power spectrum the estimation of σ0 is critically important to the analy- 2 (i.e. Aest(q)+Best = 2hIo (q, t)it) per [16, 17]. We largely sis of DDM data. To illustrate this point, we first show follow the procedures defined by [6, 31] to select the wave the results of a simulation of np = 800 particles with vectors and ∆t values for estimation and data selection σp =0.5 moving in a purely viscous fluid. We generate 10 movies demonstrating simple : at each 2 time step, ∆xi,j(t) ∼ N (0, σs ) independently, where σs represents the step size with units of pixels. The expected 2 2 MSD of the Brownian motion is E(h∆r (∆t)i) = 2σs ∆t. For 2D diffusive motion, we thus expect hr2(∆t)i = 4Dm∆t, where Dm is the diffusion coefficient. Dm can 2 σs 2 be associated with the step size Dm = 2 . For σs = 4, 2 and Dm = 2, the truth is hr (∆t)i = 4Dm∆t = 8∆t. This simulation represents an optically dense case with a large number of small particles, which can induce a large bias in estimating the noise (Table I). We demon- strate this by comparing the estimation of noise by B = Dmin(∆tmin), B = hDo(qmax, ∆t)i∆t and B = FIG. 4. The ensemble-averaged MSD calculated from the sim- ˆ 2 2 2h|Io(qmax, t)| it to the truth B = 2σ0 (horizontal line) ulated 2D Brownian motion of 800 particles. Estimation of as shown in Fig. A1. We found that all three estima- MSD by DDM-UQ with four different ways of estimating the tors overestimate the mean of B in the analysis. The mean of the noise. The error bars denote 95% predictive inter- 2 2 first approach, B = Dmin(∆tmin), has the smallest bias val of DDM-UQ with B = 2σ0 . The truth of h∆r (∆t)i = 8∆t is denoted as the black line. The inset shows the initial po- among the three, as the term 1 − f(q, ∆tmin) in Eq. (4) is relatively small at the smallest ∆t. sition of particles, where particles are enlarged for better vi- Next, we calculated the MSDs by DDM-UQ using sualization. Note that it is not possible to perform MPT due to the large number of particles moving within the frame. each of the following ways for estimating the noise, as shown in Figure 4: B = Dmin(∆tmin) (red diamonds), B = hD (q , ∆t)i (purple squares), B = 2σ2 (blue o max ∆t 0 using simulations of particles taking different step sizes circles) and B = 0 (green triangles). The truth is plotted σ , but for which all other settings and conditions were as the thick black line. Our results reveal that estimating s held constant. When σ = 2, corresponding to an in- the noise accurately is necessary to obtain an accurate s termediate step size, all four methods (DDM with two estimate of MSD, whereas overestimating or underesti- different estimators, DDM-UQ and MPT) provide re- mating the noise leads to an inaccurate estimate, partic- sults that reasonably approximate the true values that ularly for sufficiently small ∆t, where the displacements are directly calculated from the inputted particle posi- are smaller and therefore more strongly impacted by a tions (Fig. 5a). When the N-RMSE values are calculated poorly estimated noise term. and compared, we find that the results from MPT pro- A simple way to detect the estimation bias of B is to vide the best approximation of the true values for this plot D (q, ∆t) as a function of q for a given ∆t (here se- o case, whereas DDM-UQ provide nearly the same level of lected to be ∆t ) across all q values, as shown Fig. A2. min accuracy as MPT (Table II). If the signal A(q)(1−f(q, ∆t)) does not approach zero at We next calculated the MSDs for simple diffusion with the highest q values, then Do(q, ∆t) continues to decrease lower (σs = 0.5) and higher (σs = 4) step sizes [Figs. as q increases. We found this to be the case for Do(q, ∆t) for the optically dense case (with N = 800 probes, shown 5(b) and 5(c)]. The largest differences are observed at p high step sizes, [Fig. 5(c)] where particle displacements as filled circles, Fig. A2). Hence, estimating B using Do are large (σs = 4). In this limit, DDM-UQ outperforms at qmax introduces non-negligible bias in this example. In contrast, when there are only N = 50 particles (shown MPT by a large margin (Table II). The reason is intuitive p for MPT: as particle displacement becomes large, the as open circles), the Do(q, ∆t) approaches zero at high-q and the bias for B is is negligible across all methods of likelihood of two or more particles exchanging positions estimation (Fig. A1). To overcome the bias induced by within the search radius increases significantly. This can the estimator, one may adjust the experimental condi- lead to the algorithm misidentifying the particle, thereby tions to avoid the scenarios summarized in Table I, or resulting in erroneous linking of the trajectories. may attempt to measure the noise using a separate sam- ple with probes immobilized in solid matrix under similar imaging conditions. Deriving a more accurate estimator D. Diffusion with drift of the noise at these experimentally challenging scenarios will be an interesting future direction. We next consider particles subjected to diffusive- convective motion (i.e. “drift”). At each time step, a 2 particle moves by ∆xi,j ∼ N (µD, σs ), resulting in a ran- C. Simple diffusion with different step size σ s dom walk with step size σs = 0.56 superimposed on a deterministic drift with mean velocity µD = 0.1 in units To explore the effects of varying step size (which is a of pixels/time step in the same direction for all parti- proxy for varying diffusivity) on our analysis, three ad- cles. The expected value of the MSD in this case is 2 2 2 2 ditional scenarios with diffusive dynamics are explored, E[h∆r (∆t)i] = 2σs ∆t + 2µD∆t . At short ∆t, the mo- 11

TABLE II. N-RMSE of simulated cases. The analysis mode with the lowest N-RMSE is shown in bold in each scenario.

2 Scenario DDM (A(q) from plateau) DDM (A(q) from h|Iˆo(q, t)| it) DDM-UQ MPT

Simple diffusion, σs = 2 0.109 0.047 0.025 0.024 Simple diffusion, σs = 0.5 0.396 0.034 0.057 0.129 Simple diffusion, σs = 4 0.130 0.095 0.014 0.492 Diffusion with drift 0.061 0.052 0.074 0.026 Diffusion with drift, optically dense 0.041 0.049 0.029 0.424 O-U process with drift 0.243 0.063 0.183 0.213

tion is primarily diffusive [Fig. 5(d)], h∆r2(∆t)i ∼ ∆t achieve ensemble averaging for sample that manifested with a transition to convective motion, h∆r2(∆t)i ∼ ∆t2, constrained heterogeneity has previously been applied to at large ∆t. This results in an increasing value of light scattering on polymer gel samples [49]. dh∆r2(∆t)i Here, just like the original O-U process, successive d∆t with ∆t, and permits a measure of the relative strength of the two dynamic processes through measure steps have a weak correlation with previous steps: of the local slope on a log-log scale. X (t + 1) = ρX (t) +  , (26) We also compare the performance of the different ana- i,j i,j i,j,t lytical routines under differing initial conditions. In par- 2 2 where i,j,t ∼ N (0, σs (1 − ρ )), with ρ = 0.95, σs = 5 ticular, we vary the initial positions where particles were and Xi,j(t) = xi,j(t) − (t − 1)µij − xi,j(1) for any t; the released at t = 1, holding all other settings equal, to indices represent the j-th particle and i-th direction (i compare an optically dilute scenario [Fig. 5(d)], where = 1, 2) and µ1j = µD · cosθj, µ2j = µD · sinθj, respec- particles were uniformly distributed throughout the sim- tively, with µD = 0.02. Each particle’s initial position ulation box, and an optically dense scenario [Fig. 5(e)], was generated by a normal distribution centered around where particles were released from a small L × L square 2 2 20 20 xij(t0) with variance σs , i.e. xij(t1) ∼ N (xij(t0), σs ), in the middle of the frame causing them to be near each where xij(t0) are randomly distributed within a square other and even overlap for some frames. When particles 3L 3L 4 × 4 in the middle of the simulation box to reduce are evenly distributed in the simulation box, all meth- the likelihood that a particle moves out of the frame ods closely track the truth, with MPT having the lowest during the simulation. The motion is subject to an at- RMSE in the optically sparse scenario, whereas DDM- tractive potential towards xij(t0) + (t − 1)µij with drift UQ performs the best in the optically dense scenarios. T (µDcos(θj), µDsin(θj)) at the tth time point for the jth The optically dense scenario mimics situations where a particle. The expected value of MSD for O-U process high concentration of particles is present, which is known 2 2 ∆t 2 2 with drift is E[h∆r (∆t)i] = 4σs (1−ρ )+µD∆t . There to lead to tracking issues in MPT since it is often difficult is a diffusive contribution that dominates at sufficiently to distinguish particles in close proximity. small ∆t from the first term, 1−ρ∆t ≈ 1−(1+ln(ρ)∆t) = By contrast, DDM and DDM-UQ perform strongly in ln(ρ−1)∆t, by Taylor expansion at small ∆t. The second this limit (Table II). The range of ∆t that can be resolved convective term captures the convective flow, and can by any method decreases as compared to the dilute case, represent the type of dynamics observed in some actively due to the large variability of MSD at different wave vec- driven systems [20, 50]. tors in this scenario. The process results in a trace of MSD versus ∆t with multiple inflection points: h∆r2(∆t)i first grows with decreasing slope until the plateau value is reached 2 E. Ornstein–Uhlenbeck process with drift dh∆r (∆t)i ( d∆t ≈ 0); at much longer ∆t’s, the MSD will grow with an increasing slope until eventually the con- dh∆r2(∆t)i Finally, we simulated particles from an Orn- vective motion dominates ( d∆t = 2). In contrast stein–Uhlenbeck (O-U) process with drift. Such a pro- to all previous scenarios, we find that D(q, ∆t) exhibits cess mimics thermally-driven particle motion in an effec- non-monotonic behavior at some wave vectors, and that tive elastic medium with drifts distinct to each particle. the MSD shows apparent q-dependence over ∆t for many The convective term is constant in magnitude µD, and intermediate q values. Specifically, a local maximum in fixed in direction θ for an individual particle, but ran- D(q, ∆t) appears before D(q, ∆t) reaches the plateau at domized for all particles. A pure O-U process without long ∆t. Such non-monotonic behavior in the resulting a convective term can lead to a h∆r2(∆t)i that plateaus f(q, ∆t) has previously been observed by in a system con- at a certain ∆t, and thus f(q, ∆t) does not decorrelate sisting of self-catalytic Janus particles [15]. for some finite values of ∆t. Adding a convective term We find that all approaches are able to capture the leads to complete decorrelation and allows the dynam- general trends and magnitude of motion [Fig. 5(f)]. At ics at these ∆t’s to be captured. Moving the sample to short ∆t, the response of all four methods is similar. We 12

FIG. 5. Comparisons of MSDs calculated using DDM-UQ, DDM, and MPT for various simulated scenarios: (a)–(c) Simple diffusion with (a) σs = 2, (b) σs = 0.5 and (c) σs = 4, corresponding to intermediate, low, and high step sizes. (d-e) Diffusion with drift in (d) optically dilute and (e) optically dense samples, and (f) diffusion of a particle within a harmonic potential well (Ornstein-Uhlenbeck process) subjected to drift. The pink diamonds and the cyan triangles indicate mean values obtained by 2 estimating A(q) from the plateau in D(q, ∆t), or from h|Iˆo(q, t)| it, respectively, using DDM algorithm based on all values of D(q, ∆t). The blue circles and error bars depict the mean and 95% predictive interval estimated by DDM-UQ based on 1% of the D(q, ∆t), respectively. The golden squares and black solid line represent the output of MPT analysis and the true values as directly calculated from the known particle positions, respectively. The generated particle trajectories are shown in the insets, in a simulation box of 480 × 480 pixels. note that this scenario constitutes an “ideal” case for by the N-RMSE in Table II. MPT: the particle locations are sparse, and each particle Overall, we find that DDM-UQ accurately determines moves toward their respective attractive centers, x (t ), ij 0 h∆r2(∆t)i for a range of experimentally-relevant scenar- never to cross paths with one another. Yet DDM-UQ ios based on a fraction of the observations, which sub- still outperforms MPT in this case, even though it can be stantially reduces the computational cost. Note that shown that some marginal benefit can be gained by using DDM-based algorithms are more automated compared the full data set. Still, with limited sampling, DDM-UQ to MPT, as DDM does not require manually chosen in- has performance on par with DDM and MPT, evaluated puts of model parameters such as particle sizes and search 13 radius. On the other hand, MPT can provide more reli- with noticeable deviations from the linear trend and the able dynamic information at larger ∆t than the current expected values at the smallest ∆t (Fig. 6). Here the model-free DDM-based algorithms. The DDM-UQ algo- effects of using the different estimators of noise are very rithm developed in this study enables a model-free auto- small, as Do(q, ∆t) levels off at high q values for any given mated DDM-based analysis with results that are compa- ∆t. When the approaches are compared, DDM-UQ (blue rable to MPT with optimized settings, with less compu- diamonds), which stabilizes large fluctuations through tational cost and tuning parameters. Indeed, DDM-UQ the use of the predictive median, leads to more accu- even outperforms MPT for certain challenging experi- rate estimates of the MSD. Furthermore, we found that 2 mental scenarios (e.g. high concentration, fast moving estimating A(q) from Aest(q) + Best = h|Iˆo(q, t)| it (blue objects, etc). Furthermore, many other scenarios abound circles and cyan triangles) improves the estimate of the where the ISF f(q, ∆t) rather than the MSD can pro- MSD at large ∆t over cases where A is estimated from the vide physical insight to the system, and the DDM-UQ plateau values only (pink diamonds). We attribute this algorithm provides an automated, model-free estimation improvement to averaging more Do(q, ∆t)’s from more of the ISF. These findings affirm the sensitivity of this q’s in determining the ISF, regardless of whether there ensemble-based method as well as the need for an unbi- is a plateau or not at the finite ∆t that are accessible in ased estimator of noise and other model parameters, and experiment. validates our data reduction approach.

IV. ANALYSIS OF EXPERIMENTAL DATA

A. Newtonian fluid

We first measured the properties of a Newtonian fluid in which we expect simple diffusive particle dynamics. Experimentally, we suspended fluorescent polystyrene microspheres of diameter 2a = 100 nm (yellow-green with excitation maxima of λex = 441 nm and emission max- ima at λem = 485 nm, Polysciences, Warrington, PA) in a 30 wt% sucrose solution (Sigma-Aldrich, St. Louis, MO) FIG. 6. Mean squared displacements estimated from the mo- −5 at a particle volume fraction φ ≈ 3 × 10 . This com- tions of 100 nm diameter probe particles in a 30wt% sucrose position was previously studied by dark-field DDM [31], solution, which serves as a model Newtonian fluid. The pink and the exact viscosity value of the sucrose is well docu- diamonds and the cyan triangles indicate MSD obtained by mented [51]. The particle suspension is introduced into using A(q) determined from the plateau in Do(q, ∆t), or from 2 a home-made sample cell formed using a glass slide and h|Iˆo(q, t)| it, respectively, based on all values of D(q, ∆t). glass cover slip separated by 100 µm spacers. The sam- Blue circles denote results obtained using the DDM-UQ anal- ple is imaged in epifluorescence using an Olympus IX73 ysis by a small fraction of the data; in this case A(q) was only ˆ 2 inverted microscope, outfitted with a halogen lamp with estimated using h|Io(q, t)| it. The black solid line denotes ref- erence values and is calculated using the known viscosity (at green fluorescent protein (GFP) filter set (λex = 457–487 20oC) and the Stokes Einstein Equation. The inset shows a nm, λem = 502–538 nm), using a 40× objective (NA = 0.6), which provides a spatial resolution of 97 nm/pixel. single experimental image of the movie, where particles ap- pear to be grainy and cannot be individually resolved. Images are collected using a 8-bit Point Grey Chameleon USB camera using a 100 ms exposure time, 10 Hz frame rate, and 960 pixel × 960 pixel frame size. In this experiment, the limited resolution of fluores- cence microscopy relative to the particle size precludes B. Viscoelastic fluid identification of individual particles by MPT [see Fig. 6(a), inset], and thus prevents MPT analysis. Despite We next investigate the performance of DDM in prob- the lack of particle-level information, DDM is neverthe- ing the dynamics of a non-Newtonian fluid; namely, a vis- less capable of detecting the minute differences in image coelastic worm-like micelle solution of 12.5 mM sodium intensities due to particle motion, and recovers the cor- salicylate (NaSal; Sigma-Aldrich, St. Louis, MO) and rect diffusive dynamics [Fig. 6(a)]. Thus, DDM shows 15 mM cetylpyridinium chloride (CPyCl; Sigma-Aldrich, extraordinary sensitivity even when the particle size is St. Louis, MO) that forms an entangled network. To this below the diffraction limit of the microscope. solution, fluorescent polystyrene microspheres of diame- The reference value of the MSD determined by the ter 2a = 1500 nm (carboxylated yellow-green ex/em = Stokes-Einstein relation [31] is reported by the solid black 505/515, Life Technologies, Carlsbad, CA) are added at line. We generally find quantitative agreement by both a volume fraction φ ≈ 2×10−4. The sample is mixed and DDM and DDM-UQ over most of the measured ∆t’s, allowed to relax overnight prior to loading into a capil- 14

FIG. 7. Results of microrheology and bulk rheology measurements of solutions worm-like micelles, which form viscoelastic fluids. (a) Mean squared displacements obtained from full Do(q, ∆t) with A(q) estimated from the plateau in D(q, ∆t) (pink 2 diamonds), from h|Iˆo(q, t)| it (cyan triangles), DDM-UQ (blue circles) and MPT (golden squares). The inset shows an experi- mental snapshot of the microrheology experiment. (b)–(d) Comparison of the frequency-dependent linear viscoelastic moduli obtained either from bulk rheology experiments (black symbols) or calculated from the MSDs obtained by either using (b) 2 DDM with A(q) estimated from h|Iˆo(qmax, t)| i (pink diamonds), (c) DDM-UQ (blue circles), or (d) MPT (golden squares). Solid symbols denote the storage [elastic, G0(ω)] modulus while open symbols denote the loss [viscous, G00(ω)] modulus. MSDs estimated by MPT and by DDM-UQ nearly overlap. lary tube, which is sealed on both sides with optical glue of wave vector is relatively large due to the small dis- (Norland Products, Inc.) and cured under a UV lamp. placements, which approach the resolution limit at small We note that this sample is similar in composition, but lag times. not identical, to a solution characterized by DDM mi- By contrast, the DDM-UQ algorithm using the me- crorheology in previous work [6]. The sample is imaged dian from the predictive sampling based on Do(q, ∆t) on using a Zeiss Axio Observer 7 microscope in fluorescence moderately large wave vectors is more robust than using mode using a Colibri 7 light source, standard GFP fil- a simple ensemble of Do(q, ∆t) in DDM. The MSD trace ter sets and a 40× water-immersion objective lens (NA for DDM-UQ is very close to the MPT result, and is more = 1.2), which provides a magnification of 150 nm/pixel. consistent with the Maxwell fluid-like behavior (Fig. 7a). Images were recorded with an Axiocam 702 monochro- Note that we only compute the Fourier transformation of matic camera using 15 ms exposure time, 10 Hz frame intensity difference at 25 selected ∆t’s in DDM-UQ anal- rate, and 512 pixels × 512 pixels frame size. In this case, ysis, instead of 6000 ∆t points in DDM, which reduces the reference data set is obtained by a bulk rheology mea- the computational cost by more than 100 times (shown surement of an identical sample without tracer particles in Figure 8) while providing more accurate results. using an AR-G2 stress-controlled rheometer (TA Instru- Next we convert the measured MSD data into measures ments, New Castle, DE) to perform a frequency-sweep of the frequency-dependent viscoelastic moduli using the in the linear viscoelastic limit using a 40-mm diameter generalized stokes Einstein relation (GSER) [53]: cone-and plate fixture, with a 2o cone angle and a 55 µm truncation, at 2% shear strain over a frequency range of k T |G∗(ω)| ≈ B , (27) 0.01–10 rad/s. The instrument is outfitted with a solvent πah∆r2(1/ω)iΓ[1 + α(ω)] trap to minimize evaporation during testing. Wormlike micelles (WLMs) manifest complex where T is the temperature, kB is the Boltzmann con- frequency-dependent viscoelasticity that nonethe- stant, and α(ω) the power-law slope [on a log-log plot of less follow simple scaling laws [6, 52]. Such a system is h∆r2(∆t)i]. challenging to characterize: small probe displacements The procedure of determining the power law slope α(ω) at low ∆t make it difficult to determine if the flattening usually involves first taking the numerical derivative of of the MSD at low ∆t is characteristic of system behavior h∆r2(∆t)i with respect to ∆t, and then fitting a polyno- or a result of “pixel biasing” due to particle localization mial of the data around a particular ∆t, which is then error [13]. The slow dynamics also contribute to the sen- Laplace transformed to frequency space [53]. As shown sitivity of q-selection. In this case, solid-like behavior at in Fig. 7(b)–7(d), the moduli we measured directly by high-frequency (low ∆t) is confirmed by bulk rheometry bulk rheology generally agree with those computed from measurements [Figs. 7(b)–7(d)] and thus a “flattening” h∆r2(∆t)i using microrheology approaches, even at high of the MSD trace is expected at small ∆t. With this frequency, in both magnitude and in the frequency of the information, we evaluate the four methods. crossover. The result from DDM with A(q) estimated In this experiment, the noise estimators Best = from the plateau is shown in pink diamonds [Fig. 7(b)], hDo(qmax, ∆t)i∆t and Best = Dmin(∆tmin) produce sim- the result using DDM-UQ is shown in blue circles [Fig. ilar results by DDM, although neither method performs 7(c)], and MPT is shown in golden squares [Fig. 7(d)]. well because the estimation error for A(q) at high values The MSD by DDM with A(q) estimated from the relation 15

3 DDM 10 DDM-UQ

102

101 Computation time (minutes)

100 Typical simulation Sucrose Worm-like micelles 480x480x1000 960x960x900 512x512x1000

FIG. 8. Pair bar graphs demonstrating the computation ef- ficiency of DDM-UQ scheme (blue) over current DDM ap- proaches (pink). The size of each image stack is labeled un- derneath the data set, in terms of frame size × time points (Lx × Ly × T ).

2 Aest(q) + Best = h|Iˆo(q, t)| it (cyan triangles) contains very large error, and thus it fails to estimate moduli, so the result is not shown here. Both the frequency dependence and magnitude of the moduli in the high-frequency regime are extremely sen- sitive to the MSD at low ∆t. Nevertheless, MPT and DDM-UQ are in an approximate agreement with the val- ues obtained with macroscale rheology at higher frequen- cies. By contrast, in a slow moving system, or at large q’s (which correspond to small displacements), when the movement is less than a pixel on average over ∆t, it is not possible to capture the system dynamics by calcu- FIG. 9. Example of analysis of an actively driven system. Ex- lating the image difference. As a result, estimates of the perimental data extracted from a fluorescently-labeled actin- MSD from the Do(q, ∆t) tend to provide underestimates myosin-microtubule composite network system reported in as compared to the true values at small ∆t. [20], processed using DDM (red symbols) or DDM-UQ (blue We note that the numerical differentiation of symbols), and fit to a stretched exponential model to describe h∆r2(∆t)i introduces a high degree of uncertainty into the system dynamics [Eq. (28)]. (a) Comparing DDM, and DDM DQ fits to the original Do(q, ∆t) matrix, where the un- the moduli that is not represented on Fig. 7. A more certainty is denoted by the gray shaded area, which is very robust estimation method for the moduli is an area of small as the uncertainty is low. The solid line denotes full future research. Do(q, ∆t) data, while the solid dots denotes Do(q, ∆t) se- lected to obtain the predictive distribution.(b) Fits to f(q, ∆t) shown at several different q’s. (c) Parameter τ, which de- C. Model fitting for actively-driven systems scribes the system relaxation time, plotted as a function of q. Different symbols represent τ extracted from six different data sets. From this, the system velocity is obtained (Fig. Beyond thermally-driven dynamics and microrheology, 10). DDM can be applied in the realm of non-Brownian, active systems where models of f(q, ∆t) can provide insight into the physics associated with the system dynamics. In such cases, the time-variation of the image structure function sufficiently decorrelates over the experimental observa- does not arise from probe motion but from structural evo- tion time, thus greatly reducing computational time and lution of actively-powered components within the system increasing the robustness of model fits. such as migrating bacteria [8] or advection due to inter- As an example, we consider the dynamics of actively- nal stresses that arise from phase separation and aging driven composite cytoskeletal networks of actin and mi- [10], among other examples. Here we demonstrate that crotubules, by (re)analyzing the experimental data re- DDM-UQ analysis can be applied to such anomalous dy- cently reported by Lee et al. [20]. Actin and micro- namics as well. Specifically, the statistical approach we tubules are ubiquitous and essential in eukaryotic cells, have developed should be applicable to any system that and in vitro networks of the purified filamentous pro- 16 teins are widely studied for their potential to self-organize and form model non-equilibrium materials when acted TABLE III. N-RMSE of active actin networks. upon by ATP-driven molecular motors such as myosin. <70% Plateau Full Data In recent work, Lee et al. [20] investigated a composite Sample ID DDM DDM-UQ DDM DDM-UQ network comprising actin and microtubules, which were 1 0.0576 0.0417 0.0700 0.0507 each labeled with distinct fluorophores, and acted upon 2 0.0641 0.0689 0.0882 0.0948 by myosin. DDM was then used as a means of disentan- 3 0.0406 0.0265 0.0573 0.0373 gling the motions recorded through the individual fluo- 4 0.0759 0.0699 0.0835 0.0769 rescence channels, allowing investigation of the mecha- 5 0.0462 0.0285 0.0576 0.0355 nisms of cross-correlation of actin and microtubule dy- 6 0.0588 0.0317 0.0724 0.0390 namics within the entangled network. By contrast, MPT can provide only information on the bulk network itself. Here we follow [20] to model the dynamics of such an The difference between the estimated quantity active system by a stretched exponential model [20]: De(q, ∆t) from fitting stretched exponential model, and observed quantity Do(q, ∆t) is evaluated by the N- !  ∆t γ(q) RMSE: f(q, ∆t) = exp − , (28) τ(q) N-RMSE = q where γ(q) > 1 means the system shows contractile dy- 1 P P ˜ ˜ 2 n n ∆t∈∆T q∈Q(hDo(q, ∆t)i − hDe(q, ∆t)i) namics, while γ(q) < 1 when the system shows stretching q ∆t , 1 σ˜ dynamics. A relaxation time scaling where τ(q) = vq de- D scribes a system exhibiting ballistic motion with velocity (29) 1 v, whereas τ(q) = 2 describes a system exhibiting Dmq ˜ ˜ diffusive motion where Dm represents the diffusion coef- where Do(q, t) and De(q, ∆t) are the logarithm of the ficient. observed and estimated image structure function by dif- Given the intrinsically heterogeneous nature of such ferent approaches, andσ ˜D is the logarithm of sample systems, multiple replicates of the same composition are standard deviation; ∆T and Q are the sets of ∆t and q often examined, increasing the already heavy computa- available for comparison, respectively. tional costs that are typical of DDM analysis. To demon- Figure 9(a) shows reconstructed D(q, ∆t) (dashed strate this, we re-analyze six replicates of the active actin lines) from fitting the full observation (solid line) using data set from [20] to show that despite computing the DDM as well as a reconstructed D(q, ∆t) obtained by Fourier transformation of the intensity difference at only resampling with only a fraction of design points (black 25 selected ∆t’s, which significantly reduces the compu- dots) using DDM-UQ. There is general agreement be- tational cost, DDM-UQ extracts the information that is tween the two approaches at all q values, with some differ- as accurate as the DDM approach that was originally em- ences observed at short times because DDM-UQ weighs ployed. In detail, the full Do(q, ∆t) is computed directly more heavily data at small ∆t, as the predictive variance from the image stacks by DDM, from which a subset of is small at these regions. values, Do(q, ∆t) are pre-selected to obtain the predictive Fitting via a weighted least squares minimization ap- distribution by DDM-UQ. proach is more robust for estimating the noise term B. This example specifically compares the DDM and One B(q) is estimated to be very close to 0 (2.13 × 10−8) DDM-UQ analysis when a parametric model with four in DDM, whereas it is estimated to be around 3.3×104 in fitting parameters are specified. When a stretched ex- DDM-UQ. The significant underestimation of the noise ponential model [Eq. (28)] for f(q, ∆t) with coefficients by DDM explains the large deviation of the fit at this τ(q) and γ(q), along with A(q) and B(q), is fit to either wave vector shown in Fig. 9(a). This example illustrates the full matrix Do(q, ∆t) (DDM) or the predictive dis- the importance of estimating the noise parameters accu- tribution D(q, ∆t) (DDM-UQ) for each q, the estimated rately, and that fitting a parametric model of f(q, ∆t) image structure function De(q, ∆t) is obtained. D(q, ∆t) without addressing the uncertainty by the least squared contains the same number of entries as Do(q, ∆t); it is estimator can be unreliable in estimating the noise pa- reconstructed using the mean of values sampled from the rameter. predictive distribution D(q, ∆t) [300 instances at every In Fig. 9(b) we show the same fits of the stretched (q, ∆t)]. The purpose of this step is to extract coefficients exponential model used to reconstruct f(q, ∆t) at differ- τ(q) and γ(q) relevant to the underlying physical process. ent q’s using the full observed image structure function Here for DDM, we use the conventional approach by min- Do(q, ∆t) obtained by the DDM (red lines) and DDM- imizing the square error loss between the model and the UQ analysis (blue lines). Note that DDM-UQ only uses D(q, ∆t), which is used as the loss function, whereas for observations Do(q, ∆t) at selected q and ∆t, but it per- DDM-UQ, we minimize the weighted square loss where forms equally well even for unobserved q values for which the weights are calculated from the inverse variance of there is no observation for any ∆t [see Fig. 9(a), bottom the predictive samples from GPR. curve]. 17

mediate scattering function. It provides an aggregated measure of dynamics, potentially offering higher accuracy in extracting physical quantities than using real-space data alone. While the theoretical framework of DDM is well established, to our knowledge, this work represents an exploration into propagating the uncertainty associ- ated with measurement noise, and analyzing the effects of noise in parameter estimation through mathematical and numerical analysis. Based on error propagation in estimating the image structure function, we derived the mean and variance of the noise term B, leading to more accurate estimation of the ISF and MSD at small ∆t. Moreover, we showed that only a small subset of Do(q, ∆t) (around 0.5%-5%) at selected q and ∆t need to be computed, and when they are used in a GPR model, it is possible to obtain the predictive median and samples in the image structure function at unobserved inputs and FIG. 10. Velocity estimates and confidence intervals extracted subsequent quantities of interest could be robustly pre- from fitting the stretched exponential model to either the full dicted. Both simulations and experiments were presented observations (DDM) or to reconstructed D(q, ∆t) from se- to demonstrate that our method has virtually no loss lected design points (DDM-UQ). The error bars denote 95% of information, while reducing the computational time confidence intervals. by 25-120 times. The combined improvements offered by the error propagation and predictive median from GPR results in more robust estimation of the intermedi- The differences between the estimated and observed ate scattering function and mean squared displacements. values of D (q, ∆t) for DDM and DDM-UQ are quanti- o Through the comparisons made here between DDM-UQ, fied by the N-RMSE in Eq. (29) as shown in Table III. other formulations of DDM, and MPT, we highlight the Since D(q, ∆t) at large ∆t contains fewer independent need for accurate noise estimation in the analysis and samples and shows large fluctuations, it is informative to interpretation of DDM experiments. compare the accuracy of the fit only up to a threshold We anticipate that these results will enable many new value, chosen here to be 70% of the ∆t values before the applications of DDM to complex biomaterial and soft plateau is reached. In all cases, DDM-UQ outperforms material systems [54, 55]. With the potential to carry DDM in more closely approximating the D (q, ∆t) when o out real-time analysis via down-sampling, the proposed comparing the truncated data set, and in almost all cases, method can be extended to map out an entire phase space the fit of DDM-UQ which has much lower computational of material composition or physicochemical conditions cost, is comparable to that of DDM, as shown in Table in a high-throughput manner. This increased perfor- III. Thus, DDM-UQ significantly accelerates the analysis mance also places new demands on the general applicabil- without any observable sacrifice in accuracy with respect ity of the algorithm, for instance, to provide meaningful to post-processing of the data such as model fitting. analysis of stiffer materials that do not fully decorrelate As described in Eq. (28), f(q, ∆t) contains the fit pa- as quickly as a more fluid-like samples, as well as het- rameters γ(q) and τ(q). Following [20], we fit the data erogeneous samples, through analysis of sub-populations to a linear model τ(q) = 1 and extract the characteris- vq that demonstrate distinct features. Future extensions of tic velocities of the active actin mixture [Fig. 9(c)]. The DDM-UQ analysis should include reducing the selection maximum likelihood estimator (MLE) and the confidence bias by properly weighting the ISF by the inverse vari- interval (CI) for the velocities from different replicates ance of the noise, which should provide a more reliable are tabulated in Fig. 10. Importantly, DDM-UQ with its and fully automated estimation of physical quantities at limited observations largely recovers similar characteris- larger lag times. Another potential direction is to derive tic velocities and confidence intervals as those obtained 2 a more robust estimator of the imaging noise σ0 from using the full matrix, paving the way for high-throughput the image stack that can contain less bias to some chal- analysis of the dynamic properties of complex biomate- lenging experimental scenarios (summarized in Table I). rial systems. These directions will be pursued in future work.

V. CONCLUDING REMARKS ACKNOWLEDGEMENTS

DDM can be applied to a structurally evolving image This work was supported by the BioPACIFIC Mate- stack to obtain the image structure function and inter- rials Innovation Platform of the National Science Foun- 18 dation under Award No. DMR-1933487 (NSF BioPA- tional facilities purchased with funds from the National CIFIC MIP), with partial support by the Materials Re- Science Foundation (Grant No. CNS-1725797) and ad- search Science and Engineering Center (MRSEC) Pro- ministered by the Center for Scientific Computing (CSC). gram of the National Science Foundation under Award The CSC is supported by the California NanoSystems In- No. DMR-1720256 (IRG-3). MG acknowledges par- stitute and MRSEC (NSF Grant No. DMR-1720256) at tial support from the National Science foundation un- UC Santa Barbara. We thank G. Lee and R. Robertson- der Award No. DMS-2053423. MEH acknowledges par- Anderson from University of San Diego for providing the tial support from the National Science Foundation under active actin data set. Award No. CBET-1729108. This work used computa-

[1] J. C. Crocker and D. G. Grier, Methods of digital video [14] M. Reufer, V. A. Martinez, P. Schurtenberger, and W. C. microscopy for colloidal studies, J. Interface Sci. Poon, Differential dynamic microscopy for anisotropic 179, 298 (1996). colloidal dynamics, Langmuir 28, 4618 (2012). [2] T. Savin and P. S. Doyle, Statistical and sampling issues [15] C. Kurzthaler, C. Devailly, J. Arlt, T. Franosch, W. C. when using multiple particle tracking, Phys. Rev. E 76, Poon, V. A. Martinez, and A. T. Brown, Probing 021501 (2007). the spatiotemporal dynamics of catalytic janus particles [3] F. Giavazzi, D. Brogioli, V. Trappe, T. Bellini, and with single-particle tracking and differential dynamic mi- R. Cerbino, Scattering information obtained by opti- croscopy, Phys. Rev. Lett. 121, 078001 (2018). cal microscopy: differential dynamic microscopy and be- [16] M. Escobedo-S´anchez, J. Segovia-Guti´errez, yond, Phys. Rev. E 80, 031403 (2009). A. Zuccolotto-Bernez, J. Hansen, C. Marciniak, [4] F. Giavazzi and R. Cerbino, Digital Fourier microscopy K. Sachowsky, F. Platten, and S. Egelhaaf, Microliter for soft matter dynamics, J. Opt. 16, 083001 (2014). viscometry using a bright-field microscope: η-ddm, Soft [5] R. Cerbino and V. Trappe, Differential dynamic mi- Matter 14, 7016 (2018). croscopy: probing wave vector dependent dynamics with [17] R. Cerbino, D. Piotti, M. Buscaglia, and F. Giavazzi, a microscope, Phys. Rev. Lett. 100, 188102 (2008). Dark field differential dynamic microscopy enables accu- [6] A. V. Bayles, T. M. Squires, and M. E. Helgeson, Probe rate characterization of the roto-translational dynamics microrheology without particle tracking by differential of bacteria and colloidal clusters, J. Phys. Condens. Mat- dynamic microscopy, Rheol. Acta 56, 863 (2017). ter 30, 025901 (2017). [7] P. J. Lu, F. Giavazzi, T. E. Angelini, E. Zaccarelli, [18] F. Giavazzi, C. Malinverno, G. Scita, and R. Cerbino, F. Jargstorff, A. B. Schofield, J. N. Wilking, M. B. Ro- Tracking-free determination of single-cell displacements manowsky, D. A. Weitz, and R. Cerbino, Characteriz- and division rates in confluent monolayers, Frontiers in ing concentrated, multiply scattering, and actively driven Physics 6, 120 (2018). fluorescent systems with confocal differential dynamic [19] H. Moon, A. M. Dean, and T. J. Santner, Two-stage microscopy, Phys. Rev. Lett. 108, 218103 (2012). sensitivity-based group screening in computer experi- [8] V. A. Martinez, R. Besseling, O. A. Croze, J. Tailleur, ments, Technometrics 54, 376 (2012). M. Reufer, J. Schwarz-Linek, L. G. Wilson, M. A. Bees, [20] G. Lee, G. Leech, M. J. Rust, M. Das, R. J. McGorty, and W. C. Poon, Differential dynamic microscopy: A J. L. Ross, and R. M. Robertson-Anderson, Myosin- high-throughput method for characterizing the motility driven actin-microtubule networks exhibit self-organized of microorganisms, Biophys. J. 103, 1637 (2012). contractile dynamics, Sci. Adv. 7, eabe4334 (2021). [9] F. Giavazzi, A. Fornasieri, A. Vailati, and R. Cerbino, [21] F. Giavazzi, V. Trappe, and R. Cerbino, Multiple dy- Equilibrium and non-equilibrium concentration fluctua- namic regimes in a coarsening foam, Journal of Physics: tions in a critical binary mixture, The European Physical Condensed Matter 33, 024002 (2020). Journal E 39, 1 (2016). [22] M. Norouzisadeh, G. Cerchiari, and F. Croccolo, In- [10] Y. Gao, J. Kim, and M. E. Helgeson, Microdynamics creased performance in DDM analysis by calculating and arrest of coarsening during spinodal decomposition structure functions through Fourier transform in time, in thermoreversible colloidal gels, Soft Matter 11, 6360 arXiv preprint arXiv:2012.05695 (2020). (2015). [23] P. Edera, D. Bergamini, V. Trappe, F. Giavazzi, and [11] F. Giavazzi, S. Crotti, A. Speciale, F. Serra, R. Cerbino, Differential dynamic microscopy microrhe- G. Zanchetta, V. Trappe, M. Buscaglia, T. Bellini, and ology of soft materials: A tracking-free determination of R. Cerbino, Viscoelasticity of nematic liquid crystals at the frequency-dependent loss and storage moduli, Phys. a glance, Soft Matter 10, 3938 (2014). Rev. Materials 1, 073804 (2017). [12] R. Cerbino and P. Cicuta, Perspective: Differential dy- [24] T. G. Mason, K. Ganesan, J. H. van Zanten, D. Wirtz, namic microscopy extracts multi-scale activity in com- and S. C. Kuo, Particle tracking microrheology of com- plex fluids and biological systems, J. Chem. Phys. 147, plex fluids, Phys. Rev. Lett. 79, 3282 (1997). 110901 (2017). [25] T. G. Mason and D. A. Weitz, Optical measurements of [13] T. Savin and P. S. Doyle, Static and dynamic errors frequency-dependent linear viscoelastic moduli of com- in particle tracking microrheology, Biophys. J. 88, 623 plex fluids, Phys. Rev. Lett. 74, 1250 (1995). (2005). [26] M. S. Safari, M. A. Vorontsova, R. Poling-Skutvik, P. G. Vekilov, and J. C. Conrad, Differential dynamic mi- 19

croscopy of weakly scattering and polydisperse protein- [45] K. R. Anderson, I. A. Johanson, M. R. Patrick, M. Gu, rich clusters, Phys. Rev. E 92, 042712 (2015). P. Segall, M. P. Poland, E. K. Montgomery-Brown, and [27] T. Sentjabrskaja, E. Zaccarelli, C. De Michele, A. Miklius, Magma reservoir failure and the onset of F. Sciortino, P. Tartaglia, T. Voigtmann, S. U. Egelhaaf, caldera collapse at K¯ılaueavolcano in 2018, Science 366 and M. Laurati, Anomalous dynamics of intruders in a (2019). crowded environment of mobile obstacles, Nat. Comm. [46] J. Wu and M. Gu, Emulating the first principles 7, 1 (2016). of mmatter: a probabilistic roadmap, arXiv preprint [28] C. E. Rasmussen, Gaussian processes for machine learn- arXiv:2010.05942 (2020). ing (MIT Press, 2006). [47] M. Gu, J. Palomo, and J. O. Berger, RobustGaSP: Ro- [29] The MATLAB package is available at: https://github. bust Gaussian Stochastic Process Emulation in R, The com/UncertaintyQuantification/DDM-UQ. R Journal 11, 112 (2019). [30] F. Giavazzi, P. Edera, P. J. Lu, and R. Cerbino, Image [48] Y. Gao and M. L. Kilfoil, Accurate detection and com- windowing mitigates edge effects in differential dynamic plete tracking of large populations of features in three microscopy, Eur. Phys. J. E 40, 97 (2017). , Opt. Exp. 17, 4685 (2009). [31] A. V. Bayles, T. M. Squires, and M. E. Helgeson, Dark- [49] J.-Z. Xue, D. Pine, S. T. Milner, X.-L. Wu, and field differential dynamic microscopy, Soft Matter 12, P. Chaikin, Nonergodicity and light scattering from poly- 2440 (2016). mer gels, Phys. Rev. A 46, 6550 (1992). [32] B. Nijboer and A. Rahman, Time expansion of correla- [50] N. Monnier, S.-M. Guo, M. Mori, J. He, P. L´en´art,and tion functions and the theory of slow neutron scattering, M. Bathe, Bayesian approach to MSD-based analysis of Physica 32, 415 (1966). particle motion in live cells, Biophys. J. 103, 616 (2012). [33] E. R. Weeks, J. C. Crocker, A. C. Levitt, A. Schofield, [51] J. F. Swindells, Viscosities of sucrose solutions at various and D. A. Weitz, Three-dimensional direct imaging of temperatures: Tables of recalculated values, Vol. 440 (US structural relaxation near the colloidal glass transition, Government Printing Office, 1958). Science 287, 627 (2000). [52] H. Rehage and H. Hoffmann, Viscoelastic surfactant solu- [34] L. G. Wilson, V. A. Martinez, J. Schwarz-Linek, tions: model systems for rheological research, Mol. Phys. J. Tailleur, G. Bryant, P. Pusey, and W. C. Poon, Differ- 74, 933 (1991). ential dynamic microscopy of bacterial motility, Physical [53] T. G. Mason, Estimating the viscoelastic moduli of com- review letters 106, 018101 (2011). plex fluids using the generalized Stokes–Einstein equa- [35] L. Jawerth, E. Fischer-Friedrich, S. Saha, J. Wang, tion, Rheol. Acta 39, 371 (2000). T. Franzmann, X. Zhang, J. Sachweh, M. Ruer, M. Ijavi, [54] J. Fricks, L. Yao, T. C. Elston, and M. G. Forest, Time- S. Saha, et al., Protein condensates as aging Maxwell flu- domain methods for diffusive transport in soft matter, ids, Science 370, 1317 (2020). SIAM J. Appl. Math. 69, 1277 (2009). [36] E. M. Furst and T. M. Squires, Microrheology (Oxford [55] M. Lysy, N. S. Pillai, D. B. Hill, M. G. Forest, J. W. University Press, 2017). Mellnik, P. A. Vasquez, and S. A. McKinley, Model com- [37] J. A. McGlynn, N. Wu, and K. M. Schultz, Multiple par- parison and assessment for single particle tracking in bi- ticle tracking microrheological characterization: Funda- ological fluids, J. Am. Stat. Assoc. 111, 1413 (2016). mentals, emerging techniques and applications, Journal [56] M. Gu, X. Wang, and J. O. Berger, Robust Gaussian of Applied Physics 127, 201101 (2020). stochastic process emulation, The Annals of Statistics 46, [38] M. Rupp, A. Tkatchenko, K.-R. M¨uller, and O. A. 3038 (2018). Von Lilienfeld, Fast and accurate modeling of molecular atomization energies with machine learning, Phys. Rev. Lett. 108, 058301 (2012). [39] A. P. Bart´ok,R. Kondor, and G. Cs´anyi, On representing chemical environments, Phys. Rev. B 87, 184115 (2013). [40] S. Chmiela, A. Tkatchenko, H. E. Sauceda, I. Poltavsky, APPENDICES K. T. Sch¨utt,and K.-R. M¨uller,Machine learning of ac- curate energy-conserving molecular force fields, Sci. Adv. APPENDIX A 3, e1603015 (2017). [41] F. Brockherde, L. Vogt, L. Li, M. E. Tuckerman, K. Burke, and K.-R. M¨uller,Bypassing the kohn-sham Proof of Eq. (6) and Eq. (7). The observed difference of equations with machine learning, Nat. Comm. 8, 1 the intensity at two time points (t + ∆t) and t can be (2017). described as [42] S. Chmiela, H. E. Sauceda, K.-R. M¨uller, and A. Tkatchenko, Towards exact molecular dynamics sim- ulations with machine-learned force fields, Nat. Comm. ∆Io(x, t, ∆t) = ∆I(x, t, ∆t) + ∆(x, t, ∆t), 9, 1 (2018). [43] A. P. Bart´ok,J. Kermode, N. Bernstein, and G. Cs´anyi, Machine learning a general-purpose interatomic potential where ∆I(x, t, ∆t) = I(x, t + ∆t) − I(x, t) and for silicon, Phys. Rev. X 8, 041048 (2018). ∆(x, t, ∆t) = (x, t + ∆t) − (x, t). We denote the min- [44] D. M. Wilkins, A. Grisafi, Y. Yang, K. U. Lao, R. A. DiS- imum time interval by ∆tmin = 1 and ∆t = l∆tmin = l, tasio, and M. Ceriotti, Accurate molecular polarizabili- where l is a positive integer smaller than T . ties with coupled cluster theory and machine learning, Proc. Natl. Acad. Sci. U.S.A. 116, 3401 (2019). We apply 2D discrete Fourier transformations on ∆Io(x, t, ∆t) and obtain 20

Proof of Eq. (8) - Eq. (10). The observations of image structure function can be obtained through computing 2 |∆Iˆo(q, t, ∆t)| ensemble average of the observed intensity N−1 N−1  T  2 1 X X i2πx q = ∆Io(x, t, ∆t) exp − ˆ 2 N 2 N Do(q, ∆t) = h|∆Io(q, t, ∆t)| i x1=0 x2=0 ˆ 2 2 N−1 N−1 2 = h|∆I(q, t, ∆t)| i + h|ˆ(q, t, ∆t)| i 1 X X  2πxT q = ∆I (x, t, ∆t) cos ˆ N 2 o N + 2h∆I(q, t, ∆t)∆ˆ(q, t, ∆t)i x1=0 x2=0 N−1 N−1 2 1 X X  2πxT q + ∆I (x, t, ∆t) sin with expected value and variance: N 2 o N x1=0 x2=0 ˆ2 ˆ2 :=∆Io,1(q, t, ∆t) + ∆Io,2(q, t, ∆t) E[Do(q, ∆t)] 2 2 = E[h|∆Iˆ(q, t, ∆t)| i] + E[h|ˆ(q, t, ∆t)| i] + 2 E[h∆Iˆ(q, t, ∆t)∆ˆ(q, t, ∆t)i] where Iˆo,1(q, t, ∆t) and Iˆo,2(q, t, ∆t) are independent from each other by the orthogonality of the Fourier basis ˆ 2 1 X X 2 2 = h|∆I(q, t, ∆t)| i + E[∆ˆ1 + ∆ˆ2] with n∆tnq t∈S∆t (q1,q2)∈Sq 2 X X N−1 N−1  T  + ∆Iˆ(q, t, ∆t) [∆ˆ(q, t, ∆t)] ˆ 1 X X 2πx q E E[Io,1(q, t, ∆t)] = ∆I(x, t, ∆t) cos , n∆tnq N N t∈S∆t (q1,q2)∈Sq x1=0 x2=0 N−1 N−1 = D(q, ∆t) + V(∆ˆ1) + V(∆ˆ2) 1 X X 2πxT q [Iˆ (q, t, ∆t)] = ∆I(x, t, ∆t) sin , = D(q, ∆t) + 2σ2 E o,2 N N 0 x1=0 x2=0 ˆ ˆ 2 V[Io,1(q, t, ∆t)] = V[Io,2(q, t, ∆t)] = σ0. where h·i denotes averaging over available time points for 2 2 each ∆t, and (q1, q2) ∈ Sq with Sq := {(q1, q2): q1 +q2 = 2 Furthermore q }, nq := Sq, n∆t := tmax − ∆t, and

2 |∆Iˆo(q, t, ∆t)| V[Do(q, ∆t)] =|∆Iˆ(q, t, ∆t)|2 + 2∆Iˆ(q, t, ∆t)∆ˆ(q, t, ∆t) + |∆ˆ(q, t, ∆t)|2, ˆ 2 = V[h|∆Io(q, t, ∆t)| i] where h 1 X X ˆ 2i = V |∆Io(q, t, ∆t)| nqn∆t ˆ 2 ˆ2 ˆ2 (q1,q2)∈Sq t∈S∆t |∆I(q, t, ∆t)| = ∆I1 + ∆I2 , (30)

∆Iˆ(q, t, ∆t)∆ˆ(q, t, ∆t) = ∆Iˆ11 + ∆Iˆ22, (31) The variance of the average of intensity over available |∆ˆ(q, t, ∆t)|2 = ∆ˆ2 + ∆ˆ2, (32) 1 2 time points for each ∆t and tmax = T is with 1 X  |∆Iˆ (q, t, ∆t)|2 N−1 N−1 T V o   n∆t 1 X X 2πx q t∈S ∆Iˆ1 = ∆I(x, t, ∆t) cos , ∆t N N h 1 i x1=0 x2=0 X ˆ2 ˆ2  = V ∆Io,1(q, t, ∆t) + ∆Io,2(q, t, ∆t) N−1 N−1  T  n∆t 1 X X 2πx q t∈S∆t ∆ˆ1 = ∆(x, t, ∆t) cos , N N 1 X 2 x =0 x =0 = [∆Iˆ (q, t, ∆t)] 1 2 n2 V o,1 N−1 N−1 ∆t t∈S 1 X X 2πxT q ∆t ∆Iˆ2 = ∆I(x, t, ∆t) sin , 1 X N N + [∆Iˆ2 (q, t, ∆t)] x =0 x =0 2 V o,2 1 2 n∆t t∈S∆t 1 N−1 N−1 2πxT q X X 1 X 2 2  ∆ˆ2 = ∆(x, t, ∆t) sin . + Cov ∆Iˆ (q, t , ∆t), ∆Iˆ (q, t , ∆t) N N 2 o,1 1 o,1 2 x =0 x =0 n∆t 1 2 t16=t2

1 X 2 2  ˆ 2 + Cov ∆Iˆ (q, t , ∆t), ∆Iˆ (q, t , ∆t) The expected value of E[|Io(q, t, ∆t)| ] can be verified 2 o,2 1 o,2 2 n∆t using properties of the Fourier basis. t16=t2 21 where the first two terms can be computed as Similarly, for ∆t = 1, 2,..., b(tmax − 1)/2c and tmax = T

X ˆ2 ˆ2  1 X   Cov ∆I (q, t1, ∆t), ∆I (q, t2, ∆t) [∆Iˆ2 (q, t, ∆t)] + [∆Iˆ2 (q, t, ∆t)] o,2 o,2 2 V o,1 V o,2 t 6=t n∆t 1 2 t∈S∆t T −2∆t 4 n∆t σ  1   X 0 2 ˆ ˆ X ˆ 2 ˆ 2 = 2 − 4σ0∆I2(q, t, ∆t)∆I2(q, t + ∆t, ∆t) = V[(∆I1 + ∆ˆ1) ] + V[(∆I2 + ∆ˆ2) ] 2 n2 t=1 ∆t t=1 n∆t 1 X  ˆ 2 ˆ 2  For general ∆t > b(T − 1)/2c: = 2 V[2∆I1∆ˆ1 + ∆ˆ1] + V[2∆I2∆ˆ1 + ∆ˆ2] n∆t t=1 X ˆ2 ˆ2  n∆t Cov ∆Io,1(q, t1, ∆t), ∆Io,1(q, t2, ∆t) 1 X  2 2 2 2 2  = 4σ (∆Iˆ + ∆Iˆ ) + (∆ˆ + ∆ˆ ) t16=t2 n2 0 1 2 V 1 2 ∆t t=1 X ˆ2 ˆ2  = Cov ∆Io,2(q, t1, ∆t), ∆Io,2(q, t2, ∆t) n∆t 1 X   t16=t2 = 4σ2|∆Iˆ(q, t, ∆t)|2 + 4σ4 , n2 0 0 = 0 ∆t t=1 Combining the variance and covariance expressions de- and for ∆t = 1, 2,..., b(tmax − 1)/2c, the last two terms veloped above, the variance of the average of intensity is follows: 1 X  |∆Iˆ (q, t, ∆t)|2 X ˆ2 ˆ2  V o Cov ∆Io,1(q, t1, ∆t), ∆Io,1(q, t2, ∆t) n∆t t∈S∆t t16=t2 2  Pn∆t ˆ 2 T −2∆t 2σ0 2 2 t=1 |∆I(q, t, ∆t)| X = 2σ0 + ˆ2 ˆ2  n∆t n∆t = 2 Cov ∆Io,1(q, t, ∆t), ∆Io,1(q, t + ∆t, ∆t) 2 t=1 σ 2S  + max(0,T − 2∆t) 0 − q1,q2,∆t  T −2∆t n∆t n∆t(T − 2∆t) X 2 = 2 Cov (∆Iˆ1(q, t, ∆t) + ˆ1,t+∆t − ˆ1,t) , t=1 with ˆ 2 (∆I1(q, t + ∆t, ∆t) + ˆ1,t+2∆t − ˆ1,t+∆t) T −2∆t X  T −2∆t ˆ ˆ Sq1,q2,∆t = ∆I1(q, t, ∆t)∆I1(q, t + ∆t, ∆t) X ˆ 2 = 2 Cov E[(∆I1(q, t, ∆t) + ˆ1,t+∆t − ˆ1,t,∆t) ], t=1 t=1  + ∆Iˆ2(q, t, ∆t)∆Iˆ2(q, t + ∆t, ∆t) (33) ˆ 2  E[(∆I1(q, t + ∆t, ∆t) + ˆ1,t+2∆t − ˆ1,t+∆t) ]|ˆ1,t+∆t T −2∆t Finally, we have X  ˆ 2 + 2 E Cov((∆I1(q, t, ∆t) + ˆ1,t+∆t − ˆ1,t) , t=1 V[Do(q, ∆t)] ˆ 2  (∆I1(q, t + ∆t, ∆t) + ˆ1,t+2∆t − ˆ1,t+∆t) |ˆ1,t+∆t) h 1 X 1 X ˆ 2i = V |∆Io(q, t, ∆t)| T −2∆t nq n∆t X  2 2 (q1,q2)∈Sq t∈S∆t = 2 Cov(ˆ1,t+∆t, ˆ1,t+∆t) − 4× 2 Pn∆t ˆ 2 t=1 1 X 2σ  2 |∆I(q, t, ∆t)| = 0 2σ2 + t=1  2 0 ˆ ˆ nq n∆t n∆t ∆I1(q, t, ∆t)∆I1(q, t + ∆t, ∆t) Cov(ˆ1,t+∆t, ˆ1,t+∆t) (q1,q2)∈Sq 2 T −2∆t 4 σ0 2Sq1,q2,∆t  X σ0 2 ˆ ˆ  + max(0,T − 2∆t) − = 2 − 4σ ∆I1(q, t, ∆t)∆I1(q, t + ∆t, ∆t) , n n (T − 2∆t) 2 0 ∆t ∆t t=1 2 2σ0  2 = 2σ0 + 2D(q, ∆t) nqn∆t 2 σ 2Sq,∆t  with + max(0,T − 2∆t) 0 −  n∆t (T − 2∆t)n∆tnq

N−1 N−1 1 X X 2πxT q with ˆ = (x, t) cos 1,t N N x1=1 x2=1 X Sq,∆t = Sq1,q2,∆t (34) N−1 N−1  T  (q ,q ):q2+q2=q2 1 X X 2πx q 1 2 1 2 ˆ = (x, t) sin . 2,t N N x1=1 x2=1 22

APPENDIX B PARAMETER ESTIMATION IN GAUSSIAN PROCESS REGRESSION

T Let D˜ o = (D˜ o(θ1), ..., D˜ o(θn)) denote the n observa- tions. The parameters in the Gaussian process contain mean parameter m, variance parameter σ2 and inverse range parameters β = (β1, β2) in the kernel function. Conditional on β and the regularization parameter λ, the maximum likelihood estimator of the mean parameter is T ˜ −1 T −1 T ˜ −1 ˜ ˜ mest = (1n R 1n ) 1n R Do, with R = R+nλIn and 2 2 2 ˜ T ˜ −1 ˜ σest = S /n with S = (Do − 1nmˆ ) R (Do − 1nmˆ ). 2 Plugging (mest, σest), the profile likelihood of parame- ters (β, λ) in the covariance function follows L(β, λ) ∝ − 1 2 − n |K| 2 (S ) 2 . Since no closed formed expression of the maximum likelihood estimator for the range and regu- larization parameters is available, one often numerically maximizes the profile likelihood to obtain the estimates of (β, λ). When the sample size is small, the MLE can be unstable and marginal posterior mode estimation with robust parametrization is often used [56]. We imple- FIG. A1. Comparison of different estimators (denoted by mented the parameter estimation and predictions of GPR different color bars) for all simulation scenarios. Note that the by the “RobustGaSP” package available in R and MAT- 2 true noise 2σ0 is kept constant in all simulations and denoted LAB [47]. by the thick black line.

APPENDIX C DETECTION OF ESTIMATOR BIAS

The overestimation by a series of previously used meth- ods can be detected by plotting Do(q, ∆t) over all q values at one ∆t (see Fig. A2). If at high q, Do(q, ∆t)’s rate of change slows, then this implies that A(q)(1 − f(q, ∆t)) is close to zero, and the bias in estimating the noise term by the second to the fourth estimator in Table I is negli- gible. However, if Do(q, ∆t) decreases (even slightly) as the value q increases, then the bias of the estimator is FIG. A2. A close look at the estimators of noise Best in the non-negligible. first and second scenarios of simple diffusion with identical underlying dynamics σs =2, above. The difference lies in that the second case includes 16× as many particles while APPENDIX D DERIVATION OF MEAN particle radius decreases by 4× (filled circles). For the range SQUARED DISPLACEMENT of q probed, Do(q, ∆t) does not decay to zero at qmax.

Here we derive the MSD for each of the different sce- In simulated scenarios presented in Section IIID, sim- narios explored in simulation. First, the simulated par- ilarly we can split the MSD into two terms. Not- ticles in Sections IIIB and IIIC all undergo Brownian ing [x (t + ∆t) − x (t)] = µ ∆t and the process is motion. Without loss of generality, we may assume the E ij ij D isotropic, the MSD is 2σ2∆t + 2µ2 ∆t2, variance of ∆x (t) is σ2. For any x(t), x(t + ∆t) ∈ 2, s D i,j s R In simulated scenarios presented in Section IIIE, note the MSD can be simply computed by 2 that when t = t1, xij(t1) ∼ N(xij(t0), σD). For any 2 t > t1, it is not hard to show E[(xij(t + ∆t) − xij(t)) ] 2 [x (t)] = (t − 1)µ cos(θ ) + x (t ) = V [xij(t + ∆t) − xij(t)] + {E[xij(t + ∆t) − x(t)]} E 1j D j 1j 0 "∆t−1 # E[x2j(t)] = (t − 1)µDsin(θj) + x2j(t0), X = ∆x (t + k) + 0 = σ2∆t, V ij s and consequently k=0 2 for any j = 1, ..., p and i = 1, 2. Since particles move X 2 2 2 [ (xij(t + ∆t) − xij(t)) ] = µ ∆t . (35) 2 E D isotropically in a 2D space, the MSD is 2σs ∆t. i=1 23

2 2 2 ∆t 2 2 It is easy to verify V[xij(t)] = σs . Thus we have ing (35) and (36), the MSD is 4σs − 4σs ρ + µD∆t .

V[xij(t + ∆t) − xij(t)] =V[xij(t + ∆t)] + V[xij(t)] − 2Cov(xij(t + ∆t), xij(t)) 2 ∆t APPENDIX E NOTATIONS =2σs − 2Cov(ρ xij(t), xij(t)) 2 2 ∆t =2σs − 2σs ρ (36) The notations used in this paper are listed in Table IV. Since the process is on a two dimensional space, combin- 24

TABLE IV. Table of notations. Brackets, overheads & superscripts ˜ logarithmically transformed variables ˆ Fourier transformed functions < ··· >i ensemble average with respect to variable i | ... | modulus of complex numbers ...o observed quantities ...∗ unknown variables ...e estimated quantities ∆ difference of functions or variables ...T transpose

Variables x coordinates in real-space q coordinates in Fourier transformed (reciprocal) space q radius of coordinates in Fourier transformed space t real time in experiments tmin starting time in experiments tmax ending time in experiments T the total number of time points in experiments ∆t lag time  noise in the image intensity 2 σo variance of the noise in the image intensity np number of particles n∆t number of ∆t’s when lag time is ∆t nq number of coordinates with radius q τ range parameter in a stretched exponential model γ roughness parameter in a stretched exponential model S sets of variables ∆T sets of available ∆t’s Q sets of available q’s

Particle Dynamics Simulations σs diffusive step size µD drift velocity Dm diffusion coefficient Ip(x, t) particle intensity function Ic particle center pixel intensity used in Ip(x, t) Ib(x, t) background intensity function

Gaussian Process Regression m mean parameter σ2 variance parameter β inverse range parameters λ regularization parameter θ noise in regression Do vector of observed image structure function

Constants kB Boltzmann constant Ta absolute temperature

Functions & Operators I(x, t) image intensity function D(q, ∆t) image structure function Do(q, ∆t) observed image structure function (which contains noise) De(q, ∆t) estimated image structure function fit by parameterized model f(q, ∆t) intermediate scattering function h∆r2(∆t)i mean squared displacement F(... ) operator of the 2D discrete Fourier transform L(... ) likelihood function N normal distribution V(... ) variance of variables E(... ) expected value of variables