<<

Optimal Multivariate Gaussian Fitting with Applications to PSF Modeling in Two-Photon Microscopy Imaging Emilie Chouzenoux, Tim Tsz-Kit Lau, Claire Lefort, Jean-Christophe Pesquet

To cite this version:

Emilie Chouzenoux, Tim Tsz-Kit Lau, Claire Lefort, Jean-Christophe Pesquet. Optimal Multivariate Gaussian Fitting with Applications to PSF Modeling in Two-Photon Microscopy Imaging. Journal of Mathematical Imaging and Vision, Springer Verlag, 2019, 61 (7), pp.1037-1050. ￿10.1007/s10851-019- 00884-1￿. ￿hal-01985663￿

HAL Id: hal-01985663 https://hal.archives-ouvertes.fr/hal-01985663 Submitted on 18 Jan 2019

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non, lished or not. The documents may come from émanant des établissements d’enseignement et de teaching and research institutions in France or recherche français ou étrangers, des laboratoires abroad, or from public or private research centers. publics ou privés. XXX manuscript No. (will be inserted by the editor)

Optimal Multivariate Gaussian Fitting with Applications to PSF Modeling in Two-Photon Microscopy Imaging

Emilie Chouzenoux1,2 Tim Tsz-Kit Lau3 Claire Lefort4 Jean-Christophe Pesquet1 · · ·

Received: date / Accepted: date

Abstract Fitting Gaussian functions to empirical data is a 1 Introduction crucial task in a variety of scientific applications, especially in image processing. However, most of the existing approa- Fitting Gaussian shapes from noisy observed data points is ches for performing such fitting are restricted to two di- an essential task in various science and engineering applica- mensions and they cannot be easily extended to higher di- tions. In the one-dimensional (1D) case, it lies for instance at mensions. Moreover, they are usually based on alternating the core of spectroscopy signal analysis techniques in physi- minimization schemes which benefit from few theoretical cal science [21,31]. In the two-dimensional (2D) case, where guarantees in the underlying nonconvex setting. In this pa- Gaussian profile parameters are estimated from images, some per, we provide a novel variational formulation of the multi- worth mentioning applications include Gaussian beam char- variate Gaussian fitting problem, which is applicable to any acterization, particle tracking, and sensor calibration [28,37, dimension and accounts for possible non-zero background 15]. In the domain of image recovery, a particularly impor- and noise in the input data. The block multiconvexity of our tant application of Gaussian shape fitting is the modeling of objective leads us to propose a proximal alternat- Point Spread Functions (PSF) from raw data of optical sys- ing method to minimize it in order to estimate the Gaus- tems (e.g., microscopes, telescopes). The success of image sian shape parameters. The resulting FIGARO algorithm is restoration strategies strongly depends on the accuracy of shown to converge to a critical point under mild assump- the PSF estimation [13]. This estimation is often performed tions. The algorithm shows a good robustness when tested through a preliminary step of image acquisition of normal- on synthetic datasets. To demonstrate the versatility of FI- ized and calibrated objects, associated with a model fitting GARO, we also illustrate its excellent performance in the strategy. The PSF model is chosen as a trade-off between fitting of the Point Spread Functions of experimental raw accuracy and simplicity. Gaussian models often lead to both data from a two-photon fluorescence microscope. tractable and good quality approximations [35,32,1,42,41]. Let L1(RQ) denote the space of real-valued summable RQ Keywords Gaussian fitting Kullback-Leibler divergence functions defined on . In this paper, we address the prob- · · Alternating minimization Proximal methods PSF lem of fitting a Gaussian model to an observed function · · y L1(RQ). We assume that the observed function y can identification Two-photon Fluorescence microscopy ∈ · be modeled as

Emilie Chouzenoux [email protected] ( u RQ) y(u)= a + bp(u)+ v(u), (1.1) 1 Center for Visual Computing,· CentraleSupelec,´ INRIA Saclay, ∀ ∈ Universite´ Paris-Saclay, 91190 Gif-sur-Yvette, France where a R is a background term, b (0,+∞) is a scal- 2 Laboratoire d’Informatique Gaspard Monge, UMR CNRS 8049, ∈ ∈ Universite´ Paris-Est Marne-la-Vallee,´ 77454 Marne-la-Vallee´ Cedex 2, ing parameter, p L1(RQ) represents a noiseless version of ∈ France the observed field, and v is a function accounting for acqui- 3 Department of , Northwestern University, Evanston, IL p 60208, United States of America sition errors. The main assumption is that is close, in a 4 XLIM Research Institute, UMR CNRS 7252, Universite´ de Limo- sense to be made precise, to the probability density function ges, 87032 Limoges, France u g(u, µ,C), of a Q-dimensional with 7→ mean µ RQ and (i.e., inverse covariance) matrix ∈ 2 Emilie Chouzenoux1,2 et al.

C S ++ 1. This distribution is expressed as a two-photon fluorescence microscope, our new computa- ∈ Q tional strategy shows an unprecedented accuracy and relia- RQ µ RQ S ++ bility. ( u )( )( C Q ) ∀ ∈ ∀ ∈ ∀ ∈ In Section 2, the data fitting problem is formulated in C 1 a variational manner. A proximal alternating optimization g(u, µ,C)= | | exp (u µ)⊤C(u µ) , (2π)Q −2 − − method called FIGARO is then proposed in Section 3 for s   (1.2) finding a minimizer of the proposed nonconvex cost func- tion. The implementation of the algorithm steps is discussed. where C denotes the determinant of matrix C. The fitting The convergence of the sequence of iterates resulting from | | problem thus consists of finding an estimate (a,b, p, µ,C) FIGARO is established in Section 4. Section 5 illustrates of (a,b, p, µ,C) in accordance with model (1.1) the high robustness of our approach to a model mismatch, Because of its prominent importance in applications,b b b b thereb when compared to a standard nonlinear fitting has been a significant amount of works on this subject [12, strategy on 3D synthetic data. In Section 6, the scope of our 25,24,23,34,42]. To the best of our knowledge, all existing approach is demonstrated through the analysis of the Point works consider that p = g( , µ,C) and they are focused on Spread Function of a 3D two-photon fluorescence micro- · fitting parameters (a,b, µ,C) from y. Two main classes of scope. Finally, Section 7 concludes the paper. methods can be distinguished. The first set of approaches [25,24,34] is basedb onb b theb search for the best fitting pa- rameters minimizing a least-squares cost between the obser- 2 Proposed Variational Formulation vations and the sought model. The minimization process is based on the famous Levenberg-Marquardt alternating min- The key ingredient of our method relies on measuring the imization strategy. However, it is worth mentioning that few closeness of p to the Gaussian probability density functions established convergence guarantees are available for this me- by using the Kullback-Leibler (KL) divergence [5]. Let us thod, which may be detrimental to its reliable use in prac- first recall the definition of KL divergence. Let P denote tice. The second class of methods uses the so-called Caru- the set of probability density functions supported on RQ: ana’s formulation [12]. The idea here is to assume that the background term a is zero and to search for (b, µ,C) which P = q L1(RQ) ( u RQ) q(u) 0 minimize the difference of between the data and ∈ | ∀ ∈ ≥ n the model [23,1]. The advantage of such a strategyb b b is that q(u)du = 1 . (2.1) Ω it gives rise to a convex formulation, for which efficient and Z reliable optimization techniques can be applied. It is how- o Suppose that (p,q) P2 and q takes (strictly) positive val- ever worth emphasizing that all the aforementioned works ∈ are focused on the resolution of the fitting problem in low ues, the KL divergence from q to p reads dimensions, that is when Q = 1 [12,25,23,34] or Q = 2 [24, p(u) 1,42]. Moreover, except in [34] where a polynomial back- KL (p q)= p(u)log du, (2.2) k RQ q(u) ground is accounted for, the background term a is consid- Z   ered as zero. These assumptions however usually do not cor- with the convention 0log0 0. respond to constraints inherent to an experimental setup or = environment. In order to avoid singularity issues, we will assume that The aim of this paper is to propose a new multivariate the Gaussian in each direction are bounded above Gaussian fitting strategy which avoids the aforementioned by some maximal values. The spectrum of the precision ma- C limitations. Our method relies on the minimization of a hy- trix is thus bounded from below, in the sense that there exists some ε > 0 such that C = D + εIQ where D belongs brid cost function combining a least-squares data fidelity + Q Q Q to S and IQ R denotes the identity matrix of R . term, a Kullback-Leibler divergence regularizer for improved Q ∈ × robustness, and range constraints on the parameters. This We then propose to define (a,b, p, µ,D) as a minimizer of original variational formulation results in a nonconvex mini- a hybrid cost function, gathering information regarding the mization problem for which we propose a theoretically sound observation model (1.1) andb theb Gaussianb b b shape prior (1.2). and efficient proximal alternating iterative resolution scheme. The minimization problem reads When applied to the analysis of 3D raw data acquired with 1 2 1 Throughout the paper, S ++ will denote the set of symmetric pos- minimize y(u) a bp(u) du Q a A ,b B 2 RQ − − + itive definite matrices of RQ Q, S the set of symmetric positive µ RQ∈p P∈D S + Z × Q , , Q  RQ Q S ∈ ∈ ∈ semidefinite matrices of × and Q the set of symmetric matrices µ RQ Q + λKL p g( , µ,D + εIQ) . (2.3) of × k ·  Optimal Multivariate Gaussian Fitting with Applications to PSF Modeling in Two-Photon Microscopy Imaging 3

Hereabove, A and B are some nonempty closed bounded orthogonal matrix and σ =(σq)1 q Q the associated vector real intervals corresponding to known bounds on a and b re- of eigenvalues of D, ≤ ≤ spectively, and λ > 0 is a regularization parameter weight- ϕ ϕ σ ing the KL penalty term favoring the proximity between p (D)= ( ) and the Gaussian model (1.2) parametrized by (µ,D). Q ε σ ε In practice, however, one generally has access only to a e log( D + IQ )= ∑ log( q + ), − | | − q=1 sampling of y, which is performed on a bounded Borel set =  if D S +, Ω of RQ. The set Ω is supposed here chosen large enough  ∈ Q ϕ σ ∇ϕ 1 σ ∇2ϕ σ so that it captures most of the probability mass of the sought  (0Q)+ ⊤ (0Q)+ 2 ⊤ (0Q) , Gaussian disribution. More precisely, we will assume that Ω  otherwise,  is paved into N N voxels of volume ∆ (0,+∞) and mass e e e (2.8) ∈ ∈  centers (xn)1 n N. The available vector of observations is ≤ ≤ where 0Q is the Q-dimensional null vector, 1Q the Q-dimensional then y =(yn)1 n N where, for every n 1,...,N , yn = ≤ ≤ ∈ { } vector of all ones, and y(xn). After this discretization, by assuming that y and p are ∆ continuous functions in (2.3) and that is small enough, 1 2 2 ϕ(0Q)= Qlogε,∇ϕ(0Q)= ε− 1Q,∇ ϕ(0Q)= ε− IQ. the following more tractable optimization problem is thus − − substituted for the original variational formulation: (2.9) e e e Let us denote by ιS the indicator function of a set S , which 1 2 minimize y a1N bpp is equal to 0 on this set and +∞ otherwise. We are now ready a A ,b B 2k − − k µ RQ ∈ P∈ S + to define the cost function which is minimized in our Gaus- ,p d,D Q ∈ ∈ ∈ sian fitting approach: N p + λ ∑ p log n , (2.4) n µ ε N Q n=1 g(xn, ,D + IQ) ( a R)( b R)( p R )( µ R )( D SQ)   ∀ ∈ ∀ ∈ ∀ ∈ ∀ ∈ ∀ ∈ 1 2 where denotes the standard Euclidean norm. The prob- F(a,b, p, µ,D)= y a1 bpp + ιA (a) k · k N ability density function p has been replaced by the vector 2k − − k N + ιB(b)+ λΨ(p, µ,D), (2.10) p =(pn)1 n N which belongs Pd =[0,+∞) C , where C ≤ ≤ ∩ is the affine hyperplane where N N 1 N µ Q C p R ∑ p ∆ (2.5) ( p R )( R )( D SQ) = n = − . ∀ ∈ ∀ ∈ ∀ ∈ ( ∈ n=1 ) N p Ψ(p, µ,D)= ∑ ent(p )+ n Qlog(2π)+ ϕ(D) n 2 The discrete KL term in (2.4) can be rewritten as n=1  N +(xn µ)⊤(D + εIQ)(xn µ) + ιC (p)+ ιS + (D). pn Q ∑ pn log − − g(x , µ,D + εI )  (2.11) n=1  n Q   N Q 1 Remark 1 The proposed formulation deals with a regular = ∑ ent(pn)+ pn log(2π) log( D + εIQ ) n=1 2 − 2 | | grid but it can be easily extended to the case of irregular  1 sampling by changing the definition of C into + (xn µ)⊤(D + εIQ)(xn µ) , (2.6) 2 − − N  C RN ∆ where = p ∑ n pn = 1 (2.12) ( ∈ n=1 )

υ υ υ log , > 0, N where, for every n 1,...,N , ∆n (0,+∞) is the vol- ( υ R) ent(υ)= 0, υ = 0, (2.7) ∈ { } ∈ ∀ ∈  ume of the n-th voxel. +∞, otherwise.

Note that the above definition of the function ent allows us 3 FIGARO Minimization Algorithm to impose directly the nonnegativity of the components of p. For technical reasons which will appear later, we will also 3.1 Proposed Algorithm need to perform a twice continuously differentiable exten- sion of the function D log( D +εIQ ) on the whole do- The objective function (4.1) is nonconvex, yet convex with 7→ − | | main SQ. This extension ϕ is defined as follows. For every respect to each variable. A standard resolution approach is Q Q D SQ decomposed as U Diag(σ )U ⊤ with U R an thus to adopt an alternating minimization strategy, where, at ∈ ∈ × 4 Emilie Chouzenoux1,2 et al.

N Q each iteration, F is minimized with respect to one variable Proposition 1 Let (a,b, p, µ,D) R R R R SQ 2 ∈ × × × × while the others remain fixed. This approach, sometimes re- and (γa,γb) (0,+∞) . The proximity operator of ∈ ferred to as Block Coordinate Descent or nonlinear Gauss- γaF( ,b, p, µ,D) at a is given by · Seidel method, has been widely used in the context of PSF γ model fitting [42,30,32]. However, its convergence is only a + a1N⊤(y bpp) proxγ µ (a)= PA − (3.1) aF( ,b,p, ,D) γ guaranteed under restrictive assumptions [38]. In order to · 1 + aN ! get sounder convergence results, we propose to use an al- and the proximity operator of γ F(a, , p, µ,D) at b is given ternative strategy based on proximal tools which consists of b by · replacing, at each iteration the direct minimization step by a proximal one ([33, Def. 1.22], [6, Def. 12.23], [18, Def. b + γb(y a1N)⊤ p prox µ b PB (3.2) γbF(a, ,p,µ,D)( )= − 2 . 10.1] [11]). · 1 + γb p  k k  Rn Definition 1 (Domain) Let f be a function from to Proof Calculating the proximity operator of γaF( ,b, p, µ,D) · ( ∞,+∞]. The domain of f is defined by is equivalent to calculating the proximity operator of the − one-variable function ϑ + ιA where dom f := x Rn : f (x) < +∞ . { ∈ } γ N a 2 The function f is proper if and only if dom f is nonempty. ( a R) ϑ(a)= ∑ (yn a bpn) . (3.3) ∀ ∈ 2 n=1 − − Definition 2 (Proximity operator) Let f : Rn ( ∞,+∞] → − be a convex, proper, lower semi-continuous function. The It follows from [14] that proximity operator of f at x Rn is defined as prox PA prox (3.4) ∈ γaF( ,b,p,µ,D) = ϑ . · ◦ 1 2 prox f (x)= argmin f (y)+ y x . On the other hand, it follows from [18] that y Rn 2k − k ∈ γ 1 y p S Rn a + a1N⊤(y bpp) Let be a nonempty closed convex subset of . Then proxϑ (a)= γ − . (3.5) P S 1 + aN proxιS is equal to the projection S onto . Expression (3.2) is obtained by similar arguments. The application of the proximal alternating method [4,2, ⊓⊔ N Q 8] to the minimization of (4.1) yields Algorithm 1, called Proposition 2 Let (a,b, p, µ,D) R R R R SQ ∈ × × × × FIGARO (Fitting Gaussians with Proximal Optimization). and γp > 0. The proximity operator of γpF(a,b, , µ,D) at p is given by ·

Algorithm 1 FIGARO method prox p ρ 1W ρ exp w ν γpF(a,b, ,µ,D)( )=( − n( ) 1 n N , A B C µ RQ S + · ≤ ≤ a0 ,b0 , p0 , 0 ,D0 Q , (3.6) ∈ ∈ ∈ 5∈ ∈  (γa,γb,γp,γµ ,γD) (0,+∞) . b ∈ for i = 1,2,... do where W denotes the Lambert-W function [19], (i+1) (i) a = prox (i) (i) µ(i) (i) (a ) 2 γaF( ,b ,p ,µ ,D ) γpb + 1 (i+1) · (i) ρ = , (3.7) b = prox (i+1) (i) µ(i) (i) (b ) γ λ γbF(a , ,p ,µ ,D ) p (i+1) · (i) p = prox (i+1) (i+1) (i) (i) (p ) γpF(a ,b , ,µ ,D ) · and, for every n 1,...,N , wn is the function defined as µ (i+1) prox µ (i) ∈ { } = γµ F(a(i+1),b(i+1),p(i+1), ,D(i))( ) · 1 (i+1) (i) ( ν R) wn(ν)= 1 cn +(γpλ)− (pn +γpb(yn a) ν), D = prox (i+1) (i+1) (i+1) (i+1) (D ) γDF(a ,b ,p ,µ , ) ∀ ∈ − − − − end for · (3.8) with Remark that other methods such as those proposed in Q 1 c = log(2π)+ ϕ(D) [40,17,10] are also applicable to our problem, but the con- n 2 2 sidered alternating proximal point algorithm may appear prefer- 1 + (xn µ)⊤(D + εIQ)(xn µ). (3.9) able because of its simplicity. 2 − − Moreover, ν R is the the unique zero of the function ∈ 3.2 Expressions of the Proximity Operators N b 1 1 ( ν R) Φ(ν)= ρ− ∑ W ρ exp(wn(ν)) ∆ − . In this part, we show that the proximity operators required ∀ ∈ n=1 − in Algorithm 1 have closed form expressions.  (3.10) Optimal Multivariate Gaussian Fitting with Applications to PSF Modeling in Two-Photon Microscopy Imaging 5

Proof Let p RN. Then, Therefore, since W takes positive values on (0,+∞), Φ (ν) < ∈ ′ 0 for every ν R, i.e., Φ is strictly decreasing. We thus con- p p ∈ p = proxγpF(a,b, ,µ,D)(p) ν e · clude that it has a unique zero . 1 ⊓⊔ γ p µ D p p 2 b = argmin pF(a,b, pe, ,D)+ p p The computation of the above proximity operator re- p RN 2k − k b ∈ quires to determine the zero of the scalar function Φ. The N N 1 2 e following lemma shows that this can be achieved with high = argminγp ∑ (yn a bpn) + γpλ ∑ (pn log pn + pncn) p C n=1 2 − − n=1 precision using Newton algorithm, the convergence of which ∈ N is guaranteed for any initialization. 1 2 + ∑ (pn p˜n) . (3.11) 2 n=1 − Lemma 1 The Newton iteration The Lagrangian function associated with the above constrained Φ(ν(t)) ( t N) ν(t+1) = ν(t) (3.16) Φ ν(t) problem reads ∀ ∈ − ′( ) N converges to the unique zero of Φ from any starting point ∞ N ν ν γ 1 2 ( p [0,+ ) )( R) L (p, )= p ∑ (yn a bpn) ν(0) R. ∀ ∈ ∀ ∈ n=1 2 − − ∈ N Proof We have already shown that Φ is strictly decreasing γ λ 1 2 + ∑ p (pn log pn + pncn)+ (pn p˜n) on R and has a unique zero. Let us now establish the con- n 1 2 − = vexity of Φ by calculating its second-order N ν ∑ p ∆ 1 (3.12) + n − . ( ν R) n=1 − ∀ ∈   N ρ ν ν ν 1 W′( exp(wn( )))exp(wn( ))wn′ ( ) Since Slater’s condition obviously holds, there exists ν R Φ′′(ν)= ∑ ∈ −γ λ (W(ρ exp(w (ν)))+ 1)2 such that (p,ν) is a saddle point of the L [7]. By Fermat’s p n=1 n N rule [6], p =(pn)1 n N is thus obtained by finding a zerob of 1 W(ρ exp(wn(ν))) ≤ ≤ = ∑ . (3.17) the partial subdifferential of L with respect to variable p. γ2λ 2ρ ρ ν 3 b b p n=1 (W( exp(wn( )))+ 1) By usingb (3.7),b this yields, for every n 1,...,N , ∈ { } Since W(ρ exp(wn(ν))) > 0 for every n 1,...,N and 2 ∈ { } γp(b pn byn + ab)+ γpλ(1 + log pn + cn) ν R, we have Φ (ν) > 0 for all ν R, i.e., Φ is strictly − ∈ ′′ ∈ + pn p˜n + ν = 0 convex. Now, let us ascertain the convergence of Newton’s − method for finding the unique root of Φ. The remaining of ρ pn + log pn = wn(ν) ⇔ our proof follows similar arguments as the one of [29, Chap- ρ ρ ρb ν pn exp( pn)= exp(wn( )). (3.13) ter 3, Theorem 2]. For every t N, let e(t) is the error defined ⇔ b ∈ By recalling that the Lambert-W function is such that ( z as b (t) (t) R ∀ ∈ ( t N) e = ν ν, ) W(z)exp(W(z)) = z, we deduce (3.6). ∀ ∈ − In addition, canceling the derivative of L with respect to where ν is the zero of Φ. From the definition of the Newton ν amounts to finding a zero of function Φ defined in (3.10). iteration, we have b This existence of a zero is guaranteed by the existence of b Φ(ν(t)) p. Let us now establish its uniqueness by evaluating the ( t N) e(t+1) = ν(t+1) ν = ν(t) ν ∀ ∈ − − Φ (ν(t)) − derivative Φ′ using the following property of the Lambert ′ Φ ν(t) (t)Φ ν(t) Φ ν(t) W-function:b t ( ) e ′( ) ( ) = e( ) b = −b . Φ ν(t) Φ ν(t) − ′( ) ′( ) + 1 W(z) ( z R ) W′(z)= = . (3.18) ∀ ∈ (W(z)+ 1)eW(z) (W(z)+ 1)z (3.14) By performing a second-order Taylor expansion, we get

We have then ( t N) 0 = Φ(ν)= Φ(ν(t) e(t)) ∀ ∈ − (t) (t) (t) 1 (t) 2 (t) ( ν R) = Φ(ν ) e Φ′(ν )+ (e ) Φ′′(ξ ), (3.19) ∀ ∈ −b 2 N Φ ν ρ ν ν ν (t) (t) (t) ′( )= ∑ W′ exp(wn( )) exp wn( ) wn′ ( ) where, for all t N, ξ [min(ν,ν ),max(ν,ν )]. Com- ∈ ∈ n=1 bining the latter equality with (3.18) yields   1 N 1 Φ ξ (t) b b = ∑ 1 . (3.15) 1 ′′( ) (t) 2 −γpλρ − W ρ exp(w (ν)) + 1 ( t N) e = (e ) . (3.20) n=1 n ! t+1 Φ ν(t) ∀ ∈ 2 ′( )  6 Emilie Chouzenoux1,2 et al.

Recall that Φ′(ν) < 0 and Φ′′(ν) > 0 for all ν R. Ac- Since ϕ and ιS + are spectral functions on SQ associated ∈ Q cording to (3.20), for every t N, e(t+1) < 0, which im- ϕ ι ∈ with the functions and [0,+∞)Q , respectively, it follows plies that ν(t) < ν for all t 1. Thus, since Φ is strictly ≥ from [6, Corollary 24.65] that decreasing, ( t 1) Φ(ν(t)) > Φ(ν)= 0. By (3.18), ( t ∀ ≥ ∀ ≥ e 1) e(t+1) > e(t), and thus (e(t)) is increasing and upper b t 1 prox D V Diag prox ω V (t) ≥ γDF(a,b,p,µ, )( )= mϕ+ι Q ( ) ⊤, bounded by 0. Hence, (ν )t 1 isb also increasing and up- · [0,+∞) ν ≥ (t) (3.25) per bounded by . Therefore, the limits e∗ = limt +∞ e e  (t) → e and ν∗ = limt +∞ ν exist. We deduce from (3.18) that → ω ω D e∗ = e∗ Φ(ν∗)/bΦ′(ν∗), which implies that Φ(ν∗)= 0 and where = ( q)1 q Q is a vector of eigenvalues of − ≤ ≤ − ν = ν. S and V is a Q Q orthogonal matrix such that D S = ∗ ⊓⊔ × − V Diag(ω)V ⊤. Since mϕ + ι ∞ Q is a separable function,e N Q [0,+ ) Proposition 3 Let (a,b, p, µ,D) R R R R SQ b ∈ × × × × e and γµ > 0. The proximity operator of γµ F(a,b, p, ,D) at µ ω σ proxmϕ+ι ( )=( eq)1 q Q, (3.26) is given by · [0,+∞)Q ≤ ≤ e 1 where, for every q 1b,...,Q , µ γ λ ε − ∈ { } proxγµ F(a,b,p, ,D)( )= IQ + µ (1N⊤ p)(D + IQ) · 1  N  σ = argmin mlog(σ + ε)+ (σ ω )2 q q 2 q q µ + γµ λ ∑ pn(D + εIQ)xn . (3.21) σq [0,+∞) − − ∈ × n=1 ! b 1 2 = max ωq ε + (ωq + ε) + 4m,0 . (3.27) Proof Calculating the proximity of operator of γµ F(a,b, p, ,D) 2 − ·   is equivalent to calculating the proximity operator of the q quadratic function 4 Convergence Analysis N pn µ γµ λ ∑ (xn µ)⊤(D + εIQ)(xn µ). (3.22) 7→ n=1 2 − − Let us now establish the convergence of the iterates gener- ated by Algorithm 1. Our analysis will rely on the observa- The result then follows from [18]. ⊓⊔ tion that FIGARO can be viewed as a special instance of Proposition 4 Let (a,b, p, µ,D) R R RN RQ S (Q) the regularized Gauss-Seidel method from [4]. ∈ × × × × and γD > 0. The proximity operator of γDF(a,b, p, µ, ) at D is given by · 4.1 Preliminaries D proxγDF(a,b,p,µ, )( )= · Let us first recall some useful definitions concerning vari- 1 2 V Diag max ωq ε + (ωq + ε) + 4m,0 ational analysis and the fundamental Kurdyka-Łojasiewicz 2 − 1 q Q!   q  ≤ ≤ property that will be at the core of the convergence analysis V ⊤, (3.23) of our algorithm. × ω ω where =( q)1 q Q is a vector of eigenvalues of D S Definition 3 (Subdifferential) [33, Def. 8.3] Let f : Rn ≤ ≤ − → and V isaQ Q orthogonal matrix such that D S = ( ∞,+∞] be a proper function. ω × 1 γ λ ∑N µ µ− − V Diag( )V ⊤ with S = 2 D n=1 pn(xn )(xn )⊤ and 1 − − (a) For a given x dom f , the Frechet´ subdifferential of f m = γ λ(1 p). ∈ 2 D N⊤ at x, written ∂ f (x), is the set of all vectors u Rn which ∈ satisfy Proof Let F denote the Frobenius norm and let D SQ. k·k ∈ b We have f (y) f (x) u,y x lim inf − −h − i 0. e y=x y x y x ≥ D → proxγDF(a,b,p,µ, )( ) 6 k − k · γ µ 1 2 ∂ ∅ = argmin DF(a,b, pe, ,D)+ D D F When x / dom f , we set f (x)= . D S 2k − k ∈ ∈ Q (b) The limiting-subdifferential, or simply the subdifferen- 1 2 e tial, of f at x dom f , writtenb ∂ f (x), is defined as = argmin D D + tr(DDSS)+ mϕ(D)+ ιS + (D) ∈ F Q D SQ 2k − k ∈ ∂ f (x)= v Rn = prox ϕ ι (De S). (3.24) ∈ | m + S + (t) (t) (t) (t) Q −  x x, f (x ) f (x),v ∂ f (x ) v . ∃ → → ∈ → e b Optimal Multivariate Gaussian Fitting with Applications to PSF Modeling in Two-Photon Microscopy Imaging 7

Definition 4 (Kurdyka-Łojasiewicz property) [10] The func- ( a R) f1(a)= ιA (a), (4.3) ∀ ∈ tion f : Rn ( ∞,+∞] is said to satisfy the Kurdyka-Ło- → − jasiewicz (KL) property at x dom∂ f if there exist η ∗ ι ∞ ∈ ∈ ( b R) f2(b)= B(b), (4.4) (0,+ ], a neighbourhood U of x∗, and a continuous con- ∀ ∈ cave function ϕ : [0,η) R+ such that → (a) ϕ(0)= 0, N ι (b) ϕ is C 1 on 0 η , ( p R ) f3(p)= C (p) ( , ) ∀ ∈ η ϕ (c) for all s (0, ), ′(s) > 0, N Q ∈ + λ ∑ ent(p )+ p log(2π) , (4.5) (d) for all x U [ f (x∗) < f < f (x∗)+ η], the Kurdyka- n n ∈ ∩ n=1 2 Łojasiewicz inequality holds:  

ϕ′( f (x) f (x∗))dist(0,∂ f (x)) 1. − ≥ S ι ( D (Q)) f4(D)= S +(Q)(D). (4.6) Moreover, f is called a KL function if it satisfies the Kurdyka- ∀ ∈ ∂ Łojasiewicz inequality at every point in dom f . Moreover, G is C2 on R R RN RQ S (Q). × × × × ∇ ∇ ∇ ∇ 4.2 Convergence Theorem Proof We first calculate the gradients aG, bG, pG, µ G and ∇DG of G with respect to the different variables. Let us N In order to establish convergence results, we will show that denote by (en,N)1 n N the canonical basis of R . For every ≤ ≤ N Q the objective function is KL, and that it can be split into the (a,b, p, µ,D) R R R R S (Q), ∈ × × × × sum of a locally Lipschitz differentiable part involving all the variables, and non differentiable separable terms. ∇aG(a,b, p, µ,D)= b1⊤ p + Na 1⊤y, N − N Lemma 2 Function (4.1) is a KL function.

2 Proof Let us recall that there exists an o-minimal structure, ∇bG(a,b, p, µ,D)= b p y⊤ p+a1⊤ p =(bpp y+a1N)⊤ p, k k − N − denoted by S(Ran exp) with Ran exp :=(R,+, ,( f ),exp), that , , · contains the exponential functions and every restricted an- alytic functions (see [20, Example (6), pp. 505]). Note that ∇ µ 2 S R pG(a,b, p, ,D)= b p byy + ab1N ( an,exp) also contains the function log: − (0,+∞) R and ( )r : R R defined by λ N → · → + ∑ (xn µ)⊤(D + εIQ)(xn µ)+ ϕ(D) en N, 2 − − , ar, a > 0 n=1 a   7→ (0, a 0, ≤ where r R. Then, by using [20, Section 5], we conclude N ∈ ∇µ G(a,b, p, µ,D)= λ ∑ pn(D + εIQ)(µ xn), that F is definable in an o-minimal structure. As a conse- n=1 − quence, the results of [9] and Theorem 4.1 of [3] apply and hence F is a KL function.

Lemma 3 Function (4.1) can be rewritten as ∇DG(a,b, p, µ,D)= λ N N µ Q 1 ( a R)( b R)( p R )( R )( D SQ) ∑ p x µ x µ D εI ∀ ∈ ∀ ∈ ∀ ∈ ∀ ∈ ∀ ∈ n ( n )( n )⊤ ( + Q)− µ µ 2 n=1 − − − F(a,b, p, ,D)= G(a,b, p, ,D)     if D S +(Q), + f1(a)+ f2(b)+ f3(p)+ f4(D), (4.1)  ∈ λ N  ∑ p (x µ)(x µ) ε 1I + ε 2D where  n n n ⊤ − Q − 2 n=1 − − −    otherwise. ( a R)( b R)( p RN)( µ RQ)( D S (Q))   ∀ ∈ ∀ ∈ ∀ ∈ ∀ ∈ ∀ ∈  (4.7) 1 2  G(a,b, p, µ,D)= y a1N bpp 2k − − k N Let us now calculate the partial second-order of pn + λ ∑ (xn µ)⊤(D + εIQ)(xn µ)+ ϕ(D) , G. In the following, denotes the matrix Kronecker product n=1 2 − − ⊗   and vec(M) the columnwise ordering of a matrix M. For (4.2) every (a,b, p, µ,D) R R RN RQ S (Q), by setting ∈ × × × × 8 Emilie Chouzenoux1,2 et al.

(i) d = vec(D), we have bounded, then (t )i N converges to t =(a,b, p, µ,D) sat- isfying the following∈ equilibrium: ∇2 µ aG(a,b, p, ,D)= N, b b b b b b ( a R)( b R)( p RN)( µ RQ)( D S (Q)) ∇2 µ 2 bG(a,b, p, ,D)= p , ∀ ∈ ∀ ∈ ∀ ∈ ∀ ∈ ∀ ∈ k k F(a,b, p, µ,D) F(a,b, p, µ,D) ∇2 µ 2 pG(a,b, p, ,D)= b IN, ≥ F(a,b, p, µ,D) F(a,b, p, µ,D) N b b ≥ b b ∇2 µ λ ε b b b b b µ G(a,b, p, ,D)= ∑ pn(D + IQ), F(a,b, p, µ,D) F(a,b, p, µ,D) n=1 b b b b ≥ b b b b b µ µ ∇2 G(a,b, p, µ,D)= 1 p, F(a,b, p, ,D) F(a,b, p, ,D) a,b N⊤ b b b b ≥ b b b b b µ µ ∇2 G(a,b, p, µ,D)= b1 , F(a,b, p, ,D) F(a,b, p, ,D). (4.9) p,a N b b b b ≥ b b b b b 2 ∇µ G(a,b, p, µ,D)= 0 , (i) ,a Q Moreoverb b b theb sequenceb(bt b)ibN bhas a finite length. 2 ∈ ∇ G(a,b, p, µ,D)= 0 2 , d,a Q Proof In (4.1), it appears that, if p [0,+∞)N C or D ∇2 µ 6∈ ∩ 6∈ p,aG(a,b, p, ,D)= 2bpp y + a1N, S + p µ D ∞ − (Q), then F(a,b, p, ,D) = + , whereas, if 2 ∞ N C S + ∇µ G(a,b, p, µ,D)= 0 , p [0,+ ) and D (Q), µ,b Q ∈ ∩ ∈ ∇2 µ d,bG(a,b, p, ,D)= 0Q2 , G(a,b, p, µ,D) N N ∇2 µ λ ε µ pn µ,pG(a,b, p, ,D)= (D + IQ) ∑ ( xn)en⊤,N, λ ∑ x µ D εI x µ ϕ D − ( n )⊤( + Q)( n )+ ( ) n=1 ≥ n=1 2 − −   ∆ 1 − 2 λ ε inf xn µ Qlogε , (4.10) ≥ 2 n 1,...,N k − k −  ∈{ }  ∇2 G(a,b, p, µ,D) p,d f1(a) 0, f2(b) 0, f4(D)= 0, (4.11) N ≥ ≥ λ N ∑ (xn µ)⊤ en N(xn µ)⊤ 1 , f3(p) λ ∑ ent(pn) Ne− λ. (4.12) 2 n=1 − ⊗ −  ≥ n=1 ≥−   ε 1  S +  en,Nvec (D + IQ)− ⊤ if D (Q), =  − ∈ Hence λ N   ∑ (x µ) e (x  µ)  2 n ⊤ n,N n ⊤ n=1 − ⊗ − F(a,b, p, µ,D)  ε 1 ε 1   − en,N(1Q2 − d)⊤ otherwise, ∆ 1  − − − 2 1  λ ε inf xn µ Qlogε Ne− λ.   ≥ 2 n 1,...,N k − k − −   ∈{ }  N (4.13) ∇2 µ λ µ µ,d G(a,b, p, ,D)= ∑ pn( xn)⊤ IQ, n=1 − ⊗ This shows that F is bounded from below. Moreover, since (i) FIGARO alternates proximal steps, F(t ) i N is a decay- ing convergent sequence. It then follows from∈ (4.13) that λ µ (i)  1 p D εI 1 D εI 1 ( )i N is bounded (otherwise the function value sequence ( N⊤ )( + Q)− ( + Q)− ∈ (i) (i) (i) 2 ⊗ would be divergent). Since (a )i N, (b )i N and (D )i N  if D S +(Q), (i) ∈ ∈ ∈ ∇2 µ  are bounded sequences, (t )i N is bounded. Moreover, ac- d G(a,b, p, ,D)= λ ∈ ∈  ε 2 cording to Lemma 3, G is C2 on R R RN RQ S (Q),  (1N⊤ p) − IQ2 × × × ×  2 which implies that G is C1 with locally Lipschitz gradient otherwise. R R RN RQ S  on (Q). Consequently, all the condi-  × × × × i  (4.8) tions in [4, Theorem 6.2] are met to guarantee that (t ( )) N  i is a finite length sequence converging to a critical point∈ of Thanks to the definition of ϕ, the Hessian of G is thus de- F. We then deduce (4.9) from the fact that F is convex with fined and continuous on R R RN RQ S (Q). Hence respect to each of its argument. × × × × the result. Remark 2 Note that the assumption on the boundedness of (i) We are now ready to prove the convergence of FIGARO. (D )i N becomes unnecessary if an upper bound on D is introduced∈ in the formulation of the optimization problem. (i) (i) (i) (i) (i) (i) Theorem 4.1 Let (t )i N =(a ,b , p , µ ,D )i N be This however was not observed to influence the practical be- ∈ (i) ∈ a sequence generated by Algorithm 1. If (D )i N is upper haviour of the algorithm. ∈ Optimal Multivariate Gaussian Fitting with Applications to PSF Modeling in Two-Photon Microscopy Imaging 9

5 Experiments on Synthetic Data Levenberg-Macquardt (LM) algorithm. We use the lsqcurvefit function available in Matlab software with In order to validate the good performance of the FIGARO the same initialization as FIGARO2. It is important to em- Algorithm 1, we generate 3D synthetic data y =(y(xn))1 n N phasize that, even in the case when ρ = 1, we still assume a 3 ≤ ≤ 6 where (xn)1 n N are coordinates in R regularly spaced on a Gaussian model in both fitting approaches in order to assess ≤ ≤ grid with size N = 15 15 50 and voxel dimension 0.05 their robustness to an imperfect model. 3 × × × 0.05 0.1µm . For every n 1,...,N , yn = a+bp(xn)+ The plots show that FIGARO outperforms LM, in all × ∈ { } vn. In order to illustrate the robustness of our formulation, scenarios in terms of averaged PRD. FIGARO is, in addi- we define p from the multivariate generalized Gaussiane prob- tion, very stable to a model mismatch (i.e., ρ = 1), while 6 ability density function: LM performance highly decreases as soon as the data are not e generated by using the Gaussian model. This clearly high- ( n 1,...,N ) p(xn)= ∀ ∈ { } lights the advantage of our formulation, relying on the extra ρΓ ( 3 ) 1 variable p whose shape is controlled by the KL divergence C 2 x µ C x µ ρ C 3 e exp ρ ((xn )⊤C(xn )) , 3 ρ 3 3 2ς penalty term. Finally, it is noticeable that FIGARO is much | |π 2 2 2 Γ ( )ς 2 − − − q 2ρ   more stable to noise fluctuations, as confirmed by the low values of std on the PRD. In contrast, the PRD values for with scale and shape parameters (ς,ρ) (0,+∞)2. Vari- ∈ LM ous values will be tested for ρ, and for each of them, the are highly dispersed, which questions its reliability for scale parameter ς is adjusted such that most of the prob- the systematic treatment of real datasets. ability mass lies in the observation grid. When ρ = 1, we SNR = 10 SNR = 15 recover the standard multivariate Gaussian distribution. We 25 25 FIGARO FIGARO set a = b = 1, µ =[0.3,0.4,2]⊤, and LM LM 20 20 1 − C = RDiag([0.1,0.05,0.5]⊤)R⊤ 15 15 PRD   PRD 3 3 10 10 with R R × the rotation matrix associated with angles ∈ 4 equal to 0.2,10− ,0.1 radians. Note that these values for 5 5 the distribution parameters (b,C, µ) have been chosen in our  0 0 tests in order to correspond to typical values encountered in 0.5 0.75 1 1.25 1.5 1.75 2 0.5 0.75 1 1.25 1.5 1.75 2 our target application to PSF estimation in microscopy. Fi- nally, v =(vn)1 n N is the realization of a zero-mean Gaus- Fig. 1 Quality of 3D fitting results in terms of PRD, using FIGARO sian noise, with≤ standard≤ deviation σ chosen so as to obtain and LM strategies, for different shape parameters ρ and SNR values (in dB). Averaged values over 50 noise realizations. For FIGARO, the a given input signal-to-noise ratio (SNR). std varies between 0.05 and 1.72, while for LM, it lies between 7.91 The regularization parameter λ > 0 in FIGARO is set and 20.1. automatically thanks to a golden bisection search, so as to satisfy the χ2 criterion y a bp = σ√N [22]. We set k − − 5k 8 amin = bmin = 0, amax = bmax = 10 , ε = 10− . The initial- ization of the algorithm is ofb particularbb matter, as the cost 6 Application of FIGARO to Two-photon Microscopy function is nonconvex. Here, we observed that a good ini- (0) (0) tialization strategy is to take a = minn 1,...,N yn,b = The objective of this part is to illustrate experimentally the ∈{ } 1, p(0) = y, µ (0) as the position of the maximum intensity in good performance of our fitting strategy in the context of y, and C(0) a diagonal matrix with entries equal to the voxel computational imaging. Multiphoton microscopy (MPM) is size in each direction. The algorithm iterations are stopped a popular method for biomedical imaging at the micron scale, as soon as the relative residual between two consecutive it- able to generate 3D images in vivo and in depth, starting erates on the fitting model (a + bg(xn, µ,D + εI))1 n N is from a superposition of 2D image stacks. However, the in- 5 ≤ ≤ below 10− . strumental PSF in MPM has a particularly negative impact We provide in Fig. 1 the performance of our approach, on the resulting images especially when a sub-micrometer in terms of the Percent Difference (PRD) resolution is searched (about less than 0.5 µm) or when the between the estimated (a + bp(xn))1 n N and the true vec- sample emits a low level multiphoton signal. These situa- ≤ ≤ tor (a + bp(xn))1 n N, averaged on 50 noise realizations. tions represent most of the cases encountered in MPM where The range of values≤ ≤ for theb standard deviations (std) is in- b b 2 dicated in the figure caption. We also provide the averaged Our implementation relies on the extension to the 3D e case of the 2D Gaussian fitting software publicly available at: PRD and associated std range, obtained when solving the https://fr.mathworks.com/matlabcentral/fileexchange/ problem with the nonlinear least squares approach based on 41938-fit-2d-gaussian-with-optimization-toolbox. 10 Emilie Chouzenoux1,2 et al. the PSF is responsible for the resolution and contrast dete- provided in Section 5, giving rise to a set of estimated pa- riorations, with an increase of the image blur and noise. We rameters (a,b, p, µ,C) (with C = D + εIQ) directly related propose to apply our multivariate Gaussian fitting strategy to the position, size and orientation of the PSF. FIGARO to experimental MPM 3D images of fluorescent b b b b b b b microbeads, with the aim to better analyze the instrumen- tal PSF of this modality and to get high quality restoration 6.2 3D Estimation Results results. This section is organized as follows. First, the exper- imental and algorithmic setup is described in Subsection 6.1. Figure 2 shows an illustration of the 3D fitting results for Numerical results obtained with FIGARO are presented in four VOIs. Dots represent the raw data acquired experimen- Subsection 6.2, and Subsection 6.3 shows a comparison with tally while red spheres with their axis represent the recon- the state-of-the-art MetroloJ plugin based on 1D Gaussian structed 3D image of each microbead inside its VOI, re- fitting on marginalized data, which is highly employed in sulting from our multivariate Gaussian fitting strategy. Here, many platforms as a routine tool for analysis of microscopes the contour plots delimit the full-width at the half maximum resolution power. Finally, Subsection 6.4 illustrates restora- (FWHM) region, i.e., where xn is such that a+bg(xn, µ,C)= tion results obtained by using our estimated PSF model. 0.5 max(a + bg(xn, µ,C))1 n N. × ≤ ≤ b b b b b b b 6.1 Presentation of the Experimental Setup b

The experimental dataset has been recorded from a com- 6 8 6 mercial multiphoton microscope (Olympus, BX61WI) em- 4 m) m) 4 Z ( ployed in a routine protocol for two-photon fluorescence Z ( 2 imaging. A standard femtosecond titan sapphire source, 2 (Chameleon Ultra II, Coherent Inc., 800 nm, 150 fs, 10 nm, 0 0 0.8 0.8 0.6 0.6 82 MHz, 4 W) is coupled to the working station ended by a 1 1 0.4 0.4 25 water immersion microscope objective (Olympus, 0.2 0.5 0.2 0.5 × Y ( m) 0 0 X ( m) Y ( m) 0 0 X ( m) XLPLN 25 WMP, 1.05 numerical aperture). In order to × characterize experimentally the optical performance of the microscope and especially its response function, images of 6 6 fluorescent spherical latex microbeads, having a known di- m) ameter smaller than the resolution spot, are generated. The 4 m) 4 Z ( Z ( retained microbeads have been provided by Molecular Probes, 2 2 and have a diameter of 0.2 µm. Such a small diameter of the 0 0 beads allows us to consider each observed one as the (space- 1 1 1 0.5 1 0.5 variant) instrument PSF at the bead center coordinates. Mi- 0.5 0.5 crobeads are diluted into liquid gelatin and, after a short Y ( m) 0 0 X ( m) Y ( m) 0 0 X ( m) period at frig, the gelatin is solidified. The imaged sample Fig. 2 Example of 3D fitting results using FIGARO on two-photon thus constitutes the microbeads homogeneously distributed microscopy data. and immobilized into a bulk and solid volume. Their fluores- cence emission at 515 nm is detected with a photomultiplier tube coupled with an optical filter between 495 and 540 nm. In biomedical MPM, the carrying medium has often scat- A dichroic mirror at 690 nm splits the excitation beam from tering and absorbing properties not well-known or well-cha- the laser source and the back-fluorescence from the volume racterized. The more the imaged medium is scattering or ab- of microbeads which is the exclusive one directed to the de- sorbing the light (laser excitation or fluorescence emission), tection module. the more the image will be deteriorated. This phenomenon 2D image slices are generated, with a dimension of 1600 is often increasing with the imaging depth. FIGARO fit- × 1600 squared pixels. 230 slices with a pixel size of 0.053 µm ting results allow us to quantify this PFS variation along the are realized in deepness and spaced 0.1 µm apart; the super- depth of the sample. To this aim, we compute the FWHM position of the 230 slices consequently results in a 3D image along the 3 main axes of the Gaussian shapes for each VOIs, having the following dimensions in XYZ: 85 85 23 µm3. defined as (2√2log2si)1 i 3 where (si)1 i 3 are the eigen- × × 1 ≤ ≤ ≤ ≤ From this 3D image, forty volumes of interest (VOIs) are se- values of C− . lected, each of them corresponding to the noisy and blurry An analysis of these results for the whole set of VOIs observation of a single bead. For each selected VOI, the FI- shows that,b for this dataset and this range of depths, the pla- GARO algorithm is ran, using the same settings than those nar width, related to the FWHM associated to the second and Optimal Multivariate Gaussian Fitting with Applications to PSF Modeling in Two-Photon Microscopy Imaging 11

2.5 angles (Φ1,Φ2) characterizing the slope of the main direc- m) tion of the PSF, i.e., the eigenvector of C associated with its 2 largest eigenvalue. We represent a 3D representation of the

1.5 PSF main axes regarding its center position in Figure 4. Due to the presence of optical aberrations, the PSF orientations 1 measured with FIGARO change according to the beads lo- cation. In particular, the tilt angle quantifying the angle be- 0.5 tween the Z axis and the main PSF direction (i.e., Φ ) varies FWHM along 1st axis ( 2 5 10 15 20 2 Z ( m) for this dataset between 0.6◦ and 7.7◦.

Fig. 3 Evolution of the estimated FWHM along the axial axis of the fitted 3D Gaussian shapes, with respect to the bead center depth. 6.3 Comparison with A Standard Procedure

1 third eigenvalues of C− , does not vary much with respect Let us now present the comparison of our results with those 3 to the beads location. Here, the averaged FWHM of the esti- obtained from the MetroloJ plugin of Fiji. MetroloJ pre- mated Gaussian shapesb is of (0.21, 0.27) µm, which appears sents several interests in microscopy: it is a free plugin of to be consistent with the theoretical of optical planar an open source software, allowing to have a precise idea of resolution of 0.2 µm for this emission wavelength and nu- the PSF of the microscope, and is now a routine tool for merical aperture. The axial PSF width values, related to the tracking microscope performances. Unfortunately, like other 1 available Fiji plugins for PSF analysis in fluorescence mi- maximum eigenvalue of C− , are displayed in Figure 3 as a croscopy (eg, QuickPALM [27] and rapidSTORM [39]), function of the depth of bead centers. The origin of the ab- it only performs 1D shape fitting, and thus only allows to scissa axis is related to theb surface of the sample, it is not treat marginalized versions of the datasets. Thus, one may represented here as the beads employed for these measure- expect that such dimension reduction comes at the price of ments are only present in depths between 3 µm and 20 µm a loss in modeling accuracy and thus restoration quality. under the surface of the sample. One can observe that the For the sake of our comparisons, we have selected four axial PSF width is slightly increasing when the depth of the samples from the VOI set. The experimental results are gath- bead center increases, as it is expected from optical theory ered in Table 1. For each VOI, we provide the estimation [26]. The averaged axial resolution is of 1.49 µm which fits of the center coordinates of the fitted Gaussian shapes, the well the theoretical resolution limit of 1.5 µm displayed in FWHM, and the orientation (Euler angles) resulting from the literature [36]. Consequently, FIGARO appears as a so- FIGARO and MetroloJ approaches. Since the latter is based lution very well adapted for estimating the 3D variability of on 1D Gaussian fitting on the 3 marginals, only center posi- the PSF of a system. tion and FWHM along the axis XYZ of the image are avail- able as outputs. In contrast, for FIGARO, the FWHM is estimated along the actual bead axis, accounting for its in- clination angles. 104 1 As already observed in the previous section, the PSF ori- entations measured with FIGARO change according to the 0.5 beads location. The 1D-based analysis of MetroloJ does not

m) 0 have access to such a precise estimation of the tilt angle,

z ( yet of main importance for an efficient computational pro- -0.5 cessing of the microscope images. Concerning the estimated -1 center positions, they are quite similar for both methods, 1000 500 1000 mainly because of the small size of the VOIs. But results 0 500 0 -500 from Table 1 highlight substantial differences in FWHM es- -500 y ( m) -1000 -1000 x ( m) timations of the PSF between the two ways of calculations. With MetroloJ method, several estimations of FWHM are Fig. 4 3D representation of the estimated PSF main axis. not significant since the computed sizes are highly below the true bead dimension. The high variability of the estimated FWHM by MetroloJ probably results from (i) the ignorance Additionally to the relevant and reliable measurement of 3D inclination of the PSF shape, (ii) a high sensitivity to of PSF widths, our computational strategy gives also ac- cess to the orientation of each PSF inside its corresponding 3 Available at: http://imagejdocu.tudor.lu/doku.php?id= VOI. Of particular interest is the computation of the Euler plugin:analysis:metroloj:start 12 Emilie Chouzenoux1,2 et al.

Table 1 Example of fitting results on 4 VOIs for our approach, and the MetroloJ plugin from Fiji.

Volume of Interest n◦1 n◦2 n◦3 n◦4

Center (µm) (62.77,18.59,5.46) (41.62,65.69,5.50) (66.01,0.35,13.82) (10.24,66.96,10.46)

µ

MetroloJ FWHM ( m) (0.32,0.03,0.05) (0.29,0.03,0.001) (0.028,0.19,0.1) (0.05,0.04,0.57)

Center (µm) (62.78,19.19,7.57) (41.71,66.27,6.10) (66.22,1.03,14.61) (10.29,67.59,11.72)

FWHM (µm) (0.192,0.247,1.275) (0.201,0.307,1.282) (0.198,0.252,1.539) (0.205,0.259,1.601) FIGARO Angles (◦) (73.1,2.38) (67.3,5.63) (87.2,1.54) (105.6,2.24) noise and model mismatch, both reasons making impossible a correct estimation of the PSF width. This emphasizes the importance of robustly and directly dealing with 3D mod- els, for which FIGARO is able to give reliable and relevant results.

6.4 Increasing the Resolving Power

We finalize this experimental section by presenting restora- tion results of a section of the same acquired dataset with size 200 200 50 voxels, corresponding to a field of view × × of 10 10 5 µm3. A constant 3D Gaussian PSF shape is × × considered in this region, whose width and orientation are Slice 32 of input data deduced from our previously described fitting results by in- terpolation. The deblurring step is performed using the OP- TIMISM toolbox from Fiji 4 [16]. Figure 5 illustrates one 2D slice extracted from the input dataset (top) and the cor- responding restored image (bottom). In Figure 5(top), the presence of approximately seven microbeads is supposed in this 2D image. For the biggest and brightest one, its diam- eter is about 1 µm on the raw image, exceeding highly the expected 0.2 µm. No conclusion can be drawn from such poor observation quality. When applying OPTIMISM with FIGARO fitted PSF, this halo of light appears in fact as a bunch of microbeads as it is visible on Figure 5(bottom). These microbeads were too small and too close to each other to be individually identified with the multiphoton micro- Restored volume scope alone and the help of a suitable 3D PSF model, as the Fig. 5 Deblurring results. one resulting from FIGARO, is thus mandatory for increas- ing numerically the resolving power of the MPM device. proximal alternating iterative resolution scheme, grounded on solid mathematical foundations, has been proposed for 7 Conclusion the resolution of the underlying nonconvex minimization In this paper, a new algorithm has been proposed for mul- problem. The interest of this strategy named FIGARO has tivariate Gaussian fitting of observed data corrupted by ad- been illustrated by means of experiments in fitting synthetic ditive Gaussian noise. Our approach relies on the proposal data when a model mismatch is present. We have also pre- of an original hybrid cost function combining a Kullback- sented experimental results in the context of computational Leibler divergence regularizer, a least-squares data fidelity fluorescence microscopy. The objective was to characterize term and range constraints on the parameters. An efficient the instrumental space-varying 3D PSF of a two-photon flu- orescence microscope from raw observations of microbeads. 4 Available at: http://sites.imagej.net/Dbenielli/ Our numerical tests have shown the efficiency of our method Optimal Multivariate Gaussian Fitting with Applications to PSF Modeling in Two-Photon Microscopy Imaging 13 for PSF model determination. Future work will address the 15. Chen, Y.C., Furenlid, L.R., Wilson, D.W., Barrett, H.H.: Calibra- cases of more general multivariate models and noise statis- tion of scintillation cameras and pinhole SPECT imaging systems, tics. pp. 195–202. 12. Springer (2005) 16. Chouzenoux, E., Lamasse,´ L., Chaux, C., Jaouen, A., Vanzetta, I., Debarbieux, F.: Approche variationnelle pour la deconvolution´ rapide de donnees´ 3d en microscopie biphotonique. In: Actes du Acknowledgements 25e colloque GRETSI (2015) 17. Chouzenoux, E., Pesquet, J.C., Repetti, A.: A block coordinate This work was supported by the CNRS under grant MI-AAP variable metric forward–backward algorithm. Journal of Global Optimization 66(3), 457–485 (2016) Interne2018-SupReMA.´ 18. Combettes, P.L., Pesquet, J.C.: Proximal splitting methods in sig- nal processing. In: H.H. Bauschke, R.S. Burachik, P.L. Combettes, V. Elser, D.R. Luke, H. Wolkowicz (eds.) Fixed-Point Algorithms References for Inverse Problems in Science and Engineering, pp. 185–212. Springer New York (2011) 19. Corless, R.M., Gonnet, G.H., Hare, D.E.G., Jeffrey, D.J., Knuth, 1. Anthony, S.M., Granick, S.: Image analysis with rapid and accu- D.E.: On the Lambert W function. Advances in Computational rate two-dimensional Gaussian fitting. Langmuir 25(14), 8152– mathematics 5(1), 329–359 (1996) 8160 (2009). DOI 10.1021/la900393v 20. van den Dries, L., Miller, C.: Geometric categories and o-minimal 2. Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alter- structures. Duke Math. J. 84(2), 497–540 (1996) nating minimization and projection methods for nonconvex prob- 21. Friesen, W.I., Michaelian, K.H.: Deconvolution and curve-fitting lems: an approach based on the Kurdyka-Łojasiewicz inequality. in the analysis of complex spectra: The CH stretching region in 35 Math. Oper. Res. (2), 438–457 (2010). DOI 10.1287/moor. infrared spectra of coal. Appl. Spectrosc. 45(1), 50–56 (1991) 1100.0449 22. Galatsanos, N.P., Katsaggelos, A.K.: Methods for choosing the 3. Attouch, H., Bolte, J., Redont, P., Soubeyran, A.: Proximal alter- regularization parameter and estimating the noise in im- nating minimization and projection methods for nonconvex prob- age restoration and their relation. IEEE Trans. Image Process. lems: an approach based on the Kurdyka-Łojasiewicz inequality. 1(3), 322–336 (1992) Mathematics of Operations Research 35(2), 438–457 (2010). DOI 23. Guo, H.: A simple algorithm for fitting a Gaussian function [DSP 10.1287/moor.1100.0449 tips and tricks]. IEEE Signal Proc. Mag. 28(5), 134–137 (2011). 4. Attouch, H., Bolte, J., Svaiter, B.F.: Convergence of descent DOI 10.1109/MSP.2011.941846 methods for semi-algebraic and tame problems: proximal algo- 24. Hagen, N., Dereniak, E.L.: Gaussian profile estimation in two di- rithms, forward–backward splitting, and regularized Gauss–Seidel mensions. Appl. Opt. 47(36), 6842–6851 (2008). DOI 10.1364/ methods. Math. Prog. 137(1), 91–129 (2013). DOI 10.1007/ AO.47.006842 s10107-011-0484-9 25. Hagen, N., Kupinski, M., Dereniak, E.L.: Gaussian profile esti- 5. Basseville, M., Cardoso, J.F.: On entropies, divergences, and mean mation in one dimension. Appl. Opt. 46(22), 5374–5383 (2007). values. In: Proceedings of 1995 IEEE International Symposium DOI 10.1364/AO.46.005374 on Information Theory, pp. 330– (1995). DOI 10.1109/ISIT.1995. 26. Helmchen, F., Denk, W.: Deep tissue two-photon microscopy. Nat. 550317 Methods 2(12) (2005) 6. Bauschke, H.H., Combettes, P.L.: Convex Analysis and Monotone 27. Henriques, R., Lelek, M., Fornasiero, E.F., Valtorta, F., Zimmer, Operator Theory in Hilbert Spaces, 2nd edn. Springer Interna- C., Mhlanga, M.M.: QuickPALM: 3D real-time photoactivation tional Publishing (2017) nanoscopy image processing in ImageJ. Nat. Methods 7(5), 339– 7. Bertsekas, D.P.: Nonlinear Programming, 2nd edn. Athena Scien- 340 (2010) tific, Belmont, MA (1999) 28. Kazovsky, L.G.: Beam position estimation by means of detector 8. Bolte, J., Combettes, P.L., Pesquet, J.C.: Alternating proximal al- arrays. Opt. Quantum Electron. 13, 201–208 (1981) gorithm for blind image recovery. In: Proc. IEEE Int. Conf. Image 29. Kincaid, D., Cheney, E.: Numerical Analysis: Mathematics of Sci- Process. (ICIP 2010), pp. 1673–1676. Hong-Kong, China (2010) entific Computing, 3th edn. Pure and applied undergraduate texts. 9. Bolte, J., Daniilidis, A., Lewis, A., Shiota, M.: Clarke subgradients American Mathematical Society (2002) of stratifiable functions. SIAM Journal on Optimization 18(2), 30. Kirshner, H., Ahuet, F., Sage, D., Unser, M.: 3-D PSF fitting 556–572 (2007). DOI 10.1137/060670080 for fluorescence microscopy: implementation and localization ap- 10. Bolte, J., Sabach, S., Teboulle, M.: Proximal alternating linearized plication. J. Microsc. 249(1), 13–25 (2013). DOI 10.1111/j. minimization for nonconvex and nonsmooth problems. Math- 1365-2818.2012.03675.x ematical Programming 146(1), 459–494 (2014). DOI 10.1007/ 31. Landman, D.A., Roussel-Dupre,´ R., Tanigawa, G.: On the statisti- s10107-013-0701-9 cal uncertainties associated with line profile fitting. Astrophys. J. 11. Burger, M., Sawatzky, A., Steidl, G.: First Order Algorithms in 261, 732–735 (1982) Variational Image Processing, pp. 345–407. Springer International 32. Marim, M., Zhang, B., Olivo-Marin, J.C., Zimmer, C.: Improving Publishing, Cham (2016) single particle localization with an empirically calibrated Gaussian 12. Caruana, R., Searle, R., Heller, T., Shupack, S.: Fast algorithm for kernel. In: 5th IEEE Int. Symp. Biomed. Imag.: From Nano to the resolution of spectra. Anal. Chem. 58(6), 1162–1167 (1986) Macro (ISBI 2008), pp. 1003–1006. Paris, France (2008). DOI 13. Chan, R.H., Chan, T.F., Shen, L., Shen, Z.: Wavelet deblurring 10.1109/ISBI.2008.4541168 algorithms for spatially varying blur from high-resolution image 33. Rockafellar, R., Wets, R.J.B.: Variational Analysis. Springer Ver- reconstruction. Linear Algebra and its Applications 366, 139 – lag (1998). DOI 10.1007/978-3-642-02431-3 155 (2003). Special issue on Structured Matrices: Analysis, Algo- 34. Roonizi, E.K.: A new algorithm for fitting a Gaussian function rithms and Applications riding on the polynomial background. IEEE Signal Process. Lett. 14. Chaux, C., Pesquet, J., Pustelnik, N.: Nested iterative algorithms 20(11), 1062–1065 (2013). DOI 10.1109/LSP.2013.2280577 for convex constrained image recovery problems. SIAM Jour- 35. Sarder, P., Nehorai, A.: Estimating locations of quantum-dot- nal on Imaging Sciences 2(2), 730–762 (2009). DOI 10.1137/ encoded microparticles from ultra-high density 3-d microarrays. 080727749. URL https://doi.org/10.1137/080727749 IEEE Trans. NanoBioscience 7(4), 284–297 (2008) 14 Emilie Chouzenoux1,2 et al.

36. Tal, E., Oron, D., Silverberg, Y.: Improved depth resolution in video-rate line-scanning multiphoton microscopy using temporal focusing. Opt. Lett. (30), 1686–1688 (2005) 37. Thompson, R.E., Larson, D.R., Webb, W.W.: Precise nanometer localization analysis for individual fluorescent probes. Biophys. J. 82, 2775–2783 (2002) 38. Tseng, P.: Convergence of a block coordinate descent method for nondifferentiable minimization. Journal of Optimization Theory and Applications 109(3), 475–494 (2001) 39. Wolter, S., Loschberger,¨ A., Holm, T., Aufmkolk, S., Dabauvalle, M.C., van de Linde, S., Sauer, M.: rapidSTORM: accurate, fast open-source software for localization microscopy. Nat. Methods 9, 1040–1041 (2012) 40. Xu, Y., Yin, W.: A block coordinate descent method for regular- ized multiconvex optimization with applications to nonnegative tensor factorization and completion. SIAM Journal on Imaging Sciences 6(3), 1758–1789 (2013). DOI 10.1137/120887795 41. Zhang, B., Zerubia, J., Olivo-Marin, J.C.: Gaussian approxima- tions of fluorescence microscope point-spread function models. Appl. Opt. 46(10), 1819–1829 (2007). DOI 10.1364/AO.46. 001819 42. Zhu, X., Zhang, D.: Efficient parallel Levenberg-Marquardt model fitting towards real-time automated parametric imaging mi- croscopy. PLOS ONE 8(10), 1–9 (2013). DOI 10.1371/journal. pone.0076665