Coarse-to-Fine Hamiltonian Dynamics of Hierarchical Flows in Computational

Michael I. Miller Daniel J. Tward Johns Hopkins University Johns Hopkins University Baltimore, Maryland, USA 3400 N. Charles St., Baltimore, MD 21218 [email protected] [email protected] Alain Trouve´ Ecole Normale Superieure 45 Rue d’Ulm, 75005 Paris, France [email protected]

Abstract few millimeters to centimeters). In contrast, our ex- perimental work in Alzheimer’s disease is mapping the mi- We present here the Hamiltonian control equations for cro scales of molecular Tau-pathology measured via histol- hierarchical diffeomorphic flows of particles. We define the ogy, to the sub-millimeter scales of anatomy obtained via controls to be a series of multi-scale vector fields, each with moderate and high field MRI [13]. Towards this end we their own reproducing kernel norm. The hier- derive a set of Hamiltonian control equations that allow us archical control is connected across scale through succes- to directly model information across scales resulting from sive refinements that refine as they ascend the hierarchy with molecular and anatomical imaging. commensurately higher bandwidth Green’s kernels. Inter- There have been many successful approaches in the 80’s estingly the geodesic equations do not separate, with fine and 90’s for representing multi-scale data, as well as many scale motions determined by all of the particle information methods for the construction of scale-space. Our represen- simultaneously, from coarse to fine. Additionally, the hier- tation is abstract enough that it accommodates linear and archical conservation law is derived, defining the geodesics non-linear transformations including wavelets [3, 6] and and demonstrating the constancy of the Hamiltonian. We scattering functions [7] and multi-resolution pyramids for show results on one simulated example and one example signal decomposition [1, 2]. We model the controls in our from histological images of an Alzheimer’s disease brain. control system as being successively refined as the repre- We introduce the action to transport the weights of sentation descends from the coarsest continuum scales of micro-scale particles for mapping to sub millimeter scale gross anatomy to the discrete, atom-like scales of cellular cortical folds. descriptions. We have been motivated by other work on multi-scale kernels, [10, 12, 15, 4, 9] although our representation differs 1. Introduction in that these multiple scale flows are not composed, rather they exist simultaneously with one another. The fact that This paper specifically focuses on a hierarchical rep- the tangent space at the identity is successively refined via resentation of diffeomorphic flows for computational a sequence of vector fields corresponding to the higher and anatomy. Our motivation is to accommodate the multiple higher dimension of the finer scales, implies that the flow scales represented, from cellular scale to meso-scale [5, 14] itself supports multiple information flows simultaneously. to millimeter [11] anatomical scales of the numerous brain These scales are not decoupled. mapping projects in which we are involved. Most current methods in computational anatomy focus on deformable 2. Shapespace as an Orbit of brain atlases that are built from imaging data at a single scale (typically millimeter resolution clinical MRI). These Our are is built via a series of generalized func- R3 atlases are mapped to new datasets using deformations reg- tions or measures spatially localized δxi ,xi ∈ , i ∈ I ularized with smooth kernels at a single scale (typically a with associated function represented by the parameter space

1 fi ∈ F. For us the function represents the amount of pro- tein or Tau tangle at micro-scale in Alzheimer’s disease for example. The mathematical model of Shapespace we use is termed mathematical varifold measures, representing mix- tures of particles given by

µ = wiδxi ⊗ δfi . (1) Xi∈I Shapespace is modeled as an infinite dimensional space, generated from one or more exemplars of varifolds µ = Figure 1: Hierarchical dynamical system with states qℓ, w δ δ acted upon via the ℓ i ℓ i i xi ⊗ fi controls successively refined u = i v and flows ϕ . ofP one-to-one, onto maps, ϕ = (ϕ1,ϕ2,ϕ3) ∈ Diff (e.g. 3 P components for 3 dimensional images) with diffeomorphic action given by the varifolds at each level. We define the associated state ℓ ϕ · µ := wi|dϕ(xi)|δϕ(xi) ⊗ δfi . (2) process of our dynamical system t 7→ qt = (qt )ℓ∈L to Xi∈I encode the varifold action of (2), qti = (wti,xti,fi)i∈I w = |dϕ (x )|w and x = ϕ (x ), with state velocity We use varifolds and their action because they are an effi- ti t i i ti t i linear in the control t 7→ u : cient way to transport a functional feature unchanged. t The are generated from vector fields in- q˙ti =(wtidivut(xti),ut(xti), 0) . (5) dexed by time t → ut = (ut1,ut2,ut3) and integrated to generate the flows The layers in the hierarchy are coupled since the con- trolling vector fields uℓ are determined by successive re- ϕ˙ = u ◦ ϕ , ϕ = id . (3) t t t 0 finements vℓ, for ℓ = 0, 1,... : They extend the singular molecular varifolds of Shapes- uℓ uℓ−1 vℓ , u0 v0 , ℓ ,... . (6) pace, to the dense macro-scale continuum of cellular or par- = + = = 1 ticle descriptions, to global tissue scales (for example). We make the Shapespace into a metric space by exam- 3. Shapespace as a Hierarchical Dynamical ining the measures using a measure norm. However, in or- der to accomodate arbitrary scale into the dense continuum System with Metric limit we build the underlying space of shapes into a metric Our model is a hierarchical one in which we define space by measuring within the dense space of diffeomorphic (ℓ) motions on the backround space in which the particles are a hierarchy of varifolds µ = (µ )ℓ∈L and diffeomor- (ℓ) embedded. This way, we always have a metric independent phisms ϕ = (ϕ )ℓ∈L, each corresponding to their own flow and having their action. The group becomes G = of the scale at which we look at the problem. k Rd To measure the length of the mappings of one to ℓ∈L Diff0 ( ) with composition of each component and actionQ componentwise on the varifolds, where for any L- another we use the distance on the diffeomorphism group (ℓ) (ℓ) uplets ϕ =(ϕ )ℓ L and ψ =(ψ )ℓ L ∈ ∈ ′ d(µ, µ )= inf ′ dDiff (id, ϕ) , (7a) (ℓ) (ℓ) ϕ:ϕ·µ=µ ϕ · ψ =(ϕ ◦ ψ )ℓ∈L , (4a) with the length metric on diffeomorphisms with the extended action of (2) becoming

1 (ℓ) (ℓ) 2 ℓ ℓ−1 2 ϕ · µ =(ϕ · µ )ℓ∈L . (4b) dDiff (id, ψ)= inf kut − ut kV ℓ dt . u:ϕ˙ t=ut◦ϕt Z0 ϕ0=id,ϕ1=ψ Xℓ The entire space of varifolds we call Shapespace, which is (7b) an orbit M = {ϕ · µ : ϕ ∈ G} generated from the varifold particle models across all scales. 4. Hamiltonian Control We represent our model as a hierarchical dynamical sys- tem, the vector fields u are the controls on the flows sat- For building correspondences from one varifold to isfying Eqn. (3). This is depicted in Figure 1. We define aonther we measure closeness between varifolds µ = ℓ ′ ′ℓ the states of our dynamical system to encode the flows of (µ )ℓ∈L, µ = (µ )ℓ∈L ∈ M via the measure norm ′ kµ − µ kM. The measure norm is defined by calcu- The Hamiltonian becomes lating the integrals of the measures against test functions . 1 h → µ(h)= h(x)µ(dx) defined by H(q,p,u) =(p|u · q) − (Lu|u) , (14) 2 R kµkM := sup |µ(h)| . (8) L ℓ ℓ ℓ ℓ ℓ ℓ ℓ h:khk=1 with ( u|u) = ℓ∈L(L v |v ) with (L v |v ) = ku − ℓ−1 2 u kV ℓ . The boundaryP term for matching in terms of the ′ 2 At every scale of the hierarchy the norm has two kernels, state U(q1) := kµ − µ(q1)k gives ℓ ℓ one for space and one for function, KS and KF (resp. on d ℓ R and F ) induced by the dot product ℓ,w 1 ∂ p = − U(q ), (15a) 1i 2 ∂wℓ 1 ℓ ′ ℓ ′ i hδx ⊗ δf ,δx′ ⊗ δf ′ i = K (x,x )K (f,f ) . (9) S F ℓ,x 1 p = − ∇xℓ U(q1) . (15b) 1i 2 i Take the index union Kℓ = J ℓ ∪ J ′ℓ of points and features then the difference of measures becomes Here, to compute the optimal value of u given q and p, it may be convenient to introduce for l ≥ 0, the mapping ℓ ′ℓ ′ µ − µ = wjδx ⊗ δf − w δx′ ⊗ δf ′ A l Rd Rd l j j j j j u 7→ (δx |u) from C0( , ) (the space of C vector fields jX∈J ℓ jX∈J ′ℓ d d ′ vanishing at the infinity) to R defined for A ∈ Ll(R , R ) Rd = αkδx˜ ⊗ δ ˜ (10) (the dual of the space of l-multilinear mappings on with k fk Rd Rd kX∈Kℓ values in ) and x ∈ as

A . k with norm-square (δx |u) =(A|d u(x)) . (16)

w ℓ ℓ ℓ w I2 ℓ ℓ ℓ ′ℓ 2 ′ ℓ ′ ℓ ˜ ˜ ′ kµ − µ k = αkαk KS(˜xk, x˜k )KF (fk, fk ) . We get pi div(u )(xi )wi = pi (δ ℓ |u )wi and M xi k,kX′ Kℓ ℓ,x ∈ ℓ,x ℓ pi ℓ (pi |u (xi))=(δ ℓ |u ) so that we deduce (11) xi

pm,x The geodesic connection between shapes is computed by ℓ ℓ m,w m I2 i v = K m (p w δ m + δ m ) m≥ℓ i∈I i i xi xi defining the controlling vector fields to minimize the norm P P  between the flowed shape and the target shape. The op- m,x ℓ k m,w m I2 pi timal control generating the geodesic connection between u = m ( k≤ℓ∧m K )( i∈Im pi wi δxm + δxm ) .  i i  two shapes, minimizes the running cost with initial condi- P P P (17) tion q0 = qinit and target endpoint condition, is given by the The geodesic state and equation can then be de- following: rived from the Hamiltonian:

∂ Variational Problem q˙ = ∂p H(q,p,u)= u · q, 1 (18) uℓ uℓ−1 2 dt µ′ µ 2 . p ∂ H q,p,u . min k t − t kVℓ + k − 1k ˙ = − ∂q ( ) u:ϕ˙ t=ut◦ϕt Z0 ϕ0=id,µt=ϕt·µ Xℓ (12) For the evolution of the momentum, this gives

ℓ,w ℓ,w I2 ℓ The Pontryagin maximum principle provides necessary p˙i = −(pi δ ℓ |u ), xi conditions for optimal solutions [8]. Hamiltonian control ℓ,x (19) ℓ,x ℓ,w ℓ I2 pi ℓ p˙i = −∇ ℓ p w δ ℓ + δ ℓ |u . [8] reparameterizes the in what are termed xi i i x x  i i  momentum variables which act as Lagrange multipliers on the velocities to control the energy of the system. Define The core equality to get an explicit formula for the Hamil- w x w w w x x x momentum pi ,pi = (pi1,pi2,pi3), (pi1,pi2,pi3) which tonian evolution is given by the following equality: for d d ′ d d ′ act on the velocities w˙ i = widivu(xi), x˙ i = u(xi). k,l ≥ 0, A ∈ Lk(R , R ) , B ∈ Ll(R , R ) we have The geodesics for Hamiltonian control for arbitrary ker- A B k l nels Gℓ,ℓ = 0, 1,... for Reproducing Kernel Hilbert (δx |Kδy )=(A ⊗ B|∂x ∂yK(x,y)) (20) Space (RKHS) Vℓ is now derived. Define the vector no- ℓ ℓ ℓ ℓ Rd′ B α B tation u · q = (u · q )ℓ∈L for p = (p )ℓ∈L, p = since for α ∈ , (α|(Kδy )(x)) = (δx |Kδy ) = ℓ,w ℓ,x ℓ,f B α l A B (pi ,pi ,pi )i∈Iℓ , with (δy |Kδx ) = (B ⊗ α|∂yK(y,x)) so that (δx |Kδy ) = k l k l (B ⊗ A|∂x ∂yK(y,x)) = (A ⊗ B|∂x ∂yK(x,y)). In par- uℓ qℓ uℓ xℓ wℓ,uℓ xℓ , . (13) · = (div( )( i ) i ( i ) 0)i∈Iℓ ticular we have (using Einstein summation convention on repeated indices): 6. Results

I2 I2 3 b,c Figure 2 depicts the micro scales of histological mea- ∂xa (δx |δy )= ∂xa,xb,yc K(x,y) , I2 β b,c surement that we are obtaining for studying Alzheimer’s  ∂xa (δx |δy )= ∂xa ∂xb K(x,y) βc  α β b,c (21) disease. This data is being mapped to the millimeter  ∂xa (δx |δy )= ∂xa K(x,y) αbβc, α I b,c anatomical scales at moderate and high field MRI [13]. ∂ a (δ |δ 2 )= ∂ a ∂ c K(x,y) α x x y x y b Each tau tangle (brown structure) is represented as a point   particle that carries a location (here a position in 2 dimen- from which we can implement the Hamiltonian flow given sions) and a functional signal (here its cross sectional area). the kernel.

5. Conservation laws (ℓ) k+2 Rd Rd We consider A : ℓ∈L V → ℓ∈L C0 ( , ) 2 a continuous linearQ mapping. If weQ denote |v|V = (ℓ) 2 ℓ∈L |v |V (ℓ) , our construction leads to the Hamiltonian H k Rd Rd ∗ k Rd Rd P on ℓ∈L C0 ( , ) × ℓ∈L CId( , ) × V : Q Q 1 H(p, ϕ, v) = (p|u.ϕ) − (Lv|v) (22) 2 (ℓ) (ℓ) (ℓ) 1 (ℓ) 2 = (p |u ◦ ϕ ) − |v | (ℓ) . 2 V Xℓ∈L Xℓ∈L where u = Av with u(0) = v(0) and u(ℓ) = v(ℓ) + u(ℓ−1) Figure 2: Micron scale histology image of a slice of tissue for ℓ ≥ 1. Since H is C2 with bounded derivatives, we from the medial temporal lobe of the brain, stained for tau have a global solution for the Hamiltonian equation for any tangles (brown). initial conditions (ϕ0, p0) and since H is right invariant, we get a conservation equation from Noether’s theorem so that for any ℓ ∈ L we have the conservation equation, for Our implementation is based on a two layer hierarchy. At k any w ∈ C0 , the fine scale, a varifold represents individual particles (tau tangles) with a one dimensional functional signal describing d (mℓ|(dϕ(ℓ)w) ◦ (ϕℓ)−1) = 0 , (23a) their size, and weights w set to 1. At the coarse scale, the dt t t t varifold includes particles sampled on a regular grid, with a two dimensional functional signal that describes local mean where mℓ is given by Hamiltonian momentum and local standard deviation of the fine scale particle size, and weights set to the local density of fine scale particles. (m(ℓ)|w)=(p(ℓ)|w ◦ ϕ(ℓ)) . (23b) Optimal flows to match a template to a target varifold were t t t computed by minimizing our Variational Problem (12) over For particles, m are delta-Dirac generalized functions, de- velocity fields sampled on a regular grid. Minimization was fined in terms of their action against test functions w. For a performed using automatic gradient calculations in pytorch, classical density function, then pℓ = mℓ ◦ ϕℓ|dϕℓ| with with gradient descent. We first show one simulated example, illustrated in Fig. mℓ = (Lℓ + Lℓ+1)uℓ − Lℓ+1uℓ+1 − Lℓuℓ−1 3. The first row shows individual particles at the finest scale. ℓ ℓ ℓ+1 ℓ+1 Marker size is proportional to their weights, and marker = L v − L v (24) color shows their functional signal. This example shows a template and target that are simple squares with “large” For the more general case, signal on the left, and “small” signal on the right. The template particles (left column) are transformed (center col- ∗ m(ℓ) Ad (ℓ) ( t )= Cte (25a) umn) to match the target particles (right column) under the ϕt 0 1 m(ℓ) ∗ m(ℓ) flow v + v . In addition to changing position, they also or ˙ t + ad (ℓ) ( t ) = 0 , (25b) ul change weight (illustrated by size) as described in (2). The second and third row show a coarse scale varifold repre- ∗ m m with (adu( )|w)=( |Duw − Dwu), u = Av and v = sentation where features are local mean and local standard ∗m (ℓ) (k)m(m) KA , and u = k≤m,k≤ℓ K t . deviation (respectively). From left to right the local mean P Figure 4: Initial velocity fields at fine and coarse scales for simulated example. Magnitude is shown in grayscale and red arrows indicate direction.

collateral sulcus.

Figure 3: Varifold representation of simulated data, where color represents signal and size represents weight. The template (left) is transformed (center) to match the target (right). Particles are shown at fine scale (top), and coarse scale with local mean (middle) and local standard deviation (bottom).

changes from red to blue, and the standard deviation is high in the transition zone between them. The initial velocity fields corresponding to this optimal flow are shown in Fig. 4. Here the velocity field’s mag- nitude is shown as a grayscale image, and its direction is shown by arrows at several sampled points. The coarse scale controls an average change in position, whereas the fine scale controls placement of individual particles as well as changes in weight through its relatively large divergence. Second, we show an example using tau tangles that were detected in the histology image of Fig. 2, where the func- Figure 5: Varifold representation of tau tangle data, where tional signal carried by each particle is its size. We ana- marker color represents tangle size, and marker size repre- lyze the collateral sulcus, a region which begins to accumu- sents weight. The template (left) is transformed (center) to late tau tangles in the earliest stages of Alzheimer’s disease. match the target (right). Particles are shown at fine scale Corresponding varifolds are illustrated in Fig. 5, where lo- (top), and coarse scale with local mean (middle) and local cal are computed using a Gaussian neighborhood standard deviation (bottom). of 20 pixels. Each tau tangle, visible as a brown particle at a zoomed in scale in Fig.2, corresponds to a single marker in the top row of Fig. 5. We note that by analyzing local mean, The initial velocity fields corresponding to the optimal a gradient of small tangles in superficial layers to large tan- flow for the histology example are shown in Fig. 6. Here gles in deep layers is observed on the lateral bank of the the velocity field’s magnitude is shown as a grayscale im- age, and its direction is shown by arrows at several sampled [3] Ingrid Daubechies. Ten Lectures on Wavelets. Society for points. Velocity fields are smoothed by Green’s kernels of Industrial and , USA, 1992. 1 the operator (id − a2∆)4 with ∆ the Laplacian and a = [4] Barbara Gris, Stanley Durrleman, and Alain Trouve.´ A sub- 20 (for coarse scale) and 5 (for fine scale) pixels. Here we riemannian modular framework for diffeomorphism-based see that the fine scale encodes flows of much smaller mag- analysis of shape ensembles. SIAM Journal on Imaging Sci- nitude (by a factor of 10) than the coarse scale, as expected ences, 11(1):802–833, 2018. 1 for its interpretation as a refinement. The fine scale includes [5] Brian C. Lee, Daniel J. Tward, Partha P. Mitra, and Michael I. a large divergence that modifies particle weight in addition Miller. On variational solutions for whole brain serial- section histology using a sobolev prior in the computational to a refinement in position. anatomy random orbit model. PLOS Computational , 14(12):1–20, 12 2018. 1 [6] Stphane Mallat. A wavelet tour of signal processing (2. ed.). Academic Press, 1999. 1 [7] Stphane Mallat. Group invariant scattering. Communications on Pure and Applied Mathematics, 65(10):1331–1398, 2012. 1 [8] Michael I Miller, Alain Trouve,´ and Laurent Younes. Hamiltonian systems and optimal control in computational anatomy: 100 years since d’arcy thompson. Annual Review of Biomed Engineering, (17):447–509, November 4 2015. 3 [9] Marc Niethammer, Roland Kwitt, and Francois-Xavier Vialard. Metric learning for . In Proceed- Figure 6: Initial velocity fields at fine and coarse scales for ings of the IEEE Conference on and histology example. Magnitude is shown in grayscale and Recognition, pages 8463–8472, 2019. 1 red arrows indicate direction. [10] Laurent Risser, Franc¸ois-Xavier Vialard, Robin Wolz, Maria Murgasova, Darryl D Holm, and Daniel Rueckert. Simul- taneous Multi-scale Registration Using Large Deformation Diffeomorphic Metric Mapping. IEEE Transactions on Med- 7. Conclusion ical Imaging, 30(10):1746–59, 2011. 1 [11] N Sacktor, A Soldan, M Grega, Farrington L, Q Cai, MC In this work we presented a new approach to modeling Wang, RF Gottesman, RS Turner, Albert M, and BIO- anatomical form at multiple spatial scales. We used var- CARD Research Team. The biocard index: A summary mea- ifolds to model individual particles such as cells express- sure to predict onset of mild cognitive impairment. alzheimer ing functional signals, as well as to model coarser features dis assoc disord. Alzheimer Disease Association Disord, 31(2):114–119, Apr-Jun 2017. 1 such as local means and standard deviations. We modeled [12] Stefan Sommer, Mads Nielsen, Franc¸ois Lauze, and Xavier shape differences through the action of diffeomorphisms, Pennec. A multi-scale kernel bundle for LDDMM: towards generated from flows at different scales, with an associated sparse deformation description across space and scales. In- metric. In this setting, we derived an optimal Hamiltonian formation processing in proceedings of the control which describes geodesics in our shapespace, and conference, 22(17):624–635, 2011. 1 showed one simulated example, and one example from dig- [13] Daniel Tward, Timothy Brown, Yusuke Kageyama, Jaymin ital pathology, of shape matching. Patel, Zhipeng Hou, Susumu Mori, Marilyn Albert, Juan Biomedical images, such as those seen in digital pathol- Troncoso, and Michael Miller. Diffeomorphic registration ogy, are increasing in size and resolution. Models that can with intensity transformation and missing data: Application describe cells or particles at fine scales, and continuous data to 3d digital pathology of alzheimer’s disease. Frontiers. 1, at coarse scales, are becoming necessary. The approach 4 we have presented here extends the techniques of Compu- [14] Daniel Tward, Xu Li, Bingxing Huo, Brian Lee, Partha Mi- tational Anatomy into this important setting. tra, and Michael Miller. 3d mapping of serial histology sec- tions with anomalies using a novel robust deformable regis- tration algorithm. In Multimodal Brain Image Analysis and References Mathematical Foundations of Computational Anatomy, vol- [1] R.A. Akansu, A.N.; Haddad. Multiresolution signal decom- ume LNCS, volume 11846, October 10 2019. 1 position: transforms, subbands, and wavelets. Academic [15] F-X Vialard, L Risser, D Rueckert, and CJ Cotter. 3d image Press, 1992. 1 registration via geodesic shooting using and efficient adjoint [2] Peter J. Burt and Edward H. Adelson. The laplacian pyra- calculation. Journal International Journal of Computer Vi- mid as a compact image code. IEEE TRANSACTIONS ON sion, 97(2):229–241, April 2012. 1 COMMUNICATIONS, 31:532–540, 1983. 1