<<

Introduction to medical image registration

Pietro Gori

Enseignant-chercheur Equipe IMAGES - Télécom ParisTech [email protected]

October 10, 2018

P. Gori BIOMED 10/10/2018 1 / 57 Summary

1 Introduction

2 Geometric global transformations

3 Image warping and interpolations

4 Intensity based registration

5 Landmark based registration Affine registration Procrustes superimposition Non-linear registration (small displacement) Non-linear registration (diffeomorphism)

P. Gori BIOMED 10/10/2018 2 / 57 Medical image geometric registration

Definition Geometric: find the optimal parameters of a geometric transformation to spatially align two different images of the same object. It establishes spatial correspondence between the pixels of the source (or moving) image with the ones of the target image Photometric: Modify the intensity of the pixels and not their position

Applications Compare two (or more) images of the same modality (e.g. T1-w MRI of the brain of two different subjects) Combine information from multiple modalities (e.g. PET, DWI, T1-w MRI of the brain of the same subject) Longitudinal studies (e.g. monitor anatomical or functional changes of the brain over time) Relate preoperative and postoperative images after surgery

P. Gori BIOMED 10/10/2018 3 / 57 Medical Image registration - Applications

P. Gori BIOMED 10/10/2018 4 / 57 Image registration components

Dimensionality: 2D/2D, 3D/3D, 2D/3D Transformation (linear / non-linear) metric (e.g. intensities, landmarks, edges, surfaces) Optimization procedure Interaction (automatic / semi-automatic / interactive) Modalities (mono-modal / multi-modal) Subjects (intra-subject / inter-subject / atlas construction)

P. Gori BIOMED 10/10/2018 5 / 57 Introduction

Let I and J be the source and target images. They show the same anatomical object, most of the time with a different field of view and resolution (sampling) I(x, y) and J(u, v) represent the intensity values of the pixels located onto two regular grids: {x, y} ∈ ΩI and {u, v} ∈ ΩJ For the same subject, the same anatomical point z can be in position xz, yz in I and in uz, vz in J

P. Gori BIOMED 10/10/2018 6 / 57 Mathematical definition

Both I and J are functions:

2 I(x, y):Ω I ⊆ R → R (x, y) → I(x, y)

We look for a geometric transformation T, which is a 2D warping parametric function that belongs to a certain family Γ:

2 2 Tφ(x, y): R → R (x, y; φ) → Tφ(x, y)

φ is the vector of parameters of T. We look for a transformation that maps (xz, yz) to (uz, vz)

P. Gori BIOMED 10/10/2018 7 / 57 Mathematical definition

Most of the time, I and J are simply seen as matrices whose coordinates (x, y) and (u, v) are thus integer-valued (number of line and column) The values of the intensities of the pixels can be real numbers R (better for computations) or integer, usually in the range [0, 255] for a gray-scale image There are several kind of transformations:

modèle global modèle par morceaux modèle local (régional)

P. Gori BIOMED 10/10/2018 8 / 57 Summary

1 Introduction

2 Geometric global transformations

3 Image warping and interpolations

4 Intensity based registration

5 Landmark based registration Affine registration Procrustes superimposition Non-linear registration (small displacement) Non-linear registration (diffeomorphism)

P. Gori BIOMED 10/10/2018 9 / 57 Global transformations

Global means that the transformation is the same for any points p - multiply each coordinate by a scalar " # " #" # u s 0 x = x (1) v 0 sy y

P. Gori BIOMED 10/10/2018 10 / 57 Global transformations

Rotation - WRT origin. Let p = [xy ]T and p0 = [uv ]T " # " # x r cos(α) = y r sin(α) " # " # " # u r cos(α + θ) r(cos(α) cos(θ) − sin(α) sin(θ)) = = v r sin(α + θ) r(sin(α) cos(θ) + cos(α) sin(θ)) " # " #" # x cos(θ) − y sin(θ) cos(θ) − sin(θ) x = = y cos(θ) + x sin(θ) sin(θ) cos(θ) y

" # cos(θ) − sin(θ) R = sin(θ) cos(θ) det(R) = 1 R−1 = RT

P. Gori BIOMED 10/10/2018 11 / 57 Global transformations

Reflection Horizontal (Y-axis) " # " # " #" # u r cos(π − α) −10 x = = v r sin(π − α) 01 y Vertical (X-axis) " # " 3 # " #" # u r cos( 2 π + α) 10 x = 3 = v r sin( 2 π + α) 0 −1 y det(R) = −1 R−1 = RT

P. Gori BIOMED 10/10/2018 12 / 57 Global transformations

Shear - Transvection in french " # " #" # u 1 λ x = x v λy 1 y

P. Gori BIOMED 10/10/2018 13 / 57 2D Linear transformations

" # " #" # u ab x = v cd y

Linear transformations are combinations of: scaling reflection shear Properties of linear transformations: origin is always transformed to origin parallel lines remain parallel ratios are preserved lines remain lines

P. Gori BIOMED 10/10/2018 14 / 57

It does not have a fixed point → no " # " # " # " # u x t x + t = + x = x (2) v y ty y + ty

P. Gori BIOMED 10/10/2018 15 / 57 Homogeneous coordinates - 2D affine transformation

Instead than 2D matrices we use 3D matrices. Affine transformation: combination of linear transformations and translations Let p = [xy ]T and p0 = [uv ]T , we obtain p0 = Tp       u abt x x       v = cdt y y 1 001 1 | {z } T Properties of affine transformations: origin is not always transformed to origin parallel lines remain parallel ratios are preserved lines remain lines

P. Gori BIOMED 10/10/2018 16 / 57 2D Projective transformations (homographies)

u abg  x       v = cdh  y 1 efi 1

Properties of projective transformations: origin is not always transformed to origin parallel lines do not necessarily remain parallel ratios are not preserved lines remain lines

P. Gori BIOMED 10/10/2018 17 / 57 Examples

P. Gori BIOMED 10/10/2018 18 / 57 Summary

1 Introduction

2 Geometric global transformations

3 Image warping and interpolations

4 Intensity based registration

5 Landmark based registration Affine registration Procrustes superimposition Non-linear registration (small displacement) Non-linear registration (diffeomorphism)

P. Gori BIOMED 10/10/2018 19 / 57 Forward warping

2 2 Tφ(x, y): R → R (x, y; φ) → Tφ(x, y)

Ideally, if ΩI and ΩJ were continuous domains, we would simply map (x, y) to (u, v) = Tφ(x, y) and then compare I(x, y) with J(u, v) However, ΩI and ΩJ are regular grids ! What if Tφ(x, y) is not on the grid ΩJ ? → Splatting, add contribution to neighbor pixels.

P. Gori BIOMED 10/10/2018 20 / 57 Inverse warping

Find the pixel intensities for the deformed image IT starting from ΩJ : −1 (x, y) = Tφ (u, v) Assign to IT(u, v) the pixel intensity in I(x, y) −1 What if Tφ (u, v) is not on ΩI ? → Interpolation !

P. Gori BIOMED 10/10/2018 21 / 57 Interpolation

Goal: estimate the intensity value on points not located onto the regular grids nearest neighbor bilinear cubic lanczos ...

P. Gori BIOMED 10/10/2018 22 / 57 Interpolation

Nearest neighbor: J(u, v) = I(round(x), round(y)) f(x,q)−f(x,y) f(x,y+1)−f(x,q) Bilinear: ∆y = 1−∆y → f(x, q) = (1 − ∆y)f(x, y) + f(x, y + 1)∆y. Similarly, f(x + 1, q) = (1 − ∆y)f(x + 1, y) + f(x + 1, y + 1)∆y. Then, f(p, q) = f(x, q)(1 − ∆x) + f(x + 1, q)∆x

P. Gori BIOMED 10/10/2018 23 / 57 Interpolation examples

Every time we deform an image, we need to interpolate it. For instance, this is the result on Lena after 10 rotations of 36 degrees:

Figure 1: Original image - Nearest Neighbour - Bilinear

Source : http://bigwww.epfl.ch/demo/jaffine/index.html (Michael Unser)

P. Gori BIOMED 10/10/2018 24 / 57 Deformation algorithm

Recipe: Given a source image I and a global affine transformation T, compute the forward warping of the extremities of I. This gives the bounding box of the deformed image IT

Given the bounding box of IT, create a new grid within it with, for instance, the same characteristics of ΩJ to allow comparison between IT and J Use the inverse warping and interpolation to compute the intensity values at the grid points of the warped image IT (we avoid holes) Be careful! During the inverse warping, points that are mapped outside ΩI are rejected.

P. Gori BIOMED 10/10/2018 25 / 57 By minimizing a cost function:

∗ θ = arg min d(IT,J) (3) θ The similarity measure d might be based on the pixel intensities and/or on corresponding geometric objects such as control points (i.e. landmarks), curves or surfaces

Transformation parameters

Given a transformation T defined by a set of parameter θ and two images I and J, how do we estimate θ ?

P. Gori BIOMED 10/10/2018 26 / 57 Transformation parameters

Given a transformation T defined by a set of parameter θ and two images I and J, how do we estimate θ ? By minimizing a cost function:

∗ θ = arg min d(IT,J) (3) θ The similarity measure d might be based on the pixel intensities and/or on corresponding geometric objects such as control points (i.e. landmarks), curves or surfaces

P. Gori BIOMED 10/10/2018 26 / 57 Summary

1 Introduction

2 Geometric global transformations

3 Image warping and interpolations

4 Intensity based registration

5 Landmark based registration Affine registration Procrustes superimposition Non-linear registration (small displacement) Non-linear registration (diffeomorphism)

P. Gori BIOMED 10/10/2018 27 / 57 Intensity based registration

Same modality Sum of squared intensity differences (SSD). Best measure when I and J only differ by Gaussian noise. Very sensitive to “outliers” pixels, namely pixels whose intensity difference is very large compared to others.

X X −1 2 2 d(IT,J) = (J(u, v) − I(Tφ (u, v))) = (J(u, v) − IT(u, v)) u v (4)

Correlation coefficient. Assumption is that there is a linear relationship between the intensity of the images

P P ¯ ¯ u v(J(u, v) − J)(IT(u, v) − IT) d(IT,J) = q (5) P P ¯ 2 P P ¯ 2 u v(J(u, v) − J) u v(IT(u, v) − IT)

P. Gori BIOMED 10/10/2018 28 / 57 Intensity based registration

Multi modality Mutual information. We first need to define the joint histogram between IT and J. The value at location (a, b) is equal to the number of intensity values that have intensity a in IT(u, v) and intensity b in J(u, v). For example, a joint histogram which has the value of 2 in the position (7,3) means that we have found two locations (u, v) where the intensity of the first image was 7 (IT(u, v) = 7) and the intensity of the second was 3 (J(u, v) = 3). By dividing by the total number of pixels, we obtain a joint

probability density function (pdf) pIT,J . The definition of joint entropy is:

H(I ,J) = − X X p (a, b) log(p (a, b)) T IT,J IT,J (6) a b

where a and b are defined within the range of intensities in IT and J respectively

P. Gori BIOMED 10/10/2018 29 / 57 Intensity based registration

Figure 2: Joint histogram of a) same modality (IRM) b) different modality (MR-CT) c) different modality (MR-PET). First column, images are aligned. 2nd and 3rd columns images are translated. Taken from [2]. P. Gori BIOMED 10/10/2018 30 / 57 Intensity based registration

The definition of Mutual information is:

M(IT,J) = H(IT) + H(J) − H(IT,J) (7)

P where H(IT) = − a pIT (a) log(pIT (a)) and P H(J) = − b pJ (b) log(pJ (b)). The two marginal pdf pIT and pJ are

simply the accumulations over the columns and rows of pIT,J respectively. It results:

X X pIT,J (a, b) M(IT,J) = pI ,J (a, b) log (8) T p (a)p (b) a b IT,J J

P. Gori BIOMED 10/10/2018 31 / 57 Intensity based registration

M(IT,J) = H(IT) + H(J) − H(IT,J) = H(J) − H(J|IT) (9)

where the conditional entropy P P H(J|IT) = − a b pIT,J (a, b) log pJ|IT (b|a). Mutual information measures the amount of uncertainty about J minus the uncertainty about J when IT is known, that is to say, how much we reduce the uncertainty about J after observing IT. It is maximized when the two images are aligned. [8]

Maximizing M means finding a transformations T that makes IT the best predictor for J. Or, equivalently, knowing the intensity IT(u, v) allows us to perfectly predict J(u, v).

P. Gori BIOMED 10/10/2018 32 / 57 Optimization procedure

Pixel-based similarity measures need an iterative approach where an initial estimate of the transformation is gradually refined using the gradient and, depending on the method, also the Hessian of the similarity measure with respect to the parameters θ Possible algorithms: gradient descent, Newton-Raphson, Levenberg-Marquardt Problem of the “local minima” → stochastic optimization, line search, trust region, multi-resolution (first low resolution and then higher resolution) Validation → visual inspection, alignment of manually segmented objects, value of similarity measure

P. Gori BIOMED 10/10/2018 33 / 57 Summary

1 Introduction

2 Geometric global transformations

3 Image warping and interpolations

4 Intensity based registration

5 Landmark based registration Affine registration Procrustes superimposition Non-linear registration (small displacement) Non-linear registration (diffeomorphism)

P. Gori BIOMED 10/10/2018 34 / 57 Anatomical landmarks

Definition: anatomical landmark An anatomical landmark is a point precisely defined onto an anatomical structure which establishes a correspondence among the population of homologous anatomical objects

Figure 3: Example of manually labeled landmarks

P. Gori BIOMED 10/10/2018 35 / 57 Anatomical landmarks

Given a set of N landmarks i defined on I and N corresponding landmarks j defined on J, we seek to minimize:

N ∗ X 2 θ = arg min ||T(ip) − jp|| (10) θ p=1

where T(ip) means that we apply the deformation T to the p-th landmark ip 2 We suppose that all landmarks belong to R (it would be similar for 3 R ) The metric is the Euclidean norm (or Frobenius when using matrices)

P. Gori BIOMED 10/10/2018 36 / 57 Landmark based affine registration

We define:

T(ip) = Aip + t ∀i ∈ [1,N] (11) Thus, θ = A, t

N ∗ ∗ X 2 A , t = arg min f(A, t) = ||Aip + t − jp|| (12) A,t p=1 From which it results:

N ∂f X = 2 (Ai + t − j ) = 0 → t∗ = ¯j − A¯i (13) ∂t p p p=1

We notice that if we center the data (i.e. ˜ip = ip −¯i and ˜jp = jp − ¯j) ∗ PN ˜ ˜ 2 then t = 0. The criterion thus becomes f = p=1 ||Aip − jp||

P. Gori BIOMED 10/10/2018 37 / 57 Landmark based affine registration

Now we differentiate wrt A.:

∂f N ∂||A˜i ||2 ∂hA˜i , ˜j i = X p − 2 p p ∂A ∂A ∂A p=1 (14) N N X ˜ ˜T ˜ ˜T X ˜ ˜ ˜T =2 Aipip − jpip = 2 (Aip − jp)ip = 0 p=1 p=1 It results:

−1  N   N  ∗ X ˜ ˜T X ˜ ˜T A =  jpip   ipip  (15) p=1 p=1

PN ˜ ˜T  The matrix p=1 ipip is invertible if the landmarks are not all aligned on a straight line.

P. Gori BIOMED 10/10/2018 38 / 57 Landmark based affine registration

From a computational point of view, it is easier to use homogeneous coordinates:

∗ 2 T = arg min ||xT − y||F (16) T where we define     x1 y1 1000   u1 a1  .   .   .  a   .     2         xN yN 1000  t1  uN      −    000 x1 y1 1 a3  v1  (17)  .     .   .  a4  .   .   .  000 x y 1 t2 v N N | {z } N | {z } T | {z } x y

T∗ = (xT x)−1xT y (18)

P. Gori BIOMED 10/10/2018 39 / 57 Procrustes superimposition (similarity transformation)

We seek to minimize:

N ∗ X 2 (s, R, t) = arg min ||sRip + t − jp||2 (19) s,R,t p=1

where s is a uniform scaling factor (scalar) and R is a rotation matrix. ∗ ¯ 1 PN The translation vector t is, as before, equal to t = j − N p=1 sRip. Thus, by centering the data, we obtain:

N ∗ X ˜ ˜ 2 2 (s, R) = arg min f(s, R) = ||sRip − jp||2 = ||sXR − Y ||F (20) s,R p=1

PN ˜ ˜ 2 2 Remember that: p=1 ||ip − jp||2 = ||X − Y ||F where ˜T ˜T ˜T 2 T X = [i1 ; i2 ; ...; iN ] and ||X||F = Tr(X X)

P. Gori BIOMED 10/10/2018 40 / 57 Procrustes superimposition (similarity transformation)

We minimize wrt s: 2 2 f(s, R) ∝ s ||XR||F − 2shXR,Y iF ∂f(s, R) = 2s||X||2 − 2hXR,Y i ∂s F F (21) ∗ hXR,Y iF s = 2 ||X||F T T where we use the fact that R R = RR = I. Substituting into f: 2 ∗ (hXR,Y iF ) R = arg min f(R) = − 2 R ||X||F T = arg max |hXR,Y iF | = |hR,X Y iF | (22) R T T = arg max |hR,UΣV iF | = |hU RV, ΣiF | = |hZ, ΣiF | R where we use the SVD decomposition XT Y = UΣV T and the T T T definition of the trace hXR,Y iF = Tr(R X Y ) = hR,X Y iF

P. Gori BIOMED 10/10/2018 41 / 57 Procrustes superimposition (similarity transformation)

Notice that Z = U T RV is an orthogonal matrix since it is the T T product of orthogonal matrices. Thus Z Z = I and zj zj = 1. It follows that zij ≤ 1.

∗ T R = arg max |hZ, ΣiF | = |Tr(Σ Z)| = R " #" #! 2 2 (23) σ1 0 z11 z12 X X = |Tr | = σdzdd ≤ σd 0 σ2 z21 z22 d=1 d=1

The maximum is obtained when zdd = 1 ∀d, which means when:

Z = U T RV = I → R∗ = UV T (24) In order to be sure that R is a rotation matrix (det(R) = 1), we compute [5]: " # 10 R∗ = U V T = USV T (25) 0 det(UV T )

P. Gori BIOMED 10/10/2018 42 / 57 Procrustes superimposition (similarity transformation)

To recap [5]:

R∗ = USV T T ∗ hR,YX iF Tr(SΣ) s = 2 = 2 ||X||F ||X||F (26) 1 N t∗ = ¯j − X sRi N p p=1

P. Gori BIOMED 10/10/2018 43 / 57 Non-linear registration (small displacement)

We define the deformation of a pixel at location z = (x, y) as:

N X T(z) = z + v(z) with v(z) = K(z, ip)αp (27) p=1

2 ||z−ip||2 where K(z, ip) is a kernel, for instance K(z, ip) = exp (− λ2 ) and αp is a 2D vector which need to be estimated. The displacement at any point z depends on the displacement of the neighbor landmarks

P. Gori BIOMED 10/10/2018 44 / 57 Non-linear registration (small displacement)

We minimize

N ∗ X 2 α = arg min f(α; λ)= ||ip + v(ip) − jp||2 α p=1 N N (28) X X 2 = ||ip + ( K(ip, id)αd) − jp||2 p=1 d=1 2 = ||i + Kα − j||F   1 K(i1, i2) ...K (i1, iN )    K(i2, i1)1 ...K (i2, iN ) where K =   and  ......  K(iN , i1) K(iN , i2) ... 1 T T α = [α1 ; ...; αN ]

P. Gori BIOMED 10/10/2018 45 / 57 Non-linear registration (small displacement)

By differentiating wrt α:

∂||i + Kα − j||2 F = 2KT (i + Kα − j) = 0 ∂α (29) α∗ = K−1(j − i) The matrix K might not always be invertible (if λ is too big for instance). We need to regularize it. A possible solution is to use a Tikhonov matrix such as αT Kα, thus obtaining:

∗ 2 T α = arg min f(α; λ, γ)= ||i + Kα − j||F + γα Kα (30) α

∂f(α; λ, γ) = 2KT (i + Kα − j) + 2γKα = 0 ∂α (31) ∗ −1 α = (K + γI) (j − i)

P. Gori BIOMED 10/10/2018 46 / 57 Non-linear registration (small displacement)

Why only small displacement ? → We could approximate the inverse of T(z) as T−1(z0) = z0 − v(z0) obtaining:

T(T−1(z0)) = z0 − v(z0) + v(z0 − v(z0)) 6= z0 (32)

The error is small only if v(z0 − v(z0)) − v(z0) is small, which is the case only when the displacement is small ! We might have intersections, holes or tearing in area where the displacement is large

P. Gori BIOMED 10/10/2018 47 / 57 Small-displacement registration

Figure 4: First row: forward and inverse transformation. The first one is a one-to-one mapping whereas the second one presents intersections. Second row: composition of transformations. They should be the identity transforms. Image taken from [7]. P. Gori BIOMED 10/10/2018 48 / 57 Large-displacement registration: diffeomorphism

Instead than using small-displacement transforms we should use diffeomorphisms A diffeomorphism is a differentiable (smooth and continuous) bijective transformation (one-to-one) with differentiable inverse (i.e. nonzero Jacobian ) Using diffeomorphic transformations we can preserve the and spatial organization, namely no intersection, folding or shearing may occur

P. Gori BIOMED 10/10/2018 49 / 57 Diffeomorphism

Figure 5: First row: forward and inverse diffeomorphic transformation (both are one-to-one). Second row: composition of forward and inverse transformations. The result is the identity transform (i.e. no deformation). Image taken from [7].

P. Gori BIOMED 10/10/2018 50 / 57 Diffeomorphism

One of the most used algorithm in medical imaging to create diffeomorphic deformations is called LDDMM: Large Deformation Diffeomorphic Metric Mapping

Deformations are built by integrating a time-varying vector field vt(x) over t ∈ [0, 1] where vt(x) represents the instantaneous velocity of any point x at time t (and no more a displacement vector !)

P. Gori BIOMED 10/10/2018 51 / 57 Diffeomorphism

Calling φt(x) the position of a point at time t which was located in x ∂φt(x) at time t = 0, its evolution is given by: ∂t = vt(φt(x)) with φ0(x) = x ∂φt(x) Integrating ∂t = vt(φt(x)) between t ∈ [0, 1] produces a flow of diffeomorphisms (if v is square integrable). The last diffeomorphism is the one we are interested into.

P. Gori BIOMED 10/10/2018 52 / 57 Diffeomorphism

P. Gori BIOMED 10/10/2018 53 / 57 Diffeomorphism

Figure 6: Image taken from T. Mansi - MICCAI - 2009

P. Gori BIOMED 10/10/2018 54 / 57 Diffeomorphism

http://www.deformetrica.org/

P. Gori BIOMED 10/10/2018 55 / 57 Diffeomorphism

The flow of diffeomorphisms produces a dense deformation of the entire 3D space. We know how to deform every point in the space.

The last diffeomorphism is parametrised by the initial velocity v0 From a mathematical point of view, to register a source image or mesh I to a target image or mesh J we minimize:

arg min D(φ1(I),T ) + γReg(v0) (33) v0 whereD is a data term, Reg is a regularization term and γ their trade-off. We use an optimization scheme (e.g. gradient descent) to estimate the optimal deformation parameters v0.

P. Gori BIOMED 10/10/2018 56 / 57 References

1 J. Modersitzki (2004). Numerical methods for image registration. Oxford university press 2 J. V. Hajnal, D. L.G. Hill, D. J. Hawkes (2001). Medical image registration. CRC press 3 G. Wolberg (1990). Digital Image Warping. IEEE 4 The Matrix Cookbook 5 S. Umeyama (1991). Least-squares estimation of transformation parameters.... IEEE TPAMI 6 S. Durrleman et al. (2014). Morphometry of anatomical shape complexes ... . NeuroImage. 7 J. Ashburner (2007). A fast diffeomorphic image registration algorithm. NeuroImage 8 J. P. W. Pluim (2003). Mutual information based registration of medical images: a survey. IEEE TMI

P. Gori BIOMED 10/10/2018 57 / 57