<<

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Welcome to Copenhagen! Schedule:

Monday Tuesday Wednesday Thursday Friday 8 Registration and welcome 9 Crash course on Crash course on Introduction to Differential and Differential and Information & Information Geometry & Information Geometry Riemannian Riemannian Stochastic Optimization Stochastic Optimization 3.1 Geometry 1.1 Geometry 3 1.1 in Discrete Domains 1.1 (Amari) (Feragen) (Lauze) (Hansen) (Màlago)

10 Crash course on Tutorial on numerics Introduction to Differential and Information Geometry & Information Geometry & for Riemannian Information Geometry Riemannian Stochastic Optimization Stochastic Optimization geometry 1.1 3.2 Geometry 1.2 1.2 in Discrete Domains 1.2 (Sommer) (Amari) (Feragen) (Hansen) (Màlago)

11 Crash course on Tutorial on numerics Introduction to Differential and Information Geometry & Information Geometry & for Riemannian Information Geometry Riemannian Stochastic Optimization Stochastic Optimization geometry 1.2 3.3 Geometry 1.3 1.3 in Discrete Domains 1.3 (Sommer) (Amari) (Feragen) (Hansen) (Màlago)

12 Lunch Lunch Lunch Lunch Lunch 13 Crash course on Introduction to Differential and Information Geometry & Information Geometry & Information Geometry Information Geometry & Riemannian Reinforcement Learning Stochastic Optimization 2.1 Cognitive Systems 1.1 Geometry 2.1 1.1 1.4 (Amari) (Ay) (Lauze) (Peters) (Hansen) 14 Crash course on Introduction to Differential and Information Geometry & Information Geometry & Information Geometry Information Geometry & Riemannian Reinforcement Learning Stochastic Optimization 2.2 Cognitive Systems 1.2 Geometry 2.2 1.2 1.5 (Amari) (Ay) (Lauze) (Peters) (Hansen) 15 Introduction to Introduction to Information Geometry & Stochastic Optimization Information Geometry Information Geometry Information Geometry & Reinforcement Learning in Practice 1.1 1.1 2.3 Cognitive Systems 1.3 1.3 (Hansen) (Amari) (Amari) (Ay) (Peters) 16 Introduction to Introduction to Stochastic Optimization Information Geometry Information Geometry Social activity/ Information Geometry & in Practice 1.2 1.2 2.4 Networking event Cognitive Systems 1.3 (Hansen) (Amari) (Amari) (Ay)

Coffee breaks at 10:00 and 14:45 (no afternoon break on Wednesday) Slide 1/57— Aasa Feragen and François Lauze— — September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Welcome to Copenhagen!

Social Programme! • Today: Pizza and walking tour! • 17:15 Pizza dinner in lecture hall • 18:00 Departure from lecture hall (with Metro – we have tickets) • 19:00 Walking tour of old university

• Wednesday: Boat tour, Danish beer and dinner • 15:20 Bus from KUA to Nyhavn • 16:00-17:00 Boat tour • 17:20 Bus from Nyhavn to Nørrebro bryghus (NB, brewery) • 18:00 Guided tour of NB • 19:00 Dinner at NB

Slide 1/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Welcome to Copenhagen!

• Lunch on your own – canteens and coffee on campus • Internet • Eduroam • Alternative will be set up ASAP • Emergency? Call Aasa: +4526220498 • Questions?

Slide 1/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN

Faculty of Science

A Very Brief Introduction to Differential and

Aasa Feragen and François Lauze Department of Computer Science University of Copenhagen

PhD course on Information geometry, Copenhagen 2014 Slide 2/57 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Outline

1 Motivation Nonlinearity n Recall: Calculus in R

2 Differential Geometry Smooth Building Manifolds Tangent Vector fields Differential of smooth map

3 Riemannian metrics Introduction to Riemannian metrics Recall: Inner Products Riemannian metrics Invariance of the Fisher information A first take on the distance metric A first take on

Slide 3/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Outline

1 Motivation Nonlinearity n Recall: Calculus in R

2 Differential Geometry Smooth manifolds Building Manifolds Vector fields Differential of smooth map

3 Riemannian metrics Introduction to Riemannian metrics Recall: Inner Products Riemannian metrics Invariance of the Fisher information metric A first take on the geodesic distance metric A first take on curvature

Slide 4/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Why do we care about nonlinearity?

• Nonlinear relations between data objects • True distances not reflected by linear representation

”Topographic map example”. Licensed under Public domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Topographic_map_example.png#mediaviewer/File: Topographic_map_example.png

Slide 5/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Mildly nonlinear: Nonlinear transformations between different linear representations

• Kernels! • Feature map = nonlinear transformation of (linear?) data space X into linear feature space H • Learning problem is (usually) linear in H, not in X.

Slide 6/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • the of the data in feature space is nonlinear • the recovered intrinsic distance structure is linear

• Searches for the folded-up that best fits the data

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Mildly nonlinear: Nonlinearly embedded subspaces whose is linear • learning! • Find intrinsic dataset distances d • Find an R embedding that minimally distorts those distances

Slide 7/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Mildly nonlinear: Nonlinearly embedded subspaces whose intrinsic metric is linear • Manifold learning! • Find intrinsic dataset distances d • Find an R embedding that minimally distorts those distances • Searches for the folded-up Euclidean space that best fits the data • the embedding of the data in feature space is nonlinear • the recovered intrinsic distance structure is linear

Slide 7/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN More nonlinear: Data spaces which are intrinsically nonlinear

• Distances distorted in nonlinear way, varying spatially • We shall see: the distances cannot always be linearized

”Topographic map example”. Licensed under Public domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Topographic_map_example.png#mediaviewer/File: Topographic_map_example.png

Slide 8/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Open set U ⊂ Rm = set that does not contain its boundary • Manifold M gets its topology (= definition of open sets) from U via ϕ • What are the implications of getting the topology from U?

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Intrinsically nonlinear data spaces: Smooth manifolds

Definition A manifold is a set M with an associated one-to-one map ϕ: U → M from an open subset U ⊂ Rm called a global chart or a global for M.

Slide 9/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • What are the implications of getting the topology from U?

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Intrinsically nonlinear data spaces: Smooth manifolds

Definition A manifold is a set M with an associated one-to-one map ϕ: U → M from an open subset U ⊂ Rm called a global chart or a global coordinate system for M.

• Open set U ⊂ Rm = set that does not contain its boundary • Manifold M gets its topology (= definition of open sets) from U via ϕ

Slide 9/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Intrinsically nonlinear data spaces: Smooth manifolds

Definition A manifold is a set M with an associated one-to-one map ϕ: U → M from an open subset U ⊂ Rm called a global chart or a global coordinate system for M.

• Open set U ⊂ Rm = set that does not contain its boundary • Manifold M gets its topology (= definition of open sets) from U via ϕ • What are the implications of getting the topology from U?

Slide 9/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Intrinsically nonlinear data spaces: Smooth manifolds Definition A smooth manifold is a pair (M, A) where • M is a set • A is a family of one-to-one global charts ϕ: U → M from some m open subset U = Uϕ ⊂ R for M, • for any two charts ϕ: U → Rm and ψ : V → Rm in A, their corresponding change of variables is a smooth ψ−1 ◦ ϕ: U → V ⊂ Rm.

Slide 9/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Outline

1 Motivation Nonlinearity n Recall: Calculus in R

2 Differential Geometry Smooth manifolds Building Manifolds Tangent Space Vector fields Differential of smooth map

3 Riemannian metrics Introduction to Riemannian metrics Recall: Inner Products Riemannian metrics Invariance of the Fisher information metric A first take on the geodesic distance metric A first take on curvature

Slide 10/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • f is of class Cr if f has continuous partial derivatives

r +···+rn ∂ 1 yk r1 rn ∂x1 . . . ∂xn

k = 1 ... q, r1 + ... rn ≤ r. • When r = ∞, f is smooth. Our focus.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Differentiable and smooth functions

• f : U open ⊂ Rn → Rq continuous: write

(y1,..., yq) = f (x1,..., xn)

Slide 11/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • When r = ∞, f is smooth. Our focus.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Differentiable and smooth functions

• f : U open ⊂ Rn → Rq continuous: write

(y1,..., yq) = f (x1,..., xn)

• f is of class Cr if f has continuous partial derivatives

r +···+rn ∂ 1 yk r1 rn ∂x1 . . . ∂xn

k = 1 ... q, r1 + ... rn ≤ r.

Slide 11/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Differentiable and smooth functions

• f : U open ⊂ Rn → Rq continuous: write

(y1,..., yq) = f (x1,..., xn)

• f is of class Cr if f has continuous partial derivatives

r +···+rn ∂ 1 yk r1 rn ∂x1 . . . ∂xn

k = 1 ... q, r1 + ... rn ≤ r. • When r = ∞, f is smooth. Our focus.

Slide 11/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Jacobian of f : matrix q × n of partial derivatives of f :

 ∂y ∂y  1 (x) ... 1 (x) ∂x1 ∂xn  . .  Jxf =  . .   . .  ∂yq ∂yq (x) ... (x) ∂x1 ∂xn

• What is the meaning of the Jacobian? The differential? How do they differ?

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Differential, Jacobian Matrix

n q • Differential of f in x: unique (if exists) dx f : R → R s.t. f (x + h) = f (x) + dx f (h) + o(h).

Slide 12/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • What is the meaning of the Jacobian? The differential? How do they differ?

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Differential, Jacobian Matrix

n q • Differential of f in x: unique linear map (if exists) dx f : R → R s.t. f (x + h) = f (x) + dx f (h) + o(h).

• Jacobian matrix of f : matrix q × n of partial derivatives of f :

 ∂y ∂y  1 (x) ... 1 (x) ∂x1 ∂xn  . .  Jxf =  . .   . .  ∂yq ∂yq (x) ... (x) ∂x1 ∂xn

Slide 12/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Differential, Jacobian Matrix

n q • Differential of f in x: unique linear map (if exists) dx f : R → R s.t. f (x + h) = f (x) + dx f (h) + o(h).

• Jacobian matrix of f : matrix q × n of partial derivatives of f :

 ∂y ∂y  1 (x) ... 1 (x) ∂x1 ∂xn  . .  Jxf =  . .   . .  ∂yq ∂yq (x) ... (x) ∂x1 ∂xn

• What is the meaning of the Jacobian? The differential? How do they differ?

Slide 12/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Inverse Function Theorem:

• f diffeomorphism ⇒ det(Jxf ) 6= 0. • det(Jxf ) 6= 0 ⇒ f local diffeomorphism in a neighborhood of x.

• What is the meaning of Jxf ? Of det(Jxf ) 6= 0?

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Diffeomorphism

• When n = q: • If f is 1-1, f and f −1 both Cr r • f is a C -diffeomorphism. • Smooth are simply referred to as a diffeomorphisms.

Slide 13/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • What is the meaning of Jxf ? Of det(Jxf ) 6= 0?

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Diffeomorphism

• When n = q: • If f is 1-1, f and f −1 both Cr r • f is a C -diffeomorphism. • Smooth diffeomorphisms are simply referred to as a diffeomorphisms. • Inverse Function Theorem:

• f diffeomorphism ⇒ det(Jxf ) 6= 0. • det(Jxf ) 6= 0 ⇒ f local diffeomorphism in a neighborhood of x.

Slide 13/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Diffeomorphism

• When n = q: • If f is 1-1, f and f −1 both Cr r • f is a C -diffeomorphism. • Smooth diffeomorphisms are simply referred to as a diffeomorphisms. • Inverse Function Theorem:

• f diffeomorphism ⇒ det(Jxf ) 6= 0. • det(Jxf ) 6= 0 ⇒ f local diffeomorphism in a neighborhood of x.

• What is the meaning of Jxf ? Of det(Jxf ) 6= 0?

Slide 13/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • If f is 1-1 and a local diffeomorphism everywhere, it is a global diffeomorphism. • What is the intuitive meaning of a diffeomorphism?

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Diffeomorphism

• f may be a local diffeomorphism everywhere but fail to be a global diffeomorphism. Examples: • Complex exponential:

2 2 x x f : R \0 → R , (x, y) → (e cos(y), e sin(y)). Recall its inverse (the complex log) has infinitely many branches.

”Complex log” by Jan Homann; Color encoding image comment author Hal Lane, September 28, 2009 - Own . This mathematical image was created with Mathematica. Licensed under Public domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Complex_log.jpg#mediaviewer/File:Complex_log.jpg

Slide 14/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • What is the intuitive meaning of a diffeomorphism?

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Diffeomorphism

• f may be a local diffeomorphism everywhere but fail to be a global diffeomorphism. Examples: • Complex exponential:

2 2 x x f : R \0 → R , (x, y) → (e cos(y), e sin(y)). Recall its inverse (the complex log) has infinitely many branches.

• If f is 1-1 and a local diffeomorphism everywhere, it is a global diffeomorphism.

”Complex log” by Jan Homann; Color encoding image comment author Hal Lane, September 28, 2009 - Own work. This mathematical image was created with Mathematica. Licensed under Public domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Complex_log.jpg#mediaviewer/File:Complex_log.jpg

Slide 14/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Diffeomorphism

• f may be a local diffeomorphism everywhere but fail to be a global diffeomorphism. Examples: • Complex exponential:

2 2 x x f : R \0 → R , (x, y) → (e cos(y), e sin(y)). Recall its inverse (the complex log) has infinitely many branches.

• If f is 1-1 and a local diffeomorphism everywhere, it is a global diffeomorphism. • What is the intuitive meaning of a diffeomorphism?

”Complex log” by Jan Homann; Color encoding image comment author Hal Lane, September 28, 2009 - Own work. This mathematical image was created with Mathematica. Licensed under Public domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Complex_log.jpg#mediaviewer/File:Complex_log.jpg

Slide 14/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Outline

1 Motivation Nonlinearity n Recall: Calculus in R

2 Differential Geometry Smooth manifolds Building Manifolds Tangent Space Vector fields Differential of smooth map

3 Riemannian metrics Introduction to Riemannian metrics Recall: Inner Products Riemannian metrics Invariance of the Fisher information metric A first take on the geodesic distance metric A first take on curvature

Slide 15/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • What are the implications of inheriting structure through A?

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Back to smooth manifolds Definition A smooth manifold is a pair (M, A) where • M is a set • A is a family of one-to-one global charts ϕ: U → M from some m open subset U = Uϕ ⊂ R for M, • for any two charts ϕ: U → Rm and ψ : V → Rm, their corresponding change of variables is a smooth diffeomorphism ψ−1 ◦ ϕ: U → V ⊂ Rm.

Slide 16/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Back to smooth manifolds Definition A smooth manifold is a pair (M, A) where • M is a set • A is a family of one-to-one global charts ϕ: U → M from some m open subset U = Uϕ ⊂ R for M, • for any two charts ϕ: U → Rm and ψ : V → Rm, their corresponding change of variables is a smooth diffeomorphism ψ−1 ◦ ϕ: U → V ⊂ Rm.

• What are the implications of inheriting structure through A?

Slide 16/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 1 n • Set ϕj (P) = (y (P),..., y (P)), then

−1 ϕj ◦ ϕi (x1,..., xm) = (y1,..., ym)

 ∂y k  and the m × m Jacobian matrices h are invertible. ∂x k,h

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Back to smooth manifolds

• ϕ and ψ are parametrizations of M

Slide 16/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Back to smooth manifolds

• ϕ and ψ are parametrizations of M 1 n • Set ϕj (P) = (y (P),..., y (P)), then

−1 ϕj ◦ ϕi (x1,..., xm) = (y1,..., ym)

 ∂y k  and the m × m Jacobian matrices h are invertible. ∂x k,h

Slide 16/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Example: Euclidean space

• The Euclidean space Rn is a manifold: take ϕ = Id as global coordinate system!

Slide 17/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Example: Smooth surfaces

• Smooth surfaces in Rn that are the image of a smooth map f : R2 → Rn. • A global coordinate system given by f

Slide 18/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Example: Symmetric Positive Definite Matrices

• P(n) ⊂ GLn consists of all symmetric n × n matrices A that satisfy T n xAx > 0 for any x ∈ R , (positive definite – PD – matrices) n • P(n) = the set of covariance matrices on R 3 • P(3) = the set of (diffusion) on R 2 • Global chart: P(n) is an open, convex subset of R(n +n)/2 • A, B ∈ P(n) → aA + bB ∈ P(n) for all a, b > 0 so P(n) is a convex (n2+n)/2 in R .

Middle figure from Fillard et al., A Riemannian Framework for the Processing of -Valued Images, LNCS 3753, 2005, pp 112-123. Rightmost figure from Fletcher, Joshi, Principal Geodesic Analysis on Symmetric Spaces: Statistics of Diffusion Tensors, CVAMIA04

Slide 19/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Example: Space of Gaussian distributions

• The space of n-dimensional Gaussian distributions is a smooth manifold • Global chart: (µ, Σ) ∈ Rn × P(n).

Slide 20/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Example: Space of 1-dimensional Gaussian distributions • The space of 1-dimensional Gaussian distributions is parametrized by (µ, σ) ∈ R × R+, mean µ, standard deviation σ 2 2 • Also parametrized by (µ, σ ) ∈ R × R+, mean µ, variance σ • Smooth reparametrization ψ−1 ◦ ϕ

Slide 21/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 In these cases, we also require the charts to overlap ”nicely”

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN In general: Manifolds requiring multiple charts

The S2 = {(x, y, z), x 2 + y 2 + z2 = 1}

For instance the projection from North Pole, given, for a P = (x, y, z) 6= N of the sphere, by

 x y  ϕ (P) = , N 1 − z 1 − z

is a (large) local coordinate system (around the south pole).

Slide 22/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN In general: Manifolds requiring multiple charts

The sphere S2 = {(x, y, z), x 2 + y 2 + z2 = 1}

For instance the projection from North Pole, given, for a point P = (x, y, z) 6= N of the sphere, by

 x y  ϕ (P) = , N 1 − z 1 − z

is a (large) local coordinate system (around the south pole). In these cases, we also require the charts to overlap ”nicely”

Slide 22/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN In general: Manifolds requiring multiple charts

The Moebius strip The 2D-

1 1 u ∈ [0, 2π], v ∈ [ , ] 2 2 (u, v) ∈ [0, 2π]2, R  r > 0  1 u    cos(u) 1 + 2 v cos( 2 ) cos(u)(R + r cos(v)) 1 u   sin(u) 1 + 2 v cos( 2 )  sin(u)(R + r cos(v)) 1 u   2 v sin( 2 ) r sin(v)

Slide 22/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 ϕ−1 ◦ f ◦ ψ smooth.

• ϕ global coordinates for M, ψ global coordinates for N

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Smooth maps between manifolds

• f : M → N is smooth if its expression in any global coordinates for M and N is.

Slide 23/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 ϕ−1 ◦ f ◦ ψ smooth.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Smooth maps between manifolds

• f : M → N is smooth if its expression in any global coordinates for M and N is. • ϕ global coordinates for M, ψ global coordinates for N

Slide 23/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Smooth maps between manifolds

• f : M → N is smooth if its expression in any global coordinates for M and N is. • ϕ global coordinates for M, ψ global coordinates for N

ϕ−1 ◦ f ◦ ψ smooth.

Slide 23/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 ϕ−1 ◦ f ◦ ψ smooth diffeomorphism .

• ϕ global coordinates for M, ψ global coordinates for N

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Smooth diffeomorphism between manifolds

• f : M → N is a smooth diffeomorphism if its expression in any global coordinates for M and N is.

Slide 24/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 ϕ−1 ◦ f ◦ ψ smooth diffeomorphism .

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Smooth diffeomorphism between manifolds

• f : M → N is a smooth diffeomorphism if its expression in any global coordinates for M and N is. • ϕ global coordinates for M, ψ global coordinates for N

Slide 24/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Smooth diffeomorphism between manifolds

• f : M → N is a smooth diffeomorphism if its expression in any global coordinates for M and N is. • ϕ global coordinates for M, ψ global coordinates for N

ϕ−1 ◦ f ◦ ψ smooth diffeomorphism .

Slide 24/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Outline

1 Motivation Nonlinearity n Recall: Calculus in R

2 Differential Geometry Smooth manifolds Building Manifolds Tangent Space Vector fields Differential of smooth map

3 Riemannian metrics Introduction to Riemannian metrics Recall: Inner Products Riemannian metrics Invariance of the Fisher information metric A first take on the geodesic distance metric A first take on curvature

Slide 25/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 ( m n n • Γ = F 0) for F : R × R → R , F(x, y) = y − f (x).

• Set M = f −1(0).

• If for all x ∈ M, f is a submersion at x (dxf has full ), M is a manifold of m − n. • Example: m X 2 f (x1,..., xm) = 1 − xi : i=1 f −1(0) is the (m − 1)-dimensional unit sphere Sm−1. • The graph Γ = (x, f (x)) ∈ Rm × Rn is smooth for any smooth map f : Rm → Rn.

• Many common examples of manifolds in practice are of that type.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN of RN

• Take f : U ∈ Rm → Rn, n ≤ m smooth.

Slide 26/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 ( m n n • Γ = F 0) for F : R × R → R , F(x, y) = y − f (x).

• If for all x ∈ M, f is a submersion at x (dxf has full rank), M is a manifold of dimension m − n. • Example: m X 2 f (x1,..., xm) = 1 − xi : i=1 f −1(0) is the (m − 1)-dimensional unit sphere Sm−1. • The graph Γ = (x, f (x)) ∈ Rm × Rn is smooth for any smooth map f : Rm → Rn.

• Many common examples of manifolds in practice are of that type.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Submanifolds of RN

• Take f : U ∈ Rm → Rn, n ≤ m smooth. • Set M = f −1(0).

Slide 26/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 ( m n n • Γ = F 0) for F : R × R → R , F(x, y) = y − f (x).

• Example: m X 2 f (x1,..., xm) = 1 − xi : i=1 f −1(0) is the (m − 1)-dimensional unit sphere Sm−1. • The graph Γ = (x, f (x)) ∈ Rm × Rn is smooth for any smooth map f : Rm → Rn.

• Many common examples of manifolds in practice are of that type.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Submanifolds of RN

• Take f : U ∈ Rm → Rn, n ≤ m smooth. • Set M = f −1(0).

• If for all x ∈ M, f is a submersion at x (dxf has full rank), M is a manifold of dimension m − n.

Slide 26/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 ( m n n • Γ = F 0) for F : R × R → R , F(x, y) = y − f (x).

• The graph Γ = (x, f (x)) ∈ Rm × Rn is smooth for any smooth map f : Rm → Rn.

• Many common examples of manifolds in practice are of that type.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Submanifolds of RN

• Take f : U ∈ Rm → Rn, n ≤ m smooth. • Set M = f −1(0).

• If for all x ∈ M, f is a submersion at x (dxf has full rank), M is a manifold of dimension m − n. • Example: m X 2 f (x1,..., xm) = 1 − xi : i=1 f −1(0) is the (m − 1)-dimensional unit sphere Sm−1.

Slide 26/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 ( m n n • Γ = F 0) for F : R × R → R , F(x, y) = y − f (x). • Many common examples of manifolds in practice are of that type.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Submanifolds of RN

• Take f : U ∈ Rm → Rn, n ≤ m smooth. • Set M = f −1(0).

• If for all x ∈ M, f is a submersion at x (dxf has full rank), M is a manifold of dimension m − n. • Example: m X 2 f (x1,..., xm) = 1 − xi : i=1 f −1(0) is the (m − 1)-dimensional unit sphere Sm−1. • The graph Γ = (x, f (x)) ∈ Rm × Rn is smooth for any smooth map f : Rm → Rn.

Slide 26/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Many common examples of manifolds in practice are of that type.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Submanifolds of RN

• Take f : U ∈ Rm → Rn, n ≤ m smooth. • Set M = f −1(0).

• If for all x ∈ M, f is a submersion at x (dxf has full rank), M is a manifold of dimension m − n. • Example: m X 2 f (x1,..., xm) = 1 − xi : i=1 f −1(0) is the (m − 1)-dimensional unit sphere Sm−1. • The graph Γ = (x, f (x)) ∈ Rm × Rn is smooth for any smooth map f : Rm → Rn. ( m n n • Γ = F 0) for F : R × R → R , F(x, y) = y − f (x).

Slide 26/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Submanifolds of RN

• Take f : U ∈ Rm → Rn, n ≤ m smooth. • Set M = f −1(0).

• If for all x ∈ M, f is a submersion at x (dxf has full rank), M is a manifold of dimension m − n. • Example: m X 2 f (x1,..., xm) = 1 − xi : i=1 f −1(0) is the (m − 1)-dimensional unit sphere Sm−1. • The graph Γ = (x, f (x)) ∈ Rm × Rn is smooth for any smooth map f : Rm → Rn. ( m n n • Γ = F 0) for F : R × R → R , F(x, y) = y − f (x). • Many common examples of manifolds in practice are of that type.

Slide 26/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Just consider the products of charts of M and N! • Example: M = S1, N = R: the

• Example: M = N = S1: the torus

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Manifolds

• M and N manifolds, so is M × N.

Slide 27/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Example: M = S1, N = R: the cylinder

• Example: M = N = S1: the torus

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Product Manifolds

• M and N manifolds, so is M × N. • Just consider the products of charts of M and N!

Slide 27/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Example: M = N = S1: the torus

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Product Manifolds

• M and N manifolds, so is M × N. • Just consider the products of charts of M and N! • Example: M = S1, N = R: the cylinder

Slide 27/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Product Manifolds

• M and N manifolds, so is M × N. • Just consider the products of charts of M and N! • Example: M = S1, N = R: the cylinder

• Example: M = N = S1: the torus

Slide 27/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Outline

1 Motivation Nonlinearity n Recall: Calculus in R

2 Differential Geometry Smooth manifolds Building Manifolds Tangent Space Vector fields Differential of smooth map

3 Riemannian metrics Introduction to Riemannian metrics Recall: Inner Products Riemannian metrics Invariance of the Fisher information metric A first take on the geodesic distance metric A first take on curvature

Slide 28/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Informally: a tangent vector at P ∈ M: draw a c :(−ε, ε) → M, c(0) = P, then c˙ (0) is a tangent vector.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Tangent vectors informally

• How can we quantify tangent vectors to a manifold?

Slide 29/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Informally: a tangent vector at P ∈ M: draw a curve c :(−ε, ε) → M, c(0) = P, then c˙ (0) is a tangent vector.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Tangent vectors informally

• How can we quantify tangent vectors to a manifold?

Slide 29/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Tangent vectors informally

• How can we quantify tangent vectors to a manifold? • Informally: a tangent vector at P ∈ M: draw a curve c :(−ε, ε) → M, c(0) = P, then c˙ (0) is a tangent vector.

Slide 29/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • c :(−ε, ε) → M, c(0) = P. In chart ϕ, the map t 7→ ϕ ◦ c(t) is a curve in Euclidean space, and so is t 7→ ψ ◦ c(t). • d d set v = dt (ϕ ◦ c)|0, w = dt (ψ ◦ c)|0 then −1  w = J0 ϕ ◦ ψ v.

(J0f = Jacobian of f at 0) • Use this relation to identify vectors in different coordinate systems!

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN A bit more formally

.

Slide 30/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • d d set v = dt (ϕ ◦ c)|0, w = dt (ψ ◦ c)|0 then −1  w = J0 ϕ ◦ ψ v.

(J0f = Jacobian of f at 0) • Use this relation to identify vectors in different coordinate systems!

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN A bit more formally

.

• c :(−ε, ε) → M, c(0) = P. In chart ϕ, the map t 7→ ϕ ◦ c(t) is a curve in Euclidean space, and so is t 7→ ψ ◦ c(t).

Slide 30/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Use this relation to identify vectors in different coordinate systems!

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN A bit more formally

.

• c :(−ε, ε) → M, c(0) = P. In chart ϕ, the map t 7→ ϕ ◦ c(t) is a curve in Euclidean space, and so is t 7→ ψ ◦ c(t). • d d set v = dt (ϕ ◦ c)|0, w = dt (ψ ◦ c)|0 then −1  w = J0 ϕ ◦ ψ v.

(J0f = Jacobian of f at 0)

Slide 30/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN A bit more formally

.

• c :(−ε, ε) → M, c(0) = P. In chart ϕ, the map t 7→ ϕ ◦ c(t) is a curve in Euclidean space, and so is t 7→ ψ ◦ c(t). • d d set v = dt (ϕ ◦ c)|0, w = dt (ψ ◦ c)|0 then −1  w = J0 ϕ ◦ ψ v.

(J0f = Jacobian of f at 0) • Use this relation to identify vectors in different coordinate systems!

Slide 30/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Let ϕ be a global chart for M • ϕ(x1,..., xm) ∈ M with ϕ(0) = P. • Define ci : t 7→ ϕ(0,..., 0, t, 0, 0). • They go through P when t = 0 and follow the axes. ∂ • Their derivative at 0 are denoted ∂x , or sometimes . i ∂xi • The ∂xi form a of TP M.

• It is a of dimension m:

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Tangent space • The set of tangent vectors to the m-dimensional manifold M at point P is the tangent space of M at P denoted TP M.

Slide 31/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Let ϕ be a global chart for M • ϕ(x1,..., xm) ∈ M with ϕ(0) = P. • Define curves ci : t 7→ ϕ(0,..., 0, t, 0, 0). • They go through P when t = 0 and follow the axes. ∂ • Their derivative at 0 are denoted ∂x , or sometimes . i ∂xi • The ∂xi form a basis of TP M.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Tangent space • The set of tangent vectors to the m-dimensional manifold M at point P is the tangent space of M at P denoted TP M. • It is a vector space of dimension m:

Slide 31/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • ϕ(x1,..., xm) ∈ M with ϕ(0) = P. • Define curves ci : t 7→ ϕ(0,..., 0, t, 0, 0). • They go through P when t = 0 and follow the axes. ∂ • Their derivative at 0 are denoted ∂x , or sometimes . i ∂xi • The ∂xi form a basis of TP M.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Tangent space • The set of tangent vectors to the m-dimensional manifold M at point P is the tangent space of M at P denoted TP M. • It is a vector space of dimension m: • Let ϕ be a global chart for M

Slide 31/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Define curves ci : t 7→ ϕ(0,..., 0, t, 0, 0). • They go through P when t = 0 and follow the axes. ∂ • Their derivative at 0 are denoted ∂x , or sometimes . i ∂xi • The ∂xi form a basis of TP M.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Tangent space • The set of tangent vectors to the m-dimensional manifold M at point P is the tangent space of M at P denoted TP M. • It is a vector space of dimension m: • Let ϕ be a global chart for M • ϕ(x1,..., xm) ∈ M with ϕ(0) = P.

Slide 31/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • They go through P when t = 0 and follow the axes. ∂ • Their derivative at 0 are denoted ∂x , or sometimes . i ∂xi • The ∂xi form a basis of TP M.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Tangent space • The set of tangent vectors to the m-dimensional manifold M at point P is the tangent space of M at P denoted TP M. • It is a vector space of dimension m: • Let ϕ be a global chart for M • ϕ(x1,..., xm) ∈ M with ϕ(0) = P. • Define curves ci : t 7→ ϕ(0,..., 0, t, 0, 0).

Slide 31/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 ∂ • Their derivative at 0 are denoted ∂x , or sometimes . i ∂xi • The ∂xi form a basis of TP M.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Tangent space • The set of tangent vectors to the m-dimensional manifold M at point P is the tangent space of M at P denoted TP M. • It is a vector space of dimension m: • Let ϕ be a global chart for M • ϕ(x1,..., xm) ∈ M with ϕ(0) = P. • Define curves ci : t 7→ ϕ(0,..., 0, t, 0, 0). • They go through P when t = 0 and follow the axes.

Slide 31/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • The ∂xi form a basis of TP M.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Tangent space • The set of tangent vectors to the m-dimensional manifold M at point P is the tangent space of M at P denoted TP M. • It is a vector space of dimension m: • Let ϕ be a global chart for M • ϕ(x1,..., xm) ∈ M with ϕ(0) = P. • Define curves ci : t 7→ ϕ(0,..., 0, t, 0, 0). • They go through P when t = 0 and follow the axes. ∂ • Their derivative at 0 are denoted ∂x , or sometimes . i ∂xi

Slide 31/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Tangent space • The set of tangent vectors to the m-dimensional manifold M at point P is the tangent space of M at P denoted TP M. • It is a vector space of dimension m: • Let ϕ be a global chart for M • ϕ(x1,..., xm) ∈ M with ϕ(0) = P. • Define curves ci : t 7→ ϕ(0,..., 0, t, 0, 0). • They go through P when t = 0 and follow the axes. ∂ • Their derivative at 0 are denoted ∂x , or sometimes . i ∂xi • The ∂xi form a basis of TP M.

Slide 31/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Outline

1 Motivation Nonlinearity n Recall: Calculus in R

2 Differential Geometry Smooth manifolds Building Manifolds Tangent Space Vector fields Differential of smooth map

3 Riemannian metrics Introduction to Riemannian metrics Recall: Inner Products Riemannian metrics Invariance of the Fisher information metric A first take on the geodesic distance metric A first take on curvature

Slide 32/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Vector fields

• A vector field is a smooth map that sends P ∈ M to a vector v(P) ∈ TP M.

Slide 33/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Outline

1 Motivation Nonlinearity n Recall: Calculus in R

2 Differential Geometry Smooth manifolds Building Manifolds Tangent Space Vector fields Differential of smooth map

3 Riemannian metrics Introduction to Riemannian metrics Recall: Inner Products Riemannian metrics Invariance of the Fisher information metric A first take on the geodesic distance metric A first take on curvature

Slide 34/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • dP f : TP M → Tf (P)N linear map corresponding to the Jacobian matrix of f in local coordinates.

• When N = R, dP f is a TP M → R.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Differential of a smooth map

• f : M → N smooth, P ∈ M, f (P) ∈ N

Slide 35/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • When N = R, dP f is a linear form TP M → R.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Differential of a smooth map

• f : M → N smooth, P ∈ M, f (P) ∈ N

• dP f : TP M → Tf (P)N linear map corresponding to the Jacobian matrix of f in local coordinates.

Slide 35/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Differential of a smooth map

• f : M → N smooth, P ∈ M, f (P) ∈ N

• dP f : TP M → Tf (P)N linear map corresponding to the Jacobian matrix of f in local coordinates.

• When N = R, dP f is a linear form TP M → R.

Slide 35/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Outline

1 Motivation Nonlinearity n Recall: Calculus in R

2 Differential Geometry Smooth manifolds Building Manifolds Tangent Space Vector fields Differential of smooth map

3 Riemannian metrics Introduction to Riemannian metrics Recall: Inner Products Riemannian metrics Invariance of the Fisher information metric A first take on the geodesic distance metric A first take on curvature

Slide 36/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • !

• Distance metric? ? • Varying local inner product = Riemannian metric! • Optimization over such spaces?

• Riemannian geometry

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Tools needed in intrinsically nonlinear spaces?

• Comparison of objects in a nonlinear space?

”Topographic map example”. Licensed under Public domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Topographic_map_example.png#mediaviewer/File: Topographic_map_example.png

Slide 37/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Gradients!

• Optimization over such spaces?

• Riemannian geometry

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Tools needed in intrinsically nonlinear spaces?

• Comparison of objects in a nonlinear space? • Distance metric? Kernel? • Varying local inner product = Riemannian metric!

”Topographic map example”. Licensed under Public domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Topographic_map_example.png#mediaviewer/File: Topographic_map_example.png

Slide 37/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Gradients! • Riemannian geometry

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Tools needed in intrinsically nonlinear spaces?

• Comparison of objects in a nonlinear space? • Distance metric? Kernel? • Varying local inner product = Riemannian metric! • Optimization over such spaces?

”Topographic map example”. Licensed under Public domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Topographic_map_example.png#mediaviewer/File: Topographic_map_example.png

Slide 37/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Riemannian geometry

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Tools needed in intrinsically nonlinear spaces?

• Comparison of objects in a nonlinear space? • Distance metric? Kernel? • Varying local inner product = Riemannian metric! • Optimization over such spaces? • Gradients!

”Topographic map example”. Licensed under Public domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Topographic_map_example.png#mediaviewer/File: Topographic_map_example.png

Slide 37/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Tools needed in intrinsically nonlinear spaces?

• Comparison of objects in a nonlinear space? • Distance metric? Kernel? • Varying local inner product = Riemannian metric! • Optimization over such spaces? • Gradients! • Riemannian geometry ”Topographic map example”. Licensed under Public domain via Wikimedia Commons - http://commons.wikimedia.org/wiki/File:Topographic_map_example.png#mediaviewer/File: Topographic_map_example.png

Slide 37/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Outline

1 Motivation Nonlinearity n Recall: Calculus in R

2 Differential Geometry Smooth manifolds Building Manifolds Tangent Space Vector fields Differential of smooth map

3 Riemannian metrics Introduction to Riemannian metrics Recall: Inner Products Riemannian metrics Invariance of the Fisher information metric A first take on the geodesic distance metric A first take on curvature

Slide 38/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 n t • Simplest example: usual dot-product on R : x = (x1,..., xn) , t y = (y1,..., yn) ,

n X t x · y = hx, yi = xi yi = x Id y i=1 Id n × n identity matrix. T • x Ay, A symmetric, positive definite: inner product, hx, yiA. • Without subscript h−, −i will denote standard Euclidean dot-product (i.e. A = Id).

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Recall: Inner Products

• Euclidean/Hilbertian Inner Product on vector space E: bilinear, symmetric, positive definite mapping hx, yi ∈ R.

Slide 39/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 T • x Ay, A symmetric, positive definite: inner product, hx, yiA. • Without subscript h−, −i will denote standard Euclidean dot-product (i.e. A = Id).

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Recall: Inner Products

• Euclidean/Hilbertian Inner Product on vector space E: bilinear, symmetric, positive definite mapping hx, yi ∈ R. n t • Simplest example: usual dot-product on R : x = (x1,..., xn) , t y = (y1,..., yn) ,

n X t x · y = hx, yi = xi yi = x Id y i=1 Id n × n identity matrix.

Slide 39/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Without subscript h−, −i will denote standard Euclidean dot-product (i.e. A = Id).

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Recall: Inner Products

• Euclidean/Hilbertian Inner Product on vector space E: bilinear, symmetric, positive definite mapping hx, yi ∈ R. n t • Simplest example: usual dot-product on R : x = (x1,..., xn) , t y = (y1,..., yn) ,

n X t x · y = hx, yi = xi yi = x Id y i=1 Id n × n identity matrix. T • x Ay, A symmetric, positive definite: inner product, hx, yiA.

Slide 39/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Recall: Inner Products

• Euclidean/Hilbertian Inner Product on vector space E: bilinear, symmetric, positive definite mapping hx, yi ∈ R. n t • Simplest example: usual dot-product on R : x = (x1,..., xn) , t y = (y1,..., yn) ,

n X t x · y = hx, yi = xi yi = x Id y i=1 Id n × n identity matrix. T • x Ay, A symmetric, positive definite: inner product, hx, yiA. • Without subscript h−, −i will denote standard Euclidean dot-product (i.e. A = Id).

Slide 39/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 •  There are norms and distances not from an inner product.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN – Distance

• Orthogonality, vector norm, distance from inner products.

2 x⊥Ay ⇐⇒ hx, yiA = 0, kxkA = hx, xiA , dA(x, y) = kx − ykA.

Slide 40/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Orthogonality – Norm – Distance

• Orthogonality, vector norm, distance from inner products.

2 x⊥Ay ⇐⇒ hx, yiA = 0, kxkA = hx, xiA , dA(x, y) = kx − ykA.

•  There are norms and distances not from an inner product.

Slide 40/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • for standard dot product:   h1  .  T h =  .  = h ! hn

• for general inner product h−, −iA

−1 −1 T hA = A h = A h .

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Inner Products and

n Pn Linear form h = (h1,..., hn): R → R: h(x) = i=1 hi xi . n • inner product h−, −iA on R : h represented by a unique vector hA s.t h(x) = hhA, xiA

hA is the dual of h (w.r.t h−, −iA).

Slide 41/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • for general inner product h−, −iA

−1 −1 T hA = A h = A h .

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Inner Products and Duality

n Pn Linear form h = (h1,..., hn): R → R: h(x) = i=1 hi xi . n • inner product h−, −iA on R : h represented by a unique vector hA s.t h(x) = hhA, xiA

hA is the dual of h (w.r.t h−, −iA). • for standard dot product:   h1  .  T h =  .  = h ! hn

Slide 41/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Inner Products and Duality

n Pn Linear form h = (h1,..., hn): R → R: h(x) = i=1 hi xi . n • inner product h−, −iA on R : h represented by a unique vector hA s.t h(x) = hhA, xiA

hA is the dual of h (w.r.t h−, −iA). • for standard dot product:   h1  .  T h =  .  = h ! hn

• for general inner product h−, −iA

−1 −1 T hA = A h = A h .

Slide 41/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Outline

1 Motivation Nonlinearity n Recall: Calculus in R

2 Differential Geometry Smooth manifolds Building Manifolds Tangent Space Vector fields Differential of smooth map

3 Riemannian metrics Introduction to Riemannian metrics Recall: Inner Products Riemannian metrics Invariance of the Fisher information metric A first take on the geodesic distance metric A first take on curvature

Slide 42/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Given a global parametrization ϕ:(x) = (x1,..., xn) 7→ ϕ(x) ∈ M, it corresponds to a smooth family of symmetric positive definite matrices:   gx11 ... gx1n  . .  gx =  . .  gxn1 ... gxnn

• Pn Pn u = i=1 ui ∂xi , v = i=1 vi ∂xi t hu, vix = (u1,..., un)gx(v1,..., vn) • A smooth manifold with a Riemannian metric is a .

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Riemannian Metric

• Riemannian metric on an m−dimensional manifold = smooth family gP of inner products on the tangent spaces TP M of M

• u, v ∈ TP M 7→ gp(u, v) := hu, viP ∈ R • With it, one can compute of vectors in tangent spaces, check orthogonality, etc...

Slide 43/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Pn Pn u = i=1 ui ∂xi , v = i=1 vi ∂xi t hu, vix = (u1,..., un)gx(v1,..., vn) • A smooth manifold with a Riemannian metric is a Riemannian manifold.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Riemannian Metric

• Riemannian metric on an m−dimensional manifold = smooth family gP of inner products on the tangent spaces TP M of M

• u, v ∈ TP M 7→ gp(u, v) := hu, viP ∈ R • With it, one can compute length of vectors in tangent spaces, check orthogonality, etc... • Given a global parametrization ϕ:(x) = (x1,..., xn) 7→ ϕ(x) ∈ M, it corresponds to a smooth family of symmetric positive definite matrices:   gx11 ... gx1n  . .  gx =  . .  gxn1 ... gxnn

Slide 43/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • A smooth manifold with a Riemannian metric is a Riemannian manifold.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Riemannian Metric

• Riemannian metric on an m−dimensional manifold = smooth family gP of inner products on the tangent spaces TP M of M

• u, v ∈ TP M 7→ gp(u, v) := hu, viP ∈ R • With it, one can compute length of vectors in tangent spaces, check orthogonality, etc... • Given a global parametrization ϕ:(x) = (x1,..., xn) 7→ ϕ(x) ∈ M, it corresponds to a smooth family of symmetric positive definite matrices:   gx11 ... gx1n  . .  gx =  . .  gxn1 ... gxnn

• Pn Pn u = i=1 ui ∂xi , v = i=1 vi ∂xi t hu, vix = (u1,..., un)gx(v1,..., vn)

Slide 43/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Riemannian Metric

• Riemannian metric on an m−dimensional manifold = smooth family gP of inner products on the tangent spaces TP M of M

• u, v ∈ TP M 7→ gp(u, v) := hu, viP ∈ R • With it, one can compute length of vectors in tangent spaces, check orthogonality, etc... • Given a global parametrization ϕ:(x) = (x1,..., xn) 7→ ϕ(x) ∈ M, it corresponds to a smooth family of symmetric positive definite matrices:   gx11 ... gx1n  . .  gx =  . .  gxn1 ... gxnn

• Pn Pn u = i=1 ui ∂xi , v = i=1 vi ∂xi t hu, vix = (u1,..., un)gx(v1,..., vn) • A smooth manifold with a Riemannian metric is a Riemannian manifold.

Slide 43/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Riemannian Metric

Slide 43/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Example: Induced Riemannian metric on submanifolds of Rn

• Inner product from Rn restricts to inner product on M ⊂ Rn

• Frobenius metric on P(n) (n2+n)/2 • P(n) is a convex subset of R • The Euclidean inner product defines a Riemannian metric on P(n)

Rightmost figure from Fletcher, Joshi, Principal Geodesic Analysis on Symmetric Spaces: Statistics of Diffusion Tensors, CVAMIA04

Slide 44/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Example: Fisher information metric • Smooth manifold M = ϕ(U) represents a family of probability distributions (M is a statistical model), U ⊂ Rm • Each point P = ϕ(x) ∈ M is a probability distribution P : Z → R>0

• The Fisher information metric of M at Px = ϕ(x) in coordinates ϕ defined by: Z ∂ log Px (z) ∂ log Px (z) gij (x) = Px (z)dz Z ∂xi ∂xj

Slide 45/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Outline

1 Motivation Nonlinearity n Recall: Calculus in R

2 Differential Geometry Smooth manifolds Building Manifolds Tangent Space Vector fields Differential of smooth map

3 Riemannian metrics Introduction to Riemannian metrics Recall: Inner Products Riemannian metrics Invariance of the Fisher information metric A first take on the geodesic distance metric A first take on curvature

Slide 46/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Invariance of the Fisher information metric • Obtaining a Riemannian metric g on the left chart by pulling g˜ back from the right chart:   ∂ ∂ −1 ∂ −1 ∂ gij = g( , ) = g˜ d(ψ ◦ ϕ)( ), d(ψ ◦ ϕ)( ) . ∂xi ∂xi ∂xi ∂xj

Pm Pm ∂vk ∂vl • Claim: gij = g˜kl k=1 l=1 ∂xi ∂xj

Slide 47/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Invariance of the Fisher information metric

Pm Pm ∂vk ∂vl • Claim: gij = g˜kl k=1 l=1 ∂xi ∂xj • Proof:  ∂ ∂  gij = g , ∂xi ∂xj  m m  = g˜ P ∂vk ∂ , P ∂vl ∂ k=1 ∂xi ∂vk l=1 ∂xj ∂vl m m   = P P ∂vk ∂vl g˜ ∂ , ∂ k=1 l=1 ∂xi ∂xj ∂vk ∂vl Pm Pm ∂vk ∂vl = g˜kl k=1 l=1 ∂xi ∂xj

Slide 47/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Invariance of the Fisher information metric

Pm Pm ∂vk ∂vl • Fact: gij = g˜kl k=1 l=1 ∂xi ∂xj • Pulling the Fisher information metric g˜ from right to left:

Pm Pm ∂vk ∂vl gij = g˜kl k=1 l=1 ∂xi ∂xj Pm Pm R ∂ log Pv (z) ∂ log Pv (z) ∂vk ∂vl = k=1 l=1 ∂v ∂v Pv (z)dz · ∂x ∂x  k   l  i j R Pm ∂ log Pv (z) ∂vk Pm ∂ log Pv (z) ∂vl = Pv (z)dz k=1 ∂vk ∂xi l=1 ∂vl ∂xj R ∂ log Px (z) ∂ log Px (z) = Px (z)dz ∂xi ∂xj • Result: Formula of parametrization

Slide 47/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Outline

1 Motivation Nonlinearity n Recall: Calculus in R

2 Differential Geometry Smooth manifolds Building Manifolds Tangent Space Vector fields Differential of smooth map

3 Riemannian metrics Introduction to Riemannian metrics Recall: Inner Products Riemannian metrics Invariance of the Fisher information metric A first take on the geodesic distance metric A first take on curvature

Slide 48/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Approach supremum through segments c(ti , ti+1) of length → 0 Intuitive path length on Riemannian manifolds:

• Riemannian metric g on M defines norm in TP M • Locally a good approximation for use with (3.1) • This will be made precise in Francois’ lecture!

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Riemannian metrics and distances Path length in metric spaces: • Let (X, d) be a . The length of a curve c :[a, b] → X is n−1 X l(c) = supa=t0≤t1≤...≤tn=b d(c(ti , ti+1)). (3.1) i=0

Slide 49/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 Intuitive path length on Riemannian manifolds:

• Riemannian metric g on M defines norm in TP M • Locally a good approximation for use with (3.1) • This will be made precise in Francois’ lecture!

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Riemannian metrics and distances Path length in metric spaces: • Let (X, d) be a metric space. The length of a curve c :[a, b] → X is n−1 X l(c) = supa=t0≤t1≤...≤tn=b d(c(ti , ti+1)). (3.1) i=0

• Approach supremum through segments c(ti , ti+1) of length → 0

Slide 49/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Locally a good approximation for use with (3.1) • This will be made precise in Francois’ lecture!

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Riemannian metrics and distances Path length in metric spaces: • Let (X, d) be a metric space. The length of a curve c :[a, b] → X is n−1 X l(c) = supa=t0≤t1≤...≤tn=b d(c(ti , ti+1)). (3.1) i=0

• Approach supremum through segments c(ti , ti+1) of length → 0 Intuitive path length on Riemannian manifolds:

• Riemannian metric g on M defines norm in TP M

Slide 49/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • This will be made precise in Francois’ lecture!

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Riemannian metrics and distances Path length in metric spaces: • Let (X, d) be a metric space. The length of a curve c :[a, b] → X is n−1 X l(c) = supa=t0≤t1≤...≤tn=b d(c(ti , ti+1)). (3.1) i=0

• Approach supremum through segments c(ti , ti+1) of length → 0 Intuitive path length on Riemannian manifolds:

• Riemannian metric g on M defines norm in TP M • Locally a good approximation for use with (3.1)

Slide 49/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Riemannian metrics and distances Path length in metric spaces: • Let (X, d) be a metric space. The length of a curve c :[a, b] → X is n−1 X l(c) = supa=t0≤t1≤...≤tn=b d(c(ti , ti+1)). (3.1) i=0

• Approach supremum through segments c(ti , ti+1) of length → 0 Intuitive path length on Riemannian manifolds:

• Riemannian metric g on M defines norm in TP M • Locally a good approximation for use with (3.1) • This will be made precise in Francois’ lecture!

Slide 49/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 Can you see why?

• A geodesic from P to Q in M is a path c :[a, b] → X such that

c(a) = P, c(b) = Q and l(c) = infcP→Q l(cP→Q). • The distance function d(P, Q) = infcP→Q l(cP→Q) is a distance metric on the Riemannian manifold (M, g). • Do always exist?

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Geodesics as length-minimizing curves

• We have a concept of path length l(c) for paths c :[a, b] → M

Slide 50/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 Can you see why?

• The distance function d(P, Q) = infcP→Q l(cP→Q) is a distance metric on the Riemannian manifold (M, g). • Do geodesics always exist?

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Geodesics as length-minimizing curves

• We have a concept of path length l(c) for paths c :[a, b] → M • A geodesic from P to Q in M is a path c :[a, b] → X such that

c(a) = P, c(b) = Q and l(c) = infcP→Q l(cP→Q).

Slide 50/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 Can you see why? • Do geodesics always exist?

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Geodesics as length-minimizing curves

• We have a concept of path length l(c) for paths c :[a, b] → M • A geodesic from P to Q in M is a path c :[a, b] → X such that

c(a) = P, c(b) = Q and l(c) = infcP→Q l(cP→Q). • The distance function d(P, Q) = infcP→Q l(cP→Q) is a distance metric on the Riemannian manifold (M, g).

Slide 50/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Do geodesics always exist?

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Geodesics as length-minimizing curves

• We have a concept of path length l(c) for paths c :[a, b] → M • A geodesic from P to Q in M is a path c :[a, b] → X such that

c(a) = P, c(b) = Q and l(c) = infcP→Q l(cP→Q). • The distance function d(P, Q) = infcP→Q l(cP→Q) is a distance metric on the Riemannian manifold (M, g). Can you see why?

Slide 50/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Geodesics as length-minimizing curves

• We have a concept of path length l(c) for paths c :[a, b] → M • A geodesic from P to Q in M is a path c :[a, b] → X such that

c(a) = P, c(b) = Q and l(c) = infcP→Q l(cP→Q). • The distance function d(P, Q) = infcP→Q l(cP→Q) is a distance metric on the Riemannian manifold (M, g). Can you see why? • Do geodesics always exist?

Slide 50/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Example: Riemannian geodesics between 1-dimensional Gaussian distributions • Space parametrized by (µ, σ) ∈ R × R+ • Metric 1: Euclidean inner product ⇒ Euclidean geodesics

• Metric 2: Fisher information metric

• View in :

2

Middle figure from Costa et al, Fisher information distance: a geometrical reading Slide 51/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Outline

1 Motivation Nonlinearity n Recall: Calculus in R

2 Differential Geometry Smooth manifolds Building Manifolds Tangent Space Vector fields Differential of smooth map

3 Riemannian metrics Introduction to Riemannian metrics Recall: Inner Products Riemannian metrics Invariance of the Fisher information metric A first take on the geodesic distance metric A first take on curvature

Slide 52/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Positive curvature model spaces. of curvature κ > 0: • Flat model space: Euclidean plane • Negatively curved model spaces: of curvature κ > 0

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN A first take on curvature

• Curvature in metric spaces defined by comparison with model spaces of known curvature.

Slide 53/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN A first take on curvature

• Curvature in metric spaces defined by comparison with model spaces of known curvature. • Positive curvature model spaces. Spheres of curvature κ > 0: • Flat model space: Euclidean plane • Negatively curved model spaces: Hyperbolic space of curvature κ > 0

Slide 53/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN A first take on curvature

Figure : Left: Geodesic in a negatively . Right: Comparison triangle in the plane.

• A CAT (κ) space is a metric space in which geodesic are ”thinner” than for their comparison triangles in the model space Mκ; that is, d(x, a) ≤ d(x¯, a¯). • A locally CAT (κ) space has curvature bounded from above by κ. • Geodesic triangles are useful for intuition and proofs!

Slide 53/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • You will see these again with Stefan!

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Example: The two metrics on 1-dimensional Gaussian distributions • Metric 1: Euclidean inner product: FLAT

• Metric 2: Fisher information metric: HYPERBOLIC

(OBS: Not hyperbolic for any family of distributions) • View in plane:

2

Slide 54/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 Middle figure from Costa et al, Fisher information distance: a geometrical reading • You will see these again with Stefan!

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Example: The two metrics on 1-dimensional Gaussian distributions • Metric 1: Euclidean inner product: FLAT

• Metric 2: Fisher information metric: HYPERBOLIC

(OBS: Not hyperbolic for any family of distributions) • View in plane:

2

Slide 54/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 Middle figure from Costa et al, Fisher information distance: a geometrical reading UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Example: The two metrics on 1-dimensional Gaussian distributions • Metric 1: Euclidean inner product: FLAT

• Metric 2: Fisher information metric: HYPERBOLIC

(OBS: Not hyperbolic for any family of distributions) • View in plane:

2

• You will see these again with Stefan! Slide 54/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 Middle figure from Costa et al, Fisher information distance: a geometrical reading UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Relation to

• CAT (κ) is a weak notion of curvature • Stronger notion of sectional curvature (requires a little more Riemannian geometry)

Theorem A smooth Riemannian manifold M is (locally) CAT (κ) if and only if the sectional curvature of M is ≤ κ.

Slide 55/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 d • Assume that Z = {z1,..., zn} ⊂ R is an embedding of X obtained through MDS or manifold learning. • Common belief: If d large, then Z is a good (perfect?) representation of X.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Example of insight with CAT (κ): MDS and manifold learning lie to you

• Given a distance matrix Dij = d(xi , xj ) for a dataset X = {x1,..., xn} residing on a manifold M, where d is a geodesic metric.

Slide 56/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • Common belief: If d large, then Z is a good (perfect?) representation of X.

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Example of insight with CAT (κ): MDS and manifold learning lie to you

• Given a distance matrix Dij = d(xi , xj ) for a dataset X = {x1,..., xn} residing on a manifold M, where d is a geodesic metric. d • Assume that Z = {z1,..., zn} ⊂ R is an embedding of X obtained through MDS or manifold learning.

Slide 56/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Example of insight with CAT (κ): MDS and manifold learning lie to you

• Given a distance matrix Dij = d(xi , xj ) for a dataset X = {x1,..., xn} residing on a manifold M, where d is a geodesic metric. d • Assume that Z = {z1,..., zn} ⊂ R is an embedding of X obtained through MDS or manifold learning. • Common belief: If d large, then Z is a good (perfect?) representation of X.

Slide 56/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • f maps geodesics to straight lines • M is CAT (0) • M is not CAT (κ) for any κ < 0 • ⇒ M is flat • That is, if M is not flat, MDS and manifold learning lie to you • (but sometimes lies are useful)

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Example of insight with CAT (κ): MDS and manifold learning lie to you

• Truth: If there exists a map f : M → Rd such that kf (a) − f (b)k = d(a, b) for all a, b ∈ M, then

Slide 56/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • ⇒ M is flat • That is, if M is not flat, MDS and manifold learning lie to you • (but sometimes lies are useful)

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Example of insight with CAT (κ): MDS and manifold learning lie to you

• Truth: If there exists a map f : M → Rd such that kf (a) − f (b)k = d(a, b) for all a, b ∈ M, then • f maps geodesics to straight lines • M is CAT (0) • M is not CAT (κ) for any κ < 0

Slide 56/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • That is, if M is not flat, MDS and manifold learning lie to you • (but sometimes lies are useful)

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Example of insight with CAT (κ): MDS and manifold learning lie to you

• Truth: If there exists a map f : M → Rd such that kf (a) − f (b)k = d(a, b) for all a, b ∈ M, then • f maps geodesics to straight lines • M is CAT (0) • M is not CAT (κ) for any κ < 0 • ⇒ M is flat

Slide 56/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 • (but sometimes lies are useful)

UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Example of insight with CAT (κ): MDS and manifold learning lie to you

• Truth: If there exists a map f : M → Rd such that kf (a) − f (b)k = d(a, b) for all a, b ∈ M, then • f maps geodesics to straight lines • M is CAT (0) • M is not CAT (κ) for any κ < 0 • ⇒ M is flat • That is, if M is not flat, MDS and manifold learning lie to you

Slide 56/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN Example of insight with CAT (κ): MDS and manifold learning lie to you

• Truth: If there exists a map f : M → Rd such that kf (a) − f (b)k = d(a, b) for all a, b ∈ M, then • f maps geodesics to straight lines • M is CAT (0) • M is not CAT (κ) for any κ < 0 • ⇒ M is flat • That is, if M is not flat, MDS and manifold learning lie to you • (but sometimes lies are useful)

Slide 56/57— Aasa Feragen and François Lauze— Differential Geometry— September 22 UNIVERSITYOFCOPENHAGENUNIVERSITYOFCOPENHAGEN A message from Stefan for tomorrow’s practical

Check out course webpage for installation instructions! http://image.diku.dk/MLLab/IG4.php

Slide 57/57— Aasa Feragen and François Lauze— Differential Geometry— September 22