Deviations and Fluctuations for Mean-Field Games

Kavita Ramanan, Brown University

AMS Short Course JMM, Denver, Colorado January 14, 2020

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 1 / 45 Outline

I. Interacting Diffusions: Fluctuations II. From Interacting Diffusions to MFG: Fluctuations III. Large Deviations for Interacting Diffusions and MFG IV. Open Problems

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 2 / 45 I. Interacting Diffusions: Fluctuations

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 3 / 45 n,i n Under suitable regularity conditions, X X and µt µt , where X is a non-linear Markov (or McKean-Vlasov) process: ⇒ → dXt = b(Xt ,µ t )dt + dBt ,µ t = Law(Xt ),

with X0 ∼ µ0 independent of B, a d-dimensional . Alternatively, µ solves the (nonlinear) Fokker-Planck equation d  1  hµ , ϕi = µ , b(·,µ ) · ∇ϕ + ∆ϕ , ∀ϕ ∈ C . dt t t t 2 c ∞

Interacting Diffusions and the McKean-Vlasov limit

Consider n diffusions interacting through their empirical measure:

n n,i n,i n i n 1 dXt = b(Xt ,µ t )dt + dWt ,µ t = δX n,k , n t Xk=1 n,i i i n where {X0 = ξ }i∈N are iid with common law µ0 and (W )i=1 are iid d-dimensional Brownian motions.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 4 / 45 Alternatively, µ solves the (nonlinear) Fokker-Planck equation d  1  hµ , ϕi = µ , b(·,µ ) · ∇ϕ + ∆ϕ , ∀ϕ ∈ C . dt t t t 2 c ∞

Interacting Diffusions and the McKean-Vlasov limit

Consider n diffusions interacting through their empirical measure:

n n,i n,i n i n 1 dXt = b(Xt ,µ t )dt + dWt ,µ t = δX n,k , n t Xk=1 n,i i i n where {X0 = ξ }i∈N are iid with common law µ0 and (W )i=1 are iid d-dimensional Brownian motions.

n,i n Under suitable regularity conditions, X X and µt µt , where X is a non-linear Markov (or McKean-Vlasov) process: ⇒ → dXt = b(Xt ,µ t )dt + dBt ,µ t = Law(Xt ),

with X0 ∼ µ0 independent of B, a d-dimensional Brownian motion.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 4 / 45 Interacting Diffusions and the McKean-Vlasov limit

Consider n diffusions interacting through their empirical measure:

n n,i n,i n i n 1 dXt = b(Xt ,µ t )dt + dWt ,µ t = δX n,k , n t Xk=1 n,i i i n where {X0 = ξ }i∈N are iid with common law µ0 and (W )i=1 are iid d-dimensional Brownian motions.

n,i n Under suitable regularity conditions, X X and µt µt , where X is a non-linear Markov (or McKean-Vlasov) process: ⇒ → dXt = b(Xt ,µ t )dt + dBt ,µ t = Law(Xt ),

with X0 ∼ µ0 independent of B, a d-dimensional Brownian motion. Alternatively, µ solves the (nonlinear) Fokker-Planck equation d  1  hµ , ϕi = µ , b(·,µ ) · ∇ϕ + ∆ϕ , ∀ϕ ∈ C . dt t t t 2 c ∞

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 4 / 45 1 n n How can one measure distance between µt and µt , µ and µ? 2 Some notation: Define Cd = C([0, T ]: Rd ), equipped with the uniform topology Given a separable Banach space (E, k · k), let P(E) denote the space of Borel probability measures on E n n d Note that µt ,µ t ∈ P(R) and µ ,µ ∈ P(C ), p p Given p ∈ [1, ), let P (E) = {ν ∈ P : E kxk ν(dx) < } equipped with the p-Wasserstein metric: R ∞  1/∞p 0 p Wp,E (ν, ν ) = inf ||x − y|| π(dx, dy) , π ZE×E where the infimum is over all probability measures π on E × E with first and second marginals µ and ν respectively.

Rate of Convergence to the McKean-Vlasov limit

n,i n,i n i n,i dXt = b(Xt ,µ t )dt + dWt , X0 = ξi n n n 1 n 1 µt = δX n,k and µ = δX n,k (·) n t n Xk=1 Xk=1

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 5 / 45 2 Some notation: Define Cd = C([0, T ]: Rd ), equipped with the uniform topology Given a separable Banach space (E, k · k), let P(E) denote the space of Borel probability measures on E n n d Note that µt ,µ t ∈ P(R) and µ ,µ ∈ P(C ), p p Given p ∈ [1, ), let P (E) = {ν ∈ P : E kxk ν(dx) < } equipped with the p-Wasserstein metric: R ∞  1/∞p 0 p Wp,E (ν, ν ) = inf ||x − y|| π(dx, dy) , π ZE×E where the infimum is over all probability measures π on E × E with first and second marginals µ and ν respectively.

Rate of Convergence to the McKean-Vlasov limit

n,i n,i n i n,i dXt = b(Xt ,µ t )dt + dWt , X0 = ξi n n n 1 n 1 µt = δX n,k and µ = δX n,k (·) n t n Xk=1 Xk=1 1 n n How can one measure distance between µt and µt , µ and µ?

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 5 / 45 Given a separable Banach space (E, k · k), let P(E) denote the space of Borel probability measures on E n n d Note that µt ,µ t ∈ P(R) and µ ,µ ∈ P(C ), p p Given p ∈ [1, ), let P (E) = {ν ∈ P : E kxk ν(dx) < } equipped with the p-Wasserstein metric: R ∞  1/∞p 0 p Wp,E (ν, ν ) = inf ||x − y|| π(dx, dy) , π ZE×E where the infimum is over all probability measures π on E × E with first and second marginals µ and ν respectively.

Rate of Convergence to the McKean-Vlasov limit

n,i n,i n i n,i dXt = b(Xt ,µ t )dt + dWt , X0 = ξi n n n 1 n 1 µt = δX n,k and µ = δX n,k (·) n t n Xk=1 Xk=1 1 n n How can one measure distance between µt and µt , µ and µ? 2 Some notation: Define Cd = C([0, T ]: Rd ), equipped with the uniform topology

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 5 / 45 p p Given p ∈ [1, ), let P (E) = {ν ∈ P : E kxk ν(dx) < } equipped with the p-Wasserstein metric: R ∞  1/∞p 0 p Wp,E (ν, ν ) = inf ||x − y|| π(dx, dy) , π ZE×E where the infimum is over all probability measures π on E × E with first and second marginals µ and ν respectively.

Rate of Convergence to the McKean-Vlasov limit

n,i n,i n i n,i dXt = b(Xt ,µ t )dt + dWt , X0 = ξi n n n 1 n 1 µt = δX n,k and µ = δX n,k (·) n t n Xk=1 Xk=1 1 n n How can one measure distance between µt and µt , µ and µ? 2 Some notation: Define Cd = C([0, T ]: Rd ), equipped with the uniform topology Given a separable Banach space (E, k · k), let P(E) denote the space of Borel probability measures on E n n d Note that µt ,µ t ∈ P(R) and µ ,µ ∈ P(C ),

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 5 / 45 Rate of Convergence to the McKean-Vlasov limit

n,i n,i n i n,i dXt = b(Xt ,µ t )dt + dWt , X0 = ξi n n n 1 n 1 µt = δX n,k and µ = δX n,k (·) n t n Xk=1 Xk=1 1 n n How can one measure distance between µt and µt , µ and µ? 2 Some notation: Define Cd = C([0, T ]: Rd ), equipped with the uniform topology Given a separable Banach space (E, k · k), let P(E) denote the space of Borel probability measures on E n n d Note that µt ,µ t ∈ P(R) and µ ,µ ∈ P(C ), p p Given p ∈ [1, ), let P (E) = {ν ∈ P : E kxk ν(dx) < } equipped with the p-Wasserstein metric: R ∞  1/∞p 0 p Wp,E (ν, ν ) = inf ||x − y|| π(dx, dy) , π ZE×E where the infimum is over all probability measures π on E × E with first and second marginals µ and ν respectively.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 5 / 45 Fluctuations around the McKean-Vlasov limit

n n,i n,i n i n 1 dXt = b(Xt ,µ t )dt + dWt ,µ t = δX n,k n t Xk=1 and d  1  hµ , ϕi = µ , b(·,µ ) · ∇ϕ + ∆ϕ , ∀ϕ ∈ C . dt t t t 2 c ∞ Recall µ = (µt )t≥0 is the McKean-Vlasov limit. We are interested in the limit of the signed measures capturing fluctuations: √ n n St := n(µt − µt ), t ≥ 0. Do you expect this sequence to converge to another signed measure?

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 6 / 45 A Simple Example How bad can the weak limit of a signed measure be?

Let ν ∈ P(R) be a probability measure with a cdf that is not differentiable at any point:

F(x) := ν((− , x]), x ∈ R.

For n ∈ N, define νn to be a measure with cdf −1/2 ∞ Fn(x) := F(x + n ): νn((− , x]) = ν((− , x + n−1/2]) = F(x + n−1/2),

and define the signed measure ∞ ∞√ n n νb := n[ν − ν]. Then, since F is non-differentiable, √   1   lim νn((− , x]) = lim n F x + √ − F(x) doesn’t exist! n b n n

Kavita Ramanan,→∞ Brown University∞ AMS→∞ Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 7 / 45 Recall that νn((− , x]) = ν((− , x + n−1/2]), and so  1   1  hϕ,ν i = ϕ(x)ν (dx) = ϕ x − ν(dx) = ϕ · − ,ν n ∞n ∞ n n Z Z √ n n Recall that νb := n[ν − ν]. Thus, for any infinitely differentiable function ϕ : R 7 R with compact support, √ So lim hν^n, ϕi = lim n[hνn, ϕi − hν, ϕi→] n n √    1   →∞ = lim→∞ n ν, ϕ · − √ − ϕ(·) n n = −hν, ϕ0i. →∞

A Simple Example (contd.) For any measure ν and integrable function ϕ, define

hϕ,ν i = ϕ(x)ν(dx) Z

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 8 / 45 √ n n Recall that νb := n[ν − ν]. Thus, for any infinitely differentiable function ϕ : R 7 R with compact support, √ So lim hν^n, ϕi = lim n[hνn, ϕi − hν, ϕi→] n n √    1   →∞ = lim→∞ n ν, ϕ · − √ − ϕ(·) n n = −hν, ϕ0i. →∞

A Simple Example (contd.) For any measure ν and integrable function ϕ, define

hϕ,ν i = ϕ(x)ν(dx) Z Recall that νn((− , x]) = ν((− , x + n−1/2]), and so  1   1  hϕ,ν i = ϕ(x)ν (dx) = ϕ x − ν(dx) = ϕ · − ,ν n ∞n ∞ n n Z Z

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 8 / 45 A Simple Example (contd.) For any measure ν and integrable function ϕ, define

hϕ,ν i = ϕ(x)ν(dx) Z Recall that νn((− , x]) = ν((− , x + n−1/2]), and so  1   1  hϕ,ν i = ϕ(x)ν (dx) = ϕ x − ν(dx) = ϕ · − ,ν n ∞n ∞ n n Z Z √ n n Recall that νb := n[ν − ν]. Thus, for any infinitely differentiable function ϕ : R 7 R with compact support, √ So lim hν^n, ϕi = lim n[hνn, ϕi − hν, ϕi→] n n √    1   →∞ = lim→∞ n ν, ϕ · − √ − ϕ(·) n n = −hν, ϕ0i. →∞

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 8 / 45 Definition. A mapping ν : D 7 R is said to be a linear functional if

ν(α1ϕ1 + α2ϕ2) = α1ν(ϕ1) + α2ν(ϕ2), ∀αi ∈ R, ϕi ∈ D, i = 1, 2. → Let D0 denote the corresponding space of distributions, defined to be a linear functional ν : D 7 R, that also satisfies the following continuity property: → ϕn ϕ in D ν(ϕn) ν(ϕ) in R, ∀ϕ ∈ D.

→ ⇒ →

A Primer on the Theory of Distributions

ν ϕ 7 hν, ϕi Let D be a space of test functions, e.g., ↔ → D = {f ∈ C : supp f is compact }

equipped with a suitable topology:∞ ϕn ϕ in D if there exists a compact set K such that each ϕn and ϕ have support in K and α α ∂ ϕn ∂ ϕ uniformly on K for all α. →

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 9 / 45 Let D0 denote the corresponding space of distributions, defined to be a linear functional ν : D 7 R, that also satisfies the following continuity property: → ϕn ϕ in D ν(ϕn) ν(ϕ) in R, ∀ϕ ∈ D.

→ ⇒ →

A Primer on the Theory of Distributions

ν ϕ 7 hν, ϕi Let D be a space of test functions, e.g., ↔ → D = {f ∈ C : supp f is compact }

equipped with a suitable topology:∞ ϕn ϕ in D if there exists a compact set K such that each ϕn and ϕ have support in K and α α ∂ ϕn ∂ ϕ uniformly on K for all α. →

Definition.→ A mapping ν : D 7 R is said to be a linear functional if

ν(α1ϕ1 + α2ϕ2) = α1ν(ϕ1) + α2ν(ϕ2), ∀αi ∈ R, ϕi ∈ D, i = 1, 2. →

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 9 / 45 A Primer on the Theory of Distributions

ν ϕ 7 hν, ϕi Let D be a space of test functions, e.g., ↔ → D = {f ∈ C : supp f is compact }

equipped with a suitable topology:∞ ϕn ϕ in D if there exists a compact set K such that each ϕn and ϕ have support in K and α α ∂ ϕn ∂ ϕ uniformly on K for all α. →

Definition.→ A mapping ν : D 7 R is said to be a linear functional if

ν(α1ϕ1 + α2ϕ2) = α1ν(ϕ1) + α2ν(ϕ2), ∀αi ∈ R, ϕi ∈ D, i = 1, 2. → Let D0 denote the corresponding space of distributions, defined to be a linear functional ν : D 7 R, that also satisfies the following continuity property: → ϕn ϕ in D ν(ϕn) ν(ϕ) in R, ∀ϕ ∈ D.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 9 / 45 → ⇒ → 2 Given a σ-finite measure ν on R, the linear functional ϕ 7 hν, ϕi = ϕ(x)ν(dx) defines a distribution Exercise: Prove this by verifying the continuity property R Note:→ For general ν ∈ D0, ν(ϕ) is often written as hν, ϕi. 3 Given a distribution ν ∈ D0, and a C function ψ, ψν denotes the

distribution ∞ (ψν)(ϕ) = hν, ψϕi, ∀ϕ ∈ D This is clearly a linear functional. Exercise: Verify the continuity property to show ψν is a distribution.

A Primer on the Theory of Distributions (contd.)

Examples of Distributions

1 The space of distributions is clearly a vector space on R

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 10 / 45 3 Given a distribution ν ∈ D0, and a C function ψ, ψν denotes the

distribution ∞ (ψν)(ϕ) = hν, ψϕi, ∀ϕ ∈ D This is clearly a linear functional. Exercise: Verify the continuity property to show ψν is a distribution.

A Primer on the Theory of Distributions (contd.)

Examples of Distributions

1 The space of distributions is clearly a vector space on R 2 Given a σ-finite measure ν on R, the linear functional ϕ 7 hν, ϕi = ϕ(x)ν(dx) defines a distribution Exercise: Prove this by verifying the continuity property R Note:→ For general ν ∈ D0, ν(ϕ) is often written as hν, ϕi.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 10 / 45 A Primer on the Theory of Distributions (contd.)

Examples of Distributions

1 The space of distributions is clearly a vector space on R 2 Given a σ-finite measure ν on R, the linear functional ϕ 7 hν, ϕi = ϕ(x)ν(dx) defines a distribution Exercise: Prove this by verifying the continuity property R Note:→ For general ν ∈ D0, ν(ϕ) is often written as hν, ϕi. 3 Given a distribution ν ∈ D0, and a C function ψ, ψν denotes the

distribution ∞ (ψν)(ϕ) = hν, ψϕi, ∀ϕ ∈ D This is clearly a linear functional. Exercise: Verify the continuity property to show ψν is a distribution.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 10 / 45 A Primer on the Theory of Distributions (contd.)

Exercise: Which of the following maps w : D(R) 7 R are distributions: here, ϕ(k) is the kth derivative of ϕ 1 hw, ϕi := f (x)ϕ(x)dx for a locally integrable→ function f . R 2 hw, ϕi := ϕ(k). R k=0 √ 3 (k) hw, ϕi := Pk∞=0 ϕ ( 2), 4 hw, ϕi := ∞ϕ2(x)dx PR ϕ( )−ϕ(− ) 5 x x hw, ϕi := R0 x dx R∞ ( ) Exercise: Show that for any k ∈ N, hw, ϕi := (−1)k hw, ϕ k i is a distribution.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 11 / 45 Returning to the Example: √ νn((− , x]) = ν((− , x + n−1/2]), ν^n := n[νn − ν]. We showed that for all ϕ ∈ D, ∞ ∞ lim hν^n, ϕi := −hν, ϕ0i, n The linear functional ϕ 7 −hν, ϕ0i lies in D0 (by the last exercise). →∞ It is in fact denoted by ∂ν and is said to be the derivative of ν. Thus, we showed that → ν^n ∂ν in D0.

Convergence of Distributions

0 0 Let (ν`) be a sequence in D and ν ∈ D . 0 1 Then (ν`) is said to converge to ν in D , denoted ν` ν if

lim hν`, ϕi = hν, ϕi. ` →

2 0 Moreover, (ν`) is Cauchy→∞ in D if ∀ϕ ∈ D, (hν`, ϕi) is Cauchy in R.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 12 / 45 Convergence of Distributions

0 0 Let (ν`) be a sequence in D and ν ∈ D . 0 1 Then (ν`) is said to converge to ν in D , denoted ν` ν if

lim hν`, ϕi = hν, ϕi. ` →

2 0 Moreover, (ν`) is Cauchy→∞ in D if ∀ϕ ∈ D, (hν`, ϕi) is Cauchy in R. Returning to the Example: √ νn((− , x]) = ν((− , x + n−1/2]), ν^n := n[νn − ν]. We showed that for all ϕ ∈ D, ∞ ∞ lim hν^n, ϕi := −hν, ϕ0i, n The linear functional ϕ 7 −hν, ϕ0i lies in D0 (by the last exercise). →∞ It is in fact denoted by ∂ν and is said to be the derivative of ν. Thus, we showed that → ν^n ∂ν in D0.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 12 / 45 → Soln. Since (I{Bk ∈A})k∈N are iid Bernoulli random variables, the t n SLLN tells us that Nt (A) γt (A) almost surely, where γt is a centered Gaussian distribution with variance t, because

h → i 1 k γt (A) = 1 = (B ∈ A) = (B ∈ A). E I{Bt ∈A} P t P t

Note: In fact, one can prove convergence in P(Rd ): a.s., n lim N (·) γt n t

→∞ →

A Stochastic Example

k Let (B )k∈N be a sequence of independent 1-dimensional BMs with initial distribution π0. For t > 0, A ∈ B(R), define n n 1 Nt (A) := I{Bk ∈A}. n t Xk=1 n Exercise: Calculate limn Nt (A).

→∞

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 13 / 45 Note: In fact, one can prove convergence in P(Rd ): a.s., n lim N (·) γt n t

→∞ →

A Stochastic Example

k Let (B )k∈N be a sequence of independent 1-dimensional BMs with initial distribution π0. For t > 0, A ∈ B(R), define n n 1 Nt (A) := I{Bk ∈A}. n t Xk=1 n Exercise: Calculate limn Nt (A). Soln. Since (I{Bk ∈A})k∈N are iid Bernoulli random variables, the t n →∞ SLLN tells us that Nt (A) γt (A) almost surely, where γt is a centered Gaussian distribution with variance t, because

h → i 1 k γt (A) = 1 = (B ∈ A) = (B ∈ A). E I{Bt ∈A} P t P t

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 13 / 45 A Stochastic Example

k Let (B )k∈N be a sequence of independent 1-dimensional BMs with initial distribution π0. For t > 0, A ∈ B(R), define n n 1 Nt (A) := I{Bk ∈A}. n t Xk=1 n Exercise: Calculate limn Nt (A). Soln. Since (I{Bk ∈A})k∈N are iid Bernoulli random variables, the t n →∞ SLLN tells us that Nt (A) γt (A) almost surely, where γt is a centered Gaussian distribution with variance t, because

h → i 1 k γt (A) = 1 = (B ∈ A) = (B ∈ A). E I{Bt ∈A} P t P t

Note: In fact, one can prove convergence in P(Rd ): a.s., n lim N (·) γt n t

Kavita Ramanan, Brown University AMS Short Course→∞ JMM,Mean-Field Denver,→ Colorado Games January 14, 2020 13 / 45 Soln. View Sn as a distribution-valued process: for t > 0, ϕ ∈ D, √   n n n St (ϕ) := ϕ(x)St (dx) = n ϕ(x)Nt (dx) − ϕ(x)γt (dx) , ZR ZR ZR is a , in fact it is equal to

n n n −1/2  k  −1/2  k k  St (ϕ) = n ϕ(Bt ) − hϕ, γt i = n ϕ(Bt ) − E[ϕ(Bt )] Xk=1 Xk=1 n n • Can show ∀ϕ ∈ D, t 7 St (ϕ) is a.s. continuous and that, t 7 St (·) is a continuous D0-valued process. → →

Stochastic Example (contd.) n n 1 n Nt (A) = I{Bk ∈A}, E[Nt (·)] = γt , A ∈ B(R), t ≥ 0. n t Xk=1 Exercise*: Find the limit of the (random) signed measure-valued proc.: √ n n St (A) := n (Nt (A) − γt (A)) , t ≥ 0, A ∈ B(R).

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 14 / 45 Stochastic Example (contd.) n n 1 n Nt (A) = I{Bk ∈A}, E[Nt (·)] = γt , A ∈ B(R), t ≥ 0. n t Xk=1 Exercise*: Find the limit of the (random) signed measure-valued proc.: √ n n St (A) := n (Nt (A) − γt (A)) , t ≥ 0, A ∈ B(R). Soln. View Sn as a distribution-valued process: for t > 0, ϕ ∈ D, √   n n n St (ϕ) := ϕ(x)St (dx) = n ϕ(x)Nt (dx) − ϕ(x)γt (dx) , ZR ZR ZR is a random variable, in fact it is equal to

n n n −1/2  k  −1/2  k k  St (ϕ) = n ϕ(Bt ) − hϕ, γt i = n ϕ(Bt ) − E[ϕ(Bt )] Xk=1 Xk=1 n n • Can show ∀ϕ ∈ D, t 7 St (ϕ) is a.s. continuous and that, t 7 St (·) is a continuous D0-valued process. Kavita Ramanan, Brown University→ AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 → 14 / 45 Stochastic Example (contd.) √ n n St (A) := n (Nt (·) − γt (·)) . Use the multidimensional CLT to show there exists a centered 0 Gaussian D -valued process (St (ϕ))t,ϕ such that every finite-dim. n n distribution of St = (St (ϕ))t,ϕ converges to the corresponding finite-dim. distribution of the solution S = (St (ϕ))t,ϕ to the S(P)DE √ 1 dS = (∂ ◦ π )db + ∂2S dt, t t t 2 t √ 1 dS (ϕ)=( ∂ ◦ π )db (ϕ) + ∂2S (ϕ)dt, ϕ ∈ D, t t t 2 t where 0 {bt } = {bt (ϕ), ϕ ∈ H2} is a standard Wiener H2-valued process. πt = π0 ∗ γt √ 0 πt ∈ C (R) is viewed as a multiplication operator in D ∂ is differentiation in D0 √ ∞ ∂ ◦ πt denotes a composition of these operators.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 15 / 45 Recap so far

We started with interacting diffusions:

n n,i n,i n i n 1 dXt = b(Xt ,µ t )dt + dWt ,µ t = δX n,k n t Xk=1 and recalled the McKean-Vlasov limit µ that satisfies:

d  1  hµ , ϕi = µ , b(·,µ ) · ∇ϕ + ∆ϕ , ∀ϕ ∈ C . dt t t t 2 c ∞

To capture rate of convergence, we wanted to understand the limit of √ n n St := n(µt − µt ), t ≥ 0.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 16 / 45 n To understand the form of potential limits of processes such as St , we 1. considered sequences of scaled centered deterministic signed measures of a similar form, and showed that their limits are often distributions, not (signed) measures; 2. provided a brief introduction to the theory of distributions;

Recap so far (contd.)

n n,i n,i n i n 1 dXt = b(Xt ,µ t )dt + dWt ,µ t = δX n,k n t Xk=1 d  1  hµ , ϕi = µ , b(·,µ ) · ∇ϕ + ∆ϕ , ∀ϕ ∈ C . dt t t t 2 c √ ∞ n n St := n(µt − µt ), t ≥ 0.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 17 / 45 1. considered sequences of scaled centered deterministic signed measures of a similar form, and showed that their limits are often distributions, not (signed) measures; 2. provided a brief introduction to the theory of distributions;

Recap so far (contd.)

n n,i n,i n i n 1 dXt = b(Xt ,µ t )dt + dWt ,µ t = δX n,k n t Xk=1 d  1  hµ , ϕi = µ , b(·,µ ) · ∇ϕ + ∆ϕ , ∀ϕ ∈ C . dt t t t 2 c √ ∞ n n St := n(µt − µt ), t ≥ 0.

n To understand the form of potential limits of processes such as St , we

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 17 / 45 Recap so far (contd.)

n n,i n,i n i n 1 dXt = b(Xt ,µ t )dt + dWt ,µ t = δX n,k n t Xk=1 d  1  hµ , ϕi = µ , b(·,µ ) · ∇ϕ + ∆ϕ , ∀ϕ ∈ C . dt t t t 2 c √ ∞ n n St := n(µt − µt ), t ≥ 0.

n To understand the form of potential limits of processes such as St , we 1. considered sequences of scaled centered deterministic signed measures of a similar form, and showed that their limits are often distributions, not (signed) measures; 2. provided a brief introduction to the theory of distributions;

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 17 / 45 4. We now consider the interacting case, b 6= 0 and, in analogy with the non-interacting case, will view Sn as a suitable distribution-valued process.

Back to: Fluctuations around the McKean-Vlasov limit

n n,i n,i n i n 1 dXt = b(Xt ,µ t )dt + dWt ,µ t = δX n,k n t Xk=1 d  1  hµ , ϕi = µ , b(·,µ ) · ∇ϕ + ∆ϕ , ∀ϕ ∈ C . dt t t t 2 c √ ∞ n n St := n(µt − µt ), t ≥ 0. n To understand the form of potential limits of processes such as St , we 3. considered the simplest stochastic case, namely to study the limit of fluctuations (or CLT – central limit theorems) for non-interacting diffusions, that is, where b ≡ 0, and characterized the limit as a solution to distribution-valued process, governed by an “SPDE”.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 18 / 45 Back to: Fluctuations around the McKean-Vlasov limit

n n,i n,i n i n 1 dXt = b(Xt ,µ t )dt + dWt ,µ t = δX n,k n t Xk=1 d  1  hµ , ϕi = µ , b(·,µ ) · ∇ϕ + ∆ϕ , ∀ϕ ∈ C . dt t t t 2 c √ ∞ n n St := n(µt − µt ), t ≥ 0. n To understand the form of potential limits of processes such as St , we 3. considered the simplest stochastic case, namely to study the limit of fluctuations (or CLT – central limit theorems) for non-interacting diffusions, that is, where b ≡ 0, and characterized the limit as a solution to distribution-valued process, governed by an “SPDE”. 4. We now consider the interacting case, b 6= 0 and, in analogy with the non-interacting case, will view Sn as a suitable distribution-valued process.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 18 / 45 Some CLT Results for Interacting Particle Systems

References assuming affine dependence of drift on the empirical measure

1 H. Tanaka and M. Hitsuda. for a simple diffusion model of interacting particles. Hiroshima Mathematical Journal 11 (1981), no. 2, 415–423. 2 A.S. Sznitman. A fluctuation result for nonlinear diffusions. Infinite-dimensional analysis and Stochastic Processes (1985), 145–160. 3 S. Méléard. Asymptotic behaviour of some interacting particle systems: McKean-Vlasov and Boltzman models, Probabilistic models for nonlinear partial differential equations, Lecture Notes in Math, vol. 1627, Springer, 1996, pp. 42–95.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 19 / 45 Some CLT Results for Interacting Particle Systems

References that consider more general dependence on the empirical measure

1 T.G. Kurtz and J. Xiong. A stochastic evolution equation arising from fluctuations of a class of interacting particle systems, Communications in Mathematical Sciences 2 (2004) no. 3, 325–358. Comment: Only in the case where each particle takes values in R – one-dimensional case 2 F. Delarue, D. Lacker and K.R., “From the master equation to mean field game limit theory: a central limit theorem”, Electron. J. Probab., Volume 24 (2019), paper no. 51, 54 pp. Comment 1: This covers more general dependence and particles taking values in Rd for general d ∈ N. Comment 2: The precise space in which the limit process lies ends up depending on the dimension d.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 20 / 45 Theorem: Under suitable regularity assumptions on b, Sn converges 0 0 weakly to S = (St (ϕ))t,ϕ in Hd , where Hd is a suitable distribution space with test function space Hd , where S solves the SPDE:

dhSt , ϕi = hSt , At,µt ϕidt + dW t (ϕ), ϕ ∈ Hd 0 where W is a centered Hd -valued continuous centered with covariance functional s∧t E[Wt (ϕ1)Ws(ϕ2)] = hµr , Dx ϕ1 · Dx ϕ2idr, ϕ1, ϕ2 ∈ Hd , Z0

and where At,µt is some suitable (nonlocal) operator.

General CLT for Interacting Particle Systems n n,i n,i n i n 1 dXt = b(Xt ,µ t )dt + dWt ,µ t = δX n,k n t Xk=1 and, with µ the McKean-Vlasov limit, √ n n St := n(µt − µt ), t ≥ 0.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 21 / 45 0 where W is a centered Hd -valued continuous centered Gaussian process with covariance functional s∧t E[Wt (ϕ1)Ws(ϕ2)] = hµr , Dx ϕ1 · Dx ϕ2idr, ϕ1, ϕ2 ∈ Hd , Z0

and where At,µt is some suitable (nonlocal) operator.

General CLT for Interacting Particle Systems n n,i n,i n i n 1 dXt = b(Xt ,µ t )dt + dWt ,µ t = δX n,k n t Xk=1 and, with µ the McKean-Vlasov limit, √ n n St := n(µt − µt ), t ≥ 0. Theorem: Under suitable regularity assumptions on b, Sn converges 0 0 weakly to S = (St (ϕ))t,ϕ in Hd , where Hd is a suitable distribution space with test function space Hd , where S solves the SPDE:

dhSt , ϕi = hSt , At,µt ϕidt + dW t (ϕ), ϕ ∈ Hd

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 21 / 45 General CLT for Interacting Particle Systems n n,i n,i n i n 1 dXt = b(Xt ,µ t )dt + dWt ,µ t = δX n,k n t Xk=1 and, with µ the McKean-Vlasov limit, √ n n St := n(µt − µt ), t ≥ 0. Theorem: Under suitable regularity assumptions on b, Sn converges 0 0 weakly to S = (St (ϕ))t,ϕ in Hd , where Hd is a suitable distribution space with test function space Hd , where S solves the SPDE:

dhSt , ϕi = hSt , At,µt ϕidt + dW t (ϕ), ϕ ∈ Hd 0 where W is a centered Hd -valued continuous centered Gaussian process with covariance functional s∧t E[Wt (ϕ1)Ws(ϕ2)] = hµr , Dx ϕ1 · Dx ϕ2idr, ϕ1, ϕ2 ∈ Hd , Z0

and where At,µt is some suitable (nonlocal) operator.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 21 / 45 Interacting Particle Systems

2. From Interacting Diffusions to MFG: Fluctuations

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 22 / 45 Motivation and Context

Multi-agent or Many-player Systems

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 23 / 45 Symmetric n-player Differential Games

W 1,..., W n ind. d-dim BMs Polish action space drift functional b : Rd × Pp(Rd )× 7 Rd State Dynamics → n n,i n,i n i i n 1 dXt = b(Xt ,µ t ,α (t, X t ))dt + dWt ,µ t = δX n,k n t Xk=1 where αi :[0, T ] × (Rd )n is a Markovian control that is chosen to minimize the ith objective function → " T # Jn(α1, . . . , αn) = f (X i ,µ n, αi (t, X ))dt + g(X i , mn ) , i E t t t T X T Z0 for suitable cost functionals f and g.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 24 / 45 Nash Equilibria

Definition A (closed-loop) Nash equilibrium is defined in the usual way as a vector of feedback functions or controls (α1, . . . , αn), where αi :[0, T ] × (Rd )n are such that the SDE

n n,i →n,i n i i n 1 dXt = b(Xt ,µ t ,α (t, X t ))dt + dWt ,µ t = δX n,k n t Xk=1 is unique in law, and

n,i 1 n n,i 1 i−1 i+1 n J (α , . . . , α ) ≤ J (α , . . . , α , α,e α , . . . , α ),

for any alternative choice of feedback control αe.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 25 / 45 The corresponding Nash equilibrium dynamics, given by n n,i ^ n,i n n,i n i n 1 dXt = b(Xt ,µ t , Dxi v (t, X t ))dt + dWt ,µ t = δX n,k n t Xk=1 defines a collection of interacting diffusions, with

b(x, m, y) = b(x, m, αb(x, m, y)), being the Nash equilibrium drift, where αb takes the explicit form: α(x, m, y) ∈ arg min[b(x, m, a) · y + f (x, m, a)] . b a∈A

Characterization of n-player Nash equilibria Main Point (Cardialaguet et al ’15) A verification theorem tells us that if we have a unique solution n,i {v }i=1,...,n to a coupled system of n PDEs called the Nash system such that v n,i lies in C1,2 for each i = 1,..., n, then the controls d n n,i  n n,i  (0, T ] × (R ) 3 (t, x) 7 α^ x, mX , Dxi v (t, x) form a closed-loop Nash equilibrium, where mn = 1 n δ → x n i=1 xi P

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 26 / 45 with

b(x, m, y) = b(x, m, αb(x, m, y)), being the Nash equilibrium drift, where αb takes the explicit form: α(x, m, y) ∈ arg min[b(x, m, a) · y + f (x, m, a)] . b a∈A

Characterization of n-player Nash equilibria Main Point (Cardialaguet et al ’15) A verification theorem tells us that if we have a unique solution n,i {v }i=1,...,n to a coupled system of n PDEs called the Nash system such that v n,i lies in C1,2 for each i = 1,..., n, then the controls d n n,i  n n,i  (0, T ] × (R ) 3 (t, x) 7 α^ x, mX , Dxi v (t, x) form a closed-loop Nash equilibrium, where mn = 1 n δ → x n i=1 xi The corresponding Nash equilibrium dynamics, given by P n n,i ^ n,i n n,i n i n 1 dXt = b(Xt ,µ t , Dxi v (t, X t ))dt + dWt ,µ t = δX n,k n t Xk=1 defines a collection of interacting diffusions,

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 26 / 45 Characterization of n-player Nash equilibria Main Point (Cardialaguet et al ’15) A verification theorem tells us that if we have a unique solution n,i {v }i=1,...,n to a coupled system of n PDEs called the Nash system such that v n,i lies in C1,2 for each i = 1,..., n, then the controls d n n,i  n n,i  (0, T ] × (R ) 3 (t, x) 7 α^ x, mX , Dxi v (t, x) form a closed-loop Nash equilibrium, where mn = 1 n δ → x n i=1 xi The corresponding Nash equilibrium dynamics, given by P n n,i ^ n,i n n,i n i n 1 dXt = b(Xt ,µ t , Dxi v (t, X t ))dt + dWt ,µ t = δX n,k n t Xk=1 defines a collection of interacting diffusions, with

b(x, m, y) = b(x, m, αb(x, m, y)), being the Nash equilibrium drift, where αb takes the explicit form: α(x, m, y) ∈ arg min[b(x, m, a) · y + f (x, m, a)] . b a∈A Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 26 / 45 But the drift is n-dependent, so this is not of the form we considered earlier. Instead, replace the n-dependent control v n,i by a quantity coming from the master equation.

Nash Equilibrium n-player dynamics

Recall that the corresponding Nash equilibrium dynamics has the form

n n,i ^ n,i n n,i n i n 1 dXt = b(Xt ,µ t , Dxi v (t, X t ))dt + dWt ,µ t = δX n,k n t Xk=1 In other words, it is a system of weakly interacting diffusions:

i ˜ i n i dXt = bn(t, Xt ,µ t )dt + σdBt , i = 1,..., n,

where ˜ ^ n,i bn(t, x, m) = b(x, m, Dxi v (t, x))

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 27 / 45 Nash Equilibrium n-player dynamics

Recall that the corresponding Nash equilibrium dynamics has the form

n n,i ^ n,i n n,i n i n 1 dXt = b(Xt ,µ t , Dxi v (t, X t ))dt + dWt ,µ t = δX n,k n t Xk=1 In other words, it is a system of weakly interacting diffusions:

i ˜ i n i dXt = bn(t, Xt ,µ t )dt + σdBt , i = 1,..., n,

where ˜ ^ n,i bn(t, x, m) = b(x, m, Dxi v (t, x)) But the drift is n-dependent, so this is not of the form we considered earlier. Instead, replace the n-dependent control v n,i by a quantity coming from the master equation.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 27 / 45 An Approximating System

Recall: Interacting diffusions describing Nash equilibrium dynamics: n n,i ^ n,i n n,i n i n 1 dXt = b(Xt ,µ t , Dxi v (t, X t ))dt + dWt ,µ t = δX n,k n t k=1 Instead: Consider the modified system coming from the limitX system: “Replace” v n,i by un,i , where 1 n un,i (t, x ,..., x ) = U(t, x , mn), mn = δ . 1 n i x x n xi Xk=1 the dependence of un,i on n is only through the empirical measure That is, consider the sequence of IPS: ˜ n,i ˜ ˜ n,i n i dXt = b(t, Xt , m˜ ) + dWt , i = 1,..., n, X t where b˜(t, x, m) = b^(x, m, Dx U(t, x, m))

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 28 / 45 1 Analyze master equation + Nash PDE to prove n n −2 E[W2,Cd (µ , µ˜ )] = O(n ) 2 Invoke IPS results to deduce LLN/CLT for {µ˜ n}. 3 Use estimate in 1. to deduce LLN/CLT for {µn}.

Overall Philosophy:Transferring LLN/CLT Results

Interacting diffusions describing Nash equilibrium dynamics:

n n,i ^ n,i n n,i n i n 1 dXt = b(Xt ,µ t , Dxi v (t, X t ))dt + dWt ,µ t = δX n,k n t Xk=1 Approximating diffusions in the form of an IPS

n ˜ n,i ˜ ˜ n,i n i n 1 dXt = b(t, Xt , µ˜ t ) + dWt , µ˜ t = δX˜ n,k n t Xk=1

b˜(t, x, m) = b^(x, m, Dx U(t, x, m))

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 29 / 45 Overall Philosophy:Transferring LLN/CLT Results

Interacting diffusions describing Nash equilibrium dynamics:

n n,i ^ n,i n n,i n i n 1 dXt = b(Xt ,µ t , Dxi v (t, X t ))dt + dWt ,µ t = δX n,k n t Xk=1 Approximating diffusions in the form of an IPS

n ˜ n,i ˜ ˜ n,i n i n 1 dXt = b(t, Xt , µ˜ t ) + dWt , µ˜ t = δX˜ n,k n t Xk=1

b˜(t, x, m) = b^(x, m, Dx U(t, x, m))

1 Analyze master equation + Nash PDE to prove n n −2 E[W2,Cd (µ , µ˜ )] = O(n ) 2 Invoke IPS results to deduce LLN/CLT for {µ˜ n}. 3 Use estimate in 1. to deduce LLN/CLT for {µn}.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 29 / 45 Interacting Particle Systems

3. Large Deviations for Interacting Particle Systems and MFG

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 30 / 45 When a sequence of probabilities of events decay to 0, large deviations characterizes the asymptotic exponential decay rate: suppose νn, n ∈ N take values in (some topological space) E, and satisfies for all “nice” A ⊂ S,

n −s I(A) P(ν ∈ A) ∼ e n , where I : E 7 [0, ] is lowersemicontinuous with compact level sets, and I(A) := infs∈A I(s). One says in→ this case∞ that {νn} satisfies a large deviation principle (LDP) on E with speed {sn} and good (GRF) I Thus the rate of decay of the probabilities is expressed in terms of a variational problem. Often I(a) itself is also expressed in terms of a variational problem.

Large Deviations Theory

Large deviations (LD) is an asymptotic theory that characterizes the asymptotic probability of rare events

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 31 / 45 One says in this case that {νn} satisfies a large deviation principle (LDP) on E with speed {sn} and good rate function (GRF) I Thus the rate of decay of the probabilities is expressed in terms of a variational problem. Often I(a) itself is also expressed in terms of a variational problem.

Large Deviations Theory

Large deviations (LD) is an asymptotic theory that characterizes the asymptotic probability of rare events When a sequence of probabilities of events decay to 0, large deviations characterizes the asymptotic exponential decay rate: suppose νn, n ∈ N take values in (some topological space) E, and satisfies for all “nice” A ⊂ S,

n −s I(A) P(ν ∈ A) ∼ e n , where I : E 7 [0, ] is lowersemicontinuous with compact level sets, and I(A) := infs∈A I(s). → ∞

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 31 / 45 Thus the rate of decay of the probabilities is expressed in terms of a variational problem. Often I(a) itself is also expressed in terms of a variational problem.

Large Deviations Theory

Large deviations (LD) is an asymptotic theory that characterizes the asymptotic probability of rare events When a sequence of probabilities of events decay to 0, large deviations characterizes the asymptotic exponential decay rate: suppose νn, n ∈ N take values in (some topological space) E, and satisfies for all “nice” A ⊂ S,

n −s I(A) P(ν ∈ A) ∼ e n , where I : E 7 [0, ] is lowersemicontinuous with compact level sets, and I(A) := infs∈A I(s). One says in→ this case∞ that {νn} satisfies a large deviation principle (LDP) on E with speed {sn} and good rate function (GRF) I

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 31 / 45 Large Deviations Theory

Large deviations (LD) is an asymptotic theory that characterizes the asymptotic probability of rare events When a sequence of probabilities of events decay to 0, large deviations characterizes the asymptotic exponential decay rate: suppose νn, n ∈ N take values in (some topological space) E, and satisfies for all “nice” A ⊂ S,

n −s I(A) P(ν ∈ A) ∼ e n , where I : E 7 [0, ] is lowersemicontinuous with compact level sets, and I(A) := infs∈A I(s). One says in→ this case∞ that {νn} satisfies a large deviation principle (LDP) on E with speed {sn} and good rate function (GRF) I Thus the rate of decay of the probabilities is expressed in terms of a variational problem. Often I(a) itself is also expressed in terms of a variational problem.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 31 / 45 A Rigorous Statement of the Large Deviation Principle

•E - topological space • {νn} - sequence of E-valued random elements Definition (Large Deviation Principle (LDP))

{νn} is said to satisfy a large deviations principle (LDP) with speed {sn} and rate function I : R 7 [0, ) if for all measurable A,

1 n − inf I(w)≤ lim→ inf ∞ log P(ν ∈ A) w∈A◦ n s(n)

1 n ≤ lim→∞ sup log P(ν ∈ A) ≤ − inf I(w), n s(n) w∈A¯

where I is lower semicontinuous→∞ and has compact level sets.

In short, the LDP says that for all “nice” sets A ⊂ E,

n −s inf (w) P(ν ∈ A) ≈ e n w∈A I

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 32 / 45 Large Deviations Theory

The Contraction Principle

Theorem. Let Y and Y 0 be topological spaces and let F : Y 7 Y 0 be a continuous mapping. Suppose a sequence {Yn} of Y-valued random variables satisfies an LDP with rate function I : Y 7 [0, ]. Then→ the 0 sequence {Yn := F(Yn)} satisfies an LDP with rate function J : Y0 7 [0, ], given by → ∞

0 0 → ∞J(y ) = inf{I(y): F(y) = y for some y ∈ Y}. Exercise 3: Prove the contraction principle.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 33 / 45 Sanov’s Theorem 1 Then {νn} satisfies an LDP in P (Y) with good rate function H(·|θ).

Exercise 4: Prove Sanov’s theorem when Yi take values in a finite state space.

Theory of Large Deviations (Sanov’s Theorem)

Suppose Yi , i = 1,..., are iid on some Polish space Y with common distribution θ ∈ P(Y), and define

1 n ν := δ n n Yi Xi=1 Also, define relative : given ν, µ ∈ P(Y),

dν  H(ν|θ) := ln (x) ν(dx). dθ ZY if ν  θ and H(ν|θ) = otherwise.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 34 / 45 Theory of Large Deviations (Sanov’s Theorem)

Suppose Yi , i = 1,..., are iid on some Polish space Y with common distribution θ ∈ P(Y), and define

1 n ν := δ n n Yi Xi=1 Also, define relative entropy: given ν, µ ∈ P(Y),

dν  H(ν|θ) := ln (x) ν(dx). dθ ZY if ν  θ and H(ν|θ) = otherwise. Sanov’s Theorem ∞ 1 Then {νn} satisfies an LDP in P (Y) with good rate function H(·|θ).

Exercise 4: Prove Sanov’s theorem when Yi take values in a finite state space.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 34 / 45 Large Deviations in the non-interacting Case (b = 0)

n n,i n,i i n 1 Xt = X0 + Wt , Qt = δX n,k ,W i , n 0 Xk=1 i i {X0}i∈N iid with common distribution µ0; {W }i∈N iid Brownian motions. Theorem: As an immediate consequence of Sanov’s theorem we n 1 d d have {Q } satisfies an LDP on P (R × C0 ) with rate function

R(ν|µ0 × W),

i where recall µ0 is the initial distribution of X0 and W is d-dimensional d Wiener measure on C0

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 35 / 45 References (LDPs for Interacting Particle Systems)

n n,i n,i n i n 1 dXt = b(t, Xt ,µ t )dt + dWt ,µ t = δX n,k , n t Xk=1 i i {X0}i∈N iid with common distribution µ0; {W }i∈N iid Brownian motions D. Dawson and J. Gärtner, “Large deviations from the McKean-Vlasov limit for weakly interacting diffusions”, Stochastics: An International Journal of Probability and Stochastic Processes 20 (1987), 247-308. A. Budhiraja, P. Dupuis, and M. Fischer, “Large deviation properties of weakly interacting processes via weak convergence methods”, Annals of Probability (2012), 74-102. M.Fischer, “On the form of the large deviation rate function for the empirical measures of weakly interacting systems”, Bernoulli 20 (2014), 1765-1801.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 36 / 45 References (LDPs for Interacting Particle Systems)

n n,i n,i n i n 1 dXt = b(t, Xt ,µ t )dt + dWt ,µ t = δX n,k , n t Xk=1 i i {X0}i∈N iid with common distribution µ0; {W }i∈N iid Brownian motions For the ultimate application to MFG, need to consider an extension beyond those references that includes random initial conditions, time-dependent drift and a weaker continuity condition on the drift b, namely, continuous as a map from [0, T ] × Rd ×P 1(Rd ) to Rd , in particular, b need not be continuous in the third variable with respect to the weak topology And also allows for common noise, which we do not include here.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 37 / 45 Key Result: Girsanov’s Theorem Relates the law µ of the solution X to the SDE

dXt = dWt with the law of µb of the solution X b to the SDE with an adapted (suitably regular) drift (rt )t≥0 b dXt = rt dt + Wt

Large Deviations in the Interacting Case n n,i n,i n i n 1 dXt = b(t, Xt ,µ t )dt + dWt ,µ t = δX n,k , n t Xk=1 n,i (X0 )i∈N iid with common distribution µ0.

Idea: To use the contraction principle Need to express the law of X n as a continuous functional of the law {Qn} of the non-interacting particle system and a Brownian motion

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 38 / 45 Large Deviations in the Interacting Case n n,i n,i n i n 1 dXt = b(t, Xt ,µ t )dt + dWt ,µ t = δX n,k , n t Xk=1 n,i (X0 )i∈N iid with common distribution µ0.

Idea: To use the contraction principle Need to express the law of X n as a continuous functional of the law {Qn} of the non-interacting particle system and a Brownian motion Key Result: Girsanov’s Theorem Relates the law µ of the solution X to the SDE

dXt = dWt with the law of µb of the solution X b to the SDE with an adapted (suitably regular) drift (rt )t≥0 b dXt = rt dt + Wt

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 38 / 45 1 d d 2. For Q∈P (R × C0 ), define the McKean-Vlasov equation map:

t −1 xt = e + b(s, xs, Q◦ xs )ds + wt , (1) Z0 d d where e and wt denote the canonical maps on R × C0 , as defined above. Here, •Q represents the joint law of the initial condition e and driving noise w −1 1 d • Q ◦ xs ∈ P (R ) represents the marginal law at time s of the solution x to equation (1), under Q.

Large Deviations in the Interacting Case n Exercise 5. Prove the LDP for {µt } via the following steps: 1. Canonical setup: Define the mappings

d d e :(y × f ) ∈ R × C0 7 y d d wt :(y × f ) ∈ R × C0 7 ft , t ≥ 0. → →

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 39 / 45 Here, •Q represents the joint law of the initial condition e and driving noise w −1 1 d • Q ◦ xs ∈ P (R ) represents the marginal law at time s of the solution x to equation (1), under Q.

Large Deviations in the Interacting Case n Exercise 5. Prove the LDP for {µt } via the following steps: 1. Canonical setup: Define the mappings

d d e :(y × f ) ∈ R × C0 7 y d d wt :(y × f ) ∈ R × C0 7 ft , t ≥ 0. → Q∈P 1( d × d ) 2. For R C0 , define the McKean-Vlasov→ equation map: t −1 xt = e + b(s, xs, Q◦ xs )ds + wt , (1) Z0 d d where e and wt denote the canonical maps on R × C0 , as defined above.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 39 / 45 Large Deviations in the Interacting Case n Exercise 5. Prove the LDP for {µt } via the following steps: 1. Canonical setup: Define the mappings

d d e :(y × f ) ∈ R × C0 7 y d d wt :(y × f ) ∈ R × C0 7 ft , t ≥ 0. → Q∈P 1( d × d ) 2. For R C0 , define the McKean-Vlasov→ equation map: t −1 xt = e + b(s, xs, Q◦ xs )ds + wt , (1) Z0 d d where e and wt denote the canonical maps on R × C0 , as defined above. Here, •Q represents the joint law of the initial condition e and driving noise w −1 1 d • Q ◦ xs ∈ P (R ) represents the marginal law at time s of the solution x to equation (1), under Q.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 39 / 45 1 d d 1 d 3. Let Φ : P (R × C0 ) 7 C([0, T ]: P (R )) be the mapping that −1 takes Q to the flow of marginal measures (Q◦ xt )t≥0, and observe → n µ = Φ(Qn).

1 d d 1 d 4. Prove Φ : P (R × C0 ) 7 C([0, T ]: P (R )) is uniformly continuous. →

Large Deviations in the Interacting Case

n,i Recall: X0 iid, and n n n,i n,i n i n 1 n 1 dXt = b(t, Xt ,µ t )dt+dWt ,µ t = δX n,k , Q = δ(X n,k ,W k ) n t n 0 Xk=1 Xk=1 1 d d and for Q∈P (R × C0 ), the McKean-Vlasov equation map is: t −1 xt = e + b(s, xs, Q◦ xs )ds + wt , Z0 d d where e and wt denote canonical variables on R × C0 .

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 40 / 45 1 d d 1 d 4. Prove Φ : P (R × C0 ) 7 C([0, T ]: P (R )) is uniformly continuous. →

Large Deviations in the Interacting Case

n,i Recall: X0 iid, and n n n,i n,i n i n 1 n 1 dXt = b(t, Xt ,µ t )dt+dWt ,µ t = δX n,k , Q = δ(X n,k ,W k ) n t n 0 Xk=1 Xk=1 1 d d and for Q∈P (R × C0 ), the McKean-Vlasov equation map is: t −1 xt = e + b(s, xs, Q◦ xs )ds + wt , Z0 d d where e and wt denote canonical variables on R × C0 . 1 d d 1 d 3. Let Φ : P (R × C0 ) 7 C([0, T ]: P (R )) be the mapping that −1 takes Q to the flow of marginal measures (Q◦ xt )t≥0, and observe → n µ = Φ(Qn).

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 40 / 45 Large Deviations in the Interacting Case

n,i Recall: X0 iid, and n n n,i n,i n i n 1 n 1 dXt = b(t, Xt ,µ t )dt+dWt ,µ t = δX n,k , Q = δ(X n,k ,W k ) n t n 0 Xk=1 Xk=1 1 d d and for Q∈P (R × C0 ), the McKean-Vlasov equation map is: t −1 xt = e + b(s, xs, Q◦ xs )ds + wt , Z0 d d where e and wt denote canonical variables on R × C0 . 1 d d 1 d 3. Let Φ : P (R × C0 ) 7 C([0, T ]: P (R )) be the mapping that −1 takes Q to the flow of marginal measures (Q◦ xt )t≥0, and observe → n µ = Φ(Qn).

1 d d 1 d 4. Prove Φ : P (R × C0 ) 7 C([0, T ]: P (R )) is uniformly continuous. Kavita Ramanan, Brown University AMS Short→ Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 40 / 45 n • µ = Φ(Qn) 1 d d 1 d • Φ : P (R × C0 ) 7 C([0, T ]: P (R )) is uniformly continuous. 5. Apply the contraction principle to conclude that if b is bounded → and continuous and Lipschitz continuous in the second and third arguments (uniformly in time), then {µn} satisfies an LDP with rate function J(ν) = inf{R(Q|µ0 × W): Φ(Q) = ν}. This concludes Exercise 5 and the proof of the LDP for IPS.

Large Deviations in the Interacting Case: Summary Given the IPS

n n n,i n,i n i n 1 n 1 dXt = b(t, Xt ,µ t )dt+dWt ,µ t = δX n,k , Q = δ(X n,k ,W k ) n t n 0 Xk=1 Xk=1 We have shown n • {Q } satisfies an LDP with rate function R(Q|µ0 × W)

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 41 / 45 1 d d 1 d • Φ : P (R × C0 ) 7 C([0, T ]: P (R )) is uniformly continuous. 5. Apply the contraction principle to conclude that if b is bounded → and continuous and Lipschitz continuous in the second and third arguments (uniformly in time), then {µn} satisfies an LDP with rate function J(ν) = inf{R(Q|µ0 × W): Φ(Q) = ν}. This concludes Exercise 5 and the proof of the LDP for IPS.

Large Deviations in the Interacting Case: Summary Given the IPS

n n n,i n,i n i n 1 n 1 dXt = b(t, Xt ,µ t )dt+dWt ,µ t = δX n,k , Q = δ(X n,k ,W k ) n t n 0 Xk=1 Xk=1 We have shown n • {Q } satisfies an LDP with rate function R(Q|µ0 × W) n • µ = Φ(Qn)

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 41 / 45 Large Deviations in the Interacting Case: Summary Given the IPS

n n n,i n,i n i n 1 n 1 dXt = b(t, Xt ,µ t )dt+dWt ,µ t = δX n,k , Q = δ(X n,k ,W k ) n t n 0 Xk=1 Xk=1 We have shown n • {Q } satisfies an LDP with rate function R(Q|µ0 × W) n • µ = Φ(Qn) 1 d d 1 d • Φ : P (R × C0 ) 7 C([0, T ]: P (R )) is uniformly continuous. 5. Apply the contraction principle to conclude that if b is bounded → and continuous and Lipschitz continuous in the second and third arguments (uniformly in time), then {µn} satisfies an LDP with rate function J(ν) = inf{R(Q|µ0 × W): Φ(Q) = ν}. This concludes Exercise 5 and the proof of the LDP for IPS.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 41 / 45 Large Deviations for Mean-Field Games

Large Deviations for Mean-Field Games

Recall the form of interacting diffusions describing Nash equilibrium dynamics:

n n,i ^ n,i n n,i n i n 1 dXt = b(Xt , µt , Dxi v (t, X t ))dt + dWt ,µ t = δX n,k , n t Xk=1 Aim: n To prove an LDP for the sequence (µ )n∈N of empirical measures of the sequence of Nash equilibria state processes

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 42 / 45 1 n n −2 Originally, had only E[W2,Cd (µ , µ˜ )] = O(n ) 2 Invoke IPS results to get LDP for {µ˜ n}. 3 Use a sharper exponential estimate in 1. to deduce LDP for {µn}.

Principle: Transfer LDP results from IPS to MFG

Interacting diffusions describing Nash equilibrium dynamics:

n n,i ^ n,i n n,i n i n 1 dXt = b(Xt , µt , Dxi v (t, X t ))dt + dWt ,µ t = δX n,k n t Xk=1 In view of the previous results on LDPs for IPS, recall the approximating diffusions we considered earlier that were in the form of an IPS:

n ˜ n,i ˜ ˜ n,i n i n 1 dXt = b(t, Xt , µ˜ t ) + dWt , µ˜ t = δX˜ n,k n t Xk=1

b˜(t, x, m) = b^(x, m, Dx U(t, x, m))

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 43 / 45 Principle: Transfer LDP results from IPS to MFG

Interacting diffusions describing Nash equilibrium dynamics:

n n,i ^ n,i n n,i n i n 1 dXt = b(Xt , µt , Dxi v (t, X t ))dt + dWt ,µ t = δX n,k n t Xk=1 In view of the previous results on LDPs for IPS, recall the approximating diffusions we considered earlier that were in the form of an IPS:

n ˜ n,i ˜ ˜ n,i n i n 1 dXt = b(t, Xt , µ˜ t ) + dWt , µ˜ t = δX˜ n,k n t Xk=1

b˜(t, x, m) = b^(x, m, Dx U(t, x, m))

1 n n −2 Originally, had only E[W2,Cd (µ , µ˜ )] = O(n ) 2 Invoke IPS results to get LDP for {µ˜ n}. 3 Use a sharper exponential estimate in 1. to deduce LDP for {µn}.

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 43 / 45 Additional References on LDP

Lacker and Ramanan, “Rare Nash equilibria and the price of anarchy in large static games,” (2019) Mathematics of Operations Research 44 (2019) no. 2, 400-422. Cardaliaguet, Delarue, Lasry and Lions, “The master equation and the convergence problem in mean-field games” (2019) F. Delarue, D. Lacker and K. Ramanan, “From the master equation to mean field game limit theory: Large deviations and concentration of measure," (2018) to appear in Annals of Probability

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 44 / 45 IV. Open Problems

Study refined convergence theorems for open-loop Nash equilibria. Investigate if there are cases when the MFG LDP exists, but differs from the interacting paricle system LDP obtained from the master equation. Establish large deviation principles for stochastic differential games in the presence of non-uniqueness (as has been done in the static case) Use LDPs to find interesting conditional limit laws in both the static and stochastic differential settings. ...

Kavita Ramanan, Brown University AMS Short Course JMM,Mean-Field Denver, Colorado Games January 14, 2020 45 / 45