Lecture 9: Percolation and stochastic topology NSF/CBMS Conference

Sayan Mukherjee

Departments of Statistical Science, Computer Science, Duke University www.stat.duke.edu/ sayan ⇠ May 31, 2016 Stochastic topology

“I predict a new subject of statistical topology. Rather than count the number of holes, Betti numbers, etc., one will be more interested in the distribution of such objects on noncompact as one goes out to infinity,” Isadore Singer. Random graphs Erd˝os-R´enyi G G(n, p) is a draw of a graph G from the graph distribution. ⇠ Consider p(n)andn and ask questions about thresholds of !1 graph properties.

Erd˝os-R´enyi random graph model

G(n, p) is the probability space of graphs on vertex set [n]= 1,...n with the probability of each edge p independently. { } Consider p(n)andn and ask questions about thresholds of !1 graph properties.

Erd˝os-R´enyi random graph model

G(n, p) is the probability space of graphs on vertex set [n]= 1,...n with the probability of each edge p independently. { } G G(n, p) is a draw of a graph G from the graph distribution. ⇠ Erd˝os-R´enyi random graph model

G(n, p) is the probability space of graphs on vertex set [n]= 1,...n with the probability of each edge p independently. { } G G(n, p) is a draw of a graph G from the graph distribution. ⇠ Consider p(n)andn and ask questions about thresholds of !1 graph properties. Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Theorem (Erd˝os-R´enyi) log n+c Let c IR be fixed and G G(n, p).Ifp = n , then 0(G) is 2 ⇠ c asymptotically Poisson distributed with mean e and

e c lim IP [G is connected ]=e . n !1

Erd˝os-R´enyi theorem

Theorem (Erd˝os-R´enyi) Let ✏ > 0 be fixed and G G(n, p).Then ⇠ 1:p (1 + ✏) log n/n IP [G is connected ] ! 0:p (1 + ✏) log n/n. ⇢  Erd˝os-R´enyi theorem

Theorem (Erd˝os-R´enyi) Let ✏ > 0 be fixed and G G(n, p).Then ⇠ 1:p (1 + ✏) log n/n IP [G is connected ] ! 0:p (1 + ✏) log n/n. ⇢ 

Theorem (Erd˝os-R´enyi) log n+c Let c IR be fixed and G G(n, p).Ifp = n , then 0(G) is 2 ⇠ c asymptotically Poisson distributed with mean e and

e c lim IP [G is connected ]=e . n !1 Random simplicial complexes Y Y (n, p) is a draw of a random 2-simplicial complex Y from ⇠ 2 the distribution Y2(n, p)

Lineal Meshulam 2-face model

Y2(n, p) is the probability space of 2-dimensional simplicial [n] complexes with vertex set [n] and edge set 2 and every 2-dimensional face is included with probability p,independently. Lineal Meshulam 2-face model

Y2(n, p) is the probability space of 2-dimensional simplicial [n] complexes with vertex set [n] and edge set 2 and every 2-dimensional face is included with probability p,independently. Y Y (n, p) is a draw of a random 2-simplicial complex Y from ⇠ 2 the distribution Y2(n, p) Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Matthew Kahle (Ohio State University) Random topology and geometry AMS Short Course Y Y (n, p) is a draw of a random d-simplicial complex Y from ⇠ d the distribution Yd (n, p)

Lineal Meshulam model

Yd (n, p) is the probability space of d-dimensional simplicial complexes over all simplicial complexes on n vertices with complete d 1 skeleton and every d-dimensional face is included with probability p,independently. Lineal Meshulam model

Yd (n, p) is the probability space of d-dimensional simplicial complexes over all simplicial complexes on n vertices with complete d 1 skeleton and every d-dimensional face is included with probability p,independently.

Y Y (n, p) is a draw of a random d-simplicial complex Y from ⇠ d the distribution Yd (n, p) Theorem (Meshulam, Wallach) Let d 2, ` 2,and✏ > 0 be fixed and Y Y (n, p).Then ⇠ d 1:p (d + ✏) log n/n IP [Hd 1(Y , Z/`) = 0] ! 0:p (d ✏) log n/n. ⇢ 

Percolation on complexes

Theorem (Linial,Meshulam) Let ✏ > 0 be fixed and Y Y (n, p).Then ⇠ 2 1:p (2 + ✏) log n/n IP [H1(Y , Z/2) = 0] ! 0:p (2 ✏) log n/n. ⇢  Percolation on complexes

Theorem (Linial,Meshulam) Let ✏ > 0 be fixed and Y Y (n, p).Then ⇠ 2 1:p (2 + ✏) log n/n IP [H1(Y , Z/2) = 0] ! 0:p (2 ✏) log n/n. ⇢ 

Theorem (Meshulam, Wallach) Let d 2, ` 2,and✏ > 0 be fixed and Y Y (n, p).Then ⇠ d 1:p (d + ✏) log n/n IP [Hd 1(Y , Z/`) = 0] ! 0:p (d ✏) log n/n. ⇢  Simple connectivity

Theorem (Babson, Ho↵man, Kahle) Let ✏ > 0 be fixed and Y Y (n, p).Then ⇠ 2 n✏ 1:p pn IP [⇡1(Y ) = 0] ✏ ! 0:p n (  pn Theorem (Wagner)

There exist constants c1, c2 > 0 such that for Y Yd (n, p) ⇠ 2d I if p < c1/n then with high probability Y is embeddable IR ,

I if p > c2/n then with high probability Y is not embeddable IR 2d .

Embedding

Every d-dimensional simplicial complex is embeddable in IR 2d+1 but not necessarily in IR 2d . Embedding

Every d-dimensional simplicial complex is embeddable in IR 2d+1 but not necessarily in IR 2d .

Theorem (Wagner)

There exist constants c1, c2 > 0 such that for Y Yd (n, p) ⇠ 2d I if p < c1/n then with high probability Y is embeddable IR ,

I if p > c2/n then with high probability Y is not embeddable IR 2d . X X (n, p) is a draw of a random d- complex X from the ⇠ d distribution Xd (n, p).

Random clique complex

Xd (n, p) is the probability space of d-dimensional random clique complexes. Given a random graph G G(n, p), the clique ⇠ complex X is generated by including as a faces of the clique complex X (G) complete subgraphs of G. Random clique complex

Xd (n, p) is the probability space of d-dimensional random clique complexes. Given a random graph G G(n, p), the clique ⇠ complex X is generated by including as a faces of the clique complex X (G) complete subgraphs of G.

X X (n, p) is a draw of a random d-clique complex X from the ⇠ d distribution Xd (n, p). Random clique complex Homology vanishing results

Theorem (Kahle) Fix k 1 and X X (n, p) and !(1) a function that tends to ⇠ d infinity arbitrarily slowly. Then with high probability Hk (X , IR )=0 if 1/(k+1) (k/2 + 1) log n + k log log n + !(1) p 2 n ! and H (X , IR ) =0if k 6 1/(k+1) (k/2 + 1) log n + k log log n !(1) p 1/nk , 2 . 2 " n ! # Random flag complex homology

# edges

Kahle and Nanda: E[(X)] in blue for X(25,p)with 1 in green, 2 in red, 3 in cyan, and 5 in purple. Percolation and manifolds Stochastic topology

Robert Adler The empirical object is

( , r)= B (p). U P r(n) p [2P

Problem statement

iid Given points = X , ..., X ⇢ where the support of ⇢ is a P { i n} ⇠ . M Problem statement

iid Given points = X , ..., X ⇢ where the support of ⇢ is a P { i n} ⇠ manifold . M The empirical object is

( , r)= B (p). U P r(n) p [2P Topology of noise

Random point cloud

Objects of study:

1. Union of Balls - bicycle 2. Distance Function - cat

Homology (Betti numbers) Critical points Critical points

increasing

decreasing Critical points

The Distance Function:

For a finite set

Example:

Goal: Study critical points of , for a random Critical points

0.5 Index 0 (minimum) 0.45

Generated by 1 point 0.4

0.35 Index 1 (saddle) 0.3 Generated by 2 points 0.25 0.2 Index 2 (maximum) 0.15 0.1 increasing Generated by 3 points 0.05 decreasing

Index k critical points are “generated” by subsets of k+1 points Distributions on compact manifolds

Union of Balls - bicycle Distance Function - cat

Morse Theory Nerve lemma

Theorem (Borsuk, 1948)

The Cechˇ complex Cˇ ( , r) is homotopy equivalent to p Br (p). P 2P In particular, S B (p) = (Cˇ ( , r)). k r k P 0p 1 B[2P C B C @B AC Problem statement

As n and as r 0 where are limiting distributions of !1 ! (1) Betti numbers of ( , r). U P (2) The number of critical points of

d (x):=min x p , x IR d. P p k k 2 2P I Gaussian fields on manifolds —-Adler and Taylor, 2003. I Random triangulated surfaces — Pippenger and Schleich, 2006. I Random 3-manifolds — Dunfield and Thurston, 2006. I Random planar linkages— Farber and Kappeller, 2007. I Random simplicial complexes — Linial and Meshulam, 2006 I Recovery of homology — Niyogi, Smale, and Weinberger, 2008 I Random geometric graphs — Penrose, 2003 I Thresholds for cohomology — Kahle, 2014.

Earlier work

I Early mention by Milnor—1964 I Random triangulated surfaces — Pippenger and Schleich, 2006. I Random 3-manifolds — Dunfield and Thurston, 2006. I Random planar linkages— Farber and Kappeller, 2007. I Random simplicial complexes — Linial and Meshulam, 2006 I Recovery of homology — Niyogi, Smale, and Weinberger, 2008 I Random geometric graphs — Penrose, 2003 I Thresholds for cohomology — Kahle, 2014.

Earlier work

I Early mention by Milnor—1964 I Gaussian fields on manifolds —-Adler and Taylor, 2003. I Random 3-manifolds — Dunfield and Thurston, 2006. I Random planar linkages— Farber and Kappeller, 2007. I Random simplicial complexes — Linial and Meshulam, 2006 I Recovery of homology — Niyogi, Smale, and Weinberger, 2008 I Random geometric graphs — Penrose, 2003 I Thresholds for cohomology — Kahle, 2014.

Earlier work

I Early mention by Milnor—1964 I Gaussian fields on manifolds —-Adler and Taylor, 2003. I Random triangulated surfaces — Pippenger and Schleich, 2006. I Random planar linkages— Farber and Kappeller, 2007. I Random simplicial complexes — Linial and Meshulam, 2006 I Recovery of homology — Niyogi, Smale, and Weinberger, 2008 I Random geometric graphs — Penrose, 2003 I Thresholds for cohomology — Kahle, 2014.

Earlier work

I Early mention by Milnor—1964 I Gaussian fields on manifolds —-Adler and Taylor, 2003. I Random triangulated surfaces — Pippenger and Schleich, 2006. I Random 3-manifolds — Dunfield and Thurston, 2006. I Random simplicial complexes — Linial and Meshulam, 2006 I Recovery of homology — Niyogi, Smale, and Weinberger, 2008 I Random geometric graphs — Penrose, 2003 I Thresholds for cohomology — Kahle, 2014.

Earlier work

I Early mention by Milnor—1964 I Gaussian fields on manifolds —-Adler and Taylor, 2003. I Random triangulated surfaces — Pippenger and Schleich, 2006. I Random 3-manifolds — Dunfield and Thurston, 2006. I Random planar linkages— Farber and Kappeller, 2007. I Recovery of homology — Niyogi, Smale, and Weinberger, 2008 I Random geometric graphs — Penrose, 2003 I Thresholds for cohomology — Kahle, 2014.

Earlier work

I Early mention by Milnor—1964 I Gaussian fields on manifolds —-Adler and Taylor, 2003. I Random triangulated surfaces — Pippenger and Schleich, 2006. I Random 3-manifolds — Dunfield and Thurston, 2006. I Random planar linkages— Farber and Kappeller, 2007. I Random simplicial complexes — Linial and Meshulam, 2006 I Random geometric graphs — Penrose, 2003 I Thresholds for cohomology — Kahle, 2014.

Earlier work

I Early mention by Milnor—1964 I Gaussian fields on manifolds —-Adler and Taylor, 2003. I Random triangulated surfaces — Pippenger and Schleich, 2006. I Random 3-manifolds — Dunfield and Thurston, 2006. I Random planar linkages— Farber and Kappeller, 2007. I Random simplicial complexes — Linial and Meshulam, 2006 I Recovery of homology — Niyogi, Smale, and Weinberger, 2008 I Thresholds for cohomology — Kahle, 2014.

Earlier work

I Early mention by Milnor—1964 I Gaussian fields on manifolds —-Adler and Taylor, 2003. I Random triangulated surfaces — Pippenger and Schleich, 2006. I Random 3-manifolds — Dunfield and Thurston, 2006. I Random planar linkages— Farber and Kappeller, 2007. I Random simplicial complexes — Linial and Meshulam, 2006 I Recovery of homology — Niyogi, Smale, and Weinberger, 2008 I Random geometric graphs — Penrose, 2003 Earlier work

I Early mention by Milnor—1964 I Gaussian fields on manifolds —-Adler and Taylor, 2003. I Random triangulated surfaces — Pippenger and Schleich, 2006. I Random 3-manifolds — Dunfield and Thurston, 2006. I Random planar linkages— Farber and Kappeller, 2007. I Random simplicial complexes — Linial and Meshulam, 2006 I Recovery of homology — Niyogi, Smale, and Weinberger, 2008 I Random geometric graphs — Penrose, 2003 I Thresholds for cohomology — Kahle, Meckes 2010, Kahle, 2014. Three regimes

Subcritical Critical Supercritical

nrd E[number of points in a ball of radius r] / Subcritical regime

Subcritical Subcritical regime

Theorem: Bobrowski, M

Expected number of k-cycles:

- subsets with k+2 vertices all points are within a ball of radius r Subcritical regime

Theorem: Bobrowski, M

Similar results for k,n [Kahle-Meckes]. Subcritical regime

Exact limit values are known, as well as limit distributions

Phase transitions:

Critical radius: Subcritical regime Critical regime

Critical Critical regime

Theorem

Theorem Critical regime

Theorem: Bobrowski, M

Theorem: Bobrowski, M If nrd , and 1 k d 1 !   E[Nk,n] limn = k(), • !1 n

Var[Nk,n] 2 limn = k(), • !1 n

Nk,n E[Nk,n] D 2 limn (0,k()), • !1 pn !N Critical regime

By Morse theory

d 1 d (r)= ( 1)k (r)= ( 1)kN (r), n k,n k,n kX=0 kX=0 implies d 1 k n E[n(r)] 1+ ( 1) k(). ! kX=1 Supercritical regime

Supercritical Supercritical regime

Theorem: Bobrowski, M

Assume fmin := infx f(x) > 0 and nrd then for 1 2Mk d !1   E[Nk,n] limn = k( ), • !1 n 1

Var[Nk,n] 2 limn = k( ), • !1 n 1

Nk,n E[Nk,n] D 2 limn (0,k( )). • !1 pn !N 1 Supercritical regime

Coverage theorem: Bobrowski, M

k If nrn >Clog n 1 1. If C>(!kfmin) ,then ˆ lim P Nk,n = Nk,n, 1 k d =1. n 8   !1 ⇣ ⌘ 1 2. If C>2(!kfmin) ,thenthereexistsM>0 such that for n>M

N a.s.= Nˆ , 1 k d. k,n k,n 8   Supercritical regime

Convergance of Betti numbers: Bobrowski, M

If r 0 and nrk >Clog n,then n ! n 1 1. If C>(!kfmin) ,then

lim P (k,n = k( ), 0 k d)=1. n !1 M 8  

1 2. If C>2(!kfmin) ,thenthereexistsM>0 such that for n>M

a.s.= ( ), 0 k d. k,n k M 8   Supercritical regime

Vacancy probability:

Coverage threshold:

Theorem [Bobrowski,M] Supercritical regime

No more H1 death

No more H1 birth Review of scaling limits

sub-critical critical super-critical

(connectivity) (coverage)

Bobrowski & Kahle –Topology of Random Geometric Complexes: a survey (2) Curved spaces. (3) Random walks and spectral simiplicial theory. (4) Higher-order expanders. (5) Stochastic Hodge theory.

Open questions and problems

(1) Betti numbers outside the critical regime. (3) Random walks and spectral simiplicial theory. (4) Higher-order expanders. (5) Stochastic Hodge theory.

Open questions and problems

(1) Betti numbers outside the critical regime. (2) Curved spaces. (4) Higher-order expanders. (5) Stochastic Hodge theory.

Open questions and problems

(1) Betti numbers outside the critical regime. (2) Curved spaces. (3) Random walks and spectral simiplicial theory. (5) Stochastic Hodge theory.

Open questions and problems

(1) Betti numbers outside the critical regime. (2) Curved spaces. (3) Random walks and spectral simiplicial theory. (4) Higher-order expanders. Open questions and problems

(1) Betti numbers outside the critical regime. (2) Curved spaces. (3) Random walks and spectral simiplicial theory. (4) Higher-order expanders. (5) Stochastic Hodge theory. Homology of level sets Homology of level sets

Density f : IR d IR + we consider D = x : f(x) > L . ! L { }

If we know f(xi) at sample points and pick f(xi) > L to construct homology.

ˆ 1 n We can use f(x)=n i=1 K(x, xi) to pick points. P

ideal real Instead look at DL+✏ , DL ✏ !

Bobrowski-M-Taylor, 2015: Construct an estimator, prove consistency, apply to noisy topological inference.

Homology of level sets

Recovering Hk (DL ) is hard, noise and homology can be brittle. Homology of level sets

Recovering Hk (DL ) is hard, noise and homology can be brittle.

Instead look at DL+✏ , DL ✏ !

Bobrowski-M-Taylor, 2015: Construct an estimator, prove consistency, apply to noisy topological inference. (2) Consider the homology map

◆ : H (DˆL+✏(n, r)) DˆL ✏(n, r). ⇤ ⇤ !

(3) Define Hˆ (L, ✏; n):=Im(◆ ). ⇤ ⇤

Estimating homology of level sets

Define the following as procedure P1:

(1) Given (X1,...,Xn) compute

DˆL ✏(n, r):= Xi : fˆn(Xi ) L ✏;1 i n {   } Dˆ (n, r):= X : fˆ (X ) L + ✏;1 i n L+✏ { i n i   } (3) Define Hˆ (L, ✏; n):=Im(◆ ). ⇤ ⇤

Estimating homology of level sets

Define the following as procedure P1:

(1) Given (X1,...,Xn) compute

DˆL ✏(n, r):= Xi : fˆn(Xi ) L ✏;1 i n {   } Dˆ (n, r):= X : fˆ (X ) L + ✏;1 i n L+✏ { i n i   }

(2) Consider the homology map

◆ : H (DˆL+✏(n, r)) DˆL ✏(n, r). ⇤ ⇤ ! Estimating homology of level sets

Define the following as procedure P1:

(1) Given (X1,...,Xn) compute

DˆL ✏(n, r):= Xi : fˆn(Xi ) L ✏;1 i n {   } Dˆ (n, r):= X : fˆ (X ) L + ✏;1 i n L+✏ { i n i   }

(2) Consider the homology map

◆ : H (DˆL+✏(n, r)) DˆL ✏(n, r). ⇤ ⇤ !

(3) Define Hˆ (L, ✏; n):=Im(◆ ). ⇤ ⇤ In particular, if nr d D log n with D > (C ) 1 then ✏⇤/2

lim IP Hˆ (L, ✏; n) = H (DL) =1. n ⇤ ⇠ ⇤ !1 ⇣ ⌘

Estimating homology of level sets

Theorem (Bobrowski,M,Taylor) Let L > 0 and ✏ (0, L/2) be such that the density f (x) has no 2 critical values in [L 2✏, L +2✏].Ifr 0 and nr d then for n ! 1 large enough

d C ⇤ nr IP Hˆ (L, ✏; n) = H (DL) 1 6ne ✏/2 . ⇤ ⇠ ⇤ ⇣ ⌘ Estimating homology of level sets

Theorem (Bobrowski,M,Taylor) Let L > 0 and ✏ (0, L/2) be such that the density f (x) has no 2 critical values in [L 2✏, L +2✏].Ifr 0 and nr d then for n ! 1 large enough

d C ⇤ nr IP Hˆ (L, ✏; n) = H (DL) 1 6ne ✏/2 . ⇤ ⇠ ⇤ ⇣ ⌘ In particular, if nr d D log n with D > (C ) 1 then ✏⇤/2

lim IP Hˆ (L, ✏; n) = H (DL) =1. n ⇤ ⇠ ⇤ !1 ⇣ ⌘ The longest bar How does the maximal persistence over all k-cycles scale

⇧k (n)= max ⇡(). PH (n) 2 k

Theorem (Bobrowski,Kahle,Skraba) Let be a Poisson process on [0, 1]d and let PH (n) be the k-th Pn k persistent homology. Then ⇧k (n) scales as

log n 1/k (n):= . k log log n ✓ ◆

Random persistence Define =[0, 1]d and M ⇡():= death . birth Theorem (Bobrowski,Kahle,Skraba) Let be a Poisson process on [0, 1]d and let PH (n) be the k-th Pn k persistent homology. Then ⇧k (n) scales as

log n 1/k (n):= . k log log n ✓ ◆

Random persistence Define =[0, 1]d and M ⇡():= death . birth

How does the maximal persistence over all k-cycles scale

⇧k (n)= max ⇡(). PH (n) 2 k Random persistence Define =[0, 1]d and M ⇡():= death . birth

How does the maximal persistence over all k-cycles scale

⇧k (n)= max ⇡(). PH (n) 2 k

Theorem (Bobrowski,Kahle,Skraba) Let be a Poisson process on [0, 1]d and let PH (n) be the k-th Pn k persistent homology. Then ⇧k (n) scales as

log n 1/k (n):= . k log log n ✓ ◆ Acknowledgements

Many people. Funding:

I Center for Systems Biology at Duke

I NSF DMS and CCF

I AFOSR, DARPA

I NIH

204, New journal

195,