THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING

AARON NABER AND DANIELE VALTORTA

ABSTRACT. If one considers an integral Im ⊆ M with bounded mean curvature, and if S k(I) ≡ {x ∈ M : no tangent cone at x is k + 1-symmetric} is the standard stratification of the singular set, then it is well known that dim S k ≤ k. In complete generality nothing else is known about the singular sets S k(I). In this paper we prove for a general integral varifold with bounded mean curvature, in particular a stationary varifold, that every stratum S k(I) is k-rectifiable. In fact, we prove for k-a.e. point x ∈ S k that there exists a unique k-plane Vk such that every tangent cone at x is of the form V × C for some cone C. In the case of minimizing hypersurfaces In−1 ⊆ Mn we can go further. Indeed, we can show that the singular set S (I), which is known to satisfy dim S (I) ≤ n−8, is in fact n−8 rectifiable with uniformly finite n−8 . 7 An effective version of this allows us to prove that the second fundamental form A has apriori estimates in Lweak on I, an estimate which is sharp as |A| is not in L7 for the Simons cone. In fact, we prove the much stronger 7 estimate that the regularity scale rI has Lweak-estimates. The above results are in fact just applications of a new class of estimates we prove on the quantitative k k k k stratifications S ,r and S  ≡ S ,0. Roughly, x ∈ S  ⊆ I if no ball Br(x) is -close to being k + 1-symmetric. We k k n−k show that S  is k-rectifiable and satisfies the Minkowski estimate Vol(Br S  ) ≤ C r . The proof requires a new L2-subspace approximation theorem for integral varifolds with bounded mean curvature, and a W1,p-Reifenberg type theorem proved by the authors in [NVa].

CONTENTS 1. Introduction 2 1.1. Results for Varifolds with Bounded Mean Curvature5 1.2. Results for Minimizing Hypersurfaces6 1.3. Outline of Proofs and Techniques8 2. Preliminaries 12 2.1. Integral Varifolds and Mean Curvature 12 2.2. Bounded Mean Curvature and Monotonicity 12 2.3. Quantitative 0-Symmetry and Cone Splitting 13 2.4. -regularity for Minimizing Hypersurfaces 14 2.5. Hausdorff, Minkowski, and packing Content 15 2.6. Examples 17 2.7. The Classical Reifenberg Theorem 18 3. The W1,p-Reifenberg Theorem 19 3.1. Statement of Main W1,p-Reifenberg and rectifiable-Reifenberg Results 19

Date: April 27, 2015. The first author has been supported by NSF grant DMS-1406259, the second author has been supported by SNSF grant 149539. 1 2 AARON NABER AND DANIELE VALTORTA

3.2. Technical Constructions 21 3.3. Proof of Theorem 3.3: The Discrete-Reifenberg 29 4. L2-Best Approximation Theorems for Integral Currents 38 4.1. Symmetry and Gradient Bounds 39 4.2. Best L2-Subspace Equations 39 4.3. Proof of Theorem 4.1 42 5. The Inductive Covering Lemma 43 5.1. Estimating Ur in Lemma 5.1 for r > 0 45 5.2. Estimating U0 in Lemma 5.1 47 5.3. Technical Constructions for Estimating U+ 48 5.4. Estimating U+ in Lemma 5.1 51 6. Proof of Main Theorem’s for Integral Varifolds with Bounded Mean Curvature 54 6.1. Proof of Theorem 1.3 55 6.2. Proof of Theorem 1.4 57 6.3. Proof of Theorem 1.5 59 7. Proof of Main Theorem’s for Minimizing Hypersurfaces 60 7.1. Proof of Theorem 1.6 60 7.2. Proof of Theorem 1.8 60 References 61

1. INTRODUCTION In this paper we will study integral m-varifolds Im with bounded mean curvature, and in particular station- ary and area minimizing varifolds, see Section2 for an introduction to the basics. In the case of a varifold I with bounded mean curvature we can consider the stratification of I given by S 0(I) ⊆ · · · ⊆ S k(I) ⊆ · · · S (I) ⊆ I , (1.1) where the singular sets are defined as S k(I) ≡ {x ∈ M : no tangent cone at x is k + 1-symmetric} , (1.2) see Definition 1.2 for a precise definition and more detailed discussion. An important result of Federer tells us that we have the estimate dim S k(I) ≤ k . (1.3) Unfortunately, essentially nothing else is understood about the structure of the singular sets in any general- ity. In the case where I is area minimizing more is understood, at least for the top stratum of the singular set. Namely, it has been shown in [Sim93] that S m−2(I) is rectifiable, and if In−1 is further assumed to be an area minimizing hypersurface it has been shown in [Sim95] that S n−8(I) is rectifiable.

The first goal of this paper is to give improved regularity results for stratification and associated quanti- tative stratification for varifolds under only a bounded mean curvature assumption. Indeed, we will show THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 3 for such a varifold that the strata S k(I) are k-rectifiable for every k. In fact, we will show that for k-a.e. k x ∈ S (I) there exists a unique k-dimensional subspace V ⊆ Tx M such that every tangent cone at x is of the form Vk × C for some cone C. Note that we are not claiming that the cone factor C is necessarily unique, just that the Euclidean factor Vk of the cone is. It is a good question as to whether tangent cones need to be unique in general under only the assumption of bounded mean curvature.

For a varifold Im = In−1 which is an area minimizing hypersurface these results can be improved. To begin with, we have that the top stratum of the singular set S (I) = S n−8(I) is n − 8 rectifiable, and that there are a priori bounds on its n − 8 . That is, if the mass I(B2) ≤ Λ is bounded, then we have n−8 the estimate H (S (I) ∩ B1) ≤ C(Λ, B2). In fact, we have the stronger Minkowski estimate

        8 Vol Br S (I) ∩ B1 , rI Br S (I) ∩ B1 ≤ Cr , (1.4)   where Br S (I) is the tube of radius r around the singular set. Indeed, we can prove much more effective versions of these estimates. That is, in Theorem 1.8 we show that the second fundamental form A of I, and 7 in fact the regularity scale rI, have a priori bounds in Lweak. More precisely, we have that n −1o   n −1o  7 I |A| > r ∩ B1 ≤ I Br |A| > r ∩ B1 ≤ C(Λ, B2)r . (1.5)

7 Let us observe that these estimates are sharp, in that the Simon’s cone satisfies |A| < Lloc(I). Let us also point out that this sharpens estimates of [CN13b], where it was proven that for minimizing hypersurfaces that |A| ∈ Lp for all p < 7. We refer the reader to Section 1.2 for the precise and most general statements.

Now the techniques of this paper all center around the notion of the quantitative stratification. In fact, it is for the quantitative stratification that the most important results of the paper hold, everything else can be seen to be corollaries of these statements. The quantitative stratification was first introduced in [CN13a], and later used in [CN13b] with the goal of giving effective and Lp estimates on stationary and minimizing varifolds. It has since been used in [CHN13b], [CHN13a], [CNV15],[BL14] to prove similar results in the areas of mean curvature flow, critical sets of elliptic equations, harmonic map flow. More recently, in [NVa] the authors have used ideas similar to those in this paper to prove structural theorems for the singular sets of stationary harmonic maps. Before describing the results in this paper on the quantitative stratification, let us give more precise definitions of everything. To begin with, to describe the stratification and quantitative stratification we need to discuss the notion of symmetry associated to an integral varifold with bounded mean curvature. Specifically:

Definition 1.1. We define the following: (1) An integral varifold Im ⊆ Rn is called k-symmetric if I(λx) = I(x) ∀ λ > 0, and if there exists a k-plane Vk ⊆ Rn such that for each x ∈ I and y ∈ Vk we have that I(x + y) = I(x). m m (2) Given an integral varifold I ⊆ M and  > 0, we say a ball Br(x) ⊆ M with r

Remark 1.1. d may be taken to be the flat distance or the weak distance induced by the Frechet structure of varifold convergence.

Thus, an integral varifold Im is k-symmetric if I = Vk ×C for some cone C. A varifold is (k, )-symmetric on a ball Br(x) if it is weakly close to a k-symmetric varifold on this ball.

With the notion of symmetry in hand, we can define precisely the quantitative stratification associated to a solution. The idea is to group points together based on the amount of symmetry that balls centered at those points contain. In fact, there are several variants which will play a role for us. Let us introduce them all and briefly discuss them:

Definition 1.2. For an integral varifold Im we make the following definitions: th k (1) For , r > 0 we define the k (, r)-stratification S ,r(I) by k S ,r(I) ≡ {x ∈ I ∩ B1 : for no r ≤ s < 1 is Bs(x) a (k + 1, )-symmetric ball}. (1.6) th k (2) For  > 0 we define the k -stratification S  (I) by k \ k S  (I) = S ,r(I) ≡ {x ∈ I ∩ B1 : for no 0 < r < 1 is Br(x) a (k + 1, )-symmetric ball}. (1.7) r>0 (3) We define the kth-stratification S k(I) by k [ k S (I) = S  = {x ∈ I ∩ B1 : no tangent cone at x is k + 1-symmetric}. (1.8) >0 Remark 1.2. It is a small but important exercise to check that the standard stratification S k(I) as defined in S k (1.1) agrees with the set >0 S  . Let us discuss in words the meaning of the quantitative stratification, and how it relates to the standard stratification. As discussed at the beginning of the section, the stratification S k(I) of I is built by separating k points of I based on the infinitesimal symmetries of I at those points. The quantitative stratifications S  (I) k and S ,r(I) are, on the other hand, instead built by separating points of I based on how many symmetries exist on balls of definite size around the points. In practice, the quantitative stratification has two advantages over the standard stratification. First, on minimizing varifolds the quantitative stratification allows to prove effective estimates. In particular, in [CN13b] the Lp estimates

7−δ |A| , < Cδ ∀ δ > 0 , (1.9) B1∩I on minimizing hypersurfaces were proved? by exploiting this fact. The second advantage is that the estimates on the quantitative stratification are much stronger than those on the standard stratification. Namely, in [CN13b] the Hausdorff dimension estimate (1.3) on S k(I) was improved to the estimate k n−k−δ Vol(Br S ,r) ≤ C,δr ∀δ > 0 . (1.10) One of the key technical estimates of this paper is that in Theorem 1.3 we drop the δ from this estimate and obtain an estimate of the form k n−k Vol(Br S ,r) ≤ Cr . (1.11) THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 5

k From this we are able conclude in Theorem 1.4 an estimate on S  of the form k n−k Vol(Br S  ) ≤ Cr . (1.12) k In particular, this estimate allows us to conclude that S  has uniformly finite k-dimensional measure. In fact, k the techniques will prove much more for us. They will show us that S  is k-rectifiable, and that for k-a.e. k k point x ∈ S  there is a unique k-plane V ⊆ Tx M such that every tangent map at x is k-symmetric with k S k respect to V. By observing that S (I) = S  (I), this is what allows us to prove in Theorem 1.5 our main k k results on the classical stratification. This decomposition of S into the pieces S  is crucial for the proof. On the other hand, (1.12), combined with the -regularity theorems of [Fed69],[CN13b], allow us to con- clude in the minimizing hypersurface case both the weak L7 estimate on |A|, and the m − 7-finiteness of the singular set of I. Thus we will see that Theorems 1.6 and 1.8 are fairly quick consequences of (1.12).

Thus we have seen that (1.11) and (1.12), and more generally Theorem 1.3 and Theorem 1.4, are the main challenges of the paper. We will give a more complete outline of the proof in Section 1.3, however let us mention for the moment that two of the new ingredients to the proof are a new L2-subspace approximation theorem for integral varifolds with bounded mean curvature, proved in Section4, and a W1,p-Reifenberg theorem described in Section3. The L2-approximation result roughly tells us that the L2-distance of a measure from being contained in a k-dimensional subspace may be estimated by integrating the volume drop of the integral varifold over the measure. To exploit the estimate we prove a new W1,p-Reifenberg type theorem. The classical Reifenberg theorem states that if we have a set S which is L∞-approximated by an affine k-dimensional subspace at every point and scale, then S is bi-Holder¨ to a k-dimensional , see theorem 2.13 for a precise statement. It is important for us to improve on this bi-Holder¨ estimate, at least enough that we are able to control the kth-dimensional measure of the set and prove rectifiability. In particular, we want to improve the Cα-maps to W1,p-maps for p > k, and we will want to do it using a condition which is integral in nature. More precisely, we will only require a form of summable L2- closeness of the subset S to the approximating subspaces. We will see in Theorem 4.1 that by using the L2-subspace approximation theorem that the conditions of this new W1,p-Reifenberg are in fact controllable k for the quantitative stratifications S  . 1.1. Results for Varifolds with Bounded Mean Curvature. We now turn our attention to giving precise statements of the main results of this paper. In this subsection we focus on those concerning the singu- lar structure of integral varifolds with bounded mean curvature. Precisely, let (Mn, g, p) be a Riemannian manifold satisfying −1 | secB2(p) | ≤ K, inj(B2(p)) ≥ K , (1.13) m and let I be an integral varifold on M with mean curvature bounded by H on B2(p). That is, for every smooth vector field X on M with compact support in B2(p) we have the estimate Z δI(X) ≤ H |X| dI , (1.14) I see Section 2.1 for more on this. Let us begin by discussing our main theorem for the quantitative stratifica- k tions S ,r(I): 6 AARON NABER AND DANIELE VALTORTA

Theorem 1.3 ((, r)-Stratification of Integral Varifolds with BMC). Let Im be an integral varifold on Mn satisfying the curvature bound (1.13), the mean curvature bound (1.14), and the mass bound I(B2(p)) ≤ Λ. Then for each  > 0 there exists C(n, K, H, Λ, ) such that   k   n−k Vol Br S ,r(I) ∩ B1 (p) ≤ Cr . (1.15)

k When we study the quantitative stratification S  (I) we can refine the above to prove structure theorems on the set itself:

Theorem 1.4 (-Stratification of Integral Varifolds with BMC). Let Im be an integral varifold on Mn satis- fying the curvature bound (1.13), the mean curvature bound (1.14), and the mass bound I(B2(p)) ≤ Λ. Then for each  > 0 there exists C(n, K, H, Λ, ) such that  k   n−k Vol Br S  (I) ∩ B1 (p) ≤ Cr . (1.16) th k k k In particular, we have the k -dimensional Hausdorff measure estimate H (S  (I)) ≤ C. Further, S  (I) is k k k-rectifiable, and for k-a.e. x ∈ S  there exists a unique k-plane V ⊆ Tx M such that every tangent cone of x is k-symmetric with respect to Vk.

Finally, we end this subsection we stating our main results when it comes to the classical stratification S k(I). The following may be proved from the previous Theorem in only a few lines given the formula k S k S (I) = S  (I):

Theorem 1.5 (Stratification of Integral Varifolds with BMC). Let Im be an integral varifold on Mn satisfying the curvature bound (1.13), the mean curvature bound (1.14), and the mass bound I(B2(p)) < ∞. Then for each k we have that S k(I) is k-rectifiable. Further, for k-a.e. x ∈ S k(I) there exists a unique k-plane k k V ⊆ Tx M such that every tangent cone of x is k-symmetric with respect to V .

1.2. Results for Minimizing Hypersurfaces. In this section we restrict ourselves to studying codimension n−1 one integral varifolds I which minimize the area functional on B2(p) with respect to compact variations. 0 0 0 That is, if I is another integral varifold with I = I in a neighborhood of ∂B2(p), then I(B2) ≤ I (B2). One could easily restrict to local minimizers to obtain similar results. Most of the results of this section follow quickly by combining the quantitative stratification results of Section 1.1 with the -regularity of [Fed69],[CN13b], see Section 2.4 for a review of these points.

Our first estimate is on the singular set Sing(I) of a minimizing hypersurface. Recall that Sing(I) is the set of points where I is not smooth. THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 7

Theorem 1.6 (Structure of Singular Set). Let Im = In−1 be a minimizing integral varifold on Mn satisfying the curvature bound (1.13) and the mass bound I(B2(p)) ≤ Λ. Then Sing(I) is m − 7-rectifiable and there exists C(n, K, Λ) such that    8 Vol Br Sing(I) ∩ B1(p) ≤ Cr ,    7 I Br Sing(I) ∩ B1(p) ≤ Cr . (1.17) In particular, λm−7(Sing(I)) ≤ C.

The above can be extended to effective regularity estimates on I. To state the results in full strength let us recall the notion of the regularity scale associated to a function. Namely:

m Definition 1.7. Let I be an integral varifold. For x ∈ I ∩ B1(p) we define the regularity scale rI(x) by     r (x) ≡ max 0 ≤ r ≤ 1 : sup |A| ≤ r−1 . (1.18) I   Br(x) 2 By definition, rI(x) ≡ 0 if I is not C in a neighbourhood of x.

Remark 1.3. The regularity scale is scale invariant. That is, if r ≡ rI(x) and we rescale Br(x) → B1(x), then on the rescaled ball we will have |A| ≤ 1 on B1(x).

−1 Remark 1.4. We have the easy estimate |A|(x) ≤ rI(x) . However, a lower bound on rI(x) at a point is in principle much stronger than an upper bound on |A|(x).

Remark 1.5. Notice that the regularity scale is a Lipschitz function with |∇rI| ≤ 1.

Remark 1.6. If I satisfies an elliptic equation, e.g. is a stationary varifold, then we have estimates of the form k −(k+1) sup |∇ A| ≤ Ck rI(x) . (1.19) BrI /2(x)

In particular, control on rI gives control on all higher order derivatives.

Now let us state our main estimates for minimizing hypersurfaces:

Theorem 1.8 (Estimates on Minimizing Hypersurfaces). Let Im = In−1 be a minimizing integral varifold on n M satisfying the curvature bound (1.13) and the mass bound I(B2(p)) ≤ Λ. Then there exists C(n, K, Λ) such that

 −1  8 Vol {x ∈ B1(p): |A| > r } ∩ B1 (p) ≤ Cr   8 Vol {x ∈ B1(p): rI(x) < r} ∩ B1 (p) ≤ Cr ,   7 I {x ∈ B1(p): rI(x) < r} ∩ B1 (p) ≤ Cr . (1.20)

−1 7   7 In particular, both |A| and rI have bounds in Lweak I ∩ B1(p) , the space of weakly L functions on I. 8 AARON NABER AND DANIELE VALTORTA

1.3. Outline of Proofs and Techniques. In this subsection we give a brief outline of the proof of the main theorems. To describe the new ingredients involved it will be helpful to give a comparison to the proofs of previous results in the area, in particular the dimension estimate (1.3) of Federer and the Minkowski and Lp estimates (1.9) of [CN13b]. Indeed, the starting point for the study of singular sets for solutions of geometric equations typically looks the same, that is, one needs a monotone quantity. In the case of integral varifolds Im with bounded mean curvature, we consider the volume density

−m θr(x) ≡ r I(Br(x)) . (1.21)

For simplicity sake let us take M ≡ Rn and Im to be stationary in this discussion, which is really of no loss d except for some small technical work. Then dr θr ≥ 0, so θr(x) is precisely a monotone quantity, and it is independent of r if and only if I is 0-symmetric, see Section 2.2 for more on this. Interestingly, this is the only information one requires to prove the dimension estimate (1.3). Namely, since θr(x) is monotone and −α bounded, it must converge as r tends to zero. Therefore, if we consider the sequence of scales rα = 2 then we have for each x that

lim θr (x) − θr (x) → 0 . (1.22) α→∞ α α+1 From this one can conclude that every tangent cone of I is 0-symmetric. This fact combined with some very general dimension reduction arguments originating with Federer from [Sim83], yield the dimension estimate (1.3). The improvement in [CN13b] of the Hausdorff dimension estimate (1.3) to the Minkowski content esti- mate (1.10), and from there the Lp estimate of (1.9), requires exploiting more about the monotone quantity θr(x) than just that it limits as r tends to zero. Indeed, an effective version of (1.22) says that for each δ > 0 there exists N(Λ, δ) > 0 such that

x − x θrα ( ) θrα+1 ( ) < δ , (1.23) holds for all except for at most N scales α ∈ {α1, . . . , αN} ⊆ N. These bad scales where (1.23) fails may differ from point to point, but the number of such scales is uniformly bounded. This allows one to con- clude that, for all but at most N(Λ, )-scales, Brα (x) is (0, )-symmetric, see Section 2.3. To exploit this information, a new technique other than dimension reduction was required in [CN13b]. Indeed, in [CN13b] the quantitative 0-symmetry of (1.23) was instead combined with the notion of cone splitting and an energy decomposition in order to conclude the estimates (1.9),(1.10). Since we will use them in this paper, the quantitative 0-symmetry and cone splitting will be reviewed further in Section 2.3.

Now let us begin to discuss the results of this paper. The most challenging aspect of this paper is the proof of the estimates on the quantitative stratifications of Theorems 1.3 and 1.4, and so we will focus on these in our outline. Let us first observe that it might be advantageous to replace (1.23) with a version that forces an actual rate of convergence, see for instance [Sim93], [Sim95]. More generally, if one is in a context where an effective version of tangent cone uniqueness can be proved then this may be exploited. In fact, in the context of critical sets of elliptic equations one can follow exactly this approach, see the authors work [NVb] where versions of Theorems 1.3 and 1.4 were first proved in this context. However, in the general context THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 9 of this paper such an approach fails, as tangent cone uniqueness is not available, and potentially not correct.

Instead, we will first replace (1.23) with the following relatively simple observation. Namely, for each x there exists N(Λ, δ) and a finite number of scales {α1, . . . , αN} ⊆ N such that X x − x θrα ( ) θrα+1 ( ) < δ . (1.24) α j<α<α j+1 That is, not only does the mass density drop by less than δ between these scales, but the sum of the all the mass density drops is less than δ between these scales.

Unfortunately, exploiting (1.24) in order to prove estimates on the singular set turns out to be substan- tially harder to use than exploiting either (1.22) or even (1.23). In essence, this is because it is not a local assumption in terms of scale, and one needs estimates which can see many scales simultaneously, but which do not require any form of tangent cone uniqueness statements. Accomplishing this requires four new ideas, two of which have been introduced in the last few years in [CN13b],[NVb], and two of which are new to this paper. The first point is to replace the study of the singular set with the study of the quantitative singular set, as introduced in [CN13b] for integral varifolds. It will be clear from the proofs that there is no direct way to apply the ideas of this paper to the singular set itself without decomposing it into these quantitative pieces. The sharp estimates themselves also depend on a new covering argument, first introduced by the authors in [NVb]. We will only briefly discuss the covering in the outline, but this covering argument has the advantage of giving very sharp packing estimates on sets and exploits well the condition (1.24), see Section5 for more on this in the context of this paper. The disadvantage of the strategy of this covering is that it requires com- paring balls of arbitrarily different sizes. In [NVb] this was accomplished by proving an effective tangent cone uniqueness statement in the context of critical sets of elliptic equations. Unfortunately, in the context of this paper it is not clear such a statement exists.

Thus, we arrive at discussing the new ideas supplied in this paper. As discussed, in order to apply the strategy of the covering argument of [NVb] we need to be able estimate collections of balls of potentially ar- bitrarily different radii. Accomplishing this requires two ingredients, a rectifiable-Reifenberg type theorem, and a new L2-best subspace approximation theorem for integral varifolds with bounded mean curvature, which will allow us to apply the rectifiable-Reifenberg. Let us discuss these two ingredients separately.

We begin by briefly discussing the W1,p-Reifenberg and rectifiable-Reifenberg theorems, which are stated in Section3. Recall that the classical Reifenberg theorem, reviewed in Section 2.7, gives criteria under which α k a set becomes C -Holder¨ equivalent to a ball B1(0 ) in . In the context of this paper, it is important to improve on this result so that we have gradient and volume control of our set. Let us remark that there have been many generalizations of the classical Reifenberg theorem in the literature, see for instance [Tor95][DT12], however those results have hypotheses which are much too strong for the purposes of this paper. Instead, we will follow the approach introduced by the authors in [NVa] to prove rectifiability and Minkowski-type bounds on the singular sets of harmonic maps. In particular, we need to improve the Cα- equivalence to a W1,p-equivalence. This is strictly stronger by Sobolev embedding, and if p > k then this results in volume estimates and a rectifiable structure for the set. More generally, we will require a version of 10 AARON NABER AND DANIELE VALTORTA the theorem which allows for more degenerate structural behavior, namely a rectifiable-Reifenberg theorem. In this case, the assumptions will conclude that a set S is rectifiable with volume estimates. Of course, what is key about this result is that the criteria will be checkable for our quantitative stratifications, thus let us n discuss this criteria briefly. Roughly, if S ⊆ B1(0 ) is a subset equipped with the k-dimensional Hausdorff measure λk, then let us define the k-dimensional distortion of S by Z Dk (y, s) ≡ s−2 inf s−k d2(y, Lk) dλk(y) , (1.25) S k L S ∩Bs(x) where where the inf is over all k-dimensional affine subspaces of Rn. That is, Dk measures how far S is from being contained in a k-dimensional subspace. Our rectifiable-Reifenberg then requires this be small on S in an integral sense, more precisely that Z −k X k k 2 r D (y, rα)dλ (x) < δ . (1.26) S ∩B (x) r rα≤r For δ sufficiently small, the conclusions of the rectifiable-Reifenberg Theorem 3.4 are that the set S is recti- fiable with effective bounds on the k-dimensional measure. Let us remark that one cannot possibly conclude better than rectifiable under this assumption, see for instance the Examples of Section 2.6.

Thus, in order to prove the quantitative stratification estimates of Theorems 1.3 and 1.4, we will need k k to verify that the integral conditions (1.26) hold for the quantitative stratifications S  (I), S ,r(I) on all balls Br (x). In actuality the proof is more complicated. We will need to apply a discrete version of the rectifiable- Reifenberg, which will allow us to build an iterative covering in Section5 of the quantitative stratifications, and each of these will satisfy (1.26). This will allow us to keep effective track of all the estimates involved. However, let us for the moment just focus on the main estimates which allows us to turn (1.26) into infor- mation about our integral varifolds, without worrying about such details.

Namely, in Section4 we prove a new and very general approximation theorem for integral varifolds with bounded mean curvature. As always in this outline, let us assume M ≡ Rn and that Im is stationary, the n general case is no harder. Thus we consider an arbitrary measure µ which is supported on B1(0 ). We would like to study how closely the support of µ can be approximated by a k-dimensional affine subspace Lk ⊆ Rn, in the appropriate sense, and we would like to estimate this distance by using properties of I. Indeed, if n we assume that B8(0 ) is not (k + 1, )-symmetric with respect to I, then for an arbitrary µ we will prove in Theorem 4.1 that Z Z 2 k inf d (x, L ) dµ ≤ C |θ8(x) − θ1(x)| dµ , (1.27) Lk⊆Rn where C will depend on , the mass of I, and the geometry of M. That is, if I does not have k + 1 degrees of symmetry, then how closely the support of an arbitrary measure µ can be approximated by a k-dimensional subspace can be estimated by looking at the mass drop of I along µ. In applications, µ will be the restriction k to B1 of some discrete approximation of the k-dimensional Hausdorff measure on S  , and thus the symmetry assumption on I will hold for all balls centered on the support of µ. THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 11

In practice, applying (1.27) to (1.26) is subtle and must be done inductively on scale. Additionally, in k k k order to prove the effective Hausdorff estimates λ (S  ∩ Br) ≤ Cr we will need to use the Covering Lemma 5.1 to break up the quantitative stratification into appropriate pieces, and we will apply the estimates to these. This decomposition is based on a covering scheme first introduced by the authors in [NVb]. Thus for k k k the purposes of our outline, let us assume we have already proved the Hausdorff estimate λ (S  ∩ Br) ≤ Cr , and use this to be able to apply the rectifiable-Reifenberg in order to conclude the rectifiability of the sin- gular set. This may feel like a cheat, however it turns out the proof of the Hausdorff estimate will itself be proved by a very similar tactic, though will also require an inductive argument on scale and use the discrete rectifiable-Reifenberg of Theorem 3.3 in replace of the rectifiable-Reifenberg of Theorem 3.4.

B E ≡ y S˜ k ⊆ S k ∩ B Thus let us choose a ball r and let supBr θr( ). Let us consider the subset   r defined by

˜ k k S  ≡ {y ∈ S  ∩ Br : θ0(y) > E − η} , (1.28)

˜ k where η = η(n, K, H, Λ, ) will be chosen appropriately later. We will show now that S  is rectifiable. Since k η is fixed and the ball Br is arbitrary, the rectifiability of all of S  will follow quickly by an easy covering argument. Thus, let us estimate (1.26) by plugging in (1.27) and the Hausdorff estimate to conclude: Z −k X k k r D (x, rα) dλ S˜ k  rα≤r Z Z −k X  −2−k 2 k k  k = r inf rα d (y, L )dλ (y) dλ (x) S˜ k Lk S˜ k∩B (x)  rα≤r  rα Z Z −k X  −k k  k ≤ Cr rα |θ8rα (y) − θrα (y)|dλ (y) dλ (x) S˜ k S˜ k∩B (x)  rα≤r  rα

X Z −k −k k ˜ k k = Cr rα λ (S  ∩ Brα (y))|θ8rα (y) − θrα (y)| dλ (y) , S˜ k rα≤r  Z −k X k ≤ Cr |θ8rα (y) − θrα (y)| dλ (y) , S k∩B (x)  r rα≤r Z −k k ≤ Cr |θ8r(y) − θ0(y)| dλ (y) k S˜  −k ˜ k ≤ Cr λk(S  ) · η < δ2 . (1.29) where in the last line we have chosen η = η(n, K, H, Λ, ) so that the estimate is less than the required δ from the rectifiable-Reifenberg. Thus we can apply the rectifiable-Reifenberg of Theorem 3.4 in order to ˜ k k conclude the rectifiability of the set S  , which in particular proves that S  is itself rectifiable, as claimed. 12 AARON NABER AND DANIELE VALTORTA

2. PRELIMINARIES 2.1. Integral Varifolds and Mean Curvature. Let us begin by giving a very brief introduction to integral varifolds, the first variation, and their relationship to mean curvature. See [DL12] for a more complete and nice introduction. We begin with the definition of an integral varifold:

m n Definition 2.1. An integral varifold I in a Riemannian manifold (M , g) is a pair (IS , θ), where IS is a m-dimensional rectifiable set and θ : S → N+ is a measurable mapping into the positive integers.

Remark 2.1. Though this definition is not the same as Allard’s original, because of his regularity theorem [All72], it is the equivalent in the context of this paper.

m m We associate to I the measure µI ≡ θ dλS , where λS is the m-dimensional Hausdorff measure restricted to IS . Then the total mass of of I can be denoted Z |I| = I(M) = dµI . (2.1) M To define the mean curvature of an integral varifold we begin by recalling the notion of the first variation. X Given a smooth vector field X on M with compact support, let φt be the family of diffeomorphisms generated by X, then we can define the first variation of I as the distribution Z d ∗ δI(X) ≡ |φt I| = divI X dµI , (2.2) dt t=0 M where divI(X) is the divergence of I on the tangent space of IS , which is well defined a.e. We can now say that I has mean curvature bounded by H on an open set U ⊆ M if for every smooth vector field whose support is in U we have that Z

δI(X) ≤ H |X| dµI . (2.3)

Notice that if I is an integral varifold with bounded mean curvature in U, then necessarily I has no boundary in U, see [DL12]. We say that I is a stationary integral varifold on U if the mean curvature is bounded by 0, and in particular we have that I is then a critical point of the area functional, which respect to variations in U. We say that I is a minimizer if I minimizes the area functional with respect to variations in U.

2.2. Bounded Mean Curvature and Monotonicity. For the purposes of this paper, the most important property of an integral varifold with bounded mean curvature is the existence of a monotone quantity at each point. For simplicity let us first consider the case of a stationary integral varifold Im in Rn. Then for x ∈ I and r > 0 we can consider the density function −m  θr(x) = r µI Br(x) . (2.4)

Then we have that θr(x) is monotone increasing with Z −m 2 θr(x) − θs(x) = dx hNyI, nxi dµI(y) , (2.5) Br(x)\Bs(x) where NyI is the orthogonal complement of the tangent space of I at y, which is defined µI-a.e., dx(y) = |x−y| y−x is the distance function to x, and nx(y) = |y−x| is the normal vector field from x. From this it is possible to THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 13

check that θr(x) is independent of r iff I is 0-symmetric, see [Alm66]. More generally, if θr(x) = θs(x) then I is 0-symmetric on the annulus As,r(x). In Section 2.3 we will recall a quantitative version of this statement introduced in [CN13b]. Motivated by this, we see that what we are really interested in is the amount the density drops from one scale to the next, and thus we define

Ws,r(x) ≡ θr(x) − θs(x) ≥ 0 . (2.6)

−α Often times we will want to enumerate our choice of scale, so let us define the scales rα ≡ 2 for α ≥ 0, and the corresponding mass density drop:

Wα(x) ≡ Wrα,rα−3 (x) ≡ θrα−3 (x) − θrα (x) ≥ 0 . (2.7) In the general case when M , Rn and the mean curvature is only bounded, essentially the same state- Cr ments may be made, however θr(x) is now only almost monotone, meaning that e θr(x) is monotone for some constant C = C(n, K, H) which depends only on the geometry of M and the mean curvature bound H, see [DL12]. From this one can prove that at every point, every tangent map is 0-symmetric, which is the starting point for the dimension estimate (1.3) of Federer. In Section 2.3 we discuss quantitative versions of this point, first introduced in [CN13b] and used in this paper as well, and also generalizations which involve higher degrees of symmetry. These points were first used in [CN13b] to prove Minkowski estimates on the quantitative stratification of a stationary varifold. They will also play a role in our arguments, though in a different manner.

2.3. Quantitative 0-Symmetry and Cone Splitting. In this subsection we review some of the quantitative symmetry and splitting results of [CN13b], in particular those which will play a role in this paper. The first result we will discuss acts as an effective formulation of the fact that every tangent cone is 0- symmetric. Namely, the quantitative 0-symmetry of [CN13b] says that for each  > 0 and point, that away from a finite number of scales every ball looks (0, )-symmetric. To be precise, let us consider the radii −α rα ≡ 2 , and then the statement is the following:

Theorem 2.2 (Quantitative 0-Symmetry [CN13b]). Let Im be an integral varifold on Mn satisfying the curvature bound (1.13), the mean curvature bound (1.14), and the mass bound I(B2(p)) ≤ Λ. Then for each  > 0 the following hold:

(1) There exists δ(n, K, Λ, H, ) > 0 such that for each x ∈ B1(p) and 0 < r ≤ r(n, K, Λ, ), if we have |θr(x) − θδr(x)| < δ, then Br(x) is (0, )-symmetric. (2) For each x ∈ B1(p) there exists a finite number of scales {α1, . . . , αN} ⊆ N with N ≤ N(n, K, Λ, H, )

such that for r < (rα j /2, 2rα j ) we have that Br(x) is (0, )-symmetric. Remark 2.2. In [CN13b] the result is stated for a stationary varifold, however the verbatim (quick) proof holds just as well for integral varifolds with bounded mean curvature.

Another technical tool that played an important role in [CN13b] was that of cone splitting. This will be used in this paper when proving the existence of unique tangent planes of symmetry for the singular set, so we will discuss it here. In short, cone splitting is the idea that multiple 0-symmetries add up to give rise to a 14 AARON NABER AND DANIELE VALTORTA k-symmetry. To state it precisely let us give a careful definition of the notion of independence of a collection of points:

n Definition 2.3. We say a collection of points {x1,..., x`} ∈ R is independent if they are linearly indepen- dent. We say the collection is τ-independent if d(xk+1, span{x1,..., xk}) > τ for each k. If {x1,..., x`} ∈ M then we say the collection is τ-independent with respect to x if d(x, x j) < inj(x) and the collection is τ- independent when written in exponential coordinates at x.

Now we are in a position to state the effective cone splitting of [CN13b]:

Theorem 2.4 (Cone Splitting [CN13b]). Let Im be an integral varifold on Mn satisfying the curvature bound (1.13), the mean curvature bound (1.14), the mass bound I(B2(p)) ≤ Λ, and let , τ > 0 be fixed. Then there exists δ(n, Λ, , τ) > 0 such that if K + H < δ and x1,..., xk ∈ B1(p) are such that

(1) B2(x j) are (0, δ)-symmetric, (2) {x1,..., xk} are τ-independent at p, then B1(p) is (k, )-symmetric. Remark 2.3. The assumption K + H < δ is of little consequence, since this just means focusing the estimates on balls of sufficiently small radius after rescaling.

We end with the following, which one can view as a quantitative form of dimension reduction. The proof is standard and can be accomplished by a contradiction argument:

Theorem 2.5 (Quantitative Dimension Reduction). Let Im be an integral varifold on Mn satisfying the curvature bound (1.13), the mean curvature bound (1.14), and the mass bound I(B2(p)) ≤ Λ. Then for each  > 0 there exists δ(n, Λ, ), r(n, Λ, ) > 0 such that if K + H < δ and B2(p) is (k, δ)-symmetric with respect k k to some k-plane V , then for each x ∈ B1(p) \ B(V ) we have that Br(x) is (k + 1, )-symmetric.

2.4. -regularity for Minimizing Hypersurfaces. In this subsection we quickly review the -regularity theorem of [CN13b], which itself follows quickly from the difficult work of [Sim68],[Fed69] after an easy contradiction argument. This will be our primary technical tool in upgrading the structural results on var- ifolds with bounded mean curvature to the regularity results for area minimizing hypersurfaces. The main theorem of this subsection is the following:

n−1 Theorem 2.6 (-Regularity [Sim68],[Fed69],[CN13b]). Let I ⊆ B2 be a minimizing hypersurface sat- isfying the bounds (1.13), and the mass bound I(B2(p)) ≤ Λ. Then there exists (n, Λ) > 0 such that if K, H <  and B2(p) is (n − 7, )-symmetric, then

rI(p) ≥ 1 .

Remark 2.4. The assumption K, H <  is of little consequence, since this just means focusing the estimates on balls of sufficiently small radius after rescaling. THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 15

2.5. Hausdorff, Minkowski, and packing Content. In this subsection we give a brief review of the notions of Hausdorff, Minkowski, and packing content. We will also use this to recall the Hausdorff measure. The results of this subsection are completely standard, but this gives us an opportunity to introduce some notation for the paper. Let us begin with the notions of content:

Definition 2.7. Given a set S ⊆ Rn and r > 0 we define the following: (1) The k-dimensional Hausdorff r-content of S is given by k n X k [ o λr (S ) ≡ inf ωkri : S ⊆ Bri (xi) and ri ≤ r . (2.8) (2) The k-dimensional Minkowski r-content of S is given by k n X k [ o mr (S ) ≡ inf ωkr : S ⊆ Br(xi) . (2.9) (3) The k-dimensional packing r-content of S is given by k n X k o pr (S ) ≡ sup ωkri : xi ∈ S and {Bri (xi)} are disjoint . (2.10) These definitions make sense for any k ∈ [0, ∞), though in this paper we will be particularly interested in k k integer valued k. Notice that if S is a compact set then λr (S ), mr (S ) < ∞ for any r > 0, and that we always have the relations k k k λr (S ) . mr (S ) . pr (S ) . (2.11) In particular, bounding the Hausdorff content is less powerful than bounding the Minkowski content, which k k k is itself less powerful than bounding the packing content. Notice that λr , mr , pr are honest contents in that they are finitely additive measures. In particular, integration with respect to bounded functions is well k k−n defined. Let us also note that mr (S ) is uniformly equivalent to r Vol(Br(S )) in the sense that there exists a dimensional constant C(n) > 0 such that −1 k−n k k−n C r Vol(Br(S )) ≤ mr (S ) ≤ Cr Vol(Br(S )) . (2.12)

Primarily in this paper we will be mostly interested in content estimates, because these are the most effective estimates. However, since it is classical, let us go ahead and use the Hausdorff content to define a measure. To accomplish this, let us more generally observe that if r ≤ r0 then we have the monotone relations: k k0 λr (S ) ≥ λr (S ) , k k0 mr (S ) ≥ mr (S ) , k k pr (S ) ≤ pr0 (S ) . (2.13) In particular, in each case there is a well defined limit k k λ (S ) ≡ lim λr (S ) , 0 r→0 k k m (S ) ≡ lim mr (S ) , 0 r→0 k k p (S ) ≡ lim pr (S ) . (2.14) 0 r→0 16 AARON NABER AND DANIELE VALTORTA

The case of the 0-Hausdorff content is particularly special, as it defines a genuine measure. In the case of the Minkowski and packing contents, these only define pre-measures, though we can use the standard procedure to define a measure out of them. Since we will have no need for it we will only consider the Hausdorff measure: Rn k k Definition 2.8. Given a set S ⊆ we define its k-dimensional Hausdorff measure by λ (S ) ≡ λ0(S ). Finally, let us use the contents to define the notion of dimension. To accomplish this notice that if k ≤ k0 then we have the relations k k0 λr (S ) ≥ λr (S ) , k k0 mr (S ) ≥ mr (S ) , (2.15) for any r ≥ 0. In particular, as a function of k we have that the contents of a set S are monotone decreasing functions. Also observe that if S ⊆ Rn and if k > n, then k k λ0(S ) = m0(S ) = 0 . (2.16) Now we define dimension: Definition 2.9. Given a set S ⊆ Rn we define its Hausdorff and Minkowski dimension (or box-dimension) by n k o dimH S ≡ inf k ≥ 0 : λ0(S ) = 0 , n k o dimM S ≡ inf k ≥ 0 : m0(S ) = 0 . (2.17) As an easy example consider the rationals Qn ⊆ Rn. Then it is a worthwhile exercise to check that n n dimH Q = 0, while dimM Q = n.

A very important notion related to measures is the density at a point. Although this is standard, for completeness we briefly recall the definition of Hausdorff density, and refer the reader to [Mat95, chapter 6] for more on this subject. Definition 2.10. Given a set S ⊂ Rn which is λk-measurable, and x ∈ Rn, we define the k-dimensional upper and lower density of S at x by λk(S ∩ B (x)) λk(S ∩ B (x)) θ?k S, x r , θk S, x r . ( ) = lim sup k ?( ) = lim inf k (2.18) r→0 ωkr r→0 ωkr In the following, we will use the fact that for almost any point in a set with finite λk-measure, the density is bounded from above and below. Proposition 2.11 ([Mat95]). Let S ⊂ Rn be a set with λk(S ) < ∞. Then for k-a.e. x ∈ S : 2−k ≤ θ?k(S, x) ≤ 1 , (2.19) while for k-a.e. x ∈ Rn \ S θ?k(S, x) = 0 . (2.20) THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 17

2.6. Examples. In this subsection we present a few examples which motivate the sharpness of our results.

2.6.1. The Simons Cone and Sharp Estimates. In order to study the sharpness of the estimates of Theorem 1.8 we study some examples. One may use either the Simons cone or the Lawson cone for this analysis. The Simons cone C ⊆ R8 is a cone over the surface  1   1  S 3 √ × S 3 √ ⊆ S 7 . (2.21) 2 2 It has been shown [BDGG69] that C is an area minimizing cone. It is easy to check that for x ∈ C we have that |A|(x) = c|x|−1, where A is the second fundamental form and c is some universal constant. In particular, 7 7 we get that |A| ∈ Lweak, but |A| < Lloc, showing that the estimates of Theorem 1.8 are sharp.

2.6.2. Rectifiable-Reifenberg Example I. Let us begin with an easy example, which shows that the rectifi- able conclusions of Theorem 3.4 is sharp. That is, one cannot hope for better structural results under the k n k n hypothesis. Indeed, consider any k-dimensional subspace V ⊆ R , and let S ⊆ V ∩ B2(0 ) be an arbitrary measurable subset. Then clearly D(x, r) ≡ 0 for each x and r > 0, and thus the hypotheses of Theorem 3.4 are satisfied, however S clearly need not be better than rectifiable. In the next example we shall see that S need not even come from a single rectifiable chart, as it does in this example.

2.6.3. Rectifiable-Reifenberg Example II. With respect to the conclusions of Theorem 3.4 there are two natural questions regarding how sharp they are. First, is it possible to obtain more structure from the set S than rectifiable? In particular, in Theorem 3.2 there are topological conclusions about the set, is it possible to make such conclusions in the context of Theorem 3.4? In the last example we saw this is not the case. Then a second question is to ask whether we can at least find a single rectifiable chart which covers the whole set S . This example taken from [DT12, counterexample 12.4] shows that the answer to this question is negative as well.

To build our examples let us first consider a unit circle S 1 ⊆ R3. Let M2 ⊃ S 1 be a smooth Mobius¨ strip 2 1 2 2 around this circle, and let S  ⊆ M ∩ B(S ) ≡ M be an arbitrary λ -measurable subset of the Mobius¨ strip, 1 contained in a small neighborhood of the S . In particular, Area(S ) ≤ C → 0 as  → 0. It is not hard, though potentially a little tedious, to check that assumptions of Theorem 3.4 hold for δ → 0 as  → 0.

However, we have learned two points from these example. First, since S  was an arbitrary measurable subset of a two dimensional manifold, we have that it is 2-rectifiable, however that is the most which may be said of S . That is, structurally speaking we cannot hope to say better than 2-rectifiable about the set S . More than that, since S  is a subset of the Mobius¨ strip, we see that even though S  is rectifiable, we cannot 2 even cover S  by a single chart from B1(0 ), as there are topological obstructions, see [DT12] for more on this. 18 AARON NABER AND DANIELE VALTORTA

2.7. The Classical Reifenberg Theorem. In this Section we recall the classical Reifenberg Theorem, as well as some more recent generalizations. The Reifenberg theorem gives criteria on a closed subset S ⊆ n k B2 ⊆ R which determine when S ∩ B1 is bi-Holder¨ to a ball B1(0 ) in a smaller dimensional Euclidean space. The criteria itself is based on the existence of good best approximating subspaces at each scale. We start by recalling the Hausdorff distance.

Definition 2.12. Given two sets A, B ⊆ Rn, we define the Hausdorff distance between these two by

dH(A, B) = inf {r ≥ 0 s.t. A ⊂ Br (B) and B ⊂ Br (A)} . (2.22)

Recall that dH is a distance on closed sets, meaning that dH(A, B) = 0 implies A = B.

The classical Reifenberg theorem says the following:

Theorem 2.13 (Reifenberg Theorem [Rei60, Sim]). For each 0 < α < 1 and  > 0 there exists δ(n, α, ) > 0 n n such that the following holds. Assume 0 ∈ S ⊆ B2 ⊆ R is a closed subset, and that for each Br(x) ⊆ B1 with x ∈ S we have k  inf dH S ∩ Br(x), L ∩ Br(x) < δ r , (2.23) Lk k n k where the inf is taken over all k-dimensional affine subspaces L ⊆ R . Then there exists φ : B1(0 ) → S α −1 k which is a C bi-Holder¨ homeomorphism onto its image with [φ]Cα , [φ ]Cα <  and S ∩ B1 ⊆ φ(B1(0 ). Remark 2.5. In fact, one can prove a little more. In particular, under the hypothesis of the previous theorem, 0 n 0 there exists a closed subset S ⊂ R such that S ∩ B1 (0) = S ∩ B1 (0) and which is homeomorphic to n n α 0 a k-dimensional subspace 0 ∈ T0 ⊆ R via the C bi-Holder¨ homeomorphism φ : T0 → S . Moreover, |φ(x) − x| ≤ C(n)δ for all x ∈ T0 and φ(x) = x for all x ∈ T0 \ B2 (0). One can paraphrase the above to say that if S can be well approximated on every ball by a subspace in the L∞-sense, then S must be bi-Holder¨ to a ball in Euclidean space.

Let us also mention that there are several more recent generalizations of the classic Reifenberg theo- rem. In [Tor95], the author proves a strengthened version of (2.23) that allows one to improve bi-Holder¨ to bi-Lipschitz. Unfortunately, for the applications of this paper the hypotheses of [Tor95] are much too re- strictive. We will require a weaker condition than in [Tor95], which is more integral in nature, see Theorem 3.2. In exchange, we will improve the bi-Holder¨ of the classical Reifenberg to W1,p.

We will also need a version of the classical Reifenberg which only assumes that the subset S is contained near a subspace, not conversely that the subspace is also contained near S . In exchange, we will only con- clude the set is rectifiable. A result in this direction was first proved in [DT12], but again the hypothesis are too restrictive for the applications of this paper, and additionally there is a topological assumption necessary for the results of [DT12], which is not reasonable in the context in this paper. We will see how to appropri- ately drop this assumption in Theorem 3.4. THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 19

3. THE W1,p-REIFENBERG THEOREM In this Section we recall some Reifenberg-type theorems first introduced in [NVa]. In [NVa] we focused our attention on proving the rectifiable-Reifenberg of Theorem 3.4, and in this section we will focus our attention on proving the discrete Reifenberg of Theorem 3.3. The proofs of the two results are very similar.

3.1. Statement of Main W1,p-Reifenberg and rectifiable-Reifenberg Results. Before turning to the state- ments of the theorems, let us introduce some definitions in order to keep the statements as clean and intuitive as possible.

Definition 3.1. Let µ be a measure on B2 with r > 0 and k ∈ N. Then we define the k-dimensional displacement by Z Dk (x, r) ≡ inf r−(k+2) d2(y, Lk) dµ(y) , (3.1) µ k L Br(x) k −k k k if µ(Br(x)) ≥ kr ≡ ωk20 r , and Dµ(x, r) ≡ 0 otherwise, where the inf’s are taken over all k-dimensional k Rn k affine subspaces L ⊆ . If S ⊆ B2, then we can define its k-displacement DS (x, r) by associating to S the k k-dimensional Hausdorff measure λS restricted to S .

Remark 3.1. One can replace k by any smaller lower bound, and the proofs and statements will all continue to hold.

Remark 3.2. Notice that the definitions are scale invariant. In particular, if we rescale Br → B1 and let S˜ be the induced set, then Dk (x, r) → Dk (x, 1). S S˜ 0 k k Remark 3.3. Notice the monotonicity given by the following: If µ ≤ µ, then Dµ0 (x, r) ≤ Dµ(x, r). k Remark 3.4. It is immediate to see from the definition that, up to dimensional constants, Dµ(x, r) is controlled k k k k on both sides by Dµ(x, r/2) and Dµ(x, 2r). In particular, if µ(Br(x)) ≥ 2 kr , then for all y ∈ S ∩ Br (x), k k+2 k Dµ(x, r) ≤ 2 Dµ(y, 2r). As a corollary we have the estimate

k k+2 k Dµ(x, r) ≤ 2 Dµ(y, 2r)dµ(y) . (3.2) S ∩Br(x) ? Before introducing the results which are really needed for the paper, it is worth mentioning the W1,p- Reifenberg theorem obtained in [NVa]. This is a natural generalization of the Reifenberg and gives intuition and motivation for the rest of the statements, which are essentially more complicated versions of it. We do not prove the theorem in this paper, instead we refer the reader to [NVa, theorem 3.2]. Theorem 3.2 (W1,p-Reifenberg). For each  > 0 and p ∈ [1, ∞) there exists δ(n, , p) > 0 such that the n n following holds. Let S ⊆ B2 ⊆ R be a closed subset with 0 ∈ S , and assume for each x ∈ S ∩ B1 and Br(x) ⊆ B2 that k  inf dH S ∩ Br(x), L ∩ Br(x) < δr , (3.3) Lk Z Z r ! k ds k 2 k DS (x, s) dλ (x) < δ r . (3.4) S ∩Br(x) 0 s 20 AARON NABER AND DANIELE VALTORTA

Then the following hold: (1) there exists a mapping φ : Rk → Rn which is a 1 +  bi-W1,p map onto its image and such that n k n S ∩ B1 (0 ) = φ(B1(0 )) ∩ B1 (0 ). n (2) S ∩ B1(0 ) is rectifiable. (3) For each ball Br(x) ⊆ B1 with x ∈ S we have

k k k (1 − )ωkr ≤ λ (S ∩ Br(x)) ≤ (1 + )ωkr . (3.5)

Remark 3.5. The result (2) both follows from (1). We get (3) by applying the result of (1) to all smaller balls Br(x) ⊆ B1, since the assumptions of the theorem hold on these balls as well.

1,p α k Remark 3.6. Note that a bi-W map is a bi-C map, in particular we see that φ(B1(0 )) is homeomorphic k to the ball B1(0 ).

Remark 3.7. As it is easily seen, the requirement that S is closed is essential for this theorem, and in particular for the lower bound on the Hausdorff measure. As an example, consider any set S ⊆ Rk which is dense but has zero Hausdorff measure. In the following theorems, we will not be concerned with lower bounds on the measure, and we will be able to drop the closed assumption.

We are going to state another generalization of Reifenberg’s theorem, more discrete in nature, which will be particularly important in the proof of the main theorems of this paper:

Theorem 3.3 (Discrete Reifenberg). [NVa, theorem 3.4] There exists δ(n) > 0 such that the following holds. P k Let {Brs (xs)}s∈S ⊆ B2 be a collection of disjoint balls, and let µ ≡ s∈S ωkrsδxs be the associated measure. k k Assume that for each Br(x) ⊆ B2 with µ(Br (x)) ≥ kr = ωk(r/20) we have Z Z r ! k dt 2 k Dµ(x, t) dµ(x) < δ r . (3.6) Br(x) 0 t Then we have the estimate

X k rs < D(n) . (3.7) s∈S Remark 3.8. Instead of (3.6) we can may assume the estimate Z X k 2 k Dµ(x, rα) dµ(x) < δ r . (3.8) Br(x) rα≤2r In the applications, this will be the more convenient phrasing.

In order to prove rectifiability of the strata, we will also need the following version of Reifenberg’s the- orem. The proof of this theorem relies on the same ideas as the discrete-Reifenberg, for this reason we do not report it here and we refer the interested reader to [NVa, theorem 3.3]. THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 21

Theorem 3.4 (Rectifiable-Reifenberg). For every  > 0, there exists δ(n, ) > 0 such that the following n k k holds. Let S ⊆ B2 ⊆ R be a λ -measurable subset, and assume for each Br(x) ⊆ B2 with λ (S ∩ Br(x)) ≥ k kr that Z Z r ! k ds k 2 k DS (x, s) dλ (x) <δ r . (3.9) S ∩Br(x) 0 s Then the following hold:

(1) For each ball Br(x) ⊆ B1 with x ∈ S we have

k k λ (S ∩ Br(x)) ≤ (1 + )ωkr . (3.10)

n (2) S ∩ B1(0 ) is k-rectifiable.

Remark 3.9. Notice that for the statement of the theorem we do not need control over balls which already have small measure. This will be quite convenient for the applications.

Remark 3.10. Instead of (3.9) we may assume the essentially equivalent estimate Z X k k 2 k DS (x, rα) dλ (x) < δ r . (3.11) S ∩Br(x) rα≤2r In the applications, this will be the more convenient phrasing.

3.2. Technical Constructions. In this section, we recall for the reader’s convenience some technical lem- mas needed for dealing with the relation between best L2 subspaces. We start by recalling some standard facts about affine subspaces in Rn and Hausdorff distance.

Definition 3.5. Given two linear subspaces L, V ⊆ Rn, we define the Grassmannian distance between these two as

dG(L, V) = dH(L ∩ B1 (0) , V ∩ B1 (0)) , (3.12)

Note that if dim(L) , dim(V), then dG(L, V) = 1.

Remark 3.11. In practice we will want to consider the Grassmannian distance of affine subspaces, not just linear. In this case, we are by definition considering the Grassmannian distance of the induced linear subspaces.

n For general subsets in R , it is evident that A ⊆ Bδ (B) does not imply B ⊆ Bcδ (A). However, if A and B are affine spaces, then it is not difficult to see that this property holds. More precisely:

n Lemma 3.6. Let V, W be two k-dimensional affine subspaces in R , and suppose that V ∩ B1/2 (0) , ∅. There exists a constant c(k, n) such that if V ∩ B1 (0) ⊆ Bδ (W ∩ B1 (0)), then W ∩ B1 (0) ⊆ Bcδ (V ∩ B1 (0)). Thus in particular dH(V ∩ B1 (0) , W ∩ B1 (0)) ≤ cδ . 22 AARON NABER AND DANIELE VALTORTA

Proof. The proof relies on the fact that V and W have the same dimension. Let x0 ∈ V be the point of minimal distance from the origin. By assumption, we have that kx0k ≤ 1/2. Let x1, ··· , xk ∈ V be a sequence of points such that D E kxi − x0k ≥ 1/2 and for i , j , xi − x0, x j − x0 = 0 . (3.13)

k k In other words, {xi − x0}i=1 is an affine base for V. Let {yi}i=0 ⊆ W ∩ B1 (0) be such that d(xi, yi) ≤ δ. Then

D E 2 kyi − y0k ≥ 1/2 − 2δ and for i , j , yi − y0, y j − y0 ≤ 2δ + 4δ . (3.14) k This implies that for δ ≤ δ0(n), {yi − y0}i=1 is an affine base for W and for all y ∈ W Xk y = y0 + αi(yi − y0) , |αi| ≤ 10 ky − y0k . (3.15) i=1 ⊥ Now let y ∈ W ∩ B1 (0) be the point of maximum distance from V, and let π be the projection onto V and π the projection onto V⊥. Then Xk ⊥ ⊥ 0 d(y, V) = d(y, π(y)) = π (y) ≤ |αi| π (yi − y0) ≤ c (n, k)δ . (3.16) i=1 0 Since y ∈ B1 (0), then π(y) ∈ V ∩B1+c0δ (0), and thus d(y, V ∩B1(0)) ≤ 2c δ ≡ cδ, which proves the claim. 

Next we will see that the Grassmannian distance between two subspaces is enough to control the projec- tions with respect to these planes.

Lemma 3.7. There exists c(n) > 0 such that if V, W are k-dimensional linear subspaces with dG(V, W) ≤ , then for every x ∈ Rn,

kπV (x) − πW (x)k ≤ c(n) kxk . (3.17)

−1 k Proof. By choosing c(n) > 0 , we can assume without loss that  ≤ 0(n). Let {ei}i=1 be an orthonormal basis for V. Then evidently Xk πV (x) = hx, eii ei . (3.18) i=1

Let also w1, ··· , wk be points in W with d(ei, wi) ≤ . It is easy to see that |1 − kwik| ≤ . Now let fi be the result of GramSchmidt orthonormalization process on wi. In particular w1 f1 = , kw1k f 0 f 0 = w − h f , w i f , f = 2 , 2 2 1 2 1 2 0 f2 ··· Xk−1 f 0 f 0 = w − h f , w i f , f = k . (3.19) k k i k i k 0 i=1 fk THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 23

It is clear that if we choose 0 small enough, then this process is well-defined and there exists a constant c such that for all i:

kei − fik ≤ c . (3.20)

As an immediate corollary we obtain that for all i

khx, eii ei − hx, fii fik ≤ 2 kxk kei − fik ≤ 2c kxk  . (3.21)

The result then follows. 

Distance between L2 best planes. Let us begin by fixing our notation for this subsection. Throughout, ρ = ρ(n) > 0 will be fixed ac- cording to Lemma 3.8. We will consider a positive Radon measure µ supported on S ⊆ B1 (0), and use k 2 D(x, r) ≡ Dµ(x, r) to bound the distances between best L planes at different points and scales. By definition let us denote by V(x, r) a best k-dimensional plane on B (x), i.e., a k-dimensional affine subspaces minimiz- R r ing d(x, V)2dµ. We want to prove that, under reasonable hypothesis, the distance between V(x, r) and Br(x) V(y, r0) is small if d(x, y) ∼ r and r0 ∼ r.

In order to achieve this, we will need to understand some minimal properties of µ. First, we need to understand how concentrated µ is on any given ball. For this reason, for some ρ > 0 and all x ∈ B1 (0) we will want to consider the upper mass bound

k µ(Bρ (x)) ≤ Mρ ∀x ∈ B1(0) . (3.22)

However, an upper bound on the measure is not enough to guarantee best L2-planes are close, as the following example shows.

0 0 0 Example 3.1. Let V, V be k-dimensional subspaces, 0 ∈ V ∩V , and set S = V ∩B1 (0)\B1/10 (0)∪S , where 0 0 k 0 −k k k 0 S ⊆ V ∩ B1/10 (0) with λ (S ) < k10 , and µ = λ |S . Then evidently D(0, 1) ≤ λ (S ) and D(0, 1/10) = 0, independently of V and V0. However, V(0, 1) will be close to V, while V(0, 1/10) = V0. Thus, in general, we cannot expect V(0, 1) and V(0, 1/10) to be close if µ(B1/10 (0)) is too small.

Thus, in order to prove that the best planes are close, we need to have some definite amount of measure on the set, in such a way that S “effectively spans” a k-dimensional subspace. Precisely, we have the following:

Lemma 3.8. There exists a ρ0(n, M) such that if (3.22) holds for some ρ ≤ ρ0 and if µ(B1 (0)) ≥ k, then n for every affine subspace V ≤ R of dimension ≤ k − 1, there exists an x ∈ S such that Bρ (x) ⊆ B1 (0),   n B10ρ (x) ∩ V = ∅ and µ Bρ (x) ≥ c(n, ρ) = c(n)ρ > 0.

Proof. Let V be any k − 1-dimensional subspace, and consider the set B11ρ (V). Let Bi = Bρ (xi) be a sequence of balls that cover the set B11ρ (V) ∩ B1 (0) and such that Bi/2 are disjoint. If N is the number of 24 AARON NABER AND DANIELE VALTORTA these balls, then a standard covering argument gives n n k−1 n−k+1 n n−k+1 Nωnρ /2 ≤ ωk−1(1 + ρ) ωn−k+1(12ρ) ≤ 24 ωk−1ωn−k+1ρ ω − ω − =⇒ N ≤ 48n k 1 n k+1 ρ1−k . (3.23) ωn By (3.22), the measure of the set B11ρ (V) is bounded by X k µ(B11ρ (V)) ≤ µ (Bi) ≤ MNωkρ ≤ c(n)Mρ. (3.24) i

Thus if ρ < c(n)k/(4M), then µ(B11ρ (V)) ≤ k/4. In particular, we get that there must be some point of S not in B11ρ (V). More effectively, let us consider the set B1 (0) \ B11ρ (V). This set can be covered by at most c(n, ρ) = 2nρ−n balls of radius ρ, and we also see that   3 µ B (0) \ B (V) ≥ k . (3.25) 1 11ρ 4 Thus, there must exist at least one ball of radius ρ which is disjoint from B10ρ (V) and such that   3 µ B (x) ≥ k 2−nρn ≥ c(n)ρn . (3.26) ρ 4 

Now if at two consecutive scales there are some balls on which the measure µ effectively spans k- dimensional subspaces, we show that these subspaces have to be close together.

Lemma 3.9. Let µ be a positive Radon measure and assume µ(B1 (0)) ≥ k and that for each Br(y) ⊆ B1(0) 2 k with r ∈ {ρ, ρ } we have µ(Br (y)) ≤ Mr , where ρ ≤ ρ0. Additionally, let Bρ(x) be a ball such that k 2 µ(Bρ (x)) ≥ kρ . Then if A = V(0, 1) ∩ Bρ (x) and B = V(x, ρ) ∩ Bρ (x) are L -best subspace approximations of µ with d(x, A) < ρ/2, then 2  k k  dH(A, B) ≤ c(n, ρ, M) Dµ(x, ρ) + Dµ(0, 1) . (3.27) Proof. Let us begin by observing that if c(n, ρ, M) > δ−1(n, ρ, M), which will be chosen later, then we may assume without loss that k k Dµ(x, ρ) + Dµ(0, 1) ≤ δ = δ(n, ρ, M) , (3.28) since otherwise (3.27) is trivially satisfied.

We will estimate the distance dH(A, B) by finding k + 1 balls Bρ2 (yi) which have enough mass and effec- k tively span in the appropriate sense V(x, ρ). Given the upper bounds on Dµ, we will then be in a position to prove our estimate.

Consider any Bρ2 (y) ⊆ B1 (0) and let p(y) ∈ Bρ2 (y) be the center of mass of µ restricted to this ball. Let also π(p) be the orthogonal projection of p onto V(x, ρ). By Jensen’s inequality:  2 Z 2 2   1 2 d(p(y), V(x, ρ)) = d(p, π(p)) = d  z dµ(z), V(x, ρ) ≤ d(z, V(x, ρ)) dµ(z) .   µ(B 2 (y)) Bρ2 (y) ρ Bρ2 (y) ? (3.29) THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 25

Using this estimate and Lemma 3.8, we want to prove that there exists a sequence of k + 1 balls Bρ2 (yi) ⊆ Bρ (x) such that   (i) µ Bρ2 (yi) ≥ c(n, ρ, M) > 0 k k (ii) {π(p(yi))}i=0 ≡ {πi}i=0 effectively spans V(x, ρ). In other words for all i = 1, ··· , k, πi ∈ V(x, ρ) and  πi < B5ρ2 span (π0, ··· , πi−1) . (3.30) We prove this statement by induction on i = 0, ··· , k. For i = 0, the statement is trivially true since µ(B (x)) >  ρk. In order to find y , consider the i-dimensional affine subspace V(i) = span (π , ··· , π ) ≤ ρ k i+1 0 i−1 V(x, ρ). By Lemma 3.8 applied to the ball Bρ (x), there exists some Bρ2 (yi+1) such that µ Bρ2 (yi+1) ≥ c(n, ρ, M) > 0 and  yi+1 < B10ρ2 span (π0, ··· , πi−1) . (3.31) 2 By definition of center of mass it is clear that d(yi, p(yi)) ≤ ρ . Moreover, by item (i) and equation (3.29), we get Z 2 2 k d(p(yi+1), V(x, ρ)) ≤ c d(z, V(x, ρ)) dµ(z) ≤ cDµ(x, ρ) ≤ cδ . (3.32) Bρ2 (y) 2 Thus by the triangle inequality we have d(yi, πi) ≤ 2ρ if δ ≤ δ0(n, ρ, M) is small enough. This implies 2 0 k (3.30). Using similar estimates, we also prove d(p(yi), V(0, 1)) ≤ c Dµ(0, 1) for all i = 0, ··· , k. Thus by the triangle inequality

 k k 1/2 d(πi, V(0, 1)) ≤ d(πi, p(yi)) + d(p(yi), V(0, 1)) ≤ c(n, ρ, M) Dµ(x, ρ) + Dµ(0, 1) . (3.33)

k Now consider any y ∈ V(x, ρ). By item (ii), there exists a unique set {βi}i=1 such that X k y = π0 + βi=0(πi − π0) , |βi| ≤ c(n, ρ) ky − π0k . (3.34) i

Hence for all y ∈ V(x, ρ) ∩ Bρ (xi), we have

X  k k 1/2 d(y, V(0, 1)) ≤ d(π0, V(0, 1)) + |βi| d(πi − π0, V(0, 1)) ≤ c(n, ρ, M) Dµ(x, ρ) + Dµ(0, 1) . (3.35) i By Lemma 3.6, this completes the proof of (3.27). 

Comparison between L2 and L∞ planes. Given B (x), we denote as before V(x, r) to be the k-dimensional R r subspace minimizing d(y, V)2dµ. Suppose that the support of µ satisfies a uniform Reifenberg condi- Br(x) tion, i.e. suppose that there exists a k-dimensional plane L(x, r) such that x ∈ L(x, r) and supp (µ) ∩ Br (x) ⊆ Bδr (L(x, r)). Then, by the same technique used in Lemma 3.9, we can prove that

Lemma 3.10. Let µ be a positive Radon measure with µ (B1 (0)) ≥ k and such that for all Bρ (y) ⊆ B1 (0) k we have µ(Bρ (y)) ≤ Mρ . Then 2  2 k  dH(L(0, 1) ∩ B1 (0) , V(0, 1) ∩ B1 (0)) ≤ c(n, ρ, M) δ + Dµ(0, 1) . (3.36) 26 AARON NABER AND DANIELE VALTORTA

bi-Lipschitz equivalences. In this subsection, we study a particular class of maps with nice local properties. These maps are slightly modified version of the maps which are usually exploited to prove Reifenberg’s theorem, see for example [Tor95, DT12], [Mor66, section 10.5] or [Sim]. n We start by defining the functions σ. For some r > 0, let {xi} be an r-separated subset of R , i.e., let d(xi, x j) ≥ r. Let pi ∈ B10r (xi) and Vi be a sequence of k-dimensional linear subspaces. n By standard theory, it is easy to find a locally finite smooth partition of unity λi : R → [0, 1] such that

(i) supp (λi) ⊆ B3r (xi) for all i, S P (ii) for all x ∈ i B2r (xi), i λi(x) = 1 (iii) supi k∇λik∞ ≤ c(n)/r P (iv) if we set 1 − ψ(x) = i λi(x), then ψ is a smooth function with k∇ψk∞ ≤ c(n)/r

Note that by (i), and since xi is r-separated, there exists a constant c(n) such that for all x, λi(x) > 0 for at most c(n) different indexes.

n n Definition 3.11. Given xi, λi, pi, Vi as above, define a smooth function σ : R → R by X X  σ(x) = x + λi(x)π ⊥ (pi − x) = ψ(x)x + λi(x) pi + πV (x − pi) , (3.37) Vi i i i where πV (x) is the orthogonal projection of the vector x onto V.

By local finiteness, it is evident that σ is smooth. Moreover, if ψ(x) = 1, then σ(x) = x. It is clear that philosophically σ is a form of “smooth interpolation” between the identity and the projections onto the subspaces Vi. It stands to reason that if Vi are all close together, then this map σ is close to being an S orthogonal projection in the region i B2r (xi).

Lemma 3.12. Suppose that there exists a k-dimensional affine subspace V ≤ Rn such that for all i

dG(Vi, V) ≤ δ , d(pi, V) ≤ δ, . (3.38)

−1 P −1 Then the map σ restricted to the set U = ψ (0) = i λi (1) can be written as

σ(x) = q + πV (x − q) + e(x) , (3.39) where q is any point in V, and e(x) is a smooth function with kek∞ + k∇ek∞ ≤ c(n)δ/r = c(n, r)δ.

Remark 3.12. Thus, on U we have that σ is the affine projection onto V plus an error which is small in C1.

Proof. On the set U, we can define X  e(x) = σ(x) − q − πV (x − q) = −q − πV (x − q) + λi(x) · pi + πVi (x − pi) = i X  = λi(x) · pi − q − πV (pi − q) + πV (pi) − πVi (pi) + πVi (x) − πV (x) . (3.40) i By (3.38) and Lemma 3.7, we have the estimates

kpi − q − πV (pi − q)k < δ , πV (x − pi) − πVi (x − pi) ≤ c(n)δ kx − pik ≤ 13c(n)rδ . (3.41) THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 27

This implies

kekL∞(U) ≤ c(n)(1 + 13r)δ ≤ c(n)δ . (3.42) As for ∇e, we have X  X  ∇e = ∇λi(x) · pi − q − πV (pi − q) + πV (pi) − πVi (pi) + πVi (x) − πV (x) + λi(x)∇ πVi (x) − πV (x) . i i (3.43)

The first sum is easily estimated, and since h∇(πW )|x, wi = πW (w), we can still apply Lemma 3.7 and conclude: c(n) k∇ek ∞ ≤ δ . (3.44) L (U) r 

As we have seen, σ is in some sense close to the affine projection to V. In the next lemma, which is similar in spirit to the [Sim, squash lemma], we prove that the image through σ of a graph over V is again a graph over V with nice bounds.

n Lemma 3.13 (squash lemma). Fix some Br/ρ (y) ⊆ R , and let I = {xi} ∩ B5r/ρ (y). Suppose that there exists a k-dimensional affine subspace V such that d(y, V) ≤ δr and for all xi ∈ I:

d(pi, V) ≤ δr and dG(Vi, V) ≤ δ . (3.45) Suppose also that there exists a C1 function g : V → V⊥ such that G ⊆ Rn is the graph G = {x + g(x) x ∈ V}∩ −1 0 Br/ρ (y), and r kgk∞ + k∇gk∞ ≤ δ . Then (i) ∀z ∈ G, r−1 |σ(z) − z| ≤ c(n)δρ−1, and σ is a C1 diffeomorphism from G to its image, (ii) the set σ(G) is contained in a C1 graph {x + g˜(x) , x ∈ V} with

−1 0 −1 r kg˜k∞ + k∇g˜k∞ ≤ c(n)(δ + δ )ρ . (3.46) 0 S 0 (iii) moreover, for x ∈ U = i Br (xi), the previous bound is independent of δ , in the sense that −1 −1 r kg˜kL∞(U0∩V) + k∇g˜kL∞(U0∩V) ≤ c(n)δρ . (3.47)

0 −1 n 0 The same bound holds for all U ⊆ U = ψ (0) ⊆ R such that Bcδρ−1 (U ) ⊆ U. (iv) the map σ is a bi-Lipschitz equivalence between G and σ(G) with bi-Lipschitz p constant ≤ 1 + c(n)(δ + δ0)2ρ−2.

Proof. Since we are only interested in the local behaviour of σ inside Br/ρ (y), we can assume wlog that X  X σ(x + g(x)) = ψ(z)(x + g(x)) + λi(z) pi + πVi (x + g(x) − pi) , 1 − ψ(x) = λi(x) , (3.48) xi∈I xi∈I where we have set for convenience z = z(x) = x + g(x). Set by definition X  (1 − ψ(z))[x + h(x)] ≡ λi(z) pi + πVi (x + g(x) − pi) . (3.49) i 28 AARON NABER AND DANIELE VALTORTA

By projecting the function σ(x + g(x)) onto V and its orthogonal complement we obtain

σ(x + g(x)) = σT (x) + σ⊥(x) , σT (x) = x + (1 − ψ(z))hT (x) , σ⊥(x) = ψ(z)g(x) + (1 − ψ(z))h⊥(x) . (3.50)

By adapting the proof of Lemma 3.12, we obtain

− T h T i cδ r 1 (1 − ψ(z))h (x) + ∇ (1 − ψ(z))h (x) ≤ . (3.51) ρ Note that this bound in independent of δ0, as long as δ0 ≤ 1. Thus we can apply the inverse function theorem on the function σT (x): V → V and obtain a C1 inverse Q such that for all x ∈ V, r−1 |Q(x) − x| + |∇Q − id| ≤ c(n)δρ−1 , and if ψ(x + g(x)) = 1, then Q(x) = x . So we can write that for all x ∈ V

σ(x + g(x)) = σT (x) + g˜(σT (x)) where g˜(·) = σ⊥(Q(·)) . (3.52)

Arguing as before, we see that (1 − ψ(z))h⊥(x) is a C1 function with

− ⊥ h ⊥ i cδ r 1 (1 − ψ(z))h (x) + ∇ (1 − ψ(z))h (x) ≤ . (3.53) ρ and this bound is independent of δ0 (as long as δ0 ≤ 1). Thus the function g˜ : V → V⊥ satisfies for all x in its domain

r−1 |g˜(x)| + |∇g˜(x)| ≤ c(n)(δ + δ0)ρ−1 , (3.54)

Moreover, for those x such that ψ(σT (x) + g(σT (x))) = 0, the estimates on g˜ are independent of δ0, in the sense that r−1 |g˜(x)| + |∇g˜(x)| ≤ c(n)δρ−1 . This proves items (ii), (iii). As for item (i), it is an easy consequence of the estimates in (3.51), (3.53). Now since both G and σ(G) are Lipschitz graphs over V, we can estimate the bi-Lipschitz constant of σ by decomposing it as

σ : x + g(x) −→ x −→ σT (x) −→ σT (x) + g˜(σT (x)) . (3.55)



Pointwise Estimates on D. We wish to see in this subsection how (3.4) implies pointwise estimates on D, which will be convenient in the proof of the generalized Reifenberg results. Indeed, the following is an almost immediate consequence of Remark 3.4:

k k R k 2 k Lemma 3.14. Assume Br(x) ⊆ B2(0) is such that µ(Br(x)) ≥ 2 kr and D (y, 2r) dµ(y) < δ (2r) . B2r(x) µ 2 Then there exists c(n) such that D(x, r) < cδ . In particular, if (3.4) holds then for every Br(x) ⊆ B2(0) such k k 2 that µ(Br(x)) ≥ 2 kr we have that D(x, r) < cδ . THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 29

3.3. Proof of Theorem 3.3: The Discrete-Reifenberg. First of all, note that, by definition of µ, the state- ment of this theorem is equivalent to

µ(B1 (0)) ≤ D(n)/ωk . (3.56)

n In the proof, we will fix the constant C1(n) ≤ 20 and therefore the a positive scale ρ(n, C1(n)) = ρ(n) < 1 according to Lemma 3.8. For convenience, we will assume that ρ = 2q, q ∈ N, so that we will be able to use the sum bounds (3.11) more easily. The constant C1(n) will be defined by the end of the proof, however it is enough to know that it is itself n j bounded by 20 . Let us also define the scales r j = ρ .

Bottom scale In the proof, it will be convenient to assume that rs ≥ r¯ > 0. It is clear that, by means of a simple limiting argument, this assumption is not restrictive. In particular, fix any positive radius r¯ = rA for some A ∈ N, and consider the measure µr¯ ≤ µ defined by

X k µr¯ = ωkrsδxs . (3.57) s s.t. rs≥r¯

By Remark 3.3, we see that µr¯ satisfies all the hypothesis of this theorem, and since µr¯ % µ, if we prove uniform bounds on µr¯(B1 (0)) which are independent of r¯, we can conclude the theorem. A For this reason, in the rest of the proof we will assume for simplicity that rs ≥ r¯ = ρ > 0 for all s.

n First induction: upwards We are going to prove inductively on j = A, ··· , 0 that for all x ∈ B1 (0) ⊂ R j  and r j = ρ ≤ 1, either Br j (x) is contained in one of the balls Brs (xs) s∈S , or we have the bound

  k µ Br j (x) ≤ C1(n)r j , (3.58)

Note that, for j = A, this bound follows from the definition of the measure µ and the assumption that ri ≥ r¯.

Rough estimate Fix some j, and suppose that (3.58) holds for all r¯ ≤ r ≤ r .  j  Let us first observe that we can easily obtain a bad upper bound on µ Bχr j (x) for some fixed χ > 1.

Consider the points in {xs}s∈S ∩ Bχr j (x), and divide them into two groups: the ones with rs ≤ r j and the ones with rs > r j.

For the first group, cover them by balls Br j (zi) such that Br j/2 (zi) are disjoint. Since there can be at most c(n, χ) balls of this form, and for all of these balls the upper bound (3.58) holds, we have an induced upper bound on the measure of this set. As for the points with rs > r j, by construction there can be only c(n, χ) many of them, and we also have the bound rs ≤ χr j. Summing up the two contributions, we get the very rough estimate

  k µ Bχr j (x) ≤ C2(n, χ)r j , (3.59) where C2 >> C1. Note that, as long as the inductive hypothesis holds, C1 is independent of j. However, it is clear that successive repetitions of the above estimate will not lead to (3.58). 30 AARON NABER AND DANIELE VALTORTA

Second induction: downwards. n Suppose that (3.58) is true for all x ∈ B1 (0) and i = j + 1, ··· , A. Fix x ∈ R , and consider the set

B = Br j (x). Recall that we always assume that B is not contained in one of the balls Bri (xi), otherwise the bound (3.58) might fail for trivial reasons. We are going to build by induction on i ≥ j a sequence of smooth n n maps σi : R → R and smooth k-dimensional T j,i = Ti which satisfy nine properties, which we will use to eventually conclude the proof of (3.58) for B. Let us outline the inductive procedure now, and introduce all the relevant terminology. Everything described in the remainder of this subsection will be discussed more precisely over the coming pages. To begin with, we will have at the first step that

σ j = id, n T j, j = T j = V(x, r j) ≤ R , (3.60)

R 2 where V(x, r j) is one of the k-dimensional affine subspaces which minimizes d (y, V) dµ. Thus, the Br j (x)

first manifold T j is a k-dimensional affine subspace which best approximates Br j (x). At future steps we can recover Ti from Ti−1 and σi from the simple relation

Ti = σi(Ti−1) . (3.61)

We will see that σi is a diffeomorphism when restricted to Ti−1, and thus each additional submanifold Ti is also diffeomorphic to Rk. As part of our inductive construction we will build at each stage a Vitali covering of Ti given by   i   [ [ [  [ Br (Ti) ∩ Br (x) ∼  Br (y) Br (xs) Br (y) , (3.62) j j  t s  i t= j  t t  i y∈Ib xs∈I f y∈Ig where Ig, Ib, and I f represent the good, bad, and final balls in the covering. Final balls are balls belonging to the original covering Brs (xs) such that rs ∈ [ri, ri−1), and the other balls in the covering are characterized as good or bad according to how much measure they carry. Good balls are those with large measure, bad balls the ones with small measure. More precisely, we have

k  k k i λ Bri (y) ≥ 2 kri , if y ∈ Ig , k  k k i λ Bri (y) < 2 kri , if y ∈ Ib . (3.63)

We will see that, over each ball Bri (y) in this covering, Ti can be written as a graph over the best approx- imating subspace V(y, ri) with good estimates.

Our goal in these constructions is the proof of (3.58) for the ball B = Br j (x), and thus we will need to relate the submanifolds Ti, and more importantly the covering (3.62), to the set B. Indeed, this covering of Ti almost covers the set B, at least up to an excess set Ei−1. That is,   i [ [ [  [ supp (µ) ∩ B ⊆ E  B (y) B (y) B (y) . (3.64) i  rs rs  ri  s s  i s= j y∈Ib y∈I f y∈Ig THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 31

We will see that the set Ei consists of those points of B which do not satisfy a uniform Reifenberg condition. Thus in order to prove (3.58) we will need to estimate the covering (3.62), as well as the excess set Ei.

n n Let us now outline the main properties used in the inductive construction of the mapping σ j+1 : R → R , and hence T j+1 = σ j+1(T j). As is suggested in (3.62), it is the good balls and not the bad and final balls which are subdivided at further steps of the induction procedure. In order to better understand this construction let us begin by analyzing the good balls Bri (y) more carefully. On each such ball we may consider the best approximating k-dimensional subspace V(y, ri). Since Bri (y) is a good ball, one can check that most of B ∩ Bri (y) must satisfy a uniform Reifenberg and reside in a small neighborhood of V(y, ri). We denote those points which don’t by E(y, ri), see (3.76) for the precise definition. Then we can define the next step of the excess set by [ Ei = Ei−1 E(y, ri) . (3.65) i y∈Ig Thus our excess set represents all those points which do not lie in an appropriately small neighborhood of 0 the submanifolds Ti. With this in hand we can then find a submanifold Ti ⊆ Ti, which is roughly defined by 0 [ Ti ≈ Ti ∩ Bri (y) , (3.66) i y∈Ig see (3.86) for the precise inductive definition, such that   i [ [ [  [ [ B ⊆ E ∪  B (y) B (y) B T 0 ≡ R B T 0 , (3.67) i  rs rs  ri+1/4 i i ri+1/4 i  s s  s= j y∈Ib y∈I f where Ri represents our remainder term, and consists of those balls and sets which will not be further subdivided at the next stage of the induction. Now in order to finish the inductive step of the construction, 0 we can cover Bri+1/4 Ti by some Vitali set 0 [ Bri+1/4 Ti ⊆ Bri+1 (y) , (3.68) y∈I 0 where y ∈ I ⊆ Ti . We may then decompose the ball centers i+1 i+1 i+1 i+1 I = Ig ∪ Ib ∪ I f , (3.69) based on (3.63). Now we will use Definition 3.11 and the best approximating subspaces V(y, ri+1) to build n n σi+1 : R → R such that [ supp{σi+1 − Id} ⊆ B3ri+1 (y) . (3.70) i+1 y∈Ig This completes the outline of the inductive construction. Let us now record the precise properties of all of these sets which will be proved during the inductive process, and will eventually lead to the proof of (3.58):

n (i) σ j = id, T j, j = T j = V(x, r j) ≤ R (ii) Ti = σi(Ti−1) 32 AARON NABER AND DANIELE VALTORTA

i (iii) for every i ≥ j + 1 and y ∈ Ig, d(y, V(y, ri)) ≤ cδri, the set Ti ∩ B2ri (y) is the graph over V(y, ri) of a smooth function f with

k f k∞ + k∇ f k∞ ≤ cδ , (3.71) ri

(iv) for y ∈ Ti, d(σi+1(y), y) ≤ cδri+1, and σi+1|Ti is a diffeomorphism

(v) for every y ∈ Ti, Ti ∩ B2ri (y) is the graph over some k-dimensional affine subspace of a smooth function f satisfying (3.71). (vi) for all i, we have the inclusion  0 [ i supp (µ) ∩ B ⊆ Bri+1/10 Ti R (3.72) i 0 (vii) for every i ≥ j + 1 and all y ∈ Ig, the set Ti ∩ B2ri (y) is a Lipschitz graph over the plane V(y, ri) with d(y, V(y, ri)) ≤ cδri (viii) we can estimate k −1 0  i  k X k k 0 λ (σi (Ti )) + # Ib ωk(ri/10) + ωk (rs/10) ≤ λ (Ti−1) . (3.73) i xs∈I f (ix) we can estimate the excess set by k  2 k λ E(y, ri) ri+1 ≤ C(n)DS (y, 2ri)) . (3.74)

Excess set. Let us begin by describing the construction of the excess set. We will only be interested here in k ≥ k k a ball Bri (x) which is a good ball, in the sense that λ (Bri (x)) 2 kri . R Thus define V(x, r) to be (one of) the k-dimensional plane minimizing d(y, V)2dλk, and define also Br(x) the excess set to be the set of points which are some definite amount away from the best plane V. Precisely,   E(x, ri) = Bri (x) \ Bri+1/11 (V) ∩ S . (3.75) The points in supp (µ) ∩ E are in some sense what prevents the set S from satisfying a uniform one-sided Reifenberg condition at this scale. By construction, all points in E have a uniform lower bound on the k ≥ k k distance from V, so that if we assume λ (Bri (x)) 2 kri , i.e. Bri (x) is a good ball, then we can estimate Z Z 2  2 2 k+2 k d(y, V(x, ri)) dµ(y) + µ E(x, ri) (ri+1/11) ≤ d(y, V(x, ri)) dµ(y) ≤ ri Dµ(x, ri) . Bri (x)\E(x,ri) Bri (x) (3.76)

j j j Good, bad and final balls At our base step let us set Ig = {x}, Ib = I f = ∅. Inductively, let us define the remainder set to be the union of all the previous bad balls, final balls, and the excess sets:   i   i [ [ [ [  R =  Br (y) Br (xs) E(y, rt) . (3.77)  t s  t= j  t t t  y∈Ib xs∈I f Ig THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 33

The set Ri represents everything we want to throw out at the inductive stage of the proof. We will see later j in the proof how to estimate this remainder set itself. Note that at the first step R = E(x, r j).

Now consider the points xs ∈ S outside the remainder set, and separate the balls Brs (xs) with radius i rs ∼ ri+1 from the others by defining for y ∈ Ig the sets i+1 n  i o I f (y) = z ∈ S \ R ∩ Bri (y) s.t. rs ∈ [ri+1, ri) , (3.78) and i+1 n  i o J (y) = z ∈ S \ R ∩ Bri (y) s.t. rs < ri+1 . (3.79) From this we can construct the sets i+1 i+1 i+1 i+1 I = ∪ i I (y) and J = ∪ i J (y) , (3.80) f y∈Ig f y∈Ig Note that by construction and inductive item vii, we have i i+1 i+1  0 S \ R = I f ∪ J ⊂ Bri+1/10 Ti (3.81) Let us now consider a covering of (3.81) given by i i+1 [ S \ R ⊆ I f Bri+1 (z) , (3.82) z∈I ⊆ 0 ∈ ∪ ∩ ∅ where I Ti , and for any p , q I f I, Bri+1/5 (x) Bri+1/5 (q) = . Note that this second property is true by definition for p, q ∈ I f , we only need to complete this partial Vitali covering with other balls of the same 0 size. Since the set we need to cover is contained in ∪ i B (V(y, r )) and since T is locally a Lipschitz y∈Ig ri+1/10 i i 0 graph over V with (3.71), then we can also choose I ⊆ Ti . We split the balls with centers in I into two subsets, according to how much measure they carry. In particular, let i+1 n k  k ko i+1 n k  k ko Ig = y ∈ I s.t. λ Bri+1 (y) ≥ 2 kr , Ib = y ∈ I s.t. λ Bri+1 (y) < 2 kr . (3.83)

n i+1o i+1 Map and manifold structure. Let λs = {λs} be a partition of unity such that for each ys ∈ Ig

• supp (λs) ⊆ B3ri+1 (ys) P • for all z ∈ ∪ i+1 B (y ), λ (z) = 1 ys∈Ig 2ri+1 s s s • maxs k∇λsk∞ ≤ C(n)/ri+1. i+1 R 2 For every ys ∈ Ig , let V(ys, ri+1) to be (one of) the k-dimensional subspace that minimizes d(z, V) dµ. Bri+1 (ys) By Remark 3.3 we can estimate Z −k−2 2 k k+2 ri+1 d(z, V(ys, ri+1)) dλ (z) ≤ (1 + ρ) D(ys, (1 + ρ)ri+1) . (3.84) Bys (ri+1) Let p ∈ B (y ) be the center of mass of µ| . It is worth observing that p ∈ V . s ri+1 s Bri+1 (ys) s ys,ri+1 n n Define the smooth function σi+1 : R → R as in Definition 3.11, i.e., X i+1 ⊥ σi+1(x) = x + λs (x)πV(ys,ri+1) (ps − x) . (3.85) s 34 AARON NABER AND DANIELE VALTORTA

With this function, we can define the sets

        [ [  T = σ (T ) , T 0 = σ T 0 \  B (y) B (x ) . (3.86) i+1 i+1 i i+1 i+1  i  ri+1/6 rs/6 s    i+1 i+1  y∈Ib xs∈I f

i+1 i Fix any y ∈ Ig , and let z ∈ Ig be such that y ∈ Bri (z). By induction, Ti ∩ B10ri+1 (y) ⊆ Ti ∩ B2ri (z) is the 1 i+1 graph of a C function over V(z, ri). Consider the points {ys} = Ig ∩ B5ri+1 (y). By construction it is easy to see that d(ys, V(z, ri)) ≤ ri+1/9, and so we can apply the estimates in Lemma 3.9 with M = C1 by the first induction. Using condition (3.9), we obtain that for all ys:

−1   k k  ri+1dH V(z, ri) ∩ Bri+1 (ys) , V(ys, ri+1) ∩ Bri+1 (ys) ≤ c DS (ys, ri+1) + DS (z, ri) ≤ c(n, ρ, C1)δ . (3.87)

This implies that, if δ(n, ρ, C1) is small enough, Ti ∩ B10ri+1 (y) is a graph also over V(y, ri+1) satisfying the same estimates as in (3.71), up to a worse constant c. That is, if δ is sufficiently small, we can apply Lemma 3.13 and prove induction point (iii).

Points (iv) and (v) Points (iv) and (v) are proved with similar methods. We briefly sketch the proofs of these two points.

Let y ∈ T , and recall the function ψ ≡ 1−P λ . If ψ | is identically 1, then σ | = id, i+1 i+1 s i+1 B2ri+1 (y) i+1 B2ri+1 (y) and there is nothing to prove.

0 i+1 i Otherwise, there must exist some z ∈ Ig ∩ B5ri+1 (y), and thus there exists a z ∈ Ig such that B3ri+1 (y) ⊆

Bri (z). By our induction hypothesis, Ti ∩ Bri (z) is a Lipschitz graph over V(z, ri). Proceeding as before, by the estimates in Lemma 3.9 and Lemma 3.13, we obtain that Ti+1 ∩ B2ri+1 (y) is also a Lipschitz graph over V(z, ri) with small Lipschitz constant, and that |σi+1(p) − p| ≤ cδri+1 for all p ∈ Ti. Moreover, σi+1|Ti is locally a diffeomorphism at scale ri+1. From this we see that σi+1 is a diffeomorphism on the whole Ti. Now we turn our attention to the items (vi), (vii), (viii).

0 Properties of the manifolds Ti Here we want to prove the measure estimate in (3.73). The basic idea is that bad and final balls correspond to holes in the manifold Ti, and each of these holes carries a k-dimensional measure which is proportionate to the measure inside the balls. In particular, let Br (y) be a bad or a final ball. In the first case, r = ri+1, while in the second r ∈ [ri+1, ri). In either case, we will see that y must be k ∼ ri+1-close to Ti, which is a Lipschitz graph at scale ri. This implies that µ(Br (y) ∩ Ti) ∼ r , and thus we can bound the measure of a bad or final ball with the measure of the hole we have created on Ti. i In detail, point (vi) is an immediate consequence of the definition of Ri. As for point (vii), if y ∈ Ig, then i−1 i−1 by definition, there exists z ∈ Ig such that y ∈ B5ri−1/6 (z), and y < R . This implies that y is far away from THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 35

0 the balls we discard while building Ti , in particular     \  [ [  B (y)  B (y) B (x ) = ∅ (3.88) 3ri  ri+1/6 rs/6 s   i+1 i+1  y∈Ib xs∈I f

0 This proves that Ti ∩ B2ri (y) = Ti ∩ B2ri (y), and in turn point (vii). In order to prove the volume measure estimate, consider that      [ [  T 0 \ σ−1 (T 0 ) ⊆  B (y) B (y) . (3.89) i i+1 i+1  ri+1/6 rs/6   i+1 i+1  y∈Ib y∈I f

 i+1 i Note that the balls in the collection Bri+1/5 (y) i+1 i+1 are pairwise disjoint. Pick any y ∈ I , and let z ∈ Ig y∈I f ∪Ib b ∈ ∈ 0 k k −k k 0 ∩ be such that y Bri (z). By definition, y Ti and µ(Bri+1 (y)) < 2 kri+1 < 10 ωkri+1. Since Ti B2ri (z) is 0 a graph over V(z, ri) with y ∈ Ti , then

k 0 −k k λ (Ti ∩ Bri+1/6 (y)) ≥ ωk7 ri+1 . (3.90)

i+1 A similar estimate holds for the final balls. The only difference is that if xs ∈ I f , then it is not true in general that xs ∈ Ti. However, by (3.81), we still have that d(y, V(z, ri)) ≤ ri+1/10. Given (3.71), we can conclude

k 0 −k k λ (Ti ∩ Brs/7 (xs)) ≥ ωk10 rs . (3.91)

0 Now it is evident from the definition of Ti that

k −1 0  i  k X k k 0 λ (σi (Ti )) + # Ib ωk(ri/10) + ωk (rs/10) ≤ λ (Ti−1) . (3.92) i xs∈I f

Volume estimates on the manifold part Here we want to prove that for every measurable Ω ⊆ Ti Z k k λ (σi+1(Ω)) ≤ λ (Ω) + c(n, ρ, C1) D(p, 2ri+1)dµ(p) . (3.93) S ∩Br j (x)

0 The main applications will be with Ω = Ti and Ω = Ti . In order to do that, we need to analyze in a quantitative way the bi-Lipschitz correspondence between Ti and Ti+1 given by σi+1.

As we already know, σ = id on the complement of the set G = ∪ i+1 B (y), so we can concentrate i+1 y∈Ig 5ri+1 only on this set. Using the same techniques as before, and in particular by Lemmas 3.9 and 3.13, we can prove that for i+1 each y ∈ Ig , the set Ti ∩ B5ri+1 (y) is a Lipschitz graph over V(y, ri+1) with Lipschitz constant bounded by

 1/2  X  c n C D y r D z r  ( , ρ, 1)  ( , i+1) + ( , i) (3.94)  i  z∈Ig∩B4ri (y) 36 AARON NABER AND DANIELE VALTORTA

In a similar manner, we also have that Ti+1 ∩ By (5ri+1) is a Lipschitz graph over V(y, ri+1) with Lipschitz constant bounded by  1/2    X X  c(n, ρ, C )  D(z, ri ) + D(z, ri) . (3.95) 1  +1   ∈ i+1∩ ∈ i ∩  z Ig B10ri+1 (y) z Ig B4ri (y)

Summing up, we obtain that σi+1 restricted to Ti ∩ B5ri (y) is a bi-Lipschitz equivalence with bi-Lipschitz constant bounded by      X X  L(y, 5ri ) ≤ 1 + c  D(z, ri ) + D(z, ri) (3.96) +1  +1   ∈ i+1∩ ∈ i ∩  z Ig B10ri+1 (y) z Ig B4ri (y) In order to estimate this upper bound, we use (3.2) and the definition of good balls to write Z −k D(z, r) ≤ c D(p, 2r)dµ(p) ≤ c(n, ρ, C1)r D(p, 2r)dµ(p) . (3.97) Br(z) Br(z) Since by construction any point? x ∈ Rn can be covered by at most c(n) different good balls at different scales, we can bound Z c(n, ρ, C1)   L(y, 5ri+1) ≤ 1 + D(p, 2ri+1) + D(p, 2ri) dµ(p) (3.98) rk i+1 B5ri (y) We can also badly estimate

D(p, 2ri+1) + D(p, 2ri) ≤ c(n, ρ)D(p, 2ri) . (3.99)

Now let Ps be a measurable partition of Ω ∩ G such that for each s, Ps ⊆ B5ri+1 (ys). By summing up the k k volume contributions of Ps, and since evidently λ (Ps) ≤ (1 + cδ)ri+1, we get   X X Z k k k  c  λ (σi+1(Ω)) = λ (σi+1(Ω ∩ Ps)) ≤ λ (Ps) 1 + D(p, 2ri)dµ(p) ≤  rk  s s i+1 B5ri (ys) Z k ≤ λ (Ω) + c D(p, 2ri)dµ(p) ≤ S i+1 B5r (ys) ys∈Ig i Z k ≤ λ (Ω) + c(n, ρ, C1) D(p, 2ri)dµ(p) . (3.100) Br j (x)∩S

Estimates on the excess set In this paragraph, we estimate the total measure of the excess set, which is defined by [A [ ET = E(y, ri) . (3.101) i i= j y∈Ig At each y and at each scale, we have by (3.76) and (3.2)

k k k k µ(E(y, ri)) ≤ c(n, ρ)ri DS (y, 2ri) ≤ c(n, ρ)ri DS (p, 4ri)dµ(p) . (3.102) Br j (x) ? THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 37

Since by definition of excess set Bri (y) must be a good ball, then Z k µ(E(y, ri)) ≤ c(n, ρ) DS (p, 4ri)dµ(p) . (3.103) B2ri (y)

Now by construction of the good balls, there exists a constant c(n) such that at each step i, each x ∈ Rn belongs to at most c(n) many balls. Thus for each i ≥ j, we have Z Z X k k µ(E(y, ri)) ≤ c(n, ρ) DS (p, 4ri)dµ(p) ≤ c(n, ρ) DS (p, 4ri)dµ(p) . (3.104) i ∪ ∈ i B2ri (y) Br j (x) y∈Ig y Ig If we sum over all scales, we get A Z X k µ (ET ) ≤ c(n, ρ) DS (p, 4ri)dµ(p) . (3.105) i= j Br j (x) Since ρ = 2q, it is clear that A Z X k  qi−2 k µ (ET ) ≤ c(n, ρ) DS p, 2 dµ(p) ≤ c(n, ρ)δr j , (3.106) i= j Br j (x) since the sum in the middle is clearly bounded by (3.11).

This estimate is exactly what we want from the excess set.

Volume estimates. −1 0 By adding (3.93), with Ω ≡ σi+1(Ti+1) , and (3.73), we prove that for all i = j, ··· , A + 1, ... Z k 0  i  k X k k 0 λ (Ti+1) + # Ib ωk(ri/10) + ωk (rs/10) ≤ λ (Ti ) + c(n, ρ, C1) D(p, 2ri)dµ(p) . (3.107) ∩ i B5r j (y) B1(0) xs∈I f Adding the contributions from all scales, by (3.6) we get i i k 0 X  t  k X X k λ (Ti+1) + # Ib ωk(rt/10) + ωk(rs/10) ≤ t= j t= j t xs∈I f i Z k X ≤ λ (T j ∩ B2r j (x)) + c(n, ρ, C1) D(p, 2rs)dµ(p) ≤ ∩ s= j B5r j (y) B1(0) k     ≤ λ T j ∩ B2r j (x) 1 + c(n, ρ, C1)δ , (3.108)   k ∩ ∼ k where in the last line we estimated λ T j B2r j (x) r j, since T j is a k-dimensional subspace, and we bounded the sum using (3.11). In the same way, we can also bound the measure of T j by k k     λ (Ti+1) ≤ λ T j ∩ B2r j (x) 1 + c(n, ρ, C1)δ . (3.109) 38 AARON NABER AND DANIELE VALTORTA

Upper estimates for µ. Since we have assumed rs ≥ r¯ = rA for all xs ∈ S , we know that for j = A + 1 the whole support S is contained in final and bad balls and excess sets. In particular, by (3.72) A B = Br j (x) ⊆ R . (3.110) This fact and the estimates in (3.106) and (3.108) imply A i X  t  k k X X k µ(B) ≤ # Ib 2 kωkrt + ωkrs + µ(ET ) ≤ t= j t= j t xs∈I f    A i  X  t  k X X k k ≤ C (n)  # I ωk(rt/10) + ωk(rs/10)  + µ(ET ) ≤ C (n)(1 + c(n, ρ, C )δ)r . (3.111) 3  b  3 1 j  t= j t= j t  xs∈I f n In this last estimate, we can fix C1(n) = 2C3(n) ≤ 20 , and ρ(n, C1) according to Lemma 3.8. Now, it is easy to see that if δ(n, ρ, C1) is sufficiently small, then k µ(B) ≤ C1(n)r j , (3.112) which finishes the proof of the downward induction, and hence the actual ball estimate (3.58).

4. L2-BEST APPROXIMATION THEOREMSFOR INTEGRAL CURRENTS In this Section we prove the main estimate necessary for us to be able to apply the rectifiable-Reifenberg k of Theorems 3.3 and 3.4 to the singular sets S  (I) of the stratification induced by integral varifolds with bounded mean curvature.

2 k Namely, we need to understand how to estimate on a ball Br(x) the L -distance of S  from the best approximating k-dimensional subspace. When r << inj(M) this means we would like to consider subspaces k 2 k k L ⊆ Tx M and estimate d (x, L ) for x ∈ S  ∩ Br(x), where the distance is taken in the normal coordinate charts. After rescaling this is equivalent to looking at a ball of definite size, but assuming K << 1. The main Theorem of this Section is stated in some generality as we will need to apply it with some care when proving Theorems 1.3 and 1.4:

2 m Theorem 4.1 (L -Best Approximation Theorem). Let I ⊆ B8 be an integral varifold satisfying (1.13), the mean curvature bound (1.14), the mass bound I(B8(p)) ≤ Λ, and let  > 0. Then there exists δ(n, Λ, ), C(n, Λ, ) > 0 such that if K, H < δ, and B8(p) is (0, δ)-symmetric but not (k + 1, )-symmetric, then for any m finite measure µ on I ∩ B1(p) we have that Z Z 2 k Dµ(p, 1) = inf d (x, L ) dµ(x) ≤ C W0(x) dµ(x) , (4.1) k L B1(p) B1(p) k where the inf is taken over all k-dimensional affine subspaces L ⊆ T p M.

Remark 4.1. The assumption K, H < δ is of little consequence, since given any K this just means focusing the estimates on balls of sufficiently small radius after rescaling. THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 39

4.1. Symmetry and Gradient Bounds. In this subsection we study integral varifolds Im with bounded mean curvature which are not (k + 1, )-symmetric on some ball. In particular, we show that this forces for k+1 ⊥ 2 ⊥ each k + 1-subspace V that |πI [V](x)| = |hNxI, Vi| has some definite size in L , where πI [V](x) is the projection of V to the NxI, the orthogonal compliment of the tangent space TxI of I at x. More precisely:

m Lemma 4.2. Let I ⊆ B8 be an integral varifold satisfying (1.13), the mean curvature bound (1.14), and the mass bound I(B8(p)) ≤ Λ. Then for each  > 0 there exists δ(n, Λ, ) > 0 such that if K, H < δ, and B4(p) is k+1 (0, δ)-symmetric but is not (k + 1, )-symmetric, then for every k + 1-subspace V ⊆ T p M we have

⊥ 2 |πI [V]| dµI ≥ δ , (4.2) A3,4(p) ⊥ ? where πI [V](x) is the projection of V to NxI, the orthogonal compliment to the tangent space of I at x. Proof. The proof is by contradiction. So with n, Λ,  > 0 fixed let us assume the result fails. Then there exists a sequence Ii : B8(pi) for which Ki, Hi < δi, with B4(pi) being (0, δi)-symmetric but not (k + 1, )- k+1 symmetric, and such that for some subspaces Vi we have that

|π⊥[V ]|2dµ ≤ δ → 0 . (4.3) Ii i Ii i A3,4(pi) k+1 ?k+1 After rotation we can assume Vi = V , and then after passing to a subsequence we have that n Ii −→ I ⊆ B8(0 ) , (4.4) in the flat convergence. In particular, we have that I is a stationary varifold and

⊥ 2 |πI [V]| dµI = 0 . (4.5) A3,4(pi) ? On the other hand, the (0, δi)-symmetry of the Ii tells us that I is 0-symmetric. Combining these tells us that

⊥ 2 |πI [V]| dµI = 0 , (4.6) n B4(0 ) and hence we have that I is k + 1-symmetric.? Because the convergence Ii → I is in the flat norm, this contradicts that the Ii are not (k + 1, )-symmetric for i sufficiently large, which proves the Lemma. 

4.2. Best L2-Subspace Equations. In order to prove Theorem 4.1 we need to identify which subspace minimizes the L2-energy, and the properties about this subspace that allow us to estimate the distance. We begin in Section 4.2.1 by studying some very general properties of the second directional moments of a general probability measure µ ⊆ B1(p). We will then study in Section 4.2.2 a quantitative notion of almost- symmetry for stationary varifolds. 40 AARON NABER AND DANIELE VALTORTA

n 4.2.1. Second Directional Moments of a Measure. Let us consider a probability measure µ ⊆ B1(0 ), and let Z i i i xcm = xcm(µ) ≡ x dµ(x) , (4.7) be the center of mass. Let us inductively consider the maximum of the second directional moments of µ. More precisely:

R 2 Definition 4.3. Let λ1 = λ1(µ) ≡ max|v|2=1 |hx − xcm, vi| dµ(x) and let v1 = v1(µ) with |v1| = 1 (any of) the vector obtaining this maximum. Now let us define inductively the pair (λk+1, vk+1) from v1,..., vk by Z 2 λk+1 = λk+1(µ) ≡ max |hx − xcm, vi| dµ(x) , (4.8) 2 |v| =1,hv,vii=0 where vk+1 is (any of) the vector obtaining this maximum.

n Thus v1,..., vn defines an orthonormal basis of R , ordered so that they maximize the second directional moments of µ. Let us define the subspaces

k k V = V (µ) ≡ xcm + span{v1,..., vk} . (4.9) The following is a simple but important exercise:

n Lemma 4.4. If µ is a probability measure in B1(0 ), then for each k the functional Z min d2(x, Lk) dµ(x) , (4.10) Lk⊆Rn where the min is taken over all k-dimensional affine subspaces, attains its minimum at Vk. Further, we have that Z Z 2 k 2 k min d (x, L ) dµ(x) = d (x, V ) dµ(x) = λk+1(µ) + ··· + λn(µ) . (4.11) Lk⊆Rn

k Note that the best affine subspace V will necessarily pass through the center of mass xcm.

Now let us record the following Euler-Lagrange formula, which is also an easy computation:

n Lemma 4.5. If µ is a probability measure in B1(0 ), then we have that v1(µ),..., vn(µ) satisfy the Euler- Lagrange equations: Z i i X i hx − xcm, vki(x − xcm) dµ(x) = λkvk + λk,`v` , (4.12) `

4.2.2. Projections on the tangent spaces. Our goal now is to study an integral varifold Im in the directions spanned by v1(µ),..., vn(µ), associated to a probability measure. The main result of this subsection is the following, which holds for a general integral varifold with bounded mean curvature. We recall that by definition

Wα(x) ≡ Wrα,rα−3 (x) ≡ θrα−3 (x) − θrα (x) ≥ 0 , (4.14)

−α where rα = 2 .

m Proposition 4.6. Let I ⊆ B8 be an integral varifold satisfying (1.13),the mean curvature bound (1.14), and the mass bound I(B8(p)) ≤ Λ. Let µ be a probability measure on B1(p) with λk(µ), vk(µ) defined as in Definition 4.3. Then there exists δ(n) > 0 such that if K, H < δ, then Z Z ⊥ 2 −1 λk |πI [vk]| dvg(z) ≤ δ W0(x) dµ(x) , (4.15) A3,4(p) ⊥ where π [v](x) is the projection of v to NxI, the orthogonal compliment of the tangent space TxI, which exists a.e.

Proof. Note first that there is no harm in assuming that xcm ≡ 0. If not we can easily translate to make this so, in which case we still have that supp(µ) ⊆ B2. Additionally, we will simplify the technical aspect of the proof by assuming that M ≡ Rn. By working in normal coordinates the proof of the general case is no different except up to some mild technical work.

Now let us fix any z ∈ A3,4, and observe that Z hx, zi dµ(x) = hxcm, zi = 0 . (4.16)

Then combined with Lemma 4.5 we can inner product both sides of (4.12) by NzI, the normal to the tangent space of I at z, to obtain for each k and z ∈ A3,4: Z X λkhNzI, vki = hx, vkihNzI, xi dµ(x) − λk,`hNzI, v`i `

Set for convenience nx(z) = (z − x)/ |z − x|, i.e., nx(z) is the radial vector from x to z. Now for x ∈ supp(µ) we can estimate Z Z 2 2 −m 2+m |hNzI, x − zi| dµI(z) = |hNzI, nx(z)i| |x − z| |x − z| dµI(z) A3,4 A3,4 Z 2 −m ≤ C(n) |hNzI, nx(z)i| |x − z| dµI(z) A3,4 Z 2 −m ≤ C(n) |hNzI, nx(z)i| |x − z| dµI(z) A1,8(x)

= C(n)W0(x) . (4.20) Applying this to (4.19) we get the estimate Z Z Z 2 X 2 λk |hNzI, vki| dµI(z) ≤C(n) W0(x) dµ(x) + 2 λ` |hNzI, v`i| dµI(z) . (4.21) A3,4 `

4.3. Proof of Theorem 4.1. Let us now combine the results of this Section in order to prove Theorem 4.1. Indeed, let µ be a measure in B1(p) ⊆ T p M. We can assume that µ is a probability measure without any loss   of generality, since both sides of our estimate scale. Let λ1(µ), v1(µ) ,..., λn(µ), vn(µ) be the directional second moments as defined in Definition 4.3, with Vk the induced subspaces defined as in (4.9). Using Lemma 4.4 we have that Z Z 2 k 2 k min d (x, L ) dµ(x) = d (x, V ) dµ(x) = λk+1(µ) + ··· + λn(µ) ≤ (n − k)λk+1(µ) , (4.24) Lk⊆Rn where we have used that λ j ≤ λi for j ≥ i. Therefore our goal is to estimate λk+1. To begin with, Proposition 4.6 tells us that for each j Z Z 2 λ j |hNzI, v ji| dµI(z) ≤ C W0(x) dµ(x) . (4.25) A3,4(p) Let us sum the above for all j ≤ k + 1 in order to obtain k+1 Z Z X 2 λ j |hNzI, v ji| dµI(z) ≤ (k + 1)C W0(x) dµ(x) , (4.26) j=1 A3,4(p) THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 43

or by using that λk+1 ≤ λ j for k + 1 ≥ j we get Z k+1 Z Z k+1 2 X 2 λk+1 |hNzI, V i| dµI(z) = λk+1 |hNzI, v ji| dµI(z) ≤ C W0(x) dµ(x) . (4.27) A3,4(p) j=1 A3,4(p)

Now we use that B8(p) is (0, δ)-symmetric, but not (k + 1, )-symmetric in order to apply Lemma 4.2 and conclude that Z k+1 2 |hNzI, V i| dµI(z) ≥ δ . (4.28) A3,4(p) Combining this with (4.27) we obtain Z Z k+1 2 −1 δλk+1 ≤ λk+1 |hNzI, V i| dµI(z) ≤ δ W0(x) dµ(x) , (4.29) A3,4(p) or that Z λk+1 ≤ C(n, Λ, ) W0(x) dµ(x) . (4.30)

Combining this with (4.24) we have therefore proved the Theorem. 

5. THE INDUCTIVE COVERING LEMMA This Section is dedicated to the basic covering lemma needed for the proof of the main theorems of the paper. The covering scheme is similar in nature to the one introduced by the authors in [NVb, NVa] in order to prove structural theorems on critical and singular sets. Specifically, let us consider an integral varifold Im with bounded mean curvature, then we will build a covering of the quantitative stratification

k [ S ,r(I) ∩ B1(p) ⊆ Ur ∪ U+ = Ur ∪ Bri (xi) , (5.1) which satisfies several basic properties. To begin with, the set U+ is a union of balls satisfying ri > r ≥ 0, P k ≤ and should satisfy the packing estimate ωk ri C. Each ball Bri (xi) should have the additional property that there is a definite mass drop of I when compared to B2(p). To describe the set Ur we should distinguish S r between the case r > 0 and r ≡ 0. In the case r > 0 we will have that Ur = Br(xi ) is a union of r-balls n−k and satisfies the Minkowski estimate Vol(Ur) ≤ Cr . In the case when r ≡ 0 we will have that U0 is k k-rectifiable with the Hausdorff estimate λ (U0) ≤ C. Let us be more precise:

m Lemma 5.1 (Covering Lemma). Let I ⊆ B2 be an integral varifold satisfying the bounds (1.13), the mean curvature bound (1.14), the mass bound I(B2(p)) ≤ Λ, and assume K + H < δ(n, Λ, ). Let E = θ x with  > , r ≥ , and k ∈ N. Then for all η ≤ η n, ,  , there exists a covering S k I ∩ supx∈B1(p) 1( ) 0 0 ( Λ ) ,r( ) B1(p) ⊆ U = Ur ∪ U+ such that S P k ≤ (1) U+ = Bri (xi) with ri > r and ri C(n, H, Λ, ) (2) sup θr (y) ≤ E − η. y∈Bri (xi) i SN r −k (3) If r > 0 then Ur = 1 Br(xi ) with N ≤ C(n)r . 44 AARON NABER AND DANIELE VALTORTA

n−k (4) If r = 0 then U0 is k-rectifiable and satisfies Vol(Bs (U0)) ≤ C(n)s for each s > 0. k In particular, λ (U0) ≤ C(n). Remark 5.1. The assumption K + H < δ is of little consequence, since given any K this just means focusing the estimates on balls of sufficiently small radius after rescaling.

To prove the result let us begin by outlining the construction of the covering, we will then spend the rest of this section proving the constructed cover has all the desired properties.

Thus let us consider some η > 0 fixed, and then define the mass scale for x ∈ B1(p) by E,η  sx = sx ≡ inf r ≤ s ≤ 1 : sup θs(y) ≥ E − η . (5.2) Bs(x)

Note that the mass scale implicitly depends on many constants. If r = 0 let us define the set U0 by  k U0 ≡ x ∈ S ,0(I) ∩ B1(p): sx = 0 , (5.3) while if r > 0 let us define [ r Ur = Br(xi ) , (5.4) where r  k {xi } ⊆ x ∈ S ,r(I) ∩ B1(p): sx = r , (5.5) r is any minimal r/2-dense set. In particular, note that the collection of balls {Br/8(xi )} are disjoint.

In order to define the covering U+ = {Bri (xi)} let us first consider the covering  k [ x ∈ S ,r(I) ∩ B1(p): sx > r ⊆ Bsx/5(x) , (5.6) sx>r and choose from it a Vitali subcovering and set [ U+ ≡ Bri (xi) , (5.7) i∈J where ri ≡ sxi . In particular, we have that the collection of balls {Bri/5(xi)} are all disjoint. It is clear from the construction that we have built a covering k S ,r(I) ∩ B1(p) ⊆ Ur ∪ U+ , (5.8) we now need to study the properties of Ur and U+.

The outline of this Section is as follows. In Sections 5.1 and 5.2 we will deal with estimating the set Ur. In fact, from a technical standpoint Ur is much easier than U+ to deal with, and for the rectifiability and k k local λ -finiteness of S ,r this is the set which is most important. Indeed, for Ur we will be able to almost directly use the rectifiable-Reifenberg of Theorems 3.4 and 3.3, at least when combined with an additional induction argument. However, for the global Minkowski estimates and k-dimensional Hausdorff measure k bounds on S ,r it is crucial to deal with the set U+ as well. To do this we first prove a variety of technical lemmas in Section 5.3, these will allow us to exchange U+ for a more manageable collection of balls without THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 45

losing much content. Then in Section 5.4 we will be able argue as in the Ur case by applying the rectifiable Reifenberg of Theorem 3.3, however we will apply it not to U+ but to a more carefully chosen covering obtained by exploiting the results of Section 5.3.

S r 5.1. Estimating Ur in Lemma 5.1 for r > 0. Let us begin by altering the collection Ur = Br(xi ) r 0 r slightly. Indeed, by definition of Ur, we have that for each ball Br(xi ) there exists a point xi ∈ Br(xi ) such 0 that θr(xi ) ≥ E − η. Then we have the covering

N [ 0 Ur ⊆ B2r(xi ) , (5.9) 1

0 SN0 0 and let Ur ≡ 1 B2r(xi ) be a maximal disjoint subset of these balls. Let us first observe that by a standard 0 0 covering argument we have N ≤ C(n)N , and thus to estimate Ur it is enough to estimate N . Indeed, by 0 r 0 0 the maximality of Ur we have that for every ball center xi ∈ Ur there exists a ball center x j ∈ Ur such that r 0 r xi ∈ B4r(x j). Since the balls {Br/8(xi )} are disjoint, we have that

0  r n XN N · ω ≤ Vol(U ) ≤ Vol(B (x0 )) ≤ N0 · ω (4r)n , (5.10) n 8 r 4r j n 1 which in particular gives us the desired estimate N ≤ 32nN0.

0 0 Now let us proceed to estimate N . Thus consider the measure associated to Ur given by

X n µ = ω r δ 0 . (5.11) n xi −α Let us consider the sequence of radii rα = 2 . We now wish to prove that for each x ∈ B1 and all r ≤ rα ≤ 1 that

k µ(Brα (x)) ≤ D(n)rα , (5.12) where D(n) is from Theorem 3.3. Let us first observe that once (5.12) has been proved then we have the 0 desired estimate on N . Indeed, by applying the estimate to x = 0 and r0 = 1 we have that

N0 0 k X k N ωnr = ωkr = µ(B1) ≤ D(n) , (5.13) 1 which gives the claimed estimate N0 ≤ C(n)r−k.

Thus let us now concentrate on proving (5.12). We will prove it by induction on α. Let α0 be such that r ≤ rα0 < 2r. Let us begin by observing that the result clearly holds for rα0 . Therefore let us now assume we have proved (5.12) for some rα+1, and then proceed to prove it for rα.

Let us first observe a rough estimate in this direction through a covering argument. Namely, let us consider the radii rα+1 ≤ s ≤ 4rα. We can cover Bs(x) by a collection of at most C(n) balls {Brα+1 (yi)}, and thus by 46 AARON NABER AND DANIELE VALTORTA

the inductive assumption we have for all x ∈ B1 and s ≤ 4rα that X k 0 k µ(Bs(x)) ≤ µ(Brα+1 (yi)) ≤ D(n)C(n)rβ ≤ D (n)s , (5.14) where of course D0(n) >> D(n).

k Now let us consider a ball Bs(y) ⊆ B1 with s ≤ 2rα. If µ(Bs(y)) ≤ k s then Dµ(y, s) ≡ 0 by definition, while if s ≤ r then we also have that Dµ(y, s) ≡ 0, since the support of µ in Br(x) contains at most one point and thus is precisely contained in a k-dimensional subspace. Thus let us consider the case when s > r and n µ(Bs(y)) > n s . In this case notice by theorem 2.2 that there is certainly at least one point z ∈ supp(µ)∩Bs(x) which is (0, δ)-symmetric, where δ = δ(η|n, Λ) → 0 as η → 0. Thus for η ≤ η(n, Λ) we can apply Theorem 4.1 to see that Z −k Dµ(y, s) ≤ C(n, Λ, )s Ws(z) dµ(z) . (5.15) Bs(y) By applying this to all r ≤ t ≤ s we have Z Z Z −k −k −k s Dµ(y, t) dµ(y) ≤ Cs t Wt(z) dµ(z) dµ(y) Bs(x) Bs(x) Bt(y) Z −k −k = Cs t µ(Bt(y))Wt(y) dµ(y) Bs(x) Z −k ≤ Cs Wt(y) dµ(y) , (5.16) Bs(x) where we have used our rough estimate (5.14) in the last line. Let us now consider the case when t = rβ ≤ s ≤ 2rα. Then we can sum to obtain: Z Z X −k X −k s Dµ(y, rβ) dµ(y) ≤ C s Wrβ (y) dµ(y) B (x) B (x) rβ≤s s r≤rβ≤s s Z −k X = Cs Wrβ (y) dµ(y) B (x) s r≤rβ≤s Z −k ≤ C s θs(y) − θr(y) dµ(y) ≤ C(n, Λ, )η , (5.17) Bs(x) where in the last line we have used both the rough estimate (5.14) on µ(Bs(x)) and that in the support of µ 0 we have that θs(y)−θr(y) ≤ η by the construction of Ur. Now let us choose η ≤ η(n, Λ, ) such that we have Z X −k 2 s Dµ(y, rβ) dµ(y) ≤ δ , (5.18) B (x) rβ≤s s where δ is taken from Theorem 3.3. Since the estimate (5.18) holds for all Bs ⊆ B2rα (x), we can therefore apply Theorem 3.3 to conclude the estimate k µ(Brα (x)) ≤ D(n)rα . (5.19)

This finishes the proof of (5.12), and hence the proof of estimate of Ur for r > 0 in Lemma 5.1.  THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 47

5.2. Estimating U0 in Lemma 5.1. Let us begin by proving the Minkowski estimates on U0. Indeed, observe that for any r > 0 that U0 ⊆ Ur, and thus we have the estimate n n−k Vol(Br (U0)) ≤ Vol(Br (Ur)) ≤ ωnr · N ≤ C(n, Λ, )r , (5.20) which proves the Minkowski claim. In particular, we have as a consequence the k-dimensional Hausdorff measure estimate k λ (U0) ≤ C(n, Λ, ) . (5.21) In fact, let us conclude a slightly stronger estimate, since it will be a convenient technical tool in the remain- 1 der of the proof. If Bs(x) is any ball with x ∈ B1 and s < 2 , then by applying the same proof to the rescaled ball Bs(x) → B1(0), we can obtain the Hausdorff measure estimate k k λ (U0 ∩ Br(x)) ≤ Cr . (5.22)

To finish the construction we need to see that U0 is rectifiable. We will in fact apply Theorem 3.4 in order k to conclude this. To begin with, let µ ≡ λ be the k-dimensional Hausdorff measure, restricted to U0. Let U0 1 Bs(y) be a ball with y ∈ B1 and s < 2 , now we will now argue in a manner similar to Section 5.1. Thus, if k µ(Bs(y)) ≤ k s then DU0 (y, s) ≡ 0, and otherwise we then have by Theorem 4.1 that Z −k Dµ(y, s) ≤ C(n, Λ, )s Ws(z) dµ(z) . (5.23) Bs(y) By applying this to all t ≤ s we have Z Z Z −k −k −k s Dµ(y, t) dµ(y) ≤ Cs t Wt(z) dµ(z) dµ(y) Bs(x) Bs(x) Bt(y) Z −k −k = Cs t µ(Bt(y))Wt(y) dµ(y) Bs(x) Z −k ≤ Cs Wt(y) dµ(y) , (5.24) Bs(x) −β where we have used our estimate (5.22) in the last line. Let us now consider the case when t = rβ = 2 ≤ s. Then we can sum to obtain: Z Z X −k X −k s Dµ(y, rβ) dµ(y) ≤ C s Wrβ (y) dµ(y) B (x) B (x) rβ≤s s rβ≤s s Z −k X = Cs Wrβ (y) dµ(y) B (x) s rβ≤s Z −k ≤ C s θs(y) − θ0(y) dµ(y) ≤ C(n, Λ, )η , (5.25) Bs(x) k where we have used two points in the last line. First, we have used our estimate µ(Bs(x)) ≤ Cs . Second, we have used that by the definition of U0, for each point in the support of µ we have that θs(y) − θ0(y) ≤ η. Now let us choose η ≤ η(n, Λ, ) such that we have Z X −k 2 s Dµ(y, rβ) dµ(y) ≤ δ , (5.26) B (x) rβ≤s s 48 AARON NABER AND DANIELE VALTORTA

where δ is chosen from Theorem 3.4. Thus, by applying Theorem 3.4 we see that U0 is rectifiable, which finishes the proof of Lemma 5.1 in the context of U0. 

5.3. Technical Constructions for Estimating U+. Estimating the set U+ is in spirit similar to the estimates obtained in the last subsections on Ur and U0. However, the estimate on U+ itself is a bit more delicate, and we cannot direcly apply the discrete Reifenberg of Theorem 3.3 to this set. Instead, we will need to replace U+ with a different covering at each stage, which will be more adaptable to Theorem 3.3. This subsection is dedicated to proving a handful of technical results which are important in the construction of this new covering.

Throughout this subsection we are always working under the assumptions of Lemma 5.1. Let us begin with the following point, which is essentially a consequence of the continuity of the mass: Lemma 5.2. For each η0 > 0 there exists R(n, Λ, η0) > 1 such that with η ≤ η(n, Λ, η0) we have for each z ∈ Bri (xi) the estimate 0 θRri (z) > E − η . (5.27) Proof. The proof relies on a straight forward comparison using the definition and monotonicity of the mass. Namely, let x, y ∈ B1/2 with s < 1 and let us denote d ≡ d(x, y). Then we have the estimate Z Z  −k −k −k s θs(y) = s dµI ≤ s dµI = θs+d(x) . (5.28) Bs(y) Bs+d(x) s + d

To apply this in our context, let us note for each xi in our covering, that by our construction of U+ there must exist yi ∈ Bri (xi) such that θ(R−1)ri (yi) ≥ θri (yi) = E − η. Let us now apply (5.28) to obtain !−k  −k Rri R  θRri (z) ≥ θ(R−1)ri (yi) ≥ E − η . (5.29) (R − 1)ri R − 1 If R = R(n, Λ, η0) > 0 and η ≤ η(n, Λ, η0), then we obtain from this the claimed estimate.  In words, the above Lemma is telling us that even though we have no reasonable control over the size of θri (xi), after we go up a controlled number of scales we can again assume that the mass is again close to E.

In the last Lemma the proof was based on continuity estimates on the mass θ. In the next Lemma we wish to show an improved version of this continuity under appearance of symmetry. Precisely:

m Lemma 5.3 (Improved Continuity of θ). Let I ⊆ B2 be an integral varifold satisfying the bounds (1.13), the mean curvature bound (1.14), and the mass bound I(B2(p)) ≤ Λ. Then for 0 < τ, η < 1 there exists δ(n, Λ, η, τ) > 0 such that if (1) we have K, H < δ , (2) there exists x0,..., xk ∈ I ∩ B1(p) which are τ-independent with |θδr(x j) − θ3(x j)| < δ, k k then if V is the k-dimensional subspace spanned by x0,..., xk, then for all x, y ∈ I ∩ B1(p) ∩ Bδ(V ) and −4 10 ≤ s ≤ 1 we have that |θs(x) − θs(y)| < η. THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 49

Proof. The proof is by contradiction. Thus, imagine no such δ exists. Then there exists a sequence of integral varifolds Ii : B4(pi) → Ni satisfying Ii(B2(p)) ≤ Λ and such that → (1) K, H < δi 0 Ii Ii (2) there exists xi ,..., xi k ∈ B (p) which are τ-independent with θ (xi j) − θ (xi j) < δi → 0, ,0 , 1 δir , 3 ,

∈ ∩ ∩ k −4 ≤ ≤ Ii − Ii ≥ however we have that there exists xi, yi I B1(p) Bδi (Vi ) and 10 si 1 such that θsi (xi) θsi (yi) η. Since ∂I ∩ B2(p) = ∅ and the masses are uniformly bounded, we may pass to a subsequence we obtain an integral varifold Ii → I as well as the collection n xi,β → xβ ∈ B1(0 ) ,

xi → x, yi → y ∈ V = span{x0,..., xk} ,

si → s . (5.30) In particular, we have in this limit that I I θ0(xβ) − θ3(xβ) = 0 , I I θs(x) − θs(y) ≥ η . (5.31) However, we have by theorem 2.2 and the standard cone splitting of theorem 2.4 we have that I is k- symmetric with respect to the k-plane V on B1. In particular, I is invariant under translation by elements of V. However, this is a contradiction to (5.31), and therefore we have proved the result. 

Now the first goal is to partition U+ into a finite collection, each of which will have a few more manage- able properties than U+ itself. More precisely:

Lemma 5.4. For each R > 1 there exists N(n, R) > 1 such that we can break up U+ as a union N N [ a [ [ U+ = U+ = Bri (xi) , (5.32) a=1 a=1 i∈Ja a a a such that each U+ has the following property: if i ∈ J , then for any other j ∈ J we have that if x j ∈ BRri (xi), −2 then r j < R ri.

Proof. Let us recall that the balls in the collection {Bri/5(xi)} are pairwise disjoint. In particular, given

R > 1 if we fix a ball Bri (xi) then by the usual covering arguments there can be at most N(n, R) ball centers N −6 N {x } ∈ U ∩ B 3 x r ≥ R r {x } j 1 + R ri ( i) with the property that j i. Indeed, if j 1 is such a collection of balls then we get N 3 n X −6 n 3 ≥ ≥ ωn(6R ri) = Vol(B6R ri (xi)) Vol(Br j/5(x j)) Nωn(R ri/5) , (5.33) 1 which by rearranging gives the estimate N ≤ N(n, R) as claimed.

SN a Now we wish to build our decomposition U+ = 1 U+, where N is from the first paragraph. We shall do this inductively, with the property that at each step of the inductive construction we will have that the sets a a a U+ will satisfy the desired property. In particular, for every a and i ∈ J , if j ∈ J is such that x j ∈ BRri (xi), 50 AARON NABER AND DANIELE VALTORTA

−2 then r j < R ri.

a a Begin by letting each J be empty. We are going to sort the points {xi}i∈J into the sets J one at a time. SN a At each step let i ∈ J \ a=1 J be an index such that ri = max r j, where the max is taken over all indexes SN a in J \ a=1 J , i.e., over all indexes which haven’t been sorted out yet. Now let us consider the collection 0 0 2 of ball centers {y j} j∈J such that J ⊂ J, xi ∈ BRr j (y j) and ri ≤ r j ≤ R ri. Note that, by construction, either a 0 ∈ 3 ∈ y j has already been sorted out in some J , or r j = ri. Now evidently y j BR ri (xi) for all j J and so by 0 a the first paragraph the cardinality of J is at most N(n, R). In particular, there must be some U+ such that a a S a U+ ∩ {y j} = ∅. Let us assign xi ∈ U+ to this piece of the decomposition. Clearly the decomposition U+ still satisfies the inductive hypothesis after the addition of this point, and so this finishes the inductive step of construction. Since at each stage we have chosen xi to have the maximum radius, this process will continue indefinitely to give the desired decomposition of U+. 

Now with a decomposition fixed, let us consider for each 1 ≤ a ≤ N the measures

a X k µ ≡ ωkri δxi . (5.34) a xi∈U+

The following is a crucial point in our construction. It tells us that each ball B10ri (xi) either has small a µ -volume, or the point xi must have large mass at scale ri. Precisely:

Lemma 5.5. Let η0, D > 0 be fixed. There exists R = R(n, Λ, η0, D, ) > 0 such that if we consider the decomposition (5.32), and if 0 0 −4 (1) K, H < δ(n, Λ, η , ) and η ≤ η(n, Λ, η , ), ri < 10 , a ≥ k (2) we have µ (B10ri (xi)) 2ωkri , a −2n −1 a k (3) for all ball centers y j ∈ Ari/5,10ri (xi) ∩ U+ and for s = 10 D ri, we have that µ+(Bs(y j)) ≤ Ds , 0 then we have that θ(xi, ri/10) ≥ E − η .

Proof. Let us begin by choosing η00 << η0 < η0(n, Λ, ), which will be fixed later in the proof, and let us also define τ ≡ 10−2nD−1. Let δ(n, Λ, η0) be from Lemma 5.3 so that the conclusions hold with 10−1η0, and let δ0(n, Λ, η00, ) be chosen so that conclusions of Lemma 5.2 are satisfied with η00 if η ≤ η(n, Λ, η00, ). Now throughout we will assume η00 < δ and K, H < min{δ, δ0}. We will also choose R = R(n, Λ, η00, D, ) > max{τ−1, δ−1, δ0−1} so that Lemma 5.2 is satisfied with η00.

Since it will be useful later, let us first observe that ri ≥ Rr. Indeed, if not then for each ball center a −2 y j ∈ Ari/5,10ri (xi) ∩ U+ we would have r ≤ r j < R ri < r. This tells us that there can be no ball centers in a Ari/5,10ri (xi) ∩ U+. However, by our volume assumption we have that

a  a  a  k k k µ Ari/5,10ri (xi) = µ B10ri (xi) − µ Bri/5(xi) ≥ 2ωkri − ωkri ≥ ωkri , (5.35) which contradicts this. Therefore we must have that ri ≥ Rr. THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 51

Now our first real claim is that under the assumptions of the Lemma there exists ball centers y0,..., yk ∈ a U+ ∩ Ari/5,10ri (xi) which are τri-independent in the sense of Definition 2.3. Indeed, assume this is not the case, then we can find a k − 1-plane Vk−1 such that

 k−1 {yi}i∈Ja ∩ Ari/5,10ri (xi) ⊆ Bτri V . (5.36)

1−k n 1−k In particular, by covering Bτri (V) ∩ B10ri (xi) by C(n)τ ≤ 10 τ balls of radius τri, and using our a ≤ k k assumption that µ+(Bτri (y)) Dτ ri , we are then able to conclude the estimate a  a  n k k µ Ari/5,10ri (xi) ≤ µ Bτri (V) ∩ B10ri (xi) ≤ 10 Dτri < ωnri . (5.37) On the other hand, our volume assumption guarantees that

a  a  a  k k k µ Ari/5,10ri (xi) = µ B10ri (xi) − µ Bri/5(xi) ≥ 2ωkri − ωkri ≥ ωkri , (5.38) which leads to a contradiction. Therefore there must exist k + 1 ball centers y0,..., yk ∈ Ari/5,10ri (xi) which are τri-independent points, as claimed.

Let us now remark on the main consequences of the existence of these k + 1 points. Note first that for 00 a −2 −1 − ≤ each y j we have that θR ri (y j) > E η , since by the construction of U+ we have that r j R ri, and therefore we can apply Lemma 5.2. Thus we have k + 1 points in B10ri (xi) which are τri-independent, and whose mass densities are η00-pinched. To exploit this, let us first apply Lemma 5.3 in order to conclude that for each x ∈ Bδ(V) we have

00 −1 0 0 θri/10(x) ≥ θri/10(y j) − |θri/10(x) − θri/10(y j)| ≥ E − η − 10 η > E − η . (5.39)

0 −1 In particular, if we assume that xi is such that θri/10(xi) < E − η , then we must have that ri d(x j, V) ≥ δ = δ(n, Λ, η0).

0 Therefore, let us now assume θri/10(xi) < E − η , and thus to prove the Lemma we wish to find a contra- diction. To accomplish this notice that we have our k + 1 points y0,..., yk ∈ B10ri which are τri-independent 00 | − −1 | and for which θ20ri (y j) θR ri (y j) < η . Therefore by applying the cone splitting of Theorem 2.4 we 0 00 00 0 0 have for each  > 0 that if η ≤ η (n, Λ,  ) then B10ri (xi) is (k,  )-symmetric with respect to the k-plane k 0 0 V . However, since d(xi, V) > δri, we have by Theorem 2.5 that if  ≤  (n, Λ, ) then there exists some 0 0 0 τ = τ (n, Λ, ) such that Bτ ri (xi) is (k + 1, )-symmetric. However, we can assume after a further increase 0−1 0 −1 k that R = R(n, Λ, D, ) > 4τ , and thus we have that τ ri > 4R ri > 4r. This contradicts that xi ∈ S ,r, and 0 thus we have contradicted that θri/10(xi) < E − η , which proves the Lemma. 

5.4. Estimating U+ in Lemma 5.1. Now we proceed to finish the proof of Lemma 5.1 by estimating the 0 0 16n 0 set U+. First let us pick D = D (n) ≡ 2 D(n), where D(n) is from Theorem 3.3. Then for some η fixed a we can choose R as in Lemma 5.5. It is then enough to estimate each of the sets U+, as there are at most 0 a N = N(n, Λ, , η, η ) pieces to the decomposition. Thus we will fix a set U+ and focus on estimating the content of this set. Let us begin by observing that if r > 0 then we have the lower bound ri ≥ r. Other- a wise, let us fix any r > 0 and restrict ourselves to the collection of balls in U+ with ri ≥ r. The estimates we 52 AARON NABER AND DANIELE VALTORTA

a will prove in the end will be independent of r, and thus by letting r → 0 we will obtain estimates on all of U+.

Now let us make the precise statement we will prove in this subsection. Namely, consider any of ball −3 −α centers {xi}i∈Ja and any radius 2 ri ≤ rα ≤ 2, where rα = 2 . Then we will show that

a  4n k µ Brα (xi) ≤ 2 D(n)rα . (5.40)

Let us observe that once we have proved (5.40) then we have finished the proof of the Covering Lemma, as we will then have the estimate

X k a  ri = µ B2(xi) ≤ C(n) . (5.41)

We prove (5.40) inductively on α. To begin notice that for each xi if β is the largest integer such that −3 a 2 ri ≤ rβ, then the statement clearly holds, by the definition of the measure µ . In fact, we can go further 0 0 than this. For each xi, let ri ∈ [ri/5, 10ri] be the largest radius such that for all ri ≤ s ≤ ri we have

a  k µ Bs(xi) ≤ 2ωk s . (5.42)

a  4n k In particular, we certainly have the much weaker estimate µ Bs(xi) ≤ 2 D(n)s , and hence (5.40) is also −3 0 satisfied for all 2 ri ≤ rβ ≤ ri . Notice that we then also have the estimate

k a  a  0k ω r ≤ µ B (x ) ≤ µ B 0 (x ) ≤ 2ω r . (5.43) k i ri/8 i ri i k i

Now let us focus on proving the inductive step of (5.40). Namely, assume α is such that for all xi with −3 2 ri ≤ rα+1 < 2 we have that (5.40) holds. Then we want to prove that the same estimate holds for rα. Let us begin by seeing that a weak version of (5.40) holds. Namely, for any index i ∈ Ja and any radius 8n rα ≤ s ≤ 8rα, by covering Bs(xi) by at most 2 balls {Brα+1 (y j)} of radius rα+1 we have the weak estimate

a  X a  0 k µ Bs(xi) ≤ µ Brα+1 (y j) ≤ D (n)s , (5.44) where of course D0(n) >> 24nD(n).

a a −3 To improve on this, let us fix an i ∈ J and the relative ball center xi ∈ U+ with 2 ri ≤ rα. Now let { } ∩ 0 x j j∈J = {xi}i∈I Brα (xi) be the collection of ball centers in Brα (xi). Notice first that if r j > 2rα for any of the ball centers {x j}, then we can estimate

a  a  k 4n n µ Brα (xi) ≤ µ B2rα (x j) ≤ 2ωk 2rα ≤ 2 D(n)rα , (5.45)

0 so that we may fairly assume r ≤ 2r for every j ∈ J. Now for each ball B 0 (x ) let us define a new ball j α r j j

Br¯ j (y j) which is roughly equivalent, but will have some additional useful properties needed to apply the 0 0 discrete Reifenberg. Namely, for a given ball B 0 (x ), let us consider the two options r < 10r or r = 10r . r j j j j j j 0 a k If r j < 10r j, then we let y j ≡ x j with r¯ j ≡ r j. In this case we must have that µ (B10r j (x j)) > 2ωn r j , and 0 0 thus we can apply Lemma 5.5 in order to conclude that θr¯ j/10(y j) ≥ E − η . In the case when r j = 10r j is maximal, let y j ∈ Br j (x j) be a point such that θr j (y j) = E − η, such a point exists by the definition of r j, and THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 53

let r¯ j ≡ 9r j. In either case we then have the estimates

0 θr¯ j/8(y j) ≥ E − η , −k k  a  k ωk8 r¯ j ≤ µ Br¯ j/8(y j) ≤ µ Br¯ j (y j) ≤ 4ωkr¯i ,

Br j (x j) ⊆ Br¯ j (y j) . (5.46)

a S Now let us consider the covering U+ ∩ Brα (xi) ⊆ Br¯ j (y j), and choose from it a Vitali subcovering

a [ U+ ∩ Brα (xi) ⊆ Br¯ j (y j) , (5.47)

such that {Br¯ j/5(y j)} are disjoint, where we are now being lose on notation and referring to {y j} j∈J¯ as the ball centers from this subcovering. Let us now consider the measure

X r¯ j k µ0 ≡ ω δ . (5.48) k 8 y j j∈J¯

That is, we have associated to the disjoint collection {Br¯ j/8(y j)} the natural measure. Our goal is to prove that

0  k µ Brα (xi) ≤ D(n)rα . (5.49)

Let us observe that if we prove (5.49) then we are done. Indeed, using (5.46) we can estimate

a  X a  X k k 0  4n k µ Brα (xi) ≤ µ Br¯ j (y j) ≤ 4ωk r¯ j = 4 · 8 µ Brα (xi) ≤ 2 D(n)rα , (5.50) which would finish the proof of (5.40) and therefore the Lemma.

Thus let us concentrate on proving (5.49). We will want to apply the discrete Reifenberg in this case to 0 the measure µ . Let us begin by proving a weak version of (5.49). Namely, for any ball center y j from our subcovering and radius r¯ j < s ≤ 4rα let us consider the set {zk} = {ys}s∈J¯ ∩ Bs(y j) of ball centers inside

Bs(y j). Since the balls {Br¯k/4(zk)} are disjoint we have that r¯k ≤ 8s. Using this, (5.43), and (5.46) we can estimate

0  X −k k X a a k µ Bs(y j) = ωk8 r¯k ≤ C(n) µ (Br¯k/8(zk)) ≤ C(n)µ (B2s(y j)) ≤ C(n)s , (5.51) zk∈Bs(y j) zk∈Bs(y j) where of course C(n) >> 24nD(n).

Now let us finish the proof of (5.49). Thus let us pick a ball center y j ∈ Brα (xi) and a radius s < 4rα. If 0 k 0 µ (Bs(y)) ≤ k s then Dµ0 (y j, s) ≡ 0 by definition, and if s ≤ r¯ j/4 then Dµ0 (y, s) ≡ 0, since the support of µ in Br¯i/4(y j) contains at most one point and thus is precisely contained in a k-dimensional subspace. In the k case when s > r¯i/4 and µ(Bs(y)) > n s we then have by Theorem 4.1 that Z −k 0 Dµ0 (y j, s) ≤ C(n, Λ, )s Ws(z) dµ (z) . (5.52) Bs(y) 54 AARON NABER AND DANIELE VALTORTA

By applying this to all r ≤ t ≤ s we can estimate Z Z Z −k 0 −k −k 0 0 s Dµ0 (y, t) dµ (y) ≤ Cs t Wt(z) dµ (z) dµ (y) Bs(x) Bs(x) Bt(y) Z −k −k 0 0 = Cs t µ (Bt(y))Wt(y) dµ (y) Bs(x) Z −k 0 ≤ Cs Wt(y) dµ (y) , (5.53) Bs(x) 0 where we have used our estimate on µ (Bt(y)) from (5.51) in the last line. Let us now consider the case when t = rβ ≤ s ≤ 2rα. Then we can sum to obtain: Z Z X −k 0 X −k 0 0 s Dµ (y, rβ) dµ (y) ≤ C s Wrβ (y) dµ (y) Bs(x) 0 Bs(x) rβ≤s ry≤rβ≤s Z −k X 0 = Cs Wrβ (y) dµ (y) B (x) s r¯y≤rβ≤s Z −k 0 0 ≤ C s θ4s(y) − θr¯y (y) dµ (y) ≤ C(n, Λ, )η , (5.54) Bs(x) 0 where we are using (5.46) in the last line in order to see that θs(y) − θr¯y (y) ≤ η . Now let us choose η0 ≤ η0(n, Λ, ) such that Z X −k 0 2 s Dµ0 (y, rβ) dµ (y) ≤ δ , (5.55) B (x) rβ≤s s where δ is chosen from the discrete rectifiable-Reifenberg of Theorem 3.3. Since the estimate (5.55) holds for all Bs ⊆ B2rα (x), we can therefore apply Theorem 3.3 to conclude the estimate 0 k µ(Brα (xi )) ≤ D(n)rα . (5.56) This finishes the proof of (5.46) , and hence the proof of Lemma 5.1. 

6. PROOFOF MAIN THEOREM’SFOR INTEGRAL VARIFOLDSWITH BOUNDED MEAN CURVATURE In this section we prove the main theorems of the paper concerning integral varifolds with bounded mean curvature. With the tools of Sections3,4, and5 developed, we will at this stage mainly be applying the covering of Lemma 5.1 iteratively to arrive at the estimates. When this is done carefully, we can combine k the covering lemma with the cone splitting in order to check that for k-a.e. x ∈ S  there exists a unique k k-dimensional subspace V ⊆ Tx M such that every tangent cone of I at x is k-symmetric with respect to V.

For the proofs of the Theorems of this section let us make the following remark. For any δ > 0 we can cover B1(p) by a collection of balls [N B1(p) ⊆ BM−1δ(pi) , (6.1) 1 THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 55 where N ≤ C(n)Mnδ−n. Thus if δ = δ(n, Λ, ), M = K + H, and we can analyze each such ball, we can con- clude from this estimates on all of B1(p). In particular, by rescaling BM−1δ(pi) → B1(pi), we see that we can assume in our analysis that K+H < δ without any loss of generality. We shall do this throughout this section.

m 6.1. Proof of Theorem 1.3. Let I ⊆ B2 be an integral varifold satisfying the bounds (1.13), the mean curvature bound (1.14), and the mass bound I(B2(p)) ≤ Λ. With , r > 0 fixed, let us choose η(n, Λ, ) > 0 and δ(n, Λ, ) > 0 as in Lemma 5.1. By the remarks around (6.1) we see that we can assume that K + H < δ, which we will do for the remainder of the proof.

Now let us begin by first considering an arbitrary ball Bs(x) with x ∈ B1(p) and r < s ≤ 1, potentially quite small. We will use Lemma 5.1 in order to build a special covering of Bs(x). Let us define

Ex,s ≡ sup θs(y) , (6.2) y∈Bs(x) and thus if we apply Lemma 5.1 with η(n, KN, Λ, ) fixed to Bs(x), then we can build a covering k [ r [ S ,r ∩ Bs(x) ⊆ Ur ∪ U+ = Br(xi ) ∪ Bri (xi) , (6.3) with ri > r. Let us recall that this covering satisfies the following: k−n P k k (a) r Vol(Br Ur) + ωk ri ≤ C(n) s . (b) sup θr (y) ≤ Ex,s − η. y∈Bri (xi) i Now that we have built our required covering on an arbitrary ball Bs(x), let us use this iteratively to build k our final covering of S ,r(I). First, let us apply it to B1(p) is order to construct a covering k 1 1 [  r,1 [  1 S (I) ⊆ U ∪ U = Br x B 1 x , (6.4) ,r r + i ri i such that k−n   1 X  1k r Vol Br Ur + ωk ri ≤ C(n, KN, Λ, ) , (6.5) and with

sup θ 1 (y) ≤ Λ − η . (6.6) ri ∈ 1 y Br1 (xi ) i

Now let us tackle the following claim, which is our main iterative step in the proof:

Claim: For each ` there exists a constant C`(`, n, K, H, Λ, ) and a covering k ` ` [  r,` [  ` S (I) ⊆ U ∪ U = Br x B ` x , (6.7) ,r r + i ri i ` with ri > r, such that the following two properties hold: k−n   ` X `k r Vol Br Ur + ωk ri ≤ C`(`, n, K, H, Λ, ) ,

sup θ ` (y) ≤ Λ − ` · η . (6.8) ri y∈B x` r` ( i ) i 56 AARON NABER AND DANIELE VALTORTA

To prove the claim let us first observe that we have shown this holds for ` = 1. Thus let us assume we have proved the claim for some `, and determine from this how to build the covering for ` + 1 with some constant C`+1(` + 1, n, K, H, Λ, ), which we will estimate explicitly.

n  `o Thus with our covering determined at stage `, let us apply the covering of (6.3) to each ball B ` x in ri i order to obtain a covering k ` [  r  [   S ∩ B ` (x ) ⊆ Ui,r ∪ Ui,+ = Br x Bri, j xi, j , (6.9) ,r ri i i, j j j such that k−n  X k `k r Vol Br Ui,r + ωk (ri, j) ≤ C(n, K, H, Λ, ) ri , j

sup θri, j (y) ≤ Λ − (` + 1)η . (6.10) ∈ y Bri, j (xi, j) Let us consider the sets `+1 ` [ Ur ≡ Ur Ui,r , i `+1 [ U+ ≡ Bri, j (xi, j) . (6.11) i, j Notice that the second property of (6.8) holds for ` + 1 by the construction, hence we are left analyzing the volume estimate of the first property. Indeed, for this we combine our inductive hypothesis (6.8) for U` and (6.10) in order to estimate k−n   `+1 X k k−n   ` X  k−n  X k r Vol Br Ur + ωk (ri, j) ≤ r Vol Br Ur + r Vol Br Ui,r + ωk (ri, j) i, j i j X ` k ≤ C` + C (ri ) i

≤ C(n, K, H, Λ, ) · C`(`, n, K, H, Λ, )

≡ C`+1 . (6.12) Hence, we have proved that if the claim holds for some ` then the claim holds for ` + 1. Since we have already shown the claim holds for ` = 1, we have therefore proved the claim for all `.

Now we can finish the proof. Indeed, let us take ` = dη−1Λe = `(η, Λ). Then if we apply the Claim to such an `, we must have by the second property of (6.8) that ` U+ ≡ ∅ , (6.13) and therefore we have a covering k ` [ S ,r ⊆ Ur = Br(xi) . (6.14) i But in this case we have by (6.8) that   k    ` n−k Vol Br S ,r(I) ≤ Vol Br Ur ≤ C(n, K, H, Λ, )r , (6.15) THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 57 which proves the Theorem. 

6.2. Proof of Theorem 1.4. There are several pieces to Theorem 1.4. To begin with, the volume estimate follows easily now that Theorem 1.3 has been proved. That is, for each r > 0 we have that

k k S  (I) ⊆ S ,r(I) , (6.16) and therefore we have the volume estimate

  k     k   n−k Vol Br S  (I) ∩ B1 (p) ≤ Vol Br S ,r(I) ∩ B1 (p) ≤ C(n, K, H, Λ, )r . (6.17) In particular, this implies the much weaker Hausdorff measure estimate

k  k  λ S  (I) ∩ B1 (p) ≤ C(n, K, H, Λ, ) , (6.18) which proves the first part of the Theorem.

k Let us now focus on the rectifiability of S  . We consider the following claim, which is the r = 0 version of the main Claim of Theorem 1.3. We will be applying Lemma 5.1, which requires K + H < δ. As in the proof of Theorem 1.3 we can just assume this without any loss, as we can cover B1(p) by a controlled number of balls of radius M−1δ, so that after rescaling we can analyze each of these balls with the desired curvature assumption. Thus let us consider the following:

k ` ` ` S ` Claim: If K + H < δ, then for each ` there exists a covering S (I) ⊆ U ∪ U = U B ` (x ) such that  0 + 0 ri i k ` P `k ≤ (1) λ (U0) + ωk ri C`(`, n, K, H, Λ, ). ` (2) U0 is k-rectifiable. (3) supy∈B x` θr` (y) ≤ Λ − ` · η r` ( i ) i i

The proof of the Claim follows essentially the same steps as those for the main Claim of Theorem 1.3. k ⊆ 0 ∪ 0 0 ∅ 0 For base step ` = 0, we consider the decomposition S  U0 U+ where U0 = and U+ = B1(p). Now let us assume we have proved the claim for some `, then we wish to prove the claim for ` + 1. Thus, ` let us consider the set U+ from the previous covering step given by ` [ ` U = B ` (x ) . (6.19) + ri i ` Now let us apply Lemma 5.1 to each of the balls B ` (x ) in order to write ri i k ` [ S ∩ B ` (x ) ⊆ Ui,0 ∪ Ui,+ = Ui,0 ∪ Bri, j (xi, j) , (6.20)  ri i j with the following properties: k P k ` k (a) λ (Ui,0) + ωk j ri, j ≤ C(n, K, H, Λ, , p)(ri ) ,

(b) sup ∈ θri j (y) ≤ Λ − (` + 1)η, y Bri, j (xi, j) , (c) Ui,0 is k-rectifiable. 58 AARON NABER AND DANIELE VALTORTA

Now let us define the sets `+1 [ ` U0 = Ui,0 ∪ U0 , `+1 [ U+ = Bri, j (xi, j) . (6.21) i, j Conditions (2) and (3) from the Claim are clearly satisfied. We need only check condition (1). Using (a) and the inductive hypothesis we can estimate that

k `+1 X k k ` X  k X k λ (U0 ) + ωk ri, j ≤ λ (U0) + λ (Ui,0) + ωk ri, j , i, j i j X `k ≤ C` + C(n, K, H, Λ, ) ri i

≤ C(n, K, H, Λ, ) · C`

≡ C`+1 . (6.22) Thus, we have proved the inductive part of the claim, and thus the claim itself.

k −1 Let us now finish the proof that S  (I) is rectifiable. So let us take ` = dη Λe = `(η, Λ). Then if we apply the above Claim to `, then by the third property of the Claim we must have that ` U+ ≡ ∅ , (6.23) and therefore we have the covering k ` S  ⊆ U0 , (6.24) ` k ` ≤ k where U0 is k-rectifiable with the volume estimate λ (U0) C, which proves that S  is itself rectifiable.

k Finally, we prove that for k a.e. x ∈ S  there exists a k-dimensional subspace Vx ⊆ Tx M such that every tangent map at x is k-symmetric with respect to Vx. To see this we proceed as follows. For each η > 0 let us consider the finite decomposition

dη−1Λe k [ k,α S  = W,η , (6.25) α=0 where by definition we have k,α  k   W,η ≡ x ∈ S  : θ0(x) ∈ αη, (α + 1)η . (6.26) k,α k,α k,α Note then that each W,η is k-rectifiable, and thus there exists a full measure subset W˜ ,η ⊆ W,η such that k,α k,α for each x ∈ W˜ ,η the tangent cone of W,η exists and is a subspace Vx ⊆ Tx M. ˜ k,α k k,α Now let us consider such an x ∈ W,η , and let Vx be the tangent cone of W,η at x. For all r << 1 sufficiently small we of course have |θr(x) − θ0(x)| < η. Thus, by the monotonicity and continuity of θ we k,α have for all r << 1 sufficiently small and all y ∈ W,η ∩ Br(x) that |θr(y) − θ0(y)| < 2η. In particular, by k,α Theorem 2.2 we have for each y ∈ W,η ∩ Br(x) that Br(y) is (0, δη)-symmetric, with δη → 0 as η → 0. Now THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 59

k let us recall the cone splitting of Theorem 2.4. Since the tangent cone at x is Vx , for all r sufficiently small k,α −1 we can find k + 1 points x0,..., xk ∈ Br(x) ∩ W,η which are 10 r-independent, see Definition 2.3, and for which B2r(x j) are (0, δη)-symmetric. Thus, by the cone splitting of Theorem 2.4 we have that Br(x) is k (k, δη)-symmetric with respect to Vx for all r sufficiently small, where δη → 0 as η → 0. In particular, every tangent map at x is (k, δη)-symmetric with respect to Vx, where δη → 0 as η → 0.

Now let us consider the sets [ ˜ k ˜ k,α W,η ≡ W,η . (6.27) α ˜ k k ˜ k So W,η ⊆ S  is a subset of full k-dimensional measure, and for every point x ∈ W,η we have seen that every tangent map of is (k, δη)-symmetric with respect to some Vx ⊆ Tx M, where δη → 0 as η → 0. Finally let us define the set \ ˜ k ≡ ˜ k S  W, j−1 . (6.28) j ˜ k k This is a countable intersection of full measure sets, and thus S  ⊆ S  is a full measure subset. Further, we ˜ k have for each x ∈ S  that every tangent map must be (k, δ)-symmetric with respect to some Vx, for all δ > 0. In particular, every tangent map at x must be (k, 0) = k-symmetric with respect to some Vx. This finishes the proof of the Theorem. 

6.3. Proof of Theorem 1.5. Let us begin by observing the equality

k [ k [ k S (I) = S  (I) = S 2−β (I) . (6.29) >0 β∈N

k Indeed, if x ∈ S  (I), then no tangent map at x can be (k+1, /2)-symmetric, and in particular k+1-symmetric, k k k k and thus x ∈ S (I). This shows that S  (I) ⊆ S (I). On the other hand, if x ∈ S (I) then we claim there is k some  > 0 for which x ∈ S  (I). Indeed, if this is not the case, then there exists i → 0 and ri > 0 such that

Bri (x) is (k + 1, i)-symmetric. If ri → 0 then we can pass to a subsequence to find a tangent map which is k + 1-symmetric, which is a contradiction. On the other hand, if ri > r > 0 then we see that Br(x) is itself k + 1-symmetric, and in particular every tangent map at x is k + 1-symmetric. In either case we obtain a k contradiction, and thus x ∈ S  (I) for some  > 0. Therefore we have proved (6.29).

As a consequence, S k(I) is a countable union of k-rectifiable sets, and therefore is itself k-rectifiable. On the other hand, Theorem 1.4 tells us that for each β ∈ N there exists a set S˜ k (I) ⊆ S k (I) of full measure 2−β 2−β such that ˜ k ⊆  ∃ k ⊆ S 2−β x : V Tx M s.t. every tangent map at x is k-symmetric wrt V . (6.30) Hence, let us define [ ˜ k ≡ ˜ k S (I) S 2−β (I) . (6.31) 60 AARON NABER AND DANIELE VALTORTA

Then we still have that S˜ k(I) has k-full measure in S k(I), and if x ∈ S˜ k(I) then for some β we have that x ∈ S˜ k , which proves that there exists a subspace V ⊆ T M such that every tangent map at x is k-symmetric 2−β x with respect to V. We have finished the proof of the theorem. 

7. PROOFOF MAIN THEOREM’SFOR MINIMIZING HYPERSURFACES In this section we prove the main theorems of the paper concerning minimizing hypersurfaces. That is, we finish the proofs of Theorem 1.6 and Theorem 1.8. In fact, the proofs of these two results are almost identical, though the first relies on Theorem 1.4 and the later on Theorem 1.3. However, for completeness sake we will include the details of both.

  7.1. Proof of Theorem 1.6. We wish to understand better the size of the singular set Sing In−1 , where In−1 ⊂ Mn, of a minimizing hypersurface. Let us recall that the -regularity of theorem 2.6 tells us that if I is minimizing, then there exists (n, K, H, Λ) > 0 with the property that if x ∈ B1(p) and 0 < r < r(n, K, H, Λ) is such that B2r(x) is (n − 7, )-symmetric, then rI(x) ≥ r. In particular, x is a smooth point, and we have for (n, K, H, Λ) > 0 that

n−8 Sing(I) ∩ B1(p) ⊆ S  (I) . (7.1) Thus by Theorem 1.4 there exists C(n, K, H, Λ) > 0 such that for each 0 < r < 1 we have

    n−8   8 Vol Br Sing(I) ∩ B1(p) ≤ Vol Br S  (I) ∩ B1(p) ≤ Cr . (7.2) This of course immediately implies, though of course is much stronger than, the Hausdorff measure estimate

n−8  λ Sing(I) ∩ B1(p) ≤ C , (7.3) which finishes the proof of the first estimate in (1.17). As a simple corollary of this and the uniform bound θ(x, r) ≤ cΛ for all x ∈ B1 (p) and r < 1, we obtain also the second estimate in (1.17). 

7.2. Proof of Theorem 1.8. We begin again by considering the -regularity of theorem 2.6. This tells us that if I is a minimizing hypersurface, then there exists (n, K, H, Λ) > 0 with the property that if x ∈ B1(p) and 0 < r < r(n, K, H, Λ) are such that B2r(x) is (n − 7, )-symmetric, then rI(x) ≥ r. In particular, we have for such , r that

n−8 {x ∈ B1(p): rI(x) < r} ⊆ S ,r (I) . (7.4) Thus by Theorem 1.4 there exists C(n, K, H, Λ) > 0 such that for each 0 < r < 1 we have

   n−8   8 Vol Br{x ∈ B1(p): rI(x) < r} ≤ Vol Br S ,r (I) ∩ B1(p) ≤ Cr , (7.5) −1 which proves the second estimate of (1.20). To prove the first we observe that |A|(x) ≤ rI(x) , while the third is a corollary of the bound θ(x, r) ≤ cΛ for all x ∈ B1 (p) and r < 1. This concludes the proof of the Theorem.  THE SINGULAR STRUCTURE AND REGULARITY OF STATIONARY AND MINIMIZING VARIFOLDS 61

REFERENCES

[All72] W.K.ALLARD, On the first variation of a varifold, Ann. of Math. (2) 95 (1972), 417–491. MR 0307015. Zbl 0252.49028. [Alm66] F.J.ALMGREN,JR., Some interior regularity theorems for minimal surfaces and an extension of Bernstein’s theorem, Ann. of Math. (2) 84 (1966), 277–292. MR 0200816. Zbl 0146.11905. [BDGG69]E.B OMBIERI,E.DE GIORGI, and E. GIUSTI, Minimal cones and the Bernstein problem, Invent. Math. 7 (1969), 243–268. MR 0250205. Zbl 0183.25901. [BL14]C.B REINER and T. LAMM, Quantitative stratification and higher regularity for biharmonic maps, arXiv:1410.5640 (2014). Available at http://arxiv.org/abs/1410.5640. [CHN13a]J.C HEEGER,R.HASLHOFER, and A. NABER, Quantitative stratification and the regularity of harmonic map flow, accepted to Calc. of Var. and PDE (2013). Available at http://arxiv.org/abs/1308.2514. [CHN13b] , Quantitative stratification and the regularity of mean curvature flow, Geom. Funct. Anal. 23 (2013), 828–847. MR 3061773. Zbl 1277.53064. doi: 10.1007/s00039-013-0224-9. Available at http://arxiv.org/abs/1308.2514. [CN13a]J.C HEEGER and A. NABER, Lower bounds on Ricci curvature and quantitative behavior of singular sets, Invent. Math. 191 (2013), 321–339. MR 3010378. Zbl 1268.53053. doi: 10.1007/s00222-012-0394-3. Available at http: //arxiv.org/abs/1103.1819. [CN13b] , Quantitative stratification and the regularity of harmonic maps and minimal currents, Comm. Pure Appl. Math. 66 (2013), 965–990. MR 3043387. Zbl 1269.53063. doi: 10.1002/cpa.21446. Available at http://arxiv.org/abs/ 1107.3097. [CNV15]J.C HEEGER,A.NABER, and D. VALTORTA, Critical sets of elliptic equations, Comm. Pure Appl. Math. 68 (2015), 173–209. MR 3298662. Zbl 06399723. doi: 10.1002/cpa.21518. Available at http://arxiv.org/abs/1207.4236. [DT12]G.D AVID and T. TORO, Reifenberg parameterizations for sets with holes, Mem. Amer. Math. Soc. 215 (2012), vi+102. MR 2907827. Zbl 1236.28002. doi: 10.1090/S0065-9266-2011-00629-5. [DL12]C.D E LELLIS, Allard’s interior regularity theorem: an invitation to stationary varifolds, Lecture notes, 2012. Available at http://www.math.uzh.ch/fileadmin/user/delellis/publikation/allard_31.pdf. [Fed69]H.F EDERER, Geometric measure theory, Die Grundlehren der mathematischen Wissenschaften, Band 153, Springer- Verlag New York Inc., New York, 1969. MR 0257325. Zbl 0176.00801. [Mat95] P. MATTILA, Geometry of sets and measures in Euclidean spaces, Cambridge Studies in Advanced Mathematics 44, Cambridge University Press, Cambridge, 1995, Fractals and rectifiability. MR 1333890. Zbl 0819.28004. doi: 10.1017/CBO9780511623813. [Mor66]C.B.M ORREY,JR., Multiple integrals in the calculus of variations, Die Grundlehren der mathematischen Wis- senschaften, Band 130, Springer-Verlag New York, Inc., New York, 1966. MR 0202511. Zbl 0142.38701. [NVa]A.N ABER and D. VALTORTA, Rectifiable-reifenberg and the regularity of stationary and minimizing harmonic maps, Preprint. Available at http://arxiv.org/abs/1504.02043. [NVb] , Volume estimates on the critical sets of solutions to elliptic pdes, Preprint. Available at http://arxiv.org/ abs/1403.4176. [Rei60]E.R.R EIFENBERG, Solution of the Plateau Problem for m-dimensional surfaces of varying topological type, Acta Math. 104 (1960), 1–92. MR 0114145. Zbl 0099.08503. Available at http://link.springer.com/article/10. 1007%2FBF02547186. [Sim]L.S IMON, Reifenberg’s topological disc theorem. Available at http://www.math.uni-tuebingen.de/ab/ analysis/pub/leon/reifenberg/reifenberg.dvi. [Sim83] , Lectures on geometric measure theory, Proceedings of the Centre for Mathematical Analysis, Australian National University 3, Australian National University Centre for Mathematical Analysis, Canberra, 1983. MR 756417. Zbl 0546.49019. [Sim93] , Cylindrical tangent cones and the singular set of minimal submanifolds, J. Differential Geom. 38 (1993), 585–652. MR 1243788. Zbl 0819.53029. Available at http://projecteuclid.org/euclid.jdg/1214454484. 62 AARON NABER AND DANIELE VALTORTA

[Sim95] , Rectifiability of the singular sets of multiplicity 1 minimal surfaces and energy minimizing maps, in Surveys in differential geometry, Vol. II (Cambridge, MA, 1993), Int. Press, Cambridge, MA, 1995, pp. 246–305. MR 1375258. Zbl 0874.49033. [Sim68]J.S IMONS, Minimal varieties in riemannian manifolds, Ann. of Math. (2) 88 (1968), 62–105. MR 0233295. Zbl 0181.49702. Available at http://www.jstor.org/stable/1970556. [Tor95] T. TORO, Geometric conditions and existence of bi-Lipschitz parameterizations, Duke Math. J. 77 (1995), 193–227. MR 1317632. Zbl 0847.42011. doi: 10.1215/S0012-7094-95-07708-4.