arXiv:1811.05959v3 [math.DS] 11 Aug 2020 o-onigdimension. box-counting asaa Poland Warszawa, Poland offadpcigmaue) 88 (). 28A80 measures), packing and dorff aia system namical usto h ulda space Euclidean the of subset r nw for known are fa culsqec of sequence actual an of ausof values of uhsqecso esrmnsadwa stemnmlnumbe s minimal original the the is extent what and what measurements to of ask, sequences such to natural is it Therefore, spsil o eti observables certain for possible is h hsclltrtr seeg PF8] n nprdan a inspired and [PCFS80]) as e.g. known (see literature physical the -aladdresses E-mail 2 2000 1 e od n phrases. and words Key osdra xeietls bevn hsclsse mo system physical a observing experimentalist an Consider delay-coordinates nttt fMteais oihAaeyo cecs l Śniad ul. Sciences, of Academy Polish Mathematics, of Institute nttt fMteais nvriyo asw l aah ,0 2, Banacha ul. Warsaw, of University Mathematics, of Institute ahmtc ujc Classification. Subject Mathematics Abstract. oal isht n netv a.Fix map. injective and Lipschitz locally a inof sion sijcieo e ffull of set a on injective is et oprdt h o-rbblsi oneprsaet are from counterparts coordinates non-probabilistic required the Riegl to and Koliander compared Lellis, ments type De Bölcskei, Casdagli similar Alberti, and of by Yorke result theorem Sauer, embedding by probabilistic proven dynamical as theorem embedding lay o-onigoe epeeteape hwn o h s of use the results. how obtained showing previously examples the present improves We one. box-counting isht function Lipschitz p 1 = k aestp ea medn theorems embedding delay Takens-type measurements RYZO BARAŃSKI KRZYSZTOF k , . . . , X all n sueta h e of set the that assume and ( : ,T X, − Let x aasimmweup,[email protected] a.spiewak@m [email protected], [email protected], eurdfrarlal eosrcin hs usin h questions These reconstruction. reliable a for required , 1 ∈ RBBLSI AESTHEOREM TAKENS PROBABILISTIC A epoeta o yia oyoilperturbation polynomial typical a for that prove We . X ) X where , h aesdlyebdigterm rbblsi embedding, probabilistic theorem, embedding delay Takens ⊂ : n ag enough large and X R k h N ( → states x eaBrlset, Borel a be ) dim 2 R h , R T µ the , N maue hsi rbblsi eso fteTkn de- Takens the of version probabilistic a is This -measure. ( : x T totnhpesta,fragvnpoint given a for that, happens often It . h X 1 X ,Tx T , . . . x, T x, sln stemeasurements the as long as , OAA GUTMAN YONATAN , k 1. ) → to dlycodnt map coordinate -delay h , . . . , 74 Dmninter fdnmclsses,2A8(Haus 28A78 systems), dynamical of theory (Dimension 37C45 p proi onsof points -periodic Introduction dim X µ k k steeouinrl n the and rule evolution the is X oe rbblt esr on measure probability Borel a ( ∈ . T 1 n sn asoffdmninisedo the of instead dimension Hausdorff using and N k k − titygetrta h Hudr)dimen- (Hausdorff) the than greater strictly − 1 ttn httercntuto of reconstruction the that stating , 1 x x ) h bevrsacs slmtdt the to limited is access observer’s the , o real-valued a for , T r nbt ae,tekyimprove- key the cases, both In er. 2 x N DMŚPIEWAK ADAM AND , a ieso mle than smaller dimension has hc teghn previous a strengthens which , erdcino h ubrof number the of reduction he 7→ se a ercntutdfrom reconstructed be can ystem me fmteaia results, mathematical of umber ( h ˜ eas rvd non- a provide also We . ( r ee yadsrt time discrete a by deled x h asoffdimension Hausdorff the k ) , eerdt stenumber the as to referred , h h ˜ ( ( h x ˜ x T X ) observable fagvnlocally given a of h , ) , . . . , and imuw.edu.pl hs space phase ( x T asoffdimension, Hausdorff cih8 00-656 8, eckich -9 Warszawa, 2-097 T h ˜ ) x v mre in emerged ave ( : h , . . . , T X ∈ k 1 − → h p X 1 x : for X instead , )) . X ( T X ( ,T X, → k − sa is dy- 1 R x ) ) - . The possibility of performing measurements at every point of the is clearly unrealistic. However, such an assumption enables one to obtain theoretical results which justify the validity of actual procedures used by experimentalists (see e.g. [HGLS05, KY90, SGM90, SM90]). Note that one cannot expect a reliable reconstruction of the system based on the measurements of a given observable h, as it may fail to distinguish the states of the system (e.g. if h is a constant function). It is therefore necessary (and rather realistic) to assume that the experimentalists are able to perturb the given observable. The first result obtained in this area is the celebrated Takens delay embedding theorem for smooth systems on manifolds [Tak81, Theorem 1]. Due to its strong connections with actual reconstruction procedures used in the natural sciences, Takens theorem has been met with great interest among mathematical physicists (see e.g. [HBS15, SYC91, Vos03]). Let us recall its extension due to Sauer, Yorke and Casdagli [SYC91]. In this setting, the number k of the delay- coordinates should be two times larger than the upper box-counting dimension of the phase space X (denoted by dimB X; see Section 2 for the definition), and the perturbation is a polynomial of degree 2k. The formulation of the result given here follows [Rob11]. Theorem 1.1 ([Rob11, Theorem 14.5]). Let X RN be a compact set and let T : X X ⊂ → be Lipschitz and injective. Let k N be such that k > 2dim X and assume 2dim ( x ∈ B B { ∈ X : T px = x ) < p for p = 1,...,k 1. Let h: RN R be a Lipschitz function and } − → h ,...,h : RN R a basis of the space of polynomials of degree at most 2k. For α = 1 m → (α ,...,α ) Rm denote by h : RN R the map 1 m ∈ α → m

hα(x)= h(x)+ αjhj(x). j=1 X Then for Lebesgue almost every α =(α ,...,α ) Rm, the transformation 1 m ∈ φT : X Rk, φT (x)=(h (x), h (T x),...,h (T k−1x)) α → α α α α is injective on X.

T The map φα is called the delay-coordinate map. Note that Theorem 1.1 applies to any compact set X RN , not necessarily a manifold. This is a useful feature, as it allows ⊂ one to consider sets with a complicated geometrical structure, such as sets arising as in chaotic dynamical systems, see e.g. [ER85]. Moreover, the upper box-counting dimension of X can be smaller than the dimension of any smooth manifold containing X, so Theorem 1.1 may require fewer delay-coordinates than its smooth counterpart in [Tak81]. As it was noted above, usually an experimentalist may perform only a finite number of observations h(x ),...,h(T k−1x ) for some points x X, j = 1,...,l. We believe it is j j j ∈ realistic to assume there is an (explicit or implicit) random process determining which initial states xj are accessible to the experimentalist. In this paper we are interested in the question of reconstruction of the system in presence of such process. Mathematically speaking, this corresponds to fixing a probability measure µ on X and asking whether the delay-coordinate T map φα is injective almost surely with respect to µ. Since in this setting we are allowed to neglect sets of probability zero, it is reasonable to ask whether the minimal number of delay-coordinates sufficient for the reconstruction of the system can be smaller than 2 dim X. 2 Our main result states that this is indeed the case, and the number of delay-coordinates can be reduced by half for any (Borel) probability measure. The problem of determining the minimal number of delay-coordinates required for recon- struction has been already considered in the physical literature. In [PCFS80], the authors analyzed an algorithm which may by interpreted as an attempt to determine this number in a probabilistic setting. Our work provides rigorous results in this direction. The following theorem is a simplified version of our result.

Theorem 1.2 (Probabilistic Takens delay embedding theorem). Let X RN be a ⊂ Borel set, µ a Borel probability measure on X and T : X X an injective, locally Lipschitz → map. Take k N such that k > dim X and assume that for p = 1,...,k 1 we have ∈ − dim( x X : T px = x ) < p or µ( x X : T px = x )=0. Let h: X R be a locally { ∈ } { ∈ } → Lipschitz function and h ,...,h : RN R a basis of the space of real polynomials of N 1 m → variables of degree at most 2k 1. For α = (α ,...,α ) Rm denote by h : RN R the − 1 m ∈ α → map m

hα(x)= h(x)+ αjhj(x). j=1 X Then for Lebesgue almost every α =(α ,...,α ) Rm, there exists a Borel set X X of 1 m ∈ α ⊂ full µ-measure, such that the delay-coordinate map φT : X Rk, φT (x)=(h (x), h (T x),...,h (T k−1x)) α → α α α α is injective on Xα.

In the above theorem, the dimension dim can be chosen to be any of dimH , dimB , dimB (Hausdorff, lower and upper box-counting dimension; for the definitions see Section 2). Recall that for any Borel set X one has (1.1) dim X dim X dim X H ≤ B ≤ B (see e.g. [Fal14, Proposition 3.4]). Since the inequalities in (1.1) may be strict, using the Hausdorff dimension instead of the box-counting one(s) may reduce the required number of delay-coordinates. In particular, there are compact sets X RN with dim X = 0 and ⊂ H dimB X = N, hence Theorem 1.2 can reduce significantly the number of required delay- coordinates compared to Theorem 1.1 (in a probabilistic setting). Notice that in Theorem 1.2 we do not assume that the measure µ is T -invariant. However, the invariance of µ provides some additional benefits, as shown in the following remark.

Remark 1.3 (Invariant measure case). Suppose that the measure µ in Theorem 1.2 is additionally T -invariant, i.e. µ(Y ) = µ(T −1(Y )) for every Borel set Y X. Then the ⊂ set Xα can be chosen to satisfy T (Xα) = Xα. Moreover, if µ is T -invariant and ergodic (i.e. T −1(Y ) = Y can occur only for sets Y of 0 or full µ-measure), then the assumption on the periodic points of T in Theorem 1.2 can be omitted.

Note that in the case when the measure µ is T -invariant, Theorem 1.2 and Remark 1.3 T show that for a suitable choice of Xα, the map φα is injective on the invariant set Xα, which 3 implies that the (X,ˆ Tˆ) for Xˆ = φT (X ), Tˆ = φT T (φT )−1, is a model α α α ◦ ◦ α of the system (X, T ) embedded in Rk. An extended versions of Theorem 1.2 and Remark 1.3 are presented and proved in Section 4 as Theorem 4.3 and Remark 4.4, respectively. Theorem 4.3 shows that the assumption k > dim X can be slightly weakened, and in addition to locally Lipschitz functions h, one can consider locally β-Hölder functions for suitable β (0, 1]. Moreover, one can replace the ∈ probabilistic measure µ by any Borel σ-finite measure on X. For details, see Section 4. Notice that to eliminate the assumption on the periodic points of T in Theorem 1.2, one can also consider systems with ‘few’ or no periodic points. For instance, as proved in [Yor69], a flow on a subset of Euclidean space given by an autonomous differential equation x˙ = F (x), 2π where F is Lipschitz with a constant L, has no periodic orbits of period smaller than L . It 2π follows that if T is a t-time map for such a flow with t < L dim X , then it has no periodic points of periods smaller than dim X and therefore the assumption on periodic points in Theorem 1.2 can be omitted (compare also [Gut16, Remark 1.2]). The same holds if the number of periodic points of a given period is finite, which by the Kupka–Smale theorem is a generic condition in the space of Cr-diffeomorphisms (r 1) of a compact manifold equipped 1 ≥ with the uniform Cr-topology . As has been mentioned already, Takens theorems are used in order to justify actual (approx- imate) delay map procedures based on real experimental data (see e.g. [MGNS18, HGLS05, SGM90, SM90]). Note, however, that in the cited papers the dimension of the phase space X is deduced a posteriori from the properties of the (orbits of the delay coordinate map for a given observable). It would be very interesting to know whether in the literature it has been observed for some experimental data originating from a space X with known dimension that it is sufficient to have k dim X (instead of k 2 dim X) delay-coordinates ≈ ≈ (in other words, time series of length k) in the framework of such procedures. In this paper we focus our attention to the case when the space X is a subset of a finite- dimensional Euclidean space. Takens-type delay embedding theorems have also been ex- tended to finite-dimensional subsets of Banach spaces (see e.g. [Rob05]). It is a natural question, whether our probabilistic embedding theorems can also be transferred into the infinite-dimensional setup. This problem will be considered in a subsequent work. Takens-type delay embedding theorems can be seen as dynamical versions of embedding theorems which specify when a finite-dimensional set can be embedded into a Euclidean T space. Indeed, under the assumptions of Theorem 1.1, the delay-coordinate map φα is an embedding of X into Rk for typical α. Embedding theorems in various categories have been extensively studied in a number of papers (see Section 3 for a more detailed discussion). Recently, Alberti, Bölcskei, De Lellis, Koliander and Riegler [ABD+19] proved a probabilistic embedding theorem involving the modified lower box-counting dimension of the set (see Theorem 3.6). We are able to improve this result by considering the Hausdorff dimension.

1 According to the Kupka–Smale theorem ([PdM82, Chapter 3, Theorem 3.6]) it is generic that the periodic points are hyperbolic and thus periodic points of a given period are isolated by the Hartman–Grobman theorem ([PdM82, Chapter 2, Theorem 4.1]). 4 Below we present a simplified version of our theorem, which can be seen as a non-dynamical counterpart of Theorem 1.2. Theorem 1.4 (Probabilistic embedding theorem). Let X RN be a Borel set and let ⊂ µ be a Borel probability measure on X. Take k N such that the k-th Hausdorff measure of ∈ X is zero (it suffices to take k > dim X) and let φ: X Rk be a locally Lipschitz map. H → Then for Lebesgue almost every linear transformation L: RN Rk there exists a Borel set → X X of full µ-measure, such that φ = φ + L is injective on X . L ⊂ L L The extended version of the theorem is formulated and proved in Section 3 as Theorem 3.1. In particular, we obtain the following geometric corollary (see Section 3 for details). Corollary 1.5 (Probabilistic injective projection theorem). Let X RN be a Borel set ⊂ and let µ be a Borel probability measure on X. Then for every k > dimH X and almost every k-dimensional linear subspace S RN , the orthogonal projection of X into S is injective on ⊂ a full µ-measure subset of X. Notice that by the Marstrand–Mattila projection theorem (see [Mar54, Mat75]), if X RN ⊂ is Borel and k dim X, then for almost all k-dimensional linear subspaces S RN , the ≥ H ⊂ image of X under the orthogonal projection into S has Hausdorff dimension equal to dimH X. Note also that Sauer and Yorke proved in [SY97] that the dimension2 of a bounded Borel subset X of RN is preserved under typical smooth maps and typical delay-coordinate maps into Rk as long as k dim X. ≥ In this paper we also provide several examples. Example 3.5 shows that in general the condition k > dim X in Theorem 1.4 cannot be replaced by k dim X. Example 4.6 shows H ≥ H that linear perturbations of the observable are not sufficient for Takens theorem. Section 5 contains a pair of examples. The first one is based on Kan’s example from the Appendix to [SYC91], showing that condition k > 2 dimH X is not sufficient for existence of a linear transformation into Rk which is injective on X. As in the probabilistic setting one can work with the Hausdorff dimension, we consider a set X R2 similar to the one provided by Kan, ⊂ which cannot be embedded linearly into R, but when endowed with a natural probability measure, almost every linear transformation L: R2 R is injective on a set of full measure. → The second example provides a probability measure with dimH µ < dim MB µ, showing that Theorem 1.4 strengthens a previous result from [ABD+19]. Organization of the paper. The paper is organized as follows. In Section 2 we introduce notation, definitions and preliminary results. Section 3 contains the formulation and proof of the extended version of the probabilistic embedding theorem (Theorem 3.1), while Section 4 is devoted to the proof of the extended version of the probabilistic Takens delay embedding theorem (Theorem 4.3). In Section 5 we present examples showing how the use of the Hausdorff dimension improves the previously obtained results. Acknowledgements. We are grateful to Erwin Riegler for helpful discussions and to the anonymous referees for helpful comments. Y. G. and A. Ś. were partially supported by the National Science Centre (Poland) grant 2016/22/E/ST1/00448. 2 Any one of the dimensions mentioned above and denoted by dim. 5 2. Preliminaries Consider the Euclidean space RN for N N, with the standard inner product , and ∈ h· ·i the norm . The open δ-ball around a point x RN is denoted by B (x, δ). By X we k·k ∈ N | | denote the diameter of a set X RN . We say that function φ: X Rk, X RN is locally ⊂ → ⊂ β-Hölder for β > 0 if for every x X there exists an open set U RN containing x such ∈ ⊂ that φ is β-Hölder on U X, i.e. there exists C > 0 such that ∩ φ(x) φ(y) C x y β k − k ≤ k − k for every x, y U X. We say that φ is locally Lipschitz if it is locally 1-Hölder. ∈ ∩ For k N we write Gr(k, N) for the (k, N)-Grassmannian, i.e. the space of all k- ≤ dimensional linear subspaces of RN , equipped with the standard rotation-invariant (Haar) measure, see [Mat95, Section 3.9] (and [FR02] for another construction of a rotation-invariant measure on the Grassmannian). By ηN we denote the normalized Lebesgue measure on the unit ball BN (0, 1), i.e. 1 ηN = Leb BN (0,1), Leb(BN (0, 1)) | where Leb is the Lebesgue measure on RN . For s> 0, the s-dimensional (outer) Hausdorff measure of a set X RN is defined as ⊂ ∞ ∞ s s (X) = lim inf Ui : X Ui, Ui δ . H δ→0 | | ⊂ | | ≤ i=1 i=1 n X [ o The Hausdorff dimension of X is given as dim X = inf s> 0 : s(X)=0 = sup s> 0 : s(X)= . H { H } { H ∞} For a bounded set X RN and δ > 0, let N(X, δ) denote the minimal number of balls ⊂ of diameter at most δ required to cover X. The lower and upper box-counting (Minkowski) dimensions of X are defined as log N(X, δ) log N(X, δ) dimB X = lim inf and dimB X = lim sup . δ→0 log δ → log δ − δ 0 − The lower (resp. upper) box-counting dimension of an unbounded set is defined as the supre- mum of the lower (resp. upper) box-counting dimensions of its bounded subsets. The lower and upper modified box-counting dimensions of X RN are defined as ⊂ ∞

dimMB X = inf sup dimB Ki : X Ki, Ki compact , ∈N ⊂ i i=1 n [∞ o

dimMB X = inf sup dimB Ki : X Ki, Ki compact . i∈N ⊂ i=1 n [ o With this notation, the following inequalities hold:

dimH X dimMB X dimMB X dimB X, (2.1) ≤ ≤ ≤ dimH X dimMB X dimB X dimB X. ≤ ≤6 ≤ We define dimension of a finite Borel measure µ in RN as dim µ = inf dim X : X RN is a Borel set of full µ-measure . { ⊂ } Here dim may denote any one of the dimensions defined above. Recall that for a measure µ on a set X and a measurable Y X we say that Y is of full µ-measure, if µ(X Y )=0. ⊂ \ For more information on dimension theory in Euclidean space see [Fal14, Mat95, Rob11]. For N,k N let Lin(RN ; Rk) be the space of all linear transformations L: RN Rk. Such ∈ → transformations are given by (2.2) Lx = l , x ,..., l , x , h 1 i h k i N N k N k where l1,...,lk R . Thus, the space Lin(R ; R ) can be identified with (R ) , and ∈ k the Lebesgue measure on Lin(RN ; Rk) is understood as Leb, where Leb is the Lebesgue j=1 RN RN Rk N measure in . Within the space Lin( ; ) we considerN the space Ek consisting of all linear transformations L: RN Rk of the form (2.2), for which l ,...,l B (0, 1). Note → 1 k ∈ N that by the Cauchy-Schwarz inequality, (2.3) Lx √N x k k ≤ k k N RN for every L Ek and x . ∈ ∈ N By ηN,k we denote the normalized Lebesgue measure on Ek , i.e. the probability measure N on Ek given by k 1 η = Leb . N,k Leb(B (0, 1)) |BN (0,1) j=1 N O The following geometrical inequality, used in [HK99] (see also [Rob11, Lemma 4.1]) is the key ingredient of the proof of Theorem 3.1. Lemma 2.1. Let L: RN Rk be a linear transformation. Then for every x RN 0 , → ∈ \{ } z Rk and ε> 0, ∈ εk η ( L EN : Lx + z ε ) CN k/2 , N,k { ∈ k k k ≤ } ≤ x k k k where C > 0 is an absolute constant. For L Lin(Rm; Rk), where m, k N, denote by σ (L), p 1,...,k , the p-th largest ∈ ∈ p ∈ { } singular value of the matrix L, i.e. the p-th largest square root of an eigenvalue of the matrix L∗L. In the proof of Theorem 4.3, instead of Lemma 2.1 we will use the following lemma, proved as [SYC91, Lemma 4.2] (see also [Rob11, Lemma 14.3]). Lemma 2.2. Let L: Rm Rk be a linear transformation. Assume that σ (L) > 0 for some → p p 1,...,k . Then for every z Rk and ρ,ε > 0, ∈{ } ∈ p Leb( α Bm(0, ρ): Lα + z ε ) ε { ∈ k k ≤ } Cm,k , Leb(Bm(0, ρ)) ≤ σp(L) ρ   where Cm,k > 0 is a constant depending only on m, k and Leb is the Lebesgue measure on Rm. 7 To verify the measurability of the sets occurring in subsequent proofs, we will use the two following elementary lemmas. A measure µ on a set X is called σ-finite if there exists a countable collection of measurable sets An, n N such that µ(An) < for each n N and ∞ ∈ ∞ ∈ An = X. Recall that a σ-compact set is a countable union of compact sets. n=1 LemmaS 2.3. Let X RN be a Borel set and let µ be a Borel σ-finite measure on X. Then ⊂ there exists a σ-compact set K X of full µ-measure. ⊂ Proof. Follows directly from the fact that a σ-finite Borel measure in a Euclidean space is regular (see e.g. [Bil99, Theorem 1.1]).  Lemma 2.4. Let , be metric spaces. Then the following hold. X Z (a) If K is σ-compact, then so is πX (K), where πX : is the projection ⊂X×Z X×Z →X given by πX (x, z)= x. In particular, πX (K) is Borel. (b) If is σ-compact, F : is continuous and K is σ-compact, then F −1(K) X X → Z ⊂ Z is σ-compact, hence Borel. (c) If , are σ-compact, F : Rk, k N, is continuous and K is X Z X×Z → ∈ ⊂ X σ-compact, then the set (x, z) : F (x, z)= F (y, z) for some y K x { ∈X×Z ∈ \{ }} is σ-compact and hence Borel.

Proof. The statement (a) follows from the fact that πX is continuous, and a continuous image of a compact set is also compact. To show (b), it is enough to notice that F −1(K) is a countable union of closed subsets of a σ-compact space. To check (c), let πX×Z : K X × ×Z → be the projection πX×Z (x, y, z)=(x, z). Then X × Z (x, z) : F (x, z)= F (y, z) for some y K x { ∈X×Z ∈ \{ }} = πX×Z (x, y, z) K : F (x, z)= F (y, z), d(x, y) =0 { ∈ X × × Z 6 } ∞  1 = πX×Z (x, y, z) K : F (x, z)= F (y, z), d(x, y) , { ∈ X × × Z ≥ n} n=1 [  where d is the metric in . Since d is continuous, we can use (a) and (b) to end the proof.  X 3. Probabilistic embedding theorem In this section we prove an extended version of the probabilistic embedding theorem, formulated below. Obviously, Theorem 1.4 follows from Theorem 3.1 Theorem 3.1 (Probabilistic embedding theorem – extended version). Let X RN ⊂ be a Borel set and µ be a Borel σ-finite measure on X. Take k N and β (0, 1] such that ∈ ∈ βk(X)=0 and let φ: X Rk be a locally β-Hölder map. Then for Lebesgue almost every H → linear transformation L: RN Rk there exists a Borel set X X of full µ-measure, such → L ⊂ that the map φL = φ + L is injective on XL.

Remark 3.2. It is straightforward to notice that if dimH X =0, then φ can be taken to be an arbitrary Hölder map. 8 Proof of Theorem 3.1. Note first that it sufficient to prove that the set XL exists for ηN,k- N almost every L Ek . Indeed, if this is shown, then for a given locally β-Hölder map Rk ∈ N N ˜ φ: X we can take sets j Ek , j , such that ηN,k( j)=1 and for every L j → L ⊂ ∈ (j) L ∈L the map (φ/j)˜ = φ/j + L˜ is injective on a Borel set X X of full µ-measure. Then the L L˜ ⊂ set = jL˜ : L˜ Lin(RN ; Rk) has full Lebesgue measure and for every L L j∈N{ ∈ Lj} ⊂ ∈ L there exists j such that L/j j, so (φ/j)L/j = (φ + L)/j (and hence φL) is injective on S (j) ∈ L XL = j∈N XL/j, which has full µ-measure. By Lemma 2.3, we can assume that X is σ-compact. Take k N, β (0, 1] with βk(X)= T ∈ ∈ H 0 and a locally β-Hölder map φ: X Rk. Set → A = (x, L) X EN : φ (x)= φ (y) for some y X x . { ∈ × k L L ∈ \{ }} By Lemma 2.4, A is Borel. For x X and L EN , denote by A and AL, respectively, the ∈ ∈ k x sections A = L EN :(x, L) A , AL = x X :(x, L) A . x { ∈ k ∈ } { ∈ ∈ } L The sets Ax and A are Borel as sections of a Borel set. Observe first that in order to prove the theorem it is enough to show η (A )=0 for every x X, since then by Fubini’s N,k x ∈ theorem ([Rud87, Thm. 8.8]), (η µ)(A)=0 and, consequently, µ(AL)=0 for η -almost N,k ⊗ N,k every L EN . Since φ is injective on X AL, the assertion of the theorem is true. ∈ k L \ Take a point x X. Since φ is locally β-Hölder and X is separable, there exists a countable ∈ covering of X by open sets U RN , j N, such that j ⊂ ∈ (3.1) φ(y) φ(y′) C y y′ β for every y,y′ U X k − k ≤ jk − k ∈ j ∩ for some Cj > 0. Let 1 K = y X : x y . n ∈ n ≤k − k To show η (A )=0, it suffices to proven η (A )=0 foro every j, n N, where N,k x N,k x,j,n ∈ A = L EN : φ (x)= φ (y) for some y U K . x,j,n { ∈ k L L ∈ j ∩ n} Note that by Lemma 2.4, the set Ax,j,n is Borel. Take j, n N and fix a small ε > 0. Since βk(U K ) βk(X)=0, there exists a ∈ H j ∩ n ≤ H collection of balls B (y ,ε ), i N, for some y U K and ε > 0, such that N i i ∈ i ∈ j ∩ n i ∞ (3.2) U K B (y ,ε ) and εβk ε. j ∩ n ⊂ N i i i ≤ ∈N i=1 i[ X Take L A and y U K such that φ (x) = φ (y). Then y B (y ,ε ) for some ∈ x,j,n ∈ j ∩ n L L ∈ N i i i N and ∈ L(y x)+ φ(y ) φ(x) = φ (y ) φ (x) k i − i − k k L i − L k = φ (y ) φ (y) k L i − L k φ(y ) φ(y) + L(y y) ≤k i − k k i − k C y y β + √N y y ≤ jk i − k k i − k M εβ ≤ j i 9 for some Mj > 0, by (2.3) and (3.1). This shows that

N β Ax,j,n L Ek : L(yi x)+ φ(yi) φ(x) Mjεi . ⊂ ∈N{ ∈ k − − k ≤ } i[ By Lemma 2.1, (3.2) and the fact y K , we have i ∈ n ∞ η (A ) η ( L EN : L(y x)+ φ(y ) φ(x) M εβ ) N,k x,j,n ≤ N,k { ∈ k k i − i − k ≤ j i } i=1 X CN k/2M k ∞ j εβk CN k/2M knkε. ≤ 1/nk i ≤ j i=1 X Since ε> 0 was arbitrary, we obtain ηN,k(Ax,j,n)=0, which ends the proof.  Remark 3.3. Note that the assumption βk(X)=0 is fulfilled if dim X <βk, so Theo- H H rem 3.1 is indeed a Hausdorff dimension embedding theorem. Moreover, it may happen that βk(X)=0 and dim X = βk. H H As a simple consequence of Theorem 3.1, we obtain the following corollary, formulated in a slightly simplified version in Section 1 as Corollary 1.5. Corollary 3.4 (Probabilistic injective projection theorem – extended version). Let X RN be a Borel set and let µ be a Borel σ-finite measure on X. Then for every ⊂ k N, k N such that k(X)=0 and almost every k-dimensional linear subspace S ∈ ≤ H ⊂ RN (with respect to the standard measure on the Grassmannian Gr(k, N)), the orthogonal projection of X into S is injective on a full µ-measure subset of X (depending on S). Proof of Corollary 3.4. Apply Theorem 3.1 for the map φ 0. Then we know that a linear ≡ map L Lin(RN ; Rk) of the form (2.2) is injective on a set X X of full µ-measure ∈ L ⊂ for Lebesgue almost every (l ,...,l ) (RN )k. We can assume that l ,...,l are linearly 1 k ∈ 1 k independent for all such L, which also implies that the same holds for Ll1,...,Llk. Setting

SL = Span(l1,...,lk) and taking V Lin(Rk; RN ) defined by V (Ll )= l for j =1,...,k, we have L ∈ L j j V L = Π , L ◦ SL RN where ΠSL is the orthogonal projection from onto SL and VL is injective. It follows that

ΠSL is injective on XL for almost every (l1,...,lk), so ΠS is injective on a full µ-measure subset of X for almost every k-dimensional linear subspace S RN .  ⊂ Let us note that in general, the requirement βk(X)=0 in Theorem 3.1 cannot be replaced H by the weaker condition dim (X) βk. H ≤ Example 3.5. Let k = β =1, X = S1 R2 be the unit circle and let µ be the normalized ⊂ Lebesgue measure on S1. We shall prove that there is no Lipschitz transformation φ: S1 R → which is injective on a set of full µ-measure. Let φ be such a transformation. Then φ(S1)= [a, b] for some compact interval. As φ is injective on a set of full measure, the interval [a, b] is non-degenerate, i.e. a < b. Fix points x, y S1 with φ(x)= a, φ(y)= b. As x = y, there are ∈ 10 6 exactly two open arcs I,J S1 of positive measure joining x and y such that I J = x, y ⊂ ∩ { } and I J = S1. Clearly φ(I) = φ(J) = [a, b]. Let A S1 be a Borel set such that φ is ∪ ⊂ injective on A and µ(A)=1. As Lipschitz maps transform sets of zero Lebesgue measure to sets of zero Lebesgue measure, we conclude that φ(I A) and φ(J A) are disjoint Lebesgue ∩ ∩ measurable subsets of [a, b] with Lebesgue measure equal to b a. This contradiction shows − that no Lipschitz transformation φ: S1 R is injecitive on a full measure set. → Theorem 3.1 strengthens the following embedding theorem, proved recently by Alberti, Bölcskei, De Lellis, Koliander and Riegler in [ABD+19]. Theorem 3.6 ([ABD+19, Theorem II.1]). Let µ be a Borel probability measure in RN and let k N be such that k > dimMB µ. Then for Lebesgue almost every linear transformation ∈ L: RN Rk there exists a Borel set X RN such that µ(X )=1 and L is injective on → L ⊂ L XL.

+ In fact, in [ABD 19] the authors introduced the notion of dimMB µ, denoting it by K(µ) and calling it the description of the measure. In particular, Theorem 3.6 holds for measures µ supported on a Borel set X RN with dim X 2dim X ⊂ ∈ B (it suffices to take k > dim (X X)). Then Lebesgue almost every linear transformation H − L: RN Rk is injective on X. → Remark 3.8. As noticed by Mañé and communicated in [ER85, p. 627], his original state- ment in [Mn81] is incorrect. Namely, he assumed k > 2 dimH X + 1 instead of k > dim (X X). However, this is known to be insufficient for the existence of a linear em- H − bedding of X into Rk. In fact, in [SYC91, Appendix A], Kan presented an example of a m m m−1 set X R with dimH X = 0, such that any linear transformation L: R R fails ⊂ 11 → to be injective on X. It turns out that the assumption k > 2 dimH X is insufficient, while k > 2dimB X is sufficient. This stems from the fact that the proof of Theorem 3.7 actually requires the property k > dim (X X), and the upper box-counting dimension satisfies H − (3.3) dim (A B) dim (A)+ dim (B), B × ≤ B B for A, B RN , hence ⊂ dim (X X) dim (X X) dim (X X) 2dim X H − ≤ H × ≤ B × ≤ B (note that this calculation shows that k > 2dimB X is a stronger assumption than k > dim (X X)). On the other hand, (3.3) does not hold for the Hausdorff dimension (nor for H − the lower box-counting dimension), and dim X does not control dim (X X). The fact H H − that in Theorem 3.1 we can work with the Hausdorff dimension comes from the application of Fubini’s theorem, which enables us to consider covers of the set X itself, instead of X X. − In Section 5 we analyze Kan’s example from the point of view of Theorem 3.1. Theorem 3.7 is also true for subsets of an arbitrary Banach space B for a prevalent set of linear transformations L: B Rk (see [Rob11, Chapter 6] for details). → Note that the linear embedding from Theorem 3.1 need not preserve the dimension of X. Indeed, the Hausdorff and box-counting dimensions are invariants for bi-Lipschitz transforma- tions, yet inverse of a linear map on a compact set does not have to be Lipschitz. Therefore, we only know that dim φ (X) dim X (see [Rob11, Proposition 2.8.iv and Lemma 3.3.iv]) L ≤ and the inequality can be strict. For example, let φ 0 and X = (x, f(x)) : x [0, 1] be a ≡ { ∈ } graph of a (Hölder continuous) function f : [0, 1] R with dim X > 1, e.g. the Weierstrass → H non-differentiable function. Then the linear projection L: R2 R given by L(x, y)= x sat- → isfies 1 = dim L(X) < dimH X. The following theorem shows that in the non-probabilistic setting, one can obtain β-Hölder continuity of the inverse map for small enough β (0, 1) ∈ (see [BAEFN93, EFNT94, HK99] and [Rob11, Chapter 4]). Theorem 3.9. Let X RN be a compact set. Let k N be such that k > 2dim X and let ⊂ ∈ B β be such that 0 <β< 1 2dim X/k. Then Lebesgue almost every linear transformation − B L: RN Rk is injective on X with β-Hölder continuous inverse. → However, this is not true in the case of Theorem 3.1. Remark 3.10. In general, we cannot claim that the injective map φ from Theorem 3.1 L|XL has a Hölder continuous inverse. Indeed, it is well-known that for n N there are examples ∈ of compact sets X RN of Hausdorff and topological dimension equal to n, which do not ⊂ embed topologically into Rk for k 2n (showing the optimality of the bounds in the Menger– ≤ Nöbeling embedding theorem, see [HW41, Example V.3]). Consider a probability measure µ on X with supp µ = X, where supp denotes the topological support of the measure (the intersection of all closed sets of full measure). It is known that such measure exists for any compact set. If the map φL XL from Theorem 3.1 for k = n+1 had a Hölder continuous inverse −1 | Rn+1 f = φL , then we could extend f from φL(XL) to preserving the Hölder continuity ([Ban51, Theorem IV.7.5], see also [Min70]). Then Y = x X : f φ (x)= x would be a { ∈ ◦ L } closed subset of X with µ(Y )=1, hence Y = X, so φL would be homeomorphism between n+1 X and φL(X) R , which would give a contradiction. ⊂ 12 4. Probabilistic Takens delay embedding theorem In this section we present the proof of the extended probabilistic Takens delay embedding theorem. It turns out that linear perturbations are insufficient for Takens-type theorems, see Example 4.6. As observed in [SYC91], it is enough to take perturbations over the space of polynomials of degree 2k. This can be easily extended to more general families of functions. Definition 4.1. Let X be a subset of RN . A family of transformations h ,...,h : X R is 1 m → called a k-interpolating family on set X, if for every collection of distinct points x ,...,x X 1 k ∈ and every ξ = (ξ ,...,ξ ) Rk there exists (α ,...,α ) Rm such that α h (x )+ + 1 k ∈ 1 m ∈ 1 1 i ··· αmhm(xi)= ξi for each i =1,...,k. In other words, the matrix

h1(x1) ... hm(x1) . .. .  . . .  h (x ) ... h (x )  1 k m k  has full row rank as a transformation from Rm to Rk. Note that the same is true for any collection of l distinct points with l k. ≤ Remark 4.2. It is known that any linear basis h1,...,hm of the space of real polynomials of N variables of degree at most k 1 is a k-interpolating family (see e.g. [GS00, Section 1.2, − eq. (1.9)]). For a transformation T : X X and p N denote by Per (T ) the set of periodic points → ∈ p of minimal period p, i.e. Per (T )= x X : T px = x and T jx = x for j =1,...,p 1 . p { ∈ 6 − } Let µ and ν be measures on a measurable space ( , ). The measure µ is called singular X F with respect to ν, if there exists a measurable set Y such that µ( Y ) = ν(Y )=0. ⊂ X X \ In this case we write µ ν. By µ we denote the restriction of µ to a set A . ⊥ |A ∈F Theorem 4.3 (Probabilistic Takens delay embedding theorem – extended version). Let X RN be a Borel set, µ be a Borel σ-finite measure on X and T : X X an injective, ⊂ → locally Lipschitz map. Take k N and β (0, 1] such that βk(X)=0 and assume ∈ ∈ H µ βp for every p =1,...,k 1. Let h: X R be a locally β-Hölder function and |Perp(T ) ⊥ H − → h ,...,h : X R a 2k-interpolating family on X consisting of locally β-Hölder functions. 1 m → For α =(α ,...,α ) Rm denote by h : X R the transformation 1 m ∈ α → m

hα(x)= h(x)+ αjhj(x). j=1 X Then for Lebesgue almost every α =(α ,...,α ) Rm, there exists a Borel set X X of 1 m ∈ α ⊂ full µ-measure, such that the delay-coordinate map φT : X Rk, φT (x)=(h (x), h (T x),...,h (T k−1x)) α → α α α α is injective on Xα. Notice that Theorem 1.2 follows from Theorem 4.3 by Remark 4.2. 13 Remark 4.4 (Invariant measure case – extended version). Under the assumptions of Theorem 4.3, the following hold. (a) If the measure µ is T -invariant, then the set X can be chosen to satisfy T (X ) X . α α ⊂ α (b) If the measure µ is finite and T -invariant, then the set Xα can be chosen to satisfy T (Xα)= Xα. (c) If the measure µ is T -invariant and ergodic, then the assumption on the periodic points of T in Theorem 4.3 can be omitted.

Under the notation of Theorem 4.3, we first show a preliminary lemma. For x X define ∈ its full Orb(x) as

Orb(x)= T nx : n 0 y X : T ny = x for some n N . { ≥ }∪{ ∈ ∈ } Note that since T is injective, all full orbits are at most countable, and any two full orbits Orb(x) and Orb(y) are either equal or disjoint. For x, y X let D be the k m matrix ∈ x,y × defined by

h (x) h (y) ... h (x) h (y) 1 − 1 m − m h (T x) h (Ty) ... h (T x) h (Ty)  1 1 m m  Dx,y = −. . −......   h (T k−1x) h (T k−1y) ... h (T k−1x) h (T k−1y)  1 − 1 m − m    Lemma 4.5. For x, y X, the following statements hold. ∈ (i) If y = x, then rank D 1. 6 x,y ≥ (ii) If y / Orb(x) and y Perp(T ) for some p 1,...,k 1 , then rank Dx,y p. ∈ ∈ k−1 ∈{ − } ≥ (iii) If y / Orb(x) and y / Perp(T ), then rank Dx,y = k. ∈ ∈ p=1 S Proof. For (i), it suffices to observe that the first row of D is non-zero as long as x = y and x,y 6 therefore rank(D ) 1. Indeed, otherwise we would have h (x) = h (y) for j = 1,...,m x,y ≥ j j which contradicts the fact that h1,...,hm is an interpolating family. Assume now y / Orb(x), which implies Orb(y) Orb(x)= . Let q (resp. r) be a maximal ∈ ∩ ∅ number from 1,...,k such that the points x,Tx,...,T q−1x (resp. y,Ty,...,T r−1y) are { } distinct. Notice that if y Perp(T ) for some p 1,...,k 1 , then r = p, and if y / k−1 ∈ ∈ { − } ∈ Perp(T ), then r = k. Thus, the assertions (ii)–(iii) of the lemma can be written simply p=1 asS one condition

(4.1) rank D r. x,y ≥ To show that (4.1) holds, denote the points x,Tx,...,T q−1x,y,Ty,...,T r−1y, preserving the order, by z ,...,z , for l = q + r. By the definition of q,r, we have 1 l 2k and the points 1 l ≤ ≤ z1,...,zl are distinct. Thus, the matrix Dx,y can be written as the product

Dx,y = Jx,yVx,y, 14 where h1(z1) ... hm(z1) . .. . Vx,y =  . . .  h (z ) ... h (z )  1 l m l  and J is a k l matrix with entries in 1, 0, 1 and block structure of the form x,y × {− } Idr×r Jx,y = ∗ − ,  ∗ ∗  where Id × is the r r identity matrix. It follows that rank J r. Moreover, since r r × x,y ≥ z1,...,zl are distinct and h1,...,hm is a 2k-interpolating family, the matrix Vx,y is of full rank, hence rank D = rank J r, which ends the proof.  x,y x,y ≥ Proof of Theorem 4.3. We proceed similarly as in the proof of Theorem 3.1, using Lemma 2.2 instead of Lemma 2.1, together with the suitable rank estimates coming from Lemma 4.5. In the same way as in the proof of Theorem 3.1, we show that it is enough to check that the suitable set X exists for η -almost every α B (0, 1). α m ∈ m Applying Lemma 2.3 to the sets Per (T ), p = 1,...,k 1 and (possibly zero) measures p − µ , we find (possibly empty) disjoint σ-compact sets X ,...,X − X such that |Perp(T ) 1 k 1 ⊂ X Per (T ), µ(Per (T ) X )=0, βp(X )=0 for p =1,...,k 1. p ⊂ p p \ p H p − k−1 Similarly, there exists a σ-compact set Xk X Perp(T ) such that ⊂ \ p=1

k−1 S µ X Per (T ) X =0 and βk(X )=0. \ p \ k H k p=1  [   Note that Xk contains both aperiodic and periodic points (with period at least k). Let

k

X˜ = Xp. p=1 [ Then X˜ X is a σ-compact set of full µ-measure. Define ⊂ A = (x, α) X˜ B (0, 1) : φT (x)= φT (y) for some y X˜ x . { ∈ × m α α ∈ \{ }} The set A is Borel by Lemma 2.4. For x X˜ and α B (0, 1), denote, respectively, by A ∈ ∈ m x and Aα, the Borel sections A = α B (0, 1):(x, α) A , Aα = x X˜ :(x, α) A . x { ∈ m ∈ } { ∈ ∈ } T Observe that to show the injectivity of φα on a set of full µ-measure, it is enough to prove η (A )=0 for every x X˜, since then by Fubini’s theorem ([Rud87, Thm. 8.8]), (η m x ∈ m ⊗ µ)(A)=0 and, consequently, µ(Aα)=0 for η -almost every α B (0, 1). As φT is injective m ∈ m α on X˜ Aα and X˜ has full µ-measure, the proof of the claim is finished. \ Fix x X˜. To show η (A )=0, note that for y X˜, ∈ m x ∈ T T (4.2) φα (x) φα (y)= Dx,yα + wx,y − 15 for h(x) h(y) − h(T x) h(Ty) wx,y =  −.  . .   h(T k−1x) h(T k−1y)  −  Write Ax as   k A = Aorb Ap, x x ∪ x p=1 [ where Aorb = α B (0, 1) : φT (x)= φT (y) for some y X˜ Orb(x) x , x { ∈ m α α ∈ ∩ \{ }} Ap = α B (0, 1) : φT (x)= φT (y) for some y X x , p =1, . . . , k. x { ∈ m α α ∈ p \{ }} orb The set Ax is Borel as a countable union of closed sets of the form (4.3) α B (0, 1) : φT (x)= φT (y) , y X˜ Orb(x) x , { ∈ m α α } ∈ ∩ \{ } p while each set Ax is Borel as a section of the set (x, α) X˜ B (0, 1) : φT (x)= φT (y) for some y X x , { ∈ × m α α ∈ p \{ }} orb which is Borel by Lemma 2.4. To end the proof, it is enough to show that the sets Ax and p Ax, p =1,...,k, have ηm measure zero. orb To prove ηm(Ax )=0 it suffices to check that the sets of the form (4.3) have ηm measure zero. By (4.2), we have α B (0, 1) : φT (x)= φT (y) = α B (0, 1) : D α = w { ∈ m α α } { ∈ m x,y − x,y} and Lemma 4.5 gives rank Dx,y 1 whenever y = x, so each set of the form (4.3) is contained m ≥ 6 in an affine subspace of R of codimension at least 1. Consequently, it has ηm measure zero. Since T is locally Lipschitz, h, h1,...,hm are locally β-Hölder and X is separable, there exists a countable covering of X by open sets in RN , such that for every V , the map V ∈V T is Lipschitz on V and h, h1,...,hm are β-Hölder on V . Let be the collection of all sets −1 −(k−1) U of the form U = V T (V ) ... T (V − ), where V ,...,V − . Then is a 0 ∩ 1 ∩ ∩ k 1 0 k 1 ∈ V U countable covering of X by open sets, and we can write = U ∈N. By definition, for every U { j}j j N there exists C > 0 such that ∈ j T s+1(y) T s+1(y′) C T s(y) T s(y′) , k − k ≤ jk − k h(T s(y)) h(T s(y′)) C T s(y) T s(y′) β, k − k ≤ jk − k h (T s(y)) h (T s(y′)) C T s(y) T s(y′) β k r − r k ≤ jk − k for every y,y′ U X, s 0,...,k 1 , r 1,...,m . By induction, it follows that ∈ j ∩ ∈{ − } ∈{ } T s(y) T s(y′) Cs y y′ , k − k ≤ j k − k ′ ′ (4.4) h(T s(y)) h(T s(y )) Cβs+1 y y β, k − k ≤ j k − k h (T s(y)) h (T s(y′)) Cβs+1 y y′ β k r − r k ≤ j k − k ′ for y,y Uj X, s 0,...,k 1 , r 1,...,m . ∈ ∩ ∈{ − } ∈{ 16 } p To prove ηm(Ax)=0 for p = 1,...,k, we follow the strategy used in [SYC91] (see also [Rob11]). Fix n N and for j N define ∈ ∈ 1 Xp,n = y X : σ (D ) , x ∈ p p x,y ≥ n Ap,j,n = nα B (0, 1) : φT (x)=oφT (y) for some y U Xp,n x , x { ∈ m α α ∈ j ∩ x \{ }} where σp(Dx,y) is the p-th largest singular value. Note that singular values of given order depend continuously on the coefficients of the matrix, see e.g. [GVL13, Corollary 8.6.2]. p,n p,j,n Hence, the set Xx is σ-compact as a closed subset of Xp and by Lemma 2.4, the set Ax is Borel. By Lemma 4.5, for every y X Orb(x) we have rank D p. This implies σ (D ) > 0 ∈ p \ x,y ≥ p x,y (see e.g. [Rob11, Lemma 14.2]). Hence, ∞ ∞ Ap Aorb = Ap,j,n Aorb. x \ x x \ x j=1 n=1 [ [ Consequently, it is enough to prove η (Ap,j,n Aorb)=0 for every n N. m x \ x ∈ Fix ε> 0. Since βp(U Xp,n Orb(x)) βp(X )=0, there exists a collection of balls H j ∩ x \ ≤ H p B (y ,ε ), for y U Xp,n Orb(x) and 0 <ε <ε, i N, such that N i i i ∈ j ∩ x \ i ∈ ∞ (4.5) U Xp,n Orb(x) B (y ,ε ) and εβp ε. j ∩ x \ ⊂ N i i i ≤ ∈N i=1 i[ X Take α Ap,j,n Aorb and let y U Xp,n Orb(x) be such that φT (x)= φT (y). Then for ∈ x \ x ∈ j ∩ x \ α α y with y B(y ,ε ) we have i ∈ i i D α + w = φT (x) φT (y ) = φT (y) φT (y ) k x,yi x,yi k k α − α i k k α − α i k k−1 m 2 (4.6) h(T sy) h(T sy ) + α h (T sy) h (T sy ) ≤ v k − i k rk r − r i k u s=0 r=1 uX  X  Mt y y′ β M εβ ≤ jk − k ≤ j i for k−1 M = (1+ √m) C2(βs+1), j v j u s=0 uX by (4.4) and the fact α B (0, 1). By (4.6), t ∈ m Ap,j,n Aorb α B (0, 1) : D α + w M εβ . x \ x ⊂ { ∈ m k x,yi x,yi k ≤ j i } i∈N [ Since for every i N we have σ (D ) 1/n, we can apply Lemma 2.2 and (4.5) to obtain ∈ p x,yi ≥ ∞ M pεβp η (Ap,j,n Aorb) C j i C M pnpε. m x \ x ≤ m,k 1/np ≤ m,k j i=1 X Since ε > 0 was arbitrary, we conclude that η (Ap,j,n Aorb)=0, so in fact η (Ap,j,n)=0. m x \ x m x This ends the proof of Theorem 4.3.  17 Proof of Remark 4.4. Suppose that the measure µ is T -invariant. Then it is easy to check that the set ∞ −n X˜α = T (Xα) n=0 \ is a Borel subset of X of full µ-measure satisfying T (X˜ ) X˜ . Hence, to show (a), it α α ⊂ α suffices to replace the set Xα by X˜α. In the case when µ is additionally finite, we first remark that the measure µ is also forward invariant, i.e. µ(T (Y )) = µ(Y ) for Borel sets Y X. Note that if Y is Borel, then so is ⊂ T (Y ) as the image of a Borel set under a continuous and injective mapping (see e.g. [Kec95, Theorem 15.1]). Using this together with the invariance of µ and the injectivity of T , we check that −n X˜α = T (Xα) ∈Z n\ is a Borel subset of Xα of full µ-measure satisfying T (X˜α) = X˜α. This gives (b). Notice that the finiteness of µ is indeed necessary, as for X = N, T (x)= x +1 and µ the counting measure, there does not exist a set Y X of full µ-measure satisfying T (Y )= Y . ⊂ To show (c), suppose that µ is T -invariant and ergodic. Obviously, we can assume that the µ-measure of the set of all periodic points of T is positive (including + ). Then there exists ∞ p N such that the measure of the set P of all p-periodic points of T is positive (including ∈ + ). ∞ Suppose first that µ restricted to P is non-atomic. Then there exists a Borel set Y P ⊂ with 0 <µ(Y ) <µ(X)/p. Let Z = Y T −1(Y ) ... T −(p−1)(Y ). Then 0 <µ(Z) <µ(X) ∪ ∪ ∪ and, by the injectivity of T , we have T −1(Z)= Z, which contradicts the of µ. Suppose now that µ has an atom in P . Since µ is a Borel σ-finite measure in a Euclidean space, this is equivalent to the fact that µ( x ) > 0 for some x P . Let be the periodic { } ∈ O orbit of x. Again by the injectivity of T , we have T −1( )= , so by the ergodicity of µ, the O O set has full µ-measure. This means that µ is supported on a set of Hausdorff dimension 0, O which obviously gives (c).  The original Takens delay embedding theorem states that for given finite dimensional C2 manifold M and generic pair of C2-diffeomorphism T : M M and C2-function h: M R, → → the corresponding delay-coordinate map φ: M Rk, φ(x)=(h(x), h(T x),...,h(T k−1x)) is → a C2-embedding (an injective immersion) as long as k> 2 dim M. It was followed by the box- counting dimension version of Sauer, Yorke and Casdagli (Theorem 1.1) and subsequently by the infinite-dimensional result of [Rob05] (see also [Rob11, Section 14.3]). Refer to [NV18] for a version of Takens’ theorem with a fixed observable and perturbation performed on the dynamics. Takens’ theorem involving Lebesgue covering dimension on compact metric spaces and a continuous observable was proved in [Gut16] (see [GQS18] for a detailed proof). See also [Sta99, Cab00] for Takens theorem for deterministically driven smooth systems and [SBDH97, SBDH03] for stochastically driven smooth systems.

Example 4.6. It turns out that linear perturbations are not sufficient for Theorems 1.1 k−1 k−1 and 4.3, i.e. it may happen that φL = (φ(x)+ Lx,...,φ(T x)+ LT x) is not (almost 18 surely) injective for a generic linear map L: RN R. As an example, let X = B (0, 1), fix → 2 a (0, 1) and define T : X X as ∈ → T (x)= ax. Then T is a Lipschitz injective transformation on the unit disc X R2 with zero being the ⊂ unique . Fix φ 0. We claim that there is no linear observable L: R2 R ≡ → which makes the delay map injective, i.e. for every k N and every v R2 the transformation ∈ ∈ x φT (x)=( x, v , T x, v ,..., T k−1x, v ) Rk is not injective on X. This follows from 7→ v h i h i h i ∈ the fact that for each 1-dimensional linear subspace W R2 the set W X is T -invariant, ⊂ ∩ hence φT = 0 on an infinite set Ker( , v ) X. We have seen that φT is not injective for v h· i ∩ v any v R2. No we will see that it also not almost surely injecitve for µ being the Lebesgue ∈ measure on X. Note that for v R2 and c R, the segment W = z X : z, v = ∈ ∈ c { ∈ h i c satisfies T (Wc) Wac, hence all points on Wc will have the same observation vector } ⊂ k−1 2 k−1 ( x, v , T x, v ,..., T x, v )=(c, ac, a c,...,a c). Therefore, a set Xv X on which hT i h i h i ⊂ φv is injective can only have one point on each of the parallel segments Wc contained in X. However, such a set Xv cannot be of full Lebesgue measure. Note that the above example can be easily modified to make T a homeomorphism.

5. Examples In this section we present two examples which illustrate the usage of Theorem 3.1. Let us begin with fixing some notation. For x [0, 2) we will write ∈

x = x0.x1x2 ..., where x0.x1x2 ... is the binary expansion of x, i.e. ∞ x x = j , x , x , x ,... 0, 1 . 2j 0 1 2 ∈{ } j=0 X For a dyadic rational we agree to choose its eventually terminating expansion, i.e. the one with x =0 for j large enough. Let π : 0, 1 N [0, 1] be the coding map j { } → ∞ x π(x , x ,...)= j . 1 2 2j j=1 X 5.1. A modified Kan example. In the Appendix to [SYC91], Kan presented an example of a compact set K RN with dim K = 0 and such that every linear transformation ⊂ H L: RN RN−1 fails to be injective on K (see also Remark 3.8). It follows from Theorem 3.1, → that whenever we are given a Borel σ-finite measure µ on such a set, then almost every linear transformation L: RN R is injective on a set of full µ-measure. To illustrate this, we → construct a σ-compact set X R2 with dim X =0, which is a slight modification of Kan’s ⊂ H example, equipped with a natural Borel σ-finite measure µ, such that no linear transformation L: R2 R is injective on X, while for almost every L we explicitly show a set X X of → L ⊂ full µ-measure, such that L is injective on XL. 19 Following [SYC91, Appendix], we begin with constructing compact sets A, B [0, 1] such ⊂ that (5.1) dim A = dim A = dim B = dim B =0 (hence dim (A B)=0), H B H B H ∪ and (5.2) dim A = dim B =1, dim (A B)= dim (A B)=1. B B B ∪ B ∪ To this aim, let Mk, k 0, be an increasing sequence of positive integers such that M0 =1 ≥Mk+1 and Mk with lim = . Define ր∞ k→∞ Mk ∞ A = (x , x ,...) 0, 1 N : for every even k, x =0 for all j [M , M ) 1 2 ∈{ } j ∈ k k+1  or xj =1 for all j [Mk, Mk+1) , e ∈ N B = (x1, x2,...) 0, 1 : for every odd k, xj =0 for all j [Mk, Mk+1) ∈{ } ∈  or xj =1 for all j [Mk, Mk+1) , e ∈ and set A = π(A), B = π(B). It is a straightforward calculation to check that A and B satisfy (5.1) and (5.2) (see [SYC91, Appendix], [Fal14, Example 7.8] or [Rob11,e Section 6.1]).e Define X R2 as ⊂ X = 0 (A + n) 1 (B + n) . { } × ∈Z ∪ { } × ∈Z  n[   n[  By (5.1), we have dimH X = 0. The following two propositions describe the embedding properties of the set X. Proposition 5.1. No linear transformation L: R2 R is injective on X. → Proof. The map L has the form L(x, y)= αx + βy for α, β R. Obviously, we can assume ∈ β =0. Note that the points 6 u = (0, a + n), v = (1, b + m), for a A, b B, n, m Z ∈ ∈ ∈ are in X and (5.3) L(u)= L(v) if and only if b a = z, − where α z = + n m. −β − For given α and β, choose n, m Z such that z [0, 1). Consider the binary expansion ∈ ∈ z =0.z1z2 ... and define a =0.a a ... A, b =0.b b ... B 1 2 ∈ 1 2 ∈ setting

aj =0, bj = zj for j [Mk, Mk+1), if k is even, (5.4) ∈ aj =1 zj, bj =1 for j [Mk, Mk+1), if k is odd − 20 ∈ (if all b are equal to 1, we set b =1). Then z = b a and (5.3) implies that L is not injective j − on X.  Let us now define a natural Borel σ-finite measure µ on X, starting from a pair of proba- bility measures ν1, ν2 on A and B, respectively. Let ∞ ∞

e eν1 = pk, ν2 = qk, Ok=0 Ok=0 where p and q are probability measures on 0, 1 Mk+1−Mk given as k k { } ⊗ − 1 δ + 1 δ if k is even 1 δ + 1 δ (Mk+1 Mk) if k is even p = 2 (0,...,0) 2 (1,...,1) , q = 2 0 2 1 k 1 1 ⊗(Mk+1−Mk) k 1 1 ( 2 δ0 + 2 δ1 if k is odd (2 δ(0,...,0) + 2 δ(1,...,1) if k is odd and the symbol δa denotes the Dirac measure at a. Then supp ν1 = A, supp ν2 = B, hence defining

µ1 = π∗(ν1), µ2 = π∗(ν2), e e we obtain probability measures on A, B, respectively, with supp µ1 = A, supp µ2 = B. Finally, let

µ = δ0 (τn)∗µ1 + δ1 (τn)∗µ2, ∈Z ⊗ ∈Z ⊗ Xn Xn where τ : R R, τ (x) = x + n, n Z. Clearly, µ is a Borel σ-finite measure with n → n ∈ supp µ = X. For a A, b B let ∈ ∈ A = x A 1 : x + a = z .z z ... such that the sequence (z , z ,...) a ∈ \{ } 0 1 2 0 1 N  is constant on [Mk, Mk+1 1) for every odd k , − ∩ B = x B 1 : x + b = z .z z ... such that the sequence (z , z ,...) b ∈ \{ } 0 1 2 0 1 is constant on [M , M 1) N for every even k .  k k+1 − ∩ Lemma 5.2. For every a A, b B, we have µ (A )= µ (B )=0. ∈ ∈ 1 a 2 b Proof. Fix b = b .b b ... B. We will show µ (B ) = 0 (the fact µ (A ) = 0 can 0 1 2 ∈ 2 b 1 a be proved analogously). The proof proceeds by showing that for each even k, the vector (x ,...,x − ), where x = x .x x ... B , can assume at most four values. This will Mk Mk+1 2 0 1 2 ∈ b imply µ (B ) 8 2−(Mk+1−Mk) for each even k and, consequently, µ (B )=0. To show the 2 b ≤ · 2 b assertion, fix an even k and let ∞ x + b ξ = j j . 2j j=M −1 Xk+1 Note that ξ < 2−(Mk+1−3) (as ξ < 2 and we exclude expansions with digits eventually equal to 1). Hence, ξ = ξ .ξ ξ ... with ξ =0 for j M 3. Note that, since b is fixed, the values 0 1 2 j ≤ k+1 − of ξ − 0, 1 and (x +b ,...,x − +b − ) (0,..., 0), (1,..., 1) determine Mk+1 2 ∈{ } Mk Mk Mk+1 2 Mk+1 2 ∈{ } uniquely the value of (xMk ,...,xMk+1−2). Therefore, (xMk ,...,xMk+1−2) can assume at most four values.  21 Now for Lebesgue almost every linear transformation L: R2 R we will construct a set → X X of full µ-measure, such that L is injective on X . As previously, write L(x, y) = L ⊂ L αx + βy for α, β R. Neglecting a set of zero Lebesgue measure, we can assume β =0. Let ∈ 6 l Z be such that ∈ α (5.5) z = + l belongs to [0, 1). −β Similarly as in (5.4), we can write

(5.6) z = a′ b′, z 1= a′′ b′′ for some a′, a′′ A, b′, b′′ B. − − − ∈ ∈ Let

X = 0 (A + n) 1 (B (B ′ B ′′ 1 ))+ n . L { } × ∪ { } × \ b ∪ b ∪{ } n∈Z n∈Z  [   [  Then X X and Lemma 5.2 implies that X has full µ-measure. L ⊂ L Proposition 5.3. For every α R, β R 0 , the linear transformation L: R2 R, ∈ ∈ \{ } → L(x, y)= αx + βy, is injective on XL.

For the proof of the proposition we will need the following simple lemma. The proof is left to the reader.

Lemma 5.4. Let x = x .x x ... [0, 1], y = y .y y ... [0, 1], M, N N, M < N 1, 0 1 2 ∈ 0 1 2 ∈ ∈ − be such that x + y < 2 and sequences (xM ,...,xN ) and (yM ,...,yN ) are constant. Then x + y = z0.z1z2 ..., where the sequence (zM ,...,zN−1) is constant.

Proof of Proposition 5.3. Assume, on the contrary, that there exist points u, v X such ∈ L that L(u) = L(v). As β = 0, we cannot have u, v 0 R or u, v 1 R. Hence, 6 ∈ { } × ∈ { } × we can assume u 0 R, v 1 R. Then, following the previous notation, we have ∈ { } × ∈ { } × u = (0, a + n), v = (1, b + m) for a A, b B (B ′ B ′′ 1 ), n, m Z. Note that ∈ ∈ \ b ∪ b ∪{ } ∈ b a [ 1, 1), so by (5.3), we have − ∈ − b a = z or b a = z 1, − − − for z from (5.5), and (5.6) implies

b a = a′ b′ or b a = a′′ b′′. − − − − Hence, a + a′ = b + b′ or a + a′′ = b + b′′.

This is a contradiction, as Lemma 5.4 implies that the binary expansion sequences of a + a′ ′′ and a + a are constant on [Mk, Mk+1 1) N for every even k, while by the condition − ∩ ′ ′′ b B (B ′ B ′′ 1 ), the binary expansion sequences of b + b and b + b are not constant ∈ \ b ∪ b ∪{ } on [Mk, Mk+1 1) N for some even k.  − ∩ 22 5.2. Measure with dimH µ < dim MB µ. To show that Theorem 3.1 is an actual strength- ening of Theorem 3.6, we present an example of a measure µ, for which dimH µ< dimMB µ. More precisely, we show the following.

2 Theorem 5.5. There exists a Borel probability measure µ on [0, 1] , such that dimH µ = 1 and dimMB µ =2.

To begin the construction of µ, fix an increasing sequence of positive integers Nk, k k ∈ Sk 1 N, such that Nk with , where Sk = Nj. Consider the probability Sk+1 k+1 ր ∞ ≤ j=1 distributions p , p on 0, 1 given by P 0 1 { } 1 p ( 0 )=0, p ( 1 )=1, p ( 0 )= p ( 1 )= . 0 { } 0 { } 1 { } 1 { } 2

For y = 0.y1y2 ... [0, 1] (in this subsection we assume that the binary expansion of 1 is ∈ N 0.111 ...), define the probability measure ν on 0, 1 as the infinite product y { } ∞ Nj

νy = pyj . j=1 i=1 O O Further, let µy be the Borel probability measure on [0, 1] given by

µy = π∗νy. Finally, let µ be the Borel probability measure on [0, 1]2 defined as

µ = µ d Leb(y), i.e. µ(A)= µ (Ay)d Leb(y) for a Borel set A [0, 1]2, y y ⊂ [0Z,1] [0Z,1] where Ay = x [0, 1]: (x, y) A . It is easy to see that µ is well-defined, as the function { ∈ ∈ } y µ (Ay) is measurable for every Borel set A [0, 1]2. 7→ y ⊂ The proof of Theorem 5.5 is based on the analysis of the local dimension of µ, defined in terms of dyadic squares (rather then balls). For n N and x ,...,x 0, 1 let [x ,...,x ] ∈ 1 n ∈{ } 1 n denote the dyadic interval corresponding to the sequence (x1,...,xn), i.e. n n n xj xj 1 xj 1 2j , 2j + 2n if 2j + 2n < 1 [x ,...,x ]= j=1 j=1 j=1 1 n h  1P 1 ,P1 otherwise.P  − 2n N 2 Under this notation, for n and (x, y) [0, 1] let Dn(x, y) be the dyadic square of ∈ ∈ sidelength 2−n containing (x, y), i.e. D (x, y) = [x ,...,x ] [y ,...,y ], where x =0.x x ... and y =0.y y .... n 1 n × 1 n 1 2 1 2 Recall that the box-dimensions can be defined equivalently in terms of dyadic squares. Pre- ′ −n −n cisely, let N (X, 2 ) be the number of dyadic squares Dn(x, y) of sidelength 2 intersecting X. Then (see e.g. [Fal14, Section 2.1]) log N ′(X, 2−n) log N ′(X, 2−n) (5.7) dimB (X) = lim inf and dimB (X) = lim sup . n→∞ n log 2 n→∞ n log 2 23 For a Borel finite measure µ on [0, 1]2 and (x, y) [0, 1]2 define the lower and upper local ∈ dimension of µ at (x, y) as log µ(D (x, y)) log µ(D (x, y)) d(µ, (x, y)) = lim inf − n , d(µ, (x, y)) = lim sup − n . n→∞ n log 2 n→∞ n log 2 It is well-known (see e.g. [Hoc14, Propositions 3.10 and 3.20]) that

(5.8) dimH µ = ess sup d(µ, (x, y)). (x,y)∼µ The following lemma gives estimates on the measure of dyadic squares at suitable scales.

Lemma 5.6. Let x =0.x x ..., [0, 1], y = 0.y y ... [0, 1], n N and D = D (x, y)= 1 2 ∈ 1 2 ∈ ∈ n [x ,...,x ] [y ,...,y ]. Let k N be such that S < n S . Then the following hold. 1 n × 1 n ∈ k ≤ k+1 −(2− 1 )n (a) If yk = yk+1 =1, then µ(D) 2 k . ≤ −(1+ 1 )n (b) If n = S and y =0, then either µ(D)=0 or µ(D) 2 k+1 . k+1 k+1 ≥ Proof. Note that for y′ =0.y′ y′ ... [0, 1] such that (y′ ,...,y′ )=(y ,...,y ) we have 1 2 ∈ 1 n 1 n y′ ′ ′ ′ ′ ′ ′ µy (D )= µy ([x1,...,xn]) = py ( x1 ) py ( xS1 )py ( xS1+1 ) py ( xS2 ) (5.9) 1 { } ··· 1 { } 2 { } ··· 2 { } p ′ ( xS +1 ) p ′ ( xn ). ··· yk+1 { k } ··· yk+1 { } y′ Moreover, as k < n, the value of µy′ (D ) depends only on (y1,...,yn) and (x1,...,xn). Using (5.9), we can prove both assertions of the lemma, as follows. Ad (a). 1 If y = y =1, then for j S − +1,...,n we have p (x )= , where l k,k +1 k k+1 ∈{ k 1 } yl j 2 ∈{ } is such that Sl−1 < j Sl. Therefore, in the product (5.9) there is at least n Sk−1 terms 1 ≤ − equal to 2 . Consequently, S ′ Sk−1 k−1 1 y −(n−S − ) −(1− )n −(1− )n −(1− )n µ ′ (D ) 2 k 1 =2 n 2 Sk 2 k , y ≤ ≤ ≤ hence y′ ′ −n(1− 1 ) −n(2− 1 ) µ(D)= µ ′ (D )d Leb(y ) Leb([y ,...,y ])2 k =2 k . y ≤ 1 n [y1,...,yZ n] Ad (b). Assume that µ(D) =0. Then all the terms in (5.9) have to be non-zero, so every term is 1 6 equal to either 2 or 1. Moreover, as yk+1 =0 and n = Sk+1, we have p ( x ) p ( x )=1 yk+1 { Sk+1} ··· yk+1 { n} and, consequently, µ(D)=2−np ( x ) p ( x )p ( x ) p ( x ) y1 { 1} ··· y1 { S1 } y2 { S1+1} ··· y2 { S2 } Sk − − −(1+ )n −(1+ 1 )n p ( x ) p ( x ) 2 n Sk =2 Sk+1 2 k+1 . ··· yk { Sk−1+1} ··· yk { Sk } ≥ ≥ 

Now we are ready to give the proof of Theorem 5.5. 24 Proof of Theorem 5.5. We begin by proving dim µ =1. Note that dim µ 1, as µ projects H H ≥ under [0, 1]2 (x, y) y [0, 1] to the Lebesgue measure, so it is sufficient to show dim µ ∋ 7→ ∈ H ≤ 1. By (5.8), it is enough to prove that d(µ, (x, y)) 1 for µ-almost every (x, y) [0, 1]. Note ≤ ∈ that for Lebesgue almost every y = 0.y y ... [0, 1], the sequence (y ,y ,...) contains 1 2 ∈ 1 2 infinitely many zeros. Hence, it is sufficient to show d(µ, (x, y)) 1 for µ -almost every ≤ y x [0, 1], assuming that y [0, 1] has this property. Moreover, for µ -almost every x [0, 1], ∈ ∈ y ∈ we have µ(D (x, y)) > 0 for all n N (see (5.9)). For such x, by Lemma 5.6(b), we have n ∈ 1 log µ(DS (x, y)) (1 + )Sn d(µ, (x, y)) lim inf − nk lim nk k =1. k→∞ k→∞ ≤ Snk log 2 ≤ Snk

Therefore, dimH µ 1, so in fact dimH µ =1. ≤ 2 Let us prove now dimMB µ =2. Since µ is supported on [0, 1] , it suffices to show dimMB µ ≥ 2. Let A [0, 1]2 be a Borel set with µ(A) > 0. We show dim A 2. Note that there ⊂ B ≥ exists c> 0 such that the set (5.10) B = y [0, 1] : µ (Ay) c { ∈ y ≥ } satisfies Leb(B) > 0. Fix ε (0, 1 ). By the Lebesgue density theorem (see e.g. [Hoc14, ∈ 4 Corollary 3.16]), there exists a dyadic interval I [0, 1] such that ⊂ Leb(B I) (5.11) ∩ 1 ε, I ≥ − | | where I =2−N is the length of I. Fix k N +2 and n S +1,...,S . Consider the | | ≥ ∈{ k k+1} collection of dyadic intervals of length 2−n defined as Cn = [y ,...,y ]: y = y =1 and [y ,...,y ] B I = . Cn { 1 n k k+1 1 n ∩ ∩ 6 ∅} By (5.11), we have 1 (5.12) Leb B ε 2−N . ∩ Cn ≥ 4 − Let  [    A = A [0, 1] B . n ∩ × ∩ Cn Then A A and (5.10) together with (5.12) imply [  n ⊂ 1 (5.13) µ(A )= µ (Ay)d Leb(y) c ε 2−N . n y ≥ 4 − B∩ZS Cn   ′ −n Note that the above lower bound does not depend on k and n. Let N (An, 2 ) be the number of dyadic squares of sidelength 2−n intersecting A . If D = I I is a dyadic square n 1 × 2 of sidelength 2−n intersecting A , then I , hence by Lemma 5.6(a) we have n 2 ∈ Cn −(2− 1 )n µ(D) 2 k . ≤ As any two dyadic squares of the same sidelength are either equal or disjoint, (5.13) gives

′ −n ′ −n 1 −N+(2− 1 )n N (A, 2 ) N (A , 2 ) c ε 2 k . ≥ n ≥ 4 − 25   Since k and n can be taken arbitrary large, invoking (5.7) gives dim A 2. Hence, B ≥ dimMB µ 2, so in fact dimMB µ =2.  ≥ Remark 5.7. Note that as

ess sup d(µ, z) = dimH µ dimMB µ dimMB µ = dimP µ = ess sup d(µ, z) z∼µ ≤ ≤ z∼µ

(dimP denotes the packing dimension, see e.g. [Fal14, Proposition 3.9] and [Fal97, Proposition 10.3]), the equality dimH µ = dimMB µ holds for all exact dimensional measures µ, i.e. the measures µ with d(µ, z)= d(µ, z) = const for µ-almost every z.

References

[ABD+19] G. Alberti, H. Bölcskei, C. De Lellis, G. Koliander, and E. Riegler. Lossless analog compression. IEEE Transactions on Information Theory, 65(11):7480–7513, 2019. [BAEFN93] Asher Ben-Artzi, Alp Eden, Ciprian Foias, and Basil Nicolaenko. Hölder continuity for the inverse of Mañé’s projection. J. Math. Anal. Appl., 178(1):22–29, 1993. [Ban51] Stefan Banach. Wstęp do teorii funkcji rzeczywistych (Polish) [Introduction to the theory of real functions]. Monografie Matematyczne. Tom XVII. Polskie Towarzystwo Matematyczne, Warszawa-Wrocław, 1951. [Bil99] Patrick Billingsley. Convergence of probability measures. Wiley Series in Probability and Statis- tics: Probability and Statistics. John Wiley & Sons, Inc., New York, second edition, 1999. A Wiley-Interscience Publication. [Cab00] Victoria Caballero. On an embedding theorem. Acta Math. Hungar., 88(4):269–278, 2000. [EFNT94] Alp Eden, Ciprian Foias, Basil Nicolaenko, and Roger M. Temam. Exponential attractors for dissipative evolution equations, volume 37 of RAM: Research in Applied Mathematics. Masson, Paris; John Wiley & Sons, Ltd., Chichester, 1994. [ER85] Jean-Pierre Eckmann and . of chaos and strange attractors. Rev. Modern Phys., 57(3, part 1):617–656, 1985. [Fal97] Kenneth Falconer. Techniques in fractal geometry. John Wiley & Sons, Ltd., Chichester, 1997. [Fal14] Kenneth Falconer. Fractal geometry. John Wiley & Sons, Ltd., Chichester, third edition, 2014. Mathematical foundations and applications. [FR02] Peter K. Friz and James C. Robinson. Constructing an elementary measure on a space of pro- jections. J. Math. Anal. Appl., 267(2):714–725, 2002. [GQS18] Yonatan Gutman, Yixiao Qiao, and Gábor Szabó. The embedding problem in topological dy- namics and Takens’ theorem. Nonlinearity, 31(2):597–620, 2018. [GS00] Mariano Gasca and Thomas Sauer. Polynomial interpolation in several variables. Adv. Comput. Math., 12(4):377–410, 2000. Multivariate polynomial interpolation. [Gut16] Yonatan Gutman. Taken’s embedding theorem with a continuous observable. In Ergodic theory, pages 134–141. De Gruyter, Berlin, 2016. [GVL13] Gene H. Golub and Charles F. Van Loan. Matrix computations. Johns Hopkins Studies in the Mathematical Sciences. Johns Hopkins University Press, Baltimore, MD, fourth edition, 2013. [HBS15] Franz Hamilton, Tyrus Berry, and Timothy Sauer. Predicting chaotic time series with a partial model. Phys. Rev. E, 92:010902, Jul 2015. [HGLS05] Chih-Hao Hsieh, Sarah M. Glaser, Andrew J. Lucas, and George Sugihara. Distinguishing ran- dom environmental fluctuations from ecological catastrophes for the North Pacific Ocean. Nature, 435(7040):336–340, 2005. [HK99] Brian R. Hunt and Vadim Yu. Kaloshin. Regularity of embeddings of infinite-dimensional fractal sets into finite-dimensional spaces. Nonlinearity, 12(5):1263–1275, 1999. 26 [Hoc14] Michael Hochman. Lectures on dynamics, fractal geometry, and metric number theory. J. Mod. Dyn., 8(3-4):437–497, 2014. [HW41] Witold Hurewicz and Henry Wallman. Dimension Theory. Princeton Mathematical Series, v. 4. Princeton University Press, Princeton, N. J., 1941. [Kec95] Alexander S. Kechris. Classical descriptive set theory, volume 156 of Graduate Texts in Mathe- matics. Springer-Verlag, New York, 1995. [KY90] Eric J. Kostelich and James A. Yorke. Noise reduction: finding the simplest dynamical system consistent with the data. Phys. D, 41(2):183–196, 1990. [Mar54] John M. Marstrand. Some fundamental geometrical properties of plane sets of fractional dimen- sions. Proc. London Math. Soc. (3), 4:257–302, 1954. [Mat75] Pertti Mattila. Hausdorff dimension, orthogonal projections and intersections with planes. Ann. Acad. Sci. Fenn. Ser. A I Math., 1(2):227–244, 1975. [Mat95] Pertti Mattila. Geometry of sets and measures in Euclidean spaces, volume 44 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, 1995. Fractals and rectifiability. [MGNS18] Stephan B. Munch, Alfredo Giron-Nava, and George Sugihara. Nonlinear dynamics and noise in fisheries recruitment: A global meta-analysis. Fish and Fisheries, 19(6):964–973, 2018. [Min70] George J. Minty. On the extension of Lipschitz, Lipschitz-Hölder continuous, and monotone functions. Bulletin of the American Mathematical Society, 76(2):334–339, 1970. [Mn81] Ricardo Mañé. On the dimension of the compact invariant sets of certain nonlinear maps. In Dynamical systems and turbulence, Warwick 1980 (Coventry, 1979/1980), volume 898 of Lecture Notes in Math., pages 230–242. Springer, Berlin-New York, 1981. [NV18] Raymundo Navarrete and Divakar Viswanath. Prevalence of delay embeddings with a fixed observation function. Preprint. https://arxiv.org/abs/1806.07529, 2018. [PCFS80] Norman H. Packard, James P. Crutchfield, J. Doyne Farmer, and Robert S. Shaw. Geometry from a time series. Phys. Rev. Lett., 45:712–716, Sep 1980. [PdM82] , Jr. and Welington de Melo. Geometric theory of dynamical systems. Springer-Verlag, New York-Berlin, 1982. An introduction, Translated from the Portuguese by A. K. Manning. [Rob05] James C. Robinson. A topological delay embedding theorem for infinite-dimensional dynamical systems. Nonlinearity, 18(5):2135–2143, 2005. [Rob11] James C. Robinson. Dimensions, embeddings, and attractors, volume 186 of Cambridge Tracts in Mathematics. Cambridge University Press, Cambridge, 2011. [Rud87] Walter Rudin. Real and complex analysis. McGraw-Hill Book Co., New York, third edition, 1987. [SBDH97] Jaroslav Stark, David S. Broomhead, Michael Evan Davies, and Jeremy P. Huke. Takens embed- ding theorems for forced and stochastic systems. In Proceedings of the Second World Congress of Nonlinear Analysts, Part 8 (Athens, 1996), volume 30, pages 5303–5314, 1997. [SBDH03] Jaroslav Stark, David S. Broomhead, Michael Evan Davies, and Jeremy P. Huke. Delay embed- dings for forced systems. II. Stochastic forcing. J. Nonlinear Sci., 13(6):519–577, 2003. [SGM90] George Sugihara, Bryan Grenfell, and Robert May. Distinguishing error from chaos in eco- logical time-series. Philosophical Transactions of the Royal Society B-Biological Sciences, 330(1257):235–251, 1990. [SM90] George Sugihara and Robert May. Nonlinear forecasting as a way of distinguishing chaos from measurement error in time series. Nature, 344(6268):734–741, 1990. [Sta99] Jaroslav Stark. Delay embeddings for forced systems. I. Deterministic forcing. J. Nonlinear Sci., 9(3):255–332, 1999. [SY97] Timothy D. Sauer and James A. Yorke. Are the dimensions of a set and its image equal under typical smooth functions? Ergodic Theory Dynam. Systems, 17(4):941–956, 1997. [SYC91] Timothy D. Sauer, James A. Yorke, and Martin Casdagli. Embedology. J. Statist. Phys., 65(3- 4):579–616, 1991.

27 [Tak81] Floris Takens. Detecting strange attractors in turbulence. In Dynamical systems and turbulence, Warwick 1980, volume 898 of Lecture Notes in Math., pages 366–381. Springer, Berlin-New York, 1981. [Vos03] Henning U. Voss. Synchronization of reconstructed dynamical systems. Chaos, 13(1):327–334, 2003. [Whi36] Hassler Whitney. Differentiable manifolds. Ann. of Math. (2), 37(3):645–680, 1936. [Yor69] James A. Yorke. Periods of periodic solutions and the Lipschitz constant. Proc. Amer. Math. Soc., 22:509–512, 1969.

28