Copulas with continuous, strictly increasing singular conditional distribution functions

Wolfgang Trutschniga,∗, Juan Fern´andezS´anchezb

aDepartment of , University of Salzburg, Hellbrunner Strasse 34, 5020 Salzburg, Austria, Tel.: +43 662 8044-5312, Fax: +43 662 8044-137 bGrupo de Investigaci´onde An´alisisMatem´atico, Universidad de Almer´ıa,La Ca˜nadade San Urbano, Almer´ıa,Spain

Abstract Using Iterated Function Systems induced by so-called modifiable transfor- mation matrices T and tools from Symbolic Dynamical Systems we first con- ⋆ struct mutually singular copulas AT with identical (possibly fractal or full) support that are at the same time singular with respect to the Lebesgue mea- 2 sure λ2 on [0, 1] . Afterwards the established results are utilized for a simple ⋆ proof of the existence of singular copulas AT with full support for which ⋆ AT all conditional distribution functions y 7→ Fx (y) are continuous, strictly increasing and have zero λ-almost everywhere. This result under- lines the fact that conditional distribution functions of copulas may exhibit surprisingly irregular analytic behavior. Finally, we extend the notion of em- pirical copula to the case of non i.i.d. data and prove uniform convergence ′ of the empirical copula En corresponding to almost all orbits of a Markov ⋆ process usually referred to as chaos game to the singular copula AT . Several examples and graphics illustrate both the chosen approach and the main results. XXX Keywords: Copula, Doubly stochastic , Singular function, Markov kernel, Symbolic

2010 MSC: 62H20, 60E05, 28A80, 26A30

∗Corresponding author Email addresses: [email protected] (Wolfgang Trutschnig), [email protected] (Juan Fern´andezS´anchez)

Preprint submitted to Journal of and ApplicationsSeptember 2, 2013 1 1. Introduction

2 The construction of copulas with fractal support via Iterated Function

3 Systems (IFSs) induced by so-called transformation matrices goes back to

4 Fredricks et al. in [17]. Among other things the authors proved the existence 5 of families (Ar)r∈(0,1/2) of two-dimensional copulas fulfilling that for every 6 s ∈ (1, 2) there exists rs ∈ (0, 1/2) such that the Hausdorff dimension of the

7 support Zrs of Ars is s. Using the fact that the same IFS-construction also 8 converges with respect to the strong metric D1 (a metrization of the strong 9 operator topology of the corresponding Markov operators, see [30]) on the

10 space C of two-dimensional copulas Trutschnig and Fern´andez-S´anchez [31]

11 showed that the same result holds for the subclass of idempotent copulas.

12 Thereby idempotent means idempotent with respect to the star-product in-

13 troduced by Darsow et al. in [9], i.e. A ∈ C is idempotent if A ∗ A = A. 14 Families (Ar)r∈(0,1/2) of copulas with fractal support were also studied by 15 de Amo et al. in [1] and in [2]. In the latter paper, using techniques from

16 and Ergodic Theory, the authors discussed properties of subsets

17 of the corresponding fractal supports and constructed mutually singular cop-

18 ulas having the same fractal set as support. Moments of these copulas were

19 calculated in [4]; some surprising properties of homeomorphisms between

20 fractal supports of copulas were studied in [5].

21 In the current paper we first generalize some results concerning the construc-

22 tion of mutually singular copulas with identical (fractal or full) support by

23 a different method of proof than the one chosen in [2]. In particular we

24 show that for each so-called modifiable transformation matrix T with cor- ⋆ ⋆ 25 responding invariant copula AT and attractor ZT we can find (uncountable ⋆ ⋆ 26 many) copulas B having the same support ZT but being singular w.r.t. AT . 27 Afterwards in Section 4 we focus on transformation matrices T having non-

28 zero entries (hence being modifiable) and the corresponding singular copulas ⋆ 2 29 AT with full support [0, 1] and study singularity properties of their con- ⋆ AT ⋆ 30 ditional distribution functions y 7→ F (y) = K ⋆ (x, [0, y]) of A . Using x AT T 31 the one-to-one correspondence between copulas and Markov kernels hav-

32 ing the Lebesgue measure λ on [0, 1] as fixed point and the fact that the

33 IFS construction can easily be expressed as operation on the corresponding

34 Markov kernels we prove that (λ-almost) all conditional distribution func- ⋆ AT ⋆ 35 tions y 7→ F (y) = K ⋆ (x, [0, y]) of A are continuous, strictly increasing, x AT T 36 and have derivative zero λ-almost everywhere. In other words, we prove the ⋆ 37 existence of copulas AT for which all conditional distribution functions are

2 38 continuous, strictly increasing and singular in the sense of [16, 26] as well as

39 [19] (pp. 278-282). Note that this complements some results in [12] and [13]

40 since the singular copulas with full support considered therein have discrete

41 conditional distributions. For a general study of the interrelation between

42 2-increasingness and differential properties of copulas we refer to [18].

43 Finally, in Section 5 we first extend the notion of empirical copulas to non ˆ 44 i.i.d. data and then consider sequences of empirical copulas (En(k))n∈N in- 45 duced by orbits (Yn(k))n∈N of the so-called chaos game (a Markov process 46 induced by transformation matrices T , see [15, 24]). We prove that, with ˆ ⋆ 47 probability one, (En(k))n∈N converges uniformly to the copula AT . Several 48 examples and graphics illustrate the main results.

49 2. Notation and preliminaries

50 For every metric space (Ω, ρ) the family of all non-empty compact sets is

51 denoted by K(Ω), the Borel σ-field by B(Ω) and the family of all probability 52 measures on B(Ω) by P(Ω). We will call two probability measures µ1, µ2 on 53 B(Ω) singular with respect to each other (and will write µ1 ⊥ µ2) if there 54 exist disjoint Borel sets E,F ∈ B(Ω) with µ1(E) = 1 = µ2(F ). λ and 2 55 λ2 will denote the Lebesgue measure on B([0, 1]) and B([0, 1] ) respectively. 56 For every set E the cardinality of E will be denoted by #E. C will denote

57 the family of all two-dimensional copulas, see [11, 25, 29], Π will denote

58 the product copula. d∞ will denote the uniform distance on C; it is well 59 known that (C, d∞) is a compact metric space. For every A ∈ C µA will 60 denote the corresponding doubly stochastic measure defined by µA([0, x] × 61 [0, y]) := A(x, y) for all x, y ∈ [0, 1], PC the class of all these doubly stochastic

62 measures. A Markov kernel from R to B(R) is a mapping K : R × B(R) →

63 [0, 1] such that x 7→ K(x, B) is measurable for every fixed B ∈ B(R) and

64 B 7→ K(x, B) is a probability measure for every fixed x ∈ R. Suppose

65 that X,Y are real-valued random variables on a probability space (Ω, A, P),

66 then a Markov kernel K : R × B(R) → [0, 1] is called a regular conditional

67 distribution of Y given X if for every B ∈ B(R)

K(X(ω),B) = E(1B ◦ Y |X)(ω) (1)

68 holds P-a.e. It is well known that for each pair (X,Y ) of real-valued random

69 variables a regular conditional distribution K(·, ·) of Y given X exists, that X X 70 K(·, ·) is unique P -a.s. (i.e. unique for P -almost all x ∈ R) and that

3 X⊗Y 71 K(·, ·) only depends on P . Hence, given A ∈ C we will denote (a version 72 of) the regular conditional distribution of Y given X by KA(·, ·) and refer to 73 KA(·, ·) simply as regular conditional distribution of A or as Markov kernel 74 of A. Note that for every A ∈ C, its conditional regular distribution KA(·, ·), 2 75 and every Borel set G ∈ B([0, 1] ) we have (Gx := {y ∈ [0, 1] : (x, y) ∈ G} 76 denoting the x-section of G for every x ∈ [0, 1]) ∫

KA(x, Gx) dλ(x) = µA(G), (2) [0,1]

77 so in particular ∫

KA(x, F ) dλ(x) = λ(F ) (3) [0,1]

78 for every F ∈ B([0, 1]). On the other hand, every Markov kernel K : [0, 1] × 2 79 B([0, 1]) → [0, 1] fulfilling (3) induces a unique element µ ∈ PC([0, 1] ) via ∈ C ∈ 7→ A 80 (2). For every A and x [0, 1] the function y Fx (y) := KA(x, [0, y]) 81 will be called conditional distribution function of A at x. For more details

82 and properties of conditional expectation, regular conditional distributions,

83 and disintegration see [21, 22].

84 Expressing copulas in terms of their corresponding regular conditional dis- 85 tributions a metric D1 on C can be defined as follows: ∫ ∫

D1(A, B) := KA(x, [0, y]) − KB(x, [0, y]) dλ(x) dλ(y) (4) [0,1] [0,1]

86 It can be shown that (C,D1) is a complete and separable metric space and 87 that the topology induced by D1 is strictly finer than the one induced by d∞ 88 (for an interpretation and various properties of D1 see [30]). 89 Before sketching the construction of copulas with fractal support via so-

90 called transformation matrices we recall the definition of an Iterated Function

91 System (IFS for short) and some main results about IFSs (for more details

92 see [6, 14, 24]). Suppose for the following that (Ω, ρ) is a compact metric 93 space and let δH denote the Hausdorff metric on K(Ω). A mapping w : 94 Ω → Ω is called a contraction if there exists a constant L < 1 such that ≤ ∈ N ≥ 95 ρ(w(x), w(y)) Lρ(x, y) holds for all x, y Ω. A family (wl)l=1 of N 2 96 contractions on Ω is called Iterated Function System and will be denoted { N } N ∈ N 97 ∑by Ω, (wl)l=1 . An IFS together with a vector (pl)l=1 (0, 1] fulfilling N 98 l=1 pl = 1 is called Iterated Function System with (IFSP for

4 { N N } 99 short). We will denote IFSPs by Ω, (wl)l=1, (pl)l=1 . Every IFSP induces 100 the so-called Hutchinson operator H : K(Ω) → K(Ω), defined by

∪N H(Z) := wl(Z). (5) l=1 It can be shown (see [6, 24]) that H is a contraction on the compact metric space (K(Ω), δH ), so Banach’s Fixed Point theorem implies the existence of a unique, globally attractive fixed point Z⋆ ∈ K(Ω) of H. Hence, for every R ∈ K(Ω), we have ( ) n ⋆ lim δH H (R),Z = 0. n→∞ ⋆ 101 The attractor Z will be called self-similar if all contractions in the IFS are { N } 102 similarities. An IFS Ω, (wl)l=1 is called totally disconnected (or disjoint) if ⋆ ⋆ ⋆ { N } 103 the sets w1(Z ), w2(Z ), . . . , wN (Z ) are pairwise disjoint. Ω, (wl)l=1 will 104 be called just touching if it is not totally disconnected but there exists a 105 non-empty open set U ⊆ Ω such that w1(U), w2(U), . . . , wN (U) are pairwise 106 disjoint. Additionally to the operator H every IFSP also induces a (Markov)

107 operator V : P(Ω) → P(Ω), defined by

∑N wi V(µ) := pi µ . (6) i=1

108 The so-called Hutchinson metric h (sometimes also called Kantorovich or

109 Wasserstein metric) on P(Ω) is defined by { ∫ ∫ }

h(µ, ν) := sup f dµ − f dν : f ∈ Lip1(Ω, R) . (7) Ω Ω

Hereby Lip1(Ω, R) is the class of all non-expanding functions f :Ω → R, i.e. functions fulfilling |f(x) − f(y)| ≤ ρ(x, y) for all x, y ∈ Ω. It is not difficult to show that V is a contraction on (P(Ω), h), that h is a metrization of the topology of weak convergence on P(Ω) and that (P(Ω), h) is a compact metric space (see [6, 10]). Consequently, again by Banach’s Fixed Point theorem, it follows that there is a unique, globally attractive fixed point µ⋆ ∈ P(Ω) of V, i.e. for every ν ∈ P(Ω) we have ( ) lim h Vn(ν), µ⋆ = 0. n→∞

5 ⋆ ⋆ 110 µ will be called invariant measure - it is well known that the support of µ ⋆ ⋆ ⋆ 111 is exactly the attractor Z . The measure µ will be called self-similar if Z

112 is self-similar, i.e. if all contractions in the IFSP are similarities. Attractors of IFSs are strongly interrelated with symbolic dynamics via the so-called address map (see [6, 24]): For every N ∈ N the code space of N symbols will be denoted by ΣN , i.e. { } N ΣN := {1, 2,...,N} = (ki)i∈N : 1 ≤ ki ≤ N ∀i ∈ N .

Bold symbols will denote elements of ΣN . σ will denote the (left-) shift operator on ΣN , i.e. σ((k1, k2,...)) = (k2, k3,...). Define a metric ρ on ΣN by setting { 0 if k = l ρ(k, l) := − { ̸ } 21 min i:ki=li if k ≠ l,

113 then it is straightforward to verify that (ΣN , ρ) is a compact ultrametric 114 space and that ρ is a metrization of the product topology. Suppose now that { N } ⋆ ∈ 115 Ω, (wl)l=1 is an IFS with attractor Z , fix an arbitrary x Ω and define 116 the address map G :ΣN → Ω by

G(k) := lim wk ◦ wk ◦ · · · ◦ wk (x), (8) m→∞ 1 2 m

117 then (see [24]) G(k) is independent of x, G :ΣN → Ω is Lipschitz continuous ⋆ 118 and G(ΣN ) = Z . Furthermore G is injective (and hence a homeomorphism) ⋆ 119 if and only if the IFS is totally disconnected. Given z ∈ Z every element −1 120 of the preimage G ({z}) will be called address of z. Considering an IFSP { N N } ⋆ ⋆ 121 Ω, (wl)l=1, (pl)l=1 with attractor Z and invariant measure µ we can define 122 a probability measure P on B(ΣN ) by setting ({ }) ∏m ∈ P k ΣN : k1 = i1, k2 = i2, . . . , km = im = pij (9) j=1

⋆ 123 and extending in the standard way to full B(ΣN ). According to [24] µ is the G −1 ⋆ 124 push-forward of P via the address map, i.e. P (B) := P (G (B)) = µ (B) ⋆ 125 holds for each B ∈ B(Z ).

126 Throughout the rest of the paper we will consider IFSP induced by so-

127 called transformation matrices - for the original definition see [17], for the

128 generalization to the multivariate setting we refer to [31].

6 129 Definition 1 ([17]). A n×m-matrix T = (tij)i=1...n, j=1...m is called transfor- ≥ 130 mation matrix if it fulfills the following∑ four conditions: (i) max(n, m) 2, 131 (ii) all entries are non-negative, (iii) i,j tij = 1, and (iv) no row or column 132 has all entries 0. T will denote the family of all transformations matrices. ∈ T m n 133 Given T we define the vectors (aj)j=0, (bi)i=0 of cumulative column and 134 row sums by a0 = b0 = 0 and ∑ ∑n ∈ { } aj = tij0 j 1, . . . , m

j0≤j i=1 ∑ ∑m ∈ { } bi = ti0j i 1, . . . , n .

i0≤i j=1 m n 135 Since T is a transformation matrix both (aj)j=0 and (bi)i=0 are strictly in- 136 creasing and Rji := [aj−1, aj] × [bi−1, bi] is a non-empty compact rectangle e 137 for all j ∈ {1, . . . , m} and i ∈ {1, . . . , n}. Set I := {(i, j): tij > 0} and con- { 2 } 138 sider the IFSP [0, 1] , (fji)(i,j)∈Ie, (tij)(i,j)∈Ie , whereby the affine contraction 2 139 f : [0, 1] → R is given by ji ji ( ) fji(x, y) = aj−1 + x(aj − aj−1) , bi−1 + y(bi − bi−1) . (10) ⋆ ∈ K 2 140 ZT ([0, 1] ) will denote the attractor of the IFSP. The induced operator 2 141 VT on P([0, 1] ) is defined by ∑m ∑n ∑ fji fji VT (µ) := tij µ = tij µ . (11) j=1 i=1 (i,j)∈Ie

It is straightforward to see that VT maps PC into itself so we may view VT also as operator on C (see [17]). According to the before-mentioned facts ⋆ ∈ C there is exactly one copula AT , to which we will refer to as invariant ⋆ copula, such that V (µ ⋆ ) = µ ⋆ holds. In the sequel we will also write µ T AT AT T instead of µ ⋆ . Considering that V is a contraction on the complete metric AT T space (C,D1) (see [30]) it follows that Vn ⋆ lim D1( T B,AT ) = 0 n→∞ ∈ C ⋆ 142 for every copula B , i.e. the IFSP construction also converges to AT 143 w.r.t. D1 for every starting copula B. If T contains at least one zero then ⋆ ⋆ ⊥ 144 λ2(ZT ) = 0 so µT λ2. It has already been mentioned in [2, 8] that T 145 containing at least one zero is a sufficient but not necessary condition for ⋆ ⊥ 146 µT λ2.

7 147 3. Mutually singular copulas with identical (fractal) support

148 In this section we extend some ideas from [2] to the setting of arbitrary

149 transformation matrices T ∈ T not necessarily inducing IFSPs that only con-

150 tain similarities. Doing so we do not follow the approach in [2] but directly 151 prove and utilize the fact that the dynamical systems (ΣN , B(ΣN ),PT , σ) and ⋆ B ⋆ ⋆ 152 (ZT , (ZT ), µT , ΦT ) are isomorphic (ΦT to be defined subsequently in equa- 153 tion (13)) for every T ∈ T . ∈ T { 2 } 154 Fix T , consider the IFSP [0, 1] , (fji)(i,j)∈Ie, (tij)(i,j)∈Ie induced by T , ⋆ ∈ K 2 155 and let ZT ([0, 1] ) denote the corresponding attractor. To simplify 156 notation sort the functions (fji)(i,j)∈Ie lexicographically and rename them e 157 as w1, . . . , wN with N := #I; analogously rename the rectangles Rji by 158 Q1,...,QN and the probabilities tij by p1, . . . pN . Using the fact that the { 2 N N } 159 IFSP [0, 1] , (wi)i=1, (pi)i=1 is either totally disconnected or just touching −1 { } ∈ ⋆ 160 we can show that #G ( (x, y) ) = 1 for µA-almost every (x, y) ZT and 161 arbitrary A ∈ C. In fact, setting ∪ ( ) ( ) ◦ ◦ ⋆ ∩ ◦ ◦ ⋆ Dm = wi1 ... wim (ZT ) wj1 ... wjm (ZT ) (12)

i,j∈ΣN :(i1,...,im)=(̸ j1,...,jm) ∪ ∈ N ∞ 162 for every m as well as D := m=1 Dm the following result holds: ∈ T ∈ ⋆ 163 Lemma 2. Suppose that T . Then (x, y) ZT has only one address if c c 164 and only if (x, y) ∈ D . Furthermore for every A ∈ C we have µA(D ) = 1. ∈ ◦ ◦ ⋆ ∩ ◦ ◦ ⋆ ∈ 165 Proof: If (x, y) wi1 ... wim (ZT ) wj1 ... wjm (ZT ) for some m 166 N and (i1, . . . , im) ≠ (j1, . . . , jm) then there are k, l ∈ ΣN with (x, y) = ◦ ◦ ◦ ◦ 167 wi1 ... wim (G(k)) = wj1 ... wjm (G(l)). Hence (x, y) has at least 168 the two addresses (i1, . . . , im, k1, k2,...), (j1, . . . , jm, l1, l2,...) ∈ ΣN . Sup- 169 pose now that G(k) = G(l) = (x, y) for k ≠ l and let m denote the 170 smallest i ∈ N with ki ≠ li. Then it follows immediately that (x, y) ∈ ◦ ◦ ⋆ ∩ ◦ ◦ ⋆ ⊆ 171 wk1 ... wkm (ZT ) wl1 ... wlm (ZT ) Dm completing the proof that D 172 is exactly the set of all points with at least two addresses. 173 To( prove the second assertion note that for every) copula A ∈ C we even have ◦ ◦ 2 ∩ ◦ ◦ 2 ̸ 174 µA wi1 ... wim ([0, 1] ) wj1 ... wjm ([0, 1] ) = 0 whenever (i1, . . . , im) = 175 (j1, . . . , jm) since µA({x} × [0, 1]) = µA([0, 1] × {y}) = 0 for all x, y ∈ [0, 1]. 176 This implies µA(Dm) = 0 from which µA(D) = 0 follows immediately. 

177

8 ⋆ → ⋆ 178 Define a measurable function ΦT : ZT ZT by

∑N −1 ∪ ΦT (x, y) = w (x, y)1 ⋆ ∩ \ i−1 (x, y). (13) i ZT (Qi j=1 Qj ) i=1 ⋆ ∈ P 179 and, as before, let µT C denote the invariant measure of the IFSP. The 180 following result holds (PT as in equation (9)):

181 Theorem 3. For every T ∈ T the dynamical systems (ΣN , B(ΣN ),PT , σ) ⋆ B ⋆ ⋆ 182 and (ZT , (ZT ), µT , ΦT ) are isomorphic.

183 Proof: We prove the result in three steps. 184 (S1) Suppose that k ∈ ΣN and that there exists l ≠ σ(k) with G(l) =

185 G(σ(k)). Applying wk1 to both sides yields G((k1, l1, l2,...)) = G(k) = (x, y), −1 c 186 so (x, y) has at least two different addresses. Having this, σ(G (D )) ⊆ −1 c c 187 G (D ) follows immediately. Furthermore for every (x, y) ∈ D with address −1 c 188 k ∈ G (D ) we obviously have

ΦT ◦ G(k) = G ◦ σ(k). (14)

−1 c −1 c c Hence, using σ(G (D )) ⊆ G (D ), ΦT (x, y) = ΦT ◦ G(k) ∈ D follows, c c which shows ΦT (D ) ⊆ D . σ (S2) Obviously the shift-operator σ is PT -preserving, i.e. we have PT = PT . ⋆ Showing that ΦT preserves µT can be done as follows: For every Borel set ∈ B 2 ∈ { } ⋆ V B ([0, 1] ) and every j 1,...,N considering that µT is T -invariant ⋆ and that µT (D) = 0 implies (measurability of wj(B) follows from the fact that wj is an affine bijection by construction)

∑N ( ) ⋆ ⋆ −1 ⋆ µT (wj(B)) = pi µT wi (wj(B)) = pj µT (B). i=1 ∈ B ⋆ 189 Hence, for every B (ZT ) we get ∑N ( ) ⋆ −1 ⋆ c ∩ −1 ⋆ c ∩ −1 ∩ ⋆ µT (ΦT (B)) = µT (D ΦT (B)) = µT D ΦT (B) wi(ZT ) i=1 ∑N ( ) ∑N ( ) ∑N ⋆ c ∩ ⋆ ⋆ = µT D wi(B) = µT wi(B) = pi µT (B) i=1 i=1 i=1 ⋆ = µT (B),

9 ⋆ implying that ΦT is µT -preserving. (S3) As last step we show that the inverse G−1 : Dc → G−1(Dc) of the −1 c → c B ⋆ ∩ c bijection G : G (D ) D is measurable (w.r.t. the σ-fields (ZT ) D −1 c and B(ΣN ) ∩ G (D )) and measure-preserving. Suppose that m ∈ N, that l1, . . . , lm ∈ {1,...N} and set Y := {k ∈ ΣN : k1 = l1, . . . , km = lm} ∩ G−1(Dc). Then we have { } −1 −1 ∈ c −1 ∈ c ∩ ◦ ◦ · · · ◦ ⋆ (G ) (Y ) = (x, y) D : G (x, y) Y = D wl1 wl2 wlm (ZT )

190 from which measurability follows since m and l1, . . . , lm ∈ {1,...N} were −1 c −1 c 191 arbitrary. The fact that G : D → G (D ) is measure-preserving now G ⋆  192 follows easily from PT = µT . 193

194 The subsequent diagram depicts the measure-preserving maps studied in the

195 previous Theorem (vertical two-headed arrows symbolize invertible measure-

196 preserving transformations outside sets of measure zero):

σ / (ΣN ,PT ) (ΣN ,PT ) (15)

G G   ⋆ ⋆ / ⋆ ⋆ (ZT , µT ) (ZT , µT ) ΦT

197 As direct consequence of Theorem 3 we get the following corollary (see [32]). ⋆ B ⋆ ⋆ 198 Corollary 4. The dynamical system (ZT , (Z∑T ), µT , ΦT ) is strongly mixing − 199 and its entropy h(ΦT ) is given by h(ΦT ) = (i,j)∈Ie tij log(tij).

200 Theorem 3 offers a simple way for the construction of mutually singular co-

201 pulas A, B ∈ C having the same (fractal) support whenever, loosely speaking,

202 it is possible to modify T ∈ T without changing the induced IFS. This

203 motivates the following definition:

204 Definition 5. A transformation matrix T ∈ T will be called modifiable if ′ ′ ′ 205 and only if we can find T ∈ T with T ≠ T such that T and T induce the

206 same IFS (but not the same IFSP). For every T ∈ T the family of all such ′ 207 T will be denoted by MT .

208 Obviously MT is either empty or contains uncountably many elements. A 209 simple sufficient (but not necessary) condition for MT ≠ ∅ is the existence 210 of i1 < i2 and j1 < j2 such that tij > 0 for all (i, j) ∈ {i1, i2} × {j1, j2}.

10 Example 6. Consider the following three modifiable transformation matri- ces T1,T2,T3, defined by ( ) ( ) ( ) 1 1 3 5 2 6 3 6 16 16 16 16 T1 = 1 1 ,T2 = 3 5 ,T3 = 4 4 . 6 3 16 16 16 16 ∈ M 211 Then obviously T2 T3 and vice versa. Figure 1 and Figure 2 depict n n 212 V V ∈ { } T1 (Π) and T2 (Π) for n 1, 2, 4, 6 .

Step: 1 Step: 2 1 1

ln(dens) ln(dens) 3/4 3/4 −0.4 −0.8

−0.3 −0.6

−0.2 −0.4 2/4 2/4 −0.1 −0.2

0.0 0.0

1/4 0.1 1/4 0.2

0.2 0.4

0 0

0 1/4 2/4 3/4 1 0 1/4 2/4 3/4 1

Step: 4 Step: 6 1 1

3/4 ln(dens) 3/4 −1.5 ln(dens)

−1.0 −2

2/4 −0.5 2/4 −1

0.0 0

0.5 1 1/4 1/4 1.0

0 0

0 1/4 2/4 3/4 1 0 1/4 2/4 3/4 1

Figure 1: Image plot of the (natural) logarithm of the density of Vn (Π) for n ∈ {1, 2, 4, 6}, T1 T1 according to Example 6

213 Theorem 7. Suppose that T is a modifiable transformation matrix. Then ⋆ 214 there exist uncountably many copulas with support ZT that are pairwise mu- 215 tually singular with respect to each other.

11 ′ ∈ M ′ ′ Proof: Choose T T , let p1, . . . , pN > 0 denote the corresponding proba- bilities and PT ′ the corresponding probability measure on ΣN induced by ′ ′ p1, . . . , pN according to equation (9). Then, using ergodicity of σ both w.r.t. PT ′ and with respect to PT (see [32]), we have PT ⊥ PT ′ , so we can find ′ ′ ′ Λ, Λ ∈ B(ΣN ) fulfilling Λ ∩ Λ = ∅ as well as PT (Λ) = PT ′ (Λ ) = 1. Since, −1 c ⋆ c ⋆ c according to Lemma 2, we have PT (G (D )) = µT (D ) = 1 = µT ′ (D ) = −1 c −1 c ′ −1 c PT ′ (G (D )) this implies PT (Λ ∩ G (D )) = PT ′ (Λ ∩ G (D )) = 1. Set Γ = G(Λ ∩ G−1(Dc)) and Γ′ = G(Λ′ ∩ G−1(Dc)), then using Theorem 3, ′ ∈ B ⋆ ∩ ′ ∅ Γ, Γ (ZT ) and Γ Γ = follows, and we have

⋆ −1 −1 ′ ⋆ ′ µT (Γ) = PT (G (Γ)) = 1 = PT ′ (G (Γ )) = µT ′ (Γ ) which completes the proof. 

Step: 1 Step: 2 1 1

ln(dens) ln(dens) 3/4 3/4 −0.4 −0.8

−0.3 −0.6

−0.2 −0.4 2/4 2/4 −0.1 −0.2

0.0 0.0

1/4 0.1 1/4 0.2

0.2 0.4

0 0

0 1/4 2/4 3/4 1 0 1/4 2/4 3/4 1

Step: 4 Step: 6 1 1

3/4 ln(dens) 3/4 −1.5 ln(dens)

−1.0 −2

2/4 −0.5 2/4 −1

0.0 0

0.5 1 1/4 1/4 1.0

0 0

0 1/4 2/4 3/4 1 0 1/4 2/4 3/4 1

Figure 2: Image plot of the (natural) logarithm of the density of Vn (Π) for n ∈ {1, 2, 4, 6}, T2 T2 according to Example 6

216

12 217 Remark 8. Theorem 7 also holds if, for some m ∈ N, the Kronecker product ∗m 218 T := T ∗ T ∗ · · · ∗ T of T with itself fulfills MT ∗m ≠ ∅.

219 Remark 9. If T is a transformation matrix containing no zeros then obvi- 220 ously T is modifiable. If T additionally fulfills tij ≠ (aj+1 − aj)(bi+1 − bi) 221 for at least one (i, j) then MT contains the transformation matrix E with − − ⋆ 222 eij = (aj+1 aj)(bi+1 bi) for which obviously µE = µΠ holds. Hence for all ⋆ ⋆ ⊥ 223 these T we have that AT has full support although at the same time µT λ2, 224 generalizing some results contained in [8] and [2].

225 We close this section with the following result:

∈ T ⋆ ̸ 226 Corollary 10. Suppose that T fulfills µT = λ2. Then for λ-almost 227 every x ∈ [0, 1] the probability measure K ⋆ (x, ·) is singular w.r.t. λ and the AT ⋆ AT 228 conditional distribution function y 7→ F (y) := K ⋆ (x, [0, y]) has derivative x AT 229 zero λ-almost everywhere.

⋆ ̸ ⋆ ⊥ 230 Proof: Using the afore-mentioned results µT = λ2 implies µT λ2, so ∈ B 2 ⋆ 231 there exists Γ ([0, 1] ) with µT (Γ) = 1 and λ2(Γ) = 0. Disintegration 232 immediately yields λ(Γ ) = 0 as well as K ⋆ (x, Γ ) = 1 for λ-almost all x AT x 233 x ∈ [0, 1]. Having this the remaining assertion follows directly from the fact

234 that the derivative of a singular measure w.r.t. λ is zero almost everywhere

235 (see [27]). 

236 4. Singular copulas with full support whose conditional distribu-

237 tion functions are continuous, strictly increasing and singular

238 Singular copulas A with full support have already been studied in the

239 literature, see [12, 13]. In both constructions the conditional distributions 240 KA(x, ·) are concentrated on countable sets, i.e. they are discrete probability 241 measures. We will use the results of the previous section now in order to prove

242 the existence of copulas A for which all conditional distribution functions A A 7→ dFx (y) 243 Fx : y KA(x, [0, y]) are continuous, strictly increasing, and fulfill dy = 244 0 for λ-almost every y ∈ [0, 1]. To simplify notation we will only consider

245 elements of the class Tˆ of all transformation matrices T (i) containing no 246 zeros, (ii) fulfilling that the row sums and column sums through every tij are ⋆ ̸ Tˆ 247 identical and (iii) µT = λ2. It is straightforward to verify that coincides 1 248 with the class of all transformation matrices of the form T = m S, whereby

13 249 m ∈ {2, 3,...} and S is a m-dimensional doubly stochastic matrix containing

250 no zeros and having at least two different entries. The reason for considering Tˆ ⋆ ⋆ ⊥ 251 is that, firstly, in this case µT has full support and fulfills µT λ2 and i ∈ { } 252 that, secondly, ai = bi = m for every i 0, . . . , m . Hence writing i − 1 1 g (x) = a − + (a − a − )x = + x (16) i i 1 i i 1 m m −1 ∈ { } and setting hi := gi for every i 1, . . . , m the contractions (fij) in equation (10) can be expressed as ( ) fij(x, y) = gi(x), gj(y) . { 2 N N } 2 253 Obviously the corresponding IFSP [0, 1] , (wi)i=1, (pi)i=1 with N = m only ⋆ 254 contains similarities, so µT is self-similar. 255 Before studying the general case we have a look at the transformation matrix

256 T ∈ Tˆ from Example 6. For every A ∈ C the kernel KV can directly be 1 T1 A 257 calculated from KA - in fact, extending the definition of KA to [0, 1] × B(R) 258 by setting KA(·,E) := 0 if E ∩ [0, 1] = ∅, and fixing z ∈ [0, 1],E ∈ B([0, 1]) 259 we have ( ( ) ) ( ) ( ( ) ) 2 1 KV g (z),E K z, h (E) T1 A( 1 ) 3 3 A( 1 ) = 1 2 . (17) KV A g2(z),E KA z, h2(E) T1 | 3 {z 3 } t =2 T1 7→ A 260 If the conditional distribution function y Fz (y) := KA(x, [0, y]) is contin- V T1 A 261 uous then equation (17) implies continuity of F . Let F denote the family gi(z) 262 of all continuous non-decreasing functions F on [0, 1] fulfilling F (0) = 0 and

263 F (1) = 1 and ρ∞ the uniform distance. Then (F, ρ∞) is a complete metric 264 space (see [19, 27]). Define two functions φ1, φ2 : F → F by { [ ] 2 F (2y) if y ∈ 0, 1 ◦ 3 ( 2 ] (φ1 F )(y) = 2 1 − ∈ 1 (18) 3 + 3 F (2y 1) if y 2 , 1

265 and { [ ] 1 F (2y) if y ∈ 0, 1 ◦ 3 ( 2 ] (φ2 F )(y) = 1 2 − ∈ 1 (19) 3 + 3 F (2y 1) if y 2 , 1 , 266 then we have ( ) V F T A(y) = φ ◦ F A (y) (20) gi(z) i z

267 for every y ∈ [0, 1] and z ∈ [0, 1]. The next lemma gathers some properties 268 of φ1, φ2 : F → F:

14 Lemma 11. Let φ1, φ2 be defined according to equation (18) and (19). Then setting Ψ(k) = lim φk ◦ φk ◦ · · · ◦ φk (F ) n→∞ 1 2 n

269 for k ∈ Σ2 and arbitrary F ∈ F defines a continuous mapping Ψ:Σ2 → F. 270 Moreover the function Ψ(k) ∈ F is strictly increasing for every k ∈ Σ2. F Proof: The functions φ1, φ2 are contractions on ( ,( ρ∞) with Lipschitz) con- 2 ∈ ◦ · · · ◦ stant L = 3 . Fix k Σ2. As Cauchy sequence φk1 φkn (F ) n∈N has a limit Ψ(k,F ) ∈ F. Using the fact that φ1, φ2 are contractions it fol- lows that Ψ(k,F ) ∈ F is independent of the function F , i.e. Ψ : Σ2 → F is well-defined and, without loss of generality, we may choose F = id[0,1]. Since continuity of Ψ directly follows from the construction it remains to ∈ ∈ N show that{ Ψ(k) is strictly} increasing for every k Σ2. For every l set 1 2 2l−1 Sl := 0, 2l , 2l ,..., 2l , 1 . Considering that the construction of Ψ implies ◦ ◦ · · · ◦ ◦ ◦ · · · ◦ ∈ φk1 φk2 φkn (id)(z) = φk1 φk2 φkl (id)(z) for every z Sl and n ≥ l we have ◦ ◦ · · · ◦ Ψ(k)(z) = φk1 φk2 φkl (id)(z)

271 ∈ ◦ ◦ · · · ◦ for every z Sl. Since φk1 φk2 φkl (id) is strictly increasing∪ on Sl for ∈ N ∞ 272 every l it follows that Ψ(k) is strictly increasing on S := l=1 Sl which 273 completes the proof. 

274

275 Let Λ denote the set of all x ∈ [0, 1] for which there is a unique k ∈ Σ2 276 (which we will also refer to as ‘address’ of x) such that

x = lim gk ◦ gk ◦ · · · ◦ gk (1), (21) n→∞ 1 2 n c Π ∈ 277 then Λ is countable. Considering Fx (y) = y for every x [0, 1] Lemma 11 278 and equation (20) imply that for every x ∈ Λ we have

Vn Π T1 n K(x, [0, y]) := lim KV Π(x, [0, y]) = lim Fg ◦···◦g (1)(y) = Ψ(k)(y). (22) n→∞ T1 n→∞ k1 kn Applying Lebesgue’s theorem on dominated convergence shows ∫ ∫ ⋆ A (a, y) = lim KVn Π(x, [0, y]) dλ(x) = K(x, [0, y]) dλ(x) T1 →∞ T n [0,a] 1 [0,a]

279 for every a ∈ [0, 1] and y ∈ [0, 1]. Hence K(·, ·) is (a version of) the ⋆ 280 Markov kernel KA⋆ (·, ·) of A . Using Corollary 10, we have altogether T1 T1

15 281 shown that for λ-almost every x ∈ [0, 1] the conditional distribution func- A⋆ T1 282 tion y 7→ Fx (y) is continuous and strictly increasing with derivative zero 283 λ-almost everywhere. Since kernels are only unique λ-almost everywhere A⋆ T1 284 we may, without loss of generality, assume that y 7→ Fx (y) is continuous 285 and strictly increasing with derivative zero λ-almost everywhere for every ∈ ⋆ 286 x [0, 1]. In other words AT fulfills all singularity properties stated at the Vi Π T1 287 beginning of this section. Figure 3 depicts Fx for some values of i and V8 Π T1 three different x, Figure 4 is an image- and 3d-plot of (x, y) 7→ Fx (y).

1

3/4

2/4

1/4

0

0 1/4 2/4 3/4 1

Vi (Π) T1 Figure 3: y 7→ Fx for i ∈ {1, 2,..., 12} and x having address k of the form (1, 1,..., 1, ∗, ∗) (green), of the form (2, 2,..., 2, ∗, ∗) (blue), and of the form (1, 2, 1, 2,..., 1, 2, ∗, ∗) (red); the ∗-entries start at coordinate 13.

288

289 Remark 12. Using the fact that φ1(F)∩φ2(F) = ∅ together with injectivity 290 of φ1, φ2 : F → F it can be shown that the mapping Ψ in Lemma 11 is A⋆ T1 291 injective, implying that the conditional distribution functions (Fx )x∈[0,1] 292 are pairwise different for all x outside a set of λ-measure zero.

293 We now state and prove the general result for arbitrary elements in Tˆ: ∈ Tˆ ⋆ 294 Theorem 13. Suppose that T and let AT denote the corresponding 295 singular copula with full support. Then for λ-almost every x ∈ [0, 1] the ⋆ AT 296 conditional distribution function y 7→ F (y) = K ⋆ (x, [0, y]) is continuous, x AT 297 strictly increasing and has derivative zero λ-almost everywhere.

16 1

0.9 0.8

0.5 1.0

0.8

0.7 0.4 0.6

0.6 0.3 0.4 0.2 0.2

0.0 0.4 0.0

0.2 0.6

0.4

0.6 0.8

0.2 0.1 0.8

0.0 0.2 0.4 0.6 0.8 1.0 1.0 0.0 0.2 0.4 0.6 0.8 1.0

V8 Π T1 Figure 4: Image- and 3d-plot of the function (x, y) 7→ Fx (y)

298 Proof: We proceed in four main steps. 299 (S1) For every A ∈ C extend the definition of the kernel KA to [0, 1] × B(R) 300 by setting KA(·,E) := 0 if E ∩ [0, 1] = ∅. Then for z ∈ [0, 1],E ∈ B([0, 1]) 301 we have  ( )     ( )  KV g (z),E t t . . . t K z, h (E)  T A( 1 )   11 21 m1   A( 1 )   KV g (z),E   t t . . . t   K z, h (E)   T A 2   12 22 m2   A 2   .  = m  . . .. .   .  . ( . ) . . . . ( . ) KV g (z),E t t . . . t K z, h (E) T A m | 1m 2m {z mm } A m =m T t (23) ∈ 302 For y (bi0−1, bi0 ] and E = [0, y], we get (empty sums are zero by definition) ∑ ( [ − ]) y bi0−1 KV (g (z), [0, y]) = m t + m t K z, . (24) T A j ij i0j A − bi0 bi0−1 i

303 (S2) For every j ∈ {1, . . . , m} define a function φj : F → F by ∑ ( ) y − b − (φ ◦ F )(y) = m t + m t F i0 1 . (25) j ij i0j − bi0 bi0−1 i

17 ∈ whenever y (bi0−1, bi0 ]. Then each φj fulfills ( ) ρ∞(φj ◦ F, φj ◦ G) ≤ m max tij ρ∞(F,G) | {zi,j } <1

and the function Ψ : Σm → F given by

Ψ(k) = lim φk ◦ φk ◦ · · · ◦ φk (F ) n→∞ 1 2 n

is well-defined, independent of the{ concrete choice of }F , and continuous. ∈ N 1 2 ml−1 (S3) For every l set Sl := 0, ml , ml ,..., ml , 1 . The construction of ◦ ◦ · · · ◦ ◦ ◦ · · · ◦ Ψ implies φk1 φk2 φkn (id)(z) = φk1 φk2 φkl (id)(z) for every z ∈ Sl and n ≥ l we have ◦ ◦ · · · ◦ Ψ(k)(z) = φk1 φk2 φkl (id)(z)

304 ∈ ◦ ◦ · · · ◦ for every z Sl. Since φk1 φk2 φkl (id) is strictly increasing∪ on Sl ∈ N ∞ 305 for every l it follows that Ψ(k) is strictly increasing on S := l=1 Sl 306 implying that Ψ(k) is strictly increasing on [0, 1] since S is dense in [0, 1]. 307 (S4) Let Λm denote the set of all x ∈ [0, 1] for which there is a unique k ∈ Σm 308 such that x = lim gk ◦ gk ◦ · · · ◦ gk (1). (26) n→∞ 1 2 n c ∈ 309 Then Λm is countable. Considering that for every x Λm with address 310 k ∈ Λm we have

Vn T Π n K(x, [0, y]) := lim KV Π(x, [0, y]) = lim Fg ◦···◦g (1)(y) = Ψ(k)(y), (27) n→∞ T n→∞ k1 kn

311 it follows in the same way as before that K(·, ·) is (a version of) the Markov

312 kernel K ⋆ (·, ·). Using Proposition 10 therefore completes the proof.  AT ˆ ′ 313 Corollary 14. Suppose that T ∈ T and that T ∈ MT has at least two ⋆ ⋆ ⊥ ⋆ ⊥ ⋆ 314 different entries. Then µT , µT ′ λ2 and µT µT ′ . Furthermore we can 315 find a set Λ ∈ B([0, 1]) with λ(Λ) = 1 such that for every x ∈ Λ we have ⋆ A⋆ AT T ′ 316 KA⋆ (x, ·) ⊥ KA⋆ (x, ·) and the conditional distribution functions Fx ,Fx T T ′ 317 are continuous, strictly increasing with derivative zero λ-almost everywhere.

18 318 5. Uniform convergence of empirical copulas induced by orbits of

319 a special Markov process to singular copulas

′ n It is well known that the empirical copula En for i.i.d. data ((xi, yi))i=1 from A ∈ C is a strongly consistent estimator of A w.r.t. d∞. In fact, according to [20, 28], we even have (√ ) ′ log log n d∞(E ,A) = O n n almost surely for n → ∞. The purpose of this section is to show that the empirical copula based on (non i.i.d.) samples of the so-called chaos game (a Markov process defined subsequently) induced by IFSP coming from ∈ T ⋆ transformation matrices T is a strongly consistent estimator of AT w.r.t. d∞ too. We start with an extension of the notion of empirical copula to n the case of arbitrary samples ((xi, yi))i=1 for which not necessarily all xi or n ∈ R2 all yi are different. Consider a finite sample ((xi, yi))i=1 and define the empirical distribution function (ecdf, for short) Hn and the marginal empirical distribution functions Fn,Gn in the usual way, i.e.

1 ∑n H (x, y) = 1 −∞ × −∞ (x , y ) n n ( ,x] ( ,y] i i i=1 as well as 1 ∑n 1 ∑n F (x) = 1 −∞ (x ) ,G (y) = 1 −∞ (y ). n n ( ,x] i n n ( ,y] i i=1 i=1

320 Then the function En : Rg(Fn) × Rg(Gn) → [0, 1], defined on the range of 321 (Fn,Gn) by En(Fn(x),Gn(y)) = Hn(x, y) (28) ′ 322 is easily seen to be a subcopula that can be extended to a copula En via 323 bilinear interpolation (see [25] as well as [3, 7] for other possible extensions).

324 Note that bilinear interpolation yields a checkerboard copula (see [23]), i.e. ′ 325 the corresponding doubly stochastic measure µn is absolutely continuous and 326 its density is constant on each of the rectangles induced by the grid Rg(Fn)× ′ 327 Rg(Gn). In the sequel we will refer to En as empirical copula (ecop for short) n ′ 328 of the sample ((xi, yi))i=1. Figure 5 shows the empirical copula E8 and the ′ 329 mass distribution of µ8 for a (non i.i.d.) sample of size eight.

19 ecop ecop mass distribution

1.0 1.0

0.8 0.8

mass z 0.00 0.0 0.6 0.6 0.02 0.2 0.04 0.4 0.06 0.6 0.4 0.4 0.08 0.8 0.10 1.0 0.12 0.2 0.2

0.0 0.0

0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0

Figure 5: ecop and mass distribution for a sample of size eight (the black lines in the left plot depict the 0.1, 0.2,..., 0.9-contour lines, the white asterisks depict the sample).

330 We now recall the definition of the chaos game for the specific case of IFSPs

331 coming from transformation matrices and then prove the afore-mentioned ∈ T { 2 N N } 332 consistency result. Suppose that T and let [0, 1] , (wi)i=1, (pi)i=1 333 denote the induced IFSP and PT the probability measure on B(ΣN ) defined 2 334 according to equation (9). Fix a point (x0, y0) ∈ [0, 1] (not necessarily an ⋆ 2 (x0,y0) 335 element of ZT ) and define a [0, 1] -valued Markov process (Yn )n∈N on 336 (ΣN , B(ΣN ),PT ) by setting

(x0,y0) ◦ ◦ · · · ◦ Yn (k) := wkn wkn−1 wk1 (x0, y0) (29) for every n ∈ N. In other words, the (one-step) transition probabilities H(·, ·) of the process are given by (see [15])

( ) ∑N H (x, y),B = pj 1B(wj(x, y)). j=1

2 2 (x0,y0) 337 for all (x, y) ∈ [0, 1] and B ∈ B([0, 1] ). The Markov process (Yn )n∈N 338 will be called chaos game starting at (x0, y0) (see [6, 15, 24]). According to 339 Elton’s ergodic theorem (see [15, 24]) it follows that for PT -almost all k ∈ ΣN ⋆ 340 the empirical measure ϑn converges weakly to the invariant measure µT , i.e. ∑n 1 ⋆ ϑ := δ (x ,y ) −→ µ weakly for n → ∞. (30) n Y 0 0 (k) T n i i=1

20 ecop maximum of d on grid: 0.023

0.9

0.8

0.7 0.05

0.6 0.04

0.03 0.5

0.02 1.0 0.4 0.01 0.8

0.00 0.6 0.3 0.0

0.2 0.4 0.2 0.4

0.6 0.2 0.1 0.8

0.0 0.2 0.4 0.6 0.8 1.0 0.0 1.0 0.0 0.2 0.4 0.6 0.8 1.0

′ Figure 6: E500 and d(x, y) according to Example 18 (a)

341 Remark 15. As pointed out in [24] the chaos game provides a very and

342 simple and efficient way (random iteration of functions) to approximate the

343 invariant measure of an IFSP also in the general setting.

344 As direct consequence of (30) we get that the empirical distribution function (x0,y0) (x0,y0) (x0,y0) 345 Hn of the sample Y1 (k), Y2 (k),...,Yn (k) converges pointwise ⋆ ∈ 346 to the copula AT for PT -almost all k ΣN . Having this we can prove the 347 following result:

∈ T { 2 N N } Theorem 16. Suppose that T , let [0, 1] , (wi)i=1, (pi)i=1 denote the 2 corresponding IFSP and fix (x0, y0) ∈ [0, 1] . Then there exists a set Λ ⊆ ΣN ∈ ′ with PT (Λ) = 1 such that for every k Λ the sequence (En)n∈N of empirical (x0,y0) (x0,y0) (x0,y0) copulas of the sample Y1 (k),Y2 (k),...,Yn (k) fulfills

′ ⋆ lim d∞(En,AT ) = 0. n→∞

348 In other words: with probability one the empirical copula converges uniformly ⋆ 349 to the copula AT .

Proof: Define Λ ⊆ ΣN as the set of all k fulfilling Equation (30), then Elton’s ergodic theorem (see [15]) implies PT (Λ) = 1. As at the begin- ning of this section let Hn,Fn,Gn denote the empirical distribution func- (x0,y0) (x0,y0) (x0,y0) ∈ tions of the sample Y1 (k),Y2 (k),...,Yn (k) for every n N.

21 ⋆ ∈ Equation (30) implies limn→∞ Hn(x, y) = AT (x, y) for all x, y [0, 1]. Fix ∈ 2 ′ (x, y) [0, 1] . Then, using H n(x, y) = En(Fn(x),Gn(y)) and setting Rn := ′ − ⋆ En(Fn(x),Gn(y)) AT (x, y) we have limn→∞ Rn = 0 and it follows that

′ − ′ − ⋆ − ′ ≤ −→ En(Fn(x),Gn(y)) En(x, y) AT (x, y) En(x, y) Rn 0

350 for n → ∞. Taking into account that (30) also implies ( ) ′ − ′ ≤ | − | | − | lim sup En(Fn(x),Gn(y)) En(x, y) lim sup Fn(x) x + Gn(y) y n→∞ n→∞ = 0

′ ⋆ ∈ 2 351 altogether we get limn→∞ En(x, y) = AT (x, y). Since (x, y) [0, 1] was 352 arbitrary and in C pointwise and uniform convergence are equivalent the

353 proof is complete. 

354 Remark 17. It is straightforward to verify that we do not have convergence ′ ⋆ 355 of En to AT w.r.t. D1 if T has at least one column with two non-zero entries.

1 1

3/4 3/4 ln(dens) z 0 0.0 1 0.2 2 2/4 2/4 0.4 3 0.6 4 0.8 5 1.0 6 1/4 1/4

0 0

0 1/4 2/4 3/4 1 0 1/4 2/4 3/4 1

Figure 7: Image plot of the (natural) logarithm of the density of V5 (Π) (left) as well as T4 image plot of the copula V5 (Π) (right), T from Example 18 (b). T4 4

356 We conclude the paper with the following example:

Example 18. (a) Consider again the transformation matrix T1 from Exam- ⋆ ple 6. In this case the invariant copula AT is self-similar, singular and has ′ (1,1) full support. Figure 6 depicts E500 for an Orbit (Yi (k))i∈N as well as the

22 ecop maximum of d on grid: 0.037

0.9

0.8

0.05 0.7

0.04 0.6 0.03

0.5 0.02 1.0

0.01 0.8 0.4

0.00 0.6 0.0 0.3 0.2 0.4

0.2 0.4

0.6 0.2 0.1 0.8

0.0 0.2 0.4 0.6 0.8 1.0 0.0 1.0 0.0 0.2 0.4 0.6 0.8 1.0

′ Figure 8: E500 and d(x, y) according to Example 18 (b)

| ′ − ⋆ | × function d(x, y) := E500(x, y) AT1 (x, y) on an equidistant grid of 101 101 points. (b) Consider the transformation matrix T4, defined by   1/4 0 0   T4 = 0 1/4 0 . 1/4 0 1/4

⋆ 357 In this case the invariant copula AT4 is not self-similar. Figure 7 depicts 5 5 358 V V the (natural) logarithm of the density of T4 (Π) as well as T4 (Π). Figure ′ (1,1) 359 8 shows E500 for an Orbit (Yi (k))i∈N as well as the function d(x, y) := ′ ⋆ 360 | − | × E500(x, y) AT4 (x, y) on an equidistant grid of 101 101 points.

361 Acknowledgments

362 The second author gratefully acknowledges the support of the grant MTM2011-

363 22394 from the Spanish Ministry of Science and Innovation.

364 References

365 [1] E. de Amo, M. D´ıazCarrillo, J. Fern´andez-S´anchez, Measure-Preserving

366 Functions and the Independence Copula, Mediterr. J. Math. 8 (2011)

367 431–450.

23 368 [2] E. de Amo, M. D´ıazCarrillo, J. Fern´andez-S´anchez, Copulas and asso-

369 ciated fractal sets, J. Math. Anal. Appl. 386 (2012) 528–541.

370 [3] E. de Amo, M. D´ıazCarrillo, J. Fern´andez-S´anchez, Characterization of

371 all copulas associated with non-continuous random variables, Fuzzy Set

372 Syst. 191 (2012) 103–112.

373 [4] E. de Amo, M. D´ıazCarrillo, J. Fern´andez-S´anchez, A. Salmer´on,Mo-

374 ments and associated measures of copulas with fractal support, Appl.

375 Math. Comput. 218 (2012) 8634-8644.

376 [5] E. de Amo, M. D´ıaz Carrillo, J. Fern´andez-S´anchez, W. Trutschnig,

377 Some Results on Homeomorphisms Between Fractal Supports of Copu-

378 las, Nonlinear Anal-Theor. 85 (2013) 132–144.

379 [6] M.F. Barnsley, Fractals everywhere, Academic Press, Cambridge, 1993.

380 [7] H. Carley, Maximum and minimum extensions of finite subcopulas,

381 Commun. Stat. A-Theor. 31 (2002) 2151-2166.

382 [8] I. Cuculescu, R. Theodorescu, Copulas: diagonals, tracks, Rev. Roum.

383 Math Pure A. 46 (2001) 731-742.

384 [9] W.F. Darsow, B. Nguyen, E.T. Olsen, Copulas and Markov processes,

385 Illinois J. Math. 36 (1992) 600-642.

386 [10] R.M. Dudley, and Probability, Cambridge University

387 Press, 2002.

388 [11] F. Durante, C. Sempi, Copula theory: an introduction, In: Jaworski,

389 P., Durante, F., H¨ardle,W., Rychlik, T. (eds.) Copula theory and its

390 applications, Lecture Notes in Statistics–Proceedings vol 198, pp. 1–31,

391 Springer, Berlin, 2010.

392 [12] F. Durante, J. Fern´andez-S´anchez, C. Sempi, A note on the notion of

393 singular copula, Fuzzy Set Syst. 211 (2013) 120-122.

394 [13] F. Durante, P. Jaworski, R. Mesiar, Invariant dependence structures and

395 Archimedean copulas, Stat. Probabil. Lett. 81 (2011) 1995-2003.

396 [14] G.A. Edgar, Measure, Topology and Fractal Geometry, Undergrad. Texts

397 in Math., Springer, Berlin Heidelberg New York, 1990.

24 398 [15] J.F. Elton, An ergodic theorem for iterated maps, Ergod. Theor. Dyn.

399 Syst. 7 (1987) 481-488.

400 [16] J. Fern´andezS´anchez, P. Viader, J. Paradis, M. D´ıazCarrillo, A singular

401 function with a non-zero finite derivative, Nonlinear Anal-Theor. 15

402 (2012) 5010-5014.

403 [17] G. Fredricks, R. Nelsen, J.A. Rodr´ıguez-Lallena,Copulas with fractal

404 supports, Insur. Math. Econ 37 (2005) 42–48.

405 [18] R. Ghiselli Ricci, On differential properties of copulas, Fuzzy Set Syst.

406 220 (2013) 78-87.

407 [19] E. Hewitt, K. Stromberg, Real and Abstract Analysis, A Modern Treat-

408 ment of the Theory of Functions of a Real Variable, Springer-Verlag,

409 New York, 1965.

410 [20] P. Janssen, J. Swanepoel, N. Veraverbeke, Large sample behavior of the

411 Bernstein copula estimator, J. Stat. Plan. Infer. 142 (2012) 1189–1197.

412 [21] O. Kallenberg, Foundations of modern probability, Springer Verlag, New

413 York Berlin Heidelberg, 1997.

414 [22] A. Klenke, Probability Theory - A Comprehensive Course, Springer Ver-

415 lag, Berlin Heidelberg, 2007.

416 [23] T. Kulpa, On Approximation of Copulas, Internat. J. Math. & Math.

417 Sci. 22 (1999) 259-269.

418 [24] H. Kunze, D. La Torre, F. Mendivil, E.R Vrscay, Fractal Based Methods

419 in Analysis, Springer, New York Dordrecht Heidelberg London, 2012.

420 [25] R.B. Nelsen, An Introduction to Copulas (2nd edn.), Springer Series in

421 Statistics, New York, 2006.

422 [26] J. Parad´ıs,P. Viader, L. Bibiloni, Riesz-N´agysingular functions revis-

423 ited, J. Math. Anal. Appl. 329 (2007) 592–602.

424 [27] W. Rudin, Real and , McGraw-Hill, New York, 1966.

425 [28] J. Swanepoel, J. Allison, Some new results on the empirical copula es-

426 timator with applications, Stat. Probabil. Lett. 83 (2013) 1731-1739.

25 427 [29] C. Sempi, Copulæ: Some Mathematical Aspects, Appl. Stoch. Model

428 Bus. 27 (2010) 37–50.

429 [30] W. Trutschnig, On a strong metric on the space of copulas and its in-

430 duced dependence measure, J. Math. Anal. Appl. 384 (2011) 690-705.

431 [31] W. Trutschnig, J. Fern´andez-S´anchez, Idempotent and multivariate co-

432 pulas with fractal support, J. Stat. Plan. Infer. 142 (2012) 3086-3096.

433 [32] P. Walters, An Introduction to Ergodic Theory, Springer, New York,

434 1982.

26