Algorithmic Definitions of Singular Functions

by Daniel Bernstein

A thesis submitted in partial fulfillment of the requirements for Departmental Honors in Davidson College May 13, 2013

Advisor: Richard D. Neidinger Second Reader: Irl C. Bivens Contents

1 Sets of Discontinuity and Increasing Functions 1 1.1 Dirichlet-Like Functions ...... 1 1.2 Sets of Discontinuity ...... 3 1.3 Increasing Functions ...... 5

2 of an Increasing 8 2.1 Existence ...... 8 2.2 The ...... 15

3 DeRham’s Function 16 3.1 A Natural Question ...... 17 3.2 Properties ...... 18 3.3 Algorithmic Definition ...... 19 3.4 Analytic Properties ...... 20

0 3.5 Points at Which fa(x) Exists ...... 27

4 Minkowski’s Question-Mark Function 30 4.1 Algorithmic Definition of ?(x)...... 31 4.2 Notes and Remarks ...... 33 4.3 Mediants, Endpoints and an Algebraic Property ...... 33 4.4 Analytic Properties ...... 36 4.5 Salem’s Definition ...... 37 4.5.1 Continued Fractions ...... 37 4.5.2 Connection Between Convergents and Mediants ...... 40 4.5.3 Agreement with Salem’s Extension ...... 46

5 Conclusion and Ideas For Further Study 47

2 Abstract

A function f on an interval [a, b] is singular if f(a) < f(b), f is increasing (non- decreasing), f is continuous everywhere and f 0(x) = 0 . In this thesis, we focus on the strictly increasing singular functions of DeRham and Minkowski. We propose algorithmic definitions of each of these functions, and use these new defini- tions to provide alternative proofs of known properties about the values and of these functions. Before that, we give some background on sets of discontinuity for general functions, for increasing functions, and some background on the derivatives of increasing functions. 1 Sets of Discontinuity and Increasing Functions

In early calculus courses, most of the functions we encounter are continuous at every point in their domain. We sometimes encounter functions that have at most a finite number of discontinuities on any finite interval but this is usually as discontinuous as it gets. One might wonder if it is possible to define a function on a finite interval that is discontinuous at every point in its domain. More generally, one might wonder about what restrictions exist on the set of points at which a function is discontinuous.

1.1 Dirichlet-Like Functions

Consider the function D : (0, 1) → (0, 1) defined as follows:

  1 if x ∈ (0, 1) ∩ Q, D(x) =  0 if x ∈ (0, 1) \ Q.

Dirichlet proposed this function in 1829 and it was groundbreaking because hitherto, people generally thought about defining functions only using analytic expressions. It gives us an example of a function defined on an interval that is discontinuous at every point in its domain. We can extend the main idea behind Dirichlet’s function to create other functions that are discontinuous on unusual sets. Now, we give an example of a function E : (0, 1) → (0, 1) that is continuous at exactly one point:   1 x − 2 if x ∈ (0, 1) ∩ Q E(x) =  1  2 − x if x ∈ (0, 1) \ Q

Figure 1 shows an approximate graph of E.

1 This function is continuous at the point x = 2 and discontinuous everywhere else. Thomae used the idea of Dirichlet to define the function T : (0, 1) → (0, 1) which is

1 Figure 1: Graph of E continuous at every irrational in (0, 1) and discontinuous at every rational in (0, 1). Figure 2 shows the graph of T .

  1  d if x ∈ Q and has denominator dinreducedform T (x) =  0 if x is irrational

Graph of T . Source: http://en.wikipedia.org/wiki/File:Thomae function (0,1).svg

From looking at its graph, it isn’t too surprising that T it is discontinuous at each rational. It is less obvious that T is continuous at each irrational, so we will prove that

1 formally. Let x ∈ (0, 1) \ Q. Let ε > 0. Let q ∈ N be such that q < ε. For each i ∈ {2, 3, ..., q} let p δi = min {|x − |}. p∈{1,2,...,i−1} i

Let δ = min{δi}. Let c ∈ (x − δ, x + δ). We will now show that |f(x) − f(c)| < ε,

2 completing the proof that f is continuous at x. Since x is irrational, f(x) = 0 and since f is nonnegative, we have |f(x)−f(c)| = |f(c)| = f(c). So now if we show f(c) < ε then we are done. If c is irrational, f(c) = 0 so we would be done. So assume c is rational.

n Assume c = d in reduced form. Note that d > q because we defined δ such that the interval (x − δ, x + δ) contains no fractions with denominator less than or equal to q. So

1 1 f(c) = d < q < ε. One might wonder, can a function be discontinuous on any arbitrary set? As we will see, the answer is no.

1.2 Sets of Discontinuity

Definition. Let f be some function defined on R. We define the set of discontinuity of f, Df , to be the set of all points at which f is discontinuous.

Definition. We say that a set S is a Fσ set if S can be expressed as a countable union of closed sets.

The goal of this section is to show that for every function f, the set Df is an Fσ set. The structure of this section is taken from some of the exercises in [1]. First we need some more definitions.

Definition. Let f be a function defined on R and let α > 0. We say that f is α- continuous at x if there exists some δ > 0 such that for all y, z ∈ (x − δ, x + δ), |f(y) − f(z)| < α.

It follows directly from the definition of continuity that a function f is continuous at x if and only if f is α-continuous at x for each α > 0.

Definition. Let f be some function defined on R. Let α > 0. We define the set of

α-discontinuity of f, Df (α), to be the set of all points at which f is α-discontinuous.

Lemma 1. Let f be a function defined on R, let α > 0. Then Df (α) is closed.

3 Proof. Let x be a limit point of Df (α). We will show that f is not α-continuous at x.

Let {xn} be a sequence of points in Df (α) such that xn → x as n → ∞. Let δ > 0. We want to find y, z ∈ (x − δ, x + δ) such that |f(y) − f(z)| ≥ α. Let n be such that ˆ |x − xn| < δ and let δ = δ − |x − xn|. Since f is not α-continuous at xn, then there exist ˆ ˆ ˆ ˆ y, z ∈ (x−δ, x+δ) such that |f(y)−f(z)| ≥ α. Note that (xn −δ, xn +δ) ⊂ (x−δ, x+δ) and thus y, z ∈ (x − δ, x + δ), so f is not α-continuous at x.

Lemma 2. Let f be a function defined on R. Let 0 < α1 < α2. Then Df (α2) ⊆ Df (α1).

Proof. Let x ∈ Df (α2). So f is not α2-continuous at x. So for all δ > 0, there exist y, z ∈ (x − δ, x + δ) such that |f(y) − f(z)| ≥ α2 > α1. So f is also not α1-continuous at x.

Theorem 3. Let f be a function defined on R. Then Df is an Fσ set.

Proof. Since each set Df (α) is closed, we are done if we show

[ Df = Df (q). q∈Q+

Let x ∈ Df . Then there exists some α > 0 such that f is not α-continuous at x,

+ because otherwise f would be continuous at x. So x ∈ Df (α). Let q ∈ Q such that q < α. By Lemma 2, Df (α) ⊆ Df (q) and thus x ∈ Df (q). Now let x ∈ Df (q) for some q ∈ Q+. Since f is not q-continuous at x, we have that f is not continuous at x and thus x ∈ Df .

An example of a set that is not Fσ is the set of all irrational numbers in an interval. So there cannot exist a function that is discontinuous only at each irrational in a given interval. The proof of this is outside the scope of this paper.

4 1.3 Increasing Functions

We say that a function f is increasing if x < y implies f(x) ≤ f(y). We say that a function f is strictly increasing if x < y implies f(x) < f(y). Note that all strictly increasing functions are increasing. As we will see, the criteria for the set on which an increasing function is discontinuous are more restrictive then that for a general function.

Theorem 4. Let f : (0, 1) → R be increasing. Then for each c ∈ (0, 1), limx→c+ f(x)

− and limx→c f(x) exist.

Proof. Let c ∈ (0, 1). Note that since f is increasing, f is bounded below on (c, 1) by f(c). Therefore, by the completeness of the reals, f has a greatest lower bound on (c, 1) so we can define L = inf f(x). x∈(c,1)

Now let ε > 0. Since L is a greatest lower bound, there exists some y ∈ (c, 1) such that f(y) < L + ε. Let δ = y − c; note that y > c so δ > 0 and y = c + δ. This gives us

f(c + δ) < L + ε.

Since f is increasing, for any x ∈ (c, c + δ), we have

f(x) ≤ f(c + δ) < L + ε and thus f(x) − L < ε.

Since L is a lower bound for f on (c, 1), L ≤ f(x) and thus

|f(x) − L| = f(x) − L < ε,

thus showing that limx→c+ f(x) exists and is equal to L. Showing the existence of

5 limx→c− f(x) is similar.

Definition. Given a function f : D → R where D ⊆ R, we say that f has a jump discontinuity at x if limx→c+ f(x) and limx→c− f(x) both exist but are not equal.

Theorem 5. Let f : (0, 1) → R be increasing. Then for all c ∈ (0, 1), either f is continuous at c or f has a jump discontinuity at c.

Proof. Let c ∈ (0, 1). The limit limx→c f(x) either exists or doesn’t exist.

Assume that limx→c f(x) exists. We will now show that f(c) = limx→c f(x). As- sume for the sake of contradiction that f(c) 6= limx→c f(x). Then f(c) < limx→c f(x) or f(c) > limx→c. Assume f(c) < limx→c f(x). Then f(c) < limx→c− f(x). Then there exists some x < c such that f(x) > f(c) which is a contradiction because f is increas- ing. Assume f(c) > limx→c f(x). Then f(c) > limx→c+ f(x). Then there exists some x > c such that f(x) < f(x) which is a contradition because f is increasing. Thus if limx→c f(x) exists, it must equal f(c) so f is continuous at c.

Now assume that limx→c f(x) does not exist. By Theorem 4, limx→c+ f(x) and limx→c− f(x) both exist so we must have limx→c+ f(x) 6= limx→c− f(x) for limx→c f(x) to not exist. So f has a jump discontinuity at c.

Theorem 6. Let f : (0, 1) → R be increasing. Let Df denote the set of points at which f is discontinuous. Then Df is countable.

Proof. Informally, we prove this by choosing a rational number in the interval defined by each jump discontinuity. Since our function is increasing, no two such intervals will overlap, so each rational number chosen in this way will be unique. More formally, we will construct a function g : Df → Q that is one-to-one.

Let c ∈ Df . Then limx→c+ f(x) and limx→c− f(x) both exist with limx→c− f(x) < limx→c+ f(x). Let qc ∈ Q ∩ (limx→c− f(x), limx→c+ f(x)) and define g(c) = qc.

To see that g is one-to-one, let c, d ∈ Df such that g(c) = g(d). Assume for the sake of contradiction that c 6= d and without loss of generality, assume c < d. Since

6 g(c) ∈ (limx→c− f(x), limx→c+ f(x)) and g(d) ∈ (limx→d− f(x), limx→d+ f(x)), we must have lim f(x) > g(x) = g(d) > lim f(x). (1) x→c+ x→d−

Let {cn} be a decreasing sequence such that cn → c. Let {dn} be an increasing sequence such that dn → d. Since c < d and {cn} is decreasing and {dn} is increasing, there must exist some N such that for all n ≥ N, cn < dn. Since f is monotone, then for each n ≥ N, f(cn) < g(dn) implying that limx→c+ f(x) ≤ limx→d− f(x) which contradicts (1).

Can any countable set be the set of discontinuity for a monotone function? As it turns out, yes. We address this in the theorem below.

Theorem 7. Let D = {x1, x2, ...} ⊂ (0, 1). Then there exists a monotone function f : (0, 1) → (0, 1) such that f is discontinuous precisely at the points in D.

Proof. We define f as follows: X 1 f(x) = . 2n n:xn

X 1 X 1 X 1 X 1 f(y) = = + = f(x) + ≥ f(x) 2n 2n 2n 2n n:xn

Now we show that f is discontinuous at each xi by showing that the right and left hand limits of f at xi are unequal for any i.

X 1 X 1 X 1 lim f(x) = lim ≥ > = f(xi). + + n n n x→xi x→xi 2 2 2 n:xn

7 X 1 X 1 lim f(x) = lim n ≤ n = f(xi). x→x− x→x− 2 2 i i n:xn

So lim + f(x) and lim − f(x) are unequal and thus f is discontinuous at each x→xi x→xi xi.

We now give a theorem about strictly increasing functions which will be useful later.

Theorem 8. Let f : (0, 1) → (0, 1) be surjective and strictly increasing. Then f is continuous.

Proof. Let x ∈ (0, 1). Let ε > 0. Let z1 ∈ (f(x) − ε, f(x)) ∩ (0, 1) and let z2 ∈

(f(x), f(x) + ε)) ∩ (0, 1). Since f is surjective, there exist y1, y2 ∈ (0, 1) such that f(y1) = z1 and f(y2) = z2. Since f is strictly increasing, y1 < x < y2. Let δ = min{x − y1, y2 − x}, note δ > 0. Let c ∈ (x − δ, x + δ). Assume c < x. Since f is increasing and since y1 ≤ x − δ, we have z1 ≤ f(x − δ) ≤ f(c) ≤ f(x) and thus f(x) − ε < c < f(x) + ε. The case where c > x is similar.

We now consider properties of the derivatives of increasing functions.

2 Derivative of an Increasing Function

In this section we will prove a major theorem about existence properties of the derivative of an increasing function. Then, we will briefly introduce The Cantor Function (for a more complete discussion of The Cantor Function, see [5]).

2.1 Existence

Earlier we saw that it is possible to construct a function that is discontinuous everywhere, but if we want the function to be monotone, it can be discontinuous only at a countable set. Since differentiability implies continuity, it follows that we can construct a function

8 that is not differentiable anywhere. If we restrict our attention to continuous functions, this is still possible (see [1, p.146] for an example of a continuous, nowhere differentiable function). What if we restrict our attention to monotone functions? Can we construct a monotone function that is not differentiable anywhere? As we will see, such a function cannot exist - we will prove that the derivative of a monotone function must exist almost everywhere. But first, we need some preliminary results.

∞ Theorem 9 (Fatou’s Lemma). If {fn}n=1 is a sequence of nonnegative measurable func- tions and fn(x) → f(x) for x in some set E, then

Z Z f ≤ lim inf fn E E

For a proof of Fatou’s Lemma, see [9].

Definition. Let I be an interval and let E be a set. Then l(I) denotes the length of I and m(E) denotes the Lebesgue of E.

Definition. Let E ⊆ R. Let I be a collection of intervals of positive length that covers E. Then I covers E in the sense of Vitali if for each ε > 0 and for each x ∈ E, there exists an interval I ∈ I such that x ∈ I and l(I) < ε. We may also say that I is a Vitali covering of E.

Lemma 10. Let E be a set of finite and let I be a set of closed intervals of positive length that covers E in the sense of Vitali. Then, given ε > 0, there exists a finite disjoint collection of intervals {I1,...,IN } such that

N ! [ m E \ In < ε. n=1

Proof. Let O be an open set of finite measure that contains E. We first define the set

I ∗ ⊆ I where each I ∈ I ∗ is contained in O, then we note that I ∗ is also a Vitali covering of E. To see this, we let ε > 0 and let x ∈ E. Since O is open, there exists a

9 δ > 0 such that (x − δ, x + δ) ⊂ O. Letε ˆ = min{ε, δ}. Since I is a Vitali covering of E, there exists some I ∈ I such that x ∈ I and l(I) < εˆ. Then, I is contained entirely within O so I ∈ I ∗. Since l(I) < ε, I ∗ is a Vitali covering of E. So, now we will assume without loss of generality that each I ∈ I is contained entirely within O.

Now, we inductively create a sequence {In} of disjoint intervals in I . Let I1 be any Sn interval in I . Supposing that I1,...In have all been chosen. If E ⊆ k=1 Ik, then for  SN  any ε > 0, we have m E \ n=1 In < ε so we would be done. So, assume this is not the case. Now we show how to choose In+1. Let kn be the supremum of the lengths of intervals in I that do not intersect any of the intervals I1,...,In. Since each element of

I is contained in O, we have kn ≤ m(O) < ∞. Then, we choose an interval In+1 ∈ I

1 such that In+1 is disjoint from all of I1,...,In and l(In+1) > 2 kn. We can choose In+1 to be disjoint from the finite collection I1,...,In because I is a Vitali covering, so In+1

1 can be arbitrarily small. We can choose In+1 to have length greater than 2 kn because kn is the supremum of the lengths of intervals that are disjoint from I1,...,In. S∞ Thus, we have an infinite sequence {In} of disjoint intervals of I . Since n=1 In ⊆ P∞ O, we have n=1 l(In) ≤ m(O) < ∞. This implies two things. First, it implies limn→∞ l(In) = 0. Otherwise there would exist some B > 0 such that for some positive P∞ integer k, for each n ≥ k, we would have l(In) ≥ B which would give n=1 In = ∞. Second, this implies that for any ε > 0, we can choose a positive integer N such that

∞ X l(In) < ε/5. n=N+1

Now, let N [ R = E \ In. n=1 If we show that m(R) < ε, then we will be done. SN SN Let x ∈ R. Note that R does not intersect n=1 In. Then, since n=1 In is the union of finitely many closed intervals and thus itself closed, we can find a δ > 0 such that

10 SN (x − δ, x + δ) does not intersect n=1 In. Then, since I is a Vitali covering, there exists SN an I ∈ I such that x ∈ I ⊂ (x − δ, x + δ). So I also does not intersect n=1 In.

Now let n be a positive integer such that for all i ≤ n, I does not intersect Ii. Then, since kn is the supremum of the lengths of the intervals that do not intersect any of

1 I1,...,In, l(I) ≤ kn. By construction, l(In+1) > 2 kn which gives l(I) ≤ kn < 2l(In+1).

Now, In+1 may intersect I, and for n large enough, it will because limi→∞ l(Ii) = 0 and each interval in I has positive length. So fix n to be the smallest integer such that I intersects In+1. Note that we must have n ≥ N and l(I) < kn < 2l(In+1). Since x ∈ I and since I has a point in common with In+1, the distance from x to the midpoint of

1 5 In+1 is at most l(I) + 2 l(In+1) ≤ 2 l(In). For each i, define Ji to be the interval that shares a midpoint with Ii that is five times the length. Then x must be contained in the interval Jn+1. Since x is an arbitrary point in R, we have that

∞ [ R ⊆ Ji. i=N+1

Hence ∞ ∞ X X m(R) ≤ l(Ji) = 5 l(Ii) < ε i=N+1 i=N+1

Definition. Let f be a function on the real numbers. The following four quantities are called the derivates of f at x:

f(x + h) − f(x) f(x + h) − f(x) D+f(x) = lim ,D−f(x) = lim , h→0+ h h→0− h

f(x + h) − f(x) f(x + h) − f(x) D+f(x) = lim ,D−f(x) = lim . h→0+ h h→0− h

+ By definition of limit superior and limit inferior, we have D f(x) ≥ D+f(x) and

− + D f(x) ≥ D−f(x). The function f is differentiable at x if and only if D f(x) =

11 − D+f(x) = D f(x) = D−f(x) 6= ∞, and the derivative of f at x is the common value.

Theorem 11 (Lebesgue’s Theorem). Let f :[a, b] → R be an increasing function. Then f is differentiable almost everywhere.

Proof. We prove this theorem by showing that the subset of [a, b] whose elements have unequal derivates is of measure zero. We consider only the set E where for all x ∈ E,

+ D f(x) > D−f(x). The cases where other derivates are unequal is similarly handled. Now, let u, v be rational numbers with u > v and define

+ Eu,v = {x : D f(x) > u > v > D−f(x)}.

Then we have the following identity:

[ E = Eu,v. u,v∈Q,u>v

This union is countable so if we show that an arbitrary Eu,v is of measure zero, then it follows that E is of measure zero and we are done.

If Eu,v is empty, then we are done so assume that Eu,v is nonempty. Let s = m(Eu,v) and let ε > 0. Let O be an open set that contains Eu,v such that m(O) < s + ε. For an x ∈ Eu,v, we can choose an arbitrarily small h > 0 such that [x − h, x] is contained in O and so that f(x) − f(x − h) < v. (2) h

For each x ∈ Eu,v, define Ix to be the collection of all intervals [x − h, x] that are contained in O and satisfy (2) and let = S . Then is a Vitali covering of I x∈Eu,v Ix I

Eu,v so by Lemma 10, we can choose a finite, disjoint collection {I1,...,IN } of intervals in I such that N [ m(Eu,v \ In) < ε. n=1

12 For each n = 1,...,N let xn and hn be the x and h that were used to create In. Then for each n = 1,...,N, (2) gives

f(xn) − f(xn − hn) < vhn.

We can sum all of these to get

N N X X f(xn) − f(xn − hn) < v hn. (3) n=1 n=1

Note that hn is the width of the interval In and that In’s are disjoint and contained PN within O which gives us n=1 hn ≤ m(O) < s + ε. Plugging this into (3) gives

N X f(xn) − f(xn − hn) < v(s + ε). (4) n=1 SN Let An = Int(In) ∩ Eu,v and let A = n=1 An. Then, m(A) > s − ε. Since each point in A is in the interior of some In, then for each y ∈ A, we can choose an arbitrarily small k such that [y, y + k] is contained in some In; and since each y in A is also in Eu,v,

+ f(y+k)−f(y) we have D f(x) > u, so we can choose an arbitrarily small k such that k > u.

Now, for each y ∈ A, let Jy be the collection of intervals [y, y + k] with k small enough S to meet both of the above conditions. Let J = y∈A Jy. Then J is a Vitali covering of A, so by Lemma 10, we can choose a finite disjoint collection {J1,...,JM } of intervals in J where Ji = [yi, yi + ki] such that

M [ m(A \ Ji) < ε. i=1

Then we also have

M ! M [ X m Ji = ki > m(A) − ε = s − 2ε. (5) i=1 i=1

13 Recall that by construction of the ks, we have f(yi+ki)−f(yi) > u and thus ki

M M X X f(yi + ki) − f(yi) > u ki. (6) i=1 i=1 Then (5) and (6) give

M X f(yi + ki) − f(yi) > (s − 2ε)u. (7) i=1

By construction, each interval Ji is contained within some In. Then for each i, let n be such that Ji ⊂ In. Since f is increasing, the following inequality holds:

f(xn − hn) ≤ f(yi) ≤ f(yi + ki) ≤ f(xn).

We can use this to get

f(xn) − f(xn − hn) ≥ f(yi + ki) − f(yi).

Note that each Ji is contained inside exactly one In. Now fix an n and consider all i such that Ji ∈ In. Since f is increasing and since the Jis are disjoint, we have

X f(xn) − f(xn − hn) ≥ f(yi + ki) − f(yi).

Ji⊂In

Thus N M X X f(xn) − f(xn − hn) ≥ f(yi + ki) − f(yi). n=1 i=1 Applying (4) and (7) gives v(s + ε) > u(s − 2ε).

Since this is true for each ε > 0, we must have vs ≥ us. But since u > v, then vs ≥ us can only be true if s = 0.

14 This shows that the function

f(x + h) − f(x) g(x) = lim h→0 h is defined almost everywhere and that f is differentiable when g is finite. We now show that g must be finite almost everywhere.

For each n ∈ N and each x ∈ [a, b], we define

  1   g (x) = n f x + − f(x) n n and for each x > b, g(x) = f(b).

Then, gn(x) → g(x) as n → ∞ for almost all x. So g is measurable. Since f is increasing, gn(x) ≥ 0 wherever gn exists. Then, by Theorem 9,

Z Z  Z b   1    g ≤ lim inf gn = lim inf n f x + − f(x) dx [a,b] [a,b] a n " Z Z # = lim inf n f − n f 1 1 [b,b+ n ] [a,a+ n ] " Z # = lim inf f(b) − n f 1 [a,a+ n ] ≤ f(b) − f(a).

So g is integrable and thus finite almost everywhere. So f is differentiable almost everywhere.

2.2 The Cantor Function

One of the first facts we learn about the derivative of a real valued function f is that for any point x in the domain of f, if f 0(x) exists, then the value of f 0(x) is the slope

15 of the line that is tangent to the graph of f at the point x. Thus it seems natural to assume that if a function f :[a, b] → R is increasing and f(a) < f(b), then f 0(x) must be positive somewhere between a and b. However, this is not true. The Cantor Function, whose graph is shown in Figure 2, is a counterexample to this intuition.

Figure 2: The Cantor Function

This function is interesting because it is continuous and its derivative is zero almost everywhere, yet its graph climbs from 0 to 1. It satisfies the following definition:

Definition 12. A continuous, increasing function f :[a, b] → R that satisfies f(a) < f(b) and f 0(x) = 0 for almost all x ∈ [0, 1] is said to be singular.

From looking at the graph of The Cantor Function, it isn’t that surprising that it is singular; the graph is horizontal in almost all places. This function is almost always constant. One might wonder, are there any singular functions that are never constant? Equivalently, are there any strictly increasing singular functions? The answer to this question is yes. We will see two such examples.

3 DeRham’s Function

Our first example of a strictly increasing is DeRham’s function. There are multiple ways that this function is usually defined. We discuss two of them and then

16 Figure 3: Graph of fˆ1 (x) 3 provide our own definition. We then use our new definition to provide alternative proofs of some known results.

3.1 A Natural Probability Question

Say we have a fair coin that we flip an infinite number of times. We will write down a 0 each time the coin lands on tails and a 1 each time it lands on heads. After generating this infinite string of zeros and ones, we can append a decimal point to the front and interpret the result as a number between 0 and 1 expressed in binary. Call this number R. What does the cumulative distribution function for R look like? We will denote this

1 f(x). So f(x) is the probability that R < x. If R < 2k for some positive integer k, 1 1 1 then our first k flips must be tails which occurs with probability 2k . Thus, f( 2k ) = 2k . In fact, we can show that for any x ∈ [0, 1], we have f(x) = x. So in this case, the cumulative distribution function of R is not particularly interesting. However, if we

1 assume that our coin is unfair, landing on tails with probability a 6= 2 , and heads with probability (1 − a), then the cumulative distribution function of R, which we will denote ˆ fa(x), takes on some more interesting characteristics. Figures 3 and 4 show the graphs of fˆ1 (x) and fˆ2 (x). Note that these functions are not inverses of one another, although 3 3 they may appear to be at first glance.

17 Figure 4: Graph of fˆ2 (x) 3

3.2 Fractal Properties

1 Note that each part of the curve on either side of the line x = 2 is a shrunken version of the whole curve. More specifically, if we shrink the whole curve to half its size in the x direction and a times its size in the y direction, we get the left half of the curve and if we shrink the whole curve to half its size in the x direction and 1−a times its size in the y direction, we get the right half of the curve. We state this property more rigorously with the following theorem.

ˆ Theorem 13. DeRham’s function fa(x) satisfies the following:

 afˆ (2x) if 0 ≤ x ≤ 1 ˆ  a 2 fa(x) = .  ˆ 1 (1 − a)fa(2x − 1) + a if 2 < x ≤ 1

Proof. Let x = 0.b1b2b3 ... be a binary representation of x that does not terminate. ˆ Recall that fa(x) = P (R < x) and let R = 0.r1r2r3 ... be the binary representation of R.

1 1 Now assume 0 ≤ x ≤ 2 . Then b1 = 0 (assuming that if x = 2 , then it is expressed as 0.0111 ...). Then, R < x if and only if r1 = 0 and 0.r2r3r4 . . . < 0.b2b3b4 .... We have r1 = 0 with probability a. Since x1 = 0, 2x = 0.b2b3b4 .... Thus 0.r2r3r4 . . . < 0.b2b3b4 ...

18 ˆ 1 ˆ ˆ with probability fa(2x). So if 0 ≤ x ≤ 2 , then fa(x) = afa(2x). 1 Now assume 2 < x ≤ 1. Then b1 = 1. So if r1 = 1, then we must have

0.r2r3r4 . . . < 0.b2b3b4 .... Also, 2x − 1 = 0.b2b3b4 .... So 0.r2r3r4 . . . < 0.b2b3b4 ... ˆ is true with probability fa(2x − 1). If r1 = 0, then the values of r2, r3, r4,... are unre-

1 stricted. Since r1 = 1 with probability 1 − a and r1 = 0 with probability a, if 2 < x ≤ 1, ˆ ˆ then fa(x) = (1 − a)fa(2x − 1) + a.

Not only does DeRham’s function satisfy the formula given in Theorem 13, it is the unique continuous solution to that functional equation. See [3] for more details.

ˆ 1 DeRham uses this definition to prove that fa is singular for all a 6= 2 . In [6], Kawamura provides an alternate definition and uses it to explicitly identify sets of real

ˆ0 ˆ0 numbers on which fa(x) = 0 and the sets on which fa(x) does not exist.

3.3 Algorithmic Definition

ˆ We now define a function fa(x) with an algorithm. We will later show that fa(x) = fa(x) for all x. Then we use this new definition to get alternative proofs for the aforementioned results of DeRham and Kawamura.

Algorithm 1 Given input x ∈ [0, 1], generate fa(x)

Suppose x has binary representation 0.d1d2d3 ... L1 ← 0 U1 ← 1 M0 ← 1 for n = 1, 2, 3,... do Mn ← (1 − a)Ln + aUn if dn = 1 then Ln+1 ← Mn Un+1 ← Un else Un+1 ← Mn Ln+1 ← Ln end if end for All three of the sequences defined converge to a common value. This value is fa(x).

19 ∞ We can think of this algorithm as defining a sequence of closed intervals {[Ln,Un]}n=1.

Note that if dk = 0, then the length of [Lk,Uk] is a times the size of [Lk−1,Uk−1], and if dk = 1, then the length of [Lk,Uk] is 1 − a times the size of [Lk−1,Uk−1]. From this, we can see that the lengths of the intervals tends to 0 as n → ∞, and since the intervals are closed, then there is exactly one point contained in all of them. This justifies our claim at the end of the algorithm. It is also worth noting that after k iterations of the algorithm, the largest value that fa(x) could take is Uk, which occurs when all the digits after the kth are 1. Similarly, the smallest value that fa(x) could take is Lk, which occurs when all the digits after the kth are 0. Each dyadic rational has two distinct binary representations. One such representa- tion ends in an infinite string of zeros, while the other ends in an infinite string of ones. For any dyadic x, we can assume either binary representation and our algorithm will generate the same value of fa(x). In the case of infinite zeros, Lk will eventually be

fixed at fa(x), while Uk converges down to fa(x). In the case of infinite ones, Uk will become fixed at fa(x), while Lk converges up to fa(x). In the next section, we prove ˆ that fa(x) = fa(x) - that our algorithmic definition indeed defines DeRham’s function.

Before we get to that, we prove some other results about fa(x).

3.4 Analytic Properties

Now we will use our algorithmic definition of fa(x) to show that it is strictly increasing and singular. First we show that it is strictly increasing.

Theorem 14. For a ∈ (0, 1), the function fa(x) is strictly increasing.

Proof. Let x, y ∈ [0, 1] such that x < y. Consider the binary expansions of x and y and if either is dyadic, assume it is expressed in the terminating way. Since x < y, then there exists some k such that x has a 0 in the kth digit and y has a 1. Let i be

th the smallest such k. Then, if Lix,Uix and Liy,Uiy are the i endpoints for x and y

20 respectively, then Uix = Liy. We already know that fa(y) ≥ Liy. Also, fa(x) ≤ Uix.

th However, if fa(x) = Uix, then that would require x to have only ones from the i point on. This would make x a dyadic rational, but expressed in the non-terminating way, and we assumed that x would be terminating if dyadic. So fa(x) < Uix = Liy ≤ fa(y).

Now we show that fa(x) is singular. We start by showing continuity.

Theorem 15. For a ∈ (0, 1), the function fa(x) is continuous.

Proof. We claim that fa(x) is surjective. Let y ∈ [0, 1]. We now generate an infinite

∞ ∞ string b1b2b3 ... of zeros and ones. We now define sequences {wn}n=1, {un}n=1 and

∞ {mn}n=1 as follows. Let w1 = 0 and u1 = 1. For each n ∈ N, let mn = awn + (1 − a)un.

If mn ≥ y, then let un+1 = wn, let wn+1 = wn and let bn = 1 Otherwise, if mn < y, then let un+1 = un, let wn+1 = mn and let bn = 0 Now we can interpret 0.b1b2b3 ... as a number in [0, 1] expressed in binary. Let x take this value. Then f(x) = y. Since fa(x) is surjective and strictly increasing, Theorem 8 implies that fa(x) is continuous.

0 Before we prove that fa(x) = 0 when it exists, we need to show that for any function f, we can approximate f 0(x) with secant lines. We do this with two lemmas.

∞ ∞ Lemma 16. Let {xn}n=1 and {yn}n=1 and x be such that xn → x and yn → x as n → ∞.

∞ Let {rn}n=1 be a sequence such that 0 ≤ rn ≤ 1 for all n. Then rnxn + (1 − rn)xn → x as n → ∞.

Proof. Let ε > 0. Let N be such that for all n ≥ N, |xn − x| < ε and |yn − x| < ε.

21 Then,

|rnxn + (1 − rn)yn − x| =

|rn(xn − x) + (1 − rn)(yn − x)| ≤

rn|xn − x| + (1 − rn)|yn − x| <

rnε + (1 − rn)ε =

ε.

∞ ∞ Lemma 17. Let f be a function that is differentiable at x. Let {an}n=1 and {bn}n=1 be such that an → x and bn → x as n → ∞ where an ≤ x ≤ bn with an 6= x or bn 6= x for all n. Then f(b ) − f(a ) n n → f 0(x) bn − an as n → ∞.

Proof. We start with the following algebraic manipulation:

f(b ) − f(a ) n n = bn − an f(b ) − f(x) + f(x) − f(a ) n n = bn − an b − x f(b ) − f(x) x − a f(x) − f(a ) n n + n n . bn − an bn − x bn − an x − an

Then note that f(bn)−f(x) → f 0(x) and f(x)−f(an) → f 0(x) as n → ∞. Also note that bn−x x−an since a ≤ x ≤ b for all n, 0 ≤ bn−x ≤ 1 for all n. Since x−an = 1 − bn−x , we can n n bn−an bn−an bn−an apply Lemma 16 to get

b − x f(b ) − f(x) x − a f(x) − f(a ) n n + n n → f 0(x) bn − an bn − x bn − an x − an as n → ∞.

22 1 0 Theorem 18. For a 6= 2 and 0 ≤ x ≤ 1, we have fa(x) = 0 if it exists.

Proof. Let x ∈ [0, 1] and let 0.b1b2b3 ... be the binary expansion of x. Define

− xn = 0.b1b2 . . . bn000 ...,

+ xn = 0.b1b2 . . . bn111 ....

Now, note that for any n we have the following:

− Ln = fa(xn ) (8)

+ Un = fa(xn ) (9)

− + xn ≤ x ≤ xn (10) 1n x+ − x− = . (11) n n 2

We can use the above to approximate the derivative of fa at x with a secant line. We define + − fa(xn ) − fa(xn ) dn(x) = + − . xn − xn

The value dn(x) is the slope of a secant line through the graph of fa near x. So by

0 0 Lemma 17, if fa(x) exists, then dn(x) → fa(x) as n → ∞. Applying (8),(9) and (11) to our definition of dn(x) gives Un − Ln dn(x) = 1 n . 2 Now, for each n, define

kn(x) = kn = is the number of 1s in the first n binary digits of x.

As we noted earlier, if dk = 0, then the length of [Lk,Uk] is a times the size of

[Lk−1,Uk−1] and if dk = 1, then the length of [Lk,Uk] is 1−a times the size of [Lk−1,Uk−1].

23 Therefore, since the length of [L1,U1] = [0, 1] is 1, the length of [Ln,Un] must be (1 − a)kn an−kn . So

+ − kn n−kn fa(xn ) − fa(xn ) = Un − Ln = (1 − a) a , (12) which gives

kn n−kn kn n−kn (1 − a) a (1 − a) a kn n−kn dn(x) = n = = (2 − 2a) (2a) . (13) 1  1 kn 1 n−kn 2 2 2

Now that we have a concrete expression for the value of each dn(x), we can start to

∞ figure out the limit of the sequence {dn(x)}n=1. Recall that this is of interest because if

0 fa(x) exists, then its value must equal the limit of the sequence in (13).

Now assume that dn(x) → c as n → ∞ for some real constant c. If c 6= 0, then we must have dn+1(x) → c = 1. We will now show that the sequence { dn+1(x) }∞ cannot dn(x) c dn(x) n=1 ∞ converge to 1. From this, it will follow that the sequence {dn(x)}n=1 either does not converge, or that it converges to 0. Thus the derivative of fa(x) must equal 0 wherever it exists. So now we will show that dn+1(x) does not tend to 1. Note that dn(x)

d (x) (2 − 2a)kn+1 (2a)n+1−kn+1 n+1 kn+1−kn 1+kn−kn+1 = k n−k = (2 − 2a) (2a) . dn(x) (2 − 2a) n (2a) n

Since ki denotes the number of 1s in the first i digits of x, we must have either kn+1 = kn or k = k + 1. The first case gives dn+1(x) = 2a and the second case gives dn+1(x) = n+1 n dn(x) dn(x) 2(1 − a). So each term in our sequence is either 2a or 2(a − 1). So if our sequence is to converge, it must converge to 2a or to 2(1 − a). Neither one of these values is 1 since

1 a 6= 2 . So if the derivative of fa exists at a point x, it must be 0.

1 0 Corollary 19. For a 6= 2 , fa(x) = 0 almost everywhere.

Proof. We get this from the previous theorem and Lebesgue’s theorem (Theorems 18 and 11).

Now we show that our definition of DeRham’s function is equivalent to the probability

24 definition.

Theorem 20. For all x, fa(x) = P (R < x).

+ + Proof. First, we show by induction that for all x and for all n, fa(xn ) = P (R < xn ) and

− + + − fa(xn ) = P (R < xn ). For the base case, note that for all x, we have x0 = 1 and x0 = 0 and P (R < 1) = 1 = fa(1) and P (R < 0) = 0 = fa(0). For the inductive step, assume

+ + − − th that fa(xn ) = P (R < xn ) and fa(xn ) = P (R < xn ). If bn+1 = 0 (the (n + 1) digit of

− − x), then xn = xn+1 so we’re done by the induction hypothesis. If bn+1 = 1, then

− fa(xn+1) = (1 − a)Ln + aUn = Ln + a(Un − Ln) =

− + − − − + fa(xn ) + a(fa(xn ) − fa(xn )) = P (R < xn ) + aP (xn ≤ R < xn ).

− − + th This represents the probability that R < xn or xn < R < xn and the (n + 1) digit

− of R is 0. Then we are done because R < xn+1 if and only if these conditions are met.

+ Showing the inductive step for f(xn+1) is similar.

− If x is a dyadic rational, then x = xn for some n so we are done. Otherwise, if x is

− + not a dyadic rational, then for all n, we have xn < x < xn . Since fa(x) is increasing, it

− + − follows that fa(xn ) < fa(x) < fa(xn ). Applying the above gives P (R < xn ) < fa(x) <

+ − + − + P (R < xn ). Since xn < x < xn , we also have P (R < xn ) < P (R < x) < P (R < xn ).

− + As stated earlier, the sequences defined by P (R < xn ) and P (R < xn ) both converge to the same real number. Since the above inequalities are true for all n, this means that

P (R < x) = fa(x).

0 It is natural to wonder where fa(x) exists and where it does not exist. We start off by showing that this function is not differentiable at the dyadic rationals.

0 Theorem 21. Let x ∈ [0, 1] be a dyadic rational. Then fa(x) does not exist.

Proof. Since x is a dyadic rational, it has a binary representation that ends with an infinite string of 0s. Let 0.b1b2 . . . bi000 ... be this representation. Recall the definition

25 0 0 of dn(x) from the proof of Theorem 18 and recall that if fa(x) exists, then dn(x) → fa(x). Also

kn n−kn dn(x) = (2 − 2a) (2a) .

Now, assume that n ≥ i. Recall that we are working with the binary representation of

th x such that each binary digit of x beyond the i digit is a 0. This means that kn = ki which gives

kn n−kn ki n−ki dn(x) = (2 − 2a) (2a) = (2 − 2a) (2a) =

ki i−ki n−i n−i (2 − 2a) (2a) (2a) = di(x)(2a) .

1 So if a > 2 , we have dn(x) → ∞ as n → ∞, implying that the derivative doesn’t 1 exist. Note that if a < 2 then dn(x) → 0 as n → ∞. However, we can define a sequence analogous to dn(x) that is based on the other binary representation of x (the

1 representation that ends in an infinite string of 1s) that will tend to infinity when a < 2 .

Let 0.e1e2 . . . ej111 ... be the binary representation of x such that each binary digit

th of x beyond the j digit is a 1. We now define dn(x) exactly as before. Assume n ≥ j.

This means that j − kj = n − kn which gives

kn n−kn dn(x) = (2 − 2a) (2a) =

kn−kj kj j−kj n−j (2 − 2a) (2 − 2a) (2a) = (2 − 2a) dj(x).

1 So if 2 − 2a > 1, or equivalently, if a < 2 , then dn(x) → ∞ as n → ∞. It is also 1 interesting to note that for this way of defining dn(x), if a > 2 then dn(x) → 0 as n → ∞.

26 0 3.5 Points at Which fa(x) Exists

We will now identify a subset of [0, 1] where fa is differentiable. This will have to do with what we will call the density of zeros and ones in x [6]. Heuristically, the density of zeros in x is the probability that a digit chosen at random from the binary representation of x is a zero. The density of ones in x is similarly defined. Since we already know that fa is not differentiable at any dyadic rational, we will not bother with the dyadic rationals. This means that in defining the density of zeros and ones in x, we will not need to worry about non-unique binary representations. We introduce the following notation

Definition 22. Given an x ∈ [0, 1] such that x is not a dyadic rational, we define pn to be the position of the nth zero in the binary representation of x. In addition, we define the two functions

n D0(x) = lim , n→∞ pn

D1(x) = 1 − D0(x).

There do exist numbers for which the densities of 0s and 1s do not exist.

1  1  Theorem 23. Let a ∈ 0, 2 ∪ 2 , 1 . Let x ∈ [0, 1] such that x is not a dyadic rational

D0(x) D1(x) 1 and D0(x) and D1(x) both exist with 0 < D0(x),D1(x) < 1 and a (1 − a) < 2 . 0 Then fa(x) = 0.

Before we prove this, we will explore what it means for some x ∈ [0, 1] to satisfy

D0(x) D1(x) 1 a (1 − a) < 2 . Almost all numbers have equal density of zeros and ones in their 1 binary representation (see [7]), so D0(x) = D1(x) = 2 is true for almost all x ∈ [0, 1]. √ √ 1 D0(x) D1(x) Note that if D0(x) = D1(x) = 2 , then a (1 − a) = a 1 − a which is always 1 √ √ less than 2 . To see this, note that each of a and 1 − a is less than 1 and at least 1 one must be less than 2 . So, from this it follows that almost all x ∈ [0, 1] satisfy

D0(x) D1(x) 1 0 a (1 − a) < 2 . So by Theorem 23, fa(x) exists almost everywhere. Note that

27 Figure 5: Values of a and D1(x) that satisfy the hypothesis of Theorem 23 we already proved this result a different way (see Corollary 19). Now, we look at how a given value of a affects the density of 0s and 1s of xs where the

D0(x) D1(x) 1 derivative of fa(x) exists. By replacing D0(x) with 1 − D1(x) in a (1 − a) < 2 ,

1−D1(x) D1(x) 1 we get a (1 − a) < 2 . Using exponent rules and taking the natural logarithm of both sides gives 1 − a D (x) ln < − ln 2a. 1 a

1 1−a Now, if a < 2 , then ln a > 0 which gives

− ln 2a D1(x) < 1−a . ln a

1 1−a Otherwise, if a > 2 , then ln a < 0 which gives

− ln 2a D1(x) > 1−a . ln a

Figure 5 shows the graph of this function of a, bounding densities D1 where the hypoth- esis of Theorem 23 holds. Points in the shaded area represent combinations of a and

0 D1(x) such that fa(x) exists and equals zero. No points along the borders are shaded.

1 1 All points with a 6= 2 and D1(x) = 2 are shaded, as expected by the earlier discussion.

28 D0(x) D1(x) 1 0 We could also show that if a (1 − a) > 2 , then fa(x) does not exist since the

D0(x) D1(x) 1 slope approaches ∞. This and some borderline cases where a (1 − a) = 2 are treated in [6]. We now present an original proof of Theorem 23.

fa(x+h)−fa(x) Proof. We only show that limh→0+ h = 0. The proof for the limit coming from the negative side is similar. Let ε > 0. We will now show that we can choose an h such that fa(x+h)−fa(x) < ε. Without loss of generality, assume that ε < 1 − 1. h 2aD0(x)(1−a)D1(x) Now, note that the sequence pn+1 converges to 1. This is true because pn+1 = pn+1 · n+1 = pn pn n+1 pn pn+1  n 1  1 · + which converges to · D0(x) = 1. n+1 pn pn D0(x) Note that for any n, p > p which means that pn+1 > 1. This means that we can n+1 n pn √ pick a positive integer N such that for all n ≥ N , we have pn+1 < 1 + log 3 ε + 1. 1 1 pn 2 Note that the sequence {1 − n }∞ converges to D which means that we can pick a pn n=1 1 √ positive integer N such that for all n ≥ N , we have 1 − n > D (x) + log 3 ε + 1 2 2 pn 1 1−a √ 3  (since 0 < 1 − a < 1, the value of log1−a ε + 1 is negative). Note that the sequence { n−1 }∞ converges to D which means that we can pick a positive integer N such that pn n=1 0 3 √ for all n ≥ N , n−1 > D (x) + log 3 ε + 1 (as before, since 0 < a < 1, the value of 3 pn 0 a √ 3  loga ε + 1 is negative). Since x is not a dyadic rational, the binary representation of x must have an infinite number of zeros. Thus, pn is unbounded. Also, since ε <

1 − 1, we have 2aD0(x)(1 − a)D1(x)(1 + ε) < 1. This means that we can 2aD0(x)(1−a)D1(x) p D0(x) D1(x)  n choose a positive integer N4 such that 2a (1 − a) (1 + ε) < ε for all n ≥ N4.

Now, let N = max{N1,N2,N3,N4} and pick an h satisfying

0 < h < 2−pN . (14)

+ + Fix an n such that xpn < x + h ≤ xpn−1. Then, we have

+ 0 < xpn − x < h. (15)

+ Note that x and xpn agree out to pn+1 − 1 decimal places, and that when they differ

29 th + + −(pn+1) at the pn+1 place, x has digit 0 and xpn has digit 1. So xpn − x > 2 . Then, by

−(pn+1) (15), h > 2 . Putting this together with (14) gives pn+1 > pN . This inequality says

th th that the (n + 1) 0 lies further out then the N 0 which implies n ≥ N. Since fa is

+ − strictly increasing, (10) gives fa(x + h) − fa(x) < fa(xpn−1) − fa(xpn−1). Using this and applying (12), we have

+ − p −n n−1 f (x + h) − f (x) fa(x ) − fa(x ) (1 − a) n a a a < pn−1 pn−1 = = h 2−pn+1 2−pn+1 1 !pn (1 − a)pn−nan−1  pn = 2−pn+1

p p  n+1 1− n n−1  n 2 pn (1 − a) pn a pn .

Now, since n ≥ N, we can apply the inequalities from the definitions of N1,N2,N3 and

N4:

p p  n+1 1− n n−1  n 2 pn (1 − a) pn a pn <

√ √ √  3 3 3 pn 21+log2( ε+1) · (1 − a)D1(x)+log1−a( ε+1) · aD0(x)+loga( ε+1) =

p 2(1 − a)D1(x)aD0(x)(ε + 1) n < ε.

4 Minkowski’s Question-Mark Function

We now discuss Minkowski’s Question-Mark function, denoted ?(x). Like DeRham’s function, this is strictly increasing yet it has derivative zero almost everywhere. There are two common ways to define ?(x) (both produce the same result) and they are detailed in [2]. We propose an alternative way to define ?(x).

30 4.1 Algorithmic Definition of ?(x)

For any rational number q, we define num(q) and denom(q) to be the numerator and denominator of q when it is expressed in lowest terms. We define denom(0) to be 1.

Algorithm 2 Given input x ∈ [0, 1], generate the base 2 representation of ?(x) Binary Representation ← 0. w ← 0 u ← 1 for n = 1, 2, 3,... do num(w)+num(u) m ← denom(w)+denom(u) if x ≥ m then append a 1 to the right of Binary Representation w ← m else append a 0 to the right of Binary Representation u ← m end if end for return Binary Representation

We now show the steps this algorithm would perform in order to perform the com-

3 putation of ?( 8 ).

2 3 Table 1: Computation of ?( 5 ) = 8

[w, u] m test ans

0 1 1 1 [ 1 , 1 ] 2 x < 2 0.0 0 1 1 1 [ 1 , 2 ] 3 x ≥ 3 0.01 1 1 2 2 [ 3 , 2 ] 5 x ≥ 5 0.011 2 1 3 3 [ 5 , 2 ] 7 x < 7 0.0110 2 3 5 5 [ 5 , 7 ] 12 x < 12 0.01100 2 5 7 7 [ 5 , 12 ] 17 x < 17 0.011000 ...

As you can see, the value in the rightmost column will continue to be 0.011000 ... 02 =

3 8 as x < m will always be true after the fourth query, as x becomes the lower endpoint. Figure 6 shows the graph of ?(x).

31 Figure 6: Graph of ?(x)

As stated earlier, ?(x) is a strictly increasing singular function. While we will not prove this in this thesis, we will show how this algorithmic definition is equivalent to the well-established definition. In order to do this, we rephrase some of the details of the algorithm.

Algorithm 3 For some x, generate ?(x) ans ← 0 w1 ← 0 u1 ← 1 m0 ← 1 for n = 1, 2, 3,... do m ← num(wn)+num(un) n denom(wn)+denom(un) if x ≥ mn then 1 ans ← ans + 2n wn+1 ← mn un+1 ← un else un+1 ← mn wn+1 ← wn end if end for return ans

32 4.2 Notes and Remarks

An mn created in this algorithm will be referred to as a mediant and any wn, un will be called endpoints. The subscripts have no algorithmic importance, but they enable us to analyze the algorithm more easily. Note that in all cases, m0 = 1. In Corollary 29 which follows, we show that all fractions that define mn (and hence wn and un) are in reduced form so numerators and denominators simply add as shown and cancellation is not relevant.

Remark 24. At every iteration of the loop, wn, un and mn are rational. Furthermore, the values of denom(wn), denom(un) are monotone increasing and denom(mn) is strictly increasing with each iteration.

Remark 25. For every n ≥ 1, denom(mn) > denom(wn) and denom(mn) > denom(un) and wn < mn < un.

Remark 26. The inequality wn ≤ x ≤ un is always true.

4.3 Mediants, Endpoints and an Algebraic Property

We show that ? maps every rational number to a dyadic rational number. To do this, we show that when using the algorithm to calculate ?(x) for some rational x, then for some n, mn = x. In order to get to this point, we need to prove a few lemmas. In this context, we will assume that a and a0 are nonnegative integers and b and b0 are positive integers.

a a0 a a + a0 a0 Lemma 27. If < then < < . b b0 b b + b0 b0

a a+a0 Proof. Showing that b < b+b0 :

b+b0 a 0 a0 0 0 a a a + b a + 0 b a + a = b = b < b = . b b + b0 b + b0 b + b0 b + b0

33 a0 a+a0 Showing that b0 > b+b0 :

0 0 b0+b 0 a0 a 0 a a 0 a + 0 b a + b a + a = b = b > b = . b0 b + b0 b + b0 b + b0 b + b0

a a0 Lemma 28. If b and b0 are the endpoints of an interval in an iteration of the algorithm to find ?(x), then a0b − b0a = 1.

Proof. We induct on the iteration through the loop. For the base case we look at the

a 0 a0 1 first iteration through the loop. Here, our endpoints are b = 1 and b0 = 1 . Then 0 0 a a0 a b − b a = 1 · 1 − 1 · 0 = 1. For the inductive step, let b , b0 be the endpoints at iteration n−1 through the loop and assume a0b−b0a = 1 holds. Then, at iteration n, the endpoints

a a+a0 a+a0 a0 are either b , b+b0 or b+b0 , b0 . In the first case, we have

(a0 + a)b − a(b0 + b) = a0b + ab − ab0 − ab = a0b − b0a = 1 and in the second we find

a0(b + b0) − b0(a + a0) = a0b + a0b0 − b0a − b0a0 = a0b − b0a = 1.

Corollary 29. As generated by the algorithm, each endpoint is in reduced form.

a a0 Proof. Let b and b0 be endpoints. Let d, m, n be nonnegative integers such that a = nd and b = md. So d is a divisor of a and b. By Lemma 28 we have

a0b − b0a = 1,

a0md − b0nd = 1, 1 a0m − b0n = . d

1 0 0 If d > 1, then d would not be integer. Then, since a , m, b , n are all integers, d = 1 so a0 the greatest common divisor of a and b is 1. The proof that b0 is in reduced form is

34 similar.

a a0 Lemma 30. If b and b0 are the endpoints of an interval in an iteration of the algorithm a+a0 a a0 to calculate ?(x) for some x, then b+b0 is the unique fraction between b and b0 with smallest denominator.

a a+a0 a0 x Proof. From Lemma 27 we have b < b+b0 < b0 . Consider any fraction y (x, y integers) a x a0 0 0 0 0 0 0 such that b < y < b0 . Then a y − b x > 0, and since a , b , x, y are integers, a y − b x ≥ 1. Then

a0 a a0 x x a a0y − b0x bx − ay 1 1 b + b0 − = − + − = + ≥ + = . (16) b0 b b0 y y b b0y by b0y by bb0y

b+b0 a0b−b0a 1 0 From this and from Lemma 28 it follows that bb0y ≤ bb0 = bb0 , and thus y ≥ b+b . 0 x a If y > b + b then y does not have the smallest denominator among fractions between b a0 0 a+a0 x and b0 , so y = b + b . We now show that b+b0 is the unique value for y . Then

a0 a a0b − b0a 1 y b + b0 − = = = = , b0 b bb0 bb0 bb0y bb0y and thus the inequality in (16) becomes an equality. This gives us a0y − b0x = 1 and bx − ay = 1. Solving for x and y gives x = a + a0 and y = b + b0, and these solutions are unique.

Lemma 31. Let x ∈ [0, 1]. If {wn} and {un} are respectively the sequence of lower and upper endpoints generated in using the algorithm to calculate ?(x), then un − wn → 0 as n → ∞.

a0 a a0b−b0a Proof. Letting un = b0 and wn = b , we have un − wn = bb0 . By Lemma 28, this 1 1 equals 0 = which converges to 0 by Remark 24. bb denom(wn)denom(un)

Theorem 32. Let x ∈ [0, 1]. If {mn} is the sequence of mediants generated when we use the algorithm to calculate ?(x), then mn → x as n → ∞.

35 Proof. Let {un} and {wn} be the sequences of lower and upper endpoints respectively.

Let ε > 0. By Lemma 31 we can choose N such that for all n ≥ N, un − wn < ε. By

Remark 26 and Lemma 27, we have wn ≤ x, mn ≤ un, so |x − mn| ≤ un − wn < ε.

x x Theorem 33. Let y ∈ (0, 1) be a fraction in lowest terms. Then y becomes a mediant x at some point in the calculation of ?( y ).

Proof. Let m1, m2,... be the set of mediants used to calculate ?(x). Let m1, m2, . . . , mn be the finite subsequence of mediants with denominator strictly less than y. We know such a subsequence exists by Remark 24. Consider the iteration of the loop where mn becomes one of the endpoints; without loss of generality, assume mn is the greater endpoint and denote the smaller endpoint w. Then mn+1 is the mediant of w and mn and denom(mn+1) ≥ y. By Lemma 30, mn+1 is also the unique fraction between w

x and mn with smallest denominator, and by Remark 26, w ≤ y ≤ mn but equality is impossible because all mediants up to mn (including w) have smaller denominator than

x y. So y must be the mediant of w and mn.

Corollary 34. If x ∈ [0, 1] is rational then ?(x) is a dyadic rational.

4.4 Analytic Properties

Theorem 35. ? is strictly increasing.

Proof. Let a, b ∈ [0, 1] such that a < b. I will show that this implies ?(a)

Remember that we calculate ?(a) through an iterative process where for each n ∈ N we have some an, and then we check if a ≥ an, then generate an+1 accordingly and repeat.

We do the same process to calculate ?(b). This gives us two sequences {an}, {bn}. Note that an → a and bn → b. This means that if an = bn for every n, then a = b which contradicts our assumption that a < b. So we know that there exists some point at which the sequences {an} and {bn} begin to differ. Let N be the smallest natural number such that aN+1 6= bN+1. So for all n such that 1 ≤ n ≤ N, an = bn. Since the values of an+1

36 and bn+1 are completely determined by whether a ≥ an and whether b ≥ bn respectively, we can conclude that for any n such that 1 ≤ n ≤ N − 1, we have an = bn ≥ a if and only if an = bn ≥ b. Since we generate the binary representations for ?(a) and

?(b) based only on whether or not an ≥ a and bn ≥ b respectively, we know that the binary representations for ?(a) and ?(b) are the same for the first N − 1 terms right of the decimal point. Since aN+1 6= bN+1 we know that exactly one of a ≥ aN = bN and

th b ≥ aN = bN is false. Since b > a, a ≥ aN = bN must be false and thus the N binary digit of ?(a) is 0 and the N th binary digit of ?(b) is 1. Since all the preceding digits were the same, this implies ?(b) >?(a).

In [10], Salem proves that ?0(x) is 0 almost everywhere. He defines ?(x) using the continued fraction representation of x. The remainder of this thesis will show the rela- tionship between Salem’s continued fraction approach and our algorithmic approach.

4.5 Salem’s Definition

Salem provides another definition of ? which can be found in [10]. We will prove that our definition is equivalent to Salem’s, but first we introduce continued fractions.

4.5.1 Continued Fractions

9 This section is adapted from [8]. Consider the fraction 7 . We can manipulate it in the following way:

9 2 1 1 = 1 + = 1 + 7 = 1 + 1 . 7 7 2 3 + 2

9 This is the simple continued fraction representation of 7 and we can denote it [1; 3, 2].

More generally, the notation [a0; a1, a2, . . . , an] denotes the following continued fraction:

37 1 a0 + 1 . a1 + 1 a2+ . ..a + 1 n−1 an

We may also have continued fractions that do not terminate. The notation [a0; a1, a2,...] denotes the following infinite continued fraction:

1 a0 + 1 . a1 + 1 a2+ . .. Every terminating continued fraction represents a unique rational number. However, every rational number can be expressed as a continued fraction in one of exactly two distinct ways. We can see this by looking at the first example:

9 1 1 1 = 1 + 1 = 1 + 1 = 1 + 1 7 3 + 2 3 + 1+1 3 + 1 1+ 1

9 which means 7 = [1; 3, 2] = [1; 3, 1, 1]. More generally, if some rational number x has the continued fraction representation [a0; a1, a2, . . . , an] where an 6= 1 then x can also be expressed as [a0; a1, a2, . . . , an − 1, 1]. Throughout the rest of this paper we will assume that any continued fraction rep- resentation of a rational number x will not end with a 1 unless x = 1. We will always assume that a0 will be an integer and that all of a1, a2, . . . , an,... are positive unless we state otherwise. Every infinite continued fraction represents a unique irrational number. Every irra- tional number can be expressed as a unique infinite continued fraction.

If x = [a0; a1, a2, . . . , an] or [a0; a1, a2,...], we let xi denote [a0; a1, a2, . . . , ai]. The xi’s are called the convergents of x. Note that if x is rational, then x equals its last convergent. As we will see in the next few lemmas, the convergents are useful because they serve as good approximations. Before exploring this we must develop the framework

38 necessary to relate convergents to each other.

If we let x = [a0, a1,...] (we could also allow this to terminate), we define two sequences which will wind up being the numerators and denominators of the convergents of x:

h−2 = 0, h−1 = 1, hi = aihi−1 + hi−2 for i ≥ 0,

d−2 = 1, d−1 = 0, di = aidi−1 + di−2 for i ≥ 0.

Theorem 36. Let x = [a0; a1, a2,...] or [a0; a1, a2, . . . , an]. Then for any real number y 6= 0, i ≤ n,

yhi−1 + hi−2 [a0, a1, . . . , ai−1, y] = . ydi−1 + di−2

Proof. We proceed by induction on i. If i = 0, then we have

1 · y + 0 yh + h y = = −1 −2 . 0 · y + 1 yd−1 + d−2

Assume that the result holds for i − 1. We then note the following consequence of our notation: 1 [a ; a , a , . . . , a , y] = [a ; a , a , . . . , a , a + ]. 0 1 2 i 0 1 2 i−1 i y

By the induction hypothesis, we then have

1 1 (ai + y )hi−1 + hi−2 [a0; a1, a2, . . . , ai−1, ai + ] = 1 y (ai + y )di−1 + di−2

y(a h + h ) + h yh + h = i i−1 i−2 i−1 = i i−1 y(aidi−1 + di−2) + di−1 ydi + di−1 thus completing the induction.

hi Corollary 37. Let x = [a0; a1, a2,...] or [a0; a1, a2, . . . , an]. Then xi = . di

Remark 38. Note that if a0 ≥ 0, then {hi} and {di} are strictly increasing.

39 Throughout the rest of this paper, all continued fractions mentioned will have a0 = 0 so we will abbreviate [a0; a1, a2, . . . , an] by [a1, a2, . . . , an] with the understanding that a0 = 0.

Lemma 39. If x = [a1, a2,...] or x = [a1, a2, . . . , an], then each convergent xi is the rational number closest to x with denominator at most denom(xi). More rigorously, if n/d is a rational number such that |x − n/d| < |x − xi| for some i ≥ 1, then d > denom(xi).

Lemma 40. For any irrational x, the following infinite inequality holds:

x2 < x4 < x6 <...

This inequality also holds for rational x if we include only the first i − 1 convergents if xi = x.

Proofs of Lemmas 39 and 40 can be found in [8]. As we will see, Salem’s definition of ? turns out to be the same as the algorithmic definition of ?, but for the purposes of proving this, we will denote the function as defined ˆ by Salem as ? For some x ∈ [0, 1] with continued fraction representation [a1, a2, . . . , an] ˆ or [a1, a2,...], Salem defines ?(x) by:

1 1 (−1)n+1 ˆ?(x) = − + ... + + ..., 2a1−1 2a1+a2−1 2a1+...+an−1 where the above sum terminates if and only if the continued fraction terminates.

4.5.2 Connection Between Convergents and Mediants

For a given x ∈ [0, 1], the convergents of x have significance in the algorithmic calculation of ?(x). As the following theorem states, each convergent of x will become a mediant at some point in the calculation of ?(x).

40 Theorem 41. Let x ∈ (0, 1). If {mn} is the sequence of mediants generated when we use Algorithm 3 to calculate ?(x) then {xi}, the sequence of convergents to x, is a subsequence of {mn}.

Proof. We first consider the case x1 = 1 and note that since m0 = 1 we are done. Now we consider the cases where x1 < 1. Let {an} and {bn} be the sequences of lower and upper endpoints respectively. We claim that for each convergent xi there exists some

N such that aN < xi < bN , and aN+1 ≥ xi or bN+1 ≤ xi. To see this we first let xi be some convergent of x and we note that 0 < xi < 1, a1 = 0, and b1 = 1. Then we note that if for all n, an < xi < bn, then xi must never be the fraction with smallest denominator between an and bn because otherwise it would be chosen to be an endpoint and thus equal to some an or some bn. But there are only finitely many fractions with denominator less than denom(xi) and the algorithm exhausts them by turning them into endpoints or excluding them from the interval an, bn, so this is a contradiction and thus the claim is proven. Without loss of generality assume aN+1 ≥ xi. Then aN+1 = mN so mN ≥ xi. By Lemma 30, denom(mN ) ≤ denom(xi) so by the contrapositive of Lemma

39, |x − mN | ≥ |x − xi|. Since mN = aN+1, then by Remark 26, mN ≤ x. Since x ≥ mN , then |x − mN | ≥ |x − xi| implies xi ≥ mN and since we are assuming mN ≥ xi, we have mN = xi.

Now we will investigate how the sequence of convergents is distributed throughout the sequence of mediants. To explain the pattern that emerges, we will use an example.

26 Table 2 keeps track of the values used in the calculation of ?( 59 ). As a continued 26 1 3 4 fraction, 59 = [2, 3, 1, 2, 2] and so its convergents are [2] = 2 , [2, 3] = 7 , [2, 3, 1] = 9 , 11 26 [2, 3, 1, 2] = 25 and [2, 3, 1, 2, 2] = 59 . For each value of n from 1 to 12, we report the 26 26 wn, mn and un. We also write 59 between wn and mn if wn < 59 < mn, and we write 26 26 59 between mn and un if mn < 59 < un. When the convergents appear as mediants, 26 they are typed in boldface. Note that after every convergent, 59 switches to the other side of the mediant. After each convergent, an additional number is introduced into the

41 26 Table 2: Computation of ?( 59 )

n wn x mn x un

0 26 1 1 1 1 59 2 = [2] 1 0 1 26 1 2 1 3 = [2, 1] 59 2 1 2 26 1 3 3 5 = [2, 2] 59 2 2 3 26 1 4 5 7 = [2, 3] 59 2 3 26 4 1 5 7 59 9 = [2, 3, 1] 2 3 7 26 4 6 7 16 = [2, 3, 1, 1] 59 9 7 11 26 4 7 16 25 = [2, 3, 1, 2] 59 9 11 26 15 4 8 25 59 34 = [2, 3, 1, 2, 1] 9 11 26 15 9 25 59 = [2, 3, 1, 2, 2] 34 26 41 15 10 59 93 = [2, 3, 1, 2, 2, 1] 34 26 67 41 11 59 152 = [2, 3, 1, 2, 2, 2] 93 26 93 67 12 59 211 = [2, 3, 1, 2, 2, 3] 152 ...... continued fraction representation of the mediant. This number starts and 1, and at each mediant, it increases by 1 until the mediant reaches the next convergent. Then another number is introduced and the process continues. Now we will state and prove the above patterns in their general form. We first prove that for any x, the mediants in the computation of ?(x) swap between being greater than and less than x after a convergent appears as a mediant.

Lemma 42. Let x = [a1, a2,...] ∈ [0, 1] and assume that the convergents xi and xi+1 exist.

Then x 6= xi.

Let n, k be such that mn = xi and mn+k = xi+1. Let j ∈ {1, 2, . . . , k}.

If xi > x then mn+j ≤ x with equality possible only if j = k.

42 If xi < x then mn+j ≥ x with equality possible only if j = k.

Proof. Throughout the proof, we make implicit use of Lemma 40 which states that xl ≤ x for all even l and xl ≥ x for all odd l. We proceed by induction on i.

For the base case i = 1, we have x2 ≤ x < x1 = mn, and we will show mn+j ≤ x for j ∈ {1, 2, . . . , k}. Specifically,

1 1 x1 = and x2 = 1 . a1 a1 + a2

Since 1 is odd, x1 > x by Lemma 40. Let n, k be such that mn = x1 and mn+k = x2

1 (n and k exist by Lemma 41). Let j ∈ {1, 2, . . . , k}. Note that wn = 0 and un+1 = a1 1 1 1 because for each positive integer l ≤ a1, we have ≥ > x and ml = . l a1 l+1 1 j We now show by induction on j that un+j = and mn+j = . Note that a1 ja1+1 1 1 1 wn+1 = 0 and un+1 = . To see this, note that for each l < a1, we have > and a1 l a1 1 1 ml = . Also, mn = > x so un+1 = mn. For the base case j = 1, we get our mediant l+1 a1

1 + 0 1 1 mn+1 = = ≤ 1 ≤ x. a1 + 1 a1 + 1 a1 + a2

1 j−1 Now let 2 ≤ j ≤ k. Our induction hypothesis is that un+j−1 = and mn+j−1 = . a (j−1)a1+1 l 1 First note that the sequence defined by a l+1 = 1 is strictly increasing. Also, l = a2 1 a1+ l 1 is the unique value such that 1 = x2. So k = a2. Therefore, a1+ l

j − 1 1 1 mn+j−1 = ≤ 1 = 1 = x2 ≤ x. (j − 1)a1 + 1 a1 + a1 + k a2

1 1 So wn+j = mn+j−1 = 1 and un+j = un+j−1 = a . Both these fractions are expressed a1+ j−1 1 in reduced terms, so our mediant is

j − 1 + 1 j 1 mn+j = = = 1 . (j − 1)a1 + 1 + a1 ja1 + 1 a1 + j

43 1 1 So for all 1 ≤ j ≤ k, mn+j = 1 ≤ 1 = x2 ≤ x. a1+ a1+ j a2 We now do the inductive step for i ≥ 2. Assume the lemma is true for i − 1. We assume i is odd and thus xi > x. The proof for the even case is similar. Let n, k be such that xi = mn and xi+1 = mn+k. We now show by induction on j that mn+j ≤ x for all j ∈ {1, 2, . . . , k}.

We first show that wn+1 = wn = xi−1. By Theorem 41, there exists an l such that

xi−1 = ml. Then since i − 1 is even, xi−1 < x so wl1+1 = xi−1. Now we apply our induction hypothesis that the main lemma is true for i − 1. This gives us that for all p such that l + p ≤ n, we have ml+p ≥ x so wl+p+1 = wl+p = xi−1. For l + p = n, wn+1 = wn = xi−1.

Now we do the base case j = 1. Since i + 1 is odd, xi+1 ≤ x. So mn+1 ≤ x follows from mn+1 ≤ xi+1. We will show mn+1 ≤ xi+1 by showing mn+1 − wn+1 ≤ xi+1 − wn+1.

0 0 a a0 Let a, a , b, b be positive integers such that xi−1 = wn+1 = b and xi = mn = b0 where a0 these fractions are in reduced form. Since mn = xi > x, we have un+1 = mn = b0 , and 0 a+a0 ai+1a +a so mn+1 = 0 . By Theorem 36, xi+1 = 0 , so b+b ai+1b +b

0 0 0 ai+1a + a a ai+1(a b − b a) xi+1 − wn+1 = 0 − = 0 2 . ai+1b + b b ai+1bb − b

a+a0 Also, mn+1 = b+b0 so,

0 0 0 0 0 a + a a a b − b a ai+1(a b − b a) mn+1 − wn+1 = 0 − = 0 2 ≤ 0 2 = xi+1 − wn+1. b + b b bb − b ai+1bb − b

Now we do the inductive step for 2 ≤ j ≤ k. We use strong induction. Our induction hypothesis will be that mn+l < xi+1 for all l < j. Then wn+l+1 = mn+l for all l < j, and un+l+1 = un+1 = mn = xi. In particular, wn+j = mn+j−1 and un+j = xi. Just

0 0 a as in the j = 1 case, let a, a , b, b be positive integers such that xi−1 = wn+1 = b and

44 a0 a+(l−1)a0 xi = mn = b0 where these fractions are in reduced form. Then wn+l = b+(l−1)b0 . So

a + (j − 1)a0 + a0 a + ja0 1 a + a0 m = = = j . n+j 0 0 0 1 0 b + (j − 1)b + b b + jb j b + b

Note that this sequence is strictly increasing and that j = ai+1 is the unique value such 1 0 j a+a that 1 0 = xi+1. So k = ai+1 and thus j b+b

0 0 a + ja ai+1a + a mn+j = 0 ≤ 0 = xi+1 ≤ x. b + jb ai+1b + b

Corollary 43. Let x ∈ [0, 1]. Fix i ≥ 2 such that x 6= xi. Let n be such that mn = xi (such an n exists by Theorem 41).

If i is odd, then wn = xi−1.

If i is even, then un = xi−1.

Lemma 44. If mn = xi−1 then mn+ai = xi.

Proof. Let n be such that mn = xi−1. We will prove the case for when i − 1 is odd. The even case is similar. Since i − 1 is odd, wn = xi−2 by Corollary 43. Let k be such that mn+k = xi. We will show that k = ai and then we will be done. By Lemma 42, for

hi−1 all j ∈ {1, 2, . . . , k}, mn+j ≤ x and thus un+j = xi−1 = . Also note that for j < k, di−1 mn+j = wn+j+1. Since wn+1 = xi−2,

hi−1 + hi−2 mn+1 = , di−1 + di−2 (hi−1 + hi−2) + hi−1 2hi−1 + hi−2 mn+2 = = , (di−1 + di−2) + di−1 2di−1 + di − 2 khi−1 + hi−2 mn+k = . kdi−1 + di−2

45 Since k is defined such that mn+k = xi, we have

khi−1 + hi−2 hi aihi−1 + hi−2 = xi = = , kdi−1 + di−2 di aidi−1 + di−2

and thus k = ai as desired.

Lemma 45. If x = [a1, a2,...] or [a1, a2, . . . , ak] then mn = xi if and only if n = a1 + a2 + ... + ai − 1

Proof. We proceed by induction on i. We already have the base case i = 1 in the proof of Lemma 42. For the inductive step, let k = a1 + a2 + ... + ai−1 − 1 and assume that

mk = xi−1. Then, by Lemma 44, mk+ai = xi. So letting n = k + ai we have mn = xi where n = a1 + a2 + ... + ai − 1.

4.5.3 Agreement with Salem’s Extension 1 1 1 1 1 Lemma 46. Let a, k be nonnegative integers. Then − = + + ... + . 2a 2a+k 2a+1 2a+2 2a+k

Proof.

1 1 − = 2a 2a+k 1 1 1 + − = 2a+1 2a+1 2a+k 1 1 1 1 + + − = ... = 2a+1 2a+2 2a+2 2a+k 1 1 1 1 1 + + ... + + − = 2a+1 2a+2 2a+k 2a+k 2a+k 1 1 1 + + ... + . 2a+1 2a+2 2a+k

Theorem 47. The function ? defined by Algorithm 3 is the same as the function ˆ? defined by Salem.

46 Proof. Assume that x is irrational, so the continued fraction does not terminate. Using Lemma 46 we can rewrite Salem’s extension ˆ? as follows

∞ a2k−1+a2k−1 X X 1 ˆ?(x) = . 2j k=1 j=a2k−1

We can see that this is exactly what the algorithm does. For each n, Algorithm 3 adds

1 2n to the running sum whenever x ≥ mn. By Lemma 42, this happens every iteration that follows the iteration where the mediant was an odd convergent of x until, but not including, the first iteration after the next even convergent is a midpoint. By Lemma

45, these are the iterations that are numbered ak, ak + 1, . . . , ak + ak+1 − 1 for all odd k. For rational x, the proof is similar.

5 Conclusion and Ideas For Further Study

We were able to use our algorithm to find a reasonably elegant proof for the singularity

0 of fa(x) and explicit sets on which fa(x) did and did not exist.h. Maybe we can find something similar for ?(x). In [4], it is shown that ?0(x) does not exist for any x with continued fraction representation where the terms do not exceed 4. The proof of this is very messy. As of now, we are able to use the algorithmic definition to find a much neater proof of the much weaker claim that that ?0(x) does not exist when x has contin- ued fraction expansion that is eventually periodic of period length at most two, where the numbers in the period do not exceed 4. Hopefully, we will be able to extend this and investigate more of the known properties about ?(x) and fa(x) using our algorithms. Another idea is to use these algorithmic definitions to explore the connections be- tween Minkowski’s Question-Mark Function, DeRham’s Function, their inverses and other variations. We already know of one connection. At each step of the algorithm for ?(x), we have an interval [w, u] and we choose the mediant by adding the numerators and denominators of u and v. If instead, for some a ∈ (0, 1), we choose the point that is

47 −1 a part of the way between w and u, our algorithm will generate fa (x). Perhaps there is an algorithmic template that can describe all strictly increasing singular functions.

References

[1] Stephen Abbott. Understanding Analysis. Springer, New York, 2001.

[2] Randolph M. Conley. A survey of the minkowski ?(x) function. Master’s thesis, West Virginia University, Morgantown, 2003.

[3] Georges DeRham. On some curves defined by functional equations. Rendiconti del Seminario Matematico dell’Universita e del Politecnio di Torino, pages 101–113, 1957. Translated by Ilan Vardi.

[4] Anna A. Dushistova and Nikolai G. Moshchevitin. On the derivative of the minkowski question mark function ?(x). Fundementalnaya i Prikladnaya Matem- atika, 2010.

[5] Bernard R. Gelbaum and John M. H. Olmsted. Counterexamples in Analysis. Holden-Day, 1964.

[6] Kiko Kawamura. On the set of points where lebesgue’s singular function has the derivative zero. Proceedings of the Japan Academy, 87(A), October 2011.

[7] Davar Khoshenvisan. Normal numbers are normal. CMI Annual Report, 2006.

[8] I. Niven, H. Zuckerman, and H. Montgomery. An Introduction to the Theory of Numbers. Wiley, 5th edition edition, 1991.

[9] H. L. Royden. . Macmillan, New York, 1963.

[10] R. Salem. On some singular monotonic functions which are strictly increasing. Transactions of the American Mathematical Society, 53(3):427–439, May 1943.

48