Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Article Combinatorial Models of the Distribution of Prime Numbers
Vito Barbarani 1,† 0000-0002-4953-7162,
1 European Physical Society; [email protected] † Current address: via Cancherini 85, 51039 Quarrata, Italy
1 Abstract: This work is divided into two parts. In the first one the combinatorics of a new class of 2 randomly generated objects, exhibiting the same properties as the distribution of prime numbers, 3 is solved and the probability distribution of the combinatorial counterpart of the n-th prime 4 number is derived, together with an estimate of the prime-counting function π(x). A proposition 5 equivalent to the Prime Number Theorem (PNT) is proved to hold, while the equivalent of the 6 Riemann Hypothesis (RH) is proved to be false with probability 1 (w.p. 1) for this model. Many 7 identities involving Stirling numbers of the second kind and harmonic numbers are found, some 8 of which appear to be new. The second part is dedicated to generalizing the model to investigate 9 the conditions enabling both PNT and RH. A model representing a general class of random integer 10 sequences is found, for which RH holds w.p. 1. The prediction of the number of consecutive 11 prime pairs, as a function of the gap d, is derived from this class of models and the results are in 12 agreement with empirical data for large gaps. A heuristic version of the model, directly related to 13 the sequence of primes, is discussed and new integral lower and upper bounds of π(x) are found.
14 Keywords: set partitions; Stirling numbers of the second kind; harmonic numbers; prime number 15 distribution; Riemann hypothesis; Gumbel distribution
16 1. Introduction. The bingo bag of primes.
17 This work aims to investigate two main objectives: what random means in the case 18 of the distribution of prime numbers and the general conditions ensuring that what we 19 know as Prime Number Theorem and Riemann Hypothesis also hold in the case of a generic Citation: Babarani, V. 20 random sequence of natural numbers. To achieve these goals, we’ll make use of models Combinatorial Models of the 21 based on combinatorics and probability theory. Of course, this is not the first time that Distribution of Prime Numbers. 22 models are used to derive new conjectures and theorems about the prime sequence. In Mathematics 2021, 1, 0. 23 the mid-1930s, Cramér [14]-[15] proposed the first stochastic model of the sequence of https://doi.org/ 24 primes, as a way to formulate conjectures about their distribution. This model can be 25 briefly summarized as follows. Received: 26 Denote by {Xn, n = 1, 2, 3 ... } a sequence of independent random variables Xn, Accepted: 27 taking values 0 and 1 according to Published: 1 Prob{X = 1} = , n > 2 Publisher’s Note: MDPI stays neu- n ln(n) tral with regard to jurisdictional 28 = claims in published maps and insti- (the probability may be arbitrarily chosen for n 1, 2). Then the random sequence tutional affiliations. 29 of natural numbers
S = {n : Xn = 1, n > 1} Copyright: © 2021 by the author. Submitted to Mathematics for possi- 30 is a stochastic model of the sequence of primes. This means that if Ω = {ω} is ble open access publication under 31 the set of all possible sequences ω realizations of S, then the prime sequence, say ωP, the terms and conditions of the 32 is obviously an element of Ω. We can therefore study the properties of the particular Creative Commons Attribution (CC 33 sequence ωP, through the methods of probability theory applied to the sample space BY) license (https://creativecom- 34 Ω, and conjecture with Cramér that if a certain property holds in Ω with probability 1 mons.org/licenses/by/ 4.0/). 35 (w.p. 1 in that follows), that is for almost all the sequences, the same property holds for
Version March 31, 2021 submitted to Mathematics https://www.mdpi.com/journal/mathematics © 2021 by the author(s). Distributed under a Creative Commons CC BY license. Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 2 of 52
36 the prime sequence ωP. Obviously the results of these conjectures lay heavily on the 37 assumption about the probability of the event {Xn = 1}, which connects the model with 38 the distribution of prime numbers as stated by the Prime Number Theorem (PNT)
x Z x dx π(x) ∼ ∼ Li(x) = (1) ln(x) 2 ln(x) 1 39 ( ) π x being the prime-counting function. We see from (1) that ln(n) can be inter- 40 preted as the expected density of primes around n, or the probability that a given integer 41 n should be a prime. 42 As reported by Granville [25] in his review article dedicated to the work of Cramér 43 in this area, he imagined building the set of random sequences S through a game of 44 chance consisting of a series of repeated draws (independent trials) from an ordered 45 sequence of urns (U1, U2, U3, ... ), such that the n-th urn contains ln(n) black balls and 1 46 white ball, starting from n = 3 forward. In this game the event {Xn = 1} is then realized 47 by drawing the white ball from the urn number n. 48 The heuristic method based on Cramér’s model has been playing an essential 49 role since the mid-1930s to the present day when formulating conjectures concerning 50 primes. The first serious difference, between the prediction of the model and the actual 51 behavior of the distribution of primes, was found in 1985 [32]. As pointed out in [25], 52 this difference arises because of the assumptions made by heuristic procedures based on 53 the probabilistic model: the hypothesis,√ when applying the ’Sieve of Eratosthenes’, that 54 sieving by the different primes ≤ n are independent events and Gauss’s conjecture 55 about the density of primes around n. 56 The problem was known well before Cramér’s work. In their 1923’s paper [27] 57 Hardy and Littlewood, while discussing some asymptotic formulas of Sylvester and −γ 58 Brun, containing erroneous constant factors, functions of the number e (γ is the 59 Euler-Mascheroni constant), observe that "any formula in the theory of primes, deduced from 60 cosiderations of probability, is likely to be erroneous just in this way". They explain this error 61 arises because of the different answers to the question about the chance that a large 62 number n should be prime. Following the PNT, and Gauss’s conjecture, this chance is 63 approximately√ 1/ ln(n), while if we consider the chance n should not be divisible by any 64 prime ≤ n, assuming independent events for any prime (Hardy and Littlewood don’t 65 mention this condition, but it is implicit), this chance is asymptotically equivalent to
1 1 (1 − ) ∼ 2e−γ , p prime. ∏√ p ln(n) p≤ a
66 Hence, they conclude, any inference based on the previous formula is "incorrect −γ 67 to the extent of a factor 2e = 1.123...". More recently new contradictions have been 68 found, between the predictions of the model and the actual distribution of primes, that 69 no Cramér-type model, assuming the hypothesis of independent random variables, can 70 overcome [35]. 71 Cramer’s model founds its connection with the distribution of prime numbers on 72 Gauss’s conjecture; hence it assumes the PNT as a priori hypothesis. In this work, a 73 new model is proposed, based on a class of combinatorial objects I call First Occurrence 74 Sequences (FOS), reproducing the stochastic structure of the prime sequence as a result of 75 the straightforward random structure of the model, based on a sequence of independent 76 equally probable trials. The PNT emerges however, without any a priori ad hoc assump- 77 tion about the probability of events. Perhaps the best way to summarize the aim of this 78 work is that it tries to answer the question Granville cites at the beginning of his paper 79 [25]: "It is evident that the primes are randomly distributed but, unfortunately, we don’t know 1 80 what ’random’ means."
1 The quote is of Prof. R. C. Vaughan. Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 3 of 52
81 Following Cramér’s method, we can describe the pinpoints of our model as a game 82 of chance. Suppose we have an urn with n = 2 balls of different colors, say black (B) and 83 white (W), and let’s consider the following sampling with replacement game: choose 84 a ball, say B and replace it; repeat drawing until the white ball W occurs for the first 85 time. In general, the game ends when you draw the ball with a color other than the first 86 ball drawn. The sequences of B and W balls we get from this game are FOS of different 87 lengths k. Some examples: 88 BBW, k = 3; 89 BW, k = 2; 90 WWWB, k = 4; 91 WB, k = 2. 92 Assuming each ball has the same probability of being drawn, it is easy to see the 93 average numbers of steps we need to get both colors for the first time, or the average 94 length of FOS when n = 2, is equal to 3. If we repeat this simple game with different 95 values of n (number of colors), and the rule that the game ends when all colors appear in 96 the sequence for the first time and compare the average length of FOS of order n, say 0 97 Ln/n, with the prime number pn of the same order, we would find the results reported in 98 Table1 for n = 2, 3, 4, 5 and 6.
99 Table 1. FOS average length and primes pn.
0 n Ln/n pn 2 3 3 3 5.5 5 4 8.333 7 5 11.416 11 6 14.7 13
100 The correlation between FOS average lengths and the sequence of primes is a gen- 101 eral property of the model. Indeed assuming each colored ball has the same probability 102 of being drawn, and every sampling step with replacement is independent of each other, 103 we can easily find
n 0 1 Ln/n = n ∑ ∼ n ln n. (2) k=1 k
104 The above equation shows the close connection existing between the combinatorial 105 objects I have called FOS and the primes. In particular the average length of the n order 106 FOS, obtained through the sampling with replacement game from an urn with n colored 107 balls, is asymptotic to the expression of the n-th prime obtained as a consequence of the 108 PNT. In this sense, the n-th order FOS can be defined as the combinatorial counterpart of 109 the n-th prime number. 110 The following sections are devoted to developing of the theory of FOS and their 111 analogies with the distribution of primes. In Section2 a formal definition of this kind 112 of objects is given, together with the probability spaces we need to define probability 113 relations and probability recursive formulas. In Section3 the combinatorics of FOS is 114 completely solved, demonstrating its close connections with set partitions and Stirling 115 numbers of the second kind. In Section4, the probability equations derived from 116 the combinatorial analysis lead to new identities for Stirling numbers of the second 117 kind and for harmonic numbers. The combinatorial model of primes based on FOS 118 is defined together with a discrete estimate πˆ (x) of the prime-counting function π(x), 119 which is studied and numerically tested. We prove the PNT holds for this model while 120 the Riemann Hypothesis doesn’t. In Section5, the model is generalized to investigate 121 the conditions enabling both the PNT and the Riemann Hypothesis. We find a general Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 4 of 52
122 model of random sequences for which the Riemann Hypothesis holds w.p. 1, and apply 123 this model to the problem of counting consecutive prime pairs, as function of the gap 124 between successive primes. The theoretical results obtained from the model agree with 125 the empirical data for large gaps and show the importance of the correlation between 126 primes in the case of small gaps. Finally this model is directly related to the sequence of 127 primes, leading to new integral lower and upper bounds of the prime-counting function 128 π(x). The last Section6 summarizes this work’s general significance and hints at its 129 links with other research areas, particularly with research on physical models.
130 2. First Occurence Sequences. A new class of combinatorial objects.
131 In this section, we will be concerned about defining probability spaces and First 132 Occurring Sequences as events, and deriving some fundamental relations among the 133 probability of certain events, while the evaluation of these probabilities will be treated it 134 Section 3.
135 2.1. Definitions and probability spaces.
136 Each FOS can be seen as a particular result of a sampling with replacement random 137 experiment, consisting of a repeated sequence of independent trials with more than one 138 outcome. The proper probability space for such an experiment can be defined as follows 139 (see [41, Lect. 1, 2] and [29, Chap. 10, 13]). Let An be a collection of n distinct symbols
An = {a1, a2,..., an}, n ≥ 2
140 and suppose our experiment of random draws from An is repeated k times. Then 141 an outcome of the compound experiment is a k-length sequence of elements of An
ωn(k) = {(x1, x2,..., xk) : xj ∈ An, j = 1, 2, . . . , k}, k ≥ 1. k k k 142 The probability space of the experiment is the triplet (Ωn, Fn, Pn), where
k Ωn = {ωn(k)} k 143 is the sample space of all possible n outcome sequences given by the k-fold Carte- 144 sian product of An with itself
k Ωn = An × An × · · · × An, k times; k k 145 Fn is the σ-algebra of subsets of Ωn generated by finite-dimensional cylinder sets 146 Cn of the form
Cn(B1, B2,..., Bk) = {ωn(k) : x1 ∈ B1, x2 ∈ B2,..., xk ∈ Bk} (3)
147 with Bi ∈ B i=1,2,. . . ,k, B a σ-algebra on An of subsets of An. k k 148 To define the probability measure Pn : Fn → [0, 1] let’s consider a collection of 149 numbers {pi, i = 1, 2, . . . , n} such that
pi ≥ 0, i = 1, 2, . . . , n
150 and
n ∑ pi = 1. i=1
151 The collection of numbers pi defines the probability P of any outcome ai of the 152 single trial experiment and of any subset B ∈ B through the equations
P[ai] = pi (4) Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 5 of 52
P[B] = ∑ pi. (5) ai∈B
153 Assuming each trial is independent, the probability of an outcome ωn(k) of the 154 compound experiment is given by
k P[ωn(k)] = ∏ P[xi]. (6) i=1 k 155 The above equation defines a probability distribution on the sample space Ωn, since 156 it is [41, par. 2.3]
k P[Ωn] = ∑ P[ωn(k)] = 1 ωn(k) k k 157 and the probability measure Pn on the σ-algebra Fn can be generated by P through 158 the formula
k k Pn[E] = ∑ P[ωn(k)], for every event E ∈ Fn. (7) ωn(k)∈E k 159 I recall here two properties of Pn, which will be used throughout this paper.
k 160 Remark 2.1. [σ-additivity] Given Ci ∈ Fn, i = 1, 2, . . . , and Ci ∩ Cj = ∅ for i 6= j, then
∞ ∞ k [ k Pn Ci = ∑ Pn(Ci). i=1 i=1
k 161 Remark 2.2. Given any finite-dimensional cylinder set (3) the probability measure Pn has the 162 property [41, Corollary 2.1]
k k Pn[Cn(B1, B2,..., Bk)] = ∏ P[Bi]. (8) i=1
163 where P is the probability distribution defined by (4)-(6) (obviously An ∈ B anf if Bi = An 164 for some i, it is P[An] = 1).
165 = { } ⊂ A Definition 2.1. A collection Oi aj1 , aj2 , ... , aji n is an ordered choice of i distinct 166 A ( ≤ ≤ ) 6= 6= = = symbols from n, 1 i n , if ajl ajm for l m, l 1, 2, . . . , i, m 1, 2, . . . , i.
k 167 Among the events of the σ-algebra Fn we are interested in the following ones.
168 Definition 2.2. We denote with Si/n(k), k-length sequences with i/n distinct symbols (1 ≤ 169 i ≤ n, k ≥ i), the following events
Si/n(k) = {ωn(k) = (x1, x2,..., xk) : ωn(k) ⊂ Oi and Oi ⊂ ωn(k)}
170 for some i-distinct ordered choice Oi.
0 171 Definition 2.3. We denote with Si/n(k), k-length First Occurrence Sequences (FOS) with 172 i/n distinct symbols (1 ≤ i ≤ n, k ≥ i), the events Si/n(k) such that xj 6= xk, j = 173 1, 2, . . . , (k − 1).
k 174 Hence if we choose among the elementary outcomes in Ωn, those with only i 175 distinct symbols and denote them with ωi/n(k), the event Si/n(k) is the subset Si/n(k) = 0 176 {ωi/n(k)}. If we denote further with ωi/n(k) those outcomes in the above subset, where Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 6 of 52
177 the i-th distinct symbol is unique (occurs once only) and occupies the last k-th position 0 0 178 in the sequence, then the event FOS is the subset Si/n(k) = {ωi/n(k)}. The following 179 remarks follow obviously from the above definitions.
0 180 Remark 2.3. For every elementary event ωi/n(k) and ωi/n(k), there is one and only one ordered 181 choice Oi replicating the order of the first occurrence of the i distinct symbols in the elementary 182 sequence.
0 183 Remark 2.4. In general it is Si/n(k) ⊂ Si/n(k), the two events coincide when i = k = 1 and 184 i = k = n. In these cases:
0 S1/n(1) = S1/n(1) = {ai, i = 1, 2, . . . , n}
0 Sn/n(n) = Sn/n(n) = {ωn/n(n)}
185 where {ωn/n(n)} is the group of permutations of the set An.
Remark 2.5. 0 S1/n(k) = ∅, if k ≥ 2.
0 186 Let’s now derive the general probability relation between events Si/n(k), Si/n(k) 187 and elementary outcome sequences.
188 Definition 2.4. Given a single k-length outcome with i/n distinct symbols ωi/n(k)(1 ≤ i ≤ n, 189 k ≥ i), let’s denote with Y(ωi/n(k)) the subset of i symbols occurring in ωi/n(k)
Y(ωi/n(k)) = {aj ∈ An : aj ∈ ωi/n(k)} c 190 and with Y (ωi/n(k)) the complementary set of the remaining (n − i) symbols
c Y (ωi/n(k)) = An − Y(ωi/n(k)). 0 191 An analogous definition applies to the symbol set Y(ωi/n(k)) and the complementary c 0 0 c 192 symbol set Y (ωi/n(k)), of an elementary outcome ωi/n(k). Obviously both Y and Y are 193 subsets of a σ-algebra B on An of subsets of An.
0 194 Theorem 2.1. The probability of the events Si/n(k) and Si/n(k) as a function of elementary 195 outcomes of the compound experiment and their symbol sets and complementary symbol sets, are 196 given by
k 0 c Pn[Si/n(k)] = ∑ P[ωi−1/n(k − 1)]P[Y (ωi−1/n(k − 1))] (9) ωi−1/n(k−1)
k k 0 0 k−j Pn[Si/n(k)] = ∑ ∑ P[ωi/n(j)] P[Y(ωi/n(j))] . (10) j=i 0 ( ) ωi/n j
197 Proof. For every sequence ωi−1/n(k − 1), the cylinder set
c 0 Cn[ωi−1/n(k − 1), Y (ωi−1/n)] ⊂ Si/n(k) 0 0 198 contains (n − i + 1) FOS ωi/n(k) ∈ Si/n(k), and we get the whole event as the union 199 of the disjoint cylinder sets
0 [ c Si/n(k) = Cn[ωi−1/n(k − 1), Y (ωi−1/n)]. ωi−1/n(k−1) Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 7 of 52
200 From equation (8) the probability of the cylinder set is given by
k−1 c c Pn [Cn(ωi−1/n(k − 1), Y (ωi−1/n))] = P[ωi−1/n(k − 1)]P[Y (ωi−1/n)]
201 hence equation (9) follows from the σ-additivity property of the probability mea- 202 sure. 0 203 For every sequence ωi/n(j)(i ≤ j ≤ k), the cylinder set
0 0 0 0 Cn[ωi/n(j), Y(ωi/n(j)), Y(ωi/n(j)),..., Y(ωi/n(j))] ⊂ Si/n(k) | {z } (k − j) times
0 k−j 204 (when j = k the cylinder coincides with ωi/n(k)), contains i sequences ωi/n(k) ∈ 205 Si/n(k) and the whole event is the union of the disjoint cylinder sets
k [ [ 0 0 0 0 Si/n(k) = Cn[ωi/n(j), Y(ωi/n(j)), Y(ωi/n(j)),..., Y(ωi/n(j))]. 0 j=i ω (j) | {z } i/n (k − j) times
206 From equation (8) the probability of the cylinder set is
k 0 0 0 0 0 0 k−j Pn Cn[ωi/n(j), Y(ωi/n(j)), Y(ωi/n(j)),..., Y(ωi/n(j))] = P[ωi/n(j)] P[Y(ωi/n(j))] . | {z } (k − j) times
k 207 Equation (10) follows from the σ-additivity of the probability measure Pn. 208
209 Corollary 2.1.1. Given i = n (the number of distinct symbols is equal to the total number of 210 symbols in An) and k ≥ n then
k k j 0 Pn[Sn/n(k)] = ∑ Pn Sn/n(j) . j=n
0 211 Proof. When i = n the symbol set of FOS ωn/n(j), j ≥ i, becomes
0 Y(ωn/n(j)) = An
212 and remembering P[An] = 1, the thesis follows immediately through simple ma- k 213 nipulations of (10), after noting that the measure Pn of cylinder sets
0 Cn[Sn/n(j), An, An,..., An] | {z } (k − j) times
j 0 214 is equal to Pn measure of Sn/n(j).
215 Dealing with FOS in general and problems like the average length of sequences, 216 will require considering an infinite number of independent trials or repetitions of the k k k 217 random sampling with replacement experiment. The probability space (Ωn, Fn, Pn) 218 we have defined above for finite values of k, can be generalized to consider sampling 219 sequences of infinite length from the set An, so that the set of all possible outcomes is ∞ 220 Ωn , the Cartesian product of countable many copies of An. The product measure [29, ∞ ∞ ∞ 221 Theorem 6.3, pag.141] allows us to define an extended probability space (Ωn , Fn , Pn ) ∞ 222 where sequences Si/n(k), for a fixed k, are subsets of Ωn defined as Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 8 of 52
∞ Si/n(k) × An × An × · · · ∈ Fn
223 and the probability measure and its properties remain unchanged [29, Par. 13.1]
∞ k Pn Si/n(k) × An × An × ... = Pn Si/n(k) = ∑ P[ωi/n(k)] ωi/n(k) 0 224 where P[ωi/n(k)] is defined in (6). The same obviously applies to Si/n(k) events. 225 If we consider only i/n FOS as outcomes of our experiment, We may define an even 0 0 0 226 more straightforward probability space [29, Example 2, pag.270], say (Ωi/n, Fi/n, Pi/n), 227 where the sample space is the countable set
0 0 Ωi/n = {ωi/n(k), k = i, i + 1, i + 2, . . . } 0 228 and the probability of each elementary sample ωi/n(k) is defined as in (6), while for 0 229 every event E ∈ Fi/n it is
0 0 ∞ 0 Pi/n[E] = ∑ P[ωi/n(k)] = ∑ ∑ P[ωi/n(k)]. 0 ( ≥ )∈ k=i 0 ( )∈ ωi/n k i E ωi/n k E
0 0 0 k 0 230 Hence the probability of the event Si/n(k) is still given by Pi/n[Si/n(k)] = Pn[Si/n(k)]. 231 Remembering the definition of partition of a sample space [41, p.4], the following 232 remarks follow immediately from the definitions above.
233 Remark 2.6. Given k ≥ n, the collection of events {Si/n(k) i = 1, 2, ... , n} is a partition of k 234 Ωn. Hence
Si/n(k) ∩ Sj/n(k) = ∅ for i 6= j, k ≥ n n k [ Ωn = Si/n(k). i=1
0 0 235 Remark 2.7. The collection of events {Si/n(k) k = i, i + 1, i + 2, ... } is a partition of Ωi/n. 236 Hence
0 0 Si/n(k) ∩ Si/n(m) = ∅ for k 6= m, k ≥ i, m ≥ i
∞ 0 [ 0 Ωi/n = Si/n(k). k=i
237 2.2. Probability relations.
238 In order to simplify our expressions, let’s adopt the following symbols for the 0 239 probability of events Si/n(k) and Si/n(k), 1 ≤ i ≤ n, k ≥ i
Notation. k Pn[Si/n(k)] = φi/n(k)
k 0 0 Pn[Si/n(k)] = φi/n(k).
240 This notation will be used throughout the paper. 241 In the following we will derive some relations which are valid under the assumption 242 of the general probability distribution (4)-(6). 243 Note that from Remark 2.4 and 2.5 it follows Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 9 of 52
0 φ1/n(1) = φ1/n(1) = 1 (11)
0 φn/n(n) = φn/n(n) (12)
0 φ1/n(k) = 0, k ≥ 2, (13)
244 while from Remark 2.6 and 2.7
n ∑ φi/n(k) = 1, k ≥ n (14) i=1
∞ 0 ∑ φi/n(k) = 1. (15) k=i
245 Assuming k ≥ n, from (14) and Corollary 2.1.1, which can be written using the new 246 notation
k 0 φn/n(k) = ∑ φn/n(j), (16) j=n
247 it follows
n− k 1 0 ∑ φi/n(k) + ∑ φn/n(j) = 1. (17) i=1 j=n
n−1 248 Since of course ∑i=1 φi/n(k) → 0 as k → ∞, it is interesting to note that the above ∞ 0 249 equation is a heuristic proof of ∑k=n φn/n(k) = 1. 250 From (14) with k = n, (12) and (15) with i = n, we can write the following equality
n− 1 ∞ 0 ∑ φi/n(n) = ∑ φn/n(k). (18) i=1 k=n+1
251 From here on throughout this paper we’ll assume that the the symbols in the set An 252 are equally probable that is, with reference to the probability distribution (4)-(6), we’ll 253 make the following assumption.
254 Assumption 2.1. (Equally probable symbols)
1 P[a ] = , i = 1, 2, . . . , n i n
k 255 Under the above assumption, the probability (6) of any sequence ωn(k) ∈ Ωn is 256 given by
1 P[ωn(k)] = nk
257 while for the probability of FOS and related events it holds
#S (k) φ (k) = i/n (19) i/n nk
0 0 #S (k) φ (k) = i/n , (20) i/n nk
258 where the symbol "#S" denotes the cardinality of the set S. Hence equation (12) can 259 be completed in this case as Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 10 of 52
0 n! φ (n) = φ (n) = . (21) n/n n/n nn
260 while, since #S1/n(k) = n, when k ≥ 1, it is 1 φ (k) = . (22) 1/n nk−1
261 The trivial case i = 1 is completely solved by equations (11), (13) and (22), so in the 262 following we shall assume i ≥ 2. 263 The following lemma follows directly from the definitions.
264 Lemma 2.1. Under Assumption 2.1, the probability of the symbol set and its complementary 265 (see Definition 2.4) of any outcome sequence ωn(k) is independent of the sequence and is given 266 by
i P[Y(ω (k))] = i/n n
n − i P[Yc(ω (k))] = i/n n
267 Corollary 2.1.2. Under Assumption 2.1, from Theorem 2.1 it follows (2 ≤ i ≤ n, k ≥ i) 0 n − i + 1 φ (k) = φ (k − 1) (23) i/n i−1/n n
k k−j 0 i φi/n(k) = ∑ φi/n(j) (24) j=i n
k−1 k−j−1 0 0 i − 1 n − i + 1 φi/n(k) = ∑ φi−1/n(j) (25) j=i−1 n n
268 Proof. From Lemma 2.1 we get
n − i + 1 P[Yc(ω (k − 1))] = i−1/n n
0 i P[Y(ω (j))] = . i/n n
269 By substituting these in (9) and (10) respectively, and remembering the definition 270 (7) of the probability measure, equations (23) and (24) follow. 271 Equation (25) is obtained simply by putting together the previous results.
272 Note that the last equation (25) provides us with a method to calculate the sequence 0 273 (φi/n(k), i = 2, 3, ... , n), starting from (11) and (13), recursively. In particular, for i = 2 274 we get
0 n − 1 φ (k) = , k ≥ 2. (26) 2/n nk−1
275 3. Combinatorics of FOS
276 In the previous section, we have obtained the probability of FOS through the general 277 recursive formula (25) and through closed-form solutions only in a few special cases.The 278 aim of this section is the development of a combinatorial theory of FOS, in order to 279 investigate their connections with other topics of combinatorics, such as set partitions 280 and Stirling numbers of the second kind [37, Ch. 9], then to derive the closed-form 281 probability functions in the next section. Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 11 of 52
0 282 The following simple equation relates the cardinality of the sets Si/n(k) and Si/n(k) 283 defined in Definition 2.2 and 2.3
0 #Si+1/n(k + 1) = #Si/n(k)(n − i). (27)
284 The combinatorics of objects like sequences ωi/n(k) ∈ Si/n(k) is well known and 285 treated within the more general problem of counting functions between two finite sets 286 in enumerative combinatorics [42, 1.9 p. 71], with applications to set partitions, words 287 and random allocation [20, II.3 p. 106]. The problem of finding #Si/n(k) is the same as 288 finding the number of words (or sequences) of length k, over an alphabet of cardinality k 289 n, containing i letters (or symbols). As reported in [36, equation (5)], the total number n 290 of possible sequences can be obtained through the following sum, involving the Stirling 2 291 numbers of the second kind S(k, i) (see [42, equation (1.96) p.75 for details])
k n nk = i!S(k, i). (28) ∑ i i=1
292 The above equation decomposes the total number of sequences, as the sum over i of 293 the number of such sequences that contain exactly i distinct symbols. The same identity, 294 obtained through classical algebraic methods, is reported in [6, equation (29)]). On the 295 other hand, remembering Remark 2.6, we can also write
n k n = ∑ #Si/n(k), (29) i=1
296 hence by comparing (28) and (29), it follows
n #S (k) = i!S(k, i). (30) i/n i
297 The same solution is found in [20, II.6 p. 113], under the hypothesis of random 298 allocation of symbols, which is the same as Assumption 2.1. From (30) and (27) we can 299 finally derive an explicit expression for the cardinality of FOS too. 300 In the rest of this section, we prefer to present an alternative procedure that allows 301 us to derive the number of FOS sequences directly, in order both to find new identities 302 involving the Stirling numbers of the second kind and to show a new combinatorial 303 meaning of these numbers, strictly related to the sequence of primes, which is the main 304 focus of this paper.
305 3.1. Ordered FOS
306 Remembering Definition 2.1 and Remark 2.3, let’s introduce a new kind of combi- 307 natorial objects.
308 Definition 3.1. Let Oi be an ordered subset of i distinct elements from An, we denote with 0 309 Si/n(k/Oi), k-length ordered First Occurrence Sequences (oFOS) with i/n distinct symbols 310 on the ordered choice Oi, (1 ≤ i ≤ n, k ≥ i), the events
0 0 Si/n(k/Oi) = {ωi/n(k/Oi)} 0 311 where ωi/n(k/Oi) is an elementary FOS outcome having Oi as i-distinct ordered choice, 312 replicating the order of the first occurrence of its distinct i symbols.
2 I keep here the same symbol with two different spellings, S and S, both to denote the set of events (sequences) and the Stirling numbers of the second kind. The meaning is always clear from the context, in any case when S denotes a set, it is always followed by a subscript. The adoption of the same symbol is justified by the close connection existing between the cardinality of the set of sequences and the Stirling numbers of the second kind. Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 12 of 52
313 The following example can help to clarify the above definition.
314 Example 3.1. Suppose n = 4, i = 3 and k = 4, with A4 = {1, 2, 3, 4}, O3 = {3, 1, 4}; then 0 315 the event S3/4(4/O3) is made up of the following elementary outcomes:
0 0 S3/4(4/O3) = {ω3/4(4/O3)} = {(3, 3, 1, 4), (3, 1, 3, 4), (3, 1, 1, 4)}.
316 Remark 3.1. The cardinality of oFOS events depends on parameters i and k only, not on the 317 particular choice of symbols in the ordered set Oi.
318 The following theorem establishes the relation between FOS and oFOS.
0 319 Theorem 3.1. The cardinality of the set of FOS Si/n(k) is given by (1 ≤ i ≤ n, k ≥ i):
0 n! #S (k) = g(k, i) (31) i/n (n − i)!
320 where g(k, i) is the number of k-length oFOS on an ordered subset of i symbols Oi
0 #Si/n(k/Oi) = g(k, i). (32)
321 = { } Proof. Given the ordered choice of symbols Oi aj1 , aj2 , ... , aji , any finite subset of i 322 indices
Mi = {m1, m2,..., mi} ⊂ Ik = {1, 2, . . . , k}
323 subject to the constraints
1 = m1 < m2 < m3 < ··· < mi = k,
324 defines a finite-dimensional cylinder set on Oi
( ) = { ( ) = ( ) ∈ = } CMi B1, B2,..., Bk/Oi ωn k x1, x2,..., xk : xm Bm,, m 1, 2, . . . , k
325 through the following assignments:
= { } = ≤ < Bm aj1 for 1 m1 m m2
= { } = Bm aj2 for m m2
= { } < < Bm aj1 , aj2 for m2 m m3
= { } = Bm aj3 for m m3
= { } < < Bm aj1 , aj2 , aj3 for m3 m m4
......
= { } = Bm aji−1 for m mi−1
= { } < < Bm aj1 , aj2 ,..., aji−1 for mi−1 m mi Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 13 of 52
= { } = = Bm aji for m mi k.
326 Therefore the oFOS event is equal to
0 S ( ) = [ ( ) i/n k/Oi CMi B1, B2,..., Bk/Oi Mi
327 and the FOS event
0 0 S ( ) = [ ( ) = [ [ ( ) i/n k Si/n k/Oi CMi B1, B2,..., Bk/Oi . Oi Oi Mi
328 As far as the cardinality of the sets from the above equation we get
0 #Si/n(k) = #{Oi}g(k, i).
329 Since
n #{O } = C(n, i)P(i) = i! i i
330 with C(n, i) i-combinations of n elements, and P(i) number of permutations of i 331 elements, equation (31) follows.
332 3.2. Counting ordered FOS
333 The problem of finding the number of FOS is thus reduced to that of finding the 334 numbers g(k, i). When i = 1, k ≥ i we get immediately
g(1, 1) = 1, g(k, 1) = 0, k > 1; (33)
335 when i = 2, it is
g(k, 2) = 1, k ≥ 2. (34)
336 Indeed for i = 2, there is only one oFOS, whatever the value of k ≥ 2. Note that (33) 337 is a particular case of the general property
g(i, i) = 1, i ≥ 1. (35)
338 Example 3.2. i = 2,O2 = {1, 2}
g(2, 2) = 1, the only oFOS is (1, 2)
g(3, 2) = 1, the only oFOS is (1, 1, 2)
......
339 The calculations are more complex when i ≥ 3.
340 Example 3.3. i = 3,O3 = {1, 2, 3}, through elementary enumerative combinatorics we find
g(3, 3) = 1, the only oFOS is
(1, 2, 3)
g(4, 3) = 3, the oFOS are Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 14 of 52
(1, 1, 2, 3), (1, 2, 1, 3), (1, 2, 2, 3).
g(5, 3) = 7, the oFOS are
(1, 1, 1, 2, 3), (1, 1, 2, 1, 3), (1, 1, 2, 2, 3), (1, 2, 1, 1, 3),
(1, 2, 1, 2, 3), (1, 2, 2, 1, 3), (1, 2, 2, 2, 3).
...
341 Example 3.4. i = 4, O4 = {1, 2, 3, 4}, through elementary enumerative combinatorics we find
g(4, 4) = 1, the only oFOS is
(1, 2, 3, 4)
g(5, 4) = 6, the oFOS are
(1, 1, 2, 3, 4), (1, 2, 1, 3, 4), (1, 2, 3, 1, 4),
(1, 2, 2, 3, 4), (1, 2, 3, 2, 4), (1, 2, 3, 3, 4).
g(6, 4) = 25, the oFOS are
(1, 1, 1, 2, 3, 4), (1, 1, 2, 1, 3, 4), (1, 1, 2, 2, 3, 4), (1, 1, 2, 3, 1, 4), (1, 1, 2, 3, 2, 4),
(1, 1, 2, 3, 3, 4), (1, 2, 1, 1, 3, 4), (1, 2, 1, 2, 3, 4), (1, 2, 2, 1, 3, 4), (1, 2, 2, 2, 3, 4),
(1, 2, 1, 3, 1, 4), (1, 2, 1, 3, 2, 4), (1, 2, 2, 3, 1, 4), (1, 2, 2, 3, 2, 4), (1, 2, 1, 3, 3, 4),
(1, 2, 2, 3, 3, 4), (1, 2, 3, 1, 1, 4), (1, 2, 3, 1, 2, 4), (1, 2, 3, 2, 1, 4), (1, 2, 3, 2, 2, 4),
(1, 2, 3, 1, 3, 4), (1, 2, 3, 3, 1, 4), (1, 2, 3, 3, 3, 4), (1, 2, 3, 2, 3, 4), (1, 2, 3, 3, 2, 4).
...
342 I’ll give in the following some results about the calculations of g(k, i) numbers, 343 starting with the next lemma that states a recursive formula. Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 15 of 52
344 Lemma 3.1. The number of k-length oFOS on an ordered subset Oi, as function of the cardinal- 345 ity on an ordered subset Oi−1, is
k−1 g(k, i) = ∑ g(j, i − 1)(i − 1)k−j−1 , for 2 ≤ i ≤ k, (36) j=i−1
346 starting with g(k, 1), k ≥ 1, given by (33).
347 Proof. Equation (36) follows directly from definition 3.1 and remark 3.1, considering 348 that from any j-length oFOS on an ordered set Oi−1 (i − 1 ≤ j ≤ k − 1), one can obtain k−j−1 349 (i − 1) k-length oFOS on an ordered set Oi having the i-th symbol in the last position 350 k of the sequence. 351 Note that this lemma holds independently from Assumption 2.1 of equally probable 352 symbols.
353 The result reported below is simply the sum of the first k terms of a geometric series 354 and will be referred to by the next theorem.
355 Lemma 3.2. Given q real, |q| < 1, then
k 1 − qk+1 qj = , k ≥ 0 integer. ∑ − j=0 1 q
356 Theorem 3.2. Let g(k, i) be the number of k-length oFOS on an ordered set of i distinct elements. 357 Then the following statements hold:
i−2 k−i+1 k−i+1 g(k, i) = ∑ ai,j (i − 1) − j , i ≥ 3, k ≥ i (37) j=1
358 with coefficients ai,j defined recursively as
j a = − a , i ≥ 3, j = 1, 2, . . . , (i − 2) (38) i+1,j i − j i,j
i−2 ai+1,i−1 = (i − 1) ∑ ai,j (39) j=1
359 starting with
a3,1 = 1 .
360 Proof. Let’s proceed by induction. Equation (37) holds for i = 3, indeed from Lemma 361 3.1 remembering (34) we get
k−1 k−3 1 ( ) = k−j−1 = k−3 ≥ g k, 3 ∑ 2 2 ∑ j , k 3, j=2 j=0 2
362 hence from Lemma 3.2
g(k, 3) = 2k−2 − 1.
363 Let’s now prove that if (37) is true for i, i ≥ 3, then it is true for (i + 1). Substituting 364 (37) in (36) for g(k, i + 1) we get
k−1 i−2 s−i+1 s−i+1 k−s−1 g(k, i + 1) = ∑ ∑ ai,j (i − 1) − j i s=i j=1 Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 16 of 52
365 from which, after some manipulations, one gets
i−2 k−1 s+1 s+1 ai,j i − 1 ai,j j g(k, i + 1) = ik − . ∑ ∑ ( − )i i j=1 s=i i 1 i j i
366 The single sum
k−1 s+1 s+1 ai j i − 1 ai j j k , − , = i ∑ i i s=i (i − 1) i j i
i+1 k−i−1 s i+1 k−i−1 s ai j i − 1 i − 1 ai j j j = k , − , i i ∑ i ∑ (i − 1) i s=0 i j i s=0 i
367 after the application of Lemma 3.2
k−i 1 − i−1 k−i−1 i − 1 s i ∑ = s=0 i i−1 1 − i
k−i 1 − j k−i−1 j s i ∑ = s=0 i j 1 − i
368 and some manipulations, can be written as
k−1 s+1 s+1 ai j i − 1 ai j j k , − , = i ∑ i i s=i (i − 1) i j i j = a (i − 1)(ik−i − (i − 1)k−i) − a (ik−i − jk−i) . i,j i,j i − j
369 Finally we can write
i−2 j g(k, i + 1) = a (i − 1)(ik−i − (i − 1)k−i) − a (ik−i − jk−i) ∑ i,j i,j − j=1 i j
i−2 i−2 j g(k, i + 1) = (i − 1) ik−i − (i − 1)k−i a − a (ik−i − jk−i) ∑ i,j ∑ i,j − j=1 j=1 i j
370 hence
i−1 k−i k−i g(k, i + 1) = ∑ ai+1,j i − j j=1
371 with ai+1,j given by (38), j = 1, 2, . . . , i − 2, ai+1,i−1 given by (39).
372 The last theorem gives us a method to calculate each coefficient of the row i + 1 373 from the coefficients of the previous row i. The following corollary state an alternative 374 way to calculate the last term of each row, as function of the other coefficients of the 375 same row.
376 Corollary 3.2.1. It is Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 17 of 52
i−2 ai+1,i−1 = 1 − ∑(i − j)ai+1,j , i ≥ 3 . (40) j=1
377 Proof. The result follows directly from (35) and (37):
i−2 ∑ ai,j(i − j − 1) = 1 , i ≥ 3, j=1
i−2 i−2 (i − 1) ∑ ai,j = 1 + ∑ jai,j . j=1 j=1
378 Hence, remembering (38) and (39), after some manipulations, one gets (40).
379 The first rows of coefficients ai,j are reported in Table2 .
Table 2. Coefficients ai,j, 3 ≤ i ≤ 8.
j = 1 j = 2 j = 3 j = 4 j = 5 j = 6 i = 3 1 1 i = 4 − 2 2 1 9 i = 5 6 −2 2 1 4 27 32 i = 6 − 24 3 − 4 3 1 2 27 64 625 i = 7 120 − 3 4 − 3 24 1 4 81 256 3125 324 i = 8 − 720 15 − 16 9 − 48 5
380
381 Theorem 3.2 and the subsequent corollary state a method of calculating coefficients 382 ai,j by rows. The next proposition establishes another way to find them, proceeding by 383 columns.
384 Corollary 3.2.2. Given the coefficient aj+2,j, head of column j, the successive coefficients in the 385 same column can be calculated as
(−j)i−j−2 a = a + , j ≥ 1, i ≥ j + 3. (41) i,j (i − j − 1)! j 2,j
386 The column head coefficients can be calculated as a function of the previous ones only, 387 through the following formula
− j 1 (−l)j−l−1 a = j a , j ≥ 2, (42) j+2,j ∑ ( − ) l+2,l l=1 j l !
388 starting with
a3,1 = 1.
389 Proof. Equation (41) is derived simply through a recursive application of equation (38), 390 from the row index j + 3 to the required row index i. 391 By substituting (41) into (39), we obtain the second equation (42).
392 The next theorem states a general non-recursive formula to get the coefficients ai,j 393 directly as a function of the indices i, j. Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 18 of 52
394 Theorem 3.3. The following equation holds for each coefficient ai,j
ji−3 a = (−1)i−j−2 , i ≥ 3, j = 1, 2, . . . , (i − 2). (43) i,j (i − j − 1)!(j − 1)!
395 Proof. First of all, let’s prove by induction, the formula holds in the case of column head 396 coefficients. In this case, using the notation of Corollary 3.2.2, equation (43) becomes
ll−1 a + = , l ≥ 1. (44) l 2,l (l − 1)!
397 The above equation is true for l = 1, since we know it is a3,1 = 1. Assuming it is 398 true for l = 1, 2, . . . , j, from (42) we get the identity
− j 1 (−1)j−l−1lj−2 jj−1 j = , (45) ∑ ( − ) ( − ) ( − ) l=1 j l ! l 1 ! j 1 !
399 where the term on the left-hand side of the equation is the value of aj+2,j calculated 400 through (42), the one on the right-hand side is the value of the same coefficient given 401 by (44). Then the same identity holds for the next head of column coefficient aj+3,j+1; 402 indeed equation (42) leads to the result
j (−1)j−llj−1 (j + 1)j a = (j + 1) = . j+3,j+1 ∑ ( − + ) ( − ) l=1 j l 1 ! l 1 ! j!
403 The proof of (43) in the case of non-head of column coefficients, follows simply 404 from (41) after assuming (44).
405 The identity (45) deserves some more attention: it expresses the normalization ∞ 0 406 property of the FOS probabilities ∑j=n φn/n(j) = 1 (see forward equation (76) and 407 Remark 4.1). After some simple manipulations it can be written as
j j + 1 (−1)l lj = (−1)j(j + 1)j, (46) ∑ l l=0
408 that appears to be "complementary" of the well-known identity (see [23, equation 409 (1.13)])
j j (−1)l lj = (−1)j j!. (47) ∑ l l=0
410 3.3. Ordered FOS and Stirling numbers of the second kind
411 The solution we have found about the number of elementary sequences of the event 0 412 Si/n(k/Oi), presented in Definition 3.1 as oFOS of length k with i distinct symbols on 413 the ordered choice Oi, allows us to show the equivalence of g(k, i) numbers and Stirling 414 numbers of the second kind. Indeed from equation (37) of Theorem 3.2 it follows
i−2 i−2 k−i+1 k−i+1 g(k, i) = (i − 1) ∑ ai,j − ∑ ai,j j , i ≥ 3, k ≥ i j=1 j=1
415 and remembering (39) of the same theorem
i−2 k−i k−i+1 g(k, i) = ai+1,i−1(i − 1) − ∑ ai,j j . (48) j=1
416 Substituting in the above equation ai,j and ai+1,i−1 as given by (43) and (44) of 417 Theorem 3.3 we get Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 19 of 52
i−1 jk−2 g(k, i) = (−1)i−j−1 . ∑ ( − − ) ( − ) j=1 i j 1 ! j 1 !
418 Since
1 i − 1 j = (i − j − 1)!(j − 1)! j (i − 1)!
419 we finally arrive at
(−1)i−1 i−1 i − 1 g(k, i) = (−1)j jk−1, i ≥ 3, k ≥ i. (49) ( − ) ∑ j i 1 ! j=0
420 The right-hand side of the previous equation is the explicit formula for the Stirling 421 number of the second kind S(k − 1, i − 1) (see [37, equation (9.21)]). The equivalence can 422 be extended to values i = 1 and i = 2, since from equation (34) for i = 2 it follows
S(k − 1, 1) = 1, k ≥ 2
423 and for i = 1 from (33)
S(k − 1, 0) = 0, k > 1,
S(0, 0) = 1.
424 Note that the last equation solves the "bit tricky" case of the Stirling number S(0, 0) 425 [24, p. 258], by calculating it instead of assuming it "by convention" (this assumption is 426 common to all the treatments of the subject, see for example [42, p. 73]). We have thus 427 proved
0 428 Theorem 3.4. Given g(k, i), the cardinality of the oFOS set Si/n(k/Oi) defined in Definition 429 3.1, and the Stirling number of the second kind S(k − 1, i − 1), the following equation holds true
g(k, i) = S(k − 1, i − 1), i ≥ 1 ,k ≥ i. (50)
430 Remark 3.2. Due to the equivalence established by the previous theorem, equation (36) of Lemma 431 3.1 can be rewritten as
k S(k, i) = ∑ S(j − 1, i − 1)ik−j, j=i
432 that can be obtained through the repeated application of the recursive equation of the Stirling 433 numbers of the second kind (see [37, equation (9.1)])
S(k + 1, n) = nS(k, n) + S(k, n − 1). (51)
434 When k = i we know it is (see equation (35)) g(i, i) = 1, hence from the general 435 formula (49) we get
(−1)i−1 i−1 i − 1 g(i, i) = (−1)j ≤ ji−1 = 1 ( − ) ∑ j i 1 ! j=0
436 or
i−1 i − 1 (−1)j ji−1 = (−1)i−1(i − 1)! ∑ j j=0 Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 20 of 52
437 which is the identity (47) mentioned above. This identity is thus closely related 438 to the Stirling numbers and to the general property g(i, i) = S(i − 1, i − 1) = 1, i ≥ 1 439 (see [6, equation (4) and the references therein]). Note that by substituting (31), with 440 g(k + 1, i + 1) given by (50), into equation (27) we obtain for #Si/n(k) the same equation 441 (30). 442 We know Stirling numbers of the second kind have a combinatorial meaning, 443 directly related to the problem of partitioning a set into a fixed number of subsets: S(n, m) 444 counts the number of ways a set of n elements, can be partitioned into m nonempty 445 disjoint subsets ([37, Ch. 9, p. 113], [42, Ch. 1.9, p. 73], see also [24] for a complete 446 combinatorial treatment of the Stirling numbers). The analysis performed in this section 447 highlighted an alternative combinatorial interpretation of these numbers, together with 448 the close connection of oFOS with set partitions, that will be shown to be strictly related 449 to the sequence of primes in the next section.
450 4. First Occurrence Sequences, set partitions and the sequence of primes
451 In this section, we will delve into the FOS sequences as a model of the distribution 452 of prime numbers. The simple oFOS sequences can already act as a first-level model of 453 the prime number distribution, as we will see in the next subsection, while the complete 454 model based on FOS will be developed at the end of this section. In order to develop 455 the tools to build this model, we continue in the following to deepen the combinatorial 456 implications of Theorem 3.4, exploring some interesting identities involving Stirling 457 numbers of the second kind and harmonic numbers, derived through the probabilities 458 of FOS.
459 4.1. Ordered FOS and the Prime Number Theorem
460 The equivalence (50) stated by Theorem 3.4 and the combinatorial meaning asso- 461 ciated with S and g numbers suggest the following analogy with the distribution of 462 primes. Given any integer set of the type
Ik+1 = {1, 2, 3, 4, . . . , k, k + 1}, k ≥ 2, (k + 1) prime,
463 there exists a sequence of (n + 1) successive primes
On+1 = {p1 = 2, p2 = 3, . . . , pn, pn+1 = k + 1} ⊂ Ik+1
464 such that the prime-counting function π(k) = n. The set On+1 can be viewed as 465 an ordered choice of n + 1 integers and the set Ik+1 as an oFOS over On+1 (remember 466 Definition 3.1). Note that the integers i between two consecutive primes, pj < i < pj+1, 467 are multiples of the primes from p1 to pj, hence they can be considered as repetitions 468 of these symbols, thus confirming the oFOS schema. In the light of this analogy, some 469 classical results of the theory of random partitions of finite sets can be reinterpreted as a 470 general form of the PNT, which holds for all oFOS-like sequences. 471 The oFOS model of the distribution of prime numbers can be defined as follows. Let 472 consider the number of primes less than or equal to k, represented by the prime-counting 473 function π(k), as a random variable πo(k) with probability mass function defined by
g(k + 1, n + 1) P [ (k) = n] = n = k k πo k , 1, 2, . . . , , (52) ∑j=0 g(k + 1, j + 1)
474 hence, from Theorem 3.4
S(k, n) P [ (k) = n] = n = k k πo k , 1, 2, . . . , . (53) ∑j=0 S(k, j) Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 21 of 52
475 Equation (53) defines a uniform probability distribution on the class Πk of partitions −1 476 of a set of k elements, obtained by assigning the same probability Bk to each partition 477 σ ∈ Πk, 1 Pk(σ) = , Bk
478 where the total number of partitions Bk is equal to the k-th Bell number defined as 479 [37, 9.4, p. 133]
k Bk = ∑ S(k, j). (54) j=1
3 480 The combinatorics of Πk is studied in [28] and [39] in connection with the asymp- 4 481 totic (k → ∞) distribution of the probability measure Pk. From this approach it follows 482 that the random variable πo(k) with uniform probability (53), defined as a model of 483 π(k), has mathematical expectation and variance asymptotically equal to (see [40, Ch. 4, 484 p. 114-115])
k E[π (k)] = (1 + o(1)) o r
k Var[π (k)] = (1 + o(1)), o r2 r 485 where r = W(k), the unique positive solution of re = k, W(x) being the Lambert 486 function [13] defined in the domain x ≥ −1/e. Since for the function W(k) we know it 487 is r ∼ ln k as k → ∞, the above equations say the average and variance of the random 488 variable πo(k) are asymptotic to
k E[π (k)] ∼ (55) o ln k
k Var[πo(k)] ∼ . (56) ln2 k
489 Equation (55), and the related variance (56), represents the form assumed by the 490 PNT when considering oFOS-like sequences with uniform probability distribution of 491 set partitions. This is not the only admissible distribution. About this problem and the 492 general conditions in order that a uniform distribution is obtained, through random 493 allocation algorithms (or random urn models), see [43] and the references therein. The 494 following result (see [40, Theorem 1.1, p. 115]) completes the application of the methods 495 derived from the theory of random partitions of sets, to the oFOS model of the prime 496 number distribution. The probability distribution of the normalized random variable
π (k) − E[π (k)] η = o o k 1/2 (Var[πo(k)])
497 converges to that of the standard normal distribution as k → ∞, that is:
Z x 1 −u2/2 lim P{ηk < x} = √ e du. k→∞ 2π −∞
3 See also [20, Ch. IX, p. 692] for a brief summary of these results. 4 Harper points out the methods used are based on the combinatorial implications of probability theory and ascribes to W. Feller (see [19, Ch. X, p. 256]) and V. Goncharov (see [28, and the references therein]), the first applications in this field. Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 22 of 52
498 4.2. FOS probabilities and some combinatorial identities involving Stirling numbers 0 499 Let’s collect in this subsection some formulas about the probability function φi/n(k), 500 which will be referred to in the rest of this paper. Under Assumption 2.1, from equation 501 (31) of Theorem 3.1 the probability of FOS is given by
0 n! φ (k) = g(k, i), 1 ≤ i ≤ n, k ≥ i. (57) i/n (n − i)!nk
502 By substituting g(k, i) with equations (37) and (39) of Theorem 3.2, we get the 503 following expressions for 3 ≤ i ≤ n, k ≥ i
i−2 0 n! (k) = a (i − )k−i+1 − jk−i+1 φi/n k ∑ i,j 1 (58) (n − i)!n j=1
i−2 0 n! (k) = (i − )k−ia − a jk−i+1 φi/n k 1 i+1,i−1 ∑ i,j . (59) (n − i)!n j=1
0 504 Finally from (43) of Theorem 3.3 the explicit expression for φi/n(k) follows
i−1 k−2 0 n! j φ (k) = (−1)i−j−1 . (60) i/n k ∑ ( − − ) ( − ) (n − i)!n j=1 i j 1 ! j 1 !
505 If we set l = i − j − 1, the above equation can be written
i−2 k−2 0 n! (i − l − 1) φ (k) = (−1)l . (61) i/n ( − ) k ∑ ( − − ) n i !n l=0 i l 2 !l! 5 506 When i = n (n ≥ 2, k ≥ n), equation (60) becomes
n−1 k−2 0 n! j φ (k) = (−1)n−j−1 , (62) n/n k ∑ ( − − ) ( − ) n j=1 n j 1 ! n 1 !
507 which, observing that
k−1 n! jk−2 n − 1 j = , nk (n − j − 1)!(n − 1)! j n
508 can be written as
n−1 k−1 0 n − 1 j φ (k) = (−1)n−j−1 . (63) n/n ∑ j j=1 n
509 Equation (61) when i = n (n ≥ 2, k ≥ n), becomes
n−2 k−2 0 n! (n − l − 1) φ (k) = (−1)l , (64) n/n k ∑ ( − − ) n l=0 n l 2 !l!
510 which, observing that
k−1 n! (n − l − 1)k−2 n − 1 l + 1 = 1 − nk (n − l − 2)!l! l n
511 can be written as
5 It’s easy to check this equation, and (64) below, applies to the case n = 2 too. Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 23 of 52
n−2 k−1 0 n − 1 l + 1 φ (k) = (−1)l 1 − . (65) n/n ∑ l l=0 n
512 Let’s now derive the sequence probabilities expressed through the Stirling numbers 513 of the second kind. Remembering the equivalence (50), equation (57) can be written as
0 n! φ (k) = S(k − 1, i − 1), 1 ≤ i ≤ n, k ≥ i. (66) i/n (n − i)!nk
514 From this equation and (27), always under the hypothesis of equally probable 515 symbols, hence (19), (20), it is easy to get
n! φ (k) = S(k, i), 1 ≤ i ≤ n, k ≥ i. (67) i/n (n − i)!nk
516 The above equations, combined with the general properties of the probability 517 functions (11)-(18), give rise to a series of combinatorial identities involving the Stirling 518 numbers of the second kind. The following corollary reports the less trivial ones.
519 Corollary 4.1. The following identities hold true. 520 From (14) it follows
n S(k, i) nk = , k ≥ n. (68) ∑ ( − ) i=1 n i ! n!
521 From (15) it follows
∞ S(k − 1, i − 1) (n − i)! = , ≤ i ≤ n. ∑ k 1 (69) k=i n n!
522 From (16) it follows
S(k, n) k S(j − 1, n − 1) = . k ∑ j (70) n j=n n
523 From (17) it follows
n−1 S(k, i) k S(j − 1, n − 1) nk + nk = . (71) ∑ ( − ) ∑ j i=1 n i ! j=n n n!
524 From (18) it follows
∞ S(k − 1, i − 1) 1 n−1 S(n, i) = . (72) ∑ k n ∑ ( − ) k=n+1 n n i=1 n i !
525 Note that identity (68) is the same as (28) and identities (68)(71) imply (70). 0 526 I report in the following two more results about the sum of probabilities φn/n(k), 527 that clarify the probabilistic meaning of identity (46) and the close connection between 528 these probabilities and the Stirling numbers of the second kind.
529 Corollary 4.2. The following equations hold true
k n−2 k 0 n j + 1 φ (s) = 1 − (−1)j 1 − , (73) ∑ n/n ∑ j + 1 s=n j=0 n
k 0 n! (s) = S(k n). ∑ φn/n k , (74) s=n n Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 24 of 52
530 of equation (73). From (65)
k n−2 k s 0 n − 1 n n − j − 1 φ (s) = (−1)j ≤ . ∑ n/n ∑ n − j − 1 − − ∑ s=n j=0 n j 1 s=n n
531 By applying Lemma 3.2
n−1 k + + 1 − j 1 − 1 − j 1 n k n − j − 1 s n n ∑ = n − j − 1 s=n n j+1 n
532 hence
k n−2 " n−1 k# 0 n j + 1 j + 1 φ (s) = (−1)j 1 − − 1 − . (75) ∑ n/n ∑ j + 1 s=n j=0 n n
∞ 0 533 Since we know that ∑s=n φn/n(s) = 1 and
j + 1 k 1 − → 0 as k → ∞ n
534 it is
n−1 n−2 n j + 1 (−1)j 1 − = 1. (76) ∑ j + 1 j=0 n
535 Finally from (75) and (76) equation (73) follows.
536 of equation (74). Remembering (66) we can write
k k 0 S(j − 1, n − 1) ( ) = ∑ φn/n j n! ∑ j j=n j=n n
537 hence through identity (70) we simply get (74).
538 Remark 4.1. Note that equation (76) above is the same as (45), (46). Indeed after some manipu- 539 lations it can be written as
(−1)n−1 n−1 n (−1)j jn−1 = 1. n−1 ∑ j n j=0
540 This identity has thus a probabilistic meaning, connected with the normalization property of FOS 541 probabilities.
542 Remark 4.2. We know [34, Eq. 26.8.42] the Stirling number of the second kind, for fixed value 543 of n, as k → ∞ is asymptotic to
nk S(k, n) ∼ . n!
544 This asymptotic behavior follows very simply from equation (74), due to the connection 545 established between FOS probabilities and S(k, n). Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 25 of 52
546 4.3. Bounds of g(k, i) numbers and Stirling numbers of the second kind
547 We speak of g(k, i) numbers since the results are directly related to their formula- 548 tion through the coefficients ai,j, as stated by Theorem 3.2. The results are obviously 549 applicable to the Stirling numbers of the second kind.
550 Definition 4.1. Let’s define for i ≥ 3, 1 ≤ j ≤ (i − 2)
j j−s σi,j = ∑ (−1) |ai,s| = |ai,j| − |ai,j−1| + · · · ± |ai,1| s=1
551 and
j j−s k−i+1 k−i+1 k−i+1 τi,j(k) = ∑ (−1) |ai,s|s = |ai,j|j − |ai,j−1|(j − 1) + · · · ± |ai,1|, s=1
552 where the sign of |ai,1| is negative if j is an even number, positive if it’s an odd number.
553 Lemma 4.1. The following statements hold true:
a + − lim i 1,i 1 = e ; (77) i→∞ ai,i−2
σ − lim i,i 3 = 1 ; (78) i→∞ ai,i−2
ai,i−2 > σi,i−3, i ≥ 5 ; (79)
σi,i−3 > 0, i ≥ 5 ; (80)
σi,i−2 > ai,i−2 ; (81)
|ai,i−3| > σi,i−4, i ≥ 5 . (82)
554 Proof. Equation (77) follows simply from the values of the coefficients given by (44). 555 Equation (39) of Theorem 3.2 gives
a + − σ − i 1,i 1 = 1 − i,i 3 (i − 1)ai,i−2 ai,i−2
556 from that and (77) equation (78) follows. 557 Considering Definition 4.1, from (39) we can write
ai+1,i−1 a − − σ − = i,i 2 i,i 3 (i − 1)
558 with ai+1,i−1 > 0, hence (79). 559 From equation (44) it is easy to prove that
ai+1,i−1 0 < < a − for i ≥ 4 (i − 1) i,i 2
560 hence the previous equation implies (80) too. 561 Since
σi,i−2 = ai,i−2 − σi,i−3 and Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 26 of 52
σi,i−3 = |ai,i−3| − σi,i−4
562 (80) implies both (81) and (82).
563 Lemma 4.2. Given σi,j and τi,j(k) of Definition 4.1, it is
τi,j(k) > σi,j.
564 Proof. Suppose j even, then we can write
k−i+1 k−i+1 |ai,s|s − |ai,s−1|(s − 1) > |ai,s| − |ai,s−1|, s=2,4,. . . ,j.
565 By adding the pairs of the previous inequality, we obtain the thesis. 566 In the case when j is odd, we can repeat the above procedure considering the pairs for 567 s = 3, 5, 7, ... , j, then adding up these pairs and the last term |ai,1| taken with a positive 568 sign.
569 Theorem 4.1. Given the numbers g(k, i), i ≥ 3, k ≥ i, the following bounds hold
k−i k−i+1 k−i ai+1,i−1(i − 1) − ai,i−2(i − 2) < g(k, i) < ai+1,i−1(i − 1) (83)
570 and, for i fixed
g(k, i) = lim k−i 1 (84) k→∞ ai+1,i−1(i − 1)
571 Proof. From the Definition 4.1 we can write in general
σi,j = |ai,j| − σi,j−1
k−i+1 τi,j(k) = |ai,j|j − τi,j−1(k),
572 and, remembering (48),
k−i g(k, i) = ai+1,i−1(i − 1) − τi,i−2(k)
k−i k−i+1 g(k, i) = ai+1,i−1(i − 1) − ai,i−2(i − 2) + τi,i−3(k).
573 The thesis (83) follows from Lemma 4.2 and (80), (81) of Lemma 4.1. 574 The inequalities (83) can be rewritten as
i k a − (i − 1) (i − 2) g(k, i) − (i − ) i,i 2 < < 1 2 i k k−i 1, ai+1,i−1 (i − 2) (i − 1) ai+1,i−1(i − 1)
575 from that equation (84) follows.
576 Remark 4.3. The closed-form (44) of coefficients allows us to rewrite (83) and (84) as
(i − 1)k−1 (i − 2)k−1 (i − 1)k−1 − < g(k, i) < , (85) (i − 1)! (i − 2)! (i − 1)!
(i − 1)k−1 g(k, i) ∼ . (86) (i − 1)!
577 When speaking of Stirling numbers of the second kind, the previous bounds become
ik (i − 1)k ik − < S(k, i) < . (87) i! (i − 1)! i! Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 27 of 52
578 Note that (87) gives an alternative proof of the asymptotic behavior of S(k, i) for fixed i (see 579 Remark 4.2).
580 Remark 4.4. Assuming the left-hand side of the above inequality is positive both for S(k, i) and 581 S(k − 1, i − 1), that is
ln (i − a) − < = a 1 k, a 1, 2 ln 1 − i−a
582 we get the bounds of the ratio
(i − 1)k−1 (i − 2)k−1 S(k − 1, i − 1) (i − 1)k−1 − (i − 1) < < , ik−1 ik−1 S(k, i) ik−1 − (i − 1)k
583 that for i 1 becomes
− k − k S(k − 1, i − 1) − k 1 e i 1 − ie i < < e i , (88) − k S(k, i) 1 − ie i
584 under condition
i ln i < k.
585 4.4. FOS average length and harmonic numbers
586 As anticipated in the Introduction, calculating the average length of FOS sequences, 0 0 587 that is of elementary events ωi/n(k), k ≥ i, of the countable space Ωi/n, is a simple task 588 leading to the result (2) in the case i = n. This calculation is based on the following 589 well-known results, in particular on Lemma 4.4 expressing the average number of trials 590 one has to wait before obtaining the first success, in a sequence of independent Bernoulli 591 trials with p probability of success.
592 Lemma 4.3. Given q real, |q| < 1, then
∞ q kqk = . ∑ ( − )2 k=1 1 q
593 Lemma 4.4. Given p real, 0 < p ≤ 1, then
∞ 1 ∑ kp(1 − p)k−1 = . k=1 p
0 594 Therefore if we denote in general with Li/n the quantity we are looking for, we can 595 establish the following recursive relation
0 0 n L = L + , i = 1, 2, . . . , n (89) i/n i−1/n n − i + 1
596 starting with
0 L0/n = 0.
597 It is easy to see the above equation leads to
0 Li/n = n(Hn − Hn−i), i = 1, 2, . . . , n, (90)
598 where Hn is the n-th harmonic number 1 1 1 H = 1 + + + ··· + n 2 3 n Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 28 of 52
599 and
H0 = 0.
600 If we consider the set of discrete random variables qi/n with probability mass 601 function
0 φi/n(k) = Prob{qi/n = k}, 1 ≤ i ≤ n, k ≥ i, 0 602 then the quantity Li/n can be also obtained as the mathematical expectation of the 603 corresponding random variable qi/n, hence through
0 ∞ 0 Li/n = E[qi/n] = ∑ kφi/n(k). (91) k=i
604 From (91) and (90), remembering (66), we get the following relation between har- 605 monic numbers and Stirling numbers of the second kind.
Corollary 4.3.
(n − 1)! ∞ k H − H = S(k − 1, i − 1), 1 ≤ i ≤ n. (92) n n−i ( − ) ∑ k n i ! k=i n
606 Other combinatorial identities involving harmonic numbers and finite term se- 607 ries, can be derived from the direct calculation of (91), as established by the following 6 608 proposition.
609 Theorem 4.2. Assuming n and i positive integers, n ≥ 3 and 3 ≤ i ≤ n, the following identity 610 holds true
Hn − Hn−i =
" i−2 # n! a + − j ni + j − ij = i 1,i 1 ni − (i − 1)2 − a (93) ( − ) i ( − + )2 ∑ i,j − − n i !n n i 1 j=1 n j n j
611 where ai,j are the coefficients defined by Theorem 3.2.
0 612 Proof. By substituting in (91) the term φi/n(k) as given by (58), we get
∞ i−2 0 n! L = k a (i − )k−i+1 − jk−i+1 i/n ∑ k ∑ i,j 1 k=i (n − i)!n j=1
613 from which it follows
i−2 " ∞ k−i+1 ∞ k−i+1# 0 n! i − 1 j L = a k − k . (94) i/n ( − ) i−1 ∑ i,j ∑ ∑ n i !n j=1 k=i n k=i n
614 We can write
− + ∞ i − 1 k i 1 ∞ i − 1 k (i − 1)2 ∞ i − 1 k ∑ k = ∑ k + ∑ k=i n k=1 n n k=0 n
6 These identities appear to be new, like the method of deriving them. Of course, it’s not simple to establish the novelty of an identity involving harmonic numbers because of the historical interest of this subject and the great development of theory and applications. For an extensive collection of these identities, see [9], [10], [11] and the references therein. Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 29 of 52
615 and remembering Lemma 4.3 and the sum of geometric series
− + ∞ i − 1 k i 1 (i − 1) k = [ni − (i − 1)2]. ∑ ( − + )2 k=i n n i 1
616 In a similar way the second term becomes
− + ∞ j k i 1 j j + ni − ij k = . ∑ − − k=i n n j n j
617 Inserting the previous formulas into (94) we get
i−2 " # 0 n! (i − 1) j j + ni − ij L = a [ni − (i − 1)2] − i/n ( − ) i−1 ∑ i,j ( − + )2 − − n i !n j=1 n i 1 n j n j
618 from which and equations (39), (90), (93) follows.
619 The following corollary is a simple application of the formulas of Theorem 3.3 to 620 that of the previous theorem.
621 Corollary 4.2.1. For n and i positive integers, n ≥ 3 and 3 ≤ i ≤ n, the following identity 622 holds true
Hn − Hn−i =
" # n! (i − 1)i−2 ni − (i − 1)2 i−2 ji−2 (ni + j − ij) = − (−1)i−j−2 . ( − ) i ( − ) ( − + )2 ∑ ( − − ) ( − ) ( − )2 n i !n i 2 ! n i 1 j=1 i j 1 ! j 1 ! n j
623 When i = n we get for the n-th harmonic number
" # n! (n − 1)n−2 n−2 jn−2 (n2 + j − nj) H = (2n − 1) − (−1)n−j−2 . n n ( − ) ∑ ( − − ) ( − ) ( − )2 n n 2 ! j=1 n j 1 ! j 1 ! n j
624 4.5. FOS as a model of the distribution of primes
625 This subsection aims to analyse analogies and limits of FOS with respect to the true 626 sequence of prime numbers. There is a correspondence between prime numbers and 627 FOS, as stated by the following remark.
628 Remark 4.5. Given any prime pn, n ≥ 2, there exists an event FOS
∞ 0 0 [ 0 Sn/n(pn) ∈ Ωn/n = Sn/n(k) k=n
629 (see Remark 2.7), and the whole sequence of prime numbers corresponds to the element
0 (Sn/n(pn), n = 2, 3, . . . ) ∈ U
630 of the product set
∞ 0 U = ∏ Ωn/n, n=2
631 we can define as FOS universe. Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 30 of 52
0 632 The model based on FOS takes then into consideration the set Ωn/n as the com- 633 binatorial counterpart of the n-th prime pn, considered as a discrete random variable 634 assuming integer values between n and ∞. The statistics of this random variable are 635 then used to define a "counting variable", which is the FOS model equivalent of the 636 prime-counting function. In order to avoid any confusion when dealing with our model, 637 we keep the symbol pn to denote the "true" value of the n-th prime, and adopt the 638 symbol qn for the random variable modeling it. The basic assumption of the model is 639 the following.
640 Definition 4.2. The discrete random variable qn has probability mass function
0 φn/n(k) = Prob{qn = k}, k = n, n + 1, n + 2, . . . , 0 641 with φn/n(k), defined in equation (20).
642 The results of the model are referred to the sequence of random variables {qn} 643 and confronted with the analytical or numerical results we know about the true prime 644 sequence {pn}. These results, some of which are reported in two sample tables, justify a 645 posteriori the model’s adoption and the not completely rigorous approach we follow in 646 deriving it. 647 From (66) and (90) respectively we get the probability
0 n! φ (k) = S(k − 1, n − 1), k ≥ n n/n nk
648 and the mean value of qn
0 E[qn] = Ln/n = nHn ∼ n ln n. (95)
649 Hence, given k ≥ n, k integer, the probability distribution function of qn
Prob{qn ≤ k} = Φn(k) (96)
650 with
k k 0 S(j − 1, n − 1) ( ) = ( ) = Φn k ∑ φn/n j n! ∑ j , (97) j=n j=n n
651 is simply expressed through the number of ways a set of k elements can be parti- 652 tioned into n disjoint nonempty subsets
(n − 1)! Φn(k) = S(k, n), (98) nk−1
653 where the last equation follows from (74). 654 Let’s define the discrete random variable ξk whose probability distribution is in- 655 duced by qn through the following
Definition 4.3. Prob{ξk ≥ n} = Prob{qn ≤ k} = Φn(k) (99)
656 and obviously
Prob{ξk < n} = 1 − Φn(k). (100)
657 In our combinatorial model, ξk is associated with the value π(k) of the prime- 658 counting function, considered as a random variable assuming integer values between 1 659 and k. Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 31 of 52
660 Theorem 4.3. The discrete random variable ξk with probability distribution (99) assumes values 661 between 1 and k, has probability mass function
Prob{ξk = n} = Φn(k) − Φn+1(k), n = 1, 2, . . . , k, (101)
662 and mathematical expectation
k E[ξk] = ∑ Φn(k). (102) n=1
663 Proof. From (98), due to the properties of S(k, n) numbers, we get for n = 1
Prob{ξk ≥ 1} = Φ1(k) = 1
Prob{ξk < 1} = 1 − Φ1(k) = 0;
664 for n = k
(k − 1)! Prob{ξ ≥ k} = Φ (k) = k k kk−1
(k − 1)! Prob{ξ < 1} = 1 − Φ (k) = 1 − ; k k kk−1
665 for n > k
Prob{ξk ≥ n} = Φn(k) = 0
Prob{ξk < n} = 1 − Φn(k) = 1.
666 This proves the first statement. 667 Given n and m positive integers with n < m, from (100) we get
Prob{n ≤ ξk < m} = Φn(k) − Φm(k). (103)
668 From this relation, assuming m = n + 1 the probability mass function (101) follows. 669 The mathematical expectation of ξk is given by
k k E[ξk] = ∑ n(Φn(k) − Φn+1(k)) = ∑ Φn(k) − kΦk+1(k), n=1 n=1
670 that is the same as equation (102) since Φk+1(k) = 0.
671 Remark 4.6. The mean value of the discrete random variable ξk is equal to
k k k 0 E[ξk] = ∑ Prob{qn ≤ k} = ∑ ∑ φn/n(j). n=1 n=1 j=n
672 The previous theorem suggests to assume E[ξk] as an estimate πˆ (k) derived from 673 the combinatorial model, of the prime-counting function π(k), that is, remembering the 674 equality (98)
k (n − 1)! πˆ (k) = E[ξ ] = S(k n) k ∑ k−1 , . (104) n=1 n
675 An alternative expression for πˆ (k) can be derived from Remark 4.6 and equation 676 (73) Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 32 of 52
k k−2 k n j + 1 πˆ (k) = k − (−1)j 1 − . ∑ ∑ j + 1 j=0 n=j+2 n
677 To check the quality of the function πˆ (k) as a numerical estimate of π(k), equation 678 (104) looks more suitable since it can be written in a recursive form. Indeed if we put
k πˆ (k) = ∑ w(k, n) n=1
679 with
(n − 1)! w(k, n) = S(k, n), nk−1
680 from the recursive equation of the Stirling numbers of the second kind (51), we get 681 the following recurrence for the terms w(k, n)
n − 1 k w(k + 1, n) = w(k, n) + w(k, n − 1),k ≥ 1, 1 ≤ n ≤ (k + 1) n
682 initialized as
w(k, 1) = 1, k ≥ 1
w(1, n) = 0, n > 1.
683 Table3 reports a series of values of the recursive estimate πˆ (k), compared with 684 the prime-counting function π(k) and the logarithmic integral function li(k). The last 685 column of the table reports the values obtained through continuous approximation 686 formulas I’ll expose hereafter.
Table 3. πˆ (k) estimate through recursive and continuous formulas
k k R k −ane− n k π(k) li(k) πˆ (k) = ∑n=1 w(k, n) πˆ (k) = 1 e dn 100 25 30.1261 26.9462 23.1623 200 46 50.1921 47.0309 41.3716 300 62 68.3336 65.50659 58.2265 400 78 85.4178 83.06021 74.2987 500 95 101.7939 99.98131 89.8312 600 109 117.6465 116.4275 104.9572 700 125 133.0889 132.4971 119.7596 800 139 148.1967 148.2565 134.2952 900 154 163.0236 163.7537 148.6046 1000 168 177.6097 179.0246 162.7185 2000 303 314.8092 323.4725 296.6907 3000 430 442.7592 458.9438 422.8337 4000 550 565.3645 589.1406 544.3509 5000 669 684.2808 715.6488 662.6328 6000 783 800.4141 778.4514 7000 900 914.3308 892.2944 8000 1007 1026.416 1004.4960 9000 1117 1136.949 1115.2989 10000 1229 1246.137 1224.8866
687 In order to analyze the asymptotic behavior of πˆ (k), we look for a continuous ap- 688 proximation of the probability function Φn(k). After promoting k and n to be continuous Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 33 of 52
689 variables, which is a justifiable assumption for k 1 and considering the relation (98) 690 between Φ and S, equation (97) is written as
Z k n − 1 u−1 Φn(k) = Φn−1(u − 1)du. n n
691 Taking the derivative of the above equation, we get the differential equation for Φ
∂Φn(k) n − 1 k−1 = Φ − (k − 1) (105) ∂k n n 1
692 that can be written as
1 ∂Φ (k) S(k − 1, n − 1) n = . Φn(k) ∂k S(k, n)
693 We set
S(k − 1, n − 1) λn(k) − k = e n (106) S(k, n) n
694 where λn(k) is a parameter function depending on n and k. Then the general 695 solution of the differential equation is
1 R − k λn(k)e n dk Φn(k) = Ce n (107)
696 and the value of the constant is C is determined by the asymptotic condition 697 Φn(k → ∞) = 1.
698 Remark 4.7. It is interesting to note the method, based on probabilistic arguments, that led to 699 equation (107), provides an alternative approach to the problem of exploring the asymptotics 700 (k → ∞) of Stirling numbers of the second kind. Since this topic is outside the scope of this work, 701 I cite only the following example. If we choose
k λ (k) = (n + 1 − ) n n
702 then it is
k ( k −n)e− n Φn(k) = e n
703 and from (98) we get the well known asymptotic approximation of S(k, n) (see [31, and the 704 references therein])
k k n ( k −n)e− n S(k, n) ≈ e n , n! k 705 which is valid in the region n < ln k , coinciding asymptotically with the region of validity 706 of (88), n ln n < k.
707 As a continuous approximation of the probability function Φn(k) we are looking 708 for, let’s choose a parameter function depending on n only through the following simple 709 relation
λn(k) = λn = an, a > 0 constant ,
710 where the constant a varies between 1 and e as suggested by the boundary condi- 711 tions of the Stirling number ratio. Indeed for n = k the ratio (106) is
S(k − 1, k − 1) = 1, S(k, k) Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 34 of 52
712 that implies λn = en, while for k n we know it is
S(k − 1, n − 1) − k ∼ e n , S(k, n)
713 hence λn = n. 714 Therefore (107) becomes
k −ane− n Φn(k) = e (108)
715 and from (102) and (104) we get the continuous approximation of πˆ (k)
Z k Z k k −ane− n πˆ (k) = Φn(k)dn = e dn. (109) 1 1
716 The last column of Table3 reports some examples from this approximation formula 717 with a = (1 + e)/2. Table4 reports the values calculated through (109) with a = e, 27 7 718 compared with the values of π(k) up to k = 10 .
Table 4. πˆ (k) estimate through continuous approximation
k R k −ane− n k π(k) πˆ (k) = 1 e dn πˆ (k)/π(k) 105 9, 592 9, 428.02 0.98291 106 78, 498 78, 480.93 0.99978 107 664, 579 671, 099.45 1.00981 108 5, 761, 455 5, 855, 689.76 1.01636 109 50, 847, 534 51, 900, 660.41 1.02071 1010 455, 052, 511 465, 792, 892.49 1.02360 1011 4, 118, 054, 813 4, 223, 145, 802.17 1.02552 1012 37, 607, 912, 018 38, 614, 679, 105.06 1.02677 1013 346, 065, 536, 839 355, 603, 668, 431.86 1.02756 1014 3, 204, 941, 750, 802 3, 294, 779, 143, 238.6 1.02803 1015 29, 844, 570, 422, 669 30, 688, 289, 307, 555 1.02827 1016 279, 238, 341, 033, 925 287, 153, 196, 808, 146 1.02834 1017 2, 623, 557, 157, 654, 233 2.69779945552531E + 15 1.02830 1018 24, 739, 954, 287, 740, 860 2.54367476772712E + 16 1.02816 1019 234, 057, 667, 276, 344, 607 2.40603623244741E + 17 1.02797 1020 2, 220, 819, 602, 560, 918, 840 2.28238863108907E + 18 1.02772 1021 21, 127, 269, 486, 018, 731, 928 2.17071405049150E + 19 1.02745 1022 201, 467, 286, 689, 315, 906, 290 2.06936330707283E + 20 1.02715 1023 1, 925, 320, 391, 606, 803, 968, 923 1.97697565027884E + 21 1.02683 1024 18, 435, 599, 767, 349, 200, 867, 866 1.89241852576407E + 22 1.02650 1025 176, 846, 309, 399, 143, 769, 411, 680 1.81474179606716E + 23 1.02617 1026 1, 699, 246, 750, 872, 437, 141, 327, 603 1.74314253336239E + 24 1.02583 1027 16, 352, 460, 426, 841, 680, 446, 427, 399 1.676937646378480E + 25 1.02550
719 The following proposition, which is the FOS equivalent of the PNT, can be stated 720 for the function πˆ (k) defined by (109).
721 Theorem 4.4. Given e > 0 however small, it holds
7 Data with π(k) values are taken from N. J. A. Sloane, ’The On-Line Encyclopedia of Integer Sequences. Sequence A006880’, http ://oeis.org Wikipedia contributors, ’Prime-counting function’, Wikipedia, The Free Encyclopedia, https ://en.wikipedia.org/w/index.php?title = Prime − counting f unction&oldid = 987382574 Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 35 of 52
πˆ (k) − < < + , 1 e k 1 e ln k
722 definitively for k > k0.
723 Proof. Φn(k), n > 0, k > 0, is a decreasing function with n. Indeed
∂Φn(k) − k k = −aΦ (k)e n 1 + < 0. ∂n n n
724 Given n0 and n1 such that 1 < n0 < n1 < k, we can write
Z n0 Z n1 Z k πˆ (k) = Φn(k)dn + Φn(k)dn + Φn(k)dn 1 n0 n1
725 and, due to the decreasing of Φn(k) with n,
Φn0 (k)(n0 − 1) + Φn1 (k)(n1 − n0) + Φk(k)(k − n1) ≤
≤ πˆ (k) ≤
Φ1(k)(n0 − 1) + Φn0 (k)(n1 − n0) + Φn1 (k)(k − n1). 1 e 2 726 Given e, 0 < e < 1, let γ = 1 + 2 , hence 3 < γ < 1, and k n = n (k) = 0 0 ln k
k n = n (k) = 1 1 γ ln k 1 727 then the previous inequalities become (note that n1(k) < k for γ < ln k)
k1−γ ln k − a 1 −a( ) 1 − a k 1 − e ln k + − 1 e γ ln k + ln k − e e ≤ k γ γ
πˆ (k) ≤ ≤ k ln k
k1−γ ln k −ae−k 1 − a 1 −a( ) ≤ 1 + e + − 1 e ln k + ln k − e γ ln k . k γ γ
728 The limit for k → ∞ of the left-hand and the right-hand side of the previous 729 inequality, say L(k) and R(k) respectively, gives
lim L(k) = 1, k→∞
730 that is
1 − e < L(k) < 1 + e, for k > kL,
731 and
1 lim R(k) = , k→∞ γ
732 that is
1 e 1 e − < R(k) < + , for k > k . γ 2 γ 2 R Preprints (www.preprints.org) | NOT PEER-REVIEWED | Posted: 1 April 2021 doi:10.20944/preprints202104.0017.v1
Version March 31, 2021 submitted to Mathematics 36 of 52
733 Let k0 = Max{kL, kR}, then for k > k0 we get the thesis.
734 What does the model say about the Riemann Hypothesis [12]? We know the most 8 735 famous millennium problem is equivalent to establish the bound of the prime-counting 736 function π(x) [30] √ |π(x) − li(x)| < M x ln x 9 737 with M > 0 constant, definitively for x sufficiently large. We can prove this 738 bound, I’ll refer to as the RH rule throughout this work, occurs w.p. 0, or better that the 739 RH rule doesn’t hold w.p. 1 for the random variable ξk, defined above as the stochastic 740 counterpart of π(k) in our model.
741 Theorem 4.5. Let ξk then random variable with probability distribution (99), then it is √ Prob{|ξ(k) − li(k)| < M k ln k} → 0 as k → ∞.
742 Proof. Let √ n(k) = li(k) − M k ln k
743 and √ m(k) = li(k) + M k ln k.
744 Since
k k li(k) = + O ln k ln2 k
745 it is easy to see that both Φn(k)(k) and Φm(k)(k)
−e 1 Φn(k)(k) ∼ Φm(k)(k) ∼ e ln k → 1 as k → ∞.
746 From (103) then it follows