A Generalization of the Erd\H {O} S-Kac Theorem

A GENERALIZATION OF THE ERDŐS-KAC THEOREM JOSEPH SQUILLACE UNIVERSITY OF RHODE ISLAND DEPARTMENT OF COMPUTER SCIENCE & STATISTICS Abstract. Given n 2 N, let ! (n) denote the number of distinct prime factors of n, let Z denote a standard normal variable, and let Pn denote the uniform distribution on f1; : : : ; ng. The Erdős-Kac Theorem states that 1=2 Pn m ≤ n : ! (m) − log log n ≤ x (log log n) ! P (Z ≤ x) as n ! 1; i.e., if N (n) is a uniformly distributed variable on f1; : : : ; ng, then ! (N (n)) is asymptotically normally distributed as n ! 1 with both mean and variance equal to log log n. The contribution of this paper is a generalization of the Erdős-Kac Theorem to a larger class of random variables by considering per- 1 turbations of the uniform probability mass n in the following sense. Denote by Pn a probability distribution 1 on f1; : : : ; ng given by Pn (i) = n + "i;n. By providing some constraints on the "i;n’s, sufficient conditions are stated in order to conclude that 1=2 Pn m ≤ n : ! (m) − log log n ≤ x (log log n) ! P (Z ≤ x) as n ! 1: The main result will be applied to prove that the number of distinct prime factors of a positive integer with either the Harmonic(n) distribution or the Zipf(n; s) distribution also tends to the normal distribution N (log log n; log log n) as n ! 1 (and as s ! 1 in the case of a Zipf variable). 1. Introduction Given a natural number n, the number of distinct prime factors of n is denoted ! (n). For example, 3 2 P ! 2 5 7 = 3. The function ! may be written as ! (n) = pjn 1, where the sum is over all prme factors of n. In 1917, Hardy and Ramanujan (p. 270 of [3]) proved that the number of distinct prime factors of a natural number n is about log log n. In particular, they showed that the normal order of ! (n) is log log n−i.e., for every " > 0, the proportion of the natural numbers for which the inequalities (1 − ") log log n ≤ ! (n) ≤ (1 + ") log log n do not hold tends to 0 as n ! 1. Informally speaking, the Erdős-Kacp Theorem generalizes the Hardy- Ramanujan Theorem by showing that ! (n) is about log log n + Z · log log n, where Z ∼ N (0; 1). More precisely, the Erdős-Kac Theorem is the following result (p. 738 of [2]). 1 Theorem 1. Let Pn denote the uniform distribution on f1; 2; ; : : : ; ng. As n ! 1, 1=2 Pn m ≤ n : ! (m) − log log n ≤ x (log log n) ! P (Z ≤ x) : arXiv:2011.00152v1 [math.NT] 31 Oct 2020 Figure 1 shows plots of the values of ! (n) for n between, respectively, 1-100, 1-1000, and 1-10000. Figure 1. Some values of ! (n) plotted using Mathematica. 1 I.e., if U is uniformly distributed on f1; 2; : : : ; ng, then for any subset A ⊆ f1; 2; : : : ; ng, Pn (A) = P (U 2 A). 1 A GENERALIZATION OF THE ERDŐS-KAC THEOREM 2 Furthermore, Figure 1 illustrates how slowly the values of ! (n) grow along with their variation, and this is consistent with the parameters µ = σ2 = log log n appearing in the limiting normal distribution. Theorem 1 4 4 suggests that, for example, since log log e(e ) = 4, integers near e(e ) ≈ 514843556263457212366848 have, on average, 4 distinct prime factors. Using Mathematica, computing the mean of ! (n) for n within 1000000 of 514843556263457212366848 yields N[Mean[Table[PrimeNu[n], n, 514843556263457211366848, 514843556263457213366848]], 3] = 4:27: Several generalizations of Theorem 1 exist. For example, Liu [4] extends Theorem 1 to the setting of free abelian modules other than the positive integers. A recent generalization of Theorem 1 was considered in [7], where Sun and Wu showed that Theorem 1 corresponds to a special case of Theorem 1 of [7] once the z −v2 R l 2 parameter l is set to 0. In particular, they considered integrals of the form −∞ v e dv and bounds for sums of the form l 1 X ! (n) − log log x p ; x log log x !(n)−log log n n≤x; p ≤z log log n upon setting l = 0, these expressions become the CDF of the standard normal distribution and the propor- !(n)−log log n p tion of n satisfying log log n ≤ z, respectively. A generalization given by Saidak [6], assumes a quasi Generalized Riemann Hypothesis for Dedekind zeta functions and concerns a function fa (p) defined to be the minimal e for which ae ≡ 1 mod p. Saidak shows that the limit of n !(f (p))−log log p o ap p ≤ x : A ≤ log log p ≤ B π (x) as x ! 1 tends to the standard normal CDF Φ(B) − Φ(A). There are several generalizations of Theorem 1 to algebraic number fields, such as [5] where Pollack considers the number of principal ideals. A similarity in the generalizations mentioned is that they consider limiting distributions of ! (·) for uniformly distributed random integers. The contribution of this paper is to extend the Erdős-Kac Theorem to a larger class of random variables, other thanp a uniformly distributed variable, on [n] := f1; 2; : : : ; ng which also have, asymptotically, log log n + Z · log log n many distinct prime factors. 1.1. The Main Theorem. Define a probability distribution Pn on [n] given by 1 (1) (i) = + " ; Pn n i;n and impose the constraint that for each k-tuple (p1; : : : ; pk) consisting of distinct primes, j n k p1···pk X (2) lim "lp ···p ;n = 0: n!1 1 k l=1 The constraint (2) is applied in x2 where an analogue of Kac’s heuristic for Pn is provided, and the analogue of Kac’s heuristic suggests that for a Pn-distributed variable X, the events fp1 divides Xg ;:::; fpk divides Xg are independent when p1; : : : pk are distinct primes. Analogous to the case of the development of Theorem 1, it was the independence of these events that suggested a Gaussian law of errors. Moreover, it will be shown 1 1 that (2) implies limn!1 Pn (p divides X) = p (and it is easy to show that limn!1 Pn (p divides X) = p ). Due to the axioms of probability, the "i;n’s satisfy n X (3) "i;n =0; i=1 1 1 (4) " 2 − ; 1 − : i;n n n The motivation for defining Pn in terms of the uniform distribution is due to Durrett’s proof (Theorem 3.4.16 in [1]) of the Erdős-Kac Theorem. Replacing the uniform distribution Pn with the new distribution Pn in Durrett’s proof naturally yields some constraints that the terms "i;n; i ≤ n; must satisfy in order to conclude A GENERALIZATION OF THE ERDŐS-KAC THEOREM 3 p that an integer-valued random variable with the Pn distribution has about log log n + Z · log log n distinct prime factors. Our main result is the following theorem, where b·c denotes the floor function. Theorem 2. Let Z ∼ N (0; 1). Suppose the following statements are true. • There exists a constant C such that for all n and for all primes p with p > n1= log log n, n b p c X C (5) " ≤ : lp;n p l=1 • There exists a constant D such that j n k p1···pk X D (6) 0 ≤ " ≤ lp1···pk;n n l=1 for all n and, for each k, all k-tuples (p1; : : : ; pk) consisting of distinct primes of size at most n1= log log n. ∗ Let Pn denote a probability distribution obtained by imposing the constraints (5) and (6) on Pn. As n ! 1; ∗ 1=2 Pn m ≤ n : ! (m) − log log n ≤ x (log log n) ! P (Z ≤ x) : ∗ Remark. If "i;n = 0 for all i ≤ n, then Pn = Pn and Theorem 1 is obtained. 1.2. Outline. In x2, Kac’s heuristic for the number of distinct prime factors of a uniformly distributed variable on N is stated, and then an analogue of the heuristic is provided for any Pn-distributed random variable. Just as Kac’s heuristic suggested the appearance of the normal distribution in Theorem 1 along with 2 the parameters µ = σ = log log n, the analogue of Kac’s heuristic for Pn is the motivation for the conclusion of Theorem 2. The proof of Theorem 2 is provided in x3; the proof applies the method of moments and is motivated by Durrett’s proof of the Erdős-Kac Theorem (Theorem 3.4.16 in [1]). Moreover, in x3, the constraints (5) and (6) are applied to ensure that Pn also satisfies Durrett’s method of moment bounds. In x4, the values "i;n; i ≤ n; are given for the Harmonic distribution on [n], and it is shown that the "i;n’s ∗ satisfy constraints (2), (5), and (6). Therefore, the conclusion of Theorem 2 is true if Pn is replaced with ∗ the harmonic distribution. In a similar fashion, it is shown that a Zipf(n; s)-distribution can also replace Pn in Theorem 2, as long as s ! 1 as n ! 1. I.e., the number of distinct prime factors of an integer-valued random variable from the Harmonic(n) distribution distribution tends to N (log log n; log log n) as n ! 1; and the number of distinct prime factors of an integer-valued random variable from the Zipf(n; s) distribution tends to N (log log n; log log n) as n ! 1 and s ! 1.

A Generalization of the Erd\H {O} S-Kac Theorem

The Halphen Family of Distributions for Modeling Maximum

Harmonic Measurement and Reduction in Power Systems. Rutisurhata Kurniawan Hartana Louisiana State University and Agricultural & Mechanical College

Field Guide to Continuous Probability Distributions

Research of Harmonics in Power System Signal Using Gaussian's

Supplementary Information for “Unraveling Electronic Absorption

Harmonic Management of Transmission and Distribution Systems Maty Ghezelayagh University of Wollongong

A Framework to Analyze the Stochastic Harmonics and Resonance of Wind Energy Grid Interconnection

Exact Solution for the Singlet Density Distributions and Second-Order Correlations of Normal-Mode Coordinates for Hard Rods in One Dimension C

Introducing Harmonic Distribution in Wikipedia

Estimation of Harmonics, Interharmonics and Sub-Harmonics in Motor Drive Systems

Second Harmonic Revisited: an Analytic Quantum Approach

Arxiv:1805.04695V2 [Nucl-Th] 7 Mar 2019 an Analysis More Sophisticated Than a Fourier Analysis