arXiv:1906.08671v1 [math.CO] 20 Jun 2019 facmo itdyt be to birthday common a of ag ausof values large itdy sa es ?Fraya otiig35dy,agopo 28 of of group group a a grea 1. In days, is least birthdays O 365 at of be be common containing to number to of year birthday needs expected number a group the expected For the that the general such before 1? group needed least a is at of size is minimum birthdays the is What ar is people birthdays 23 same of the having minimum people a two days of 365 the having that year order a for earlier, mentioned itdywt oen lei h ru?Ti skona h Strong the as known is This group? the in else someone with birthday a hav two least at of probability size the the people and resu year 23 known least well 1 at around most is of the birthday leas group Perhaps at a birthday. that in same probability that the the have for group asks a Problem Birthday standard The Introduction 1 rbblt fa es w epei h ru aigtesm birthda same the having group the in people t two least of at context of the probability having in year discussed a For been have which are: questions problem the of mentioned been Some have Davenport Harold problem. and the 1932 of in Mises von Richard ne aiu osrit on constraints various under te vnsi aiu om fteBrha Problem. Birthday the of forms various in events other onigbrha olsosuigpartitions using collisions birthday Counting euepriin opoiesm omlefrcutn s-co counting for formulae some provide to partitions use We n q h ieo h ru ed ob O be to needs group the of size the epe hti h rbblt hteeyn ntegopshares group the in everyone that probability the is what people, q n ftegopaetetda aibe n h ucmsaestudie are outcomes the and variables as treated are group the of / .I h eea omo h rbe h ubro days of number the problem the of form general the In 2. as hti h iiu ieo ru oesr htthe that ensure to group a of size minimum the is what days, o un n e McKenzie Jen and Burns Rob 1 2 n e,freape [ example, for See, . and 1tJn 2019 June 21st √ n q  h itr ftepolmi nla.Both unclear. is problem the of history The . Abstract nodrfrteepce au facommon a of value expected the for order in 1 2 ,[ ], √ 7 n ].  nodrfrteprobability the for order in lsosand llisions tlat1 least at ti hsae is area this in lt w epein people two t e hn1 In 1. than ter n h same the ing eBirthday he sinitiators as eddin needed e is y Birthday common rmore or n / 2 1 .For 2. nthe in As ? d Problem. In his survey article [8] DasGupta states that for a year having 365 days, the group having 3064 members is the smallest such that the probability of everyone 1 sharing a birthday is 2 . What can be said about the distribution of outcomes as the size q of the group and the number n of possible birthdays approach ? See for example [4], [3], [8], [10]. ∞ The problem arises in a number of scenarios and lends itself to many variations. It appears in cryptography in the form of what is called the ””. In this situation, messages are mapped to a and for security reasons it is important to know how many messages need to be hashed before two are found with the same hash value (see e.g. [18], [19], [11], [15]). The problem arises in the study of colourings of complete graphs ([6], [9]). It is also related to the behaviour of certain Markov Chains ([5], [12], [14]). In this paper we will picture the problem in terms of arranging q balls inside n tubes or buckets and counting various types of outcomes.

2 Terminology

Suppose we have q numbered balls which are arranged inside n numbered tubes. The tubes have the same width as the diameter of the balls so that when more than one ball is located within the same tube a column of balls forms. We denote the number of arrangements of q balls into n tubes by T (n, q). The order of the balls within each tube is important and is taken into account when counting the number of arrangements. We may also arrange the balls in buckets instead of tubes. For our purposes a bucket is a tube in which the order of the balls is not important. We denote the number of arrangements of q balls into n buckets by B(n, q). Whether dealing with tubes or buckets we will generally assume that the balls are numbered and therefore distinguishable. The case of indistinguishable balls, which are called bosons, has been studied. For example, the asymptotic behaviour of bosons as n, q approach was studied in [1] and [3]. Formulae obtained in sections 5 and ∞ 6 can be modified to apply to bosons. Let s N with s 2. We say an s-collision has occurred when a tube (or bucket) contains∈ s or more≥ balls. We will be counting the number of s-collisions which occur in an arrangement of balls. We will therefore need to define the number of s-collisions occurring when a tube contains r balls with r s. In the literature two separate definitions have been used to count s-collisions (see≥ [4]). Under one definition a tube containing r s balls contributes r s + 1 collisions to the count of s-collisions. The second definition≥ states that a tube containing− r s balls contributes r collisions to ≥ s  2 the count of s-collisions in an arrangement of balls. We will use the second definition is this paper but the formulae derived here can be easily altered to accomodate the first definition. The floor function will be denoted in the usual way by . . ⌊ ⌋

3 Partitions

We will denote a general partition of a positive integer q by the letter λ. λ is therefore a set of positive integers λ1,λ2,...,λk { } such that k

q = λi i=1 X and q λ1 λ2 λk 1. ≥ ≥ ···≥ ≥ Here k is called the size or the number of parts of the partition λ. It may also be written as λ . k depends on λ but we will not generally make that explicit. | | The following lemma will not be used in this paper but is provided to show that

k λi the term q!/ i=1 i which appears in some of the subsequent formulae is an integer.  Q  k Lemma 3.1. Let q N and λ be a partition of q. Then q! is divisible by iλi . ∈ i=1 In addition q! is divisible by k λ !. i=1 i Q Proof. We use the usual approachQ of taking an arbitrary prime p and showing that the maximum power of p dividing q! is greater than the maximum power dividing either k λi k of i=1 i or i=1 λi!. For an integer n let ordp(n) denote the maximum power of p dividing n. We know that Q Q q ordp(q!) = . ⌊pi ⌋ i≥1 X We have k λi ordp i = λj∗pi . i=1 ! i≥1 j≥1 Y X X Since the λ1,λ2, . . . .λk forms a decreasing sequence, for fixed i we have { } i p λ i λr = q ∗ j∗p ≤ j≥1 r≥1 X X 3 and since j≥1 λj∗pi is an integer it follows that P q λ i . j∗p ≤ ⌊pi ⌋ j≥1 X Therefore, k λi q ordp i = ordp(q!). ≤ ⌊pi ⌋ i=1 ! i≥1 Y X k The approach for i=1 λi! is the same. We have

Q k λi ordp( λi!) = ⌊pj ⌋ i=1 i≥1 j≥1 Y X X

i≥1 λi ≤ ⌊ pj ⌋ j≥1 P X q = = ordp(q). ⌊pj ⌋ j≥1 X

The above lemma shows that the tuple q,λ1,...,λk satisfies { } (q n)! N k ∗ (λi n)! ∈ i=1 ∗ for every n N. We say that the tupleQ has an integral ratio. This is an area ∈ of current research. See for example recent papers by Soundararajan [16], [17].

4 A commutative diagram

Each arrangement of q balls in n tubes can be mapped to a partition λ of q by letting λ1 be the number of tubes holding at least one ball, λ2 be the number of tubes holding at least two balls etc. It is clear that λ defined in this way satisfies the definition of a partition of q. We will call this mapping φT . In the same way, each arrangement of q balls in n buckets can be mapped to a partition q by a map which we will call φB. Both mappings are many to one. If q n then the mappings are surjective, otherwise ≤ the common range of the mappings is the set of partitions λ such that λ1 n. ≤

4 Each arrangement of balls in tubes can be mapped to an arrangement of balls in buckets by ignoring the order of the balls in each tube. We will call this mapping φT B. It is also many to one and surjective.

The mappings φT , φB and φT B satisfy the identity

φT = φB φT B. (1) ◦ This identity represents the fact that the partition associated with an arrangement

of balls in tubes is the same partition associated with the arrangement in buckets obtained by ignoring the order of the balls in each tube.

For a set A, we denote the number of elements in A by A . | | Lemma 4.1. Let λ be a partition of q. Then we have

k −1 n λi−1 φT (λ) = q! (2) λ1 λ | | ∗ ∗ i=2 i   Y   and k k −1 λi n λi−1 φB (λ) = q!/ i . (3) λ1 λ | | i=1 ! ∗ ∗ i=2 i Y   Y   Let a be an arrangement of q balls in n buckets and denote the partition φB(a) by λ. Then k φ−1 (a) = iλi . (4) | T B | i=1 Y Proof. n For a fixed partition λ of q there are λ1 ways of choosing the λ1 tubes λ1 containing at least one ball, ways of choosing λ2 tubes containing at least two λ2  balls etc. Therefore the number of ways that the tubes can be chosen so that the  arrangement matches λ is k n λ 1 i− . λ1 λ ∗ i=2 i   Y   For each choice of tube pattern there are q! ways of arranging the q balls in the tubes so that the balls match the pattern. Equation (2) follows.

Let a be an arrangement of q balls in n buckets and λ = φB(a). Denote the number of balls in the j-th bucket by xj. By definition, for each i 1, ≥

λi = j : xj i . |{ ≥ }|

5 Then n k −1 |{j:xj≥i}| λi φ (a) = xj!= i = i . | T B | j=1 i≥1 i=1 Y Y Y Equation (3) follows from equations (2) and (4) and the identity (1).

5 Tubes and balls

In this section we present a formula for the number of arrangements of q balls in n tubes.

Theorem 5.1. The number T (n, q) of arrangements of q numbered balls in n num- bered tubes is given by the equation

k n λ 1 T (n, q)= q! i− . (5) ∗ λ1 ∗ λi λ:λ1 n i=2 X≤   Y  

where the sum is over all partitions λ of q such that λ1 n. ≤ Proof. Each arrangement of balls in the tubes corresponds to a partition λ of q via the mapping φT . The number of arrangements mapped to the same λ is given by equation (2). In order to count all possible arrangements we take the sum of φ−1(λ) | T | over all partitions of q resulting in equation (5).

Theorem 5.1 can be used to obtain an expression for the number of arrangements having no s-collision by restricting the sum to partitions of q in which λ

Corollary 5.2. The number of arrangements of q numbered balls in n numbered tubes in which there are no 2-collisions is n! (n q)! − Proof. In these arrangements all the balls lie on the bottom level of the tubes so the corresponding partition of q must have λ2 = 0. The only such partition is the trivial one given by λ = q . Equation (5) then reduces to the required formula. { }

6 Corollary 5.3. The number of arrangements of q numbered balls in n numbered tubes in which there are no 3-collisions is

r=⌊ q ⌋ 2 n q r q! − q r r ∗ r=0 ∗ X  −    Proof. In these arrangements all the balls lie on the bottom two levels of the tubes. We therefore have λ 2 for the corresponding partitions of q. Partitions satisfying this constraint are given| | ≤ by λ 0= q and λ r = q r, r for r 1, 2, ..., q . The { } { − } ∈{ ⌊ 2 ⌋} corollary follows from equation (5).

Note that when q 2n + 1 all arrangements have at least one 3-collision. The ≥ expression in Corollary 5.3 still makes sense and sums to 0 when q 2n + 1 taking x ≥ into account the convention that y = 0 when y > x.  Theorem 5.4. Let s 2. The total number CT (n,q,s) of s-collisions in all arrange- ments of q numbered balls≥ in n numbered tubes is equal to

k k−1 n λi−1 k i q! λk + (λi λi+1) (6) ∗ λ1 ∗ λ ∗ ∗ s − ∗ s : : i=2 i ! i=s ! λ λ1≤Xn |λ|≥s   Y     X   where the sum is over all partitions λ of q such that λ1 n and λ s. ≤ | | ≥ Proof. Any arrangement corresponding to a partition of q with λ < s has no s- collisions as all balls lie below the s-th level of the tubes. We therefore| | only need to consider partitions with λ s. We begin with equation (5). Each term in the sum | | ≥ in equation (5) is the number of arrangements corresponding to a particular partition λ. The number of s-collisons is the same for each of arrangement having the same λ. We need to calculate the number of s-collisions occurring for each of these partitions. As mentioned earlier, a tube containing r s balls contributes r s-collisions to the ≥ s count. For the partition λ with k =: λ s there are λk tubes containing exactly k | | ≥  balls, λk−1 λk tubes containing exactly k 1 balls, ... , λs λs+1 tubes containing exactly s balls.− Therefore, each λ with λ −s contributes − | | ≥ k 1 k − i λk + (λi λi+1) ∗ s − ∗ s i=s   X   s-collisions to the total. Combining this with equation (5) yields equation (6).

7 6 Buckets and balls

In this section we replace the tubes by buckets. The results from section 5 can be used with an appropriate adjustment to take into account that the balls are not ordered within each bucket. A number of closed form expressions have been published for the number of arrangements of balls in buckets satisfying various properties. For example, McKinney ([13]) provided an expression for the number of arrangements in which there is no s-collision. Brink ([7] ) provided an exact formula for the least value of q (in terms of n) such that the number of arrangements containing a 2-collision is at least a half of all arrangements. The number of arrangements containing an s-collision can also be calculated using a recursive formula. Such a formula was provided by Suzuki, Tonien, Kurosawa, and Toyota in the paper [19]. In this section we will use partitions to construct formulae for various Birthday events.

Theorem 6.1. The number B(n, q) of arrangements of q numbered balls in n num- bered buckets is given by the equation

k k n λi 1 B(n, q)= − q!/ iλi (7) λ1 ∗ λi ∗ λ:λ1 n i=2 ! i=1 ! X≤   Y   Y

where the sum is over all partitions λ of q such that λ1 n. ≤ Proof. This follows from Theorem 5.1 and equation (4).

Since we know that the number of arrangements of q balls in n buckets is nq ,we have

k k n λi 1 nq = − q!/ iλi . (8) λ1 ∗ λi ∗ λ:λ1 n i=2 ! i=1 ! X≤   Y   Y When n = 2, the relevant partitions of q in equation (8) are of the form

2, 2,..., 2, 1, 1,..., 1 . { } Some algebra then produces the well known formula

q q 2q = . i i=0 X  

8 Corollary 6.2. The number of arrangements of q numbered balls in n numbered buckets in which there are no 2-collisions is n! (n q)! − Proof. The proof is the same as for corollary 5.2.

The following corollary appears as equation (4) in DasGupta’s survey article [8].

Corollary 6.3. The number of arrangements of q numbered balls in n numbered buckets in which there are no 3-collisions is

r=⌊ q ⌋ 2 n q r q! − q r r 2r r=0 ∗ ∗ X  −      Proof. In these arrangements each bucket contains at most 2 balls. We therefore have λ 2 for the corresponding partitions of q. Partitions satisfying this constraint are given| | ≤ by λ 0 = q and λ r = q r, r for r 1, 2, ..., q . For the partition λ r { } { − } ∈{ ⌊ 2 ⌋} we have k iλ ri =2r. i=1 Y The corollary follows from equation (7).

Theorem 6.4. Let s 2. The total number CB(n,q,s) of s-collisions in all arrange- ments of q numbered balls≥ in n numbered buckets is given by

k k−1 k n λi−1 k i λi λk + (λi λi+1) q!/ i λ1 λ s s : : ∗ i=2 i !∗ ∗ i=s − ∗ !∗ i=1 ! λ λ1≤Xn |λ|≥s   Y     X   Y (9) where the sum is over all partitions λ of q such that λ1 n and λ s. ≤ | | ≥ Proof. This follows from Theorem 5.4 and equation (4).

Subsets of arrangements can be counted using equation (7) by restricting the choice of partitions in the sum. For example, to count the number of arrangements in which at least r buckets have an s-collision, the sum in (7) should only include partitions λ such that λs r. The Strong Birthday problem asks for the number of arrangements in which no≥ bucket contains only one ball. This number is obtained from (7) by restricting to partitions λ such that λ1 = λ2.

9 References

[1] Scott Aaronson and Alex Arkhipov. The computational complexity of linear optics. IN PROCEEDINGS OF STOC 2011, 2011. 2

[2] S.E. Ahmed and R.J. McIntosh. An asymptotic approximation for the birthday prob- lem. Crux Math., 26:151–155, 2000. 1

[3] Alex Arkhipov and Greg Kuperberg. The bosonic birthday paradox. arXiv, arXiv:1106.0849:3, 2011. 2

[4] R. Arratia, S. Garibaldi, and J. Kilian. Asymptotic distribution for the birthday problem with multiple coincidences, via an embedding of the collision process. Random Structures and Algorithms, 48(3):480–502, 2016. 2

[5] Itai Benjamini and Ben Morris. The birthday problem and markov chain monte carlo. arXiv, arXiv:math/0701390:7, 2007. 2

[6] Bhaswar B. Bhattacharya, Somabha Mukherjee, and Sumit Mukherjee. Birthday paradox, monochromatic subgraphs, and the second moment phenomenon. arXiv, arXiv:1711.01465:28, 2017. 2

[7] David Brink. A (probably) exact solution to the birthday problem. The Ramanujan Journal, 28(2):223–238, Apr 2012. 1, 8

[8] Anirban DasGupta. The matching, birthday and the strong birthday problem: a contemporary review. Journal of Statistical Planning and Inference, 130(1):377 – 389, 2005. Herman Chernoff: Eightieth Birthday Felicitation Volume. 2, 9

[9] Sukhada Fadnavis. A generalization of the birthday problem and the chromatic poly- nomial. arXiv, arXiv:1105.0698v3:17, 2015. 2

[10] Norbert Henze. A poisson limit law for a generalized birthday problem. Statistics and Probability Letters, 39(4):333 – 336, 1998. 2

[11] Antoine Joux. Multicollisions in iterated hash functions. application to cascaded con- structions. Lecture Notes in Computer Science, pages 306–316, 2004. 2

[12] Jeong Han Kim, Ravi Montenegro, Yuval Peres, and Prasad Tetali. A birthday paradox for markov chains with an optimal bound for collision in the pollard rho algorithm for discrete logarithm. Annals of Applied Probability, 20(2):495–521, 2010. 2

[13] E. H. Mckinney. Generalized birthday problem. The American Mathematical Monthly, 73(4):385, Apr 1966. 8

[14] Ravi Montenegro and Ravi Montenegro. A simple method for precisely determining complexity of many birthday attacks. 2012. 2

[15] Ronald L. Rivest and Adi Shamir. Payword and micromint: Two simple micropayment schemes. Lecture Notes in Computer Science, pages 69–87, 1997. 2

10 [16] K. Soundararajan. Integral factorial ratios. arXiv, arXiv:1901.05133:31, 2019. 4

[17] K. Soundararajan. Integral factorial ratios: Irreducible examples with height larger than 1. arXiv, arXiv:1906.06413:16, 2019. 4

[18] Shenghui Su, Tao Xie, and Shuwang Lu. A new non-mds hash function resisting birth- day attack and meet-in-the-middle attack. Theoretical Computer Science, 654:128–142, Nov 2016. 2

[19] Kazuhiro Suzuki, Dongvu Tonien, Kaoru Kurosawa, and Koji Toyota. Birthday para- dox for multi-collisions. Lecture Notes in Computer Science, pages 29–40, 2006. 2, 8

11