Order Statistics in the Farey Sequences in Sublinear Time

Order Statistics in the Farey Sequences in Sublinear Time Jakub Pawlewicz Institute of Informatics, Warsaw University Banacha 2, 02-097 Warsaw, Poland [email protected] Abstract. The paper presents the first sublinear algorithm for computing order statistics in the Farey sequences. The algorithm runs in time p O(n3=4 log n) and in space O( n ) for Farey sequence of order n. This is a significant improvement to the algorithm from [1] that runs in time O(n log n). 1 Introduction The Farey sequence of order n (denoted Fn) is the increasing sequence of all irreducible fractions from interval [0; 1] with denominators less than or equal to n. The Farey sequences have numerous interesting properties and they are well known in the number theory and in the combinatorics. They are deeply investigated in [2]. In this paper we study the following algorithmic problem. For given positive integers n and k compute the k-th element of the Farey sequence of order n. This problem is known as order statistics problem. The solution to the order statistics problem is based on a solution to a related rank problem, i.e. the problem of finding the rank of a given fraction in the Farey sequence. Both, the order statistics problem and the rank problem, can be easily solved in quadratic time by listing all elements of the sequence (see [2, Problem 4-61] and Section 2.1). Faster solutions for order statistics are possible by reducing the main problem to the rank problem. The roughly linear time algorithm is presented in [1]. The authors present a solution of the rank problem working in time O(n) and in sublinear space. They also show how to reduce the order statistics problem to the rank problem by calling O(log n) instances of the rank problem. This gives an algorithm running in O(n log n) time. They remark that their solution to the rank problem could run in time O(n5=6+o(1)) if it was possible to compute Pn the sum i=1bxic, for a rational x, in this time or faster. This sum is related to counting lattice points in right triangles. A simple algorithm for that task running in logarithmic time can be found in [3]. Nevertheless, for completeness we present a simple logarithmic algorithm computing that sum in Section 3. The O(n5=6+o(1)) solution is complicated. For instance it involves summation of Möbius function and subexpotential integer factorization. In Section 2.3 we 2 Jakub Pawlewicz 3=4 present a simple algorithmp for the rank problem with time complexity O(n ) and space complexity O( n ). We assume RAM as a model of computation.1 What remains is to show a faster reduction in order to find order statistics in sublinear time. In [1] the reduction was made in two stages. The first stage j j+1 consists of finding out the interval n ; n containing the k-th term in the sequence. The interval is computed by a binary search with calling the rank problem O(log n) times. The second stage tracks the searched term by checking all fractions in the interval. Since there are at most n such fractions, in the worst case this stage runs in O(n) time, which dominates time complexity of j j+1 the reduction. In [1] the authors proposed to take smaller interval n2 ; n2 , since this interval contains exactly one fraction. However, there is a problem of tracking that fraction, which is not solved by them. In Section 2.2 we show solution to that problem. We also show another, more direct reduction running in logarithmic time and using O(log n) calls to the rank problem. This reduction is obtained by exploring Stern–Brocot tree in a smart way. 2 Computing order statistics in the Farey sequences 2.1 An O(n2) time algorithm We show two O(n2) time methods. The working time follows the number of 3 2 elements in the sequence Fn, which is asymptotically equal to π2 n . We use the following property of the Farey sequences. For two consecutive a c fractions b < d in a Farey sequence, the first fraction that appears between them a+c is b+d – the mediant of these two fractions. The mediant is already reduced and it firstly appears in Fb+d. Using this property one can successively compute all fractions. That way the Stern–Brocot tree2 is obtained. Farey fractions form a subtree. In–order traversal gives an O(n2) time and O(n) space algorithm. Space complexity depends on the depth of the Farey tree. The second O(n2) method is a straightforward application of a surprising a c e formula. For three consecutive fractions b < d < f the following holds: e tc − a b + n = , where t = : f td − b d That method works in optimal O(1) space. 2.2 Reduction to the rank problem In order to achieve a solution faster than quadratic one we reduce the problem of finding given term of the Farey sequence to a problem of counting the number of 1 In RAM (Random Access Machine) single cell can store arbitrarily large integers. Cell access or arithmetic operations on cells are performed in constant time. Also memory complexity is measured in cells. 2 For a description of Stern–Brocot tree together with its properties we refer to [2]. Order Statistics in the Farey Sequences in Sublinear Time 3 fractions bounded by a real number (the rank problem). To be more precise, for given positive integer n and real number x 2 [0; 1] we want to find the number a of fractions b belonging to sequence Fn not larger than x. We show how to solve the original problem given an algorithm for the rank problem. Reduction in linear time. We recall the reduction from [1]. Firstly, the inter- j j+1 val n ; n containing the k-th term is searched by a binary search starting from 0 n l r l m the interval n ; n , splitting interval n ; m into two smaller intervals n ; n m r l+r j j+1 and n ; n , where m = 2 . Next, we track the fraction in n ; n . Because 1 the size of the interval is n , it contains at most one fraction with denomina- tor b ≤ n. That fraction can be found in constant time since numerator must (j+1)b−1 be n . We check all such fractions for all possible denominators. Total tracking time is O(n) and whole reduction also works in time O(n). It is sufficient to use the above reduction to construct roughly linear time solution, but it is not enough if we want to create a sublinear algorithm. Therefore, we need faster reduction. We show two reductions with logarithmic time. Smaller interval. First, we can tune up the construction using intervals. As j j+1 it was suggested in [1] we can find smaller interval n2 ; n2 also by a binary search. Now in such interval there is at most one fraction in Fn which belongs 1 to that interval. This is because the size of the interval is n2 and because the a c following inequality holds for every two consecutive fractions b < d in sequence Fn: c a 1 1 − = ≤ : d b bd n2 The k-th term of the Farey sequence Fn is the only fraction from Fn in j j+1 the interval n2 ; n2 . What is left is to track this fraction. We use the Stern– Brocot tree to this task. The Stern–Brocot tree allows us to explore all irreducible 0 1 fraction in an organized way. We start from two fractions 1 and 0 . These fractions represent the interval where the k-th term resides. We repeatedly narrow that j j+1 interval to enclose the interval n2 ; n2 until we find the fraction. Assume we a c have already narrowed the interval to fractions b and d , such that a j j + 1 c < < ≤ : b n2 n2 d a+c Then, in a single iteration we split the interval by the mediant b+d . If the mediant j j+1 falls into the interval n2 ; n2 , then the k-th term is found and this is the a+c a c a+c mediant b+d . Otherwise, we replace one of the fractions b and d by b+d . If a+c j a a+c j+1 c b+d < n2 , we replace fraction b , otherwise b+d ≥ n2 and we replace d . The above procedure guarantees successful tracking. However, the time complexity is O(n) since in the worst case n iterations are needed. For instance 1 1 1 for k = 1 we replace the right fraction n times consecutively to 1 ; 2 ;:::; n . This problem can be solved by grouping successive substitutions of left or right fractions. 4 Jakub Pawlewicz Suppose we replace several times the left fraction. After first substitution we a+c c a+2c have fractions b+d and d . After second substitution the fractions become b+2d c and d and so on. Generally, after t substitutions of left fractions the final fraction a+tc is equal to b+td . In the above procedure we replace the left fractions as long as a + tc j < : (1) b + td n2 If t is the largest integer satisfying the above inequality then for the next mediant j a+(t+1)c a+(t+1)c j+1 we have n2 ≤ b+(t+1)d .

Load more