Optimizing Integer Sorting in O(N Log Log N) Expected Time in Linear Space
Total Page:16
File Type:pdf, Size:1020Kb
Optimizing Integer Sorting in 푂(푛 푙표푔 푙표푔 푛) Expected Time in Linear Space Ajit Singh Dr. Deepak Garg M. E. (Computer Science and Engineering) Asst. Prof. Department of computer Science and Engineering, Department of Computer Science and Engineering, Thapar University, Patiala Thapar University, Patiala [email protected] [email protected] Abstract-The traditional algorithms for integer space. The table-1 shows the comparative analysis sorting give a bound of 푶(풏 풍풐품 풏) expected of various traditional algorithms: time without randomization and 푶(풏) with randomization. Recent researches have Table-1: Complexity Comparison optimized lower bound for deterministic Name Best Average Worst Memory algorithms for integer sorting [4, 5, 7]. We Insertion Sort 푂(푛) 푂(푛2) 푂(푛2) 푂(1) present a fast deterministic algorithm for Shell Sort 푂(푛) 퐷푒푝푒푛푑푠 푂(푛 푙표푔푛 2) 푂(푛) integer sorting in linear space. The algorithm Binary tree 푂(푛) 푂(푛푙표푔푛) 푂(푛푙표푔푛) 푂(푛) discussed in this paper sorts 풏 integers in the sort range {ퟎ, ퟏ, ퟐ … 풎 − ퟏ} in linear space in Selection sort 푂(푛2) 푂(푛2) 푂(푛2) 푂(1) 푶(풏 풍풐품 풍풐품 풏) expected time. This improves Heap sort 푂(푛푙표푔푛) 푂(푛푙표푔푛) 푂(푛푙표푔푛) 푂(1) the traditional deterministic algorithms. The Bubble sort 푂 푛2 푂(푛2) 푂(푛2) 푂(1) algorithm can be compared with the result of Merge sort 푂(푛푙표푔푛) 푂(푛푙표푔푛) 푂(푛푙표푔푛) Depends Andersson, Hagerup, Nilson and Raman which Quick sort 푂(푛푙표푔푛) 푂(푛푙표푔푛) 푂(푛2) 푂(log 푛) sorts n integers in 푶(풏 풍풐품 풍풐품 풏) expected time Randomized but uses 푶(풎ᵋ ) space [2, 14]. The algorithm can 푂(푛푙표푔푛) 푂(푛푙표푔푛) 푂(푛푙표푔푛) 푂(log 푛) Quick sort also be compared with the result of Andersson Signature Sort 푂(푛) 푂(푛) 푂(푛) Linear which sort 풏 integers in 푶( 풏 풍풐품 풍풐품 풏) expected time and linear space but uses For sorting integers in [0, m – 1] range 푂(푚ᵋ ) randomization [2, 5]. The result of this paper space is used in many algorithms. When m is large, can also be compared with the result of Yijie the space used is excessive. Thus integer sorting Han which sort 풏 integers in 푶(풏 풍풐품 풍풐품 풏) using linear space is more important and therefore expected time and linear space but passes extensively studied by researchers. integers in a batch i.e. all integers at a time [8]. This paper also gives an implementation view of Fredman and Willard showed that n integers can be integer sorting in 푶(풏 풍풐품 풍풐품 풏) in linear sorted in 푂 푛 푙표푔푛 / 푙표푔 푙표푔푛 time in linear space. space [11]. Raman showed that sorting can be done Keywords-Deterministic Algorithms; Sorting; Integer in 푂 푛 log 푛 log log 푛 time in linear space [15]. Sorting; Complexity; Space Requirement. Later Andersson improved the time bound to 푂 푛 log 푛 [3]. Then Thorup improved the 1. INTRODUCTION time bound to 푂(푛(log log 푛)2) [16]. Later Yijei Han showed 푂(푛 푙표푔 푙표푔 푛 푙표푔 푙표푔 푙표푔 푛) time In modern computer world most of the problem is for deterministic linear space integer sorting [6]. being solved by sorting. It is a classical problem Yijei Han again showed improved result with which has been studied by many researchers. The 푂(푛 푙표푔 푙표푔 푛) time and linear space [8]. traditional algorithms for sorting give a clear picture for complexity. Although the complexity In these algorithms expected time is achieved by for comparison sorting is now well understood, the using Andersson’s exponential tree [3]. The height picture for integer sorting is still not clear. The only of such a tree is 푂(푙표푔 푙표푔 푛). Also the balancing known lower bound for integer sorting is the trivial of the tree will take only 푂(푙표푔 푙표푔 푛) time. Ω(푛) bound. Many continuous research efforts Balancing does not take much time. The major time have been made by many researchers on integer is taken by the insertion of integers as proved by sorting [2, 3, 5-16]. Recent advances in the design the Andresson [3]. Thus we only need to reduce the of algorithms for integers sorting have resulted in insertion time of integers. fast algorithms [5-12, 15, 16]. However, many of these algorithms use randomization or super-linear Andersson has shown that if we pass down integers The tree itself is very complex to handle as number in exponential tree one by one than the insertion of integer increases. We are using modified concept takes 푂 log 푛 for each integer i.e. total of exponential tree for our convenience for integer sorting. Here we will say that a tree with properties complexity will be 푂 푛 log 푛 [3]. This is of binary search tree will be the exponential tree if improvement over the result of Raman which takes it has following properties: 푂 푛 log 푛 log log 푛 expected time [15]. 1. Each node at level 푘 will hold 푘 number Yijie Han has given an idea which reduces the of keys (or integers in our case) i.e. at complexity to 푂(푛 푙표푔 푙표푔 푛) expected time in depth 푘 the number of key in any node linear space [8]. The technique used by him is will be 푘 keeping root at level 1. coordinated pass down of integers on the 2. Each node at level 푘 will be having 푘 + 1 Andersson’s exponential search tree [3] and the children i.e. at depth 푘 the number of linear time multi-dividing of the bits of integers. children will be 푘 + 1. Instead of inserting integer one at a time into the 3. All the keys in any node must be sorted. exponential search tree he passed down all integers 4. An integer in child 푖 must be greater than one level of the exponential search tree at a time. key 푖 − 1 and less than key 푖. Such coordinated passing down provides the chance of performing multi-dividing in linear time Thus the node at root will look like as depicted in and therefore speeding up the algorithm. figure 1. This paper will present a different idea to achieve the complexity of deterministic algorithm for integer sorting in 푂(푛 푙표푔 푙표푔 푛) expected time and linear time which is easy to implement and very simple enough. We are also using Andersson exponential tree [3] to perform the sorting. The exponential tree can’t be used as it is for our idea, so we need to modify the exponential tree. We will Figure 1: Root Node of Exponential Tree pass down integers one at a time but limit the And the node at 푖푡푕 level or depth will look like as comparison one per level. Thus total number of shown in figure 2. comparison for any integer will be 푂(푙표푔 푙표푔 푛) i.e. total time taken for all integers insertion will be 푂( 푛 푙표푔 푙표푔 푛) as the height of the tree will be 푂(푙표푔 푙표푔 푛). In order to perform this we introduce a new concept of Node word. Each node will be containing one extra word which will represent all the keys at that node. We will perform comparison of integer with this word. 2. EXPONENTIAL TREE The exponential tree was first introduced by Andersson in his research for fast deterministic Figure 2: Node at 푖푡푕 level in Exponential Tree algorithms for integer sorting [3]. In such a tree the number of children increases exponentially. The height of the tree will remain 푂(푙표푔 푙표푔 푛) which can be proved by using induction. The An exponential tree is almost identical to a binary modification will not only reduce the complexity of search tree, with the exception that the dimension exponential tree involved in implementing it but of the tree is not the same at all levels. In a normal also improves the balancing method as well as binary search tree, each node has a dimension (d) sorting technique. Integer sorting will be more of 1, and has 2d children. In an exponential tree, the convenient and fast with this modification. dimension equals the depth of the node, with the root node having a d = 1. So the second level can 2.1 Balancing hold two nodes, the third can hold eight nodes, the fourth 64 nodes, and so on. This shows that number The balancing of exponential tree is one of the of children at each level increased by a most important and necessary task in order to multiplicative factor of 2 i.e. exponential increases achieve the 푂(푛 푙표푔 푙표푔 푛) expected time in linear in number of children at each level. space [3]. If the tree is not balance then the worst case will happen and the expected running time will increase to undesired level. The balancing will guarantee that the depth of tree will remain split the integers into smaller lists. He suggested 푂(푙표푔 푙표푔 푛) which will lead us to our target. We that instead of inserting integer one at a time into will say that tree is balanced if the difference the exponential search tree we can pass down all between depths (or heights) of both sided children integers to one level of the exponential search tree of a key will not exceed by 2. The balancing will at a time. Such coordinated passing down provides happen on all the children of the node. Each key the chance of performing multi-dividing in linear will play a role of a node like in binary search tree time and therefore speeding up the algorithm.