IJCST Vo l . 3, Iss u e 2, Ap r i l - Ju n e 2012 ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print) Advanced Pattern Based With Pooling 1Shanti Swaroop Moharana, 2Satya Prakash Sahoo, 3Dr. Manas Ranjan Kabat 1,2,3Dept. of Computer Science & Engineering, Veer Surendra Sai University of Tech. Burla, India

Abstract in natural merge sort [10], which may increase the execution Sorting is the arrangement of records or elements in some time. Bottom-Up merge sort [11], uses a non recursive procedure mathematical or logical order. Merge sort is one among many for sorting numbers which eliminates the divide procedure sorting techniques. Several improvements on the merge sort completely. However, although elimination of recursive function procedure are done by different researchers that require less space calls decreases time but actually the sorting time increases. This for auxiliary memory and less number of conditions checking. In increase in sorting time occurs because the algorithm actually this paper, a new merge-sort technique is proposed which uses the works similar to the worst case merge sort algorithm, comparing design patterns to reduce computational complexity of swapping each and every number with the other number. Heuristic Merge operation and memory usage. The order of settlement of elements Sort [12], algorithm searches a specific pattern(consecutive is recorded by various design patterns and then merging of one ascending or consecutive descending) of numbers in the given phase of elements with another is replaced with chunk merging. list of numbers. It extracts sub lists of sorted data, and then The proposed technique uses the design patterns to create a memory uses the merge procedure to merge all the extracted sub lists. pool of data to sort data. This algorithm has been implemented However the list extracted may contain very small chunk(group in JAVA and it is observed that the proposed algorithm performs of consecutive sorted data) because the procedure extracts only better than all existing merge sort algorithms. consecutive numbers which are already sorted. Finding sorted data in a random list is very sparse and hence the chunks formed are Keywords of small size. Since the sublists are of small size more number of Sorting, Pooling, Heuristic, Pattern Based merge calls are required. Hence there is no significant decrease in execution time in case of Heuristic merge sort [12]. I. Introduction In this paper, we present a proposal to form larger pool of sorted Sorting [7, 16], is the arrangement of records or element in some data for merging thereby reducing the number of merge function mathematical or logical form. Proper arrangement of elements calls. Our proposed technique uses the design patterns to create makes searching and selecting a data element easy and hence a memory pool of data to sort data more efficiently. The order it saves time. There are many fundamental and advance sorting of Settlement of elements is recorded by a design pattern and algorithms in computer science. These sorting algorithms are then merging of one phase of elements with another is done. The classified by various research papers [1-3], which are listed as design pattern reduces computational complexity of swapping follows :- operation and memory usage. We have also combined the various • Swap based- Sorts begin conceptually with the entire list, and approaches which were used in heuristic merge sort [12], and exchange of particular pairs of elements (adjacent elements or bottom up merge sort [11], to form the merge sort procedure a elements with certain step like in Shell sort) takes place. more efficient one. We have reduced the necessity of dividing the • Merge Based Swap.eg Merge Sort. list or the step of splitting (until a single element) and thereby • Tree based sorts.eg Heap Sort. shown an improvement in running time requiring less number • Other.eg or . of recursive calls to the divide and conquer procedure. Although many people consider sorting as a solved problem, but useful new sorting algorithms are still being invented for example, II. Motivation library sort in 2004, Bottom up merge sort in 2008 and Heuristic A pool in computer science is a set of initialized resources that are merge Sort in 2010. Mergesort [4-5], is a which kept ready to use rather than allocated and destroyed on demand. is used for sorting data stored either in internal or external storage A client of the pool requests an object from the pool and performs device. This algorithm is based on divide and conquer theory. various operations on the returned object. When the client has • Merge-sort technique recursively divides a list of ‘n’ elements finished with an object (or resource), it returns it to the pool, rather into two sub-lists n1 and n2 until it gets a single element. than destroying it. Pooling of resources can offer a significant • Then it merges two single elements into one sorted list and performance boost in situations where the cost of initializing a then two elements and proceeds in this way. Two such lists class instance is high, the rate of instantiation of a class is high, and are then merged to form a sorted list of four elements. the number of instances in use at any one time is low. The pooled The above procedure is repeated with larger number of elements resource is obtained in predictable time when creation of the new through the so called conquer steps until it merges two n/2 sized objects (especially over network) may take variable time. sorted lists into a sorted list of n elements. Example of Pooling- Several approaches have been proposed to improve the performance The idea of object (or resource) pooling is similar to the operation of the merge sort algorithm such as top down merge sort [8], natural of your local library. When you want to read a book, you know that merge sort [10], queue merge sort [9], in place merge sort [6], it’s cheaper to borrow a copy from the library rather than purchase bottom up merge sort [11], heuristic merge sort [12], etc. your own copy. Likewise, it is cheaper (in relation to memory and Top-Down merge sort [8], uses a technique to cut the auxiliary speed) for a process to borrow an object rather than create its own array down to half. However no significant effort has been made copy. In other words, the books in the library represent objects in top down merge sort [8], to reduce the recursive merge function and the library patrons represent the processes. When a process calls which may increase the execution time. Another sorting needs an object, it checks out a copy from an object pool rather technique natural merge sort [10], reduces some condition checks than instantiate a new one. The process then returns the object to in loops. However the recursive merge function calls still exist the pool when it is no longer needed.

218 In t e r n a t i o n a l Jo u r n a l o f Co m p u t e r Sci e n c e An d Te c h n o l o g y www.ijcst.com ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print) IJCST Vo l . 3, Iss u e 2, Ap r i l - Ju n e 2012

However, these benefits are mostly true for objects which are set of numbers is sorted. In our algorithm we have used scanning expensive with respect to time, such as database connections, alternately from both ends. The end from which scanning is to be socket connections, threads and large graphic objects like fonts done is decided by using modulus operator function. We have used or bitmaps. a simple even odd logic for deciding the end from which scanning Since the merging process is a high cost operation, we have tried is done. By adding a new counter variable and incrementing it each to reduce the number of merge function calls. We noticed that time when scanning from one end is completed and then applying instead of extracting smaller sub list again and again from the the modulus operator on that counter. If the modulus condition is given list, if we can create a larger pool of sorted data then larger satisfied then the scanning is done from the last element otherwise pool of sorted data is provided as input to the merge procedure from first elements. Our algorithm uses the first element in the which may decrease the execution time significantly. given list a as the key of a new list called the sub-list b. Then the remaining elements of the original list a is scanned, and every III. Advanced Pattern Based Merge Sort time an element is found which is greater than the last element Our proposed algorithm targets the pattern of input data which of the sub-list b, it is appended to the sub-list b, and removed consists of successive sorted numbers either in ascending or from the original list a. After extracting a sorted sub-list b from descending order. The proposed algorithm scans the numbers the original list, putting that list aside, we then extract another in two directions that is from forward direction as well as from sub-list in the same manner. Then the two sub-lists are merged. backward direction. By alternating the scanning direction we The process of extracting a sub-list and merging with the previous ensure that we always get the maximum number of numbers. Our sublist is continued until the original list is exhausted. If the data algorithm uses the modulus operator for alternating the scanning is already sorted it gives theoretical as well as computational time direction. The numbers which are extracted are then stored in complexity to be O(n). a larger memory pool. Finally an efficient merging procedure is applied for merging the numbers present in the pooled set of IV. Improving Merge Procedure for Patterns numbers. The complexity of mergesort algorithm is O(nlog(n)). However it includes only the number of comparisons among the elements being sorted. In this paper we focus on some other factors that Algorithm merge_sort_pooling(a) are not ignorable. For a single processor based system the key 1. Input a points, which have drawn attention of most researchers, is to /*a is the given set of numbers to be sorted*/ improve the performance of the merge sort algorithm which mainly 2. Input n,lenc,key,middle includes reducing the number of comparisons and cutting down /* n represents the number of elements that are sorted till now. the size of required auxiliary array. Earlier top down merge sort lenc represents the total number of numbers to be sorted. Key proposed a method to cut the auxiliary array down to half while represents the element which is currently selected. middle natural merge-sort deals with reducing some condition checking represents the number of elements in the current sorted list. */ in loops. However, to the best of our knowledge, little attention 3. Input c is given to reduce the number of recursive function calls, which /*c is the completely sorted set of numbers*/ certainly adds some overhead on the performance of merge sort. In 4. while(n=0;i--) proposed merge sort reduces the number of merge function calls 8. if((a[i]==-1)||(key>a[i])) and hence performs better. 9. continue; 10. endif A. Reducing Auxiliary Memory Size 11. c[n]=a[i]; We noticed that it is not necessary to copy the second half of 12. key=a[i]; array a to the auxiliary array b. Since, if all elements of the first 13. a[i]=-1; n++; middle++; half have been copied back to a, the remaining elements of the 14. end for second half need not be moved anymore since they are already 15. else at their proper places. Therefore, it cuts the required auxiliary 16. key=a[0]; middle=0; array size as well as the copy operations to half of that needed in 17. for( i=0;ia[i])) 19. continue; B. Reducing the Loop Condition Checking 20. end if We also noticed that in top down merge sort [8] the second while- 21. c[n]=a[i]; key=a[i]; loop of the merging algorithm checks whether any of the two lists 22. a[i]=-1; n++; middle++; are ended. However, both the list will never end at the same time. 23. end for Hence, checking only the list that will end earlier is sufficient. This 24. merge_pool(c,b,0,(n-middle),n-1); cuts down almost half of the CPU time which is spent in checking 25. end while the while loop condition. The improved algorithm, along with the improvement done earlier now looks like this

Initially a given set of numbers (array a) is given as input to the algorithm and the algorithm returns another set of numbers (array c) as output. The algorithm continues execution until the entire www.ijcst.com In t e r n a t i o n a l Jo u r n a l o f Co m p u t e r Sci e n c e An d Te c h n o l o g y 219 IJCST Vo l . 3, Iss u e 2, Ap r i l - Ju n e 2012 ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print)

Algorithm natural_merge_sort( a, low,mid,high) Algorithm merge_pool( a,b,low,mid,high){ 1. Input a /*a is the given set of numbers */ 1. Input a 2. Input low,mid,high 2. /*a is the given set of numbers */ /*low represents the first element postion in the given list 3. Input b which is to be merged. Mid represents the middle element 4. /*b is the given sublist which is to be merged with at which the split is to be performed. High represents the original list*/ last element in the given list.*/ 5. Input low,mid,high 3. if (a[mid] > a[high]) 6. /*low represents the first element postion in the 4. i=0; j=low; given list a which is to be merged. Mid represents the 5. while (j<=mid) middle element at which the split is to be performed. 6. b[i]=a[j]; i=i+1; j = j+1; High represents the last element in the given list a.*/ 7. end while 7. if(mid==0) 8. i=0; k=low; //Storing Initial elements into array 9. while (j<=high) 8. return; 10. // no need to check whether the left list is finished 9. end if 11. if (b[i]<=a[j]) 10. if(a[mid-1]a[high]) 21. while (j ≤ mid) 19. while(j<=high) 22. b[i] = a[j]; i = i+1; j = j+1; 20. if(b[k]

220 In t e r n a t i o n a l Jo u r n a l o f Co m p u t e r Sci e n c e An d Te c h n o l o g y www.ijcst.com ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print) IJCST Vo l . 3, Iss u e 2, Ap r i l - Ju n e 2012 algorithms on them. After running the algorithm on different [4] Thomas H. Cormen; Leiserson, Charles E., Rivest, Ronald datasets of a database we have taken the average execution time L., Stein, Clifford,"2.3: Designing algorithms", Introduction needed to sort the sets. The average execution time required to to Algorithms (2nd ed.). MIT Press and McGraw-Hill. pp. sort the records is given in Table 1. 27-37, (2001) [1990]. [5] Han Y, Thorup M.,“ in O(n(log(logn))1/2) Table 1: Average Execution Time of Various Merge Sort Expected time and linear space”, In Proceedings of the 43rd Algorithms Symposium on Foundations of Computer Science (November Pattern Based 16-19, 2002). FOCS IEEE Computer Society, Washington Bottom-up Heuristic Merge Sort DC. Database Merge Sort Pattern Sort with Pooling [6] Katajainen, Jyrki; Pasanen, Tomi; Teuhola, Jukka,"Practical Size (time in nano (time in nano (time in nano in-place mergesort", Nordic Journal of Computing 3, seconds) seconds) seconds) 1996. 10 52860 24633.8 16679.1 [7] Knuth, Donald,"Section 5.2.4: Sorting by Merging", Sorting 100 140618 630681.3 170846.1 and Searching. The Art of Computer Programming. 3 (2nd 1000 2321744 6423269 261366.2 ed.). Addison-Wesley, 1998. 10000 117372283 383049384 29798303 [8] Sommerlad P Rapperswil H.f.T Schweiz, 2004. 30000 981015994 2678122785 130551310 [9] Golin M.J. Sedgewick R.,“Queue Mergesort”, Information Processing Letter, pp. 253-259, 1993. 50000 2650592882 7100290014 275109618 [10] Katajainen J Traff J.L.,"A meticulous Analysis of Mergesort 70000 5342830896 13690772148 464356275 Programs”, Proceedings of the third Italian Conference on 100000 10539018288 27848741279 794770350 Algorithms and Third Italian Conference on Algorithms and Complexity, Vol. 1203 Lecture Notes in Computer Science Finally, it is observed that our algorithm performs better as 1997 pp. 217-228. compared to other existing mergesort techniques. Moreover, it [11] M. Abdullah-Al-Wadud,Md. Amiruzzaman, Oksam Chae,“A was also noticed that when larger datasets are given as input, Bottom Up Merge Sort Elminating Recursion”, Daffodil the algorithm performs better as compared to other existing International University Journal Of Science and Technology, algorithms. This also advocates for the proposed method. 2008. [12] Manouchehr Zadahmad jafarlou, Parisa Yousefzadeh VI. Conclusion fard,“Heursitic and Pattern Based Merge Sort” Elseveir In this paper, we have presented a new advanced pattern based WCIT-2010. merge-sort technique which targets larger pattern of successive [13] Azad A.K.M. Kaykobad M.,“A Variation of mergesort sorted numbers. It greatly enhances the performance of merging by algorithm requiring fewer comparisons”, National Conference replacing small list element merging with larger chunk merging. Computer and Info Systems, Dhaka, Bangladesh. We have also studied some of the approaches which were made [14] Sun Microsystems, Inc..,"Arrays API", 2007. earlier in order to optimize the merging procedure and found [15] Sun Microsystems, Inc.,"java.util.Arrays.java" 2007. some flaws which may reduce the actual execution time. The [16] Horowitz Sahni,“Fundamentals of Computer Algorithms”, technique of Memory Pooling greatly enhances the speed with Galgotia Publication. which merging is done and thereby reduces the execution time practically. Shanti Swaroop Moharana is pursuing his Though the computational complexity increases during the first B.Tech final year in Computer Science phase because of increase in the number of comparisons while & Engineering from Veer Surendra extracting the sub lists, but it surely runs faster during the second Sai University of Technology, Burla phase when merging of sub lists is done and this compensates formerly known as University college for increase in the complexity during the first phase. Since larger of engineering Burla. His research numbers of sub lists are already sorted the number of comparisons interests include Sorting Algorithms, required during merging as well as the number of merges function Networking (specially routing in Ad- call decreases. Hence our algorithm performs better and the hoc networks). experimental results also support this claim.

References [1] Bradford Philip G Rawlins Gregory J.E Shannon Gregory, Satya Prakash Sahoo received his “Efficient Matrix Chain Ordering in Polylog time”, SIAM M.Tech in Computer Science & Engg. Journal on Computing (Philadelphia:Society for Industrial and pursuing Ph.d in CSE. He is having and Applied Mathematics), 1998. 8 years teaching experience and 5 years [2] Papadmitriou,Christos H.,"Computational Complexity", industry experience. At present he is Reading Mass Addison-Wesley, 1994. working as a Senior Lecturer in Veer [3] Grotschel, Martin Laszlo Loovasz, Alexander Schrijver, Surendra Sai University of Technology, “Complexity, Oracle and Numerical Compuutation”, Burla. His research interests include Geometric Algorithms and Combinatorial Opttimization Computer Networking, Ad-Hoc Springer, 1988. networks, Database Management and Mobile Computing. He published 2 research papers and attended 2 international conferences. www.ijcst.com In t e r n a t i o n a l Jo u r n a l o f Co m p u t e r Sci e n c e An d Te c h n o l o g y 221 IJCST Vo l . 3, Iss u e 2, Ap r i l - Ju n e 2012 ISSN : 0976-8491 (Online) | ISSN : 2229-4333 (Print)

Manas Ranjan Kabat received his M.Tech in Machine Intelligence from Bengal Engineering & Science University. He is also a Ph.d Holder. He is having 11 years of teaching experience. He is pursuing research from last 8 years. He is currently working as a Senior Lecturer in Veer Surendra Sai University of Technology, Burla. His research areas include Algorithms, Computer Networking, Real Time Systems and Internet and Web Technology. He is the author of various papers which are published in various journals, 5 papers in international journals, 15 research papers and attended 10 International Conferences.

222 In t e r n a t i o n a l Jo u r n a l o f Co m p u t e r Sci e n c e An d Te c h n o l o g y www.ijcst.com