Proceedings of the 8th WSEAS International Conference on APPLIED COMPUTER SCIENCE (ACS'08)

Investigating Design Choices between Bitmap index and B-tree index for a Large System

Morteza Zaker, Somnuk Phon-Amnuaisuk, Su-Cheng Haw Faculty of Information Technology, Multimedia University, Malaysia [email protected], [email protected], [email protected]

Abstract: Building indexes on is common, but it has an important impact on the query performance, especially in large such as a Data Warehouse where the queries are usually very complex and ad hoc. If a proper index structure is chosen, the query response time can be accelerated. Until now, there is no definite guideline for Data Warehouse analysts to choose the appropriate index. According to conventional wisdom, Bitmap index is a preferred indexing technique for cases where the indexed attributes have few distinct values (i.e., low cardinality). The query response time is expected to degrade as the cardinality of indexed columns increase due to a larger index size. On the other hand, B-tree index is good if the column values are of high cardinality due to its indexing and retrieving mechanisms. In this paper, we show that this may not be true under certain circumstances. Experimental results support the fact that even though the level of column cardinality determines the index file size, but the query processing time is not determined by the level of column cardinality. Moreover, our results indicate that the Bitmap index is faster than B-tree index on a large dataset with multi-billion records.

Key–Words: Data warehouse, Bitmap index, B-tree index, Query processing

1 Introduction quently updated because it does not need re-balancing as frequently as other self-balancing search trees. In addition, all leaf blocks of the tree are at the same A Data Warehouse (DW) is the foundation for Deci- depth [6] . Thus, choosing the proper type of index sion Support Systems (DSS) with a large collection structures has a big impact on the DW environment. of information that can be accessed through an On- line Analytical Processing (OLAP) application. This The main problem that arises is that there is no large database stores current and historical data that definite guideline for DW analysts to choose appro- come from several external data sources [1, 2, 3] . priate indexing methods. According to common prac- The queries built on DW system are complex and usu- tice, Bitmap index is best suited for columns having ally include some join operations that incur computa- low cardinality and should be only considered for low- tional overhead. This overhead increase the response cardinality data [1, 3, 6] . Strohm [6]concludes that time especially when queries are performed on a large the advantages of using Bitmap indexes are greatest dataset. To increase the performance, DW analysts for low cardinality columns, i.e., columns which has a commonly use some solutions such as indexes, sum- small number of distinct values compared to the num- mary tables and partition mechanism [4] . ber of rows in the table. If the number of distinct There are various index techniques supported by values of a column is less than 1%, then the column database vendors such as Bitmap index[4], B-tree is a candidate for a Bitmap index. This assumption [3, 5, 6], Projection index [7], Join bitmap index [8], may be correct to some extent based on previous al- Range base bitmap index [9] and so on. Bitmap in- gorithms and based on old machine processing used dex for example is advisable for a system that contains by the database software and hardware respectively, data that are not frequently updated by many concur- but, as the usage of data is exploding, this assumption rent processes [10, 11, 12] . This is mainly due to may no longer be applicable. the fact that Bitmap index stores large amount of row In this paper, we demonstrate that: (i) Bitmap information in each block of the index structure. In index on a column with high cardinality is more ef- addition, since Bitmap index locking is at the block ficient than a B-tree index, (ii) The query response level, any insert, update, or delete activity may result time in multi-dimensional queries is not pursued by in locking an entire range of values [13] . On the other the time that is needed to one-dimensional queries on hand, B-tree index is good for a system which is fre- both Bitmap index and B-tree index, and (iii) Query

ISSN: 1790-5109 123 ISBN: 978-960-474-028-4 Proceedings of the 8th WSEAS International Conference on APPLIED COMPUTER SCIENCE (ACS'08)

utilizing Bitmap index which are executed within a Table 1: Basic Bitmap index adopted by[10] range of predicates is affected by the distribution of data, but does not have any affinity by the cardinality RowId C B0 B1 B2 B3 conditions. 0 2 0 0 1 0 The rest of the paper is organized as follows. Sec- 1 1 0 1 0 0 tions 2 presents the background studies on Bitmap in- 2 3 0 0 0 1 dex, B-tree index and cardinality concepts. Section 3 0 1 0 0 0 3 defines a case study and performance methodology 4 3 0 0 0 1 with a set of queries to compare the performances of 5 1 0 1 0 0 Bitmap index and B-tree index. Section 4 discusses 6 0 1 0 0 0 the experimental results. Lastly, Section 5 concludes 7 0 1 0 0 0 the paper. 8 2 0 0 1 0

2 Background and related work that point to the next level in the index. Leaf nodes consist of the index key and pointers pointing to the 2.1 Bitmap index physical location in which the corresponding records Bitmap index is built to enhance the performance on are stored. According to some research studies [3, 11], various query types including range, aggregation and B-tree index has features that make it a well selection join queries. It is used to index the values of a single criterion on column with high cardinality values espe- column in a table. Bitmap index is derived from a cially in DW’s designing. sequence of the key values which depicts the number of distinct values of the column. Each row in Bitmap 2.3 Cardinality index is sequentially numbered starting from integer 0. If the is set to ”1”, it indicates that the row Definition of cardinality in set theory is refers to the with the corresponding RowId contains the key value; number of members in the set. On database theory, the otherwise the bit is set to ”0”. cardinality of a table refers to the number of rows con- To illustrate how Bitmap indexes work, we show tained in a particular table. In terms of OLAP system, an example which is based on the example illustrated cardinality refers to the number of rows in a table. On by E.E-O’Neil and P.P-O’Neil[10] . ” Table 1 shows the other hand, on a data warehousing point of view, a basic Bitmap index on a table containing 9 rows, cardinality usually refers to the number of distinct val- where Bitmap index is to be created on column C with ues in a column. Generally, there is four levels of car- integer values ranging from 0 to 3. We say that the dinalities; Low, Normal, High and Very high cardinal- column cardinality of C is 4 because it has 4 distinct ity (also known as Full Cardinality). Low-cardinality values. Bitmap index for C contains 4 bitmaps, shown refers to columns which have a very few unique val- as B0, B1, B2 and B3 corresponding to the value rep- ues. Normal-cardinality refers to columns which have resented. For instance, in the first row where RowId sporadic unique values. High-cardinality is related to =0, column C has the value 2. Hence, in column B2, columns which has a large number of distinct values the bit is set to ”1”, while the rest of bitmaps are containing very unique values. Very high cardinality set to ”0” . Similarly, for the second row, bit of B1 is related to columns which has a very large number of is ”1” because the second row of C has the value 1, distinct values. Recently, very high cardinality is also while the corresponding bits of B0, B2 and B3 are known as Full-cardinality in the database community. all ”0” . This process repeats for the rest of the rows [10].” 2.4 Related Work

2.2 B-tree index Recently, there are some significant research stud- ies investigating the main limitation of Bitmap in- B-tree [5] stores the index pointers and values to other dex. New indexing strategy that applied to bitmap index nodes by using a recursive tree structure. The compression schemes requires less space and provides data could be easily retrieved by tracing on the pointer. performance gains [10, 12, 14, 17, 18, 19] . The top-most level of the index is known as root while In [16, 17], they have analyzed that WAH com- the lowest level is known as the leaf node. All the pression is effective in reducing Bitmap index size. other levels in between are called branches (Internal They have shown that query processing time grows nodes). Both the root and branches contain entries linearly as the index size increases. Besides, they also

ISSN: 1790-5109 124 ISBN: 978-960-474-028-4 Proceedings of the 8th WSEAS International Conference on APPLIED COMPUTER SCIENCE (ACS'08) demonstrated that the query processing time is linear table involves low-cardinality columns, while the Or- in the number of hits when using a WAH compressed der and Product tables involve normal and high car- bitmap index. They proved that WAH compressed dinality columns respectively. All tables have the Id- bitmap index are optimal for both of low cardinality Bit and Name-Bit Columns; while the Active-Bit col- and high cardinality and the techniques for compress- umn only presents in the Product table which involves ing bitmap index, increase efficiency of in-memory Bitmap index with low cardinality. Likewise, the Id- logical operations. Bt and Name-Bt are present in the all tables. However, In [12], they have investigated some of recent the Active-Bt column in the table Product involves B- developments in bitmap indexing technology under tree index with low cardinality. We use a number of three categories, i.e., encoding, compression, and bin- queries to study performance of B-tree and Bitmap ning. They discuss how various encoding methods indexes. In each column, C1k has 1000 distinct val- could reduce the index size and improve the query re- ues appearing randomly on approximately 1,600,000 sponse time. On the other hand, though, several meth- times each, C1M has 1,000,000 distinct values and ods of indexing, including B*-tree and B+-tree (exten- C120M has 120,000,000 distinct values. The columns sions of B-tree) are theoretically best suited for single Id-Bit and Name-Bit indicate Bitmap index and Id-Bt dimensional range queries, but most of them cannot be and Name-Bt indicate B-tree index in all tables. used to efficiently answer arbitrary multi-dimensional range queries. In [18], we see the FastBit is a compressed Bitmap index which implemented with a particular compression schema. This indexing scheme can an- swer range queries many times faster than the well- known indexing schemes. In [19], they claimed that FastBit is efficient in Table 2: Various columns with their associated data both terms of speed and compression amongst data types and column cardinalities management techniques. In [10] they show an effi- cient bitmap index design on modern processors with Id-Bit Id-Bt Name-Bit Name-Bt Active-Bt Active-Bit analyzing of the RIDBit and Fast-Bit with the phys- Numeric Numeric Varchar Varchar Number 1Byte Number 1Byte ical design aspects of the two packages. They show 8 8 Byte 8 Byte 8 Byte that the FastBit indexes are usually larger than RIDBit indexes, but it can answer many queries in less time Sales C1K C1K C1K C1K because it accesses the needed bitmaps in less I/O op- Order C1M C1M C1M C1M erations. In fact, the optimizer of database softwares Product C120M C120M C120M C120M 2 2 can not make use of any indexes to execute some kind of queries. Rather these databases will prefer to do a full table scan. Since there is an abnormal growth of data, table scan will be needed to increase phys- ical disk reads to avoid insufficient memory alloca- tion. Therefore, FastBit can support these queries di- rectly [18, 19, 14] where Oracle 11G has not utilize this method of implementation.

3 Methodology The Set Query Benchmark has been used for 3.1 Query Set frequent-query application as much like as Star- Schema within data-warehouse design [22, 23] . The In order to compare efficiency of Bitmap index and Queries of the Set Query Benchmark have been de- B-tree index,we build a series of queries on some signed on business analysis missions. In order to eval- columns for evaluation. In our dataset, there are 3 uate the time required answering different query types tables namely Order, Sales and Product. Table 2 de- including range, aggregation and join queries; we im- picts these tables with their column cardinalities in- plemented the six queries adopted by the Set Query dicated. Each table has approximately 1.6 billion of Benchmark. Briefly, we describe all of our selected records. These records are generated randomly us- SQL queries used for our performance measurements ing PL/SQL Block by Oracle11G tools. The Sales as indicated in Listing 1 to 6.

ISSN: 1790-5109 125 ISBN: 978-960-474-028-4 Proceedings of the 8th WSEAS International Conference on APPLIED COMPUTER SCIENCE (ACS'08)

Query1A: SELECT count (*) FROM table WHERE Query4A: SELECT * FROM tables WHERE ColumnX = 10; columnX is in (1000, 100000, 1000000, ColumnX is one of Id-Bit and 100000000, 1000000000). Id-Bt and table is one of Sales with C1k ColumnY is one of Id-Bit and Id-Bt. cardinality on its columns, There is 8 instance of Query4A. Order with C1M, and Product with In the Product table we have 2 other C120M cardinality respectively. columns, namely Active-Bit and Active-Bt Query1B: SELECT count (*) FROM table with 2 cardinalities. The Active-Bit WHERE ColumnY = ’ABCDEFGH’; is concern with Bitmap index and the ColumnY is one of Name-Bit and Name-Bt. Active-Bt is concern with B-tree index According to E. O’Neil and P. O’Neil [10], in the same table. since they involve only one column at a Query4B: SELECT * FROM Product WHERE time in the WHERE clause, we call Query1 ColumnX is in (1000, 100000, 1000000, as one-dimensional (1-D) query. There 100000000, 1000000000) and Active-bit = 1 are 12 different instances of Query1B. ColumnZ is one of Id-Bit and Id-Bt. Listing 1: Description for Query 1 There are 2 instances of Query4B. Listing 4: Description for Query 4

Query5A: SELECT Id-Bit, Name-bit, count (*) from tables GROUP BY Id-Bit,Name-bit. Query5B: SELECT Id-Bt, Name-bt, count (*) Query2A: SELECT count (*) FROM tables from tables GROUP BY Id-Bt, Name-bt; WHERE ColumnX in (100000, 100000000); tables is one of the three existent ColumnX is one of Id-Bit and Id-Bt. Tables. There are 8 instances of Query2B0: SELECT count (*) FROM tables Query5A and Query4B. WHERE Id-Bit= 1000 and NOT ID-Bit = 1000000. Query5C: SELECT sum (ColumnM) FROM Query2B1: SELECT count (*) FROM tables tables WHERE ColumnN > 9000 and WHERE Id-Bt = 1000 and NOT Id-Bt = 1000000; ColumnN < 9100 Query2B0 and Query2B1 are two-dimensional ColumnM, ColumnN is one of Id-Bit queries where each WHERE clause involves and Id-Bt and and M=N= Id-bit or conditions on two columns. There are 12 M=N= Id-bt. There is 6 instances different instances of Query2B. of Query5C. Listing 2: Description for Query 2 Listing 5: Description for Query 5

Query6: SELECT sum(D.ColumnM) FROM sale E, tables D WHERE E.ColumnN= D. CoulmnP Group by columnM; Here ColumnM, ColumnN and ColumnP is Query3A: SELECT sum (ColumnM) FROM one of Id-Bit and Id-Bt that tables WHERE ColumnN between 100000 M=N=P=Id-Bit or M=N=P= Id-Bt and and 100000000. tables is one of the ColumnM, ColumnN is one of Id-Bit and three existent tables except the Id-Bt and and M=N= Id-bit or M=N= Id-bt. Sales table. There are 6 There is 6 instances of Query3A. instances of Query6. Query3B: SELECT Sum (ColumnM) FROM tables Listing 6: Description for Query 6 WHERE (ColumnN between 100000 and 1000000 or ColumnN between 1000000 and 10000000 3.2 Experimental Setup or ColumnN between 10000000 and 30000000 or ColumnN between 30000000 and 60000000 We performed our tests on the Microsoft Windows or ColumnN between 60000000 and 100000000); Server 2003 machine with Oracle11G database CoulmnM, ColumnN is one of Id-bit and Id-bt systems. Table 3 shows some basic information about and M=N= Id-bit or M=N= Id-bt . the test machines and the disk system. To make sure There are 6 instances of Query3B. the full disk access time is accounted for; we disable Listing 3: Description for Query 3 all unnecessary services in the system and keep the ISSN: 1790-5109 126 ISBN: 978-960-474-028-4 Proceedings of the 8th WSEAS International Conference on APPLIED COMPUTER SCIENCE (ACS'08) same condition for each query. To avoid inaccuracy, Table 4: Index files size and index construction time all queries are run 4 consecutive times to give an Sales Order Product average elapsed time. Size(MB) Time(S) Size(MB) Time(S) Size(MB) Time(S)

ID-Bit 326 1580 1222 2805 3012 3534

Id-Bt 26211 21090 26532 21319 26568 21580 Table 3: Information about the test system CPU Pentium 4 (2.6 GHZ) Name-Bit 418 1673 1341 2605 3215 3892 Disk 7200 RPM, Name-Bt 26911 21638 26821 21430 27190 21802 500 GB Active-Bit 288 1544 Memory 1 GB Active-Bt 0.06 4678 Database Oracle11G

4 Results and discussions

We present the performance measurement experi- ments in two main parts, namely, (i) the index file size and index construction time and (ii) query retrieval time. Figure 1: Index file size of bitmap with various cardi- 4.1 Index File Size and Index Construction nality Time The time taken to construct B-tree and Bitmap indexes which is involved by Bitmap index is significantly is shown in Table 4. We see that the Bitmap requires smaller than the same column which is involved by slightly more time to build high-cardinality columns B-tree index. (Product table) as compared low-cardinality (Sales ta- ble) on the same columns. B-tree, on the other hand, 4.2 Query Response Time requires considerably more time to build all indexes regardless of the column’s cardinalities. Table 4 sum- In this section, we evaluate the time required to an- marizes the indexes size over various kinds of data swer the queries. These timing measurements directly cardinality. In Figure 1, we consider only the size of reflect the performance of indexing methods. A sum- the two columns on Bitmap and B-tree indexes. For mary of all the timing measurements on several kinds high-cardinality cases, Bitmap generates a large num- of queries as shown in Listing 1 to 6 are shown in Ta- ber of small bitmap objects and spends much time in ble 5. allocating memory of these bitmaps. Since the index Now, we examine the performance on count file size of Bitmap index depends on the cardinality of queries (Query1 and Query2) in detail. In Query1, the column; ultimately, the index size on the columns when the cardinality of the column is high, it takes will be smaller than a B-tree even for full cardinality slightly more time to execute the queries. In all the (100% distinct values) on the same column. There is tables with cardinalities of 1K, 1M and 120M, the av- a research study [15] shows that the index file size of a erage time used by Bitmap index to read in the index Bitmap index on column which would be a candidate blocks is nearly 0.021 s (21 ms). However, in most for primary key will be much larger than a B-tree in- cases the average time used by B-tree index is more dex on the same column. In contrast, according to our than 52 ms. Hence, we show that Bitmap index could test results, the index file size of a Bitmap index on be best suited for one-dimensional count queries. the above-mentioned column will not be larger than a In Query2A and Query2B (which are two- B-tree index. Similarly, in terms of index construction dimensional queries and involve by two conditions time, Bitmap index outperforms B-tree significantly. clause of the same structure as Query1), generally, we Table 4 and Figure 1 show that to build index on expect the response time of both indexes to be about a large column which is involved by B-tree is pro- twice as much time as Query1. However, it seems hibitively expensive in terms of space and creation that estimate is not accurate for Bitmap index. There- time. In other word, the index file size of column fore, the time in multi-dimensional queries is not pur-

ISSN: 1790-5109 127 ISBN: 978-960-474-028-4 Proceedings of the 8th WSEAS International Conference on APPLIED COMPUTER SCIENCE (ACS'08)

that B-tree index take about twice as much time as Table 5: Query response time per seconds Bitmap index, in contrast, we have not seen men- Sales Order Product tioned trend during execution tracing in our test sys- (Low Cardinality) (Normal Cardinality) (High Cardinality) tem. Thus, we can conclude that with Bitmap indexes, Bitmap B-tree Bitmap B-tree Bitmap B-tree the optimizer of Oracle11G answers to these queries, Query1A 0.018 0.051 0.019 0.053 0.020 0.052 which involved with AND, OR and so on is as much

Query1B 0.023 0.056 0.023 0.055 0.024 0.057 fast as B-tree index.

Query2A 0.017 0.078 0.017 0.075 0.023 0.076 Another query that can be a main way to exer-

Query2B 0.021 0.097 0.024 0.101 0.022 0.090 cise the indexing performance of Bitmap and B-tree is Query5. In Query5A and Query5B, we see that the Query3A 21.21 113.52 22.12 112.39 21.20 115.68 response time of B-tree is slightly less than Bitmap in- Query3B 307.61 1230 308.56 1246.2 308.54 1243.9 dex. On the other hand, the required time to answer Query4A 0.081 0.140 0.081 0.138 0.097 0.151 these queries is extremely more than the others. That Query4B 0.044 0.110 is because; to execute of this type of queries, the op- Query5C 1.15 5.21 0.92 5.20 0.92 5.23 timizer will not make use of any indexes. Rather, it Query5A ,B 1560.6 1554.3 1730.21 1701.52 1846.98 1840.03 will prefer to do a full table scan. Since there is an

Query6 1108.87 1400.3 1113.39 1440.12 abnormal growth of data, table scan will be needed to increase physical disk reads to avoid insufficient memory allocation. So this does not scale very well as data volumes increase. Even though, there is one im- sued by the time that is needed to one-dimensional plementation of Bitmap indexes (FastBit) which can queries. On the other hand, B-tree index has a much support these queries directly [18, 19, 14], but Oracle more growth in the response time (90 ms) as well. We 11G has not utilize this method of implementation. also observed that the time used by Bitmap index is The required time to answer Query5C that involved slightly less than the time used by B-tree index. by Bitmap index is slightly unusual. The time is de- Next, we focus for Query3. The query response creasing for a column with high cardinality compared time is different from the query response time of to column with low cardinality. Query1 and Query2. Overall, we see that the time re- Since more general join queries are often submit- quired by both of indexes has risen significantly. Since ted interactively; reducing their response time is a crit- the Bitmap and B-tree indexes use different mecha- ical issue in the DW environment [12] . Thus, the nism to organize for the table data; the time to resolve ability to answer Query6 has a strong impact to the the conditions on Query3 will conclude the total query query processing performance. Even though, Oracle response time. The query response time required to 11G [6] has implemented the Join Bitmap index to fetch data for Query3A and Query3B has the same join columns, but this is not always possible for ad hoc doing as each other. The number of records by these query, therefore it is strongly needed to know which queries that has to be selected are uniformly scattered indexes are best suited. Nevertheless, we see that the among rows 100,000 and 100,000,000. Consequently, elapsed time of this type of query which is involved the elapsed time of both indexes that is needed to an- with join operations is much faster than B-tree index swer the queries which are executed within a range of either high cardinality or low cardinality. predicates is affected by the distribution of data and In summary, Figure 2 shows the query elapse time dose not follow by the cardinality conditions. for the Product table (table with high cardinality). The response time required to retrieve the data for This figure shows that Bitmap index is much faster Query4 has similar trend as that for Query3 with just than B-tree index. Thus, we say that Bitmap index one difference. The difference stems from the column is good for all level of column cardinality as shown in under the second condition which has extremely low Figure 3 where the query elapse time is about constant cardinality. Here, with a Bitmap index on the Active- for each query type. Bit column (Cardinality = 2) in place, we created an- other Bitmap index on the Id-Bit column containing equal values between 1000 and 1000000000 and then 5 Conclusions execute Query4A. Subsequently, the Query4B will be re-executed with B-tree indexes on the same condi- It is commonly accepted that Bitmap index is more ef- tions. In the previous version of Oracle database soft- ficient for low cardinality attributes. Our experiment ware; the Oracle optimizer will choose a full table shows that, Bitmap index effectively reduce the query scan rather make it to use of index for B-tree [15] response time for a column with high cardinality com- . Even though the query response time demonstrates pared B-tree index. We have also shown that Bitmap ISSN: 1790-5109 128 ISBN: 978-960-474-028-4 Proceedings of the 8th WSEAS International Conference on APPLIED COMPUTER SCIENCE (ACS'08)

[2] P. O’Neil, Model 204 Architecture and Perfor- mance. In Proceedings of the 2nd international Workshop on High Performance Transaction Systems, Lecture Notes In Computer Science, vol. 359. Springer-Verlag, London, (September 28 - 30, 1987),pp.40-59 [3] R. Kimball, L. Reeves, M. Ross, The Data Warehouse Toolkit, John Wiley Sons, NEW Figure 2: Query elapse time for Bitmap and B-tree YORK, 2nd edition, 2002 index on high cardinality [4] W. Inmon, Building the Data Warehouse, John Wiley Sons, fourth edition, 2005 [5] D. Comer,Ubiquitous b-tree, ACM Comput. Surv. 11, 2, 1979, pp. 121-13 [6] R. Strohm, Oracle Database Concepts 11g,Oracle, Redwood City,CA 94065, 2007 [7] P. O’Neil and D. Quass, Improved query perfor- mance with variant indexes, In SIGMOD: Pro- ceedings of the 1997 ACM SIGMOD interna- tional conference on Management of data.1997 [8] P. O’Neil and G. Graefe, Multi-table joins through bitmapped join indices, ACM SIGMOD Figure 3: Query elapse time for Bitmap index on var- Record 24 number 3, Sep 1995 , pp. 8-11. ious level of column cardinality [9] K. Wu and P. Yu, Range-based bitmap index- ing for high cardinality attributes with skew, In index file size and index creation time grows gradu- COMPSAC 98: Proceedings of the 22nd Inter- ally as the column cardinality increases as compared national Computer Software and Applications to B-tree which grows significantly. In addition, we Conference. IEEE Computer Society, Washing- have demonstrated that although the index file size of ton, DC, USA, 1998, pp. 61-67. bitmap index is affected by column cardinality; the [10] E. E-O’Neil and P. P-O’Neil, Bitmap index query processing time is constant as the column car- design choices and their performance impli- dinality increases. Besides, Bitmap index is also effi- cations, Database Engineering and Applications cient for other types of queries, such as joins on keys, Symposium. IDEAS 2007. 11th International, multidimensional range queries and computations of pp. 72-84. aggregates. Thus, we conclude that Bitmap index is [11] C. Imho and N. Galemmo and J. Geiger, Mas- the conclusive choice for a DW designing no matter tering Data Warehouse Design : Relational and for columns with high or low cardinality. It is often Dimensional Techniques, John Wiley and Sons, considered that I/O cost dominates the query response NEW YORK.2003 time. Moreover main memory size may play a role in index performance as small memory size might trig- [12] K. Stockinger and K. Wu, Bitmap indices ger a lot of paging activities, which then could change for data warehouses, In Data Warehouses and the query performance of Both indexing. Thus, our OLAP ,IRM Press,2007, Chapter 7. future work includes to extent the evaluation of I/O [13] J. Lewis,Oracle index management secrets, costs on an upgraded hardware system. BMC Software (http://www.dbazine.com), 2006, pp. 37-47. Acknowledgements: Thanks God. [14] K. Stockinger and E. Bethel and S. Campbell and E. Dart and K. Wu,Detecting distributed scans using high-performance query-driven vi- References: sualization, In SC ’06: Proceedings of the 2006 ACM/IEEE conference on Supercomput- [1] S. Chaudhuri, U. Dayal,An Overview of Data ing, 2006 Warehousing and OLAP Technology, ACM SIG- [15] V.Sharma, Bitmap index vs. b-tree index: Which MOD RECORD. 1997 and when, http://www.oracle.com.2006 ISSN: 1790-5109 129 ISBN: 978-960-474-028-4 Proceedings of the 8th WSEAS International Conference on APPLIED COMPUTER SCIENCE (ACS'08)

[16] K. Wu and E. Otoo and A.Shoshani,Optimizing bitmap indices with efficient compression, ACM Trans. Database Syst. 31, 1 (Mar. 2006), pp. 1- 38.DOI=http://doi.acm.org/10.1145/1132863.1132864 [17] K. Stockinger and K. Wu and A. Shoshani,A performance comparison of bitmap indexes, In CIKM 01: Proceedings of the tenth international conference on Information and knowledge man- agement,2001 [18] K. wu, An Efficient Compressed Bitmap Index Technology, Http://sdm.lbl.gov/fastbit/, 2008. [19] L. Gosink and J. Anderson and W. Bethel and K. Joy, Variable Interactions in Query Driven Vi- sualization, The Visualization and Graphics Re- search Group of the Institute for Data Analysis and Visualization (IDAV), 2007 [20] L. Gosink and J. Anderson and W. Bethel and K. Joy, Bin-Hash Indexing: A Parallel GPU- Based Method For Fast Query Processing, The Visualization and Graphics Research Group of the Institute for Data Analysis and Visualization (IDAV), 2007 [21] K. Stockinger and K. Wu and A. Shoshani,Strategies for processing ad hoc queries on large data warehouses, In Proceedings of the 5th ACM interna- tional Workshop on Data Warehousing and OLAP (McLean, Virginia, USA, Novem- ber 08 - 08, 2002). DOLAP ’02. ACM, New York, NY,2002, pp. 72-79. DOI= http://doi.acm.org/10.1145/583890.583901 [22] P. O’Neil, The Set Query Benchmark. In The Benchmark Handbook For Database and Trans- action Processing Benchmarks, Jim Gray, Edi- tor, Morgan Kaufmann, 1993. [23] P. ONeil and E. ONeil, Database Principles, Programming, and Performance, 2nd Ed. Mor- gan Kaufmann Publishers. 2001.

ISSN: 1790-5109 130 ISBN: 978-960-474-028-4