Scalable Cooperative Caching Algorithm Based on Bloom Filters Nodirjon Siddikov

Rochester Institute of Technology RIT Scholar Works Theses Thesis/Dissertation Collections 2011 Scalable cooperative caching algorithm based on bloom filters Nodirjon Siddikov Follow this and additional works at: http://scholarworks.rit.edu/theses Recommended Citation Siddikov, Nodirjon, "Scalable cooperative caching algorithm based on bloom filters" (2011). Thesis. Rochester Institute of Technology. Accessed from This Thesis is brought to you for free and open access by the Thesis/Dissertation Collections at RIT Scholar Works. It has been accepted for inclusion in Theses by an authorized administrator of RIT Scholar Works. For more information, please contact [email protected]. Scalable Cooperative Caching Algorithm based on Bloom Filters Nodirjon Siddikov Email: [email protected] M.S. in Computer Science Department of Computer Science B. Thomas Golisano College of Computing and Information Sciences Rochester Institute of Technology Rochester, New York August, 2011 A Thesis submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science in Computer Science COMMITEE MEMBERS: _____________________________________________________________________________ CHAIR: Hans-Peter Bischof DATE Professor, Department of Computer Science _____________________________________________________________________________ READER: Minseok Kwon DATE Associate Professor, Department of Computer Science _____________________________________________________________________________ OBSERVER: James Heliotis DATE Professor, Department of Computer Science i Table of Contents 1. Abstract .................................................................................................................................................... 1 2. Introduction ............................................................................................................................................. 1 2.1. Cooperative caching ...................................................................................................................... 1 2.2. Research problem .......................................................................................................................... 2 2.3. Proposed solution ........................................................................................................................... 2 2.4. Motivation ...................................................................................................................................... 3 2.5. Organization of paper .................................................................................................................... 4 3. Research process ..................................................................................................................................... 4 3.1. Research questions ......................................................................................................................... 4 3.2. Hypothesis ..................................................................................................................................... 5 3.3. Research goal ................................................................................................................................. 5 3.4. Research scope and limitations ...................................................................................................... 6 4. Background review ................................................................................................................................. 6 4.1. Cache memory ............................................................................................................................... 6 4.2. Elements of cache design ............................................................................................................... 8 4.2.1. Cache size ........................................................................................................................... 8 4.2.2. Mapping function ................................................................................................................ 8 4.2.3. Cache replacement policies ................................................................................................ 9 4.2.4. Write policy ....................................................................................................................... 10 4.2.5. Block size .......................................................................................................................... 10 4.3. Architecture of cooperative caching system ................................................................................ 11 5. Existing algorithms ............................................................................................................................... 13 5.1. N-chance algorithm...................................................................................................................... 13 5.2. Hint-based algorithm ................................................................................................................... 14 6. New algorithm ....................................................................................................................................... 15 6.1. Problems with existing algorithms .............................................................................................. 15 6.2. Cache summary ........................................................................................................................... 16 6.2.1. Data compression ............................................................................................................. 17 6.2.2. Data approximation .......................................................................................................... 18 6.3. Bloom filter .................................................................................................................................. 18 6.3.1. Mathematical description ................................................................................................. 18 6.3.2. Tradeoff between bloom filter parameters........................................................................ 19 ii 6.3.3. Comparison of a bloom filter and a hash table ................................................................ 21 6.4. Proposed solution ......................................................................................................................... 23 6.5. New algorithm ............................................................................................................................. 26 6.5.1. Algorithm environment ..................................................................................................... 26 6.5.2. Algorithm steps ................................................................................................................. 28 6.5.3. Algorithm scope and limitations ....................................................................................... 29 7. Evaluation .............................................................................................................................................. 30 7.1. Software simulator ....................................................................................................................... 30 7.2. Experimental setup ...................................................................................................................... 31 7.2.1. Assumptions ...................................................................................................................... 31 7.2.2. Trace file ........................................................................................................................... 32 7.2.3. Simulator parameters ....................................................................................................... 32 7.3. Evaluation metrics ....................................................................................................................... 33 7.4. Experiment results ....................................................................................................................... 34 7.4.1. Block access time .............................................................................................................. 35 7.4.2. Manager load ................................................................................................................... 39 7.4.3. Cache hit/miss rates .......................................................................................................... 42 7.4.4. Memory overhead ............................................................................................................. 46 7.4.5. Communication overhead ................................................................................................. 49 7.5. User manual ................................................................................................................................. 51 8. Deliverables ........................................................................................................................................... 52 9. Thesis limitations .................................................................................................................................. 52 10. Future work ......................................................................................................................................... 53 11. Conclusion ..........................................................................................................................................

Scalable Cooperative Caching Algorithm Based on Bloom Filters Nodirjon Siddikov

Implementing Signatures for Transactional Memory

Fast Candidate Generation for Real-Time Tweet Search with Bloom Filter Chains

An Optimal Bloom Filter Replacement∗

Comparison on Search Failure Between Hash Tables and a Functional Bloom Filter

Invertible Bloom Lookup Tables

Bloom Filter-2

Bloom Filter Encryption and Applications to Efficient Forward

Data Mining Locality Sensitive Hashing

Pay for a Sliding Bloom Filter and Get Counting, Distinct Elements, and Entropy for Free

Scalable Data Center Multicast Using Multi-Class Bloom Filter

Similarity Search Using Locality Sensitive Hashing and Bloom Filter

Improved Streaming Quotient Filter: a Duplicate Detection Approach for Data Streams