Background on Bloom Filter

Background on Bloom Filter

CSE 535 : Lecture 5 String Matching with Bloom Filters Washington University Fall 2003 http://www.arl.wustl.edu/arl/projects/fpx/cse535/ Copyright 2003, Sarang Dharmapurikar [Guest Lecture] CSE 535 : Fall 2003 1 Background on Bloom Filter • Data structure proposed by Burton Bloom • Randomized data structure – Strings are stored using multiple hash functions – It can be queried to check the presence of a string • Membership queries result in rare false positives but never false negatives • Originally used for UNIX spell check • Modern applications include : – Content Networks – Summary Caches – route trace-back – Network measurements – Intrusion Detection CSE 535 : Fall 2003 2 Hash Functions • Input : x •Output : H[x] • Properties – Each value of x maps to a value of H[x] – Typically: Size of (x) >> Size of (H[x]) – H[x] evenly distributed over values of x • Implementation – Hash Function • XOR of bits, Shifting, rotates .. X H H[X] CSE 535 : Fall 2003 3 Programming a Bloom Filter • Bloom filter computes k hash functions on input 1 1 H1 1 X H2 H3 H k 1 1 m-bit vector CSE 535 : Fall 2003 4 Programming a Bloom Filter 1 1 1 H1 1 Y H2 H 3 1 H k 1 1 1 m-bit vector CSE 535 : Fall 2003 5 Querying a Bloom Filter 1 1 1 H1 1 X H2 H 3 1 match H k 1 1 1 m-bit vector CSE 535 : Fall 2003 6 Querying a Bloom Filter 1 1 1 H1 1 W H2 H 3 1 Match H (false positive) k 1 1 1 m-bit vector CSE 535 : Fall 2003 7 Optimal Parameters of a Bloom filter • n : number of strings to be stored 1 • k : number of hash functions 1 • m : the size of the bit-array (memory) 1 H1 • The false positive probability H 1 Y 2 k H f = (½) 3 1 H4 H • The optimal value of hash functions, k, k 1 is k = ln2 × m/n = 0.693 × m/n 1 1 m-bit Array Key Point : False positive probability decreases exponentially with linear increase in the number of hash functions & memory CSE 535 : Fall 2003 8 Counting Bloom Filters • A message once programmed in the Bloom filter can not be deleted – Deletion of message requires clearing the corresponding bits – Since a bit can be set by multiple messages, clearing it will disturb other messages • Counting Bloom filters solve the problem – Array of counters instead of array of bits – Increment the corresponding counters when a message is added, decrement when deleted 1 2 0 0 A A 1 1 1 1 0 0 B 0 0 B 1 2 0 0 off-chip counter array CSE 535 : Fall 2003 9 Counting Bloom Filters • Maintain Bloom filters on the chip and corresponding counters off the chip 1 2 – Saves on the on-chip 0 0 resources to implement 1 1 counters 1 1 0 0 – Addition and deletion of 0 0 messages are rare 1 2 0 0 – Set the bit when On-chip bit array off-chip counter array corresponding counter changes 0 to 1, clear it when counter changes 1 to 0 CSE 535 : Fall 2003 10 Using Bloom filters for String Matching Hash Table False Positives Resolver BFW BF5 BF4 BF3 Entering byte bW --------- b5 b4 b3 b2 b1 Leaving byte CSE 535 : Fall 2003 11 Bloom filter for cs535 Hash Table BF16 Entering byte bW --------- b5 b4 b3 b2 b1 Leaving byte CSE 535 : Fall 2003 12 System Overview Off-chip • Receives the control packets, 64 Mega bytes decodes the commands in it and SDRAM accordingly either updates the A component that reads-writes data Bloom filter or updates the hash given by the user component in the table off-chip SDRAM SDRAM Controller • Implements the hash-table Sends Control packets to around SDRAM CPP and data packets to Bloom filter • communicates with the SDRAM Hash Table Interface controller through a request grant protocol When hash table instructs, it Process the packet Control sends a notification packet out headers Packet Processor Bloom Filter Input Output Controller Controller Protocol Wrappers CSE 535 : Fall 2003 13 Bloom Filters on the FPX Platform Bloom filters • Xilinx XCV2000E FPGA implemented on the Reconfigurable – Implements Application Device Reconfigurable Application Device (RAD) on the Field- programmable Port Extender (FPX) – Contains 160 Embedded RAMs • Each BlockRAM has dual (2) ports • Each BlockRAM stores 4096 bits – Enables MP2 to Field-programmable Port Extender (FPX) Platform implement large, fast, parallel Bloom filters CSE 535 : Fall 2003 14 Partial Bloom Filter 1 bit dinA weA doutA ‘0’ addrA Output H1(X) X Hash dinB (match/no match) Value 4096 bits doutB Calculator weB ‘0’ addrB H2(X) CSE 535 : Fall 2003 15 Partial Bloom Filter Address Valid PBF BRAM # Bit 1 bit dinA Request weA doutA Decoder addrA Output H1(X) X Hash dinB (match/no match) Value 4096 bits doutB Calculator weB ‘0’ addrB H2(X) CSE 535 : Fall 2003 16 Bloom Filter Control Interface H1 H2 PBF 1 H3 H4 PBF 2 H5 X Hash Value H6 PBF 3 Calculator Match H7 H8 PBF 4 H9 H10 PBF 5 CSE 535 : Fall 2003 17.

View Full Text

Details

  • File Type
    pdf
  • Upload Time
    -
  • Content Languages
    English
  • Upload User
    Anonymous/Not logged-in
  • File Pages
    9 Page
  • File Size
    -

Download

Channel Download Status
Express Download Enable

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

  • Not to be reproduced or distributed without explicit permission.
  • Not used for commercial purposes outside of approved use cases.
  • Not used to infringe on the rights of the original creators.
  • If you believe any content infringes your copyright, please contact us immediately.

Support

For help with questions, suggestions, or problems, please contact us