Conditional Cuckoo Filters

Description: 

Bloom filters, cuckoo filters, and other approximate set membership sketches have a wide range of applications. Oftentimes, expensive operations can be skipped if an item is not in a data set. These filters provide an inexpensive, memory efficient way to test if an item is in a set and avoid unnecessary operations. Existing sketches only allow membership testing for single set. However, in some applications such as join processing, the relevant set is not fixed and is determined by a set of predicates.

We propose the Conditional Cuckoo Filter, a simple modification of the cuckoo filter that allows for set membership testing given predicates on a pre-computed sketch. This filter also introduces a novel chaining technique that enables cuckoo filters to handle insertion of duplicate keys. We evaluate our methods on a join processing application and show that they significantly reduce the number of tuples that a join must process.

Authors: 
Daniel Ting
Rick Cole
Publication Date: 
Tuesday, May 5, 2020
Publication Information: 
arXiv:2005.02537 [cs.DS] 5 May 2020