bloom filter & hashing · bloom filter • checks for set membership efficiently is element x...
TRANSCRIPT
![Page 1: Bloom Filter & Hashing · Bloom Filter • Checks for SET MEMBERSHIP efficiently Is element x in the set? MoAvang Example • Spam Filtering Ø We have a set of 1 billion email addresses](https://reader033.vdocuments.net/reader033/viewer/2022060400/5f0dfb3c7e708231d43d0902/html5/thumbnails/1.jpg)
BloomFilter&Hashing
BarnaSaha
![Page 2: Bloom Filter & Hashing · Bloom Filter • Checks for SET MEMBERSHIP efficiently Is element x in the set? MoAvang Example • Spam Filtering Ø We have a set of 1 billion email addresses](https://reader033.vdocuments.net/reader033/viewer/2022060400/5f0dfb3c7e708231d43d0902/html5/thumbnails/2.jpg)
BloomFilter
• ChecksforSETMEMBERSHIPefficiently
Iselementxintheset?
![Page 3: Bloom Filter & Hashing · Bloom Filter • Checks for SET MEMBERSHIP efficiently Is element x in the set? MoAvang Example • Spam Filtering Ø We have a set of 1 billion email addresses](https://reader033.vdocuments.net/reader033/viewer/2022060400/5f0dfb3c7e708231d43d0902/html5/thumbnails/3.jpg)
MoAvaAngExample
• SpamFiltering
Ø Wehaveasetof1billionemailaddressesthatweconsidertobenon-spam.Ø Eachstreamelementisoftheform(emailaddress,email).Ø BeforeaccepAngtheemail,amail-clientneedstocheckifthisaddressbelongstosetS.Ø Eachtypicalemailaddressrequires20bytesofstoragewhereasinthemainmemoryweonlyhavesay1billionbyte(roughly1Gigabyte),or8billionbits.Ø Wecannotstoreallthevalidemailaddressesinthemain
memory.
![Page 4: Bloom Filter & Hashing · Bloom Filter • Checks for SET MEMBERSHIP efficiently Is element x in the set? MoAvang Example • Spam Filtering Ø We have a set of 1 billion email addresses](https://reader033.vdocuments.net/reader033/viewer/2022060400/5f0dfb3c7e708231d43d0902/html5/thumbnails/4.jpg)
MoAvaAngExample
• SpamFiltering– Allvalidemailsmustbedelivered– Numberofspamemailsdeliveredshouldbeaslowaspossible
![Page 5: Bloom Filter & Hashing · Bloom Filter • Checks for SET MEMBERSHIP efficiently Is element x in the set? MoAvang Example • Spam Filtering Ø We have a set of 1 billion email addresses](https://reader033.vdocuments.net/reader033/viewer/2022060400/5f0dfb3c7e708231d43d0902/html5/thumbnails/5.jpg)
BloomFilter
![Page 6: Bloom Filter & Hashing · Bloom Filter • Checks for SET MEMBERSHIP efficiently Is element x in the set? MoAvang Example • Spam Filtering Ø We have a set of 1 billion email addresses](https://reader033.vdocuments.net/reader033/viewer/2022060400/5f0dfb3c7e708231d43d0902/html5/thumbnails/6.jpg)
BloomFilter
![Page 7: Bloom Filter & Hashing · Bloom Filter • Checks for SET MEMBERSHIP efficiently Is element x in the set? MoAvang Example • Spam Filtering Ø We have a set of 1 billion email addresses](https://reader033.vdocuments.net/reader033/viewer/2022060400/5f0dfb3c7e708231d43d0902/html5/thumbnails/7.jpg)
AnalysisofBloomFilter
![Page 8: Bloom Filter & Hashing · Bloom Filter • Checks for SET MEMBERSHIP efficiently Is element x in the set? MoAvang Example • Spam Filtering Ø We have a set of 1 billion email addresses](https://reader033.vdocuments.net/reader033/viewer/2022060400/5f0dfb3c7e708231d43d0902/html5/thumbnails/8.jpg)
AnalysisofBloomFilter
![Page 9: Bloom Filter & Hashing · Bloom Filter • Checks for SET MEMBERSHIP efficiently Is element x in the set? MoAvang Example • Spam Filtering Ø We have a set of 1 billion email addresses](https://reader033.vdocuments.net/reader033/viewer/2022060400/5f0dfb3c7e708231d43d0902/html5/thumbnails/9.jpg)
SpamFilteringExample
• Wehave
![Page 10: Bloom Filter & Hashing · Bloom Filter • Checks for SET MEMBERSHIP efficiently Is element x in the set? MoAvang Example • Spam Filtering Ø We have a set of 1 billion email addresses](https://reader033.vdocuments.net/reader033/viewer/2022060400/5f0dfb3c7e708231d43d0902/html5/thumbnails/10.jpg)
OpAmumValueofk
• AsthenumberofhashfuncAonsincrease,higheristhechanceoffindinga0bitcell
• AlsowithincreasingnumberofhashfuncAons,thenumberofcellswith0bitsdecreases
• OpAmumvalueobtainedbydifferenAaAon
![Page 11: Bloom Filter & Hashing · Bloom Filter • Checks for SET MEMBERSHIP efficiently Is element x in the set? MoAvang Example • Spam Filtering Ø We have a set of 1 billion email addresses](https://reader033.vdocuments.net/reader033/viewer/2022060400/5f0dfb3c7e708231d43d0902/html5/thumbnails/11.jpg)
ApplicaAonsofBloomFilter
• BloomFilterhasfoundinnumerableapplicaAonsinnetworkingandwebtechnology
![Page 12: Bloom Filter & Hashing · Bloom Filter • Checks for SET MEMBERSHIP efficiently Is element x in the set? MoAvang Example • Spam Filtering Ø We have a set of 1 billion email addresses](https://reader033.vdocuments.net/reader033/viewer/2022060400/5f0dfb3c7e708231d43d0902/html5/thumbnails/12.jpg)
![Page 13: Bloom Filter & Hashing · Bloom Filter • Checks for SET MEMBERSHIP efficiently Is element x in the set? MoAvang Example • Spam Filtering Ø We have a set of 1 billion email addresses](https://reader033.vdocuments.net/reader033/viewer/2022060400/5f0dfb3c7e708231d43d0902/html5/thumbnails/13.jpg)
![Page 14: Bloom Filter & Hashing · Bloom Filter • Checks for SET MEMBERSHIP efficiently Is element x in the set? MoAvang Example • Spam Filtering Ø We have a set of 1 billion email addresses](https://reader033.vdocuments.net/reader033/viewer/2022060400/5f0dfb3c7e708231d43d0902/html5/thumbnails/14.jpg)
![Page 15: Bloom Filter & Hashing · Bloom Filter • Checks for SET MEMBERSHIP efficiently Is element x in the set? MoAvang Example • Spam Filtering Ø We have a set of 1 billion email addresses](https://reader033.vdocuments.net/reader033/viewer/2022060400/5f0dfb3c7e708231d43d0902/html5/thumbnails/15.jpg)
![Page 16: Bloom Filter & Hashing · Bloom Filter • Checks for SET MEMBERSHIP efficiently Is element x in the set? MoAvang Example • Spam Filtering Ø We have a set of 1 billion email addresses](https://reader033.vdocuments.net/reader033/viewer/2022060400/5f0dfb3c7e708231d43d0902/html5/thumbnails/16.jpg)
![Page 17: Bloom Filter & Hashing · Bloom Filter • Checks for SET MEMBERSHIP efficiently Is element x in the set? MoAvang Example • Spam Filtering Ø We have a set of 1 billion email addresses](https://reader033.vdocuments.net/reader033/viewer/2022060400/5f0dfb3c7e708231d43d0902/html5/thumbnails/17.jpg)
AnalysisofBloomFilter
AnalysisusesfullyrandomhashfuncAons—difficulttoobtainwithhighspaceandcompuAngrequirements
![Page 18: Bloom Filter & Hashing · Bloom Filter • Checks for SET MEMBERSHIP efficiently Is element x in the set? MoAvang Example • Spam Filtering Ø We have a set of 1 billion email addresses](https://reader033.vdocuments.net/reader033/viewer/2022060400/5f0dfb3c7e708231d43d0902/html5/thumbnails/18.jpg)
Strongly2-wiseUniversalHashFuncAon
• MappingsetofkeysU=[0,1,2,…,m-1]torangeR=[0,1,2,…,n-1]– H={ha,b=[(ax+b)modp]modn}
• p>=misaprime,1<=a<=p-1,0<=b<=p-1• Easytocomputeandstore:O(1)• SaAsfies(almost)forall,
![Page 19: Bloom Filter & Hashing · Bloom Filter • Checks for SET MEMBERSHIP efficiently Is element x in the set? MoAvang Example • Spam Filtering Ø We have a set of 1 billion email addresses](https://reader033.vdocuments.net/reader033/viewer/2022060400/5f0dfb3c7e708231d43d0902/html5/thumbnails/19.jpg)
Strongly3-wiseUniversalHashFuncAon
• MappingsetofkeysU=[0,1,2,…,m-1]torangeR=[0,1,2,…,n-1]– H={ha,b=[(ax2+bx+c)modp]modn}
• p>=misaprime,1<=a<=p-1,0<=b,c<=p-1• Easytocomputeandstore:O(1)• SaAsfies(almost)
![Page 20: Bloom Filter & Hashing · Bloom Filter • Checks for SET MEMBERSHIP efficiently Is element x in the set? MoAvang Example • Spam Filtering Ø We have a set of 1 billion email addresses](https://reader033.vdocuments.net/reader033/viewer/2022060400/5f0dfb3c7e708231d43d0902/html5/thumbnails/20.jpg)
• MappingsetofkeysU=[0,1,2,…,p-1]torangeR=[0,1,2,…,p-1]– H={ha,b=(ax+b)modp},0<=a,b<=p-1
• Fix.– Whatis?– NumberofhashfuncAons– NumberofsoluAonsfor“a”and“b”=1
Strongly2-Universal