privacy-preserving search for chemical compound databases
TRANSCRIPT
Privacy-Preserving Search forChemical Compound
Databases
Kana Shimizu, Koji Nuida, Hiromi Arai, Shegeo Mitsunari,Nuttapong Attrapadung, Michiaki Hamada, Koji Tsuda, Takatsugu
Hirokawa, Jun Sakuma, Goichiro Hanaoka, Kiyoshi Asai
BMC Bioinformatics 2015
June 1, 2016Mateus Cruz
Introduction Method Experiments Conclusion
OUTLINE
1 Introduction
2 Method
3 Experiments
4 Conclusion
Introduction Method Experiments Conclusion
OUTLINE
1 Introduction
2 Method
3 Experiments
4 Conclusion
Introduction Method Experiments Conclusion
OVERVIEW
Protocol for searching chemical databasesChecks if items are similar
I Tversky indexEncrypt items to preserve privacy
I Additive-homomorphic encryption
1 / 11
Introduction Method Experiments Conclusion
SECURITY REQUIREMENTS
User privacyI The database shouldn’t learn about the query
Database privacyI The user shouldn’t learn about the DB contents
The similarity value cannot be disclosedI Allows regression attacks
2 / 11
Introduction Method Experiments Conclusion
MODEL
The user is a private chemical compoundholder, and the server is a private database
holder. The user learns nothing but the numberof similar compounds in the server’s database,and the server learns nothing about the user’s
query compound.
3 / 11
Introduction Method Experiments Conclusion
PROPOSAL
Secure similar compounds counterTolerant against regression attacksEfficient
I ComputationI Communication
Scalable
4 / 11
Introduction Method Experiments Conclusion
OUTLINE
1 Introduction
2 Method
3 Experiments
4 Conclusion
Introduction Method Experiments Conclusion
SIMILARITY CALCULATION
Compounds are modeled as p ∈ {0, 1}`I Bit array of size `
Similarity given by Tversky indexI TI1,1 gives the Jaccard IndexI TI1/2,1/2 gives the Dice Index
TIα,β(p, q) = |p∩q||p∩q|+α|p\q|+β|q\p|
5 / 11
Introduction Method Experiments Conclusion
PROTOCOL OVERVIEW
Assume there is only p in the databaseTwo party protocol
I Client Alice holds query item pI Server Bob holds q
ObjectiveI Check if TI(p, q) ≥ θ
Security issuesI Alice should not know what is qI Bob should not know what is p
6 / 11
Introduction Method Experiments Conclusion
THRESHOLD TVERSKY INDEX TI
Tversky Index: TIα,β(p, q) = |p∩q||p∩q|+α|p\q|+β|q\p|
Let α = µa/γ, β = µb/γ, θ = θn/θd
TI(p, q) ≥ θ =⇒ |p∩q||p∩q|+(µa/γ)|p\q|+(µb/γ)|q\p| ≥
θnθd
Let Γ = (θd − θn)γ + θn(µa + µb)
TI(p, q) = Γ|p ∩ q| − θn(µa|p|+ µb|q|) ≥ 0
7 / 11
Introduction Method Experiments Conclusion
PROTOCOL STEPS
1 Alice generates (pk, sk) pair2 Alice sends 〈cA,Γ, µa, µb, θn〉 to Bob
I cA := Enc(pk, p)
3 Bob encrypts qI cB,q := Enc(pk, q)
4 Bob calculates cTI and sends it to AliceI cTI := ΓcB,∩ − θn(µacB,p + µbcB,q)
5 Alice decrypts cTI and checks TI(p, q) ≥ 0I T := Dec(sk, cTI)
8 / 11
Introduction Method Experiments Conclusion
IMPROVING SECURITY
Alice should not know T := TI(p, q)Insert encrypted dummies to the result
I c′1, . . . , c′d
I Shuffled with the result cTII Send shuffled set to AliceI Also sends Np,dummy
– Number of non-negative dummies
Alice...I Decrypts ciphertextsI Count non-negative values: Np,all
If Np,all −Np,dummy = 1 then TI(p, q) ≥ 0
9 / 11
Introduction Method Experiments Conclusion
OUTLINE
1 Introduction
2 Method
3 Experiments
4 Conclusion
Introduction Method Experiments Conclusion
PERFORMANCE
36,000 times faster computation12,000 times faster communication
10 / 11
Introduction Method Experiments Conclusion
OUTLINE
1 Introduction
2 Method
3 Experiments
4 Conclusion
Introduction Method Experiments Conclusion
SUMMARY
Checks similarities between compoundsOperate over encrypted dataLow computation and communicationFaster than MPC and FHE
I Multi-party ComputationI Fully Homomorphic Encryption
11 / 11