hello from cs - jaist...
TRANSCRIPT
![Page 1: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/1.jpg)
![Page 2: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/2.jpg)
Hello from CS
![Page 3: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/3.jpg)
Security on Cloud Computing, Query Computation and Data Mining on Encrypted Database
Professor David CHEUNG
Head, Department of Computer Science The University of Hong Kong
RIVF 2010 Hanoi
![Page 4: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/4.jpg)
What is Cloud Computing ???
![Page 5: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/5.jpg)
From Youtube
Song from Youtube
![Page 6: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/6.jpg)
What the Cloud can offer ?‐ unlimited supply of computing power
(electricity)
![Page 7: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/7.jpg)
What do they say about Cloud Computing?
![Page 8: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/8.jpg)
• 27.5 M hits in Google Search on “cloud computing”
• “The term cloud is used as a metaphor for Internet, based on how Internet is depicted in the network diagrams” –Wikipedia
• “The ability to connect to software and data on the Internet (the cloud) instead of your hard drive or local area network” ‐ComputerWorld
![Page 9: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/9.jpg)
Some examples of Cloud
Computing?
![Page 10: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/10.jpg)
Gmail – Web mail
![Page 11: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/11.jpg)
Flickr – Photo Album
![Page 12: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/12.jpg)
Google Docs – Internet Docs
![Page 13: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/13.jpg)
Amazon Web Services –Infrastructure as a services
![Page 14: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/14.jpg)
The three pillars of Cloud
Computing?
![Page 15: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/15.jpg)
Google Docs – Internet Docs
![Page 16: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/16.jpg)
Virtualization
Separation of applications and infrastructure
![Page 17: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/17.jpg)
Software as a Service
![Page 18: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/18.jpg)
Utility Computing
![Page 19: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/19.jpg)
Impacts and Riskof Cloud
Computing?
![Page 20: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/20.jpg)
Survey by World Economic Forum (Dec 2009)
![Page 21: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/21.jpg)
Barriers – from WorldEconomicsForum
• Lack of Understanding• Interoperability• Fear of vendor lock‐in• Privacy
• Security• Governance• Integration and Migration
![Page 22: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/22.jpg)
How about security inCloud
Computing?
![Page 23: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/23.jpg)
Trust me, it is very secure.
How about that 0.001% chance ???
![Page 24: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/24.jpg)
Can we run query applications on Cloud if the provider cannot be
trusted??
Secure Computation on Encrypted
Database
![Page 25: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/25.jpg)
Secure Computation on Encrypted Data – What ?
X Y
Z
encryption
Computation :x+y ??
find a: a > 0 ?
data
![Page 26: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/26.jpg)
Security on Cloud Computing
Amazon’s EC2database
E(DB)(U.S.)
encrypted customerdata
(submit to Cloud for query services )
A’s customers(U.K.)
Company A(Hong Kong)
customerqueries
Gartner listed SEVEN security issues in Cloud Computing:
4. Data Segregation : encryption scheme must be available to protect the corporate data
Challenge: How to perform computation (queries) on encrypted data ??
![Page 27: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/27.jpg)
Computation on Encrypted Data –what is the key issue ??
Bob: the master
Alice: untrusted slave
xy…
E(x), E(y)
I want to find x+y;I want Alice to do it for me
But I don’t trust Alice
E(x)E(y)
Result = E(x) + E(y)
R = E(x) + E(y)
Privacy homomorphism (additive):
E(x) + E(y) = E(x + y)
E-1(R)
= E-1(E(x) + E(y))
= E-1(E(x + y))
= x + y
[Rivest, Adleman, Dertouzos, Foundations of Secure Computation 1978]
Can I use R to recover (x+y) ??
Keyissue??
need a nice and secure encryption
![Page 28: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/28.jpg)
What is a secure encryption scheme ?
A secure scheme :
It can stop the attacker and achieve the defined security goal ?
Security goal Adversary Model
Attack ModelWhat can the adversary do ?
e.g., he can access some data (plain text) and encrypt it
What does the adversary know ?
e.g., he knows some background knowledge
Security GoalWhat is the primary goal ?
e.g., prevent the adversary (service provider) from seeing the data
What is the additional requirement ?
e.g., prevent the adversary from finding any statistical information on the protected data
![Page 29: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/29.jpg)
A model for Secure Queries on Encrypted Database (SCONEDB)
• Encrypted DBMS (EDBMS) hosting at an untrusted service provider– Store encrypted data– Process queries
• A Three Players Game– Player 1 : Database owner ‐ encrypt data and send them to
the db at the service provider
– Player 2 : User of the database – issue queries to the EDBMS
– Player 3 : Attacker (service provider) – try to break into the encrypted database
![Page 30: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/30.jpg)
Player 2 (queries issuer)
Player 1 (data owner) DB
ET()tK
R EQ(q)ET(t)
EQ()
D()
K
q
D(R)
EDBMS
E(DB)query
processing
Service provider: hosting the encrypted db
SCONEDB Model
CryptanalysisH DBA
Player 3 (attacker)
knowledge known to the attacker on the data
attacker’s guess in his attempt to break
the encryption
full access to E(DB)
![Page 31: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/31.jpg)
Player 2
Player 1
EDBMS
E(DB)
DB
ET() EQ()
D()
t
queryprocessing
K
K q
D(R)
R EQ(q)ET(t)
CryptanalysisH DBA
Player 3 (attacker)
Database hosting at the service provider
Define an encryptionscheme (ET,EQ and D) and a queryprocessingmethod on E(DB) such that query resultsreturned are correct and the attackercannot compromisethe E(DB), i.e., DBA is empty, given background knowledge H.
Problem definition:SCONEDB Model
![Page 32: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/32.jpg)
Attack Model :Three levels of background knowledge
• Basic capability: attacker has full access to encrypted data
• Background knowledge (a three level model):– Level 1 : no background knowledge
– Level 2 : attacker knows some records in DB (plain text)
– Level 3 : attacker knows some records in DB and the encrypted values of these records, i.e., knows some (x, E(x)) pairs
x2
DB
x1
x1’x2’
E(DB)
x3’E
x3 Level 1attackerLevel 2attackerLevel 3attacker
? ? ?
![Page 33: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/33.jpg)
Attack model – different background knowledge level
• The attacker has full access to the EDBMS (encrypted database, encrypted queries)
• Three levels of attacker model
– Level 1: has no background knowledge (i.e., H = empty set) (basic level);
• e.g., service provider knows nothing about the business of the data owner
– Level 2: knows a set of points P in DB (practical level);• e.g., the adversary is a customer of the bank
– Level 3: knows a set of points P in DB and their corresponding encrypted values in E(DB) (avoidable);
• e.g., the adversary creates a new account yesterday and he observes there is only one new encrypted account added since yesterday
If a scheme survives a higher level attack, it survives a lower level one.
![Page 34: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/34.jpg)
SCONEDB on an important query type: Secure kNN Computation
• We develop an encryption scheme for kNN queries on SECONEDB to explore its applicability
• k nearest neighbor query (kNN)
– Database DB: a set of d dimensional points
– Given a query point q, find the k nearest points to q in the database
q
x1x2
d2d1
DB
x2 is the 1-nearest neighbor of q
![Page 35: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/35.jpg)
Is Distance Preserving Transformation (DPT) an answer to the kNN problem?
• E is a DPT if– d(x, y) = d’(E(x), E(y))
• kNN can be computed on E(DB) if E is a DPT
q
x1x2
d2d1
DB
q’
x1’
x2’d2’
d1’
E(DB)
E
(DPT)
Is this a real solution? Can it survive attack?
Nice property ??????
![Page 36: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/36.jpg)
DPT fails at levels 2 and 3 attack
• DPT fails at a level 3 attack
• DPT also fails at a level 2 attack (signature attack)
y (?)
x1
x2
DB
x3
y’
x1’x2’
E(DB)
x3’
d1’
d2’
d3’
d1
d2
d3
Attacker’s background knowledge (level 3)
Attacker wants to compromise the encryption of y’
E
(DPT)
DPT is not a good solution, it is not secure enough.
![Page 37: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/37.jpg)
An Asymmetric Scalar Product Preservation Encryption
Scheme 1: Asymmetric Scalar Product Preservation Encryption [SIGMOD 2009]
x’ = ET(x) = MT(xT, -0.5 ||x||2 )T
q’ = EQ(q) = M-1(r(qT, 1)T)
x = D(x’) = Πd((MT)-1(x’))
Encryption key: M is a (d+1) dimension random invertible matrix
![Page 38: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/38.jpg)
Scheme 1: An ASPE encryption scheme
Theorem: Let x1’, x2’, and q’ be the encrypted values of x1, x2, and q with scheme 1, then
||x2 – q|| < ||x1 – q|| iff x2’.q’ > x1’.q’
and the scheme is not a DPT. (Very nice !!!!)
q
x1x2
d2d1
E
(Scheme 1)
d dimensions d+1 dimensions
x1’
x2’
q’
![Page 39: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/39.jpg)
Scheme 1: how safe ?
• Scheme 1 resists level 2 attack – it preserves some nice asymmetric scalar product and it has broken the curse of distance preservation
• Unfortunately, Scheme 1 fails level 3 attack – if enough (x, ET(x)) pairs are known to the attacker, the random matrix M can be solved (compromised)
![Page 40: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/40.jpg)
Scheme 2: Asymmetric Random Splitting
• Scheme is based on a splitting technique done independently and randomly on each dimension [SIGMOD 2009]
• 2d+1 possible splitting configurations for (d+1)‐dimensional case – exponentially many configurations for the adversary to guess
• Scheme 2 resists level 3 attack if the attacker cannot derive the splitting configuration
• If the number of dimensions is over 80, the scheme is as safe as 1024‐bit RSA key
![Page 41: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/41.jpg)
What have we shown you ?
• It is possible to process queries on encrypted data – so, query processing on cloud service provider could be done and need new techniques
• This is a practical approach – the security level is not as strong as the conventional goal (e.g., semantic security); but nevertheless practical and has an implementable performance
![Page 42: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/42.jpg)
Can we do data mining on Cloud if the provider cannot be
trusted??
Data Mining on Encrypted Data
![Page 43: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/43.jpg)
Example: Association rules mining
• In the form of X => Y• Meaning
– If a transaction contains itemset X, the transaction will probably contain itemset Y
• A rule must have– High support : number of transactions that include XY– High confidence : the ratio of number of transactions containing XY
to number of transactions containing X but not Y.
• Key issue : compute large item sets– X is large if the percentage of transactions containing X is larger than a threshold [Agrawal VLDB 94]
![Page 44: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/44.jpg)
Data mining by a Service Provider (cloud)
Data Mining ServiceProvider (cloud computing)DB DB’Transformer
AssociationRules’
AssociationRules
DataOwner
Send to SPEncryption
Decryption
![Page 45: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/45.jpg)
Solution: Item mapping (encryption cipher)
• T is a set of transactionst = {cheese, book, bread, chocolate, ..}
• bread ‐> 54 (item mapping)• chocolate ‐> 165• t = <cheese, book, bread, chocolate> becomes t’ = <8, 69, 54, 165>
• <54, 165> is large to the miner, but what is it ? <cheese, book> or <bread, chocolate> ???
• Similar to substitution cipher used in encryption of text
![Page 46: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/46.jpg)
A More Secure Mapping ??
• A one‐to‐n item mapping (more secure)– B: a set of items
– m: I ‐> 2B
• Example– m(a) = {1, 4, 5}
– m(b) = {2}
– m(c) = {3, 5}
![Page 47: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/47.jpg)
Itemset mapping using one‐to‐n item mapping (encryption)
• m: I ‐> 2B : one‐to‐n item mapping• M: 2I ‐> 2B : itemset mapping
• Example:– M(<a, c>) = <1, 3, 4, 5>– M(<b, c>) = <2, 3, 5>– M‐1(<1, 2, 4, 5>) = <a, b>– M‐1(<1, 2, 3, 4, 5>) = <a, b, c>
m:a -> {1, 4, 5}b -> {2}c -> {3, 5}
![Page 48: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/48.jpg)
Correctness – collision on one‐to‐n mapping
• <a, b>=><1, 2, 3>
• <a, b, c>=><1, 2, 3>
Collisions!
Decryption failure !
• <a>=><1, 2>
...
• <a,b>=><1, 2, 3>
…
• <a, b, c>=><1, 2, 3, 4>
m:a -> {1, 2}b -> {2, 3}c -> {1, 3}
m’:a -> {1, 2}b -> {2, 3}c -> {2, 4}
A good one-to-n mapping: the mapping of each item must contain unique stuff
![Page 49: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/49.jpg)
One‐to‐n vs one‐to‐one
• one‐to‐n vs one‐to‐one?– Intuitively, one‐to‐n should be more secure
Unfortunate Scenario:• one‐to‐n + item mapping
= one‐to‐one + item mappingOur solution :
– Add a random component to transaction transformation
– It will make one‐to‐n always better (more secure) than one‐to‐one [VLDB 2007]
![Page 50: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/50.jpg)
Criteria for a valid transformation
• Correctness ‐ The randomly added items must not affect the decryption
• Completeness ‐ A good random transformation algorithm should generate all possible transformations to make it not easy to guess
![Page 51: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/51.jpg)
Algorithm to perform valid and complete transformation
t = <…>
StartMeet
quota?
a->…b->…
…->…
Mappings
No
N(t)
Pick one
x->x1, …,xn
History
Stores items we must not add
xi, …,xj
Filter
E
E = Ø at start
Some add to E
Others to history
Next
Add E to result
![Page 52: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/52.jpg)
Integrity concerns
• The mining results returned from the service provider could be wrong– Bugs in their mining program– Only compute part of answer (laziness) – Adversely modify the results (malicious)
• Two problems– Soundness
• All rules are correct (no false positive)• All rules are with correct support and confidence
– Completeness• All correct rules are included (no false negative)
![Page 53: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/53.jpg)
Light weight audit of mining result
• Pre‐processing + light‐weighted verification [VLDB 2009]
T
Data owner
T T
FI
Transformations
Service provider
FI
Audit Environment
FrequentItemsets
FIVerifications
auxiliary data
^
^
U
R
![Page 54: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/54.jpg)
What have we shown you ?
• It is possible to do data mining on encrypted data –at least it can be done on association mining; also the mining result is not visible to the service provider
• We can audit the mining result returned by a service provider; therefore, the integrity is protected against the service provider
![Page 55: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/55.jpg)
Publications on Computation and Mining on Encrypted Data
• Data mining on encrypted data– [Wong, Cheung, Hung, Liu]
Protecting Privacy in Incremental Maintenance for Distributed Association Rule Mining, PAKDD 2008.
– [Wong, Cheung, Hung, Kao and Mamoulis]
Security in Outsourcing of Association Rule Mining, VLDB 2007.
• Integrity of data mining on encrypted data– [Wong, Cheung, Hung, Kao and Mamoulis]
An Audit Environment for Outsourcing of Frequent Itemset Mining, Proceeding of PVLDB 2009 (VLDB).
• Secure kNN computation on encrypted data– [Wong, Cheung, Kao and Mamoulis]
Secure k‐NN Computation on Encrypted Databases, SIGMOD 2009.
• Privacy preservation– [Wong, Mamoulis and Cheung]
Non‐homogeneous Generalization in Privacy Preserving Data Publishing, SIGMOD 2010.
![Page 56: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/56.jpg)
Acknowledgement
• Wong Wai Kit• Ben Kao• Nikos Mamoulis• Edward Hung
![Page 57: Hello from CS - JAIST 北陸先端科学技術大学院大学bao/IEEE-RIVF2010/IEEE-RIVF-DavidCheung.pdf · Security on Cloud Computing, Query Computation and Data Mining on Encrypted](https://reader031.vdocuments.net/reader031/viewer/2022022508/5ad002367f8b9a6c6c8dcd84/html5/thumbnails/57.jpg)
The End
Thank you. Question please !!!!