protecting data privacy and integrity in clouds by jyh-haw yeh computer science boise state...

21
Relational Semantic Hiding Databases (RSHDB) Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

Upload: dorcas-thompson

Post on 25-Dec-2015

218 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

Relational Semantic Hiding Databases (RSHDB)Protecting data privacy and integrity in clouds

By Jyh-haw YehComputer Science

Boise state University

Page 2: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

Cloud ComputingCloud computing paradigm provides a new

concept of IT management.Business purchases IT services from CloudsCost savingUnlimited computing powerCharged by usageMore secure?Better resource utilization, thus green

computing

Page 3: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

Cloud ComputingCloud computing also has some known

problemsTrust issuesData privacy and integrity Non-transparency of data locationsLiability issue

Page 4: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

Outsourcing DatabasesDatabase-as-a-service is an emerging service

starts to appear in cloud industry.Clients has the flexibility to design an

application as a database that is suitable for their business.

Outsource the database to clouds.Clouds is able to execute queries over the

database upon client’s requests.Clouds (may not be trusted) have the total

control of data.Data privacy/integrity is a big concern .

Page 5: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

Encrypted DatabasesAn extreme approach to protect data privacy:

Encrypt the whole database and then outsource the encrypted database to clouds.

This approach works if a practical fully homomorphic encryption (FHE) algorithm exists.

FHE: arithmetic, rational comparisons can be applied directly to ciphers.

No practical and efficient FHE exists.

Page 6: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

RSHDBRSHDB (relational semantic hiding

databases) is a proposed database system that is able to hide semantics from DBAs.Suitable for business to outsource their

business applications as a RSHDB instance to Clouds.

Enable the DBAs or DBMS in clouds to operate on the RSHDB databases without knowing private business information.

Page 7: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

RSHDB: Idea of Hiding SemanticsIdea of semantic hiding in RSHDB:

An XYZ company has a PAYROLL database, in which a record in a table EMPLOYEE shows that John Smith SALARY is 63,000.

An ? company has a ? database, in which a record in a table ? shows that ? ? is 63,000.

Page 8: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

RSHDB: Basic OperationsBasic database operations:

Arithmetic: add or multiply numeric data.Equality test: test the equality of two data items.Rational comparison: decide A> B or A < B.Substring matching: decide whether a string A

is a substring in another string BOther database operations: sorting, searching,

aggregate functions, set operations are extension/combination of basic operations.

Page 9: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

RSHDB: Data TypesData types:

NC-type: Numeric with Comparison only.NCA-type: Numeric with both Comparison and

Arithmetic.SC-type: String with Comparison only.SCS-type: String with both Comparison and

Substring matching.

Page 10: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

RSHDB: Design GoalPartially encrypts the database so that the

cloud is able to execute queries over encrypted data.

Encrypt enough information (but not all) to hide semantics from data operators.

Minimize the impacts for the DBMS, the SQL, the hosting clouds, and the clients.

Page 11: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

RSHDB: Encryption StrategyUse a secure deterministic encryption for all

semantic telling information: database, table, attribute names.

String type data is also semantic telling: always encrypted.SC-type: order-preserved encryption (less

secure)SCS-type:

char-by-char (less secure) order-preserved encryption.

word-by-word order-preserved encryption.

Page 12: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

RSHDB: Encryption StrategyNumeric data itself reveal less semantics.

NC-type: order-preserved encryption. Example: bdate data

NCA-type: no practical homomorphic encryption available for this type of data. Leave the data in clear Homomorphic encoding (not too much help for

security) Example: salary data

Page 13: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

ImpactsThe DBMS: Need to be semantic hiding awareThe SQL: New data types for DDLThe hosting clouds:

More storage space for encrypted data. Install semantic hiding aware DBMS

The clients: Install an query API:Perform encryptionConvert SQL query to semantic hiding queryPerform decryptionReturn the result to the clients

Page 14: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

Example DatabaseEMPLOYEE

NAME SSN DEPT_NO JOB _TYPE

BDATE SALARY

John Smith 123456789

1 Manager 1966-05-04

83,000

Frank Wong 333445555

3 Staff 1985-07-26

48,000

Joey English 453453453

2 Engineer 1978-10-03

72,000

Joe Johnson 999887777

2 Engineer 1982-03-29

70,500

DEPARTMENT

DEPT_NAME DEPT_NO LOCATION

Headquarter 1 Houston

Research 2 Boise

Finance 3 Houston

Page 15: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

Example DatabaseT

A1 A2 A3 A4 A5 A6

X11 X12 25,300 X14 2,418,241,992

83,000

X21 X22 75,900 X24 2,441,639,298

48,000

X31 X32 50,600 X34 2,437,900,467

72,000

X41 X42 50,600 X44 2,433,063,369

70,500R

B1 B2 B3

Y11 25,300 Y13

Y21 50,600 Y23

Y31 75,900 Y33

Page 16: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

Semantic Hiding Query (SHQ)The sensitive information or data is

encrypted in SHQ.

To make a query to a RSHDB, the SQL query must be a SHQ.

Example

Retrieve the name and salary of each employee in ‘Research’ department whose salary is more than $50,000, sort the report in ascending order of names.

Page 17: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

SHQ Exampleselect EMPLOYEE.NAME, EMPLOYEE.SALARYfrom EMPLOYEE, DEPARTMENTwhere EMPLOYEE.DEPT_NO =

DEPARTMENT.DEPT_NO AND DEPT_NAME = ‘Research’ AND EMPLOYEE.SALARY > 50000asc EMPLOYEE.NAME;---------------------------------------------------------------------------select T.A1, T.A6from T, Rwhere T.A3 = R.B2 AND R.B1 = Y21 AND T.A6 >

50000asc T.A1;

Page 18: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

SHQ ResultT.A1 T.A6

X41 70,500

X31 72,000

Query API decrypts the result and return to the clients

EMPLOYEE.NAME EMPLOYEE.SALARY

Joe Johnson 70,500

Joey English 72,000

Page 19: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

Research IssuesStorage requirement.Is order-preserved encryption secure enough?

More secure encryption + order-preserved hashing?

Guessing the semantics from the range and format of NCA-type data in clear.Adding noises?

RSHDB’s DBMS has a weaker domain constraint enforcement.All encrypted data are in type of bit-string

Page 20: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

Research IssuesChar-by-char versus word-by-word

encryption for SCS-type data.Flexibility, security and space.

Who should develop the query API?Performance downgrade:

Implementation and simulationReal world databases and queries

Page 21: Protecting data privacy and integrity in clouds By Jyh-haw Yeh Computer Science Boise state University

Future WorkDesigning algorithms for data integrity

protection for outsourced database.CompletenessNon-forgeryFreshness

Adding data integrity protection to RSHDB is challenging.