privacy and auditing in clouds
TRANSCRIPT
All opinions expressed herein are my own and do not reflect the opinions of of anyone that I work with (or have worked with) or any organization that am or have been affiliated with.
• Jamaican
Education
• BSc Hons Computer Studies, UWI-Mona.
• MSc Software Engineering, UWI-Mona
• PhD Computer Science, Imperial College –
London
• MBA Finance, IBM Academy
Experience
• 10 years leading Quest team at IBM
• 2 years working in startups
• 3 years running companies and consulting
• Now, working for the White House
Recognition
• Fellow, British Computer Society (BCS)
• Fellow, Healthcare Information and Management
Systems Society (HIMSS)
• Pioneer of the Year (2009), National Society of
Black Engineers (NSBE)
• IEEE Technical Achievement Award (2010) for
“Pioneering Contributions to Secure and Private
Data Management".
• Modern Day Technology Leader (2009), Minority in
Science Trailblazer (2010), Science Spectrum
Trailblazer (2012, 2013). Black Engineer of the
Year Award Board
• IBM Master Inventor
• Distinguished Engineer, Association of Computing
Machinery (ACM)
• Senior Member, Institute of Electrical and
Electronics Engineers (IEEE)
Record
• Over 100 technical papers, over 47 patents and 2
books.
• The Fundamentals
• Auditing
• Privacy
• Cloud Computing
• Why Do We Need A&P in
Clouds
• The Current State of the
World
• Potential Research Areas
• Guiding Principles
• Considerations
• Research Roadmap
• Task 1
• Task 2
• Starting Point
• Small step 1
• Other Steps
• Conclusion
The process of collecting and evaluating evidence to determine whether a computer system safeguards assets, maintains data integrity, achieves
organizational goals effectively and consumes resources efficiently
- Information Systems Control and Audit, Ron Weber (1998).
generates examined
by
Audit Log/TrailAuditor
An individual’s right to control, edit, manage, and delete information about them[selves] and decide when, how, and to what extent
information is communicated to others
Privacy and Freedom. Alan F. Westin. (1967).
My Data
cre
ate
I authorize my doctor to view my
test results for diagnosis purposes only
My insurance company
is not authorized
to see any of my data
Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing
resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management
effort or service provider interaction.
- NIST Special Publication 800-145, Mell & Grance (2011).
Currently, cloud clients trust too much
Real-time detection of an attack only possible in simplest, most obvious cases
Real-time notification is the exception (when possible) not the rule
Due to cloud delivery model and cloud deployment model, the artifact that any particular person is using may be different.
Cloudy specifics on cloud, e.g. location of instances, mechanisms in place, etc.
For advanced auditing scenarios, details of the cloud operations, communications with clients and client-based cloud operations need to be known
1. Creating Privacy-Preserving LogsAssumes that the cloud user does not have full confidence in the cloud provider or their affiliated ecosystem.
1. Enabling Auditing in a Privacy-Preserving MannerAssumes there is not complete trust in the auditor and the service provider.
Seamless: Integrate into the current mode of operation with minimal to no significant.
Transparent: It should be clear to the cloud service user what the purpose of the mechanism is and when it
is functioning.
Elastic: Be able to scale to dynamically handle the request loads placed on the cloud service provider.
Low Impact: Inclusion of the mechanism should have a minor impact on the storage and performance of
the cloud environment.
Verifiable: An independent third party should prove the veracity of the actions of the mechanism.
The Mechanism Injection Point (MIP) The mechanism injection point refers to the location of the A&P controls. This is the location
where enforcement of the auditing and privacy rules will be performed and the supplementary mechanisms, such as data structures are situated.
The Nature of the Cloud Service Employed Cloud Model being used, i.e. Software-as-a-Service (SaaS), Platform-as-a-Service (PaaS)
and Infrastructure-as-a-Service (IaaS), etc.
The Transaction Attack Vector The transaction attack vector refers to the class of transactions that are evaluated in the
process of assessing a possible threat. There are two types of transaction attack vectors: Requests and Consequences.
The Threat Determination Point The threat determination point refers to the location where the analysis of the recorded
privacy and audit events occurs, i.e. the location where breach detection and notification happens.
Create the big picture
Identify the basic problems
Efficient Auditing Mechanisms
Time Synchronization of Logs
Creating Processing-Friendly, Privacy-Preserving Data
Processing of Encrypted Log Data
Mechanisms for Basic Cloud Forensics
Solve the core problems
Scale up to the big picture
User Cloud Service Provider (CSP)
Pri
vacy
-Pre
serv
ing
AP
I
Public Key Infrastructure
Nat
ive
AP
I
Pseudonym
Req
ues
t/
Co
nse
qu
ence
Pa
rser
Resources
…..…..…..…..…..
App1
Appn
Privacy-P
reserving A
PI
C2: signed API request, with user ID
C2: API response/consequence
Data
Tables
2004-02…
2004-02…
Timestamp
publicTelemarketingJohnSelect …2
OursCurrentJaneSelect …1
RecipientPurposeUserQueryID
Query Audit Log
Database
Layer
Query with purpose, recipient
Generate audit record
for each query
Updates, inserts, deletes
Backlog
Database triggers track
updates to base tables
Audit
Database
Layer
Audit query
IDs of log queries having
accessed data specified by the
audit query
• Audits whether particular data has
been disclosed in violation of the
specified policies
• Audit expression specifies what
potential data disclosures need
monitoring
• Identifies logged queries that
accessed the specified data
• Analyze circumstances of the
violation
• Make necessary corrections to
procedures, policies, security
Jane complains to the department of Health and Human Services saying that she had opted out of the doctor sharing her medical information with pharmaceutical companies for marketing purposes
The doctor must now review disclosures of Jane’s information in order to understand the circumstances of the disclosure, and take appropriate action
Sometime later, Jane receives promotional literature from a pharmaceutical company, proposing over the counter diabetes tests
Jane has not been feeling well and decides to consult her doctor
The doctor uncovers that Jane’s blood sugar level is high and suspects diabetes
audit T.disease
from Customer C, Treatment T
where C.cid=T.pcid and C.name =‘Jane’
Who has accessed Jane’s disease information?
GivenA log of queries executed over a data system
An audit expression specifying sensitive data
Precisely identifyThose queries that accessed the data specified by the audit expression
“Candidate” query Logged query that accesses all columns specified by the audit expression
“Indispensable” tuple (for a query) A tuple whose omission makes a difference to the result of a query
“Suspicious” query A candidate query that shares an indispensable tuple with the audit
expression
Query Q: Addresses of people with diabetesAudit A: Jane’s diagnosis
Jane’s tuple is indispensable for both; hence query Q is“suspicious” with respect to A
s PA(s PQ(T ´R´S)) ¹j
))((
))((
STA
RTQ
AOA
QOQ
PC
PC
Theorem - A candidate query Q is suspicious with respect to an audit expression A iff:
The candidate query Q and the audit expression A are of the form:
Query Graph Modeler (QGM) rewrites Q and A into:
)))((("" SRTQAi PPQ
Data
Tables
2004-02…
2004-02…
Timestamp
publicTelemarketingJohnSelect …2
OursCurrentJaneSelect …1
RecipientPurposeUserQueryID
Query Audit Log
Database
Layer
Query with purpose, recipient
Generate audit record
for each query
Updates, inserts, delete
Backlog
Database triggers track
updates to base tables
Audit
Database
Layer
Audit expression
IDs of log queries having
accessed data specified by the
audit query
Static analysis
Generate audit query
ID Timestamp Query User Purpose Recipient
1 2004-02… Select … James Current Ours
2 2004-02… Select … John Telemarketing public
Query Log
Audit expression
Filter Queries
Candidate queries
Eliminate queries that could not possibly have violated the audit expression
Accomplished by examining only the queries themselves (i.e., without running the queries)
OAQ CC
Merge logged queries and audit expression into a single query graph
Customer
c, n, …, t
audit expression := T.p=C.c and C.n=
‘Jane’
T.s
Select := T.s=‘diabetes’ and T.p=C.c
C.n, C.a, C.z
C
C
Treatment
p, r, …, t
T
T
Customer
c, n, …, t
audit expression := X.n= ‘Jane’
‘Q1’
Select := T.s=‘diabetes’ and C.c=T.p
C.n
View of Customer (Treatment) is a temporal view at
the time of the query was executed
The audit expression now ranges over the logged
query. If the logged query is suspicious, the audit
query will output the id of the logged query
Treatment
p, r, ..., t
X
C
T
0
50
100
150
200
250
5 20 35 50
# of versions per tuple
Tim
e (m
inut
es)
Composite
Simple
No Index
No Triggers
7x if all tuples are updates
3x if a single tuple is updated
Negligible
by using
Recovery
Log to build
Backlog tables
Complete initial solutions for basic problems Show their importance (in other domains)
Integrate into bigger picture.
Demonstrate applicability to cloud environment
Partner with Cloud providers to prototype and iron out kinks.
Focus on Cloud Forensics Privacy-Preserving Protocols
Chain of Evidence
Authenticity
Iterate on initial vision given the current state.
This space has a lot of difficult (and fundamental) problems.
These specific questions need more researchers focusing on themApplicable not only to privacy and auditing in clouds
Translate to fundamental impact to basic Computer Systems Research.
This is just my view and should never be thought to be complete and definitive.