representing chains of custody along a forensic process: a case study on kruse model tamer fares...
TRANSCRIPT
Representing Chains of Custody Along a Forensic Process: A Case
Study on Kruse Model
Tamer Fares Gayed, UQAM
Hakim Lounis, UQAM
Moncef Bari, UQAM
Presentation outline
• Introduction
• Problem definition
• Why Linked Data for representing chain of custody
• Solution framework
• Conclusions and perspectives
Introduction
• The semantic web is the web of data • Tim Berners Lee outlined a set of rules for publishing data on the
web:• Use URI’s as names for things• Use HTTP URI’s to enable people to look up those names• Provide useful RDF information related to URI’s that are
looked up by machines or people• Include RDF statements that link to other URIs
• Publishing data in a structured way can facilitate its consumption and helps the consumer to take the proper decision.
Phase1
Introduction: CoC?
• CoC is chronological document that accompanies all digital evidence in order to avoid later allegations of tampering with such evidences
• A forensic process contains a set of phases, each phase has its own CoC document
• Each CoC answers 5Ws and 1H questions
Phase2 Phase x
CoC1 CoC2 CoCx
Introduction: forensic process
• The most common forensic process is the Kruse model: it includes the three essential steps required by any cyber forensic investigation.
• The 3 phases are: acquisition, authentication, and analysis
Acquisition Authentication Analysis
WhoWhenWhy
WhereWhatHow
WhoWhenWhy
WhereWhatHow
WhoWhenWhy
WhereWhatHow
CoCAcqui CoCAuth CoCAnaly
Problem definition
• CF is a growing field that requires the accommodation with the digital technologies :
– Semantic web standards (RDF, URL, SPARQL) are fertile land for representing the CoCs
• Judges’ awareness and understanding the digital evidences are not enough to evaluate and take the proper decision about the digital evidence :
– Representation using LDP, allows the dereferenceability of the represented resources + execution of queries.
• CoCs should be managed only by the authorized people and its integrity should be maintained throughout the investigation process
– A security mechanism should be integrated with the represented data to keep its integrity and control its access
Why LDP for representing CoCs?
• CoC and LDP are metaphors for each others; interlinking between entities
• Interpretation of terms and resources
• Inference capabilities (human or automated)
• Semantic vocabularies: mixture (schema) for representing forensics data
• Provenance metadata: to describe the provenance and complement missing answers about forensics data
• Knowledge representation, definition (reuse) of concepts, collaboration between different role players.
Solution Framework
Semantic Web Vocabularies
• Built in vocabularies– RDFS, OWL, DC, FOAF,..etc
• Custom Vocabularies– Created to describe particular domain
– When the built in vocabularies do not provide all terms that are needed to describe content of a data set
– Creating such vocabularies using lightweight ontology
http://cyberforensics-coc.com/vocab/authentication#investigator
http://cyberforensics-coc.com/vocab/authentication#Investigator
www.w3.org/2000/01/rdf-schema#range
www.w3.org/2000/01/rdf-schema#subclassOf
http://xmlns.com/foaf/0.1/Person
The Who question
http://www.w3.org/2000/01/rdf-schema#label
http://www.w3.org/2002/07/owl#inverseof
http://cyberforensics-coc.com/vocab/authentication#investigated
http://www.w3.org/2000/01/rdf-schema#domain http://cyberforensics-coc.com/vocab/authentication#Authentication
http://www.w3.org/2000/01/rdf-schema#range
http://www.w3.org/2000/01/rdf-schema#comment
Class of all investigation
http://www.w3.org/2000/01/rdf-schema#Class
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
www.w3.org/2000/01/rdf-schema#domain
http://www.w3.org/2002/07/owl#ObjectProperty
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://www.w3.org/2000/01/rdf-schema#label
The Authentication Phase
http://www.w3.org/2000/01/rdf-schema#comment
The Class of all authentication tasks
Ex. : Definition of terms
Victim and Forensic Part
• This layer describes the mechanism of how the resources of victim and forensic parts are represented: 303 URIs, Hash URIs
• There exist different ways to describe any concept– URI identifying the concept itself
– URI identifying the RDF/XML document describing the concept
– URI identifying the HTML document describing the concept
• Forensic Format can also be represented in the same unified framework (AFF4 : an open format for the storage and processing of digital evidences + representing forensics data in the form of RDF triples)
CF-CoC Web Application form
• The CF-CoC web application form should be designed to :– Import resources from the forensic parts
– Import resources from the victim parts
– Create and describe resources by the support of • Existing terms imported from well established vocabularies
• New terms imported from custom vocabulary created to describe the CoC for each forensic phase
• Add provenance metadata to the forensics data
Pattern Consumption Applications
• Three main patterns can be used by juries to consume this information of the CoC :
– Browsing
– Searching
– Querying
Provenance Metadata
• The ability to track the origin of data is a key component in building trustworthy, which is required for the admissibility of digital evidences
• Provenance information can be integrated within the forensic process using 3 different methods :
– Provenance vocabularies
– Open provenance model
– Named Graph: used to denote a collection of triples with relevant provenance information. The set of RDF triples is the considered as one graph (NG) and it is assigned a URI reference.
Ex. Abstract NG of Kruse Model
20 Mar 2011
dc:date
dc:creator
Jean Pierre
dc:p
ublis
her
An
n M
ari
e
NGAuth
NGAcqui
NGAnaly
Ex. : Usage of custom term & Metadata
Genid:A14471
http://74.208.87.195/evidence01/docs/aff4.xml#evidence
http://www.w3.org/1999/02/22-rdf-syntax-ns#type
http://74.208.87.195/evidence01/docs/aff4.xml#name
http://digitaltest.ca/employee/Jean-Pierre
http://www.cyberforensics-coc.com/vocab/authentication#investigated
Evidence01http://cyberforensics-coc.com/vocab/authentication#Investigator
http://www.w3,org/1999/02/22-rdf-syntax-ns#type
2010-11-10 16:34:15Z
http://purl.org/dc/terms/date
http://74.208.87.195/evidence01/docs/aff4.xml#type AFF4
http://74.208.87.195/evidence01/docs/aff4.xml#algorithmMD5
http://74.208.87.195/evidence01/docs/aff4.xml#locationMachine1
db64e67f5b41bbc0f3728c2eae4f07ebhttp://74.208.87.195/evidence01/docs/aff4.xml#hash
Back
NGAuth
Forward
Public Key Infrastructure
• Applying PKI to LOD, transform it to LCD
• Allows juries to ensure from the identity of role players participated in the forensic investigation
• The main idea behind applying the PKI to LOD is based on the PK cryptography, where senders (role players and CA) make signature using their private key, and the jury verifies these signatures using their public key.
PKI Scenario
K R-P
2. K U-P
3. R-CA{ P,K U-P }
4. KU-CA
1.
5. R-CA{ P, KU-P }
Sign
NG
Conclusion and Perspective
1. New combination of several fields in the same framework, such as cyber forensics, semantic web, provenance vocabularies, PKI Approach, and LDP.
2. Underline that each phase in the forensics process should have its own CoC along any forensics model.
3. Provide a framework that leads to the creation of an assistance system for juries in a court of law.
4. Integrate provenance metadata to the victim/forensics data, in order to answer questions about the origin of information published by the role players during the forensics investigation.
5. Using the PKI approach to ensure the identities of each player participating in the forensics process.
Transforming tangible CoC to electronic one consumable by people and machines.
Future Work
• Current framework will be extended by extra educational resources for aid purposes
• These educational resources provide help to the role players and juries to respectively publish and consume the represented data