camflow: managed data-sharing for cloud services · camflow: managed data-sharing for cloud...

14
1 CamFlow: Managed data-sharing for cloud services Thomas F. J.-M. Pasquier, Member, IEEE, Jatinder Singh, Member, IEEE, David Eyers, Member, IEEE and Jean Bacon Fellow, IEEE, Abstract—A model of cloud services is emerging whereby a few trusted providers manage the underlying hardware and communica- tions whereas many companies build on this infrastructure to offer higher level, cloud-hosted PaaS services and/or SaaS applications. From the start, strong isolation between cloud tenants was seen to be of paramount importance, provided first by virtual machines (VM) and later by containers, which share the operating system (OS) kernel. Increasingly it is the case that applications also require facilities to effect isolation and protection of data managed by those applications. They also require flexible data sharing with other applications, often across the traditional cloud-isolation boundaries; for example, when government, consisting of different departments, provides services to its citizens through a common platform. These concerns relate to the management of data. Traditional access control is application and principal/role specific, applied at policy enforcement points, after which there is no subsequent control over where data flows; a crucial issue once data has left its owner’s control by cloud-hosted applications and within cloud-services. Information Flow Control (IFC), in addition, offers system-wide, end-to- end, flow control based on the properties of the data. We discuss the potential of cloud-deployed IFC for enforcing owners’ data flow policy with regard to protection and sharing, as well as safeguarding against malicious or buggy software. In addition, the audit log associated with IFC provides transparency and offers system-wide visibility over data flows. This helps those responsible to meet their data management obligations, providing evidence of compliance, and aids in the identification of policy errors and misconfigurations. We present our IFC model and describe and evaluate our IFC architecture and implementation (CamFlow). This comprises an OS level implementation of IFC with support for application management, together with an IFC-enabled middleware. Index Terms—Compliance, Security, Audit, Cloud Computing, Information Flow Control, Middleware, PaaS 1 I NTRODUCTION AND MOTIVATION A MODEL of cloud services is emerging whereby a few trusted providers manage the underlying hard- ware and communications infrastructure—datacenters with worldwide replication to achieve high data in- tegrity and availability at low latency. Many compa- nies build on this infrastructure to offer higher level cloud services, for example Heroku is a PaaS built on Amazon’s EC2, above which SaaS offerings can be built (e.g. the LIFX smart lightbulb cloud service on top of the Heroku platform). From the start, protection was a paramount concern for the cloud as infrastructure is shared between tenants. Strong tenant isolation was provided by means of totally separated virtual machines (VMs) [1], [2] and more recently, isolated containers have been provided that share a common OS kernel [3]. Increasingly, cloud-hosted applications may need not only protection (and isolation) from other applications but also have requirements for flexible data sharing, often across VM and container boundaries. An example is the Thomas F. J.-M. Pasquier, Jatinder Singh and Jean Bacon are with the Computer Laboratory, University of Cambridge, UK. E-mail: fi[email protected] David Eyers is with the Department of Computer Science, University of Otago, New Zealand. E-mail: [email protected] Manuscript received 31 Mar. 2015; revised 25 Aug. 2015; accepted 11 Sept. 2015; current version 6 Oct. 2015. UK GCloud 1 initiative, a government platform designed to encourage small companies to provide cloud-hosted applications. These applications need to be composed and made to interoperate to support citizens’ needs for online services. Similarly, the Massachusetts Open Cloud [4] is a marketplace (Open Cloud Exchange (OCX)) to encourage small business development. Solutions are open and one may build on the services of another. The aim is to create a catalyst for the economic development of business clusters. End-users of cloud services still need to be assured that their data is protected from leakage to other parties by their cloud hosts, due to software bugs or mis- configurations, also safeguarded to the extent possible against insider attacks and external threats. But increas- ingly, they also need to be able to access their own data across applications and to share their data with others, according to the policies they specify. Contain- ment mechanisms, such as VMs and containers, provide strong isolation between applications, but do not support these sharing requirements. The incorporation of cloud services within ‘Internet of Things’ (IoT) architectures [5] is another driver of the requirement for both protection and cross-application data sharing, given these IoT ar- chitectures’ strong emphasis on (safe) interaction. For example, a patient being monitored at home may store sensor-gathered medical data in the cloud and share it 1. https://www.gov.uk/digital-marketplace arXiv:1506.04391v3 [cs.CR] 21 Dec 2015

Upload: others

Post on 03-May-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CamFlow: Managed data-sharing for cloud services · CamFlow: Managed data-sharing for cloud services Thomas F. J.-M. Pasquier, Member, IEEE, Jatinder Singh, Member, IEEE, David Eyers,

1

CamFlow Managed data-sharingfor cloud services

Thomas F J-M Pasquier Member IEEE Jatinder Singh Member IEEE David Eyers Member IEEEand Jean Bacon Fellow IEEE

AbstractmdashA model of cloud services is emerging whereby a few trusted providers manage the underlying hardware and communica-tions whereas many companies build on this infrastructure to offer higher level cloud-hosted PaaS services andor SaaS applicationsFrom the start strong isolation between cloud tenants was seen to be of paramount importance provided first by virtual machines (VM)and later by containers which share the operating system (OS) kernel Increasingly it is the case that applications also require facilitiesto effect isolation and protection of data managed by those applications They also require flexible data sharing with other applicationsoften across the traditional cloud-isolation boundaries for example when government consisting of different departments providesservices to its citizens through a common platformThese concerns relate to the management of data Traditional access control is application and principalrole specific applied at policyenforcement points after which there is no subsequent control over where data flows a crucial issue once data has left its ownerrsquoscontrol by cloud-hosted applications and within cloud-services Information Flow Control (IFC) in addition offers system-wide end-to-end flow control based on the properties of the data We discuss the potential of cloud-deployed IFC for enforcing ownersrsquo data flowpolicy with regard to protection and sharing as well as safeguarding against malicious or buggy software In addition the audit logassociated with IFC provides transparency and offers system-wide visibility over data flows This helps those responsible to meet theirdata management obligations providing evidence of compliance and aids in the identification of policy errors and misconfigurationsWe present our IFC model and describe and evaluate our IFC architecture and implementation (CamFlow) This comprises an OS levelimplementation of IFC with support for application management together with an IFC-enabled middleware

Index TermsmdashCompliance Security Audit Cloud Computing Information Flow Control Middleware PaaS

F

1 INTRODUCTION AND MOTIVATION

A MODEL of cloud services is emerging whereby afew trusted providers manage the underlying hard-

ware and communications infrastructuremdashdatacenterswith worldwide replication to achieve high data in-tegrity and availability at low latency Many compa-nies build on this infrastructure to offer higher levelcloud services for example Heroku is a PaaS built onAmazonrsquos EC2 above which SaaS offerings can be built(eg the LIFX smart lightbulb cloud service on top ofthe Heroku platform) From the start protection wasa paramount concern for the cloud as infrastructureis shared between tenants Strong tenant isolation wasprovided by means of totally separated virtual machines(VMs) [1] [2] and more recently isolated containers havebeen provided that share a common OS kernel [3]

Increasingly cloud-hosted applications may need notonly protection (and isolation) from other applicationsbut also have requirements for flexible data sharing oftenacross VM and container boundaries An example is the

bull Thomas F J-M Pasquier Jatinder Singh and Jean Bacon are with theComputer Laboratory University of Cambridge UKE-mail firstnamelastnameclamacuk

bull David Eyers is with the Department of Computer Science University ofOtago New ZealandE-mail dmecsotagoacnz

Manuscript received 31 Mar 2015 revised 25 Aug 2015 accepted 11 Sept2015 current version 6 Oct 2015

UK GCloud1 initiative a government platform designedto encourage small companies to provide cloud-hostedapplications These applications need to be composedand made to interoperate to support citizensrsquo needs foronline services Similarly the Massachusetts Open Cloud[4] is a marketplace (Open Cloud Exchange (OCX)) toencourage small business development Solutions areopen and one may build on the services of another Theaim is to create a catalyst for the economic developmentof business clusters

End-users of cloud services still need to be assuredthat their data is protected from leakage to other partiesby their cloud hosts due to software bugs or mis-configurations also safeguarded to the extent possibleagainst insider attacks and external threats But increas-ingly they also need to be able to access their owndata across applications and to share their data withothers according to the policies they specify Contain-ment mechanisms such as VMs and containers providestrong isolation between applications but do not supportthese sharing requirements The incorporation of cloudservices within lsquoInternet of Thingsrsquo (IoT) architectures [5]is another driver of the requirement for both protectionand cross-application data sharing given these IoT ar-chitecturesrsquo strong emphasis on (safe) interaction Forexample a patient being monitored at home may storesensor-gathered medical data in the cloud and share it

1 httpswwwgovukdigital-marketplace

arX

iv1

506

0439

1v3

[cs

CR

] 2

1 D

ec 2

015

2

with selected carers medical practitioners and medicalresearch (big-data) repositories via cloud-hosted andmediated services Once data has left end-usersrsquo homesfor cloud services they need to be assured that it is onlyaccessed as they specify

Traditional access control tends to be principalrolespecific and apply only within the context of a partic-ular applicationservice Controls are applied at policyenforcement points after which there is no subsequentcontrol over where data flows Once data has left thedirect control of its owner for example after beingshared with others it is difficult using traditional accesscontrols to ensure and demonstrate that it is not leakedIf a leak is suspected it often cannot be establishedwhether this is a breach of confidentiality by a person ordue to buggy or misconfigured cloud service software

Encryption offers protection by restricting access tointelligible data even beyond the boundary of onersquostechnical control However encryption hinders flexiblenuanced data sharing in that key management (dis-tribution revocation) is difficult Further traceability islimited as being mathematically based there is generallyno feedback as to whenwhere decryption occurs anda compromised key or broken encryption scheme at anytime in the future places data at risk As such it isimportant that data flows are managed and auditedeven if data items are encrypted

Although contracts exist between cloud providers andtenants and cloud services are increasingly subject toregulation [6] there is at present no way to establish thatproviders remain in compliance with these agreementsand requirements Also there are often requirementsthat data should pass through certain processes egencryption or anonymisation There is currently no clearmechanism to express such requirements and demon-strate they have been consistently enforced

An approach to maintaining the association of datawith policy is to use ldquosticky policiesrdquo [7] Here owner-specified management constraints are attached to en-crypted data Decryption is only allowed by parties ac-cepting the management constraints and able to enforcethem This forms the basis for establishing contractualrelationships between data owners and service providersor other applications However this approach requirestrust in a (relatively large amount of) software Furtherthe enforcement is either at too coarse a granularity orprohibitively expensive This is further explored in sect24

As an alternative Information Flow Control (IFC)augments traditional access control by offering contin-uous system-wide end-to-end flow control based onproperties of the datamdashfor example ldquomedical data mayonly be used for research purposes after going throughconsent checking and anonymisationrdquo IFC allows secu-rity contexts to be defined system-wide and guaranteesnon-interference between them This is achieved by tagsapplied to entities (eg processes files database en-tries) inseparable from the entities they are associatedwith Every exchange of data between entities is verified

against security-context-domain relationships created bythe tags thus allowing tight control over any subsequenttransfers of the data

In this paper we present CamFlow (Cambridge FlowControl Architecture) We outline CamFlowrsquos IFC modeland implementation which comprises a new operatingsystem (OS) level implementation of IFC as a LinuxSecurity Module (LSM) with support for applicationmanagement together with an IFC-enabled middlewareIFC tags are checked on OS system calls and on mes-sage passing by the middleware to determine whetherdata flows are permissible Log records can be madeefficiently of all attempted flows whether permittedor rejected and this log provides a possible basis foraudit data provenance and compliance checking By thismeans it can be checked whether application level policyhas been enforced and whether cloud service provisionhas complied with contractual obligations

We argue that incorporating IFC into the underlyingPaaS-provided OSs as a small trusted computing basewould greatly enhance the trustworthiness of cloudservices whether public or private and hence all theirhosted servicesapplications Our evaluation shows thatIFC would incur acceptable overhead and our IFCmodel is designed to ensure that application developersneed not be aware of IFC although some applicationproviders may wish to take explicit advantage of IFCWe demonstrate the feasibility of our approach via anIFC-enabled framework for web services see sect7Contributions Our main contribution is to demonstratethe feasibility of providing IFC as part of cloud softwareinfrastructure and showing how IFC can be made towork end-to-end system-wide In addition to discussingthe rsquobig picturersquo in this paper we also present a newkernel implementation of IFC and a new audit functionOur approach enables (1) protection of applicationsfrom each other (non-interference) (2) flexible manageddata sharing across isolation boundaries (3) preventionof data leakage due to bugsmisconfigurations (4) ex-tension of access control beyond application boundaries(5) increased transparency through detailed logs of in-formation flow decisionssect2 gives background in protection and IFC then sect3

presents the essentials of the CamFlow IFC model withexamples sect4 and sect5 describe our new OS-level im-plementation of IFC as a LSM and its integration viatrusted processes with an IFC-enabled middleware stor-age services etc sect6 emphasises that audit in IFC systemsproduces logs capable of being processed by lsquobig-datarsquoanalytics tools Audit is central to establishing prove-nance and for providers to demonstrate compliance withcontract and regulation sect7 shows how standard webservices are supported transparently by the CamFlowarchitecture only a privileged application managementframework need be aware of IFC and unprivilegedapplication instances can run unchanged In all casesevaluation is included within the section sect8 summarisesconcludes and suggests future work

3

2 BACKGROUND

We first define the scope of current isolation mech-anisms highlighting the need for flexible data shar-ing at application-level granularity ie where applicationsmanage their own security concerns as well as strongisolation between tenants andor applications As anintroduction to IFC we outline the evolution of IFCmodels Related work on IFC implementation at the OSlevel and within distributed systems is given with therelevant sections We end with a brief comparison of IFCwith taint tracking (TT) and sticky policies

21 IFC Models

In 1976 Denning [8] proposed a Mandatory AccessControl (MAC) model to track and enforce rules oninformation flow in computer systems In this modelentities are associated with security classes The flow ofinformation from an entity a to an entity b is allowedonly if the security class of b (denoted b) is equal to orhigher than a This allows the no-read up no-write downprinciple of Bell and LaPadula [9] to be implementedto enforce secrecy By this means a traditional militaryclassification public secret top secret can be implementedA second security class can be associated with eachentity to track and enforce integrity (quality of data)no read down no write up as proposed by Biba [10] Acurrent example might allow input of information froma government website in the govuk domain but forbidthat from ldquoJoersquos Blogrdquo Using this model we are ableto control and monitor information flow to ensure datasecrecy and integrity

In 1997 Myers [11] introduced a Decentralised IFCmodel (DIFC) that has inspired most later work Thismodel was designed to meet the changing needs ofsystems from global static hierarchical security levelsto a more flexible system able to capture the needsof different applications In this model each entity isassociated with two labels a secrecy label and an integritylabel to capture respectively the privacyconfidentialityof the data and the reliability of a source of data Eachlabel comprises a set of tags each of which representssome security concern Data is allowed to flow if thesecurity label of the sender is a subset of the label of thereceiver and conversely for integrity

Implementations of a decentralised model akin toMyersrsquo include a sensitive embedded system for BMWcars [12] and XBook [13] in a social media contextOur own model is described in sect3 When implementedfrom the OS kernel level applications running underIFC enforcement do not need to be trusted for the datamanagement policy to be properly enforced [14]

22 Protection via VMs and Containers

Isolation of tenants in cloud platforms is throughhypervisor-supported virtual machines [1] [2] or OS-provided containers [3] However flexible sharing mech-anisms are also required to manage data exchange be-tween applications contributing to more complex sys-

tems or to achieve end-user goals For example gov-ernment applications might access citizensrsquo records forvarious purposes a userrsquos data from different applica-tions might together contribute to evidence related tohealth or wellbeing

At present the sharing of information between appli-cations tends to involve a binary decision (ie to shareor not) as for example in Google pods (containers)2

Whole resources can be shared but no control over datausage between applications is provided Furthermorethere are no means for preventing leakage outside ofthe mechanisms implemented by the individual appli-cationsservices

Solutions have been proposed to provide intra-application sandboxes (down to individual end-users)[15] but such schemes are difficult to scale requirechanges in application logic and still do not providecontrol beyond isolation boundaries (ie again loss ofcontrol once the data is shared)

IFC has been proposed to guarantee the proper usageof data by social network applications [13] The aim is toprovide purpose-based disclosure via IFC [16] betweenisolated components thus guaranteeing that shared datacan only be used for a well-defined and agreed-uponpurpose

IFC is by no means proposed as a replacement foraccess control VMs or containers but rather as a com-plement to those techniques to provide flexible manageddata-sharing IFC would allow tenants and end-users tomaintain control (within an IFC-enforcing world) anddefine policy applying to their data consistently andbeyond isolation and application borders

23 Taint Tracking (TT) Systems

Runtime dynamic TT is similar to IFC but with less func-tionality TT systems use one tag type ldquotaintrdquo instead ofsecrecy and integrity tags Tags propagate with data anddata flows may be logged An entity that inputs taggeddata acquires the datarsquos tag(s) Data flow constraintsare only enforced at specified sink points for examplewhen data attempts to leave a mobile phone [17] Policyis applied at sink points such as preventing privateunanonymised or unencrypted data from flowing orstrictly controlling to where data may flow

An example of TT used for integrity purposes is totaint data from untrusted sources eg user input froma TCP stream in a web application environment andenforce that it is sanitised before being processed [18]This simple mechanism prevents injection attacks thatplague badly designed web applications An example ofTT used for confidentiality purposes is to taint sensitiveinformation eg a list of contacts in a mobile phone andtrack it through this closed system [17] Data leaving thesystem (ie the phone) is analysed to ensure it does notcontain sensitive information Data containing sensitiveinformation should only leave to a number of closely

2 httpscloudgooglecomcontainer-enginedocspods

4

controlled destinations such as the cloud backup contactlist This approach aids the detection of malicious ap-plications attempting to steal user-sensitive informationand send it to third parties Equally this type of concerncan be captured through the use of IFC policies

One concern with TT systems is that there is a gap intime between the occurrence of the issue (eg a leak anattack) and when it is detected [19] ie problems becomeevident only when the tainted data reaches a sink (en-forcement point) Depending on the degree of isolationbetween the different parts of the system and the num-ber of system components involved this tainted datamay have lsquocontaminatedrsquo much of the system Whilethis can be managed in smaller closed environmentsit is less appropriate for cloud services in general IFCpolicies present the clear advantage to prevent problemsas they occur and to stop their effects propagating to apotentially large part of the system

Some argue that TT is simpler to use than IFC andincurs lower overhead but when the enforcement issystemic and the granularity identical the overheadsare similar (compare [17] and the evaluation in sect4 andsect5) Indeed the complexity of verifying IFC policy (seesect3) is comparable to the cost of propagating taint Forboth techniques most of the overhead comes from themechanism for intercepting data exchange

24 Sticky Policies

IFC can be seen as a mechanism for enforcing policythe labels associated with entities represent applicationpolicies IFC is a simple low-level mechanism Stickypolicy approaches also consider the enforcement of data-bound policy but at a higher-level

Casassa-Mont et al [20] first introduced sticky policieswhich involves encrypting data along with a list of poli-cies to be enforced on that data To obtain the decryptionkey from a Trusted Authority (TA) a party must agree toenforce the policies associated with the data This agree-ment may be considered as part of forming a contractuallink between the data owner and the service providerWork has continued in the area [21]ndash[23] Sticky policiestypically enforced at the application-level are generallymore complex and heavyweight than the simple secrecyand integrity constraints of IFC As such sticky policiestend only to be enforced at particular points eg atadministrative boundaries IFC on the other hand aswe show in sect412 can be enforced continuously at areasonable cost In sect3 we discuss how complex policiesmight be built from IFC labels

Further the sticky policy approach builds upon thetrust established between the data owner the TAs andservices that use the data A non-compliant service couldbe black-listed but only if and when a breach of agree-ment is detected and the TA updated Our IFC approachbuilds only upon the trust between the data owner andthe cloud provider Services and applications running ontop of the cloud provider platform need not be trustedWe believe this to be a great improvement to the overall

Alice RecordS(A) = alicemedical

I(A) = hosp-dev consent

Bob RecordS(B) = bobmedical

I(B) = hosp-dev consent

Alicersquos app instanceS(C) = alicemedical

I(C) = consentAllowed FlowPrevented Flow

Fig 1 An allowed safe flow and prevented flowstrustworthiness of the system

3 CAMFLOW-MODEL IFC FOR THE CLOUD

IFC operates to ensure that only permitted flows ofinformation can occur by enforcing data flow policy dy-namically end-to-end within and across applicationsservices Entities to which IFC constraints are appliedcan include a MapReduce worker instance [24] a file aprocess a database entry [25] etc In CamFlow IFC is ap-plied continuously typically on every system call for anIFC-enabled OS and on communication mechanisms forenforcement across applicationsruntime environmentsIFC policy should therefore be as simple as possibleto allow verification human understanding and to min-imise runtime overhead Indeed there is no need for IFCto encapsulate every possible policy rather it augmentsother control mechanisms and can help enforce theirpolicies

31 Tags and Labels

We define tags that are tokens each representing somesecurity concern over secrecy or integrity The tagbob-private could for example represent Bobrsquos personaldata We associate every entity in the system with twolabels (sets of tags) an entity A has a secrecy label S(A)and an integrity label I(A) The state of these labels isthe security context of the entity The power of IFC is thatit guarantees non-interference between security contexts[26] [27]Example ndash secrecy Suppose a patient Bob is dischargedfrom hospital to be medically monitored at home Thedata streams from his sensors are transferred to a cloudservice and are to be shared with his medical team atthe hospital The data items from his devices are taggedwith medical bob in their secrecy labelsExample ndash integrity The cloud-based home monitoringsupport service needs to be assured that the data itreceives is from a hospital-issued device Each sensingdevice is checked and issued with the tag hospital-devicein its integrity label

Fig 1 illustrates information flow constraints beingapplied over both secrecy and integrity dimensions

32 Decentralised Privileges and Security Contexts

In decentralised IFC (DIFC) any active entity can createnew tags Tag creation is typically carried out by appli-cation managers when setting up application instancesWhen an active entity creates a new tag either for secrecyor integrity this process is given the correspondingprivilege to add and remove the tag to its secrecy orintegrity label respectively If an active entity A has a

5

Medical RecordS = personal

I = empty

Consent Checker

S = personalI = cons

Anonymiser

S = researchI = cons anon

Research DatabaseS = research

I = cons anon

Researcher PortalS = research

I = cons anonAllowed FlowPrevented Flow

Research Project XXNHS Cloud S = personalI = empty

S = personalI = cons

Context change

Fig 2 Medical data declassified and endorsed for research purposes

privilege to add t to its secrecy label we denote thist isin P+

S (A) and to remove t from its secrecy labelt isin PminusS (A) (and similarly P+

I (A) and PminusI (A) are the priv-ileges for integrity) An active entity may therefore havefour privilege sets in addition to its security contextApplication managers will normally set up applicationinstances in security contexts without the privileges tochange them An example is given in sect7

33 Creating a New Entity

We define A rArr B as the operation of the entity Acreating the entity B An example is creating a processin a Unix-style OS by clone We have the following rulesfor creation

if ArArr B then

S(B) = S(A)

I(B) = I(A)

That is the created entity inherits the security contextof its creator These rules force the creating entity toexplicitly change its security context to that required forthe entity to be created We motivate this below in sect342Note that only labels pass to the created entity privilegeshave to be passed explicitly

34 Security

The purpose of IFC models is to regulate flows betweenentities and effect label changes and privilege delega-tion

Definition 1 A system is secure in the CamFlow IFC modelif and only if all allowed messages are safe (Definition 2) allallowed label changes are safe (Definition 3) and all privilegedelegation is safe (Definitions 4 and 5)

341 Information Exchange

IFC prevents data leakage by controlling the exchangeof information We follow the classic pattern for IFC-guaranteed secrecy (no read up no write down [9]) andintegrity (no read down no write up [10])

Definition 2 A flow of information A rarr B is safe if andonly if

Ararr B iff S(A) sube S(B) and I(B) sube I(A)Example ndash secrecy enforcement Consider our exam-ple of patient monitoring after discharge from hospital

where the patientrsquos devices are tagged with medical bobin their secrecy labels In order for the cloud service tobe able to receive this data it must also include the tagsmedical bob in its secrecy label Therefore an applicationinstance accessing Bobrsquos medical data must be labelled assuch In sect7 we describe how applications can be designedto meet such requirementsExample ndash integrity enforcement The cloud-basedhome monitoring support service needs to be assuredthat the data it receives is from a hospital-issued de-vice To achieve this the service has an integrity taghospital-issued in its integrity label and will only acceptdata from devices with tags hospital-issued

342 Label Change

Under the above constraints information flows arerestricted to equal or increasing secrecy constraintsand equal or decreasing integrity constraints Howeverdata may undergo transformations andor checks thatchange its security properties For example moving datathrough an anonymisation engine renders the data lesssensitive so less strict secrecy constraints can applyto the anonymised output In the integrity dimensiondata may go through a validation process on inputthus becoming more trustworthy In CamFlow only theprocess itself is able to change its secrecy and integritylabels which requires the appropriate privileges andmust be explicitly requested

Definition 3 A label change noted A Aprime is safe if andonly if for a label X (either S or I) and a tag t

X(Aprime) = X(A) cup t if t isin P+X (A)

ORX(Aprime) = X(A) t if t isin PminusX (A)

Declassifiers and endorsers are the entities with theprivileges to perform security context transformationsDeclassifiers change the secrecy properties and endorserschange the integrity propertiesExample ndash declassification A medical record systemis held in a private cloud Research datasets may becreated from these records but only from records wherethe patients have given consent Also only anonymised

6

data may leave the private protected environment Weassume a health service approved anonymisation proce-dure Fig 2 shows the anonymiser inputting data taggedas personal and declassifying the data by outputting datawith secrecy tag researchExample ndash endorsement In the same example theResearch Database is on a public cloud and may onlyreceive research data tagged with consent anon in itsintegrity label In the private cloud we see a pro-cess that selects appropriate records for specific re-search purposes checks for patient consent and addsthe tag consent to the integrity label of its output Theanonymiser process can only input data with this tag itanonymises the data and outputs data with the tag anonin its integrity label

Some previous work [14] [28] allows implicit declas-sification and endorsement That is if an active entity hasthe privilege to declassifyendorse and the privilegeto return to its original state (ie for declassificationendorsement over t the entity has privilege tminus and t+)the declassificationendorsement may occur implicitlywithout the need for the entity to make the label changesWe believe that this could in practice lead to unintentionaldata disclosure Suppose an entity has the privilege todeclassify top-secret information The requirement forexplicit label change makes it unlikely that the entity willsend such data accidentally to an unintended recipientOur model has stronger constraints that require endorse-ment and declassification operations to be programmedexplicitly

343 Privilege delegation

An entity is only able to delegate a privilege it owns

Definition 4 A privilege delegation is safe if and only ift isin PplusmnX (A)

35 Conflict of Interest

In CamFlow alone among IFC systems privilege dele-gation is further restricted by Conflict of Interest (CoI)(or Separation of Duty (SoD)) enforcement The receivingentity A must not be put in a situation where it wouldbreak a CoI constraint By this means an applicationmanager is prevented from creating an application in-stance with access to conflicting data

Definition 5 An entity A does not violate a CoI C if andonly if∣∣∣(S(A) cup I(A) cup P+

S (A) cup P+I (A) cup Pminus

S (A) cup PminusI (A)

)cap C

∣∣∣ le 1

Example ndash conflict of interest A CoI might arise whendata relating to competing companies is available in asystem In a hospital context this might involve theresults of analysis of the usage and effects of drugs fromcompeting pharmaceutical companies The companiesmight agree to this analysis only if their data is guar-anteed to be isolated ie not leaked to other companies

The hospital may be participating in drug trials andwant to ensure that information does not leak between

Camflow-LSM

IFC Library

User Space

Kernel Space

Application Process

Kernel

DIFC Management APISystem Calls

Hardware

IFC LibraryTrusted Process

Authorise and Audit

Fig 3 The interactions of the IFC Security Module (LSM)and a Trusted Process within an OStrials suppose a conflict is C = PfizerGSKRoche and some data (eg files) are labelled PfizerData[S =Pfizer I = empty] and RocheData[S = Roche I = empty] TheCoI described ensures that it is not possible for a singleentity (eg an application instance) to have access toboth RocheData and PfizerData either simultaneously orsequentially ie enforcing that Roche-owned data andPfizer-owned data are processed in isolation

The next sections describe the CamFlow platform thatenforces the IFC constraints described

4 OS ENFORCEMENT

At the heart of the architecture is a minimal kernel mod-ule dedicated solely to OS-level IFC enforcement Themodule is trusted to enforce IFC transparently acrossall flows between entities within the OS User spaceprocesses can directly interact with the kernel moduleeg to delegate privileges (sect34) through a pseudo-filesystem abstracted through a high level API Higher levelconsiderations and policies can be managed throughspecifically defined Trusted Processes (see sect42) The localmachine architecture is represented in Fig 3

Note that IFC operates alongside and complementsother security technologies It is not a cloud securitypanacea challenges regarding covert and side channelsand direct access to hardware by an attacker remain asthey do for systems in general There are approachesthat can help address these security threats but manyare highly disruptive (eg synchronisation approachesto reducing timing channels) and are infrequently usedOther threats may be easier to mitigate and solutionsmay be used when appropriate (eg on-disk encryption)

41 CamFlow-LSM

Our kernel module CamFlow-LSM is implemented as aLinux Security Module (LSM) [29] Although our work isLinux-specific a similar approach could be used on anysystem providing LSM-like security hooks Unlike otherDIFC OS implementations [14] [28] our kernel patch isself-contained strictly limited to the security moduledoes not modify any existing system calls and followsLSM implementation best practice This allows amongother things LSM stacking [30] [31] and coexistencewith other security modules such as eg SELinux [32]or AppArmor [33] and complements their MAC enforce-ment with decentralised information flow policies

We assume that the rest of the kernel can be trustedand does not interfere with the IFC enforcement mech-anism LSM system hooks have been statically anddynamically verified [34]ndash[36] and our implementation

7

inherits from LSM the formal assurance of IFCrsquos correctplacement on the path to any controlled kernel objectThis is sufficient to guarantee that we control flow andrecord audit on any operation on a controlled kernelobject

Since applications running on SELinux [32] or Ap-pArmor [33] need not be aware of the MAC policybeing enforced we see no reason to force applicationsrunning on an IFC system to be aware of IFC only thoseperforming declassification or endorsement operationsare necessarily aware This implementation choice isimportant cloud providers can incorporate IFC withoutrequiring changes in the software deployed by ten-ants Alternatively applications that wish to managetheir own IFC constraints can declare policy through apseudo-filesystem (as is typical for LSMs) abstracted bya user space library and enforced transparently by theIFC mechanism

The LSM framework calls security hooks when accessto a kernel object is attempted Security metadata can beassociated with kernel objects and is used by the LSMmodule to make access decisions Tags and privilegesare represented by 64-bit opaque nonces associated withkernel objects such as processes inodes files sharedmemory objects messages etc On interaction betweenkernel objects CamFlow-LSM security hooks are calledto enforce data-flow policy (sect341) or propagate tags onentity creation (sect33) as appropriate

Only active entities (processes) have mutable labelsand privileges all other (passive) entities have im-mutable labels and no privileges

Privileges are allocated by the kernel and owned bythe creating process (any process can create tags and theassociated privileges in a decentralised fashion) Privi-leges can be passed to other processes users or groupsCamFlow-LSM verifying that constraints on privilegedelegation (sect343) and conflict of interest (sect35) are notviolated A process can add or remove a tag from itslabel if it owns the appropriate privilege (following IFCconstraints described in sect342) if the current user ownsthe privilege or if the current group owns the privilegeHow tags are shared and managed must be consideredwith care when designing an application and the systemmust be administered accordingly

411 Checkpointing and Restoration

Checkpointing a process involves halting its executionallowing it to be restarted at a later stage and enablingmigration eg [37] LSM state is normally saved andrestored by the checkpointing system eg [38] andour module further exports an API to more efficientlyserialise and restore security context

Furthermore self-checkpointing and restoring the pre-vious state of a process has been demonstrated [39] to bea beneficial feature for IFC systems This is particularlyuseful for processes serving requests In such a scenariothe state of the process is saved after initialisation Whena request is received the serving process sets itself up

0 10 20 30 40 50 60 70 80 90

sys pipe

sys write

sys read

sys clone

microsdyn label IFC LSM Native

Fig 4 Overhead introduced into the OS by CamFlowLSM (x-axis time in micros)in the security context appropriate to serve the requestAfter the request is served (or a series of requests if thesystem is session-based as described in sect7) the processrestores its memory state and security context to whatthey were immediately after initialisation This improvesperformance and prevents data leaks between securitycontexts

412 OS Evaluation

We tested the CamFlow-LSM module on Linux Kernelversion 3178 (012015) from the Fedora distribution3

The tests are run on an Intel 22Ghz i7 CPU and 6GiBRAM machine

Measurements are done using the Linux tool ftrace [40]to provide a microbenchmark Two processes read fromand write to a pipe respectively Each has 20 tags in itssecurity label substantially more than we have seen aneed for in current use cases We measure the overheadinduced by creating a new process (sys clone) creatinga new pipe (sys pipe) writing to the pipe (sys write) andreading from the pipe (sys read) The results are givenin Fig 4

We can distinguish two types of induced overheadverifying an IFC constraint (sys read sys write) and allo-cating labels (sys clone sys pipe) The sys clone overheadis roughly twice that of sys pipe as memory is allocateddynamically for the active entityrsquos labels and privilegesRecall that passive entities have no privileges Overheadmeasurements for other system callsdata structures areessentially identical as they rely on the same underlyingenforcement mechanism and are not included

The CamFlow-LSM overhead is a few percent seeFig 4 We provide a build option that further improvesperformance by declaring labels and privileges with afixed maximum size (by default label size can increasedynamically to meet application requirements) This re-duces the overhead of the system calls that create newentities (the dynamic label component in Fig 4) This isan acceptable trade-off as in practical scenarios labelsrarely exceed more than five tags However for mostapplications the overhead is imperceptible and lost insystem noise it is hard to measure without using kernel

3 It is not feasible to provide a comparison with the Laminarimplementation [28] that is closest in technical terms to our workas the implementation available httpsgithubcomut-osalaminaris for an obsolete kernel version 2622 (072007)

8

tools as the variation between two executions may begreater than the overhead

42 Trusted Processes

The CamFlow-LSM is trusted to enforce IFC at the kernellevel Its functionality is minimal strictly confined tothe enforcement of IFC policies as described in sect3 Thisguarantees easier maintainability and a system that isagnostic to higher level application requirements thusminimising the constraints imposed on user-space ap-plication design

We introduce the concept of a trusted process that al-lows applicationplatform-specific concerns to be man-aged in user space by bypassing some LSM-enforced IFCconstraints For example a trusted process might serveas a proxy for external connections as in the Trusted IFCGateway in the example in sect7 setting up and managingapplication componentsrsquo labels Trusted processes areused to interact with persistent storage (see sect53) forcheckpointing and restoring processes (see sect411) andfor managing inter-process and external communication(see sect5)

Fig 5 shows OS instances running the CamFlow-LSM hosting a number of application processes thatmay be grouped in containers Each OS instance hasa single trusted process (Security Context Manager) tomanage its hosted processesrsquo IFC labels and privileges Inaddition each process has an associated trusted middle-ware process to handle inter-process and inter-machinecommunication Such communication may be within orbetween containers OSs or clouds

In this example S represents a particular set of secrecytags and I a particular set of integrity tags both ofwhich remain the same throughout The application pro-cesses and other OS objects such as pipes and files arelabelled [S I] The process labelled [empty I] writes lsquopublicrsquodata to a pipe which is read by a process labelled [S I]assuming all the I tags match correctly Similarly twoprocesses are shown writing to and reading from a file

The Security Context Manager maps between thekernel-level representation of tags (as 64-bit integers)and the representation of tags in user space Within acloud or other trusted environment tags may be simplestrings When tags need to cross domain boundarieseg when cloud services form part of a wider archi-tecture as in IoT tags may need to be protected bycryptographic means (see sect51)

Trusted processes are either set up through staticconfiguration read at boot time by the CamFlow-LSMmodule or created at runtime by another trusted pro-cess Trusted processes must either be managed by atrusted party (in our current approach the underlying in-frastructure provider) andor the code must be auditableand a means to verify the current version running on theplatform must be provided (see sect43)

43 Leveraging Hardware Roots of Trust

Incorporating IFC into cloud-provider OSs would en-hance the trustworthiness of the platform HoweverIFC only guarantees protection above the technical layerin which it is enforced Recent hardware and softwaredevelopments make it possible to attest that the softwarelayers on which our platform runs have been audited

The Trusted Platform Module (TPM) [41] as used forremote attestation [42] is one such hardware mecha-nism TPM is used to generate a nearly unforgeablehash representing the state of the hardware and soft-ware of a given platform that can be remotely verifiedTherefore a company could audit the implementationof our IFC enforcement mechanism and ensure that ourkernel security module messaging middleware and theconfiguration they provide are indeed running on theplatform Any difference between the expected state ofthe software stack and the platform could be considereda breach of trust such considerations can easily beembedded in the contractual obligations of the cloudprovider

TPM and remote attestation for cloud computing [43]are reaching maturity with IBM rolling out an opensource scalable trusted platform based on virtual TPMs[44] Indeed Berger et al [44] describe a mechanismallowing the TPM and remote attestation to be providedfor virtual machine offerings and container-based solu-tions covering the whole range of contemporary cloudofferings Furthermore the approach not only allows thestate of the software stack to be verified at boot time butalso during execution and can thus prevent run-timemodification of the system configuration

5 CROSS-MACHINE ENFORCEMENT

CamFlow-LSM operates to protect flows within the OSHowever it is also important that flows are protectedacross OS instances

Generally in order to guarantee flow constraints onlyprocesses P such that S(P ) = empty and I(P ) = empty ie notsubject to IFC constraints are allowed to directly connectto or receive messages from connections on remote OS(eg through a socket) In order to connect to anothermachine a process must either 1) be able to declassifyto change its security context to S(P ) = empty and I(P ) = empty2) communicate through an intermediate trusted process

As such CamFlow contains an IFC-enabled fully-featured messaging middleware (CamFlow-MW) to bothfacilitate communication and guarantee enforcementacross machines For want of space we only considerthe middleware concepts relevant to IFC details on thegeneral middleware (as it was prior to IFCCamFlowintegration) can be found in [45] In short the middle-ware supports strongly-typed messages a range of in-teraction paradigms including request-reply broadcastand streams flexible resource discovery and securitymechanisms including access controls and encryptedcommunication A particular feature is its support for

9

CamFlow-MW CamFlow-MW CamFlow-MW CamFlow-MW CamFlow-MW

Process

CamFlow-LSM

Enforcement Point Pipe [empty I]

Process Process

SecurityContextManager

SecurityContextManager

Container

Container

Container

Process[S I]

CamFlow-LSM

[S I]

File[S I]

[S empty][S empty]Process[empty I]

Fig 5 CamFlow Architecture Labelled OS objects trusted processes and communication middleware

dynamic reconfiguration based on event-driven policyThis simplifies both application development and de-ployment as concerns can be abstracted and tailoredto the particular environment rather than embeddedwithin application code

The role of the middleware is to move towards contin-uous end-to-end data flow management such that IFCcan be enforced across applicationsmachines (kernels)There is work on IFC enforcement across machineshowever these impose specific requirements such asdesign-time considerations [46] a particular languageruntime [47] or constraints on system architectureimplementation [48] In contrast we integrate IFC func-tionality into the general fully featured distributed sys-tems middleware mentioned above (see [49]) to provideflexibility and be more generally applicable We delib-erately avoid imposing a structure on system designinstead integrating IFC functionality into the sort of com-munications infrastructure common to current enterpriseand cloud systems

51 Remote Interactions

CamFlow-MW operates by associating a trusted process(see sect42) with an entity that seeks to communicate viathe messaging system Its fit within the broader architec-ture is depicted in Fig 5 The process is responsible forhandling the communication of messages and enforcesIFC based on the current runtime labels of the entity onwhose behalf it operates

It follows that for IFC to be enforced across machinestags require system-wide management ie throughoutthe cloud service In [50] we proposed that the widelyused and available X509 certificates could be used Theapproach relies on public key certificates and attributecertificates [51] to respectively identify the applicationassociated with the CamFlow-MW instance and the tagsassociated with this application

As part of establishing a connection CamFlow-MWensures that each entity authorises communication withthe other according to a local access control policy De-cisions are based on component metadata the relevantauthentication aspects secured through PKI (certificates)Similarly IFC policy must also be verified to ensure dataflows are authorised according to IFC policy Attributecertificates provide cryptographic means to determineand verify the tags associated with the remote entity

(see [50]) on which policy can be enforced If tags donot accord the connection will not be established

We also see potential for remote attestation based onhardware integrity measures (see sect43) to be integratedinto this authorisation phase to ensure the remote ma-chine operates a reliable IFC enforcement regime

52 Message-Level Enforcement

CamFlow messages are strongly typed where a mes-sage type is defined by a schema describing its set ofattributes For an instance of a message an attributeconsists of a name type and value The support forIFC within messages is fine-grained in that individualattributes within messages can also be labelled Theseattribute labels introduce additional IFC constraints overand above those already applying to the entity ie asrecognised by the kernel-LSM and validated on connec-tion establishment

Labels can be defined within message type schemawhich sets the attributesrsquo IFC labels for all messageinstances of the type These labels cannot be changed byentities dealing in such messages and the entities musthold the requisite labels to interact with the attributesOtherwise the entity producingpublishing a messagecan set the security labels for the attributes (for those notpredefined) if the entity holds the associated privilegesEnforcement occurs as followsReceiving If the receiving entityrsquos labels do not agreewith those of an attribute value the attribute value (andany sub-attributes) are removed from (made null in) themessage This is enforced on message receipt before itis delivered to the entitySending An entity cannot send values for attributeswhere its labels do not agree with those of the attributeThis is enforced when an entity attempts to send amessage ensuring values for any attributes violating thispolicy are removed before message propagation

Enforcement is automatic meaning that applicationsusing the messaging system can be subject to IFC en-forcement completely transparently (ie without their di-rect involvement) though again there is the interface forthe application to actively manage IFC where requiredIn addition the general reconfiguration capabilities ofthe middleware enable connections between componentsto be defined and managed at runtime providing an-other mechanism for controlling communication [45]

10

0 100 200 300 400 500 600 700

Time in msIFC-overhead

Fig 6 IFC overhead of CamFlow-MW for a workloadtransmission of 5000 messages (x-axis in ms)53 Integrating With Persistent Storage

One technique to provide IFC with persistent data storesis to store the tags alongside the data and a trustedsoftware component ensures that when information isread from the store the corresponding labels are appliedIn Flume [14] a trusted process provides the interfacebetween untrusted applications and persistent storageMore recent work has seen the emergence of databasesthat natively understand IFC concepts and can enforceIFC policies [25]

We see much promise in having the middleware me-diate between persistence systems and the kernel toensure consistent IFC application

54 Evaluation

As shown in Fig 6 the results indicate that IFC en-forcement introduces an overhead of sim13 in perfor-mance time compared to the standard non IFC-enabledmiddleware (see [49] for details) Note that these resultswere measured in the context of a particular workloaddeliberately designed to highlight the impact of IFCenforcement It follows that the overheads associatedwith real-world usage are most likely less onerous

6 AUDIT DATA-CENTRIC LOGS

IFC in addition to providing strong assurances that pol-icy is being enforced can also provide a data-centric log[52] detailing the information flows within and betweensystem components In addition to enforcing IFC via ourLSM module we log the data flows of labelled processespolicy decisions privileges and IFC security contextmanipulations Equivalent inter-machine operations viathe middleware are also recorded

Cloud logging systems are generally based on legacylogging systems (OS web-server database etc) thateither fail to capture the needed information or areextremely complicated to interpret in a useful manner[53] More importantly such logs tend to be relevant onlyto the particular service or component which makes itdifficult if not impossible to audit across a range ofapplications clouds etc

IFC logs as provided by our platform allow us to cap-ture information on application-level data flows both at-tempted and permitted allowing the correct expressionand implementation of data flow policy to be checkedThis provides transparency allows for meaningful auditin terms of investigating the circumstances in which dataleakage occurs and provides evidence of complianceeg with legal obligations [54]

Information Exchange

CreateSecurity Context Change

e0

e1

e3e4

e7

e2 e5e6

P3[Sprimeprime I]

P2[Sprimeprime empty]

P3[S I]F1[S I]

P1[S I] P1[empty I]

Public[empty empty]F2[Sprime I prime]

Fig 7 Simplified audit graph from IFC OS execution (weomit metadata for readability) The path to disclosure isshown in bluepale61 Analysing Paths to Disclosure

To assist in interpreting log information we build a di-rected graph corresponding to the allowed flows duringthe execution of our system as shown in Fig 7 Theflows defined in our IFC model (see sect3) namely dataflow creation flow security context change and privi-lege delegation correspond to the edges of the directedgraph Entities (such as processes files messages etc)are represented as nodes in the graph

In addition to information necessary to build the graph(as shown in the figure) additional metadata is collectedfor forensic purposes which is contextentityevent-dependent These audit entries provided by the LSMand middleware can be exploited by a dedicated serviceimplemented in user space connected to the kernelcollection mechanism via relayfs [55] This service feedsfor example a graph visualisation tool such as Cytoscape[56] or a graph database such as Neo4J4

Such a directed graph helps one identify data leaksFor example a tenant might discover that some sensi-tive medical data leaked into a data store where onlyanonymised research data were supposed to be storedIFC is enforced in line with the policy encapsulated inlabels thus data may leak if such policy is improperly ex-pressed andor declassificationendorsement processesare not correctly implemented (eg if the anonymisationprocess in Fig 2 allows re-identification)

Suppose that an information leak is suspected be-tween different security contexts L1[S I] and L2[S

prime I prime]Determining whether such a leak can occur is equiv-alent to discovering whether there is a path in thegraph between the two contexts If the leak occurredthere must be a path between some entity Ei such thatS(Ei) = S and I(Ei) = I and another entity Fi such thatS(Fi) = Sprime and I(Fi) = I prime

The existence of such a path demonstrates that a leakis possible To investigate whether a leak occurred it isessential to consider the event ID associated with theedges comprising the path We denote by ei the last in-coming edge to the entity under investigation with labels[Sprime I prime] only edges such that e lt ei should be consideredWhen applied to all nodes along a path this rule ensuresstrictly monotonically increasing timestamps from thefirst node to the last Fig 7 shows in bluepale a possibledata disclosure path from file F2 from a very simple

4 httpneo4jcom

11

audit graph We know from the event IDs e0 and e1 thatthe data disclosure did not occur through file F1 andprocess P3 but through P1rsquos declassification

62 Demonstrating Compliance

Compliance with certain requirements can be demon-strated through queries over the graph We assume theaudit data is stored in a graph database that we canquery For example the following plain English policyldquoEuropean personal data sent to the US must be anonymisedrdquo[57] is equivalent to writing a query that verifies thatthere is no path between EU- and US-labelled datawithout an anonymisation processThe policy ldquoMedical data stored in database X must havereceived proper consent and be anonymisedrdquo [54] can beexpressed as a query verifying that there is no path be-tween data labelled as medical and the database withoutconsent-checking and anonymiser processes In additionan investigator may want to know which anonymisationalgorithm has been run which data has been used togenerate the anonymised records etc Our audit graphassists in answering such questions

Note that IFC only applies guarantees with respectto flows Demonstrating the overall effectiveness of themanagement regime eg the quality and suitability ofthe anonymisation algorithm is out-of-scope for a flow-based enforcement mechanism

63 Audit as lsquoBig Datarsquo

We are potentially generating a vast amount of data inour IFC logs However unlike standard system logs thatare complex to analyse our logs generate graphs thatare ideal for analysis by ldquobig datardquo tools that have beendeveloped for this purpose [58]

Since the amount of data is potentially huge theamount of data being logged can be fine-tuned to meetthe requirements of the platformtenant eg by reduc-ing the amount of metadata being stored by loggingonly security context changing operations by loggingonly information corresponding to some target securitycontext keeping operations on unlabelled entities out-side of the log etc The decision on what needs to belogged then becomes a tradeoff between utility and thevolume (cost) of log generated which can be decided inorder to correspond to legal or contractual requirements(for example a regulated sector may need to have afine-grained log to satisfy data forensic requirements)Indeed as such an approach is new to the cloud suchconsiderations will be refined by experience with bestpractices developing over time

64 Audit Access

Logs can contain sensitive information and access tothem should be controlled This represents an area ofour ongoing work Traditional access controls clearlyplay a role however secrecy tags could also be lever-aged For example an auditor before being grantedaccess to audit logs could be forced to demonstrate

ownership of the corresponding secrecy IFC tags (forexample through cryptographic means as in sect51) Theauditor may be granted access to a log entry only ifS(origin) cup S(destination) sube S(auditor)7 EXAMPLE SUPPORT FOR WEB SERVICES

One of the most common uses of PaaS is to host web ap-plications In this section we present the implementationof such a solution built on the infrastructure described insect4 in order to evaluate and demonstrate the feasibility ofour proposed approach This is illustrated in Fig 8 Werun standard and unmodified Ruby web applications

Interaction with end-users is achieved through aldquogatewayrdquo between the IFC and non-IFC worlds Simi-larly interaction with cloud services (such as data stores)is also achieved through our messaging middleware asdiscussed in sect5 The requirement for this gateway canbe removed if a trustworthy IFC implementation can beprovided at the client side consistent with the cloud im-plementation with respect to tag naming enforcementetc Tag naming in general system-wide is an issuebeyond the scope of this paper see further sect8 In ourproof of concept implementation the gateway is a simpleApache server running a custom-built module

The role of the gateway is to authenticate the end-userwhen a session is created and to associate this sessionwith an application instance running within the securitycontext corresponding to the user Recall that a securitycontext comprises the S and I labels Any further re-quests to the gateway in that session are routed to thecorresponding application instance Once an instance nolonger has an associated session it can be recycled usingself-checkpointing as described in sect411

Several application types are running over our cloud-based web services platform For example in a medicalcontext these might be medical record editing pharmacyordering social services etc A single shared identityservice for the end-user is part of the cloud provideroffering (in our proof of concept implementation weused OAuth [59])

The GP authenticates is authorised as treating doc-tor for Alice and selects the lsquoidentityrsquo that correspondsto Alice A new session is created server-side by thegateway with the requested application instance run-ning in the corresponding security context with S =[medical alice] I = [empty] When the GP wants to accessapplications on behalf of a new patient he needs to closeAlicersquos session authorise as treating doctor for Bob andopen a new session for Bob

The control described above is not achieved by theapplication but by the platform itself and can be con-trolled by the end-user subject to access control Thatis a medical application used on Alicersquos behalf runs ina security context in which data cannot flow to that ofanother patient Furthermore applications running onbehalf of a given user can share the data of that userwithout the risk of seeing a buggy application leakingdata between end-users

12

App Instance App InstanceApp Instance

CamFlow-MW

IFC Constrained IFC Constrained IFC Constrained

ContextA

ContextB

Public

Trusted IFC Gateway

Enforcing IFCData-store

Data-store Application

CamFlow-MW CamFlow-MWCamFlow-MW

IFC Cloud Platform

Dr Yellow Dr PinkDr MauveRemote Clients

Fig 8 PaaS Architecture on top of IFC-OS

As described in sect4 we assume the middleware andthe OS enforcement are provided as a service by theunderlying platform A tenant wanting to use the third-party web-service offering once his trust in the under-lying platform is established needs only to audit thegateway again the underlying infrastructure providercould either provide such a gateway or audit it Therest of the software stack of the third-party web-serviceprovider is bound by the IFC enforcement mechanismand therefore need not be trusted

8 CONCLUSION amp FUTURE WORK

IFC allows data flows to be controlled continuouslythroughout a system by providing an information-centric MAC scheme that continuously ensures non-interference between security contexts This paper pre-sented the CamFlow platform that demonstrates thepotential of cloud-deployed IFC as supporting (1) pro-tection of applications from each other (2) flexible datasharing across isolation boundaries (3) prevention ofdata leakage due to bugsmisconfigurations (4) exten-sion of access control beyond application boundaries (5)data flow transparency

Specifically we detailed a new kernel implementationof IFC as an LSM demonstrating low overhead evenfor worst-case scenarios where processes continuouslymake readwrite system calls We also described theintegration of a messaging middleware to enforcing IFCacross machines This combination makes it possible toprovide whole-system IFC to PaaS cloud services andtherefore also SaaS Our approach and implementationwere designed so that applications can run unchangedover IFC thus making cloud adoption feasible

We also indicated how the data-centric logs based onIFC enforcement could provide the means to audit anIFC-enabled system whereby a log can be processed asa directed graph to investigate leaks and attacks andshow compliance with data management requirementsThough this represents our initial work in the area thereappears much promise

In light of the above we believe that IFC has greatpotential as a security mechanism for the cloud wherebytrust in a few major cloud providers deploying IFCcan be built on to provide a demonstrably trustworthycomputing environment

CamFlow was developed with cloud deployment inmind Our future work will investigate the challenges ofa broader distributed context

It is already feasible to extend CamFlow to sup-port mobile environments Android supports the fullSELinux enforcement5 and an Android-LSM integrationhas been demonstrated [60] But when dealing withmultiple cloud services particularly as they become partof a wider distributed architecture such as in the IoTa trustworthy system-wide deployment of IFC can nolonger be assumed Much work remains on establishingtrust in (and the trustworthiness of) the IFC enforcementmechanism within end-usersrsquo devices Outside a cloudcontext all partiesrsquo trust in a common third partyrsquos en-forcement of IFC constraints cannot be assumed (unlikethe cloud provider for cloud services) We intend toexplore leveraging hardware roots of trust and remoteattestation to ensure the integrity and trustworthiness ofIFC enforcement mechanisms

Another area of investigation concerns the represen-tation of tags across administrative domains In a cloudcontext a federated approach can be envisaged wherea common understanding of tags could be negotiatedacross multiple domains For example in [54] we dis-cussed initial thoughts on managing data according tospecific obligation regimes However as the numberof administrative domains increases both a global tagnaming scheme and mechanisms for ad-hoc negotiationbecome necessary

Related is the sensitivity of the tags themselvesKnowledge of the meaning of a tag can indicate thatthe associated entity contains or deals with certain infor-mation If a relationship can be established between anentity and the information owner this may disclose pri-vate information about the information owner This maylead to work on a need-to-know negotiation mechanismto establish a secure channel between hosts especiallyin a wide-scale distributed system A promising mech-anism is private set intersection to determine tagsrsquo subsetrelationship (sect341)

Another challenge concerns extending audit to dis-tributed architectures both in terms of resource man-agement and regulating access to log data

5 httpssourceandroidcomdevicestechsecurityselinux

13

ACKNOWLEDGMENTS

This work was supported by UK Engineering andPhysical Sciences Research Council grant EPK011510CloudSafetyNet End-to-End Application Security inthe Cloud We acknowledge the support of Microsoftthrough the Microsoft Cloud Computing Research Cen-tre

REFERENCES

[1] P Barham B Dragovic K Fraser S Hand T Harris A HoR Neugebauer I Pratt and A Warfield ldquoXen and the art ofvirtualizationrdquo SIGOPS Operating Systems Review vol 37 no 5pp 164ndash177 2003

[2] A Kivity Y Kamay D Laor U Lublin and A Liguori ldquokvmthe Linux Virtual Machine Monitorrdquo in Proceedings of the LinuxSymposium vol 1 2007 pp 225ndash230

[3] D Bernstein ldquoContainers and Cloud From LXC to Docker toKubernetesrdquo IEEE Cloud Computing Magazine no 3 pp 81ndash842014

[4] P Desnoyers O Krieger B Holden and J Hennessey ldquoUsingOpenStack for an Open Cloud eXchange (OCX)rdquo in InternationalConference on Cloud Engineering (IC2E) IEEE 2015

[5] J Mineraud O Mazhelis X Su and S Tarkoma ldquoA Gap Analysisof Internet-of-Things Platformsrdquo 2015 Arxiv arXiv150201181[Online] Available httparxivorgabs150201181

[6] C J Millard Ed Cloud Computing Law Oxford University Press2013

[7] S Pearson and M Casassa-Mont ldquoSticky Policies An Approachfor Managing Privacy across Multiple Partiesrdquo Computer vol 44July 2011

[8] D E Denning ldquoA lattice model of secure information flowrdquoCommunication of the ACM vol 19 no 5 pp 236ndash243 1976

[9] D E Bell and L J LaPadula ldquoSecure Computer Systems Mathe-matical Foundations and Modelrdquo The MITRE Corp Bedford MATech Rep M74-244 1973

[10] K J Biba ldquoIntegrity Considerations for Secure Computer Sys-temsrdquo MITRE Corp Tech Rep ESD-TR 76-372 1977

[11] A C Myers and B Liskov ldquoA Decentralized Model for Infor-mation Flow Controlrdquo in 17th Symposium on Operating SystemsPrinciples (SOSP) ACM 1997 pp 129ndash142

[12] A Bouard B Weyl and C Eckert ldquoPractical information-flowaware middleware for in-car communicationrdquo in Workshop onSecurity privacy amp dependability for cyber vehicles ACM 2013pp 3ndash8

[13] K Singh S Bhola and W Lee ldquoxBook Redesigning PrivacyControl in Social Networking Platformsrdquo in Security SymposiumUSENIX 2009 pp 249ndash266

[14] M Krohn A Yip M Brodsky N Cliffer M F KaashoekE Kohler and R Morris ldquoInformation Flow Control for StandardOS Abstractionsrdquo in Symposium on Operating Systems PrinciplesACM 2007 pp 321ndash334

[15] S Lee E L Wong D Goel M Dahlin and V ShmatikovldquoπBox A Platform for Privacy-Preserving Appsrdquo in Symposiumon Networked System Design and Implementation USENIX 2013pp 501ndash514

[16] N Kumar and R Shyamasundar ldquoRealizing Purpose-Based Pri-vacy Policies Succinctly via Information-Flow Labelsrdquo in Big Dataand Cloud Computing (BDCloudrsquo14) IEEE 2014 pp 753ndash760

[17] W Enck P Gilbert B-G Chun L P Cox J Jung P McDaniel andA N Sheth ldquoTaintDroid An information-flow tracking systemfor realtime privacy monitoring on smartphonesrdquo in Conference onOperating systems design and implementation (OSDIrsquo10) USENIX2010 pp 1ndash6

[18] I Papagiannis M Migliavacca and P Pietzuch ldquoPHP AspisUsing partial taint tracking to protect against injection attacksrdquoin 2nd USENIX Conference on Web Application Development 2011p 13

[19] E Schwartz T Avgerinos and D Brumley ldquoAll you ever wantedto know about dynamic taint analysis and forward symbolicexecution (but might have been afraid to ask)rdquo in Symposium onSecurity and Privacy IEEE 2010

[20] M Casassa-Mont S Pearson and P Bramhall ldquoTowards account-able management of identity and privacy Sticky policies andenforceable tracing servicesrdquo in Database and Expert Systems Ap-plications 2003 Proceedings 14th International Workshop on IEEE2003 pp 377ndash382

[21] S Bandhakavi C C Zhang and M Winslett ldquoSuper-sticky anddeclassifiable release policies for flexible information dissemina-tion controlrdquo in Proceedings of the 5th ACM workshop on Privacy inelectronic society ACM 2006 pp 51ndash58

[22] D W Chadwick and S F Lievens ldquoEnforcing sticky securitypolicies throughout a distributed applicationrdquo in Proceedings ofthe 2008 workshop on Middleware security ACM 2008 pp 1ndash6

[23] M Casassa-Mont I Matteucci M Petrocchi and M L SbodioldquoTowards Safer Information Sharing in the Cloudrdquo InternationalJournal of Information Security pp 1ndash16 2014

[24] S Akoush L Carata R Sohan and A Hopper ldquoMrLazy LazyRuntime Label Propagation for MapReducerdquo in 6th Workshop onHot Topics in Cloud Computing (HotCloud) USENIX 2014

[25] D Schultz and B Liskov ldquoIFDB Decentralized Information FlowControl for Databasesrdquo in European Conference on Computer Sys-tems (Eurosysrsquo13) ACM 2013 pp 43ndash56

[26] D Von Oheimb ldquoInformation Flow Control Revisited Nonin-fluence = Noninterference + Nonleakagerdquo in Computer SecurityndashESORICS 2004 Springer 2004 pp 225ndash243

[27] D Hedin and A Sabelfeld ldquoA perspective on Information-FlowControlrdquo NATO 2012

[28] D E Porter M D Bond I Roy K S McKinley and E WitchelldquoPractical Fine-Grained Information Flow Control Using Lam-inarrdquo ACM Transactions on Programming Languages and Systems(TOPLAS) vol 37 no 1 p 4 2014

[29] C Wright C Cowan S Smalley J Morris and G Kroah-Hartman ldquoLinux Security Modules General security support forthe Linux kernelrdquo in Foundations of Intrusion Tolerant SystemsIEEE 2003 pp 213ndash213

[30] M Quaritsch and T Winkler ldquoLinux Security Modules Enhance-ments Module Stacking Framework and TCP State TransitionHooks for State-Driven NIDSrdquo Secure Information and Communi-cation vol 7 pp 7ndash13 2004

[31] C Schaufler ldquoLSM Generalize existing module stackingrdquo LinuxWeekly News 2014

[32] S Smalley C Vance and W Salamon ldquoImplementing SELinux asa Linux Security Modulerdquo NAI Labs Report vol 1 p 43 2001

[33] M Bauer ldquoParanoid Penguin an Introduction to Novell AppAr-morrdquo Linux Journal vol 2006 no 148 p 13 2006

[34] A Edwards T Jaeger and X Zhang ldquoRuntime verification ofauthorization hook placement for the Linux Security Modulesframeworkrdquo in Conference on Computer and Communications Secu-rity ACM 2002 pp 225ndash234

[35] T Jaeger A Edwards and X Zhang ldquoConsistency analysis ofauthorization hook placement in the Linux security modulesframeworkrdquo ACM Transactions on Information and System Security(TISSEC) vol 7 no 2 pp 175ndash205 2004

[36] V Ganapathy T Jaeger and S Jha ldquoAutomatic placement ofauthorization hooks in the Linux security modules frameworkrdquoin Conference on Computer and Communications Security ACM2005 pp 330ndash339

[37] I P Egwutuoha D Levy B Selic and S Chen ldquoA survey of faulttolerance mechanisms and checkpointrestart implementationsfor high performance computing systemsrdquo Springer 2013vol 65 no 3 pp 1302ndash1326

[38] O Laadan and S E Hallyn ldquoLinux-CR Transparent applicationcheckpoint-restart in Linuxrdquo in Linux Symposium 2010 p 159

[39] B Niu and G Tan ldquoEfficient User-space Information Flow Con-trolrdquo in Symposium on Information Computer and CommunicationsSecurity (SIGSACrsquo13) ACM 2013 pp 131ndash142

[40] T Bird ldquoMeasuring Function Duration with ftracerdquo in Japan LinuxSymposium 2009 pp 47ndash54

[41] B Parno ldquoBootstrapping Trust in ardquo Trustedrdquo Platformrdquo in Con-ference on Hot Topics in Security (HotSecrsquo08) USENIX 2008

[42] N Santos K P Gummadi and R Rodrigues ldquoTowards TrustedCloud Computingrdquo in Conference on Hot Topics in Cloud Comput-ing USENIX 2009 pp 3ndash3

[43] S Berger R Caceres K A Goldman R Perez R Sailer andL van Doorn ldquovTPM Virtualizing the Trusted Platform Modulerdquoin Security Symposium USENIX 2006 pp 305ndash320

14

[44] S Berger K Goldman D Pendarakis D Safford E Valdezand M Zohar ldquoScalable Attestation A Step Toward Secure andTrusted Cloudsrdquo in International Conference on Cloud Engineering(IC2E) IEEE 2015

[45] J Singh D Eyers and J Bacon ldquoPolicy Enforcement withinEmerging Distributed Event-Based Systemsrdquo in ACM DistributedEvent-Based Systems (DEBSrsquo14) 2014

[46] L Sfaxi T Abdellatif R Robbana and Y Lakhnech ldquoInformationFlow Control of Component-based Distributed Systemsrdquo Concur-rency and Computation Practice and Experience vol 25 no 2 pp161ndash179 2013

[47] S Yoshihama T Yoshizawa Y Watanabe M Kudoh and K Oy-anagi ldquoDynamic Information Flow Control Architecture for WebApplicationsrdquo in ESORICS 2007 J Biskup and J Lopez EdsSpringer 2007 vol LNCS 4734 pp 267ndash282

[48] W Cheng D R K Ports D Schultz V Popic A BlanksteinJ Cowling D Curtis L Shrira and B Liskov ldquoAbstractions forUsable Information Flow Control in Aeolusrdquo in USENIX AnnualTechnical Conference Boston 2012

[49] J Singh T Pasquier J Bacon and D Eyers ldquoIntegrating Middle-ware and Information Flow Controlrdquo in International Conferenceon Cloud Engineering (IC2E) IEEE 2015 pp 54ndash59

[50] J Singh T F J-M Pasquier and J Bacon ldquoSecuring Tags toControl Information Flows within the Internet of Thingsrdquo inInternational Conference on Recent Advances in Internet of Things(RIoTrsquo15) IEEE 2015

[51] J S Park and R Sandhu ldquoBinding Identities and Attributes UsingDigitally Signed Certificatesrdquo in Annual Conference on ComputerSecurity Applications IEEE 2000 pp 120ndash127

[52] A Ganjali and D Lie ldquoAuditing cloud management using infor-mation flow trackingrdquo in Workshop on Scalable Trusted ComputingACM 2012 pp 79ndash84

[53] R K Ko M Kirchberg and B S Lee ldquoFrom System-centric toData-centric Logging-accountability Trust amp Security in CloudComputingrdquo in Defense Science Research Conference and Expo (DSR)2011 IEEE 2011 pp 1ndash4

[54] J Singh J Powles T Pasquier and J Bacon ldquoData Flow Man-agement and Compliance in Cloud Computingrdquo IEEE CloudComputing Magazine SI on Legal Clouds 2015

[55] T Zanussi K Yaghmour R Wisniewski R Moore and M Dage-nais ldquorelayfs An efficient unified approach for transmitting datafrom kernel to user spacerdquo in Linux Symposium 2003 p 494

[56] M E Smoot K Ono J Ruscheinski P-L Wang and T IdekerldquoCytoscape 28 New features for data integration and networkvisualizationrdquo Oxford University Press 2011 vol 27 no 3 pp431ndash432

[57] T Pasquier and J Powles ldquoExpressing and Enforcing LocationRequirements in the Cloud using Information Flow Controlrdquo inIC2E International Workshop on Legal and Technical Issues in CloudComputing (CLawrsquo15) IEEE 2015

[58] R Angles and C Gutierrez ldquoSurvey of Graph Database ModelsrdquoACM Computing Surveys (CSUR) vol 40 no 1 p 1 2008

[59] D Hardt ldquoThe OAuth 20 Authorization Frameworkrdquo IETF TechRep RFC 6749 2012

[60] S Smalley and R Craig ldquoSecurity Enhanced (SE) Android Bring-ing Flexible MAC to Androidrdquo in Network and Distributed SystemSecurity Symposium Internet Society 2013

  • 1 Introduction and Motivation
  • 2 Background
    • 21 IFC Models
    • 22 Protection via VMs and Containers
    • 23 Taint Tracking (TT) Systems
    • 24 Sticky Policies
      • 3 CamFlow-Model IFC for the Cloud
        • 31 Tags and Labels
        • 32 Decentralised Privileges and Security Contexts
        • 33 Creating a New Entity
        • 34 Security
          • 341 Information Exchange
          • 342 Label Change
          • 343 Privilege delegation
            • 35 Conflict of Interest
              • 4 OS Enforcement
                • 41 CamFlow-LSM
                  • 411 Checkpointing and Restoration
                  • 412 OS Evaluation
                    • 42 Trusted Processes
                    • 43 Leveraging Hardware Roots of Trust
                      • 5 Cross-Machine Enforcement
                        • 51 Remote Interactions
                        • 52 Message-Level Enforcement
                        • 53 Integrating With Persistent Storage
                        • 54 Evaluation
                          • 6 Audit Data-Centric Logs
                            • 61 Analysing Paths to Disclosure
                            • 62 Demonstrating Compliance
                            • 63 Audit as `Big Data
                            • 64 Audit Access
                              • 7 Example Support for Web Services
                              • 8 Conclusion amp Future Work
                              • References
Page 2: CamFlow: Managed data-sharing for cloud services · CamFlow: Managed data-sharing for cloud services Thomas F. J.-M. Pasquier, Member, IEEE, Jatinder Singh, Member, IEEE, David Eyers,

2

with selected carers medical practitioners and medicalresearch (big-data) repositories via cloud-hosted andmediated services Once data has left end-usersrsquo homesfor cloud services they need to be assured that it is onlyaccessed as they specify

Traditional access control tends to be principalrolespecific and apply only within the context of a partic-ular applicationservice Controls are applied at policyenforcement points after which there is no subsequentcontrol over where data flows Once data has left thedirect control of its owner for example after beingshared with others it is difficult using traditional accesscontrols to ensure and demonstrate that it is not leakedIf a leak is suspected it often cannot be establishedwhether this is a breach of confidentiality by a person ordue to buggy or misconfigured cloud service software

Encryption offers protection by restricting access tointelligible data even beyond the boundary of onersquostechnical control However encryption hinders flexiblenuanced data sharing in that key management (dis-tribution revocation) is difficult Further traceability islimited as being mathematically based there is generallyno feedback as to whenwhere decryption occurs anda compromised key or broken encryption scheme at anytime in the future places data at risk As such it isimportant that data flows are managed and auditedeven if data items are encrypted

Although contracts exist between cloud providers andtenants and cloud services are increasingly subject toregulation [6] there is at present no way to establish thatproviders remain in compliance with these agreementsand requirements Also there are often requirementsthat data should pass through certain processes egencryption or anonymisation There is currently no clearmechanism to express such requirements and demon-strate they have been consistently enforced

An approach to maintaining the association of datawith policy is to use ldquosticky policiesrdquo [7] Here owner-specified management constraints are attached to en-crypted data Decryption is only allowed by parties ac-cepting the management constraints and able to enforcethem This forms the basis for establishing contractualrelationships between data owners and service providersor other applications However this approach requirestrust in a (relatively large amount of) software Furtherthe enforcement is either at too coarse a granularity orprohibitively expensive This is further explored in sect24

As an alternative Information Flow Control (IFC)augments traditional access control by offering contin-uous system-wide end-to-end flow control based onproperties of the datamdashfor example ldquomedical data mayonly be used for research purposes after going throughconsent checking and anonymisationrdquo IFC allows secu-rity contexts to be defined system-wide and guaranteesnon-interference between them This is achieved by tagsapplied to entities (eg processes files database en-tries) inseparable from the entities they are associatedwith Every exchange of data between entities is verified

against security-context-domain relationships created bythe tags thus allowing tight control over any subsequenttransfers of the data

In this paper we present CamFlow (Cambridge FlowControl Architecture) We outline CamFlowrsquos IFC modeland implementation which comprises a new operatingsystem (OS) level implementation of IFC as a LinuxSecurity Module (LSM) with support for applicationmanagement together with an IFC-enabled middlewareIFC tags are checked on OS system calls and on mes-sage passing by the middleware to determine whetherdata flows are permissible Log records can be madeefficiently of all attempted flows whether permittedor rejected and this log provides a possible basis foraudit data provenance and compliance checking By thismeans it can be checked whether application level policyhas been enforced and whether cloud service provisionhas complied with contractual obligations

We argue that incorporating IFC into the underlyingPaaS-provided OSs as a small trusted computing basewould greatly enhance the trustworthiness of cloudservices whether public or private and hence all theirhosted servicesapplications Our evaluation shows thatIFC would incur acceptable overhead and our IFCmodel is designed to ensure that application developersneed not be aware of IFC although some applicationproviders may wish to take explicit advantage of IFCWe demonstrate the feasibility of our approach via anIFC-enabled framework for web services see sect7Contributions Our main contribution is to demonstratethe feasibility of providing IFC as part of cloud softwareinfrastructure and showing how IFC can be made towork end-to-end system-wide In addition to discussingthe rsquobig picturersquo in this paper we also present a newkernel implementation of IFC and a new audit functionOur approach enables (1) protection of applicationsfrom each other (non-interference) (2) flexible manageddata sharing across isolation boundaries (3) preventionof data leakage due to bugsmisconfigurations (4) ex-tension of access control beyond application boundaries(5) increased transparency through detailed logs of in-formation flow decisionssect2 gives background in protection and IFC then sect3

presents the essentials of the CamFlow IFC model withexamples sect4 and sect5 describe our new OS-level im-plementation of IFC as a LSM and its integration viatrusted processes with an IFC-enabled middleware stor-age services etc sect6 emphasises that audit in IFC systemsproduces logs capable of being processed by lsquobig-datarsquoanalytics tools Audit is central to establishing prove-nance and for providers to demonstrate compliance withcontract and regulation sect7 shows how standard webservices are supported transparently by the CamFlowarchitecture only a privileged application managementframework need be aware of IFC and unprivilegedapplication instances can run unchanged In all casesevaluation is included within the section sect8 summarisesconcludes and suggests future work

3

2 BACKGROUND

We first define the scope of current isolation mech-anisms highlighting the need for flexible data shar-ing at application-level granularity ie where applicationsmanage their own security concerns as well as strongisolation between tenants andor applications As anintroduction to IFC we outline the evolution of IFCmodels Related work on IFC implementation at the OSlevel and within distributed systems is given with therelevant sections We end with a brief comparison of IFCwith taint tracking (TT) and sticky policies

21 IFC Models

In 1976 Denning [8] proposed a Mandatory AccessControl (MAC) model to track and enforce rules oninformation flow in computer systems In this modelentities are associated with security classes The flow ofinformation from an entity a to an entity b is allowedonly if the security class of b (denoted b) is equal to orhigher than a This allows the no-read up no-write downprinciple of Bell and LaPadula [9] to be implementedto enforce secrecy By this means a traditional militaryclassification public secret top secret can be implementedA second security class can be associated with eachentity to track and enforce integrity (quality of data)no read down no write up as proposed by Biba [10] Acurrent example might allow input of information froma government website in the govuk domain but forbidthat from ldquoJoersquos Blogrdquo Using this model we are ableto control and monitor information flow to ensure datasecrecy and integrity

In 1997 Myers [11] introduced a Decentralised IFCmodel (DIFC) that has inspired most later work Thismodel was designed to meet the changing needs ofsystems from global static hierarchical security levelsto a more flexible system able to capture the needsof different applications In this model each entity isassociated with two labels a secrecy label and an integritylabel to capture respectively the privacyconfidentialityof the data and the reliability of a source of data Eachlabel comprises a set of tags each of which representssome security concern Data is allowed to flow if thesecurity label of the sender is a subset of the label of thereceiver and conversely for integrity

Implementations of a decentralised model akin toMyersrsquo include a sensitive embedded system for BMWcars [12] and XBook [13] in a social media contextOur own model is described in sect3 When implementedfrom the OS kernel level applications running underIFC enforcement do not need to be trusted for the datamanagement policy to be properly enforced [14]

22 Protection via VMs and Containers

Isolation of tenants in cloud platforms is throughhypervisor-supported virtual machines [1] [2] or OS-provided containers [3] However flexible sharing mech-anisms are also required to manage data exchange be-tween applications contributing to more complex sys-

tems or to achieve end-user goals For example gov-ernment applications might access citizensrsquo records forvarious purposes a userrsquos data from different applica-tions might together contribute to evidence related tohealth or wellbeing

At present the sharing of information between appli-cations tends to involve a binary decision (ie to shareor not) as for example in Google pods (containers)2

Whole resources can be shared but no control over datausage between applications is provided Furthermorethere are no means for preventing leakage outside ofthe mechanisms implemented by the individual appli-cationsservices

Solutions have been proposed to provide intra-application sandboxes (down to individual end-users)[15] but such schemes are difficult to scale requirechanges in application logic and still do not providecontrol beyond isolation boundaries (ie again loss ofcontrol once the data is shared)

IFC has been proposed to guarantee the proper usageof data by social network applications [13] The aim is toprovide purpose-based disclosure via IFC [16] betweenisolated components thus guaranteeing that shared datacan only be used for a well-defined and agreed-uponpurpose

IFC is by no means proposed as a replacement foraccess control VMs or containers but rather as a com-plement to those techniques to provide flexible manageddata-sharing IFC would allow tenants and end-users tomaintain control (within an IFC-enforcing world) anddefine policy applying to their data consistently andbeyond isolation and application borders

23 Taint Tracking (TT) Systems

Runtime dynamic TT is similar to IFC but with less func-tionality TT systems use one tag type ldquotaintrdquo instead ofsecrecy and integrity tags Tags propagate with data anddata flows may be logged An entity that inputs taggeddata acquires the datarsquos tag(s) Data flow constraintsare only enforced at specified sink points for examplewhen data attempts to leave a mobile phone [17] Policyis applied at sink points such as preventing privateunanonymised or unencrypted data from flowing orstrictly controlling to where data may flow

An example of TT used for integrity purposes is totaint data from untrusted sources eg user input froma TCP stream in a web application environment andenforce that it is sanitised before being processed [18]This simple mechanism prevents injection attacks thatplague badly designed web applications An example ofTT used for confidentiality purposes is to taint sensitiveinformation eg a list of contacts in a mobile phone andtrack it through this closed system [17] Data leaving thesystem (ie the phone) is analysed to ensure it does notcontain sensitive information Data containing sensitiveinformation should only leave to a number of closely

2 httpscloudgooglecomcontainer-enginedocspods

4

controlled destinations such as the cloud backup contactlist This approach aids the detection of malicious ap-plications attempting to steal user-sensitive informationand send it to third parties Equally this type of concerncan be captured through the use of IFC policies

One concern with TT systems is that there is a gap intime between the occurrence of the issue (eg a leak anattack) and when it is detected [19] ie problems becomeevident only when the tainted data reaches a sink (en-forcement point) Depending on the degree of isolationbetween the different parts of the system and the num-ber of system components involved this tainted datamay have lsquocontaminatedrsquo much of the system Whilethis can be managed in smaller closed environmentsit is less appropriate for cloud services in general IFCpolicies present the clear advantage to prevent problemsas they occur and to stop their effects propagating to apotentially large part of the system

Some argue that TT is simpler to use than IFC andincurs lower overhead but when the enforcement issystemic and the granularity identical the overheadsare similar (compare [17] and the evaluation in sect4 andsect5) Indeed the complexity of verifying IFC policy (seesect3) is comparable to the cost of propagating taint Forboth techniques most of the overhead comes from themechanism for intercepting data exchange

24 Sticky Policies

IFC can be seen as a mechanism for enforcing policythe labels associated with entities represent applicationpolicies IFC is a simple low-level mechanism Stickypolicy approaches also consider the enforcement of data-bound policy but at a higher-level

Casassa-Mont et al [20] first introduced sticky policieswhich involves encrypting data along with a list of poli-cies to be enforced on that data To obtain the decryptionkey from a Trusted Authority (TA) a party must agree toenforce the policies associated with the data This agree-ment may be considered as part of forming a contractuallink between the data owner and the service providerWork has continued in the area [21]ndash[23] Sticky policiestypically enforced at the application-level are generallymore complex and heavyweight than the simple secrecyand integrity constraints of IFC As such sticky policiestend only to be enforced at particular points eg atadministrative boundaries IFC on the other hand aswe show in sect412 can be enforced continuously at areasonable cost In sect3 we discuss how complex policiesmight be built from IFC labels

Further the sticky policy approach builds upon thetrust established between the data owner the TAs andservices that use the data A non-compliant service couldbe black-listed but only if and when a breach of agree-ment is detected and the TA updated Our IFC approachbuilds only upon the trust between the data owner andthe cloud provider Services and applications running ontop of the cloud provider platform need not be trustedWe believe this to be a great improvement to the overall

Alice RecordS(A) = alicemedical

I(A) = hosp-dev consent

Bob RecordS(B) = bobmedical

I(B) = hosp-dev consent

Alicersquos app instanceS(C) = alicemedical

I(C) = consentAllowed FlowPrevented Flow

Fig 1 An allowed safe flow and prevented flowstrustworthiness of the system

3 CAMFLOW-MODEL IFC FOR THE CLOUD

IFC operates to ensure that only permitted flows ofinformation can occur by enforcing data flow policy dy-namically end-to-end within and across applicationsservices Entities to which IFC constraints are appliedcan include a MapReduce worker instance [24] a file aprocess a database entry [25] etc In CamFlow IFC is ap-plied continuously typically on every system call for anIFC-enabled OS and on communication mechanisms forenforcement across applicationsruntime environmentsIFC policy should therefore be as simple as possibleto allow verification human understanding and to min-imise runtime overhead Indeed there is no need for IFCto encapsulate every possible policy rather it augmentsother control mechanisms and can help enforce theirpolicies

31 Tags and Labels

We define tags that are tokens each representing somesecurity concern over secrecy or integrity The tagbob-private could for example represent Bobrsquos personaldata We associate every entity in the system with twolabels (sets of tags) an entity A has a secrecy label S(A)and an integrity label I(A) The state of these labels isthe security context of the entity The power of IFC is thatit guarantees non-interference between security contexts[26] [27]Example ndash secrecy Suppose a patient Bob is dischargedfrom hospital to be medically monitored at home Thedata streams from his sensors are transferred to a cloudservice and are to be shared with his medical team atthe hospital The data items from his devices are taggedwith medical bob in their secrecy labelsExample ndash integrity The cloud-based home monitoringsupport service needs to be assured that the data itreceives is from a hospital-issued device Each sensingdevice is checked and issued with the tag hospital-devicein its integrity label

Fig 1 illustrates information flow constraints beingapplied over both secrecy and integrity dimensions

32 Decentralised Privileges and Security Contexts

In decentralised IFC (DIFC) any active entity can createnew tags Tag creation is typically carried out by appli-cation managers when setting up application instancesWhen an active entity creates a new tag either for secrecyor integrity this process is given the correspondingprivilege to add and remove the tag to its secrecy orintegrity label respectively If an active entity A has a

5

Medical RecordS = personal

I = empty

Consent Checker

S = personalI = cons

Anonymiser

S = researchI = cons anon

Research DatabaseS = research

I = cons anon

Researcher PortalS = research

I = cons anonAllowed FlowPrevented Flow

Research Project XXNHS Cloud S = personalI = empty

S = personalI = cons

Context change

Fig 2 Medical data declassified and endorsed for research purposes

privilege to add t to its secrecy label we denote thist isin P+

S (A) and to remove t from its secrecy labelt isin PminusS (A) (and similarly P+

I (A) and PminusI (A) are the priv-ileges for integrity) An active entity may therefore havefour privilege sets in addition to its security contextApplication managers will normally set up applicationinstances in security contexts without the privileges tochange them An example is given in sect7

33 Creating a New Entity

We define A rArr B as the operation of the entity Acreating the entity B An example is creating a processin a Unix-style OS by clone We have the following rulesfor creation

if ArArr B then

S(B) = S(A)

I(B) = I(A)

That is the created entity inherits the security contextof its creator These rules force the creating entity toexplicitly change its security context to that required forthe entity to be created We motivate this below in sect342Note that only labels pass to the created entity privilegeshave to be passed explicitly

34 Security

The purpose of IFC models is to regulate flows betweenentities and effect label changes and privilege delega-tion

Definition 1 A system is secure in the CamFlow IFC modelif and only if all allowed messages are safe (Definition 2) allallowed label changes are safe (Definition 3) and all privilegedelegation is safe (Definitions 4 and 5)

341 Information Exchange

IFC prevents data leakage by controlling the exchangeof information We follow the classic pattern for IFC-guaranteed secrecy (no read up no write down [9]) andintegrity (no read down no write up [10])

Definition 2 A flow of information A rarr B is safe if andonly if

Ararr B iff S(A) sube S(B) and I(B) sube I(A)Example ndash secrecy enforcement Consider our exam-ple of patient monitoring after discharge from hospital

where the patientrsquos devices are tagged with medical bobin their secrecy labels In order for the cloud service tobe able to receive this data it must also include the tagsmedical bob in its secrecy label Therefore an applicationinstance accessing Bobrsquos medical data must be labelled assuch In sect7 we describe how applications can be designedto meet such requirementsExample ndash integrity enforcement The cloud-basedhome monitoring support service needs to be assuredthat the data it receives is from a hospital-issued de-vice To achieve this the service has an integrity taghospital-issued in its integrity label and will only acceptdata from devices with tags hospital-issued

342 Label Change

Under the above constraints information flows arerestricted to equal or increasing secrecy constraintsand equal or decreasing integrity constraints Howeverdata may undergo transformations andor checks thatchange its security properties For example moving datathrough an anonymisation engine renders the data lesssensitive so less strict secrecy constraints can applyto the anonymised output In the integrity dimensiondata may go through a validation process on inputthus becoming more trustworthy In CamFlow only theprocess itself is able to change its secrecy and integritylabels which requires the appropriate privileges andmust be explicitly requested

Definition 3 A label change noted A Aprime is safe if andonly if for a label X (either S or I) and a tag t

X(Aprime) = X(A) cup t if t isin P+X (A)

ORX(Aprime) = X(A) t if t isin PminusX (A)

Declassifiers and endorsers are the entities with theprivileges to perform security context transformationsDeclassifiers change the secrecy properties and endorserschange the integrity propertiesExample ndash declassification A medical record systemis held in a private cloud Research datasets may becreated from these records but only from records wherethe patients have given consent Also only anonymised

6

data may leave the private protected environment Weassume a health service approved anonymisation proce-dure Fig 2 shows the anonymiser inputting data taggedas personal and declassifying the data by outputting datawith secrecy tag researchExample ndash endorsement In the same example theResearch Database is on a public cloud and may onlyreceive research data tagged with consent anon in itsintegrity label In the private cloud we see a pro-cess that selects appropriate records for specific re-search purposes checks for patient consent and addsthe tag consent to the integrity label of its output Theanonymiser process can only input data with this tag itanonymises the data and outputs data with the tag anonin its integrity label

Some previous work [14] [28] allows implicit declas-sification and endorsement That is if an active entity hasthe privilege to declassifyendorse and the privilegeto return to its original state (ie for declassificationendorsement over t the entity has privilege tminus and t+)the declassificationendorsement may occur implicitlywithout the need for the entity to make the label changesWe believe that this could in practice lead to unintentionaldata disclosure Suppose an entity has the privilege todeclassify top-secret information The requirement forexplicit label change makes it unlikely that the entity willsend such data accidentally to an unintended recipientOur model has stronger constraints that require endorse-ment and declassification operations to be programmedexplicitly

343 Privilege delegation

An entity is only able to delegate a privilege it owns

Definition 4 A privilege delegation is safe if and only ift isin PplusmnX (A)

35 Conflict of Interest

In CamFlow alone among IFC systems privilege dele-gation is further restricted by Conflict of Interest (CoI)(or Separation of Duty (SoD)) enforcement The receivingentity A must not be put in a situation where it wouldbreak a CoI constraint By this means an applicationmanager is prevented from creating an application in-stance with access to conflicting data

Definition 5 An entity A does not violate a CoI C if andonly if∣∣∣(S(A) cup I(A) cup P+

S (A) cup P+I (A) cup Pminus

S (A) cup PminusI (A)

)cap C

∣∣∣ le 1

Example ndash conflict of interest A CoI might arise whendata relating to competing companies is available in asystem In a hospital context this might involve theresults of analysis of the usage and effects of drugs fromcompeting pharmaceutical companies The companiesmight agree to this analysis only if their data is guar-anteed to be isolated ie not leaked to other companies

The hospital may be participating in drug trials andwant to ensure that information does not leak between

Camflow-LSM

IFC Library

User Space

Kernel Space

Application Process

Kernel

DIFC Management APISystem Calls

Hardware

IFC LibraryTrusted Process

Authorise and Audit

Fig 3 The interactions of the IFC Security Module (LSM)and a Trusted Process within an OStrials suppose a conflict is C = PfizerGSKRoche and some data (eg files) are labelled PfizerData[S =Pfizer I = empty] and RocheData[S = Roche I = empty] TheCoI described ensures that it is not possible for a singleentity (eg an application instance) to have access toboth RocheData and PfizerData either simultaneously orsequentially ie enforcing that Roche-owned data andPfizer-owned data are processed in isolation

The next sections describe the CamFlow platform thatenforces the IFC constraints described

4 OS ENFORCEMENT

At the heart of the architecture is a minimal kernel mod-ule dedicated solely to OS-level IFC enforcement Themodule is trusted to enforce IFC transparently acrossall flows between entities within the OS User spaceprocesses can directly interact with the kernel moduleeg to delegate privileges (sect34) through a pseudo-filesystem abstracted through a high level API Higher levelconsiderations and policies can be managed throughspecifically defined Trusted Processes (see sect42) The localmachine architecture is represented in Fig 3

Note that IFC operates alongside and complementsother security technologies It is not a cloud securitypanacea challenges regarding covert and side channelsand direct access to hardware by an attacker remain asthey do for systems in general There are approachesthat can help address these security threats but manyare highly disruptive (eg synchronisation approachesto reducing timing channels) and are infrequently usedOther threats may be easier to mitigate and solutionsmay be used when appropriate (eg on-disk encryption)

41 CamFlow-LSM

Our kernel module CamFlow-LSM is implemented as aLinux Security Module (LSM) [29] Although our work isLinux-specific a similar approach could be used on anysystem providing LSM-like security hooks Unlike otherDIFC OS implementations [14] [28] our kernel patch isself-contained strictly limited to the security moduledoes not modify any existing system calls and followsLSM implementation best practice This allows amongother things LSM stacking [30] [31] and coexistencewith other security modules such as eg SELinux [32]or AppArmor [33] and complements their MAC enforce-ment with decentralised information flow policies

We assume that the rest of the kernel can be trustedand does not interfere with the IFC enforcement mech-anism LSM system hooks have been statically anddynamically verified [34]ndash[36] and our implementation

7

inherits from LSM the formal assurance of IFCrsquos correctplacement on the path to any controlled kernel objectThis is sufficient to guarantee that we control flow andrecord audit on any operation on a controlled kernelobject

Since applications running on SELinux [32] or Ap-pArmor [33] need not be aware of the MAC policybeing enforced we see no reason to force applicationsrunning on an IFC system to be aware of IFC only thoseperforming declassification or endorsement operationsare necessarily aware This implementation choice isimportant cloud providers can incorporate IFC withoutrequiring changes in the software deployed by ten-ants Alternatively applications that wish to managetheir own IFC constraints can declare policy through apseudo-filesystem (as is typical for LSMs) abstracted bya user space library and enforced transparently by theIFC mechanism

The LSM framework calls security hooks when accessto a kernel object is attempted Security metadata can beassociated with kernel objects and is used by the LSMmodule to make access decisions Tags and privilegesare represented by 64-bit opaque nonces associated withkernel objects such as processes inodes files sharedmemory objects messages etc On interaction betweenkernel objects CamFlow-LSM security hooks are calledto enforce data-flow policy (sect341) or propagate tags onentity creation (sect33) as appropriate

Only active entities (processes) have mutable labelsand privileges all other (passive) entities have im-mutable labels and no privileges

Privileges are allocated by the kernel and owned bythe creating process (any process can create tags and theassociated privileges in a decentralised fashion) Privi-leges can be passed to other processes users or groupsCamFlow-LSM verifying that constraints on privilegedelegation (sect343) and conflict of interest (sect35) are notviolated A process can add or remove a tag from itslabel if it owns the appropriate privilege (following IFCconstraints described in sect342) if the current user ownsthe privilege or if the current group owns the privilegeHow tags are shared and managed must be consideredwith care when designing an application and the systemmust be administered accordingly

411 Checkpointing and Restoration

Checkpointing a process involves halting its executionallowing it to be restarted at a later stage and enablingmigration eg [37] LSM state is normally saved andrestored by the checkpointing system eg [38] andour module further exports an API to more efficientlyserialise and restore security context

Furthermore self-checkpointing and restoring the pre-vious state of a process has been demonstrated [39] to bea beneficial feature for IFC systems This is particularlyuseful for processes serving requests In such a scenariothe state of the process is saved after initialisation Whena request is received the serving process sets itself up

0 10 20 30 40 50 60 70 80 90

sys pipe

sys write

sys read

sys clone

microsdyn label IFC LSM Native

Fig 4 Overhead introduced into the OS by CamFlowLSM (x-axis time in micros)in the security context appropriate to serve the requestAfter the request is served (or a series of requests if thesystem is session-based as described in sect7) the processrestores its memory state and security context to whatthey were immediately after initialisation This improvesperformance and prevents data leaks between securitycontexts

412 OS Evaluation

We tested the CamFlow-LSM module on Linux Kernelversion 3178 (012015) from the Fedora distribution3

The tests are run on an Intel 22Ghz i7 CPU and 6GiBRAM machine

Measurements are done using the Linux tool ftrace [40]to provide a microbenchmark Two processes read fromand write to a pipe respectively Each has 20 tags in itssecurity label substantially more than we have seen aneed for in current use cases We measure the overheadinduced by creating a new process (sys clone) creatinga new pipe (sys pipe) writing to the pipe (sys write) andreading from the pipe (sys read) The results are givenin Fig 4

We can distinguish two types of induced overheadverifying an IFC constraint (sys read sys write) and allo-cating labels (sys clone sys pipe) The sys clone overheadis roughly twice that of sys pipe as memory is allocateddynamically for the active entityrsquos labels and privilegesRecall that passive entities have no privileges Overheadmeasurements for other system callsdata structures areessentially identical as they rely on the same underlyingenforcement mechanism and are not included

The CamFlow-LSM overhead is a few percent seeFig 4 We provide a build option that further improvesperformance by declaring labels and privileges with afixed maximum size (by default label size can increasedynamically to meet application requirements) This re-duces the overhead of the system calls that create newentities (the dynamic label component in Fig 4) This isan acceptable trade-off as in practical scenarios labelsrarely exceed more than five tags However for mostapplications the overhead is imperceptible and lost insystem noise it is hard to measure without using kernel

3 It is not feasible to provide a comparison with the Laminarimplementation [28] that is closest in technical terms to our workas the implementation available httpsgithubcomut-osalaminaris for an obsolete kernel version 2622 (072007)

8

tools as the variation between two executions may begreater than the overhead

42 Trusted Processes

The CamFlow-LSM is trusted to enforce IFC at the kernellevel Its functionality is minimal strictly confined tothe enforcement of IFC policies as described in sect3 Thisguarantees easier maintainability and a system that isagnostic to higher level application requirements thusminimising the constraints imposed on user-space ap-plication design

We introduce the concept of a trusted process that al-lows applicationplatform-specific concerns to be man-aged in user space by bypassing some LSM-enforced IFCconstraints For example a trusted process might serveas a proxy for external connections as in the Trusted IFCGateway in the example in sect7 setting up and managingapplication componentsrsquo labels Trusted processes areused to interact with persistent storage (see sect53) forcheckpointing and restoring processes (see sect411) andfor managing inter-process and external communication(see sect5)

Fig 5 shows OS instances running the CamFlow-LSM hosting a number of application processes thatmay be grouped in containers Each OS instance hasa single trusted process (Security Context Manager) tomanage its hosted processesrsquo IFC labels and privileges Inaddition each process has an associated trusted middle-ware process to handle inter-process and inter-machinecommunication Such communication may be within orbetween containers OSs or clouds

In this example S represents a particular set of secrecytags and I a particular set of integrity tags both ofwhich remain the same throughout The application pro-cesses and other OS objects such as pipes and files arelabelled [S I] The process labelled [empty I] writes lsquopublicrsquodata to a pipe which is read by a process labelled [S I]assuming all the I tags match correctly Similarly twoprocesses are shown writing to and reading from a file

The Security Context Manager maps between thekernel-level representation of tags (as 64-bit integers)and the representation of tags in user space Within acloud or other trusted environment tags may be simplestrings When tags need to cross domain boundarieseg when cloud services form part of a wider archi-tecture as in IoT tags may need to be protected bycryptographic means (see sect51)

Trusted processes are either set up through staticconfiguration read at boot time by the CamFlow-LSMmodule or created at runtime by another trusted pro-cess Trusted processes must either be managed by atrusted party (in our current approach the underlying in-frastructure provider) andor the code must be auditableand a means to verify the current version running on theplatform must be provided (see sect43)

43 Leveraging Hardware Roots of Trust

Incorporating IFC into cloud-provider OSs would en-hance the trustworthiness of the platform HoweverIFC only guarantees protection above the technical layerin which it is enforced Recent hardware and softwaredevelopments make it possible to attest that the softwarelayers on which our platform runs have been audited

The Trusted Platform Module (TPM) [41] as used forremote attestation [42] is one such hardware mecha-nism TPM is used to generate a nearly unforgeablehash representing the state of the hardware and soft-ware of a given platform that can be remotely verifiedTherefore a company could audit the implementationof our IFC enforcement mechanism and ensure that ourkernel security module messaging middleware and theconfiguration they provide are indeed running on theplatform Any difference between the expected state ofthe software stack and the platform could be considereda breach of trust such considerations can easily beembedded in the contractual obligations of the cloudprovider

TPM and remote attestation for cloud computing [43]are reaching maturity with IBM rolling out an opensource scalable trusted platform based on virtual TPMs[44] Indeed Berger et al [44] describe a mechanismallowing the TPM and remote attestation to be providedfor virtual machine offerings and container-based solu-tions covering the whole range of contemporary cloudofferings Furthermore the approach not only allows thestate of the software stack to be verified at boot time butalso during execution and can thus prevent run-timemodification of the system configuration

5 CROSS-MACHINE ENFORCEMENT

CamFlow-LSM operates to protect flows within the OSHowever it is also important that flows are protectedacross OS instances

Generally in order to guarantee flow constraints onlyprocesses P such that S(P ) = empty and I(P ) = empty ie notsubject to IFC constraints are allowed to directly connectto or receive messages from connections on remote OS(eg through a socket) In order to connect to anothermachine a process must either 1) be able to declassifyto change its security context to S(P ) = empty and I(P ) = empty2) communicate through an intermediate trusted process

As such CamFlow contains an IFC-enabled fully-featured messaging middleware (CamFlow-MW) to bothfacilitate communication and guarantee enforcementacross machines For want of space we only considerthe middleware concepts relevant to IFC details on thegeneral middleware (as it was prior to IFCCamFlowintegration) can be found in [45] In short the middle-ware supports strongly-typed messages a range of in-teraction paradigms including request-reply broadcastand streams flexible resource discovery and securitymechanisms including access controls and encryptedcommunication A particular feature is its support for

9

CamFlow-MW CamFlow-MW CamFlow-MW CamFlow-MW CamFlow-MW

Process

CamFlow-LSM

Enforcement Point Pipe [empty I]

Process Process

SecurityContextManager

SecurityContextManager

Container

Container

Container

Process[S I]

CamFlow-LSM

[S I]

File[S I]

[S empty][S empty]Process[empty I]

Fig 5 CamFlow Architecture Labelled OS objects trusted processes and communication middleware

dynamic reconfiguration based on event-driven policyThis simplifies both application development and de-ployment as concerns can be abstracted and tailoredto the particular environment rather than embeddedwithin application code

The role of the middleware is to move towards contin-uous end-to-end data flow management such that IFCcan be enforced across applicationsmachines (kernels)There is work on IFC enforcement across machineshowever these impose specific requirements such asdesign-time considerations [46] a particular languageruntime [47] or constraints on system architectureimplementation [48] In contrast we integrate IFC func-tionality into the general fully featured distributed sys-tems middleware mentioned above (see [49]) to provideflexibility and be more generally applicable We delib-erately avoid imposing a structure on system designinstead integrating IFC functionality into the sort of com-munications infrastructure common to current enterpriseand cloud systems

51 Remote Interactions

CamFlow-MW operates by associating a trusted process(see sect42) with an entity that seeks to communicate viathe messaging system Its fit within the broader architec-ture is depicted in Fig 5 The process is responsible forhandling the communication of messages and enforcesIFC based on the current runtime labels of the entity onwhose behalf it operates

It follows that for IFC to be enforced across machinestags require system-wide management ie throughoutthe cloud service In [50] we proposed that the widelyused and available X509 certificates could be used Theapproach relies on public key certificates and attributecertificates [51] to respectively identify the applicationassociated with the CamFlow-MW instance and the tagsassociated with this application

As part of establishing a connection CamFlow-MWensures that each entity authorises communication withthe other according to a local access control policy De-cisions are based on component metadata the relevantauthentication aspects secured through PKI (certificates)Similarly IFC policy must also be verified to ensure dataflows are authorised according to IFC policy Attributecertificates provide cryptographic means to determineand verify the tags associated with the remote entity

(see [50]) on which policy can be enforced If tags donot accord the connection will not be established

We also see potential for remote attestation based onhardware integrity measures (see sect43) to be integratedinto this authorisation phase to ensure the remote ma-chine operates a reliable IFC enforcement regime

52 Message-Level Enforcement

CamFlow messages are strongly typed where a mes-sage type is defined by a schema describing its set ofattributes For an instance of a message an attributeconsists of a name type and value The support forIFC within messages is fine-grained in that individualattributes within messages can also be labelled Theseattribute labels introduce additional IFC constraints overand above those already applying to the entity ie asrecognised by the kernel-LSM and validated on connec-tion establishment

Labels can be defined within message type schemawhich sets the attributesrsquo IFC labels for all messageinstances of the type These labels cannot be changed byentities dealing in such messages and the entities musthold the requisite labels to interact with the attributesOtherwise the entity producingpublishing a messagecan set the security labels for the attributes (for those notpredefined) if the entity holds the associated privilegesEnforcement occurs as followsReceiving If the receiving entityrsquos labels do not agreewith those of an attribute value the attribute value (andany sub-attributes) are removed from (made null in) themessage This is enforced on message receipt before itis delivered to the entitySending An entity cannot send values for attributeswhere its labels do not agree with those of the attributeThis is enforced when an entity attempts to send amessage ensuring values for any attributes violating thispolicy are removed before message propagation

Enforcement is automatic meaning that applicationsusing the messaging system can be subject to IFC en-forcement completely transparently (ie without their di-rect involvement) though again there is the interface forthe application to actively manage IFC where requiredIn addition the general reconfiguration capabilities ofthe middleware enable connections between componentsto be defined and managed at runtime providing an-other mechanism for controlling communication [45]

10

0 100 200 300 400 500 600 700

Time in msIFC-overhead

Fig 6 IFC overhead of CamFlow-MW for a workloadtransmission of 5000 messages (x-axis in ms)53 Integrating With Persistent Storage

One technique to provide IFC with persistent data storesis to store the tags alongside the data and a trustedsoftware component ensures that when information isread from the store the corresponding labels are appliedIn Flume [14] a trusted process provides the interfacebetween untrusted applications and persistent storageMore recent work has seen the emergence of databasesthat natively understand IFC concepts and can enforceIFC policies [25]

We see much promise in having the middleware me-diate between persistence systems and the kernel toensure consistent IFC application

54 Evaluation

As shown in Fig 6 the results indicate that IFC en-forcement introduces an overhead of sim13 in perfor-mance time compared to the standard non IFC-enabledmiddleware (see [49] for details) Note that these resultswere measured in the context of a particular workloaddeliberately designed to highlight the impact of IFCenforcement It follows that the overheads associatedwith real-world usage are most likely less onerous

6 AUDIT DATA-CENTRIC LOGS

IFC in addition to providing strong assurances that pol-icy is being enforced can also provide a data-centric log[52] detailing the information flows within and betweensystem components In addition to enforcing IFC via ourLSM module we log the data flows of labelled processespolicy decisions privileges and IFC security contextmanipulations Equivalent inter-machine operations viathe middleware are also recorded

Cloud logging systems are generally based on legacylogging systems (OS web-server database etc) thateither fail to capture the needed information or areextremely complicated to interpret in a useful manner[53] More importantly such logs tend to be relevant onlyto the particular service or component which makes itdifficult if not impossible to audit across a range ofapplications clouds etc

IFC logs as provided by our platform allow us to cap-ture information on application-level data flows both at-tempted and permitted allowing the correct expressionand implementation of data flow policy to be checkedThis provides transparency allows for meaningful auditin terms of investigating the circumstances in which dataleakage occurs and provides evidence of complianceeg with legal obligations [54]

Information Exchange

CreateSecurity Context Change

e0

e1

e3e4

e7

e2 e5e6

P3[Sprimeprime I]

P2[Sprimeprime empty]

P3[S I]F1[S I]

P1[S I] P1[empty I]

Public[empty empty]F2[Sprime I prime]

Fig 7 Simplified audit graph from IFC OS execution (weomit metadata for readability) The path to disclosure isshown in bluepale61 Analysing Paths to Disclosure

To assist in interpreting log information we build a di-rected graph corresponding to the allowed flows duringthe execution of our system as shown in Fig 7 Theflows defined in our IFC model (see sect3) namely dataflow creation flow security context change and privi-lege delegation correspond to the edges of the directedgraph Entities (such as processes files messages etc)are represented as nodes in the graph

In addition to information necessary to build the graph(as shown in the figure) additional metadata is collectedfor forensic purposes which is contextentityevent-dependent These audit entries provided by the LSMand middleware can be exploited by a dedicated serviceimplemented in user space connected to the kernelcollection mechanism via relayfs [55] This service feedsfor example a graph visualisation tool such as Cytoscape[56] or a graph database such as Neo4J4

Such a directed graph helps one identify data leaksFor example a tenant might discover that some sensi-tive medical data leaked into a data store where onlyanonymised research data were supposed to be storedIFC is enforced in line with the policy encapsulated inlabels thus data may leak if such policy is improperly ex-pressed andor declassificationendorsement processesare not correctly implemented (eg if the anonymisationprocess in Fig 2 allows re-identification)

Suppose that an information leak is suspected be-tween different security contexts L1[S I] and L2[S

prime I prime]Determining whether such a leak can occur is equiv-alent to discovering whether there is a path in thegraph between the two contexts If the leak occurredthere must be a path between some entity Ei such thatS(Ei) = S and I(Ei) = I and another entity Fi such thatS(Fi) = Sprime and I(Fi) = I prime

The existence of such a path demonstrates that a leakis possible To investigate whether a leak occurred it isessential to consider the event ID associated with theedges comprising the path We denote by ei the last in-coming edge to the entity under investigation with labels[Sprime I prime] only edges such that e lt ei should be consideredWhen applied to all nodes along a path this rule ensuresstrictly monotonically increasing timestamps from thefirst node to the last Fig 7 shows in bluepale a possibledata disclosure path from file F2 from a very simple

4 httpneo4jcom

11

audit graph We know from the event IDs e0 and e1 thatthe data disclosure did not occur through file F1 andprocess P3 but through P1rsquos declassification

62 Demonstrating Compliance

Compliance with certain requirements can be demon-strated through queries over the graph We assume theaudit data is stored in a graph database that we canquery For example the following plain English policyldquoEuropean personal data sent to the US must be anonymisedrdquo[57] is equivalent to writing a query that verifies thatthere is no path between EU- and US-labelled datawithout an anonymisation processThe policy ldquoMedical data stored in database X must havereceived proper consent and be anonymisedrdquo [54] can beexpressed as a query verifying that there is no path be-tween data labelled as medical and the database withoutconsent-checking and anonymiser processes In additionan investigator may want to know which anonymisationalgorithm has been run which data has been used togenerate the anonymised records etc Our audit graphassists in answering such questions

Note that IFC only applies guarantees with respectto flows Demonstrating the overall effectiveness of themanagement regime eg the quality and suitability ofthe anonymisation algorithm is out-of-scope for a flow-based enforcement mechanism

63 Audit as lsquoBig Datarsquo

We are potentially generating a vast amount of data inour IFC logs However unlike standard system logs thatare complex to analyse our logs generate graphs thatare ideal for analysis by ldquobig datardquo tools that have beendeveloped for this purpose [58]

Since the amount of data is potentially huge theamount of data being logged can be fine-tuned to meetthe requirements of the platformtenant eg by reduc-ing the amount of metadata being stored by loggingonly security context changing operations by loggingonly information corresponding to some target securitycontext keeping operations on unlabelled entities out-side of the log etc The decision on what needs to belogged then becomes a tradeoff between utility and thevolume (cost) of log generated which can be decided inorder to correspond to legal or contractual requirements(for example a regulated sector may need to have afine-grained log to satisfy data forensic requirements)Indeed as such an approach is new to the cloud suchconsiderations will be refined by experience with bestpractices developing over time

64 Audit Access

Logs can contain sensitive information and access tothem should be controlled This represents an area ofour ongoing work Traditional access controls clearlyplay a role however secrecy tags could also be lever-aged For example an auditor before being grantedaccess to audit logs could be forced to demonstrate

ownership of the corresponding secrecy IFC tags (forexample through cryptographic means as in sect51) Theauditor may be granted access to a log entry only ifS(origin) cup S(destination) sube S(auditor)7 EXAMPLE SUPPORT FOR WEB SERVICES

One of the most common uses of PaaS is to host web ap-plications In this section we present the implementationof such a solution built on the infrastructure described insect4 in order to evaluate and demonstrate the feasibility ofour proposed approach This is illustrated in Fig 8 Werun standard and unmodified Ruby web applications

Interaction with end-users is achieved through aldquogatewayrdquo between the IFC and non-IFC worlds Simi-larly interaction with cloud services (such as data stores)is also achieved through our messaging middleware asdiscussed in sect5 The requirement for this gateway canbe removed if a trustworthy IFC implementation can beprovided at the client side consistent with the cloud im-plementation with respect to tag naming enforcementetc Tag naming in general system-wide is an issuebeyond the scope of this paper see further sect8 In ourproof of concept implementation the gateway is a simpleApache server running a custom-built module

The role of the gateway is to authenticate the end-userwhen a session is created and to associate this sessionwith an application instance running within the securitycontext corresponding to the user Recall that a securitycontext comprises the S and I labels Any further re-quests to the gateway in that session are routed to thecorresponding application instance Once an instance nolonger has an associated session it can be recycled usingself-checkpointing as described in sect411

Several application types are running over our cloud-based web services platform For example in a medicalcontext these might be medical record editing pharmacyordering social services etc A single shared identityservice for the end-user is part of the cloud provideroffering (in our proof of concept implementation weused OAuth [59])

The GP authenticates is authorised as treating doc-tor for Alice and selects the lsquoidentityrsquo that correspondsto Alice A new session is created server-side by thegateway with the requested application instance run-ning in the corresponding security context with S =[medical alice] I = [empty] When the GP wants to accessapplications on behalf of a new patient he needs to closeAlicersquos session authorise as treating doctor for Bob andopen a new session for Bob

The control described above is not achieved by theapplication but by the platform itself and can be con-trolled by the end-user subject to access control Thatis a medical application used on Alicersquos behalf runs ina security context in which data cannot flow to that ofanother patient Furthermore applications running onbehalf of a given user can share the data of that userwithout the risk of seeing a buggy application leakingdata between end-users

12

App Instance App InstanceApp Instance

CamFlow-MW

IFC Constrained IFC Constrained IFC Constrained

ContextA

ContextB

Public

Trusted IFC Gateway

Enforcing IFCData-store

Data-store Application

CamFlow-MW CamFlow-MWCamFlow-MW

IFC Cloud Platform

Dr Yellow Dr PinkDr MauveRemote Clients

Fig 8 PaaS Architecture on top of IFC-OS

As described in sect4 we assume the middleware andthe OS enforcement are provided as a service by theunderlying platform A tenant wanting to use the third-party web-service offering once his trust in the under-lying platform is established needs only to audit thegateway again the underlying infrastructure providercould either provide such a gateway or audit it Therest of the software stack of the third-party web-serviceprovider is bound by the IFC enforcement mechanismand therefore need not be trusted

8 CONCLUSION amp FUTURE WORK

IFC allows data flows to be controlled continuouslythroughout a system by providing an information-centric MAC scheme that continuously ensures non-interference between security contexts This paper pre-sented the CamFlow platform that demonstrates thepotential of cloud-deployed IFC as supporting (1) pro-tection of applications from each other (2) flexible datasharing across isolation boundaries (3) prevention ofdata leakage due to bugsmisconfigurations (4) exten-sion of access control beyond application boundaries (5)data flow transparency

Specifically we detailed a new kernel implementationof IFC as an LSM demonstrating low overhead evenfor worst-case scenarios where processes continuouslymake readwrite system calls We also described theintegration of a messaging middleware to enforcing IFCacross machines This combination makes it possible toprovide whole-system IFC to PaaS cloud services andtherefore also SaaS Our approach and implementationwere designed so that applications can run unchangedover IFC thus making cloud adoption feasible

We also indicated how the data-centric logs based onIFC enforcement could provide the means to audit anIFC-enabled system whereby a log can be processed asa directed graph to investigate leaks and attacks andshow compliance with data management requirementsThough this represents our initial work in the area thereappears much promise

In light of the above we believe that IFC has greatpotential as a security mechanism for the cloud wherebytrust in a few major cloud providers deploying IFCcan be built on to provide a demonstrably trustworthycomputing environment

CamFlow was developed with cloud deployment inmind Our future work will investigate the challenges ofa broader distributed context

It is already feasible to extend CamFlow to sup-port mobile environments Android supports the fullSELinux enforcement5 and an Android-LSM integrationhas been demonstrated [60] But when dealing withmultiple cloud services particularly as they become partof a wider distributed architecture such as in the IoTa trustworthy system-wide deployment of IFC can nolonger be assumed Much work remains on establishingtrust in (and the trustworthiness of) the IFC enforcementmechanism within end-usersrsquo devices Outside a cloudcontext all partiesrsquo trust in a common third partyrsquos en-forcement of IFC constraints cannot be assumed (unlikethe cloud provider for cloud services) We intend toexplore leveraging hardware roots of trust and remoteattestation to ensure the integrity and trustworthiness ofIFC enforcement mechanisms

Another area of investigation concerns the represen-tation of tags across administrative domains In a cloudcontext a federated approach can be envisaged wherea common understanding of tags could be negotiatedacross multiple domains For example in [54] we dis-cussed initial thoughts on managing data according tospecific obligation regimes However as the numberof administrative domains increases both a global tagnaming scheme and mechanisms for ad-hoc negotiationbecome necessary

Related is the sensitivity of the tags themselvesKnowledge of the meaning of a tag can indicate thatthe associated entity contains or deals with certain infor-mation If a relationship can be established between anentity and the information owner this may disclose pri-vate information about the information owner This maylead to work on a need-to-know negotiation mechanismto establish a secure channel between hosts especiallyin a wide-scale distributed system A promising mech-anism is private set intersection to determine tagsrsquo subsetrelationship (sect341)

Another challenge concerns extending audit to dis-tributed architectures both in terms of resource man-agement and regulating access to log data

5 httpssourceandroidcomdevicestechsecurityselinux

13

ACKNOWLEDGMENTS

This work was supported by UK Engineering andPhysical Sciences Research Council grant EPK011510CloudSafetyNet End-to-End Application Security inthe Cloud We acknowledge the support of Microsoftthrough the Microsoft Cloud Computing Research Cen-tre

REFERENCES

[1] P Barham B Dragovic K Fraser S Hand T Harris A HoR Neugebauer I Pratt and A Warfield ldquoXen and the art ofvirtualizationrdquo SIGOPS Operating Systems Review vol 37 no 5pp 164ndash177 2003

[2] A Kivity Y Kamay D Laor U Lublin and A Liguori ldquokvmthe Linux Virtual Machine Monitorrdquo in Proceedings of the LinuxSymposium vol 1 2007 pp 225ndash230

[3] D Bernstein ldquoContainers and Cloud From LXC to Docker toKubernetesrdquo IEEE Cloud Computing Magazine no 3 pp 81ndash842014

[4] P Desnoyers O Krieger B Holden and J Hennessey ldquoUsingOpenStack for an Open Cloud eXchange (OCX)rdquo in InternationalConference on Cloud Engineering (IC2E) IEEE 2015

[5] J Mineraud O Mazhelis X Su and S Tarkoma ldquoA Gap Analysisof Internet-of-Things Platformsrdquo 2015 Arxiv arXiv150201181[Online] Available httparxivorgabs150201181

[6] C J Millard Ed Cloud Computing Law Oxford University Press2013

[7] S Pearson and M Casassa-Mont ldquoSticky Policies An Approachfor Managing Privacy across Multiple Partiesrdquo Computer vol 44July 2011

[8] D E Denning ldquoA lattice model of secure information flowrdquoCommunication of the ACM vol 19 no 5 pp 236ndash243 1976

[9] D E Bell and L J LaPadula ldquoSecure Computer Systems Mathe-matical Foundations and Modelrdquo The MITRE Corp Bedford MATech Rep M74-244 1973

[10] K J Biba ldquoIntegrity Considerations for Secure Computer Sys-temsrdquo MITRE Corp Tech Rep ESD-TR 76-372 1977

[11] A C Myers and B Liskov ldquoA Decentralized Model for Infor-mation Flow Controlrdquo in 17th Symposium on Operating SystemsPrinciples (SOSP) ACM 1997 pp 129ndash142

[12] A Bouard B Weyl and C Eckert ldquoPractical information-flowaware middleware for in-car communicationrdquo in Workshop onSecurity privacy amp dependability for cyber vehicles ACM 2013pp 3ndash8

[13] K Singh S Bhola and W Lee ldquoxBook Redesigning PrivacyControl in Social Networking Platformsrdquo in Security SymposiumUSENIX 2009 pp 249ndash266

[14] M Krohn A Yip M Brodsky N Cliffer M F KaashoekE Kohler and R Morris ldquoInformation Flow Control for StandardOS Abstractionsrdquo in Symposium on Operating Systems PrinciplesACM 2007 pp 321ndash334

[15] S Lee E L Wong D Goel M Dahlin and V ShmatikovldquoπBox A Platform for Privacy-Preserving Appsrdquo in Symposiumon Networked System Design and Implementation USENIX 2013pp 501ndash514

[16] N Kumar and R Shyamasundar ldquoRealizing Purpose-Based Pri-vacy Policies Succinctly via Information-Flow Labelsrdquo in Big Dataand Cloud Computing (BDCloudrsquo14) IEEE 2014 pp 753ndash760

[17] W Enck P Gilbert B-G Chun L P Cox J Jung P McDaniel andA N Sheth ldquoTaintDroid An information-flow tracking systemfor realtime privacy monitoring on smartphonesrdquo in Conference onOperating systems design and implementation (OSDIrsquo10) USENIX2010 pp 1ndash6

[18] I Papagiannis M Migliavacca and P Pietzuch ldquoPHP AspisUsing partial taint tracking to protect against injection attacksrdquoin 2nd USENIX Conference on Web Application Development 2011p 13

[19] E Schwartz T Avgerinos and D Brumley ldquoAll you ever wantedto know about dynamic taint analysis and forward symbolicexecution (but might have been afraid to ask)rdquo in Symposium onSecurity and Privacy IEEE 2010

[20] M Casassa-Mont S Pearson and P Bramhall ldquoTowards account-able management of identity and privacy Sticky policies andenforceable tracing servicesrdquo in Database and Expert Systems Ap-plications 2003 Proceedings 14th International Workshop on IEEE2003 pp 377ndash382

[21] S Bandhakavi C C Zhang and M Winslett ldquoSuper-sticky anddeclassifiable release policies for flexible information dissemina-tion controlrdquo in Proceedings of the 5th ACM workshop on Privacy inelectronic society ACM 2006 pp 51ndash58

[22] D W Chadwick and S F Lievens ldquoEnforcing sticky securitypolicies throughout a distributed applicationrdquo in Proceedings ofthe 2008 workshop on Middleware security ACM 2008 pp 1ndash6

[23] M Casassa-Mont I Matteucci M Petrocchi and M L SbodioldquoTowards Safer Information Sharing in the Cloudrdquo InternationalJournal of Information Security pp 1ndash16 2014

[24] S Akoush L Carata R Sohan and A Hopper ldquoMrLazy LazyRuntime Label Propagation for MapReducerdquo in 6th Workshop onHot Topics in Cloud Computing (HotCloud) USENIX 2014

[25] D Schultz and B Liskov ldquoIFDB Decentralized Information FlowControl for Databasesrdquo in European Conference on Computer Sys-tems (Eurosysrsquo13) ACM 2013 pp 43ndash56

[26] D Von Oheimb ldquoInformation Flow Control Revisited Nonin-fluence = Noninterference + Nonleakagerdquo in Computer SecurityndashESORICS 2004 Springer 2004 pp 225ndash243

[27] D Hedin and A Sabelfeld ldquoA perspective on Information-FlowControlrdquo NATO 2012

[28] D E Porter M D Bond I Roy K S McKinley and E WitchelldquoPractical Fine-Grained Information Flow Control Using Lam-inarrdquo ACM Transactions on Programming Languages and Systems(TOPLAS) vol 37 no 1 p 4 2014

[29] C Wright C Cowan S Smalley J Morris and G Kroah-Hartman ldquoLinux Security Modules General security support forthe Linux kernelrdquo in Foundations of Intrusion Tolerant SystemsIEEE 2003 pp 213ndash213

[30] M Quaritsch and T Winkler ldquoLinux Security Modules Enhance-ments Module Stacking Framework and TCP State TransitionHooks for State-Driven NIDSrdquo Secure Information and Communi-cation vol 7 pp 7ndash13 2004

[31] C Schaufler ldquoLSM Generalize existing module stackingrdquo LinuxWeekly News 2014

[32] S Smalley C Vance and W Salamon ldquoImplementing SELinux asa Linux Security Modulerdquo NAI Labs Report vol 1 p 43 2001

[33] M Bauer ldquoParanoid Penguin an Introduction to Novell AppAr-morrdquo Linux Journal vol 2006 no 148 p 13 2006

[34] A Edwards T Jaeger and X Zhang ldquoRuntime verification ofauthorization hook placement for the Linux Security Modulesframeworkrdquo in Conference on Computer and Communications Secu-rity ACM 2002 pp 225ndash234

[35] T Jaeger A Edwards and X Zhang ldquoConsistency analysis ofauthorization hook placement in the Linux security modulesframeworkrdquo ACM Transactions on Information and System Security(TISSEC) vol 7 no 2 pp 175ndash205 2004

[36] V Ganapathy T Jaeger and S Jha ldquoAutomatic placement ofauthorization hooks in the Linux security modules frameworkrdquoin Conference on Computer and Communications Security ACM2005 pp 330ndash339

[37] I P Egwutuoha D Levy B Selic and S Chen ldquoA survey of faulttolerance mechanisms and checkpointrestart implementationsfor high performance computing systemsrdquo Springer 2013vol 65 no 3 pp 1302ndash1326

[38] O Laadan and S E Hallyn ldquoLinux-CR Transparent applicationcheckpoint-restart in Linuxrdquo in Linux Symposium 2010 p 159

[39] B Niu and G Tan ldquoEfficient User-space Information Flow Con-trolrdquo in Symposium on Information Computer and CommunicationsSecurity (SIGSACrsquo13) ACM 2013 pp 131ndash142

[40] T Bird ldquoMeasuring Function Duration with ftracerdquo in Japan LinuxSymposium 2009 pp 47ndash54

[41] B Parno ldquoBootstrapping Trust in ardquo Trustedrdquo Platformrdquo in Con-ference on Hot Topics in Security (HotSecrsquo08) USENIX 2008

[42] N Santos K P Gummadi and R Rodrigues ldquoTowards TrustedCloud Computingrdquo in Conference on Hot Topics in Cloud Comput-ing USENIX 2009 pp 3ndash3

[43] S Berger R Caceres K A Goldman R Perez R Sailer andL van Doorn ldquovTPM Virtualizing the Trusted Platform Modulerdquoin Security Symposium USENIX 2006 pp 305ndash320

14

[44] S Berger K Goldman D Pendarakis D Safford E Valdezand M Zohar ldquoScalable Attestation A Step Toward Secure andTrusted Cloudsrdquo in International Conference on Cloud Engineering(IC2E) IEEE 2015

[45] J Singh D Eyers and J Bacon ldquoPolicy Enforcement withinEmerging Distributed Event-Based Systemsrdquo in ACM DistributedEvent-Based Systems (DEBSrsquo14) 2014

[46] L Sfaxi T Abdellatif R Robbana and Y Lakhnech ldquoInformationFlow Control of Component-based Distributed Systemsrdquo Concur-rency and Computation Practice and Experience vol 25 no 2 pp161ndash179 2013

[47] S Yoshihama T Yoshizawa Y Watanabe M Kudoh and K Oy-anagi ldquoDynamic Information Flow Control Architecture for WebApplicationsrdquo in ESORICS 2007 J Biskup and J Lopez EdsSpringer 2007 vol LNCS 4734 pp 267ndash282

[48] W Cheng D R K Ports D Schultz V Popic A BlanksteinJ Cowling D Curtis L Shrira and B Liskov ldquoAbstractions forUsable Information Flow Control in Aeolusrdquo in USENIX AnnualTechnical Conference Boston 2012

[49] J Singh T Pasquier J Bacon and D Eyers ldquoIntegrating Middle-ware and Information Flow Controlrdquo in International Conferenceon Cloud Engineering (IC2E) IEEE 2015 pp 54ndash59

[50] J Singh T F J-M Pasquier and J Bacon ldquoSecuring Tags toControl Information Flows within the Internet of Thingsrdquo inInternational Conference on Recent Advances in Internet of Things(RIoTrsquo15) IEEE 2015

[51] J S Park and R Sandhu ldquoBinding Identities and Attributes UsingDigitally Signed Certificatesrdquo in Annual Conference on ComputerSecurity Applications IEEE 2000 pp 120ndash127

[52] A Ganjali and D Lie ldquoAuditing cloud management using infor-mation flow trackingrdquo in Workshop on Scalable Trusted ComputingACM 2012 pp 79ndash84

[53] R K Ko M Kirchberg and B S Lee ldquoFrom System-centric toData-centric Logging-accountability Trust amp Security in CloudComputingrdquo in Defense Science Research Conference and Expo (DSR)2011 IEEE 2011 pp 1ndash4

[54] J Singh J Powles T Pasquier and J Bacon ldquoData Flow Man-agement and Compliance in Cloud Computingrdquo IEEE CloudComputing Magazine SI on Legal Clouds 2015

[55] T Zanussi K Yaghmour R Wisniewski R Moore and M Dage-nais ldquorelayfs An efficient unified approach for transmitting datafrom kernel to user spacerdquo in Linux Symposium 2003 p 494

[56] M E Smoot K Ono J Ruscheinski P-L Wang and T IdekerldquoCytoscape 28 New features for data integration and networkvisualizationrdquo Oxford University Press 2011 vol 27 no 3 pp431ndash432

[57] T Pasquier and J Powles ldquoExpressing and Enforcing LocationRequirements in the Cloud using Information Flow Controlrdquo inIC2E International Workshop on Legal and Technical Issues in CloudComputing (CLawrsquo15) IEEE 2015

[58] R Angles and C Gutierrez ldquoSurvey of Graph Database ModelsrdquoACM Computing Surveys (CSUR) vol 40 no 1 p 1 2008

[59] D Hardt ldquoThe OAuth 20 Authorization Frameworkrdquo IETF TechRep RFC 6749 2012

[60] S Smalley and R Craig ldquoSecurity Enhanced (SE) Android Bring-ing Flexible MAC to Androidrdquo in Network and Distributed SystemSecurity Symposium Internet Society 2013

  • 1 Introduction and Motivation
  • 2 Background
    • 21 IFC Models
    • 22 Protection via VMs and Containers
    • 23 Taint Tracking (TT) Systems
    • 24 Sticky Policies
      • 3 CamFlow-Model IFC for the Cloud
        • 31 Tags and Labels
        • 32 Decentralised Privileges and Security Contexts
        • 33 Creating a New Entity
        • 34 Security
          • 341 Information Exchange
          • 342 Label Change
          • 343 Privilege delegation
            • 35 Conflict of Interest
              • 4 OS Enforcement
                • 41 CamFlow-LSM
                  • 411 Checkpointing and Restoration
                  • 412 OS Evaluation
                    • 42 Trusted Processes
                    • 43 Leveraging Hardware Roots of Trust
                      • 5 Cross-Machine Enforcement
                        • 51 Remote Interactions
                        • 52 Message-Level Enforcement
                        • 53 Integrating With Persistent Storage
                        • 54 Evaluation
                          • 6 Audit Data-Centric Logs
                            • 61 Analysing Paths to Disclosure
                            • 62 Demonstrating Compliance
                            • 63 Audit as `Big Data
                            • 64 Audit Access
                              • 7 Example Support for Web Services
                              • 8 Conclusion amp Future Work
                              • References
Page 3: CamFlow: Managed data-sharing for cloud services · CamFlow: Managed data-sharing for cloud services Thomas F. J.-M. Pasquier, Member, IEEE, Jatinder Singh, Member, IEEE, David Eyers,

3

2 BACKGROUND

We first define the scope of current isolation mech-anisms highlighting the need for flexible data shar-ing at application-level granularity ie where applicationsmanage their own security concerns as well as strongisolation between tenants andor applications As anintroduction to IFC we outline the evolution of IFCmodels Related work on IFC implementation at the OSlevel and within distributed systems is given with therelevant sections We end with a brief comparison of IFCwith taint tracking (TT) and sticky policies

21 IFC Models

In 1976 Denning [8] proposed a Mandatory AccessControl (MAC) model to track and enforce rules oninformation flow in computer systems In this modelentities are associated with security classes The flow ofinformation from an entity a to an entity b is allowedonly if the security class of b (denoted b) is equal to orhigher than a This allows the no-read up no-write downprinciple of Bell and LaPadula [9] to be implementedto enforce secrecy By this means a traditional militaryclassification public secret top secret can be implementedA second security class can be associated with eachentity to track and enforce integrity (quality of data)no read down no write up as proposed by Biba [10] Acurrent example might allow input of information froma government website in the govuk domain but forbidthat from ldquoJoersquos Blogrdquo Using this model we are ableto control and monitor information flow to ensure datasecrecy and integrity

In 1997 Myers [11] introduced a Decentralised IFCmodel (DIFC) that has inspired most later work Thismodel was designed to meet the changing needs ofsystems from global static hierarchical security levelsto a more flexible system able to capture the needsof different applications In this model each entity isassociated with two labels a secrecy label and an integritylabel to capture respectively the privacyconfidentialityof the data and the reliability of a source of data Eachlabel comprises a set of tags each of which representssome security concern Data is allowed to flow if thesecurity label of the sender is a subset of the label of thereceiver and conversely for integrity

Implementations of a decentralised model akin toMyersrsquo include a sensitive embedded system for BMWcars [12] and XBook [13] in a social media contextOur own model is described in sect3 When implementedfrom the OS kernel level applications running underIFC enforcement do not need to be trusted for the datamanagement policy to be properly enforced [14]

22 Protection via VMs and Containers

Isolation of tenants in cloud platforms is throughhypervisor-supported virtual machines [1] [2] or OS-provided containers [3] However flexible sharing mech-anisms are also required to manage data exchange be-tween applications contributing to more complex sys-

tems or to achieve end-user goals For example gov-ernment applications might access citizensrsquo records forvarious purposes a userrsquos data from different applica-tions might together contribute to evidence related tohealth or wellbeing

At present the sharing of information between appli-cations tends to involve a binary decision (ie to shareor not) as for example in Google pods (containers)2

Whole resources can be shared but no control over datausage between applications is provided Furthermorethere are no means for preventing leakage outside ofthe mechanisms implemented by the individual appli-cationsservices

Solutions have been proposed to provide intra-application sandboxes (down to individual end-users)[15] but such schemes are difficult to scale requirechanges in application logic and still do not providecontrol beyond isolation boundaries (ie again loss ofcontrol once the data is shared)

IFC has been proposed to guarantee the proper usageof data by social network applications [13] The aim is toprovide purpose-based disclosure via IFC [16] betweenisolated components thus guaranteeing that shared datacan only be used for a well-defined and agreed-uponpurpose

IFC is by no means proposed as a replacement foraccess control VMs or containers but rather as a com-plement to those techniques to provide flexible manageddata-sharing IFC would allow tenants and end-users tomaintain control (within an IFC-enforcing world) anddefine policy applying to their data consistently andbeyond isolation and application borders

23 Taint Tracking (TT) Systems

Runtime dynamic TT is similar to IFC but with less func-tionality TT systems use one tag type ldquotaintrdquo instead ofsecrecy and integrity tags Tags propagate with data anddata flows may be logged An entity that inputs taggeddata acquires the datarsquos tag(s) Data flow constraintsare only enforced at specified sink points for examplewhen data attempts to leave a mobile phone [17] Policyis applied at sink points such as preventing privateunanonymised or unencrypted data from flowing orstrictly controlling to where data may flow

An example of TT used for integrity purposes is totaint data from untrusted sources eg user input froma TCP stream in a web application environment andenforce that it is sanitised before being processed [18]This simple mechanism prevents injection attacks thatplague badly designed web applications An example ofTT used for confidentiality purposes is to taint sensitiveinformation eg a list of contacts in a mobile phone andtrack it through this closed system [17] Data leaving thesystem (ie the phone) is analysed to ensure it does notcontain sensitive information Data containing sensitiveinformation should only leave to a number of closely

2 httpscloudgooglecomcontainer-enginedocspods

4

controlled destinations such as the cloud backup contactlist This approach aids the detection of malicious ap-plications attempting to steal user-sensitive informationand send it to third parties Equally this type of concerncan be captured through the use of IFC policies

One concern with TT systems is that there is a gap intime between the occurrence of the issue (eg a leak anattack) and when it is detected [19] ie problems becomeevident only when the tainted data reaches a sink (en-forcement point) Depending on the degree of isolationbetween the different parts of the system and the num-ber of system components involved this tainted datamay have lsquocontaminatedrsquo much of the system Whilethis can be managed in smaller closed environmentsit is less appropriate for cloud services in general IFCpolicies present the clear advantage to prevent problemsas they occur and to stop their effects propagating to apotentially large part of the system

Some argue that TT is simpler to use than IFC andincurs lower overhead but when the enforcement issystemic and the granularity identical the overheadsare similar (compare [17] and the evaluation in sect4 andsect5) Indeed the complexity of verifying IFC policy (seesect3) is comparable to the cost of propagating taint Forboth techniques most of the overhead comes from themechanism for intercepting data exchange

24 Sticky Policies

IFC can be seen as a mechanism for enforcing policythe labels associated with entities represent applicationpolicies IFC is a simple low-level mechanism Stickypolicy approaches also consider the enforcement of data-bound policy but at a higher-level

Casassa-Mont et al [20] first introduced sticky policieswhich involves encrypting data along with a list of poli-cies to be enforced on that data To obtain the decryptionkey from a Trusted Authority (TA) a party must agree toenforce the policies associated with the data This agree-ment may be considered as part of forming a contractuallink between the data owner and the service providerWork has continued in the area [21]ndash[23] Sticky policiestypically enforced at the application-level are generallymore complex and heavyweight than the simple secrecyand integrity constraints of IFC As such sticky policiestend only to be enforced at particular points eg atadministrative boundaries IFC on the other hand aswe show in sect412 can be enforced continuously at areasonable cost In sect3 we discuss how complex policiesmight be built from IFC labels

Further the sticky policy approach builds upon thetrust established between the data owner the TAs andservices that use the data A non-compliant service couldbe black-listed but only if and when a breach of agree-ment is detected and the TA updated Our IFC approachbuilds only upon the trust between the data owner andthe cloud provider Services and applications running ontop of the cloud provider platform need not be trustedWe believe this to be a great improvement to the overall

Alice RecordS(A) = alicemedical

I(A) = hosp-dev consent

Bob RecordS(B) = bobmedical

I(B) = hosp-dev consent

Alicersquos app instanceS(C) = alicemedical

I(C) = consentAllowed FlowPrevented Flow

Fig 1 An allowed safe flow and prevented flowstrustworthiness of the system

3 CAMFLOW-MODEL IFC FOR THE CLOUD

IFC operates to ensure that only permitted flows ofinformation can occur by enforcing data flow policy dy-namically end-to-end within and across applicationsservices Entities to which IFC constraints are appliedcan include a MapReduce worker instance [24] a file aprocess a database entry [25] etc In CamFlow IFC is ap-plied continuously typically on every system call for anIFC-enabled OS and on communication mechanisms forenforcement across applicationsruntime environmentsIFC policy should therefore be as simple as possibleto allow verification human understanding and to min-imise runtime overhead Indeed there is no need for IFCto encapsulate every possible policy rather it augmentsother control mechanisms and can help enforce theirpolicies

31 Tags and Labels

We define tags that are tokens each representing somesecurity concern over secrecy or integrity The tagbob-private could for example represent Bobrsquos personaldata We associate every entity in the system with twolabels (sets of tags) an entity A has a secrecy label S(A)and an integrity label I(A) The state of these labels isthe security context of the entity The power of IFC is thatit guarantees non-interference between security contexts[26] [27]Example ndash secrecy Suppose a patient Bob is dischargedfrom hospital to be medically monitored at home Thedata streams from his sensors are transferred to a cloudservice and are to be shared with his medical team atthe hospital The data items from his devices are taggedwith medical bob in their secrecy labelsExample ndash integrity The cloud-based home monitoringsupport service needs to be assured that the data itreceives is from a hospital-issued device Each sensingdevice is checked and issued with the tag hospital-devicein its integrity label

Fig 1 illustrates information flow constraints beingapplied over both secrecy and integrity dimensions

32 Decentralised Privileges and Security Contexts

In decentralised IFC (DIFC) any active entity can createnew tags Tag creation is typically carried out by appli-cation managers when setting up application instancesWhen an active entity creates a new tag either for secrecyor integrity this process is given the correspondingprivilege to add and remove the tag to its secrecy orintegrity label respectively If an active entity A has a

5

Medical RecordS = personal

I = empty

Consent Checker

S = personalI = cons

Anonymiser

S = researchI = cons anon

Research DatabaseS = research

I = cons anon

Researcher PortalS = research

I = cons anonAllowed FlowPrevented Flow

Research Project XXNHS Cloud S = personalI = empty

S = personalI = cons

Context change

Fig 2 Medical data declassified and endorsed for research purposes

privilege to add t to its secrecy label we denote thist isin P+

S (A) and to remove t from its secrecy labelt isin PminusS (A) (and similarly P+

I (A) and PminusI (A) are the priv-ileges for integrity) An active entity may therefore havefour privilege sets in addition to its security contextApplication managers will normally set up applicationinstances in security contexts without the privileges tochange them An example is given in sect7

33 Creating a New Entity

We define A rArr B as the operation of the entity Acreating the entity B An example is creating a processin a Unix-style OS by clone We have the following rulesfor creation

if ArArr B then

S(B) = S(A)

I(B) = I(A)

That is the created entity inherits the security contextof its creator These rules force the creating entity toexplicitly change its security context to that required forthe entity to be created We motivate this below in sect342Note that only labels pass to the created entity privilegeshave to be passed explicitly

34 Security

The purpose of IFC models is to regulate flows betweenentities and effect label changes and privilege delega-tion

Definition 1 A system is secure in the CamFlow IFC modelif and only if all allowed messages are safe (Definition 2) allallowed label changes are safe (Definition 3) and all privilegedelegation is safe (Definitions 4 and 5)

341 Information Exchange

IFC prevents data leakage by controlling the exchangeof information We follow the classic pattern for IFC-guaranteed secrecy (no read up no write down [9]) andintegrity (no read down no write up [10])

Definition 2 A flow of information A rarr B is safe if andonly if

Ararr B iff S(A) sube S(B) and I(B) sube I(A)Example ndash secrecy enforcement Consider our exam-ple of patient monitoring after discharge from hospital

where the patientrsquos devices are tagged with medical bobin their secrecy labels In order for the cloud service tobe able to receive this data it must also include the tagsmedical bob in its secrecy label Therefore an applicationinstance accessing Bobrsquos medical data must be labelled assuch In sect7 we describe how applications can be designedto meet such requirementsExample ndash integrity enforcement The cloud-basedhome monitoring support service needs to be assuredthat the data it receives is from a hospital-issued de-vice To achieve this the service has an integrity taghospital-issued in its integrity label and will only acceptdata from devices with tags hospital-issued

342 Label Change

Under the above constraints information flows arerestricted to equal or increasing secrecy constraintsand equal or decreasing integrity constraints Howeverdata may undergo transformations andor checks thatchange its security properties For example moving datathrough an anonymisation engine renders the data lesssensitive so less strict secrecy constraints can applyto the anonymised output In the integrity dimensiondata may go through a validation process on inputthus becoming more trustworthy In CamFlow only theprocess itself is able to change its secrecy and integritylabels which requires the appropriate privileges andmust be explicitly requested

Definition 3 A label change noted A Aprime is safe if andonly if for a label X (either S or I) and a tag t

X(Aprime) = X(A) cup t if t isin P+X (A)

ORX(Aprime) = X(A) t if t isin PminusX (A)

Declassifiers and endorsers are the entities with theprivileges to perform security context transformationsDeclassifiers change the secrecy properties and endorserschange the integrity propertiesExample ndash declassification A medical record systemis held in a private cloud Research datasets may becreated from these records but only from records wherethe patients have given consent Also only anonymised

6

data may leave the private protected environment Weassume a health service approved anonymisation proce-dure Fig 2 shows the anonymiser inputting data taggedas personal and declassifying the data by outputting datawith secrecy tag researchExample ndash endorsement In the same example theResearch Database is on a public cloud and may onlyreceive research data tagged with consent anon in itsintegrity label In the private cloud we see a pro-cess that selects appropriate records for specific re-search purposes checks for patient consent and addsthe tag consent to the integrity label of its output Theanonymiser process can only input data with this tag itanonymises the data and outputs data with the tag anonin its integrity label

Some previous work [14] [28] allows implicit declas-sification and endorsement That is if an active entity hasthe privilege to declassifyendorse and the privilegeto return to its original state (ie for declassificationendorsement over t the entity has privilege tminus and t+)the declassificationendorsement may occur implicitlywithout the need for the entity to make the label changesWe believe that this could in practice lead to unintentionaldata disclosure Suppose an entity has the privilege todeclassify top-secret information The requirement forexplicit label change makes it unlikely that the entity willsend such data accidentally to an unintended recipientOur model has stronger constraints that require endorse-ment and declassification operations to be programmedexplicitly

343 Privilege delegation

An entity is only able to delegate a privilege it owns

Definition 4 A privilege delegation is safe if and only ift isin PplusmnX (A)

35 Conflict of Interest

In CamFlow alone among IFC systems privilege dele-gation is further restricted by Conflict of Interest (CoI)(or Separation of Duty (SoD)) enforcement The receivingentity A must not be put in a situation where it wouldbreak a CoI constraint By this means an applicationmanager is prevented from creating an application in-stance with access to conflicting data

Definition 5 An entity A does not violate a CoI C if andonly if∣∣∣(S(A) cup I(A) cup P+

S (A) cup P+I (A) cup Pminus

S (A) cup PminusI (A)

)cap C

∣∣∣ le 1

Example ndash conflict of interest A CoI might arise whendata relating to competing companies is available in asystem In a hospital context this might involve theresults of analysis of the usage and effects of drugs fromcompeting pharmaceutical companies The companiesmight agree to this analysis only if their data is guar-anteed to be isolated ie not leaked to other companies

The hospital may be participating in drug trials andwant to ensure that information does not leak between

Camflow-LSM

IFC Library

User Space

Kernel Space

Application Process

Kernel

DIFC Management APISystem Calls

Hardware

IFC LibraryTrusted Process

Authorise and Audit

Fig 3 The interactions of the IFC Security Module (LSM)and a Trusted Process within an OStrials suppose a conflict is C = PfizerGSKRoche and some data (eg files) are labelled PfizerData[S =Pfizer I = empty] and RocheData[S = Roche I = empty] TheCoI described ensures that it is not possible for a singleentity (eg an application instance) to have access toboth RocheData and PfizerData either simultaneously orsequentially ie enforcing that Roche-owned data andPfizer-owned data are processed in isolation

The next sections describe the CamFlow platform thatenforces the IFC constraints described

4 OS ENFORCEMENT

At the heart of the architecture is a minimal kernel mod-ule dedicated solely to OS-level IFC enforcement Themodule is trusted to enforce IFC transparently acrossall flows between entities within the OS User spaceprocesses can directly interact with the kernel moduleeg to delegate privileges (sect34) through a pseudo-filesystem abstracted through a high level API Higher levelconsiderations and policies can be managed throughspecifically defined Trusted Processes (see sect42) The localmachine architecture is represented in Fig 3

Note that IFC operates alongside and complementsother security technologies It is not a cloud securitypanacea challenges regarding covert and side channelsand direct access to hardware by an attacker remain asthey do for systems in general There are approachesthat can help address these security threats but manyare highly disruptive (eg synchronisation approachesto reducing timing channels) and are infrequently usedOther threats may be easier to mitigate and solutionsmay be used when appropriate (eg on-disk encryption)

41 CamFlow-LSM

Our kernel module CamFlow-LSM is implemented as aLinux Security Module (LSM) [29] Although our work isLinux-specific a similar approach could be used on anysystem providing LSM-like security hooks Unlike otherDIFC OS implementations [14] [28] our kernel patch isself-contained strictly limited to the security moduledoes not modify any existing system calls and followsLSM implementation best practice This allows amongother things LSM stacking [30] [31] and coexistencewith other security modules such as eg SELinux [32]or AppArmor [33] and complements their MAC enforce-ment with decentralised information flow policies

We assume that the rest of the kernel can be trustedand does not interfere with the IFC enforcement mech-anism LSM system hooks have been statically anddynamically verified [34]ndash[36] and our implementation

7

inherits from LSM the formal assurance of IFCrsquos correctplacement on the path to any controlled kernel objectThis is sufficient to guarantee that we control flow andrecord audit on any operation on a controlled kernelobject

Since applications running on SELinux [32] or Ap-pArmor [33] need not be aware of the MAC policybeing enforced we see no reason to force applicationsrunning on an IFC system to be aware of IFC only thoseperforming declassification or endorsement operationsare necessarily aware This implementation choice isimportant cloud providers can incorporate IFC withoutrequiring changes in the software deployed by ten-ants Alternatively applications that wish to managetheir own IFC constraints can declare policy through apseudo-filesystem (as is typical for LSMs) abstracted bya user space library and enforced transparently by theIFC mechanism

The LSM framework calls security hooks when accessto a kernel object is attempted Security metadata can beassociated with kernel objects and is used by the LSMmodule to make access decisions Tags and privilegesare represented by 64-bit opaque nonces associated withkernel objects such as processes inodes files sharedmemory objects messages etc On interaction betweenkernel objects CamFlow-LSM security hooks are calledto enforce data-flow policy (sect341) or propagate tags onentity creation (sect33) as appropriate

Only active entities (processes) have mutable labelsand privileges all other (passive) entities have im-mutable labels and no privileges

Privileges are allocated by the kernel and owned bythe creating process (any process can create tags and theassociated privileges in a decentralised fashion) Privi-leges can be passed to other processes users or groupsCamFlow-LSM verifying that constraints on privilegedelegation (sect343) and conflict of interest (sect35) are notviolated A process can add or remove a tag from itslabel if it owns the appropriate privilege (following IFCconstraints described in sect342) if the current user ownsthe privilege or if the current group owns the privilegeHow tags are shared and managed must be consideredwith care when designing an application and the systemmust be administered accordingly

411 Checkpointing and Restoration

Checkpointing a process involves halting its executionallowing it to be restarted at a later stage and enablingmigration eg [37] LSM state is normally saved andrestored by the checkpointing system eg [38] andour module further exports an API to more efficientlyserialise and restore security context

Furthermore self-checkpointing and restoring the pre-vious state of a process has been demonstrated [39] to bea beneficial feature for IFC systems This is particularlyuseful for processes serving requests In such a scenariothe state of the process is saved after initialisation Whena request is received the serving process sets itself up

0 10 20 30 40 50 60 70 80 90

sys pipe

sys write

sys read

sys clone

microsdyn label IFC LSM Native

Fig 4 Overhead introduced into the OS by CamFlowLSM (x-axis time in micros)in the security context appropriate to serve the requestAfter the request is served (or a series of requests if thesystem is session-based as described in sect7) the processrestores its memory state and security context to whatthey were immediately after initialisation This improvesperformance and prevents data leaks between securitycontexts

412 OS Evaluation

We tested the CamFlow-LSM module on Linux Kernelversion 3178 (012015) from the Fedora distribution3

The tests are run on an Intel 22Ghz i7 CPU and 6GiBRAM machine

Measurements are done using the Linux tool ftrace [40]to provide a microbenchmark Two processes read fromand write to a pipe respectively Each has 20 tags in itssecurity label substantially more than we have seen aneed for in current use cases We measure the overheadinduced by creating a new process (sys clone) creatinga new pipe (sys pipe) writing to the pipe (sys write) andreading from the pipe (sys read) The results are givenin Fig 4

We can distinguish two types of induced overheadverifying an IFC constraint (sys read sys write) and allo-cating labels (sys clone sys pipe) The sys clone overheadis roughly twice that of sys pipe as memory is allocateddynamically for the active entityrsquos labels and privilegesRecall that passive entities have no privileges Overheadmeasurements for other system callsdata structures areessentially identical as they rely on the same underlyingenforcement mechanism and are not included

The CamFlow-LSM overhead is a few percent seeFig 4 We provide a build option that further improvesperformance by declaring labels and privileges with afixed maximum size (by default label size can increasedynamically to meet application requirements) This re-duces the overhead of the system calls that create newentities (the dynamic label component in Fig 4) This isan acceptable trade-off as in practical scenarios labelsrarely exceed more than five tags However for mostapplications the overhead is imperceptible and lost insystem noise it is hard to measure without using kernel

3 It is not feasible to provide a comparison with the Laminarimplementation [28] that is closest in technical terms to our workas the implementation available httpsgithubcomut-osalaminaris for an obsolete kernel version 2622 (072007)

8

tools as the variation between two executions may begreater than the overhead

42 Trusted Processes

The CamFlow-LSM is trusted to enforce IFC at the kernellevel Its functionality is minimal strictly confined tothe enforcement of IFC policies as described in sect3 Thisguarantees easier maintainability and a system that isagnostic to higher level application requirements thusminimising the constraints imposed on user-space ap-plication design

We introduce the concept of a trusted process that al-lows applicationplatform-specific concerns to be man-aged in user space by bypassing some LSM-enforced IFCconstraints For example a trusted process might serveas a proxy for external connections as in the Trusted IFCGateway in the example in sect7 setting up and managingapplication componentsrsquo labels Trusted processes areused to interact with persistent storage (see sect53) forcheckpointing and restoring processes (see sect411) andfor managing inter-process and external communication(see sect5)

Fig 5 shows OS instances running the CamFlow-LSM hosting a number of application processes thatmay be grouped in containers Each OS instance hasa single trusted process (Security Context Manager) tomanage its hosted processesrsquo IFC labels and privileges Inaddition each process has an associated trusted middle-ware process to handle inter-process and inter-machinecommunication Such communication may be within orbetween containers OSs or clouds

In this example S represents a particular set of secrecytags and I a particular set of integrity tags both ofwhich remain the same throughout The application pro-cesses and other OS objects such as pipes and files arelabelled [S I] The process labelled [empty I] writes lsquopublicrsquodata to a pipe which is read by a process labelled [S I]assuming all the I tags match correctly Similarly twoprocesses are shown writing to and reading from a file

The Security Context Manager maps between thekernel-level representation of tags (as 64-bit integers)and the representation of tags in user space Within acloud or other trusted environment tags may be simplestrings When tags need to cross domain boundarieseg when cloud services form part of a wider archi-tecture as in IoT tags may need to be protected bycryptographic means (see sect51)

Trusted processes are either set up through staticconfiguration read at boot time by the CamFlow-LSMmodule or created at runtime by another trusted pro-cess Trusted processes must either be managed by atrusted party (in our current approach the underlying in-frastructure provider) andor the code must be auditableand a means to verify the current version running on theplatform must be provided (see sect43)

43 Leveraging Hardware Roots of Trust

Incorporating IFC into cloud-provider OSs would en-hance the trustworthiness of the platform HoweverIFC only guarantees protection above the technical layerin which it is enforced Recent hardware and softwaredevelopments make it possible to attest that the softwarelayers on which our platform runs have been audited

The Trusted Platform Module (TPM) [41] as used forremote attestation [42] is one such hardware mecha-nism TPM is used to generate a nearly unforgeablehash representing the state of the hardware and soft-ware of a given platform that can be remotely verifiedTherefore a company could audit the implementationof our IFC enforcement mechanism and ensure that ourkernel security module messaging middleware and theconfiguration they provide are indeed running on theplatform Any difference between the expected state ofthe software stack and the platform could be considereda breach of trust such considerations can easily beembedded in the contractual obligations of the cloudprovider

TPM and remote attestation for cloud computing [43]are reaching maturity with IBM rolling out an opensource scalable trusted platform based on virtual TPMs[44] Indeed Berger et al [44] describe a mechanismallowing the TPM and remote attestation to be providedfor virtual machine offerings and container-based solu-tions covering the whole range of contemporary cloudofferings Furthermore the approach not only allows thestate of the software stack to be verified at boot time butalso during execution and can thus prevent run-timemodification of the system configuration

5 CROSS-MACHINE ENFORCEMENT

CamFlow-LSM operates to protect flows within the OSHowever it is also important that flows are protectedacross OS instances

Generally in order to guarantee flow constraints onlyprocesses P such that S(P ) = empty and I(P ) = empty ie notsubject to IFC constraints are allowed to directly connectto or receive messages from connections on remote OS(eg through a socket) In order to connect to anothermachine a process must either 1) be able to declassifyto change its security context to S(P ) = empty and I(P ) = empty2) communicate through an intermediate trusted process

As such CamFlow contains an IFC-enabled fully-featured messaging middleware (CamFlow-MW) to bothfacilitate communication and guarantee enforcementacross machines For want of space we only considerthe middleware concepts relevant to IFC details on thegeneral middleware (as it was prior to IFCCamFlowintegration) can be found in [45] In short the middle-ware supports strongly-typed messages a range of in-teraction paradigms including request-reply broadcastand streams flexible resource discovery and securitymechanisms including access controls and encryptedcommunication A particular feature is its support for

9

CamFlow-MW CamFlow-MW CamFlow-MW CamFlow-MW CamFlow-MW

Process

CamFlow-LSM

Enforcement Point Pipe [empty I]

Process Process

SecurityContextManager

SecurityContextManager

Container

Container

Container

Process[S I]

CamFlow-LSM

[S I]

File[S I]

[S empty][S empty]Process[empty I]

Fig 5 CamFlow Architecture Labelled OS objects trusted processes and communication middleware

dynamic reconfiguration based on event-driven policyThis simplifies both application development and de-ployment as concerns can be abstracted and tailoredto the particular environment rather than embeddedwithin application code

The role of the middleware is to move towards contin-uous end-to-end data flow management such that IFCcan be enforced across applicationsmachines (kernels)There is work on IFC enforcement across machineshowever these impose specific requirements such asdesign-time considerations [46] a particular languageruntime [47] or constraints on system architectureimplementation [48] In contrast we integrate IFC func-tionality into the general fully featured distributed sys-tems middleware mentioned above (see [49]) to provideflexibility and be more generally applicable We delib-erately avoid imposing a structure on system designinstead integrating IFC functionality into the sort of com-munications infrastructure common to current enterpriseand cloud systems

51 Remote Interactions

CamFlow-MW operates by associating a trusted process(see sect42) with an entity that seeks to communicate viathe messaging system Its fit within the broader architec-ture is depicted in Fig 5 The process is responsible forhandling the communication of messages and enforcesIFC based on the current runtime labels of the entity onwhose behalf it operates

It follows that for IFC to be enforced across machinestags require system-wide management ie throughoutthe cloud service In [50] we proposed that the widelyused and available X509 certificates could be used Theapproach relies on public key certificates and attributecertificates [51] to respectively identify the applicationassociated with the CamFlow-MW instance and the tagsassociated with this application

As part of establishing a connection CamFlow-MWensures that each entity authorises communication withthe other according to a local access control policy De-cisions are based on component metadata the relevantauthentication aspects secured through PKI (certificates)Similarly IFC policy must also be verified to ensure dataflows are authorised according to IFC policy Attributecertificates provide cryptographic means to determineand verify the tags associated with the remote entity

(see [50]) on which policy can be enforced If tags donot accord the connection will not be established

We also see potential for remote attestation based onhardware integrity measures (see sect43) to be integratedinto this authorisation phase to ensure the remote ma-chine operates a reliable IFC enforcement regime

52 Message-Level Enforcement

CamFlow messages are strongly typed where a mes-sage type is defined by a schema describing its set ofattributes For an instance of a message an attributeconsists of a name type and value The support forIFC within messages is fine-grained in that individualattributes within messages can also be labelled Theseattribute labels introduce additional IFC constraints overand above those already applying to the entity ie asrecognised by the kernel-LSM and validated on connec-tion establishment

Labels can be defined within message type schemawhich sets the attributesrsquo IFC labels for all messageinstances of the type These labels cannot be changed byentities dealing in such messages and the entities musthold the requisite labels to interact with the attributesOtherwise the entity producingpublishing a messagecan set the security labels for the attributes (for those notpredefined) if the entity holds the associated privilegesEnforcement occurs as followsReceiving If the receiving entityrsquos labels do not agreewith those of an attribute value the attribute value (andany sub-attributes) are removed from (made null in) themessage This is enforced on message receipt before itis delivered to the entitySending An entity cannot send values for attributeswhere its labels do not agree with those of the attributeThis is enforced when an entity attempts to send amessage ensuring values for any attributes violating thispolicy are removed before message propagation

Enforcement is automatic meaning that applicationsusing the messaging system can be subject to IFC en-forcement completely transparently (ie without their di-rect involvement) though again there is the interface forthe application to actively manage IFC where requiredIn addition the general reconfiguration capabilities ofthe middleware enable connections between componentsto be defined and managed at runtime providing an-other mechanism for controlling communication [45]

10

0 100 200 300 400 500 600 700

Time in msIFC-overhead

Fig 6 IFC overhead of CamFlow-MW for a workloadtransmission of 5000 messages (x-axis in ms)53 Integrating With Persistent Storage

One technique to provide IFC with persistent data storesis to store the tags alongside the data and a trustedsoftware component ensures that when information isread from the store the corresponding labels are appliedIn Flume [14] a trusted process provides the interfacebetween untrusted applications and persistent storageMore recent work has seen the emergence of databasesthat natively understand IFC concepts and can enforceIFC policies [25]

We see much promise in having the middleware me-diate between persistence systems and the kernel toensure consistent IFC application

54 Evaluation

As shown in Fig 6 the results indicate that IFC en-forcement introduces an overhead of sim13 in perfor-mance time compared to the standard non IFC-enabledmiddleware (see [49] for details) Note that these resultswere measured in the context of a particular workloaddeliberately designed to highlight the impact of IFCenforcement It follows that the overheads associatedwith real-world usage are most likely less onerous

6 AUDIT DATA-CENTRIC LOGS

IFC in addition to providing strong assurances that pol-icy is being enforced can also provide a data-centric log[52] detailing the information flows within and betweensystem components In addition to enforcing IFC via ourLSM module we log the data flows of labelled processespolicy decisions privileges and IFC security contextmanipulations Equivalent inter-machine operations viathe middleware are also recorded

Cloud logging systems are generally based on legacylogging systems (OS web-server database etc) thateither fail to capture the needed information or areextremely complicated to interpret in a useful manner[53] More importantly such logs tend to be relevant onlyto the particular service or component which makes itdifficult if not impossible to audit across a range ofapplications clouds etc

IFC logs as provided by our platform allow us to cap-ture information on application-level data flows both at-tempted and permitted allowing the correct expressionand implementation of data flow policy to be checkedThis provides transparency allows for meaningful auditin terms of investigating the circumstances in which dataleakage occurs and provides evidence of complianceeg with legal obligations [54]

Information Exchange

CreateSecurity Context Change

e0

e1

e3e4

e7

e2 e5e6

P3[Sprimeprime I]

P2[Sprimeprime empty]

P3[S I]F1[S I]

P1[S I] P1[empty I]

Public[empty empty]F2[Sprime I prime]

Fig 7 Simplified audit graph from IFC OS execution (weomit metadata for readability) The path to disclosure isshown in bluepale61 Analysing Paths to Disclosure

To assist in interpreting log information we build a di-rected graph corresponding to the allowed flows duringthe execution of our system as shown in Fig 7 Theflows defined in our IFC model (see sect3) namely dataflow creation flow security context change and privi-lege delegation correspond to the edges of the directedgraph Entities (such as processes files messages etc)are represented as nodes in the graph

In addition to information necessary to build the graph(as shown in the figure) additional metadata is collectedfor forensic purposes which is contextentityevent-dependent These audit entries provided by the LSMand middleware can be exploited by a dedicated serviceimplemented in user space connected to the kernelcollection mechanism via relayfs [55] This service feedsfor example a graph visualisation tool such as Cytoscape[56] or a graph database such as Neo4J4

Such a directed graph helps one identify data leaksFor example a tenant might discover that some sensi-tive medical data leaked into a data store where onlyanonymised research data were supposed to be storedIFC is enforced in line with the policy encapsulated inlabels thus data may leak if such policy is improperly ex-pressed andor declassificationendorsement processesare not correctly implemented (eg if the anonymisationprocess in Fig 2 allows re-identification)

Suppose that an information leak is suspected be-tween different security contexts L1[S I] and L2[S

prime I prime]Determining whether such a leak can occur is equiv-alent to discovering whether there is a path in thegraph between the two contexts If the leak occurredthere must be a path between some entity Ei such thatS(Ei) = S and I(Ei) = I and another entity Fi such thatS(Fi) = Sprime and I(Fi) = I prime

The existence of such a path demonstrates that a leakis possible To investigate whether a leak occurred it isessential to consider the event ID associated with theedges comprising the path We denote by ei the last in-coming edge to the entity under investigation with labels[Sprime I prime] only edges such that e lt ei should be consideredWhen applied to all nodes along a path this rule ensuresstrictly monotonically increasing timestamps from thefirst node to the last Fig 7 shows in bluepale a possibledata disclosure path from file F2 from a very simple

4 httpneo4jcom

11

audit graph We know from the event IDs e0 and e1 thatthe data disclosure did not occur through file F1 andprocess P3 but through P1rsquos declassification

62 Demonstrating Compliance

Compliance with certain requirements can be demon-strated through queries over the graph We assume theaudit data is stored in a graph database that we canquery For example the following plain English policyldquoEuropean personal data sent to the US must be anonymisedrdquo[57] is equivalent to writing a query that verifies thatthere is no path between EU- and US-labelled datawithout an anonymisation processThe policy ldquoMedical data stored in database X must havereceived proper consent and be anonymisedrdquo [54] can beexpressed as a query verifying that there is no path be-tween data labelled as medical and the database withoutconsent-checking and anonymiser processes In additionan investigator may want to know which anonymisationalgorithm has been run which data has been used togenerate the anonymised records etc Our audit graphassists in answering such questions

Note that IFC only applies guarantees with respectto flows Demonstrating the overall effectiveness of themanagement regime eg the quality and suitability ofthe anonymisation algorithm is out-of-scope for a flow-based enforcement mechanism

63 Audit as lsquoBig Datarsquo

We are potentially generating a vast amount of data inour IFC logs However unlike standard system logs thatare complex to analyse our logs generate graphs thatare ideal for analysis by ldquobig datardquo tools that have beendeveloped for this purpose [58]

Since the amount of data is potentially huge theamount of data being logged can be fine-tuned to meetthe requirements of the platformtenant eg by reduc-ing the amount of metadata being stored by loggingonly security context changing operations by loggingonly information corresponding to some target securitycontext keeping operations on unlabelled entities out-side of the log etc The decision on what needs to belogged then becomes a tradeoff between utility and thevolume (cost) of log generated which can be decided inorder to correspond to legal or contractual requirements(for example a regulated sector may need to have afine-grained log to satisfy data forensic requirements)Indeed as such an approach is new to the cloud suchconsiderations will be refined by experience with bestpractices developing over time

64 Audit Access

Logs can contain sensitive information and access tothem should be controlled This represents an area ofour ongoing work Traditional access controls clearlyplay a role however secrecy tags could also be lever-aged For example an auditor before being grantedaccess to audit logs could be forced to demonstrate

ownership of the corresponding secrecy IFC tags (forexample through cryptographic means as in sect51) Theauditor may be granted access to a log entry only ifS(origin) cup S(destination) sube S(auditor)7 EXAMPLE SUPPORT FOR WEB SERVICES

One of the most common uses of PaaS is to host web ap-plications In this section we present the implementationof such a solution built on the infrastructure described insect4 in order to evaluate and demonstrate the feasibility ofour proposed approach This is illustrated in Fig 8 Werun standard and unmodified Ruby web applications

Interaction with end-users is achieved through aldquogatewayrdquo between the IFC and non-IFC worlds Simi-larly interaction with cloud services (such as data stores)is also achieved through our messaging middleware asdiscussed in sect5 The requirement for this gateway canbe removed if a trustworthy IFC implementation can beprovided at the client side consistent with the cloud im-plementation with respect to tag naming enforcementetc Tag naming in general system-wide is an issuebeyond the scope of this paper see further sect8 In ourproof of concept implementation the gateway is a simpleApache server running a custom-built module

The role of the gateway is to authenticate the end-userwhen a session is created and to associate this sessionwith an application instance running within the securitycontext corresponding to the user Recall that a securitycontext comprises the S and I labels Any further re-quests to the gateway in that session are routed to thecorresponding application instance Once an instance nolonger has an associated session it can be recycled usingself-checkpointing as described in sect411

Several application types are running over our cloud-based web services platform For example in a medicalcontext these might be medical record editing pharmacyordering social services etc A single shared identityservice for the end-user is part of the cloud provideroffering (in our proof of concept implementation weused OAuth [59])

The GP authenticates is authorised as treating doc-tor for Alice and selects the lsquoidentityrsquo that correspondsto Alice A new session is created server-side by thegateway with the requested application instance run-ning in the corresponding security context with S =[medical alice] I = [empty] When the GP wants to accessapplications on behalf of a new patient he needs to closeAlicersquos session authorise as treating doctor for Bob andopen a new session for Bob

The control described above is not achieved by theapplication but by the platform itself and can be con-trolled by the end-user subject to access control Thatis a medical application used on Alicersquos behalf runs ina security context in which data cannot flow to that ofanother patient Furthermore applications running onbehalf of a given user can share the data of that userwithout the risk of seeing a buggy application leakingdata between end-users

12

App Instance App InstanceApp Instance

CamFlow-MW

IFC Constrained IFC Constrained IFC Constrained

ContextA

ContextB

Public

Trusted IFC Gateway

Enforcing IFCData-store

Data-store Application

CamFlow-MW CamFlow-MWCamFlow-MW

IFC Cloud Platform

Dr Yellow Dr PinkDr MauveRemote Clients

Fig 8 PaaS Architecture on top of IFC-OS

As described in sect4 we assume the middleware andthe OS enforcement are provided as a service by theunderlying platform A tenant wanting to use the third-party web-service offering once his trust in the under-lying platform is established needs only to audit thegateway again the underlying infrastructure providercould either provide such a gateway or audit it Therest of the software stack of the third-party web-serviceprovider is bound by the IFC enforcement mechanismand therefore need not be trusted

8 CONCLUSION amp FUTURE WORK

IFC allows data flows to be controlled continuouslythroughout a system by providing an information-centric MAC scheme that continuously ensures non-interference between security contexts This paper pre-sented the CamFlow platform that demonstrates thepotential of cloud-deployed IFC as supporting (1) pro-tection of applications from each other (2) flexible datasharing across isolation boundaries (3) prevention ofdata leakage due to bugsmisconfigurations (4) exten-sion of access control beyond application boundaries (5)data flow transparency

Specifically we detailed a new kernel implementationof IFC as an LSM demonstrating low overhead evenfor worst-case scenarios where processes continuouslymake readwrite system calls We also described theintegration of a messaging middleware to enforcing IFCacross machines This combination makes it possible toprovide whole-system IFC to PaaS cloud services andtherefore also SaaS Our approach and implementationwere designed so that applications can run unchangedover IFC thus making cloud adoption feasible

We also indicated how the data-centric logs based onIFC enforcement could provide the means to audit anIFC-enabled system whereby a log can be processed asa directed graph to investigate leaks and attacks andshow compliance with data management requirementsThough this represents our initial work in the area thereappears much promise

In light of the above we believe that IFC has greatpotential as a security mechanism for the cloud wherebytrust in a few major cloud providers deploying IFCcan be built on to provide a demonstrably trustworthycomputing environment

CamFlow was developed with cloud deployment inmind Our future work will investigate the challenges ofa broader distributed context

It is already feasible to extend CamFlow to sup-port mobile environments Android supports the fullSELinux enforcement5 and an Android-LSM integrationhas been demonstrated [60] But when dealing withmultiple cloud services particularly as they become partof a wider distributed architecture such as in the IoTa trustworthy system-wide deployment of IFC can nolonger be assumed Much work remains on establishingtrust in (and the trustworthiness of) the IFC enforcementmechanism within end-usersrsquo devices Outside a cloudcontext all partiesrsquo trust in a common third partyrsquos en-forcement of IFC constraints cannot be assumed (unlikethe cloud provider for cloud services) We intend toexplore leveraging hardware roots of trust and remoteattestation to ensure the integrity and trustworthiness ofIFC enforcement mechanisms

Another area of investigation concerns the represen-tation of tags across administrative domains In a cloudcontext a federated approach can be envisaged wherea common understanding of tags could be negotiatedacross multiple domains For example in [54] we dis-cussed initial thoughts on managing data according tospecific obligation regimes However as the numberof administrative domains increases both a global tagnaming scheme and mechanisms for ad-hoc negotiationbecome necessary

Related is the sensitivity of the tags themselvesKnowledge of the meaning of a tag can indicate thatthe associated entity contains or deals with certain infor-mation If a relationship can be established between anentity and the information owner this may disclose pri-vate information about the information owner This maylead to work on a need-to-know negotiation mechanismto establish a secure channel between hosts especiallyin a wide-scale distributed system A promising mech-anism is private set intersection to determine tagsrsquo subsetrelationship (sect341)

Another challenge concerns extending audit to dis-tributed architectures both in terms of resource man-agement and regulating access to log data

5 httpssourceandroidcomdevicestechsecurityselinux

13

ACKNOWLEDGMENTS

This work was supported by UK Engineering andPhysical Sciences Research Council grant EPK011510CloudSafetyNet End-to-End Application Security inthe Cloud We acknowledge the support of Microsoftthrough the Microsoft Cloud Computing Research Cen-tre

REFERENCES

[1] P Barham B Dragovic K Fraser S Hand T Harris A HoR Neugebauer I Pratt and A Warfield ldquoXen and the art ofvirtualizationrdquo SIGOPS Operating Systems Review vol 37 no 5pp 164ndash177 2003

[2] A Kivity Y Kamay D Laor U Lublin and A Liguori ldquokvmthe Linux Virtual Machine Monitorrdquo in Proceedings of the LinuxSymposium vol 1 2007 pp 225ndash230

[3] D Bernstein ldquoContainers and Cloud From LXC to Docker toKubernetesrdquo IEEE Cloud Computing Magazine no 3 pp 81ndash842014

[4] P Desnoyers O Krieger B Holden and J Hennessey ldquoUsingOpenStack for an Open Cloud eXchange (OCX)rdquo in InternationalConference on Cloud Engineering (IC2E) IEEE 2015

[5] J Mineraud O Mazhelis X Su and S Tarkoma ldquoA Gap Analysisof Internet-of-Things Platformsrdquo 2015 Arxiv arXiv150201181[Online] Available httparxivorgabs150201181

[6] C J Millard Ed Cloud Computing Law Oxford University Press2013

[7] S Pearson and M Casassa-Mont ldquoSticky Policies An Approachfor Managing Privacy across Multiple Partiesrdquo Computer vol 44July 2011

[8] D E Denning ldquoA lattice model of secure information flowrdquoCommunication of the ACM vol 19 no 5 pp 236ndash243 1976

[9] D E Bell and L J LaPadula ldquoSecure Computer Systems Mathe-matical Foundations and Modelrdquo The MITRE Corp Bedford MATech Rep M74-244 1973

[10] K J Biba ldquoIntegrity Considerations for Secure Computer Sys-temsrdquo MITRE Corp Tech Rep ESD-TR 76-372 1977

[11] A C Myers and B Liskov ldquoA Decentralized Model for Infor-mation Flow Controlrdquo in 17th Symposium on Operating SystemsPrinciples (SOSP) ACM 1997 pp 129ndash142

[12] A Bouard B Weyl and C Eckert ldquoPractical information-flowaware middleware for in-car communicationrdquo in Workshop onSecurity privacy amp dependability for cyber vehicles ACM 2013pp 3ndash8

[13] K Singh S Bhola and W Lee ldquoxBook Redesigning PrivacyControl in Social Networking Platformsrdquo in Security SymposiumUSENIX 2009 pp 249ndash266

[14] M Krohn A Yip M Brodsky N Cliffer M F KaashoekE Kohler and R Morris ldquoInformation Flow Control for StandardOS Abstractionsrdquo in Symposium on Operating Systems PrinciplesACM 2007 pp 321ndash334

[15] S Lee E L Wong D Goel M Dahlin and V ShmatikovldquoπBox A Platform for Privacy-Preserving Appsrdquo in Symposiumon Networked System Design and Implementation USENIX 2013pp 501ndash514

[16] N Kumar and R Shyamasundar ldquoRealizing Purpose-Based Pri-vacy Policies Succinctly via Information-Flow Labelsrdquo in Big Dataand Cloud Computing (BDCloudrsquo14) IEEE 2014 pp 753ndash760

[17] W Enck P Gilbert B-G Chun L P Cox J Jung P McDaniel andA N Sheth ldquoTaintDroid An information-flow tracking systemfor realtime privacy monitoring on smartphonesrdquo in Conference onOperating systems design and implementation (OSDIrsquo10) USENIX2010 pp 1ndash6

[18] I Papagiannis M Migliavacca and P Pietzuch ldquoPHP AspisUsing partial taint tracking to protect against injection attacksrdquoin 2nd USENIX Conference on Web Application Development 2011p 13

[19] E Schwartz T Avgerinos and D Brumley ldquoAll you ever wantedto know about dynamic taint analysis and forward symbolicexecution (but might have been afraid to ask)rdquo in Symposium onSecurity and Privacy IEEE 2010

[20] M Casassa-Mont S Pearson and P Bramhall ldquoTowards account-able management of identity and privacy Sticky policies andenforceable tracing servicesrdquo in Database and Expert Systems Ap-plications 2003 Proceedings 14th International Workshop on IEEE2003 pp 377ndash382

[21] S Bandhakavi C C Zhang and M Winslett ldquoSuper-sticky anddeclassifiable release policies for flexible information dissemina-tion controlrdquo in Proceedings of the 5th ACM workshop on Privacy inelectronic society ACM 2006 pp 51ndash58

[22] D W Chadwick and S F Lievens ldquoEnforcing sticky securitypolicies throughout a distributed applicationrdquo in Proceedings ofthe 2008 workshop on Middleware security ACM 2008 pp 1ndash6

[23] M Casassa-Mont I Matteucci M Petrocchi and M L SbodioldquoTowards Safer Information Sharing in the Cloudrdquo InternationalJournal of Information Security pp 1ndash16 2014

[24] S Akoush L Carata R Sohan and A Hopper ldquoMrLazy LazyRuntime Label Propagation for MapReducerdquo in 6th Workshop onHot Topics in Cloud Computing (HotCloud) USENIX 2014

[25] D Schultz and B Liskov ldquoIFDB Decentralized Information FlowControl for Databasesrdquo in European Conference on Computer Sys-tems (Eurosysrsquo13) ACM 2013 pp 43ndash56

[26] D Von Oheimb ldquoInformation Flow Control Revisited Nonin-fluence = Noninterference + Nonleakagerdquo in Computer SecurityndashESORICS 2004 Springer 2004 pp 225ndash243

[27] D Hedin and A Sabelfeld ldquoA perspective on Information-FlowControlrdquo NATO 2012

[28] D E Porter M D Bond I Roy K S McKinley and E WitchelldquoPractical Fine-Grained Information Flow Control Using Lam-inarrdquo ACM Transactions on Programming Languages and Systems(TOPLAS) vol 37 no 1 p 4 2014

[29] C Wright C Cowan S Smalley J Morris and G Kroah-Hartman ldquoLinux Security Modules General security support forthe Linux kernelrdquo in Foundations of Intrusion Tolerant SystemsIEEE 2003 pp 213ndash213

[30] M Quaritsch and T Winkler ldquoLinux Security Modules Enhance-ments Module Stacking Framework and TCP State TransitionHooks for State-Driven NIDSrdquo Secure Information and Communi-cation vol 7 pp 7ndash13 2004

[31] C Schaufler ldquoLSM Generalize existing module stackingrdquo LinuxWeekly News 2014

[32] S Smalley C Vance and W Salamon ldquoImplementing SELinux asa Linux Security Modulerdquo NAI Labs Report vol 1 p 43 2001

[33] M Bauer ldquoParanoid Penguin an Introduction to Novell AppAr-morrdquo Linux Journal vol 2006 no 148 p 13 2006

[34] A Edwards T Jaeger and X Zhang ldquoRuntime verification ofauthorization hook placement for the Linux Security Modulesframeworkrdquo in Conference on Computer and Communications Secu-rity ACM 2002 pp 225ndash234

[35] T Jaeger A Edwards and X Zhang ldquoConsistency analysis ofauthorization hook placement in the Linux security modulesframeworkrdquo ACM Transactions on Information and System Security(TISSEC) vol 7 no 2 pp 175ndash205 2004

[36] V Ganapathy T Jaeger and S Jha ldquoAutomatic placement ofauthorization hooks in the Linux security modules frameworkrdquoin Conference on Computer and Communications Security ACM2005 pp 330ndash339

[37] I P Egwutuoha D Levy B Selic and S Chen ldquoA survey of faulttolerance mechanisms and checkpointrestart implementationsfor high performance computing systemsrdquo Springer 2013vol 65 no 3 pp 1302ndash1326

[38] O Laadan and S E Hallyn ldquoLinux-CR Transparent applicationcheckpoint-restart in Linuxrdquo in Linux Symposium 2010 p 159

[39] B Niu and G Tan ldquoEfficient User-space Information Flow Con-trolrdquo in Symposium on Information Computer and CommunicationsSecurity (SIGSACrsquo13) ACM 2013 pp 131ndash142

[40] T Bird ldquoMeasuring Function Duration with ftracerdquo in Japan LinuxSymposium 2009 pp 47ndash54

[41] B Parno ldquoBootstrapping Trust in ardquo Trustedrdquo Platformrdquo in Con-ference on Hot Topics in Security (HotSecrsquo08) USENIX 2008

[42] N Santos K P Gummadi and R Rodrigues ldquoTowards TrustedCloud Computingrdquo in Conference on Hot Topics in Cloud Comput-ing USENIX 2009 pp 3ndash3

[43] S Berger R Caceres K A Goldman R Perez R Sailer andL van Doorn ldquovTPM Virtualizing the Trusted Platform Modulerdquoin Security Symposium USENIX 2006 pp 305ndash320

14

[44] S Berger K Goldman D Pendarakis D Safford E Valdezand M Zohar ldquoScalable Attestation A Step Toward Secure andTrusted Cloudsrdquo in International Conference on Cloud Engineering(IC2E) IEEE 2015

[45] J Singh D Eyers and J Bacon ldquoPolicy Enforcement withinEmerging Distributed Event-Based Systemsrdquo in ACM DistributedEvent-Based Systems (DEBSrsquo14) 2014

[46] L Sfaxi T Abdellatif R Robbana and Y Lakhnech ldquoInformationFlow Control of Component-based Distributed Systemsrdquo Concur-rency and Computation Practice and Experience vol 25 no 2 pp161ndash179 2013

[47] S Yoshihama T Yoshizawa Y Watanabe M Kudoh and K Oy-anagi ldquoDynamic Information Flow Control Architecture for WebApplicationsrdquo in ESORICS 2007 J Biskup and J Lopez EdsSpringer 2007 vol LNCS 4734 pp 267ndash282

[48] W Cheng D R K Ports D Schultz V Popic A BlanksteinJ Cowling D Curtis L Shrira and B Liskov ldquoAbstractions forUsable Information Flow Control in Aeolusrdquo in USENIX AnnualTechnical Conference Boston 2012

[49] J Singh T Pasquier J Bacon and D Eyers ldquoIntegrating Middle-ware and Information Flow Controlrdquo in International Conferenceon Cloud Engineering (IC2E) IEEE 2015 pp 54ndash59

[50] J Singh T F J-M Pasquier and J Bacon ldquoSecuring Tags toControl Information Flows within the Internet of Thingsrdquo inInternational Conference on Recent Advances in Internet of Things(RIoTrsquo15) IEEE 2015

[51] J S Park and R Sandhu ldquoBinding Identities and Attributes UsingDigitally Signed Certificatesrdquo in Annual Conference on ComputerSecurity Applications IEEE 2000 pp 120ndash127

[52] A Ganjali and D Lie ldquoAuditing cloud management using infor-mation flow trackingrdquo in Workshop on Scalable Trusted ComputingACM 2012 pp 79ndash84

[53] R K Ko M Kirchberg and B S Lee ldquoFrom System-centric toData-centric Logging-accountability Trust amp Security in CloudComputingrdquo in Defense Science Research Conference and Expo (DSR)2011 IEEE 2011 pp 1ndash4

[54] J Singh J Powles T Pasquier and J Bacon ldquoData Flow Man-agement and Compliance in Cloud Computingrdquo IEEE CloudComputing Magazine SI on Legal Clouds 2015

[55] T Zanussi K Yaghmour R Wisniewski R Moore and M Dage-nais ldquorelayfs An efficient unified approach for transmitting datafrom kernel to user spacerdquo in Linux Symposium 2003 p 494

[56] M E Smoot K Ono J Ruscheinski P-L Wang and T IdekerldquoCytoscape 28 New features for data integration and networkvisualizationrdquo Oxford University Press 2011 vol 27 no 3 pp431ndash432

[57] T Pasquier and J Powles ldquoExpressing and Enforcing LocationRequirements in the Cloud using Information Flow Controlrdquo inIC2E International Workshop on Legal and Technical Issues in CloudComputing (CLawrsquo15) IEEE 2015

[58] R Angles and C Gutierrez ldquoSurvey of Graph Database ModelsrdquoACM Computing Surveys (CSUR) vol 40 no 1 p 1 2008

[59] D Hardt ldquoThe OAuth 20 Authorization Frameworkrdquo IETF TechRep RFC 6749 2012

[60] S Smalley and R Craig ldquoSecurity Enhanced (SE) Android Bring-ing Flexible MAC to Androidrdquo in Network and Distributed SystemSecurity Symposium Internet Society 2013

  • 1 Introduction and Motivation
  • 2 Background
    • 21 IFC Models
    • 22 Protection via VMs and Containers
    • 23 Taint Tracking (TT) Systems
    • 24 Sticky Policies
      • 3 CamFlow-Model IFC for the Cloud
        • 31 Tags and Labels
        • 32 Decentralised Privileges and Security Contexts
        • 33 Creating a New Entity
        • 34 Security
          • 341 Information Exchange
          • 342 Label Change
          • 343 Privilege delegation
            • 35 Conflict of Interest
              • 4 OS Enforcement
                • 41 CamFlow-LSM
                  • 411 Checkpointing and Restoration
                  • 412 OS Evaluation
                    • 42 Trusted Processes
                    • 43 Leveraging Hardware Roots of Trust
                      • 5 Cross-Machine Enforcement
                        • 51 Remote Interactions
                        • 52 Message-Level Enforcement
                        • 53 Integrating With Persistent Storage
                        • 54 Evaluation
                          • 6 Audit Data-Centric Logs
                            • 61 Analysing Paths to Disclosure
                            • 62 Demonstrating Compliance
                            • 63 Audit as `Big Data
                            • 64 Audit Access
                              • 7 Example Support for Web Services
                              • 8 Conclusion amp Future Work
                              • References
Page 4: CamFlow: Managed data-sharing for cloud services · CamFlow: Managed data-sharing for cloud services Thomas F. J.-M. Pasquier, Member, IEEE, Jatinder Singh, Member, IEEE, David Eyers,

4

controlled destinations such as the cloud backup contactlist This approach aids the detection of malicious ap-plications attempting to steal user-sensitive informationand send it to third parties Equally this type of concerncan be captured through the use of IFC policies

One concern with TT systems is that there is a gap intime between the occurrence of the issue (eg a leak anattack) and when it is detected [19] ie problems becomeevident only when the tainted data reaches a sink (en-forcement point) Depending on the degree of isolationbetween the different parts of the system and the num-ber of system components involved this tainted datamay have lsquocontaminatedrsquo much of the system Whilethis can be managed in smaller closed environmentsit is less appropriate for cloud services in general IFCpolicies present the clear advantage to prevent problemsas they occur and to stop their effects propagating to apotentially large part of the system

Some argue that TT is simpler to use than IFC andincurs lower overhead but when the enforcement issystemic and the granularity identical the overheadsare similar (compare [17] and the evaluation in sect4 andsect5) Indeed the complexity of verifying IFC policy (seesect3) is comparable to the cost of propagating taint Forboth techniques most of the overhead comes from themechanism for intercepting data exchange

24 Sticky Policies

IFC can be seen as a mechanism for enforcing policythe labels associated with entities represent applicationpolicies IFC is a simple low-level mechanism Stickypolicy approaches also consider the enforcement of data-bound policy but at a higher-level

Casassa-Mont et al [20] first introduced sticky policieswhich involves encrypting data along with a list of poli-cies to be enforced on that data To obtain the decryptionkey from a Trusted Authority (TA) a party must agree toenforce the policies associated with the data This agree-ment may be considered as part of forming a contractuallink between the data owner and the service providerWork has continued in the area [21]ndash[23] Sticky policiestypically enforced at the application-level are generallymore complex and heavyweight than the simple secrecyand integrity constraints of IFC As such sticky policiestend only to be enforced at particular points eg atadministrative boundaries IFC on the other hand aswe show in sect412 can be enforced continuously at areasonable cost In sect3 we discuss how complex policiesmight be built from IFC labels

Further the sticky policy approach builds upon thetrust established between the data owner the TAs andservices that use the data A non-compliant service couldbe black-listed but only if and when a breach of agree-ment is detected and the TA updated Our IFC approachbuilds only upon the trust between the data owner andthe cloud provider Services and applications running ontop of the cloud provider platform need not be trustedWe believe this to be a great improvement to the overall

Alice RecordS(A) = alicemedical

I(A) = hosp-dev consent

Bob RecordS(B) = bobmedical

I(B) = hosp-dev consent

Alicersquos app instanceS(C) = alicemedical

I(C) = consentAllowed FlowPrevented Flow

Fig 1 An allowed safe flow and prevented flowstrustworthiness of the system

3 CAMFLOW-MODEL IFC FOR THE CLOUD

IFC operates to ensure that only permitted flows ofinformation can occur by enforcing data flow policy dy-namically end-to-end within and across applicationsservices Entities to which IFC constraints are appliedcan include a MapReduce worker instance [24] a file aprocess a database entry [25] etc In CamFlow IFC is ap-plied continuously typically on every system call for anIFC-enabled OS and on communication mechanisms forenforcement across applicationsruntime environmentsIFC policy should therefore be as simple as possibleto allow verification human understanding and to min-imise runtime overhead Indeed there is no need for IFCto encapsulate every possible policy rather it augmentsother control mechanisms and can help enforce theirpolicies

31 Tags and Labels

We define tags that are tokens each representing somesecurity concern over secrecy or integrity The tagbob-private could for example represent Bobrsquos personaldata We associate every entity in the system with twolabels (sets of tags) an entity A has a secrecy label S(A)and an integrity label I(A) The state of these labels isthe security context of the entity The power of IFC is thatit guarantees non-interference between security contexts[26] [27]Example ndash secrecy Suppose a patient Bob is dischargedfrom hospital to be medically monitored at home Thedata streams from his sensors are transferred to a cloudservice and are to be shared with his medical team atthe hospital The data items from his devices are taggedwith medical bob in their secrecy labelsExample ndash integrity The cloud-based home monitoringsupport service needs to be assured that the data itreceives is from a hospital-issued device Each sensingdevice is checked and issued with the tag hospital-devicein its integrity label

Fig 1 illustrates information flow constraints beingapplied over both secrecy and integrity dimensions

32 Decentralised Privileges and Security Contexts

In decentralised IFC (DIFC) any active entity can createnew tags Tag creation is typically carried out by appli-cation managers when setting up application instancesWhen an active entity creates a new tag either for secrecyor integrity this process is given the correspondingprivilege to add and remove the tag to its secrecy orintegrity label respectively If an active entity A has a

5

Medical RecordS = personal

I = empty

Consent Checker

S = personalI = cons

Anonymiser

S = researchI = cons anon

Research DatabaseS = research

I = cons anon

Researcher PortalS = research

I = cons anonAllowed FlowPrevented Flow

Research Project XXNHS Cloud S = personalI = empty

S = personalI = cons

Context change

Fig 2 Medical data declassified and endorsed for research purposes

privilege to add t to its secrecy label we denote thist isin P+

S (A) and to remove t from its secrecy labelt isin PminusS (A) (and similarly P+

I (A) and PminusI (A) are the priv-ileges for integrity) An active entity may therefore havefour privilege sets in addition to its security contextApplication managers will normally set up applicationinstances in security contexts without the privileges tochange them An example is given in sect7

33 Creating a New Entity

We define A rArr B as the operation of the entity Acreating the entity B An example is creating a processin a Unix-style OS by clone We have the following rulesfor creation

if ArArr B then

S(B) = S(A)

I(B) = I(A)

That is the created entity inherits the security contextof its creator These rules force the creating entity toexplicitly change its security context to that required forthe entity to be created We motivate this below in sect342Note that only labels pass to the created entity privilegeshave to be passed explicitly

34 Security

The purpose of IFC models is to regulate flows betweenentities and effect label changes and privilege delega-tion

Definition 1 A system is secure in the CamFlow IFC modelif and only if all allowed messages are safe (Definition 2) allallowed label changes are safe (Definition 3) and all privilegedelegation is safe (Definitions 4 and 5)

341 Information Exchange

IFC prevents data leakage by controlling the exchangeof information We follow the classic pattern for IFC-guaranteed secrecy (no read up no write down [9]) andintegrity (no read down no write up [10])

Definition 2 A flow of information A rarr B is safe if andonly if

Ararr B iff S(A) sube S(B) and I(B) sube I(A)Example ndash secrecy enforcement Consider our exam-ple of patient monitoring after discharge from hospital

where the patientrsquos devices are tagged with medical bobin their secrecy labels In order for the cloud service tobe able to receive this data it must also include the tagsmedical bob in its secrecy label Therefore an applicationinstance accessing Bobrsquos medical data must be labelled assuch In sect7 we describe how applications can be designedto meet such requirementsExample ndash integrity enforcement The cloud-basedhome monitoring support service needs to be assuredthat the data it receives is from a hospital-issued de-vice To achieve this the service has an integrity taghospital-issued in its integrity label and will only acceptdata from devices with tags hospital-issued

342 Label Change

Under the above constraints information flows arerestricted to equal or increasing secrecy constraintsand equal or decreasing integrity constraints Howeverdata may undergo transformations andor checks thatchange its security properties For example moving datathrough an anonymisation engine renders the data lesssensitive so less strict secrecy constraints can applyto the anonymised output In the integrity dimensiondata may go through a validation process on inputthus becoming more trustworthy In CamFlow only theprocess itself is able to change its secrecy and integritylabels which requires the appropriate privileges andmust be explicitly requested

Definition 3 A label change noted A Aprime is safe if andonly if for a label X (either S or I) and a tag t

X(Aprime) = X(A) cup t if t isin P+X (A)

ORX(Aprime) = X(A) t if t isin PminusX (A)

Declassifiers and endorsers are the entities with theprivileges to perform security context transformationsDeclassifiers change the secrecy properties and endorserschange the integrity propertiesExample ndash declassification A medical record systemis held in a private cloud Research datasets may becreated from these records but only from records wherethe patients have given consent Also only anonymised

6

data may leave the private protected environment Weassume a health service approved anonymisation proce-dure Fig 2 shows the anonymiser inputting data taggedas personal and declassifying the data by outputting datawith secrecy tag researchExample ndash endorsement In the same example theResearch Database is on a public cloud and may onlyreceive research data tagged with consent anon in itsintegrity label In the private cloud we see a pro-cess that selects appropriate records for specific re-search purposes checks for patient consent and addsthe tag consent to the integrity label of its output Theanonymiser process can only input data with this tag itanonymises the data and outputs data with the tag anonin its integrity label

Some previous work [14] [28] allows implicit declas-sification and endorsement That is if an active entity hasthe privilege to declassifyendorse and the privilegeto return to its original state (ie for declassificationendorsement over t the entity has privilege tminus and t+)the declassificationendorsement may occur implicitlywithout the need for the entity to make the label changesWe believe that this could in practice lead to unintentionaldata disclosure Suppose an entity has the privilege todeclassify top-secret information The requirement forexplicit label change makes it unlikely that the entity willsend such data accidentally to an unintended recipientOur model has stronger constraints that require endorse-ment and declassification operations to be programmedexplicitly

343 Privilege delegation

An entity is only able to delegate a privilege it owns

Definition 4 A privilege delegation is safe if and only ift isin PplusmnX (A)

35 Conflict of Interest

In CamFlow alone among IFC systems privilege dele-gation is further restricted by Conflict of Interest (CoI)(or Separation of Duty (SoD)) enforcement The receivingentity A must not be put in a situation where it wouldbreak a CoI constraint By this means an applicationmanager is prevented from creating an application in-stance with access to conflicting data

Definition 5 An entity A does not violate a CoI C if andonly if∣∣∣(S(A) cup I(A) cup P+

S (A) cup P+I (A) cup Pminus

S (A) cup PminusI (A)

)cap C

∣∣∣ le 1

Example ndash conflict of interest A CoI might arise whendata relating to competing companies is available in asystem In a hospital context this might involve theresults of analysis of the usage and effects of drugs fromcompeting pharmaceutical companies The companiesmight agree to this analysis only if their data is guar-anteed to be isolated ie not leaked to other companies

The hospital may be participating in drug trials andwant to ensure that information does not leak between

Camflow-LSM

IFC Library

User Space

Kernel Space

Application Process

Kernel

DIFC Management APISystem Calls

Hardware

IFC LibraryTrusted Process

Authorise and Audit

Fig 3 The interactions of the IFC Security Module (LSM)and a Trusted Process within an OStrials suppose a conflict is C = PfizerGSKRoche and some data (eg files) are labelled PfizerData[S =Pfizer I = empty] and RocheData[S = Roche I = empty] TheCoI described ensures that it is not possible for a singleentity (eg an application instance) to have access toboth RocheData and PfizerData either simultaneously orsequentially ie enforcing that Roche-owned data andPfizer-owned data are processed in isolation

The next sections describe the CamFlow platform thatenforces the IFC constraints described

4 OS ENFORCEMENT

At the heart of the architecture is a minimal kernel mod-ule dedicated solely to OS-level IFC enforcement Themodule is trusted to enforce IFC transparently acrossall flows between entities within the OS User spaceprocesses can directly interact with the kernel moduleeg to delegate privileges (sect34) through a pseudo-filesystem abstracted through a high level API Higher levelconsiderations and policies can be managed throughspecifically defined Trusted Processes (see sect42) The localmachine architecture is represented in Fig 3

Note that IFC operates alongside and complementsother security technologies It is not a cloud securitypanacea challenges regarding covert and side channelsand direct access to hardware by an attacker remain asthey do for systems in general There are approachesthat can help address these security threats but manyare highly disruptive (eg synchronisation approachesto reducing timing channels) and are infrequently usedOther threats may be easier to mitigate and solutionsmay be used when appropriate (eg on-disk encryption)

41 CamFlow-LSM

Our kernel module CamFlow-LSM is implemented as aLinux Security Module (LSM) [29] Although our work isLinux-specific a similar approach could be used on anysystem providing LSM-like security hooks Unlike otherDIFC OS implementations [14] [28] our kernel patch isself-contained strictly limited to the security moduledoes not modify any existing system calls and followsLSM implementation best practice This allows amongother things LSM stacking [30] [31] and coexistencewith other security modules such as eg SELinux [32]or AppArmor [33] and complements their MAC enforce-ment with decentralised information flow policies

We assume that the rest of the kernel can be trustedand does not interfere with the IFC enforcement mech-anism LSM system hooks have been statically anddynamically verified [34]ndash[36] and our implementation

7

inherits from LSM the formal assurance of IFCrsquos correctplacement on the path to any controlled kernel objectThis is sufficient to guarantee that we control flow andrecord audit on any operation on a controlled kernelobject

Since applications running on SELinux [32] or Ap-pArmor [33] need not be aware of the MAC policybeing enforced we see no reason to force applicationsrunning on an IFC system to be aware of IFC only thoseperforming declassification or endorsement operationsare necessarily aware This implementation choice isimportant cloud providers can incorporate IFC withoutrequiring changes in the software deployed by ten-ants Alternatively applications that wish to managetheir own IFC constraints can declare policy through apseudo-filesystem (as is typical for LSMs) abstracted bya user space library and enforced transparently by theIFC mechanism

The LSM framework calls security hooks when accessto a kernel object is attempted Security metadata can beassociated with kernel objects and is used by the LSMmodule to make access decisions Tags and privilegesare represented by 64-bit opaque nonces associated withkernel objects such as processes inodes files sharedmemory objects messages etc On interaction betweenkernel objects CamFlow-LSM security hooks are calledto enforce data-flow policy (sect341) or propagate tags onentity creation (sect33) as appropriate

Only active entities (processes) have mutable labelsand privileges all other (passive) entities have im-mutable labels and no privileges

Privileges are allocated by the kernel and owned bythe creating process (any process can create tags and theassociated privileges in a decentralised fashion) Privi-leges can be passed to other processes users or groupsCamFlow-LSM verifying that constraints on privilegedelegation (sect343) and conflict of interest (sect35) are notviolated A process can add or remove a tag from itslabel if it owns the appropriate privilege (following IFCconstraints described in sect342) if the current user ownsthe privilege or if the current group owns the privilegeHow tags are shared and managed must be consideredwith care when designing an application and the systemmust be administered accordingly

411 Checkpointing and Restoration

Checkpointing a process involves halting its executionallowing it to be restarted at a later stage and enablingmigration eg [37] LSM state is normally saved andrestored by the checkpointing system eg [38] andour module further exports an API to more efficientlyserialise and restore security context

Furthermore self-checkpointing and restoring the pre-vious state of a process has been demonstrated [39] to bea beneficial feature for IFC systems This is particularlyuseful for processes serving requests In such a scenariothe state of the process is saved after initialisation Whena request is received the serving process sets itself up

0 10 20 30 40 50 60 70 80 90

sys pipe

sys write

sys read

sys clone

microsdyn label IFC LSM Native

Fig 4 Overhead introduced into the OS by CamFlowLSM (x-axis time in micros)in the security context appropriate to serve the requestAfter the request is served (or a series of requests if thesystem is session-based as described in sect7) the processrestores its memory state and security context to whatthey were immediately after initialisation This improvesperformance and prevents data leaks between securitycontexts

412 OS Evaluation

We tested the CamFlow-LSM module on Linux Kernelversion 3178 (012015) from the Fedora distribution3

The tests are run on an Intel 22Ghz i7 CPU and 6GiBRAM machine

Measurements are done using the Linux tool ftrace [40]to provide a microbenchmark Two processes read fromand write to a pipe respectively Each has 20 tags in itssecurity label substantially more than we have seen aneed for in current use cases We measure the overheadinduced by creating a new process (sys clone) creatinga new pipe (sys pipe) writing to the pipe (sys write) andreading from the pipe (sys read) The results are givenin Fig 4

We can distinguish two types of induced overheadverifying an IFC constraint (sys read sys write) and allo-cating labels (sys clone sys pipe) The sys clone overheadis roughly twice that of sys pipe as memory is allocateddynamically for the active entityrsquos labels and privilegesRecall that passive entities have no privileges Overheadmeasurements for other system callsdata structures areessentially identical as they rely on the same underlyingenforcement mechanism and are not included

The CamFlow-LSM overhead is a few percent seeFig 4 We provide a build option that further improvesperformance by declaring labels and privileges with afixed maximum size (by default label size can increasedynamically to meet application requirements) This re-duces the overhead of the system calls that create newentities (the dynamic label component in Fig 4) This isan acceptable trade-off as in practical scenarios labelsrarely exceed more than five tags However for mostapplications the overhead is imperceptible and lost insystem noise it is hard to measure without using kernel

3 It is not feasible to provide a comparison with the Laminarimplementation [28] that is closest in technical terms to our workas the implementation available httpsgithubcomut-osalaminaris for an obsolete kernel version 2622 (072007)

8

tools as the variation between two executions may begreater than the overhead

42 Trusted Processes

The CamFlow-LSM is trusted to enforce IFC at the kernellevel Its functionality is minimal strictly confined tothe enforcement of IFC policies as described in sect3 Thisguarantees easier maintainability and a system that isagnostic to higher level application requirements thusminimising the constraints imposed on user-space ap-plication design

We introduce the concept of a trusted process that al-lows applicationplatform-specific concerns to be man-aged in user space by bypassing some LSM-enforced IFCconstraints For example a trusted process might serveas a proxy for external connections as in the Trusted IFCGateway in the example in sect7 setting up and managingapplication componentsrsquo labels Trusted processes areused to interact with persistent storage (see sect53) forcheckpointing and restoring processes (see sect411) andfor managing inter-process and external communication(see sect5)

Fig 5 shows OS instances running the CamFlow-LSM hosting a number of application processes thatmay be grouped in containers Each OS instance hasa single trusted process (Security Context Manager) tomanage its hosted processesrsquo IFC labels and privileges Inaddition each process has an associated trusted middle-ware process to handle inter-process and inter-machinecommunication Such communication may be within orbetween containers OSs or clouds

In this example S represents a particular set of secrecytags and I a particular set of integrity tags both ofwhich remain the same throughout The application pro-cesses and other OS objects such as pipes and files arelabelled [S I] The process labelled [empty I] writes lsquopublicrsquodata to a pipe which is read by a process labelled [S I]assuming all the I tags match correctly Similarly twoprocesses are shown writing to and reading from a file

The Security Context Manager maps between thekernel-level representation of tags (as 64-bit integers)and the representation of tags in user space Within acloud or other trusted environment tags may be simplestrings When tags need to cross domain boundarieseg when cloud services form part of a wider archi-tecture as in IoT tags may need to be protected bycryptographic means (see sect51)

Trusted processes are either set up through staticconfiguration read at boot time by the CamFlow-LSMmodule or created at runtime by another trusted pro-cess Trusted processes must either be managed by atrusted party (in our current approach the underlying in-frastructure provider) andor the code must be auditableand a means to verify the current version running on theplatform must be provided (see sect43)

43 Leveraging Hardware Roots of Trust

Incorporating IFC into cloud-provider OSs would en-hance the trustworthiness of the platform HoweverIFC only guarantees protection above the technical layerin which it is enforced Recent hardware and softwaredevelopments make it possible to attest that the softwarelayers on which our platform runs have been audited

The Trusted Platform Module (TPM) [41] as used forremote attestation [42] is one such hardware mecha-nism TPM is used to generate a nearly unforgeablehash representing the state of the hardware and soft-ware of a given platform that can be remotely verifiedTherefore a company could audit the implementationof our IFC enforcement mechanism and ensure that ourkernel security module messaging middleware and theconfiguration they provide are indeed running on theplatform Any difference between the expected state ofthe software stack and the platform could be considereda breach of trust such considerations can easily beembedded in the contractual obligations of the cloudprovider

TPM and remote attestation for cloud computing [43]are reaching maturity with IBM rolling out an opensource scalable trusted platform based on virtual TPMs[44] Indeed Berger et al [44] describe a mechanismallowing the TPM and remote attestation to be providedfor virtual machine offerings and container-based solu-tions covering the whole range of contemporary cloudofferings Furthermore the approach not only allows thestate of the software stack to be verified at boot time butalso during execution and can thus prevent run-timemodification of the system configuration

5 CROSS-MACHINE ENFORCEMENT

CamFlow-LSM operates to protect flows within the OSHowever it is also important that flows are protectedacross OS instances

Generally in order to guarantee flow constraints onlyprocesses P such that S(P ) = empty and I(P ) = empty ie notsubject to IFC constraints are allowed to directly connectto or receive messages from connections on remote OS(eg through a socket) In order to connect to anothermachine a process must either 1) be able to declassifyto change its security context to S(P ) = empty and I(P ) = empty2) communicate through an intermediate trusted process

As such CamFlow contains an IFC-enabled fully-featured messaging middleware (CamFlow-MW) to bothfacilitate communication and guarantee enforcementacross machines For want of space we only considerthe middleware concepts relevant to IFC details on thegeneral middleware (as it was prior to IFCCamFlowintegration) can be found in [45] In short the middle-ware supports strongly-typed messages a range of in-teraction paradigms including request-reply broadcastand streams flexible resource discovery and securitymechanisms including access controls and encryptedcommunication A particular feature is its support for

9

CamFlow-MW CamFlow-MW CamFlow-MW CamFlow-MW CamFlow-MW

Process

CamFlow-LSM

Enforcement Point Pipe [empty I]

Process Process

SecurityContextManager

SecurityContextManager

Container

Container

Container

Process[S I]

CamFlow-LSM

[S I]

File[S I]

[S empty][S empty]Process[empty I]

Fig 5 CamFlow Architecture Labelled OS objects trusted processes and communication middleware

dynamic reconfiguration based on event-driven policyThis simplifies both application development and de-ployment as concerns can be abstracted and tailoredto the particular environment rather than embeddedwithin application code

The role of the middleware is to move towards contin-uous end-to-end data flow management such that IFCcan be enforced across applicationsmachines (kernels)There is work on IFC enforcement across machineshowever these impose specific requirements such asdesign-time considerations [46] a particular languageruntime [47] or constraints on system architectureimplementation [48] In contrast we integrate IFC func-tionality into the general fully featured distributed sys-tems middleware mentioned above (see [49]) to provideflexibility and be more generally applicable We delib-erately avoid imposing a structure on system designinstead integrating IFC functionality into the sort of com-munications infrastructure common to current enterpriseand cloud systems

51 Remote Interactions

CamFlow-MW operates by associating a trusted process(see sect42) with an entity that seeks to communicate viathe messaging system Its fit within the broader architec-ture is depicted in Fig 5 The process is responsible forhandling the communication of messages and enforcesIFC based on the current runtime labels of the entity onwhose behalf it operates

It follows that for IFC to be enforced across machinestags require system-wide management ie throughoutthe cloud service In [50] we proposed that the widelyused and available X509 certificates could be used Theapproach relies on public key certificates and attributecertificates [51] to respectively identify the applicationassociated with the CamFlow-MW instance and the tagsassociated with this application

As part of establishing a connection CamFlow-MWensures that each entity authorises communication withthe other according to a local access control policy De-cisions are based on component metadata the relevantauthentication aspects secured through PKI (certificates)Similarly IFC policy must also be verified to ensure dataflows are authorised according to IFC policy Attributecertificates provide cryptographic means to determineand verify the tags associated with the remote entity

(see [50]) on which policy can be enforced If tags donot accord the connection will not be established

We also see potential for remote attestation based onhardware integrity measures (see sect43) to be integratedinto this authorisation phase to ensure the remote ma-chine operates a reliable IFC enforcement regime

52 Message-Level Enforcement

CamFlow messages are strongly typed where a mes-sage type is defined by a schema describing its set ofattributes For an instance of a message an attributeconsists of a name type and value The support forIFC within messages is fine-grained in that individualattributes within messages can also be labelled Theseattribute labels introduce additional IFC constraints overand above those already applying to the entity ie asrecognised by the kernel-LSM and validated on connec-tion establishment

Labels can be defined within message type schemawhich sets the attributesrsquo IFC labels for all messageinstances of the type These labels cannot be changed byentities dealing in such messages and the entities musthold the requisite labels to interact with the attributesOtherwise the entity producingpublishing a messagecan set the security labels for the attributes (for those notpredefined) if the entity holds the associated privilegesEnforcement occurs as followsReceiving If the receiving entityrsquos labels do not agreewith those of an attribute value the attribute value (andany sub-attributes) are removed from (made null in) themessage This is enforced on message receipt before itis delivered to the entitySending An entity cannot send values for attributeswhere its labels do not agree with those of the attributeThis is enforced when an entity attempts to send amessage ensuring values for any attributes violating thispolicy are removed before message propagation

Enforcement is automatic meaning that applicationsusing the messaging system can be subject to IFC en-forcement completely transparently (ie without their di-rect involvement) though again there is the interface forthe application to actively manage IFC where requiredIn addition the general reconfiguration capabilities ofthe middleware enable connections between componentsto be defined and managed at runtime providing an-other mechanism for controlling communication [45]

10

0 100 200 300 400 500 600 700

Time in msIFC-overhead

Fig 6 IFC overhead of CamFlow-MW for a workloadtransmission of 5000 messages (x-axis in ms)53 Integrating With Persistent Storage

One technique to provide IFC with persistent data storesis to store the tags alongside the data and a trustedsoftware component ensures that when information isread from the store the corresponding labels are appliedIn Flume [14] a trusted process provides the interfacebetween untrusted applications and persistent storageMore recent work has seen the emergence of databasesthat natively understand IFC concepts and can enforceIFC policies [25]

We see much promise in having the middleware me-diate between persistence systems and the kernel toensure consistent IFC application

54 Evaluation

As shown in Fig 6 the results indicate that IFC en-forcement introduces an overhead of sim13 in perfor-mance time compared to the standard non IFC-enabledmiddleware (see [49] for details) Note that these resultswere measured in the context of a particular workloaddeliberately designed to highlight the impact of IFCenforcement It follows that the overheads associatedwith real-world usage are most likely less onerous

6 AUDIT DATA-CENTRIC LOGS

IFC in addition to providing strong assurances that pol-icy is being enforced can also provide a data-centric log[52] detailing the information flows within and betweensystem components In addition to enforcing IFC via ourLSM module we log the data flows of labelled processespolicy decisions privileges and IFC security contextmanipulations Equivalent inter-machine operations viathe middleware are also recorded

Cloud logging systems are generally based on legacylogging systems (OS web-server database etc) thateither fail to capture the needed information or areextremely complicated to interpret in a useful manner[53] More importantly such logs tend to be relevant onlyto the particular service or component which makes itdifficult if not impossible to audit across a range ofapplications clouds etc

IFC logs as provided by our platform allow us to cap-ture information on application-level data flows both at-tempted and permitted allowing the correct expressionand implementation of data flow policy to be checkedThis provides transparency allows for meaningful auditin terms of investigating the circumstances in which dataleakage occurs and provides evidence of complianceeg with legal obligations [54]

Information Exchange

CreateSecurity Context Change

e0

e1

e3e4

e7

e2 e5e6

P3[Sprimeprime I]

P2[Sprimeprime empty]

P3[S I]F1[S I]

P1[S I] P1[empty I]

Public[empty empty]F2[Sprime I prime]

Fig 7 Simplified audit graph from IFC OS execution (weomit metadata for readability) The path to disclosure isshown in bluepale61 Analysing Paths to Disclosure

To assist in interpreting log information we build a di-rected graph corresponding to the allowed flows duringthe execution of our system as shown in Fig 7 Theflows defined in our IFC model (see sect3) namely dataflow creation flow security context change and privi-lege delegation correspond to the edges of the directedgraph Entities (such as processes files messages etc)are represented as nodes in the graph

In addition to information necessary to build the graph(as shown in the figure) additional metadata is collectedfor forensic purposes which is contextentityevent-dependent These audit entries provided by the LSMand middleware can be exploited by a dedicated serviceimplemented in user space connected to the kernelcollection mechanism via relayfs [55] This service feedsfor example a graph visualisation tool such as Cytoscape[56] or a graph database such as Neo4J4

Such a directed graph helps one identify data leaksFor example a tenant might discover that some sensi-tive medical data leaked into a data store where onlyanonymised research data were supposed to be storedIFC is enforced in line with the policy encapsulated inlabels thus data may leak if such policy is improperly ex-pressed andor declassificationendorsement processesare not correctly implemented (eg if the anonymisationprocess in Fig 2 allows re-identification)

Suppose that an information leak is suspected be-tween different security contexts L1[S I] and L2[S

prime I prime]Determining whether such a leak can occur is equiv-alent to discovering whether there is a path in thegraph between the two contexts If the leak occurredthere must be a path between some entity Ei such thatS(Ei) = S and I(Ei) = I and another entity Fi such thatS(Fi) = Sprime and I(Fi) = I prime

The existence of such a path demonstrates that a leakis possible To investigate whether a leak occurred it isessential to consider the event ID associated with theedges comprising the path We denote by ei the last in-coming edge to the entity under investigation with labels[Sprime I prime] only edges such that e lt ei should be consideredWhen applied to all nodes along a path this rule ensuresstrictly monotonically increasing timestamps from thefirst node to the last Fig 7 shows in bluepale a possibledata disclosure path from file F2 from a very simple

4 httpneo4jcom

11

audit graph We know from the event IDs e0 and e1 thatthe data disclosure did not occur through file F1 andprocess P3 but through P1rsquos declassification

62 Demonstrating Compliance

Compliance with certain requirements can be demon-strated through queries over the graph We assume theaudit data is stored in a graph database that we canquery For example the following plain English policyldquoEuropean personal data sent to the US must be anonymisedrdquo[57] is equivalent to writing a query that verifies thatthere is no path between EU- and US-labelled datawithout an anonymisation processThe policy ldquoMedical data stored in database X must havereceived proper consent and be anonymisedrdquo [54] can beexpressed as a query verifying that there is no path be-tween data labelled as medical and the database withoutconsent-checking and anonymiser processes In additionan investigator may want to know which anonymisationalgorithm has been run which data has been used togenerate the anonymised records etc Our audit graphassists in answering such questions

Note that IFC only applies guarantees with respectto flows Demonstrating the overall effectiveness of themanagement regime eg the quality and suitability ofthe anonymisation algorithm is out-of-scope for a flow-based enforcement mechanism

63 Audit as lsquoBig Datarsquo

We are potentially generating a vast amount of data inour IFC logs However unlike standard system logs thatare complex to analyse our logs generate graphs thatare ideal for analysis by ldquobig datardquo tools that have beendeveloped for this purpose [58]

Since the amount of data is potentially huge theamount of data being logged can be fine-tuned to meetthe requirements of the platformtenant eg by reduc-ing the amount of metadata being stored by loggingonly security context changing operations by loggingonly information corresponding to some target securitycontext keeping operations on unlabelled entities out-side of the log etc The decision on what needs to belogged then becomes a tradeoff between utility and thevolume (cost) of log generated which can be decided inorder to correspond to legal or contractual requirements(for example a regulated sector may need to have afine-grained log to satisfy data forensic requirements)Indeed as such an approach is new to the cloud suchconsiderations will be refined by experience with bestpractices developing over time

64 Audit Access

Logs can contain sensitive information and access tothem should be controlled This represents an area ofour ongoing work Traditional access controls clearlyplay a role however secrecy tags could also be lever-aged For example an auditor before being grantedaccess to audit logs could be forced to demonstrate

ownership of the corresponding secrecy IFC tags (forexample through cryptographic means as in sect51) Theauditor may be granted access to a log entry only ifS(origin) cup S(destination) sube S(auditor)7 EXAMPLE SUPPORT FOR WEB SERVICES

One of the most common uses of PaaS is to host web ap-plications In this section we present the implementationof such a solution built on the infrastructure described insect4 in order to evaluate and demonstrate the feasibility ofour proposed approach This is illustrated in Fig 8 Werun standard and unmodified Ruby web applications

Interaction with end-users is achieved through aldquogatewayrdquo between the IFC and non-IFC worlds Simi-larly interaction with cloud services (such as data stores)is also achieved through our messaging middleware asdiscussed in sect5 The requirement for this gateway canbe removed if a trustworthy IFC implementation can beprovided at the client side consistent with the cloud im-plementation with respect to tag naming enforcementetc Tag naming in general system-wide is an issuebeyond the scope of this paper see further sect8 In ourproof of concept implementation the gateway is a simpleApache server running a custom-built module

The role of the gateway is to authenticate the end-userwhen a session is created and to associate this sessionwith an application instance running within the securitycontext corresponding to the user Recall that a securitycontext comprises the S and I labels Any further re-quests to the gateway in that session are routed to thecorresponding application instance Once an instance nolonger has an associated session it can be recycled usingself-checkpointing as described in sect411

Several application types are running over our cloud-based web services platform For example in a medicalcontext these might be medical record editing pharmacyordering social services etc A single shared identityservice for the end-user is part of the cloud provideroffering (in our proof of concept implementation weused OAuth [59])

The GP authenticates is authorised as treating doc-tor for Alice and selects the lsquoidentityrsquo that correspondsto Alice A new session is created server-side by thegateway with the requested application instance run-ning in the corresponding security context with S =[medical alice] I = [empty] When the GP wants to accessapplications on behalf of a new patient he needs to closeAlicersquos session authorise as treating doctor for Bob andopen a new session for Bob

The control described above is not achieved by theapplication but by the platform itself and can be con-trolled by the end-user subject to access control Thatis a medical application used on Alicersquos behalf runs ina security context in which data cannot flow to that ofanother patient Furthermore applications running onbehalf of a given user can share the data of that userwithout the risk of seeing a buggy application leakingdata between end-users

12

App Instance App InstanceApp Instance

CamFlow-MW

IFC Constrained IFC Constrained IFC Constrained

ContextA

ContextB

Public

Trusted IFC Gateway

Enforcing IFCData-store

Data-store Application

CamFlow-MW CamFlow-MWCamFlow-MW

IFC Cloud Platform

Dr Yellow Dr PinkDr MauveRemote Clients

Fig 8 PaaS Architecture on top of IFC-OS

As described in sect4 we assume the middleware andthe OS enforcement are provided as a service by theunderlying platform A tenant wanting to use the third-party web-service offering once his trust in the under-lying platform is established needs only to audit thegateway again the underlying infrastructure providercould either provide such a gateway or audit it Therest of the software stack of the third-party web-serviceprovider is bound by the IFC enforcement mechanismand therefore need not be trusted

8 CONCLUSION amp FUTURE WORK

IFC allows data flows to be controlled continuouslythroughout a system by providing an information-centric MAC scheme that continuously ensures non-interference between security contexts This paper pre-sented the CamFlow platform that demonstrates thepotential of cloud-deployed IFC as supporting (1) pro-tection of applications from each other (2) flexible datasharing across isolation boundaries (3) prevention ofdata leakage due to bugsmisconfigurations (4) exten-sion of access control beyond application boundaries (5)data flow transparency

Specifically we detailed a new kernel implementationof IFC as an LSM demonstrating low overhead evenfor worst-case scenarios where processes continuouslymake readwrite system calls We also described theintegration of a messaging middleware to enforcing IFCacross machines This combination makes it possible toprovide whole-system IFC to PaaS cloud services andtherefore also SaaS Our approach and implementationwere designed so that applications can run unchangedover IFC thus making cloud adoption feasible

We also indicated how the data-centric logs based onIFC enforcement could provide the means to audit anIFC-enabled system whereby a log can be processed asa directed graph to investigate leaks and attacks andshow compliance with data management requirementsThough this represents our initial work in the area thereappears much promise

In light of the above we believe that IFC has greatpotential as a security mechanism for the cloud wherebytrust in a few major cloud providers deploying IFCcan be built on to provide a demonstrably trustworthycomputing environment

CamFlow was developed with cloud deployment inmind Our future work will investigate the challenges ofa broader distributed context

It is already feasible to extend CamFlow to sup-port mobile environments Android supports the fullSELinux enforcement5 and an Android-LSM integrationhas been demonstrated [60] But when dealing withmultiple cloud services particularly as they become partof a wider distributed architecture such as in the IoTa trustworthy system-wide deployment of IFC can nolonger be assumed Much work remains on establishingtrust in (and the trustworthiness of) the IFC enforcementmechanism within end-usersrsquo devices Outside a cloudcontext all partiesrsquo trust in a common third partyrsquos en-forcement of IFC constraints cannot be assumed (unlikethe cloud provider for cloud services) We intend toexplore leveraging hardware roots of trust and remoteattestation to ensure the integrity and trustworthiness ofIFC enforcement mechanisms

Another area of investigation concerns the represen-tation of tags across administrative domains In a cloudcontext a federated approach can be envisaged wherea common understanding of tags could be negotiatedacross multiple domains For example in [54] we dis-cussed initial thoughts on managing data according tospecific obligation regimes However as the numberof administrative domains increases both a global tagnaming scheme and mechanisms for ad-hoc negotiationbecome necessary

Related is the sensitivity of the tags themselvesKnowledge of the meaning of a tag can indicate thatthe associated entity contains or deals with certain infor-mation If a relationship can be established between anentity and the information owner this may disclose pri-vate information about the information owner This maylead to work on a need-to-know negotiation mechanismto establish a secure channel between hosts especiallyin a wide-scale distributed system A promising mech-anism is private set intersection to determine tagsrsquo subsetrelationship (sect341)

Another challenge concerns extending audit to dis-tributed architectures both in terms of resource man-agement and regulating access to log data

5 httpssourceandroidcomdevicestechsecurityselinux

13

ACKNOWLEDGMENTS

This work was supported by UK Engineering andPhysical Sciences Research Council grant EPK011510CloudSafetyNet End-to-End Application Security inthe Cloud We acknowledge the support of Microsoftthrough the Microsoft Cloud Computing Research Cen-tre

REFERENCES

[1] P Barham B Dragovic K Fraser S Hand T Harris A HoR Neugebauer I Pratt and A Warfield ldquoXen and the art ofvirtualizationrdquo SIGOPS Operating Systems Review vol 37 no 5pp 164ndash177 2003

[2] A Kivity Y Kamay D Laor U Lublin and A Liguori ldquokvmthe Linux Virtual Machine Monitorrdquo in Proceedings of the LinuxSymposium vol 1 2007 pp 225ndash230

[3] D Bernstein ldquoContainers and Cloud From LXC to Docker toKubernetesrdquo IEEE Cloud Computing Magazine no 3 pp 81ndash842014

[4] P Desnoyers O Krieger B Holden and J Hennessey ldquoUsingOpenStack for an Open Cloud eXchange (OCX)rdquo in InternationalConference on Cloud Engineering (IC2E) IEEE 2015

[5] J Mineraud O Mazhelis X Su and S Tarkoma ldquoA Gap Analysisof Internet-of-Things Platformsrdquo 2015 Arxiv arXiv150201181[Online] Available httparxivorgabs150201181

[6] C J Millard Ed Cloud Computing Law Oxford University Press2013

[7] S Pearson and M Casassa-Mont ldquoSticky Policies An Approachfor Managing Privacy across Multiple Partiesrdquo Computer vol 44July 2011

[8] D E Denning ldquoA lattice model of secure information flowrdquoCommunication of the ACM vol 19 no 5 pp 236ndash243 1976

[9] D E Bell and L J LaPadula ldquoSecure Computer Systems Mathe-matical Foundations and Modelrdquo The MITRE Corp Bedford MATech Rep M74-244 1973

[10] K J Biba ldquoIntegrity Considerations for Secure Computer Sys-temsrdquo MITRE Corp Tech Rep ESD-TR 76-372 1977

[11] A C Myers and B Liskov ldquoA Decentralized Model for Infor-mation Flow Controlrdquo in 17th Symposium on Operating SystemsPrinciples (SOSP) ACM 1997 pp 129ndash142

[12] A Bouard B Weyl and C Eckert ldquoPractical information-flowaware middleware for in-car communicationrdquo in Workshop onSecurity privacy amp dependability for cyber vehicles ACM 2013pp 3ndash8

[13] K Singh S Bhola and W Lee ldquoxBook Redesigning PrivacyControl in Social Networking Platformsrdquo in Security SymposiumUSENIX 2009 pp 249ndash266

[14] M Krohn A Yip M Brodsky N Cliffer M F KaashoekE Kohler and R Morris ldquoInformation Flow Control for StandardOS Abstractionsrdquo in Symposium on Operating Systems PrinciplesACM 2007 pp 321ndash334

[15] S Lee E L Wong D Goel M Dahlin and V ShmatikovldquoπBox A Platform for Privacy-Preserving Appsrdquo in Symposiumon Networked System Design and Implementation USENIX 2013pp 501ndash514

[16] N Kumar and R Shyamasundar ldquoRealizing Purpose-Based Pri-vacy Policies Succinctly via Information-Flow Labelsrdquo in Big Dataand Cloud Computing (BDCloudrsquo14) IEEE 2014 pp 753ndash760

[17] W Enck P Gilbert B-G Chun L P Cox J Jung P McDaniel andA N Sheth ldquoTaintDroid An information-flow tracking systemfor realtime privacy monitoring on smartphonesrdquo in Conference onOperating systems design and implementation (OSDIrsquo10) USENIX2010 pp 1ndash6

[18] I Papagiannis M Migliavacca and P Pietzuch ldquoPHP AspisUsing partial taint tracking to protect against injection attacksrdquoin 2nd USENIX Conference on Web Application Development 2011p 13

[19] E Schwartz T Avgerinos and D Brumley ldquoAll you ever wantedto know about dynamic taint analysis and forward symbolicexecution (but might have been afraid to ask)rdquo in Symposium onSecurity and Privacy IEEE 2010

[20] M Casassa-Mont S Pearson and P Bramhall ldquoTowards account-able management of identity and privacy Sticky policies andenforceable tracing servicesrdquo in Database and Expert Systems Ap-plications 2003 Proceedings 14th International Workshop on IEEE2003 pp 377ndash382

[21] S Bandhakavi C C Zhang and M Winslett ldquoSuper-sticky anddeclassifiable release policies for flexible information dissemina-tion controlrdquo in Proceedings of the 5th ACM workshop on Privacy inelectronic society ACM 2006 pp 51ndash58

[22] D W Chadwick and S F Lievens ldquoEnforcing sticky securitypolicies throughout a distributed applicationrdquo in Proceedings ofthe 2008 workshop on Middleware security ACM 2008 pp 1ndash6

[23] M Casassa-Mont I Matteucci M Petrocchi and M L SbodioldquoTowards Safer Information Sharing in the Cloudrdquo InternationalJournal of Information Security pp 1ndash16 2014

[24] S Akoush L Carata R Sohan and A Hopper ldquoMrLazy LazyRuntime Label Propagation for MapReducerdquo in 6th Workshop onHot Topics in Cloud Computing (HotCloud) USENIX 2014

[25] D Schultz and B Liskov ldquoIFDB Decentralized Information FlowControl for Databasesrdquo in European Conference on Computer Sys-tems (Eurosysrsquo13) ACM 2013 pp 43ndash56

[26] D Von Oheimb ldquoInformation Flow Control Revisited Nonin-fluence = Noninterference + Nonleakagerdquo in Computer SecurityndashESORICS 2004 Springer 2004 pp 225ndash243

[27] D Hedin and A Sabelfeld ldquoA perspective on Information-FlowControlrdquo NATO 2012

[28] D E Porter M D Bond I Roy K S McKinley and E WitchelldquoPractical Fine-Grained Information Flow Control Using Lam-inarrdquo ACM Transactions on Programming Languages and Systems(TOPLAS) vol 37 no 1 p 4 2014

[29] C Wright C Cowan S Smalley J Morris and G Kroah-Hartman ldquoLinux Security Modules General security support forthe Linux kernelrdquo in Foundations of Intrusion Tolerant SystemsIEEE 2003 pp 213ndash213

[30] M Quaritsch and T Winkler ldquoLinux Security Modules Enhance-ments Module Stacking Framework and TCP State TransitionHooks for State-Driven NIDSrdquo Secure Information and Communi-cation vol 7 pp 7ndash13 2004

[31] C Schaufler ldquoLSM Generalize existing module stackingrdquo LinuxWeekly News 2014

[32] S Smalley C Vance and W Salamon ldquoImplementing SELinux asa Linux Security Modulerdquo NAI Labs Report vol 1 p 43 2001

[33] M Bauer ldquoParanoid Penguin an Introduction to Novell AppAr-morrdquo Linux Journal vol 2006 no 148 p 13 2006

[34] A Edwards T Jaeger and X Zhang ldquoRuntime verification ofauthorization hook placement for the Linux Security Modulesframeworkrdquo in Conference on Computer and Communications Secu-rity ACM 2002 pp 225ndash234

[35] T Jaeger A Edwards and X Zhang ldquoConsistency analysis ofauthorization hook placement in the Linux security modulesframeworkrdquo ACM Transactions on Information and System Security(TISSEC) vol 7 no 2 pp 175ndash205 2004

[36] V Ganapathy T Jaeger and S Jha ldquoAutomatic placement ofauthorization hooks in the Linux security modules frameworkrdquoin Conference on Computer and Communications Security ACM2005 pp 330ndash339

[37] I P Egwutuoha D Levy B Selic and S Chen ldquoA survey of faulttolerance mechanisms and checkpointrestart implementationsfor high performance computing systemsrdquo Springer 2013vol 65 no 3 pp 1302ndash1326

[38] O Laadan and S E Hallyn ldquoLinux-CR Transparent applicationcheckpoint-restart in Linuxrdquo in Linux Symposium 2010 p 159

[39] B Niu and G Tan ldquoEfficient User-space Information Flow Con-trolrdquo in Symposium on Information Computer and CommunicationsSecurity (SIGSACrsquo13) ACM 2013 pp 131ndash142

[40] T Bird ldquoMeasuring Function Duration with ftracerdquo in Japan LinuxSymposium 2009 pp 47ndash54

[41] B Parno ldquoBootstrapping Trust in ardquo Trustedrdquo Platformrdquo in Con-ference on Hot Topics in Security (HotSecrsquo08) USENIX 2008

[42] N Santos K P Gummadi and R Rodrigues ldquoTowards TrustedCloud Computingrdquo in Conference on Hot Topics in Cloud Comput-ing USENIX 2009 pp 3ndash3

[43] S Berger R Caceres K A Goldman R Perez R Sailer andL van Doorn ldquovTPM Virtualizing the Trusted Platform Modulerdquoin Security Symposium USENIX 2006 pp 305ndash320

14

[44] S Berger K Goldman D Pendarakis D Safford E Valdezand M Zohar ldquoScalable Attestation A Step Toward Secure andTrusted Cloudsrdquo in International Conference on Cloud Engineering(IC2E) IEEE 2015

[45] J Singh D Eyers and J Bacon ldquoPolicy Enforcement withinEmerging Distributed Event-Based Systemsrdquo in ACM DistributedEvent-Based Systems (DEBSrsquo14) 2014

[46] L Sfaxi T Abdellatif R Robbana and Y Lakhnech ldquoInformationFlow Control of Component-based Distributed Systemsrdquo Concur-rency and Computation Practice and Experience vol 25 no 2 pp161ndash179 2013

[47] S Yoshihama T Yoshizawa Y Watanabe M Kudoh and K Oy-anagi ldquoDynamic Information Flow Control Architecture for WebApplicationsrdquo in ESORICS 2007 J Biskup and J Lopez EdsSpringer 2007 vol LNCS 4734 pp 267ndash282

[48] W Cheng D R K Ports D Schultz V Popic A BlanksteinJ Cowling D Curtis L Shrira and B Liskov ldquoAbstractions forUsable Information Flow Control in Aeolusrdquo in USENIX AnnualTechnical Conference Boston 2012

[49] J Singh T Pasquier J Bacon and D Eyers ldquoIntegrating Middle-ware and Information Flow Controlrdquo in International Conferenceon Cloud Engineering (IC2E) IEEE 2015 pp 54ndash59

[50] J Singh T F J-M Pasquier and J Bacon ldquoSecuring Tags toControl Information Flows within the Internet of Thingsrdquo inInternational Conference on Recent Advances in Internet of Things(RIoTrsquo15) IEEE 2015

[51] J S Park and R Sandhu ldquoBinding Identities and Attributes UsingDigitally Signed Certificatesrdquo in Annual Conference on ComputerSecurity Applications IEEE 2000 pp 120ndash127

[52] A Ganjali and D Lie ldquoAuditing cloud management using infor-mation flow trackingrdquo in Workshop on Scalable Trusted ComputingACM 2012 pp 79ndash84

[53] R K Ko M Kirchberg and B S Lee ldquoFrom System-centric toData-centric Logging-accountability Trust amp Security in CloudComputingrdquo in Defense Science Research Conference and Expo (DSR)2011 IEEE 2011 pp 1ndash4

[54] J Singh J Powles T Pasquier and J Bacon ldquoData Flow Man-agement and Compliance in Cloud Computingrdquo IEEE CloudComputing Magazine SI on Legal Clouds 2015

[55] T Zanussi K Yaghmour R Wisniewski R Moore and M Dage-nais ldquorelayfs An efficient unified approach for transmitting datafrom kernel to user spacerdquo in Linux Symposium 2003 p 494

[56] M E Smoot K Ono J Ruscheinski P-L Wang and T IdekerldquoCytoscape 28 New features for data integration and networkvisualizationrdquo Oxford University Press 2011 vol 27 no 3 pp431ndash432

[57] T Pasquier and J Powles ldquoExpressing and Enforcing LocationRequirements in the Cloud using Information Flow Controlrdquo inIC2E International Workshop on Legal and Technical Issues in CloudComputing (CLawrsquo15) IEEE 2015

[58] R Angles and C Gutierrez ldquoSurvey of Graph Database ModelsrdquoACM Computing Surveys (CSUR) vol 40 no 1 p 1 2008

[59] D Hardt ldquoThe OAuth 20 Authorization Frameworkrdquo IETF TechRep RFC 6749 2012

[60] S Smalley and R Craig ldquoSecurity Enhanced (SE) Android Bring-ing Flexible MAC to Androidrdquo in Network and Distributed SystemSecurity Symposium Internet Society 2013

  • 1 Introduction and Motivation
  • 2 Background
    • 21 IFC Models
    • 22 Protection via VMs and Containers
    • 23 Taint Tracking (TT) Systems
    • 24 Sticky Policies
      • 3 CamFlow-Model IFC for the Cloud
        • 31 Tags and Labels
        • 32 Decentralised Privileges and Security Contexts
        • 33 Creating a New Entity
        • 34 Security
          • 341 Information Exchange
          • 342 Label Change
          • 343 Privilege delegation
            • 35 Conflict of Interest
              • 4 OS Enforcement
                • 41 CamFlow-LSM
                  • 411 Checkpointing and Restoration
                  • 412 OS Evaluation
                    • 42 Trusted Processes
                    • 43 Leveraging Hardware Roots of Trust
                      • 5 Cross-Machine Enforcement
                        • 51 Remote Interactions
                        • 52 Message-Level Enforcement
                        • 53 Integrating With Persistent Storage
                        • 54 Evaluation
                          • 6 Audit Data-Centric Logs
                            • 61 Analysing Paths to Disclosure
                            • 62 Demonstrating Compliance
                            • 63 Audit as `Big Data
                            • 64 Audit Access
                              • 7 Example Support for Web Services
                              • 8 Conclusion amp Future Work
                              • References
Page 5: CamFlow: Managed data-sharing for cloud services · CamFlow: Managed data-sharing for cloud services Thomas F. J.-M. Pasquier, Member, IEEE, Jatinder Singh, Member, IEEE, David Eyers,

5

Medical RecordS = personal

I = empty

Consent Checker

S = personalI = cons

Anonymiser

S = researchI = cons anon

Research DatabaseS = research

I = cons anon

Researcher PortalS = research

I = cons anonAllowed FlowPrevented Flow

Research Project XXNHS Cloud S = personalI = empty

S = personalI = cons

Context change

Fig 2 Medical data declassified and endorsed for research purposes

privilege to add t to its secrecy label we denote thist isin P+

S (A) and to remove t from its secrecy labelt isin PminusS (A) (and similarly P+

I (A) and PminusI (A) are the priv-ileges for integrity) An active entity may therefore havefour privilege sets in addition to its security contextApplication managers will normally set up applicationinstances in security contexts without the privileges tochange them An example is given in sect7

33 Creating a New Entity

We define A rArr B as the operation of the entity Acreating the entity B An example is creating a processin a Unix-style OS by clone We have the following rulesfor creation

if ArArr B then

S(B) = S(A)

I(B) = I(A)

That is the created entity inherits the security contextof its creator These rules force the creating entity toexplicitly change its security context to that required forthe entity to be created We motivate this below in sect342Note that only labels pass to the created entity privilegeshave to be passed explicitly

34 Security

The purpose of IFC models is to regulate flows betweenentities and effect label changes and privilege delega-tion

Definition 1 A system is secure in the CamFlow IFC modelif and only if all allowed messages are safe (Definition 2) allallowed label changes are safe (Definition 3) and all privilegedelegation is safe (Definitions 4 and 5)

341 Information Exchange

IFC prevents data leakage by controlling the exchangeof information We follow the classic pattern for IFC-guaranteed secrecy (no read up no write down [9]) andintegrity (no read down no write up [10])

Definition 2 A flow of information A rarr B is safe if andonly if

Ararr B iff S(A) sube S(B) and I(B) sube I(A)Example ndash secrecy enforcement Consider our exam-ple of patient monitoring after discharge from hospital

where the patientrsquos devices are tagged with medical bobin their secrecy labels In order for the cloud service tobe able to receive this data it must also include the tagsmedical bob in its secrecy label Therefore an applicationinstance accessing Bobrsquos medical data must be labelled assuch In sect7 we describe how applications can be designedto meet such requirementsExample ndash integrity enforcement The cloud-basedhome monitoring support service needs to be assuredthat the data it receives is from a hospital-issued de-vice To achieve this the service has an integrity taghospital-issued in its integrity label and will only acceptdata from devices with tags hospital-issued

342 Label Change

Under the above constraints information flows arerestricted to equal or increasing secrecy constraintsand equal or decreasing integrity constraints Howeverdata may undergo transformations andor checks thatchange its security properties For example moving datathrough an anonymisation engine renders the data lesssensitive so less strict secrecy constraints can applyto the anonymised output In the integrity dimensiondata may go through a validation process on inputthus becoming more trustworthy In CamFlow only theprocess itself is able to change its secrecy and integritylabels which requires the appropriate privileges andmust be explicitly requested

Definition 3 A label change noted A Aprime is safe if andonly if for a label X (either S or I) and a tag t

X(Aprime) = X(A) cup t if t isin P+X (A)

ORX(Aprime) = X(A) t if t isin PminusX (A)

Declassifiers and endorsers are the entities with theprivileges to perform security context transformationsDeclassifiers change the secrecy properties and endorserschange the integrity propertiesExample ndash declassification A medical record systemis held in a private cloud Research datasets may becreated from these records but only from records wherethe patients have given consent Also only anonymised

6

data may leave the private protected environment Weassume a health service approved anonymisation proce-dure Fig 2 shows the anonymiser inputting data taggedas personal and declassifying the data by outputting datawith secrecy tag researchExample ndash endorsement In the same example theResearch Database is on a public cloud and may onlyreceive research data tagged with consent anon in itsintegrity label In the private cloud we see a pro-cess that selects appropriate records for specific re-search purposes checks for patient consent and addsthe tag consent to the integrity label of its output Theanonymiser process can only input data with this tag itanonymises the data and outputs data with the tag anonin its integrity label

Some previous work [14] [28] allows implicit declas-sification and endorsement That is if an active entity hasthe privilege to declassifyendorse and the privilegeto return to its original state (ie for declassificationendorsement over t the entity has privilege tminus and t+)the declassificationendorsement may occur implicitlywithout the need for the entity to make the label changesWe believe that this could in practice lead to unintentionaldata disclosure Suppose an entity has the privilege todeclassify top-secret information The requirement forexplicit label change makes it unlikely that the entity willsend such data accidentally to an unintended recipientOur model has stronger constraints that require endorse-ment and declassification operations to be programmedexplicitly

343 Privilege delegation

An entity is only able to delegate a privilege it owns

Definition 4 A privilege delegation is safe if and only ift isin PplusmnX (A)

35 Conflict of Interest

In CamFlow alone among IFC systems privilege dele-gation is further restricted by Conflict of Interest (CoI)(or Separation of Duty (SoD)) enforcement The receivingentity A must not be put in a situation where it wouldbreak a CoI constraint By this means an applicationmanager is prevented from creating an application in-stance with access to conflicting data

Definition 5 An entity A does not violate a CoI C if andonly if∣∣∣(S(A) cup I(A) cup P+

S (A) cup P+I (A) cup Pminus

S (A) cup PminusI (A)

)cap C

∣∣∣ le 1

Example ndash conflict of interest A CoI might arise whendata relating to competing companies is available in asystem In a hospital context this might involve theresults of analysis of the usage and effects of drugs fromcompeting pharmaceutical companies The companiesmight agree to this analysis only if their data is guar-anteed to be isolated ie not leaked to other companies

The hospital may be participating in drug trials andwant to ensure that information does not leak between

Camflow-LSM

IFC Library

User Space

Kernel Space

Application Process

Kernel

DIFC Management APISystem Calls

Hardware

IFC LibraryTrusted Process

Authorise and Audit

Fig 3 The interactions of the IFC Security Module (LSM)and a Trusted Process within an OStrials suppose a conflict is C = PfizerGSKRoche and some data (eg files) are labelled PfizerData[S =Pfizer I = empty] and RocheData[S = Roche I = empty] TheCoI described ensures that it is not possible for a singleentity (eg an application instance) to have access toboth RocheData and PfizerData either simultaneously orsequentially ie enforcing that Roche-owned data andPfizer-owned data are processed in isolation

The next sections describe the CamFlow platform thatenforces the IFC constraints described

4 OS ENFORCEMENT

At the heart of the architecture is a minimal kernel mod-ule dedicated solely to OS-level IFC enforcement Themodule is trusted to enforce IFC transparently acrossall flows between entities within the OS User spaceprocesses can directly interact with the kernel moduleeg to delegate privileges (sect34) through a pseudo-filesystem abstracted through a high level API Higher levelconsiderations and policies can be managed throughspecifically defined Trusted Processes (see sect42) The localmachine architecture is represented in Fig 3

Note that IFC operates alongside and complementsother security technologies It is not a cloud securitypanacea challenges regarding covert and side channelsand direct access to hardware by an attacker remain asthey do for systems in general There are approachesthat can help address these security threats but manyare highly disruptive (eg synchronisation approachesto reducing timing channels) and are infrequently usedOther threats may be easier to mitigate and solutionsmay be used when appropriate (eg on-disk encryption)

41 CamFlow-LSM

Our kernel module CamFlow-LSM is implemented as aLinux Security Module (LSM) [29] Although our work isLinux-specific a similar approach could be used on anysystem providing LSM-like security hooks Unlike otherDIFC OS implementations [14] [28] our kernel patch isself-contained strictly limited to the security moduledoes not modify any existing system calls and followsLSM implementation best practice This allows amongother things LSM stacking [30] [31] and coexistencewith other security modules such as eg SELinux [32]or AppArmor [33] and complements their MAC enforce-ment with decentralised information flow policies

We assume that the rest of the kernel can be trustedand does not interfere with the IFC enforcement mech-anism LSM system hooks have been statically anddynamically verified [34]ndash[36] and our implementation

7

inherits from LSM the formal assurance of IFCrsquos correctplacement on the path to any controlled kernel objectThis is sufficient to guarantee that we control flow andrecord audit on any operation on a controlled kernelobject

Since applications running on SELinux [32] or Ap-pArmor [33] need not be aware of the MAC policybeing enforced we see no reason to force applicationsrunning on an IFC system to be aware of IFC only thoseperforming declassification or endorsement operationsare necessarily aware This implementation choice isimportant cloud providers can incorporate IFC withoutrequiring changes in the software deployed by ten-ants Alternatively applications that wish to managetheir own IFC constraints can declare policy through apseudo-filesystem (as is typical for LSMs) abstracted bya user space library and enforced transparently by theIFC mechanism

The LSM framework calls security hooks when accessto a kernel object is attempted Security metadata can beassociated with kernel objects and is used by the LSMmodule to make access decisions Tags and privilegesare represented by 64-bit opaque nonces associated withkernel objects such as processes inodes files sharedmemory objects messages etc On interaction betweenkernel objects CamFlow-LSM security hooks are calledto enforce data-flow policy (sect341) or propagate tags onentity creation (sect33) as appropriate

Only active entities (processes) have mutable labelsand privileges all other (passive) entities have im-mutable labels and no privileges

Privileges are allocated by the kernel and owned bythe creating process (any process can create tags and theassociated privileges in a decentralised fashion) Privi-leges can be passed to other processes users or groupsCamFlow-LSM verifying that constraints on privilegedelegation (sect343) and conflict of interest (sect35) are notviolated A process can add or remove a tag from itslabel if it owns the appropriate privilege (following IFCconstraints described in sect342) if the current user ownsthe privilege or if the current group owns the privilegeHow tags are shared and managed must be consideredwith care when designing an application and the systemmust be administered accordingly

411 Checkpointing and Restoration

Checkpointing a process involves halting its executionallowing it to be restarted at a later stage and enablingmigration eg [37] LSM state is normally saved andrestored by the checkpointing system eg [38] andour module further exports an API to more efficientlyserialise and restore security context

Furthermore self-checkpointing and restoring the pre-vious state of a process has been demonstrated [39] to bea beneficial feature for IFC systems This is particularlyuseful for processes serving requests In such a scenariothe state of the process is saved after initialisation Whena request is received the serving process sets itself up

0 10 20 30 40 50 60 70 80 90

sys pipe

sys write

sys read

sys clone

microsdyn label IFC LSM Native

Fig 4 Overhead introduced into the OS by CamFlowLSM (x-axis time in micros)in the security context appropriate to serve the requestAfter the request is served (or a series of requests if thesystem is session-based as described in sect7) the processrestores its memory state and security context to whatthey were immediately after initialisation This improvesperformance and prevents data leaks between securitycontexts

412 OS Evaluation

We tested the CamFlow-LSM module on Linux Kernelversion 3178 (012015) from the Fedora distribution3

The tests are run on an Intel 22Ghz i7 CPU and 6GiBRAM machine

Measurements are done using the Linux tool ftrace [40]to provide a microbenchmark Two processes read fromand write to a pipe respectively Each has 20 tags in itssecurity label substantially more than we have seen aneed for in current use cases We measure the overheadinduced by creating a new process (sys clone) creatinga new pipe (sys pipe) writing to the pipe (sys write) andreading from the pipe (sys read) The results are givenin Fig 4

We can distinguish two types of induced overheadverifying an IFC constraint (sys read sys write) and allo-cating labels (sys clone sys pipe) The sys clone overheadis roughly twice that of sys pipe as memory is allocateddynamically for the active entityrsquos labels and privilegesRecall that passive entities have no privileges Overheadmeasurements for other system callsdata structures areessentially identical as they rely on the same underlyingenforcement mechanism and are not included

The CamFlow-LSM overhead is a few percent seeFig 4 We provide a build option that further improvesperformance by declaring labels and privileges with afixed maximum size (by default label size can increasedynamically to meet application requirements) This re-duces the overhead of the system calls that create newentities (the dynamic label component in Fig 4) This isan acceptable trade-off as in practical scenarios labelsrarely exceed more than five tags However for mostapplications the overhead is imperceptible and lost insystem noise it is hard to measure without using kernel

3 It is not feasible to provide a comparison with the Laminarimplementation [28] that is closest in technical terms to our workas the implementation available httpsgithubcomut-osalaminaris for an obsolete kernel version 2622 (072007)

8

tools as the variation between two executions may begreater than the overhead

42 Trusted Processes

The CamFlow-LSM is trusted to enforce IFC at the kernellevel Its functionality is minimal strictly confined tothe enforcement of IFC policies as described in sect3 Thisguarantees easier maintainability and a system that isagnostic to higher level application requirements thusminimising the constraints imposed on user-space ap-plication design

We introduce the concept of a trusted process that al-lows applicationplatform-specific concerns to be man-aged in user space by bypassing some LSM-enforced IFCconstraints For example a trusted process might serveas a proxy for external connections as in the Trusted IFCGateway in the example in sect7 setting up and managingapplication componentsrsquo labels Trusted processes areused to interact with persistent storage (see sect53) forcheckpointing and restoring processes (see sect411) andfor managing inter-process and external communication(see sect5)

Fig 5 shows OS instances running the CamFlow-LSM hosting a number of application processes thatmay be grouped in containers Each OS instance hasa single trusted process (Security Context Manager) tomanage its hosted processesrsquo IFC labels and privileges Inaddition each process has an associated trusted middle-ware process to handle inter-process and inter-machinecommunication Such communication may be within orbetween containers OSs or clouds

In this example S represents a particular set of secrecytags and I a particular set of integrity tags both ofwhich remain the same throughout The application pro-cesses and other OS objects such as pipes and files arelabelled [S I] The process labelled [empty I] writes lsquopublicrsquodata to a pipe which is read by a process labelled [S I]assuming all the I tags match correctly Similarly twoprocesses are shown writing to and reading from a file

The Security Context Manager maps between thekernel-level representation of tags (as 64-bit integers)and the representation of tags in user space Within acloud or other trusted environment tags may be simplestrings When tags need to cross domain boundarieseg when cloud services form part of a wider archi-tecture as in IoT tags may need to be protected bycryptographic means (see sect51)

Trusted processes are either set up through staticconfiguration read at boot time by the CamFlow-LSMmodule or created at runtime by another trusted pro-cess Trusted processes must either be managed by atrusted party (in our current approach the underlying in-frastructure provider) andor the code must be auditableand a means to verify the current version running on theplatform must be provided (see sect43)

43 Leveraging Hardware Roots of Trust

Incorporating IFC into cloud-provider OSs would en-hance the trustworthiness of the platform HoweverIFC only guarantees protection above the technical layerin which it is enforced Recent hardware and softwaredevelopments make it possible to attest that the softwarelayers on which our platform runs have been audited

The Trusted Platform Module (TPM) [41] as used forremote attestation [42] is one such hardware mecha-nism TPM is used to generate a nearly unforgeablehash representing the state of the hardware and soft-ware of a given platform that can be remotely verifiedTherefore a company could audit the implementationof our IFC enforcement mechanism and ensure that ourkernel security module messaging middleware and theconfiguration they provide are indeed running on theplatform Any difference between the expected state ofthe software stack and the platform could be considereda breach of trust such considerations can easily beembedded in the contractual obligations of the cloudprovider

TPM and remote attestation for cloud computing [43]are reaching maturity with IBM rolling out an opensource scalable trusted platform based on virtual TPMs[44] Indeed Berger et al [44] describe a mechanismallowing the TPM and remote attestation to be providedfor virtual machine offerings and container-based solu-tions covering the whole range of contemporary cloudofferings Furthermore the approach not only allows thestate of the software stack to be verified at boot time butalso during execution and can thus prevent run-timemodification of the system configuration

5 CROSS-MACHINE ENFORCEMENT

CamFlow-LSM operates to protect flows within the OSHowever it is also important that flows are protectedacross OS instances

Generally in order to guarantee flow constraints onlyprocesses P such that S(P ) = empty and I(P ) = empty ie notsubject to IFC constraints are allowed to directly connectto or receive messages from connections on remote OS(eg through a socket) In order to connect to anothermachine a process must either 1) be able to declassifyto change its security context to S(P ) = empty and I(P ) = empty2) communicate through an intermediate trusted process

As such CamFlow contains an IFC-enabled fully-featured messaging middleware (CamFlow-MW) to bothfacilitate communication and guarantee enforcementacross machines For want of space we only considerthe middleware concepts relevant to IFC details on thegeneral middleware (as it was prior to IFCCamFlowintegration) can be found in [45] In short the middle-ware supports strongly-typed messages a range of in-teraction paradigms including request-reply broadcastand streams flexible resource discovery and securitymechanisms including access controls and encryptedcommunication A particular feature is its support for

9

CamFlow-MW CamFlow-MW CamFlow-MW CamFlow-MW CamFlow-MW

Process

CamFlow-LSM

Enforcement Point Pipe [empty I]

Process Process

SecurityContextManager

SecurityContextManager

Container

Container

Container

Process[S I]

CamFlow-LSM

[S I]

File[S I]

[S empty][S empty]Process[empty I]

Fig 5 CamFlow Architecture Labelled OS objects trusted processes and communication middleware

dynamic reconfiguration based on event-driven policyThis simplifies both application development and de-ployment as concerns can be abstracted and tailoredto the particular environment rather than embeddedwithin application code

The role of the middleware is to move towards contin-uous end-to-end data flow management such that IFCcan be enforced across applicationsmachines (kernels)There is work on IFC enforcement across machineshowever these impose specific requirements such asdesign-time considerations [46] a particular languageruntime [47] or constraints on system architectureimplementation [48] In contrast we integrate IFC func-tionality into the general fully featured distributed sys-tems middleware mentioned above (see [49]) to provideflexibility and be more generally applicable We delib-erately avoid imposing a structure on system designinstead integrating IFC functionality into the sort of com-munications infrastructure common to current enterpriseand cloud systems

51 Remote Interactions

CamFlow-MW operates by associating a trusted process(see sect42) with an entity that seeks to communicate viathe messaging system Its fit within the broader architec-ture is depicted in Fig 5 The process is responsible forhandling the communication of messages and enforcesIFC based on the current runtime labels of the entity onwhose behalf it operates

It follows that for IFC to be enforced across machinestags require system-wide management ie throughoutthe cloud service In [50] we proposed that the widelyused and available X509 certificates could be used Theapproach relies on public key certificates and attributecertificates [51] to respectively identify the applicationassociated with the CamFlow-MW instance and the tagsassociated with this application

As part of establishing a connection CamFlow-MWensures that each entity authorises communication withthe other according to a local access control policy De-cisions are based on component metadata the relevantauthentication aspects secured through PKI (certificates)Similarly IFC policy must also be verified to ensure dataflows are authorised according to IFC policy Attributecertificates provide cryptographic means to determineand verify the tags associated with the remote entity

(see [50]) on which policy can be enforced If tags donot accord the connection will not be established

We also see potential for remote attestation based onhardware integrity measures (see sect43) to be integratedinto this authorisation phase to ensure the remote ma-chine operates a reliable IFC enforcement regime

52 Message-Level Enforcement

CamFlow messages are strongly typed where a mes-sage type is defined by a schema describing its set ofattributes For an instance of a message an attributeconsists of a name type and value The support forIFC within messages is fine-grained in that individualattributes within messages can also be labelled Theseattribute labels introduce additional IFC constraints overand above those already applying to the entity ie asrecognised by the kernel-LSM and validated on connec-tion establishment

Labels can be defined within message type schemawhich sets the attributesrsquo IFC labels for all messageinstances of the type These labels cannot be changed byentities dealing in such messages and the entities musthold the requisite labels to interact with the attributesOtherwise the entity producingpublishing a messagecan set the security labels for the attributes (for those notpredefined) if the entity holds the associated privilegesEnforcement occurs as followsReceiving If the receiving entityrsquos labels do not agreewith those of an attribute value the attribute value (andany sub-attributes) are removed from (made null in) themessage This is enforced on message receipt before itis delivered to the entitySending An entity cannot send values for attributeswhere its labels do not agree with those of the attributeThis is enforced when an entity attempts to send amessage ensuring values for any attributes violating thispolicy are removed before message propagation

Enforcement is automatic meaning that applicationsusing the messaging system can be subject to IFC en-forcement completely transparently (ie without their di-rect involvement) though again there is the interface forthe application to actively manage IFC where requiredIn addition the general reconfiguration capabilities ofthe middleware enable connections between componentsto be defined and managed at runtime providing an-other mechanism for controlling communication [45]

10

0 100 200 300 400 500 600 700

Time in msIFC-overhead

Fig 6 IFC overhead of CamFlow-MW for a workloadtransmission of 5000 messages (x-axis in ms)53 Integrating With Persistent Storage

One technique to provide IFC with persistent data storesis to store the tags alongside the data and a trustedsoftware component ensures that when information isread from the store the corresponding labels are appliedIn Flume [14] a trusted process provides the interfacebetween untrusted applications and persistent storageMore recent work has seen the emergence of databasesthat natively understand IFC concepts and can enforceIFC policies [25]

We see much promise in having the middleware me-diate between persistence systems and the kernel toensure consistent IFC application

54 Evaluation

As shown in Fig 6 the results indicate that IFC en-forcement introduces an overhead of sim13 in perfor-mance time compared to the standard non IFC-enabledmiddleware (see [49] for details) Note that these resultswere measured in the context of a particular workloaddeliberately designed to highlight the impact of IFCenforcement It follows that the overheads associatedwith real-world usage are most likely less onerous

6 AUDIT DATA-CENTRIC LOGS

IFC in addition to providing strong assurances that pol-icy is being enforced can also provide a data-centric log[52] detailing the information flows within and betweensystem components In addition to enforcing IFC via ourLSM module we log the data flows of labelled processespolicy decisions privileges and IFC security contextmanipulations Equivalent inter-machine operations viathe middleware are also recorded

Cloud logging systems are generally based on legacylogging systems (OS web-server database etc) thateither fail to capture the needed information or areextremely complicated to interpret in a useful manner[53] More importantly such logs tend to be relevant onlyto the particular service or component which makes itdifficult if not impossible to audit across a range ofapplications clouds etc

IFC logs as provided by our platform allow us to cap-ture information on application-level data flows both at-tempted and permitted allowing the correct expressionand implementation of data flow policy to be checkedThis provides transparency allows for meaningful auditin terms of investigating the circumstances in which dataleakage occurs and provides evidence of complianceeg with legal obligations [54]

Information Exchange

CreateSecurity Context Change

e0

e1

e3e4

e7

e2 e5e6

P3[Sprimeprime I]

P2[Sprimeprime empty]

P3[S I]F1[S I]

P1[S I] P1[empty I]

Public[empty empty]F2[Sprime I prime]

Fig 7 Simplified audit graph from IFC OS execution (weomit metadata for readability) The path to disclosure isshown in bluepale61 Analysing Paths to Disclosure

To assist in interpreting log information we build a di-rected graph corresponding to the allowed flows duringthe execution of our system as shown in Fig 7 Theflows defined in our IFC model (see sect3) namely dataflow creation flow security context change and privi-lege delegation correspond to the edges of the directedgraph Entities (such as processes files messages etc)are represented as nodes in the graph

In addition to information necessary to build the graph(as shown in the figure) additional metadata is collectedfor forensic purposes which is contextentityevent-dependent These audit entries provided by the LSMand middleware can be exploited by a dedicated serviceimplemented in user space connected to the kernelcollection mechanism via relayfs [55] This service feedsfor example a graph visualisation tool such as Cytoscape[56] or a graph database such as Neo4J4

Such a directed graph helps one identify data leaksFor example a tenant might discover that some sensi-tive medical data leaked into a data store where onlyanonymised research data were supposed to be storedIFC is enforced in line with the policy encapsulated inlabels thus data may leak if such policy is improperly ex-pressed andor declassificationendorsement processesare not correctly implemented (eg if the anonymisationprocess in Fig 2 allows re-identification)

Suppose that an information leak is suspected be-tween different security contexts L1[S I] and L2[S

prime I prime]Determining whether such a leak can occur is equiv-alent to discovering whether there is a path in thegraph between the two contexts If the leak occurredthere must be a path between some entity Ei such thatS(Ei) = S and I(Ei) = I and another entity Fi such thatS(Fi) = Sprime and I(Fi) = I prime

The existence of such a path demonstrates that a leakis possible To investigate whether a leak occurred it isessential to consider the event ID associated with theedges comprising the path We denote by ei the last in-coming edge to the entity under investigation with labels[Sprime I prime] only edges such that e lt ei should be consideredWhen applied to all nodes along a path this rule ensuresstrictly monotonically increasing timestamps from thefirst node to the last Fig 7 shows in bluepale a possibledata disclosure path from file F2 from a very simple

4 httpneo4jcom

11

audit graph We know from the event IDs e0 and e1 thatthe data disclosure did not occur through file F1 andprocess P3 but through P1rsquos declassification

62 Demonstrating Compliance

Compliance with certain requirements can be demon-strated through queries over the graph We assume theaudit data is stored in a graph database that we canquery For example the following plain English policyldquoEuropean personal data sent to the US must be anonymisedrdquo[57] is equivalent to writing a query that verifies thatthere is no path between EU- and US-labelled datawithout an anonymisation processThe policy ldquoMedical data stored in database X must havereceived proper consent and be anonymisedrdquo [54] can beexpressed as a query verifying that there is no path be-tween data labelled as medical and the database withoutconsent-checking and anonymiser processes In additionan investigator may want to know which anonymisationalgorithm has been run which data has been used togenerate the anonymised records etc Our audit graphassists in answering such questions

Note that IFC only applies guarantees with respectto flows Demonstrating the overall effectiveness of themanagement regime eg the quality and suitability ofthe anonymisation algorithm is out-of-scope for a flow-based enforcement mechanism

63 Audit as lsquoBig Datarsquo

We are potentially generating a vast amount of data inour IFC logs However unlike standard system logs thatare complex to analyse our logs generate graphs thatare ideal for analysis by ldquobig datardquo tools that have beendeveloped for this purpose [58]

Since the amount of data is potentially huge theamount of data being logged can be fine-tuned to meetthe requirements of the platformtenant eg by reduc-ing the amount of metadata being stored by loggingonly security context changing operations by loggingonly information corresponding to some target securitycontext keeping operations on unlabelled entities out-side of the log etc The decision on what needs to belogged then becomes a tradeoff between utility and thevolume (cost) of log generated which can be decided inorder to correspond to legal or contractual requirements(for example a regulated sector may need to have afine-grained log to satisfy data forensic requirements)Indeed as such an approach is new to the cloud suchconsiderations will be refined by experience with bestpractices developing over time

64 Audit Access

Logs can contain sensitive information and access tothem should be controlled This represents an area ofour ongoing work Traditional access controls clearlyplay a role however secrecy tags could also be lever-aged For example an auditor before being grantedaccess to audit logs could be forced to demonstrate

ownership of the corresponding secrecy IFC tags (forexample through cryptographic means as in sect51) Theauditor may be granted access to a log entry only ifS(origin) cup S(destination) sube S(auditor)7 EXAMPLE SUPPORT FOR WEB SERVICES

One of the most common uses of PaaS is to host web ap-plications In this section we present the implementationof such a solution built on the infrastructure described insect4 in order to evaluate and demonstrate the feasibility ofour proposed approach This is illustrated in Fig 8 Werun standard and unmodified Ruby web applications

Interaction with end-users is achieved through aldquogatewayrdquo between the IFC and non-IFC worlds Simi-larly interaction with cloud services (such as data stores)is also achieved through our messaging middleware asdiscussed in sect5 The requirement for this gateway canbe removed if a trustworthy IFC implementation can beprovided at the client side consistent with the cloud im-plementation with respect to tag naming enforcementetc Tag naming in general system-wide is an issuebeyond the scope of this paper see further sect8 In ourproof of concept implementation the gateway is a simpleApache server running a custom-built module

The role of the gateway is to authenticate the end-userwhen a session is created and to associate this sessionwith an application instance running within the securitycontext corresponding to the user Recall that a securitycontext comprises the S and I labels Any further re-quests to the gateway in that session are routed to thecorresponding application instance Once an instance nolonger has an associated session it can be recycled usingself-checkpointing as described in sect411

Several application types are running over our cloud-based web services platform For example in a medicalcontext these might be medical record editing pharmacyordering social services etc A single shared identityservice for the end-user is part of the cloud provideroffering (in our proof of concept implementation weused OAuth [59])

The GP authenticates is authorised as treating doc-tor for Alice and selects the lsquoidentityrsquo that correspondsto Alice A new session is created server-side by thegateway with the requested application instance run-ning in the corresponding security context with S =[medical alice] I = [empty] When the GP wants to accessapplications on behalf of a new patient he needs to closeAlicersquos session authorise as treating doctor for Bob andopen a new session for Bob

The control described above is not achieved by theapplication but by the platform itself and can be con-trolled by the end-user subject to access control Thatis a medical application used on Alicersquos behalf runs ina security context in which data cannot flow to that ofanother patient Furthermore applications running onbehalf of a given user can share the data of that userwithout the risk of seeing a buggy application leakingdata between end-users

12

App Instance App InstanceApp Instance

CamFlow-MW

IFC Constrained IFC Constrained IFC Constrained

ContextA

ContextB

Public

Trusted IFC Gateway

Enforcing IFCData-store

Data-store Application

CamFlow-MW CamFlow-MWCamFlow-MW

IFC Cloud Platform

Dr Yellow Dr PinkDr MauveRemote Clients

Fig 8 PaaS Architecture on top of IFC-OS

As described in sect4 we assume the middleware andthe OS enforcement are provided as a service by theunderlying platform A tenant wanting to use the third-party web-service offering once his trust in the under-lying platform is established needs only to audit thegateway again the underlying infrastructure providercould either provide such a gateway or audit it Therest of the software stack of the third-party web-serviceprovider is bound by the IFC enforcement mechanismand therefore need not be trusted

8 CONCLUSION amp FUTURE WORK

IFC allows data flows to be controlled continuouslythroughout a system by providing an information-centric MAC scheme that continuously ensures non-interference between security contexts This paper pre-sented the CamFlow platform that demonstrates thepotential of cloud-deployed IFC as supporting (1) pro-tection of applications from each other (2) flexible datasharing across isolation boundaries (3) prevention ofdata leakage due to bugsmisconfigurations (4) exten-sion of access control beyond application boundaries (5)data flow transparency

Specifically we detailed a new kernel implementationof IFC as an LSM demonstrating low overhead evenfor worst-case scenarios where processes continuouslymake readwrite system calls We also described theintegration of a messaging middleware to enforcing IFCacross machines This combination makes it possible toprovide whole-system IFC to PaaS cloud services andtherefore also SaaS Our approach and implementationwere designed so that applications can run unchangedover IFC thus making cloud adoption feasible

We also indicated how the data-centric logs based onIFC enforcement could provide the means to audit anIFC-enabled system whereby a log can be processed asa directed graph to investigate leaks and attacks andshow compliance with data management requirementsThough this represents our initial work in the area thereappears much promise

In light of the above we believe that IFC has greatpotential as a security mechanism for the cloud wherebytrust in a few major cloud providers deploying IFCcan be built on to provide a demonstrably trustworthycomputing environment

CamFlow was developed with cloud deployment inmind Our future work will investigate the challenges ofa broader distributed context

It is already feasible to extend CamFlow to sup-port mobile environments Android supports the fullSELinux enforcement5 and an Android-LSM integrationhas been demonstrated [60] But when dealing withmultiple cloud services particularly as they become partof a wider distributed architecture such as in the IoTa trustworthy system-wide deployment of IFC can nolonger be assumed Much work remains on establishingtrust in (and the trustworthiness of) the IFC enforcementmechanism within end-usersrsquo devices Outside a cloudcontext all partiesrsquo trust in a common third partyrsquos en-forcement of IFC constraints cannot be assumed (unlikethe cloud provider for cloud services) We intend toexplore leveraging hardware roots of trust and remoteattestation to ensure the integrity and trustworthiness ofIFC enforcement mechanisms

Another area of investigation concerns the represen-tation of tags across administrative domains In a cloudcontext a federated approach can be envisaged wherea common understanding of tags could be negotiatedacross multiple domains For example in [54] we dis-cussed initial thoughts on managing data according tospecific obligation regimes However as the numberof administrative domains increases both a global tagnaming scheme and mechanisms for ad-hoc negotiationbecome necessary

Related is the sensitivity of the tags themselvesKnowledge of the meaning of a tag can indicate thatthe associated entity contains or deals with certain infor-mation If a relationship can be established between anentity and the information owner this may disclose pri-vate information about the information owner This maylead to work on a need-to-know negotiation mechanismto establish a secure channel between hosts especiallyin a wide-scale distributed system A promising mech-anism is private set intersection to determine tagsrsquo subsetrelationship (sect341)

Another challenge concerns extending audit to dis-tributed architectures both in terms of resource man-agement and regulating access to log data

5 httpssourceandroidcomdevicestechsecurityselinux

13

ACKNOWLEDGMENTS

This work was supported by UK Engineering andPhysical Sciences Research Council grant EPK011510CloudSafetyNet End-to-End Application Security inthe Cloud We acknowledge the support of Microsoftthrough the Microsoft Cloud Computing Research Cen-tre

REFERENCES

[1] P Barham B Dragovic K Fraser S Hand T Harris A HoR Neugebauer I Pratt and A Warfield ldquoXen and the art ofvirtualizationrdquo SIGOPS Operating Systems Review vol 37 no 5pp 164ndash177 2003

[2] A Kivity Y Kamay D Laor U Lublin and A Liguori ldquokvmthe Linux Virtual Machine Monitorrdquo in Proceedings of the LinuxSymposium vol 1 2007 pp 225ndash230

[3] D Bernstein ldquoContainers and Cloud From LXC to Docker toKubernetesrdquo IEEE Cloud Computing Magazine no 3 pp 81ndash842014

[4] P Desnoyers O Krieger B Holden and J Hennessey ldquoUsingOpenStack for an Open Cloud eXchange (OCX)rdquo in InternationalConference on Cloud Engineering (IC2E) IEEE 2015

[5] J Mineraud O Mazhelis X Su and S Tarkoma ldquoA Gap Analysisof Internet-of-Things Platformsrdquo 2015 Arxiv arXiv150201181[Online] Available httparxivorgabs150201181

[6] C J Millard Ed Cloud Computing Law Oxford University Press2013

[7] S Pearson and M Casassa-Mont ldquoSticky Policies An Approachfor Managing Privacy across Multiple Partiesrdquo Computer vol 44July 2011

[8] D E Denning ldquoA lattice model of secure information flowrdquoCommunication of the ACM vol 19 no 5 pp 236ndash243 1976

[9] D E Bell and L J LaPadula ldquoSecure Computer Systems Mathe-matical Foundations and Modelrdquo The MITRE Corp Bedford MATech Rep M74-244 1973

[10] K J Biba ldquoIntegrity Considerations for Secure Computer Sys-temsrdquo MITRE Corp Tech Rep ESD-TR 76-372 1977

[11] A C Myers and B Liskov ldquoA Decentralized Model for Infor-mation Flow Controlrdquo in 17th Symposium on Operating SystemsPrinciples (SOSP) ACM 1997 pp 129ndash142

[12] A Bouard B Weyl and C Eckert ldquoPractical information-flowaware middleware for in-car communicationrdquo in Workshop onSecurity privacy amp dependability for cyber vehicles ACM 2013pp 3ndash8

[13] K Singh S Bhola and W Lee ldquoxBook Redesigning PrivacyControl in Social Networking Platformsrdquo in Security SymposiumUSENIX 2009 pp 249ndash266

[14] M Krohn A Yip M Brodsky N Cliffer M F KaashoekE Kohler and R Morris ldquoInformation Flow Control for StandardOS Abstractionsrdquo in Symposium on Operating Systems PrinciplesACM 2007 pp 321ndash334

[15] S Lee E L Wong D Goel M Dahlin and V ShmatikovldquoπBox A Platform for Privacy-Preserving Appsrdquo in Symposiumon Networked System Design and Implementation USENIX 2013pp 501ndash514

[16] N Kumar and R Shyamasundar ldquoRealizing Purpose-Based Pri-vacy Policies Succinctly via Information-Flow Labelsrdquo in Big Dataand Cloud Computing (BDCloudrsquo14) IEEE 2014 pp 753ndash760

[17] W Enck P Gilbert B-G Chun L P Cox J Jung P McDaniel andA N Sheth ldquoTaintDroid An information-flow tracking systemfor realtime privacy monitoring on smartphonesrdquo in Conference onOperating systems design and implementation (OSDIrsquo10) USENIX2010 pp 1ndash6

[18] I Papagiannis M Migliavacca and P Pietzuch ldquoPHP AspisUsing partial taint tracking to protect against injection attacksrdquoin 2nd USENIX Conference on Web Application Development 2011p 13

[19] E Schwartz T Avgerinos and D Brumley ldquoAll you ever wantedto know about dynamic taint analysis and forward symbolicexecution (but might have been afraid to ask)rdquo in Symposium onSecurity and Privacy IEEE 2010

[20] M Casassa-Mont S Pearson and P Bramhall ldquoTowards account-able management of identity and privacy Sticky policies andenforceable tracing servicesrdquo in Database and Expert Systems Ap-plications 2003 Proceedings 14th International Workshop on IEEE2003 pp 377ndash382

[21] S Bandhakavi C C Zhang and M Winslett ldquoSuper-sticky anddeclassifiable release policies for flexible information dissemina-tion controlrdquo in Proceedings of the 5th ACM workshop on Privacy inelectronic society ACM 2006 pp 51ndash58

[22] D W Chadwick and S F Lievens ldquoEnforcing sticky securitypolicies throughout a distributed applicationrdquo in Proceedings ofthe 2008 workshop on Middleware security ACM 2008 pp 1ndash6

[23] M Casassa-Mont I Matteucci M Petrocchi and M L SbodioldquoTowards Safer Information Sharing in the Cloudrdquo InternationalJournal of Information Security pp 1ndash16 2014

[24] S Akoush L Carata R Sohan and A Hopper ldquoMrLazy LazyRuntime Label Propagation for MapReducerdquo in 6th Workshop onHot Topics in Cloud Computing (HotCloud) USENIX 2014

[25] D Schultz and B Liskov ldquoIFDB Decentralized Information FlowControl for Databasesrdquo in European Conference on Computer Sys-tems (Eurosysrsquo13) ACM 2013 pp 43ndash56

[26] D Von Oheimb ldquoInformation Flow Control Revisited Nonin-fluence = Noninterference + Nonleakagerdquo in Computer SecurityndashESORICS 2004 Springer 2004 pp 225ndash243

[27] D Hedin and A Sabelfeld ldquoA perspective on Information-FlowControlrdquo NATO 2012

[28] D E Porter M D Bond I Roy K S McKinley and E WitchelldquoPractical Fine-Grained Information Flow Control Using Lam-inarrdquo ACM Transactions on Programming Languages and Systems(TOPLAS) vol 37 no 1 p 4 2014

[29] C Wright C Cowan S Smalley J Morris and G Kroah-Hartman ldquoLinux Security Modules General security support forthe Linux kernelrdquo in Foundations of Intrusion Tolerant SystemsIEEE 2003 pp 213ndash213

[30] M Quaritsch and T Winkler ldquoLinux Security Modules Enhance-ments Module Stacking Framework and TCP State TransitionHooks for State-Driven NIDSrdquo Secure Information and Communi-cation vol 7 pp 7ndash13 2004

[31] C Schaufler ldquoLSM Generalize existing module stackingrdquo LinuxWeekly News 2014

[32] S Smalley C Vance and W Salamon ldquoImplementing SELinux asa Linux Security Modulerdquo NAI Labs Report vol 1 p 43 2001

[33] M Bauer ldquoParanoid Penguin an Introduction to Novell AppAr-morrdquo Linux Journal vol 2006 no 148 p 13 2006

[34] A Edwards T Jaeger and X Zhang ldquoRuntime verification ofauthorization hook placement for the Linux Security Modulesframeworkrdquo in Conference on Computer and Communications Secu-rity ACM 2002 pp 225ndash234

[35] T Jaeger A Edwards and X Zhang ldquoConsistency analysis ofauthorization hook placement in the Linux security modulesframeworkrdquo ACM Transactions on Information and System Security(TISSEC) vol 7 no 2 pp 175ndash205 2004

[36] V Ganapathy T Jaeger and S Jha ldquoAutomatic placement ofauthorization hooks in the Linux security modules frameworkrdquoin Conference on Computer and Communications Security ACM2005 pp 330ndash339

[37] I P Egwutuoha D Levy B Selic and S Chen ldquoA survey of faulttolerance mechanisms and checkpointrestart implementationsfor high performance computing systemsrdquo Springer 2013vol 65 no 3 pp 1302ndash1326

[38] O Laadan and S E Hallyn ldquoLinux-CR Transparent applicationcheckpoint-restart in Linuxrdquo in Linux Symposium 2010 p 159

[39] B Niu and G Tan ldquoEfficient User-space Information Flow Con-trolrdquo in Symposium on Information Computer and CommunicationsSecurity (SIGSACrsquo13) ACM 2013 pp 131ndash142

[40] T Bird ldquoMeasuring Function Duration with ftracerdquo in Japan LinuxSymposium 2009 pp 47ndash54

[41] B Parno ldquoBootstrapping Trust in ardquo Trustedrdquo Platformrdquo in Con-ference on Hot Topics in Security (HotSecrsquo08) USENIX 2008

[42] N Santos K P Gummadi and R Rodrigues ldquoTowards TrustedCloud Computingrdquo in Conference on Hot Topics in Cloud Comput-ing USENIX 2009 pp 3ndash3

[43] S Berger R Caceres K A Goldman R Perez R Sailer andL van Doorn ldquovTPM Virtualizing the Trusted Platform Modulerdquoin Security Symposium USENIX 2006 pp 305ndash320

14

[44] S Berger K Goldman D Pendarakis D Safford E Valdezand M Zohar ldquoScalable Attestation A Step Toward Secure andTrusted Cloudsrdquo in International Conference on Cloud Engineering(IC2E) IEEE 2015

[45] J Singh D Eyers and J Bacon ldquoPolicy Enforcement withinEmerging Distributed Event-Based Systemsrdquo in ACM DistributedEvent-Based Systems (DEBSrsquo14) 2014

[46] L Sfaxi T Abdellatif R Robbana and Y Lakhnech ldquoInformationFlow Control of Component-based Distributed Systemsrdquo Concur-rency and Computation Practice and Experience vol 25 no 2 pp161ndash179 2013

[47] S Yoshihama T Yoshizawa Y Watanabe M Kudoh and K Oy-anagi ldquoDynamic Information Flow Control Architecture for WebApplicationsrdquo in ESORICS 2007 J Biskup and J Lopez EdsSpringer 2007 vol LNCS 4734 pp 267ndash282

[48] W Cheng D R K Ports D Schultz V Popic A BlanksteinJ Cowling D Curtis L Shrira and B Liskov ldquoAbstractions forUsable Information Flow Control in Aeolusrdquo in USENIX AnnualTechnical Conference Boston 2012

[49] J Singh T Pasquier J Bacon and D Eyers ldquoIntegrating Middle-ware and Information Flow Controlrdquo in International Conferenceon Cloud Engineering (IC2E) IEEE 2015 pp 54ndash59

[50] J Singh T F J-M Pasquier and J Bacon ldquoSecuring Tags toControl Information Flows within the Internet of Thingsrdquo inInternational Conference on Recent Advances in Internet of Things(RIoTrsquo15) IEEE 2015

[51] J S Park and R Sandhu ldquoBinding Identities and Attributes UsingDigitally Signed Certificatesrdquo in Annual Conference on ComputerSecurity Applications IEEE 2000 pp 120ndash127

[52] A Ganjali and D Lie ldquoAuditing cloud management using infor-mation flow trackingrdquo in Workshop on Scalable Trusted ComputingACM 2012 pp 79ndash84

[53] R K Ko M Kirchberg and B S Lee ldquoFrom System-centric toData-centric Logging-accountability Trust amp Security in CloudComputingrdquo in Defense Science Research Conference and Expo (DSR)2011 IEEE 2011 pp 1ndash4

[54] J Singh J Powles T Pasquier and J Bacon ldquoData Flow Man-agement and Compliance in Cloud Computingrdquo IEEE CloudComputing Magazine SI on Legal Clouds 2015

[55] T Zanussi K Yaghmour R Wisniewski R Moore and M Dage-nais ldquorelayfs An efficient unified approach for transmitting datafrom kernel to user spacerdquo in Linux Symposium 2003 p 494

[56] M E Smoot K Ono J Ruscheinski P-L Wang and T IdekerldquoCytoscape 28 New features for data integration and networkvisualizationrdquo Oxford University Press 2011 vol 27 no 3 pp431ndash432

[57] T Pasquier and J Powles ldquoExpressing and Enforcing LocationRequirements in the Cloud using Information Flow Controlrdquo inIC2E International Workshop on Legal and Technical Issues in CloudComputing (CLawrsquo15) IEEE 2015

[58] R Angles and C Gutierrez ldquoSurvey of Graph Database ModelsrdquoACM Computing Surveys (CSUR) vol 40 no 1 p 1 2008

[59] D Hardt ldquoThe OAuth 20 Authorization Frameworkrdquo IETF TechRep RFC 6749 2012

[60] S Smalley and R Craig ldquoSecurity Enhanced (SE) Android Bring-ing Flexible MAC to Androidrdquo in Network and Distributed SystemSecurity Symposium Internet Society 2013

  • 1 Introduction and Motivation
  • 2 Background
    • 21 IFC Models
    • 22 Protection via VMs and Containers
    • 23 Taint Tracking (TT) Systems
    • 24 Sticky Policies
      • 3 CamFlow-Model IFC for the Cloud
        • 31 Tags and Labels
        • 32 Decentralised Privileges and Security Contexts
        • 33 Creating a New Entity
        • 34 Security
          • 341 Information Exchange
          • 342 Label Change
          • 343 Privilege delegation
            • 35 Conflict of Interest
              • 4 OS Enforcement
                • 41 CamFlow-LSM
                  • 411 Checkpointing and Restoration
                  • 412 OS Evaluation
                    • 42 Trusted Processes
                    • 43 Leveraging Hardware Roots of Trust
                      • 5 Cross-Machine Enforcement
                        • 51 Remote Interactions
                        • 52 Message-Level Enforcement
                        • 53 Integrating With Persistent Storage
                        • 54 Evaluation
                          • 6 Audit Data-Centric Logs
                            • 61 Analysing Paths to Disclosure
                            • 62 Demonstrating Compliance
                            • 63 Audit as `Big Data
                            • 64 Audit Access
                              • 7 Example Support for Web Services
                              • 8 Conclusion amp Future Work
                              • References
Page 6: CamFlow: Managed data-sharing for cloud services · CamFlow: Managed data-sharing for cloud services Thomas F. J.-M. Pasquier, Member, IEEE, Jatinder Singh, Member, IEEE, David Eyers,

6

data may leave the private protected environment Weassume a health service approved anonymisation proce-dure Fig 2 shows the anonymiser inputting data taggedas personal and declassifying the data by outputting datawith secrecy tag researchExample ndash endorsement In the same example theResearch Database is on a public cloud and may onlyreceive research data tagged with consent anon in itsintegrity label In the private cloud we see a pro-cess that selects appropriate records for specific re-search purposes checks for patient consent and addsthe tag consent to the integrity label of its output Theanonymiser process can only input data with this tag itanonymises the data and outputs data with the tag anonin its integrity label

Some previous work [14] [28] allows implicit declas-sification and endorsement That is if an active entity hasthe privilege to declassifyendorse and the privilegeto return to its original state (ie for declassificationendorsement over t the entity has privilege tminus and t+)the declassificationendorsement may occur implicitlywithout the need for the entity to make the label changesWe believe that this could in practice lead to unintentionaldata disclosure Suppose an entity has the privilege todeclassify top-secret information The requirement forexplicit label change makes it unlikely that the entity willsend such data accidentally to an unintended recipientOur model has stronger constraints that require endorse-ment and declassification operations to be programmedexplicitly

343 Privilege delegation

An entity is only able to delegate a privilege it owns

Definition 4 A privilege delegation is safe if and only ift isin PplusmnX (A)

35 Conflict of Interest

In CamFlow alone among IFC systems privilege dele-gation is further restricted by Conflict of Interest (CoI)(or Separation of Duty (SoD)) enforcement The receivingentity A must not be put in a situation where it wouldbreak a CoI constraint By this means an applicationmanager is prevented from creating an application in-stance with access to conflicting data

Definition 5 An entity A does not violate a CoI C if andonly if∣∣∣(S(A) cup I(A) cup P+

S (A) cup P+I (A) cup Pminus

S (A) cup PminusI (A)

)cap C

∣∣∣ le 1

Example ndash conflict of interest A CoI might arise whendata relating to competing companies is available in asystem In a hospital context this might involve theresults of analysis of the usage and effects of drugs fromcompeting pharmaceutical companies The companiesmight agree to this analysis only if their data is guar-anteed to be isolated ie not leaked to other companies

The hospital may be participating in drug trials andwant to ensure that information does not leak between

Camflow-LSM

IFC Library

User Space

Kernel Space

Application Process

Kernel

DIFC Management APISystem Calls

Hardware

IFC LibraryTrusted Process

Authorise and Audit

Fig 3 The interactions of the IFC Security Module (LSM)and a Trusted Process within an OStrials suppose a conflict is C = PfizerGSKRoche and some data (eg files) are labelled PfizerData[S =Pfizer I = empty] and RocheData[S = Roche I = empty] TheCoI described ensures that it is not possible for a singleentity (eg an application instance) to have access toboth RocheData and PfizerData either simultaneously orsequentially ie enforcing that Roche-owned data andPfizer-owned data are processed in isolation

The next sections describe the CamFlow platform thatenforces the IFC constraints described

4 OS ENFORCEMENT

At the heart of the architecture is a minimal kernel mod-ule dedicated solely to OS-level IFC enforcement Themodule is trusted to enforce IFC transparently acrossall flows between entities within the OS User spaceprocesses can directly interact with the kernel moduleeg to delegate privileges (sect34) through a pseudo-filesystem abstracted through a high level API Higher levelconsiderations and policies can be managed throughspecifically defined Trusted Processes (see sect42) The localmachine architecture is represented in Fig 3

Note that IFC operates alongside and complementsother security technologies It is not a cloud securitypanacea challenges regarding covert and side channelsand direct access to hardware by an attacker remain asthey do for systems in general There are approachesthat can help address these security threats but manyare highly disruptive (eg synchronisation approachesto reducing timing channels) and are infrequently usedOther threats may be easier to mitigate and solutionsmay be used when appropriate (eg on-disk encryption)

41 CamFlow-LSM

Our kernel module CamFlow-LSM is implemented as aLinux Security Module (LSM) [29] Although our work isLinux-specific a similar approach could be used on anysystem providing LSM-like security hooks Unlike otherDIFC OS implementations [14] [28] our kernel patch isself-contained strictly limited to the security moduledoes not modify any existing system calls and followsLSM implementation best practice This allows amongother things LSM stacking [30] [31] and coexistencewith other security modules such as eg SELinux [32]or AppArmor [33] and complements their MAC enforce-ment with decentralised information flow policies

We assume that the rest of the kernel can be trustedand does not interfere with the IFC enforcement mech-anism LSM system hooks have been statically anddynamically verified [34]ndash[36] and our implementation

7

inherits from LSM the formal assurance of IFCrsquos correctplacement on the path to any controlled kernel objectThis is sufficient to guarantee that we control flow andrecord audit on any operation on a controlled kernelobject

Since applications running on SELinux [32] or Ap-pArmor [33] need not be aware of the MAC policybeing enforced we see no reason to force applicationsrunning on an IFC system to be aware of IFC only thoseperforming declassification or endorsement operationsare necessarily aware This implementation choice isimportant cloud providers can incorporate IFC withoutrequiring changes in the software deployed by ten-ants Alternatively applications that wish to managetheir own IFC constraints can declare policy through apseudo-filesystem (as is typical for LSMs) abstracted bya user space library and enforced transparently by theIFC mechanism

The LSM framework calls security hooks when accessto a kernel object is attempted Security metadata can beassociated with kernel objects and is used by the LSMmodule to make access decisions Tags and privilegesare represented by 64-bit opaque nonces associated withkernel objects such as processes inodes files sharedmemory objects messages etc On interaction betweenkernel objects CamFlow-LSM security hooks are calledto enforce data-flow policy (sect341) or propagate tags onentity creation (sect33) as appropriate

Only active entities (processes) have mutable labelsand privileges all other (passive) entities have im-mutable labels and no privileges

Privileges are allocated by the kernel and owned bythe creating process (any process can create tags and theassociated privileges in a decentralised fashion) Privi-leges can be passed to other processes users or groupsCamFlow-LSM verifying that constraints on privilegedelegation (sect343) and conflict of interest (sect35) are notviolated A process can add or remove a tag from itslabel if it owns the appropriate privilege (following IFCconstraints described in sect342) if the current user ownsthe privilege or if the current group owns the privilegeHow tags are shared and managed must be consideredwith care when designing an application and the systemmust be administered accordingly

411 Checkpointing and Restoration

Checkpointing a process involves halting its executionallowing it to be restarted at a later stage and enablingmigration eg [37] LSM state is normally saved andrestored by the checkpointing system eg [38] andour module further exports an API to more efficientlyserialise and restore security context

Furthermore self-checkpointing and restoring the pre-vious state of a process has been demonstrated [39] to bea beneficial feature for IFC systems This is particularlyuseful for processes serving requests In such a scenariothe state of the process is saved after initialisation Whena request is received the serving process sets itself up

0 10 20 30 40 50 60 70 80 90

sys pipe

sys write

sys read

sys clone

microsdyn label IFC LSM Native

Fig 4 Overhead introduced into the OS by CamFlowLSM (x-axis time in micros)in the security context appropriate to serve the requestAfter the request is served (or a series of requests if thesystem is session-based as described in sect7) the processrestores its memory state and security context to whatthey were immediately after initialisation This improvesperformance and prevents data leaks between securitycontexts

412 OS Evaluation

We tested the CamFlow-LSM module on Linux Kernelversion 3178 (012015) from the Fedora distribution3

The tests are run on an Intel 22Ghz i7 CPU and 6GiBRAM machine

Measurements are done using the Linux tool ftrace [40]to provide a microbenchmark Two processes read fromand write to a pipe respectively Each has 20 tags in itssecurity label substantially more than we have seen aneed for in current use cases We measure the overheadinduced by creating a new process (sys clone) creatinga new pipe (sys pipe) writing to the pipe (sys write) andreading from the pipe (sys read) The results are givenin Fig 4

We can distinguish two types of induced overheadverifying an IFC constraint (sys read sys write) and allo-cating labels (sys clone sys pipe) The sys clone overheadis roughly twice that of sys pipe as memory is allocateddynamically for the active entityrsquos labels and privilegesRecall that passive entities have no privileges Overheadmeasurements for other system callsdata structures areessentially identical as they rely on the same underlyingenforcement mechanism and are not included

The CamFlow-LSM overhead is a few percent seeFig 4 We provide a build option that further improvesperformance by declaring labels and privileges with afixed maximum size (by default label size can increasedynamically to meet application requirements) This re-duces the overhead of the system calls that create newentities (the dynamic label component in Fig 4) This isan acceptable trade-off as in practical scenarios labelsrarely exceed more than five tags However for mostapplications the overhead is imperceptible and lost insystem noise it is hard to measure without using kernel

3 It is not feasible to provide a comparison with the Laminarimplementation [28] that is closest in technical terms to our workas the implementation available httpsgithubcomut-osalaminaris for an obsolete kernel version 2622 (072007)

8

tools as the variation between two executions may begreater than the overhead

42 Trusted Processes

The CamFlow-LSM is trusted to enforce IFC at the kernellevel Its functionality is minimal strictly confined tothe enforcement of IFC policies as described in sect3 Thisguarantees easier maintainability and a system that isagnostic to higher level application requirements thusminimising the constraints imposed on user-space ap-plication design

We introduce the concept of a trusted process that al-lows applicationplatform-specific concerns to be man-aged in user space by bypassing some LSM-enforced IFCconstraints For example a trusted process might serveas a proxy for external connections as in the Trusted IFCGateway in the example in sect7 setting up and managingapplication componentsrsquo labels Trusted processes areused to interact with persistent storage (see sect53) forcheckpointing and restoring processes (see sect411) andfor managing inter-process and external communication(see sect5)

Fig 5 shows OS instances running the CamFlow-LSM hosting a number of application processes thatmay be grouped in containers Each OS instance hasa single trusted process (Security Context Manager) tomanage its hosted processesrsquo IFC labels and privileges Inaddition each process has an associated trusted middle-ware process to handle inter-process and inter-machinecommunication Such communication may be within orbetween containers OSs or clouds

In this example S represents a particular set of secrecytags and I a particular set of integrity tags both ofwhich remain the same throughout The application pro-cesses and other OS objects such as pipes and files arelabelled [S I] The process labelled [empty I] writes lsquopublicrsquodata to a pipe which is read by a process labelled [S I]assuming all the I tags match correctly Similarly twoprocesses are shown writing to and reading from a file

The Security Context Manager maps between thekernel-level representation of tags (as 64-bit integers)and the representation of tags in user space Within acloud or other trusted environment tags may be simplestrings When tags need to cross domain boundarieseg when cloud services form part of a wider archi-tecture as in IoT tags may need to be protected bycryptographic means (see sect51)

Trusted processes are either set up through staticconfiguration read at boot time by the CamFlow-LSMmodule or created at runtime by another trusted pro-cess Trusted processes must either be managed by atrusted party (in our current approach the underlying in-frastructure provider) andor the code must be auditableand a means to verify the current version running on theplatform must be provided (see sect43)

43 Leveraging Hardware Roots of Trust

Incorporating IFC into cloud-provider OSs would en-hance the trustworthiness of the platform HoweverIFC only guarantees protection above the technical layerin which it is enforced Recent hardware and softwaredevelopments make it possible to attest that the softwarelayers on which our platform runs have been audited

The Trusted Platform Module (TPM) [41] as used forremote attestation [42] is one such hardware mecha-nism TPM is used to generate a nearly unforgeablehash representing the state of the hardware and soft-ware of a given platform that can be remotely verifiedTherefore a company could audit the implementationof our IFC enforcement mechanism and ensure that ourkernel security module messaging middleware and theconfiguration they provide are indeed running on theplatform Any difference between the expected state ofthe software stack and the platform could be considereda breach of trust such considerations can easily beembedded in the contractual obligations of the cloudprovider

TPM and remote attestation for cloud computing [43]are reaching maturity with IBM rolling out an opensource scalable trusted platform based on virtual TPMs[44] Indeed Berger et al [44] describe a mechanismallowing the TPM and remote attestation to be providedfor virtual machine offerings and container-based solu-tions covering the whole range of contemporary cloudofferings Furthermore the approach not only allows thestate of the software stack to be verified at boot time butalso during execution and can thus prevent run-timemodification of the system configuration

5 CROSS-MACHINE ENFORCEMENT

CamFlow-LSM operates to protect flows within the OSHowever it is also important that flows are protectedacross OS instances

Generally in order to guarantee flow constraints onlyprocesses P such that S(P ) = empty and I(P ) = empty ie notsubject to IFC constraints are allowed to directly connectto or receive messages from connections on remote OS(eg through a socket) In order to connect to anothermachine a process must either 1) be able to declassifyto change its security context to S(P ) = empty and I(P ) = empty2) communicate through an intermediate trusted process

As such CamFlow contains an IFC-enabled fully-featured messaging middleware (CamFlow-MW) to bothfacilitate communication and guarantee enforcementacross machines For want of space we only considerthe middleware concepts relevant to IFC details on thegeneral middleware (as it was prior to IFCCamFlowintegration) can be found in [45] In short the middle-ware supports strongly-typed messages a range of in-teraction paradigms including request-reply broadcastand streams flexible resource discovery and securitymechanisms including access controls and encryptedcommunication A particular feature is its support for

9

CamFlow-MW CamFlow-MW CamFlow-MW CamFlow-MW CamFlow-MW

Process

CamFlow-LSM

Enforcement Point Pipe [empty I]

Process Process

SecurityContextManager

SecurityContextManager

Container

Container

Container

Process[S I]

CamFlow-LSM

[S I]

File[S I]

[S empty][S empty]Process[empty I]

Fig 5 CamFlow Architecture Labelled OS objects trusted processes and communication middleware

dynamic reconfiguration based on event-driven policyThis simplifies both application development and de-ployment as concerns can be abstracted and tailoredto the particular environment rather than embeddedwithin application code

The role of the middleware is to move towards contin-uous end-to-end data flow management such that IFCcan be enforced across applicationsmachines (kernels)There is work on IFC enforcement across machineshowever these impose specific requirements such asdesign-time considerations [46] a particular languageruntime [47] or constraints on system architectureimplementation [48] In contrast we integrate IFC func-tionality into the general fully featured distributed sys-tems middleware mentioned above (see [49]) to provideflexibility and be more generally applicable We delib-erately avoid imposing a structure on system designinstead integrating IFC functionality into the sort of com-munications infrastructure common to current enterpriseand cloud systems

51 Remote Interactions

CamFlow-MW operates by associating a trusted process(see sect42) with an entity that seeks to communicate viathe messaging system Its fit within the broader architec-ture is depicted in Fig 5 The process is responsible forhandling the communication of messages and enforcesIFC based on the current runtime labels of the entity onwhose behalf it operates

It follows that for IFC to be enforced across machinestags require system-wide management ie throughoutthe cloud service In [50] we proposed that the widelyused and available X509 certificates could be used Theapproach relies on public key certificates and attributecertificates [51] to respectively identify the applicationassociated with the CamFlow-MW instance and the tagsassociated with this application

As part of establishing a connection CamFlow-MWensures that each entity authorises communication withthe other according to a local access control policy De-cisions are based on component metadata the relevantauthentication aspects secured through PKI (certificates)Similarly IFC policy must also be verified to ensure dataflows are authorised according to IFC policy Attributecertificates provide cryptographic means to determineand verify the tags associated with the remote entity

(see [50]) on which policy can be enforced If tags donot accord the connection will not be established

We also see potential for remote attestation based onhardware integrity measures (see sect43) to be integratedinto this authorisation phase to ensure the remote ma-chine operates a reliable IFC enforcement regime

52 Message-Level Enforcement

CamFlow messages are strongly typed where a mes-sage type is defined by a schema describing its set ofattributes For an instance of a message an attributeconsists of a name type and value The support forIFC within messages is fine-grained in that individualattributes within messages can also be labelled Theseattribute labels introduce additional IFC constraints overand above those already applying to the entity ie asrecognised by the kernel-LSM and validated on connec-tion establishment

Labels can be defined within message type schemawhich sets the attributesrsquo IFC labels for all messageinstances of the type These labels cannot be changed byentities dealing in such messages and the entities musthold the requisite labels to interact with the attributesOtherwise the entity producingpublishing a messagecan set the security labels for the attributes (for those notpredefined) if the entity holds the associated privilegesEnforcement occurs as followsReceiving If the receiving entityrsquos labels do not agreewith those of an attribute value the attribute value (andany sub-attributes) are removed from (made null in) themessage This is enforced on message receipt before itis delivered to the entitySending An entity cannot send values for attributeswhere its labels do not agree with those of the attributeThis is enforced when an entity attempts to send amessage ensuring values for any attributes violating thispolicy are removed before message propagation

Enforcement is automatic meaning that applicationsusing the messaging system can be subject to IFC en-forcement completely transparently (ie without their di-rect involvement) though again there is the interface forthe application to actively manage IFC where requiredIn addition the general reconfiguration capabilities ofthe middleware enable connections between componentsto be defined and managed at runtime providing an-other mechanism for controlling communication [45]

10

0 100 200 300 400 500 600 700

Time in msIFC-overhead

Fig 6 IFC overhead of CamFlow-MW for a workloadtransmission of 5000 messages (x-axis in ms)53 Integrating With Persistent Storage

One technique to provide IFC with persistent data storesis to store the tags alongside the data and a trustedsoftware component ensures that when information isread from the store the corresponding labels are appliedIn Flume [14] a trusted process provides the interfacebetween untrusted applications and persistent storageMore recent work has seen the emergence of databasesthat natively understand IFC concepts and can enforceIFC policies [25]

We see much promise in having the middleware me-diate between persistence systems and the kernel toensure consistent IFC application

54 Evaluation

As shown in Fig 6 the results indicate that IFC en-forcement introduces an overhead of sim13 in perfor-mance time compared to the standard non IFC-enabledmiddleware (see [49] for details) Note that these resultswere measured in the context of a particular workloaddeliberately designed to highlight the impact of IFCenforcement It follows that the overheads associatedwith real-world usage are most likely less onerous

6 AUDIT DATA-CENTRIC LOGS

IFC in addition to providing strong assurances that pol-icy is being enforced can also provide a data-centric log[52] detailing the information flows within and betweensystem components In addition to enforcing IFC via ourLSM module we log the data flows of labelled processespolicy decisions privileges and IFC security contextmanipulations Equivalent inter-machine operations viathe middleware are also recorded

Cloud logging systems are generally based on legacylogging systems (OS web-server database etc) thateither fail to capture the needed information or areextremely complicated to interpret in a useful manner[53] More importantly such logs tend to be relevant onlyto the particular service or component which makes itdifficult if not impossible to audit across a range ofapplications clouds etc

IFC logs as provided by our platform allow us to cap-ture information on application-level data flows both at-tempted and permitted allowing the correct expressionand implementation of data flow policy to be checkedThis provides transparency allows for meaningful auditin terms of investigating the circumstances in which dataleakage occurs and provides evidence of complianceeg with legal obligations [54]

Information Exchange

CreateSecurity Context Change

e0

e1

e3e4

e7

e2 e5e6

P3[Sprimeprime I]

P2[Sprimeprime empty]

P3[S I]F1[S I]

P1[S I] P1[empty I]

Public[empty empty]F2[Sprime I prime]

Fig 7 Simplified audit graph from IFC OS execution (weomit metadata for readability) The path to disclosure isshown in bluepale61 Analysing Paths to Disclosure

To assist in interpreting log information we build a di-rected graph corresponding to the allowed flows duringthe execution of our system as shown in Fig 7 Theflows defined in our IFC model (see sect3) namely dataflow creation flow security context change and privi-lege delegation correspond to the edges of the directedgraph Entities (such as processes files messages etc)are represented as nodes in the graph

In addition to information necessary to build the graph(as shown in the figure) additional metadata is collectedfor forensic purposes which is contextentityevent-dependent These audit entries provided by the LSMand middleware can be exploited by a dedicated serviceimplemented in user space connected to the kernelcollection mechanism via relayfs [55] This service feedsfor example a graph visualisation tool such as Cytoscape[56] or a graph database such as Neo4J4

Such a directed graph helps one identify data leaksFor example a tenant might discover that some sensi-tive medical data leaked into a data store where onlyanonymised research data were supposed to be storedIFC is enforced in line with the policy encapsulated inlabels thus data may leak if such policy is improperly ex-pressed andor declassificationendorsement processesare not correctly implemented (eg if the anonymisationprocess in Fig 2 allows re-identification)

Suppose that an information leak is suspected be-tween different security contexts L1[S I] and L2[S

prime I prime]Determining whether such a leak can occur is equiv-alent to discovering whether there is a path in thegraph between the two contexts If the leak occurredthere must be a path between some entity Ei such thatS(Ei) = S and I(Ei) = I and another entity Fi such thatS(Fi) = Sprime and I(Fi) = I prime

The existence of such a path demonstrates that a leakis possible To investigate whether a leak occurred it isessential to consider the event ID associated with theedges comprising the path We denote by ei the last in-coming edge to the entity under investigation with labels[Sprime I prime] only edges such that e lt ei should be consideredWhen applied to all nodes along a path this rule ensuresstrictly monotonically increasing timestamps from thefirst node to the last Fig 7 shows in bluepale a possibledata disclosure path from file F2 from a very simple

4 httpneo4jcom

11

audit graph We know from the event IDs e0 and e1 thatthe data disclosure did not occur through file F1 andprocess P3 but through P1rsquos declassification

62 Demonstrating Compliance

Compliance with certain requirements can be demon-strated through queries over the graph We assume theaudit data is stored in a graph database that we canquery For example the following plain English policyldquoEuropean personal data sent to the US must be anonymisedrdquo[57] is equivalent to writing a query that verifies thatthere is no path between EU- and US-labelled datawithout an anonymisation processThe policy ldquoMedical data stored in database X must havereceived proper consent and be anonymisedrdquo [54] can beexpressed as a query verifying that there is no path be-tween data labelled as medical and the database withoutconsent-checking and anonymiser processes In additionan investigator may want to know which anonymisationalgorithm has been run which data has been used togenerate the anonymised records etc Our audit graphassists in answering such questions

Note that IFC only applies guarantees with respectto flows Demonstrating the overall effectiveness of themanagement regime eg the quality and suitability ofthe anonymisation algorithm is out-of-scope for a flow-based enforcement mechanism

63 Audit as lsquoBig Datarsquo

We are potentially generating a vast amount of data inour IFC logs However unlike standard system logs thatare complex to analyse our logs generate graphs thatare ideal for analysis by ldquobig datardquo tools that have beendeveloped for this purpose [58]

Since the amount of data is potentially huge theamount of data being logged can be fine-tuned to meetthe requirements of the platformtenant eg by reduc-ing the amount of metadata being stored by loggingonly security context changing operations by loggingonly information corresponding to some target securitycontext keeping operations on unlabelled entities out-side of the log etc The decision on what needs to belogged then becomes a tradeoff between utility and thevolume (cost) of log generated which can be decided inorder to correspond to legal or contractual requirements(for example a regulated sector may need to have afine-grained log to satisfy data forensic requirements)Indeed as such an approach is new to the cloud suchconsiderations will be refined by experience with bestpractices developing over time

64 Audit Access

Logs can contain sensitive information and access tothem should be controlled This represents an area ofour ongoing work Traditional access controls clearlyplay a role however secrecy tags could also be lever-aged For example an auditor before being grantedaccess to audit logs could be forced to demonstrate

ownership of the corresponding secrecy IFC tags (forexample through cryptographic means as in sect51) Theauditor may be granted access to a log entry only ifS(origin) cup S(destination) sube S(auditor)7 EXAMPLE SUPPORT FOR WEB SERVICES

One of the most common uses of PaaS is to host web ap-plications In this section we present the implementationof such a solution built on the infrastructure described insect4 in order to evaluate and demonstrate the feasibility ofour proposed approach This is illustrated in Fig 8 Werun standard and unmodified Ruby web applications

Interaction with end-users is achieved through aldquogatewayrdquo between the IFC and non-IFC worlds Simi-larly interaction with cloud services (such as data stores)is also achieved through our messaging middleware asdiscussed in sect5 The requirement for this gateway canbe removed if a trustworthy IFC implementation can beprovided at the client side consistent with the cloud im-plementation with respect to tag naming enforcementetc Tag naming in general system-wide is an issuebeyond the scope of this paper see further sect8 In ourproof of concept implementation the gateway is a simpleApache server running a custom-built module

The role of the gateway is to authenticate the end-userwhen a session is created and to associate this sessionwith an application instance running within the securitycontext corresponding to the user Recall that a securitycontext comprises the S and I labels Any further re-quests to the gateway in that session are routed to thecorresponding application instance Once an instance nolonger has an associated session it can be recycled usingself-checkpointing as described in sect411

Several application types are running over our cloud-based web services platform For example in a medicalcontext these might be medical record editing pharmacyordering social services etc A single shared identityservice for the end-user is part of the cloud provideroffering (in our proof of concept implementation weused OAuth [59])

The GP authenticates is authorised as treating doc-tor for Alice and selects the lsquoidentityrsquo that correspondsto Alice A new session is created server-side by thegateway with the requested application instance run-ning in the corresponding security context with S =[medical alice] I = [empty] When the GP wants to accessapplications on behalf of a new patient he needs to closeAlicersquos session authorise as treating doctor for Bob andopen a new session for Bob

The control described above is not achieved by theapplication but by the platform itself and can be con-trolled by the end-user subject to access control Thatis a medical application used on Alicersquos behalf runs ina security context in which data cannot flow to that ofanother patient Furthermore applications running onbehalf of a given user can share the data of that userwithout the risk of seeing a buggy application leakingdata between end-users

12

App Instance App InstanceApp Instance

CamFlow-MW

IFC Constrained IFC Constrained IFC Constrained

ContextA

ContextB

Public

Trusted IFC Gateway

Enforcing IFCData-store

Data-store Application

CamFlow-MW CamFlow-MWCamFlow-MW

IFC Cloud Platform

Dr Yellow Dr PinkDr MauveRemote Clients

Fig 8 PaaS Architecture on top of IFC-OS

As described in sect4 we assume the middleware andthe OS enforcement are provided as a service by theunderlying platform A tenant wanting to use the third-party web-service offering once his trust in the under-lying platform is established needs only to audit thegateway again the underlying infrastructure providercould either provide such a gateway or audit it Therest of the software stack of the third-party web-serviceprovider is bound by the IFC enforcement mechanismand therefore need not be trusted

8 CONCLUSION amp FUTURE WORK

IFC allows data flows to be controlled continuouslythroughout a system by providing an information-centric MAC scheme that continuously ensures non-interference between security contexts This paper pre-sented the CamFlow platform that demonstrates thepotential of cloud-deployed IFC as supporting (1) pro-tection of applications from each other (2) flexible datasharing across isolation boundaries (3) prevention ofdata leakage due to bugsmisconfigurations (4) exten-sion of access control beyond application boundaries (5)data flow transparency

Specifically we detailed a new kernel implementationof IFC as an LSM demonstrating low overhead evenfor worst-case scenarios where processes continuouslymake readwrite system calls We also described theintegration of a messaging middleware to enforcing IFCacross machines This combination makes it possible toprovide whole-system IFC to PaaS cloud services andtherefore also SaaS Our approach and implementationwere designed so that applications can run unchangedover IFC thus making cloud adoption feasible

We also indicated how the data-centric logs based onIFC enforcement could provide the means to audit anIFC-enabled system whereby a log can be processed asa directed graph to investigate leaks and attacks andshow compliance with data management requirementsThough this represents our initial work in the area thereappears much promise

In light of the above we believe that IFC has greatpotential as a security mechanism for the cloud wherebytrust in a few major cloud providers deploying IFCcan be built on to provide a demonstrably trustworthycomputing environment

CamFlow was developed with cloud deployment inmind Our future work will investigate the challenges ofa broader distributed context

It is already feasible to extend CamFlow to sup-port mobile environments Android supports the fullSELinux enforcement5 and an Android-LSM integrationhas been demonstrated [60] But when dealing withmultiple cloud services particularly as they become partof a wider distributed architecture such as in the IoTa trustworthy system-wide deployment of IFC can nolonger be assumed Much work remains on establishingtrust in (and the trustworthiness of) the IFC enforcementmechanism within end-usersrsquo devices Outside a cloudcontext all partiesrsquo trust in a common third partyrsquos en-forcement of IFC constraints cannot be assumed (unlikethe cloud provider for cloud services) We intend toexplore leveraging hardware roots of trust and remoteattestation to ensure the integrity and trustworthiness ofIFC enforcement mechanisms

Another area of investigation concerns the represen-tation of tags across administrative domains In a cloudcontext a federated approach can be envisaged wherea common understanding of tags could be negotiatedacross multiple domains For example in [54] we dis-cussed initial thoughts on managing data according tospecific obligation regimes However as the numberof administrative domains increases both a global tagnaming scheme and mechanisms for ad-hoc negotiationbecome necessary

Related is the sensitivity of the tags themselvesKnowledge of the meaning of a tag can indicate thatthe associated entity contains or deals with certain infor-mation If a relationship can be established between anentity and the information owner this may disclose pri-vate information about the information owner This maylead to work on a need-to-know negotiation mechanismto establish a secure channel between hosts especiallyin a wide-scale distributed system A promising mech-anism is private set intersection to determine tagsrsquo subsetrelationship (sect341)

Another challenge concerns extending audit to dis-tributed architectures both in terms of resource man-agement and regulating access to log data

5 httpssourceandroidcomdevicestechsecurityselinux

13

ACKNOWLEDGMENTS

This work was supported by UK Engineering andPhysical Sciences Research Council grant EPK011510CloudSafetyNet End-to-End Application Security inthe Cloud We acknowledge the support of Microsoftthrough the Microsoft Cloud Computing Research Cen-tre

REFERENCES

[1] P Barham B Dragovic K Fraser S Hand T Harris A HoR Neugebauer I Pratt and A Warfield ldquoXen and the art ofvirtualizationrdquo SIGOPS Operating Systems Review vol 37 no 5pp 164ndash177 2003

[2] A Kivity Y Kamay D Laor U Lublin and A Liguori ldquokvmthe Linux Virtual Machine Monitorrdquo in Proceedings of the LinuxSymposium vol 1 2007 pp 225ndash230

[3] D Bernstein ldquoContainers and Cloud From LXC to Docker toKubernetesrdquo IEEE Cloud Computing Magazine no 3 pp 81ndash842014

[4] P Desnoyers O Krieger B Holden and J Hennessey ldquoUsingOpenStack for an Open Cloud eXchange (OCX)rdquo in InternationalConference on Cloud Engineering (IC2E) IEEE 2015

[5] J Mineraud O Mazhelis X Su and S Tarkoma ldquoA Gap Analysisof Internet-of-Things Platformsrdquo 2015 Arxiv arXiv150201181[Online] Available httparxivorgabs150201181

[6] C J Millard Ed Cloud Computing Law Oxford University Press2013

[7] S Pearson and M Casassa-Mont ldquoSticky Policies An Approachfor Managing Privacy across Multiple Partiesrdquo Computer vol 44July 2011

[8] D E Denning ldquoA lattice model of secure information flowrdquoCommunication of the ACM vol 19 no 5 pp 236ndash243 1976

[9] D E Bell and L J LaPadula ldquoSecure Computer Systems Mathe-matical Foundations and Modelrdquo The MITRE Corp Bedford MATech Rep M74-244 1973

[10] K J Biba ldquoIntegrity Considerations for Secure Computer Sys-temsrdquo MITRE Corp Tech Rep ESD-TR 76-372 1977

[11] A C Myers and B Liskov ldquoA Decentralized Model for Infor-mation Flow Controlrdquo in 17th Symposium on Operating SystemsPrinciples (SOSP) ACM 1997 pp 129ndash142

[12] A Bouard B Weyl and C Eckert ldquoPractical information-flowaware middleware for in-car communicationrdquo in Workshop onSecurity privacy amp dependability for cyber vehicles ACM 2013pp 3ndash8

[13] K Singh S Bhola and W Lee ldquoxBook Redesigning PrivacyControl in Social Networking Platformsrdquo in Security SymposiumUSENIX 2009 pp 249ndash266

[14] M Krohn A Yip M Brodsky N Cliffer M F KaashoekE Kohler and R Morris ldquoInformation Flow Control for StandardOS Abstractionsrdquo in Symposium on Operating Systems PrinciplesACM 2007 pp 321ndash334

[15] S Lee E L Wong D Goel M Dahlin and V ShmatikovldquoπBox A Platform for Privacy-Preserving Appsrdquo in Symposiumon Networked System Design and Implementation USENIX 2013pp 501ndash514

[16] N Kumar and R Shyamasundar ldquoRealizing Purpose-Based Pri-vacy Policies Succinctly via Information-Flow Labelsrdquo in Big Dataand Cloud Computing (BDCloudrsquo14) IEEE 2014 pp 753ndash760

[17] W Enck P Gilbert B-G Chun L P Cox J Jung P McDaniel andA N Sheth ldquoTaintDroid An information-flow tracking systemfor realtime privacy monitoring on smartphonesrdquo in Conference onOperating systems design and implementation (OSDIrsquo10) USENIX2010 pp 1ndash6

[18] I Papagiannis M Migliavacca and P Pietzuch ldquoPHP AspisUsing partial taint tracking to protect against injection attacksrdquoin 2nd USENIX Conference on Web Application Development 2011p 13

[19] E Schwartz T Avgerinos and D Brumley ldquoAll you ever wantedto know about dynamic taint analysis and forward symbolicexecution (but might have been afraid to ask)rdquo in Symposium onSecurity and Privacy IEEE 2010

[20] M Casassa-Mont S Pearson and P Bramhall ldquoTowards account-able management of identity and privacy Sticky policies andenforceable tracing servicesrdquo in Database and Expert Systems Ap-plications 2003 Proceedings 14th International Workshop on IEEE2003 pp 377ndash382

[21] S Bandhakavi C C Zhang and M Winslett ldquoSuper-sticky anddeclassifiable release policies for flexible information dissemina-tion controlrdquo in Proceedings of the 5th ACM workshop on Privacy inelectronic society ACM 2006 pp 51ndash58

[22] D W Chadwick and S F Lievens ldquoEnforcing sticky securitypolicies throughout a distributed applicationrdquo in Proceedings ofthe 2008 workshop on Middleware security ACM 2008 pp 1ndash6

[23] M Casassa-Mont I Matteucci M Petrocchi and M L SbodioldquoTowards Safer Information Sharing in the Cloudrdquo InternationalJournal of Information Security pp 1ndash16 2014

[24] S Akoush L Carata R Sohan and A Hopper ldquoMrLazy LazyRuntime Label Propagation for MapReducerdquo in 6th Workshop onHot Topics in Cloud Computing (HotCloud) USENIX 2014

[25] D Schultz and B Liskov ldquoIFDB Decentralized Information FlowControl for Databasesrdquo in European Conference on Computer Sys-tems (Eurosysrsquo13) ACM 2013 pp 43ndash56

[26] D Von Oheimb ldquoInformation Flow Control Revisited Nonin-fluence = Noninterference + Nonleakagerdquo in Computer SecurityndashESORICS 2004 Springer 2004 pp 225ndash243

[27] D Hedin and A Sabelfeld ldquoA perspective on Information-FlowControlrdquo NATO 2012

[28] D E Porter M D Bond I Roy K S McKinley and E WitchelldquoPractical Fine-Grained Information Flow Control Using Lam-inarrdquo ACM Transactions on Programming Languages and Systems(TOPLAS) vol 37 no 1 p 4 2014

[29] C Wright C Cowan S Smalley J Morris and G Kroah-Hartman ldquoLinux Security Modules General security support forthe Linux kernelrdquo in Foundations of Intrusion Tolerant SystemsIEEE 2003 pp 213ndash213

[30] M Quaritsch and T Winkler ldquoLinux Security Modules Enhance-ments Module Stacking Framework and TCP State TransitionHooks for State-Driven NIDSrdquo Secure Information and Communi-cation vol 7 pp 7ndash13 2004

[31] C Schaufler ldquoLSM Generalize existing module stackingrdquo LinuxWeekly News 2014

[32] S Smalley C Vance and W Salamon ldquoImplementing SELinux asa Linux Security Modulerdquo NAI Labs Report vol 1 p 43 2001

[33] M Bauer ldquoParanoid Penguin an Introduction to Novell AppAr-morrdquo Linux Journal vol 2006 no 148 p 13 2006

[34] A Edwards T Jaeger and X Zhang ldquoRuntime verification ofauthorization hook placement for the Linux Security Modulesframeworkrdquo in Conference on Computer and Communications Secu-rity ACM 2002 pp 225ndash234

[35] T Jaeger A Edwards and X Zhang ldquoConsistency analysis ofauthorization hook placement in the Linux security modulesframeworkrdquo ACM Transactions on Information and System Security(TISSEC) vol 7 no 2 pp 175ndash205 2004

[36] V Ganapathy T Jaeger and S Jha ldquoAutomatic placement ofauthorization hooks in the Linux security modules frameworkrdquoin Conference on Computer and Communications Security ACM2005 pp 330ndash339

[37] I P Egwutuoha D Levy B Selic and S Chen ldquoA survey of faulttolerance mechanisms and checkpointrestart implementationsfor high performance computing systemsrdquo Springer 2013vol 65 no 3 pp 1302ndash1326

[38] O Laadan and S E Hallyn ldquoLinux-CR Transparent applicationcheckpoint-restart in Linuxrdquo in Linux Symposium 2010 p 159

[39] B Niu and G Tan ldquoEfficient User-space Information Flow Con-trolrdquo in Symposium on Information Computer and CommunicationsSecurity (SIGSACrsquo13) ACM 2013 pp 131ndash142

[40] T Bird ldquoMeasuring Function Duration with ftracerdquo in Japan LinuxSymposium 2009 pp 47ndash54

[41] B Parno ldquoBootstrapping Trust in ardquo Trustedrdquo Platformrdquo in Con-ference on Hot Topics in Security (HotSecrsquo08) USENIX 2008

[42] N Santos K P Gummadi and R Rodrigues ldquoTowards TrustedCloud Computingrdquo in Conference on Hot Topics in Cloud Comput-ing USENIX 2009 pp 3ndash3

[43] S Berger R Caceres K A Goldman R Perez R Sailer andL van Doorn ldquovTPM Virtualizing the Trusted Platform Modulerdquoin Security Symposium USENIX 2006 pp 305ndash320

14

[44] S Berger K Goldman D Pendarakis D Safford E Valdezand M Zohar ldquoScalable Attestation A Step Toward Secure andTrusted Cloudsrdquo in International Conference on Cloud Engineering(IC2E) IEEE 2015

[45] J Singh D Eyers and J Bacon ldquoPolicy Enforcement withinEmerging Distributed Event-Based Systemsrdquo in ACM DistributedEvent-Based Systems (DEBSrsquo14) 2014

[46] L Sfaxi T Abdellatif R Robbana and Y Lakhnech ldquoInformationFlow Control of Component-based Distributed Systemsrdquo Concur-rency and Computation Practice and Experience vol 25 no 2 pp161ndash179 2013

[47] S Yoshihama T Yoshizawa Y Watanabe M Kudoh and K Oy-anagi ldquoDynamic Information Flow Control Architecture for WebApplicationsrdquo in ESORICS 2007 J Biskup and J Lopez EdsSpringer 2007 vol LNCS 4734 pp 267ndash282

[48] W Cheng D R K Ports D Schultz V Popic A BlanksteinJ Cowling D Curtis L Shrira and B Liskov ldquoAbstractions forUsable Information Flow Control in Aeolusrdquo in USENIX AnnualTechnical Conference Boston 2012

[49] J Singh T Pasquier J Bacon and D Eyers ldquoIntegrating Middle-ware and Information Flow Controlrdquo in International Conferenceon Cloud Engineering (IC2E) IEEE 2015 pp 54ndash59

[50] J Singh T F J-M Pasquier and J Bacon ldquoSecuring Tags toControl Information Flows within the Internet of Thingsrdquo inInternational Conference on Recent Advances in Internet of Things(RIoTrsquo15) IEEE 2015

[51] J S Park and R Sandhu ldquoBinding Identities and Attributes UsingDigitally Signed Certificatesrdquo in Annual Conference on ComputerSecurity Applications IEEE 2000 pp 120ndash127

[52] A Ganjali and D Lie ldquoAuditing cloud management using infor-mation flow trackingrdquo in Workshop on Scalable Trusted ComputingACM 2012 pp 79ndash84

[53] R K Ko M Kirchberg and B S Lee ldquoFrom System-centric toData-centric Logging-accountability Trust amp Security in CloudComputingrdquo in Defense Science Research Conference and Expo (DSR)2011 IEEE 2011 pp 1ndash4

[54] J Singh J Powles T Pasquier and J Bacon ldquoData Flow Man-agement and Compliance in Cloud Computingrdquo IEEE CloudComputing Magazine SI on Legal Clouds 2015

[55] T Zanussi K Yaghmour R Wisniewski R Moore and M Dage-nais ldquorelayfs An efficient unified approach for transmitting datafrom kernel to user spacerdquo in Linux Symposium 2003 p 494

[56] M E Smoot K Ono J Ruscheinski P-L Wang and T IdekerldquoCytoscape 28 New features for data integration and networkvisualizationrdquo Oxford University Press 2011 vol 27 no 3 pp431ndash432

[57] T Pasquier and J Powles ldquoExpressing and Enforcing LocationRequirements in the Cloud using Information Flow Controlrdquo inIC2E International Workshop on Legal and Technical Issues in CloudComputing (CLawrsquo15) IEEE 2015

[58] R Angles and C Gutierrez ldquoSurvey of Graph Database ModelsrdquoACM Computing Surveys (CSUR) vol 40 no 1 p 1 2008

[59] D Hardt ldquoThe OAuth 20 Authorization Frameworkrdquo IETF TechRep RFC 6749 2012

[60] S Smalley and R Craig ldquoSecurity Enhanced (SE) Android Bring-ing Flexible MAC to Androidrdquo in Network and Distributed SystemSecurity Symposium Internet Society 2013

  • 1 Introduction and Motivation
  • 2 Background
    • 21 IFC Models
    • 22 Protection via VMs and Containers
    • 23 Taint Tracking (TT) Systems
    • 24 Sticky Policies
      • 3 CamFlow-Model IFC for the Cloud
        • 31 Tags and Labels
        • 32 Decentralised Privileges and Security Contexts
        • 33 Creating a New Entity
        • 34 Security
          • 341 Information Exchange
          • 342 Label Change
          • 343 Privilege delegation
            • 35 Conflict of Interest
              • 4 OS Enforcement
                • 41 CamFlow-LSM
                  • 411 Checkpointing and Restoration
                  • 412 OS Evaluation
                    • 42 Trusted Processes
                    • 43 Leveraging Hardware Roots of Trust
                      • 5 Cross-Machine Enforcement
                        • 51 Remote Interactions
                        • 52 Message-Level Enforcement
                        • 53 Integrating With Persistent Storage
                        • 54 Evaluation
                          • 6 Audit Data-Centric Logs
                            • 61 Analysing Paths to Disclosure
                            • 62 Demonstrating Compliance
                            • 63 Audit as `Big Data
                            • 64 Audit Access
                              • 7 Example Support for Web Services
                              • 8 Conclusion amp Future Work
                              • References
Page 7: CamFlow: Managed data-sharing for cloud services · CamFlow: Managed data-sharing for cloud services Thomas F. J.-M. Pasquier, Member, IEEE, Jatinder Singh, Member, IEEE, David Eyers,

7

inherits from LSM the formal assurance of IFCrsquos correctplacement on the path to any controlled kernel objectThis is sufficient to guarantee that we control flow andrecord audit on any operation on a controlled kernelobject

Since applications running on SELinux [32] or Ap-pArmor [33] need not be aware of the MAC policybeing enforced we see no reason to force applicationsrunning on an IFC system to be aware of IFC only thoseperforming declassification or endorsement operationsare necessarily aware This implementation choice isimportant cloud providers can incorporate IFC withoutrequiring changes in the software deployed by ten-ants Alternatively applications that wish to managetheir own IFC constraints can declare policy through apseudo-filesystem (as is typical for LSMs) abstracted bya user space library and enforced transparently by theIFC mechanism

The LSM framework calls security hooks when accessto a kernel object is attempted Security metadata can beassociated with kernel objects and is used by the LSMmodule to make access decisions Tags and privilegesare represented by 64-bit opaque nonces associated withkernel objects such as processes inodes files sharedmemory objects messages etc On interaction betweenkernel objects CamFlow-LSM security hooks are calledto enforce data-flow policy (sect341) or propagate tags onentity creation (sect33) as appropriate

Only active entities (processes) have mutable labelsand privileges all other (passive) entities have im-mutable labels and no privileges

Privileges are allocated by the kernel and owned bythe creating process (any process can create tags and theassociated privileges in a decentralised fashion) Privi-leges can be passed to other processes users or groupsCamFlow-LSM verifying that constraints on privilegedelegation (sect343) and conflict of interest (sect35) are notviolated A process can add or remove a tag from itslabel if it owns the appropriate privilege (following IFCconstraints described in sect342) if the current user ownsthe privilege or if the current group owns the privilegeHow tags are shared and managed must be consideredwith care when designing an application and the systemmust be administered accordingly

411 Checkpointing and Restoration

Checkpointing a process involves halting its executionallowing it to be restarted at a later stage and enablingmigration eg [37] LSM state is normally saved andrestored by the checkpointing system eg [38] andour module further exports an API to more efficientlyserialise and restore security context

Furthermore self-checkpointing and restoring the pre-vious state of a process has been demonstrated [39] to bea beneficial feature for IFC systems This is particularlyuseful for processes serving requests In such a scenariothe state of the process is saved after initialisation Whena request is received the serving process sets itself up

0 10 20 30 40 50 60 70 80 90

sys pipe

sys write

sys read

sys clone

microsdyn label IFC LSM Native

Fig 4 Overhead introduced into the OS by CamFlowLSM (x-axis time in micros)in the security context appropriate to serve the requestAfter the request is served (or a series of requests if thesystem is session-based as described in sect7) the processrestores its memory state and security context to whatthey were immediately after initialisation This improvesperformance and prevents data leaks between securitycontexts

412 OS Evaluation

We tested the CamFlow-LSM module on Linux Kernelversion 3178 (012015) from the Fedora distribution3

The tests are run on an Intel 22Ghz i7 CPU and 6GiBRAM machine

Measurements are done using the Linux tool ftrace [40]to provide a microbenchmark Two processes read fromand write to a pipe respectively Each has 20 tags in itssecurity label substantially more than we have seen aneed for in current use cases We measure the overheadinduced by creating a new process (sys clone) creatinga new pipe (sys pipe) writing to the pipe (sys write) andreading from the pipe (sys read) The results are givenin Fig 4

We can distinguish two types of induced overheadverifying an IFC constraint (sys read sys write) and allo-cating labels (sys clone sys pipe) The sys clone overheadis roughly twice that of sys pipe as memory is allocateddynamically for the active entityrsquos labels and privilegesRecall that passive entities have no privileges Overheadmeasurements for other system callsdata structures areessentially identical as they rely on the same underlyingenforcement mechanism and are not included

The CamFlow-LSM overhead is a few percent seeFig 4 We provide a build option that further improvesperformance by declaring labels and privileges with afixed maximum size (by default label size can increasedynamically to meet application requirements) This re-duces the overhead of the system calls that create newentities (the dynamic label component in Fig 4) This isan acceptable trade-off as in practical scenarios labelsrarely exceed more than five tags However for mostapplications the overhead is imperceptible and lost insystem noise it is hard to measure without using kernel

3 It is not feasible to provide a comparison with the Laminarimplementation [28] that is closest in technical terms to our workas the implementation available httpsgithubcomut-osalaminaris for an obsolete kernel version 2622 (072007)

8

tools as the variation between two executions may begreater than the overhead

42 Trusted Processes

The CamFlow-LSM is trusted to enforce IFC at the kernellevel Its functionality is minimal strictly confined tothe enforcement of IFC policies as described in sect3 Thisguarantees easier maintainability and a system that isagnostic to higher level application requirements thusminimising the constraints imposed on user-space ap-plication design

We introduce the concept of a trusted process that al-lows applicationplatform-specific concerns to be man-aged in user space by bypassing some LSM-enforced IFCconstraints For example a trusted process might serveas a proxy for external connections as in the Trusted IFCGateway in the example in sect7 setting up and managingapplication componentsrsquo labels Trusted processes areused to interact with persistent storage (see sect53) forcheckpointing and restoring processes (see sect411) andfor managing inter-process and external communication(see sect5)

Fig 5 shows OS instances running the CamFlow-LSM hosting a number of application processes thatmay be grouped in containers Each OS instance hasa single trusted process (Security Context Manager) tomanage its hosted processesrsquo IFC labels and privileges Inaddition each process has an associated trusted middle-ware process to handle inter-process and inter-machinecommunication Such communication may be within orbetween containers OSs or clouds

In this example S represents a particular set of secrecytags and I a particular set of integrity tags both ofwhich remain the same throughout The application pro-cesses and other OS objects such as pipes and files arelabelled [S I] The process labelled [empty I] writes lsquopublicrsquodata to a pipe which is read by a process labelled [S I]assuming all the I tags match correctly Similarly twoprocesses are shown writing to and reading from a file

The Security Context Manager maps between thekernel-level representation of tags (as 64-bit integers)and the representation of tags in user space Within acloud or other trusted environment tags may be simplestrings When tags need to cross domain boundarieseg when cloud services form part of a wider archi-tecture as in IoT tags may need to be protected bycryptographic means (see sect51)

Trusted processes are either set up through staticconfiguration read at boot time by the CamFlow-LSMmodule or created at runtime by another trusted pro-cess Trusted processes must either be managed by atrusted party (in our current approach the underlying in-frastructure provider) andor the code must be auditableand a means to verify the current version running on theplatform must be provided (see sect43)

43 Leveraging Hardware Roots of Trust

Incorporating IFC into cloud-provider OSs would en-hance the trustworthiness of the platform HoweverIFC only guarantees protection above the technical layerin which it is enforced Recent hardware and softwaredevelopments make it possible to attest that the softwarelayers on which our platform runs have been audited

The Trusted Platform Module (TPM) [41] as used forremote attestation [42] is one such hardware mecha-nism TPM is used to generate a nearly unforgeablehash representing the state of the hardware and soft-ware of a given platform that can be remotely verifiedTherefore a company could audit the implementationof our IFC enforcement mechanism and ensure that ourkernel security module messaging middleware and theconfiguration they provide are indeed running on theplatform Any difference between the expected state ofthe software stack and the platform could be considereda breach of trust such considerations can easily beembedded in the contractual obligations of the cloudprovider

TPM and remote attestation for cloud computing [43]are reaching maturity with IBM rolling out an opensource scalable trusted platform based on virtual TPMs[44] Indeed Berger et al [44] describe a mechanismallowing the TPM and remote attestation to be providedfor virtual machine offerings and container-based solu-tions covering the whole range of contemporary cloudofferings Furthermore the approach not only allows thestate of the software stack to be verified at boot time butalso during execution and can thus prevent run-timemodification of the system configuration

5 CROSS-MACHINE ENFORCEMENT

CamFlow-LSM operates to protect flows within the OSHowever it is also important that flows are protectedacross OS instances

Generally in order to guarantee flow constraints onlyprocesses P such that S(P ) = empty and I(P ) = empty ie notsubject to IFC constraints are allowed to directly connectto or receive messages from connections on remote OS(eg through a socket) In order to connect to anothermachine a process must either 1) be able to declassifyto change its security context to S(P ) = empty and I(P ) = empty2) communicate through an intermediate trusted process

As such CamFlow contains an IFC-enabled fully-featured messaging middleware (CamFlow-MW) to bothfacilitate communication and guarantee enforcementacross machines For want of space we only considerthe middleware concepts relevant to IFC details on thegeneral middleware (as it was prior to IFCCamFlowintegration) can be found in [45] In short the middle-ware supports strongly-typed messages a range of in-teraction paradigms including request-reply broadcastand streams flexible resource discovery and securitymechanisms including access controls and encryptedcommunication A particular feature is its support for

9

CamFlow-MW CamFlow-MW CamFlow-MW CamFlow-MW CamFlow-MW

Process

CamFlow-LSM

Enforcement Point Pipe [empty I]

Process Process

SecurityContextManager

SecurityContextManager

Container

Container

Container

Process[S I]

CamFlow-LSM

[S I]

File[S I]

[S empty][S empty]Process[empty I]

Fig 5 CamFlow Architecture Labelled OS objects trusted processes and communication middleware

dynamic reconfiguration based on event-driven policyThis simplifies both application development and de-ployment as concerns can be abstracted and tailoredto the particular environment rather than embeddedwithin application code

The role of the middleware is to move towards contin-uous end-to-end data flow management such that IFCcan be enforced across applicationsmachines (kernels)There is work on IFC enforcement across machineshowever these impose specific requirements such asdesign-time considerations [46] a particular languageruntime [47] or constraints on system architectureimplementation [48] In contrast we integrate IFC func-tionality into the general fully featured distributed sys-tems middleware mentioned above (see [49]) to provideflexibility and be more generally applicable We delib-erately avoid imposing a structure on system designinstead integrating IFC functionality into the sort of com-munications infrastructure common to current enterpriseand cloud systems

51 Remote Interactions

CamFlow-MW operates by associating a trusted process(see sect42) with an entity that seeks to communicate viathe messaging system Its fit within the broader architec-ture is depicted in Fig 5 The process is responsible forhandling the communication of messages and enforcesIFC based on the current runtime labels of the entity onwhose behalf it operates

It follows that for IFC to be enforced across machinestags require system-wide management ie throughoutthe cloud service In [50] we proposed that the widelyused and available X509 certificates could be used Theapproach relies on public key certificates and attributecertificates [51] to respectively identify the applicationassociated with the CamFlow-MW instance and the tagsassociated with this application

As part of establishing a connection CamFlow-MWensures that each entity authorises communication withthe other according to a local access control policy De-cisions are based on component metadata the relevantauthentication aspects secured through PKI (certificates)Similarly IFC policy must also be verified to ensure dataflows are authorised according to IFC policy Attributecertificates provide cryptographic means to determineand verify the tags associated with the remote entity

(see [50]) on which policy can be enforced If tags donot accord the connection will not be established

We also see potential for remote attestation based onhardware integrity measures (see sect43) to be integratedinto this authorisation phase to ensure the remote ma-chine operates a reliable IFC enforcement regime

52 Message-Level Enforcement

CamFlow messages are strongly typed where a mes-sage type is defined by a schema describing its set ofattributes For an instance of a message an attributeconsists of a name type and value The support forIFC within messages is fine-grained in that individualattributes within messages can also be labelled Theseattribute labels introduce additional IFC constraints overand above those already applying to the entity ie asrecognised by the kernel-LSM and validated on connec-tion establishment

Labels can be defined within message type schemawhich sets the attributesrsquo IFC labels for all messageinstances of the type These labels cannot be changed byentities dealing in such messages and the entities musthold the requisite labels to interact with the attributesOtherwise the entity producingpublishing a messagecan set the security labels for the attributes (for those notpredefined) if the entity holds the associated privilegesEnforcement occurs as followsReceiving If the receiving entityrsquos labels do not agreewith those of an attribute value the attribute value (andany sub-attributes) are removed from (made null in) themessage This is enforced on message receipt before itis delivered to the entitySending An entity cannot send values for attributeswhere its labels do not agree with those of the attributeThis is enforced when an entity attempts to send amessage ensuring values for any attributes violating thispolicy are removed before message propagation

Enforcement is automatic meaning that applicationsusing the messaging system can be subject to IFC en-forcement completely transparently (ie without their di-rect involvement) though again there is the interface forthe application to actively manage IFC where requiredIn addition the general reconfiguration capabilities ofthe middleware enable connections between componentsto be defined and managed at runtime providing an-other mechanism for controlling communication [45]

10

0 100 200 300 400 500 600 700

Time in msIFC-overhead

Fig 6 IFC overhead of CamFlow-MW for a workloadtransmission of 5000 messages (x-axis in ms)53 Integrating With Persistent Storage

One technique to provide IFC with persistent data storesis to store the tags alongside the data and a trustedsoftware component ensures that when information isread from the store the corresponding labels are appliedIn Flume [14] a trusted process provides the interfacebetween untrusted applications and persistent storageMore recent work has seen the emergence of databasesthat natively understand IFC concepts and can enforceIFC policies [25]

We see much promise in having the middleware me-diate between persistence systems and the kernel toensure consistent IFC application

54 Evaluation

As shown in Fig 6 the results indicate that IFC en-forcement introduces an overhead of sim13 in perfor-mance time compared to the standard non IFC-enabledmiddleware (see [49] for details) Note that these resultswere measured in the context of a particular workloaddeliberately designed to highlight the impact of IFCenforcement It follows that the overheads associatedwith real-world usage are most likely less onerous

6 AUDIT DATA-CENTRIC LOGS

IFC in addition to providing strong assurances that pol-icy is being enforced can also provide a data-centric log[52] detailing the information flows within and betweensystem components In addition to enforcing IFC via ourLSM module we log the data flows of labelled processespolicy decisions privileges and IFC security contextmanipulations Equivalent inter-machine operations viathe middleware are also recorded

Cloud logging systems are generally based on legacylogging systems (OS web-server database etc) thateither fail to capture the needed information or areextremely complicated to interpret in a useful manner[53] More importantly such logs tend to be relevant onlyto the particular service or component which makes itdifficult if not impossible to audit across a range ofapplications clouds etc

IFC logs as provided by our platform allow us to cap-ture information on application-level data flows both at-tempted and permitted allowing the correct expressionand implementation of data flow policy to be checkedThis provides transparency allows for meaningful auditin terms of investigating the circumstances in which dataleakage occurs and provides evidence of complianceeg with legal obligations [54]

Information Exchange

CreateSecurity Context Change

e0

e1

e3e4

e7

e2 e5e6

P3[Sprimeprime I]

P2[Sprimeprime empty]

P3[S I]F1[S I]

P1[S I] P1[empty I]

Public[empty empty]F2[Sprime I prime]

Fig 7 Simplified audit graph from IFC OS execution (weomit metadata for readability) The path to disclosure isshown in bluepale61 Analysing Paths to Disclosure

To assist in interpreting log information we build a di-rected graph corresponding to the allowed flows duringthe execution of our system as shown in Fig 7 Theflows defined in our IFC model (see sect3) namely dataflow creation flow security context change and privi-lege delegation correspond to the edges of the directedgraph Entities (such as processes files messages etc)are represented as nodes in the graph

In addition to information necessary to build the graph(as shown in the figure) additional metadata is collectedfor forensic purposes which is contextentityevent-dependent These audit entries provided by the LSMand middleware can be exploited by a dedicated serviceimplemented in user space connected to the kernelcollection mechanism via relayfs [55] This service feedsfor example a graph visualisation tool such as Cytoscape[56] or a graph database such as Neo4J4

Such a directed graph helps one identify data leaksFor example a tenant might discover that some sensi-tive medical data leaked into a data store where onlyanonymised research data were supposed to be storedIFC is enforced in line with the policy encapsulated inlabels thus data may leak if such policy is improperly ex-pressed andor declassificationendorsement processesare not correctly implemented (eg if the anonymisationprocess in Fig 2 allows re-identification)

Suppose that an information leak is suspected be-tween different security contexts L1[S I] and L2[S

prime I prime]Determining whether such a leak can occur is equiv-alent to discovering whether there is a path in thegraph between the two contexts If the leak occurredthere must be a path between some entity Ei such thatS(Ei) = S and I(Ei) = I and another entity Fi such thatS(Fi) = Sprime and I(Fi) = I prime

The existence of such a path demonstrates that a leakis possible To investigate whether a leak occurred it isessential to consider the event ID associated with theedges comprising the path We denote by ei the last in-coming edge to the entity under investigation with labels[Sprime I prime] only edges such that e lt ei should be consideredWhen applied to all nodes along a path this rule ensuresstrictly monotonically increasing timestamps from thefirst node to the last Fig 7 shows in bluepale a possibledata disclosure path from file F2 from a very simple

4 httpneo4jcom

11

audit graph We know from the event IDs e0 and e1 thatthe data disclosure did not occur through file F1 andprocess P3 but through P1rsquos declassification

62 Demonstrating Compliance

Compliance with certain requirements can be demon-strated through queries over the graph We assume theaudit data is stored in a graph database that we canquery For example the following plain English policyldquoEuropean personal data sent to the US must be anonymisedrdquo[57] is equivalent to writing a query that verifies thatthere is no path between EU- and US-labelled datawithout an anonymisation processThe policy ldquoMedical data stored in database X must havereceived proper consent and be anonymisedrdquo [54] can beexpressed as a query verifying that there is no path be-tween data labelled as medical and the database withoutconsent-checking and anonymiser processes In additionan investigator may want to know which anonymisationalgorithm has been run which data has been used togenerate the anonymised records etc Our audit graphassists in answering such questions

Note that IFC only applies guarantees with respectto flows Demonstrating the overall effectiveness of themanagement regime eg the quality and suitability ofthe anonymisation algorithm is out-of-scope for a flow-based enforcement mechanism

63 Audit as lsquoBig Datarsquo

We are potentially generating a vast amount of data inour IFC logs However unlike standard system logs thatare complex to analyse our logs generate graphs thatare ideal for analysis by ldquobig datardquo tools that have beendeveloped for this purpose [58]

Since the amount of data is potentially huge theamount of data being logged can be fine-tuned to meetthe requirements of the platformtenant eg by reduc-ing the amount of metadata being stored by loggingonly security context changing operations by loggingonly information corresponding to some target securitycontext keeping operations on unlabelled entities out-side of the log etc The decision on what needs to belogged then becomes a tradeoff between utility and thevolume (cost) of log generated which can be decided inorder to correspond to legal or contractual requirements(for example a regulated sector may need to have afine-grained log to satisfy data forensic requirements)Indeed as such an approach is new to the cloud suchconsiderations will be refined by experience with bestpractices developing over time

64 Audit Access

Logs can contain sensitive information and access tothem should be controlled This represents an area ofour ongoing work Traditional access controls clearlyplay a role however secrecy tags could also be lever-aged For example an auditor before being grantedaccess to audit logs could be forced to demonstrate

ownership of the corresponding secrecy IFC tags (forexample through cryptographic means as in sect51) Theauditor may be granted access to a log entry only ifS(origin) cup S(destination) sube S(auditor)7 EXAMPLE SUPPORT FOR WEB SERVICES

One of the most common uses of PaaS is to host web ap-plications In this section we present the implementationof such a solution built on the infrastructure described insect4 in order to evaluate and demonstrate the feasibility ofour proposed approach This is illustrated in Fig 8 Werun standard and unmodified Ruby web applications

Interaction with end-users is achieved through aldquogatewayrdquo between the IFC and non-IFC worlds Simi-larly interaction with cloud services (such as data stores)is also achieved through our messaging middleware asdiscussed in sect5 The requirement for this gateway canbe removed if a trustworthy IFC implementation can beprovided at the client side consistent with the cloud im-plementation with respect to tag naming enforcementetc Tag naming in general system-wide is an issuebeyond the scope of this paper see further sect8 In ourproof of concept implementation the gateway is a simpleApache server running a custom-built module

The role of the gateway is to authenticate the end-userwhen a session is created and to associate this sessionwith an application instance running within the securitycontext corresponding to the user Recall that a securitycontext comprises the S and I labels Any further re-quests to the gateway in that session are routed to thecorresponding application instance Once an instance nolonger has an associated session it can be recycled usingself-checkpointing as described in sect411

Several application types are running over our cloud-based web services platform For example in a medicalcontext these might be medical record editing pharmacyordering social services etc A single shared identityservice for the end-user is part of the cloud provideroffering (in our proof of concept implementation weused OAuth [59])

The GP authenticates is authorised as treating doc-tor for Alice and selects the lsquoidentityrsquo that correspondsto Alice A new session is created server-side by thegateway with the requested application instance run-ning in the corresponding security context with S =[medical alice] I = [empty] When the GP wants to accessapplications on behalf of a new patient he needs to closeAlicersquos session authorise as treating doctor for Bob andopen a new session for Bob

The control described above is not achieved by theapplication but by the platform itself and can be con-trolled by the end-user subject to access control Thatis a medical application used on Alicersquos behalf runs ina security context in which data cannot flow to that ofanother patient Furthermore applications running onbehalf of a given user can share the data of that userwithout the risk of seeing a buggy application leakingdata between end-users

12

App Instance App InstanceApp Instance

CamFlow-MW

IFC Constrained IFC Constrained IFC Constrained

ContextA

ContextB

Public

Trusted IFC Gateway

Enforcing IFCData-store

Data-store Application

CamFlow-MW CamFlow-MWCamFlow-MW

IFC Cloud Platform

Dr Yellow Dr PinkDr MauveRemote Clients

Fig 8 PaaS Architecture on top of IFC-OS

As described in sect4 we assume the middleware andthe OS enforcement are provided as a service by theunderlying platform A tenant wanting to use the third-party web-service offering once his trust in the under-lying platform is established needs only to audit thegateway again the underlying infrastructure providercould either provide such a gateway or audit it Therest of the software stack of the third-party web-serviceprovider is bound by the IFC enforcement mechanismand therefore need not be trusted

8 CONCLUSION amp FUTURE WORK

IFC allows data flows to be controlled continuouslythroughout a system by providing an information-centric MAC scheme that continuously ensures non-interference between security contexts This paper pre-sented the CamFlow platform that demonstrates thepotential of cloud-deployed IFC as supporting (1) pro-tection of applications from each other (2) flexible datasharing across isolation boundaries (3) prevention ofdata leakage due to bugsmisconfigurations (4) exten-sion of access control beyond application boundaries (5)data flow transparency

Specifically we detailed a new kernel implementationof IFC as an LSM demonstrating low overhead evenfor worst-case scenarios where processes continuouslymake readwrite system calls We also described theintegration of a messaging middleware to enforcing IFCacross machines This combination makes it possible toprovide whole-system IFC to PaaS cloud services andtherefore also SaaS Our approach and implementationwere designed so that applications can run unchangedover IFC thus making cloud adoption feasible

We also indicated how the data-centric logs based onIFC enforcement could provide the means to audit anIFC-enabled system whereby a log can be processed asa directed graph to investigate leaks and attacks andshow compliance with data management requirementsThough this represents our initial work in the area thereappears much promise

In light of the above we believe that IFC has greatpotential as a security mechanism for the cloud wherebytrust in a few major cloud providers deploying IFCcan be built on to provide a demonstrably trustworthycomputing environment

CamFlow was developed with cloud deployment inmind Our future work will investigate the challenges ofa broader distributed context

It is already feasible to extend CamFlow to sup-port mobile environments Android supports the fullSELinux enforcement5 and an Android-LSM integrationhas been demonstrated [60] But when dealing withmultiple cloud services particularly as they become partof a wider distributed architecture such as in the IoTa trustworthy system-wide deployment of IFC can nolonger be assumed Much work remains on establishingtrust in (and the trustworthiness of) the IFC enforcementmechanism within end-usersrsquo devices Outside a cloudcontext all partiesrsquo trust in a common third partyrsquos en-forcement of IFC constraints cannot be assumed (unlikethe cloud provider for cloud services) We intend toexplore leveraging hardware roots of trust and remoteattestation to ensure the integrity and trustworthiness ofIFC enforcement mechanisms

Another area of investigation concerns the represen-tation of tags across administrative domains In a cloudcontext a federated approach can be envisaged wherea common understanding of tags could be negotiatedacross multiple domains For example in [54] we dis-cussed initial thoughts on managing data according tospecific obligation regimes However as the numberof administrative domains increases both a global tagnaming scheme and mechanisms for ad-hoc negotiationbecome necessary

Related is the sensitivity of the tags themselvesKnowledge of the meaning of a tag can indicate thatthe associated entity contains or deals with certain infor-mation If a relationship can be established between anentity and the information owner this may disclose pri-vate information about the information owner This maylead to work on a need-to-know negotiation mechanismto establish a secure channel between hosts especiallyin a wide-scale distributed system A promising mech-anism is private set intersection to determine tagsrsquo subsetrelationship (sect341)

Another challenge concerns extending audit to dis-tributed architectures both in terms of resource man-agement and regulating access to log data

5 httpssourceandroidcomdevicestechsecurityselinux

13

ACKNOWLEDGMENTS

This work was supported by UK Engineering andPhysical Sciences Research Council grant EPK011510CloudSafetyNet End-to-End Application Security inthe Cloud We acknowledge the support of Microsoftthrough the Microsoft Cloud Computing Research Cen-tre

REFERENCES

[1] P Barham B Dragovic K Fraser S Hand T Harris A HoR Neugebauer I Pratt and A Warfield ldquoXen and the art ofvirtualizationrdquo SIGOPS Operating Systems Review vol 37 no 5pp 164ndash177 2003

[2] A Kivity Y Kamay D Laor U Lublin and A Liguori ldquokvmthe Linux Virtual Machine Monitorrdquo in Proceedings of the LinuxSymposium vol 1 2007 pp 225ndash230

[3] D Bernstein ldquoContainers and Cloud From LXC to Docker toKubernetesrdquo IEEE Cloud Computing Magazine no 3 pp 81ndash842014

[4] P Desnoyers O Krieger B Holden and J Hennessey ldquoUsingOpenStack for an Open Cloud eXchange (OCX)rdquo in InternationalConference on Cloud Engineering (IC2E) IEEE 2015

[5] J Mineraud O Mazhelis X Su and S Tarkoma ldquoA Gap Analysisof Internet-of-Things Platformsrdquo 2015 Arxiv arXiv150201181[Online] Available httparxivorgabs150201181

[6] C J Millard Ed Cloud Computing Law Oxford University Press2013

[7] S Pearson and M Casassa-Mont ldquoSticky Policies An Approachfor Managing Privacy across Multiple Partiesrdquo Computer vol 44July 2011

[8] D E Denning ldquoA lattice model of secure information flowrdquoCommunication of the ACM vol 19 no 5 pp 236ndash243 1976

[9] D E Bell and L J LaPadula ldquoSecure Computer Systems Mathe-matical Foundations and Modelrdquo The MITRE Corp Bedford MATech Rep M74-244 1973

[10] K J Biba ldquoIntegrity Considerations for Secure Computer Sys-temsrdquo MITRE Corp Tech Rep ESD-TR 76-372 1977

[11] A C Myers and B Liskov ldquoA Decentralized Model for Infor-mation Flow Controlrdquo in 17th Symposium on Operating SystemsPrinciples (SOSP) ACM 1997 pp 129ndash142

[12] A Bouard B Weyl and C Eckert ldquoPractical information-flowaware middleware for in-car communicationrdquo in Workshop onSecurity privacy amp dependability for cyber vehicles ACM 2013pp 3ndash8

[13] K Singh S Bhola and W Lee ldquoxBook Redesigning PrivacyControl in Social Networking Platformsrdquo in Security SymposiumUSENIX 2009 pp 249ndash266

[14] M Krohn A Yip M Brodsky N Cliffer M F KaashoekE Kohler and R Morris ldquoInformation Flow Control for StandardOS Abstractionsrdquo in Symposium on Operating Systems PrinciplesACM 2007 pp 321ndash334

[15] S Lee E L Wong D Goel M Dahlin and V ShmatikovldquoπBox A Platform for Privacy-Preserving Appsrdquo in Symposiumon Networked System Design and Implementation USENIX 2013pp 501ndash514

[16] N Kumar and R Shyamasundar ldquoRealizing Purpose-Based Pri-vacy Policies Succinctly via Information-Flow Labelsrdquo in Big Dataand Cloud Computing (BDCloudrsquo14) IEEE 2014 pp 753ndash760

[17] W Enck P Gilbert B-G Chun L P Cox J Jung P McDaniel andA N Sheth ldquoTaintDroid An information-flow tracking systemfor realtime privacy monitoring on smartphonesrdquo in Conference onOperating systems design and implementation (OSDIrsquo10) USENIX2010 pp 1ndash6

[18] I Papagiannis M Migliavacca and P Pietzuch ldquoPHP AspisUsing partial taint tracking to protect against injection attacksrdquoin 2nd USENIX Conference on Web Application Development 2011p 13

[19] E Schwartz T Avgerinos and D Brumley ldquoAll you ever wantedto know about dynamic taint analysis and forward symbolicexecution (but might have been afraid to ask)rdquo in Symposium onSecurity and Privacy IEEE 2010

[20] M Casassa-Mont S Pearson and P Bramhall ldquoTowards account-able management of identity and privacy Sticky policies andenforceable tracing servicesrdquo in Database and Expert Systems Ap-plications 2003 Proceedings 14th International Workshop on IEEE2003 pp 377ndash382

[21] S Bandhakavi C C Zhang and M Winslett ldquoSuper-sticky anddeclassifiable release policies for flexible information dissemina-tion controlrdquo in Proceedings of the 5th ACM workshop on Privacy inelectronic society ACM 2006 pp 51ndash58

[22] D W Chadwick and S F Lievens ldquoEnforcing sticky securitypolicies throughout a distributed applicationrdquo in Proceedings ofthe 2008 workshop on Middleware security ACM 2008 pp 1ndash6

[23] M Casassa-Mont I Matteucci M Petrocchi and M L SbodioldquoTowards Safer Information Sharing in the Cloudrdquo InternationalJournal of Information Security pp 1ndash16 2014

[24] S Akoush L Carata R Sohan and A Hopper ldquoMrLazy LazyRuntime Label Propagation for MapReducerdquo in 6th Workshop onHot Topics in Cloud Computing (HotCloud) USENIX 2014

[25] D Schultz and B Liskov ldquoIFDB Decentralized Information FlowControl for Databasesrdquo in European Conference on Computer Sys-tems (Eurosysrsquo13) ACM 2013 pp 43ndash56

[26] D Von Oheimb ldquoInformation Flow Control Revisited Nonin-fluence = Noninterference + Nonleakagerdquo in Computer SecurityndashESORICS 2004 Springer 2004 pp 225ndash243

[27] D Hedin and A Sabelfeld ldquoA perspective on Information-FlowControlrdquo NATO 2012

[28] D E Porter M D Bond I Roy K S McKinley and E WitchelldquoPractical Fine-Grained Information Flow Control Using Lam-inarrdquo ACM Transactions on Programming Languages and Systems(TOPLAS) vol 37 no 1 p 4 2014

[29] C Wright C Cowan S Smalley J Morris and G Kroah-Hartman ldquoLinux Security Modules General security support forthe Linux kernelrdquo in Foundations of Intrusion Tolerant SystemsIEEE 2003 pp 213ndash213

[30] M Quaritsch and T Winkler ldquoLinux Security Modules Enhance-ments Module Stacking Framework and TCP State TransitionHooks for State-Driven NIDSrdquo Secure Information and Communi-cation vol 7 pp 7ndash13 2004

[31] C Schaufler ldquoLSM Generalize existing module stackingrdquo LinuxWeekly News 2014

[32] S Smalley C Vance and W Salamon ldquoImplementing SELinux asa Linux Security Modulerdquo NAI Labs Report vol 1 p 43 2001

[33] M Bauer ldquoParanoid Penguin an Introduction to Novell AppAr-morrdquo Linux Journal vol 2006 no 148 p 13 2006

[34] A Edwards T Jaeger and X Zhang ldquoRuntime verification ofauthorization hook placement for the Linux Security Modulesframeworkrdquo in Conference on Computer and Communications Secu-rity ACM 2002 pp 225ndash234

[35] T Jaeger A Edwards and X Zhang ldquoConsistency analysis ofauthorization hook placement in the Linux security modulesframeworkrdquo ACM Transactions on Information and System Security(TISSEC) vol 7 no 2 pp 175ndash205 2004

[36] V Ganapathy T Jaeger and S Jha ldquoAutomatic placement ofauthorization hooks in the Linux security modules frameworkrdquoin Conference on Computer and Communications Security ACM2005 pp 330ndash339

[37] I P Egwutuoha D Levy B Selic and S Chen ldquoA survey of faulttolerance mechanisms and checkpointrestart implementationsfor high performance computing systemsrdquo Springer 2013vol 65 no 3 pp 1302ndash1326

[38] O Laadan and S E Hallyn ldquoLinux-CR Transparent applicationcheckpoint-restart in Linuxrdquo in Linux Symposium 2010 p 159

[39] B Niu and G Tan ldquoEfficient User-space Information Flow Con-trolrdquo in Symposium on Information Computer and CommunicationsSecurity (SIGSACrsquo13) ACM 2013 pp 131ndash142

[40] T Bird ldquoMeasuring Function Duration with ftracerdquo in Japan LinuxSymposium 2009 pp 47ndash54

[41] B Parno ldquoBootstrapping Trust in ardquo Trustedrdquo Platformrdquo in Con-ference on Hot Topics in Security (HotSecrsquo08) USENIX 2008

[42] N Santos K P Gummadi and R Rodrigues ldquoTowards TrustedCloud Computingrdquo in Conference on Hot Topics in Cloud Comput-ing USENIX 2009 pp 3ndash3

[43] S Berger R Caceres K A Goldman R Perez R Sailer andL van Doorn ldquovTPM Virtualizing the Trusted Platform Modulerdquoin Security Symposium USENIX 2006 pp 305ndash320

14

[44] S Berger K Goldman D Pendarakis D Safford E Valdezand M Zohar ldquoScalable Attestation A Step Toward Secure andTrusted Cloudsrdquo in International Conference on Cloud Engineering(IC2E) IEEE 2015

[45] J Singh D Eyers and J Bacon ldquoPolicy Enforcement withinEmerging Distributed Event-Based Systemsrdquo in ACM DistributedEvent-Based Systems (DEBSrsquo14) 2014

[46] L Sfaxi T Abdellatif R Robbana and Y Lakhnech ldquoInformationFlow Control of Component-based Distributed Systemsrdquo Concur-rency and Computation Practice and Experience vol 25 no 2 pp161ndash179 2013

[47] S Yoshihama T Yoshizawa Y Watanabe M Kudoh and K Oy-anagi ldquoDynamic Information Flow Control Architecture for WebApplicationsrdquo in ESORICS 2007 J Biskup and J Lopez EdsSpringer 2007 vol LNCS 4734 pp 267ndash282

[48] W Cheng D R K Ports D Schultz V Popic A BlanksteinJ Cowling D Curtis L Shrira and B Liskov ldquoAbstractions forUsable Information Flow Control in Aeolusrdquo in USENIX AnnualTechnical Conference Boston 2012

[49] J Singh T Pasquier J Bacon and D Eyers ldquoIntegrating Middle-ware and Information Flow Controlrdquo in International Conferenceon Cloud Engineering (IC2E) IEEE 2015 pp 54ndash59

[50] J Singh T F J-M Pasquier and J Bacon ldquoSecuring Tags toControl Information Flows within the Internet of Thingsrdquo inInternational Conference on Recent Advances in Internet of Things(RIoTrsquo15) IEEE 2015

[51] J S Park and R Sandhu ldquoBinding Identities and Attributes UsingDigitally Signed Certificatesrdquo in Annual Conference on ComputerSecurity Applications IEEE 2000 pp 120ndash127

[52] A Ganjali and D Lie ldquoAuditing cloud management using infor-mation flow trackingrdquo in Workshop on Scalable Trusted ComputingACM 2012 pp 79ndash84

[53] R K Ko M Kirchberg and B S Lee ldquoFrom System-centric toData-centric Logging-accountability Trust amp Security in CloudComputingrdquo in Defense Science Research Conference and Expo (DSR)2011 IEEE 2011 pp 1ndash4

[54] J Singh J Powles T Pasquier and J Bacon ldquoData Flow Man-agement and Compliance in Cloud Computingrdquo IEEE CloudComputing Magazine SI on Legal Clouds 2015

[55] T Zanussi K Yaghmour R Wisniewski R Moore and M Dage-nais ldquorelayfs An efficient unified approach for transmitting datafrom kernel to user spacerdquo in Linux Symposium 2003 p 494

[56] M E Smoot K Ono J Ruscheinski P-L Wang and T IdekerldquoCytoscape 28 New features for data integration and networkvisualizationrdquo Oxford University Press 2011 vol 27 no 3 pp431ndash432

[57] T Pasquier and J Powles ldquoExpressing and Enforcing LocationRequirements in the Cloud using Information Flow Controlrdquo inIC2E International Workshop on Legal and Technical Issues in CloudComputing (CLawrsquo15) IEEE 2015

[58] R Angles and C Gutierrez ldquoSurvey of Graph Database ModelsrdquoACM Computing Surveys (CSUR) vol 40 no 1 p 1 2008

[59] D Hardt ldquoThe OAuth 20 Authorization Frameworkrdquo IETF TechRep RFC 6749 2012

[60] S Smalley and R Craig ldquoSecurity Enhanced (SE) Android Bring-ing Flexible MAC to Androidrdquo in Network and Distributed SystemSecurity Symposium Internet Society 2013

  • 1 Introduction and Motivation
  • 2 Background
    • 21 IFC Models
    • 22 Protection via VMs and Containers
    • 23 Taint Tracking (TT) Systems
    • 24 Sticky Policies
      • 3 CamFlow-Model IFC for the Cloud
        • 31 Tags and Labels
        • 32 Decentralised Privileges and Security Contexts
        • 33 Creating a New Entity
        • 34 Security
          • 341 Information Exchange
          • 342 Label Change
          • 343 Privilege delegation
            • 35 Conflict of Interest
              • 4 OS Enforcement
                • 41 CamFlow-LSM
                  • 411 Checkpointing and Restoration
                  • 412 OS Evaluation
                    • 42 Trusted Processes
                    • 43 Leveraging Hardware Roots of Trust
                      • 5 Cross-Machine Enforcement
                        • 51 Remote Interactions
                        • 52 Message-Level Enforcement
                        • 53 Integrating With Persistent Storage
                        • 54 Evaluation
                          • 6 Audit Data-Centric Logs
                            • 61 Analysing Paths to Disclosure
                            • 62 Demonstrating Compliance
                            • 63 Audit as `Big Data
                            • 64 Audit Access
                              • 7 Example Support for Web Services
                              • 8 Conclusion amp Future Work
                              • References
Page 8: CamFlow: Managed data-sharing for cloud services · CamFlow: Managed data-sharing for cloud services Thomas F. J.-M. Pasquier, Member, IEEE, Jatinder Singh, Member, IEEE, David Eyers,

8

tools as the variation between two executions may begreater than the overhead

42 Trusted Processes

The CamFlow-LSM is trusted to enforce IFC at the kernellevel Its functionality is minimal strictly confined tothe enforcement of IFC policies as described in sect3 Thisguarantees easier maintainability and a system that isagnostic to higher level application requirements thusminimising the constraints imposed on user-space ap-plication design

We introduce the concept of a trusted process that al-lows applicationplatform-specific concerns to be man-aged in user space by bypassing some LSM-enforced IFCconstraints For example a trusted process might serveas a proxy for external connections as in the Trusted IFCGateway in the example in sect7 setting up and managingapplication componentsrsquo labels Trusted processes areused to interact with persistent storage (see sect53) forcheckpointing and restoring processes (see sect411) andfor managing inter-process and external communication(see sect5)

Fig 5 shows OS instances running the CamFlow-LSM hosting a number of application processes thatmay be grouped in containers Each OS instance hasa single trusted process (Security Context Manager) tomanage its hosted processesrsquo IFC labels and privileges Inaddition each process has an associated trusted middle-ware process to handle inter-process and inter-machinecommunication Such communication may be within orbetween containers OSs or clouds

In this example S represents a particular set of secrecytags and I a particular set of integrity tags both ofwhich remain the same throughout The application pro-cesses and other OS objects such as pipes and files arelabelled [S I] The process labelled [empty I] writes lsquopublicrsquodata to a pipe which is read by a process labelled [S I]assuming all the I tags match correctly Similarly twoprocesses are shown writing to and reading from a file

The Security Context Manager maps between thekernel-level representation of tags (as 64-bit integers)and the representation of tags in user space Within acloud or other trusted environment tags may be simplestrings When tags need to cross domain boundarieseg when cloud services form part of a wider archi-tecture as in IoT tags may need to be protected bycryptographic means (see sect51)

Trusted processes are either set up through staticconfiguration read at boot time by the CamFlow-LSMmodule or created at runtime by another trusted pro-cess Trusted processes must either be managed by atrusted party (in our current approach the underlying in-frastructure provider) andor the code must be auditableand a means to verify the current version running on theplatform must be provided (see sect43)

43 Leveraging Hardware Roots of Trust

Incorporating IFC into cloud-provider OSs would en-hance the trustworthiness of the platform HoweverIFC only guarantees protection above the technical layerin which it is enforced Recent hardware and softwaredevelopments make it possible to attest that the softwarelayers on which our platform runs have been audited

The Trusted Platform Module (TPM) [41] as used forremote attestation [42] is one such hardware mecha-nism TPM is used to generate a nearly unforgeablehash representing the state of the hardware and soft-ware of a given platform that can be remotely verifiedTherefore a company could audit the implementationof our IFC enforcement mechanism and ensure that ourkernel security module messaging middleware and theconfiguration they provide are indeed running on theplatform Any difference between the expected state ofthe software stack and the platform could be considereda breach of trust such considerations can easily beembedded in the contractual obligations of the cloudprovider

TPM and remote attestation for cloud computing [43]are reaching maturity with IBM rolling out an opensource scalable trusted platform based on virtual TPMs[44] Indeed Berger et al [44] describe a mechanismallowing the TPM and remote attestation to be providedfor virtual machine offerings and container-based solu-tions covering the whole range of contemporary cloudofferings Furthermore the approach not only allows thestate of the software stack to be verified at boot time butalso during execution and can thus prevent run-timemodification of the system configuration

5 CROSS-MACHINE ENFORCEMENT

CamFlow-LSM operates to protect flows within the OSHowever it is also important that flows are protectedacross OS instances

Generally in order to guarantee flow constraints onlyprocesses P such that S(P ) = empty and I(P ) = empty ie notsubject to IFC constraints are allowed to directly connectto or receive messages from connections on remote OS(eg through a socket) In order to connect to anothermachine a process must either 1) be able to declassifyto change its security context to S(P ) = empty and I(P ) = empty2) communicate through an intermediate trusted process

As such CamFlow contains an IFC-enabled fully-featured messaging middleware (CamFlow-MW) to bothfacilitate communication and guarantee enforcementacross machines For want of space we only considerthe middleware concepts relevant to IFC details on thegeneral middleware (as it was prior to IFCCamFlowintegration) can be found in [45] In short the middle-ware supports strongly-typed messages a range of in-teraction paradigms including request-reply broadcastand streams flexible resource discovery and securitymechanisms including access controls and encryptedcommunication A particular feature is its support for

9

CamFlow-MW CamFlow-MW CamFlow-MW CamFlow-MW CamFlow-MW

Process

CamFlow-LSM

Enforcement Point Pipe [empty I]

Process Process

SecurityContextManager

SecurityContextManager

Container

Container

Container

Process[S I]

CamFlow-LSM

[S I]

File[S I]

[S empty][S empty]Process[empty I]

Fig 5 CamFlow Architecture Labelled OS objects trusted processes and communication middleware

dynamic reconfiguration based on event-driven policyThis simplifies both application development and de-ployment as concerns can be abstracted and tailoredto the particular environment rather than embeddedwithin application code

The role of the middleware is to move towards contin-uous end-to-end data flow management such that IFCcan be enforced across applicationsmachines (kernels)There is work on IFC enforcement across machineshowever these impose specific requirements such asdesign-time considerations [46] a particular languageruntime [47] or constraints on system architectureimplementation [48] In contrast we integrate IFC func-tionality into the general fully featured distributed sys-tems middleware mentioned above (see [49]) to provideflexibility and be more generally applicable We delib-erately avoid imposing a structure on system designinstead integrating IFC functionality into the sort of com-munications infrastructure common to current enterpriseand cloud systems

51 Remote Interactions

CamFlow-MW operates by associating a trusted process(see sect42) with an entity that seeks to communicate viathe messaging system Its fit within the broader architec-ture is depicted in Fig 5 The process is responsible forhandling the communication of messages and enforcesIFC based on the current runtime labels of the entity onwhose behalf it operates

It follows that for IFC to be enforced across machinestags require system-wide management ie throughoutthe cloud service In [50] we proposed that the widelyused and available X509 certificates could be used Theapproach relies on public key certificates and attributecertificates [51] to respectively identify the applicationassociated with the CamFlow-MW instance and the tagsassociated with this application

As part of establishing a connection CamFlow-MWensures that each entity authorises communication withthe other according to a local access control policy De-cisions are based on component metadata the relevantauthentication aspects secured through PKI (certificates)Similarly IFC policy must also be verified to ensure dataflows are authorised according to IFC policy Attributecertificates provide cryptographic means to determineand verify the tags associated with the remote entity

(see [50]) on which policy can be enforced If tags donot accord the connection will not be established

We also see potential for remote attestation based onhardware integrity measures (see sect43) to be integratedinto this authorisation phase to ensure the remote ma-chine operates a reliable IFC enforcement regime

52 Message-Level Enforcement

CamFlow messages are strongly typed where a mes-sage type is defined by a schema describing its set ofattributes For an instance of a message an attributeconsists of a name type and value The support forIFC within messages is fine-grained in that individualattributes within messages can also be labelled Theseattribute labels introduce additional IFC constraints overand above those already applying to the entity ie asrecognised by the kernel-LSM and validated on connec-tion establishment

Labels can be defined within message type schemawhich sets the attributesrsquo IFC labels for all messageinstances of the type These labels cannot be changed byentities dealing in such messages and the entities musthold the requisite labels to interact with the attributesOtherwise the entity producingpublishing a messagecan set the security labels for the attributes (for those notpredefined) if the entity holds the associated privilegesEnforcement occurs as followsReceiving If the receiving entityrsquos labels do not agreewith those of an attribute value the attribute value (andany sub-attributes) are removed from (made null in) themessage This is enforced on message receipt before itis delivered to the entitySending An entity cannot send values for attributeswhere its labels do not agree with those of the attributeThis is enforced when an entity attempts to send amessage ensuring values for any attributes violating thispolicy are removed before message propagation

Enforcement is automatic meaning that applicationsusing the messaging system can be subject to IFC en-forcement completely transparently (ie without their di-rect involvement) though again there is the interface forthe application to actively manage IFC where requiredIn addition the general reconfiguration capabilities ofthe middleware enable connections between componentsto be defined and managed at runtime providing an-other mechanism for controlling communication [45]

10

0 100 200 300 400 500 600 700

Time in msIFC-overhead

Fig 6 IFC overhead of CamFlow-MW for a workloadtransmission of 5000 messages (x-axis in ms)53 Integrating With Persistent Storage

One technique to provide IFC with persistent data storesis to store the tags alongside the data and a trustedsoftware component ensures that when information isread from the store the corresponding labels are appliedIn Flume [14] a trusted process provides the interfacebetween untrusted applications and persistent storageMore recent work has seen the emergence of databasesthat natively understand IFC concepts and can enforceIFC policies [25]

We see much promise in having the middleware me-diate between persistence systems and the kernel toensure consistent IFC application

54 Evaluation

As shown in Fig 6 the results indicate that IFC en-forcement introduces an overhead of sim13 in perfor-mance time compared to the standard non IFC-enabledmiddleware (see [49] for details) Note that these resultswere measured in the context of a particular workloaddeliberately designed to highlight the impact of IFCenforcement It follows that the overheads associatedwith real-world usage are most likely less onerous

6 AUDIT DATA-CENTRIC LOGS

IFC in addition to providing strong assurances that pol-icy is being enforced can also provide a data-centric log[52] detailing the information flows within and betweensystem components In addition to enforcing IFC via ourLSM module we log the data flows of labelled processespolicy decisions privileges and IFC security contextmanipulations Equivalent inter-machine operations viathe middleware are also recorded

Cloud logging systems are generally based on legacylogging systems (OS web-server database etc) thateither fail to capture the needed information or areextremely complicated to interpret in a useful manner[53] More importantly such logs tend to be relevant onlyto the particular service or component which makes itdifficult if not impossible to audit across a range ofapplications clouds etc

IFC logs as provided by our platform allow us to cap-ture information on application-level data flows both at-tempted and permitted allowing the correct expressionand implementation of data flow policy to be checkedThis provides transparency allows for meaningful auditin terms of investigating the circumstances in which dataleakage occurs and provides evidence of complianceeg with legal obligations [54]

Information Exchange

CreateSecurity Context Change

e0

e1

e3e4

e7

e2 e5e6

P3[Sprimeprime I]

P2[Sprimeprime empty]

P3[S I]F1[S I]

P1[S I] P1[empty I]

Public[empty empty]F2[Sprime I prime]

Fig 7 Simplified audit graph from IFC OS execution (weomit metadata for readability) The path to disclosure isshown in bluepale61 Analysing Paths to Disclosure

To assist in interpreting log information we build a di-rected graph corresponding to the allowed flows duringthe execution of our system as shown in Fig 7 Theflows defined in our IFC model (see sect3) namely dataflow creation flow security context change and privi-lege delegation correspond to the edges of the directedgraph Entities (such as processes files messages etc)are represented as nodes in the graph

In addition to information necessary to build the graph(as shown in the figure) additional metadata is collectedfor forensic purposes which is contextentityevent-dependent These audit entries provided by the LSMand middleware can be exploited by a dedicated serviceimplemented in user space connected to the kernelcollection mechanism via relayfs [55] This service feedsfor example a graph visualisation tool such as Cytoscape[56] or a graph database such as Neo4J4

Such a directed graph helps one identify data leaksFor example a tenant might discover that some sensi-tive medical data leaked into a data store where onlyanonymised research data were supposed to be storedIFC is enforced in line with the policy encapsulated inlabels thus data may leak if such policy is improperly ex-pressed andor declassificationendorsement processesare not correctly implemented (eg if the anonymisationprocess in Fig 2 allows re-identification)

Suppose that an information leak is suspected be-tween different security contexts L1[S I] and L2[S

prime I prime]Determining whether such a leak can occur is equiv-alent to discovering whether there is a path in thegraph between the two contexts If the leak occurredthere must be a path between some entity Ei such thatS(Ei) = S and I(Ei) = I and another entity Fi such thatS(Fi) = Sprime and I(Fi) = I prime

The existence of such a path demonstrates that a leakis possible To investigate whether a leak occurred it isessential to consider the event ID associated with theedges comprising the path We denote by ei the last in-coming edge to the entity under investigation with labels[Sprime I prime] only edges such that e lt ei should be consideredWhen applied to all nodes along a path this rule ensuresstrictly monotonically increasing timestamps from thefirst node to the last Fig 7 shows in bluepale a possibledata disclosure path from file F2 from a very simple

4 httpneo4jcom

11

audit graph We know from the event IDs e0 and e1 thatthe data disclosure did not occur through file F1 andprocess P3 but through P1rsquos declassification

62 Demonstrating Compliance

Compliance with certain requirements can be demon-strated through queries over the graph We assume theaudit data is stored in a graph database that we canquery For example the following plain English policyldquoEuropean personal data sent to the US must be anonymisedrdquo[57] is equivalent to writing a query that verifies thatthere is no path between EU- and US-labelled datawithout an anonymisation processThe policy ldquoMedical data stored in database X must havereceived proper consent and be anonymisedrdquo [54] can beexpressed as a query verifying that there is no path be-tween data labelled as medical and the database withoutconsent-checking and anonymiser processes In additionan investigator may want to know which anonymisationalgorithm has been run which data has been used togenerate the anonymised records etc Our audit graphassists in answering such questions

Note that IFC only applies guarantees with respectto flows Demonstrating the overall effectiveness of themanagement regime eg the quality and suitability ofthe anonymisation algorithm is out-of-scope for a flow-based enforcement mechanism

63 Audit as lsquoBig Datarsquo

We are potentially generating a vast amount of data inour IFC logs However unlike standard system logs thatare complex to analyse our logs generate graphs thatare ideal for analysis by ldquobig datardquo tools that have beendeveloped for this purpose [58]

Since the amount of data is potentially huge theamount of data being logged can be fine-tuned to meetthe requirements of the platformtenant eg by reduc-ing the amount of metadata being stored by loggingonly security context changing operations by loggingonly information corresponding to some target securitycontext keeping operations on unlabelled entities out-side of the log etc The decision on what needs to belogged then becomes a tradeoff between utility and thevolume (cost) of log generated which can be decided inorder to correspond to legal or contractual requirements(for example a regulated sector may need to have afine-grained log to satisfy data forensic requirements)Indeed as such an approach is new to the cloud suchconsiderations will be refined by experience with bestpractices developing over time

64 Audit Access

Logs can contain sensitive information and access tothem should be controlled This represents an area ofour ongoing work Traditional access controls clearlyplay a role however secrecy tags could also be lever-aged For example an auditor before being grantedaccess to audit logs could be forced to demonstrate

ownership of the corresponding secrecy IFC tags (forexample through cryptographic means as in sect51) Theauditor may be granted access to a log entry only ifS(origin) cup S(destination) sube S(auditor)7 EXAMPLE SUPPORT FOR WEB SERVICES

One of the most common uses of PaaS is to host web ap-plications In this section we present the implementationof such a solution built on the infrastructure described insect4 in order to evaluate and demonstrate the feasibility ofour proposed approach This is illustrated in Fig 8 Werun standard and unmodified Ruby web applications

Interaction with end-users is achieved through aldquogatewayrdquo between the IFC and non-IFC worlds Simi-larly interaction with cloud services (such as data stores)is also achieved through our messaging middleware asdiscussed in sect5 The requirement for this gateway canbe removed if a trustworthy IFC implementation can beprovided at the client side consistent with the cloud im-plementation with respect to tag naming enforcementetc Tag naming in general system-wide is an issuebeyond the scope of this paper see further sect8 In ourproof of concept implementation the gateway is a simpleApache server running a custom-built module

The role of the gateway is to authenticate the end-userwhen a session is created and to associate this sessionwith an application instance running within the securitycontext corresponding to the user Recall that a securitycontext comprises the S and I labels Any further re-quests to the gateway in that session are routed to thecorresponding application instance Once an instance nolonger has an associated session it can be recycled usingself-checkpointing as described in sect411

Several application types are running over our cloud-based web services platform For example in a medicalcontext these might be medical record editing pharmacyordering social services etc A single shared identityservice for the end-user is part of the cloud provideroffering (in our proof of concept implementation weused OAuth [59])

The GP authenticates is authorised as treating doc-tor for Alice and selects the lsquoidentityrsquo that correspondsto Alice A new session is created server-side by thegateway with the requested application instance run-ning in the corresponding security context with S =[medical alice] I = [empty] When the GP wants to accessapplications on behalf of a new patient he needs to closeAlicersquos session authorise as treating doctor for Bob andopen a new session for Bob

The control described above is not achieved by theapplication but by the platform itself and can be con-trolled by the end-user subject to access control Thatis a medical application used on Alicersquos behalf runs ina security context in which data cannot flow to that ofanother patient Furthermore applications running onbehalf of a given user can share the data of that userwithout the risk of seeing a buggy application leakingdata between end-users

12

App Instance App InstanceApp Instance

CamFlow-MW

IFC Constrained IFC Constrained IFC Constrained

ContextA

ContextB

Public

Trusted IFC Gateway

Enforcing IFCData-store

Data-store Application

CamFlow-MW CamFlow-MWCamFlow-MW

IFC Cloud Platform

Dr Yellow Dr PinkDr MauveRemote Clients

Fig 8 PaaS Architecture on top of IFC-OS

As described in sect4 we assume the middleware andthe OS enforcement are provided as a service by theunderlying platform A tenant wanting to use the third-party web-service offering once his trust in the under-lying platform is established needs only to audit thegateway again the underlying infrastructure providercould either provide such a gateway or audit it Therest of the software stack of the third-party web-serviceprovider is bound by the IFC enforcement mechanismand therefore need not be trusted

8 CONCLUSION amp FUTURE WORK

IFC allows data flows to be controlled continuouslythroughout a system by providing an information-centric MAC scheme that continuously ensures non-interference between security contexts This paper pre-sented the CamFlow platform that demonstrates thepotential of cloud-deployed IFC as supporting (1) pro-tection of applications from each other (2) flexible datasharing across isolation boundaries (3) prevention ofdata leakage due to bugsmisconfigurations (4) exten-sion of access control beyond application boundaries (5)data flow transparency

Specifically we detailed a new kernel implementationof IFC as an LSM demonstrating low overhead evenfor worst-case scenarios where processes continuouslymake readwrite system calls We also described theintegration of a messaging middleware to enforcing IFCacross machines This combination makes it possible toprovide whole-system IFC to PaaS cloud services andtherefore also SaaS Our approach and implementationwere designed so that applications can run unchangedover IFC thus making cloud adoption feasible

We also indicated how the data-centric logs based onIFC enforcement could provide the means to audit anIFC-enabled system whereby a log can be processed asa directed graph to investigate leaks and attacks andshow compliance with data management requirementsThough this represents our initial work in the area thereappears much promise

In light of the above we believe that IFC has greatpotential as a security mechanism for the cloud wherebytrust in a few major cloud providers deploying IFCcan be built on to provide a demonstrably trustworthycomputing environment

CamFlow was developed with cloud deployment inmind Our future work will investigate the challenges ofa broader distributed context

It is already feasible to extend CamFlow to sup-port mobile environments Android supports the fullSELinux enforcement5 and an Android-LSM integrationhas been demonstrated [60] But when dealing withmultiple cloud services particularly as they become partof a wider distributed architecture such as in the IoTa trustworthy system-wide deployment of IFC can nolonger be assumed Much work remains on establishingtrust in (and the trustworthiness of) the IFC enforcementmechanism within end-usersrsquo devices Outside a cloudcontext all partiesrsquo trust in a common third partyrsquos en-forcement of IFC constraints cannot be assumed (unlikethe cloud provider for cloud services) We intend toexplore leveraging hardware roots of trust and remoteattestation to ensure the integrity and trustworthiness ofIFC enforcement mechanisms

Another area of investigation concerns the represen-tation of tags across administrative domains In a cloudcontext a federated approach can be envisaged wherea common understanding of tags could be negotiatedacross multiple domains For example in [54] we dis-cussed initial thoughts on managing data according tospecific obligation regimes However as the numberof administrative domains increases both a global tagnaming scheme and mechanisms for ad-hoc negotiationbecome necessary

Related is the sensitivity of the tags themselvesKnowledge of the meaning of a tag can indicate thatthe associated entity contains or deals with certain infor-mation If a relationship can be established between anentity and the information owner this may disclose pri-vate information about the information owner This maylead to work on a need-to-know negotiation mechanismto establish a secure channel between hosts especiallyin a wide-scale distributed system A promising mech-anism is private set intersection to determine tagsrsquo subsetrelationship (sect341)

Another challenge concerns extending audit to dis-tributed architectures both in terms of resource man-agement and regulating access to log data

5 httpssourceandroidcomdevicestechsecurityselinux

13

ACKNOWLEDGMENTS

This work was supported by UK Engineering andPhysical Sciences Research Council grant EPK011510CloudSafetyNet End-to-End Application Security inthe Cloud We acknowledge the support of Microsoftthrough the Microsoft Cloud Computing Research Cen-tre

REFERENCES

[1] P Barham B Dragovic K Fraser S Hand T Harris A HoR Neugebauer I Pratt and A Warfield ldquoXen and the art ofvirtualizationrdquo SIGOPS Operating Systems Review vol 37 no 5pp 164ndash177 2003

[2] A Kivity Y Kamay D Laor U Lublin and A Liguori ldquokvmthe Linux Virtual Machine Monitorrdquo in Proceedings of the LinuxSymposium vol 1 2007 pp 225ndash230

[3] D Bernstein ldquoContainers and Cloud From LXC to Docker toKubernetesrdquo IEEE Cloud Computing Magazine no 3 pp 81ndash842014

[4] P Desnoyers O Krieger B Holden and J Hennessey ldquoUsingOpenStack for an Open Cloud eXchange (OCX)rdquo in InternationalConference on Cloud Engineering (IC2E) IEEE 2015

[5] J Mineraud O Mazhelis X Su and S Tarkoma ldquoA Gap Analysisof Internet-of-Things Platformsrdquo 2015 Arxiv arXiv150201181[Online] Available httparxivorgabs150201181

[6] C J Millard Ed Cloud Computing Law Oxford University Press2013

[7] S Pearson and M Casassa-Mont ldquoSticky Policies An Approachfor Managing Privacy across Multiple Partiesrdquo Computer vol 44July 2011

[8] D E Denning ldquoA lattice model of secure information flowrdquoCommunication of the ACM vol 19 no 5 pp 236ndash243 1976

[9] D E Bell and L J LaPadula ldquoSecure Computer Systems Mathe-matical Foundations and Modelrdquo The MITRE Corp Bedford MATech Rep M74-244 1973

[10] K J Biba ldquoIntegrity Considerations for Secure Computer Sys-temsrdquo MITRE Corp Tech Rep ESD-TR 76-372 1977

[11] A C Myers and B Liskov ldquoA Decentralized Model for Infor-mation Flow Controlrdquo in 17th Symposium on Operating SystemsPrinciples (SOSP) ACM 1997 pp 129ndash142

[12] A Bouard B Weyl and C Eckert ldquoPractical information-flowaware middleware for in-car communicationrdquo in Workshop onSecurity privacy amp dependability for cyber vehicles ACM 2013pp 3ndash8

[13] K Singh S Bhola and W Lee ldquoxBook Redesigning PrivacyControl in Social Networking Platformsrdquo in Security SymposiumUSENIX 2009 pp 249ndash266

[14] M Krohn A Yip M Brodsky N Cliffer M F KaashoekE Kohler and R Morris ldquoInformation Flow Control for StandardOS Abstractionsrdquo in Symposium on Operating Systems PrinciplesACM 2007 pp 321ndash334

[15] S Lee E L Wong D Goel M Dahlin and V ShmatikovldquoπBox A Platform for Privacy-Preserving Appsrdquo in Symposiumon Networked System Design and Implementation USENIX 2013pp 501ndash514

[16] N Kumar and R Shyamasundar ldquoRealizing Purpose-Based Pri-vacy Policies Succinctly via Information-Flow Labelsrdquo in Big Dataand Cloud Computing (BDCloudrsquo14) IEEE 2014 pp 753ndash760

[17] W Enck P Gilbert B-G Chun L P Cox J Jung P McDaniel andA N Sheth ldquoTaintDroid An information-flow tracking systemfor realtime privacy monitoring on smartphonesrdquo in Conference onOperating systems design and implementation (OSDIrsquo10) USENIX2010 pp 1ndash6

[18] I Papagiannis M Migliavacca and P Pietzuch ldquoPHP AspisUsing partial taint tracking to protect against injection attacksrdquoin 2nd USENIX Conference on Web Application Development 2011p 13

[19] E Schwartz T Avgerinos and D Brumley ldquoAll you ever wantedto know about dynamic taint analysis and forward symbolicexecution (but might have been afraid to ask)rdquo in Symposium onSecurity and Privacy IEEE 2010

[20] M Casassa-Mont S Pearson and P Bramhall ldquoTowards account-able management of identity and privacy Sticky policies andenforceable tracing servicesrdquo in Database and Expert Systems Ap-plications 2003 Proceedings 14th International Workshop on IEEE2003 pp 377ndash382

[21] S Bandhakavi C C Zhang and M Winslett ldquoSuper-sticky anddeclassifiable release policies for flexible information dissemina-tion controlrdquo in Proceedings of the 5th ACM workshop on Privacy inelectronic society ACM 2006 pp 51ndash58

[22] D W Chadwick and S F Lievens ldquoEnforcing sticky securitypolicies throughout a distributed applicationrdquo in Proceedings ofthe 2008 workshop on Middleware security ACM 2008 pp 1ndash6

[23] M Casassa-Mont I Matteucci M Petrocchi and M L SbodioldquoTowards Safer Information Sharing in the Cloudrdquo InternationalJournal of Information Security pp 1ndash16 2014

[24] S Akoush L Carata R Sohan and A Hopper ldquoMrLazy LazyRuntime Label Propagation for MapReducerdquo in 6th Workshop onHot Topics in Cloud Computing (HotCloud) USENIX 2014

[25] D Schultz and B Liskov ldquoIFDB Decentralized Information FlowControl for Databasesrdquo in European Conference on Computer Sys-tems (Eurosysrsquo13) ACM 2013 pp 43ndash56

[26] D Von Oheimb ldquoInformation Flow Control Revisited Nonin-fluence = Noninterference + Nonleakagerdquo in Computer SecurityndashESORICS 2004 Springer 2004 pp 225ndash243

[27] D Hedin and A Sabelfeld ldquoA perspective on Information-FlowControlrdquo NATO 2012

[28] D E Porter M D Bond I Roy K S McKinley and E WitchelldquoPractical Fine-Grained Information Flow Control Using Lam-inarrdquo ACM Transactions on Programming Languages and Systems(TOPLAS) vol 37 no 1 p 4 2014

[29] C Wright C Cowan S Smalley J Morris and G Kroah-Hartman ldquoLinux Security Modules General security support forthe Linux kernelrdquo in Foundations of Intrusion Tolerant SystemsIEEE 2003 pp 213ndash213

[30] M Quaritsch and T Winkler ldquoLinux Security Modules Enhance-ments Module Stacking Framework and TCP State TransitionHooks for State-Driven NIDSrdquo Secure Information and Communi-cation vol 7 pp 7ndash13 2004

[31] C Schaufler ldquoLSM Generalize existing module stackingrdquo LinuxWeekly News 2014

[32] S Smalley C Vance and W Salamon ldquoImplementing SELinux asa Linux Security Modulerdquo NAI Labs Report vol 1 p 43 2001

[33] M Bauer ldquoParanoid Penguin an Introduction to Novell AppAr-morrdquo Linux Journal vol 2006 no 148 p 13 2006

[34] A Edwards T Jaeger and X Zhang ldquoRuntime verification ofauthorization hook placement for the Linux Security Modulesframeworkrdquo in Conference on Computer and Communications Secu-rity ACM 2002 pp 225ndash234

[35] T Jaeger A Edwards and X Zhang ldquoConsistency analysis ofauthorization hook placement in the Linux security modulesframeworkrdquo ACM Transactions on Information and System Security(TISSEC) vol 7 no 2 pp 175ndash205 2004

[36] V Ganapathy T Jaeger and S Jha ldquoAutomatic placement ofauthorization hooks in the Linux security modules frameworkrdquoin Conference on Computer and Communications Security ACM2005 pp 330ndash339

[37] I P Egwutuoha D Levy B Selic and S Chen ldquoA survey of faulttolerance mechanisms and checkpointrestart implementationsfor high performance computing systemsrdquo Springer 2013vol 65 no 3 pp 1302ndash1326

[38] O Laadan and S E Hallyn ldquoLinux-CR Transparent applicationcheckpoint-restart in Linuxrdquo in Linux Symposium 2010 p 159

[39] B Niu and G Tan ldquoEfficient User-space Information Flow Con-trolrdquo in Symposium on Information Computer and CommunicationsSecurity (SIGSACrsquo13) ACM 2013 pp 131ndash142

[40] T Bird ldquoMeasuring Function Duration with ftracerdquo in Japan LinuxSymposium 2009 pp 47ndash54

[41] B Parno ldquoBootstrapping Trust in ardquo Trustedrdquo Platformrdquo in Con-ference on Hot Topics in Security (HotSecrsquo08) USENIX 2008

[42] N Santos K P Gummadi and R Rodrigues ldquoTowards TrustedCloud Computingrdquo in Conference on Hot Topics in Cloud Comput-ing USENIX 2009 pp 3ndash3

[43] S Berger R Caceres K A Goldman R Perez R Sailer andL van Doorn ldquovTPM Virtualizing the Trusted Platform Modulerdquoin Security Symposium USENIX 2006 pp 305ndash320

14

[44] S Berger K Goldman D Pendarakis D Safford E Valdezand M Zohar ldquoScalable Attestation A Step Toward Secure andTrusted Cloudsrdquo in International Conference on Cloud Engineering(IC2E) IEEE 2015

[45] J Singh D Eyers and J Bacon ldquoPolicy Enforcement withinEmerging Distributed Event-Based Systemsrdquo in ACM DistributedEvent-Based Systems (DEBSrsquo14) 2014

[46] L Sfaxi T Abdellatif R Robbana and Y Lakhnech ldquoInformationFlow Control of Component-based Distributed Systemsrdquo Concur-rency and Computation Practice and Experience vol 25 no 2 pp161ndash179 2013

[47] S Yoshihama T Yoshizawa Y Watanabe M Kudoh and K Oy-anagi ldquoDynamic Information Flow Control Architecture for WebApplicationsrdquo in ESORICS 2007 J Biskup and J Lopez EdsSpringer 2007 vol LNCS 4734 pp 267ndash282

[48] W Cheng D R K Ports D Schultz V Popic A BlanksteinJ Cowling D Curtis L Shrira and B Liskov ldquoAbstractions forUsable Information Flow Control in Aeolusrdquo in USENIX AnnualTechnical Conference Boston 2012

[49] J Singh T Pasquier J Bacon and D Eyers ldquoIntegrating Middle-ware and Information Flow Controlrdquo in International Conferenceon Cloud Engineering (IC2E) IEEE 2015 pp 54ndash59

[50] J Singh T F J-M Pasquier and J Bacon ldquoSecuring Tags toControl Information Flows within the Internet of Thingsrdquo inInternational Conference on Recent Advances in Internet of Things(RIoTrsquo15) IEEE 2015

[51] J S Park and R Sandhu ldquoBinding Identities and Attributes UsingDigitally Signed Certificatesrdquo in Annual Conference on ComputerSecurity Applications IEEE 2000 pp 120ndash127

[52] A Ganjali and D Lie ldquoAuditing cloud management using infor-mation flow trackingrdquo in Workshop on Scalable Trusted ComputingACM 2012 pp 79ndash84

[53] R K Ko M Kirchberg and B S Lee ldquoFrom System-centric toData-centric Logging-accountability Trust amp Security in CloudComputingrdquo in Defense Science Research Conference and Expo (DSR)2011 IEEE 2011 pp 1ndash4

[54] J Singh J Powles T Pasquier and J Bacon ldquoData Flow Man-agement and Compliance in Cloud Computingrdquo IEEE CloudComputing Magazine SI on Legal Clouds 2015

[55] T Zanussi K Yaghmour R Wisniewski R Moore and M Dage-nais ldquorelayfs An efficient unified approach for transmitting datafrom kernel to user spacerdquo in Linux Symposium 2003 p 494

[56] M E Smoot K Ono J Ruscheinski P-L Wang and T IdekerldquoCytoscape 28 New features for data integration and networkvisualizationrdquo Oxford University Press 2011 vol 27 no 3 pp431ndash432

[57] T Pasquier and J Powles ldquoExpressing and Enforcing LocationRequirements in the Cloud using Information Flow Controlrdquo inIC2E International Workshop on Legal and Technical Issues in CloudComputing (CLawrsquo15) IEEE 2015

[58] R Angles and C Gutierrez ldquoSurvey of Graph Database ModelsrdquoACM Computing Surveys (CSUR) vol 40 no 1 p 1 2008

[59] D Hardt ldquoThe OAuth 20 Authorization Frameworkrdquo IETF TechRep RFC 6749 2012

[60] S Smalley and R Craig ldquoSecurity Enhanced (SE) Android Bring-ing Flexible MAC to Androidrdquo in Network and Distributed SystemSecurity Symposium Internet Society 2013

  • 1 Introduction and Motivation
  • 2 Background
    • 21 IFC Models
    • 22 Protection via VMs and Containers
    • 23 Taint Tracking (TT) Systems
    • 24 Sticky Policies
      • 3 CamFlow-Model IFC for the Cloud
        • 31 Tags and Labels
        • 32 Decentralised Privileges and Security Contexts
        • 33 Creating a New Entity
        • 34 Security
          • 341 Information Exchange
          • 342 Label Change
          • 343 Privilege delegation
            • 35 Conflict of Interest
              • 4 OS Enforcement
                • 41 CamFlow-LSM
                  • 411 Checkpointing and Restoration
                  • 412 OS Evaluation
                    • 42 Trusted Processes
                    • 43 Leveraging Hardware Roots of Trust
                      • 5 Cross-Machine Enforcement
                        • 51 Remote Interactions
                        • 52 Message-Level Enforcement
                        • 53 Integrating With Persistent Storage
                        • 54 Evaluation
                          • 6 Audit Data-Centric Logs
                            • 61 Analysing Paths to Disclosure
                            • 62 Demonstrating Compliance
                            • 63 Audit as `Big Data
                            • 64 Audit Access
                              • 7 Example Support for Web Services
                              • 8 Conclusion amp Future Work
                              • References
Page 9: CamFlow: Managed data-sharing for cloud services · CamFlow: Managed data-sharing for cloud services Thomas F. J.-M. Pasquier, Member, IEEE, Jatinder Singh, Member, IEEE, David Eyers,

9

CamFlow-MW CamFlow-MW CamFlow-MW CamFlow-MW CamFlow-MW

Process

CamFlow-LSM

Enforcement Point Pipe [empty I]

Process Process

SecurityContextManager

SecurityContextManager

Container

Container

Container

Process[S I]

CamFlow-LSM

[S I]

File[S I]

[S empty][S empty]Process[empty I]

Fig 5 CamFlow Architecture Labelled OS objects trusted processes and communication middleware

dynamic reconfiguration based on event-driven policyThis simplifies both application development and de-ployment as concerns can be abstracted and tailoredto the particular environment rather than embeddedwithin application code

The role of the middleware is to move towards contin-uous end-to-end data flow management such that IFCcan be enforced across applicationsmachines (kernels)There is work on IFC enforcement across machineshowever these impose specific requirements such asdesign-time considerations [46] a particular languageruntime [47] or constraints on system architectureimplementation [48] In contrast we integrate IFC func-tionality into the general fully featured distributed sys-tems middleware mentioned above (see [49]) to provideflexibility and be more generally applicable We delib-erately avoid imposing a structure on system designinstead integrating IFC functionality into the sort of com-munications infrastructure common to current enterpriseand cloud systems

51 Remote Interactions

CamFlow-MW operates by associating a trusted process(see sect42) with an entity that seeks to communicate viathe messaging system Its fit within the broader architec-ture is depicted in Fig 5 The process is responsible forhandling the communication of messages and enforcesIFC based on the current runtime labels of the entity onwhose behalf it operates

It follows that for IFC to be enforced across machinestags require system-wide management ie throughoutthe cloud service In [50] we proposed that the widelyused and available X509 certificates could be used Theapproach relies on public key certificates and attributecertificates [51] to respectively identify the applicationassociated with the CamFlow-MW instance and the tagsassociated with this application

As part of establishing a connection CamFlow-MWensures that each entity authorises communication withthe other according to a local access control policy De-cisions are based on component metadata the relevantauthentication aspects secured through PKI (certificates)Similarly IFC policy must also be verified to ensure dataflows are authorised according to IFC policy Attributecertificates provide cryptographic means to determineand verify the tags associated with the remote entity

(see [50]) on which policy can be enforced If tags donot accord the connection will not be established

We also see potential for remote attestation based onhardware integrity measures (see sect43) to be integratedinto this authorisation phase to ensure the remote ma-chine operates a reliable IFC enforcement regime

52 Message-Level Enforcement

CamFlow messages are strongly typed where a mes-sage type is defined by a schema describing its set ofattributes For an instance of a message an attributeconsists of a name type and value The support forIFC within messages is fine-grained in that individualattributes within messages can also be labelled Theseattribute labels introduce additional IFC constraints overand above those already applying to the entity ie asrecognised by the kernel-LSM and validated on connec-tion establishment

Labels can be defined within message type schemawhich sets the attributesrsquo IFC labels for all messageinstances of the type These labels cannot be changed byentities dealing in such messages and the entities musthold the requisite labels to interact with the attributesOtherwise the entity producingpublishing a messagecan set the security labels for the attributes (for those notpredefined) if the entity holds the associated privilegesEnforcement occurs as followsReceiving If the receiving entityrsquos labels do not agreewith those of an attribute value the attribute value (andany sub-attributes) are removed from (made null in) themessage This is enforced on message receipt before itis delivered to the entitySending An entity cannot send values for attributeswhere its labels do not agree with those of the attributeThis is enforced when an entity attempts to send amessage ensuring values for any attributes violating thispolicy are removed before message propagation

Enforcement is automatic meaning that applicationsusing the messaging system can be subject to IFC en-forcement completely transparently (ie without their di-rect involvement) though again there is the interface forthe application to actively manage IFC where requiredIn addition the general reconfiguration capabilities ofthe middleware enable connections between componentsto be defined and managed at runtime providing an-other mechanism for controlling communication [45]

10

0 100 200 300 400 500 600 700

Time in msIFC-overhead

Fig 6 IFC overhead of CamFlow-MW for a workloadtransmission of 5000 messages (x-axis in ms)53 Integrating With Persistent Storage

One technique to provide IFC with persistent data storesis to store the tags alongside the data and a trustedsoftware component ensures that when information isread from the store the corresponding labels are appliedIn Flume [14] a trusted process provides the interfacebetween untrusted applications and persistent storageMore recent work has seen the emergence of databasesthat natively understand IFC concepts and can enforceIFC policies [25]

We see much promise in having the middleware me-diate between persistence systems and the kernel toensure consistent IFC application

54 Evaluation

As shown in Fig 6 the results indicate that IFC en-forcement introduces an overhead of sim13 in perfor-mance time compared to the standard non IFC-enabledmiddleware (see [49] for details) Note that these resultswere measured in the context of a particular workloaddeliberately designed to highlight the impact of IFCenforcement It follows that the overheads associatedwith real-world usage are most likely less onerous

6 AUDIT DATA-CENTRIC LOGS

IFC in addition to providing strong assurances that pol-icy is being enforced can also provide a data-centric log[52] detailing the information flows within and betweensystem components In addition to enforcing IFC via ourLSM module we log the data flows of labelled processespolicy decisions privileges and IFC security contextmanipulations Equivalent inter-machine operations viathe middleware are also recorded

Cloud logging systems are generally based on legacylogging systems (OS web-server database etc) thateither fail to capture the needed information or areextremely complicated to interpret in a useful manner[53] More importantly such logs tend to be relevant onlyto the particular service or component which makes itdifficult if not impossible to audit across a range ofapplications clouds etc

IFC logs as provided by our platform allow us to cap-ture information on application-level data flows both at-tempted and permitted allowing the correct expressionand implementation of data flow policy to be checkedThis provides transparency allows for meaningful auditin terms of investigating the circumstances in which dataleakage occurs and provides evidence of complianceeg with legal obligations [54]

Information Exchange

CreateSecurity Context Change

e0

e1

e3e4

e7

e2 e5e6

P3[Sprimeprime I]

P2[Sprimeprime empty]

P3[S I]F1[S I]

P1[S I] P1[empty I]

Public[empty empty]F2[Sprime I prime]

Fig 7 Simplified audit graph from IFC OS execution (weomit metadata for readability) The path to disclosure isshown in bluepale61 Analysing Paths to Disclosure

To assist in interpreting log information we build a di-rected graph corresponding to the allowed flows duringthe execution of our system as shown in Fig 7 Theflows defined in our IFC model (see sect3) namely dataflow creation flow security context change and privi-lege delegation correspond to the edges of the directedgraph Entities (such as processes files messages etc)are represented as nodes in the graph

In addition to information necessary to build the graph(as shown in the figure) additional metadata is collectedfor forensic purposes which is contextentityevent-dependent These audit entries provided by the LSMand middleware can be exploited by a dedicated serviceimplemented in user space connected to the kernelcollection mechanism via relayfs [55] This service feedsfor example a graph visualisation tool such as Cytoscape[56] or a graph database such as Neo4J4

Such a directed graph helps one identify data leaksFor example a tenant might discover that some sensi-tive medical data leaked into a data store where onlyanonymised research data were supposed to be storedIFC is enforced in line with the policy encapsulated inlabels thus data may leak if such policy is improperly ex-pressed andor declassificationendorsement processesare not correctly implemented (eg if the anonymisationprocess in Fig 2 allows re-identification)

Suppose that an information leak is suspected be-tween different security contexts L1[S I] and L2[S

prime I prime]Determining whether such a leak can occur is equiv-alent to discovering whether there is a path in thegraph between the two contexts If the leak occurredthere must be a path between some entity Ei such thatS(Ei) = S and I(Ei) = I and another entity Fi such thatS(Fi) = Sprime and I(Fi) = I prime

The existence of such a path demonstrates that a leakis possible To investigate whether a leak occurred it isessential to consider the event ID associated with theedges comprising the path We denote by ei the last in-coming edge to the entity under investigation with labels[Sprime I prime] only edges such that e lt ei should be consideredWhen applied to all nodes along a path this rule ensuresstrictly monotonically increasing timestamps from thefirst node to the last Fig 7 shows in bluepale a possibledata disclosure path from file F2 from a very simple

4 httpneo4jcom

11

audit graph We know from the event IDs e0 and e1 thatthe data disclosure did not occur through file F1 andprocess P3 but through P1rsquos declassification

62 Demonstrating Compliance

Compliance with certain requirements can be demon-strated through queries over the graph We assume theaudit data is stored in a graph database that we canquery For example the following plain English policyldquoEuropean personal data sent to the US must be anonymisedrdquo[57] is equivalent to writing a query that verifies thatthere is no path between EU- and US-labelled datawithout an anonymisation processThe policy ldquoMedical data stored in database X must havereceived proper consent and be anonymisedrdquo [54] can beexpressed as a query verifying that there is no path be-tween data labelled as medical and the database withoutconsent-checking and anonymiser processes In additionan investigator may want to know which anonymisationalgorithm has been run which data has been used togenerate the anonymised records etc Our audit graphassists in answering such questions

Note that IFC only applies guarantees with respectto flows Demonstrating the overall effectiveness of themanagement regime eg the quality and suitability ofthe anonymisation algorithm is out-of-scope for a flow-based enforcement mechanism

63 Audit as lsquoBig Datarsquo

We are potentially generating a vast amount of data inour IFC logs However unlike standard system logs thatare complex to analyse our logs generate graphs thatare ideal for analysis by ldquobig datardquo tools that have beendeveloped for this purpose [58]

Since the amount of data is potentially huge theamount of data being logged can be fine-tuned to meetthe requirements of the platformtenant eg by reduc-ing the amount of metadata being stored by loggingonly security context changing operations by loggingonly information corresponding to some target securitycontext keeping operations on unlabelled entities out-side of the log etc The decision on what needs to belogged then becomes a tradeoff between utility and thevolume (cost) of log generated which can be decided inorder to correspond to legal or contractual requirements(for example a regulated sector may need to have afine-grained log to satisfy data forensic requirements)Indeed as such an approach is new to the cloud suchconsiderations will be refined by experience with bestpractices developing over time

64 Audit Access

Logs can contain sensitive information and access tothem should be controlled This represents an area ofour ongoing work Traditional access controls clearlyplay a role however secrecy tags could also be lever-aged For example an auditor before being grantedaccess to audit logs could be forced to demonstrate

ownership of the corresponding secrecy IFC tags (forexample through cryptographic means as in sect51) Theauditor may be granted access to a log entry only ifS(origin) cup S(destination) sube S(auditor)7 EXAMPLE SUPPORT FOR WEB SERVICES

One of the most common uses of PaaS is to host web ap-plications In this section we present the implementationof such a solution built on the infrastructure described insect4 in order to evaluate and demonstrate the feasibility ofour proposed approach This is illustrated in Fig 8 Werun standard and unmodified Ruby web applications

Interaction with end-users is achieved through aldquogatewayrdquo between the IFC and non-IFC worlds Simi-larly interaction with cloud services (such as data stores)is also achieved through our messaging middleware asdiscussed in sect5 The requirement for this gateway canbe removed if a trustworthy IFC implementation can beprovided at the client side consistent with the cloud im-plementation with respect to tag naming enforcementetc Tag naming in general system-wide is an issuebeyond the scope of this paper see further sect8 In ourproof of concept implementation the gateway is a simpleApache server running a custom-built module

The role of the gateway is to authenticate the end-userwhen a session is created and to associate this sessionwith an application instance running within the securitycontext corresponding to the user Recall that a securitycontext comprises the S and I labels Any further re-quests to the gateway in that session are routed to thecorresponding application instance Once an instance nolonger has an associated session it can be recycled usingself-checkpointing as described in sect411

Several application types are running over our cloud-based web services platform For example in a medicalcontext these might be medical record editing pharmacyordering social services etc A single shared identityservice for the end-user is part of the cloud provideroffering (in our proof of concept implementation weused OAuth [59])

The GP authenticates is authorised as treating doc-tor for Alice and selects the lsquoidentityrsquo that correspondsto Alice A new session is created server-side by thegateway with the requested application instance run-ning in the corresponding security context with S =[medical alice] I = [empty] When the GP wants to accessapplications on behalf of a new patient he needs to closeAlicersquos session authorise as treating doctor for Bob andopen a new session for Bob

The control described above is not achieved by theapplication but by the platform itself and can be con-trolled by the end-user subject to access control Thatis a medical application used on Alicersquos behalf runs ina security context in which data cannot flow to that ofanother patient Furthermore applications running onbehalf of a given user can share the data of that userwithout the risk of seeing a buggy application leakingdata between end-users

12

App Instance App InstanceApp Instance

CamFlow-MW

IFC Constrained IFC Constrained IFC Constrained

ContextA

ContextB

Public

Trusted IFC Gateway

Enforcing IFCData-store

Data-store Application

CamFlow-MW CamFlow-MWCamFlow-MW

IFC Cloud Platform

Dr Yellow Dr PinkDr MauveRemote Clients

Fig 8 PaaS Architecture on top of IFC-OS

As described in sect4 we assume the middleware andthe OS enforcement are provided as a service by theunderlying platform A tenant wanting to use the third-party web-service offering once his trust in the under-lying platform is established needs only to audit thegateway again the underlying infrastructure providercould either provide such a gateway or audit it Therest of the software stack of the third-party web-serviceprovider is bound by the IFC enforcement mechanismand therefore need not be trusted

8 CONCLUSION amp FUTURE WORK

IFC allows data flows to be controlled continuouslythroughout a system by providing an information-centric MAC scheme that continuously ensures non-interference between security contexts This paper pre-sented the CamFlow platform that demonstrates thepotential of cloud-deployed IFC as supporting (1) pro-tection of applications from each other (2) flexible datasharing across isolation boundaries (3) prevention ofdata leakage due to bugsmisconfigurations (4) exten-sion of access control beyond application boundaries (5)data flow transparency

Specifically we detailed a new kernel implementationof IFC as an LSM demonstrating low overhead evenfor worst-case scenarios where processes continuouslymake readwrite system calls We also described theintegration of a messaging middleware to enforcing IFCacross machines This combination makes it possible toprovide whole-system IFC to PaaS cloud services andtherefore also SaaS Our approach and implementationwere designed so that applications can run unchangedover IFC thus making cloud adoption feasible

We also indicated how the data-centric logs based onIFC enforcement could provide the means to audit anIFC-enabled system whereby a log can be processed asa directed graph to investigate leaks and attacks andshow compliance with data management requirementsThough this represents our initial work in the area thereappears much promise

In light of the above we believe that IFC has greatpotential as a security mechanism for the cloud wherebytrust in a few major cloud providers deploying IFCcan be built on to provide a demonstrably trustworthycomputing environment

CamFlow was developed with cloud deployment inmind Our future work will investigate the challenges ofa broader distributed context

It is already feasible to extend CamFlow to sup-port mobile environments Android supports the fullSELinux enforcement5 and an Android-LSM integrationhas been demonstrated [60] But when dealing withmultiple cloud services particularly as they become partof a wider distributed architecture such as in the IoTa trustworthy system-wide deployment of IFC can nolonger be assumed Much work remains on establishingtrust in (and the trustworthiness of) the IFC enforcementmechanism within end-usersrsquo devices Outside a cloudcontext all partiesrsquo trust in a common third partyrsquos en-forcement of IFC constraints cannot be assumed (unlikethe cloud provider for cloud services) We intend toexplore leveraging hardware roots of trust and remoteattestation to ensure the integrity and trustworthiness ofIFC enforcement mechanisms

Another area of investigation concerns the represen-tation of tags across administrative domains In a cloudcontext a federated approach can be envisaged wherea common understanding of tags could be negotiatedacross multiple domains For example in [54] we dis-cussed initial thoughts on managing data according tospecific obligation regimes However as the numberof administrative domains increases both a global tagnaming scheme and mechanisms for ad-hoc negotiationbecome necessary

Related is the sensitivity of the tags themselvesKnowledge of the meaning of a tag can indicate thatthe associated entity contains or deals with certain infor-mation If a relationship can be established between anentity and the information owner this may disclose pri-vate information about the information owner This maylead to work on a need-to-know negotiation mechanismto establish a secure channel between hosts especiallyin a wide-scale distributed system A promising mech-anism is private set intersection to determine tagsrsquo subsetrelationship (sect341)

Another challenge concerns extending audit to dis-tributed architectures both in terms of resource man-agement and regulating access to log data

5 httpssourceandroidcomdevicestechsecurityselinux

13

ACKNOWLEDGMENTS

This work was supported by UK Engineering andPhysical Sciences Research Council grant EPK011510CloudSafetyNet End-to-End Application Security inthe Cloud We acknowledge the support of Microsoftthrough the Microsoft Cloud Computing Research Cen-tre

REFERENCES

[1] P Barham B Dragovic K Fraser S Hand T Harris A HoR Neugebauer I Pratt and A Warfield ldquoXen and the art ofvirtualizationrdquo SIGOPS Operating Systems Review vol 37 no 5pp 164ndash177 2003

[2] A Kivity Y Kamay D Laor U Lublin and A Liguori ldquokvmthe Linux Virtual Machine Monitorrdquo in Proceedings of the LinuxSymposium vol 1 2007 pp 225ndash230

[3] D Bernstein ldquoContainers and Cloud From LXC to Docker toKubernetesrdquo IEEE Cloud Computing Magazine no 3 pp 81ndash842014

[4] P Desnoyers O Krieger B Holden and J Hennessey ldquoUsingOpenStack for an Open Cloud eXchange (OCX)rdquo in InternationalConference on Cloud Engineering (IC2E) IEEE 2015

[5] J Mineraud O Mazhelis X Su and S Tarkoma ldquoA Gap Analysisof Internet-of-Things Platformsrdquo 2015 Arxiv arXiv150201181[Online] Available httparxivorgabs150201181

[6] C J Millard Ed Cloud Computing Law Oxford University Press2013

[7] S Pearson and M Casassa-Mont ldquoSticky Policies An Approachfor Managing Privacy across Multiple Partiesrdquo Computer vol 44July 2011

[8] D E Denning ldquoA lattice model of secure information flowrdquoCommunication of the ACM vol 19 no 5 pp 236ndash243 1976

[9] D E Bell and L J LaPadula ldquoSecure Computer Systems Mathe-matical Foundations and Modelrdquo The MITRE Corp Bedford MATech Rep M74-244 1973

[10] K J Biba ldquoIntegrity Considerations for Secure Computer Sys-temsrdquo MITRE Corp Tech Rep ESD-TR 76-372 1977

[11] A C Myers and B Liskov ldquoA Decentralized Model for Infor-mation Flow Controlrdquo in 17th Symposium on Operating SystemsPrinciples (SOSP) ACM 1997 pp 129ndash142

[12] A Bouard B Weyl and C Eckert ldquoPractical information-flowaware middleware for in-car communicationrdquo in Workshop onSecurity privacy amp dependability for cyber vehicles ACM 2013pp 3ndash8

[13] K Singh S Bhola and W Lee ldquoxBook Redesigning PrivacyControl in Social Networking Platformsrdquo in Security SymposiumUSENIX 2009 pp 249ndash266

[14] M Krohn A Yip M Brodsky N Cliffer M F KaashoekE Kohler and R Morris ldquoInformation Flow Control for StandardOS Abstractionsrdquo in Symposium on Operating Systems PrinciplesACM 2007 pp 321ndash334

[15] S Lee E L Wong D Goel M Dahlin and V ShmatikovldquoπBox A Platform for Privacy-Preserving Appsrdquo in Symposiumon Networked System Design and Implementation USENIX 2013pp 501ndash514

[16] N Kumar and R Shyamasundar ldquoRealizing Purpose-Based Pri-vacy Policies Succinctly via Information-Flow Labelsrdquo in Big Dataand Cloud Computing (BDCloudrsquo14) IEEE 2014 pp 753ndash760

[17] W Enck P Gilbert B-G Chun L P Cox J Jung P McDaniel andA N Sheth ldquoTaintDroid An information-flow tracking systemfor realtime privacy monitoring on smartphonesrdquo in Conference onOperating systems design and implementation (OSDIrsquo10) USENIX2010 pp 1ndash6

[18] I Papagiannis M Migliavacca and P Pietzuch ldquoPHP AspisUsing partial taint tracking to protect against injection attacksrdquoin 2nd USENIX Conference on Web Application Development 2011p 13

[19] E Schwartz T Avgerinos and D Brumley ldquoAll you ever wantedto know about dynamic taint analysis and forward symbolicexecution (but might have been afraid to ask)rdquo in Symposium onSecurity and Privacy IEEE 2010

[20] M Casassa-Mont S Pearson and P Bramhall ldquoTowards account-able management of identity and privacy Sticky policies andenforceable tracing servicesrdquo in Database and Expert Systems Ap-plications 2003 Proceedings 14th International Workshop on IEEE2003 pp 377ndash382

[21] S Bandhakavi C C Zhang and M Winslett ldquoSuper-sticky anddeclassifiable release policies for flexible information dissemina-tion controlrdquo in Proceedings of the 5th ACM workshop on Privacy inelectronic society ACM 2006 pp 51ndash58

[22] D W Chadwick and S F Lievens ldquoEnforcing sticky securitypolicies throughout a distributed applicationrdquo in Proceedings ofthe 2008 workshop on Middleware security ACM 2008 pp 1ndash6

[23] M Casassa-Mont I Matteucci M Petrocchi and M L SbodioldquoTowards Safer Information Sharing in the Cloudrdquo InternationalJournal of Information Security pp 1ndash16 2014

[24] S Akoush L Carata R Sohan and A Hopper ldquoMrLazy LazyRuntime Label Propagation for MapReducerdquo in 6th Workshop onHot Topics in Cloud Computing (HotCloud) USENIX 2014

[25] D Schultz and B Liskov ldquoIFDB Decentralized Information FlowControl for Databasesrdquo in European Conference on Computer Sys-tems (Eurosysrsquo13) ACM 2013 pp 43ndash56

[26] D Von Oheimb ldquoInformation Flow Control Revisited Nonin-fluence = Noninterference + Nonleakagerdquo in Computer SecurityndashESORICS 2004 Springer 2004 pp 225ndash243

[27] D Hedin and A Sabelfeld ldquoA perspective on Information-FlowControlrdquo NATO 2012

[28] D E Porter M D Bond I Roy K S McKinley and E WitchelldquoPractical Fine-Grained Information Flow Control Using Lam-inarrdquo ACM Transactions on Programming Languages and Systems(TOPLAS) vol 37 no 1 p 4 2014

[29] C Wright C Cowan S Smalley J Morris and G Kroah-Hartman ldquoLinux Security Modules General security support forthe Linux kernelrdquo in Foundations of Intrusion Tolerant SystemsIEEE 2003 pp 213ndash213

[30] M Quaritsch and T Winkler ldquoLinux Security Modules Enhance-ments Module Stacking Framework and TCP State TransitionHooks for State-Driven NIDSrdquo Secure Information and Communi-cation vol 7 pp 7ndash13 2004

[31] C Schaufler ldquoLSM Generalize existing module stackingrdquo LinuxWeekly News 2014

[32] S Smalley C Vance and W Salamon ldquoImplementing SELinux asa Linux Security Modulerdquo NAI Labs Report vol 1 p 43 2001

[33] M Bauer ldquoParanoid Penguin an Introduction to Novell AppAr-morrdquo Linux Journal vol 2006 no 148 p 13 2006

[34] A Edwards T Jaeger and X Zhang ldquoRuntime verification ofauthorization hook placement for the Linux Security Modulesframeworkrdquo in Conference on Computer and Communications Secu-rity ACM 2002 pp 225ndash234

[35] T Jaeger A Edwards and X Zhang ldquoConsistency analysis ofauthorization hook placement in the Linux security modulesframeworkrdquo ACM Transactions on Information and System Security(TISSEC) vol 7 no 2 pp 175ndash205 2004

[36] V Ganapathy T Jaeger and S Jha ldquoAutomatic placement ofauthorization hooks in the Linux security modules frameworkrdquoin Conference on Computer and Communications Security ACM2005 pp 330ndash339

[37] I P Egwutuoha D Levy B Selic and S Chen ldquoA survey of faulttolerance mechanisms and checkpointrestart implementationsfor high performance computing systemsrdquo Springer 2013vol 65 no 3 pp 1302ndash1326

[38] O Laadan and S E Hallyn ldquoLinux-CR Transparent applicationcheckpoint-restart in Linuxrdquo in Linux Symposium 2010 p 159

[39] B Niu and G Tan ldquoEfficient User-space Information Flow Con-trolrdquo in Symposium on Information Computer and CommunicationsSecurity (SIGSACrsquo13) ACM 2013 pp 131ndash142

[40] T Bird ldquoMeasuring Function Duration with ftracerdquo in Japan LinuxSymposium 2009 pp 47ndash54

[41] B Parno ldquoBootstrapping Trust in ardquo Trustedrdquo Platformrdquo in Con-ference on Hot Topics in Security (HotSecrsquo08) USENIX 2008

[42] N Santos K P Gummadi and R Rodrigues ldquoTowards TrustedCloud Computingrdquo in Conference on Hot Topics in Cloud Comput-ing USENIX 2009 pp 3ndash3

[43] S Berger R Caceres K A Goldman R Perez R Sailer andL van Doorn ldquovTPM Virtualizing the Trusted Platform Modulerdquoin Security Symposium USENIX 2006 pp 305ndash320

14

[44] S Berger K Goldman D Pendarakis D Safford E Valdezand M Zohar ldquoScalable Attestation A Step Toward Secure andTrusted Cloudsrdquo in International Conference on Cloud Engineering(IC2E) IEEE 2015

[45] J Singh D Eyers and J Bacon ldquoPolicy Enforcement withinEmerging Distributed Event-Based Systemsrdquo in ACM DistributedEvent-Based Systems (DEBSrsquo14) 2014

[46] L Sfaxi T Abdellatif R Robbana and Y Lakhnech ldquoInformationFlow Control of Component-based Distributed Systemsrdquo Concur-rency and Computation Practice and Experience vol 25 no 2 pp161ndash179 2013

[47] S Yoshihama T Yoshizawa Y Watanabe M Kudoh and K Oy-anagi ldquoDynamic Information Flow Control Architecture for WebApplicationsrdquo in ESORICS 2007 J Biskup and J Lopez EdsSpringer 2007 vol LNCS 4734 pp 267ndash282

[48] W Cheng D R K Ports D Schultz V Popic A BlanksteinJ Cowling D Curtis L Shrira and B Liskov ldquoAbstractions forUsable Information Flow Control in Aeolusrdquo in USENIX AnnualTechnical Conference Boston 2012

[49] J Singh T Pasquier J Bacon and D Eyers ldquoIntegrating Middle-ware and Information Flow Controlrdquo in International Conferenceon Cloud Engineering (IC2E) IEEE 2015 pp 54ndash59

[50] J Singh T F J-M Pasquier and J Bacon ldquoSecuring Tags toControl Information Flows within the Internet of Thingsrdquo inInternational Conference on Recent Advances in Internet of Things(RIoTrsquo15) IEEE 2015

[51] J S Park and R Sandhu ldquoBinding Identities and Attributes UsingDigitally Signed Certificatesrdquo in Annual Conference on ComputerSecurity Applications IEEE 2000 pp 120ndash127

[52] A Ganjali and D Lie ldquoAuditing cloud management using infor-mation flow trackingrdquo in Workshop on Scalable Trusted ComputingACM 2012 pp 79ndash84

[53] R K Ko M Kirchberg and B S Lee ldquoFrom System-centric toData-centric Logging-accountability Trust amp Security in CloudComputingrdquo in Defense Science Research Conference and Expo (DSR)2011 IEEE 2011 pp 1ndash4

[54] J Singh J Powles T Pasquier and J Bacon ldquoData Flow Man-agement and Compliance in Cloud Computingrdquo IEEE CloudComputing Magazine SI on Legal Clouds 2015

[55] T Zanussi K Yaghmour R Wisniewski R Moore and M Dage-nais ldquorelayfs An efficient unified approach for transmitting datafrom kernel to user spacerdquo in Linux Symposium 2003 p 494

[56] M E Smoot K Ono J Ruscheinski P-L Wang and T IdekerldquoCytoscape 28 New features for data integration and networkvisualizationrdquo Oxford University Press 2011 vol 27 no 3 pp431ndash432

[57] T Pasquier and J Powles ldquoExpressing and Enforcing LocationRequirements in the Cloud using Information Flow Controlrdquo inIC2E International Workshop on Legal and Technical Issues in CloudComputing (CLawrsquo15) IEEE 2015

[58] R Angles and C Gutierrez ldquoSurvey of Graph Database ModelsrdquoACM Computing Surveys (CSUR) vol 40 no 1 p 1 2008

[59] D Hardt ldquoThe OAuth 20 Authorization Frameworkrdquo IETF TechRep RFC 6749 2012

[60] S Smalley and R Craig ldquoSecurity Enhanced (SE) Android Bring-ing Flexible MAC to Androidrdquo in Network and Distributed SystemSecurity Symposium Internet Society 2013

  • 1 Introduction and Motivation
  • 2 Background
    • 21 IFC Models
    • 22 Protection via VMs and Containers
    • 23 Taint Tracking (TT) Systems
    • 24 Sticky Policies
      • 3 CamFlow-Model IFC for the Cloud
        • 31 Tags and Labels
        • 32 Decentralised Privileges and Security Contexts
        • 33 Creating a New Entity
        • 34 Security
          • 341 Information Exchange
          • 342 Label Change
          • 343 Privilege delegation
            • 35 Conflict of Interest
              • 4 OS Enforcement
                • 41 CamFlow-LSM
                  • 411 Checkpointing and Restoration
                  • 412 OS Evaluation
                    • 42 Trusted Processes
                    • 43 Leveraging Hardware Roots of Trust
                      • 5 Cross-Machine Enforcement
                        • 51 Remote Interactions
                        • 52 Message-Level Enforcement
                        • 53 Integrating With Persistent Storage
                        • 54 Evaluation
                          • 6 Audit Data-Centric Logs
                            • 61 Analysing Paths to Disclosure
                            • 62 Demonstrating Compliance
                            • 63 Audit as `Big Data
                            • 64 Audit Access
                              • 7 Example Support for Web Services
                              • 8 Conclusion amp Future Work
                              • References
Page 10: CamFlow: Managed data-sharing for cloud services · CamFlow: Managed data-sharing for cloud services Thomas F. J.-M. Pasquier, Member, IEEE, Jatinder Singh, Member, IEEE, David Eyers,

10

0 100 200 300 400 500 600 700

Time in msIFC-overhead

Fig 6 IFC overhead of CamFlow-MW for a workloadtransmission of 5000 messages (x-axis in ms)53 Integrating With Persistent Storage

One technique to provide IFC with persistent data storesis to store the tags alongside the data and a trustedsoftware component ensures that when information isread from the store the corresponding labels are appliedIn Flume [14] a trusted process provides the interfacebetween untrusted applications and persistent storageMore recent work has seen the emergence of databasesthat natively understand IFC concepts and can enforceIFC policies [25]

We see much promise in having the middleware me-diate between persistence systems and the kernel toensure consistent IFC application

54 Evaluation

As shown in Fig 6 the results indicate that IFC en-forcement introduces an overhead of sim13 in perfor-mance time compared to the standard non IFC-enabledmiddleware (see [49] for details) Note that these resultswere measured in the context of a particular workloaddeliberately designed to highlight the impact of IFCenforcement It follows that the overheads associatedwith real-world usage are most likely less onerous

6 AUDIT DATA-CENTRIC LOGS

IFC in addition to providing strong assurances that pol-icy is being enforced can also provide a data-centric log[52] detailing the information flows within and betweensystem components In addition to enforcing IFC via ourLSM module we log the data flows of labelled processespolicy decisions privileges and IFC security contextmanipulations Equivalent inter-machine operations viathe middleware are also recorded

Cloud logging systems are generally based on legacylogging systems (OS web-server database etc) thateither fail to capture the needed information or areextremely complicated to interpret in a useful manner[53] More importantly such logs tend to be relevant onlyto the particular service or component which makes itdifficult if not impossible to audit across a range ofapplications clouds etc

IFC logs as provided by our platform allow us to cap-ture information on application-level data flows both at-tempted and permitted allowing the correct expressionand implementation of data flow policy to be checkedThis provides transparency allows for meaningful auditin terms of investigating the circumstances in which dataleakage occurs and provides evidence of complianceeg with legal obligations [54]

Information Exchange

CreateSecurity Context Change

e0

e1

e3e4

e7

e2 e5e6

P3[Sprimeprime I]

P2[Sprimeprime empty]

P3[S I]F1[S I]

P1[S I] P1[empty I]

Public[empty empty]F2[Sprime I prime]

Fig 7 Simplified audit graph from IFC OS execution (weomit metadata for readability) The path to disclosure isshown in bluepale61 Analysing Paths to Disclosure

To assist in interpreting log information we build a di-rected graph corresponding to the allowed flows duringthe execution of our system as shown in Fig 7 Theflows defined in our IFC model (see sect3) namely dataflow creation flow security context change and privi-lege delegation correspond to the edges of the directedgraph Entities (such as processes files messages etc)are represented as nodes in the graph

In addition to information necessary to build the graph(as shown in the figure) additional metadata is collectedfor forensic purposes which is contextentityevent-dependent These audit entries provided by the LSMand middleware can be exploited by a dedicated serviceimplemented in user space connected to the kernelcollection mechanism via relayfs [55] This service feedsfor example a graph visualisation tool such as Cytoscape[56] or a graph database such as Neo4J4

Such a directed graph helps one identify data leaksFor example a tenant might discover that some sensi-tive medical data leaked into a data store where onlyanonymised research data were supposed to be storedIFC is enforced in line with the policy encapsulated inlabels thus data may leak if such policy is improperly ex-pressed andor declassificationendorsement processesare not correctly implemented (eg if the anonymisationprocess in Fig 2 allows re-identification)

Suppose that an information leak is suspected be-tween different security contexts L1[S I] and L2[S

prime I prime]Determining whether such a leak can occur is equiv-alent to discovering whether there is a path in thegraph between the two contexts If the leak occurredthere must be a path between some entity Ei such thatS(Ei) = S and I(Ei) = I and another entity Fi such thatS(Fi) = Sprime and I(Fi) = I prime

The existence of such a path demonstrates that a leakis possible To investigate whether a leak occurred it isessential to consider the event ID associated with theedges comprising the path We denote by ei the last in-coming edge to the entity under investigation with labels[Sprime I prime] only edges such that e lt ei should be consideredWhen applied to all nodes along a path this rule ensuresstrictly monotonically increasing timestamps from thefirst node to the last Fig 7 shows in bluepale a possibledata disclosure path from file F2 from a very simple

4 httpneo4jcom

11

audit graph We know from the event IDs e0 and e1 thatthe data disclosure did not occur through file F1 andprocess P3 but through P1rsquos declassification

62 Demonstrating Compliance

Compliance with certain requirements can be demon-strated through queries over the graph We assume theaudit data is stored in a graph database that we canquery For example the following plain English policyldquoEuropean personal data sent to the US must be anonymisedrdquo[57] is equivalent to writing a query that verifies thatthere is no path between EU- and US-labelled datawithout an anonymisation processThe policy ldquoMedical data stored in database X must havereceived proper consent and be anonymisedrdquo [54] can beexpressed as a query verifying that there is no path be-tween data labelled as medical and the database withoutconsent-checking and anonymiser processes In additionan investigator may want to know which anonymisationalgorithm has been run which data has been used togenerate the anonymised records etc Our audit graphassists in answering such questions

Note that IFC only applies guarantees with respectto flows Demonstrating the overall effectiveness of themanagement regime eg the quality and suitability ofthe anonymisation algorithm is out-of-scope for a flow-based enforcement mechanism

63 Audit as lsquoBig Datarsquo

We are potentially generating a vast amount of data inour IFC logs However unlike standard system logs thatare complex to analyse our logs generate graphs thatare ideal for analysis by ldquobig datardquo tools that have beendeveloped for this purpose [58]

Since the amount of data is potentially huge theamount of data being logged can be fine-tuned to meetthe requirements of the platformtenant eg by reduc-ing the amount of metadata being stored by loggingonly security context changing operations by loggingonly information corresponding to some target securitycontext keeping operations on unlabelled entities out-side of the log etc The decision on what needs to belogged then becomes a tradeoff between utility and thevolume (cost) of log generated which can be decided inorder to correspond to legal or contractual requirements(for example a regulated sector may need to have afine-grained log to satisfy data forensic requirements)Indeed as such an approach is new to the cloud suchconsiderations will be refined by experience with bestpractices developing over time

64 Audit Access

Logs can contain sensitive information and access tothem should be controlled This represents an area ofour ongoing work Traditional access controls clearlyplay a role however secrecy tags could also be lever-aged For example an auditor before being grantedaccess to audit logs could be forced to demonstrate

ownership of the corresponding secrecy IFC tags (forexample through cryptographic means as in sect51) Theauditor may be granted access to a log entry only ifS(origin) cup S(destination) sube S(auditor)7 EXAMPLE SUPPORT FOR WEB SERVICES

One of the most common uses of PaaS is to host web ap-plications In this section we present the implementationof such a solution built on the infrastructure described insect4 in order to evaluate and demonstrate the feasibility ofour proposed approach This is illustrated in Fig 8 Werun standard and unmodified Ruby web applications

Interaction with end-users is achieved through aldquogatewayrdquo between the IFC and non-IFC worlds Simi-larly interaction with cloud services (such as data stores)is also achieved through our messaging middleware asdiscussed in sect5 The requirement for this gateway canbe removed if a trustworthy IFC implementation can beprovided at the client side consistent with the cloud im-plementation with respect to tag naming enforcementetc Tag naming in general system-wide is an issuebeyond the scope of this paper see further sect8 In ourproof of concept implementation the gateway is a simpleApache server running a custom-built module

The role of the gateway is to authenticate the end-userwhen a session is created and to associate this sessionwith an application instance running within the securitycontext corresponding to the user Recall that a securitycontext comprises the S and I labels Any further re-quests to the gateway in that session are routed to thecorresponding application instance Once an instance nolonger has an associated session it can be recycled usingself-checkpointing as described in sect411

Several application types are running over our cloud-based web services platform For example in a medicalcontext these might be medical record editing pharmacyordering social services etc A single shared identityservice for the end-user is part of the cloud provideroffering (in our proof of concept implementation weused OAuth [59])

The GP authenticates is authorised as treating doc-tor for Alice and selects the lsquoidentityrsquo that correspondsto Alice A new session is created server-side by thegateway with the requested application instance run-ning in the corresponding security context with S =[medical alice] I = [empty] When the GP wants to accessapplications on behalf of a new patient he needs to closeAlicersquos session authorise as treating doctor for Bob andopen a new session for Bob

The control described above is not achieved by theapplication but by the platform itself and can be con-trolled by the end-user subject to access control Thatis a medical application used on Alicersquos behalf runs ina security context in which data cannot flow to that ofanother patient Furthermore applications running onbehalf of a given user can share the data of that userwithout the risk of seeing a buggy application leakingdata between end-users

12

App Instance App InstanceApp Instance

CamFlow-MW

IFC Constrained IFC Constrained IFC Constrained

ContextA

ContextB

Public

Trusted IFC Gateway

Enforcing IFCData-store

Data-store Application

CamFlow-MW CamFlow-MWCamFlow-MW

IFC Cloud Platform

Dr Yellow Dr PinkDr MauveRemote Clients

Fig 8 PaaS Architecture on top of IFC-OS

As described in sect4 we assume the middleware andthe OS enforcement are provided as a service by theunderlying platform A tenant wanting to use the third-party web-service offering once his trust in the under-lying platform is established needs only to audit thegateway again the underlying infrastructure providercould either provide such a gateway or audit it Therest of the software stack of the third-party web-serviceprovider is bound by the IFC enforcement mechanismand therefore need not be trusted

8 CONCLUSION amp FUTURE WORK

IFC allows data flows to be controlled continuouslythroughout a system by providing an information-centric MAC scheme that continuously ensures non-interference between security contexts This paper pre-sented the CamFlow platform that demonstrates thepotential of cloud-deployed IFC as supporting (1) pro-tection of applications from each other (2) flexible datasharing across isolation boundaries (3) prevention ofdata leakage due to bugsmisconfigurations (4) exten-sion of access control beyond application boundaries (5)data flow transparency

Specifically we detailed a new kernel implementationof IFC as an LSM demonstrating low overhead evenfor worst-case scenarios where processes continuouslymake readwrite system calls We also described theintegration of a messaging middleware to enforcing IFCacross machines This combination makes it possible toprovide whole-system IFC to PaaS cloud services andtherefore also SaaS Our approach and implementationwere designed so that applications can run unchangedover IFC thus making cloud adoption feasible

We also indicated how the data-centric logs based onIFC enforcement could provide the means to audit anIFC-enabled system whereby a log can be processed asa directed graph to investigate leaks and attacks andshow compliance with data management requirementsThough this represents our initial work in the area thereappears much promise

In light of the above we believe that IFC has greatpotential as a security mechanism for the cloud wherebytrust in a few major cloud providers deploying IFCcan be built on to provide a demonstrably trustworthycomputing environment

CamFlow was developed with cloud deployment inmind Our future work will investigate the challenges ofa broader distributed context

It is already feasible to extend CamFlow to sup-port mobile environments Android supports the fullSELinux enforcement5 and an Android-LSM integrationhas been demonstrated [60] But when dealing withmultiple cloud services particularly as they become partof a wider distributed architecture such as in the IoTa trustworthy system-wide deployment of IFC can nolonger be assumed Much work remains on establishingtrust in (and the trustworthiness of) the IFC enforcementmechanism within end-usersrsquo devices Outside a cloudcontext all partiesrsquo trust in a common third partyrsquos en-forcement of IFC constraints cannot be assumed (unlikethe cloud provider for cloud services) We intend toexplore leveraging hardware roots of trust and remoteattestation to ensure the integrity and trustworthiness ofIFC enforcement mechanisms

Another area of investigation concerns the represen-tation of tags across administrative domains In a cloudcontext a federated approach can be envisaged wherea common understanding of tags could be negotiatedacross multiple domains For example in [54] we dis-cussed initial thoughts on managing data according tospecific obligation regimes However as the numberof administrative domains increases both a global tagnaming scheme and mechanisms for ad-hoc negotiationbecome necessary

Related is the sensitivity of the tags themselvesKnowledge of the meaning of a tag can indicate thatthe associated entity contains or deals with certain infor-mation If a relationship can be established between anentity and the information owner this may disclose pri-vate information about the information owner This maylead to work on a need-to-know negotiation mechanismto establish a secure channel between hosts especiallyin a wide-scale distributed system A promising mech-anism is private set intersection to determine tagsrsquo subsetrelationship (sect341)

Another challenge concerns extending audit to dis-tributed architectures both in terms of resource man-agement and regulating access to log data

5 httpssourceandroidcomdevicestechsecurityselinux

13

ACKNOWLEDGMENTS

This work was supported by UK Engineering andPhysical Sciences Research Council grant EPK011510CloudSafetyNet End-to-End Application Security inthe Cloud We acknowledge the support of Microsoftthrough the Microsoft Cloud Computing Research Cen-tre

REFERENCES

[1] P Barham B Dragovic K Fraser S Hand T Harris A HoR Neugebauer I Pratt and A Warfield ldquoXen and the art ofvirtualizationrdquo SIGOPS Operating Systems Review vol 37 no 5pp 164ndash177 2003

[2] A Kivity Y Kamay D Laor U Lublin and A Liguori ldquokvmthe Linux Virtual Machine Monitorrdquo in Proceedings of the LinuxSymposium vol 1 2007 pp 225ndash230

[3] D Bernstein ldquoContainers and Cloud From LXC to Docker toKubernetesrdquo IEEE Cloud Computing Magazine no 3 pp 81ndash842014

[4] P Desnoyers O Krieger B Holden and J Hennessey ldquoUsingOpenStack for an Open Cloud eXchange (OCX)rdquo in InternationalConference on Cloud Engineering (IC2E) IEEE 2015

[5] J Mineraud O Mazhelis X Su and S Tarkoma ldquoA Gap Analysisof Internet-of-Things Platformsrdquo 2015 Arxiv arXiv150201181[Online] Available httparxivorgabs150201181

[6] C J Millard Ed Cloud Computing Law Oxford University Press2013

[7] S Pearson and M Casassa-Mont ldquoSticky Policies An Approachfor Managing Privacy across Multiple Partiesrdquo Computer vol 44July 2011

[8] D E Denning ldquoA lattice model of secure information flowrdquoCommunication of the ACM vol 19 no 5 pp 236ndash243 1976

[9] D E Bell and L J LaPadula ldquoSecure Computer Systems Mathe-matical Foundations and Modelrdquo The MITRE Corp Bedford MATech Rep M74-244 1973

[10] K J Biba ldquoIntegrity Considerations for Secure Computer Sys-temsrdquo MITRE Corp Tech Rep ESD-TR 76-372 1977

[11] A C Myers and B Liskov ldquoA Decentralized Model for Infor-mation Flow Controlrdquo in 17th Symposium on Operating SystemsPrinciples (SOSP) ACM 1997 pp 129ndash142

[12] A Bouard B Weyl and C Eckert ldquoPractical information-flowaware middleware for in-car communicationrdquo in Workshop onSecurity privacy amp dependability for cyber vehicles ACM 2013pp 3ndash8

[13] K Singh S Bhola and W Lee ldquoxBook Redesigning PrivacyControl in Social Networking Platformsrdquo in Security SymposiumUSENIX 2009 pp 249ndash266

[14] M Krohn A Yip M Brodsky N Cliffer M F KaashoekE Kohler and R Morris ldquoInformation Flow Control for StandardOS Abstractionsrdquo in Symposium on Operating Systems PrinciplesACM 2007 pp 321ndash334

[15] S Lee E L Wong D Goel M Dahlin and V ShmatikovldquoπBox A Platform for Privacy-Preserving Appsrdquo in Symposiumon Networked System Design and Implementation USENIX 2013pp 501ndash514

[16] N Kumar and R Shyamasundar ldquoRealizing Purpose-Based Pri-vacy Policies Succinctly via Information-Flow Labelsrdquo in Big Dataand Cloud Computing (BDCloudrsquo14) IEEE 2014 pp 753ndash760

[17] W Enck P Gilbert B-G Chun L P Cox J Jung P McDaniel andA N Sheth ldquoTaintDroid An information-flow tracking systemfor realtime privacy monitoring on smartphonesrdquo in Conference onOperating systems design and implementation (OSDIrsquo10) USENIX2010 pp 1ndash6

[18] I Papagiannis M Migliavacca and P Pietzuch ldquoPHP AspisUsing partial taint tracking to protect against injection attacksrdquoin 2nd USENIX Conference on Web Application Development 2011p 13

[19] E Schwartz T Avgerinos and D Brumley ldquoAll you ever wantedto know about dynamic taint analysis and forward symbolicexecution (but might have been afraid to ask)rdquo in Symposium onSecurity and Privacy IEEE 2010

[20] M Casassa-Mont S Pearson and P Bramhall ldquoTowards account-able management of identity and privacy Sticky policies andenforceable tracing servicesrdquo in Database and Expert Systems Ap-plications 2003 Proceedings 14th International Workshop on IEEE2003 pp 377ndash382

[21] S Bandhakavi C C Zhang and M Winslett ldquoSuper-sticky anddeclassifiable release policies for flexible information dissemina-tion controlrdquo in Proceedings of the 5th ACM workshop on Privacy inelectronic society ACM 2006 pp 51ndash58

[22] D W Chadwick and S F Lievens ldquoEnforcing sticky securitypolicies throughout a distributed applicationrdquo in Proceedings ofthe 2008 workshop on Middleware security ACM 2008 pp 1ndash6

[23] M Casassa-Mont I Matteucci M Petrocchi and M L SbodioldquoTowards Safer Information Sharing in the Cloudrdquo InternationalJournal of Information Security pp 1ndash16 2014

[24] S Akoush L Carata R Sohan and A Hopper ldquoMrLazy LazyRuntime Label Propagation for MapReducerdquo in 6th Workshop onHot Topics in Cloud Computing (HotCloud) USENIX 2014

[25] D Schultz and B Liskov ldquoIFDB Decentralized Information FlowControl for Databasesrdquo in European Conference on Computer Sys-tems (Eurosysrsquo13) ACM 2013 pp 43ndash56

[26] D Von Oheimb ldquoInformation Flow Control Revisited Nonin-fluence = Noninterference + Nonleakagerdquo in Computer SecurityndashESORICS 2004 Springer 2004 pp 225ndash243

[27] D Hedin and A Sabelfeld ldquoA perspective on Information-FlowControlrdquo NATO 2012

[28] D E Porter M D Bond I Roy K S McKinley and E WitchelldquoPractical Fine-Grained Information Flow Control Using Lam-inarrdquo ACM Transactions on Programming Languages and Systems(TOPLAS) vol 37 no 1 p 4 2014

[29] C Wright C Cowan S Smalley J Morris and G Kroah-Hartman ldquoLinux Security Modules General security support forthe Linux kernelrdquo in Foundations of Intrusion Tolerant SystemsIEEE 2003 pp 213ndash213

[30] M Quaritsch and T Winkler ldquoLinux Security Modules Enhance-ments Module Stacking Framework and TCP State TransitionHooks for State-Driven NIDSrdquo Secure Information and Communi-cation vol 7 pp 7ndash13 2004

[31] C Schaufler ldquoLSM Generalize existing module stackingrdquo LinuxWeekly News 2014

[32] S Smalley C Vance and W Salamon ldquoImplementing SELinux asa Linux Security Modulerdquo NAI Labs Report vol 1 p 43 2001

[33] M Bauer ldquoParanoid Penguin an Introduction to Novell AppAr-morrdquo Linux Journal vol 2006 no 148 p 13 2006

[34] A Edwards T Jaeger and X Zhang ldquoRuntime verification ofauthorization hook placement for the Linux Security Modulesframeworkrdquo in Conference on Computer and Communications Secu-rity ACM 2002 pp 225ndash234

[35] T Jaeger A Edwards and X Zhang ldquoConsistency analysis ofauthorization hook placement in the Linux security modulesframeworkrdquo ACM Transactions on Information and System Security(TISSEC) vol 7 no 2 pp 175ndash205 2004

[36] V Ganapathy T Jaeger and S Jha ldquoAutomatic placement ofauthorization hooks in the Linux security modules frameworkrdquoin Conference on Computer and Communications Security ACM2005 pp 330ndash339

[37] I P Egwutuoha D Levy B Selic and S Chen ldquoA survey of faulttolerance mechanisms and checkpointrestart implementationsfor high performance computing systemsrdquo Springer 2013vol 65 no 3 pp 1302ndash1326

[38] O Laadan and S E Hallyn ldquoLinux-CR Transparent applicationcheckpoint-restart in Linuxrdquo in Linux Symposium 2010 p 159

[39] B Niu and G Tan ldquoEfficient User-space Information Flow Con-trolrdquo in Symposium on Information Computer and CommunicationsSecurity (SIGSACrsquo13) ACM 2013 pp 131ndash142

[40] T Bird ldquoMeasuring Function Duration with ftracerdquo in Japan LinuxSymposium 2009 pp 47ndash54

[41] B Parno ldquoBootstrapping Trust in ardquo Trustedrdquo Platformrdquo in Con-ference on Hot Topics in Security (HotSecrsquo08) USENIX 2008

[42] N Santos K P Gummadi and R Rodrigues ldquoTowards TrustedCloud Computingrdquo in Conference on Hot Topics in Cloud Comput-ing USENIX 2009 pp 3ndash3

[43] S Berger R Caceres K A Goldman R Perez R Sailer andL van Doorn ldquovTPM Virtualizing the Trusted Platform Modulerdquoin Security Symposium USENIX 2006 pp 305ndash320

14

[44] S Berger K Goldman D Pendarakis D Safford E Valdezand M Zohar ldquoScalable Attestation A Step Toward Secure andTrusted Cloudsrdquo in International Conference on Cloud Engineering(IC2E) IEEE 2015

[45] J Singh D Eyers and J Bacon ldquoPolicy Enforcement withinEmerging Distributed Event-Based Systemsrdquo in ACM DistributedEvent-Based Systems (DEBSrsquo14) 2014

[46] L Sfaxi T Abdellatif R Robbana and Y Lakhnech ldquoInformationFlow Control of Component-based Distributed Systemsrdquo Concur-rency and Computation Practice and Experience vol 25 no 2 pp161ndash179 2013

[47] S Yoshihama T Yoshizawa Y Watanabe M Kudoh and K Oy-anagi ldquoDynamic Information Flow Control Architecture for WebApplicationsrdquo in ESORICS 2007 J Biskup and J Lopez EdsSpringer 2007 vol LNCS 4734 pp 267ndash282

[48] W Cheng D R K Ports D Schultz V Popic A BlanksteinJ Cowling D Curtis L Shrira and B Liskov ldquoAbstractions forUsable Information Flow Control in Aeolusrdquo in USENIX AnnualTechnical Conference Boston 2012

[49] J Singh T Pasquier J Bacon and D Eyers ldquoIntegrating Middle-ware and Information Flow Controlrdquo in International Conferenceon Cloud Engineering (IC2E) IEEE 2015 pp 54ndash59

[50] J Singh T F J-M Pasquier and J Bacon ldquoSecuring Tags toControl Information Flows within the Internet of Thingsrdquo inInternational Conference on Recent Advances in Internet of Things(RIoTrsquo15) IEEE 2015

[51] J S Park and R Sandhu ldquoBinding Identities and Attributes UsingDigitally Signed Certificatesrdquo in Annual Conference on ComputerSecurity Applications IEEE 2000 pp 120ndash127

[52] A Ganjali and D Lie ldquoAuditing cloud management using infor-mation flow trackingrdquo in Workshop on Scalable Trusted ComputingACM 2012 pp 79ndash84

[53] R K Ko M Kirchberg and B S Lee ldquoFrom System-centric toData-centric Logging-accountability Trust amp Security in CloudComputingrdquo in Defense Science Research Conference and Expo (DSR)2011 IEEE 2011 pp 1ndash4

[54] J Singh J Powles T Pasquier and J Bacon ldquoData Flow Man-agement and Compliance in Cloud Computingrdquo IEEE CloudComputing Magazine SI on Legal Clouds 2015

[55] T Zanussi K Yaghmour R Wisniewski R Moore and M Dage-nais ldquorelayfs An efficient unified approach for transmitting datafrom kernel to user spacerdquo in Linux Symposium 2003 p 494

[56] M E Smoot K Ono J Ruscheinski P-L Wang and T IdekerldquoCytoscape 28 New features for data integration and networkvisualizationrdquo Oxford University Press 2011 vol 27 no 3 pp431ndash432

[57] T Pasquier and J Powles ldquoExpressing and Enforcing LocationRequirements in the Cloud using Information Flow Controlrdquo inIC2E International Workshop on Legal and Technical Issues in CloudComputing (CLawrsquo15) IEEE 2015

[58] R Angles and C Gutierrez ldquoSurvey of Graph Database ModelsrdquoACM Computing Surveys (CSUR) vol 40 no 1 p 1 2008

[59] D Hardt ldquoThe OAuth 20 Authorization Frameworkrdquo IETF TechRep RFC 6749 2012

[60] S Smalley and R Craig ldquoSecurity Enhanced (SE) Android Bring-ing Flexible MAC to Androidrdquo in Network and Distributed SystemSecurity Symposium Internet Society 2013

  • 1 Introduction and Motivation
  • 2 Background
    • 21 IFC Models
    • 22 Protection via VMs and Containers
    • 23 Taint Tracking (TT) Systems
    • 24 Sticky Policies
      • 3 CamFlow-Model IFC for the Cloud
        • 31 Tags and Labels
        • 32 Decentralised Privileges and Security Contexts
        • 33 Creating a New Entity
        • 34 Security
          • 341 Information Exchange
          • 342 Label Change
          • 343 Privilege delegation
            • 35 Conflict of Interest
              • 4 OS Enforcement
                • 41 CamFlow-LSM
                  • 411 Checkpointing and Restoration
                  • 412 OS Evaluation
                    • 42 Trusted Processes
                    • 43 Leveraging Hardware Roots of Trust
                      • 5 Cross-Machine Enforcement
                        • 51 Remote Interactions
                        • 52 Message-Level Enforcement
                        • 53 Integrating With Persistent Storage
                        • 54 Evaluation
                          • 6 Audit Data-Centric Logs
                            • 61 Analysing Paths to Disclosure
                            • 62 Demonstrating Compliance
                            • 63 Audit as `Big Data
                            • 64 Audit Access
                              • 7 Example Support for Web Services
                              • 8 Conclusion amp Future Work
                              • References
Page 11: CamFlow: Managed data-sharing for cloud services · CamFlow: Managed data-sharing for cloud services Thomas F. J.-M. Pasquier, Member, IEEE, Jatinder Singh, Member, IEEE, David Eyers,

11

audit graph We know from the event IDs e0 and e1 thatthe data disclosure did not occur through file F1 andprocess P3 but through P1rsquos declassification

62 Demonstrating Compliance

Compliance with certain requirements can be demon-strated through queries over the graph We assume theaudit data is stored in a graph database that we canquery For example the following plain English policyldquoEuropean personal data sent to the US must be anonymisedrdquo[57] is equivalent to writing a query that verifies thatthere is no path between EU- and US-labelled datawithout an anonymisation processThe policy ldquoMedical data stored in database X must havereceived proper consent and be anonymisedrdquo [54] can beexpressed as a query verifying that there is no path be-tween data labelled as medical and the database withoutconsent-checking and anonymiser processes In additionan investigator may want to know which anonymisationalgorithm has been run which data has been used togenerate the anonymised records etc Our audit graphassists in answering such questions

Note that IFC only applies guarantees with respectto flows Demonstrating the overall effectiveness of themanagement regime eg the quality and suitability ofthe anonymisation algorithm is out-of-scope for a flow-based enforcement mechanism

63 Audit as lsquoBig Datarsquo

We are potentially generating a vast amount of data inour IFC logs However unlike standard system logs thatare complex to analyse our logs generate graphs thatare ideal for analysis by ldquobig datardquo tools that have beendeveloped for this purpose [58]

Since the amount of data is potentially huge theamount of data being logged can be fine-tuned to meetthe requirements of the platformtenant eg by reduc-ing the amount of metadata being stored by loggingonly security context changing operations by loggingonly information corresponding to some target securitycontext keeping operations on unlabelled entities out-side of the log etc The decision on what needs to belogged then becomes a tradeoff between utility and thevolume (cost) of log generated which can be decided inorder to correspond to legal or contractual requirements(for example a regulated sector may need to have afine-grained log to satisfy data forensic requirements)Indeed as such an approach is new to the cloud suchconsiderations will be refined by experience with bestpractices developing over time

64 Audit Access

Logs can contain sensitive information and access tothem should be controlled This represents an area ofour ongoing work Traditional access controls clearlyplay a role however secrecy tags could also be lever-aged For example an auditor before being grantedaccess to audit logs could be forced to demonstrate

ownership of the corresponding secrecy IFC tags (forexample through cryptographic means as in sect51) Theauditor may be granted access to a log entry only ifS(origin) cup S(destination) sube S(auditor)7 EXAMPLE SUPPORT FOR WEB SERVICES

One of the most common uses of PaaS is to host web ap-plications In this section we present the implementationof such a solution built on the infrastructure described insect4 in order to evaluate and demonstrate the feasibility ofour proposed approach This is illustrated in Fig 8 Werun standard and unmodified Ruby web applications

Interaction with end-users is achieved through aldquogatewayrdquo between the IFC and non-IFC worlds Simi-larly interaction with cloud services (such as data stores)is also achieved through our messaging middleware asdiscussed in sect5 The requirement for this gateway canbe removed if a trustworthy IFC implementation can beprovided at the client side consistent with the cloud im-plementation with respect to tag naming enforcementetc Tag naming in general system-wide is an issuebeyond the scope of this paper see further sect8 In ourproof of concept implementation the gateway is a simpleApache server running a custom-built module

The role of the gateway is to authenticate the end-userwhen a session is created and to associate this sessionwith an application instance running within the securitycontext corresponding to the user Recall that a securitycontext comprises the S and I labels Any further re-quests to the gateway in that session are routed to thecorresponding application instance Once an instance nolonger has an associated session it can be recycled usingself-checkpointing as described in sect411

Several application types are running over our cloud-based web services platform For example in a medicalcontext these might be medical record editing pharmacyordering social services etc A single shared identityservice for the end-user is part of the cloud provideroffering (in our proof of concept implementation weused OAuth [59])

The GP authenticates is authorised as treating doc-tor for Alice and selects the lsquoidentityrsquo that correspondsto Alice A new session is created server-side by thegateway with the requested application instance run-ning in the corresponding security context with S =[medical alice] I = [empty] When the GP wants to accessapplications on behalf of a new patient he needs to closeAlicersquos session authorise as treating doctor for Bob andopen a new session for Bob

The control described above is not achieved by theapplication but by the platform itself and can be con-trolled by the end-user subject to access control Thatis a medical application used on Alicersquos behalf runs ina security context in which data cannot flow to that ofanother patient Furthermore applications running onbehalf of a given user can share the data of that userwithout the risk of seeing a buggy application leakingdata between end-users

12

App Instance App InstanceApp Instance

CamFlow-MW

IFC Constrained IFC Constrained IFC Constrained

ContextA

ContextB

Public

Trusted IFC Gateway

Enforcing IFCData-store

Data-store Application

CamFlow-MW CamFlow-MWCamFlow-MW

IFC Cloud Platform

Dr Yellow Dr PinkDr MauveRemote Clients

Fig 8 PaaS Architecture on top of IFC-OS

As described in sect4 we assume the middleware andthe OS enforcement are provided as a service by theunderlying platform A tenant wanting to use the third-party web-service offering once his trust in the under-lying platform is established needs only to audit thegateway again the underlying infrastructure providercould either provide such a gateway or audit it Therest of the software stack of the third-party web-serviceprovider is bound by the IFC enforcement mechanismand therefore need not be trusted

8 CONCLUSION amp FUTURE WORK

IFC allows data flows to be controlled continuouslythroughout a system by providing an information-centric MAC scheme that continuously ensures non-interference between security contexts This paper pre-sented the CamFlow platform that demonstrates thepotential of cloud-deployed IFC as supporting (1) pro-tection of applications from each other (2) flexible datasharing across isolation boundaries (3) prevention ofdata leakage due to bugsmisconfigurations (4) exten-sion of access control beyond application boundaries (5)data flow transparency

Specifically we detailed a new kernel implementationof IFC as an LSM demonstrating low overhead evenfor worst-case scenarios where processes continuouslymake readwrite system calls We also described theintegration of a messaging middleware to enforcing IFCacross machines This combination makes it possible toprovide whole-system IFC to PaaS cloud services andtherefore also SaaS Our approach and implementationwere designed so that applications can run unchangedover IFC thus making cloud adoption feasible

We also indicated how the data-centric logs based onIFC enforcement could provide the means to audit anIFC-enabled system whereby a log can be processed asa directed graph to investigate leaks and attacks andshow compliance with data management requirementsThough this represents our initial work in the area thereappears much promise

In light of the above we believe that IFC has greatpotential as a security mechanism for the cloud wherebytrust in a few major cloud providers deploying IFCcan be built on to provide a demonstrably trustworthycomputing environment

CamFlow was developed with cloud deployment inmind Our future work will investigate the challenges ofa broader distributed context

It is already feasible to extend CamFlow to sup-port mobile environments Android supports the fullSELinux enforcement5 and an Android-LSM integrationhas been demonstrated [60] But when dealing withmultiple cloud services particularly as they become partof a wider distributed architecture such as in the IoTa trustworthy system-wide deployment of IFC can nolonger be assumed Much work remains on establishingtrust in (and the trustworthiness of) the IFC enforcementmechanism within end-usersrsquo devices Outside a cloudcontext all partiesrsquo trust in a common third partyrsquos en-forcement of IFC constraints cannot be assumed (unlikethe cloud provider for cloud services) We intend toexplore leveraging hardware roots of trust and remoteattestation to ensure the integrity and trustworthiness ofIFC enforcement mechanisms

Another area of investigation concerns the represen-tation of tags across administrative domains In a cloudcontext a federated approach can be envisaged wherea common understanding of tags could be negotiatedacross multiple domains For example in [54] we dis-cussed initial thoughts on managing data according tospecific obligation regimes However as the numberof administrative domains increases both a global tagnaming scheme and mechanisms for ad-hoc negotiationbecome necessary

Related is the sensitivity of the tags themselvesKnowledge of the meaning of a tag can indicate thatthe associated entity contains or deals with certain infor-mation If a relationship can be established between anentity and the information owner this may disclose pri-vate information about the information owner This maylead to work on a need-to-know negotiation mechanismto establish a secure channel between hosts especiallyin a wide-scale distributed system A promising mech-anism is private set intersection to determine tagsrsquo subsetrelationship (sect341)

Another challenge concerns extending audit to dis-tributed architectures both in terms of resource man-agement and regulating access to log data

5 httpssourceandroidcomdevicestechsecurityselinux

13

ACKNOWLEDGMENTS

This work was supported by UK Engineering andPhysical Sciences Research Council grant EPK011510CloudSafetyNet End-to-End Application Security inthe Cloud We acknowledge the support of Microsoftthrough the Microsoft Cloud Computing Research Cen-tre

REFERENCES

[1] P Barham B Dragovic K Fraser S Hand T Harris A HoR Neugebauer I Pratt and A Warfield ldquoXen and the art ofvirtualizationrdquo SIGOPS Operating Systems Review vol 37 no 5pp 164ndash177 2003

[2] A Kivity Y Kamay D Laor U Lublin and A Liguori ldquokvmthe Linux Virtual Machine Monitorrdquo in Proceedings of the LinuxSymposium vol 1 2007 pp 225ndash230

[3] D Bernstein ldquoContainers and Cloud From LXC to Docker toKubernetesrdquo IEEE Cloud Computing Magazine no 3 pp 81ndash842014

[4] P Desnoyers O Krieger B Holden and J Hennessey ldquoUsingOpenStack for an Open Cloud eXchange (OCX)rdquo in InternationalConference on Cloud Engineering (IC2E) IEEE 2015

[5] J Mineraud O Mazhelis X Su and S Tarkoma ldquoA Gap Analysisof Internet-of-Things Platformsrdquo 2015 Arxiv arXiv150201181[Online] Available httparxivorgabs150201181

[6] C J Millard Ed Cloud Computing Law Oxford University Press2013

[7] S Pearson and M Casassa-Mont ldquoSticky Policies An Approachfor Managing Privacy across Multiple Partiesrdquo Computer vol 44July 2011

[8] D E Denning ldquoA lattice model of secure information flowrdquoCommunication of the ACM vol 19 no 5 pp 236ndash243 1976

[9] D E Bell and L J LaPadula ldquoSecure Computer Systems Mathe-matical Foundations and Modelrdquo The MITRE Corp Bedford MATech Rep M74-244 1973

[10] K J Biba ldquoIntegrity Considerations for Secure Computer Sys-temsrdquo MITRE Corp Tech Rep ESD-TR 76-372 1977

[11] A C Myers and B Liskov ldquoA Decentralized Model for Infor-mation Flow Controlrdquo in 17th Symposium on Operating SystemsPrinciples (SOSP) ACM 1997 pp 129ndash142

[12] A Bouard B Weyl and C Eckert ldquoPractical information-flowaware middleware for in-car communicationrdquo in Workshop onSecurity privacy amp dependability for cyber vehicles ACM 2013pp 3ndash8

[13] K Singh S Bhola and W Lee ldquoxBook Redesigning PrivacyControl in Social Networking Platformsrdquo in Security SymposiumUSENIX 2009 pp 249ndash266

[14] M Krohn A Yip M Brodsky N Cliffer M F KaashoekE Kohler and R Morris ldquoInformation Flow Control for StandardOS Abstractionsrdquo in Symposium on Operating Systems PrinciplesACM 2007 pp 321ndash334

[15] S Lee E L Wong D Goel M Dahlin and V ShmatikovldquoπBox A Platform for Privacy-Preserving Appsrdquo in Symposiumon Networked System Design and Implementation USENIX 2013pp 501ndash514

[16] N Kumar and R Shyamasundar ldquoRealizing Purpose-Based Pri-vacy Policies Succinctly via Information-Flow Labelsrdquo in Big Dataand Cloud Computing (BDCloudrsquo14) IEEE 2014 pp 753ndash760

[17] W Enck P Gilbert B-G Chun L P Cox J Jung P McDaniel andA N Sheth ldquoTaintDroid An information-flow tracking systemfor realtime privacy monitoring on smartphonesrdquo in Conference onOperating systems design and implementation (OSDIrsquo10) USENIX2010 pp 1ndash6

[18] I Papagiannis M Migliavacca and P Pietzuch ldquoPHP AspisUsing partial taint tracking to protect against injection attacksrdquoin 2nd USENIX Conference on Web Application Development 2011p 13

[19] E Schwartz T Avgerinos and D Brumley ldquoAll you ever wantedto know about dynamic taint analysis and forward symbolicexecution (but might have been afraid to ask)rdquo in Symposium onSecurity and Privacy IEEE 2010

[20] M Casassa-Mont S Pearson and P Bramhall ldquoTowards account-able management of identity and privacy Sticky policies andenforceable tracing servicesrdquo in Database and Expert Systems Ap-plications 2003 Proceedings 14th International Workshop on IEEE2003 pp 377ndash382

[21] S Bandhakavi C C Zhang and M Winslett ldquoSuper-sticky anddeclassifiable release policies for flexible information dissemina-tion controlrdquo in Proceedings of the 5th ACM workshop on Privacy inelectronic society ACM 2006 pp 51ndash58

[22] D W Chadwick and S F Lievens ldquoEnforcing sticky securitypolicies throughout a distributed applicationrdquo in Proceedings ofthe 2008 workshop on Middleware security ACM 2008 pp 1ndash6

[23] M Casassa-Mont I Matteucci M Petrocchi and M L SbodioldquoTowards Safer Information Sharing in the Cloudrdquo InternationalJournal of Information Security pp 1ndash16 2014

[24] S Akoush L Carata R Sohan and A Hopper ldquoMrLazy LazyRuntime Label Propagation for MapReducerdquo in 6th Workshop onHot Topics in Cloud Computing (HotCloud) USENIX 2014

[25] D Schultz and B Liskov ldquoIFDB Decentralized Information FlowControl for Databasesrdquo in European Conference on Computer Sys-tems (Eurosysrsquo13) ACM 2013 pp 43ndash56

[26] D Von Oheimb ldquoInformation Flow Control Revisited Nonin-fluence = Noninterference + Nonleakagerdquo in Computer SecurityndashESORICS 2004 Springer 2004 pp 225ndash243

[27] D Hedin and A Sabelfeld ldquoA perspective on Information-FlowControlrdquo NATO 2012

[28] D E Porter M D Bond I Roy K S McKinley and E WitchelldquoPractical Fine-Grained Information Flow Control Using Lam-inarrdquo ACM Transactions on Programming Languages and Systems(TOPLAS) vol 37 no 1 p 4 2014

[29] C Wright C Cowan S Smalley J Morris and G Kroah-Hartman ldquoLinux Security Modules General security support forthe Linux kernelrdquo in Foundations of Intrusion Tolerant SystemsIEEE 2003 pp 213ndash213

[30] M Quaritsch and T Winkler ldquoLinux Security Modules Enhance-ments Module Stacking Framework and TCP State TransitionHooks for State-Driven NIDSrdquo Secure Information and Communi-cation vol 7 pp 7ndash13 2004

[31] C Schaufler ldquoLSM Generalize existing module stackingrdquo LinuxWeekly News 2014

[32] S Smalley C Vance and W Salamon ldquoImplementing SELinux asa Linux Security Modulerdquo NAI Labs Report vol 1 p 43 2001

[33] M Bauer ldquoParanoid Penguin an Introduction to Novell AppAr-morrdquo Linux Journal vol 2006 no 148 p 13 2006

[34] A Edwards T Jaeger and X Zhang ldquoRuntime verification ofauthorization hook placement for the Linux Security Modulesframeworkrdquo in Conference on Computer and Communications Secu-rity ACM 2002 pp 225ndash234

[35] T Jaeger A Edwards and X Zhang ldquoConsistency analysis ofauthorization hook placement in the Linux security modulesframeworkrdquo ACM Transactions on Information and System Security(TISSEC) vol 7 no 2 pp 175ndash205 2004

[36] V Ganapathy T Jaeger and S Jha ldquoAutomatic placement ofauthorization hooks in the Linux security modules frameworkrdquoin Conference on Computer and Communications Security ACM2005 pp 330ndash339

[37] I P Egwutuoha D Levy B Selic and S Chen ldquoA survey of faulttolerance mechanisms and checkpointrestart implementationsfor high performance computing systemsrdquo Springer 2013vol 65 no 3 pp 1302ndash1326

[38] O Laadan and S E Hallyn ldquoLinux-CR Transparent applicationcheckpoint-restart in Linuxrdquo in Linux Symposium 2010 p 159

[39] B Niu and G Tan ldquoEfficient User-space Information Flow Con-trolrdquo in Symposium on Information Computer and CommunicationsSecurity (SIGSACrsquo13) ACM 2013 pp 131ndash142

[40] T Bird ldquoMeasuring Function Duration with ftracerdquo in Japan LinuxSymposium 2009 pp 47ndash54

[41] B Parno ldquoBootstrapping Trust in ardquo Trustedrdquo Platformrdquo in Con-ference on Hot Topics in Security (HotSecrsquo08) USENIX 2008

[42] N Santos K P Gummadi and R Rodrigues ldquoTowards TrustedCloud Computingrdquo in Conference on Hot Topics in Cloud Comput-ing USENIX 2009 pp 3ndash3

[43] S Berger R Caceres K A Goldman R Perez R Sailer andL van Doorn ldquovTPM Virtualizing the Trusted Platform Modulerdquoin Security Symposium USENIX 2006 pp 305ndash320

14

[44] S Berger K Goldman D Pendarakis D Safford E Valdezand M Zohar ldquoScalable Attestation A Step Toward Secure andTrusted Cloudsrdquo in International Conference on Cloud Engineering(IC2E) IEEE 2015

[45] J Singh D Eyers and J Bacon ldquoPolicy Enforcement withinEmerging Distributed Event-Based Systemsrdquo in ACM DistributedEvent-Based Systems (DEBSrsquo14) 2014

[46] L Sfaxi T Abdellatif R Robbana and Y Lakhnech ldquoInformationFlow Control of Component-based Distributed Systemsrdquo Concur-rency and Computation Practice and Experience vol 25 no 2 pp161ndash179 2013

[47] S Yoshihama T Yoshizawa Y Watanabe M Kudoh and K Oy-anagi ldquoDynamic Information Flow Control Architecture for WebApplicationsrdquo in ESORICS 2007 J Biskup and J Lopez EdsSpringer 2007 vol LNCS 4734 pp 267ndash282

[48] W Cheng D R K Ports D Schultz V Popic A BlanksteinJ Cowling D Curtis L Shrira and B Liskov ldquoAbstractions forUsable Information Flow Control in Aeolusrdquo in USENIX AnnualTechnical Conference Boston 2012

[49] J Singh T Pasquier J Bacon and D Eyers ldquoIntegrating Middle-ware and Information Flow Controlrdquo in International Conferenceon Cloud Engineering (IC2E) IEEE 2015 pp 54ndash59

[50] J Singh T F J-M Pasquier and J Bacon ldquoSecuring Tags toControl Information Flows within the Internet of Thingsrdquo inInternational Conference on Recent Advances in Internet of Things(RIoTrsquo15) IEEE 2015

[51] J S Park and R Sandhu ldquoBinding Identities and Attributes UsingDigitally Signed Certificatesrdquo in Annual Conference on ComputerSecurity Applications IEEE 2000 pp 120ndash127

[52] A Ganjali and D Lie ldquoAuditing cloud management using infor-mation flow trackingrdquo in Workshop on Scalable Trusted ComputingACM 2012 pp 79ndash84

[53] R K Ko M Kirchberg and B S Lee ldquoFrom System-centric toData-centric Logging-accountability Trust amp Security in CloudComputingrdquo in Defense Science Research Conference and Expo (DSR)2011 IEEE 2011 pp 1ndash4

[54] J Singh J Powles T Pasquier and J Bacon ldquoData Flow Man-agement and Compliance in Cloud Computingrdquo IEEE CloudComputing Magazine SI on Legal Clouds 2015

[55] T Zanussi K Yaghmour R Wisniewski R Moore and M Dage-nais ldquorelayfs An efficient unified approach for transmitting datafrom kernel to user spacerdquo in Linux Symposium 2003 p 494

[56] M E Smoot K Ono J Ruscheinski P-L Wang and T IdekerldquoCytoscape 28 New features for data integration and networkvisualizationrdquo Oxford University Press 2011 vol 27 no 3 pp431ndash432

[57] T Pasquier and J Powles ldquoExpressing and Enforcing LocationRequirements in the Cloud using Information Flow Controlrdquo inIC2E International Workshop on Legal and Technical Issues in CloudComputing (CLawrsquo15) IEEE 2015

[58] R Angles and C Gutierrez ldquoSurvey of Graph Database ModelsrdquoACM Computing Surveys (CSUR) vol 40 no 1 p 1 2008

[59] D Hardt ldquoThe OAuth 20 Authorization Frameworkrdquo IETF TechRep RFC 6749 2012

[60] S Smalley and R Craig ldquoSecurity Enhanced (SE) Android Bring-ing Flexible MAC to Androidrdquo in Network and Distributed SystemSecurity Symposium Internet Society 2013

  • 1 Introduction and Motivation
  • 2 Background
    • 21 IFC Models
    • 22 Protection via VMs and Containers
    • 23 Taint Tracking (TT) Systems
    • 24 Sticky Policies
      • 3 CamFlow-Model IFC for the Cloud
        • 31 Tags and Labels
        • 32 Decentralised Privileges and Security Contexts
        • 33 Creating a New Entity
        • 34 Security
          • 341 Information Exchange
          • 342 Label Change
          • 343 Privilege delegation
            • 35 Conflict of Interest
              • 4 OS Enforcement
                • 41 CamFlow-LSM
                  • 411 Checkpointing and Restoration
                  • 412 OS Evaluation
                    • 42 Trusted Processes
                    • 43 Leveraging Hardware Roots of Trust
                      • 5 Cross-Machine Enforcement
                        • 51 Remote Interactions
                        • 52 Message-Level Enforcement
                        • 53 Integrating With Persistent Storage
                        • 54 Evaluation
                          • 6 Audit Data-Centric Logs
                            • 61 Analysing Paths to Disclosure
                            • 62 Demonstrating Compliance
                            • 63 Audit as `Big Data
                            • 64 Audit Access
                              • 7 Example Support for Web Services
                              • 8 Conclusion amp Future Work
                              • References
Page 12: CamFlow: Managed data-sharing for cloud services · CamFlow: Managed data-sharing for cloud services Thomas F. J.-M. Pasquier, Member, IEEE, Jatinder Singh, Member, IEEE, David Eyers,

12

App Instance App InstanceApp Instance

CamFlow-MW

IFC Constrained IFC Constrained IFC Constrained

ContextA

ContextB

Public

Trusted IFC Gateway

Enforcing IFCData-store

Data-store Application

CamFlow-MW CamFlow-MWCamFlow-MW

IFC Cloud Platform

Dr Yellow Dr PinkDr MauveRemote Clients

Fig 8 PaaS Architecture on top of IFC-OS

As described in sect4 we assume the middleware andthe OS enforcement are provided as a service by theunderlying platform A tenant wanting to use the third-party web-service offering once his trust in the under-lying platform is established needs only to audit thegateway again the underlying infrastructure providercould either provide such a gateway or audit it Therest of the software stack of the third-party web-serviceprovider is bound by the IFC enforcement mechanismand therefore need not be trusted

8 CONCLUSION amp FUTURE WORK

IFC allows data flows to be controlled continuouslythroughout a system by providing an information-centric MAC scheme that continuously ensures non-interference between security contexts This paper pre-sented the CamFlow platform that demonstrates thepotential of cloud-deployed IFC as supporting (1) pro-tection of applications from each other (2) flexible datasharing across isolation boundaries (3) prevention ofdata leakage due to bugsmisconfigurations (4) exten-sion of access control beyond application boundaries (5)data flow transparency

Specifically we detailed a new kernel implementationof IFC as an LSM demonstrating low overhead evenfor worst-case scenarios where processes continuouslymake readwrite system calls We also described theintegration of a messaging middleware to enforcing IFCacross machines This combination makes it possible toprovide whole-system IFC to PaaS cloud services andtherefore also SaaS Our approach and implementationwere designed so that applications can run unchangedover IFC thus making cloud adoption feasible

We also indicated how the data-centric logs based onIFC enforcement could provide the means to audit anIFC-enabled system whereby a log can be processed asa directed graph to investigate leaks and attacks andshow compliance with data management requirementsThough this represents our initial work in the area thereappears much promise

In light of the above we believe that IFC has greatpotential as a security mechanism for the cloud wherebytrust in a few major cloud providers deploying IFCcan be built on to provide a demonstrably trustworthycomputing environment

CamFlow was developed with cloud deployment inmind Our future work will investigate the challenges ofa broader distributed context

It is already feasible to extend CamFlow to sup-port mobile environments Android supports the fullSELinux enforcement5 and an Android-LSM integrationhas been demonstrated [60] But when dealing withmultiple cloud services particularly as they become partof a wider distributed architecture such as in the IoTa trustworthy system-wide deployment of IFC can nolonger be assumed Much work remains on establishingtrust in (and the trustworthiness of) the IFC enforcementmechanism within end-usersrsquo devices Outside a cloudcontext all partiesrsquo trust in a common third partyrsquos en-forcement of IFC constraints cannot be assumed (unlikethe cloud provider for cloud services) We intend toexplore leveraging hardware roots of trust and remoteattestation to ensure the integrity and trustworthiness ofIFC enforcement mechanisms

Another area of investigation concerns the represen-tation of tags across administrative domains In a cloudcontext a federated approach can be envisaged wherea common understanding of tags could be negotiatedacross multiple domains For example in [54] we dis-cussed initial thoughts on managing data according tospecific obligation regimes However as the numberof administrative domains increases both a global tagnaming scheme and mechanisms for ad-hoc negotiationbecome necessary

Related is the sensitivity of the tags themselvesKnowledge of the meaning of a tag can indicate thatthe associated entity contains or deals with certain infor-mation If a relationship can be established between anentity and the information owner this may disclose pri-vate information about the information owner This maylead to work on a need-to-know negotiation mechanismto establish a secure channel between hosts especiallyin a wide-scale distributed system A promising mech-anism is private set intersection to determine tagsrsquo subsetrelationship (sect341)

Another challenge concerns extending audit to dis-tributed architectures both in terms of resource man-agement and regulating access to log data

5 httpssourceandroidcomdevicestechsecurityselinux

13

ACKNOWLEDGMENTS

This work was supported by UK Engineering andPhysical Sciences Research Council grant EPK011510CloudSafetyNet End-to-End Application Security inthe Cloud We acknowledge the support of Microsoftthrough the Microsoft Cloud Computing Research Cen-tre

REFERENCES

[1] P Barham B Dragovic K Fraser S Hand T Harris A HoR Neugebauer I Pratt and A Warfield ldquoXen and the art ofvirtualizationrdquo SIGOPS Operating Systems Review vol 37 no 5pp 164ndash177 2003

[2] A Kivity Y Kamay D Laor U Lublin and A Liguori ldquokvmthe Linux Virtual Machine Monitorrdquo in Proceedings of the LinuxSymposium vol 1 2007 pp 225ndash230

[3] D Bernstein ldquoContainers and Cloud From LXC to Docker toKubernetesrdquo IEEE Cloud Computing Magazine no 3 pp 81ndash842014

[4] P Desnoyers O Krieger B Holden and J Hennessey ldquoUsingOpenStack for an Open Cloud eXchange (OCX)rdquo in InternationalConference on Cloud Engineering (IC2E) IEEE 2015

[5] J Mineraud O Mazhelis X Su and S Tarkoma ldquoA Gap Analysisof Internet-of-Things Platformsrdquo 2015 Arxiv arXiv150201181[Online] Available httparxivorgabs150201181

[6] C J Millard Ed Cloud Computing Law Oxford University Press2013

[7] S Pearson and M Casassa-Mont ldquoSticky Policies An Approachfor Managing Privacy across Multiple Partiesrdquo Computer vol 44July 2011

[8] D E Denning ldquoA lattice model of secure information flowrdquoCommunication of the ACM vol 19 no 5 pp 236ndash243 1976

[9] D E Bell and L J LaPadula ldquoSecure Computer Systems Mathe-matical Foundations and Modelrdquo The MITRE Corp Bedford MATech Rep M74-244 1973

[10] K J Biba ldquoIntegrity Considerations for Secure Computer Sys-temsrdquo MITRE Corp Tech Rep ESD-TR 76-372 1977

[11] A C Myers and B Liskov ldquoA Decentralized Model for Infor-mation Flow Controlrdquo in 17th Symposium on Operating SystemsPrinciples (SOSP) ACM 1997 pp 129ndash142

[12] A Bouard B Weyl and C Eckert ldquoPractical information-flowaware middleware for in-car communicationrdquo in Workshop onSecurity privacy amp dependability for cyber vehicles ACM 2013pp 3ndash8

[13] K Singh S Bhola and W Lee ldquoxBook Redesigning PrivacyControl in Social Networking Platformsrdquo in Security SymposiumUSENIX 2009 pp 249ndash266

[14] M Krohn A Yip M Brodsky N Cliffer M F KaashoekE Kohler and R Morris ldquoInformation Flow Control for StandardOS Abstractionsrdquo in Symposium on Operating Systems PrinciplesACM 2007 pp 321ndash334

[15] S Lee E L Wong D Goel M Dahlin and V ShmatikovldquoπBox A Platform for Privacy-Preserving Appsrdquo in Symposiumon Networked System Design and Implementation USENIX 2013pp 501ndash514

[16] N Kumar and R Shyamasundar ldquoRealizing Purpose-Based Pri-vacy Policies Succinctly via Information-Flow Labelsrdquo in Big Dataand Cloud Computing (BDCloudrsquo14) IEEE 2014 pp 753ndash760

[17] W Enck P Gilbert B-G Chun L P Cox J Jung P McDaniel andA N Sheth ldquoTaintDroid An information-flow tracking systemfor realtime privacy monitoring on smartphonesrdquo in Conference onOperating systems design and implementation (OSDIrsquo10) USENIX2010 pp 1ndash6

[18] I Papagiannis M Migliavacca and P Pietzuch ldquoPHP AspisUsing partial taint tracking to protect against injection attacksrdquoin 2nd USENIX Conference on Web Application Development 2011p 13

[19] E Schwartz T Avgerinos and D Brumley ldquoAll you ever wantedto know about dynamic taint analysis and forward symbolicexecution (but might have been afraid to ask)rdquo in Symposium onSecurity and Privacy IEEE 2010

[20] M Casassa-Mont S Pearson and P Bramhall ldquoTowards account-able management of identity and privacy Sticky policies andenforceable tracing servicesrdquo in Database and Expert Systems Ap-plications 2003 Proceedings 14th International Workshop on IEEE2003 pp 377ndash382

[21] S Bandhakavi C C Zhang and M Winslett ldquoSuper-sticky anddeclassifiable release policies for flexible information dissemina-tion controlrdquo in Proceedings of the 5th ACM workshop on Privacy inelectronic society ACM 2006 pp 51ndash58

[22] D W Chadwick and S F Lievens ldquoEnforcing sticky securitypolicies throughout a distributed applicationrdquo in Proceedings ofthe 2008 workshop on Middleware security ACM 2008 pp 1ndash6

[23] M Casassa-Mont I Matteucci M Petrocchi and M L SbodioldquoTowards Safer Information Sharing in the Cloudrdquo InternationalJournal of Information Security pp 1ndash16 2014

[24] S Akoush L Carata R Sohan and A Hopper ldquoMrLazy LazyRuntime Label Propagation for MapReducerdquo in 6th Workshop onHot Topics in Cloud Computing (HotCloud) USENIX 2014

[25] D Schultz and B Liskov ldquoIFDB Decentralized Information FlowControl for Databasesrdquo in European Conference on Computer Sys-tems (Eurosysrsquo13) ACM 2013 pp 43ndash56

[26] D Von Oheimb ldquoInformation Flow Control Revisited Nonin-fluence = Noninterference + Nonleakagerdquo in Computer SecurityndashESORICS 2004 Springer 2004 pp 225ndash243

[27] D Hedin and A Sabelfeld ldquoA perspective on Information-FlowControlrdquo NATO 2012

[28] D E Porter M D Bond I Roy K S McKinley and E WitchelldquoPractical Fine-Grained Information Flow Control Using Lam-inarrdquo ACM Transactions on Programming Languages and Systems(TOPLAS) vol 37 no 1 p 4 2014

[29] C Wright C Cowan S Smalley J Morris and G Kroah-Hartman ldquoLinux Security Modules General security support forthe Linux kernelrdquo in Foundations of Intrusion Tolerant SystemsIEEE 2003 pp 213ndash213

[30] M Quaritsch and T Winkler ldquoLinux Security Modules Enhance-ments Module Stacking Framework and TCP State TransitionHooks for State-Driven NIDSrdquo Secure Information and Communi-cation vol 7 pp 7ndash13 2004

[31] C Schaufler ldquoLSM Generalize existing module stackingrdquo LinuxWeekly News 2014

[32] S Smalley C Vance and W Salamon ldquoImplementing SELinux asa Linux Security Modulerdquo NAI Labs Report vol 1 p 43 2001

[33] M Bauer ldquoParanoid Penguin an Introduction to Novell AppAr-morrdquo Linux Journal vol 2006 no 148 p 13 2006

[34] A Edwards T Jaeger and X Zhang ldquoRuntime verification ofauthorization hook placement for the Linux Security Modulesframeworkrdquo in Conference on Computer and Communications Secu-rity ACM 2002 pp 225ndash234

[35] T Jaeger A Edwards and X Zhang ldquoConsistency analysis ofauthorization hook placement in the Linux security modulesframeworkrdquo ACM Transactions on Information and System Security(TISSEC) vol 7 no 2 pp 175ndash205 2004

[36] V Ganapathy T Jaeger and S Jha ldquoAutomatic placement ofauthorization hooks in the Linux security modules frameworkrdquoin Conference on Computer and Communications Security ACM2005 pp 330ndash339

[37] I P Egwutuoha D Levy B Selic and S Chen ldquoA survey of faulttolerance mechanisms and checkpointrestart implementationsfor high performance computing systemsrdquo Springer 2013vol 65 no 3 pp 1302ndash1326

[38] O Laadan and S E Hallyn ldquoLinux-CR Transparent applicationcheckpoint-restart in Linuxrdquo in Linux Symposium 2010 p 159

[39] B Niu and G Tan ldquoEfficient User-space Information Flow Con-trolrdquo in Symposium on Information Computer and CommunicationsSecurity (SIGSACrsquo13) ACM 2013 pp 131ndash142

[40] T Bird ldquoMeasuring Function Duration with ftracerdquo in Japan LinuxSymposium 2009 pp 47ndash54

[41] B Parno ldquoBootstrapping Trust in ardquo Trustedrdquo Platformrdquo in Con-ference on Hot Topics in Security (HotSecrsquo08) USENIX 2008

[42] N Santos K P Gummadi and R Rodrigues ldquoTowards TrustedCloud Computingrdquo in Conference on Hot Topics in Cloud Comput-ing USENIX 2009 pp 3ndash3

[43] S Berger R Caceres K A Goldman R Perez R Sailer andL van Doorn ldquovTPM Virtualizing the Trusted Platform Modulerdquoin Security Symposium USENIX 2006 pp 305ndash320

14

[44] S Berger K Goldman D Pendarakis D Safford E Valdezand M Zohar ldquoScalable Attestation A Step Toward Secure andTrusted Cloudsrdquo in International Conference on Cloud Engineering(IC2E) IEEE 2015

[45] J Singh D Eyers and J Bacon ldquoPolicy Enforcement withinEmerging Distributed Event-Based Systemsrdquo in ACM DistributedEvent-Based Systems (DEBSrsquo14) 2014

[46] L Sfaxi T Abdellatif R Robbana and Y Lakhnech ldquoInformationFlow Control of Component-based Distributed Systemsrdquo Concur-rency and Computation Practice and Experience vol 25 no 2 pp161ndash179 2013

[47] S Yoshihama T Yoshizawa Y Watanabe M Kudoh and K Oy-anagi ldquoDynamic Information Flow Control Architecture for WebApplicationsrdquo in ESORICS 2007 J Biskup and J Lopez EdsSpringer 2007 vol LNCS 4734 pp 267ndash282

[48] W Cheng D R K Ports D Schultz V Popic A BlanksteinJ Cowling D Curtis L Shrira and B Liskov ldquoAbstractions forUsable Information Flow Control in Aeolusrdquo in USENIX AnnualTechnical Conference Boston 2012

[49] J Singh T Pasquier J Bacon and D Eyers ldquoIntegrating Middle-ware and Information Flow Controlrdquo in International Conferenceon Cloud Engineering (IC2E) IEEE 2015 pp 54ndash59

[50] J Singh T F J-M Pasquier and J Bacon ldquoSecuring Tags toControl Information Flows within the Internet of Thingsrdquo inInternational Conference on Recent Advances in Internet of Things(RIoTrsquo15) IEEE 2015

[51] J S Park and R Sandhu ldquoBinding Identities and Attributes UsingDigitally Signed Certificatesrdquo in Annual Conference on ComputerSecurity Applications IEEE 2000 pp 120ndash127

[52] A Ganjali and D Lie ldquoAuditing cloud management using infor-mation flow trackingrdquo in Workshop on Scalable Trusted ComputingACM 2012 pp 79ndash84

[53] R K Ko M Kirchberg and B S Lee ldquoFrom System-centric toData-centric Logging-accountability Trust amp Security in CloudComputingrdquo in Defense Science Research Conference and Expo (DSR)2011 IEEE 2011 pp 1ndash4

[54] J Singh J Powles T Pasquier and J Bacon ldquoData Flow Man-agement and Compliance in Cloud Computingrdquo IEEE CloudComputing Magazine SI on Legal Clouds 2015

[55] T Zanussi K Yaghmour R Wisniewski R Moore and M Dage-nais ldquorelayfs An efficient unified approach for transmitting datafrom kernel to user spacerdquo in Linux Symposium 2003 p 494

[56] M E Smoot K Ono J Ruscheinski P-L Wang and T IdekerldquoCytoscape 28 New features for data integration and networkvisualizationrdquo Oxford University Press 2011 vol 27 no 3 pp431ndash432

[57] T Pasquier and J Powles ldquoExpressing and Enforcing LocationRequirements in the Cloud using Information Flow Controlrdquo inIC2E International Workshop on Legal and Technical Issues in CloudComputing (CLawrsquo15) IEEE 2015

[58] R Angles and C Gutierrez ldquoSurvey of Graph Database ModelsrdquoACM Computing Surveys (CSUR) vol 40 no 1 p 1 2008

[59] D Hardt ldquoThe OAuth 20 Authorization Frameworkrdquo IETF TechRep RFC 6749 2012

[60] S Smalley and R Craig ldquoSecurity Enhanced (SE) Android Bring-ing Flexible MAC to Androidrdquo in Network and Distributed SystemSecurity Symposium Internet Society 2013

  • 1 Introduction and Motivation
  • 2 Background
    • 21 IFC Models
    • 22 Protection via VMs and Containers
    • 23 Taint Tracking (TT) Systems
    • 24 Sticky Policies
      • 3 CamFlow-Model IFC for the Cloud
        • 31 Tags and Labels
        • 32 Decentralised Privileges and Security Contexts
        • 33 Creating a New Entity
        • 34 Security
          • 341 Information Exchange
          • 342 Label Change
          • 343 Privilege delegation
            • 35 Conflict of Interest
              • 4 OS Enforcement
                • 41 CamFlow-LSM
                  • 411 Checkpointing and Restoration
                  • 412 OS Evaluation
                    • 42 Trusted Processes
                    • 43 Leveraging Hardware Roots of Trust
                      • 5 Cross-Machine Enforcement
                        • 51 Remote Interactions
                        • 52 Message-Level Enforcement
                        • 53 Integrating With Persistent Storage
                        • 54 Evaluation
                          • 6 Audit Data-Centric Logs
                            • 61 Analysing Paths to Disclosure
                            • 62 Demonstrating Compliance
                            • 63 Audit as `Big Data
                            • 64 Audit Access
                              • 7 Example Support for Web Services
                              • 8 Conclusion amp Future Work
                              • References
Page 13: CamFlow: Managed data-sharing for cloud services · CamFlow: Managed data-sharing for cloud services Thomas F. J.-M. Pasquier, Member, IEEE, Jatinder Singh, Member, IEEE, David Eyers,

13

ACKNOWLEDGMENTS

This work was supported by UK Engineering andPhysical Sciences Research Council grant EPK011510CloudSafetyNet End-to-End Application Security inthe Cloud We acknowledge the support of Microsoftthrough the Microsoft Cloud Computing Research Cen-tre

REFERENCES

[1] P Barham B Dragovic K Fraser S Hand T Harris A HoR Neugebauer I Pratt and A Warfield ldquoXen and the art ofvirtualizationrdquo SIGOPS Operating Systems Review vol 37 no 5pp 164ndash177 2003

[2] A Kivity Y Kamay D Laor U Lublin and A Liguori ldquokvmthe Linux Virtual Machine Monitorrdquo in Proceedings of the LinuxSymposium vol 1 2007 pp 225ndash230

[3] D Bernstein ldquoContainers and Cloud From LXC to Docker toKubernetesrdquo IEEE Cloud Computing Magazine no 3 pp 81ndash842014

[4] P Desnoyers O Krieger B Holden and J Hennessey ldquoUsingOpenStack for an Open Cloud eXchange (OCX)rdquo in InternationalConference on Cloud Engineering (IC2E) IEEE 2015

[5] J Mineraud O Mazhelis X Su and S Tarkoma ldquoA Gap Analysisof Internet-of-Things Platformsrdquo 2015 Arxiv arXiv150201181[Online] Available httparxivorgabs150201181

[6] C J Millard Ed Cloud Computing Law Oxford University Press2013

[7] S Pearson and M Casassa-Mont ldquoSticky Policies An Approachfor Managing Privacy across Multiple Partiesrdquo Computer vol 44July 2011

[8] D E Denning ldquoA lattice model of secure information flowrdquoCommunication of the ACM vol 19 no 5 pp 236ndash243 1976

[9] D E Bell and L J LaPadula ldquoSecure Computer Systems Mathe-matical Foundations and Modelrdquo The MITRE Corp Bedford MATech Rep M74-244 1973

[10] K J Biba ldquoIntegrity Considerations for Secure Computer Sys-temsrdquo MITRE Corp Tech Rep ESD-TR 76-372 1977

[11] A C Myers and B Liskov ldquoA Decentralized Model for Infor-mation Flow Controlrdquo in 17th Symposium on Operating SystemsPrinciples (SOSP) ACM 1997 pp 129ndash142

[12] A Bouard B Weyl and C Eckert ldquoPractical information-flowaware middleware for in-car communicationrdquo in Workshop onSecurity privacy amp dependability for cyber vehicles ACM 2013pp 3ndash8

[13] K Singh S Bhola and W Lee ldquoxBook Redesigning PrivacyControl in Social Networking Platformsrdquo in Security SymposiumUSENIX 2009 pp 249ndash266

[14] M Krohn A Yip M Brodsky N Cliffer M F KaashoekE Kohler and R Morris ldquoInformation Flow Control for StandardOS Abstractionsrdquo in Symposium on Operating Systems PrinciplesACM 2007 pp 321ndash334

[15] S Lee E L Wong D Goel M Dahlin and V ShmatikovldquoπBox A Platform for Privacy-Preserving Appsrdquo in Symposiumon Networked System Design and Implementation USENIX 2013pp 501ndash514

[16] N Kumar and R Shyamasundar ldquoRealizing Purpose-Based Pri-vacy Policies Succinctly via Information-Flow Labelsrdquo in Big Dataand Cloud Computing (BDCloudrsquo14) IEEE 2014 pp 753ndash760

[17] W Enck P Gilbert B-G Chun L P Cox J Jung P McDaniel andA N Sheth ldquoTaintDroid An information-flow tracking systemfor realtime privacy monitoring on smartphonesrdquo in Conference onOperating systems design and implementation (OSDIrsquo10) USENIX2010 pp 1ndash6

[18] I Papagiannis M Migliavacca and P Pietzuch ldquoPHP AspisUsing partial taint tracking to protect against injection attacksrdquoin 2nd USENIX Conference on Web Application Development 2011p 13

[19] E Schwartz T Avgerinos and D Brumley ldquoAll you ever wantedto know about dynamic taint analysis and forward symbolicexecution (but might have been afraid to ask)rdquo in Symposium onSecurity and Privacy IEEE 2010

[20] M Casassa-Mont S Pearson and P Bramhall ldquoTowards account-able management of identity and privacy Sticky policies andenforceable tracing servicesrdquo in Database and Expert Systems Ap-plications 2003 Proceedings 14th International Workshop on IEEE2003 pp 377ndash382

[21] S Bandhakavi C C Zhang and M Winslett ldquoSuper-sticky anddeclassifiable release policies for flexible information dissemina-tion controlrdquo in Proceedings of the 5th ACM workshop on Privacy inelectronic society ACM 2006 pp 51ndash58

[22] D W Chadwick and S F Lievens ldquoEnforcing sticky securitypolicies throughout a distributed applicationrdquo in Proceedings ofthe 2008 workshop on Middleware security ACM 2008 pp 1ndash6

[23] M Casassa-Mont I Matteucci M Petrocchi and M L SbodioldquoTowards Safer Information Sharing in the Cloudrdquo InternationalJournal of Information Security pp 1ndash16 2014

[24] S Akoush L Carata R Sohan and A Hopper ldquoMrLazy LazyRuntime Label Propagation for MapReducerdquo in 6th Workshop onHot Topics in Cloud Computing (HotCloud) USENIX 2014

[25] D Schultz and B Liskov ldquoIFDB Decentralized Information FlowControl for Databasesrdquo in European Conference on Computer Sys-tems (Eurosysrsquo13) ACM 2013 pp 43ndash56

[26] D Von Oheimb ldquoInformation Flow Control Revisited Nonin-fluence = Noninterference + Nonleakagerdquo in Computer SecurityndashESORICS 2004 Springer 2004 pp 225ndash243

[27] D Hedin and A Sabelfeld ldquoA perspective on Information-FlowControlrdquo NATO 2012

[28] D E Porter M D Bond I Roy K S McKinley and E WitchelldquoPractical Fine-Grained Information Flow Control Using Lam-inarrdquo ACM Transactions on Programming Languages and Systems(TOPLAS) vol 37 no 1 p 4 2014

[29] C Wright C Cowan S Smalley J Morris and G Kroah-Hartman ldquoLinux Security Modules General security support forthe Linux kernelrdquo in Foundations of Intrusion Tolerant SystemsIEEE 2003 pp 213ndash213

[30] M Quaritsch and T Winkler ldquoLinux Security Modules Enhance-ments Module Stacking Framework and TCP State TransitionHooks for State-Driven NIDSrdquo Secure Information and Communi-cation vol 7 pp 7ndash13 2004

[31] C Schaufler ldquoLSM Generalize existing module stackingrdquo LinuxWeekly News 2014

[32] S Smalley C Vance and W Salamon ldquoImplementing SELinux asa Linux Security Modulerdquo NAI Labs Report vol 1 p 43 2001

[33] M Bauer ldquoParanoid Penguin an Introduction to Novell AppAr-morrdquo Linux Journal vol 2006 no 148 p 13 2006

[34] A Edwards T Jaeger and X Zhang ldquoRuntime verification ofauthorization hook placement for the Linux Security Modulesframeworkrdquo in Conference on Computer and Communications Secu-rity ACM 2002 pp 225ndash234

[35] T Jaeger A Edwards and X Zhang ldquoConsistency analysis ofauthorization hook placement in the Linux security modulesframeworkrdquo ACM Transactions on Information and System Security(TISSEC) vol 7 no 2 pp 175ndash205 2004

[36] V Ganapathy T Jaeger and S Jha ldquoAutomatic placement ofauthorization hooks in the Linux security modules frameworkrdquoin Conference on Computer and Communications Security ACM2005 pp 330ndash339

[37] I P Egwutuoha D Levy B Selic and S Chen ldquoA survey of faulttolerance mechanisms and checkpointrestart implementationsfor high performance computing systemsrdquo Springer 2013vol 65 no 3 pp 1302ndash1326

[38] O Laadan and S E Hallyn ldquoLinux-CR Transparent applicationcheckpoint-restart in Linuxrdquo in Linux Symposium 2010 p 159

[39] B Niu and G Tan ldquoEfficient User-space Information Flow Con-trolrdquo in Symposium on Information Computer and CommunicationsSecurity (SIGSACrsquo13) ACM 2013 pp 131ndash142

[40] T Bird ldquoMeasuring Function Duration with ftracerdquo in Japan LinuxSymposium 2009 pp 47ndash54

[41] B Parno ldquoBootstrapping Trust in ardquo Trustedrdquo Platformrdquo in Con-ference on Hot Topics in Security (HotSecrsquo08) USENIX 2008

[42] N Santos K P Gummadi and R Rodrigues ldquoTowards TrustedCloud Computingrdquo in Conference on Hot Topics in Cloud Comput-ing USENIX 2009 pp 3ndash3

[43] S Berger R Caceres K A Goldman R Perez R Sailer andL van Doorn ldquovTPM Virtualizing the Trusted Platform Modulerdquoin Security Symposium USENIX 2006 pp 305ndash320

14

[44] S Berger K Goldman D Pendarakis D Safford E Valdezand M Zohar ldquoScalable Attestation A Step Toward Secure andTrusted Cloudsrdquo in International Conference on Cloud Engineering(IC2E) IEEE 2015

[45] J Singh D Eyers and J Bacon ldquoPolicy Enforcement withinEmerging Distributed Event-Based Systemsrdquo in ACM DistributedEvent-Based Systems (DEBSrsquo14) 2014

[46] L Sfaxi T Abdellatif R Robbana and Y Lakhnech ldquoInformationFlow Control of Component-based Distributed Systemsrdquo Concur-rency and Computation Practice and Experience vol 25 no 2 pp161ndash179 2013

[47] S Yoshihama T Yoshizawa Y Watanabe M Kudoh and K Oy-anagi ldquoDynamic Information Flow Control Architecture for WebApplicationsrdquo in ESORICS 2007 J Biskup and J Lopez EdsSpringer 2007 vol LNCS 4734 pp 267ndash282

[48] W Cheng D R K Ports D Schultz V Popic A BlanksteinJ Cowling D Curtis L Shrira and B Liskov ldquoAbstractions forUsable Information Flow Control in Aeolusrdquo in USENIX AnnualTechnical Conference Boston 2012

[49] J Singh T Pasquier J Bacon and D Eyers ldquoIntegrating Middle-ware and Information Flow Controlrdquo in International Conferenceon Cloud Engineering (IC2E) IEEE 2015 pp 54ndash59

[50] J Singh T F J-M Pasquier and J Bacon ldquoSecuring Tags toControl Information Flows within the Internet of Thingsrdquo inInternational Conference on Recent Advances in Internet of Things(RIoTrsquo15) IEEE 2015

[51] J S Park and R Sandhu ldquoBinding Identities and Attributes UsingDigitally Signed Certificatesrdquo in Annual Conference on ComputerSecurity Applications IEEE 2000 pp 120ndash127

[52] A Ganjali and D Lie ldquoAuditing cloud management using infor-mation flow trackingrdquo in Workshop on Scalable Trusted ComputingACM 2012 pp 79ndash84

[53] R K Ko M Kirchberg and B S Lee ldquoFrom System-centric toData-centric Logging-accountability Trust amp Security in CloudComputingrdquo in Defense Science Research Conference and Expo (DSR)2011 IEEE 2011 pp 1ndash4

[54] J Singh J Powles T Pasquier and J Bacon ldquoData Flow Man-agement and Compliance in Cloud Computingrdquo IEEE CloudComputing Magazine SI on Legal Clouds 2015

[55] T Zanussi K Yaghmour R Wisniewski R Moore and M Dage-nais ldquorelayfs An efficient unified approach for transmitting datafrom kernel to user spacerdquo in Linux Symposium 2003 p 494

[56] M E Smoot K Ono J Ruscheinski P-L Wang and T IdekerldquoCytoscape 28 New features for data integration and networkvisualizationrdquo Oxford University Press 2011 vol 27 no 3 pp431ndash432

[57] T Pasquier and J Powles ldquoExpressing and Enforcing LocationRequirements in the Cloud using Information Flow Controlrdquo inIC2E International Workshop on Legal and Technical Issues in CloudComputing (CLawrsquo15) IEEE 2015

[58] R Angles and C Gutierrez ldquoSurvey of Graph Database ModelsrdquoACM Computing Surveys (CSUR) vol 40 no 1 p 1 2008

[59] D Hardt ldquoThe OAuth 20 Authorization Frameworkrdquo IETF TechRep RFC 6749 2012

[60] S Smalley and R Craig ldquoSecurity Enhanced (SE) Android Bring-ing Flexible MAC to Androidrdquo in Network and Distributed SystemSecurity Symposium Internet Society 2013

  • 1 Introduction and Motivation
  • 2 Background
    • 21 IFC Models
    • 22 Protection via VMs and Containers
    • 23 Taint Tracking (TT) Systems
    • 24 Sticky Policies
      • 3 CamFlow-Model IFC for the Cloud
        • 31 Tags and Labels
        • 32 Decentralised Privileges and Security Contexts
        • 33 Creating a New Entity
        • 34 Security
          • 341 Information Exchange
          • 342 Label Change
          • 343 Privilege delegation
            • 35 Conflict of Interest
              • 4 OS Enforcement
                • 41 CamFlow-LSM
                  • 411 Checkpointing and Restoration
                  • 412 OS Evaluation
                    • 42 Trusted Processes
                    • 43 Leveraging Hardware Roots of Trust
                      • 5 Cross-Machine Enforcement
                        • 51 Remote Interactions
                        • 52 Message-Level Enforcement
                        • 53 Integrating With Persistent Storage
                        • 54 Evaluation
                          • 6 Audit Data-Centric Logs
                            • 61 Analysing Paths to Disclosure
                            • 62 Demonstrating Compliance
                            • 63 Audit as `Big Data
                            • 64 Audit Access
                              • 7 Example Support for Web Services
                              • 8 Conclusion amp Future Work
                              • References
Page 14: CamFlow: Managed data-sharing for cloud services · CamFlow: Managed data-sharing for cloud services Thomas F. J.-M. Pasquier, Member, IEEE, Jatinder Singh, Member, IEEE, David Eyers,

14

[44] S Berger K Goldman D Pendarakis D Safford E Valdezand M Zohar ldquoScalable Attestation A Step Toward Secure andTrusted Cloudsrdquo in International Conference on Cloud Engineering(IC2E) IEEE 2015

[45] J Singh D Eyers and J Bacon ldquoPolicy Enforcement withinEmerging Distributed Event-Based Systemsrdquo in ACM DistributedEvent-Based Systems (DEBSrsquo14) 2014

[46] L Sfaxi T Abdellatif R Robbana and Y Lakhnech ldquoInformationFlow Control of Component-based Distributed Systemsrdquo Concur-rency and Computation Practice and Experience vol 25 no 2 pp161ndash179 2013

[47] S Yoshihama T Yoshizawa Y Watanabe M Kudoh and K Oy-anagi ldquoDynamic Information Flow Control Architecture for WebApplicationsrdquo in ESORICS 2007 J Biskup and J Lopez EdsSpringer 2007 vol LNCS 4734 pp 267ndash282

[48] W Cheng D R K Ports D Schultz V Popic A BlanksteinJ Cowling D Curtis L Shrira and B Liskov ldquoAbstractions forUsable Information Flow Control in Aeolusrdquo in USENIX AnnualTechnical Conference Boston 2012

[49] J Singh T Pasquier J Bacon and D Eyers ldquoIntegrating Middle-ware and Information Flow Controlrdquo in International Conferenceon Cloud Engineering (IC2E) IEEE 2015 pp 54ndash59

[50] J Singh T F J-M Pasquier and J Bacon ldquoSecuring Tags toControl Information Flows within the Internet of Thingsrdquo inInternational Conference on Recent Advances in Internet of Things(RIoTrsquo15) IEEE 2015

[51] J S Park and R Sandhu ldquoBinding Identities and Attributes UsingDigitally Signed Certificatesrdquo in Annual Conference on ComputerSecurity Applications IEEE 2000 pp 120ndash127

[52] A Ganjali and D Lie ldquoAuditing cloud management using infor-mation flow trackingrdquo in Workshop on Scalable Trusted ComputingACM 2012 pp 79ndash84

[53] R K Ko M Kirchberg and B S Lee ldquoFrom System-centric toData-centric Logging-accountability Trust amp Security in CloudComputingrdquo in Defense Science Research Conference and Expo (DSR)2011 IEEE 2011 pp 1ndash4

[54] J Singh J Powles T Pasquier and J Bacon ldquoData Flow Man-agement and Compliance in Cloud Computingrdquo IEEE CloudComputing Magazine SI on Legal Clouds 2015

[55] T Zanussi K Yaghmour R Wisniewski R Moore and M Dage-nais ldquorelayfs An efficient unified approach for transmitting datafrom kernel to user spacerdquo in Linux Symposium 2003 p 494

[56] M E Smoot K Ono J Ruscheinski P-L Wang and T IdekerldquoCytoscape 28 New features for data integration and networkvisualizationrdquo Oxford University Press 2011 vol 27 no 3 pp431ndash432

[57] T Pasquier and J Powles ldquoExpressing and Enforcing LocationRequirements in the Cloud using Information Flow Controlrdquo inIC2E International Workshop on Legal and Technical Issues in CloudComputing (CLawrsquo15) IEEE 2015

[58] R Angles and C Gutierrez ldquoSurvey of Graph Database ModelsrdquoACM Computing Surveys (CSUR) vol 40 no 1 p 1 2008

[59] D Hardt ldquoThe OAuth 20 Authorization Frameworkrdquo IETF TechRep RFC 6749 2012

[60] S Smalley and R Craig ldquoSecurity Enhanced (SE) Android Bring-ing Flexible MAC to Androidrdquo in Network and Distributed SystemSecurity Symposium Internet Society 2013

  • 1 Introduction and Motivation
  • 2 Background
    • 21 IFC Models
    • 22 Protection via VMs and Containers
    • 23 Taint Tracking (TT) Systems
    • 24 Sticky Policies
      • 3 CamFlow-Model IFC for the Cloud
        • 31 Tags and Labels
        • 32 Decentralised Privileges and Security Contexts
        • 33 Creating a New Entity
        • 34 Security
          • 341 Information Exchange
          • 342 Label Change
          • 343 Privilege delegation
            • 35 Conflict of Interest
              • 4 OS Enforcement
                • 41 CamFlow-LSM
                  • 411 Checkpointing and Restoration
                  • 412 OS Evaluation
                    • 42 Trusted Processes
                    • 43 Leveraging Hardware Roots of Trust
                      • 5 Cross-Machine Enforcement
                        • 51 Remote Interactions
                        • 52 Message-Level Enforcement
                        • 53 Integrating With Persistent Storage
                        • 54 Evaluation
                          • 6 Audit Data-Centric Logs
                            • 61 Analysing Paths to Disclosure
                            • 62 Demonstrating Compliance
                            • 63 Audit as `Big Data
                            • 64 Audit Access
                              • 7 Example Support for Web Services
                              • 8 Conclusion amp Future Work
                              • References