making cloud storage provenance-aware
DESCRIPTION
Making Cloud Storage Provenance-Aware. Kiran-Kumar Muniswamy-Reddy, Peter Macko, and Margo Seltzer. Harvard School of Engineering and Applied Sciences. The Cloud. Next generation computing environment Cheap: Pay as you go Provision resources (storage, CPU) on a need basis - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/1.jpg)
Making Cloud Storage Provenance-Making Cloud Storage Provenance-AwareAware
Kiran-Kumar Muniswamy-Reddy, Peter Macko, and Margo Seltzer
Harvard School of Engineering and Applied
Sciences
![Page 2: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/2.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 22
The CloudThe Cloud
Next generation computing environmentNext generation computing environment Cheap: Pay as you goCheap: Pay as you go Provision resources (storage, CPU) on a need basisProvision resources (storage, CPU) on a need basis
Provides illusion of infinite resourcesProvides illusion of infinite resources Companies with large batch oriented tasks can get results Companies with large batch oriented tasks can get results
quicklyquickly
Cloud providersCloud providers Amazon Web Services (AWS)Amazon Web Services (AWS) Google AppEngineGoogle AppEngine Microsoft AzureMicrosoft Azure
![Page 3: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/3.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 33
Provenance for the CloudProvenance for the Cloud
As apps move to the cloud, so will the dataAs apps move to the cloud, so will the data Amazon hosts scientific data for freeAmazon hosts scientific data for free
However, most cloud services are not However, most cloud services are not designed to store provenancedesigned to store provenance
Why Provenance?Why Provenance? Debug Application ResultsDebug Application Results Validate Data SetsValidate Data Sets Improve Search ResultsImprove Search Results Regulatory ComplianceRegulatory Compliance
![Page 4: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/4.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 44
Provenance PropertiesProvenance Properties
We identified the following propertiesWe identified the following properties Read CorrectnessRead Correctness Causal Ancestry OrderingCausal Ancestry Ordering QueryableQueryable
![Page 5: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/5.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 55
Read CorrectnessRead Correctness
Data must be what is described by provenanceData must be what is described by provenance Provenance accurately describes the data objectProvenance accurately describes the data object MechanismsMechanisms
AtomicityAtomicity: At storage time, both provenance and data : At storage time, both provenance and data should be stored or neither should be storedshould be stored or neither should be stored
ConsistencyConsistency: At retrieval time, data returned should : At retrieval time, data returned should be consistent with provenancebe consistent with provenance
![Page 6: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/6.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 66
Causal Ancestry OrderingCausal Ancestry Ordering
The provenance and data of an ancestor object must be recorded in the provenance systemNo dangling references
![Page 7: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/7.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 77
Efficient QueryEfficient Query
Provenance must be accessible to users who want to verify properties of their data or simply be aware of its lineage If provenance is not readily accessible, the
provenance is of questionable value.
![Page 8: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/8.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 88
GoalGoal
How do we design protocols around How do we design protocols around current cloud services such that these current cloud services such that these properties are satisfied?properties are satisfied?
SettingSetting Provenance-Aware Storage system (PASS) Provenance-Aware Storage system (PASS)
tracks and collects provenancetracks and collects provenance Primarily considered AWSPrimarily considered AWS
Used 3 services: S3, SimpleDB, SQSUsed 3 services: S3, SimpleDB, SQS
![Page 9: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/9.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 99
OutlineOutline
IntroductionIntroduction PASS BackgroundPASS Background Protocol 1: Standalone S3Protocol 1: Standalone S3 Protocol 2: S3 + SimpleDBProtocol 2: S3 + SimpleDB Protocol 3: S3 + SimpleDB + SQSProtocol 3: S3 + SimpleDB + SQS AnalysisAnalysis Conclusion and StatusConclusion and Status
![Page 10: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/10.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 1010
Observes system calls that applications make and captures relationships between objects P: read A
Generates record: P depends on A Cache the record
P: write B Generates record: B depends on P Store both ‘B depends on P’ and ‘P depends on A’
Mirrors data locally and caches provenance till we need to send it to AWS
Provenance-Aware Storage SystemProvenance-Aware Storage System
![Page 11: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/11.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 1111
OutlineOutline
IntroductionIntroduction PASS BackgroundPASS Background Protocol 1: Standalone S3Protocol 1: Standalone S3 Protocol 2: S3 + SimpleDBProtocol 2: S3 + SimpleDB Protocol 3: S3 + SimpleDB + SQSProtocol 3: S3 + SimpleDB + SQS AnalysisAnalysis Conclusion and StatusConclusion and Status
![Page 12: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/12.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 1212
Simple Storage Service (S3)Simple Storage Service (S3)
Object Store: sizes from 1byte to 5GBObject Store: sizes from 1byte to 5GB Object’s identified by URIObject’s identified by URI SOAP or REST interfaceSOAP or REST interface Operations: Operations:
PUT, GET, HEAD, COPY, DELETEPUT, GET, HEAD, COPY, DELETE PUT: store an object and its metadata (2KB limit)PUT: store an object and its metadata (2KB limit) HEAD: retrieves metadata of an objectHEAD: retrieves metadata of an object
Cost: data storage + bandwidth + num opsCost: data storage + bandwidth + num ops Eventual consistencyEventual consistency
![Page 13: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/13.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 1313
Architecture 1: Standalone S3Architecture 1: Standalone S3
ApplicationApplication
PASSPASS
S3S3
Prov+DataProv+Data
UserUserSystemSystem
![Page 14: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/14.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 1414
Protocol 1: Standalone S3Protocol 1: Standalone S3
PASSPASS
S3S3
PUT:(Prov >1KB)
PUT:(Prov >1KB)
OKOK
![Page 15: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/15.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 1515
Protocol 1: Standalone S3Protocol 1: Standalone S3
PASSPASS
S3S3PUT:D
ata
PUT:Data
OKOK
![Page 16: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/16.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 1616
PropertiesProperties
ArchArch Read Read
CorrectnessCorrectness
Causal Causal OrderingOrdering
EfficientEfficient
QueryQuery
AtomicityAtomicity ConsistencyConsistency
S3S3
![Page 17: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/17.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 1717
OutlineOutline
IntroductionIntroduction PASS BackgroundPASS Background Protocol 1: Standalone S3Protocol 1: Standalone S3 Protocol 2: S3 + SimpleDBProtocol 2: S3 + SimpleDB Protocol 3: S3 + SimpleDB + SQSProtocol 3: S3 + SimpleDB + SQS AnalysisAnalysis Conclusion and StatusConclusion and Status
![Page 18: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/18.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 1818
SimpleDBSimpleDB
Service providing database functionalityService providing database functionality Data model: items described by attribute-value Data model: items described by attribute-value
pairspairs 256 attrs maximum, name/value < 1KB256 attrs maximum, name/value < 1KB Operations: PutAttributes, Query, Operations: PutAttributes, Query,
QueryWithAttributes, and SELECTQueryWithAttributes, and SELECT Query returns itemsQuery returns items QueryWithAttributes returns both items and attributesQueryWithAttributes returns both items and attributes
Cost: bandwidth + storage + num ops + Cost: bandwidth + storage + num ops + machine hrsmachine hrs
![Page 19: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/19.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 1919
Architecture 2: S3 + SimpleDBArchitecture 2: S3 + SimpleDB
ApplicationApplication
PASSPASS
S3S3
UserUserSystemSystem
SimpleDBSimpleDB
DataData ProvProv
![Page 20: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/20.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 2020
Protocol 2: S3 + SimpleDBProtocol 2: S3 + SimpleDB
PASSPASS
S3S3
PUT:(rec > 1KB)
PUT:(rec > 1KB)
OKOK
SimpleDBSimpleDB
PutAttrs+PutAttrs+
OKOK
![Page 21: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/21.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 2121
Protocol 2: S3 + SimpleDBProtocol 2: S3 + SimpleDB
PASSPASS
S3S3PUT:D
ata
PUT:Data
OKOK
SimpleDBSimpleDB
![Page 22: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/22.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 2222
PropertiesProperties
ArchArch Read Read
CorrectnessCorrectness
Causal Causal OrderingOrdering
EfficientEfficient
QueryQuery
AtomicityAtomicity ConsistencyConsistency
S3S3
SimpleDBSimpleDB
![Page 23: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/23.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 2323
OutlineOutline
IntroductionIntroduction PASS BackgroundPASS Background Protocol 1: Standalone S3Protocol 1: Standalone S3 Protocol 2: S3 + SimpleDBProtocol 2: S3 + SimpleDB Protocol 3: S3 + SimpleDB + SQSProtocol 3: S3 + SimpleDB + SQS AnalysisAnalysis Conclusion and StatusConclusion and Status
![Page 24: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/24.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 2424
Simple Queuing Service (SQS)Simple Queuing Service (SQS)
Distributed Messaging SystemDistributed Messaging System Queues are identified by URLQueues are identified by URL Operations: SendMessage, Operations: SendMessage,
ReceiveMessage, DeleteMessageReceiveMessage, DeleteMessage VisibilityTimeout:VisibilityTimeout:
Message will not be available for x seconds Message will not be available for x seconds after a ReceiveMessageafter a ReceiveMessage
Limits: 8KB message size, max 10 msgs can be Limits: 8KB message size, max 10 msgs can be receivedreceived
![Page 25: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/25.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 2525
Architecture 3: S3 + SimpleDB + SQSArchitecture 3: S3 + SimpleDB + SQS
ApplicationApplication
PASSPASS
S3S3
UserUserSystemSystem
SimpleDBSimpleDB
Queue1Queue1
DataData ProvProv
![Page 26: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/26.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 2626
Protocol 3: S3 + SimpleDB + SQSProtocol 3: S3 + SimpleDB + SQS
PASSPASS
SimpleDBSimpleDBSQSSQS
CommitdCommitd
S3S3PUT: Temp copyPUT: Temp copy
OKOK
SndMsg+SndMsg+ OKOK
RecvMsg+RecvMsg+
COPYCOPY
OKOK
OKOK
PutAttrs+
PutAttrs+
![Page 27: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/27.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 2727
Protocol 3: S3 + SimpleDB + SQSProtocol 3: S3 + SimpleDB + SQS
PASSPASS
SimpleDBSimpleDBSQSSQS
CommitdCommitd
S3S3
DelMsg+DelMsg+
DEL:CPY
DEL:CPY
OKOK
![Page 28: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/28.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 2828
IdempotencyIdempotency
SimpleDB, S3, and SQS are idempotentSimpleDB, S3, and SQS are idempotent If a commit daemon crashes, comes back If a commit daemon crashes, comes back
up and processes a transaction again, up and processes a transaction again, there will not be errorsthere will not be errors
![Page 29: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/29.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 2929
PropertiesProperties
ArchArch Read Read
CorrectnessCorrectness
Causal Causal OrderingOrdering
EfficientEfficient
QueryQuery
AtomicityAtomicity ConsistencyConsistency
S3S3
SimpleDBSimpleDB
SQSSQS
![Page 30: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/30.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 3030
OutlineOutline
IntroductionIntroduction PASS BackgroundPASS Background Protocol 1: Standalone S3Protocol 1: Standalone S3 Protocol 2: S3 + SimpleDBProtocol 2: S3 + SimpleDB Protocol 3: S3 + SimpleDB + SQSProtocol 3: S3 + SimpleDB + SQS AnalysisAnalysis Conclusion and StatusConclusion and Status
![Page 31: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/31.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 3131
AnalysisAnalysis
Extracted provenance by running three Extracted provenance by running three workloadsworkloads Linux compileLinux compile BlastBlast Provenance challengeProvenance challenge
Compute cost to store and query Compute cost to store and query provenanceprovenance Number of opsNumber of ops BandwidthBandwidth
![Page 32: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/32.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 3232
Storage CostStorage Cost
Raw P1 P2 P3
Data 1.27GB 121.8MB
(9.3%)
167.8MB
(13.6%)
421.4MB
(32.2%)
ops 31,180 24,952
(80.0%)
168,514
(540.5%)
231,287
(741.7%)
P1 = S3P1 = S3
P2 = S3 + SimpleDBP2 = S3 + SimpleDB
P3 = S3 + SimpleDB + SQSP3 = S3 + SimpleDB + SQS
![Page 33: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/33.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 3333
Query CostQuery Cost
1. Dump the provenance of a given object Ran it on all objects for statistical
significance
2. Find all the files that were outputs of blast.
3. Find all the descendants of files derived from blast.
![Page 34: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/34.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 3434
Query resultsQuery results
Query S3 SimpleDB
Data OpsOps DataData OpsOps
11 121.8MB 56,132 51.24MB 71,825
22 121.8MB 56,132 2.8KB 6
33 121.8MB 56,132 13.8KB 31
![Page 35: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/35.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 3535
OutlineOutline
IntroductionIntroduction PASS BackgroundPASS Background Protocol 1: Standalone S3Protocol 1: Standalone S3 Protocol 2: S3 + SimpleDBProtocol 2: S3 + SimpleDB Protocol 3: S3 + SimpleDB + SQSProtocol 3: S3 + SimpleDB + SQS AnalysisAnalysis Conclusion and StatusConclusion and Status
![Page 36: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/36.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 3636
ConclusionsConclusions
Identified the properties that need to be Identified the properties that need to be satisfied for storing provenance in the satisfied for storing provenance in the cloudcloud
Presented various protocols for storing Presented various protocols for storing provenance and data on the cloudprovenance and data on the cloud
Costs of storing provenance is reasonableCosts of storing provenance is reasonable
![Page 37: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/37.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 3737
StatusStatus
System almost readySystem almost ready Plan to submit it to Symposium on Operating Plan to submit it to Symposium on Operating
Systems PrinciplesSystems Principles ( (SOSP)SOSP) Really hard to drive up the costReally hard to drive up the cost
Jan Bill = $1.95Jan Bill = $1.95 Feb Bill = $9.38Feb Bill = $9.38
![Page 38: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/38.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 3838
ExtraExtra
![Page 39: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/39.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 3939
Protocol 1: Standalone S3Protocol 1: Standalone S3
On file close:On file close: Convert the provenance into attribute-Convert the provenance into attribute-
value pairs as required by S3 value pairs as required by S3 If (sizeof(record) > 1KB) If (sizeof(record) > 1KB)
Store the record in a separate S3 object Store the record in a separate S3 object Replace attribute-value pair with pointer to Replace attribute-value pair with pointer to
this object this object Upload the file using PUT:Upload the file using PUT:
Arguments: object, attributesArguments: object, attributes
![Page 40: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/40.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 4040
Protocol 2: S3 + SimpleDBProtocol 2: S3 + SimpleDB
On file close:On file close:1.1. Convert the provenance into attribute-value pairs as Convert the provenance into attribute-value pairs as
required by SimpleDBrequired by SimpleDB Additonal record: md5sum of (file contents + version)Additonal record: md5sum of (file contents + version)
2.2. If (sizeof(record) > 1KB)If (sizeof(record) > 1KB) Store the record in a separate S3 objectStore the record in a separate S3 object Replace attribute-value pair with pointer to this objectReplace attribute-value pair with pointer to this object
3.3. Issue PutAttributes: store the provenanceIssue PutAttributes: store the provenance One item per version (= One PutAttributes) per version of the One item per version (= One PutAttributes) per version of the
objectobject
4.4. Upload the file to S3 using PUTUpload the file to S3 using PUT
![Page 41: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/41.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 4141
Protocol 3: S3 + SimpleDB + SQS (I)Protocol 3: S3 + SimpleDB + SQS (I)
Log phase: Log data on a queueLog phase: Log data on a queue1.1. Store a copy of the file in a temporary Store a copy of the file in a temporary
location on S3location on S3
2.2. Allocate a transaction id (uuid)Allocate a transaction id (uuid)
3.3. Split provenance into chunks of 8KB and Split provenance into chunks of 8KB and enqueue them on an SQS queueenqueue them on an SQS queue
Tag each message with the transaction IDTag each message with the transaction ID One additional record that has a pointer to the One additional record that has a pointer to the
temp S3 objecttemp S3 object
![Page 42: Making Cloud Storage Provenance-Aware](https://reader036.vdocuments.net/reader036/viewer/2022062809/5681587d550346895dc5dedb/html5/thumbnails/42.jpg)
2/23/20092/23/2009 Making a cloud Provenance-Aware - TaPP'09Making a cloud Provenance-Aware - TaPP'09 4242
Protocol 3: S3 + SimpleDB + SQS (II)Protocol 3: S3 + SimpleDB + SQS (II)
Commit phase: move data from SQS to S3 and Commit phase: move data from SQS to S3 and SimpleDBSimpleDB
1.1. ReceiveMessage: get messages from the queue and ReceiveMessage: get messages from the queue and assemble the packetsassemble the packets
2.2. Store the provenance in SimpleDB using Store the provenance in SimpleDB using PutAttributes callPutAttributes call
Take care of overflowsTake care of overflows
3.3. Execute an S3 COPY and copy the object from its Execute an S3 COPY and copy the object from its temporary location to permanenttemporary location to permanent
4.4. Delete Messages from SQSDelete Messages from SQS5.5. Delete temporary file copyDelete temporary file copy