distributed data integrity assurance and repair using the ... · 2018 storage developer conference....
TRANSCRIPT
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 1
Distributed Data IntegrityAssurance and Repair Using the
LOCKSS Content Audit Protocol (LCAP)
Thib Guicherd-Callin
LOCKSS Program, Stanford University
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 2
Outline
1. Context and Use Cases
2. Threat Models
3. LCAP in Action
4. Unlocking LOCKSS for Developers
5. Q&A
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 3
Outline
1. Context and Use Cases
2. Threat Models
3. LCAP in Action
4. Unlocking LOCKSS for Developers
5. Q&A
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 4
Traditional Research Libraries
❒ Ownership model
❒ Many independent replicas
❒ Features
❒ Disaster resistance
❒ Disaster recovery
❒ Tamper evident
❒ Permanent access
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 5
Research Libraries in the Web Era
❒ Leasing model
❒ One master copy
❒ Misfeatures
❒ Disaster resistance?
❒ Disaster recovery?
❒ Tamper evident?
❒ Permanent access?
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 6
LOCKSS Technology Use Cases
❒ "Lots Of Copies Keep Stuff Safe"
❒ Global LOCKSS Network (GLN)
❒ CLOCKSS Archive
❒ Government documents networks
❒ Regional and national networks
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 7
Key Publications
❒ "Founding paper"❒ David S.H. Rosenthal, Vicky Reich. "Permanent Web Publishing."
Proceedings of the 2000 USENIX Annual Technical Conference
FREENIX Track, pg. 129-140, 2000. URL: https://www.usenix.org/legacy/publications/library/p
roceedings/usenix2000/freenix/rosenthal.html
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 8
Key Publications
❒ "Protocol paper"❒ Petros Maniatis, Mema Roussopoulos, TJ Giuli, David S.H. Rosenthal,
Mary Baker, and Yanto Muliadi. "Preserving Peer Replicas By Rate-
Limited Sampled Voting." Proceedings of the Nineteenth ACM
Symposium on Operating Systems Principles (SOSP '03), pg. 44-59, 2003. DOI: 10.1145/945445.945451
❒ Petros Maniatis, Mema Roussopoulos, TJ Giuli, David S.H. Rosenthal,
Mary Baker, and Yanto Muliadi. "LOCKSS: A Peer-To-Peer Digital Preservation System." Technical report cs.CR/0303026, Stanford
University, 2003. URL: http://www.eecs.harvard.edu/~mema/publications/SOSP2
003-long.pdf
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 9
Key Publications
❒ "Threat models paper"❒ David S.H. Rosenthal, Thomas S. Robertson, Tom Lipkis, Vicky
Reich, Seth Morabito. "Requirements for Digital Preservation
Systems: A Bottom-Up Approach." D-Lib Magazine, vol. 11, iss. 11, November 2005. DOI: 10.1045/november2005-rosenthal
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 10
Outline
1. Context and Use Cases
2. Threat Models
3. LCAP in Action
4. Unlocking LOCKSS for Developers
5. Q&A
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 11
Goal of Digital Preservation
The goal of a digital preservation system is that the
information it contains remains accessible to users
over a period of time much longer than the lifetime
of individual storage media, hardware and
software components
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 12
Key Properties
❒ No single point of failure
❒ Media, hardware and software flow through as
they fail or are replaced
❒ Regular audits frequent enough to keep
probability of irrecoverable failure acceptable
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 13
Threat Taxonomy (1)
❒ Media failure
❒ Hardware failure
❒ Software failure
❒ Communication errors
❒ Failure of network services
❒ Natural disaster
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 14
Threat Taxonomy (2)
❒ Media and hardware obsolescence
❒ Software obsolescence
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 15
Threat Taxonomy (3)
❒ Operator error
❒ Economic failure
❒ Organizational failure
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 16
Threat Taxonomy (4)
❒ External attack
❒ Internal attack
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 17
Outline
1. Context and Use Cases
2. Threat Models
3. LCAP in Action
4. Unlocking LOCKSS for Developers
5. Q&A
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 18
Basic Principle
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 19
P2
P1
P3
P4
P5P6
What is hash(X)?
XThe peers hold identical replicas of XPeer P1 calls a poll on content X
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 20
P2
P1
P3
P4
P5P6
X
hash(X) = h1 hash(X) = h1
hash(X) = h1
hash(X) = h1hash(X) = h1
P2, P3, P4, P5, P6 agreed with me on X
Landslide agreement
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 21
P2
P1
P3
P4
P5P6
Peer P2 calls a poll on content X
X
What is hash(X)?
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 22
P2
P1
P3
P4
P5P6
hash(X) = h1
hash(X) = h1
hash(X) = h1hash(X) = h1
hash(X) = h1
P1, P3, P4, P5, P6 agreed with me on X
XLandslide agreement
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 23
P2
P1
P3
P4
P5P6
What is hash(X)?
XPeer P1 incurs damage on content XPeer P1 later calls a poll on content X
X
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 24
P2
P1
P3
P4
P5P6
X
hash(X) = h1 hash(X) = h1
hash(X) = h1
hash(X) = h1hash(X) = h1
hash(X) = h2
Landslide disagreement
X
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 25
P2
P1
P3
P4
P5P6
Help me repair X
X
XRepair request
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 26
P2
P1
P3
P4
P5P6
P1 agreed with me on X
X
X
X
Repair
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 27
P2
P1
P3
P4
P5P6
X
XThe peers hold identical replicas of X
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 28
Stealth Modification Gap
❒ Landslide agreement: take no action
(high confidence in outcome)
❒ Inconclusive agreement: take no
action and raise alarm (low confidence
in outcome)
❒ Landslide disagreement: seek repair
and notify (high confidence in
outcome)
Attacker'sgoal
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 29
Nonces
❒ For each voter in a poll over X, the poller
supplies a poller nonce P and the voter a voter
nonce V
❒ Rather than hash(X), it is the value of
hash(PVX) that is computed
❒ Nonces must be fresh
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 30
Repair Verification
❒ Byzantine fault
❒ Bait and switch
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 31
Physical Fixity vs. Logical Fixity
❒ What if the peers hold the same content even
though not all of, or even none of, the replicas
are byte-identical?
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 32
LCAP vs. Threats
❒ Media failure❒ Hardware failure❒ Software failure
❒ Communication errors
❒ Failure of network services❒ Natural disaster
❒ Media and hardware obsolescence
❒ Software obsolescence
❒ Operator error
❒ Economic failure❒ Organizational failure
❒ External attack
❒ Internal attack
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 33
Outline
1. Context and Use Cases
2. Threat Models
3. LCAP in Action
4. Unlocking LOCKSS for Developers
5. Q&A
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 34
Narrow Origins
❒ Audience: research libraries
❒ Target: Web content
❒ Context: appliance model
❒ "Monolithic stack"
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 35
LAAWS Initiative
❒ "LOCKSS Architected As Web Services"
❒ Two year Mellon Foundation grant
❒ Modernization effort
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 36
Re-Architecture
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 37
REST APIs and Software Assets
❒ Repository service❒ https://github.com/lockss/laaws-repository-service
❒ Configuration service
❒ https://github.com/lockss/laaws-configuration-service
❒ Poller service❒ https://github.com/lockss/laaws-poller
❒ Metadata extraction service
❒ https://github.com/lockss/laaws-metadataextractor
❒ Metadata service❒ https://github.com/lockss/laaws-metadataservice
❒ develop branch → docs/swagger.yaml
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 38
Dev/Demo Environment
❒ https://github.com/lockss/laaws-demo
❒ feature-mgdemo branch
❒ Contains:
❒ Docker support infrastructure
❒ Docker flavor❒ JAR flavor
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 39
Outline
1. Context and Use Cases
2. Threat Models
3. LCAP in Action
4. Unlocking LOCKSS for Developers
5. Q&A
2018 Storage Developer Conference. © LOCKSS Program, Stanford University. All Rights Reserved. 40
Thank you