don’t give up on distributed file systems jeremy stribling, emil sit, frans kaashoek, jinyang li,...
TRANSCRIPT
Don’t Give Up on Distributed File Systems
Jeremy Stribling, Emil Sit, Frans Kaashoek, Jinyang Li, and Robert Morris
MIT CSAIL and NYU
• New apps tend to use new storage layers• Examples:
• Can we invent this layer once?
Reinventing the Storage Wheel
BLAST
What About a File System?
• A FS enables quick-prototyping for apps– A familiar interface– Language-independent usage model– Hierarchical namespace useful for apps– Write distributed apps in shell scripts
if [ -f /fs/cwc/$URL ]; then if notexpired /fs/cwc/$URL; then cat /fs/cwc/$URL exit fifiwget $URL –O - | tee /fs/cwc/$URL
Why Won’t That Work Today?
• Needs of distributed apps:– Control over consistency and delays– Efficient data sharing between peers
• Current systems focus on FS transparency– Hide faults with long timeouts– Centralized file servers
Example: Cooperative Web Cache
• Would rather fail and refetch than wait
• Perfect consistency isn’t crucial
• Avoid hotspots
if [ -f /fs/cwc/$URL ]; then if notexpired /fs/cwc/$URL; then cat /fs/cwc/$URL exit fifiwget $URL –O - | tee /fs/cwc/$URL
Our Proposal: WheelFS
• A distributed wide-area FS to simplify apps
• Main contributions:
1) Give apps control with semantic cues
2) Provide good performance according to Read Globally, Write Locally
Basic Design: Reading and Writing
Node653
Node076
Node150 Node
554
Node402
Node257
File 135?
076 150257 402554 653
File135
135
135135
File135v2
File135v3
135v2
135v2
135v3
135v3
Cached135
Createfoo/bar
550
File550(bar)
Dir209(foo)
bar = 550
Explicit Semantic Cues
• Allow direct control over system behavior
• Meta-data that attach to files, dirs, or refs
• Apply recursively down dir tree
• Possible impl: intra-path component– /wfs/cwc/.cue/foo/bar
Semantic Cues: Writability• Applies to files
• WriteMany (default)
• WriteOnce Node653 Node
076
Node150
Node554
Node402
Node257
File 135?
File135
File135v2
File135v3
Cached135v3
Cached135
Semantic Cues: Freshness• Applies to file references
• LatestVersion (default)
• AnyVersion
• BestVersion
Node653 Node
076
Node150
Node554
Node402
Node257
File 135?
File135
Cached135
Semantic Cues: Write Consistency• Applies to files or directories
• Strict (default)
• Lax Node653 Node
076
Node150
Node554
Node402
Node257
WriteFile 135
File135
135
WriteFile 135
File135v2
135v2
Example: Cooperative Web Cache
• Reading an older version is ok:– cat /wfs/cwc/.bestversion,maxtime=250/foo
• Writing conflicting versions is ok:– wget http://foo > /wfs/cwc/.lax,writemany/foo
if [ -f /wfs/cwc/.maxtime=250,bestversion/$URL ]; then if notexpired /wfs/cwc/.maxtime=250,bestversion/$URL; then cat /wfs/cwc/.maxtime=250,bestversion/$URL exit fifiwget $URL –O - | tee /wfs/cwc/.lax,writemany/$URL
Example: Cooperative Web Cache
Node653 Node
076
Node150
Node554
Node402
Node257
File135
Cached135
Client $URL
“$URL”?135
135?135 = v1402
Chunk
Chunk
Chunk
Cached135
No!
$URL
File550
“$URL” == 550
Dir070
(/wfs/cwc)
Discussion• Current set of cues enough for many apps
– All-sites-pings– Grid computations– OverCite
• Stuff we swept under the rug:– Security– Atomic renames across dirs– Storage load-balancing– Unreferenced files
Related Work
• Every FS paper ever written
• Specifically:– Cluster FS: Farsite, GFS, xFS, Ceph– Wide-area FS: JetFile, CFS, Shark– Grid: LegionFS, GridFTP, IBP– POSIX I/O High Performance Computing
Extensions