(speaker notes version) architecting an enterprise storage platform using object stores

91
Architecting an Enterprise Storage Platform Using Object Stores © mekuria getinet / www.mekuriageti.net NirajTolia Chief Architect, Maginatics @nirajtolia

Upload: niraj-tolia

Post on 15-Jan-2015

910 views

Category:

Technology


1 download

DESCRIPTION

Presented at SNIA SDC 2013. This deck adds speaker notes.

TRANSCRIPT

Page 1: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Architecting an Enterprise Storage

Platform Using Object Stores

© mekuria getinet / www.mekuriageti.net

Niraj Tolia

Chief Architect, Maginatics

@nirajtolia

Page 2: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

These gray slides are equivalent to speaker notes

Normally invisible, they are provided for non-presentation settings

Hope they help

Page 3: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

A Whirlwind Tour

Page 4: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

This presentation provides an end-to-end overview of MagFS and therefore might not be deep enough in

certain areas

Contact @nirajtolia for Comments, Questions, Flames

Page 5: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Awesome Questions == Awesome T-shirts

Page 6: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Hacker T-shirts were handed out for “awesome”questions during the SNIA SDC talk.

If you asked one but didn’t get one, get in touch with us and we will ship one.

If you missed the talk and still want a T-shirt, come to a future talk or try MagFS out.

Page 7: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

80% YoY Growth in Unstructured Data

41% Growth in IaaSSystems through 2016

Sources:

Gartner, IT Marketing Clock for Storage, Sep 2011

Gartner, Forecast Overview: Public Cloud Services, Worldwide, 2011-2016, Feb 2013

Page 8: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Data growth is impressive! Requires centralization for protection, analysis and cost management.

Infrastructure-as-a-Service systems are rapidly growing. Apart from leveraging new storage paradigms (object storage) to deal with this data growth, workloads are

migrating and need to use cloud storage. Storage systems also need to support elastic workloads

(capacity and scale).

Page 9: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

MagFS –The File System for the Cloud

Consistent, Elastic, Secure, Mobile-Enabled

Layered on Object Stores

“Software-Defined”

Page 10: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

To respond to the earlier trends, we built a system that, at its core, is a distributed file system

It differs from legacy systems in a number of ways but primarily with an end-to-end (E2E) security perspective, the ability to both be elastic and support elastic workloads, by elevating mobility to a first-class citizen, and by exploiting

object stores

Further, while “software-defined” is a oft-abused buzzword, MagFS does fit the definition: software-only, packaged as

VMs, and clean separation of data and control planes

Page 11: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

No (Initial) Legacy Support (NFS/CIFS)

Native Clients: Push Intelligence to Edges

Strong Consistency w/ Full-Spectrum Caching

Page 12: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Three Early Decisions:

1. No legacy (NFS, CIFS) support on purpose: File systems must evolve (e.g., dedup, caching, scaling). MagFS

transparently replaces legacy distributed file systems though.

2. Client agents allows MagFS to push smarts to edges. No significant IT pushback anymore. Common codebase

reduces development costs.

3. Enable data & metadata caching with strong consistency

Page 13: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

File System Design Goals

Low Cost, High Scale

Intelligent Clients

Span Devices and Networks

Support Rapid Iteration

Page 14: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Design Goals:

1. Deliver scale at a cost-effective point

2. Make clients intelligent: modern computing platforms have enough horsepower

3. Span server-grade hardware to mobile clients and from fast to bandwidth-challenged networks

4. To rapidly iterate on our product and add new features with disruption to users

Page 15: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

In-CloudFile System

NAS Replacement and Consolidation

Enterprise File Sharing

Use Cases

Page 16: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

MagFS, a general purpose system, is used for many different use cases. The majority are Tier 2/3 workloads (e.g., home

directory, media, nearline storage, etc.).

In-Cloud File System: Allow unmodified applications to Just Work™ in the cloud. Provide a distributed file system

where no filer can be racked in.NAS: Both serve as a more cost-effective filer as well as allow for globally distributed workforces to leverage our

WAN optimization.Enterprise File Sharing: Related to NAS, secure file sharing

that meets compliance and regulatory concerns as MagFS is a product and not a service.

Page 17: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Object Storage(public, on-premises, or hybrid)

Data

Metadata

Metadata Servers

Clients

10,000 Foot View

Page 18: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

The previous slide presents a very high-level overview of MagFS

Note the split data and metadata planes: MagFS does not try to resolve scalability issues already tackled by

the object storage system and therefore will not intercept data on the fast path

The metadata servers provide a single pane-of-glass for admins, integrate with native AD or LDAP setups, and

also store encryption keys

Page 19: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Koukouvaya / flickr.com/photos/jackoughton/6535137981/

Heavy (Data) Lifting via Clients

Encryption

Inline Deduplication

Compression

Persistent Data Caching

Bulk Data Transfers

Page 20: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Push a lot of smarts to increasingly-powerful clients

Clients do heavy data lifting: Chunking for deduplication, encryption, optional compression, on-disk caching, etc.

Available resources generally proportional to workloads for different device types

Server doesn’t see data on read OR write path!

Page 21: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Cloud Object Storage

Scale Out, Low Cost

Handles Placement + Replication

Tolerates Failures

High Aggregate Performance

Page 22: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Object Storage has a number of very useful properties: Cost, Commodity, Scale Out (aggregate performance,

fault tolerance, etc.)

We directly expose clients to the object store

Similar to clients, we also push functionality to the object storage system: data placement and replication,

fault-tolerance, repairs, etc. as we do not want to reinvent the wheel

Page 23: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Virtualized Metadata Servers

Enforce Strong Consistency

Enforce Authentication and Integrity

Runtime Performance Optimization

Share-level Deduplication

Data Scrubbing & Garbage Collection

Page 24: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

The VM-based metadata servers are where consistency and user authentication are enforced

They also allow clients to dynamically cache read and write data, lock objects and byte ranges, etc.

Works with clients to prevent duplicated data transfers or redundant data copies

Data is scrubbed and unused data deleted in the background

Page 25: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Architecture

Page 26: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

We will now branch off into details about the client and server architecture and how they interact with object

storage

Page 27: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Client

Architecture

Page 28: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

MagFS supports different Linux, Windows, OS X, Android, and iOS versions

Majority of code is shared across platforms with platform-specific glue layers

The next few slides talk about desktop/server platforms but the same structure applies to all.

Page 29: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Client Architecture

Application

Redirector

(e.g., FUSE)

File System

OS Glue

Data Manager

Metadata Transport

Layer

Local Remote

Userspace

Kernel

Deduplication Encryption Compression

Locking Leases

Page 30: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Traditional platforms have a thin in-kernel redirector (FUSE on Linux. We ship the equivalent on Windows and OS X)

Modulo glue, the file system layer contains core functionality

Data manager used for local persistent data caching and optimized remote object store fetches

Metadata transport layer manages the MagFS control plane

Page 31: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Data Manager

File System Layer

Simplified Write: Deduplication + Encryption

Write Request

Plaintext

Variable-Length

Chunking

Encrypted Text (E)

AES-256 (K)

Object Name (N)SHA-256

Local Cache Remote Transfer

Encryption Key (K)SHA-256

Page 32: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Very simple example! In reality, most operations are not synchronous, are batched, and clients get ack early

Incoming data is broken up into smaller variable-length chunks for deduplication

Per-chunk encryption used where the per-chunk key is derived from a cryptographic hash of unencrypted data

Chunk name derived from hash of encrypted data

Page 33: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Data Manager

File System Layer

Simplified Write: Deduplication + Encryption

Write Request

Plaintext

Variable-Length

Chunking

Encrypted Text (E)

AES-256 (K)

Object Name (N)SHA-256

<File, Offset, N, K>

Optional(<URI>)Local Cache Remote Transfer

<N, E>

<URI, E>

No Encryption Keys

in the Cloud

No Encryption Keys

in Local Cache

Encryption Key (K)SHA-256

<E>

Page 34: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Encrypted data (but not key) is written to local cache

Write request with offset, chunk name, and encryption key is made to the server

If new chunk, a secure write URI is sent to the client

Data manager queues and writes chunk to the cloud

No encryption keys in local cache or object store

Page 35: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Data Manager

File System Layer

Simplified Read: Deduplication + Encryption

Read Request

<File, Offset, Range>

Local Cache Remote Transfer

<N, URI>

Encryption Key (K)

<N, K, URI>

Encrypted Text (E)

<E>

<URI>

<E>

<URI>

<E>

Plaintext

AES-256 (K)

Page 36: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Another very simple example. Does not include metadata caching either.

Server responds to a read request with the chunk name, decryption key, and secure read URI

A local cache miss causes an object storage fetch.Encrypted chunk is decrypted using the server-provided key and unencrypted data returned to the application. All deduplication and encryption is always transparent

to the application.

Page 37: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

The Client in Real Life Does a Lot More!

• File and Directory Leases (data and metadata caching)

• Asynchronous Operations (including writes)

• Operation Compounding

• Runtime Optimizations (e.g., read ahead)

• Optimizing for High Bandwidth Delay Product (BDP)

• …

Page 38: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

There is a separate discussion on leases later when we talk about how clients and servers optimize

performance at runtime

Page 39: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Object Storage(public, on-premises, or hybrid)

Data

Metadata

Metadata Servers

Clients

Communication Details

Thrift

(HTTPS)

REST

(HTTPS)

Page 40: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Important: Split Data and Metadata paths (always, not optional). Clients directly access the object store. MagFS

does not need to scale the data plane.

Client technically speaks REST over HTTPS to the object store but has no knowledge of the actual API (server-

provided URIs)

The MagFS protocol uses Thrift over HTTPS (firewall and proxy friendly). Enables efficient encoding and easy protocol

extension without breaking compatibility.

Page 41: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Server

Architecture

Page 42: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

The next file slides cover how we virtualize file namespaces, the distributed system deployment, a view

into internals, and a brief overview of leases

Page 43: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Metadata Server Internals

Metadata Storage Layer

Storage Core

Backups

Production Development

GC

Scrubbing

Quotas Dedup Leases Security

HA

MagFS

Ext. Sharing

Multi-Cloud Versioning Offline Mode

Cloud Abstraction Layer

Legend

Page 44: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

The metadata server internals have been modularized to provide both development and runtime agility

For example, adding support for a new object storage system doesn’t impact the rest of the code

Runtime background operations (e.g., hot backups, garbage collection, scrubbing) do not impact clients.

The file system protocol is separate from file system-agnostic features (e.g., quotas, lease, and lock management)

Page 45: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Bootstrapping: Virtualized Namespaces

\\server.example.com\share

HOST FQDN FOLDER

Legacy

\\server.example.com\shareMagFS

Dynamic mapping to host:port

Page 46: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

With both Window UNC paths or NFS server/share exports, the exported file system would be tied to a

DNS name.

Instead, MagFS virtualizes the access path. Nothing changes with respect to applications but a virtualized server:share combination can map to any host:port

This is extremely useful for High Availability Failover and Disaster Recovery

Page 47: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Discovery Service

Metadata

Server

Metadata

Server (HA)

Metadata

Server

ZooKeeper

ZooKeeperZooKeeper

MonitoringManagement

Console

Config +

Scheduler

Virtual Filer Host:Port Mapping

Page 48: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

MagFS is a distributed system. It has a number of backend services: VM and Service Monitoring,

ZooKeeper for server registration and discovery, Admin management console, job scheduler, AD integration, etc.

Shares are deployed in HA or non-HA configuration. HA comes with automatic failover.

Clients use a discovery service to map namespace to server

Page 49: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores
Page 50: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

One of the big challenges in any distributed file system is the tradeoff between consistency and performance.

In a naïve strongly consistent system, every operation needs to be centralized on a server. This is obviously

bad for performance.

The MagFS metadata server therefore hands leases out to clients for data and metadata caching (including

caching writes and updates)

Page 51: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Leases: Performance and Strong Consistency

Read Write HandleLease Types

ReadRead + Handle

Read + Write + Handle

Lease States

Valid File Leases

Valid Directory Leases

Page 52: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Lease Types: READ allows client to cache reads locally,WRITE allows local write caching, and HANDLE where

files can be closed and reopened locally

Valid Lease Type combinations are: READ, READ + HANDLE, READ + WRITE + HANDLE. Others don’t

really apply (e.g, WRITE is exclusive and READ + HANDLE come for free if a WRITE lease is held)

MagFS also supports WRITE directory leases

Page 53: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Cloud Storage

Interaction

Page 54: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

While Maginatics does not provide an Object Storage system itself, it works with a number of different products. The next few slides will talk about the

challenges of interoperating with a large number of systems as well the technical challenges of layering a file

system on top of them.

Page 55: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Object Storage(public, on-premises, or hybrid)

Page 56: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Today, MagFS supports a large number of object storage systems: private and public Swift and Atmos

deployments, AWS S3, public and private S3 clones, Azure, and others not mentioned here

We are seeing an increasing shift towards vendors providing S3 and Swift API compatibility layers even if

they originally had their own REST-style protocols

Page 57: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Object Storage systems

are like snowflakes!

Page 58: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

MagFS also works hard to address inter-object store variance and hide the complexity from the end user.

MagFS uses very basic API calls (GET/PUT/DELETE object/bucket and Signed URLs) and we discovered a

number of differences in vendor implementations

MagFS also optimizes data layout for different object stores to obtain the best performance. For example,

data layout on S3, Atmos, and Swift differs to match the underlying platform.

Page 59: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Object Store API Compatibility

Q: Has anyone come across a near 100%

Amazon S3 API compatible object storage

system?

A: It is hard to find a near-100% compatible

product…

- Vendor w/ S3 Compatible Product

Page 60: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Even vendors claiming to support the same API have differences, bugs, or interpretation differences. For

example, most S3 compatible systems we have added support is different from one another (e.g., subsets of API supported, differing API interpretations, bugs, etc.).

Swift is similar. The same code cannot be used with a generic Swift setup and the public cloud providers that

are based on Swift. Swift authentication (Keystone, TempAuth, etc.) also differs between vendors.

Page 61: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Object Storage(public, on-premises, or hybrid)

Data

Metadata

Metadata Servers

Clients

Direct Client Access: Security Problem?

Page 62: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

One of the challenges with providing clients direct object store access is security. There is generally one (or few) master API key(s) that can delete or read

arbitrary data.

However, as different MagFS users have different access rights to files, we should not provide the master key to

clients (even though the data is encrypted).

Further, a malicious client would be able to wipe all data with the master key!

Page 63: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Request Signing

Page 64: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

The solution to providing secure and time-limited data access to clients is to use Request Signing, a feature found in all mature object storage systems today.

The next few slides will walk through an example of how Request Signing works for a write.

Page 65: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Server-Driven Request Signing

SignString = HTTP-Verb + "\n"

+ Content-MD5 + "\n"

+ Content-Type + "\n"

+ Date + "\n"

+ Resource + "\n"

+ ...

Page 66: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Client read or write requests are authorized by the MagFS server that shares the master key with the

object storage system

Signing is done by the metadata server creating a request string in a pre-defined order

Page 67: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Server-Driven Request Signing

SignString = PUT + "\n"

+ Content-MD5 + "\n"

+ Content-Type + "\n"

+ Date + "\n"

+ Resource + "\n"

+ ...

Page 68: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

The first component of the signature string is the HTTP verb used. This would be GET for a read and generally

PUT for a write (some providers like Atmos use POST). DELETEs are never performed by the client.

Page 69: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Server-Driven Request Signing

SignString = PUT + "\n"

+ 07BzhNET7exJ6qYjitX/AA== + "\n"

+ Content-Type + "\n"

+ Date + "\n"

+ Resource + "\n"

+ ...

Page 70: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

The second component is a cryptographic hash of the data. A number of object storage systems will reject data whose cryptographic hash doesn’t match the

request. This is useful to protect against TCP errors that the TCP checksum doesn’t catch, buggy clients, and

even malicious clients.

A common hash algorithm used at this step is MD5 but some object storage systems are now supporting

stronger cryptographic algorithms

Page 71: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Server-Driven Request Signing

SignString = PUT + "\n"

+ 07BzhNET7exJ6qYjitX/AA== + "\n"

+ image/jpeg + "\n"

+ Date + "\n"

+ Resource + "\n"

+ ...

Page 72: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

The next component is the content-type of the object. We are using the JPEG type in this example but, in

MagFS, this would be “application/octet-stream” for all our objects as they are encrypted binary data.

Page 73: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Server-Driven Request Signing

SignString = PUT + "\n"

+ 07BzhNET7exJ6qYjitX/AA== + "\n"

+ image/jpeg + "\n"

+ Tue, 11 Jun 2013 00:27:41 + "\n"

+ Resource + "\n"

+ ...

Page 74: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Following the content-type, we now add a timestamp field. This is very useful because it puts a time limit on

this request to prevent replay attacks.

Most object stores place a reasonable time limit on request validity (e.g., 15 minutes) but a number also

allow configurable values. MagFS supports both.

Page 75: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Server-Driven Request Signing

SignString = PUT + "\n"

+ 07BzhNET7exJ6qYjitX/AA== + "\n"

+ image/jpeg + "\n"

+ Tue, 11 Jun 2013 00:27:41 + "\n"

+ /container/example.jpeg + "\n"

+ ...

Page 76: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

The final component in this example is the resource name and this includes both the container name and

the object name within the container

More options are possible in signature strings and these options differ from provider to provider

Page 77: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Server-Driven Request Signing

SignString = PUT + "\n"

+ 07BzhNET7exJ6qYjitX/AA== + "\n"

+ image/jpeg + "\n"

+ Tue, 11 Jun 2013 00:27:41 + "\n"

+ /container/example.jpeg + "\n"

+ ...

HMAC-SHA1( , SignString)

Page 78: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Following the construction of the signature string, a keyed hash message authentication code (HMAC) is

generated using the signature string and the master key

This is a one-way transform and obtaining the HMAC value does not leak information about the master key

Page 79: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Server-Driven Request Signing

SignString = PUT + "\n"

+ 07BzhNET7exJ6qYjitX/AA== + "\n"

+ image/jpeg + "\n"

+ Tue, 11 Jun 2013 00:27:41 + "\n"

+ /container/example.jpeg + "\n"

+ ...

Signature = Base64(HMAC-SHA1( , SignString))

Page 80: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

A Base64 encoded representation (signature) of this HMAC is sent to the client to prove that this request

was authorized by the server

Page 81: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Object Storage(public, on-premises, or hybrid)

Data

Metadata

Metadata Servers

Clients

Safe Direct Client Access via Request Signing

1. Read/Write Request

3. HTTP Request +

Signature +

Encrypted Data

2. HTTP Request + Signature

Page 82: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

To summarize, read or write operations not serviced from the local cache requires server authorization

Using the server-provided request and signature, a client can safely read and write data but only for the

specified object

The object store recalculates the signature based on the request, compares it to the received signature, and reject the request in case of a mismatch (e.g., wrong HTTP verb, stale/old request, swapped object names)

Page 83: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Dealing with Lost Client Writes

• Clients can lose connectivity or, in the worst case, be malicious

• Naïvely trusting client writes can “corrupt” w/ global dedup

• MagFS server scrubs all writes:• Client acknowledges write

• Server verifies object existence (object store performed MD5 at PUT)

• Server can also read and verify object data (stronger SHA-256 check)

• The object will be available for deduplication only after scrubbing

Page 84: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

MagFS exposes global deduplication and therefore needs to handle buggy or malicious clients that might have claimed to

have written data but did not

The server therefore waits for a client to acknowledge the write, checks the object store to verify that the object was written (implies success for the cryptographic hash check),

and can optionally scrub the data using a stronger cryptographic hash.

Modulo optimizations for the same client (really user), the data is only used for deduplication after scrubbing.

Page 85: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Handling Object Store Eventual Consistency

• Treat objects as immutable (even if modifications are allowed)

• Use content-based names (generated using cryptographic hashes)

• Tombstone names after Garbage Collection• Suffix generation number to content-based names in case of resurrection

Page 86: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Some object stores have eventually consistent properties and can lead to interesting read-after-write behaviors where what you read might not be the most

recent write.

To address this, we treat all objects as immutable, use content-based names, and using a suffix-based method

to tombstone names so that they are never reused

AWS S3 supporting read-after-first-put consistency in most regions also really helps with the above scheme

Page 87: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Security

Architecture

Page 88: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

In theory, this is where we would discuss MagFS’ssecurity architecture. However, as you observed,

security is baked into the product at every level and has been covered throughout the deck. We will therefore

only recap here.

Page 89: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Recap: On-Premises Security Model

• User authentication and permissions derived from native Active Directory setup

• Encryption keys are never exposed to the cloud

• Data and metadata is always encrypted: At-Rest and In-Flight

Page 90: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Quick point about Active Directory (AD):The fact that all our user permissions, group membership

information, and other authentication information is derived from AD makes it very simple for admins and

using MagFS does not change their workflows.

Page 91: (Speaker Notes Version) Architecting An Enterprise Storage Platform Using Object Stores

Slides (with speaker notes) at http://tolia.org

Try MagFS at http://maginatics.com