smore: a cold data object store for smr drives · a cold data object store for smr drives david...

25
SMORE: A Cold Data Object Store for SMR Drives David Slik NetApp, Inc.

Upload: others

Post on 24-May-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 1

SMORE:A Cold Data Object Store for SMR Drives

David SlikNetApp, Inc.

Page 2: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 2

Session Overview

A brief overview of SMR Slides 3 - 7 What is SMORE Slides 8 - 10 Architectural elements Slides 11 - 18 Implementation results Slides 19 - 22 Lessons learned Slides 23 - 24

Questions & Answers

Page 3: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 3

Shingled Magnetic Recording (SMR)

Technique to increase hard drive density 2.3x originally cited [1], but real-world has been around 1.5x

Trades off reduced write flexibility for increased capacity Concept has been around for over a decade Widely productized (many millions of drives shipped) Products often implement SMR “under the covers”, with firmware

hiding limitations from higher-level software (drive-managed SMR)

[1] “High density data storage using shingle-write”, Proceedings of the IEEE International Magnetics Conference, 2009

Not a new technology

Page 4: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 4

Shingled Magnetic Recording (SMR)

Tracks on conventional hard drives are separated by “guard bands” Guard band prevents writes from effecting adjacent tracks Takes up physical space on the hard drive

How does it work?

Width of write head

Width of read head

Guard Band

Track N

Track N+1

Page 5: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 5

Shingled Magnetic Recording (SMR)

SMR hard drives eliminate guard bands, “shingle” data writes Allows tracks to be packed closer together, increasing density Takes advantage of smaller read heads than write heads Write region of track N+1 overlaps write region of track N

How does it work?

Width of write head

Width of read head

Track N

Track N+1

Track N+2

Page 6: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 6

Shingled Magnetic Recording (SMR)

Primary tradeoff is re-writing data overwrites adjacent track(s) Drive divided into “zones”, each of which can be independently

appended to, and independently erased (“Trimmed”) Requires append-only data writes. Current write position known

as the “write pointer”

How does it work?

Width of write head

Width of read head

Track N+1

Track N+2

Track N

Page 7: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 7

Shingled Magnetic Recording (SMR)

Three general classes of SMR drives are defined: Drive managed (where the drive hides SMR complexity) Host managed (where a host is responsible for SMR complexity) Host aware (Hybrid of the two, where drive manages violations)

SMORE (SMR Object Repository) assumes Host managed SMR

In the context of this presentation

Page 8: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 8

SMR Object Repository (SMORE)

A project from the Advanced Technology Group at NetApp Primary research objectives included:

Identify if SMR drives can achieve hardware limits for cold storage workloads

Provide an experimental platform for investigating novel fault and error recovery techniques

Results published at: https://arxiv.org/pdf/1705.09701.pdf

Project Overview

Page 9: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 9

SMR Object Repository (SMORE)

Project assumptions Flash is for hot data, disk is for cool data, tape is for cold data Hot/warm data will be served from flash tier, so cool reads are random Cool data workloads are primarily object-based

Write (PUT) once, complete object replacement (versioning) Read (GET) infrequently, with range requests Delete (DELETE) infrequently, long-lived objects

Streaming reads and writes at aggregate drive throughput Flash enables new architectures, so can be used judiciously User-space software

Project Overview

Page 10: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 10

SMR Object Repository (SMORE)Project Overview

User-space SMR Driver

SCSI Generic Driver

STDLIB

VFS

Array of SMR Drives Flash

SMORE

NVRAM FIFO Buffer

User Requests

Page 11: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 11

SMR Object Repository (SMORE)

Index Each object has an identifier Need a place to translate ID to location(s) on disk This mapping information is stored in a B+ Tree on flash Index information is stored as part of each object to enable

reconstruction

Architectural Elements

Page 12: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 12

SMR Object Repository (SMORE)

Segmenting Objects split into segments Allows arbitrarily large

objects Fragmenting

Each segment is erasurecoded into N fragments

Each fragment is stored ondifferent physical hard drive

Architectural Elements

Fragment 2AFragment 1A

Fragment 2BFragment 1B

Fragment 2CFragment 1C

Fragment 2NFragment 1N

Disk 1, Zone α

Disk 2, Zone β

Disk 3, Zone γ

Disk 14, Zone ν

Segment 1 Segment 2 Digest

Page 13: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 13

SMR Object Repository (SMORE)

Each SMR Drive has M zones A system with X drives has

a total of X * M zones Zones are grouped into sets

such that no two zones areone the same drive

E.g. {α, β, γ, … ν} These are “Zone Sets” Zone set width =< X

Architectural Elements

Fragment 2AFragment 1A

Fragment 2BFragment 1B

Fragment 2CFragment 1C

Fragment 2NFragment 1N

Disk 1, Zone α

Disk 2, Zone β

Disk 3, Zone γ

Disk 14, Zone ν

Segment 1 Segment 2 Digest

Page 14: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 14

SMR Object Repository (SMORE)

Zone sets are allocated Stored in a zone set table Array of {disk, zone} tuples

E.g. Zone Set 1 includes zoneson disks {1,3,6,7, … 47}

Zone Set 2 includes zoneson disks {2,3,5,7, … 46,48}

Etc.

Architectural Elements

Page 15: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 15

SMR Object Repository (SMORE)

Assume disk 5 fails Affects Zone Sets 2, 3 EC prevents data loss

All Zone Sets with a zone ondisk 5 can be easily identified

New zones can be allocated Data can be rebuilt Only zone set table is updated

Architectural Elements

Page 16: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 16

SMR Object Repository (SMORE)

De-clustering reduces dataloss in multi-fault scenarios

E.g. disk triple disk fault with16/18 RS EC: 23.5% prob a given zone set has 0 lost 45.3% prob a given zone set has 1 lost 26.5% prob a given zone set has 2 lost 4.7% prob a given zone set has 3 lost

Prioritize rebuilding of ZoneSets with 2 lost zones

Architectural Elements

Page 17: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 17

SMR Object Repository (SMORE)

Layout Marker Blocks atbeginning of each fragmentcontain recovery informationabout the segment, used forrecovering partially writtenzone setsDigests at end of zone containsummary of all segments inthe zone, used for recoveringfilled zone sets

Architectural Elements

Fragment 2AFragment 1A

Fragment 2BFragment 1B

Fragment 2CFragment 1C

Fragment 2NFragment 1N

Disk 1, Zone α

Disk 2, Zone β

Disk 3, Zone γ

Disk 14, Zone ν

Segment 1 Segment 2 Digest

Layout Marker Blocks

Page 18: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 18

SMR Object Repository (SMORE)

Index Recovery Index is discarded and rebuilt on a crash Allows index to not have to be replicated, lowers cost Design approach is to make rebuilding the index fast & simple Indexes are checkpointed and stored in dedicated zone sets Any zone sets closed since the checkpoint are replayed using

the zone set digest Any zone sets opened since the checkpoint are replayed by

reading through a zone to replay layout marker blocks

Architectural Elements

Page 19: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 19

SMR Object Repository (SMORE)

Implementation shows that indexrecovery time depends on thenumber of dirty zone sets

Trade extra I/O for faster recovery Implementation numbers are

serialized, but can easily beparallelized, resulting inperformance increasing bynumber of drives

Implementation Results

Page 20: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 20

SMR Object Repository (SMORE)

Even in worst case scenariowhere every zone set has beendirtied, rebuild is bounded

In real-world deployments,more frequent checkpointswould bound number of dirtyzone sets in a rebuild

If index lost, rebuild must readdigest from every zone set

Implementation Results

Page 21: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 21

SMR Object Repository (SMORE)

For write I/Os, we achieved100% of theoretical bandwidth,regardless of object size

For random read I/Os, seeklatency constrains throughputfor small objects

For sufficiently large objects,we achieved close to 100% oftheoretical bandwidth

Implementation Results Aggregate Read Performance

Page 22: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 22

SMR Object Repository (SMORE)

Exceeding SMR workload limitsrequire low write amplification

The implementation has low write amplification results, evenwith worst-case workloads

For cold data storage, deletesare infrequent, resulting in farlower write amplification levels

Implementation Results

Page 23: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 23

SMR Object Repository (SMORE)

General lessons Parallelism is critical for performance Data spreading is critical for fault tolerance Optimal Zone Set width changes based on number of drives Simplifying fault recovery dramatically reduces implementation

and testing complexity Use of flash storage for indexes, metadata, etc, dramatically

improves performance while reducing complexity

Lessons Learned

Page 24: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 24

SMR Object Repository (SMORE)

SMR-specific lessons Append-only model simplifies implementation and increases

write performance NVRAM needed to reliably buffer data due to SMR disk block

sizes (Can be omitted if minimum object size enforced) NVRAM improves mixed I/O performance by reducing seeks SMR disk write performance often significantly lower than read

performance. Easy to exceed SMR disk workload limits

Lessons Learned

Page 25: SMORE: A Cold Data Object Store for SMR Drives · A Cold Data Object Store for SMR Drives David Slik NetApp, Inc. ... Technique to increase hard drive density 2.3x originally cited

2017 Storage Developer Conference. © NetApp, Inc. All Rights Reserved. 25

Questions & Answers

Questions from the audience

My contact information:[email protected]