better i/o through byte-addressable, persistent memory€¦ · better i/o through byte-addressable,...

Post on 22-Aug-2020

21 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Better I/O ThroughByte-Addressable, Persistent Memory

Jeremy Condit, Ed Nightingale, Chris Frost,

Engin Ipek, Ben Lee, Doug Burger, Derrick Coetzee

A New World of Storage

+ Fast+ Byte-addressable- Volatile

+ Non-volatile- Slow- Block-addressable

Disk / Flash

DRAM

2

A New World of Storage

BPRAM

Byte-addressable, Persistent RAM

3

+ Fast+ Byte-addressable+ Non-volatile

A New World of Storage

BPRAM

How do we build fast, reliable systems with BPRAM?

Byte-addressable, Persistent RAM

4

+ Fast+ Byte-addressable+ Non-volatile

Phase Change Memory

• Most promising formof BPRAM

• “Melting memory chips in mass production”– Nature, 9/25/09

5

Phase Change Memory

phase change material(chalcogenide)

electrode

slow cooling -> crystalline state (1)

fast cooling -> amorphous state (0)

PropertiesReads: 2-4x DRAMWrites: 5-10x DRAMEndurance: 108+

6

A New World of Storage

+ Non-volatile- Slow- Block-addressable

BPRAM

Disk / FlashHow do we build fast, reliable systems with BPRAM?

Byte-addressable, Persistent RAM

This talk: BPFS, a file system for BPRAMResult: Improved performance and reliability

7

+ Fast+ Byte-addressable+ Non-volatile

Goal

New mechanism:short-circuit shadow paging

8

New guarantees for applications

• File system operations will commit atomically and in program order

• Your data is durable as soon as the cache is flushed

Design Principles

1. Eliminate the DRAM buffer cache;use the L1/L2 cache instead

3. Provide atomicity andordering in hardware

Write A Write B

2. Put BPRAM on the memory bus

9

Outline

• Intro

• File System

• Hardware Support

• Evaluation

• Conclusion

10

BPRAM in the PC

L1

L2

DRAM

HD / Flash

PCI/IDE bus

Memory bus

11

BPRAM in the PC

L1

L2

DRAM

HD / Flash

PCI/IDE bus

Memory bus

BPRAM

• BPRAM and DRAM are addressable by the CPU

• Physical address space is partitioned

• BPRAM data may be cached in L1/L2

12

BPRAM in the PC

L1

L2

DRAM

Memory bus

BPRAM

• BPRAM and DRAM are addressable by the CPU

• Physical address space is partitioned

• BPRAM data may be cached in L1/L2

13

BPFS: A BPRAM File System

• Guarantees that all file operations execute atomically and in program order

• Despite guarantees, significant performance improvements over NTFS on the same media

• Short-circuit shadow paging often allowsatomic, in-place updates

14

file directory

inodefile

root pointer

indirect blocks

inodes

BPFS: A BPRAM File System

file

15

file directory

inodefile

root pointer

indirect blocks

inodes

BPFS: A BPRAM File System

file

16

Enforcing FS Consistency Guarantees

• What happens if we crash during an update?

17

Enforcing FS Consistency Guarantees

• What happens if we crash during an update?

18

Enforcing FS Consistency Guarantees

• What happens if we crash during an update?

19

Enforcing FS Consistency Guarantees

• What happens if we crash during an update?

– Disk: Use journaling or shadow paging

– BPRAM: Use short-circuit shadow paging

20

Review 1: Journaling

• Write to journal, then write to file system

A B

file system

journal

21

Review 1: Journaling

• Write to journal, then write to file system

A B

file system

journal A’ B’

22

Review 1: Journaling

• Write to journal, then write to file system

A B

file system

journal A’ B’

B’A’

23

Review 1: Journaling

• Write to journal, then write to file system

A B

file system

journal A’ B’

B’A’

• Reliable, but all data is written twice

24

Review 2: Shadow Paging

• Use copy-on-write up to root of file system

BA

file’s root pointer

25

Review 2: Shadow Paging

• Use copy-on-write up to root of file system

BA A’ B’

file’s root pointer

26

Review 2: Shadow Paging

• Use copy-on-write up to root of file system

BA A’ B’

file’s root pointer

27

Review 2: Shadow Paging

• Use copy-on-write up to root of file system

BA A’ B’

file’s root pointer

28

Review 2: Shadow Paging

• Use copy-on-write up to root of file system

BA A’ B’

file’s root pointer

29

Review 2: Shadow Paging

• Use copy-on-write up to root of file system

BA A’ B’

file’s root pointer

• Any change requires bubbling to the FS root

• Small writes require large copying overhead30

Short-Circuit Shadow Paging

• Uses byte-addressability and atomic 64b writes

BA

file’s root pointer

31

• Inspired by shadow paging– Optimization: In-place update when possible

Short-Circuit Shadow Paging

• Uses byte-addressability and atomic 64b writes

BA A’ B’

file’s root pointer

32

• Inspired by shadow paging– Optimization: In-place update when possible

Short-Circuit Shadow Paging

• Uses byte-addressability and atomic 64b writes

BA A’ B’

file’s root pointer

33

• Inspired by shadow paging– Optimization: In-place update when possible

Short-Circuit Shadow Paging

• Uses byte-addressability and atomic 64b writes

BA A’ B’

file’s root pointer

34

• Inspired by shadow paging– Optimization: In-place update when possible

Opt. 1: In-Place Writes

• Aligned 64-bit writes are performed in place

– Data and metadata

file’s root pointer

35

Opt. 1: In-Place Writes

• Aligned 64-bit writes are performed in place

– Data and metadata

file’s root pointer

in-place write

36

Opt. 1: In-Place Writes

• Aligned 64-bit writes are performed in place

– Data and metadata

file’s root pointer

37

Opt. 1: In-Place Writes

• Aligned 64-bit writes are performed in place

– Data and metadata

file’s root pointer

38

Opt. 1: In-Place Writes

• Aligned 64-bit writes are performed in place

– Data and metadata

file’s root pointer

39

• Appends committed by updating file size

file’s root pointer + size

40

Opt. 2: Exploit Data-Metadata Invariants

• Appends committed by updating file size

file’s root pointer + size

in-place append

41

Opt. 2: Exploit Data-Metadata Invariants

• Appends committed by updating file size

file’s root pointer + size

in-place append

file size update

42

Opt. 2: Exploit Data-Metadata Invariants

BPFS Example

directory filedirectory

inodefile

root pointer

indirect blocks

inodes

43

BPFS Example

directory filedirectory

inodefile

root pointer

indirect blocks

inodes

add entry

remove entry

44

• Cross-directory rename bubbles to common ancestor

BPFS Example

directory filedirectory

inodefile

root pointer

indirect blocks

inodes

45

Outline

• Intro

• File System

• Hardware Support

• Evaluation

• Conclusion

46

BPRAM

L1 / L2

...

CoW

Commit

...

Problem 1: Ordering

47

BPRAM

L1 / L2

...

CoW

Commit

...

Problem 1: Ordering

48

BPRAM

L1 / L2

...

CoW

Commit

...

Problem 1: Ordering

49

BPRAM

L1 / L2

...

CoW

Commit

...

Problem 1: Ordering

50

BPRAM

L1 / L2

...

CoW

Commit

...

Problem 1: Ordering

51

...

CoW

Commit

...

Problem 2: Atomicity

L1 / L2

BPRAM

52

...

CoW

Commit

...

Problem 2: Atomicity

L1 / L2

BPRAM

53

...

CoW

Commit

...

Problem 2: Atomicity

L1 / L2

BPRAM

54

...

CoW

Commit

...

Problem 2: Atomicity

L1 / L2

BPRAM

55

Enforcing Ordering and Atomicity

• Ordering

– Solution: Epoch barriers to declare constraints

– Faster than write-through

– Important hardware primitive (cf. SCSI TCQ)

• Atomicity

– Solution: Capacitor on DIMM

– Simple and cheap!

56

...

CoW

Barrier

Commit

...

Ordering and Atomicity

L1 / L2

BPRAM

57

...

CoW

Barrier

Commit

...

Ordering and Atomicity

L1 / L2

BPRAM

1

1 1

58

...

CoW

Barrier

Commit

...

Ordering and Atomicity

L1 / L2

BPRAM

1

1 1

59

...

CoW

Barrier

Commit

...

Ordering and Atomicity

L1 / L2

BPRAM

1

1 1

2

60

...

CoW

Barrier

Commit

...

Ordering and Atomicity

L1 / L2

BPRAM

1

1 1

2

Ineligible for eviction!

61

...

CoW

Barrier

Commit

...

Ordering and Atomicity

L1 / L2

BPRAM

2

Ineligible for eviction!

62

...

CoW

Barrier

Commit

...

Ordering and Atomicity

L1 / L2

BPRAM

2

63

...

CoW

Barrier

Commit

...

Ordering and Atomicity

L1 / L2

BPRAM

64

...

CoW

Barrier

Commit

...

Ordering and Atomicity

L1 / L2

BPRAM

65

MP works too(see paper)

Outline

• Intro

• File System

• Hardware Support

• Evaluation

• Conclusion

66

Methodology

• Built and evaluated BPFS in Windows

• Three parts:

– Experimental: BPFS vs. NTFS on DRAM

– Simulation: Epoch barrier evaluation

– Analytical: BPFS on PCM

67

0

2

4

6

8

10

8 64 512 4096

Random n Byte Write

Microbenchmarks

0

0.4

0.8

1.2

1.6

2

8 64 512 4096

Tim

e (s

)

Append n Bytes

NTFS - DiskNTFS - RAMBPFS - RAM

68

NOT DURABLE!

NOT DURABLE!

DURABLE!

DURABLE!

BPFS Throughput On PCM

0

0.25

0.5

0.75

1

Exec

uti

on

Tim

e (v

s. N

TFS

/ D

isk)

NTFSDisk

NTFSRAM

BPFSRAM

69

BPFSPCM(Proj)

BPFS Throughput On PCM

0

0.25

0.5

0.75

1

Exec

uti

on

Tim

e (v

s. N

TFS

/ D

isk)

0

0.25

0.5

0.75

1

0 200 400 600 800

Sustained Throughput of PCM (MB/s)

ProjectedThroughput

BPFS - PCM

NTFSDisk

NTFSRAM

BPFSRAM

70

BPFSPCM(Proj)

Conclusions

• BPRAM changes the trade-offs for storage– Use consistency technique designed for medium

• Short-circuit shadow paging:– improves performance

– improves reliability

Bonus: PCM chips on display at poster session!

71

top related