pen: design and evaluation of partial-erase for 3d nand ... · impacts on block size •we use four...

34
PEN: Design and Evaluation of P artial-E rase for 3D N AND-Based High Density SSDs Chun-Yi Liu, Jagadish B. Kotra, *Myoungsoo Jung, Mahmut T. Kandemir The Pennsylvania State University, *Yonsei University 1

Upload: others

Post on 20-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

PEN: Design and Evaluation of Partial-Erase for 3D NAND-Based High Density SSDsChun-Yi Liu, Jagadish B. Kotra, *Myoungsoo Jung, Mahmut T. Kandemir

The Pennsylvania State University, *Yonsei University

1

Page 2: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Outline• Introduction and Background

• Motivation: Impact on Block Size

• Controller Hardware for Partial-Erase

• FTL for Partial-Erase

• Evaluation

• Conclusion

2

Page 3: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

The trend of NAND Flash

• The issues of increasing 2D NAND flash density (smaller cells) Reliability - various serious disturbances

Program (write) disturbances, read disturbances, and retention errors.

Performance – longer program (write) time

Cells are more sensitive to the program voltage.

• 3D NAND can highly mitigate those issues. Larger stakability cells

Increasing layers to increase density.

32 -> 48 -> 64 -> 96 layers

• 3D NAND will dominate MLC / TLC

NAND market. 020406080

100

2D > 2Xnm 2D 20nm 2D <1Xnm 3D NAND 3

Page 4: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Overview of 2D/3D NAND-based SSDs

SSDcontroller

SSD

DRAM

Flash Package Y

Flash Package 0….

….

….

….Die 0

Plane 0 Plane 1

Channel 0 Channel 1 Channel X

… …

Host interface: NVMe, PCIe, SATA

2D NAND dies are replaced by 3D NAND dies.

P-0P-1

P-Z

Block W

P-0P-1

P-Z

Block 0

PlaneFlash Package

P-0

P-1

P-2

P-3

Bit 1 Bit 2 Bit n…

2D NAND Block

P-0

P-4

P-8

P-12

Bit 0 Bit 1 Bit n

3D NAND Block

……

123

567

9

13

…1011

1415

4

Page 5: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Plane 0

Plane

Abstract

View

Block Layer

View

SG select gate

CG control gate

ST select transistor

P-0

P-4

P-8

P-12

Bit 0 Bit 1 Bit n

Slice 0

LowerST

123

567

9

13

PageDecoder

USG 3

CG 0

CG 3

LSG

1011

1415

Block Physical View

Upper

ST

P-0P-1P-2P-3

P-14

Block N

P-15

P-0P-1P-2P-3

P-14

Block 0

P-15

… Lower ST

layer

Cell layer 3Cell layer 2Cell layer 1Cell layer 0

Upper ST

layer

Block 0

The Big Block Problem

Cross-layer signals:They are difficult to shrink.

All Read/write/erase operationsare controlled by the top layer.

Multiple slices have to share the control circuits due to cross-layer signals.As the number of layers (n) increases, pages per page in 3D NAND increases in O(n2), not only O(n).

To read/write P-4:CG 1 and USG 0 have to be set to proper value.

USG 0

CG 1

5

Page 6: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

SSD Management Software:Flash Translation Layers (FTLs)

• NAND flash properties: Read/Write operation: at a page unit

Typically, hundred of microsecond (us)

Erase: at a block unit

Typically, milliseconds (ms) or tens of ms

• Out-place-update:

• Flash Translation Layers (FTLs): Address mapping

Maintain a mapping table

Garbage Collection (GC)

Different FTLs have their own GC algorithms.

Page-0

Page-1

Page-14

Block 0

Page-15

Page-0

Page-1

Page-14

Block 0

Page-15

In-place-update

Out-place-update

Page-0

Page-1

Page-14

Block 0

Page-15

Page-0Page-1

Page-14

Block 0

Page-15

1 erase (10ms) 1 write (900us)

1 write (900us)

Mapping table required

UpdatePage-0

Invalidate

6

Page 7: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Flash Translation Layers (FTLs)

• Three categories of FTLs: Page-level mapping:

Pro: providing the best performance

Con: huge mapping table required

1 TB SSDs requires 1 GB mapping table.

Block-level mapping:

Pro: small mapping table

Con: performance degradation

A well-known implementation: NFTL

Hybrid mapping:

Combining the best of previous two mappings

We focus on the Superblock FTL.

Huge Mapping table

Logical address

Page-0

Page-1

Page-14

Block 0

Page-15

Page-0

Page-1

Page-14

Block N

Page-15

….

Page-level mapping

Small Mapping table

Logical address

Page-0

Page-1

Page-14

Block 0

Page-15

Page-0

Page-1

Page-14

Block N

Page-15

….

Block-level mapping7

Page 8: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Introduction of NFTL

D-block (Data-block)

Mapping table

Logical address

Block Index (block x)

Calculation Block x

Write request to a new address

D-block

Update request to the same addressThe data is updated (logged) in U-block.

U-block(Update-block)

D-block

Read request to the same address(We may need to sequentially search U-block.)

U-block

GC scenario 1: fully-utilized U-block

D-block U-block

Write Request

Read Request

GC scenario 2: unpaired D-block(The number of U-blocks is mush smaller than that of D-blocks.)

D-block

Write Request

Allocate a U-block

Write Request

8

Page 9: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Chip interface

Free pageValid pageInvalid page

D-block U-block(block pair)

Free block

Read page to controller

Step-1: Read valid page

D-block U-blockStep-2:Write valid page &

invalidate page

Write page to free block

Repeat steps (1) & (2) until no valid pages in

D/U-blocksD-block U-block

Step-3: Erase D/U-block

New D-blockStep-4: Update

mapping table(FTL)

WRWWWW

I/O Request QueueStalledStalled

Chip 0

SSD ControllerVictim blocks

Free blockFree block

Introduction of NFTL (cont’d)• NFTL GC (Merge) overview

The number of copied valid pages increase, as the number of pages per block increase.

9

Page 10: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Introduction of Superblock FTL• Superblock FTL GC overview

Intra-superblock GC (similar to fully-utilized U-block in NFTL)

Inter-superblock GC (similar to unpaired D-block in NFTL)

A superblock

A superblock

Allocatea

block

10

Page 11: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Outline• Introduction and Background

• Motivation: Impact on Block Size

• Controller Hardware for Partial-Erase

• FTL for Partial-Erase

• Evaluation

• Conclusion

11

Page 12: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Impacts on block size• We use four iso-capacity SSD configurations to show the impacts.

To prevent that the capacity affects the GC overheads (GC scenario 2 in NFTL)

• (blocks per plane, pages per block) (15014, 72), (7552, 144), (3776, 288), (1888, 576)

As number of pages per block increases, the time spending in reading/writing valid pages during GC increases.

0

30

60

72

14

4

28

8

57

6

72

14

4

28

8

57

6

72

14

4

28

8

57

6

72

14

4

28

8

57

6

72

14

4

28

8

57

6

72

14

4

28

8

57

6

prn_1 proj_0 proj_1 proj_2 usr_1 usr_2

Wri

te L

ate

ncy

(in

m

sec)

Spare Area Read Write GC Spare Area Read GC Read GC Write GC Erase

12

Page 13: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Impacts on block size (cont’d)

• The write amplifications slightly increase.

• GC triggered frequencies increase due to a few number of blocks.

0

6

12

18

Wri

te A

mp

.

Blk_72 Blk_144 Blk_288 Blk_576

0

0.2

0.4

0.6

Nu

mb

er

of

GC

s (i

n

mill

ion

)

Blk_72 Blk_144 Blk_288 Blk_576

13

Page 14: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

• We propose PEN (Partial-Erase for 3D NAND flash) to address the 3D NAND performance degradation. The number of copied valid pages can be reduced.

• The PEN system contains two parts: Hardware part:

Partial-erase operation

Indexing the partial-blocks

Software part:

FTL for partial-erase

M-merge

Program disturbance

Wear-leveling

Overview of Partial-Erase Operation

576 valid pages

Free page

Valid page

Invalid page

Block-ErasePartial-Erase

7272

432

144

36

36

360576

7272

432

144

36

36

360

72 valid pages

Baseline Proposed PEN system14

Page 15: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Outline• Introduction and Background

• Motivation: Impact on Block Size

• Controller Hardware for Partial-Erase

• FTL for Partial-Erase

• Evaluation

• Conclusion

15

Page 16: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Controller Hardware for Partial-Erase

• PEN enables the partial-erase by setting proper control signals while erase operation is performed. E.g. CG0~CG3 in the figure.

• The minimum unit is a layer of page.

• In the bottom figure, the layer (Page-4 ~ Page-7) controlled by CG1 is erased.

• The partial-erase introduce additional program disturbances. E.g. upper layer (Page-0 ~ Page-3) and lower layer

(Page-8 ~ Page-11).

P-0P-4

P-8

P-12

Bit 0 Bit 1 Bit n

Slice 0 Block 0

…123

567

913

…1011

1415

0V (CG1)

Floating

Floating

Floating

Floating

Partial Erase

Floating

……

P-0P-4

P-8

P-12

Bit 0 Bit 1 Bit n

Slice 0

…123

567

913

…1011

1415

0V

0V

0V (CG3)

Floating

Floating

Block Erase

0V (CG0)

……

ProgramDisturbance

Block 0

16

Page 17: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Hardware Overheads & Indexing the Partial Blocks

Start Address Address Address EndCommand Ignored Block Index

Block IndexBlock-Erase (baseline)Partial-Erase PB Index

1 byte 24 bits 1 byte

P - 0123

P - 4567

P - 89

1011

P - 12131415

1

2

3

5

89

10

11

12

13

14

15Block 426

(426, 5)

Block Index

PB Index

4

5

6

7

*2

*2 +1…

3D NAND Flash Cell

Array(plane 0)

3D NAND Flash Cell

Array(plane 1)

Blo

ck De

cod

er

Page buffers and column decoderPeripheral circuits

Charge Pump Charge PumpCommand logic andAnalog Circuits

Top view of 3D NAND Flash Chip

PD PD

Page Decoder (PD)

ControlLogic

High voltageswitch

High voltageswitch

Control logics in page decoders(PD) need to be duplicated. Partial-block index mechanism.

Partial-erase command format.

17The latency of the partial-erase operation is only slightly faster than the block erase.

Page 18: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Outline• Introduction and Background

• Motivation: Impact on Block Size

• Controller Hardware for Partial-Erase

• FTL for Partial-Erase

• Evaluation

• Conclusion

18

Page 19: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Partial Block vs. Smaller Block

• Pretending the block size is 72 pages, instead of 576 pages.

• The drawback of smaller block (72 pages) through the partial-erase. 8x of the mapping table

Reducing the block size will increase the number of blocks.

Inefficient partial-erase operations

Not aware of disturbance

1 block erase (576 pages)

8 partial-erases (8 * 72 pages)

19

Page 20: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Different Block Pair Scenarios

7272

432

144

36

36

360

(1)

(2)

18 pages

(3)

How/Should we use the partial-erase in the following scenarios?

Partial-erases may not be beneficial.

2 pages

6 pages

Partial-erase may still introduce valid page copy.

(4)

Assume that minimum partial-block has 18 pages.

72

36108

Deciding partial-block size is important.

Should select 72 + 36 pages,not 36 + 36 + 36 pages.

36 pages

36 pages

2 pages

2 pages

20

Page 21: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

M-Merge (Modified-Merge) Algorithm

• Restore operation: (1) copy out valid pages

(2) partial-erase operation

(3) copy back valid pages

936

36

72

432

14

D

43

U

936

36

72

Restore (PB 9)

(2) partial-erase (3) Copy back valid pages

432

7214

D

45

27

U

432

7214

D

45

27

U

2 pages

Restore (PB 14)

(2) partial-erase (3) Copy back valid pages (1) Copy out valid pages

D U D U

1

2

3

89101112131415

4

5

6

7

72

288

72

D

21

Page 22: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

M-Merge Algorithm (cont’d)• Recursive relation to decide the restored partial-blocks (pb)

𝑐𝑜𝑠𝑡 𝑝𝑏 = min(𝑟𝑒𝑠𝑡𝑜𝑟𝑒 𝑝𝑏 , 𝑐𝑜𝑠𝑡 𝑝𝑏 ∗ 2 + 𝑐𝑜𝑠𝑡[𝑝𝑏 ∗ 2 + 1])

If ( pb = the smallest PB ) 𝑐𝑜𝑠𝑡 𝑝𝑏 = 𝑟𝑒𝑠𝑡𝑜𝑟𝑒 𝑝𝑏

• Check whether the Cost(M-Merge) < cost(Merge)

P - 0123

P - 4567

P - 89

1011

P - 12131415

1

2

3

5

4

5

6

7

1

2

3

4

5

6

7

Restore(PB 1) = 12 copies + 1 erase + 16 copiesRestore(PB 2) = 4 copies + 1 erase + 8 copiesRestore (PB 5) = 1 erase + 4 copiesRestore (other PBs) = 0

1

2

3

4

5

6

7

*2

*2+1

Restore(PB 2) > cost[4] + cost[5]

1

2

3

4

5

6

7

*2

*2+1

Restore(PB 1) > cost[2] + cost[3]

Cost[PB 1] is the final cost of the M-merge.Merge algorithm will restore PBs 3, 4, 5 to complete merge.Note restore(PB 3) and restore(PB 4) are 0.

Calculate Restore costs

22

Page 23: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

M-Merge for Block-level Mapping (NFTL)

• M-Merge use the recursive relation to decide the restored partial-block. PBs 5, 6, 8, 9, 14, and 15

• Apply restore to those PBs PBs 5, 6, 8, and 15 can be skipped.

Free page

Valid pageInvalid page

D,U=>Victim blocks M-Merge

72

288

72

(1) Skipped (5, 6, 8, 15)

432

(3) Restore (PB 14)

14

(2) Restore (PB 9)

9

576

14

D D DU

6

5

8

15

36

36

U

72

D

43

U

72

D

45

27

U

2 pages

1

2

3

8

9

10

11

12

13

14

15

4

5

6

7

72

288

72

D

23

Page 24: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

M-Merge with Program Disturbance

• To prevent the data corruption, M-Merge restored the previous disturbed pages.

• At time t2 and t4, neighbored partial-blocks are erased.

• To prevent wear-unleveling, a block can only be M-Merged limited times.

72

D

To be disturbedValid pageTo be erased (PB 9)To be erased in case of data corruption

9

t1 t2 t3 t4 time

P-0P-4

P-8

P-12

Bit 0 Bit 1 Bit n

Slice 0 Block 0

…123

567

913

…1011

1415

Partial Erase

……

ProgramDisturbance

24

BlockErase

(Merge)

t5

Page 25: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

M-Merge for Hybrid Mapping (Superblock FTL)

• Superblock FTL has no D-/U-block concept Before M-Merge, we assign D-/U-block-sets, based on number of valid

pages.

A superblock

D-block-set

U-block-set

25

Page 26: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Outline• Introduction and Background

• Motivation: Impact on Block Size

• Controller Hardware for Partial-Erase

• FTL for Partial-Erase

• Evaluation

• Conclusion

26

Page 27: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Experimental Setup• We use 12 write-dominant workload traces.

• Four metrics Average write latency

Write amplification

AEP – the average number of erase operations per page

VEP – the variance number of erase operations per page

Typically, AEB (average number of erase operation per block) are used.

Due to the partial-erase operation, a finer measurement is required.

27

Page 28: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

NFTL Results

0

0.5

1

1.5

No

rmal

ized

Wri

te

Late

ncy

Baseline PEN

0

5

10

15

20

Wri

te A

mp

. Baseline PEN

0

5

10

15

AEP

Baseline PEN

0

5

10

15

VEP

Baseline PEN

28

Page 29: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Sensitivity Results

0

0.5

1

1.5

No

rmal

ized

Wri

te

Late

ncy

Baseline L=1 2 3 4 5 6

0

0.5

1

1.5

No

rmal

ized

Wri

te

Late

ncy

Baseline PEN without disturbance PEN

0

0.5

1

1.5

No

rmal

ized

Wri

te

Late

ncy

Baseline Unlimited 16 4

0

6

12

18

VEP

Baseline Unlimited 16 4

29

Page 30: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Superblock FTL Results

0

0.5

1

1.5

No

rmal

ized

Wri

te

Late

ncy

4_5_BSL 4_5_PEN 4_8_BSL 4_8_PEN

0

1.5

3

4.5

6

7.5

Wri

te A

mp

. 4_5_BSL 4_5_PEN 4_8_BSL 4_8_PEN

0

1

2

3

4

AEP

4_5_BSL 4_5_PEN 4_8_BSL 4_8_PEN

0

4

8

12

16

VEP

4_5_BSL 4_5_PEN 4_8_BSL 4_8_PEN

30

Page 31: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Related Works

• Partial-erase proposals Hardware

2D NAND partial-erase proposal – not beneficial due to a smaller block

Partial-block erase (PBE) – reduce the whole capacity

Subblock management – provide only three partial-blocks

Software

Subblock-erase – designed for page-level mapping

• Partial-GC proposals Those redistribute the GC overheads to idle time, cannot reduce GC overheads.

Those can be combined with our partial-erase operation.

31

Page 32: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Conclusion• we propose and evaluate a novel partial-erase based PEN architecture in

emerging 3D NAND flashes, which minimizes the number of valid pages copied during a GC operation.

• To show the effectiveness of our proposed partial-erase operation, we introduce our M-Merge algorithm that employs our partial-erase operation for NFTL and Superblock FTL.

• Our extensive experimental evaluations show that the average write latency under the proposed PEN system is reduced up to 47.9%.

32

Page 33: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Q & A

33

Page 34: PEN: Design and Evaluation of Partial-Erase for 3D NAND ... · Impacts on block size •We use four iso-capacity SSD configurations to show the impacts. To prevent that the capacity

Thank You!

34