k2: work-constraining scheduling of nvme-attached...

24
K2: Work-Constraining Scheduling of NVMe-Attached Storage Till Miemietz , Hannes Weisbach, Michael Roitzsch and Hermann Härtig Presentation at the 40 th IEEE Real-Time Systems Symposium Hong Kong ⚫ 4 th of December 2019

Upload: others

Post on 23-Aug-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

K2: Work-Constraining Scheduling of NVMe-Attached Storage

Till Miemietz, Hannes Weisbach, Michael Roitzsch and Hermann Härtig

Presentation at the 40 th IEEE Real-Time Systems Symposium

Hong Kong ⚫ 4th of December 2019

Page 2: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

What are the implications of fast storage devices for

real-time systems?

2

Page 3: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

What May a Modern Storage Stack Look Like?

SSD

Driver

Block Layer

Apps (e.g., File Systems)

1 2 3

Apps @ CPU 0 Apps @ CPU 1 Apps @ CPU 2

I/O scheduler-specific staging queue scheme

32

1

NVMe Queue Pair NVMe Queue Pair

NVMe commands

bios

requests

4

Dispatching Queues(FIFO only)

Page 4: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

What May a Modern Storage Stack Look Like?

SSD

Driver

Block Layer

Apps (e.g., File Systems)

Controller with Flash Translation Layer (FTL)

Flash Package Flash Package

Block Block

P P

SQ CQ SQ CQ SQ CQ

NVMe commands

bios

requests

5

Page 5: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

Latency Characteristics of SSDs – Motivation

● Gap between CPU and storage devices is shrinking

→ Speedup of 1000x compared to HDDs

→ Up to 40% of storage latency caused by software

● High degree of abstraction (FTL) is source of non-determinism

→ Garbage Collection

→ Caching

→ Scheduling of multiple NVMe queue pairs

● Can host-sided I/O schedulers still be used to enforce latency goals?

6

Page 6: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

Dissecting Linux' Block I/O Schedulers

● Performing micro benchmarks in a simulated real-time scenario

→ Samsung 970evo (250GB, NVMe 1.3) on a desktop system

● Multiple background processes used to create load on the drive

→ Goal is bandwidth maximization

● Single foreground high-priority process that periodically issues requests

→ Goal is fast access to the drive

● Analyse latency characteristics of RT process using different I/O schedulers

7

Page 7: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

Block I/O Schedulers in Linux 4.15

● None (default)

● mq-deadline

→ Orders requests by target block address

● Kyber

→ Core-local, balances latencies of coarse-grained request classes

● Bandwidth Fair Queuing (BFQ)

→ Bandwidth control per process

→ Aware of I/O priorities

8

Page 8: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

Latency Characteristics for Random Reads

● Both real-time and background processes issuing small random reads (4K)

→ X-axis shows achieved bandwidth, color of plotting symbols depicts targeted throughput

● Plots look similar for larger block sizes

9

Page 9: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

Latency Characteristics for Write-Only Workloads

● Writing is very fast as long as drive-internal SLC cache is accessed

● Garbage collection drastically increases storage latency of RT process

10

Page 10: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

Dissecting Linux’ I/O Schedulers – Lessons learned

● Reading is often slower than writing

→ Caching can avoid synchronous access of second-level flash cells when writing

● No performance isolation

→ Real-time process faces high latency when SSD is fully loaded

→ BFQ is unable to enforce priorities correctly

● From latency viewpoint I/O schedulers show little differences

→ However, complex implementations have latency penalties of up to 10%

11

Page 11: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

K2: Work-Constraining I/O Scheduling

● Work-conserving behavior of current schedulers is not optimal w.r.t. latencies

→ Stalls high-priority read requests

→ Amplifies effects of garbage collection

● Concept: Limit the number of requests that are served in parallel

→ Device-wide limit of inflight requests

→ Requests are stored in per-priority FIFO queues

→ Submit new requests on completion of previous ones

● Queue length as a tunable parameter

→ Trade global throughput for softly bounded I/O latency of high-priority processes

12

Page 12: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

Evaluation of K2 – Random Reads (64K)

● Limiting the length of the device queues enforces correct service order

→ Tested K2 with a queue length of 8, 16, and 32

→ Note the different scales of the y-axis!

● Host gains flexibility to enforce quick submission of real-time requests

13

Page 13: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

Evaluation of K2 – Sequential Write Operations (64K)

● Performance for very fast operations similar to mq-deadline and Kyber

● K2 can not avoid garbage collection but mitigates impact on RT applications

14

Page 14: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

Evaluation of K2 – Application Benchmark

● Tested read-only OLTP benchmarks of sysbench with with read / write background load

15

Page 15: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

Summary

● Fast SSDs impose new challenges on the OS to enforce timely access to storage

→ Complex abstractions cause non-determinism of performance parameters

● Current I/O schedulers are not suitable for real-time demands

→ No performance isolation

→ Overridden by drive-internal scheduler

● K2: work-constraining I/O scheduling to limit storage access latency

→ Trade throughput for lower tail-latencies

→ Improve worst-case latency up to 10x for reading, 6.8x for writing

→ Works with components-off-the-shelf

16

Page 16: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

Additional Slides

Page 17: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

Latency Characteristics for Random Reads (All Percentiles)

● Both real-time and background processes issuing small random reads (4K)

19

Page 18: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

Latency Characteristics for Write-Only Workloads

● Garbage collection also affects lower percentiles

20

Page 19: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

Latency Characteristics for Mixed Workloads (64K)

● Access times of real-time application are similar to reading

21

Page 20: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

Impact of Scheduler Complexity

● For small random writes, complex policies have a notable impact on overall latency

22

Page 21: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

Evaluation of K2 – Random Reads (64K, All Percentiles)

● Limiting the length of the device queues enforces correct service order

23

Page 22: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

Evaluation of K2 – Mixed Workloads (64K)

● Reduction of tail latencies is also present for mixed read / write requests

● Bandwidth penalty reduced by fast write operations

24

Page 23: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

● Table shows 99.9th percentile of latency at maximum throughput

→ Throughput loss of K2 is mitigated by large block sizes

Evaluation – Comparison of I/O Schedulers

25

Page 24: K2: Work-Constraining Scheduling of NVMe-Attached Storage2019.rtss.org/wp-content/uploads/2020/05/277... · 2020. 5. 19. · K2: Work-Constraining Scheduling of NVMe-Attached Storage

● Table shows throughput at maximum 99.9th percentile of latency

Evaluation – Comparison of I/O Schedulers

26