enlightening the i/o path: a holistic approach for application performance (usenix fast'17)
TRANSCRIPT
![Page 1: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/1.jpg)
Enlightening the I/O Path:A Holistic Approach for Application Performance
Sangwook Kim13, Hwanju Kim2, Joonwon Lee3, and Jinkyu Jeong3
Apposha1
Dell EMC2
Sungkyunkwan University3
![Page 2: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/2.jpg)
Data-Intensive Applications
2
Relational
Key-value
SearchColumn
Document
![Page 3: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/3.jpg)
Data-Intensive Applications
• Common structure
3
Storage Device
Operating System
T1
Client
T2
I/O
T3 T4
Request Response
I/O I/O I/O
Application
Application
performance
* Example: MongoDB
- Client (foreground)
- Checkpointer
- Log writer
- Eviction worker
- …
![Page 4: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/4.jpg)
Data-Intensive Applications
• Common structure
4
Storage Device
Operating System
T1
Client
T2
I/O
T3 T4
Request Response
I/O I/O I/O
Application
Application
performance
* Example: MongoDB
- Server (client)
- Checkpointer
- Log writer
- Evict worker
- …
Background tasks are problematic
for application performance
![Page 5: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/5.jpg)
Application Impact
5
• Illustrative experiment
• YCSB update-heavy workload against MongoDB
![Page 6: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/6.jpg)
Application Impact
6
• Illustrative experiment
• YCSB update-heavy workload against MongoDB
0
5000
10000
15000
20000
25000
30000
0 200 400 600 800 1000 1200 1400 1600 1800
Opera
tion thro
ughput
(ops/
sec)
Elapsed time (sec)
CFQ
Regular checkpoint task
30 seconds latency at 99.99th percentile
![Page 7: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/7.jpg)
0
5000
10000
15000
20000
25000
30000
0 200 400 600 800 1000 1200 1400 1600 1800
Opera
tion thro
ughput
(ops/
sec)
Elapsed time (sec)
CFQ CFQ-IDLE
Application Impact
7
• Illustrative experiment
• YCSB update-heavy workload against MongoDB
I/O priority does not help
![Page 8: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/8.jpg)
0
5000
10000
15000
20000
25000
30000
0 200 400 600 800 1000 1200 1400 1600 1800
Opera
tion thro
ughput
(ops/
sec)
Elapsed time (sec)
CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO
Application Impact
8
• Illustrative experiment
• YCSB update-heavy workload against MongoDB
State-of-the-art schedulers do not help much
![Page 9: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/9.jpg)
What’s the Problem?
• Independent policies in multiple layers
• Each layer processes I/Os w/ limited information
• I/O priority inversion
• Background I/Os can arbitrarily delay foreground tasks
9
![Page 10: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/10.jpg)
What’s the Problem?
• Independent policies in multiple layers
• Each layer processes I/Os w/ limited information
• I/O priority inversion
• Background I/Os can arbitrarily delay foreground tasks
10
![Page 11: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/11.jpg)
Multiple Independent Layers
• Independent I/O processing
11
Storage Device
Caching Layer
Application
File System Layer
Block LayerAb
str
actio
n
![Page 12: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/12.jpg)
Multiple Independent Layers
12
Storage Device
Caching Layer
Application
File System Layer
Block Layer
Buffer Cache
read() write()admission
control
Ab
str
actio
n
• Independent I/O processing
![Page 13: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/13.jpg)
Multiple Independent Layers
• Independent I/O processing
13
Storage Device
Caching Layer
Application
File System Layer
Block Layer
Buffer Cache
read() write()
Block-level Q
admission control
Ab
str
actio
n
![Page 14: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/14.jpg)
Multiple Independent Layers
• Independent I/O processing
14
Storage Device
Caching Layer
Application
File System Layer
Block LayerAb
str
actio
n
Buffer Cache
read() write()
reorder
FG FG BGBG
![Page 15: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/15.jpg)
Multiple Independent Layers
• Independent I/O processing
15
Storage Device
Caching Layer
Application
File System Layer
Block LayerAb
str
actio
n
Buffer Cache
read() write()
FG FGBG
Device-internal Q
admission control
![Page 16: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/16.jpg)
Multiple Independent Layers
• Independent I/O processing
16
Storage Device
Caching Layer
Application
File System Layer
Block LayerAb
str
actio
n
Buffer Cache
read() write()
FG FGBG
BG FG BGBG
reorder
![Page 17: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/17.jpg)
What’s the Problem?
• Independent policies in multiple layers
• Each layer processes I/Os w/ limited information
• I/O priority inversion
• Background I/Os can arbitrarily delay foreground tasks
17
![Page 18: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/18.jpg)
I/O Priority Inversion
• Task dependency
18
Storage Device
Caching Layer
Application
File System Layer
Block Layer
Locks
Condition variables
![Page 19: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/19.jpg)
I/O Priority Inversion
• Task dependency
19
Storage Device
Caching Layer
Application
File System Layer
Block Layer
Condition variables
I/OFGlock
BGwait
![Page 20: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/20.jpg)
I/O Priority Inversion
• Task dependency
20
Storage Device
Caching Layer
Application
File System Layer
Block Layer
I/OFGlock
BGwait
FGwait
wait
BGvarwake
![Page 21: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/21.jpg)
I/O Priority Inversion
• Task dependency
21
Storage Device
Caching Layer
Application
File System Layer
Block Layer
I/O
FGwait
wait
BGuser var
wake
FGwait
![Page 22: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/22.jpg)
I/O Priority Inversion
• I/O dependency
22
Storage Device
Caching Layer
Application
File System Layer
Block Layer
Outstanding I/Os
![Page 23: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/23.jpg)
I/O Priority Inversion
• I/O dependency
23
Storage Device
Caching Layer
Application
File System Layer
Block Layer
I/OFGwait
For ensuring consistency and/or durability
![Page 24: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/24.jpg)
24
100 ms latency at 99.99th percentile
0
5000
10000
15000
20000
25000
30000
35000
0 200 400 600 800 1000 1200 1400 1600 1800
Opera
tion thro
ughput
(ops/
sec)
Elapsed time (sec)
CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO RCP
• Request-centric I/O prioritization (RCP)
• Critical I/O: I/O in the critical path of request handling
• Policy: holistically prioritizes critical I/Os along the I/O path
Our Approach
![Page 25: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/25.jpg)
Challenges
• How to accurately identify I/O criticality
• How to effectively enforce I/O criticality
25
![Page 26: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/26.jpg)
Critical I/O Detection
• Enlightenment API
• Interface for tagging foreground tasks
• I/O priority inheritance
• Handling task dependency
• Handling I/O dependency
26
![Page 27: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/27.jpg)
I/O Priority Inheritance
• Handling task dependency
• Locks
• Condition variables
27
FGlock
BG I/OFG
inherit
BG
submit
complete
FG BG
unlock
FGwait
BG
register
BG
inherit
FG BGI/O
submit
complete
wake
CV CV CV
![Page 28: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/28.jpg)
I/O Priority Inheritance
• Handling I/O dependency
28
![Page 29: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/29.jpg)
I/O Priority Inheritance
• Handling I/O dependency
29
Block Layer
I/OI/O
![Page 30: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/30.jpg)
I/O Priority Inheritance
• Handling I/O dependency
30
Block Layer
Q admission stage
I/OI/O
Sched queueing stage
![Page 31: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/31.jpg)
I/O Priority Inheritance
• Handling I/O dependency
31
Block Layer
PER-DEV ROOT
NCIO NCIO
NCIO NCIO
Q admission stage
I/OI/O
Sched queueing stage
Non-critical I/O tracking
Descriptor
Location
Resolver
Sector #
![Page 32: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/32.jpg)
I/O Priority Inheritance
• Handling I/O dependency
32
Block Layer
Descriptor Location Resolver Sector #
allocate
Q admission stage
I/O
I/O
I/O
Sched queueing stage
Non-critical I/O tracking
PER-DEV ROOT
NCIO NCIO
NCIO NCIO
![Page 33: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/33.jpg)
I/O Priority Inheritance
• Handling I/O dependency
33
Block Layer
Q admission stage
I/O
I/O
I/O
Sched queueing stage
Non-critical I/O tracking
update
Descriptor Location Resolver Sector #
PER-DEV ROOT
NCIO NCIO
NCIO NCIO
![Page 34: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/34.jpg)
I/O Priority Inheritance
• Handling I/O dependency
34
Block Layer
Q admission stage
I/O
I/O
Sched queueing stage
I/O
Non-critical I/O tracking
update Descriptor Location Resolver Sector #
PER-DEV ROOT
NCIO NCIO
NCIO NCIO
![Page 35: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/35.jpg)
I/O Priority Inheritance
• Handling I/O dependency
35
Block Layer
Q admission stage
I/O
I/O
Sched queueing stage
I/O
Non-critical I/O tracking
updateDescriptor Location Resolver Sector #
PER-DEV ROOT
NCIO NCIO
NCIO NCIO
![Page 36: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/36.jpg)
I/O Priority Inheritance
• Handling I/O dependency
36
Block Layer
Q admission stage
I/O
I/O
Sched queueing stage
I/O
Non-critical I/O tracking
update
Descriptor Location Resolver Sector #
PER-DEV ROOT
NCIO NCIO
NCIO NCIO
![Page 37: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/37.jpg)
I/O Priority Inheritance
• Handling I/O dependency
37
Block Layer
Q admission stage
I/O
I/O
Sched queueing stage
I/O
Non-critical I/O tracking
Descriptor Location Resolver Sector #
PER-DEV ROOT
NCIO NCIO
NCIO NCIO
delete on completion
![Page 38: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/38.jpg)
Handling Transitive Dependency
• Possible states of dependent task
38
FG
inherit
BG BG
Blocked on task
I/OFG
inherit
BG
wait wait
Blocked on I/O
FG
inherit
BG
wait
Blocked at admission stage
![Page 39: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/39.jpg)
Handling Transitive Dependency
• Recording blocking status
39
FG
inherit
BG BG
Blocked on task
I/OFG
inherit
BG
wait wait
Blocked on I/O
FG
inherit
BG
wait
Blocked at admission stage
![Page 40: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/40.jpg)
Handling Transitive Dependency
• Recording blocking status
40
FG
inherit
BG BG
Task is recorded
I/OFG
inherit
BG
wait
Blocked on I/O
FG
inherit
BG
wait
Blocked at admission stage
inherit
![Page 41: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/41.jpg)
Handling Transitive Dependency
• Recording blocking status
41
FG
inherit
BG BG I/OFG
inherit
BG
reprio
I/O is recorded
FG
inherit
BG
wait
Blocked at admission stage
Task is recorded
inherit
![Page 42: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/42.jpg)
Handling Transitive Dependency
• Recording blocking status
42
FG
inherit
BG BG I/OFG
inherit
BG FG
inherit
BG
retryreprio
I/O is recorded
Task is recorded
inherit
![Page 43: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/43.jpg)
Challenges
• How to accurately identify I/O criticality
• Enlightenment API
• I/O priority inheritance
• Recording blocking status
• How to effectively enforce I/O criticality
43
![Page 44: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/44.jpg)
Criticality-Aware I/O Prioritization
• Caching layer
• Apply low dirty ratio for non-critical writes (1% by default)
• Block layer
• Isolate allocation of block queue slots
• Maintain 2 FIFO queues
• Schedule critical I/O first
• Limit # of outstanding non-critical I/Os (1 by default)
• Support queue promotion to resolve I/O dependency
44
![Page 45: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/45.jpg)
Evaluation
• Implementation on Linux 3.13 w/ ext4
• Application studies
• PostgreSQL relational database
• Backend processes as foreground tasks
• I/O priority inheritance on LWLocks (semop)
• MongoDB document store
• Client threads as foreground tasks
• I/O priority inheritance on Pthread mutex and condition vars (futex)
• Redis key-value store
• Master process as foreground task
45
![Page 46: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/46.jpg)
Evaluation
• Experimental setup
• 2 Dell PowerEdge R530 (server & client)
• 1TB Micron MX200 SSD
• I/O prioritization schemes
• CFQ (default), CFQ-IDLE
• SPLIT-A (priority), SPLIT-D (deadline) [SOSP’15]
• QASIO [FAST’15]
• RCP
46
![Page 47: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/47.jpg)
Application Throughput
• PostgreSQL w/ TPC-C workload
47
0
1000
2000
3000
4000
5000
6000
7000
10GB dataset 60GB dataset 200GB dataset
Transa
ctio
n thro
ughput
(trx
/sec)
CFQ CFQ-IDLE
![Page 48: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/48.jpg)
Application Throughput
• PostgreSQL w/ TPC-C workload
48
0
1000
2000
3000
4000
5000
6000
7000
10GB dataset 60GB dataset 200GB dataset
Transa
ctio
n thro
ughput
(trx
/sec)
CFQ CFQ-IDLE SPLIT-A
![Page 49: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/49.jpg)
Application Throughput
• PostgreSQL w/ TPC-C workload
49
0
1000
2000
3000
4000
5000
6000
7000
10GB dataset 60GB dataset 200GB dataset
Transa
ctio
n thro
ughput
(trx
/sec)
CFQ CFQ-IDLE SPLIT-A SPLIT-D
![Page 50: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/50.jpg)
Application Throughput
• PostgreSQL w/ TPC-C workload
50
0
1000
2000
3000
4000
5000
6000
7000
10GB dataset 60GB dataset 200GB dataset
Transa
ctio
n thro
ughput
(trx
/sec)
CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO
![Page 51: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/51.jpg)
Application Throughput
• PostgreSQL w/ TPC-C workload
51
0
1000
2000
3000
4000
5000
6000
7000
10GB dataset 60GB dataset 200GB dataset
Transa
ctio
n thro
ughput
(trx
/sec)
CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO RCP
37%
31%28%
![Page 52: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/52.jpg)
Application Throughput
• Impact on background task
52
0
5
10
15
20
25
30
35
0 200 400 600 800 1000 1200 1400 1600 1800
Transa
ctio
n log s
ize (GB)
Elapsed time (sec)
CFQ CFQ-IDLE QASIO
![Page 53: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/53.jpg)
Application Throughput
• Impact on background task
53
0
5
10
15
20
25
30
35
0 200 400 600 800 1000 1200 1400 1600 1800
Transa
ctio
n log s
ize (GB)
Elapsed time (sec)
CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO
![Page 54: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/54.jpg)
Application Throughput
• Impact on background task
54
Our scheme improvesapplication throughput w/o penalizing background tasks
0
5
10
15
20
25
30
35
0 200 400 600 800 1000 1200 1400 1600 1800
Transa
ctio
n log s
ize (GB)
Elapsed time (sec)
CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO RCP
![Page 55: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/55.jpg)
Application Latency
• PostgreSQL w/ TPC-C workload
55
1.E-05
1.E-04
1.E-03
1.E-02
1.E-01
1.E+00
0 1000 2000 3000 4000 5000 6000
CCD
F P[X
>=x]
Transaction latency (msec)
CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO
100
10-1
10-2
10-3
10-4
10-5
Over 2 sec at 99.9th
0th
90th
99th
99.9th
99.99th
99.999th
![Page 56: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/56.jpg)
Application Latency
• PostgreSQL w/ TPC-C workload
56
1.E-05
1.E-04
1.E-03
1.E-02
1.E-01
1.E+00
0 1000 2000 3000 4000 5000 6000
CCD
F P[X
>=x]
Transaction latency (msec)
CFQ CFQ-IDLE SPLIT-A SPLIT-D QASIO RCP
100
10-1
10-2
10-3
10-4
10-5
300 msecat 99.999th
Our scheme is effective for improving tail latency
Over 2 sec at 99.9th
100
10-1
10-2
10-3
10-4
10-5
0th
90th
99th
99.9th
99.99th
99.999th
![Page 57: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/57.jpg)
Summary of Other Results
• Performance results
• MongoDB: 12%-201% throughput, 5x-20x latency at 99.9th
• Redis: 7%-49% throughput, 2x-20x latency at 99.9th
• Analysis results
• System latency analysis using LatencyTOP
• System throughput vs. Application latency
• Need for holistic approach
57
![Page 58: Enlightening the I/O Path: A Holistic Approach for Application Performance (USENIX FAST'17)](https://reader030.vdocuments.net/reader030/viewer/2022020119/58cea9341a28abb26e8b6a6d/html5/thumbnails/58.jpg)
Conclusions
• Key observation• All the layers in the I/O path should be considered as a whole with I/O
priority inversion in mind for effective I/O prioritization
• Request-centric I/O prioritization• Enlightens the I/O path solely for application performance
• Improves throughput and latency of real applications
• Ongoing work• Practicalizing implementation
• Applying RCP to database cluster with multiple replicas
58