park-
TRANSCRIPT
8/10/2019 Park-
http://slidepdf.com/reader/full/park- 1/22
FIOS: A Fair, Efficient Flash I/O Scheduler
Stan Park Kai Shen
University of Rochester
1/21
8/10/2019 Park-
http://slidepdf.com/reader/full/park- 2/22
Background
Flash is widely available as mass storage, e.g. SSD
$/GB still dropping, affordable high-performance I/ODeployed in data centers as well low-power platforms
Adoption continues to grow but very little work in robust I/Oscheduling for Flash I/O
Synchronous writes are still a major factor in I/O bottlenecks
2/21
8/10/2019 Park-
http://slidepdf.com/reader/full/park- 3/22
Flash: Characteristics & Challenges
No seek latency, low latency variance
I/O granularity: Flash page, 2–8KB
Large erase granularity: Flash block, 64–256 pagesArchitecture parallelism
Erase-before-write limitation
I/O asymmetryWide variation in performance across vendors!
3/21
8/10/2019 Park-
http://slidepdf.com/reader/full/park- 4/22
Motivation
Disk is slow → scheduling has largely beenperformance-oriented
Flash scheduling for high performance ALONE is easy (justuse noop)
Now fairness can be a first-class concern
Fairness must account for unique Flash characteristics
4/21
8/10/2019 Park-
http://slidepdf.com/reader/full/park- 5/22
Motivation: Prior Schedulers
Fairness-oriented Schedulers: Linux CFQ, SFQ(D), Argon
Lack Flash-awareness and appropriate anticipation support
Linux CFQ, SFQ(D): fail to recognize the need for anticipationArgon: overly aggressive anticipation support
Flash I/O Scheduling: write bundling, write block preferential, andpage-aligned request merging/splitting
Limited applicability to modern SSDs, performance-oriented
5/21
8/10/2019 Park-
http://slidepdf.com/reader/full/park- 6/22
Motivation: Read-Write Interference
0 1 2
P r o b a b i l i t y d e n s i t y
I/O response time (in msecs)
Intel SSD read (alone)
0 0.2 0.4 0.6
P r o b a b i l i t y d e n s i t y
I/O response time (in msecs)
Vertex SSD read (alone)
0 100 200 300
P r o b a b i l i t y d e n s i t y
← all respond quickly
I/O response time (in msecs)
CompactFlash read (alone)
6/21
8/10/2019 Park-
http://slidepdf.com/reader/full/park- 9/22
Motivation: I/O Anticipation Support
Reduces potential seek cost for mechanical disks
...but largely negative performance effect on FlashFlash has no seek latency: no need for anticipation?
No anticipation can result in unfairness: premature servicechange, read-write interference
8/21
8/10/2019 Park-
http://slidepdf.com/reader/full/park- 10/22
Motivation: I/O Anticipation Support
0
1
2
3
4
I / O s
l o w
d o w n r a t i o
L i n u x C F Q , n o a n t i c .
S F Q ( D ) , n o a n t i c .
F u l l − q u a n t u m a n t i c .
Read slowdown
Write slowdown
Lack of anticipation can lead to unfairness; aggressive anticipationmakes fairness costly.
9/21
8/10/2019 Park-
http://slidepdf.com/reader/full/park- 11/22
FIOS: Policy
Fair timeslice management: Basis of fairness
Read-write interference management: Account for Flash I/Oasymmetry
I/O parallelism: Recognize and exploit SSD internalparallelism while maintaining fairness
I/O anticipation: Prevent disruption to fairness mechanisms
10/21
8/10/2019 Park-
http://slidepdf.com/reader/full/park- 12/22
FIOS: Timeslice Management
Equal timeslices: amount of time to access device
Non-contiguous usageMultiple tasks can be serviced simultaneously
Collection of timeslices = epoch; Epoch ends when:
No task with a remaining timeslice issues a request, orNo task has a remaining timeslice
11/21
8/10/2019 Park-
http://slidepdf.com/reader/full/park- 13/22
FIOS: Interference Management
0 1 2
P r o b a b i l i t y d e n s i t y
I/O response time (in msecs)
Intel SSD read (with write)
0 0.2 0.4 0.6
P r o b a b i l i t y d e n s i t y
I/O response time (in msecs)
Vertex SSD read (with write)
0 100 200 300
P r o b a b i l i t y d e n s i t y
I/O response time (in msecs)
CompactFlash read (with write)
Reads are faster than writes → interference penalizes readsmore
Preference for servicing reads
Delay writes until reads complete
12/21
8/10/2019 Park-
http://slidepdf.com/reader/full/park- 14/22
FIOS: I/O Parallelism
SSDs utilize multiple independent channels
Exploit internal parallelism when possible, minding timesliceand interference management
Parallel cost accounting: New problem in Flash schedulingLinear cost model, using time to service a given request sizeProbabilistic fair sharing: Share perceived device time usageamong concurrent users/tasks
Cost = T elapsed P issuance
T elapsed is the requests elapsed time from its issuance to itscompletion, and P issuance is the number of outstanding requests(including the new request) at the issuance time
13/21
8/10/2019 Park-
http://slidepdf.com/reader/full/park- 15/22
FIOS: I/O Anticipation - When to anticipate?
Anticipatory I/O originally used for improving performance on diskto handle deceptive idleness: wait for a desirable request.Anticipatory I/O on Flash used to preserve fairness.
Deceptive idleness may break:
timeslice management
interference management
14/21
8/10/2019 Park-
http://slidepdf.com/reader/full/park- 16/22
FIOS: I/O Anticipation - How long to anticipate?
Must be much shorter than the typical idle period for disks
Relative anticipation cost is bounded by α
, where idle period isT service ∗
α
1−αwhere T service is per-task exponentially-weighted
moving average of per-request service time(Default α = 0.5)
ex. I/O → anticipation → I/O → anticipation → I/O → · · ·
15/21
8/10/2019 Park-
http://slidepdf.com/reader/full/park- 17/22
Implementation Issues
Linux coarse tick timer → High resolution timer for I/Oanticipation
Inconsistent synchronous write handling across file system andI/O layers
ext4 nanosecond timestamps lead to excessive metadataupdates for write-intensive applications
16/21
8/10/2019 Park-
http://slidepdf.com/reader/full/park- 19/22
Results: Beyond Read-Write Fairness
0
8
16
proportionalslowdown ←
I / O s
l o w d o w n r
a t i o
4−reader 4−writer on Vertex SSD
R a w d e v i c e I / O
L i n u x C F Q S F Q ( D )
Q u a n t a F I O S
Mean read latency
Mean write latency
0
2
4
6
8
proportionalslowdown
← I/O
slowdownr
atio
4KB−reader and 128KB−reader on Vertex SSD
R a w d e v i c e I / O
L i n u x C F Q
S F Q ( D ) Q u a n t a
F I O S
Mean latency 4KB read
Mean latency 128KB read
FIOS achieves fairness not only with read-write asymmetry butalso requests of varying cost.
18/21
8/10/2019 Park-
http://slidepdf.com/reader/full/park- 20/22
Results: SPECweb co-run TPC-C
0
2
4
6
8
10
12
R e s p o n s e t i m e s l o w d o w n r a t i o
SPECweb and TPC−C on Intel SSD
28
R a w d e v i c e I / O
L i n u x C F Q
S F Q ( D ) Q u a n t a
F I O S
SPECweb
TPC−C
FIOS exhibits the best fairness compared to the alternatives.
19/21
8/10/2019 Park-
http://slidepdf.com/reader/full/park- 21/22
Results: FAWNDS (CMU, SOSP’09) on CompactFlash
0
1
2
3
T a s k s l o w d o w n r a t i o
← proportional slowdown
R a w d e v i c e I / O
L i n u x C F Q
S F Q ( D )
Q u a n t a
F I O S
FAWNDS hash gets FAWNDS hash puts
FIOS also applies to low-power Flash and provides efficient fairness.
20/21
8/10/2019 Park-
http://slidepdf.com/reader/full/park- 22/22
Conclusion
Fairness and efficiency in Flash I/O scheduling
Fairness is a primary concernNew challenge for fairness AND high efficiency (parallelism)
I/O anticipation is ALSO important for fairness
I/O scheduler support must be robust in the face of variedperformance and evolving hardware
Read/write fairness and BEYOND
May support other resource principals (VMs in cloud).
21/21