ssd performance testing - flash memory · pdf filestorscore microsoft’s system for ssd...
TRANSCRIPT
![Page 1: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/1.jpg)
StorScoreMicrosoft’s System for SSD Qualification
Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma
Cloud Server Infrastructure Engineering (CSI)
![Page 2: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/2.jpg)
Who are we?
System Development
(Web Search, Hosted Cloud, etc.)
SSD Vendors
Unique Needs & Opportunities
• Microsoft’s platform
• Workloads: Variety and Quantity
• Flexibility to modify stack
• Iterate on designs with vendors
• Wide variety of expertise
• Additional metrics
2
![Page 3: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/3.jpg)
Workload Generators
Many Resources & Concepts
Matrix of Workloads
Workload-Dependent
Preconditioning(SNIA)
Independent Threads
(“Workers”)
Performance Monitors
Spread-sheets
Pivot Tables
Statistics
Workload-Independent
Preconditioning(SNIA)
3
![Page 4: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/4.jpg)
Inputs & Initialization
Executing Each Workload
Final Analysis
StorScore
What is StorScore?
StorScore is a script wrapper that automates industry-wide best practices for SSD performance testing, existing tools that are under active development for Windows and modern tools and techniques for data analysis.
Spread-sheets
Pivot Tables
Statistics
Workload Generators
Workload-Dependent
Preconditioning(SNIA)
Performance Monitors
Matrix of Workloads
Independent Threads
(“Workers”)
Workload-Independent
Preconditioning(SNIA)
Automation == Minimal Engineering TimeScripted == Quick & Easy to Modify
Inputs & Initialization
Executing Each Workload
Final Analysis
4
![Page 5: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/5.jpg)
Outline
• Recipes: Defining the Test Suite
• Scores: Managing the Output
• Endurance: Quantifying the Consumable
5
![Page 6: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/6.jpg)
Outline
• Recipes: Defining the Test Suite
• Scores: Managing the Output
• Endurance: Quantifying the Consumable
6
![Page 7: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/7.jpg)
A Single Test
• The entire contents of single.rcp
• Reference the file from the cmd line:$> StorScore --recipe=single.rcp
• Reads like English
7
![Page 8: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/8.jpg)
A Matrix of Tests
• Mimics Test designer’s whiteboard sketch
• “include” statements combine test files
• Full functionality of Perl
8
![Page 9: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/9.jpg)
Outline
• Recipes: Defining the Test Suite
• Scores: Managing the Output
• Endurance: Quantifying the Consumable
9
![Page 10: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/10.jpg)
Results Parser
• Raw Output Files One Excel File(24 SSDs x 218 Workloads = 5,232 Files)
• Detects and highlights outliers
• Generate Pivot Tables & Graphs
• Still too much data(5,232 Files x 23 Metrics = 120k Data Pts.)
DisplayName
WriteMix
Access Size (kB)
AccessType
Queue Depth
Bandwidth (MB/s)
Average Latency (ms)
Device A 100% 16 random 1 54.32 1.04
Device B 100% 16 random 1 15.05 0.29
Device A 30% 16 random 1 20.01 1.39
Example Policy:Bandwidth matters a lot, latency matters a little
Device A scores 72/100Device B scores 65/100
10
![Page 11: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/11.jpg)
• Goal: Enable data-driven decisions throughout the company
• Reduce data to one score per drive• Explainable
• Repeatable
• Representative
• Method: a weighted average of all the metrics for each workload
DisplayName
WriteMix
Access Size (kB)
AccessType
Queue Depth
Bandwidth (MB/s)
Average Latency (ms)
Device A 100% 16 random 1 54.32 1.04
Device B 100% 16 random 1 15.05 0.29
Device A 30% 16 random 1 20.01 1.39
Bandwidth (MB/s)
Average Latency (ms)
Z_AX0 Z_AX1
Z_BX0 Z_BX1
Z_AY0 Z_AY1
Step 1: Convert each value to z-score
Putting the “Score” in StorScore
11
![Page 12: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/12.jpg)
Calculating Each Z-Score
• One z-score for each data point
• Positive = better than average
• Negative = worse than average
• Based on cohort of drives
Drive: AWkld: X (4k, rand, QD = 1, 100% writes)Metric: 0 (Read Latency)
GoodBad Distribution of • all drives • workload X• metric 0
Drive A
zAX0
A z-score (or standard score) is the number of standard deviations from the mean.
12
![Page 13: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/13.jpg)
Calculating the Weighted Average
zA(n+m)i50%
Throughput Metrics Latency Metrics
50% zA(n+m)j+ =
General Policy:
70 / 100
zAn(i+j)5 x
70/30 Read/Write Mix Workloads 100% Read & 100% Write Workloads
1 x zAm(i+j)+ =
Policy to Favor Mixed Workloads:
65 / 100
• Can apply multiple policies at once
• Can use any kind of weight system(stay consistent within single policy)
x x
Drive AWkld range 0 to (n+m)Metric range 0 to i
Score for Drive A
13
![Page 14: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/14.jpg)
Scores
0
10
20
30
40
50
60
70
80
90
100
A K M L I H J F D G B C E N
Sco
re
Device
A is best-in-class
E & N are stragglers
If you’ve been using L, A-I will be comparable or better
H is so close, and they’ve got a great price. How do we tweak the drive or
application to make it work? 14
![Page 15: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/15.jpg)
Scores’ Breakdown
0
10
20
30
40
50
60
70
80
90
100
A K M L I H J F D G B C E N
Sco
re
Device
Random Writes Random Mix Random Reads
Sequential Writes Sequential Mix Sequential Reads
H is so close, and they’ve got a great price. How do we tweak the drive or
application to make it work?
Answer: Drive should improve random mix (not seq. mix), orApp should favor sequential mix (not random mix)15
![Page 16: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/16.jpg)
Outline
• Recipes: Defining the Test Suite
• Scores: Managing the Output
• Endurance: Quantifying the Consumable
16
![Page 17: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/17.jpg)
Write Amplification Factor Workload Dependent,
Vendor Reported, Implementation Specific
SSD Failure Mechanism: WritesDrive Writes Per Day (DWPD)
Total Bytes Writes (TBW)Drive Writes (DW)
Program / Erase Cycles (P/E Cycles, or PEC)Write / Erase Cycles (W/E Cycles)
SSD
ControllerNANDNANDNANDNANDHost Writes
Controller Writes
=x P/E CyclesTotal Drive Writes
New Telemetry
SMART “Controller Writes”
Reported in units of sectors or GB
1,700 workloads in 4.7 months
Previously Available
SMART “Media Wear Indicator”
Reported in units of 1%(300 TB for 30k, 1TB drive)
4.7 months for 1 workload 17
![Page 18: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/18.jpg)
Endurance Results
0
4
8
12
16
20
1 2 4 8 1632 1 2 4 8 1632 1 2 4 8 1632 1 2 4 8 1632 1 2 4 8 1632 1 2 4 8 1632 1 2 4 8 1632 1 2 4 8 1632 1 2 4 8 1632 1 2 4 8 1632 1 2 4 8 1632 1 2 4 8 1632
4 8 16 64 1024 2048 4 8 16 64 1024 2048
random sequential
Dri
ve W
rite
s Pe
r D
ay (
3 y
ears
)
Device D
Device E
Device F
Device H
Device J
Queue Depth
Write Size (kB)
Access Pattern
Reported Range
18
![Page 19: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/19.jpg)
Endurance Results
0
4
8
12
16
20
1 2 4 8 1632 1 2 4 8 1632 1 2 4 8 1632 1 2 4 8 1632 1 2 4 8 1632 1 2 4 8 1632 1 2 4 8 1632 1 2 4 8 1632 1 2 4 8 1632 1 2 4 8 1632 1 2 4 8 1632 1 2 4 8 1632
4 8 16 64 1024 2048 4 8 16 64 1024 2048
random sequential
Dri
ve W
rite
s Pe
r D
ay (
3 y
ears
)
Device D
Device E
Device F
Device H
Device J
Queue Depth
Write Size (kB)
Access Pattern
Reported Range
Not all sequential workloads achieve
High Endurance
Identify problem workloads
19
![Page 20: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/20.jpg)
Conclusion
•How StorScore brings together existing work & concepts
•Simplicity of defining the inputs
•Spectrum of analysis tools• Directly and interactively with excel & pivot charts
• Automated Score generation
• Burrowing down into portions of the score
•Measuring endurance on many workloads
StorScoreenables data-driven decision making process for Microsoft
cloud applications
20
![Page 21: SSD Performance Testing - Flash Memory · PDF fileStorScore Microsoft’s System for SSD Qualification Dr. Laura Caulfield, Mark Santaniello, Dr. Bikash Sharma Cloud Server Infrastructure](https://reader036.vdocuments.net/reader036/viewer/2022070606/5a8527df7f8b9a9f1b8c48e8/html5/thumbnails/21.jpg)
Thanks! Questions?
You may download StorScore for free at:
http://aka.ms/storScore
21