enterprise application of ssd 曹庆玲 qingling1220@sina
DESCRIPTION
Enterprise Application of SSD 曹庆玲 [email protected]. Towards SSD-Ready Enterprise Platforms Building Large Storage Based On Flash Disks. Towards SSD-Ready Enterprise Platforms Building Large Storage Based On Flash Disks. Outline. Motivation Platform and methodology - PowerPoint PPT PresentationTRANSCRIPT
Enterprise Application of SSD
• Towards SSD-Ready Enterprise Platforms
• Building Large Storage Based On Flash Disks
• Towards SSD-Ready Enterprise Platforms
• Building Large Storage Based On Flash Disks
Outline
• Motivation
• Platform and methodology
• Platform bottleneck analysis Platform latency bottlenecks
I/O processing bottlenecks
Performance scaling bottlenecks
• Conclusion
Motivation
• SSD deliver 2-3 orders of magnitude increase in IOPS over HDD
• Platform have long been optimized for HDD
• Is it ready for SSD?
Platform and methodology
Platform and methodology
Platform and methodology
• Use Linux* as a reference OS for experiment
• Focus on fixed-size 4KB random reads .
Random read to avoid I/O merging policies and if the platform ready for read , then it must be ready for write.
Platform bottleneck analysis
• Platform latency bottlenecks—determine component dominates I/O latency
• I/O processing bottlenecks—determine software contribute the most CPU overhead for I/O processing
• Performance scaling bottlenecks—determine component limits scaling of performance
Platform bottleneck analysis
—Platform latency
Total I/O latency is the time from application issue an I/O to the time it receives completion.
Time due to media
Time due to platform
Platform bottleneck analysis
—Platform latency
The platform only contribute 26% of the total latency.
Optimizing the media is necessary.
Platform bottleneck analysis —I/O processing cost
35000
Platform bottleneck analysis —I/O processing cost
• ahci_interrupt() and ahci_scr_read() executed
uncacheable (UC) reads. The UC reads incurred averaging 2,100 clocks per UC read.
Device interfaces that adopt message signaled interrupts (MSI),and the added intelligence to push status to drivers , can eliminate such UC reads.
Can reduce overhead about 8,400 clocks/IO.
Platform bottleneck analysis —I/O processing cost
• I/O processing when done through an MSI-based interface like LSI’s, incurred 25,000 clocks/IO
Platform bottleneck analysis —I/O processing cost
• The LSI’s driver return path (5250 clocks/IO) is still substantial.
It can be reduced by employing interrupt coalescing. Then only 650 clocks remain in the driver return path, resulting in about 20,000 clocks/IO.
Platform bottleneck analysis —Performance scaling
Ensure that I/O processing scales with cores and SSDs.
The single core with 3 SSDs is fully saturated,more cores are required.
One adapter enable 177K IOPS.
With more throughput scaled up to 445K IOPS.
Platform bottleneck analysis —Performance scaling
Conclusion
• Existing platforms to be ready for SSDs.
• Scalability of file system
• I/O behavior of real application
• Implementation of RAID
• Towards SSD-Ready Enterprise Platforms
• Building Large Storage Based On Flash Disks
Outline
• Introduction
• SSD RAID configuration
• Scalability
• Solution alternatives
• Conclusion
Input data streamInput data
RAID controller
parallel
SSD1 SSD2 SSD3 SSD4 SSD5
RAID0
Input data stream
RAID controller
Parallel
SSD1 SSD2
Group 1
SSD3 SSD4
RAID1
Group 2
Work disk Mirror disk
Two RAID 1’s Striped
RAID LevelsRAID Levels——
RAID 10RAID 10
Input data streamInput data
RAID controller
RAID5
parity
parity
parity
parity
Introduction
SSD RAID shows the performance loss.
Test setup:• 16 core server with 64GB RAM• 3 RAID controllers with 512MB cache• Intel 64GB SSD
Workloads:• Workload light – one worker,32 queue;• Workload heavy – ten worker,queue depth 16;• Workload latency – single request,one worker,
queue depth 1.
Test setup and workload
SSD RAID Configurations —throughput(workload heavy)
RAID 0,5,10 With 8 SSDs on a single controller
SSD RAID Configurations —throughput(workload heavy)
RAID 0,5,10 With 8 SSDs on a single controller
SSD RAID Configurations—throughput(workload light)
Volume=240GBShow single SSD data for comparison
Volume=240GBShow single SSD data for comparison
SSD RAID Configurations—throughput(workload light)
saturate
Scalability
Experiment data above indicate:
Exist a bottleneck along the IO chain
Is it RAID controller or PCIe bus?
With the best throughput,the utilization PCIe bus is less than 50%.
RAID controller is the bottleneck.
Scalability
Scalability
Two SSDs are enough to saturate the controller!
With read-ahead With write cache
Scalability
Without write cache
Scalability
Solution alternatives
Combination of hardware and software.
A. Without controller. Devices connect directly with software RAID on top
B. Use controller just as simple device aggregator while running software RAID on top
C. Use simple RAID level on multiple RAID controller while running software on top
Solution alternatives
Compare option A and B RAID with 2 SSDs
Second controller have a profound effect on performance.
Solution alternatives
Compare option B and C
conclusions
• Software RAID-approaches • Multiple blocksize• RAID controllers are not designed for the
characteristic of SSD
Thank you~