using arm® processing for efficient hyperscale … demand for greater efficiency using arm®...
Post on 19-Apr-2018
234 Views
Preview:
TRANSCRIPT
1
Demand for Greater Efficiency
UsingArm®ProcessingforEfficientHyperscaleStorage
Sco$FureyMarvell,AssociateVP-EnterpriseStorageBusiness
Storage Demand in Data Centers
2
Increased Capacity
Increased Performance
Demand for Greater Efficiency
POW
ER
DEN
SITY
PERFORMANCE
uS
GB/s
Balance Performance, Power, Cost
Architectural tradeoffs on core components
Storage + Networking + Processing
3
Hyperconverged
SSD SSD SSD SSD
SSD SSD SSD SSD
CPU
CPU
CPU
CPU Add Incremental
Compute + Storage
Network
Hyperscale (Disaggregated Compute/Storage)
5
Network
SSD SSD SSD SSD
SSD SSD SSD SSD
CPU
CPU
CPU
CPU
Add Incremental
Compute
Add Incremental
Storage
Current Disaggregated All Flash Array Architecture
6
Substantial CPU capability
Enormous network capacity
Performance at all cost CPU
CPU PCIe
PCIe
SSD SSD SSD
SSD SSD SSD
SSD SSD SSD
SSD SSD SSD
SSD SSD SSD
SSD SSD SSD
SSD SSD SSD
SSD SSD SSD
Nx 100 G
12 - 24 Drives
Cost of Adding Capacity
Expensive compute/network replicated per instance
Next tiers of storage tend to have a very similar server architecture
7
Can embedded storage controllers with Arm® processors better address this?
SSD SSD SSD SSD
SSD SSD SSD SSD
CPU
CPU
CPU
CPU CPU
CPU
CPU
CPU
Network
Embedded ARM Processor in Storage
8
Evolving storage architectures removing several performance barriers
Compute + HW Acceleration = Reduced Cost / Power / Latencies
Improved Efficiencies in Storage IO Stacks
NVM
e
SCSI
OVE
RH
EAD
50% Reduction
Application
OS Scheduler
Block Driver
SCSI/SATA Translation
Device Driver
Application
SCSI Traditional Access Time (ms)
NVMe SSD Access Time (us) VFS
VFS
OS Scheduler
Block Driver
Device Driver
Reduced Networking Overhead with RoCEv2
10
NVMe™ Host Software Host Side Transport Abstraction
NVMe RDMA RDMA Verbs
RoCE
RoCE
RDMA Verbs NVMe RDMA
Controller Side Transport Abstraction NVMe Controller
RDMA Fabric
RoC
Ev2
TCP
OVE
RH
EAD
Reduced Complexity
Adoption of Linux User Space Drivers
Eliminates Interrupts
Fast Scheduling of Threads
Lower Latency / Higher Performance
11
User Space
Hardware Accelerators
DDR
nvmE Stub Driver
Mapper Driver
Kernel
SAS Back End (polled)
NVMe Front End
Run
time
VM
Or R
TOS
Storage Applications
Fast path
Fast path
Slo
w
path
Initialization
Inte
rrup
ts
Efficiency Through Integration/Optimization
Application Optimized Network Capability Hardware Acceleration Scale CPU Cores for the Application Eliminate Additional Storage Fan-Out Devices
12
Network
Arm Compute
Storage
SSD SSD
SSD SSD
SSD SSD
SSD SSD
Nx 25G
Hardware Optimizations for Hyperscale Storage
13
Storage • Configurable IO to support any storage service • Virtualize any storage device as an NVMe namespace
Networking • Optimized for full datapath offload of NVMe-oF • Zero copy
Hash / Compression / Encryption / Erasure Codes • Line rate throughput / concurrent operation • Memory utilization reduced by 60%
Embedded Scale-Out Storage Controller
Multi-Core 64-bit Arm®
4x25G Target RDMA Ports
24 Flexible Storage Ports
Line Rate Hardware Offload
Power < 25W
14
Cache
Dual DDR3/DDR3L/DDR4 64-bit DRAM w/ECC
Multi-Core Arm v8 Processors
Storage Accelerators
NVMe/SAS/SATA
24 x High Speed SERDES
Network Accelerators
RoCEv2/NVMe-oF
4 x 25 G Ethernet
Encryption
Security
RAID/Erasure codes
SHA / Compression
Single Controller All Flash Array
100W Fully Populated
Compact Scalable Unit
Integrate Several Arrays into a Single Chassis
15
Up to 12 direct attached M.2
Single Controller Hybrid Array
HDD Cost/Capacity with Improved Performance
Attach HDDs to Unified NVMe-oF Interconnect
16
2x4 SSD and 16 direct attached SAS
NVMe-oF JBOD/JBOF - Appliances
17
CPU
CPU CPU
CPU
CPU
Network
Low power footprint for flexible scaling capacity vs performance
NVMe-oF
Distributed Storage Cluster
18
CPU
CPU CPU
CPU
CPU
Network
Embedded CPUs enable clustering applications HW acceleration for erasure code generation
Hierarchical / Hybrid – self replicating / self healing
NVMe-oF
STORAGECPU
SSD
SSD
STORAGECONTROLLER
SSD
SSD
STORAGECPU
SSD SSD
SSD SSD
SSD SSD
SSD SSD
STORAGECONTROLLER
SSD SSD
SSD SSD
SSD SSD
SSD SSD
Conclusion
Extension of the storage controller not another server
Optimize the attachment of storage to the server over the network
Integration ensures storage isn’t disproportionately burdened by cost/power of disaggregating hardware
Embedded compute enables Software Defined Storage while hardware offload
improves overall efficiency/performance
19
top related