diablo technologies - flash memory summit technologies slide 2 $ dram capacity is limited while cost...
TRANSCRIPT
Diablo Technologies Slide 1
Diablo Technologies
Memory1™
Maher Amer
CTO Diablo Technologies - Highly Confidential
Diablo Technologies Slide 2
$DRAM capacity is limited
while cost is prohibitive
THE NEED: Big Data applications need Big Memory
THE SOLUTION: Expand system memory with Memory1
Big Memory for Big Data
Diablo Technologies Slide 3
Two Tier Memory Sub-Systems are Here to Stay
Load
/Sto
re(A
pp
licat
ion
Dir
ect
Acc
ess)
Block(Application Indirect Access)
CPU
Registers
Cache
DRAM
Storage
Byte-addressable with Flash Capacity and Economics
Diablo Technologies Slide 4
Memory1 At A Glance
TM
• DDR4 memory DIMM with up to 128 GB per module
• Economically expands application memory by terabytes
• No changes required to servers, applications
• Industry standard, JEDEC-compliant LRDIMM/RDIMM
Diablo Technologies Slide 5
How It Works: High-Performance Hardware Solution
Software Intelligently Manages Memory Access
• Acts as extension of OS Virtual Memory Manager
• Implements intelligent paging algorithms
• Optimizes performance and extends Flash endurance
Modules Plug Into DDR4 Memory Slots
• Memory bus is highest performing interface to CPU
• Over 17GB/s per memory channel @ 2133 MT/s
• Lower latency than PCIe/NVMe
Innovative All-Flash DDR4 DIMM Hardware
• Deployed in parallel with standard DRAM
• Leverages Flash capacity, power, and cost advantages
QP
I Memory BusQPI
Diablo Technologies Slide 6
Target Applications and Workloads
BIG DATA
PROCESSING
Real-time analysis
Distributed caching
Predictive analytics
Caching
Paid search
Key-value lookup
CLOUD
Distributed database
In-memory database
Relational database
DATABASE
Application data doesn’t fit in one
machine
Application data doesn’t fit in one
machine but DRAM constrained
Application data doesn’t fit in one
machine
Diablo Technologies Slide 7
Diablo Memory Expansion
Diablo Technologies Slide 8
Hardware
Combination of firmware and software• Intelligently manages application memory
access
• Leverages CPU hardware and special statistics
• Manages performance and endurance
No Application Changes• Loads like a driver
Two Major Components• Data Management
• Media Management
Diablo Memory Expansion (DMX) Software
Application
Operating System
DMX Software
RAM
Diablo Technologies Slide 9
DMX Software Intelligently Manages Expanded Memory
DATA MANAGEMENTKeeps track of all application page sequences and
ensures correct pages are in DRAM
MEDIA MANAGEMENT Organizes and optimizes the flash layer for maximum
performance and endurance
• DMX Software loads and operates as an OS-level driver
• Application Software requires no changes
APPLICATION
SOFTWARE
DRAM
Memory1 Flash Memory (Total Available System Memory)
Diablo Technologies Slide 10
Data Management Details
Data Tiering• DMX keeps hot data in DRAM
• High priority data maintained in DRAM
• Cold data evicted to flash
Quality of Service• Priority Associated Data Placement
• Additional DRAM allocated per
application increasing hot data in DRAM
• Keeps data with response time
requirements in DRAM
Data
Tiering
Quality
of
Service
Diablo Technologies Slide 11
Data Management Details
CPU
CPU
NUMA Node
Amortized PF
Data Pages
Clustering
Smart
Prefetch
DRAM
DRAM
Learning Engine
• Application Profiling and Analytics
- Monitors application data access behaviors
• Data access prediction
- Predicts next or additional pages required
• Smart Data pre-fetch
- Pre-fetches pages to DRAM based on profiling,
history, and data access patterns
Clustered Pages• Prefetches grouped data typical in many applications
Data Locality• Movement between DRAM and Memory1 ensures data
local to associated node
Amortized Page Faults• Groups page requests together, fully leveraging page fault
Diablo Technologies Slide 12
Media Management Details
Flash Management
• Low-level media management (handled in
firmware)
• SoftFTL – Adaptation for 4K pages
• Tuneable cache ratio (4:1, 8:1)
• Device Striping
Intelligent Traffic Management
• Dirty Page Writes
- Avoids premature writes to flash for frequently
written pages
- Minimizes Read/Modify/Write operations
• Traffic Sequentialization
- Pages evicted are written sequentially to flash
DRAM
Evictions
Written
Sequentially
Clean
Pages
Dropped
Dirty Pages
Remain in
DRAM Until
Writes
Complete
Pages
Naturally
Pulled in by
Application
Diablo Technologies Slide 13
Application Benchmarks
Diablo Technologies Slide 14
Kdb+ Time Series Software: Memory1 Use Case
Generate Stock
Data
Generate
Returns
Generate
Moving
Averages
Generate
Statistics
Stock Ticker Analysis and Regression:
300GB of in-memory data generated (after Garbage Collection)
2HRs of Execution Time on 2S EP Platform w/ 512GB DRAM
Kdb+ time series data software | Kx SystemsThe world’s most powerful number cruncher, kdb+ offers unparalleled
performance for time-series data and analytics
https://kx.com/software.php
Diablo Technologies Slide 15
Kdb+: 2TB Memory1 Configuration
Memory1 2TB KDB+ Appliance
CPU: Haswell/Broadwell
Cores: 14-18C/socket
DRAM: 256GB
MEMORY1: 1900GB
NIC: built-in GE + Add-on 10GE
Storage: N x HDD Increase dataset size and instances per
machine by 5X and avoid EX Platforms
0 0.5 1 1.5 2 2.5 3 3.5
DRAM Instance
M1 Instance1
M1 Instance2
M1 Instance3
M1 Instance4
M1 Instance5
Execution Time (Hrs)
Diablo Technologies Slide 16
Memory1 Advantage: Economically Increases MySQL Buffer Pool
RAM MEMORY1
xMB yGB
zGB MySQL Instance
Buffer Pool
Instance NMain and
Virtual Memory
(includes all
buffers)
Physical Disk/Secondary
Storage (log files, databases
and relative statistics, index,
data, and meta-data files
Buffer
Pools
MySQL Architecture
STORAGE CAPACITY
Significantly Expand Cache
Increase TPS
Improve Response Time
Diablo Technologies Slide 17
Test Configuration
Random, Normal Workload Distribution
“DRAM-Only” Server 10GE Optical NIC
Dual 14-Core CPUs
128GB DRAM
Centos7
MySQL:
Memory1 Server10GE Optical NIC
Dual 14-Core CPUs
128GB DRAM
1TB Memory1
Centos7
MySQL:
Client (Load Generator)10GE Optical NIC
Linux:
Sysbench:
RAM MEMORY1
STORAGE CAPACITY
Increasing Buffer Pool size
with MEMORY1 should:
Remove Bottleneck imposed
by Storage Performance
Increase TPS and Reduce
Response Time
10GE Optical Switch
Diablo Technologies Slide 18
Hardware Configuration
DRAM / NVMe ServerDual 14-Core CPUs
128GB DRAM
NVMe SSD as storage
Memory1 ServerDual 14-Core CPUs
128GB DRAM
1TB Memory1
Client (Load Generator)
Sysbench
Increasing Buffer Pool size
with Memory1 will:
• Remove bottleneck imposed by
storage performance
• Increase TPS and reduce response
time
• Allow top performance with any
storage solution
All Servers Include: 10GE Optical NIC CentOS 7
10GE Optical Switch
Random, Normal Workload Distribution
RAM MEMORY1
STORAGE CAPACITY
Diablo Technologies Slide 19
Benchmark Scenarios
75
GB
DB
80
GB
Bu
ffer-P
oo
l
75
GB
DB
15 G
B
Bu
ffer-
Po
ol
Scenario “A” Scenario “B”
DMX:
6x Scenario A
Total RAM used = 128G
Sysbench Mixed Traffic
RAM:
6x Scenario B
Total RAM used = 128G
Sysbench Mixed Traffic
15 G
B
Ca
ch
e
Diablo Technologies Slide 20
DMX vs RAM
0
200
400
600
800
1000
1200
1400
1600
1800
13
51
70
11
05
11
40
11
75
12
10
12
45
12
80
13
15
13
50
13
85
14
20
14
55
14
90
15
25
15
60
15
95
16
30
16
65
1
TPS
Time (sec)
TPS Comparison
RAM
DMX
0
10
20
30
40
50
60
70
80
1
38
6
77
1
11
56
15
41
19
26
23
11
26
96
30
81
34
66
38
51
42
36
46
21
50
06
53
91
57
76
61
61
65
46
Late
ncy
(m
s)
Time (Sec)
Latency Comparison
RAM Latency
DMX Latency
Diablo Technologies Slide 21
Storage Read BW