map-d - nvidia€¦ · map-d gdwd uhÀqhg @datarefined todd mostak steven stewart [email protected]...
TRANSCRIPT
![Page 1: map-D - NVIDIA€¦ · map-D GDWD UHÀQHG @datarefined Todd Mostak Steven Stewart todd@map-d.com steve@map-d.com ! Ι Ι! Ι Ι! 245 First St. Suite 1832 Cambridge, MA 02148](https://reader034.vdocuments.net/reader034/viewer/2022051815/60404c29b7b537780038ef83/html5/thumbnails/1.jpg)
map-DGDWD�UHÀQHG
www.map-d.com @datarefined
Todd Mostak Steven Stewart
[email protected] [email protected]
Ι Ι
Ι Ι
245 First St. Suite 1832 Cambridge, MA 02148
#mapd @datarefined
![Page 2: map-D - NVIDIA€¦ · map-D GDWD UHÀQHG @datarefined Todd Mostak Steven Stewart todd@map-d.com steve@map-d.com ! Ι Ι! Ι Ι! 245 First St. Suite 1832 Cambridge, MA 02148](https://reader034.vdocuments.net/reader034/viewer/2022051815/60404c29b7b537780038ef83/html5/thumbnails/2.jpg)
map-D? super-fast database built into GPU memory
Do? world’s fastest real-time big data analytics interactive visualization
Demo? twitter analytics platform 1billion+ tweets milliseconds
![Page 3: map-D - NVIDIA€¦ · map-D GDWD UHÀQHG @datarefined Todd Mostak Steven Stewart todd@map-d.com steve@map-d.com ! Ι Ι! Ι Ι! 245 First St. Suite 1832 Cambridge, MA 02148](https://reader034.vdocuments.net/reader034/viewer/2022051815/60404c29b7b537780038ef83/html5/thumbnails/3.jpg)
The importance of interactivity
People have struggled for a long time to build interactive visualizations of big data that can deliver insight
• Hypothesis testing can occur at “speed of thought”
Interactivity means:
How Interactive is interactive enough?
• According to a study by Jeffrey Heer and Zhicheng Liu, “an injected delay of half a second per operation adversely affects user performance in exploratory data analysis.”
• Some types of latency are more detrimental than others:
• For example, linking and brushing more sensitive than zooming
![Page 4: map-D - NVIDIA€¦ · map-D GDWD UHÀQHG @datarefined Todd Mostak Steven Stewart todd@map-d.com steve@map-d.com ! Ι Ι! Ι Ι! 245 First St. Suite 1832 Cambridge, MA 02148](https://reader034.vdocuments.net/reader034/viewer/2022051815/60404c29b7b537780038ef83/html5/thumbnails/4.jpg)
Strategies for interactivity
• Sampling:
• Ex. BlinkDB
• Issues:
• Need statistically robust method for sampling
• Sampling can miss “long-tail” phenomena
• Pre-computation
• Ex. ImMems (datacubing)
• Issues:
• Only can show what curator thought was relevant
• Can only store a certain number of binned attributes
• Must be curated!
• At the same time, Map-D also rendered HD data visualizations and sent them to Tweetmap’s interactive analytics GUI
Live demo: www.mapd.csail.mit.edu SC13 video and write up:
![Page 5: map-D - NVIDIA€¦ · map-D GDWD UHÀQHG @datarefined Todd Mostak Steven Stewart todd@map-d.com steve@map-d.com ! Ι Ι! Ι Ι! 245 First St. Suite 1832 Cambridge, MA 02148](https://reader034.vdocuments.net/reader034/viewer/2022051815/60404c29b7b537780038ef83/html5/thumbnails/5.jpg)
The Arrival of In-Memory Systems
• Traditional RDBMS used to be too slow to serve as a back-end for interactive visualizations.
• Queries over a billion records could take minutes if not hours
• But in-memory systems can execute such queries in a fraction of the time.
• Both full DBMS and “pseudo”-DBMS solutions
• But still often too slow
![Page 6: map-D - NVIDIA€¦ · map-D GDWD UHÀQHG @datarefined Todd Mostak Steven Stewart todd@map-d.com steve@map-d.com ! Ι Ι! Ι Ι! 245 First St. Suite 1832 Cambridge, MA 02148](https://reader034.vdocuments.net/reader034/viewer/2022051815/60404c29b7b537780038ef83/html5/thumbnails/6.jpg)
Enter Map-D
![Page 7: map-D - NVIDIA€¦ · map-D GDWD UHÀQHG @datarefined Todd Mostak Steven Stewart todd@map-d.com steve@map-d.com ! Ι Ι! Ι Ι! 245 First St. Suite 1832 Cambridge, MA 02148](https://reader034.vdocuments.net/reader034/viewer/2022051815/60404c29b7b537780038ef83/html5/thumbnails/7.jpg)
the technology
![Page 8: map-D - NVIDIA€¦ · map-D GDWD UHÀQHG @datarefined Todd Mostak Steven Stewart todd@map-d.com steve@map-d.com ! Ι Ι! Ι Ι! 245 First St. Suite 1832 Cambridge, MA 02148](https://reader034.vdocuments.net/reader034/viewer/2022051815/60404c29b7b537780038ef83/html5/thumbnails/8.jpg)
Core Innovation
SQL-enabled column store database built into the memory architecture on GPUs and CPUs
• Memory bandwidth • Massive parallelism across multiple GPUs
• Systems with both GPU and CPU memory • Near-linear scaling to clusters of GPU nodes
System can process > 2TB/sec per node, with > 10TB/sec per node logical throughput with shared scans
Code developed from scratch to take advantage of:
Double-level buffer pool across GPU and CPU memory
Shared scans – multiple queries of the same data can share memory bandwidth
![Page 9: map-D - NVIDIA€¦ · map-D GDWD UHÀQHG @datarefined Todd Mostak Steven Stewart todd@map-d.com steve@map-d.com ! Ι Ι! Ι Ι! 245 First St. Suite 1832 Cambridge, MA 02148](https://reader034.vdocuments.net/reader034/viewer/2022051815/60404c29b7b537780038ef83/html5/thumbnails/9.jpg)
Multiple GPUs, with data partitioned between them
Node 1 Node 2 Node 3
Filter text ILIKE ‘rain’!
Filter text ILIKE ‘rain’!
Filter text ILIKE ‘rain’!
Shared Nothing Processing
![Page 10: map-D - NVIDIA€¦ · map-D GDWD UHÀQHG @datarefined Todd Mostak Steven Stewart todd@map-d.com steve@map-d.com ! Ι Ι! Ι Ι! 245 First St. Suite 1832 Cambridge, MA 02148](https://reader034.vdocuments.net/reader034/viewer/2022051815/60404c29b7b537780038ef83/html5/thumbnails/10.jpg)
the product
![Page 11: map-D - NVIDIA€¦ · map-D GDWD UHÀQHG @datarefined Todd Mostak Steven Stewart todd@map-d.com steve@map-d.com ! Ι Ι! Ι Ι! 245 First St. Suite 1832 Cambridge, MA 02148](https://reader034.vdocuments.net/reader034/viewer/2022051815/60404c29b7b537780038ef83/html5/thumbnails/11.jpg)
Complex Analy-cs
GPU in-‐memory SQL database
Visualiza-on
Image processing OpenGL
H.264/VP8 streaming GPU pipeline
Machine learning Graph analy-cs
Scale to cluster of GPU nodes SQL compiler Shared scans User defined func-ons Hybrid GPU/CPU execu-on OpenCL and CUDA
License
Simple # of GPUs
Mobile/server versions
Product GPU powered end-‐to-‐end big data analy-cs and visualiza-on plaQorm
![Page 12: map-D - NVIDIA€¦ · map-D GDWD UHÀQHG @datarefined Todd Mostak Steven Stewart todd@map-d.com steve@map-d.com ! Ι Ι! Ι Ι! 245 First St. Suite 1832 Cambridge, MA 02148](https://reader034.vdocuments.net/reader034/viewer/2022051815/60404c29b7b537780038ef83/html5/thumbnails/12.jpg)
Ma
p-D
co
de
Single GPU
12GB memory
Map-D code integrated into GPU memory
Single CPU
768GB memory
Map-D code integrated into CPU memory
NVIDIA TEGRA Mobile chip
4GB memory
Map-D code integrated into chip memory
8 cards = 4U box
4 sockets = 4U box
Map-D code runs on GPU + CPU memory
36U rack: ~400GB GPU ~12TB CPU
Mobile Map-D running small datasets
Native App
Web-based service
Map-D hardware architecture
Large Data Big Data
Small Data
Next Gen Flash 40TB
100GB/s