the fusion of aka. why we got tired of peter ungaro ... … · 2016 delivery: ~9,500 nodes of knl...

22
AKA. Why We Got Tired of Hearing Talks Like Katie Gave and What We Are Doing About It!! The Fusion of Supercomputing and Big Data Peter Ungaro President & CEO

Upload: others

Post on 28-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

AKA. Why We Got

Tired of Hearing Talks

Like Katie Gave and What We

Are Doing About It!!

The Fusion of Supercomputing

and Big Data

Peter Ungaro President & CEO

The first wave of “big” data…

“Simulation was the first big data market” -- IDC

The current wave of Big Data….

1 ) Will analytics in the era of “big data” look like analytics has looked since the relational database was developed in 1 969 by Dr. Codd?

2) Will “big data” look the same in the

world of engineering and scientific research as it will in the world of

business?

2 Questions...

Cray’s Vision: The Fusion of Supercomputing and Big & Fast Data

Modeling The World

Cray Supercomputers solving “grand challenges” in science, engineering and analytics

Compute Store Analyze

Data Models

Integration of datasets and math models for

search, analysis, predictive modeling and

knowledge discovery

Math Models

Modeling and simulation augmented with data to

provide the highest fidelity virtual reality

results

Data-Intensive

Processing

High throughput event processing & data

capture from sensors, data feeds and

instruments

Cray Technology Innovations Interconnect and optimization software to address the data transfer bottleneck at large scale

Software to productively manage and extract performance out of thousands of processors used as a single system

Systems Management & Performance Software

Greenest x86 supercomputers with innovative cooling and upgradability to improve TCO Packaging

System Interconnect

Combines multiple processing architectures into a single,

scalable system Extending Adaptive Supercomputing

to Big Data Workloads

Big Data’s Methods for Analysis

Solutions for Advanced Analytics

Data Warehouses +Extensions NoSQL Databases

Big Data Solutions

These solutions can overlap, but also can be very complementary as each has strengths & weaknesses

Hadoop / MapReduce Graph Analytics

System Architecture Differences… Supercomputing

• Scalable computing w/high BW, low-latency, global memory architectures • Tightly integrated processor-memory-interconnect & network storage • Minimize data movement – load the “mesh” into memory • Move data for loading, check-pointing or archiving • “Basketball court sized” systems

Large-scale Data Analytics

• Distributed computing at largest scale • Divide-and-conquer approaches on Service Orientated Architectures • Maximize data movement-- scan/sort/stream all the data all the time • Lowest cost processor-memory-interconnect & local storage • “Warehouse sized” clouds

Can’t “The Cloud” Do This?

Supercomputers &

Analytic Appliances

Public & Private Clouds

Step 2: Enabling More Complexity & Capability …Big Data Fast Data

Global Memory + Fast

Interconnects

Fast Data

SAN Interconnects

Enterprise Data

(structured)

GRID

Big Data

CLOUD

LAN/WAN Interconnects

Big Data Appliance for Real-Time Data Discovery

Detecting cyber threats

Customers

Call Center Events

Work Orders

Call Escalations

Truck Rolls Set-Top Box feeds

Supervisor Intervention

3rd Party Service Tech

AVR Failure

CSR Resolution

Cabinet Failure

Residential Accounts

Web Service

Analyze customer churn Find new fraud patterns Discover new drug re-purposing opportunities

The Baseball World Has Changed…

Box Score

Play-by-Play

Pitch f/x

Source: MLB.com and Baseball-Reference 14

15

Big Data Era of Baseball Analytics

New (Big) Data + New Technology

• New Data

– 20X from Moneyball – Pitch f/x (20/Pitch) – Hit f/x (5/hit)

+ • New Technology

– Graph Analytics – Urika

(aka. supercomputer)

Evaluating Batter/Pitcher

Match Ups

Multi-disciplinary, multi-institutional

Data Intensive / Data Driven Science

Computational Simulation of Complex Phenomena

Theoretical Research

Evolution of Science and Knowledge Discovery

Experimental Research

Sensors, devices, simulations, social…

NERSC-8 “Cori” System ● To be installed in 2Q16 ●KNL many-core processor

● > 9,300 Compute nodes ● > 27 PF ● Data partition of 2,000 Haswell nodes

●Cray Lustre Filesystem ● 28 Petabytes ● 430 GB/sec performance

● Focus on design is sustained performance and perf/watt

●Cray-NERSC-Intel COE will focus on science applications

Trinity System at LANL/Sandia

● 2015 Delivery: ~9,500 Haswell nodes ● Large Sonexion/Lustre solution

● 80 PB capacity ● 1.7 TB/sec bandwidth

● Tightly-integrated Burst Buffer SSD nodes ● 3.7 PB capacity ● 3.3 TB/sec bandwidth

● 2016 Delivery: ~9,500 nodes of KNL ● Each with 16GB of fast on-package memory ● All integrated into a single 42 PF system

Integrated HPC Environments are the capability that will turn data in to insight and discovery

Analyze

Compute

Store

Our Vision…

Build a world-class integrated supercomputing environment that enables transformational computing

across a broad set of science, engineering and advanced analytics (big data) applications

Remember, some great things never change…