david turek vp high performance and cognitive computing · 2020-01-15 · (bluemix local) ibm cloud...

18
Data Centric Systems David Turek VP High Performance and Cognitive Computing

Upload: others

Post on 23-May-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: David Turek VP High Performance and Cognitive Computing · 2020-01-15 · (Bluemix Local) IBM Cloud private X86, Power & Z X86 based systems On-prem, IBM managed Off-prem, IBM managed

Data Centric Systems

David Turek

VP High Performance and Cognitive Computing

Page 2: David Turek VP High Performance and Cognitive Computing · 2020-01-15 · (Bluemix Local) IBM Cloud private X86, Power & Z X86 based systems On-prem, IBM managed Off-prem, IBM managed

© 2016 International Business Machines Corporation

R&Din the

IT Era

Theory/Knowledge

Experiment

Simulation

Massive improvements:

• Applicability & ease of use• Simulation fidelity• Scalability• Throughput

thanks to Supercomputing

Page 3: David Turek VP High Performance and Cognitive Computing · 2020-01-15 · (Bluemix Local) IBM Cloud private X86, Power & Z X86 based systems On-prem, IBM managed Off-prem, IBM managed

Data Centric Systems

3Source: Top500.org

Implementing Exact-Exchange in CPMD

>99% Parallel Efficiency to over 6.2M threads

Studying Li-Air Batteries, 1736 atoms, 70Ry cuttof

V. Weber, T. Laino, C. Bekas, A. Curioni, A. Bertsch, S. Futral IPDPS 13

Page 4: David Turek VP High Performance and Cognitive Computing · 2020-01-15 · (Bluemix Local) IBM Cloud private X86, Power & Z X86 based systems On-prem, IBM managed Off-prem, IBM managed

Data Centric Systems

4

ACM Gordon Bell Prize 2013

14.4 PFLOP/S @73% of peak perf., with I/O

2 orders of magnitude improvement in

• scale of the problem (from 128 to 15K bubbles)

• time to solution

Compute specifics:

13 Trillion elements, 1.2TBytes compressed I/O

per time step, 6.4 M threads

IBM, ETHZ, TUM, LLNL

Success in Petascale computing: CFD can achieve Linpack like sustained performance

Page 5: David Turek VP High Performance and Cognitive Computing · 2020-01-15 · (Bluemix Local) IBM Cloud private X86, Power & Z X86 based systems On-prem, IBM managed Off-prem, IBM managed

Data Centric Systems

5

ACM Gordon Bell Prize 2015

97% of sustained scalability for

a fully implicit solver. 1.6M cores

3.2M MPI processes

602B DoF,

IBM, UT Austin, NYU, CALTECH

Success in Petascale computing: Implicit linear solvers do scale!

Page 6: David Turek VP High Performance and Cognitive Computing · 2020-01-15 · (Bluemix Local) IBM Cloud private X86, Power & Z X86 based systems On-prem, IBM managed Off-prem, IBM managed

© 2016 International Business Machines Corporation

6

New

Product

Opportunistic

Discovery

by Humans

Simulation

Experiments

R&D Today

But we cannot beat complexity with brute force simulation. Traditional

discovery has limits: We need a new, data driven, holistic approach

Page 7: David Turek VP High Performance and Cognitive Computing · 2020-01-15 · (Bluemix Local) IBM Cloud private X86, Power & Z X86 based systems On-prem, IBM managed Off-prem, IBM managed

Data Centric Systems

Page 8: David Turek VP High Performance and Cognitive Computing · 2020-01-15 · (Bluemix Local) IBM Cloud private X86, Power & Z X86 based systems On-prem, IBM managed Off-prem, IBM managed

Data Centric Systems

Page 9: David Turek VP High Performance and Cognitive Computing · 2020-01-15 · (Bluemix Local) IBM Cloud private X86, Power & Z X86 based systems On-prem, IBM managed Off-prem, IBM managed

Data Centric Systems

Page 10: David Turek VP High Performance and Cognitive Computing · 2020-01-15 · (Bluemix Local) IBM Cloud private X86, Power & Z X86 based systems On-prem, IBM managed Off-prem, IBM managed

© 2016 International Business Machines Corporation

10

Companies need to easily access

quickly growing and widely

diverse information sources.

• Highly unstructured/dark

• Current human based

approach not scalable

Domain related inference is largely

missing. Setting up and deploying the

right simulations is very hard.

• Human capital intensive, non

scalable

Internal evidence and experiments

are driven primarily empirically,

often brute force, and their results

are isolated from wider knowledge

space.

Knowledge

Evidence & Experiments

Inference & Simulation

Page 11: David Turek VP High Performance and Cognitive Computing · 2020-01-15 · (Bluemix Local) IBM Cloud private X86, Power & Z X86 based systems On-prem, IBM managed Off-prem, IBM managed

© 2016 International Business Machines Corporation

11

Create technical area specific

knowledge space from all relevant

sources. Link with company data.

Use knowledge space to

• Drastically augment internal know-how & modeling

• Focus on which experiment is relevant

• Embed results in knowledge base

Use inference on the knowledge space

& simulation on the models

• To augment the knowledge space

• Sharpen simulation models

• Make precise decisions

Cognitive Discovery

Drastically accelerate pace

of systematic discovery

and maximize ROI for R&D

Rapid and Precise Materials R&D

drives new value for our clients

Pharma Materials Engineering &

Manufacturing

Science,

Products &

Economics

Experimental

Results

Knowledge Inference & Simulation

Evidence & Experiments

Simulation

Page 12: David Turek VP High Performance and Cognitive Computing · 2020-01-15 · (Bluemix Local) IBM Cloud private X86, Power & Z X86 based systems On-prem, IBM managed Off-prem, IBM managed

Data Centric Systems

Document Ingestion: PDF

Domain Specific Knowledge

Graphs

Domain Specific ML +

Inference

NLQ + ML Driven

Simulations

Automatic Hypothesis

Discovery

Fully Automated

Reasoning

Fully Automated

Discovery

mature

Ideation

KNOWLEDGE EXTRACTION &

REPRESENTATION

INFERENCE DRIVEN

SIMULATIONS

AUTOMATED TECHNICAL

REASONING

Page 13: David Turek VP High Performance and Cognitive Computing · 2020-01-15 · (Bluemix Local) IBM Cloud private X86, Power & Z X86 based systems On-prem, IBM managed Off-prem, IBM managed

Data Centric Systems

13

Literature ReviewNon scalable, human based outsourcing:

• Limited sources

• Non-systematic; limited re-use

1

Chemical/Physical/Eng. modeling & simulations

• Expert material scientists

• Empirical: no inference

• Trial and error based: no systematic knowledge buildup

2

Lab tests

Time/money costly

• Empirical (slow: many tests)

• No systematic knowledge buildup & connection

3

YearsMonths Months Months

INGESTION SIMULATION ANALYSIS

Page 14: David Turek VP High Performance and Cognitive Computing · 2020-01-15 · (Bluemix Local) IBM Cloud private X86, Power & Z X86 based systems On-prem, IBM managed Off-prem, IBM managed

Data Centric Systems

Pdf-parser:

• Parses the pdf-code and presents the raw data of the pdf (text-cells, embedded images and vector-graphics in consumable format)

Pdf-interpreter:

• Captures ground truth by massive Crowd-sourcing big Data system

• Uses HPC for ML-techniques (Deep Leaning), to train automatic annotation models

Semantic-representation:

• Uses HPC & Big Data systems to to obtain a semantic representation in JSON-format of the original text

Billions of documentsMillions of concurrent users

Page 15: David Turek VP High Performance and Cognitive Computing · 2020-01-15 · (Bluemix Local) IBM Cloud private X86, Power & Z X86 based systems On-prem, IBM managed Off-prem, IBM managed

Data Centric Systems

Weeks

Deep

Search

Lab tests experiments data

Simulation

& Inference

Scientific literature & internal

reports

Design alloys to avoid catastrophic failure that can

lead to huge liabilities

• Corrosion

• Cracks

• Special environmental and deployment

conditions

DAYS

Knowledge

space

• Atomistic simulations

• Deep Learning based property prediction

Page 16: David Turek VP High Performance and Cognitive Computing · 2020-01-15 · (Bluemix Local) IBM Cloud private X86, Power & Z X86 based systems On-prem, IBM managed Off-prem, IBM managed

Data Centric Systems

• Typically HPC development is focused

on increased speed.

• The fastest calculation is the one

which you don’t run!

• Can we use machine learning to make

better decisions on which simulations

give the most value?

• Can we use machine learning to

improve resolution of information?

‘Cognitive’ workflow uses 1/3 of the calculations to achieve 4 orders of magnitude resolution increase

Page 17: David Turek VP High Performance and Cognitive Computing · 2020-01-15 · (Bluemix Local) IBM Cloud private X86, Power & Z X86 based systems On-prem, IBM managed Off-prem, IBM managed

Data Centric Systems

On-prem, customer managed

(Bluemix Local)

IBM Cloud

private

X86, Power & Z X86 based systems

On-prem,

IBM

managed

Off-prem, IBM managed

(Bluemix Public or Dedicated)

Linux

4/11/2018IBM Confidential 17

kube-arbitrator

GPFS/Parallel object store

Spectrum MPI

Spectrum LSF Conductor w/Spark Symphony

XLc/C/Fortran

Compute Accelerators (GPUs, AI, FPGA, etc.)//High Performance Network (RoCE, IB, RRC)//NVMe,Flash

Math librariesESSL, GPU, AI

AI frameworks (PowerAI,DLaaS)

Workflow Managers (TCaaS)

HPC, AI Applications

xC

AT

Pro

vis

ionin

g

Ubiquity Storage drivers

Page 18: David Turek VP High Performance and Cognitive Computing · 2020-01-15 · (Bluemix Local) IBM Cloud private X86, Power & Z X86 based systems On-prem, IBM managed Off-prem, IBM managed

Data Centric Systems

Knowledge

Space

Simulation

Weeks

Evidence/Experiments

• Supercomputing

• Quantum and new computing paradigms

• Inference (ML)

Ingest data and create massive knowledge spaces

Link evidence with knowledge spaces. Drive deep search