research with ocelot

8
SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | GEORGIA INSTITUTE OF TECHNOLOGY RESEARCH WITH OCELOT 1

Upload: ogden

Post on 23-Feb-2016

36 views

Category:

Documents


0 download

DESCRIPTION

Research with ocelot. Workload Characterization and Analysis. SM Load Imbalance (Mandelbrot). Intra-Thread Data Sharing. Activity Factor. Constructing Performance Models: Eiger. Develop a portable methodology to discover relationships between architectures and applications. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Research with ocelot

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | GEORGIA INSTITUTE OF TECHNOLOGY 1

RESEARCH WITH OCELOT

Page 2: Research with ocelot

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | GEORGIA INSTITUTE OF TECHNOLOGY

Workload Characterization and Analysis SM Load Imbalance (Mandelbrot)

Intra-Thread Data Sharing

Activity Factor

2

Page 3: Research with ocelot

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | GEORGIA INSTITUTE OF TECHNOLOGY 3

Constructing Performance Models: Eiger

Develop a portable methodology to discover relationships between architectures and applications

Adapteva’s multicore from electronicdesign.com

Extensions to Ocelot for the synthesis of performance models

Used in macroscale simulation models Used in JIT compilers to make optimization decisions Used in run-times to make scheduling decisions

Page 4: Research with ocelot

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | GEORGIA INSTITUTE OF TECHNOLOGY

Eiger Methodology

Use data analysis techniques to uncover application-architecture relationships

Discover and synthesize analytic modelsExtensible in source data, analysis passes, model construction techniques, and destination/use

4

Ocelot JIT SST/Macro

Page 5: Research with ocelot

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | GEORGIA INSTITUTE OF TECHNOLOGY

Feedback-Driven Optimization: Autotuning

Use Ocelot’s dynamic instrumentation capabilityReal-Time feedback drives the Ocelot kernel JITDecision models to drive existing/new auto-tuners

Change data layout to improve memory efficiency Use different algorithms Selective invocation hot path profiling algorithm

selection

5

Decision Models

Measurements Code Generation

Workload Characterization

Not available with CUPTI

5

Page 6: Research with ocelot

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | GEORGIA INSTITUTE OF TECHNOLOGY

OCelot

Feedback-Driven Resource Management

Real time customized information available about GPU usage

Can drive scheduling decisionsCan drive management policies, e.g., power, throughput, etc.

6

Instrumented PTX

Instrumented PTX

Applications

Management Layer

GPU Clusters

Instrumented PTX

PTX

Instrumentation APIs

Inst

rum

ento

r

C-on-Demand JIT

C-PTX TranslatorPTX-PTX

Transformer

Instrumentation

6

Ocelot’s Lynx

Page 7: Research with ocelot

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | GEORGIA INSTITUTE OF TECHNOLOGY 7

Domain Specific Compilation: Red Fox

LogicBlox Front-End

Datalog-to-RA(nvcc + RA-Lib)

Harmony

src-srcOptimizatio

n

Ocelot

IR Optimizatio

n

Datalog Queries

RA Primitive

s

Language Front-

End

Translation Layer

Machine Neutral Back-End

Targeting Accelerator Clouds for meeting the

demands of data warehousing applications

Joint with LogicBlox Inc.

Harmony Kernel IR

Page 8: Research with ocelot

SCHOOL OF ELECTRICAL AND COMPUTER ENGINEERING | GEORGIA INSTITUTE OF TECHNOLOGY

Thank You

Questions?

8