revisiting kirchhoff migration on gpus 2015 rice oil & gas hpc workshop rajesh gandham, rice...

26
Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess Corporation Scott Morton, Hess Corporation

Upload: nigel-patrick

Post on 29-Dec-2015

220 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

Revisiting Kirchhoff Migration on GPUs

2015 Rice Oil & Gas HPC Workshop

Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess Corporation

Scott Morton, Hess Corporation

Page 2: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

Seismic Experiment

http://www.chevron.pl/images/timeline/rsImgSeismicImaging1.jpg

Page 3: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

Kirchhoff Migration

x

y

z

t = Ts + Tr Add to Image

Image traceData trace

Image point

Source

Receiver

Ts

Tr

Page 4: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

Seismic Image

Page 5: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

Project Goals

• Hardware portability

• General image gathers

• Improve migration performance

Page 6: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

Project Goals

• Hardware portability

• General image gathers

• Improve migration performance

Page 7: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

OCCA for Portability

Page 8: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

Portability Results

• Ported and tested production kernel from CUDA to OCCA in ~3 weeks• Tested and verified kernel results on CPU and GPU• Tested production migration on GPUs

• Performance• Greater kernel performance because of runtime compilation• Kernels still need some tuning for best performance on various

architectures

Page 9: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

Project Goals

• Hardware portability

• General image gathers

• Improve migration performance

Page 10: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

Standard Kirchhoff Imaging

• Pre-compute coarse travel times from surface locations to image points• 4D surface integral through a 5D data set to 3D image

• Computational complexity:

• NI ~ 1010, number of output image points

• ND ~ 109, number of input data traces

• f ~ 10, number of cycles/image-point/trace

• fNlND ~ 1020, number of cycles ~ 103 CPU core years

Page 11: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

Kirchhoff Gather Imaging

• Pre-compute coarse travel times from surface locations to image points• 4D surface integral through a 5D data set to 4D/5D image

• Image Gathers• Offset • Offset vectors tile (OVT)• Subsurface angles• etc…

Page 12: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

Project Goals

• Hardware portability

• General image gathers

• Improve migration performance

Page 13: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

Previous Approach

• Define tasks that can be run in parallel

• Task should be small enough to fit on a GPU

• Copying data to and from the GPU is expensive

• Global memory access can be a bottleneck

Page 14: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

Previous Approach

Page 15: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

Previous Approach

Page 16: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

Previous Approach Overview

• ~32 traces per task• Big image block per task• One gather bin per task• Pre-filter the data• Resample the data• CUDA programming model

Page 17: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

New Approach for Performance

• Define tasks that can be run in parallel

• Task should be small enough to fit on a GPU

• Copying data to and from the GPU is expensive

• Global memory access can be a bottleneck

• Improve FLOPs/load

Page 18: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

New Approach

parallelism in image volumeparallelism in data traces

Page 19: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

Parameter Analysis

100

400

700

1000

1300

1600

1900

2200

0

100

200

300

400

500

600

700

800

900

1000

1100

1200

1300

100

200

300

400

500

600

700

800

900

1000

1100

1200

1300

1400

1500

1600

1700

1800

1900

2000

2100

2200

2300

2400

Computation-to-Memory Efficiency

Input Trace Block (m)

Migration-Contribu-tions/Byte

Output Image Block (m)

Page 20: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

New Approach

• Implementation for general image gathers• Offset gather, OVT gather, reflection angle gather, etc.

• Produce a small chunk of image quickly• See imaging results as each task finishes

• Improve the overall performance on new hardware• The production code was optimized for CUDA and NVIDIA GPUs in 2008/2009

• Develop portable software• Hardware architectures change relatively fast• Several vendors and varieties of accelerators• Several parallel models for various languages

Page 21: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

Production vs New Approach

• ~32 traces per task• Big image block per task• One gather bin per task• Pre-filter the data• Resample the data• CUDA programming model

• ~200k traces per task• Small image block per task• Multiple gather bins per task• Filter on the fly• Interpolate on the fly• OCCA programming approach

• Avoiding pre-filtering (for anti-aliasing) & resampling – Reduces memory overhead– Increase the number of computations per migration contribution– Greater FLOPs/Byte

Page 22: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

Production vs New Approach

Output Image Block Length (m)

Mil

lio

n M

igra

tio

n-

Co

ntr

ibu

tio

ns/

s

• Input traces are fixed at ~177,000 (Nvidia K10)• Pre-filtering and resampling of production code is not included

500 1000 1500 20000

500

1000

1500

2000

2500

Production ApproachNew Approach

Page 23: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

New Approach Outcomes

• Improved production performance best guess (~2X)

• Generalized gather kernel framework

• Portable implementation• Tested and verified CPU vs GPU results• Tested and compared OpenCL vs CUDA• Performance on AMD GPUs is similar to NVIDIA GPUs

Page 24: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

New Approach Kernel: NVIDIA vs AMD

Output image volume size in meters

Mil

lio

n m

igra

tio

n

con

trib

uti

on

s/s

• Number of input traces ~177,000

500 1000 1500 20000

500

1000

1500

2000

2500

3000

OpenCL + K40CUDA + K40OpenCL + Tahiti

Page 25: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

Project Goals Review

Hardware portability

General image gathers

Improve migration performance

• Finish integration of new kernel in to production

• More testing on various accelerators

• Explore using mixed architecture migrations

Future Work

Page 26: Revisiting Kirchhoff Migration on GPUs 2015 Rice Oil & Gas HPC Workshop Rajesh Gandham, Rice University & Hess Corporation (intern) Thomas Cullison, Hess

Acknowledgements

• Hess Corporation

• CAAM @ Rice University• Tim Warburton• David Medina