by yequn zhang, yu zhang. contents introduction problem analysis proposed algorithm evaluation

Post on 20-Dec-2015

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Gaussian EliminationBy

Yequn Zhang, Yu Zhang

Contents

IntroductionProblem AnalysisProposed AlgorithmEvaluation

Contents

IntroductionProblem AnalysisProposed AlgorithmEvaluation

Gaussian EliminationForward EliminationBack Substitution

Contents

IntroductionProblem AnalysisProposed AlgorithmEvaluation

Problem AnalysisData size used by kernels changes continuouslyDifficult to find an appropriate block size to avoid divergenceBlock-based approach

Assign a certain part of computation running on CPU-leave the irregularity to cpu

Manually make the data size changes with a step of block sizeBlock number per grid is easy to set

Contents

IntroductionProblem AnalysisProposed AlgorithmEvaluation

Forward EliminationA block-based approachTry to avoid divergenceTry to use GPUTry to be fine-grained

K 1

Find Max Row

Swapcpu

Now start toeliminate the block of data on cpu

Calculatecoefficients

Eliminationon CPU

K 1

Calculate Coefficients

K2K 2

Eliminationon CPU

Swap on GPU

K3

K 3

K4Elimination on GPU

K 4

K5Eliminationon GPU

K 5

Intra-block loop

Inter-block loop

Last inter-block loopprocessedon CPU

Back SubstitutionLaunch kernel when number of coefficients per row

exceeds four block size (64*4=256)A fine-grained way, use a similar way as forward

elimination, part on CPU and part on GPU

Contents

IntroductionProblem AnalysisProposed AlgorithmEvaluation

Block size effect

The contribution of swap and find max rowIs it necessary to implement every part on GPU?

Performance breakdownContribution of each part to the total performance,

including kernels as well as CPU part

Speedup

Questions ?

top related