optimizations in mlir lto and data layout compiler tree

30
LTO and Data Layout Optimizations in MLIR Ranjith/Prashantha Compiler Tree Technologies

Upload: others

Post on 06-Oct-2021

12 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Optimizations in MLIR LTO and Data Layout Compiler Tree

LTO and Data Layout Optimizations in MLIR

Ranjith/PrashanthaCompiler Tree Technologies

Page 2: Optimizations in MLIR LTO and Data Layout Compiler Tree

Agenda● Motivation for LTO in MLIR ● CIL - LTO ● Data layout optimisations ● Instance interleaving ● Dead field elimination

Page 3: Optimizations in MLIR LTO and Data Layout Compiler Tree

Motivation

FC

CML

CLANG

CIL Module

LLVM Module

CIL Pass Manager

STL Optimisations

LNODLO

Inliner

llvm-opt bitcodec

c

fortran

Runs only on single module

Page 4: Optimizations in MLIR LTO and Data Layout Compiler Tree

Motivation

bitcode

bitcode

bitcode

LLVM Module

Target lowering

LLVM LTO Pass managerllvm-lto

Page 5: Optimizations in MLIR LTO and Data Layout Compiler Tree

Motivation

MLIR Module CIL Module LLVM Module

CIL Pass Manager

STL OptimisationsLNODLO

Inliner

mlir-opt

Page 6: Optimizations in MLIR LTO and Data Layout Compiler Tree

Motivation

MLIR Module

MLIR Module

MLIR Module

CIL Module LLVM Module

CIL Pass Manager

STL OptimisationsLNODLO

Inliner

Page 7: Optimizations in MLIR LTO and Data Layout Compiler Tree

Why LTO in MLIR?● MLIR can represent high level source constructs like Multi-dimensional

arrays.● This reduces the analysis required during optimisations. ● Optimisations like LNO, DLO are best performed at MLIR. ● Coverage of these optimisations is limited if entire application is not

represented using single MLIR module.● Running LNO/DLO at MLIR on whole application requires a LTO support

Page 8: Optimizations in MLIR LTO and Data Layout Compiler Tree

Example - C

f1.c f2.c

Page 9: Optimizations in MLIR LTO and Data Layout Compiler Tree

CIL - LTO● CIL is a MLIR dialect designed to represent language constructs of C, C++

and Fortran.● CIL LTO is a framework to link multiple MLIR modules into one.● There is very less dependency on dialect and can be easily extended to

work on any dialect.● Multiple SPEC CPU 2017 benchmarks can be compiled with the framework● https://llvm.org/devmtg/2020-09/slides/CIL_Common_MLIR_Abstraction.pdf

Page 10: Optimizations in MLIR LTO and Data Layout Compiler Tree

LTO Framework

MLIR Module

MLIR Module

MLIR Module

CIL Module LLVM Module

CIL Pass Manager

STL OptimisationsLNODLO

Inliner

CIL LTO Framework

Page 11: Optimizations in MLIR LTO and Data Layout Compiler Tree

LTO - Process

Read the module files

All the input modules are parsed and stored in a list

Symbol Resolution

Symbol resolution happens for top level operations like global variables operations and function operations

Create new module

Single MLIR module is created and fed to the pipeline

Page 12: Optimizations in MLIR LTO and Data Layout Compiler Tree

LTO Example - C

Page 13: Optimizations in MLIR LTO and Data Layout Compiler Tree

LTO Example - C

Page 14: Optimizations in MLIR LTO and Data Layout Compiler Tree

LTO Example

Page 15: Optimizations in MLIR LTO and Data Layout Compiler Tree

Data Layout Optimisations

Page 16: Optimizations in MLIR LTO and Data Layout Compiler Tree

Data Layout Optimisations● Modifying structure/array patterns for better cache utilization. ● Implemented in both LLVM and MLIR

https://llvm.org/devmtg/2014-10/Slides/Prashanth-DLO.pdf

Page 17: Optimizations in MLIR LTO and Data Layout Compiler Tree

Structure Splitting/Peeling

Page 18: Optimizations in MLIR LTO and Data Layout Compiler Tree

DLO - Instance Interleaving

https://llvm.org/devmtg/2014-10/Slides/Prashanth-DLO.pdf

Array[0].a Array[0].b Array[0].c Array[0].d

Array[1].a Array[1].b Array[1].c Array[1].d

Array[2].a Array[2].b Array[2].c Array[2].d

Page 19: Optimizations in MLIR LTO and Data Layout Compiler Tree

DLO - Instance Interleaving

https://llvm.org/devmtg/2014-10/Slides/Prashanth-DLO.pdf

Page 20: Optimizations in MLIR LTO and Data Layout Compiler Tree

DLO - Instance Interleaving

https://llvm.org/devmtg/2014-10/Slides/Prashanth-DLO.pdf

Array.a[0] Array.a[1] Array.a[2] Array.a[3]

Array.a[4] Array.a[5] Array.a[6] Array.a[7]

Page 21: Optimizations in MLIR LTO and Data Layout Compiler Tree

Data Layout Optimisations in MLIR● Cross module optimization ● Instance interleaving and Dead field elimination optimisations are

implemented in MLIR● Runs as LTO passes● Identification of struct access is simpler as compared to LLVM because

there is separate operation for struct access. ● Approximately 35% improvement is seen in one of SPEC CPU 2017

benchmark.

Page 22: Optimizations in MLIR LTO and Data Layout Compiler Tree

DLO - Instance Interleaving● Identify the profitable and legal structs to transform● Identify arrays of structures whose different fields are accessed in

different loops● Create and allocate new structure type. ● Rewrite rewrite old accesses.

Page 23: Optimizations in MLIR LTO and Data Layout Compiler Tree

DLO - Dead field elimination

b is write only field and c not at all accessed

Page 24: Optimizations in MLIR LTO and Data Layout Compiler Tree

DLO - Dead field elimination● Classify the struct fields

○ READ - Field is loaded in the use○ WRITE - Some value is being written to the field○ UNKNOWN - Any use other than read/write○ NOACCESS - Field is not at all used.

● Remove the NOACCESS and WRITE only fields.

● Rewrite the uses

Page 25: Optimizations in MLIR LTO and Data Layout Compiler Tree

Dead field elimination - Transformation

Page 26: Optimizations in MLIR LTO and Data Layout Compiler Tree

Dead field elimination - Unknown Access

Page 27: Optimizations in MLIR LTO and Data Layout Compiler Tree

Dead field elimination - 505.mcf_r

defines.h

Page 28: Optimizations in MLIR LTO and Data Layout Compiler Tree

Dead field elimination - 505.mcf_r

defines.h

Page 29: Optimizations in MLIR LTO and Data Layout Compiler Tree

Future works● Improve coverage of DLO● Implement other data layout optimisations like structure peeling,

structure splitting etc.

Page 30: Optimizations in MLIR LTO and Data Layout Compiler Tree

Thank You