automatic differentiation algorithms and data structures

30
Automatic Differentiation Algorithms and Data Structures Chen Fang PhD Candidate Joined: January 2013 FuRSST Inaugural Annual Meeting May 8 th , 2014 1

Upload: others

Post on 19-Feb-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Automatic Differentiation Algorithms and Data Structures

Chen FangPhD Candidate

Joined: January 2013FuRSST Inaugural Annual Meeting

May 8th, 2014

1

About 𝟐 πŸ‘ of the code for physics in simulators computes

derivatives

2

Automatic Differentiation (AD)

1. AD data type

π‘₯1 β†’ π‘₯1,πœ• π‘₯1πœ•π‘₯1

,πœ• π‘₯1πœ•π‘₯2

2. Operator overloading

π‘₯1 βˆ— π‘₯2 = π‘₯1 βˆ— π‘₯2, π‘₯1 βˆ—πœ• π‘₯2πœ•π‘₯1

,πœ• π‘₯2πœ•π‘₯2

+ π‘₯2 βˆ—πœ• π‘₯1πœ•π‘₯1

,πœ• π‘₯1πœ•π‘₯2

3

Value Gradient

𝛻 π‘₯ βˆ— 𝑦 = x𝛻𝑦 + y𝛻π‘₯

Example of Automatic Differentiation

4

𝑓 π‘₯1, π‘₯2 = cos π‘₯1 + π‘₯1 βˆ— 𝑒π‘₯𝑝 π‘₯2

Procedures:

π‘₯1 = π‘₯1 , 1, 0 π‘₯2 = π‘₯2 , 0, 1

cos π‘₯1 = cos π‘₯1 , βˆ’ sin π‘₯1 , 0𝑒π‘₯𝑝 π‘₯2 = 𝑒π‘₯𝑝 π‘₯2 , 0 , 𝑒π‘₯𝑝 π‘₯2 π‘₯1 βˆ— 𝑒π‘₯𝑝 π‘₯2 = π‘₯1𝑒π‘₯𝑝 π‘₯2 , 𝑒π‘₯𝑝 π‘₯2 , π‘₯1𝑒π‘₯𝑝 π‘₯2

𝑓 π‘₯1, π‘₯2 = cos π‘₯1 + π‘₯1𝑒π‘₯𝑝 π‘₯2 , βˆ’ sin π‘₯1 +𝑒π‘₯𝑝 π‘₯2 , π‘₯1𝑒π‘₯𝑝 π‘₯2

πœ•π‘“

πœ•π‘₯1

πœ•π‘“

πœ•π‘₯2

Popular AD Packages

β€’ There are many AD packages available: www.autodiff.org– OpenAD– ADOL-C

β€’ Various implementation methods– Source Transformation– Operator Overloading

β€’ Various implementation language– Fortran– C/C++– MATLAB– Python

β€’ Applications that use AD:– Ocean circulation– Optimization of dynamic systems governed by PDEs– Nuclear reactor applications

5

Challenges for AD in Reservoir Simulation

1. Variable sparsity pattern

6

Diagonal StencilingBlock diagonal

Evaluation

Challenges for AD in Reservoir Simulation

2. Variable switching

7

Automatically Differentiable Expression Templates Library (ADETL)

β€’ Developed by Rami Younis at Stanford University

β€’ Initial prototype to prove viability of AD for reservoir simulation

β€’ ADETL solves some reservoir simulation chanllenges:

– Block sparse gradient data structure

– Two algorithms to compute derivative operations

– Variable switching and adaptive implicit problems

– Builds runtime expression for derivatives

8

Sparse Algorithms in ADETL

9

Algorithm 1: axpy

Phase 1 Phase 2

Sparse Algorithms in ADETL

10

Algorithm 2: running pointers

Variable Switching in ADETL

Independent variable set:

[ Po, Pg, Pw, So, Sg, x1, x2, x3, x4, y1, y2, y3, y4]

Gas phase disappears:

[ Po, Pg, Pw, So, Sg, x1, x2, x3, x4, y1, y2, y3, y4]

11

Components in liquid phase

Components in gas phase

Auto (de)activation

Application of ADETL

β€’ ADETL has been successfully applied in thedevelopment of reservoir simulators:– AD-GPRS (Stanford University)

– Unconventional shale gas reservoir simulator (FuRSST)

β€’ We are improving ADETL towards ADETL 2.0

12

So what’s next?

1. Can we deal with various degrees of sparsity patterns?

2. Can we avoid the cost of runtime sparsity detection?

13

Non-zero Call Stack

14

15

1.2 2.3 1.4

0.5 2.8 2.2

1.2 0.5 0 2.3 4.2 0 2.2 0

0.1 0.2 0.3 0.4

1.2 1.5 1.4 1.8

1.3 1.7 1.7 2.2

1.2

0.7

1.9

Univariate

Dense Multivariate

Sparse Multivariate

ADETL

Stage 1 Univariate terms

Test case: product of sequence

16

𝐹 π‘₯ =

𝑖=1

𝑁

𝑓𝑖 π‘₯

𝐹′ π‘₯ =

𝑗=1

𝑁𝑑𝑓𝑗 π‘₯

𝑑π‘₯

𝑖≠𝑗

𝑓𝑖 π‘₯

Case # # of arguments (N)

1 3

2 6

3 9

Univariate Test 1

1. Hand differentiation

17

N Value & Derivative Expressions

1 𝐹 π‘₯ = 𝑓1 π‘₯𝐹′ π‘₯ = 𝑓1

β€² π‘₯

2 𝐹 π‘₯ = 𝑓1 π‘₯ βˆ— 𝑓2 π‘₯𝐹′ π‘₯ = 𝑓1

β€² π‘₯ βˆ— 𝑓2 π‘₯ + 𝑓1 π‘₯ βˆ— 𝑓2β€² π‘₯

3 𝐹 π‘₯ = 𝑓1 π‘₯ βˆ— 𝑓2 π‘₯ βˆ— 𝑓3 π‘₯𝐹′ π‘₯= 𝑓1

β€² π‘₯ βˆ— 𝑓2 π‘₯ βˆ— 𝑓3 π‘₯ + 𝑓1 π‘₯ βˆ— 𝑓2β€² π‘₯ βˆ— 𝑓3 π‘₯ + 𝑓1 π‘₯

βˆ— 𝑓2 π‘₯ βˆ— 𝑓3β€² π‘₯

𝐹 π‘₯ =

𝑖=1

𝑁

𝑓𝑖 π‘₯

𝐹′ π‘₯ =

𝑗=1

𝑁𝑑𝑓𝑗 π‘₯

𝑑π‘₯

𝑖≠𝑗

𝑓𝑖 π‘₯

Univariate Test 2

2. ADunivariate (new datatype)

18

N ADunivariate Expression

1 R = a

2 R = a * b

3 R = a * b * c

Univariate Test 3

3. ADETL

(block sparse gradient)

19

N ADETL Expression

1 R = a

2 R = a * b

3 R = a * b * c

Univariate Test Result

20

𝐹 π‘₯ =

𝑖=1

𝑁

𝑓𝑖 π‘₯

𝐹′ π‘₯ =

𝑗=1

𝑁𝑑𝑓𝑗 π‘₯

𝑑π‘₯

𝑖≠𝑗

𝑓𝑖 π‘₯

Stage 2 Dense Multivariate

Test case: summation

21

𝐹 π‘₯1, π‘₯2…π‘₯𝑛 =

𝑖=1

4

𝑓𝑖 π‘₯1, π‘₯2…π‘₯𝑛

πœ•πΉ π‘₯1, π‘₯2…π‘₯π‘›πœ•π‘₯1

=

𝑖=1

4πœ•π‘“π‘– π‘₯1, π‘₯2…π‘₯𝑛

πœ•π‘₯1

…………

πœ•πΉ π‘₯1, π‘₯2…π‘₯π‘›πœ•π‘₯𝑛

=

𝑖=1

4πœ•π‘“π‘– π‘₯1, π‘₯2…π‘₯𝑛

πœ•π‘₯𝑛

Case # # of independent variables

1 5

2 20

3 80

Dense Multivariate Test 1

1. Manual implementation

22

1 2 3 4 5

1 2 3 4 5 … … 20

1 2 3 4 5 … … … … … 79 80

Case 1

Case 2

Case 3

Dense Multivariate Test 2

23

Vector1 Vector2 Vector3

Vector1 + Vector2

Vector1 + Vector2 + Vector3

2. Expression Templates with dense gradient

Vector4

Vector1 + Vector2 + Vector3+Vector4

Dense Multivariate Test 3

3. ADETL (block sparse gradient)

24

1 2 3 4 5

1 2 3 4 5 … … 20

1 2 3 4 5 … … … … … 79 80

Case 1

Case 2

Case 3

Problem 2 - Dense Multivariate

25

𝐹 π‘₯1, π‘₯2…π‘₯𝑛 =

𝑖=1

𝑁

𝑓𝑖 π‘₯1, π‘₯2…π‘₯𝑛

0

50000

100000

150000

200000

250000

300000

350000

400000

5 20 80

Tim

e(m

icro

seco

nd

s)

Number of elements in dense vector

Manual Expression Template ADETL

11.4X

4.1X

7.5X

1. Manual2. Expression Template3. ADETL

1X

1X

1X

Avoiding sparse operations

1 2 3 4 5 6 7 8 9 d1 d2 d3 1 2 3

1 x x 1 x x

2 x x x 1 x x x

3 x x x 1 x x x

4 x x x 1 x x x

5 x x x 1 x x x

6 x x x 1 x x x

7 x x x 1 x x x

8 x x x 1 x x x

9 x x 1 x x

x =

26

sparse

Sparse Jacobian Seed matrix Compressed Jacobian

dense

Compressed AD

1. Variable activation

2. Evaluation

27

1

1

1

1

1

1

1

1

1

1 1

2 1

3 1

4 1

5 1

6 1

7 1

8 1

9 1

x x

x x x

x x x

x x x

x x x

x x x

x x x

x x x

x x

x x

x x x

x x x

x x x

x x x

x x x

x x x

x x x

x x

recover

(Dense Jacobian)

𝑱 =

𝑼 =

𝑼 𝑺

𝑱 =

Challenge with Compressed AD

1 2 3 4 5 6 7 8 9 d1 d2 d3 1 2 3

1 x x 1 x x

2 x x x 1 x x x

3 x x x 1 x x x

4 x x x 1 x x x

5 x x x 1 x x x

6 x x x 1 x x x

7 x x x 1 x x x

8 x x x 1 x x x

9 x x 1 x x

x =

28

1. Sparsity pattern is unknown2. Variable switching

Sparse Jacobian Seed matrix Compressed Jacobian

ADETL 2.0

β€’ Full range of data types and efficient algorithms

β€’ Ability to convert from one type to the other

β€’ Compressed AD

β€’ Automatic variable set

β€’ Seed matrix will be incorporated

β€’ Benchmark kernel to test AD packages for reservoir simulation calculations

29

Automatic Differentiation Algorithms and Data Structures

Chen FangPhD Candidate

Joined: January 2013FuRSST Inaugural Annual Meeting

May 8th, 2014

30