source transformation via operator overloading in matlab
TRANSCRIPT
Source Transformation via Operator Overloadingin MATLAB
Matthew J. WeinsteinAnil V. Rao
Department of Mechanical and Aerospace EngineeringUniversity of Florida
Gainesville, FL 32611-6250
15th EuroAD WorkshopJune 16th and 17th, 2014
1 / 16
Motivation
• Target Applications - all require repeated evaluation ofderivative
• First derivatives of stiff ODE’s• First and second derivatives of NLPs• Particular emphasis on vectorized functions which result from
collocation methods - direct collocation optimal control
• Generate Stand-Alone Derivative Files• Efficient at run-time• Can apply method recursively - allows for second and higher
order derivative computation without complexity of codingsecond and higher order derivative rules
• Pure Source Transformation in MATLAB is Hard• Interpreted nature of the language makes it difficult to
determine what is happening from a purely lexical analysis
2 / 16
Key Points of the Method
1 Fix following input information:• All variable sizes• All derivative sparsity patterns• Any reference/assignment indices
2 Offload as Much Information as Possible to File Generation Steps3 Evaluate Basic Blocks on Overloaded CADA Objects
• Utilizes sparse forward mode AD• Objects do not contain numeric values but rather
• Variable sizes• Derivative sparsity patterns• Symbolic identifiers
• Overloaded evaluations result in function and derivativecalculations being printed to a file as well as propagation ofsparsity patterns within the objects.
4 Transcribe Flow Control from Function Program to DerivativeProgram
• Requires an initial source to source transformation• Psuedo-overloading of flow control statements
5 Vectorized derivative computation of vectorized functions
3 / 16
Basics of Overloaded Class
Sparse Notation:
• f = f(x) : Rn → Rm
• dfx ∈ Rq≤mn : thenon-zero elements of5xf(x) ∈ Rm×n
• ifx ∈ Rq: the rowlocations of dfx in5xf(x)
• jfx ∈ Rq: the columnlocations of dfx in5xf(x)
Basic Object Properties:
• F.func.name = ‘f.f’ =string identifier of what thef is written to in generatedfile
• F.func.size = [m 1]
• F.deriv.name = ‘f.dx’
= string identifier of whatdfx is written to ingenerated file
• F.deriv.nzlocs = [ifx jfx]
4 / 16
Basic Overloaded Operation ExampleConsider g = g(f) : Rm → Rp, where f = f(x), and the correspondingoverloaded operation G = G(F).Procedure of G:
1 Determine size(g) given m and assign to G.func.size2 Assign a function variable name to G.func.name, call this ‘g.f’
3 Print calculation to file which will compute g(f)Ex: g.f = sin(f.f);
4 Determine igx and jgx using ifx and jfx and assign to G.deriv.nzlocs5 Assign a derivative variable name to G.deriv.name, call this‘g.dx’
6 Print derivative calculations to file which will compute dgx(f,dfx) -typically dependent upon one or more reference or assignmentindices
Ex: g.dx = cos(f.f(ifx)).*f.dx
NOTE: Non-zero derivative computations typically only valid for thefixed values of ifx
5 / 16
Issues
If simply evaluating a function file on the overloaded class
1 In general, cannot evaluate branching statements (numericvalues are free) - multiple possible branches
• is Y > 0? - maybe
2 If loops are unrollable, unrolled loop will be printed -derivative code can get quite large
3 Printed derivative code hard to read, no variable names copiedover
4 Redundant 1st - (n− 1)th derivative calculations printed in nth
derivative file
6 / 16
Handling Flow ControlBranching Statements:
• Evaluate each possible branch independently on overloadedobjects
Loops:
• Want to evaluate single loop iteration on overloaded objects andprint calculations valid for all loop iterations
Issues:
1 Calculations printed by overloaded evaluations only valid for fixedvalues of function size and non-zero derivative locations
• Different branches may result in objects containing differentnon-zero derivative locations
• Inputs to loops may be iteration dependent
2 Iteration dependent organizational operations
• Consider the reference g = f(k), derivative rule is thendgx = dfx(kx), but there does not, in general, exist a mapping fromfunction reference index k to the derivative reference index kx
7 / 16
Unions and Re-Maps of Overloaded Objects
• W = U ∪ V (W is the union of U and V) has followingproperties:
1 Row/column dimension of W is considered to be maximumrow/column dimension of U and V
2 Derivative element of W is only considered to be zero if thecorresponding derivative element of U and V are also zero
• W = U → W (re-map of U to W)
1 Append zeros to rows/columns of u to get w2 Initialize dwx to be vector of zeros3 Map dux into proper elements of dwx
• U =W → U (re-map of W to U)
1 Remove rows/columns of w to get u2 Reference proper elements of dwx to get dux
• Assume that any calculations printed by the operation g(W)are valid for g(U) and g(V)
8 / 16
Overmapped Objects
Consider a variable y which can be assigned n different objectsY(1), . . . ,Y(n)
• Define the overmap of y to be Y = ∪ni=1Y(i)
• Y(i) can be thought of as the output of the i th branch of aconditional fragment or the input to the i th iteration of a loop
• Ignoring iteration/branch dependent organizationaloperations, can assume
1 If y is a loop input, then evaluating a single iteration of a loopon Y will print valid calculations for all iterations
2 If y is a conditional fragment output, then evaluating the restof the program on Y will print valid calculations for any branch
9 / 16
The ADiGator AlgorithmInputs: user program P(x) together with information to instantiate X
1 Transform user program P to intermediate program P ′
• Augment to the original program calls to transformation routines• Replace flow control statements with calls to transformation
routines
2 Evaluate intermediate program on X three times1 Empty Parsing Evaluation
• Collect information on data and control flow (similar to controlflow graphs and data flow graphs of compilers) - no derivativecomputations performed, no calculations printed to derivative file
• Use this to determine what overmapped objects must be built andwhich objects belong to them
2 Overmapping Evaluation• Build/store required overmapped objects and collect loop
organizational operation data
3 Printing Evaluation• Use the collected data from previous two operations to print the
derivative file
10 / 16
Transformation of Basic Blocks
y1 = s1;
y2(i) = s2;
Pred (A)
Succ (A)
Ay1 = s1;
y1 = VarAnalyzer (‘y1 = s1’,y1,‘y1’,0);
y2(i) = s2;
y2 = VarAnalyzer (‘y2(i) = s2’,y2,‘y2’,1);
Pred (A′)
Succ (A′)
A′
VarAnalyzer has control over evaluating workspace
• VarAnalyzer used to track variables and manipulateevaluating workspace
11 / 16
Transformation of Conditional Fragments
if s1 elseif s2 · · · elseF F F
Bk,1 Bk,2 · · · Bk,n
T T
Pred (Ck)
end
Succ (Ck)
Ck
• Each branch evaluatedindependently
• Overmapped outputs arebuilt and brought intoevaluating workspace
• In printing evaluation,perform re-maps aftereach branch evaluating
cadacond1 = s1;
cadacond1 = VarAnalyzer (‘s1’,cadacond1,‘cadacond1’,0);...
cadacondn-1 = sn;
cadacondn-1 =
VarAnalyzer (‘sn-1’,cadacondn-1,‘cadacondn-1’,0);
IfIterStart (k,1);
IfIterEnd (k,1);
[IfEvalStr,IfEvalVar] = IfIterStart (k,2);
if not(isempty(IfEvalStr))
cellfun(@eval,(IfEvalStr);
end
IfIterEnd(k,2);
[IfEvalStr,IfEvalVar] = IfIterStart (k,3);
if not(isempty(IfEvalStr))
cellfun(@eval,(IfEvalStr);
end
...
[IfEvalStr,IfEvalVar] = IfIterEnd (k,n);
if not(isempty(IfEvalStr))
cellfun(@eval,(IfEvalStr);
end
B ′k,1
B ′k,2
...
B ′k,n
Pred (C ′k)
Succ (C ′k)
C ′k
12 / 16
Transformation of Loops
for i =
sf
Ik
end
Pred (Lk)
Succ (Lk)
Lk
• Build overmapped inputs andfinal output, collectorganizational operation databy evaluating all iterations inovermapping eval
• On printing eval, re-mapinputs to overmapped inputs,evaluate single iteration onovermaps, remap overmappedoutputs to true outputs
cadaLoopVar k = sf;
cadaLoopVar k = ...
VarAnalyzer (‘sf’,cadaLoopVar k,‘cadaLoopVar k’,0);
[adigatorForVar k, ForEvalStr, ForEvalVar] = ...
ForInitialize(k,cadaLoopVar k);
if not(isempty(ForEvalStr))
cellfun(@eval,ForEvalStr)
end
for adigatorForVar k i = adigatorForVar k;
cadaForCount k = ForIterStart(k,adigatorForVar k i);
i = cadaLoopVar k(:,cadaForCount k);
i = VarAnalyzer(‘cadaLoopVar k(:,cadaForCount k)’,i,‘i’,0);
I ′k
[ForEvalStr, ForEvalVar] = ...
ForIterEnd(k,adigatorForVar k i);
end
if not(isempty(ForEvalStr))
cellfun(@eval,ForEvalStr)
end
Pred (L′k)
Succ (L′k)
L′k
13 / 16
Questions?
https://sourceforge.net/projects/adigator/
We gratefully acknowledge support for this research from Office of Naval Research
Grant N00014-11-1-0068 U.S. Defense Advanced Research Projects Agency (DARPA)
Under Contract HR0011-12-0011
14 / 16
Vectorized Functions
Continuous Function:
• f = f(x(t)) : Rn → Rm
• dfx ∈ Rq≤mn : the non-zero elements of 5xf(x) ∈ Rm×n
• ifx ∈ Rq: the row locations of dfx in 5xf(x)
• jfx ∈ Rq: the column locations of dfx in 5xf(x)
Discretized Function:
• Let X =[X1 X2 · · · XN
]∈ Rn×N , where Xi = x(ti )
• F(X) =[f(X1) f(X2) · · · f(XN)
]∈ Rm×N
5XF(X) ∈ Rn×N×m×N defined by:
• dimensions n, m, N
• row and column locations ifx ∈ Rq and jfx ∈ Rq
• Non-zero derivatives DFX =
[dfX1
dfX2· · · dfXN
]∈ Rp×N
15 / 16
Vectorized Differentiation
• Allow vectorized dimension N to be free
• Store non-zero locations (e.g. ifx, and jfx) and size (e.g. n) ofcontinuous function (e.g. f(x(t)))
• Print calculations which compute DFX
Example:
• Scalar:g.dx = cos(f.f(ifx)).*f.dx ∈ Rq
• Vectorized:g.dX = cos(f.f(:,ifx)).*f.dX ∈ Rq×N
16 / 16