common subexpression elimination involving multiple variables for linear dsp synthesis 15 th ieee...

28
Common Subexpression Common Subexpression Elimination Involving Elimination Involving Multiple Variables for Multiple Variables for Linear DSP Synthesis Linear DSP Synthesis 15 15 th th IEEE International Conference on IEEE International Conference on Application Specific Architectures Application Specific Architectures and Processors (ASAP) and Processors (ASAP) Farzan Fallah Farzan Fallah Advanced CAD Research Advanced CAD Research Fujitsu Labs. of Fujitsu Labs. of America America Anup Hosangadi Anup Hosangadi Ryan Kastner Ryan Kastner ECE Department, UCSB ECE Department, UCSB

Post on 19-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Common Subexpression Common Subexpression Elimination Involving Multiple Elimination Involving Multiple

Variables for Linear DSP Variables for Linear DSP SynthesisSynthesis

1515thth IEEE International Conference on Application IEEE International Conference on Application Specific Architectures and Processors (ASAP)Specific Architectures and Processors (ASAP)

Farzan FallahFarzan Fallah

Advanced CAD ResearchAdvanced CAD Research

Fujitsu Labs. of AmericaFujitsu Labs. of America

Farzan FallahFarzan Fallah

Advanced CAD ResearchAdvanced CAD Research

Fujitsu Labs. of AmericaFujitsu Labs. of America

Anup Hosangadi Anup Hosangadi

Ryan KastnerRyan KastnerECE Department, UCSBECE Department, UCSB

Anup Hosangadi Anup Hosangadi

Ryan KastnerRyan KastnerECE Department, UCSBECE Department, UCSB

Page 2: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

OutlineOutline

IntroductionIntroduction

Arithmetic expressions and polynomial Arithmetic expressions and polynomial formulationformulation

Eliminating multiple variable common Eliminating multiple variable common subexpressionssubexpressions

ResultsResults

Limitations of proposed techniqueLimitations of proposed technique

ConclusionsConclusions

Page 3: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

IntroductionIntroduction

Multiplications by constants encountered Multiplications by constants encountered in many application areasin many application areas– DSP transforms in Audio, Video, Image DSP transforms in Audio, Video, Image

processing (processing (DFT, DCT, IDCT etc..)DFT, DCT, IDCT etc..)– Filtering operations in Communication (Filtering operations in Communication (FIR, FIR,

IIR filters)IIR filters)– Multiple Input Multiple Output (MIMO) Multiple Input Multiple Output (MIMO)

systemssystems– Polynomials in Computer graphics Polynomials in Computer graphics

Page 4: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

IntroductionIntroduction

Multiplication is expensive in hardwareMultiplication is expensive in hardwareDecompose constant multiplications into shifts and Decompose constant multiplications into shifts and additionsadditions– 13*X = (1101)13*X = (1101)22*X = X + X<<2 + X<<3*X = X + X<<2 + X<<3

Signed digits can reduce the number of Signed digits can reduce the number of additions/subtractionsadditions/subtractions– Canonical Signed Digits (CSD) Canonical Signed Digits (CSD) (Knuth’74)(Knuth’74)– (57)(57)1010 = (0110111) = (0110111)22 = (100-1001) = (100-1001)CSDCSD

Further reduction possible by common subexpression Further reduction possible by common subexpression eliminationelimination– Upto 50% reduction Upto 50% reduction (R.Hartley TCS’96)(R.Hartley TCS’96)

Page 5: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

IntroductionIntroduction

Common subexpressions Common subexpressions = common = common digit patternsdigit patterns

– FF11 = 7*X = (0111)*X = X + X<<1 + X<<2 = 7*X = (0111)*X = X + X<<1 + X<<2

FF22 = 13*X = (1101)*X = X + X<<2 + X<<3 = 13*X = (1101)*X = X + X<<2 + X<<3

– DD11 = X + X<<2 = X + X<<2

FF11 = D = D11 + X<<1 + X<<1

FF22 = D = D11 + X<<3 + X<<3

– Good for single variable: Good for single variable: FIR filters FIR filters (transposed form)(transposed form)– Multiple variable? (Multiple variable? (DFT, DCT etcDFT, DCT etc..??)..??)

“0101”

=> X + X<<2

3+, 3<<

4+, 4<<

Page 6: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

IntroductionIntroduction

Matrix form of linear systemsMatrix form of linear systems YY1 1 aa1111 a a1212 a a13 13 XX11

YY2 2 == aa2121 a a2222 a a23 23 xx XX22

YY3 3 a a3131 a a3232 a a33 33 X X33

k

kikjj

iji DCXSY k

kikjj

iji DCXSY

11 00 11 11 00 00

00 11 11 11 00 11

11 00 00 11 00 11

All Distinct SAll Distinct SijijXXjj and C and CikikDDkk

Y1

Y2

Y3

Potkonjak TCAD’95

Page 7: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Arithmetic expressions & Arithmetic expressions & Polynomial formulationPolynomial formulation

View linear systems as set of arithmetic expressionsView linear systems as set of arithmetic expressions– Expressions consisting of Expressions consisting of +,-,<<+,-,<< operators operators– Develop methodology for extracting common Develop methodology for extracting common

subexpressionssubexpressions

Polynomial formulationPolynomial formulation

C×X=(±X×Li)C×X=(±X×Li)

(14)(10)×X=(1110)(2)×X

= X<<3 + X<<2 + X<<1

= XL3 + XL2 + XL1

= (100-10)(CSD)× X = XL4 - XL

(14)(10)×X=(1110)(2)×X

= X<<3 + X<<2 + X<<1

= XL3 + XL2 + XL1

= (100-10)(CSD)× X = XL4 - XL

Page 8: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Arithmetic expressions and Arithmetic expressions and Polynomial formulationPolynomial formulation

YY1 1 = 5 7 X= 5 7 X11

YY2 2 4 12 X4 12 X22

Polynomial formulationPolynomial formulation

5 = 0101

7 = 0111

4 = 0100

12 = 1100

Y1 = (1)X1 + (2)X1L2 + (3)X2 + (4)X2L + (5)X2L2

Y2 = (6)X1L2 + (7)X2L2 + (8)X2L3

Y1 = (1)X1 + (2)X1L2 + (3)X2 + (4)X2L + (5)X2L2

Y2 = (6)X1L2 + (7)X2L2 + (8)X2L3

6 <<, 6 +6 <<, 6 +

Page 9: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Digit pattern matching techniquesDigit pattern matching techniques

0 1 0 1 0 1 1 1

0 1 0 0 1 1 0 0

D1 = X2 + X2<<1Y1 = X1 + X1<<2 + D1+ X2<<2Y2 = X1<<2 + D1<<2

D1 = X2 + X2<<1Y1 = X1 + X1<<2 + D1+ X2<<2Y2 = X1<<2 + D1<<2

5 <<, 5 +5 <<, 5 +

X1

X2

Page 10: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Algebraic techniques for factoring and Algebraic techniques for factoring and eliminating common subexpressionseliminating common subexpressions

Algebraic methods in Algebraic methods in multi-level logic synthesis multi-level logic synthesis ((MLLS)MLLS)– Reducing literal count in a Reducing literal count in a

set of Boolean expressionsset of Boolean expressions– Factoring, decomposition: Factoring, decomposition:

Established algebraic Established algebraic techniquestechniques

Can be applied to linear Can be applied to linear arithmetic expressions as arithmetic expressions as wellwell

D1 = X1+ X2<<2

Y1 = D1 + D1<<3 + X1<<3

Y2 = D1 + X2<<2

Page 11: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Finding candidate common Finding candidate common subexpressions (kernels)subexpressions (kernels)

TerminologyTerminology– Divisor:Divisor: An expression having at least one term with a An expression having at least one term with a

non-zero exponent of Lnon-zero exponent of L– eg. Xeg. X11 + X + X22L + XL + X33LL2 2 isis a divisor a divisor– XX11L + XL + X22LL22 + X + X33LL22 is is notnot a divisor a divisor– Kernel:Kernel: Divisor obtained from original expression by Divisor obtained from original expression by

division by an exponent of L. division by an exponent of L. – Co-kernelCo-kernel: Exponent of L that is used to obtain the : Exponent of L that is used to obtain the

kernelkernel

ExampleExample– P = XP = X11LL33 + X + X22LL3 3 + X + X22LL22 + X + X33

– Division by LDivision by L22 kernelkernel = X = X11L + XL + X22L + XL + X22; ; co-kernelco-kernel = = LL22

Page 12: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Kernel generation algorithmKernel generation algorithm

LX X LX L

Y2(5) 2(4)1(2)

1 LX X LX L

Y2(5) 2(4)1(2)

1

» Divide Y1 by L» Divide Y1 by L» Divide again by L» Divide again by L

2(5)1(2)21 X X

L

Y 2(5)1(2)2

1 X X L

Y LX XX

L

Y2(8) 2(7)1(6)2

2 LX XX L

Y2(8) 2(7)1(6)2

2

» Divide Y2 by L2» Divide Y2 by L2

Y1 = (1)X1 + (2)X1L2 + (3)X2 + (4)X2L + (5)X2L2

Y2 = (6)X1L2 + (7)X2L2 + (8)X2L3

Y1 = (1)X1 + (2)X1L2 + (3)X2 + (4)X2L + (5)X2L2

Y2 = (6)X1L2 + (7)X2L2 + (8)X2L3

Recursively divide by the smallest non-zero exponent of L

Page 13: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Kernel generationKernel generation

All kernels and co-kernels for example All kernels and co-kernels for example linear systemlinear system

(((1)(1)XX11 + + (2)(2)XX11LL22 + + (3)(3)XX2 2 + + (4)(4)XX22L + L + (5)(5)XX22LL22)[1])[1]

(((2)(2)XX11L + L + (4)(4)XX22 + + (5)(5)XX22L)[L]L)[L]

(((2)(2)XX11 + + (5)(5)XX22)[L)[L22]]

(((6)(6)XX11LL22 + + (7)(7)XX22LL22 + + (8)(8)XX22LL33)[1])[1]

(((6)(6)XX1 1 + + (7)(7)XX22 + + (8)(8)XX22L)[LL)[L22]]

Page 14: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Importance of KernelsImportance of KernelsTheorem:Theorem: There exists a k-term common There exists a k-term common subexpression subexpression iffiff there is a k-term “ there is a k-term “non-overlappingnon-overlapping” ” intersection between at least two kernelsintersection between at least two kernels

ProofProof– If:If: Non-overlapping k-term intersection Non-overlapping k-term intersection

=> K-term common subexpression=> K-term common subexpression

Only If: Only If: If there are 2 instances of k-term subexpressionIf there are 2 instances of k-term subexpressionCase1: Case1: “divisor” => Each instance will be a part of some kernel “divisor” => Each instance will be a part of some kernel expressionexpression

Case2:Case2: “non-divisor” => dividing by smallest non-zero exponent of “non-divisor” => dividing by smallest non-zero exponent of L will convert it into a “divisor”L will convert it into a “divisor”

Page 15: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Kernel generationKernel generationeg. 10*X = (1010)*X = eg. 10*X = (1010)*X = (1)(1)XLXL + + (2)(2)XLXL33

14*X = (1110)*X = 14*X = (1110)*X = (3)(3)XLXL + + (4)(4)XLXL22 + + (5)(5)XLXL33

– common subexpression = common subexpression = XL + XLXL + XL3 3 = (X + XL = (X + XL22)L)L– kernels involved in intersection: kernels involved in intersection:

(((1)(1)XX + + (2)(2)XLXL22))

(((3)(3)XX + + (4)(4)XL + XL + (5)(5)XLXL22) )

Page 16: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Overlapping kernelsOverlapping kernels

Consider (1001001)*XConsider (1001001)*X

(1001001)*X = (1001001)*X = (1)(1)XLXL66 + + (2)(2)XLXL33 + + (3)(3)XX

– Kernels Kernels [1] ( [1] ( (1)(1)XLXL66 + + (2)(2)XLXL33 + + (3)(3)XX))

[L[L33] ( ] ( (1)(1)XLXL33 + + (2)(2)XX))

1 0 0 1 0 0 1

Page 17: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Finding kernel intersectionsFinding kernel intersections

Form Kernel Cube Matrix (KCM)Form Kernel Cube Matrix (KCM)– One row for each kernel generatedOne row for each kernel generated– One column for each distinct kernel cubeOne column for each distinct kernel cube– Each non-zero element represents a termEach non-zero element represents a term

1 2 3 4 5 6

X1 X1L2 X2 X2L X2L2 X1L

CoKernels

1 1 1(1) 1(2) 1(3) 1(4) 1(5) 0

2 L 0 0 1(4) 1(5) 0 1(2)

3 L2 1(2) 0 1(5) 0 0 0

4 L2 1(6) 0 1(7) 1(8) 0 0

Y1 = X1 + X1L2 + X2 + X2L + X2L2

Y2 = X1L2 + X2L2 + X2L3

X2L2

Page 18: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Finding kernel intersectionsFinding kernel intersectionsEach rectangle with non-overlapping terms = a common Each rectangle with non-overlapping terms = a common subexpressionsubexpression– RectangleRectangle: Set of rows and columns such that all elements are ‘1’: Set of rows and columns such that all elements are ‘1’

Search only for prime rectanglesSearch only for prime rectangles– Prime rectanglePrime rectangle: Rectangle that is not covered by any other : Rectangle that is not covered by any other

rectanglerectangle

Prime rectangle may have overlapping termsPrime rectangle may have overlapping terms– Find a non-overlapping rectangle within the prime rectangle Find a non-overlapping rectangle within the prime rectangle ((MIRMIR = Maximum Irredundant Rectangle) = Maximum Irredundant Rectangle)

Value of a rectangle (R = #Rows, C = #Cols)Value of a rectangle (R = #Rows, C = #Cols)– ValueValue = # of additions/subtractions saved by selecting rectangle = # of additions/subtractions saved by selecting rectangle– V(R,C) = (R-1)*(C-1)V(R,C) = (R-1)*(C-1)

Page 19: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Finding kernel intersectionsFinding kernel intersectionsSelecting common subexpressionsSelecting common subexpressions– Greedy selection of most valued non-overlapping Greedy selection of most valued non-overlapping

rectangle in each iterationrectangle in each iteration

– This is very expensive This is very expensive Worst case Worst case O(2O(2MNMN)) prime rectangles to be considered prime rectangles to be consideredM = # of expressions; N = Bit-widthM = # of expressions; N = Bit-width

– Heuristic required (Heuristic required (ping-pongping-pong))Start with a seed row/column Start with a seed row/column Build rectangle by intersections with other rows/colsBuild rectangle by intersections with other rows/colsComplexity = Linear in #Rows/ColumnsComplexity = Linear in #Rows/Columns

Page 20: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Finding kernel intersectionsFinding kernel intersections

1 2 3 4 5 6

X1 X1L2 X2 X2L X2L2 X1L

CoKernels

1 1 1(1) 1(2) 1(3) 1(4) 1(5) 0

2 L 0 0 1(4) 1(5) 0 1(2)

3 L2 1(2) 0 1(5) 0 0 0

4 L2 1(6) 0 1(7) 1(8) 0 0

3 47 8

4 57 8

OR

MIR =

Page 21: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Extracting kernel intersections (1Extracting kernel intersections (1stst Iteration)Iteration)

1 2 3 4 5 6

X1 X1L2 X2 X2L X2L2 X1L

CoKernels

1 1 1(1) 1(2) 1(3) 1(4) 1(5) 0

2 L 0 0 1(4) 1(5) 0 1(2)

3 L2 1(2) 0 1(5) 0 0 0

4 L2 1(6) 0 1(7) 1(8) 0 0

Select D1 = X1 + X2 + X2L, saves 2 additions!

Page 22: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Extracting Kernel intersections (2Extracting Kernel intersections (2ndnd iteration)iteration)

1 2 3 4 5 6

D1 X1L2 X2L2 X1 X2 X2L

1 1 1(1) 1(2) 1(3) 0 0 0

2 L2 0 0 0 1(2) 1(3) 0

3 1 0 0 0 1(5) 1(6) 1(7)

D2 = X1 + X2

D2 = X1 + X2

D1 = D2 + X2<<1Y1 = D1 + D2<<2Y2 = D1<<1

D2 = X1 + X2

D1 = D2 + X2<<1Y1 = D1 + D2<<2Y2 = D1<<1

Final Implementation

3 <<, 3 +3 <<, 3 +

Page 23: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Experimental SetupExperimental SetupGoalGoal– Reduction in #additions/subtractionsReduction in #additions/subtractions– Effect on area/latency on synthesisEffect on area/latency on synthesis

Transforms DCT, IDCT,DFT, DST, DHT.Transforms DCT, IDCT,DFT, DST, DHT.

8x8 constant matrices8x8 constant matrices

16 digits precision (CSD representation)16 digits precision (CSD representation)

Compare withCompare with– Potkonjak (Potkonjak (TCAD’95TCAD’95))– RESANDS (RESANDS (Nguyen et. al TVLSI’2000Nguyen et. al TVLSI’2000))

Page 24: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Experimental resultsExperimental results

Example Example # of additions/subtractions# of additions/subtractions % Improvement % Improvement overover

OriginalOriginal

(I)(I)

RESANDSRESANDS

(II)(II)

PotkonjakPotkonjak

(III)(III)

Our Our TechniqueTechnique

(IV)(IV)

(I)(I) (II)(II) (III)(III)

DCTDCT 274274 202202 227227 174174 36.536.5 13.113.1 23.323.3

IDCTIDCT 242242 183183 222222 162162 33.033.0 11.511.5 27.027.0

R-DFTR-DFT 253253 193193 208208 165165 34.834.8 14.514.5 20.720.7

I-DFTI-DFT 207207 178178 198198 134134 35.335.3 24.724.7 32.332.3

DSTDST 320320 238238 252252 200200 37.537.5 16.016.0 20.620.6

DHTDHT 284284 209209 211211 175175 38.438.4 16.316.3 17.017.0

AverageAverage 263.3263.3 200.5200.5 219.7219.7 168.3168.3 35.935.9 16.016.0 23.523.5

Page 25: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Experimental resultsExperimental resultsSynthesis results (Synthesis results (Minimum Latency constraintsMinimum Latency constraints))

ExampleExample Area (Library Units)Area (Library Units) Latency (Clock cycles)Latency (Clock cycles)

(II)(II) (III)(III) (IV)(IV) (II)(II) (III)(III) (IV)(IV)

DCTDCT 9066790667 9637596375 7331173311 1010 1111 1111

IDCTIDCT 8186881868 9977199771 6686466864 1010 1111 1111

R-DFTR-DFT 9049690496 8477084770 6982769827 1010 1212 1111

I-DFTI-DFT 7514075140 8486484864 5594055940 1010 1111 1010

DSTDST 108101108101 106498106498 8471584715 1111 1212 1111

DHTDHT 9393993939 7940979409 7127271272 1111 1111 1111

AverageAverage 9011090110 9194891948 7032270322 10.310.3 11.311.3 10.810.8

Page 26: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

Limitations of this techniqueLimitations of this technique

Results dependant on initial representation of Results dependant on initial representation of constants constants – Mixed representationMixed representation

Too many: O(3Too many: O(3NN) per constant) per constant

Factoring of constantsFactoring of constants– eg. 105*X = 15*7*X = (16-1)*(8-1)*Xeg. 105*X = 15*7*X = (16-1)*(8-1)*X

= ( (X<<4 -1)<<3 – 1)= ( (X<<4 -1)<<3 – 1)– Factoring in general is very hardFactoring in general is very hard

Common subexpressions with reversed signsCommon subexpressions with reversed signs– eg. (Xeg. (X11 – X – X22) = ) = -(X -(X22 – X – X11) cannot be detected) cannot be detected

Page 27: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

ConclusionsConclusions

ContributionsContributions– Novel polynomial transformationNovel polynomial transformation– Adapting rectangle covering methodsAdapting rectangle covering methods– Single var and multi-var subexpressions Single var and multi-var subexpressions

eliminated together => better resultseliminated together => better results

Future workFuture work– Addressing shortcomings of current methodAddressing shortcomings of current method– Optimization for timing, powerOptimization for timing, power

Page 28: Common Subexpression Elimination Involving Multiple Variables for Linear DSP Synthesis 15 th IEEE International Conference on Application Specific Architectures

ConclusionsConclusions

Thank you!!Thank you!!

Questions??Questions??