qregup4

11
Quasi Regression and black boxes 1 Finding important variables and interactions in black boxes Art B. Owen Stanford University [email protected] Tao Jiang Stanford University  [email protected] Quasi Regression and black boxes 2 Theme As dimension increases many numerical problems become more statistical. Because: 1. the sample is inevitably sparse, 2. error depends on unsampled part of space, 3. worst case error bounds are inapplicable Quasi Regression and black boxes 3 Example: integration Sampling Methods 1. Monte Carlo: 2. Quasi-Monte Carlo: , but no practical error estimate 3. Randomized Quasi-Monte Carlo: replication based error estimates, and Rates are asymptotic under mild conditions on Also statistical: approximation Quasi Regression and black boxes 4 Mortgage backed securities integrand Paskov & Traub, Caisch, Morokoff, & Owen Present value of 30 years of monthly cash ows. Prepayment: 1. puts lumps into payment stream 2. more common when interest rates are low MBS Model (from Goldman-Sachs) Interest rates: Geometric Brownian motion driven by Prepayment fraction:

Upload: postscript

Post on 31-May-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: qregup4

8/14/2019 qregup4

http://slidepdf.com/reader/full/qregup4 1/11

Quasi Regression and black boxes 1

³   

²   

   °

   ±

Finding important

variables and interactions

in black boxes

Art B. Owen

Stanford University

[email protected]

Tao Jiang

Stanford University

 [email protected]

Quasi Regression and black boxes 2

³   

²   

   °

   ±

ThemeAs dimension increases many numerical problems become

more statistical.

Because:

1. the sample is inevitably sparse,

2. error depends on unsampled part of space,

3. worst case error bounds are inapplicable

Quasi Regression and black boxes 3

³   

²   

   °

   ±

Example: integration

Á        

  

´ ¼    ½ µ 

 

       Ü    µ   Ü   

Sampling Methods

1. Monte Carlo: Ò   

  ½    ¾ 

2. Quasi-Monte Carlo: Ò   

  ½ 

´ Ð Ó      Ò    µ  

    ½  , but no practical

error estimate

3. Randomized Quasi-Monte Carlo: replication based error

estimates, and Ò   

  ¿    ¾ 

´ Ð Ó      Ò    µ  

      ½ µ    ¾ 

Rates are asymptotic under mild conditions on    

Also statistical: approximation

Quasi Regression and black boxes 4

³   

²   

   °

   ±

Mortgage backed

securities integrandPaskov & Traub, Caflisch, Morokoff, & Owen

          Present value of 30 years of monthly cash flows.

Prepayment:

1. puts lumps into payment stream

2. more common when interest rates are low

MBS Model (from Goldman-Sachs)

                       µ  

       

Í     ¼   

 ½ ℄ 

¿ ¼ 

       

    

  ½ 

         µ  

Interest rates: Ö   

½ 

Ö      

¿ ¼ 

  Geometric Brownian motion

driven by     

Prepayment fraction:      ·          Ö Ø Ò ´    

     ·         ¢  

Ö   

Ø 

µ  

Page 2: qregup4

8/14/2019 qregup4

http://slidepdf.com/reader/full/qregup4 2/11

Quasi Regression and black boxes 5

³   

²   

   °

   ±

QMC super on MBSBut                        µ   is

    

   % additive

Latin hypercube sampling variance about ¼     ¼     % of MC

Also                        µ   is    

     

% odd (antisymmetric)

antithetic sampling variance is about ¼     ¼ ¾   

% of MC

Additive and odd: was virtually linear in     

upon further investigation

Curse of dimensionality not broken by QMC

we just had an easy integrand

QMC requires low “effective dimension” to trounce MC

Quasi Regression and black boxes 6

³   

²   

   °

   ±

ANOVA of Ä    

¾    

¼   

½ ℄ 

    

Hoeffding, Efron & Stein, Sobol’

Main effects and     –factor interactions generalizing familiar

discrete ANOVA

       Ü   µ     

  

Ù    ½    ¾     

   

Ù 

   Ü    µ  

     

Ù 

depends only on Ü    -components in set Ù   

     

 

    

Ê 

       Ü    µ   Ü   

“grand mean”

     

¾ 

      µ     

È  

Ù     

Ê 

   

Ù 

   Ü    µ  

¾ 

Ü   

  

Ê 

   

Ù 

   Ü    µ      

Ú 

   Ü    µ   Ü    ¼    , Ù   

     Ú   

½  

Ò   

Ò 

  

  ½ 

       Ü   

 

µ     

  

Ù 

½  

Ò   

Ò 

  

  ½ 

   

Ù 

   Ü   

 

µ  

QMC Ü   

 

very uniform in low dimensional projections

Great for functions dominated by    

Ù 

with small Ù   

Quasi Regression and black boxes 7

³   

²   

   °

   ±

Isotropic integrandCapstick & Keister, Pagageorgiou & Traub, Owen

       Ü   µ Ó ×    

 

×  

½  

¾   

    

  ½ 

   Ü    µ  

¾ 

 

Ü      Í    

´ ¼    

½ µ  

 

Ê 

       Ü    µ   Ü   

         

 

Ó ×  

 

Õ  

   

¾ 

    µ 

    ¾   

 

closed form (Mathematica) aids comparison of methods

Varies equally in all directions

QMC does well

For     ¾    

over 99% of variance from

1,2,3 dimensional ANOVA effects

 after numerical investigation

exploiting symmetry and Gaussianity

Quasi Regression and black boxes 8

³   

²   

   °

   ±

The borehole functionMorris, Mitchell, Ylvisaker

Flow from upper to lower aquifer:

¾    Ì    

Ù 

   À     

Ù 

  À     

Р

℄ 

Ð Ó    

 

Ö 

Ö 

Û 

 

 

½ ·    

¾  Ä Ì 

Ù 

Ð Ó  

  

Ö 

Ö 

Û 

µ  

Ö 

¾ 

Û 

à

Û 

·    

Ì 

Ù 

Ì 

Р

 

Ö    , Ö   

Û 

Radii borehole, basin

Ì    

Р

, Ì    

Ù 

Transmissivities upper and lower

À     

Р

, À     

Ù 

Potentiometric heads upper and lower

Ä     , à    

Û 

Length and conductivity

Diaconis: closed form     understanding

Which variables are important?

Which interact?

Page 3: qregup4

8/14/2019 qregup4

http://slidepdf.com/reader/full/qregup4 3/11

Quasi Regression and black boxes 9

³   

²   

   °

   ±

Black box functions

                       µ   Without “ ·         ”

Examples     

Semiconductors Device design Speed, heat

Aerospace Wing shape Lift, drag

Automotive Auto Frame Strength, weight

Statistics Predictors Responses

Used to design products. Cheaper than physical

experiments. Costs from milliseconds to hours. Dimension

from 3 to 300. Accuracy varies too.

Kriging widely used Journel, Huijbreghts, Sacks, Ylvisaker,

Welch, Wynn, Mitchell

Quasi Regression and black boxes 10

³   

²   

   °

   ±

A small neural netVenables, Ripley

PredictÐ Ó    

½ ¼ 

   Ô Ö      

µ   from the others

perf published performance of computer

syct cycle time in nanoseconds

mmin minimum main memory in kilobytes

mmax maximum main memory in kilobytes

cach cache size in kilobytes

chmin minimum number of channels

chmax maximum number of channels

Function found by training on¾ ¼    

examples.

Quasi Regression and black boxes 11

³   

²   

   °

   ±

      T      h     e     n   -     n     e      t      f     u     n     c      t      i     o     n

   ¼

  

   

   

 

   ½

  

   ¾

   ½

   Ü

 ½

    ·

   ½

  

   ¿

   

   Ü

 ¾

    ·

   ½

  

   

   ¾

   Ü

 ¿

 

   ½

  

   ¼

   ½

   Ü

 

 

   ¼

  

   ¿

   ¿

   Ü

 

    ·

   ¼

  

   ¿

   ¼

   Ü

 

 

   ¾

  

   

   ¾

    Ë

  

 

   ½

  

   ½

   ¾

    ·

   ¼

  

   

   

   Ü

 ½

    ·

   ¾

  

   ¾

   

   Ü

 ¾

    ·

   ¾

  

   

   ½

   Ü

 ¿

 

   ½

  

   

   ¿

   Ü

 

 

   ¼

  

   

   

   Ü

 

    ·

   ¼

  

   

   ¿

   Ü

 

  

    ·

   ¿

  

   ½

   

    Ë

  

 

   ½

  

   ¼

   

    ·

   ¾

  

   ¾

   

   Ü

 ½

 

   ¼

  

   ½

   ¼

   Ü

 ¾

    ·

   ½

  

   

   

   Ü

 ¿

    ·

   ¾

  

   

   ¼

   Ü

 

    ·

   ½

  

   ¾

   

   Ü

 

    ·

   ¼

  

   ¾

   

   Ü

 

  

    ·

   ¼

  

   ¿

   

    Ë

  

   ¼

  

   ¼

   

 

   ¼

  

   ½

   ½

   Ü

 ½

    ·

   ¼

  

   ½

   ½

   Ü

 ¾

    ·

   ¼

  

   ½

   ¾

   Ü

 ¿

 

   ¼

  

   ½

   ¼

   Ü

 

 

   ¼

  

   ¼

   

   Ü

 

    ·

   ¼

  

   ¼

   ¾

   Ü

 

  

    w     h    e    r    e

    Ë

  

   Þ

  µ

    

  

   ½

    ·

   

   Ü

   Ô

  

 

   Þ

  µ

 ℄

 

 ½

     i    s    a    s     i    g    m    o     i     d    a     l     f    u    n    c     t     i    o    n

Quasi Regression and black boxes 12

³   

²   

   °

   ±

Given        Ü    µ   on ¼   

½ ℄ 

    

How can we tell if     is:

1. Nearly linear?

2. Nearly additive?

3. Nearly quadratic?

4. Has mostly ¿    factor interactions or less?

5. Which variables matter most?

6. Which interactions matter most?

We would like:

1. a systematic approach

2. that also predicts    

Page 4: qregup4

8/14/2019 qregup4

http://slidepdf.com/reader/full/qregup4 4/11

Page 5: qregup4

8/14/2019 qregup4

http://slidepdf.com/reader/full/qregup4 5/11

Quasi Regression and black boxes 17

³   

²   

   °

   ±

InterpretationVariance of     is

È  

Ö  ¼ 

¬   

¾ 

Ö 

·    

Ê 

       Ü    µ  

¾ 

Ü   

Importance of Ë   isÈ  

Ö  ¾ Ë 

¬   

¾ 

Ö 

Estimate byÈ  

Ö  ¾ Ë 

 

¬   

¾ 

Ö 

  

   Var  

 

¬   

Ö 

µ  

Subsets of interest include:

  Ö   

 Ö   

´ ½ µ       ¼   

  involves Ü   

½ 

  Ö   

 Ö   

´ ½ µ ¼        does not involve Ü   

½ 

  Ö   

 Ö   

 

¼ 

½     additive part

  Ö   

 ¼        

 Ö   

 

¼ 

     

   interactions up to order    

  Ö   

 ¼        

 Ö   

 

½ 

     

  of degree at most    

  Ö   

 Ö          

µ ¼        ¿   

   uses only first ¿    inputs,

Quasi Regression and black boxes 18

³   

²   

   °

   ±

Approximation throughintegration

Define:         Ü   µ   

   

¼ 

   Ü    µ        

Ô    ½ 

   Ü   µ µ  

Ì 

Optimal ¬    is

¬   

£ 

Ö Ñ Ò      

¬ 

  

 

       Ü    µ    

        Ü    µ  

Ì 

¬   

¡ 

¾ 

Ü   

    

 

  

        Ü    µ           Ü    µ  

Ì 

Ü   

 

  ½ 

  

        Ü    µ          Ü    µ   Ü   

also,

Á Ë         

  

          Ü    µ    

        Ü    µ  

Ì 

¬    µ  

¾ 

Ü   

Quasi Regression and black boxes 19

³   

²   

   °

   ±

Regression and

quasi-regression

¬   

£ 

    

 

  

        Ü    µ           Ü    µ  

Ì 

Ü   

 

  ½ 

  

        Ü    µ          Ü    µ   Ü   

    

  

        Ü    µ          Ü    µ   Ü   

by orthogonality

Observations

Ü   

 

  Í    

¼    

½ ℄ 

 

  ½    

     Ò  

IID

Regression

  

¬        

 

   

Ì 

   

¡ 

  ½ 

   

Ì 

   

Ò  ¢  Ô 

  

Ò  ¢  ½ 

Quasi-Regression

 

¬        

½  

Ò   

   

Ì 

  

Quasi Regression and black boxes 20

³   

²   

   °

   ±

Precursors of

quasi-regression

Quasi-interpolation

Chui & Diamond, Wang

“Ignore the denominator” (   

Ì 

   ) to get fast approximate

interpolation.

Computer experiments

Koehler and Owen 1996 advocate quasi-regression for

computer experiments

Efromovich 1992 applies qr to sinusoids on ¼   

 ½ ℄ 

.

Owen 1992 describes quasi-regression for Latin hypercubesampling

Page 6: qregup4

8/14/2019 qregup4

http://slidepdf.com/reader/full/qregup4 6/11

Quasi Regression and black boxes 21

³   

²   

   °

   ±

Accuracy in Monte Carlo

samplingDefine:

   

 

         

 

      

 

¬   

Æ   

Ô  ¢  ½ 

    

½ 

Ò 

È  

Ò 

  ½ 

    

Ì 

 

       

 

      

 

¬    µ  

    

Ô  ¢  Ô 

    

½ 

Ò 

È  

Ò 

  ½ 

    

Ì 

 

    

 

  Á   

Now

 

¬     

¬        

½  

Ò   

    

Ì 

      

¬   

    

½  

Ò   

    

Ì 

   ¬   

·         µ    

¬   

     Æ    ·     ¬   

  

¬     

¬      

    

Ì 

     µ  

  ½ 

    

Ì 

   ¬   

·         µ    

¬   

      

Ì 

     µ  

  ½ 

    

Ì 

   

  Á    ·          µ  

  ½ 

Æ   

  Á   

       ·         

¾ 

      

¿ 

¡ ¡ ¡ µ   Æ   

 

     Æ      Æ   

Quasi Regression and black boxes 22

³   

²   

   °

   ±

Fast stable updatesDefine:

 

¬   

  Ò  µ 

Ö 

  

½  

Ò   

Ò 

  

  ½ 

   

Ö 

   Ü   

 

µ          Ü   

 

µ  

Ë    

  Ò  µ 

Ö 

  

½  

Ò   

Ò 

  

  ½ 

 

   

Ö 

   Ü   

 

µ          Ü   

 

µ    

 

¬   

  Ò  µ 

Ö 

 

¾ 

Then:

 

¬   

  Ò  µ 

Ö 

    

 

¬   

  Ò    ½ µ 

Ö 

·    

½  

Ò   

 

   

Ö 

   Ü   

 

µ          Ü   

 

µ    

 

¬   

  Ò    ½ µ 

Ö 

 

Ë    

  Ò  µ 

Ö 

    

Ò     

½  

Ò   

Ë    

  Ò    ½ µ 

Ö 

·    

Ò     

½  

Ò   

¾ 

 

   

Ö 

   Ü   

 

µ          Ü   

 

µ    

 

¬   

  Ò    ½ µ 

Ö 

 

¾ 

Chan, Golub, Leveque who useÒ Ë    

  Ò  µ 

Ö 

    

 

Ò 

Ò    ½ 

Ë    

  Ò  µ 

Ö 

 

     Var  

 

¬   

  Ò  µ 

Ö 

µ  

Quasi Regression and black boxes 23

³   

²   

   °

   ±

Updatable accuracy

estimates

Predict        Ü   

Ò 

µ   by  

   

Ò    ½ 

   Ü   

Ò 

µ   Ü   

Ò 

indep of  

   

Ò    ½ 

Average recent squared errors

            

Á Ë        Ò   

Ñ 

µ     

½  

Ò   

Ñ 

  Ò   

Ñ    ½ 

Ò 

Ñ 

  

    Ò 

Ñ    ½ 

 

       Ü   

 

µ    

 

   

    ½ 

   Ü   

 

µ  

 

¾ 

on subsequence Ò   

Ñ 

     Ñ        Ñ    · ½ µ  

    ¾   

estimates avg ISE over recent   

Ô   

¾    Ò    values

Diagnostic:

Large LOF and small

È  

Ö 

   

Var  

 

¬   

Ö 

µ      µ   

need bigger basis

Quasi Regression and black boxes 24

³   

²   

   °

   ±

Presented as lack-of-fit:

½      Ê    

¾    

LOF     

Á Ë     

Î Ö   

  LOF     

AVG        

 

    µ  

¾ 

AVG         

 

¬   

¼ 

µ  

¾ 

Ð Ó    

½ ¼ 

  Ä Ç     

µ   Ê    

¾ 

       

±    

   ¿     

±    

   ¾ ±        

   ½ ¼ ±        

¼ ¼ ±    

½    

¼ ¼ ±        

Page 7: qregup4

8/14/2019 qregup4

http://slidepdf.com/reader/full/qregup4 7/11

Quasi Regression and black boxes 25

³   

²   

   °

   ±

      C     o

     s      t     s     o      f     a      l     g     e      b     r     a

     T     i    m    e

     S    p    a    c    e

     F    o    o     t    p    r     i    n     t

     K    r     i    g     i    n    g

     Ç

  

    Ò

 ¿

    ·

   Ô

 ¿

  µ

     Ç

  

    Ò

 ¾

    ·

   Ô

 ¾

  µ

     Ç

  

    Ò

 

    ·

   Ô

 

  µ

     R    e    g    r    e    s    s     i    o    n

     Ç

  

    Ò

   Ô

 ¾

  µ

     Ç

  

   Ô

 ¾

  µ

     Ç

  

    Ò

   Ô

 

  µ

     Q    u    a    s     i   -    r    e    g    r    e    s    s     i    o    n

     Ç

  

    Ò

   Ô

  µ

     Ç

  

   Ô

  µ

     Ç

  

    Ò

   Ô

 ¾

  µ

     Q    u    a    s     i   -    r    e

    g    a     l     l    o    w    s     l    a    r    g    e    r

    Ò

    o    r    m    u    c     h     l    a    r    g    e    r

   Ô

   Ô

    

   ½

  

   ¼

   ¼

   ¼

  

   ¼

   ¼

   ¼

     d    o    a     b     l    e     b    y    q    u    a    s     i   -    r    e    g . ,

    n    o     t     b    y    r    e    g .     O

    w    e    n ,

     A    n    n     S

     t    a     t     2     0     0     0

     C    o    s     t    o     f

    

     D     i    m    e    n    s     i    o    n

     L    o    w

     L    o    w

     E    a    s    y

     H     i    g     h

     L    o    w

     K    r     i    g     i    n    g

     L    o    w

     H     i    g     h

     (     Q    u    a    s     i   -     )    r    e    g    r    e    s    s     i    o    n

     H     i    g     h

     H     i    g     h

     [    g    o    o     d     l    u    c     k     ]

Quasi Regression and black boxes 26

³   

²   

   °

   ±

Incorporating shrinkageHoerl, Kennard, Efromovich, Donoho, Johnstone, Beran

 

 

   

- Ò 

   Ü   µ     

  

Ö 

-   

Ö Ò 

 

¬   

Ö Ò 

   

Ö 

   Ü    µ   -   

Ö Ò 

¾   ¼    

½ ℄ 

Optimally

-   

Ö Ò 

    

¬   

¾ 

Ö 

¬   

¾ 

Ö 

·     Var  

 

¬   

Ö Ò 

µ  

Shrinkage can reduce prediction variance.

We use data to estimate -   

Ö Ò 

e.g.   -   

Ö Ò 

    

 

¬   

  Ò    ½ µ 

¾ 

Ö 

 

¬   

  Ò    ½ µ 

¾ 

Ö 

·     Ë    

  Ò    ½ µ 

Ö 

Quasi Regression and black boxes 27

³   

²   

   °

   ±

Exploiting residualsFor Ö   

¼   : ¬   

Ö 

      µ     

¬   

Ö 

        

    µ   , for    ¾  

Ê    

Var

 

½  

Ò   

Ò 

  

  ½ 

   

Ö 

   Ü   

 

µ          Ü   

 

µ    

    µ  

 

depends on c

Try   

  ¬   

¼ 

More generally

 

¬   

  Ò  µ 

Ö 

  

½  

Ò   

Ò 

  

  ½ 

   

Ö 

   Ü   

 

µ  

 

       Ü   

 

µ    

  

×    Ö 

   

×     ½ 

 

¬   

      ½ µ 

× 

   

× 

   Ü   

 

µ  

 

Original quasi-reg:    

Ö  

¼   or ½  

Ö  ¼ 

-   

Ö  

½  

Ö  ¾ Ê 

Self-consistent quasi-reg:    

Ö  

     -   

Ö  

¾   ¼    

½ ℄ 

BoundingÊ 

 

   

¾  by sample variance   eliminates explosive

feedback

Still updatable  

¬   

Ö 

and Ë    

Ö 

NB: Ò      

 

¬   

  Ò  µ 

Ö 

  ¬   

Ö 

µ   is a martingale in Ò   

Quasi Regression and black boxes 28

³   

²   

   °

   ±

N-net example       Ü    µ   is prediction of Ð Ó    

½ ¼ 

   perf  µ  

       

   

Ö 

are Legendre polynomials

   

Ö 

are tensor products

 Ö   

 

¼ 

  ¿   

 Ö   

 

½ 

     

 Ö   

 

½ 

        µ    Ô   

½ ½      

Net is fast, so Ò    ¼ ¼      

 ¼ ¼ ¼   

(about 3min on 800Mhz PC in java)

Page 8: qregup4

8/14/2019 qregup4

http://slidepdf.com/reader/full/qregup4 8/11

   ²

   ±

 S  a m p l    e  s i   z  e 

LOF

1  0  0 

1  0  0  0 

1  0  0  0  0 

1  0  0  0  0  0 

10^-3 10^-1 1

   ²

   ±

Res

       Ü  µ

 ¼    ½  ¾ ½   Ü

½ 

 · ½   ¿    Ü

¾ 

 · ½  ¾   Ü

 ¿ 

¾    ¾   Ë 

  

½  ½ ¾ · ¼      Ü

½ 

 · ¾  ¾    Ü

¾ 

 · ¿  ½    Ë 

  

½   ¼ · ¾   ¾    Ü

½ 

 ¼  ½ ¼   Ü

¾ 

 · ¼   ¿    Ë 

  

 ¼   ¼    ¼  ½ ½   Ü

½ 

 · ¼  ½ ½   Ü

¾ 

 ·

Additive com

Var syct mmin mmax

% 0.520 0.011 0.088

 Q u a s i    R  e  gr   e s  s i     on an d   b  l     a c k   b   ox  e s 

 3  1  

 ³      ²

 °      ±

Neural net resultsNumber of bases is 1145

###### Anova at Iteration 500000 ######

1-RSquare (LOF) is 0.0011707 at iteration 499500

Beta[0] (constant factor) is 2.0717Sample mean is 2.0719, sample variance is 0.14359

Unbiased estimates of dimension variances

0.11441 0.026592 0.0027723 0.0 0.0 0.0

Dimension Probabilities

(Ratios of dimension variances to sample variance)

0.79676 0.18518 0.019307 0.0 0.0 0.0

 Q u a s i    R  e  gr   e s  s i     on an d   b  l     a c k   b   ox  e s 

 3  2  

 ³      ²

 °      ±

Neural net results, ctdVariances on one and two variables / sample variance

syct mmin mmax cach chmin

0.5177106

9.292114E-4 0.01069175

0.008898125 0.02590950 0.08782891

0.05507833 0.006469443 0.05429608 0.1301971

0.01091619 6.212815E-4 0.008541468 0.01008703 0.03679156

2.480628E-4 4.889575E-4 2.725553E-4 0.001473632 2.348261E-4

Biggest main effect: syct  is   ¾  %

Biggest interaction syct ¢

cach is     

%

Page 9: qregup4

8/14/2019 qregup4

http://slidepdf.com/reader/full/qregup4 9/11

Page 10: qregup4

8/14/2019 qregup4

http://slidepdf.com/reader/full/qregup4 10/11

Quasi Regression and black boxes 37

³   

²   

   °

   ±

Biggest interaction

Cycle Time x Cache Size Interaction

syct

    c    a    c     h

0.0 0.2 0.4 0.6 0.8 1.0

     0 .     0

     0 .     2

     0 .     4

     0 .     6

     0 .     8

     1 .     0

 Ö   

 

¼ 

  ¿   

 Ö   

 

½ 

     

 Ö   

 

½ 

     

Quasi Regression and black boxes 38

³   

²   

   °

   ±

2nd biggest interaction

Cycle time ¢  

Main Memory Max   

5.4% of  

   

 0. 20. 4

0. 60. 8

0  .2  

0  .4  

0  .6   

0  .8  

  -   0 .

   0   6  -   0 .

   0   4  -   0 .

   0   2   0   0 .

   0   2   0 .

   0   4

Quasi Regression and black boxes 39

³   

²   

   °

   ±

2nd biggest interaction

Cycle Time x Max Main Memory Interaction

syct

    m    m    a    x

0.0 0.2 0.4 0.6 0.8 1.0

     0 .     0

     0 .     2

     0 .     4

     0 .     6

     0 .     8

     1 .     0

 Ö   

 

¼ 

  ¿   

 Ö   

 

½ 

     

 Ö   

 

½ 

     

Quasi Regression and black boxes 40

³   

²   

   °

   ±

N-net conclusions1.  

    a fairly simple function wrt Í     ¼   

 ½ ℄ 

 

2. Ü   

½ 

most important, and nearly linear

3. At least one interaction not supported by data

4. Non-random cross-validation (leave out clusters) might

help

Page 11: qregup4

8/14/2019 qregup4

http://slidepdf.com/reader/full/qregup4 11/11

Quasi Regression and black boxes 41

³   

²   

   °

   ±

Next directions

1. Mars-like dynamic choice of basis

2. Comparisons of     and  

    on training data

3. Decompositions of  

    under empirical measures

4. Distinguishing     structure from  

    artifacts

5. More types of statistical/ML black boxes

6. Missing data (arise in function mining too)

7. Stopping rules

8. More basis function choices

9. Block diagonal or banded           

Ì 

   µ  

(EG      -splines)

10. Examples with noise (    unusable basis fns)

Quasi Regression and black boxes 42

³   

²   

   °

   ±

Robot arm functionRobot arm has     joints: Lengths Ä    

 

, angles    

 

Shoulder at´ ¼   

 ¼ µ  

, hand at   Ù Ú   

µ   :

Ù        

 

  

  ½ 

Ä    

 

Ó ×  

 

 

  

  ½ 

   

 

 

Ú        

 

  

  ½ 

Ä    

 

× Ò   

 

 

  

  ½ 

   

 

 

    is shoulder to hand distance

   

 

Ä    

½ 

Ä    

¾ 

Ä    

¿ 

Ä    

 

   

½ 

   

¾ 

   

¿ 

   

 

 

    

Ô  

Ù   

¾ 

·     Ú   

¾ 

¼     

Ä    

 

   ½ ¼         

 

  ¾