a personal and historical perspective on some uses of least ......a personal and historical...

55
A personal and historical perspective on some uses of least squares Antoine de Falguerolles Stephen Stigler, 1999 Box and Draper, 1987 Data structure Regression Laplace, 1796 Tobias Mayer, 1748 Legendre, 1805 Georges M¨ untz, 1834 Augustin-Louis Cauchy, 1836 Ibáñez and Saavedra, 1859 Pareto, 1897 SVD La Armada Invencible, 1558 Principal coordinates analyses Horse kicks, 1898 Correspondence Analysis Perspectives? IW-SVD A ghost model for MCA The end A personal and historical perspective on some uses of least squares Antoine de Falguerolles Universit´ e de Toulouse III - Paul Sabatier (France) Dr hab. (retired) antoine @ falguerolles.net

Upload: others

Post on 12-Feb-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    A personal and historical perspective onsome uses of least squares

    Antoine de Falguerolles

    Université de Toulouse III - Paul Sabatier (France)Dr hab. (retired)

    antoine @ falguerolles.net

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Summary

    As observed by Stephen Stigler in one of is influential book“The method of least squares is the automobile of modernstatistical analysis [...] this method and its numerousvariations [...] carry the bulk of statistical analyse”. In mypresentation, I will consider a series of situations (eitherregression or singular value decomposition) in which leastsquares or some of its variations have been actually used orwould be used nowadays. In all cases, some light modeling isused as a safeguard against separated evolutions ofstatistical “Schools” (e.g. use of the R package gnm forcorrespondence analysis).

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Stephen Stigler’s full citation

    Chapter 17Gauss and the invention of Least Squaresp. 320

    The method of least squares is the automobileof modern statistical analysis: [despite itslimitations, occasional acccidents, and incidentalpollution,] this method and its numerousvariations, extensions and related conveyances]carry the bulk of statistical analyses[, and areknown and valued by nearly all].

    Stephen Stigler (1999): Statistics on the Table, Cambridge:Harvard university Press.

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    George Box and Norman Draper, 1987

    Essentially, all models are wrong, but some areuseful. However, the approximate nature of themodel must always be borne in mind.

    George Box and Norman Draper (1987): Empirical modelbuilding and response surfaces, New York: John Wiley andsons (p. 424 ?).

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Data structureYan r× c response matrix

    Xrinformation on the rowsand names

    X ′cinformation on thecolumns and names

    Missing dataWeightsImpact of the coding

    Y Xr

    X ′c

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Hint to Pierre Legendre(1995)’s fourth-cornerproblem where rowsare species and columnsites.

    Y Xr

    X ′c ?

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Table of contentsStephen Stigler, 1999Box and Draper, 1987Data structureRegression

    Laplace, 1796Tobias Mayer, 1748Legendre, 1805Georges Müntz, 1834Augustin-Louis Cauchy, 1836Ibáñez and Saavedra, 1859Pareto, 1897

    SVDLa Armada Invencible, 1558Principal coordinates analysesHorse kicks, 1898Correspondence Analysis

    Perspectives?IW-SVDA ghost model for MCA

    The end

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Least squares in what we call a regression context appearedbetween 1795 (Gauss’s claims but no published evidence)and 1805 (Legendre’s publication).Still

    I Before? (18th Century)I After? (19th Century)

    I generalised least squaresI qualitative explanatory variableI generalised linear model

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Laplace (an IV)(23 september 1795 - 21 septembre 1796)

    Mathématiques

    a lesson taught on the 21 Floréal de l’An IV

    (20 May 1796), pp. 67-68

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    An enigmatic assertion on the computing of a central value.Noticeable ideas:

    I weight (poids): wiI influence of the error (influence de l’erreur) computed

    by differential calculus: hiI wi = 1hiI gravity center (un centre de gravité):

    The context is astronomical (and my language surelyanachronistic)

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Let yi (i ∈ {1, . . . , n}) be observed values and xi (xi > 0)the corresponding computed values such that yi ≈ bxi, bbeing an unknown correction coefficient.How do we estimate b? A “middle”of the results of severalobservations (fixer un milieu entre les résultats de plusieursobservations)?

    yixi, i ∈ {1, . . . , n}

    Take b as the centre of gravity of the weighted yixi

    b =∑i=1,...,nwi(x1, . . . , xn)

    yixi∑

    i=1,...,nwi(x1, . . . , xn)

    wi(x1, . . . , xn) for the yixi ? The weights could also dependon the yis. But not really considered here.

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Remember, without much loss of generality, xi > 0Exemples

    I wi(x1, . . . , xn) = 1,

    b = 1n

    ∑i=1,...,n

    yixi

    I wi(x1, . . . , xn) = xi,

    b =∑i=1,...,n xi

    yixi∑

    i=1,...,n xi

    I wi(x1, . . . , xn) = x2i ,

    b =∑i=1,...,n x

    2iyixi∑

    i=1,...,n x2i

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    I wi(x1, . . . , xn) = 1,

    b = 1n

    ∑i=1,...,n

    yixi

    I wi(x1, . . . , xn) = xi,

    b =∑i=1,...,n yi∑i=1,...,n xi

    I wi(x1, . . . , xn) = x2i ,

    b =∑i=1,...,n xiyi∑i=1,...,n x

    2i

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Caveat: An interpretation extracted from further publicationsby Laplace (and discussions with patient colleagues)!Influence of the error in independent observations onthe unknown coefficient?

    I Model 1: e = b− yx , h =dbde = 1 and w = 1

    I Model 2: ex = b−yx , h =

    dbde =

    1x and w = x

    Model 2 and Laplace’s suggestion lead to the so-calledestimator of Roger Cotes (1682 - 1716):

    a =∑i=1,...,n xi

    yixi∑

    i=1,...,n xi=∑i=1,...,n yi∑i=1,...,n xi

    a linear estimator but not the yet unknown least-squaressolution for the regression model y ≈ bx.Note: if the yi are observed values of independent binomial B(xi, π), then the rule of Roger Cotes

    gives the maximum likelihood estimator of π.

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Exercise: Let Zi ∼ L(β, σ2

    x2i) be n independent r.v.. Compare

    the two following unbiased estimators of β:

    1n

    ∑1,...,n

    Zi and1∑

    1,...,n |xi|∑

    1,...,n|xi|Zi

    Hint: xi > 0, mA1 ≥ mH1 > 0, 1(mA1 )2 ≤1

    (mH1 )2≤ 1

    mH2where

    mAk is the arithmetic mean of kth powers of values and mHkthe harmonic mean of the kth powers.

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    And multiple regression?

    the situation had been already considered!

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Tobias Mayer 1748

    The ordinary multiple regression model (in the anachronisticvector and matrix form):

    E[Y ] = Xβ and Var(Y ) = σ2I

    U is any conformable matrix (U ′ and β, U and X have samenumber of rows).An analogue of the normal equations is the system:

    U ′Xβ = U ′Y

    If U ′X is regular, β̂M = (U ′X)−1U ′Y isI a linear and unbiased estimator of βI Var(β̂M ) = σ2(U ′X)−1U ′U(X ′U)−1

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Several ways of constructing the analogues of matrix U andsystem (U ′X)β̂M = U ′Y

    I Mayer: if p regression coefficients, then split the datainto p clusters; aggregate the data in each clusters toobtain a system of linear equations of p equations and punknown.

    I Ideas from Laplace’s lesson: E[Y ] = Xβ = [xji ]β; forall column j of X, multiply every line i ofY − [X]β = 0 by sign(xji )|x

    ji | and add all revised lines

    into one linear equation.I Laplace: overlapping clusters; linear combinations with

    coefficients in {−1, 0, 1}.Potential computational problem: solving the linearsystem p× p by hand

    (U ′X)β̂M = U ′Y β̂M = σ2(U ′X)−1U ′Y

    Triangularisation.

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Legendre, 1805

    The first page of Legendre’s appendix on least squares. Theunknown coefficients are x, y, z . . . , the variables are a, b, c,f . . . Note that a (or rather −a) is the response variable, andthat there is no formal intercept (Preliminary centering?).E, the error, is zero or a very small quantity.

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Measuring Paris meridian arc

    A famous data set in metrology!

    Place Arc betweenLatitude L’-L L’+L

    in modules

    Dunkerque 51˚ 2’10”50DP 62472.59 2˚11’20”75 99˚53’ 0”

    Panthéon 48˚50’49”75PE 76145.74 2˚40’ 7”25 95˚ 1’32”

    Evaux 46˚10’42”50EC 84424.55 2˚57’48”10 89˚23’37”

    Carcassonne 43˚12’54”40CM 52749.48 1˚51’ 9”60 84˚34’39”

    Montjouy 41˚21’44”80

    See Stephen M. Stigler, “Gauss and the invention of leastsquares”, The annals of statistics, 1981, 9(3), 465–474.

    I i (i = 0, . . . , 4) the places of observation.I Li their latitudes.I Si the length of the 4 arcs i (i = 1, . . . , 4) between the places of latitudes Li−1 et Li.

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    The equations to match:two potential models under consideration, one preferred byLegendre (number 2, in red).

    1. Response variable: yi = Si. Linear predictor:µi(β) = β1(Li − Li−1) + β2K sin(Li − Li−1) cos(Li + Li−1)

    where K is a known coefficient, and where β1 et β2 denote the unknown regression coefficients.

    2. Response variable : yi = Li − Li−1. Linear predictor:

    µi(β) =SiK ′

    +β1SiK ′

    +β2K ′′ sin(Li−Li−1) cos(Li+Li−1)

    where K ′ et K ′′ are known coefficients, where SiK ′

    is an“offset”, and where β1 et β2 again denote the unknownregression coefficients.

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Matching (Loss) criterion for the equations yi − µi(β) = 0?Ordinary least squares:

    arg minβ||y − µ(β)||2

    Not quite, since the errors are likely to be correlatedalong the meridian. Legendre’s proposal:

    ui = ei − ei−1

    Generalised least squares:arg min

    β||y − µ(β)||2W

    Solved by OLS on augmented data! EM ?

    Obvious extensions: MA(1) and AR(1) disturbances,ui = ei + θei−1 (θ ∈ [−1, 1]), and ui = θui−1 + ei(θ ∈]− 1, 1[). (For the latter remember Durbin-Watson!).RegArima?

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Müntz’s data

    Group Unit Cost (y) Distance (x)

    1 1 18.8 5.501 2 0.8 0.551 3 2.2 1.001 4 12.3 4.001 5 9.9 3.501 6 9.0 3.001 7 10.2 4.001 8 8.4 3.001 9 5.0 2.701 10 2.2 1.201 11 2.7 1.501 12 2.1 1.001 13 1.6 0.801 14 1.5 1.001 15 13.3 4.10

    2 1 7.6 3.402 2 6.7 3.352 3 8.9 4.50

    Georges Müntz (1807-1887), a road engineer, ontransportation costs of stones.

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Covariance analysis

    0 5 10 15 20

    12

    34

    5

    distance

    tran

    spor

    tatio

    n co

    st

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Covariance analysis

    Y=c(Y1,Y2) ; X=c(X1,X2)G=factor(rep(c(1,2),c(length(Y1),length(Y2))))

    > lm(Y˜G:X)$coefficients(Intercept) G1:X G2:X

    0.5966443 0.2785730 0.4096165

    > lm(Y1˜X1)$coefficients(Intercept) X1

    0.6040630 0.2778905

    > mean((Y2-lm(Y1˜X1)$coefficients[1])/X2)[1] 0.4051578

    Optimality? Gauss-Markov theorem gives the answer.

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Augustin-Louis Cauchy, 1836

    Augustin-Louis Cauchy (1789-1857) is one of the greatestFrench mathematician of the XIXth century. He wrote about800 papers in an astonishing variety of areas. Hisuncompromising and reactionary (légitimiste) personalityhindered his academic career. He published his regressionmethod while in exile in Prague.

    Cauchy’s motivation: alleviate the computional burdenof least squares (cross products and system of linearequations)

    The methods of Cauchy and Legendre were oftensimultaneously considered in French applications until theearly 20th century.

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Estimation : Cauchy? Cauchy vs Legendre?

    I the yi, i ∈ {1, . . . , n}, are observed values ofindependent random variables Yi, Yi ∼ L(µi, σ2)

    I a linear model for µisimple regression:

    µi = β0 + β1xi

    multiple regression:

    µi = β0 + β1x1i + . . .+ βKxKi

    I a stepwise inclination: introduction of x1, x2, . . . in turn(with some idea on the order of introduction)

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Simple regression: µi = β0 + β1xi

    Legendre{

    β̂1 =

    ∑ni=1

    (xi−x)(Yi−Y )∑ni=1

    (xi−x)2

    β̂0 = Y − β̂1x

    Cauchyβ̂1 =

    ∑ni=1 sign(xi−x)(Yi−Y )∑n

    i=1 |xi−x|

    β̂0 = Y − β̂1x

    where Y = 1n∑ni=1 Yi, x = 1n

    ∑ni=1 xi,

    and sign(xi − x) = 1(xi − x ≥ 0)− 1(xi − x ≤ 0).

    Property: Both methods provide linear unbiased estimators.

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Cauchy as weighted Legendre

    β̂1 =

    ∑ni=1 sign(xi−x)(Yi−Y )∑n

    i=1 |xi−x|=∑n

    i=1 wi(xi−x)(Yi−Y )∑ni=1 wi(xi−x)

    2

    β̂0 = Y − β̂1x

    Cauchy and Mayer’s method of meansβ̂1 =

    1∑ni=1 1(xi>x)

    ∑ni=1 Yi1(xi>x)−

    1∑ni=1 1(xix)−

    1∑ni=1 1(xi

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Multiple regression: µi = β0 + β1x1i + β2x2iCommon procedure with specific estimation (Legendre, orCauchy, or whatever).

    I Step 0. All variables are mean centered: y(0)i = yi − y,x

    1,(0)i = x1i − x1, x

    2,(0)i = x2i − x2.

    I Step 1. Simple regressions of y(0) on x1,(0) with slopeby1, and of x2,(0) on x1,(0) with slope b21. Computeresiduals y(1)i = y

    (0)i − b

    y1x

    1,(0)i and

    x2,(1)i = x

    2,(0)i − b21x

    1,(0)i

    I Step 2. Simple regression of y(1) on x2,(1) with slopeby2. Compute residuals y

    (2)i = y

    (1)i − b

    y2x

    2,(1)i

    Finally: b2 = by2, b1 = by1 − b

    y2b

    21, b0 = y − b1x1 − b2x2

    (triangularisation).If LS in each step, final result independent from the entering order of the explanatory variables.

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    One slope, several intercepts

    I Carlos Ibáñez eIbáñez de Ibero(Barcelona, 1825 -Niza, 1891)

    I Frutos SaavedraMeneses (1823 -1868)

    I Aimé Laussedat(1819-1907) (1860)

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    One slope, several intercepts

    µi,g(i) = βg(i),0 + β1xi,g(i)This problem arose in the calibration of some parts of anapparatus to be used in geographical triangulation.The number of levels for the group factor g is large and thedesign is balanced.Least squares are used.The block pattern of the design matrix and the normalequations is recognised and closed form estimators areobtained.Residuals are carefully checked.

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Pareto, 1897

    Pareto (1848-1923) was born in Paris from an exiled Italianfather and a French mother. A graduate from thePolitecnico di Torino, he was appointed a lecturer ineconomics at the University of Lausanne in Switzerland in1893. He died in Switzerland.

    In a 1897 paper in the Journal de la Société de statistique deParis

    I Artificial dataI Example 1: annual variation of population size over time in England and Wales (regression

    issues)I Example 2: annual covariation over time of the number of weddings and economic prosperity in

    England; prosperity is measured by two covariates: coal extraction and exports (differencing oftime series issues)

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    population growth

    Year Population(i) (yi)

    1 88932 101643 120004 138975 159141 179287 200668 227129 25974

    10 29003

    Notation:i = 1, . . . , 10yi population size in year xi = i

    Modeling:Yi ∼ L(µi, σ2)

    I Arithmetic progressionµi = β0 + β1xi(identity link)

    I Geometric progressionlog(µi) = β0 + β1xi(log link)

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Minimization of ∑ni=1(yi − exp{β0 + β1xi})2)Gradiant

    ∇S(β0, β1) = −[ ∑n

    i=1(yi − µi)µi∑ni=1(yi − µi)µixi

    ]

    HessianHS(β0, β1) =

    [ ∑ni=1

    ((yi − µi)µi − µ2i

    ) ∑ni=1

    ((yi − µi)µixi − µ2ixi

    )∑ni=1

    ((yi − µi)µixi − µ2ixi

    ) ∑ni=1

    ((yi − µi)µix2i − µ

    2ix

    2i

    ) ]

    At a stationnary point (β?0 , β?1)′, ∇S(β?0 , β?1)′ = (0, 0)′, then

    HS(β?0 , β?1) =n∑i=1

    µ?i2[

    1 xixi x

    2i

    ]−

    n∑i=1

    µ?ix2i

    [0 00 yi − µ?i

    ]

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Minimization of ∑ni=1(yi − exp{β0 + β1xi})2Pareto searches a stationnary point using a Newton-Raphsonalgorithm in which a modified Hessian is used

    (k+1)0β

    (k+1)1

    ]=[β

    (k)0β

    (k)1

    ]− H̃S(β(k)0 , β

    (k)1 )−1∇S(β

    (k)0 , β

    (k)1 )

    whereH̃S(β?0 , β?1) =

    n∑i=1

    µ?i2([

    1 xixi x

    2i

    ])

    Has Pareto realized that E[(Yi − µi)µix2i ] = 0 and that hecan use the so called Fisher’s scoring method (Ronald Fisher,1890-1962) ! Note that Pareto’s simplified Hessian is, underGaussian assumptions, Fisher’s information matrix for β0 andβ1.

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Weighted least-squares for the log tranformedresponse

    Weighting the log-transformed observations might provide afair approximation. Pareto suggests to weight each relationlog(yi) ≈ β0 + β1xi by yi, or weights wi = y2i in the lossfunction:

    ∑ni=1 y

    2i (log(yi)− (β0 + β1xi))2.

    How it does it relate to IWLS?I Some GLM notation: Yi ∼ L(µi, σ2) with link functiong, g(µi) = ηi = β0 + β1xi. Here g = log

    I Associated loss function:∑ni=1(yi − exp{β0 + β1xi})2

    I Loss function at iteration k:∑ni=1 (exp{η

    (k−1)i })2(z

    (k)i − (β0 + β1xi))2

    where z(k)i =yi−exp{η(k−1)i }

    exp{η(k−1)i }+ η(k−1)i

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Weighted least-squares for the log tranformedresponse

    I Initial values: z(0)i = η(0)i = log(yi).

    I Loss function at iteration 1:∑ni=1 (yi)2(log(yi)− (β0 + β1xi))2

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Eigen vs Singular

    SVD came into the statistical landscape in the 1980s andreplaced EVD in many uses in statistics. Both extends tocomplex matrices.Y is a #r ×#c real matrix of rank K. Consider the optimalapproximation of Y by a matrix of rank k (k ≤ K):

    arg minM∈Rk ||Y −M ||2

    SVD provides a solution:∑k`=1 d`u

    `{v`}′

    I SVD is well documented and implementedI useful for factorisationI useful for visualisation

    I distances between rows of Y , inner product for Y ′YI biplot

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    La Armada Invencible, 1558V1 V2 V3 V4 V5 V6 V7 V8 V9 V10

    A1 12 7737 3330 1293 4623 347 18450 789 186 150A2 14 6567 1937 863 2800 238 11900 477 140 87A3 16 8714 2458 1719 4171 384 23040 710 290 309A4 11 8762 2325 780 3105 240 10200 415 63 119A5 14 6991 1992 616 2608 247 12150 518 139 109A6 10 7705 2780 767 3523 280 14000 584 177 141A7 23 10271 3121 608 3729 384 19200 258 142 215A8 22 1221 479 574 1093 91 4550 66 20 13A9 4 0 873 468 1341 200 10000 498 61 88A10 4 0 0 362 362 20 1200 60 20 20

    I Rows (10): fleets forming the Armada: Armada de Galeones de Portugal (A1), Armada deViscaya (A2), Galeones de la Armada de Castilla (A3), . . . , Galeras (A10)

    I Columns (10 items): Numero de navios (V1), Toneladas (V2), Gente de guerra (V3), Gente demar (V4), Numero de todos (V5=V3+V4), Pieças di artilleria (V6), Peloteria (V7), Poluora(V8), Plomo qui tales (V9), Cuerda qui tales (V10)

    I Total columns sometimes differ from the published total (typographic errors more than grosserrors).

    As many columns as rows. Coincidence?

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    SVD

    mat.Y=as.matrix(read.csv(’A.csv’, header = TRUE, sep = ","))

    vec.m=apply(mat.Y,2,mean)mat.V=cov.wt(mat.Y)$covmat.R=cor(mat.Y)mat.Ys=sweep(mat.Y,2,sqrt(diag(mat.V)),’/’)mat.Yc=sweep(mat.Y,2,vec.m,’-’)mat.Z=sweep(mat.Yc,2,sqrt(diag(mat.V)),’/’)> round(sum(mat.Z*mat.Z),3)[1] 90

    > round(sum(mat.Z*mat.Z)-cumsum(svd(mat.Z)$dˆ2),3)[1] 22.185 11.089 5.590 2.912 1.264[6] 0.431 0.063 0.003 0.000 0.000

    Two principal dimensions? The rounded third squared sv is5.590.

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    gnm

    library(gnm);Ys=as.vector(mat.Ys)F.R=factor(row(mat.Y), labels=rownames(mat.Y))F.C=factor(col(mat.Y), labels=colnames(mat.Y))

    > deviance(gnm(Ys˜F.C))[1] 90

    > deviance(gnm(Ys˜F.C+instances(Mult(F.R,F.C),3)))InitialisingRunning start-up iterations..Running main iterations...........................Done[1] 5.589685

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    SVD

    I Data matrix Y (#r ×#c) with rank K, column meanvector y, covariance matrix V and correlation marix R

    I Preprocessing Y : Ys column standardized; Yc columncentered; Z = [zcr] = [zr ′] = [zc] column centered andstandardised.

    I SVD of Z: Z = Udiag(d)V ′, where: U ′1#r = 0, andU ′U = V ′V = Ik

    I LS approximation of Z of order k (k ≤ K):

    k∑`=1

    d`u`{v`}′

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    gnm

    Reconstitution formula for Ys = [ycr]:

    ycr = yc +k∑`=1

    d`u`rv`c

    Sum of a main column effects and a multiplicative (bilinear)interaction.

    Note: other identification constraints than the ones used inSVD are possible

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Death by horsekick in Prussian cavalry, 1898

    G I II III IV V VI VII VIII IX X XI XIV XV1875 0 0 0 0 0 0 0 1 1 0 0 0 1 01876 2 0 0 0 1 0 0 0 0 0 0 0 1 11877 2 0 0 0 0 0 1 1 0 0 1 0 2 01878 1 2 2 1 1 0 0 0 0 0 1 0 1 01879 0 0 0 1 1 2 2 0 1 0 0 2 1 01880 0 3 2 1 1 1 0 0 0 2 1 4 3 01881 1 0 0 2 1 0 0 1 0 1 0 0 0 01882 1 2 0 0 0 0 1 0 1 1 2 1 4 11883 0 0 1 2 0 1 2 1 0 1 0 3 0 01884 3 0 1 0 0 0 0 1 0 0 2 0 1 11885 0 0 0 0 0 0 1 0 0 2 0 1 0 11886 2 1 0 0 1 1 1 0 0 1 0 1 3 01887 1 1 2 1 0 0 3 2 1 1 0 1 2 01888 0 1 1 0 0 1 1 0 0 0 0 1 1 01889 0 0 1 1 0 1 1 0 0 1 2 2 0 21890 1 2 0 2 0 1 1 2 0 2 1 1 2 21891 0 0 0 1 1 1 0 1 1 0 3 3 1 01892 1 3 2 0 1 1 3 0 1 1 0 1 1 01893 0 1 0 0 0 1 0 2 0 0 1 3 0 01894 1 0 0 0 0 0 0 0 1 0 1 1 0 0

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    CA’s formulas, miracles and confusion

    c...

    r . . . yRCrc . . . yRr

    ...yCc y

    c...

    r . . . eRCrc . . ....

    I A two-way table of counts Y RC = [yRCrc ], marginalvectors Y R = [yRc ] and Y C = [yCc ], and total y∅

    I table of expected values under additivity of effects(independence): 1

    y∅Y RY C

    ′ = [yRr y

    Cc

    y∅]

    I associated two-way table E of Pearson’s residuals :

    E = [eRCrc ] = [yRCrc −

    yRr yCc

    y∅√yRr y

    Cc

    y∅

    ]

    Light taste of Poisson or multinomial

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    CA formulas, miracles and confusion

    I SVD of E: E = Udiag(d)V ′

    I Row and column metrics WR = 1y∅ diag(YR) and

    WC = 1y∅ diag(YC)

    I Reconstruction formula: Y = [yRCrc ] =1y∅Y RY C

    ′(1 + 1√(y∅){W−

    12

    R U}diag(d){W− 12C )V }′

    Light taste of Gaussian (LS)

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    CA as a gnm

    I yRCrc observations of independent r.v. Y RCrcI E[Y RCrc ] = µRCrcI Var(Y RCrc ) = 1y∅ y

    Rr y

    Cc

    I data driven mean (and link) function:µ(ηRCrc ) = 1y∅ y

    Rr Y

    Cc (1 + ηRCrc )

    I rank([ηRCrc ]) = k (k small)I some processing of [ηRCrc ] for visualisation

    Note:“Data driven” rather than “canonical” link function.Therefore more troubles in handling missing data!

    I am indebted to David Firth and Heather Turner for havinghelped me to implement CA as a gaussian model in gnm

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    Perspectives?

    I Flexibility of gnm for implementing standard analysesand many more (possibility to introduce otherexplanatory variables the row effects and columneffects). Modeling with a menagery of interesting formsof interactions

    I IW-SVD (Are there simple solutions to K. RubenGabriel and Shmuel Zamir (1979) element-wiseweighted low rank approximation?)

    I Multiway table:Why MCA? A ghost model forMCA

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    IW-SVD: Notation and problem

    I Triple (Y,WR,WC) whereI Y = [yRCrc ] is #r ×#cI WR metric for the rows: WR positive definite;1

    ′#rWR1#r = 1

    I WC metric: for the column: WC positive definite;1

    ′#cWC1#c = 1

    I yRCrc observed values of independent rv Y RCrcI E[Y RCrc ] = µRCrc = m(ηRCrc )I Var(Y RCrc ) = φV(µRCrc )I [ηRCrc ] is a rank k + 1 matrix:ηRCrc = β∅ + βR0,r + βC0,c +

    ∑k`=1 β

    R`,rβ

    C`,c

    Minimisation problem:

    #r∑r=1

    #c∑c=1

    (yRCrc −m(ηRCrc ))2

    V(µRCrc )

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    IW-SVD iteration

    I At step i, matrix [n(i)RCrc ] is the current estimate of[ηRCrc ]

    I Working response matrix:Z = [zRCrc ] = [

    yRCrc −m(n(i)RCrc )

    m′(n(i)RCrc )+ n(i)RCrc ]

    I Matrix of working weights [{m′(n(i)RCrc )}2

    V (m(n(i)RCrc ))],

    W = [√{m′(n(i)RCrc )}2

    V (m(n(i)RCrc ))], and W̃ = [

    √V (m(n(i)RCrc )){m′(n(i)RCrc )}2

    ]

    Minimisation problem:

    #r∑r=1

    #c∑c=1

    {m′(n(i)RCrc )}2

    V (m(n(i)RCrc ))(zRCrc −(β∅+βR0,r+βC0,c+

    k∑`=1

    βR`,rβC`,c))2

    This is exactly the problem raised by Gabriel and Zamir(1979). It was then solved by alternating regressions.

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    IW-SVD troubles

    The Schur product by W̃ of a rank approximation of W ◦ Zdoes not preserve the rank.

    Exceptions are sometimes possible, e.g. the miracle in CA:V ar(Y RCrc ) = m′(ηRCrc ) = 1y∅

    yRry∅

    yCcy∅

    since it factorizes.

    The gnm package offers a practical solution to Gabriel andZamir problem.

    Is there an algorithm based on SVD? Working on augmenteddata?

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    A ghost model for MCA

    Is MCA, CA of a BURT matrix, a trick? Is there a modelbehind?

    I if Y ABCabc ∼ L(µABCabc , y∅yAay∅

    yBby∅

    yCcy∅

    ) where:

    I µABCabc = y∅yAay∅

    yBby∅

    yCcy∅

    (1 + ηABCabc )I ηABCabc has model:

    1+A+B+C+[A×B]k(A,B)+[A×C]k(A,C)+[B×C]k(B,C)

    I then Y ABab ∼ L(µABab , y∅yAay∅

    yBby∅

    ) where:

    I µABab = y∅yAay∅

    yBby∅

    (1 + ηABab )I ηABab has model:

    1 +A+B + [A×B]k(A,B)

  • A personal andhistorical

    perspective onsome uses of least

    squares

    Antoine deFalguerolles

    Stephen Stigler,1999

    Box and Draper,1987

    Data structure

    RegressionLaplace, 1796

    Tobias Mayer, 1748

    Legendre, 1805

    Georges Müntz, 1834

    Augustin-Louis Cauchy,1836

    Ibáñez and Saavedra, 1859

    Pareto, 1897

    SVDLa Armada Invencible, 1558

    Principal coordinatesanalyses

    Horse kicks, 1898

    Correspondence Analysis

    Perspectives?IW-SVD

    A ghost model for MCA

    The end

    I have reached the end of my tale.

    Thank you for your attention

    Stephen Stigler, 1999Box and Draper, 1987Data structureRegressionLaplace, 1796Tobias Mayer, 1748Legendre, 1805Georges Müntz, 1834Augustin-Louis Cauchy, 1836Ibáñez and Saavedra, 1859Pareto, 1897

    SVDLa Armada Invencible, 1558Principal coordinates analysesHorse kicks, 1898Correspondence Analysis

    Perspectives?IW-SVDA ghost model for MCA

    The end