a sas/iml program for generalised procrustes analysis ... sasiml program for... · a sas/iml...

9
A SAS/IML PROGRAM FOR GENERALISED PROCRUSTES ANALYSIS Pascal SCHLICH MINISTRY DE L"AGRICULTURE ABSTRACT GPA is a fairly known mUltivariate statistical method. GENSTAT software includes a GPA macro, but SAS does not. The aim of this communication is to announce a GPA program written in SAS/IML. GPA is a three-way data analysis : variables have been recorded on the same n sam- ples in k different situations. Each situation defines a configuration of n points in a multidimensional space. Purpose of GPA is to match the k configurations to a common consensus configuration, by translation, scale change and iterative rotation/reflec- tion. When transformed configurations are as close as possible, their mean defines the cqnsensus. Output of the program allows interpre"tation in several ways. Prccrustes statis- tics quantify agreement of each configuration or sample with the consensus. Grar lical representations of the consensus are obtained by principal component analysis. Samples separation on principal plot is explained by correlations between the principal compo- nents and all of the variables. Relations between configurations are studied by prin- cipal co-ordinate analyses. An application of GPA to Free-choice Profiling (FCP) of 6 strawberry jam samples by 15 assessors (configurations) is detailed. FCP is a method of sensory analysis in Which each assessor chooses and scores his own attributes to describe the samples. To take into account these vocabulary differences is precisely the goal of Procrustes rotations. GPA can be applied each time individuals are described by several sets of variables. INTRODUCTION In Greek mythology, Procrustes was a highwayman supposed to have made all hi;;; victims fit his bed, cruelly stretching those who were too small and cutting down to size those who were too tall. HURLEY and CATTELL (1962) gave the code name Procrustes to their program to refer to its ability to fit almost any data to any other "for better or worse". As described by HARMAN (1976), Procrustes is the common term used to refer to any forced transformation in factor analysis. In this paper, Procrustes analysis means the matching of two configurations of n points in a p-dimensional space by translation, scale change and rotation/reflec Generalised Procrustes Analysis (GPA) introduced by KRISTOF and WINGERSKY (1971 popularized by GOWER (1975) allows to match iteratively, in a Procrustes configurations of n points to their mean which defines a consensus configuration. 529

Upload: doannga

Post on 20-May-2018

233 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A SAS/IML PROGRAM FOR GENERALISED PROCRUSTES ANALYSIS ... SASIML Program For... · A SAS/IML PROGRAM FOR GENERALISED PROCRUSTES ANALYSIS ... by translation, ... GPA isa fairly known

A SAS/IML PROGRAM FOR GENERALISED PROCRUSTES ANALYSIS

Pascal SCHLICH MINISTRY DE L"AGRICULTURE

ABSTRACT

GPA is a fairly known mUltivariate statistical method. GENSTAT software includes

a GPA macro, but SAS does not. The aim of this communication is to announce a GPA

program written in SAS/IML.

GPA is a three-way data analysis : variables have been recorded on the same n sam­

ples in k different situations. Each situation defines a configuration of n points in

a multidimensional space. Purpose of GPA is to match the k configurations to a common

consensus configuration, by translation, scale change and iterative rotation/reflec­

tion. When transformed configurations are as close as possible, their mean defines the

cqnsensus.

Output of the program allows interpre"tation in several ways. Prccrustes statis­

tics quantify agreement of each configuration or sample with the consensus. Grar lical

representations of the consensus are obtained by principal component analysis. Samples

separation on principal plot is explained by correlations between the principal compo­

nents and all of the variables. Relations between configurations are studied by prin­

cipal co-ordinate analyses.

An application of GPA to Free-choice Profiling (FCP) of 6 strawberry jam samples

by 15 assessors (configurations) is detailed. FCP is a method of sensory analysis in

Which each assessor chooses and scores his own attributes to describe the samples. To

take into account these vocabulary differences is precisely the goal of Procrustes

rotations.

GPA can be applied ~enerally each time individuals are described by several sets

of variables.

INTRODUCTION

In Greek mythology, Procrustes was a highwayman supposed to have made all hi;;;

victims fit his bed, cruelly stretching those who were too small and cutting down to

size those who were too tall. HURLEY and CATTELL (1962) gave the code name Procrustes

to their program to refer to its ability to fit almost any data to any other "for

better or worse". As described by HARMAN (1976), Procrustes is the common term used to

refer to any forced transformation in factor analysis.

In this paper, Procrustes analysis means the matching of two configurations of n

points in a p-dimensional space by translation, scale change and rotation/reflec

Generalised Procrustes Analysis (GPA) introduced by KRISTOF and WINGERSKY (1971

popularized by GOWER (1975) allows to match iteratively, in a Procrustes sen~

configurations of n points to their mean which defines a consensus configuration.

529

Page 2: A SAS/IML PROGRAM FOR GENERALISED PROCRUSTES ANALYSIS ... SASIML Program For... · A SAS/IML PROGRAM FOR GENERALISED PROCRUSTES ANALYSIS ... by translation, ... GPA isa fairly known

GPA isa fairly known multivariate statistical method. For instance, the GENSTAT

statistical software includes a GPA macro, but SAS does not. The aim of this communica~

tion is to announce ~ GPA program written in SAS/IML.

Th~ first patt of this communication gives the broad outlines of GPA. The second

one, which is an application of GPA to a sensory analysis, shows and interprets the

computer output obtain~d from these real data.

GPA METHOD

Assume n samples have been described by k sets of Pi variables (i=l, ... ,k).

Each set of variables are arranged into a n*Pi d~tamatrix call~d Xi. The raws of Xi

denote the. samples and the columns denote the variables. The samples are the same in

each Xi' but the variables could be different. The samples can be seen geometrically

as k configurations of n points in a Pi~dimensional space. If p is the maxi(Pi)' P-Pi

zero columns are appended to Xi' allowing to see the k configurations in the same p­

dimensional space. Doing this common representation, .each configuration· has its own

axis meanings, except if the var~ables are identical in each Xi.

A usual way to describe one Xi data matrix is to perform a principal component

analysis (PCA) (MORRISSON, 1976) using PRINCOMP or FACTOR SAS/STAT procedures. With

several Xi data matrices, one PCA for each matrix does not allow to easily define a

consensus about sample differences. If and only if variables are identical in each Xi'

it is possible to compute the mean data matrix and then perform a single PCA. But if

large differences are found between means or scales of configurations, the average

configuration could not be a good consensus.

GPA first translates the configurations to a common origin, and applies contrac­

tions or dilatations to give a common dispersion to each configuration. These two

first stages are iri the same spirit then centering and autoscaling columns of a data

matrix inPCA, in order to describe correlations rather than covar iances. The third

stage, which is the most characteristic of Procrustes analysis, consists in rotating

configurations to fit a target confi~uration defined as the mean configuration.

Figure 1 illustrates what the GPA transformations do in the case of two configurations.

With more than two configurations, the algorithm (GOWER, 1975) needs to be iterative

as described in figure 2. The sum of squared distances between corresponding samples

in transformed configurations, called the Procrustes statistic. and denoted by s, ** and s in figure 2, is minimized by this algorithm. When iter~tion is complete,

mean of transformed configurations defines the consensus denoted by C in figur~ 2.

* s

the

The sum of squared distances between corresponding sample~ in initial configura-

tions is called the total sum of squares. It .can be seen as the sum of squares for

translation, scaling and rotation/reflection, plus a redisual which is the Procrustes

statistic defined above,LANGRON and COLtINS (1985) derived asymptotic Procr

530

Page 3: A SAS/IML PROGRAM FOR GENERALISED PROCRUSTES ANALYSIS ... SASIML Program For... · A SAS/IML PROGRAM FOR GENERALISED PROCRUSTES ANALYSIS ... by translation, ... GPA isa fairly known

;:;-::m.ij_d'!'iii:S;:i!-,,-i\!L"iEi'!i5':l'C;:tlilU&s;._;;m"_E!i.9{tt"_~~~~~~~-Yg~~~~~m.!"J_"'",,~-::,~-~"'"""' ... :::n~::~";~-~_':J::::"':::~"'.(\q-::';-~~:;-~~';~<:~';"'''.'"''.''''':".~'~'',~~r;~':.~''_'?~~''..''':-.'~="'f';-~~-"-~~'.!''''E~~W·;<;t·-·!N.'?'_r:-r.::-~1'''i~.''-:;'':··Cr-~~~T~'_':.J"'.»r"-:;",:"':T,;"i!~~~~"'<:'#!'::.""~-r.:"1':"''''~~::T:~''-!'t'~~y?.~:!"k'..!!..'':~:~·~=''''_~~,,-"~'.

CIt ~

GPA TRANSFORMATIONS

Simulated example for.k=2 configurations (O.C.6) and { ...... } of n=3' samples symbolised by geometrical shapes in a Pl=P2=2-dimensional space . . .

INITIAL I + configurations

~. . C

~ • '"

:r6 0 0 ..

.. .~

. /'" 0 I 0

ure 1

A GPAAlgorithm (J.C. GOWER. Psychrometrika, 1975)

1. Centre each column of each ~. Scale each Xi by A. =kll"1tr(XiXi')

2. SetC = Xl For i = 2 to k do: rotate Xi to Gand take meanofX1, .... ,Xi as newC Set s = k.(l-tr(CC'». For i = 1 to k do : ri = 1

3. For i = 1 to k do : rotate riKi to C giving X'I = rJ{;Hi Let C· be the mean:()fX\, ... X·k

Set s· = s - k.tr (C'C·' - CC').

4. If scaliRg is not required set s'· = s·, set C·· = C' and go to step 6

5. For i = 1 to k do: r*,jr; = tr (X·jG")/(tr(X·jX·j').tr (C'C"» Set X·oj = (r*/rj)X';. Set r; = r·; Let C,· be the mean ofX·\ •... ,X··k

Set s·· = s - k.tr (C"· C"·' - CC')

6; Ifs - s··> 0.0001 then set s = s'·, set C = CoO. and. go to step 3 else go to next step

7. Iteration is complete. Calculate and print Procrustes analysis of variance .

8. Perform a principal coordinate analysis (PCO) of matrices, considered as ksamples, before each stage ofGPA Dissimilarity matrices for these analyses are given by the covariance matrices of the transformed configurations considered as n.p vectors.

9. Perform a principal component analysis (PCA) of the consensus C. Refer each configuration to these principal axes. Calculate correlations between variables and principal components . Calculate rotation matrices of variables on principal components.

10. Computation is achieve. To return to the original units before the scaling in step 1. the configurations obtained must be divided by ...JA. and the sums of squares by A..

Figure 2

Page 4: A SAS/IML PROGRAM FOR GENERALISED PROCRUSTES ANALYSIS ... SASIML Program For... · A SAS/IML PROGRAM FOR GENERALISED PROCRUSTES ANALYSIS ... by translation, ... GPA isa fairly known

analysis of variance (PANOVA) from this decomposition. PANOVA allows to see which are

the most important transformations.

To detect possible outlier configurations, or group of configurations or configu­

rations badly represented by the consensus, a principal co-ordinate analysis (PCO)

(GOW~R, 1966) of configurations is performed. Dissimilarity matrix analyzed by PCOis

the covariance matrix of the k configurations considered as n. p vectors (LANGRON,

1981). A first PCO is computed with the initial configurations, then one PCOis

computed after each stage of GPA. It is convenient to show on a same plot the first

and the last PCO (ARNOLD and WILLIAMS, 1986), in order to appreciate the cohesion bet­

ween configurations saved by GPA.

Finally, a PCA of the consensus is performed. In addition to the n consensus

samples, the n.k transformed sam~les are located on the principal axes. Each principal

component is explained using either its correlations with the ~iPi initial variables,

or fhe rotation coefficients of variables on it.

The program presented here is composed of 331 IML instructions, but will be

optimized soon. It has been developed in IML Version 6 for personal computer, and

receritly adapted in IML Version 5 fot mainfr~me.

APPLICATION OF GPA IN SENSORY ANALYSIS

Food s~ientists often use the profile method, ~hich consist in asking to asses­

sors to score samples for attributes or descriptive terms. But a few problems appear

when human is taken as a measure instrument. For instance, understanding of attributes

could be different from one to another, or given attributes could not be the most

appropriate

assessors.

to describe the differences between samples detected by some

WILLIAMS and LANGRON (1984) described the Free-choice Profiling (FC9) in order to

avoid these problems. Each assessor is asked to establish his own list of attributes,

which must mean something to him, but not necessary to anyone else.

The foundation of FCP is to assume that assessors feel the same differences

between samples, but describe them with different vocabularies. Thus it is admissible

to translate, dilate and rotate configurations defined by assessors, as these transfor­

mations keep the ratio of distances.

The Procrustes rotations establish ~ link between the different vocabularies, and

finally each principal" component of the consensus can be refered" to each vocabulary

through correlations.

Real data submi ted to the program consist in the evaluation of 6 strawberry jams

(n=6 sampies) by 15 asseSSors (k=15configurations) for their self-chosen attributes

(Pivariables, 1 < Pi < 8). The jams were made with different sugars, but it is beyond

the scope .of this paper to detail the material and the results 6f this sensory study.

The data are only used here as an example to show imput and output of the GPA program

532

Page 5: A SAS/IML PROGRAM FOR GENERALISED PROCRUSTES ANALYSIS ... SASIML Program For... · A SAS/IML PROGRAM FOR GENERALISED PROCRUSTES ANALYSIS ... by translation, ... GPA isa fairly known

GPA===============================================================, Command =.==>

GENERALISED PROCRUSTES ANALYSIS - Free choice profiling Input data set observations must be the samples Variables must be sensory attributes arranged assessor by assessor A character variable must give sample identifiers

DATA - strawber Input data set

SAMP jam Name of sample identifier

m = 15 Number of assessors

JUDG = A B CD E F G H t J K L M. N 0 L1st of assessor identifiers

ATTR 8 2 2 3 2 1 7 5 1 3 8 3 4 7 1

OUTl OUT2 OUT3 OUT4

List of numbers o~ attributes for every assessor

OPTIONAL jam_gpc ass peo att cor

OUTPUT DATA SETS Sample coordinates (GPC) Assessors coordinates after each stage (PCO) Correlation between original attributes and GPC Rotation of original attributes on GPC

~=========~~=============================================================I:====~

Figure 3. Input screen for the example. Answers of the user are underlined.

Figure 3 is the input screen displayed by. a %WINDOW statement in the program.

With the comments given on this screen, any user should be able to understand how to

structure his data to run the GPA program.

Figures 4a, 4b and 4c are the computer output given by the GPA program with the

data example. The most important comments and the code meanings have been superimposed

on this computer output with a script font.

The GPA program has no graphic output, as the SAS/GRAPH software includes many

capabilities.

Figures Sa, 5b and 5c have been obtained using the GPLOT procedure and the

ANNOTATE opti6n.

It is noticeable on figure Sa that the assessors a, k and g, who were very diffe­

rent from the others, are in a quite good agreement with the whole panel after GPA.

Only assessors B, I and 0 seem to be qui.te different from the panel after the GPA

transformations, maybe due to their low number of attributes (one or two).

The consensus appears clearly on figure 5b, and more particularly on the first

axis (77 %) which locates samples 4 and 5 on the opposite of samples 1, 6 and 3, while

sample 2 has an intermediate location on this axis. The second axis (11 %) only allows

to separate ~ample 1 from samples 6 and 3.

Interpretation of figure 5c is not easy and· shows the importance of vocabulary

specificities in sensory analysis. Anyway, negative part of the first axis seems to be

related to acid, pungent and lemon attributes which a~e all unfavourable to strawberry jams.

533

Page 6: A SAS/IML PROGRAM FOR GENERALISED PROCRUSTES ANALYSIS ... SASIML Program For... · A SAS/IML PROGRAM FOR GENERALISED PROCRUSTES ANALYSIS ... by translation, ... GPA isa fairly known

·'d;~~~~~"~.c!_~~""':._.!L~~~"t"!"_"S~.~~~:';.~~;{V~'i;"T~"'-''''·:'-'o":,,,.,,,,;.·,,T~,.~.:.-t;~r..~:-i':.~":!>-:,~..a'':'_h."'2'''_L~::--:,_ ... ,_.>..~','-', I,,"" '.~.;"j " ... " ..

~

!A'l{.'l{.O'T.9l.TE/]) CO:MPUTE/J( O'llP.P'l1T

Number of itera~ions 10 Qltite usua[ fIIun6er to 06tain convergCllce

Scaling factor~'of assessors: A 5 C o E F

.G H I J K L M N a

1.2791759 1.1292196 1.1592943 0.9988504

1.011963 5.84617

0.7880535 1.0751748

0.67073 1.5010166 0,8226347 1.4550544 0.7519068 0.9166149 0.8421115

JifSsessor:f gave very smal[ aifferetlCes 6etrveen

sampCes, tnus neetfs a nign scafing factor

Procrustes statistics (to minimize) before and after each stage of GPA:

A 5 C o E F G H I J K L M N o TOT..>u.

INITIAL ~~SLAXION . ROTATION SCALING 36416.34 6413.0267 1921.2155 3219.6769

13736.623 4311.0294 3000.1134 4154.1939 9921.4067 4953.32ll· 3566.·383 3895.2857 12938.407 6905 •. 4711 3660.6771 3950.1537

36066.49 12593.318 3071.8204 3l96.6017 10501.94 574.35444 3408.3587.. 3590.6101. 46204.39 16068.485 6251.7081' 3160.7264

32384.507 9701.6378 2866.6101 3126.2126 H94.79 2447.1012 5064.5933 4804.0833

14870.707 5215.0711 1449.7709 3101.7254 39182.807 15412.443 5744.1949 3048.4229 10071.007 3840.0318 2660.1084 3150.7644 33254.857 19997.913 8445.7008 2872.2058 25099.757 10846.529 4237.2809 3368.5456 8429.2733 3336.3544 ~700.5768 4572.0183

336873.4 122616.1 60049.1l2 53811.887

Residual procrusU!s statistics by sample: SAC SIS FGL

.·5IS XIS PFG TOTAL

12299.995 8165.8064 8538.2366

10115.52 6173,478

8518.85 538ll.887

Procrustes analysis of variance

'I1ie configumtWrts are in,itiaf[y very far from

tfieir means. Jil.fter ljPfIL, tfr.eseaistances

art fargeCy retfucetf antE almost fiomogeneOus

Consensus is 6etter a60ut XIS tfian

a60utSYlC

SOURCE OF 55 (PANOVA) table:

·MS 1913.0116 159.60966

445.5161 349.42784

F LEVEL TRANSLATION 112 214257.3 5.4746972 0 ROTATION. 392 62566.988 0.4567743 1 SCALING 14 6237.2254 1.2749874 0.2287648 RESIDUAL 154 53811.887 TOTAL 672 336873.4

!l{ptation is not very important compare! to transfation. See S.P Langron antE JiI..j.CO{["UIS, J.~tatist Soc.$, 1985 fot tfetaifs on PM{Oo/JiI. tFieory. Ji/n.yway, totaC.

SS is composetf of 65% from transfation, 18% from rotation, 2% from scaCing antE 15% from resitfuaC.

Figure 4a

"; ~~ ~:~i!.~t,\'~ .. :.-;; ...... "'" '~e'o. -,' .~'""~." ~<" _ : ->.'. ',~ '" -, ' .. :;A.b:~)~'-J·~ .~ ...... ~ '.'" .. ".-" I: :'-'" · ..... ;.,;·.~ci~!.c:,:' ". <.. '-.~.~'~~'-'" '/'hr'~N.e>,.t ..

pca of assessors after"each stage.of orA:

Variation explained by dimensions (%): PC01 PC02· PC03 PC04 PC05 PC06

INITIAL 24.651132 21.715959 16.956643 11.175066 8.533753 4.5858204 TRANSLATION 30.011192 16.836262 12.708977 9.444043 7.8388453 7.287926 ROTATION 37.691186 15.325926 13.154044 10.205335 6.9733784 4.850638 SCALING 25.876394 19.369966 14.174691 11.358044 9_8814006 6.1371259.

:first prane of eacn pco carries a60ut 50% of aifferences 6et1veen tne assessors.

Coordinates of assessors: PCOl PC02 PC03 PC04 PC05 PC06

IA -43.70515 '-155.4087 -65.88697 -10.359·36 -1.715786 -32.01655 IB 26.95768 66.047628 26.219952 11.764399 -14.94372 61.042542

IN -80.89212 29.535412 2.1147946 103.07442 -48.15436 2.4300611 10 50.475741 16.762845 42.195526 -0.151209 36.316638 11.643389 TA 7.0855552 39.779718 -20.06328 -22.86886 28.745078 5.4690434 TB 30.86.4615 11.874507 -3.720645 -8.661169 -33.60575 19.729797

TN 32.453219 5.1927444 72.712674 -13.79231 -20.3595 -58.64504 TO -7.127786 12.344057 -9.007459 -16.60081 -38.48962 12.861124 RA -0.622079 32.369433 9.2001449 -4.685353 -8·.514518 11.232673 RB 30.666672 -26.31984 0.2579978 -14.26554 -26.49597 -2.875042 ...................................................................... RN -21.432466.3098449 20.785029 -42.4~704 28.801663 8.0504311 RO 46.82707 -15.49977 -18.73205 -32.77083 -6.681261 13.173352 SA -26.24148 19.654225 35.807723 -6.06717 -1.096313 1.3401133 SB SL5U074 19.464426 -4.968019 -20.09792.-18.44648 -4.274713 ......................................................................... SN -3.00967 18.365965 6.2832283 -14.9706 41.174821 25.488629 SO 52.621248 -20.94988 9.012825 -2S.14263 -3.154922 11.174715

'Ta6Ces are cuttea to sfiow onfy resu{ts of assessors fIL, $, 9{. antE O. :first fetter of assessor itEentifiers aenotes tfie stage of tfie anafysis, Joffowing fetters aenote tne assessors. 'I1iis ta6fe is store! in 'ass_pco' tfata set.

peA of consensus configuration:

Variation explained by dimensions of individual and consensus PCA (\): GPCl GPC2 GPC3 GPC4 GPC5

A 66.679404· 29.90766 2.7336591 0.5272911 0.1519861 B 77 .864503 22.135497· 9.579E-15 9.4l3E-16 -1.73E-17 C 99.263743 0.7362S66 6.588E-15 4.794E-15 9.031E-16. D 75.337828 23.370458 1.2917138 8.931E-16 -6.41E-16 E 67.842288 32.157712 3.658E-15 2.434E-15 4.612E-16 F 100 1.171E-15 4.608E-lS 3.831E-15 -9.65E-16 G 77.81177414.982047 5.4996584 1.5481041.0.1584165 H 77.915897 18.489711 3.4279184 0.1612059 0.0052674 I 100 1.482E-14 1.485E-15 6.471E-16 -1.26E-15 J 96.720179'2.86241070.4174106 3.602E-15 1.779E-15 K 49.532966 18.083898 16.89005 11.297887 4.1951995 L 83.914547 9.8759313 6.209522 1.085E~14 1.171E-15 M 77 .842552 21.222954 0.5296829 0.404.8104 1.803E-15 N 65.587271 21.21533 12.550197 0.5110385 0.1361642 a 100 4.214E-15 9.451E-16 1.159E-17 -2.54E-11

76.609031 11.242371 7.5959023 2.9919233 1.5607728

• tfenotes tfie consensus. 'I1ie first twoljPC are sufficient to aescri6e tfie consensus

Figure 4b

Page 7: A SAS/IML PROGRAM FOR GENERALISED PROCRUSTES ANALYSIS ... SASIML Program For... · A SAS/IML PROGRAM FOR GENERALISED PROCRUSTES ANALYSIS ... by translation, ... GPA isa fairly known

.. ·~r~~~~~~~~9J)~d~~~~~~~::!!)';'?:~1'1'~~~''7l'''~~';<-'~~-:':h,~~;<t=IT:;;:ry::,",~~:-.''!:::l':rr·~;''.7.J:?''';-;n:.7"Z;;::<-!.~'''':~':::<<'''~·>.K·.- .r., '""/~~:'i "'~. n·c···~_-r;··.~.cr"'''h'I",··<·~"''·:y.-·~l""!-·.':.'':;r.;-'b~~--r;.;'>",''.-'''~;·~}-;r:'''''>~··1:~.<~--:P''·'~ >'f-;:~.,~,-.-·",.I;.~",t'.''','~~-r,''"'-,,: .. -~.'''.''~,-.-;:,'':'.-,:::;.~-··.,",'~'PI:'_-;""-_"1Ii.A.!'~r ... -sF:r~!J,"~d,",=.~""'="'=,_

~

Locat.ion uf" aSSE'!SSOI cH.id consensus configurat.iotlt::~: in t.he peA of consensus: GPCl GPC2 GPC3 GPC4· GPC5

SACA 32.891576 8.544264 4.5703624 0.0199406 -0.250363 SISA 1.1904732 -4.3762 -19.66076 5.8352874 5.6202781 FGLA 20.6013 -10.04471 -31.39575 17.17078 7.7069336 BISA -63.27012 8.2443192 28.885517 -4.611289 5.3660193 XISA -17.89972 -4.062817 -9.300169 -1.493085 -8.010458 PFGA 26.486489 1.6951469 26.900795 -16.92163 -10.43241 SACB -15.4993) 10.209765 5.699002 1.0748087 2.0580869 SISB -17.53777 5.8458932 -17.47941 -4.630796 -2.858711 FGLB 32.198832 -16.52187 7.8185514 2.570693 -0.013802 BISB -14.29476 12.788416 19.395335 4.4463023 4.9634678 XISB -17.76941 5.3499987 -20.11332 -5.27916 -3.417438 PFGB 32.902417 -17.6722 4.6798381 1.8181517 -0.731602 ..... ~ .. 6 • 6 ..... ~ ..................... ~ ........................ ..

SAC:N 27. 585698 17.375599 11.702926 -5.596397 -0.488523 SISN 6.5394165 4.1687739 -17.15346 -17.72199 -1.741985 FGLN -9.062346 -34.17709 -3.084387 34.445655 5.7865787 BISN -34.01505 4.1798869 18.666213 9.5086279 3.9330131 XISN -35.48625 -0.737164 -15.12429 -2.08379 _4.785134 PFGN 44.438531 9.1899956 4.9929956 -18.55211 -2.703949 SACO 10.174498 -0.713762 6.9597855 0.2757601 -0.063925 SISO -26.24558 1.8411824 -17.95308 -0.711336 0.1648987 FGLO 1.5030508 -0.105442 1.0281501 0.0407373 -0.009444 BISO 11.908787 -0.835426 8.1461125 0.3227646 -0.074822 XISO -21.38957 1.5005231 -14~63137 -0.579723 0.1343888 PFGO 24.048813 -1.687075 16.450402 0.6517965 -0.151097 SAC:* 28.45404 18.13752 1.82642 2.315538 -0.396968 SIS* -2.127927 0.5360821 -11.68548 -5.915315 4.4002421 FGL* 14.597859 -11.93277 -2.970367 8.4392896 1.6521043 BIS* -37.05239 1.8396771 11.403396 0.5721559 3.2365723 XIS* -29.03164 1.0463151 -6.517053 0.8176847 -6.277461 PFG* 25.160051 -9.626828 7.9430864 -6.229354 -2.61449

JCirst fetters of itientifiers aenote tlie sampfes, fo«Owing fetters aenote tlie assessors. 'I1i.is ta6fe is storea in Jam-Bpc· aata set.

Correlations between original attributes and transformed attributes (GPe): GPCl GPC:2 GPC:3 GPC:4 GPC:5

unrA -0.651951 0.0844989 0.6372109 0.0509422 0.3989827 COoA 0.5784237 -0.053352 -0.655945 0.4298042 0.2181394 rotA 0.7884688 0.1973065 0.4083871 -0.319233 -0.26589 carA 0.5511668 -0.128972 -0.752636 0.3254691 -0.084794 droA -0.88473 0.1130152 0.3592741 -0,057135 0.2685861 lemA -0.651951 0.0844989 0.6372109 0.0509422 0.3989827 SWeA 0.9004448 -0.257184 -0.319613 -Q.048558 -0.136182 punA -0.904491 0.1155874 0.3131543 -0.07559 0.2544716 typB -0.11815 0.0079746 -0.85572 -0.345626 -0.366225 oveB 0.5545031 -0.781956 0.2245782 0.145007~ -0.098026

typN 0.6577716 0.0118537 -0.264518 -0.694342 -0.122944 cooN -0.468107 -0.485632 -0.281087 0.6577782 0.1826345 rotN 0.2495291 -0.502931 -0.197691 0.7923678 0.13367 mi1N 0.8822979 0.0776543 0.1409541 -0.441629 -0.024825 musN 0.2568549 -0.548089 -0.165981 0.7513967 0.2036602 sweN -0.753521 -0.058008 -0.625964 0.0517676 -0.185286 bitN -0.280592 -0.671936 -0.348993 0.5893351 0.0256642 c~. -rO.4166~~ 0 07!!9-,l_23 -0.90)992 -rO.Ol'lp?53 0 01799,64 •

-:fIrst tetters oJ U!ent':Jl.ers are tlU coaes UJ attrt'6utes (unr=unnpe, coo=co0K:.?a, rot=rotten, car=caral1U{, aro=arop, fetn=femon, swe=sweet, pun=pungent, typ=typicaf, ove=ovempe, mu=mi{k. mus=musty, 6it=6ittemess, cry=crystafCiserf), {oCfowinn fetters aenote tlie assessors. 'I1i.is ta6fe is storea in 'atLcor' aata set.

Figure 4c

PRINCIPAL. COORDINATES ANALYSIS OF ASSESSORS

PC02 100

i m

b

g h e n AKGD j,B

O~ HEN Dc M d J c 011 F L

f

-100

a k

-2OO~ I iii iii iii Iii iii I --,-"T I r r I I • I I I I I I

-200 -100 0 100

PC01

Lower case letters denote assessors before GPA:PC01=2S%PC02=22%

Upper case letters denote assessors after GPA:PC01=26%PC02=19%

Plotted from "assyco"

Figure Sa.

Page 8: A SAS/IML PROGRAM FOR GENERALISED PROCRUSTES ANALYSIS ... SASIML Program For... · A SAS/IML PROGRAM FOR GENERALISED PROCRUSTES ANALYSIS ... by translation, ... GPA isa fairly known

, Figure 5b

GPca $0

30

-10

-30

4H

4M 4A

GPA SAMPLE PlOT

11 4B 21lD

2E lG

lK lE

1M

5K4G lB 5D 2G 3D 1A 6N 4D 4N 5B5L 2B 5Iu 2N 3C lC

$G 4.J ~E5E5M ';N ~IO 500lc5H 2. 30 ~g 2C 3H 6A 6H lH .,. 4K 4C 5J 5C 4L5A VH ~ 402K 3J 60

4F5F 6L SA6· 3· 3B

6J

2M 2L

3L

3N

6311 2D3M 6K 3G 614 6D 6B

6G 3K

6E 3E

lL

iF 13

-40~~~~~~~Tr~~~rn~~TO~~nTrn~rn"~TrnTTnnTrnTO~TM~ -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 60

GPC1

Numbers denote samples. Letters denote aS3essors, *denotes consensus. Plotted from "jam_gpc" by GPLOI' procedure. l=SAC, 2=SIS, 3=FGL, 4=B1S, 5=X1S, 6=PFG

Figure 5c " , " CORRELATIONS BETWEEN ATTRIBUTES A.~D GPC

CPC2 1~0

0.8 typl

burG

0.1 ripM

0.4

0.2 .weK

0.0 typD

-0.2

car

I J

-0.4-

typN cooA;

carA mllK

carK

oveD roW

--1 cooN

-0.1

bltN

-0.11

-1.0

.,.1.0 -0.8 -0.11 -0.4 -0.2 0.0

Plotted from "att cor" GPC1

rotN musN

0.2 0.4

536

OYe<l'/

rip~ oveD/'

/' /'

/

0.6 0.8

!

1.0

Page 9: A SAS/IML PROGRAM FOR GENERALISED PROCRUSTES ANALYSIS ... SASIML Program For... · A SAS/IML PROGRAM FOR GENERALISED PROCRUSTES ANALYSIS ... by translation, ... GPA isa fairly known

r ,

!(

CONCLUSION

GPA can be applied each time individuals are described by several sets of varia­

bles and when consensus about sample differences must be derived from these sets of

variables. GPA is better than either a PCA of each data matrix or a PCA of the mean

data matrix.

REFERENCES

Arnold, G.M., and Williams, A.A. 1986. The use of Generalised Procrustes Techniques in

Sensory AnalYsis. In Statistical Procedures in Food Research. Piggot, J.R. Ed.Elsevier

Applied Science, London.

Gower, J.C. 1966. Some distances properties of latent root and vector methods used in

multivariate analysis. Biometrika, 53,325-338.

Gower, J.C. 1975. Generalized Procrustes analysis. Psychometrika, 40, 33-50.

Harman, H.H. 1976. Modern Factor Analysis. 3rd Edition. The University of Chicago

Press, Chicago.

Hurley, J.R., and Cattell, R.B. 1962. The Procrustes Program Producing direct rota­

tion to test a hypothesized factor structure. Behav. Sci., 7, 258-262.

Kristof, W., and Wingersky, B. 1971. Generalization of the orthogonal Procrustes rota­

tion procedure to more than two matrices. In Proceedings of the 79th Annual Conven­

tion, American Psychological Association, 89-90.

Langron, S.P. 1981. The statistical treatment of sensory analysis data. Ph. D. Thesis,

University of Bath.

Langron, S.P., and Collins, A.J. 1985. Perturbation Theory for Generalized Procrustes

Analysis. J.R. Statist. Soc. B, 47, 277-284.

Morrisson, D.F. 1976. Multivariate Statistical Methods. 2nd Edition. Mc Graw-Hill Book

Co, New York.

Williams, A.A., and Langron, S.P. 1984. The use of free-choice profiling for the eva­

luation of commercial ports. J. Sci. Food Agric., 35, 558-568.

Thanks are due to Mrs. S. ISSANCHOU for providing sensory data.

Address of the author is I.N.R.A. Laboratoire de Recherches sur les Ar6mes

17, Rue Sully,

21034 DIJON CEDEX. FRANCE

537