quick & simple
TRANSCRIPT
-
8/6/2019 Quick & Simple
1/13
Quick & Simple Introduction to
Multidimensional Scaling Professor Tony Coxon
Hon. Professorial Research Fellow, University
of Edinburgh ( [email protected] ) see www.tonycoxon.com for information on me
see www.newmdsx.com for information resourceon MDS and NewMDSX programs/doc.
See: The Users Guide to MDS and
Key Texts in MDS (readings), Heineman 1982
Available as pdf at 15 from newmdsx
-
8/6/2019 Quick & Simple
2/13
What is Multidimensional Scaling?
A students definition: Ifyou are interested in how certain objects relate to each other
and ifyou would like to present these relationships in the form ofamap then MDS is the technique you need (Mr Gawels, KUB)
A goodstart!
MDS is a family of models structured by D-T-M: (DATA) the empirical information on inter-relationships
between a set of objects/variables are given in a set of dis/similarity data
(TRANSFORMATION) which are then re-scaled ( accordingto permissible transformations for the d
ata / level of measurement) , in terms of
(MODEL) the assumptions of the model chosen to representthe data
-
8/6/2019 Quick & Simple
3/13
MDS Solution to produce a SOLUTION, consisting of :
1. a CONFIGURATION, which is a
i. pattern of points representing the objects ii. located in a space of a small number of dimensions
(hence SSA Smallest-Space Analysis)
iii. where the distances between the points represent thedis/similarities between the data-points
iv. as perfectly as possible(the imperfection/badness of fit is measured by Stress)
Low stress is desirable; No stress is perfection
-
8/6/2019 Quick & Simple
4/13
Distances & Maps Given a map, its easy to calculate the (Euclidean) distances
between the points :
MDS operates the other way round:
Given the distances [data] find the map [configuration] whichgenerated them
and MDS can do so when all but ordinal information has beenjettisoned (fruit of the non-metric revolution)
even when there are missing data and in the presence of considerablenoise/error (MDS is robust).
MDS thus provides at least [exploratory] a useful and easily-assimilable graphic visualization of a
complex data set (Tukey: A picture is worth a thousand words)
2
, )( kaa
jakj xxd !
-
8/6/2019 Quick & Simple
5/13
What is like MDS?Related and Special-case Models: Metric Scalar Products Models:
*PRINCIPAL COMPONENTS ANALYSIS FACTOR ANALYSIS (+ communalities)
Metric and Non-Metric Ultrametric Distance, Discrete models *Hierarchical Clustering *Partition Clustering (CONPAR) Additive Clustering ( 2 and 3-way)
Metric Chi-squared Distance Model for 2W2M and 3W data / Tables
*Simple (2W2M) and Multiple (3W) Correspondence Analysis BECAUSE OF NON-METRIC (MONOTONE) REGRESSION, MDS ALSO
OFFERS ORDINAL EQUIVALENTS OF: *ANOVA other simple composition models * UNICON
(All models with asterisk * exist as programs within NewMDSX)
-
8/6/2019 Quick & Simple
6/13
How does MDS differ from other
Multivariate Methods?Compared to other multivariate methods, MDS models are
usually: distribution-free
(though MLE models do exist Ramsays MULTISCALE)
make conservative (non-metric) demands on the structure of the data,
are relatively unaffected by non-systematic missing data,
can be used with a very wide variety of types of data: direct data (pair comparisons, ratings, rankings, triads, sortings)
derived data (profiles, co-occurrence matrices, textual data, aggregateddata)
measures of association/correlation etc derived from simpler data, and
tables of data.
range of transformations monotonic (ordinal), linear/metric (interval), but also log-interval, power,
smoothness even maximum variance non-dimensional scaling(Shepard)
-
8/6/2019 Quick & Simple
7/13
How does MDS differ from other
Multivariate Methods (2)?Compared to other multivariate methods, MDS models are
also offer: range of models (chiefly distance (Euclidean, but also City-block),
factor/vector (scalar-products), simple composition (additive).
Also there are hierarchies of models:
Similarity models: 2W1M METRIC 3W2M INDSCAL IDIOSCAL (honest!)
Preference models : Vector-distance-weighted distance-rotated, weighted(PREFMAP)
Procrustes rotation for putting configurations into maximum conformity,and then increasingly complex transformations: PINDIS
the solutions are visually assimilable & readily interpretable
the structure is not limited to dimensional information also othersimple structures (horseshoes, radex/circumplex, clusters,
directions).
-
8/6/2019 Quick & Simple
8/13
W
eaknesses in MDSThere ARE any??!
Relative ignorance of the sampling properties of stress
prone-ness to local minima solutions (but less so, and interactive programs like PERMAP allow
thousands of runs to check)
a few forms of data/models are prone to degeneracies(especially MD Unfolding but see new PREFSCAL inSPSS)
difficulty in representing the asymmetry of causal models though external analysis is very akin to dependent-independent
modelling,
there are convergences with GLM in hybrid models such asCLASCAL (INDSCAL with parameterization of latent classes)
-
8/6/2019 Quick & Simple
9/13
CHARACTERIZATION OF BASIC MDS
& TERMINOLOGYStructure of MDS specifiable in terms of D-T-MDATA (specifies input data shape and content)
DATA MATRIX INPUT:
WAY: dimensionality of array (2,3,4 ...)
MODALITY: No of distinct sets (to be represented)(1,2,3 )
NB: Modality < or = Way
Common examples: 2W1M basic models (LTM,UTM,FSM)
2W2M rectangular, joint (conditional )mapping
3W2M (stack of 2W1M) Individual differencesScaling
-
8/6/2019 Quick & Simple
10/13
CHARACTERIZATION OF BASICMDS (2)
TRANSFORMATION (form or type of rescaling performed on data)o Non-Metric /Ordinal: H =M(d)
Monotonic Increasing (sims) orDecreasing (dissims)
y Order/inequalityo Strong / Guttman: (j,k) > (l,m)-> d(j,k) > d(l,m)o weak/Kruskal: (j,k) > (l,m)-> d(j,k) d(l,m)
y Equality / tieso Primary (j,k)= (l,m)-> d(j,k)= OR d(l,m)o 2ndary (j,k)= (l,m)-> d(j,k)= d(l,m)
o Metric / Linear Linear: H = L(d)
H = a + b(d)
-
8/6/2019 Quick & Simple
11/13
CHARACTERIZATION OF BASICMDS(3)
MODEL: Euclidean Distance
where x(i,a) is the co-ordinate of point i on dimension a inthe solution configuration X of low dimension
The basic model is Euclidean distance, but otherMinkowski metrics are available, including: City Block Model
2
, )( kaa
jakj xxd !
-
8/6/2019 Quick & Simple
12/13
(Badness of) FIT: Stress
2
1S
FORMULAE-STRESS
distance)meanfromdeviationssquaredof(sum)(dNF2
distances)squaredof(sumdNF1
:FactorsgNormalisin
)regressionmonotonefromresidualssquaredof(sum)(dStressRaw
STRESSOFSDEFINITION
2
1
2
kj,
kj,
kj,
2
jk
2
kj,
jk
NF
rawstressS
NF
rawstress
d
do
jk
!
!
!
!
!
-
8/6/2019 Quick & Simple
13/13