quick & simple

8/6/2019 Quick & Simple

1/13

Quick & Simple Introduction to

Multidimensional Scaling Professor Tony Coxon

Hon. Professorial Research Fellow, University

of Edinburgh ( [email protected] ) see www.tonycoxon.com for information on me

see www.newmdsx.com for information resourceon MDS and NewMDSX programs/doc.

See: The Users Guide to MDS and

Key Texts in MDS (readings), Heineman 1982

Available as pdf at 15 from newmdsx


2/13

What is Multidimensional Scaling?

A students definition: Ifyou are interested in how certain objects relate to each other

and ifyou would like to present these relationships in the form ofamap then MDS is the technique you need (Mr Gawels, KUB)

A goodstart!

MDS is a family of models structured by D-T-M: (DATA) the empirical information on inter-relationships

between a set of objects/variables are given in a set of dis/similarity data

(TRANSFORMATION) which are then re-scaled ( accordingto permissible transformations for the d

ata / level of measurement) , in terms of

(MODEL) the assumptions of the model chosen to representthe data


3/13

MDS Solution to produce a SOLUTION, consisting of :

1. a CONFIGURATION, which is a

i. pattern of points representing the objects ii. located in a space of a small number of dimensions

(hence SSA Smallest-Space Analysis)

iii. where the distances between the points represent thedis/similarities between the data-points

iv. as perfectly as possible(the imperfection/badness of fit is measured by Stress)

Low stress is desirable; No stress is perfection


4/13

Distances & Maps Given a map, its easy to calculate the (Euclidean) distances

between the points :

MDS operates the other way round:

Given the distances [data] find the map [configuration] whichgenerated them

and MDS can do so when all but ordinal information has beenjettisoned (fruit of the non-metric revolution)

even when there are missing data and in the presence of considerablenoise/error (MDS is robust).

MDS thus provides at least [exploratory] a useful and easily-assimilable graphic visualization of a

complex data set (Tukey: A picture is worth a thousand words)

2

, )( kaa

jakj xxd !


5/13

What is like MDS?Related and Special-case Models: Metric Scalar Products Models:

*PRINCIPAL COMPONENTS ANALYSIS FACTOR ANALYSIS (+ communalities)

Metric and Non-Metric Ultrametric Distance, Discrete models *Hierarchical Clustering *Partition Clustering (CONPAR) Additive Clustering ( 2 and 3-way)

Metric Chi-squared Distance Model for 2W2M and 3W data / Tables

*Simple (2W2M) and Multiple (3W) Correspondence Analysis BECAUSE OF NON-METRIC (MONOTONE) REGRESSION, MDS ALSO

OFFERS ORDINAL EQUIVALENTS OF: *ANOVA other simple composition models * UNICON

(All models with asterisk * exist as programs within NewMDSX)


6/13

How does MDS differ from other

Multivariate Methods?Compared to other multivariate methods, MDS models are

usually: distribution-free

(though MLE models do exist Ramsays MULTISCALE)

make conservative (non-metric) demands on the structure of the data,

are relatively unaffected by non-systematic missing data,

can be used with a very wide variety of types of data: direct data (pair comparisons, ratings, rankings, triads, sortings)

derived data (profiles, co-occurrence matrices, textual data, aggregateddata)

measures of association/correlation etc derived from simpler data, and

tables of data.

range of transformations monotonic (ordinal), linear/metric (interval), but also log-interval, power,

smoothness even maximum variance non-dimensional scaling(Shepard)


7/13

How does MDS differ from other

Multivariate Methods (2)?Compared to other multivariate methods, MDS models are

also offer: range of models (chiefly distance (Euclidean, but also City-block),

factor/vector (scalar-products), simple composition (additive).

Also there are hierarchies of models:

Similarity models: 2W1M METRIC 3W2M INDSCAL IDIOSCAL (honest!)

Preference models : Vector-distance-weighted distance-rotated, weighted(PREFMAP)

Procrustes rotation for putting configurations into maximum conformity,and then increasingly complex transformations: PINDIS

the solutions are visually assimilable & readily interpretable

the structure is not limited to dimensional information also othersimple structures (horseshoes, radex/circumplex, clusters,

directions).


8/13

W

eaknesses in MDSThere ARE any??!

Relative ignorance of the sampling properties of stress

prone-ness to local minima solutions (but less so, and interactive programs like PERMAP allow

thousands of runs to check)

a few forms of data/models are prone to degeneracies(especially MD Unfolding but see new PREFSCAL inSPSS)

difficulty in representing the asymmetry of causal models though external analysis is very akin to dependent-independent

modelling,

there are convergences with GLM in hybrid models such asCLASCAL (INDSCAL with parameterization of latent classes)


9/13

CHARACTERIZATION OF BASIC MDS

& TERMINOLOGYStructure of MDS specifiable in terms of D-T-MDATA (specifies input data shape and content)

DATA MATRIX INPUT:

WAY: dimensionality of array (2,3,4 ...)

MODALITY: No of distinct sets (to be represented)(1,2,3 )

NB: Modality < or = Way

Common examples: 2W1M basic models (LTM,UTM,FSM)

2W2M rectangular, joint (conditional )mapping

3W2M (stack of 2W1M) Individual differencesScaling


10/13

CHARACTERIZATION OF BASICMDS (2)

TRANSFORMATION (form or type of rescaling performed on data)o Non-Metric /Ordinal: H =M(d)

Monotonic Increasing (sims) orDecreasing (dissims)

y Order/inequalityo Strong / Guttman: (j,k) > (l,m)-> d(j,k) > d(l,m)o weak/Kruskal: (j,k) > (l,m)-> d(j,k) d(l,m)

y Equality / tieso Primary (j,k)= (l,m)-> d(j,k)= OR d(l,m)o 2ndary (j,k)= (l,m)-> d(j,k)= d(l,m)

o Metric / Linear Linear: H = L(d)

H = a + b(d)


11/13

CHARACTERIZATION OF BASICMDS(3)

MODEL: Euclidean Distance

where x(i,a) is the co-ordinate of point i on dimension a inthe solution configuration X of low dimension

The basic model is Euclidean distance, but otherMinkowski metrics are available, including: City Block Model

2

, )( kaa

jakj xxd !


12/13

(Badness of) FIT: Stress

2

1S

FORMULAE-STRESS

distance)meanfromdeviationssquaredof(sum)(dNF2

distances)squaredof(sumdNF1

:FactorsgNormalisin

)regressionmonotonefromresidualssquaredof(sum)(dStressRaw

STRESSOFSDEFINITION

2

1

2

kj,

kj,

kj,

2

jk

2

kj,

jk

NF

rawstressS

NF

rawstress

d

do

jk

!

!

!

!

!


13/13

quick & simple

Documents