quick & simple

Upload: atish-chandra

Post on 07-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/6/2019 Quick & Simple

    1/13

    Quick & Simple Introduction to

    Multidimensional Scaling Professor Tony Coxon

    Hon. Professorial Research Fellow, University

    of Edinburgh ( [email protected] ) see www.tonycoxon.com for information on me

    see www.newmdsx.com for information resourceon MDS and NewMDSX programs/doc.

    See: The Users Guide to MDS and

    Key Texts in MDS (readings), Heineman 1982

    Available as pdf at 15 from newmdsx

  • 8/6/2019 Quick & Simple

    2/13

    What is Multidimensional Scaling?

    A students definition: Ifyou are interested in how certain objects relate to each other

    and ifyou would like to present these relationships in the form ofamap then MDS is the technique you need (Mr Gawels, KUB)

    A goodstart!

    MDS is a family of models structured by D-T-M: (DATA) the empirical information on inter-relationships

    between a set of objects/variables are given in a set of dis/similarity data

    (TRANSFORMATION) which are then re-scaled ( accordingto permissible transformations for the d

    ata / level of measurement) , in terms of

    (MODEL) the assumptions of the model chosen to representthe data

  • 8/6/2019 Quick & Simple

    3/13

    MDS Solution to produce a SOLUTION, consisting of :

    1. a CONFIGURATION, which is a

    i. pattern of points representing the objects ii. located in a space of a small number of dimensions

    (hence SSA Smallest-Space Analysis)

    iii. where the distances between the points represent thedis/similarities between the data-points

    iv. as perfectly as possible(the imperfection/badness of fit is measured by Stress)

    Low stress is desirable; No stress is perfection

  • 8/6/2019 Quick & Simple

    4/13

    Distances & Maps Given a map, its easy to calculate the (Euclidean) distances

    between the points :

    MDS operates the other way round:

    Given the distances [data] find the map [configuration] whichgenerated them

    and MDS can do so when all but ordinal information has beenjettisoned (fruit of the non-metric revolution)

    even when there are missing data and in the presence of considerablenoise/error (MDS is robust).

    MDS thus provides at least [exploratory] a useful and easily-assimilable graphic visualization of a

    complex data set (Tukey: A picture is worth a thousand words)

    2

    , )( kaa

    jakj xxd !

  • 8/6/2019 Quick & Simple

    5/13

    What is like MDS?Related and Special-case Models: Metric Scalar Products Models:

    *PRINCIPAL COMPONENTS ANALYSIS FACTOR ANALYSIS (+ communalities)

    Metric and Non-Metric Ultrametric Distance, Discrete models *Hierarchical Clustering *Partition Clustering (CONPAR) Additive Clustering ( 2 and 3-way)

    Metric Chi-squared Distance Model for 2W2M and 3W data / Tables

    *Simple (2W2M) and Multiple (3W) Correspondence Analysis BECAUSE OF NON-METRIC (MONOTONE) REGRESSION, MDS ALSO

    OFFERS ORDINAL EQUIVALENTS OF: *ANOVA other simple composition models * UNICON

    (All models with asterisk * exist as programs within NewMDSX)

  • 8/6/2019 Quick & Simple

    6/13

    How does MDS differ from other

    Multivariate Methods?Compared to other multivariate methods, MDS models are

    usually: distribution-free

    (though MLE models do exist Ramsays MULTISCALE)

    make conservative (non-metric) demands on the structure of the data,

    are relatively unaffected by non-systematic missing data,

    can be used with a very wide variety of types of data: direct data (pair comparisons, ratings, rankings, triads, sortings)

    derived data (profiles, co-occurrence matrices, textual data, aggregateddata)

    measures of association/correlation etc derived from simpler data, and

    tables of data.

    range of transformations monotonic (ordinal), linear/metric (interval), but also log-interval, power,

    smoothness even maximum variance non-dimensional scaling(Shepard)

  • 8/6/2019 Quick & Simple

    7/13

    How does MDS differ from other

    Multivariate Methods (2)?Compared to other multivariate methods, MDS models are

    also offer: range of models (chiefly distance (Euclidean, but also City-block),

    factor/vector (scalar-products), simple composition (additive).

    Also there are hierarchies of models:

    Similarity models: 2W1M METRIC 3W2M INDSCAL IDIOSCAL (honest!)

    Preference models : Vector-distance-weighted distance-rotated, weighted(PREFMAP)

    Procrustes rotation for putting configurations into maximum conformity,and then increasingly complex transformations: PINDIS

    the solutions are visually assimilable & readily interpretable

    the structure is not limited to dimensional information also othersimple structures (horseshoes, radex/circumplex, clusters,

    directions).

  • 8/6/2019 Quick & Simple

    8/13

    W

    eaknesses in MDSThere ARE any??!

    Relative ignorance of the sampling properties of stress

    prone-ness to local minima solutions (but less so, and interactive programs like PERMAP allow

    thousands of runs to check)

    a few forms of data/models are prone to degeneracies(especially MD Unfolding but see new PREFSCAL inSPSS)

    difficulty in representing the asymmetry of causal models though external analysis is very akin to dependent-independent

    modelling,

    there are convergences with GLM in hybrid models such asCLASCAL (INDSCAL with parameterization of latent classes)

  • 8/6/2019 Quick & Simple

    9/13

    CHARACTERIZATION OF BASIC MDS

    & TERMINOLOGYStructure of MDS specifiable in terms of D-T-MDATA (specifies input data shape and content)

    DATA MATRIX INPUT:

    WAY: dimensionality of array (2,3,4 ...)

    MODALITY: No of distinct sets (to be represented)(1,2,3 )

    NB: Modality < or = Way

    Common examples: 2W1M basic models (LTM,UTM,FSM)

    2W2M rectangular, joint (conditional )mapping

    3W2M (stack of 2W1M) Individual differencesScaling

  • 8/6/2019 Quick & Simple

    10/13

    CHARACTERIZATION OF BASICMDS (2)

    TRANSFORMATION (form or type of rescaling performed on data)o Non-Metric /Ordinal: H =M(d)

    Monotonic Increasing (sims) orDecreasing (dissims)

    y Order/inequalityo Strong / Guttman: (j,k) > (l,m)-> d(j,k) > d(l,m)o weak/Kruskal: (j,k) > (l,m)-> d(j,k) d(l,m)

    y Equality / tieso Primary (j,k)= (l,m)-> d(j,k)= OR d(l,m)o 2ndary (j,k)= (l,m)-> d(j,k)= d(l,m)

    o Metric / Linear Linear: H = L(d)

    H = a + b(d)

  • 8/6/2019 Quick & Simple

    11/13

    CHARACTERIZATION OF BASICMDS(3)

    MODEL: Euclidean Distance

    where x(i,a) is the co-ordinate of point i on dimension a inthe solution configuration X of low dimension

    The basic model is Euclidean distance, but otherMinkowski metrics are available, including: City Block Model

    2

    , )( kaa

    jakj xxd !

  • 8/6/2019 Quick & Simple

    12/13

    (Badness of) FIT: Stress

    2

    1S

    FORMULAE-STRESS

    distance)meanfromdeviationssquaredof(sum)(dNF2

    distances)squaredof(sumdNF1

    :FactorsgNormalisin

    )regressionmonotonefromresidualssquaredof(sum)(dStressRaw

    STRESSOFSDEFINITION

    2

    1

    2

    kj,

    kj,

    kj,

    2

    jk

    2

    kj,

    jk

    NF

    rawstressS

    NF

    rawstress

    d

    do

    jk

    !

    !

    !

    !

    !

  • 8/6/2019 Quick & Simple

    13/13