measurement in market reasearch

Upload: calmchandan

Post on 07-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/4/2019 Measurement in Market Reasearch

    1/39

    9-1

    Business Research Methods

    Measurement and Scaling:

    Noncomparative ScalingTechniques

  • 8/4/2019 Measurement in Market Reasearch

    2/39

    9-2

    Noncomparative Scaling

    Techniques Respondents evaluate only one object at a time, and for

    this reason non comparative scales are often referred to

    as monadic scales.

    Noncomparative techniques consist ofcontinuous and

    itemized rating scales.

  • 8/4/2019 Measurement in Market Reasearch

    3/39

    9-3

    Continuous Rating ScaleRespondents rate the objects by placing a mark at the appropriate position

    on a line that runs from one extreme of the criterion variable to the other.

    The form of the continuous scale may vary considerably.

    How would you rate Sears as a department store?

    Version 1

    Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Probably the best

    Version 2

    Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -- - Probably the best

    0 10 20 30 40 50 60 70 80 90 100

    Version 3

    Very bad Neither good Very good

    nor bad

    Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - -Probably the best

    0 10 20 30 40 50 60 70 80 90 100

  • 8/4/2019 Measurement in Market Reasearch

    4/39

    9-4

    Itemized Rating Scales

    The respondents are provided with a scale that has a

    number or brief description associated with each

    category.

    The categories are ordered in terms of scale position, and

    the respondents are required to select the specified

    category that best describes the object being rated.

    The commonly used itemized rating scales are the Likert,

    semantic differential, and Stapel scales.

  • 8/4/2019 Measurement in Market Reasearch

    5/39

    9-5

    Likert ScaleThe Likert scale requires the respondents to indicate a degree of agreement or

    disagreement with each of a series of statements about the stimulus objects.

    Strongly Disagree Neither Agree Strongly

    disagree agree nor agree

    disagree

    1. Sears sells high quality merchandise. 1 2X 3 4 5

    2. Sears has poor in-store service. 1 2X 3 4 5

    3. I like to shop at Sears. 1 2 3X 4 5

    The analysis can be conducted on an item-by-item basis (profile analysis), or atotal (summated) score can be calculated.

    When arriving at a total score, the categories assigned to the negativestatements by the respondents should be scored by reversing the scale.

  • 8/4/2019 Measurement in Market Reasearch

    6/39

    9-6

    Semantic Differential Scale

    The semantic differential is a seven-point rating scale with end

    points associated with bipolar labels that have semantic meaning.

    SEARS IS:

    Powerful --:--:--:--:-X-:--:--: WeakUnreliable --:--:--:--:--:-X-:--:

    Reliable

    Modern --:--:--:--:--:--:-X-: Old-fashioned

    The negative adjective or phrase sometimes appears at the left sideof the scale and sometimes at the right.

    This controls the tendency of some respondents, particularly thosewith ver ositive or ver ne ative attitudes to mark the ri ht- or le t-

    A S ti Diff ti l S l f M i

  • 8/4/2019 Measurement in Market Reasearch

    7/39

    9-7A Semantic Differential Scale for Measuring

    Self- Concepts, Person Concepts, and

    Product Concepts

    1) Rugged :---:---:---:---:---:---:---: Delicate

    2) Excitable :---:---:---:---:---:---:---: Calm

    3) Uncomfortable :---:---:---:---:---:---:---: Comfortable

    4) Dominating :---:---:---:---:---:---:---: Submissive

    5) Thrifty :---:---:---:---:---:---:---: Indulgent

    6) Pleasant :---:---:---:---:---:---:---: Unpleasant

    7) Contemporary :---:---:---:---:---:---:---: Obsolete

    8) Organized :---:---:---:---:---:---:---: Unorganized

    9) Rational :---:---:---:---:---:---:---: Emotional

    10) Youthful :---:---:---:---:---:---:---: Mature

  • 8/4/2019 Measurement in Market Reasearch

    8/39

    9-8

    Stapel ScaleThe Stapel scale is a unipolar rating scale with ten categories

    numbered from -5 to +5, without a neutral point (zero). This scale

    is usually presented vertically.

    SEARS

    +5 +5

    +4 +4+3 +3

    +2 +2X

    +1 +1

    HIGH QUALITY POOR SERVICE

    -1 -1

    -2 -2

    -3 -3

    -4X -4

    -5 -5

    The data obtained by using a Stapel scale can be analyzed in the

    same way as semantic differential data. It shows both intensity & direction

  • 8/4/2019 Measurement in Market Reasearch

    9/39

    9-9

    Scale BasicCharacteristics Examples Advantages DisadvantagesContinuousRatingScale

    Place a mark on acontinuous line

    Reaction toTV

    commercials

    Easy to construct Scoring can becumbersome

    unlesscomputerized

    Itemized RatingScales

    Likert Scale Degrees ofagreement on a 1(strongly disagree)to 5 (strongly agree)

    scale

    Measurementof attitudes

    Easy to construct,administer, and

    understand

    Moretime - consuming

    SemanticDifferential

    Seven - point scalewith bipolar labels

    Brand,product, and

    company

    images

    Versatile Controversy asto whether thedata are interval

    StapelScale

    Unipolar ten - pointscale, - 5 to +5,

    without a neutralpoint (zero)

    Measurementof attitudesand images

    Easy to construct,administer over

    telephone

    Confusing anddifficult to apply

    Basic Noncomparative Scales

  • 8/4/2019 Measurement in Market Reasearch

    10/39

    9-10

    Summary of Itemized Scale

    Decisions1) Number of categories Although there is no single, optimal number,traditional guidelines suggest that there

    should be between five and nine categories

    2) Balanced vs. unbalanced In general, the scale should be balanced toobtain objective data

    3) Odd/even no. of categories If a neutral or indifferent scale response ispossible from at least some of the respondents,an odd number of categories should be used

    4) Forced vs. non-forced In situations where the respondents areexpected to have no opinion, the accuracy ofthe data may be improved by a non-forced scale

    5) Verbal description An argument can be made for labeling all or many scale categories. The categorydescriptions should be located as close to theresponse categories as possible

    6)Physical form A number of options should be tried and thebest selected

  • 8/4/2019 Measurement in Market Reasearch

    11/39

    9-11

    Jovan Musk for Men is Jovan Musk for Men is

    Extremely good Extremely goodVery good Very goodGood Good Bad Somewhat good

    Very bad BadExtremely bad Very bad

    Figure 9.1

    Balanced and Unbalanced

    Scales

  • 8/4/2019 Measurement in Market Reasearch

    12/39

    9-12

    A variety of scale configurations may be employed to measure the

    gentleness of Cheer detergent. Some examples include:Cheer detergent is: 1) Very harsh --- --- --- --- --- --- --- Very gentle

    2) Very harsh 1 2 3 4 5 6 7 Very gentle

    3) . Very harsh

    .

    .

    . Neither harsh nor gentle

    .

    .

    . Very gentle

    4) ____ ____ ____ ____ ____ ____ ____Very Harsh Somewhat Neither harsh Somewhat Gentle Very

    harsh Harsh nor gentle gentle gentle

    5)

    Very Neither harsh Very

    harsh nor gentle gentle

    Rating Scale Configurations Figure 9.2

    -3 -1 0 +1 +2-2 +3

    Cheer

    9 13

  • 8/4/2019 Measurement in Market Reasearch

    13/39

    9-13

    Thermometer Scale

    Instructions: Please indicate how much you like McDonalds hamburgers by coloringin the thermometer. Start at the bottom and color up to the temperature level that best

    indicates how strong your preference is.

    Form:

    Smiling Face Scale

    Instructions: Please point to the face that shows how much you like the Barbie Doll. Ifyou do not like the Barbie Doll at all, you would point to Face 1. If you liked it verymuch, you would point to Face 5.

    Form:

    1 2 3 4 5

    Figure 9.3

    Like very

    much

    Dislike

    very much

    100

    75

    50

    25

    0

    Some Unique Rating Scale

    GRAPHIC

    9 14

  • 8/4/2019 Measurement in Market Reasearch

    14/39

    9-14

    Thurstone Scale

    It is a two stage procedure In the first stage researcher selects 80 to 100

    items indicating different degrees offavourable attitude for concept under study

    They are given to a group of judges to groupthem into favourable & disfavour able by

    keeping equal intervals between categories All items that have consensus from judges

    are selected & distributed uniformly on ascale of favourability

    This scale is then administered torespondents to measure their attitude towardsa particular concept

    It is time consuming & costly & is rarely usedin applied BR

    9 15

  • 8/4/2019 Measurement in Market Reasearch

    15/39

    9-15

    In psychology, the Thurstone scale was the firstformal technique for measuring an attitude. Itwas developed by Louis Leon Thurstone in1928, as a means of measuring attitudestowards religion. It is made up of statementsabout a particular issue, and each statement hasa numerical value indicating how favorable orunfavorable it is judged to be. People check

    each of the statements to which they agree, anda mean score is computed, indicating theirattitude.

    9 16

  • 8/4/2019 Measurement in Market Reasearch

    16/39

    9-16

    Measurement AccuracyThe true score model provides a framework for

    understanding the accuracy of measurement.

    XO = XT + XS + XR

    where

    XO = the observed score or measurement

    XT = the true score of the characteristic

    XS = systematic error

    XR = random error

    9 17

  • 8/4/2019 Measurement in Market Reasearch

    17/39

    9-17

    Potential Sources of Error on

    Measurement11) Other relatively stable characteristics of the individual that

    influence the test score, such as intelligence, social desirability,and education.

    2) Short-term or transient personal factors, such as health, emotions,and fatigue.

    3) Situational factors, such as the presence of other people, noise,and distractions.

    4) Sampling of items included in the scale: addition, deletion, orchanges in the scale items.

    5) Lack of clarity of the scale, including the instructions or the items

    themselves.6) Mechanical factors, such as poor printing, overcrowding items in

    the questionnaire, and poor design.

    7) Administration of the scale, such as differences amonginterviewers.

    8) Analysis factors, such as differences in scoring and statisticalanalysis..

    9 18

  • 8/4/2019 Measurement in Market Reasearch

    18/39

    9-18

    Criteria for evaluating measurement

    The criteria for evaluating measurements

    are Reliability

    Validity

    Sensitivity

    Generalizability

    Relevance

    9 19

  • 8/4/2019 Measurement in Market Reasearch

    19/39

    9-19

    Reliability

    The degree to which measures are freefrom random error and therefore yield

    consistent results across time or

    situations.Perfect reliability requires that there is

    no random error

    XR=0

    9-20

    V lidit

  • 8/4/2019 Measurement in Market Reasearch

    20/39

    9-20

    Validity

    The ability of a scale to measure what

    was intended to be measured.

    Perfect validity requires that there is no

    measurement error either systematic or

    random.XR=o XS=0

    9-21

  • 8/4/2019 Measurement in Market Reasearch

    21/39

    9 21

    Relationship between validity & reliability

    If a measure is perfectly valid it is also

    perfectly reliable

    However if a measure is perfectly reliable

    it may or may not be perfectly valid

    If a measure is unreliable it will not be valid

    Reliability is a necessary but not a

    sufficient condition for validity

    9-22

  • 8/4/2019 Measurement in Market Reasearch

    22/39

    9 22

    THE GOAL OF

    MEASUREMENT:

    VALIDITY and RELIABILITY

    9-23

  • 8/4/2019 Measurement in Market Reasearch

    23/39

    9 23

    Reliability and Validity on Target

    Old Rifle New Rifle New Rifle

    Sunglare

    Low Reliability High Reliability Reliable but Not

    Valid

    (Target A) (Target B) (Target C)

    9-24

  • 8/4/2019 Measurement in Market Reasearch

    24/39

    9 24

    RELIABILITY

    T E S T R

    S T A B I L

    E Q U I V A S P L I T T I

    I N T E R N A

    R E L I A B

    Repeatability Of index measures

    9-25Types of Reliability

  • 8/4/2019 Measurement in Market Reasearch

    25/39

    9 25 Types of Reliability There are two dimensions of reliability:Repeatability & Internal

    consistency

    If the results of the research are the same even when it is

    conducted second or third time it confirms repeatability aspect

    Test-Retest Method: An approach for assessing reliability in

    which respondents are administered identical sets of scale

    items at two different times under as nearly equivalent

    conditions as possible This measures repeatability since the same scale or measure

    is administered to the same set of respondents at two

    separate points. If the measure is stable over time , it should

    obtain similar results.(40% satisfied with jobs both times) However it is difficult to locate all respondents for the second

    round, their attitudes may change over time or the first

    measure may sensitize the respondents

    9-26

    E i l t F M th d

  • 8/4/2019 Measurement in Market Reasearch

    26/39

    9 6

    Equivalent Forms Method

    An approach to assess reliability that

    requires two equivalent forms of scale to beconstructed &administered to the same

    respondents at two different times

    However it is difficult , time consuming &expensive to construct two equivalent forms

    of scale

    9-27

    I t l C i t

  • 8/4/2019 Measurement in Market Reasearch

    27/39

    Internal Consistency

    This measure of reliability focuses on

    internal consistency of the set of items

    forming the scale.

    It is used to assess reliability of a

    summated scale where several items are

    summed to form a total score .Each itemmeasures some aspect of the construct

    and the items should be consistent in

    what they indicate about thecharacteristics

    9-28

  • 8/4/2019 Measurement in Market Reasearch

    28/39

    Split half Method

    Split half Method: It is a method of measuring

    internal consistency reliability in which the itemsconstituting the scale are divided into two halvesand the resulting scores of two halves arecorrelated. High correlation indicates high

    consistency However results will depend on how the scale

    items are split

    Coefficient alpha :A measure of internal

    consistency reliability that is the average of allpossible split half coefficients resulting fromdifferent splitting of the scale items

    9-29

  • 8/4/2019 Measurement in Market Reasearch

    29/39

    Some multi item scales include several sets of items measuringdifferent dimensions of a multidimensional construct. Since thesedimensions are independent a measure of internal consistency

    computed across dimensions would be inappropriate. so internalconsistency reliability can be computed for each dimension

    Store image is a multidimensional construct that includes

    --- Quality of goods,

    --- variety of goods,---returns policy,

    ---service ,

    ----price,

    ----location,----layout

    ----billing & credit policy

    9-30

  • 8/4/2019 Measurement in Market Reasearch

    30/39

    F A C E O R

    C O N C P R E D

    C R I T E R I O C O N S T R

    V a l i d i t y

    Face Professional agreement that logically it appears valid.(Subjective)

    Content-Depends on established theories for support(objective)Criterion Does it fit or correlate with other similarmeasure/constructs? Body Fat caliper, water displacement,electrical impedance, BMI.

    Concurrent two measure, same timePredictive Two measures at diff. times.Construct - confirmed with network of hypotheses.Convergent(High relationship with similar concepts). and divergent ordiscriminant validit low relationship with dissimilar concepts .

    9-31

    F V lidit

  • 8/4/2019 Measurement in Market Reasearch

    31/39

    Face Validity

    Face Validity: Subjective agreement amongprofessionals that a scale logically appears toaccurately measure what it is intended tomeasure. Weakest form without any analysis

    Face validity is concerned with how a

    measure or procedure appears. Does it seemlike a reasonable way to gain the informationthe researchers are attempting to obtain?Does it seem well designed? Does it seem asthough it will work reliably? Unlike contentvalidity, face does not depend on establishedtheories for support

    9-32

    C t t V lidit

  • 8/4/2019 Measurement in Market Reasearch

    32/39

    Content Validity Content Validity is based on the extent to which a

    measurement reflects the specific intended domain

    of content . Researchers aim to study mathematical learning and

    create a survey to test for mathematical skill. Ifthese researchers only tested for multiplication and

    then drew conclusions from that survey, their studywould not show content validity because it excludesother mathematical functions.

    To measure adequacies of facilities in schools:

    attractiveness of school name, frequency of oldstudents meet. eatables in the canteen not relevantvariables:

    Number of classrooms, Number of qualifiedteachers, playground, liabrary- relevant variables

    9-33

  • 8/4/2019 Measurement in Market Reasearch

    33/39

    Criterion related Validity Criterion related validity, also referred to as

    instrumental validity, is used to demonstrate theaccuracy of a measure or procedure by comparingit with another measure or procedure which hasbeen demonstrated to be valid.

    For example, imagine a hands-on driving test hasbeen shown to be an accurate test of driving skills.By comparing the scores on the written driving testwith the scores from the hands-on driving test, thewritten test can be validated by using a criterionrelated strategy in which the hands-on driving test iscompared to the written test.

    New measure correlates with criterion measure

    9-34

  • 8/4/2019 Measurement in Market Reasearch

    34/39

    Predictive Validity Predictive Validity. A type of criterion validity whereby a

    new measure correlates with criterion measure administered

    at a later time In order for a test to be a valid screening device for some

    future behaviour, it must have predictive validity. The SAT

    is used by college screening committees as one way to

    predict college grades. The GMAT is used to predict success

    in business .It measures predictive validity .

    We determine predictive validity by computing a correlation

    coefficient comparing SAT(New) scores, for example, andcollege grades (Criterion). If they are directly related, then

    we can make a prediction regarding college grades based on

    SAT score. We can show that students who score high on the

    SAT tend to receive high grades in college.

    9-35

  • 8/4/2019 Measurement in Market Reasearch

    35/39

    Construct Validity

    Construct validity seeks agreement between atheoretical concept and a specific measuring

    device or procedure. For example, a researcherinventing a new IQ test might spend a great dealof time attempting to "define" intelligence in orderto reach an acceptable level of construct validity.

    Construct validity can be broken down into twosub-categories: Convergent validity anddiscriminate validity. Convergent validity is theactual general agreement among ratings, wheremeasures should be theoretically related.

    Discriminate validity is the lack of a relationshipamong measures which theoretically should notbe related

    9-36

  • 8/4/2019 Measurement in Market Reasearch

    36/39

    To measure: Tendency to stay in low cost hotels

    Four personality variables: High level of selfconfidence, low need for status, low need fordistinctiveness, high level of adaptability

    Not related to: brand loyalty, high level ofaggressiveness

    The scale can be said to have construct if itcorrelates highly with other measures of tendencyto stay in low cost hotels: Reported hotelspatronised and social class (convergent)

    Low correlation with the unrelated constructs ofbrand loyalty & high level of aggressiveness(Divergent)

    9-37

  • 8/4/2019 Measurement in Market Reasearch

    37/39

    SENSITIVITY

    A measurement instruments ability to

    accurately measure variability in stimuli or

    responses.

    Yes and no agree or disagree are not verysensitive

    Strongly agree, mildly agree, indifferent, mildly

    disagree, strongly disagree ,are categories whoseinclusion increases scales sensitivity

    9-38

  • 8/4/2019 Measurement in Market Reasearch

    38/39

    Generizability

    It is the degree to which a study based on

    a sample applies to a universe ofgeneralization

    Universe of generalization includes set of

    all conditions of measurement :items,interviewers, modes of data collection etc.

    To generalize a scale developed for

    personal interview to other modes of datacollection such as mail, telephone etc.

    To generalize from a sample of items to

    universe of items

    9-39

    R l

  • 8/4/2019 Measurement in Market Reasearch

    39/39

    Relevance

    It represents appropriateness of using a

    particular scale for measuring a variable Relevance= Reliability x Validity

    If either reliability or validity is low then the

    scale will have little relevance

    If correlation coefficient is used to analyse

    both reliability & validity then the scale can

    have relevance from 0 to 1.