measurement in market reasearch
TRANSCRIPT
-
8/4/2019 Measurement in Market Reasearch
1/39
9-1
Business Research Methods
Measurement and Scaling:
Noncomparative ScalingTechniques
-
8/4/2019 Measurement in Market Reasearch
2/39
9-2
Noncomparative Scaling
Techniques Respondents evaluate only one object at a time, and for
this reason non comparative scales are often referred to
as monadic scales.
Noncomparative techniques consist ofcontinuous and
itemized rating scales.
-
8/4/2019 Measurement in Market Reasearch
3/39
9-3
Continuous Rating ScaleRespondents rate the objects by placing a mark at the appropriate position
on a line that runs from one extreme of the criterion variable to the other.
The form of the continuous scale may vary considerably.
How would you rate Sears as a department store?
Version 1
Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Probably the best
Version 2
Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -- - Probably the best
0 10 20 30 40 50 60 70 80 90 100
Version 3
Very bad Neither good Very good
nor bad
Probably the worst - - - - - - -I - - - - - - - - - - - - - - - - - - - - - -- - - - - - - - - - - - - - - - -Probably the best
0 10 20 30 40 50 60 70 80 90 100
-
8/4/2019 Measurement in Market Reasearch
4/39
9-4
Itemized Rating Scales
The respondents are provided with a scale that has a
number or brief description associated with each
category.
The categories are ordered in terms of scale position, and
the respondents are required to select the specified
category that best describes the object being rated.
The commonly used itemized rating scales are the Likert,
semantic differential, and Stapel scales.
-
8/4/2019 Measurement in Market Reasearch
5/39
9-5
Likert ScaleThe Likert scale requires the respondents to indicate a degree of agreement or
disagreement with each of a series of statements about the stimulus objects.
Strongly Disagree Neither Agree Strongly
disagree agree nor agree
disagree
1. Sears sells high quality merchandise. 1 2X 3 4 5
2. Sears has poor in-store service. 1 2X 3 4 5
3. I like to shop at Sears. 1 2 3X 4 5
The analysis can be conducted on an item-by-item basis (profile analysis), or atotal (summated) score can be calculated.
When arriving at a total score, the categories assigned to the negativestatements by the respondents should be scored by reversing the scale.
-
8/4/2019 Measurement in Market Reasearch
6/39
9-6
Semantic Differential Scale
The semantic differential is a seven-point rating scale with end
points associated with bipolar labels that have semantic meaning.
SEARS IS:
Powerful --:--:--:--:-X-:--:--: WeakUnreliable --:--:--:--:--:-X-:--:
Reliable
Modern --:--:--:--:--:--:-X-: Old-fashioned
The negative adjective or phrase sometimes appears at the left sideof the scale and sometimes at the right.
This controls the tendency of some respondents, particularly thosewith ver ositive or ver ne ative attitudes to mark the ri ht- or le t-
A S ti Diff ti l S l f M i
-
8/4/2019 Measurement in Market Reasearch
7/39
9-7A Semantic Differential Scale for Measuring
Self- Concepts, Person Concepts, and
Product Concepts
1) Rugged :---:---:---:---:---:---:---: Delicate
2) Excitable :---:---:---:---:---:---:---: Calm
3) Uncomfortable :---:---:---:---:---:---:---: Comfortable
4) Dominating :---:---:---:---:---:---:---: Submissive
5) Thrifty :---:---:---:---:---:---:---: Indulgent
6) Pleasant :---:---:---:---:---:---:---: Unpleasant
7) Contemporary :---:---:---:---:---:---:---: Obsolete
8) Organized :---:---:---:---:---:---:---: Unorganized
9) Rational :---:---:---:---:---:---:---: Emotional
10) Youthful :---:---:---:---:---:---:---: Mature
-
8/4/2019 Measurement in Market Reasearch
8/39
9-8
Stapel ScaleThe Stapel scale is a unipolar rating scale with ten categories
numbered from -5 to +5, without a neutral point (zero). This scale
is usually presented vertically.
SEARS
+5 +5
+4 +4+3 +3
+2 +2X
+1 +1
HIGH QUALITY POOR SERVICE
-1 -1
-2 -2
-3 -3
-4X -4
-5 -5
The data obtained by using a Stapel scale can be analyzed in the
same way as semantic differential data. It shows both intensity & direction
-
8/4/2019 Measurement in Market Reasearch
9/39
9-9
Scale BasicCharacteristics Examples Advantages DisadvantagesContinuousRatingScale
Place a mark on acontinuous line
Reaction toTV
commercials
Easy to construct Scoring can becumbersome
unlesscomputerized
Itemized RatingScales
Likert Scale Degrees ofagreement on a 1(strongly disagree)to 5 (strongly agree)
scale
Measurementof attitudes
Easy to construct,administer, and
understand
Moretime - consuming
SemanticDifferential
Seven - point scalewith bipolar labels
Brand,product, and
company
images
Versatile Controversy asto whether thedata are interval
StapelScale
Unipolar ten - pointscale, - 5 to +5,
without a neutralpoint (zero)
Measurementof attitudesand images
Easy to construct,administer over
telephone
Confusing anddifficult to apply
Basic Noncomparative Scales
-
8/4/2019 Measurement in Market Reasearch
10/39
9-10
Summary of Itemized Scale
Decisions1) Number of categories Although there is no single, optimal number,traditional guidelines suggest that there
should be between five and nine categories
2) Balanced vs. unbalanced In general, the scale should be balanced toobtain objective data
3) Odd/even no. of categories If a neutral or indifferent scale response ispossible from at least some of the respondents,an odd number of categories should be used
4) Forced vs. non-forced In situations where the respondents areexpected to have no opinion, the accuracy ofthe data may be improved by a non-forced scale
5) Verbal description An argument can be made for labeling all or many scale categories. The categorydescriptions should be located as close to theresponse categories as possible
6)Physical form A number of options should be tried and thebest selected
-
8/4/2019 Measurement in Market Reasearch
11/39
9-11
Jovan Musk for Men is Jovan Musk for Men is
Extremely good Extremely goodVery good Very goodGood Good Bad Somewhat good
Very bad BadExtremely bad Very bad
Figure 9.1
Balanced and Unbalanced
Scales
-
8/4/2019 Measurement in Market Reasearch
12/39
9-12
A variety of scale configurations may be employed to measure the
gentleness of Cheer detergent. Some examples include:Cheer detergent is: 1) Very harsh --- --- --- --- --- --- --- Very gentle
2) Very harsh 1 2 3 4 5 6 7 Very gentle
3) . Very harsh
.
.
. Neither harsh nor gentle
.
.
. Very gentle
4) ____ ____ ____ ____ ____ ____ ____Very Harsh Somewhat Neither harsh Somewhat Gentle Very
harsh Harsh nor gentle gentle gentle
5)
Very Neither harsh Very
harsh nor gentle gentle
Rating Scale Configurations Figure 9.2
-3 -1 0 +1 +2-2 +3
Cheer
9 13
-
8/4/2019 Measurement in Market Reasearch
13/39
9-13
Thermometer Scale
Instructions: Please indicate how much you like McDonalds hamburgers by coloringin the thermometer. Start at the bottom and color up to the temperature level that best
indicates how strong your preference is.
Form:
Smiling Face Scale
Instructions: Please point to the face that shows how much you like the Barbie Doll. Ifyou do not like the Barbie Doll at all, you would point to Face 1. If you liked it verymuch, you would point to Face 5.
Form:
1 2 3 4 5
Figure 9.3
Like very
much
Dislike
very much
100
75
50
25
0
Some Unique Rating Scale
GRAPHIC
9 14
-
8/4/2019 Measurement in Market Reasearch
14/39
9-14
Thurstone Scale
It is a two stage procedure In the first stage researcher selects 80 to 100
items indicating different degrees offavourable attitude for concept under study
They are given to a group of judges to groupthem into favourable & disfavour able by
keeping equal intervals between categories All items that have consensus from judges
are selected & distributed uniformly on ascale of favourability
This scale is then administered torespondents to measure their attitude towardsa particular concept
It is time consuming & costly & is rarely usedin applied BR
9 15
-
8/4/2019 Measurement in Market Reasearch
15/39
9-15
In psychology, the Thurstone scale was the firstformal technique for measuring an attitude. Itwas developed by Louis Leon Thurstone in1928, as a means of measuring attitudestowards religion. It is made up of statementsabout a particular issue, and each statement hasa numerical value indicating how favorable orunfavorable it is judged to be. People check
each of the statements to which they agree, anda mean score is computed, indicating theirattitude.
9 16
-
8/4/2019 Measurement in Market Reasearch
16/39
9-16
Measurement AccuracyThe true score model provides a framework for
understanding the accuracy of measurement.
XO = XT + XS + XR
where
XO = the observed score or measurement
XT = the true score of the characteristic
XS = systematic error
XR = random error
9 17
-
8/4/2019 Measurement in Market Reasearch
17/39
9-17
Potential Sources of Error on
Measurement11) Other relatively stable characteristics of the individual that
influence the test score, such as intelligence, social desirability,and education.
2) Short-term or transient personal factors, such as health, emotions,and fatigue.
3) Situational factors, such as the presence of other people, noise,and distractions.
4) Sampling of items included in the scale: addition, deletion, orchanges in the scale items.
5) Lack of clarity of the scale, including the instructions or the items
themselves.6) Mechanical factors, such as poor printing, overcrowding items in
the questionnaire, and poor design.
7) Administration of the scale, such as differences amonginterviewers.
8) Analysis factors, such as differences in scoring and statisticalanalysis..
9 18
-
8/4/2019 Measurement in Market Reasearch
18/39
9-18
Criteria for evaluating measurement
The criteria for evaluating measurements
are Reliability
Validity
Sensitivity
Generalizability
Relevance
9 19
-
8/4/2019 Measurement in Market Reasearch
19/39
9-19
Reliability
The degree to which measures are freefrom random error and therefore yield
consistent results across time or
situations.Perfect reliability requires that there is
no random error
XR=0
9-20
V lidit
-
8/4/2019 Measurement in Market Reasearch
20/39
9-20
Validity
The ability of a scale to measure what
was intended to be measured.
Perfect validity requires that there is no
measurement error either systematic or
random.XR=o XS=0
9-21
-
8/4/2019 Measurement in Market Reasearch
21/39
9 21
Relationship between validity & reliability
If a measure is perfectly valid it is also
perfectly reliable
However if a measure is perfectly reliable
it may or may not be perfectly valid
If a measure is unreliable it will not be valid
Reliability is a necessary but not a
sufficient condition for validity
9-22
-
8/4/2019 Measurement in Market Reasearch
22/39
9 22
THE GOAL OF
MEASUREMENT:
VALIDITY and RELIABILITY
9-23
-
8/4/2019 Measurement in Market Reasearch
23/39
9 23
Reliability and Validity on Target
Old Rifle New Rifle New Rifle
Sunglare
Low Reliability High Reliability Reliable but Not
Valid
(Target A) (Target B) (Target C)
9-24
-
8/4/2019 Measurement in Market Reasearch
24/39
9 24
RELIABILITY
T E S T R
S T A B I L
E Q U I V A S P L I T T I
I N T E R N A
R E L I A B
Repeatability Of index measures
9-25Types of Reliability
-
8/4/2019 Measurement in Market Reasearch
25/39
9 25 Types of Reliability There are two dimensions of reliability:Repeatability & Internal
consistency
If the results of the research are the same even when it is
conducted second or third time it confirms repeatability aspect
Test-Retest Method: An approach for assessing reliability in
which respondents are administered identical sets of scale
items at two different times under as nearly equivalent
conditions as possible This measures repeatability since the same scale or measure
is administered to the same set of respondents at two
separate points. If the measure is stable over time , it should
obtain similar results.(40% satisfied with jobs both times) However it is difficult to locate all respondents for the second
round, their attitudes may change over time or the first
measure may sensitize the respondents
9-26
E i l t F M th d
-
8/4/2019 Measurement in Market Reasearch
26/39
9 6
Equivalent Forms Method
An approach to assess reliability that
requires two equivalent forms of scale to beconstructed &administered to the same
respondents at two different times
However it is difficult , time consuming &expensive to construct two equivalent forms
of scale
9-27
I t l C i t
-
8/4/2019 Measurement in Market Reasearch
27/39
Internal Consistency
This measure of reliability focuses on
internal consistency of the set of items
forming the scale.
It is used to assess reliability of a
summated scale where several items are
summed to form a total score .Each itemmeasures some aspect of the construct
and the items should be consistent in
what they indicate about thecharacteristics
9-28
-
8/4/2019 Measurement in Market Reasearch
28/39
Split half Method
Split half Method: It is a method of measuring
internal consistency reliability in which the itemsconstituting the scale are divided into two halvesand the resulting scores of two halves arecorrelated. High correlation indicates high
consistency However results will depend on how the scale
items are split
Coefficient alpha :A measure of internal
consistency reliability that is the average of allpossible split half coefficients resulting fromdifferent splitting of the scale items
9-29
-
8/4/2019 Measurement in Market Reasearch
29/39
Some multi item scales include several sets of items measuringdifferent dimensions of a multidimensional construct. Since thesedimensions are independent a measure of internal consistency
computed across dimensions would be inappropriate. so internalconsistency reliability can be computed for each dimension
Store image is a multidimensional construct that includes
--- Quality of goods,
--- variety of goods,---returns policy,
---service ,
----price,
----location,----layout
----billing & credit policy
9-30
-
8/4/2019 Measurement in Market Reasearch
30/39
F A C E O R
C O N C P R E D
C R I T E R I O C O N S T R
V a l i d i t y
Face Professional agreement that logically it appears valid.(Subjective)
Content-Depends on established theories for support(objective)Criterion Does it fit or correlate with other similarmeasure/constructs? Body Fat caliper, water displacement,electrical impedance, BMI.
Concurrent two measure, same timePredictive Two measures at diff. times.Construct - confirmed with network of hypotheses.Convergent(High relationship with similar concepts). and divergent ordiscriminant validit low relationship with dissimilar concepts .
9-31
F V lidit
-
8/4/2019 Measurement in Market Reasearch
31/39
Face Validity
Face Validity: Subjective agreement amongprofessionals that a scale logically appears toaccurately measure what it is intended tomeasure. Weakest form without any analysis
Face validity is concerned with how a
measure or procedure appears. Does it seemlike a reasonable way to gain the informationthe researchers are attempting to obtain?Does it seem well designed? Does it seem asthough it will work reliably? Unlike contentvalidity, face does not depend on establishedtheories for support
9-32
C t t V lidit
-
8/4/2019 Measurement in Market Reasearch
32/39
Content Validity Content Validity is based on the extent to which a
measurement reflects the specific intended domain
of content . Researchers aim to study mathematical learning and
create a survey to test for mathematical skill. Ifthese researchers only tested for multiplication and
then drew conclusions from that survey, their studywould not show content validity because it excludesother mathematical functions.
To measure adequacies of facilities in schools:
attractiveness of school name, frequency of oldstudents meet. eatables in the canteen not relevantvariables:
Number of classrooms, Number of qualifiedteachers, playground, liabrary- relevant variables
9-33
-
8/4/2019 Measurement in Market Reasearch
33/39
Criterion related Validity Criterion related validity, also referred to as
instrumental validity, is used to demonstrate theaccuracy of a measure or procedure by comparingit with another measure or procedure which hasbeen demonstrated to be valid.
For example, imagine a hands-on driving test hasbeen shown to be an accurate test of driving skills.By comparing the scores on the written driving testwith the scores from the hands-on driving test, thewritten test can be validated by using a criterionrelated strategy in which the hands-on driving test iscompared to the written test.
New measure correlates with criterion measure
9-34
-
8/4/2019 Measurement in Market Reasearch
34/39
Predictive Validity Predictive Validity. A type of criterion validity whereby a
new measure correlates with criterion measure administered
at a later time In order for a test to be a valid screening device for some
future behaviour, it must have predictive validity. The SAT
is used by college screening committees as one way to
predict college grades. The GMAT is used to predict success
in business .It measures predictive validity .
We determine predictive validity by computing a correlation
coefficient comparing SAT(New) scores, for example, andcollege grades (Criterion). If they are directly related, then
we can make a prediction regarding college grades based on
SAT score. We can show that students who score high on the
SAT tend to receive high grades in college.
9-35
-
8/4/2019 Measurement in Market Reasearch
35/39
Construct Validity
Construct validity seeks agreement between atheoretical concept and a specific measuring
device or procedure. For example, a researcherinventing a new IQ test might spend a great dealof time attempting to "define" intelligence in orderto reach an acceptable level of construct validity.
Construct validity can be broken down into twosub-categories: Convergent validity anddiscriminate validity. Convergent validity is theactual general agreement among ratings, wheremeasures should be theoretically related.
Discriminate validity is the lack of a relationshipamong measures which theoretically should notbe related
9-36
-
8/4/2019 Measurement in Market Reasearch
36/39
To measure: Tendency to stay in low cost hotels
Four personality variables: High level of selfconfidence, low need for status, low need fordistinctiveness, high level of adaptability
Not related to: brand loyalty, high level ofaggressiveness
The scale can be said to have construct if itcorrelates highly with other measures of tendencyto stay in low cost hotels: Reported hotelspatronised and social class (convergent)
Low correlation with the unrelated constructs ofbrand loyalty & high level of aggressiveness(Divergent)
9-37
-
8/4/2019 Measurement in Market Reasearch
37/39
SENSITIVITY
A measurement instruments ability to
accurately measure variability in stimuli or
responses.
Yes and no agree or disagree are not verysensitive
Strongly agree, mildly agree, indifferent, mildly
disagree, strongly disagree ,are categories whoseinclusion increases scales sensitivity
9-38
-
8/4/2019 Measurement in Market Reasearch
38/39
Generizability
It is the degree to which a study based on
a sample applies to a universe ofgeneralization
Universe of generalization includes set of
all conditions of measurement :items,interviewers, modes of data collection etc.
To generalize a scale developed for
personal interview to other modes of datacollection such as mail, telephone etc.
To generalize from a sample of items to
universe of items
9-39
R l
-
8/4/2019 Measurement in Market Reasearch
39/39
Relevance
It represents appropriateness of using a
particular scale for measuring a variable Relevance= Reliability x Validity
If either reliability or validity is low then the
scale will have little relevance
If correlation coefficient is used to analyse
both reliability & validity then the scale can
have relevance from 0 to 1.