basic marketing research v3

Upload: khaleddanaf

Post on 15-Oct-2015

93 views

Category:

Documents


3 download

DESCRIPTION

good book

TRANSCRIPT

  • 5/25/2018 Basic Marketing Research V3

    1/170

    BasicMarketingResearch

    Analysis and Results

    Scott M. Smith and Gerald S. Albaum

    Copyright 2013, Qualtrics Labs, Inc.

  • 5/25/2018 Basic Marketing Research V3

    2/170

    ISBN: 978-0-9849328-3-2

    2013 Qualtrics Labs Inc.

    All rights reserved. This publication may not be reproduced or transmitted in any form or by any means, electronic or mechanical,

    including photocopy, recording, or any information storage and retrieval system, without written permission from Qualtrics. Errors

    or omissions will be corrected in subsequent editions.

    Scott M. Smithis Founder of Qualtrics, Professor Emeritus of Marketing, Brigham Young University.

    Professor Smith is a Fulbright Scholar and has written numerous articles published in journals such as Journal of ConsumerResearch, Journal of the Academy of Marketing Science, Journal of Business Ethics, International Journal of Marketing Research,

    Journal of Marketing Research, and Journal of Business Research. He is the author, co-author, or editor of books, chapters, and

    proceedings including An Introduction to Marketing Research. Qualtrics, 2010 (with G. Albaum); Fundamentals of Marketing

    Research. Thousand Oaks, CA: Sage Publishers 2005 (with G. Albaum); Multidimensional Scaling. New York: Allyn and Bacon 1989

    (with F. J. Carmone and P. E. Green), and Computer Assisted Decisions in Marketing. Richard D. Irwin 1988 (with W. Swinyard).

    Gerald S. Albaumis Research Professor in the Marketing Department at the Robert O. Anderson Schools of Management, the

    University of New Mexico, Professor Emeritus of Marketing, University of Oregon.

    Professor Albaum has written numerous articles published in journals such as Journal of Marketing Research, Journal of theAcademy of Marketing Science, Journal of the Market Research Society, Psychological Reports, Journal of Retailing, Journal of

    Business and Journal of Business Research. He is the author, co-author, or editor of twenty books including International Market-

    ing and Export Management. Pearson Education Limited (UK), Fourth Edition, 2002 (with J. Strandskov, E. Duerr); Fundamentals of

    Marketing Research. Thousand Oaks, CA: Sage Publishers 2005 (with S.M. Smith); Research for Marketing Decisions. Englewood

    Cliffs, NJ: Prentice-Hall, Fifth Edition, 1988 (with P. Green and D. Tull)

    GRAPHIC AND COVER DESIGN: Myntillae Nash

    Published byQualtrics Labs, Inc.

    2250 N. University Parkway #48C

    Provo, Utah, 84604, USA

    +1.801.374.6682

    Website Address

    www.Qualtrics.com

    Qualtrics and the Qualtrics logos are registered trademarks of Qualtrics Labs, Inc.

    2 | Acknowledgments

  • 5/25/2018 Basic Marketing Research V3

    3/170Table of Contents |

    Table of Contents

    Chapter 1: Designing Your Analysis:Measurement and Statistics ........................... 7

    Definitions in Marketing Measurement ...................... 8

    Building Blocks for Measurement and Models .....9

    Concepts and Constructs .............................9

    Variables ......................................................9

    Measurement ...............................................9

    Propositions ...............................................11

    Integration into a Systematic Model ...........11

    Inaccuracies in Measurement ................................ 11 Basic Concepts of Measurement and Scaling ......... 12

    Primary Types of Scales .....................................13

    Nominal Scales ..........................................14

    Ordinal Scales ............................................14

    Interval Scales ...........................................14

    Ratio Scales ...............................................14

    The Relationship Between

    Scaling and Analysis .........................................15

    Selecting the Correct Statistical Analysis Tool ...16

    Chapter 2: Hypothesis Testing ..................... 21

    An Overview of the Analysis Process ....................... 22

    The Data Tabulation Process..............................22

    Defining Categories ....................................23

    Editing and Coding ....................................24

    Tabulation: Cleaning the Data....................25

    Tabulation: Basic Analysis .........................25

    Formulating Hypotheses ....................................26

    Making Inferences ......................................27

    Statistics 101: What is a Population, a Sampling

    Distribution, and a Sample ......................................28

    The Population ...................................................28

    Sample Distribution ...........................................28

    The Relationship Between the Sampleand the Sampling Distribution...........................30

    Hypothesis Testing .............................................32

    Power of a Test...................................................34

    Chapter 3: Bivariate Data Analysis: SimpleAssociative Data Analysis, Cross tabulation,

    T-Test and ANOVA .......................................... 37

    Basic Concepts of Analyzing Associative Data ....... 38

    Bivariate Cross-Tabulation ................................38

    Percentage .................................................38

    Absolute Percentages Increase ..................40

    Relative Percentages Increase ...................41

    Percentages Possible Increase ...................41

    Introducing a Third Variable Into the Analysis ...41 Analysis and Interpretation................................42

    Bivariate Analysis: Difference BetweenSample Groups ....................................................... 44

    Bivariate Cross-Tabulation ................................44

    Cross-Tabulation with Chi-Square Analysis .......45

    Cross-Tabulations and Computationof the Chi-Square Statistic ........................45

    Bivariate Analysis: DIfference inMeans and Proportions...............................48

    Testing Hypotheses ....................................49

    Testing Multiple Group Means:Analysis of Variance ...................................52

    The Anova Methodology......................................53

    One-Way (Single Factor) Analysis of Variance ....56

    Conducting Follow-Up Tests ofTreatment Differences .......................................58

    Chapter 4: Bivariate Measures of Association:

    Two Variable Correlation and Regression

    Analysis ......................................................61

    Correlation Analysis ................................................ 62

    Introduction to Bivariate Regression Analysis ..........65

    Parameter Estimation ........................................67

    Assumptions of the Model..................................68

    Strength of Association......................................70

    Interpretation of Bivariate Regression .............. 72

  • 5/25/2018 Basic Marketing Research V3

    4/1704 | Table of Contents

    Chapter 5: Multivariate Statistical Analysis I:

    Analyzing Criterion-Predictor Association ....77

    An Overview of Multivariate Procedures ..................78

    The Data Matrix ............................................... 78

    A Classification of Techniques forAnalyzing Associative Data ............................... 79

    Analysis of Dependence .................................... 80

    Analysis of Interdependence ............................ 80

    Multiple and Partial Regression ............................. 81

    Parameter Estimation ....................................... 82

    The Coefficient of Multiple Determination, R ....83

    Standard Errors and T-Tests ..............................84

    Other Forms of Regression .................................84

    Multicollinearity .................................................85

    Cross-Validation ................................................86

    Two-Group Discrimnant Analysis ............................ 86

    Objectives of Two-GroupDscriminant Analysis ........................................ 87

    A Geomtric Example ...........................................87

    A Numerical Example .........................................88

    Plotting the Discrimnant Function .....................90

    Classifying the Persons .....................................92

    Testing Statistical Significance .........................92

    Multiple Discrimant Analysis .............................93

    Other Criterion-PredictorMultivariate Techniques.....................................93

    Probit and Logit Analysis ...................................95

    Path Analysis/Causal Modeling .........................96

    Chapter 6: Multivariate Statistical Analysis

    Part II: Factor Analysis and Clustering

    Methods ......................................................99

    An Introduction to the Basic Concepts ofFactor Analysis ...................................................... 100

    Identifying the Factors .................................... 103

    Interpreting the Factors ...................................105

    Factor Scores ...................................................106

    Basic Concepts of Cluster Analysis ....................... 106

    Primary Questions........................................... 107

    Choice of Proximity Measure ....................108

    Selecting the Clustering Methods ............109

    A Product-Positiong Exampleof Cluster Analysis ...................................110

    Foreign Market Analysis ...........................111

    Computer Analysis ...................................111

    Chapter 7: Multivariate Statistical Analysis III:

    Multidimensional Scaling and Conjoint

    Analysis ....................................................113

    Multidimensional Scaling (MDS) Analysis ............. 114

    Psychological Versus Physical Distance ...........116

    Classifying MDS Techniques ............................116

    Data Mode/Way ................................................116

    Type of Geometric Model ..................................117 Collecting Data for MDS...................................118

    Marketing Applications of MDS ........................119

    An Introduction to the Fundamentals of Conjoint Analysis................................................... 119

    An Example of Full-Profile Conjoint Analysis ....121

    Self-Explicated Conjoint Analysis ....................125

    Self-Explicated Data Collection Task ...............125

    Choice-Based Conjoint Analysis ......................126

    Max-Diff Conjoint Analysis...............................127

    Conjoint Reliability and Validity Checks ..........127 Other Models....................................................128

    Other Aspects of Conjoint Analysis ..................128

    Use of Visual Aids in Conjoint Analysis ............128

    Strategic Aspects of Conjoint Analysis ............129

    Applications of Conjoint Analysis.....................130

    Recent Developments in Conjoint Analysis ......130

    Chapter 8: Preparing the Research

    Report ....................................................133

    Communication ..................................................... 134

    The Research Report ............................................ 135

    Criteria for a Good Report .............................. 136

    Completeness.......................................... 136

    Accuracy ................................................. 136

    Conciseness .............................................136

    Clarity ......................................................136

  • 5/25/2018 Basic Marketing Research V3

    5/170Table of Contents |

    Practitioners Views .........................................138

    Perspective #1: Strengthen YourCommunications ......................................138

    Perspective #2: Make It Readable .......... 138

    Perspective #3: Make Your Results Talk ...139

    Report Format and Organization .......................... 140

    Parts of the Formal Report ...............................140

    Using Graphics ................................................143

    Tables .................................................................143

    Charts amd Graphs .........................................145

    Line Chart ................................................145

    Bar Chart .................................................146

    Specialty Charts .......................................146

    Dot Charts ................................................147

    Other Graphic Aids...........................................148 The Oral Report and Presentation...........................150

    Appendix 1: Validity and Reliability

    of Measurement ......................................... 155

    Writing Good Questions ........................................ 156

    Strength of Question Wording ......................... 156

    Reducing Question Ambiguity ..........................156

    Validity ............................................................... 159

    Content Validation .......................................... 160

    Criterion Validation ........................................ 160

    Construct Validation ....................................... 161

    Reliability ............................................................. 161

    Test-Retest ......................................................162

    Alternative Forms.............................................162

    Internal Consistency ........................................162

    Appendix 2: Statistical Probability Tables .. 165

    Table A.1 .............................................................. 166

    Table A.2 .............................................................. 167

    Table A.3 ................................................................168

    Table A.4 ........................................................169-170

  • 5/25/2018 Basic Marketing Research V3

    6/1706 | Table of Contents

  • 5/25/2018 Basic Marketing Research V3

    7/170

    Designing YourAnalysis:

    Measurementand StatisticsGeneral concepts of measurement determine the specific

    types of analysis that are appropriate for your data.

    Chapter 1

    Basic Marketing Research: Analysis and Results | 7

  • 5/25/2018 Basic Marketing Research V3

    8/1708 | Chapter 1: Designing Your AnalysisMeasurement and Statistics

    Survey research drives many marketing decisions of practical interest, including testing concepts for new products,

    positioning brand and corporate image, evaluating ad copy, and determining how to satisfy customers. Regardless of the

    research topic, you obtain useful data only when you exercise care and skill in defining the research problem, in develop-

    ing the measurement instrument, in collecting the data, and in conducting the proper analysis.

    Building Your Survey, the second book in this Basic Marketing Research series, emphasized that survey building was

    much more than just adding questions to a form. In your survey building efforts, you must:

    1. Define the construct to be measured.

    2. Define how to measure the construct.

    3. Develop the measurement question.

    4. Develop the scale for construct measurement.

    5. Develop the data analysis plan.

    Definitions play a significant role in scientific inquiry, especially in marketing research and the behavioral sciences. In

    this chapter, we first explain conceptual and operational definitions and how they are used in research. We then discuss

    the importance of measurement scales in selecting appropriate statistical techniques.This section is an introduction to the statistical analyses presented in chapters 3-8. The overall quality of your research

    report depends not only on the appropriateness and adequacy of the research design and sampling techniques, but also

    on careful planning to enable your measurement and analysis procedures.

    Defnitions in

    Marketing Measurement

    Marketing success is measured by new product ratings, increased brand awareness, brand likeability ratings, uniqueness,

    purchase intent, and customer satisfaction. In measuring each of these constructs, we often refer to models.

    A sales forecasting model that does not forecast sales accurately is worse than no sales forecasting model at all because

    of the impact on morale, hiring, and expenditures. Models represent reality and are judged by how well they represent

    reality on all significant issues. Your models will be judged against the criteria of qualityand utility. Quality refers toyour models accuracy in describing and predicting reality; whereas, utility refers to the value your model adds to decision

    making.

    Your models accuracy further depends on two drivers: completeness and validity. Managers should not expect a model

    to make decisions for them but should instead view it as an additional piece of information to help make decisions.

    Managers clearly benefit when models are easy to understand and implement. But models for million-dollar decisions

    should be more complete than those used to make hundred-dollar decisions. Your models value is determined by its

    sophistication and efficiency in helping you make decisions. You should use models only when they can help you get

    results faster with less expense or more validity.

  • 5/25/2018 Basic Marketing Research V3

    9/170Chapter 1: Designing Your AnalysisMeasurement and Statistics |

    BUILDING BLOCKS FOR MEASUREMENT AND MODELS

    We cannot measure attitude, market share, sales, or any other concept without first understanding what we are

    measuring and how it is defined, formed, and related to other marketing variables. With this in mind, we briefly mention

    the building blocks of measurement theory: concepts, constructs, variables, operational definitions, and propositions.

    CONCEPTS AND CONSTRUCTSA concept is a theoretical abstraction formed by a generalization about particulars. Mass, strength, and love are a

    concepts, as are advertising effectiveness, consumer attitude, and price elasticity. Constructs are also concepts,

    but they are observable and measurable and defined in terms of other constructs. For example, we may define the

    construct attitude as a learned tendency to respond consistently with respect to a given object. Attitudes are often

    measured as a sum of brand attribute performance and importance evaluations.

    VARIABLES

    Researchers loosely give the name variablesto the constructs that they study. A question in a survey is a variablerepresenting the constructs in measured and quantified form. The answer reports the different values that the

    respondents give to the variable.

    MEASUREMENTWe can talk about consumer attitudes as if we know what the term means, but the term makes little sense until we

    define it in a specific, measurable way. An operational definitionassigns meaning to a variable by specifying what ismeasured and how it is measured. It is a set of instructions defining how we are going to treat a variable. For example,

    expectancy value models use attitudes to predict behavioral intention (intention to try, intention to purchase, intention to

    recommend, or intention to re-purchase a product or service). First developed in the 1960s, this methodology has becom

    a mainstay of marketing research and performs well in predicting both consumer purchase behavior and consumersatisfaction/dissatisfaction.

    The expectancy value model uses attitudes and beliefs in a mathematical formulation that links attitudes to intentions

    and finally to behavior in the following manner:

    Actual purchase of a Toyota Prius (Behavior B) is approximated by intention to purchase a Toyota Prius(Behavioral Intention BI), which in addition to being measured directly, is approximated by the Overall Attitudetoward a Toyota Prius.

    The Overall Attitude toward a brand (Toyota Prius), equals the sum of all relevant and important attitudes,(ki=1ai ) about the brand. These attitudes are weighted by how important each attitude is in thepurchase decision process (bi ) .

    The overall attitude is modeled by multiplying ai (the persons liking of attribute i), by bi (the importance ofattribute i in the purchase decision).

    Mathematically, this is expressed as:

  • 5/25/2018 Basic Marketing Research V3

    10/17010 | Chapter 1: Designing Your AnalysisMeasurement and Statistics

    Operationally, aiis the affective (liking) component of the evaluation of attribute i. The evaluation would use a five- orseven-point scale with endpoints ranging from Poor to Excellent or Not at all Desirable to Very Desirable. (See the

    far right panel of Figure 1.1).

    Operationally, ei(the importance of attribute iin the context of behavior B) is sometimes measured as the probabilityof attribute ibeing associated with brand X. At other times, it is measured as the importance of attribute iin achieving

    behavior B. Using the example of purchasing a Toyota Prius, the attribute Gets 50 miles per gallon could be rated on aseven-point scale with endpoints labeled Very Unlikely and Very Likely, or as in the example below, is measured as

    Not at all Important to Very Important. See the left side of Figure 1.1.

    Assume you are purchasing a Toyota Prius. Please evaluate the following attributes using the IMPORTANCE scale on

    the left and the PERFORMANCE scale on the right

    Figure 1.1 Expectancy Value Importance-Performance Rating Scales

    The expectancy value model predicts behavior by summing the attitudes and beliefs about a products most important

    attributes (Figure 1.2). This powerful approach to measuring customer attitudes and predicting customer behavior is a

    mainstay in consumer research and the basis of many popular indices and methodologies.

    Figure 1.2 Expectancy Value Importance-Performance Rating Scales

  • 5/25/2018 Basic Marketing Research V3

    11/170Chapter 1: Designing Your AnalysisMeasurement and Statistics | 1

    PROPOSITIONS

    In the above example, we used propositions to define the relationship between variables. We must specify both the

    variables influencing the relationship and the form of the relationship. For example, the attitude toward a Prius was

    a function of attribute importance and performance, such that AttitudePrius = i=1ai * ei.This relationship can bemade more complex by adding other intervening variables (like conformance to social norms and expectations) along

    with the relevant ranges for the effect, including where we would observe saturation effects, threshold effects, and themathematical shape of the relationship (linear, curvilinear, etc.).

    INTEGRATION INTO A SYSTEMATIC MODEL

    A model is produced by linking propositions together to provide a meaningful explanation of a system or a process.

    Your survey then reflects your research plan that links concepts, constructs, variables, and propositions into a model.

    Conceptually, we should ask the following questions:

    Are the important concepts and propositions specied? Are the concepts relevant to solving the managerial research problem?

    Are the principal parts of the concept clearly dened?

    Is there consensus as to which concepts explain the problem?

    Do we link the concepts through clear assumptions made in the model?

    Can the model be readily quantied?

    Are the concepts properly dened, labeled, and measured?

    Are the concept measures specic enough to be operationally reliable and valid?

    Are the limitations of the model stated?

    Can the measures be analyzed to explain and predict?

    Can the model provide results for managerial decision making?

    Are the outcomes of the model supported by common sense?

    If the model does not meet the relevant criteria, it should be revised: concept definitions made more precise; variables

    redefined, added, or deleted; operational definitions and measurements tested for validity; and/or mathematical forms

    revised.

    Inaccuracies in Measurement

    Before delving into measurement scales and analysis, it is helpful to remember that measurements in marketing researc

    are rarely exact. Inaccuracies in measurement arise from a variety of sources or factors that cause variations in respo

    dent scores. Part of your job as a researcher is to prevent, or at least identify, control, and quantify inaccuracies in your

    analysis. You might attempt to identify some of these sources of inaccuracy:

    True differences in the characteristic or property

    Relatively stable respondent characteristics that affect scores (intelligence, extent of education,

    information processed)

  • 5/25/2018 Basic Marketing Research V3

    12/17012 | Chapter 1: Designing Your AnalysisMeasurement and Statistics

    Transient personal factors (health, fatigue, motivation, emotional strain)

    Situational factors (usage occasions, distractions)

    Variations in administering the measuring instrument, such as interviewers

    Sampling items included in the instrument

    Lack of clarity (ambiguity, complexity, interpretation of words and context)

    Mechanical factors (lack of space to record response, appearance of instrument, browser incompatibility)

    Factors in the analysis (scoring, tabulation, statistical compilation) Variations not otherwise accounted for (chance), such as guessing an answer

    Ideally, variation within a set of measurements would represent only true differences in the characteristic being measured.

    Many sources of potential error exist in measurement. Measurement error has a constant (systematic) dimension and a

    random (variable) dimension. We expect random error to sum to zero and is, therefore, less worrisome than nonrandom

    measurement error.

    Systematic error is a flaw in the measurement instrument, the research, or the sampling design. Unless the flaw is cor-

    rected, the researcher can do nothing to get valid results after the data is collected. These two subtypes of measurement

    error affect the validity and reliability of measurement and were discussed in Planning Your Study, the first volume ofthis series.

    Now that we are aware of the conceptual building blocks and errors in measurement and that they relate to developing

    measurement scales, we will consider the types of measurement and associated questions commonly used in marketing

    research.

    Basic Concepts ofMeasurement and Scaling

    Specifying the proper measurement scale is vital to all research.The type of measurement scale dictates not only the accuracy of the data, but the specific analytical (statistical)techniques that are most appropriate for use in the analysis.

    Measurement is a way of assigning numbers to objects to represent the amounts or degrees of a property possessed by

    the objects. There are three characteristics or features of real number series measurement:

    1. ORDER:Are the numbers ordered?

    2. DISTANCE:Are the differences that exist between the ordered numbers constant or variable?

    3. ORIGIN:Does the series have a unique origin indicated by the number zero?

    A measurement scale allows you to measure and compare the quantities and changes in a variable. However, we measure

    the attributes or characteristics of objects, not the objects themselves.

  • 5/25/2018 Basic Marketing Research V3

    13/170Chapter 1: Designing Your AnalysisMeasurement and Statistics | 1

    PRIMARY TYPES OF SCALES

    We can classify scales into four major categories: nominal, ordinal, interval, and ratio. Each scale possesses its own

    set of underlying assumptions about order, distance, and origin, and how well the numbers correspond with real-world

    entities. As our rigor in conceptualizing and measuring concepts increases, we can correspondingly upgrade the rigor of

    our statistical analysis (Table 1.1).

    For example with the measurement of color, we may simply categorize colors (nominal scale), or we can measure the

    frequency of light waves (ratio scale). Ratio scale data possesses a natural zero and a constant unit of measurement an

    meets the requirements of more advanced statistical analyses. We in the behavioral sciences must frequently settle for

    less-precise data.

    Table 1.1 Scales of Measurement

    Scale Mathematical Group

    Structure

    Permissible Statistics Typical Elements

    Nominal Permutation group

    y = f(x), where f(x)

    means any one-to-one

    correspondence

    Mode

    Contingency Coefficient

    Numbering of baseball

    players

    Assignment of type

    or model numbers to

    classes

    Ordinal Isotonic group

    y = f(x), where f(x)

    means any strictlyincreasing function

    Median

    Percentile

    Order correlationSign test; run test

    Hardness of minerals

    Quality of leather, lumber

    Top 10 ListsGood-Better-Best

    Interval General linear group

    y = a + bx

    b > 0

    Mean

    Average deviation

    Standard deviation

    Product-moment cor-

    relation

    t-test, F-test

    Temperature (Fahrenheit

    and centigrade)

    Energy

    Calendar dates

    Net Promoter Score

    Satisfaction Ratings

    Ratio Similarity group

    y = cx

    c > 0

    Similarity group

    y = cx

    c > 0

    Length, width, density

    Pitch scale, loudness

    scale

    Price utility

  • 5/25/2018 Basic Marketing Research V3

    14/17014 | Chapter 1: Designing Your AnalysisMeasurement and Statistics

    NOMINAL SCALES

    Nominal scalesare the simplest. They support only the most basic analyses. A nominal scale serves only as a label or tagto identify objects, properties, or events. A nominal scale does not possess order, distance, or origin. For example, we can

    assign numbers to baseball players or classify supermarkets into categories that carry our brand versus those that donot carry our brand.

    Using nominal scales, we can only count the stores that carry each brand in a product class and find the modal (highest

    number of mentions) brand carried. The usual statistical operations involving the calculations of means, standard

    deviations, and so on are not appropriate or meaningful for nominal scales.

    ORDINAL SCALES

    Ordinal scalesare ranking scales and possess the characteristic of order only. These scales require us to distinguish

    between objects according to a single attribute and direction.

    For example, when ranking a group of floor polish brands according to cleaning ability, we would assign the number 1 to

    the highest-ranking polish, 2 to the second-highest ranking polish, and so on. However, the mere ranking of brands does

    not quantify the differences separating brands with regard to cleaning ability. We do not know if the difference in cleaning

    ability between the brands ranked 1 and 2 is larger, less than, or equal to the difference between the brands ranked 2 and

    3.

    In dealing with ordinal scales, statistical description can employ positional measures such as the median, quartile,

    and percentile, or other summary statistics that deal with order among brands. As with the nominal scale, arithmetic

    averaging is not meaningful for ranked data.

    INTERVAL SCALES

    Interval scalespermit us to make meaningful statements about the differences separating two objects. This type of scalepossesses the properties of order and constant units of distance, but the zero point of the scale is arbitrary.

    For example, an arbitrary zero is assigned to the Fahrenheit temperature scale, and equal temperature differences for each

    degree equate to equal volumes of expansion in the liquid used in the thermometer. However, it is not correct to state that

    any value on a specific interval scale is a multiple of another (50F is not twice as hot as 25F).

    Most ordinary statistical measures (such as arithmetic mean, standard deviation, and correlation coefficient) require only

    interval scales for their computation.

    RATIO SCALES

    Ratio scalesrepresent the elite of scales and contain all the information of lower-order scales and more. These scales,like length and weight, possess a unique zero point, equal intervals, and the ability to make ratio statements. All types of

  • 5/25/2018 Basic Marketing Research V3

    15/170Chapter 1: Designing Your AnalysisMeasurement and Statistics | 1

    statistical operations can be performed on ratio scales. An example of ratio-scale properties is that 3 yards is three

    times 1 yard.

    THE RELATIONSHIP BETWEEN SCALING AND ANALYSIS

    To provide some idea of the relationships among nominal, ordinal, interval, and ratio scales, marketing researchers who

    use descriptive statistics (arithmetic mean, standard deviation) and tests of significance (t-test, F-test) should require

    that the data is (at least) interval-scaled.

    From a purely mathematical point of view, you can obviously do arithmetic with any set of numbersand any scale. Wha

    is at issue is the interpretation and meaningfulness of the results. As we select more powerful measurement scales, our

    abilities to predict, explain, and otherwise understand respondent ratings also increase.

    The measurement properties of the final scale (nominal, ordinal, interval, or ratio) reflect the data collection task that

    the respondent is asked to perform (rank, rate, compare, fractionate, aggregate), and whether the scale measures therespondent, an object (stimuli), or both. Rating scale construction, when not treated as a serious task, is prone to create

    measurement error. Table 1.2 identifies nine operational issues that should be addressed when constructing a scale.

    Table 1.2 Operational Issues in Constructing a Rating Scale

    1. Should negative numbers be used?

    2. How many categories should be included?

    3. Related to the number of categories: Should there be an odd number or an even number? That is, should a

    neutral alternative be provided?

    4. Should the scale be balanced or unbalanced?

    5. Is it desirable to not force a substantive response by giving respondents an opportunity to indicate dont

    know, no opinion, or something similar?

    6. What should be done about halo effects giving favorable evaluations to all attributes of a stimulus object,

    because they happen to like the particular object in general?

    7. How can raters biases be examinedfor example, the tendency to use extreme values or, perhaps, only the

    middle range of the response scale, or to overestimate the desirable features of the things they like (i.e., the

    generosity error)?

    8. How should descriptive adjectives for rating categories be selected?

    9. Should anchoring phrases be chosen for the scales origin?

    Consider each of these issues during scale construction and again during the preliminary data cleaning and analysis

    phases of your study. Lets consider likely voters as an example. Some research suggests including a neutral /undecided

    option or further gradations of preference unless the researcher has a compelling reason not to do so. As shown in Figure

    1.3, the decision to change from a nominal scale to an interval scale will affect your data coding schema, your method o

    statistical analysis and the message you are reporting.

  • 5/25/2018 Basic Marketing Research V3

    16/17016 | Chapter 1: Designing Your AnalysisMeasurement and Statistics

    Figure 1.3 Presidential Voting Using a Nominal and Interval Scale

    Answers to questions such as these will vary according to the researchers approach to the problem being studied. As

    further reading, the effect of research design on the reliability and validity of rating scales is discussed in two excellentreview papers (Churchill and Peter, 1984; Peter and Churchill, 1986).

    In summary, the selected rating methods depend on the assumptions of the researcher and, if properly planned, can lead

    from categorical- to ordinal-, interval-, or even ratio-scaled responses. Your choice of data analysis technique is based on

    the scale type of the data available and will be discussed next.

    SELECTING THE CORRECT STATISTICAL ANALYSIS TOOL

    Statistical analysis can be intimidating, especially when it comes to answering the question of Which statistical

    technique is right for me? As we mentioned previously, the scale of measurement determines the options that you have

    available for statistical analysis. Table 1.2 is a lookup table that will help you identify which statistical analysis works for

    you.

    We look up the appropriate statistical technique by entering two important pieces of information. On the left side of the

    table, we select the number of variables to be entered in the analysis (1, 2, 3, 4, or more) and the type of scale (discrete

    refers to nominal /ordinal scales, and continuous refers to interval/ratio scales). On the top of the table, we select the

    family of statistical analysis we want to use. Non-parametric analyses are best suited for nominal and ordinal data and

    parametric analyses are best suited for interval and ratio data.

    For example, the voting question from Table 1.3(a) is scaled as a discrete (categorical) variable. According to the table,

    we cannot use parametric statistics like a mean or standard deviation. Instead, we would report the percentage of

    potential voters favoring each of the candidates.

    If we are interested in analyzing the relationship between candidates Obama-Romney and voter gender Men-Women, then

    we would look up two variables, both discrete (nominal). In the table, we see a number of choices, including the Chi-

    square statistic which is applied to a 2 row x 2 column cross tabulation table. Cross tabulation tables, as discussed in

    Chapter 3 are one of the most common forms of data analysis.

  • 5/25/2018 Basic Marketing Research V3

    17/170Chapter 1: Designing Your AnalysisMeasurement and Statistics | 1

    Extending to more and more variables, three, four, or more continuous scale variables (interval or ratio scale) open a

    variety of parametric statistical options including multiple regression analysis, factor analysis and cluster analysis. Eac

    of these tools provides a different view of the data and is discussed in later chapters.

    Table 1.2 Statistical Analysis Lookup TableType of Data Non-Parametric

    (Nominal-Ordinal)

    Parametric

    (Interval-Ratio)

    One Variable in the Analysis

    1. One Discrete a. Percentage (Impossible)

    2. One Continuous a. Median, Mode

    b Quartile range

    a. Mean

    b. Std. Deviation

    Two Variables in the Analysis

    1. Two Discrete a. Chi Square

    b. Phi Coefficient

    (2 rows x 2 columns)

    c. Contingency Coefficient

    d. Tetrachoric Corr.(2x2)

    e. Yules Q (2x2)

    f. Lambda

    a. Conjoint Analysis

    (2 Attribute Tradeoff)

    b. Correspondence Analysis

    (2 rows x 2 columns)

    2. One Discrete, One Continuous a. Student t (if dependent variable

    is dichotomous)

    b. One-way ANOVA

    3. Two Continuous a. Spearman Rank Correlation

    b. Kendalls R

    c. Gamma

    a. Pearson Correlation

    b. Eta Curvilinear Correlation

    c. Simple Regression

    Three Variables

    1. Three discrete a. Cross Tabulation a. Conjoint analysisb. Multi-Dimensional Scaling

    c. Correspondence Analysis

    (Multiple)

    2. Two Discrete, One Continuous a. ANOVA (2 Way)

  • 5/25/2018 Basic Marketing Research V3

    18/17018 | Chapter 1: Designing Your AnalysisMeasurement and Statistics

    Type of Data Non-Parametric

    (Nominal-Ordinal)

    Parametric

    (Interval-Ratio)

    3. One Discrete, Two Continuous a. Comparison of Proportions a. Analysis of Covariance

    b. Comparison of two Correlations(if dependent variable is multi-

    chotomy)

    c. Multiple Discriminant Analysis

    4. Three Continuous a. Kendalls W a. First Order Partial Correlation

    b. Multiple Regression

    c. Multiple Correlation

    Four Variables

    1. K Discrete a. Cross Tabulation

    b. Non-Metric MDS

    (Multi-Dimensional Scaling)

    a. Conjoint Analysis

    b. Quasi Metric MDS

    (Multi-Dimensional Scaling)

    2. One Discrete, K Continuous a. Probit Analysis

    (for normal probability distribu-

    tion)

    b. Logit Analysis

    (for Logistic probability distribu-

    tion)

    c. Multiple Discriminant Analysis

    3. K Continuous Variables a. Second-order partial correlation

    b. Multiple Regression

    c. Multiple Correlation

    d. Factor Analysis

    e. Canonical Correlation

    f. Fully Metric Multi-Dimensional

    Scaling

    4. One Continuous, K Discrete a. CART (Classification andRegression Tree)

    b. Conjoint

    c. Multiple Regression (Dummy)

  • 5/25/2018 Basic Marketing Research V3

    19/170Chapter 1: Designing Your AnalysisMeasurement and Statistics | 1

    Summary

    In this chapter, we focused on general concepts of measurement and the relationship to data analysis. We discussed

    philosophy of science and the role of concepts, constructs, variables, operational definitions, and propositions. This broa

    foundation provides a unique perspective on how to conceptualize your study variables.

    We then examined measurement, what it is and how different levels of measurement relate to developing scales for your

    variables. We also presented a variety of scale construction issues that you should consider in your data analysis.

    As a final topic, we demonstrated the link between your choice measurement scales and the correct statistical analysis

    tools available to you. Nominal and ordinal scales were shown to require a less powerful set of statistical analysis tools

    than interval and ratio scales. You can determine the statistical tool appropriate for your specific problem by identifying

    the number of variables to be analyzed and their level of measurement.

    Statistical analysis tools will help you analyze and predict from attitude scales, preference ratings, and the like. As you

    increase your skill with these tools, you will become more proficient at transforming your measures (attitudes, sales) into

    models, such as estimates of market share, that are of more direct value in making strategic decisions. However, the

    goal of conducting the perfect research study is still to be achieved because we continue to struggle when translating

    verbalized product ratings, attitudes about brands, and stated purchase intentions into estimates of sales, market share

    and profitability that reflect specific marketing actions.

  • 5/25/2018 Basic Marketing Research V3

    20/170

  • 5/25/2018 Basic Marketing Research V3

    21/170Basic Marketing Research: Analysis and Results | 21

    HypothesisTesting

    Hypothesis formulation and testing is basic to conductingstatistical analysis and making inferences about the population.

    Chapter 2

  • 5/25/2018 Basic Marketing Research V3

    22/170

    The overall process of analyzing and making inferences from sample data is a refinement process involving a number of

    separate and sequential steps identified as part of three broad stages:

    1. TABULATION:Identifying appropriate categories for the information desired, sorting data by categories, making

    initial counts of responses, and using summarizing measures to provide economy of description to facilitate

    understanding.

    2. FORMULATING ADDITIONAL HYPOTHESES: Using the inductions derived from the data concerning the relevant

    variables, their parameters, their differences, and their relationships to suggest working hypotheses not

    originally considered. 3. MAKING INFERENCES: Reaching conclusions about the variables that are important, their parameters, their

    differences, and the relationships among them.

    THE DATA TABULATION PROCESS

    Data tabulation consists of three steps:

    1. DEFINE THE RESPONSE CATEGORIES:Define appropriate categories for coding the information collected.

    2. EDITING AND CODING DATA:

    Assign codes to the respondents' answers. 3. TABULATION, INCLUDING ERROR CHECKING AND HANDLING MISSING DATA: Tabulate first to check the data file

    for uncoded open end responses, miscodings, and partial responses in coding or data entry. Once errors and

    omissions are identified and corrected, further analysis may proceed.

    As simple as these steps are from a technical standpoint, data management is the most important step in assuring a

    quality analysis and therefore merits an introductory discussion.

    22 | Chapter 2: Hypothesis Testing

    An Overview of the Analysis Process

    Scientific research requires inquiry and testing of alternative explanations of what appears to be fact. For behavioral

    researchers, this scientific inquiry translates into a desire to ask questions about the nature of relationships that affect

    behavior within markets. Hypotheses are formulated and tested to determine (1) what relationships exist and (2) when

    and where these relationships hold.

    The first stage in the analysis process includes editing, coding, and tabulating and cross tabulating responses. In the

    current chapter, we extended this first stage to include testing relationships and formulating hypotheses. We will addressmaking inferences based on these hypotheses in Chapter 3.

    In formulating hypotheses, researchers use interesting variables and consider their relationships to each other to findsuggestions for working hypotheses that, originally, may or may not have been considered.

  • 5/25/2018 Basic Marketing Research V3

    23/170Chapter 2: Hypothesis Testing | 2

    DEFINING CATEGORIES

    The raw input to most data analyses consists of the basic data matrix, as shown in Table 2.1. In most data matrices, eac

    row contains a respondents data, and the columns identify the variables or data fields collected for the respondent. Eac

    column of categorical data is analyzed to tabulate data counts in each of the response categories. For interval or ratio

    data, the analysis might involve the computation of the mean and standard deviation. Because we summarize the data

    and make inferences from it, the data must be accurate.

    Table 2.1 Illustration of Data Matrix

    Depending on the data collection method, data code sheets can be prepared and pre-coded. In online data collection,

    Qualtrics automates this entire process by not only defining the questions and response categories in the database but

    also automatically recording the completed responses as they are submitted. ). Response categories are coded from 1 fo

    the first category to the highest value for the last category. Category values can be recoded to assign different numbers

    as desired by the researcher. After responses are collected, data may be analyzed online or exported to Microsoft Excelor a dedicated statistical analysis program such as SPSS.

    Categories can sometimes only be defined after the data have been collected. This is usually the case when researchers

    use open-end text questions, unstructured interviews, and projective techniques.

    The selection of categories is controlled by both the purposes of the study and the nature of the responses. Useful

    classifications meet the following conditions:

    1. CATEGORIES REPRESENT SIMILAR RESPONSES.Each category should contain responses that, for purposes of

    the study, are sufficiently similar to be considered homogenous.

    2. RESPONSES DIFFER BETWEEN CATEGORIES.Differences in category descriptions should be great enough to

    disclose any important distinctions in the characteristic being examined.

    3. CATEGORIES ARE MUTUALLY EXCLUSIVE. Category descriptions should be unambiguous and defined so that an

    response can be placed in only one category.

    4. CATEGORIES SHOULD BE EXHAUSTIVE.The classification schema should provide categories for all responses.

  • 5/25/2018 Basic Marketing Research V3

    24/17024 | Chapter 2: Hypothesis Testing

    Using extensive open-ended questions often provides rich contextual and anecdotal information but is often associated

    with fledgling researchers. Open-end questions have their place in marketing research but have inherent difficulties in

    questionnaire coding and tabulation and are more burdensome to the respondent. Any open-end questions should be

    carefully checked to see if a closed-ended question (i.e., check the appropriate box) can be substituted without altering

    the intent of the question.

    EDITING AND CODING

    Editingis the process of reviewing the questionnaire and data to ensure maximum accuracy and clarity. Careful editingduring the pre-test process catches misinterpreted instructions, recording errors, and other problems and eliminates them

    from the later stages of the study. Additionally, pretesting allows you to question interviewers while the material is still

    relatively fresh in their minds. Obviously, this has limited application for printed questionnaires, but you can edit online

    surveys while collecting data.

    Qualtrics has an interesting testing wizard feature that completes the survey by making random choices and followingthe various logic paths found in the survey. The resulting test data can be checked for correctness.

    Editing involves judging the quality of responses in the following manners. Once pretesting is done and data is collected,

    the actual editing begins and includes the following.

    1. REASONABILITY OF ENTRIES.Data must make sense in order to be used. When sentences or thoughts are not

    complete or are poorly spelled or written, it may be possible to infer the response from other data collected;

    however, if any real doubt exists about the meaning of data, that data should not be used.

    2. COMPLETENESS OF ENTRIES.On a fully structured collection form, the absence of an entry is ambiguous. It may

    mean that the respondent could not or would not provide an answer, that the interviewer failed to ask thequestion, or that the interviewer failed to record collected data.

    3. CONSISTENCY OF ENTRIES.Inconsistent responses raise questions concerning the validity of every response. (If

    respondents indicate that they do not watch game shows, for example, but a later entry indicates that they

    watched Wheel of Fortune twice during the past week, an obvious question arises as to which entry is correct.)

    4. ACCURACY OF ENTRIES.An editor should keep an eye out for any indication of inaccuracy in the data. Of

    particular importance is detecting any repetitive response patterns in the reports of individual interviews. Such

    patterns may well be indicative of systematic interviewer or respondent bias or dishonesty.

    Codingis the process of assigning respondent answers to data categories. Numbers are assigned to identify the answerswith the categories. Pre-coding refers to the practice of assigning codes to categories. Sometimes these codes are part of

    the answer scale that is shown when the data are collected.

    Post-coding is the assignment of codes to responses after the data are collected and is most often required when

    responses are reported in an unstructured format (open-ended text or numeric input). Careful interpretation and good

    judgment are required to ensure that the meaning of the response and the meaning of the category are consistently and

    uniformly matched.

  • 5/25/2018 Basic Marketing Research V3

    25/170Chapter 2: Hypothesis Testing | 2

    Like good questionnaire construction, good coding of open-ended text requires training and supervision. Coding is an

    activity that should not be taken lightly. Improper coding leads to poor analyses and may even constrain the types of

    analysis that you can complete.

    TABULATION: CLEANING THE DATA

    The purpose of the initial data cleaning tabulation is to identify outliers, missing data, small category counts, and other

    issues that will lead to data analysis problems. One advantage of online research is that data entry errors are avoided

    and many data cleaning tasks are not required.

    Missing data occurs when respondents do not provide responses for all the questions. The preferred way to handle a

    nonresponse is to treat the nonresponse as missing data in the analysis. Statistical software programs can handle this

    either question-by-question or by deleting the respondent with a missing value from all analyses. Also, the researcher ca

    eliminate a respondent from the data set if there is too much missing data. Yet another way is to assign the groups mea

    value to the missing items. Or, when an item is missing from a multi-item measure, you can substitute a respondentsmean value for the remaining items for the missing value.

    TABULATION: BASIC ANALYSIS

    Tabulation is the final step in the data collection process and the first step in the analytical process. The most basic

    tabulation consists of counting the number of responses that occur in each of the data categories that comprise a

    variable. Cross-tabulation involves simultaneously counting the number of observations that occur in each of the data

    categories of two or more variables. An example is given in Table 2.2. A cross-tabulation is one of the more commonly

    employed and useful forms of tabulation for analytical purposes.

    Table 2.2 Cross-Tabulation: Energy Drink Purchases by Income Classes of Respondents*

    Number

    of liters

    purchased

    Income Class Zero One Two Three or more Total

    Less than $15,000 160 25 15 0 200

    $15,000 - $34,999 120 15 10 5 150

    $35,000 - $54,999 60 20 15 5 100

    $55,000 - $74,999 5 10 5 5 25

    $75,000 and over 5 5 5 10 25

    Total 350 75 50 25 500

  • 5/25/2018 Basic Marketing Research V3

    26/17026 | Chapter 2: Hypothesis Testing

    FORMULATING HYPOTHESES

    A hypothesis is an assertion that variables (measured concepts) are related in a specific way that explains certain facts or

    phenomena. From a practical standpoint, hypotheses may be developed to solve a problem, answer a question, or imply a

    possible course of action. Outcomes are predicted if a specific course of action is followed. Hypotheses must be empirically

    testable. The hypothesis may be stated informally as a research question, or more formally as an alternative hypothesis, orin a testable form known as a null hypothesis. Todays informality has caused research questions to be used more frequentlythan formal hypotheses. The relationship between hypothesis and research questions is summarized in Table 2.3.

    Table 2.3 Hypotheses and Research Questions

    Purpose Example Decision

    ResearchQuestion

    Express the purpose of the research What is theperception ofMingles custom-

    ers regarding theprice-value of thefood?

    None used

    AlternativeHypothesis

    The alternative hypothesis states the specific natureof the hypothesized relationship; i.e., that thereis a difference. The alternative hypothesis is theopposite of the null hypothesis. The alternative hy-pothesis cannot be falsified because a relationshiphypothesized to exist may not have been verified, butmay in truth exist in another sample. (You can never

    reject an alternative hypothesis unless you test allpossible samples of the population.

    Mingles is per-ceived as havingsuperior foodvalue for the pricewhen comparedto an averagerestaurant.

    Not tested be-cause we cannotreject. We mayonly accept thata relationshipexists.

    NullHypothesis

    The null hypothesis is testable in the sense thatthe hypothesized lack of relationship can be tested.If a relationship is found, the null hypothesis isrejected. The Null hypothesis states that there isno difference between groups (with respect to somevariable) or that a given variable does not predictor otherwise explain an observed phenomena, effector trend.

    There is no differ-ence in perceivedfood value for theprice for Minglesand an averagerestaurant.

    We may reject anull hypothesis(Find a relation-ship). We mayonly tentativelyaccept that norelationshipexists.

    Objectives and hypotheses shape the study, so they must be clearly stated from the beginning; they determine the kindsof questions to be asked, the measurement scales for the data to be collected, and the kinds of analyses that will be

    necessary. However, as the project progresses, new hypotheses will turn up. They especially develop as data collection

    transitions to the final interpretation of the findings.

    In contrast to the strict procedures of the scientific method, where hypotheses formulation must precede the collection of

    data, actual research projects almost always formulate and test new hypotheses during the project. It is both acceptable

    and desirable to expand the analysis to examine new hypotheses to the extent that the data permit. At one extreme, it

  • 5/25/2018 Basic Marketing Research V3

    27/170Chapter 2: Hypothesis Testing | 2

    may be possible to show that the new hypotheses are not supported by the data and that no further investigation

    should be considered. At the other extreme, a hypothesis may be supported by both the specific variables tested and by

    other relationships that can be interpreted similarly. The converging results from these separate parts of the analysis

    strengthen the case that the hypothesized relationship is correct. Between these extremes of nonsupport-support are

    outcomes of indeterminacy: the new hypothesis is neither supported nor rejected by the data. Even this result may

    indicate the need for an additional collection of information.

    In a position even further from the scientific method, we recognize that it is rarely possible to formulate precise

    hypotheses independently of the data. This means that most survey research is essentially exploratory. Rather than

    having a single pre-designated hypothesis in mind, the analyst often works with many diffuse variables that provide a

    slightly different approach and perspective on the situation and problem. The added cost of an extra question is so low

    that the same survey can be used to investigate many problems without increasing the total cost. However, researchers

    must resist the syndrome of just one more question. Often, that one more question escalates into many more question

    of the type it would be nice to know, that are often unrelated to the research objectives. In a typical survey project,

    the analyst may alternate between searching the data (analyzing) and formulating hypotheses. Obviously, there are

    exceptions to all general rules and phenomena. Selvin and Stuart (1966), therefore, designate three practices of survey

    analysts:

    1. SNOOPING: The process of searching through a body of data and looking at many relations in order to find thos

    worth testing (that is, there are no pre-designated hypotheses).

    2. FISHING: The process of using the data to choose which of a number of pre-designated variables to include in

    an explanatory model.

    3. HUNTING: The process of testing from the data all of a pre-designated set of hypotheses.

    This investigative approach is reasonable for basic research but may not be practical for decisional research. Time and

    resource pressures require that directed problem solving be the focus of decision research. Rarely can the decision make

    afford the luxury of dredging through the data to find all of the relationships that must be present. Again, whether to

    examine all of the relationships simply becomes a question of cost versus value.

    MAKING INFERENCES

    Testing hypotheses is the broad objective that underlies all decisional research. Sometimes the population as a whole

    can be measured and profiled in its entirety. Often, however, we cannot measure everyone in the population but instead

    must estimate the population using a sample of respondents drawn from the population. In this case, we estimate

    the population parameters using the sample statistics. Thus, in both estimation and hypothesis testing, we make

    inferences about the population of interest on the basis of information from a sample. We often will make inferences

    about the nature of the population and ask a multitude of questions, such as: Does the samples mean satisfaction diffefrom the mean of the population of all restaurant patrons? Does the magnitude of the observed differences between

    categories indicate that actual differences exist, or are they the result of random variations in the sample?

    In other studies, it may be sufficient to simply estimate the value of certain parameters of the population, such as the

    amount of our product used per household, the proportion of stores carrying our brand, or the preferences of housewives

    concerning alternative styles or package designs of a new product. Even in these cases, however, we would want to know

    about the underlying associated variables that influence preference, purchase, or use (color, ease of opening, accuracy

    in dispensing the desired quantity, comfort in handling, etc.), and if not for purposes of the immediate problem, then for

  • 5/25/2018 Basic Marketing Research V3

    28/17028 | Chapter 2: Hypothesis Testing

    THE POPULATION

    Suppose there is a population consisting of only seven persons. On a specific topic, these seven persons each have a

    difference of opinion that is measured on a 7-point scale ranging from a very strongly agree to very strongly disagree. The

    frequency distribution of a population is shown in the bar chart of Figure 2.2.

    Figure 2.2 Population Frequency Distribution

    The parameters describe this population as having a mean of = 4 and standard deviation = 2.

    Now that we know the parameters of the population, we will consider the sampling distribution for our example data.

    Assume for a moment that like most populations, ours is so large that we are not able to measure all persons in this

    population, but must rely instead on a sample. In our example, the population is not large, and we will assume a sample

    of size n = 2. The sampling distribution is the distribution if sample means from all possible samples of size n=2. The

    sampling distribution of means and standard errors are shown in Table 2.4.

    SAMPLING DISTRIBUTION

    Next, we turn to the sampling distributionfor our example data. Assume for a moment that like most populations, ours isso large that we are not able to measure all persons in this population, but must rely instead on a sample. Switching back

    to our example, the population is not large, and we will assume a sample size of n=2.

    Statistics 101: What is aPopulation, a SamplingDistribution, and a Sample?

    solving later problems. In other case studies, it might be necessary to analyze the relationships between the enabling or

    situational variables that facilitate or cause behavior. Knowledge of these relationships will enhance our ability to make

    reliable predictions, when decisions involve changes in controllable variables.

  • 5/25/2018 Basic Marketing Research V3

    29/170Chapter 2: Hypothesis Testing | 2

    The sampling distribution is the distribution of sample means from all possible samples of size n=2. The sampling

    distribution of means and standard errors are shown in Table 2.4.

    Table 2.4 Computation of Sampling Distribution, Mean, and Standard Error

    ALL POSSIBLE SAMPLE

    DISTRIBUTIONS

    SAMPLE MEAN ERROR

    STANDARD ERROR

    1, 2 1.5 -2.5 6.25

    1, 3 2 -2 4

    1, 4 2.5 -1.5 2.25

    1, 5 3 -1 1

    1, 6 3.5 -0.5 0.25

    1, 7 4 0 0

    2, 3 2.5 -1.5 2.25

    2, 4 3 -1 1

    2, 5 3.5 -0.5 0.25

    2, 6 4 0 0

    2, 7 4.5 0.5 0.25

    3, 4 3.5 -0.5 0.25

    3, 5 4 0 0

    3, 6 4.5 0.5 0.25

    3, 7 5 1 14, 5 4, 5 0.5 0.25

    4,6 5 1 1

    4, 7 5.5 1.5 2.25

    5, 6 5.5 1.5 2.25

    5, 7 6 2 4

    6, 7 6.5 2.5 6.25

    The mean of all possible two-member sample means is

    and summing the standard errors of the mean for the sampling distribution gives

  • 5/25/2018 Basic Marketing Research V3

    30/17030 | Chapter 2: Hypothesis Testing

    which gives a standard deviation of

    The sampling distribution becomes more normal as the sample size increases, and even in this simple case, we observe

    a somewhat normal shape (see Exhibit 2.2 part II). Also the population mean =4 is always equal to the mean of the

    sampling distribution of all possible sample means .

    THE RELATIONSHIP BETWEEN THE SAMPLE AND THESAMPLING DISTRIBUTION

    When we draw a sample, we rarely know anything about the population, including its shape, , or. We must, therefore,compute statistics from the sample ( and s). We use these sample statistics to make inferences about the population (

    and ).

    Suppose we were to repeatedly draw samples of n = 2 (without replacement). The relevant statistics for the first of these

    samples having the values of (1,2) are:

    Given this and , we can now estimate with a given probability, the intervals that give a range of possible values that

    could include , the population mean. For this single sample, they are:

    68% = 1.5 6.31 (.5) or -1.655 to 4.655

    95% = 1.5 12.71 (.5) or -4.855 to 7.855

    99% = 1.5 31.82 (.5) or -14.41 to 17.41

    Thus, we could state that we are 99% confident that the population mean would fall within the interval 14.41 to 17.41.

    This range is very wide and includes our population mean of 4.0. The size of the range is large because of the small

    sample size (n=2). As the sample size increases, the numbers become larger and gradually approximate a standard

    normal distribution.

    The above discussion is summarized graphically in Exhibit 2.2. Part I of Exhibit 2.2 shows the relationship between the

    population, the sample, and the sampling distribution, while Part II illustrates the impact of sample size on the shape of

    the sampling distribution for differently shaped population distributions.

  • 5/25/2018 Basic Marketing Research V3

    31/170Chapter 2: Hypothesis Testing | 3

    Exhibit 2.1 Population, Sample, and Sampling Distribution

    Part I

  • 5/25/2018 Basic Marketing Research V3

    32/170

    Exhibit 2.1 Population, Sample, and Sampling Distribution

    Part II

    HYPOTHESIS TESTING

    Researchers and management continually ask, are my results signicant? The signicance level refers to the amount

    of error we are willing to accept in our decisions that are based on the hypothesis test. Hypotheses testing involves

    specifying the value , which is the allowable amount of Type I error.

    In hypothesis testing the sample results sometimes lead us to reject H0 when it is true. This is a Type I error. On other

    occasions the sample findings may lead us to accept H0 when it is false. This is a , the Type II error. The nature of theseerrors is shown in Exhibit 2.3.

    The amount of type I error, , we are willing to accept should be set after considering (a) how much it costs to make suchan error and (b) the decision rule used.

    32 | Chapter 2: Hypothesis Testing

  • 5/25/2018 Basic Marketing Research V3

    33/170Chapter 2: Hypothesis Testing | 3

    A tradition of conservatism exists in basic research and has resulted in the practice of keeping the Type I error at

    a low level (.05 or .01). Traditionally, Type I errors have been considered more important than the Type II errors and,

    correspondingly, that it is more important to have a low than a low . The basic researcher typically assigns highercosts to a Type I than to a Type II error.

    Exhibit 2.2

    TYPES OF ERROR IN MAKING A WRONG DECISION

    Two types of error result from a mismatch between the conclusion of a research study and reality. In the null-

    hypothesis format, there are two possible research conclusions, to retain H0 and to reject H0. There are also two

    possibilities for the true situation: is true or is false.

    These outcomes provide the definitions of Type I and Type II errors and the confidence level and power of the test:

    1. A Type I error occurs when we incorrectly conclude that a difference exists. This is expressed as, the prob-ability that we will incorrectly reject H0, the null hypothesis, sometimes called the hypothesis of no difference.

    2. A Type II error occurs when we accept a null hypothesis when it is in reality false (we find no difference when a

    difference really does exist).

    3. Confidence level: we correctly retain the null hypothesis (It is also correct to say we tentatively accept it or that

    it could not be rejected). Geometrically, this is equal to the area under the normal curve less the area occupied by

    , the significance level.

    4. The power of the test is the ability to reject the null hypothesis when it should be rejected (when false). That

    is, the power to not make an error. Because power increases as becomes larger, researchers may choose an of .10 or even .20 to increase power. Alternatively, sample size may be increased to increase power. Increasing

    sample size is the preferred option for most market researchers.

    The four possible combinations are shown in the following diagram:

  • 5/25/2018 Basic Marketing Research V3

    34/17034 | Chapter 2: Hypothesis Testing

    In decisional research, the costs of an error are a direct result of the consequences of the errors. The cost of missing

    a market entry by not producing a product and foregoing gain (Type II error) may be even greater than the loss from

    producing a product when we should not (Type I error). The cost depends on the situation and the decision rule being used.

    Of course not all decision situations have errors leading to such consequences. In some situations, making a Type I error

    may lead to an opportunity cost (for example, a foregone gain), and a Type II error may create a direct loss.

    POWER OF A TEST

    The power of a hypothesis test is defined as 1-, or 1 minus the probability of a Type II error. The power of a test is theability to reject the null hypothesis when it is false (or to find a difference when one is present).

    The power of a statistical test is determined by several factors. The main factor is the acceptable amount of discrepancy

    between the tested hypothesis and the true situation. This can be controlled by increasing, from .05 to .01, for example.Power is also increased by increasing the sample size (which decreases the confidence interval).

    In Figure 2.3, we observe two sampling distributions, the first having a mean and the second having a mean 0 + 1 .In Figure 2.2, we observe that for the second sampling distribution, power is the area to the right of 0+ 1.65, i.e., thearea in which a difference between means from the two sampling distributions exists and was found to exist.

    Figure 2.3 Two Sampling Distributions

    If were decreased to, say, 0 + 1.5, the shaded area would move to the left, and the area defining power for curve IIwould increase.

    The second method of increasing power is to increase sample size. Because increasing sample size directly reduces

    the standard error (= /2), any increase in n decreases the absolute width of the confidence interval and narrowssampling distribution I.

    In decisions that imply a greater cost for a type II error (missing a difference that exists), the analyst may want to consider

    expanding the probability of a type I error () to .10, .15, or even .20, especially when sample sizes are small. Exhibit 2.2discusses some ramifications of ignoring statistical power.

  • 5/25/2018 Basic Marketing Research V3

    35/170Chapter 2: Hypothesis Testing | 3

    Selvin, H. and Stuart, A. (1966). Data-Dredging Procedures in Survey Analysis, The American Statistician, June, 20-23

    References

    In Chapter 2 we introduced the basic concepts of formulating hypothesis testing and making statistical inference in the

    context of univariate analysis. In actual research, the analyst may alternate between analyzing the data and formulatin

    hypotheses.

    A hypothesis is a statement in which variables (measured constructs) are related in a specific way. The null hypothesis,

    H0, is a statement that no relationship exists between the variables tested or that there is no difference.

    A sample of respondents is used to make inferences about the population of all respondents by means of a sampling

    distribution. The sampling distribution is a distribution of the parameter values (means or variances) that are estimate

    when all possible samples are collected.

    When testing hypotheses, we may correctly identify a relationship as present or absent, or may commit one of two types o

    errors. A Type I Error occurs when a true H0is rejected (there is no difference, but we find there is). A Type II Error occurs

    when we accept a false H0(there is a difference, but we find that none exists). The power of a test was explained as the

    ability to reject H0when it should be rejected.

    In the next chapters we will observe that the appropriate statistical technique is selected, in part based on the level of

    measurement, and the number of variable and their relationship when included in the analysis. The investigation of com

    plex relationships using higher order statistics is not possible without higher levels of measurement and the associated

    characteristics of central tendency, dispersion, and rates of change.

    Summary

  • 5/25/2018 Basic Marketing Research V3

    36/170

  • 5/25/2018 Basic Marketing Research V3

    37/170

    Bivariate DataAnalysis:

    Simple Associative Data

    Analysis, Cross Tabulation,

    T-Test and ANOVA

    Bivariate data analysis, including cross-tabulation andt-tests are the backbone of most research projects.

    Chapter 3

    Basic Marketing Research: Analysis and Results | 37

  • 5/25/2018 Basic Marketing Research V3

    38/17038 | Chapter 3: Bivariate Data Analysis

    Analysis is the manipulation and interpretation of data to obtain answers to research questions. The process of inter-

    pretation involves taking the results of analysis, making inferences relevant to the research relationships studied, and

    drawing managerially useful conclusions about these relationships.

    Basic Concepts of AnalyzingAssociative Data

    This chapter begins with a brief discussion of cross-tabulation to mark the beginning of a major topic of this chapter

    the analysis of associative data. A large part of the rest of the chapter will focus on analyzing how the variation of one

    variable is associated with variation in other variables.

    BIVARIATE CROSS-TABULATION

    Bivariate (two variable) cross-tabulation represents the simplest form of associative data analysis. Starting with two

    categorical variables, such as occupation and education, we assume that each variable is nominal-scaled.

    Bivariate cross-tabulation is the single most widely used bivariate technique in applied settings:

    1. It provides a means of data display and analysis that is clearly interpretable even to the less statistically

    inclined researcher or manager.

    2. A series of bivariate tabulations can provide clear insights into complex marketing phenomena that might be

    lost in a simultaneous analysis with many variables. 3. The clarity of interpretation affords a more readily constructed link between market research and market action.

    4. Bivariate cross-tabulations may lessen the problems of sparse cell values that often plague the interpretation

    of discrete multivariate analyses because the requisite number of respondents in any table cell is 5.

    The items being cross-classified are usually people, objects, or events. Cross-tabulation, at its simplest, consists of a

    simple count of the number of items that fall into each of the possible categories of the cross-classification. However,

    cross-tabulation includes more than raw frequency data. At the very least, we also compute row or column percentages

    (or both).

    PERCENTAGES

    The simple mechanics of calculating percentages are known to all of us. In cross-tabulation, percentages serve as a

    relative measure indicating the relative size of two or more categories.

    Lets assume that KENs Original Salad Dressing wants to test the effectiveness of spot TV ads in increasing consumer

    awareness of a new low calorie brand called Life. Two geographic areas are chosen for the test: test area A and control

    area B. The test area receives five 15-second television spots per week over an eight-week period, while the control area

    receives no spot TV ads at all. (Other forms of advertising were equal in each area.)

  • 5/25/2018 Basic Marketing Research V3

    39/170Chapter 3: Bivariate Data Analysis | 3

    Four independent samples of telephone interviews were conducted as before and after tests in each of the areas.

    Respondents were asked to name every brand of salad dressing they could, on an aided basis. If a respondent mentioned

    Life, researchers assumed that this constituted consumer awareness of the brand. However, sample sizes differed across

    all four sets of interviews. In surveys, this common occurrence of variation in sample sizes increases the importance of

    computing percentages.

    Table 3.1 shows a set of frequency tables compiled before and after a TV ad for Life Salad Dressing aired. Interpreting

    Table 3.1 would be difficult if the data were expressed solely as raw frequencies and if different percentage bases were

    reported. Accordingly, Table 3.1 shows the data, with percentages based on column and row totals.

    DIRECTION IN WHICH TO COMPUTE PERCENTAGES

    In examining the relationship between two variables, one variable is the independent or control variable and the other

    the dependent or criterion variable. When this distinction is clear, the rule is to compare percentages within levels of the

    dependent variable.

    In Table 3.1, the control variable is the experimental area (test versus control) and the dependent variable is awareness.

    When comparing awareness in the test and control areas, row percentages are preferred. Before the spot TV campaign,

    the percentage of respondents who are aware of Life is nearly equal between test and control areas: 42 percent and 40

    percent, respectively.

    However, after the campaign the test-area awareness level rises to 66 percent, while the control-area awareness remain

    nearly unchanged at 42 percent. The increase of 2 percentage points reflects either sampling variability or the effect of

    other factors that might be increasing awareness of Life in the control area.

    On the other hand, computing percentages across the independent variable (column percent) makes little sense. While 6

    percent of the aware group (before the spot TV campaign) originates from the test area, this is mainly a reflection of the

    differences in total sample sizes between the test and control areas.

    After the campaign, the percentage of aware respondents in the control area is only 33 percent, versus 39 percent before

    the campaign. This may be erroneously interpreted as indicating that spot TV in the test area depressed awareness in th

    control area. But we know this is false from our earlier examination of raw percentages.

    It is not always the case that one variable is clearly the independent or control variable and the other the dependent or

    criterion variable. However, this should not pose a problem as long as we agree, for analysis purposes, which variable

    is the control variable. Indeed, cases often arise in which each of the variables in turn serves as the independent and

    dependent variable.

  • 5/25/2018 Basic Marketing Research V3

    40/17040 | Chapter 3: Bivariate Data Analysis

    Table 3.1 Aware of Life Salad DressingBefore and After Spot TV

    AREA AWARE NOT

    AWARE

    TOTAL

    AREA

    AREA AWARE NOT

    AWARE

    TOTAL

    AREA

    Test Area

    Freq.

    Col %

    Row %

    250

    61%

    42%

    350

    59%

    58%

    600

    60%

    Test Area

    Freq.

    Col %

    Row %

    330

    67%

    66%

    170

    44%

    34%

    550

    57%

    Control

    Area

    Freq.

    Col %Row %

    160

    39%

    40%

    240

    41%

    60%

    400

    40%

    Control

    Area

    Freq.

    Col %Row %

    160

    33%

    42%

    220

    56%

    58%

    380

    43%

    Total

    Before TV

    Spot

    410

    41%

    590

    59%

    1000

    100%

    Total

    After TV

    Spot

    490

    56%

    390

    44%

    880

    100%

    INTERPRETATION OF THE PERCENTAGE CHANGE

    A second problem that arises from using percentages in cross-tabulations is choosing which method to use in measuring

    differences. There are three principal ways to portray percentage change:

    1. The absolute difference

    2. The relative difference

    3. The percentage of possible change

    The same example can be used to illustrate the three methods.

    ABSOLUTE PERCENTAGE INCREASE

    Table 3.2 shows the percentage of respondents who were aware of Life before and after the spot TV campaign in the test

    and control areas. First, we note that the test-area respondents displayed a greater absolute increase in awareness. The

    increase for the test-area respondents was 24 percentage points, while the control-area awareness increased by only 2

    percentage points.

    Table 3.2 Aware of LifePercentages Before and After the Spot TV Campaign

  • 5/25/2018 Basic Marketing Research V3

    41/170Chapter 3: Bivariate Data Analysis | 4

    Before the Campaign After the Campaign

    Test area

    Control area

    42%

    40%

    66%

    42%

    RELATIVE PERCENTAGE INCREASE

    The relative increase in percentage is [(66 42)]/42] 100 57 percent and [(42 40)/40] 100 5 percent,

    respectively, for test- and control-area respondents.

    PERCENTAGE POSSIBLE INCREASE

    For the test area, the maximum percentage-point increase that could have occurred is 100 42 = 58 points. The actual

    increase is 24 percentage points, or 100(24/58) = 41 percent of the maximum possible. That of the control area is

    100(2/60) = 3 percent of the maximum possible.

    In terms of the illustrative problem, all three methods give consistent results in the sense that the awareness level in the

    test area undergoes greater change than the control area. However, in other situations conflicts among the measures ma

    arise.

    The absolute-difference method is simple to use and requires only that we understand the distinction between percentag

    and percentage points. The relative-difference method can be misleading, particularly if the base for computing thepercentage change is small. The percentage-of-possible-difference method takes into account the greater difficulty

    associated with obtaining increases in awareness as the difference between potential-level and realized-level decreases

    In some studies, all three measures are used, inasmuch as they emphasize different aspects of the relationship.

    INTRODUCING A THIRD VARIABLE INTO THE ANALYSIS

    Using cross-tabulation analysis to investigate relationships need not stop with two variables. Often we can learn much

    about the original two-variable association through introducing a third variable.

    Consider the hypothetical situation facing MCX Company, a specialist in the residential telecommunications equipment

    market. The company has recently test-marketed a new cloud-based service for recording Internet, cable, or satellite TV

    programs without a storage box. Several months after the introduction, the company conducted a telephone survey in

    which respondents in the test area were asked whether they had adopted the innovation. Researchers interviewed 600

    respondents using a quota sample design based on age and gender (Table 3.3).

    Table 3.3 AdoptionPercentage by Gender and Age

  • 5/25/2018 Basic Marketing Research V3

    42/17042 | Chapter 3: Bivariate Data Analysis

    FREQUENCY < 35 YRS 35 YRS TOTAL % < 35 YRS 35 YRS TOTAL %

    ADOPTERSNumber of Cases

    Column %

    Row %

    100

    50%

    62.5%

    60

    30%

    37.5%

    160

    40%

    11

    11%

    55%

    9

    9%

    45%

    20

    10%

    NON ADOPTERS

    Number of Cases

    Column %

    Row %

    100

    50%

    41.7%

    140

    70%

    58.3%

    240

    60%

    89

    89%

    49.4%

    91

    91%

    50.6%

    180

    90%

    TotalPercentage 20050% 20050% 400100% 10050% 10050% 200100%

    Based on earlier studies of the residential market, it appeared that adopters of the firms new products tended to be less

    than 35 years old. Accordingly, the market analyst performed a cross-tabulation of adoption rate and respondent age.

    Respondents were classified into the categories under 35 years (

  • 5/25/2018 Basic Marketing Research V3

    43/170Chapter 3: Bivariate Data Analysis | 4

    Figure 3.1 AdoptionPercentage by Age and Gender

    From the bar graph we can easily see that adoption differs by age group (37 percent versus 23 percent). Furthermore, th

    size of the difference depends on the gender of the respondent: Men display a relatively higher rate of adoption, compare

    with women, in the younger age category.

    RECAPITULATION

    Representatives of three-variable association can involve many possibilities that could be illustrated by the preceding

    adoption-age-gender example:

    1. In the example, adoption and age exhibit initial association. This association is still maintained in the

    aggregate but is refined by the introduction of the third variable, gender.

    2. Adoption and age do not appear to be associated. However, adding the third variable, gender, revealssuppressed association between the first two variables within the separate categories of men and women.

    In the two-variable cases, men and women exhibit opposite patterns, canceling each other out.

    Introducing a third variable can often be useful in interpreting two-variable cross-tabulations. However, we have

    deliberat