doane chapter 02

Upload: thomasmcarter

Post on 14-Apr-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Doane Chapter 02

    1/82

  • 7/30/2019 Doane Chapter 02

    2/82

    Data Collection

    Data VocabularyLevel of Measurement

    Time Series and Cross-sectional Data

    Sampling Concepts

    Sampling Methods

    Data Sources

    Survey Research

    Chapter

    2

  • 7/30/2019 Doane Chapter 02

    3/82

    Data Vocabulary

    Data is the plural form of the Latin datum(a givenfact).

    McGraw-Hill/Irwin 2007 The McGraw-Hill Companies, Inc. All rights reserved.

    In scientific research, data arise

    from experiments whose resultsare recorded systematically.

    Important decisions may depend on data.

    In business, data usually arise from

    accounting transactions ormanagement processes.

  • 7/30/2019 Doane Chapter 02

    4/82

    Data Vocabulary

    Sub jects , Variables, Data Sets

    We will refer to Data as plural and data setas aparticular collection of data as a whole.

    Observation each data value.

    Subject(orindividual) an item for study (e.g., anemployee in your company).

    Variable a characteristic about the subject orindividual (e.g., employees income).

  • 7/30/2019 Doane Chapter 02

    5/82

    Data Vocabulary

    Sub jects , Variables, Data Sets Three types of data sets:

    Data Set Variables Typical Tasks

    Univariate One Histograms, descriptivestatistics, frequency tallies

    Bivariate Two Scatter plots, correlations,simple regression

    Multivariate More thantwo

    Multiple regression, datamining, econometric modeling

  • 7/30/2019 Doane Chapter 02

    6/82

    Data Vocabulary

    Sub jects , Variables, Data SetsConsider the multivariate data set with

    5 variables 8 subjects 5 x 8 = 40 observations

  • 7/30/2019 Doane Chapter 02

    7/82

    Data Vocabulary

    Data Types A data set may have a mixture ofdata types.

    Types of Data

    Attribute(qualitative)

    Numerical(quantitative)

    Verbal LabelX= economics

    (your major)

    CodedX= 3

    (i.e., economics)

    DiscreteX= 2

    (your siblings)

    ContinuousX= 3.15

    (your GPA)

  • 7/30/2019 Doane Chapter 02

    8/82

    Data Vocabulary

    Attr ibu te Data Also called categorical, nominal or qualitative data.

    Values are described by words rather than

    numbers. For example,

    - Automobile style (e.g.,X= full, midsize,compact, subcompact).

    - Mutual fund (e.g.,X= load, no-load).

  • 7/30/2019 Doane Chapter 02

    9/82

    Data Vocabulary

    Data Coding Codingrefers to using numbers to represent

    categories to facilitate statistical analysis.

    Coding an attribute as a number does notmakethe data numerical.

    For example,

    1 = Bachelors, 2 = Masters, 3 = Doctorate Rankings may exist, for example,

    1 = Liberal, 2 = Moderate, 3 = Conservative

  • 7/30/2019 Doane Chapter 02

    10/82

    Data Vocabulary

    B inary Data A binary variable has only two values,

    1 = presence, 0 = absence of a characteristic of

    interest (codes themselves are arbitrary). For example,

    1 = employed, 0 = not employed1 = married, 0 = not married

    1 = male, 0 = female1 = female, 0 = male

    The coding itself has no numerical value so binaryvariables are attribute data.

  • 7/30/2019 Doane Chapter 02

    11/82

    Data Vocabulary

    Numerical Data Numericalorquantitative data arise from counting

    or some kind of mathematical operation.

    For example,- Number of auto insurance claims filed inMarch (e.g.,X= 114 claims).

    - Ratio of profit to sales for last quarter

    (e.g.,X= 0.0447). Can be broken down into two typesdiscrete or

    continuous data.

  • 7/30/2019 Doane Chapter 02

    12/82

    Data Vocabulary

    Disc rete Data A numerical variable with a countable number of

    values that can be represented by an integer (no

    fractional values). For example,

    - Number of Medicaid patients (e.g.,X= 2).- Number of takeoffs at OHare (e.g.,X= 37).

  • 7/30/2019 Doane Chapter 02

    13/82

    Data Vocabulary

    Cont inuous Data A numerical variable that can have any value

    within an interval (e.g., length, weight, time, sales,

    price/earnings ratios). Any continuous interval contains infinitely many

    possible values (e.g., 426

  • 7/30/2019 Doane Chapter 02

    14/82

    Data Vocabulary

    Rounding Ambiguity is introduced when continuous data are

    rounded to whole numbers.

    Underlying measurement scale is continuous. Precision of measurement depends on instrument.

    Sometimes discrete data are treated as

    continuous when the range is very large (e.g., SATscores) and small differences (e.g., 604 or 605)arent of much importance.

  • 7/30/2019 Doane Chapter 02

    15/82

    Level of Measurement

    Fou r levels of measu rement for data:

    Level of

    Measurement Characteristics Example

    Nominal Categories only Eye color (blue, brown,green, hazel)

    Ordinal Rank has meaning Bond ratings (Aaa, Aab,C, D, F, etc.)

    Interval Distance hasmeaning

    Temperature (57oCelsius)

    Ratio Meaningful zeroexists

    Accounts payable ($21.7million)

  • 7/30/2019 Doane Chapter 02

    16/82

    Level of Measurement

    Nom inal Measu rement Nominal data merely identify a category.

    Nominal data are qualitative, attribute, categorical

    or classification data (e.g., Apple, Compaq, Dell,HP).

    Nominal data are usually coded numerically,codes are arbitrary (e.g., 1 = Apple, 2 = Compaq,

    3 = Dell, 4 = HP). Only mathematical operations are counting (e.g.,

    frequencies) and simple statistics.

  • 7/30/2019 Doane Chapter 02

    17/82

    Level of Measurement

    Ordinal Measurement Ordinal data codes can be ranked

    (e.g., 1 = Frequently, 2 = Sometimes, 3 = Rarely,

    4 = Never). Distance between codes is not meaningful

    (e.g., distance between 1 and 2, or between 2 and3, or between 3 and 4 lacks meaning).

    Many useful statistical tests exist for ordinal data.Especially useful in social science, marketing andhuman resource research.

  • 7/30/2019 Doane Chapter 02

    18/82

    Level of Measurement

    In terval Measu rement Data can not only be ranked, but also have

    meaningful intervals between scale points

    (e.g., difference between 60

    F and 70

    F is sameas difference between 20F and 30F). Since intervals between numbers represent

    distances, mathematical operations can be

    performed (e.g., average). Zero point of interval scales is arbitrary, so ratiosare not meaningful (e.g., 60F is nottwice aswarm as 30F).

  • 7/30/2019 Doane Chapter 02

    19/82

    Level of Measurement

    L ikert Scales A special case of interval data frequently used in

    survey research.

    The coarseness of a Likert scale refers to thenumber of scale points (typically 5 or 7).

    College-bound high school students should be required to study aforeign language. (check one)

    StronglyAgree

    SomewhatAgree

    NeitherAgree

    NorDisagree

    SomewhatDisagree

    StronglyDisagree

  • 7/30/2019 Doane Chapter 02

    20/82

    Level of Measurement

    L ikert Scales A neutral midpoint(Neither Agree Nor Disagree)

    is allowed if an oddnumber of scale points is usedor omitted to force the respondent to lean one

    way or the other.

    Likert data arecoded numerically

    (e.g., 1 to 5) but anyequally spacedvalues will work.

    Likert coding:

    1 to 5 scaleLikert coding:

    -2 to +2 scale

    5 = Help a lot

    4 = Help a little3 = No effect2 = Hurt a little1 = Hurt a lot

    +2 = Help a lot

    +1 = Help a little0 = No effect1 = Hurt a little2 = Hurt a lot

  • 7/30/2019 Doane Chapter 02

    21/82

    Level of Measurement

    L ikert Scales Careful choice of verbal anchors results in

    measurable intervals (e.g., the distance from 1 to2 is the same as the interval, say, from 3 to 4).

    Ratios are not meaningful (e.g., here 4 is nottwice 2).

    Many statistical calculations can be performed(e.g., averages, correlations, etc.).

  • 7/30/2019 Doane Chapter 02

    22/82

    Level of Measurement

    L ikert Scales More variants of Likert scales:

    How would you rate your marketing instructor? (check one)

    Terrible

    Poor

    Adequate

    Good

    Excellent

    How would you rate your marketing instructor? (check one)

    Very Bad Very Good

  • 7/30/2019 Doane Chapter 02

    23/82

    Level of Measurement

    Ambigu i ty Grades are usually coded numerically

    (A = 4, B = 3, C= 2, D = 1, F= 0) and are used tocalculate a mean GPA.

    Is the intervalfrom 3.0 to 4.0 really the same asthe interval from 1.0 to 2.0?

    What is the underlying reality ranging from 0 to 4

    that we are measuring? Best to be conservative and limit statistical tests to

    those for ordinal data.

  • 7/30/2019 Doane Chapter 02

    24/82

    Level of Measurement

    Ratio Measu rement Ratio data have all properties of nominal, ordinal

    and interval data types and also possess ameaningful zero (absence of quantity beingmeasured).

    Because of this zero point, ratios of data valuesare meaningful (e.g., $20 million profit is twice as

    much as $10 million). Zero does not have to be observable in the data,

    it is an absolute reference point.

  • 7/30/2019 Doane Chapter 02

    25/82

    Level of Measurement

    Use the fol low ing p rocedu re torecognize data types:

    Question If Yes

    Q1. Is there ameaningful zero point?

    Ratio data (all statistical operations areallowed)

    Q2. Are intervalsbetween scale points

    meaningful?

    Interval data (common statistics allowed,e.g., means and standard deviations)

    Q3. Do scale pointsrepresent rankings?

    Ordinal data (restricted to certain typesof nonparametric statistical tests)

    Q4. Are there discrete

    categories?

    Nominal data (only counting allowed,

    e g finding the mode)

  • 7/30/2019 Doane Chapter 02

    26/82

    Level of Measurement

    Chang ing Data by Recoding In order to simplify data or when exact data

    magnitude is of little interest, ratio data can berecoded downwardinto ordinal or nominalmeasurements (but not conversely).

    For example, recode systolic blood pressure asnormal (under 130), elevated (130 to 140), or

    high (over 140). The above recoded data are ordinal (ranking is

    preserved) but intervals are unequal and someinformation is lost.

  • 7/30/2019 Doane Chapter 02

    27/82

    Time Series and Cross-sectional Data

    Time Series Data Each observation in the sample represents a

    different equally spaced point in time (e.g., years,months, days).

    Periodicitymay be annual, quarterly, monthly,weekly, daily, hourly, etc.

    We are interested in trends and patterns over time(e.g., annual growth inconsumer debit card usefrom 1999 to 2006).

  • 7/30/2019 Doane Chapter 02

    28/82

    Time Series and Cross-sectional Data

    Cross-sect ional Data Each observation represents a different individual

    unit (e.g., person) at the same point in time(e.g., monthly VISA balances).

    We are interested in- variation among observations or in- relationships.

    We can combine the two data types to getpooledcross-sectional and time series data.

  • 7/30/2019 Doane Chapter 02

    29/82

    Sampling Concepts

    Sample or Census? A sample involves looking only at some items

    selected from the population.

    A census is an examination of all items in adefined population.

    - Mobility- Illegal immigrants- Budget constraints- Incomplete responses or nonresponses

    Why cant the United States Census survey every

    person in the population?

  • 7/30/2019 Doane Chapter 02

    30/82

    Sampling Concepts

    Situations Where A Sample May Be Preferred:

    Infinite PopulationNo census is possible if the population is infinite or of indefinite size(an assembly line can keep producing bolts, a doctor can keep

    seeing more patients).

    Destructive TestingThe act of sampling may destroy or devalue the item (measuringbattery life, testing auto crashworthiness, or testing aircraft turbofan

    engine life).Timely Results

    Sampling may yield more timely results than a census (checkingwheat samples for moisture and protein content, checking peanutbutter for aflatoxin contamination).

  • 7/30/2019 Doane Chapter 02

    31/82

    Sampling Concepts

    Situations Where A Sample May Be Preferred:

    AccuracySample estimates can be more accurate than a census. Instead ofspreading limited resources thinly to attempt a census, our budget

    of time and money might be better spent to hire experienced staff,improve training of field interviewers, and improve data safeguards.

    CostEven if it is feasible to take a census, the cost, either in time ormoney, may exceed our budget.

    Sensitive InformationSome kinds of information are better captured by a well-designedsample, rather than attempting a census. Confidentiality may alsobe improved in a carefully-done sample.

  • 7/30/2019 Doane Chapter 02

    32/82

    Sampling Concepts

    Situations Where A Census May Be Preferred

    Small PopulationIf the population is small, there is little reason to sample, for the effort ofdata collection may be only a small part of the total cost.

    Large Sample SizeIf the required sample size approaches the population size, we might aswell go ahead and take a census.

    Legal RequirementsBanks must count allthe cash in bank teller drawers at the end of eachbusiness day. The U.S. Congress forbade sampling in the 2000 decennial

    l ti

    Database Exists

    If the data are on disk we can examine 100% of the cases. But auditing orvalidating data against physical records may raise the cost.

  • 7/30/2019 Doane Chapter 02

    33/82

    Sampling Concepts

    Parameters and Stat ist ics Statistics are computed from a sample ofn items,

    chosen from a population ofNitems.

    Statistics can be used as estimates ofparametersfound in the population.

    Symbols are used to represent populationparameters and sample statistics.

  • 7/30/2019 Doane Chapter 02

    34/82

    Sampling Concepts

    Parameters and Stat ist ics

    Statistic Any measurement computed from a sample. Usually,the statistic is regarded as an estimate of a populationparameter. Sample statistics are often (but notalways) represented by Roman letters.

    Parameter or Statistic?

    Parameter Any measurement that describes an entirepopulation.

    Usually, the parameter value is unknown since werarely can observe the entire population. Parametersare often (but not always) represented by Greekletters.

  • 7/30/2019 Doane Chapter 02

    35/82

    Sampling Concepts

    Parameters and Stat ist ics The population must be carefully specified and the

    sample must be drawn scientifically so that thesample is representative.

    The target population is the population we areinterested in (e.g., U.S. gasoline prices).

    Target Populat ion

    The sampling frame is the group from which wetake the sample (e.g., 115,000 stations). The frame should not differ from the target

    population.

  • 7/30/2019 Doane Chapter 02

    36/82

    N n

    Finite or In f in i te? A population is finite if it has a definite size, even if

    its size is unknown.

    A population is infinite if it is of arbitrarily large

    size. Rule of Thumb: A population may be treated as

    infinite when Nis at least 20 times n (i.e., whenN/n > 20)

    Sampling Concepts

    Here,N/n > 20

  • 7/30/2019 Doane Chapter 02

    37/82

    Sampling Methods

    Probability Samples

    Simple RandomSample

    Use random numbers to select itemsfrom a list (e.g., VISA cardholders).

    Systematic Sample Select every kth item from a list orsequence (e.g., restaurant customers).

    Stratified Sample Select randomly within defined strata(e.g., by age, occupation, gender).

    Cluster Sample Like stratified sampling except strataare geographical areas (e.g., zipcodes).

    l h d

  • 7/30/2019 Doane Chapter 02

    38/82

    Sampling Methods

    Nonprobability Samples

    JudgmentSample

    Use expert knowledge to choosetypical items (e.g., which employees

    to interview).

    ConvenienceSample

    Use a sample that happens to beavailable (e.g., ask co-worker opinionsat lunch).

    S l h d

  • 7/30/2019 Doane Chapter 02

    39/82

    Sampling Methods

    Simple Random Sample Every item in the population ofNitems has the

    same chance of being chosen in the sample ofnitems.

    We rely on randomnumbers to select aname.

    =RANDBETWEEN(1,48)

    S li M h d

  • 7/30/2019 Doane Chapter 02

    40/82

    Sampling Methods

    Random Number Tables A table of random digits used to select random

    numbers between 1 and N. Each digit 0 through 9 is equally likely to be

    chosen.

    Sett ing Up a Rule

    For example, NilCo wants to award cash prizes to

    10 of its 875 loyal customers. To get 10 three-digit numbers between 001 and875, we define any consistent rule for movingthrough the random number table.

    S li M h d

  • 7/30/2019 Doane Chapter 02

    41/82

    Sampling Methods

    Sett ing Up a Rule Randomly point at the table to choose a starting

    point.

    Choose the first three digits of the selected five-digit block, move to the right one column, downone row, and repeat.

    When we reach the end of a line, wrap around to

    the other side of the table and continue. Discard any number greater than 875 and any

    duplicates.

    82134 14458 66716 54269 31928 46241 03052 00260 32367 25783

    Table of 1,000 Random DigitsStart Here

  • 7/30/2019 Doane Chapter 02

    42/82

    82134 14458 66716 54269 31928 46241 03052 00260 32367 25783

    07139 16829 76768 11913 42434 91961 92934 18229 15595 02566

    45056 43939 31188 43272 11332 99494 19348 97076 95605 28010

    10244 19093 51678 63463 85568 70034 82811 23261 48794 63984

    12940 84434 50087 20189 58009 66972 05764 10421 36875 64964

    84438 45828 40353 28925 11911 53502 24640 96880 93166 68409

    98681 67871 71735 64113 90139 33466 65312 90655 75444 30845

    43290 96753 18799 49713 39227 15955 46167 63853 03633 19990

    96893 85410 88233 22094 30605 79024 01791 38839 85531 94576

    75403 41227 00192 16814 47054 16814 81349 92264 01028 29071

    78064 92111 51541 76563 69027 67718 06499 71938 17354 12680

    26246 71746 94019 93165 96713 03316 75912 86209 12081 57817

    98766 67312 96358 21351 86448 31828 86113 78868 67243 06763

    37895 51055 11929 44443 15995 72935 99631 18190 85877 31309

    27988 81163 52212 25102 61798 28670 01358 60354 74015 18556

    19216 53008 44498 19262 12196 93947 90162 76337 12646 26838

    28078 86729 69438 24235 35208 48957 53529 76297 41741 54735

    34455 61363 93711 68038 75960 16327 95716 66964 28634 65015

    53510 90412 70438 45932 57815 75144 52472 61817 41562 42084

    30658 18894 88208 97867 30737 94985 18235 02178 39728 66398

    S li M th d

  • 7/30/2019 Doane Chapter 02

    43/82

    Sampling Methods

    With or Without Replacement If we allow duplicates when sampling, then we are

    sampling with replacement.

    Duplicates are unlikely when n is much smallerthan N.

    If we do not allow duplicates when sampling, thenwe are sampling without replacement.

    S li M th d

  • 7/30/2019 Doane Chapter 02

    44/82

    Sampling Methods

    Computer Methods

    These arepseudo-random generators because even the bestalgorithms eventually repeat themselves.

    Excel - Option A Enter the Excel function =RANDBETWEEN(1,875)into 10 spread-sheet cells. Press F9 to get a newsample.

    Excel - Option B Enter the function =INT(1+875*RAND()) into 10spreadsheet cells. Press F9 to get a new sample.

    Internet The web site www.random.org will give you manykinds of excellent random numbers (integers,decimals, etc).

    Minitab Use Minitabs Random Data menu with the Integeroption.

    http://www.random.org/http://www.random.org/
  • 7/30/2019 Doane Chapter 02

    45/82

    Using MINITAB to generate random numbers.

    S li M th d

  • 7/30/2019 Doane Chapter 02

    46/82

    Sampling Methods

    Row

    Column Data A rrays When the data are arranged in a rectangular array,

    an item can be chosen at random by selecting arow and column.

    For example, in the 4 x 3 array, select a randomcolumn between 1 and 3 and a random rowbetween 1 and 4.

    This way, each item has an equal chance of beingselected.

    Sampling Methods

  • 7/30/2019 Doane Chapter 02

    47/82

    Dillard's K-Mart Saks

    Dollar General Kohl's Sears Roebuck

    Federated DeptStores

    May Dept Stores Target

    J. C Penney Nordstrom Wal-Mart Stores

    Sampling Methods

    Row

    Column Data A rrays Use =RANDBETWEEN function to choose row 3

    and column 3 (Target).

    Sampling Methods

  • 7/30/2019 Doane Chapter 02

    48/82

    Sampling Methods

    Random izing a List In Excel, use function =RAND() beside each row

    to create a column of random numbers between0 and 1.

    Copy and paste these numbers into the samecolumn using Paste Special | Values (to paste

    only the values and not the formulas).

    Sort the spreadsheet on the random numbercolumn.

    Sampling Methods

  • 7/30/2019 Doane Chapter 02

    49/82

    Sampling Methods

    The first n itemsare a random

    sample of theentire list (theyare as likely asany others).

    Random izing a List

    Sampling Methods

  • 7/30/2019 Doane Chapter 02

    50/82

    Sampling Methods

    Systemat ic Sampl ing

    For example, starting at item 2, we sample everyk= 4 items to obtain a sample ofn = 20 items froma list ofN= 78 items.

    Note that N/n = 78/20 4.

    Sample by choosing every kth item from a list,starting from a randomly chosen entry on the list.

    Sampling Methods

  • 7/30/2019 Doane Chapter 02

    51/82

    Sampling Methods

    Systemat ic Sampl ing A systematic sample ofn items from a population

    ofNitems requires that periodicity kbeapproximately N/n.

    Systematic sampling should yield acceptableresults unless patterns in the population happen torecur at periodicity k.

    Can be used with unlistable or infinite populations. Systematic samples are well-suited to linearly

    organized physical populations.

    Sampling Methods

  • 7/30/2019 Doane Chapter 02

    52/82

    Sampling Methods

    Systemat ic Sampl ing For example, out of 501 companies, we want to

    obtain a sample of 25. What should the periodicitykbe?

    k = N/n = 501/25 20.

    So, we should choose every 20th company from arandom starting point.

    Sampling Methods

  • 7/30/2019 Doane Chapter 02

    53/82

    Sampling Methods

    Strat i f ied Sampl ing Utilizes prior information about the population.

    Applicable when the population can be divided

    into relatively homogeneous subgroups of knownsize (strata).

    A simple random sample of the desired size istaken within each stratum.

    For example, from a population containing 55%males and 45% females, randomly sample 120males and 80 females (n = 200).

    Sampling Methods

  • 7/30/2019 Doane Chapter 02

    54/82

    Sampling Methods

    Strat i f ied Sampl ing Or, take a random sample of the entire population

    and then combine individual strata estimates usingappropriate weights.

    For a population with L strata, the population sizeNis the sum of the stratum sizes:

    N= N1 + N2+ ... + NL

    The weight assigned to stratumjis

    wj= Nj / n For example, take a random sample ofn = 200

    and then weight the responses for males bywM= .55 and for females by wF= .45.

    Sampling Methods

  • 7/30/2019 Doane Chapter 02

    55/82

    Sampling Methods

    Cluster Sample Strata consist of geographical regions.

    One-stage cluster sampling sample consists of

    all elements in each ofkrandomly chosensubregions (clusters).

    Two-stage cluster sampling, first choose ksubregions (clusters), then choose a random

    sample of elements within each cluster.

    Sampling Methods

  • 7/30/2019 Doane Chapter 02

    56/82

    Here is anexample of 4

    elements sampledfrom each of 3randomly chosenclusters (two-stage

    cluster sampling).

    Sampling Methods

    Cluster Sample

    Sampling Methods

  • 7/30/2019 Doane Chapter 02

    57/82

    Sampling Methods

    Cluster Sample Cluster sampling is useful when

    - Population frame and stratum characteristics arenot readily available

    - It is too expensive to obtain a simple or stratifiedsample

    - The cost of obtaining data increases sharply withdistance

    - Some loss of reliability is acceptable

    Sampling Methods

  • 7/30/2019 Doane Chapter 02

    58/82

    Sampling Methods

    Judgment Sample A nonprobability sampling method that relies on

    the expertise of the sampler to choose items thatare representative of the population.

    Can be affected by subconscious bias (i.e.,nonrandomness in the choice).

    Quota samplingis a special kind of judgment

    sampling, in which the interviewer chooses acertain number of people in each category.

    Sampling Methods

  • 7/30/2019 Doane Chapter 02

    59/82

    Sampling Methods

    Convenience Sample Take advantage of whatever sample is available at

    that moment. A quick way to sample.

    Sample size depends on the inherent variability ofthe quantity being measured and on the desiredprecision of the estimate.

    Sample Size

    Data Sources

  • 7/30/2019 Doane Chapter 02

    60/82

    Type of Data Examples

    U.S. general data Statistical Abstract of the U.S.

    U.S. economic data Economic Report of the PresidentAlmanacs World Almanac, Time Almanac

    Periodicals Economist, Business Week, Fortune

    Indexes New York Times, Wall Street Journal

    Databases CompuStat, Citibase, U.S. CensusWorld data CIA World Factbook

    Web Google, Yahoo, msn

    Data Sources

    Useful Data Sou rces

    Survey Research

  • 7/30/2019 Doane Chapter 02

    61/82

    Step 1: State the goals of the research

    Step 2: Develop the budget (time, money,

    staff)

    Step 3: Create a research design (targetpopulation,

    frame, sample size)

    Step 4: Choose a survey type and method ofadministration

    Survey Research

    Basic Steps o f Survey Research

    Survey Research

  • 7/30/2019 Doane Chapter 02

    62/82

    Step 5: Design a data collection instrument(questionnaire)

    Step 6: Pretest the survey instrument andrevise as

    needed

    Step 7: Administer the survey (follow up ifneeded)

    Step 8: Code the data and analyze it

    y

    Basic Steps o f Survey Research

    Survey Research

  • 7/30/2019 Doane Chapter 02

    63/82

    y

    Survey TypesType of

    SurveyCharacteristics

    Mail You need a well-targeted and current mailing list

    (people move a lot). Low response rates are typicaland nonresponse bias is expected (nonrespondentsdiffer from those who respond). Zip code lists (oftencostly) are an attractive option to define strata ofsimilar income, education, and attitudes. Toencourage participation, a cover letter should clearlyexplain the uses to which the data will be put. Planfor follow-up mailings.

    Survey Research

  • 7/30/2019 Doane Chapter 02

    64/82

    y

    Survey TypesType ofSurvey

    Characteristics

    Telephone Random dialing yields very low response and is

    poorly targeted. Purchased phone lists help reachthe target population, though a low response ratestill is typical (disconnected phones, callerscreening, answering machines, work hours, no-

    call lists). Other sources of nonresponse biasinclude the growing number of non-Englishspeakers and distrust caused by scams andspams.

    Survey Research

  • 7/30/2019 Doane Chapter 02

    65/82

    y

    Survey TypesType ofSurvey

    Characteristics

    Interviews Interviewing is expensive and time-consuming, yeta trade-off between sample size for high-qualityresults may still be worth it. Interviews must becarefully handled so interviewers must be well-trained an added cost. But you can obtain

    information on complex or sensitive topics (e.g.,gender discrimination in companies, birth controlpractices, diet and exercise habits).

    Survey Research

  • 7/30/2019 Doane Chapter 02

    66/82

    y

    Survey TypesType ofSurvey

    Characteristics

    Web Web surveys are growing in popularity, but are

    subject to nonresponse bias because those whoparticipate may differ from those who feel too busy,dont own computers or distrust your motives

    (scams and spam are again to blame). This type ofsurvey works best when targeted to a well-definedinterest group on a question of self-interest (e.g.,views of CPAs on new proposed accounting rules,frequent flyer views on airline security).

    Survey Research

  • 7/30/2019 Doane Chapter 02

    67/82

    y

    Survey TypesType ofSurvey

    Characteristics

    Direct

    Observation

    This can be done in a controlled setting (e.g.,

    psychology lab) but requires informed consent,which can change behavior. Unobtrusiveobservation is possible in some nonlab settings(e.g., what percentage of airline passengers carryon more than two bags, what percentage of SUVs

    carry no passengers, what percentage of driverswear seat belts).

    Survey Research

  • 7/30/2019 Doane Chapter 02

    68/82

    Plan What is the purpose of the survey?Consider staff expertise, needed skills,degree of precision, budget.

    Design Invest time and money in designing thesurvey. Use books and references to

    avoid unnecessary errors.Quality Take care in preparing a quality survey

    so that people will take you seriously.

    y

    Survey Guidel ines

    Survey Research

  • 7/30/2019 Doane Chapter 02

    69/82

    Pilot Test Pretest on friends or co-workers to makesure the survey is clear.

    Buy-in Improve response rates by stating thepurpose of the survey, offering a token ofappreciation or paving the way withendorsements.

    Expertise Work with a consultant early on.

    y

    Survey Guidel ines

    Survey Research

  • 7/30/2019 Doane Chapter 02

    70/82

    y

    Gett ing Advice Consider hiring a consultant in the early stages.

    Many resources are available to help- The American Statistical Association

    - The Research Industry Coalition

    - The Council of American Survey Research Organizations

    Survey Research

    http://www.amstat.org/http://www.researchindustry.org/http://www.casro.org/http://www.casro.org/http://www.researchindustry.org/http://www.amstat.org/
  • 7/30/2019 Doane Chapter 02

    71/82

    Use a lot of white space in layout. Questionnaire Design

    Begin with short, clear instructions.

    State the survey purpose.

    Assure anonymity.

    Instruct on how to submit the completed survey.

    Survey Research

  • 7/30/2019 Doane Chapter 02

    72/82

    Questionnaire Design Break survey into naturally occurring sections.

    Let respondents bypass sections that are notapplicable (e.g., if you answered no to question 7,

    skip directly to Question 15).

    Pretest and revise as needed.

    Keep as short as possible.

    Survey Research

  • 7/30/2019 Doane Chapter 02

    73/82

    Questionnaire DesignType of Question Example

    Open-ended question Briefly describe your job goals.

    Fill-in-the-blank How many times did you attend formalreligious services during the last year?________ times

    Check boxes Which of these statistics packageshave you ever used? SAS Visual Statistics SPSS MegaStat Systat Minitab

    Survey Research

  • 7/30/2019 Doane Chapter 02

    74/82

    Type of Question Example

    Questionnaire Design

    Ranked choices Please evaluate your dining experience

    Excellent Good Fair Poor

    Food

    Service

    Ambiance Cleanliness

    Overall

    Survey Research

  • 7/30/2019 Doane Chapter 02

    75/82

    Type of Question Example

    Pictograms What do you think of the Presidentseconomic policies? (circle one)

    Questionnaire Design

    Likert scale Statistics is a difficult subject.Neither

    Strongly Slightly Agree Nor Slightly StronglyAgree Agree Disagree Disagree Disagree

    Survey Research

  • 7/30/2019 Doane Chapter 02

    76/82

    The way a question is asked has a profoundinfluence on the response. For example,

    1. Shall state taxes be cut?

    2. Shall state taxes be cut, if it meansreducing highway maintenance?

    3. Shall state taxes be cut, it is means firingteachers and police?

    Quest ion Wording

    Survey Research

  • 7/30/2019 Doane Chapter 02

    77/82

    Make sure you have covered all the possibilities.For example,

    Are you married? Yes No

    Overlapping classes orunclear categories are aproblem. For example,

    How old is your father? 35 45 45 55

    55 65 65 or older

    Quest ion Wording

    Survey Research

  • 7/30/2019 Doane Chapter 02

    78/82

    Responses are usually coded numerically(e.g., 1 = male 2 = female).

    Missing values are typically denoted by special

    characters (e.g., blank, . or *). Discard questionnaires that are flawed or missing

    many responses.

    Watch for multiple responses, outrageous orinconsistent replies or range answers.

    Follow-up if necessary and always document yourdata-coding decisions.

    Cod ing and Data Screening

    Survey Research

  • 7/30/2019 Doane Chapter 02

    79/82

    Source of Error Characteristics

    Nonresponse bias Respondents differ from nonrespondents

    Selection bias Self-selected respondents are atypical

    Response error Respondents give false information

    Coverage error Incorrect specification of frame orpopulation

    Interviewer error Responses influenced by interviewerMeasurement error Survey instrument wording is biased or

    unclear

    Sampling error Random and unavoidable

    Sources of Error

    Survey Research

  • 7/30/2019 Doane Chapter 02

    80/82

    Enter data into a spreadsheet or database as aflat file (n subjects x m variables matrix).

    Data File Format

    Survey Research

  • 7/30/2019 Doane Chapter 02

    81/82

    Using commas (,), dollar signs ($), or percents (%)as part of the values may result in your data beingtreated as text values.

    A numerical variable may only contain the digits0-9, a decimal point, and a minus sign.

    To avoid round-off errors, format the data column

    as plain numbers with the desired number ofdecimal places before you copy the data to astatistical package.

    Advice on Copy ing Data

  • 7/30/2019 Doane Chapter 02

    82/82

    Applied Statistics inBusiness and Economics

    End of Chapter 2