session 7 data analysis
TRANSCRIPT
SESSION 7DATA ANALYSIS
Chapter 15: Data Preparation and Description Chapter 16: Exploring, Displaying, and
Examining Data
RESEARCH METHODOLOGY
CHAPTER 16 Data Preparation and Description
Learning Objectives:
• The importance of editing the collected raw data to detect errors and omissions.
• How coding is used to assign number and other symbols to answers and to categorize responses.
• The use of content analysis to interpret and summarize open questions.
• Problems with and solutions for “don’t know” responses and handling missing data.
• The options for data entry and manipulation.
15-3
Data Preparation in the Research Process
15-4
Monitoring Online Survey Data
Online surveys need special editing attention. CfMC provides software and support to research suppliers to prevent interruptions from damaging data .
15-5
Editing
Criteria
Consistent
Uniformly entered
Arranged forsimplification
Complete
Accurate
15-6
Field Editing
Speed without accuracy won’t help the manager choose the right direction.
•Field editing review•Entry gaps identified•Callbacks made•Validate results
15-7
Central Editing
Be familiar with instructions given to interviewers and coders
Do not destroy the original entry
Make all editing entries identifiable and in standardized form
Initial all answers changed or supplied
Place initials and date of editing on each instrument completed
15-8
Sample Codebook
15-9
Precoding
15-10
Coding Open-Ended Questions
6. What prompted you to purchase your most recent life insurance policy?
_______________________________ _______________________________ _______________________________ _______________________________ _______________________________ _______________________________ _______________________________ _______________________________
15-11
Coding Rules
Categories should be
Appropriate to the research problemExhaustive
Mutually exclusive Derived from one classification principle
15-12
Content Analysis
QSR’s XSight software for content
analysis.
15-13
Content Analysis
15-14
Types of Content Analysis
Syntactical
Propositional
Referential
Thematic
15-15
Open-Question Coding
Locus of Responsibility Mentioned
Not Mentioned
A. Company_____________
_________________________
__________
B. Customer_____________
_________________________
__________
C. Joint Company-Customer
________________________
________________________
F. Other_____________
_________________________
__________
Locus of Responsibility
Frequency (n = 100)
A. Management 1. Sales manager 2. Sales process
3. Other 4. No action area
identifiedB. Management 1. Training C. Customer
1. Buying processes 2. Other
3. No action area identified
D. Environmental conditions
E. TechnologyF. Other
102073
15
1285
20
15-16
Handling “Don’t Know” Responses
Question: Do you have a productive relationship with your present salesperson?
Years of Purchasing Yes No Don’t Know
Less than 1 year 10% 40% 38%
1 – 3 years 30 30 32
4 years or more 60 30 30
Total100%n = 650
100%n = 150
100%n = 200
15-17
Data Entry
Database Programs
Optical Recognition
Digital/Barcodes
Voicerecognition
Keyboarding
15-18
Missing Data
Listwise Deletion
Pairwise Deletion
Replacement
15-19
Key Terms
• Bar code• Codebook• Coding• Content analysis• Data entry• Data field• Data file• Data preparation• Data record• Database
• Don’t know response • Editing• Missing data• Optical character
recognition• Optical mark
recognition• Precoding• Spreadsheet• Voice recognition
Appendix 15aDescribing Data Statistically
McGraw-Hill/Irwin Copyright © 2011 by The McGraw-Hill Companies, Inc. All Rights Reserved.
15-21
Frequencies
Unit Sales Increase
(%) Frequency PercentageCumulative Percentage
56789
Total
123219
11.122.233.322.211.1
100.0
11.133.366.788.9100Unit Sales
Increase (%) Frequency Percentage
Cumulative Percentage
Origin, foreign (1)
678
122
11.122.222.2
11.133.355.5
Origin, foreign (2)
5679
Total
11119
11.111.111.111.1
100.0
66.677.788.8
100.0
A
B
15-22
Distributions
15-23
Characteristics of Distributions
15-24
Measures of Central Tendency
Mean ModeMedian
15-25
Measures of Variability
Interquartile range
Quartile deviation
Range
Standard deviation
Variance
15-26
Summarizing Distribution Shape
15-27
Variable Population Sample Mean
µ
X
Proportion
p
Variance
2
s2
Standard deviation
s
Size
N
n
Standard error of the mean
x
Sx
Standard error of the proportion
p
Sp
__
_
Symbols
15-28
Key Terms
• Central tendency• Descriptive statistics• Deviation scores• Frequency distribution• Interquartile range (IQR)• Kurtosis• Median• Mode
• Normal distribution• Quartile deviation (Q)• Skewness• Standard deviation• Standard normal
distribution• Standard score (Z score)• Variability• Variance
CHAPTER 16 Exploring, Displaying, and Examining Data
Learning Objectives:
• That exploratory data analysis techniques provide insights and data diagnostics by emphasizing visual representations of the data.
• How cross-tabulation is used to examine relationships involving categorical variables, serves as a framework for later statistical testing, and makes an efficient tool for data visualization and later decision-making
16-30
Exploratory Data Analysis
ConfirmatoryExploratory
16-31
Data Exploration, Examination, and Analysis in the Research Process
16-32
Frequency of Ad Recall
Value Label Value Frequency Percent Valid Cumulative Percent Percent
16-33
Bar Chart
16-34
Pie Chart
16-35
Frequency Table
16-36
Histogram
16-37
Stem-and-Leaf Display
455666788889124667990223567802268
240183106336
3
68
56789101112131415161718192021
16-38
Pareto Diagram
16-39
Boxplot Components
16-40
Diagnostics with Boxplots
16-41
Boxplot Comparison
16-42
Mapping
16-43
Geograph: Digital Camera Ownership
16-44
SPSS Cross-Tabulation
16-45
Percentages in Cross-Tabulation
16-46
Guidelines for Using Percentages
Averaging percentages
Use of too large percentages
Using too small a base
Percentage decreases can never exceed 100%
16-47
Cross-Tabulation with Control and Nested Variables
16-48
Automatic Interaction Detection (AID)
16-49
Exploratory Data Analysis
This Booth Research Services ad suggests that the researcher’s role is to make sense of data displays.
Great data exploration and analysis delivers insight from data.
16-50
Key Terms
• Automatic interaction detection (AID)
• Boxplot• Cell• Confirmatory data
analysis• Contingency table• Control variable• Cross-tabulation• Exploratory data analysis
(EDA)
• Five-number summary• Frequency table• Histogram• Interquartile range (IQR)• Marginals• Nonresistant statistics• Outliers• Pareto diagram• Resistant statistics• Stem-and-leaf display
Working with Data Tables
McGraw-Hill/Irwin Copyright © 2011 by The McGraw-Hill Companies, Inc. All Rights Reserved.
16-52
Original Data Table
Our grateful appreciation to eMarketer for the use of their table.
16-53
Arranged by Spending
16-54
Arranged by No. of Purchases
16-55
Arranged by Avg. Transaction, Highest
16-56
Arranged by Avg. Transaction, Lowest
REFERENCES: