introduction to spss with the interpretation and steps
DESCRIPTION
Author: Wan Mohamad Asyraf Bin Wan Afthanorhan, Nazim Aimran This book helps the readers to conduct their research by using SPSS. Thus, the basic, application, assupmtion, interpretation are also employ in this books. There are a lot of statistical analysis which is normality test, independent t-test, one way anova, two-way anova, association analysis, correlation analysis, and regression analysis.TRANSCRIPT
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
1
Item Description
File Open : allows data files to be opened for analysis.
File Save : saves the file in the active window.
File Print : prints the file in the active windows.
Insert Cases : inserts a case above the case containing the active cell.
Insert Variable : inserts a variable to the left of the variable containing the
active cell.
Value Labels : allows toggling between actual values and value labels in
the Data Editor.
Select Cases : provides methods for selecting a subgroup of cases based
on criteria that include variables and complex expressions.
Split Files : splits the data file into separate groups for analysis based on
the values of one or more grouping variable.
Toolbar
This toolbar is available in SPSS Data Editor, providing quick and easy access to
frequently used features. The following are some of the frequent used tools in the Data
Editor.
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
2
The box shown above is a dialogue box appears once PASW SPSS 18.0 opened. To
open an existing file, choose a file under “open an existing data source menu”. Since we
are not going to use this dialogue box, click on the Cancel button to close it.
If you want to open existing data, go to File > Open > Data.
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
3
You can now open the location where you save the existing SPSS data file.
Data Editor Window
The window (shown above) is the Data Editor Window. It consists of 12 pull-down
PASW STATISTICS menus available for user. The menus are: File, Edit, View, Data,
Transform, Analyze, Direct Marketing, Graphs, Utilities, Add-ons, Window and Help. At
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
4
the left-bottom of the window, there are options for Data View and Variable View
windows.
The Data View window is where you will type in your data. However, you must first tell
SPSS certain things about your data and you will do this in the Variable View window.
Variable View window has 10 columns and they tell the program different things about
the measurement values such as whether or not the values are qualitative or
quantitative.
Defining Variables
To enable PASW SPSS 18.0 analysis works, variables of the research must be defined
first in the Data Editor Window before entering any data. Click on the left bottom of the
PASW STATISTICS Data Editor. We can see Define Variable Dialog box, as shown in
the figure below. It consists of:
Name
Type
Width
Decimals
Label
Values
Missing
Measure
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
5
o Nominal variable, is for mutual exclusive, but not ordered, categories. For
example, your study might compare achievement between gender; male and
female.
o Ordinal variable, is one where the order matters but not the difference between
values. For example, you might ask patients to express the amount of pain they
are feeling on a scale of 1 to 10. Another example would be movie ratings, from *
to *****.
o Interval variable is a measurement where the difference between two values is
meaningful. For example, you might ask the respondent’s salary in order to
compare their salary and their expenses.
o Ratio data is interval data with a natural zero point. For example, time is ratio
since 0 time is meaningful. A weight of 4 grams is twice a weight of 2 grams,
because weight is a ratio variable. A temperature of 100 degrees C is not twice
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
6
as hot as 50 degrees C, because temperature C is not a ratio variable. A pH of 3
is not twice as acidic as a pH of 6, because pH is not a ratio variable.
Variable Name Label Value Label Measure
Gender Respondent’s Sex 1 = Male, 2 = Female Nominal
Qualification Respondent’s Highest Education Background
1 = SPM, 2 = Diploma, 3 = Bachelor, 4 = Master, 5 = PhD
Ordinal
Income Respondent’s Monthly Income
1 = ≤ RM999, 2 = RM1000 – RM1999, 3 = RM2000 – RM2999, 4 = RM3000 – RM3999, 5 = ≥ RM4000
Interval
Weight Respondent’s Weight
Any value Ratio
Value Labels
Value labels is a label assigned to a particular value of variable. For example, for races
label, we might use codes 1= Malay, 2= Chinese, 3= Indian, 4=others.
In our case, for gender labels we will use 1= Male and 2= Female.
To enter the codes:
Type “1” in value box and “male” in label box. Then click “Add”.
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
7
Type “2” in value box and “female” in label box. Then click “Add”. End the process “OK”.
Missing Values
Select Discrete Missing Value button. Then, type “99” (example) or any other codes that
will not be used in other variable’s code to replace the missing value.
Repeat the step for labelling other variables. For age, the Measure column should be in
“scale” since age is an interval measure. The same measure goes to visit, serv_prop,
ser_friendly, serv_clean, serv_time, serv_overall variables. For employment and
residence, we will use a “Nominal” measure.
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
8
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
9
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
10
Assumption on Parametric Test
1. Data must be normal
2. Data have equal variance
3. Data must be more than 30 cases
Testing Normality
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
11
Testing normality of our data is prerequisite for inferential statistical technique.
The normality test also needed in order to use parametric test on our data. Only
normal data can use parametric tests.
To check normality for single variable, follow the following steps:
Analyze > Descriptive Statistics > Explore
Select the variable of interest, for example : age. Then, click on “Plots” button.
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
12
Ansure that the “Factor levels together” button is selected in the Boxplot display.
Tick on “Stem and Leaf”, “Histogram” and “Normality plots with tests” buttons
Click Continue for the results.
Normality Test (Single Variable) Output
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
13
In the above diagram, the Histogram shows a perfect bell-shaped distribution without
skewness to either left or right. Therefore, the age variable can be concluded as normal.
Another way to look at the distribution of our data is by using Normal Q-Q plot. In our
case, the points lie along the straight line and show no pattern, therefore the age data
distribution can be concluded as normal.
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
14
To check normality for multiple variables, follow the following steps:
Analyze > Regression > Linear
Click “overall quality” and insert it to the Dependent box. Click other observed variable
(demographic profile excluded) to the independent variables. Then, click on “Plots”
button.
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
15
Normality Test (Multiple Variables) Output
In the above diagram, the Histogram shows a perfect bell-shaped distribution without
skewness to either left or right. Therefore, variables in this case study can be concluded
as normal.
Another way to look at the distribution of our data is by using Normal Q-Q plot. In our
case, the points lie along the straight line and show no pattern, therefore the age data
distribution can be concluded as normal.
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
16
Recode Into Different Variable
Recode assigns discrete values to a variable, based solely on the present values of the
variable being recoded. You may want to recode variable for easier interpretation or
decision making.
Transform > Recode Into Different Variables
Step 2: Type “overall” in the Name box. Then, click on “Old and New Values” button.
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
17
Step 3 : Let us recode the overall perception variable
1 thru 2 = 1 (low / disagree)
3 = 2 (medium / undecided)
4 thru 5 = 3 (high / agree)
Then, click on “Continue” button.
Now, new recoded value will appear to the left side of the Data View window.
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
18
Independent Sample T-Test
The Independent Sample T-test procedure tests the null hypothesis that the population
mean o a variable is the same for the two groups of cases. It also displays confidence
interval for the different between the population means of the groups
Step 1: Click analyze > Compare Means > Independent- Sample T- test
Step 2: Transfer the variable into Test Variable(s) box, following gender variable into the
Grouping Variable: box (below)
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
19
Step 3: Click on the Define Groups button and you will need to define which two
categories for gender variable. In this case, there are only two categories which are
male for Group 1 and female in Group 2. These categories referred as the values 1 and
2. Hence, type the value 1 in the Group 1 and 2 in the Group 2.
Step 4: Click on Continue. Then OK. The following output is appeared:
Group Statistics
respondent's
sex N Mean
Std.
Deviation
Std. Error
Mean
infrastructure male 7 3.5714 .53452 .20203
female 16 3.6250 .95743 .23936
service quality male 7 3.4286 .53452 .20203
female 16 3.8125 1.04682 .26171
cleanliness
quality
male 7 4.2857 .75593 .28571
female 16 3.8125 .75000 .18750
queue time male 7 3.1429 .89974 .34007
female 16 2.8750 .88506 .22127
overall quality male 7 4.0000 .81650 .30861
female 16 4.0000 .73030 .18257
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
20
The table above shows the means of infrastructure, service quality, cleanliness quality,
queue time and overall quality between male and female. By referring on that table, the
mean to cleanliness quality for male is the highest following to the overall quality
between male and female. The least mean is queue time for female.
Independent Samples Test
Levene's Test
for Equality of
Variances t-test for Equality of Means
F Sig. t df
Sig. (2-
tailed)
Mean
Differenc
e
Std.
Error
Differe
nce
95% Confidence
Interval of the
Difference
Lower Upper
infrastructure Equal variances assumed 1.447 .242 -.138 21 .892 -.05357 .38888 -.86228 .75514
Equal variances not
assumed
-.171 19.387 .866 -.05357 .31322 -.70827 .60113
service quality Equal variances assumed 3.000 .098 -.911 21 .372 -.38393 .42131 -1.26010 .49224
Equal variances not
assumed
-1.161 20.237 .259 -.38393 .33061 -1.07306 .30520
cleanliness Equal variances assumed .000 .987 1.389 21 .179 .47321 .34064 -.23519 1.18162
Equal variances not
assumed
1.385 11.433 .193 .47321 .34174 -.27550 1.22193
queue time Equal variances assumed .106 .748 .665 21 .513 .26786 .40299 -.57020 1.10592
Equal variances not
assumed
.660 11.342 .522 .26786 .40571 -.62184 1.15755
overall quality Equal variances assumed .091 .765 .000 21 1.000 .00000 .34256 -.71239 .71239
Equal variances not
assumed
.000 10.424 1.000 .00000 .35857 -.79456 .79456
The table above encompasses the result of Levene’s Test for equality of variances and
t- test for equality of means. However, most of researchers just focus on the value of
significant in t- test for equality of means to determine whether differences exist
between male and female students. In this case, all of the variables indicates that p>
0.05 and therefore is not significant. Hence, the null hypothesis is accepted that there is
no significant difference between male and female pertaining to all variable included.
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
21
One-Way Anova
Step 1: Click analyze > compare means > One- way ANOVA
Step 2: Transfer the infrastructure (serv_prop) from the list variable into Dependent List
following the work place (employment) into the factor.
Step 3: Click Post Hoc > Tick LSD
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
22
Step 4: Click Option > Tick Descriptive, Fixed and Homogeneity of Variance Test
Step 5: Click on Continue, followed by OK. The following result is produced:
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
23
Descriptives
Infrastructure
N Mean
Std.
Deviation
Std.
Error
95% Confidence
Interval for Mean
Minimum Maximum
Between-
Component
Variance
Lower
Bound
Upper
Bound
Government 6 4.1667 1.32916 .54263 2.7718 5.5615 2.00 6.00
Private Sector 6 3.1667 .40825 .16667 2.7382 3.5951 3.00 4.00
GLC Sector 5 3.6000 .54772 .24495 2.9199 4.2801 3.00 4.00
Self Employed 6 3.5000 .54772 .22361 2.9252 4.0748 3.00 4.00
Total 23 3.6087 .83878 .17490 3.2460 3.9714 2.00 6.00
M
o
d
e
l
Fixed Effects .80677 .16822 3.2566 3.9608
Random Effects .21266 2.9319 4.2855 .06731
The descriptive table shows mean of infrastructure for each categories of work place.
The result obtained shows the respondents among government sector are the highest
interest on infrastructure towards customer’s satisfaction. Instead, the respondents
among private sector are the lowest interest on infrastructure towards customer
satisfaction.
Test of Homogeneity of Variances
infrastructure
Levene Statistic df1 df2 Sig.
1.643 3 19 .213
The test of homogeneity shows insignificant since 0.213> 0.05. So, the null hypothesis
is accepted and proved that the population variances for each group are approximately
equal. This test is required to ensure the probability of the test value is homogeneity or
heterogeneity.
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
24
ANOVA
Infrastructure
Sum of Squares df Mean Square F Sig.
Between Groups 3.112 3 1.037 1.594 .224
Within Groups 12.367 19 .651
Total 15.478 22
The significant value of the ANOVA table is 0.224 which is greater than 0.05. Hence,
the null hypothesis is accepted which defines that the infrastructure towards customer
satisfaction is not different at work place. However, the ANOVA result does not enough
to identify which work place differed with each other. Thus, the LSD is required to
determine where the significant lies. The result is shown as below:
Multiple Comparisons
Infrastructure
LSD
(I) work place (J) work place Mean Difference
(I-J) Std. Error Sig.
95% Confidence Interval
Lower Bound Upper Bound
Government
Sector
Private Sector 1.00000* .46579 .045 .0251 1.9749
GLC Sector .56667 .48852 .260 -.4558 1.5892
Self Employed .66667 .46579 .169 -.3082 1.6416
Private Sector Government Sector -1.00000* .46579 .045 -1.9749 -.0251
GLC Sector -.43333 .48852 .386 -1.4558 .5892
Self Employed -.33333 .46579 .483 -1.3082 .6416
GLC Sector Government Sector -.56667 .48852 .260 -1.5892 .4558
Private Sector .43333 .48852 .386 -.5892 1.4558
Self Employed .10000 .48852 .840 -.9225 1.1225
Self Employed Government Sector -.66667 .46579 .169 -1.6416 .3082
Private Sector .33333 .46579 .483 -.6416 1.3082
GLC Sector -.10000 .48852 .840 -1.1225 .9225
*. The mean difference is significant at the 0.05 level.
The outcome illustrates that government sector and private sector have significantly
different mean on infrastructure towards customer’s satisfaction.
Association Analysis
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
25
Association Analysis is the weakest measurement of relationship. It is usually used for
categorical types of data with nominal and ordinal measurement. Measurement of
association is obtained together with cross tabulation between qualitative variables
(Nominal and Ordinal). For example, we want to measure the relationship between the
gender (male and female) with their attitude towards Mathematics whether high,
medium or low.
Step 1: Click analyze > Descriptive Statistics > Crosstabs
Step 2: Transfer the cleanliness quality into the Row(s) following the respondent’s sex
into the Column(s).
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
26
Step 3: Click Statistics > Tick Chi-square
Step 4: Click Cell Display > Tick Observed and Unstandardized Residual
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
27
Step 5: Click on Continue and then press OK.
Crosstab
respondent's sex
Totalmale female
overall
quality
neither agree
nor disagree
Count 2 4 6
Residual .2 -.2
agree Count 3 8 11
Residual -.3 .3
strongly agree Count 2 4 6
Residual .2 -.2
Total Count 7 16 23
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
28
Chi-Square Tests
Value df
Asymp. Sig.
(2-sided)
Pearson Chi-Square .100a 2 .951
Likelihood Ratio .100 2 .951
Linear-by-Linear Association .000 1 1.000
N of Valid Cases 23
a. 5 cells (83.3%) have expected count less than 5. The minimum
expected count is 1.83.
First, look at the Pearson Chi- Square value. Based on the result, the Chi-Square test
value is not significant since p-value is greater than 0.05. This result indicates that there
is no association exist between gender and the perception towards overall quality.
Correlation Analysis
Correlation Analysis is used to measure the relationship between variables. To measure
the relationship using correlation analysis, there are two types of correlation coefficient
which are the Spearman rank coefficient of correlation and the Pearson product
moment coefficient of correlation.
The Spearman is appropriate for abnormal data and it is also known as the non
parametric version of correlation analysis. The Pearson is appropriate for normal
distributed data and it is calculated using the actual data values while Spearman
replaces the actual data with ranks.
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
29
Step 1: Click Analyze > Correlate > Bivariate
Step 2: Transfer the five variables from the list variables into the Variables.
Step 3: Click OK.
Correlations
Ahmad Nazim & W.M.AsyrafIntroduction to SPSS hands-0n
30
infrastructure
service
quality
cleanliness
quality queue time
overall
quality
infrastructure Pearson Correlation 1 -.408 .046 .111 -.592
Sig. (2-tailed) .054 .835 .613 .003
N 23 23 23 23 23
service quality Pearson Correlation -.408 1 -.147 -.408 -.066
Sig. (2-tailed) .054 .502 .053 .763
N 23 23 23 23 23
cleanliness
quality
Pearson Correlation .046 -.147 1 -.604 .401
Sig. (2-tailed) .835 .502 .002 .058
N 23 23 23 23 23
queue time Pearson Correlation .111 -.408 -.604 1 .210
Sig. (2-tailed) .613 .053 .002 .335
N 23 23 23 23 23
overall quality Pearson Correlation -.592 -.066 .401 .210 1
Sig. (2-tailed) .003 .763 .058 .335
N 23 23 23 23 23
**. Correlation is significant at the 0.01 level (2-tailed).
The table above shows that there is only two significant relationships exist among the
variables which are relationship between infrastructure with overall quality and
cleanliness with queue time with p-value of 0.03 and 0.02 respectively (<0.05). Both
relationships are considered as negative moderate relationship.