· web view= 0.00, min = 0.00, max = 0.01). skewness and kurtosis were also calculated in table 2....
TRANSCRIPT
Sample OutputDescriptive Statistics
Introduction. Summary statistics were calculated for each interval and ratio variable.
Frequencies and percentages were calculated for each nominal variable.
Frequencies and Percentages. The most frequently observed category of Gender was Male (n
= 16, 55%). Frequencies and percentages are presented in Table 1.
Table 1
Frequency Table for Nominal Variables
Variable n %Gender Male 16 55.17 Female 13 44.83 Missing 0 0.00
Note. Due to rounding errors, percentages may not equal 100%.
Summary Statistics. The observations for Time had an average of 27.76 (SD = 11.03, SEM =
2.05, Min = 13.00, Max = 44.00). The observations for Distance had an average of 2.55 (SD =
0.93, SEM = 0.17, Min = 1.00, Max = 4.00). The observations for Accelerate had an average of
0.00 (SD = 0.00, SEM = 0.00, Min = 0.00, Max = 0.01). Skewness and kurtosis were also
calculated in Table 2. When the skewness is greater than or equal to 2 or less than or equal to -2,
then the variable is considered to be asymmetrical about its mean. When the kurtosis is greater
than or equal to 3, then the variable's distribution is markedly different than a normal distribution
in its tendency to produce outliers (Westfall & Henning, 2013).
Table 2
Summary Statistics Table for Interval and Ratio Variables
Variable M SD n SEM Skewness KurtosisTime 27.76 11.03 29 2.05 0.41 -1.41Distance 2.55 0.93 29 0.17 0.05 -1.24Accelerate 0.00 0.00 28 0.00 0.05 -1.27
Linear Regression Analysis
Introduction. A linear regression analysis was conducted to assess whether Time significantly
predicted Distance. The 'Enter' variable selection method was chosen for the linear regression
model, which includes all of the selected predictors.
Assumptions. Prior to conducting the linear regression, the assumptions of normality of
residuals, homoscedasticity (equal variance) of residuals, and the lack of outliers were examined.
A Q-Q scatterplot was used to assess normality, homoscedasticity was assessed with a residuals
scatterplot, and outliers were evaluated using a Studentized residuals plot.
Normality. Normality was evaluated using a Shapiro-Wilk test and a Q-Q scatterplot. The
results of the Shapiro-Wilk test were significant, W = 0.79, p < .001, indicating the assumption of
normality was violated. The normality assumption was also assessed visually with a Q-Q
scatterplot. The Q-Q scatterplot compares the distribution of the residuals (the differences
between observed and predicted values) with a normal distribution (a theoretical distribution
which follows a bell curve). The Q-Q scatterplot for normality is presented in Figure 1.
Figure 1. Q-Q scatterplot for normality for Time predicting Distance
Homoscedasticity. Homoscedasticity was evaluated for each model by plotting the model
residuals against the predicted model values (Osborne & Walters, 2002). The assumption is met
if the points appear randomly distributed with a mean of zero and no apparent curvature. Figure
2 presents a scatterplot of predicted values and model residuals.
Figure 2. Residuals scatterplot for homoscedasticity for Time predicting Distance
Outliers. To identify influential points, Studentized residuals were calculated and the absolute
values were plotted against the observation numbers. An observation with a Studentized residual
greater than three in absolute value has significant influence on the results of the model. Figure
3 presents a Studentized residuals plot of the observations. Observation numbers are specified
next to each point with a Studentized residual greater than three.
Figure 3. Studentized residuals plot for outlier detection.
Results. The results of the linear regression model were significant, F(1,27) = 237.41, p < .001,
R2 = 0.90, indicating that approximately 90% of the variance in Distance is explainable by Time.
Time significantly predicted Distance, B = 0.08, t(27) = 15.41, p < .001. This indicates that on
average, a one-unit increase of Time will increase the value of Distance by 0.08 units. Table 3
summarizes the results of the regression model.
Table 3
Results for Linear Regression with Time predicting Distance
Variable B SE 95% CI β t p(Intercept) 0.34 0.15 [0.02, 0.65] 0.00 2.17 .039Time 0.08 0.01 [0.07, 0.09] 0.95 15.41 < .001
Note. Results: F(1,27) = 237.41, p < .001, R2 = 0.90Unstandardized Regression Equation: Distance = 0.34 + 0.08*Time
Paired Samples t-Test
Introduction. A paired samples t-test was conducted to examine whether the difference
between Distance and Time was significantly different from zero.
Assumptions. Prior to the analysis, the assumptions of normality and homogeneity of variance
were assessed. A Shapiro-Wilk test was conducted to determine whether difference could have
been produced by a normal distribution (Razali & Wah, 2011). The results of the Shapiro-Wilk
test were significant, W = 0.84, p < .001. This suggests that difference is unlikely to have been
produced by a normal distribution; thus normality cannot be assumed. However, the mean of
any random variable will be approximately normally distributed as sample size increases
according to the Central Limit Theorem (CLT). Therefore, with a sufficiently large sample size
(n > 50), deviations from normality will have little effect on the results (Stevens, 2009). An
alternative way to test the assumption of normality was utilized by plotting the quantiles of the
model residuals against the quantiles of a Chi-square distribution, also called a Q-Q scatterplot
(DeCarlo, 1997). For the assumption of normality to be met, the quantiles of the residuals must
not strongly deviate from the theoretical quantiles. Strong deviations could indicate that the
parameter estimates are unreliable. Figure 4 presents a Q-Q scatterplot of the difference between
Distance and Time. Levene's test for equality of variance was used to assess whether the
homogeneity of variance assumption was met (Levene, 1960). The homogeneity of variance
assumption requires the variance of the dependent variable be approximately equal in each
group. The result of Levene's test was significant, F(1, 56) = 27.87, p < .001, indicating that the
assumption of homogeneity of variance was violated. Consequently, the results may not be
reliable or generalizable. Since equal variances cannot be assumed, Welch's t-test was used
instead of the Student's t-test, which is more reliable when the two samples have unequal
variances and unequal sample sizes (Ruxton, 2006).
Figure 4. Q-Q scatterplot for normality for the difference between Distance and Time.
Results. The result of the paired samples t-test was significant, t(28) = -13.37, p < .001,
suggesting that the true difference in the means of Distance and Time was significantly different
from zero. The mean of Distance (M = 2.55) was significantly lower than the mean of Time (M
= 27.76). Table 4 presents the results of the paired samples t-test. Figure 5 presents the mean of
Distance and Time.
Table 4
Paired Samples t-Test for the Difference between Distance and Time
Distance Time M SD M SD t p d
2.55 0.93 27.76 11.03 -13.37 < .001 3.22
Note. Degrees of Freedom for the t-statistic = 28. d represents Cohen's d.
Figure 5. The means of Distance and Time.
Wilcoxon Signed Rank Test
Introduction. A Wilcoxon signed rank test was conducted to examine whether there was a
significant difference between Distance and Time. The Wilcoxon signed rank test is a non-
parametric alternative to the paired samples t-test and does not share its distributional
assumptions (Conover & Iman, 1981).
Results. The results of the Wilcoxon signed rank test were significant, V = 0.00, p < .001. This
indicates that the differences between Distance and Time are not likely due to random variation.
Table 5 presents the results of the Wilcoxon signed rank test. Figure 6 presents the ranked
values of Distance and Time.
Table 5
Wilcoxon Signed Rank Test for the Differences between Distance and Time.
Median Median Time V z p
Distance2.50 22.00 0.00 -4.71 < .001
Figure 6. Ranked values of Distance and Time.
Repeated Measures Analysis of Variance
Introduction. A repeated measures analysis of variance (ANOVA) was conducted to assess if
significant differences exist among Time, Distance, and Accelerate.
Assumptions. Prior to the analysis, the assumptions of multivariate normality, univariate
normality, and sphericity were assessed.
Multivariate normality. To examine the multivariate normality assumption, Mahalanobis
distances were calculated and plotted against the quantiles of a Chi-Square distribution in Figure
7. The assumption is met if the points form a relatively straight line.
Figure 7. Q-Q scatterplot for Mahalanobis distances.
Univariate normality. Shapiro-Wilk tests were used to test the assumption of univariate
normality (Razali & Wah, 2011). These tests suggest that the following variables did not come
from a normally distributed population: Accelerate, Distance, and TimeFigure 7.
Table 6
Shapiro-Wilk Tests for Univariate Normality
Variable pTime .001Distance .038Accelerate .035
Sphericity. Mauchly's Test was used to examine the assumption of sphericity (Mauchly, 1940).
The results showed the variances of the difference scores between each variable were
significantly different, p < .001, indicating the assumption was violated.
Results. The ANOVA was calculated using the Greenhouse-Geisser correction to adjust for the
violation of the sphericity assumption. According to Greenhouse and Geisser (1959), this is the
appropriate way to adjust for violations of the sphericity assumption. The results of the ANOVA
were significant, F(1.00, 27.04) = 174.64, p < .001, indicating there were significant differences
among the values of Time, Distance, and Accelerate (Table 7). The means are presented in
Table 8 and Figure 8.
Table 7
Repeated Measures ANOVA Table for Time, Distance, and Accelerate
Source df SS MS F p ηp2
Within.factor 1.00 12669.35 12652.54 174.64 < .001 0.87Residuals 27.04 1958.74 72.45
Table 8
Means Table for Within-Subject Variables
Variable M SDTime 27.21 10.83Distance 2.50 0.90Accelerate 0.00 0.00
Note. n = 28.
Figure 8. Within-subject variable means.
Post-hoc. To further examine the differences among the variables, t-tests were calculated
between each pair of measurements. A Bonferroni p-value correction was used to adjust for
multiple testing. Bonferroni corrections are a conservative way to analyze the means of pairwise
comparisons according to Rafter, Abell, and Brasolton (2002). All differences were significant.
The mean value of Accelerate (M = 0.00, SD = 0.00) was significantly less than Distance (M =
2.50, SD = 0.90) and Time (M = 27.21, SD = 10.83). The mean value of Distance (M = 2.50, SD
= 0.90) was significantly less than Time (M = 27.21, SD = 10.83).
Friedman Rank Sum Test
Introduction. A Friedman rank sum test was conducted to examine whether the medians of
Time, Distance, and Accelerate were equal. The Friedman test is a non-parametric alternative to
the repeated measures one-way ANOVA and does not share the ANOVA's distributional
assumptions (Conover & Iman, 1981; Zimmerman & Zumbo, 1993).
Results. The results of the Friedman test were significant, χ2(2) = 56.00, p < .001, indicating
significant differences in the median values of Time, Distance, and Accelerate. Table 9 presents
the results of the Friedman rank sum test. Figure 9 presents the ranked values of Time, Distance,
and Accelerate.
Table 9
Friedman Rank Sum Test
Time Distance Accelerate Χ2 df p3.00 2.00 1.00 56.00 2 < .001
Figure 9. Ranked Values of Time, Distance, and Accelerate.
Post-hoc. Since the overall test was significant, pairwise comparisons were examined between
each variable level. The results of the multiple comparisons indicated significant differences
between the following variable pairs: Time-Distance, Time-Accelerate, and Distance-Accelerate.
Table 10 presents the results of the pairwise comparisons.
Table 10
Pairwise comparisons for the mean ranks of Time, Distance, and Accelerate
Comparison Observed Difference Critical DifferenceTime-Distance 28.00 17.91Time-Accelerate 56.00 17.91Distance-Accelerate 28.00 17.91
References
Conover, W. J., & Iman, R. L. (1981). Rank transformations as a bridge between parametric and
nonparametric statistics. The American Statistician,35(3), 124-129.
DeCarlo, L. T. (1997). On the meaning and use of kurtosis. Psychological methods, 2(3), 292-
307
Greenhouse, S. W., & Geisser, S. (1959). On methods in the analysis of profile data.
Psychometrika, 24 (2), 95-112.
Levene, H. (1960). Contributions to probability and statistics. Essays in honor of Harold
Hotelling, 278-292.
Mauchly, J. W. (1940). Significance test for sphericity of a normal n-variate distribution. The
Annals of Mathematical Statistics, 11(2), 204-209
Osborne, J., & Waters, E. (2002). Four assumptions of multiple regression that researchers
should always test. Practical assessment, research & evaluation, 8(2), 1-9.
Rafter, J. A., Abell, M. L., & Braselton, J. P. (2002). Multiple comparison methods for means.
Siam Review, 44(2), 259-278.
Razali, N. M., & Wah, Y. B. (2011). Power comparisons of shapiro-wilk, kolmogorov-smirnov,
lilliefors and anderson-darling tests. Journal of statistical modeling and analytics, 2(1), 21-
33.
Ruxton, G. D. (2006). The unequal variance t-test is an underused alternative to Student's t-test
and the Mann-Whitney U test. Behavioral Ecology, 17(4), 688-690.
Intellectus Statistics [Online computer software]. (2017). Retrieved from
https://analyze.intellectusstatistics.com/
Stevens, J. P. (2009). Applied multivariate statistics for the social sciences (5th ed.). Mahwah,
NJ: Routledge Academic.
Westfall, P.H., & Henning, K.S.S. (2013). Texts in statistical science: Understanding advanced
statistical methods. Boca Raton, FL: Taylor & Francis.
Zimmerman, D. W., & Zumbo, B. D. (1993). Relative power of the Wilcoxon test, the Friedman
test, and repeated-measures ANOVA on ranks. The Journal of Experimental Education,
62(1), 75-86.
Included Analyses
Descriptive Statistics Linear Regression with Distance predicted by Time Paired Samples t Test between Distance and Time Wilcoxon Signed Rank Test between Distance and Time Repeated Measures ANOVA for Time, Distance, and Accelerate Friedman Test for Time, Distance, and Accelerate
Glossaries
Descriptive Statistics
Descriptive statistics are typically used to describe or summarize the data. It is used as an exploratory method to examine the variables of interest, potentially before conducting inferential statistics on them. They provide summaries of the data and are used to answer descriptive research questions.
Fun Fact! A GPA is actually a descriptive statistic. It does not tell you how well you performed in a single class, only your average performance across multiple classes.
Kurtosis: The measure of the tail behavior of a distribution. Positive kurtosis signifies a distribution is more prone to outliers, and negative kurtosis implies a distribution is less prone to outliers.
Mean (M): The average value of a scale variable.
Percentage (%): The percentage of the frequency or count of a nominal or ordinal category.
Sample Minimum (Min): The smallest numeric value in a given sample.
Sample Maximum (Max): The largest numeric value in a given sample.
Sample Size (n): The frequency or count of a nominal or ordinal category.
Skewness: The measure of asymmetry in the distribution of a variable. Positive skewness indicates a long right tail, while negative skewness indicates a long left tail.
Standard Deviation (SD): The spread of the data around the mean of a scale variable.
Standard Error of the Mean (SEM): The estimate of how far the sample mean is likely to differ from the actual population mean.
Multiple Linear Regression
The multiple linear regression is the most common form of linear regression analysis. As a predictive analysis, the multiple linear regression is used to explain the relationship between one continuous dependent variable from two or more independent variables. It does this by creating a linear combination of all the independent variables to predict the dependent variable. The independent variables can be continuous or categorical (dummy coded as appropriate). The R2 statistic is used to assess how well the regression predicted the dependent variable. While the unstandardized beta (B) describes the increase or decrease of the independent variable(s) with the dependent variable.
95% Confidence Interval (95% CI): An interval that estimates the range one would expect B to lie in 95% of the time given the samples tested comes from the same distribution.
Degrees of Freedom (df): Used with the F ratio to determine the p-value.
Dummy-Code: Performed in order to add a nominal or ordinal independent variable into the regression model; turns the one variable into a series of dichotomous "yes/no" variables, one for each category; one of the categories are left out of the regression as the reference group that all other categories are compared to.
F Ratio (F): Used with the two df values to determine the p value of the overall model.
Homoscedasticity: Refers to the relationship between the residuals and the independent variables; the assumption is met when we have no relationship; the assumption is met when the residuals plot has the points randomly distributed (with no pattern), and the distribution line is approximately straight.
Normality: Refers to the distribution of the residuals; the assumption is that the residuals follow a bell-shaped curve; the assumption is met when the q-q plot has the points distributed approximately on the normality line.
p-value: The probability that the null hypothesis (no relationship in the dependent variable by the independent variable) is true.
Residuals: Refers to the difference between the predicted value for the dependent variable and the actual value of the dependent variable.
R-Squared Statistic (R2): Tells how much variance in the dependent variable is explained by only the predictor variables.
Standardized Beta (β): Ranges from -1 to 1; gives the strength of the relationship between the predictor and dependent variable.
t-Test Statistic (t): Used with the df to determine the p value; also can show the direction of the relationship between the predictor and dependent variable.
Unstandardized Beta (B): The slope of the predictor with the dependent variable SE (standard error): how much the B is expected to vary.
Paired Samples t-Test
The paired (dependent) samples t-test is used to assess for significant differences between two scale variables that can be matched. Typically, the scale variables are matched by time (e.g. pretest vs. posttest), but the data can also be matched in other ways (e.g. husband vs. wife). The test uses the average difference between each pair of matched scores to compute the t statistic, which is used with the df to compute the p-value (i.e., significance level). A significant result indicates the observed test statistic would be unlikely under the null hypothesis. The dependent samples t-test assumes that the differences between pairs of matched scores are normally distributed (i.e., normality).
Fun Fact! This test is based on the Student's t distribution. This distribution was named after William Sealy Gosset, who published a paper about the distribution in 1908 under the pseudonym "Student."
Cohen's d: Effect size for the t-test; determines the strength of the differences between the matched scores. The larger the effect size, the greater the differences in the matched scores.
Degrees of Freedom (df): Refers to the number of values used to compute a statistic. The df is determined by the number of observations in the sample and equal the number of observations - 1; used with t to compute the p-value.
Mean (M): The average value of a scale-level variable.
Normality: Refers to the distribution of the data. The assumption is that the data follows the bell-shaped curve.
p-value: The probability of obtaining the observed results if the null hypothesis is true. A result is usually considered statistically significant if the p-value is ≤ .05.
Shapiro-Wilk Test: A test to assess if the assumption of normality is met. If statistical significance is found in this test, the data is not normally distributed.
Standard Deviation (SD): The spread of the data around the mean of a scale-level variable.
t-Test Statistic (t): Used with the df to determine the p value.
Wilcoxon Signed Rank
The Wilcoxon Signed Rank test is a non-parametric test used to assess for significant differences between two scale or ordinal variables that can be matched. Typically, the variables are matched by time (such as pretest vs. posttest), but the data can also be matched by other characteristics (such as husband vs. wife). This test ranks the pairs of scores by the magnitude of the differences between each matched pair, then sums the signed ranks to compute the V statistic. The V statistic is then used to compute z, which in turn is used to compute the p-value (i.e., significance level). A significant result for this test suggests that the two matched variables are reliably different from each other (e.g., pretest scores are significantly different from posttest scores). The Wilcoxon Signed Rank test assumes that the variables under investigation are scale or ordinal level.
Fun Fact! The Wilcoxon Signed Rank test is named after Frank Wilcoxon, a chemist who published more than 70 papers over the course of his career.
Non-Parametric Test: A type of statistical test that does not require the data to follow a particular distribution; typically used when assumptions of a parametric test are violated or when the data do not fit the level of measurement required by a parametric test.
p-value: The probability of obtaining the observed results if the null hypothesis (no relationship between the independent variable(s) and dependent variable) is true; in most social science research, a result is considered statistically significant if this value is ≤ .05.
V-Test Statistic (V): Represents the sum of the signed ranks; used to compute the z.
z-Test Statistic (z): Used to compute the p value.
Repeated Measures ANOVA (Analysis of Variance)
The Repeated Measures ANOVA examines differences among repeated measurements on the same subjects.
Fun Fact! The repeated measures analysis of variance (ANOVA) is commonly mistaken as a multivariate design. This is because in the repeated measures design, each trial represents the measurement of the same characteristic under a different condition.
Degrees of Freedom (df): Refers to the number of values used to compute a statistic; an F-test has two values for df: the first is determined by the number of groups being compared - 1, and the second is approximately the number of observations in the sample; used with the F to determine the p-value.
F Ratio (F): The ratio of explained variance to error variance; used with the two df values to determine the p-value.
Normality: Refers to the distribution of the data. The assumption is that the data follows the bell-shaped curve.
Partial Eta Squared (η2p): Effect size for the ANOVA and determines the strength of the differences among the groups.
p-value: The probability of obtaining the observed results if the null hypothesis is true.
Shapiro-Wilk Test: A test to assess if the assumption of normality is met. If statistical significance is found in this test, the data is not normally distributed.
Sphericity: When there are three or more repeated measurements, the variance of the differences between each pair of measurements must be equal. Sphericity is the term used to describe this measurement.
Type I Error: Rejection of the null hypothesis when the null hypothesis is true; also referred to as a false positive result.
Friedman Test
Friedman test is a non-parametric significance test for more than two dependent samples and is also known as the Friedman two-way analysis of variance; it is used as a null hypothesis test. In other words, it is used to test that there is no significant difference between the size of 'k' dependent samples and the population from which these have been drawn. The Friedman test statistic is distributed approximately as chi-square, with (k - 1) degrees of freedom.
Fun Fact! The Friedman Test was developed by Milton Friedman, an American economist who was an adviser to President Ronald Reagan and British Prime Minister Margaret Thatcher.
Chi-Square Test Statistic (χ2): Refers to the number of values used to compute a statistic. The df is determined from the number of groups the nominal variable has; used with χ2 to compute the p-value.
Degrees of Freedom (df): Refers to the number of values used to compute a statistic; used with χ2 to compute the p-value.
p-value: The probability of obtaining the observed results if the null hypothesis (no relationship between the independent variable(s) and dependent variable) is true; in most social science research, a result is considered statistically significant if this value is ≤ .05.
Descriptive Statistics
Summary of Numeric Variables:
Variable n M SD Min Max Skewness Kurtosis1 Accelerate 28 0.00 0.00 0.00 0.01 0.05 -1.272 Distance 29 2.55 0.93 1.00 4.00 0.05 -1.243 Time 29 27.76 11.03 13.00 44.00 0.41 -1.41
Quantiles of Numeric Variables: Accelerate Distance Time10% 0.00 1.50 16.0020% 0.00 1.80 18.8025% 0.00 2.00 20.0030% 0.00 2.00 21.0040% 0.00 2.00 21.2050% 0.00 2.50 22.0060% 0.00 3.00 27.0070% 0.00 3.30 36.8075% 0.01 3.50 40.0080% 0.01 3.50 43.0090% 0.01 3.60 43.00
Summary of Categorical Variables:Gender: Level n %1 Male 16 0.552 Female 13 0.453 Missing(NA) 0 0.00
Linear Regression Output
Linear Regression Results:Distance ~ Time
B SE 95% CI Std. B t p1 (Intercept) 0.34 0.15 [0.02, 0.65] 0.00 2.17 .0392 Time 0.08 0.01 [0.07, 0.09] 0.95 15.41 < .001
F(1,27) = 237.41, p < .001, R^2 = 0.898, adj. R^2 = 0.894
Paired Samples t-Test for Distance and Time
Assumption Tests:
Shapiro-Wilk Test for NormalityW = 0.84, p < .001
Levene Test for Equality of VarianceF(1,56) = 27.87, p < .001
Paired Samples t-Test:
M SD1 Distance 2.55 0.932 Time 27.76 11.03
t = -13.37, df = 28, p < .001, Cohen's d = 3.22
95% CI for the difference between Distance and Time: (-29.07, -21.35)
Wilcoxon Signed Rank Test for Distance and Time
Wilcoxon Signed Rank Test
Median1 Distance 2.502 Time 22.00
V = 0.00, z = -4.71, p < .001
Repeated Measures ANOVA Output for Time, Distance, and Accelerate
Assumption Tests:
Shapiro-Wilk Test for Normality p1 Accelerate .0352 Distance .0383 Time .001
Mauchly's Test for Sphericity p1 Within.Factor < .001
Repeated Measures ANOVA Results: Source df SS MS F p eta^21 Within.factor 1.00 12669.35 12652.54 174.64 < .001 0.872 Residuals 27.04 1958.74 72.45
Post-hoc Comparisons: Comparison M SD p1 Time-Distance -2.50 0.90 < .0012 Time-Accelerate -27.21 10.83 < .0013 Distance-Accelerate -24.71 9.98 < .001
Friedman Rank Sum Test for the Differences in Time, Distance, and Accelerate
Friedman Rank Sum Test
Mean_Rank1 Time 32 Distance 23 Accelerate 1
Chi-square = 56, df = 2, p < .001
Pairwise Comparisons: Comparison Obs. Diff. Crit. Diff.1 Time-Distance 28.00 17.912 Time-Accelerate 56.00 17.913 Distance-Accelerate 28.00 17.91