team bivariate chris bulock chris bulock michael mackavoy michael mackavoy jennifer masunaga...

28
Team Bivariate Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Post on 22-Dec-2015

220 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Team BivariateTeam Bivariate

Chris BulockChris Bulock Michael MackavoyMichael Mackavoy Jennifer MasunagaJennifer Masunaga Ann PanAnn Pan Joe PozdolJoe Pozdol

Page 2: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Independent vs. DependentIndependent vs. Dependent

Independent: The variable manipulated or Independent: The variable manipulated or presumed to affect a dependent variable.presumed to affect a dependent variable.- Alternatively known as a predictor or Alternatively known as a predictor or

experimental variable.experimental variable.

• Dependent: The variable that changes in Dependent: The variable that changes in response to the independent variable.response to the independent variable.– Also known as outcome or subject variable.Also known as outcome or subject variable.

Page 3: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

ExampleExample

Hypothesis: The more library Hypothesis: The more library instruction a college student instruction a college student receives, the more he or she will use receives, the more he or she will use the library.the library.

– Independent Variable: Quantity of Independent Variable: Quantity of Instruction Instruction

– Dependent Variable: Usage of the libraryDependent Variable: Usage of the library

Page 4: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Hypothesis TestingHypothesis Testing

Null Hypothesis (Ho): A hypothesis set Null Hypothesis (Ho): A hypothesis set up to be nullified or refuted in order to up to be nullified or refuted in order to support an alternative hypothesis. support an alternative hypothesis.

Alternative Hypothesis (HA or H1): The Alternative Hypothesis (HA or H1): The hypothesis supported if the null is hypothesis supported if the null is rejectedrejected

Alpha Level (α) and P-ValuesAlpha Level (α) and P-Values

Page 5: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Hypothesis: The more library Hypothesis: The more library instruction a college student receives, instruction a college student receives, the more he or she will use the library.the more he or she will use the library.

• What is the Null Hypothesis?What is the Null Hypothesis?

• What is the Alternative Hypothesis?What is the Alternative Hypothesis?

• If p is smaller than the α level, then If p is smaller than the α level, then the data is said to be “statistically the data is said to be “statistically significant.”significant.”

Page 6: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Kurtosis, SkewnessKurtosis, Skewness (and other weird sounding words)(and other weird sounding words)

Kurtosis refers to the peakedness or Kurtosis refers to the peakedness or flatness of a frequency distribution.flatness of a frequency distribution.

Page 7: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

SkewnessSkewness

Skewness describes data as Skewness describes data as symmetrical or asymmetrical about a symmetrical or asymmetrical about a central point.central point.

Page 8: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Linear RegressionLinear Regression

Analysis technique which predicts Analysis technique which predicts one variable from another, with the one variable from another, with the regression lineregression line being the best fit being the best fit straight line drawn through paired straight line drawn through paired pointspoints

Independent (X-axis) and dependent Independent (X-axis) and dependent (Y-axis) variables (Y-axis) variables

Slope may be negative, positive, or 0Slope may be negative, positive, or 0

Page 9: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Samples of Scatter Diagrams Samples of Scatter Diagrams and Variable Relationships:and Variable Relationships:

Page 10: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

CorrelationCorrelation

How strongly one variable predicts How strongly one variable predicts another another

Numerous methods for calculation of Numerous methods for calculation of correlation coefficient correlation coefficient

Relationship can be direct or inverseRelationship can be direct or inverse Correlation coefficient holds a value of Correlation coefficient holds a value of

r = -1.00 to r = +1.00r = -1.00 to r = +1.00

Page 11: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Measurement of Correlation Measurement of Correlation CoefficientsCoefficients

Parametric (used in interval/ratio Parametric (used in interval/ratio data measurement)data measurement)

Nonparametric (for ordinal or Nonparametric (for ordinal or nominal data measurement)nominal data measurement)

Usage of parametric tests Usage of parametric tests requires satisfaction of certain requires satisfaction of certain conditionsconditions

Page 12: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Pearson Correlation Pearson Correlation

Parametric method for calculation of Parametric method for calculation of coefficient of correlation (requires interval coefficient of correlation (requires interval or ratio data)or ratio data)

r= r= n∑XY-∑X∑Y_________ n∑XY-∑X∑Y_________

√ √{[n∑X{[n∑X22 – (∑X) – (∑X)22] [n∑Y] [n∑Y22 – (∑Y) – (∑Y)22]}]}

From r can calculate rFrom r can calculate r22, which is the , which is the coefficient of determination, in order to coefficient of determination, in order to determine proportion of variation in the determine proportion of variation in the dependent variable explained by variation dependent variable explained by variation in the independent variablein the independent variable

Page 13: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Pearson CorrelationPearson Correlation

Important to remember Pearson Important to remember Pearson coefficient(r) or Pearson coefficient coefficient(r) or Pearson coefficient of determination (rof determination (r22) does not ) does not indicate causation. Instead, indicate causation. Instead, provides statistical evidence for a provides statistical evidence for a relationship between the variables.relationship between the variables.

Page 14: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Nonparametric methodsNonparametric methods

Used for data expressed in ordinal or Used for data expressed in ordinal or nominal scale measurementsnominal scale measurements

Spearman rank order correlation Spearman rank order correlation coefficient, rcoefficient, rss (uses ordinal scale data (uses ordinal scale data and assumes n ranked pairs)and assumes n ranked pairs)

rrss tells the strength of the tells the strength of the relationship between two variables relationship between two variables that are measured on ordinal scalesthat are measured on ordinal scales

Page 15: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Chi-squared (X^2) Test

Nonparametric test (or parametric if normal distribution)

Used for 2 nominal or ordinal variables (or continuous)

Used for small samples, but minimum size required

Tests if relationship between 2 variables

Column percents show nature of relationship

Page 16: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Research question

Does gender influence library type preference?

Gender: male or female

Library type: academic, corporate, public

Independent variable? Dependent variable?

Null hypothesis? Alternative hypothesis?

Page 17: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Collect data and construct table

Poll class

Make contingency table

Calculate row and column marginals

Calculate expected frequencies

Page 18: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Calculate X^2

Check that expected frequencies are >5

(Modify if necessary to illustrate)

X^2 = Σ (O-E)^2 / E

Page 19: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Determine degrees of freedom

For X^2,

degrees of freedom =

(#rows - 1) (#columns - 1)

Page 20: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Are variables related?Compare calculated X^2 to critical value in table

  0.10 0.05 0.025 0.01 0.005

1 2.706 3.841 5.024 6.635 7.879

2 4.605 5.991 7.378 9.210 10.597

3 6.251 7.815 9.348 11.345 12.838

4 7.779 9.488 11.143 13.277 14.860

5 9.236 11.070 12.833 15.086 16.750

p-value

d.f.

Page 21: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

““Online Workplace Online Workplace Training in Libraries”Training in Libraries”

By Connie K HaleyBy Connie K Haley

Focused on the preference for online training Focused on the preference for online training versus traditional face-to-face trainingversus traditional face-to-face training

Purpose of the study is to reveal the Purpose of the study is to reveal the relationships between variables and relationships between variables and

preference for online or traditional face-to-preference for online or traditional face-to-face training face training

Page 22: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

““Online Workplace Online Workplace Training in Libraries”Training in Libraries”

Aims to reveal the relationship between Aims to reveal the relationship between preference for training and variables such as:preference for training and variables such as:

Gender, age, education level, years of Gender, age, education level, years of experience, training locations, training experience, training locations, training

providers, and professional development providers, and professional development policies policies

Page 23: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

MethodologyMethodologyThe study took pace over a twenty-day The study took pace over a twenty-day period from April 10 to April 30 of 2006.period from April 10 to April 30 of 2006.

Library employees were sent online survey Library employees were sent online survey questionnaires questionnaires

The surveys were anonymous and The surveys were anonymous and confidentialconfidential

Consisted of three parts: demographic Consisted of three parts: demographic variables,variables,

Likert-scale assessment of training Likert-scale assessment of training preferences, and open-ended questions preferences, and open-ended questions

Page 24: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

AssumptionsAssumptions

Expectations included: Expectations included:

Younger employees would prefer online Younger employees would prefer online training, while older ones would prefer face-training, while older ones would prefer face-

to-face training; to-face training;

Highly educated employees would prefer Highly educated employees would prefer online training, while less educated online training, while less educated

employees with fewer skills would prefer employees with fewer skills would prefer face-to-face training;face-to-face training;

Employees with more library training would Employees with more library training would prefer online training while those with less prefer online training while those with less

experience would prefer face-to-face trainingexperience would prefer face-to-face training

Page 25: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

FindingsFindingsPreference for online training shows a Preference for online training shows a

correlation to training providers and training correlation to training providers and training locationslocations

The preference for online training was not The preference for online training was not associated with ethnicity, gender, age, associated with ethnicity, gender, age,

education, or library experienceeducation, or library experience

Training budgets and professional Training budgets and professional development policies were not related to the development policies were not related to the

preference for online training preference for online training

Page 26: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Advantages of bivariate Advantages of bivariate modelsmodels

Quantitative goals:Quantitative goals:– RelationshipsRelationships– PredictionPrediction– CausalityCausality

SimplificationSimplification– Core RelationshipCore Relationship– ParsimonyParsimony

Page 27: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Disadvantages of bivariate Disadvantages of bivariate modelsmodels

Over-simplificationOver-simplification– Many related variablesMany related variables– Picking the right pairPicking the right pair

False relationshipsFalse relationships– May overlook the true relationshipMay overlook the true relationship

Poor definitionsPoor definitions

Page 28: Team Bivariate Chris Bulock Chris Bulock Michael Mackavoy Michael Mackavoy Jennifer Masunaga Jennifer Masunaga Ann Pan Ann Pan Joe Pozdol Joe Pozdol

Bivariate models: when to Bivariate models: when to useuse

Simple situationsSimple situations Interested in single relationshipInterested in single relationship

– oror Get a handle on complex situationGet a handle on complex situation Initial studyInitial study