statistics. why do we need statistics? to describe data: a-average b-quartile (act tests)...

51
Statistics

Upload: myron-byrd

Post on 18-Jan-2016

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Statistics

Page 2: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Why do we need statistics?

To describe data:

A-average

B-quartile (ACT tests)

C-percentile

Page 3: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Why do we need statistics?

To find relationships● Is use of 'like' related to age?● Do people who learn a language

earlier learn it better?

Page 4: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Why do we need statistics?

To test a hypothesis● Is Shakespeare's vocabulary larger

than King James Bible's? ● Do men interrupt more than women?

Page 5: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Null Hypothesis

●Hypothesis: Women talk more than men

●Null hypothesis: There is no difference between women and men'

Page 6: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Null Hypothesis

●Hypothesis: Women talk more than men

●Null hypothesis: There is no difference between women and men'

●Hypothesis: Program X classifies parts of speech more accurately than program Y

●Null hypothesis: There is no difference between program X and Y

Page 7: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Statistical Significance

● A significant difference is better than one in twenty of happening by chance (p < .05). The opposite of significance is random chance.

Page 8: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Statistical Significance

● A significant difference is better than one in twenty of happening by chance (p < .05). The opposite of significance is random chance.

● What if test had only 4 multiple choice questions and only one person took it and was rolling dice to determine answer? How many times could the person take the test with dice and get an 80% or better? The probability is high (over 1/20) that it will happen. If 100 people took the test the chances of getting an average 80% or better by rolling dice go way down (less than 1/20). If the test has 100 questions the possibility goes way down also.

Page 9: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Statistical Significance

● Consider a commercial that claims that four out of five dentists recommend toothpaste X. If only five dentists were actually consulted would you be impressed? Would you not be more motivated to buy it if 4,000 out of 5,000 dentists recommended toothpaste X, in spite of the fact that 4/5 and 4000/5000 are both 80%? In like manner, statistical formulas take into consideration factors such as the number of subjects, responses, and test items when calculating the statistical significance.

● In other words, an 80% vs. 85% score may not be significant if there are few test takers and few items, but an 80% vs. 81% may be significant if the test is long and many people took the test. Statistics takes this into consideration.

Page 10: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Types of Data

Categorical

Gender: male or female

Country of origin: Korea, Canada, Brazil, France

Education: high school graduate or not

Ethnicity: Hispanic, Caucasian, Asian, Black, Polynesian

Page 11: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Types of Data

Categorical

Childhood language background: monolingual, bilingual, multilingual

Prodrop: subject pronoun used with verb, subject pronoun not used with verb

Language abilities of participant: native, non-native

Teaching method: total physical response, audiolingual, grammar translation

Which word is used for “large sandwich”?: hoagie, subway, grinder, po boy

Page 12: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Types of Data

Ordinal

The order in which children acquire certain morphemes.

The way a test participant orders a series of five recordings of non-

natives from “most fluent” to “least fluent.”

Page 13: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Types of Data

Continuous

●Age

●Number of years of formal schooling

●Months spent living in a foreign country

●Time required to recognize a word during an experiment

●Frequency of a formant

●Duration of consonant closure

●Hours spent sending text messages

Page 14: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Variables

Characteristics that change from situation to situation, object to object, or person to person.

– Biographical variables (What kind are they?)

● age

● number of children

● ethnicity

● state of residence

● birth order among siblings

Page 15: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Dependent and Independent Variables

●What is the effect of X on Y?– X is independent– Y is dependent (you measure it)

Page 16: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Dependent and Independent Variables

Idea: People seem to use 'myself' as the non-reflexive object of a preposition rather than as a reflexive a lot more nowadays (e.g. “as for myself”).

Page 17: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Dependent and Independent Variables

Idea: People seem to use 'myself' as the non-reflexive object of a preposition rather than as a reflexive a lot more nowadays (e.g. “as for myself”).

Quantified question: What is the effect of time

(1950s, 1960s, etc.) on the use of 'myself' as the non-

reflexive object of a preposition?

What are the variables?

Page 18: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Dependent and Independent Variables

Idea: People seem to use 'myself' as the non-reflexive object of a preposition rather than as a reflexive a lot more nowadays (e.g. “as for myself”).

Quantified question: What is the effect of time (1950s, 1960s,

etc.) on the use of 'myself' as the non-reflexive object of a preposition?

Variables: Time is a continuous independent variable and number of uses of

'myself' as the object of a preposition is a continuous dependent variable.

Page 19: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Dependent and Independent Variables

Idea: It seems that women always outnumber men in foreign language classes.

Page 20: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Dependent and Independent Variables

Idea: It seems that women always outnumber men in foreign language classes.

Quantified question: What is the effect of gender

on enrollment in foreign language classes?

What are the variables?

Page 21: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Dependent and Independent Variables

Idea: It seems that women always outnumber men in foreign language classes.

Quantified question: What is the effect of gender on

enrollment in foreign language classes?

Variables: Gender is the categorical independent variable and

number of students enrolled is the continuous dependent

variable.

Page 22: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Dependent and Independent Variables

Idea: I wonder if daily consumption of greasy American-style fast food is likely to shorten my life?

Page 23: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Dependent and Independent Variables

Idea: I wonder if daily consumption of greasy American-style fast food is likely to shorten my life?

Yes

Page 24: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Correlation

Question answered: What is the relationship between two variables?

Type of variables used: Both continuous.

Page 25: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Correlation

Examples:

1 How are second language proficiency and degree of cultural adaptation related?

2 What is the relationship between vowel backness and how big an object

represented by a nonce word with back (or front) vowels is perceived to be?

3 How is word frequency related to the amount of time required to name a word?

4 How does income relate to happiness?

Page 26: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Correlation

Do southerners who move away from the South shift the pronunciation [aɪ] to [a] over time?

Page 27: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

CorrelationSpeaker % [a] Years Away

1 98 1

2 82 1

3 99 2

4 65 3

5 90 3

6 85 5

7 75 5

8 50 5

9 75 6

10 55 6

11 85 7

12 70 8

13 30 8

14 55 9

15 80 9

16 25 10

Page 28: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Correlation

●Line slopes down =negative correlation

Page 29: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Correlation

●What is the effect of education on income?

Page 30: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Correlation

●What is the effect of education on income?

●Line slopes up=positive correlation

Page 31: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Correlation

●What does this correlation tell you?

●Is it positive or negative?

Page 32: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Correlation coefficient

●Called r

●Ranges from +1 to -1

●Shows direction of correlation (neg pos)

●Shows strength of correlation

Page 33: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Correlation coefficient

●r = .79

Page 34: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Correlation coefficient

●What is the effect of water's volume on its weight?

Page 35: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Correlation coefficient

●What is the effect of water's volume on its weight?

●What is r?

Page 36: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Correlation coefficient

●What is the effect of water's volume on its weight?

●What is r?

●r = 1

Page 37: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Regression Line

●The regression line is the closest line that can be drawn to the data points.

Interactive graph

Page 38: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

What is p?

●The probability of getting the results by chance– r = 1 with two data points– r = 1 with 1000 data points

●1 in 20 chance or smaller of getting results by chance is called statistically significant

Page 39: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

What is p?

●The probability of getting the results by chance– r = 1 with two data points– r = 1 with 1000 data points

●1 in 20 chance or smaller of getting results by chance is called statistically significant

●1/20 =.05, so p ≤ .05 is significant

Page 40: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

What is p?

●The probability of getting the results by chance– r = 1 with two data points– r = 1 with 1000 data points

●1 in 20 chance or smaller of getting results by chance is called statistically significant

●1/20 =.05, so p ≤ .05 is significant

●Smaller p is MORE significant

Page 41: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Number of months in a foreign country and linguistic abilities in the country's language (positive or negative?)●What would this mean? R = 0.56, p < .03

●What would this mean? R = 0.56, p < .07

Page 42: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Number of native dialectal usages and time spent living outside of native dialect area (negative or positive?) ●What would this mean? R = -.23, p < .0001

●What would this mean? R = -.67, p < .0001

Page 43: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

What is the past test of spling? What is the past tense of creeze?

Computer People Computer People

splung 35% splung 22% croze 12% croze 6%

Page 44: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Correlation and Causation

Page 45: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Correlation and Causation

●Does wealth cause belief in evolution?

●Does belief in God cause poverty?

Page 46: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Correlation and Causation

●Utah has highest use of antidepressants

●Utah has highest percentage of LDS

Page 47: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Correlation and Causation

●Utah has highest use of antidepressants

●Utah has highest percentage of LDS

●Utah has highest use of thyroid medicine

●Utah has highest autism rate

Page 48: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Correlation and Causation

●Utah has highest use of antidepressants

●Utah has highest percentage of LDS

●Utah has highest use of thyroid medicine

●Utah has highest autism rate

●Utahns go to doctors more

●Utahns don't self medicate with alcohol (as much)

Page 49: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Correlation and Causation

●Number of drownings is positively correlated with ice-cream sales

Page 50: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Correlation and Causation

●Number of drownings is positively correlated with ice-cream sales

●Bad oral health is correlated with Alzheimer's

Page 51: Statistics. Why do we need statistics? To describe data: A-average B-quartile (ACT tests) C-percentile

Correlation and Causation

●Number of drownings is positively correlated with ice-cream sales

●Bad oral health is correlated with Alzheimer's– What are other reasons for this?