urbp 204a quantitative methods i statistical analysis lecture ii gregory newmark san jose state...

Post on 03-Jan-2016

216 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

URBP 204A QUANTITATIVE METHODS I

Statistical Analysis Lecture II

Gregory NewmarkSan Jose State University

(This lecture accords with Chapters 6,7, & 8 of Neil Salkind’sStatistics for People who (Think They) Hate Statistics)

Populations and Samples• Populations

– All the people in a specified group of people• The population of Students at SJSU• The population of Students in Urban Planning at SJSU• The population of Students in 204A this semester

• Samples– A portion of a larger population selected for study

• A 500 person Sample of Students at SJSU• A 50 person Sample of Students in Urban Planning• A 15 person Sample of Students in 204A this semester

Populations and Samples• Ideally, research covers entire populations

– “Medicine X always cures the common cold”

• Financially, research is expensive– “We can’t afford to test Medicine X on everyone”

• Practically, we test samples of a population– “We can afford to test Medicine X on 1,000 people”

• Hopefully, those samples well represent the actual population– “For our results to be generalizable, our 1,000 people

should approximate the characteristics of everyone”

Populations and Samples

Populations and Samples• Sampling Error

– A measure of how well a sample approximates the characteristics of the larger population

– The difference between a sampling statistic (i.e., values in the sample) and a population parameter (i.e., values in the population)

– Low sampling error means higher precision– Higher precision means more generalizability– Valuable research has a high degree of

generalizability

Questions and Hypotheses• Research Questions (Problem Statements)

– What you are trying to investigate

• Hypotheses– Translates research question into a testable form

Hypotheses• Null Hypothesis

– Assumption that no relationship exists in population– Statements of equality– Examples

• “There is no relationship between reaction time and problem solving ability”

• “There is no difference in the average GRE scores of women and men”

– Purposes (Null Hypothesis can not be tested directly)• Starting point for research

– Until you prove a difference you have to assume none exists• Benchmark to compare observations

– Defines a range within which observed difference may be due to change

Hypotheses

Hypotheses• Research Hypothesis

– Definitive statement that a relationship exists in a sample

– Statements of inequality– Examples

• “There is a positive relationship between reaction time and problem solving ability”

• “There is a difference in the average GRE scores of women and men”

– Two Types• Non-directional – there is a difference but its direction is

unspecified• Directional – there is a difference and its direction is

specified– Purpose – to provide a hypothesis for direct testing

Hypotheses• Should be stated in a clear, forceful, declarative form

– “Students who complete all assignments will get higher grades in 204A than those who do not.”

• Should be expressed succinctly– Avoid excessive verbiage that can confuse your readers

• Should posit an expected relationship between variables– This will focus the research and avoid ‘scattershot’ approach

• Should reflect theory or literature– This ensures that the researcher has investigated the issue in

advance• Should be testable

– One can actually carry out the research– Defines how measurement will happen

Hypotheses Quotes• The great tragedy of Science - the slaying of a beautiful

hypothesis by an ugly fact.– Thomas H. Huxley (1825 - 1895)

• There are two possible outcomes: If the result conforms the hypothesis, then you've made a measurement. If the result is contrary to the hypothesis, then you've made a discovery.– Enrico Fermi (1901-1954)

• It is a good morning exercise for a research scientist to discard a pet hypothesis every day before breakfast. It keeps him young.– Konrad Lorenz (1903 - 1989)

• For every fact there is an infinity of hypotheses.– Robert M. Pirsig (1928 - )

Inferential Statistics• Descriptive Statistics describe a data set

– “The average height in this class is 5’6” with a standard deviation of 3”.”

• Inferential Statistics are used to make inferences from sample data to populations– “Based on our class data, we infer that the

average height at SJSU is 5’6” with a standard deviation of 3”.”

Inferential Statistics

The Normal Curve• Visual representation of a distribution of

scores with the following characteristics– Mean, median, and mode are the same– Symmetry around the mean (or mode or median)– Tails of curve approach zero asymptotically

The Normal Curve

The Normal Curve• We can exploit these properties of the normal

curve to compare distributions with different means and standard deviations, by putting them into standard scores based on the standard deviation

• Basically, we can compare curves by discussing their standard deviations

Z-Scores• A commonly used standardized score• Represent the number of standard deviations a

raw score falls from the mean• Result of dividing the amount that a raw score

differs from the mean of a distribution by the standard deviation of that distribution

• Z = z score; X = individual score; Xbar = mean; s = standard deviation

Z-Scores• Characteristics

– Z scores above the mean are:• Positive• To the right of the mean• In the upper half of the distribution

– Z scores below the mean are:• Negative• To the left of the mean• In the lower half of the distribution

– Z scores have associated probabilities

Z-Scores• Every z score has an associated probability• We can use that property to test hypotheses• This property enables inferential statistics• We can assess whether an event is due to

chance or reflects some research finding• Typically, we reject the null hypothesis if an

event has less than a 5% chance of occurring• In that case, the research hypothesis likely

makes more sense

Class Lab• Have everyone report their height in inches• Determine class mean• Determine class standard deviation• Calculate z score for your height• What percentage of the class is taller than

you? (see chart in back of book or online)

• Have everyone move the data into SPSS and repeat the experiment

The Normal CurveThe Normal Law

by W.J. Youden (1900 - 1971)

THENORMAL

LAW OF ERRORSTANDS OUT IN THE

EXPERIENCE OF MANKINDAS ONE OF THE BROADEST

GENERALIZATIONS OF NATURALPHILOSOPHY ... IT SERVES AS THE

GUIDING INSTRUMENT IN RESEARCHESIN THE PHYSICAL AND SOCIAL SCIENCES AND

IN MEDICINE, AGRICULTURE, AND ENGINEERING.IT IS AN INDISPENSABLE TOOL FOR THE ANALYSIS AND THE

INTERPRETATION OF THE BASIC DATA OBTAINED BY OBSERVATION AND EXPERIMENT

Statistical Significance• Refers to whether or not an observed effect is due to

chance or to systematic influence.– “There is a positive statistically significant relationship

between GDP and average life span.”– Statistical significance makes the null hypothesis less

attractive an explanation than the research hypothesis• Ideally, research would control for all other factors,

but in practice there will be uncontrolled error.– “There is a chance that a low GDP nation will have a higher

average life span, due to unaccounted for factors.”• Researchers ultimately define the level of certainty

they are willing to accept in determining significance.– “There is a 1 in 20 chance that the observed effect is not

due to the hypothesized reason, and we can live with that.”

– This is called significance level (or critical p-value).

Significance Levels can Vary

Statistical Significance

• To review:– First, hypothesize a relationship

• Null Hypothesis means no relationship (often implied)• Research Hypothesis means there is a relationship

– Second, test the research hypothesis• Define your significance level• Do your experiment

– Third, based on your findings either:• Reject the null and accept the research hypothesis• Accept the null and reject the research hypothesis

Statistical Significance

• Data and Dating– Is this enough to reject the null hypothesis?

Statistical Significance• Null Hypotheses can be either true or false

– If true, there is an equality– If false, there is an inequality

• The Null Hypothesis can not be directly tested– This presents a problem because one might reject the null

when it is true (Type I) or accept it when it is false (Type II)– Four options:

No ProblemAccept the Null Hypothesis when there is truly no difference between groups

Type I Error (False Positive)Reject the Null Hypothesis when there is truly no difference between groups

Type II Error (False Negative)Accept the Null Hypothesis when there truly are differences between groups

No ProblemReject the Null Hypothesis when there truly are differences between groups

Significant vs. Meaningful• Statistically significant does not always imply

the finding is meaningful– “There is a statistically significant ¼ inch difference

in the heights of women and men.”– “There is a statistically significant $0.50 difference

in the per capita tax returns of married couples versus singles.”

• Large samples will almost always find statistically significant differences.

• The researcher needs to assess the meaning of the outcomes by considering their context.

Statistical Significance Revisited

• Steps:– State hypothesis– Set significance level associated with null

hypothesis– Select statistical test (we will learn these soon)– Computation of obtained test statistic value – Computation of critical test statistic value– Comparison of obtained and critical values

• If obtained > critical reject the null hypothesis• If obtained < critical stick with the null hypothesis

Statistical Significance Revisited

• One Tailed Test

Statistical Significance Revisited

• Two Tailed Test

Inferential Statistics Revisited

• Inference allows decisions to be made about populations based on information about samples.

• Steps:– Take a representative sample– Test each member of the sample– Analyze data to determine if variation is due to

chance (accept null hypothesis) or statistically significant (accept research hypothesis)

– Conclusions inferred about population

Inferential Statistics Revisited

top related