top 10 concepts of statistics

Upload: ajbala

Post on 04-Apr-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Top 10 concepts of Statistics

    1/111

    Review of Top 10 Concepts

    in Statistics(reordered slightly for review the interactivesession)

    NOTE: This Power Point file is not an introduction,but rather a checklist of topics to review

  • 7/30/2019 Top 10 concepts of Statistics

    2/111

    Top Ten #10

    Qualitative vs. Quantitative

  • 7/30/2019 Top 10 concepts of Statistics

    3/111

    Qualitative

    Categorical data:

    success vs. failure

    ethnicitymarital status

    color

    zip code4 star hotel in tour guide

  • 7/30/2019 Top 10 concepts of Statistics

    4/111

    Qualitative

    If you need an average, do not calculate themean

    However, you can compute the mode(average person is married, buys a blue carmade in America)

  • 7/30/2019 Top 10 concepts of Statistics

    5/111

    Quantitative

    Two cases

    Case 1: discrete

    Case 2: continuous

  • 7/30/2019 Top 10 concepts of Statistics

    6/111

    Discrete

    (1) integer values (0,1,2,)

    (2) example: binomial

    (3) finite number of possible values(4) counting

    (5) number of brothers

    (6) number of cars arriving at gas station

  • 7/30/2019 Top 10 concepts of Statistics

    7/111

    Continuous

    Real numbers, such as decimal values($22.22)

    Examples: Z, t Infinite number of possible values

    Measurement

    Miles per gallon, distance, duration of time

  • 7/30/2019 Top 10 concepts of Statistics

    8/111

    Graphical Tools

    Pie chart or bar chart: qualitative

    Joint frequency table: qualitative (relatemarital status vs zip code)

    Scatter diagram: quantitative (distance fromCSUN vs duration of time to reach CSUN)

  • 7/30/2019 Top 10 concepts of Statistics

    9/111

    Hypothesis Testing

    Confidence Intervals

    Quantitative: Mean

    Qualitative: Proportion

  • 7/30/2019 Top 10 concepts of Statistics

    10/111

    Top Ten #9

    Population vs. Sample

  • 7/30/2019 Top 10 concepts of Statistics

    11/111

    Population

    Collection of all items (all light bulbs made atfactory)

    Parameter: measure of population

    (1) population mean (average number ofhours in life of all bulbs)

    (2) population proportion (% of all bulbs thatare defective)

  • 7/30/2019 Top 10 concepts of Statistics

    12/111

    Sample

    Part of population (bulbs tested by inspector)

    Statistic: measure of sample = estimate ofparameter

    (1) sample mean (average number of hoursin life of bulbs tested by inspector)

    (2) sample proportion (% of bulbs in sample

    that are defective)

  • 7/30/2019 Top 10 concepts of Statistics

    13/111

    Top Ten #1

    Descriptive Statistics

  • 7/30/2019 Top 10 concepts of Statistics

    14/111

    Measures of Central Location

    Mean

    Median

    Mode

  • 7/30/2019 Top 10 concepts of Statistics

    15/111

    Mean

    Population mean == x/N = (5+1+6)/3 = 12/3 =4

    Algebra: x = N* = 3*4 =12

    Sample mean = x-bar = x/n Example: the number of hours spent on the

    Internet: 4, 8, and 9

    x-bar = (4+8+9)/3 = 7 hours

    Do NOT use if the number of observations issmall or with extreme values

    Ex: Do NOT use if 3 houses were sold this week,and one was a mansion

  • 7/30/2019 Top 10 concepts of Statistics

    16/111

    Median

    Median = middle value

    Example: 5,1,6

    Step 1: Sort data: 1,5,6

    Step 2: Middle value = 5 When there is an even number of observation,

    median is computed by averaging the twoobservations in the middle.

    OK even if there are extreme values

    Home sales: 100K,200K,900K, so

    mean =400K, but median = 200K

  • 7/30/2019 Top 10 concepts of Statistics

    17/111

    Mode

    Mode: most frequent value

    Ex: female, male, female

    Mode = female Ex: 1,1,2,3,5,8

    Mode = 1

    It may not be a very good measure, see thefollowing example

  • 7/30/2019 Top 10 concepts of Statistics

    18/111

    Measures of Central Location -Example

    Sample: 0, 0, 5, 7, 8, 9, 12, 14, 22, 23

    Sample Mean = x-bar = x/n = 100/10 = 10

    Median = (8+9)/2 = 8.5

    Mode = 0

  • 7/30/2019 Top 10 concepts of Statistics

    19/111

    Relationship

    Case 1: if probability distribution symmetric(ex. bell-shaped, normal distribution),

    Mean = Median = Mode

    Case 2: if distribution positively skewed toright (ex. incomes of employers in large firm: a

    large number of relatively low-paid workersand a small number of high-paid executives),

    Mode < Median < Mean

  • 7/30/2019 Top 10 concepts of Statistics

    20/111

    Relationship contd

    Case 3: if distribution negatively skewed to left(ex. The time taken by students to write

    exams: few students hand their exams earlyand majority of students turn in their exam atthe end of exam), Mean < Median < Mode

  • 7/30/2019 Top 10 concepts of Statistics

    21/111

    Dispersion Measures ofVariability

    How much spread of data

    How much uncertainty

    Measures Range

    Variance

    Standard deviation

  • 7/30/2019 Top 10 concepts of Statistics

    22/111

    Range

    Range = Max-Min > 0

    But range affected by unusual values

    Ex: Santa Monica has a high of 105 degreesand a low of 30 once a century, but rangewould be 105-30 = 75

  • 7/30/2019 Top 10 concepts of Statistics

    23/111

    Standard Deviation (SD)

    Better than range because all data used

    Population SD = Square root of variance=sigma =

    SD > 0

  • 7/30/2019 Top 10 concepts of Statistics

    24/111

    Empirical Rule

    Applies to mound or bell-shaped curves

    Ex: normal distribution 68% of data within + one SD of mean

    95% of data within + two SD of mean

    99.7% of data within + three SD of mean

  • 7/30/2019 Top 10 concepts of Statistics

    25/111

    Standard Deviation =

    Square Root of Variance

    1)(2

    nxxs

  • 7/30/2019 Top 10 concepts of Statistics

    26/111

    Sample Standard Deviation

    x

    6 6-8=-2 (-2)(-2)= 4

    6 6-8=-2 4

    7 7-8=-1 (-1)(-1)= 1

    8 8-8=0 0

    13 13-8=5 (5)(5)= 25

    Sum=40 Sum=0 Sum = 34

    Mean=40/5=8

    xx 2)( xx

  • 7/30/2019 Top 10 concepts of Statistics

    27/111

    Standard Deviation

    Total variation = 34

    Sample variance = 34/4 = 8.5

    Sample standard deviation =square root of 8.5 = 2.9

  • 7/30/2019 Top 10 concepts of Statistics

    28/111

    Measures of Variability - Example

    The hourly wages earned by a sample of five studentsare:

    $7, $5, $11, $8, and $6Range: 11 5 = 6

    Variance:

    Standard deviation:

    30.5

    15

    2.21

    15

    4.76...4.77

    1

    222

    2

    n

    XXs

    30.230.52

    ss

  • 7/30/2019 Top 10 concepts of Statistics

    29/111

    Graphical Tools

    Line chart: trend over time

    Scatter diagram: relationship between twovariables

    Bar chart: frequency for each category Histogram: frequency for each class of

    measured data (graph of frequency distr.)

    Box plot: graphical display based onquartiles, which divide data into 4 parts

  • 7/30/2019 Top 10 concepts of Statistics

    30/111

    Top Ten #8

    Variation Creates Uncertainty

  • 7/30/2019 Top 10 concepts of Statistics

    31/111

    No Variation

    Certainty, exact prediction

    Standard deviation = 0

    Variance = 0

    All data exactly same

    Example: all workers in minimum wage job

  • 7/30/2019 Top 10 concepts of Statistics

    32/111

    High Variation

    Uncertainty, unpredictable

    High standard deviation

    Ex #1: Workers in downtown L.A. have variationbetween CEOs and garment workers

    Ex #2: New York temperatures in spring rangefrom below freezing to very hot

  • 7/30/2019 Top 10 concepts of Statistics

    33/111

    Comparing StandardDeviations

    Temperature Example

    Beach city: small standard deviation (single

    temperature reading close to mean) High Desert city: High standard deviation (hot

    days, cool nights in spring)

  • 7/30/2019 Top 10 concepts of Statistics

    34/111

    Standard Error of the Mean

    Standard deviation of sample mean =

    standard deviation/square root of n

    Ex: standard deviation = 10, n =4, so standarderror of the mean = 10/2= 5

    Note that 5

  • 7/30/2019 Top 10 concepts of Statistics

    35/111

    Sampling Distribution

    Expected value of sample mean = populationmean, but an individual sample mean could besmaller or larger than the population mean

    Population mean is a constant parameter, butsample mean is a random variable

    Sampling distribution is distribution of samplemeans

  • 7/30/2019 Top 10 concepts of Statistics

    36/111

    Example

    Mean age of all students in the building ispopulation mean

    Each classroom has a sample mean

    Distribution of sample means from allclassrooms is sampling distribution

  • 7/30/2019 Top 10 concepts of Statistics

    37/111

    Central Limit Theorem (CLT)

    If population standard deviation is known,sampling distribution of sample means is normalif n > 30

    CLT applies even if original population isskewed

  • 7/30/2019 Top 10 concepts of Statistics

    38/111

    Top Ten #5

    Expected Value

  • 7/30/2019 Top 10 concepts of Statistics

    39/111

    Expected Value

    Expected Value = E(x) = xP(x)

    = x1P(x1) + x2P(x2) +

    Expected value is a weighted average, also along-run average

  • 7/30/2019 Top 10 concepts of Statistics

    40/111

    Example

    Find the expected age at high schoolgraduation if 11 were 17 years old, 80 were18 years old, and 5 were 19 years old

    Step 1: 11+80+5=96

  • 7/30/2019 Top 10 concepts of Statistics

    41/111

    Step 2

    x P(x) x P(x)

    17 11/96=.115 17(.115)=1.955

    18 80/96=.833 18(.833)=14.994

    19 5/96=.052 19(.052)=.988

    E(x)= 17.937

  • 7/30/2019 Top 10 concepts of Statistics

    42/111

    Top Ten #4

    Linear Regression

  • 7/30/2019 Top 10 concepts of Statistics

    43/111

    Linear Regression

    Regression equation:

    =dependent variable=predicted value x= independent variable

    b0=y-intercept =predicted value of y if x=0

    b1

    =slope=regression coefficient

    =change in y per unit change in x

    xy bb 10

    y

  • 7/30/2019 Top 10 concepts of Statistics

    44/111

    Slope vs Correlation

    Positive slope (b1>0): positive correlationbetween x and y (y increase if x increase)

    Negative slope (b1

  • 7/30/2019 Top 10 concepts of Statistics

    45/111

    Simple Linear Regression

    Simple: one independent variable, onedependent variable

    Linear: graph of regression equation isstraight line

  • 7/30/2019 Top 10 concepts of Statistics

    46/111

    Example

    y = salary (female manager, in thousands ofdollars)

    x = number of children

    n = number of observations

  • 7/30/2019 Top 10 concepts of Statistics

    47/111

    Given Data

    x y

    2 48

    1 52

    4 33

  • 7/30/2019 Top 10 concepts of Statistics

    48/111

    Totals

    x y

    2 48

    1 52

    4 33 n=3

    Sum=7 Sum=133

  • 7/30/2019 Top 10 concepts of Statistics

    49/111

    Slope (b1) = -6.5

    Method of Least Squares formulas not onBUS 302 exam

    b1= -6.5 given

    Interpretation: If one female manager has 1more child than another, salary is $6,500

    lower; that is, salary of female managersis expected to decrease by -6.5 (inthousand of dollars) per child

  • 7/30/2019 Top 10 concepts of Statistics

    50/111

    Intercept (b0)

    33.237

    nxx 33.44

    3133

    nyy

    b0 = 44.33 (-6.5)(2.33) = 59.5

    If number of children is zero,expected salary is $59,500

    xy bb 10

  • 7/30/2019 Top 10 concepts of Statistics

    51/111

    Regression Equation

    xy 5.65.59

  • 7/30/2019 Top 10 concepts of Statistics

    52/111

    Forecast Salary If 3 Children

    59.56.5(3) = 40

    $40,000 = expected salary

  • 7/30/2019 Top 10 concepts of Statistics

    53/111

    xforecasty bb 10

    yyerror

    2

    )(

    2

    2

    n

    yy

    n

    SSES

    Standard Error of Estimate

  • 7/30/2019 Top 10 concepts of Statistics

    54/111

    Standard Error of Estimate

    (1)=x (2)=y (3) =59.5-6.5x

    (4)=

    (2)-(3)

    2 48 46.5 1.5 2.25

    1 52 53 -1 1

    4 33 33.5 -.5 .25

    SSE=3.5

    y 2)( yy

  • 7/30/2019 Top 10 concepts of Statistics

    55/111

    9.15.3

    23

    5.3

    S

    Standard Error of Estimate

    Actual salary typically $1,900away from expected salary

  • 7/30/2019 Top 10 concepts of Statistics

    56/111

    Coefficient of Determination

    R2 = % of total variation in y that can beexplained by variation in x

    Measure of how close the linear regression

    line fits the points in a scatter diagram

    R2 = 1: max. possible value: perfect linearrelationship between y and x (straight line)

    R2 = 0: min. value: no linear relationship

  • 7/30/2019 Top 10 concepts of Statistics

    57/111

    Sources of Variation (V)

    Total V = Explained V + Unexplained V

    SS = Sum of Squares = V

    Total SS = Regression SS + Error SS

    SST = SSR + SSE

    SSR = Explained V, SSE = Unexplained

  • 7/30/2019 Top 10 concepts of Statistics

    58/111

    Coefficient of Determination

    R2 =SSRSST

    R2 = 197 = .98

    200.5

    Interpretation: 98% of total variation in salarycan be explained by variation in number of

    children

  • 7/30/2019 Top 10 concepts of Statistics

    59/111

    0 < R2 < 1

    0: No linear relationship since SSR=0(explained variation =0)

    1: Perfect relationship since SSR = SST

    (unexplained variation = SSE = 0), but doesnot prove cause and effect

  • 7/30/2019 Top 10 concepts of Statistics

    60/111

    R=Correlation Coefficient

    Case 1: slope (b1) < 0

    R < 0

    R is negative square root of coefficient of

    determination

    2

    RR

  • 7/30/2019 Top 10 concepts of Statistics

    61/111

    Our Example

    Slope = b1 = -6.5

    R2 = .98

    R = -.99

  • 7/30/2019 Top 10 concepts of Statistics

    62/111

    Case 2: Slope > 0

    R is positive square root of coefficient ofdetermination

    Ex: R2 = .49

    R = .70

    R has no interpretation

    R overstates relationship

  • 7/30/2019 Top 10 concepts of Statistics

    63/111

    Caution

    Nonlinear relationship (parabola, hyperbola,etc) can NOT be measured by R2

    In fact, you could get R2=0 with a nonlinear

    graph on a scatter diagram

  • 7/30/2019 Top 10 concepts of Statistics

    64/111

    Summary: Correlation Coefficient

    Case 1: If b1 > 0, R is the positive square rootof the coefficient of determination Ex#1: y = 4+3x, R2=.36: R = +.60

    Case 2: If b1 < 0, R is the negative squareroot of the coefficient of determination Ex#2: y = 80-10x, R2=.49: R = -.70

    NOTE! Ex#2 has stronger relationship, asmeasured by coefficient of determination

  • 7/30/2019 Top 10 concepts of Statistics

    65/111

    Extreme Values

    R=+1: perfect positive correlation

    R= -1: perfect negative correlation

    R=0: zero correlation

  • 7/30/2019 Top 10 concepts of Statistics

    66/111

    MS Excel Output

    Correlation Coefficient (-0.9912): Note

    that you need to change the sign because

    the sign of slope (b1) is negative (-6.5)

    Coefficient of Determination

    Standard Error of Estimate

    Regression Coefficient

  • 7/30/2019 Top 10 concepts of Statistics

    67/111

    Top Ten #6

    What Distribution to Use?

  • 7/30/2019 Top 10 concepts of Statistics

    68/111

    Use Binomial Distribution If:

    Random variable (x) is number of successes in ntrials

    Each trial is success or failure

    Independent trials

    Constant probability of success () on each trial

    Sampling with replacement (in practice, people

    may use binomial w/o replacement, but theory iswith replacement)

  • 7/30/2019 Top 10 concepts of Statistics

    69/111

    Success vs. Failure

    The binomial experiment can result in onlyone of two possible outcomes:

    Male vs. Female

    Defective vs. Non-defective Yes or No

    Pass (8 or more right answers) vs. Fail (fewer

    than 8) Buy drink (21 or over) vs. Cannot buy drink

  • 7/30/2019 Top 10 concepts of Statistics

    70/111

    Binomial Is Discrete

    Integer values

    0,1,2,n

    Binomial is often skewed, but may be symmetric

  • 7/30/2019 Top 10 concepts of Statistics

    71/111

    Normal Distribution

    Continuous, bell-shaped, symmetric

    Mean=median=mode

    Measurement (dollars, inches, years)

    Cumulative probability under normal curve : useZ table if you know population mean andpopulation standard deviation

    Sample mean: use Z table if you know

    population standard deviation and either normalpopulation or n > 30

  • 7/30/2019 Top 10 concepts of Statistics

    72/111

    t Distribution

    Continuous, mound-shaped, symmetric

    Applications similar to normal

    More spread out than normal

    Use t if normal population but populationstandard deviation not known

    Degrees of freedom = df = n-1 if estimating themean of one population

    t approaches z as df increases

  • 7/30/2019 Top 10 concepts of Statistics

    73/111

    Normal or t Distribution?

    Use t table if normal population but populationstandard deviation () is not known

    If you are given the sample standard deviation

    (s), use t table, assuming normal population

  • 7/30/2019 Top 10 concepts of Statistics

    74/111

    Top Ten #3

    Confidence Intervals: Mean and Proportion

  • 7/30/2019 Top 10 concepts of Statistics

    75/111

    Confidence Interval

    A confidence interval is a range of values withinwhich the population parameter is expectedto occur.

  • 7/30/2019 Top 10 concepts of Statistics

    76/111

    Factors for Confidence Interval

    The factors that determine the width of aconfidence interval are:

    1. The sample size, n2. The variability in the population, usually

    estimated by standard deviation.

    3. The desired level of confidence.

  • 7/30/2019 Top 10 concepts of Statistics

    77/111

    Confidence Interval: Mean

    Use normal distribution (Z table if):

    population standard deviation (sigma)known and either (1) or (2):

    (1) Normal population

    (2) Sample size > 30

  • 7/30/2019 Top 10 concepts of Statistics

    78/111

    Confidence Interval: Mean

    If normal table, then

    n

    z

    n

    x

  • 7/30/2019 Top 10 concepts of Statistics

    79/111

    Normal Table

    Tail = .5(1 confidence level)

    NOTE! Different statistics texts have differentnormal tables

    This review uses the tail of the bell curve

    Ex: 95% confidence: tail = .5(1-.95)= .025

    Z.025 = 1.96

  • 7/30/2019 Top 10 concepts of Statistics

    80/111

    Example

    n=49, x=490, =2, 95% confidence

    9.44 < < 10.56

    56.01049

    296.1

    49

    490

  • 7/30/2019 Top 10 concepts of Statistics

    81/111

    One of SOM professors wants toestimate the mean number of hoursworked per week by students. A sample

    of 49 students showed a mean of 24hours. It is assumed that the populationstandard deviation is 4 hours. What isthe population mean?

    Another Example

  • 7/30/2019 Top 10 concepts of Statistics

    82/111

    95 percent confidence interval for thepopulation mean.

    12.100.24

    49

    4

    96.100.2496.1

    nX

    The confidence limits range from 22.88 to

    25.12. We estimate with 95 percentconfidence that the average number of hoursworked per week by students lies between

    these two values.

    Another Example contd

    Confidence Interval: Mean

  • 7/30/2019 Top 10 concepts of Statistics

    83/111

    Confidence Interval: Meant distribution

    Use if normal population but populationstandard deviation () not known

    If you are given the sample standarddeviation (s), use t table, assuming normalpopulation

    If one population, n-1 degrees of freedom

    Confidence Interval: Mean

  • 7/30/2019 Top 10 concepts of Statistics

    84/111

    n

    s

    n

    xtn 1

    Confidence Interval: Meant distribution

    Confidence Interval:

  • 7/30/2019 Top 10 concepts of Statistics

    85/111

    Confidence Interval:Proportion

    Use if success or failure

    (ex: defective or not-defective,

    satisfactory or unsatisfactory)

    Normal approximation to binomial ok if(n)() > 5 and (n)(1-) > 5, where

    n = sample size

    = population proportion

    NOTE: NEVER use the t table if proportion!!

    Confidence Interval:

  • 7/30/2019 Top 10 concepts of Statistics

    86/111

    Confidence Interval:Proportion

    Ex: 8 defectives out of 100, so p = .08 and

    n = 100, 95% confidence

    n

    ppzp

    )1(

    05.08.100

    )92)(.08.0(96.108.

    Confidence Interval:

  • 7/30/2019 Top 10 concepts of Statistics

    87/111

    Confidence Interval:Proportion

    A sample of 500 people who own their houserevealed that 175 planned to sell their homeswithin five years. Develop a 98% confidence

    interval for the proportion of people who plan tosell their house within five years.

    0497.35.500

    )65)(.35(.33.235.

    35.0500

    175p

  • 7/30/2019 Top 10 concepts of Statistics

    88/111

    Interpretation

    If 95% confidence, then 95% of all confidenceintervals will include the true population parameter

    NOTE! Never use the term probability when

    estimating a parameter!! (ex: Do NOT sayProbability that population mean is between 23 and32 is .95 because parameter is not a randomvariable. In fact, the population mean is a fixed but

    unknown quantity.)

  • 7/30/2019 Top 10 concepts of Statistics

    89/111

    Point vs Interval Estimate

    Point estimate: statistic (single number)

    Ex: sample mean, sample proportion

    Each sample gives different point estimate

    Interval estimate: range of values

    Ex: Population mean = sample mean + error

    Parameter = statistic + error

  • 7/30/2019 Top 10 concepts of Statistics

    90/111

    Width of Interval

    Ex: sample mean =23, error = 3

    Point estimate = 23

    Interval estimate = 23 + 3, or (20,26)

    Width of interval = 26-20 = 6

    Wide interval: Point estimate unreliable

  • 7/30/2019 Top 10 concepts of Statistics

    91/111

    Wide Confidence Interval If

    (1) small sample size(n)

    (2) large standard deviation

    (3) high confidence interval (ex: 99% confidenceinterval wider than 95% confidence interval)

    If you want narrow interval, you need a largesample size or small standard deviation or low

    confidence level.

  • 7/30/2019 Top 10 concepts of Statistics

    92/111

    Top Ten #7

    P-value

  • 7/30/2019 Top 10 concepts of Statistics

    93/111

    P-value

    P-value = probability of getting a sample statisticas extreme (or more extreme) than the samplestatistic you got from your sample, given that the

    null hypothesis is true

  • 7/30/2019 Top 10 concepts of Statistics

    94/111

    P-value Example: one tail test

    H0: = 40

    HA: > 40

    Sample mean = 43

    P-value = P(sample mean > 43, given H0 true)

    Meaning: probability of observing a samplemean as large as 43 when the population mean

    is 40 How to use it: Reject H0 if p-value <

    (significance level)

  • 7/30/2019 Top 10 concepts of Statistics

    95/111

    Two Cases

    Suppose = .05

    Case 1: suppose p-value = .02, then reject H0(unlikely H0 is true; you believe population mean> 40)

    Case 2: suppose p-value = .08, then do notreject H0 (H0 may be true; you have reason tobelieve that the population mean may be 40)

  • 7/30/2019 Top 10 concepts of Statistics

    96/111

    P-value Example: two tail test

    H0 : = 70

    HA: 70

    Sample mean = 72

    If two-tails, then P-value =

    2 P(sample mean > 72)=2(.04)=.08

    If = .05, p-value > , so do not reject H0

  • 7/30/2019 Top 10 concepts of Statistics

    97/111

    Top Ten #2

    Hypothesis Testing

  • 7/30/2019 Top 10 concepts of Statistics

    98/111

    Population mean=

    Population proportion=

    A statement about the value of a populationparameter

    Never include sample statistic (such as, x-bar) in hypothesis

    H0: Null Hypothesis

    H H Alt ti H th i

  • 7/30/2019 Top 10 concepts of Statistics

    99/111

    HA or H1:Alternative Hypothesis

    ONE TAIL ALTERNATIVE

    Right tail: >number(smog ck)

    >fraction(%defectives)

    Left tail:

  • 7/30/2019 Top 10 concepts of Statistics

    100/111

    One-Tailed Tests

    A test is one-tailed when the alternatehypothesis, H1 or HA, states a direction, such as:

    H1: The mean yearly salaries earned by full-timeemployees is more than $45,000. (>$45,000)

    H1: The average speed of cars traveling onfreeway is less than 75 miles per hour. (

  • 7/30/2019 Top 10 concepts of Statistics

    101/111

    Two-Tail Alternative

    Population mean not equal to number (toohot or too cold)

    Population proportion not equal to fraction (%

    alcohol too weak or too strong)

    Two-Tailed Tests

  • 7/30/2019 Top 10 concepts of Statistics

    102/111

    Two Tailed Tests

    A test is two-tailed when no direction isspecified in the alternate hypothesis

    H1: The mean amount of time spent for the

    Internet is not equal to 5 hours. ( 5).

    H1: The mean price for a gallon of gasoline

    is not equal to $2.54. ( $2.54).

  • 7/30/2019 Top 10 concepts of Statistics

    103/111

    Reject Null Hypothesis (H0) If

    Absolute value of test statistic* > critical value*

    Reject H0 if |Z Value| > critical Z

    Reject H0 if | t Value| > critical t

    Reject H0 if p-value < significance level (alpha) Note that direction of inequality is reversed!

    Reject H0 if very large difference between samplestatistic and population parameter in H

    0

    * Test statistic: A value, determined from sample information, used to determinewhether or not to reject the null hypothesis.

    * Critical value: The dividing point between the region where the null hypothesis isrejected and the region where it is not rejected.

  • 7/30/2019 Top 10 concepts of Statistics

    104/111

    Example: Smog Check

    H0 : = 80

    HA: > 80

    If test statistic =2.2 and critical value = 1.96,

    reject H0, and conclude that the populationmean is likely > 80

    If test statistic = 1.6 and critical value = 1.96,

    do not reject H0, and reserve judgment aboutH0

  • 7/30/2019 Top 10 concepts of Statistics

    105/111

    Type I vs Type II Error

    Alpha= = P(type I error) = Significance level =probability that you reject true null hypothesis

    Beta= = P(type II error) = probability you do notreject a null hypothesis, given H0 false

    Ex: H0 : Defendant innocent

    = P(jury convicts innocent person)

    =P(jury acquits guilty person)

  • 7/30/2019 Top 10 concepts of Statistics

    106/111

    Type I vs Type II Error

    H0 true H0 false

    Reject H0 Alpha = =P(type I error)

    1 (CorrectDecision)

    Do not reject H0 1 (CorrectDecision) Beta = =P(type II error)

    E l S Ch k

  • 7/30/2019 Top 10 concepts of Statistics

    107/111

    Example: Smog Check

    H0 : = 80

    HA: > 80

    If p-value = 0.01 and alpha = 0.05, reject H0,

    and conclude that the population mean islikely > 80

    If p-value = 0.07 and alpha = 0.05, do not

    reject H0, and reserve judgment about H0

    Test Statistic

  • 7/30/2019 Top 10 concepts of Statistics

    108/111

    Test Statistic

    When testing for the population mean from alarge sample and the population standarddeviation is known, the test statistic is given

    by:

    zX

    / n

    E l

  • 7/30/2019 Top 10 concepts of Statistics

    109/111

    The processors of Best Mayo indicate on thelabel that the bottle contains 16 ounces ofmayo. The standard deviation of the process

    is 0.5 ounces. A sample of 36 bottles from lasthours production showed a mean weight of16.12 ounces per bottle. At the .05significance level, can we conclude that themean amount per bottle is greater than 16ounces?

    Example

    E l td

  • 7/30/2019 Top 10 concepts of Statistics

    110/111

    1. State the null and the alternative hypotheses:H0: = 16, H1: > 16

    3. Identify the test statistic. Because we know thepopulation standard deviation, the test statistic is z.

    4. State the decision rule.

    Reject H0 if |z|>1.645 (= z0.05)

    2. Select the level of significance. In this case,

    we selected the .05 significance level.

    Example contd

    E l td

  • 7/30/2019 Top 10 concepts of Statistics

    111/111

    5. Compute the value of the test statistic

    44.1

    365.0

    00.1612.16

    n

    Xz

    6. Conclusion: Do not reject the null hypothesis.

    We cannot conclude the mean is greater than 16ounces.

    Example contd