business statistics

51
DEPARTMENT OF MARKETING AND STATISTICS AARHUS SCHOOL OF BUSINESS UNIVERSITY OF AARHUS INTERNT UNDERVISNINGSMATERIALE E309 (ERSTATTER E281) Lecture Notes in Business Statistics BSc(B)/(IM) part 1 Steen Andersen and Morten Berg Jensen 2009

Upload: nyakerarioe

Post on 02-Dec-2014

62 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Business Statistics

DEPARTMENT OF MARKETING AND STATISTICS AARHUS SCHOOL OF BUSINESS

UNIVERSITY OF AARHUS

INTERNT UNDERVISNINGSMATERIALE E309 (ERSTATTER E281)

Lecture Notes in Business Statistics BSc(B)/(IM) part 1 

 

Steen Andersen and Morten Berg Jensen 

2009 

Page 2: Business Statistics
Page 3: Business Statistics

Table of Contents  1.  Expected Value and Variance of Random Variables .......................................................... 1 2.  Various Probability Distributions ......................................................................................... 7 

A)  The hypergeometric distribution ..................................................................................... 9 B)  The k-dimensional hypergeometric distribution ........................................................... 10 C)  The binomial distribution ............................................................................................. 11 D)  The multinomial distribution ........................................................................................ 12 E)  The geometric distribution ............................................................................................ 13 F)  The Poisson distribution ............................................................................................... 14 G)  The exponential distribution ......................................................................................... 15 H)  The uniform distribution ............................................................................................... 16 I)  The normal distribution ................................................................................................ 17 J)  The T-distribution ......................................................................................................... 19 K)  The χ2-distribution ........................................................................................................ 21 L)  The F-distribution ......................................................................................................... 23 

3.  Choice of Test Statistic by Test for Expected Values and Proportions............................ 24 4. Sampling Methods................................................................................................................. 29 5. Poisson Distribution, Confidence Intervals and Hypothesis Testing ............................... 31 6.  Overview ................................................................................................................................ 35 

A)  Descriptive measures .................................................................................................... 35 B)  Construction of confidence intervals ............................................................................ 36 C)  Application of hypotheses ............................................................................................ 41 

Page 4: Business Statistics

1

1. Expected Value and Variance of Random Variables A) Definition A random variable is a function or rule that assigns a number to each outcome of an experiment. Random variables are separated into discrete and continuous random variables. A throw of the dice will for example have a discrete outcome, whereas time of reaction will have a continuous outcome. Discrete random variables If we look at the throw of the dice example, the throw itself is the experiment, the random variable is the possible outcomes that the throw may result in, the outcome is the number of pips the throw shows, and the probability distribution is the probabilities for each of the possible outcomes. P(x) = P(X = x) is to be read as the Probability that the random variable X equals x. All probabilities should be larger or equal to zero, and the sum of probabilities should be equal to 1

( ) 0 and ( ) 1x

P x P x⎛ ⎞≥ =⎜ ⎟⎝ ⎠

∑ .

Random variables are written in capital letters and values in small letters. E(X) is to be read as the expected value of the random variable X. For a throw with the dice, the expected value is 3½ expressing the average number of pips you will expect to throw in the long run. The expected value is defined as ( ) ( )X

xE X x P xμ= = ⋅∑ , the sum of the single outcomes, weighted

with the probability of the outcome. V(X) is to be read as the variance of the random variable X. The variance is an expression of the diversity that the throw of the dice may result in. The variance is defined as:

( ) ( )2 22 2 2 2 2( ) ( ) ( ) ( ) ( ) .σ μ μ μ⎡ ⎤= = − = − ⋅ = ⋅ − = −⎣ ⎦ ∑ ∑X X X Xx x

V X E X x P x x P x E X E X

( )2XE X μ⎡ ⎤−⎣ ⎦ is the expected value of the squared deviations and

( )2E X is the expected value of the random variable X2.

Page 5: Business Statistics

2

Example Let the random variable X be defined by the following probability distributions:

}{10( ) for 1, 2,3, 4xP x x= = , where {1,2,3,4} is the possible outcomes for X.

The probability distribution can appropriately be presented in a table where further calculations can be made. x P(x) ( )x P x⋅ 2 ( )x P x⋅ ( )2

Xx μ− ( )2 ( )Xx P xμ− ⋅ 1 0.1 0.1 0.1 4 0.4 2 0.2 0.4 0.8 1 0.2 3 0.3 0.9 2.7 0 0.0 4 0.4 1.6 6.4 1 0.4 sum 1.0 3.0 = E(X) 10.0 = E(X2) - 1.0 = V(X) Continuous random variables In the continuous case the point probability is 0, and this is replaced by f(x), called the probability density function for X, if f(x) is non-negative and exhaustive.

I.e. ( ) 0 and ( ) 1.f x f x dx∞

−∞

≥ =∫

The probability that X assumes a value within the interval [a;b] is

( ) ( ) ( ) ( ) where ( ) ( ) and ( ) ( )b b a

a

P a X b f x dx F b F a F b f x dx F a f x dx−∞ −∞

≤ ≤ = = − = =∫ ∫ ∫

The expected value is defined as ( ) ( ) .XE X x f x dxμ∞

−∞

= = ⋅∫

The variance is defined as

( ) ( )2 22 2 2 2 2( ) ( ) ( ) ( ) ( )X X X XV X E X x f x dx x f x dx E X E Xσ μ μ μ∞ ∞

−∞ −∞

⎡ ⎤= = − = − ⋅ = ⋅ − = −⎣ ⎦ ∫ ∫

x

4321

P(x

) ,5

,4

,3

,2

,1

0,0

Page 6: Business Statistics

3

B) Laws of expected value and variance 1. Laws of expected value

(1) E(b) (2) E(a X) (3) E(aX + b) (4) E(X + Y) (5) E(X – Y) (6) E(aX + bY)

= b = a · E(X) = a · E(X) + b = E(X) + E(Y) = E(X) – E(Y) = a · E(X) + b · E(Y)

where b is an arbitrary constant where a is an arbitrary constant general addition

2. Laws of variance

(1) V(b) (2) V(a X) (3) V(aX + b) (4) V(X + Y) (5) V(X – Y) (6) V(aX + bY)

= 0 = a2 · V(X) = a2 · V(X) = V(X) + V(Y) = V(X) + V(Y) = a2 · V(X) + b2 · V(Y)

if X and Y are independent if X and Y are independent if X and Y are independent

3. Covariance and coefficient of correlation If the random variables are not independent, then

(4a) V(X + Y) (5a) V(X – Y) (6a) V(aX + bY)

= V(X) + V(Y) + 2 · COV(X,Y) = V(X) + V(Y) – 2 · COV(X,Y) = a2 · V(X) + b2 · V(Y) + 2ab · COV(X,Y)

Let the random variable R be a linear combination of k random variables. The expected value and variance of R are determined by the following expressions based on the expected values, variances and covariances. R could be the turnover, ai is the price of item i and Xi is the number of item i sold.

1

1

12

1 1

( ) ( )

( ) ( ) 2 ( , ).

k

i ii

k

i ii

k k k

i i i j i ji i j i

R a X

E R a E X

V R a V X a a COV X X

=

=

= = >

= ⋅

= ⋅

= ⋅ + ⋅ ⋅ ⋅

∑ ∑∑

Page 7: Business Statistics

4

The random variable X is a linear combination of n independent random variables, all weighted by 1n .

( ) 2

2

1

1

1 1

11

2 21 1 1 1

1 1

2

( ) ( ) when ( )

( ) ( ) 2 ( , ) 0

when ( ) and ( , ) 0

n

ini

n

i in ni

n n n

i i jn n n nni i j i

i i j

X X

E X E X n E X

V X V X COV X X n

V X COV X X

σ

μ μ μ

σ

σ

=

=

= = >

= ⋅

= ⋅ = ⋅ ⋅ = =

= ⋅ + ⋅ ⋅ = ⋅ ⋅ + =

= =

∑ ∑∑

The covariance is an expression of linear dependence. If the covariance is 0, then there may be independence, but there may be another kind of dependence than linear dependence.

( ) ( )( , ) ( ) where

( ) ( , ) or ( ) ( , ) .

X Y X Y

x y x y

COV X Y E X Y E X Y

E X Y x y P x y E X Y x y f x y dxdy

μ μ μ μ= − ⋅ − = ⋅ − ⋅⎡ ⎤⎣ ⎦

⋅ = ⋅ ⋅ ⋅ = ⋅ ⋅∑∑ ∫ ∫

The coefficient of correlation, ρ , is defined from the covariance:

( , )

X Y

COV X Yρσ σ

=⋅

, where 1 1ρ− ≤ ≤ .

2ρ is an expression of the proportion of the variation in Y which is explained by X. ρ can be

regarded as a kind of index figure and the direction of the relationship. As supplement to ( )V X it can be proved that if 2( )iV X σ= and ijρ ρ= , then

( ) ( ) ( ) ( )( )2 21 1 1 1 1

12 1 .

2

n

i i jn n n n ni

nV X V X σ σ ρ σ ρ

=

⎛ ⎞= ⋅ + ⋅ ⋅ ⋅ ⋅ ⋅ ⋅ = ⋅ + −⎜ ⎟

⎝ ⎠∑

Page 8: Business Statistics

5

C) Example Let the random variables X and Y denote the number of computers sold from store X and Y per day. The two random variables are defined by the following simultaneous probability distribution:

y x 0 1 2 P(x)

1 2 3 4

0.10 0.05 0.05 0.00

0.00 0.10 0.10 0.10

0.00 0.05 0.15 0.30

0.1 0.2 0.3 0.4

P(y) 0.2 0.3 0.5 1.0 The table states the probability for all possible cases of (x,y), e.g. ( 3 2) (3, 2) 0.15P X Y P= = = =∩ . This means that there is a 15% probability that on one day 3 computers are sold from store X, while 2 computers are sold from store Y. Also available are the probability distributions for X and Y, called the marginal probability distributions, e.g. there is a 10% probability that on one day 1 computer will be sold from store X. Calculating the expected value and variance for X and Y, respectively, leads to the following results:

2 2( ) 3.0 and ( ) 1.0 ( ) 1.3 and ( ) 0.61.X X Y YE X V X E Y V Yμ σ μ σ= = = = = = = = Let us define a random variable S = X + Y, the sale of computers from the two stores per day. What are E(S) and V(S)?

( ) ( ) ( ) ( ) 3.0 1.3 4.3

( ) ( ) ( ) 2 ( , ) 1.0 0.61 2 0.50 2.61

( , ) ( ) 4.40 3.0 1.3 0.50( ) ( , ) 0 ( , ) 1 ( , ) 2 ( , )

0 1 1 0.00 2 0.10 3 0

X Y

x y x x x

E S E X Y E X E Y

V S V X V Y COV X Y

COV X Y E X YE X Y x y P x y x P x y x P x y x P x y

μ μ

= + = + = + =

= + + ⋅ = + + ⋅ =

= ⋅ − ⋅ = − ⋅ =

⋅ = ⋅ ⋅ = ⋅ ⋅ + ⋅ ⋅ + ⋅ ⋅

= + ⋅ ⋅ + ⋅ + ⋅

∑∑ ∑ ∑ ∑( ) ( ).10 4 0.10 2 1 0.00 2 0.05 3 0.15 4 0.30

0 0.90 3.50 4.40+ ⋅ + ⋅ ⋅ + ⋅ + ⋅ + ⋅

= + + =

2( , ) 0.50 0.6402 and 0.4098.

1.0 0.61X Y

COV X Yρ ρσ σ

= = = =⋅ ⋅

Page 9: Business Statistics

6

A more laborious method would be to determine the probability distribution for S = X + Y and carry out the calculations on the basis of this.

s (x + y) P(s) ( )s P s⋅ 2 ( )s P s⋅ ( )2Ss μ− ( )2 ( )Ss P sμ− ⋅

1 (1 + 0) 0.10 0.10 0.10 10.89 1.0890 2 (2 + 0 1 + 1) 0.05 0.10 0.20 5.29 0.2645 3 (3 + 0 2 + 1 1 + 2) 0.15 0.45 1.35 1.69 0.2535 4 (4 + 0 3 + 1 2 + 2) 0.15 0.60 2.40 0.09 0.0135 5 (4 + 1 3 + 2) 0.25 1.25 6.25 0.49 0.1225 6 (4 + 2) 0.30 1.80 10.80 2.89 0.8670 sum 1.00 4.30 = E(S) 21.10 = E(S2) - 2.6100 = V(S)

E(S) = 4.30 V(S) = E(S2) – E(S)2 = 21.10 – 4.302 = 21.10 – 18.49 = 2.61 The same result as obtained when applying the law of the expected value and variance. Suppose that the profit of a computer sold in store X is 1,000 DKK, and that the profit of a computer sold in store Y is 1,300 DKK. What is the expected profit per day for the two stores, and what are the variance and standard deviation of the profit?

2 2

2 2

, where 1,000 and 1,300( ) ( ) ( ) 1,000 3.0 1,300 1.3 3,000 1,690 4,690( ) ( ) ( ) 2 ( , )

1,000 1.0 1,300 0.61 2 1,000 1,300 0.501,000,000 1,030,900 1,300,000 3,33

F a X b Y a bE F a E X b E YV F a V X b V Y ab COV X Y

= ⋅ + ⋅ = == ⋅ + ⋅ = ⋅ + ⋅ = + =

= ⋅ + ⋅ + ⋅

= ⋅ + ⋅ + ⋅ ⋅ ⋅= + + = 0,900

The standard deviation of the profit is ( ) 3,330,900 1,825 DKK.V F = =

Page 10: Business Statistics

7

2. Various Probability Distributions

Hypergeometric distribution, X = number of clubs, n = 13, N = 52 and S = 13

Number of clubs

131211109876543210

P(c

lubs

) ,30

,25

,20

,15

,10

,05

0,00

,04

,12

,24

,29

,21

,08

Binomial distribution, X = number of sixes, n = 6 throws, p = 1/6

Number of sixs

6543210

P(s

ixs) ,45

,40

,35

,30

,25

,20

,15

,10

,05

0,00

,05

,20

,40

,33

Geometric distribution, X = number of throws, p = 1/6

Number of tosses

2019181716151413121110987654321

P(to

sses

) ,20

,15

,10

,05

0,00 ,01,01,01,02,02,02

,03,03

,04,05

,06

,07

,08

,10

,12

,14

,17

Poisson distribution, μ = 360/198

Exponential distribution, density function λ = 360/(198 · 90) = 2/99

Standard normal distribution, N(0;1)

T-distribution with 9 degrees of freedom

χ2-distribution with 1, 2, 3 and 4 degrees of freedom

F-distribution with various degrees of freedom

Numbers

109876543210

P(n

umbe

rs) ,30

,25

,20

,15

,10

,05

0,00

Page 11: Business Statistics

8

Relationships between different distributions with some rules of thumb for approximation:

In each circle it is indicated in which of the following sections further information about the distributions can be found.

k-dimensional hypergeometric

distribution (B)

Multinomial distribution

(D)

Binomial distribution

(C)

Geometric distribution

(E)

Hyper- geometric distribution

(A)

Normal distribution

(I)

Poisson distribution

(F)

Exponential distribution

(G)

T- distribution

(J)

χ2 -distribution

(K)

F- distribution

(L)

N > 50

0.05nN

< n ≥ 20 p ≤ 0.05

n p > 5 n (1-p) > 5

μ > 10

1 51

S S N nnN N N

−⎛ ⎞⎛ ⎞− >⎜ ⎟⎜ ⎟−⎝ ⎠⎝ ⎠

ν > 50

ν > 30

Page 12: Business Statistics

9

A) The hypergeometric distribution1 Hypergeometric distribution, X = number of clubs, n = 13, N = 52 and S = 13

Assumptions 1) Finite population consisting of N elements. 2) Population consisting of 2 alternative groups ( / )A A . 3) A simple random sample consisting of n elements is drawn. Point probability

( ) ( ) ( ; , , )

, where

P x P X x h x N n SS N Sx n x

Nn

= = =

−⎛ ⎞⎛ ⎞⎜ ⎟⎜ ⎟−⎝ ⎠⎝ ⎠=

⎛ ⎞⎜ ⎟⎝ ⎠

N = number of elements in the population S = number of elements in the population with the distinctive mark A N-S = number of elements in the population with the distinctive mark A n = number of elements in the sample x = number of elements in the sample with the distinctive mark A n-x = number of elements in the sample with the distinctive mark A

Nn

⎛ ⎞⎜ ⎟⎝ ⎠

is number of ways, where n elements can be drawn among N elements.

! , where ! ( 1) ..... 2 1.! ( )!

N N n n nn n N n

⎛ ⎞= = ⋅ − ⋅ ⋅ ⋅⎜ ⎟ ⋅ −⎝ ⎠

Expected value Variance Skewness

( ) SE X nN

= ( ) 11

S S N nV X nN N N

−⎛ ⎞⎛ ⎞= −⎜ ⎟⎜ ⎟−⎝ ⎠⎝ ⎠ 1

21 22

11

SN nNNS S N nn

N N N

γ− −

= ⋅−−⎛ ⎞⎛ ⎞−⎜ ⎟⎜ ⎟−⎝ ⎠⎝ ⎠

1 Keller, CD appendix F

Number of clubs

131211109876543210

P(c

lubs

) ,30

,25

,20

,15

,10

,05

0,00

,04

,12

,24

,29

,21

,08

Page 13: Business Statistics

10

Approximations

To the binomial distribution, if N > 50 and 0.05nN

< At the approximation SpN

=

To the normal distribution if 1 51

S S N nnN N N

−⎛ ⎞⎛ ⎞− >⎜ ⎟⎜ ⎟−⎝ ⎠⎝ ⎠ At the approximation µ=E(X) and σ2=V(X).

~ ( ; , , ) ~ (0,1)1

1

SX nNX h x N n S Z N

S S N nnN N N

−⇒ =

−⎛ ⎞⎛ ⎞−⎜ ⎟⎜ ⎟−⎝ ⎠⎝ ⎠

Correction for the approximation from a discrete to a continuous distribution is done as follows:

If ( | , , )P X x N n S≤ is required, then 0.5 ( )( )

x E XP ZV X

⎛ ⎞+ −<⎜ ⎟⎜ ⎟

⎝ ⎠ for illustration, see

Applet 13, Keller p. 312. If ( | , , )P X x N n S≥ is required, then 1 ( 1 | , , ).− ≤ −P X x N n S

B) The k-dimensional hypergeometric distribution Assumptions 1) Finite population consisting of N elements. 2) Population consisting of k alternative groups (A1, A2, …, Ak). 3) A simple random sample consisting of n elements is drawn. Expected value Variance Covariance

{ }

( )

for 1, 2,...,

ii

SE X nN

i k

=

=

{ }

( ) 11

for 1, 2,...,

i ii

S S N nV X nN N N

i k

−⎛ ⎞⎛ ⎞= − ⎜ ⎟⎜ ⎟ −⎝ ⎠⎝ ⎠=

( , )

1for

jii j

SS N nCov X X nN N N

i j

−⎛ ⎞= − ⎜ ⎟−⎝ ⎠≠

Page 14: Business Statistics

11

C) The binomial distribution Binomial distribution, X = number of sixes, n = 6 throws, p = 1/6

Assumptions 1) Each trial results in one of two mutually excluding outcomes ( / )A A . 2) P(A) = p is constant from trial to trial. 3) The individual trials are independent. 4) n trials are carried out. Point probability2

( ) ( ) ( ; , )

(1 )x n x

P X x P x b x n pn

p px

= = =

⎛ ⎞= −⎜ ⎟

⎝ ⎠

where X = the number of trials with the outcome A.

Expected value Variance Skewness

( )E X n p= ⋅ ( ) (1 )V X n p p= ⋅ − 11 2

(1 )p

np pγ −

=−

Approximations3 1) To the normal distribution if n · p > 5 and n · (1 − p) > 5,

~ ( ; , )ˆ ˆ~ or alternatively ~ , where

(1 ) (1 )

X b x n p

X n p P p XZ Z Pnn p p p p

n

− ⋅ −⇒ =

⋅ − −

Regarding correction for the approximation from a discrete to a continuous distribution see p. 10 and Keller p. 311.

2) To the Poisson distribution if n > 20 and p < 0.05. At the approximation n · p = μ.

2 Cf. Keller p. 236 3 Cf. Keller p. 310

Number of sixs

6543210

P(s

ixs) ,45

,40

,35

,30

,25

,20

,15

,10

,05

0,00

,05

,20

,40

,33

Page 15: Business Statistics

12

D) The multinomial distribution The multinomial distribution is a generalization of the binomial distribution where, instead of 2 alternative outcomes, each trial has k possible outcomes (mutually exclusive and collectively exhaustive). Assumptions 1) Each trial results in one of k mutually exclusive outcomes, Ai, which are collectively exhaustive

i = {1, 2, …, k}. 2) P(Ai) = pi is constant from trial to trial i = {1, 2, …, k}. 3) The individual trials are independent. 4) n trials are carried out.

Assumptions 1) and 2) mean that 1

1k

ii

p=

=∑

Simultaneous probability distribution

1 2

1 2

1 1 2 2

1 2 1 2

1 21 2

( , ,..., )( ... )( , ,..., ; , , ,..., )

! ...! !..., !

k

k

k k

k k

xx xk

k

P x x xP X x X x X xp x x x n p p p

n p p px x x

= = = ==

= ⋅ ⋅ ⋅

∩ ∩ ∩ Where Xi = number of trials with the outcome Ai

From this follows that Xi is binomially distributed b(xi;n,pi). Expected value Variance Covariance

( )i iE X n p= ⋅ for i = {1, 2, …, k}

( ) (1 )i i iV X n p p= ⋅ − for i = {1, 2, …, k}

( , )i j i jCov X X n p p= − ⋅ ⋅ for i ≠ j

Page 16: Business Statistics

13

E) The geometric distribution Geometric distribution, X = number of throws until you get a six, p = 1/6

In the geometric distribution the random variable X is the number of trials until A is stated/observed for the first time in a Bernoulli-process (= binomial experiment), i.e. Assumptions: Like the binomial distribution. Point probability

1( ) ( ) (1 )xP x P X x p p −= = = − Where X = number of trials until the first trial with the outcome A. Expected value Variance Skewness

1( )E Xp

= 2

1( ) pV Xp−

= 121

pp

γ −=

E-supl.) The negative binomial distribution In the negative binomial distribution the random variable X is the number of trials until A is stated/observed the sth time in a Bernoulli process (= binomial experiment), i.e. Assumptions: Like the binomial distribution. Point probability

11 1( ) ( ) ( 1; 1, ) (1 ) (1 )

1 1s x s s x sx x

P x P X x p b s x p p p p p ps s

− − −− −⎛ ⎞ ⎛ ⎞= = = ⋅ − − = ⋅ ⋅ ⋅ − = ⋅ ⋅ −⎜ ⎟ ⎜ ⎟− −⎝ ⎠ ⎝ ⎠

Where X = number of trials until the sth trial with the outcome A. Expected value Variance Skewness

1( )E X sp

= ⋅ 2

1( ) pV X sp−

= ⋅ 12(1 )

ps p

γ −=

⋅ −

Number of tosses

2019181716151413121110987654321

P(to

sses

) ,20

,15

,10

,05

0,00 ,01,01,01,02,02,02

,03,03

,04,05

,06

,07

,08

,10

,12

,14

,17

Page 17: Business Statistics

14

F) The Poisson distribution The Poisson distribution, μ = 360/198. The random variable X indicates the number of occurrences in a particular interval of time (or space). Ex. 360 goals scored in 198 matches, where X is the number of goals scored per match by the home team.

Assumptions 1) The number of occurrences within an interval of time (e.g. one minute) are independent of the

number of occurrences within other intervals of time, provided that overlapping intervals of time are not involved.

2) The expected number of occurrences within an interval of time (e.g. one minute) is constant during the whole lapse of time (e.g. one hour or one day). The process is said to be stationary.

3) The probability that exactly one occurrence takes place within a very small interval of time is proportional to the length of the interval.

4) The probability that more than one occurrence take place within a very small interval of time is negligible in relation to the probability that exactly one occurrence takes place.

Point probability4

( ) ( ) ( ; )!

x

P x P X x p x ex

μμμ −= = = = ⋅

where μ = the intensity (average number of activities within a certain interval of time). Generally μ = λ·t, where λ is the intensity per unit of time and t is the time. Expected value Variance Skewness

( )E X μ= ( )V X μ= 11γμ

=

Use Primarily in connection with queuing theoretical problems. The Poisson process gives a good description of a series of situations where arrivals occur at random over time. Approximations

To the normal distribution if μ > 10 ~X Zμμ−

⇒ .

Correction for the approximation from a discrete to a continuous distribution see p. 10 and Keller p. 311.

4 Cf. Keller p. 243

Numbers

109876543210

P(n

umbe

rs) ,30

,25

,20

,15

,10

,05

0,00

Page 18: Business Statistics

15

G) The exponential distribution Distribution function λ = 360/(198 · 90) = 2/99

Density function λ = 360/(198 · 90) = 2/99

Continuous distribution ⇒ point probabilities = 0. In this example T states the time until the first goal is scored by the home team. Assumptions: Like the Poisson distribution. Density function5

( ) tf t e λλ −= ⋅ where T states the time between 2 activities, or the time one activity takes (operation time) and the parameter λ states the intensity (average number of occurrences) per unit of time. Distribution function (The cumulative probability function)

( ) ( ) 1 tF t P T t e λ−= ≤ = − states the probability that the next activity in a Poisson process occurs at time t at the latest. The distribution function can be derived from the Poisson distribution, since the expected number of occurrences in units of time t are μ = λ · t. The probability that the next activity occurs before time t in the exponential distribution corresponds to the fact that at least one activity occurs during an interval of t units of time in the Poisson distribution which is

0

( ) ( 1 | )( )1 ( 0 | ) 1

0!1

t

t

P T t P X ttP X t e

e

λ

λ

μ λ

λμ λ − ⋅

− ⋅

≤ = ≥ = ⋅

⋅= − = = ⋅ = − ⋅

= −

Expected value Variance Skewness

1( )E Tλ

= 2

1( )V Tλ

= 1 2γ =

5 Cf. Keller p. 277

Page 19: Business Statistics

16

H) The uniform distribution Discrete or continuous. Range of definition and variation: a X b≤ ≤ . Discrete version: (e.g. number of pips after a throw of a die) Point probability

1( ) ( ) 1

P x P X x a X bb a

= = = ≤ ≤− +

Expected value Variance Skewness

( )2

b aE X +=

2( ) ( ) ( ) ( 2)( ) or12 6 12

b a b a b a b aV X − − − ⋅ − += + 1 0γ =

Continuous version: (e.g. the waiting time for a bus arriving at intervals of 10 minutes) Density function6

1( )f x a X bb a

= ≤ ≤−

Expected value Variance Skewness

( )2

b aE X +=

2( )( )12

b aV X −= 1 0γ =

6 Cf. Keller p. 255

Page 20: Business Statistics

17

I) The normal distribution7 The standard normal distribution, N(0;1)

2~ ( , )X N μ σ , i.e. X follows a normal distribution with the expected value μ and the standard

deviationσ .

~X Zμσ− follows a standard normal distribution with the expected value 0 and the standard

deviation 1 ~ (0,1).Z N The normal distribution is continuous. The point probability is always 0. Thus the density function does not indicate the point probability but the density. Density function8

2121( ) ,

2

x

f x e xμ

σ

σ π

−⎛ ⎞− ⎜ ⎟⎝ ⎠= − ∞ < < ∞

Expected value Variance Skewness Normal distribution ( )E X μ= 2( )V X σ= 1 0γ = Standard normal distribution ( ) 0E Z = ( ) 1V Z =

1 0γ = Convolution property If X1 and X2 are independent and normally distributed, the following applies

2 2 2 21 1 2 2 1 1 2 2 1 1 2 2~ ( , ),μ μ σ σ= + ⋅ + ⋅ + ⋅ + ⋅ ⋅ + ⋅Y a b X b X N a b b b b

where a, b1 and b2 are arbitrary constants.

7 Keller p. 259 and CD Applet 5 8 Cf. Keller p. 259

Page 21: Business Statistics

18

Use 1) Tests and confidence intervals for μ. 2) Tests and confidence intervals for comparison of expected values in 2 populations (μ1 − μ2).

3) Tests and confidence intervals for SpN

= in the hypergeometric distribution.

4) Tests and confidence intervals for p in the binomial distribution. 5) Tests and confidence intervals for comparison of proportions in 2 populations

1 21 2

1 2

( )S S p pN N

⎛ ⎞− = −⎜ ⎟

⎝ ⎠.

6) Tests and confidence intervals for comparison of processes in 2 populations 1 2( )p p− . 7) Tests and confidence intervals for μ and (μ1 − μ2) in the Poisson distribution. Ad 1 The normal distribution can only be applied if σ is known and X is normally distributed, either because the variable (X) is normally distributed, or because of the central limit theorem, where the sample size, n, is assumed to be sufficiently large according to the form (skewness) of X, cf. Keller p. 300. On pp. 24, 36 and 43 the procedure is shown for deciding the test procedure. Ad 2 The normal distribution can only be applied if 1 2X X− are normally distributed, and 2

1σ and 22σ are

known, or if the samples are sufficiently large (cf. the central limit theorem). On pp. 25, 37 and 44 you can see the choice of test variable/test statistic when 2 population averages are to be compared. Ad 3, 4, 5 and 6 The normal distribution can only be applied if 1 2 1 1 1 2 2 2

ˆ ˆand in / , and /X X P X n P X n= = are approximately normally distributed, cf. approximation from the binomial distribution or the hypergeometric distribution to the normal distribution. Correction for the finite population should be included if ni/Ni > 0.05. Ad 7 See pages 32-33.

Page 22: Business Statistics

19

J) The T-distribution9 T-distribution with 9 degrees of freedom

Illustration of various T-distribution and the standard normal distribution

Continuous distribution ⇒ point probability = 0 Defined by:

2 2

2

2

~ , where ~ (0,1), ~ distributed with degrees of freedom,

and is independent of .

v v

v

v

Z v T Z N

Z

χ χ νχ

χ

⋅ −

Density function

[ ][ ]

[ ][ ]

2 2 2

2 2

( 1) / 2 / 2( 1) / 2 !1( 2) / 2 !

/ 2 / 21 12 2

( 1) / 2 !( ) 1 1 1

( 2) / 2 !

for ( ) 1

t t t

t t

f t

f t e e

ν ννν ν ννν π

νν π π

νν π ν

ν

− − −−−⋅

− −⋅ ⋅

−⎡ ⎤ ⎡ ⎤= + = ⋅ ⋅ + ⋅ +⎣ ⎦ ⎣ ⎦⋅ −

→ ∞ ⇒ = ⋅ ⋅ ⋅ = ⋅

The gamma function ( )nΓ ,which is equal to (n-1)! for positive integers, is defined for all non-negative real figures. This means that the expression for the density function can be written as

( )( )

21 ( 1) / 22

2

( ) 1 tf tν ν

ννν π

+ − −Γ⎡ ⎤= ⋅ +⎣ ⎦⋅ ⋅Γ

Expected value Variance Skewness

( ) 0vE T = only defined for v > 1

( )2v

vV Tv

=−

only defined for v > 2

1 0γ = only defined for v > 3

9 Keller p. 281 and CD Applet 6

Page 23: Business Statistics

20

Use 1) Tests and confidence intervals for μ in a normally distributed population, where σ is unknown.

According to W. C. Guenther empirical trials have shown that the T-distribution with a good approximation can be used if n is assumed to be sufficiently large (cf. Keller p. 389).

2) Comparison of two expected values from normally distributed populations, where 2

1σ and 22σ are

unknown.

3) Tests and confidence intervals in regression and correlation analyses. Approximations When ν > 30, the standard normal distribution can be applied as an approximation of the T-distribution. It will always give a more precise result to use the T-distribution when σ is unknown - irrespective of the size of n.

Page 24: Business Statistics

21

K) The χ2-distribution10 χ2 –distribution with respectively 1, 2, 3 and 4 degrees of freedom

χ2 –distribution with respectively 10, 20 and 50 degrees of freedom compared with corresponding normal distributions

Continuous distribution ⇒ point probability = 0

Defined by: { }

2 2 2 21 2 .... ~ where

~ (0,1) for 1,.... and independent of for all .v v

i i j

Z Z ZZ N i v Z Z i j

χ+ + +

= ≠

Density function

( ) ( ) [ ] ( ) ( )2

2

2

( 2) 2 12 2 2 2 21 1 22( 2) / 2 ! 2

22

1

2f f e eν

χνν χν νν

νχ χ χ χ

−− −−−

⎛ ⎞⎜ ⎟⎝ ⎠

= = ⋅ ⋅ ⋅ = ⋅ ⋅Γ

Expected value Variance Skewness

2( )vE vχ = 2( ) 2vV vχ = 18v

γ =

Use 1) Confidence intervals for and test of 2

Xσ , when X is normally distributed or the sample is sufficiently large.11

2) Goodness-of-fit (test showing whether or not some given data follow a given distribution). 3) Test for independence and homogeneity.

10 Keller p. 286 and CD Applet 7 11 Keller pp. 402

Page 25: Business Statistics

22

Approximations If ν > 50, the normal distribution can be applied as an approximation. This results in 2 ( , 2 ).N v vνχ ≈ The probability in the 2

νχ -distribution can be calculated as follows:

( ) ( )( )

22

2.

2

χχ

χ

⎛ ⎞− −⎛ ⎞⎜ ⎟> ≈ > = >⎜ ⎟⎜ ⎟ ⎝ ⎠⎜ ⎟⎝ ⎠

vv

v

a E a vP a P Z P ZvV

By construction of the confidence interval for 2σ (see page 38) we get:

2 22

/ 2 / 2

( 1) ( 1) ( 1) 2 ( 1) ( 1) 2 ( 1)a a

n s n sn z n n z n

σ− ⋅ − ⋅≤ ≤

− + ⋅ − − − ⋅ −

2 2

2

/ 2 / 2

2 21 1

1 1

s s

z zn nα α

σ≤ ≤+ −

− −

By test of 2σ (see p. 45) we get:

( )

2 2 2 20 0 1 0

22

20

22 2

1

: :

( 1)

( 1)value .2 ( 1)

Obs

Obsn Obs

H H

n s

np P P Zn

σ σ σ σ

χσ

χχ χ−

= >

− ⋅=

⎛ ⎞− −− = > ≈ >⎜ ⎟⎜ ⎟⋅ −⎝ ⎠

Page 26: Business Statistics

23

L) The F-distribution12

F- distribution with various degrees of freedom

Continuous distribution ⇒ point probability = 0 Defined by

1

1 2

2

21

( , )22

/~

/Fν

ν νν

χ νχ ν provided that there is independence between numerator and denominator.

Density function

( ) ( )( ) ( ) ( )

1 212 2

1 21 221

2

1 2 1;

1 2 2

( 2) / 2 !( ) .

( 2) / 2 ! ( 2) / 2 ! 1 f

ff f f f

ν ν

ν νν ννν

ν ν νν ν ν

+⋅

− − ⎛ ⎞= = ⋅ ⋅⎜ ⎟− ⋅ − ⎝ ⎠ +

Expected value Variance Skewness

( )1 2

2( , )

2 2v vvE F

v=

− Please note: independence of ν1

( )( ) ( )1 2

22 1 2

( , ) 21 2 2

2 ( 2)2 4v v

v v vV Fv v v

+ −=

− −

( )( )

2 1 2

21 2 1

8 4 2 21 62

ν ν ννν ν νγ ⋅ − ⋅ + −

−+ − ⋅= ⋅

2

2

for 2if large: 1

νν

>≈

1

24

1 2

21 2

for 4if ~ and large :if small and large:

ν

ν

νν νν ν

>

1

26

1 2

81 2

for 6if and large:

if small and large:ν

ν

νν ν

ν ν

>

≈ ≈

Use 1) Comparison of two variances from normally distributed populations. 2) Analysis of variance.

Please note that 2 2

21(1; ) (1, )2 2

/1 ~ ~ i.e. ~ ./ v v v v

v v

Z v F T T Fv

χχ χ

= ⋅

12 Keller p. 289 and CD Applet 8

Page 27: Business Statistics

24

3. Choice of Test Statistic by Test of Expected Values and Proportions

A) Test for expected values 1. Test for one expected value (one μ -value) 0 0 1 0: :H Hμ μ μ μ= ≠

1 According to the central limit theorem – is the sample size sufficiently large, according to the

distribution of X. Is f(x) symmetric: n > 5-10. Is f(x) moderately skew: n > 20-30. Is f(x) on the contrary extremely skew, the demand on n can be extremely great. Rule of thumb: n > 25 · 2

1γ (which requires knowledge or assumption of the size of γ1 - see p. 35).

2 More strict demand on the sample size, often referred to as the extended central limit theorem. Rule of thumb: n > 100 · 2

1γ . If the sample makes up more than 5% of the population, make correction for the finite population

0 01~ and .

1 1

nX XZ T

N n S N nN Nn n

μ μσ −

− −− −

⋅ ⋅− −

If X can be assumed to follow a poisson distribution and μ > 10, the following normal test can be

used: 0

0

X Zμμ− ∼ since V(X) = μ for a poisson distributed random variable.

Is σ2 known?

Normal test

T-test

Normal test andT-test for μ

cannot be used

Is

approximatelyT-distributed?2

Is approximately

normallydistributed1

Yes

μ− 0/

XS n

No

No

Yes

Yes

No

μσ

−Ζ∼0

/X

n

X

μ−

− ∼01/ n

X TS n

Page 28: Business Statistics

25

2. Comparison of 2 μ-values ( ) ( )0 1 2 1 2 1 1 2 1 20 0: :H Hμ μ μ μ μ μ μ μ− = − − ≠ −

Pairedcomparison?

T-test

T-test

Normal test

T-test

Normal test orT-test for μ1-μ2

cannotbe used6

No

Is

approximatelyT-distributed?5

Is

approximatelyT-distributed?4

Is

Approximatelynormally

distributed?2

Is

approximatelyT-distributed?1

Yes

Areand

known?

22σ

0D

D

D

S n

μ−0

1D

nD

DT

S n

μ−

−∼

1 2X X− ( ) ( )σ σ

μ μ− − −

+∼

2 21 2

1 2

1 2 1 2 0

n n

X XZ

Is= ?3

( ) ( )( )

1 2

1 2 1 2 0

2 1 1p n n

X X

S

μ μ− − −

+( ) ( )

( ) 1 2

1 2

1 2 1 2 02

2 1 1n n

p n n

X XT

S

μ μ+ −

− − −

+∼

( ) ( )2 21 2

1 2

1 2 1 2 0

S Sn n

X X μ μ− − −

+

( ) ( )2 21 2

1 2

1 2 1 2 0 ~S Sn n

X XTν

μ μ− − −

+

No

No

No

No

No

No

Yes

Yes

Yes

Yes

Yes

Yes

21σ

22σ2

Page 29: Business Statistics

26

1 Identical to the test of μ with unknown σ. See Keller chapter 13.1-3 to distinguish between paired and independent samples.

2 If X1 and X2 are not extremely skewed, 1 2X X− are approximately normally distributed even at

relatively small samples. Usually 1 30n ≥ and 2 30n ≥ will be sufficient with the caveat that there are very unequal distributions, see the reference to the central limit theorem on page 24.

3 Tested by an F-test, see page 45:

2 2 21 1 12 2 22 2 2

obss sfs s

σσ

= = , if X1 and X2 are independent and normally

distributed or n1 and n2 are sufficiently large. 2pS is stated on p. 37.

4 If X1 and X2 are independent and not skewed, the test statistic is approximately T-distributed at

relatively small samples. Usually 1 30n ≥ and 2 30n ≥ will be sufficient with the caveat that there are very unequal distributions, see the reference to the central limit theorem on page 24. The test statistic will be approximately T-distributed with n1 + n2 − 2 degrees of freedom.

5 If X1 and X2 are independent and not skewed, the test statistic is approximately T-distributed at

relatively small samples. Usually 1 30n ≥ and 2 30n ≥ will be sufficient with the caveat that there are very unequal distributions, see the reference to the central limit theorem on page 24. The test statistic will be approximately T-distributed with ν degrees of freedom where

( )( ) ( )

( )

22 21 1 2 2

2 22 21 1 2 2

1 222 2

1 21 2 2 4 4

1 2

1 1

For we get: ( 1)

s n s n

s n s nn n

s sn n n

s s

ν

ν

+=

+− −

+= = − ⋅

+

On page 20 it is stated that when ν > 30, the standard normal distribution could be used as a

reasonable approximation to the T-distribution - which means that in this case, we can assume

that ( ) ( )

2 21 2

1 2

1 2 1 2 0

S Sn n

X XZ

μ μ− − −≈

+ and hence omit the calculation of ν.

6 See Keller ch. 19.1.

Page 30: Business Statistics

27

B) Test for proportions 1. Test for one proportion 0 0 1 0: :H p p H p p= ≠

1 Estimated on the basis of the approximation from either the hypergeometric distribution or

from the binomial distribution to the normal distribution. 2 Examples: a) Hypergeometric distribution:

( )

40 52

4 48 4 483 2 4 1

52 525 5

: 52, 5, 4, 3

value 2 3 | 52, 5, 4 2

2 (0.001736 0.000018) 0.00351

H p N n S x

p P X N n S⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠ ⎝ ⎠ ⎝ ⎠

⎛ ⎞ ⎛ ⎞⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎜ ⎟ ⎜ ⎟⎝ ⎠ ⎝ ⎠

= = = = =

⎛ ⎞⎜ ⎟− = ⋅ ≥ = = = = ⋅ =⎜ ⎟⎝ ⎠

⋅ + =

+i i

b) Binomial distribution: 0 : 0.05 50 1H p n x≥ = = p-value = P(X ≤ 1 | n = 50, p = 0.05) = 0.279 If the sample makes up more than 5% of a population, a correction for the finite population is made

( ) ( )0 0

0 0 0 01 1

ˆ~ or ~ .

1 1 /N n N nN N

X n p P pZ Zn p p p p n− −

− −

− ⋅ −

⋅ ⋅ − ⋅ ⋅ − ⋅

X is approximately

normally distributed?1

( ) ( )0 0

0 0 0 0

ˆor

1 1 /X n p P pZ Z

n p p p p n− ⋅ −

⋅ ⋅ − ⋅ −∼ ∼

Yes

No Solution by exact calculations2

Page 31: Business Statistics

28

2. Comparison of 2 proportions ( ) ( )0 1 2 1 2 1 1 2 1 20 0: :H p p p p H p p p p− = − − ≠ −

1 Estimated on the basis of the approximation from either the hypergeometric distribution or

from the binomial distribution to the normal distribution. 2 1 1 2 2 1 2

1 2 1 2

ˆ ˆˆ n P n P X XPn n n n

⋅ + ⋅ += =

+ +

If one or both samples make up more than 5% of a population, a correction for the finite population is made.

( ) ( )1 1 2 2

1 1 2 2

1 2

1 11 1

ˆ ˆ~

ˆ ˆ1 N n N nn N n N

P P ZP P − −

− −

⋅ − ⋅ ⋅ + ⋅ or

( )( ) ( )1 1 2 21 1 2 2

1 1 2 2

1 2 1 2 0ˆ ˆ ˆ ˆ1 1

1 1

ˆ ˆ~ .

P P P PN n N nn N n N

P P p pZ

⋅ − ⋅ −− −− −

− − −

⋅ + ⋅

Are X1 - X2

approximately normally

distributed?1

Yes

Is ( )1 2 0

0p p− =

Yes

( ) ( )1 2

21 2

1 1

ˆ ˆ~

ˆ ˆ1 n n

P P ZP P

⋅ − ⋅ +

( ) ( )( ) ( )1 1 2 2

1 2

1 2

ˆ ˆ ˆ ˆ1 1

1 2 0ˆ ˆ

~P P P P

n n

P PZ

p p⋅ − ⋅ −

− −

+

No

Within the curriculum there are no immediate solutions

No

Page 32: Business Statistics

29

4. Sampling Methods13 There are various methods of drawing random samples. One of the methods is to draw a simple random sample. You choose n among N elements. All n elements have equal drawing probability, (1/N), and all sample combinations

!!( )!

N Nn n N n

⎛ ⎞=⎜ ⎟ −⎝ ⎠

have the same probability.

Metaphorically speaking, imagine that all N in a population has a numbered slip put in a hat from which you draw n at random. Sometimes you can or should apply other sampling methods. In the following we describe when and how. If it is assumed that the income of a given population with regard to a specific variable, e.g. kind of housing, is very different (heterogeneous), it is advantageous to apply stratified sampling. The variable, type of housing, is a stratification criterion, and the sub-populations, in this example e.g. house owners and tenants, are the strata. In this context heterogeneous means that the population consists of sub-populations, each of which are similar (homogeneous), but across the populations there are differences. If you divide the population into the strata, owners and tenants, you will in each of these strata see a higher degree of homogeneity with respect to income, than if you consider the population as a whole. In each of the strata you have a high degree of homogeneity, so a smaller sample from each stratum is required in order to reach a reliable estimate of the mean income of the sub-population. You may furthermore allocate the sample among the strata so that you are certain that the sample is representative (proportional allocation) or so that the marginal information of the last sample unit drawn is the same for each stratum (optimal allocation). Concerning calculation of average and variance, weights should be included, i.e. the population proportions made up by the sub-populations. The weight for stratum no. 1 is 1 /N N . Metaphorically speaking it corresponds to all

1N in stratum no. 1 having a numbered slip in hat no. 1 - and that you then draw 1n at random and so on for the other strata. If we have the opposite situation where each of the sub-populations is heterogeneous, and where homogeneity exists across the sub-populations, it is advantageous to use cluster sampling. If we assume that in a lot of 1,000 bags of potatoes there is no great variation from one bag to another, but there are large as well as small potatoes in every bag, we have homogeneity across the bags and heterogeneity within the bags. Cluster sampling can also be an advantage in some other cases. It is often the case that you do not know the population (there is no register) from which you can draw a sample. In such a situation you can choose some sub-populations and then draw a simple random sample or maybe make a total count. Large distances between the sub-populations involved with a subsequent long time of transportation for an interviewer as well as small marginal costs by choosing larger samples from a subpopulation already selected are indications in favour of cluster sampling. Metaphorically

13 Keller ch. 5.3

Page 33: Business Statistics

30

speaking it means that first you choose from among the hats, and then from among the slips in the selected hats. The fourth and last method to be discussed here is systematic sampling. Systematic sampling means that if you choose n among N elements, which are numbered from 1 to N, then you choose every N/n’th element. Since N/n is seldom an integer, you round off to the integer. So all you have to do is to choose a random element, j, from the first N/n elements. The chosen elements will be j, j+N/n, j+2N/n, j+3N/n, ... , j+(n-1)· N/n. If there is no relationship between the order of the elements and the variable to be examined, the arithmetic for simple random sampling can be used. It should be noted, however, that not all sample combinations have the same probability, since only

N/n combinations of in allNn

⎛ ⎞⎜ ⎟⎝ ⎠

are the only possibilities.

If there is some kind of relationship between the order of the elements and the variable examined, you can, if possible, make a sorting resulting in a random order for the variable to be examined. If it is not possible to make such a sorting (e.g. move about the passengers in a bus, change the times of completion for some mass-produced items), the methods of stratified sampling or of cluster sampling can be applied. As an example The Aarhus School of Business may wish to examine the degree of satisfaction with the tutorials in statistics. If the tutorials are the same for all groups, it would be advantageous to use cluster sampling. The tutorial groups are alike (homogeneous) in relation to satisfaction, whereas the students of the various groups differ (heterogeneous). This means that first you choose some groups (clusters), and then some, or maybe all, students are chosen from each of the groups chosen. If the tutorials held differ from one group to another (different teachers, times, rooms etc.) it would be advantageous to use stratified sampling. The tutorial groups differ (are heterogeneous) as far as satisfaction is concerned, whereas the students of the single groups are more alike (homogeneous) their fellow students in the group, than their other fellow students. This means that from each tutorial group (strata) a sample is drawn. The size of the sample depends on the size of the group; the larger the group the larger the sample. If the groups differ greatly with regard to variance; the greater the variance the larger the sample. It should be noted that none of the four sampling methods discussed above are superior. For a given problem you may often find that one or more of the methods will be superior to the others with regard to efficiency. By efficiency is meant the least margin of error/uncertainty in connection with the expenditure of a given sample.

Page 34: Business Statistics

31

5. Poisson Distribution, Confidence Intervals and Hypothesis Testing

If the random variable X follows a Poisson distribution, where ( )E X μ= and ( )V X μ= , the best estimate of the parameter μ is the number of occurrences in an interval of time divided by the number of time units (named m in the following and not necessarily an integer). If you on the basis of the number of occurrences that has taken place within one time unit want to construct a confidence interval for μ , the lower ( )Lμ and the upper ( )Hμ limit can be obtained as

follows: ( ) ( )| / 2 and | / 2.L HP X x P X xμ α μ α≥ = ≤ = As a solution it can be shown that:14

222 ( 1); / 22 ;1 / 2 and

2 2xx

L Hαα χχ

μ μ ⋅ +⋅ −= =

Assume that the number of orders a department within a company receives per day is Poisson distributed and that you one day has received 15 orders. The best estimate of μ would be 15. A construction of a 95 % confidence interval for would be as follows:

22 2 22 ( 1); / 22 ;1 / 2 30;0.975 32;0.02516.79 49.488.395 and 24.74.

2 2 2 2 2 2xx

L Hαα χχ χ χ

μ μ ⋅ +⋅ −= = = = = = = =

Assume that the number of daily orders follows a Poisson distribution and within a year of 210 (= m) working days you have received in total 3150 ( )x= orders, then the best estimate of would be:

3150ˆ 15.210

xm

μ = = =

A construction of a 95 % confidence interval for would be, with approximating to the normal distribution, see p. 22, as follows:

2 26300;0.975 6302;0.025

2 26300;0.975 6302;0.025

6300 1.96 12600 6302 1.96 126043040 and 32612 2 2 2

3040 326114.48 and 15.53.2 210 2 210

L H

L H

m m

m m

χ χμ μ

χ χμ μ

− +⋅ = ≈ = ⋅ = ≈ =

= ≈ = = ≈ =⋅ ⋅

14 Johnson NL, Kotz S. Discrete Distributions. Boston: Houghton Mifflin Company 1969 and Stuart A, Ord JK. Kendall's Advanced Theory of Statistics (6th edition). London: Edward Arnold 1994.

μ

μ

μ

Page 35: Business Statistics

32

If m μ⋅ is larger than 10, it could alternatively be calculated as follows:

/ 23150 31501.96 15 1.96 0.267 15 0.52.210 210

x xzm mαμ = ± ⋅ = ± ⋅ = ± ⋅ = ±

Testing if 16μ = can be done as follows:

0

00

0 0

: 16

3150 1615 16 1 15 162100.25 and 3.62.4 0.27616 16

210

Obs Obs

H

xx mz z

m

μ

μμμ μ

=

− −− − − −= = = = − = = = = −

Alternatively by a Goodness-of-fit test:

( ) ( )2 20 02 2 2

1;0 0

and are to be compared to the critical point .Obs Obs

x x mm α

μ μχ χ χ

μ μ− − ⋅

= =⋅

With reference to the source below15, two independent Poisson processes can be compared as follows, when i im μ⋅ can be assumed to be larger than 10:

( ) ( )

( )

( )( )

( ) ( )

1 2 1 21 2 / 2 2 2

1 2 1 2

1 2

1 20 1 2 1 2 1 20 0

1 2

1 2

1 21 2

1 2

1 21 2 0

1 21 2 0

1 22 21 2

1 2 1 1 2 01 2

1

and

: when 0

If

when 0

If

Obs

Obs

Obs

Obs

x x x xzm m m m

x xm m

H zx xm m

x xm m z

x x

x xm m

zx xm m

x x mm m z

x

αμ μ

μ μ μ μ μ μ

μ μμ μ

μ μ

⎛ ⎞− = − ± ⋅ +⎜ ⎟

⎝ ⎠

⎛ ⎞−⎜ ⎟

⎝ ⎠− = − = − =+⋅

−= =

+

⎛ ⎞− − −⎜ ⎟

⎝ ⎠= − ≠+

− − ⋅ −= =

+ 2

.x

15 Sahai H, Kurshid A. Statistics in epidemiology: methods techniques and applications. CRC Press 1996

Page 36: Business Statistics

33

Assume that the same assumption holds for another and independent department of the company. This department has one day received 14 orders and within 210 working days received 2940 orders in total. Confidence intervals for the difference in the two cases would by approximation to the normal distribution be as follows:

( ) ( )

( )

1 2 1 2 / 2 1 2

1 2 1 21 2 / 2 2 2

1 2 1 2

2 2

15 14 1.96 5.39 1 10.55 and

3150 2940 3150 29401.96 15 14 1.96 0.372 1 0.73.210 210 210 210

x x z x x

x x x xzm m m m

α

α

μ μ

μ μ

− = − ± ⋅ + = − ± ⋅ = ±

⎛ ⎞− = − ± ⋅ +⎜ ⎟

⎝ ⎠

⎛ ⎞= − ± ⋅ + = − ± ⋅ = ±⎜ ⎟⎝ ⎠

Hypothesis for the difference in the two cases are as follows:

( ) ( ) ( )

( ) ( ) ( )

0 1 2

1 2 1 2 0

1 2

1 21 2 0

1 21 2

1 2 1 2

1 2

: 0

15 14 0 1 0.19 and5.3915 14

3150 2940 210 210 2.69.78.043150 2940 6090

Obs

Obs

H

x xz

x x

x xx xm m

zx x x xm m

μ μ

μ μ

μ μ

− =

− − − − −= = = =

+ +

⎛ ⎞− − −⎜ ⎟ − −⎝ ⎠= = = = = =

+ + +⋅

Page 37: Business Statistics

34

For testing of 3 or more independent Poisson processes, 0 1 2: ... kH μ μ μ= = = , a 2χ -test is applied:

( )

( )

( )

( )

22

11

22 1

1

1

2 22 1

1

2 2 22

2 1 1

1 1

1

where and

ˆ ( 1) ˆif we get whereˆ ˆ

ˆˆˆ ˆ /

ki i

ki i

k

iki i i

Obs i i i i ki i

ii

k

iki iX

i Obsi

k k

i ik ki i i

Obs iki i

ii

F EE

xf ef x e m

e m

xx k sm mk

x k xxx

x k

χ

χ

μχ μ

μ μ

μμχ

μ μ

−=

=

=

=

=

=

= =

= =

=

−= = = ⋅

− − ⋅= = = =

⎛ ⎞− ⋅⎜ −

⎜ = = = +⎜⎜⎝ ⎠

∑∑

∑∑

∑ ∑∑ ∑

.⎟⎟⎟⎟

Assume that a third and independent department one day has received 19 orders and within 210 working days has received 3045 orders in total. The three departments can now be compared to each other. mi fi ei ( )2

i i

i

f ee−

1 15 16 1/16 1 14 16 4/16 1 19 16 9/16 48 48 2

Obsχ = 0.875

mi fi ei ( )2i i

i

f ee−

210 3150 3045 11025/3045 210 2940 3045 11025/3045 210 3045 3045 0 630 9135 9135 2

Obsχ = 7.241

In both cases the 2

Obsχ value is to be evaluated with the critical point, 22;0.05 5.99χ = .

2 2 2 2 2 2

2 215 14 19 3150 2940 30453 16 0.875 3 3045 7.241.48 9135Obs Obsχ χ

⎛ ⎞ ⎛ ⎞+ + + += ⋅ − = = ⋅ − =⎜ ⎟ ⎜ ⎟

⎝ ⎠ ⎝ ⎠

Page 38: Business Statistics

35

6. Overview

A) Descriptive measures

Parameter Parameter estimate Description/name Description and reference to Keller

μ 1

/n

ii

x x n=

= ∑ Average Mean, p. 98

2σ ( )2 2 2

2 1 1

1 1

n n

i ii i

x x x n xs

n n= =

− − ⋅= =

− −

∑ ∑

Variance Variance, p. 107

σ 2s s= Standard deviation Std., p. 110

1γ ( )3

11 3

n

ii

x xg

n s=

−≈

Skewness

Skew to the left 1 0γ⇒ < Symmetrical 1 0γ⇒ = Skew to the right 1 0γ⇒ >

Skewness, p. 36

2γ ( )4

12 4 3

n

ii

x xg

n s=

−≈ −

Kurtosis

Less peaked than normal distr. 2 0γ⇒ < Normal distribution 2 0γ⇒ = More peaked than normal distr. 2 0γ⇒ >

(Kurtosis)

2

1

n

ii

x=∑ Sum of squared values Sum of squared values

( )2 2 2 2

1 1

( 1)n n

X i i Xi i

SS x x x n x n s= =

= − = − ⋅ = − ⋅∑ ∑ Sum of squares of the deviations Sum of squared deviations, p. 107

scvx

= Coefficient of variation CV, p. 113

Xσ or , if 5%1X X

s s N n ns sN Nn n

⎛ ⎞−= = ⋅ >⎜ ⎟⎜ ⎟−⎝ ⎠

Standard error Std. Error, p. 300

Page 39: Business Statistics

36

B) Construction of confidence intervals

Generally it is assumed that the sample is drawn at random and that the responses are reliable. A confidence interval with a confidence level of 1-α expresses that with a certainty of 1-α you can say that the parameter is included in this interval.

In case the sample makes up more than 5% of a population, the variance of the test statistic is to be corrected by 1N nN

−− , e.g.

221 .N n

n NXσσ −

−= ⋅

The parameters and their starting point stated in the grey cells are most common.

Parameter Starting point Estimate(s) Confidence interval Assumptions

(Generally)

θ

Known or assumed variance θ̂ ˆ2

ˆ zαθ θ σΘ

= ± ⋅ That the test statistic, Θ̂ , follows a normal distribution.

μ Known or assumed variance 1

n

ii

xx

n==∑

/ 2x znα

σμ = ± ⋅

That the test statistic X follows a normal distribution re: the central limit theorem:

Irrespective of the distribution of X, X is approximately normally distributed, when the sample is sufficiently large.

Use the rule-of-thumb that the sample should exceed 25 times the square of skewness.

μ One sample

Unknown variance 1; / 2n

sx tnαμ −= ± ⋅

That X follows a normal distribution or that the extended central limit theorem is fulfilled.

If X is not extremely skewed, /

XS n

μ− is approximately

Tn-1-distributed, when the sample is sufficiently large.

Use the rule-of-thumb that the sample should exceed 100 times the square of skewness. (see pp. 24 and 35)

( )

1

2

1

1

n

ii

n

ii

xx

n

x xs

n

=

=

=

−=

Page 40: Business Statistics

37

Parameter Starting point Estimate(s) Confidence interval Assumptions

μ

Occurrences per time unit

X~Poisson distributed

(see pp. 14, 31 and 32)

x is the number of occurrences 2

x xzm mαμ = ± ⋅

That X is approximately normally distributed.

If m μ⋅ is assumed to be larger than 10.

Note that x is used for calculating the margin of error.

m is the number of time units; m > 0.

Paired samples

Unknown variance

1 2( )i i id x x= − ( )

1

2

1

1

n

ii

n

ii

D

dd

n

d ds

n

=

=

=

−=

∑ 1; 2

DsD n n

d t αμ −= ± ⋅

That D follows a normal distribution or that the extended central limit theorem is fulfilled.

If D is not extremely skewed, /

D

D

DS n

μ− is approximately

T n-1-distributed when the sample is sufficiently large.

1 2μ μ−

Two independent samples

Known variances

1

11

11

n

ii

xx

n==∑

That 1 2X X− is approximately normally distributed.

Two independent samples

Unknown but equal variances. (see p. 25)

( ) ( )2 21 1 2 2

1 2

1 122

n s n sp n ns − ⋅ + − ⋅

+ −= ( ) ( )1 2 1 2;

1 2

2 1 11 2 2 2n n p n nx x t sα

μ μ

+ −

− =

− ± ⋅ +

That ( ) ( )

( )1 2

1 2 1 2

2 1 1p n n

X X

S

μ μ− − −

+ is approximately

1 2 2n nT + − -distributed.

1 2μ μ−

Two independent samples

Unknown variances

( )1 2

1 12 11

122

1

....

n

ii

x xs

n

s

=

−=

=

That ( ) ( )

2 21 2

1 2

1 2 1 2

S Sn n

X X μ μ− − −

+ is appr. Tν -distributed

- for calculation of ν see p. 26.

( )2 21 2

1 21 2 1 2 2 n nx x z σ σαμ μ− = − ± ⋅ +

1 2μ μ−

( )2 21 2

1 2;1 2 1 2 2s sn nx x tυ αμ μ− = − ± ⋅ +

Page 41: Business Statistics

38

Parameter Starting point Estimate(s) Confidence interval Assumptions

1 2μ μ−

Occurrences per time unit

X1 and X2 ~Poisson distributed

(see p. 14, 32 and 33)

x1 = number of occurrences in Poisson process 1

1 2 1 21 2 2 2 2

1 2 1 2

x x x xzm m m mαμ μ

⎛ ⎞− = − ± ⋅ +⎜ ⎟

⎝ ⎠

That X1 - X2 is approximately normally distributed and independent.

If 1 1m μ⋅ and 2 2m μ⋅ both are larger than 10.

Note that x1 and x2 are used for calculating the margin of error. Two time spans of m1 and m2 time units.

i jμ μ−

Three or more independent samples

Simultaneous C.I.

Unknown but equal variances

( ) 2

12

1k

j jj

p

n ss

n k=

− ⋅=

( ) ( )*;

1 2

2 1 12

* 2( 1)

i ji j p n nn k

k k

x x t sα

α

μ μ

α

⋅⋅ −

− =

− ± ⋅ ⋅ +

=

That

( ) ( )( )2 1 1

μ μ− − −

+i j

i j i j

p n n

X X

S is approximately

Tn-k-distributed.

One sample

( )2

2 1

1

n

ii

x xs

n=

−=

( ) ( )2 22

2 21; 2 1;1 2

1 1

n n

n s n s

α α

σχ χ− − −

− ⋅ − ⋅≤ ≤ That X is normally distributed. Alternatively that the

sample is sufficiently large.

2122

σσ

Two independent samples

1 2 1 2

2 2 2 2 21 2 1 1 2

21; 1; 2 2 1; 1;1 2n n n n

s s s sf fα α

σσ− − − − −

≤ ≤ That X1 and X2 both are normally distributed. Alternatively that the samples are sufficiently large.

p

One sample

X is binomial or hypergeometrical distributed

ˆ xpn

=

x = number successes

( )2

ˆ ˆ1ˆ

p pp p z

−= ±

That X is approximately normally distributed. This means that n p⋅ and (1 )n p⋅ − can be assumed to be larger than 5, or that V(X) can be assumed to be larger than 5.

Page 42: Business Statistics

39

Parameter Starting point Estimate(s) Confidence interval Assumptions

1 2p p−

Two independent samples

X1 and X2 are ,

binomial or hypergeometrical distributed

11

1

ˆ xpn

= ( ) ( ) ( )1 1 2 2

1 2

1 2

ˆ ˆ ˆ ˆ1 11 2 2ˆ ˆ p p p p

n n

p p

p p zα− −

− =

− ± +

That X1 - X2 is approximately normally distributed.

This means that 1n p⋅ , ( )11n p⋅ − , 2n p⋅ and

( )21n p⋅ − can be assumed to be larger than 5, or that

( )1V X and ( )2V X can be assumed to be larger than 5.

i jp p−

One sample

X is multinomial or k-dimensional hyper-geometrical distributed

ˆ ˆ jii j

xxp pn n

= =

xi = number of outcomes with characteristic i

( ) ( )2ˆ ˆ ˆ ˆ

2ˆ ˆ i j i j

i j

p p p pi j n

p p

p p zα+ − −

− =

− ±

That Xi – Xj is approximately normally distributed.

This means that in p⋅ , ( )1 in p⋅ − , jn p⋅ and ( )1 jn p⋅ −

can be assumed to be larger than 5, or that ( )iV X and

( )jV X can be assumed to be larger than 5.

1β Regression analysis 1 2

cov( , )

X

X Ybs

= 11 1 2; 2n bb t sαβ −= ± ⋅

That the error component ε is approximately normally distributed, ( ) 0E ε = and ( )V ε = constant.

0β Regression analysis 0 1b y b x= − ⋅ 00 0 2; 2n bb t sαβ −= ± ⋅

That the error component ε is approximately normally distributed, ( ) 0E ε = and ( )V ε = constant.

When X = 0 is close to the X values.

|Y Xμ

Regression analysis

Confidence interval for expected value

( ) ( ) ( )21

0 1 2; 2|X

x xn n SSE Y X b b x t sα ε

−−= + ⋅ ± ⋅ +

That the error component ε is approximately normally distributed, ( ) 0E ε = and ( )V ε = constant.

When X = x is close to the X values.

|Y X

Regression analysis

Prediction interval for a value

( ) ( )21

0 1 2; 2| 1X

x xn n SSY X b b x t sα ε

−−= + ⋅ ± ⋅ + +

That the error component ε is approximately normally distributed, ( ) 0E ε = and ( )V ε = constant.

When X = x is close to the X values.

Page 43: Business Statistics

40

Parameter Starting point Estimate(s) Confidence interval Assumptions

ρ

The degree of linear relationship

Analysis of correlation

When ρ is assumed to be ~ 0

cov( , )

X Y

X Yrs s

=⋅

212; 2 2

rn nr t αρ −

− −≈ ± ⋅ Corresponding to assumptions concerning 1β .

ρ

Analysis of correlation

When ρ is assumed to be ≠ 0

cov( , )

X Y

X Yrs s

=⋅

[ ]

/ 2 / 3

;

Z Zr z n

Lower Upper

αρ

ρ

= ± −

Corresponding to assumptions concerning 1β

Transformations: ( )(1 ) (1 ) / 2Zr ln r ln r= + − −

( )

( )

/ 2

/ 2

2 / 3

2 / 3

1

1

z

z

r z n

r z n

eLowere

α

α

− −

− −

−=

+

( )

( )

/ 2

/ 2

2 / 3

2 / 3

1

1

z

z

r z n

r z n

eUppere

α

α

+ −

+ −

−=

+

Page 44: Business Statistics

41

C) Application of hypotheses

Generally it is assumed that the sample is a simple random sample and that the responses are reliable. The 7 points approach below refer to a hypothesis on a μ-value. The approach can also be used with another notation for tests of proportions, difference between two means/proportions as well as for tests on one and two variances, respectively. X: Definition of the random variables. Given information: n, x and σ2 or s2. 1) Hypothesis formulation H0: μ = μ0 The null hypothesis is status quo (write down the action taken if H0 is chosen) H1: μ ≠ μ0 The alternative hypothesis is what we wish or fear to show (write down the action taken if H1 is chosen). H1 could possibly be one-sided, μ > μ0 or μ < μ0, given that it can be reasoned for through a theory or earlier survey within the given area. If the test is one-sided it will have an effect on the critical limit (point 5) and the calculation of the p-value (point 6). 2) Choice of significance level, typically 0.05 unless otherwise stated (see Keller p. 346) The maximum uncertainty allowed for a type I error, i.e. choosing H1, when H0 is true. If the consequence by committing a type I error is crucial, the α level should be lowered. 3) Choice of test statistic (cf. pp. 24, 25-28 and 43-48)

If the population variance is known (σ2), or assumed, a Z-test statistic (observatory) is used, see Keller chapter 11.2, 0 ./

X Zn

μσ

− ∼

If the variance is estimated (s2) on the basis of a simple random sample, the T-test statistic is used: 01/ n

X TS n

μ−

− ∼ , see Keller chapter 12.1.

Assumptions: • H0 is true - hypothesis testing is conducted from the assumption that H0 is true.

• X ≈ NF or 01/ n

X TS n

μ−

−≈ - must be discussed, see p. 24 and Keller p. 300 and p. 389.

• n/N < 0.05 - or else the finite population correction factor must be added.

Page 45: Business Statistics

42

4) Calculation of value of test statistic (see Keller p. 351 and p. 383) 0

/Obsxz

σ−

=

or 0 ./Obs

xts n

μ−=

Number of standard deviations of X ( )orX Xsσ that the simple random sample result ( x ) deviates from μ0.

If the sample represents more than 5% of the population, the variance of the estimator is corrected with 1N nN

−− , e.g.

221

N nn NXσσ −

−= ⋅ . 5) Determination of critical values and choice between H0 and H1 (see Keller p. 350-351 and p. 386) zα/2 and z1-α/2 - add a graphic illustration if |zobs| ≤ |zα/2| maintain H0 (then H0 cannot be rejected) (then H0 is accepted) (then H0 is chosen)

if |zobs| > |zα/2| reject H0 (making H1 plausible) (then H1 is chosen) At a lower one-sided hypothesis test H1: μ < μ0 is the critical limit -|zα| At an upper one-sided hypothesis test H1: μ > μ0 is the critical limit |zα| If the variance is estimated on the basis of the sample, the Tn-1-distribution fractiles must be applied. 6) Calculation of p-value (see Keller p. 353 and p. 386) The probability of just as extreme or more extreme sample result, if the null hypothesis is true, i.e. 2·P(Z ≤ -|zobs|) or 2·P(Tn-1≤ -|tobs|) At a lower one-sided hypothesis test H1: μ < μ0 the p-value is P(Z ≤ zobs). At an upper one-sided hypothesis test H1: μ > μ0 the p-value is P(Z ≥ zobs). 7) Conclusion (see Keller p. 355 and p. 358) Must contain a choice: either you choose H0 or H1 and the certainty with which the choice has been made.

• It is a very certain conclusion when you choose H0 - and p-value is >> α. • It is an uncertain conclusion when you choose H0 - and p-value is > α but close to α. • It is an uncertain conclusion when you choose H1 - and p-value is < α but close to α. • It is a very certain conclusion when you choose H1 - and p-value is << α.

The reservations you may have about the assumptions have hardly any influence on the choice if it is a very certain conclusion, but they may be of importance for the choice between H0 and H1, when dealing with an uncertain conclusion.

Page 46: Business Statistics

43

The hypotheses stated in the grey cells are the most common.

H0-hypothesis Test statistic The value of the test statistic Assumptions

(Generally)

0θ θ=

0

ˆ

ˆ~ Zθ

σΘ

Θ − 0

ˆ

ˆObsz θ θ

σΘ

−= That the test statistic, Θ̂ , follows a normal distribution.

0μ μ=

2σ known or assumed

0 ~X Znμ

σ−

0Obs

xzn

μσ

−=

That the test statistic, X , follows a normal distribution re: the central limit theorem:

Irrespective of the distribution of X, X is approximately normally distributed, when the sample is sufficiently large.

Use the rule-of-thumb that the sample should exceed 25 times the square of skewness.

0μ μ=

Unknown variance

01~ n

X TS n

μ−

− 0Obs

xts n

μ−=

That X follows a normal distribution or that the extended central limit theorem is fulfilled.

If X is not extremely skewed,/

XS n

μ− is approximately Tn-1-

distributed, when the sample is sufficiently large.

Use the rule-of-thumb that the sample should exceed 100 times the estimated square of skewness (see pp. 24 and 92).

0

~X Poissonμ μ=

Occurrences per time unit

0

0

~

Xm Z

m

μ

μ

0

0Obs

xmz

m

μ

μ

−=

That X is approximately normally distributed.

If 0m μ⋅ is assumed to be larger than 10.

Note that 0μ is used for calculating the standard deviation for the test statistic.

m is the number of time units, m > 0.

Page 47: Business Statistics

44

H0-hypothesis Test statistic The value of the test statistic Assumptions

0D Dμ μ= 01~D

nD

DT

S n

μ−

− 0D

ObsD

dt

s n

μ−=

Paired samples.

That D follows a normal distribution or that the extended central limit theorem is fulfilled.

If D is not extremely skewed, 0

/D

D

D

S n

μ− is approximately T n-1-

distributed, when the sample is sufficiently large. ( )1 2 1 2 0

μ μ μ μ− = −

2jσ ’s are known or

assumed

( ) ( )2 21 2

1 2

1 2 1 2 0 ~n n

X XZ

σ σ

μ μ− − −

+

( ) ( )2 21 2

1 2

1 2 1 2 0Obs

n n

x xz

σ σ

μ μ− − −=

+

Two independent samples.

That 1 2X X− is approximately normally distributed.

( )1 2 1 2 0

2 21 2given

μ μ μ μ

σ σ

− = −

=

Unknown variances

( ) ( )( ) 1 2

1 2

1 2 1 2 02

2 1 1~ n n

p n n

X XT

S

μ μ+ −

− − −

+

( ) ( )( )1 2

1 2 1 2 0

2 1 1Obs

p n n

x xt

s

μ μ− − −=

+

Two independent samples.

That ( ) ( )

( )1 2

1 2 1 2 0

2 1 1p n n

X X

S

μ μ− − −

+ is approximately

1 2 2n nT + − -distributed.

( )1 2 1 2 0μ μ μ μ− = −

Unknown and unequal variances

( ) ( )2 21 2

1 2

1 2 1 2 0 ~S Sn n

X XTυ

μ μ− − −

+

( ) ( )2 21 2

1 2

1 2 1 2 0Obs

s sn n

x xt

μ μ− − −=

+

Two independent samples.

That ( ) ( )

2 21 2

1 2

1 2 1 2 0

S Sn n

X X μ μ− − −

+ is approx. Tν -distributed

- for calculation of ν see p. 26.

Page 48: Business Statistics

45

H0-hypothesis Test statistic The value of the test statistic Assumptions

1 2 0

jX Poissonμ μ− =∼

Occurrences per time unit

1 2

1 2

1 2

1 2

~

X Xm m

ZX Xm m

⎛ ⎞−⎜ ⎟

⎝ ⎠+⋅

1 2

1 2

1 2

1 2

Obs

x xm m

zx xm m

⎛ ⎞−⎜ ⎟

⎝ ⎠=+⋅

That X1 - X2 is approximately normally distributed. That X1 og X2 is independent.

If 1 1 2 2and m mμ μ⋅ ⋅ both are larger than 10.

Note that x1 and x2 is used for calculating the standard deviation for the test statistic.

Two time spans of respectively m1 and m2 time units.

1 2 ... kμ μ μ= = = 1;~ k n kMSTR FMSE − −

( )

( )

2

1

2

1

/( 1)

1 /( )

k

j jj

Obs k

j jj

n x x kf

n s n k

=

=

⋅ − −=

− ⋅ −

That the Xj’s are approximately normally distributed and have equal variances. That Xj’s are independent.

Alternatively that the samples are sufficiently large.

2 20σ σ=

( ) 22

120

1~ n

n Sχ

σ −

− ⋅

( ) 22

20

1Obs

n sχ

σ− ⋅

=

That X is normally distributed.

Alternatively that the sample is sufficiently large.

2 21 2σ σ=

1 2

21

1; 122

~ n nS FS − −

2122

Obssfs

=

That both X1 and X2 are normally distributed.

Alternatively that the samples are sufficiently large.

Evaluation of

2 2 21 2 ... kσ σ σ= = =

*

2

2 1; 1; / 2

* 2( 1)

i j

MaxObs n n

Min

k k

sf ift fs α

αα

− −

⋅−

=

=

”The false F-test”

That Xi and Xj are both normally distributed and that jn ≈ equal.

Alternatively that the samples are sufficiently large.

Page 49: Business Statistics

46

H0-hypothesis Test statistic The value of the test statistic Assumptions

0p p= ( )0

0 0

ˆ~

1P p Z

p pn

( )0

0 0

ˆ1

Obsp pz

p pn

−=

That X is binomial or hypergeometrical distributed.

That X is approximately normally distributed.

This means that 0n p⋅ and ( )01n p⋅ − are larger than 5, or that

V(X) are larger than 5 ( )ˆ XnP =

1 2 0p p− = ( )( )1 2

1 2

1 1

ˆ ˆ~

ˆ ˆ1 n n

P P ZP P

− +

( )( )1 2

1 2

1 1

ˆ ˆ

ˆ ˆ1Obs

n n

p pzp p

−=

− +

Two independent samples.

That X1 – X2 is approximately normally distributed.

( )1 2 1 2 0p p p p− = −

( ) ( )( ) ( )1 1 2 2

1 2

1 2 1 2 0

ˆ ˆ ˆ ˆ1 1

ˆ ˆ~

P P P Pn n

P P p pZ

− −

− − −

+

( ) ( )( ) ( )1 1 2 2

1 2

1 2 1 2 0ˆ ˆ ˆ ˆ1 1

ˆ ˆObs p p p p

n n

p p p pz

− −

− − −=

+

Two independent samples.

That X1 – X2 is approximately normally distributed.

0i jp p− = ˆ ˆ

ˆ ˆ~

i j

i j

P Pn

P PZ

+

ˆ ˆ

ˆ ˆi j

i j i jObs p p

i jn

p p x xz

x x+

⎛ ⎞− −⎜ ⎟= =⎜ ⎟+⎝ ⎠

One sample, each test has ≥2 possible outcomes.

That Xi – Xj is approximately normally distributed.

This means that in p⋅ , ( )1 in p⋅ − , jn p⋅ and ( )1 jn p⋅ − can

be assumed to be larger than 5, or that ( )iV X and ( )jV X can

be assumed to be larger than 5.

( )0i j i jp p p p− = −

( ) ( )( )2

0

0

ˆ ˆ

ˆ ˆ~

i j i j

i j i j

P P p pn

P P p pZ

+ − −

− − −

( ) ( )( )2

0

0

ˆ ˆ

ˆ ˆ

i j i j

i j i jObs

p p p pn

p p p pz

+ − −

− − −=

One sample, each trial has ≥2 possible outcomes.

That Xi – Xj is approximately normally distributed.

This means that in p⋅ , ( )1 in p⋅ − , jn p⋅ and can be assumed

to be larger than 5, or that ( )iV X and ( )jV X can be assumed

to be larger than 5.

Page 50: Business Statistics

47

H0-hypothesis Test statistic The value of the test statistic Assumptions

0 0

0

1 1

1

, ... ,

1

k k

k

ii

p p p p

p=

= =

=∑

Test of distribution

( )22

11

~k

i ik m

i i

F EE

χ − −=

−∑

( )

0

22

1

ki i

Obsi i

i i

f ee

e n p

χ=

−=

= ⋅

That Xi is approximately normally distributed.

This means that all ei must be larger than 5.

m = number of estimated parameters.

Independence or homogeneity

( )2

2( 1)( 1)

1 1

~r c

ij ijr c

i j ij

F EE

χ − −= =

−∑∑

( )2

2

1 1

r cij ij

Obsi j ij

i jij

f ee

n ne

n

χ= =

−=

⋅=

∑∑

i i

That Xij is approximately normally distributed.

This means that all eij must be larger than 5.

1 1.

r c

j ij i iji j

n f and n f= =

= =∑ ∑i i

01 1β β= 0

1

1 12~ n

b

bT

− 0

1

1 1Obs

b

bt

sβ−

= That the test statistic ε is approximately normally distributed, ( ) 0E ε = and ( )V ε = constant.

00 0β β= 0

0

0 02~ n

b

bT

− 0

0

0 0Obs

b

bt

sβ−

=

That the test statistic ε is approximately normally distributed, ( ) 0E ε = and ( )V ε = constant.

When X = 0 is close to the X values.

0Y Yμ μ= ( )

( )0

2

0 12

1

~

X

Yn

X Xn SS

b b XT

μ−

+ ⋅ −

+

( )( )

0

2

0 1

1X

YObs

x xn SS

b b xt

μ−

+ ⋅ −=

+

That the test statistic ε is approximately normally distributed, ( ) 0E ε = and ( )V ε = constant.

When X = x is close to the X values.

0| YY X μ=

( )( )

0

2

0 12

1

~1

X

Yn

X Xn SS

b b XT

μ−

+ ⋅ −

+ +

( )( )

0

2

0 1

11X

YObs

x xn SS

b b xt

μ−

+ ⋅ −=

+ +

That the test statistic ε is approximately normally distributed, ( ) 0E ε = and ( )V ε = constant.

When X = x is close to the X values.

Page 51: Business Statistics

48

H0-hypothesis Test statistic The value of the test statistic Assumptions

0ρ = 22

21~n

nRR T−

−−⋅ 2

21n

Obs rt r −

−= ⋅

Corresponding to assumptions concerning 1β .

0ρ ρ= 0 ~1 3

Z ZRZ

nρ−

− 0

1 3Z Z

Obs

rz

n

ρ−=

Corresponding to assumptions concerning 1β .

Transformations: ( )(1 ) (1 ) / 2Zr ln r ln r= + − −

( )0 0 0(1 ) (1 ) / 2Z ln lnρ ρ ρ= + − −

1 2ρ ρ= 1 2

1 2

1 13 3

~Z Z

n n

R RZ

− −

+ 1 2

1 2

1 13 3

Z ZObs

n n

r rz

− −

−=

+

Two independent samples.

Corresponding to assumptions concerning 1β .

Transformations: ( ) ( )( )1 1 11 1 / 2Zr ln r ln r= + − −

( ) ( )( )2 2 21 1 / 2Zr ln r ln r= + − −

, ,A B A Cρ ρ= ( )( )

, ,32 2 2

, , , , , ,

,

~2 1 2

( 3) 1

A C B Cn

A B A C B C A B A C B C

A B

R RT

R R R R R Rn R

⋅ − − − + ⋅ ⋅ ⋅

− ⋅ +

( )( )

, ,

2 2 2, , , , , ,

,

2 1 2( 3) 1

A C B CObs

A B A C B C A B A C B C

A B

r rt

r r r r r rn r

−=

⋅ − − − + ⋅ ⋅ ⋅

− ⋅ +

Overlapping correlations where the correlation between A and C are tested against the correlation between B and C in consideration of the correlation between A and B.