organizational behavior

5

Click here to load reader

Upload: winny-shiru-machira

Post on 12-May-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: organizational behavior

© Copy Right: Rai University142 11.556

RE

SE

AR

CH

ME

TH

OD

OLO

GY

In this and the next lesson we look at tests of statistical inferencefor small samples. Broadly the main theoretical issues underlyingtests of statistical inference are similar to the large samples. Sincethe previous few lessons have analyzed these issues at length weshall not spend too much time on the theory in this chapter. Inthis lesson we will briefly review the main theoretical properties ofthe t distribution and then determine principles of statisticalinference under various situations.By the end of this chapter you should be able to1. Review of the theoretical aspects of t distribution.2. Carryout hypothesis testing using the t distribution for

small samples3. Apply the principles of hypothesis testing of differences

between means for small sample sizes.4. Carryout tests of differences between means for

dependent samples .

Theoretical aspects of the t distributionTheoretical work on the t distribution was done by W.S. Gosset inthe 1900s.The student’s t distribution is used under two circumstances:1. Sample size, n , is less than 30.2. Where population standard deviation is not known. In

this case t tests may be used even if the sample size isgreater than 30.

We also assume that the population underlying a t distribution isnormal or approximately normal.

Characteristics of the t distributionRelationship between the t distribution and normal distribution:1. Both distributions are symmetrical. However as can be seen in

figure1 the t distribution is flatter than the normal distributionand is higher in the tails and has proportionately less area inthe around the mean. This implies that we have to go furtherout from the mean of a t distribution to include the same areaunder the curve. Thus interval widths are much wider for a tdistribution.

3. There is a different t distribution for every possible samplesize.

3. As sample size increases, the shape of the t distribution losesits flatness and becomes approximately equal to the normaldistribution. In fact for sample sizes greater than 30 the tdistribution becomes less dispersed and approximates anormal distribution and we can use the normal distribution.

Figure 1

Degrees of Freedom

What is Degree of Freedom?This is defined as the number of values we can choose freely. Theconcept is best illustrated with the help of an example:Consider the case: a+b/2=18Given that the mean of these two numbers has to equal 18, howdo we determine values for a and b? Basically we can slot in anytwo values such that they add up to 36. Suppose a=10. then b hasto equal 26 given the above constraint. Thus in a sample of twowhere the value of the mean is specified ( i.e., a constraint) we areonly free to specify one variable. Therefore we have only one degreeof freedom.Another example: a+b+c+d+e+f+g/7=16Now we have 7 variables. Given the mean we are free to specify 6variables. The value of the 7 th variable is determined automatically.For a sample size of n we can define a t distribution for degree offreedom n-1.

Using The t Distribution Tables

• The t table differs in construction from the normal table in thatit is more compact. It shows areas under the curve and t valuesfor a limited number of level of significance (usually .01, .05,.10). t values are therefore defined for level of significance anddegrees of freedom.

• A second difference is that we must specify the degrees offreedom with which we are dealing. Suppose we are making anestimate for a n=14, at 90% level of confidence. We would godown vertically to determine the degrees of freedom (i.e. 13)and then read of the appropriate t value for a level of significanceof .1.

• The normal tables focus on the chance of that the samplestatistic lies within a given number of standard deviations oneither side of the population mean. The t distribution tableson the other hand measures the chance that the observed samplestatistic will lie outside it our confidence interval, defined by agiven number of standard deviations on either side of themean. A t value of 1.771 shows that if we mark off plus andminus 1.771s x = on either side of the mean then we enclose90% of the area under the curve. The area outside these limits,i.e., that of chance error, will be 10%.This is shown in the

LESSON 23:TESTS OF HYPOTHESES – SMALL SAMPLES

Page 2: organizational behavior

© Copy Right: Rai University11.556 143

RE

SE

AR

CH

ME

TH

OD

OLO

GY

Figure 2 below. Thus if we are making an estimate at the 90%confidence limit we would look in the t tables under the .1column (1.0-.9=.1). This is actually or the probability of error.

Figure 2

Reading the t tableA sample excerpt from the t table is presented below in table 1. Wecan use it to read of t values for different levels of significance,degrees of freedom.

Table 1

ExampleFor the following sample sizes and significance levels find theappropriate t values:1. n=28, a=.05 ‡ degrees of freedom= 28-1 =27 t=±2.0482. n=10, 99% ‡ degrees of freedom=9 t=±3.250

Exercise

1. Find t values for the following:2. n=13, 90%3. n=25, 95%4. Given the following sample sizes and t values find the

corresponding confidence levels:• n=27, t=±2.056• n=5, t=±2.132• n=18 t=±2.898

t Values for one Tailed TestsThe procedure for using t tests for a one tailed test is conceptuallythe same as for a one tailed normal test. However the t tablesusually give the area in both tails combined at a specific level ofsignificance.For a one tailed test t test, we need to determine the area located inonly one tail. For example to find the appropriate t value for a onetailed test at a level of significance of .05 with 12 degrees offreedom we look in the table under the .10 column opposite 12degrees of freedom. The t value is 1.782. This is because the .10

column represents .10 of the area contained under both tailscombined. Therefore it also represents .05 of the area contained ineach tail separately.

Page 3: organizational behavior

© Copy Right: Rai University144 11.556

RE

SE

AR

CH

ME

TH

OD

OLO

GY

ExerciseFind one tail value for n=13, a=.05 % ‡ degrees of freedom=12T value for one tail test we need to look up the value under the .10column t= ±1.782Find one tail t values for the following:

• n=10, a=.01• n=15, a=.05

Hypothesis Testing Using The t DistributionThe procedure for hypothesis testing using the t test is very similarto that followed for the normal test. Instead of calculating the zstatistic we calculate a t statistic. The formula for the t statistic is

x

xt

σµ

ˆ−

=

where xσ̂ is the estimated standard error of the sample means.

The t test is the appropriate test to use when population standarddeviation is not known and has to be estimated by the samplestandard deviation.

sσ̂ where s is the sample standard deviation

nx

σσ

ˆˆ =

This represents the basic t test. Variants of this formula aredeveloped to meet the requirements of different testing situations.We shall look at more common types of problems briefly. As thetheoretical basis of hypothesis is the same as the normaldistribution and has been dealt with in detail in the last chapter,we shall focus on applications of the t test to various situations.1. Hypotheses testing of means

The t test is used when :1. the sample size is <30

or2. When population standard deviation not known and has to

be estimated by the sample standard deviation.3. When a population is finite and the sample accounts for more

than 5% of the population we use the finite populationmultiplier and the formula for the standard is modified to;

ˆ−−

=N

nNn

σ

Two Tailed Test:The specification of the null and alternative hypotheses is similarto the normal distribution.Ho: µ= µo

Ha: µ≠µo

This is tested at a prespecified level of significanceThe t statistic is

x

xt

σµ

ˆ−

=

The calculated t value should be compared with the table t value.If t calculated< t critical we accept the null hypotheses that there isno significant difference between the sample mean and thehypothesized population mean. If the calculated t value > t criticalwe reject the null hypotheses at the given level of significance.An example shall make the process clearer:A personnel specialist is a corporation is recruiting a large numberof employees. For an overseas assignment. She believes theaptitude scores are likely to be 90. a management review finds themean scores for 20 test results ot be 84 with a standard deviationof 11. Management wish to test the hypotheses at the .10 level ofsignificance that the average aptitude score is 90.Our data is as follows;Ho: µ= 90 Ha: µ≠90a=.10 n=20 As we can see this represents a two-tailed test.Degrees of freedom=19To find t critical we look under the t table under the .10 column,which gives the t value for .05 under both sides of the t curve. t.=1.729As population standard deviation is not known we estimate it :

11ˆ =sσ where s is the sample standard deviationStandard error of sampling mean

46.220

11ˆˆ ===

nx

σσ

44.246.2

9084ˆ

−=−

=−

=x

xt

σµ

Therefore since –2.44< -1.729 we reject the personnel managershypotheses that the true mean of employees being tested is 90.This is also illustrated diagrammatically in figure 3

Figure 3

.05 of area -2.44 -1.729 1.729 Exercises

1. Given a sample mean 83, Given a sample mean of 94.3, asample standard deviation of 12.5 and a sample size of G sizeof 22, test the hypothesis that the value of the populationmean is 70 against the alternative the hypothesis that it is morethan 100. Use the 0.025 significance level.

2. If a sample of 25 observations reveals a sample mean of 52 asample variance of 4.2, test the hypothesis that the populationmean is 05 against the alternative hypothesis that it is someother value. Use the .01 level of significance. .

3. Picosoft, Ltd., a supplier of operating system softwarefor personal computers, was planning the initial public

Page 4: organizational behavior

© Copy Right: Rai University11.556 145

RE

SE

AR

CH

ME

TH

OD

OLO

GY

offering of its stock in order to raise sufficient workingcapital to finance the development of a new seventh-generation integrated system. With current earnings $1.61a share, Picosoft and its underwriters were contemplatingan offering price of $21, or about 13 times earnings. Inorder to check the appropriateness of this price, theyrandomly chose seven publicly traded software firmsand found that their average price/ earnings ratio was11.6, and the sample standard deviation was 1.3. At =.02 can Picosoft conclude that the stocks of publiclytraded software firms have an average P /E ratio that issignificantly different from 13?

4. The data-processing department at a large life insurancecompany has installed new color video display terminalsto replace the monochrome units it previously used. The95 operators trained to use the new machines averaged7.2 hours before achieving a satisfactory level ofperformance. Their sample variance was 16.2 squaredhours. Long experience with operators on the oldmonochrome terminals showed that they averaged 8.1hours on the machines before their performances weresatisfactory. At the 0.01 significance level, should thesupervisor of the department conclude that the newterminals are easier to learn to operate?

TEsts for Differences Between Means – SmallSamplesAgain broadly the procedure for testing whether the sample meansfrom two different samples are not significantly different fromeach other is the same as for the large sample case. The differencesare in the calculation of the standard error formula and secondlyin the calculation of the degrees of freedom.

Degrees of FreedomIn the earlier case where we had tested the sample against ahypothesized population value, we had used a t distribution withn-1 degrees of freedom. In this case we have n1 –1 degrees offreedom for sample 1 and n 2 –1 for sample 2. When we combinethe sample to estimate the pooled variance we have n1 + n2 –2degrees of freedom . Thus for example if n 1 =10 and n 2 = 12 thecombined degrees of freedom = 20

Estimation of Sample Standard Error of thedifference Between Two Means.In large samples had assumed the unknown population varianceswere equal and we estimated σ̂ by s1

2 and s22 .

This is not appropriate for small samples. We assume theunderlying population variances are equal: s 1

2= s 22 we estimate

population variance as a weighted average of s 12 and s2

2 where theweights are numbers of degrees of freedom in each sample.

2)1()1(

21

222

2112

−+−+−

=nn

snsns p

One we have our estimate for population variance we can then useit to determine standard error of the difference between two sample

means, i.e we get an equation for the estimate standard error of

21 xx −

21

11ˆ

21 nns pxx +=−σ

The null hypotheses in this case isHo: µ1= µ1 Ha: µ1≠µ1

21ˆ

21

xx

xxt

−=

σ

An example will help make this clearer:A company investigates two programmes for improving thesensitivity of its managers. One was a more informal one whereasthe second involved more formal classroom instruction. Theinformal programme is more expensive and the president wantsto know at the .05 level of significance whether this expenditurehas resulted in greater sensitivity.12Managers were observed for the first method and 15 for thesecond. The sample data is as follows:

Programme

Mean sensitivity index

No. of managers observed

Estimated standard deviation of sensitivity of the programme.

1 92% 12 15% 2 82% 15 19%

Ho: µ1= µ1 Ha: µ1>µ1

The next step is to calculate estimate of the population variance :

35.1721512

)19)(115()15)(112(2

)1()1( 22

21

222

2112 =

−+−+−

=−+

−+−=

nnsnsn

s p

151

121

35.1711

ˆ21

21+=+=− nn

s pxxσ = 6.72

We then calculate the t statistic for the difference between twomeans:

19.172.6

8492ˆ

21

21 =−

=−

=−xx

xxt

σ

since it is a one tailed test at the .05 level of significance we look inthe .1 column against 25 degrees of freedom.t. critical at .05 level of significance= 1.708 Since calculated t< t critical , we accept the null hypothesis that thefirst method is significantly superior to the second.

Page 5: organizational behavior

© Copy Right: Rai University146 11.556

RE

SE

AR

CH

ME

TH

OD

OLO

GY

Exercises

1. A consumer research organization routinely selects several carmodels each year and evaluates their fuel efficiency. In this year’sstudy of two small cars it was found the average mileage for 12cars of brand A was 27.2km/litre with a standard deviation of3.8litres. 9 brand B cars were tested and they averaged 32.1kmper litre. With a standard deviation of 4.3 km per litre. Ata=.01 should the survey conclude that brand a cars have lowermileage than brand B cars?

2. Connie Rodrigues, the Dean of Students at Mid State College,is wondering about grade distributions at the school. She hasheard grumbling that the GPAs in the Business School areabout 0.25 lower than those in the College of Arts and Sciences.A quick random sampling produced following GPAs.

Business: 2.86 2.77 3.18 2.80 3.14 2.87 3.19 3.24 2.91 3.00 Arts & Sciences 2.83 3.35 3.32 3.36 3.63 3.41 3.37 3.45 3.43. 3.44 3.17 3.26 3.18 Do these data indicate that there is a factual basis for the grumbling?State and test appropriate hypotheses at = 0.02.2. A credit-insurance organization has developed a new high-tech

method of training new sales personnel. The company sampled16 employees, who were trained the original way and foundaverage daily sales to be $688 and the sample standard deviationwas $32.63. They also sampled 11 employees who were trainedusing the new method and found average daily sales to be$706 and the sample standard deviation was $24. At = 0.05,can the company conclude that average daily sales have increasedunder the new plan?

3. To celebrate their first anniversary, Randy Nelson decided tobuy diamond earrings for his wife Debbie. He was shown ninepairs with marquise gems weighing approximately 2 carats perpair. Because of differences in the colors and qualities of thestones, the prices varied from set to set. The average price was$2,990, and the sample standard deviation was $370. He alsolooked at six pairs with pear-shaped stones of the same 2-caratapproximate weight. These earrings had an average price of$3,065, and standard deviation was $805. On the basis of thisevidence, can Randy conclude (at a significance level of 0.05)that pear-shaped diamonds cost more on average, than marquisediamonds?

ReferencesLevin and Rubin Statisitcs for Management

Notes