chapter 2 data presentation using descriptive graphs

24
Chapter 2 Data Presentation Using Descriptive Graphs

Upload: lester-lyons

Post on 02-Jan-2016

230 views

Category:

Documents


4 download

TRANSCRIPT

Chapter 2

Data Presentation Using Descriptive Graphs

2

2.1 Frequency Distributions

The tabulation of data by dividing it into classes and computing the number of data points (or their fraction out of the total) falling within each class.

Example 2.1.1: Grades on Business Statistics Exam

Classes (Exam Score) Frequency (Number of Students)

Below 50 28

50 and under 60 30

60 and under 70 36

70 and under 80 20

80 and under 90 90

90 and over 16

  220

3

Constructing a Frequency Distribution

Gather the sample dataGather the sample data Arrange the data in an ordered arrayArrange the data in an ordered array

Ascending Order: Lowest to highestAscending Order: Lowest to highest Descending Order: Highest to lowestDescending Order: Highest to lowest

Select the number of classes, Select the number of classes, K,K, to be used to be used There is no “correct” number of classes.There is no “correct” number of classes.

Determine the class width, CW.Determine the class width, CW.

K

Range

K

LHCW

Determine the class limits for each classDetermine the class limits for each class Count the number of data values in each class (the class frequencies)Count the number of data values in each class (the class frequencies) Summarize the class frequencies in a frequency distribution tableSummarize the class frequencies in a frequency distribution table

Where: H = Highest ValueL = Lowest Value

Rounded up or down to a value that is easy to interpret.

4

Constructing a Frequency Distribution (cont.)

Example 2.1.2: Frequency Distribution for Continuous Data Fifty starting salaries for business majors at Bellaire College

Raw Data

41.5 39.4 40.9 35.9 37.4

39.5 40.3 39.3 41.6 36.6

41.1 35.7 43.7 37.0 41.3

40.6 38.0 42.4 35.7 41.4

39.2 36.8 39.3 43.8 38.5

43.0 36.3 35.6 36.2 38.1

34.8 38.1 35.7 36.5 39.5

37.9 34.3 36.8 33.8 35.0

37.8 38.7 37.2 32.8 38.2

37.0 39.7 38.8 35.2 36.2

5

Constructing a Frequency Distribution (cont.)

Example 2.1.2: Arrange the data in an ordered arrayArrange the data in an ordered array

Ordered Array

32.8 33.8 34.3 34.8 35.0

35.2 35.6 35.7 35.7 35.7

35.9 36.2 36.2 36.3 36.5

36.6 36.8 36.8 37.0 37.0

37.2 37.4 37.8 37.9 38.0

38.1 38.1 38.2 38.5 38.7

38.8 39.2 39.3 39.3 39.4

39.5 39.5 39.7 40.3 40.6

40.9 41.1 41.3 41.4 41.5

41.6 42.4 43.0 43.7 43.8

6

Constructing a Frequency Distribution (cont.)

Example 2.1.2: Number of classes, K = 6 Class Width:

283.16

8.328.43

K

LHCW

Class Number Class Frequency

1 32 and under 34 2

2 34 and under 36 9

3 36 and under 38 13

4 38 and under 40 14

5 40 and under 42 8

6 42 and under 44 4

    50

7

Constructing a Frequency Distribution (cont.)

Example 2.1.3: Frequency Distribution for Discrete Data and Categorical Data

Class Number Class Frequency Class Number Class Frequency

1 4 - 6 9 1 Accounting 26

2 7 - 9 10 2 IT 10

3 10 - 12 8 3 Marketing 14

4 13 - 15 8

5 16 - 18 6     50

6 19 - 21 9

    50

8

Frequency Distributions

Relative Frequency Distribution The ratio of each class frequency to the total number of data points in a

frequency distribution. Cumulative Frequency Distribution

The cumulative frequency corresponding to the upper limit of any class is the total frequency of all values less than that upper limit.

Relative Cumulative Frequency Distribution The ratio of the cumulative frequency of each class to the total number

of data points in a frequency distribution.

9

Frequency Distributions (cont.)

Example 2.1.4: Frequency Distributions

ClassFrequenc

yRelative Frequency

Cumulative Frequency

Relative Cumulative Frequency

32 and under 34 2 0.04 2 0.04

34 and under 36 9 0.18 11 0.22

36 and under 38 13 0.26 24 0.48

38 and under 40 14 0.28 38 0.76

40 and under 42 8 0.16 46 0.92

42 and under 44 4 0.08 50 1.00

  50 1.00    

10

Comments on Frequency Distribution

Outliers Very small or very large numbers quite unlike the remaining data values.

Open-ended Classes Example 2.1.1 (Revisited): Grades on Business Statistics Exam

Classes (Exam Score) Frequency (Number of Students)

Below 50 28

50 and under 60 30

60 and under 70 36

70 and under 80 20

80 and under 90 90

90 and over 16

  220

11

Comments on Frequency Distribution (cont.)

Class Limits The highest and lowest values describing a class Lower Limit Upper Limit

Class Midpoints (also called Class Marks) Values in the center of the classes. Example 2.1.5: Finding Class Midpoints

Class Class Midpoints Frequency

32 and under 34 (32+34)/2 = 33 2

34 and under 36 (34+36)/2 = 35 9

36 and under 38 (36+38)/2 = 37 13

38 and under 40 (38+40)/2 = 39 14

40 and under 42 (40+42)/2 = 41 8

42 and under 44 (42+44)/2 = 43 4

12

Class Midpoints - Example

Three midpoints of adjoining classes in a frequency distribution are 16.5, 19.5, and 22.5. How wide are the classes?

Note: In a frequency distribution, all classes usually have the same class width unless we have open-ended classes to accommodate outliers.

Class Class Midpoints Frequency

: : :

A and under B 16.5 :

B and under C 19.5 :

C and under D 22.5 :

: : :

16.5 19.5 22.5 A B C D

The three adjoining classes and their midpoints can be shown below in a frequency distribution form. If we know A and B, or B and C, or C and D, we can get the class width. We can also put the midpoints in the following line graph. A-B is a class, B-C is a class, and C-D is a class. Since all classes have the same class width, B is equidistant from 16.5 and 19.5. Same goes for C. I am taking B and C, because they are closed by the midpoints. If B is equidistant from 16.5 and 19.5, what is the value of B? It’s 18. Same way C is equidistant from 19.5 and 22.5. Then the value of C is 21. So class width = 21-18. Remember, class width is simply the difference between the upper limit and the lower limit of a class.

13

Using Excel

KPK Data Analysis > Quantitative Data Charts/Tables > Histogram/Freq. Charts.

Frequency Distribution Table

CLASS CLASS LIMITS FREQUENCYRELATIVE

FREQCUMULATIVE

FREQCUM REL

FREQ

132 and under

34 2 0.04 2.00 0.04

234 and under

36 9 0.18 11.00 0.22

336 and under

38 13 0.26 24.00 0.48

438 and under

40 14 0.28 38.00 0.76

540 and under

42 8 0.16 46.00 0.92

642 and under

44 4 0.08 50.00 1.00

TOTAL 50

14

2.2 Histograms and Stem-and-Leaf Diagrams

Histogram A Histogram is a A Histogram is a

graphical representation graphical representation of a frequency of a frequency distribution for distribution for continuous data.continuous data.

Drawn by putting class Drawn by putting class limits on X-axis and limits on X-axis and frequencies on Y-axis.frequencies on Y-axis.

Describes the shape of Describes the shape of the data.the data.

Relative Frequency Relative Frequency Histogram: Constructed Histogram: Constructed using relative frequencies using relative frequencies rather than the rather than the frequencies.frequencies.

Frequency Histogram

0

2

4

6

8

10

12

14

16

32 and under 34 34 and under 36 36 and under 38 38 and under 40 40 and under 42 42 and under 44

Class Limits

Relative Frequency Histogram

0.00

0.05

0.10

0.15

0.20

0.25

0.30

32 and under 34 34 and under 36 36 and under 38 38 and under 40 40 and under 42 42 and under 44

Class Limits

15

Stem-and-Leaf Diagrams

Summarizing reasonably sized data (under 150 values as a general rule) without loss of information.

Each observation is represented by a stem to the left of a vertical line and a leaf to the right of the vertical line.

The leaf for each observation is generally the last digit (or possibly the last two digits) of the data value, with the stem consisting of the remaining first digits.

Example 2.2.1: Constructing Stem-and-Leaf DiagramsExample 2.2.1: Constructing Stem-and-Leaf Diagrams Reports of the after-tax profits of 12 companies are (recorded as cents per dollar of Reports of the after-tax profits of 12 companies are (recorded as cents per dollar of

revenue) as follows:revenue) as follows:

3.4, 4.5, 2.3, 2.7, 3.8, 5.9, 3.4, 4.7, 2.4, 4.1, 3.6, 5.13.4, 4.5, 2.3, 2.7, 3.8, 5.9, 3.4, 4.7, 2.4, 4.1, 3.6, 5.1

Stem Leaf (unit = .1)    

2 3 4 7

3 4 4 6 8

4 1 5 7

5 1 9

What percentage of the companies pays tax more than 4.5 cents per dollar of revenue?

What is the range of these data in cents?

16

2.3 Frequency Polygons

Constructed by connecting the centers of the tops of the histogram bars (located at the class midpoints) with a series of straight lines.

Relative Frequency Polygons use relative frequencies rather than frequencies.

Frequency Polygon

0

2

4

6

8

10

12

14

16

31 33 35 37 39 41 43 45

Midpoint

Relative Frequency Polygon

0

0.05

0.1

0.15

0.2

0.25

0.3

31 33 35 37 39 41 43 45

Midpoint

17

Frequency Polygons (cont.)

Better than histograms for comparing the shape of two (or more) different frequency distributions.

College degreeCollege degree

No college degreeNo college degree

||1010

||2020

||3030

||4040

||5050

||6060

||7070

||8080

||9090

Annual salaries (thousands of dollars)Annual salaries (thousands of dollars)

Num

ber

of e

mpl

oyee

sN

umbe

r of

em

ploy

ees

||100100

18

Frequency Polygons (cont.)

For open-ended class, place a footnote at each open-ended class location indicating the frequency of that particular class.

** 4 cities had populations of less than 10,0004 cities had populations of less than 10,000**** 5 cities had populations of 50,000 or greater5 cities had populations of 50,000 or greater

100 –100 –

90 –90 –

80 –80 –

70 –70 –

60 –60 –

50 –50 –

40 –40 –

30 –30 –

20 –20 –

10 –10 – ||1010

||1515

||2020

||2525

||3030

||3535

||4040

||4545

||5050

Population (thousands)Population (thousands)

******

Fre

quen

cyF

requ

ency

19

2.4 Cumulative Frequencies (Ogives)

Constructed by putting upper class limits on X-axis and cumulative frequencies (or cumulative relative frequencies) on the Y-axis.

Useful in determining what percentage of the data lies below a certain value.

Relative Frequency Ogive

0

0.2

0.4

0.6

0.8

1

1.2

32 34 36 38 40 42 44

Upper Class Limit

20

2.5 Bar Charts

Bar Charts are used for graphical representation of nominal and Bar Charts are used for graphical representation of nominal and ordinal dataordinal data

As with a histogram the height of the bar is proportional to the As with a histogram the height of the bar is proportional to the number of values in the categorynumber of values in the category

2626

1010

1414

AccountingAccounting Information Information systemssystems

MarketingMarketing

Num

ber

of m

ajor

sN

umbe

r of

maj

ors

30 –30 –

25 –25 –

20 –20 –

15 –15 –

10 –10 –

5 –5 –

21

2.6 Pie Charts

The Pie Chart is an alternative to the bar chart for nominal and The Pie Chart is an alternative to the bar chart for nominal and ordinal dataordinal data

The proportion of the Pie represents the category’s percentage in the The proportion of the Pie represents the category’s percentage in the population or samplepopulation or sample

Business Students by Dept.

Management33%

Accounting 35%

Marketing 19%

IT 13%

Dept. # of StudentsAccounting 26

IT 10Marketing 14

Management 25  75

22

Bar and Pie Charts

23

2.7 Deceptive Graphs

If care is not taken in constructing graphs, the graph may not properly If care is not taken in constructing graphs, the graph may not properly present the datapresent the data

Also, graphs can be purposely manipulated to provide false Also, graphs can be purposely manipulated to provide false impressions of the dataimpressions of the data

WomenWomen

AA

MenMen

BB30 –30 –

25 –25 –

20 –20 –

15 –15 –

10 –10 –

5 –5 –

––Num

ber

of e

mpl

oyee

s (t

hous

ands

)N

umbe

r of

em

ploy

ees

(tho

usan

ds)

24

Deceptive Graphs (cont.)

19991999 20002000 20012001

YearYear

Sal

ary

(tho

usan

ds o

f do

llars

)S

alar

y (t

hous

ands

of

dolla

rs) 32 –32 –

31 –31 –

30 –30 –

0019991999 20002000 20012001

YearYear

Sal

ary

(tho

usan

ds o

f do

llars

)S

alar

y (t

hous

ands

of

dolla

rs)

32 –32 –

31 –31 –

30 –30 –