the diminishing rhinoceros & the crescive cow exploring, organizing, and describing, qualitative...

14
The Diminishing Rhinoceros & the Crescive Cow Exploring, Organizing, and Describing, Qualitative Data

Upload: preston-casey

Post on 28-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Diminishing Rhinoceros & the Crescive Cow Exploring, Organizing, and Describing, Qualitative Data

The Diminishing Rhinoceros

& the Crescive CowExploring, Organizing, and Describing, Qualitative

Data

Page 2: The Diminishing Rhinoceros & the Crescive Cow Exploring, Organizing, and Describing, Qualitative Data

Essentials: Qualitative Data(Be able to address the following.)

• Characteristics of qualitative variables.

• Building a qualitative frequency table.

• Appropriate charts/graphs for qualitative data (and how to make them).

Page 3: The Diminishing Rhinoceros & the Crescive Cow Exploring, Organizing, and Describing, Qualitative Data

What do these pictographs tell you about rhinos and cows?

Page 4: The Diminishing Rhinoceros & the Crescive Cow Exploring, Organizing, and Describing, Qualitative Data

Proper presentation using a pictograph.

The Crescive Cow Revisited

Page 5: The Diminishing Rhinoceros & the Crescive Cow Exploring, Organizing, and Describing, Qualitative Data

Exploring, Organizing, and Describing Data

• Before beginning to analyze data, it is important to know three things:• 1. Did the data come from a sample or a population?

• 2. Are the data qualitative or quantitative?

• 3. In what measurement scale are the data reported?

• Knowing the characteristics of a variable allows one to select appropriate presentation formats and analysis procedures.

Page 6: The Diminishing Rhinoceros & the Crescive Cow Exploring, Organizing, and Describing, Qualitative Data

Important Characteristics of a

Data Set• Center – an “average” value that indicates

where the middle of the data is located. For qualitative data the “center” is represented by the mode, or most frequently occurring value.

• Variation – a measure of the amount that the values vary among themselves.

• Distribution – the “shape” of the distribution of data.

• Outliers – values that are far away from the majority of values.

• Time – changing characteristics of data over time.

Page 7: The Diminishing Rhinoceros & the Crescive Cow Exploring, Organizing, and Describing, Qualitative Data

FREQUENCY DISTRIBUTIONS

• A Frequency Distribution represents the range over which a variable’s values occur. For qualitative data the distribution is represented by the categories (values) of the variable.

• A Qualitative Frequency Table lists the categories (values) of a variable, along with the frequencies (counts) of the number of values that fall into each category and what portion of all values the category frequency represents, known as the relative frequency. In addition, for ordinal data, a frequency table may show cumulative frequencies and cumulative relative frequencies.

• Frequency Tables are derived from RAW DATA through the use of a TALLY process.

Page 8: The Diminishing Rhinoceros & the Crescive Cow Exploring, Organizing, and Describing, Qualitative Data

Qualitative Frequency Distributions

• Qualitative or Categorical Frequency Distributions present categorical data, such as gender, hair color, or

military rank.• Each category of the variable is presented along with

the frequency of its occurrence and a relative frequency.

Page 9: The Diminishing Rhinoceros & the Crescive Cow Exploring, Organizing, and Describing, Qualitative Data

Building and Displaying a Qualitative Frequency Distribution (Nominal Data)

Color of m&m's in One 1.69 Oz. Bag

  Color Freq. Rel. Freq.  

  Blue 10 0.175  

  Brown 12 0.211  

  Green 11 0.193  

  Orange 3 0.053  

  Red 8 0.140  

  Yellow 13 0.228  

  Totals 57 1.00  

START HERE: The TallyData: Tally (count) occurrences of each category of the item.

Blue: 11111 11111 = 10

Brown: 11111 11111 11 = 12

Green: 11111 11111 1 = 11

Orange: 111 = 3

Red: 11111 111 = 8

Yellow: 11111 11111 111 = 13

Next build the table:

List the values of the variable (here color) followed by columns for the frequency of occurrence and the relative frequency

Relative freq. = the number of a category/total number of items. For, example, the relative frequency of green candies out of all 57 candies is 11/57 = .193NOTE: For ordinal data add cumulative frequency and cumulative

relative frequency columns as with quantitative tables – example on next slide.)

Page 10: The Diminishing Rhinoceros & the Crescive Cow Exploring, Organizing, and Describing, Qualitative Data

Variations on a Qualitative Frequency Table

The table below presents the same m&m’s data, but in slightly different format. The relative frequency has been replaced by the percent of candies a given color is of all candies. A percentage is equal to the relative frequency * 100.

Color of m&m's in One 1.69 Oz. Bag

10 17.5

12 21.1

11 19.3

3 5.3

8 14.0

13 22.8

57 100.0

Blue

Brown

Green

Orange

Red

Yellow

Total

Valid

Frequency Percent

Because of the “ordering” characteristic of ordinal data, two additional columns, cumulative frequency (cf) and cumulative relative frequency (crf), are added to a table.

Page 11: The Diminishing Rhinoceros & the Crescive Cow Exploring, Organizing, and Describing, Qualitative Data

Bar Chart The Bar Chart is a commonly used chart for the presentation of qualitative, categorical data. Within a bar chart there will be a bar for each value of the variable. The bar can represent frequency data or relative frequency data.

• In a bar chart the bars do not touch and are equal in width.

•The height of a bar represents the frequency or relative frequency of the value. BEWARE: Bar Charts

and Histograms (used for quantitative data) look similar, but are actually quite different in structure. Knowing what type of data you are working with will lead you to the correct chart type.

Color of m&m's in One 1.69 Oz. Bag

  ColorFreq

.Rel.

Freq.  

  Blue 10 0.175  

  Brown 12 0.211  

  Green 11 0.193  

  Orange 3 0.053  

  Red 8 0.140  

  Yellow 13 0.228  

  Totals 57 1.00  

Page 12: The Diminishing Rhinoceros & the Crescive Cow Exploring, Organizing, and Describing, Qualitative Data

Bar Chart Variations: Bar Charts may be used to compare data. A Stacked Bar Chart will divide each bar by some other categorical variable. A Clustered Bar Chart (side by side) will provide separate bars for a variable across the values of a second variable.

June July August September

Month of Sale

0

20

40

60

80

Cou

nt

Location (City) of

House

Arlington

Dallas

Fort Worth

June July August September

Month of Sale

0

10

20

30

Per

cen

t

Location (City) of

House

Arlington

Dallas

Fort Worth

June July August September

Month of Sale

0

20

40

60

80

100

120

140

Cou

nt

Location (City) of

House

Arlington

Dallas

Fort Worth

Stacked Bar Chart: Below the variable month is sub-divided by the location of the houses sold during the specified months.

Location (City) of House

109 21.0 21.0 21.0

243 46.9 46.9 68.0

166 32.0 32.0 100.0

518 100.0 100.0

Arlington

Dallas

Fort Worth

Total

Valid

Frequency Percent Valid PercentCumulative

Percent

Clustered Bar Charts: Below are two clustered bar charts. Both depict information regarding the number of houses sold by month in three cities. The bar chart on the left shows where sales occurred and is based upon a count. Dallas experienced far more house sales in a selected month than either other city. In contrast, the bar chart on the right is based upon relative frequency. It shows for each city what portion (rel. freq.) of its total sales occurred each month.

Count based: Total Sales for a Month.

Rel. Freq. based: Portion of total sales for a city.

Page 13: The Diminishing Rhinoceros & the Crescive Cow Exploring, Organizing, and Describing, Qualitative Data

Pareto ChartA Pareto Chart is similar to a bar chart, except the categories appear from most frequent to least frequent (left to right).

Note that there are two y-axes. One for counts and the other for corresponding percentages.

Ogive -

Color of m&m's in One 1.69 Oz. Bag

  ColorFreq

.Rel.

Freq.  

  Blue 10 0.175  

  Brown 12 0.211  

  Green 11 0.193  

  Orange 3 0.053  

  Red 8 0.140  

  Yellow 13 0.228  

  Totals 57 1.00  

In some instances a pareto chart will contain a cumulative frequency line, which represents the summed frequencies from left to right. As a separate chart the cumulative frequency line is called an Ogive (pronounced with a long “i” – Oj-Ive).

Page 14: The Diminishing Rhinoceros & the Crescive Cow Exploring, Organizing, and Describing, Qualitative Data

Pie Chart Pie Charts present qualitative data or

grouped quantitative data. To determine the size of each slice, first find the relative frequency for a value or category. Then multiply the relative frequency times 360o. The result will be the portion of the circle’s circumference that will represent the category.

Example: 11 green candies represent .193 (or 19.3%) of all candies. Multiply 360o * .193 = 69.48o, which is the length of the arc on the circumference representing green candies.

NOTE: Arc degrees are not included on pie charts.

Arc = 69.480

Color of m&m's in One 1.69 Oz. Bag

  ColorFreq

.Rel.

Freq.  

  Blue 10 0.175  

  Brown 12 0.211  

  Green 11 0.193  

  Orange 3 0.053  

  Red 8 0.140  

  Yellow 13 0.228  

  Totals 57 1.00