1.representation of data

6
1 Representation of Data Select a suitable way of presenting raw statistical data, and discuss advantages and/or disadvantages that particular representations may have 1.1 Introduction Data: pieces of numerical and other information Variable: to observe or to measure some property for the collection of data (Sample) 2 types of variable: a) Qualitative: non-numerical b) Quantitative: numerical 2 types of quantitative variable a) Continuous: variable which can take any value in a given range (etc.150-160,160-170) b) Discrete: variable which has clear steps between its possible values (etc.1, 2, 3, 4, 5….) Construct and interpret stem-and-leaf diagrams, box-and-whisker plots, histograms and cumulative frequency graphs 1.2 Stem-and-leaf diagrams Datafile on cereals on a scale of 0-100 This is known as raw data . Raw data: not categorized, not arranged, scattered, not organized To organize the date, you can use stem-and-leaf diagram. Steam=tens digits Leaf=unit digits The stem-and-leaf diagram for the ratings data is shown below. The key shows that the stem and leaves mean.

Upload: kee-tze-san

Post on 02-Feb-2016

217 views

Category:

Documents


0 download

DESCRIPTION

Statistics 1 notes for CIE A-level

TRANSCRIPT

Page 1: 1.Representation of Data

1

Representation of Data

Select a suitable way of presenting raw statistical data, and discuss advantages and/or

disadvantages that particular representations may have

1.1 Introduction

Data: pieces of numerical and other information

Variable: to observe or to measure some property for the collection of data (Sample)

2 types of variable: a) Qualitative: non-numerical

b) Quantitative: numerical

2 types of quantitative variable

a) Continuous: variable which can take any value in a given range (etc.150-160,160-170)

b) Discrete: variable which has clear steps between its possible values (etc.1, 2, 3, 4, 5….)

Construct and interpret stem-and-leaf diagrams, box-and-whisker plots, histograms and

cumulative frequency graphs

1.2 Stem-and-leaf diagrams

Datafile on cereals on a scale of 0-100

This is known as raw data.

Raw data: not categorized, not arranged, scattered, not organized

To organize the date, you can use stem-and-leaf diagram.

Steam=tens digits Leaf=unit digits

The stem-and-leaf diagram for the ratings data is shown below. The key shows that the stem and

leaves mean.

Page 2: 1.Representation of Data

2

not ordered stem-and-leaf diagram

Brackets: frequency of leaves in each stem

Ordered stem-and-leaf diagram

Ordered stem-and-leaf diagram

Page 3: 1.Representation of Data

3

1.3 Histograms

Classes: group of data

Grouped frequency distribution: frequency of each classes

Class boundaries: real endpoints of classes

Example 1: Based on the table above, give the class boundaries for the first and second class

117.5 ≤ 𝑥 < 126.5

Example 2: A group of 40 motorists was asked to state the ages at which they passes their

driving test

Age, a (years) 17- 20- 23-

Frequency 6 11 7

17 ≤ a <20

Histogram

Bars have no spaces in between

Area of each bar is proportionate to the frequency

Page 4: 1.Representation of Data

4

Reason:

Area of barinterval 20≤h<30 is not proportionate to the frequency.

Heights in interval 20≤h<30 plotted wrongly.

Height=2.5 is correct.

.

Area of five blocks is 15,25,55,30 and 25. Ratio area of 3,5,11, 6 and 5 are the same as

frequency

Page 5: 1.Representation of Data

5

Formula:

𝑕𝑒𝑖𝑔𝑕𝑡 =𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦

𝑤𝑖𝑑𝑡 𝑕 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠 (Heights are known as frequency densities)

Example:

1.3 Cumulative frequency graphs ( Alternative to represent continuous data )

Plotted under upper class boundaries

Page 6: 1.Representation of Data

6

Cumulative frequency graphs

have been joined with smooth curve.

If joined with straight lines, we

assume that observations in each class are evenly

spread among the range of values in that class.

Extra info about data would

suggest a curve was appropriate.

Past year question