1.representation of data
DESCRIPTION
Statistics 1 notes for CIE A-levelTRANSCRIPT
1
Representation of Data
Select a suitable way of presenting raw statistical data, and discuss advantages and/or
disadvantages that particular representations may have
1.1 Introduction
Data: pieces of numerical and other information
Variable: to observe or to measure some property for the collection of data (Sample)
2 types of variable: a) Qualitative: non-numerical
b) Quantitative: numerical
2 types of quantitative variable
a) Continuous: variable which can take any value in a given range (etc.150-160,160-170)
b) Discrete: variable which has clear steps between its possible values (etc.1, 2, 3, 4, 5….)
Construct and interpret stem-and-leaf diagrams, box-and-whisker plots, histograms and
cumulative frequency graphs
1.2 Stem-and-leaf diagrams
Datafile on cereals on a scale of 0-100
This is known as raw data.
Raw data: not categorized, not arranged, scattered, not organized
To organize the date, you can use stem-and-leaf diagram.
Steam=tens digits Leaf=unit digits
The stem-and-leaf diagram for the ratings data is shown below. The key shows that the stem and
leaves mean.
2
not ordered stem-and-leaf diagram
Brackets: frequency of leaves in each stem
Ordered stem-and-leaf diagram
Ordered stem-and-leaf diagram
3
1.3 Histograms
Classes: group of data
Grouped frequency distribution: frequency of each classes
Class boundaries: real endpoints of classes
Example 1: Based on the table above, give the class boundaries for the first and second class
117.5 ≤ 𝑥 < 126.5
Example 2: A group of 40 motorists was asked to state the ages at which they passes their
driving test
Age, a (years) 17- 20- 23-
Frequency 6 11 7
17 ≤ a <20
Histogram
Bars have no spaces in between
Area of each bar is proportionate to the frequency
4
Reason:
Area of barinterval 20≤h<30 is not proportionate to the frequency.
Heights in interval 20≤h<30 plotted wrongly.
Height=2.5 is correct.
.
Area of five blocks is 15,25,55,30 and 25. Ratio area of 3,5,11, 6 and 5 are the same as
frequency
5
Formula:
𝑒𝑖𝑔𝑡 =𝑓𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦
𝑤𝑖𝑑𝑡 𝑜𝑓 𝑐𝑙𝑎𝑠𝑠 (Heights are known as frequency densities)
Example:
1.3 Cumulative frequency graphs ( Alternative to represent continuous data )
Plotted under upper class boundaries
6
Cumulative frequency graphs
have been joined with smooth curve.
If joined with straight lines, we
assume that observations in each class are evenly
spread among the range of values in that class.
Extra info about data would
suggest a curve was appropriate.
Past year question