data analysis techniques - nanyang technological university analysis... · 2016-04-15 · data...
TRANSCRIPT
Data Analysis Techniques
Dr. G. Roshan Deen
Assistant Professor
Natural Sciences and Science Education Academic GroupNational Institute of Education
Nanyang Technological University
Outline
• Definition of Error & Importance
• Examples of Data
• Accuracy and Precision
• Types of Error
• Standard Deviation and Histogram
• Error Propagation
Every measurement has a degree of uncertainty
associated with the measurement. This
uncertainty is called the Error.
The term Error is the difference between the
measured value and true value.
The determination of this uncertainty requires
additional effort on the part of the user. Without
an estimate of the error of these numbers they
are largely useless.
Error
http://phys.columbia.edu/~tutorial/introduction/tut_e_1_1.html
How many apples are there in this
picture?
How many particles are there in
this picture?
EXACT Answer
NOT an EXACT
AnswerYou will deal mostly with measurements
of this kind in your experiments!
An Introduction to
Error Analysis
A very readable text, but with
enough mathematics to be rigorous.
The cover says it all – emphasizes
how important error analysis can be
in the real world.
Data & Data Analysis
Data Analysis
– process of looking at and summarizing data to extractuseful information and develop conclusions.
– Exploratory Data Analysis - discovering new features inthe data.
– Confirmatory Data Analysis - confirming or falsifyingexisting hypotheses.
Data refers to the collection of
organized information. Eg. Collection
of set of numbers for the experiment.
This may consist of numbers, words,
or images of measurements or
observations.
Examples of DataStudent examination scores on a given test
Student test answers on a given test
Number of mosquitoes collected in a trap
A robot
Video of shuttlecocks flying through the air
Video of interviews
Measurements of wind speed
The speed of light in vacuum
A graph of intensity versus angle
Qualitative
Quantitative
• There is no general method for treating qualitative
data.
• For quantitative data, there are some standard
techniques.
• Not all standard numerical techniques are applicable
to all types of research.
• Be careful when trying to quantify qualitative data.
Numerical Data Analysis
Accuracy & Precision
Accuracy is the proximity of answer to the true value and
absence of systematic error.
Precision is the proximity of repeated values to each other
Systematic
Error
Random
Error
The influence of Random Error can be minimized by
averaging a large set of data.
Types of Errors
Systematic Error is often due to the measuring
instrument used. For example, calibration error and
zero error.
Random Error arises due to accidental errors in
measurements. For example, when measurements
are estimated to the nearest division, lack of
perfection in the observer, random fluctuations in the
object being measured, electronic noise, etc.
Systematic Errors cannot be estimated by
repeating the experiment with the same
equipment.
Consider the example of measuring an oscillation
period with a stopwatch. Suppose that the
stopwatch is running slow. This will lead to
underestimation of all the time results. Systematic
errors, unlike random errors, shift the results
always in one direction.
Errors that can be reliably estimated by repeating
m e a s u r e m e n t s a r e c a l l e d R a n d o m .
Random error can be easily quantified using the
standard deviation formula. Most of the error
analysis in your lab will involve the estimation of
random errors.
There are always random fluctuations
(noise) in the system you are trying to
measure.
How to give a numerical value for the error
in a measurement?
The Standard Deviation
Example: Determination of Boiling Point of a Liquid
You measure 32 C. You repeat the experiment many times and
collect a set of results.
Trial No. 1 2 3 4 5
Measured value 31.9 32.1 31.8 32.2 32.1
The best estimate is the average of the 6 measured values.
ixN
x1_
02.325
1.322.328.311.329.31_
x
What is the error of this measurement?
Taking the difference between the highest measured value and
the average will give the maximum deviation. The error is
overestimated in this method.
18.002.322.32
The best estimate is to get the average deviation. The Standard
Deviation is given as:
21
ix dN
15.05
01.003.005.001.001.0
x
The boiling point of the liquid = 32.0 C ± 0.2 C
Histograms
Take a large number of measurements and count the number of
occurrence of each value.
31.9, 32.1, 31.8, 32.2, 32.1, 31.8, 32.1, 32.4, 32.1, 32.2, 32.3
Measured Value 31.9 32.1 31.8 32.2 32.3 32.4
Occurrences, n 1 4 2 2 1 1
31.8 31.9 32.0 32.1 32.2 32.3 32.4
1.0
1.5
2.0
2.5
3.0
3.5
4.0
Nu
mb
er
of o
ccu
ran
ce
s
Temperature (C)
31.8 31.9 32.0 32.1 32.2 32.3 32.4
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
1
Nu
mb
er
of o
ccu
ran
ce
s
Temperature (C)
1
0 1 2 3 4 5 60
0.25
0.5
0.75
1
Normal Distribution of the Average of 4 Points
G x ( )
G x av
x
The uncertainty of an average decreases as more data points are
averaged.
It is a property of the normal distribution that 68% of the
measurements lie within one standard deviation on either side of
the mean.
Even a crude estimate of error is better than no estimate
at all
Significant Digits / Figures
All reported values must be given with the correct number of
significant figures.
Errors are usually reported to either one or two significant digits.
Report the main value to the same number of digit of error.
Error = 0.83
You are certain about 8 but less certain about 3. Error = 0.8
Heat Capacity = 5.86 kJ K-1 mol-1 Error = 0.4
Heat Capacity = 5.9 ± 0.4 kJ K-1 mol-1
Always include units. If the quantity is dimensionless say so.
Error Propagation
Consider the ideal gas equation. We need to determine the
number of moles of gas in a sample.
RT
PVn
nRTPV
We need to measure Pressure, Volume and
Temperature to get the value of n. Each
measured quantity has error value.
To determine the error in n, we need to
propagate the error in the individual
measurement.
We want to measure a quantity V.
The result is f(V). 000 VVV
Error above V0Error below V0
The desired property f is then within the range:
)()()( 000 VfVfVf
If a function only contains addition and subtraction operations,
the error can be calculated as:
...22 yxf
If a function only contains multiplication and division operations,
the error can be calculated as:
...
2
0
2
0
yxf
yxf
Example:
Let us measure the molar heat of solvation of LiCl in water. This
experiment involves:
(a)Weighing an amount of LiCl
(b)Measuring volume of water
(c)Measuring the temperature change when the material dissolve.
The heat of solvation (H) is expressed in terms of the three
measurements as:
Mm
TVCH
/
The function H contains multiplication and division operations.
The variables we need to consider are mass of LiCl, volume of
water, and temperature.
222
TVmH TVm
H
Example:
Let us measure the molar heat of solvation of LiCl in water.
If two separate masses were weighed and added to the solvent the
equation is given as:
Mmm
TVCH
/)( 21
The first step is to calculate the uncertainty in m = m1 + m2 using
the error propagation rule as:
2
2
2
1 mmm
Then use the total mass and its calculated error, and determine H
and H.
Summary
Error does not mean a mistake or blunder but refers to
imprecision in measurements.
Random or indeterminate errors are inherent to all measurements
and cause symmetrical spread of data around the mean.
The source of systematic errors are: Instrumental errors, Method
errors, and Personal errors.
Numerical results given without error analysis is as good as
useless.
Thank you for your attention!