data analysis techniques - nanyang technological university analysis... · 2016-04-15 · data...

Data Analysis Techniques

Dr. G. Roshan Deen

Assistant Professor

Natural Sciences and Science Education Academic GroupNational Institute of Education

Nanyang Technological University

Outline

• Definition of Error & Importance

• Examples of Data

• Accuracy and Precision

• Types of Error

• Standard Deviation and Histogram

• Error Propagation

Every measurement has a degree of uncertainty

associated with the measurement. This

uncertainty is called the Error.

The term Error is the difference between the

measured value and true value.

The determination of this uncertainty requires

additional effort on the part of the user. Without

an estimate of the error of these numbers they

are largely useless.

Error

http://phys.columbia.edu/~tutorial/introduction/tut_e_1_1.html

How many apples are there in this

picture?

How many particles are there in

this picture?

EXACT Answer

NOT an EXACT

AnswerYou will deal mostly with measurements

of this kind in your experiments!

An Introduction to

Error Analysis

A very readable text, but with

enough mathematics to be rigorous.

The cover says it all – emphasizes

how important error analysis can be

in the real world.

Data & Data Analysis

Data Analysis

– process of looking at and summarizing data to extractuseful information and develop conclusions.

– Exploratory Data Analysis - discovering new features inthe data.

– Confirmatory Data Analysis - confirming or falsifyingexisting hypotheses.

Data refers to the collection of

organized information. Eg. Collection

of set of numbers for the experiment.

This may consist of numbers, words,

or images of measurements or

observations.

Examples of DataStudent examination scores on a given test

Student test answers on a given test

Number of mosquitoes collected in a trap

A robot

Video of shuttlecocks flying through the air

Video of interviews

Measurements of wind speed

The speed of light in vacuum

A graph of intensity versus angle

Qualitative

Quantitative

• There is no general method for treating qualitative

data.

• For quantitative data, there are some standard

techniques.

• Not all standard numerical techniques are applicable

to all types of research.

• Be careful when trying to quantify qualitative data.

Numerical Data Analysis

Accuracy & Precision

Accuracy is the proximity of answer to the true value and

absence of systematic error.

Precision is the proximity of repeated values to each other

Systematic

Error

Random

Error

The influence of Random Error can be minimized by

averaging a large set of data.

Types of Errors

Systematic Error is often due to the measuring

instrument used. For example, calibration error and

zero error.

Random Error arises due to accidental errors in

measurements. For example, when measurements

are estimated to the nearest division, lack of

perfection in the observer, random fluctuations in the

object being measured, electronic noise, etc.

Systematic Errors cannot be estimated by

repeating the experiment with the same

equipment.

Consider the example of measuring an oscillation

period with a stopwatch. Suppose that the

stopwatch is running slow. This will lead to

underestimation of all the time results. Systematic

errors, unlike random errors, shift the results

always in one direction.

Errors that can be reliably estimated by repeating

m e a s u r e m e n t s a r e c a l l e d R a n d o m .

Random error can be easily quantified using the

standard deviation formula. Most of the error

analysis in your lab will involve the estimation of

random errors.

There are always random fluctuations

(noise) in the system you are trying to

measure.

How to give a numerical value for the error

in a measurement?

The Standard Deviation

Example: Determination of Boiling Point of a Liquid

You measure 32 C. You repeat the experiment many times and

collect a set of results.

Trial No. 1 2 3 4 5

Measured value 31.9 32.1 31.8 32.2 32.1

The best estimate is the average of the 6 measured values.

ixN

x1_

02.325

1.322.328.311.329.31_

x

What is the error of this measurement?

Taking the difference between the highest measured value and

the average will give the maximum deviation. The error is

overestimated in this method.

18.002.322.32

The best estimate is to get the average deviation. The Standard

Deviation is given as:

21

ix dN

15.05

01.003.005.001.001.0

x

The boiling point of the liquid = 32.0 C ± 0.2 C

Histograms

Take a large number of measurements and count the number of

occurrence of each value.

31.9, 32.1, 31.8, 32.2, 32.1, 31.8, 32.1, 32.4, 32.1, 32.2, 32.3

Measured Value 31.9 32.1 31.8 32.2 32.3 32.4

Occurrences, n 1 4 2 2 1 1

31.8 31.9 32.0 32.1 32.2 32.3 32.4

1.0

1.5

2.0

2.5

3.0

3.5

4.0

Nu

mb

er

of o

ccu

ran

ce

s

Temperature (C)

31.8 31.9 32.0 32.1 32.2 32.3 32.4

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

1

Nu

mb

er

of o

ccu

ran

ce

s

Temperature (C)

1

0 1 2 3 4 5 60

0.25

0.5

0.75

1

Normal Distribution of the Average of 4 Points

G x ( )

G x av

x

The uncertainty of an average decreases as more data points are

averaged.

It is a property of the normal distribution that 68% of the

measurements lie within one standard deviation on either side of

the mean.

Even a crude estimate of error is better than no estimate

at all

Significant Digits / Figures

All reported values must be given with the correct number of

significant figures.

Errors are usually reported to either one or two significant digits.

Report the main value to the same number of digit of error.

Error = 0.83

You are certain about 8 but less certain about 3. Error = 0.8

Heat Capacity = 5.86 kJ K-1 mol-1 Error = 0.4

Heat Capacity = 5.9 ± 0.4 kJ K-1 mol-1

Always include units. If the quantity is dimensionless say so.

Error Propagation

Consider the ideal gas equation. We need to determine the

number of moles of gas in a sample.

RT

PVn

nRTPV

We need to measure Pressure, Volume and

Temperature to get the value of n. Each

measured quantity has error value.

To determine the error in n, we need to

propagate the error in the individual

measurement.

We want to measure a quantity V.

The result is f(V). 000 VVV

Error above V0Error below V0

The desired property f is then within the range:

)()()( 000 VfVfVf

If a function only contains addition and subtraction operations,

the error can be calculated as:

...22 yxf

If a function only contains multiplication and division operations,

the error can be calculated as:

...

2

0

2

0

yxf

yxf

Example:

Let us measure the molar heat of solvation of LiCl in water. This

experiment involves:

(a)Weighing an amount of LiCl

(b)Measuring volume of water

(c)Measuring the temperature change when the material dissolve.

The heat of solvation (H) is expressed in terms of the three

measurements as:

Mm

TVCH

/

The function H contains multiplication and division operations.

The variables we need to consider are mass of LiCl, volume of

water, and temperature.

222

TVmH TVm

H

Example:

Let us measure the molar heat of solvation of LiCl in water.

If two separate masses were weighed and added to the solvent the

equation is given as:

Mmm

TVCH

/)( 21

The first step is to calculate the uncertainty in m = m1 + m2 using

the error propagation rule as:

2

2

2

1 mmm

Then use the total mass and its calculated error, and determine H

and H.

Summary

Error does not mean a mistake or blunder but refers to

imprecision in measurements.

Random or indeterminate errors are inherent to all measurements

and cause symmetrical spread of data around the mean.

The source of systematic errors are: Instrumental errors, Method

errors, and Personal errors.

Numerical results given without error analysis is as good as

useless.

Thank you for your attention!

data analysis techniques - nanyang technological university analysis... · 2016-04-15 · data...

Documents