data domains and introduction to statistics chemistry 243

24
Data Domains and Introduction to Statistics Chemistry 243

Upload: melanie-wolfram

Post on 14-Dec-2015

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Data Domains and Introduction to Statistics Chemistry 243

Data Domains and Introduction to Statistics

Chemistry 243

Page 2: Data Domains and Introduction to Statistics Chemistry 243

Instrumental methods and what they measure

Electromagnetic methods

Electrical methods

Photons are

modulated by sample

Page 3: Data Domains and Introduction to Statistics Chemistry 243

Instruments are translators

Convert physical or chemical properties that we cannot directly observe into information that we can interpret.

0

0

log

log

PT

P

A bc T

P

Pc

b

Page 4: Data Domains and Introduction to Statistics Chemistry 243

Sometimes multiple translations are needed

Thermometer Bimetallic coil converts temperature to

physical displacement Scale converts angle of the pointer to an

observable value of meaning

adapted from C.G. Enke, The Art and Science of Chemical Analysis, 2001.

http://upload.wikimedia.org/wikipedia/commons/d/d2/Bimetaal.jpghttp://upload.wikimedia.org/wikipedia/commons/2/26/

Bimetal_coil_reacts_to_lighter.gifhttp://static.howstuffworks.com/gif/home-thermostat-thermometer.jpg

Thermostat: Displacement used to activate switch

Page 5: Data Domains and Introduction to Statistics Chemistry 243

Components in translation

Page 6: Data Domains and Introduction to Statistics Chemistry 243

Data domains

Information is encoded and transferred between domains Non-electrical

domains Beginning and end of

a measurement Electrical domains

Intermediate data collection and processing

Page 7: Data Domains and Introduction to Statistics Chemistry 243

Initial conversion

device

Intermediate conversion

device

Readout conversion

device

Qua

ntity

to

be m

easu

red

Inte

rmed

iate

quan

tity

2

Numbe

r

Inte

rmed

iate

quan

tity

1

PMT ResistorDigital

voltmeter

Emis

sion

Volta

ge (V

= iR

)

Inte

nsity

Curre

nt

Data domains

Often viewed on a GUI(graphical user interface)

Page 8: Data Domains and Introduction to Statistics Chemistry 243

Electrical domains Analog signals

Magnitude of voltage, current, charge, or power Continuous in both amplitude and time

Time-domain signals Time relationship of signal fluctuations

(not amplitudes) Frequency, pulse width, phase

Digital information Data encoded in only two discrete levels A simplification for transmission and storage of

information which can be re-combined with great accuracy and precision

The heart of modern electronics

Page 9: Data Domains and Introduction to Statistics Chemistry 243

Digital and analog signals

Analog signals Magnitude of voltage, current, charge, or power Continuous in both amplitude and time

Digital information Data encoded in only discrete levels

Page 10: Data Domains and Introduction to Statistics Chemistry 243

Analog to digital to conversion Limited by bit resolution of ADC

4-bit card has 24 = 16 discrete binary levels 8-bit card has 28 = 256 discrete binary levels 32-bit card has 232 = 4,294,967,296 discrete binary levels

Common today

Maximum resolution comes from full use of ADC voltage range.

Trade-offs More bits is usually slower More expensive

K.A. Rubinson, J.F. Rubinson, Contemporary Instrumental Analysis, 2000.

Page 11: Data Domains and Introduction to Statistics Chemistry 243

Byte prefixes

About 1000

About a million

About a billion

Page 12: Data Domains and Introduction to Statistics Chemistry 243

Serial and parallel binary encoding

(serial) Slow – not digital; outdated

Fast – between instruments“serial-coded binary” data

Binary Parallel:Very Fast – within an instrument

“parallel digital” data

Page 13: Data Domains and Introduction to Statistics Chemistry 243

Introductory statistics

Statistical handling of data is incredibly important because it gives it significance.

The ability or inability to definitively state that two values are statistically different has profound ramifications in data interpretation.

Measurements are not absolute and robust methods for establishing run-to-run reproducibility and instrument-to-instrument variability are essential.

Page 14: Data Domains and Introduction to Statistics Chemistry 243

Introductory statistics:Mean, median, and mode

Population mean (m): average value of replicate data

Median (m½): ½ of the observations are greater; ½ are less

Mode (mmd): most probable value For a symmetrical distribution:

Real distributions are rarely perfectly symmetrical

1 1 2 3 ...lim

N

ii N

N

xx x x x

N N

1/ 2 md

Page 15: Data Domains and Introduction to Statistics Chemistry 243

Statistical distribution

Often follows a Gaussian functional form

Page 16: Data Domains and Introduction to Statistics Chemistry 243

Introductory statistics: Standard deviation and variance

Standard deviation (s):

Variance (s2):

21lim

N

ii

N

x

N

22 1lim

N

ii

N

x

N

Page 17: Data Domains and Introduction to Statistics Chemistry 243

Gaussian distribution

Common distribution with well-defined stats 68.3% of data is within 1s of mean 95.5% at 2s 99.7% at 3s

2221

2

x

y e

Page 18: Data Domains and Introduction to Statistics Chemistry 243

Statistical distribution

50 Abs measurements of an identical sample Let’s go to Excel

Table a1-1,Skoog

Page 19: Data Domains and Introduction to Statistics Chemistry 243

But no one hasan infinite data set …

 

21

1

N

ii

x x

sN

22 1

1

N

ii

x x

sN

1

N

ii

x

xN

Page 20: Data Domains and Introduction to Statistics Chemistry 243

Standard deviation and variance, continued

s is a measure of precision (magnitude of indeterminate error)

Other useful definitions: Standard error of mean

2 2 2 2 21 2 3 ...total n

mN

Page 21: Data Domains and Introduction to Statistics Chemistry 243

Confidence intervals

In most situations cannot be determined Would require infinite number of measurements

Statistically we can establish confidence interval around in which is expected to lie with a certain level of probability.

x

Page 22: Data Domains and Introduction to Statistics Chemistry 243

Calculating confidence intervals

We cannot absolutely determine , so when s is not a good estimate (small # of samples) use:

Note that t approaches z as N increases.

 

2-sided t values

Page 23: Data Domains and Introduction to Statistics Chemistry 243

Example of confidence interval determination for smaller number of samples Given the following values for

serum carcinoembryonic acid (CEA) measurements, determine the 95% confidence interval. 16.9 ng/mL, 12.7 ng/mL,

15.3 ng/mL, 17.2 ng/mL

or

Sample mean = 15.525 ng/mL s = 2.059733 ng/mL

Answer: 15.525 ± 2.863, but when you consider sig figs you get: 16 ± 3

Page 24: Data Domains and Introduction to Statistics Chemistry 243

Propagation of errors

How do errors at each set contribute to the final result?

2 2 2 2

, , ...

, , ...

...

...

i i i i

vv v

x p q r

x f p q r

dx f dp dq dr

x x xdx dp dq dr

p q r

x x xs s s s

p q r