biostatistics - avcr.czbaloun.entu.cas.cz/png/biostatistics.pdf · biostatistics. aims of...

23
Biostatistics

Upload: lyngoc

Post on 04-Feb-2018

244 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

Biostatistics

Page 2: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta
Page 3: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

Aims of statistics

• (1) Descriptive statistics – to summarize data, to extract the information from many independent values to a small number of parameters or to a diagram

Page 4: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

Name PointsAnton Jan 70.5

Balzarov á Martina 72.5

Bendová Lenka 65.5

Blabolil Petr 71

Blažek Petr 87

Břendová Veronika 67.5

Čermáková Helena 88

Černíková Zuzana 94

Černý Jiří

Chalupecký František 59

Choma Michal 76.5

Chundelová Daniela 51

Doanová Tereza 69

Dortová Markéta 60.5

Dufek Luboš 69.5

Dvořáková Veronika 72

Effenberková Lenka 62

Franta Petr 74

Hajžmanová Tereza 72

Havlan Luboš 57.5

Hejna Ondřej 76

Holá Hana 81

Horák Jan

Jalovecká Marie 98.5

Jarolímová Zuzana 65

Jarošová Andrea 80

Jenčov á

Jerkovičová Diana 69

Jonáková Martina 91

Jůzlová Zuzana 85

Compare

Average number of points was 74.5,

whereas the minimum value was 28 and the maximum value was 100.

Histogram četností

20 30 40 50 60 70 80 90 100 110

Body

0

2

4

6

8

10

12

14

16

18

20

22N

o of

obs

Frequency diagram

No. of points

Page 5: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

The lower number of parameters I obtain

• the more transparent and clearer the result is

• the loss of information is bigger though

Page 6: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

Aims of statisticsPopulation and sample

• (2) Interferential statistics- Making an inference about (statistical) population from a sample

• Some (statistical) populations are too large [or potentially infinite] – consequently, I am not able to sample all the individuals (sampling units)

• What can I say about ammount of Cd in blood of all cuscus in PNG, when I took blood just from 10 specimens?

Page 7: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

Interferential statistic is common in biology

I don’t want to know, whether the average number of species was on average higher in primary forest light trap in comparison with the river during the ten nights of our project, but whether there would be difference any time I do a similar project again

• Should this be a science, the experiments have to be reproducible

Page 8: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

Populationand Random sample

• Sampling; Sampling design• Random sample – every individual

(sampling unit) has to have the same probability to be sampled, independent whether another individual has been sampled

• Tables and generators of (pseudo)randomnumbers

Page 9: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

To make a random sampling isn’t usually trivial –in no case it is a

sampling of typical individuals – itworks reasonably well in agricultural experiments

1

2

3

1 2 3 4 5 6

Page 10: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

Basic statistical parameters (characteristics)

• We usually mark N – size of the population, n – size of sample

• Parameters of the population are estimated

• Characteristics of location and variability:

• Means, median and modus

• Means are defined for quantitative data (i.e. on ratio and interval scale)

Page 11: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

Arithmetic mean

n

XX

n

i

i∑== 1

of a sample

Page 12: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

Geometrical mean

• n-root of the sum of n values (for a sample here)

∏ =

n

iiXn

1

Compare with the mean of log(x)

Page 13: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

Median [used for ordinal-scale data also]

• One half of the individual values is under and the secondhalf above the median (in infinite population, the probability that randomly selected value is above as well as below the median is 0.5).

Page 14: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

Upper and lower quartile

• One quarter of individual observations is above the upper quartil, one quarter is below the lower quartil

Page 15: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

Make difference among meaningof mean and median

Company A Company B8000 70009000 7500

11000 800012000 8500 Median15000 1100018000 1800020000 39000

13286 14143 Mean

Example – salories paid in two companies

Page 16: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

Modus – the most common value in the data data – in continuous

data it is the “peak” in frequency diagram –

Page 17: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

mean

mean

mean mean

median

median median

median

Page 18: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

Characteristics of variability

• 1. Rangeis a difference between minimum and maximum

• 2. Interquartile range

• 3. Variance and standard deviation

Page 19: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

Variance – average value of squared difference between the

value and mean• population -

2

12 )(

N

XN

ii∑ =−

σ

estimation based on the sample

1

)(1

22

−−

= ∑ =

n

XXs

n

ii n-1 = df = degrees of

freedom

Page 20: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

Standard deviation (sx, often also “s.d.” or “S.D.”) is square

root of variance – it is a characteristics of variability

Page 21: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

Standard error of mean

• Characteristic of estimate precision – how large would be the variability of means estimated from samples of this size

ss

nx

x=

precisionvariability

in data

We can increase the precision by increasing sample size.

Page 22: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

Graphical summary – frequency diagramHistogram (OHRAZENI 8v*21c)

POČET_SE = 21*100*normal(x, 314.8095, 173.2422)

0 100 200 300 400 500 600 700 800

POČET_SEMENÁČU

0

1

2

3

4

5

6

7

8

No

of o

bs

NO_SAPLING

Page 23: Biostatistics - avcr.czbaloun.entu.cas.cz/png/Biostatistics.pdf · Biostatistics. Aims of statistics • (1) ... Franta Petr 74 Hajžmanová Tereza 72 ... StatIntro.ppt Author: Vojta

Box and whisker plot

Box Plot ( 8v*21c)

Median = 329 25%-75% = (196, 363) Non-Outlier Range = (93, 500) Outliers Extremes0

100

200

300

400

500

600

700

800

Take care, box & whisker plot is now also used for mean and standard deviation etc.

NO_SAPLING