elementary statistics for foresters lecture 2 socrates/erasmus program @ wau spring semester...

Elementary statistics for foresters

Lecture 2

Socrates/Erasmus Program @ WAU

Spring semester 2005/2006

Descriptive statistics

• Data grouping (frequency distribution)

• Graphical data presentation (histogram, polygon, cumulative histogram, cumulative histogram)

• Measures of location (mean, quadratic mean, weighted mean, median, mode)

• Measures of dispersion (range, variance, standard deviation, coefficient of variation)

• Measures of asymmetry

• Descriptive statistics are used to summarize or describe characteristics of a known set of data.

• Used if we want to describe or summarize data in a clear and concise way using graphical and/or numerical methods.

• For example: we can consider everybody in the class as a group to be described. Each person can be a source of data for such an analysis.

• A characteristic of this data may be for example age, weight, height, sex, country of origin, etc.

• Closer-to-forestry example: we can consider all pine stands in central Poland as a group to be characterized.

• Each stand can be described by its area, age, site index, average height, QMD, volume per hectare, volume increment per hectare per year, amount of carbon sequestered, species composition, damage index, ...

Frequency distribution

xi nini pi

10121416

238273452421

23105178223247249250

0,0920,3280,2920,1800,0960,0080,004

0,0920,4200,7120,8920,9880,9961,000

250 1,000

• Frequency distribution is an ordered statistical material (measurements) in classes (bins) built according to the investigated variable values

• How to build it? – determine classes (values/mid-points and class

limits), depending on variable type– classify each unit/measurement to the

appropriate class– sum units in each class

• Practical issues:– number of classes should be between 6 and 16– classes should have identical widths– middle-class values/class mid-points should be

chosen in such a way, that they are easy to manipulate

xi nini pi

10121416

238273452421

23105178223247249250

0,0920,3280,2920,1800,0960,0080,004

0,0920,4200,7120,8920,9880,9961,000

250 1,000

Graphical description of data

• Pictures are very informative and can tell the entire story about the data.

• We can use different plots for different sorts of variables. We can use for example bar plots (histograms), pie charts, box plots, ... .

Histogram for dk

0 3 6 9 12 15 180

polygon

0 3 6 9 12 15 180

cumulative histogram

0 3 6 9 12 15 180

Numerical data description

Sums and their properties

Measures of location

• Arithmetic mean

• Quadratic mean

• Weighted mean

• Median

• Mode

• other

Arithmetic mean

Quadratic mean

Properties of the mean

Weighted mean

Median

• If observations of a variable are ordered by value, the median value corresponds to the middle observation in that ordered list.

• The median value corresponds to a cumulative percentage of 50% (i.e., 50% of the values are below the median and 50% of the values are above the median).

Median

• The position of the median is calculated by the following formula:

Median

• How to calculate it?

• If the detailed values are available, sort the data file and find an appropriate value

• If the frequecy distribution is available, use the following formula:

• The mode is the most frequently observed data value.

• There may be no mode if no value appears more than any other.

• There may also be two (bimodal), three (trimodal), or more modes (multimodal).

• In the case of grouped frequency distributions, the modal class is the class with the largest frequency.

• If there is no exact mode available in the data file, you can calculate its value by using:– an approximate Pearson formula

– by using an interpolation

Relationship between measures

Relationship between measuresf(x)

μo μe μ

Relationship between measures

μ μe μo

Sample calculations

Measures of dispersion

• Range

• Variance

• Standard deviation

• Coefficient of variation

Range and variance

• Range is a difference between the lowest and the highest value in the data set

• Variance– average squared differences between data

values and arithmetic mean

Variance

22222 22 Nxxxxx iiiii

xnxn iiii

Variance

min2 ix

222xcx c

Standard deviation and coefficient of variation

Sample calculations

1950iixn 165962iixn250N

544,5250

1521016596

250250

380250016596

250250

195016596

35,2544,5 %1,30%10080,7

35,2%100

Measures of asymmetry

• Skewness: is a measure of the degree of asymmetry of a distribution.

• If the left tail is more pronounced than the right tail, the function has negative skewness.

• If the reverse is true, it has positive skewness.

• If the two are equal, it has zero skewness.

Skewness

• Skewness can be calculated as a distance between mean and mode expressed in standard deviations:

Acknowledgements

• This presentation was made thanks to the support and contribution of dr Lech Wróblewski

elementary statistics for foresters lecture 2 socrates/erasmus program @ wau spring semester...

median value

data file

source of data

observed data value

known set of data

example age

different plots

classes bins

Documents

wau bulan · 2020. 3. 25. · title: wau bulan

corso wau! sassari

sejarah wau

foresters everyday families - forestersiwant.com ·...

foresters in suriname

valid arguments - university of maryland...modus ponens...

dunia seni visual wau

wau newsletter june 2016

elementary statistics for foresters lecture 1...

student policy manual - wau

fractionation and resource code statistics reports ·...

wau & layang-layang china

agenda who is foresters tm foresters advantages product...

foresters equity services, inc. - cri4.com written...

elementary statistics for foresters lecture 5...

wau portugal

14 - universiti sains...

socrates of athens: euthyphro, socrates' defense

folio sejarah wau powerpoint

why foresters